All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] s390: virtio: support protected virtualization
@ 2019-04-26 18:32 ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

Enhanced virtualization protection technology may require the use of
bounce buffers for I/O. While support for this was built into the virtio
core, virtio-ccw wasn't changed accordingly.

Some background on technology (not part of this series) and the
terminology used.

* Protected Virtualization (PV):

Protected Virtualization guarantees, that non-shared memory of a  guest
that operates in PV mode private to that guest. I.e. any attempts by the
hypervisor or other guests to access it will result in an exception. If
supported by the environment (machine, KVM, guest VM) a guest can decide
to change into PV mode by doing the appropriate ultravisor calls. Unlike
some other enhanced virtualization protection technology, 

* Ultravisor:

A hardware/firmware entity that manages PV guests, and polices access to
their memory. A PV guest prospect needs to interact with the ultravisor,
to enter PV mode, and potentially to share pages (for I/O which should
be encrypted by the guest). A guest interacts with the ultravisor via so
called ultravisor calls. A hypervisor needs to interact with the
ultravisor to facilitate interpretation, emulation and swapping. A
hypervisor  interacts with the ultravisor via ultravisor calls and via
the SIE state description. Generally the ultravisor sanitizes hypervisor
inputs so that the guest can not be corrupted (except for denial of
service.


What needs to be done
=====================

Thus what needs to be done to bring virtio-ccw up to speed with respect
to protected virtualization is:
* use some 'new' common virtio stuff
* make sure that virtio-ccw specific stuff uses shared memory when
  talking to the hypervisor (except control/communication blocks like ORB,
  these are handled by the ultravisor)
* make sure the DMA API does what is necessary to talk through shared
  memory if we are a protected virtualization guest.
* make sure the common IO layer plays along as well (airqs, sense).


Important notes
================

* This patch set is based on Martins features branch
 (git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git branch
 'features').

* Documentation is still very sketchy. I'm committed to improving this,
  but I'm currently hampered by some dependencies currently.  

* The existing naming in the common infrastructure (kernel internal
  interfaces) is pretty much based on the AMD SEV terminology. Thus the
  names aren't always perfect. There might be merit to changing these
  names to more abstract ones. I did not put much thought into that at
  the current stage.

* Testing: Please use iommu_platform=on for any virtio devices you are
  going to test this code with (so virtio actually uses the DMA API).

Change log
==========

RFC --> v1:
* Fixed bugs found by Connie (may_reduce and handling reduced,  warning,
  split move -- thanks Connie!).
* Fixed console bug found by Sebastian (thanks Sebastian!).
* Removed the completely useless duplicate of dma-mapping.h spotted by
  Christoph (thanks Christoph!).
* Don't use the global DMA pool for subchannel and ccw device
  owned memory as requested by Sebastian. Consequences:
	* Both subchannel and ccw devices have their dma masks
	now (both specifying 31 bit addressable)
	* We require at least 2 DMA pages per ccw device now, most of
	this memory is wasted though.
	* DMA memory allocated by virtio is also 31 bit addressable now
        as virtio uses the parent (which is the ccw device).
* Enabled packed ring.
* Rebased onto Martins feature branch; using the actual uv (ultravisor)
  interface instead of TODO comments.
* Added some explanations to the cover letter (Connie, David).
* Squashed a couple of patches together and fixed some text stuff. 

Looking forward to your review, or any other type of input.

Halil Pasic (10):
  virtio/s390: use vring_create_virtqueue
  virtio/s390: DMA support for virtio-ccw
  virtio/s390: enable packed ring
  s390/mm: force swiotlb for protected virtualization
  s390/cio: introduce DMA pools to cio
  s390/cio: add basic protected virtualization support
  s390/airq: use DMA memory for adapter interrupts
  virtio/s390: add indirection to indicators access
  virtio/s390: use DMA memory for ccw I/O and classic notifiers
  virtio/s390: make airq summary indicators DMA

 arch/s390/Kconfig                   |   5 +
 arch/s390/include/asm/airq.h        |   2 +
 arch/s390/include/asm/ccwdev.h      |   4 +
 arch/s390/include/asm/cio.h         |  11 ++
 arch/s390/include/asm/mem_encrypt.h |  18 +++
 arch/s390/mm/init.c                 |  50 +++++++
 drivers/s390/cio/airq.c             |  18 ++-
 drivers/s390/cio/ccwreq.c           |   8 +-
 drivers/s390/cio/cio.h              |   1 +
 drivers/s390/cio/css.c              | 101 +++++++++++++
 drivers/s390/cio/device.c           |  65 +++++++--
 drivers/s390/cio/device_fsm.c       |  40 +++---
 drivers/s390/cio/device_id.c        |  18 +--
 drivers/s390/cio/device_ops.c       |  21 ++-
 drivers/s390/cio/device_pgid.c      |  20 +--
 drivers/s390/cio/device_status.c    |  24 ++--
 drivers/s390/cio/io_sch.h           |  21 ++-
 drivers/s390/virtio/virtio_ccw.c    | 275 +++++++++++++++++++-----------------
 include/linux/virtio.h              |  17 ---
 19 files changed, 499 insertions(+), 220 deletions(-)
 create mode 100644 arch/s390/include/asm/mem_encrypt.h

-- 
2.16.4

^ permalink raw reply	[flat|nested] 182+ messages in thread

* [PATCH 00/10] s390: virtio: support protected virtualization
@ 2019-04-26 18:32 ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Christoph Hellwig, Thomas Huth, Claudio Imbrenda, Janosch Frank,
	Vasily Gorbik, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Viktor Mihajlovski

Enhanced virtualization protection technology may require the use of
bounce buffers for I/O. While support for this was built into the virtio
core, virtio-ccw wasn't changed accordingly.

Some background on technology (not part of this series) and the
terminology used.

* Protected Virtualization (PV):

Protected Virtualization guarantees, that non-shared memory of a  guest
that operates in PV mode private to that guest. I.e. any attempts by the
hypervisor or other guests to access it will result in an exception. If
supported by the environment (machine, KVM, guest VM) a guest can decide
to change into PV mode by doing the appropriate ultravisor calls. Unlike
some other enhanced virtualization protection technology, 

* Ultravisor:

A hardware/firmware entity that manages PV guests, and polices access to
their memory. A PV guest prospect needs to interact with the ultravisor,
to enter PV mode, and potentially to share pages (for I/O which should
be encrypted by the guest). A guest interacts with the ultravisor via so
called ultravisor calls. A hypervisor needs to interact with the
ultravisor to facilitate interpretation, emulation and swapping. A
hypervisor  interacts with the ultravisor via ultravisor calls and via
the SIE state description. Generally the ultravisor sanitizes hypervisor
inputs so that the guest can not be corrupted (except for denial of
service.


What needs to be done
=====================

Thus what needs to be done to bring virtio-ccw up to speed with respect
to protected virtualization is:
* use some 'new' common virtio stuff
* make sure that virtio-ccw specific stuff uses shared memory when
  talking to the hypervisor (except control/communication blocks like ORB,
  these are handled by the ultravisor)
* make sure the DMA API does what is necessary to talk through shared
  memory if we are a protected virtualization guest.
* make sure the common IO layer plays along as well (airqs, sense).


Important notes
================

* This patch set is based on Martins features branch
 (git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git branch
 'features').

* Documentation is still very sketchy. I'm committed to improving this,
  but I'm currently hampered by some dependencies currently.  

* The existing naming in the common infrastructure (kernel internal
  interfaces) is pretty much based on the AMD SEV terminology. Thus the
  names aren't always perfect. There might be merit to changing these
  names to more abstract ones. I did not put much thought into that at
  the current stage.

* Testing: Please use iommu_platform=on for any virtio devices you are
  going to test this code with (so virtio actually uses the DMA API).

Change log
==========

RFC --> v1:
* Fixed bugs found by Connie (may_reduce and handling reduced,  warning,
  split move -- thanks Connie!).
* Fixed console bug found by Sebastian (thanks Sebastian!).
* Removed the completely useless duplicate of dma-mapping.h spotted by
  Christoph (thanks Christoph!).
* Don't use the global DMA pool for subchannel and ccw device
  owned memory as requested by Sebastian. Consequences:
	* Both subchannel and ccw devices have their dma masks
	now (both specifying 31 bit addressable)
	* We require at least 2 DMA pages per ccw device now, most of
	this memory is wasted though.
	* DMA memory allocated by virtio is also 31 bit addressable now
        as virtio uses the parent (which is the ccw device).
* Enabled packed ring.
* Rebased onto Martins feature branch; using the actual uv (ultravisor)
  interface instead of TODO comments.
* Added some explanations to the cover letter (Connie, David).
* Squashed a couple of patches together and fixed some text stuff. 

Looking forward to your review, or any other type of input.

Halil Pasic (10):
  virtio/s390: use vring_create_virtqueue
  virtio/s390: DMA support for virtio-ccw
  virtio/s390: enable packed ring
  s390/mm: force swiotlb for protected virtualization
  s390/cio: introduce DMA pools to cio
  s390/cio: add basic protected virtualization support
  s390/airq: use DMA memory for adapter interrupts
  virtio/s390: add indirection to indicators access
  virtio/s390: use DMA memory for ccw I/O and classic notifiers
  virtio/s390: make airq summary indicators DMA

 arch/s390/Kconfig                   |   5 +
 arch/s390/include/asm/airq.h        |   2 +
 arch/s390/include/asm/ccwdev.h      |   4 +
 arch/s390/include/asm/cio.h         |  11 ++
 arch/s390/include/asm/mem_encrypt.h |  18 +++
 arch/s390/mm/init.c                 |  50 +++++++
 drivers/s390/cio/airq.c             |  18 ++-
 drivers/s390/cio/ccwreq.c           |   8 +-
 drivers/s390/cio/cio.h              |   1 +
 drivers/s390/cio/css.c              | 101 +++++++++++++
 drivers/s390/cio/device.c           |  65 +++++++--
 drivers/s390/cio/device_fsm.c       |  40 +++---
 drivers/s390/cio/device_id.c        |  18 +--
 drivers/s390/cio/device_ops.c       |  21 ++-
 drivers/s390/cio/device_pgid.c      |  20 +--
 drivers/s390/cio/device_status.c    |  24 ++--
 drivers/s390/cio/io_sch.h           |  21 ++-
 drivers/s390/virtio/virtio_ccw.c    | 275 +++++++++++++++++++-----------------
 include/linux/virtio.h              |  17 ---
 19 files changed, 499 insertions(+), 220 deletions(-)
 create mode 100644 arch/s390/include/asm/mem_encrypt.h

-- 
2.16.4

^ permalink raw reply	[flat|nested] 182+ messages in thread

* [PATCH 01/10] virtio/s390: use vring_create_virtqueue
  2019-04-26 18:32 ` Halil Pasic
@ 2019-04-26 18:32   ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
establishes a new way of allocating virtqueues (as a part of the effort
that taught DMA to virtio rings).

In the future we will want virtio-ccw to use the DMA API as well.

Let us switch from the legacy method of allocating virtqueues to
vring_create_virtqueue() as the first step into that direction.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
 1 file changed, 11 insertions(+), 19 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 74c328321889..2c66941ef3d0 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -108,7 +108,6 @@ struct virtio_rev_info {
 struct virtio_ccw_vq_info {
 	struct virtqueue *vq;
 	int num;
-	void *queue;
 	union {
 		struct vq_info_block s;
 		struct vq_info_block_legacy l;
@@ -423,7 +422,6 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
 	struct virtio_ccw_device *vcdev = to_vc_device(vq->vdev);
 	struct virtio_ccw_vq_info *info = vq->priv;
 	unsigned long flags;
-	unsigned long size;
 	int ret;
 	unsigned int index = vq->index;
 
@@ -461,8 +459,6 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
 			 ret, index);
 
 	vring_del_virtqueue(vq);
-	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
-	free_pages_exact(info->queue, size);
 	kfree(info->info_block);
 	kfree(info);
 }
@@ -494,8 +490,9 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 	int err;
 	struct virtqueue *vq = NULL;
 	struct virtio_ccw_vq_info *info;
-	unsigned long size = 0; /* silence the compiler */
+	u64 queue;
 	unsigned long flags;
+	bool may_reduce;
 
 	/* Allocate queue. */
 	info = kzalloc(sizeof(struct virtio_ccw_vq_info), GFP_KERNEL);
@@ -516,33 +513,30 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 		err = info->num;
 		goto out_err;
 	}
-	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
-	info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
-	if (info->queue == NULL) {
-		dev_warn(&vcdev->cdev->dev, "no queue\n");
-		err = -ENOMEM;
-		goto out_err;
-	}
+	may_reduce = vcdev->revision > 0;
+	vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
+				    vdev, true, may_reduce, ctx,
+				    virtio_ccw_kvm_notify, callback, name);
 
-	vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
-				 true, ctx, info->queue, virtio_ccw_kvm_notify,
-				 callback, name);
 	if (!vq) {
 		/* For now, we fail if we can't get the requested size. */
 		dev_warn(&vcdev->cdev->dev, "no vq\n");
 		err = -ENOMEM;
 		goto out_err;
 	}
+	/* it may have been reduced */
+	info->num = virtqueue_get_vring_size(vq);
 
 	/* Register it with the host. */
+	queue = virtqueue_get_desc_addr(vq);
 	if (vcdev->revision == 0) {
-		info->info_block->l.queue = (__u64)info->queue;
+		info->info_block->l.queue = queue;
 		info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
 		info->info_block->l.index = i;
 		info->info_block->l.num = info->num;
 		ccw->count = sizeof(info->info_block->l);
 	} else {
-		info->info_block->s.desc = (__u64)info->queue;
+		info->info_block->s.desc = queue;
 		info->info_block->s.index = i;
 		info->info_block->s.num = info->num;
 		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
@@ -572,8 +566,6 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 	if (vq)
 		vring_del_virtqueue(vq);
 	if (info) {
-		if (info->queue)
-			free_pages_exact(info->queue, size);
 		kfree(info->info_block);
 	}
 	kfree(info);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 01/10] virtio/s390: use vring_create_virtqueue
@ 2019-04-26 18:32   ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Christoph Hellwig, Thomas Huth, Claudio Imbrenda, Janosch Frank,
	Vasily Gorbik, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Viktor Mihajlovski

The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
establishes a new way of allocating virtqueues (as a part of the effort
that taught DMA to virtio rings).

In the future we will want virtio-ccw to use the DMA API as well.

Let us switch from the legacy method of allocating virtqueues to
vring_create_virtqueue() as the first step into that direction.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
 1 file changed, 11 insertions(+), 19 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 74c328321889..2c66941ef3d0 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -108,7 +108,6 @@ struct virtio_rev_info {
 struct virtio_ccw_vq_info {
 	struct virtqueue *vq;
 	int num;
-	void *queue;
 	union {
 		struct vq_info_block s;
 		struct vq_info_block_legacy l;
@@ -423,7 +422,6 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
 	struct virtio_ccw_device *vcdev = to_vc_device(vq->vdev);
 	struct virtio_ccw_vq_info *info = vq->priv;
 	unsigned long flags;
-	unsigned long size;
 	int ret;
 	unsigned int index = vq->index;
 
@@ -461,8 +459,6 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
 			 ret, index);
 
 	vring_del_virtqueue(vq);
-	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
-	free_pages_exact(info->queue, size);
 	kfree(info->info_block);
 	kfree(info);
 }
@@ -494,8 +490,9 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 	int err;
 	struct virtqueue *vq = NULL;
 	struct virtio_ccw_vq_info *info;
-	unsigned long size = 0; /* silence the compiler */
+	u64 queue;
 	unsigned long flags;
+	bool may_reduce;
 
 	/* Allocate queue. */
 	info = kzalloc(sizeof(struct virtio_ccw_vq_info), GFP_KERNEL);
@@ -516,33 +513,30 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 		err = info->num;
 		goto out_err;
 	}
-	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
-	info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
-	if (info->queue == NULL) {
-		dev_warn(&vcdev->cdev->dev, "no queue\n");
-		err = -ENOMEM;
-		goto out_err;
-	}
+	may_reduce = vcdev->revision > 0;
+	vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
+				    vdev, true, may_reduce, ctx,
+				    virtio_ccw_kvm_notify, callback, name);
 
-	vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
-				 true, ctx, info->queue, virtio_ccw_kvm_notify,
-				 callback, name);
 	if (!vq) {
 		/* For now, we fail if we can't get the requested size. */
 		dev_warn(&vcdev->cdev->dev, "no vq\n");
 		err = -ENOMEM;
 		goto out_err;
 	}
+	/* it may have been reduced */
+	info->num = virtqueue_get_vring_size(vq);
 
 	/* Register it with the host. */
+	queue = virtqueue_get_desc_addr(vq);
 	if (vcdev->revision == 0) {
-		info->info_block->l.queue = (__u64)info->queue;
+		info->info_block->l.queue = queue;
 		info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
 		info->info_block->l.index = i;
 		info->info_block->l.num = info->num;
 		ccw->count = sizeof(info->info_block->l);
 	} else {
-		info->info_block->s.desc = (__u64)info->queue;
+		info->info_block->s.desc = queue;
 		info->info_block->s.index = i;
 		info->info_block->s.num = info->num;
 		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
@@ -572,8 +566,6 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 	if (vq)
 		vring_del_virtqueue(vq);
 	if (info) {
-		if (info->queue)
-			free_pages_exact(info->queue, size);
 		kfree(info->info_block);
 	}
 	kfree(info);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 02/10] virtio/s390: DMA support for virtio-ccw
  2019-04-26 18:32 ` Halil Pasic
@ 2019-04-26 18:32   ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

Currently virtio-ccw devices do not work if the device has
VIRTIO_F_IOMMU_PLATFORM. In future we do want to support DMA API with
virtio-ccw.

Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works
with virtio-ccw.

Let us also switch from legacy avail/used accessors to the DMA aware
ones (even if it isn't strictly necessary), and remove the legacy
accessors (we were the last users).

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 22 +++++++++++++++-------
 include/linux/virtio.h           | 17 -----------------
 2 files changed, 15 insertions(+), 24 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 2c66941ef3d0..42832a164546 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -66,6 +66,7 @@ struct virtio_ccw_device {
 	bool device_lost;
 	unsigned int config_ready;
 	void *airq_info;
+	u64 dma_mask;
 };
 
 struct vq_info_block_legacy {
@@ -539,8 +540,8 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 		info->info_block->s.desc = queue;
 		info->info_block->s.index = i;
 		info->info_block->s.num = info->num;
-		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
-		info->info_block->s.used = (__u64)virtqueue_get_used(vq);
+		info->info_block->s.avail = (__u64)virtqueue_get_avail_addr(vq);
+		info->info_block->s.used = (__u64)virtqueue_get_used_addr(vq);
 		ccw->count = sizeof(info->info_block->s);
 	}
 	ccw->cmd_code = CCW_CMD_SET_VQ;
@@ -772,10 +773,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
 static void ccw_transport_features(struct virtio_device *vdev)
 {
 	/*
-	 * Packed ring isn't enabled on virtio_ccw for now,
-	 * because virtio_ccw uses some legacy accessors,
-	 * e.g. virtqueue_get_avail() and virtqueue_get_used()
-	 * which aren't available in packed ring currently.
+	 * There shouldn't be anything that precludes supporting packed.
+	 * TODO: Remove the limitation after having another look into this.
 	 */
 	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
 }
@@ -1258,6 +1257,16 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		ret = -ENOMEM;
 		goto out_free;
 	}
+
+	vcdev->vdev.dev.parent = &cdev->dev;
+	cdev->dev.dma_mask = &vcdev->dma_mask;
+	/* we are fine with common virtio infrastructure using 64 bit DMA */
+	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
+	if (ret) {
+		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
+		goto out_free;
+	}
+
 	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
 				   GFP_DMA | GFP_KERNEL);
 	if (!vcdev->config_block) {
@@ -1272,7 +1281,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 
 	vcdev->is_thinint = virtio_ccw_use_airq; /* at least try */
 
-	vcdev->vdev.dev.parent = &cdev->dev;
 	vcdev->vdev.dev.release = virtio_ccw_release_dev;
 	vcdev->vdev.config = &virtio_ccw_config_ops;
 	vcdev->cdev = cdev;
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 673fe3ef3607..15f906e4a748 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -90,23 +90,6 @@ dma_addr_t virtqueue_get_desc_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_avail_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
 
-/*
- * Legacy accessors -- in almost all cases, these are the wrong functions
- * to use.
- */
-static inline void *virtqueue_get_desc(struct virtqueue *vq)
-{
-	return virtqueue_get_vring(vq)->desc;
-}
-static inline void *virtqueue_get_avail(struct virtqueue *vq)
-{
-	return virtqueue_get_vring(vq)->avail;
-}
-static inline void *virtqueue_get_used(struct virtqueue *vq)
-{
-	return virtqueue_get_vring(vq)->used;
-}
-
 /**
  * virtio_device - representation of a device using virtio
  * @index: unique position on the virtio bus
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 02/10] virtio/s390: DMA support for virtio-ccw
@ 2019-04-26 18:32   ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Christoph Hellwig, Thomas Huth, Claudio Imbrenda, Janosch Frank,
	Vasily Gorbik, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Viktor Mihajlovski

Currently virtio-ccw devices do not work if the device has
VIRTIO_F_IOMMU_PLATFORM. In future we do want to support DMA API with
virtio-ccw.

Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works
with virtio-ccw.

Let us also switch from legacy avail/used accessors to the DMA aware
ones (even if it isn't strictly necessary), and remove the legacy
accessors (we were the last users).

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 22 +++++++++++++++-------
 include/linux/virtio.h           | 17 -----------------
 2 files changed, 15 insertions(+), 24 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 2c66941ef3d0..42832a164546 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -66,6 +66,7 @@ struct virtio_ccw_device {
 	bool device_lost;
 	unsigned int config_ready;
 	void *airq_info;
+	u64 dma_mask;
 };
 
 struct vq_info_block_legacy {
@@ -539,8 +540,8 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 		info->info_block->s.desc = queue;
 		info->info_block->s.index = i;
 		info->info_block->s.num = info->num;
-		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
-		info->info_block->s.used = (__u64)virtqueue_get_used(vq);
+		info->info_block->s.avail = (__u64)virtqueue_get_avail_addr(vq);
+		info->info_block->s.used = (__u64)virtqueue_get_used_addr(vq);
 		ccw->count = sizeof(info->info_block->s);
 	}
 	ccw->cmd_code = CCW_CMD_SET_VQ;
@@ -772,10 +773,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
 static void ccw_transport_features(struct virtio_device *vdev)
 {
 	/*
-	 * Packed ring isn't enabled on virtio_ccw for now,
-	 * because virtio_ccw uses some legacy accessors,
-	 * e.g. virtqueue_get_avail() and virtqueue_get_used()
-	 * which aren't available in packed ring currently.
+	 * There shouldn't be anything that precludes supporting packed.
+	 * TODO: Remove the limitation after having another look into this.
 	 */
 	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
 }
@@ -1258,6 +1257,16 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		ret = -ENOMEM;
 		goto out_free;
 	}
+
+	vcdev->vdev.dev.parent = &cdev->dev;
+	cdev->dev.dma_mask = &vcdev->dma_mask;
+	/* we are fine with common virtio infrastructure using 64 bit DMA */
+	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
+	if (ret) {
+		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
+		goto out_free;
+	}
+
 	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
 				   GFP_DMA | GFP_KERNEL);
 	if (!vcdev->config_block) {
@@ -1272,7 +1281,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 
 	vcdev->is_thinint = virtio_ccw_use_airq; /* at least try */
 
-	vcdev->vdev.dev.parent = &cdev->dev;
 	vcdev->vdev.dev.release = virtio_ccw_release_dev;
 	vcdev->vdev.config = &virtio_ccw_config_ops;
 	vcdev->cdev = cdev;
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 673fe3ef3607..15f906e4a748 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -90,23 +90,6 @@ dma_addr_t virtqueue_get_desc_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_avail_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
 
-/*
- * Legacy accessors -- in almost all cases, these are the wrong functions
- * to use.
- */
-static inline void *virtqueue_get_desc(struct virtqueue *vq)
-{
-	return virtqueue_get_vring(vq)->desc;
-}
-static inline void *virtqueue_get_avail(struct virtqueue *vq)
-{
-	return virtqueue_get_vring(vq)->avail;
-}
-static inline void *virtqueue_get_used(struct virtqueue *vq)
-{
-	return virtqueue_get_vring(vq)->used;
-}
-
 /**
  * virtio_device - representation of a device using virtio
  * @index: unique position on the virtio bus
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 03/10] virtio/s390: enable packed ring
  2019-04-26 18:32 ` Halil Pasic
@ 2019-04-26 18:32   ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

Nothing precludes to accepting  VIRTIO_F_RING_PACKED any more.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 42832a164546..6d989c360f38 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -773,10 +773,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
 static void ccw_transport_features(struct virtio_device *vdev)
 {
 	/*
-	 * There shouldn't be anything that precludes supporting packed.
-	 * TODO: Remove the limitation after having another look into this.
+	 * Currently nothing to do here.
 	 */
-	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
 }
 
 static int virtio_ccw_finalize_features(struct virtio_device *vdev)
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 03/10] virtio/s390: enable packed ring
@ 2019-04-26 18:32   ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Christoph Hellwig, Thomas Huth, Claudio Imbrenda, Janosch Frank,
	Vasily Gorbik, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Viktor Mihajlovski

Nothing precludes to accepting  VIRTIO_F_RING_PACKED any more.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 42832a164546..6d989c360f38 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -773,10 +773,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
 static void ccw_transport_features(struct virtio_device *vdev)
 {
 	/*
-	 * There shouldn't be anything that precludes supporting packed.
-	 * TODO: Remove the limitation after having another look into this.
+	 * Currently nothing to do here.
 	 */
-	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
 }
 
 static int virtio_ccw_finalize_features(struct virtio_device *vdev)
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
  2019-04-26 18:32 ` Halil Pasic
@ 2019-04-26 18:32   ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On s390, protected virtualization guests have to use bounced I/O
buffers.  That requires some plumbing.

Let us make sure, any device that uses DMA API with direct ops correctly
is spared from the problems, that a hypervisor attempting I/O to a
non-shared page would bring.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 arch/s390/Kconfig                   |  4 +++
 arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++
 arch/s390/mm/init.c                 | 50 +++++++++++++++++++++++++++++++++++++
 3 files changed, 72 insertions(+)
 create mode 100644 arch/s390/include/asm/mem_encrypt.h

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 1c3fcf19c3af..5500d05d4d53 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -1,4 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
+config ARCH_HAS_MEM_ENCRYPT
+        def_bool y
+
 config MMU
 	def_bool y
 
@@ -191,6 +194,7 @@ config S390
 	select ARCH_HAS_SCALED_CPUTIME
 	select VIRT_TO_BUS
 	select HAVE_NMI
+	select SWIOTLB
 
 
 config SCHED_OMIT_FRAME_POINTER
diff --git a/arch/s390/include/asm/mem_encrypt.h b/arch/s390/include/asm/mem_encrypt.h
new file mode 100644
index 000000000000..0898c09a888c
--- /dev/null
+++ b/arch/s390/include/asm/mem_encrypt.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef S390_MEM_ENCRYPT_H__
+#define S390_MEM_ENCRYPT_H__
+
+#ifndef __ASSEMBLY__
+
+#define sme_me_mask	0ULL
+
+static inline bool sme_active(void) { return false; }
+extern bool sev_active(void);
+
+int set_memory_encrypted(unsigned long addr, int numpages);
+int set_memory_decrypted(unsigned long addr, int numpages);
+
+#endif	/* __ASSEMBLY__ */
+
+#endif	/* S390_MEM_ENCRYPT_H__ */
+
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 3e82f66d5c61..7e3cbd15dcfa 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -18,6 +18,7 @@
 #include <linux/mman.h>
 #include <linux/mm.h>
 #include <linux/swap.h>
+#include <linux/swiotlb.h>
 #include <linux/smp.h>
 #include <linux/init.h>
 #include <linux/pagemap.h>
@@ -29,6 +30,7 @@
 #include <linux/export.h>
 #include <linux/cma.h>
 #include <linux/gfp.h>
+#include <linux/dma-mapping.h>
 #include <asm/processor.h>
 #include <linux/uaccess.h>
 #include <asm/pgtable.h>
@@ -42,6 +44,8 @@
 #include <asm/sclp.h>
 #include <asm/set_memory.h>
 #include <asm/kasan.h>
+#include <asm/dma-mapping.h>
+#include <asm/uv.h>
 
 pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
 
@@ -126,6 +130,50 @@ void mark_rodata_ro(void)
 	pr_info("Write protected read-only-after-init data: %luk\n", size >> 10);
 }
 
+int set_memory_encrypted(unsigned long addr, int numpages)
+{
+	int i;
+
+	/* make all pages shared, (swiotlb, dma_free) */
+	for (i = 0; i < numpages; ++i) {
+		uv_remove_shared(addr);
+		addr += PAGE_SIZE;
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(set_memory_encrypted);
+
+int set_memory_decrypted(unsigned long addr, int numpages)
+{
+	int i;
+	/* make all pages shared (swiotlb, dma_alloca) */
+	for (i = 0; i < numpages; ++i) {
+		uv_set_shared(addr);
+		addr += PAGE_SIZE;
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(set_memory_decrypted);
+
+/* are we a protected virtualization guest? */
+bool sev_active(void)
+{
+	return is_prot_virt_guest();
+}
+EXPORT_SYMBOL_GPL(sev_active);
+
+/* protected virtualization */
+static void pv_init(void)
+{
+	if (!sev_active())
+		return;
+
+	/* make sure bounce buffers are shared */
+	swiotlb_init(1);
+	swiotlb_update_mem_attributes();
+	swiotlb_force = SWIOTLB_FORCE;
+}
+
 void __init mem_init(void)
 {
 	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
@@ -134,6 +182,8 @@ void __init mem_init(void)
 	set_max_mapnr(max_low_pfn);
         high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
 
+	pv_init();
+
 	/* Setup guest page hinting */
 	cmma_init();
 
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
@ 2019-04-26 18:32   ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Christoph Hellwig, Thomas Huth, Claudio Imbrenda, Janosch Frank,
	Vasily Gorbik, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Viktor Mihajlovski

On s390, protected virtualization guests have to use bounced I/O
buffers.  That requires some plumbing.

Let us make sure, any device that uses DMA API with direct ops correctly
is spared from the problems, that a hypervisor attempting I/O to a
non-shared page would bring.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 arch/s390/Kconfig                   |  4 +++
 arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++
 arch/s390/mm/init.c                 | 50 +++++++++++++++++++++++++++++++++++++
 3 files changed, 72 insertions(+)
 create mode 100644 arch/s390/include/asm/mem_encrypt.h

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 1c3fcf19c3af..5500d05d4d53 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -1,4 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
+config ARCH_HAS_MEM_ENCRYPT
+        def_bool y
+
 config MMU
 	def_bool y
 
@@ -191,6 +194,7 @@ config S390
 	select ARCH_HAS_SCALED_CPUTIME
 	select VIRT_TO_BUS
 	select HAVE_NMI
+	select SWIOTLB
 
 
 config SCHED_OMIT_FRAME_POINTER
diff --git a/arch/s390/include/asm/mem_encrypt.h b/arch/s390/include/asm/mem_encrypt.h
new file mode 100644
index 000000000000..0898c09a888c
--- /dev/null
+++ b/arch/s390/include/asm/mem_encrypt.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef S390_MEM_ENCRYPT_H__
+#define S390_MEM_ENCRYPT_H__
+
+#ifndef __ASSEMBLY__
+
+#define sme_me_mask	0ULL
+
+static inline bool sme_active(void) { return false; }
+extern bool sev_active(void);
+
+int set_memory_encrypted(unsigned long addr, int numpages);
+int set_memory_decrypted(unsigned long addr, int numpages);
+
+#endif	/* __ASSEMBLY__ */
+
+#endif	/* S390_MEM_ENCRYPT_H__ */
+
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 3e82f66d5c61..7e3cbd15dcfa 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -18,6 +18,7 @@
 #include <linux/mman.h>
 #include <linux/mm.h>
 #include <linux/swap.h>
+#include <linux/swiotlb.h>
 #include <linux/smp.h>
 #include <linux/init.h>
 #include <linux/pagemap.h>
@@ -29,6 +30,7 @@
 #include <linux/export.h>
 #include <linux/cma.h>
 #include <linux/gfp.h>
+#include <linux/dma-mapping.h>
 #include <asm/processor.h>
 #include <linux/uaccess.h>
 #include <asm/pgtable.h>
@@ -42,6 +44,8 @@
 #include <asm/sclp.h>
 #include <asm/set_memory.h>
 #include <asm/kasan.h>
+#include <asm/dma-mapping.h>
+#include <asm/uv.h>
 
 pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
 
@@ -126,6 +130,50 @@ void mark_rodata_ro(void)
 	pr_info("Write protected read-only-after-init data: %luk\n", size >> 10);
 }
 
+int set_memory_encrypted(unsigned long addr, int numpages)
+{
+	int i;
+
+	/* make all pages shared, (swiotlb, dma_free) */
+	for (i = 0; i < numpages; ++i) {
+		uv_remove_shared(addr);
+		addr += PAGE_SIZE;
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(set_memory_encrypted);
+
+int set_memory_decrypted(unsigned long addr, int numpages)
+{
+	int i;
+	/* make all pages shared (swiotlb, dma_alloca) */
+	for (i = 0; i < numpages; ++i) {
+		uv_set_shared(addr);
+		addr += PAGE_SIZE;
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(set_memory_decrypted);
+
+/* are we a protected virtualization guest? */
+bool sev_active(void)
+{
+	return is_prot_virt_guest();
+}
+EXPORT_SYMBOL_GPL(sev_active);
+
+/* protected virtualization */
+static void pv_init(void)
+{
+	if (!sev_active())
+		return;
+
+	/* make sure bounce buffers are shared */
+	swiotlb_init(1);
+	swiotlb_update_mem_attributes();
+	swiotlb_force = SWIOTLB_FORCE;
+}
+
 void __init mem_init(void)
 {
 	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
@@ -134,6 +182,8 @@ void __init mem_init(void)
 	set_max_mapnr(max_low_pfn);
         high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
 
+	pv_init();
+
 	/* Setup guest page hinting */
 	cmma_init();
 
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-04-26 18:32 ` Halil Pasic
@ 2019-04-26 18:32   ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

To support protected virtualization cio will need to make sure the
memory used for communication with the hypervisor is DMA memory.

Let us introduce one global cio, and some tools for pools seated
at individual devices.

Our DMA pools are implemented as a gen_pool backed with DMA pages. The
idea is to avoid each allocation effectively wasting a page, as we
typically allocate much less than PAGE_SIZE.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 arch/s390/Kconfig           |   1 +
 arch/s390/include/asm/cio.h |  11 +++++
 drivers/s390/cio/cio.h      |   1 +
 drivers/s390/cio/css.c      | 101 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 114 insertions(+)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 5500d05d4d53..5861311d95d9 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -195,6 +195,7 @@ config S390
 	select VIRT_TO_BUS
 	select HAVE_NMI
 	select SWIOTLB
+	select GENERIC_ALLOCATOR
 
 
 config SCHED_OMIT_FRAME_POINTER
diff --git a/arch/s390/include/asm/cio.h b/arch/s390/include/asm/cio.h
index 1727180e8ca1..43c007d2775a 100644
--- a/arch/s390/include/asm/cio.h
+++ b/arch/s390/include/asm/cio.h
@@ -328,6 +328,17 @@ static inline u8 pathmask_to_pos(u8 mask)
 void channel_subsystem_reinit(void);
 extern void css_schedule_reprobe(void);
 
+extern void *cio_dma_zalloc(size_t size);
+extern void cio_dma_free(void *cpu_addr, size_t size);
+extern struct device *cio_get_dma_css_dev(void);
+
+struct gen_pool;
+void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
+			size_t size);
+void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size);
+void cio_gp_dma_destroy(struct gen_pool *gp_dma, struct device *dma_dev);
+struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages);
+
 /* Function from drivers/s390/cio/chsc.c */
 int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta);
 int chsc_sstpi(void *page, void *result, size_t size);
diff --git a/drivers/s390/cio/cio.h b/drivers/s390/cio/cio.h
index 92eabbb5f18d..f23f7e2c33f7 100644
--- a/drivers/s390/cio/cio.h
+++ b/drivers/s390/cio/cio.h
@@ -113,6 +113,7 @@ struct subchannel {
 	enum sch_todo todo;
 	struct work_struct todo_work;
 	struct schib_config config;
+	u64 dma_mask;
 } __attribute__ ((aligned(8)));
 
 DECLARE_PER_CPU_ALIGNED(struct irb, cio_irb);
diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
index aea502922646..7087cc314fe9 100644
--- a/drivers/s390/cio/css.c
+++ b/drivers/s390/cio/css.c
@@ -20,6 +20,8 @@
 #include <linux/reboot.h>
 #include <linux/suspend.h>
 #include <linux/proc_fs.h>
+#include <linux/genalloc.h>
+#include <linux/dma-mapping.h>
 #include <asm/isc.h>
 #include <asm/crw.h>
 
@@ -199,6 +201,8 @@ static int css_validate_subchannel(struct subchannel_id schid,
 	return err;
 }
 
+static u64 css_dev_dma_mask = DMA_BIT_MASK(31);
+
 struct subchannel *css_alloc_subchannel(struct subchannel_id schid,
 					struct schib *schib)
 {
@@ -224,6 +228,9 @@ struct subchannel *css_alloc_subchannel(struct subchannel_id schid,
 	INIT_WORK(&sch->todo_work, css_sch_todo);
 	sch->dev.release = &css_subchannel_release;
 	device_initialize(&sch->dev);
+	sch->dma_mask = css_dev_dma_mask;
+	sch->dev.dma_mask = &sch->dma_mask;
+	sch->dev.coherent_dma_mask = sch->dma_mask;
 	return sch;
 
 err:
@@ -899,6 +906,9 @@ static int __init setup_css(int nr)
 	dev_set_name(&css->device, "css%x", nr);
 	css->device.groups = cssdev_attr_groups;
 	css->device.release = channel_subsystem_release;
+	/* some cio DMA memory needs to be 31 bit addressable */
+	css->device.coherent_dma_mask = css_dev_dma_mask,
+	css->device.dma_mask = &css_dev_dma_mask;
 
 	mutex_init(&css->mutex);
 	css->cssid = chsc_get_cssid(nr);
@@ -1018,6 +1028,96 @@ static struct notifier_block css_power_notifier = {
 	.notifier_call = css_power_event,
 };
 
+#define POOL_INIT_PAGES 1
+static struct gen_pool *cio_dma_pool;
+/* Currently cio supports only a single css */
+#define  CIO_DMA_GFP (GFP_KERNEL | __GFP_ZERO)
+
+
+struct device *cio_get_dma_css_dev(void)
+{
+	return &channel_subsystems[0]->device;
+}
+
+struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
+{
+	struct gen_pool *gp_dma;
+	void *cpu_addr;
+	dma_addr_t dma_addr;
+	int i;
+
+	gp_dma = gen_pool_create(3, -1);
+	if (!gp_dma)
+		return NULL;
+	for (i = 0; i < nr_pages; ++i) {
+		cpu_addr = dma_alloc_coherent(dma_dev, PAGE_SIZE, &dma_addr,
+					      CIO_DMA_GFP);
+		if (!cpu_addr)
+			return gp_dma;
+		gen_pool_add_virt(gp_dma, (unsigned long) cpu_addr,
+				  dma_addr, PAGE_SIZE, -1);
+	}
+	return gp_dma;
+}
+
+static void __gp_dma_free_dma(struct gen_pool *pool,
+			      struct gen_pool_chunk *chunk, void *data)
+{
+	dma_free_coherent((struct device *) data, PAGE_SIZE,
+			 (void *) chunk->start_addr,
+			 (dma_addr_t) chunk->phys_addr);
+}
+
+void cio_gp_dma_destroy(struct gen_pool *gp_dma, struct device *dma_dev)
+{
+	if (!gp_dma)
+		return;
+	/* this is qite ugly but no better idea */
+	gen_pool_for_each_chunk(gp_dma, __gp_dma_free_dma, dma_dev);
+	gen_pool_destroy(gp_dma);
+}
+
+static void __init cio_dma_pool_init(void)
+{
+	/* No need to free up the resources: compiled in */
+	cio_dma_pool = cio_gp_dma_create(cio_get_dma_css_dev(), 1);
+}
+
+void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
+			size_t size)
+{
+	dma_addr_t dma_addr;
+	unsigned long addr = gen_pool_alloc(gp_dma, size);
+
+	if (!addr) {
+		addr = (unsigned long) dma_alloc_coherent(dma_dev,
+					PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
+		if (!addr)
+			return NULL;
+		gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
+		addr = gen_pool_alloc(gp_dma, size);
+	}
+	return (void *) addr;
+}
+
+void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
+{
+	if (!cpu_addr)
+		return;
+	memset(cpu_addr, 0, size);
+	gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
+}
+
+void *cio_dma_zalloc(size_t size)
+{
+	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);
+}
+
+void cio_dma_free(void *cpu_addr, size_t size)
+{
+	cio_gp_dma_free(cio_dma_pool, cpu_addr, size);
+}
+
 /*
  * Now that the driver core is running, we can setup our channel subsystem.
  * The struct subchannel's are created during probing.
@@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
 		unregister_reboot_notifier(&css_reboot_notifier);
 		goto out_unregister;
 	}
+	cio_dma_pool_init();
 	css_init_done = 1;
 
 	/* Enable default isc for I/O subchannels. */
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-04-26 18:32   ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Christoph Hellwig, Thomas Huth, Claudio Imbrenda, Janosch Frank,
	Vasily Gorbik, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Viktor Mihajlovski

To support protected virtualization cio will need to make sure the
memory used for communication with the hypervisor is DMA memory.

Let us introduce one global cio, and some tools for pools seated
at individual devices.

Our DMA pools are implemented as a gen_pool backed with DMA pages. The
idea is to avoid each allocation effectively wasting a page, as we
typically allocate much less than PAGE_SIZE.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 arch/s390/Kconfig           |   1 +
 arch/s390/include/asm/cio.h |  11 +++++
 drivers/s390/cio/cio.h      |   1 +
 drivers/s390/cio/css.c      | 101 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 114 insertions(+)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 5500d05d4d53..5861311d95d9 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -195,6 +195,7 @@ config S390
 	select VIRT_TO_BUS
 	select HAVE_NMI
 	select SWIOTLB
+	select GENERIC_ALLOCATOR
 
 
 config SCHED_OMIT_FRAME_POINTER
diff --git a/arch/s390/include/asm/cio.h b/arch/s390/include/asm/cio.h
index 1727180e8ca1..43c007d2775a 100644
--- a/arch/s390/include/asm/cio.h
+++ b/arch/s390/include/asm/cio.h
@@ -328,6 +328,17 @@ static inline u8 pathmask_to_pos(u8 mask)
 void channel_subsystem_reinit(void);
 extern void css_schedule_reprobe(void);
 
+extern void *cio_dma_zalloc(size_t size);
+extern void cio_dma_free(void *cpu_addr, size_t size);
+extern struct device *cio_get_dma_css_dev(void);
+
+struct gen_pool;
+void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
+			size_t size);
+void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size);
+void cio_gp_dma_destroy(struct gen_pool *gp_dma, struct device *dma_dev);
+struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages);
+
 /* Function from drivers/s390/cio/chsc.c */
 int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta);
 int chsc_sstpi(void *page, void *result, size_t size);
diff --git a/drivers/s390/cio/cio.h b/drivers/s390/cio/cio.h
index 92eabbb5f18d..f23f7e2c33f7 100644
--- a/drivers/s390/cio/cio.h
+++ b/drivers/s390/cio/cio.h
@@ -113,6 +113,7 @@ struct subchannel {
 	enum sch_todo todo;
 	struct work_struct todo_work;
 	struct schib_config config;
+	u64 dma_mask;
 } __attribute__ ((aligned(8)));
 
 DECLARE_PER_CPU_ALIGNED(struct irb, cio_irb);
diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
index aea502922646..7087cc314fe9 100644
--- a/drivers/s390/cio/css.c
+++ b/drivers/s390/cio/css.c
@@ -20,6 +20,8 @@
 #include <linux/reboot.h>
 #include <linux/suspend.h>
 #include <linux/proc_fs.h>
+#include <linux/genalloc.h>
+#include <linux/dma-mapping.h>
 #include <asm/isc.h>
 #include <asm/crw.h>
 
@@ -199,6 +201,8 @@ static int css_validate_subchannel(struct subchannel_id schid,
 	return err;
 }
 
+static u64 css_dev_dma_mask = DMA_BIT_MASK(31);
+
 struct subchannel *css_alloc_subchannel(struct subchannel_id schid,
 					struct schib *schib)
 {
@@ -224,6 +228,9 @@ struct subchannel *css_alloc_subchannel(struct subchannel_id schid,
 	INIT_WORK(&sch->todo_work, css_sch_todo);
 	sch->dev.release = &css_subchannel_release;
 	device_initialize(&sch->dev);
+	sch->dma_mask = css_dev_dma_mask;
+	sch->dev.dma_mask = &sch->dma_mask;
+	sch->dev.coherent_dma_mask = sch->dma_mask;
 	return sch;
 
 err:
@@ -899,6 +906,9 @@ static int __init setup_css(int nr)
 	dev_set_name(&css->device, "css%x", nr);
 	css->device.groups = cssdev_attr_groups;
 	css->device.release = channel_subsystem_release;
+	/* some cio DMA memory needs to be 31 bit addressable */
+	css->device.coherent_dma_mask = css_dev_dma_mask,
+	css->device.dma_mask = &css_dev_dma_mask;
 
 	mutex_init(&css->mutex);
 	css->cssid = chsc_get_cssid(nr);
@@ -1018,6 +1028,96 @@ static struct notifier_block css_power_notifier = {
 	.notifier_call = css_power_event,
 };
 
+#define POOL_INIT_PAGES 1
+static struct gen_pool *cio_dma_pool;
+/* Currently cio supports only a single css */
+#define  CIO_DMA_GFP (GFP_KERNEL | __GFP_ZERO)
+
+
+struct device *cio_get_dma_css_dev(void)
+{
+	return &channel_subsystems[0]->device;
+}
+
+struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
+{
+	struct gen_pool *gp_dma;
+	void *cpu_addr;
+	dma_addr_t dma_addr;
+	int i;
+
+	gp_dma = gen_pool_create(3, -1);
+	if (!gp_dma)
+		return NULL;
+	for (i = 0; i < nr_pages; ++i) {
+		cpu_addr = dma_alloc_coherent(dma_dev, PAGE_SIZE, &dma_addr,
+					      CIO_DMA_GFP);
+		if (!cpu_addr)
+			return gp_dma;
+		gen_pool_add_virt(gp_dma, (unsigned long) cpu_addr,
+				  dma_addr, PAGE_SIZE, -1);
+	}
+	return gp_dma;
+}
+
+static void __gp_dma_free_dma(struct gen_pool *pool,
+			      struct gen_pool_chunk *chunk, void *data)
+{
+	dma_free_coherent((struct device *) data, PAGE_SIZE,
+			 (void *) chunk->start_addr,
+			 (dma_addr_t) chunk->phys_addr);
+}
+
+void cio_gp_dma_destroy(struct gen_pool *gp_dma, struct device *dma_dev)
+{
+	if (!gp_dma)
+		return;
+	/* this is qite ugly but no better idea */
+	gen_pool_for_each_chunk(gp_dma, __gp_dma_free_dma, dma_dev);
+	gen_pool_destroy(gp_dma);
+}
+
+static void __init cio_dma_pool_init(void)
+{
+	/* No need to free up the resources: compiled in */
+	cio_dma_pool = cio_gp_dma_create(cio_get_dma_css_dev(), 1);
+}
+
+void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
+			size_t size)
+{
+	dma_addr_t dma_addr;
+	unsigned long addr = gen_pool_alloc(gp_dma, size);
+
+	if (!addr) {
+		addr = (unsigned long) dma_alloc_coherent(dma_dev,
+					PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
+		if (!addr)
+			return NULL;
+		gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
+		addr = gen_pool_alloc(gp_dma, size);
+	}
+	return (void *) addr;
+}
+
+void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
+{
+	if (!cpu_addr)
+		return;
+	memset(cpu_addr, 0, size);
+	gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
+}
+
+void *cio_dma_zalloc(size_t size)
+{
+	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);
+}
+
+void cio_dma_free(void *cpu_addr, size_t size)
+{
+	cio_gp_dma_free(cio_dma_pool, cpu_addr, size);
+}
+
 /*
  * Now that the driver core is running, we can setup our channel subsystem.
  * The struct subchannel's are created during probing.
@@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
 		unregister_reboot_notifier(&css_reboot_notifier);
 		goto out_unregister;
 	}
+	cio_dma_pool_init();
 	css_init_done = 1;
 
 	/* Enable default isc for I/O subchannels. */
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-04-26 18:32 ` Halil Pasic
@ 2019-04-26 18:32   ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

As virtio-ccw devices are channel devices, we need to use the dma area
for any communication with the hypervisor.

This patch addresses the most basic stuff (mostly what is required for
virtio-ccw), and does take care of QDIO or any devices.

An interesting side effect is that virtio structures are now going to
get allocated in 31 bit addressable storage.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 arch/s390/include/asm/ccwdev.h   |  4 +++
 drivers/s390/cio/ccwreq.c        |  8 ++---
 drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
 drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
 drivers/s390/cio/device_id.c     | 18 +++++------
 drivers/s390/cio/device_ops.c    | 21 +++++++++++--
 drivers/s390/cio/device_pgid.c   | 20 ++++++-------
 drivers/s390/cio/device_status.c | 24 +++++++--------
 drivers/s390/cio/io_sch.h        | 21 +++++++++----
 drivers/s390/virtio/virtio_ccw.c | 10 -------
 10 files changed, 148 insertions(+), 83 deletions(-)

diff --git a/arch/s390/include/asm/ccwdev.h b/arch/s390/include/asm/ccwdev.h
index a29dd430fb40..865ce1cb86d5 100644
--- a/arch/s390/include/asm/ccwdev.h
+++ b/arch/s390/include/asm/ccwdev.h
@@ -226,6 +226,10 @@ extern int ccw_device_enable_console(struct ccw_device *);
 extern void ccw_device_wait_idle(struct ccw_device *);
 extern int ccw_device_force_console(struct ccw_device *);
 
+extern void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size);
+extern void ccw_device_dma_free(struct ccw_device *cdev,
+				void *cpu_addr, size_t size);
+
 int ccw_device_siosl(struct ccw_device *);
 
 extern void ccw_device_get_schid(struct ccw_device *, struct subchannel_id *);
diff --git a/drivers/s390/cio/ccwreq.c b/drivers/s390/cio/ccwreq.c
index 603268a33ea1..dafbceb311b3 100644
--- a/drivers/s390/cio/ccwreq.c
+++ b/drivers/s390/cio/ccwreq.c
@@ -63,7 +63,7 @@ static void ccwreq_stop(struct ccw_device *cdev, int rc)
 		return;
 	req->done = 1;
 	ccw_device_set_timeout(cdev, 0);
-	memset(&cdev->private->irb, 0, sizeof(struct irb));
+	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 	if (rc && rc != -ENODEV && req->drc)
 		rc = req->drc;
 	req->callback(cdev, req->data, rc);
@@ -86,7 +86,7 @@ static void ccwreq_do(struct ccw_device *cdev)
 			continue;
 		}
 		/* Perform start function. */
-		memset(&cdev->private->irb, 0, sizeof(struct irb));
+		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 		rc = cio_start(sch, cp, (u8) req->mask);
 		if (rc == 0) {
 			/* I/O started successfully. */
@@ -169,7 +169,7 @@ int ccw_request_cancel(struct ccw_device *cdev)
  */
 static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb)
 {
-	struct irb *irb = &cdev->private->irb;
+	struct irb *irb = &cdev->private->dma_area->irb;
 	struct cmd_scsw *scsw = &irb->scsw.cmd;
 	enum uc_todo todo;
 
@@ -187,7 +187,7 @@ static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb)
 		CIO_TRACE_EVENT(2, "sensedata");
 		CIO_HEX_EVENT(2, &cdev->private->dev_id,
 			      sizeof(struct ccw_dev_id));
-		CIO_HEX_EVENT(2, &cdev->private->irb.ecw, SENSE_MAX_COUNT);
+		CIO_HEX_EVENT(2, &cdev->private->dma_area->irb.ecw, SENSE_MAX_COUNT);
 		/* Check for command reject. */
 		if (irb->ecw[0] & SNS0_CMD_REJECT)
 			return IO_REJECTED;
diff --git a/drivers/s390/cio/device.c b/drivers/s390/cio/device.c
index 1540229a37bb..a3310ee14a4a 100644
--- a/drivers/s390/cio/device.c
+++ b/drivers/s390/cio/device.c
@@ -24,6 +24,7 @@
 #include <linux/timer.h>
 #include <linux/kernel_stat.h>
 #include <linux/sched/signal.h>
+#include <linux/dma-mapping.h>
 
 #include <asm/ccwdev.h>
 #include <asm/cio.h>
@@ -687,6 +688,9 @@ ccw_device_release(struct device *dev)
 	struct ccw_device *cdev;
 
 	cdev = to_ccwdev(dev);
+	cio_gp_dma_free(cdev->private->dma_pool, cdev->private->dma_area,
+			sizeof(*cdev->private->dma_area));
+	cio_gp_dma_destroy(cdev->private->dma_pool, &cdev->dev);
 	/* Release reference of parent subchannel. */
 	put_device(cdev->dev.parent);
 	kfree(cdev->private);
@@ -696,15 +700,31 @@ ccw_device_release(struct device *dev)
 static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
 {
 	struct ccw_device *cdev;
+	struct gen_pool *dma_pool;
 
 	cdev  = kzalloc(sizeof(*cdev), GFP_KERNEL);
-	if (cdev) {
-		cdev->private = kzalloc(sizeof(struct ccw_device_private),
-					GFP_KERNEL | GFP_DMA);
-		if (cdev->private)
-			return cdev;
-	}
+	if (!cdev)
+		goto err_cdev;
+	cdev->private = kzalloc(sizeof(struct ccw_device_private),
+				GFP_KERNEL | GFP_DMA);
+	if (!cdev->private)
+		goto err_priv;
+	cdev->dev.dma_mask = &cdev->private->dma_mask;
+	*cdev->dev.dma_mask = *sch->dev.dma_mask;
+	cdev->dev.coherent_dma_mask = sch->dev.coherent_dma_mask;
+	dma_pool = cio_gp_dma_create(&cdev->dev, 1);
+	cdev->private->dma_pool = dma_pool;
+	cdev->private->dma_area = cio_gp_dma_zalloc(dma_pool, &cdev->dev,
+					sizeof(*cdev->private->dma_area));
+	if (!cdev->private->dma_area)
+		goto err_dma_area;
+	return cdev;
+err_dma_area:
+	cio_gp_dma_destroy(dma_pool, &cdev->dev);
+	kfree(cdev->private);
+err_priv:
 	kfree(cdev);
+err_cdev:
 	return ERR_PTR(-ENOMEM);
 }
 
@@ -884,7 +904,7 @@ io_subchannel_recog_done(struct ccw_device *cdev)
 			wake_up(&ccw_device_init_wq);
 		break;
 	case DEV_STATE_OFFLINE:
-		/* 
+		/*
 		 * We can't register the device in interrupt context so
 		 * we schedule a work item.
 		 */
@@ -1062,6 +1082,14 @@ static int io_subchannel_probe(struct subchannel *sch)
 	if (!io_priv)
 		goto out_schedule;
 
+	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
+				sizeof(*io_priv->dma_area),
+				&io_priv->dma_area_dma, GFP_KERNEL);
+	if (!io_priv->dma_area) {
+		kfree(io_priv);
+		goto out_schedule;
+	}
+
 	set_io_private(sch, io_priv);
 	css_schedule_eval(sch->schid);
 	return 0;
@@ -1088,6 +1116,8 @@ static int io_subchannel_remove(struct subchannel *sch)
 	set_io_private(sch, NULL);
 	spin_unlock_irq(sch->lock);
 out_free:
+	dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
+			  io_priv->dma_area, io_priv->dma_area_dma);
 	kfree(io_priv);
 	sysfs_remove_group(&sch->dev.kobj, &io_subchannel_attr_group);
 	return 0;
@@ -1593,20 +1623,31 @@ struct ccw_device * __init ccw_device_create_console(struct ccw_driver *drv)
 		return ERR_CAST(sch);
 
 	io_priv = kzalloc(sizeof(*io_priv), GFP_KERNEL | GFP_DMA);
-	if (!io_priv) {
-		put_device(&sch->dev);
-		return ERR_PTR(-ENOMEM);
-	}
+	if (!io_priv)
+		goto err_priv;
+	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
+				sizeof(*io_priv->dma_area),
+				&io_priv->dma_area_dma, GFP_KERNEL);
+	if (!io_priv->dma_area)
+		goto err_dma_area;
 	set_io_private(sch, io_priv);
 	cdev = io_subchannel_create_ccwdev(sch);
 	if (IS_ERR(cdev)) {
 		put_device(&sch->dev);
+		dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
+				  io_priv->dma_area, io_priv->dma_area_dma);
 		kfree(io_priv);
 		return cdev;
 	}
 	cdev->drv = drv;
 	ccw_device_set_int_class(cdev);
 	return cdev;
+
+err_dma_area:
+		kfree(io_priv);
+err_priv:
+	put_device(&sch->dev);
+	return ERR_PTR(-ENOMEM);
 }
 
 void __init ccw_device_destroy_console(struct ccw_device *cdev)
@@ -1617,6 +1658,8 @@ void __init ccw_device_destroy_console(struct ccw_device *cdev)
 	set_io_private(sch, NULL);
 	put_device(&sch->dev);
 	put_device(&cdev->dev);
+	dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
+			  io_priv->dma_area, io_priv->dma_area_dma);
 	kfree(io_priv);
 }
 
diff --git a/drivers/s390/cio/device_fsm.c b/drivers/s390/cio/device_fsm.c
index 9169af7dbb43..fea23b44795b 100644
--- a/drivers/s390/cio/device_fsm.c
+++ b/drivers/s390/cio/device_fsm.c
@@ -67,8 +67,8 @@ static void ccw_timeout_log(struct ccw_device *cdev)
 			       sizeof(struct tcw), 0);
 	} else {
 		printk(KERN_WARNING "cio: orb indicates command mode\n");
-		if ((void *)(addr_t)orb->cmd.cpa == &private->sense_ccw ||
-		    (void *)(addr_t)orb->cmd.cpa == cdev->private->iccws)
+		if ((void *)(addr_t)orb->cmd.cpa == &private->dma_area->sense_ccw ||
+		    (void *)(addr_t)orb->cmd.cpa == cdev->private->dma_area->iccws)
 			printk(KERN_WARNING "cio: last channel program "
 			       "(intern):\n");
 		else
@@ -143,18 +143,18 @@ ccw_device_cancel_halt_clear(struct ccw_device *cdev)
 void ccw_device_update_sense_data(struct ccw_device *cdev)
 {
 	memset(&cdev->id, 0, sizeof(cdev->id));
-	cdev->id.cu_type   = cdev->private->senseid.cu_type;
-	cdev->id.cu_model  = cdev->private->senseid.cu_model;
-	cdev->id.dev_type  = cdev->private->senseid.dev_type;
-	cdev->id.dev_model = cdev->private->senseid.dev_model;
+	cdev->id.cu_type   = cdev->private->dma_area->senseid.cu_type;
+	cdev->id.cu_model  = cdev->private->dma_area->senseid.cu_model;
+	cdev->id.dev_type  = cdev->private->dma_area->senseid.dev_type;
+	cdev->id.dev_model = cdev->private->dma_area->senseid.dev_model;
 }
 
 int ccw_device_test_sense_data(struct ccw_device *cdev)
 {
-	return cdev->id.cu_type == cdev->private->senseid.cu_type &&
-		cdev->id.cu_model == cdev->private->senseid.cu_model &&
-		cdev->id.dev_type == cdev->private->senseid.dev_type &&
-		cdev->id.dev_model == cdev->private->senseid.dev_model;
+	return cdev->id.cu_type == cdev->private->dma_area->senseid.cu_type &&
+		cdev->id.cu_model == cdev->private->dma_area->senseid.cu_model &&
+		cdev->id.dev_type == cdev->private->dma_area->senseid.dev_type &&
+		cdev->id.dev_model == cdev->private->dma_area->senseid.dev_model;
 }
 
 /*
@@ -342,7 +342,7 @@ ccw_device_done(struct ccw_device *cdev, int state)
 		cio_disable_subchannel(sch);
 
 	/* Reset device status. */
-	memset(&cdev->private->irb, 0, sizeof(struct irb));
+	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 
 	cdev->private->state = state;
 
@@ -509,13 +509,13 @@ void ccw_device_verify_done(struct ccw_device *cdev, int err)
 		ccw_device_done(cdev, DEV_STATE_ONLINE);
 		/* Deliver fake irb to device driver, if needed. */
 		if (cdev->private->flags.fake_irb) {
-			create_fake_irb(&cdev->private->irb,
+			create_fake_irb(&cdev->private->dma_area->irb,
 					cdev->private->flags.fake_irb);
 			cdev->private->flags.fake_irb = 0;
 			if (cdev->handler)
 				cdev->handler(cdev, cdev->private->intparm,
-					      &cdev->private->irb);
-			memset(&cdev->private->irb, 0, sizeof(struct irb));
+					      &cdev->private->dma_area->irb);
+			memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 		}
 		ccw_device_report_path_events(cdev);
 		ccw_device_handle_broken_paths(cdev);
@@ -672,7 +672,7 @@ ccw_device_online_verify(struct ccw_device *cdev, enum dev_event dev_event)
 
 	if (scsw_actl(&sch->schib.scsw) != 0 ||
 	    (scsw_stctl(&sch->schib.scsw) & SCSW_STCTL_STATUS_PEND) ||
-	    (scsw_stctl(&cdev->private->irb.scsw) & SCSW_STCTL_STATUS_PEND)) {
+	    (scsw_stctl(&cdev->private->dma_area->irb.scsw) & SCSW_STCTL_STATUS_PEND)) {
 		/*
 		 * No final status yet or final status not yet delivered
 		 * to the device driver. Can't do path verification now,
@@ -719,7 +719,7 @@ static int ccw_device_call_handler(struct ccw_device *cdev)
 	 *  - fast notification was requested (primary status)
 	 *  - unsolicited interrupts
 	 */
-	stctl = scsw_stctl(&cdev->private->irb.scsw);
+	stctl = scsw_stctl(&cdev->private->dma_area->irb.scsw);
 	ending_status = (stctl & SCSW_STCTL_SEC_STATUS) ||
 		(stctl == (SCSW_STCTL_ALERT_STATUS | SCSW_STCTL_STATUS_PEND)) ||
 		(stctl == SCSW_STCTL_STATUS_PEND);
@@ -735,9 +735,9 @@ static int ccw_device_call_handler(struct ccw_device *cdev)
 
 	if (cdev->handler)
 		cdev->handler(cdev, cdev->private->intparm,
-			      &cdev->private->irb);
+			      &cdev->private->dma_area->irb);
 
-	memset(&cdev->private->irb, 0, sizeof(struct irb));
+	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 	return 1;
 }
 
@@ -759,7 +759,7 @@ ccw_device_irq(struct ccw_device *cdev, enum dev_event dev_event)
 			/* Unit check but no sense data. Need basic sense. */
 			if (ccw_device_do_sense(cdev, irb) != 0)
 				goto call_handler_unsol;
-			memcpy(&cdev->private->irb, irb, sizeof(struct irb));
+			memcpy(&cdev->private->dma_area->irb, irb, sizeof(struct irb));
 			cdev->private->state = DEV_STATE_W4SENSE;
 			cdev->private->intparm = 0;
 			return;
@@ -842,7 +842,7 @@ ccw_device_w4sense(struct ccw_device *cdev, enum dev_event dev_event)
 	if (scsw_fctl(&irb->scsw) &
 	    (SCSW_FCTL_CLEAR_FUNC | SCSW_FCTL_HALT_FUNC)) {
 		cdev->private->flags.dosense = 0;
-		memset(&cdev->private->irb, 0, sizeof(struct irb));
+		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 		ccw_device_accumulate_irb(cdev, irb);
 		goto call_handler;
 	}
diff --git a/drivers/s390/cio/device_id.c b/drivers/s390/cio/device_id.c
index f6df83a9dfbb..ea8a0fc6c0b6 100644
--- a/drivers/s390/cio/device_id.c
+++ b/drivers/s390/cio/device_id.c
@@ -99,7 +99,7 @@ static int diag210_to_senseid(struct senseid *senseid, struct diag210 *diag)
 static int diag210_get_dev_info(struct ccw_device *cdev)
 {
 	struct ccw_dev_id *dev_id = &cdev->private->dev_id;
-	struct senseid *senseid = &cdev->private->senseid;
+	struct senseid *senseid = &cdev->private->dma_area->senseid;
 	struct diag210 diag_data;
 	int rc;
 
@@ -134,8 +134,8 @@ static int diag210_get_dev_info(struct ccw_device *cdev)
 static void snsid_init(struct ccw_device *cdev)
 {
 	cdev->private->flags.esid = 0;
-	memset(&cdev->private->senseid, 0, sizeof(cdev->private->senseid));
-	cdev->private->senseid.cu_type = 0xffff;
+	memset(&cdev->private->dma_area->senseid, 0, sizeof(cdev->private->dma_area->senseid));
+	cdev->private->dma_area->senseid.cu_type = 0xffff;
 }
 
 /*
@@ -143,16 +143,16 @@ static void snsid_init(struct ccw_device *cdev)
  */
 static int snsid_check(struct ccw_device *cdev, void *data)
 {
-	struct cmd_scsw *scsw = &cdev->private->irb.scsw.cmd;
+	struct cmd_scsw *scsw = &cdev->private->dma_area->irb.scsw.cmd;
 	int len = sizeof(struct senseid) - scsw->count;
 
 	/* Check for incomplete SENSE ID data. */
 	if (len < SENSE_ID_MIN_LEN)
 		goto out_restart;
-	if (cdev->private->senseid.cu_type == 0xffff)
+	if (cdev->private->dma_area->senseid.cu_type == 0xffff)
 		goto out_restart;
 	/* Check for incompatible SENSE ID data. */
-	if (cdev->private->senseid.reserved != 0xff)
+	if (cdev->private->dma_area->senseid.reserved != 0xff)
 		return -EOPNOTSUPP;
 	/* Check for extended-identification information. */
 	if (len > SENSE_ID_BASIC_LEN)
@@ -170,7 +170,7 @@ static int snsid_check(struct ccw_device *cdev, void *data)
 static void snsid_callback(struct ccw_device *cdev, void *data, int rc)
 {
 	struct ccw_dev_id *id = &cdev->private->dev_id;
-	struct senseid *senseid = &cdev->private->senseid;
+	struct senseid *senseid = &cdev->private->dma_area->senseid;
 	int vm = 0;
 
 	if (rc && MACHINE_IS_VM) {
@@ -200,7 +200,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev)
 {
 	struct subchannel *sch = to_subchannel(cdev->dev.parent);
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 
 	CIO_TRACE_EVENT(4, "snsid");
 	CIO_HEX_EVENT(4, &cdev->private->dev_id, sizeof(cdev->private->dev_id));
@@ -208,7 +208,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev)
 	snsid_init(cdev);
 	/* Channel program setup. */
 	cp->cmd_code	= CCW_CMD_SENSE_ID;
-	cp->cda		= (u32) (addr_t) &cdev->private->senseid;
+	cp->cda		= (u32) (addr_t) &cdev->private->dma_area->senseid;
 	cp->count	= sizeof(struct senseid);
 	cp->flags	= CCW_FLAG_SLI;
 	/* Request setup. */
diff --git a/drivers/s390/cio/device_ops.c b/drivers/s390/cio/device_ops.c
index 4435ae0b3027..be4acfa9265a 100644
--- a/drivers/s390/cio/device_ops.c
+++ b/drivers/s390/cio/device_ops.c
@@ -429,8 +429,8 @@ struct ciw *ccw_device_get_ciw(struct ccw_device *cdev, __u32 ct)
 	if (cdev->private->flags.esid == 0)
 		return NULL;
 	for (ciw_cnt = 0; ciw_cnt < MAX_CIWS; ciw_cnt++)
-		if (cdev->private->senseid.ciw[ciw_cnt].ct == ct)
-			return cdev->private->senseid.ciw + ciw_cnt;
+		if (cdev->private->dma_area->senseid.ciw[ciw_cnt].ct == ct)
+			return cdev->private->dma_area->senseid.ciw + ciw_cnt;
 	return NULL;
 }
 
@@ -699,6 +699,23 @@ void ccw_device_get_schid(struct ccw_device *cdev, struct subchannel_id *schid)
 }
 EXPORT_SYMBOL_GPL(ccw_device_get_schid);
 
+/**
+ * Allocate zeroed dma coherent 31 bit addressable memory using
+ * the subchannels dma pool. Maximal size of allocation supported
+ * is PAGE_SIZE.
+ */
+void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size)
+{
+	return cio_gp_dma_zalloc(cdev->private->dma_pool, &cdev->dev, size);
+}
+EXPORT_SYMBOL(ccw_device_dma_zalloc);
+
+void ccw_device_dma_free(struct ccw_device *cdev, void *cpu_addr, size_t size)
+{
+	cio_gp_dma_free(cdev->private->dma_pool, cpu_addr, size);
+}
+EXPORT_SYMBOL(ccw_device_dma_free);
+
 EXPORT_SYMBOL(ccw_device_set_options_mask);
 EXPORT_SYMBOL(ccw_device_set_options);
 EXPORT_SYMBOL(ccw_device_clear_options);
diff --git a/drivers/s390/cio/device_pgid.c b/drivers/s390/cio/device_pgid.c
index d30a3babf176..e97baa89cbf8 100644
--- a/drivers/s390/cio/device_pgid.c
+++ b/drivers/s390/cio/device_pgid.c
@@ -57,7 +57,7 @@ static void verify_done(struct ccw_device *cdev, int rc)
 static void nop_build_cp(struct ccw_device *cdev)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 
 	cp->cmd_code	= CCW_CMD_NOOP;
 	cp->cda		= 0;
@@ -134,9 +134,9 @@ static void nop_callback(struct ccw_device *cdev, void *data, int rc)
 static void spid_build_cp(struct ccw_device *cdev, u8 fn)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 	int i = pathmask_to_pos(req->lpm);
-	struct pgid *pgid = &cdev->private->pgid[i];
+	struct pgid *pgid = &cdev->private->dma_area->pgid[i];
 
 	pgid->inf.fc	= fn;
 	cp->cmd_code	= CCW_CMD_SET_PGID;
@@ -300,7 +300,7 @@ static int pgid_cmp(struct pgid *p1, struct pgid *p2)
 static void pgid_analyze(struct ccw_device *cdev, struct pgid **p,
 			 int *mismatch, u8 *reserved, u8 *reset)
 {
-	struct pgid *pgid = &cdev->private->pgid[0];
+	struct pgid *pgid = &cdev->private->dma_area->pgid[0];
 	struct pgid *first = NULL;
 	int lpm;
 	int i;
@@ -342,7 +342,7 @@ static u8 pgid_to_donepm(struct ccw_device *cdev)
 		lpm = 0x80 >> i;
 		if ((cdev->private->pgid_valid_mask & lpm) == 0)
 			continue;
-		pgid = &cdev->private->pgid[i];
+		pgid = &cdev->private->dma_area->pgid[i];
 		if (sch->opm & lpm) {
 			if (pgid->inf.ps.state1 != SNID_STATE1_GROUPED)
 				continue;
@@ -368,7 +368,7 @@ static void pgid_fill(struct ccw_device *cdev, struct pgid *pgid)
 	int i;
 
 	for (i = 0; i < 8; i++)
-		memcpy(&cdev->private->pgid[i], pgid, sizeof(struct pgid));
+		memcpy(&cdev->private->dma_area->pgid[i], pgid, sizeof(struct pgid));
 }
 
 /*
@@ -435,12 +435,12 @@ static void snid_done(struct ccw_device *cdev, int rc)
 static void snid_build_cp(struct ccw_device *cdev)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 	int i = pathmask_to_pos(req->lpm);
 
 	/* Channel program setup. */
 	cp->cmd_code	= CCW_CMD_SENSE_PGID;
-	cp->cda		= (u32) (addr_t) &cdev->private->pgid[i];
+	cp->cda		= (u32) (addr_t) &cdev->private->dma_area->pgid[i];
 	cp->count	= sizeof(struct pgid);
 	cp->flags	= CCW_FLAG_SLI;
 	req->cp		= cp;
@@ -516,7 +516,7 @@ static void verify_start(struct ccw_device *cdev)
 	sch->lpm = sch->schib.pmcw.pam;
 
 	/* Initialize PGID data. */
-	memset(cdev->private->pgid, 0, sizeof(cdev->private->pgid));
+	memset(cdev->private->dma_area->pgid, 0, sizeof(cdev->private->dma_area->pgid));
 	cdev->private->pgid_valid_mask = 0;
 	cdev->private->pgid_todo_mask = sch->schib.pmcw.pam;
 	cdev->private->path_notoper_mask = 0;
@@ -626,7 +626,7 @@ struct stlck_data {
 static void stlck_build_cp(struct ccw_device *cdev, void *buf1, void *buf2)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 
 	cp[0].cmd_code = CCW_CMD_STLCK;
 	cp[0].cda = (u32) (addr_t) buf1;
diff --git a/drivers/s390/cio/device_status.c b/drivers/s390/cio/device_status.c
index 7d5c7892b2c4..a9aabde604f4 100644
--- a/drivers/s390/cio/device_status.c
+++ b/drivers/s390/cio/device_status.c
@@ -79,15 +79,15 @@ ccw_device_accumulate_ecw(struct ccw_device *cdev, struct irb *irb)
 	 * are condition that have to be met for the extended control
 	 * bit to have meaning. Sick.
 	 */
-	cdev->private->irb.scsw.cmd.ectl = 0;
+	cdev->private->dma_area->irb.scsw.cmd.ectl = 0;
 	if ((irb->scsw.cmd.stctl & SCSW_STCTL_ALERT_STATUS) &&
 	    !(irb->scsw.cmd.stctl & SCSW_STCTL_INTER_STATUS))
-		cdev->private->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl;
+		cdev->private->dma_area->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl;
 	/* Check if extended control word is valid. */
-	if (!cdev->private->irb.scsw.cmd.ectl)
+	if (!cdev->private->dma_area->irb.scsw.cmd.ectl)
 		return;
 	/* Copy concurrent sense / model dependent information. */
-	memcpy (&cdev->private->irb.ecw, irb->ecw, sizeof (irb->ecw));
+	memcpy (&cdev->private->dma_area->irb.ecw, irb->ecw, sizeof (irb->ecw));
 }
 
 /*
@@ -118,7 +118,7 @@ ccw_device_accumulate_esw(struct ccw_device *cdev, struct irb *irb)
 	if (!ccw_device_accumulate_esw_valid(irb))
 		return;
 
-	cdev_irb = &cdev->private->irb;
+	cdev_irb = &cdev->private->dma_area->irb;
 
 	/* Copy last path used mask. */
 	cdev_irb->esw.esw1.lpum = irb->esw.esw1.lpum;
@@ -210,7 +210,7 @@ ccw_device_accumulate_irb(struct ccw_device *cdev, struct irb *irb)
 		ccw_device_path_notoper(cdev);
 	/* No irb accumulation for transport mode irbs. */
 	if (scsw_is_tm(&irb->scsw)) {
-		memcpy(&cdev->private->irb, irb, sizeof(struct irb));
+		memcpy(&cdev->private->dma_area->irb, irb, sizeof(struct irb));
 		return;
 	}
 	/*
@@ -219,7 +219,7 @@ ccw_device_accumulate_irb(struct ccw_device *cdev, struct irb *irb)
 	if (!scsw_is_solicited(&irb->scsw))
 		return;
 
-	cdev_irb = &cdev->private->irb;
+	cdev_irb = &cdev->private->dma_area->irb;
 
 	/*
 	 * If the clear function had been performed, all formerly pending
@@ -227,7 +227,7 @@ ccw_device_accumulate_irb(struct ccw_device *cdev, struct irb *irb)
 	 * intermediate accumulated status to the device driver.
 	 */
 	if (irb->scsw.cmd.fctl & SCSW_FCTL_CLEAR_FUNC)
-		memset(&cdev->private->irb, 0, sizeof(struct irb));
+		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 
 	/* Copy bits which are valid only for the start function. */
 	if (irb->scsw.cmd.fctl & SCSW_FCTL_START_FUNC) {
@@ -329,9 +329,9 @@ ccw_device_do_sense(struct ccw_device *cdev, struct irb *irb)
 	/*
 	 * We have ending status but no sense information. Do a basic sense.
 	 */
-	sense_ccw = &to_io_private(sch)->sense_ccw;
+	sense_ccw = &to_io_private(sch)->dma_area->sense_ccw;
 	sense_ccw->cmd_code = CCW_CMD_BASIC_SENSE;
-	sense_ccw->cda = (__u32) __pa(cdev->private->irb.ecw);
+	sense_ccw->cda = (__u32) __pa(cdev->private->dma_area->irb.ecw);
 	sense_ccw->count = SENSE_MAX_COUNT;
 	sense_ccw->flags = CCW_FLAG_SLI;
 
@@ -364,7 +364,7 @@ ccw_device_accumulate_basic_sense(struct ccw_device *cdev, struct irb *irb)
 
 	if (!(irb->scsw.cmd.dstat & DEV_STAT_UNIT_CHECK) &&
 	    (irb->scsw.cmd.dstat & DEV_STAT_CHN_END)) {
-		cdev->private->irb.esw.esw0.erw.cons = 1;
+		cdev->private->dma_area->irb.esw.esw0.erw.cons = 1;
 		cdev->private->flags.dosense = 0;
 	}
 	/* Check if path verification is required. */
@@ -386,7 +386,7 @@ ccw_device_accumulate_and_sense(struct ccw_device *cdev, struct irb *irb)
 	/* Check for basic sense. */
 	if (cdev->private->flags.dosense &&
 	    !(irb->scsw.cmd.dstat & DEV_STAT_UNIT_CHECK)) {
-		cdev->private->irb.esw.esw0.erw.cons = 1;
+		cdev->private->dma_area->irb.esw.esw0.erw.cons = 1;
 		cdev->private->flags.dosense = 0;
 		return 0;
 	}
diff --git a/drivers/s390/cio/io_sch.h b/drivers/s390/cio/io_sch.h
index 90e4e3a7841b..4cf38c98cd0b 100644
--- a/drivers/s390/cio/io_sch.h
+++ b/drivers/s390/cio/io_sch.h
@@ -9,15 +9,20 @@
 #include "css.h"
 #include "orb.h"
 
+struct io_subchannel_dma_area {
+	struct ccw1 sense_ccw;	/* static ccw for sense command */
+};
+
 struct io_subchannel_private {
 	union orb orb;		/* operation request block */
-	struct ccw1 sense_ccw;	/* static ccw for sense command */
 	struct ccw_device *cdev;/* pointer to the child ccw device */
 	struct {
 		unsigned int suspend:1;	/* allow suspend */
 		unsigned int prefetch:1;/* deny prefetch */
 		unsigned int inter:1;	/* suppress intermediate interrupts */
 	} __packed options;
+	struct io_subchannel_dma_area *dma_area;
+	dma_addr_t dma_area_dma;
 } __aligned(8);
 
 #define to_io_private(n) ((struct io_subchannel_private *) \
@@ -115,6 +120,13 @@ enum cdev_todo {
 #define FAKE_CMD_IRB	1
 #define FAKE_TM_IRB	2
 
+struct ccw_device_dma_area {
+	struct senseid senseid;	/* SenseID info */
+	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
+	struct irb irb;		/* device status */
+	struct pgid pgid[8];	/* path group IDs per chpid*/
+};
+
 struct ccw_device_private {
 	struct ccw_device *cdev;
 	struct subchannel *sch;
@@ -156,11 +168,7 @@ struct ccw_device_private {
 	} __attribute__((packed)) flags;
 	unsigned long intparm;	/* user interruption parameter */
 	struct qdio_irq *qdio_data;
-	struct irb irb;		/* device status */
 	int async_kill_io_rc;
-	struct senseid senseid;	/* SenseID info */
-	struct pgid pgid[8];	/* path group IDs per chpid*/
-	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
 	struct work_struct todo_work;
 	enum cdev_todo todo;
 	wait_queue_head_t wait_q;
@@ -169,6 +177,9 @@ struct ccw_device_private {
 	struct list_head cmb_list;	/* list of measured devices */
 	u64 cmb_start_time;		/* clock value of cmb reset */
 	void *cmb_wait;			/* deferred cmb enable/disable */
+	u64 dma_mask;
+	struct gen_pool *dma_pool;
+	struct ccw_device_dma_area *dma_area;
 	enum interruption_class int_class;
 };
 
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 6d989c360f38..bb7a92316fc8 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -66,7 +66,6 @@ struct virtio_ccw_device {
 	bool device_lost;
 	unsigned int config_ready;
 	void *airq_info;
-	u64 dma_mask;
 };
 
 struct vq_info_block_legacy {
@@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		ret = -ENOMEM;
 		goto out_free;
 	}
-
 	vcdev->vdev.dev.parent = &cdev->dev;
-	cdev->dev.dma_mask = &vcdev->dma_mask;
-	/* we are fine with common virtio infrastructure using 64 bit DMA */
-	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
-	if (ret) {
-		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
-		goto out_free;
-	}
-
 	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
 				   GFP_DMA | GFP_KERNEL);
 	if (!vcdev->config_block) {
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-04-26 18:32   ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Christoph Hellwig, Thomas Huth, Claudio Imbrenda, Janosch Frank,
	Vasily Gorbik, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Viktor Mihajlovski

As virtio-ccw devices are channel devices, we need to use the dma area
for any communication with the hypervisor.

This patch addresses the most basic stuff (mostly what is required for
virtio-ccw), and does take care of QDIO or any devices.

An interesting side effect is that virtio structures are now going to
get allocated in 31 bit addressable storage.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 arch/s390/include/asm/ccwdev.h   |  4 +++
 drivers/s390/cio/ccwreq.c        |  8 ++---
 drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
 drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
 drivers/s390/cio/device_id.c     | 18 +++++------
 drivers/s390/cio/device_ops.c    | 21 +++++++++++--
 drivers/s390/cio/device_pgid.c   | 20 ++++++-------
 drivers/s390/cio/device_status.c | 24 +++++++--------
 drivers/s390/cio/io_sch.h        | 21 +++++++++----
 drivers/s390/virtio/virtio_ccw.c | 10 -------
 10 files changed, 148 insertions(+), 83 deletions(-)

diff --git a/arch/s390/include/asm/ccwdev.h b/arch/s390/include/asm/ccwdev.h
index a29dd430fb40..865ce1cb86d5 100644
--- a/arch/s390/include/asm/ccwdev.h
+++ b/arch/s390/include/asm/ccwdev.h
@@ -226,6 +226,10 @@ extern int ccw_device_enable_console(struct ccw_device *);
 extern void ccw_device_wait_idle(struct ccw_device *);
 extern int ccw_device_force_console(struct ccw_device *);
 
+extern void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size);
+extern void ccw_device_dma_free(struct ccw_device *cdev,
+				void *cpu_addr, size_t size);
+
 int ccw_device_siosl(struct ccw_device *);
 
 extern void ccw_device_get_schid(struct ccw_device *, struct subchannel_id *);
diff --git a/drivers/s390/cio/ccwreq.c b/drivers/s390/cio/ccwreq.c
index 603268a33ea1..dafbceb311b3 100644
--- a/drivers/s390/cio/ccwreq.c
+++ b/drivers/s390/cio/ccwreq.c
@@ -63,7 +63,7 @@ static void ccwreq_stop(struct ccw_device *cdev, int rc)
 		return;
 	req->done = 1;
 	ccw_device_set_timeout(cdev, 0);
-	memset(&cdev->private->irb, 0, sizeof(struct irb));
+	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 	if (rc && rc != -ENODEV && req->drc)
 		rc = req->drc;
 	req->callback(cdev, req->data, rc);
@@ -86,7 +86,7 @@ static void ccwreq_do(struct ccw_device *cdev)
 			continue;
 		}
 		/* Perform start function. */
-		memset(&cdev->private->irb, 0, sizeof(struct irb));
+		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 		rc = cio_start(sch, cp, (u8) req->mask);
 		if (rc == 0) {
 			/* I/O started successfully. */
@@ -169,7 +169,7 @@ int ccw_request_cancel(struct ccw_device *cdev)
  */
 static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb)
 {
-	struct irb *irb = &cdev->private->irb;
+	struct irb *irb = &cdev->private->dma_area->irb;
 	struct cmd_scsw *scsw = &irb->scsw.cmd;
 	enum uc_todo todo;
 
@@ -187,7 +187,7 @@ static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb)
 		CIO_TRACE_EVENT(2, "sensedata");
 		CIO_HEX_EVENT(2, &cdev->private->dev_id,
 			      sizeof(struct ccw_dev_id));
-		CIO_HEX_EVENT(2, &cdev->private->irb.ecw, SENSE_MAX_COUNT);
+		CIO_HEX_EVENT(2, &cdev->private->dma_area->irb.ecw, SENSE_MAX_COUNT);
 		/* Check for command reject. */
 		if (irb->ecw[0] & SNS0_CMD_REJECT)
 			return IO_REJECTED;
diff --git a/drivers/s390/cio/device.c b/drivers/s390/cio/device.c
index 1540229a37bb..a3310ee14a4a 100644
--- a/drivers/s390/cio/device.c
+++ b/drivers/s390/cio/device.c
@@ -24,6 +24,7 @@
 #include <linux/timer.h>
 #include <linux/kernel_stat.h>
 #include <linux/sched/signal.h>
+#include <linux/dma-mapping.h>
 
 #include <asm/ccwdev.h>
 #include <asm/cio.h>
@@ -687,6 +688,9 @@ ccw_device_release(struct device *dev)
 	struct ccw_device *cdev;
 
 	cdev = to_ccwdev(dev);
+	cio_gp_dma_free(cdev->private->dma_pool, cdev->private->dma_area,
+			sizeof(*cdev->private->dma_area));
+	cio_gp_dma_destroy(cdev->private->dma_pool, &cdev->dev);
 	/* Release reference of parent subchannel. */
 	put_device(cdev->dev.parent);
 	kfree(cdev->private);
@@ -696,15 +700,31 @@ ccw_device_release(struct device *dev)
 static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
 {
 	struct ccw_device *cdev;
+	struct gen_pool *dma_pool;
 
 	cdev  = kzalloc(sizeof(*cdev), GFP_KERNEL);
-	if (cdev) {
-		cdev->private = kzalloc(sizeof(struct ccw_device_private),
-					GFP_KERNEL | GFP_DMA);
-		if (cdev->private)
-			return cdev;
-	}
+	if (!cdev)
+		goto err_cdev;
+	cdev->private = kzalloc(sizeof(struct ccw_device_private),
+				GFP_KERNEL | GFP_DMA);
+	if (!cdev->private)
+		goto err_priv;
+	cdev->dev.dma_mask = &cdev->private->dma_mask;
+	*cdev->dev.dma_mask = *sch->dev.dma_mask;
+	cdev->dev.coherent_dma_mask = sch->dev.coherent_dma_mask;
+	dma_pool = cio_gp_dma_create(&cdev->dev, 1);
+	cdev->private->dma_pool = dma_pool;
+	cdev->private->dma_area = cio_gp_dma_zalloc(dma_pool, &cdev->dev,
+					sizeof(*cdev->private->dma_area));
+	if (!cdev->private->dma_area)
+		goto err_dma_area;
+	return cdev;
+err_dma_area:
+	cio_gp_dma_destroy(dma_pool, &cdev->dev);
+	kfree(cdev->private);
+err_priv:
 	kfree(cdev);
+err_cdev:
 	return ERR_PTR(-ENOMEM);
 }
 
@@ -884,7 +904,7 @@ io_subchannel_recog_done(struct ccw_device *cdev)
 			wake_up(&ccw_device_init_wq);
 		break;
 	case DEV_STATE_OFFLINE:
-		/* 
+		/*
 		 * We can't register the device in interrupt context so
 		 * we schedule a work item.
 		 */
@@ -1062,6 +1082,14 @@ static int io_subchannel_probe(struct subchannel *sch)
 	if (!io_priv)
 		goto out_schedule;
 
+	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
+				sizeof(*io_priv->dma_area),
+				&io_priv->dma_area_dma, GFP_KERNEL);
+	if (!io_priv->dma_area) {
+		kfree(io_priv);
+		goto out_schedule;
+	}
+
 	set_io_private(sch, io_priv);
 	css_schedule_eval(sch->schid);
 	return 0;
@@ -1088,6 +1116,8 @@ static int io_subchannel_remove(struct subchannel *sch)
 	set_io_private(sch, NULL);
 	spin_unlock_irq(sch->lock);
 out_free:
+	dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
+			  io_priv->dma_area, io_priv->dma_area_dma);
 	kfree(io_priv);
 	sysfs_remove_group(&sch->dev.kobj, &io_subchannel_attr_group);
 	return 0;
@@ -1593,20 +1623,31 @@ struct ccw_device * __init ccw_device_create_console(struct ccw_driver *drv)
 		return ERR_CAST(sch);
 
 	io_priv = kzalloc(sizeof(*io_priv), GFP_KERNEL | GFP_DMA);
-	if (!io_priv) {
-		put_device(&sch->dev);
-		return ERR_PTR(-ENOMEM);
-	}
+	if (!io_priv)
+		goto err_priv;
+	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
+				sizeof(*io_priv->dma_area),
+				&io_priv->dma_area_dma, GFP_KERNEL);
+	if (!io_priv->dma_area)
+		goto err_dma_area;
 	set_io_private(sch, io_priv);
 	cdev = io_subchannel_create_ccwdev(sch);
 	if (IS_ERR(cdev)) {
 		put_device(&sch->dev);
+		dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
+				  io_priv->dma_area, io_priv->dma_area_dma);
 		kfree(io_priv);
 		return cdev;
 	}
 	cdev->drv = drv;
 	ccw_device_set_int_class(cdev);
 	return cdev;
+
+err_dma_area:
+		kfree(io_priv);
+err_priv:
+	put_device(&sch->dev);
+	return ERR_PTR(-ENOMEM);
 }
 
 void __init ccw_device_destroy_console(struct ccw_device *cdev)
@@ -1617,6 +1658,8 @@ void __init ccw_device_destroy_console(struct ccw_device *cdev)
 	set_io_private(sch, NULL);
 	put_device(&sch->dev);
 	put_device(&cdev->dev);
+	dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
+			  io_priv->dma_area, io_priv->dma_area_dma);
 	kfree(io_priv);
 }
 
diff --git a/drivers/s390/cio/device_fsm.c b/drivers/s390/cio/device_fsm.c
index 9169af7dbb43..fea23b44795b 100644
--- a/drivers/s390/cio/device_fsm.c
+++ b/drivers/s390/cio/device_fsm.c
@@ -67,8 +67,8 @@ static void ccw_timeout_log(struct ccw_device *cdev)
 			       sizeof(struct tcw), 0);
 	} else {
 		printk(KERN_WARNING "cio: orb indicates command mode\n");
-		if ((void *)(addr_t)orb->cmd.cpa == &private->sense_ccw ||
-		    (void *)(addr_t)orb->cmd.cpa == cdev->private->iccws)
+		if ((void *)(addr_t)orb->cmd.cpa == &private->dma_area->sense_ccw ||
+		    (void *)(addr_t)orb->cmd.cpa == cdev->private->dma_area->iccws)
 			printk(KERN_WARNING "cio: last channel program "
 			       "(intern):\n");
 		else
@@ -143,18 +143,18 @@ ccw_device_cancel_halt_clear(struct ccw_device *cdev)
 void ccw_device_update_sense_data(struct ccw_device *cdev)
 {
 	memset(&cdev->id, 0, sizeof(cdev->id));
-	cdev->id.cu_type   = cdev->private->senseid.cu_type;
-	cdev->id.cu_model  = cdev->private->senseid.cu_model;
-	cdev->id.dev_type  = cdev->private->senseid.dev_type;
-	cdev->id.dev_model = cdev->private->senseid.dev_model;
+	cdev->id.cu_type   = cdev->private->dma_area->senseid.cu_type;
+	cdev->id.cu_model  = cdev->private->dma_area->senseid.cu_model;
+	cdev->id.dev_type  = cdev->private->dma_area->senseid.dev_type;
+	cdev->id.dev_model = cdev->private->dma_area->senseid.dev_model;
 }
 
 int ccw_device_test_sense_data(struct ccw_device *cdev)
 {
-	return cdev->id.cu_type == cdev->private->senseid.cu_type &&
-		cdev->id.cu_model == cdev->private->senseid.cu_model &&
-		cdev->id.dev_type == cdev->private->senseid.dev_type &&
-		cdev->id.dev_model == cdev->private->senseid.dev_model;
+	return cdev->id.cu_type == cdev->private->dma_area->senseid.cu_type &&
+		cdev->id.cu_model == cdev->private->dma_area->senseid.cu_model &&
+		cdev->id.dev_type == cdev->private->dma_area->senseid.dev_type &&
+		cdev->id.dev_model == cdev->private->dma_area->senseid.dev_model;
 }
 
 /*
@@ -342,7 +342,7 @@ ccw_device_done(struct ccw_device *cdev, int state)
 		cio_disable_subchannel(sch);
 
 	/* Reset device status. */
-	memset(&cdev->private->irb, 0, sizeof(struct irb));
+	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 
 	cdev->private->state = state;
 
@@ -509,13 +509,13 @@ void ccw_device_verify_done(struct ccw_device *cdev, int err)
 		ccw_device_done(cdev, DEV_STATE_ONLINE);
 		/* Deliver fake irb to device driver, if needed. */
 		if (cdev->private->flags.fake_irb) {
-			create_fake_irb(&cdev->private->irb,
+			create_fake_irb(&cdev->private->dma_area->irb,
 					cdev->private->flags.fake_irb);
 			cdev->private->flags.fake_irb = 0;
 			if (cdev->handler)
 				cdev->handler(cdev, cdev->private->intparm,
-					      &cdev->private->irb);
-			memset(&cdev->private->irb, 0, sizeof(struct irb));
+					      &cdev->private->dma_area->irb);
+			memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 		}
 		ccw_device_report_path_events(cdev);
 		ccw_device_handle_broken_paths(cdev);
@@ -672,7 +672,7 @@ ccw_device_online_verify(struct ccw_device *cdev, enum dev_event dev_event)
 
 	if (scsw_actl(&sch->schib.scsw) != 0 ||
 	    (scsw_stctl(&sch->schib.scsw) & SCSW_STCTL_STATUS_PEND) ||
-	    (scsw_stctl(&cdev->private->irb.scsw) & SCSW_STCTL_STATUS_PEND)) {
+	    (scsw_stctl(&cdev->private->dma_area->irb.scsw) & SCSW_STCTL_STATUS_PEND)) {
 		/*
 		 * No final status yet or final status not yet delivered
 		 * to the device driver. Can't do path verification now,
@@ -719,7 +719,7 @@ static int ccw_device_call_handler(struct ccw_device *cdev)
 	 *  - fast notification was requested (primary status)
 	 *  - unsolicited interrupts
 	 */
-	stctl = scsw_stctl(&cdev->private->irb.scsw);
+	stctl = scsw_stctl(&cdev->private->dma_area->irb.scsw);
 	ending_status = (stctl & SCSW_STCTL_SEC_STATUS) ||
 		(stctl == (SCSW_STCTL_ALERT_STATUS | SCSW_STCTL_STATUS_PEND)) ||
 		(stctl == SCSW_STCTL_STATUS_PEND);
@@ -735,9 +735,9 @@ static int ccw_device_call_handler(struct ccw_device *cdev)
 
 	if (cdev->handler)
 		cdev->handler(cdev, cdev->private->intparm,
-			      &cdev->private->irb);
+			      &cdev->private->dma_area->irb);
 
-	memset(&cdev->private->irb, 0, sizeof(struct irb));
+	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 	return 1;
 }
 
@@ -759,7 +759,7 @@ ccw_device_irq(struct ccw_device *cdev, enum dev_event dev_event)
 			/* Unit check but no sense data. Need basic sense. */
 			if (ccw_device_do_sense(cdev, irb) != 0)
 				goto call_handler_unsol;
-			memcpy(&cdev->private->irb, irb, sizeof(struct irb));
+			memcpy(&cdev->private->dma_area->irb, irb, sizeof(struct irb));
 			cdev->private->state = DEV_STATE_W4SENSE;
 			cdev->private->intparm = 0;
 			return;
@@ -842,7 +842,7 @@ ccw_device_w4sense(struct ccw_device *cdev, enum dev_event dev_event)
 	if (scsw_fctl(&irb->scsw) &
 	    (SCSW_FCTL_CLEAR_FUNC | SCSW_FCTL_HALT_FUNC)) {
 		cdev->private->flags.dosense = 0;
-		memset(&cdev->private->irb, 0, sizeof(struct irb));
+		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 		ccw_device_accumulate_irb(cdev, irb);
 		goto call_handler;
 	}
diff --git a/drivers/s390/cio/device_id.c b/drivers/s390/cio/device_id.c
index f6df83a9dfbb..ea8a0fc6c0b6 100644
--- a/drivers/s390/cio/device_id.c
+++ b/drivers/s390/cio/device_id.c
@@ -99,7 +99,7 @@ static int diag210_to_senseid(struct senseid *senseid, struct diag210 *diag)
 static int diag210_get_dev_info(struct ccw_device *cdev)
 {
 	struct ccw_dev_id *dev_id = &cdev->private->dev_id;
-	struct senseid *senseid = &cdev->private->senseid;
+	struct senseid *senseid = &cdev->private->dma_area->senseid;
 	struct diag210 diag_data;
 	int rc;
 
@@ -134,8 +134,8 @@ static int diag210_get_dev_info(struct ccw_device *cdev)
 static void snsid_init(struct ccw_device *cdev)
 {
 	cdev->private->flags.esid = 0;
-	memset(&cdev->private->senseid, 0, sizeof(cdev->private->senseid));
-	cdev->private->senseid.cu_type = 0xffff;
+	memset(&cdev->private->dma_area->senseid, 0, sizeof(cdev->private->dma_area->senseid));
+	cdev->private->dma_area->senseid.cu_type = 0xffff;
 }
 
 /*
@@ -143,16 +143,16 @@ static void snsid_init(struct ccw_device *cdev)
  */
 static int snsid_check(struct ccw_device *cdev, void *data)
 {
-	struct cmd_scsw *scsw = &cdev->private->irb.scsw.cmd;
+	struct cmd_scsw *scsw = &cdev->private->dma_area->irb.scsw.cmd;
 	int len = sizeof(struct senseid) - scsw->count;
 
 	/* Check for incomplete SENSE ID data. */
 	if (len < SENSE_ID_MIN_LEN)
 		goto out_restart;
-	if (cdev->private->senseid.cu_type == 0xffff)
+	if (cdev->private->dma_area->senseid.cu_type == 0xffff)
 		goto out_restart;
 	/* Check for incompatible SENSE ID data. */
-	if (cdev->private->senseid.reserved != 0xff)
+	if (cdev->private->dma_area->senseid.reserved != 0xff)
 		return -EOPNOTSUPP;
 	/* Check for extended-identification information. */
 	if (len > SENSE_ID_BASIC_LEN)
@@ -170,7 +170,7 @@ static int snsid_check(struct ccw_device *cdev, void *data)
 static void snsid_callback(struct ccw_device *cdev, void *data, int rc)
 {
 	struct ccw_dev_id *id = &cdev->private->dev_id;
-	struct senseid *senseid = &cdev->private->senseid;
+	struct senseid *senseid = &cdev->private->dma_area->senseid;
 	int vm = 0;
 
 	if (rc && MACHINE_IS_VM) {
@@ -200,7 +200,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev)
 {
 	struct subchannel *sch = to_subchannel(cdev->dev.parent);
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 
 	CIO_TRACE_EVENT(4, "snsid");
 	CIO_HEX_EVENT(4, &cdev->private->dev_id, sizeof(cdev->private->dev_id));
@@ -208,7 +208,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev)
 	snsid_init(cdev);
 	/* Channel program setup. */
 	cp->cmd_code	= CCW_CMD_SENSE_ID;
-	cp->cda		= (u32) (addr_t) &cdev->private->senseid;
+	cp->cda		= (u32) (addr_t) &cdev->private->dma_area->senseid;
 	cp->count	= sizeof(struct senseid);
 	cp->flags	= CCW_FLAG_SLI;
 	/* Request setup. */
diff --git a/drivers/s390/cio/device_ops.c b/drivers/s390/cio/device_ops.c
index 4435ae0b3027..be4acfa9265a 100644
--- a/drivers/s390/cio/device_ops.c
+++ b/drivers/s390/cio/device_ops.c
@@ -429,8 +429,8 @@ struct ciw *ccw_device_get_ciw(struct ccw_device *cdev, __u32 ct)
 	if (cdev->private->flags.esid == 0)
 		return NULL;
 	for (ciw_cnt = 0; ciw_cnt < MAX_CIWS; ciw_cnt++)
-		if (cdev->private->senseid.ciw[ciw_cnt].ct == ct)
-			return cdev->private->senseid.ciw + ciw_cnt;
+		if (cdev->private->dma_area->senseid.ciw[ciw_cnt].ct == ct)
+			return cdev->private->dma_area->senseid.ciw + ciw_cnt;
 	return NULL;
 }
 
@@ -699,6 +699,23 @@ void ccw_device_get_schid(struct ccw_device *cdev, struct subchannel_id *schid)
 }
 EXPORT_SYMBOL_GPL(ccw_device_get_schid);
 
+/**
+ * Allocate zeroed dma coherent 31 bit addressable memory using
+ * the subchannels dma pool. Maximal size of allocation supported
+ * is PAGE_SIZE.
+ */
+void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size)
+{
+	return cio_gp_dma_zalloc(cdev->private->dma_pool, &cdev->dev, size);
+}
+EXPORT_SYMBOL(ccw_device_dma_zalloc);
+
+void ccw_device_dma_free(struct ccw_device *cdev, void *cpu_addr, size_t size)
+{
+	cio_gp_dma_free(cdev->private->dma_pool, cpu_addr, size);
+}
+EXPORT_SYMBOL(ccw_device_dma_free);
+
 EXPORT_SYMBOL(ccw_device_set_options_mask);
 EXPORT_SYMBOL(ccw_device_set_options);
 EXPORT_SYMBOL(ccw_device_clear_options);
diff --git a/drivers/s390/cio/device_pgid.c b/drivers/s390/cio/device_pgid.c
index d30a3babf176..e97baa89cbf8 100644
--- a/drivers/s390/cio/device_pgid.c
+++ b/drivers/s390/cio/device_pgid.c
@@ -57,7 +57,7 @@ static void verify_done(struct ccw_device *cdev, int rc)
 static void nop_build_cp(struct ccw_device *cdev)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 
 	cp->cmd_code	= CCW_CMD_NOOP;
 	cp->cda		= 0;
@@ -134,9 +134,9 @@ static void nop_callback(struct ccw_device *cdev, void *data, int rc)
 static void spid_build_cp(struct ccw_device *cdev, u8 fn)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 	int i = pathmask_to_pos(req->lpm);
-	struct pgid *pgid = &cdev->private->pgid[i];
+	struct pgid *pgid = &cdev->private->dma_area->pgid[i];
 
 	pgid->inf.fc	= fn;
 	cp->cmd_code	= CCW_CMD_SET_PGID;
@@ -300,7 +300,7 @@ static int pgid_cmp(struct pgid *p1, struct pgid *p2)
 static void pgid_analyze(struct ccw_device *cdev, struct pgid **p,
 			 int *mismatch, u8 *reserved, u8 *reset)
 {
-	struct pgid *pgid = &cdev->private->pgid[0];
+	struct pgid *pgid = &cdev->private->dma_area->pgid[0];
 	struct pgid *first = NULL;
 	int lpm;
 	int i;
@@ -342,7 +342,7 @@ static u8 pgid_to_donepm(struct ccw_device *cdev)
 		lpm = 0x80 >> i;
 		if ((cdev->private->pgid_valid_mask & lpm) == 0)
 			continue;
-		pgid = &cdev->private->pgid[i];
+		pgid = &cdev->private->dma_area->pgid[i];
 		if (sch->opm & lpm) {
 			if (pgid->inf.ps.state1 != SNID_STATE1_GROUPED)
 				continue;
@@ -368,7 +368,7 @@ static void pgid_fill(struct ccw_device *cdev, struct pgid *pgid)
 	int i;
 
 	for (i = 0; i < 8; i++)
-		memcpy(&cdev->private->pgid[i], pgid, sizeof(struct pgid));
+		memcpy(&cdev->private->dma_area->pgid[i], pgid, sizeof(struct pgid));
 }
 
 /*
@@ -435,12 +435,12 @@ static void snid_done(struct ccw_device *cdev, int rc)
 static void snid_build_cp(struct ccw_device *cdev)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 	int i = pathmask_to_pos(req->lpm);
 
 	/* Channel program setup. */
 	cp->cmd_code	= CCW_CMD_SENSE_PGID;
-	cp->cda		= (u32) (addr_t) &cdev->private->pgid[i];
+	cp->cda		= (u32) (addr_t) &cdev->private->dma_area->pgid[i];
 	cp->count	= sizeof(struct pgid);
 	cp->flags	= CCW_FLAG_SLI;
 	req->cp		= cp;
@@ -516,7 +516,7 @@ static void verify_start(struct ccw_device *cdev)
 	sch->lpm = sch->schib.pmcw.pam;
 
 	/* Initialize PGID data. */
-	memset(cdev->private->pgid, 0, sizeof(cdev->private->pgid));
+	memset(cdev->private->dma_area->pgid, 0, sizeof(cdev->private->dma_area->pgid));
 	cdev->private->pgid_valid_mask = 0;
 	cdev->private->pgid_todo_mask = sch->schib.pmcw.pam;
 	cdev->private->path_notoper_mask = 0;
@@ -626,7 +626,7 @@ struct stlck_data {
 static void stlck_build_cp(struct ccw_device *cdev, void *buf1, void *buf2)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 
 	cp[0].cmd_code = CCW_CMD_STLCK;
 	cp[0].cda = (u32) (addr_t) buf1;
diff --git a/drivers/s390/cio/device_status.c b/drivers/s390/cio/device_status.c
index 7d5c7892b2c4..a9aabde604f4 100644
--- a/drivers/s390/cio/device_status.c
+++ b/drivers/s390/cio/device_status.c
@@ -79,15 +79,15 @@ ccw_device_accumulate_ecw(struct ccw_device *cdev, struct irb *irb)
 	 * are condition that have to be met for the extended control
 	 * bit to have meaning. Sick.
 	 */
-	cdev->private->irb.scsw.cmd.ectl = 0;
+	cdev->private->dma_area->irb.scsw.cmd.ectl = 0;
 	if ((irb->scsw.cmd.stctl & SCSW_STCTL_ALERT_STATUS) &&
 	    !(irb->scsw.cmd.stctl & SCSW_STCTL_INTER_STATUS))
-		cdev->private->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl;
+		cdev->private->dma_area->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl;
 	/* Check if extended control word is valid. */
-	if (!cdev->private->irb.scsw.cmd.ectl)
+	if (!cdev->private->dma_area->irb.scsw.cmd.ectl)
 		return;
 	/* Copy concurrent sense / model dependent information. */
-	memcpy (&cdev->private->irb.ecw, irb->ecw, sizeof (irb->ecw));
+	memcpy (&cdev->private->dma_area->irb.ecw, irb->ecw, sizeof (irb->ecw));
 }
 
 /*
@@ -118,7 +118,7 @@ ccw_device_accumulate_esw(struct ccw_device *cdev, struct irb *irb)
 	if (!ccw_device_accumulate_esw_valid(irb))
 		return;
 
-	cdev_irb = &cdev->private->irb;
+	cdev_irb = &cdev->private->dma_area->irb;
 
 	/* Copy last path used mask. */
 	cdev_irb->esw.esw1.lpum = irb->esw.esw1.lpum;
@@ -210,7 +210,7 @@ ccw_device_accumulate_irb(struct ccw_device *cdev, struct irb *irb)
 		ccw_device_path_notoper(cdev);
 	/* No irb accumulation for transport mode irbs. */
 	if (scsw_is_tm(&irb->scsw)) {
-		memcpy(&cdev->private->irb, irb, sizeof(struct irb));
+		memcpy(&cdev->private->dma_area->irb, irb, sizeof(struct irb));
 		return;
 	}
 	/*
@@ -219,7 +219,7 @@ ccw_device_accumulate_irb(struct ccw_device *cdev, struct irb *irb)
 	if (!scsw_is_solicited(&irb->scsw))
 		return;
 
-	cdev_irb = &cdev->private->irb;
+	cdev_irb = &cdev->private->dma_area->irb;
 
 	/*
 	 * If the clear function had been performed, all formerly pending
@@ -227,7 +227,7 @@ ccw_device_accumulate_irb(struct ccw_device *cdev, struct irb *irb)
 	 * intermediate accumulated status to the device driver.
 	 */
 	if (irb->scsw.cmd.fctl & SCSW_FCTL_CLEAR_FUNC)
-		memset(&cdev->private->irb, 0, sizeof(struct irb));
+		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 
 	/* Copy bits which are valid only for the start function. */
 	if (irb->scsw.cmd.fctl & SCSW_FCTL_START_FUNC) {
@@ -329,9 +329,9 @@ ccw_device_do_sense(struct ccw_device *cdev, struct irb *irb)
 	/*
 	 * We have ending status but no sense information. Do a basic sense.
 	 */
-	sense_ccw = &to_io_private(sch)->sense_ccw;
+	sense_ccw = &to_io_private(sch)->dma_area->sense_ccw;
 	sense_ccw->cmd_code = CCW_CMD_BASIC_SENSE;
-	sense_ccw->cda = (__u32) __pa(cdev->private->irb.ecw);
+	sense_ccw->cda = (__u32) __pa(cdev->private->dma_area->irb.ecw);
 	sense_ccw->count = SENSE_MAX_COUNT;
 	sense_ccw->flags = CCW_FLAG_SLI;
 
@@ -364,7 +364,7 @@ ccw_device_accumulate_basic_sense(struct ccw_device *cdev, struct irb *irb)
 
 	if (!(irb->scsw.cmd.dstat & DEV_STAT_UNIT_CHECK) &&
 	    (irb->scsw.cmd.dstat & DEV_STAT_CHN_END)) {
-		cdev->private->irb.esw.esw0.erw.cons = 1;
+		cdev->private->dma_area->irb.esw.esw0.erw.cons = 1;
 		cdev->private->flags.dosense = 0;
 	}
 	/* Check if path verification is required. */
@@ -386,7 +386,7 @@ ccw_device_accumulate_and_sense(struct ccw_device *cdev, struct irb *irb)
 	/* Check for basic sense. */
 	if (cdev->private->flags.dosense &&
 	    !(irb->scsw.cmd.dstat & DEV_STAT_UNIT_CHECK)) {
-		cdev->private->irb.esw.esw0.erw.cons = 1;
+		cdev->private->dma_area->irb.esw.esw0.erw.cons = 1;
 		cdev->private->flags.dosense = 0;
 		return 0;
 	}
diff --git a/drivers/s390/cio/io_sch.h b/drivers/s390/cio/io_sch.h
index 90e4e3a7841b..4cf38c98cd0b 100644
--- a/drivers/s390/cio/io_sch.h
+++ b/drivers/s390/cio/io_sch.h
@@ -9,15 +9,20 @@
 #include "css.h"
 #include "orb.h"
 
+struct io_subchannel_dma_area {
+	struct ccw1 sense_ccw;	/* static ccw for sense command */
+};
+
 struct io_subchannel_private {
 	union orb orb;		/* operation request block */
-	struct ccw1 sense_ccw;	/* static ccw for sense command */
 	struct ccw_device *cdev;/* pointer to the child ccw device */
 	struct {
 		unsigned int suspend:1;	/* allow suspend */
 		unsigned int prefetch:1;/* deny prefetch */
 		unsigned int inter:1;	/* suppress intermediate interrupts */
 	} __packed options;
+	struct io_subchannel_dma_area *dma_area;
+	dma_addr_t dma_area_dma;
 } __aligned(8);
 
 #define to_io_private(n) ((struct io_subchannel_private *) \
@@ -115,6 +120,13 @@ enum cdev_todo {
 #define FAKE_CMD_IRB	1
 #define FAKE_TM_IRB	2
 
+struct ccw_device_dma_area {
+	struct senseid senseid;	/* SenseID info */
+	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
+	struct irb irb;		/* device status */
+	struct pgid pgid[8];	/* path group IDs per chpid*/
+};
+
 struct ccw_device_private {
 	struct ccw_device *cdev;
 	struct subchannel *sch;
@@ -156,11 +168,7 @@ struct ccw_device_private {
 	} __attribute__((packed)) flags;
 	unsigned long intparm;	/* user interruption parameter */
 	struct qdio_irq *qdio_data;
-	struct irb irb;		/* device status */
 	int async_kill_io_rc;
-	struct senseid senseid;	/* SenseID info */
-	struct pgid pgid[8];	/* path group IDs per chpid*/
-	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
 	struct work_struct todo_work;
 	enum cdev_todo todo;
 	wait_queue_head_t wait_q;
@@ -169,6 +177,9 @@ struct ccw_device_private {
 	struct list_head cmb_list;	/* list of measured devices */
 	u64 cmb_start_time;		/* clock value of cmb reset */
 	void *cmb_wait;			/* deferred cmb enable/disable */
+	u64 dma_mask;
+	struct gen_pool *dma_pool;
+	struct ccw_device_dma_area *dma_area;
 	enum interruption_class int_class;
 };
 
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 6d989c360f38..bb7a92316fc8 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -66,7 +66,6 @@ struct virtio_ccw_device {
 	bool device_lost;
 	unsigned int config_ready;
 	void *airq_info;
-	u64 dma_mask;
 };
 
 struct vq_info_block_legacy {
@@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		ret = -ENOMEM;
 		goto out_free;
 	}
-
 	vcdev->vdev.dev.parent = &cdev->dev;
-	cdev->dev.dma_mask = &vcdev->dma_mask;
-	/* we are fine with common virtio infrastructure using 64 bit DMA */
-	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
-	if (ret) {
-		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
-		goto out_free;
-	}
-
 	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
 				   GFP_DMA | GFP_KERNEL);
 	if (!vcdev->config_block) {
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 07/10] s390/airq: use DMA memory for adapter interrupts
  2019-04-26 18:32 ` Halil Pasic
@ 2019-04-26 18:32   ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

Protected virtualization guests have to use shared pages for airq
notifier bit vectors, because hypervisor needs to write these bits.

Let us make sure we allocate DMA memory for the notifier bit vectors.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 arch/s390/include/asm/airq.h |  2 ++
 drivers/s390/cio/airq.c      | 18 ++++++++++++++----
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index fcf539efb32f..1492d4856049 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -11,6 +11,7 @@
 #define _ASM_S390_AIRQ_H
 
 #include <linux/bit_spinlock.h>
+#include <linux/dma-mapping.h>
 
 struct airq_struct {
 	struct hlist_node list;		/* Handler queueing. */
@@ -29,6 +30,7 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
 /* Adapter interrupt bit vector */
 struct airq_iv {
 	unsigned long *vector;	/* Adapter interrupt bit vector */
+	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
 	unsigned long *avail;	/* Allocation bit mask for the bit vector */
 	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
 	unsigned long *ptr;	/* Pointer associated with each bit */
diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index a45011e4529e..7a5c0a08ee09 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -19,6 +19,7 @@
 
 #include <asm/airq.h>
 #include <asm/isc.h>
+#include <asm/cio.h>
 
 #include "cio.h"
 #include "cio_debug.h"
@@ -113,6 +114,11 @@ void __init init_airq_interrupts(void)
 	setup_irq(THIN_INTERRUPT, &airq_interrupt);
 }
 
+static inline unsigned long iv_size(unsigned long bits)
+{
+	return BITS_TO_LONGS(bits) * sizeof(unsigned long);
+}
+
 /**
  * airq_iv_create - create an interrupt vector
  * @bits: number of bits in the interrupt vector
@@ -123,14 +129,15 @@ void __init init_airq_interrupts(void)
 struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 {
 	struct airq_iv *iv;
-	unsigned long size;
+	unsigned long size = 0;
 
 	iv = kzalloc(sizeof(*iv), GFP_KERNEL);
 	if (!iv)
 		goto out;
 	iv->bits = bits;
-	size = BITS_TO_LONGS(bits) * sizeof(unsigned long);
-	iv->vector = kzalloc(size, GFP_KERNEL);
+	size = iv_size(bits);
+	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
+						 &iv->vector_dma, GFP_KERNEL);
 	if (!iv->vector)
 		goto out_free;
 	if (flags & AIRQ_IV_ALLOC) {
@@ -165,7 +172,8 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
 	kfree(iv->avail);
-	kfree(iv->vector);
+	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
+			  iv->vector_dma);
 	kfree(iv);
 out:
 	return NULL;
@@ -182,6 +190,8 @@ void airq_iv_release(struct airq_iv *iv)
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
 	kfree(iv->vector);
+	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
+			  iv->vector, iv->vector_dma);
 	kfree(iv->avail);
 	kfree(iv);
 }
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 07/10] s390/airq: use DMA memory for adapter interrupts
@ 2019-04-26 18:32   ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Christoph Hellwig, Thomas Huth, Claudio Imbrenda, Janosch Frank,
	Vasily Gorbik, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Viktor Mihajlovski

Protected virtualization guests have to use shared pages for airq
notifier bit vectors, because hypervisor needs to write these bits.

Let us make sure we allocate DMA memory for the notifier bit vectors.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 arch/s390/include/asm/airq.h |  2 ++
 drivers/s390/cio/airq.c      | 18 ++++++++++++++----
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index fcf539efb32f..1492d4856049 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -11,6 +11,7 @@
 #define _ASM_S390_AIRQ_H
 
 #include <linux/bit_spinlock.h>
+#include <linux/dma-mapping.h>
 
 struct airq_struct {
 	struct hlist_node list;		/* Handler queueing. */
@@ -29,6 +30,7 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
 /* Adapter interrupt bit vector */
 struct airq_iv {
 	unsigned long *vector;	/* Adapter interrupt bit vector */
+	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
 	unsigned long *avail;	/* Allocation bit mask for the bit vector */
 	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
 	unsigned long *ptr;	/* Pointer associated with each bit */
diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index a45011e4529e..7a5c0a08ee09 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -19,6 +19,7 @@
 
 #include <asm/airq.h>
 #include <asm/isc.h>
+#include <asm/cio.h>
 
 #include "cio.h"
 #include "cio_debug.h"
@@ -113,6 +114,11 @@ void __init init_airq_interrupts(void)
 	setup_irq(THIN_INTERRUPT, &airq_interrupt);
 }
 
+static inline unsigned long iv_size(unsigned long bits)
+{
+	return BITS_TO_LONGS(bits) * sizeof(unsigned long);
+}
+
 /**
  * airq_iv_create - create an interrupt vector
  * @bits: number of bits in the interrupt vector
@@ -123,14 +129,15 @@ void __init init_airq_interrupts(void)
 struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 {
 	struct airq_iv *iv;
-	unsigned long size;
+	unsigned long size = 0;
 
 	iv = kzalloc(sizeof(*iv), GFP_KERNEL);
 	if (!iv)
 		goto out;
 	iv->bits = bits;
-	size = BITS_TO_LONGS(bits) * sizeof(unsigned long);
-	iv->vector = kzalloc(size, GFP_KERNEL);
+	size = iv_size(bits);
+	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
+						 &iv->vector_dma, GFP_KERNEL);
 	if (!iv->vector)
 		goto out_free;
 	if (flags & AIRQ_IV_ALLOC) {
@@ -165,7 +172,8 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
 	kfree(iv->avail);
-	kfree(iv->vector);
+	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
+			  iv->vector_dma);
 	kfree(iv);
 out:
 	return NULL;
@@ -182,6 +190,8 @@ void airq_iv_release(struct airq_iv *iv)
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
 	kfree(iv->vector);
+	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
+			  iv->vector, iv->vector_dma);
 	kfree(iv->avail);
 	kfree(iv);
 }
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 08/10] virtio/s390: add indirection to indicators access
  2019-04-26 18:32 ` Halil Pasic
@ 2019-04-26 18:32   ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

This will come in handy soon when we pull out the indicators from
virtio_ccw_device to a memory area that is shared with the hypervisor
(in particular for protected virtualization guests).

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 40 +++++++++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index bb7a92316fc8..1f3e7d56924f 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -68,6 +68,16 @@ struct virtio_ccw_device {
 	void *airq_info;
 };
 
+static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
+{
+	return &vcdev->indicators;
+}
+
+static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
+{
+	return &vcdev->indicators2;
+}
+
 struct vq_info_block_legacy {
 	__u64 queue;
 	__u32 align;
@@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 		ccw->cda = (__u32)(unsigned long) thinint_area;
 	} else {
 		/* payload is the address of the indicators */
-		indicatorp = kmalloc(sizeof(&vcdev->indicators),
+		indicatorp = kmalloc(sizeof(indicators(vcdev)),
 				     GFP_DMA | GFP_KERNEL);
 		if (!indicatorp)
 			return;
 		*indicatorp = 0;
 		ccw->cmd_code = CCW_CMD_SET_IND;
-		ccw->count = sizeof(&vcdev->indicators);
+		ccw->count = sizeof(indicators(vcdev));
 		ccw->cda = (__u32)(unsigned long) indicatorp;
 	}
 	/* Deregister indicators from host. */
-	vcdev->indicators = 0;
+	*indicators(vcdev) = 0;
 	ccw->flags = 0;
 	ret = ccw_io_helper(vcdev, ccw,
 			    vcdev->is_thinint ?
@@ -656,10 +666,10 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	 * We need a data area under 2G to communicate. Our payload is
 	 * the address of the indicators.
 	*/
-	indicatorp = kmalloc(sizeof(&vcdev->indicators), GFP_DMA | GFP_KERNEL);
+	indicatorp = kmalloc(sizeof(indicators(vcdev)), GFP_DMA | GFP_KERNEL);
 	if (!indicatorp)
 		goto out;
-	*indicatorp = (unsigned long) &vcdev->indicators;
+	*indicatorp = (unsigned long) indicators(vcdev);
 	if (vcdev->is_thinint) {
 		ret = virtio_ccw_register_adapter_ind(vcdev, vqs, nvqs, ccw);
 		if (ret)
@@ -668,21 +678,21 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	}
 	if (!vcdev->is_thinint) {
 		/* Register queue indicators with host. */
-		vcdev->indicators = 0;
+		*indicators(vcdev) = 0;
 		ccw->cmd_code = CCW_CMD_SET_IND;
 		ccw->flags = 0;
-		ccw->count = sizeof(&vcdev->indicators);
+		ccw->count = sizeof(indicators(vcdev));
 		ccw->cda = (__u32)(unsigned long) indicatorp;
 		ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_IND);
 		if (ret)
 			goto out;
 	}
 	/* Register indicators2 with host for config changes */
-	*indicatorp = (unsigned long) &vcdev->indicators2;
-	vcdev->indicators2 = 0;
+	*indicatorp = (unsigned long) indicators2(vcdev);
+	*indicators2(vcdev) = 0;
 	ccw->cmd_code = CCW_CMD_SET_CONF_IND;
 	ccw->flags = 0;
-	ccw->count = sizeof(&vcdev->indicators2);
+	ccw->count = sizeof(indicators2(vcdev));
 	ccw->cda = (__u32)(unsigned long) indicatorp;
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_CONF_IND);
 	if (ret)
@@ -1092,17 +1102,17 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
 			vcdev->err = -EIO;
 	}
 	virtio_ccw_check_activity(vcdev, activity);
-	for_each_set_bit(i, &vcdev->indicators,
-			 sizeof(vcdev->indicators) * BITS_PER_BYTE) {
+	for_each_set_bit(i, indicators(vcdev),
+			 sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
 		/* The bit clear must happen before the vring kick. */
-		clear_bit(i, &vcdev->indicators);
+		clear_bit(i, indicators(vcdev));
 		barrier();
 		vq = virtio_ccw_vq_by_ind(vcdev, i);
 		vring_interrupt(0, vq);
 	}
-	if (test_bit(0, &vcdev->indicators2)) {
+	if (test_bit(0, indicators2(vcdev))) {
 		virtio_config_changed(&vcdev->vdev);
-		clear_bit(0, &vcdev->indicators2);
+		clear_bit(0, indicators2(vcdev));
 	}
 }
 
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 08/10] virtio/s390: add indirection to indicators access
@ 2019-04-26 18:32   ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Christoph Hellwig, Thomas Huth, Claudio Imbrenda, Janosch Frank,
	Vasily Gorbik, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Viktor Mihajlovski

This will come in handy soon when we pull out the indicators from
virtio_ccw_device to a memory area that is shared with the hypervisor
(in particular for protected virtualization guests).

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 40 +++++++++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index bb7a92316fc8..1f3e7d56924f 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -68,6 +68,16 @@ struct virtio_ccw_device {
 	void *airq_info;
 };
 
+static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
+{
+	return &vcdev->indicators;
+}
+
+static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
+{
+	return &vcdev->indicators2;
+}
+
 struct vq_info_block_legacy {
 	__u64 queue;
 	__u32 align;
@@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 		ccw->cda = (__u32)(unsigned long) thinint_area;
 	} else {
 		/* payload is the address of the indicators */
-		indicatorp = kmalloc(sizeof(&vcdev->indicators),
+		indicatorp = kmalloc(sizeof(indicators(vcdev)),
 				     GFP_DMA | GFP_KERNEL);
 		if (!indicatorp)
 			return;
 		*indicatorp = 0;
 		ccw->cmd_code = CCW_CMD_SET_IND;
-		ccw->count = sizeof(&vcdev->indicators);
+		ccw->count = sizeof(indicators(vcdev));
 		ccw->cda = (__u32)(unsigned long) indicatorp;
 	}
 	/* Deregister indicators from host. */
-	vcdev->indicators = 0;
+	*indicators(vcdev) = 0;
 	ccw->flags = 0;
 	ret = ccw_io_helper(vcdev, ccw,
 			    vcdev->is_thinint ?
@@ -656,10 +666,10 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	 * We need a data area under 2G to communicate. Our payload is
 	 * the address of the indicators.
 	*/
-	indicatorp = kmalloc(sizeof(&vcdev->indicators), GFP_DMA | GFP_KERNEL);
+	indicatorp = kmalloc(sizeof(indicators(vcdev)), GFP_DMA | GFP_KERNEL);
 	if (!indicatorp)
 		goto out;
-	*indicatorp = (unsigned long) &vcdev->indicators;
+	*indicatorp = (unsigned long) indicators(vcdev);
 	if (vcdev->is_thinint) {
 		ret = virtio_ccw_register_adapter_ind(vcdev, vqs, nvqs, ccw);
 		if (ret)
@@ -668,21 +678,21 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	}
 	if (!vcdev->is_thinint) {
 		/* Register queue indicators with host. */
-		vcdev->indicators = 0;
+		*indicators(vcdev) = 0;
 		ccw->cmd_code = CCW_CMD_SET_IND;
 		ccw->flags = 0;
-		ccw->count = sizeof(&vcdev->indicators);
+		ccw->count = sizeof(indicators(vcdev));
 		ccw->cda = (__u32)(unsigned long) indicatorp;
 		ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_IND);
 		if (ret)
 			goto out;
 	}
 	/* Register indicators2 with host for config changes */
-	*indicatorp = (unsigned long) &vcdev->indicators2;
-	vcdev->indicators2 = 0;
+	*indicatorp = (unsigned long) indicators2(vcdev);
+	*indicators2(vcdev) = 0;
 	ccw->cmd_code = CCW_CMD_SET_CONF_IND;
 	ccw->flags = 0;
-	ccw->count = sizeof(&vcdev->indicators2);
+	ccw->count = sizeof(indicators2(vcdev));
 	ccw->cda = (__u32)(unsigned long) indicatorp;
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_CONF_IND);
 	if (ret)
@@ -1092,17 +1102,17 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
 			vcdev->err = -EIO;
 	}
 	virtio_ccw_check_activity(vcdev, activity);
-	for_each_set_bit(i, &vcdev->indicators,
-			 sizeof(vcdev->indicators) * BITS_PER_BYTE) {
+	for_each_set_bit(i, indicators(vcdev),
+			 sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
 		/* The bit clear must happen before the vring kick. */
-		clear_bit(i, &vcdev->indicators);
+		clear_bit(i, indicators(vcdev));
 		barrier();
 		vq = virtio_ccw_vq_by_ind(vcdev, i);
 		vring_interrupt(0, vq);
 	}
-	if (test_bit(0, &vcdev->indicators2)) {
+	if (test_bit(0, indicators2(vcdev))) {
 		virtio_config_changed(&vcdev->vdev);
-		clear_bit(0, &vcdev->indicators2);
+		clear_bit(0, indicators2(vcdev));
 	}
 }
 
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers
  2019-04-26 18:32 ` Halil Pasic
@ 2019-04-26 18:32   ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

Before virtio-ccw could get away with not using DMA API for the pieces of
memory it does ccw I/O with. With protected virtualization this has to
change, since the hypervisor needs to read and sometimes also write these
pieces of memory.

The hypervisor is supposed to poke the classic notifiers, if these are
used, out of band with regards to ccw I/O. So these need to be allocated
as DMA memory (which is shared memory for protected virtualization
guests).

Let us factor out everything from struct virtio_ccw_device that needs to
be DMA memory in a satellite that is allocated as such.

Note: The control blocks of I/O instructions do not need to be shared.
These are marshalled by the ultravisor.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++------------------
 1 file changed, 96 insertions(+), 81 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 1f3e7d56924f..613b18001a0c 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -46,9 +46,15 @@ struct vq_config_block {
 #define VIRTIO_CCW_CONFIG_SIZE 0x100
 /* same as PCI config space size, should be enough for all drivers */
 
+struct vcdev_dma_area {
+	unsigned long indicators;
+	unsigned long indicators2;
+	struct vq_config_block config_block;
+	__u8 status;
+};
+
 struct virtio_ccw_device {
 	struct virtio_device vdev;
-	__u8 *status;
 	__u8 config[VIRTIO_CCW_CONFIG_SIZE];
 	struct ccw_device *cdev;
 	__u32 curr_io;
@@ -58,24 +64,22 @@ struct virtio_ccw_device {
 	spinlock_t lock;
 	struct mutex io_lock; /* Serializes I/O requests */
 	struct list_head virtqueues;
-	unsigned long indicators;
-	unsigned long indicators2;
-	struct vq_config_block *config_block;
 	bool is_thinint;
 	bool going_away;
 	bool device_lost;
 	unsigned int config_ready;
 	void *airq_info;
+	struct vcdev_dma_area *dma_area;
 };
 
 static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
 {
-	return &vcdev->indicators;
+	return &vcdev->dma_area->indicators;
 }
 
 static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
 {
-	return &vcdev->indicators2;
+	return &vcdev->dma_area->indicators2;
 }
 
 struct vq_info_block_legacy {
@@ -176,6 +180,22 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
 	return container_of(vdev, struct virtio_ccw_device, vdev);
 }
 
+static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size)
+{
+	return ccw_device_dma_zalloc(to_vc_device(vdev)->cdev, size);
+}
+
+static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
+				 void *cpu_addr)
+{
+	return ccw_device_dma_free(to_vc_device(vdev)->cdev, cpu_addr, size);
+}
+
+#define vc_dma_alloc_struct(vdev, ptr) \
+	({ptr = __vc_dma_alloc(vdev, sizeof(*(ptr))); })
+#define vc_dma_free_struct(vdev, ptr) \
+	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr))
+
 static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
 {
 	unsigned long i, flags;
@@ -335,8 +355,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 	struct airq_info *airq_info = vcdev->airq_info;
 
 	if (vcdev->is_thinint) {
-		thinint_area = kzalloc(sizeof(*thinint_area),
-				       GFP_DMA | GFP_KERNEL);
+		vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
 		if (!thinint_area)
 			return;
 		thinint_area->summary_indicator =
@@ -347,8 +366,8 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 		ccw->cda = (__u32)(unsigned long) thinint_area;
 	} else {
 		/* payload is the address of the indicators */
-		indicatorp = kmalloc(sizeof(indicators(vcdev)),
-				     GFP_DMA | GFP_KERNEL);
+		indicatorp = __vc_dma_alloc(&vcdev->vdev,
+					   sizeof(indicators(vcdev)));
 		if (!indicatorp)
 			return;
 		*indicatorp = 0;
@@ -368,8 +387,9 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 			 "Failed to deregister indicators (%d)\n", ret);
 	else if (vcdev->is_thinint)
 		virtio_ccw_drop_indicators(vcdev);
-	kfree(indicatorp);
-	kfree(thinint_area);
+	__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
+		      indicatorp);
+	vc_dma_free_struct(&vcdev->vdev, thinint_area);
 }
 
 static inline long __do_kvm_notify(struct subchannel_id schid,
@@ -416,15 +436,15 @@ static int virtio_ccw_read_vq_conf(struct virtio_ccw_device *vcdev,
 {
 	int ret;
 
-	vcdev->config_block->index = index;
+	vcdev->dma_area->config_block.index = index;
 	ccw->cmd_code = CCW_CMD_READ_VQ_CONF;
 	ccw->flags = 0;
 	ccw->count = sizeof(struct vq_config_block);
-	ccw->cda = (__u32)(unsigned long)(vcdev->config_block);
+	ccw->cda = (__u32)(unsigned long)(&vcdev->dma_area->config_block);
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_READ_VQ_CONF);
 	if (ret)
 		return ret;
-	return vcdev->config_block->num ?: -ENOENT;
+	return vcdev->dma_area->config_block.num ?: -ENOENT;
 }
 
 static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
@@ -469,7 +489,7 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
 			 ret, index);
 
 	vring_del_virtqueue(vq);
-	kfree(info->info_block);
+	vc_dma_free_struct(vq->vdev, info->info_block);
 	kfree(info);
 }
 
@@ -479,7 +499,7 @@ static void virtio_ccw_del_vqs(struct virtio_device *vdev)
 	struct ccw1 *ccw;
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
@@ -488,7 +508,7 @@ static void virtio_ccw_del_vqs(struct virtio_device *vdev)
 	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
 		virtio_ccw_del_vq(vq, ccw);
 
-	kfree(ccw);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
@@ -511,8 +531,7 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 		err = -ENOMEM;
 		goto out_err;
 	}
-	info->info_block = kzalloc(sizeof(*info->info_block),
-				   GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, info->info_block);
 	if (!info->info_block) {
 		dev_warn(&vcdev->cdev->dev, "no info block\n");
 		err = -ENOMEM;
@@ -576,7 +595,7 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 	if (vq)
 		vring_del_virtqueue(vq);
 	if (info) {
-		kfree(info->info_block);
+		vc_dma_free_struct(vdev, info->info_block);
 	}
 	kfree(info);
 	return ERR_PTR(err);
@@ -590,7 +609,7 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev,
 	struct virtio_thinint_area *thinint_area = NULL;
 	struct airq_info *info;
 
-	thinint_area = kzalloc(sizeof(*thinint_area), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
 	if (!thinint_area) {
 		ret = -ENOMEM;
 		goto out;
@@ -626,7 +645,7 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev,
 		virtio_ccw_drop_indicators(vcdev);
 	}
 out:
-	kfree(thinint_area);
+	vc_dma_free_struct(&vcdev->vdev, thinint_area);
 	return ret;
 }
 
@@ -642,7 +661,7 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	int ret, i, queue_idx = 0;
 	struct ccw1 *ccw;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return -ENOMEM;
 
@@ -666,7 +685,7 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	 * We need a data area under 2G to communicate. Our payload is
 	 * the address of the indicators.
 	*/
-	indicatorp = kmalloc(sizeof(indicators(vcdev)), GFP_DMA | GFP_KERNEL);
+	indicatorp = __vc_dma_alloc(&vcdev->vdev, sizeof(indicators(vcdev)));
 	if (!indicatorp)
 		goto out;
 	*indicatorp = (unsigned long) indicators(vcdev);
@@ -698,12 +717,16 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	if (ret)
 		goto out;
 
-	kfree(indicatorp);
-	kfree(ccw);
+	if (indicatorp)
+		__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
+			       indicatorp);
+	vc_dma_free_struct(vdev, ccw);
 	return 0;
 out:
-	kfree(indicatorp);
-	kfree(ccw);
+	if (indicatorp)
+		__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
+			       indicatorp);
+	vc_dma_free_struct(vdev, ccw);
 	virtio_ccw_del_vqs(vdev);
 	return ret;
 }
@@ -713,12 +736,12 @@ static void virtio_ccw_reset(struct virtio_device *vdev)
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 	struct ccw1 *ccw;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
 	/* Zero status bits. */
-	*vcdev->status = 0;
+	vcdev->dma_area->status = 0;
 
 	/* Send a reset ccw on device. */
 	ccw->cmd_code = CCW_CMD_VDEV_RESET;
@@ -726,22 +749,22 @@ static void virtio_ccw_reset(struct virtio_device *vdev)
 	ccw->count = 0;
 	ccw->cda = 0;
 	ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_RESET);
-	kfree(ccw);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static u64 virtio_ccw_get_features(struct virtio_device *vdev)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 	struct virtio_feature_desc *features;
+	struct ccw1 *ccw;
 	int ret;
 	u64 rc;
-	struct ccw1 *ccw;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return 0;
 
-	features = kzalloc(sizeof(*features), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, features);
 	if (!features) {
 		rc = 0;
 		goto out_free;
@@ -774,8 +797,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
 		rc |= (u64)le32_to_cpu(features->features) << 32;
 
 out_free:
-	kfree(features);
-	kfree(ccw);
+	vc_dma_free_struct(vdev, features);
+	vc_dma_free_struct(vdev, ccw);
 	return rc;
 }
 
@@ -800,11 +823,11 @@ static int virtio_ccw_finalize_features(struct virtio_device *vdev)
 		return -EINVAL;
 	}
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return -ENOMEM;
 
-	features = kzalloc(sizeof(*features), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, features);
 	if (!features) {
 		ret = -ENOMEM;
 		goto out_free;
@@ -839,8 +862,8 @@ static int virtio_ccw_finalize_features(struct virtio_device *vdev)
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_FEAT);
 
 out_free:
-	kfree(features);
-	kfree(ccw);
+	vc_dma_free_struct(vdev, features);
+	vc_dma_free_struct(vdev, ccw);
 
 	return ret;
 }
@@ -854,11 +877,11 @@ static void virtio_ccw_get_config(struct virtio_device *vdev,
 	void *config_area;
 	unsigned long flags;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
-	config_area = kzalloc(VIRTIO_CCW_CONFIG_SIZE, GFP_DMA | GFP_KERNEL);
+	config_area = __vc_dma_alloc(vdev, VIRTIO_CCW_CONFIG_SIZE);
 	if (!config_area)
 		goto out_free;
 
@@ -880,8 +903,8 @@ static void virtio_ccw_get_config(struct virtio_device *vdev,
 		memcpy(buf, config_area + offset, len);
 
 out_free:
-	kfree(config_area);
-	kfree(ccw);
+	__vc_dma_free(vdev, VIRTIO_CCW_CONFIG_SIZE, config_area);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static void virtio_ccw_set_config(struct virtio_device *vdev,
@@ -893,11 +916,11 @@ static void virtio_ccw_set_config(struct virtio_device *vdev,
 	void *config_area;
 	unsigned long flags;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
-	config_area = kzalloc(VIRTIO_CCW_CONFIG_SIZE, GFP_DMA | GFP_KERNEL);
+	config_area = __vc_dma_alloc(vdev, VIRTIO_CCW_CONFIG_SIZE);
 	if (!config_area)
 		goto out_free;
 
@@ -916,61 +939,61 @@ static void virtio_ccw_set_config(struct virtio_device *vdev,
 	ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_CONFIG);
 
 out_free:
-	kfree(config_area);
-	kfree(ccw);
+	__vc_dma_free(vdev, VIRTIO_CCW_CONFIG_SIZE, config_area);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static u8 virtio_ccw_get_status(struct virtio_device *vdev)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
-	u8 old_status = *vcdev->status;
+	u8 old_status = vcdev->dma_area->status;
 	struct ccw1 *ccw;
 
 	if (vcdev->revision < 1)
-		return *vcdev->status;
+		return vcdev->dma_area->status;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return old_status;
 
 	ccw->cmd_code = CCW_CMD_READ_STATUS;
 	ccw->flags = 0;
-	ccw->count = sizeof(*vcdev->status);
-	ccw->cda = (__u32)(unsigned long)vcdev->status;
+	ccw->count = sizeof(vcdev->dma_area->status);
+	ccw->cda = (__u32)(unsigned long)&vcdev->dma_area->status;
 	ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_READ_STATUS);
 /*
  * If the channel program failed (should only happen if the device
  * was hotunplugged, and then we clean up via the machine check
- * handler anyway), vcdev->status was not overwritten and we just
+ * handler anyway), vcdev->dma_area->status was not overwritten and we just
  * return the old status, which is fine.
 */
-	kfree(ccw);
+	vc_dma_free_struct(vdev, ccw);
 
-	return *vcdev->status;
+	return vcdev->dma_area->status;
 }
 
 static void virtio_ccw_set_status(struct virtio_device *vdev, u8 status)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
-	u8 old_status = *vcdev->status;
+	u8 old_status = vcdev->dma_area->status;
 	struct ccw1 *ccw;
 	int ret;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
 	/* Write the status to the host. */
-	*vcdev->status = status;
+	vcdev->dma_area->status = status;
 	ccw->cmd_code = CCW_CMD_WRITE_STATUS;
 	ccw->flags = 0;
 	ccw->count = sizeof(status);
-	ccw->cda = (__u32)(unsigned long)vcdev->status;
+	ccw->cda = (__u32)(unsigned long)&vcdev->dma_area->status;
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_STATUS);
 	/* Write failed? We assume status is unchanged. */
 	if (ret)
-		*vcdev->status = old_status;
-	kfree(ccw);
+		vcdev->dma_area->status = old_status;
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static const char *virtio_ccw_bus_name(struct virtio_device *vdev)
@@ -1003,8 +1026,7 @@ static void virtio_ccw_release_dev(struct device *_d)
 	struct virtio_device *dev = dev_to_virtio(_d);
 	struct virtio_ccw_device *vcdev = to_vc_device(dev);
 
-	kfree(vcdev->status);
-	kfree(vcdev->config_block);
+	vc_dma_free_struct(&vcdev->vdev, vcdev->dma_area);
 	kfree(vcdev);
 }
 
@@ -1212,12 +1234,12 @@ static int virtio_ccw_set_transport_rev(struct virtio_ccw_device *vcdev)
 	struct ccw1 *ccw;
 	int ret;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, ccw);
 	if (!ccw)
 		return -ENOMEM;
-	rev = kzalloc(sizeof(*rev), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, rev);
 	if (!rev) {
-		kfree(ccw);
+		vc_dma_free_struct(&vcdev->vdev, ccw);
 		return -ENOMEM;
 	}
 
@@ -1247,8 +1269,8 @@ static int virtio_ccw_set_transport_rev(struct virtio_ccw_device *vcdev)
 		}
 	} while (ret == -EOPNOTSUPP);
 
-	kfree(ccw);
-	kfree(rev);
+	vc_dma_free_struct(&vcdev->vdev, ccw);
+	vc_dma_free_struct(&vcdev->vdev, rev);
 	return ret;
 }
 
@@ -1265,14 +1287,9 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		goto out_free;
 	}
 	vcdev->vdev.dev.parent = &cdev->dev;
-	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
-				   GFP_DMA | GFP_KERNEL);
-	if (!vcdev->config_block) {
-		ret = -ENOMEM;
-		goto out_free;
-	}
-	vcdev->status = kzalloc(sizeof(*vcdev->status), GFP_DMA | GFP_KERNEL);
-	if (!vcdev->status) {
+	vcdev->cdev = cdev;
+	vc_dma_alloc_struct(&vcdev->vdev, vcdev->dma_area);
+	if (!vcdev->dma_area) {
 		ret = -ENOMEM;
 		goto out_free;
 	}
@@ -1281,7 +1298,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 
 	vcdev->vdev.dev.release = virtio_ccw_release_dev;
 	vcdev->vdev.config = &virtio_ccw_config_ops;
-	vcdev->cdev = cdev;
 	init_waitqueue_head(&vcdev->wait_q);
 	INIT_LIST_HEAD(&vcdev->virtqueues);
 	spin_lock_init(&vcdev->lock);
@@ -1312,8 +1328,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 	return ret;
 out_free:
 	if (vcdev) {
-		kfree(vcdev->status);
-		kfree(vcdev->config_block);
+		vc_dma_free_struct(&vcdev->vdev, vcdev->dma_area);
 	}
 	kfree(vcdev);
 	return ret;
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers
@ 2019-04-26 18:32   ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Christoph Hellwig, Thomas Huth, Claudio Imbrenda, Janosch Frank,
	Vasily Gorbik, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Viktor Mihajlovski

Before virtio-ccw could get away with not using DMA API for the pieces of
memory it does ccw I/O with. With protected virtualization this has to
change, since the hypervisor needs to read and sometimes also write these
pieces of memory.

The hypervisor is supposed to poke the classic notifiers, if these are
used, out of band with regards to ccw I/O. So these need to be allocated
as DMA memory (which is shared memory for protected virtualization
guests).

Let us factor out everything from struct virtio_ccw_device that needs to
be DMA memory in a satellite that is allocated as such.

Note: The control blocks of I/O instructions do not need to be shared.
These are marshalled by the ultravisor.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++------------------
 1 file changed, 96 insertions(+), 81 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 1f3e7d56924f..613b18001a0c 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -46,9 +46,15 @@ struct vq_config_block {
 #define VIRTIO_CCW_CONFIG_SIZE 0x100
 /* same as PCI config space size, should be enough for all drivers */
 
+struct vcdev_dma_area {
+	unsigned long indicators;
+	unsigned long indicators2;
+	struct vq_config_block config_block;
+	__u8 status;
+};
+
 struct virtio_ccw_device {
 	struct virtio_device vdev;
-	__u8 *status;
 	__u8 config[VIRTIO_CCW_CONFIG_SIZE];
 	struct ccw_device *cdev;
 	__u32 curr_io;
@@ -58,24 +64,22 @@ struct virtio_ccw_device {
 	spinlock_t lock;
 	struct mutex io_lock; /* Serializes I/O requests */
 	struct list_head virtqueues;
-	unsigned long indicators;
-	unsigned long indicators2;
-	struct vq_config_block *config_block;
 	bool is_thinint;
 	bool going_away;
 	bool device_lost;
 	unsigned int config_ready;
 	void *airq_info;
+	struct vcdev_dma_area *dma_area;
 };
 
 static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
 {
-	return &vcdev->indicators;
+	return &vcdev->dma_area->indicators;
 }
 
 static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
 {
-	return &vcdev->indicators2;
+	return &vcdev->dma_area->indicators2;
 }
 
 struct vq_info_block_legacy {
@@ -176,6 +180,22 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
 	return container_of(vdev, struct virtio_ccw_device, vdev);
 }
 
+static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size)
+{
+	return ccw_device_dma_zalloc(to_vc_device(vdev)->cdev, size);
+}
+
+static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
+				 void *cpu_addr)
+{
+	return ccw_device_dma_free(to_vc_device(vdev)->cdev, cpu_addr, size);
+}
+
+#define vc_dma_alloc_struct(vdev, ptr) \
+	({ptr = __vc_dma_alloc(vdev, sizeof(*(ptr))); })
+#define vc_dma_free_struct(vdev, ptr) \
+	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr))
+
 static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
 {
 	unsigned long i, flags;
@@ -335,8 +355,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 	struct airq_info *airq_info = vcdev->airq_info;
 
 	if (vcdev->is_thinint) {
-		thinint_area = kzalloc(sizeof(*thinint_area),
-				       GFP_DMA | GFP_KERNEL);
+		vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
 		if (!thinint_area)
 			return;
 		thinint_area->summary_indicator =
@@ -347,8 +366,8 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 		ccw->cda = (__u32)(unsigned long) thinint_area;
 	} else {
 		/* payload is the address of the indicators */
-		indicatorp = kmalloc(sizeof(indicators(vcdev)),
-				     GFP_DMA | GFP_KERNEL);
+		indicatorp = __vc_dma_alloc(&vcdev->vdev,
+					   sizeof(indicators(vcdev)));
 		if (!indicatorp)
 			return;
 		*indicatorp = 0;
@@ -368,8 +387,9 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 			 "Failed to deregister indicators (%d)\n", ret);
 	else if (vcdev->is_thinint)
 		virtio_ccw_drop_indicators(vcdev);
-	kfree(indicatorp);
-	kfree(thinint_area);
+	__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
+		      indicatorp);
+	vc_dma_free_struct(&vcdev->vdev, thinint_area);
 }
 
 static inline long __do_kvm_notify(struct subchannel_id schid,
@@ -416,15 +436,15 @@ static int virtio_ccw_read_vq_conf(struct virtio_ccw_device *vcdev,
 {
 	int ret;
 
-	vcdev->config_block->index = index;
+	vcdev->dma_area->config_block.index = index;
 	ccw->cmd_code = CCW_CMD_READ_VQ_CONF;
 	ccw->flags = 0;
 	ccw->count = sizeof(struct vq_config_block);
-	ccw->cda = (__u32)(unsigned long)(vcdev->config_block);
+	ccw->cda = (__u32)(unsigned long)(&vcdev->dma_area->config_block);
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_READ_VQ_CONF);
 	if (ret)
 		return ret;
-	return vcdev->config_block->num ?: -ENOENT;
+	return vcdev->dma_area->config_block.num ?: -ENOENT;
 }
 
 static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
@@ -469,7 +489,7 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
 			 ret, index);
 
 	vring_del_virtqueue(vq);
-	kfree(info->info_block);
+	vc_dma_free_struct(vq->vdev, info->info_block);
 	kfree(info);
 }
 
@@ -479,7 +499,7 @@ static void virtio_ccw_del_vqs(struct virtio_device *vdev)
 	struct ccw1 *ccw;
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
@@ -488,7 +508,7 @@ static void virtio_ccw_del_vqs(struct virtio_device *vdev)
 	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
 		virtio_ccw_del_vq(vq, ccw);
 
-	kfree(ccw);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
@@ -511,8 +531,7 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 		err = -ENOMEM;
 		goto out_err;
 	}
-	info->info_block = kzalloc(sizeof(*info->info_block),
-				   GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, info->info_block);
 	if (!info->info_block) {
 		dev_warn(&vcdev->cdev->dev, "no info block\n");
 		err = -ENOMEM;
@@ -576,7 +595,7 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 	if (vq)
 		vring_del_virtqueue(vq);
 	if (info) {
-		kfree(info->info_block);
+		vc_dma_free_struct(vdev, info->info_block);
 	}
 	kfree(info);
 	return ERR_PTR(err);
@@ -590,7 +609,7 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev,
 	struct virtio_thinint_area *thinint_area = NULL;
 	struct airq_info *info;
 
-	thinint_area = kzalloc(sizeof(*thinint_area), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
 	if (!thinint_area) {
 		ret = -ENOMEM;
 		goto out;
@@ -626,7 +645,7 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev,
 		virtio_ccw_drop_indicators(vcdev);
 	}
 out:
-	kfree(thinint_area);
+	vc_dma_free_struct(&vcdev->vdev, thinint_area);
 	return ret;
 }
 
@@ -642,7 +661,7 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	int ret, i, queue_idx = 0;
 	struct ccw1 *ccw;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return -ENOMEM;
 
@@ -666,7 +685,7 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	 * We need a data area under 2G to communicate. Our payload is
 	 * the address of the indicators.
 	*/
-	indicatorp = kmalloc(sizeof(indicators(vcdev)), GFP_DMA | GFP_KERNEL);
+	indicatorp = __vc_dma_alloc(&vcdev->vdev, sizeof(indicators(vcdev)));
 	if (!indicatorp)
 		goto out;
 	*indicatorp = (unsigned long) indicators(vcdev);
@@ -698,12 +717,16 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	if (ret)
 		goto out;
 
-	kfree(indicatorp);
-	kfree(ccw);
+	if (indicatorp)
+		__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
+			       indicatorp);
+	vc_dma_free_struct(vdev, ccw);
 	return 0;
 out:
-	kfree(indicatorp);
-	kfree(ccw);
+	if (indicatorp)
+		__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
+			       indicatorp);
+	vc_dma_free_struct(vdev, ccw);
 	virtio_ccw_del_vqs(vdev);
 	return ret;
 }
@@ -713,12 +736,12 @@ static void virtio_ccw_reset(struct virtio_device *vdev)
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 	struct ccw1 *ccw;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
 	/* Zero status bits. */
-	*vcdev->status = 0;
+	vcdev->dma_area->status = 0;
 
 	/* Send a reset ccw on device. */
 	ccw->cmd_code = CCW_CMD_VDEV_RESET;
@@ -726,22 +749,22 @@ static void virtio_ccw_reset(struct virtio_device *vdev)
 	ccw->count = 0;
 	ccw->cda = 0;
 	ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_RESET);
-	kfree(ccw);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static u64 virtio_ccw_get_features(struct virtio_device *vdev)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 	struct virtio_feature_desc *features;
+	struct ccw1 *ccw;
 	int ret;
 	u64 rc;
-	struct ccw1 *ccw;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return 0;
 
-	features = kzalloc(sizeof(*features), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, features);
 	if (!features) {
 		rc = 0;
 		goto out_free;
@@ -774,8 +797,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
 		rc |= (u64)le32_to_cpu(features->features) << 32;
 
 out_free:
-	kfree(features);
-	kfree(ccw);
+	vc_dma_free_struct(vdev, features);
+	vc_dma_free_struct(vdev, ccw);
 	return rc;
 }
 
@@ -800,11 +823,11 @@ static int virtio_ccw_finalize_features(struct virtio_device *vdev)
 		return -EINVAL;
 	}
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return -ENOMEM;
 
-	features = kzalloc(sizeof(*features), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, features);
 	if (!features) {
 		ret = -ENOMEM;
 		goto out_free;
@@ -839,8 +862,8 @@ static int virtio_ccw_finalize_features(struct virtio_device *vdev)
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_FEAT);
 
 out_free:
-	kfree(features);
-	kfree(ccw);
+	vc_dma_free_struct(vdev, features);
+	vc_dma_free_struct(vdev, ccw);
 
 	return ret;
 }
@@ -854,11 +877,11 @@ static void virtio_ccw_get_config(struct virtio_device *vdev,
 	void *config_area;
 	unsigned long flags;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
-	config_area = kzalloc(VIRTIO_CCW_CONFIG_SIZE, GFP_DMA | GFP_KERNEL);
+	config_area = __vc_dma_alloc(vdev, VIRTIO_CCW_CONFIG_SIZE);
 	if (!config_area)
 		goto out_free;
 
@@ -880,8 +903,8 @@ static void virtio_ccw_get_config(struct virtio_device *vdev,
 		memcpy(buf, config_area + offset, len);
 
 out_free:
-	kfree(config_area);
-	kfree(ccw);
+	__vc_dma_free(vdev, VIRTIO_CCW_CONFIG_SIZE, config_area);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static void virtio_ccw_set_config(struct virtio_device *vdev,
@@ -893,11 +916,11 @@ static void virtio_ccw_set_config(struct virtio_device *vdev,
 	void *config_area;
 	unsigned long flags;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
-	config_area = kzalloc(VIRTIO_CCW_CONFIG_SIZE, GFP_DMA | GFP_KERNEL);
+	config_area = __vc_dma_alloc(vdev, VIRTIO_CCW_CONFIG_SIZE);
 	if (!config_area)
 		goto out_free;
 
@@ -916,61 +939,61 @@ static void virtio_ccw_set_config(struct virtio_device *vdev,
 	ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_CONFIG);
 
 out_free:
-	kfree(config_area);
-	kfree(ccw);
+	__vc_dma_free(vdev, VIRTIO_CCW_CONFIG_SIZE, config_area);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static u8 virtio_ccw_get_status(struct virtio_device *vdev)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
-	u8 old_status = *vcdev->status;
+	u8 old_status = vcdev->dma_area->status;
 	struct ccw1 *ccw;
 
 	if (vcdev->revision < 1)
-		return *vcdev->status;
+		return vcdev->dma_area->status;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return old_status;
 
 	ccw->cmd_code = CCW_CMD_READ_STATUS;
 	ccw->flags = 0;
-	ccw->count = sizeof(*vcdev->status);
-	ccw->cda = (__u32)(unsigned long)vcdev->status;
+	ccw->count = sizeof(vcdev->dma_area->status);
+	ccw->cda = (__u32)(unsigned long)&vcdev->dma_area->status;
 	ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_READ_STATUS);
 /*
  * If the channel program failed (should only happen if the device
  * was hotunplugged, and then we clean up via the machine check
- * handler anyway), vcdev->status was not overwritten and we just
+ * handler anyway), vcdev->dma_area->status was not overwritten and we just
  * return the old status, which is fine.
 */
-	kfree(ccw);
+	vc_dma_free_struct(vdev, ccw);
 
-	return *vcdev->status;
+	return vcdev->dma_area->status;
 }
 
 static void virtio_ccw_set_status(struct virtio_device *vdev, u8 status)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
-	u8 old_status = *vcdev->status;
+	u8 old_status = vcdev->dma_area->status;
 	struct ccw1 *ccw;
 	int ret;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
 	/* Write the status to the host. */
-	*vcdev->status = status;
+	vcdev->dma_area->status = status;
 	ccw->cmd_code = CCW_CMD_WRITE_STATUS;
 	ccw->flags = 0;
 	ccw->count = sizeof(status);
-	ccw->cda = (__u32)(unsigned long)vcdev->status;
+	ccw->cda = (__u32)(unsigned long)&vcdev->dma_area->status;
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_STATUS);
 	/* Write failed? We assume status is unchanged. */
 	if (ret)
-		*vcdev->status = old_status;
-	kfree(ccw);
+		vcdev->dma_area->status = old_status;
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static const char *virtio_ccw_bus_name(struct virtio_device *vdev)
@@ -1003,8 +1026,7 @@ static void virtio_ccw_release_dev(struct device *_d)
 	struct virtio_device *dev = dev_to_virtio(_d);
 	struct virtio_ccw_device *vcdev = to_vc_device(dev);
 
-	kfree(vcdev->status);
-	kfree(vcdev->config_block);
+	vc_dma_free_struct(&vcdev->vdev, vcdev->dma_area);
 	kfree(vcdev);
 }
 
@@ -1212,12 +1234,12 @@ static int virtio_ccw_set_transport_rev(struct virtio_ccw_device *vcdev)
 	struct ccw1 *ccw;
 	int ret;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, ccw);
 	if (!ccw)
 		return -ENOMEM;
-	rev = kzalloc(sizeof(*rev), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, rev);
 	if (!rev) {
-		kfree(ccw);
+		vc_dma_free_struct(&vcdev->vdev, ccw);
 		return -ENOMEM;
 	}
 
@@ -1247,8 +1269,8 @@ static int virtio_ccw_set_transport_rev(struct virtio_ccw_device *vcdev)
 		}
 	} while (ret == -EOPNOTSUPP);
 
-	kfree(ccw);
-	kfree(rev);
+	vc_dma_free_struct(&vcdev->vdev, ccw);
+	vc_dma_free_struct(&vcdev->vdev, rev);
 	return ret;
 }
 
@@ -1265,14 +1287,9 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		goto out_free;
 	}
 	vcdev->vdev.dev.parent = &cdev->dev;
-	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
-				   GFP_DMA | GFP_KERNEL);
-	if (!vcdev->config_block) {
-		ret = -ENOMEM;
-		goto out_free;
-	}
-	vcdev->status = kzalloc(sizeof(*vcdev->status), GFP_DMA | GFP_KERNEL);
-	if (!vcdev->status) {
+	vcdev->cdev = cdev;
+	vc_dma_alloc_struct(&vcdev->vdev, vcdev->dma_area);
+	if (!vcdev->dma_area) {
 		ret = -ENOMEM;
 		goto out_free;
 	}
@@ -1281,7 +1298,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 
 	vcdev->vdev.dev.release = virtio_ccw_release_dev;
 	vcdev->vdev.config = &virtio_ccw_config_ops;
-	vcdev->cdev = cdev;
 	init_waitqueue_head(&vcdev->wait_q);
 	INIT_LIST_HEAD(&vcdev->virtqueues);
 	spin_lock_init(&vcdev->lock);
@@ -1312,8 +1328,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 	return ret;
 out_free:
 	if (vcdev) {
-		kfree(vcdev->status);
-		kfree(vcdev->config_block);
+		vc_dma_free_struct(&vcdev->vdev, vcdev->dma_area);
 	}
 	kfree(vcdev);
 	return ret;
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 10/10] virtio/s390: make airq summary indicators DMA
  2019-04-26 18:32 ` Halil Pasic
@ 2019-04-26 18:32   ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

Hypervisor needs to interact with the summary indicators, so these
need to be DMA memory as well (at least for protected virtualization
guests).

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 613b18001a0c..6058b07fea08 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -140,11 +140,17 @@ static int virtio_ccw_use_airq = 1;
 
 struct airq_info {
 	rwlock_t lock;
-	u8 summary_indicator;
+	u8 summary_indicator_idx;
 	struct airq_struct airq;
 	struct airq_iv *aiv;
 };
 static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
+static u8 *summary_indicators;
+
+static inline u8 *get_summary_indicator(struct airq_info *info)
+{
+	return summary_indicators + info->summary_indicator_idx;
+}
 
 #define CCW_CMD_SET_VQ 0x13
 #define CCW_CMD_VDEV_RESET 0x33
@@ -225,7 +231,7 @@ static void virtio_airq_handler(struct airq_struct *airq)
 			break;
 		vring_interrupt(0, (void *)airq_iv_get_ptr(info->aiv, ai));
 	}
-	info->summary_indicator = 0;
+	*(get_summary_indicator(info)) = 0;
 	smp_wmb();
 	/* Walk through indicators field, summary indicator not active. */
 	for (ai = 0;;) {
@@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq)
 	read_unlock(&info->lock);
 }
 
-static struct airq_info *new_airq_info(void)
+/* call with airq_areas_lock held */
+static struct airq_info *new_airq_info(int index)
 {
 	struct airq_info *info;
 	int rc;
@@ -252,7 +259,8 @@ static struct airq_info *new_airq_info(void)
 		return NULL;
 	}
 	info->airq.handler = virtio_airq_handler;
-	info->airq.lsi_ptr = &info->summary_indicator;
+	info->summary_indicator_idx = index;
+	info->airq.lsi_ptr = get_summary_indicator(info);
 	info->airq.lsi_mask = 0xff;
 	info->airq.isc = VIRTIO_AIRQ_ISC;
 	rc = register_adapter_interrupt(&info->airq);
@@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct virtqueue *vqs[], int nvqs,
 	unsigned long bit, flags;
 
 	for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
+		/* TODO: this seems to be racy */
 		if (!airq_areas[i])
-			airq_areas[i] = new_airq_info();
+			airq_areas[i] = new_airq_info(i);
 		info = airq_areas[i];
 		if (!info)
 			return 0;
@@ -359,7 +368,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 		if (!thinint_area)
 			return;
 		thinint_area->summary_indicator =
-			(unsigned long) &airq_info->summary_indicator;
+			(unsigned long) get_summary_indicator(airq_info);
 		thinint_area->isc = VIRTIO_AIRQ_ISC;
 		ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
 		ccw->count = sizeof(*thinint_area);
@@ -624,7 +633,7 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev,
 	}
 	info = vcdev->airq_info;
 	thinint_area->summary_indicator =
-		(unsigned long) &info->summary_indicator;
+		(unsigned long) get_summary_indicator(info);
 	thinint_area->isc = VIRTIO_AIRQ_ISC;
 	ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
 	ccw->flags = CCW_FLAG_SLI;
@@ -1500,6 +1509,7 @@ static int __init virtio_ccw_init(void)
 {
 	/* parse no_auto string before we do anything further */
 	no_auto_parse();
+	summary_indicators = cio_dma_zalloc(MAX_AIRQ_AREAS);
 	return ccw_driver_register(&virtio_ccw_driver);
 }
 device_initcall(virtio_ccw_init);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* [PATCH 10/10] virtio/s390: make airq summary indicators DMA
@ 2019-04-26 18:32   ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-26 18:32 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Christoph Hellwig, Thomas Huth, Claudio Imbrenda, Janosch Frank,
	Vasily Gorbik, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Viktor Mihajlovski

Hypervisor needs to interact with the summary indicators, so these
need to be DMA memory as well (at least for protected virtualization
guests).

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 613b18001a0c..6058b07fea08 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -140,11 +140,17 @@ static int virtio_ccw_use_airq = 1;
 
 struct airq_info {
 	rwlock_t lock;
-	u8 summary_indicator;
+	u8 summary_indicator_idx;
 	struct airq_struct airq;
 	struct airq_iv *aiv;
 };
 static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
+static u8 *summary_indicators;
+
+static inline u8 *get_summary_indicator(struct airq_info *info)
+{
+	return summary_indicators + info->summary_indicator_idx;
+}
 
 #define CCW_CMD_SET_VQ 0x13
 #define CCW_CMD_VDEV_RESET 0x33
@@ -225,7 +231,7 @@ static void virtio_airq_handler(struct airq_struct *airq)
 			break;
 		vring_interrupt(0, (void *)airq_iv_get_ptr(info->aiv, ai));
 	}
-	info->summary_indicator = 0;
+	*(get_summary_indicator(info)) = 0;
 	smp_wmb();
 	/* Walk through indicators field, summary indicator not active. */
 	for (ai = 0;;) {
@@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq)
 	read_unlock(&info->lock);
 }
 
-static struct airq_info *new_airq_info(void)
+/* call with airq_areas_lock held */
+static struct airq_info *new_airq_info(int index)
 {
 	struct airq_info *info;
 	int rc;
@@ -252,7 +259,8 @@ static struct airq_info *new_airq_info(void)
 		return NULL;
 	}
 	info->airq.handler = virtio_airq_handler;
-	info->airq.lsi_ptr = &info->summary_indicator;
+	info->summary_indicator_idx = index;
+	info->airq.lsi_ptr = get_summary_indicator(info);
 	info->airq.lsi_mask = 0xff;
 	info->airq.isc = VIRTIO_AIRQ_ISC;
 	rc = register_adapter_interrupt(&info->airq);
@@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct virtqueue *vqs[], int nvqs,
 	unsigned long bit, flags;
 
 	for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
+		/* TODO: this seems to be racy */
 		if (!airq_areas[i])
-			airq_areas[i] = new_airq_info();
+			airq_areas[i] = new_airq_info(i);
 		info = airq_areas[i];
 		if (!info)
 			return 0;
@@ -359,7 +368,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 		if (!thinint_area)
 			return;
 		thinint_area->summary_indicator =
-			(unsigned long) &airq_info->summary_indicator;
+			(unsigned long) get_summary_indicator(airq_info);
 		thinint_area->isc = VIRTIO_AIRQ_ISC;
 		ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
 		ccw->count = sizeof(*thinint_area);
@@ -624,7 +633,7 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev,
 	}
 	info = vcdev->airq_info;
 	thinint_area->summary_indicator =
-		(unsigned long) &info->summary_indicator;
+		(unsigned long) get_summary_indicator(info);
 	thinint_area->isc = VIRTIO_AIRQ_ISC;
 	ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
 	ccw->flags = CCW_FLAG_SLI;
@@ -1500,6 +1509,7 @@ static int __init virtio_ccw_init(void)
 {
 	/* parse no_auto string before we do anything further */
 	no_auto_parse();
+	summary_indicators = cio_dma_zalloc(MAX_AIRQ_AREAS);
 	return ccw_driver_register(&virtio_ccw_driver);
 }
 device_initcall(virtio_ccw_init);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
  2019-04-26 18:32   ` Halil Pasic
@ 2019-04-26 19:27     ` Christoph Hellwig
  -1 siblings, 0 replies; 182+ messages in thread
From: Christoph Hellwig @ 2019-04-26 19:27 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On Fri, Apr 26, 2019 at 08:32:39PM +0200, Halil Pasic wrote:
> +EXPORT_SYMBOL_GPL(set_memory_encrypted);

> +EXPORT_SYMBOL_GPL(set_memory_decrypted);

> +EXPORT_SYMBOL_GPL(sev_active);

Why do you export these?  I know x86 exports those as well, but
it shoudn't be needed there either.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
@ 2019-04-26 19:27     ` Christoph Hellwig
  0 siblings, 0 replies; 182+ messages in thread
From: Christoph Hellwig @ 2019-04-26 19:27 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Farhan Ali, Viktor Mihajlovski, Janosch Frank

On Fri, Apr 26, 2019 at 08:32:39PM +0200, Halil Pasic wrote:
> +EXPORT_SYMBOL_GPL(set_memory_encrypted);

> +EXPORT_SYMBOL_GPL(set_memory_decrypted);

> +EXPORT_SYMBOL_GPL(sev_active);

Why do you export these?  I know x86 exports those as well, but
it shoudn't be needed there either.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
  2019-04-26 19:27     ` Christoph Hellwig
@ 2019-04-29 13:59       ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-29 13:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin, Thomas Huth,
	Christian Borntraeger, Viktor Mihajlovski, Vasily Gorbik,
	Janosch Frank, Claudio Imbrenda, Farhan Ali, Eric Farman

On Fri, 26 Apr 2019 12:27:11 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Fri, Apr 26, 2019 at 08:32:39PM +0200, Halil Pasic wrote:
> > +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> 
> > +EXPORT_SYMBOL_GPL(set_memory_decrypted);
> 
> > +EXPORT_SYMBOL_GPL(sev_active);
> 
> Why do you export these?  I know x86 exports those as well, but
> it shoudn't be needed there either.
> 

I export these to be in line with the x86 implementation (which
is the original and seems to be the only one at the moment). I assumed
that 'exported or not' is kind of a part of the interface definition.
Honestly, I did not give it too much thought.

For x86 set_memory(en|de)crypted got exported by 95cf9264d5f3 "x86, drm,
fbdev: Do not specify encrypted memory for video mappings" (Tom
Lendacky, 2017-07-17). With CONFIG_FB_VGA16=m seems to be necessary for x84.

If the consensus is don't export: I won't. I'm fine one way or the other.
@Christian, what is your take on this?

Thank you very much!

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
@ 2019-04-29 13:59       ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-04-29 13:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Martin Schwidefsky, Farhan Ali,
	Viktor Mihajlovski, Janosch Frank

On Fri, 26 Apr 2019 12:27:11 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Fri, Apr 26, 2019 at 08:32:39PM +0200, Halil Pasic wrote:
> > +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> 
> > +EXPORT_SYMBOL_GPL(set_memory_decrypted);
> 
> > +EXPORT_SYMBOL_GPL(sev_active);
> 
> Why do you export these?  I know x86 exports those as well, but
> it shoudn't be needed there either.
> 

I export these to be in line with the x86 implementation (which
is the original and seems to be the only one at the moment). I assumed
that 'exported or not' is kind of a part of the interface definition.
Honestly, I did not give it too much thought.

For x86 set_memory(en|de)crypted got exported by 95cf9264d5f3 "x86, drm,
fbdev: Do not specify encrypted memory for video mappings" (Tom
Lendacky, 2017-07-17). With CONFIG_FB_VGA16=m seems to be necessary for x84.

If the consensus is don't export: I won't. I'm fine one way or the other.
@Christian, what is your take on this?

Thank you very much!

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
  2019-04-29 13:59       ` Halil Pasic
@ 2019-04-29 14:05         ` Christian Borntraeger
  -1 siblings, 0 replies; 182+ messages in thread
From: Christian Borntraeger @ 2019-04-29 14:05 UTC (permalink / raw)
  To: Halil Pasic, Christoph Hellwig
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin, Thomas Huth,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman



On 29.04.19 15:59, Halil Pasic wrote:
> On Fri, 26 Apr 2019 12:27:11 -0700
> Christoph Hellwig <hch@infradead.org> wrote:
> 
>> On Fri, Apr 26, 2019 at 08:32:39PM +0200, Halil Pasic wrote:
>>> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
>>
>>> +EXPORT_SYMBOL_GPL(set_memory_decrypted);
>>
>>> +EXPORT_SYMBOL_GPL(sev_active);
>>
>> Why do you export these?  I know x86 exports those as well, but
>> it shoudn't be needed there either.
>>
> 
> I export these to be in line with the x86 implementation (which
> is the original and seems to be the only one at the moment). I assumed
> that 'exported or not' is kind of a part of the interface definition.
> Honestly, I did not give it too much thought.
> 
> For x86 set_memory(en|de)crypted got exported by 95cf9264d5f3 "x86, drm,
> fbdev: Do not specify encrypted memory for video mappings" (Tom
> Lendacky, 2017-07-17). With CONFIG_FB_VGA16=m seems to be necessary for x84.
> 
> If the consensus is don't export: I won't. I'm fine one way or the other.
> @Christian, what is your take on this?

If we do not need it today for anything (e.g. virtio-gpu) then we can get rid
of the exports (and introduce them when necessary).
> 
> Thank you very much!
> 
> Regards,
> Halil
> 
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
@ 2019-04-29 14:05         ` Christian Borntraeger
  0 siblings, 0 replies; 182+ messages in thread
From: Christian Borntraeger @ 2019-04-29 14:05 UTC (permalink / raw)
  To: Halil Pasic, Christoph Hellwig
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Martin Schwidefsky, Farhan Ali,
	Viktor Mihajlovski, Janosch Frank



On 29.04.19 15:59, Halil Pasic wrote:
> On Fri, 26 Apr 2019 12:27:11 -0700
> Christoph Hellwig <hch@infradead.org> wrote:
> 
>> On Fri, Apr 26, 2019 at 08:32:39PM +0200, Halil Pasic wrote:
>>> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
>>
>>> +EXPORT_SYMBOL_GPL(set_memory_decrypted);
>>
>>> +EXPORT_SYMBOL_GPL(sev_active);
>>
>> Why do you export these?  I know x86 exports those as well, but
>> it shoudn't be needed there either.
>>
> 
> I export these to be in line with the x86 implementation (which
> is the original and seems to be the only one at the moment). I assumed
> that 'exported or not' is kind of a part of the interface definition.
> Honestly, I did not give it too much thought.
> 
> For x86 set_memory(en|de)crypted got exported by 95cf9264d5f3 "x86, drm,
> fbdev: Do not specify encrypted memory for video mappings" (Tom
> Lendacky, 2017-07-17). With CONFIG_FB_VGA16=m seems to be necessary for x84.
> 
> If the consensus is don't export: I won't. I'm fine one way or the other.
> @Christian, what is your take on this?

If we do not need it today for anything (e.g. virtio-gpu) then we can get rid
of the exports (and introduce them when necessary).
> 
> Thank you very much!
> 
> Regards,
> Halil
> 
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
  2019-04-26 18:32   ` Halil Pasic
  (?)
@ 2019-05-03  9:17   ` Cornelia Huck
  2019-05-03 20:04       ` Michael S. Tsirkin
  -1 siblings, 1 reply; 182+ messages in thread
From: Cornelia Huck @ 2019-05-03  9:17 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 26 Apr 2019 20:32:36 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> establishes a new way of allocating virtqueues (as a part of the effort
> that taught DMA to virtio rings).
> 
> In the future we will want virtio-ccw to use the DMA API as well.
> 
> Let us switch from the legacy method of allocating virtqueues to
> vring_create_virtqueue() as the first step into that direction.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
>  1 file changed, 11 insertions(+), 19 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

I'd vote for merging this patch right away for 5.2.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 02/10] virtio/s390: DMA support for virtio-ccw
  2019-04-26 18:32   ` Halil Pasic
  (?)
@ 2019-05-03  9:31   ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-03  9:31 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 26 Apr 2019 20:32:37 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Currently virtio-ccw devices do not work if the device has
> VIRTIO_F_IOMMU_PLATFORM. In future we do want to support DMA API with
> virtio-ccw.
> 
> Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works
> with virtio-ccw.
> 
> Let us also switch from legacy avail/used accessors to the DMA aware
> ones (even if it isn't strictly necessary), and remove the legacy
> accessors (we were the last users).
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 22 +++++++++++++++-------
>  include/linux/virtio.h           | 17 -----------------
>  2 files changed, 15 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 2c66941ef3d0..42832a164546 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c

(...)

> @@ -1258,6 +1257,16 @@ static int virtio_ccw_online(struct ccw_device *cdev)
>  		ret = -ENOMEM;
>  		goto out_free;
>  	}
> +
> +	vcdev->vdev.dev.parent = &cdev->dev;
> +	cdev->dev.dma_mask = &vcdev->dma_mask;
> +	/* we are fine with common virtio infrastructure using 64 bit DMA */
> +	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> +	if (ret) {
> +		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");

Drop the trailing period?

> +		goto out_free;
> +	}
> +
>  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
>  				   GFP_DMA | GFP_KERNEL);
>  	if (!vcdev->config_block) {

(...)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

Also 5.2 material, I think.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 03/10] virtio/s390: enable packed ring
  2019-04-26 18:32   ` Halil Pasic
  (?)
@ 2019-05-03  9:44   ` Cornelia Huck
  2019-05-05 15:13       ` Thomas Huth
  -1 siblings, 1 reply; 182+ messages in thread
From: Cornelia Huck @ 2019-05-03  9:44 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 26 Apr 2019 20:32:38 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Nothing precludes to accepting  VIRTIO_F_RING_PACKED any more.

"precludes us from accepting"

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 42832a164546..6d989c360f38 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -773,10 +773,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
>  static void ccw_transport_features(struct virtio_device *vdev)
>  {
>  	/*
> -	 * There shouldn't be anything that precludes supporting packed.
> -	 * TODO: Remove the limitation after having another look into this.
> +	 * Currently nothing to do here.
>  	 */
> -	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
>  }
>  
>  static int virtio_ccw_finalize_features(struct virtio_device *vdev)

Not sure whether we should merge this into the previous patch instead.
Anyway, I think we can go ahead with that for 5.2 as well.

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 00/10] s390: virtio: support protected virtualization
  2019-04-26 18:32 ` Halil Pasic
                   ` (10 preceding siblings ...)
  (?)
@ 2019-05-03  9:55 ` Cornelia Huck
  2019-05-03 10:03   ` Juergen Gross
                     ` (2 more replies)
  -1 siblings, 3 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-03  9:55 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 26 Apr 2019 20:32:35 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Enhanced virtualization protection technology may require the use of
> bounce buffers for I/O. While support for this was built into the virtio
> core, virtio-ccw wasn't changed accordingly.
> 
> Some background on technology (not part of this series) and the
> terminology used.
> 
> * Protected Virtualization (PV):
> 
> Protected Virtualization guarantees, that non-shared memory of a  guest
> that operates in PV mode private to that guest. I.e. any attempts by the
> hypervisor or other guests to access it will result in an exception. If
> supported by the environment (machine, KVM, guest VM) a guest can decide
> to change into PV mode by doing the appropriate ultravisor calls. Unlike
> some other enhanced virtualization protection technology, 

I think that sentence misses its second part?

> 
> * Ultravisor:
> 
> A hardware/firmware entity that manages PV guests, and polices access to
> their memory. A PV guest prospect needs to interact with the ultravisor,
> to enter PV mode, and potentially to share pages (for I/O which should
> be encrypted by the guest). A guest interacts with the ultravisor via so
> called ultravisor calls. A hypervisor needs to interact with the
> ultravisor to facilitate interpretation, emulation and swapping. A
> hypervisor  interacts with the ultravisor via ultravisor calls and via
> the SIE state description. Generally the ultravisor sanitizes hypervisor
> inputs so that the guest can not be corrupted (except for denial of
> service.
> 
> 
> What needs to be done
> =====================
> 
> Thus what needs to be done to bring virtio-ccw up to speed with respect
> to protected virtualization is:
> * use some 'new' common virtio stuff

Doing this makes sense regardless of the protected virtualization use
case, and I think we should go ahead and merge those patches for 5.2.

> * make sure that virtio-ccw specific stuff uses shared memory when
>   talking to the hypervisor (except control/communication blocks like ORB,
>   these are handled by the ultravisor)

TBH, I'm still a bit hazy on what needs to use shared memory and what
doesn't.

> * make sure the DMA API does what is necessary to talk through shared
>   memory if we are a protected virtualization guest.
> * make sure the common IO layer plays along as well (airqs, sense).
> 
> 
> Important notes
> ================
> 
> * This patch set is based on Martins features branch
>  (git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git branch
>  'features').
> 
> * Documentation is still very sketchy. I'm committed to improving this,
>   but I'm currently hampered by some dependencies currently.  

I understand, but I think this really needs more doc; also for people
who want to understand the code in the future.

Unfortunately lack of doc also hampers others in reviewing this :/

> 
> * The existing naming in the common infrastructure (kernel internal
>   interfaces) is pretty much based on the AMD SEV terminology. Thus the
>   names aren't always perfect. There might be merit to changing these
>   names to more abstract ones. I did not put much thought into that at
>   the current stage.
> 
> * Testing: Please use iommu_platform=on for any virtio devices you are
>   going to test this code with (so virtio actually uses the DMA API).
> 
> Change log
> ==========
> 
> RFC --> v1:
> * Fixed bugs found by Connie (may_reduce and handling reduced,  warning,
>   split move -- thanks Connie!).
> * Fixed console bug found by Sebastian (thanks Sebastian!).
> * Removed the completely useless duplicate of dma-mapping.h spotted by
>   Christoph (thanks Christoph!).
> * Don't use the global DMA pool for subchannel and ccw device
>   owned memory as requested by Sebastian. Consequences:
> 	* Both subchannel and ccw devices have their dma masks
> 	now (both specifying 31 bit addressable)
> 	* We require at least 2 DMA pages per ccw device now, most of
> 	this memory is wasted though.
> 	* DMA memory allocated by virtio is also 31 bit addressable now
>         as virtio uses the parent (which is the ccw device).
> * Enabled packed ring.
> * Rebased onto Martins feature branch; using the actual uv (ultravisor)
>   interface instead of TODO comments.
> * Added some explanations to the cover letter (Connie, David).
> * Squashed a couple of patches together and fixed some text stuff. 
> 
> Looking forward to your review, or any other type of input.
> 
> Halil Pasic (10):
>   virtio/s390: use vring_create_virtqueue
>   virtio/s390: DMA support for virtio-ccw
>   virtio/s390: enable packed ring
>   s390/mm: force swiotlb for protected virtualization
>   s390/cio: introduce DMA pools to cio
>   s390/cio: add basic protected virtualization support
>   s390/airq: use DMA memory for adapter interrupts
>   virtio/s390: add indirection to indicators access
>   virtio/s390: use DMA memory for ccw I/O and classic notifiers
>   virtio/s390: make airq summary indicators DMA
> 
>  arch/s390/Kconfig                   |   5 +
>  arch/s390/include/asm/airq.h        |   2 +
>  arch/s390/include/asm/ccwdev.h      |   4 +
>  arch/s390/include/asm/cio.h         |  11 ++
>  arch/s390/include/asm/mem_encrypt.h |  18 +++
>  arch/s390/mm/init.c                 |  50 +++++++
>  drivers/s390/cio/airq.c             |  18 ++-
>  drivers/s390/cio/ccwreq.c           |   8 +-
>  drivers/s390/cio/cio.h              |   1 +
>  drivers/s390/cio/css.c              | 101 +++++++++++++
>  drivers/s390/cio/device.c           |  65 +++++++--
>  drivers/s390/cio/device_fsm.c       |  40 +++---
>  drivers/s390/cio/device_id.c        |  18 +--
>  drivers/s390/cio/device_ops.c       |  21 ++-
>  drivers/s390/cio/device_pgid.c      |  20 +--
>  drivers/s390/cio/device_status.c    |  24 ++--
>  drivers/s390/cio/io_sch.h           |  21 ++-
>  drivers/s390/virtio/virtio_ccw.c    | 275 +++++++++++++++++++-----------------
>  include/linux/virtio.h              |  17 ---
>  19 files changed, 499 insertions(+), 220 deletions(-)
>  create mode 100644 arch/s390/include/asm/mem_encrypt.h
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 00/10] s390: virtio: support protected virtualization
  2019-05-03  9:55 ` [PATCH 00/10] s390: virtio: support protected virtualization Cornelia Huck
@ 2019-05-03 10:03   ` Juergen Gross
  2019-05-03 13:33     ` Cornelia Huck
  2019-05-04 13:58     ` Halil Pasic
  2 siblings, 0 replies; 182+ messages in thread
From: Juergen Gross @ 2019-05-03 10:03 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On 03/05/2019 11:55, Cornelia Huck wrote:
> On Fri, 26 Apr 2019 20:32:35 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> Enhanced virtualization protection technology may require the use of
>> bounce buffers for I/O. While support for this was built into the virtio
>> core, virtio-ccw wasn't changed accordingly.
>>
>> Some background on technology (not part of this series) and the
>> terminology used.
>>
>> * Protected Virtualization (PV):

Uuh, you are aware that "PV" in virtualization environment is used as
"Para-Virtualization" (e.g. in Xen) today? I believe you are risking a
major mis-understanding here.


Juergen

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 00/10] s390: virtio: support protected virtualization
  2019-05-03  9:55 ` [PATCH 00/10] s390: virtio: support protected virtualization Cornelia Huck
@ 2019-05-03 13:33     ` Cornelia Huck
  2019-05-03 13:33     ` Cornelia Huck
  2019-05-04 13:58     ` Halil Pasic
  2 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-03 13:33 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 3 May 2019 11:55:11 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri, 26 Apr 2019 20:32:35 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 

> > What needs to be done
> > =====================
> > 
> > Thus what needs to be done to bring virtio-ccw up to speed with respect
> > to protected virtualization is:
> > * use some 'new' common virtio stuff
> 
> Doing this makes sense regardless of the protected virtualization use
> case, and I think we should go ahead and merge those patches for 5.2.
> 
> > * make sure that virtio-ccw specific stuff uses shared memory when
> >   talking to the hypervisor (except control/communication blocks like ORB,
> >   these are handled by the ultravisor)
> 
> TBH, I'm still a bit hazy on what needs to use shared memory and what
> doesn't.
> 
> > * make sure the DMA API does what is necessary to talk through shared
> >   memory if we are a protected virtualization guest.
> > * make sure the common IO layer plays along as well (airqs, sense).

I've skimmed some more over the rest of the patches; but I think this
needs review by someone else.

For example, I'm not sure what the semantics of using the main css
device as the dma device are, as I'm not sufficiently familiar with the
intricacies of the dma infrastructure. Combine this with a lack of
documentation of the hardware architecture, and I think that others can
provide much better feedback on this than I am able to.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 00/10] s390: virtio: support protected virtualization
@ 2019-05-03 13:33     ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-03 13:33 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri, 3 May 2019 11:55:11 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri, 26 Apr 2019 20:32:35 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 

> > What needs to be done
> > =====================
> > 
> > Thus what needs to be done to bring virtio-ccw up to speed with respect
> > to protected virtualization is:
> > * use some 'new' common virtio stuff
> 
> Doing this makes sense regardless of the protected virtualization use
> case, and I think we should go ahead and merge those patches for 5.2.
> 
> > * make sure that virtio-ccw specific stuff uses shared memory when
> >   talking to the hypervisor (except control/communication blocks like ORB,
> >   these are handled by the ultravisor)
> 
> TBH, I'm still a bit hazy on what needs to use shared memory and what
> doesn't.
> 
> > * make sure the DMA API does what is necessary to talk through shared
> >   memory if we are a protected virtualization guest.
> > * make sure the common IO layer plays along as well (airqs, sense).

I've skimmed some more over the rest of the patches; but I think this
needs review by someone else.

For example, I'm not sure what the semantics of using the main css
device as the dma device are, as I'm not sufficiently familiar with the
intricacies of the dma infrastructure. Combine this with a lack of
documentation of the hardware architecture, and I think that others can
provide much better feedback on this than I am able to.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
  2019-05-03  9:17   ` Cornelia Huck
@ 2019-05-03 20:04       ` Michael S. Tsirkin
  0 siblings, 0 replies; 182+ messages in thread
From: Michael S. Tsirkin @ 2019-05-03 20:04 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Halil Pasic, kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christoph Hellwig, Thomas Huth,
	Christian Borntraeger, Viktor Mihajlovski, Vasily Gorbik,
	Janosch Frank, Claudio Imbrenda, Farhan Ali, Eric Farman

On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:
> On Fri, 26 Apr 2019 20:32:36 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > establishes a new way of allocating virtqueues (as a part of the effort
> > that taught DMA to virtio rings).
> > 
> > In the future we will want virtio-ccw to use the DMA API as well.
> > 
> > Let us switch from the legacy method of allocating virtqueues to
> > vring_create_virtqueue() as the first step into that direction.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> >  1 file changed, 11 insertions(+), 19 deletions(-)
> 
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> 
> I'd vote for merging this patch right away for 5.2.

So which tree is this going through? mine?

-- 
MST

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
@ 2019-05-03 20:04       ` Michael S. Tsirkin
  0 siblings, 0 replies; 182+ messages in thread
From: Michael S. Tsirkin @ 2019-05-03 20:04 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Christoph Hellwig, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:
> On Fri, 26 Apr 2019 20:32:36 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > establishes a new way of allocating virtqueues (as a part of the effort
> > that taught DMA to virtio rings).
> > 
> > In the future we will want virtio-ccw to use the DMA API as well.
> > 
> > Let us switch from the legacy method of allocating virtqueues to
> > vring_create_virtqueue() as the first step into that direction.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> >  1 file changed, 11 insertions(+), 19 deletions(-)
> 
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> 
> I'd vote for merging this patch right away for 5.2.

So which tree is this going through? mine?

-- 
MST

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 00/10] s390: virtio: support protected virtualization
  2019-05-03  9:55 ` [PATCH 00/10] s390: virtio: support protected virtualization Cornelia Huck
@ 2019-05-04 13:58     ` Halil Pasic
  2019-05-03 13:33     ` Cornelia Huck
  2019-05-04 13:58     ` Halil Pasic
  2 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-04 13:58 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 3 May 2019 11:55:11 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri, 26 Apr 2019 20:32:35 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > Enhanced virtualization protection technology may require the use of
> > bounce buffers for I/O. While support for this was built into the virtio
> > core, virtio-ccw wasn't changed accordingly.
> > 
> > Some background on technology (not part of this series) and the
> > terminology used.
> > 
> > * Protected Virtualization (PV):
> > 
> > Protected Virtualization guarantees, that non-shared memory of a  guest
> > that operates in PV mode private to that guest. I.e. any attempts by the
> > hypervisor or other guests to access it will result in an exception. If
> > supported by the environment (machine, KVM, guest VM) a guest can decide
> > to change into PV mode by doing the appropriate ultravisor calls. Unlike
> > some other enhanced virtualization protection technology, 
> 
> I think that sentence misses its second part?
>

I wanted to kill the whole sentence, but killed only a part of
it. :( Sorry. If any, the sentence had only significance for judging how
well inherited some names fit.
  
> > 
> > * Ultravisor:
> > 
> > A hardware/firmware entity that manages PV guests, and polices access to
> > their memory. A PV guest prospect needs to interact with the ultravisor,
> > to enter PV mode, and potentially to share pages (for I/O which should
> > be encrypted by the guest). A guest interacts with the ultravisor via so
> > called ultravisor calls. A hypervisor needs to interact with the
> > ultravisor to facilitate interpretation, emulation and swapping. A
> > hypervisor  interacts with the ultravisor via ultravisor calls and via
> > the SIE state description. Generally the ultravisor sanitizes hypervisor
> > inputs so that the guest can not be corrupted (except for denial of
> > service.
> > 
> > 
> > What needs to be done
> > =====================
> > 
> > Thus what needs to be done to bring virtio-ccw up to speed with respect
> > to protected virtualization is:
> > * use some 'new' common virtio stuff
> 
> Doing this makes sense regardless of the protected virtualization use
> case, and I think we should go ahead and merge those patches for 5.2.
> 

I agree.

> > * make sure that virtio-ccw specific stuff uses shared memory when
> >   talking to the hypervisor (except control/communication blocks like ORB,
> >   these are handled by the ultravisor)
> 
> TBH, I'm still a bit hazy on what needs to use shared memory and what
> doesn't.
> 

It is all in the code :). To have complete and definitive answers here
we would need some sort of public UV architecture.

> > * make sure the DMA API does what is necessary to talk through shared
> >   memory if we are a protected virtualization guest.
> > * make sure the common IO layer plays along as well (airqs, sense).
> > 
> > 
> > Important notes
> > ================
> > 
> > * This patch set is based on Martins features branch
> >  (git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git branch
> >  'features').
> > 
> > * Documentation is still very sketchy. I'm committed to improving this,
> >   but I'm currently hampered by some dependencies currently.  
> 
> I understand, but I think this really needs more doc; also for people
> who want to understand the code in the future.
> 
> Unfortunately lack of doc also hampers others in reviewing this :/
>

I'm not sure how much can we do on the doc front. Without a complete
architecture, one basically needs to trust the guys with access to the
architecture.

Many thanks for your feedback. Regards,
Halil

[..]

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 00/10] s390: virtio: support protected virtualization
@ 2019-05-04 13:58     ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-04 13:58 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri, 3 May 2019 11:55:11 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri, 26 Apr 2019 20:32:35 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > Enhanced virtualization protection technology may require the use of
> > bounce buffers for I/O. While support for this was built into the virtio
> > core, virtio-ccw wasn't changed accordingly.
> > 
> > Some background on technology (not part of this series) and the
> > terminology used.
> > 
> > * Protected Virtualization (PV):
> > 
> > Protected Virtualization guarantees, that non-shared memory of a  guest
> > that operates in PV mode private to that guest. I.e. any attempts by the
> > hypervisor or other guests to access it will result in an exception. If
> > supported by the environment (machine, KVM, guest VM) a guest can decide
> > to change into PV mode by doing the appropriate ultravisor calls. Unlike
> > some other enhanced virtualization protection technology, 
> 
> I think that sentence misses its second part?
>

I wanted to kill the whole sentence, but killed only a part of
it. :( Sorry. If any, the sentence had only significance for judging how
well inherited some names fit.
  
> > 
> > * Ultravisor:
> > 
> > A hardware/firmware entity that manages PV guests, and polices access to
> > their memory. A PV guest prospect needs to interact with the ultravisor,
> > to enter PV mode, and potentially to share pages (for I/O which should
> > be encrypted by the guest). A guest interacts with the ultravisor via so
> > called ultravisor calls. A hypervisor needs to interact with the
> > ultravisor to facilitate interpretation, emulation and swapping. A
> > hypervisor  interacts with the ultravisor via ultravisor calls and via
> > the SIE state description. Generally the ultravisor sanitizes hypervisor
> > inputs so that the guest can not be corrupted (except for denial of
> > service.
> > 
> > 
> > What needs to be done
> > =====================
> > 
> > Thus what needs to be done to bring virtio-ccw up to speed with respect
> > to protected virtualization is:
> > * use some 'new' common virtio stuff
> 
> Doing this makes sense regardless of the protected virtualization use
> case, and I think we should go ahead and merge those patches for 5.2.
> 

I agree.

> > * make sure that virtio-ccw specific stuff uses shared memory when
> >   talking to the hypervisor (except control/communication blocks like ORB,
> >   these are handled by the ultravisor)
> 
> TBH, I'm still a bit hazy on what needs to use shared memory and what
> doesn't.
> 

It is all in the code :). To have complete and definitive answers here
we would need some sort of public UV architecture.

> > * make sure the DMA API does what is necessary to talk through shared
> >   memory if we are a protected virtualization guest.
> > * make sure the common IO layer plays along as well (airqs, sense).
> > 
> > 
> > Important notes
> > ================
> > 
> > * This patch set is based on Martins features branch
> >  (git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git branch
> >  'features').
> > 
> > * Documentation is still very sketchy. I'm committed to improving this,
> >   but I'm currently hampered by some dependencies currently.  
> 
> I understand, but I think this really needs more doc; also for people
> who want to understand the code in the future.
> 
> Unfortunately lack of doc also hampers others in reviewing this :/
>

I'm not sure how much can we do on the doc front. Without a complete
architecture, one basically needs to trust the guys with access to the
architecture.

Many thanks for your feedback. Regards,
Halil

[..]

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
  2019-05-03 20:04       ` Michael S. Tsirkin
@ 2019-05-04 14:03         ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-04 14:03 UTC (permalink / raw)
  To: Michael S. Tsirkin, Christian Borntraeger
  Cc: Cornelia Huck, kvm, linux-s390, Martin Schwidefsky,
	Sebastian Ott, virtualization, Christoph Hellwig, Thomas Huth,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On Fri, 3 May 2019 16:04:48 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:
> > On Fri, 26 Apr 2019 20:32:36 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> > 
> > > The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > > establishes a new way of allocating virtqueues (as a part of the effort
> > > that taught DMA to virtio rings).
> > > 
> > > In the future we will want virtio-ccw to use the DMA API as well.
> > > 
> > > Let us switch from the legacy method of allocating virtqueues to
> > > vring_create_virtqueue() as the first step into that direction.
> > > 
> > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > ---
> > >  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> > >  1 file changed, 11 insertions(+), 19 deletions(-)
> > 
> > Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> > 
> > I'd vote for merging this patch right away for 5.2.
> 
> So which tree is this going through? mine?
> 

Christian, what do you think? If the whole series is supposed to go in
in one go (which I hope it is), via Martin's tree could be the simplest
route IMHO.

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
@ 2019-05-04 14:03         ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-04 14:03 UTC (permalink / raw)
  To: Michael S. Tsirkin, Christian Borntraeger
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Cornelia Huck, Eric Farman, virtualization,
	Christoph Hellwig, Martin Schwidefsky, Farhan Ali,
	Viktor Mihajlovski, Janosch Frank

On Fri, 3 May 2019 16:04:48 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:
> > On Fri, 26 Apr 2019 20:32:36 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> > 
> > > The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > > establishes a new way of allocating virtqueues (as a part of the effort
> > > that taught DMA to virtio rings).
> > > 
> > > In the future we will want virtio-ccw to use the DMA API as well.
> > > 
> > > Let us switch from the legacy method of allocating virtqueues to
> > > vring_create_virtqueue() as the first step into that direction.
> > > 
> > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > ---
> > >  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> > >  1 file changed, 11 insertions(+), 19 deletions(-)
> > 
> > Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> > 
> > I'd vote for merging this patch right away for 5.2.
> 
> So which tree is this going through? mine?
> 

Christian, what do you think? If the whole series is supposed to go in
in one go (which I hope it is), via Martin's tree could be the simplest
route IMHO.

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
  2019-05-04 14:03         ` Halil Pasic
@ 2019-05-05 11:15           ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-05 11:15 UTC (permalink / raw)
  To: Halil Pasic, Michael S. Tsirkin
  Cc: Christian Borntraeger, kvm, linux-s390, Martin Schwidefsky,
	Sebastian Ott, virtualization, Christoph Hellwig, Thomas Huth,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On Sat, 4 May 2019 16:03:40 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Fri, 3 May 2019 16:04:48 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:  
> > > On Fri, 26 Apr 2019 20:32:36 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > > > The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > > > establishes a new way of allocating virtqueues (as a part of the effort
> > > > that taught DMA to virtio rings).
> > > > 
> > > > In the future we will want virtio-ccw to use the DMA API as well.
> > > > 
> > > > Let us switch from the legacy method of allocating virtqueues to
> > > > vring_create_virtqueue() as the first step into that direction.
> > > > 
> > > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > > ---
> > > >  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> > > >  1 file changed, 11 insertions(+), 19 deletions(-)  
> > > 
> > > Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> > > 
> > > I'd vote for merging this patch right away for 5.2.  
> > 
> > So which tree is this going through? mine?
> >   
> 
> Christian, what do you think? If the whole series is supposed to go in
> in one go (which I hope it is), via Martin's tree could be the simplest
> route IMHO.


The first three patches are virtio(-ccw) only and the those are the ones
that I think are ready to go.

I'm not feeling comfortable going forward with the remainder as it
stands now; waiting for some other folks to give feedback. (They are
touching/interacting with code parts I'm not so familiar with, and lack
of documentation, while not the developers' fault, does not make it
easier.)

Michael, would you like to pick up 1-3 for your tree directly? That
looks like the easiest way.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
@ 2019-05-05 11:15           ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-05 11:15 UTC (permalink / raw)
  To: Halil Pasic, Michael S. Tsirkin
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, Eric Farman, virtualization,
	Christoph Hellwig, Martin Schwidefsky, Viktor Mihajlovski,
	Janosch Frank

On Sat, 4 May 2019 16:03:40 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Fri, 3 May 2019 16:04:48 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:  
> > > On Fri, 26 Apr 2019 20:32:36 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > > > The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > > > establishes a new way of allocating virtqueues (as a part of the effort
> > > > that taught DMA to virtio rings).
> > > > 
> > > > In the future we will want virtio-ccw to use the DMA API as well.
> > > > 
> > > > Let us switch from the legacy method of allocating virtqueues to
> > > > vring_create_virtqueue() as the first step into that direction.
> > > > 
> > > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > > ---
> > > >  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> > > >  1 file changed, 11 insertions(+), 19 deletions(-)  
> > > 
> > > Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> > > 
> > > I'd vote for merging this patch right away for 5.2.  
> > 
> > So which tree is this going through? mine?
> >   
> 
> Christian, what do you think? If the whole series is supposed to go in
> in one go (which I hope it is), via Martin's tree could be the simplest
> route IMHO.


The first three patches are virtio(-ccw) only and the those are the ones
that I think are ready to go.

I'm not feeling comfortable going forward with the remainder as it
stands now; waiting for some other folks to give feedback. (They are
touching/interacting with code parts I'm not so familiar with, and lack
of documentation, while not the developers' fault, does not make it
easier.)

Michael, would you like to pick up 1-3 for your tree directly? That
looks like the easiest way.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 03/10] virtio/s390: enable packed ring
  2019-05-03  9:44   ` Cornelia Huck
@ 2019-05-05 15:13       ` Thomas Huth
  0 siblings, 0 replies; 182+ messages in thread
From: Thomas Huth @ 2019-05-05 15:13 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Christian Borntraeger, Viktor Mihajlovski, Vasily Gorbik,
	Janosch Frank, Claudio Imbrenda, Farhan Ali, Eric Farman

On 03/05/2019 11.44, Cornelia Huck wrote:
> On Fri, 26 Apr 2019 20:32:38 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> Nothing precludes to accepting  VIRTIO_F_RING_PACKED any more.
> 
> "precludes us from accepting"
> 
>>
>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>> ---
>>  drivers/s390/virtio/virtio_ccw.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
>> index 42832a164546..6d989c360f38 100644
>> --- a/drivers/s390/virtio/virtio_ccw.c
>> +++ b/drivers/s390/virtio/virtio_ccw.c
>> @@ -773,10 +773,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
>>  static void ccw_transport_features(struct virtio_device *vdev)
>>  {
>>  	/*
>> -	 * There shouldn't be anything that precludes supporting packed.
>> -	 * TODO: Remove the limitation after having another look into this.
>> +	 * Currently nothing to do here.
>>  	 */
>> -	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
>>  }
>>  
>>  static int virtio_ccw_finalize_features(struct virtio_device *vdev)
> 
> Not sure whether we should merge this into the previous patch instead.

In case you respin, I'd vote for squashing this into the previous patch
instead, especially since you've just added the comment in that patch.

Also, what about removing that function now completely? It's a static
function and only called once in this file, so IMHO it does not make
much sense to keep an empty function around. Or do you plan to add new
code here in the near future?

 Thomas

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 03/10] virtio/s390: enable packed ring
@ 2019-05-05 15:13       ` Thomas Huth
  0 siblings, 0 replies; 182+ messages in thread
From: Thomas Huth @ 2019-05-05 15:13 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, virtualization,
	Christoph Hellwig, Martin Schwidefsky, Viktor Mihajlovski,
	Janosch Frank

On 03/05/2019 11.44, Cornelia Huck wrote:
> On Fri, 26 Apr 2019 20:32:38 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> Nothing precludes to accepting  VIRTIO_F_RING_PACKED any more.
> 
> "precludes us from accepting"
> 
>>
>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>> ---
>>  drivers/s390/virtio/virtio_ccw.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
>> index 42832a164546..6d989c360f38 100644
>> --- a/drivers/s390/virtio/virtio_ccw.c
>> +++ b/drivers/s390/virtio/virtio_ccw.c
>> @@ -773,10 +773,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
>>  static void ccw_transport_features(struct virtio_device *vdev)
>>  {
>>  	/*
>> -	 * There shouldn't be anything that precludes supporting packed.
>> -	 * TODO: Remove the limitation after having another look into this.
>> +	 * Currently nothing to do here.
>>  	 */
>> -	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
>>  }
>>  
>>  static int virtio_ccw_finalize_features(struct virtio_device *vdev)
> 
> Not sure whether we should merge this into the previous patch instead.

In case you respin, I'd vote for squashing this into the previous patch
instead, especially since you've just added the comment in that patch.

Also, what about removing that function now completely? It's a static
function and only called once in this file, so IMHO it does not make
much sense to keep an empty function around. Or do you plan to add new
code here in the near future?

 Thomas

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
  2019-05-05 11:15           ` Cornelia Huck
@ 2019-05-07 13:58             ` Christian Borntraeger
  -1 siblings, 0 replies; 182+ messages in thread
From: Christian Borntraeger @ 2019-05-07 13:58 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic, Michael S. Tsirkin
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christoph Hellwig, Thomas Huth,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman



On 05.05.19 13:15, Cornelia Huck wrote:
> On Sat, 4 May 2019 16:03:40 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> On Fri, 3 May 2019 16:04:48 -0400
>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>
>>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:  
>>>> On Fri, 26 Apr 2019 20:32:36 +0200
>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>>   
>>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
>>>>> establishes a new way of allocating virtqueues (as a part of the effort
>>>>> that taught DMA to virtio rings).
>>>>>
>>>>> In the future we will want virtio-ccw to use the DMA API as well.
>>>>>
>>>>> Let us switch from the legacy method of allocating virtqueues to
>>>>> vring_create_virtqueue() as the first step into that direction.
>>>>>
>>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>>>> ---
>>>>>  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
>>>>>  1 file changed, 11 insertions(+), 19 deletions(-)  
>>>>
>>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>>>>
>>>> I'd vote for merging this patch right away for 5.2.  
>>>
>>> So which tree is this going through? mine?
>>>   
>>
>> Christian, what do you think? If the whole series is supposed to go in
>> in one go (which I hope it is), via Martin's tree could be the simplest
>> route IMHO.
> 
> 
> The first three patches are virtio(-ccw) only and the those are the ones
> that I think are ready to go.
> 
> I'm not feeling comfortable going forward with the remainder as it
> stands now; waiting for some other folks to give feedback. (They are
> touching/interacting with code parts I'm not so familiar with, and lack
> of documentation, while not the developers' fault, does not make it
> easier.)
> 
> Michael, would you like to pick up 1-3 for your tree directly? That
> looks like the easiest way.

Agreed. Michael please pick 1-3.
We will continue to review 4- first and then see which tree is best.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
@ 2019-05-07 13:58             ` Christian Borntraeger
  0 siblings, 0 replies; 182+ messages in thread
From: Christian Borntraeger @ 2019-05-07 13:58 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic, Michael S. Tsirkin
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, Eric Farman, virtualization,
	Christoph Hellwig, Martin Schwidefsky, Viktor Mihajlovski,
	Janosch Frank



On 05.05.19 13:15, Cornelia Huck wrote:
> On Sat, 4 May 2019 16:03:40 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> On Fri, 3 May 2019 16:04:48 -0400
>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>
>>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:  
>>>> On Fri, 26 Apr 2019 20:32:36 +0200
>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>>   
>>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
>>>>> establishes a new way of allocating virtqueues (as a part of the effort
>>>>> that taught DMA to virtio rings).
>>>>>
>>>>> In the future we will want virtio-ccw to use the DMA API as well.
>>>>>
>>>>> Let us switch from the legacy method of allocating virtqueues to
>>>>> vring_create_virtqueue() as the first step into that direction.
>>>>>
>>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>>>> ---
>>>>>  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
>>>>>  1 file changed, 11 insertions(+), 19 deletions(-)  
>>>>
>>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>>>>
>>>> I'd vote for merging this patch right away for 5.2.  
>>>
>>> So which tree is this going through? mine?
>>>   
>>
>> Christian, what do you think? If the whole series is supposed to go in
>> in one go (which I hope it is), via Martin's tree could be the simplest
>> route IMHO.
> 
> 
> The first three patches are virtio(-ccw) only and the those are the ones
> that I think are ready to go.
> 
> I'm not feeling comfortable going forward with the remainder as it
> stands now; waiting for some other folks to give feedback. (They are
> touching/interacting with code parts I'm not so familiar with, and lack
> of documentation, while not the developers' fault, does not make it
> easier.)
> 
> Michael, would you like to pick up 1-3 for your tree directly? That
> looks like the easiest way.

Agreed. Michael please pick 1-3.
We will continue to review 4- first and then see which tree is best.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
  2019-04-26 18:32   ` Halil Pasic
@ 2019-05-08 13:15     ` Claudio Imbrenda
  -1 siblings, 0 replies; 182+ messages in thread
From: Claudio Imbrenda @ 2019-05-08 13:15 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank, Farhan Ali,
	Eric Farman

On Fri, 26 Apr 2019 20:32:39 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On s390, protected virtualization guests have to use bounced I/O
> buffers.  That requires some plumbing.
> 
> Let us make sure, any device that uses DMA API with direct ops
> correctly is spared from the problems, that a hypervisor attempting
> I/O to a non-shared page would bring.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/Kconfig                   |  4 +++
>  arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++
>  arch/s390/mm/init.c                 | 50
> +++++++++++++++++++++++++++++++++++++ 3 files changed, 72
> insertions(+) create mode 100644 arch/s390/include/asm/mem_encrypt.h
> 
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index 1c3fcf19c3af..5500d05d4d53 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -1,4 +1,7 @@
>  # SPDX-License-Identifier: GPL-2.0
> +config ARCH_HAS_MEM_ENCRYPT
> +        def_bool y
> +
>  config MMU
>  	def_bool y
>  
> @@ -191,6 +194,7 @@ config S390
>  	select ARCH_HAS_SCALED_CPUTIME
>  	select VIRT_TO_BUS
>  	select HAVE_NMI
> +	select SWIOTLB
>  
>  
>  config SCHED_OMIT_FRAME_POINTER
> diff --git a/arch/s390/include/asm/mem_encrypt.h
> b/arch/s390/include/asm/mem_encrypt.h new file mode 100644
> index 000000000000..0898c09a888c
> --- /dev/null
> +++ b/arch/s390/include/asm/mem_encrypt.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef S390_MEM_ENCRYPT_H__
> +#define S390_MEM_ENCRYPT_H__
> +
> +#ifndef __ASSEMBLY__
> +
> +#define sme_me_mask	0ULL

This is rather ugly, but I understand why it's there

> +
> +static inline bool sme_active(void) { return false; }
> +extern bool sev_active(void);
> +
> +int set_memory_encrypted(unsigned long addr, int numpages);
> +int set_memory_decrypted(unsigned long addr, int numpages);
> +
> +#endif	/* __ASSEMBLY__ */
> +
> +#endif	/* S390_MEM_ENCRYPT_H__ */
> +
> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> index 3e82f66d5c61..7e3cbd15dcfa 100644
> --- a/arch/s390/mm/init.c
> +++ b/arch/s390/mm/init.c
> @@ -18,6 +18,7 @@
>  #include <linux/mman.h>
>  #include <linux/mm.h>
>  #include <linux/swap.h>
> +#include <linux/swiotlb.h>
>  #include <linux/smp.h>
>  #include <linux/init.h>
>  #include <linux/pagemap.h>
> @@ -29,6 +30,7 @@
>  #include <linux/export.h>
>  #include <linux/cma.h>
>  #include <linux/gfp.h>
> +#include <linux/dma-mapping.h>
>  #include <asm/processor.h>
>  #include <linux/uaccess.h>
>  #include <asm/pgtable.h>
> @@ -42,6 +44,8 @@
>  #include <asm/sclp.h>
>  #include <asm/set_memory.h>
>  #include <asm/kasan.h>
> +#include <asm/dma-mapping.h>
> +#include <asm/uv.h>
>  
>  pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
>  
> @@ -126,6 +130,50 @@ void mark_rodata_ro(void)
>  	pr_info("Write protected read-only-after-init data: %luk\n",
> size >> 10); }
>  
> +int set_memory_encrypted(unsigned long addr, int numpages)
> +{
> +	int i;
> +
> +	/* make all pages shared, (swiotlb, dma_free) */

this is a copypaste typo, I think? (should be UNshared?)
also, it doesn't make ALL pages unshared, but only those specified in
the parameters

with this fixed:
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>

> +	for (i = 0; i < numpages; ++i) {
> +		uv_remove_shared(addr);
> +		addr += PAGE_SIZE;
> +	}
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> +
> +int set_memory_decrypted(unsigned long addr, int numpages)
> +{
> +	int i;
> +	/* make all pages shared (swiotlb, dma_alloca) */

same here with ALL

> +	for (i = 0; i < numpages; ++i) {
> +		uv_set_shared(addr);
> +		addr += PAGE_SIZE;
> +	}
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_decrypted);
> +
> +/* are we a protected virtualization guest? */
> +bool sev_active(void)

this is also ugly. the correct solution would be probably to refactor
everything, including all the AMD SEV code.... let's not go there

> +{
> +	return is_prot_virt_guest();
> +}
> +EXPORT_SYMBOL_GPL(sev_active);
> +
> +/* protected virtualization */
> +static void pv_init(void)
> +{
> +	if (!sev_active())

can't you just use is_prot_virt_guest here?

> +		return;
> +
> +	/* make sure bounce buffers are shared */
> +	swiotlb_init(1);
> +	swiotlb_update_mem_attributes();
> +	swiotlb_force = SWIOTLB_FORCE;
> +}
> +
>  void __init mem_init(void)
>  {
>  	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> @@ -134,6 +182,8 @@ void __init mem_init(void)
>  	set_max_mapnr(max_low_pfn);
>          high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
>  
> +	pv_init();
> +
>  	/* Setup guest page hinting */
>  	cmma_init();
>  

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
@ 2019-05-08 13:15     ` Claudio Imbrenda
  0 siblings, 0 replies; 182+ messages in thread
From: Claudio Imbrenda @ 2019-05-08 13:15 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Farhan Ali, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri, 26 Apr 2019 20:32:39 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On s390, protected virtualization guests have to use bounced I/O
> buffers.  That requires some plumbing.
> 
> Let us make sure, any device that uses DMA API with direct ops
> correctly is spared from the problems, that a hypervisor attempting
> I/O to a non-shared page would bring.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/Kconfig                   |  4 +++
>  arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++
>  arch/s390/mm/init.c                 | 50
> +++++++++++++++++++++++++++++++++++++ 3 files changed, 72
> insertions(+) create mode 100644 arch/s390/include/asm/mem_encrypt.h
> 
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index 1c3fcf19c3af..5500d05d4d53 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -1,4 +1,7 @@
>  # SPDX-License-Identifier: GPL-2.0
> +config ARCH_HAS_MEM_ENCRYPT
> +        def_bool y
> +
>  config MMU
>  	def_bool y
>  
> @@ -191,6 +194,7 @@ config S390
>  	select ARCH_HAS_SCALED_CPUTIME
>  	select VIRT_TO_BUS
>  	select HAVE_NMI
> +	select SWIOTLB
>  
>  
>  config SCHED_OMIT_FRAME_POINTER
> diff --git a/arch/s390/include/asm/mem_encrypt.h
> b/arch/s390/include/asm/mem_encrypt.h new file mode 100644
> index 000000000000..0898c09a888c
> --- /dev/null
> +++ b/arch/s390/include/asm/mem_encrypt.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef S390_MEM_ENCRYPT_H__
> +#define S390_MEM_ENCRYPT_H__
> +
> +#ifndef __ASSEMBLY__
> +
> +#define sme_me_mask	0ULL

This is rather ugly, but I understand why it's there

> +
> +static inline bool sme_active(void) { return false; }
> +extern bool sev_active(void);
> +
> +int set_memory_encrypted(unsigned long addr, int numpages);
> +int set_memory_decrypted(unsigned long addr, int numpages);
> +
> +#endif	/* __ASSEMBLY__ */
> +
> +#endif	/* S390_MEM_ENCRYPT_H__ */
> +
> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> index 3e82f66d5c61..7e3cbd15dcfa 100644
> --- a/arch/s390/mm/init.c
> +++ b/arch/s390/mm/init.c
> @@ -18,6 +18,7 @@
>  #include <linux/mman.h>
>  #include <linux/mm.h>
>  #include <linux/swap.h>
> +#include <linux/swiotlb.h>
>  #include <linux/smp.h>
>  #include <linux/init.h>
>  #include <linux/pagemap.h>
> @@ -29,6 +30,7 @@
>  #include <linux/export.h>
>  #include <linux/cma.h>
>  #include <linux/gfp.h>
> +#include <linux/dma-mapping.h>
>  #include <asm/processor.h>
>  #include <linux/uaccess.h>
>  #include <asm/pgtable.h>
> @@ -42,6 +44,8 @@
>  #include <asm/sclp.h>
>  #include <asm/set_memory.h>
>  #include <asm/kasan.h>
> +#include <asm/dma-mapping.h>
> +#include <asm/uv.h>
>  
>  pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
>  
> @@ -126,6 +130,50 @@ void mark_rodata_ro(void)
>  	pr_info("Write protected read-only-after-init data: %luk\n",
> size >> 10); }
>  
> +int set_memory_encrypted(unsigned long addr, int numpages)
> +{
> +	int i;
> +
> +	/* make all pages shared, (swiotlb, dma_free) */

this is a copypaste typo, I think? (should be UNshared?)
also, it doesn't make ALL pages unshared, but only those specified in
the parameters

with this fixed:
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>

> +	for (i = 0; i < numpages; ++i) {
> +		uv_remove_shared(addr);
> +		addr += PAGE_SIZE;
> +	}
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> +
> +int set_memory_decrypted(unsigned long addr, int numpages)
> +{
> +	int i;
> +	/* make all pages shared (swiotlb, dma_alloca) */

same here with ALL

> +	for (i = 0; i < numpages; ++i) {
> +		uv_set_shared(addr);
> +		addr += PAGE_SIZE;
> +	}
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_decrypted);
> +
> +/* are we a protected virtualization guest? */
> +bool sev_active(void)

this is also ugly. the correct solution would be probably to refactor
everything, including all the AMD SEV code.... let's not go there

> +{
> +	return is_prot_virt_guest();
> +}
> +EXPORT_SYMBOL_GPL(sev_active);
> +
> +/* protected virtualization */
> +static void pv_init(void)
> +{
> +	if (!sev_active())

can't you just use is_prot_virt_guest here?

> +		return;
> +
> +	/* make sure bounce buffers are shared */
> +	swiotlb_init(1);
> +	swiotlb_update_mem_attributes();
> +	swiotlb_force = SWIOTLB_FORCE;
> +}
> +
>  void __init mem_init(void)
>  {
>  	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> @@ -134,6 +182,8 @@ void __init mem_init(void)
>  	set_max_mapnr(max_low_pfn);
>          high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
>  
> +	pv_init();
> +
>  	/* Setup guest page hinting */
>  	cmma_init();
>  

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-04-26 18:32   ` Halil Pasic
@ 2019-05-08 13:18     ` Sebastian Ott
  -1 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-08 13:18 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman


On Fri, 26 Apr 2019, Halil Pasic wrote:
> @@ -224,6 +228,9 @@ struct subchannel *css_alloc_subchannel(struct subchannel_id schid,
>  	INIT_WORK(&sch->todo_work, css_sch_todo);
>  	sch->dev.release = &css_subchannel_release;
>  	device_initialize(&sch->dev);
> +	sch->dma_mask = css_dev_dma_mask;
> +	sch->dev.dma_mask = &sch->dma_mask;
> +	sch->dev.coherent_dma_mask = sch->dma_mask;

Could we do:
	sch->dev.dma_mask = &sch->dev.coherent_dma_mask;
	sch->dev.coherent_dma_mask = css_dev_dma_mask;
?

> +#define POOL_INIT_PAGES 1
> +static struct gen_pool *cio_dma_pool;
> +/* Currently cio supports only a single css */
> +#define  CIO_DMA_GFP (GFP_KERNEL | __GFP_ZERO)

__GFP_ZERO has no meaning with the dma api (since all implementations do
an implicit zero initialization) but let's keep it for the sake of
documentation.

We need GFP_DMA here (which will return addresses < 2G on s390)!

> +void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
> +{
> +	if (!cpu_addr)
> +		return;
> +	memset(cpu_addr, 0, size);

Hm, normally I'd do the memset during alloc not during free - but maybe
this makes more sense here with your usecase in mind.

> @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
>  		unregister_reboot_notifier(&css_reboot_notifier);
>  		goto out_unregister;
>  	}
> +	cio_dma_pool_init();

This is too late for early devices (ccw console!).

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-08 13:18     ` Sebastian Ott
  0 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-08 13:18 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Thomas Huth, Claudio Imbrenda,
	Vasily Gorbik, kvm, Michael S. Tsirkin, Cornelia Huck,
	Eric Farman, virtualization, Christoph Hellwig,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank


On Fri, 26 Apr 2019, Halil Pasic wrote:
> @@ -224,6 +228,9 @@ struct subchannel *css_alloc_subchannel(struct subchannel_id schid,
>  	INIT_WORK(&sch->todo_work, css_sch_todo);
>  	sch->dev.release = &css_subchannel_release;
>  	device_initialize(&sch->dev);
> +	sch->dma_mask = css_dev_dma_mask;
> +	sch->dev.dma_mask = &sch->dma_mask;
> +	sch->dev.coherent_dma_mask = sch->dma_mask;

Could we do:
	sch->dev.dma_mask = &sch->dev.coherent_dma_mask;
	sch->dev.coherent_dma_mask = css_dev_dma_mask;
?

> +#define POOL_INIT_PAGES 1
> +static struct gen_pool *cio_dma_pool;
> +/* Currently cio supports only a single css */
> +#define  CIO_DMA_GFP (GFP_KERNEL | __GFP_ZERO)

__GFP_ZERO has no meaning with the dma api (since all implementations do
an implicit zero initialization) but let's keep it for the sake of
documentation.

We need GFP_DMA here (which will return addresses < 2G on s390)!

> +void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
> +{
> +	if (!cpu_addr)
> +		return;
> +	memset(cpu_addr, 0, size);

Hm, normally I'd do the memset during alloc not during free - but maybe
this makes more sense here with your usecase in mind.

> @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
>  		unregister_reboot_notifier(&css_reboot_notifier);
>  		goto out_unregister;
>  	}
> +	cio_dma_pool_init();

This is too late for early devices (ccw console!).

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-04-26 18:32   ` Halil Pasic
@ 2019-05-08 13:46     ` Sebastian Ott
  -1 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-08 13:46 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman


On Fri, 26 Apr 2019, Halil Pasic wrote:
>  static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
>  {
[..]
> +	cdev->private = kzalloc(sizeof(struct ccw_device_private),
> +				GFP_KERNEL | GFP_DMA);

Do we still need GFP_DMA here (since we now have cdev->private->dma_area)?

> @@ -1062,6 +1082,14 @@ static int io_subchannel_probe(struct subchannel *sch)
>  	if (!io_priv)
>  		goto out_schedule;
>  
> +	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
> +				sizeof(*io_priv->dma_area),
> +				&io_priv->dma_area_dma, GFP_KERNEL);

This needs GFP_DMA.
You use a genpool for ccw_private->dma and not for iopriv->dma - looks
kinda inconsistent.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-08 13:46     ` Sebastian Ott
  0 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-08 13:46 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Thomas Huth, Claudio Imbrenda,
	Vasily Gorbik, kvm, Michael S. Tsirkin, Cornelia Huck,
	Eric Farman, virtualization, Christoph Hellwig,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank


On Fri, 26 Apr 2019, Halil Pasic wrote:
>  static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
>  {
[..]
> +	cdev->private = kzalloc(sizeof(struct ccw_device_private),
> +				GFP_KERNEL | GFP_DMA);

Do we still need GFP_DMA here (since we now have cdev->private->dma_area)?

> @@ -1062,6 +1082,14 @@ static int io_subchannel_probe(struct subchannel *sch)
>  	if (!io_priv)
>  		goto out_schedule;
>  
> +	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
> +				sizeof(*io_priv->dma_area),
> +				&io_priv->dma_area_dma, GFP_KERNEL);

This needs GFP_DMA.
You use a genpool for ccw_private->dma and not for iopriv->dma - looks
kinda inconsistent.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-08 13:46     ` Sebastian Ott
@ 2019-05-08 13:54       ` Christoph Hellwig
  -1 siblings, 0 replies; 182+ messages in thread
From: Christoph Hellwig @ 2019-05-08 13:54 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, May 08, 2019 at 03:46:42PM +0200, Sebastian Ott wrote:
> > +	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
> > +				sizeof(*io_priv->dma_area),
> > +				&io_priv->dma_area_dma, GFP_KERNEL);
> 
> This needs GFP_DMA.
> You use a genpool for ccw_private->dma and not for iopriv->dma - looks
> kinda inconsistent.

dma_alloc_* never needs GFP_DMA.  It selects the zone to allocate
from based on the dma_coherent_mask of the device.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-08 13:54       ` Christoph Hellwig
  0 siblings, 0 replies; 182+ messages in thread
From: Christoph Hellwig @ 2019-05-08 13:54 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: Farhan Ali, linux-s390, Thomas Huth, Claudio Imbrenda,
	Vasily Gorbik, kvm, Michael S. Tsirkin, Cornelia Huck,
	Eric Farman, virtualization, Halil Pasic, Christoph Hellwig,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Wed, May 08, 2019 at 03:46:42PM +0200, Sebastian Ott wrote:
> > +	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
> > +				sizeof(*io_priv->dma_area),
> > +				&io_priv->dma_area_dma, GFP_KERNEL);
> 
> This needs GFP_DMA.
> You use a genpool for ccw_private->dma and not for iopriv->dma - looks
> kinda inconsistent.

dma_alloc_* never needs GFP_DMA.  It selects the zone to allocate
from based on the dma_coherent_mask of the device.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 07/10] s390/airq: use DMA memory for adapter interrupts
  2019-04-26 18:32   ` Halil Pasic
@ 2019-05-08 13:58     ` Sebastian Ott
  -1 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-08 13:58 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman


On Fri, 26 Apr 2019, Halil Pasic wrote:
> @@ -182,6 +190,8 @@ void airq_iv_release(struct airq_iv *iv)
>  	kfree(iv->ptr);
>  	kfree(iv->bitlock);
>  	kfree(iv->vector);

-  	kfree(iv->vector);

> +	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
> +			  iv->vector, iv->vector_dma);
>  	kfree(iv->avail);
>  	kfree(iv);
>  }

Looks good to me but needs adaption to current code. Probably you can just
revert my changes introducing cacheline aligned vectors since we now use
a whole page.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 07/10] s390/airq: use DMA memory for adapter interrupts
@ 2019-05-08 13:58     ` Sebastian Ott
  0 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-08 13:58 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Thomas Huth, Claudio Imbrenda,
	Vasily Gorbik, kvm, Michael S. Tsirkin, Cornelia Huck,
	Eric Farman, virtualization, Christoph Hellwig,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank


On Fri, 26 Apr 2019, Halil Pasic wrote:
> @@ -182,6 +190,8 @@ void airq_iv_release(struct airq_iv *iv)
>  	kfree(iv->ptr);
>  	kfree(iv->bitlock);
>  	kfree(iv->vector);

-  	kfree(iv->vector);

> +	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
> +			  iv->vector, iv->vector_dma);
>  	kfree(iv->avail);
>  	kfree(iv);
>  }

Looks good to me but needs adaption to current code. Probably you can just
revert my changes introducing cacheline aligned vectors since we now use
a whole page.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-04-26 18:32   ` Halil Pasic
@ 2019-05-08 14:23     ` Pierre Morel
  -1 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-08 14:23 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: Christoph Hellwig, Thomas Huth, Claudio Imbrenda, Janosch Frank,
	Vasily Gorbik, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Viktor Mihajlovski

On 26/04/2019 20:32, Halil Pasic wrote:
> As virtio-ccw devices are channel devices, we need to use the dma area
> for any communication with the hypervisor.
> 
> This patch addresses the most basic stuff (mostly what is required for
> virtio-ccw), and does take care of QDIO or any devices.
> 
> An interesting side effect is that virtio structures are now going to
> get allocated in 31 bit addressable storage.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>   arch/s390/include/asm/ccwdev.h   |  4 +++
>   drivers/s390/cio/ccwreq.c        |  8 ++---
>   drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
>   drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
>   drivers/s390/cio/device_id.c     | 18 +++++------
>   drivers/s390/cio/device_ops.c    | 21 +++++++++++--
>   drivers/s390/cio/device_pgid.c   | 20 ++++++-------
>   drivers/s390/cio/device_status.c | 24 +++++++--------
>   drivers/s390/cio/io_sch.h        | 21 +++++++++----
>   drivers/s390/virtio/virtio_ccw.c | 10 -------
>   10 files changed, 148 insertions(+), 83 deletions(-)
> 
> diff --git a/arch/s390/include/asm/ccwdev.h b/arch/s390/include/asm/ccwdev.h
> index a29dd430fb40..865ce1cb86d5 100644
> --- a/arch/s390/include/asm/ccwdev.h
> +++ b/arch/s390/include/asm/ccwdev.h
> @@ -226,6 +226,10 @@ extern int ccw_device_enable_console(struct ccw_device *);
>   extern void ccw_device_wait_idle(struct ccw_device *);
>   extern int ccw_device_force_console(struct ccw_device *);
> 
> +extern void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size);
> +extern void ccw_device_dma_free(struct ccw_device *cdev,
> +				void *cpu_addr, size_t size);
> +
>   int ccw_device_siosl(struct ccw_device *);
> 
>   extern void ccw_device_get_schid(struct ccw_device *, struct subchannel_id *);
> diff --git a/drivers/s390/cio/ccwreq.c b/drivers/s390/cio/ccwreq.c
> index 603268a33ea1..dafbceb311b3 100644
> --- a/drivers/s390/cio/ccwreq.c
> +++ b/drivers/s390/cio/ccwreq.c
> @@ -63,7 +63,7 @@ static void ccwreq_stop(struct ccw_device *cdev, int rc)
>   		return;
>   	req->done = 1;
>   	ccw_device_set_timeout(cdev, 0);
> -	memset(&cdev->private->irb, 0, sizeof(struct irb));
> +	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
>   	if (rc && rc != -ENODEV && req->drc)
>   		rc = req->drc;
>   	req->callback(cdev, req->data, rc);
> @@ -86,7 +86,7 @@ static void ccwreq_do(struct ccw_device *cdev)
>   			continue;
>   		}
>   		/* Perform start function. */
> -		memset(&cdev->private->irb, 0, sizeof(struct irb));
> +		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
>   		rc = cio_start(sch, cp, (u8) req->mask);
>   		if (rc == 0) {
>   			/* I/O started successfully. */
> @@ -169,7 +169,7 @@ int ccw_request_cancel(struct ccw_device *cdev)
>    */
>   static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb)
>   {
> -	struct irb *irb = &cdev->private->irb;
> +	struct irb *irb = &cdev->private->dma_area->irb;
>   	struct cmd_scsw *scsw = &irb->scsw.cmd;
>   	enum uc_todo todo;
> 
> @@ -187,7 +187,7 @@ static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb)
>   		CIO_TRACE_EVENT(2, "sensedata");
>   		CIO_HEX_EVENT(2, &cdev->private->dev_id,
>   			      sizeof(struct ccw_dev_id));
> -		CIO_HEX_EVENT(2, &cdev->private->irb.ecw, SENSE_MAX_COUNT);
> +		CIO_HEX_EVENT(2, &cdev->private->dma_area->irb.ecw, SENSE_MAX_COUNT);
>   		/* Check for command reject. */
>   		if (irb->ecw[0] & SNS0_CMD_REJECT)
>   			return IO_REJECTED;
> diff --git a/drivers/s390/cio/device.c b/drivers/s390/cio/device.c
> index 1540229a37bb..a3310ee14a4a 100644
> --- a/drivers/s390/cio/device.c
> +++ b/drivers/s390/cio/device.c
> @@ -24,6 +24,7 @@
>   #include <linux/timer.h>
>   #include <linux/kernel_stat.h>
>   #include <linux/sched/signal.h>
> +#include <linux/dma-mapping.h>
> 
>   #include <asm/ccwdev.h>
>   #include <asm/cio.h>
> @@ -687,6 +688,9 @@ ccw_device_release(struct device *dev)
>   	struct ccw_device *cdev;
> 
>   	cdev = to_ccwdev(dev);
> +	cio_gp_dma_free(cdev->private->dma_pool, cdev->private->dma_area,
> +			sizeof(*cdev->private->dma_area));
> +	cio_gp_dma_destroy(cdev->private->dma_pool, &cdev->dev);
>   	/* Release reference of parent subchannel. */
>   	put_device(cdev->dev.parent);
>   	kfree(cdev->private);
> @@ -696,15 +700,31 @@ ccw_device_release(struct device *dev)
>   static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
>   {
>   	struct ccw_device *cdev;
> +	struct gen_pool *dma_pool;
> 
>   	cdev  = kzalloc(sizeof(*cdev), GFP_KERNEL);
> -	if (cdev) {
> -		cdev->private = kzalloc(sizeof(struct ccw_device_private),
> -					GFP_KERNEL | GFP_DMA);
> -		if (cdev->private)
> -			return cdev;
> -	}
> +	if (!cdev)
> +		goto err_cdev;
> +	cdev->private = kzalloc(sizeof(struct ccw_device_private),
> +				GFP_KERNEL | GFP_DMA);
> +	if (!cdev->private)
> +		goto err_priv;
> +	cdev->dev.dma_mask = &cdev->private->dma_mask;
> +	*cdev->dev.dma_mask = *sch->dev.dma_mask;
> +	cdev->dev.coherent_dma_mask = sch->dev.coherent_dma_mask;
> +	dma_pool = cio_gp_dma_create(&cdev->dev, 1);
> +	cdev->private->dma_pool = dma_pool;
> +	cdev->private->dma_area = cio_gp_dma_zalloc(dma_pool, &cdev->dev,
> +					sizeof(*cdev->private->dma_area));
> +	if (!cdev->private->dma_area)
> +		goto err_dma_area;
> +	return cdev;
> +err_dma_area:
> +	cio_gp_dma_destroy(dma_pool, &cdev->dev);
> +	kfree(cdev->private);
> +err_priv:
>   	kfree(cdev);
> +err_cdev:
>   	return ERR_PTR(-ENOMEM);
>   }
> 
> @@ -884,7 +904,7 @@ io_subchannel_recog_done(struct ccw_device *cdev)
>   			wake_up(&ccw_device_init_wq);
>   		break;
>   	case DEV_STATE_OFFLINE:
> -		/*
> +		/*
>   		 * We can't register the device in interrupt context so
>   		 * we schedule a work item.
>   		 */
> @@ -1062,6 +1082,14 @@ static int io_subchannel_probe(struct subchannel *sch)
>   	if (!io_priv)
>   		goto out_schedule;
> 
> +	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
> +				sizeof(*io_priv->dma_area),
> +				&io_priv->dma_area_dma, GFP_KERNEL);
> +	if (!io_priv->dma_area) {
> +		kfree(io_priv);
> +		goto out_schedule;
> +	}
> +
>   	set_io_private(sch, io_priv);
>   	css_schedule_eval(sch->schid);
>   	return 0;
> @@ -1088,6 +1116,8 @@ static int io_subchannel_remove(struct subchannel *sch)
>   	set_io_private(sch, NULL);
>   	spin_unlock_irq(sch->lock);
>   out_free:
> +	dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
> +			  io_priv->dma_area, io_priv->dma_area_dma);
>   	kfree(io_priv);
>   	sysfs_remove_group(&sch->dev.kobj, &io_subchannel_attr_group);
>   	return 0;
> @@ -1593,20 +1623,31 @@ struct ccw_device * __init ccw_device_create_console(struct ccw_driver *drv)
>   		return ERR_CAST(sch);
> 
>   	io_priv = kzalloc(sizeof(*io_priv), GFP_KERNEL | GFP_DMA);
> -	if (!io_priv) {
> -		put_device(&sch->dev);
> -		return ERR_PTR(-ENOMEM);
> -	}
> +	if (!io_priv)
> +		goto err_priv;
> +	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
> +				sizeof(*io_priv->dma_area),
> +				&io_priv->dma_area_dma, GFP_KERNEL);
> +	if (!io_priv->dma_area)
> +		goto err_dma_area;
>   	set_io_private(sch, io_priv);
>   	cdev = io_subchannel_create_ccwdev(sch);
>   	if (IS_ERR(cdev)) {
>   		put_device(&sch->dev);
> +		dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
> +				  io_priv->dma_area, io_priv->dma_area_dma);
>   		kfree(io_priv);
>   		return cdev;
>   	}
>   	cdev->drv = drv;
>   	ccw_device_set_int_class(cdev);
>   	return cdev;
> +
> +err_dma_area:
> +		kfree(io_priv);
> +err_priv:
> +	put_device(&sch->dev);
> +	return ERR_PTR(-ENOMEM);
>   }
> 
>   void __init ccw_device_destroy_console(struct ccw_device *cdev)
> @@ -1617,6 +1658,8 @@ void __init ccw_device_destroy_console(struct ccw_device *cdev)
>   	set_io_private(sch, NULL);
>   	put_device(&sch->dev);
>   	put_device(&cdev->dev);
> +	dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
> +			  io_priv->dma_area, io_priv->dma_area_dma);
>   	kfree(io_priv);
>   }
> 
> diff --git a/drivers/s390/cio/device_fsm.c b/drivers/s390/cio/device_fsm.c
> index 9169af7dbb43..fea23b44795b 100644
> --- a/drivers/s390/cio/device_fsm.c
> +++ b/drivers/s390/cio/device_fsm.c
> @@ -67,8 +67,8 @@ static void ccw_timeout_log(struct ccw_device *cdev)
>   			       sizeof(struct tcw), 0);
>   	} else {
>   		printk(KERN_WARNING "cio: orb indicates command mode\n");
> -		if ((void *)(addr_t)orb->cmd.cpa == &private->sense_ccw ||
> -		    (void *)(addr_t)orb->cmd.cpa == cdev->private->iccws)
> +		if ((void *)(addr_t)orb->cmd.cpa == &private->dma_area->sense_ccw ||
> +		    (void *)(addr_t)orb->cmd.cpa == cdev->private->dma_area->iccws)
>   			printk(KERN_WARNING "cio: last channel program "
>   			       "(intern):\n");
>   		else
> @@ -143,18 +143,18 @@ ccw_device_cancel_halt_clear(struct ccw_device *cdev)
>   void ccw_device_update_sense_data(struct ccw_device *cdev)
>   {
>   	memset(&cdev->id, 0, sizeof(cdev->id));
> -	cdev->id.cu_type   = cdev->private->senseid.cu_type;
> -	cdev->id.cu_model  = cdev->private->senseid.cu_model;
> -	cdev->id.dev_type  = cdev->private->senseid.dev_type;
> -	cdev->id.dev_model = cdev->private->senseid.dev_model;
> +	cdev->id.cu_type   = cdev->private->dma_area->senseid.cu_type;
> +	cdev->id.cu_model  = cdev->private->dma_area->senseid.cu_model;
> +	cdev->id.dev_type  = cdev->private->dma_area->senseid.dev_type;
> +	cdev->id.dev_model = cdev->private->dma_area->senseid.dev_model;
>   }
> 
>   int ccw_device_test_sense_data(struct ccw_device *cdev)
>   {
> -	return cdev->id.cu_type == cdev->private->senseid.cu_type &&
> -		cdev->id.cu_model == cdev->private->senseid.cu_model &&
> -		cdev->id.dev_type == cdev->private->senseid.dev_type &&
> -		cdev->id.dev_model == cdev->private->senseid.dev_model;
> +	return cdev->id.cu_type == cdev->private->dma_area->senseid.cu_type &&
> +		cdev->id.cu_model == cdev->private->dma_area->senseid.cu_model &&
> +		cdev->id.dev_type == cdev->private->dma_area->senseid.dev_type &&
> +		cdev->id.dev_model == cdev->private->dma_area->senseid.dev_model;
>   }
> 
>   /*
> @@ -342,7 +342,7 @@ ccw_device_done(struct ccw_device *cdev, int state)
>   		cio_disable_subchannel(sch);
> 
>   	/* Reset device status. */
> -	memset(&cdev->private->irb, 0, sizeof(struct irb));
> +	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
> 
>   	cdev->private->state = state;
> 
> @@ -509,13 +509,13 @@ void ccw_device_verify_done(struct ccw_device *cdev, int err)
>   		ccw_device_done(cdev, DEV_STATE_ONLINE);
>   		/* Deliver fake irb to device driver, if needed. */
>   		if (cdev->private->flags.fake_irb) {
> -			create_fake_irb(&cdev->private->irb,
> +			create_fake_irb(&cdev->private->dma_area->irb,
>   					cdev->private->flags.fake_irb);
>   			cdev->private->flags.fake_irb = 0;
>   			if (cdev->handler)
>   				cdev->handler(cdev, cdev->private->intparm,
> -					      &cdev->private->irb);
> -			memset(&cdev->private->irb, 0, sizeof(struct irb));
> +					      &cdev->private->dma_area->irb);
> +			memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
>   		}
>   		ccw_device_report_path_events(cdev);
>   		ccw_device_handle_broken_paths(cdev);
> @@ -672,7 +672,7 @@ ccw_device_online_verify(struct ccw_device *cdev, enum dev_event dev_event)
> 
>   	if (scsw_actl(&sch->schib.scsw) != 0 ||
>   	    (scsw_stctl(&sch->schib.scsw) & SCSW_STCTL_STATUS_PEND) ||
> -	    (scsw_stctl(&cdev->private->irb.scsw) & SCSW_STCTL_STATUS_PEND)) {
> +	    (scsw_stctl(&cdev->private->dma_area->irb.scsw) & SCSW_STCTL_STATUS_PEND)) {
>   		/*
>   		 * No final status yet or final status not yet delivered
>   		 * to the device driver. Can't do path verification now,
> @@ -719,7 +719,7 @@ static int ccw_device_call_handler(struct ccw_device *cdev)
>   	 *  - fast notification was requested (primary status)
>   	 *  - unsolicited interrupts
>   	 */
> -	stctl = scsw_stctl(&cdev->private->irb.scsw);
> +	stctl = scsw_stctl(&cdev->private->dma_area->irb.scsw);
>   	ending_status = (stctl & SCSW_STCTL_SEC_STATUS) ||
>   		(stctl == (SCSW_STCTL_ALERT_STATUS | SCSW_STCTL_STATUS_PEND)) ||
>   		(stctl == SCSW_STCTL_STATUS_PEND);
> @@ -735,9 +735,9 @@ static int ccw_device_call_handler(struct ccw_device *cdev)
> 
>   	if (cdev->handler)
>   		cdev->handler(cdev, cdev->private->intparm,
> -			      &cdev->private->irb);
> +			      &cdev->private->dma_area->irb);
> 
> -	memset(&cdev->private->irb, 0, sizeof(struct irb));
> +	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
>   	return 1;
>   }
> 
> @@ -759,7 +759,7 @@ ccw_device_irq(struct ccw_device *cdev, enum dev_event dev_event)
>   			/* Unit check but no sense data. Need basic sense. */
>   			if (ccw_device_do_sense(cdev, irb) != 0)
>   				goto call_handler_unsol;
> -			memcpy(&cdev->private->irb, irb, sizeof(struct irb));
> +			memcpy(&cdev->private->dma_area->irb, irb, sizeof(struct irb));
>   			cdev->private->state = DEV_STATE_W4SENSE;
>   			cdev->private->intparm = 0;
>   			return;
> @@ -842,7 +842,7 @@ ccw_device_w4sense(struct ccw_device *cdev, enum dev_event dev_event)
>   	if (scsw_fctl(&irb->scsw) &
>   	    (SCSW_FCTL_CLEAR_FUNC | SCSW_FCTL_HALT_FUNC)) {
>   		cdev->private->flags.dosense = 0;
> -		memset(&cdev->private->irb, 0, sizeof(struct irb));
> +		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
>   		ccw_device_accumulate_irb(cdev, irb);
>   		goto call_handler;
>   	}
> diff --git a/drivers/s390/cio/device_id.c b/drivers/s390/cio/device_id.c
> index f6df83a9dfbb..ea8a0fc6c0b6 100644
> --- a/drivers/s390/cio/device_id.c
> +++ b/drivers/s390/cio/device_id.c
> @@ -99,7 +99,7 @@ static int diag210_to_senseid(struct senseid *senseid, struct diag210 *diag)
>   static int diag210_get_dev_info(struct ccw_device *cdev)
>   {
>   	struct ccw_dev_id *dev_id = &cdev->private->dev_id;
> -	struct senseid *senseid = &cdev->private->senseid;
> +	struct senseid *senseid = &cdev->private->dma_area->senseid;
>   	struct diag210 diag_data;
>   	int rc;
> 
> @@ -134,8 +134,8 @@ static int diag210_get_dev_info(struct ccw_device *cdev)
>   static void snsid_init(struct ccw_device *cdev)
>   {
>   	cdev->private->flags.esid = 0;
> -	memset(&cdev->private->senseid, 0, sizeof(cdev->private->senseid));
> -	cdev->private->senseid.cu_type = 0xffff;
> +	memset(&cdev->private->dma_area->senseid, 0, sizeof(cdev->private->dma_area->senseid));
> +	cdev->private->dma_area->senseid.cu_type = 0xffff;
>   }
> 
>   /*
> @@ -143,16 +143,16 @@ static void snsid_init(struct ccw_device *cdev)
>    */
>   static int snsid_check(struct ccw_device *cdev, void *data)
>   {
> -	struct cmd_scsw *scsw = &cdev->private->irb.scsw.cmd;
> +	struct cmd_scsw *scsw = &cdev->private->dma_area->irb.scsw.cmd;
>   	int len = sizeof(struct senseid) - scsw->count;
> 
>   	/* Check for incomplete SENSE ID data. */
>   	if (len < SENSE_ID_MIN_LEN)
>   		goto out_restart;
> -	if (cdev->private->senseid.cu_type == 0xffff)
> +	if (cdev->private->dma_area->senseid.cu_type == 0xffff)
>   		goto out_restart;
>   	/* Check for incompatible SENSE ID data. */
> -	if (cdev->private->senseid.reserved != 0xff)
> +	if (cdev->private->dma_area->senseid.reserved != 0xff)
>   		return -EOPNOTSUPP;
>   	/* Check for extended-identification information. */
>   	if (len > SENSE_ID_BASIC_LEN)
> @@ -170,7 +170,7 @@ static int snsid_check(struct ccw_device *cdev, void *data)
>   static void snsid_callback(struct ccw_device *cdev, void *data, int rc)
>   {
>   	struct ccw_dev_id *id = &cdev->private->dev_id;
> -	struct senseid *senseid = &cdev->private->senseid;
> +	struct senseid *senseid = &cdev->private->dma_area->senseid;
>   	int vm = 0;
> 
>   	if (rc && MACHINE_IS_VM) {
> @@ -200,7 +200,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev)
>   {
>   	struct subchannel *sch = to_subchannel(cdev->dev.parent);
>   	struct ccw_request *req = &cdev->private->req;
> -	struct ccw1 *cp = cdev->private->iccws;
> +	struct ccw1 *cp = cdev->private->dma_area->iccws;
> 
>   	CIO_TRACE_EVENT(4, "snsid");
>   	CIO_HEX_EVENT(4, &cdev->private->dev_id, sizeof(cdev->private->dev_id));
> @@ -208,7 +208,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev)
>   	snsid_init(cdev);
>   	/* Channel program setup. */
>   	cp->cmd_code	= CCW_CMD_SENSE_ID;
> -	cp->cda		= (u32) (addr_t) &cdev->private->senseid;
> +	cp->cda		= (u32) (addr_t) &cdev->private->dma_area->senseid;
>   	cp->count	= sizeof(struct senseid);
>   	cp->flags	= CCW_FLAG_SLI;
>   	/* Request setup. */
> diff --git a/drivers/s390/cio/device_ops.c b/drivers/s390/cio/device_ops.c
> index 4435ae0b3027..be4acfa9265a 100644
> --- a/drivers/s390/cio/device_ops.c
> +++ b/drivers/s390/cio/device_ops.c
> @@ -429,8 +429,8 @@ struct ciw *ccw_device_get_ciw(struct ccw_device *cdev, __u32 ct)
>   	if (cdev->private->flags.esid == 0)
>   		return NULL;
>   	for (ciw_cnt = 0; ciw_cnt < MAX_CIWS; ciw_cnt++)
> -		if (cdev->private->senseid.ciw[ciw_cnt].ct == ct)
> -			return cdev->private->senseid.ciw + ciw_cnt;
> +		if (cdev->private->dma_area->senseid.ciw[ciw_cnt].ct == ct)
> +			return cdev->private->dma_area->senseid.ciw + ciw_cnt;
>   	return NULL;
>   }
> 
> @@ -699,6 +699,23 @@ void ccw_device_get_schid(struct ccw_device *cdev, struct subchannel_id *schid)
>   }
>   EXPORT_SYMBOL_GPL(ccw_device_get_schid);
> 
> +/**
> + * Allocate zeroed dma coherent 31 bit addressable memory using
> + * the subchannels dma pool. Maximal size of allocation supported
> + * is PAGE_SIZE.
> + */
> +void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size)
> +{
> +	return cio_gp_dma_zalloc(cdev->private->dma_pool, &cdev->dev, size);
> +}
> +EXPORT_SYMBOL(ccw_device_dma_zalloc);
> +
> +void ccw_device_dma_free(struct ccw_device *cdev, void *cpu_addr, size_t size)
> +{
> +	cio_gp_dma_free(cdev->private->dma_pool, cpu_addr, size);
> +}
> +EXPORT_SYMBOL(ccw_device_dma_free);
> +
>   EXPORT_SYMBOL(ccw_device_set_options_mask);
>   EXPORT_SYMBOL(ccw_device_set_options);
>   EXPORT_SYMBOL(ccw_device_clear_options);
> diff --git a/drivers/s390/cio/device_pgid.c b/drivers/s390/cio/device_pgid.c
> index d30a3babf176..e97baa89cbf8 100644
> --- a/drivers/s390/cio/device_pgid.c
> +++ b/drivers/s390/cio/device_pgid.c
> @@ -57,7 +57,7 @@ static void verify_done(struct ccw_device *cdev, int rc)
>   static void nop_build_cp(struct ccw_device *cdev)
>   {
>   	struct ccw_request *req = &cdev->private->req;
> -	struct ccw1 *cp = cdev->private->iccws;
> +	struct ccw1 *cp = cdev->private->dma_area->iccws;
> 
>   	cp->cmd_code	= CCW_CMD_NOOP;
>   	cp->cda		= 0;
> @@ -134,9 +134,9 @@ static void nop_callback(struct ccw_device *cdev, void *data, int rc)
>   static void spid_build_cp(struct ccw_device *cdev, u8 fn)
>   {
>   	struct ccw_request *req = &cdev->private->req;
> -	struct ccw1 *cp = cdev->private->iccws;
> +	struct ccw1 *cp = cdev->private->dma_area->iccws;
>   	int i = pathmask_to_pos(req->lpm);
> -	struct pgid *pgid = &cdev->private->pgid[i];
> +	struct pgid *pgid = &cdev->private->dma_area->pgid[i];
> 
>   	pgid->inf.fc	= fn;
>   	cp->cmd_code	= CCW_CMD_SET_PGID;
> @@ -300,7 +300,7 @@ static int pgid_cmp(struct pgid *p1, struct pgid *p2)
>   static void pgid_analyze(struct ccw_device *cdev, struct pgid **p,
>   			 int *mismatch, u8 *reserved, u8 *reset)
>   {
> -	struct pgid *pgid = &cdev->private->pgid[0];
> +	struct pgid *pgid = &cdev->private->dma_area->pgid[0];
>   	struct pgid *first = NULL;
>   	int lpm;
>   	int i;
> @@ -342,7 +342,7 @@ static u8 pgid_to_donepm(struct ccw_device *cdev)
>   		lpm = 0x80 >> i;
>   		if ((cdev->private->pgid_valid_mask & lpm) == 0)
>   			continue;
> -		pgid = &cdev->private->pgid[i];
> +		pgid = &cdev->private->dma_area->pgid[i];
>   		if (sch->opm & lpm) {
>   			if (pgid->inf.ps.state1 != SNID_STATE1_GROUPED)
>   				continue;
> @@ -368,7 +368,7 @@ static void pgid_fill(struct ccw_device *cdev, struct pgid *pgid)
>   	int i;
> 
>   	for (i = 0; i < 8; i++)
> -		memcpy(&cdev->private->pgid[i], pgid, sizeof(struct pgid));
> +		memcpy(&cdev->private->dma_area->pgid[i], pgid, sizeof(struct pgid));
>   }
> 
>   /*
> @@ -435,12 +435,12 @@ static void snid_done(struct ccw_device *cdev, int rc)
>   static void snid_build_cp(struct ccw_device *cdev)
>   {
>   	struct ccw_request *req = &cdev->private->req;
> -	struct ccw1 *cp = cdev->private->iccws;
> +	struct ccw1 *cp = cdev->private->dma_area->iccws;
>   	int i = pathmask_to_pos(req->lpm);
> 
>   	/* Channel program setup. */
>   	cp->cmd_code	= CCW_CMD_SENSE_PGID;
> -	cp->cda		= (u32) (addr_t) &cdev->private->pgid[i];
> +	cp->cda		= (u32) (addr_t) &cdev->private->dma_area->pgid[i];
>   	cp->count	= sizeof(struct pgid);
>   	cp->flags	= CCW_FLAG_SLI;
>   	req->cp		= cp;
> @@ -516,7 +516,7 @@ static void verify_start(struct ccw_device *cdev)
>   	sch->lpm = sch->schib.pmcw.pam;
> 
>   	/* Initialize PGID data. */
> -	memset(cdev->private->pgid, 0, sizeof(cdev->private->pgid));
> +	memset(cdev->private->dma_area->pgid, 0, sizeof(cdev->private->dma_area->pgid));
>   	cdev->private->pgid_valid_mask = 0;
>   	cdev->private->pgid_todo_mask = sch->schib.pmcw.pam;
>   	cdev->private->path_notoper_mask = 0;
> @@ -626,7 +626,7 @@ struct stlck_data {
>   static void stlck_build_cp(struct ccw_device *cdev, void *buf1, void *buf2)
>   {
>   	struct ccw_request *req = &cdev->private->req;
> -	struct ccw1 *cp = cdev->private->iccws;
> +	struct ccw1 *cp = cdev->private->dma_area->iccws;
> 
>   	cp[0].cmd_code = CCW_CMD_STLCK;
>   	cp[0].cda = (u32) (addr_t) buf1;
> diff --git a/drivers/s390/cio/device_status.c b/drivers/s390/cio/device_status.c
> index 7d5c7892b2c4..a9aabde604f4 100644
> --- a/drivers/s390/cio/device_status.c
> +++ b/drivers/s390/cio/device_status.c
> @@ -79,15 +79,15 @@ ccw_device_accumulate_ecw(struct ccw_device *cdev, struct irb *irb)
>   	 * are condition that have to be met for the extended control
>   	 * bit to have meaning. Sick.
>   	 */
> -	cdev->private->irb.scsw.cmd.ectl = 0;
> +	cdev->private->dma_area->irb.scsw.cmd.ectl = 0;
>   	if ((irb->scsw.cmd.stctl & SCSW_STCTL_ALERT_STATUS) &&
>   	    !(irb->scsw.cmd.stctl & SCSW_STCTL_INTER_STATUS))
> -		cdev->private->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl;
> +		cdev->private->dma_area->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl;
>   	/* Check if extended control word is valid. */
> -	if (!cdev->private->irb.scsw.cmd.ectl)
> +	if (!cdev->private->dma_area->irb.scsw.cmd.ectl)
>   		return;
>   	/* Copy concurrent sense / model dependent information. */
> -	memcpy (&cdev->private->irb.ecw, irb->ecw, sizeof (irb->ecw));
> +	memcpy (&cdev->private->dma_area->irb.ecw, irb->ecw, sizeof (irb->ecw));


NIT, may be you should take  the opportunity to remove the blanc before 
the parenthesis.

NIT again, Some lines over 80 character too.

just a first check, I will go deeper later.

Regards,
Pierre

-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-08 14:23     ` Pierre Morel
  0 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-08 14:23 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: Thomas Huth, Viktor Mihajlovski, Janosch Frank, Vasily Gorbik,
	Michael S. Tsirkin, Farhan Ali, Eric Farman, virtualization,
	Christoph Hellwig, Claudio Imbrenda

On 26/04/2019 20:32, Halil Pasic wrote:
> As virtio-ccw devices are channel devices, we need to use the dma area
> for any communication with the hypervisor.
> 
> This patch addresses the most basic stuff (mostly what is required for
> virtio-ccw), and does take care of QDIO or any devices.
> 
> An interesting side effect is that virtio structures are now going to
> get allocated in 31 bit addressable storage.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>   arch/s390/include/asm/ccwdev.h   |  4 +++
>   drivers/s390/cio/ccwreq.c        |  8 ++---
>   drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
>   drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
>   drivers/s390/cio/device_id.c     | 18 +++++------
>   drivers/s390/cio/device_ops.c    | 21 +++++++++++--
>   drivers/s390/cio/device_pgid.c   | 20 ++++++-------
>   drivers/s390/cio/device_status.c | 24 +++++++--------
>   drivers/s390/cio/io_sch.h        | 21 +++++++++----
>   drivers/s390/virtio/virtio_ccw.c | 10 -------
>   10 files changed, 148 insertions(+), 83 deletions(-)
> 
> diff --git a/arch/s390/include/asm/ccwdev.h b/arch/s390/include/asm/ccwdev.h
> index a29dd430fb40..865ce1cb86d5 100644
> --- a/arch/s390/include/asm/ccwdev.h
> +++ b/arch/s390/include/asm/ccwdev.h
> @@ -226,6 +226,10 @@ extern int ccw_device_enable_console(struct ccw_device *);
>   extern void ccw_device_wait_idle(struct ccw_device *);
>   extern int ccw_device_force_console(struct ccw_device *);
> 
> +extern void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size);
> +extern void ccw_device_dma_free(struct ccw_device *cdev,
> +				void *cpu_addr, size_t size);
> +
>   int ccw_device_siosl(struct ccw_device *);
> 
>   extern void ccw_device_get_schid(struct ccw_device *, struct subchannel_id *);
> diff --git a/drivers/s390/cio/ccwreq.c b/drivers/s390/cio/ccwreq.c
> index 603268a33ea1..dafbceb311b3 100644
> --- a/drivers/s390/cio/ccwreq.c
> +++ b/drivers/s390/cio/ccwreq.c
> @@ -63,7 +63,7 @@ static void ccwreq_stop(struct ccw_device *cdev, int rc)
>   		return;
>   	req->done = 1;
>   	ccw_device_set_timeout(cdev, 0);
> -	memset(&cdev->private->irb, 0, sizeof(struct irb));
> +	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
>   	if (rc && rc != -ENODEV && req->drc)
>   		rc = req->drc;
>   	req->callback(cdev, req->data, rc);
> @@ -86,7 +86,7 @@ static void ccwreq_do(struct ccw_device *cdev)
>   			continue;
>   		}
>   		/* Perform start function. */
> -		memset(&cdev->private->irb, 0, sizeof(struct irb));
> +		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
>   		rc = cio_start(sch, cp, (u8) req->mask);
>   		if (rc == 0) {
>   			/* I/O started successfully. */
> @@ -169,7 +169,7 @@ int ccw_request_cancel(struct ccw_device *cdev)
>    */
>   static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb)
>   {
> -	struct irb *irb = &cdev->private->irb;
> +	struct irb *irb = &cdev->private->dma_area->irb;
>   	struct cmd_scsw *scsw = &irb->scsw.cmd;
>   	enum uc_todo todo;
> 
> @@ -187,7 +187,7 @@ static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb)
>   		CIO_TRACE_EVENT(2, "sensedata");
>   		CIO_HEX_EVENT(2, &cdev->private->dev_id,
>   			      sizeof(struct ccw_dev_id));
> -		CIO_HEX_EVENT(2, &cdev->private->irb.ecw, SENSE_MAX_COUNT);
> +		CIO_HEX_EVENT(2, &cdev->private->dma_area->irb.ecw, SENSE_MAX_COUNT);
>   		/* Check for command reject. */
>   		if (irb->ecw[0] & SNS0_CMD_REJECT)
>   			return IO_REJECTED;
> diff --git a/drivers/s390/cio/device.c b/drivers/s390/cio/device.c
> index 1540229a37bb..a3310ee14a4a 100644
> --- a/drivers/s390/cio/device.c
> +++ b/drivers/s390/cio/device.c
> @@ -24,6 +24,7 @@
>   #include <linux/timer.h>
>   #include <linux/kernel_stat.h>
>   #include <linux/sched/signal.h>
> +#include <linux/dma-mapping.h>
> 
>   #include <asm/ccwdev.h>
>   #include <asm/cio.h>
> @@ -687,6 +688,9 @@ ccw_device_release(struct device *dev)
>   	struct ccw_device *cdev;
> 
>   	cdev = to_ccwdev(dev);
> +	cio_gp_dma_free(cdev->private->dma_pool, cdev->private->dma_area,
> +			sizeof(*cdev->private->dma_area));
> +	cio_gp_dma_destroy(cdev->private->dma_pool, &cdev->dev);
>   	/* Release reference of parent subchannel. */
>   	put_device(cdev->dev.parent);
>   	kfree(cdev->private);
> @@ -696,15 +700,31 @@ ccw_device_release(struct device *dev)
>   static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
>   {
>   	struct ccw_device *cdev;
> +	struct gen_pool *dma_pool;
> 
>   	cdev  = kzalloc(sizeof(*cdev), GFP_KERNEL);
> -	if (cdev) {
> -		cdev->private = kzalloc(sizeof(struct ccw_device_private),
> -					GFP_KERNEL | GFP_DMA);
> -		if (cdev->private)
> -			return cdev;
> -	}
> +	if (!cdev)
> +		goto err_cdev;
> +	cdev->private = kzalloc(sizeof(struct ccw_device_private),
> +				GFP_KERNEL | GFP_DMA);
> +	if (!cdev->private)
> +		goto err_priv;
> +	cdev->dev.dma_mask = &cdev->private->dma_mask;
> +	*cdev->dev.dma_mask = *sch->dev.dma_mask;
> +	cdev->dev.coherent_dma_mask = sch->dev.coherent_dma_mask;
> +	dma_pool = cio_gp_dma_create(&cdev->dev, 1);
> +	cdev->private->dma_pool = dma_pool;
> +	cdev->private->dma_area = cio_gp_dma_zalloc(dma_pool, &cdev->dev,
> +					sizeof(*cdev->private->dma_area));
> +	if (!cdev->private->dma_area)
> +		goto err_dma_area;
> +	return cdev;
> +err_dma_area:
> +	cio_gp_dma_destroy(dma_pool, &cdev->dev);
> +	kfree(cdev->private);
> +err_priv:
>   	kfree(cdev);
> +err_cdev:
>   	return ERR_PTR(-ENOMEM);
>   }
> 
> @@ -884,7 +904,7 @@ io_subchannel_recog_done(struct ccw_device *cdev)
>   			wake_up(&ccw_device_init_wq);
>   		break;
>   	case DEV_STATE_OFFLINE:
> -		/*
> +		/*
>   		 * We can't register the device in interrupt context so
>   		 * we schedule a work item.
>   		 */
> @@ -1062,6 +1082,14 @@ static int io_subchannel_probe(struct subchannel *sch)
>   	if (!io_priv)
>   		goto out_schedule;
> 
> +	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
> +				sizeof(*io_priv->dma_area),
> +				&io_priv->dma_area_dma, GFP_KERNEL);
> +	if (!io_priv->dma_area) {
> +		kfree(io_priv);
> +		goto out_schedule;
> +	}
> +
>   	set_io_private(sch, io_priv);
>   	css_schedule_eval(sch->schid);
>   	return 0;
> @@ -1088,6 +1116,8 @@ static int io_subchannel_remove(struct subchannel *sch)
>   	set_io_private(sch, NULL);
>   	spin_unlock_irq(sch->lock);
>   out_free:
> +	dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
> +			  io_priv->dma_area, io_priv->dma_area_dma);
>   	kfree(io_priv);
>   	sysfs_remove_group(&sch->dev.kobj, &io_subchannel_attr_group);
>   	return 0;
> @@ -1593,20 +1623,31 @@ struct ccw_device * __init ccw_device_create_console(struct ccw_driver *drv)
>   		return ERR_CAST(sch);
> 
>   	io_priv = kzalloc(sizeof(*io_priv), GFP_KERNEL | GFP_DMA);
> -	if (!io_priv) {
> -		put_device(&sch->dev);
> -		return ERR_PTR(-ENOMEM);
> -	}
> +	if (!io_priv)
> +		goto err_priv;
> +	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
> +				sizeof(*io_priv->dma_area),
> +				&io_priv->dma_area_dma, GFP_KERNEL);
> +	if (!io_priv->dma_area)
> +		goto err_dma_area;
>   	set_io_private(sch, io_priv);
>   	cdev = io_subchannel_create_ccwdev(sch);
>   	if (IS_ERR(cdev)) {
>   		put_device(&sch->dev);
> +		dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
> +				  io_priv->dma_area, io_priv->dma_area_dma);
>   		kfree(io_priv);
>   		return cdev;
>   	}
>   	cdev->drv = drv;
>   	ccw_device_set_int_class(cdev);
>   	return cdev;
> +
> +err_dma_area:
> +		kfree(io_priv);
> +err_priv:
> +	put_device(&sch->dev);
> +	return ERR_PTR(-ENOMEM);
>   }
> 
>   void __init ccw_device_destroy_console(struct ccw_device *cdev)
> @@ -1617,6 +1658,8 @@ void __init ccw_device_destroy_console(struct ccw_device *cdev)
>   	set_io_private(sch, NULL);
>   	put_device(&sch->dev);
>   	put_device(&cdev->dev);
> +	dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
> +			  io_priv->dma_area, io_priv->dma_area_dma);
>   	kfree(io_priv);
>   }
> 
> diff --git a/drivers/s390/cio/device_fsm.c b/drivers/s390/cio/device_fsm.c
> index 9169af7dbb43..fea23b44795b 100644
> --- a/drivers/s390/cio/device_fsm.c
> +++ b/drivers/s390/cio/device_fsm.c
> @@ -67,8 +67,8 @@ static void ccw_timeout_log(struct ccw_device *cdev)
>   			       sizeof(struct tcw), 0);
>   	} else {
>   		printk(KERN_WARNING "cio: orb indicates command mode\n");
> -		if ((void *)(addr_t)orb->cmd.cpa == &private->sense_ccw ||
> -		    (void *)(addr_t)orb->cmd.cpa == cdev->private->iccws)
> +		if ((void *)(addr_t)orb->cmd.cpa == &private->dma_area->sense_ccw ||
> +		    (void *)(addr_t)orb->cmd.cpa == cdev->private->dma_area->iccws)
>   			printk(KERN_WARNING "cio: last channel program "
>   			       "(intern):\n");
>   		else
> @@ -143,18 +143,18 @@ ccw_device_cancel_halt_clear(struct ccw_device *cdev)
>   void ccw_device_update_sense_data(struct ccw_device *cdev)
>   {
>   	memset(&cdev->id, 0, sizeof(cdev->id));
> -	cdev->id.cu_type   = cdev->private->senseid.cu_type;
> -	cdev->id.cu_model  = cdev->private->senseid.cu_model;
> -	cdev->id.dev_type  = cdev->private->senseid.dev_type;
> -	cdev->id.dev_model = cdev->private->senseid.dev_model;
> +	cdev->id.cu_type   = cdev->private->dma_area->senseid.cu_type;
> +	cdev->id.cu_model  = cdev->private->dma_area->senseid.cu_model;
> +	cdev->id.dev_type  = cdev->private->dma_area->senseid.dev_type;
> +	cdev->id.dev_model = cdev->private->dma_area->senseid.dev_model;
>   }
> 
>   int ccw_device_test_sense_data(struct ccw_device *cdev)
>   {
> -	return cdev->id.cu_type == cdev->private->senseid.cu_type &&
> -		cdev->id.cu_model == cdev->private->senseid.cu_model &&
> -		cdev->id.dev_type == cdev->private->senseid.dev_type &&
> -		cdev->id.dev_model == cdev->private->senseid.dev_model;
> +	return cdev->id.cu_type == cdev->private->dma_area->senseid.cu_type &&
> +		cdev->id.cu_model == cdev->private->dma_area->senseid.cu_model &&
> +		cdev->id.dev_type == cdev->private->dma_area->senseid.dev_type &&
> +		cdev->id.dev_model == cdev->private->dma_area->senseid.dev_model;
>   }
> 
>   /*
> @@ -342,7 +342,7 @@ ccw_device_done(struct ccw_device *cdev, int state)
>   		cio_disable_subchannel(sch);
> 
>   	/* Reset device status. */
> -	memset(&cdev->private->irb, 0, sizeof(struct irb));
> +	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
> 
>   	cdev->private->state = state;
> 
> @@ -509,13 +509,13 @@ void ccw_device_verify_done(struct ccw_device *cdev, int err)
>   		ccw_device_done(cdev, DEV_STATE_ONLINE);
>   		/* Deliver fake irb to device driver, if needed. */
>   		if (cdev->private->flags.fake_irb) {
> -			create_fake_irb(&cdev->private->irb,
> +			create_fake_irb(&cdev->private->dma_area->irb,
>   					cdev->private->flags.fake_irb);
>   			cdev->private->flags.fake_irb = 0;
>   			if (cdev->handler)
>   				cdev->handler(cdev, cdev->private->intparm,
> -					      &cdev->private->irb);
> -			memset(&cdev->private->irb, 0, sizeof(struct irb));
> +					      &cdev->private->dma_area->irb);
> +			memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
>   		}
>   		ccw_device_report_path_events(cdev);
>   		ccw_device_handle_broken_paths(cdev);
> @@ -672,7 +672,7 @@ ccw_device_online_verify(struct ccw_device *cdev, enum dev_event dev_event)
> 
>   	if (scsw_actl(&sch->schib.scsw) != 0 ||
>   	    (scsw_stctl(&sch->schib.scsw) & SCSW_STCTL_STATUS_PEND) ||
> -	    (scsw_stctl(&cdev->private->irb.scsw) & SCSW_STCTL_STATUS_PEND)) {
> +	    (scsw_stctl(&cdev->private->dma_area->irb.scsw) & SCSW_STCTL_STATUS_PEND)) {
>   		/*
>   		 * No final status yet or final status not yet delivered
>   		 * to the device driver. Can't do path verification now,
> @@ -719,7 +719,7 @@ static int ccw_device_call_handler(struct ccw_device *cdev)
>   	 *  - fast notification was requested (primary status)
>   	 *  - unsolicited interrupts
>   	 */
> -	stctl = scsw_stctl(&cdev->private->irb.scsw);
> +	stctl = scsw_stctl(&cdev->private->dma_area->irb.scsw);
>   	ending_status = (stctl & SCSW_STCTL_SEC_STATUS) ||
>   		(stctl == (SCSW_STCTL_ALERT_STATUS | SCSW_STCTL_STATUS_PEND)) ||
>   		(stctl == SCSW_STCTL_STATUS_PEND);
> @@ -735,9 +735,9 @@ static int ccw_device_call_handler(struct ccw_device *cdev)
> 
>   	if (cdev->handler)
>   		cdev->handler(cdev, cdev->private->intparm,
> -			      &cdev->private->irb);
> +			      &cdev->private->dma_area->irb);
> 
> -	memset(&cdev->private->irb, 0, sizeof(struct irb));
> +	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
>   	return 1;
>   }
> 
> @@ -759,7 +759,7 @@ ccw_device_irq(struct ccw_device *cdev, enum dev_event dev_event)
>   			/* Unit check but no sense data. Need basic sense. */
>   			if (ccw_device_do_sense(cdev, irb) != 0)
>   				goto call_handler_unsol;
> -			memcpy(&cdev->private->irb, irb, sizeof(struct irb));
> +			memcpy(&cdev->private->dma_area->irb, irb, sizeof(struct irb));
>   			cdev->private->state = DEV_STATE_W4SENSE;
>   			cdev->private->intparm = 0;
>   			return;
> @@ -842,7 +842,7 @@ ccw_device_w4sense(struct ccw_device *cdev, enum dev_event dev_event)
>   	if (scsw_fctl(&irb->scsw) &
>   	    (SCSW_FCTL_CLEAR_FUNC | SCSW_FCTL_HALT_FUNC)) {
>   		cdev->private->flags.dosense = 0;
> -		memset(&cdev->private->irb, 0, sizeof(struct irb));
> +		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
>   		ccw_device_accumulate_irb(cdev, irb);
>   		goto call_handler;
>   	}
> diff --git a/drivers/s390/cio/device_id.c b/drivers/s390/cio/device_id.c
> index f6df83a9dfbb..ea8a0fc6c0b6 100644
> --- a/drivers/s390/cio/device_id.c
> +++ b/drivers/s390/cio/device_id.c
> @@ -99,7 +99,7 @@ static int diag210_to_senseid(struct senseid *senseid, struct diag210 *diag)
>   static int diag210_get_dev_info(struct ccw_device *cdev)
>   {
>   	struct ccw_dev_id *dev_id = &cdev->private->dev_id;
> -	struct senseid *senseid = &cdev->private->senseid;
> +	struct senseid *senseid = &cdev->private->dma_area->senseid;
>   	struct diag210 diag_data;
>   	int rc;
> 
> @@ -134,8 +134,8 @@ static int diag210_get_dev_info(struct ccw_device *cdev)
>   static void snsid_init(struct ccw_device *cdev)
>   {
>   	cdev->private->flags.esid = 0;
> -	memset(&cdev->private->senseid, 0, sizeof(cdev->private->senseid));
> -	cdev->private->senseid.cu_type = 0xffff;
> +	memset(&cdev->private->dma_area->senseid, 0, sizeof(cdev->private->dma_area->senseid));
> +	cdev->private->dma_area->senseid.cu_type = 0xffff;
>   }
> 
>   /*
> @@ -143,16 +143,16 @@ static void snsid_init(struct ccw_device *cdev)
>    */
>   static int snsid_check(struct ccw_device *cdev, void *data)
>   {
> -	struct cmd_scsw *scsw = &cdev->private->irb.scsw.cmd;
> +	struct cmd_scsw *scsw = &cdev->private->dma_area->irb.scsw.cmd;
>   	int len = sizeof(struct senseid) - scsw->count;
> 
>   	/* Check for incomplete SENSE ID data. */
>   	if (len < SENSE_ID_MIN_LEN)
>   		goto out_restart;
> -	if (cdev->private->senseid.cu_type == 0xffff)
> +	if (cdev->private->dma_area->senseid.cu_type == 0xffff)
>   		goto out_restart;
>   	/* Check for incompatible SENSE ID data. */
> -	if (cdev->private->senseid.reserved != 0xff)
> +	if (cdev->private->dma_area->senseid.reserved != 0xff)
>   		return -EOPNOTSUPP;
>   	/* Check for extended-identification information. */
>   	if (len > SENSE_ID_BASIC_LEN)
> @@ -170,7 +170,7 @@ static int snsid_check(struct ccw_device *cdev, void *data)
>   static void snsid_callback(struct ccw_device *cdev, void *data, int rc)
>   {
>   	struct ccw_dev_id *id = &cdev->private->dev_id;
> -	struct senseid *senseid = &cdev->private->senseid;
> +	struct senseid *senseid = &cdev->private->dma_area->senseid;
>   	int vm = 0;
> 
>   	if (rc && MACHINE_IS_VM) {
> @@ -200,7 +200,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev)
>   {
>   	struct subchannel *sch = to_subchannel(cdev->dev.parent);
>   	struct ccw_request *req = &cdev->private->req;
> -	struct ccw1 *cp = cdev->private->iccws;
> +	struct ccw1 *cp = cdev->private->dma_area->iccws;
> 
>   	CIO_TRACE_EVENT(4, "snsid");
>   	CIO_HEX_EVENT(4, &cdev->private->dev_id, sizeof(cdev->private->dev_id));
> @@ -208,7 +208,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev)
>   	snsid_init(cdev);
>   	/* Channel program setup. */
>   	cp->cmd_code	= CCW_CMD_SENSE_ID;
> -	cp->cda		= (u32) (addr_t) &cdev->private->senseid;
> +	cp->cda		= (u32) (addr_t) &cdev->private->dma_area->senseid;
>   	cp->count	= sizeof(struct senseid);
>   	cp->flags	= CCW_FLAG_SLI;
>   	/* Request setup. */
> diff --git a/drivers/s390/cio/device_ops.c b/drivers/s390/cio/device_ops.c
> index 4435ae0b3027..be4acfa9265a 100644
> --- a/drivers/s390/cio/device_ops.c
> +++ b/drivers/s390/cio/device_ops.c
> @@ -429,8 +429,8 @@ struct ciw *ccw_device_get_ciw(struct ccw_device *cdev, __u32 ct)
>   	if (cdev->private->flags.esid == 0)
>   		return NULL;
>   	for (ciw_cnt = 0; ciw_cnt < MAX_CIWS; ciw_cnt++)
> -		if (cdev->private->senseid.ciw[ciw_cnt].ct == ct)
> -			return cdev->private->senseid.ciw + ciw_cnt;
> +		if (cdev->private->dma_area->senseid.ciw[ciw_cnt].ct == ct)
> +			return cdev->private->dma_area->senseid.ciw + ciw_cnt;
>   	return NULL;
>   }
> 
> @@ -699,6 +699,23 @@ void ccw_device_get_schid(struct ccw_device *cdev, struct subchannel_id *schid)
>   }
>   EXPORT_SYMBOL_GPL(ccw_device_get_schid);
> 
> +/**
> + * Allocate zeroed dma coherent 31 bit addressable memory using
> + * the subchannels dma pool. Maximal size of allocation supported
> + * is PAGE_SIZE.
> + */
> +void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size)
> +{
> +	return cio_gp_dma_zalloc(cdev->private->dma_pool, &cdev->dev, size);
> +}
> +EXPORT_SYMBOL(ccw_device_dma_zalloc);
> +
> +void ccw_device_dma_free(struct ccw_device *cdev, void *cpu_addr, size_t size)
> +{
> +	cio_gp_dma_free(cdev->private->dma_pool, cpu_addr, size);
> +}
> +EXPORT_SYMBOL(ccw_device_dma_free);
> +
>   EXPORT_SYMBOL(ccw_device_set_options_mask);
>   EXPORT_SYMBOL(ccw_device_set_options);
>   EXPORT_SYMBOL(ccw_device_clear_options);
> diff --git a/drivers/s390/cio/device_pgid.c b/drivers/s390/cio/device_pgid.c
> index d30a3babf176..e97baa89cbf8 100644
> --- a/drivers/s390/cio/device_pgid.c
> +++ b/drivers/s390/cio/device_pgid.c
> @@ -57,7 +57,7 @@ static void verify_done(struct ccw_device *cdev, int rc)
>   static void nop_build_cp(struct ccw_device *cdev)
>   {
>   	struct ccw_request *req = &cdev->private->req;
> -	struct ccw1 *cp = cdev->private->iccws;
> +	struct ccw1 *cp = cdev->private->dma_area->iccws;
> 
>   	cp->cmd_code	= CCW_CMD_NOOP;
>   	cp->cda		= 0;
> @@ -134,9 +134,9 @@ static void nop_callback(struct ccw_device *cdev, void *data, int rc)
>   static void spid_build_cp(struct ccw_device *cdev, u8 fn)
>   {
>   	struct ccw_request *req = &cdev->private->req;
> -	struct ccw1 *cp = cdev->private->iccws;
> +	struct ccw1 *cp = cdev->private->dma_area->iccws;
>   	int i = pathmask_to_pos(req->lpm);
> -	struct pgid *pgid = &cdev->private->pgid[i];
> +	struct pgid *pgid = &cdev->private->dma_area->pgid[i];
> 
>   	pgid->inf.fc	= fn;
>   	cp->cmd_code	= CCW_CMD_SET_PGID;
> @@ -300,7 +300,7 @@ static int pgid_cmp(struct pgid *p1, struct pgid *p2)
>   static void pgid_analyze(struct ccw_device *cdev, struct pgid **p,
>   			 int *mismatch, u8 *reserved, u8 *reset)
>   {
> -	struct pgid *pgid = &cdev->private->pgid[0];
> +	struct pgid *pgid = &cdev->private->dma_area->pgid[0];
>   	struct pgid *first = NULL;
>   	int lpm;
>   	int i;
> @@ -342,7 +342,7 @@ static u8 pgid_to_donepm(struct ccw_device *cdev)
>   		lpm = 0x80 >> i;
>   		if ((cdev->private->pgid_valid_mask & lpm) == 0)
>   			continue;
> -		pgid = &cdev->private->pgid[i];
> +		pgid = &cdev->private->dma_area->pgid[i];
>   		if (sch->opm & lpm) {
>   			if (pgid->inf.ps.state1 != SNID_STATE1_GROUPED)
>   				continue;
> @@ -368,7 +368,7 @@ static void pgid_fill(struct ccw_device *cdev, struct pgid *pgid)
>   	int i;
> 
>   	for (i = 0; i < 8; i++)
> -		memcpy(&cdev->private->pgid[i], pgid, sizeof(struct pgid));
> +		memcpy(&cdev->private->dma_area->pgid[i], pgid, sizeof(struct pgid));
>   }
> 
>   /*
> @@ -435,12 +435,12 @@ static void snid_done(struct ccw_device *cdev, int rc)
>   static void snid_build_cp(struct ccw_device *cdev)
>   {
>   	struct ccw_request *req = &cdev->private->req;
> -	struct ccw1 *cp = cdev->private->iccws;
> +	struct ccw1 *cp = cdev->private->dma_area->iccws;
>   	int i = pathmask_to_pos(req->lpm);
> 
>   	/* Channel program setup. */
>   	cp->cmd_code	= CCW_CMD_SENSE_PGID;
> -	cp->cda		= (u32) (addr_t) &cdev->private->pgid[i];
> +	cp->cda		= (u32) (addr_t) &cdev->private->dma_area->pgid[i];
>   	cp->count	= sizeof(struct pgid);
>   	cp->flags	= CCW_FLAG_SLI;
>   	req->cp		= cp;
> @@ -516,7 +516,7 @@ static void verify_start(struct ccw_device *cdev)
>   	sch->lpm = sch->schib.pmcw.pam;
> 
>   	/* Initialize PGID data. */
> -	memset(cdev->private->pgid, 0, sizeof(cdev->private->pgid));
> +	memset(cdev->private->dma_area->pgid, 0, sizeof(cdev->private->dma_area->pgid));
>   	cdev->private->pgid_valid_mask = 0;
>   	cdev->private->pgid_todo_mask = sch->schib.pmcw.pam;
>   	cdev->private->path_notoper_mask = 0;
> @@ -626,7 +626,7 @@ struct stlck_data {
>   static void stlck_build_cp(struct ccw_device *cdev, void *buf1, void *buf2)
>   {
>   	struct ccw_request *req = &cdev->private->req;
> -	struct ccw1 *cp = cdev->private->iccws;
> +	struct ccw1 *cp = cdev->private->dma_area->iccws;
> 
>   	cp[0].cmd_code = CCW_CMD_STLCK;
>   	cp[0].cda = (u32) (addr_t) buf1;
> diff --git a/drivers/s390/cio/device_status.c b/drivers/s390/cio/device_status.c
> index 7d5c7892b2c4..a9aabde604f4 100644
> --- a/drivers/s390/cio/device_status.c
> +++ b/drivers/s390/cio/device_status.c
> @@ -79,15 +79,15 @@ ccw_device_accumulate_ecw(struct ccw_device *cdev, struct irb *irb)
>   	 * are condition that have to be met for the extended control
>   	 * bit to have meaning. Sick.
>   	 */
> -	cdev->private->irb.scsw.cmd.ectl = 0;
> +	cdev->private->dma_area->irb.scsw.cmd.ectl = 0;
>   	if ((irb->scsw.cmd.stctl & SCSW_STCTL_ALERT_STATUS) &&
>   	    !(irb->scsw.cmd.stctl & SCSW_STCTL_INTER_STATUS))
> -		cdev->private->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl;
> +		cdev->private->dma_area->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl;
>   	/* Check if extended control word is valid. */
> -	if (!cdev->private->irb.scsw.cmd.ectl)
> +	if (!cdev->private->dma_area->irb.scsw.cmd.ectl)
>   		return;
>   	/* Copy concurrent sense / model dependent information. */
> -	memcpy (&cdev->private->irb.ecw, irb->ecw, sizeof (irb->ecw));
> +	memcpy (&cdev->private->dma_area->irb.ecw, irb->ecw, sizeof (irb->ecw));


NIT, may be you should take  the opportunity to remove the blanc before 
the parenthesis.

NIT again, Some lines over 80 character too.

just a first check, I will go deeper later.

Regards,
Pierre

-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
  2019-04-26 18:32   ` Halil Pasic
@ 2019-05-08 14:31     ` Pierre Morel
  -1 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-08 14:31 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On 26/04/2019 20:32, Halil Pasic wrote:
> This will come in handy soon when we pull out the indicators from
> virtio_ccw_device to a memory area that is shared with the hypervisor
> (in particular for protected virtualization guests).
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>   drivers/s390/virtio/virtio_ccw.c | 40 +++++++++++++++++++++++++---------------
>   1 file changed, 25 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index bb7a92316fc8..1f3e7d56924f 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
>   	void *airq_info;
>   };
>   
> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
> +{
> +	return &vcdev->indicators;
> +}
> +
> +static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
> +{
> +	return &vcdev->indicators2;
> +}
> +
>   struct vq_info_block_legacy {
>   	__u64 queue;
>   	__u32 align;
> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>   		ccw->cda = (__u32)(unsigned long) thinint_area;
>   	} else {
>   		/* payload is the address of the indicators */
> -		indicatorp = kmalloc(sizeof(&vcdev->indicators),
> +		indicatorp = kmalloc(sizeof(indicators(vcdev)),
>   				     GFP_DMA | GFP_KERNEL);
>   		if (!indicatorp)
>   			return;
>   		*indicatorp = 0;
>   		ccw->cmd_code = CCW_CMD_SET_IND;
> -		ccw->count = sizeof(&vcdev->indicators);
> +		ccw->count = sizeof(indicators(vcdev));

This looks strange to me. Was already weird before.
Lucky we are indicators are long...
may be just sizeof(long)

>   		ccw->cda = (__u32)(unsigned long) indicatorp;
>   	}
>   	/* Deregister indicators from host. */
> -	vcdev->indicators = 0;
> +	*indicators(vcdev) = 0;
>   	ccw->flags = 0;
>   	ret = ccw_io_helper(vcdev, ccw,
>   			    vcdev->is_thinint ?
> @@ -656,10 +666,10 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>   	 * We need a data area under 2G to communicate. Our payload is
>   	 * the address of the indicators.
>   	*/
> -	indicatorp = kmalloc(sizeof(&vcdev->indicators), GFP_DMA | GFP_KERNEL);
> +	indicatorp = kmalloc(sizeof(indicators(vcdev)), GFP_DMA | GFP_KERNEL);
>   	if (!indicatorp)
>   		goto out;
> -	*indicatorp = (unsigned long) &vcdev->indicators;
> +	*indicatorp = (unsigned long) indicators(vcdev);
>   	if (vcdev->is_thinint) {
>   		ret = virtio_ccw_register_adapter_ind(vcdev, vqs, nvqs, ccw);
>   		if (ret)
> @@ -668,21 +678,21 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>   	}
>   	if (!vcdev->is_thinint) {
>   		/* Register queue indicators with host. */
> -		vcdev->indicators = 0;
> +		*indicators(vcdev) = 0;
>   		ccw->cmd_code = CCW_CMD_SET_IND;
>   		ccw->flags = 0;
> -		ccw->count = sizeof(&vcdev->indicators);
> +		ccw->count = sizeof(indicators(vcdev));

same as before

>   		ccw->cda = (__u32)(unsigned long) indicatorp;
>   		ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_IND);
>   		if (ret)
>   			goto out;
>   	}
>   	/* Register indicators2 with host for config changes */
> -	*indicatorp = (unsigned long) &vcdev->indicators2;
> -	vcdev->indicators2 = 0;
> +	*indicatorp = (unsigned long) indicators2(vcdev);
> +	*indicators2(vcdev) = 0;
>   	ccw->cmd_code = CCW_CMD_SET_CONF_IND;
>   	ccw->flags = 0;
> -	ccw->count = sizeof(&vcdev->indicators2);
> +	ccw->count = sizeof(indicators2(vcdev));

here too

>   	ccw->cda = (__u32)(unsigned long) indicatorp;
>   	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_CONF_IND);
>   	if (ret)
> @@ -1092,17 +1102,17 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
>   			vcdev->err = -EIO;
>   	}
>   	virtio_ccw_check_activity(vcdev, activity);
> -	for_each_set_bit(i, &vcdev->indicators,
> -			 sizeof(vcdev->indicators) * BITS_PER_BYTE) {
> +	for_each_set_bit(i, indicators(vcdev),
> +			 sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
>   		/* The bit clear must happen before the vring kick. */
> -		clear_bit(i, &vcdev->indicators);
> +		clear_bit(i, indicators(vcdev));
>   		barrier();
>   		vq = virtio_ccw_vq_by_ind(vcdev, i);
>   		vring_interrupt(0, vq);
>   	}
> -	if (test_bit(0, &vcdev->indicators2)) {
> +	if (test_bit(0, indicators2(vcdev))) {
>   		virtio_config_changed(&vcdev->vdev);
> -		clear_bit(0, &vcdev->indicators2);
> +		clear_bit(0, indicators2(vcdev));
>   	}
>   }
>   
> 

Here again just a fast check.
I will go in the functional later.

Regards,
Pierre


-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
@ 2019-05-08 14:31     ` Pierre Morel
  0 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-08 14:31 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: Thomas Huth, Claudio Imbrenda, Janosch Frank, Vasily Gorbik,
	Michael S. Tsirkin, Farhan Ali, Eric Farman, virtualization,
	Christoph Hellwig, Viktor Mihajlovski

On 26/04/2019 20:32, Halil Pasic wrote:
> This will come in handy soon when we pull out the indicators from
> virtio_ccw_device to a memory area that is shared with the hypervisor
> (in particular for protected virtualization guests).
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>   drivers/s390/virtio/virtio_ccw.c | 40 +++++++++++++++++++++++++---------------
>   1 file changed, 25 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index bb7a92316fc8..1f3e7d56924f 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
>   	void *airq_info;
>   };
>   
> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
> +{
> +	return &vcdev->indicators;
> +}
> +
> +static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
> +{
> +	return &vcdev->indicators2;
> +}
> +
>   struct vq_info_block_legacy {
>   	__u64 queue;
>   	__u32 align;
> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>   		ccw->cda = (__u32)(unsigned long) thinint_area;
>   	} else {
>   		/* payload is the address of the indicators */
> -		indicatorp = kmalloc(sizeof(&vcdev->indicators),
> +		indicatorp = kmalloc(sizeof(indicators(vcdev)),
>   				     GFP_DMA | GFP_KERNEL);
>   		if (!indicatorp)
>   			return;
>   		*indicatorp = 0;
>   		ccw->cmd_code = CCW_CMD_SET_IND;
> -		ccw->count = sizeof(&vcdev->indicators);
> +		ccw->count = sizeof(indicators(vcdev));

This looks strange to me. Was already weird before.
Lucky we are indicators are long...
may be just sizeof(long)

>   		ccw->cda = (__u32)(unsigned long) indicatorp;
>   	}
>   	/* Deregister indicators from host. */
> -	vcdev->indicators = 0;
> +	*indicators(vcdev) = 0;
>   	ccw->flags = 0;
>   	ret = ccw_io_helper(vcdev, ccw,
>   			    vcdev->is_thinint ?
> @@ -656,10 +666,10 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>   	 * We need a data area under 2G to communicate. Our payload is
>   	 * the address of the indicators.
>   	*/
> -	indicatorp = kmalloc(sizeof(&vcdev->indicators), GFP_DMA | GFP_KERNEL);
> +	indicatorp = kmalloc(sizeof(indicators(vcdev)), GFP_DMA | GFP_KERNEL);
>   	if (!indicatorp)
>   		goto out;
> -	*indicatorp = (unsigned long) &vcdev->indicators;
> +	*indicatorp = (unsigned long) indicators(vcdev);
>   	if (vcdev->is_thinint) {
>   		ret = virtio_ccw_register_adapter_ind(vcdev, vqs, nvqs, ccw);
>   		if (ret)
> @@ -668,21 +678,21 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>   	}
>   	if (!vcdev->is_thinint) {
>   		/* Register queue indicators with host. */
> -		vcdev->indicators = 0;
> +		*indicators(vcdev) = 0;
>   		ccw->cmd_code = CCW_CMD_SET_IND;
>   		ccw->flags = 0;
> -		ccw->count = sizeof(&vcdev->indicators);
> +		ccw->count = sizeof(indicators(vcdev));

same as before

>   		ccw->cda = (__u32)(unsigned long) indicatorp;
>   		ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_IND);
>   		if (ret)
>   			goto out;
>   	}
>   	/* Register indicators2 with host for config changes */
> -	*indicatorp = (unsigned long) &vcdev->indicators2;
> -	vcdev->indicators2 = 0;
> +	*indicatorp = (unsigned long) indicators2(vcdev);
> +	*indicators2(vcdev) = 0;
>   	ccw->cmd_code = CCW_CMD_SET_CONF_IND;
>   	ccw->flags = 0;
> -	ccw->count = sizeof(&vcdev->indicators2);
> +	ccw->count = sizeof(indicators2(vcdev));

here too

>   	ccw->cda = (__u32)(unsigned long) indicatorp;
>   	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_CONF_IND);
>   	if (ret)
> @@ -1092,17 +1102,17 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
>   			vcdev->err = -EIO;
>   	}
>   	virtio_ccw_check_activity(vcdev, activity);
> -	for_each_set_bit(i, &vcdev->indicators,
> -			 sizeof(vcdev->indicators) * BITS_PER_BYTE) {
> +	for_each_set_bit(i, indicators(vcdev),
> +			 sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
>   		/* The bit clear must happen before the vring kick. */
> -		clear_bit(i, &vcdev->indicators);
> +		clear_bit(i, indicators(vcdev));
>   		barrier();
>   		vq = virtio_ccw_vq_by_ind(vcdev, i);
>   		vring_interrupt(0, vq);
>   	}
> -	if (test_bit(0, &vcdev->indicators2)) {
> +	if (test_bit(0, indicators2(vcdev))) {
>   		virtio_config_changed(&vcdev->vdev);
> -		clear_bit(0, &vcdev->indicators2);
> +		clear_bit(0, indicators2(vcdev));
>   	}
>   }
>   
> 

Here again just a fast check.
I will go in the functional later.

Regards,
Pierre


-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers
  2019-04-26 18:32   ` Halil Pasic
@ 2019-05-08 14:46     ` Pierre Morel
  -1 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-08 14:46 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On 26/04/2019 20:32, Halil Pasic wrote:
> Before virtio-ccw could get away with not using DMA API for the pieces of
> memory it does ccw I/O with. With protected virtualization this has to
> change, since the hypervisor needs to read and sometimes also write these
> pieces of memory.
> 
> The hypervisor is supposed to poke the classic notifiers, if these are
> used, out of band with regards to ccw I/O. So these need to be allocated
> as DMA memory (which is shared memory for protected virtualization
> guests).
> 
> Let us factor out everything from struct virtio_ccw_device that needs to
> be DMA memory in a satellite that is allocated as such.
> 
> Note: The control blocks of I/O instructions do not need to be shared.
> These are marshalled by the ultravisor.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>   drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++------------------
>   1 file changed, 96 insertions(+), 81 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 1f3e7d56924f..613b18001a0c 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -46,9 +46,15 @@ struct vq_config_block {
>   #define VIRTIO_CCW_CONFIG_SIZE 0x100
>   /* same as PCI config space size, should be enough for all drivers */
>   
> +struct vcdev_dma_area {
> +	unsigned long indicators;
> +	unsigned long indicators2;
> +	struct vq_config_block config_block;
> +	__u8 status;
> +};
> +
>   struct virtio_ccw_device {
>   	struct virtio_device vdev;
> -	__u8 *status;
>   	__u8 config[VIRTIO_CCW_CONFIG_SIZE];
>   	struct ccw_device *cdev;
>   	__u32 curr_io;
> @@ -58,24 +64,22 @@ struct virtio_ccw_device {
>   	spinlock_t lock;
>   	struct mutex io_lock; /* Serializes I/O requests */
>   	struct list_head virtqueues;
> -	unsigned long indicators;
> -	unsigned long indicators2;
> -	struct vq_config_block *config_block;
>   	bool is_thinint;
>   	bool going_away;
>   	bool device_lost;
>   	unsigned int config_ready;
>   	void *airq_info;
> +	struct vcdev_dma_area *dma_area;
>   };
>   
>   static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
>   {
> -	return &vcdev->indicators;
> +	return &vcdev->dma_area->indicators;
>   }
>   
>   static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
>   {
> -	return &vcdev->indicators2;
> +	return &vcdev->dma_area->indicators2;
>   }
>   
>   struct vq_info_block_legacy {
> @@ -176,6 +180,22 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
>   	return container_of(vdev, struct virtio_ccw_device, vdev);
>   }
>   
> +static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size)
> +{
> +	return ccw_device_dma_zalloc(to_vc_device(vdev)->cdev, size);
> +}
> +
> +static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
> +				 void *cpu_addr)
> +{
> +	return ccw_device_dma_free(to_vc_device(vdev)->cdev, cpu_addr, size);
> +}
> +
> +#define vc_dma_alloc_struct(vdev, ptr) \
> +	({ptr = __vc_dma_alloc(vdev, sizeof(*(ptr))); })
> +#define vc_dma_free_struct(vdev, ptr) \
> +	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr))
> +
>   static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
>   {
>   	unsigned long i, flags;
> @@ -335,8 +355,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>   	struct airq_info *airq_info = vcdev->airq_info;
>   
>   	if (vcdev->is_thinint) {
> -		thinint_area = kzalloc(sizeof(*thinint_area),
> -				       GFP_DMA | GFP_KERNEL);
> +		vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
>   		if (!thinint_area)
>   			return;
>   		thinint_area->summary_indicator =
> @@ -347,8 +366,8 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>   		ccw->cda = (__u32)(unsigned long) thinint_area;
>   	} else {
>   		/* payload is the address of the indicators */
> -		indicatorp = kmalloc(sizeof(indicators(vcdev)),
> -				     GFP_DMA | GFP_KERNEL);
> +		indicatorp = __vc_dma_alloc(&vcdev->vdev,
> +					   sizeof(indicators(vcdev)));

should be sizeof(long) ?

This is a recurrent error, but it is not an issue because the size of
the indicators is unsigned long as the size of the pointer.

Regards,
Pierre

-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers
@ 2019-05-08 14:46     ` Pierre Morel
  0 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-08 14:46 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: Thomas Huth, Claudio Imbrenda, Janosch Frank, Vasily Gorbik,
	Michael S. Tsirkin, Farhan Ali, Eric Farman, virtualization,
	Christoph Hellwig, Viktor Mihajlovski

On 26/04/2019 20:32, Halil Pasic wrote:
> Before virtio-ccw could get away with not using DMA API for the pieces of
> memory it does ccw I/O with. With protected virtualization this has to
> change, since the hypervisor needs to read and sometimes also write these
> pieces of memory.
> 
> The hypervisor is supposed to poke the classic notifiers, if these are
> used, out of band with regards to ccw I/O. So these need to be allocated
> as DMA memory (which is shared memory for protected virtualization
> guests).
> 
> Let us factor out everything from struct virtio_ccw_device that needs to
> be DMA memory in a satellite that is allocated as such.
> 
> Note: The control blocks of I/O instructions do not need to be shared.
> These are marshalled by the ultravisor.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>   drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++------------------
>   1 file changed, 96 insertions(+), 81 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 1f3e7d56924f..613b18001a0c 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -46,9 +46,15 @@ struct vq_config_block {
>   #define VIRTIO_CCW_CONFIG_SIZE 0x100
>   /* same as PCI config space size, should be enough for all drivers */
>   
> +struct vcdev_dma_area {
> +	unsigned long indicators;
> +	unsigned long indicators2;
> +	struct vq_config_block config_block;
> +	__u8 status;
> +};
> +
>   struct virtio_ccw_device {
>   	struct virtio_device vdev;
> -	__u8 *status;
>   	__u8 config[VIRTIO_CCW_CONFIG_SIZE];
>   	struct ccw_device *cdev;
>   	__u32 curr_io;
> @@ -58,24 +64,22 @@ struct virtio_ccw_device {
>   	spinlock_t lock;
>   	struct mutex io_lock; /* Serializes I/O requests */
>   	struct list_head virtqueues;
> -	unsigned long indicators;
> -	unsigned long indicators2;
> -	struct vq_config_block *config_block;
>   	bool is_thinint;
>   	bool going_away;
>   	bool device_lost;
>   	unsigned int config_ready;
>   	void *airq_info;
> +	struct vcdev_dma_area *dma_area;
>   };
>   
>   static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
>   {
> -	return &vcdev->indicators;
> +	return &vcdev->dma_area->indicators;
>   }
>   
>   static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
>   {
> -	return &vcdev->indicators2;
> +	return &vcdev->dma_area->indicators2;
>   }
>   
>   struct vq_info_block_legacy {
> @@ -176,6 +180,22 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
>   	return container_of(vdev, struct virtio_ccw_device, vdev);
>   }
>   
> +static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size)
> +{
> +	return ccw_device_dma_zalloc(to_vc_device(vdev)->cdev, size);
> +}
> +
> +static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
> +				 void *cpu_addr)
> +{
> +	return ccw_device_dma_free(to_vc_device(vdev)->cdev, cpu_addr, size);
> +}
> +
> +#define vc_dma_alloc_struct(vdev, ptr) \
> +	({ptr = __vc_dma_alloc(vdev, sizeof(*(ptr))); })
> +#define vc_dma_free_struct(vdev, ptr) \
> +	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr))
> +
>   static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
>   {
>   	unsigned long i, flags;
> @@ -335,8 +355,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>   	struct airq_info *airq_info = vcdev->airq_info;
>   
>   	if (vcdev->is_thinint) {
> -		thinint_area = kzalloc(sizeof(*thinint_area),
> -				       GFP_DMA | GFP_KERNEL);
> +		vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
>   		if (!thinint_area)
>   			return;
>   		thinint_area->summary_indicator =
> @@ -347,8 +366,8 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>   		ccw->cda = (__u32)(unsigned long) thinint_area;
>   	} else {
>   		/* payload is the address of the indicators */
> -		indicatorp = kmalloc(sizeof(indicators(vcdev)),
> -				     GFP_DMA | GFP_KERNEL);
> +		indicatorp = __vc_dma_alloc(&vcdev->vdev,
> +					   sizeof(indicators(vcdev)));

should be sizeof(long) ?

This is a recurrent error, but it is not an issue because the size of
the indicators is unsigned long as the size of the pointer.

Regards,
Pierre

-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
  2019-04-26 18:32   ` Halil Pasic
@ 2019-05-08 15:11     ` Pierre Morel
  -1 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-08 15:11 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On 26/04/2019 20:32, Halil Pasic wrote:
> Hypervisor needs to interact with the summary indicators, so these
> need to be DMA memory as well (at least for protected virtualization
> guests).
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>   drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
>   1 file changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 613b18001a0c..6058b07fea08 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -140,11 +140,17 @@ static int virtio_ccw_use_airq = 1;
>   
>   struct airq_info {
>   	rwlock_t lock;
> -	u8 summary_indicator;
> +	u8 summary_indicator_idx;
>   	struct airq_struct airq;
>   	struct airq_iv *aiv;
>   };
>   static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
> +static u8 *summary_indicators;
> +
> +static inline u8 *get_summary_indicator(struct airq_info *info)
> +{
> +	return summary_indicators + info->summary_indicator_idx;
> +}
>   
>   #define CCW_CMD_SET_VQ 0x13
>   #define CCW_CMD_VDEV_RESET 0x33
> @@ -225,7 +231,7 @@ static void virtio_airq_handler(struct airq_struct *airq)
>   			break;
>   		vring_interrupt(0, (void *)airq_iv_get_ptr(info->aiv, ai));
>   	}
> -	info->summary_indicator = 0;
> +	*(get_summary_indicator(info)) = 0;
>   	smp_wmb();
>   	/* Walk through indicators field, summary indicator not active. */
>   	for (ai = 0;;) {
> @@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq)
>   	read_unlock(&info->lock);
>   }
>   
> -static struct airq_info *new_airq_info(void)
> +/* call with airq_areas_lock held */
> +static struct airq_info *new_airq_info(int index)
>   {
>   	struct airq_info *info;
>   	int rc;
> @@ -252,7 +259,8 @@ static struct airq_info *new_airq_info(void)
>   		return NULL;
>   	}
>   	info->airq.handler = virtio_airq_handler;
> -	info->airq.lsi_ptr = &info->summary_indicator;
> +	info->summary_indicator_idx = index;
> +	info->airq.lsi_ptr = get_summary_indicator(info);
>   	info->airq.lsi_mask = 0xff;
>   	info->airq.isc = VIRTIO_AIRQ_ISC;
>   	rc = register_adapter_interrupt(&info->airq);
> @@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct virtqueue *vqs[], int nvqs,
>   	unsigned long bit, flags;
>   
>   	for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
> +		/* TODO: this seems to be racy */

yes, my opinions too, was already racy, in my opinion, we need another 
patch in another series to fix this.

However, not sure about the comment.

>   		if (!airq_areas[i])
> -			airq_areas[i] = new_airq_info();
> +			airq_areas[i] = new_airq_info(i);
>   		info = airq_areas[i];
>   		if (!info)
>   			return 0;
> @@ -359,7 +368,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>   		if (!thinint_area)
>   			return;
>   		thinint_area->summary_indicator =
> -			(unsigned long) &airq_info->summary_indicator;
> +			(unsigned long) get_summary_indicator(airq_info);
>   		thinint_area->isc = VIRTIO_AIRQ_ISC;
>   		ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
>   		ccw->count = sizeof(*thinint_area);
> @@ -624,7 +633,7 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev,
>   	}
>   	info = vcdev->airq_info;
>   	thinint_area->summary_indicator =
> -		(unsigned long) &info->summary_indicator;
> +		(unsigned long) get_summary_indicator(info);
>   	thinint_area->isc = VIRTIO_AIRQ_ISC;
>   	ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
>   	ccw->flags = CCW_FLAG_SLI;
> @@ -1500,6 +1509,7 @@ static int __init virtio_ccw_init(void)
>   {
>   	/* parse no_auto string before we do anything further */
>   	no_auto_parse();
> +	summary_indicators = cio_dma_zalloc(MAX_AIRQ_AREAS);
>   	return ccw_driver_register(&virtio_ccw_driver);
>   }
>   device_initcall(virtio_ccw_init);
> 



-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
@ 2019-05-08 15:11     ` Pierre Morel
  0 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-08 15:11 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: Thomas Huth, Claudio Imbrenda, Janosch Frank, Vasily Gorbik,
	Michael S. Tsirkin, Farhan Ali, Eric Farman, virtualization,
	Christoph Hellwig, Viktor Mihajlovski

On 26/04/2019 20:32, Halil Pasic wrote:
> Hypervisor needs to interact with the summary indicators, so these
> need to be DMA memory as well (at least for protected virtualization
> guests).
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>   drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
>   1 file changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 613b18001a0c..6058b07fea08 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -140,11 +140,17 @@ static int virtio_ccw_use_airq = 1;
>   
>   struct airq_info {
>   	rwlock_t lock;
> -	u8 summary_indicator;
> +	u8 summary_indicator_idx;
>   	struct airq_struct airq;
>   	struct airq_iv *aiv;
>   };
>   static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
> +static u8 *summary_indicators;
> +
> +static inline u8 *get_summary_indicator(struct airq_info *info)
> +{
> +	return summary_indicators + info->summary_indicator_idx;
> +}
>   
>   #define CCW_CMD_SET_VQ 0x13
>   #define CCW_CMD_VDEV_RESET 0x33
> @@ -225,7 +231,7 @@ static void virtio_airq_handler(struct airq_struct *airq)
>   			break;
>   		vring_interrupt(0, (void *)airq_iv_get_ptr(info->aiv, ai));
>   	}
> -	info->summary_indicator = 0;
> +	*(get_summary_indicator(info)) = 0;
>   	smp_wmb();
>   	/* Walk through indicators field, summary indicator not active. */
>   	for (ai = 0;;) {
> @@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq)
>   	read_unlock(&info->lock);
>   }
>   
> -static struct airq_info *new_airq_info(void)
> +/* call with airq_areas_lock held */
> +static struct airq_info *new_airq_info(int index)
>   {
>   	struct airq_info *info;
>   	int rc;
> @@ -252,7 +259,8 @@ static struct airq_info *new_airq_info(void)
>   		return NULL;
>   	}
>   	info->airq.handler = virtio_airq_handler;
> -	info->airq.lsi_ptr = &info->summary_indicator;
> +	info->summary_indicator_idx = index;
> +	info->airq.lsi_ptr = get_summary_indicator(info);
>   	info->airq.lsi_mask = 0xff;
>   	info->airq.isc = VIRTIO_AIRQ_ISC;
>   	rc = register_adapter_interrupt(&info->airq);
> @@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct virtqueue *vqs[], int nvqs,
>   	unsigned long bit, flags;
>   
>   	for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
> +		/* TODO: this seems to be racy */

yes, my opinions too, was already racy, in my opinion, we need another 
patch in another series to fix this.

However, not sure about the comment.

>   		if (!airq_areas[i])
> -			airq_areas[i] = new_airq_info();
> +			airq_areas[i] = new_airq_info(i);
>   		info = airq_areas[i];
>   		if (!info)
>   			return 0;
> @@ -359,7 +368,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>   		if (!thinint_area)
>   			return;
>   		thinint_area->summary_indicator =
> -			(unsigned long) &airq_info->summary_indicator;
> +			(unsigned long) get_summary_indicator(airq_info);
>   		thinint_area->isc = VIRTIO_AIRQ_ISC;
>   		ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
>   		ccw->count = sizeof(*thinint_area);
> @@ -624,7 +633,7 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev,
>   	}
>   	info = vcdev->airq_info;
>   	thinint_area->summary_indicator =
> -		(unsigned long) &info->summary_indicator;
> +		(unsigned long) get_summary_indicator(info);
>   	thinint_area->isc = VIRTIO_AIRQ_ISC;
>   	ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
>   	ccw->flags = CCW_FLAG_SLI;
> @@ -1500,6 +1509,7 @@ static int __init virtio_ccw_init(void)
>   {
>   	/* parse no_auto string before we do anything further */
>   	no_auto_parse();
> +	summary_indicators = cio_dma_zalloc(MAX_AIRQ_AREAS);
>   	return ccw_driver_register(&virtio_ccw_driver);
>   }
>   device_initcall(virtio_ccw_init);
> 



-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
  2019-05-07 13:58             ` Christian Borntraeger
@ 2019-05-08 20:12               ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-08 20:12 UTC (permalink / raw)
  To: Christian Borntraeger, Michael Mueller
  Cc: Cornelia Huck, Michael S. Tsirkin, kvm, linux-s390,
	Martin Schwidefsky, Sebastian Ott, virtualization,
	Christoph Hellwig, Thomas Huth, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Tue, 7 May 2019 15:58:12 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> 
> 
> On 05.05.19 13:15, Cornelia Huck wrote:
> > On Sat, 4 May 2019 16:03:40 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> > 
> >> On Fri, 3 May 2019 16:04:48 -0400
> >> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >>
> >>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:  
> >>>> On Fri, 26 Apr 2019 20:32:36 +0200
> >>>> Halil Pasic <pasic@linux.ibm.com> wrote:
> >>>>   
> >>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> >>>>> establishes a new way of allocating virtqueues (as a part of the effort
> >>>>> that taught DMA to virtio rings).
> >>>>>
> >>>>> In the future we will want virtio-ccw to use the DMA API as well.
> >>>>>
> >>>>> Let us switch from the legacy method of allocating virtqueues to
> >>>>> vring_create_virtqueue() as the first step into that direction.
> >>>>>
> >>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >>>>> ---
> >>>>>  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> >>>>>  1 file changed, 11 insertions(+), 19 deletions(-)  
> >>>>
> >>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> >>>>
> >>>> I'd vote for merging this patch right away for 5.2.  
> >>>
> >>> So which tree is this going through? mine?
> >>>   
> >>
> >> Christian, what do you think? If the whole series is supposed to go in
> >> in one go (which I hope it is), via Martin's tree could be the simplest
> >> route IMHO.
> > 
> > 
> > The first three patches are virtio(-ccw) only and the those are the ones
> > that I think are ready to go.
> > 
> > I'm not feeling comfortable going forward with the remainder as it
> > stands now; waiting for some other folks to give feedback. (They are
> > touching/interacting with code parts I'm not so familiar with, and lack
> > of documentation, while not the developers' fault, does not make it
> > easier.)
> > 
> > Michael, would you like to pick up 1-3 for your tree directly? That
> > looks like the easiest way.
> 
> Agreed. Michael please pick 1-3.
> We will continue to review 4- first and then see which tree is best.

Thanks Christian!

Guys, I broke my right arm on last Thursday (2nd may). You may
have noticed that I was not as responsive as I'm supposed to be.
Unfortunately this less than responsiveness is about to persist for a
couple more weeks. Fortunate the guys from IBM, and chiefly Michael
Mueller is going to help me drive this -- thanks Michael! Due to this,
I would generally prefer doing as few changes to this series as
necessary, and deferring as many of the beautifying to patches on top
(possibly authored by somebody else) as possible.

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
@ 2019-05-08 20:12               ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-08 20:12 UTC (permalink / raw)
  To: Christian Borntraeger, Michael Mueller
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Farhan Ali, Viktor Mihajlovski, Janosch Frank

On Tue, 7 May 2019 15:58:12 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> 
> 
> On 05.05.19 13:15, Cornelia Huck wrote:
> > On Sat, 4 May 2019 16:03:40 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> > 
> >> On Fri, 3 May 2019 16:04:48 -0400
> >> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >>
> >>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:  
> >>>> On Fri, 26 Apr 2019 20:32:36 +0200
> >>>> Halil Pasic <pasic@linux.ibm.com> wrote:
> >>>>   
> >>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> >>>>> establishes a new way of allocating virtqueues (as a part of the effort
> >>>>> that taught DMA to virtio rings).
> >>>>>
> >>>>> In the future we will want virtio-ccw to use the DMA API as well.
> >>>>>
> >>>>> Let us switch from the legacy method of allocating virtqueues to
> >>>>> vring_create_virtqueue() as the first step into that direction.
> >>>>>
> >>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >>>>> ---
> >>>>>  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> >>>>>  1 file changed, 11 insertions(+), 19 deletions(-)  
> >>>>
> >>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> >>>>
> >>>> I'd vote for merging this patch right away for 5.2.  
> >>>
> >>> So which tree is this going through? mine?
> >>>   
> >>
> >> Christian, what do you think? If the whole series is supposed to go in
> >> in one go (which I hope it is), via Martin's tree could be the simplest
> >> route IMHO.
> > 
> > 
> > The first three patches are virtio(-ccw) only and the those are the ones
> > that I think are ready to go.
> > 
> > I'm not feeling comfortable going forward with the remainder as it
> > stands now; waiting for some other folks to give feedback. (They are
> > touching/interacting with code parts I'm not so familiar with, and lack
> > of documentation, while not the developers' fault, does not make it
> > easier.)
> > 
> > Michael, would you like to pick up 1-3 for your tree directly? That
> > looks like the easiest way.
> 
> Agreed. Michael please pick 1-3.
> We will continue to review 4- first and then see which tree is best.

Thanks Christian!

Guys, I broke my right arm on last Thursday (2nd may). You may
have noticed that I was not as responsive as I'm supposed to be.
Unfortunately this less than responsiveness is about to persist for a
couple more weeks. Fortunate the guys from IBM, and chiefly Michael
Mueller is going to help me drive this -- thanks Michael! Due to this,
I would generally prefer doing as few changes to this series as
necessary, and deferring as many of the beautifying to patches on top
(possibly authored by somebody else) as possible.

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-08 13:46     ` Sebastian Ott
@ 2019-05-08 21:08       ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-08 21:08 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 8 May 2019 15:46:42 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> 
> On Fri, 26 Apr 2019, Halil Pasic wrote:
> >  static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
> >  {
> [..]
> > +	cdev->private = kzalloc(sizeof(struct ccw_device_private),
> > +				GFP_KERNEL | GFP_DMA);
> 
> Do we still need GFP_DMA here (since we now have cdev->private->dma_area)?
> 

We probably do not. I kept it GFP_DMA to keep changes to the
minimum. Should changing this in your opinion be a part of this patch?

> > @@ -1062,6 +1082,14 @@ static int io_subchannel_probe(struct subchannel *sch)
> >  	if (!io_priv)
> >  		goto out_schedule;
> >  
> > +	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
> > +				sizeof(*io_priv->dma_area),
> > +				&io_priv->dma_area_dma, GFP_KERNEL);
> 
> This needs GFP_DMA.

Christoph already answered this one. Thanks Christoph!

> You use a genpool for ccw_private->dma and not for iopriv->dma - looks
> kinda inconsistent.
> 

Please have a look at patch #9. A virtio-ccw device uses the genpool of
it's ccw device (the pool from which ccw_private->dma is allicated) for
the ccw stuff it needs to do. AFAICT for a subchannel device and its API
all the DMA memory we need is iopriv->dma. So my thought was
constructing a genpool for that would be an overkill.

Are you comfortable with this answer, or should we change something?

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-08 21:08       ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-08 21:08 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: Farhan Ali, linux-s390, Thomas Huth, Claudio Imbrenda,
	Vasily Gorbik, kvm, Michael S. Tsirkin, Cornelia Huck,
	Eric Farman, virtualization, Christoph Hellwig,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Wed, 8 May 2019 15:46:42 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> 
> On Fri, 26 Apr 2019, Halil Pasic wrote:
> >  static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
> >  {
> [..]
> > +	cdev->private = kzalloc(sizeof(struct ccw_device_private),
> > +				GFP_KERNEL | GFP_DMA);
> 
> Do we still need GFP_DMA here (since we now have cdev->private->dma_area)?
> 

We probably do not. I kept it GFP_DMA to keep changes to the
minimum. Should changing this in your opinion be a part of this patch?

> > @@ -1062,6 +1082,14 @@ static int io_subchannel_probe(struct subchannel *sch)
> >  	if (!io_priv)
> >  		goto out_schedule;
> >  
> > +	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
> > +				sizeof(*io_priv->dma_area),
> > +				&io_priv->dma_area_dma, GFP_KERNEL);
> 
> This needs GFP_DMA.

Christoph already answered this one. Thanks Christoph!

> You use a genpool for ccw_private->dma and not for iopriv->dma - looks
> kinda inconsistent.
> 

Please have a look at patch #9. A virtio-ccw device uses the genpool of
it's ccw device (the pool from which ccw_private->dma is allicated) for
the ccw stuff it needs to do. AFAICT for a subchannel device and its API
all the DMA memory we need is iopriv->dma. So my thought was
constructing a genpool for that would be an overkill.

Are you comfortable with this answer, or should we change something?

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-08 13:18     ` Sebastian Ott
@ 2019-05-08 21:22       ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-08 21:22 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller

On Wed, 8 May 2019 15:18:10 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> > +void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
> > +{
> > +	if (!cpu_addr)
> > +		return;
> > +	memset(cpu_addr, 0, size);  
> 
> Hm, normally I'd do the memset during alloc not during free - but maybe
> this makes more sense here with your usecase in mind.

I allocate the backing as zeroed, and zero the memory before putting it
back to the pool. So the stuff in the pool is always zeroed.

> 
> > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> >  		unregister_reboot_notifier(&css_reboot_notifier);
> >  		goto out_unregister;
> >  	}
> > +	cio_dma_pool_init();  
> 
> This is too late for early devices (ccw console!).

You have already raised concern about this last time (thanks). I think,
I've addressed this issue: tje cio_dma_pool is only used by the airq
stuff. I don't think the ccw console needs it. Please have an other look
at patch #6, and explain your concern in more detail if it persists.

(@Mimu: What do you think, am I wrong here?)

Thanks for having a look!

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-08 21:22       ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-08 21:22 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: Farhan Ali, linux-s390, Thomas Huth, Claudio Imbrenda,
	Vasily Gorbik, kvm, Michael S. Tsirkin, Cornelia Huck,
	Eric Farman, virtualization, Christoph Hellwig,
	Martin Schwidefsky, Michael Mueller, Viktor Mihajlovski,
	Janosch Frank

On Wed, 8 May 2019 15:18:10 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> > +void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
> > +{
> > +	if (!cpu_addr)
> > +		return;
> > +	memset(cpu_addr, 0, size);  
> 
> Hm, normally I'd do the memset during alloc not during free - but maybe
> this makes more sense here with your usecase in mind.

I allocate the backing as zeroed, and zero the memory before putting it
back to the pool. So the stuff in the pool is always zeroed.

> 
> > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> >  		unregister_reboot_notifier(&css_reboot_notifier);
> >  		goto out_unregister;
> >  	}
> > +	cio_dma_pool_init();  
> 
> This is too late for early devices (ccw console!).

You have already raised concern about this last time (thanks). I think,
I've addressed this issue: tje cio_dma_pool is only used by the airq
stuff. I don't think the ccw console needs it. Please have an other look
at patch #6, and explain your concern in more detail if it persists.

(@Mimu: What do you think, am I wrong here?)

Thanks for having a look!

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-08 21:22       ` Halil Pasic
@ 2019-05-09  8:40         ` Sebastian Ott
  -1 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-09  8:40 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller


On Wed, 8 May 2019, Halil Pasic wrote:
> On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > >  		goto out_unregister;
> > >  	}
> > > +	cio_dma_pool_init();  
> > 
> > This is too late for early devices (ccw console!).
> 
> You have already raised concern about this last time (thanks). I think,
> I've addressed this issue: tje cio_dma_pool is only used by the airq
> stuff. I don't think the ccw console needs it. Please have an other look
> at patch #6, and explain your concern in more detail if it persists.

Oh, you don't use the global pool for that - yes that should work.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-09  8:40         ` Sebastian Ott
  0 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-09  8:40 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Thomas Huth, Claudio Imbrenda,
	Vasily Gorbik, kvm, Michael S. Tsirkin, Cornelia Huck,
	Eric Farman, virtualization, Christoph Hellwig,
	Martin Schwidefsky, Michael Mueller, Viktor Mihajlovski,
	Janosch Frank


On Wed, 8 May 2019, Halil Pasic wrote:
> On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > >  		goto out_unregister;
> > >  	}
> > > +	cio_dma_pool_init();  
> > 
> > This is too late for early devices (ccw console!).
> 
> You have already raised concern about this last time (thanks). I think,
> I've addressed this issue: tje cio_dma_pool is only used by the airq
> stuff. I don't think the ccw console needs it. Please have an other look
> at patch #6, and explain your concern in more detail if it persists.

Oh, you don't use the global pool for that - yes that should work.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-08 21:08       ` Halil Pasic
@ 2019-05-09  8:52         ` Sebastian Ott
  -1 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-09  8:52 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman


On Wed, 8 May 2019, Halil Pasic wrote:
> On Wed, 8 May 2019 15:46:42 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> > On Fri, 26 Apr 2019, Halil Pasic wrote:
> > >  static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
> > >  {
> > [..]
> > > +	cdev->private = kzalloc(sizeof(struct ccw_device_private),
> > > +				GFP_KERNEL | GFP_DMA);
> > 
> > Do we still need GFP_DMA here (since we now have cdev->private->dma_area)?
> > 
> 
> We probably do not. I kept it GFP_DMA to keep changes to the
> minimum. Should changing this in your opinion be a part of this patch?

This can be changed on top.

> > > @@ -1062,6 +1082,14 @@ static int io_subchannel_probe(struct subchannel *sch)
> > >  	if (!io_priv)
> > >  		goto out_schedule;
> > >  
> > > +	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
> > > +				sizeof(*io_priv->dma_area),
> > > +				&io_priv->dma_area_dma, GFP_KERNEL);
> > 
> > This needs GFP_DMA.
> 
> Christoph already answered this one. Thanks Christoph!

Yes, I'm still struggling to grasp the whole channel IO + DMA API +
protected virtualization thing..

> 
> > You use a genpool for ccw_private->dma and not for iopriv->dma - looks
> > kinda inconsistent.
> > 
> 
> Please have a look at patch #9. A virtio-ccw device uses the genpool of
> it's ccw device (the pool from which ccw_private->dma is allicated) for
> the ccw stuff it needs to do. AFAICT for a subchannel device and its API
> all the DMA memory we need is iopriv->dma. So my thought was
> constructing a genpool for that would be an overkill.
> 
> Are you comfortable with this answer, or should we change something?

Nope, I'm good with that one - ccw_private->dma has multiple users that
all fit into one page.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-09  8:52         ` Sebastian Ott
  0 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-09  8:52 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Thomas Huth, Claudio Imbrenda,
	Vasily Gorbik, kvm, Michael S. Tsirkin, Cornelia Huck,
	Eric Farman, virtualization, Christoph Hellwig,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank


On Wed, 8 May 2019, Halil Pasic wrote:
> On Wed, 8 May 2019 15:46:42 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> > On Fri, 26 Apr 2019, Halil Pasic wrote:
> > >  static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
> > >  {
> > [..]
> > > +	cdev->private = kzalloc(sizeof(struct ccw_device_private),
> > > +				GFP_KERNEL | GFP_DMA);
> > 
> > Do we still need GFP_DMA here (since we now have cdev->private->dma_area)?
> > 
> 
> We probably do not. I kept it GFP_DMA to keep changes to the
> minimum. Should changing this in your opinion be a part of this patch?

This can be changed on top.

> > > @@ -1062,6 +1082,14 @@ static int io_subchannel_probe(struct subchannel *sch)
> > >  	if (!io_priv)
> > >  		goto out_schedule;
> > >  
> > > +	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
> > > +				sizeof(*io_priv->dma_area),
> > > +				&io_priv->dma_area_dma, GFP_KERNEL);
> > 
> > This needs GFP_DMA.
> 
> Christoph already answered this one. Thanks Christoph!

Yes, I'm still struggling to grasp the whole channel IO + DMA API +
protected virtualization thing..

> 
> > You use a genpool for ccw_private->dma and not for iopriv->dma - looks
> > kinda inconsistent.
> > 
> 
> Please have a look at patch #9. A virtio-ccw device uses the genpool of
> it's ccw device (the pool from which ccw_private->dma is allicated) for
> the ccw stuff it needs to do. AFAICT for a subchannel device and its API
> all the DMA memory we need is iopriv->dma. So my thought was
> constructing a genpool for that would be an overkill.
> 
> Are you comfortable with this answer, or should we change something?

Nope, I'm good with that one - ccw_private->dma has multiple users that
all fit into one page.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-08 21:22       ` Halil Pasic
@ 2019-05-09 10:11         ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-09 10:11 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Sebastian Ott, kvm, linux-s390, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller

On Wed, 8 May 2019 23:22:10 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:

> > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > >  		goto out_unregister;
> > >  	}
> > > +	cio_dma_pool_init();    
> > 
> > This is too late for early devices (ccw console!).  
> 
> You have already raised concern about this last time (thanks). I think,
> I've addressed this issue: tje cio_dma_pool is only used by the airq
> stuff. I don't think the ccw console needs it. Please have an other look
> at patch #6, and explain your concern in more detail if it persists.

What about changing the naming/adding comments here, so that (1) folks
aren't confused by the same thing in the future and (2) folks don't try
to use that pool for something needed for the early ccw consoles?

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-09 10:11         ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-09 10:11 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda,
	Sebastian Ott, kvm, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Michael Mueller, Viktor Mihajlovski, Janosch Frank

On Wed, 8 May 2019 23:22:10 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:

> > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > >  		goto out_unregister;
> > >  	}
> > > +	cio_dma_pool_init();    
> > 
> > This is too late for early devices (ccw console!).  
> 
> You have already raised concern about this last time (thanks). I think,
> I've addressed this issue: tje cio_dma_pool is only used by the airq
> stuff. I don't think the ccw console needs it. Please have an other look
> at patch #6, and explain your concern in more detail if it persists.

What about changing the naming/adding comments here, so that (1) folks
aren't confused by the same thing in the future and (2) folks don't try
to use that pool for something needed for the early ccw consoles?

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 07/10] s390/airq: use DMA memory for adapter interrupts
  2019-04-26 18:32   ` Halil Pasic
@ 2019-05-09 11:37     ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-09 11:37 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 26 Apr 2019 20:32:42 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Protected virtualization guests have to use shared pages for airq
> notifier bit vectors, because hypervisor needs to write these bits.
> 
> Let us make sure we allocate DMA memory for the notifier bit vectors.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/include/asm/airq.h |  2 ++
>  drivers/s390/cio/airq.c      | 18 ++++++++++++++----
>  2 files changed, 16 insertions(+), 4 deletions(-)

As an aside, there are some other devices that use adapter interrupts
as well (pci, ap, qdio). How does that interact with their needs? Do
they continue to work on non-protected virt guests (kvm or z/VM), and
can they be accommodated if support for them on protected guests is
added in the future?

(For some of the indicator bit handling, I suspect millicode takes care
of it anyway, but at least for pci, there's some hypervisor action to
be aware of.)

Also, as another aside, has this been tested with a regular guest under
tcg as well? If not, can you provide a branch for quick verification
somewhere?

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 07/10] s390/airq: use DMA memory for adapter interrupts
@ 2019-05-09 11:37     ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-09 11:37 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri, 26 Apr 2019 20:32:42 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Protected virtualization guests have to use shared pages for airq
> notifier bit vectors, because hypervisor needs to write these bits.
> 
> Let us make sure we allocate DMA memory for the notifier bit vectors.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/include/asm/airq.h |  2 ++
>  drivers/s390/cio/airq.c      | 18 ++++++++++++++----
>  2 files changed, 16 insertions(+), 4 deletions(-)

As an aside, there are some other devices that use adapter interrupts
as well (pci, ap, qdio). How does that interact with their needs? Do
they continue to work on non-protected virt guests (kvm or z/VM), and
can they be accommodated if support for them on protected guests is
added in the future?

(For some of the indicator bit handling, I suspect millicode takes care
of it anyway, but at least for pci, there's some hypervisor action to
be aware of.)

Also, as another aside, has this been tested with a regular guest under
tcg as well? If not, can you provide a branch for quick verification
somewhere?

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
  2019-05-08 14:31     ` Pierre Morel
@ 2019-05-09 12:01       ` Pierre Morel
  -1 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-09 12:01 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On 08/05/2019 16:31, Pierre Morel wrote:
> On 26/04/2019 20:32, Halil Pasic wrote:
>> This will come in handy soon when we pull out the indicators from
>> virtio_ccw_device to a memory area that is shared with the hypervisor
>> (in particular for protected virtualization guests).
>>
>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>> ---
>>   drivers/s390/virtio/virtio_ccw.c | 40 
>> +++++++++++++++++++++++++---------------
>>   1 file changed, 25 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/s390/virtio/virtio_ccw.c 
>> b/drivers/s390/virtio/virtio_ccw.c
>> index bb7a92316fc8..1f3e7d56924f 100644
>> --- a/drivers/s390/virtio/virtio_ccw.c
>> +++ b/drivers/s390/virtio/virtio_ccw.c
>> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
>>       void *airq_info;
>>   };
>> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
>> +{
>> +    return &vcdev->indicators;
>> +}
>> +
>> +static inline unsigned long *indicators2(struct virtio_ccw_device 
>> *vcdev)
>> +{
>> +    return &vcdev->indicators2;
>> +}
>> +
>>   struct vq_info_block_legacy {
>>       __u64 queue;
>>       __u32 align;
>> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct 
>> virtio_ccw_device *vcdev,
>>           ccw->cda = (__u32)(unsigned long) thinint_area;
>>       } else {
>>           /* payload is the address of the indicators */
>> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
>> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
>>                        GFP_DMA | GFP_KERNEL);
>>           if (!indicatorp)
>>               return;
>>           *indicatorp = 0;
>>           ccw->cmd_code = CCW_CMD_SET_IND;
>> -        ccw->count = sizeof(&vcdev->indicators);
>> +        ccw->count = sizeof(indicators(vcdev));
> 
> This looks strange to me. Was already weird before.
> Lucky we are indicators are long...
> may be just sizeof(long)


AFAIK the size of the indicators (AIV/AIS) is not restricted by the 
architecture.
However we never use more than 64 bits, do we ever have an adapter 
having more than 64 different interrupts?

May be we can state than we use a maximal number of AISB of 64 and 
therefor use indicators with a size of unsigned long, or __u64 or 
whatever is appropriate. Please clear this.

With this cleared:
Reviewed-by: Pierre Morel<pmorel@linux.ibm.com>


Regards,
Pierre


-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
@ 2019-05-09 12:01       ` Pierre Morel
  0 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-09 12:01 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: Thomas Huth, Claudio Imbrenda, Janosch Frank, Vasily Gorbik,
	Michael S. Tsirkin, Farhan Ali, Eric Farman, virtualization,
	Christoph Hellwig, Viktor Mihajlovski

On 08/05/2019 16:31, Pierre Morel wrote:
> On 26/04/2019 20:32, Halil Pasic wrote:
>> This will come in handy soon when we pull out the indicators from
>> virtio_ccw_device to a memory area that is shared with the hypervisor
>> (in particular for protected virtualization guests).
>>
>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>> ---
>>   drivers/s390/virtio/virtio_ccw.c | 40 
>> +++++++++++++++++++++++++---------------
>>   1 file changed, 25 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/s390/virtio/virtio_ccw.c 
>> b/drivers/s390/virtio/virtio_ccw.c
>> index bb7a92316fc8..1f3e7d56924f 100644
>> --- a/drivers/s390/virtio/virtio_ccw.c
>> +++ b/drivers/s390/virtio/virtio_ccw.c
>> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
>>       void *airq_info;
>>   };
>> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
>> +{
>> +    return &vcdev->indicators;
>> +}
>> +
>> +static inline unsigned long *indicators2(struct virtio_ccw_device 
>> *vcdev)
>> +{
>> +    return &vcdev->indicators2;
>> +}
>> +
>>   struct vq_info_block_legacy {
>>       __u64 queue;
>>       __u32 align;
>> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct 
>> virtio_ccw_device *vcdev,
>>           ccw->cda = (__u32)(unsigned long) thinint_area;
>>       } else {
>>           /* payload is the address of the indicators */
>> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
>> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
>>                        GFP_DMA | GFP_KERNEL);
>>           if (!indicatorp)
>>               return;
>>           *indicatorp = 0;
>>           ccw->cmd_code = CCW_CMD_SET_IND;
>> -        ccw->count = sizeof(&vcdev->indicators);
>> +        ccw->count = sizeof(indicators(vcdev));
> 
> This looks strange to me. Was already weird before.
> Lucky we are indicators are long...
> may be just sizeof(long)


AFAIK the size of the indicators (AIV/AIS) is not restricted by the 
architecture.
However we never use more than 64 bits, do we ever have an adapter 
having more than 64 different interrupts?

May be we can state than we use a maximal number of AISB of 64 and 
therefor use indicators with a size of unsigned long, or __u64 or 
whatever is appropriate. Please clear this.

With this cleared:
Reviewed-by: Pierre Morel<pmorel@linux.ibm.com>


Regards,
Pierre


-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers
  2019-05-08 14:46     ` Pierre Morel
@ 2019-05-09 13:30       ` Pierre Morel
  -1 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-09 13:30 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On 08/05/2019 16:46, Pierre Morel wrote:
> On 26/04/2019 20:32, Halil Pasic wrote:
>> Before virtio-ccw could get away with not using DMA API for the pieces of
>> memory it does ccw I/O with. With protected virtualization this has to
>> change, since the hypervisor needs to read and sometimes also write these
>> pieces of memory.
>>
>> The hypervisor is supposed to poke the classic notifiers, if these are
>> used, out of band with regards to ccw I/O. So these need to be allocated
>> as DMA memory (which is shared memory for protected virtualization
>> guests).
>>
>> Let us factor out everything from struct virtio_ccw_device that needs to
>> be DMA memory in a satellite that is allocated as such.
>>
...
>> +                       sizeof(indicators(vcdev)));
> 
> should be sizeof(long) ?
> 
> This is a recurrent error, but it is not an issue because the size of
> the indicators is unsigned long as the size of the pointer.
> 
> Regards,
> Pierre
> 

Here too, with the problem of the indicator size handled:
Reviewed-by: Pierre Morel<pmorel@linux.ibm.com>


-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers
@ 2019-05-09 13:30       ` Pierre Morel
  0 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-09 13:30 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: Thomas Huth, Claudio Imbrenda, Janosch Frank, Vasily Gorbik,
	Michael S. Tsirkin, Farhan Ali, Eric Farman, virtualization,
	Christoph Hellwig, Viktor Mihajlovski

On 08/05/2019 16:46, Pierre Morel wrote:
> On 26/04/2019 20:32, Halil Pasic wrote:
>> Before virtio-ccw could get away with not using DMA API for the pieces of
>> memory it does ccw I/O with. With protected virtualization this has to
>> change, since the hypervisor needs to read and sometimes also write these
>> pieces of memory.
>>
>> The hypervisor is supposed to poke the classic notifiers, if these are
>> used, out of band with regards to ccw I/O. So these need to be allocated
>> as DMA memory (which is shared memory for protected virtualization
>> guests).
>>
>> Let us factor out everything from struct virtio_ccw_device that needs to
>> be DMA memory in a satellite that is allocated as such.
>>
...
>> +                       sizeof(indicators(vcdev)));
> 
> should be sizeof(long) ?
> 
> This is a recurrent error, but it is not an issue because the size of
> the indicators is unsigned long as the size of the pointer.
> 
> Regards,
> Pierre
> 

Here too, with the problem of the indicator size handled:
Reviewed-by: Pierre Morel<pmorel@linux.ibm.com>


-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
       [not found]   ` <ad23f5e7-dc78-04af-c892-47bbc65134c6@linux.ibm.com>
  2019-05-09 18:05       ` Jason J. Herne
@ 2019-05-09 18:05       ` Jason J. Herne
  0 siblings, 0 replies; 182+ messages in thread
From: Jason J. Herne @ 2019-05-09 18:05 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck ,,
	Martin Schwidefsky, Sebastian Ott
  Cc: virtualization, Michael S. Tsirkin ,,
	Christoph Hellwig, Thomas Huth ,,
	Christian Borntraeger, Viktor Mihajlovski, Vasily Gorbik ,,
	Janosch Frank, Claudio Imbrenda, Farhan Ali, Eric Farman

> Subject: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
> Date: Fri, 26 Apr 2019 20:32:39 +0200
> From: Halil Pasic <pasic@linux.ibm.com>
> To: kvm@vger.kernel.org, linux-s390@vger.kernel.org, Cornelia Huck <cohuck@redhat.com>, 
> Martin Schwidefsky <schwidefsky@de.ibm.com>, Sebastian Ott <sebott@linux.ibm.com>
> CC: Halil Pasic <pasic@linux.ibm.com>, virtualization@lists.linux-foundation.org, Michael 
> S. Tsirkin <mst@redhat.com>, Christoph Hellwig <hch@infradead.org>, Thomas Huth 
> <thuth@redhat.com>, Christian Borntraeger <borntraeger@de.ibm.com>, Viktor Mihajlovski 
> <mihajlov@linux.ibm.com>, Vasily Gorbik <gor@linux.ibm.com>, Janosch Frank 
> <frankja@linux.ibm.com>, Claudio Imbrenda <imbrenda@linux.ibm.com>, Farhan Ali 
> <alifm@linux.ibm.com>, Eric Farman <farman@linux.ibm.com>
> 
> On s390, protected virtualization guests have to use bounced I/O
> buffers.  That requires some plumbing.
> 
> Let us make sure, any device that uses DMA API with direct ops correctly
> is spared from the problems, that a hypervisor attempting I/O to a
> non-shared page would bring.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>   arch/s390/Kconfig                   |  4 +++
>   arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++
>   arch/s390/mm/init.c                 | 50 +++++++++++++++++++++++++++++++++++++
>   3 files changed, 72 insertions(+)
>   create mode 100644 arch/s390/include/asm/mem_encrypt.h
> 
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index 1c3fcf19c3af..5500d05d4d53 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -1,4 +1,7 @@
>   # SPDX-License-Identifier: GPL-2.0
> +config ARCH_HAS_MEM_ENCRYPT
> +        def_bool y
> +
>   config MMU
>       def_bool y
>   @@ -191,6 +194,7 @@ config S390
>       select ARCH_HAS_SCALED_CPUTIME
>       select VIRT_TO_BUS
>       select HAVE_NMI
> +    select SWIOTLB
>     config SCHED_OMIT_FRAME_POINTER
> diff --git a/arch/s390/include/asm/mem_encrypt.h b/arch/s390/include/asm/mem_encrypt.h
> new file mode 100644
> index 000000000000..0898c09a888c
> --- /dev/null
> +++ b/arch/s390/include/asm/mem_encrypt.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef S390_MEM_ENCRYPT_H__
> +#define S390_MEM_ENCRYPT_H__
> +
> +#ifndef __ASSEMBLY__
> +
> +#define sme_me_mask    0ULL
> +
> +static inline bool sme_active(void) { return false; }
> +extern bool sev_active(void);
> +

I noticed this patch always returns false for sme_active. Is it safe to assume that 
whatever fixups are required on x86 to deal with sme do not apply to s390?

> +int set_memory_encrypted(unsigned long addr, int numpages);
> +int set_memory_decrypted(unsigned long addr, int numpages);
> +
> +#endif    /* __ASSEMBLY__ */
> +
> +#endif    /* S390_MEM_ENCRYPT_H__ */
> +
> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> index 3e82f66d5c61..7e3cbd15dcfa 100644
> --- a/arch/s390/mm/init.c
> +++ b/arch/s390/mm/init.c
> @@ -18,6 +18,7 @@
>   #include <linux/mman.h>
>   #include <linux/mm.h>
>   #include <linux/swap.h>
> +#include <linux/swiotlb.h>
>   #include <linux/smp.h>
>   #include <linux/init.h>
>   #include <linux/pagemap.h>
> @@ -29,6 +30,7 @@
>   #include <linux/export.h>
>   #include <linux/cma.h>
>   #include <linux/gfp.h>
> +#include <linux/dma-mapping.h>
>   #include <asm/processor.h>
>   #include <linux/uaccess.h>
>   #include <asm/pgtable.h>
> @@ -42,6 +44,8 @@
>   #include <asm/sclp.h>
>   #include <asm/set_memory.h>
>   #include <asm/kasan.h>
> +#include <asm/dma-mapping.h>
> +#include <asm/uv.h>
>    pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
>   @@ -126,6 +130,50 @@ void mark_rodata_ro(void)
>       pr_info("Write protected read-only-after-init data: %luk\n", size >> 10);
>   }
>   +int set_memory_encrypted(unsigned long addr, int numpages)
> +{
> +    int i;
> +
> +    /* make all pages shared, (swiotlb, dma_free) */

This comment should be "make all pages unshared"?

> +    for (i = 0; i < numpages; ++i) {
> +        uv_remove_shared(addr);
> +        addr += PAGE_SIZE;
> +    }
> +    return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> +
> +int set_memory_decrypted(unsigned long addr, int numpages)
> +{
> +    int i;
> +    /* make all pages shared (swiotlb, dma_alloca) */
> +    for (i = 0; i < numpages; ++i) {
> +        uv_set_shared(addr);
> +        addr += PAGE_SIZE;
> +    }
> +    return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_decrypted);

The addr arguments for the above functions appear to be referring to virtual addresses. 
Would vaddr be a better name?

> +
> +/* are we a protected virtualization guest? */
> +bool sev_active(void)
> +{
> +    return is_prot_virt_guest();
> +}
> +EXPORT_SYMBOL_GPL(sev_active);
> +
> +/* protected virtualization */
> +static void pv_init(void)
> +{
> +    if (!sev_active())
> +        return;
> +
> +    /* make sure bounce buffers are shared */
> +    swiotlb_init(1);
> +    swiotlb_update_mem_attributes();
> +    swiotlb_force = SWIOTLB_FORCE;
> +}
> +
>   void __init mem_init(void)
>   {
>       cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> @@ -134,6 +182,8 @@ void __init mem_init(void)
>       set_max_mapnr(max_low_pfn);
>           high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
>   +    pv_init();
> +
>       /* Setup guest page hinting */
>       cmma_init();
>   -- 2.16.4
> 
> 

-- 
-- Jason J. Herne (jjherne@linux.ibm.com)

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
@ 2019-05-09 18:05       ` Jason J. Herne
  0 siblings, 0 replies; 182+ messages in thread
From: Jason J. Herne @ 2019-05-09 18:05 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck ,,
	Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Michael S. Tsirkin ,,
	Christoph Hellwig, Thomas Huth ,,
	Christian Borntraeger, Viktor Mihajlovski, Vasily Gorbik ,,
	Janosch Frank, Claudio Imbrenda, Farhan Ali, Eric Farman

> Subject: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
> Date: Fri, 26 Apr 2019 20:32:39 +0200
> From: Halil Pasic <pasic@linux.ibm.com>
> To: kvm@vger.kernel.org, linux-s390@vger.kernel.org, Cornelia Huck <cohuck@redhat.com>, 
> Martin Schwidefsky <schwidefsky@de.ibm.com>, Sebastian Ott <sebott@linux.ibm.com>
> CC: Halil Pasic <pasic@linux.ibm.com>, virtualization@lists.linux-foundation.org, Michael 
> S. Tsirkin <mst@redhat.com>, Christoph Hellwig <hch@infradead.org>, Thomas Huth 
> <thuth@redhat.com>, Christian Borntraeger <borntraeger@de.ibm.com>, Viktor Mihajlovski 
> <mihajlov@linux.ibm.com>, Vasily Gorbik <gor@linux.ibm.com>, Janosch Frank 
> <frankja@linux.ibm.com>, Claudio Imbrenda <imbrenda@linux.ibm.com>, Farhan Ali 
> <alifm@linux.ibm.com>, Eric Farman <farman@linux.ibm.com>
> 
> On s390, protected virtualization guests have to use bounced I/O
> buffers.  That requires some plumbing.
> 
> Let us make sure, any device that uses DMA API with direct ops correctly
> is spared from the problems, that a hypervisor attempting I/O to a
> non-shared page would bring.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>   arch/s390/Kconfig                   |  4 +++
>   arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++
>   arch/s390/mm/init.c                 | 50 +++++++++++++++++++++++++++++++++++++
>   3 files changed, 72 insertions(+)
>   create mode 100644 arch/s390/include/asm/mem_encrypt.h
> 
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index 1c3fcf19c3af..5500d05d4d53 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -1,4 +1,7 @@
>   # SPDX-License-Identifier: GPL-2.0
> +config ARCH_HAS_MEM_ENCRYPT
> +        def_bool y
> +
>   config MMU
>       def_bool y
>   @@ -191,6 +194,7 @@ config S390
>       select ARCH_HAS_SCALED_CPUTIME
>       select VIRT_TO_BUS
>       select HAVE_NMI
> +    select SWIOTLB
>     config SCHED_OMIT_FRAME_POINTER
> diff --git a/arch/s390/include/asm/mem_encrypt.h b/arch/s390/include/asm/mem_encrypt.h
> new file mode 100644
> index 000000000000..0898c09a888c
> --- /dev/null
> +++ b/arch/s390/include/asm/mem_encrypt.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef S390_MEM_ENCRYPT_H__
> +#define S390_MEM_ENCRYPT_H__
> +
> +#ifndef __ASSEMBLY__
> +
> +#define sme_me_mask    0ULL
> +
> +static inline bool sme_active(void) { return false; }
> +extern bool sev_active(void);
> +

I noticed this patch always returns false for sme_active. Is it safe to assume that 
whatever fixups are required on x86 to deal with sme do not apply to s390?

> +int set_memory_encrypted(unsigned long addr, int numpages);
> +int set_memory_decrypted(unsigned long addr, int numpages);
> +
> +#endif    /* __ASSEMBLY__ */
> +
> +#endif    /* S390_MEM_ENCRYPT_H__ */
> +
> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> index 3e82f66d5c61..7e3cbd15dcfa 100644
> --- a/arch/s390/mm/init.c
> +++ b/arch/s390/mm/init.c
> @@ -18,6 +18,7 @@
>   #include <linux/mman.h>
>   #include <linux/mm.h>
>   #include <linux/swap.h>
> +#include <linux/swiotlb.h>
>   #include <linux/smp.h>
>   #include <linux/init.h>
>   #include <linux/pagemap.h>
> @@ -29,6 +30,7 @@
>   #include <linux/export.h>
>   #include <linux/cma.h>
>   #include <linux/gfp.h>
> +#include <linux/dma-mapping.h>
>   #include <asm/processor.h>
>   #include <linux/uaccess.h>
>   #include <asm/pgtable.h>
> @@ -42,6 +44,8 @@
>   #include <asm/sclp.h>
>   #include <asm/set_memory.h>
>   #include <asm/kasan.h>
> +#include <asm/dma-mapping.h>
> +#include <asm/uv.h>
>    pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
>   @@ -126,6 +130,50 @@ void mark_rodata_ro(void)
>       pr_info("Write protected read-only-after-init data: %luk\n", size >> 10);
>   }
>   +int set_memory_encrypted(unsigned long addr, int numpages)
> +{
> +    int i;
> +
> +    /* make all pages shared, (swiotlb, dma_free) */

This comment should be "make all pages unshared"?

> +    for (i = 0; i < numpages; ++i) {
> +        uv_remove_shared(addr);
> +        addr += PAGE_SIZE;
> +    }
> +    return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> +
> +int set_memory_decrypted(unsigned long addr, int numpages)
> +{
> +    int i;
> +    /* make all pages shared (swiotlb, dma_alloca) */
> +    for (i = 0; i < numpages; ++i) {
> +        uv_set_shared(addr);
> +        addr += PAGE_SIZE;
> +    }
> +    return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_decrypted);

The addr arguments for the above functions appear to be referring to virtual addresses. 
Would vaddr be a better name?

> +
> +/* are we a protected virtualization guest? */
> +bool sev_active(void)
> +{
> +    return is_prot_virt_guest();
> +}
> +EXPORT_SYMBOL_GPL(sev_active);
> +
> +/* protected virtualization */
> +static void pv_init(void)
> +{
> +    if (!sev_active())
> +        return;
> +
> +    /* make sure bounce buffers are shared */
> +    swiotlb_init(1);
> +    swiotlb_update_mem_attributes();
> +    swiotlb_force = SWIOTLB_FORCE;
> +}
> +
>   void __init mem_init(void)
>   {
>       cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> @@ -134,6 +182,8 @@ void __init mem_init(void)
>       set_max_mapnr(max_low_pfn);
>           high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
>   +    pv_init();
> +
>       /* Setup guest page hinting */
>       cmma_init();
>   -- 2.16.4
> 
> 

-- 
-- Jason J. Herne (jjherne@linux.ibm.com)


^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
@ 2019-05-09 18:05       ` Jason J. Herne
  0 siblings, 0 replies; 182+ messages in thread
From: Jason J. Herne @ 2019-05-09 18:05 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck ,, Martin Schwidefsky, Sebastian Ott
  Cc: Christoph Hellwig, Thomas Huth , ,
	Claudio Imbrenda, Janosch Frank, Vasily Gorbik , ,
	Michael S. Tsirkin , ,
	Farhan Ali, Eric Farman, virtualization, Halil Pasic,
	Viktor Mihajlovski

> Subject: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
> Date: Fri, 26 Apr 2019 20:32:39 +0200
> From: Halil Pasic <pasic@linux.ibm.com>
> To: kvm@vger.kernel.org, linux-s390@vger.kernel.org, Cornelia Huck <cohuck@redhat.com>, 
> Martin Schwidefsky <schwidefsky@de.ibm.com>, Sebastian Ott <sebott@linux.ibm.com>
> CC: Halil Pasic <pasic@linux.ibm.com>, virtualization@lists.linux-foundation.org, Michael 
> S. Tsirkin <mst@redhat.com>, Christoph Hellwig <hch@infradead.org>, Thomas Huth 
> <thuth@redhat.com>, Christian Borntraeger <borntraeger@de.ibm.com>, Viktor Mihajlovski 
> <mihajlov@linux.ibm.com>, Vasily Gorbik <gor@linux.ibm.com>, Janosch Frank 
> <frankja@linux.ibm.com>, Claudio Imbrenda <imbrenda@linux.ibm.com>, Farhan Ali 
> <alifm@linux.ibm.com>, Eric Farman <farman@linux.ibm.com>
> 
> On s390, protected virtualization guests have to use bounced I/O
> buffers.  That requires some plumbing.
> 
> Let us make sure, any device that uses DMA API with direct ops correctly
> is spared from the problems, that a hypervisor attempting I/O to a
> non-shared page would bring.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>   arch/s390/Kconfig                   |  4 +++
>   arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++
>   arch/s390/mm/init.c                 | 50 +++++++++++++++++++++++++++++++++++++
>   3 files changed, 72 insertions(+)
>   create mode 100644 arch/s390/include/asm/mem_encrypt.h
> 
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index 1c3fcf19c3af..5500d05d4d53 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -1,4 +1,7 @@
>   # SPDX-License-Identifier: GPL-2.0
> +config ARCH_HAS_MEM_ENCRYPT
> +        def_bool y
> +
>   config MMU
>       def_bool y
>   @@ -191,6 +194,7 @@ config S390
>       select ARCH_HAS_SCALED_CPUTIME
>       select VIRT_TO_BUS
>       select HAVE_NMI
> +    select SWIOTLB
>     config SCHED_OMIT_FRAME_POINTER
> diff --git a/arch/s390/include/asm/mem_encrypt.h b/arch/s390/include/asm/mem_encrypt.h
> new file mode 100644
> index 000000000000..0898c09a888c
> --- /dev/null
> +++ b/arch/s390/include/asm/mem_encrypt.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef S390_MEM_ENCRYPT_H__
> +#define S390_MEM_ENCRYPT_H__
> +
> +#ifndef __ASSEMBLY__
> +
> +#define sme_me_mask    0ULL
> +
> +static inline bool sme_active(void) { return false; }
> +extern bool sev_active(void);
> +

I noticed this patch always returns false for sme_active. Is it safe to assume that 
whatever fixups are required on x86 to deal with sme do not apply to s390?

> +int set_memory_encrypted(unsigned long addr, int numpages);
> +int set_memory_decrypted(unsigned long addr, int numpages);
> +
> +#endif    /* __ASSEMBLY__ */
> +
> +#endif    /* S390_MEM_ENCRYPT_H__ */
> +
> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> index 3e82f66d5c61..7e3cbd15dcfa 100644
> --- a/arch/s390/mm/init.c
> +++ b/arch/s390/mm/init.c
> @@ -18,6 +18,7 @@
>   #include <linux/mman.h>
>   #include <linux/mm.h>
>   #include <linux/swap.h>
> +#include <linux/swiotlb.h>
>   #include <linux/smp.h>
>   #include <linux/init.h>
>   #include <linux/pagemap.h>
> @@ -29,6 +30,7 @@
>   #include <linux/export.h>
>   #include <linux/cma.h>
>   #include <linux/gfp.h>
> +#include <linux/dma-mapping.h>
>   #include <asm/processor.h>
>   #include <linux/uaccess.h>
>   #include <asm/pgtable.h>
> @@ -42,6 +44,8 @@
>   #include <asm/sclp.h>
>   #include <asm/set_memory.h>
>   #include <asm/kasan.h>
> +#include <asm/dma-mapping.h>
> +#include <asm/uv.h>
>    pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
>   @@ -126,6 +130,50 @@ void mark_rodata_ro(void)
>       pr_info("Write protected read-only-after-init data: %luk\n", size >> 10);
>   }
>   +int set_memory_encrypted(unsigned long addr, int numpages)
> +{
> +    int i;
> +
> +    /* make all pages shared, (swiotlb, dma_free) */

This comment should be "make all pages unshared"?

> +    for (i = 0; i < numpages; ++i) {
> +        uv_remove_shared(addr);
> +        addr += PAGE_SIZE;
> +    }
> +    return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> +
> +int set_memory_decrypted(unsigned long addr, int numpages)
> +{
> +    int i;
> +    /* make all pages shared (swiotlb, dma_alloca) */
> +    for (i = 0; i < numpages; ++i) {
> +        uv_set_shared(addr);
> +        addr += PAGE_SIZE;
> +    }
> +    return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_decrypted);

The addr arguments for the above functions appear to be referring to virtual addresses. 
Would vaddr be a better name?

> +
> +/* are we a protected virtualization guest? */
> +bool sev_active(void)
> +{
> +    return is_prot_virt_guest();
> +}
> +EXPORT_SYMBOL_GPL(sev_active);
> +
> +/* protected virtualization */
> +static void pv_init(void)
> +{
> +    if (!sev_active())
> +        return;
> +
> +    /* make sure bounce buffers are shared */
> +    swiotlb_init(1);
> +    swiotlb_update_mem_attributes();
> +    swiotlb_force = SWIOTLB_FORCE;
> +}
> +
>   void __init mem_init(void)
>   {
>       cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> @@ -134,6 +182,8 @@ void __init mem_init(void)
>       set_max_mapnr(max_low_pfn);
>           high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
>   +    pv_init();
> +
>       /* Setup guest page hinting */
>       cmma_init();
>   -- 2.16.4
> 
> 

-- 
-- Jason J. Herne (jjherne@linux.ibm.com)

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
  2019-05-09 12:01       ` Pierre Morel
@ 2019-05-09 18:26         ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-09 18:26 UTC (permalink / raw)
  To: Pierre Morel
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On Thu, 9 May 2019 14:01:01 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 08/05/2019 16:31, Pierre Morel wrote:
> > On 26/04/2019 20:32, Halil Pasic wrote:
> >> This will come in handy soon when we pull out the indicators from
> >> virtio_ccw_device to a memory area that is shared with the hypervisor
> >> (in particular for protected virtualization guests).
> >>
> >> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >> ---
> >>   drivers/s390/virtio/virtio_ccw.c | 40 
> >> +++++++++++++++++++++++++---------------
> >>   1 file changed, 25 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/drivers/s390/virtio/virtio_ccw.c 
> >> b/drivers/s390/virtio/virtio_ccw.c
> >> index bb7a92316fc8..1f3e7d56924f 100644
> >> --- a/drivers/s390/virtio/virtio_ccw.c
> >> +++ b/drivers/s390/virtio/virtio_ccw.c
> >> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
> >>       void *airq_info;
> >>   };
> >> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
> >> +{
> >> +    return &vcdev->indicators;
> >> +}
> >> +
> >> +static inline unsigned long *indicators2(struct virtio_ccw_device 
> >> *vcdev)
> >> +{
> >> +    return &vcdev->indicators2;
> >> +}
> >> +
> >>   struct vq_info_block_legacy {
> >>       __u64 queue;
> >>       __u32 align;
> >> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct 
> >> virtio_ccw_device *vcdev,
> >>           ccw->cda = (__u32)(unsigned long) thinint_area;
> >>       } else {
> >>           /* payload is the address of the indicators */
> >> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
> >> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
> >>                        GFP_DMA | GFP_KERNEL);
> >>           if (!indicatorp)
> >>               return;
> >>           *indicatorp = 0;
> >>           ccw->cmd_code = CCW_CMD_SET_IND;
> >> -        ccw->count = sizeof(&vcdev->indicators);
> >> +        ccw->count = sizeof(indicators(vcdev));
> > 
> > This looks strange to me. Was already weird before.
> > Lucky we are indicators are long...
> > may be just sizeof(long)
> 

I'm not sure I understand where are you coming from...

With CCW_CMD_SET_IND we tell the hypervisor the guest physical address
at which the so called classic indicators. There is a comment that
makes this obvious. The argument of the sizeof was and remained a
pointer type. AFAIU this is what bothers you. 
> 
> AFAIK the size of the indicators (AIV/AIS) is not restricted by the 
> architecture.

The size of vcdev->indicators is restricted or defined by the virtio
specification. Please have a look at '4.3.2.6.1 Setting Up Classic Queue
Indicators' here:
https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1630002

Since with Linux on s390 only 64 bit is supported, both the sizes are in
line with the specification. Using u64 would semantically match the spec
better, modulo pre virtio 1.0 which ain't specified. I did not want to
do changes that are not necessary for what I'm trying to accomplish. If
we want we can change these to u64 with a patch on top.

> However we never use more than 64 bits, do we ever have an adapter 
> having more than 64 different interrupts?

These are one per queue. The number of queues used to be limited to 64
but it ain't no more. If the driver uses classic notifiers, only the
first 64 can be used.

> 
> May be we can state than we use a maximal number of AISB of 64 and 
> therefor use indicators with a size of unsigned long, or __u64 or 
> whatever is appropriate. Please clear this.
> 

I think you are mixing up adapter interrupts as defined by the
architecture, with virtio indicators which are kind of a special case
at best: the two stage stuff is modeled after AISB and AIBV.  

> With this cleared:

I hope, I managed to clear this up a bit. If not please try to re-state
your concern in different words.

> Reviewed-by: Pierre Morel<pmorel@linux.ibm.com>
> 

Thanks for your review!

Halil
 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
@ 2019-05-09 18:26         ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-09 18:26 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Farhan Ali, Viktor Mihajlovski, Janosch Frank

On Thu, 9 May 2019 14:01:01 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 08/05/2019 16:31, Pierre Morel wrote:
> > On 26/04/2019 20:32, Halil Pasic wrote:
> >> This will come in handy soon when we pull out the indicators from
> >> virtio_ccw_device to a memory area that is shared with the hypervisor
> >> (in particular for protected virtualization guests).
> >>
> >> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >> ---
> >>   drivers/s390/virtio/virtio_ccw.c | 40 
> >> +++++++++++++++++++++++++---------------
> >>   1 file changed, 25 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/drivers/s390/virtio/virtio_ccw.c 
> >> b/drivers/s390/virtio/virtio_ccw.c
> >> index bb7a92316fc8..1f3e7d56924f 100644
> >> --- a/drivers/s390/virtio/virtio_ccw.c
> >> +++ b/drivers/s390/virtio/virtio_ccw.c
> >> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
> >>       void *airq_info;
> >>   };
> >> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
> >> +{
> >> +    return &vcdev->indicators;
> >> +}
> >> +
> >> +static inline unsigned long *indicators2(struct virtio_ccw_device 
> >> *vcdev)
> >> +{
> >> +    return &vcdev->indicators2;
> >> +}
> >> +
> >>   struct vq_info_block_legacy {
> >>       __u64 queue;
> >>       __u32 align;
> >> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct 
> >> virtio_ccw_device *vcdev,
> >>           ccw->cda = (__u32)(unsigned long) thinint_area;
> >>       } else {
> >>           /* payload is the address of the indicators */
> >> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
> >> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
> >>                        GFP_DMA | GFP_KERNEL);
> >>           if (!indicatorp)
> >>               return;
> >>           *indicatorp = 0;
> >>           ccw->cmd_code = CCW_CMD_SET_IND;
> >> -        ccw->count = sizeof(&vcdev->indicators);
> >> +        ccw->count = sizeof(indicators(vcdev));
> > 
> > This looks strange to me. Was already weird before.
> > Lucky we are indicators are long...
> > may be just sizeof(long)
> 

I'm not sure I understand where are you coming from...

With CCW_CMD_SET_IND we tell the hypervisor the guest physical address
at which the so called classic indicators. There is a comment that
makes this obvious. The argument of the sizeof was and remained a
pointer type. AFAIU this is what bothers you. 
> 
> AFAIK the size of the indicators (AIV/AIS) is not restricted by the 
> architecture.

The size of vcdev->indicators is restricted or defined by the virtio
specification. Please have a look at '4.3.2.6.1 Setting Up Classic Queue
Indicators' here:
https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1630002

Since with Linux on s390 only 64 bit is supported, both the sizes are in
line with the specification. Using u64 would semantically match the spec
better, modulo pre virtio 1.0 which ain't specified. I did not want to
do changes that are not necessary for what I'm trying to accomplish. If
we want we can change these to u64 with a patch on top.

> However we never use more than 64 bits, do we ever have an adapter 
> having more than 64 different interrupts?

These are one per queue. The number of queues used to be limited to 64
but it ain't no more. If the driver uses classic notifiers, only the
first 64 can be used.

> 
> May be we can state than we use a maximal number of AISB of 64 and 
> therefor use indicators with a size of unsigned long, or __u64 or 
> whatever is appropriate. Please clear this.
> 

I think you are mixing up adapter interrupts as defined by the
architecture, with virtio indicators which are kind of a special case
at best: the two stage stuff is modeled after AISB and AIBV.  

> With this cleared:

I hope, I managed to clear this up a bit. If not please try to re-state
your concern in different words.

> Reviewed-by: Pierre Morel<pmorel@linux.ibm.com>
> 

Thanks for your review!

Halil
 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers
  2019-05-09 13:30       ` Pierre Morel
@ 2019-05-09 18:30         ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-09 18:30 UTC (permalink / raw)
  To: Pierre Morel
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On Thu, 9 May 2019 15:30:08 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 08/05/2019 16:46, Pierre Morel wrote:
> > On 26/04/2019 20:32, Halil Pasic wrote:
> >> Before virtio-ccw could get away with not using DMA API for the pieces of
> >> memory it does ccw I/O with. With protected virtualization this has to
> >> change, since the hypervisor needs to read and sometimes also write these
> >> pieces of memory.
> >>
> >> The hypervisor is supposed to poke the classic notifiers, if these are
> >> used, out of band with regards to ccw I/O. So these need to be allocated
> >> as DMA memory (which is shared memory for protected virtualization
> >> guests).
> >>
> >> Let us factor out everything from struct virtio_ccw_device that needs to
> >> be DMA memory in a satellite that is allocated as such.
> >>
> ...
> >> +                       sizeof(indicators(vcdev)));
> > 
> > should be sizeof(long) ?

If something different then sizeof(u64) IMHO.
> > 
> > This is a recurrent error, but it is not an issue because the size of
> > the indicators is unsigned long as the size of the pointer.

I don't think there is an error, let alone a recurrent one.

> > 
> > Regards,
> > Pierre
> > 
> 
> Here too, with the problem of the indicator size handled:

I've laid out my view in a response to your comment on patch #8.

> Reviewed-by: Pierre Morel<pmorel@linux.ibm.com>

Thanks!
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers
@ 2019-05-09 18:30         ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-09 18:30 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Farhan Ali, Viktor Mihajlovski, Janosch Frank

On Thu, 9 May 2019 15:30:08 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 08/05/2019 16:46, Pierre Morel wrote:
> > On 26/04/2019 20:32, Halil Pasic wrote:
> >> Before virtio-ccw could get away with not using DMA API for the pieces of
> >> memory it does ccw I/O with. With protected virtualization this has to
> >> change, since the hypervisor needs to read and sometimes also write these
> >> pieces of memory.
> >>
> >> The hypervisor is supposed to poke the classic notifiers, if these are
> >> used, out of band with regards to ccw I/O. So these need to be allocated
> >> as DMA memory (which is shared memory for protected virtualization
> >> guests).
> >>
> >> Let us factor out everything from struct virtio_ccw_device that needs to
> >> be DMA memory in a satellite that is allocated as such.
> >>
> ...
> >> +                       sizeof(indicators(vcdev)));
> > 
> > should be sizeof(long) ?

If something different then sizeof(u64) IMHO.
> > 
> > This is a recurrent error, but it is not an issue because the size of
> > the indicators is unsigned long as the size of the pointer.

I don't think there is an error, let alone a recurrent one.

> > 
> > Regards,
> > Pierre
> > 
> 
> Here too, with the problem of the indicator size handled:

I've laid out my view in a response to your comment on patch #8.

> Reviewed-by: Pierre Morel<pmorel@linux.ibm.com>

Thanks!
Halil

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-09 10:11         ` Cornelia Huck
@ 2019-05-09 22:11           ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-09 22:11 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Sebastian Ott, kvm, linux-s390, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller

On Thu, 9 May 2019 12:11:06 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 8 May 2019 23:22:10 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > Sebastian Ott <sebott@linux.ibm.com> wrote:
> 
> > > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > > >  		goto out_unregister;
> > > >  	}
> > > > +	cio_dma_pool_init();    
> > > 
> > > This is too late for early devices (ccw console!).  
> > 
> > You have already raised concern about this last time (thanks). I think,
> > I've addressed this issue: tje cio_dma_pool is only used by the airq
> > stuff. I don't think the ccw console needs it. Please have an other look
> > at patch #6, and explain your concern in more detail if it persists.
> 
> What about changing the naming/adding comments here, so that (1) folks
> aren't confused by the same thing in the future and (2) folks don't try
> to use that pool for something needed for the early ccw consoles?
> 

I'm all for clarity! Suggestions for better names?

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-09 22:11           ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-09 22:11 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda,
	Sebastian Ott, kvm, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Michael Mueller, Viktor Mihajlovski, Janosch Frank

On Thu, 9 May 2019 12:11:06 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 8 May 2019 23:22:10 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > Sebastian Ott <sebott@linux.ibm.com> wrote:
> 
> > > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > > >  		goto out_unregister;
> > > >  	}
> > > > +	cio_dma_pool_init();    
> > > 
> > > This is too late for early devices (ccw console!).  
> > 
> > You have already raised concern about this last time (thanks). I think,
> > I've addressed this issue: tje cio_dma_pool is only used by the airq
> > stuff. I don't think the ccw console needs it. Please have an other look
> > at patch #6, and explain your concern in more detail if it persists.
> 
> What about changing the naming/adding comments here, so that (1) folks
> aren't confused by the same thing in the future and (2) folks don't try
> to use that pool for something needed for the early ccw consoles?
> 

I'm all for clarity! Suggestions for better names?

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
  2019-05-08 13:15     ` Claudio Imbrenda
@ 2019-05-09 22:34       ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-09 22:34 UTC (permalink / raw)
  To: Claudio Imbrenda
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank, Farhan Ali,
	Eric Farman

On Wed, 8 May 2019 15:15:40 +0200
Claudio Imbrenda <imbrenda@linux.ibm.com> wrote:

> On Fri, 26 Apr 2019 20:32:39 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On s390, protected virtualization guests have to use bounced I/O
> > buffers.  That requires some plumbing.
> > 
> > Let us make sure, any device that uses DMA API with direct ops
> > correctly is spared from the problems, that a hypervisor attempting
> > I/O to a non-shared page would bring.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  arch/s390/Kconfig                   |  4 +++
> >  arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++
> >  arch/s390/mm/init.c                 | 50
> > +++++++++++++++++++++++++++++++++++++ 3 files changed, 72
> > insertions(+) create mode 100644 arch/s390/include/asm/mem_encrypt.h
> > 
> > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> > index 1c3fcf19c3af..5500d05d4d53 100644
> > --- a/arch/s390/Kconfig
> > +++ b/arch/s390/Kconfig
> > @@ -1,4 +1,7 @@
> >  # SPDX-License-Identifier: GPL-2.0
> > +config ARCH_HAS_MEM_ENCRYPT
> > +        def_bool y
> > +
> >  config MMU
> >  	def_bool y
> >  
> > @@ -191,6 +194,7 @@ config S390
> >  	select ARCH_HAS_SCALED_CPUTIME
> >  	select VIRT_TO_BUS
> >  	select HAVE_NMI
> > +	select SWIOTLB
> >  
> >  
> >  config SCHED_OMIT_FRAME_POINTER
> > diff --git a/arch/s390/include/asm/mem_encrypt.h
> > b/arch/s390/include/asm/mem_encrypt.h new file mode 100644
> > index 000000000000..0898c09a888c
> > --- /dev/null
> > +++ b/arch/s390/include/asm/mem_encrypt.h
> > @@ -0,0 +1,18 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef S390_MEM_ENCRYPT_H__
> > +#define S390_MEM_ENCRYPT_H__
> > +
> > +#ifndef __ASSEMBLY__
> > +
> > +#define sme_me_mask	0ULL
> 
> This is rather ugly, but I understand why it's there
> 

Nod.

> > +
> > +static inline bool sme_active(void) { return false; }
> > +extern bool sev_active(void);
> > +
> > +int set_memory_encrypted(unsigned long addr, int numpages);
> > +int set_memory_decrypted(unsigned long addr, int numpages);
> > +
> > +#endif	/* __ASSEMBLY__ */
> > +
> > +#endif	/* S390_MEM_ENCRYPT_H__ */
> > +
> > diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> > index 3e82f66d5c61..7e3cbd15dcfa 100644
> > --- a/arch/s390/mm/init.c
> > +++ b/arch/s390/mm/init.c
> > @@ -18,6 +18,7 @@
> >  #include <linux/mman.h>
> >  #include <linux/mm.h>
> >  #include <linux/swap.h>
> > +#include <linux/swiotlb.h>
> >  #include <linux/smp.h>
> >  #include <linux/init.h>
> >  #include <linux/pagemap.h>
> > @@ -29,6 +30,7 @@
> >  #include <linux/export.h>
> >  #include <linux/cma.h>
> >  #include <linux/gfp.h>
> > +#include <linux/dma-mapping.h>
> >  #include <asm/processor.h>
> >  #include <linux/uaccess.h>
> >  #include <asm/pgtable.h>
> > @@ -42,6 +44,8 @@
> >  #include <asm/sclp.h>
> >  #include <asm/set_memory.h>
> >  #include <asm/kasan.h>
> > +#include <asm/dma-mapping.h>
> > +#include <asm/uv.h>
> >  
> >  pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
> >  
> > @@ -126,6 +130,50 @@ void mark_rodata_ro(void)
> >  	pr_info("Write protected read-only-after-init data: %luk\n",
> > size >> 10); }
> >  
> > +int set_memory_encrypted(unsigned long addr, int numpages)
> > +{
> > +	int i;
> > +
> > +	/* make all pages shared, (swiotlb, dma_free) */
> 
> this is a copypaste typo, I think? (should be UNshared?)
> also, it doesn't make ALL pages unshared, but only those specified in
> the parameters

Right a copy paste error. Needs correction. The all was meant like all
pages in the range specified by the arguments. But it is better changed
since it turned out to be confusing.

> 
> with this fixed:
> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> 

Thanks!

> > +	for (i = 0; i < numpages; ++i) {
> > +		uv_remove_shared(addr);
> > +		addr += PAGE_SIZE;
> > +	}
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> > +
> > +int set_memory_decrypted(unsigned long addr, int numpages)
> > +{
> > +	int i;
> > +	/* make all pages shared (swiotlb, dma_alloca) */
> 
> same here with ALL
> 
> > +	for (i = 0; i < numpages; ++i) {
> > +		uv_set_shared(addr);
> > +		addr += PAGE_SIZE;
> > +	}
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(set_memory_decrypted);
> > +
> > +/* are we a protected virtualization guest? */
> > +bool sev_active(void)
> 
> this is also ugly. the correct solution would be probably to refactor
> everything, including all the AMD SEV code.... let's not go there
> 

Nod. Maybe later.

> > +{
> > +	return is_prot_virt_guest();
> > +}
> > +EXPORT_SYMBOL_GPL(sev_active);
> > +
> > +/* protected virtualization */
> > +static void pv_init(void)
> > +{
> > +	if (!sev_active())
> 
> can't you just use is_prot_virt_guest here?
> 

Sure! I guess it would be less confusing. It is something I did not
remember to change when the interface for this provided by uv.h went
from sketchy to nice.

Thanks again!

Regards,
Halil

> > +		return;
> > +
> > +	/* make sure bounce buffers are shared */
> > +	swiotlb_init(1);
> > +	swiotlb_update_mem_attributes();
> > +	swiotlb_force = SWIOTLB_FORCE;
> > +}
> > +
> >  void __init mem_init(void)
> >  {
> >  	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> > @@ -134,6 +182,8 @@ void __init mem_init(void)
> >  	set_max_mapnr(max_low_pfn);
> >          high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
> >  
> > +	pv_init();
> > +
> >  	/* Setup guest page hinting */
> >  	cmma_init();
> >  
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
@ 2019-05-09 22:34       ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-09 22:34 UTC (permalink / raw)
  To: Claudio Imbrenda
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Farhan Ali, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 8 May 2019 15:15:40 +0200
Claudio Imbrenda <imbrenda@linux.ibm.com> wrote:

> On Fri, 26 Apr 2019 20:32:39 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On s390, protected virtualization guests have to use bounced I/O
> > buffers.  That requires some plumbing.
> > 
> > Let us make sure, any device that uses DMA API with direct ops
> > correctly is spared from the problems, that a hypervisor attempting
> > I/O to a non-shared page would bring.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  arch/s390/Kconfig                   |  4 +++
> >  arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++
> >  arch/s390/mm/init.c                 | 50
> > +++++++++++++++++++++++++++++++++++++ 3 files changed, 72
> > insertions(+) create mode 100644 arch/s390/include/asm/mem_encrypt.h
> > 
> > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> > index 1c3fcf19c3af..5500d05d4d53 100644
> > --- a/arch/s390/Kconfig
> > +++ b/arch/s390/Kconfig
> > @@ -1,4 +1,7 @@
> >  # SPDX-License-Identifier: GPL-2.0
> > +config ARCH_HAS_MEM_ENCRYPT
> > +        def_bool y
> > +
> >  config MMU
> >  	def_bool y
> >  
> > @@ -191,6 +194,7 @@ config S390
> >  	select ARCH_HAS_SCALED_CPUTIME
> >  	select VIRT_TO_BUS
> >  	select HAVE_NMI
> > +	select SWIOTLB
> >  
> >  
> >  config SCHED_OMIT_FRAME_POINTER
> > diff --git a/arch/s390/include/asm/mem_encrypt.h
> > b/arch/s390/include/asm/mem_encrypt.h new file mode 100644
> > index 000000000000..0898c09a888c
> > --- /dev/null
> > +++ b/arch/s390/include/asm/mem_encrypt.h
> > @@ -0,0 +1,18 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef S390_MEM_ENCRYPT_H__
> > +#define S390_MEM_ENCRYPT_H__
> > +
> > +#ifndef __ASSEMBLY__
> > +
> > +#define sme_me_mask	0ULL
> 
> This is rather ugly, but I understand why it's there
> 

Nod.

> > +
> > +static inline bool sme_active(void) { return false; }
> > +extern bool sev_active(void);
> > +
> > +int set_memory_encrypted(unsigned long addr, int numpages);
> > +int set_memory_decrypted(unsigned long addr, int numpages);
> > +
> > +#endif	/* __ASSEMBLY__ */
> > +
> > +#endif	/* S390_MEM_ENCRYPT_H__ */
> > +
> > diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> > index 3e82f66d5c61..7e3cbd15dcfa 100644
> > --- a/arch/s390/mm/init.c
> > +++ b/arch/s390/mm/init.c
> > @@ -18,6 +18,7 @@
> >  #include <linux/mman.h>
> >  #include <linux/mm.h>
> >  #include <linux/swap.h>
> > +#include <linux/swiotlb.h>
> >  #include <linux/smp.h>
> >  #include <linux/init.h>
> >  #include <linux/pagemap.h>
> > @@ -29,6 +30,7 @@
> >  #include <linux/export.h>
> >  #include <linux/cma.h>
> >  #include <linux/gfp.h>
> > +#include <linux/dma-mapping.h>
> >  #include <asm/processor.h>
> >  #include <linux/uaccess.h>
> >  #include <asm/pgtable.h>
> > @@ -42,6 +44,8 @@
> >  #include <asm/sclp.h>
> >  #include <asm/set_memory.h>
> >  #include <asm/kasan.h>
> > +#include <asm/dma-mapping.h>
> > +#include <asm/uv.h>
> >  
> >  pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
> >  
> > @@ -126,6 +130,50 @@ void mark_rodata_ro(void)
> >  	pr_info("Write protected read-only-after-init data: %luk\n",
> > size >> 10); }
> >  
> > +int set_memory_encrypted(unsigned long addr, int numpages)
> > +{
> > +	int i;
> > +
> > +	/* make all pages shared, (swiotlb, dma_free) */
> 
> this is a copypaste typo, I think? (should be UNshared?)
> also, it doesn't make ALL pages unshared, but only those specified in
> the parameters

Right a copy paste error. Needs correction. The all was meant like all
pages in the range specified by the arguments. But it is better changed
since it turned out to be confusing.

> 
> with this fixed:
> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> 

Thanks!

> > +	for (i = 0; i < numpages; ++i) {
> > +		uv_remove_shared(addr);
> > +		addr += PAGE_SIZE;
> > +	}
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> > +
> > +int set_memory_decrypted(unsigned long addr, int numpages)
> > +{
> > +	int i;
> > +	/* make all pages shared (swiotlb, dma_alloca) */
> 
> same here with ALL
> 
> > +	for (i = 0; i < numpages; ++i) {
> > +		uv_set_shared(addr);
> > +		addr += PAGE_SIZE;
> > +	}
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(set_memory_decrypted);
> > +
> > +/* are we a protected virtualization guest? */
> > +bool sev_active(void)
> 
> this is also ugly. the correct solution would be probably to refactor
> everything, including all the AMD SEV code.... let's not go there
> 

Nod. Maybe later.

> > +{
> > +	return is_prot_virt_guest();
> > +}
> > +EXPORT_SYMBOL_GPL(sev_active);
> > +
> > +/* protected virtualization */
> > +static void pv_init(void)
> > +{
> > +	if (!sev_active())
> 
> can't you just use is_prot_virt_guest here?
> 

Sure! I guess it would be less confusing. It is something I did not
remember to change when the interface for this provided by uv.h went
from sketchy to nice.

Thanks again!

Regards,
Halil

> > +		return;
> > +
> > +	/* make sure bounce buffers are shared */
> > +	swiotlb_init(1);
> > +	swiotlb_update_mem_attributes();
> > +	swiotlb_force = SWIOTLB_FORCE;
> > +}
> > +
> >  void __init mem_init(void)
> >  {
> >  	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> > @@ -134,6 +182,8 @@ void __init mem_init(void)
> >  	set_max_mapnr(max_low_pfn);
> >          high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
> >  
> > +	pv_init();
> > +
> >  	/* Setup guest page hinting */
> >  	cmma_init();
> >  
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
  2019-05-09 18:26         ` Halil Pasic
@ 2019-05-10  7:43           ` Pierre Morel
  -1 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-10  7:43 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On 09/05/2019 20:26, Halil Pasic wrote:
> On Thu, 9 May 2019 14:01:01 +0200
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> On 08/05/2019 16:31, Pierre Morel wrote:
>>> On 26/04/2019 20:32, Halil Pasic wrote:
>>>> This will come in handy soon when we pull out the indicators from
>>>> virtio_ccw_device to a memory area that is shared with the hypervisor
>>>> (in particular for protected virtualization guests).
>>>>
>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>>> ---
>>>>    drivers/s390/virtio/virtio_ccw.c | 40
>>>> +++++++++++++++++++++++++---------------
>>>>    1 file changed, 25 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/drivers/s390/virtio/virtio_ccw.c
>>>> b/drivers/s390/virtio/virtio_ccw.c
>>>> index bb7a92316fc8..1f3e7d56924f 100644
>>>> --- a/drivers/s390/virtio/virtio_ccw.c
>>>> +++ b/drivers/s390/virtio/virtio_ccw.c
>>>> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
>>>>        void *airq_info;
>>>>    };
>>>> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
>>>> +{
>>>> +    return &vcdev->indicators;
>>>> +}
>>>> +
>>>> +static inline unsigned long *indicators2(struct virtio_ccw_device
>>>> *vcdev)
>>>> +{
>>>> +    return &vcdev->indicators2;
>>>> +}
>>>> +
>>>>    struct vq_info_block_legacy {
>>>>        __u64 queue;
>>>>        __u32 align;
>>>> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct
>>>> virtio_ccw_device *vcdev,
>>>>            ccw->cda = (__u32)(unsigned long) thinint_area;
>>>>        } else {
>>>>            /* payload is the address of the indicators */
>>>> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
>>>> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
>>>>                         GFP_DMA | GFP_KERNEL);
>>>>            if (!indicatorp)
>>>>                return;
>>>>            *indicatorp = 0;
>>>>            ccw->cmd_code = CCW_CMD_SET_IND;
>>>> -        ccw->count = sizeof(&vcdev->indicators);
>>>> +        ccw->count = sizeof(indicators(vcdev));
>>>
>>> This looks strange to me. Was already weird before.
>>> Lucky we are indicators are long...
>>> may be just sizeof(long)
>>
> 
> I'm not sure I understand where are you coming from...
> 
> With CCW_CMD_SET_IND we tell the hypervisor the guest physical address
> at which the so called classic indicators. There is a comment that
> makes this obvious. The argument of the sizeof was and remained a
> pointer type. AFAIU this is what bothers you.
>>
>> AFAIK the size of the indicators (AIV/AIS) is not restricted by the
>> architecture.
> 
> The size of vcdev->indicators is restricted or defined by the virtio
> specification. Please have a look at '4.3.2.6.1 Setting Up Classic Queue
> Indicators' here:
> https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1630002
> 
> Since with Linux on s390 only 64 bit is supported, both the sizes are in
> line with the specification. Using u64 would semantically match the spec
> better, modulo pre virtio 1.0 which ain't specified. I did not want to
> do changes that are not necessary for what I'm trying to accomplish. If
> we want we can change these to u64 with a patch on top.

I mean you are changing these line already, so why not doing it right 
while at it?

Pierre

-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
@ 2019-05-10  7:43           ` Pierre Morel
  0 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-10  7:43 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Farhan Ali, Viktor Mihajlovski, Janosch Frank

On 09/05/2019 20:26, Halil Pasic wrote:
> On Thu, 9 May 2019 14:01:01 +0200
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> On 08/05/2019 16:31, Pierre Morel wrote:
>>> On 26/04/2019 20:32, Halil Pasic wrote:
>>>> This will come in handy soon when we pull out the indicators from
>>>> virtio_ccw_device to a memory area that is shared with the hypervisor
>>>> (in particular for protected virtualization guests).
>>>>
>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>>> ---
>>>>    drivers/s390/virtio/virtio_ccw.c | 40
>>>> +++++++++++++++++++++++++---------------
>>>>    1 file changed, 25 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/drivers/s390/virtio/virtio_ccw.c
>>>> b/drivers/s390/virtio/virtio_ccw.c
>>>> index bb7a92316fc8..1f3e7d56924f 100644
>>>> --- a/drivers/s390/virtio/virtio_ccw.c
>>>> +++ b/drivers/s390/virtio/virtio_ccw.c
>>>> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
>>>>        void *airq_info;
>>>>    };
>>>> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
>>>> +{
>>>> +    return &vcdev->indicators;
>>>> +}
>>>> +
>>>> +static inline unsigned long *indicators2(struct virtio_ccw_device
>>>> *vcdev)
>>>> +{
>>>> +    return &vcdev->indicators2;
>>>> +}
>>>> +
>>>>    struct vq_info_block_legacy {
>>>>        __u64 queue;
>>>>        __u32 align;
>>>> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct
>>>> virtio_ccw_device *vcdev,
>>>>            ccw->cda = (__u32)(unsigned long) thinint_area;
>>>>        } else {
>>>>            /* payload is the address of the indicators */
>>>> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
>>>> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
>>>>                         GFP_DMA | GFP_KERNEL);
>>>>            if (!indicatorp)
>>>>                return;
>>>>            *indicatorp = 0;
>>>>            ccw->cmd_code = CCW_CMD_SET_IND;
>>>> -        ccw->count = sizeof(&vcdev->indicators);
>>>> +        ccw->count = sizeof(indicators(vcdev));
>>>
>>> This looks strange to me. Was already weird before.
>>> Lucky we are indicators are long...
>>> may be just sizeof(long)
>>
> 
> I'm not sure I understand where are you coming from...
> 
> With CCW_CMD_SET_IND we tell the hypervisor the guest physical address
> at which the so called classic indicators. There is a comment that
> makes this obvious. The argument of the sizeof was and remained a
> pointer type. AFAIU this is what bothers you.
>>
>> AFAIK the size of the indicators (AIV/AIS) is not restricted by the
>> architecture.
> 
> The size of vcdev->indicators is restricted or defined by the virtio
> specification. Please have a look at '4.3.2.6.1 Setting Up Classic Queue
> Indicators' here:
> https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1630002
> 
> Since with Linux on s390 only 64 bit is supported, both the sizes are in
> line with the specification. Using u64 would semantically match the spec
> better, modulo pre virtio 1.0 which ain't specified. I did not want to
> do changes that are not necessary for what I'm trying to accomplish. If
> we want we can change these to u64 with a patch on top.

I mean you are changing these line already, so why not doing it right 
while at it?

Pierre

-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
  2019-05-09 18:05       ` Jason J. Herne
@ 2019-05-10  7:49         ` Claudio Imbrenda
  -1 siblings, 0 replies; 182+ messages in thread
From: Claudio Imbrenda @ 2019-05-10  7:49 UTC (permalink / raw)
  To: Jason J. Herne
  Cc: Halil Pasic, kvm, linux-s390, Cornelia Huck ,,
	Martin Schwidefsky, Sebastian Ott, virtualization,
	Michael S. Tsirkin ,, Christoph Hellwig, Thomas Huth ,,
	Christian Borntraeger, Viktor Mihajlovski, Vasily Gorbik ,,
	Janosch Frank, Farhan Ali, Eric Farman

On Thu, 9 May 2019 14:05:20 -0400
"Jason J. Herne" <jjherne@linux.ibm.com> wrote:

[...]

> > +#define sme_me_mask    0ULL
> > +
> > +static inline bool sme_active(void) { return false; }
> > +extern bool sev_active(void);
> > +  
> 
> I noticed this patch always returns false for sme_active. Is it safe
> to assume that whatever fixups are required on x86 to deal with sme
> do not apply to s390?

yes, on x86 sev_active returns false if SEV is enabled. SME is for
host memory encryption. from arch/x86/mm/mem_encrypt.c:

bool sme_active(void)
{
        return sme_me_mask && !sev_enabled;
}

and it makes sense because you can't have both SME and SEV enabled on
the same kernel, because either you're running on bare metal (and then
you can have SME) __or__ you are running as a guest (and then you can
have SEV). The key difference is that DMA operations don't need
bounce buffers with SME, but they do with SEV.

I hope this clarifies your doubts :)

[...]

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
@ 2019-05-10  7:49         ` Claudio Imbrenda
  0 siblings, 0 replies; 182+ messages in thread
From: Claudio Imbrenda @ 2019-05-10  7:49 UTC (permalink / raw)
  To: Jason J. Herne
  Cc: Vasily Gorbik , , linux-s390, Thomas Huth , ,
	Farhan Ali, kvm, Sebastian Ott, Michael S. Tsirkin , ,
	Cornelia Huck , ,
	Eric Farman, virtualization, Halil Pasic, Christoph Hellwig,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Thu, 9 May 2019 14:05:20 -0400
"Jason J. Herne" <jjherne@linux.ibm.com> wrote:

[...]

> > +#define sme_me_mask    0ULL
> > +
> > +static inline bool sme_active(void) { return false; }
> > +extern bool sev_active(void);
> > +  
> 
> I noticed this patch always returns false for sme_active. Is it safe
> to assume that whatever fixups are required on x86 to deal with sme
> do not apply to s390?

yes, on x86 sev_active returns false if SEV is enabled. SME is for
host memory encryption. from arch/x86/mm/mem_encrypt.c:

bool sme_active(void)
{
        return sme_me_mask && !sev_enabled;
}

and it makes sense because you can't have both SME and SEV enabled on
the same kernel, because either you're running on bare metal (and then
you can have SME) __or__ you are running as a guest (and then you can
have SEV). The key difference is that DMA operations don't need
bounce buffers with SME, but they do with SEV.

I hope this clarifies your doubts :)

[...]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
  2019-05-10  7:43           ` Pierre Morel
@ 2019-05-10 11:54             ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-10 11:54 UTC (permalink / raw)
  To: Pierre Morel
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On Fri, 10 May 2019 09:43:08 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 09/05/2019 20:26, Halil Pasic wrote:
> > On Thu, 9 May 2019 14:01:01 +0200
> > Pierre Morel <pmorel@linux.ibm.com> wrote:
> > 
> >> On 08/05/2019 16:31, Pierre Morel wrote:
> >>> On 26/04/2019 20:32, Halil Pasic wrote:
> >>>> This will come in handy soon when we pull out the indicators from
> >>>> virtio_ccw_device to a memory area that is shared with the hypervisor
> >>>> (in particular for protected virtualization guests).
> >>>>
> >>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >>>> ---
> >>>>    drivers/s390/virtio/virtio_ccw.c | 40
> >>>> +++++++++++++++++++++++++---------------
> >>>>    1 file changed, 25 insertions(+), 15 deletions(-)
> >>>>
> >>>> diff --git a/drivers/s390/virtio/virtio_ccw.c
> >>>> b/drivers/s390/virtio/virtio_ccw.c
> >>>> index bb7a92316fc8..1f3e7d56924f 100644
> >>>> --- a/drivers/s390/virtio/virtio_ccw.c
> >>>> +++ b/drivers/s390/virtio/virtio_ccw.c
> >>>> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
> >>>>        void *airq_info;
> >>>>    };
> >>>> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
> >>>> +{
> >>>> +    return &vcdev->indicators;
> >>>> +}
> >>>> +
> >>>> +static inline unsigned long *indicators2(struct virtio_ccw_device
> >>>> *vcdev)
> >>>> +{
> >>>> +    return &vcdev->indicators2;
> >>>> +}
> >>>> +
> >>>>    struct vq_info_block_legacy {
> >>>>        __u64 queue;
> >>>>        __u32 align;
> >>>> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct
> >>>> virtio_ccw_device *vcdev,
> >>>>            ccw->cda = (__u32)(unsigned long) thinint_area;
> >>>>        } else {
> >>>>            /* payload is the address of the indicators */
> >>>> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
> >>>> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
> >>>>                         GFP_DMA | GFP_KERNEL);
> >>>>            if (!indicatorp)
> >>>>                return;
> >>>>            *indicatorp = 0;
> >>>>            ccw->cmd_code = CCW_CMD_SET_IND;
> >>>> -        ccw->count = sizeof(&vcdev->indicators);
> >>>> +        ccw->count = sizeof(indicators(vcdev));
> >>>
> >>> This looks strange to me. Was already weird before.
> >>> Lucky we are indicators are long...
> >>> may be just sizeof(long)
> >>
> > 
> > I'm not sure I understand where are you coming from...
> > 
> > With CCW_CMD_SET_IND we tell the hypervisor the guest physical address
> > at which the so called classic indicators. There is a comment that
> > makes this obvious. The argument of the sizeof was and remained a
> > pointer type. AFAIU this is what bothers you.
> >>
> >> AFAIK the size of the indicators (AIV/AIS) is not restricted by the
> >> architecture.
> > 
> > The size of vcdev->indicators is restricted or defined by the virtio
> > specification. Please have a look at '4.3.2.6.1 Setting Up Classic Queue
> > Indicators' here:
> > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1630002
> > 
> > Since with Linux on s390 only 64 bit is supported, both the sizes are in
> > line with the specification. Using u64 would semantically match the spec
> > better, modulo pre virtio 1.0 which ain't specified. I did not want to
> > do changes that are not necessary for what I'm trying to accomplish. If
> > we want we can change these to u64 with a patch on top.
> 
> I mean you are changing these line already, so why not doing it right 
> while at it?
> 

This patch is about adding the indirection so we can move the member
painlessly. Mixing in different stuff would be a bad practice.

BTW I just explained that it ain't wrong, so I really do not understand
what do you mean by  'why not doing it right'. Can you please explain?

Did you agree with the rest of my comment? I mean there was more to it.

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
@ 2019-05-10 11:54             ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-10 11:54 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Farhan Ali, Viktor Mihajlovski, Janosch Frank

On Fri, 10 May 2019 09:43:08 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 09/05/2019 20:26, Halil Pasic wrote:
> > On Thu, 9 May 2019 14:01:01 +0200
> > Pierre Morel <pmorel@linux.ibm.com> wrote:
> > 
> >> On 08/05/2019 16:31, Pierre Morel wrote:
> >>> On 26/04/2019 20:32, Halil Pasic wrote:
> >>>> This will come in handy soon when we pull out the indicators from
> >>>> virtio_ccw_device to a memory area that is shared with the hypervisor
> >>>> (in particular for protected virtualization guests).
> >>>>
> >>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >>>> ---
> >>>>    drivers/s390/virtio/virtio_ccw.c | 40
> >>>> +++++++++++++++++++++++++---------------
> >>>>    1 file changed, 25 insertions(+), 15 deletions(-)
> >>>>
> >>>> diff --git a/drivers/s390/virtio/virtio_ccw.c
> >>>> b/drivers/s390/virtio/virtio_ccw.c
> >>>> index bb7a92316fc8..1f3e7d56924f 100644
> >>>> --- a/drivers/s390/virtio/virtio_ccw.c
> >>>> +++ b/drivers/s390/virtio/virtio_ccw.c
> >>>> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
> >>>>        void *airq_info;
> >>>>    };
> >>>> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
> >>>> +{
> >>>> +    return &vcdev->indicators;
> >>>> +}
> >>>> +
> >>>> +static inline unsigned long *indicators2(struct virtio_ccw_device
> >>>> *vcdev)
> >>>> +{
> >>>> +    return &vcdev->indicators2;
> >>>> +}
> >>>> +
> >>>>    struct vq_info_block_legacy {
> >>>>        __u64 queue;
> >>>>        __u32 align;
> >>>> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct
> >>>> virtio_ccw_device *vcdev,
> >>>>            ccw->cda = (__u32)(unsigned long) thinint_area;
> >>>>        } else {
> >>>>            /* payload is the address of the indicators */
> >>>> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
> >>>> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
> >>>>                         GFP_DMA | GFP_KERNEL);
> >>>>            if (!indicatorp)
> >>>>                return;
> >>>>            *indicatorp = 0;
> >>>>            ccw->cmd_code = CCW_CMD_SET_IND;
> >>>> -        ccw->count = sizeof(&vcdev->indicators);
> >>>> +        ccw->count = sizeof(indicators(vcdev));
> >>>
> >>> This looks strange to me. Was already weird before.
> >>> Lucky we are indicators are long...
> >>> may be just sizeof(long)
> >>
> > 
> > I'm not sure I understand where are you coming from...
> > 
> > With CCW_CMD_SET_IND we tell the hypervisor the guest physical address
> > at which the so called classic indicators. There is a comment that
> > makes this obvious. The argument of the sizeof was and remained a
> > pointer type. AFAIU this is what bothers you.
> >>
> >> AFAIK the size of the indicators (AIV/AIS) is not restricted by the
> >> architecture.
> > 
> > The size of vcdev->indicators is restricted or defined by the virtio
> > specification. Please have a look at '4.3.2.6.1 Setting Up Classic Queue
> > Indicators' here:
> > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1630002
> > 
> > Since with Linux on s390 only 64 bit is supported, both the sizes are in
> > line with the specification. Using u64 would semantically match the spec
> > better, modulo pre virtio 1.0 which ain't specified. I did not want to
> > do changes that are not necessary for what I'm trying to accomplish. If
> > we want we can change these to u64 with a patch on top.
> 
> I mean you are changing these line already, so why not doing it right 
> while at it?
> 

This patch is about adding the indirection so we can move the member
painlessly. Mixing in different stuff would be a bad practice.

BTW I just explained that it ain't wrong, so I really do not understand
what do you mean by  'why not doing it right'. Can you please explain?

Did you agree with the rest of my comment? I mean there was more to it.

Regards,
Halil

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
  2019-05-07 13:58             ` Christian Borntraeger
@ 2019-05-10 14:07               ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-10 14:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Christian Borntraeger, Halil Pasic, kvm, linux-s390,
	Martin Schwidefsky, Sebastian Ott, virtualization,
	Christoph Hellwig, Thomas Huth, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Tue, 7 May 2019 15:58:12 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> On 05.05.19 13:15, Cornelia Huck wrote:
> > On Sat, 4 May 2019 16:03:40 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> >> On Fri, 3 May 2019 16:04:48 -0400
> >> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >>  
> >>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:    
> >>>> On Fri, 26 Apr 2019 20:32:36 +0200
> >>>> Halil Pasic <pasic@linux.ibm.com> wrote:
> >>>>     
> >>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> >>>>> establishes a new way of allocating virtqueues (as a part of the effort
> >>>>> that taught DMA to virtio rings).
> >>>>>
> >>>>> In the future we will want virtio-ccw to use the DMA API as well.
> >>>>>
> >>>>> Let us switch from the legacy method of allocating virtqueues to
> >>>>> vring_create_virtqueue() as the first step into that direction.
> >>>>>
> >>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >>>>> ---
> >>>>>  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> >>>>>  1 file changed, 11 insertions(+), 19 deletions(-)    
> >>>>
> >>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> >>>>
> >>>> I'd vote for merging this patch right away for 5.2.    
> >>>
> >>> So which tree is this going through? mine?
> >>>     
> >>
> >> Christian, what do you think? If the whole series is supposed to go in
> >> in one go (which I hope it is), via Martin's tree could be the simplest
> >> route IMHO.  
> > 
> > 
> > The first three patches are virtio(-ccw) only and the those are the ones
> > that I think are ready to go.
> > 
> > I'm not feeling comfortable going forward with the remainder as it
> > stands now; waiting for some other folks to give feedback. (They are
> > touching/interacting with code parts I'm not so familiar with, and lack
> > of documentation, while not the developers' fault, does not make it
> > easier.)
> > 
> > Michael, would you like to pick up 1-3 for your tree directly? That
> > looks like the easiest way.  
> 
> Agreed. Michael please pick 1-3.
> We will continue to review 4- first and then see which tree is best.

Michael, please let me know if you'll pick directly or whether I should
post a series.

[Given that the patches are from one virtio-ccw maintainer and reviewed
by the other, picking directly would eliminate an unnecessary
indirection :)]

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
@ 2019-05-10 14:07               ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-10 14:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Christoph Hellwig, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Tue, 7 May 2019 15:58:12 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> On 05.05.19 13:15, Cornelia Huck wrote:
> > On Sat, 4 May 2019 16:03:40 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> >> On Fri, 3 May 2019 16:04:48 -0400
> >> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >>  
> >>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:    
> >>>> On Fri, 26 Apr 2019 20:32:36 +0200
> >>>> Halil Pasic <pasic@linux.ibm.com> wrote:
> >>>>     
> >>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> >>>>> establishes a new way of allocating virtqueues (as a part of the effort
> >>>>> that taught DMA to virtio rings).
> >>>>>
> >>>>> In the future we will want virtio-ccw to use the DMA API as well.
> >>>>>
> >>>>> Let us switch from the legacy method of allocating virtqueues to
> >>>>> vring_create_virtqueue() as the first step into that direction.
> >>>>>
> >>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >>>>> ---
> >>>>>  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> >>>>>  1 file changed, 11 insertions(+), 19 deletions(-)    
> >>>>
> >>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> >>>>
> >>>> I'd vote for merging this patch right away for 5.2.    
> >>>
> >>> So which tree is this going through? mine?
> >>>     
> >>
> >> Christian, what do you think? If the whole series is supposed to go in
> >> in one go (which I hope it is), via Martin's tree could be the simplest
> >> route IMHO.  
> > 
> > 
> > The first three patches are virtio(-ccw) only and the those are the ones
> > that I think are ready to go.
> > 
> > I'm not feeling comfortable going forward with the remainder as it
> > stands now; waiting for some other folks to give feedback. (They are
> > touching/interacting with code parts I'm not so familiar with, and lack
> > of documentation, while not the developers' fault, does not make it
> > easier.)
> > 
> > Michael, would you like to pick up 1-3 for your tree directly? That
> > looks like the easiest way.  
> 
> Agreed. Michael please pick 1-3.
> We will continue to review 4- first and then see which tree is best.

Michael, please let me know if you'll pick directly or whether I should
post a series.

[Given that the patches are from one virtio-ccw maintainer and reviewed
by the other, picking directly would eliminate an unnecessary
indirection :)]

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-09 22:11           ` Halil Pasic
@ 2019-05-10 14:10             ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-10 14:10 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Sebastian Ott, kvm, linux-s390, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller

On Fri, 10 May 2019 00:11:12 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Thu, 9 May 2019 12:11:06 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Wed, 8 May 2019 23:22:10 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > Sebastian Ott <sebott@linux.ibm.com> wrote:  
> >   
> > > > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > > > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > > > >  		goto out_unregister;
> > > > >  	}
> > > > > +	cio_dma_pool_init();      
> > > > 
> > > > This is too late for early devices (ccw console!).    
> > > 
> > > You have already raised concern about this last time (thanks). I think,
> > > I've addressed this issue: tje cio_dma_pool is only used by the airq
> > > stuff. I don't think the ccw console needs it. Please have an other look
> > > at patch #6, and explain your concern in more detail if it persists.  
> > 
> > What about changing the naming/adding comments here, so that (1) folks
> > aren't confused by the same thing in the future and (2) folks don't try
> > to use that pool for something needed for the early ccw consoles?
> >   
> 
> I'm all for clarity! Suggestions for better names?

css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
need it?

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-10 14:10             ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-10 14:10 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda,
	Sebastian Ott, kvm, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Michael Mueller, Viktor Mihajlovski, Janosch Frank

On Fri, 10 May 2019 00:11:12 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Thu, 9 May 2019 12:11:06 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Wed, 8 May 2019 23:22:10 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > Sebastian Ott <sebott@linux.ibm.com> wrote:  
> >   
> > > > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > > > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > > > >  		goto out_unregister;
> > > > >  	}
> > > > > +	cio_dma_pool_init();      
> > > > 
> > > > This is too late for early devices (ccw console!).    
> > > 
> > > You have already raised concern about this last time (thanks). I think,
> > > I've addressed this issue: tje cio_dma_pool is only used by the airq
> > > stuff. I don't think the ccw console needs it. Please have an other look
> > > at patch #6, and explain your concern in more detail if it persists.  
> > 
> > What about changing the naming/adding comments here, so that (1) folks
> > aren't confused by the same thing in the future and (2) folks don't try
> > to use that pool for something needed for the early ccw consoles?
> >   
> 
> I'm all for clarity! Suggestions for better names?

css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
need it?

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
  2019-05-10 11:54             ` Halil Pasic
@ 2019-05-10 15:36               ` Pierre Morel
  -1 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-10 15:36 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On 10/05/2019 13:54, Halil Pasic wrote:
> On Fri, 10 May 2019 09:43:08 +0200
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> On 09/05/2019 20:26, Halil Pasic wrote:
>>> On Thu, 9 May 2019 14:01:01 +0200
>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>
>>>> On 08/05/2019 16:31, Pierre Morel wrote:
>>>>> On 26/04/2019 20:32, Halil Pasic wrote:
>>>>>> This will come in handy soon when we pull out the indicators from
>>>>>> virtio_ccw_device to a memory area that is shared with the hypervisor
>>>>>> (in particular for protected virtualization guests).
>>>>>>
>>>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>>>>> ---
>>>>>>     drivers/s390/virtio/virtio_ccw.c | 40
>>>>>> +++++++++++++++++++++++++---------------
>>>>>>     1 file changed, 25 insertions(+), 15 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/s390/virtio/virtio_ccw.c
>>>>>> b/drivers/s390/virtio/virtio_ccw.c
>>>>>> index bb7a92316fc8..1f3e7d56924f 100644
>>>>>> --- a/drivers/s390/virtio/virtio_ccw.c
>>>>>> +++ b/drivers/s390/virtio/virtio_ccw.c
>>>>>> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
>>>>>>         void *airq_info;
>>>>>>     };
>>>>>> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
>>>>>> +{
>>>>>> +    return &vcdev->indicators;
>>>>>> +}
>>>>>> +
>>>>>> +static inline unsigned long *indicators2(struct virtio_ccw_device
>>>>>> *vcdev)
>>>>>> +{
>>>>>> +    return &vcdev->indicators2;
>>>>>> +}
>>>>>> +
>>>>>>     struct vq_info_block_legacy {
>>>>>>         __u64 queue;
>>>>>>         __u32 align;
>>>>>> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct
>>>>>> virtio_ccw_device *vcdev,
>>>>>>             ccw->cda = (__u32)(unsigned long) thinint_area;
>>>>>>         } else {
>>>>>>             /* payload is the address of the indicators */
>>>>>> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
>>>>>> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
>>>>>>                          GFP_DMA | GFP_KERNEL);
>>>>>>             if (!indicatorp)
>>>>>>                 return;
>>>>>>             *indicatorp = 0;
>>>>>>             ccw->cmd_code = CCW_CMD_SET_IND;
>>>>>> -        ccw->count = sizeof(&vcdev->indicators);
>>>>>> +        ccw->count = sizeof(indicators(vcdev));
>>>>>
>>>>> This looks strange to me. Was already weird before.
>>>>> Lucky we are indicators are long...
>>>>> may be just sizeof(long)
>>>>
>>>
>>> I'm not sure I understand where are you coming from...
>>>
>>> With CCW_CMD_SET_IND we tell the hypervisor the guest physical address
>>> at which the so called classic indicators. There is a comment that
>>> makes this obvious. The argument of the sizeof was and remained a
>>> pointer type. AFAIU this is what bothers you.
>>>>
>>>> AFAIK the size of the indicators (AIV/AIS) is not restricted by the
>>>> architecture.
>>>
>>> The size of vcdev->indicators is restricted or defined by the virtio
>>> specification. Please have a look at '4.3.2.6.1 Setting Up Classic Queue
>>> Indicators' here:
>>> https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1630002
>>>
>>> Since with Linux on s390 only 64 bit is supported, both the sizes are in
>>> line with the specification. Using u64 would semantically match the spec
>>> better, modulo pre virtio 1.0 which ain't specified. I did not want to
>>> do changes that are not necessary for what I'm trying to accomplish. If
>>> we want we can change these to u64 with a patch on top.
>>
>> I mean you are changing these line already, so why not doing it right
>> while at it?
>>
> 
> This patch is about adding the indirection so we can move the member
> painlessly. Mixing in different stuff would be a bad practice.
> 
> BTW I just explained that it ain't wrong, so I really do not understand
> what do you mean by  'why not doing it right'. Can you please explain?
> 

I did not wanted to discuss a long time on this and gave my R-B, so 
meaning that I am OK with this patch.

But if you ask, yes I can, it seems quite obvious.
When you build a CCW you give the pointer to CCW->cda and you give the 
size of the transfer in CCW->count.

Here the count is initialized with the sizeof of the pointer used to 
initialize CCW->cda with.
Lukily we work on a 64 bits machine with 64 bits pointers and the size 
of the pointed object is 64 bits wide so... the resulting count is right.
But it is not the correct way to do it.
That is all. Not a big concern, you do not need to change it, as you 
said it can be done in another patch.

> Did you agree with the rest of my comment? I mean there was more to it.
> 

I understood from your comments that the indicators in Linux are 64bits 
wide so all OK.

Regards
Pierre






-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
@ 2019-05-10 15:36               ` Pierre Morel
  0 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-10 15:36 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Farhan Ali, Viktor Mihajlovski, Janosch Frank

On 10/05/2019 13:54, Halil Pasic wrote:
> On Fri, 10 May 2019 09:43:08 +0200
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> On 09/05/2019 20:26, Halil Pasic wrote:
>>> On Thu, 9 May 2019 14:01:01 +0200
>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>
>>>> On 08/05/2019 16:31, Pierre Morel wrote:
>>>>> On 26/04/2019 20:32, Halil Pasic wrote:
>>>>>> This will come in handy soon when we pull out the indicators from
>>>>>> virtio_ccw_device to a memory area that is shared with the hypervisor
>>>>>> (in particular for protected virtualization guests).
>>>>>>
>>>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>>>>> ---
>>>>>>     drivers/s390/virtio/virtio_ccw.c | 40
>>>>>> +++++++++++++++++++++++++---------------
>>>>>>     1 file changed, 25 insertions(+), 15 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/s390/virtio/virtio_ccw.c
>>>>>> b/drivers/s390/virtio/virtio_ccw.c
>>>>>> index bb7a92316fc8..1f3e7d56924f 100644
>>>>>> --- a/drivers/s390/virtio/virtio_ccw.c
>>>>>> +++ b/drivers/s390/virtio/virtio_ccw.c
>>>>>> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
>>>>>>         void *airq_info;
>>>>>>     };
>>>>>> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
>>>>>> +{
>>>>>> +    return &vcdev->indicators;
>>>>>> +}
>>>>>> +
>>>>>> +static inline unsigned long *indicators2(struct virtio_ccw_device
>>>>>> *vcdev)
>>>>>> +{
>>>>>> +    return &vcdev->indicators2;
>>>>>> +}
>>>>>> +
>>>>>>     struct vq_info_block_legacy {
>>>>>>         __u64 queue;
>>>>>>         __u32 align;
>>>>>> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct
>>>>>> virtio_ccw_device *vcdev,
>>>>>>             ccw->cda = (__u32)(unsigned long) thinint_area;
>>>>>>         } else {
>>>>>>             /* payload is the address of the indicators */
>>>>>> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
>>>>>> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
>>>>>>                          GFP_DMA | GFP_KERNEL);
>>>>>>             if (!indicatorp)
>>>>>>                 return;
>>>>>>             *indicatorp = 0;
>>>>>>             ccw->cmd_code = CCW_CMD_SET_IND;
>>>>>> -        ccw->count = sizeof(&vcdev->indicators);
>>>>>> +        ccw->count = sizeof(indicators(vcdev));
>>>>>
>>>>> This looks strange to me. Was already weird before.
>>>>> Lucky we are indicators are long...
>>>>> may be just sizeof(long)
>>>>
>>>
>>> I'm not sure I understand where are you coming from...
>>>
>>> With CCW_CMD_SET_IND we tell the hypervisor the guest physical address
>>> at which the so called classic indicators. There is a comment that
>>> makes this obvious. The argument of the sizeof was and remained a
>>> pointer type. AFAIU this is what bothers you.
>>>>
>>>> AFAIK the size of the indicators (AIV/AIS) is not restricted by the
>>>> architecture.
>>>
>>> The size of vcdev->indicators is restricted or defined by the virtio
>>> specification. Please have a look at '4.3.2.6.1 Setting Up Classic Queue
>>> Indicators' here:
>>> https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1630002
>>>
>>> Since with Linux on s390 only 64 bit is supported, both the sizes are in
>>> line with the specification. Using u64 would semantically match the spec
>>> better, modulo pre virtio 1.0 which ain't specified. I did not want to
>>> do changes that are not necessary for what I'm trying to accomplish. If
>>> we want we can change these to u64 with a patch on top.
>>
>> I mean you are changing these line already, so why not doing it right
>> while at it?
>>
> 
> This patch is about adding the indirection so we can move the member
> painlessly. Mixing in different stuff would be a bad practice.
> 
> BTW I just explained that it ain't wrong, so I really do not understand
> what do you mean by  'why not doing it right'. Can you please explain?
> 

I did not wanted to discuss a long time on this and gave my R-B, so 
meaning that I am OK with this patch.

But if you ask, yes I can, it seems quite obvious.
When you build a CCW you give the pointer to CCW->cda and you give the 
size of the transfer in CCW->count.

Here the count is initialized with the sizeof of the pointer used to 
initialize CCW->cda with.
Lukily we work on a 64 bits machine with 64 bits pointers and the size 
of the pointed object is 64 bits wide so... the resulting count is right.
But it is not the correct way to do it.
That is all. Not a big concern, you do not need to change it, as you 
said it can be done in another patch.

> Did you agree with the rest of my comment? I mean there was more to it.
> 

I understood from your comments that the indicators in Linux are 64bits 
wide so all OK.

Regards
Pierre






-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
  2019-05-10 14:07               ` Cornelia Huck
@ 2019-05-12 16:47                 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 182+ messages in thread
From: Michael S. Tsirkin @ 2019-05-12 16:47 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Christian Borntraeger, Halil Pasic, kvm, linux-s390,
	Martin Schwidefsky, Sebastian Ott, virtualization,
	Christoph Hellwig, Thomas Huth, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, May 10, 2019 at 04:07:44PM +0200, Cornelia Huck wrote:
> On Tue, 7 May 2019 15:58:12 +0200
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
> > On 05.05.19 13:15, Cornelia Huck wrote:
> > > On Sat, 4 May 2019 16:03:40 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > >> On Fri, 3 May 2019 16:04:48 -0400
> > >> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > >>  
> > >>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:    
> > >>>> On Fri, 26 Apr 2019 20:32:36 +0200
> > >>>> Halil Pasic <pasic@linux.ibm.com> wrote:
> > >>>>     
> > >>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > >>>>> establishes a new way of allocating virtqueues (as a part of the effort
> > >>>>> that taught DMA to virtio rings).
> > >>>>>
> > >>>>> In the future we will want virtio-ccw to use the DMA API as well.
> > >>>>>
> > >>>>> Let us switch from the legacy method of allocating virtqueues to
> > >>>>> vring_create_virtqueue() as the first step into that direction.
> > >>>>>
> > >>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > >>>>> ---
> > >>>>>  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> > >>>>>  1 file changed, 11 insertions(+), 19 deletions(-)    
> > >>>>
> > >>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> > >>>>
> > >>>> I'd vote for merging this patch right away for 5.2.    
> > >>>
> > >>> So which tree is this going through? mine?
> > >>>     
> > >>
> > >> Christian, what do you think? If the whole series is supposed to go in
> > >> in one go (which I hope it is), via Martin's tree could be the simplest
> > >> route IMHO.  
> > > 
> > > 
> > > The first three patches are virtio(-ccw) only and the those are the ones
> > > that I think are ready to go.
> > > 
> > > I'm not feeling comfortable going forward with the remainder as it
> > > stands now; waiting for some other folks to give feedback. (They are
> > > touching/interacting with code parts I'm not so familiar with, and lack
> > > of documentation, while not the developers' fault, does not make it
> > > easier.)
> > > 
> > > Michael, would you like to pick up 1-3 for your tree directly? That
> > > looks like the easiest way.  
> > 
> > Agreed. Michael please pick 1-3.
> > We will continue to review 4- first and then see which tree is best.
> 
> Michael, please let me know if you'll pick directly or whether I should
> post a series.
> 
> [Given that the patches are from one virtio-ccw maintainer and reviewed
> by the other, picking directly would eliminate an unnecessary
> indirection :)]

picked them

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
@ 2019-05-12 16:47                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 182+ messages in thread
From: Michael S. Tsirkin @ 2019-05-12 16:47 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Christoph Hellwig, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri, May 10, 2019 at 04:07:44PM +0200, Cornelia Huck wrote:
> On Tue, 7 May 2019 15:58:12 +0200
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
> > On 05.05.19 13:15, Cornelia Huck wrote:
> > > On Sat, 4 May 2019 16:03:40 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > >> On Fri, 3 May 2019 16:04:48 -0400
> > >> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > >>  
> > >>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:    
> > >>>> On Fri, 26 Apr 2019 20:32:36 +0200
> > >>>> Halil Pasic <pasic@linux.ibm.com> wrote:
> > >>>>     
> > >>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > >>>>> establishes a new way of allocating virtqueues (as a part of the effort
> > >>>>> that taught DMA to virtio rings).
> > >>>>>
> > >>>>> In the future we will want virtio-ccw to use the DMA API as well.
> > >>>>>
> > >>>>> Let us switch from the legacy method of allocating virtqueues to
> > >>>>> vring_create_virtqueue() as the first step into that direction.
> > >>>>>
> > >>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > >>>>> ---
> > >>>>>  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> > >>>>>  1 file changed, 11 insertions(+), 19 deletions(-)    
> > >>>>
> > >>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> > >>>>
> > >>>> I'd vote for merging this patch right away for 5.2.    
> > >>>
> > >>> So which tree is this going through? mine?
> > >>>     
> > >>
> > >> Christian, what do you think? If the whole series is supposed to go in
> > >> in one go (which I hope it is), via Martin's tree could be the simplest
> > >> route IMHO.  
> > > 
> > > 
> > > The first three patches are virtio(-ccw) only and the those are the ones
> > > that I think are ready to go.
> > > 
> > > I'm not feeling comfortable going forward with the remainder as it
> > > stands now; waiting for some other folks to give feedback. (They are
> > > touching/interacting with code parts I'm not so familiar with, and lack
> > > of documentation, while not the developers' fault, does not make it
> > > easier.)
> > > 
> > > Michael, would you like to pick up 1-3 for your tree directly? That
> > > looks like the easiest way.  
> > 
> > Agreed. Michael please pick 1-3.
> > We will continue to review 4- first and then see which tree is best.
> 
> Michael, please let me know if you'll pick directly or whether I should
> post a series.
> 
> [Given that the patches are from one virtio-ccw maintainer and reviewed
> by the other, picking directly would eliminate an unnecessary
> indirection :)]

picked them

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-10 14:10             ` Cornelia Huck
@ 2019-05-12 18:22               ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-12 18:22 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Sebastian Ott, kvm, linux-s390, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller

On Fri, 10 May 2019 16:10:13 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri, 10 May 2019 00:11:12 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Thu, 9 May 2019 12:11:06 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Wed, 8 May 2019 23:22:10 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > > Sebastian Ott <sebott@linux.ibm.com> wrote:  
> > >   
> > > > > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > > > > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > > > > >  		goto out_unregister;
> > > > > >  	}
> > > > > > +	cio_dma_pool_init();      
> > > > > 
> > > > > This is too late for early devices (ccw console!).    
> > > > 
> > > > You have already raised concern about this last time (thanks). I think,
> > > > I've addressed this issue: the cio_dma_pool is only used by the airq
> > > > stuff. I don't think the ccw console needs it. Please have an other look
> > > > at patch #6, and explain your concern in more detail if it persists.  
> > > 
> > > What about changing the naming/adding comments here, so that (1) folks
> > > aren't confused by the same thing in the future and (2) folks don't try
> > > to use that pool for something needed for the early ccw consoles?
> > >   
> > 
> > I'm all for clarity! Suggestions for better names?
> 
> css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
> need it?
> 

Ouch! I was considering to use cio_dma_zalloc() for the adapter
interruption vectors but I ended up between the two chairs in the end.
So with this series there are no uses for cio_dma pool.

I don't feel strongly about this going one way the other.

Against getting rid of the cio_dma_pool and sticking with the speaks
dma_alloc_coherent() that we waste a DMA page per vector, which is a
non obvious side effect.

What speaks against cio_dma_pool is that it is slightly more code, and
this currently can not be used for very early stuff, which I don't
think is relevant. What also used to speak against it is that
allocations asking for more than a page would just fail, but I addressed
that in the patch I've hacked up on top of the series, and I'm going to
paste below. While at it I addressed some other issues as well.

I've also got code that deals with AIRQ_IV_CACHELINE by turning the
kmem_cache into a dma_pool.

Cornelia, Sebastian which approach do you prefer:
1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
vector, or
2) go with the approach taken by the patch below?


Regards,
Halil
-----------------------8<---------------------------------------------
From: Halil Pasic <pasic@linux.ibm.com>
Date: Sun, 12 May 2019 18:08:05 +0200
Subject: [PATCH] WIP: use cio dma pool for airqs

Let's not waste a DMA page per adapter interrupt bit vector.
---
Lightly tested...
---
 arch/s390/include/asm/airq.h |  1 -
 drivers/s390/cio/airq.c      | 10 +++-------
 drivers/s390/cio/css.c       | 18 +++++++++++++++---
 3 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index 1492d48..981a3eb 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -30,7 +30,6 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
 /* Adapter interrupt bit vector */
 struct airq_iv {
 	unsigned long *vector;	/* Adapter interrupt bit vector */
-	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
 	unsigned long *avail;	/* Allocation bit mask for the bit vector */
 	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
 	unsigned long *ptr;	/* Pointer associated with each bit */
diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index 7a5c0a0..f11f437 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -136,8 +136,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 		goto out;
 	iv->bits = bits;
 	size = iv_size(bits);
-	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
-						 &iv->vector_dma, GFP_KERNEL);
+	iv->vector = cio_dma_zalloc(size);
 	if (!iv->vector)
 		goto out_free;
 	if (flags & AIRQ_IV_ALLOC) {
@@ -172,8 +171,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
 	kfree(iv->avail);
-	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
-			  iv->vector_dma);
+	cio_dma_free(iv->vector, size);
 	kfree(iv);
 out:
 	return NULL;
@@ -189,9 +187,7 @@ void airq_iv_release(struct airq_iv *iv)
 	kfree(iv->data);
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
-	kfree(iv->vector);
-	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
-			  iv->vector, iv->vector_dma);
+	cio_dma_free(iv->vector, iv_size(iv->bits));
 	kfree(iv->avail);
 	kfree(iv);
 }
diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
index 7087cc3..88d9c92 100644
--- a/drivers/s390/cio/css.c
+++ b/drivers/s390/cio/css.c
@@ -1063,7 +1063,10 @@ struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
 static void __gp_dma_free_dma(struct gen_pool *pool,
 			      struct gen_pool_chunk *chunk, void *data)
 {
-	dma_free_coherent((struct device *) data, PAGE_SIZE,
+
+	size_t chunk_size = chunk->end_addr - chunk->start_addr + 1;
+
+	dma_free_coherent((struct device *) data, chunk_size,
 			 (void *) chunk->start_addr,
 			 (dma_addr_t) chunk->phys_addr);
 }
@@ -1088,13 +1091,15 @@ void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
 {
 	dma_addr_t dma_addr;
 	unsigned long addr = gen_pool_alloc(gp_dma, size);
+	size_t chunk_size;
 
 	if (!addr) {
+		chunk_size = round_up(size, PAGE_SIZE);
 		addr = (unsigned long) dma_alloc_coherent(dma_dev,
-					PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
+					 chunk_size, &dma_addr, CIO_DMA_GFP);
 		if (!addr)
 			return NULL;
-		gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
+		gen_pool_add_virt(gp_dma, addr, dma_addr, chunk_size, -1);
 		addr = gen_pool_alloc(gp_dma, size);
 	}
 	return (void *) addr;
@@ -1108,6 +1113,13 @@ void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
 	gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
 }
 
+/**
+ * Allocate dma memory from the css global pool. Intended for memory not
+ * specific to any single device within the css.
+ *
+ * Caution: Not suitable for early stuff like console.
+ *
+ */
 void *cio_dma_zalloc(size_t size)
 {
 	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-12 18:22               ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-12 18:22 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda,
	Sebastian Ott, kvm, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Michael Mueller, Viktor Mihajlovski, Janosch Frank

On Fri, 10 May 2019 16:10:13 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri, 10 May 2019 00:11:12 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Thu, 9 May 2019 12:11:06 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Wed, 8 May 2019 23:22:10 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > > Sebastian Ott <sebott@linux.ibm.com> wrote:  
> > >   
> > > > > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > > > > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > > > > >  		goto out_unregister;
> > > > > >  	}
> > > > > > +	cio_dma_pool_init();      
> > > > > 
> > > > > This is too late for early devices (ccw console!).    
> > > > 
> > > > You have already raised concern about this last time (thanks). I think,
> > > > I've addressed this issue: the cio_dma_pool is only used by the airq
> > > > stuff. I don't think the ccw console needs it. Please have an other look
> > > > at patch #6, and explain your concern in more detail if it persists.  
> > > 
> > > What about changing the naming/adding comments here, so that (1) folks
> > > aren't confused by the same thing in the future and (2) folks don't try
> > > to use that pool for something needed for the early ccw consoles?
> > >   
> > 
> > I'm all for clarity! Suggestions for better names?
> 
> css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
> need it?
> 

Ouch! I was considering to use cio_dma_zalloc() for the adapter
interruption vectors but I ended up between the two chairs in the end.
So with this series there are no uses for cio_dma pool.

I don't feel strongly about this going one way the other.

Against getting rid of the cio_dma_pool and sticking with the speaks
dma_alloc_coherent() that we waste a DMA page per vector, which is a
non obvious side effect.

What speaks against cio_dma_pool is that it is slightly more code, and
this currently can not be used for very early stuff, which I don't
think is relevant. What also used to speak against it is that
allocations asking for more than a page would just fail, but I addressed
that in the patch I've hacked up on top of the series, and I'm going to
paste below. While at it I addressed some other issues as well.

I've also got code that deals with AIRQ_IV_CACHELINE by turning the
kmem_cache into a dma_pool.

Cornelia, Sebastian which approach do you prefer:
1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
vector, or
2) go with the approach taken by the patch below?


Regards,
Halil
-----------------------8<---------------------------------------------
From: Halil Pasic <pasic@linux.ibm.com>
Date: Sun, 12 May 2019 18:08:05 +0200
Subject: [PATCH] WIP: use cio dma pool for airqs

Let's not waste a DMA page per adapter interrupt bit vector.
---
Lightly tested...
---
 arch/s390/include/asm/airq.h |  1 -
 drivers/s390/cio/airq.c      | 10 +++-------
 drivers/s390/cio/css.c       | 18 +++++++++++++++---
 3 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index 1492d48..981a3eb 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -30,7 +30,6 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
 /* Adapter interrupt bit vector */
 struct airq_iv {
 	unsigned long *vector;	/* Adapter interrupt bit vector */
-	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
 	unsigned long *avail;	/* Allocation bit mask for the bit vector */
 	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
 	unsigned long *ptr;	/* Pointer associated with each bit */
diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index 7a5c0a0..f11f437 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -136,8 +136,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 		goto out;
 	iv->bits = bits;
 	size = iv_size(bits);
-	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
-						 &iv->vector_dma, GFP_KERNEL);
+	iv->vector = cio_dma_zalloc(size);
 	if (!iv->vector)
 		goto out_free;
 	if (flags & AIRQ_IV_ALLOC) {
@@ -172,8 +171,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
 	kfree(iv->avail);
-	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
-			  iv->vector_dma);
+	cio_dma_free(iv->vector, size);
 	kfree(iv);
 out:
 	return NULL;
@@ -189,9 +187,7 @@ void airq_iv_release(struct airq_iv *iv)
 	kfree(iv->data);
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
-	kfree(iv->vector);
-	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
-			  iv->vector, iv->vector_dma);
+	cio_dma_free(iv->vector, iv_size(iv->bits));
 	kfree(iv->avail);
 	kfree(iv);
 }
diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
index 7087cc3..88d9c92 100644
--- a/drivers/s390/cio/css.c
+++ b/drivers/s390/cio/css.c
@@ -1063,7 +1063,10 @@ struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
 static void __gp_dma_free_dma(struct gen_pool *pool,
 			      struct gen_pool_chunk *chunk, void *data)
 {
-	dma_free_coherent((struct device *) data, PAGE_SIZE,
+
+	size_t chunk_size = chunk->end_addr - chunk->start_addr + 1;
+
+	dma_free_coherent((struct device *) data, chunk_size,
 			 (void *) chunk->start_addr,
 			 (dma_addr_t) chunk->phys_addr);
 }
@@ -1088,13 +1091,15 @@ void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
 {
 	dma_addr_t dma_addr;
 	unsigned long addr = gen_pool_alloc(gp_dma, size);
+	size_t chunk_size;
 
 	if (!addr) {
+		chunk_size = round_up(size, PAGE_SIZE);
 		addr = (unsigned long) dma_alloc_coherent(dma_dev,
-					PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
+					 chunk_size, &dma_addr, CIO_DMA_GFP);
 		if (!addr)
 			return NULL;
-		gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
+		gen_pool_add_virt(gp_dma, addr, dma_addr, chunk_size, -1);
 		addr = gen_pool_alloc(gp_dma, size);
 	}
 	return (void *) addr;
@@ -1108,6 +1113,13 @@ void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
 	gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
 }
 
+/**
+ * Allocate dma memory from the css global pool. Intended for memory not
+ * specific to any single device within the css.
+ *
+ * Caution: Not suitable for early stuff like console.
+ *
+ */
 void *cio_dma_zalloc(size_t size)
 {
 	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-04-26 18:32   ` Halil Pasic
@ 2019-05-13  9:41     ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13  9:41 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 26 Apr 2019 20:32:41 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> As virtio-ccw devices are channel devices, we need to use the dma area
> for any communication with the hypervisor.
> 
> This patch addresses the most basic stuff (mostly what is required for
> virtio-ccw), and does take care of QDIO or any devices.

"does not take care of QDIO", surely? (Also, what does "any devices"
mean? Do you mean "every arbitrary device", perhaps?)

> 
> An interesting side effect is that virtio structures are now going to
> get allocated in 31 bit addressable storage.

Hm...

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/include/asm/ccwdev.h   |  4 +++
>  drivers/s390/cio/ccwreq.c        |  8 ++---
>  drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
>  drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
>  drivers/s390/cio/device_id.c     | 18 +++++------
>  drivers/s390/cio/device_ops.c    | 21 +++++++++++--
>  drivers/s390/cio/device_pgid.c   | 20 ++++++-------
>  drivers/s390/cio/device_status.c | 24 +++++++--------
>  drivers/s390/cio/io_sch.h        | 21 +++++++++----
>  drivers/s390/virtio/virtio_ccw.c | 10 -------
>  10 files changed, 148 insertions(+), 83 deletions(-)

(...)

> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 6d989c360f38..bb7a92316fc8 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -66,7 +66,6 @@ struct virtio_ccw_device {
>  	bool device_lost;
>  	unsigned int config_ready;
>  	void *airq_info;
> -	u64 dma_mask;
>  };
>  
>  struct vq_info_block_legacy {
> @@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
>  		ret = -ENOMEM;
>  		goto out_free;
>  	}
> -
>  	vcdev->vdev.dev.parent = &cdev->dev;
> -	cdev->dev.dma_mask = &vcdev->dma_mask;
> -	/* we are fine with common virtio infrastructure using 64 bit DMA */
> -	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> -	if (ret) {
> -		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
> -		goto out_free;
> -	}

This means that vring structures now need to fit into 31 bits as well,
I think? Is there any way to reserve the 31 bit restriction for channel
subsystem structures and keep vring in the full 64 bit range? (Or am I
fundamentally misunderstanding something?)

> -
>  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
>  				   GFP_DMA | GFP_KERNEL);
>  	if (!vcdev->config_block) {

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-13  9:41     ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13  9:41 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri, 26 Apr 2019 20:32:41 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> As virtio-ccw devices are channel devices, we need to use the dma area
> for any communication with the hypervisor.
> 
> This patch addresses the most basic stuff (mostly what is required for
> virtio-ccw), and does take care of QDIO or any devices.

"does not take care of QDIO", surely? (Also, what does "any devices"
mean? Do you mean "every arbitrary device", perhaps?)

> 
> An interesting side effect is that virtio structures are now going to
> get allocated in 31 bit addressable storage.

Hm...

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/include/asm/ccwdev.h   |  4 +++
>  drivers/s390/cio/ccwreq.c        |  8 ++---
>  drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
>  drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
>  drivers/s390/cio/device_id.c     | 18 +++++------
>  drivers/s390/cio/device_ops.c    | 21 +++++++++++--
>  drivers/s390/cio/device_pgid.c   | 20 ++++++-------
>  drivers/s390/cio/device_status.c | 24 +++++++--------
>  drivers/s390/cio/io_sch.h        | 21 +++++++++----
>  drivers/s390/virtio/virtio_ccw.c | 10 -------
>  10 files changed, 148 insertions(+), 83 deletions(-)

(...)

> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 6d989c360f38..bb7a92316fc8 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -66,7 +66,6 @@ struct virtio_ccw_device {
>  	bool device_lost;
>  	unsigned int config_ready;
>  	void *airq_info;
> -	u64 dma_mask;
>  };
>  
>  struct vq_info_block_legacy {
> @@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
>  		ret = -ENOMEM;
>  		goto out_free;
>  	}
> -
>  	vcdev->vdev.dev.parent = &cdev->dev;
> -	cdev->dev.dma_mask = &vcdev->dma_mask;
> -	/* we are fine with common virtio infrastructure using 64 bit DMA */
> -	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> -	if (ret) {
> -		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
> -		goto out_free;
> -	}

This means that vring structures now need to fit into 31 bits as well,
I think? Is there any way to reserve the 31 bit restriction for channel
subsystem structures and keep vring in the full 64 bit range? (Or am I
fundamentally misunderstanding something?)

> -
>  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
>  				   GFP_DMA | GFP_KERNEL);
>  	if (!vcdev->config_block) {

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
  2019-05-12 16:47                 ` Michael S. Tsirkin
@ 2019-05-13  9:52                   ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13  9:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Christian Borntraeger, Halil Pasic, kvm, linux-s390,
	Martin Schwidefsky, Sebastian Ott, virtualization,
	Christoph Hellwig, Thomas Huth, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Sun, 12 May 2019 12:47:39 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Fri, May 10, 2019 at 04:07:44PM +0200, Cornelia Huck wrote:
> > On Tue, 7 May 2019 15:58:12 +0200
> > Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >   
> > > On 05.05.19 13:15, Cornelia Huck wrote:  
> > > > On Sat, 4 May 2019 16:03:40 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > > >     
> > > >> On Fri, 3 May 2019 16:04:48 -0400
> > > >> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > >>    
> > > >>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:      
> > > >>>> On Fri, 26 Apr 2019 20:32:36 +0200
> > > >>>> Halil Pasic <pasic@linux.ibm.com> wrote:
> > > >>>>       
> > > >>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > > >>>>> establishes a new way of allocating virtqueues (as a part of the effort
> > > >>>>> that taught DMA to virtio rings).
> > > >>>>>
> > > >>>>> In the future we will want virtio-ccw to use the DMA API as well.
> > > >>>>>
> > > >>>>> Let us switch from the legacy method of allocating virtqueues to
> > > >>>>> vring_create_virtqueue() as the first step into that direction.
> > > >>>>>
> > > >>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > >>>>> ---
> > > >>>>>  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> > > >>>>>  1 file changed, 11 insertions(+), 19 deletions(-)      
> > > >>>>
> > > >>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> > > >>>>
> > > >>>> I'd vote for merging this patch right away for 5.2.      
> > > >>>
> > > >>> So which tree is this going through? mine?
> > > >>>       
> > > >>
> > > >> Christian, what do you think? If the whole series is supposed to go in
> > > >> in one go (which I hope it is), via Martin's tree could be the simplest
> > > >> route IMHO.    
> > > > 
> > > > 
> > > > The first three patches are virtio(-ccw) only and the those are the ones
> > > > that I think are ready to go.
> > > > 
> > > > I'm not feeling comfortable going forward with the remainder as it
> > > > stands now; waiting for some other folks to give feedback. (They are
> > > > touching/interacting with code parts I'm not so familiar with, and lack
> > > > of documentation, while not the developers' fault, does not make it
> > > > easier.)
> > > > 
> > > > Michael, would you like to pick up 1-3 for your tree directly? That
> > > > looks like the easiest way.    
> > > 
> > > Agreed. Michael please pick 1-3.
> > > We will continue to review 4- first and then see which tree is best.  
> > 
> > Michael, please let me know if you'll pick directly or whether I should
> > post a series.
> > 
> > [Given that the patches are from one virtio-ccw maintainer and reviewed
> > by the other, picking directly would eliminate an unnecessary
> > indirection :)]  
> 
> picked them

Thanks!

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
@ 2019-05-13  9:52                   ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13  9:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Christoph Hellwig, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Sun, 12 May 2019 12:47:39 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Fri, May 10, 2019 at 04:07:44PM +0200, Cornelia Huck wrote:
> > On Tue, 7 May 2019 15:58:12 +0200
> > Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >   
> > > On 05.05.19 13:15, Cornelia Huck wrote:  
> > > > On Sat, 4 May 2019 16:03:40 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > > >     
> > > >> On Fri, 3 May 2019 16:04:48 -0400
> > > >> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > >>    
> > > >>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:      
> > > >>>> On Fri, 26 Apr 2019 20:32:36 +0200
> > > >>>> Halil Pasic <pasic@linux.ibm.com> wrote:
> > > >>>>       
> > > >>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > > >>>>> establishes a new way of allocating virtqueues (as a part of the effort
> > > >>>>> that taught DMA to virtio rings).
> > > >>>>>
> > > >>>>> In the future we will want virtio-ccw to use the DMA API as well.
> > > >>>>>
> > > >>>>> Let us switch from the legacy method of allocating virtqueues to
> > > >>>>> vring_create_virtqueue() as the first step into that direction.
> > > >>>>>
> > > >>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > >>>>> ---
> > > >>>>>  drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> > > >>>>>  1 file changed, 11 insertions(+), 19 deletions(-)      
> > > >>>>
> > > >>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> > > >>>>
> > > >>>> I'd vote for merging this patch right away for 5.2.      
> > > >>>
> > > >>> So which tree is this going through? mine?
> > > >>>       
> > > >>
> > > >> Christian, what do you think? If the whole series is supposed to go in
> > > >> in one go (which I hope it is), via Martin's tree could be the simplest
> > > >> route IMHO.    
> > > > 
> > > > 
> > > > The first three patches are virtio(-ccw) only and the those are the ones
> > > > that I think are ready to go.
> > > > 
> > > > I'm not feeling comfortable going forward with the remainder as it
> > > > stands now; waiting for some other folks to give feedback. (They are
> > > > touching/interacting with code parts I'm not so familiar with, and lack
> > > > of documentation, while not the developers' fault, does not make it
> > > > easier.)
> > > > 
> > > > Michael, would you like to pick up 1-3 for your tree directly? That
> > > > looks like the easiest way.    
> > > 
> > > Agreed. Michael please pick 1-3.
> > > We will continue to review 4- first and then see which tree is best.  
> > 
> > Michael, please let me know if you'll pick directly or whether I should
> > post a series.
> > 
> > [Given that the patches are from one virtio-ccw maintainer and reviewed
> > by the other, picking directly would eliminate an unnecessary
> > indirection :)]  
> 
> picked them

Thanks!

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
  2019-05-10 15:36               ` Pierre Morel
@ 2019-05-13 10:15                 ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13 10:15 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Halil Pasic, kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 10 May 2019 17:36:05 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 10/05/2019 13:54, Halil Pasic wrote:
> > On Fri, 10 May 2019 09:43:08 +0200
> > Pierre Morel <pmorel@linux.ibm.com> wrote:
> >   
> >> On 09/05/2019 20:26, Halil Pasic wrote:  
> >>> On Thu, 9 May 2019 14:01:01 +0200
> >>> Pierre Morel <pmorel@linux.ibm.com> wrote:
> >>>  
> >>>> On 08/05/2019 16:31, Pierre Morel wrote:  
> >>>>> On 26/04/2019 20:32, Halil Pasic wrote:  
> >>>>>> This will come in handy soon when we pull out the indicators from
> >>>>>> virtio_ccw_device to a memory area that is shared with the hypervisor
> >>>>>> (in particular for protected virtualization guests).
> >>>>>>
> >>>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >>>>>> ---
> >>>>>>     drivers/s390/virtio/virtio_ccw.c | 40
> >>>>>> +++++++++++++++++++++++++---------------
> >>>>>>     1 file changed, 25 insertions(+), 15 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/s390/virtio/virtio_ccw.c
> >>>>>> b/drivers/s390/virtio/virtio_ccw.c
> >>>>>> index bb7a92316fc8..1f3e7d56924f 100644
> >>>>>> --- a/drivers/s390/virtio/virtio_ccw.c
> >>>>>> +++ b/drivers/s390/virtio/virtio_ccw.c
> >>>>>> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
> >>>>>>         void *airq_info;
> >>>>>>     };
> >>>>>> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
> >>>>>> +{
> >>>>>> +    return &vcdev->indicators;
> >>>>>> +}
> >>>>>> +
> >>>>>> +static inline unsigned long *indicators2(struct virtio_ccw_device
> >>>>>> *vcdev)
> >>>>>> +{
> >>>>>> +    return &vcdev->indicators2;
> >>>>>> +}
> >>>>>> +
> >>>>>>     struct vq_info_block_legacy {
> >>>>>>         __u64 queue;
> >>>>>>         __u32 align;
> >>>>>> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct
> >>>>>> virtio_ccw_device *vcdev,
> >>>>>>             ccw->cda = (__u32)(unsigned long) thinint_area;
> >>>>>>         } else {
> >>>>>>             /* payload is the address of the indicators */
> >>>>>> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
> >>>>>> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
> >>>>>>                          GFP_DMA | GFP_KERNEL);
> >>>>>>             if (!indicatorp)
> >>>>>>                 return;
> >>>>>>             *indicatorp = 0;
> >>>>>>             ccw->cmd_code = CCW_CMD_SET_IND;
> >>>>>> -        ccw->count = sizeof(&vcdev->indicators);
> >>>>>> +        ccw->count = sizeof(indicators(vcdev));  
> >>>>>
> >>>>> This looks strange to me. Was already weird before.
> >>>>> Lucky we are indicators are long...
> >>>>> may be just sizeof(long)  
> >>>>  
> >>>
> >>> I'm not sure I understand where are you coming from...
> >>>
> >>> With CCW_CMD_SET_IND we tell the hypervisor the guest physical address
> >>> at which the so called classic indicators. There is a comment that
> >>> makes this obvious. The argument of the sizeof was and remained a
> >>> pointer type. AFAIU this is what bothers you.  
> >>>>
> >>>> AFAIK the size of the indicators (AIV/AIS) is not restricted by the
> >>>> architecture.  
> >>>
> >>> The size of vcdev->indicators is restricted or defined by the virtio
> >>> specification. Please have a look at '4.3.2.6.1 Setting Up Classic Queue
> >>> Indicators' here:
> >>> https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1630002
> >>>
> >>> Since with Linux on s390 only 64 bit is supported, both the sizes are in
> >>> line with the specification. Using u64 would semantically match the spec
> >>> better, modulo pre virtio 1.0 which ain't specified. I did not want to
> >>> do changes that are not necessary for what I'm trying to accomplish. If
> >>> we want we can change these to u64 with a patch on top.  
> >>
> >> I mean you are changing these line already, so why not doing it right
> >> while at it?
> >>  
> > 
> > This patch is about adding the indirection so we can move the member
> > painlessly. Mixing in different stuff would be a bad practice.
> > 
> > BTW I just explained that it ain't wrong, so I really do not understand
> > what do you mean by  'why not doing it right'. Can you please explain?
> >   
> 
> I did not wanted to discuss a long time on this and gave my R-B, so 
> meaning that I am OK with this patch.
> 
> But if you ask, yes I can, it seems quite obvious.
> When you build a CCW you give the pointer to CCW->cda and you give the 
> size of the transfer in CCW->count.
> 
> Here the count is initialized with the sizeof of the pointer used to 
> initialize CCW->cda with.

But the cda points to the pointer address, so the size of the pointer
is actually the correct value here, isn't it?

> Lukily we work on a 64 bits machine with 64 bits pointers and the size 
> of the pointed object is 64 bits wide so... the resulting count is right.
> But it is not the correct way to do it.

I think it is, but this interface really is confusing.

> That is all. Not a big concern, you do not need to change it, as you 
> said it can be done in another patch.
> 
> > Did you agree with the rest of my comment? I mean there was more to it.
> >   
> 
> I understood from your comments that the indicators in Linux are 64bits 
> wide so all OK.
> 
> Regards
> Pierre
> 
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
@ 2019-05-13 10:15                 ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13 10:15 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Christoph Hellwig,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Fri, 10 May 2019 17:36:05 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 10/05/2019 13:54, Halil Pasic wrote:
> > On Fri, 10 May 2019 09:43:08 +0200
> > Pierre Morel <pmorel@linux.ibm.com> wrote:
> >   
> >> On 09/05/2019 20:26, Halil Pasic wrote:  
> >>> On Thu, 9 May 2019 14:01:01 +0200
> >>> Pierre Morel <pmorel@linux.ibm.com> wrote:
> >>>  
> >>>> On 08/05/2019 16:31, Pierre Morel wrote:  
> >>>>> On 26/04/2019 20:32, Halil Pasic wrote:  
> >>>>>> This will come in handy soon when we pull out the indicators from
> >>>>>> virtio_ccw_device to a memory area that is shared with the hypervisor
> >>>>>> (in particular for protected virtualization guests).
> >>>>>>
> >>>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >>>>>> ---
> >>>>>>     drivers/s390/virtio/virtio_ccw.c | 40
> >>>>>> +++++++++++++++++++++++++---------------
> >>>>>>     1 file changed, 25 insertions(+), 15 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/s390/virtio/virtio_ccw.c
> >>>>>> b/drivers/s390/virtio/virtio_ccw.c
> >>>>>> index bb7a92316fc8..1f3e7d56924f 100644
> >>>>>> --- a/drivers/s390/virtio/virtio_ccw.c
> >>>>>> +++ b/drivers/s390/virtio/virtio_ccw.c
> >>>>>> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
> >>>>>>         void *airq_info;
> >>>>>>     };
> >>>>>> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
> >>>>>> +{
> >>>>>> +    return &vcdev->indicators;
> >>>>>> +}
> >>>>>> +
> >>>>>> +static inline unsigned long *indicators2(struct virtio_ccw_device
> >>>>>> *vcdev)
> >>>>>> +{
> >>>>>> +    return &vcdev->indicators2;
> >>>>>> +}
> >>>>>> +
> >>>>>>     struct vq_info_block_legacy {
> >>>>>>         __u64 queue;
> >>>>>>         __u32 align;
> >>>>>> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct
> >>>>>> virtio_ccw_device *vcdev,
> >>>>>>             ccw->cda = (__u32)(unsigned long) thinint_area;
> >>>>>>         } else {
> >>>>>>             /* payload is the address of the indicators */
> >>>>>> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
> >>>>>> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
> >>>>>>                          GFP_DMA | GFP_KERNEL);
> >>>>>>             if (!indicatorp)
> >>>>>>                 return;
> >>>>>>             *indicatorp = 0;
> >>>>>>             ccw->cmd_code = CCW_CMD_SET_IND;
> >>>>>> -        ccw->count = sizeof(&vcdev->indicators);
> >>>>>> +        ccw->count = sizeof(indicators(vcdev));  
> >>>>>
> >>>>> This looks strange to me. Was already weird before.
> >>>>> Lucky we are indicators are long...
> >>>>> may be just sizeof(long)  
> >>>>  
> >>>
> >>> I'm not sure I understand where are you coming from...
> >>>
> >>> With CCW_CMD_SET_IND we tell the hypervisor the guest physical address
> >>> at which the so called classic indicators. There is a comment that
> >>> makes this obvious. The argument of the sizeof was and remained a
> >>> pointer type. AFAIU this is what bothers you.  
> >>>>
> >>>> AFAIK the size of the indicators (AIV/AIS) is not restricted by the
> >>>> architecture.  
> >>>
> >>> The size of vcdev->indicators is restricted or defined by the virtio
> >>> specification. Please have a look at '4.3.2.6.1 Setting Up Classic Queue
> >>> Indicators' here:
> >>> https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1630002
> >>>
> >>> Since with Linux on s390 only 64 bit is supported, both the sizes are in
> >>> line with the specification. Using u64 would semantically match the spec
> >>> better, modulo pre virtio 1.0 which ain't specified. I did not want to
> >>> do changes that are not necessary for what I'm trying to accomplish. If
> >>> we want we can change these to u64 with a patch on top.  
> >>
> >> I mean you are changing these line already, so why not doing it right
> >> while at it?
> >>  
> > 
> > This patch is about adding the indirection so we can move the member
> > painlessly. Mixing in different stuff would be a bad practice.
> > 
> > BTW I just explained that it ain't wrong, so I really do not understand
> > what do you mean by  'why not doing it right'. Can you please explain?
> >   
> 
> I did not wanted to discuss a long time on this and gave my R-B, so 
> meaning that I am OK with this patch.
> 
> But if you ask, yes I can, it seems quite obvious.
> When you build a CCW you give the pointer to CCW->cda and you give the 
> size of the transfer in CCW->count.
> 
> Here the count is initialized with the sizeof of the pointer used to 
> initialize CCW->cda with.

But the cda points to the pointer address, so the size of the pointer
is actually the correct value here, isn't it?

> Lukily we work on a 64 bits machine with 64 bits pointers and the size 
> of the pointed object is 64 bits wide so... the resulting count is right.
> But it is not the correct way to do it.

I think it is, but this interface really is confusing.

> That is all. Not a big concern, you do not need to change it, as you 
> said it can be done in another patch.
> 
> > Did you agree with the rest of my comment? I mean there was more to it.
> >   
> 
> I understood from your comments that the indicators in Linux are 64bits 
> wide so all OK.
> 
> Regards
> Pierre
> 
> 
> 
> 
> 
> 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
  2019-04-26 18:32   ` Halil Pasic
@ 2019-05-13 12:20     ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13 12:20 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 26 Apr 2019 20:32:45 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Hypervisor needs to interact with the summary indicators, so these
> need to be DMA memory as well (at least for protected virtualization
> guests).
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
>  1 file changed, 17 insertions(+), 7 deletions(-)

(...)

> @@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq)
>  	read_unlock(&info->lock);
>  }
>  
> -static struct airq_info *new_airq_info(void)
> +/* call with airq_areas_lock held */

Hm, where is airq_areas_lock defined? If it was introduced in one of
the previous patches, I have missed it.

> +static struct airq_info *new_airq_info(int index)
>  {
>  	struct airq_info *info;
>  	int rc;

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
@ 2019-05-13 12:20     ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13 12:20 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri, 26 Apr 2019 20:32:45 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Hypervisor needs to interact with the summary indicators, so these
> need to be DMA memory as well (at least for protected virtualization
> guests).
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
>  1 file changed, 17 insertions(+), 7 deletions(-)

(...)

> @@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq)
>  	read_unlock(&info->lock);
>  }
>  
> -static struct airq_info *new_airq_info(void)
> +/* call with airq_areas_lock held */

Hm, where is airq_areas_lock defined? If it was introduced in one of
the previous patches, I have missed it.

> +static struct airq_info *new_airq_info(int index)
>  {
>  	struct airq_info *info;
>  	int rc;

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
  2019-05-13  9:52                   ` Cornelia Huck
@ 2019-05-13 12:27                     ` Michael Mueller
  -1 siblings, 0 replies; 182+ messages in thread
From: Michael Mueller @ 2019-05-13 12:27 UTC (permalink / raw)
  To: Cornelia Huck, Michael S. Tsirkin
  Cc: Christian Borntraeger, Halil Pasic, kvm, linux-s390,
	Martin Schwidefsky, Sebastian Ott, virtualization,
	Christoph Hellwig, Thomas Huth, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman



On 13.05.19 11:52, Cornelia Huck wrote:
> On Sun, 12 May 2019 12:47:39 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
>> On Fri, May 10, 2019 at 04:07:44PM +0200, Cornelia Huck wrote:
>>> On Tue, 7 May 2019 15:58:12 +0200
>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>>    
>>>> On 05.05.19 13:15, Cornelia Huck wrote:
>>>>> On Sat, 4 May 2019 16:03:40 +0200
>>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>>>      
>>>>>> On Fri, 3 May 2019 16:04:48 -0400
>>>>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>>>>     
>>>>>>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:
>>>>>>>> On Fri, 26 Apr 2019 20:32:36 +0200
>>>>>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>>>>>>        
>>>>>>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
>>>>>>>>> establishes a new way of allocating virtqueues (as a part of the effort
>>>>>>>>> that taught DMA to virtio rings).
>>>>>>>>>
>>>>>>>>> In the future we will want virtio-ccw to use the DMA API as well.
>>>>>>>>>
>>>>>>>>> Let us switch from the legacy method of allocating virtqueues to
>>>>>>>>> vring_create_virtqueue() as the first step into that direction.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>>>>>>>> ---
>>>>>>>>>   drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
>>>>>>>>>   1 file changed, 11 insertions(+), 19 deletions(-)
>>>>>>>>
>>>>>>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>>>>>>>>
>>>>>>>> I'd vote for merging this patch right away for 5.2.
>>>>>>>
>>>>>>> So which tree is this going through? mine?
>>>>>>>        
>>>>>>
>>>>>> Christian, what do you think? If the whole series is supposed to go in
>>>>>> in one go (which I hope it is), via Martin's tree could be the simplest
>>>>>> route IMHO.
>>>>>
>>>>>
>>>>> The first three patches are virtio(-ccw) only and the those are the ones
>>>>> that I think are ready to go.
>>>>>
>>>>> I'm not feeling comfortable going forward with the remainder as it
>>>>> stands now; waiting for some other folks to give feedback. (They are
>>>>> touching/interacting with code parts I'm not so familiar with, and lack
>>>>> of documentation, while not the developers' fault, does not make it
>>>>> easier.)
>>>>>
>>>>> Michael, would you like to pick up 1-3 for your tree directly? That
>>>>> looks like the easiest way.
>>>>
>>>> Agreed. Michael please pick 1-3.
>>>> We will continue to review 4- first and then see which tree is best.
>>>
>>> Michael, please let me know if you'll pick directly or whether I should
>>> post a series.
>>>
>>> [Given that the patches are from one virtio-ccw maintainer and reviewed
>>> by the other, picking directly would eliminate an unnecessary
>>> indirection :)]
>>
>> picked them
> 
> Thanks!
> 

Connie,

if I get you right here, you don't need a v2 for the
patches 1 through 3?

Thanks,
Michael

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
@ 2019-05-13 12:27                     ` Michael Mueller
  0 siblings, 0 replies; 182+ messages in thread
From: Michael Mueller @ 2019-05-13 12:27 UTC (permalink / raw)
  To: Cornelia Huck, Michael S. Tsirkin
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Christoph Hellwig, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank



On 13.05.19 11:52, Cornelia Huck wrote:
> On Sun, 12 May 2019 12:47:39 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
>> On Fri, May 10, 2019 at 04:07:44PM +0200, Cornelia Huck wrote:
>>> On Tue, 7 May 2019 15:58:12 +0200
>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>>    
>>>> On 05.05.19 13:15, Cornelia Huck wrote:
>>>>> On Sat, 4 May 2019 16:03:40 +0200
>>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>>>      
>>>>>> On Fri, 3 May 2019 16:04:48 -0400
>>>>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>>>>     
>>>>>>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:
>>>>>>>> On Fri, 26 Apr 2019 20:32:36 +0200
>>>>>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>>>>>>        
>>>>>>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
>>>>>>>>> establishes a new way of allocating virtqueues (as a part of the effort
>>>>>>>>> that taught DMA to virtio rings).
>>>>>>>>>
>>>>>>>>> In the future we will want virtio-ccw to use the DMA API as well.
>>>>>>>>>
>>>>>>>>> Let us switch from the legacy method of allocating virtqueues to
>>>>>>>>> vring_create_virtqueue() as the first step into that direction.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>>>>>>>> ---
>>>>>>>>>   drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
>>>>>>>>>   1 file changed, 11 insertions(+), 19 deletions(-)
>>>>>>>>
>>>>>>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>>>>>>>>
>>>>>>>> I'd vote for merging this patch right away for 5.2.
>>>>>>>
>>>>>>> So which tree is this going through? mine?
>>>>>>>        
>>>>>>
>>>>>> Christian, what do you think? If the whole series is supposed to go in
>>>>>> in one go (which I hope it is), via Martin's tree could be the simplest
>>>>>> route IMHO.
>>>>>
>>>>>
>>>>> The first three patches are virtio(-ccw) only and the those are the ones
>>>>> that I think are ready to go.
>>>>>
>>>>> I'm not feeling comfortable going forward with the remainder as it
>>>>> stands now; waiting for some other folks to give feedback. (They are
>>>>> touching/interacting with code parts I'm not so familiar with, and lack
>>>>> of documentation, while not the developers' fault, does not make it
>>>>> easier.)
>>>>>
>>>>> Michael, would you like to pick up 1-3 for your tree directly? That
>>>>> looks like the easiest way.
>>>>
>>>> Agreed. Michael please pick 1-3.
>>>> We will continue to review 4- first and then see which tree is best.
>>>
>>> Michael, please let me know if you'll pick directly or whether I should
>>> post a series.
>>>
>>> [Given that the patches are from one virtio-ccw maintainer and reviewed
>>> by the other, picking directly would eliminate an unnecessary
>>> indirection :)]
>>
>> picked them
> 
> Thanks!
> 

Connie,

if I get you right here, you don't need a v2 for the
patches 1 through 3?

Thanks,
Michael

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
  2019-05-13 12:27                     ` Michael Mueller
@ 2019-05-13 12:29                       ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13 12:29 UTC (permalink / raw)
  To: Michael Mueller
  Cc: Michael S. Tsirkin, Christian Borntraeger, Halil Pasic, kvm,
	linux-s390, Martin Schwidefsky, Sebastian Ott, virtualization,
	Christoph Hellwig, Thomas Huth, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Mon, 13 May 2019 14:27:35 +0200
Michael Mueller <mimu@linux.ibm.com> wrote:

> On 13.05.19 11:52, Cornelia Huck wrote:
> > On Sun, 12 May 2019 12:47:39 -0400
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >   
> >> On Fri, May 10, 2019 at 04:07:44PM +0200, Cornelia Huck wrote:  
> >>> On Tue, 7 May 2019 15:58:12 +0200
> >>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >>>      
> >>>> On 05.05.19 13:15, Cornelia Huck wrote:  
> >>>>> On Sat, 4 May 2019 16:03:40 +0200
> >>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
> >>>>>        
> >>>>>> On Fri, 3 May 2019 16:04:48 -0400
> >>>>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >>>>>>       
> >>>>>>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:  
> >>>>>>>> On Fri, 26 Apr 2019 20:32:36 +0200
> >>>>>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
> >>>>>>>>          
> >>>>>>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> >>>>>>>>> establishes a new way of allocating virtqueues (as a part of the effort
> >>>>>>>>> that taught DMA to virtio rings).
> >>>>>>>>>
> >>>>>>>>> In the future we will want virtio-ccw to use the DMA API as well.
> >>>>>>>>>
> >>>>>>>>> Let us switch from the legacy method of allocating virtqueues to
> >>>>>>>>> vring_create_virtqueue() as the first step into that direction.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >>>>>>>>> ---
> >>>>>>>>>   drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> >>>>>>>>>   1 file changed, 11 insertions(+), 19 deletions(-)  
> >>>>>>>>
> >>>>>>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> >>>>>>>>
> >>>>>>>> I'd vote for merging this patch right away for 5.2.  
> >>>>>>>
> >>>>>>> So which tree is this going through? mine?
> >>>>>>>          
> >>>>>>
> >>>>>> Christian, what do you think? If the whole series is supposed to go in
> >>>>>> in one go (which I hope it is), via Martin's tree could be the simplest
> >>>>>> route IMHO.  
> >>>>>
> >>>>>
> >>>>> The first three patches are virtio(-ccw) only and the those are the ones
> >>>>> that I think are ready to go.
> >>>>>
> >>>>> I'm not feeling comfortable going forward with the remainder as it
> >>>>> stands now; waiting for some other folks to give feedback. (They are
> >>>>> touching/interacting with code parts I'm not so familiar with, and lack
> >>>>> of documentation, while not the developers' fault, does not make it
> >>>>> easier.)
> >>>>>
> >>>>> Michael, would you like to pick up 1-3 for your tree directly? That
> >>>>> looks like the easiest way.  
> >>>>
> >>>> Agreed. Michael please pick 1-3.
> >>>> We will continue to review 4- first and then see which tree is best.  
> >>>
> >>> Michael, please let me know if you'll pick directly or whether I should
> >>> post a series.
> >>>
> >>> [Given that the patches are from one virtio-ccw maintainer and reviewed
> >>> by the other, picking directly would eliminate an unnecessary
> >>> indirection :)]  
> >>
> >> picked them  
> > 
> > Thanks!
> >   
> 
> Connie,
> 
> if I get you right here, you don't need a v2 for the
> patches 1 through 3?

Exactly, they are all queued in Michael's tree.

> 
> Thanks,
> Michael
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 01/10] virtio/s390: use vring_create_virtqueue
@ 2019-05-13 12:29                       ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13 12:29 UTC (permalink / raw)
  To: Michael Mueller
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda,
	Sebastian Ott, kvm, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Christoph Hellwig,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Mon, 13 May 2019 14:27:35 +0200
Michael Mueller <mimu@linux.ibm.com> wrote:

> On 13.05.19 11:52, Cornelia Huck wrote:
> > On Sun, 12 May 2019 12:47:39 -0400
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >   
> >> On Fri, May 10, 2019 at 04:07:44PM +0200, Cornelia Huck wrote:  
> >>> On Tue, 7 May 2019 15:58:12 +0200
> >>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >>>      
> >>>> On 05.05.19 13:15, Cornelia Huck wrote:  
> >>>>> On Sat, 4 May 2019 16:03:40 +0200
> >>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
> >>>>>        
> >>>>>> On Fri, 3 May 2019 16:04:48 -0400
> >>>>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >>>>>>       
> >>>>>>> On Fri, May 03, 2019 at 11:17:24AM +0200, Cornelia Huck wrote:  
> >>>>>>>> On Fri, 26 Apr 2019 20:32:36 +0200
> >>>>>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
> >>>>>>>>          
> >>>>>>>>> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> >>>>>>>>> establishes a new way of allocating virtqueues (as a part of the effort
> >>>>>>>>> that taught DMA to virtio rings).
> >>>>>>>>>
> >>>>>>>>> In the future we will want virtio-ccw to use the DMA API as well.
> >>>>>>>>>
> >>>>>>>>> Let us switch from the legacy method of allocating virtqueues to
> >>>>>>>>> vring_create_virtqueue() as the first step into that direction.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >>>>>>>>> ---
> >>>>>>>>>   drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
> >>>>>>>>>   1 file changed, 11 insertions(+), 19 deletions(-)  
> >>>>>>>>
> >>>>>>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> >>>>>>>>
> >>>>>>>> I'd vote for merging this patch right away for 5.2.  
> >>>>>>>
> >>>>>>> So which tree is this going through? mine?
> >>>>>>>          
> >>>>>>
> >>>>>> Christian, what do you think? If the whole series is supposed to go in
> >>>>>> in one go (which I hope it is), via Martin's tree could be the simplest
> >>>>>> route IMHO.  
> >>>>>
> >>>>>
> >>>>> The first three patches are virtio(-ccw) only and the those are the ones
> >>>>> that I think are ready to go.
> >>>>>
> >>>>> I'm not feeling comfortable going forward with the remainder as it
> >>>>> stands now; waiting for some other folks to give feedback. (They are
> >>>>> touching/interacting with code parts I'm not so familiar with, and lack
> >>>>> of documentation, while not the developers' fault, does not make it
> >>>>> easier.)
> >>>>>
> >>>>> Michael, would you like to pick up 1-3 for your tree directly? That
> >>>>> looks like the easiest way.  
> >>>>
> >>>> Agreed. Michael please pick 1-3.
> >>>> We will continue to review 4- first and then see which tree is best.  
> >>>
> >>> Michael, please let me know if you'll pick directly or whether I should
> >>> post a series.
> >>>
> >>> [Given that the patches are from one virtio-ccw maintainer and reviewed
> >>> by the other, picking directly would eliminate an unnecessary
> >>> indirection :)]  
> >>
> >> picked them  
> > 
> > Thanks!
> >   
> 
> Connie,
> 
> if I get you right here, you don't need a v2 for the
> patches 1 through 3?

Exactly, they are all queued in Michael's tree.

> 
> Thanks,
> Michael
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
  2019-04-29 14:05         ` Christian Borntraeger
@ 2019-05-13 12:50           ` Michael Mueller
  -1 siblings, 0 replies; 182+ messages in thread
From: Michael Mueller @ 2019-05-13 12:50 UTC (permalink / raw)
  To: Christian Borntraeger, Halil Pasic, Christoph Hellwig
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin, Thomas Huth,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman



On 29.04.19 16:05, Christian Borntraeger wrote:
> 
> 
> On 29.04.19 15:59, Halil Pasic wrote:
>> On Fri, 26 Apr 2019 12:27:11 -0700
>> Christoph Hellwig <hch@infradead.org> wrote:
>>
>>> On Fri, Apr 26, 2019 at 08:32:39PM +0200, Halil Pasic wrote:
>>>> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
>>>
>>>> +EXPORT_SYMBOL_GPL(set_memory_decrypted);
>>>
>>>> +EXPORT_SYMBOL_GPL(sev_active);
>>>
>>> Why do you export these?  I know x86 exports those as well, but
>>> it shoudn't be needed there either.
>>>
>>
>> I export these to be in line with the x86 implementation (which
>> is the original and seems to be the only one at the moment). I assumed
>> that 'exported or not' is kind of a part of the interface definition.
>> Honestly, I did not give it too much thought.
>>
>> For x86 set_memory(en|de)crypted got exported by 95cf9264d5f3 "x86, drm,
>> fbdev: Do not specify encrypted memory for video mappings" (Tom
>> Lendacky, 2017-07-17). With CONFIG_FB_VGA16=m seems to be necessary for x84.
>>
>> If the consensus is don't export: I won't. I'm fine one way or the other.
>> @Christian, what is your take on this?
> 
> If we do not need it today for anything (e.g. virtio-gpu) then we can get rid
> of the exports (and introduce them when necessary).

I'll take them out then.

>>
>> Thank you very much!
>>
>> Regards,
>> Halil
>>
>>
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
@ 2019-05-13 12:50           ` Michael Mueller
  0 siblings, 0 replies; 182+ messages in thread
From: Michael Mueller @ 2019-05-13 12:50 UTC (permalink / raw)
  To: Christian Borntraeger, Halil Pasic, Christoph Hellwig
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Martin Schwidefsky, Farhan Ali,
	Viktor Mihajlovski, Janosch Frank



On 29.04.19 16:05, Christian Borntraeger wrote:
> 
> 
> On 29.04.19 15:59, Halil Pasic wrote:
>> On Fri, 26 Apr 2019 12:27:11 -0700
>> Christoph Hellwig <hch@infradead.org> wrote:
>>
>>> On Fri, Apr 26, 2019 at 08:32:39PM +0200, Halil Pasic wrote:
>>>> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
>>>
>>>> +EXPORT_SYMBOL_GPL(set_memory_decrypted);
>>>
>>>> +EXPORT_SYMBOL_GPL(sev_active);
>>>
>>> Why do you export these?  I know x86 exports those as well, but
>>> it shoudn't be needed there either.
>>>
>>
>> I export these to be in line with the x86 implementation (which
>> is the original and seems to be the only one at the moment). I assumed
>> that 'exported or not' is kind of a part of the interface definition.
>> Honestly, I did not give it too much thought.
>>
>> For x86 set_memory(en|de)crypted got exported by 95cf9264d5f3 "x86, drm,
>> fbdev: Do not specify encrypted memory for video mappings" (Tom
>> Lendacky, 2017-07-17). With CONFIG_FB_VGA16=m seems to be necessary for x84.
>>
>> If the consensus is don't export: I won't. I'm fine one way or the other.
>> @Christian, what is your take on this?
> 
> If we do not need it today for anything (e.g. virtio-gpu) then we can get rid
> of the exports (and introduce them when necessary).

I'll take them out then.

>>
>> Thank you very much!
>>
>> Regards,
>> Halil
>>
>>
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 07/10] s390/airq: use DMA memory for adapter interrupts
  2019-04-26 18:32   ` Halil Pasic
@ 2019-05-13 12:59     ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13 12:59 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 26 Apr 2019 20:32:42 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Protected virtualization guests have to use shared pages for airq
> notifier bit vectors, because hypervisor needs to write these bits.
> 
> Let us make sure we allocate DMA memory for the notifier bit vectors.

[Looking at this first, before I can think about your update in patch
5.]

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/include/asm/airq.h |  2 ++
>  drivers/s390/cio/airq.c      | 18 ++++++++++++++----
>  2 files changed, 16 insertions(+), 4 deletions(-)

(...)

> diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> index a45011e4529e..7a5c0a08ee09 100644
> --- a/drivers/s390/cio/airq.c
> +++ b/drivers/s390/cio/airq.c
> @@ -19,6 +19,7 @@
>  
>  #include <asm/airq.h>
>  #include <asm/isc.h>
> +#include <asm/cio.h>
>  
>  #include "cio.h"
>  #include "cio_debug.h"
> @@ -113,6 +114,11 @@ void __init init_airq_interrupts(void)
>  	setup_irq(THIN_INTERRUPT, &airq_interrupt);
>  }
>  
> +static inline unsigned long iv_size(unsigned long bits)
> +{
> +	return BITS_TO_LONGS(bits) * sizeof(unsigned long);
> +}
> +
>  /**
>   * airq_iv_create - create an interrupt vector
>   * @bits: number of bits in the interrupt vector
> @@ -123,14 +129,15 @@ void __init init_airq_interrupts(void)
>  struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
>  {
>  	struct airq_iv *iv;
> -	unsigned long size;
> +	unsigned long size = 0;

Why do you need to init this to 0?

>  
>  	iv = kzalloc(sizeof(*iv), GFP_KERNEL);
>  	if (!iv)
>  		goto out;
>  	iv->bits = bits;
> -	size = BITS_TO_LONGS(bits) * sizeof(unsigned long);
> -	iv->vector = kzalloc(size, GFP_KERNEL);
> +	size = iv_size(bits);
> +	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
> +						 &iv->vector_dma, GFP_KERNEL);

Indent is a bit off.

But more importantly, I'm also a bit vary about ap and pci. IIRC, css
support is mandatory, so that should not be a problem; and unless I
remember incorrectly, ap only uses summary indicators. How does this
interact with pci devices? I suppose any of their dma properties do not
come into play with the interrupt code here? (Just want to be sure.)

>  	if (!iv->vector)
>  		goto out_free;
>  	if (flags & AIRQ_IV_ALLOC) {
> @@ -165,7 +172,8 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
>  	kfree(iv->ptr);
>  	kfree(iv->bitlock);
>  	kfree(iv->avail);
> -	kfree(iv->vector);
> +	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
> +			  iv->vector_dma);
>  	kfree(iv);
>  out:
>  	return NULL;
> @@ -182,6 +190,8 @@ void airq_iv_release(struct airq_iv *iv)
>  	kfree(iv->ptr);
>  	kfree(iv->bitlock);
>  	kfree(iv->vector);
> +	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
> +			  iv->vector, iv->vector_dma);
>  	kfree(iv->avail);
>  	kfree(iv);
>  }

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 07/10] s390/airq: use DMA memory for adapter interrupts
@ 2019-05-13 12:59     ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13 12:59 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri, 26 Apr 2019 20:32:42 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Protected virtualization guests have to use shared pages for airq
> notifier bit vectors, because hypervisor needs to write these bits.
> 
> Let us make sure we allocate DMA memory for the notifier bit vectors.

[Looking at this first, before I can think about your update in patch
5.]

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/include/asm/airq.h |  2 ++
>  drivers/s390/cio/airq.c      | 18 ++++++++++++++----
>  2 files changed, 16 insertions(+), 4 deletions(-)

(...)

> diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> index a45011e4529e..7a5c0a08ee09 100644
> --- a/drivers/s390/cio/airq.c
> +++ b/drivers/s390/cio/airq.c
> @@ -19,6 +19,7 @@
>  
>  #include <asm/airq.h>
>  #include <asm/isc.h>
> +#include <asm/cio.h>
>  
>  #include "cio.h"
>  #include "cio_debug.h"
> @@ -113,6 +114,11 @@ void __init init_airq_interrupts(void)
>  	setup_irq(THIN_INTERRUPT, &airq_interrupt);
>  }
>  
> +static inline unsigned long iv_size(unsigned long bits)
> +{
> +	return BITS_TO_LONGS(bits) * sizeof(unsigned long);
> +}
> +
>  /**
>   * airq_iv_create - create an interrupt vector
>   * @bits: number of bits in the interrupt vector
> @@ -123,14 +129,15 @@ void __init init_airq_interrupts(void)
>  struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
>  {
>  	struct airq_iv *iv;
> -	unsigned long size;
> +	unsigned long size = 0;

Why do you need to init this to 0?

>  
>  	iv = kzalloc(sizeof(*iv), GFP_KERNEL);
>  	if (!iv)
>  		goto out;
>  	iv->bits = bits;
> -	size = BITS_TO_LONGS(bits) * sizeof(unsigned long);
> -	iv->vector = kzalloc(size, GFP_KERNEL);
> +	size = iv_size(bits);
> +	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
> +						 &iv->vector_dma, GFP_KERNEL);

Indent is a bit off.

But more importantly, I'm also a bit vary about ap and pci. IIRC, css
support is mandatory, so that should not be a problem; and unless I
remember incorrectly, ap only uses summary indicators. How does this
interact with pci devices? I suppose any of their dma properties do not
come into play with the interrupt code here? (Just want to be sure.)

>  	if (!iv->vector)
>  		goto out_free;
>  	if (flags & AIRQ_IV_ALLOC) {
> @@ -165,7 +172,8 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
>  	kfree(iv->ptr);
>  	kfree(iv->bitlock);
>  	kfree(iv->avail);
> -	kfree(iv->vector);
> +	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
> +			  iv->vector_dma);
>  	kfree(iv);
>  out:
>  	return NULL;
> @@ -182,6 +190,8 @@ void airq_iv_release(struct airq_iv *iv)
>  	kfree(iv->ptr);
>  	kfree(iv->bitlock);
>  	kfree(iv->vector);
> +	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
> +			  iv->vector, iv->vector_dma);
>  	kfree(iv->avail);
>  	kfree(iv);
>  }

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-12 18:22               ` Halil Pasic
@ 2019-05-13 13:29                 ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13 13:29 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Sebastian Ott, kvm, linux-s390, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller

On Sun, 12 May 2019 20:22:56 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Fri, 10 May 2019 16:10:13 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri, 10 May 2019 00:11:12 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Thu, 9 May 2019 12:11:06 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Wed, 8 May 2019 23:22:10 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > > >     
> > > > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > > > Sebastian Ott <sebott@linux.ibm.com> wrote:    
> > > >     
> > > > > > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > > > > > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > > > > > >  		goto out_unregister;
> > > > > > >  	}
> > > > > > > +	cio_dma_pool_init();        
> > > > > > 
> > > > > > This is too late for early devices (ccw console!).      
> > > > > 
> > > > > You have already raised concern about this last time (thanks). I think,
> > > > > I've addressed this issue: the cio_dma_pool is only used by the airq
> > > > > stuff. I don't think the ccw console needs it. Please have an other look
> > > > > at patch #6, and explain your concern in more detail if it persists.    
> > > > 
> > > > What about changing the naming/adding comments here, so that (1) folks
> > > > aren't confused by the same thing in the future and (2) folks don't try
> > > > to use that pool for something needed for the early ccw consoles?
> > > >     
> > > 
> > > I'm all for clarity! Suggestions for better names?  
> > 
> > css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
> > need it?
> >   
> 
> Ouch! I was considering to use cio_dma_zalloc() for the adapter
> interruption vectors but I ended up between the two chairs in the end.
> So with this series there are no uses for cio_dma pool.
> 
> I don't feel strongly about this going one way the other.
> 
> Against getting rid of the cio_dma_pool and sticking with the speaks
> dma_alloc_coherent() that we waste a DMA page per vector, which is a
> non obvious side effect.

That would basically mean one DMA page per virtio-ccw device, right?
For single queue devices, this seems like quite a bit of overhead.

Are we expecting many devices in use per guest?

> 
> What speaks against cio_dma_pool is that it is slightly more code, and
> this currently can not be used for very early stuff, which I don't
> think is relevant. 

Unless properly documented, it feels like something you can easily trip
over, however.

I assume that the "very early stuff" is basically only ccw consoles.
Not sure if we can use virtio-serial as an early ccw console -- IIRC
that was only about 3215/3270? While QEMU guests are able to use a 3270
console, this is experimental and I would not count that as a use case.
Anyway, 3215/3270 don't use adapter interrupts, and probably not
anything cross-device, either; so unless early virtio-serial is a
thing, this restriction is fine if properly documented.

> What also used to speak against it is that
> allocations asking for more than a page would just fail, but I addressed
> that in the patch I've hacked up on top of the series, and I'm going to
> paste below. While at it I addressed some other issues as well.

Hm, which "other issues"?

> 
> I've also got code that deals with AIRQ_IV_CACHELINE by turning the
> kmem_cache into a dma_pool.

Isn't that percolating to other airq users again? Or maybe I just don't
understand what you're proposing here...

> 
> Cornelia, Sebastian which approach do you prefer:
> 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
> vector, or
> 2) go with the approach taken by the patch below?

I'm not sure that I properly understand this (yeah, you probably
guessed); so I'm not sure I can make a good call here.

> 
> 
> Regards,
> Halil
> -----------------------8<---------------------------------------------
> From: Halil Pasic <pasic@linux.ibm.com>
> Date: Sun, 12 May 2019 18:08:05 +0200
> Subject: [PATCH] WIP: use cio dma pool for airqs
> 
> Let's not waste a DMA page per adapter interrupt bit vector.
> ---
> Lightly tested...
> ---
>  arch/s390/include/asm/airq.h |  1 -
>  drivers/s390/cio/airq.c      | 10 +++-------
>  drivers/s390/cio/css.c       | 18 +++++++++++++++---
>  3 files changed, 18 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
> index 1492d48..981a3eb 100644
> --- a/arch/s390/include/asm/airq.h
> +++ b/arch/s390/include/asm/airq.h
> @@ -30,7 +30,6 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
>  /* Adapter interrupt bit vector */
>  struct airq_iv {
>  	unsigned long *vector;	/* Adapter interrupt bit vector */
> -	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
>  	unsigned long *avail;	/* Allocation bit mask for the bit vector */
>  	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
>  	unsigned long *ptr;	/* Pointer associated with each bit */
> diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> index 7a5c0a0..f11f437 100644
> --- a/drivers/s390/cio/airq.c
> +++ b/drivers/s390/cio/airq.c
> @@ -136,8 +136,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
>  		goto out;
>  	iv->bits = bits;
>  	size = iv_size(bits);
> -	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
> -						 &iv->vector_dma, GFP_KERNEL);
> +	iv->vector = cio_dma_zalloc(size);
>  	if (!iv->vector)
>  		goto out_free;
>  	if (flags & AIRQ_IV_ALLOC) {
> @@ -172,8 +171,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
>  	kfree(iv->ptr);
>  	kfree(iv->bitlock);
>  	kfree(iv->avail);
> -	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
> -			  iv->vector_dma);
> +	cio_dma_free(iv->vector, size);
>  	kfree(iv);
>  out:
>  	return NULL;
> @@ -189,9 +187,7 @@ void airq_iv_release(struct airq_iv *iv)
>  	kfree(iv->data);
>  	kfree(iv->ptr);
>  	kfree(iv->bitlock);
> -	kfree(iv->vector);
> -	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
> -			  iv->vector, iv->vector_dma);
> +	cio_dma_free(iv->vector, iv_size(iv->bits));
>  	kfree(iv->avail);
>  	kfree(iv);
>  }
> diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
> index 7087cc3..88d9c92 100644
> --- a/drivers/s390/cio/css.c
> +++ b/drivers/s390/cio/css.c
> @@ -1063,7 +1063,10 @@ struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
>  static void __gp_dma_free_dma(struct gen_pool *pool,
>  			      struct gen_pool_chunk *chunk, void *data)
>  {
> -	dma_free_coherent((struct device *) data, PAGE_SIZE,
> +
> +	size_t chunk_size = chunk->end_addr - chunk->start_addr + 1;
> +
> +	dma_free_coherent((struct device *) data, chunk_size,
>  			 (void *) chunk->start_addr,
>  			 (dma_addr_t) chunk->phys_addr);
>  }
> @@ -1088,13 +1091,15 @@ void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
>  {
>  	dma_addr_t dma_addr;
>  	unsigned long addr = gen_pool_alloc(gp_dma, size);
> +	size_t chunk_size;
>  
>  	if (!addr) {
> +		chunk_size = round_up(size, PAGE_SIZE);

Doesn't that mean that we still go up to chunks of at least PAGE_SIZE?
Or can vectors now share the same chunk?

>  		addr = (unsigned long) dma_alloc_coherent(dma_dev,
> -					PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
> +					 chunk_size, &dma_addr, CIO_DMA_GFP);
>  		if (!addr)
>  			return NULL;
> -		gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
> +		gen_pool_add_virt(gp_dma, addr, dma_addr, chunk_size, -1);
>  		addr = gen_pool_alloc(gp_dma, size);
>  	}
>  	return (void *) addr;
> @@ -1108,6 +1113,13 @@ void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
>  	gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
>  }
>  
> +/**
> + * Allocate dma memory from the css global pool. Intended for memory not
> + * specific to any single device within the css.
> + *
> + * Caution: Not suitable for early stuff like console.

Maybe add "Do not use prior to <point in startup>"?

> + *
> + */
>  void *cio_dma_zalloc(size_t size)
>  {
>  	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-13 13:29                 ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13 13:29 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda,
	Sebastian Ott, kvm, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Michael Mueller, Viktor Mihajlovski, Janosch Frank

On Sun, 12 May 2019 20:22:56 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Fri, 10 May 2019 16:10:13 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri, 10 May 2019 00:11:12 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Thu, 9 May 2019 12:11:06 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Wed, 8 May 2019 23:22:10 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > > >     
> > > > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > > > Sebastian Ott <sebott@linux.ibm.com> wrote:    
> > > >     
> > > > > > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > > > > > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > > > > > >  		goto out_unregister;
> > > > > > >  	}
> > > > > > > +	cio_dma_pool_init();        
> > > > > > 
> > > > > > This is too late for early devices (ccw console!).      
> > > > > 
> > > > > You have already raised concern about this last time (thanks). I think,
> > > > > I've addressed this issue: the cio_dma_pool is only used by the airq
> > > > > stuff. I don't think the ccw console needs it. Please have an other look
> > > > > at patch #6, and explain your concern in more detail if it persists.    
> > > > 
> > > > What about changing the naming/adding comments here, so that (1) folks
> > > > aren't confused by the same thing in the future and (2) folks don't try
> > > > to use that pool for something needed for the early ccw consoles?
> > > >     
> > > 
> > > I'm all for clarity! Suggestions for better names?  
> > 
> > css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
> > need it?
> >   
> 
> Ouch! I was considering to use cio_dma_zalloc() for the adapter
> interruption vectors but I ended up between the two chairs in the end.
> So with this series there are no uses for cio_dma pool.
> 
> I don't feel strongly about this going one way the other.
> 
> Against getting rid of the cio_dma_pool and sticking with the speaks
> dma_alloc_coherent() that we waste a DMA page per vector, which is a
> non obvious side effect.

That would basically mean one DMA page per virtio-ccw device, right?
For single queue devices, this seems like quite a bit of overhead.

Are we expecting many devices in use per guest?

> 
> What speaks against cio_dma_pool is that it is slightly more code, and
> this currently can not be used for very early stuff, which I don't
> think is relevant. 

Unless properly documented, it feels like something you can easily trip
over, however.

I assume that the "very early stuff" is basically only ccw consoles.
Not sure if we can use virtio-serial as an early ccw console -- IIRC
that was only about 3215/3270? While QEMU guests are able to use a 3270
console, this is experimental and I would not count that as a use case.
Anyway, 3215/3270 don't use adapter interrupts, and probably not
anything cross-device, either; so unless early virtio-serial is a
thing, this restriction is fine if properly documented.

> What also used to speak against it is that
> allocations asking for more than a page would just fail, but I addressed
> that in the patch I've hacked up on top of the series, and I'm going to
> paste below. While at it I addressed some other issues as well.

Hm, which "other issues"?

> 
> I've also got code that deals with AIRQ_IV_CACHELINE by turning the
> kmem_cache into a dma_pool.

Isn't that percolating to other airq users again? Or maybe I just don't
understand what you're proposing here...

> 
> Cornelia, Sebastian which approach do you prefer:
> 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
> vector, or
> 2) go with the approach taken by the patch below?

I'm not sure that I properly understand this (yeah, you probably
guessed); so I'm not sure I can make a good call here.

> 
> 
> Regards,
> Halil
> -----------------------8<---------------------------------------------
> From: Halil Pasic <pasic@linux.ibm.com>
> Date: Sun, 12 May 2019 18:08:05 +0200
> Subject: [PATCH] WIP: use cio dma pool for airqs
> 
> Let's not waste a DMA page per adapter interrupt bit vector.
> ---
> Lightly tested...
> ---
>  arch/s390/include/asm/airq.h |  1 -
>  drivers/s390/cio/airq.c      | 10 +++-------
>  drivers/s390/cio/css.c       | 18 +++++++++++++++---
>  3 files changed, 18 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
> index 1492d48..981a3eb 100644
> --- a/arch/s390/include/asm/airq.h
> +++ b/arch/s390/include/asm/airq.h
> @@ -30,7 +30,6 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
>  /* Adapter interrupt bit vector */
>  struct airq_iv {
>  	unsigned long *vector;	/* Adapter interrupt bit vector */
> -	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
>  	unsigned long *avail;	/* Allocation bit mask for the bit vector */
>  	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
>  	unsigned long *ptr;	/* Pointer associated with each bit */
> diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> index 7a5c0a0..f11f437 100644
> --- a/drivers/s390/cio/airq.c
> +++ b/drivers/s390/cio/airq.c
> @@ -136,8 +136,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
>  		goto out;
>  	iv->bits = bits;
>  	size = iv_size(bits);
> -	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
> -						 &iv->vector_dma, GFP_KERNEL);
> +	iv->vector = cio_dma_zalloc(size);
>  	if (!iv->vector)
>  		goto out_free;
>  	if (flags & AIRQ_IV_ALLOC) {
> @@ -172,8 +171,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
>  	kfree(iv->ptr);
>  	kfree(iv->bitlock);
>  	kfree(iv->avail);
> -	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
> -			  iv->vector_dma);
> +	cio_dma_free(iv->vector, size);
>  	kfree(iv);
>  out:
>  	return NULL;
> @@ -189,9 +187,7 @@ void airq_iv_release(struct airq_iv *iv)
>  	kfree(iv->data);
>  	kfree(iv->ptr);
>  	kfree(iv->bitlock);
> -	kfree(iv->vector);
> -	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
> -			  iv->vector, iv->vector_dma);
> +	cio_dma_free(iv->vector, iv_size(iv->bits));
>  	kfree(iv->avail);
>  	kfree(iv);
>  }
> diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
> index 7087cc3..88d9c92 100644
> --- a/drivers/s390/cio/css.c
> +++ b/drivers/s390/cio/css.c
> @@ -1063,7 +1063,10 @@ struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
>  static void __gp_dma_free_dma(struct gen_pool *pool,
>  			      struct gen_pool_chunk *chunk, void *data)
>  {
> -	dma_free_coherent((struct device *) data, PAGE_SIZE,
> +
> +	size_t chunk_size = chunk->end_addr - chunk->start_addr + 1;
> +
> +	dma_free_coherent((struct device *) data, chunk_size,
>  			 (void *) chunk->start_addr,
>  			 (dma_addr_t) chunk->phys_addr);
>  }
> @@ -1088,13 +1091,15 @@ void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
>  {
>  	dma_addr_t dma_addr;
>  	unsigned long addr = gen_pool_alloc(gp_dma, size);
> +	size_t chunk_size;
>  
>  	if (!addr) {
> +		chunk_size = round_up(size, PAGE_SIZE);

Doesn't that mean that we still go up to chunks of at least PAGE_SIZE?
Or can vectors now share the same chunk?

>  		addr = (unsigned long) dma_alloc_coherent(dma_dev,
> -					PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
> +					 chunk_size, &dma_addr, CIO_DMA_GFP);
>  		if (!addr)
>  			return NULL;
> -		gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
> +		gen_pool_add_virt(gp_dma, addr, dma_addr, chunk_size, -1);
>  		addr = gen_pool_alloc(gp_dma, size);
>  	}
>  	return (void *) addr;
> @@ -1108,6 +1113,13 @@ void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
>  	gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
>  }
>  
> +/**
> + * Allocate dma memory from the css global pool. Intended for memory not
> + * specific to any single device within the css.
> + *
> + * Caution: Not suitable for early stuff like console.

Maybe add "Do not use prior to <point in startup>"?

> + *
> + */
>  void *cio_dma_zalloc(size_t size)
>  {
>  	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers
  2019-04-26 18:32   ` Halil Pasic
@ 2019-05-13 13:54     ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13 13:54 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 26 Apr 2019 20:32:44 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Before virtio-ccw could get away with not using DMA API for the pieces of
> memory it does ccw I/O with. With protected virtualization this has to
> change, since the hypervisor needs to read and sometimes also write these
> pieces of memory.
> 
> The hypervisor is supposed to poke the classic notifiers, if these are
> used, out of band with regards to ccw I/O. So these need to be allocated
> as DMA memory (which is shared memory for protected virtualization
> guests).
> 
> Let us factor out everything from struct virtio_ccw_device that needs to
> be DMA memory in a satellite that is allocated as such.
> 
> Note: The control blocks of I/O instructions do not need to be shared.
> These are marshalled by the ultravisor.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++------------------
>  1 file changed, 96 insertions(+), 81 deletions(-)
> 

> @@ -176,6 +180,22 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
>  	return container_of(vdev, struct virtio_ccw_device, vdev);
>  }
>  
> +static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size)
> +{
> +	return ccw_device_dma_zalloc(to_vc_device(vdev)->cdev, size);
> +}
> +
> +static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
> +				 void *cpu_addr)
> +{
> +	return ccw_device_dma_free(to_vc_device(vdev)->cdev, cpu_addr, size);
> +}

Hm, why do these use leading underscores?

Also, maybe make the _free function safe for NULL to simplify the
cleanup paths?


> +
> +#define vc_dma_alloc_struct(vdev, ptr) \
> +	({ptr = __vc_dma_alloc(vdev, sizeof(*(ptr))); })
> +#define vc_dma_free_struct(vdev, ptr) \
> +	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr))

I find these a bit ugly... does adding a wrapper help that much?

> +
>  static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
>  {
>  	unsigned long i, flags;

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers
@ 2019-05-13 13:54     ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-13 13:54 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri, 26 Apr 2019 20:32:44 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Before virtio-ccw could get away with not using DMA API for the pieces of
> memory it does ccw I/O with. With protected virtualization this has to
> change, since the hypervisor needs to read and sometimes also write these
> pieces of memory.
> 
> The hypervisor is supposed to poke the classic notifiers, if these are
> used, out of band with regards to ccw I/O. So these need to be allocated
> as DMA memory (which is shared memory for protected virtualization
> guests).
> 
> Let us factor out everything from struct virtio_ccw_device that needs to
> be DMA memory in a satellite that is allocated as such.
> 
> Note: The control blocks of I/O instructions do not need to be shared.
> These are marshalled by the ultravisor.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++------------------
>  1 file changed, 96 insertions(+), 81 deletions(-)
> 

> @@ -176,6 +180,22 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
>  	return container_of(vdev, struct virtio_ccw_device, vdev);
>  }
>  
> +static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size)
> +{
> +	return ccw_device_dma_zalloc(to_vc_device(vdev)->cdev, size);
> +}
> +
> +static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
> +				 void *cpu_addr)
> +{
> +	return ccw_device_dma_free(to_vc_device(vdev)->cdev, cpu_addr, size);
> +}

Hm, why do these use leading underscores?

Also, maybe make the _free function safe for NULL to simplify the
cleanup paths?


> +
> +#define vc_dma_alloc_struct(vdev, ptr) \
> +	({ptr = __vc_dma_alloc(vdev, sizeof(*(ptr))); })
> +#define vc_dma_free_struct(vdev, ptr) \
> +	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr))

I find these a bit ugly... does adding a wrapper help that much?

> +
>  static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
>  {
>  	unsigned long i, flags;

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-13  9:41     ` Cornelia Huck
@ 2019-05-14 14:47       ` Jason J. Herne
  -1 siblings, 0 replies; 182+ messages in thread
From: Jason J. Herne @ 2019-05-14 14:47 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On 5/13/19 5:41 AM, Cornelia Huck wrote:
> On Fri, 26 Apr 2019 20:32:41 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> As virtio-ccw devices are channel devices, we need to use the dma area
>> for any communication with the hypervisor.
>>
>> This patch addresses the most basic stuff (mostly what is required for
>> virtio-ccw), and does take care of QDIO or any devices.
> 
> "does not take care of QDIO", surely? (Also, what does "any devices"
> mean? Do you mean "every arbitrary device", perhaps?)
> 
>>
>> An interesting side effect is that virtio structures are now going to
>> get allocated in 31 bit addressable storage.
> 
> Hm...
> 
>>
>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>> ---
>>   arch/s390/include/asm/ccwdev.h   |  4 +++
>>   drivers/s390/cio/ccwreq.c        |  8 ++---
>>   drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
>>   drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
>>   drivers/s390/cio/device_id.c     | 18 +++++------
>>   drivers/s390/cio/device_ops.c    | 21 +++++++++++--
>>   drivers/s390/cio/device_pgid.c   | 20 ++++++-------
>>   drivers/s390/cio/device_status.c | 24 +++++++--------
>>   drivers/s390/cio/io_sch.h        | 21 +++++++++----
>>   drivers/s390/virtio/virtio_ccw.c | 10 -------
>>   10 files changed, 148 insertions(+), 83 deletions(-)
> 
> (...)
> 
>> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
>> index 6d989c360f38..bb7a92316fc8 100644
>> --- a/drivers/s390/virtio/virtio_ccw.c
>> +++ b/drivers/s390/virtio/virtio_ccw.c
>> @@ -66,7 +66,6 @@ struct virtio_ccw_device {
>>   	bool device_lost;
>>   	unsigned int config_ready;
>>   	void *airq_info;
>> -	u64 dma_mask;
>>   };
>>   
>>   struct vq_info_block_legacy {
>> @@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
>>   		ret = -ENOMEM;
>>   		goto out_free;
>>   	}
>> -
>>   	vcdev->vdev.dev.parent = &cdev->dev;
>> -	cdev->dev.dma_mask = &vcdev->dma_mask;
>> -	/* we are fine with common virtio infrastructure using 64 bit DMA */
>> -	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
>> -	if (ret) {
>> -		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
>> -		goto out_free;
>> -	}
> 
> This means that vring structures now need to fit into 31 bits as well,
> I think? Is there any way to reserve the 31 bit restriction for channel
> subsystem structures and keep vring in the full 64 bit range? (Or am I
> fundamentally misunderstanding something?)
> 

I hope I've understood everything... I'm new to virtio. But from what I'm understanding, 
the vring structure (a.k.a. the VirtQueue) needs to be accessed and modified by both host 
and guest. Therefore the page(s) holding that data need to be marked shared if using 
protected virtualization. This patch set makes use of DMA pages by way of swiotlb (always 
below 32-bit line right?) for shared memory. Therefore, a side effect is that all shared 
memory, including VirtQueue data will be in the DMA zone and in 32-bit memory.

I don't see any restrictions on sharing pages above the 32-bit line. So it seems possible. 
I'm not sure how much more work it would be. I wonder if Halil has considered this? Are we 
worried that virtio data structures are going to be a burden on the 31-bit address space?


-- 
-- Jason J. Herne (jjherne@linux.ibm.com)

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-14 14:47       ` Jason J. Herne
  0 siblings, 0 replies; 182+ messages in thread
From: Jason J. Herne @ 2019-05-14 14:47 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On 5/13/19 5:41 AM, Cornelia Huck wrote:
> On Fri, 26 Apr 2019 20:32:41 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> As virtio-ccw devices are channel devices, we need to use the dma area
>> for any communication with the hypervisor.
>>
>> This patch addresses the most basic stuff (mostly what is required for
>> virtio-ccw), and does take care of QDIO or any devices.
> 
> "does not take care of QDIO", surely? (Also, what does "any devices"
> mean? Do you mean "every arbitrary device", perhaps?)
> 
>>
>> An interesting side effect is that virtio structures are now going to
>> get allocated in 31 bit addressable storage.
> 
> Hm...
> 
>>
>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>> ---
>>   arch/s390/include/asm/ccwdev.h   |  4 +++
>>   drivers/s390/cio/ccwreq.c        |  8 ++---
>>   drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
>>   drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
>>   drivers/s390/cio/device_id.c     | 18 +++++------
>>   drivers/s390/cio/device_ops.c    | 21 +++++++++++--
>>   drivers/s390/cio/device_pgid.c   | 20 ++++++-------
>>   drivers/s390/cio/device_status.c | 24 +++++++--------
>>   drivers/s390/cio/io_sch.h        | 21 +++++++++----
>>   drivers/s390/virtio/virtio_ccw.c | 10 -------
>>   10 files changed, 148 insertions(+), 83 deletions(-)
> 
> (...)
> 
>> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
>> index 6d989c360f38..bb7a92316fc8 100644
>> --- a/drivers/s390/virtio/virtio_ccw.c
>> +++ b/drivers/s390/virtio/virtio_ccw.c
>> @@ -66,7 +66,6 @@ struct virtio_ccw_device {
>>   	bool device_lost;
>>   	unsigned int config_ready;
>>   	void *airq_info;
>> -	u64 dma_mask;
>>   };
>>   
>>   struct vq_info_block_legacy {
>> @@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
>>   		ret = -ENOMEM;
>>   		goto out_free;
>>   	}
>> -
>>   	vcdev->vdev.dev.parent = &cdev->dev;
>> -	cdev->dev.dma_mask = &vcdev->dma_mask;
>> -	/* we are fine with common virtio infrastructure using 64 bit DMA */
>> -	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
>> -	if (ret) {
>> -		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
>> -		goto out_free;
>> -	}
> 
> This means that vring structures now need to fit into 31 bits as well,
> I think? Is there any way to reserve the 31 bit restriction for channel
> subsystem structures and keep vring in the full 64 bit range? (Or am I
> fundamentally misunderstanding something?)
> 

I hope I've understood everything... I'm new to virtio. But from what I'm understanding, 
the vring structure (a.k.a. the VirtQueue) needs to be accessed and modified by both host 
and guest. Therefore the page(s) holding that data need to be marked shared if using 
protected virtualization. This patch set makes use of DMA pages by way of swiotlb (always 
below 32-bit line right?) for shared memory. Therefore, a side effect is that all shared 
memory, including VirtQueue data will be in the DMA zone and in 32-bit memory.

I don't see any restrictions on sharing pages above the 32-bit line. So it seems possible. 
I'm not sure how much more work it would be. I wonder if Halil has considered this? Are we 
worried that virtio data structures are going to be a burden on the 31-bit address space?


-- 
-- Jason J. Herne (jjherne@linux.ibm.com)

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
  2019-05-08 15:11     ` Pierre Morel
@ 2019-05-15 13:33       ` Michael Mueller
  -1 siblings, 0 replies; 182+ messages in thread
From: Michael Mueller @ 2019-05-15 13:33 UTC (permalink / raw)
  To: pmorel, Halil Pasic, kvm, linux-s390, Cornelia Huck,
	Martin Schwidefsky, Sebastian Ott
  Cc: virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman



On 08.05.19 17:11, Pierre Morel wrote:
> On 26/04/2019 20:32, Halil Pasic wrote:
>> Hypervisor needs to interact with the summary indicators, so these
>> need to be DMA memory as well (at least for protected virtualization
>> guests).
>>
>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>> ---
>>   drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
>>   1 file changed, 17 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/s390/virtio/virtio_ccw.c 
>> b/drivers/s390/virtio/virtio_ccw.c
>> index 613b18001a0c..6058b07fea08 100644
>> --- a/drivers/s390/virtio/virtio_ccw.c
>> +++ b/drivers/s390/virtio/virtio_ccw.c
>> @@ -140,11 +140,17 @@ static int virtio_ccw_use_airq = 1;
>>   struct airq_info {
>>       rwlock_t lock;
>> -    u8 summary_indicator;
>> +    u8 summary_indicator_idx;
>>       struct airq_struct airq;
>>       struct airq_iv *aiv;
>>   };
>>   static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
>> +static u8 *summary_indicators;
>> +
>> +static inline u8 *get_summary_indicator(struct airq_info *info)
>> +{
>> +    return summary_indicators + info->summary_indicator_idx;
>> +}
>>   #define CCW_CMD_SET_VQ 0x13
>>   #define CCW_CMD_VDEV_RESET 0x33
>> @@ -225,7 +231,7 @@ static void virtio_airq_handler(struct airq_struct 
>> *airq)
>>               break;
>>           vring_interrupt(0, (void *)airq_iv_get_ptr(info->aiv, ai));
>>       }
>> -    info->summary_indicator = 0;
>> +    *(get_summary_indicator(info)) = 0;
>>       smp_wmb();
>>       /* Walk through indicators field, summary indicator not active. */
>>       for (ai = 0;;) {
>> @@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct 
>> *airq)
>>       read_unlock(&info->lock);
>>   }
>> -static struct airq_info *new_airq_info(void)
>> +/* call with airq_areas_lock held */
>> +static struct airq_info *new_airq_info(int index)
>>   {
>>       struct airq_info *info;
>>       int rc;
>> @@ -252,7 +259,8 @@ static struct airq_info *new_airq_info(void)
>>           return NULL;
>>       }
>>       info->airq.handler = virtio_airq_handler;
>> -    info->airq.lsi_ptr = &info->summary_indicator;
>> +    info->summary_indicator_idx = index;
>> +    info->airq.lsi_ptr = get_summary_indicator(info);
>>       info->airq.lsi_mask = 0xff;
>>       info->airq.isc = VIRTIO_AIRQ_ISC;
>>       rc = register_adapter_interrupt(&info->airq);
>> @@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct 
>> virtqueue *vqs[], int nvqs,
>>       unsigned long bit, flags;
>>       for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
>> +        /* TODO: this seems to be racy */
> 
> yes, my opinions too, was already racy, in my opinion, we need another 
> patch in another series to fix this.
> 
> However, not sure about the comment.

I will drop this comment for v2 of this patch series.
We shall fix the race with a separate patch.

Michael

> 
>>           if (!airq_areas[i])
>> -            airq_areas[i] = new_airq_info();
>> +            airq_areas[i] = new_airq_info(i);
>>           info = airq_areas[i];
>>           if (!info)
>>               return 0;
>> @@ -359,7 +368,7 @@ static void virtio_ccw_drop_indicator(struct 
>> virtio_ccw_device *vcdev,
>>           if (!thinint_area)
>>               return;
>>           thinint_area->summary_indicator =
>> -            (unsigned long) &airq_info->summary_indicator;
>> +            (unsigned long) get_summary_indicator(airq_info);
>>           thinint_area->isc = VIRTIO_AIRQ_ISC;
>>           ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
>>           ccw->count = sizeof(*thinint_area);
>> @@ -624,7 +633,7 @@ static int virtio_ccw_register_adapter_ind(struct 
>> virtio_ccw_device *vcdev,
>>       }
>>       info = vcdev->airq_info;
>>       thinint_area->summary_indicator =
>> -        (unsigned long) &info->summary_indicator;
>> +        (unsigned long) get_summary_indicator(info);
>>       thinint_area->isc = VIRTIO_AIRQ_ISC;
>>       ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
>>       ccw->flags = CCW_FLAG_SLI;
>> @@ -1500,6 +1509,7 @@ static int __init virtio_ccw_init(void)
>>   {
>>       /* parse no_auto string before we do anything further */
>>       no_auto_parse();
>> +    summary_indicators = cio_dma_zalloc(MAX_AIRQ_AREAS);
>>       return ccw_driver_register(&virtio_ccw_driver);
>>   }
>>   device_initcall(virtio_ccw_init);
>>
> 
> 
> 

-- 
Mit freundlichen Grüßen / Kind regards
Michael Müller

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
@ 2019-05-15 13:33       ` Michael Mueller
  0 siblings, 0 replies; 182+ messages in thread
From: Michael Mueller @ 2019-05-15 13:33 UTC (permalink / raw)
  To: pmorel, Halil Pasic, kvm, linux-s390, Cornelia Huck,
	Martin Schwidefsky, Sebastian Ott
  Cc: Thomas Huth, Claudio Imbrenda, Janosch Frank, Vasily Gorbik,
	Michael S. Tsirkin, Farhan Ali, Eric Farman, virtualization,
	Christoph Hellwig, Viktor Mihajlovski



On 08.05.19 17:11, Pierre Morel wrote:
> On 26/04/2019 20:32, Halil Pasic wrote:
>> Hypervisor needs to interact with the summary indicators, so these
>> need to be DMA memory as well (at least for protected virtualization
>> guests).
>>
>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>> ---
>>   drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
>>   1 file changed, 17 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/s390/virtio/virtio_ccw.c 
>> b/drivers/s390/virtio/virtio_ccw.c
>> index 613b18001a0c..6058b07fea08 100644
>> --- a/drivers/s390/virtio/virtio_ccw.c
>> +++ b/drivers/s390/virtio/virtio_ccw.c
>> @@ -140,11 +140,17 @@ static int virtio_ccw_use_airq = 1;
>>   struct airq_info {
>>       rwlock_t lock;
>> -    u8 summary_indicator;
>> +    u8 summary_indicator_idx;
>>       struct airq_struct airq;
>>       struct airq_iv *aiv;
>>   };
>>   static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
>> +static u8 *summary_indicators;
>> +
>> +static inline u8 *get_summary_indicator(struct airq_info *info)
>> +{
>> +    return summary_indicators + info->summary_indicator_idx;
>> +}
>>   #define CCW_CMD_SET_VQ 0x13
>>   #define CCW_CMD_VDEV_RESET 0x33
>> @@ -225,7 +231,7 @@ static void virtio_airq_handler(struct airq_struct 
>> *airq)
>>               break;
>>           vring_interrupt(0, (void *)airq_iv_get_ptr(info->aiv, ai));
>>       }
>> -    info->summary_indicator = 0;
>> +    *(get_summary_indicator(info)) = 0;
>>       smp_wmb();
>>       /* Walk through indicators field, summary indicator not active. */
>>       for (ai = 0;;) {
>> @@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct 
>> *airq)
>>       read_unlock(&info->lock);
>>   }
>> -static struct airq_info *new_airq_info(void)
>> +/* call with airq_areas_lock held */
>> +static struct airq_info *new_airq_info(int index)
>>   {
>>       struct airq_info *info;
>>       int rc;
>> @@ -252,7 +259,8 @@ static struct airq_info *new_airq_info(void)
>>           return NULL;
>>       }
>>       info->airq.handler = virtio_airq_handler;
>> -    info->airq.lsi_ptr = &info->summary_indicator;
>> +    info->summary_indicator_idx = index;
>> +    info->airq.lsi_ptr = get_summary_indicator(info);
>>       info->airq.lsi_mask = 0xff;
>>       info->airq.isc = VIRTIO_AIRQ_ISC;
>>       rc = register_adapter_interrupt(&info->airq);
>> @@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct 
>> virtqueue *vqs[], int nvqs,
>>       unsigned long bit, flags;
>>       for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
>> +        /* TODO: this seems to be racy */
> 
> yes, my opinions too, was already racy, in my opinion, we need another 
> patch in another series to fix this.
> 
> However, not sure about the comment.

I will drop this comment for v2 of this patch series.
We shall fix the race with a separate patch.

Michael

> 
>>           if (!airq_areas[i])
>> -            airq_areas[i] = new_airq_info();
>> +            airq_areas[i] = new_airq_info(i);
>>           info = airq_areas[i];
>>           if (!info)
>>               return 0;
>> @@ -359,7 +368,7 @@ static void virtio_ccw_drop_indicator(struct 
>> virtio_ccw_device *vcdev,
>>           if (!thinint_area)
>>               return;
>>           thinint_area->summary_indicator =
>> -            (unsigned long) &airq_info->summary_indicator;
>> +            (unsigned long) get_summary_indicator(airq_info);
>>           thinint_area->isc = VIRTIO_AIRQ_ISC;
>>           ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
>>           ccw->count = sizeof(*thinint_area);
>> @@ -624,7 +633,7 @@ static int virtio_ccw_register_adapter_ind(struct 
>> virtio_ccw_device *vcdev,
>>       }
>>       info = vcdev->airq_info;
>>       thinint_area->summary_indicator =
>> -        (unsigned long) &info->summary_indicator;
>> +        (unsigned long) get_summary_indicator(info);
>>       thinint_area->isc = VIRTIO_AIRQ_ISC;
>>       ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
>>       ccw->flags = CCW_FLAG_SLI;
>> @@ -1500,6 +1509,7 @@ static int __init virtio_ccw_init(void)
>>   {
>>       /* parse no_auto string before we do anything further */
>>       no_auto_parse();
>> +    summary_indicators = cio_dma_zalloc(MAX_AIRQ_AREAS);
>>       return ccw_driver_register(&virtio_ccw_driver);
>>   }
>>   device_initcall(virtio_ccw_init);
>>
> 
> 
> 

-- 
Mit freundlichen Grüßen / Kind regards
Michael Müller

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
  2019-05-13 12:20     ` Cornelia Huck
@ 2019-05-15 13:43       ` Michael Mueller
  -1 siblings, 0 replies; 182+ messages in thread
From: Michael Mueller @ 2019-05-15 13:43 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman



On 13.05.19 14:20, Cornelia Huck wrote:
> On Fri, 26 Apr 2019 20:32:45 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> Hypervisor needs to interact with the summary indicators, so these
>> need to be DMA memory as well (at least for protected virtualization
>> guests).
>>
>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>> ---
>>   drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
>>   1 file changed, 17 insertions(+), 7 deletions(-)
> 
> (...)
> 
>> @@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq)
>>   	read_unlock(&info->lock);
>>   }
>>   
>> -static struct airq_info *new_airq_info(void)
>> +/* call with drivers/s390/virtio/virtio_ccw.cheld */
> 
> Hm, where is airq_areas_lock defined? If it was introduced in one of
> the previous patches, I have missed it.

There is no airq_areas_lock defined currently. My assumption is that
this will be used in context with the likely race condition this
part of the patch is talking about.

@@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct 
virtqueue *vqs[], int nvqs,
  	unsigned long bit, flags;

  	for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
+		/* TODO: this seems to be racy */
  		if (!airq_areas[i])
-			airq_areas[i] = new_airq_info();
+			airq_areas[i] = new_airq_info(i);


As this shall be handled by a separate patch I will drop the comment
in regard to airq_areas_lock from this patch as well for v2.

Michael

> 
>> +static struct airq_info *new_airq_info(int index)
>>   {
>>   	struct airq_info *info;
>>   	int rc;
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
@ 2019-05-15 13:43       ` Michael Mueller
  0 siblings, 0 replies; 182+ messages in thread
From: Michael Mueller @ 2019-05-15 13:43 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank



On 13.05.19 14:20, Cornelia Huck wrote:
> On Fri, 26 Apr 2019 20:32:45 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> Hypervisor needs to interact with the summary indicators, so these
>> need to be DMA memory as well (at least for protected virtualization
>> guests).
>>
>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>> ---
>>   drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
>>   1 file changed, 17 insertions(+), 7 deletions(-)
> 
> (...)
> 
>> @@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq)
>>   	read_unlock(&info->lock);
>>   }
>>   
>> -static struct airq_info *new_airq_info(void)
>> +/* call with drivers/s390/virtio/virtio_ccw.cheld */
> 
> Hm, where is airq_areas_lock defined? If it was introduced in one of
> the previous patches, I have missed it.

There is no airq_areas_lock defined currently. My assumption is that
this will be used in context with the likely race condition this
part of the patch is talking about.

@@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct 
virtqueue *vqs[], int nvqs,
  	unsigned long bit, flags;

  	for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
+		/* TODO: this seems to be racy */
  		if (!airq_areas[i])
-			airq_areas[i] = new_airq_info();
+			airq_areas[i] = new_airq_info(i);


As this shall be handled by a separate patch I will drop the comment
in regard to airq_areas_lock from this patch as well for v2.

Michael

> 
>> +static struct airq_info *new_airq_info(int index)
>>   {
>>   	struct airq_info *info;
>>   	int rc;
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
  2019-05-15 13:43       ` Michael Mueller
@ 2019-05-15 13:50         ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-15 13:50 UTC (permalink / raw)
  To: Michael Mueller
  Cc: Halil Pasic, kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 15 May 2019 15:43:23 +0200
Michael Mueller <mimu@linux.ibm.com> wrote:

> On 13.05.19 14:20, Cornelia Huck wrote:
> > On Fri, 26 Apr 2019 20:32:45 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> >> Hypervisor needs to interact with the summary indicators, so these
> >> need to be DMA memory as well (at least for protected virtualization
> >> guests).
> >>
> >> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >> ---
> >>   drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
> >>   1 file changed, 17 insertions(+), 7 deletions(-)  
> > 
> > (...)
> >   
> >> @@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq)
> >>   	read_unlock(&info->lock);
> >>   }
> >>   
> >> -static struct airq_info *new_airq_info(void)
> >> +/* call with drivers/s390/virtio/virtio_ccw.cheld */  
> > 
> > Hm, where is airq_areas_lock defined? If it was introduced in one of
> > the previous patches, I have missed it.  
> 
> There is no airq_areas_lock defined currently. My assumption is that
> this will be used in context with the likely race condition this
> part of the patch is talking about.
> 
> @@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct 
> virtqueue *vqs[], int nvqs,
>   	unsigned long bit, flags;
> 
>   	for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
> +		/* TODO: this seems to be racy */
>   		if (!airq_areas[i])
> -			airq_areas[i] = new_airq_info();
> +			airq_areas[i] = new_airq_info(i);
> 
> 
> As this shall be handled by a separate patch I will drop the comment
> in regard to airq_areas_lock from this patch as well for v2.

Ok, that makes sense.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
@ 2019-05-15 13:50         ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-15 13:50 UTC (permalink / raw)
  To: Michael Mueller
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Christoph Hellwig,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Wed, 15 May 2019 15:43:23 +0200
Michael Mueller <mimu@linux.ibm.com> wrote:

> On 13.05.19 14:20, Cornelia Huck wrote:
> > On Fri, 26 Apr 2019 20:32:45 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> >> Hypervisor needs to interact with the summary indicators, so these
> >> need to be DMA memory as well (at least for protected virtualization
> >> guests).
> >>
> >> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >> ---
> >>   drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
> >>   1 file changed, 17 insertions(+), 7 deletions(-)  
> > 
> > (...)
> >   
> >> @@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq)
> >>   	read_unlock(&info->lock);
> >>   }
> >>   
> >> -static struct airq_info *new_airq_info(void)
> >> +/* call with drivers/s390/virtio/virtio_ccw.cheld */  
> > 
> > Hm, where is airq_areas_lock defined? If it was introduced in one of
> > the previous patches, I have missed it.  
> 
> There is no airq_areas_lock defined currently. My assumption is that
> this will be used in context with the likely race condition this
> part of the patch is talking about.
> 
> @@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct 
> virtqueue *vqs[], int nvqs,
>   	unsigned long bit, flags;
> 
>   	for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
> +		/* TODO: this seems to be racy */
>   		if (!airq_areas[i])
> -			airq_areas[i] = new_airq_info();
> +			airq_areas[i] = new_airq_info(i);
> 
> 
> As this shall be handled by a separate patch I will drop the comment
> in regard to airq_areas_lock from this patch as well for v2.

Ok, that makes sense.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
  2019-05-09 22:34       ` Halil Pasic
@ 2019-05-15 14:15         ` Michael Mueller
  -1 siblings, 0 replies; 182+ messages in thread
From: Michael Mueller @ 2019-05-15 14:15 UTC (permalink / raw)
  To: Halil Pasic, Claudio Imbrenda
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank, Farhan Ali,
	Eric Farman



On 10.05.19 00:34, Halil Pasic wrote:
> On Wed, 8 May 2019 15:15:40 +0200
> Claudio Imbrenda <imbrenda@linux.ibm.com> wrote:
> 
>> On Fri, 26 Apr 2019 20:32:39 +0200
>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>
>>> On s390, protected virtualization guests have to use bounced I/O
>>> buffers.  That requires some plumbing.
>>>
>>> Let us make sure, any device that uses DMA API with direct ops
>>> correctly is spared from the problems, that a hypervisor attempting
>>> I/O to a non-shared page would bring.
>>>
>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>> ---
>>>   arch/s390/Kconfig                   |  4 +++
>>>   arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++
>>>   arch/s390/mm/init.c                 | 50
>>> +++++++++++++++++++++++++++++++++++++ 3 files changed, 72
>>> insertions(+) create mode 100644 arch/s390/include/asm/mem_encrypt.h
>>>
>>> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
>>> index 1c3fcf19c3af..5500d05d4d53 100644
>>> --- a/arch/s390/Kconfig
>>> +++ b/arch/s390/Kconfig
>>> @@ -1,4 +1,7 @@
>>>   # SPDX-License-Identifier: GPL-2.0
>>> +config ARCH_HAS_MEM_ENCRYPT
>>> +        def_bool y
>>> +
>>>   config MMU
>>>   	def_bool y
>>>   
>>> @@ -191,6 +194,7 @@ config S390
>>>   	select ARCH_HAS_SCALED_CPUTIME
>>>   	select VIRT_TO_BUS
>>>   	select HAVE_NMI
>>> +	select SWIOTLB
>>>   
>>>   
>>>   config SCHED_OMIT_FRAME_POINTER
>>> diff --git a/arch/s390/include/asm/mem_encrypt.h
>>> b/arch/s390/include/asm/mem_encrypt.h new file mode 100644
>>> index 000000000000..0898c09a888c
>>> --- /dev/null
>>> +++ b/arch/s390/include/asm/mem_encrypt.h
>>> @@ -0,0 +1,18 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> +#ifndef S390_MEM_ENCRYPT_H__
>>> +#define S390_MEM_ENCRYPT_H__
>>> +
>>> +#ifndef __ASSEMBLY__
>>> +
>>> +#define sme_me_mask	0ULL
>>
>> This is rather ugly, but I understand why it's there
>>
> 
> Nod.
> 
>>> +
>>> +static inline bool sme_active(void) { return false; }
>>> +extern bool sev_active(void);
>>> +
>>> +int set_memory_encrypted(unsigned long addr, int numpages);
>>> +int set_memory_decrypted(unsigned long addr, int numpages);
>>> +
>>> +#endif	/* __ASSEMBLY__ */
>>> +
>>> +#endif	/* S390_MEM_ENCRYPT_H__ */
>>> +
>>> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
>>> index 3e82f66d5c61..7e3cbd15dcfa 100644
>>> --- a/arch/s390/mm/init.c
>>> +++ b/arch/s390/mm/init.c
>>> @@ -18,6 +18,7 @@
>>>   #include <linux/mman.h>
>>>   #include <linux/mm.h>
>>>   #include <linux/swap.h>
>>> +#include <linux/swiotlb.h>
>>>   #include <linux/smp.h>
>>>   #include <linux/init.h>
>>>   #include <linux/pagemap.h>
>>> @@ -29,6 +30,7 @@
>>>   #include <linux/export.h>
>>>   #include <linux/cma.h>
>>>   #include <linux/gfp.h>
>>> +#include <linux/dma-mapping.h>
>>>   #include <asm/processor.h>
>>>   #include <linux/uaccess.h>
>>>   #include <asm/pgtable.h>
>>> @@ -42,6 +44,8 @@
>>>   #include <asm/sclp.h>
>>>   #include <asm/set_memory.h>
>>>   #include <asm/kasan.h>
>>> +#include <asm/dma-mapping.h>
>>> +#include <asm/uv.h>
>>>   
>>>   pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
>>>   
>>> @@ -126,6 +130,50 @@ void mark_rodata_ro(void)
>>>   	pr_info("Write protected read-only-after-init data: %luk\n",
>>> size >> 10); }
>>>   
>>> +int set_memory_encrypted(unsigned long addr, int numpages)
>>> +{
>>> +	int i;
>>> +
>>> +	/* make all pages shared, (swiotlb, dma_free) */
>>
>> this is a copypaste typo, I think? (should be UNshared?)
>> also, it doesn't make ALL pages unshared, but only those specified in
>> the parameters
> 
> Right a copy paste error. Needs correction. The all was meant like all
> pages in the range specified by the arguments. But it is better changed
> since it turned out to be confusing.
> 
>>
>> with this fixed:
>> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
>>
> 
> Thanks!
> 
>>> +	for (i = 0; i < numpages; ++i) {
>>> +		uv_remove_shared(addr);
>>> +		addr += PAGE_SIZE;
>>> +	}
>>> +	return 0;
>>> +}
>>> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
>>> +
>>> +int set_memory_decrypted(unsigned long addr, int numpages)
>>> +{
>>> +	int i;
>>> +	/* make all pages shared (swiotlb, dma_alloca) */
>>
>> same here with ALL
>>
>>> +	for (i = 0; i < numpages; ++i) {
>>> +		uv_set_shared(addr);
>>> +		addr += PAGE_SIZE;
>>> +	}
>>> +	return 0;
>>> +}
>>> +EXPORT_SYMBOL_GPL(set_memory_decrypted);
>>> +
>>> +/* are we a protected virtualization guest? */
>>> +bool sev_active(void)
>>
>> this is also ugly. the correct solution would be probably to refactor
>> everything, including all the AMD SEV code.... let's not go there
>>
> 
> Nod. Maybe later.
> 
>>> +{
>>> +	return is_prot_virt_guest();
>>> +}
>>> +EXPORT_SYMBOL_GPL(sev_active);
>>> +
>>> +/* protected virtualization */
>>> +static void pv_init(void)
>>> +{
>>> +	if (!sev_active())
>>
>> can't you just use is_prot_virt_guest here?
>>
> 
> Sure! I guess it would be less confusing. It is something I did not
> remember to change when the interface for this provided by uv.h went
> from sketchy to nice.

integrated in v2

Michael

> 
> Thanks again!
> 
> Regards,
> Halil
> 
>>> +		return;
>>> +
>>> +	/* make sure bounce buffers are shared */
>>> +	swiotlb_init(1);
>>> +	swiotlb_update_mem_attributes();
>>> +	swiotlb_force = SWIOTLB_FORCE;
>>> +}
>>> +
>>>   void __init mem_init(void)
>>>   {
>>>   	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
>>> @@ -134,6 +182,8 @@ void __init mem_init(void)
>>>   	set_max_mapnr(max_low_pfn);
>>>           high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
>>>   
>>> +	pv_init();
>>> +
>>>   	/* Setup guest page hinting */
>>>   	cmma_init();
>>>   
>>
> 

-- 
Mit freundlichen Grüßen / Kind regards
Michael Müller

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization
@ 2019-05-15 14:15         ` Michael Mueller
  0 siblings, 0 replies; 182+ messages in thread
From: Michael Mueller @ 2019-05-15 14:15 UTC (permalink / raw)
  To: Halil Pasic, Claudio Imbrenda
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Farhan Ali, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank



On 10.05.19 00:34, Halil Pasic wrote:
> On Wed, 8 May 2019 15:15:40 +0200
> Claudio Imbrenda <imbrenda@linux.ibm.com> wrote:
> 
>> On Fri, 26 Apr 2019 20:32:39 +0200
>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>
>>> On s390, protected virtualization guests have to use bounced I/O
>>> buffers.  That requires some plumbing.
>>>
>>> Let us make sure, any device that uses DMA API with direct ops
>>> correctly is spared from the problems, that a hypervisor attempting
>>> I/O to a non-shared page would bring.
>>>
>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>> ---
>>>   arch/s390/Kconfig                   |  4 +++
>>>   arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++
>>>   arch/s390/mm/init.c                 | 50
>>> +++++++++++++++++++++++++++++++++++++ 3 files changed, 72
>>> insertions(+) create mode 100644 arch/s390/include/asm/mem_encrypt.h
>>>
>>> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
>>> index 1c3fcf19c3af..5500d05d4d53 100644
>>> --- a/arch/s390/Kconfig
>>> +++ b/arch/s390/Kconfig
>>> @@ -1,4 +1,7 @@
>>>   # SPDX-License-Identifier: GPL-2.0
>>> +config ARCH_HAS_MEM_ENCRYPT
>>> +        def_bool y
>>> +
>>>   config MMU
>>>   	def_bool y
>>>   
>>> @@ -191,6 +194,7 @@ config S390
>>>   	select ARCH_HAS_SCALED_CPUTIME
>>>   	select VIRT_TO_BUS
>>>   	select HAVE_NMI
>>> +	select SWIOTLB
>>>   
>>>   
>>>   config SCHED_OMIT_FRAME_POINTER
>>> diff --git a/arch/s390/include/asm/mem_encrypt.h
>>> b/arch/s390/include/asm/mem_encrypt.h new file mode 100644
>>> index 000000000000..0898c09a888c
>>> --- /dev/null
>>> +++ b/arch/s390/include/asm/mem_encrypt.h
>>> @@ -0,0 +1,18 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> +#ifndef S390_MEM_ENCRYPT_H__
>>> +#define S390_MEM_ENCRYPT_H__
>>> +
>>> +#ifndef __ASSEMBLY__
>>> +
>>> +#define sme_me_mask	0ULL
>>
>> This is rather ugly, but I understand why it's there
>>
> 
> Nod.
> 
>>> +
>>> +static inline bool sme_active(void) { return false; }
>>> +extern bool sev_active(void);
>>> +
>>> +int set_memory_encrypted(unsigned long addr, int numpages);
>>> +int set_memory_decrypted(unsigned long addr, int numpages);
>>> +
>>> +#endif	/* __ASSEMBLY__ */
>>> +
>>> +#endif	/* S390_MEM_ENCRYPT_H__ */
>>> +
>>> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
>>> index 3e82f66d5c61..7e3cbd15dcfa 100644
>>> --- a/arch/s390/mm/init.c
>>> +++ b/arch/s390/mm/init.c
>>> @@ -18,6 +18,7 @@
>>>   #include <linux/mman.h>
>>>   #include <linux/mm.h>
>>>   #include <linux/swap.h>
>>> +#include <linux/swiotlb.h>
>>>   #include <linux/smp.h>
>>>   #include <linux/init.h>
>>>   #include <linux/pagemap.h>
>>> @@ -29,6 +30,7 @@
>>>   #include <linux/export.h>
>>>   #include <linux/cma.h>
>>>   #include <linux/gfp.h>
>>> +#include <linux/dma-mapping.h>
>>>   #include <asm/processor.h>
>>>   #include <linux/uaccess.h>
>>>   #include <asm/pgtable.h>
>>> @@ -42,6 +44,8 @@
>>>   #include <asm/sclp.h>
>>>   #include <asm/set_memory.h>
>>>   #include <asm/kasan.h>
>>> +#include <asm/dma-mapping.h>
>>> +#include <asm/uv.h>
>>>   
>>>   pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
>>>   
>>> @@ -126,6 +130,50 @@ void mark_rodata_ro(void)
>>>   	pr_info("Write protected read-only-after-init data: %luk\n",
>>> size >> 10); }
>>>   
>>> +int set_memory_encrypted(unsigned long addr, int numpages)
>>> +{
>>> +	int i;
>>> +
>>> +	/* make all pages shared, (swiotlb, dma_free) */
>>
>> this is a copypaste typo, I think? (should be UNshared?)
>> also, it doesn't make ALL pages unshared, but only those specified in
>> the parameters
> 
> Right a copy paste error. Needs correction. The all was meant like all
> pages in the range specified by the arguments. But it is better changed
> since it turned out to be confusing.
> 
>>
>> with this fixed:
>> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
>>
> 
> Thanks!
> 
>>> +	for (i = 0; i < numpages; ++i) {
>>> +		uv_remove_shared(addr);
>>> +		addr += PAGE_SIZE;
>>> +	}
>>> +	return 0;
>>> +}
>>> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
>>> +
>>> +int set_memory_decrypted(unsigned long addr, int numpages)
>>> +{
>>> +	int i;
>>> +	/* make all pages shared (swiotlb, dma_alloca) */
>>
>> same here with ALL
>>
>>> +	for (i = 0; i < numpages; ++i) {
>>> +		uv_set_shared(addr);
>>> +		addr += PAGE_SIZE;
>>> +	}
>>> +	return 0;
>>> +}
>>> +EXPORT_SYMBOL_GPL(set_memory_decrypted);
>>> +
>>> +/* are we a protected virtualization guest? */
>>> +bool sev_active(void)
>>
>> this is also ugly. the correct solution would be probably to refactor
>> everything, including all the AMD SEV code.... let's not go there
>>
> 
> Nod. Maybe later.
> 
>>> +{
>>> +	return is_prot_virt_guest();
>>> +}
>>> +EXPORT_SYMBOL_GPL(sev_active);
>>> +
>>> +/* protected virtualization */
>>> +static void pv_init(void)
>>> +{
>>> +	if (!sev_active())
>>
>> can't you just use is_prot_virt_guest here?
>>
> 
> Sure! I guess it would be less confusing. It is something I did not
> remember to change when the interface for this provided by uv.h went
> from sketchy to nice.

integrated in v2

Michael

> 
> Thanks again!
> 
> Regards,
> Halil
> 
>>> +		return;
>>> +
>>> +	/* make sure bounce buffers are shared */
>>> +	swiotlb_init(1);
>>> +	swiotlb_update_mem_attributes();
>>> +	swiotlb_force = SWIOTLB_FORCE;
>>> +}
>>> +
>>>   void __init mem_init(void)
>>>   {
>>>   	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
>>> @@ -134,6 +182,8 @@ void __init mem_init(void)
>>>   	set_max_mapnr(max_low_pfn);
>>>           high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
>>>   
>>> +	pv_init();
>>> +
>>>   	/* Setup guest page hinting */
>>>   	cmma_init();
>>>   
>>
> 

-- 
Mit freundlichen Grüßen / Kind regards
Michael Müller

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-13 13:29                 ` Cornelia Huck
@ 2019-05-15 17:12                   ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-15 17:12 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Sebastian Ott, kvm, linux-s390, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller

On Mon, 13 May 2019 15:29:24 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Sun, 12 May 2019 20:22:56 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Fri, 10 May 2019 16:10:13 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Fri, 10 May 2019 00:11:12 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > > > On Thu, 9 May 2019 12:11:06 +0200
> > > > Cornelia Huck <cohuck@redhat.com> wrote:
> > > >   
> > > > > On Wed, 8 May 2019 23:22:10 +0200
> > > > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > > > >     
> > > > > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > > > > Sebastian Ott <sebott@linux.ibm.com> wrote:    
> > > > >     
> > > > > > > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > > > > > > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > > > > > > >  		goto out_unregister;
> > > > > > > >  	}
> > > > > > > > +	cio_dma_pool_init();        
> > > > > > > 
> > > > > > > This is too late for early devices (ccw console!).      
> > > > > > 
> > > > > > You have already raised concern about this last time (thanks). I think,
> > > > > > I've addressed this issue: the cio_dma_pool is only used by the airq
> > > > > > stuff. I don't think the ccw console needs it. Please have an other look
> > > > > > at patch #6, and explain your concern in more detail if it persists.    
> > > > > 
> > > > > What about changing the naming/adding comments here, so that (1) folks
> > > > > aren't confused by the same thing in the future and (2) folks don't try
> > > > > to use that pool for something needed for the early ccw consoles?
> > > > >     
> > > > 
> > > > I'm all for clarity! Suggestions for better names?  
> > > 
> > > css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
> > > need it?
> > >   
> > 
> > Ouch! I was considering to use cio_dma_zalloc() for the adapter
> > interruption vectors but I ended up between the two chairs in the end.
> > So with this series there are no uses for cio_dma pool.
> > 
> > I don't feel strongly about this going one way the other.
> > 
> > Against getting rid of the cio_dma_pool and sticking with the speaks
> > dma_alloc_coherent() that we waste a DMA page per vector, which is a
> > non obvious side effect.
> 
> That would basically mean one DMA page per virtio-ccw device, right?

Not quite: virtio-ccw shares airq vectors among multiple devices. We
alloc 64 bytes a time and use that as long as we don't run out of bits.
 
> For single queue devices, this seems like quite a bit of overhead.
>

Nod.
 
> Are we expecting many devices in use per guest?

This is affect linux in general, not only guest 2 stuff (i.e. we
also waste in vanilla lpar mode). And zpci seems to do at least one
airq_iv_create() per pci function.

> 
> > 
> > What speaks against cio_dma_pool is that it is slightly more code, and
> > this currently can not be used for very early stuff, which I don't
> > think is relevant. 
> 
> Unless properly documented, it feels like something you can easily trip
> over, however.
> 
> I assume that the "very early stuff" is basically only ccw consoles.
> Not sure if we can use virtio-serial as an early ccw console -- IIRC
> that was only about 3215/3270? While QEMU guests are able to use a 3270
> console, this is experimental and I would not count that as a use case.
> Anyway, 3215/3270 don't use adapter interrupts, and probably not
> anything cross-device, either; so unless early virtio-serial is a
> thing, this restriction is fine if properly documented.
> 

Mimu can you dig into this a bit?

We could also aim for getting rid of this limitation. One idea would be
some sort of lazy initialization (i.e. triggered by first usage).
Another idea would be simply doing the initialization earlier.
Unfortunately I'm not all that familiar with the early stuff. Is there
some doc you can recommend?

Anyway before investing much more into this, I think we should have a
fair idea which option do we prefer...

> > What also used to speak against it is that
> > allocations asking for more than a page would just fail, but I addressed
> > that in the patch I've hacked up on top of the series, and I'm going to
> > paste below. While at it I addressed some other issues as well.
> 
> Hm, which "other issues"?
> 

The kfree() I've forgotten to get rid of, and this 'document does not
work early' (pun intended) business.

> > 
> > I've also got code that deals with AIRQ_IV_CACHELINE by turning the
> > kmem_cache into a dma_pool.
> 
> Isn't that percolating to other airq users again? Or maybe I just don't
> understand what you're proposing here...

Absolutely.

> 
> > 
> > Cornelia, Sebastian which approach do you prefer:
> > 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
> > vector, or
> > 2) go with the approach taken by the patch below?
> 
> I'm not sure that I properly understand this (yeah, you probably
> guessed); so I'm not sure I can make a good call here.
> 

I hope you, I managed to clarify some of the questions. Please keep
asking if stuff remains unclear. I'm not a great communicator, but a
quite tenacious one.

I hope Sebastian will chime in as well.

> > 
> > 
> > Regards,
> > Halil
> > -----------------------8<---------------------------------------------
> > From: Halil Pasic <pasic@linux.ibm.com>
> > Date: Sun, 12 May 2019 18:08:05 +0200
> > Subject: [PATCH] WIP: use cio dma pool for airqs
> > 
> > Let's not waste a DMA page per adapter interrupt bit vector.
> > ---
> > Lightly tested...
> > ---
> >  arch/s390/include/asm/airq.h |  1 -
> >  drivers/s390/cio/airq.c      | 10 +++-------
> >  drivers/s390/cio/css.c       | 18 +++++++++++++++---
> >  3 files changed, 18 insertions(+), 11 deletions(-)
> > 
> > diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
> > index 1492d48..981a3eb 100644
> > --- a/arch/s390/include/asm/airq.h
> > +++ b/arch/s390/include/asm/airq.h
> > @@ -30,7 +30,6 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
> >  /* Adapter interrupt bit vector */
> >  struct airq_iv {
> >  	unsigned long *vector;	/* Adapter interrupt bit vector */
> > -	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
> >  	unsigned long *avail;	/* Allocation bit mask for the bit vector */
> >  	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
> >  	unsigned long *ptr;	/* Pointer associated with each bit */
> > diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> > index 7a5c0a0..f11f437 100644
> > --- a/drivers/s390/cio/airq.c
> > +++ b/drivers/s390/cio/airq.c
> > @@ -136,8 +136,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
> >  		goto out;
> >  	iv->bits = bits;
> >  	size = iv_size(bits);
> > -	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
> > -						 &iv->vector_dma, GFP_KERNEL);
> > +	iv->vector = cio_dma_zalloc(size);
> >  	if (!iv->vector)
> >  		goto out_free;
> >  	if (flags & AIRQ_IV_ALLOC) {
> > @@ -172,8 +171,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
> >  	kfree(iv->ptr);
> >  	kfree(iv->bitlock);
> >  	kfree(iv->avail);
> > -	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
> > -			  iv->vector_dma);
> > +	cio_dma_free(iv->vector, size);
> >  	kfree(iv);
> >  out:
> >  	return NULL;
> > @@ -189,9 +187,7 @@ void airq_iv_release(struct airq_iv *iv)
> >  	kfree(iv->data);
> >  	kfree(iv->ptr);
> >  	kfree(iv->bitlock);
> > -	kfree(iv->vector);
> > -	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
> > -			  iv->vector, iv->vector_dma);
> > +	cio_dma_free(iv->vector, iv_size(iv->bits));
> >  	kfree(iv->avail);
> >  	kfree(iv);
> >  }
> > diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
> > index 7087cc3..88d9c92 100644
> > --- a/drivers/s390/cio/css.c
> > +++ b/drivers/s390/cio/css.c
> > @@ -1063,7 +1063,10 @@ struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
> >  static void __gp_dma_free_dma(struct gen_pool *pool,
> >  			      struct gen_pool_chunk *chunk, void *data)
> >  {
> > -	dma_free_coherent((struct device *) data, PAGE_SIZE,
> > +
> > +	size_t chunk_size = chunk->end_addr - chunk->start_addr + 1;
> > +
> > +	dma_free_coherent((struct device *) data, chunk_size,
> >  			 (void *) chunk->start_addr,
> >  			 (dma_addr_t) chunk->phys_addr);
> >  }
> > @@ -1088,13 +1091,15 @@ void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
> >  {
> >  	dma_addr_t dma_addr;
> >  	unsigned long addr = gen_pool_alloc(gp_dma, size);
> > +	size_t chunk_size;
> >  
> >  	if (!addr) {
> > +		chunk_size = round_up(size, PAGE_SIZE);
> 
> Doesn't that mean that we still go up to chunks of at least PAGE_SIZE?
> Or can vectors now share the same chunk?

Exactly! We put the allocated dma mem into the genpool. So if the next
request can be served from what is already in the genpool we don't end
up in this fallback path where we grow the pool. 

> 
> >  		addr = (unsigned long) dma_alloc_coherent(dma_dev,
> > -					PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
> > +					 chunk_size, &dma_addr, CIO_DMA_GFP);
> >  		if (!addr)
> >  			return NULL;
> > -		gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
> > +		gen_pool_add_virt(gp_dma, addr, dma_addr, chunk_size, -1);
> >  		addr = gen_pool_alloc(gp_dma, size);

BTW I think it would be good to recover from this alloc failing due to a
race (qute unlikely with small allocations thogh...).

> >  	}
> >  	return (void *) addr;
> > @@ -1108,6 +1113,13 @@ void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
> >  	gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
> >  }
> >  
> > +/**
> > + * Allocate dma memory from the css global pool. Intended for memory not
> > + * specific to any single device within the css.
> > + *
> > + * Caution: Not suitable for early stuff like console.
> 
> Maybe add "Do not use prior to <point in startup>"?
> 

I'm not awfully familiar with the well known 'points in startup'. Can
you recommend me some documentation on this topic?


Regards,
Halil

> > + *
> > + */
> >  void *cio_dma_zalloc(size_t size)
> >  {
> >  	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-15 17:12                   ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-15 17:12 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda,
	Sebastian Ott, kvm, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Michael Mueller, Viktor Mihajlovski, Janosch Frank

On Mon, 13 May 2019 15:29:24 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Sun, 12 May 2019 20:22:56 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Fri, 10 May 2019 16:10:13 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Fri, 10 May 2019 00:11:12 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > > > On Thu, 9 May 2019 12:11:06 +0200
> > > > Cornelia Huck <cohuck@redhat.com> wrote:
> > > >   
> > > > > On Wed, 8 May 2019 23:22:10 +0200
> > > > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > > > >     
> > > > > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > > > > Sebastian Ott <sebott@linux.ibm.com> wrote:    
> > > > >     
> > > > > > > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > > > > > > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > > > > > > >  		goto out_unregister;
> > > > > > > >  	}
> > > > > > > > +	cio_dma_pool_init();        
> > > > > > > 
> > > > > > > This is too late for early devices (ccw console!).      
> > > > > > 
> > > > > > You have already raised concern about this last time (thanks). I think,
> > > > > > I've addressed this issue: the cio_dma_pool is only used by the airq
> > > > > > stuff. I don't think the ccw console needs it. Please have an other look
> > > > > > at patch #6, and explain your concern in more detail if it persists.    
> > > > > 
> > > > > What about changing the naming/adding comments here, so that (1) folks
> > > > > aren't confused by the same thing in the future and (2) folks don't try
> > > > > to use that pool for something needed for the early ccw consoles?
> > > > >     
> > > > 
> > > > I'm all for clarity! Suggestions for better names?  
> > > 
> > > css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
> > > need it?
> > >   
> > 
> > Ouch! I was considering to use cio_dma_zalloc() for the adapter
> > interruption vectors but I ended up between the two chairs in the end.
> > So with this series there are no uses for cio_dma pool.
> > 
> > I don't feel strongly about this going one way the other.
> > 
> > Against getting rid of the cio_dma_pool and sticking with the speaks
> > dma_alloc_coherent() that we waste a DMA page per vector, which is a
> > non obvious side effect.
> 
> That would basically mean one DMA page per virtio-ccw device, right?

Not quite: virtio-ccw shares airq vectors among multiple devices. We
alloc 64 bytes a time and use that as long as we don't run out of bits.
 
> For single queue devices, this seems like quite a bit of overhead.
>

Nod.
 
> Are we expecting many devices in use per guest?

This is affect linux in general, not only guest 2 stuff (i.e. we
also waste in vanilla lpar mode). And zpci seems to do at least one
airq_iv_create() per pci function.

> 
> > 
> > What speaks against cio_dma_pool is that it is slightly more code, and
> > this currently can not be used for very early stuff, which I don't
> > think is relevant. 
> 
> Unless properly documented, it feels like something you can easily trip
> over, however.
> 
> I assume that the "very early stuff" is basically only ccw consoles.
> Not sure if we can use virtio-serial as an early ccw console -- IIRC
> that was only about 3215/3270? While QEMU guests are able to use a 3270
> console, this is experimental and I would not count that as a use case.
> Anyway, 3215/3270 don't use adapter interrupts, and probably not
> anything cross-device, either; so unless early virtio-serial is a
> thing, this restriction is fine if properly documented.
> 

Mimu can you dig into this a bit?

We could also aim for getting rid of this limitation. One idea would be
some sort of lazy initialization (i.e. triggered by first usage).
Another idea would be simply doing the initialization earlier.
Unfortunately I'm not all that familiar with the early stuff. Is there
some doc you can recommend?

Anyway before investing much more into this, I think we should have a
fair idea which option do we prefer...

> > What also used to speak against it is that
> > allocations asking for more than a page would just fail, but I addressed
> > that in the patch I've hacked up on top of the series, and I'm going to
> > paste below. While at it I addressed some other issues as well.
> 
> Hm, which "other issues"?
> 

The kfree() I've forgotten to get rid of, and this 'document does not
work early' (pun intended) business.

> > 
> > I've also got code that deals with AIRQ_IV_CACHELINE by turning the
> > kmem_cache into a dma_pool.
> 
> Isn't that percolating to other airq users again? Or maybe I just don't
> understand what you're proposing here...

Absolutely.

> 
> > 
> > Cornelia, Sebastian which approach do you prefer:
> > 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
> > vector, or
> > 2) go with the approach taken by the patch below?
> 
> I'm not sure that I properly understand this (yeah, you probably
> guessed); so I'm not sure I can make a good call here.
> 

I hope you, I managed to clarify some of the questions. Please keep
asking if stuff remains unclear. I'm not a great communicator, but a
quite tenacious one.

I hope Sebastian will chime in as well.

> > 
> > 
> > Regards,
> > Halil
> > -----------------------8<---------------------------------------------
> > From: Halil Pasic <pasic@linux.ibm.com>
> > Date: Sun, 12 May 2019 18:08:05 +0200
> > Subject: [PATCH] WIP: use cio dma pool for airqs
> > 
> > Let's not waste a DMA page per adapter interrupt bit vector.
> > ---
> > Lightly tested...
> > ---
> >  arch/s390/include/asm/airq.h |  1 -
> >  drivers/s390/cio/airq.c      | 10 +++-------
> >  drivers/s390/cio/css.c       | 18 +++++++++++++++---
> >  3 files changed, 18 insertions(+), 11 deletions(-)
> > 
> > diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
> > index 1492d48..981a3eb 100644
> > --- a/arch/s390/include/asm/airq.h
> > +++ b/arch/s390/include/asm/airq.h
> > @@ -30,7 +30,6 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
> >  /* Adapter interrupt bit vector */
> >  struct airq_iv {
> >  	unsigned long *vector;	/* Adapter interrupt bit vector */
> > -	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
> >  	unsigned long *avail;	/* Allocation bit mask for the bit vector */
> >  	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
> >  	unsigned long *ptr;	/* Pointer associated with each bit */
> > diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> > index 7a5c0a0..f11f437 100644
> > --- a/drivers/s390/cio/airq.c
> > +++ b/drivers/s390/cio/airq.c
> > @@ -136,8 +136,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
> >  		goto out;
> >  	iv->bits = bits;
> >  	size = iv_size(bits);
> > -	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
> > -						 &iv->vector_dma, GFP_KERNEL);
> > +	iv->vector = cio_dma_zalloc(size);
> >  	if (!iv->vector)
> >  		goto out_free;
> >  	if (flags & AIRQ_IV_ALLOC) {
> > @@ -172,8 +171,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
> >  	kfree(iv->ptr);
> >  	kfree(iv->bitlock);
> >  	kfree(iv->avail);
> > -	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
> > -			  iv->vector_dma);
> > +	cio_dma_free(iv->vector, size);
> >  	kfree(iv);
> >  out:
> >  	return NULL;
> > @@ -189,9 +187,7 @@ void airq_iv_release(struct airq_iv *iv)
> >  	kfree(iv->data);
> >  	kfree(iv->ptr);
> >  	kfree(iv->bitlock);
> > -	kfree(iv->vector);
> > -	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
> > -			  iv->vector, iv->vector_dma);
> > +	cio_dma_free(iv->vector, iv_size(iv->bits));
> >  	kfree(iv->avail);
> >  	kfree(iv);
> >  }
> > diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
> > index 7087cc3..88d9c92 100644
> > --- a/drivers/s390/cio/css.c
> > +++ b/drivers/s390/cio/css.c
> > @@ -1063,7 +1063,10 @@ struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
> >  static void __gp_dma_free_dma(struct gen_pool *pool,
> >  			      struct gen_pool_chunk *chunk, void *data)
> >  {
> > -	dma_free_coherent((struct device *) data, PAGE_SIZE,
> > +
> > +	size_t chunk_size = chunk->end_addr - chunk->start_addr + 1;
> > +
> > +	dma_free_coherent((struct device *) data, chunk_size,
> >  			 (void *) chunk->start_addr,
> >  			 (dma_addr_t) chunk->phys_addr);
> >  }
> > @@ -1088,13 +1091,15 @@ void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
> >  {
> >  	dma_addr_t dma_addr;
> >  	unsigned long addr = gen_pool_alloc(gp_dma, size);
> > +	size_t chunk_size;
> >  
> >  	if (!addr) {
> > +		chunk_size = round_up(size, PAGE_SIZE);
> 
> Doesn't that mean that we still go up to chunks of at least PAGE_SIZE?
> Or can vectors now share the same chunk?

Exactly! We put the allocated dma mem into the genpool. So if the next
request can be served from what is already in the genpool we don't end
up in this fallback path where we grow the pool. 

> 
> >  		addr = (unsigned long) dma_alloc_coherent(dma_dev,
> > -					PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
> > +					 chunk_size, &dma_addr, CIO_DMA_GFP);
> >  		if (!addr)
> >  			return NULL;
> > -		gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
> > +		gen_pool_add_virt(gp_dma, addr, dma_addr, chunk_size, -1);
> >  		addr = gen_pool_alloc(gp_dma, size);

BTW I think it would be good to recover from this alloc failing due to a
race (qute unlikely with small allocations thogh...).

> >  	}
> >  	return (void *) addr;
> > @@ -1108,6 +1113,13 @@ void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
> >  	gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
> >  }
> >  
> > +/**
> > + * Allocate dma memory from the css global pool. Intended for memory not
> > + * specific to any single device within the css.
> > + *
> > + * Caution: Not suitable for early stuff like console.
> 
> Maybe add "Do not use prior to <point in startup>"?
> 

I'm not awfully familiar with the well known 'points in startup'. Can
you recommend me some documentation on this topic?


Regards,
Halil

> > + *
> > + */
> >  void *cio_dma_zalloc(size_t size)
> >  {
> >  	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
  2019-05-15 13:43       ` Michael Mueller
@ 2019-05-15 17:18         ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-15 17:18 UTC (permalink / raw)
  To: Michael Mueller
  Cc: Cornelia Huck, kvm, linux-s390, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On Wed, 15 May 2019 15:43:23 +0200
Michael Mueller <mimu@linux.ibm.com> wrote:

> > Hm, where is airq_areas_lock defined? If it was introduced in one of
> > the previous patches, I have missed it.  
> 
> There is no airq_areas_lock defined currently. My assumption is that
> this will be used in context with the likely race condition this
> part of the patch is talking about.

Right! I first started resolving the race, but then decided to discuss
the issue first, because if I were to just have hallucinated that race,
it would be a lots of wasted effort. Unfortunately I forgot to get rid
of this comment.

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
@ 2019-05-15 17:18         ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-15 17:18 UTC (permalink / raw)
  To: Michael Mueller
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Farhan Ali, Viktor Mihajlovski, Janosch Frank

On Wed, 15 May 2019 15:43:23 +0200
Michael Mueller <mimu@linux.ibm.com> wrote:

> > Hm, where is airq_areas_lock defined? If it was introduced in one of
> > the previous patches, I have missed it.  
> 
> There is no airq_areas_lock defined currently. My assumption is that
> this will be used in context with the likely race condition this
> part of the patch is talking about.

Right! I first started resolving the race, but then decided to discuss
the issue first, because if I were to just have hallucinated that race,
it would be a lots of wasted effort. Unfortunately I forgot to get rid
of this comment.

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
  2019-05-15 13:33       ` Michael Mueller
@ 2019-05-15 17:23         ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-15 17:23 UTC (permalink / raw)
  To: Michael Mueller
  Cc: pmorel, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On Wed, 15 May 2019 15:33:02 +0200
Michael Mueller <mimu@linux.ibm.com> wrote:

> >> @@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct 
> >> virtqueue *vqs[], int nvqs,
> >>       unsigned long bit, flags;
> >>       for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
> >> +        /* TODO: this seems to be racy */  
> > 
> > yes, my opinions too, was already racy, in my opinion, we need another 
> > patch in another series to fix this.
> > 
> > However, not sure about the comment.  
> 
> I will drop this comment for v2 of this patch series.
> We shall fix the race with a separate patch.

Unless there is somebody eager to address this real soon, I would prefer
keeping the comment as a reminder.

Thanks for shouldering the v2!

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 10/10] virtio/s390: make airq summary indicators DMA
@ 2019-05-15 17:23         ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-15 17:23 UTC (permalink / raw)
  To: Michael Mueller
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, pmorel, Cornelia Huck,
	Eric Farman, virtualization, Christoph Hellwig,
	Martin Schwidefsky, Farhan Ali, Viktor Mihajlovski,
	Janosch Frank

On Wed, 15 May 2019 15:33:02 +0200
Michael Mueller <mimu@linux.ibm.com> wrote:

> >> @@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct 
> >> virtqueue *vqs[], int nvqs,
> >>       unsigned long bit, flags;
> >>       for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
> >> +        /* TODO: this seems to be racy */  
> > 
> > yes, my opinions too, was already racy, in my opinion, we need another 
> > patch in another series to fix this.
> > 
> > However, not sure about the comment.  
> 
> I will drop this comment for v2 of this patch series.
> We shall fix the race with a separate patch.

Unless there is somebody eager to address this real soon, I would prefer
keeping the comment as a reminder.

Thanks for shouldering the v2!

Regards,
Halil

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-13  9:41     ` Cornelia Huck
@ 2019-05-15 20:51       ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-15 20:51 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Mon, 13 May 2019 11:41:36 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri, 26 Apr 2019 20:32:41 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > As virtio-ccw devices are channel devices, we need to use the dma area
> > for any communication with the hypervisor.
> > 
> > This patch addresses the most basic stuff (mostly what is required for
> > virtio-ccw), and does take care of QDIO or any devices.
> 
> "does not take care of QDIO", surely? 

I did not bother making the QDIO library code use dma memory for
anything that is conceptually dma memory. AFAIK QDIO is out of scope for
prot virt for now. If one were to do some emulated qdio with prot virt
guests, one wound need to make a bunch of things shared.

> (Also, what does "any devices"
> mean? Do you mean "every arbitrary device", perhaps?)

What I mean is: this patch takes care of the core stuff, but any
particular device is likely to have to do more -- that is it ain't all
the cio devices support prot virt with this patch. For example
virtio-ccw needs to make sure that the ccws constituting the channel
programs, as well as the data pointed by the ccws is shared. If one
would want to make vfio-ccw DASD pass-through work under prot virt, one
would need to make sure, that everything that needs to be shared is
shared (data buffers, channel programs).

Does is clarify things?

> 
> > 
> > An interesting side effect is that virtio structures are now going to
> > get allocated in 31 bit addressable storage.
> 
> Hm...
> 
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  arch/s390/include/asm/ccwdev.h   |  4 +++
> >  drivers/s390/cio/ccwreq.c        |  8 ++---
> >  drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
> >  drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
> >  drivers/s390/cio/device_id.c     | 18 +++++------
> >  drivers/s390/cio/device_ops.c    | 21 +++++++++++--
> >  drivers/s390/cio/device_pgid.c   | 20 ++++++-------
> >  drivers/s390/cio/device_status.c | 24 +++++++--------
> >  drivers/s390/cio/io_sch.h        | 21 +++++++++----
> >  drivers/s390/virtio/virtio_ccw.c | 10 -------
> >  10 files changed, 148 insertions(+), 83 deletions(-)
> 
> (...)
> 
> > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > index 6d989c360f38..bb7a92316fc8 100644
> > --- a/drivers/s390/virtio/virtio_ccw.c
> > +++ b/drivers/s390/virtio/virtio_ccw.c
> > @@ -66,7 +66,6 @@ struct virtio_ccw_device {
> >  	bool device_lost;
> >  	unsigned int config_ready;
> >  	void *airq_info;
> > -	u64 dma_mask;
> >  };
> >  
> >  struct vq_info_block_legacy {
> > @@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> >  		ret = -ENOMEM;
> >  		goto out_free;
> >  	}
> > -
> >  	vcdev->vdev.dev.parent = &cdev->dev;
> > -	cdev->dev.dma_mask = &vcdev->dma_mask;
> > -	/* we are fine with common virtio infrastructure using 64 bit DMA */
> > -	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> > -	if (ret) {
> > -		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
> > -		goto out_free;
> > -	}
> 
> This means that vring structures now need to fit into 31 bits as well,
> I think?

Nod.

> Is there any way to reserve the 31 bit restriction for channel
> subsystem structures and keep vring in the full 64 bit range? (Or am I
> fundamentally misunderstanding something?)
> 

At the root of this problem is that the DMA API basically says devices
may have addressing limitations expressed by the dma_mask, while our
addressing limitations are not coming from the device but from the IO
arch: e.g. orb.cpa and ccw.cda are 31 bit addresses. In our case it
depends on how and for what is the device going to use the memory (e.g.
buffers addressed by MIDA vs IDA vs direct).

Virtio uses the DMA properties of the parent, that is in our case the
struct device embedded in struct ccw_device.

The previous version (RFC) used to allocate all the cio DMA stuff from
this global cio_dma_pool using the css0.dev for the DMA API
interactions. And we set *css0.dev.dma_mask == DMA_BIT_MASK(31) so
e.g. the allocated ccws are 31 bit addressable.

But I was asked to change this so that when I allocate DMA memory for a
channel program of particular ccw device, a struct device of that ccw
device is used as the first argument of dma_alloc_coherent().

Considering

void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
                gfp_t flag, unsigned long attrs)
{
        const struct dma_map_ops *ops = get_dma_ops(dev);
        void *cpu_addr;

        WARN_ON_ONCE(dev && !dev->coherent_dma_mask);

        if (dma_alloc_from_dev_coherent(dev, size, dma_handle, &cpu_addr))
                return cpu_addr;

        /* let the implementation decide on the zone to allocate from: */
        flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);

that is the GFP flags dropped that implies that we really want
cdev->dev restricted to 31 bit addressable memory because we can't tell
(with the current common DMA code) hey but this piece of DMA mem you
are abot to allocate for me must be 31 bit addressable (using GFP_DMA
as we usually do).

So, as described in the commit message, the vring stuff being forced
into ZONE_DMA is an unfortunate consequence of this all.

A side note: making the subchannel device 'own' the DMA stuff of a ccw
device (something that was discussed in the RFC thread) is tricky
because the ccw device may outlive the subchannel (all that orphan
stuff).

So the answer is: it is technically possible (e.g. see RFC) but it comes
at a price, and I see no obviously brilliant solution.

Regards,
Halil

> > -
> >  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
> >  				   GFP_DMA | GFP_KERNEL);
> >  	if (!vcdev->config_block) {
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-15 20:51       ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-15 20:51 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Mon, 13 May 2019 11:41:36 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri, 26 Apr 2019 20:32:41 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > As virtio-ccw devices are channel devices, we need to use the dma area
> > for any communication with the hypervisor.
> > 
> > This patch addresses the most basic stuff (mostly what is required for
> > virtio-ccw), and does take care of QDIO or any devices.
> 
> "does not take care of QDIO", surely? 

I did not bother making the QDIO library code use dma memory for
anything that is conceptually dma memory. AFAIK QDIO is out of scope for
prot virt for now. If one were to do some emulated qdio with prot virt
guests, one wound need to make a bunch of things shared.

> (Also, what does "any devices"
> mean? Do you mean "every arbitrary device", perhaps?)

What I mean is: this patch takes care of the core stuff, but any
particular device is likely to have to do more -- that is it ain't all
the cio devices support prot virt with this patch. For example
virtio-ccw needs to make sure that the ccws constituting the channel
programs, as well as the data pointed by the ccws is shared. If one
would want to make vfio-ccw DASD pass-through work under prot virt, one
would need to make sure, that everything that needs to be shared is
shared (data buffers, channel programs).

Does is clarify things?

> 
> > 
> > An interesting side effect is that virtio structures are now going to
> > get allocated in 31 bit addressable storage.
> 
> Hm...
> 
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  arch/s390/include/asm/ccwdev.h   |  4 +++
> >  drivers/s390/cio/ccwreq.c        |  8 ++---
> >  drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
> >  drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
> >  drivers/s390/cio/device_id.c     | 18 +++++------
> >  drivers/s390/cio/device_ops.c    | 21 +++++++++++--
> >  drivers/s390/cio/device_pgid.c   | 20 ++++++-------
> >  drivers/s390/cio/device_status.c | 24 +++++++--------
> >  drivers/s390/cio/io_sch.h        | 21 +++++++++----
> >  drivers/s390/virtio/virtio_ccw.c | 10 -------
> >  10 files changed, 148 insertions(+), 83 deletions(-)
> 
> (...)
> 
> > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > index 6d989c360f38..bb7a92316fc8 100644
> > --- a/drivers/s390/virtio/virtio_ccw.c
> > +++ b/drivers/s390/virtio/virtio_ccw.c
> > @@ -66,7 +66,6 @@ struct virtio_ccw_device {
> >  	bool device_lost;
> >  	unsigned int config_ready;
> >  	void *airq_info;
> > -	u64 dma_mask;
> >  };
> >  
> >  struct vq_info_block_legacy {
> > @@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> >  		ret = -ENOMEM;
> >  		goto out_free;
> >  	}
> > -
> >  	vcdev->vdev.dev.parent = &cdev->dev;
> > -	cdev->dev.dma_mask = &vcdev->dma_mask;
> > -	/* we are fine with common virtio infrastructure using 64 bit DMA */
> > -	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> > -	if (ret) {
> > -		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
> > -		goto out_free;
> > -	}
> 
> This means that vring structures now need to fit into 31 bits as well,
> I think?

Nod.

> Is there any way to reserve the 31 bit restriction for channel
> subsystem structures and keep vring in the full 64 bit range? (Or am I
> fundamentally misunderstanding something?)
> 

At the root of this problem is that the DMA API basically says devices
may have addressing limitations expressed by the dma_mask, while our
addressing limitations are not coming from the device but from the IO
arch: e.g. orb.cpa and ccw.cda are 31 bit addresses. In our case it
depends on how and for what is the device going to use the memory (e.g.
buffers addressed by MIDA vs IDA vs direct).

Virtio uses the DMA properties of the parent, that is in our case the
struct device embedded in struct ccw_device.

The previous version (RFC) used to allocate all the cio DMA stuff from
this global cio_dma_pool using the css0.dev for the DMA API
interactions. And we set *css0.dev.dma_mask == DMA_BIT_MASK(31) so
e.g. the allocated ccws are 31 bit addressable.

But I was asked to change this so that when I allocate DMA memory for a
channel program of particular ccw device, a struct device of that ccw
device is used as the first argument of dma_alloc_coherent().

Considering

void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
                gfp_t flag, unsigned long attrs)
{
        const struct dma_map_ops *ops = get_dma_ops(dev);
        void *cpu_addr;

        WARN_ON_ONCE(dev && !dev->coherent_dma_mask);

        if (dma_alloc_from_dev_coherent(dev, size, dma_handle, &cpu_addr))
                return cpu_addr;

        /* let the implementation decide on the zone to allocate from: */
        flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);

that is the GFP flags dropped that implies that we really want
cdev->dev restricted to 31 bit addressable memory because we can't tell
(with the current common DMA code) hey but this piece of DMA mem you
are abot to allocate for me must be 31 bit addressable (using GFP_DMA
as we usually do).

So, as described in the commit message, the vring stuff being forced
into ZONE_DMA is an unfortunate consequence of this all.

A side note: making the subchannel device 'own' the DMA stuff of a ccw
device (something that was discussed in the RFC thread) is tricky
because the ccw device may outlive the subchannel (all that orphan
stuff).

So the answer is: it is technically possible (e.g. see RFC) but it comes
at a price, and I see no obviously brilliant solution.

Regards,
Halil

> > -
> >  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
> >  				   GFP_DMA | GFP_KERNEL);
> >  	if (!vcdev->config_block) {
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-14 14:47       ` Jason J. Herne
@ 2019-05-15 21:08         ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-15 21:08 UTC (permalink / raw)
  To: Jason J. Herne
  Cc: Cornelia Huck, kvm, linux-s390, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On Tue, 14 May 2019 10:47:34 -0400
"Jason J. Herne" <jjherne@linux.ibm.com> wrote:

> On 5/13/19 5:41 AM, Cornelia Huck wrote:
> > On Fri, 26 Apr 2019 20:32:41 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> > 
> >> As virtio-ccw devices are channel devices, we need to use the dma area
> >> for any communication with the hypervisor.
> >>
> >> This patch addresses the most basic stuff (mostly what is required for
> >> virtio-ccw), and does take care of QDIO or any devices.
> > 
> > "does not take care of QDIO", surely? (Also, what does "any devices"
> > mean? Do you mean "every arbitrary device", perhaps?)
> > 
> >>
> >> An interesting side effect is that virtio structures are now going to
> >> get allocated in 31 bit addressable storage.
> > 
> > Hm...
> > 
> >>
> >> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >> ---
> >>   arch/s390/include/asm/ccwdev.h   |  4 +++
> >>   drivers/s390/cio/ccwreq.c        |  8 ++---
> >>   drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
> >>   drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
> >>   drivers/s390/cio/device_id.c     | 18 +++++------
> >>   drivers/s390/cio/device_ops.c    | 21 +++++++++++--
> >>   drivers/s390/cio/device_pgid.c   | 20 ++++++-------
> >>   drivers/s390/cio/device_status.c | 24 +++++++--------
> >>   drivers/s390/cio/io_sch.h        | 21 +++++++++----
> >>   drivers/s390/virtio/virtio_ccw.c | 10 -------
> >>   10 files changed, 148 insertions(+), 83 deletions(-)
> > 
> > (...)
> > 
> >> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> >> index 6d989c360f38..bb7a92316fc8 100644
> >> --- a/drivers/s390/virtio/virtio_ccw.c
> >> +++ b/drivers/s390/virtio/virtio_ccw.c
> >> @@ -66,7 +66,6 @@ struct virtio_ccw_device {
> >>   	bool device_lost;
> >>   	unsigned int config_ready;
> >>   	void *airq_info;
> >> -	u64 dma_mask;
> >>   };
> >>   
> >>   struct vq_info_block_legacy {
> >> @@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> >>   		ret = -ENOMEM;
> >>   		goto out_free;
> >>   	}
> >> -
> >>   	vcdev->vdev.dev.parent = &cdev->dev;
> >> -	cdev->dev.dma_mask = &vcdev->dma_mask;
> >> -	/* we are fine with common virtio infrastructure using 64 bit DMA */
> >> -	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> >> -	if (ret) {
> >> -		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
> >> -		goto out_free;
> >> -	}
> > 
> > This means that vring structures now need to fit into 31 bits as well,
> > I think? Is there any way to reserve the 31 bit restriction for channel
> > subsystem structures and keep vring in the full 64 bit range? (Or am I
> > fundamentally misunderstanding something?)
> > 
> 
> I hope I've understood everything... I'm new to virtio. But from what I'm understanding, 
> the vring structure (a.k.a. the VirtQueue) needs to be accessed and modified by both host 
> and guest. Therefore the page(s) holding that data need to be marked shared if using 
> protected virtualization. This patch set makes use of DMA pages by way of swiotlb (always 
> below 32-bit line right?) for shared memory.

The last sentence is wrong. You have to differentiate between stuff that
is mapped as DMA and that is allocated as DMA. The mapped stuff is
handled via swiotlb and bouncing. But that can not work for vring stuff
which needs to be allocated as DMA.

> Therefore, a side effect is that all shared 
> memory, including VirtQueue data will be in the DMA zone and in 32-bit memory.
> 

Consequently wrong. The reason I explained in a reply to Connie (see
there).

> I don't see any restrictions on sharing pages above the 32-bit line. So it seems possible. 
> I'm not sure how much more work it would be. I wonder if Halil has considered this?

I did consider this, the RFC was doing this (again see other mail).

> Are we 
> worried that virtio data structures are going to be a burden on the 31-bit address space?
> 
> 

That is a good question I can not answer. Since it is currently at least
a page per queue (because we use dma direct, right Mimu?), I am concerned
about this.

Connie, what is your opinion?

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-15 21:08         ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-15 21:08 UTC (permalink / raw)
  To: Jason J. Herne
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Cornelia Huck, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Farhan Ali, Viktor Mihajlovski, Janosch Frank

On Tue, 14 May 2019 10:47:34 -0400
"Jason J. Herne" <jjherne@linux.ibm.com> wrote:

> On 5/13/19 5:41 AM, Cornelia Huck wrote:
> > On Fri, 26 Apr 2019 20:32:41 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> > 
> >> As virtio-ccw devices are channel devices, we need to use the dma area
> >> for any communication with the hypervisor.
> >>
> >> This patch addresses the most basic stuff (mostly what is required for
> >> virtio-ccw), and does take care of QDIO or any devices.
> > 
> > "does not take care of QDIO", surely? (Also, what does "any devices"
> > mean? Do you mean "every arbitrary device", perhaps?)
> > 
> >>
> >> An interesting side effect is that virtio structures are now going to
> >> get allocated in 31 bit addressable storage.
> > 
> > Hm...
> > 
> >>
> >> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >> ---
> >>   arch/s390/include/asm/ccwdev.h   |  4 +++
> >>   drivers/s390/cio/ccwreq.c        |  8 ++---
> >>   drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
> >>   drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
> >>   drivers/s390/cio/device_id.c     | 18 +++++------
> >>   drivers/s390/cio/device_ops.c    | 21 +++++++++++--
> >>   drivers/s390/cio/device_pgid.c   | 20 ++++++-------
> >>   drivers/s390/cio/device_status.c | 24 +++++++--------
> >>   drivers/s390/cio/io_sch.h        | 21 +++++++++----
> >>   drivers/s390/virtio/virtio_ccw.c | 10 -------
> >>   10 files changed, 148 insertions(+), 83 deletions(-)
> > 
> > (...)
> > 
> >> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> >> index 6d989c360f38..bb7a92316fc8 100644
> >> --- a/drivers/s390/virtio/virtio_ccw.c
> >> +++ b/drivers/s390/virtio/virtio_ccw.c
> >> @@ -66,7 +66,6 @@ struct virtio_ccw_device {
> >>   	bool device_lost;
> >>   	unsigned int config_ready;
> >>   	void *airq_info;
> >> -	u64 dma_mask;
> >>   };
> >>   
> >>   struct vq_info_block_legacy {
> >> @@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> >>   		ret = -ENOMEM;
> >>   		goto out_free;
> >>   	}
> >> -
> >>   	vcdev->vdev.dev.parent = &cdev->dev;
> >> -	cdev->dev.dma_mask = &vcdev->dma_mask;
> >> -	/* we are fine with common virtio infrastructure using 64 bit DMA */
> >> -	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> >> -	if (ret) {
> >> -		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
> >> -		goto out_free;
> >> -	}
> > 
> > This means that vring structures now need to fit into 31 bits as well,
> > I think? Is there any way to reserve the 31 bit restriction for channel
> > subsystem structures and keep vring in the full 64 bit range? (Or am I
> > fundamentally misunderstanding something?)
> > 
> 
> I hope I've understood everything... I'm new to virtio. But from what I'm understanding, 
> the vring structure (a.k.a. the VirtQueue) needs to be accessed and modified by both host 
> and guest. Therefore the page(s) holding that data need to be marked shared if using 
> protected virtualization. This patch set makes use of DMA pages by way of swiotlb (always 
> below 32-bit line right?) for shared memory.

The last sentence is wrong. You have to differentiate between stuff that
is mapped as DMA and that is allocated as DMA. The mapped stuff is
handled via swiotlb and bouncing. But that can not work for vring stuff
which needs to be allocated as DMA.

> Therefore, a side effect is that all shared 
> memory, including VirtQueue data will be in the DMA zone and in 32-bit memory.
> 

Consequently wrong. The reason I explained in a reply to Connie (see
there).

> I don't see any restrictions on sharing pages above the 32-bit line. So it seems possible. 
> I'm not sure how much more work it would be. I wonder if Halil has considered this?

I did consider this, the RFC was doing this (again see other mail).

> Are we 
> worried that virtio data structures are going to be a burden on the 31-bit address space?
> 
> 

That is a good question I can not answer. Since it is currently at least
a page per queue (because we use dma direct, right Mimu?), I am concerned
about this.

Connie, what is your opinion?

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-15 17:12                   ` Halil Pasic
@ 2019-05-16  6:13                     ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-16  6:13 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Sebastian Ott, kvm, linux-s390, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller

On Wed, 15 May 2019 19:12:57 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Mon, 13 May 2019 15:29:24 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Sun, 12 May 2019 20:22:56 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Fri, 10 May 2019 16:10:13 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Fri, 10 May 2019 00:11:12 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > > >     
> > > > > On Thu, 9 May 2019 12:11:06 +0200
> > > > > Cornelia Huck <cohuck@redhat.com> wrote:
> > > > >     
> > > > > > On Wed, 8 May 2019 23:22:10 +0200
> > > > > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > > > > >       
> > > > > > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > > > > > Sebastian Ott <sebott@linux.ibm.com> wrote:      
> > > > > >       
> > > > > > > > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > > > > > > > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > > > > > > > >  		goto out_unregister;
> > > > > > > > >  	}
> > > > > > > > > +	cio_dma_pool_init();          
> > > > > > > > 
> > > > > > > > This is too late for early devices (ccw console!).        
> > > > > > > 
> > > > > > > You have already raised concern about this last time (thanks). I think,
> > > > > > > I've addressed this issue: the cio_dma_pool is only used by the airq
> > > > > > > stuff. I don't think the ccw console needs it. Please have an other look
> > > > > > > at patch #6, and explain your concern in more detail if it persists.      
> > > > > > 
> > > > > > What about changing the naming/adding comments here, so that (1) folks
> > > > > > aren't confused by the same thing in the future and (2) folks don't try
> > > > > > to use that pool for something needed for the early ccw consoles?
> > > > > >       
> > > > > 
> > > > > I'm all for clarity! Suggestions for better names?    
> > > > 
> > > > css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
> > > > need it?
> > > >     
> > > 
> > > Ouch! I was considering to use cio_dma_zalloc() for the adapter
> > > interruption vectors but I ended up between the two chairs in the end.
> > > So with this series there are no uses for cio_dma pool.
> > > 
> > > I don't feel strongly about this going one way the other.
> > > 
> > > Against getting rid of the cio_dma_pool and sticking with the speaks
> > > dma_alloc_coherent() that we waste a DMA page per vector, which is a
> > > non obvious side effect.  
> > 
> > That would basically mean one DMA page per virtio-ccw device, right?  
> 
> Not quite: virtio-ccw shares airq vectors among multiple devices. We
> alloc 64 bytes a time and use that as long as we don't run out of bits.
>  
> > For single queue devices, this seems like quite a bit of overhead.
> >  
> 
> Nod.
>  
> > Are we expecting many devices in use per guest?  
> 
> This is affect linux in general, not only guest 2 stuff (i.e. we
> also waste in vanilla lpar mode). And zpci seems to do at least one
> airq_iv_create() per pci function.

Hm, I thought that guests with virtio-ccw devices were the most heavy
user of this.

On LPAR (which would be the environment where I'd expect lots of
devices), how many users of this infrastructure do we typically have?
DASD do not use adapter interrupts, and QDIO devices (qeth/zfcp) use
their own indicator handling (that pre-dates this infrastructure) IIRC,
which basically only leaves the PCI functions you mention above.

> 
> >   
> > > 
> > > What speaks against cio_dma_pool is that it is slightly more code, and
> > > this currently can not be used for very early stuff, which I don't
> > > think is relevant.   
> > 
> > Unless properly documented, it feels like something you can easily trip
> > over, however.
> > 
> > I assume that the "very early stuff" is basically only ccw consoles.
> > Not sure if we can use virtio-serial as an early ccw console -- IIRC
> > that was only about 3215/3270? While QEMU guests are able to use a 3270
> > console, this is experimental and I would not count that as a use case.
> > Anyway, 3215/3270 don't use adapter interrupts, and probably not
> > anything cross-device, either; so unless early virtio-serial is a
> > thing, this restriction is fine if properly documented.
> >   
> 
> Mimu can you dig into this a bit?
> 
> We could also aim for getting rid of this limitation. One idea would be
> some sort of lazy initialization (i.e. triggered by first usage).
> Another idea would be simply doing the initialization earlier.
> Unfortunately I'm not all that familiar with the early stuff. Is there
> some doc you can recommend?

I'd probably look at the code for that.

> 
> Anyway before investing much more into this, I think we should have a
> fair idea which option do we prefer...

Agreed.

> 
> > > What also used to speak against it is that
> > > allocations asking for more than a page would just fail, but I addressed
> > > that in the patch I've hacked up on top of the series, and I'm going to
> > > paste below. While at it I addressed some other issues as well.  
> > 
> > Hm, which "other issues"?
> >   
> 
> The kfree() I've forgotten to get rid of, and this 'document does not
> work early' (pun intended) business.

Ok.

> 
> > > 
> > > I've also got code that deals with AIRQ_IV_CACHELINE by turning the
> > > kmem_cache into a dma_pool.  
> > 
> > Isn't that percolating to other airq users again? Or maybe I just don't
> > understand what you're proposing here...  
> 
> Absolutely.
> 
> >   
> > > 
> > > Cornelia, Sebastian which approach do you prefer:
> > > 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
> > > vector, or
> > > 2) go with the approach taken by the patch below?  
> > 
> > I'm not sure that I properly understand this (yeah, you probably
> > guessed); so I'm not sure I can make a good call here.
> >   
> 
> I hope you, I managed to clarify some of the questions. Please keep
> asking if stuff remains unclear. I'm not a great communicator, but a
> quite tenacious one.
> 
> I hope Sebastian will chime in as well.

Yes, waiting for his comments as well :)

> 
> > > 
> > > 
> > > Regards,
> > > Halil
> > > -----------------------8<---------------------------------------------
> > > From: Halil Pasic <pasic@linux.ibm.com>
> > > Date: Sun, 12 May 2019 18:08:05 +0200
> > > Subject: [PATCH] WIP: use cio dma pool for airqs
> > > 
> > > Let's not waste a DMA page per adapter interrupt bit vector.
> > > ---
> > > Lightly tested...
> > > ---
> > >  arch/s390/include/asm/airq.h |  1 -
> > >  drivers/s390/cio/airq.c      | 10 +++-------
> > >  drivers/s390/cio/css.c       | 18 +++++++++++++++---
> > >  3 files changed, 18 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
> > > index 1492d48..981a3eb 100644
> > > --- a/arch/s390/include/asm/airq.h
> > > +++ b/arch/s390/include/asm/airq.h
> > > @@ -30,7 +30,6 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
> > >  /* Adapter interrupt bit vector */
> > >  struct airq_iv {
> > >  	unsigned long *vector;	/* Adapter interrupt bit vector */
> > > -	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
> > >  	unsigned long *avail;	/* Allocation bit mask for the bit vector */
> > >  	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
> > >  	unsigned long *ptr;	/* Pointer associated with each bit */
> > > diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> > > index 7a5c0a0..f11f437 100644
> > > --- a/drivers/s390/cio/airq.c
> > > +++ b/drivers/s390/cio/airq.c
> > > @@ -136,8 +136,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
> > >  		goto out;
> > >  	iv->bits = bits;
> > >  	size = iv_size(bits);
> > > -	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
> > > -						 &iv->vector_dma, GFP_KERNEL);
> > > +	iv->vector = cio_dma_zalloc(size);
> > >  	if (!iv->vector)
> > >  		goto out_free;
> > >  	if (flags & AIRQ_IV_ALLOC) {
> > > @@ -172,8 +171,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
> > >  	kfree(iv->ptr);
> > >  	kfree(iv->bitlock);
> > >  	kfree(iv->avail);
> > > -	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
> > > -			  iv->vector_dma);
> > > +	cio_dma_free(iv->vector, size);
> > >  	kfree(iv);
> > >  out:
> > >  	return NULL;
> > > @@ -189,9 +187,7 @@ void airq_iv_release(struct airq_iv *iv)
> > >  	kfree(iv->data);
> > >  	kfree(iv->ptr);
> > >  	kfree(iv->bitlock);
> > > -	kfree(iv->vector);
> > > -	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
> > > -			  iv->vector, iv->vector_dma);
> > > +	cio_dma_free(iv->vector, iv_size(iv->bits));
> > >  	kfree(iv->avail);
> > >  	kfree(iv);
> > >  }
> > > diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
> > > index 7087cc3..88d9c92 100644
> > > --- a/drivers/s390/cio/css.c
> > > +++ b/drivers/s390/cio/css.c
> > > @@ -1063,7 +1063,10 @@ struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
> > >  static void __gp_dma_free_dma(struct gen_pool *pool,
> > >  			      struct gen_pool_chunk *chunk, void *data)
> > >  {
> > > -	dma_free_coherent((struct device *) data, PAGE_SIZE,
> > > +
> > > +	size_t chunk_size = chunk->end_addr - chunk->start_addr + 1;
> > > +
> > > +	dma_free_coherent((struct device *) data, chunk_size,
> > >  			 (void *) chunk->start_addr,
> > >  			 (dma_addr_t) chunk->phys_addr);
> > >  }
> > > @@ -1088,13 +1091,15 @@ void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
> > >  {
> > >  	dma_addr_t dma_addr;
> > >  	unsigned long addr = gen_pool_alloc(gp_dma, size);
> > > +	size_t chunk_size;
> > >  
> > >  	if (!addr) {
> > > +		chunk_size = round_up(size, PAGE_SIZE);  
> > 
> > Doesn't that mean that we still go up to chunks of at least PAGE_SIZE?
> > Or can vectors now share the same chunk?  
> 
> Exactly! We put the allocated dma mem into the genpool. So if the next
> request can be served from what is already in the genpool we don't end
> up in this fallback path where we grow the pool. 

Ok, that makes sense.

> 
> >   
> > >  		addr = (unsigned long) dma_alloc_coherent(dma_dev,
> > > -					PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
> > > +					 chunk_size, &dma_addr, CIO_DMA_GFP);
> > >  		if (!addr)
> > >  			return NULL;
> > > -		gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
> > > +		gen_pool_add_virt(gp_dma, addr, dma_addr, chunk_size, -1);
> > >  		addr = gen_pool_alloc(gp_dma, size);  
> 
> BTW I think it would be good to recover from this alloc failing due to a
> race (qute unlikely with small allocations thogh...).

The callers hopefully check the result?

> 
> > >  	}
> > >  	return (void *) addr;
> > > @@ -1108,6 +1113,13 @@ void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
> > >  	gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
> > >  }
> > >  
> > > +/**
> > > + * Allocate dma memory from the css global pool. Intended for memory not
> > > + * specific to any single device within the css.
> > > + *
> > > + * Caution: Not suitable for early stuff like console.  
> > 
> > Maybe add "Do not use prior to <point in startup>"?
> >   
> 
> I'm not awfully familiar with the well known 'points in startup'. Can
> you recommend me some documentation on this topic?

The code O:-) Anyway, that's where I'd start reading; there might be
good stuff under Documentation/ that I've never looked at...

> 
> 
> Regards,
> Halil
> 
> > > + *
> > > + */
> > >  void *cio_dma_zalloc(size_t size)
> > >  {
> > >  	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);  
> >   
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-16  6:13                     ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-16  6:13 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda,
	Sebastian Ott, kvm, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Michael Mueller, Viktor Mihajlovski, Janosch Frank

On Wed, 15 May 2019 19:12:57 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Mon, 13 May 2019 15:29:24 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Sun, 12 May 2019 20:22:56 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Fri, 10 May 2019 16:10:13 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Fri, 10 May 2019 00:11:12 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > > >     
> > > > > On Thu, 9 May 2019 12:11:06 +0200
> > > > > Cornelia Huck <cohuck@redhat.com> wrote:
> > > > >     
> > > > > > On Wed, 8 May 2019 23:22:10 +0200
> > > > > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > > > > >       
> > > > > > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > > > > > Sebastian Ott <sebott@linux.ibm.com> wrote:      
> > > > > >       
> > > > > > > > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > > > > > > > >  		unregister_reboot_notifier(&css_reboot_notifier);
> > > > > > > > >  		goto out_unregister;
> > > > > > > > >  	}
> > > > > > > > > +	cio_dma_pool_init();          
> > > > > > > > 
> > > > > > > > This is too late for early devices (ccw console!).        
> > > > > > > 
> > > > > > > You have already raised concern about this last time (thanks). I think,
> > > > > > > I've addressed this issue: the cio_dma_pool is only used by the airq
> > > > > > > stuff. I don't think the ccw console needs it. Please have an other look
> > > > > > > at patch #6, and explain your concern in more detail if it persists.      
> > > > > > 
> > > > > > What about changing the naming/adding comments here, so that (1) folks
> > > > > > aren't confused by the same thing in the future and (2) folks don't try
> > > > > > to use that pool for something needed for the early ccw consoles?
> > > > > >       
> > > > > 
> > > > > I'm all for clarity! Suggestions for better names?    
> > > > 
> > > > css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
> > > > need it?
> > > >     
> > > 
> > > Ouch! I was considering to use cio_dma_zalloc() for the adapter
> > > interruption vectors but I ended up between the two chairs in the end.
> > > So with this series there are no uses for cio_dma pool.
> > > 
> > > I don't feel strongly about this going one way the other.
> > > 
> > > Against getting rid of the cio_dma_pool and sticking with the speaks
> > > dma_alloc_coherent() that we waste a DMA page per vector, which is a
> > > non obvious side effect.  
> > 
> > That would basically mean one DMA page per virtio-ccw device, right?  
> 
> Not quite: virtio-ccw shares airq vectors among multiple devices. We
> alloc 64 bytes a time and use that as long as we don't run out of bits.
>  
> > For single queue devices, this seems like quite a bit of overhead.
> >  
> 
> Nod.
>  
> > Are we expecting many devices in use per guest?  
> 
> This is affect linux in general, not only guest 2 stuff (i.e. we
> also waste in vanilla lpar mode). And zpci seems to do at least one
> airq_iv_create() per pci function.

Hm, I thought that guests with virtio-ccw devices were the most heavy
user of this.

On LPAR (which would be the environment where I'd expect lots of
devices), how many users of this infrastructure do we typically have?
DASD do not use adapter interrupts, and QDIO devices (qeth/zfcp) use
their own indicator handling (that pre-dates this infrastructure) IIRC,
which basically only leaves the PCI functions you mention above.

> 
> >   
> > > 
> > > What speaks against cio_dma_pool is that it is slightly more code, and
> > > this currently can not be used for very early stuff, which I don't
> > > think is relevant.   
> > 
> > Unless properly documented, it feels like something you can easily trip
> > over, however.
> > 
> > I assume that the "very early stuff" is basically only ccw consoles.
> > Not sure if we can use virtio-serial as an early ccw console -- IIRC
> > that was only about 3215/3270? While QEMU guests are able to use a 3270
> > console, this is experimental and I would not count that as a use case.
> > Anyway, 3215/3270 don't use adapter interrupts, and probably not
> > anything cross-device, either; so unless early virtio-serial is a
> > thing, this restriction is fine if properly documented.
> >   
> 
> Mimu can you dig into this a bit?
> 
> We could also aim for getting rid of this limitation. One idea would be
> some sort of lazy initialization (i.e. triggered by first usage).
> Another idea would be simply doing the initialization earlier.
> Unfortunately I'm not all that familiar with the early stuff. Is there
> some doc you can recommend?

I'd probably look at the code for that.

> 
> Anyway before investing much more into this, I think we should have a
> fair idea which option do we prefer...

Agreed.

> 
> > > What also used to speak against it is that
> > > allocations asking for more than a page would just fail, but I addressed
> > > that in the patch I've hacked up on top of the series, and I'm going to
> > > paste below. While at it I addressed some other issues as well.  
> > 
> > Hm, which "other issues"?
> >   
> 
> The kfree() I've forgotten to get rid of, and this 'document does not
> work early' (pun intended) business.

Ok.

> 
> > > 
> > > I've also got code that deals with AIRQ_IV_CACHELINE by turning the
> > > kmem_cache into a dma_pool.  
> > 
> > Isn't that percolating to other airq users again? Or maybe I just don't
> > understand what you're proposing here...  
> 
> Absolutely.
> 
> >   
> > > 
> > > Cornelia, Sebastian which approach do you prefer:
> > > 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
> > > vector, or
> > > 2) go with the approach taken by the patch below?  
> > 
> > I'm not sure that I properly understand this (yeah, you probably
> > guessed); so I'm not sure I can make a good call here.
> >   
> 
> I hope you, I managed to clarify some of the questions. Please keep
> asking if stuff remains unclear. I'm not a great communicator, but a
> quite tenacious one.
> 
> I hope Sebastian will chime in as well.

Yes, waiting for his comments as well :)

> 
> > > 
> > > 
> > > Regards,
> > > Halil
> > > -----------------------8<---------------------------------------------
> > > From: Halil Pasic <pasic@linux.ibm.com>
> > > Date: Sun, 12 May 2019 18:08:05 +0200
> > > Subject: [PATCH] WIP: use cio dma pool for airqs
> > > 
> > > Let's not waste a DMA page per adapter interrupt bit vector.
> > > ---
> > > Lightly tested...
> > > ---
> > >  arch/s390/include/asm/airq.h |  1 -
> > >  drivers/s390/cio/airq.c      | 10 +++-------
> > >  drivers/s390/cio/css.c       | 18 +++++++++++++++---
> > >  3 files changed, 18 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
> > > index 1492d48..981a3eb 100644
> > > --- a/arch/s390/include/asm/airq.h
> > > +++ b/arch/s390/include/asm/airq.h
> > > @@ -30,7 +30,6 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
> > >  /* Adapter interrupt bit vector */
> > >  struct airq_iv {
> > >  	unsigned long *vector;	/* Adapter interrupt bit vector */
> > > -	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
> > >  	unsigned long *avail;	/* Allocation bit mask for the bit vector */
> > >  	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
> > >  	unsigned long *ptr;	/* Pointer associated with each bit */
> > > diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> > > index 7a5c0a0..f11f437 100644
> > > --- a/drivers/s390/cio/airq.c
> > > +++ b/drivers/s390/cio/airq.c
> > > @@ -136,8 +136,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
> > >  		goto out;
> > >  	iv->bits = bits;
> > >  	size = iv_size(bits);
> > > -	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
> > > -						 &iv->vector_dma, GFP_KERNEL);
> > > +	iv->vector = cio_dma_zalloc(size);
> > >  	if (!iv->vector)
> > >  		goto out_free;
> > >  	if (flags & AIRQ_IV_ALLOC) {
> > > @@ -172,8 +171,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
> > >  	kfree(iv->ptr);
> > >  	kfree(iv->bitlock);
> > >  	kfree(iv->avail);
> > > -	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
> > > -			  iv->vector_dma);
> > > +	cio_dma_free(iv->vector, size);
> > >  	kfree(iv);
> > >  out:
> > >  	return NULL;
> > > @@ -189,9 +187,7 @@ void airq_iv_release(struct airq_iv *iv)
> > >  	kfree(iv->data);
> > >  	kfree(iv->ptr);
> > >  	kfree(iv->bitlock);
> > > -	kfree(iv->vector);
> > > -	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
> > > -			  iv->vector, iv->vector_dma);
> > > +	cio_dma_free(iv->vector, iv_size(iv->bits));
> > >  	kfree(iv->avail);
> > >  	kfree(iv);
> > >  }
> > > diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
> > > index 7087cc3..88d9c92 100644
> > > --- a/drivers/s390/cio/css.c
> > > +++ b/drivers/s390/cio/css.c
> > > @@ -1063,7 +1063,10 @@ struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
> > >  static void __gp_dma_free_dma(struct gen_pool *pool,
> > >  			      struct gen_pool_chunk *chunk, void *data)
> > >  {
> > > -	dma_free_coherent((struct device *) data, PAGE_SIZE,
> > > +
> > > +	size_t chunk_size = chunk->end_addr - chunk->start_addr + 1;
> > > +
> > > +	dma_free_coherent((struct device *) data, chunk_size,
> > >  			 (void *) chunk->start_addr,
> > >  			 (dma_addr_t) chunk->phys_addr);
> > >  }
> > > @@ -1088,13 +1091,15 @@ void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
> > >  {
> > >  	dma_addr_t dma_addr;
> > >  	unsigned long addr = gen_pool_alloc(gp_dma, size);
> > > +	size_t chunk_size;
> > >  
> > >  	if (!addr) {
> > > +		chunk_size = round_up(size, PAGE_SIZE);  
> > 
> > Doesn't that mean that we still go up to chunks of at least PAGE_SIZE?
> > Or can vectors now share the same chunk?  
> 
> Exactly! We put the allocated dma mem into the genpool. So if the next
> request can be served from what is already in the genpool we don't end
> up in this fallback path where we grow the pool. 

Ok, that makes sense.

> 
> >   
> > >  		addr = (unsigned long) dma_alloc_coherent(dma_dev,
> > > -					PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
> > > +					 chunk_size, &dma_addr, CIO_DMA_GFP);
> > >  		if (!addr)
> > >  			return NULL;
> > > -		gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
> > > +		gen_pool_add_virt(gp_dma, addr, dma_addr, chunk_size, -1);
> > >  		addr = gen_pool_alloc(gp_dma, size);  
> 
> BTW I think it would be good to recover from this alloc failing due to a
> race (qute unlikely with small allocations thogh...).

The callers hopefully check the result?

> 
> > >  	}
> > >  	return (void *) addr;
> > > @@ -1108,6 +1113,13 @@ void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
> > >  	gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
> > >  }
> > >  
> > > +/**
> > > + * Allocate dma memory from the css global pool. Intended for memory not
> > > + * specific to any single device within the css.
> > > + *
> > > + * Caution: Not suitable for early stuff like console.  
> > 
> > Maybe add "Do not use prior to <point in startup>"?
> >   
> 
> I'm not awfully familiar with the well known 'points in startup'. Can
> you recommend me some documentation on this topic?

The code O:-) Anyway, that's where I'd start reading; there might be
good stuff under Documentation/ that I've never looked at...

> 
> 
> Regards,
> Halil
> 
> > > + *
> > > + */
> > >  void *cio_dma_zalloc(size_t size)
> > >  {
> > >  	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);  
> >   
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-15 20:51       ` Halil Pasic
@ 2019-05-16  6:29         ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-16  6:29 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 15 May 2019 22:51:58 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Mon, 13 May 2019 11:41:36 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri, 26 Apr 2019 20:32:41 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > As virtio-ccw devices are channel devices, we need to use the dma area
> > > for any communication with the hypervisor.
> > > 
> > > This patch addresses the most basic stuff (mostly what is required for
> > > virtio-ccw), and does take care of QDIO or any devices.  
> > 
> > "does not take care of QDIO", surely?   
> 
> I did not bother making the QDIO library code use dma memory for
> anything that is conceptually dma memory. AFAIK QDIO is out of scope for
> prot virt for now. If one were to do some emulated qdio with prot virt
> guests, one wound need to make a bunch of things shared.

And unless you wanted to support protected virt under z/VM as well, it
would be wasted effort :)

> 
> > (Also, what does "any devices"
> > mean? Do you mean "every arbitrary device", perhaps?)  
> 
> What I mean is: this patch takes care of the core stuff, but any
> particular device is likely to have to do more -- that is it ain't all
> the cio devices support prot virt with this patch. For example
> virtio-ccw needs to make sure that the ccws constituting the channel
> programs, as well as the data pointed by the ccws is shared. If one
> would want to make vfio-ccw DASD pass-through work under prot virt, one
> would need to make sure, that everything that needs to be shared is
> shared (data buffers, channel programs).
> 
> Does is clarify things?

That's what I thought, but the sentence was confusing :) What about

"This patch addresses the most basic stuff (mostly what is required to
support virtio-ccw devices). It handles neither QDIO devices, nor
arbitrary non-virtio-ccw devices." ?

> 
> >   
> > > 
> > > An interesting side effect is that virtio structures are now going to
> > > get allocated in 31 bit addressable storage.  
> > 
> > Hm...
> >   
> > > 
> > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > ---
> > >  arch/s390/include/asm/ccwdev.h   |  4 +++
> > >  drivers/s390/cio/ccwreq.c        |  8 ++---
> > >  drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
> > >  drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
> > >  drivers/s390/cio/device_id.c     | 18 +++++------
> > >  drivers/s390/cio/device_ops.c    | 21 +++++++++++--
> > >  drivers/s390/cio/device_pgid.c   | 20 ++++++-------
> > >  drivers/s390/cio/device_status.c | 24 +++++++--------
> > >  drivers/s390/cio/io_sch.h        | 21 +++++++++----
> > >  drivers/s390/virtio/virtio_ccw.c | 10 -------
> > >  10 files changed, 148 insertions(+), 83 deletions(-)  
> > 
> > (...)
> >   
> > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > > index 6d989c360f38..bb7a92316fc8 100644
> > > --- a/drivers/s390/virtio/virtio_ccw.c
> > > +++ b/drivers/s390/virtio/virtio_ccw.c
> > > @@ -66,7 +66,6 @@ struct virtio_ccw_device {
> > >  	bool device_lost;
> > >  	unsigned int config_ready;
> > >  	void *airq_info;
> > > -	u64 dma_mask;
> > >  };
> > >  
> > >  struct vq_info_block_legacy {
> > > @@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> > >  		ret = -ENOMEM;
> > >  		goto out_free;
> > >  	}
> > > -
> > >  	vcdev->vdev.dev.parent = &cdev->dev;
> > > -	cdev->dev.dma_mask = &vcdev->dma_mask;
> > > -	/* we are fine with common virtio infrastructure using 64 bit DMA */
> > > -	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> > > -	if (ret) {
> > > -		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
> > > -		goto out_free;
> > > -	}  
> > 
> > This means that vring structures now need to fit into 31 bits as well,
> > I think?  
> 
> Nod.
> 
> > Is there any way to reserve the 31 bit restriction for channel
> > subsystem structures and keep vring in the full 64 bit range? (Or am I
> > fundamentally misunderstanding something?)
> >   
> 
> At the root of this problem is that the DMA API basically says devices
> may have addressing limitations expressed by the dma_mask, while our
> addressing limitations are not coming from the device but from the IO
> arch: e.g. orb.cpa and ccw.cda are 31 bit addresses. In our case it
> depends on how and for what is the device going to use the memory (e.g.
> buffers addressed by MIDA vs IDA vs direct).
> 
> Virtio uses the DMA properties of the parent, that is in our case the
> struct device embedded in struct ccw_device.
> 
> The previous version (RFC) used to allocate all the cio DMA stuff from
> this global cio_dma_pool using the css0.dev for the DMA API
> interactions. And we set *css0.dev.dma_mask == DMA_BIT_MASK(31) so
> e.g. the allocated ccws are 31 bit addressable.
> 
> But I was asked to change this so that when I allocate DMA memory for a
> channel program of particular ccw device, a struct device of that ccw
> device is used as the first argument of dma_alloc_coherent().
> 
> Considering
> 
> void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
>                 gfp_t flag, unsigned long attrs)
> {
>         const struct dma_map_ops *ops = get_dma_ops(dev);
>         void *cpu_addr;
> 
>         WARN_ON_ONCE(dev && !dev->coherent_dma_mask);
> 
>         if (dma_alloc_from_dev_coherent(dev, size, dma_handle, &cpu_addr))
>                 return cpu_addr;
> 
>         /* let the implementation decide on the zone to allocate from: */
>         flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
> 
> that is the GFP flags dropped that implies that we really want
> cdev->dev restricted to 31 bit addressable memory because we can't tell
> (with the current common DMA code) hey but this piece of DMA mem you
> are abot to allocate for me must be 31 bit addressable (using GFP_DMA
> as we usually do).
> 
> So, as described in the commit message, the vring stuff being forced
> into ZONE_DMA is an unfortunate consequence of this all.

Yeah. I hope 31 bits are enough for that as well.

> 
> A side note: making the subchannel device 'own' the DMA stuff of a ccw
> device (something that was discussed in the RFC thread) is tricky
> because the ccw device may outlive the subchannel (all that orphan
> stuff).

Yes, that's... eww. Not really a problem for virtio-ccw devices (which
do not support the disconnected state), but can we make DMA and the
subchannel moving play nice with each other at all?

> 
> So the answer is: it is technically possible (e.g. see RFC) but it comes
> at a price, and I see no obviously brilliant solution.
> 
> Regards,
> Halil
> 
> > > -
> > >  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
> > >  				   GFP_DMA | GFP_KERNEL);
> > >  	if (!vcdev->config_block) {  
> >   
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-16  6:29         ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-16  6:29 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 15 May 2019 22:51:58 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Mon, 13 May 2019 11:41:36 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri, 26 Apr 2019 20:32:41 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > As virtio-ccw devices are channel devices, we need to use the dma area
> > > for any communication with the hypervisor.
> > > 
> > > This patch addresses the most basic stuff (mostly what is required for
> > > virtio-ccw), and does take care of QDIO or any devices.  
> > 
> > "does not take care of QDIO", surely?   
> 
> I did not bother making the QDIO library code use dma memory for
> anything that is conceptually dma memory. AFAIK QDIO is out of scope for
> prot virt for now. If one were to do some emulated qdio with prot virt
> guests, one wound need to make a bunch of things shared.

And unless you wanted to support protected virt under z/VM as well, it
would be wasted effort :)

> 
> > (Also, what does "any devices"
> > mean? Do you mean "every arbitrary device", perhaps?)  
> 
> What I mean is: this patch takes care of the core stuff, but any
> particular device is likely to have to do more -- that is it ain't all
> the cio devices support prot virt with this patch. For example
> virtio-ccw needs to make sure that the ccws constituting the channel
> programs, as well as the data pointed by the ccws is shared. If one
> would want to make vfio-ccw DASD pass-through work under prot virt, one
> would need to make sure, that everything that needs to be shared is
> shared (data buffers, channel programs).
> 
> Does is clarify things?

That's what I thought, but the sentence was confusing :) What about

"This patch addresses the most basic stuff (mostly what is required to
support virtio-ccw devices). It handles neither QDIO devices, nor
arbitrary non-virtio-ccw devices." ?

> 
> >   
> > > 
> > > An interesting side effect is that virtio structures are now going to
> > > get allocated in 31 bit addressable storage.  
> > 
> > Hm...
> >   
> > > 
> > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > ---
> > >  arch/s390/include/asm/ccwdev.h   |  4 +++
> > >  drivers/s390/cio/ccwreq.c        |  8 ++---
> > >  drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
> > >  drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
> > >  drivers/s390/cio/device_id.c     | 18 +++++------
> > >  drivers/s390/cio/device_ops.c    | 21 +++++++++++--
> > >  drivers/s390/cio/device_pgid.c   | 20 ++++++-------
> > >  drivers/s390/cio/device_status.c | 24 +++++++--------
> > >  drivers/s390/cio/io_sch.h        | 21 +++++++++----
> > >  drivers/s390/virtio/virtio_ccw.c | 10 -------
> > >  10 files changed, 148 insertions(+), 83 deletions(-)  
> > 
> > (...)
> >   
> > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > > index 6d989c360f38..bb7a92316fc8 100644
> > > --- a/drivers/s390/virtio/virtio_ccw.c
> > > +++ b/drivers/s390/virtio/virtio_ccw.c
> > > @@ -66,7 +66,6 @@ struct virtio_ccw_device {
> > >  	bool device_lost;
> > >  	unsigned int config_ready;
> > >  	void *airq_info;
> > > -	u64 dma_mask;
> > >  };
> > >  
> > >  struct vq_info_block_legacy {
> > > @@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> > >  		ret = -ENOMEM;
> > >  		goto out_free;
> > >  	}
> > > -
> > >  	vcdev->vdev.dev.parent = &cdev->dev;
> > > -	cdev->dev.dma_mask = &vcdev->dma_mask;
> > > -	/* we are fine with common virtio infrastructure using 64 bit DMA */
> > > -	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> > > -	if (ret) {
> > > -		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
> > > -		goto out_free;
> > > -	}  
> > 
> > This means that vring structures now need to fit into 31 bits as well,
> > I think?  
> 
> Nod.
> 
> > Is there any way to reserve the 31 bit restriction for channel
> > subsystem structures and keep vring in the full 64 bit range? (Or am I
> > fundamentally misunderstanding something?)
> >   
> 
> At the root of this problem is that the DMA API basically says devices
> may have addressing limitations expressed by the dma_mask, while our
> addressing limitations are not coming from the device but from the IO
> arch: e.g. orb.cpa and ccw.cda are 31 bit addresses. In our case it
> depends on how and for what is the device going to use the memory (e.g.
> buffers addressed by MIDA vs IDA vs direct).
> 
> Virtio uses the DMA properties of the parent, that is in our case the
> struct device embedded in struct ccw_device.
> 
> The previous version (RFC) used to allocate all the cio DMA stuff from
> this global cio_dma_pool using the css0.dev for the DMA API
> interactions. And we set *css0.dev.dma_mask == DMA_BIT_MASK(31) so
> e.g. the allocated ccws are 31 bit addressable.
> 
> But I was asked to change this so that when I allocate DMA memory for a
> channel program of particular ccw device, a struct device of that ccw
> device is used as the first argument of dma_alloc_coherent().
> 
> Considering
> 
> void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
>                 gfp_t flag, unsigned long attrs)
> {
>         const struct dma_map_ops *ops = get_dma_ops(dev);
>         void *cpu_addr;
> 
>         WARN_ON_ONCE(dev && !dev->coherent_dma_mask);
> 
>         if (dma_alloc_from_dev_coherent(dev, size, dma_handle, &cpu_addr))
>                 return cpu_addr;
> 
>         /* let the implementation decide on the zone to allocate from: */
>         flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
> 
> that is the GFP flags dropped that implies that we really want
> cdev->dev restricted to 31 bit addressable memory because we can't tell
> (with the current common DMA code) hey but this piece of DMA mem you
> are abot to allocate for me must be 31 bit addressable (using GFP_DMA
> as we usually do).
> 
> So, as described in the commit message, the vring stuff being forced
> into ZONE_DMA is an unfortunate consequence of this all.

Yeah. I hope 31 bits are enough for that as well.

> 
> A side note: making the subchannel device 'own' the DMA stuff of a ccw
> device (something that was discussed in the RFC thread) is tricky
> because the ccw device may outlive the subchannel (all that orphan
> stuff).

Yes, that's... eww. Not really a problem for virtio-ccw devices (which
do not support the disconnected state), but can we make DMA and the
subchannel moving play nice with each other at all?

> 
> So the answer is: it is technically possible (e.g. see RFC) but it comes
> at a price, and I see no obviously brilliant solution.
> 
> Regards,
> Halil
> 
> > > -
> > >  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
> > >  				   GFP_DMA | GFP_KERNEL);
> > >  	if (!vcdev->config_block) {  
> >   
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-15 21:08         ` Halil Pasic
@ 2019-05-16  6:32           ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-16  6:32 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Jason J. Herne, kvm, linux-s390, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On Wed, 15 May 2019 23:08:17 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 14 May 2019 10:47:34 -0400
> "Jason J. Herne" <jjherne@linux.ibm.com> wrote:

> > Are we 
> > worried that virtio data structures are going to be a burden on the 31-bit address space?
> > 
> >   
> 
> That is a good question I can not answer. Since it is currently at least
> a page per queue (because we use dma direct, right Mimu?), I am concerned
> about this.
> 
> Connie, what is your opinion?

Yes, running into problems there was one of my motivations for my
question. I guess it depends on the number of devices and how many
queues they use. The problem is that it affects not only protected virt
guests, but all guests.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-16  6:32           ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-16  6:32 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Jason J. Herne, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank

On Wed, 15 May 2019 23:08:17 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 14 May 2019 10:47:34 -0400
> "Jason J. Herne" <jjherne@linux.ibm.com> wrote:

> > Are we 
> > worried that virtio data structures are going to be a burden on the 31-bit address space?
> > 
> >   
> 
> That is a good question I can not answer. Since it is currently at least
> a page per queue (because we use dma direct, right Mimu?), I am concerned
> about this.
> 
> Connie, what is your opinion?

Yes, running into problems there was one of my motivations for my
question. I guess it depends on the number of devices and how many
queues they use. The problem is that it affects not only protected virt
guests, but all guests.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-16  6:32           ` Cornelia Huck
@ 2019-05-16 13:42             ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-16 13:42 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Jason J. Herne, kvm, linux-s390, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman, Michael Mueller

On Thu, 16 May 2019 08:32:28 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 15 May 2019 23:08:17 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Tue, 14 May 2019 10:47:34 -0400
> > "Jason J. Herne" <jjherne@linux.ibm.com> wrote:
> 
> > > Are we 
> > > worried that virtio data structures are going to be a burden on the 31-bit address space?
> > > 
> > >   
> > 
> > That is a good question I can not answer. Since it is currently at least
> > a page per queue (because we use dma direct, right Mimu?), I am concerned
> > about this.
> > 
> > Connie, what is your opinion?
> 
> Yes, running into problems there was one of my motivations for my
> question. I guess it depends on the number of devices and how many
> queues they use. The problem is that it affects not only protected virt
> guests, but all guests.
> 

Unless things are about to change only devices that have
VIRTIO_F_IOMMU_PLATFORM are affected. So it does not necessarily affect
not protected virt guests. (With prot virt we have to use
VIRTIO_F_IOMMU_PLATFORM.)

If it were not like this, I would be much more worried.

@Mimu: Could you please discuss this problem with the team? It might be
worth considering to go back to the design of the RFC (i.e. cio/ccw stuff
allocated from a common cio dma pool which gives you 31 bit addressable
memory, and 64 bit dma mask for a ccw device of a virtio device).

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-16 13:42             ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-16 13:42 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Jason J. Herne, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Michael Mueller, Viktor Mihajlovski, Vasily Gorbik,
	Janosch Frank

On Thu, 16 May 2019 08:32:28 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 15 May 2019 23:08:17 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Tue, 14 May 2019 10:47:34 -0400
> > "Jason J. Herne" <jjherne@linux.ibm.com> wrote:
> 
> > > Are we 
> > > worried that virtio data structures are going to be a burden on the 31-bit address space?
> > > 
> > >   
> > 
> > That is a good question I can not answer. Since it is currently at least
> > a page per queue (because we use dma direct, right Mimu?), I am concerned
> > about this.
> > 
> > Connie, what is your opinion?
> 
> Yes, running into problems there was one of my motivations for my
> question. I guess it depends on the number of devices and how many
> queues they use. The problem is that it affects not only protected virt
> guests, but all guests.
> 

Unless things are about to change only devices that have
VIRTIO_F_IOMMU_PLATFORM are affected. So it does not necessarily affect
not protected virt guests. (With prot virt we have to use
VIRTIO_F_IOMMU_PLATFORM.)

If it were not like this, I would be much more worried.

@Mimu: Could you please discuss this problem with the team? It might be
worth considering to go back to the design of the RFC (i.e. cio/ccw stuff
allocated from a common cio dma pool which gives you 31 bit addressable
memory, and 64 bit dma mask for a ccw device of a virtio device).

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-16 13:42             ` Halil Pasic
@ 2019-05-16 13:54               ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-16 13:54 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Jason J. Herne, kvm, linux-s390, Martin Schwidefsky,
	Sebastian Ott, virtualization, Michael S. Tsirkin,
	Christoph Hellwig, Thomas Huth, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman, Michael Mueller

On Thu, 16 May 2019 15:42:45 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Thu, 16 May 2019 08:32:28 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Wed, 15 May 2019 23:08:17 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Tue, 14 May 2019 10:47:34 -0400
> > > "Jason J. Herne" <jjherne@linux.ibm.com> wrote:  
> >   
> > > > Are we 
> > > > worried that virtio data structures are going to be a burden on the 31-bit address space?
> > > > 
> > > >     
> > > 
> > > That is a good question I can not answer. Since it is currently at least
> > > a page per queue (because we use dma direct, right Mimu?), I am concerned
> > > about this.
> > > 
> > > Connie, what is your opinion?  
> > 
> > Yes, running into problems there was one of my motivations for my
> > question. I guess it depends on the number of devices and how many
> > queues they use. The problem is that it affects not only protected virt
> > guests, but all guests.
> >   
> 
> Unless things are about to change only devices that have
> VIRTIO_F_IOMMU_PLATFORM are affected. So it does not necessarily affect
> not protected virt guests. (With prot virt we have to use
> VIRTIO_F_IOMMU_PLATFORM.)
> 
> If it were not like this, I would be much more worried.

If we go forward with this approach, documenting this side effect of
VIRTIO_F_IOMMU_PLATFORM is something that needs to happen.

> 
> @Mimu: Could you please discuss this problem with the team? It might be
> worth considering to go back to the design of the RFC (i.e. cio/ccw stuff
> allocated from a common cio dma pool which gives you 31 bit addressable
> memory, and 64 bit dma mask for a ccw device of a virtio device).
> 
> Regards,
> Halil
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-16 13:54               ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-16 13:54 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Jason J. Herne, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Michael Mueller, Viktor Mihajlovski, Vasily Gorbik,
	Janosch Frank

On Thu, 16 May 2019 15:42:45 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Thu, 16 May 2019 08:32:28 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Wed, 15 May 2019 23:08:17 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Tue, 14 May 2019 10:47:34 -0400
> > > "Jason J. Herne" <jjherne@linux.ibm.com> wrote:  
> >   
> > > > Are we 
> > > > worried that virtio data structures are going to be a burden on the 31-bit address space?
> > > > 
> > > >     
> > > 
> > > That is a good question I can not answer. Since it is currently at least
> > > a page per queue (because we use dma direct, right Mimu?), I am concerned
> > > about this.
> > > 
> > > Connie, what is your opinion?  
> > 
> > Yes, running into problems there was one of my motivations for my
> > question. I guess it depends on the number of devices and how many
> > queues they use. The problem is that it affects not only protected virt
> > guests, but all guests.
> >   
> 
> Unless things are about to change only devices that have
> VIRTIO_F_IOMMU_PLATFORM are affected. So it does not necessarily affect
> not protected virt guests. (With prot virt we have to use
> VIRTIO_F_IOMMU_PLATFORM.)
> 
> If it were not like this, I would be much more worried.

If we go forward with this approach, documenting this side effect of
VIRTIO_F_IOMMU_PLATFORM is something that needs to happen.

> 
> @Mimu: Could you please discuss this problem with the team? It might be
> worth considering to go back to the design of the RFC (i.e. cio/ccw stuff
> allocated from a common cio dma pool which gives you 31 bit addressable
> memory, and 64 bit dma mask for a ccw device of a virtio device).
> 
> Regards,
> Halil
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-12 18:22               ` Halil Pasic
@ 2019-05-16 13:59                 ` Sebastian Ott
  -1 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-16 13:59 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Cornelia Huck, kvm, linux-s390, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller

On Sun, 12 May 2019, Halil Pasic wrote:
> I've also got code that deals with AIRQ_IV_CACHELINE by turning the
> kmem_cache into a dma_pool.
> 
> Cornelia, Sebastian which approach do you prefer:
> 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
> vector, or
> 2) go with the approach taken by the patch below?

We only have a couple of users for airq_iv:

virtio_ccw.c: 2K bits

pci with floating IRQs: <= 2K (for the per-function bit vectors)
                        1..4K (for the summary bit vector)

pci with CPU directed IRQs: 2K (for the per-CPU bit vectors)
                            1..nr_cpu (for the summary bit vector)


The options are:
* page allocations for everything
* dma_pool for AIRQ_IV_CACHELINE ,gen_pool for others
* dma_pool for everything

I think we should do option 3 and use a dma_pool with cachesize
alignment for everything (as a prerequisite we have to limit
config PCI_NR_FUNCTIONS to 2K - but that is not a real constraint).

Sebastian

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-16 13:59                 ` Sebastian Ott
  0 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-16 13:59 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Thomas Huth, Claudio Imbrenda,
	Vasily Gorbik, kvm, Michael S. Tsirkin, Cornelia Huck,
	Eric Farman, virtualization, Christoph Hellwig,
	Martin Schwidefsky, Michael Mueller, Viktor Mihajlovski,
	Janosch Frank

On Sun, 12 May 2019, Halil Pasic wrote:
> I've also got code that deals with AIRQ_IV_CACHELINE by turning the
> kmem_cache into a dma_pool.
> 
> Cornelia, Sebastian which approach do you prefer:
> 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
> vector, or
> 2) go with the approach taken by the patch below?

We only have a couple of users for airq_iv:

virtio_ccw.c: 2K bits

pci with floating IRQs: <= 2K (for the per-function bit vectors)
                        1..4K (for the summary bit vector)

pci with CPU directed IRQs: 2K (for the per-CPU bit vectors)
                            1..nr_cpu (for the summary bit vector)


The options are:
* page allocations for everything
* dma_pool for AIRQ_IV_CACHELINE ,gen_pool for others
* dma_pool for everything

I think we should do option 3 and use a dma_pool with cachesize
alignment for everything (as a prerequisite we have to limit
config PCI_NR_FUNCTIONS to 2K - but that is not a real constraint).

Sebastian

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
  2019-05-13 10:15                 ` Cornelia Huck
@ 2019-05-16 15:24                   ` Pierre Morel
  -1 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-16 15:24 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Halil Pasic, kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On 13/05/2019 12:15, Cornelia Huck wrote:
> On Fri, 10 May 2019 17:36:05 +0200
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> On 10/05/2019 13:54, Halil Pasic wrote:
>>> On Fri, 10 May 2019 09:43:08 +0200
>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>    
>>>> On 09/05/2019 20:26, Halil Pasic wrote:
>>>>> On Thu, 9 May 2019 14:01:01 +0200
>>>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>>>   
>>>>>> On 08/05/2019 16:31, Pierre Morel wrote:
>>>>>>> On 26/04/2019 20:32, Halil Pasic wrote:
>>>>>>>> This will come in handy soon when we pull out the indicators from
>>>>>>>> virtio_ccw_device to a memory area that is shared with the hypervisor
>>>>>>>> (in particular for protected virtualization guests).
>>>>>>>>
>>>>>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>>>>>>> ---
>>>>>>>>      drivers/s390/virtio/virtio_ccw.c | 40
>>>>>>>> +++++++++++++++++++++++++---------------
>>>>>>>>      1 file changed, 25 insertions(+), 15 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/s390/virtio/virtio_ccw.c
>>>>>>>> b/drivers/s390/virtio/virtio_ccw.c
>>>>>>>> index bb7a92316fc8..1f3e7d56924f 100644
>>>>>>>> --- a/drivers/s390/virtio/virtio_ccw.c
>>>>>>>> +++ b/drivers/s390/virtio/virtio_ccw.c
>>>>>>>> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
>>>>>>>>          void *airq_info;
>>>>>>>>      };
>>>>>>>> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
>>>>>>>> +{
>>>>>>>> +    return &vcdev->indicators;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static inline unsigned long *indicators2(struct virtio_ccw_device
>>>>>>>> *vcdev)
>>>>>>>> +{
>>>>>>>> +    return &vcdev->indicators2;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>      struct vq_info_block_legacy {
>>>>>>>>          __u64 queue;
>>>>>>>>          __u32 align;
>>>>>>>> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct
>>>>>>>> virtio_ccw_device *vcdev,
>>>>>>>>              ccw->cda = (__u32)(unsigned long) thinint_area;
>>>>>>>>          } else {
>>>>>>>>              /* payload is the address of the indicators */
>>>>>>>> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
>>>>>>>> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
>>>>>>>>                           GFP_DMA | GFP_KERNEL);
>>>>>>>>              if (!indicatorp)
>>>>>>>>                  return;
>>>>>>>>              *indicatorp = 0;
>>>>>>>>              ccw->cmd_code = CCW_CMD_SET_IND;
>>>>>>>> -        ccw->count = sizeof(&vcdev->indicators);
>>>>>>>> +        ccw->count = sizeof(indicators(vcdev));
>>>>>>>
>>>>>>> This looks strange to me. Was already weird before.
>>>>>>> Lucky we are indicators are long...
>>>>>>> may be just sizeof(long)
>>>>>>   
>>>>>
>>>>> I'm not sure I understand where are you coming from...
>>>>>
>>>>> With CCW_CMD_SET_IND we tell the hypervisor the guest physical address
>>>>> at which the so called classic indicators. There is a comment that
>>>>> makes this obvious. The argument of the sizeof was and remained a
>>>>> pointer type. AFAIU this is what bothers you.
>>>>>>
>>>>>> AFAIK the size of the indicators (AIV/AIS) is not restricted by the
>>>>>> architecture.
>>>>>
>>>>> The size of vcdev->indicators is restricted or defined by the virtio
>>>>> specification. Please have a look at '4.3.2.6.1 Setting Up Classic Queue
>>>>> Indicators' here:
>>>>> https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1630002
>>>>>
>>>>> Since with Linux on s390 only 64 bit is supported, both the sizes are in
>>>>> line with the specification. Using u64 would semantically match the spec
>>>>> better, modulo pre virtio 1.0 which ain't specified. I did not want to
>>>>> do changes that are not necessary for what I'm trying to accomplish. If
>>>>> we want we can change these to u64 with a patch on top.
>>>>
>>>> I mean you are changing these line already, so why not doing it right
>>>> while at it?
>>>>   
>>>
>>> This patch is about adding the indirection so we can move the member
>>> painlessly. Mixing in different stuff would be a bad practice.
>>>
>>> BTW I just explained that it ain't wrong, so I really do not understand
>>> what do you mean by  'why not doing it right'. Can you please explain?
>>>    
>>
>> I did not wanted to discuss a long time on this and gave my R-B, so
>> meaning that I am OK with this patch.
>>
>> But if you ask, yes I can, it seems quite obvious.
>> When you build a CCW you give the pointer to CCW->cda and you give the
>> size of the transfer in CCW->count.
>>
>> Here the count is initialized with the sizeof of the pointer used to
>> initialize CCW->cda with.
> 
> But the cda points to the pointer address, so the size of the pointer
> is actually the correct value here, isn't it?

Oh. Yes, it is correct.
What I do not like are the mixing of (unsigned long), (unsigned long *) 
and &
if we had
cda = _u32 (unsigned long) indicatorp
count = sizeof(*indicatorp)

I would have been completely happy.

It was just a non important thing and I wouldn't have given a R-B if the 
functionality was not correct.


> 
>> Lukily we work on a 64 bits machine with 64 bits pointers and the size
>> of the pointed object is 64 bits wide so... the resulting count is right.
>> But it is not the correct way to do it.
> 
> I think it is, but this interface really is confusing.

Yes, it is what I thought we could do better.

> 
>> That is all. Not a big concern, you do not need to change it, as you
>> said it can be done in another patch.
>>
>>> Did you agree with the rest of my comment? I mean there was more to it.
>>>    
>>
>> I understood from your comments that the indicators in Linux are 64bits
>> wide so all OK.
>>
>> Regards
>> Pierre
>>
>>
>>
>>
>>
>>
> 


-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 08/10] virtio/s390: add indirection to indicators access
@ 2019-05-16 15:24                   ` Pierre Morel
  0 siblings, 0 replies; 182+ messages in thread
From: Pierre Morel @ 2019-05-16 15:24 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Halil Pasic, Christoph Hellwig,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On 13/05/2019 12:15, Cornelia Huck wrote:
> On Fri, 10 May 2019 17:36:05 +0200
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> On 10/05/2019 13:54, Halil Pasic wrote:
>>> On Fri, 10 May 2019 09:43:08 +0200
>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>    
>>>> On 09/05/2019 20:26, Halil Pasic wrote:
>>>>> On Thu, 9 May 2019 14:01:01 +0200
>>>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>>>   
>>>>>> On 08/05/2019 16:31, Pierre Morel wrote:
>>>>>>> On 26/04/2019 20:32, Halil Pasic wrote:
>>>>>>>> This will come in handy soon when we pull out the indicators from
>>>>>>>> virtio_ccw_device to a memory area that is shared with the hypervisor
>>>>>>>> (in particular for protected virtualization guests).
>>>>>>>>
>>>>>>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>>>>>>> ---
>>>>>>>>      drivers/s390/virtio/virtio_ccw.c | 40
>>>>>>>> +++++++++++++++++++++++++---------------
>>>>>>>>      1 file changed, 25 insertions(+), 15 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/s390/virtio/virtio_ccw.c
>>>>>>>> b/drivers/s390/virtio/virtio_ccw.c
>>>>>>>> index bb7a92316fc8..1f3e7d56924f 100644
>>>>>>>> --- a/drivers/s390/virtio/virtio_ccw.c
>>>>>>>> +++ b/drivers/s390/virtio/virtio_ccw.c
>>>>>>>> @@ -68,6 +68,16 @@ struct virtio_ccw_device {
>>>>>>>>          void *airq_info;
>>>>>>>>      };
>>>>>>>> +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
>>>>>>>> +{
>>>>>>>> +    return &vcdev->indicators;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static inline unsigned long *indicators2(struct virtio_ccw_device
>>>>>>>> *vcdev)
>>>>>>>> +{
>>>>>>>> +    return &vcdev->indicators2;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>      struct vq_info_block_legacy {
>>>>>>>>          __u64 queue;
>>>>>>>>          __u32 align;
>>>>>>>> @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct
>>>>>>>> virtio_ccw_device *vcdev,
>>>>>>>>              ccw->cda = (__u32)(unsigned long) thinint_area;
>>>>>>>>          } else {
>>>>>>>>              /* payload is the address of the indicators */
>>>>>>>> -        indicatorp = kmalloc(sizeof(&vcdev->indicators),
>>>>>>>> +        indicatorp = kmalloc(sizeof(indicators(vcdev)),
>>>>>>>>                           GFP_DMA | GFP_KERNEL);
>>>>>>>>              if (!indicatorp)
>>>>>>>>                  return;
>>>>>>>>              *indicatorp = 0;
>>>>>>>>              ccw->cmd_code = CCW_CMD_SET_IND;
>>>>>>>> -        ccw->count = sizeof(&vcdev->indicators);
>>>>>>>> +        ccw->count = sizeof(indicators(vcdev));
>>>>>>>
>>>>>>> This looks strange to me. Was already weird before.
>>>>>>> Lucky we are indicators are long...
>>>>>>> may be just sizeof(long)
>>>>>>   
>>>>>
>>>>> I'm not sure I understand where are you coming from...
>>>>>
>>>>> With CCW_CMD_SET_IND we tell the hypervisor the guest physical address
>>>>> at which the so called classic indicators. There is a comment that
>>>>> makes this obvious. The argument of the sizeof was and remained a
>>>>> pointer type. AFAIU this is what bothers you.
>>>>>>
>>>>>> AFAIK the size of the indicators (AIV/AIS) is not restricted by the
>>>>>> architecture.
>>>>>
>>>>> The size of vcdev->indicators is restricted or defined by the virtio
>>>>> specification. Please have a look at '4.3.2.6.1 Setting Up Classic Queue
>>>>> Indicators' here:
>>>>> https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1630002
>>>>>
>>>>> Since with Linux on s390 only 64 bit is supported, both the sizes are in
>>>>> line with the specification. Using u64 would semantically match the spec
>>>>> better, modulo pre virtio 1.0 which ain't specified. I did not want to
>>>>> do changes that are not necessary for what I'm trying to accomplish. If
>>>>> we want we can change these to u64 with a patch on top.
>>>>
>>>> I mean you are changing these line already, so why not doing it right
>>>> while at it?
>>>>   
>>>
>>> This patch is about adding the indirection so we can move the member
>>> painlessly. Mixing in different stuff would be a bad practice.
>>>
>>> BTW I just explained that it ain't wrong, so I really do not understand
>>> what do you mean by  'why not doing it right'. Can you please explain?
>>>    
>>
>> I did not wanted to discuss a long time on this and gave my R-B, so
>> meaning that I am OK with this patch.
>>
>> But if you ask, yes I can, it seems quite obvious.
>> When you build a CCW you give the pointer to CCW->cda and you give the
>> size of the transfer in CCW->count.
>>
>> Here the count is initialized with the sizeof of the pointer used to
>> initialize CCW->cda with.
> 
> But the cda points to the pointer address, so the size of the pointer
> is actually the correct value here, isn't it?

Oh. Yes, it is correct.
What I do not like are the mixing of (unsigned long), (unsigned long *) 
and &
if we had
cda = _u32 (unsigned long) indicatorp
count = sizeof(*indicatorp)

I would have been completely happy.

It was just a non important thing and I wouldn't have given a R-B if the 
functionality was not correct.


> 
>> Lukily we work on a 64 bits machine with 64 bits pointers and the size
>> of the pointed object is 64 bits wide so... the resulting count is right.
>> But it is not the correct way to do it.
> 
> I think it is, but this interface really is confusing.

Yes, it is what I thought we could do better.

> 
>> That is all. Not a big concern, you do not need to change it, as you
>> said it can be done in another patch.
>>
>>> Did you agree with the rest of my comment? I mean there was more to it.
>>>    
>>
>> I understood from your comments that the indicators in Linux are 64bits
>> wide so all OK.
>>
>> Regards
>> Pierre
>>
>>
>>
>>
>>
>>
> 


-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-16  6:29         ` Cornelia Huck
@ 2019-05-18 18:11           ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-18 18:11 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Thu, 16 May 2019 08:29:28 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 15 May 2019 22:51:58 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Mon, 13 May 2019 11:41:36 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Fri, 26 Apr 2019 20:32:41 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > > > As virtio-ccw devices are channel devices, we need to use the dma area
> > > > for any communication with the hypervisor.
> > > > 
> > > > This patch addresses the most basic stuff (mostly what is required for
> > > > virtio-ccw), and does take care of QDIO or any devices.  
> > > 
> > > "does not take care of QDIO", surely?   
> > 
> > I did not bother making the QDIO library code use dma memory for
> > anything that is conceptually dma memory. AFAIK QDIO is out of scope for
> > prot virt for now. If one were to do some emulated qdio with prot virt
> > guests, one wound need to make a bunch of things shared.
> 
> And unless you wanted to support protected virt under z/VM as well, it
> would be wasted effort :)
> 

:)

> > 
> > > (Also, what does "any devices"
> > > mean? Do you mean "every arbitrary device", perhaps?)  
> > 
> > What I mean is: this patch takes care of the core stuff, but any
> > particular device is likely to have to do more -- that is it ain't all
> > the cio devices support prot virt with this patch. For example
> > virtio-ccw needs to make sure that the ccws constituting the channel
> > programs, as well as the data pointed by the ccws is shared. If one
> > would want to make vfio-ccw DASD pass-through work under prot virt, one
> > would need to make sure, that everything that needs to be shared is
> > shared (data buffers, channel programs).
> > 
> > Does is clarify things?
> 
> That's what I thought, but the sentence was confusing :) What about
> 
> "This patch addresses the most basic stuff (mostly what is required to
> support virtio-ccw devices). It handles neither QDIO devices, nor
> arbitrary non-virtio-ccw devices." ?
>

Don't like the second sentence. How about "It handles neither QDIO
in the common code, nor any device type specific stuff (like channel
programs constructed by the DADS driver)."

My problem is that this patch is about the common infrastructure code,
and virtio-ccw specific stuff is handled by subsequent patches of this
series.

[..]

> > > Is there any way to reserve the 31 bit restriction for channel
> > > subsystem structures and keep vring in the full 64 bit range? (Or
> > > am I fundamentally misunderstanding something?)
> > >   
> > 
> > At the root of this problem is that the DMA API basically says
> > devices may have addressing limitations expressed by the dma_mask,
> > while our addressing limitations are not coming from the device but
> > from the IO arch: e.g. orb.cpa and ccw.cda are 31 bit addresses. In
> > our case it depends on how and for what is the device going to use
> > the memory (e.g. buffers addressed by MIDA vs IDA vs direct).
> > 
> > Virtio uses the DMA properties of the parent, that is in our case the
> > struct device embedded in struct ccw_device.
> > 
> > The previous version (RFC) used to allocate all the cio DMA stuff
> > from this global cio_dma_pool using the css0.dev for the DMA API
> > interactions. And we set *css0.dev.dma_mask == DMA_BIT_MASK(31) so
> > e.g. the allocated ccws are 31 bit addressable.
> > 
> > But I was asked to change this so that when I allocate DMA memory
> > for a channel program of particular ccw device, a struct device of
> > that ccw device is used as the first argument of
> > dma_alloc_coherent().
> > 
> > Considering
> > 
> > void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t
> > *dma_handle, gfp_t flag, unsigned long attrs)
> > {
> >         const struct dma_map_ops *ops = get_dma_ops(dev);
> >         void *cpu_addr;
> > 
> >         WARN_ON_ONCE(dev && !dev->coherent_dma_mask);
> > 
> >         if (dma_alloc_from_dev_coherent(dev, size, dma_handle,
> > &cpu_addr)) return cpu_addr;
> > 
> >         /* let the implementation decide on the zone to allocate
> > from: */ flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
> > 
> > that is the GFP flags dropped that implies that we really want
> > cdev->dev restricted to 31 bit addressable memory because we can't
> > tell (with the current common DMA code) hey but this piece of DMA
> > mem you are abot to allocate for me must be 31 bit addressable
> > (using GFP_DMA as we usually do).
> > 
> > So, as described in the commit message, the vring stuff being forced
> > into ZONE_DMA is an unfortunate consequence of this all.
> 
> Yeah. I hope 31 bits are enough for that as well.
> 
> > 
> > A side note: making the subchannel device 'own' the DMA stuff of a
> > ccw device (something that was discussed in the RFC thread) is tricky
> > because the ccw device may outlive the subchannel (all that orphan
> > stuff).
> 
> Yes, that's... eww. Not really a problem for virtio-ccw devices (which
> do not support the disconnected state), but can we make DMA and the
> subchannel moving play nice with each other at all?
> 

I don't quite understand the question. This series does not have any
problems with that AFAIU. Can you please clarify?

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-18 18:11           ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-18 18:11 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Thu, 16 May 2019 08:29:28 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 15 May 2019 22:51:58 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Mon, 13 May 2019 11:41:36 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Fri, 26 Apr 2019 20:32:41 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > > > As virtio-ccw devices are channel devices, we need to use the dma area
> > > > for any communication with the hypervisor.
> > > > 
> > > > This patch addresses the most basic stuff (mostly what is required for
> > > > virtio-ccw), and does take care of QDIO or any devices.  
> > > 
> > > "does not take care of QDIO", surely?   
> > 
> > I did not bother making the QDIO library code use dma memory for
> > anything that is conceptually dma memory. AFAIK QDIO is out of scope for
> > prot virt for now. If one were to do some emulated qdio with prot virt
> > guests, one wound need to make a bunch of things shared.
> 
> And unless you wanted to support protected virt under z/VM as well, it
> would be wasted effort :)
> 

:)

> > 
> > > (Also, what does "any devices"
> > > mean? Do you mean "every arbitrary device", perhaps?)  
> > 
> > What I mean is: this patch takes care of the core stuff, but any
> > particular device is likely to have to do more -- that is it ain't all
> > the cio devices support prot virt with this patch. For example
> > virtio-ccw needs to make sure that the ccws constituting the channel
> > programs, as well as the data pointed by the ccws is shared. If one
> > would want to make vfio-ccw DASD pass-through work under prot virt, one
> > would need to make sure, that everything that needs to be shared is
> > shared (data buffers, channel programs).
> > 
> > Does is clarify things?
> 
> That's what I thought, but the sentence was confusing :) What about
> 
> "This patch addresses the most basic stuff (mostly what is required to
> support virtio-ccw devices). It handles neither QDIO devices, nor
> arbitrary non-virtio-ccw devices." ?
>

Don't like the second sentence. How about "It handles neither QDIO
in the common code, nor any device type specific stuff (like channel
programs constructed by the DADS driver)."

My problem is that this patch is about the common infrastructure code,
and virtio-ccw specific stuff is handled by subsequent patches of this
series.

[..]

> > > Is there any way to reserve the 31 bit restriction for channel
> > > subsystem structures and keep vring in the full 64 bit range? (Or
> > > am I fundamentally misunderstanding something?)
> > >   
> > 
> > At the root of this problem is that the DMA API basically says
> > devices may have addressing limitations expressed by the dma_mask,
> > while our addressing limitations are not coming from the device but
> > from the IO arch: e.g. orb.cpa and ccw.cda are 31 bit addresses. In
> > our case it depends on how and for what is the device going to use
> > the memory (e.g. buffers addressed by MIDA vs IDA vs direct).
> > 
> > Virtio uses the DMA properties of the parent, that is in our case the
> > struct device embedded in struct ccw_device.
> > 
> > The previous version (RFC) used to allocate all the cio DMA stuff
> > from this global cio_dma_pool using the css0.dev for the DMA API
> > interactions. And we set *css0.dev.dma_mask == DMA_BIT_MASK(31) so
> > e.g. the allocated ccws are 31 bit addressable.
> > 
> > But I was asked to change this so that when I allocate DMA memory
> > for a channel program of particular ccw device, a struct device of
> > that ccw device is used as the first argument of
> > dma_alloc_coherent().
> > 
> > Considering
> > 
> > void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t
> > *dma_handle, gfp_t flag, unsigned long attrs)
> > {
> >         const struct dma_map_ops *ops = get_dma_ops(dev);
> >         void *cpu_addr;
> > 
> >         WARN_ON_ONCE(dev && !dev->coherent_dma_mask);
> > 
> >         if (dma_alloc_from_dev_coherent(dev, size, dma_handle,
> > &cpu_addr)) return cpu_addr;
> > 
> >         /* let the implementation decide on the zone to allocate
> > from: */ flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
> > 
> > that is the GFP flags dropped that implies that we really want
> > cdev->dev restricted to 31 bit addressable memory because we can't
> > tell (with the current common DMA code) hey but this piece of DMA
> > mem you are abot to allocate for me must be 31 bit addressable
> > (using GFP_DMA as we usually do).
> > 
> > So, as described in the commit message, the vring stuff being forced
> > into ZONE_DMA is an unfortunate consequence of this all.
> 
> Yeah. I hope 31 bits are enough for that as well.
> 
> > 
> > A side note: making the subchannel device 'own' the DMA stuff of a
> > ccw device (something that was discussed in the RFC thread) is tricky
> > because the ccw device may outlive the subchannel (all that orphan
> > stuff).
> 
> Yes, that's... eww. Not really a problem for virtio-ccw devices (which
> do not support the disconnected state), but can we make DMA and the
> subchannel moving play nice with each other at all?
> 

I don't quite understand the question. This series does not have any
problems with that AFAIU. Can you please clarify?

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-18 18:11           ` Halil Pasic
@ 2019-05-20 10:21             ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-20 10:21 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Sat, 18 May 2019 20:11:00 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Thu, 16 May 2019 08:29:28 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Wed, 15 May 2019 22:51:58 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> Don't like the second sentence. How about "It handles neither QDIO
> in the common code, nor any device type specific stuff (like channel
> programs constructed by the DADS driver)."

Sounds good to me (with s/DADS/DASD/ :)

> > > A side note: making the subchannel device 'own' the DMA stuff of a
> > > ccw device (something that was discussed in the RFC thread) is tricky
> > > because the ccw device may outlive the subchannel (all that orphan
> > > stuff).  
> > 
> > Yes, that's... eww. Not really a problem for virtio-ccw devices (which
> > do not support the disconnected state), but can we make DMA and the
> > subchannel moving play nice with each other at all?
> >   
> 
> I don't quite understand the question. This series does not have any
> problems with that AFAIU. Can you please clarify?

Wait, weren't you saying that there actually is a problem?

We seem to have the following situation:
- the device per se is represented by the ccw device
- the subchannel is the means of communication, and dma is tied to the
  (I/O ?) subchannel
- the machine check handling code may move a ccw device to a different
  subchannel, or even to a fake subchannel (orphanage handling)

The moving won't happen with virtio-ccw devices (as they do not support
the disconnected state, which is a prereq for being moved around), but
at a glance, this looks like it is worth some more thought.

- Are all (I/O) subchannels using e.g. the same dma size? (TBH, that
  question sounds a bit silly: that should be a property belonging to
  the ccw device, shouldn't it?)
- What dma properties does the fake subchannel have? (Probably none, as
  its only purpose is to serve as a parent for otherwise parentless
  disconnected ccw devices, and is therefore not involved in any I/O.)
- There needs to be some kind of handling in the machine check code, I
  guess? We would probably need a different allocation if we end up at
  a different subchannel?

I think we can assume that the dma size is at most 31 bits (since that
is what the common I/O layer needs); but can we also assume that it
will always be at least 31 bits?

My take on this is that we should be sure that we're not digging
ourselves a hole that will be hard to get out of again should we want to
support non-virtio-ccw in the future, not that the current
implementation is necessarily broken.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-20 10:21             ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-20 10:21 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Sat, 18 May 2019 20:11:00 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Thu, 16 May 2019 08:29:28 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Wed, 15 May 2019 22:51:58 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> Don't like the second sentence. How about "It handles neither QDIO
> in the common code, nor any device type specific stuff (like channel
> programs constructed by the DADS driver)."

Sounds good to me (with s/DADS/DASD/ :)

> > > A side note: making the subchannel device 'own' the DMA stuff of a
> > > ccw device (something that was discussed in the RFC thread) is tricky
> > > because the ccw device may outlive the subchannel (all that orphan
> > > stuff).  
> > 
> > Yes, that's... eww. Not really a problem for virtio-ccw devices (which
> > do not support the disconnected state), but can we make DMA and the
> > subchannel moving play nice with each other at all?
> >   
> 
> I don't quite understand the question. This series does not have any
> problems with that AFAIU. Can you please clarify?

Wait, weren't you saying that there actually is a problem?

We seem to have the following situation:
- the device per se is represented by the ccw device
- the subchannel is the means of communication, and dma is tied to the
  (I/O ?) subchannel
- the machine check handling code may move a ccw device to a different
  subchannel, or even to a fake subchannel (orphanage handling)

The moving won't happen with virtio-ccw devices (as they do not support
the disconnected state, which is a prereq for being moved around), but
at a glance, this looks like it is worth some more thought.

- Are all (I/O) subchannels using e.g. the same dma size? (TBH, that
  question sounds a bit silly: that should be a property belonging to
  the ccw device, shouldn't it?)
- What dma properties does the fake subchannel have? (Probably none, as
  its only purpose is to serve as a parent for otherwise parentless
  disconnected ccw devices, and is therefore not involved in any I/O.)
- There needs to be some kind of handling in the machine check code, I
  guess? We would probably need a different allocation if we end up at
  a different subchannel?

I think we can assume that the dma size is at most 31 bits (since that
is what the common I/O layer needs); but can we also assume that it
will always be at least 31 bits?

My take on this is that we should be sure that we're not digging
ourselves a hole that will be hard to get out of again should we want to
support non-virtio-ccw in the future, not that the current
implementation is necessarily broken.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-16 13:59                 ` Sebastian Ott
@ 2019-05-20 12:13                   ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-20 12:13 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: Cornelia Huck, kvm, linux-s390, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller

On Thu, 16 May 2019 15:59:22 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> On Sun, 12 May 2019, Halil Pasic wrote:
> > I've also got code that deals with AIRQ_IV_CACHELINE by turning the
> > kmem_cache into a dma_pool.
> > 
> > Cornelia, Sebastian which approach do you prefer:
> > 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
> > vector, or
> > 2) go with the approach taken by the patch below?
> 
> We only have a couple of users for airq_iv:
> 
> virtio_ccw.c: 2K bits


You mean a single allocation is 2k bits (VIRTIO_IV_BITS = 256 * 8)? My
understanding is that the upper bound is more like:
MAX_AIRQ_AREAS * VIRTIO_IV_BITS = 20 * 256 * 8 = 40960 bits.

In practice it is most likely just 2k.

> 
> pci with floating IRQs: <= 2K (for the per-function bit vectors)
>                         1..4K (for the summary bit vector)
>

As far as I can tell with virtio_pci arch_setup_msi_irqs() gets called
once per device and allocates a small number of bits (2 and 3 in my
test, it may depend on #virtqueues, but I did not check).

So for an upper bound we would have to multiply with the upper bound of
pci devices/functions. What is the upper bound on the number of
functions?

> pci with CPU directed IRQs: 2K (for the per-CPU bit vectors)
>                             1..nr_cpu (for the summary bit vector)
> 

I guess this is the same.

> 
> The options are:
> * page allocations for everything

Worst case we need 20 + #max_pci_dev pages. At the moment we allocate
from ZONE_DMA (!) and waste a lot.

> * dma_pool for AIRQ_IV_CACHELINE ,gen_pool for others

I prefer this. Explanation follows.

> * dma_pool for everything
> 

Less waste by factor factor 16.

> I think we should do option 3 and use a dma_pool with cachesize
> alignment for everything (as a prerequisite we have to limit
> config PCI_NR_FUNCTIONS to 2K - but that is not a real constraint).
>

I prefer option 3 because it is conceptually the smallest change, and
provides the behavior which is closest to the current one.

Commit  414cbd1e3d14 "s390/airq: provide cacheline aligned
ivs" (Sebastian Ott, 2019-02-27) could have been smaller had you implemented
'kmem_cache for everything' (and I would have had just to replace kmem_cache with
dma_cache to achieve option 3). For some reason you decided to keep the
iv->vector = kzalloc(size, GFP_KERNEL) code-path and make the client code request
iv->vector = kmem_cache_zalloc(airq_iv_cache, GFP_KERNEL) explicitly, using a flag
which you only decided to use for directed pci irqs AFAICT.

My understanding of these decisions, and especially of the rationale
behind commit 414cbd1e3d14 is limited. Thus if option 3 is the way to
go, and the choices made by 414cbd1e3d14 were sub-optimal, I would feel
much more comfortable if you provided a patch that revises  and switches
everything to kmem_chache. I would then just swap kmem_cache out with a
dma_cache and my change would end up a straightforward and relatively
clean one.

So Sebastian, what shall we do?

Regards,
Halil




> Sebastian
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-20 12:13                   ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-20 12:13 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: Farhan Ali, linux-s390, Thomas Huth, Claudio Imbrenda,
	Vasily Gorbik, kvm, Michael S. Tsirkin, Cornelia Huck,
	Eric Farman, virtualization, Christoph Hellwig,
	Martin Schwidefsky, Michael Mueller, Viktor Mihajlovski,
	Janosch Frank

On Thu, 16 May 2019 15:59:22 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> On Sun, 12 May 2019, Halil Pasic wrote:
> > I've also got code that deals with AIRQ_IV_CACHELINE by turning the
> > kmem_cache into a dma_pool.
> > 
> > Cornelia, Sebastian which approach do you prefer:
> > 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
> > vector, or
> > 2) go with the approach taken by the patch below?
> 
> We only have a couple of users for airq_iv:
> 
> virtio_ccw.c: 2K bits


You mean a single allocation is 2k bits (VIRTIO_IV_BITS = 256 * 8)? My
understanding is that the upper bound is more like:
MAX_AIRQ_AREAS * VIRTIO_IV_BITS = 20 * 256 * 8 = 40960 bits.

In practice it is most likely just 2k.

> 
> pci with floating IRQs: <= 2K (for the per-function bit vectors)
>                         1..4K (for the summary bit vector)
>

As far as I can tell with virtio_pci arch_setup_msi_irqs() gets called
once per device and allocates a small number of bits (2 and 3 in my
test, it may depend on #virtqueues, but I did not check).

So for an upper bound we would have to multiply with the upper bound of
pci devices/functions. What is the upper bound on the number of
functions?

> pci with CPU directed IRQs: 2K (for the per-CPU bit vectors)
>                             1..nr_cpu (for the summary bit vector)
> 

I guess this is the same.

> 
> The options are:
> * page allocations for everything

Worst case we need 20 + #max_pci_dev pages. At the moment we allocate
from ZONE_DMA (!) and waste a lot.

> * dma_pool for AIRQ_IV_CACHELINE ,gen_pool for others

I prefer this. Explanation follows.

> * dma_pool for everything
> 

Less waste by factor factor 16.

> I think we should do option 3 and use a dma_pool with cachesize
> alignment for everything (as a prerequisite we have to limit
> config PCI_NR_FUNCTIONS to 2K - but that is not a real constraint).
>

I prefer option 3 because it is conceptually the smallest change, and
provides the behavior which is closest to the current one.

Commit  414cbd1e3d14 "s390/airq: provide cacheline aligned
ivs" (Sebastian Ott, 2019-02-27) could have been smaller had you implemented
'kmem_cache for everything' (and I would have had just to replace kmem_cache with
dma_cache to achieve option 3). For some reason you decided to keep the
iv->vector = kzalloc(size, GFP_KERNEL) code-path and make the client code request
iv->vector = kmem_cache_zalloc(airq_iv_cache, GFP_KERNEL) explicitly, using a flag
which you only decided to use for directed pci irqs AFAICT.

My understanding of these decisions, and especially of the rationale
behind commit 414cbd1e3d14 is limited. Thus if option 3 is the way to
go, and the choices made by 414cbd1e3d14 were sub-optimal, I would feel
much more comfortable if you provided a patch that revises  and switches
everything to kmem_chache. I would then just swap kmem_cache out with a
dma_cache and my change would end up a straightforward and relatively
clean one.

So Sebastian, what shall we do?

Regards,
Halil




> Sebastian
> 

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-20 10:21             ` Cornelia Huck
@ 2019-05-20 12:34               ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-20 12:34 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Mon, 20 May 2019 12:21:43 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Sat, 18 May 2019 20:11:00 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Thu, 16 May 2019 08:29:28 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Wed, 15 May 2019 22:51:58 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > Don't like the second sentence. How about "It handles neither QDIO
> > in the common code, nor any device type specific stuff (like channel
> > programs constructed by the DADS driver)."
> 
> Sounds good to me (with s/DADS/DASD/ :)
> 

Of course!

> > > > A side note: making the subchannel device 'own' the DMA stuff of a
> > > > ccw device (something that was discussed in the RFC thread) is tricky
> > > > because the ccw device may outlive the subchannel (all that orphan
> > > > stuff).  
> > > 
> > > Yes, that's... eww. Not really a problem for virtio-ccw devices (which
> > > do not support the disconnected state), but can we make DMA and the
> > > subchannel moving play nice with each other at all?
> > >   
> > 
> > I don't quite understand the question. This series does not have any
> > problems with that AFAIU. Can you please clarify?
> 
> Wait, weren't you saying that there actually is a problem?
>

No, what I tried to say is: if we tried to make all the dma mem belong to
the subchannel device, we would have a problem. It appeared as a
tempting opportunity for consolidation, but I decided to not do it.

> We seem to have the following situation:
> - the device per se is represented by the ccw device
> - the subchannel is the means of communication, and dma is tied to the
>   (I/O ?) subchannel

It is not. When for example a virtio-ccw device talks to the device
using a channel program, the dma mem hosting the channel program belongs
to the ccw device and not to the subchannel.

In fact everything but the stuff in io_priv->dma_area belongs to the ccw
device.

> - the machine check handling code may move a ccw device to a different
>   subchannel, or even to a fake subchannel (orphanage handling)
> 

Right!

> The moving won't happen with virtio-ccw devices (as they do not support
> the disconnected state, which is a prereq for being moved around), but
> at a glance, this looks like it is worth some more thought.
> 
> - Are all (I/O) subchannels using e.g. the same dma size? (TBH, that
>   question sounds a bit silly: that should be a property belonging to
>   the ccw device, shouldn't it?)
> - What dma properties does the fake subchannel have? (Probably none, as
>   its only purpose is to serve as a parent for otherwise parentless
>   disconnected ccw devices, and is therefore not involved in any I/O.)
> - There needs to be some kind of handling in the machine check code, I
>   guess? We would probably need a different allocation if we end up at
>   a different subchannel?
> 

Basically nothing changes with mem ownership, except that some bits are
dma memory now. Should I provide a more detailed answer to the
questions above?

> I think we can assume that the dma size is at most 31 bits (since that
> is what the common I/O layer needs); but can we also assume that it
> will always be at least 31 bits?
> 

You mean dma_mas by dma size?

> My take on this is that we should be sure that we're not digging
> ourselves a hole that will be hard to get out of again should we want to
> support non-virtio-ccw in the future, not that the current
> implementation is necessarily broken.
> 

I agree!

Regards,
Hali

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-20 12:34               ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-20 12:34 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Mon, 20 May 2019 12:21:43 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Sat, 18 May 2019 20:11:00 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Thu, 16 May 2019 08:29:28 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Wed, 15 May 2019 22:51:58 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > Don't like the second sentence. How about "It handles neither QDIO
> > in the common code, nor any device type specific stuff (like channel
> > programs constructed by the DADS driver)."
> 
> Sounds good to me (with s/DADS/DASD/ :)
> 

Of course!

> > > > A side note: making the subchannel device 'own' the DMA stuff of a
> > > > ccw device (something that was discussed in the RFC thread) is tricky
> > > > because the ccw device may outlive the subchannel (all that orphan
> > > > stuff).  
> > > 
> > > Yes, that's... eww. Not really a problem for virtio-ccw devices (which
> > > do not support the disconnected state), but can we make DMA and the
> > > subchannel moving play nice with each other at all?
> > >   
> > 
> > I don't quite understand the question. This series does not have any
> > problems with that AFAIU. Can you please clarify?
> 
> Wait, weren't you saying that there actually is a problem?
>

No, what I tried to say is: if we tried to make all the dma mem belong to
the subchannel device, we would have a problem. It appeared as a
tempting opportunity for consolidation, but I decided to not do it.

> We seem to have the following situation:
> - the device per se is represented by the ccw device
> - the subchannel is the means of communication, and dma is tied to the
>   (I/O ?) subchannel

It is not. When for example a virtio-ccw device talks to the device
using a channel program, the dma mem hosting the channel program belongs
to the ccw device and not to the subchannel.

In fact everything but the stuff in io_priv->dma_area belongs to the ccw
device.

> - the machine check handling code may move a ccw device to a different
>   subchannel, or even to a fake subchannel (orphanage handling)
> 

Right!

> The moving won't happen with virtio-ccw devices (as they do not support
> the disconnected state, which is a prereq for being moved around), but
> at a glance, this looks like it is worth some more thought.
> 
> - Are all (I/O) subchannels using e.g. the same dma size? (TBH, that
>   question sounds a bit silly: that should be a property belonging to
>   the ccw device, shouldn't it?)
> - What dma properties does the fake subchannel have? (Probably none, as
>   its only purpose is to serve as a parent for otherwise parentless
>   disconnected ccw devices, and is therefore not involved in any I/O.)
> - There needs to be some kind of handling in the machine check code, I
>   guess? We would probably need a different allocation if we end up at
>   a different subchannel?
> 

Basically nothing changes with mem ownership, except that some bits are
dma memory now. Should I provide a more detailed answer to the
questions above?

> I think we can assume that the dma size is at most 31 bits (since that
> is what the common I/O layer needs); but can we also assume that it
> will always be at least 31 bits?
> 

You mean dma_mas by dma size?

> My take on this is that we should be sure that we're not digging
> ourselves a hole that will be hard to get out of again should we want to
> support non-virtio-ccw in the future, not that the current
> implementation is necessarily broken.
> 

I agree!

Regards,
Hali

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
  2019-05-20 12:34               ` Halil Pasic
@ 2019-05-20 13:43                 ` Cornelia Huck
  -1 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-20 13:43 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Mon, 20 May 2019 14:34:11 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Mon, 20 May 2019 12:21:43 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Sat, 18 May 2019 20:11:00 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Thu, 16 May 2019 08:29:28 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Wed, 15 May 2019 22:51:58 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:  

> > > > > A side note: making the subchannel device 'own' the DMA stuff of a
> > > > > ccw device (something that was discussed in the RFC thread) is tricky
> > > > > because the ccw device may outlive the subchannel (all that orphan
> > > > > stuff).    
> > > > 
> > > > Yes, that's... eww. Not really a problem for virtio-ccw devices (which
> > > > do not support the disconnected state), but can we make DMA and the
> > > > subchannel moving play nice with each other at all?
> > > >     
> > > 
> > > I don't quite understand the question. This series does not have any
> > > problems with that AFAIU. Can you please clarify?  
> > 
> > Wait, weren't you saying that there actually is a problem?
> >  
> 
> No, what I tried to say is: if we tried to make all the dma mem belong to
> the subchannel device, we would have a problem. It appeared as a
> tempting opportunity for consolidation, but I decided to not do it.

Ok, that makes sense.

> 
> > We seem to have the following situation:
> > - the device per se is represented by the ccw device
> > - the subchannel is the means of communication, and dma is tied to the
> >   (I/O ?) subchannel  
> 
> It is not. When for example a virtio-ccw device talks to the device
> using a channel program, the dma mem hosting the channel program belongs
> to the ccw device and not to the subchannel.
> 
> In fact everything but the stuff in io_priv->dma_area belongs to the ccw
> device.

Normal machine check handling hopefully should cover this one, then.

> 
> > - the machine check handling code may move a ccw device to a different
> >   subchannel, or even to a fake subchannel (orphanage handling)
> >   
> 
> Right!
> 
> > The moving won't happen with virtio-ccw devices (as they do not support
> > the disconnected state, which is a prereq for being moved around), but
> > at a glance, this looks like it is worth some more thought.
> > 
> > - Are all (I/O) subchannels using e.g. the same dma size? (TBH, that
> >   question sounds a bit silly: that should be a property belonging to
> >   the ccw device, shouldn't it?)
> > - What dma properties does the fake subchannel have? (Probably none, as
> >   its only purpose is to serve as a parent for otherwise parentless
> >   disconnected ccw devices, and is therefore not involved in any I/O.)
> > - There needs to be some kind of handling in the machine check code, I
> >   guess? We would probably need a different allocation if we end up at
> >   a different subchannel?
> >   
> 
> Basically nothing changes with mem ownership, except that some bits are
> dma memory now. Should I provide a more detailed answer to the
> questions above?

No real need, I simply did not understand your initial remark correctly.

> 
> > I think we can assume that the dma size is at most 31 bits (since that
> > is what the common I/O layer needs); but can we also assume that it
> > will always be at least 31 bits?
> >   
> 
> You mean dma_mas by dma size?

Whatever it is called :) IIUC, we need to go with 31 bit for any
channel I/O related structures; I was mainly wondering whether any
devices need a lower limit for some of the memory they use. I would be
surprised if they did, but you never know :)

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 06/10] s390/cio: add basic protected virtualization support
@ 2019-05-20 13:43                 ` Cornelia Huck
  0 siblings, 0 replies; 182+ messages in thread
From: Cornelia Huck @ 2019-05-20 13:43 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Thomas Huth, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S. Tsirkin, Farhan Ali, Eric Farman,
	virtualization, Christoph Hellwig, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Mon, 20 May 2019 14:34:11 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Mon, 20 May 2019 12:21:43 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Sat, 18 May 2019 20:11:00 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Thu, 16 May 2019 08:29:28 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Wed, 15 May 2019 22:51:58 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:  

> > > > > A side note: making the subchannel device 'own' the DMA stuff of a
> > > > > ccw device (something that was discussed in the RFC thread) is tricky
> > > > > because the ccw device may outlive the subchannel (all that orphan
> > > > > stuff).    
> > > > 
> > > > Yes, that's... eww. Not really a problem for virtio-ccw devices (which
> > > > do not support the disconnected state), but can we make DMA and the
> > > > subchannel moving play nice with each other at all?
> > > >     
> > > 
> > > I don't quite understand the question. This series does not have any
> > > problems with that AFAIU. Can you please clarify?  
> > 
> > Wait, weren't you saying that there actually is a problem?
> >  
> 
> No, what I tried to say is: if we tried to make all the dma mem belong to
> the subchannel device, we would have a problem. It appeared as a
> tempting opportunity for consolidation, but I decided to not do it.

Ok, that makes sense.

> 
> > We seem to have the following situation:
> > - the device per se is represented by the ccw device
> > - the subchannel is the means of communication, and dma is tied to the
> >   (I/O ?) subchannel  
> 
> It is not. When for example a virtio-ccw device talks to the device
> using a channel program, the dma mem hosting the channel program belongs
> to the ccw device and not to the subchannel.
> 
> In fact everything but the stuff in io_priv->dma_area belongs to the ccw
> device.

Normal machine check handling hopefully should cover this one, then.

> 
> > - the machine check handling code may move a ccw device to a different
> >   subchannel, or even to a fake subchannel (orphanage handling)
> >   
> 
> Right!
> 
> > The moving won't happen with virtio-ccw devices (as they do not support
> > the disconnected state, which is a prereq for being moved around), but
> > at a glance, this looks like it is worth some more thought.
> > 
> > - Are all (I/O) subchannels using e.g. the same dma size? (TBH, that
> >   question sounds a bit silly: that should be a property belonging to
> >   the ccw device, shouldn't it?)
> > - What dma properties does the fake subchannel have? (Probably none, as
> >   its only purpose is to serve as a parent for otherwise parentless
> >   disconnected ccw devices, and is therefore not involved in any I/O.)
> > - There needs to be some kind of handling in the machine check code, I
> >   guess? We would probably need a different allocation if we end up at
> >   a different subchannel?
> >   
> 
> Basically nothing changes with mem ownership, except that some bits are
> dma memory now. Should I provide a more detailed answer to the
> questions above?

No real need, I simply did not understand your initial remark correctly.

> 
> > I think we can assume that the dma size is at most 31 bits (since that
> > is what the common I/O layer needs); but can we also assume that it
> > will always be at least 31 bits?
> >   
> 
> You mean dma_mas by dma size?

Whatever it is called :) IIUC, we need to go with 31 bit for any
channel I/O related structures; I was mainly wondering whether any
devices need a lower limit for some of the memory they use. I would be
surprised if they did, but you never know :)

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-20 12:13                   ` Halil Pasic
@ 2019-05-21  8:46                     ` Michael Mueller
  -1 siblings, 0 replies; 182+ messages in thread
From: Michael Mueller @ 2019-05-21  8:46 UTC (permalink / raw)
  To: Halil Pasic, Sebastian Ott, Christian Borntraeger, Viktor Mihajlovski
  Cc: Cornelia Huck, kvm, linux-s390, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Vasily Gorbik, Janosch Frank, Claudio Imbrenda,
	Farhan Ali, Eric Farman



On 20.05.19 14:13, Halil Pasic wrote:
> On Thu, 16 May 2019 15:59:22 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> 
>> On Sun, 12 May 2019, Halil Pasic wrote:
>>> I've also got code that deals with AIRQ_IV_CACHELINE by turning the
>>> kmem_cache into a dma_pool.
>>>
>>> Cornelia, Sebastian which approach do you prefer:
>>> 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
>>> vector, or
>>> 2) go with the approach taken by the patch below?
>>
>> We only have a couple of users for airq_iv:
>>
>> virtio_ccw.c: 2K bits
> 
> 
> You mean a single allocation is 2k bits (VIRTIO_IV_BITS = 256 * 8)? My
> understanding is that the upper bound is more like:
> MAX_AIRQ_AREAS * VIRTIO_IV_BITS = 20 * 256 * 8 = 40960 bits.
> 
> In practice it is most likely just 2k.
> 
>>
>> pci with floating IRQs: <= 2K (for the per-function bit vectors)
>>                          1..4K (for the summary bit vector)
>>
> 
> As far as I can tell with virtio_pci arch_setup_msi_irqs() gets called
> once per device and allocates a small number of bits (2 and 3 in my
> test, it may depend on #virtqueues, but I did not check).
> 
> So for an upper bound we would have to multiply with the upper bound of
> pci devices/functions. What is the upper bound on the number of
> functions?
> 
>> pci with CPU directed IRQs: 2K (for the per-CPU bit vectors)
>>                              1..nr_cpu (for the summary bit vector)
>>
> 
> I guess this is the same.
> 
>>
>> The options are:
>> * page allocations for everything
> 
> Worst case we need 20 + #max_pci_dev pages. At the moment we allocate
> from ZONE_DMA (!) and waste a lot.
> 
>> * dma_pool for AIRQ_IV_CACHELINE ,gen_pool for others
> 
> I prefer this. Explanation follows.
> 
>> * dma_pool for everything
>>
> 
> Less waste by factor factor 16.
> 
>> I think we should do option 3 and use a dma_pool with cachesize
>> alignment for everything (as a prerequisite we have to limit
>> config PCI_NR_FUNCTIONS to 2K - but that is not a real constraint).
>>
> 
> I prefer option 3 because it is conceptually the smallest change, and
> provides the behavior which is closest to the current one.
> 
> Commit  414cbd1e3d14 "s390/airq: provide cacheline aligned
> ivs" (Sebastian Ott, 2019-02-27) could have been smaller had you implemented
> 'kmem_cache for everything' (and I would have had just to replace kmem_cache with
> dma_cache to achieve option 3). For some reason you decided to keep the
> iv->vector = kzalloc(size, GFP_KERNEL) code-path and make the client code request
> iv->vector = kmem_cache_zalloc(airq_iv_cache, GFP_KERNEL) explicitly, using a flag
> which you only decided to use for directed pci irqs AFAICT.
> 
> My understanding of these decisions, and especially of the rationale
> behind commit 414cbd1e3d14 is limited. Thus if option 3 is the way to
> go, and the choices made by 414cbd1e3d14 were sub-optimal, I would feel
> much more comfortable if you provided a patch that revises  and switches
> everything to kmem_chache. I would then just swap kmem_cache out with a
> dma_cache and my change would end up a straightforward and relatively
> clean one.
> 
> So Sebastian, what shall we do?
> 
> Regards,
> Halil
> 
> 
> 
> 
>> Sebastian
>>
> 

Folks, I had a version running with slight changes to the initial
v1 patch set together with a revert of 414cbd1e3d14 "s390/airq: provide 
cacheline aligned ivs". That of course has the deficit of the memory
usage pattern.

Now you are discussing same substantial changes. The exercise was to
get an initial working code through the door. We really need a decision!


Michael

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-21  8:46                     ` Michael Mueller
  0 siblings, 0 replies; 182+ messages in thread
From: Michael Mueller @ 2019-05-21  8:46 UTC (permalink / raw)
  To: Halil Pasic, Sebastian Ott, Christian Borntraeger, Viktor Mihajlovski
  Cc: linux-s390, Thomas Huth, Farhan Ali, Vasily Gorbik, kvm,
	Michael S. Tsirkin, Cornelia Huck, Eric Farman, virtualization,
	Christoph Hellwig, Martin Schwidefsky, Claudio Imbrenda,
	Janosch Frank



On 20.05.19 14:13, Halil Pasic wrote:
> On Thu, 16 May 2019 15:59:22 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> 
>> On Sun, 12 May 2019, Halil Pasic wrote:
>>> I've also got code that deals with AIRQ_IV_CACHELINE by turning the
>>> kmem_cache into a dma_pool.
>>>
>>> Cornelia, Sebastian which approach do you prefer:
>>> 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
>>> vector, or
>>> 2) go with the approach taken by the patch below?
>>
>> We only have a couple of users for airq_iv:
>>
>> virtio_ccw.c: 2K bits
> 
> 
> You mean a single allocation is 2k bits (VIRTIO_IV_BITS = 256 * 8)? My
> understanding is that the upper bound is more like:
> MAX_AIRQ_AREAS * VIRTIO_IV_BITS = 20 * 256 * 8 = 40960 bits.
> 
> In practice it is most likely just 2k.
> 
>>
>> pci with floating IRQs: <= 2K (for the per-function bit vectors)
>>                          1..4K (for the summary bit vector)
>>
> 
> As far as I can tell with virtio_pci arch_setup_msi_irqs() gets called
> once per device and allocates a small number of bits (2 and 3 in my
> test, it may depend on #virtqueues, but I did not check).
> 
> So for an upper bound we would have to multiply with the upper bound of
> pci devices/functions. What is the upper bound on the number of
> functions?
> 
>> pci with CPU directed IRQs: 2K (for the per-CPU bit vectors)
>>                              1..nr_cpu (for the summary bit vector)
>>
> 
> I guess this is the same.
> 
>>
>> The options are:
>> * page allocations for everything
> 
> Worst case we need 20 + #max_pci_dev pages. At the moment we allocate
> from ZONE_DMA (!) and waste a lot.
> 
>> * dma_pool for AIRQ_IV_CACHELINE ,gen_pool for others
> 
> I prefer this. Explanation follows.
> 
>> * dma_pool for everything
>>
> 
> Less waste by factor factor 16.
> 
>> I think we should do option 3 and use a dma_pool with cachesize
>> alignment for everything (as a prerequisite we have to limit
>> config PCI_NR_FUNCTIONS to 2K - but that is not a real constraint).
>>
> 
> I prefer option 3 because it is conceptually the smallest change, and
> provides the behavior which is closest to the current one.
> 
> Commit  414cbd1e3d14 "s390/airq: provide cacheline aligned
> ivs" (Sebastian Ott, 2019-02-27) could have been smaller had you implemented
> 'kmem_cache for everything' (and I would have had just to replace kmem_cache with
> dma_cache to achieve option 3). For some reason you decided to keep the
> iv->vector = kzalloc(size, GFP_KERNEL) code-path and make the client code request
> iv->vector = kmem_cache_zalloc(airq_iv_cache, GFP_KERNEL) explicitly, using a flag
> which you only decided to use for directed pci irqs AFAICT.
> 
> My understanding of these decisions, and especially of the rationale
> behind commit 414cbd1e3d14 is limited. Thus if option 3 is the way to
> go, and the choices made by 414cbd1e3d14 were sub-optimal, I would feel
> much more comfortable if you provided a patch that revises  and switches
> everything to kmem_chache. I would then just swap kmem_cache out with a
> dma_cache and my change would end up a straightforward and relatively
> clean one.
> 
> So Sebastian, what shall we do?
> 
> Regards,
> Halil
> 
> 
> 
> 
>> Sebastian
>>
> 

Folks, I had a version running with slight changes to the initial
v1 patch set together with a revert of 414cbd1e3d14 "s390/airq: provide 
cacheline aligned ivs". That of course has the deficit of the memory
usage pattern.

Now you are discussing same substantial changes. The exercise was to
get an initial working code through the door. We really need a decision!


Michael

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-20 12:13                   ` Halil Pasic
@ 2019-05-22 12:07                     ` Sebastian Ott
  -1 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-22 12:07 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Cornelia Huck, kvm, linux-s390, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller

On Mon, 20 May 2019, Halil Pasic wrote:
> On Thu, 16 May 2019 15:59:22 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> > We only have a couple of users for airq_iv:
> > 
> > virtio_ccw.c: 2K bits
> 
> You mean a single allocation is 2k bits (VIRTIO_IV_BITS = 256 * 8)? My
> understanding is that the upper bound is more like:
> MAX_AIRQ_AREAS * VIRTIO_IV_BITS = 20 * 256 * 8 = 40960 bits.
> 
> In practice it is most likely just 2k.
> 
> > 
> > pci with floating IRQs: <= 2K (for the per-function bit vectors)
> >                         1..4K (for the summary bit vector)
> >
> 
> As far as I can tell with virtio_pci arch_setup_msi_irqs() gets called
> once per device and allocates a small number of bits (2 and 3 in my
> test, it may depend on #virtqueues, but I did not check).
> 
> So for an upper bound we would have to multiply with the upper bound of
> pci devices/functions. What is the upper bound on the number of
> functions?
> 
> > pci with CPU directed IRQs: 2K (for the per-CPU bit vectors)
> >                             1..nr_cpu (for the summary bit vector)
> > 
> 
> I guess this is the same.
> 
> > 
> > The options are:
> > * page allocations for everything
> 
> Worst case we need 20 + #max_pci_dev pages. At the moment we allocate
> from ZONE_DMA (!) and waste a lot.
> 
> > * dma_pool for AIRQ_IV_CACHELINE ,gen_pool for others
> 
> I prefer this. Explanation follows.
> 
> > * dma_pool for everything
> > 
> 
> Less waste by factor factor 16.
> 
> > I think we should do option 3 and use a dma_pool with cachesize
> > alignment for everything (as a prerequisite we have to limit
> > config PCI_NR_FUNCTIONS to 2K - but that is not a real constraint).
> 
> I prefer option 3 because it is conceptually the smallest change, and
                  ^
                  2
> provides the behavior which is closest to the current one.

I can see that this is the smallest change on top of the current
implementation. I'm good with doing that and looking for further
simplification/unification later.


> Commit  414cbd1e3d14 "s390/airq: provide cacheline aligned
> ivs" (Sebastian Ott, 2019-02-27) could have been smaller had you implemented
> 'kmem_cache for everything' (and I would have had just to replace kmem_cache with
> dma_cache to achieve option 3). For some reason you decided to keep the
> iv->vector = kzalloc(size, GFP_KERNEL) code-path and make the client code request
> iv->vector = kmem_cache_zalloc(airq_iv_cache, GFP_KERNEL) explicitly, using a flag
> which you only decided to use for directed pci irqs AFAICT.
> 
> My understanding of these decisions, and especially of the rationale
> behind commit 414cbd1e3d14 is limited.

I introduced per cpu interrupt vectors and wanted to prevent 2 CPUs from
sharing data from the same cacheline. No other user of the airq stuff had
this need. If I had been aware of the additional complexity we would add
on top of that maybe I would have made a different decision.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-22 12:07                     ` Sebastian Ott
  0 siblings, 0 replies; 182+ messages in thread
From: Sebastian Ott @ 2019-05-22 12:07 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Thomas Huth, Claudio Imbrenda,
	Vasily Gorbik, kvm, Michael S. Tsirkin, Cornelia Huck,
	Eric Farman, virtualization, Christoph Hellwig,
	Christian Borntraeger, Martin Schwidefsky, Michael Mueller,
	Viktor Mihajlovski, Janosch Frank

On Mon, 20 May 2019, Halil Pasic wrote:
> On Thu, 16 May 2019 15:59:22 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> > We only have a couple of users for airq_iv:
> > 
> > virtio_ccw.c: 2K bits
> 
> You mean a single allocation is 2k bits (VIRTIO_IV_BITS = 256 * 8)? My
> understanding is that the upper bound is more like:
> MAX_AIRQ_AREAS * VIRTIO_IV_BITS = 20 * 256 * 8 = 40960 bits.
> 
> In practice it is most likely just 2k.
> 
> > 
> > pci with floating IRQs: <= 2K (for the per-function bit vectors)
> >                         1..4K (for the summary bit vector)
> >
> 
> As far as I can tell with virtio_pci arch_setup_msi_irqs() gets called
> once per device and allocates a small number of bits (2 and 3 in my
> test, it may depend on #virtqueues, but I did not check).
> 
> So for an upper bound we would have to multiply with the upper bound of
> pci devices/functions. What is the upper bound on the number of
> functions?
> 
> > pci with CPU directed IRQs: 2K (for the per-CPU bit vectors)
> >                             1..nr_cpu (for the summary bit vector)
> > 
> 
> I guess this is the same.
> 
> > 
> > The options are:
> > * page allocations for everything
> 
> Worst case we need 20 + #max_pci_dev pages. At the moment we allocate
> from ZONE_DMA (!) and waste a lot.
> 
> > * dma_pool for AIRQ_IV_CACHELINE ,gen_pool for others
> 
> I prefer this. Explanation follows.
> 
> > * dma_pool for everything
> > 
> 
> Less waste by factor factor 16.
> 
> > I think we should do option 3 and use a dma_pool with cachesize
> > alignment for everything (as a prerequisite we have to limit
> > config PCI_NR_FUNCTIONS to 2K - but that is not a real constraint).
> 
> I prefer option 3 because it is conceptually the smallest change, and
                  ^
                  2
> provides the behavior which is closest to the current one.

I can see that this is the smallest change on top of the current
implementation. I'm good with doing that and looking for further
simplification/unification later.


> Commit  414cbd1e3d14 "s390/airq: provide cacheline aligned
> ivs" (Sebastian Ott, 2019-02-27) could have been smaller had you implemented
> 'kmem_cache for everything' (and I would have had just to replace kmem_cache with
> dma_cache to achieve option 3). For some reason you decided to keep the
> iv->vector = kzalloc(size, GFP_KERNEL) code-path and make the client code request
> iv->vector = kmem_cache_zalloc(airq_iv_cache, GFP_KERNEL) explicitly, using a flag
> which you only decided to use for directed pci irqs AFAICT.
> 
> My understanding of these decisions, and especially of the rationale
> behind commit 414cbd1e3d14 is limited.

I introduced per cpu interrupt vectors and wanted to prevent 2 CPUs from
sharing data from the same cacheline. No other user of the airq stuff had
this need. If I had been aware of the additional complexity we would add
on top of that maybe I would have made a different decision.

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-22 12:07                     ` Sebastian Ott
@ 2019-05-22 22:12                       ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-22 22:12 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: Cornelia Huck, kvm, linux-s390, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller

On Wed, 22 May 2019 14:07:05 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> On Mon, 20 May 2019, Halil Pasic wrote:
> > On Thu, 16 May 2019 15:59:22 +0200 (CEST)
> > Sebastian Ott <sebott@linux.ibm.com> wrote:
> > > We only have a couple of users for airq_iv:
> > > 
> > > virtio_ccw.c: 2K bits
> > 
> > You mean a single allocation is 2k bits (VIRTIO_IV_BITS = 256 * 8)? My
> > understanding is that the upper bound is more like:
> > MAX_AIRQ_AREAS * VIRTIO_IV_BITS = 20 * 256 * 8 = 40960 bits.
> > 
> > In practice it is most likely just 2k.
> > 
> > > 
> > > pci with floating IRQs: <= 2K (for the per-function bit vectors)
> > >                         1..4K (for the summary bit vector)
> > >
> > 
> > As far as I can tell with virtio_pci arch_setup_msi_irqs() gets called
> > once per device and allocates a small number of bits (2 and 3 in my
> > test, it may depend on #virtqueues, but I did not check).
> > 
> > So for an upper bound we would have to multiply with the upper bound of
> > pci devices/functions. What is the upper bound on the number of
> > functions?
> > 
> > > pci with CPU directed IRQs: 2K (for the per-CPU bit vectors)
> > >                             1..nr_cpu (for the summary bit vector)
> > > 
> > 
> > I guess this is the same.
> > 
> > > 
> > > The options are:
> > > * page allocations for everything
> > 
> > Worst case we need 20 + #max_pci_dev pages. At the moment we allocate
> > from ZONE_DMA (!) and waste a lot.
> > 
> > > * dma_pool for AIRQ_IV_CACHELINE ,gen_pool for others
> > 
> > I prefer this. Explanation follows.
> > 
> > > * dma_pool for everything
> > > 
> > 
> > Less waste by factor factor 16.
> > 
> > > I think we should do option 3 and use a dma_pool with cachesize
> > > alignment for everything (as a prerequisite we have to limit
> > > config PCI_NR_FUNCTIONS to 2K - but that is not a real constraint).
> > 
> > I prefer option 3 because it is conceptually the smallest change, and
>                   ^
>                   2
> > provides the behavior which is closest to the current one.
> 
> I can see that this is the smallest change on top of the current
> implementation. I'm good with doing that and looking for further
> simplification/unification later.
> 

Nod. I will go with that for v2.

> 
> > Commit  414cbd1e3d14 "s390/airq: provide cacheline aligned
> > ivs" (Sebastian Ott, 2019-02-27) could have been smaller had you implemented
> > 'kmem_cache for everything' (and I would have had just to replace kmem_cache with
> > dma_cache to achieve option 3). For some reason you decided to keep the
> > iv->vector = kzalloc(size, GFP_KERNEL) code-path and make the client code request
> > iv->vector = kmem_cache_zalloc(airq_iv_cache, GFP_KERNEL) explicitly, using a flag
> > which you only decided to use for directed pci irqs AFAICT.
> > 
> > My understanding of these decisions, and especially of the rationale
> > behind commit 414cbd1e3d14 is limited.
> 
> I introduced per cpu interrupt vectors and wanted to prevent 2 CPUs from
> sharing data from the same cacheline. No other user of the airq stuff had
> this need. If I had been aware of the additional complexity we would add
> on top of that maybe I would have made a different decision.

I understand. Thanks for the explanation!

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-22 22:12                       ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-22 22:12 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: Farhan Ali, linux-s390, Thomas Huth, Claudio Imbrenda,
	Vasily Gorbik, kvm, Michael S. Tsirkin, Cornelia Huck,
	Eric Farman, virtualization, Christoph Hellwig,
	Christian Borntraeger, Martin Schwidefsky, Michael Mueller,
	Viktor Mihajlovski, Janosch Frank

On Wed, 22 May 2019 14:07:05 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> On Mon, 20 May 2019, Halil Pasic wrote:
> > On Thu, 16 May 2019 15:59:22 +0200 (CEST)
> > Sebastian Ott <sebott@linux.ibm.com> wrote:
> > > We only have a couple of users for airq_iv:
> > > 
> > > virtio_ccw.c: 2K bits
> > 
> > You mean a single allocation is 2k bits (VIRTIO_IV_BITS = 256 * 8)? My
> > understanding is that the upper bound is more like:
> > MAX_AIRQ_AREAS * VIRTIO_IV_BITS = 20 * 256 * 8 = 40960 bits.
> > 
> > In practice it is most likely just 2k.
> > 
> > > 
> > > pci with floating IRQs: <= 2K (for the per-function bit vectors)
> > >                         1..4K (for the summary bit vector)
> > >
> > 
> > As far as I can tell with virtio_pci arch_setup_msi_irqs() gets called
> > once per device and allocates a small number of bits (2 and 3 in my
> > test, it may depend on #virtqueues, but I did not check).
> > 
> > So for an upper bound we would have to multiply with the upper bound of
> > pci devices/functions. What is the upper bound on the number of
> > functions?
> > 
> > > pci with CPU directed IRQs: 2K (for the per-CPU bit vectors)
> > >                             1..nr_cpu (for the summary bit vector)
> > > 
> > 
> > I guess this is the same.
> > 
> > > 
> > > The options are:
> > > * page allocations for everything
> > 
> > Worst case we need 20 + #max_pci_dev pages. At the moment we allocate
> > from ZONE_DMA (!) and waste a lot.
> > 
> > > * dma_pool for AIRQ_IV_CACHELINE ,gen_pool for others
> > 
> > I prefer this. Explanation follows.
> > 
> > > * dma_pool for everything
> > > 
> > 
> > Less waste by factor factor 16.
> > 
> > > I think we should do option 3 and use a dma_pool with cachesize
> > > alignment for everything (as a prerequisite we have to limit
> > > config PCI_NR_FUNCTIONS to 2K - but that is not a real constraint).
> > 
> > I prefer option 3 because it is conceptually the smallest change, and
>                   ^
>                   2
> > provides the behavior which is closest to the current one.
> 
> I can see that this is the smallest change on top of the current
> implementation. I'm good with doing that and looking for further
> simplification/unification later.
> 

Nod. I will go with that for v2.

> 
> > Commit  414cbd1e3d14 "s390/airq: provide cacheline aligned
> > ivs" (Sebastian Ott, 2019-02-27) could have been smaller had you implemented
> > 'kmem_cache for everything' (and I would have had just to replace kmem_cache with
> > dma_cache to achieve option 3). For some reason you decided to keep the
> > iv->vector = kzalloc(size, GFP_KERNEL) code-path and make the client code request
> > iv->vector = kmem_cache_zalloc(airq_iv_cache, GFP_KERNEL) explicitly, using a flag
> > which you only decided to use for directed pci irqs AFAICT.
> > 
> > My understanding of these decisions, and especially of the rationale
> > behind commit 414cbd1e3d14 is limited.
> 
> I introduced per cpu interrupt vectors and wanted to prevent 2 CPUs from
> sharing data from the same cacheline. No other user of the airq stuff had
> this need. If I had been aware of the additional complexity we would add
> on top of that maybe I would have made a different decision.

I understand. Thanks for the explanation!

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
  2019-05-08 13:18     ` Sebastian Ott
@ 2019-05-23 15:17       ` Halil Pasic
  -1 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-23 15:17 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Michael S. Tsirkin, Christoph Hellwig,
	Thomas Huth, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael Mueller

On Wed, 8 May 2019 15:18:10 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> > @@ -224,6 +228,9 @@ struct subchannel *css_alloc_subchannel(struct subchannel_id schid,
> >  	INIT_WORK(&sch->todo_work, css_sch_todo);
> >  	sch->dev.release = &css_subchannel_release;
> >  	device_initialize(&sch->dev);
> > +	sch->dma_mask = css_dev_dma_mask;
> > +	sch->dev.dma_mask = &sch->dma_mask;
> > +	sch->dev.coherent_dma_mask = sch->dma_mask;  
> 
> Could we do:
> 	sch->dev.dma_mask = &sch->dev.coherent_dma_mask;
> 	sch->dev.coherent_dma_mask = css_dev_dma_mask;
> ?

Looks like a good idea to me. We will do it for all 3 (sch, ccw and
css). Thanks!

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

* Re: [PATCH 05/10] s390/cio: introduce DMA pools to cio
@ 2019-05-23 15:17       ` Halil Pasic
  0 siblings, 0 replies; 182+ messages in thread
From: Halil Pasic @ 2019-05-23 15:17 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: Farhan Ali, linux-s390, Thomas Huth, Claudio Imbrenda,
	Vasily Gorbik, kvm, Michael S. Tsirkin, Cornelia Huck,
	Eric Farman, virtualization, Christoph Hellwig,
	Christian Borntraeger, Martin Schwidefsky, Michael Mueller,
	Viktor Mihajlovski, Janosch Frank

On Wed, 8 May 2019 15:18:10 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> > @@ -224,6 +228,9 @@ struct subchannel *css_alloc_subchannel(struct subchannel_id schid,
> >  	INIT_WORK(&sch->todo_work, css_sch_todo);
> >  	sch->dev.release = &css_subchannel_release;
> >  	device_initialize(&sch->dev);
> > +	sch->dma_mask = css_dev_dma_mask;
> > +	sch->dev.dma_mask = &sch->dma_mask;
> > +	sch->dev.coherent_dma_mask = sch->dma_mask;  
> 
> Could we do:
> 	sch->dev.dma_mask = &sch->dev.coherent_dma_mask;
> 	sch->dev.coherent_dma_mask = css_dev_dma_mask;
> ?

Looks like a good idea to me. We will do it for all 3 (sch, ccw and
css). Thanks!

Regards,
Halil

^ permalink raw reply	[flat|nested] 182+ messages in thread

end of thread, other threads:[~2019-05-23 15:17 UTC | newest]

Thread overview: 182+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-26 18:32 [PATCH 00/10] s390: virtio: support protected virtualization Halil Pasic
2019-04-26 18:32 ` Halil Pasic
2019-04-26 18:32 ` [PATCH 01/10] virtio/s390: use vring_create_virtqueue Halil Pasic
2019-04-26 18:32   ` Halil Pasic
2019-05-03  9:17   ` Cornelia Huck
2019-05-03 20:04     ` Michael S. Tsirkin
2019-05-03 20:04       ` Michael S. Tsirkin
2019-05-04 14:03       ` Halil Pasic
2019-05-04 14:03         ` Halil Pasic
2019-05-05 11:15         ` Cornelia Huck
2019-05-05 11:15           ` Cornelia Huck
2019-05-07 13:58           ` Christian Borntraeger
2019-05-07 13:58             ` Christian Borntraeger
2019-05-08 20:12             ` Halil Pasic
2019-05-08 20:12               ` Halil Pasic
2019-05-10 14:07             ` Cornelia Huck
2019-05-10 14:07               ` Cornelia Huck
2019-05-12 16:47               ` Michael S. Tsirkin
2019-05-12 16:47                 ` Michael S. Tsirkin
2019-05-13  9:52                 ` Cornelia Huck
2019-05-13  9:52                   ` Cornelia Huck
2019-05-13 12:27                   ` Michael Mueller
2019-05-13 12:27                     ` Michael Mueller
2019-05-13 12:29                     ` Cornelia Huck
2019-05-13 12:29                       ` Cornelia Huck
2019-04-26 18:32 ` [PATCH 02/10] virtio/s390: DMA support for virtio-ccw Halil Pasic
2019-04-26 18:32   ` Halil Pasic
2019-05-03  9:31   ` Cornelia Huck
2019-04-26 18:32 ` [PATCH 03/10] virtio/s390: enable packed ring Halil Pasic
2019-04-26 18:32   ` Halil Pasic
2019-05-03  9:44   ` Cornelia Huck
2019-05-05 15:13     ` Thomas Huth
2019-05-05 15:13       ` Thomas Huth
2019-04-26 18:32 ` [PATCH 04/10] s390/mm: force swiotlb for protected virtualization Halil Pasic
2019-04-26 18:32   ` Halil Pasic
2019-04-26 19:27   ` Christoph Hellwig
2019-04-26 19:27     ` Christoph Hellwig
2019-04-29 13:59     ` Halil Pasic
2019-04-29 13:59       ` Halil Pasic
2019-04-29 14:05       ` Christian Borntraeger
2019-04-29 14:05         ` Christian Borntraeger
2019-05-13 12:50         ` Michael Mueller
2019-05-13 12:50           ` Michael Mueller
2019-05-08 13:15   ` Claudio Imbrenda
2019-05-08 13:15     ` Claudio Imbrenda
2019-05-09 22:34     ` Halil Pasic
2019-05-09 22:34       ` Halil Pasic
2019-05-15 14:15       ` Michael Mueller
2019-05-15 14:15         ` Michael Mueller
     [not found]   ` <ad23f5e7-dc78-04af-c892-47bbc65134c6@linux.ibm.com>
2019-05-09 18:05     ` Jason J. Herne
2019-05-09 18:05       ` Jason J. Herne
2019-05-09 18:05       ` Jason J. Herne
2019-05-10  7:49       ` Claudio Imbrenda
2019-05-10  7:49         ` Claudio Imbrenda
2019-04-26 18:32 ` [PATCH 05/10] s390/cio: introduce DMA pools to cio Halil Pasic
2019-04-26 18:32   ` Halil Pasic
2019-05-08 13:18   ` Sebastian Ott
2019-05-08 13:18     ` Sebastian Ott
2019-05-08 21:22     ` Halil Pasic
2019-05-08 21:22       ` Halil Pasic
2019-05-09  8:40       ` Sebastian Ott
2019-05-09  8:40         ` Sebastian Ott
2019-05-09 10:11       ` Cornelia Huck
2019-05-09 10:11         ` Cornelia Huck
2019-05-09 22:11         ` Halil Pasic
2019-05-09 22:11           ` Halil Pasic
2019-05-10 14:10           ` Cornelia Huck
2019-05-10 14:10             ` Cornelia Huck
2019-05-12 18:22             ` Halil Pasic
2019-05-12 18:22               ` Halil Pasic
2019-05-13 13:29               ` Cornelia Huck
2019-05-13 13:29                 ` Cornelia Huck
2019-05-15 17:12                 ` Halil Pasic
2019-05-15 17:12                   ` Halil Pasic
2019-05-16  6:13                   ` Cornelia Huck
2019-05-16  6:13                     ` Cornelia Huck
2019-05-16 13:59               ` Sebastian Ott
2019-05-16 13:59                 ` Sebastian Ott
2019-05-20 12:13                 ` Halil Pasic
2019-05-20 12:13                   ` Halil Pasic
2019-05-21  8:46                   ` Michael Mueller
2019-05-21  8:46                     ` Michael Mueller
2019-05-22 12:07                   ` Sebastian Ott
2019-05-22 12:07                     ` Sebastian Ott
2019-05-22 22:12                     ` Halil Pasic
2019-05-22 22:12                       ` Halil Pasic
2019-05-23 15:17     ` Halil Pasic
2019-05-23 15:17       ` Halil Pasic
2019-04-26 18:32 ` [PATCH 06/10] s390/cio: add basic protected virtualization support Halil Pasic
2019-04-26 18:32   ` Halil Pasic
2019-05-08 13:46   ` Sebastian Ott
2019-05-08 13:46     ` Sebastian Ott
2019-05-08 13:54     ` Christoph Hellwig
2019-05-08 13:54       ` Christoph Hellwig
2019-05-08 21:08     ` Halil Pasic
2019-05-08 21:08       ` Halil Pasic
2019-05-09  8:52       ` Sebastian Ott
2019-05-09  8:52         ` Sebastian Ott
2019-05-08 14:23   ` Pierre Morel
2019-05-08 14:23     ` Pierre Morel
2019-05-13  9:41   ` Cornelia Huck
2019-05-13  9:41     ` Cornelia Huck
2019-05-14 14:47     ` Jason J. Herne
2019-05-14 14:47       ` Jason J. Herne
2019-05-15 21:08       ` Halil Pasic
2019-05-15 21:08         ` Halil Pasic
2019-05-16  6:32         ` Cornelia Huck
2019-05-16  6:32           ` Cornelia Huck
2019-05-16 13:42           ` Halil Pasic
2019-05-16 13:42             ` Halil Pasic
2019-05-16 13:54             ` Cornelia Huck
2019-05-16 13:54               ` Cornelia Huck
2019-05-15 20:51     ` Halil Pasic
2019-05-15 20:51       ` Halil Pasic
2019-05-16  6:29       ` Cornelia Huck
2019-05-16  6:29         ` Cornelia Huck
2019-05-18 18:11         ` Halil Pasic
2019-05-18 18:11           ` Halil Pasic
2019-05-20 10:21           ` Cornelia Huck
2019-05-20 10:21             ` Cornelia Huck
2019-05-20 12:34             ` Halil Pasic
2019-05-20 12:34               ` Halil Pasic
2019-05-20 13:43               ` Cornelia Huck
2019-05-20 13:43                 ` Cornelia Huck
2019-04-26 18:32 ` [PATCH 07/10] s390/airq: use DMA memory for adapter interrupts Halil Pasic
2019-04-26 18:32   ` Halil Pasic
2019-05-08 13:58   ` Sebastian Ott
2019-05-08 13:58     ` Sebastian Ott
2019-05-09 11:37   ` Cornelia Huck
2019-05-09 11:37     ` Cornelia Huck
2019-05-13 12:59   ` Cornelia Huck
2019-05-13 12:59     ` Cornelia Huck
2019-04-26 18:32 ` [PATCH 08/10] virtio/s390: add indirection to indicators access Halil Pasic
2019-04-26 18:32   ` Halil Pasic
2019-05-08 14:31   ` Pierre Morel
2019-05-08 14:31     ` Pierre Morel
2019-05-09 12:01     ` Pierre Morel
2019-05-09 12:01       ` Pierre Morel
2019-05-09 18:26       ` Halil Pasic
2019-05-09 18:26         ` Halil Pasic
2019-05-10  7:43         ` Pierre Morel
2019-05-10  7:43           ` Pierre Morel
2019-05-10 11:54           ` Halil Pasic
2019-05-10 11:54             ` Halil Pasic
2019-05-10 15:36             ` Pierre Morel
2019-05-10 15:36               ` Pierre Morel
2019-05-13 10:15               ` Cornelia Huck
2019-05-13 10:15                 ` Cornelia Huck
2019-05-16 15:24                 ` Pierre Morel
2019-05-16 15:24                   ` Pierre Morel
2019-04-26 18:32 ` [PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers Halil Pasic
2019-04-26 18:32   ` Halil Pasic
2019-05-08 14:46   ` Pierre Morel
2019-05-08 14:46     ` Pierre Morel
2019-05-09 13:30     ` Pierre Morel
2019-05-09 13:30       ` Pierre Morel
2019-05-09 18:30       ` Halil Pasic
2019-05-09 18:30         ` Halil Pasic
2019-05-13 13:54   ` Cornelia Huck
2019-05-13 13:54     ` Cornelia Huck
2019-04-26 18:32 ` [PATCH 10/10] virtio/s390: make airq summary indicators DMA Halil Pasic
2019-04-26 18:32   ` Halil Pasic
2019-05-08 15:11   ` Pierre Morel
2019-05-08 15:11     ` Pierre Morel
2019-05-15 13:33     ` Michael Mueller
2019-05-15 13:33       ` Michael Mueller
2019-05-15 17:23       ` Halil Pasic
2019-05-15 17:23         ` Halil Pasic
2019-05-13 12:20   ` Cornelia Huck
2019-05-13 12:20     ` Cornelia Huck
2019-05-15 13:43     ` Michael Mueller
2019-05-15 13:43       ` Michael Mueller
2019-05-15 13:50       ` Cornelia Huck
2019-05-15 13:50         ` Cornelia Huck
2019-05-15 17:18       ` Halil Pasic
2019-05-15 17:18         ` Halil Pasic
2019-05-03  9:55 ` [PATCH 00/10] s390: virtio: support protected virtualization Cornelia Huck
2019-05-03 10:03   ` Juergen Gross
2019-05-03 13:33   ` Cornelia Huck
2019-05-03 13:33     ` Cornelia Huck
2019-05-04 13:58   ` Halil Pasic
2019-05-04 13:58     ` Halil Pasic

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.