All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/6] vfio-ccw: support hsch/csch (kernel part)
@ 2019-01-30 13:22 ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, Cornelia Huck, Alex Williamson, qemu-devel, qemu-s390x

[This is the Linux kernel part, git tree is available at
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw.git vfio-ccw-eagain-caps-v3

The companion QEMU patches are available at
https://github.com/cohuck/qemu vfio-ccw-caps
This is the previously posted v2 version, which should continue to work.]

Currently, vfio-ccw only relays START SUBCHANNEL requests to the real
device. This tends to work well for the most common 'good path' scenarios;
however, as we emulate {HALT,CLEAR} SUBCHANNEL in QEMU, things like
clearing pending requests at the device is currently not supported.
This may be a problem for e.g. error recovery.

This patch series introduces capabilities (similar to what vfio-pci uses)
and exposes a new async region for handling hsch/csch.

Lightly tested (I can interact with a dasd as before, and reserve/release
seems to work well.) Not sure if there is a better way to test this, ideas
welcome.

Changes v2->v3:
- Unb0rked patch 1, improved scope
- Split out the new mutex from patch 2 into new patch 3; added missing
  locking and hopefully improved description
- Patch 2 now reworks the state handling by splitting the BUSY state
  into CP_PROCESSING and CP_PENDING
- Patches 3 and 5 adapted on top of the reworked patches; hsch/csch
  are allowed in CP_PENDING, but not in CP_PROCESSING (did not add
  any R-b due to that)
- Added missing free in patch 5
- Probably some small changes I forgot to note down

Changes v1->v2:
- New patch 1: make it safe to use the cp accessors at any time; this
  should avoid problems with unsolicited interrupt handling
- New patch 2: handle concurrent accesses to the io region; the idea is
  to return -EAGAIN to userspace more often (so it can simply retry)
- also handle concurrent accesses to the async io region
- change VFIO_REGION_TYPE_CCW
- merge events for halt and clear to a single async event; this turned out
  to make the code quite a bit simpler
- probably some small changes I forgot to note down

Cornelia Huck (6):
  vfio-ccw: make it safe to access channel programs
  vfio-ccw: rework ssch state handling
  vfio-ccw: protect the I/O region
  vfio-ccw: add capabilities chain
  s390/cio: export hsch to modules
  vfio-ccw: add handling for async channel instructions

 drivers/s390/cio/Makefile           |   3 +-
 drivers/s390/cio/ioasm.c            |   1 +
 drivers/s390/cio/vfio_ccw_async.c   |  88 ++++++++++++
 drivers/s390/cio/vfio_ccw_cp.c      |  20 ++-
 drivers/s390/cio/vfio_ccw_cp.h      |   2 +
 drivers/s390/cio/vfio_ccw_drv.c     |  57 ++++++--
 drivers/s390/cio/vfio_ccw_fsm.c     | 143 ++++++++++++++++++-
 drivers/s390/cio/vfio_ccw_ops.c     | 210 +++++++++++++++++++++++-----
 drivers/s390/cio/vfio_ccw_private.h |  48 ++++++-
 include/uapi/linux/vfio.h           |   4 +
 include/uapi/linux/vfio_ccw.h       |  12 ++
 11 files changed, 531 insertions(+), 57 deletions(-)
 create mode 100644 drivers/s390/cio/vfio_ccw_async.c

-- 
2.17.2

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH v3 0/6] vfio-ccw: support hsch/csch (kernel part)
@ 2019-01-30 13:22 ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, qemu-devel, qemu-s390x, Alex Williamson, Cornelia Huck

[This is the Linux kernel part, git tree is available at
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw.git vfio-ccw-eagain-caps-v3

The companion QEMU patches are available at
https://github.com/cohuck/qemu vfio-ccw-caps
This is the previously posted v2 version, which should continue to work.]

Currently, vfio-ccw only relays START SUBCHANNEL requests to the real
device. This tends to work well for the most common 'good path' scenarios;
however, as we emulate {HALT,CLEAR} SUBCHANNEL in QEMU, things like
clearing pending requests at the device is currently not supported.
This may be a problem for e.g. error recovery.

This patch series introduces capabilities (similar to what vfio-pci uses)
and exposes a new async region for handling hsch/csch.

Lightly tested (I can interact with a dasd as before, and reserve/release
seems to work well.) Not sure if there is a better way to test this, ideas
welcome.

Changes v2->v3:
- Unb0rked patch 1, improved scope
- Split out the new mutex from patch 2 into new patch 3; added missing
  locking and hopefully improved description
- Patch 2 now reworks the state handling by splitting the BUSY state
  into CP_PROCESSING and CP_PENDING
- Patches 3 and 5 adapted on top of the reworked patches; hsch/csch
  are allowed in CP_PENDING, but not in CP_PROCESSING (did not add
  any R-b due to that)
- Added missing free in patch 5
- Probably some small changes I forgot to note down

Changes v1->v2:
- New patch 1: make it safe to use the cp accessors at any time; this
  should avoid problems with unsolicited interrupt handling
- New patch 2: handle concurrent accesses to the io region; the idea is
  to return -EAGAIN to userspace more often (so it can simply retry)
- also handle concurrent accesses to the async io region
- change VFIO_REGION_TYPE_CCW
- merge events for halt and clear to a single async event; this turned out
  to make the code quite a bit simpler
- probably some small changes I forgot to note down

Cornelia Huck (6):
  vfio-ccw: make it safe to access channel programs
  vfio-ccw: rework ssch state handling
  vfio-ccw: protect the I/O region
  vfio-ccw: add capabilities chain
  s390/cio: export hsch to modules
  vfio-ccw: add handling for async channel instructions

 drivers/s390/cio/Makefile           |   3 +-
 drivers/s390/cio/ioasm.c            |   1 +
 drivers/s390/cio/vfio_ccw_async.c   |  88 ++++++++++++
 drivers/s390/cio/vfio_ccw_cp.c      |  20 ++-
 drivers/s390/cio/vfio_ccw_cp.h      |   2 +
 drivers/s390/cio/vfio_ccw_drv.c     |  57 ++++++--
 drivers/s390/cio/vfio_ccw_fsm.c     | 143 ++++++++++++++++++-
 drivers/s390/cio/vfio_ccw_ops.c     | 210 +++++++++++++++++++++++-----
 drivers/s390/cio/vfio_ccw_private.h |  48 ++++++-
 include/uapi/linux/vfio.h           |   4 +
 include/uapi/linux/vfio_ccw.h       |  12 ++
 11 files changed, 531 insertions(+), 57 deletions(-)
 create mode 100644 drivers/s390/cio/vfio_ccw_async.c

-- 
2.17.2

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-01-30 13:22 ` [Qemu-devel] " Cornelia Huck
@ 2019-01-30 13:22   ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, Cornelia Huck, Alex Williamson, qemu-devel, qemu-s390x

When we get a solicited interrupt, the start function may have
been cleared by a csch, but we still have a channel program
structure allocated. Make it safe to call the cp accessors in
any case, so we can call them unconditionally.

While at it, also make sure that functions called from other parts
of the code return gracefully if the channel program structure
has not been initialized (even though that is a bug in the caller).

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/s390/cio/vfio_ccw_cp.c  | 20 +++++++++++++++++++-
 drivers/s390/cio/vfio_ccw_cp.h  |  2 ++
 drivers/s390/cio/vfio_ccw_fsm.c |  5 +++++
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index ba08fe137c2e..0bc0c38edda7 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -335,6 +335,7 @@ static void cp_unpin_free(struct channel_program *cp)
 	struct ccwchain *chain, *temp;
 	int i;
 
+	cp->initialized = false;
 	list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) {
 		for (i = 0; i < chain->ch_len; i++) {
 			pfn_array_table_unpin_free(chain->ch_pat + i,
@@ -701,6 +702,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
 	 */
 	cp->orb.cmd.c64 = 1;
 
+	cp->initialized = true;
+
 	return ret;
 }
 
@@ -715,7 +718,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
  */
 void cp_free(struct channel_program *cp)
 {
-	cp_unpin_free(cp);
+	if (cp->initialized)
+		cp_unpin_free(cp);
 }
 
 /**
@@ -760,6 +764,10 @@ int cp_prefetch(struct channel_program *cp)
 	struct ccwchain *chain;
 	int len, idx, ret;
 
+	/* this is an error in the caller */
+	if (!cp || !cp->initialized)
+		return -EINVAL;
+
 	list_for_each_entry(chain, &cp->ccwchain_list, next) {
 		len = chain->ch_len;
 		for (idx = 0; idx < len; idx++) {
@@ -795,6 +803,10 @@ union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm)
 	struct ccwchain *chain;
 	struct ccw1 *cpa;
 
+	/* this is an error in the caller */
+	if (!cp || !cp->initialized)
+		return NULL;
+
 	orb = &cp->orb;
 
 	orb->cmd.intparm = intparm;
@@ -831,6 +843,9 @@ void cp_update_scsw(struct channel_program *cp, union scsw *scsw)
 	u32 cpa = scsw->cmd.cpa;
 	u32 ccw_head, ccw_tail;
 
+	if (!cp->initialized)
+		return;
+
 	/*
 	 * LATER:
 	 * For now, only update the cmd.cpa part. We may need to deal with
@@ -869,6 +884,9 @@ bool cp_iova_pinned(struct channel_program *cp, u64 iova)
 	struct ccwchain *chain;
 	int i;
 
+	if (!cp->initialized)
+		return false;
+
 	list_for_each_entry(chain, &cp->ccwchain_list, next) {
 		for (i = 0; i < chain->ch_len; i++)
 			if (pfn_array_table_iova_pinned(chain->ch_pat + i,
diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h
index a4b74fb1aa57..3c20cd208da5 100644
--- a/drivers/s390/cio/vfio_ccw_cp.h
+++ b/drivers/s390/cio/vfio_ccw_cp.h
@@ -21,6 +21,7 @@
  * @ccwchain_list: list head of ccwchains
  * @orb: orb for the currently processed ssch request
  * @mdev: the mediated device to perform page pinning/unpinning
+ * @initialized: whether this instance is actually initialized
  *
  * @ccwchain_list is the head of a ccwchain list, that contents the
  * translated result of the guest channel program that pointed out by
@@ -30,6 +31,7 @@ struct channel_program {
 	struct list_head ccwchain_list;
 	union orb orb;
 	struct device *mdev;
+	bool initialized;
 };
 
 extern int cp_init(struct channel_program *cp, struct device *mdev,
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index cab17865aafe..e7c9877c9f1e 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -31,6 +31,10 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 	private->state = VFIO_CCW_STATE_BUSY;
 
 	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
+	if (!orb) {
+		ret = -EIO;
+		goto out;
+	}
 
 	/* Issue "Start Subchannel" */
 	ccode = ssch(sch->schid, orb);
@@ -64,6 +68,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 	default:
 		ret = ccode;
 	}
+out:
 	spin_unlock_irqrestore(sch->lock, flags);
 	return ret;
 }
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-01-30 13:22   ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, qemu-devel, qemu-s390x, Alex Williamson, Cornelia Huck

When we get a solicited interrupt, the start function may have
been cleared by a csch, but we still have a channel program
structure allocated. Make it safe to call the cp accessors in
any case, so we can call them unconditionally.

While at it, also make sure that functions called from other parts
of the code return gracefully if the channel program structure
has not been initialized (even though that is a bug in the caller).

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/s390/cio/vfio_ccw_cp.c  | 20 +++++++++++++++++++-
 drivers/s390/cio/vfio_ccw_cp.h  |  2 ++
 drivers/s390/cio/vfio_ccw_fsm.c |  5 +++++
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index ba08fe137c2e..0bc0c38edda7 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -335,6 +335,7 @@ static void cp_unpin_free(struct channel_program *cp)
 	struct ccwchain *chain, *temp;
 	int i;
 
+	cp->initialized = false;
 	list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) {
 		for (i = 0; i < chain->ch_len; i++) {
 			pfn_array_table_unpin_free(chain->ch_pat + i,
@@ -701,6 +702,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
 	 */
 	cp->orb.cmd.c64 = 1;
 
+	cp->initialized = true;
+
 	return ret;
 }
 
@@ -715,7 +718,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
  */
 void cp_free(struct channel_program *cp)
 {
-	cp_unpin_free(cp);
+	if (cp->initialized)
+		cp_unpin_free(cp);
 }
 
 /**
@@ -760,6 +764,10 @@ int cp_prefetch(struct channel_program *cp)
 	struct ccwchain *chain;
 	int len, idx, ret;
 
+	/* this is an error in the caller */
+	if (!cp || !cp->initialized)
+		return -EINVAL;
+
 	list_for_each_entry(chain, &cp->ccwchain_list, next) {
 		len = chain->ch_len;
 		for (idx = 0; idx < len; idx++) {
@@ -795,6 +803,10 @@ union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm)
 	struct ccwchain *chain;
 	struct ccw1 *cpa;
 
+	/* this is an error in the caller */
+	if (!cp || !cp->initialized)
+		return NULL;
+
 	orb = &cp->orb;
 
 	orb->cmd.intparm = intparm;
@@ -831,6 +843,9 @@ void cp_update_scsw(struct channel_program *cp, union scsw *scsw)
 	u32 cpa = scsw->cmd.cpa;
 	u32 ccw_head, ccw_tail;
 
+	if (!cp->initialized)
+		return;
+
 	/*
 	 * LATER:
 	 * For now, only update the cmd.cpa part. We may need to deal with
@@ -869,6 +884,9 @@ bool cp_iova_pinned(struct channel_program *cp, u64 iova)
 	struct ccwchain *chain;
 	int i;
 
+	if (!cp->initialized)
+		return false;
+
 	list_for_each_entry(chain, &cp->ccwchain_list, next) {
 		for (i = 0; i < chain->ch_len; i++)
 			if (pfn_array_table_iova_pinned(chain->ch_pat + i,
diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h
index a4b74fb1aa57..3c20cd208da5 100644
--- a/drivers/s390/cio/vfio_ccw_cp.h
+++ b/drivers/s390/cio/vfio_ccw_cp.h
@@ -21,6 +21,7 @@
  * @ccwchain_list: list head of ccwchains
  * @orb: orb for the currently processed ssch request
  * @mdev: the mediated device to perform page pinning/unpinning
+ * @initialized: whether this instance is actually initialized
  *
  * @ccwchain_list is the head of a ccwchain list, that contents the
  * translated result of the guest channel program that pointed out by
@@ -30,6 +31,7 @@ struct channel_program {
 	struct list_head ccwchain_list;
 	union orb orb;
 	struct device *mdev;
+	bool initialized;
 };
 
 extern int cp_init(struct channel_program *cp, struct device *mdev,
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index cab17865aafe..e7c9877c9f1e 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -31,6 +31,10 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 	private->state = VFIO_CCW_STATE_BUSY;
 
 	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
+	if (!orb) {
+		ret = -EIO;
+		goto out;
+	}
 
 	/* Issue "Start Subchannel" */
 	ccode = ssch(sch->schid, orb);
@@ -64,6 +68,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 	default:
 		ret = ccode;
 	}
+out:
 	spin_unlock_irqrestore(sch->lock, flags);
 	return ret;
 }
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 2/6] vfio-ccw: rework ssch state handling
  2019-01-30 13:22 ` [Qemu-devel] " Cornelia Huck
@ 2019-01-30 13:22   ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, Cornelia Huck, Alex Williamson, qemu-devel, qemu-s390x

The flow for processing ssch requests can be improved by splitting
the BUSY state:

- CP_PROCESSING: We reject any user space requests while we are in
  the process of translating a channel program and submitting it to
  the hardware. Use -EAGAIN to signal user space that it should
  retry the request.
- CP_PENDING: We have successfully submitted a request with ssch and
  are now expecting an interrupt. As we can't handle more than one
  channel program being processed, reject any further requests with
  -EBUSY. A final interrupt will move us out of this state; this also
  fixes a latent bug where a non-final interrupt might have freed up
  a channel program that still was in progress.
  By making this a separate state, we make it possible to issue a
  halt or a clear while we're still waiting for the final interrupt
  for the ssch (in a follow-on patch).

It also makes a lot of sense not to preemptively filter out writes to
the io_region if we're in an incorrect state: the state machine will
handle this correctly.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/s390/cio/vfio_ccw_drv.c     |  8 ++++++--
 drivers/s390/cio/vfio_ccw_fsm.c     | 19 ++++++++++++++-----
 drivers/s390/cio/vfio_ccw_ops.c     |  2 --
 drivers/s390/cio/vfio_ccw_private.h |  3 ++-
 4 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index a10cec0e86eb..0b3b9de45c60 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -72,20 +72,24 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
 {
 	struct vfio_ccw_private *private;
 	struct irb *irb;
+	bool is_final;
 
 	private = container_of(work, struct vfio_ccw_private, io_work);
 	irb = &private->irb;
 
+	is_final = !(scsw_actl(&irb->scsw) &
+		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
 	if (scsw_is_solicited(&irb->scsw)) {
 		cp_update_scsw(&private->cp, &irb->scsw);
-		cp_free(&private->cp);
+		if (is_final)
+			cp_free(&private->cp);
 	}
 	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
 
 	if (private->io_trigger)
 		eventfd_signal(private->io_trigger, 1);
 
-	if (private->mdev)
+	if (private->mdev && is_final)
 		private->state = VFIO_CCW_STATE_IDLE;
 }
 
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index e7c9877c9f1e..b4a141fbd1a8 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -28,7 +28,6 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 	sch = private->sch;
 
 	spin_lock_irqsave(sch->lock, flags);
-	private->state = VFIO_CCW_STATE_BUSY;
 
 	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
 	if (!orb) {
@@ -46,6 +45,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 		 */
 		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
 		ret = 0;
+		private->state = VFIO_CCW_STATE_CP_PENDING;
 		break;
 	case 1:		/* Status pending */
 	case 2:		/* Busy */
@@ -107,6 +107,12 @@ static void fsm_io_busy(struct vfio_ccw_private *private,
 	private->io_region->ret_code = -EBUSY;
 }
 
+static void fsm_io_retry(struct vfio_ccw_private *private,
+			 enum vfio_ccw_event event)
+{
+	private->io_region->ret_code = -EAGAIN;
+}
+
 static void fsm_disabled_irq(struct vfio_ccw_private *private,
 			     enum vfio_ccw_event event)
 {
@@ -135,8 +141,7 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 	struct mdev_device *mdev = private->mdev;
 	char *errstr = "request";
 
-	private->state = VFIO_CCW_STATE_BUSY;
-
+	private->state = VFIO_CCW_STATE_CP_PROCESSING;
 	memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
 
 	if (scsw->cmd.fctl & SCSW_FCTL_START_FUNC) {
@@ -181,7 +186,6 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 	}
 
 err_out:
-	private->state = VFIO_CCW_STATE_IDLE;
 	trace_vfio_ccw_io_fctl(scsw->cmd.fctl, get_schid(private),
 			       io_region->ret_code, errstr);
 }
@@ -221,7 +225,12 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_request,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
 	},
-	[VFIO_CCW_STATE_BUSY] = {
+	[VFIO_CCW_STATE_CP_PROCESSING] = {
+		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
+		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_retry,
+		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
+	},
+	[VFIO_CCW_STATE_CP_PENDING] = {
 		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_busy,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index f673e106c041..3fdcc6dfe0bf 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -193,8 +193,6 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
 		return -EINVAL;
 
 	private = dev_get_drvdata(mdev_parent_dev(mdev));
-	if (private->state != VFIO_CCW_STATE_IDLE)
-		return -EACCES;
 
 	region = private->io_region;
 	if (copy_from_user((void *)region + *ppos, buf, count))
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 08e9a7dc9176..50c52efb4fcb 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -63,7 +63,8 @@ enum vfio_ccw_state {
 	VFIO_CCW_STATE_NOT_OPER,
 	VFIO_CCW_STATE_STANDBY,
 	VFIO_CCW_STATE_IDLE,
-	VFIO_CCW_STATE_BUSY,
+	VFIO_CCW_STATE_CP_PROCESSING,
+	VFIO_CCW_STATE_CP_PENDING,
 	/* last element! */
 	NR_VFIO_CCW_STATES
 };
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH v3 2/6] vfio-ccw: rework ssch state handling
@ 2019-01-30 13:22   ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, qemu-devel, qemu-s390x, Alex Williamson, Cornelia Huck

The flow for processing ssch requests can be improved by splitting
the BUSY state:

- CP_PROCESSING: We reject any user space requests while we are in
  the process of translating a channel program and submitting it to
  the hardware. Use -EAGAIN to signal user space that it should
  retry the request.
- CP_PENDING: We have successfully submitted a request with ssch and
  are now expecting an interrupt. As we can't handle more than one
  channel program being processed, reject any further requests with
  -EBUSY. A final interrupt will move us out of this state; this also
  fixes a latent bug where a non-final interrupt might have freed up
  a channel program that still was in progress.
  By making this a separate state, we make it possible to issue a
  halt or a clear while we're still waiting for the final interrupt
  for the ssch (in a follow-on patch).

It also makes a lot of sense not to preemptively filter out writes to
the io_region if we're in an incorrect state: the state machine will
handle this correctly.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/s390/cio/vfio_ccw_drv.c     |  8 ++++++--
 drivers/s390/cio/vfio_ccw_fsm.c     | 19 ++++++++++++++-----
 drivers/s390/cio/vfio_ccw_ops.c     |  2 --
 drivers/s390/cio/vfio_ccw_private.h |  3 ++-
 4 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index a10cec0e86eb..0b3b9de45c60 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -72,20 +72,24 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
 {
 	struct vfio_ccw_private *private;
 	struct irb *irb;
+	bool is_final;
 
 	private = container_of(work, struct vfio_ccw_private, io_work);
 	irb = &private->irb;
 
+	is_final = !(scsw_actl(&irb->scsw) &
+		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
 	if (scsw_is_solicited(&irb->scsw)) {
 		cp_update_scsw(&private->cp, &irb->scsw);
-		cp_free(&private->cp);
+		if (is_final)
+			cp_free(&private->cp);
 	}
 	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
 
 	if (private->io_trigger)
 		eventfd_signal(private->io_trigger, 1);
 
-	if (private->mdev)
+	if (private->mdev && is_final)
 		private->state = VFIO_CCW_STATE_IDLE;
 }
 
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index e7c9877c9f1e..b4a141fbd1a8 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -28,7 +28,6 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 	sch = private->sch;
 
 	spin_lock_irqsave(sch->lock, flags);
-	private->state = VFIO_CCW_STATE_BUSY;
 
 	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
 	if (!orb) {
@@ -46,6 +45,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 		 */
 		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
 		ret = 0;
+		private->state = VFIO_CCW_STATE_CP_PENDING;
 		break;
 	case 1:		/* Status pending */
 	case 2:		/* Busy */
@@ -107,6 +107,12 @@ static void fsm_io_busy(struct vfio_ccw_private *private,
 	private->io_region->ret_code = -EBUSY;
 }
 
+static void fsm_io_retry(struct vfio_ccw_private *private,
+			 enum vfio_ccw_event event)
+{
+	private->io_region->ret_code = -EAGAIN;
+}
+
 static void fsm_disabled_irq(struct vfio_ccw_private *private,
 			     enum vfio_ccw_event event)
 {
@@ -135,8 +141,7 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 	struct mdev_device *mdev = private->mdev;
 	char *errstr = "request";
 
-	private->state = VFIO_CCW_STATE_BUSY;
-
+	private->state = VFIO_CCW_STATE_CP_PROCESSING;
 	memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
 
 	if (scsw->cmd.fctl & SCSW_FCTL_START_FUNC) {
@@ -181,7 +186,6 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 	}
 
 err_out:
-	private->state = VFIO_CCW_STATE_IDLE;
 	trace_vfio_ccw_io_fctl(scsw->cmd.fctl, get_schid(private),
 			       io_region->ret_code, errstr);
 }
@@ -221,7 +225,12 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_request,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
 	},
-	[VFIO_CCW_STATE_BUSY] = {
+	[VFIO_CCW_STATE_CP_PROCESSING] = {
+		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
+		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_retry,
+		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
+	},
+	[VFIO_CCW_STATE_CP_PENDING] = {
 		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_busy,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index f673e106c041..3fdcc6dfe0bf 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -193,8 +193,6 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
 		return -EINVAL;
 
 	private = dev_get_drvdata(mdev_parent_dev(mdev));
-	if (private->state != VFIO_CCW_STATE_IDLE)
-		return -EACCES;
 
 	region = private->io_region;
 	if (copy_from_user((void *)region + *ppos, buf, count))
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 08e9a7dc9176..50c52efb4fcb 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -63,7 +63,8 @@ enum vfio_ccw_state {
 	VFIO_CCW_STATE_NOT_OPER,
 	VFIO_CCW_STATE_STANDBY,
 	VFIO_CCW_STATE_IDLE,
-	VFIO_CCW_STATE_BUSY,
+	VFIO_CCW_STATE_CP_PROCESSING,
+	VFIO_CCW_STATE_CP_PENDING,
 	/* last element! */
 	NR_VFIO_CCW_STATES
 };
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 3/6] vfio-ccw: protect the I/O region
  2019-01-30 13:22 ` [Qemu-devel] " Cornelia Huck
@ 2019-01-30 13:22   ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, Cornelia Huck, Alex Williamson, qemu-devel, qemu-s390x

Introduce a mutex to disallow concurrent reads or writes to the
I/O region. This makes sure that the data the kernel or user
space see is always consistent.

The same mutex will be used to protect the async region as well.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/s390/cio/vfio_ccw_drv.c     |  3 +++
 drivers/s390/cio/vfio_ccw_ops.c     | 28 +++++++++++++++++++---------
 drivers/s390/cio/vfio_ccw_private.h |  2 ++
 3 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 0b3b9de45c60..5ea0da1dd954 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -84,7 +84,9 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
 		if (is_final)
 			cp_free(&private->cp);
 	}
+	mutex_lock(&private->io_mutex);
 	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
+	mutex_unlock(&private->io_mutex);
 
 	if (private->io_trigger)
 		eventfd_signal(private->io_trigger, 1);
@@ -129,6 +131,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 
 	private->sch = sch;
 	dev_set_drvdata(&sch->dev, private);
+	mutex_init(&private->io_mutex);
 
 	spin_lock_irq(sch->lock);
 	private->state = VFIO_CCW_STATE_NOT_OPER;
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 3fdcc6dfe0bf..025c8a832bc8 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -169,16 +169,20 @@ static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
 {
 	struct vfio_ccw_private *private;
 	struct ccw_io_region *region;
+	int ret;
 
 	if (*ppos + count > sizeof(*region))
 		return -EINVAL;
 
 	private = dev_get_drvdata(mdev_parent_dev(mdev));
+	mutex_lock(&private->io_mutex);
 	region = private->io_region;
 	if (copy_to_user(buf, (void *)region + *ppos, count))
-		return -EFAULT;
-
-	return count;
+		ret = -EFAULT;
+	else
+		ret = count;
+	mutex_unlock(&private->io_mutex);
+	return ret;
 }
 
 static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
@@ -188,23 +192,29 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
 {
 	struct vfio_ccw_private *private;
 	struct ccw_io_region *region;
+	int ret;
 
 	if (*ppos + count > sizeof(*region))
 		return -EINVAL;
 
 	private = dev_get_drvdata(mdev_parent_dev(mdev));
+	if (!mutex_trylock(&private->io_mutex))
+		return -EAGAIN;
 
 	region = private->io_region;
-	if (copy_from_user((void *)region + *ppos, buf, count))
-		return -EFAULT;
+	if (copy_from_user((void *)region + *ppos, buf, count)) {
+		ret = -EFAULT;
+		goto out_unlock;
+	}
 
 	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_IO_REQ);
-	if (region->ret_code != 0) {
+	if (region->ret_code != 0)
 		private->state = VFIO_CCW_STATE_IDLE;
-		return region->ret_code;
-	}
+	ret = (region->ret_code != 0) ? region->ret_code : count;
 
-	return count;
+out_unlock:
+	mutex_unlock(&private->io_mutex);
+	return ret;
 }
 
 static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info)
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 50c52efb4fcb..32173cbd838d 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -28,6 +28,7 @@
  * @mdev: pointer to the mediated device
  * @nb: notifier for vfio events
  * @io_region: MMIO region to input/output I/O arguments/results
+ * @io_mutex: protect against concurrent update of I/O regions
  * @cp: channel program for the current I/O operation
  * @irb: irb info received from interrupt
  * @scsw: scsw info
@@ -42,6 +43,7 @@ struct vfio_ccw_private {
 	struct mdev_device	*mdev;
 	struct notifier_block	nb;
 	struct ccw_io_region	*io_region;
+	struct mutex		io_mutex;
 
 	struct channel_program	cp;
 	struct irb		irb;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH v3 3/6] vfio-ccw: protect the I/O region
@ 2019-01-30 13:22   ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, qemu-devel, qemu-s390x, Alex Williamson, Cornelia Huck

Introduce a mutex to disallow concurrent reads or writes to the
I/O region. This makes sure that the data the kernel or user
space see is always consistent.

The same mutex will be used to protect the async region as well.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/s390/cio/vfio_ccw_drv.c     |  3 +++
 drivers/s390/cio/vfio_ccw_ops.c     | 28 +++++++++++++++++++---------
 drivers/s390/cio/vfio_ccw_private.h |  2 ++
 3 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 0b3b9de45c60..5ea0da1dd954 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -84,7 +84,9 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
 		if (is_final)
 			cp_free(&private->cp);
 	}
+	mutex_lock(&private->io_mutex);
 	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
+	mutex_unlock(&private->io_mutex);
 
 	if (private->io_trigger)
 		eventfd_signal(private->io_trigger, 1);
@@ -129,6 +131,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 
 	private->sch = sch;
 	dev_set_drvdata(&sch->dev, private);
+	mutex_init(&private->io_mutex);
 
 	spin_lock_irq(sch->lock);
 	private->state = VFIO_CCW_STATE_NOT_OPER;
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 3fdcc6dfe0bf..025c8a832bc8 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -169,16 +169,20 @@ static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
 {
 	struct vfio_ccw_private *private;
 	struct ccw_io_region *region;
+	int ret;
 
 	if (*ppos + count > sizeof(*region))
 		return -EINVAL;
 
 	private = dev_get_drvdata(mdev_parent_dev(mdev));
+	mutex_lock(&private->io_mutex);
 	region = private->io_region;
 	if (copy_to_user(buf, (void *)region + *ppos, count))
-		return -EFAULT;
-
-	return count;
+		ret = -EFAULT;
+	else
+		ret = count;
+	mutex_unlock(&private->io_mutex);
+	return ret;
 }
 
 static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
@@ -188,23 +192,29 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
 {
 	struct vfio_ccw_private *private;
 	struct ccw_io_region *region;
+	int ret;
 
 	if (*ppos + count > sizeof(*region))
 		return -EINVAL;
 
 	private = dev_get_drvdata(mdev_parent_dev(mdev));
+	if (!mutex_trylock(&private->io_mutex))
+		return -EAGAIN;
 
 	region = private->io_region;
-	if (copy_from_user((void *)region + *ppos, buf, count))
-		return -EFAULT;
+	if (copy_from_user((void *)region + *ppos, buf, count)) {
+		ret = -EFAULT;
+		goto out_unlock;
+	}
 
 	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_IO_REQ);
-	if (region->ret_code != 0) {
+	if (region->ret_code != 0)
 		private->state = VFIO_CCW_STATE_IDLE;
-		return region->ret_code;
-	}
+	ret = (region->ret_code != 0) ? region->ret_code : count;
 
-	return count;
+out_unlock:
+	mutex_unlock(&private->io_mutex);
+	return ret;
 }
 
 static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info)
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 50c52efb4fcb..32173cbd838d 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -28,6 +28,7 @@
  * @mdev: pointer to the mediated device
  * @nb: notifier for vfio events
  * @io_region: MMIO region to input/output I/O arguments/results
+ * @io_mutex: protect against concurrent update of I/O regions
  * @cp: channel program for the current I/O operation
  * @irb: irb info received from interrupt
  * @scsw: scsw info
@@ -42,6 +43,7 @@ struct vfio_ccw_private {
 	struct mdev_device	*mdev;
 	struct notifier_block	nb;
 	struct ccw_io_region	*io_region;
+	struct mutex		io_mutex;
 
 	struct channel_program	cp;
 	struct irb		irb;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 4/6] vfio-ccw: add capabilities chain
  2019-01-30 13:22 ` [Qemu-devel] " Cornelia Huck
@ 2019-01-30 13:22   ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, Cornelia Huck, Alex Williamson, qemu-devel, qemu-s390x

Allow to extend the regions used by vfio-ccw. The first user will be
handling of halt and clear subchannel.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/s390/cio/vfio_ccw_ops.c     | 181 ++++++++++++++++++++++++----
 drivers/s390/cio/vfio_ccw_private.h |  38 ++++++
 include/uapi/linux/vfio.h           |   2 +
 3 files changed, 195 insertions(+), 26 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 025c8a832bc8..48b2d7930ea8 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -3,9 +3,11 @@
  * Physical device callbacks for vfio_ccw
  *
  * Copyright IBM Corp. 2017
+ * Copyright Red Hat, Inc. 2019
  *
  * Author(s): Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
  *            Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
+ *            Cornelia Huck <cohuck@redhat.com>
  */
 
 #include <linux/vfio.h>
@@ -157,27 +159,33 @@ static void vfio_ccw_mdev_release(struct mdev_device *mdev)
 {
 	struct vfio_ccw_private *private =
 		dev_get_drvdata(mdev_parent_dev(mdev));
+	int i;
 
 	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
 				 &private->nb);
+
+	for (i = 0; i < private->num_regions; i++)
+		private->region[i].ops->release(private, &private->region[i]);
+
+	private->num_regions = 0;
+	kfree(private->region);
+	private->region = NULL;
 }
 
-static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
-				  char __user *buf,
-				  size_t count,
-				  loff_t *ppos)
+static ssize_t vfio_ccw_mdev_read_io_region(struct vfio_ccw_private *private,
+					    char __user *buf, size_t count,
+					    loff_t *ppos)
 {
-	struct vfio_ccw_private *private;
+	loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
 	struct ccw_io_region *region;
 	int ret;
 
-	if (*ppos + count > sizeof(*region))
+	if (pos + count > sizeof(*region))
 		return -EINVAL;
 
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	mutex_lock(&private->io_mutex);
 	region = private->io_region;
-	if (copy_to_user(buf, (void *)region + *ppos, count))
+	if (copy_to_user(buf, (void *)region + pos, count))
 		ret = -EFAULT;
 	else
 		ret = count;
@@ -185,24 +193,47 @@ static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
 	return ret;
 }
 
-static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
-				   const char __user *buf,
-				   size_t count,
-				   loff_t *ppos)
+static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
+				  char __user *buf,
+				  size_t count,
+				  loff_t *ppos)
 {
+	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
 	struct vfio_ccw_private *private;
+
+	private = dev_get_drvdata(mdev_parent_dev(mdev));
+
+	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
+		return -EINVAL;
+
+	switch (index) {
+	case VFIO_CCW_CONFIG_REGION_INDEX:
+		return vfio_ccw_mdev_read_io_region(private, buf, count, ppos);
+	default:
+		index -= VFIO_CCW_NUM_REGIONS;
+		return private->region[index].ops->read(private, buf, count,
+							ppos);
+	}
+
+	return -EINVAL;
+}
+
+static ssize_t vfio_ccw_mdev_write_io_region(struct vfio_ccw_private *private,
+					     const char __user *buf,
+					     size_t count, loff_t *ppos)
+{
+	loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
 	struct ccw_io_region *region;
 	int ret;
 
-	if (*ppos + count > sizeof(*region))
+	if (pos + count > sizeof(*region))
 		return -EINVAL;
 
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	if (!mutex_trylock(&private->io_mutex))
 		return -EAGAIN;
 
 	region = private->io_region;
-	if (copy_from_user((void *)region + *ppos, buf, count)) {
+	if (copy_from_user((void *)region + pos, buf, count)) {
 		ret = -EFAULT;
 		goto out_unlock;
 	}
@@ -217,19 +248,52 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
 	return ret;
 }
 
-static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info)
+static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
+				   const char __user *buf,
+				   size_t count,
+				   loff_t *ppos)
+{
+	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
+	struct vfio_ccw_private *private;
+
+	private = dev_get_drvdata(mdev_parent_dev(mdev));
+
+	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
+		return -EINVAL;
+
+	switch (index) {
+	case VFIO_CCW_CONFIG_REGION_INDEX:
+		return vfio_ccw_mdev_write_io_region(private, buf, count, ppos);
+	default:
+		index -= VFIO_CCW_NUM_REGIONS;
+		return private->region[index].ops->write(private, buf, count,
+							 ppos);
+	}
+
+	return -EINVAL;
+}
+
+static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info,
+					 struct mdev_device *mdev)
 {
+	struct vfio_ccw_private *private;
+
+	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	info->flags = VFIO_DEVICE_FLAGS_CCW | VFIO_DEVICE_FLAGS_RESET;
-	info->num_regions = VFIO_CCW_NUM_REGIONS;
+	info->num_regions = VFIO_CCW_NUM_REGIONS + private->num_regions;
 	info->num_irqs = VFIO_CCW_NUM_IRQS;
 
 	return 0;
 }
 
 static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
-					 u16 *cap_type_id,
-					 void **cap_type)
+					 struct mdev_device *mdev,
+					 unsigned long arg)
 {
+	struct vfio_ccw_private *private;
+	int i;
+
+	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	switch (info->index) {
 	case VFIO_CCW_CONFIG_REGION_INDEX:
 		info->offset = 0;
@@ -237,9 +301,51 @@ static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
 		info->flags = VFIO_REGION_INFO_FLAG_READ
 			      | VFIO_REGION_INFO_FLAG_WRITE;
 		return 0;
-	default:
-		return -EINVAL;
+	default: /* all other regions are handled via capability chain */
+	{
+		struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
+		struct vfio_region_info_cap_type cap_type = {
+			.header.id = VFIO_REGION_INFO_CAP_TYPE,
+			.header.version = 1 };
+		int ret;
+
+		if (info->index >=
+		    VFIO_CCW_NUM_REGIONS + private->num_regions)
+			return -EINVAL;
+
+		i = info->index - VFIO_CCW_NUM_REGIONS;
+
+		info->offset = VFIO_CCW_INDEX_TO_OFFSET(info->index);
+		info->size = private->region[i].size;
+		info->flags = private->region[i].flags;
+
+		cap_type.type = private->region[i].type;
+		cap_type.subtype = private->region[i].subtype;
+
+		ret = vfio_info_add_capability(&caps, &cap_type.header,
+					       sizeof(cap_type));
+		if (ret)
+			return ret;
+
+		info->flags |= VFIO_REGION_INFO_FLAG_CAPS;
+		if (info->argsz < sizeof(*info) + caps.size) {
+			info->argsz = sizeof(*info) + caps.size;
+			info->cap_offset = 0;
+		} else {
+			vfio_info_cap_shift(&caps, sizeof(*info));
+			if (copy_to_user((void __user *)arg + sizeof(*info),
+					 caps.buf, caps.size)) {
+				kfree(caps.buf);
+				return -EFAULT;
+			}
+			info->cap_offset = sizeof(*info);
+		}
+
+		kfree(caps.buf);
+
+	}
 	}
+	return 0;
 }
 
 static int vfio_ccw_mdev_get_irq_info(struct vfio_irq_info *info)
@@ -316,6 +422,32 @@ static int vfio_ccw_mdev_set_irqs(struct mdev_device *mdev,
 	}
 }
 
+int vfio_ccw_register_dev_region(struct vfio_ccw_private *private,
+				 unsigned int subtype,
+				 const struct vfio_ccw_regops *ops,
+				 size_t size, u32 flags, void *data)
+{
+	struct vfio_ccw_region *region;
+
+	region = krealloc(private->region,
+			  (private->num_regions + 1) * sizeof(*region),
+			  GFP_KERNEL);
+	if (!region)
+		return -ENOMEM;
+
+	private->region = region;
+	private->region[private->num_regions].type = VFIO_REGION_TYPE_CCW;
+	private->region[private->num_regions].subtype = subtype;
+	private->region[private->num_regions].ops = ops;
+	private->region[private->num_regions].size = size;
+	private->region[private->num_regions].flags = flags;
+	private->region[private->num_regions].data = data;
+
+	private->num_regions++;
+
+	return 0;
+}
+
 static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 				   unsigned int cmd,
 				   unsigned long arg)
@@ -336,7 +468,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		ret = vfio_ccw_mdev_get_device_info(&info);
+		ret = vfio_ccw_mdev_get_device_info(&info, mdev);
 		if (ret)
 			return ret;
 
@@ -345,8 +477,6 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 	case VFIO_DEVICE_GET_REGION_INFO:
 	{
 		struct vfio_region_info info;
-		u16 cap_type_id = 0;
-		void *cap_type = NULL;
 
 		minsz = offsetofend(struct vfio_region_info, offset);
 
@@ -356,8 +486,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		ret = vfio_ccw_mdev_get_region_info(&info, &cap_type_id,
-						    &cap_type);
+		ret = vfio_ccw_mdev_get_region_info(&info, mdev, arg);
 		if (ret)
 			return ret;
 
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 32173cbd838d..c979eb32fb1c 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -3,9 +3,11 @@
  * Private stuff for vfio_ccw driver
  *
  * Copyright IBM Corp. 2017
+ * Copyright Red Hat, Inc. 2019
  *
  * Author(s): Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
  *            Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
+ *            Cornelia Huck <cohuck@redhat.com>
  */
 
 #ifndef _VFIO_CCW_PRIVATE_H_
@@ -19,6 +21,38 @@
 #include "css.h"
 #include "vfio_ccw_cp.h"
 
+#define VFIO_CCW_OFFSET_SHIFT   40
+#define VFIO_CCW_OFFSET_TO_INDEX(off)	(off >> VFIO_CCW_OFFSET_SHIFT)
+#define VFIO_CCW_INDEX_TO_OFFSET(index)	((u64)(index) << VFIO_CCW_OFFSET_SHIFT)
+#define VFIO_CCW_OFFSET_MASK	(((u64)(1) << VFIO_CCW_OFFSET_SHIFT) - 1)
+
+/* capability chain handling similar to vfio-pci */
+struct vfio_ccw_private;
+struct vfio_ccw_region;
+
+struct vfio_ccw_regops {
+	ssize_t	(*read)(struct vfio_ccw_private *private, char __user *buf,
+			size_t count, loff_t *ppos);
+	ssize_t	(*write)(struct vfio_ccw_private *private,
+			 const char __user *buf, size_t count, loff_t *ppos);
+	void	(*release)(struct vfio_ccw_private *private,
+			   struct vfio_ccw_region *region);
+};
+
+struct vfio_ccw_region {
+	u32				type;
+	u32				subtype;
+	const struct vfio_ccw_regops	*ops;
+	void				*data;
+	size_t				size;
+	u32				flags;
+};
+
+int vfio_ccw_register_dev_region(struct vfio_ccw_private *private,
+				 unsigned int subtype,
+				 const struct vfio_ccw_regops *ops,
+				 size_t size, u32 flags, void *data);
+
 /**
  * struct vfio_ccw_private
  * @sch: pointer to the subchannel
@@ -29,6 +63,8 @@
  * @nb: notifier for vfio events
  * @io_region: MMIO region to input/output I/O arguments/results
  * @io_mutex: protect against concurrent update of I/O regions
+ * @region: additional regions for other subchannel operations
+ * @num_regions: number of additional regions
  * @cp: channel program for the current I/O operation
  * @irb: irb info received from interrupt
  * @scsw: scsw info
@@ -44,6 +80,8 @@ struct vfio_ccw_private {
 	struct notifier_block	nb;
 	struct ccw_io_region	*io_region;
 	struct mutex		io_mutex;
+	struct vfio_ccw_region *region;
+	int num_regions;
 
 	struct channel_program	cp;
 	struct irb		irb;
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 02bb7ad6e986..56e2413d3e00 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -353,6 +353,8 @@ struct vfio_region_gfx_edid {
 #define VFIO_DEVICE_GFX_LINK_STATE_DOWN  2
 };
 
+#define VFIO_REGION_TYPE_CCW			(2)
+
 /*
  * 10de vendor sub-type
  *
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH v3 4/6] vfio-ccw: add capabilities chain
@ 2019-01-30 13:22   ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, qemu-devel, qemu-s390x, Alex Williamson, Cornelia Huck

Allow to extend the regions used by vfio-ccw. The first user will be
handling of halt and clear subchannel.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/s390/cio/vfio_ccw_ops.c     | 181 ++++++++++++++++++++++++----
 drivers/s390/cio/vfio_ccw_private.h |  38 ++++++
 include/uapi/linux/vfio.h           |   2 +
 3 files changed, 195 insertions(+), 26 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 025c8a832bc8..48b2d7930ea8 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -3,9 +3,11 @@
  * Physical device callbacks for vfio_ccw
  *
  * Copyright IBM Corp. 2017
+ * Copyright Red Hat, Inc. 2019
  *
  * Author(s): Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
  *            Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
+ *            Cornelia Huck <cohuck@redhat.com>
  */
 
 #include <linux/vfio.h>
@@ -157,27 +159,33 @@ static void vfio_ccw_mdev_release(struct mdev_device *mdev)
 {
 	struct vfio_ccw_private *private =
 		dev_get_drvdata(mdev_parent_dev(mdev));
+	int i;
 
 	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
 				 &private->nb);
+
+	for (i = 0; i < private->num_regions; i++)
+		private->region[i].ops->release(private, &private->region[i]);
+
+	private->num_regions = 0;
+	kfree(private->region);
+	private->region = NULL;
 }
 
-static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
-				  char __user *buf,
-				  size_t count,
-				  loff_t *ppos)
+static ssize_t vfio_ccw_mdev_read_io_region(struct vfio_ccw_private *private,
+					    char __user *buf, size_t count,
+					    loff_t *ppos)
 {
-	struct vfio_ccw_private *private;
+	loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
 	struct ccw_io_region *region;
 	int ret;
 
-	if (*ppos + count > sizeof(*region))
+	if (pos + count > sizeof(*region))
 		return -EINVAL;
 
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	mutex_lock(&private->io_mutex);
 	region = private->io_region;
-	if (copy_to_user(buf, (void *)region + *ppos, count))
+	if (copy_to_user(buf, (void *)region + pos, count))
 		ret = -EFAULT;
 	else
 		ret = count;
@@ -185,24 +193,47 @@ static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
 	return ret;
 }
 
-static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
-				   const char __user *buf,
-				   size_t count,
-				   loff_t *ppos)
+static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
+				  char __user *buf,
+				  size_t count,
+				  loff_t *ppos)
 {
+	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
 	struct vfio_ccw_private *private;
+
+	private = dev_get_drvdata(mdev_parent_dev(mdev));
+
+	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
+		return -EINVAL;
+
+	switch (index) {
+	case VFIO_CCW_CONFIG_REGION_INDEX:
+		return vfio_ccw_mdev_read_io_region(private, buf, count, ppos);
+	default:
+		index -= VFIO_CCW_NUM_REGIONS;
+		return private->region[index].ops->read(private, buf, count,
+							ppos);
+	}
+
+	return -EINVAL;
+}
+
+static ssize_t vfio_ccw_mdev_write_io_region(struct vfio_ccw_private *private,
+					     const char __user *buf,
+					     size_t count, loff_t *ppos)
+{
+	loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
 	struct ccw_io_region *region;
 	int ret;
 
-	if (*ppos + count > sizeof(*region))
+	if (pos + count > sizeof(*region))
 		return -EINVAL;
 
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	if (!mutex_trylock(&private->io_mutex))
 		return -EAGAIN;
 
 	region = private->io_region;
-	if (copy_from_user((void *)region + *ppos, buf, count)) {
+	if (copy_from_user((void *)region + pos, buf, count)) {
 		ret = -EFAULT;
 		goto out_unlock;
 	}
@@ -217,19 +248,52 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
 	return ret;
 }
 
-static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info)
+static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
+				   const char __user *buf,
+				   size_t count,
+				   loff_t *ppos)
+{
+	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
+	struct vfio_ccw_private *private;
+
+	private = dev_get_drvdata(mdev_parent_dev(mdev));
+
+	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
+		return -EINVAL;
+
+	switch (index) {
+	case VFIO_CCW_CONFIG_REGION_INDEX:
+		return vfio_ccw_mdev_write_io_region(private, buf, count, ppos);
+	default:
+		index -= VFIO_CCW_NUM_REGIONS;
+		return private->region[index].ops->write(private, buf, count,
+							 ppos);
+	}
+
+	return -EINVAL;
+}
+
+static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info,
+					 struct mdev_device *mdev)
 {
+	struct vfio_ccw_private *private;
+
+	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	info->flags = VFIO_DEVICE_FLAGS_CCW | VFIO_DEVICE_FLAGS_RESET;
-	info->num_regions = VFIO_CCW_NUM_REGIONS;
+	info->num_regions = VFIO_CCW_NUM_REGIONS + private->num_regions;
 	info->num_irqs = VFIO_CCW_NUM_IRQS;
 
 	return 0;
 }
 
 static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
-					 u16 *cap_type_id,
-					 void **cap_type)
+					 struct mdev_device *mdev,
+					 unsigned long arg)
 {
+	struct vfio_ccw_private *private;
+	int i;
+
+	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	switch (info->index) {
 	case VFIO_CCW_CONFIG_REGION_INDEX:
 		info->offset = 0;
@@ -237,9 +301,51 @@ static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
 		info->flags = VFIO_REGION_INFO_FLAG_READ
 			      | VFIO_REGION_INFO_FLAG_WRITE;
 		return 0;
-	default:
-		return -EINVAL;
+	default: /* all other regions are handled via capability chain */
+	{
+		struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
+		struct vfio_region_info_cap_type cap_type = {
+			.header.id = VFIO_REGION_INFO_CAP_TYPE,
+			.header.version = 1 };
+		int ret;
+
+		if (info->index >=
+		    VFIO_CCW_NUM_REGIONS + private->num_regions)
+			return -EINVAL;
+
+		i = info->index - VFIO_CCW_NUM_REGIONS;
+
+		info->offset = VFIO_CCW_INDEX_TO_OFFSET(info->index);
+		info->size = private->region[i].size;
+		info->flags = private->region[i].flags;
+
+		cap_type.type = private->region[i].type;
+		cap_type.subtype = private->region[i].subtype;
+
+		ret = vfio_info_add_capability(&caps, &cap_type.header,
+					       sizeof(cap_type));
+		if (ret)
+			return ret;
+
+		info->flags |= VFIO_REGION_INFO_FLAG_CAPS;
+		if (info->argsz < sizeof(*info) + caps.size) {
+			info->argsz = sizeof(*info) + caps.size;
+			info->cap_offset = 0;
+		} else {
+			vfio_info_cap_shift(&caps, sizeof(*info));
+			if (copy_to_user((void __user *)arg + sizeof(*info),
+					 caps.buf, caps.size)) {
+				kfree(caps.buf);
+				return -EFAULT;
+			}
+			info->cap_offset = sizeof(*info);
+		}
+
+		kfree(caps.buf);
+
+	}
 	}
+	return 0;
 }
 
 static int vfio_ccw_mdev_get_irq_info(struct vfio_irq_info *info)
@@ -316,6 +422,32 @@ static int vfio_ccw_mdev_set_irqs(struct mdev_device *mdev,
 	}
 }
 
+int vfio_ccw_register_dev_region(struct vfio_ccw_private *private,
+				 unsigned int subtype,
+				 const struct vfio_ccw_regops *ops,
+				 size_t size, u32 flags, void *data)
+{
+	struct vfio_ccw_region *region;
+
+	region = krealloc(private->region,
+			  (private->num_regions + 1) * sizeof(*region),
+			  GFP_KERNEL);
+	if (!region)
+		return -ENOMEM;
+
+	private->region = region;
+	private->region[private->num_regions].type = VFIO_REGION_TYPE_CCW;
+	private->region[private->num_regions].subtype = subtype;
+	private->region[private->num_regions].ops = ops;
+	private->region[private->num_regions].size = size;
+	private->region[private->num_regions].flags = flags;
+	private->region[private->num_regions].data = data;
+
+	private->num_regions++;
+
+	return 0;
+}
+
 static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 				   unsigned int cmd,
 				   unsigned long arg)
@@ -336,7 +468,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		ret = vfio_ccw_mdev_get_device_info(&info);
+		ret = vfio_ccw_mdev_get_device_info(&info, mdev);
 		if (ret)
 			return ret;
 
@@ -345,8 +477,6 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 	case VFIO_DEVICE_GET_REGION_INFO:
 	{
 		struct vfio_region_info info;
-		u16 cap_type_id = 0;
-		void *cap_type = NULL;
 
 		minsz = offsetofend(struct vfio_region_info, offset);
 
@@ -356,8 +486,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		ret = vfio_ccw_mdev_get_region_info(&info, &cap_type_id,
-						    &cap_type);
+		ret = vfio_ccw_mdev_get_region_info(&info, mdev, arg);
 		if (ret)
 			return ret;
 
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 32173cbd838d..c979eb32fb1c 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -3,9 +3,11 @@
  * Private stuff for vfio_ccw driver
  *
  * Copyright IBM Corp. 2017
+ * Copyright Red Hat, Inc. 2019
  *
  * Author(s): Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
  *            Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
+ *            Cornelia Huck <cohuck@redhat.com>
  */
 
 #ifndef _VFIO_CCW_PRIVATE_H_
@@ -19,6 +21,38 @@
 #include "css.h"
 #include "vfio_ccw_cp.h"
 
+#define VFIO_CCW_OFFSET_SHIFT   40
+#define VFIO_CCW_OFFSET_TO_INDEX(off)	(off >> VFIO_CCW_OFFSET_SHIFT)
+#define VFIO_CCW_INDEX_TO_OFFSET(index)	((u64)(index) << VFIO_CCW_OFFSET_SHIFT)
+#define VFIO_CCW_OFFSET_MASK	(((u64)(1) << VFIO_CCW_OFFSET_SHIFT) - 1)
+
+/* capability chain handling similar to vfio-pci */
+struct vfio_ccw_private;
+struct vfio_ccw_region;
+
+struct vfio_ccw_regops {
+	ssize_t	(*read)(struct vfio_ccw_private *private, char __user *buf,
+			size_t count, loff_t *ppos);
+	ssize_t	(*write)(struct vfio_ccw_private *private,
+			 const char __user *buf, size_t count, loff_t *ppos);
+	void	(*release)(struct vfio_ccw_private *private,
+			   struct vfio_ccw_region *region);
+};
+
+struct vfio_ccw_region {
+	u32				type;
+	u32				subtype;
+	const struct vfio_ccw_regops	*ops;
+	void				*data;
+	size_t				size;
+	u32				flags;
+};
+
+int vfio_ccw_register_dev_region(struct vfio_ccw_private *private,
+				 unsigned int subtype,
+				 const struct vfio_ccw_regops *ops,
+				 size_t size, u32 flags, void *data);
+
 /**
  * struct vfio_ccw_private
  * @sch: pointer to the subchannel
@@ -29,6 +63,8 @@
  * @nb: notifier for vfio events
  * @io_region: MMIO region to input/output I/O arguments/results
  * @io_mutex: protect against concurrent update of I/O regions
+ * @region: additional regions for other subchannel operations
+ * @num_regions: number of additional regions
  * @cp: channel program for the current I/O operation
  * @irb: irb info received from interrupt
  * @scsw: scsw info
@@ -44,6 +80,8 @@ struct vfio_ccw_private {
 	struct notifier_block	nb;
 	struct ccw_io_region	*io_region;
 	struct mutex		io_mutex;
+	struct vfio_ccw_region *region;
+	int num_regions;
 
 	struct channel_program	cp;
 	struct irb		irb;
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 02bb7ad6e986..56e2413d3e00 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -353,6 +353,8 @@ struct vfio_region_gfx_edid {
 #define VFIO_DEVICE_GFX_LINK_STATE_DOWN  2
 };
 
+#define VFIO_REGION_TYPE_CCW			(2)
+
 /*
  * 10de vendor sub-type
  *
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 5/6] s390/cio: export hsch to modules
  2019-01-30 13:22 ` [Qemu-devel] " Cornelia Huck
@ 2019-01-30 13:22   ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, Cornelia Huck, Alex Williamson, qemu-devel, qemu-s390x

The vfio-ccw code will need this, and it matches treatment of ssch
and csch.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/s390/cio/ioasm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/s390/cio/ioasm.c b/drivers/s390/cio/ioasm.c
index 14d328338ce2..08eb10283b18 100644
--- a/drivers/s390/cio/ioasm.c
+++ b/drivers/s390/cio/ioasm.c
@@ -233,6 +233,7 @@ int hsch(struct subchannel_id schid)
 
 	return ccode;
 }
+EXPORT_SYMBOL(hsch);
 
 static inline int __xsch(struct subchannel_id schid)
 {
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH v3 5/6] s390/cio: export hsch to modules
@ 2019-01-30 13:22   ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, qemu-devel, qemu-s390x, Alex Williamson, Cornelia Huck

The vfio-ccw code will need this, and it matches treatment of ssch
and csch.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/s390/cio/ioasm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/s390/cio/ioasm.c b/drivers/s390/cio/ioasm.c
index 14d328338ce2..08eb10283b18 100644
--- a/drivers/s390/cio/ioasm.c
+++ b/drivers/s390/cio/ioasm.c
@@ -233,6 +233,7 @@ int hsch(struct subchannel_id schid)
 
 	return ccode;
 }
+EXPORT_SYMBOL(hsch);
 
 static inline int __xsch(struct subchannel_id schid)
 {
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 6/6] vfio-ccw: add handling for async channel instructions
  2019-01-30 13:22 ` [Qemu-devel] " Cornelia Huck
@ 2019-01-30 13:22   ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, Cornelia Huck, Alex Williamson, qemu-devel, qemu-s390x

Add a region to the vfio-ccw device that can be used to submit
asynchronous I/O instructions. ssch continues to be handled by the
existing I/O region; the new region handles hsch and csch.

Interrupt status continues to be reported through the same channels
as for ssch.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/s390/cio/Makefile           |   3 +-
 drivers/s390/cio/vfio_ccw_async.c   |  88 ++++++++++++++++++++
 drivers/s390/cio/vfio_ccw_drv.c     |  46 ++++++++---
 drivers/s390/cio/vfio_ccw_fsm.c     | 119 +++++++++++++++++++++++++++-
 drivers/s390/cio/vfio_ccw_ops.c     |  13 ++-
 drivers/s390/cio/vfio_ccw_private.h |   5 ++
 include/uapi/linux/vfio.h           |   2 +
 include/uapi/linux/vfio_ccw.h       |  12 +++
 8 files changed, 270 insertions(+), 18 deletions(-)
 create mode 100644 drivers/s390/cio/vfio_ccw_async.c

diff --git a/drivers/s390/cio/Makefile b/drivers/s390/cio/Makefile
index f230516abb96..f6a8db04177c 100644
--- a/drivers/s390/cio/Makefile
+++ b/drivers/s390/cio/Makefile
@@ -20,5 +20,6 @@ obj-$(CONFIG_CCWGROUP) += ccwgroup.o
 qdio-objs := qdio_main.o qdio_thinint.o qdio_debug.o qdio_setup.o
 obj-$(CONFIG_QDIO) += qdio.o
 
-vfio_ccw-objs += vfio_ccw_drv.o vfio_ccw_cp.o vfio_ccw_ops.o vfio_ccw_fsm.o
+vfio_ccw-objs += vfio_ccw_drv.o vfio_ccw_cp.o vfio_ccw_ops.o vfio_ccw_fsm.o \
+	vfio_ccw_async.o
 obj-$(CONFIG_VFIO_CCW) += vfio_ccw.o
diff --git a/drivers/s390/cio/vfio_ccw_async.c b/drivers/s390/cio/vfio_ccw_async.c
new file mode 100644
index 000000000000..8c1d2357ef5b
--- /dev/null
+++ b/drivers/s390/cio/vfio_ccw_async.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Async I/O region for vfio_ccw
+ *
+ * Copyright Red Hat, Inc. 2019
+ *
+ * Author(s): Cornelia Huck <cohuck@redhat.com>
+ */
+
+#include <linux/vfio.h>
+#include <linux/mdev.h>
+
+#include "vfio_ccw_private.h"
+
+static ssize_t vfio_ccw_async_region_read(struct vfio_ccw_private *private,
+					  char __user *buf, size_t count,
+					  loff_t *ppos)
+{
+	unsigned int i = VFIO_CCW_OFFSET_TO_INDEX(*ppos) - VFIO_CCW_NUM_REGIONS;
+	loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
+	struct ccw_cmd_region *region;
+	int ret;
+
+	if (pos + count > sizeof(*region))
+		return -EINVAL;
+
+	mutex_lock(&private->io_mutex);
+	region = private->region[i].data;
+	if (copy_to_user(buf, (void *)region + pos, count))
+		ret = -EFAULT;
+	else
+		ret = count;
+	mutex_unlock(&private->io_mutex);
+	return ret;
+}
+
+static ssize_t vfio_ccw_async_region_write(struct vfio_ccw_private *private,
+					   const char __user *buf, size_t count,
+					   loff_t *ppos)
+{
+	unsigned int i = VFIO_CCW_OFFSET_TO_INDEX(*ppos) - VFIO_CCW_NUM_REGIONS;
+	loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
+	struct ccw_cmd_region *region;
+	int ret;
+
+	if (pos + count > sizeof(*region))
+		return -EINVAL;
+
+	if (!mutex_trylock(&private->io_mutex))
+		return -EAGAIN;
+
+	region = private->region[i].data;
+	if (copy_from_user((void *)region + pos, buf, count)) {
+		ret = -EFAULT;
+		goto out_unlock;
+	}
+
+	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_ASYNC_REQ);
+
+	ret = region->ret_code ? region->ret_code : count;
+
+out_unlock:
+	mutex_unlock(&private->io_mutex);
+	return ret;
+}
+
+static void vfio_ccw_async_region_release(struct vfio_ccw_private *private,
+					  struct vfio_ccw_region *region)
+{
+
+}
+
+const struct vfio_ccw_regops vfio_ccw_async_region_ops = {
+	.read = vfio_ccw_async_region_read,
+	.write = vfio_ccw_async_region_write,
+	.release = vfio_ccw_async_region_release,
+};
+
+int vfio_ccw_register_async_dev_regions(struct vfio_ccw_private *private)
+{
+	return vfio_ccw_register_dev_region(private,
+					    VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD,
+					    &vfio_ccw_async_region_ops,
+					    sizeof(struct ccw_cmd_region),
+					    VFIO_REGION_INFO_FLAG_READ |
+					    VFIO_REGION_INFO_FLAG_WRITE,
+					    private->cmd_region);
+}
diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 5ea0da1dd954..c39d01943a6a 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -3,9 +3,11 @@
  * VFIO based Physical Subchannel device driver
  *
  * Copyright IBM Corp. 2017
+ * Copyright Red Hat, Inc. 2019
  *
  * Author(s): Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
  *            Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
+ *            Cornelia Huck <cohuck@redhat.com>
  */
 
 #include <linux/module.h>
@@ -23,6 +25,7 @@
 
 struct workqueue_struct *vfio_ccw_work_q;
 static struct kmem_cache *vfio_ccw_io_region;
+static struct kmem_cache *vfio_ccw_cmd_region;
 
 /*
  * Helpers
@@ -110,7 +113,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 {
 	struct pmcw *pmcw = &sch->schib.pmcw;
 	struct vfio_ccw_private *private;
-	int ret;
+	int ret = -ENOMEM;
 
 	if (pmcw->qf) {
 		dev_warn(&sch->dev, "vfio: ccw: does not support QDIO: %s\n",
@@ -124,10 +127,13 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 
 	private->io_region = kmem_cache_zalloc(vfio_ccw_io_region,
 					       GFP_KERNEL | GFP_DMA);
-	if (!private->io_region) {
-		kfree(private);
-		return -ENOMEM;
-	}
+	if (!private->io_region)
+		goto out_free;
+
+	private->cmd_region = kmem_cache_zalloc(vfio_ccw_cmd_region,
+						GFP_KERNEL | GFP_DMA);
+	if (!private->cmd_region)
+		goto out_free;
 
 	private->sch = sch;
 	dev_set_drvdata(&sch->dev, private);
@@ -155,7 +161,10 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 	cio_disable_subchannel(sch);
 out_free:
 	dev_set_drvdata(&sch->dev, NULL);
-	kmem_cache_free(vfio_ccw_io_region, private->io_region);
+	if (private->cmd_region)
+		kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
+	if (private->io_region)
+		kmem_cache_free(vfio_ccw_io_region, private->io_region);
 	kfree(private);
 	return ret;
 }
@@ -170,6 +179,7 @@ static int vfio_ccw_sch_remove(struct subchannel *sch)
 
 	dev_set_drvdata(&sch->dev, NULL);
 
+	kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
 	kmem_cache_free(vfio_ccw_io_region, private->io_region);
 	kfree(private);
 
@@ -244,7 +254,7 @@ static struct css_driver vfio_ccw_sch_driver = {
 
 static int __init vfio_ccw_sch_init(void)
 {
-	int ret;
+	int ret = -ENOMEM;
 
 	vfio_ccw_work_q = create_singlethread_workqueue("vfio-ccw");
 	if (!vfio_ccw_work_q)
@@ -254,20 +264,30 @@ static int __init vfio_ccw_sch_init(void)
 					sizeof(struct ccw_io_region), 0,
 					SLAB_ACCOUNT, 0,
 					sizeof(struct ccw_io_region), NULL);
-	if (!vfio_ccw_io_region) {
-		destroy_workqueue(vfio_ccw_work_q);
-		return -ENOMEM;
-	}
+	if (!vfio_ccw_io_region)
+		goto out_err;
+
+	vfio_ccw_cmd_region = kmem_cache_create_usercopy("vfio_ccw_cmd_region",
+					sizeof(struct ccw_cmd_region), 0,
+					SLAB_ACCOUNT, 0,
+					sizeof(struct ccw_cmd_region), NULL);
+	if (!vfio_ccw_cmd_region)
+		goto out_err;
 
 	isc_register(VFIO_CCW_ISC);
 	ret = css_driver_register(&vfio_ccw_sch_driver);
 	if (ret) {
 		isc_unregister(VFIO_CCW_ISC);
-		kmem_cache_destroy(vfio_ccw_io_region);
-		destroy_workqueue(vfio_ccw_work_q);
+		goto out_err;
 	}
 
 	return ret;
+
+out_err:
+	kmem_cache_destroy(vfio_ccw_cmd_region);
+	kmem_cache_destroy(vfio_ccw_io_region);
+	destroy_workqueue(vfio_ccw_work_q);
+	return ret;
 }
 
 static void __exit vfio_ccw_sch_exit(void)
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index b4a141fbd1a8..49d9d3da0282 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -3,8 +3,10 @@
  * Finite state machine for vfio-ccw device handling
  *
  * Copyright IBM Corp. 2017
+ * Copyright Red Hat, Inc. 2019
  *
  * Author(s): Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
+ *            Cornelia Huck <cohuck@redhat.com>
  */
 
 #include <linux/vfio.h>
@@ -73,6 +75,75 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 	return ret;
 }
 
+static int fsm_do_halt(struct vfio_ccw_private *private)
+{
+	struct subchannel *sch;
+	unsigned long flags;
+	int ccode;
+	int ret;
+
+	sch = private->sch;
+
+	spin_lock_irqsave(sch->lock, flags);
+
+	/* Issue "Halt Subchannel" */
+	ccode = hsch(sch->schid);
+
+	switch (ccode) {
+	case 0:
+		/*
+		 * Initialize device status information
+		 */
+		sch->schib.scsw.cmd.actl |= SCSW_ACTL_HALT_PEND;
+		ret = 0;
+		break;
+	case 1:		/* Status pending */
+	case 2:		/* Busy */
+		ret = -EBUSY;
+		break;
+	case 3:		/* Device not operational */
+		ret = -ENODEV;
+		break;
+	default:
+		ret = ccode;
+	}
+	spin_unlock_irqrestore(sch->lock, flags);
+	return ret;
+}
+
+static int fsm_do_clear(struct vfio_ccw_private *private)
+{
+	struct subchannel *sch;
+	unsigned long flags;
+	int ccode;
+	int ret;
+
+	sch = private->sch;
+
+	spin_lock_irqsave(sch->lock, flags);
+
+	/* Issue "Clear Subchannel" */
+	ccode = csch(sch->schid);
+
+	switch (ccode) {
+	case 0:
+		/*
+		 * Initialize device status information
+		 */
+		sch->schib.scsw.cmd.actl = SCSW_ACTL_CLEAR_PEND;
+		/* TODO: check what else we might need to clear */
+		ret = 0;
+		break;
+	case 3:		/* Device not operational */
+		ret = -ENODEV;
+		break;
+	default:
+		ret = ccode;
+	}
+	spin_unlock_irqrestore(sch->lock, flags);
+	return ret;
+}
+
 static void fsm_notoper(struct vfio_ccw_private *private,
 			enum vfio_ccw_event event)
 {
@@ -113,6 +184,24 @@ static void fsm_io_retry(struct vfio_ccw_private *private,
 	private->io_region->ret_code = -EAGAIN;
 }
 
+static void fsm_async_error(struct vfio_ccw_private *private,
+			    enum vfio_ccw_event event)
+{
+	struct ccw_cmd_region *cmd_region = private->cmd_region;
+
+	pr_err("vfio-ccw: FSM: %s request from state:%d\n",
+	       cmd_region->command == VFIO_CCW_ASYNC_CMD_HSCH ? "halt" :
+	       cmd_region->command == VFIO_CCW_ASYNC_CMD_CSCH ? "clear" :
+	       "<unknown>", private->state);
+	cmd_region->ret_code = -EIO;
+}
+
+static void fsm_async_retry(struct vfio_ccw_private *private,
+			    enum vfio_ccw_event event)
+{
+	private->cmd_region->ret_code = -EAGAIN;
+}
+
 static void fsm_disabled_irq(struct vfio_ccw_private *private,
 			     enum vfio_ccw_event event)
 {
@@ -176,11 +265,11 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 		}
 		return;
 	} else if (scsw->cmd.fctl & SCSW_FCTL_HALT_FUNC) {
-		/* XXX: Handle halt. */
+		/* halt is handled via the async cmd region */
 		io_region->ret_code = -EOPNOTSUPP;
 		goto err_out;
 	} else if (scsw->cmd.fctl & SCSW_FCTL_CLEAR_FUNC) {
-		/* XXX: Handle clear. */
+		/* clear is handled via the async cmd region */
 		io_region->ret_code = -EOPNOTSUPP;
 		goto err_out;
 	}
@@ -190,6 +279,27 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 			       io_region->ret_code, errstr);
 }
 
+/*
+ * Deal with an async request from userspace.
+ */
+static void fsm_async_request(struct vfio_ccw_private *private,
+			      enum vfio_ccw_event event)
+{
+	struct ccw_cmd_region *cmd_region = private->cmd_region;
+
+	switch (cmd_region->command) {
+	case VFIO_CCW_ASYNC_CMD_HSCH:
+		cmd_region->ret_code = fsm_do_halt(private);
+		break;
+	case VFIO_CCW_ASYNC_CMD_CSCH:
+		cmd_region->ret_code = fsm_do_clear(private);
+		break;
+	default:
+		/* should not happen? */
+		cmd_region->ret_code = -EINVAL;
+	}
+}
+
 /*
  * Got an interrupt for a normal io (state busy).
  */
@@ -213,26 +323,31 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
 	[VFIO_CCW_STATE_NOT_OPER] = {
 		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_nop,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
+		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_disabled_irq,
 	},
 	[VFIO_CCW_STATE_STANDBY] = {
 		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
+		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
 	},
 	[VFIO_CCW_STATE_IDLE] = {
 		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_request,
+		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_request,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
 	},
 	[VFIO_CCW_STATE_CP_PROCESSING] = {
 		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_retry,
+		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_retry,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
 	},
 	[VFIO_CCW_STATE_CP_PENDING] = {
 		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_busy,
+		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_request,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
 	},
 };
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 48b2d7930ea8..2c7bdbd9e402 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -148,11 +148,20 @@ static int vfio_ccw_mdev_open(struct mdev_device *mdev)
 	struct vfio_ccw_private *private =
 		dev_get_drvdata(mdev_parent_dev(mdev));
 	unsigned long events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
+	int ret;
 
 	private->nb.notifier_call = vfio_ccw_mdev_notifier;
 
-	return vfio_register_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
-				      &events, &private->nb);
+	ret = vfio_register_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
+				     &events, &private->nb);
+	if (ret)
+		return ret;
+
+	ret = vfio_ccw_register_async_dev_regions(private);
+	if (ret)
+		vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
+					 &private->nb);
+	return ret;
 }
 
 static void vfio_ccw_mdev_release(struct mdev_device *mdev)
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index c979eb32fb1c..bdcb05dcaf29 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -53,6 +53,8 @@ int vfio_ccw_register_dev_region(struct vfio_ccw_private *private,
 				 const struct vfio_ccw_regops *ops,
 				 size_t size, u32 flags, void *data);
 
+int vfio_ccw_register_async_dev_regions(struct vfio_ccw_private *private);
+
 /**
  * struct vfio_ccw_private
  * @sch: pointer to the subchannel
@@ -64,6 +66,7 @@ int vfio_ccw_register_dev_region(struct vfio_ccw_private *private,
  * @io_region: MMIO region to input/output I/O arguments/results
  * @io_mutex: protect against concurrent update of I/O regions
  * @region: additional regions for other subchannel operations
+ * @cmd_region: MMIO region for asynchronous I/O commands other than START
  * @num_regions: number of additional regions
  * @cp: channel program for the current I/O operation
  * @irb: irb info received from interrupt
@@ -81,6 +84,7 @@ struct vfio_ccw_private {
 	struct ccw_io_region	*io_region;
 	struct mutex		io_mutex;
 	struct vfio_ccw_region *region;
+	struct ccw_cmd_region	*cmd_region;
 	int num_regions;
 
 	struct channel_program	cp;
@@ -116,6 +120,7 @@ enum vfio_ccw_event {
 	VFIO_CCW_EVENT_NOT_OPER,
 	VFIO_CCW_EVENT_IO_REQ,
 	VFIO_CCW_EVENT_INTERRUPT,
+	VFIO_CCW_EVENT_ASYNC_REQ,
 	/* last element! */
 	NR_VFIO_CCW_EVENTS
 };
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 56e2413d3e00..8f10748dac79 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -354,6 +354,8 @@ struct vfio_region_gfx_edid {
 };
 
 #define VFIO_REGION_TYPE_CCW			(2)
+/* ccw sub-types */
+#define VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD	(1)
 
 /*
  * 10de vendor sub-type
diff --git a/include/uapi/linux/vfio_ccw.h b/include/uapi/linux/vfio_ccw.h
index 2ec5f367ff78..cbecbf0cd54f 100644
--- a/include/uapi/linux/vfio_ccw.h
+++ b/include/uapi/linux/vfio_ccw.h
@@ -12,6 +12,7 @@
 
 #include <linux/types.h>
 
+/* used for START SUBCHANNEL, always present */
 struct ccw_io_region {
 #define ORB_AREA_SIZE 12
 	__u8	orb_area[ORB_AREA_SIZE];
@@ -22,4 +23,15 @@ struct ccw_io_region {
 	__u32	ret_code;
 } __packed;
 
+/*
+ * used for processing commands that trigger asynchronous actions
+ * Note: this is controlled by a capability
+ */
+#define VFIO_CCW_ASYNC_CMD_HSCH (1 << 0)
+#define VFIO_CCW_ASYNC_CMD_CSCH (1 << 1)
+struct ccw_cmd_region {
+	__u32 command;
+	__u32 ret_code;
+} __packed;
+
 #endif
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH v3 6/6] vfio-ccw: add handling for async channel instructions
@ 2019-01-30 13:22   ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-30 13:22 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, qemu-devel, qemu-s390x, Alex Williamson, Cornelia Huck

Add a region to the vfio-ccw device that can be used to submit
asynchronous I/O instructions. ssch continues to be handled by the
existing I/O region; the new region handles hsch and csch.

Interrupt status continues to be reported through the same channels
as for ssch.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/s390/cio/Makefile           |   3 +-
 drivers/s390/cio/vfio_ccw_async.c   |  88 ++++++++++++++++++++
 drivers/s390/cio/vfio_ccw_drv.c     |  46 ++++++++---
 drivers/s390/cio/vfio_ccw_fsm.c     | 119 +++++++++++++++++++++++++++-
 drivers/s390/cio/vfio_ccw_ops.c     |  13 ++-
 drivers/s390/cio/vfio_ccw_private.h |   5 ++
 include/uapi/linux/vfio.h           |   2 +
 include/uapi/linux/vfio_ccw.h       |  12 +++
 8 files changed, 270 insertions(+), 18 deletions(-)
 create mode 100644 drivers/s390/cio/vfio_ccw_async.c

diff --git a/drivers/s390/cio/Makefile b/drivers/s390/cio/Makefile
index f230516abb96..f6a8db04177c 100644
--- a/drivers/s390/cio/Makefile
+++ b/drivers/s390/cio/Makefile
@@ -20,5 +20,6 @@ obj-$(CONFIG_CCWGROUP) += ccwgroup.o
 qdio-objs := qdio_main.o qdio_thinint.o qdio_debug.o qdio_setup.o
 obj-$(CONFIG_QDIO) += qdio.o
 
-vfio_ccw-objs += vfio_ccw_drv.o vfio_ccw_cp.o vfio_ccw_ops.o vfio_ccw_fsm.o
+vfio_ccw-objs += vfio_ccw_drv.o vfio_ccw_cp.o vfio_ccw_ops.o vfio_ccw_fsm.o \
+	vfio_ccw_async.o
 obj-$(CONFIG_VFIO_CCW) += vfio_ccw.o
diff --git a/drivers/s390/cio/vfio_ccw_async.c b/drivers/s390/cio/vfio_ccw_async.c
new file mode 100644
index 000000000000..8c1d2357ef5b
--- /dev/null
+++ b/drivers/s390/cio/vfio_ccw_async.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Async I/O region for vfio_ccw
+ *
+ * Copyright Red Hat, Inc. 2019
+ *
+ * Author(s): Cornelia Huck <cohuck@redhat.com>
+ */
+
+#include <linux/vfio.h>
+#include <linux/mdev.h>
+
+#include "vfio_ccw_private.h"
+
+static ssize_t vfio_ccw_async_region_read(struct vfio_ccw_private *private,
+					  char __user *buf, size_t count,
+					  loff_t *ppos)
+{
+	unsigned int i = VFIO_CCW_OFFSET_TO_INDEX(*ppos) - VFIO_CCW_NUM_REGIONS;
+	loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
+	struct ccw_cmd_region *region;
+	int ret;
+
+	if (pos + count > sizeof(*region))
+		return -EINVAL;
+
+	mutex_lock(&private->io_mutex);
+	region = private->region[i].data;
+	if (copy_to_user(buf, (void *)region + pos, count))
+		ret = -EFAULT;
+	else
+		ret = count;
+	mutex_unlock(&private->io_mutex);
+	return ret;
+}
+
+static ssize_t vfio_ccw_async_region_write(struct vfio_ccw_private *private,
+					   const char __user *buf, size_t count,
+					   loff_t *ppos)
+{
+	unsigned int i = VFIO_CCW_OFFSET_TO_INDEX(*ppos) - VFIO_CCW_NUM_REGIONS;
+	loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
+	struct ccw_cmd_region *region;
+	int ret;
+
+	if (pos + count > sizeof(*region))
+		return -EINVAL;
+
+	if (!mutex_trylock(&private->io_mutex))
+		return -EAGAIN;
+
+	region = private->region[i].data;
+	if (copy_from_user((void *)region + pos, buf, count)) {
+		ret = -EFAULT;
+		goto out_unlock;
+	}
+
+	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_ASYNC_REQ);
+
+	ret = region->ret_code ? region->ret_code : count;
+
+out_unlock:
+	mutex_unlock(&private->io_mutex);
+	return ret;
+}
+
+static void vfio_ccw_async_region_release(struct vfio_ccw_private *private,
+					  struct vfio_ccw_region *region)
+{
+
+}
+
+const struct vfio_ccw_regops vfio_ccw_async_region_ops = {
+	.read = vfio_ccw_async_region_read,
+	.write = vfio_ccw_async_region_write,
+	.release = vfio_ccw_async_region_release,
+};
+
+int vfio_ccw_register_async_dev_regions(struct vfio_ccw_private *private)
+{
+	return vfio_ccw_register_dev_region(private,
+					    VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD,
+					    &vfio_ccw_async_region_ops,
+					    sizeof(struct ccw_cmd_region),
+					    VFIO_REGION_INFO_FLAG_READ |
+					    VFIO_REGION_INFO_FLAG_WRITE,
+					    private->cmd_region);
+}
diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 5ea0da1dd954..c39d01943a6a 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -3,9 +3,11 @@
  * VFIO based Physical Subchannel device driver
  *
  * Copyright IBM Corp. 2017
+ * Copyright Red Hat, Inc. 2019
  *
  * Author(s): Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
  *            Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
+ *            Cornelia Huck <cohuck@redhat.com>
  */
 
 #include <linux/module.h>
@@ -23,6 +25,7 @@
 
 struct workqueue_struct *vfio_ccw_work_q;
 static struct kmem_cache *vfio_ccw_io_region;
+static struct kmem_cache *vfio_ccw_cmd_region;
 
 /*
  * Helpers
@@ -110,7 +113,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 {
 	struct pmcw *pmcw = &sch->schib.pmcw;
 	struct vfio_ccw_private *private;
-	int ret;
+	int ret = -ENOMEM;
 
 	if (pmcw->qf) {
 		dev_warn(&sch->dev, "vfio: ccw: does not support QDIO: %s\n",
@@ -124,10 +127,13 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 
 	private->io_region = kmem_cache_zalloc(vfio_ccw_io_region,
 					       GFP_KERNEL | GFP_DMA);
-	if (!private->io_region) {
-		kfree(private);
-		return -ENOMEM;
-	}
+	if (!private->io_region)
+		goto out_free;
+
+	private->cmd_region = kmem_cache_zalloc(vfio_ccw_cmd_region,
+						GFP_KERNEL | GFP_DMA);
+	if (!private->cmd_region)
+		goto out_free;
 
 	private->sch = sch;
 	dev_set_drvdata(&sch->dev, private);
@@ -155,7 +161,10 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 	cio_disable_subchannel(sch);
 out_free:
 	dev_set_drvdata(&sch->dev, NULL);
-	kmem_cache_free(vfio_ccw_io_region, private->io_region);
+	if (private->cmd_region)
+		kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
+	if (private->io_region)
+		kmem_cache_free(vfio_ccw_io_region, private->io_region);
 	kfree(private);
 	return ret;
 }
@@ -170,6 +179,7 @@ static int vfio_ccw_sch_remove(struct subchannel *sch)
 
 	dev_set_drvdata(&sch->dev, NULL);
 
+	kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
 	kmem_cache_free(vfio_ccw_io_region, private->io_region);
 	kfree(private);
 
@@ -244,7 +254,7 @@ static struct css_driver vfio_ccw_sch_driver = {
 
 static int __init vfio_ccw_sch_init(void)
 {
-	int ret;
+	int ret = -ENOMEM;
 
 	vfio_ccw_work_q = create_singlethread_workqueue("vfio-ccw");
 	if (!vfio_ccw_work_q)
@@ -254,20 +264,30 @@ static int __init vfio_ccw_sch_init(void)
 					sizeof(struct ccw_io_region), 0,
 					SLAB_ACCOUNT, 0,
 					sizeof(struct ccw_io_region), NULL);
-	if (!vfio_ccw_io_region) {
-		destroy_workqueue(vfio_ccw_work_q);
-		return -ENOMEM;
-	}
+	if (!vfio_ccw_io_region)
+		goto out_err;
+
+	vfio_ccw_cmd_region = kmem_cache_create_usercopy("vfio_ccw_cmd_region",
+					sizeof(struct ccw_cmd_region), 0,
+					SLAB_ACCOUNT, 0,
+					sizeof(struct ccw_cmd_region), NULL);
+	if (!vfio_ccw_cmd_region)
+		goto out_err;
 
 	isc_register(VFIO_CCW_ISC);
 	ret = css_driver_register(&vfio_ccw_sch_driver);
 	if (ret) {
 		isc_unregister(VFIO_CCW_ISC);
-		kmem_cache_destroy(vfio_ccw_io_region);
-		destroy_workqueue(vfio_ccw_work_q);
+		goto out_err;
 	}
 
 	return ret;
+
+out_err:
+	kmem_cache_destroy(vfio_ccw_cmd_region);
+	kmem_cache_destroy(vfio_ccw_io_region);
+	destroy_workqueue(vfio_ccw_work_q);
+	return ret;
 }
 
 static void __exit vfio_ccw_sch_exit(void)
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index b4a141fbd1a8..49d9d3da0282 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -3,8 +3,10 @@
  * Finite state machine for vfio-ccw device handling
  *
  * Copyright IBM Corp. 2017
+ * Copyright Red Hat, Inc. 2019
  *
  * Author(s): Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
+ *            Cornelia Huck <cohuck@redhat.com>
  */
 
 #include <linux/vfio.h>
@@ -73,6 +75,75 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 	return ret;
 }
 
+static int fsm_do_halt(struct vfio_ccw_private *private)
+{
+	struct subchannel *sch;
+	unsigned long flags;
+	int ccode;
+	int ret;
+
+	sch = private->sch;
+
+	spin_lock_irqsave(sch->lock, flags);
+
+	/* Issue "Halt Subchannel" */
+	ccode = hsch(sch->schid);
+
+	switch (ccode) {
+	case 0:
+		/*
+		 * Initialize device status information
+		 */
+		sch->schib.scsw.cmd.actl |= SCSW_ACTL_HALT_PEND;
+		ret = 0;
+		break;
+	case 1:		/* Status pending */
+	case 2:		/* Busy */
+		ret = -EBUSY;
+		break;
+	case 3:		/* Device not operational */
+		ret = -ENODEV;
+		break;
+	default:
+		ret = ccode;
+	}
+	spin_unlock_irqrestore(sch->lock, flags);
+	return ret;
+}
+
+static int fsm_do_clear(struct vfio_ccw_private *private)
+{
+	struct subchannel *sch;
+	unsigned long flags;
+	int ccode;
+	int ret;
+
+	sch = private->sch;
+
+	spin_lock_irqsave(sch->lock, flags);
+
+	/* Issue "Clear Subchannel" */
+	ccode = csch(sch->schid);
+
+	switch (ccode) {
+	case 0:
+		/*
+		 * Initialize device status information
+		 */
+		sch->schib.scsw.cmd.actl = SCSW_ACTL_CLEAR_PEND;
+		/* TODO: check what else we might need to clear */
+		ret = 0;
+		break;
+	case 3:		/* Device not operational */
+		ret = -ENODEV;
+		break;
+	default:
+		ret = ccode;
+	}
+	spin_unlock_irqrestore(sch->lock, flags);
+	return ret;
+}
+
 static void fsm_notoper(struct vfio_ccw_private *private,
 			enum vfio_ccw_event event)
 {
@@ -113,6 +184,24 @@ static void fsm_io_retry(struct vfio_ccw_private *private,
 	private->io_region->ret_code = -EAGAIN;
 }
 
+static void fsm_async_error(struct vfio_ccw_private *private,
+			    enum vfio_ccw_event event)
+{
+	struct ccw_cmd_region *cmd_region = private->cmd_region;
+
+	pr_err("vfio-ccw: FSM: %s request from state:%d\n",
+	       cmd_region->command == VFIO_CCW_ASYNC_CMD_HSCH ? "halt" :
+	       cmd_region->command == VFIO_CCW_ASYNC_CMD_CSCH ? "clear" :
+	       "<unknown>", private->state);
+	cmd_region->ret_code = -EIO;
+}
+
+static void fsm_async_retry(struct vfio_ccw_private *private,
+			    enum vfio_ccw_event event)
+{
+	private->cmd_region->ret_code = -EAGAIN;
+}
+
 static void fsm_disabled_irq(struct vfio_ccw_private *private,
 			     enum vfio_ccw_event event)
 {
@@ -176,11 +265,11 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 		}
 		return;
 	} else if (scsw->cmd.fctl & SCSW_FCTL_HALT_FUNC) {
-		/* XXX: Handle halt. */
+		/* halt is handled via the async cmd region */
 		io_region->ret_code = -EOPNOTSUPP;
 		goto err_out;
 	} else if (scsw->cmd.fctl & SCSW_FCTL_CLEAR_FUNC) {
-		/* XXX: Handle clear. */
+		/* clear is handled via the async cmd region */
 		io_region->ret_code = -EOPNOTSUPP;
 		goto err_out;
 	}
@@ -190,6 +279,27 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 			       io_region->ret_code, errstr);
 }
 
+/*
+ * Deal with an async request from userspace.
+ */
+static void fsm_async_request(struct vfio_ccw_private *private,
+			      enum vfio_ccw_event event)
+{
+	struct ccw_cmd_region *cmd_region = private->cmd_region;
+
+	switch (cmd_region->command) {
+	case VFIO_CCW_ASYNC_CMD_HSCH:
+		cmd_region->ret_code = fsm_do_halt(private);
+		break;
+	case VFIO_CCW_ASYNC_CMD_CSCH:
+		cmd_region->ret_code = fsm_do_clear(private);
+		break;
+	default:
+		/* should not happen? */
+		cmd_region->ret_code = -EINVAL;
+	}
+}
+
 /*
  * Got an interrupt for a normal io (state busy).
  */
@@ -213,26 +323,31 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
 	[VFIO_CCW_STATE_NOT_OPER] = {
 		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_nop,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
+		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_disabled_irq,
 	},
 	[VFIO_CCW_STATE_STANDBY] = {
 		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
+		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
 	},
 	[VFIO_CCW_STATE_IDLE] = {
 		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_request,
+		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_request,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
 	},
 	[VFIO_CCW_STATE_CP_PROCESSING] = {
 		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_retry,
+		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_retry,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
 	},
 	[VFIO_CCW_STATE_CP_PENDING] = {
 		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_busy,
+		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_request,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
 	},
 };
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 48b2d7930ea8..2c7bdbd9e402 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -148,11 +148,20 @@ static int vfio_ccw_mdev_open(struct mdev_device *mdev)
 	struct vfio_ccw_private *private =
 		dev_get_drvdata(mdev_parent_dev(mdev));
 	unsigned long events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
+	int ret;
 
 	private->nb.notifier_call = vfio_ccw_mdev_notifier;
 
-	return vfio_register_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
-				      &events, &private->nb);
+	ret = vfio_register_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
+				     &events, &private->nb);
+	if (ret)
+		return ret;
+
+	ret = vfio_ccw_register_async_dev_regions(private);
+	if (ret)
+		vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
+					 &private->nb);
+	return ret;
 }
 
 static void vfio_ccw_mdev_release(struct mdev_device *mdev)
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index c979eb32fb1c..bdcb05dcaf29 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -53,6 +53,8 @@ int vfio_ccw_register_dev_region(struct vfio_ccw_private *private,
 				 const struct vfio_ccw_regops *ops,
 				 size_t size, u32 flags, void *data);
 
+int vfio_ccw_register_async_dev_regions(struct vfio_ccw_private *private);
+
 /**
  * struct vfio_ccw_private
  * @sch: pointer to the subchannel
@@ -64,6 +66,7 @@ int vfio_ccw_register_dev_region(struct vfio_ccw_private *private,
  * @io_region: MMIO region to input/output I/O arguments/results
  * @io_mutex: protect against concurrent update of I/O regions
  * @region: additional regions for other subchannel operations
+ * @cmd_region: MMIO region for asynchronous I/O commands other than START
  * @num_regions: number of additional regions
  * @cp: channel program for the current I/O operation
  * @irb: irb info received from interrupt
@@ -81,6 +84,7 @@ struct vfio_ccw_private {
 	struct ccw_io_region	*io_region;
 	struct mutex		io_mutex;
 	struct vfio_ccw_region *region;
+	struct ccw_cmd_region	*cmd_region;
 	int num_regions;
 
 	struct channel_program	cp;
@@ -116,6 +120,7 @@ enum vfio_ccw_event {
 	VFIO_CCW_EVENT_NOT_OPER,
 	VFIO_CCW_EVENT_IO_REQ,
 	VFIO_CCW_EVENT_INTERRUPT,
+	VFIO_CCW_EVENT_ASYNC_REQ,
 	/* last element! */
 	NR_VFIO_CCW_EVENTS
 };
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 56e2413d3e00..8f10748dac79 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -354,6 +354,8 @@ struct vfio_region_gfx_edid {
 };
 
 #define VFIO_REGION_TYPE_CCW			(2)
+/* ccw sub-types */
+#define VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD	(1)
 
 /*
  * 10de vendor sub-type
diff --git a/include/uapi/linux/vfio_ccw.h b/include/uapi/linux/vfio_ccw.h
index 2ec5f367ff78..cbecbf0cd54f 100644
--- a/include/uapi/linux/vfio_ccw.h
+++ b/include/uapi/linux/vfio_ccw.h
@@ -12,6 +12,7 @@
 
 #include <linux/types.h>
 
+/* used for START SUBCHANNEL, always present */
 struct ccw_io_region {
 #define ORB_AREA_SIZE 12
 	__u8	orb_area[ORB_AREA_SIZE];
@@ -22,4 +23,15 @@ struct ccw_io_region {
 	__u32	ret_code;
 } __packed;
 
+/*
+ * used for processing commands that trigger asynchronous actions
+ * Note: this is controlled by a capability
+ */
+#define VFIO_CCW_ASYNC_CMD_HSCH (1 << 0)
+#define VFIO_CCW_ASYNC_CMD_CSCH (1 << 1)
+struct ccw_cmd_region {
+	__u32 command;
+	__u32 ret_code;
+} __packed;
+
 #endif
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 6/6] vfio-ccw: add handling for async channel instructions
  2019-01-30 13:22   ` [Qemu-devel] " Cornelia Huck
@ 2019-01-30 17:00     ` Halil Pasic
  -1 siblings, 0 replies; 70+ messages in thread
From: Halil Pasic @ 2019-01-30 17:00 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: linux-s390, Eric Farman, Alex Williamson, Pierre Morel, kvm,
	Farhan Ali, qemu-devel, qemu-s390x

On Wed, 30 Jan 2019 14:22:12 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> +static void fsm_async_retry(struct vfio_ccw_private *private,
> +			    enum vfio_ccw_event event)
> +{
> +	private->cmd_region->ret_code = -EAGAIN;
> +}
> +

This is essentially dead code at the moment, isn't it? I mean we hold the
io_mutex whenever we are in state VFIO_CCW_STATE_CP_PROCESSING. And we
call vfio_ccw_fsm_event() under the very same mutex.



> @@ -213,26 +323,31 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
>  	[VFIO_CCW_STATE_NOT_OPER] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_nop,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_disabled_irq,
>  	},
>  	[VFIO_CCW_STATE_STANDBY] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>  	},
>  	[VFIO_CCW_STATE_IDLE] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_request,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_request,
>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>  	},
>  	[VFIO_CCW_STATE_CP_PROCESSING] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_retry,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_retry,

Used here.

Regards,
Halil

>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>  	},

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 6/6] vfio-ccw: add handling for async channel instructions
@ 2019-01-30 17:00     ` Halil Pasic
  0 siblings, 0 replies; 70+ messages in thread
From: Halil Pasic @ 2019-01-30 17:00 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Eric Farman, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Wed, 30 Jan 2019 14:22:12 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> +static void fsm_async_retry(struct vfio_ccw_private *private,
> +			    enum vfio_ccw_event event)
> +{
> +	private->cmd_region->ret_code = -EAGAIN;
> +}
> +

This is essentially dead code at the moment, isn't it? I mean we hold the
io_mutex whenever we are in state VFIO_CCW_STATE_CP_PROCESSING. And we
call vfio_ccw_fsm_event() under the very same mutex.



> @@ -213,26 +323,31 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
>  	[VFIO_CCW_STATE_NOT_OPER] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_nop,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_disabled_irq,
>  	},
>  	[VFIO_CCW_STATE_STANDBY] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>  	},
>  	[VFIO_CCW_STATE_IDLE] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_request,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_request,
>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>  	},
>  	[VFIO_CCW_STATE_CP_PROCESSING] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_retry,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_retry,

Used here.

Regards,
Halil

>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>  	},

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 6/6] vfio-ccw: add handling for async channel instructions
  2019-01-30 13:22   ` [Qemu-devel] " Cornelia Huck
@ 2019-01-30 17:09     ` Halil Pasic
  -1 siblings, 0 replies; 70+ messages in thread
From: Halil Pasic @ 2019-01-30 17:09 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: linux-s390, Eric Farman, Alex Williamson, Pierre Morel, kvm,
	Farhan Ali, qemu-devel, qemu-s390x

On Wed, 30 Jan 2019 14:22:12 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> +static void fsm_async_retry(struct vfio_ccw_private *private,
> +			    enum vfio_ccw_event event)
> +{
> +	private->cmd_region->ret_code = -EAGAIN;
> +}
> +

This is essentially dead code at the moment, isn't it? I mean we hold the
io_mutex whenever we are in state VFIO_CCW_STATE_CP_PROCESSING, and we
call vfio_ccw_fsm_event() with the very same mutex held.

> @@ -213,26 +323,31 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
>  	[VFIO_CCW_STATE_NOT_OPER] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_nop,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_disabled_irq,
>  	},
>  	[VFIO_CCW_STATE_STANDBY] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>  	},
>  	[VFIO_CCW_STATE_IDLE] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_request,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_request,
>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>  	},
>  	[VFIO_CCW_STATE_CP_PROCESSING] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_retry,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_retry,

Used here.

Regards,
Halil

>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>  	},

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 6/6] vfio-ccw: add handling for async channel instructions
@ 2019-01-30 17:09     ` Halil Pasic
  0 siblings, 0 replies; 70+ messages in thread
From: Halil Pasic @ 2019-01-30 17:09 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Eric Farman, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Wed, 30 Jan 2019 14:22:12 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> +static void fsm_async_retry(struct vfio_ccw_private *private,
> +			    enum vfio_ccw_event event)
> +{
> +	private->cmd_region->ret_code = -EAGAIN;
> +}
> +

This is essentially dead code at the moment, isn't it? I mean we hold the
io_mutex whenever we are in state VFIO_CCW_STATE_CP_PROCESSING, and we
call vfio_ccw_fsm_event() with the very same mutex held.

> @@ -213,26 +323,31 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
>  	[VFIO_CCW_STATE_NOT_OPER] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_nop,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_disabled_irq,
>  	},
>  	[VFIO_CCW_STATE_STANDBY] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>  	},
>  	[VFIO_CCW_STATE_IDLE] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_request,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_request,
>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>  	},
>  	[VFIO_CCW_STATE_CP_PROCESSING] = {
>  		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_retry,
> +		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_retry,

Used here.

Regards,
Halil

>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>  	},

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-01-30 13:22   ` [Qemu-devel] " Cornelia Huck
@ 2019-01-30 18:51     ` Halil Pasic
  -1 siblings, 0 replies; 70+ messages in thread
From: Halil Pasic @ 2019-01-30 18:51 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: linux-s390, Eric Farman, Alex Williamson, Pierre Morel, kvm,
	Farhan Ali, qemu-devel, qemu-s390x

On Wed, 30 Jan 2019 14:22:07 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> When we get a solicited interrupt, the start function may have
> been cleared by a csch, but we still have a channel program
> structure allocated. Make it safe to call the cp accessors in
> any case, so we can call them unconditionally.

I read this like it is supposed to be safe regardless of
parallelism and threads. However I don't see any explicit
synchronization done for cp->initialized.

I've managed to figure out how is that supposed to be safe
for the cp_free() (which is probably our main concern) in
vfio_ccw_sch_io_todo(), but if fail when it comes to the one
in vfio_ccw_mdev_notifier().

Can you explain us how does the synchronization work?

Regards,
Halil

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-01-30 18:51     ` Halil Pasic
  0 siblings, 0 replies; 70+ messages in thread
From: Halil Pasic @ 2019-01-30 18:51 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Eric Farman, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Wed, 30 Jan 2019 14:22:07 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> When we get a solicited interrupt, the start function may have
> been cleared by a csch, but we still have a channel program
> structure allocated. Make it safe to call the cp accessors in
> any case, so we can call them unconditionally.

I read this like it is supposed to be safe regardless of
parallelism and threads. However I don't see any explicit
synchronization done for cp->initialized.

I've managed to figure out how is that supposed to be safe
for the cp_free() (which is probably our main concern) in
vfio_ccw_sch_io_todo(), but if fail when it comes to the one
in vfio_ccw_mdev_notifier().

Can you explain us how does the synchronization work?

Regards,
Halil

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-01-30 18:51     ` [Qemu-devel] " Halil Pasic
@ 2019-01-31 11:52       ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-31 11:52 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Eric Farman, Alex Williamson, Pierre Morel, kvm,
	Farhan Ali, qemu-devel, qemu-s390x

On Wed, 30 Jan 2019 19:51:27 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 30 Jan 2019 14:22:07 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > When we get a solicited interrupt, the start function may have
> > been cleared by a csch, but we still have a channel program
> > structure allocated. Make it safe to call the cp accessors in
> > any case, so we can call them unconditionally.  
> 
> I read this like it is supposed to be safe regardless of
> parallelism and threads. However I don't see any explicit
> synchronization done for cp->initialized.
> 
> I've managed to figure out how is that supposed to be safe
> for the cp_free() (which is probably our main concern) in
> vfio_ccw_sch_io_todo(), but if fail when it comes to the one
> in vfio_ccw_mdev_notifier().
> 
> Can you explain us how does the synchronization work?

You read that wrong, I don't add synchronization, I just add a check.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-01-31 11:52       ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-31 11:52 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Eric Farman, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Wed, 30 Jan 2019 19:51:27 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 30 Jan 2019 14:22:07 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > When we get a solicited interrupt, the start function may have
> > been cleared by a csch, but we still have a channel program
> > structure allocated. Make it safe to call the cp accessors in
> > any case, so we can call them unconditionally.  
> 
> I read this like it is supposed to be safe regardless of
> parallelism and threads. However I don't see any explicit
> synchronization done for cp->initialized.
> 
> I've managed to figure out how is that supposed to be safe
> for the cp_free() (which is probably our main concern) in
> vfio_ccw_sch_io_todo(), but if fail when it comes to the one
> in vfio_ccw_mdev_notifier().
> 
> Can you explain us how does the synchronization work?

You read that wrong, I don't add synchronization, I just add a check.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 6/6] vfio-ccw: add handling for async channel instructions
  2019-01-30 17:09     ` [Qemu-devel] " Halil Pasic
@ 2019-01-31 11:53       ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-31 11:53 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Eric Farman, Alex Williamson, Pierre Morel, kvm,
	Farhan Ali, qemu-devel, qemu-s390x

On Wed, 30 Jan 2019 18:09:31 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 30 Jan 2019 14:22:12 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > +static void fsm_async_retry(struct vfio_ccw_private *private,
> > +			    enum vfio_ccw_event event)
> > +{
> > +	private->cmd_region->ret_code = -EAGAIN;
> > +}
> > +  
> 
> This is essentially dead code at the moment, isn't it? I mean we hold the
> io_mutex whenever we are in state VFIO_CCW_STATE_CP_PROCESSING, and we
> call vfio_ccw_fsm_event() with the very same mutex held.

Well, I did have to put something in the state machine...

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 6/6] vfio-ccw: add handling for async channel instructions
@ 2019-01-31 11:53       ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-01-31 11:53 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Eric Farman, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Wed, 30 Jan 2019 18:09:31 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 30 Jan 2019 14:22:12 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > +static void fsm_async_retry(struct vfio_ccw_private *private,
> > +			    enum vfio_ccw_event event)
> > +{
> > +	private->cmd_region->ret_code = -EAGAIN;
> > +}
> > +  
> 
> This is essentially dead code at the moment, isn't it? I mean we hold the
> io_mutex whenever we are in state VFIO_CCW_STATE_CP_PROCESSING, and we
> call vfio_ccw_fsm_event() with the very same mutex held.

Well, I did have to put something in the state machine...

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-01-31 11:52       ` [Qemu-devel] " Cornelia Huck
@ 2019-01-31 12:34         ` Halil Pasic
  -1 siblings, 0 replies; 70+ messages in thread
From: Halil Pasic @ 2019-01-31 12:34 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: linux-s390, Eric Farman, Alex Williamson, Pierre Morel, kvm,
	Farhan Ali, qemu-devel, qemu-s390x

On Thu, 31 Jan 2019 12:52:20 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 30 Jan 2019 19:51:27 +0100
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Wed, 30 Jan 2019 14:22:07 +0100
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > When we get a solicited interrupt, the start function may have
> > > been cleared by a csch, but we still have a channel program
> > > structure allocated. Make it safe to call the cp accessors in
> > > any case, so we can call them unconditionally.  
> > 
> > I read this like it is supposed to be safe regardless of
> > parallelism and threads. However I don't see any explicit
> > synchronization done for cp->initialized.
> > 
> > I've managed to figure out how is that supposed to be safe
> > for the cp_free() (which is probably our main concern) in
> > vfio_ccw_sch_io_todo(), but if fail when it comes to the one
> > in vfio_ccw_mdev_notifier().
> > 
> > Can you explain us how does the synchronization work?
> 
> You read that wrong, I don't add synchronization, I just add a check.
> 

Now I'm confused. Does that mean we don't need synchronization for this?

Regards,
Halil

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-01-31 12:34         ` Halil Pasic
  0 siblings, 0 replies; 70+ messages in thread
From: Halil Pasic @ 2019-01-31 12:34 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Eric Farman, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Thu, 31 Jan 2019 12:52:20 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 30 Jan 2019 19:51:27 +0100
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Wed, 30 Jan 2019 14:22:07 +0100
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > When we get a solicited interrupt, the start function may have
> > > been cleared by a csch, but we still have a channel program
> > > structure allocated. Make it safe to call the cp accessors in
> > > any case, so we can call them unconditionally.  
> > 
> > I read this like it is supposed to be safe regardless of
> > parallelism and threads. However I don't see any explicit
> > synchronization done for cp->initialized.
> > 
> > I've managed to figure out how is that supposed to be safe
> > for the cp_free() (which is probably our main concern) in
> > vfio_ccw_sch_io_todo(), but if fail when it comes to the one
> > in vfio_ccw_mdev_notifier().
> > 
> > Can you explain us how does the synchronization work?
> 
> You read that wrong, I don't add synchronization, I just add a check.
> 

Now I'm confused. Does that mean we don't need synchronization for this?

Regards,
Halil

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-01-31 12:34         ` [Qemu-devel] " Halil Pasic
@ 2019-02-04 15:31           ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-04 15:31 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Eric Farman, Alex Williamson, Pierre Morel, kvm,
	Farhan Ali, qemu-devel, qemu-s390x

On Thu, 31 Jan 2019 13:34:55 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Thu, 31 Jan 2019 12:52:20 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Wed, 30 Jan 2019 19:51:27 +0100
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Wed, 30 Jan 2019 14:22:07 +0100
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > When we get a solicited interrupt, the start function may have
> > > > been cleared by a csch, but we still have a channel program
> > > > structure allocated. Make it safe to call the cp accessors in
> > > > any case, so we can call them unconditionally.    
> > > 
> > > I read this like it is supposed to be safe regardless of
> > > parallelism and threads. However I don't see any explicit
> > > synchronization done for cp->initialized.
> > > 
> > > I've managed to figure out how is that supposed to be safe
> > > for the cp_free() (which is probably our main concern) in
> > > vfio_ccw_sch_io_todo(), but if fail when it comes to the one
> > > in vfio_ccw_mdev_notifier().
> > > 
> > > Can you explain us how does the synchronization work?  
> > 
> > You read that wrong, I don't add synchronization, I just add a check.
> >   
> 
> Now I'm confused. Does that mean we don't need synchronization for this?

If we lack synchronization (that is not provided by the current state
machine handling, or the rework here), we should do a patch on top
(preferably on top of the whole series, so this does not get even more
tangled up.) This is really just about the extra check.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-02-04 15:31           ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-04 15:31 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Eric Farman, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Thu, 31 Jan 2019 13:34:55 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Thu, 31 Jan 2019 12:52:20 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Wed, 30 Jan 2019 19:51:27 +0100
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Wed, 30 Jan 2019 14:22:07 +0100
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > When we get a solicited interrupt, the start function may have
> > > > been cleared by a csch, but we still have a channel program
> > > > structure allocated. Make it safe to call the cp accessors in
> > > > any case, so we can call them unconditionally.    
> > > 
> > > I read this like it is supposed to be safe regardless of
> > > parallelism and threads. However I don't see any explicit
> > > synchronization done for cp->initialized.
> > > 
> > > I've managed to figure out how is that supposed to be safe
> > > for the cp_free() (which is probably our main concern) in
> > > vfio_ccw_sch_io_todo(), but if fail when it comes to the one
> > > in vfio_ccw_mdev_notifier().
> > > 
> > > Can you explain us how does the synchronization work?  
> > 
> > You read that wrong, I don't add synchronization, I just add a check.
> >   
> 
> Now I'm confused. Does that mean we don't need synchronization for this?

If we lack synchronization (that is not provided by the current state
machine handling, or the rework here), we should do a patch on top
(preferably on top of the whole series, so this does not get even more
tangled up.) This is really just about the extra check.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-01-30 13:22   ` [Qemu-devel] " Cornelia Huck
@ 2019-02-04 19:25     ` Eric Farman
  -1 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-04 19:25 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic, Farhan Ali, Pierre Morel
  Cc: linux-s390, qemu-s390x, Alex Williamson, qemu-devel, kvm



On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> When we get a solicited interrupt, the start function may have
> been cleared by a csch, but we still have a channel program
> structure allocated. Make it safe to call the cp accessors in
> any case, so we can call them unconditionally.
> 
> While at it, also make sure that functions called from other parts
> of the code return gracefully if the channel program structure
> has not been initialized (even though that is a bug in the caller).
> 
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> ---
>   drivers/s390/cio/vfio_ccw_cp.c  | 20 +++++++++++++++++++-
>   drivers/s390/cio/vfio_ccw_cp.h  |  2 ++
>   drivers/s390/cio/vfio_ccw_fsm.c |  5 +++++
>   3 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index ba08fe137c2e..0bc0c38edda7 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -335,6 +335,7 @@ static void cp_unpin_free(struct channel_program *cp)
>   	struct ccwchain *chain, *temp;
>   	int i;
>   
> +	cp->initialized = false;
>   	list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) {
>   		for (i = 0; i < chain->ch_len; i++) {
>   			pfn_array_table_unpin_free(chain->ch_pat + i,
> @@ -701,6 +702,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
>   	 */
>   	cp->orb.cmd.c64 = 1;
>   
> +	cp->initialized = true;
> +

Not seen in this hunk, but we call ccwchain_loop_tic() just prior to 
this point.  If that returns non-zero, we call cp_unpin_free()[1] (and 
set initailized to false), and then fall through to here.  So this is 
going to set initialized to true, even though we're taking an error 
path.  :-(

[1] Wait, why is it calling cp_unpin_free()?  Oh, I had proposed 
squashing cp_free() and cp_unpin_free() back in November[2], got an r-b 
from Pierre but haven't gotten back to tidy up the series for a v2. 
Okay, I'll try to do that again soon.  :-)
[2] https://patchwork.kernel.org/patch/10675261/

>   	return ret;
>   }
>   
> @@ -715,7 +718,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
>    */
>   void cp_free(struct channel_program *cp)
>   {
> -	cp_unpin_free(cp);
> +	if (cp->initialized)
> +		cp_unpin_free(cp);
>   }
>   
>   /**
> @@ -760,6 +764,10 @@ int cp_prefetch(struct channel_program *cp)
>   	struct ccwchain *chain;
>   	int len, idx, ret;
>   
> +	/* this is an error in the caller */
> +	if (!cp || !cp->initialized)
> +		return -EINVAL;
> +
>   	list_for_each_entry(chain, &cp->ccwchain_list, next) {
>   		len = chain->ch_len;
>   		for (idx = 0; idx < len; idx++) {
> @@ -795,6 +803,10 @@ union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm)
>   	struct ccwchain *chain;
>   	struct ccw1 *cpa;
>   
> +	/* this is an error in the caller */
> +	if (!cp || !cp->initialized)
> +		return NULL;
> +
>   	orb = &cp->orb;
>   
>   	orb->cmd.intparm = intparm;
> @@ -831,6 +843,9 @@ void cp_update_scsw(struct channel_program *cp, union scsw *scsw)
>   	u32 cpa = scsw->cmd.cpa;
>   	u32 ccw_head, ccw_tail;
>   
> +	if (!cp->initialized)
> +		return;
> +
>   	/*
>   	 * LATER:
>   	 * For now, only update the cmd.cpa part. We may need to deal with
> @@ -869,6 +884,9 @@ bool cp_iova_pinned(struct channel_program *cp, u64 iova)
>   	struct ccwchain *chain;
>   	int i;
>   
> +	if (!cp->initialized)

So, two of the checks added above look for a nonzero cp pointer prior to 
checking initialized, while two don't.  I guess cp can't be NULL, since 
it's embedded in the private struct directly and that's only free'd when 
we do vfio_ccw_sch_remove() ... But I guess some consistency in how we 
look would be nice.

> +		return false;
> +
>   	list_for_each_entry(chain, &cp->ccwchain_list, next) {
>   		for (i = 0; i < chain->ch_len; i++)
>   			if (pfn_array_table_iova_pinned(chain->ch_pat + i,
> diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h
> index a4b74fb1aa57..3c20cd208da5 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.h
> +++ b/drivers/s390/cio/vfio_ccw_cp.h
> @@ -21,6 +21,7 @@
>    * @ccwchain_list: list head of ccwchains
>    * @orb: orb for the currently processed ssch request
>    * @mdev: the mediated device to perform page pinning/unpinning
> + * @initialized: whether this instance is actually initialized
>    *
>    * @ccwchain_list is the head of a ccwchain list, that contents the
>    * translated result of the guest channel program that pointed out by
> @@ -30,6 +31,7 @@ struct channel_program {
>   	struct list_head ccwchain_list;
>   	union orb orb;
>   	struct device *mdev;
> +	bool initialized;
>   };
>   
>   extern int cp_init(struct channel_program *cp, struct device *mdev,
> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> index cab17865aafe..e7c9877c9f1e 100644
> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> @@ -31,6 +31,10 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>   	private->state = VFIO_CCW_STATE_BUSY;
>   
>   	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
> +	if (!orb) {
> +		ret = -EIO;
> +		goto out;
> +	}
>   
>   	/* Issue "Start Subchannel" */
>   	ccode = ssch(sch->schid, orb);
> @@ -64,6 +68,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>   	default:
>   		ret = ccode;
>   	}
> +out:
>   	spin_unlock_irqrestore(sch->lock, flags);
>   	return ret;
>   }
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-02-04 19:25     ` Eric Farman
  0 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-04 19:25 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, qemu-devel, qemu-s390x, Alex Williamson



On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> When we get a solicited interrupt, the start function may have
> been cleared by a csch, but we still have a channel program
> structure allocated. Make it safe to call the cp accessors in
> any case, so we can call them unconditionally.
> 
> While at it, also make sure that functions called from other parts
> of the code return gracefully if the channel program structure
> has not been initialized (even though that is a bug in the caller).
> 
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> ---
>   drivers/s390/cio/vfio_ccw_cp.c  | 20 +++++++++++++++++++-
>   drivers/s390/cio/vfio_ccw_cp.h  |  2 ++
>   drivers/s390/cio/vfio_ccw_fsm.c |  5 +++++
>   3 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index ba08fe137c2e..0bc0c38edda7 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -335,6 +335,7 @@ static void cp_unpin_free(struct channel_program *cp)
>   	struct ccwchain *chain, *temp;
>   	int i;
>   
> +	cp->initialized = false;
>   	list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) {
>   		for (i = 0; i < chain->ch_len; i++) {
>   			pfn_array_table_unpin_free(chain->ch_pat + i,
> @@ -701,6 +702,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
>   	 */
>   	cp->orb.cmd.c64 = 1;
>   
> +	cp->initialized = true;
> +

Not seen in this hunk, but we call ccwchain_loop_tic() just prior to 
this point.  If that returns non-zero, we call cp_unpin_free()[1] (and 
set initailized to false), and then fall through to here.  So this is 
going to set initialized to true, even though we're taking an error 
path.  :-(

[1] Wait, why is it calling cp_unpin_free()?  Oh, I had proposed 
squashing cp_free() and cp_unpin_free() back in November[2], got an r-b 
from Pierre but haven't gotten back to tidy up the series for a v2. 
Okay, I'll try to do that again soon.  :-)
[2] https://patchwork.kernel.org/patch/10675261/

>   	return ret;
>   }
>   
> @@ -715,7 +718,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
>    */
>   void cp_free(struct channel_program *cp)
>   {
> -	cp_unpin_free(cp);
> +	if (cp->initialized)
> +		cp_unpin_free(cp);
>   }
>   
>   /**
> @@ -760,6 +764,10 @@ int cp_prefetch(struct channel_program *cp)
>   	struct ccwchain *chain;
>   	int len, idx, ret;
>   
> +	/* this is an error in the caller */
> +	if (!cp || !cp->initialized)
> +		return -EINVAL;
> +
>   	list_for_each_entry(chain, &cp->ccwchain_list, next) {
>   		len = chain->ch_len;
>   		for (idx = 0; idx < len; idx++) {
> @@ -795,6 +803,10 @@ union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm)
>   	struct ccwchain *chain;
>   	struct ccw1 *cpa;
>   
> +	/* this is an error in the caller */
> +	if (!cp || !cp->initialized)
> +		return NULL;
> +
>   	orb = &cp->orb;
>   
>   	orb->cmd.intparm = intparm;
> @@ -831,6 +843,9 @@ void cp_update_scsw(struct channel_program *cp, union scsw *scsw)
>   	u32 cpa = scsw->cmd.cpa;
>   	u32 ccw_head, ccw_tail;
>   
> +	if (!cp->initialized)
> +		return;
> +
>   	/*
>   	 * LATER:
>   	 * For now, only update the cmd.cpa part. We may need to deal with
> @@ -869,6 +884,9 @@ bool cp_iova_pinned(struct channel_program *cp, u64 iova)
>   	struct ccwchain *chain;
>   	int i;
>   
> +	if (!cp->initialized)

So, two of the checks added above look for a nonzero cp pointer prior to 
checking initialized, while two don't.  I guess cp can't be NULL, since 
it's embedded in the private struct directly and that's only free'd when 
we do vfio_ccw_sch_remove() ... But I guess some consistency in how we 
look would be nice.

> +		return false;
> +
>   	list_for_each_entry(chain, &cp->ccwchain_list, next) {
>   		for (i = 0; i < chain->ch_len; i++)
>   			if (pfn_array_table_iova_pinned(chain->ch_pat + i,
> diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h
> index a4b74fb1aa57..3c20cd208da5 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.h
> +++ b/drivers/s390/cio/vfio_ccw_cp.h
> @@ -21,6 +21,7 @@
>    * @ccwchain_list: list head of ccwchains
>    * @orb: orb for the currently processed ssch request
>    * @mdev: the mediated device to perform page pinning/unpinning
> + * @initialized: whether this instance is actually initialized
>    *
>    * @ccwchain_list is the head of a ccwchain list, that contents the
>    * translated result of the guest channel program that pointed out by
> @@ -30,6 +31,7 @@ struct channel_program {
>   	struct list_head ccwchain_list;
>   	union orb orb;
>   	struct device *mdev;
> +	bool initialized;
>   };
>   
>   extern int cp_init(struct channel_program *cp, struct device *mdev,
> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> index cab17865aafe..e7c9877c9f1e 100644
> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> @@ -31,6 +31,10 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>   	private->state = VFIO_CCW_STATE_BUSY;
>   
>   	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
> +	if (!orb) {
> +		ret = -EIO;
> +		goto out;
> +	}
>   
>   	/* Issue "Start Subchannel" */
>   	ccode = ssch(sch->schid, orb);
> @@ -64,6 +68,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>   	default:
>   		ret = ccode;
>   	}
> +out:
>   	spin_unlock_irqrestore(sch->lock, flags);
>   	return ret;
>   }
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/6] vfio-ccw: rework ssch state handling
  2019-01-30 13:22   ` [Qemu-devel] " Cornelia Huck
@ 2019-02-04 21:29     ` Eric Farman
  -1 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-04 21:29 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic, Farhan Ali, Pierre Morel
  Cc: linux-s390, qemu-s390x, Alex Williamson, qemu-devel, kvm



On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> The flow for processing ssch requests can be improved by splitting
> the BUSY state:
> 
> - CP_PROCESSING: We reject any user space requests while we are in
>    the process of translating a channel program and submitting it to
>    the hardware. Use -EAGAIN to signal user space that it should
>    retry the request.
> - CP_PENDING: We have successfully submitted a request with ssch and
>    are now expecting an interrupt. As we can't handle more than one
>    channel program being processed, reject any further requests with
>    -EBUSY. A final interrupt will move us out of this state; this also
>    fixes a latent bug where a non-final interrupt might have freed up
>    a channel program that still was in progress.
>    By making this a separate state, we make it possible to issue a
>    halt or a clear while we're still waiting for the final interrupt
>    for the ssch (in a follow-on patch).
> 
> It also makes a lot of sense not to preemptively filter out writes to
> the io_region if we're in an incorrect state: the state machine will
> handle this correctly.
> 
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> ---
>   drivers/s390/cio/vfio_ccw_drv.c     |  8 ++++++--
>   drivers/s390/cio/vfio_ccw_fsm.c     | 19 ++++++++++++++-----
>   drivers/s390/cio/vfio_ccw_ops.c     |  2 --
>   drivers/s390/cio/vfio_ccw_private.h |  3 ++-
>   4 files changed, 22 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
> index a10cec0e86eb..0b3b9de45c60 100644
> --- a/drivers/s390/cio/vfio_ccw_drv.c
> +++ b/drivers/s390/cio/vfio_ccw_drv.c
> @@ -72,20 +72,24 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
>   {
>   	struct vfio_ccw_private *private;
>   	struct irb *irb;
> +	bool is_final;
>   
>   	private = container_of(work, struct vfio_ccw_private, io_work);
>   	irb = &private->irb;
>   
> +	is_final = !(scsw_actl(&irb->scsw) &
> +		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
>   	if (scsw_is_solicited(&irb->scsw)) {
>   		cp_update_scsw(&private->cp, &irb->scsw);
> -		cp_free(&private->cp);
> +		if (is_final)
> +			cp_free(&private->cp);
>   	}
>   	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
>   
>   	if (private->io_trigger)
>   		eventfd_signal(private->io_trigger, 1);
>   
> -	if (private->mdev)
> +	if (private->mdev && is_final)
>   		private->state = VFIO_CCW_STATE_IDLE;
>   }
>   
> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> index e7c9877c9f1e..b4a141fbd1a8 100644
> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> @@ -28,7 +28,6 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>   	sch = private->sch;
>   
>   	spin_lock_irqsave(sch->lock, flags);
> -	private->state = VFIO_CCW_STATE_BUSY;
>   
>   	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
>   	if (!orb) {
> @@ -46,6 +45,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>   		 */
>   		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
>   		ret = 0;
> +		private->state = VFIO_CCW_STATE_CP_PENDING;

[1]

>   		break;
>   	case 1:		/* Status pending */
>   	case 2:		/* Busy */
> @@ -107,6 +107,12 @@ static void fsm_io_busy(struct vfio_ccw_private *private,
>   	private->io_region->ret_code = -EBUSY;
>   }
>   
> +static void fsm_io_retry(struct vfio_ccw_private *private,
> +			 enum vfio_ccw_event event)
> +{
> +	private->io_region->ret_code = -EAGAIN;
> +}
> +
>   static void fsm_disabled_irq(struct vfio_ccw_private *private,
>   			     enum vfio_ccw_event event)
>   {
> @@ -135,8 +141,7 @@ static void fsm_io_request(struct vfio_ccw_private *private,
>   	struct mdev_device *mdev = private->mdev;
>   	char *errstr = "request";
>   
> -	private->state = VFIO_CCW_STATE_BUSY;
> -
> +	private->state = VFIO_CCW_STATE_CP_PROCESSING;

[1]

>   	memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
>   
>   	if (scsw->cmd.fctl & SCSW_FCTL_START_FUNC) {
> @@ -181,7 +186,6 @@ static void fsm_io_request(struct vfio_ccw_private *private,
>   	}
>   
>   err_out:
> -	private->state = VFIO_CCW_STATE_IDLE;

[1] Revisiting these locations as from an earlier discussion [2]... 
These go IDLE->CP_PROCESSING->CP_PENDING if we get a cc=0 on the SSCH, 
but we stop in CP_PROCESSING if the SSCH gets a nonzero cc.  Shouldn't 
we cleanup and go back to IDLE in this scenario, rather than forcing 
userspace to escalate to CSCH/HSCH after some number of retries (via FSM)?

[2] https://patchwork.kernel.org/patch/10773611/#22447997

Besides that, I think this looks good to me.

  - Eric

>   	trace_vfio_ccw_io_fctl(scsw->cmd.fctl, get_schid(private),
>   			       io_region->ret_code, errstr);
>   }
> @@ -221,7 +225,12 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
>   		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_request,
>   		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>   	},
> -	[VFIO_CCW_STATE_BUSY] = {
> +	[VFIO_CCW_STATE_CP_PROCESSING] = {
> +		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
> +		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_retry,
> +		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
> +	},
> +	[VFIO_CCW_STATE_CP_PENDING] = {
>   		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>   		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_busy,
>   		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
> diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
> index f673e106c041..3fdcc6dfe0bf 100644
> --- a/drivers/s390/cio/vfio_ccw_ops.c
> +++ b/drivers/s390/cio/vfio_ccw_ops.c
> @@ -193,8 +193,6 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
>   		return -EINVAL;
>   
>   	private = dev_get_drvdata(mdev_parent_dev(mdev));
> -	if (private->state != VFIO_CCW_STATE_IDLE)
> -		return -EACCES;
>   
>   	region = private->io_region;
>   	if (copy_from_user((void *)region + *ppos, buf, count))
> diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
> index 08e9a7dc9176..50c52efb4fcb 100644
> --- a/drivers/s390/cio/vfio_ccw_private.h
> +++ b/drivers/s390/cio/vfio_ccw_private.h
> @@ -63,7 +63,8 @@ enum vfio_ccw_state {
>   	VFIO_CCW_STATE_NOT_OPER,
>   	VFIO_CCW_STATE_STANDBY,
>   	VFIO_CCW_STATE_IDLE,
> -	VFIO_CCW_STATE_BUSY,
> +	VFIO_CCW_STATE_CP_PROCESSING,
> +	VFIO_CCW_STATE_CP_PENDING,
>   	/* last element! */
>   	NR_VFIO_CCW_STATES
>   };
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/6] vfio-ccw: rework ssch state handling
@ 2019-02-04 21:29     ` Eric Farman
  0 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-04 21:29 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, qemu-devel, qemu-s390x, Alex Williamson



On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> The flow for processing ssch requests can be improved by splitting
> the BUSY state:
> 
> - CP_PROCESSING: We reject any user space requests while we are in
>    the process of translating a channel program and submitting it to
>    the hardware. Use -EAGAIN to signal user space that it should
>    retry the request.
> - CP_PENDING: We have successfully submitted a request with ssch and
>    are now expecting an interrupt. As we can't handle more than one
>    channel program being processed, reject any further requests with
>    -EBUSY. A final interrupt will move us out of this state; this also
>    fixes a latent bug where a non-final interrupt might have freed up
>    a channel program that still was in progress.
>    By making this a separate state, we make it possible to issue a
>    halt or a clear while we're still waiting for the final interrupt
>    for the ssch (in a follow-on patch).
> 
> It also makes a lot of sense not to preemptively filter out writes to
> the io_region if we're in an incorrect state: the state machine will
> handle this correctly.
> 
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> ---
>   drivers/s390/cio/vfio_ccw_drv.c     |  8 ++++++--
>   drivers/s390/cio/vfio_ccw_fsm.c     | 19 ++++++++++++++-----
>   drivers/s390/cio/vfio_ccw_ops.c     |  2 --
>   drivers/s390/cio/vfio_ccw_private.h |  3 ++-
>   4 files changed, 22 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
> index a10cec0e86eb..0b3b9de45c60 100644
> --- a/drivers/s390/cio/vfio_ccw_drv.c
> +++ b/drivers/s390/cio/vfio_ccw_drv.c
> @@ -72,20 +72,24 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
>   {
>   	struct vfio_ccw_private *private;
>   	struct irb *irb;
> +	bool is_final;
>   
>   	private = container_of(work, struct vfio_ccw_private, io_work);
>   	irb = &private->irb;
>   
> +	is_final = !(scsw_actl(&irb->scsw) &
> +		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
>   	if (scsw_is_solicited(&irb->scsw)) {
>   		cp_update_scsw(&private->cp, &irb->scsw);
> -		cp_free(&private->cp);
> +		if (is_final)
> +			cp_free(&private->cp);
>   	}
>   	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
>   
>   	if (private->io_trigger)
>   		eventfd_signal(private->io_trigger, 1);
>   
> -	if (private->mdev)
> +	if (private->mdev && is_final)
>   		private->state = VFIO_CCW_STATE_IDLE;
>   }
>   
> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> index e7c9877c9f1e..b4a141fbd1a8 100644
> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> @@ -28,7 +28,6 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>   	sch = private->sch;
>   
>   	spin_lock_irqsave(sch->lock, flags);
> -	private->state = VFIO_CCW_STATE_BUSY;
>   
>   	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
>   	if (!orb) {
> @@ -46,6 +45,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>   		 */
>   		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
>   		ret = 0;
> +		private->state = VFIO_CCW_STATE_CP_PENDING;

[1]

>   		break;
>   	case 1:		/* Status pending */
>   	case 2:		/* Busy */
> @@ -107,6 +107,12 @@ static void fsm_io_busy(struct vfio_ccw_private *private,
>   	private->io_region->ret_code = -EBUSY;
>   }
>   
> +static void fsm_io_retry(struct vfio_ccw_private *private,
> +			 enum vfio_ccw_event event)
> +{
> +	private->io_region->ret_code = -EAGAIN;
> +}
> +
>   static void fsm_disabled_irq(struct vfio_ccw_private *private,
>   			     enum vfio_ccw_event event)
>   {
> @@ -135,8 +141,7 @@ static void fsm_io_request(struct vfio_ccw_private *private,
>   	struct mdev_device *mdev = private->mdev;
>   	char *errstr = "request";
>   
> -	private->state = VFIO_CCW_STATE_BUSY;
> -
> +	private->state = VFIO_CCW_STATE_CP_PROCESSING;

[1]

>   	memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
>   
>   	if (scsw->cmd.fctl & SCSW_FCTL_START_FUNC) {
> @@ -181,7 +186,6 @@ static void fsm_io_request(struct vfio_ccw_private *private,
>   	}
>   
>   err_out:
> -	private->state = VFIO_CCW_STATE_IDLE;

[1] Revisiting these locations as from an earlier discussion [2]... 
These go IDLE->CP_PROCESSING->CP_PENDING if we get a cc=0 on the SSCH, 
but we stop in CP_PROCESSING if the SSCH gets a nonzero cc.  Shouldn't 
we cleanup and go back to IDLE in this scenario, rather than forcing 
userspace to escalate to CSCH/HSCH after some number of retries (via FSM)?

[2] https://patchwork.kernel.org/patch/10773611/#22447997

Besides that, I think this looks good to me.

  - Eric

>   	trace_vfio_ccw_io_fctl(scsw->cmd.fctl, get_schid(private),
>   			       io_region->ret_code, errstr);
>   }
> @@ -221,7 +225,12 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
>   		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_request,
>   		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
>   	},
> -	[VFIO_CCW_STATE_BUSY] = {
> +	[VFIO_CCW_STATE_CP_PROCESSING] = {
> +		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
> +		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_retry,
> +		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
> +	},
> +	[VFIO_CCW_STATE_CP_PENDING] = {
>   		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
>   		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_busy,
>   		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
> diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
> index f673e106c041..3fdcc6dfe0bf 100644
> --- a/drivers/s390/cio/vfio_ccw_ops.c
> +++ b/drivers/s390/cio/vfio_ccw_ops.c
> @@ -193,8 +193,6 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
>   		return -EINVAL;
>   
>   	private = dev_get_drvdata(mdev_parent_dev(mdev));
> -	if (private->state != VFIO_CCW_STATE_IDLE)
> -		return -EACCES;
>   
>   	region = private->io_region;
>   	if (copy_from_user((void *)region + *ppos, buf, count))
> diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
> index 08e9a7dc9176..50c52efb4fcb 100644
> --- a/drivers/s390/cio/vfio_ccw_private.h
> +++ b/drivers/s390/cio/vfio_ccw_private.h
> @@ -63,7 +63,8 @@ enum vfio_ccw_state {
>   	VFIO_CCW_STATE_NOT_OPER,
>   	VFIO_CCW_STATE_STANDBY,
>   	VFIO_CCW_STATE_IDLE,
> -	VFIO_CCW_STATE_BUSY,
> +	VFIO_CCW_STATE_CP_PROCESSING,
> +	VFIO_CCW_STATE_CP_PENDING,
>   	/* last element! */
>   	NR_VFIO_CCW_STATES
>   };
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-02-04 15:31           ` [Qemu-devel] " Cornelia Huck
@ 2019-02-05 11:52             ` Halil Pasic
  -1 siblings, 0 replies; 70+ messages in thread
From: Halil Pasic @ 2019-02-05 11:52 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: linux-s390, Eric Farman, kvm, Pierre Morel, qemu-s390x,
	Farhan Ali, qemu-devel, Alex Williamson

On Mon, 4 Feb 2019 16:31:02 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> On Thu, 31 Jan 2019 13:34:55 +0100
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Thu, 31 Jan 2019 12:52:20 +0100
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Wed, 30 Jan 2019 19:51:27 +0100
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > > > On Wed, 30 Jan 2019 14:22:07 +0100
> > > > Cornelia Huck <cohuck@redhat.com> wrote:
> > > >   
> > > > > When we get a solicited interrupt, the start function may have
> > > > > been cleared by a csch, but we still have a channel program
> > > > > structure allocated. Make it safe to call the cp accessors in
> > > > > any case, so we can call them unconditionally.    
> > > > 
> > > > I read this like it is supposed to be safe regardless of
> > > > parallelism and threads. However I don't see any explicit
> > > > synchronization done for cp->initialized.
> > > > 
> > > > I've managed to figure out how is that supposed to be safe
> > > > for the cp_free() (which is probably our main concern) in
> > > > vfio_ccw_sch_io_todo(), but if fail when it comes to the one
> > > > in vfio_ccw_mdev_notifier().
> > > > 
> > > > Can you explain us how does the synchronization work?  
> > > 
> > > You read that wrong, I don't add synchronization, I just add a check.
> > >   
> > 
> > Now I'm confused. Does that mean we don't need synchronization for this?
> 
> If we lack synchronization (that is not provided by the current state
> machine handling, or the rework here), we should do a patch on top
> (preferably on top of the whole series, so this does not get even more
> tangled up.) This is really just about the extra check.
> 

I'm not a huge fan of keeping or introducing races -- it makes things
difficult to reason about, but I do have some understanging your
position.

This patch-series is AFAICT a big improvement over what we have. I would
like Farhan confirming that it makes these hick-ups when he used to hit
BUSY with another ssch request disappear. If it does (I hope it does)
it's definitely a good thing for anybody who wants to use vfio-ccw.

Yet I find it difficult to slap my r-b over racy code, or partial
solutions. In the latter case, when I lack conceptual clarity, I find it
difficult to tell if we are heading into the right direction, or is what
we build today going to turn against us tomorrow. Sorry for being a drag.

Regards,
Halil

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-02-05 11:52             ` Halil Pasic
  0 siblings, 0 replies; 70+ messages in thread
From: Halil Pasic @ 2019-02-05 11:52 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: linux-s390, Eric Farman, Alex Williamson, Pierre Morel, kvm,
	Farhan Ali, qemu-devel, qemu-s390x

On Mon, 4 Feb 2019 16:31:02 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> On Thu, 31 Jan 2019 13:34:55 +0100
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Thu, 31 Jan 2019 12:52:20 +0100
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Wed, 30 Jan 2019 19:51:27 +0100
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > > > On Wed, 30 Jan 2019 14:22:07 +0100
> > > > Cornelia Huck <cohuck@redhat.com> wrote:
> > > >   
> > > > > When we get a solicited interrupt, the start function may have
> > > > > been cleared by a csch, but we still have a channel program
> > > > > structure allocated. Make it safe to call the cp accessors in
> > > > > any case, so we can call them unconditionally.    
> > > > 
> > > > I read this like it is supposed to be safe regardless of
> > > > parallelism and threads. However I don't see any explicit
> > > > synchronization done for cp->initialized.
> > > > 
> > > > I've managed to figure out how is that supposed to be safe
> > > > for the cp_free() (which is probably our main concern) in
> > > > vfio_ccw_sch_io_todo(), but if fail when it comes to the one
> > > > in vfio_ccw_mdev_notifier().
> > > > 
> > > > Can you explain us how does the synchronization work?  
> > > 
> > > You read that wrong, I don't add synchronization, I just add a check.
> > >   
> > 
> > Now I'm confused. Does that mean we don't need synchronization for this?
> 
> If we lack synchronization (that is not provided by the current state
> machine handling, or the rework here), we should do a patch on top
> (preferably on top of the whole series, so this does not get even more
> tangled up.) This is really just about the extra check.
> 

I'm not a huge fan of keeping or introducing races -- it makes things
difficult to reason about, but I do have some understanging your
position.

This patch-series is AFAICT a big improvement over what we have. I would
like Farhan confirming that it makes these hick-ups when he used to hit
BUSY with another ssch request disappear. If it does (I hope it does)
it's definitely a good thing for anybody who wants to use vfio-ccw.

Yet I find it difficult to slap my r-b over racy code, or partial
solutions. In the latter case, when I lack conceptual clarity, I find it
difficult to tell if we are heading into the right direction, or is what
we build today going to turn against us tomorrow. Sorry for being a drag.

Regards,
Halil

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-02-04 19:25     ` [Qemu-devel] " Eric Farman
@ 2019-02-05 12:03       ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-05 12:03 UTC (permalink / raw)
  To: Eric Farman
  Cc: linux-s390, Alex Williamson, Pierre Morel, kvm, Farhan Ali,
	qemu-devel, Halil Pasic, qemu-s390x

On Mon, 4 Feb 2019 14:25:34 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> > When we get a solicited interrupt, the start function may have
> > been cleared by a csch, but we still have a channel program
> > structure allocated. Make it safe to call the cp accessors in
> > any case, so we can call them unconditionally.
> > 
> > While at it, also make sure that functions called from other parts
> > of the code return gracefully if the channel program structure
> > has not been initialized (even though that is a bug in the caller).
> > 
> > Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> > ---
> >   drivers/s390/cio/vfio_ccw_cp.c  | 20 +++++++++++++++++++-
> >   drivers/s390/cio/vfio_ccw_cp.h  |  2 ++
> >   drivers/s390/cio/vfio_ccw_fsm.c |  5 +++++
> >   3 files changed, 26 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> > index ba08fe137c2e..0bc0c38edda7 100644
> > --- a/drivers/s390/cio/vfio_ccw_cp.c
> > +++ b/drivers/s390/cio/vfio_ccw_cp.c
> > @@ -335,6 +335,7 @@ static void cp_unpin_free(struct channel_program *cp)
> >   	struct ccwchain *chain, *temp;
> >   	int i;
> >   
> > +	cp->initialized = false;
> >   	list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) {
> >   		for (i = 0; i < chain->ch_len; i++) {
> >   			pfn_array_table_unpin_free(chain->ch_pat + i,
> > @@ -701,6 +702,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
> >   	 */
> >   	cp->orb.cmd.c64 = 1;
> >   
> > +	cp->initialized = true;
> > +  
> 
> Not seen in this hunk, but we call ccwchain_loop_tic() just prior to 
> this point.  If that returns non-zero, we call cp_unpin_free()[1] (and 
> set initailized to false), and then fall through to here.  So this is 
> going to set initialized to true, even though we're taking an error 
> path.  :-(

Eek, setting c64 unconditionally threw me off. This needs to check
for !ret, of course.

> 
> [1] Wait, why is it calling cp_unpin_free()?  Oh, I had proposed 
> squashing cp_free() and cp_unpin_free() back in November[2], got an r-b 
> from Pierre but haven't gotten back to tidy up the series for a v2. 
> Okay, I'll try to do that again soon.  :-)

:)

> [2] https://patchwork.kernel.org/patch/10675261/
> 
> >   	return ret;
> >   }
> >   
> > @@ -715,7 +718,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
> >    */
> >   void cp_free(struct channel_program *cp)
> >   {
> > -	cp_unpin_free(cp);
> > +	if (cp->initialized)
> > +		cp_unpin_free(cp);
> >   }
> >   
> >   /**
> > @@ -760,6 +764,10 @@ int cp_prefetch(struct channel_program *cp)
> >   	struct ccwchain *chain;
> >   	int len, idx, ret;
> >   
> > +	/* this is an error in the caller */
> > +	if (!cp || !cp->initialized)
> > +		return -EINVAL;
> > +
> >   	list_for_each_entry(chain, &cp->ccwchain_list, next) {
> >   		len = chain->ch_len;
> >   		for (idx = 0; idx < len; idx++) {
> > @@ -795,6 +803,10 @@ union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm)
> >   	struct ccwchain *chain;
> >   	struct ccw1 *cpa;
> >   
> > +	/* this is an error in the caller */
> > +	if (!cp || !cp->initialized)
> > +		return NULL;
> > +
> >   	orb = &cp->orb;
> >   
> >   	orb->cmd.intparm = intparm;
> > @@ -831,6 +843,9 @@ void cp_update_scsw(struct channel_program *cp, union scsw *scsw)
> >   	u32 cpa = scsw->cmd.cpa;
> >   	u32 ccw_head, ccw_tail;
> >   
> > +	if (!cp->initialized)
> > +		return;
> > +
> >   	/*
> >   	 * LATER:
> >   	 * For now, only update the cmd.cpa part. We may need to deal with
> > @@ -869,6 +884,9 @@ bool cp_iova_pinned(struct channel_program *cp, u64 iova)
> >   	struct ccwchain *chain;
> >   	int i;
> >   
> > +	if (!cp->initialized)  
> 
> So, two of the checks added above look for a nonzero cp pointer prior to 
> checking initialized, while two don't.  I guess cp can't be NULL, since 
> it's embedded in the private struct directly and that's only free'd when 
> we do vfio_ccw_sch_remove() ... But I guess some consistency in how we 
> look would be nice.

The idea was: In which context is this called? Is there a legitimate
reason for the caller to pass in an uninitialized cp, or would that
mean the caller had messed up (and we should not trust cp to be !NULL
either?)

But you're right, that does look inconsistent. Always checking for
cp != NULL probably looks least odd, although it is overkill. Opinions?

> 
> > +		return false;
> > +
> >   	list_for_each_entry(chain, &cp->ccwchain_list, next) {
> >   		for (i = 0; i < chain->ch_len; i++)
> >   			if (pfn_array_table_iova_pinned(chain->ch_pat + i,

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-02-05 12:03       ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-05 12:03 UTC (permalink / raw)
  To: Eric Farman
  Cc: Halil Pasic, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Mon, 4 Feb 2019 14:25:34 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> > When we get a solicited interrupt, the start function may have
> > been cleared by a csch, but we still have a channel program
> > structure allocated. Make it safe to call the cp accessors in
> > any case, so we can call them unconditionally.
> > 
> > While at it, also make sure that functions called from other parts
> > of the code return gracefully if the channel program structure
> > has not been initialized (even though that is a bug in the caller).
> > 
> > Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> > ---
> >   drivers/s390/cio/vfio_ccw_cp.c  | 20 +++++++++++++++++++-
> >   drivers/s390/cio/vfio_ccw_cp.h  |  2 ++
> >   drivers/s390/cio/vfio_ccw_fsm.c |  5 +++++
> >   3 files changed, 26 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> > index ba08fe137c2e..0bc0c38edda7 100644
> > --- a/drivers/s390/cio/vfio_ccw_cp.c
> > +++ b/drivers/s390/cio/vfio_ccw_cp.c
> > @@ -335,6 +335,7 @@ static void cp_unpin_free(struct channel_program *cp)
> >   	struct ccwchain *chain, *temp;
> >   	int i;
> >   
> > +	cp->initialized = false;
> >   	list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) {
> >   		for (i = 0; i < chain->ch_len; i++) {
> >   			pfn_array_table_unpin_free(chain->ch_pat + i,
> > @@ -701,6 +702,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
> >   	 */
> >   	cp->orb.cmd.c64 = 1;
> >   
> > +	cp->initialized = true;
> > +  
> 
> Not seen in this hunk, but we call ccwchain_loop_tic() just prior to 
> this point.  If that returns non-zero, we call cp_unpin_free()[1] (and 
> set initailized to false), and then fall through to here.  So this is 
> going to set initialized to true, even though we're taking an error 
> path.  :-(

Eek, setting c64 unconditionally threw me off. This needs to check
for !ret, of course.

> 
> [1] Wait, why is it calling cp_unpin_free()?  Oh, I had proposed 
> squashing cp_free() and cp_unpin_free() back in November[2], got an r-b 
> from Pierre but haven't gotten back to tidy up the series for a v2. 
> Okay, I'll try to do that again soon.  :-)

:)

> [2] https://patchwork.kernel.org/patch/10675261/
> 
> >   	return ret;
> >   }
> >   
> > @@ -715,7 +718,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
> >    */
> >   void cp_free(struct channel_program *cp)
> >   {
> > -	cp_unpin_free(cp);
> > +	if (cp->initialized)
> > +		cp_unpin_free(cp);
> >   }
> >   
> >   /**
> > @@ -760,6 +764,10 @@ int cp_prefetch(struct channel_program *cp)
> >   	struct ccwchain *chain;
> >   	int len, idx, ret;
> >   
> > +	/* this is an error in the caller */
> > +	if (!cp || !cp->initialized)
> > +		return -EINVAL;
> > +
> >   	list_for_each_entry(chain, &cp->ccwchain_list, next) {
> >   		len = chain->ch_len;
> >   		for (idx = 0; idx < len; idx++) {
> > @@ -795,6 +803,10 @@ union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm)
> >   	struct ccwchain *chain;
> >   	struct ccw1 *cpa;
> >   
> > +	/* this is an error in the caller */
> > +	if (!cp || !cp->initialized)
> > +		return NULL;
> > +
> >   	orb = &cp->orb;
> >   
> >   	orb->cmd.intparm = intparm;
> > @@ -831,6 +843,9 @@ void cp_update_scsw(struct channel_program *cp, union scsw *scsw)
> >   	u32 cpa = scsw->cmd.cpa;
> >   	u32 ccw_head, ccw_tail;
> >   
> > +	if (!cp->initialized)
> > +		return;
> > +
> >   	/*
> >   	 * LATER:
> >   	 * For now, only update the cmd.cpa part. We may need to deal with
> > @@ -869,6 +884,9 @@ bool cp_iova_pinned(struct channel_program *cp, u64 iova)
> >   	struct ccwchain *chain;
> >   	int i;
> >   
> > +	if (!cp->initialized)  
> 
> So, two of the checks added above look for a nonzero cp pointer prior to 
> checking initialized, while two don't.  I guess cp can't be NULL, since 
> it's embedded in the private struct directly and that's only free'd when 
> we do vfio_ccw_sch_remove() ... But I guess some consistency in how we 
> look would be nice.

The idea was: In which context is this called? Is there a legitimate
reason for the caller to pass in an uninitialized cp, or would that
mean the caller had messed up (and we should not trust cp to be !NULL
either?)

But you're right, that does look inconsistent. Always checking for
cp != NULL probably looks least odd, although it is overkill. Opinions?

> 
> > +		return false;
> > +
> >   	list_for_each_entry(chain, &cp->ccwchain_list, next) {
> >   		for (i = 0; i < chain->ch_len; i++)
> >   			if (pfn_array_table_iova_pinned(chain->ch_pat + i,

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/6] vfio-ccw: rework ssch state handling
  2019-02-04 21:29     ` [Qemu-devel] " Eric Farman
@ 2019-02-05 12:10       ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-05 12:10 UTC (permalink / raw)
  To: Eric Farman
  Cc: linux-s390, Alex Williamson, Pierre Morel, kvm, Farhan Ali,
	qemu-devel, Halil Pasic, qemu-s390x

On Mon, 4 Feb 2019 16:29:40 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> > The flow for processing ssch requests can be improved by splitting
> > the BUSY state:
> > 
> > - CP_PROCESSING: We reject any user space requests while we are in
> >    the process of translating a channel program and submitting it to
> >    the hardware. Use -EAGAIN to signal user space that it should
> >    retry the request.
> > - CP_PENDING: We have successfully submitted a request with ssch and
> >    are now expecting an interrupt. As we can't handle more than one
> >    channel program being processed, reject any further requests with
> >    -EBUSY. A final interrupt will move us out of this state; this also
> >    fixes a latent bug where a non-final interrupt might have freed up
> >    a channel program that still was in progress.
> >    By making this a separate state, we make it possible to issue a
> >    halt or a clear while we're still waiting for the final interrupt
> >    for the ssch (in a follow-on patch).
> > 
> > It also makes a lot of sense not to preemptively filter out writes to
> > the io_region if we're in an incorrect state: the state machine will
> > handle this correctly.
> > 
> > Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> > ---
> >   drivers/s390/cio/vfio_ccw_drv.c     |  8 ++++++--
> >   drivers/s390/cio/vfio_ccw_fsm.c     | 19 ++++++++++++++-----
> >   drivers/s390/cio/vfio_ccw_ops.c     |  2 --
> >   drivers/s390/cio/vfio_ccw_private.h |  3 ++-
> >   4 files changed, 22 insertions(+), 10 deletions(-)

> > diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> > index e7c9877c9f1e..b4a141fbd1a8 100644
> > --- a/drivers/s390/cio/vfio_ccw_fsm.c
> > +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> > @@ -28,7 +28,6 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
> >   	sch = private->sch;
> >   
> >   	spin_lock_irqsave(sch->lock, flags);
> > -	private->state = VFIO_CCW_STATE_BUSY;
> >   
> >   	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
> >   	if (!orb) {
> > @@ -46,6 +45,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
> >   		 */
> >   		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
> >   		ret = 0;
> > +		private->state = VFIO_CCW_STATE_CP_PENDING;  
> 
> [1]
> 
> >   		break;
> >   	case 1:		/* Status pending */
> >   	case 2:		/* Busy */
> > @@ -107,6 +107,12 @@ static void fsm_io_busy(struct vfio_ccw_private *private,
> >   	private->io_region->ret_code = -EBUSY;
> >   }
> >   
> > +static void fsm_io_retry(struct vfio_ccw_private *private,
> > +			 enum vfio_ccw_event event)
> > +{
> > +	private->io_region->ret_code = -EAGAIN;
> > +}
> > +
> >   static void fsm_disabled_irq(struct vfio_ccw_private *private,
> >   			     enum vfio_ccw_event event)
> >   {
> > @@ -135,8 +141,7 @@ static void fsm_io_request(struct vfio_ccw_private *private,
> >   	struct mdev_device *mdev = private->mdev;
> >   	char *errstr = "request";
> >   
> > -	private->state = VFIO_CCW_STATE_BUSY;
> > -
> > +	private->state = VFIO_CCW_STATE_CP_PROCESSING;  
> 
> [1]
> 
> >   	memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
> >   
> >   	if (scsw->cmd.fctl & SCSW_FCTL_START_FUNC) {
> > @@ -181,7 +186,6 @@ static void fsm_io_request(struct vfio_ccw_private *private,
> >   	}
> >   
> >   err_out:
> > -	private->state = VFIO_CCW_STATE_IDLE;  
> 
> [1] Revisiting these locations as from an earlier discussion [2]... 
> These go IDLE->CP_PROCESSING->CP_PENDING if we get a cc=0 on the SSCH, 
> but we stop in CP_PROCESSING if the SSCH gets a nonzero cc.  Shouldn't 
> we cleanup and go back to IDLE in this scenario, rather than forcing 
> userspace to escalate to CSCH/HSCH after some number of retries (via FSM)?
> 
> [2] https://patchwork.kernel.org/patch/10773611/#22447997

It does do that (in vfio_ccw_mdev_write), it was not needed here. Or do
you think doing it here would be more obvious?

> 
> Besides that, I think this looks good to me.

Thanks!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/6] vfio-ccw: rework ssch state handling
@ 2019-02-05 12:10       ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-05 12:10 UTC (permalink / raw)
  To: Eric Farman
  Cc: Halil Pasic, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Mon, 4 Feb 2019 16:29:40 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> > The flow for processing ssch requests can be improved by splitting
> > the BUSY state:
> > 
> > - CP_PROCESSING: We reject any user space requests while we are in
> >    the process of translating a channel program and submitting it to
> >    the hardware. Use -EAGAIN to signal user space that it should
> >    retry the request.
> > - CP_PENDING: We have successfully submitted a request with ssch and
> >    are now expecting an interrupt. As we can't handle more than one
> >    channel program being processed, reject any further requests with
> >    -EBUSY. A final interrupt will move us out of this state; this also
> >    fixes a latent bug where a non-final interrupt might have freed up
> >    a channel program that still was in progress.
> >    By making this a separate state, we make it possible to issue a
> >    halt or a clear while we're still waiting for the final interrupt
> >    for the ssch (in a follow-on patch).
> > 
> > It also makes a lot of sense not to preemptively filter out writes to
> > the io_region if we're in an incorrect state: the state machine will
> > handle this correctly.
> > 
> > Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> > ---
> >   drivers/s390/cio/vfio_ccw_drv.c     |  8 ++++++--
> >   drivers/s390/cio/vfio_ccw_fsm.c     | 19 ++++++++++++++-----
> >   drivers/s390/cio/vfio_ccw_ops.c     |  2 --
> >   drivers/s390/cio/vfio_ccw_private.h |  3 ++-
> >   4 files changed, 22 insertions(+), 10 deletions(-)

> > diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> > index e7c9877c9f1e..b4a141fbd1a8 100644
> > --- a/drivers/s390/cio/vfio_ccw_fsm.c
> > +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> > @@ -28,7 +28,6 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
> >   	sch = private->sch;
> >   
> >   	spin_lock_irqsave(sch->lock, flags);
> > -	private->state = VFIO_CCW_STATE_BUSY;
> >   
> >   	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
> >   	if (!orb) {
> > @@ -46,6 +45,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
> >   		 */
> >   		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
> >   		ret = 0;
> > +		private->state = VFIO_CCW_STATE_CP_PENDING;  
> 
> [1]
> 
> >   		break;
> >   	case 1:		/* Status pending */
> >   	case 2:		/* Busy */
> > @@ -107,6 +107,12 @@ static void fsm_io_busy(struct vfio_ccw_private *private,
> >   	private->io_region->ret_code = -EBUSY;
> >   }
> >   
> > +static void fsm_io_retry(struct vfio_ccw_private *private,
> > +			 enum vfio_ccw_event event)
> > +{
> > +	private->io_region->ret_code = -EAGAIN;
> > +}
> > +
> >   static void fsm_disabled_irq(struct vfio_ccw_private *private,
> >   			     enum vfio_ccw_event event)
> >   {
> > @@ -135,8 +141,7 @@ static void fsm_io_request(struct vfio_ccw_private *private,
> >   	struct mdev_device *mdev = private->mdev;
> >   	char *errstr = "request";
> >   
> > -	private->state = VFIO_CCW_STATE_BUSY;
> > -
> > +	private->state = VFIO_CCW_STATE_CP_PROCESSING;  
> 
> [1]
> 
> >   	memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
> >   
> >   	if (scsw->cmd.fctl & SCSW_FCTL_START_FUNC) {
> > @@ -181,7 +186,6 @@ static void fsm_io_request(struct vfio_ccw_private *private,
> >   	}
> >   
> >   err_out:
> > -	private->state = VFIO_CCW_STATE_IDLE;  
> 
> [1] Revisiting these locations as from an earlier discussion [2]... 
> These go IDLE->CP_PROCESSING->CP_PENDING if we get a cc=0 on the SSCH, 
> but we stop in CP_PROCESSING if the SSCH gets a nonzero cc.  Shouldn't 
> we cleanup and go back to IDLE in this scenario, rather than forcing 
> userspace to escalate to CSCH/HSCH after some number of retries (via FSM)?
> 
> [2] https://patchwork.kernel.org/patch/10773611/#22447997

It does do that (in vfio_ccw_mdev_write), it was not needed here. Or do
you think doing it here would be more obvious?

> 
> Besides that, I think this looks good to me.

Thanks!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-02-05 11:52             ` [Qemu-devel] " Halil Pasic
@ 2019-02-05 12:35               ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-05 12:35 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Eric Farman, kvm, Pierre Morel, qemu-s390x,
	Farhan Ali, qemu-devel, Alex Williamson

On Tue, 5 Feb 2019 12:52:29 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Mon, 4 Feb 2019 16:31:02 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Thu, 31 Jan 2019 13:34:55 +0100
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Thu, 31 Jan 2019 12:52:20 +0100
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Wed, 30 Jan 2019 19:51:27 +0100
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > > >     
> > > > > On Wed, 30 Jan 2019 14:22:07 +0100
> > > > > Cornelia Huck <cohuck@redhat.com> wrote:
> > > > >     
> > > > > > When we get a solicited interrupt, the start function may have
> > > > > > been cleared by a csch, but we still have a channel program
> > > > > > structure allocated. Make it safe to call the cp accessors in
> > > > > > any case, so we can call them unconditionally.      
> > > > > 
> > > > > I read this like it is supposed to be safe regardless of
> > > > > parallelism and threads. However I don't see any explicit
> > > > > synchronization done for cp->initialized.
> > > > > 
> > > > > I've managed to figure out how is that supposed to be safe
> > > > > for the cp_free() (which is probably our main concern) in
> > > > > vfio_ccw_sch_io_todo(), but if fail when it comes to the one
> > > > > in vfio_ccw_mdev_notifier().
> > > > > 
> > > > > Can you explain us how does the synchronization work?    
> > > > 
> > > > You read that wrong, I don't add synchronization, I just add a check.
> > > >     
> > > 
> > > Now I'm confused. Does that mean we don't need synchronization for this?  
> > 
> > If we lack synchronization (that is not provided by the current state
> > machine handling, or the rework here), we should do a patch on top
> > (preferably on top of the whole series, so this does not get even more
> > tangled up.) This is really just about the extra check.
> >   
> 
> I'm not a huge fan of keeping or introducing races -- it makes things
> difficult to reason about, but I do have some understanging your
> position.

The only thing I want to avoid is knowingly making things worse than
before, and I don't think this patch does that.

> 
> This patch-series is AFAICT a big improvement over what we have. I would
> like Farhan confirming that it makes these hick-ups when he used to hit
> BUSY with another ssch request disappear. If it does (I hope it does)
> it's definitely a good thing for anybody who wants to use vfio-ccw.

Yep. There remains a lot to be done, but it's a first step.

> 
> Yet I find it difficult to slap my r-b over racy code, or partial
> solutions. In the latter case, when I lack conceptual clarity, I find it
> difficult to tell if we are heading into the right direction, or is what
> we build today going to turn against us tomorrow. Sorry for being a drag.

As long as we don't introduce bad user space interfaces we have to drag
around forever, I think anything is fair game if we think it's a good
idea at that moment. We can rewrite things if it turned out to be a bad
idea (although I'm not arguing for doing random crap, of course :)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-02-05 12:35               ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-05 12:35 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Eric Farman, Alex Williamson, Pierre Morel, kvm,
	Farhan Ali, qemu-devel, qemu-s390x

On Tue, 5 Feb 2019 12:52:29 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Mon, 4 Feb 2019 16:31:02 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Thu, 31 Jan 2019 13:34:55 +0100
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Thu, 31 Jan 2019 12:52:20 +0100
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Wed, 30 Jan 2019 19:51:27 +0100
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > > >     
> > > > > On Wed, 30 Jan 2019 14:22:07 +0100
> > > > > Cornelia Huck <cohuck@redhat.com> wrote:
> > > > >     
> > > > > > When we get a solicited interrupt, the start function may have
> > > > > > been cleared by a csch, but we still have a channel program
> > > > > > structure allocated. Make it safe to call the cp accessors in
> > > > > > any case, so we can call them unconditionally.      
> > > > > 
> > > > > I read this like it is supposed to be safe regardless of
> > > > > parallelism and threads. However I don't see any explicit
> > > > > synchronization done for cp->initialized.
> > > > > 
> > > > > I've managed to figure out how is that supposed to be safe
> > > > > for the cp_free() (which is probably our main concern) in
> > > > > vfio_ccw_sch_io_todo(), but if fail when it comes to the one
> > > > > in vfio_ccw_mdev_notifier().
> > > > > 
> > > > > Can you explain us how does the synchronization work?    
> > > > 
> > > > You read that wrong, I don't add synchronization, I just add a check.
> > > >     
> > > 
> > > Now I'm confused. Does that mean we don't need synchronization for this?  
> > 
> > If we lack synchronization (that is not provided by the current state
> > machine handling, or the rework here), we should do a patch on top
> > (preferably on top of the whole series, so this does not get even more
> > tangled up.) This is really just about the extra check.
> >   
> 
> I'm not a huge fan of keeping or introducing races -- it makes things
> difficult to reason about, but I do have some understanging your
> position.

The only thing I want to avoid is knowingly making things worse than
before, and I don't think this patch does that.

> 
> This patch-series is AFAICT a big improvement over what we have. I would
> like Farhan confirming that it makes these hick-ups when he used to hit
> BUSY with another ssch request disappear. If it does (I hope it does)
> it's definitely a good thing for anybody who wants to use vfio-ccw.

Yep. There remains a lot to be done, but it's a first step.

> 
> Yet I find it difficult to slap my r-b over racy code, or partial
> solutions. In the latter case, when I lack conceptual clarity, I find it
> difficult to tell if we are heading into the right direction, or is what
> we build today going to turn against us tomorrow. Sorry for being a drag.

As long as we don't introduce bad user space interfaces we have to drag
around forever, I think anything is fair game if we think it's a good
idea at that moment. We can rewrite things if it turned out to be a bad
idea (although I'm not arguing for doing random crap, of course :)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/6] vfio-ccw: rework ssch state handling
  2019-02-05 12:10       ` [Qemu-devel] " Cornelia Huck
@ 2019-02-05 14:31         ` Eric Farman
  -1 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-05 14:31 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: linux-s390, Alex Williamson, Pierre Morel, kvm, Farhan Ali,
	qemu-devel, Halil Pasic, qemu-s390x



On 02/05/2019 07:10 AM, Cornelia Huck wrote:
> On Mon, 4 Feb 2019 16:29:40 -0500
> Eric Farman <farman@linux.ibm.com> wrote:
> 
>> On 01/30/2019 08:22 AM, Cornelia Huck wrote:
>>> The flow for processing ssch requests can be improved by splitting
>>> the BUSY state:
>>>
>>> - CP_PROCESSING: We reject any user space requests while we are in
>>>     the process of translating a channel program and submitting it to
>>>     the hardware. Use -EAGAIN to signal user space that it should
>>>     retry the request.
>>> - CP_PENDING: We have successfully submitted a request with ssch and
>>>     are now expecting an interrupt. As we can't handle more than one
>>>     channel program being processed, reject any further requests with
>>>     -EBUSY. A final interrupt will move us out of this state; this also
>>>     fixes a latent bug where a non-final interrupt might have freed up
>>>     a channel program that still was in progress.
>>>     By making this a separate state, we make it possible to issue a
>>>     halt or a clear while we're still waiting for the final interrupt
>>>     for the ssch (in a follow-on patch).
>>>
>>> It also makes a lot of sense not to preemptively filter out writes to
>>> the io_region if we're in an incorrect state: the state machine will
>>> handle this correctly.
>>>
>>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>>> ---
>>>    drivers/s390/cio/vfio_ccw_drv.c     |  8 ++++++--
>>>    drivers/s390/cio/vfio_ccw_fsm.c     | 19 ++++++++++++++-----
>>>    drivers/s390/cio/vfio_ccw_ops.c     |  2 --
>>>    drivers/s390/cio/vfio_ccw_private.h |  3 ++-
>>>    4 files changed, 22 insertions(+), 10 deletions(-)
> 
>>> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
>>> index e7c9877c9f1e..b4a141fbd1a8 100644
>>> --- a/drivers/s390/cio/vfio_ccw_fsm.c
>>> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
>>> @@ -28,7 +28,6 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>>>    	sch = private->sch;
>>>    
>>>    	spin_lock_irqsave(sch->lock, flags);
>>> -	private->state = VFIO_CCW_STATE_BUSY;
>>>    
>>>    	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
>>>    	if (!orb) {
>>> @@ -46,6 +45,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>>>    		 */
>>>    		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
>>>    		ret = 0;
>>> +		private->state = VFIO_CCW_STATE_CP_PENDING;
>>
>> [1]
>>
>>>    		break;
>>>    	case 1:		/* Status pending */
>>>    	case 2:		/* Busy */
>>> @@ -107,6 +107,12 @@ static void fsm_io_busy(struct vfio_ccw_private *private,
>>>    	private->io_region->ret_code = -EBUSY;
>>>    }
>>>    
>>> +static void fsm_io_retry(struct vfio_ccw_private *private,
>>> +			 enum vfio_ccw_event event)
>>> +{
>>> +	private->io_region->ret_code = -EAGAIN;
>>> +}
>>> +
>>>    static void fsm_disabled_irq(struct vfio_ccw_private *private,
>>>    			     enum vfio_ccw_event event)
>>>    {
>>> @@ -135,8 +141,7 @@ static void fsm_io_request(struct vfio_ccw_private *private,
>>>    	struct mdev_device *mdev = private->mdev;
>>>    	char *errstr = "request";
>>>    
>>> -	private->state = VFIO_CCW_STATE_BUSY;
>>> -
>>> +	private->state = VFIO_CCW_STATE_CP_PROCESSING;
>>
>> [1]
>>
>>>    	memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
>>>    
>>>    	if (scsw->cmd.fctl & SCSW_FCTL_START_FUNC) {
>>> @@ -181,7 +186,6 @@ static void fsm_io_request(struct vfio_ccw_private *private,
>>>    	}
>>>    
>>>    err_out:
>>> -	private->state = VFIO_CCW_STATE_IDLE;
>>
>> [1] Revisiting these locations as from an earlier discussion [2]...
>> These go IDLE->CP_PROCESSING->CP_PENDING if we get a cc=0 on the SSCH,
>> but we stop in CP_PROCESSING if the SSCH gets a nonzero cc.  Shouldn't
>> we cleanup and go back to IDLE in this scenario, rather than forcing
>> userspace to escalate to CSCH/HSCH after some number of retries (via FSM)?
>>
>> [2] https://patchwork.kernel.org/patch/10773611/#22447997
> 
> It does do that (in vfio_ccw_mdev_write), it was not needed here. Or do
> you think doing it here would be more obvious?

Ah, my mistake, I missed that.  (That function is renamed to 
vfio_ccw_mdev_write_io_region in patch 4.)

I don't think keeping it here is necessary then.  I got too focused 
looking at what you ripped out that I lost the things that stayed.  Once 
this series gets in its entirety, and Pierre has a chance to rebase his 
FSM series on top of it all, this should be in great shape.

> 
>>
>> Besides that, I think this looks good to me.
> 
> Thanks!
> 

You're welcome!  Here, have a thing to add to this patch:

Reviewed-by: Eric Farman <farman@linux.ibm.com>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/6] vfio-ccw: rework ssch state handling
@ 2019-02-05 14:31         ` Eric Farman
  0 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-05 14:31 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Halil Pasic, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson



On 02/05/2019 07:10 AM, Cornelia Huck wrote:
> On Mon, 4 Feb 2019 16:29:40 -0500
> Eric Farman <farman@linux.ibm.com> wrote:
> 
>> On 01/30/2019 08:22 AM, Cornelia Huck wrote:
>>> The flow for processing ssch requests can be improved by splitting
>>> the BUSY state:
>>>
>>> - CP_PROCESSING: We reject any user space requests while we are in
>>>     the process of translating a channel program and submitting it to
>>>     the hardware. Use -EAGAIN to signal user space that it should
>>>     retry the request.
>>> - CP_PENDING: We have successfully submitted a request with ssch and
>>>     are now expecting an interrupt. As we can't handle more than one
>>>     channel program being processed, reject any further requests with
>>>     -EBUSY. A final interrupt will move us out of this state; this also
>>>     fixes a latent bug where a non-final interrupt might have freed up
>>>     a channel program that still was in progress.
>>>     By making this a separate state, we make it possible to issue a
>>>     halt or a clear while we're still waiting for the final interrupt
>>>     for the ssch (in a follow-on patch).
>>>
>>> It also makes a lot of sense not to preemptively filter out writes to
>>> the io_region if we're in an incorrect state: the state machine will
>>> handle this correctly.
>>>
>>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>>> ---
>>>    drivers/s390/cio/vfio_ccw_drv.c     |  8 ++++++--
>>>    drivers/s390/cio/vfio_ccw_fsm.c     | 19 ++++++++++++++-----
>>>    drivers/s390/cio/vfio_ccw_ops.c     |  2 --
>>>    drivers/s390/cio/vfio_ccw_private.h |  3 ++-
>>>    4 files changed, 22 insertions(+), 10 deletions(-)
> 
>>> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
>>> index e7c9877c9f1e..b4a141fbd1a8 100644
>>> --- a/drivers/s390/cio/vfio_ccw_fsm.c
>>> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
>>> @@ -28,7 +28,6 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>>>    	sch = private->sch;
>>>    
>>>    	spin_lock_irqsave(sch->lock, flags);
>>> -	private->state = VFIO_CCW_STATE_BUSY;
>>>    
>>>    	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
>>>    	if (!orb) {
>>> @@ -46,6 +45,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>>>    		 */
>>>    		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
>>>    		ret = 0;
>>> +		private->state = VFIO_CCW_STATE_CP_PENDING;
>>
>> [1]
>>
>>>    		break;
>>>    	case 1:		/* Status pending */
>>>    	case 2:		/* Busy */
>>> @@ -107,6 +107,12 @@ static void fsm_io_busy(struct vfio_ccw_private *private,
>>>    	private->io_region->ret_code = -EBUSY;
>>>    }
>>>    
>>> +static void fsm_io_retry(struct vfio_ccw_private *private,
>>> +			 enum vfio_ccw_event event)
>>> +{
>>> +	private->io_region->ret_code = -EAGAIN;
>>> +}
>>> +
>>>    static void fsm_disabled_irq(struct vfio_ccw_private *private,
>>>    			     enum vfio_ccw_event event)
>>>    {
>>> @@ -135,8 +141,7 @@ static void fsm_io_request(struct vfio_ccw_private *private,
>>>    	struct mdev_device *mdev = private->mdev;
>>>    	char *errstr = "request";
>>>    
>>> -	private->state = VFIO_CCW_STATE_BUSY;
>>> -
>>> +	private->state = VFIO_CCW_STATE_CP_PROCESSING;
>>
>> [1]
>>
>>>    	memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
>>>    
>>>    	if (scsw->cmd.fctl & SCSW_FCTL_START_FUNC) {
>>> @@ -181,7 +186,6 @@ static void fsm_io_request(struct vfio_ccw_private *private,
>>>    	}
>>>    
>>>    err_out:
>>> -	private->state = VFIO_CCW_STATE_IDLE;
>>
>> [1] Revisiting these locations as from an earlier discussion [2]...
>> These go IDLE->CP_PROCESSING->CP_PENDING if we get a cc=0 on the SSCH,
>> but we stop in CP_PROCESSING if the SSCH gets a nonzero cc.  Shouldn't
>> we cleanup and go back to IDLE in this scenario, rather than forcing
>> userspace to escalate to CSCH/HSCH after some number of retries (via FSM)?
>>
>> [2] https://patchwork.kernel.org/patch/10773611/#22447997
> 
> It does do that (in vfio_ccw_mdev_write), it was not needed here. Or do
> you think doing it here would be more obvious?

Ah, my mistake, I missed that.  (That function is renamed to 
vfio_ccw_mdev_write_io_region in patch 4.)

I don't think keeping it here is necessary then.  I got too focused 
looking at what you ripped out that I lost the things that stayed.  Once 
this series gets in its entirety, and Pierre has a chance to rebase his 
FSM series on top of it all, this should be in great shape.

> 
>>
>> Besides that, I think this looks good to me.
> 
> Thanks!
> 

You're welcome!  Here, have a thing to add to this patch:

Reviewed-by: Eric Farman <farman@linux.ibm.com>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-02-05 12:03       ` [Qemu-devel] " Cornelia Huck
@ 2019-02-05 14:41         ` Eric Farman
  -1 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-05 14:41 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: linux-s390, Alex Williamson, Pierre Morel, kvm, Farhan Ali,
	qemu-devel, Halil Pasic, qemu-s390x



On 02/05/2019 07:03 AM, Cornelia Huck wrote:
> On Mon, 4 Feb 2019 14:25:34 -0500
> Eric Farman <farman@linux.ibm.com> wrote:
> 
>> On 01/30/2019 08:22 AM, Cornelia Huck wrote:
>>> When we get a solicited interrupt, the start function may have
>>> been cleared by a csch, but we still have a channel program
>>> structure allocated. Make it safe to call the cp accessors in
>>> any case, so we can call them unconditionally.
>>>
>>> While at it, also make sure that functions called from other parts
>>> of the code return gracefully if the channel program structure
>>> has not been initialized (even though that is a bug in the caller).
>>>
>>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>>> ---
>>>    drivers/s390/cio/vfio_ccw_cp.c  | 20 +++++++++++++++++++-
>>>    drivers/s390/cio/vfio_ccw_cp.h  |  2 ++
>>>    drivers/s390/cio/vfio_ccw_fsm.c |  5 +++++
>>>    3 files changed, 26 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
>>> index ba08fe137c2e..0bc0c38edda7 100644
>>> --- a/drivers/s390/cio/vfio_ccw_cp.c
>>> +++ b/drivers/s390/cio/vfio_ccw_cp.c
>>> @@ -335,6 +335,7 @@ static void cp_unpin_free(struct channel_program *cp)
>>>    	struct ccwchain *chain, *temp;
>>>    	int i;
>>>    
>>> +	cp->initialized = false;
>>>    	list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) {
>>>    		for (i = 0; i < chain->ch_len; i++) {
>>>    			pfn_array_table_unpin_free(chain->ch_pat + i,
>>> @@ -701,6 +702,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
>>>    	 */
>>>    	cp->orb.cmd.c64 = 1;
>>>    
>>> +	cp->initialized = true;
>>> +
>>
>> Not seen in this hunk, but we call ccwchain_loop_tic() just prior to
>> this point.  If that returns non-zero, we call cp_unpin_free()[1] (and
>> set initailized to false), and then fall through to here.  So this is
>> going to set initialized to true, even though we're taking an error
>> path.  :-(
> 
> Eek, setting c64 unconditionally threw me off. This needs to check
> for !ret, of course.
> 
>>
>> [1] Wait, why is it calling cp_unpin_free()?  Oh, I had proposed
>> squashing cp_free() and cp_unpin_free() back in November[2], got an r-b
>> from Pierre but haven't gotten back to tidy up the series for a v2.
>> Okay, I'll try to do that again soon.  :-)
> 
> :)
> 
>> [2] https://patchwork.kernel.org/patch/10675261/
>>
>>>    	return ret;
>>>    }
>>>    
>>> @@ -715,7 +718,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
>>>     */
>>>    void cp_free(struct channel_program *cp)
>>>    {
>>> -	cp_unpin_free(cp);
>>> +	if (cp->initialized)
>>> +		cp_unpin_free(cp);
>>>    }
>>>    
>>>    /**
>>> @@ -760,6 +764,10 @@ int cp_prefetch(struct channel_program *cp)
>>>    	struct ccwchain *chain;
>>>    	int len, idx, ret;
>>>    
>>> +	/* this is an error in the caller */
>>> +	if (!cp || !cp->initialized)
>>> +		return -EINVAL;
>>> +
>>>    	list_for_each_entry(chain, &cp->ccwchain_list, next) {
>>>    		len = chain->ch_len;
>>>    		for (idx = 0; idx < len; idx++) {
>>> @@ -795,6 +803,10 @@ union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm)
>>>    	struct ccwchain *chain;
>>>    	struct ccw1 *cpa;
>>>    
>>> +	/* this is an error in the caller */
>>> +	if (!cp || !cp->initialized)
>>> +		return NULL;
>>> +
>>>    	orb = &cp->orb;
>>>    
>>>    	orb->cmd.intparm = intparm;
>>> @@ -831,6 +843,9 @@ void cp_update_scsw(struct channel_program *cp, union scsw *scsw)
>>>    	u32 cpa = scsw->cmd.cpa;
>>>    	u32 ccw_head, ccw_tail;
>>>    
>>> +	if (!cp->initialized)
>>> +		return;
>>> +
>>>    	/*
>>>    	 * LATER:
>>>    	 * For now, only update the cmd.cpa part. We may need to deal with
>>> @@ -869,6 +884,9 @@ bool cp_iova_pinned(struct channel_program *cp, u64 iova)
>>>    	struct ccwchain *chain;
>>>    	int i;
>>>    
>>> +	if (!cp->initialized)
>>
>> So, two of the checks added above look for a nonzero cp pointer prior to
>> checking initialized, while two don't.  I guess cp can't be NULL, since
>> it's embedded in the private struct directly and that's only free'd when
>> we do vfio_ccw_sch_remove() ... But I guess some consistency in how we
>> look would be nice.
> 
> The idea was: In which context is this called? Is there a legitimate
> reason for the caller to pass in an uninitialized cp, or would that
> mean the caller had messed up (and we should not trust cp to be !NULL
> either?)
> 
> But you're right, that does look inconsistent. Always checking for
> cp != NULL probably looks least odd, although it is overkill. Opinions?

My opinion?  Since cp is embedded in vfio_ccw_private, rather than a 
pointer to a separately malloc'd struct, we pass &private->cp to those 
functions.  So a check for !cp doesn't really buy us anything because 
what we are actually concerned about is whether or not private is NULL, 
which only changes on the probe/remove boundaries.

> 
>>
>>> +		return false;
>>> +
>>>    	list_for_each_entry(chain, &cp->ccwchain_list, next) {
>>>    		for (i = 0; i < chain->ch_len; i++)
>>>    			if (pfn_array_table_iova_pinned(chain->ch_pat + i,
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-02-05 14:41         ` Eric Farman
  0 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-05 14:41 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Halil Pasic, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson



On 02/05/2019 07:03 AM, Cornelia Huck wrote:
> On Mon, 4 Feb 2019 14:25:34 -0500
> Eric Farman <farman@linux.ibm.com> wrote:
> 
>> On 01/30/2019 08:22 AM, Cornelia Huck wrote:
>>> When we get a solicited interrupt, the start function may have
>>> been cleared by a csch, but we still have a channel program
>>> structure allocated. Make it safe to call the cp accessors in
>>> any case, so we can call them unconditionally.
>>>
>>> While at it, also make sure that functions called from other parts
>>> of the code return gracefully if the channel program structure
>>> has not been initialized (even though that is a bug in the caller).
>>>
>>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>>> ---
>>>    drivers/s390/cio/vfio_ccw_cp.c  | 20 +++++++++++++++++++-
>>>    drivers/s390/cio/vfio_ccw_cp.h  |  2 ++
>>>    drivers/s390/cio/vfio_ccw_fsm.c |  5 +++++
>>>    3 files changed, 26 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
>>> index ba08fe137c2e..0bc0c38edda7 100644
>>> --- a/drivers/s390/cio/vfio_ccw_cp.c
>>> +++ b/drivers/s390/cio/vfio_ccw_cp.c
>>> @@ -335,6 +335,7 @@ static void cp_unpin_free(struct channel_program *cp)
>>>    	struct ccwchain *chain, *temp;
>>>    	int i;
>>>    
>>> +	cp->initialized = false;
>>>    	list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) {
>>>    		for (i = 0; i < chain->ch_len; i++) {
>>>    			pfn_array_table_unpin_free(chain->ch_pat + i,
>>> @@ -701,6 +702,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
>>>    	 */
>>>    	cp->orb.cmd.c64 = 1;
>>>    
>>> +	cp->initialized = true;
>>> +
>>
>> Not seen in this hunk, but we call ccwchain_loop_tic() just prior to
>> this point.  If that returns non-zero, we call cp_unpin_free()[1] (and
>> set initailized to false), and then fall through to here.  So this is
>> going to set initialized to true, even though we're taking an error
>> path.  :-(
> 
> Eek, setting c64 unconditionally threw me off. This needs to check
> for !ret, of course.
> 
>>
>> [1] Wait, why is it calling cp_unpin_free()?  Oh, I had proposed
>> squashing cp_free() and cp_unpin_free() back in November[2], got an r-b
>> from Pierre but haven't gotten back to tidy up the series for a v2.
>> Okay, I'll try to do that again soon.  :-)
> 
> :)
> 
>> [2] https://patchwork.kernel.org/patch/10675261/
>>
>>>    	return ret;
>>>    }
>>>    
>>> @@ -715,7 +718,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
>>>     */
>>>    void cp_free(struct channel_program *cp)
>>>    {
>>> -	cp_unpin_free(cp);
>>> +	if (cp->initialized)
>>> +		cp_unpin_free(cp);
>>>    }
>>>    
>>>    /**
>>> @@ -760,6 +764,10 @@ int cp_prefetch(struct channel_program *cp)
>>>    	struct ccwchain *chain;
>>>    	int len, idx, ret;
>>>    
>>> +	/* this is an error in the caller */
>>> +	if (!cp || !cp->initialized)
>>> +		return -EINVAL;
>>> +
>>>    	list_for_each_entry(chain, &cp->ccwchain_list, next) {
>>>    		len = chain->ch_len;
>>>    		for (idx = 0; idx < len; idx++) {
>>> @@ -795,6 +803,10 @@ union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm)
>>>    	struct ccwchain *chain;
>>>    	struct ccw1 *cpa;
>>>    
>>> +	/* this is an error in the caller */
>>> +	if (!cp || !cp->initialized)
>>> +		return NULL;
>>> +
>>>    	orb = &cp->orb;
>>>    
>>>    	orb->cmd.intparm = intparm;
>>> @@ -831,6 +843,9 @@ void cp_update_scsw(struct channel_program *cp, union scsw *scsw)
>>>    	u32 cpa = scsw->cmd.cpa;
>>>    	u32 ccw_head, ccw_tail;
>>>    
>>> +	if (!cp->initialized)
>>> +		return;
>>> +
>>>    	/*
>>>    	 * LATER:
>>>    	 * For now, only update the cmd.cpa part. We may need to deal with
>>> @@ -869,6 +884,9 @@ bool cp_iova_pinned(struct channel_program *cp, u64 iova)
>>>    	struct ccwchain *chain;
>>>    	int i;
>>>    
>>> +	if (!cp->initialized)
>>
>> So, two of the checks added above look for a nonzero cp pointer prior to
>> checking initialized, while two don't.  I guess cp can't be NULL, since
>> it's embedded in the private struct directly and that's only free'd when
>> we do vfio_ccw_sch_remove() ... But I guess some consistency in how we
>> look would be nice.
> 
> The idea was: In which context is this called? Is there a legitimate
> reason for the caller to pass in an uninitialized cp, or would that
> mean the caller had messed up (and we should not trust cp to be !NULL
> either?)
> 
> But you're right, that does look inconsistent. Always checking for
> cp != NULL probably looks least odd, although it is overkill. Opinions?

My opinion?  Since cp is embedded in vfio_ccw_private, rather than a 
pointer to a separately malloc'd struct, we pass &private->cp to those 
functions.  So a check for !cp doesn't really buy us anything because 
what we are actually concerned about is whether or not private is NULL, 
which only changes on the probe/remove boundaries.

> 
>>
>>> +		return false;
>>> +
>>>    	list_for_each_entry(chain, &cp->ccwchain_list, next) {
>>>    		for (i = 0; i < chain->ch_len; i++)
>>>    			if (pfn_array_table_iova_pinned(chain->ch_pat + i,
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-02-05 12:35               ` [Qemu-devel] " Cornelia Huck
@ 2019-02-05 14:48                 ` Eric Farman
  -1 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-05 14:48 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: linux-s390, Pierre Morel, kvm, qemu-s390x, Farhan Ali,
	qemu-devel, Alex Williamson



On 02/05/2019 07:35 AM, Cornelia Huck wrote:
> On Tue, 5 Feb 2019 12:52:29 +0100
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> On Mon, 4 Feb 2019 16:31:02 +0100
>> Cornelia Huck <cohuck@redhat.com> wrote:
>>
>>> On Thu, 31 Jan 2019 13:34:55 +0100
>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>    
>>>> On Thu, 31 Jan 2019 12:52:20 +0100
>>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>>    
>>>>> On Wed, 30 Jan 2019 19:51:27 +0100
>>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>>>      
>>>>>> On Wed, 30 Jan 2019 14:22:07 +0100
>>>>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>>>>      
>>>>>>> When we get a solicited interrupt, the start function may have
>>>>>>> been cleared by a csch, but we still have a channel program
>>>>>>> structure allocated. Make it safe to call the cp accessors in
>>>>>>> any case, so we can call them unconditionally.
>>>>>>
>>>>>> I read this like it is supposed to be safe regardless of
>>>>>> parallelism and threads. However I don't see any explicit
>>>>>> synchronization done for cp->initialized.
>>>>>>
>>>>>> I've managed to figure out how is that supposed to be safe
>>>>>> for the cp_free() (which is probably our main concern) in
>>>>>> vfio_ccw_sch_io_todo(), but if fail when it comes to the one
>>>>>> in vfio_ccw_mdev_notifier().
>>>>>>
>>>>>> Can you explain us how does the synchronization work?
>>>>>
>>>>> You read that wrong, I don't add synchronization, I just add a check.
>>>>>      
>>>>
>>>> Now I'm confused. Does that mean we don't need synchronization for this?
>>>
>>> If we lack synchronization (that is not provided by the current state
>>> machine handling, or the rework here), we should do a patch on top
>>> (preferably on top of the whole series, so this does not get even more
>>> tangled up.) This is really just about the extra check.
>>>    
>>
>> I'm not a huge fan of keeping or introducing races -- it makes things
>> difficult to reason about, but I do have some understanging your
>> position.
> 
> The only thing I want to avoid is knowingly making things worse than
> before, and I don't think this patch does that.
> 
>>
>> This patch-series is AFAICT a big improvement over what we have. I would
>> like Farhan confirming that it makes these hick-ups when he used to hit
>> BUSY with another ssch request disappear. If it does (I hope it does)
>> it's definitely a good thing for anybody who wants to use vfio-ccw.
> 
> Yep. There remains a lot to be done, but it's a first step.

s/a first step/an excellent first step/  :)

Can't speak for Farhan, but this makes things somewhat better for me. 
I'm still getting some periodic errors, but they happen infrequently 
enough now that debugging them is frustrating.  ;-)

  - Eric

> 
>>
>> Yet I find it difficult to slap my r-b over racy code, or partial
>> solutions. In the latter case, when I lack conceptual clarity, I find it
>> difficult to tell if we are heading into the right direction, or is what
>> we build today going to turn against us tomorrow. Sorry for being a drag.
> 
> As long as we don't introduce bad user space interfaces we have to drag
> around forever, I think anything is fair game if we think it's a good
> idea at that moment. We can rewrite things if it turned out to be a bad
> idea (although I'm not arguing for doing random crap, of course :)
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-02-05 14:48                 ` Eric Farman
  0 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-05 14:48 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: linux-s390, Alex Williamson, Pierre Morel, kvm, Farhan Ali,
	qemu-devel, qemu-s390x



On 02/05/2019 07:35 AM, Cornelia Huck wrote:
> On Tue, 5 Feb 2019 12:52:29 +0100
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> On Mon, 4 Feb 2019 16:31:02 +0100
>> Cornelia Huck <cohuck@redhat.com> wrote:
>>
>>> On Thu, 31 Jan 2019 13:34:55 +0100
>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>    
>>>> On Thu, 31 Jan 2019 12:52:20 +0100
>>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>>    
>>>>> On Wed, 30 Jan 2019 19:51:27 +0100
>>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>>>      
>>>>>> On Wed, 30 Jan 2019 14:22:07 +0100
>>>>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>>>>      
>>>>>>> When we get a solicited interrupt, the start function may have
>>>>>>> been cleared by a csch, but we still have a channel program
>>>>>>> structure allocated. Make it safe to call the cp accessors in
>>>>>>> any case, so we can call them unconditionally.
>>>>>>
>>>>>> I read this like it is supposed to be safe regardless of
>>>>>> parallelism and threads. However I don't see any explicit
>>>>>> synchronization done for cp->initialized.
>>>>>>
>>>>>> I've managed to figure out how is that supposed to be safe
>>>>>> for the cp_free() (which is probably our main concern) in
>>>>>> vfio_ccw_sch_io_todo(), but if fail when it comes to the one
>>>>>> in vfio_ccw_mdev_notifier().
>>>>>>
>>>>>> Can you explain us how does the synchronization work?
>>>>>
>>>>> You read that wrong, I don't add synchronization, I just add a check.
>>>>>      
>>>>
>>>> Now I'm confused. Does that mean we don't need synchronization for this?
>>>
>>> If we lack synchronization (that is not provided by the current state
>>> machine handling, or the rework here), we should do a patch on top
>>> (preferably on top of the whole series, so this does not get even more
>>> tangled up.) This is really just about the extra check.
>>>    
>>
>> I'm not a huge fan of keeping or introducing races -- it makes things
>> difficult to reason about, but I do have some understanging your
>> position.
> 
> The only thing I want to avoid is knowingly making things worse than
> before, and I don't think this patch does that.
> 
>>
>> This patch-series is AFAICT a big improvement over what we have. I would
>> like Farhan confirming that it makes these hick-ups when he used to hit
>> BUSY with another ssch request disappear. If it does (I hope it does)
>> it's definitely a good thing for anybody who wants to use vfio-ccw.
> 
> Yep. There remains a lot to be done, but it's a first step.

s/a first step/an excellent first step/  :)

Can't speak for Farhan, but this makes things somewhat better for me. 
I'm still getting some periodic errors, but they happen infrequently 
enough now that debugging them is frustrating.  ;-)

  - Eric

> 
>>
>> Yet I find it difficult to slap my r-b over racy code, or partial
>> solutions. In the latter case, when I lack conceptual clarity, I find it
>> difficult to tell if we are heading into the right direction, or is what
>> we build today going to turn against us tomorrow. Sorry for being a drag.
> 
> As long as we don't introduce bad user space interfaces we have to drag
> around forever, I think anything is fair game if we think it's a good
> idea at that moment. We can rewrite things if it turned out to be a bad
> idea (although I'm not arguing for doing random crap, of course :)
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-02-05 14:48                 ` [Qemu-devel] " Eric Farman
@ 2019-02-05 15:14                   ` Farhan Ali
  -1 siblings, 0 replies; 70+ messages in thread
From: Farhan Ali @ 2019-02-05 15:14 UTC (permalink / raw)
  To: Eric Farman, Cornelia Huck, Halil Pasic
  Cc: linux-s390, Pierre Morel, kvm, Alex Williamson, qemu-devel, qemu-s390x



On 02/05/2019 09:48 AM, Eric Farman wrote:
> 
> 
> On 02/05/2019 07:35 AM, Cornelia Huck wrote:
>> On Tue, 5 Feb 2019 12:52:29 +0100
>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>
>>> On Mon, 4 Feb 2019 16:31:02 +0100
>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>
>>>> On Thu, 31 Jan 2019 13:34:55 +0100
>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>>> On Thu, 31 Jan 2019 12:52:20 +0100
>>>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>>>> On Wed, 30 Jan 2019 19:51:27 +0100
>>>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>>>>> On Wed, 30 Jan 2019 14:22:07 +0100
>>>>>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>>>>>> When we get a solicited interrupt, the start function may have
>>>>>>>> been cleared by a csch, but we still have a channel program
>>>>>>>> structure allocated. Make it safe to call the cp accessors in
>>>>>>>> any case, so we can call them unconditionally.
>>>>>>>
>>>>>>> I read this like it is supposed to be safe regardless of
>>>>>>> parallelism and threads. However I don't see any explicit
>>>>>>> synchronization done for cp->initialized.
>>>>>>>
>>>>>>> I've managed to figure out how is that supposed to be safe
>>>>>>> for the cp_free() (which is probably our main concern) in
>>>>>>> vfio_ccw_sch_io_todo(), but if fail when it comes to the one
>>>>>>> in vfio_ccw_mdev_notifier().
>>>>>>>
>>>>>>> Can you explain us how does the synchronization work?
>>>>>>
>>>>>> You read that wrong, I don't add synchronization, I just add a check.
>>>>>
>>>>> Now I'm confused. Does that mean we don't need synchronization for 
>>>>> this?
>>>>
>>>> If we lack synchronization (that is not provided by the current state
>>>> machine handling, or the rework here), we should do a patch on top
>>>> (preferably on top of the whole series, so this does not get even more
>>>> tangled up.) This is really just about the extra check.
>>>
>>> I'm not a huge fan of keeping or introducing races -- it makes things
>>> difficult to reason about, but I do have some understanging your
>>> position.
>>
>> The only thing I want to avoid is knowingly making things worse than
>> before, and I don't think this patch does that.
>>
>>>
>>> This patch-series is AFAICT a big improvement over what we have. I would
>>> like Farhan confirming that it makes these hick-ups when he used to hit
>>> BUSY with another ssch request disappear. If it does (I hope it does)
>>> it's definitely a good thing for anybody who wants to use vfio-ccw.
>>
>> Yep. There remains a lot to be done, but it's a first step.
> 
> s/a first step/an excellent first step/  :)
> 
> Can't speak for Farhan, but this makes things somewhat better for me. 
> I'm still getting some periodic errors, but they happen infrequently 
> enough now that debugging them is frustrating.  ;-)
> 
>   - Eric
> 

I ran the my workloads/tests with the patches and like Eric I notice the 
errors I previously hit less frequently.


>>
>>>
>>> Yet I find it difficult to slap my r-b over racy code, or partial
>>> solutions. In the latter case, when I lack conceptual clarity, I find it
>>> difficult to tell if we are heading into the right direction, or is what
>>> we build today going to turn against us tomorrow. Sorry for being a 
>>> drag.
>>
>> As long as we don't introduce bad user space interfaces we have to drag
>> around forever, I think anything is fair game if we think it's a good
>> idea at that moment. We can rewrite things if it turned out to be a bad
>> idea (although I'm not arguing for doing random crap, of course :)
>>
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-02-05 15:14                   ` Farhan Ali
  0 siblings, 0 replies; 70+ messages in thread
From: Farhan Ali @ 2019-02-05 15:14 UTC (permalink / raw)
  To: Eric Farman, Cornelia Huck, Halil Pasic
  Cc: linux-s390, Alex Williamson, Pierre Morel, kvm, qemu-devel, qemu-s390x



On 02/05/2019 09:48 AM, Eric Farman wrote:
> 
> 
> On 02/05/2019 07:35 AM, Cornelia Huck wrote:
>> On Tue, 5 Feb 2019 12:52:29 +0100
>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>
>>> On Mon, 4 Feb 2019 16:31:02 +0100
>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>
>>>> On Thu, 31 Jan 2019 13:34:55 +0100
>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>>> On Thu, 31 Jan 2019 12:52:20 +0100
>>>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>>>> On Wed, 30 Jan 2019 19:51:27 +0100
>>>>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>>>>> On Wed, 30 Jan 2019 14:22:07 +0100
>>>>>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>>>>>> When we get a solicited interrupt, the start function may have
>>>>>>>> been cleared by a csch, but we still have a channel program
>>>>>>>> structure allocated. Make it safe to call the cp accessors in
>>>>>>>> any case, so we can call them unconditionally.
>>>>>>>
>>>>>>> I read this like it is supposed to be safe regardless of
>>>>>>> parallelism and threads. However I don't see any explicit
>>>>>>> synchronization done for cp->initialized.
>>>>>>>
>>>>>>> I've managed to figure out how is that supposed to be safe
>>>>>>> for the cp_free() (which is probably our main concern) in
>>>>>>> vfio_ccw_sch_io_todo(), but if fail when it comes to the one
>>>>>>> in vfio_ccw_mdev_notifier().
>>>>>>>
>>>>>>> Can you explain us how does the synchronization work?
>>>>>>
>>>>>> You read that wrong, I don't add synchronization, I just add a check.
>>>>>
>>>>> Now I'm confused. Does that mean we don't need synchronization for 
>>>>> this?
>>>>
>>>> If we lack synchronization (that is not provided by the current state
>>>> machine handling, or the rework here), we should do a patch on top
>>>> (preferably on top of the whole series, so this does not get even more
>>>> tangled up.) This is really just about the extra check.
>>>
>>> I'm not a huge fan of keeping or introducing races -- it makes things
>>> difficult to reason about, but I do have some understanging your
>>> position.
>>
>> The only thing I want to avoid is knowingly making things worse than
>> before, and I don't think this patch does that.
>>
>>>
>>> This patch-series is AFAICT a big improvement over what we have. I would
>>> like Farhan confirming that it makes these hick-ups when he used to hit
>>> BUSY with another ssch request disappear. If it does (I hope it does)
>>> it's definitely a good thing for anybody who wants to use vfio-ccw.
>>
>> Yep. There remains a lot to be done, but it's a first step.
> 
> s/a first step/an excellent first step/  :)
> 
> Can't speak for Farhan, but this makes things somewhat better for me. 
> I'm still getting some periodic errors, but they happen infrequently 
> enough now that debugging them is frustrating.  ;-)
> 
>   - Eric
> 

I ran the my workloads/tests with the patches and like Eric I notice the 
errors I previously hit less frequently.


>>
>>>
>>> Yet I find it difficult to slap my r-b over racy code, or partial
>>> solutions. In the latter case, when I lack conceptual clarity, I find it
>>> difficult to tell if we are heading into the right direction, or is what
>>> we build today going to turn against us tomorrow. Sorry for being a 
>>> drag.
>>
>> As long as we don't introduce bad user space interfaces we have to drag
>> around forever, I think anything is fair game if we think it's a good
>> idea at that moment. We can rewrite things if it turned out to be a bad
>> idea (although I'm not arguing for doing random crap, of course :)
>>
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-02-05 15:14                   ` [Qemu-devel] " Farhan Ali
@ 2019-02-05 16:13                     ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-05 16:13 UTC (permalink / raw)
  To: Farhan Ali
  Cc: linux-s390, Eric Farman, kvm, Pierre Morel, qemu-s390x,
	qemu-devel, Halil Pasic, Alex Williamson

On Tue, 5 Feb 2019 10:14:46 -0500
Farhan Ali <alifm@linux.ibm.com> wrote:

> On 02/05/2019 09:48 AM, Eric Farman wrote:

> >>> This patch-series is AFAICT a big improvement over what we have. I would
> >>> like Farhan confirming that it makes these hick-ups when he used to hit
> >>> BUSY with another ssch request disappear. If it does (I hope it does)
> >>> it's definitely a good thing for anybody who wants to use vfio-ccw.  
> >>
> >> Yep. There remains a lot to be done, but it's a first step.  
> > 
> > s/a first step/an excellent first step/  :)
> > 
> > Can't speak for Farhan, but this makes things somewhat better for me. 
> > I'm still getting some periodic errors, but they happen infrequently 
> > enough now that debugging them is frustrating.  ;-)
> > 
> >   - Eric
> >   
> 
> I ran the my workloads/tests with the patches and like Eric I notice the 
> errors I previously hit less frequently.

Great, thanks for testing!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-02-05 16:13                     ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-05 16:13 UTC (permalink / raw)
  To: Farhan Ali
  Cc: Eric Farman, Halil Pasic, linux-s390, Alex Williamson,
	Pierre Morel, kvm, qemu-devel, qemu-s390x

On Tue, 5 Feb 2019 10:14:46 -0500
Farhan Ali <alifm@linux.ibm.com> wrote:

> On 02/05/2019 09:48 AM, Eric Farman wrote:

> >>> This patch-series is AFAICT a big improvement over what we have. I would
> >>> like Farhan confirming that it makes these hick-ups when he used to hit
> >>> BUSY with another ssch request disappear. If it does (I hope it does)
> >>> it's definitely a good thing for anybody who wants to use vfio-ccw.  
> >>
> >> Yep. There remains a lot to be done, but it's a first step.  
> > 
> > s/a first step/an excellent first step/  :)
> > 
> > Can't speak for Farhan, but this makes things somewhat better for me. 
> > I'm still getting some periodic errors, but they happen infrequently 
> > enough now that debugging them is frustrating.  ;-)
> > 
> >   - Eric
> >   
> 
> I ran the my workloads/tests with the patches and like Eric I notice the 
> errors I previously hit less frequently.

Great, thanks for testing!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
  2019-02-05 14:41         ` [Qemu-devel] " Eric Farman
@ 2019-02-05 16:29           ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-05 16:29 UTC (permalink / raw)
  To: Eric Farman
  Cc: linux-s390, Alex Williamson, Pierre Morel, kvm, Farhan Ali,
	qemu-devel, Halil Pasic, qemu-s390x

On Tue, 5 Feb 2019 09:41:15 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 02/05/2019 07:03 AM, Cornelia Huck wrote:
> > On Mon, 4 Feb 2019 14:25:34 -0500
> > Eric Farman <farman@linux.ibm.com> wrote:
> >   
> >> On 01/30/2019 08:22 AM, Cornelia Huck wrote:  

> >>> @@ -760,6 +764,10 @@ int cp_prefetch(struct channel_program *cp)
> >>>    	struct ccwchain *chain;
> >>>    	int len, idx, ret;
> >>>    
> >>> +	/* this is an error in the caller */
> >>> +	if (!cp || !cp->initialized)
> >>> +		return -EINVAL;
> >>> +
> >>>    	list_for_each_entry(chain, &cp->ccwchain_list, next) {
> >>>    		len = chain->ch_len;
> >>>    		for (idx = 0; idx < len; idx++) {
> >>> @@ -795,6 +803,10 @@ union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm)
> >>>    	struct ccwchain *chain;
> >>>    	struct ccw1 *cpa;
> >>>    
> >>> +	/* this is an error in the caller */
> >>> +	if (!cp || !cp->initialized)
> >>> +		return NULL;
> >>> +
> >>>    	orb = &cp->orb;
> >>>    
> >>>    	orb->cmd.intparm = intparm;
> >>> @@ -831,6 +843,9 @@ void cp_update_scsw(struct channel_program *cp, union scsw *scsw)
> >>>    	u32 cpa = scsw->cmd.cpa;
> >>>    	u32 ccw_head, ccw_tail;
> >>>    
> >>> +	if (!cp->initialized)
> >>> +		return;
> >>> +
> >>>    	/*
> >>>    	 * LATER:
> >>>    	 * For now, only update the cmd.cpa part. We may need to deal with
> >>> @@ -869,6 +884,9 @@ bool cp_iova_pinned(struct channel_program *cp, u64 iova)
> >>>    	struct ccwchain *chain;
> >>>    	int i;
> >>>    
> >>> +	if (!cp->initialized)  
> >>
> >> So, two of the checks added above look for a nonzero cp pointer prior to
> >> checking initialized, while two don't.  I guess cp can't be NULL, since
> >> it's embedded in the private struct directly and that's only free'd when
> >> we do vfio_ccw_sch_remove() ... But I guess some consistency in how we
> >> look would be nice.  
> > 
> > The idea was: In which context is this called? Is there a legitimate
> > reason for the caller to pass in an uninitialized cp, or would that
> > mean the caller had messed up (and we should not trust cp to be !NULL
> > either?)
> > 
> > But you're right, that does look inconsistent. Always checking for
> > cp != NULL probably looks least odd, although it is overkill. Opinions?  
> 
> My opinion?  Since cp is embedded in vfio_ccw_private, rather than a 
> pointer to a separately malloc'd struct, we pass &private->cp to those 
> functions.  So a check for !cp doesn't really buy us anything because 
> what we are actually concerned about is whether or not private is NULL, 
> which only changes on the probe/remove boundaries.

I guess if we pass in crap (or NULL) instead of &private->cp, it's our
own fault and we can disregard fencing that case. The probe/remove path
does not really bother me, for the reasons you said.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
@ 2019-02-05 16:29           ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-05 16:29 UTC (permalink / raw)
  To: Eric Farman
  Cc: Halil Pasic, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Tue, 5 Feb 2019 09:41:15 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 02/05/2019 07:03 AM, Cornelia Huck wrote:
> > On Mon, 4 Feb 2019 14:25:34 -0500
> > Eric Farman <farman@linux.ibm.com> wrote:
> >   
> >> On 01/30/2019 08:22 AM, Cornelia Huck wrote:  

> >>> @@ -760,6 +764,10 @@ int cp_prefetch(struct channel_program *cp)
> >>>    	struct ccwchain *chain;
> >>>    	int len, idx, ret;
> >>>    
> >>> +	/* this is an error in the caller */
> >>> +	if (!cp || !cp->initialized)
> >>> +		return -EINVAL;
> >>> +
> >>>    	list_for_each_entry(chain, &cp->ccwchain_list, next) {
> >>>    		len = chain->ch_len;
> >>>    		for (idx = 0; idx < len; idx++) {
> >>> @@ -795,6 +803,10 @@ union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm)
> >>>    	struct ccwchain *chain;
> >>>    	struct ccw1 *cpa;
> >>>    
> >>> +	/* this is an error in the caller */
> >>> +	if (!cp || !cp->initialized)
> >>> +		return NULL;
> >>> +
> >>>    	orb = &cp->orb;
> >>>    
> >>>    	orb->cmd.intparm = intparm;
> >>> @@ -831,6 +843,9 @@ void cp_update_scsw(struct channel_program *cp, union scsw *scsw)
> >>>    	u32 cpa = scsw->cmd.cpa;
> >>>    	u32 ccw_head, ccw_tail;
> >>>    
> >>> +	if (!cp->initialized)
> >>> +		return;
> >>> +
> >>>    	/*
> >>>    	 * LATER:
> >>>    	 * For now, only update the cmd.cpa part. We may need to deal with
> >>> @@ -869,6 +884,9 @@ bool cp_iova_pinned(struct channel_program *cp, u64 iova)
> >>>    	struct ccwchain *chain;
> >>>    	int i;
> >>>    
> >>> +	if (!cp->initialized)  
> >>
> >> So, two of the checks added above look for a nonzero cp pointer prior to
> >> checking initialized, while two don't.  I guess cp can't be NULL, since
> >> it's embedded in the private struct directly and that's only free'd when
> >> we do vfio_ccw_sch_remove() ... But I guess some consistency in how we
> >> look would be nice.  
> > 
> > The idea was: In which context is this called? Is there a legitimate
> > reason for the caller to pass in an uninitialized cp, or would that
> > mean the caller had messed up (and we should not trust cp to be !NULL
> > either?)
> > 
> > But you're right, that does look inconsistent. Always checking for
> > cp != NULL probably looks least odd, although it is overkill. Opinions?  
> 
> My opinion?  Since cp is embedded in vfio_ccw_private, rather than a 
> pointer to a separately malloc'd struct, we pass &private->cp to those 
> functions.  So a check for !cp doesn't really buy us anything because 
> what we are actually concerned about is whether or not private is NULL, 
> which only changes on the probe/remove boundaries.

I guess if we pass in crap (or NULL) instead of &private->cp, it's our
own fault and we can disregard fencing that case. The probe/remove path
does not really bother me, for the reasons you said.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/6] vfio-ccw: rework ssch state handling
  2019-02-05 14:31         ` [Qemu-devel] " Eric Farman
@ 2019-02-05 16:32           ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-05 16:32 UTC (permalink / raw)
  To: Eric Farman
  Cc: linux-s390, Alex Williamson, Pierre Morel, kvm, Farhan Ali,
	qemu-devel, Halil Pasic, qemu-s390x

On Tue, 5 Feb 2019 09:31:55 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 02/05/2019 07:10 AM, Cornelia Huck wrote:
> > On Mon, 4 Feb 2019 16:29:40 -0500
> > Eric Farman <farman@linux.ibm.com> wrote:
> >   
> >> On 01/30/2019 08:22 AM, Cornelia Huck wrote:  
> >>> The flow for processing ssch requests can be improved by splitting
> >>> the BUSY state:
> >>>
> >>> - CP_PROCESSING: We reject any user space requests while we are in
> >>>     the process of translating a channel program and submitting it to
> >>>     the hardware. Use -EAGAIN to signal user space that it should
> >>>     retry the request.
> >>> - CP_PENDING: We have successfully submitted a request with ssch and
> >>>     are now expecting an interrupt. As we can't handle more than one
> >>>     channel program being processed, reject any further requests with
> >>>     -EBUSY. A final interrupt will move us out of this state; this also
> >>>     fixes a latent bug where a non-final interrupt might have freed up
> >>>     a channel program that still was in progress.
> >>>     By making this a separate state, we make it possible to issue a
> >>>     halt or a clear while we're still waiting for the final interrupt
> >>>     for the ssch (in a follow-on patch).
> >>>
> >>> It also makes a lot of sense not to preemptively filter out writes to
> >>> the io_region if we're in an incorrect state: the state machine will
> >>> handle this correctly.
> >>>
> >>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> >>> ---
> >>>    drivers/s390/cio/vfio_ccw_drv.c     |  8 ++++++--
> >>>    drivers/s390/cio/vfio_ccw_fsm.c     | 19 ++++++++++++++-----
> >>>    drivers/s390/cio/vfio_ccw_ops.c     |  2 --
> >>>    drivers/s390/cio/vfio_ccw_private.h |  3 ++-
> >>>    4 files changed, 22 insertions(+), 10 deletions(-)  
> >   
> >>> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> >>> index e7c9877c9f1e..b4a141fbd1a8 100644
> >>> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> >>> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> >>> @@ -28,7 +28,6 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
> >>>    	sch = private->sch;
> >>>    
> >>>    	spin_lock_irqsave(sch->lock, flags);
> >>> -	private->state = VFIO_CCW_STATE_BUSY;
> >>>    
> >>>    	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
> >>>    	if (!orb) {
> >>> @@ -46,6 +45,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
> >>>    		 */
> >>>    		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
> >>>    		ret = 0;
> >>> +		private->state = VFIO_CCW_STATE_CP_PENDING;  
> >>
> >> [1]
> >>  
> >>>    		break;
> >>>    	case 1:		/* Status pending */
> >>>    	case 2:		/* Busy */
> >>> @@ -107,6 +107,12 @@ static void fsm_io_busy(struct vfio_ccw_private *private,
> >>>    	private->io_region->ret_code = -EBUSY;
> >>>    }
> >>>    
> >>> +static void fsm_io_retry(struct vfio_ccw_private *private,
> >>> +			 enum vfio_ccw_event event)
> >>> +{
> >>> +	private->io_region->ret_code = -EAGAIN;
> >>> +}
> >>> +
> >>>    static void fsm_disabled_irq(struct vfio_ccw_private *private,
> >>>    			     enum vfio_ccw_event event)
> >>>    {
> >>> @@ -135,8 +141,7 @@ static void fsm_io_request(struct vfio_ccw_private *private,
> >>>    	struct mdev_device *mdev = private->mdev;
> >>>    	char *errstr = "request";
> >>>    
> >>> -	private->state = VFIO_CCW_STATE_BUSY;
> >>> -
> >>> +	private->state = VFIO_CCW_STATE_CP_PROCESSING;  
> >>
> >> [1]
> >>  
> >>>    	memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
> >>>    
> >>>    	if (scsw->cmd.fctl & SCSW_FCTL_START_FUNC) {
> >>> @@ -181,7 +186,6 @@ static void fsm_io_request(struct vfio_ccw_private *private,
> >>>    	}
> >>>    
> >>>    err_out:
> >>> -	private->state = VFIO_CCW_STATE_IDLE;  
> >>
> >> [1] Revisiting these locations as from an earlier discussion [2]...
> >> These go IDLE->CP_PROCESSING->CP_PENDING if we get a cc=0 on the SSCH,
> >> but we stop in CP_PROCESSING if the SSCH gets a nonzero cc.  Shouldn't
> >> we cleanup and go back to IDLE in this scenario, rather than forcing
> >> userspace to escalate to CSCH/HSCH after some number of retries (via FSM)?
> >>
> >> [2] https://patchwork.kernel.org/patch/10773611/#22447997  
> > 
> > It does do that (in vfio_ccw_mdev_write), it was not needed here. Or do
> > you think doing it here would be more obvious?  
> 
> Ah, my mistake, I missed that.  (That function is renamed to 
> vfio_ccw_mdev_write_io_region in patch 4.)
> 
> I don't think keeping it here is necessary then.  I got too focused 
> looking at what you ripped out that I lost the things that stayed.  Once 
> this series gets in its entirety, and Pierre has a chance to rebase his 
> FSM series on top of it all, this should be in great shape.

Yeah, it's probably easier to look at the end result.

> 
> >   
> >>
> >> Besides that, I think this looks good to me.  
> > 
> > Thanks!
> >   
> 
> You're welcome!  Here, have a thing to add to this patch:
> 
> Reviewed-by: Eric Farman <farman@linux.ibm.com>
> 

Thanks a lot!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/6] vfio-ccw: rework ssch state handling
@ 2019-02-05 16:32           ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-05 16:32 UTC (permalink / raw)
  To: Eric Farman
  Cc: Halil Pasic, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Tue, 5 Feb 2019 09:31:55 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 02/05/2019 07:10 AM, Cornelia Huck wrote:
> > On Mon, 4 Feb 2019 16:29:40 -0500
> > Eric Farman <farman@linux.ibm.com> wrote:
> >   
> >> On 01/30/2019 08:22 AM, Cornelia Huck wrote:  
> >>> The flow for processing ssch requests can be improved by splitting
> >>> the BUSY state:
> >>>
> >>> - CP_PROCESSING: We reject any user space requests while we are in
> >>>     the process of translating a channel program and submitting it to
> >>>     the hardware. Use -EAGAIN to signal user space that it should
> >>>     retry the request.
> >>> - CP_PENDING: We have successfully submitted a request with ssch and
> >>>     are now expecting an interrupt. As we can't handle more than one
> >>>     channel program being processed, reject any further requests with
> >>>     -EBUSY. A final interrupt will move us out of this state; this also
> >>>     fixes a latent bug where a non-final interrupt might have freed up
> >>>     a channel program that still was in progress.
> >>>     By making this a separate state, we make it possible to issue a
> >>>     halt or a clear while we're still waiting for the final interrupt
> >>>     for the ssch (in a follow-on patch).
> >>>
> >>> It also makes a lot of sense not to preemptively filter out writes to
> >>> the io_region if we're in an incorrect state: the state machine will
> >>> handle this correctly.
> >>>
> >>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> >>> ---
> >>>    drivers/s390/cio/vfio_ccw_drv.c     |  8 ++++++--
> >>>    drivers/s390/cio/vfio_ccw_fsm.c     | 19 ++++++++++++++-----
> >>>    drivers/s390/cio/vfio_ccw_ops.c     |  2 --
> >>>    drivers/s390/cio/vfio_ccw_private.h |  3 ++-
> >>>    4 files changed, 22 insertions(+), 10 deletions(-)  
> >   
> >>> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> >>> index e7c9877c9f1e..b4a141fbd1a8 100644
> >>> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> >>> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> >>> @@ -28,7 +28,6 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
> >>>    	sch = private->sch;
> >>>    
> >>>    	spin_lock_irqsave(sch->lock, flags);
> >>> -	private->state = VFIO_CCW_STATE_BUSY;
> >>>    
> >>>    	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
> >>>    	if (!orb) {
> >>> @@ -46,6 +45,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
> >>>    		 */
> >>>    		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
> >>>    		ret = 0;
> >>> +		private->state = VFIO_CCW_STATE_CP_PENDING;  
> >>
> >> [1]
> >>  
> >>>    		break;
> >>>    	case 1:		/* Status pending */
> >>>    	case 2:		/* Busy */
> >>> @@ -107,6 +107,12 @@ static void fsm_io_busy(struct vfio_ccw_private *private,
> >>>    	private->io_region->ret_code = -EBUSY;
> >>>    }
> >>>    
> >>> +static void fsm_io_retry(struct vfio_ccw_private *private,
> >>> +			 enum vfio_ccw_event event)
> >>> +{
> >>> +	private->io_region->ret_code = -EAGAIN;
> >>> +}
> >>> +
> >>>    static void fsm_disabled_irq(struct vfio_ccw_private *private,
> >>>    			     enum vfio_ccw_event event)
> >>>    {
> >>> @@ -135,8 +141,7 @@ static void fsm_io_request(struct vfio_ccw_private *private,
> >>>    	struct mdev_device *mdev = private->mdev;
> >>>    	char *errstr = "request";
> >>>    
> >>> -	private->state = VFIO_CCW_STATE_BUSY;
> >>> -
> >>> +	private->state = VFIO_CCW_STATE_CP_PROCESSING;  
> >>
> >> [1]
> >>  
> >>>    	memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
> >>>    
> >>>    	if (scsw->cmd.fctl & SCSW_FCTL_START_FUNC) {
> >>> @@ -181,7 +186,6 @@ static void fsm_io_request(struct vfio_ccw_private *private,
> >>>    	}
> >>>    
> >>>    err_out:
> >>> -	private->state = VFIO_CCW_STATE_IDLE;  
> >>
> >> [1] Revisiting these locations as from an earlier discussion [2]...
> >> These go IDLE->CP_PROCESSING->CP_PENDING if we get a cc=0 on the SSCH,
> >> but we stop in CP_PROCESSING if the SSCH gets a nonzero cc.  Shouldn't
> >> we cleanup and go back to IDLE in this scenario, rather than forcing
> >> userspace to escalate to CSCH/HSCH after some number of retries (via FSM)?
> >>
> >> [2] https://patchwork.kernel.org/patch/10773611/#22447997  
> > 
> > It does do that (in vfio_ccw_mdev_write), it was not needed here. Or do
> > you think doing it here would be more obvious?  
> 
> Ah, my mistake, I missed that.  (That function is renamed to 
> vfio_ccw_mdev_write_io_region in patch 4.)
> 
> I don't think keeping it here is necessary then.  I got too focused 
> looking at what you ripped out that I lost the things that stayed.  Once 
> this series gets in its entirety, and Pierre has a chance to rebase his 
> FSM series on top of it all, this should be in great shape.

Yeah, it's probably easier to look at the end result.

> 
> >   
> >>
> >> Besides that, I think this looks good to me.  
> > 
> > Thanks!
> >   
> 
> You're welcome!  Here, have a thing to add to this patch:
> 
> Reviewed-by: Eric Farman <farman@linux.ibm.com>
> 

Thanks a lot!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 0/6] vfio-ccw: support hsch/csch (kernel part)
  2019-01-30 13:22 ` [Qemu-devel] " Cornelia Huck
@ 2019-02-06 14:00   ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-06 14:00 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, qemu-s390x, Alex Williamson, qemu-devel, kvm

On Wed, 30 Jan 2019 14:22:06 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> [This is the Linux kernel part, git tree is available at
> https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw.git vfio-ccw-eagain-caps-v3

I've pushed out the changes I've made so far (patch 1) to
vfio-ccw-eagain-caps-v3.5. I'll wait a bit for more comments before
sending a new version.

> 
> The companion QEMU patches are available at
> https://github.com/cohuck/qemu vfio-ccw-caps
> This is the previously posted v2 version, which should continue to work.]

I would not mind if somebody looked at those as well :)

> 
> Currently, vfio-ccw only relays START SUBCHANNEL requests to the real
> device. This tends to work well for the most common 'good path' scenarios;
> however, as we emulate {HALT,CLEAR} SUBCHANNEL in QEMU, things like
> clearing pending requests at the device is currently not supported.
> This may be a problem for e.g. error recovery.
> 
> This patch series introduces capabilities (similar to what vfio-pci uses)
> and exposes a new async region for handling hsch/csch.
> 
> Lightly tested (I can interact with a dasd as before, and reserve/release
> seems to work well.) Not sure if there is a better way to test this, ideas
> welcome.
> 
> Changes v2->v3:
> - Unb0rked patch 1, improved scope
> - Split out the new mutex from patch 2 into new patch 3; added missing
>   locking and hopefully improved description
> - Patch 2 now reworks the state handling by splitting the BUSY state
>   into CP_PROCESSING and CP_PENDING
> - Patches 3 and 5 adapted on top of the reworked patches; hsch/csch
>   are allowed in CP_PENDING, but not in CP_PROCESSING (did not add
>   any R-b due to that)
> - Added missing free in patch 5
> - Probably some small changes I forgot to note down
> 
> Changes v1->v2:
> - New patch 1: make it safe to use the cp accessors at any time; this
>   should avoid problems with unsolicited interrupt handling
> - New patch 2: handle concurrent accesses to the io region; the idea is
>   to return -EAGAIN to userspace more often (so it can simply retry)
> - also handle concurrent accesses to the async io region
> - change VFIO_REGION_TYPE_CCW
> - merge events for halt and clear to a single async event; this turned out
>   to make the code quite a bit simpler
> - probably some small changes I forgot to note down
> 
> Cornelia Huck (6):
>   vfio-ccw: make it safe to access channel programs
>   vfio-ccw: rework ssch state handling
>   vfio-ccw: protect the I/O region
>   vfio-ccw: add capabilities chain
>   s390/cio: export hsch to modules
>   vfio-ccw: add handling for async channel instructions
> 
>  drivers/s390/cio/Makefile           |   3 +-
>  drivers/s390/cio/ioasm.c            |   1 +
>  drivers/s390/cio/vfio_ccw_async.c   |  88 ++++++++++++
>  drivers/s390/cio/vfio_ccw_cp.c      |  20 ++-
>  drivers/s390/cio/vfio_ccw_cp.h      |   2 +
>  drivers/s390/cio/vfio_ccw_drv.c     |  57 ++++++--
>  drivers/s390/cio/vfio_ccw_fsm.c     | 143 ++++++++++++++++++-
>  drivers/s390/cio/vfio_ccw_ops.c     | 210 +++++++++++++++++++++++-----
>  drivers/s390/cio/vfio_ccw_private.h |  48 ++++++-
>  include/uapi/linux/vfio.h           |   4 +
>  include/uapi/linux/vfio_ccw.h       |  12 ++
>  11 files changed, 531 insertions(+), 57 deletions(-)
>  create mode 100644 drivers/s390/cio/vfio_ccw_async.c
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/6] vfio-ccw: support hsch/csch (kernel part)
@ 2019-02-06 14:00   ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-06 14:00 UTC (permalink / raw)
  To: Halil Pasic, Eric Farman, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, qemu-devel, qemu-s390x, Alex Williamson

On Wed, 30 Jan 2019 14:22:06 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> [This is the Linux kernel part, git tree is available at
> https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw.git vfio-ccw-eagain-caps-v3

I've pushed out the changes I've made so far (patch 1) to
vfio-ccw-eagain-caps-v3.5. I'll wait a bit for more comments before
sending a new version.

> 
> The companion QEMU patches are available at
> https://github.com/cohuck/qemu vfio-ccw-caps
> This is the previously posted v2 version, which should continue to work.]

I would not mind if somebody looked at those as well :)

> 
> Currently, vfio-ccw only relays START SUBCHANNEL requests to the real
> device. This tends to work well for the most common 'good path' scenarios;
> however, as we emulate {HALT,CLEAR} SUBCHANNEL in QEMU, things like
> clearing pending requests at the device is currently not supported.
> This may be a problem for e.g. error recovery.
> 
> This patch series introduces capabilities (similar to what vfio-pci uses)
> and exposes a new async region for handling hsch/csch.
> 
> Lightly tested (I can interact with a dasd as before, and reserve/release
> seems to work well.) Not sure if there is a better way to test this, ideas
> welcome.
> 
> Changes v2->v3:
> - Unb0rked patch 1, improved scope
> - Split out the new mutex from patch 2 into new patch 3; added missing
>   locking and hopefully improved description
> - Patch 2 now reworks the state handling by splitting the BUSY state
>   into CP_PROCESSING and CP_PENDING
> - Patches 3 and 5 adapted on top of the reworked patches; hsch/csch
>   are allowed in CP_PENDING, but not in CP_PROCESSING (did not add
>   any R-b due to that)
> - Added missing free in patch 5
> - Probably some small changes I forgot to note down
> 
> Changes v1->v2:
> - New patch 1: make it safe to use the cp accessors at any time; this
>   should avoid problems with unsolicited interrupt handling
> - New patch 2: handle concurrent accesses to the io region; the idea is
>   to return -EAGAIN to userspace more often (so it can simply retry)
> - also handle concurrent accesses to the async io region
> - change VFIO_REGION_TYPE_CCW
> - merge events for halt and clear to a single async event; this turned out
>   to make the code quite a bit simpler
> - probably some small changes I forgot to note down
> 
> Cornelia Huck (6):
>   vfio-ccw: make it safe to access channel programs
>   vfio-ccw: rework ssch state handling
>   vfio-ccw: protect the I/O region
>   vfio-ccw: add capabilities chain
>   s390/cio: export hsch to modules
>   vfio-ccw: add handling for async channel instructions
> 
>  drivers/s390/cio/Makefile           |   3 +-
>  drivers/s390/cio/ioasm.c            |   1 +
>  drivers/s390/cio/vfio_ccw_async.c   |  88 ++++++++++++
>  drivers/s390/cio/vfio_ccw_cp.c      |  20 ++-
>  drivers/s390/cio/vfio_ccw_cp.h      |   2 +
>  drivers/s390/cio/vfio_ccw_drv.c     |  57 ++++++--
>  drivers/s390/cio/vfio_ccw_fsm.c     | 143 ++++++++++++++++++-
>  drivers/s390/cio/vfio_ccw_ops.c     | 210 +++++++++++++++++++++++-----
>  drivers/s390/cio/vfio_ccw_private.h |  48 ++++++-
>  include/uapi/linux/vfio.h           |   4 +
>  include/uapi/linux/vfio_ccw.h       |  12 ++
>  11 files changed, 531 insertions(+), 57 deletions(-)
>  create mode 100644 drivers/s390/cio/vfio_ccw_async.c
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 0/6] vfio-ccw: support hsch/csch (kernel part)
  2019-02-06 14:00   ` [Qemu-devel] " Cornelia Huck
@ 2019-02-08 21:19     ` Eric Farman
  -1 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-08 21:19 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic, Farhan Ali, Pierre Morel
  Cc: linux-s390, qemu-s390x, Alex Williamson, qemu-devel, kvm



On 02/06/2019 09:00 AM, Cornelia Huck wrote:
> On Wed, 30 Jan 2019 14:22:06 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
>> [This is the Linux kernel part, git tree is available at
>> https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw.git vfio-ccw-eagain-caps-v3
> 
> I've pushed out the changes I've made so far (patch 1) to
> vfio-ccw-eagain-caps-v3.5. I'll wait a bit for more comments before
> sending a new version.
> 

Thanks for that branch...  For patch 1 in v3.5:

Reviewed-by: Eric Farman <farman@linux.ibm.com>

>>
>> The companion QEMU patches are available at
>> https://github.com/cohuck/qemu vfio-ccw-caps
>> This is the previously posted v2 version, which should continue to work.]
> 
> I would not mind if somebody looked at those as well :)

Not precluding anyone else from doing so :) ... I'd planned on looking 
at them as I get into the meat of patches 4-6 on the kernel side, where 
the overlap occurs.  I'm getting close.  :)

FWIW, I've been running with both series for the last week or two, along 
with some host kernel traces to prove things got executed the way I 
thought, and it's seemed to be working well.  So that makes me 
optimistic for the later patches.

  - Eric

> 
>>
>> Currently, vfio-ccw only relays START SUBCHANNEL requests to the real
>> device. This tends to work well for the most common 'good path' scenarios;
>> however, as we emulate {HALT,CLEAR} SUBCHANNEL in QEMU, things like
>> clearing pending requests at the device is currently not supported.
>> This may be a problem for e.g. error recovery.
>>
>> This patch series introduces capabilities (similar to what vfio-pci uses)
>> and exposes a new async region for handling hsch/csch.
>>
>> Lightly tested (I can interact with a dasd as before, and reserve/release
>> seems to work well.) Not sure if there is a better way to test this, ideas
>> welcome.
>>
>> Changes v2->v3:
>> - Unb0rked patch 1, improved scope
>> - Split out the new mutex from patch 2 into new patch 3; added missing
>>    locking and hopefully improved description
>> - Patch 2 now reworks the state handling by splitting the BUSY state
>>    into CP_PROCESSING and CP_PENDING
>> - Patches 3 and 5 adapted on top of the reworked patches; hsch/csch
>>    are allowed in CP_PENDING, but not in CP_PROCESSING (did not add
>>    any R-b due to that)
>> - Added missing free in patch 5
>> - Probably some small changes I forgot to note down
>>
>> Changes v1->v2:
>> - New patch 1: make it safe to use the cp accessors at any time; this
>>    should avoid problems with unsolicited interrupt handling
>> - New patch 2: handle concurrent accesses to the io region; the idea is
>>    to return -EAGAIN to userspace more often (so it can simply retry)
>> - also handle concurrent accesses to the async io region
>> - change VFIO_REGION_TYPE_CCW
>> - merge events for halt and clear to a single async event; this turned out
>>    to make the code quite a bit simpler
>> - probably some small changes I forgot to note down
>>
>> Cornelia Huck (6):
>>    vfio-ccw: make it safe to access channel programs
>>    vfio-ccw: rework ssch state handling
>>    vfio-ccw: protect the I/O region
>>    vfio-ccw: add capabilities chain
>>    s390/cio: export hsch to modules
>>    vfio-ccw: add handling for async channel instructions
>>
>>   drivers/s390/cio/Makefile           |   3 +-
>>   drivers/s390/cio/ioasm.c            |   1 +
>>   drivers/s390/cio/vfio_ccw_async.c   |  88 ++++++++++++
>>   drivers/s390/cio/vfio_ccw_cp.c      |  20 ++-
>>   drivers/s390/cio/vfio_ccw_cp.h      |   2 +
>>   drivers/s390/cio/vfio_ccw_drv.c     |  57 ++++++--
>>   drivers/s390/cio/vfio_ccw_fsm.c     | 143 ++++++++++++++++++-
>>   drivers/s390/cio/vfio_ccw_ops.c     | 210 +++++++++++++++++++++++-----
>>   drivers/s390/cio/vfio_ccw_private.h |  48 ++++++-
>>   include/uapi/linux/vfio.h           |   4 +
>>   include/uapi/linux/vfio_ccw.h       |  12 ++
>>   11 files changed, 531 insertions(+), 57 deletions(-)
>>   create mode 100644 drivers/s390/cio/vfio_ccw_async.c
>>
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/6] vfio-ccw: support hsch/csch (kernel part)
@ 2019-02-08 21:19     ` Eric Farman
  0 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-08 21:19 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, qemu-devel, qemu-s390x, Alex Williamson



On 02/06/2019 09:00 AM, Cornelia Huck wrote:
> On Wed, 30 Jan 2019 14:22:06 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
>> [This is the Linux kernel part, git tree is available at
>> https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw.git vfio-ccw-eagain-caps-v3
> 
> I've pushed out the changes I've made so far (patch 1) to
> vfio-ccw-eagain-caps-v3.5. I'll wait a bit for more comments before
> sending a new version.
> 

Thanks for that branch...  For patch 1 in v3.5:

Reviewed-by: Eric Farman <farman@linux.ibm.com>

>>
>> The companion QEMU patches are available at
>> https://github.com/cohuck/qemu vfio-ccw-caps
>> This is the previously posted v2 version, which should continue to work.]
> 
> I would not mind if somebody looked at those as well :)

Not precluding anyone else from doing so :) ... I'd planned on looking 
at them as I get into the meat of patches 4-6 on the kernel side, where 
the overlap occurs.  I'm getting close.  :)

FWIW, I've been running with both series for the last week or two, along 
with some host kernel traces to prove things got executed the way I 
thought, and it's seemed to be working well.  So that makes me 
optimistic for the later patches.

  - Eric

> 
>>
>> Currently, vfio-ccw only relays START SUBCHANNEL requests to the real
>> device. This tends to work well for the most common 'good path' scenarios;
>> however, as we emulate {HALT,CLEAR} SUBCHANNEL in QEMU, things like
>> clearing pending requests at the device is currently not supported.
>> This may be a problem for e.g. error recovery.
>>
>> This patch series introduces capabilities (similar to what vfio-pci uses)
>> and exposes a new async region for handling hsch/csch.
>>
>> Lightly tested (I can interact with a dasd as before, and reserve/release
>> seems to work well.) Not sure if there is a better way to test this, ideas
>> welcome.
>>
>> Changes v2->v3:
>> - Unb0rked patch 1, improved scope
>> - Split out the new mutex from patch 2 into new patch 3; added missing
>>    locking and hopefully improved description
>> - Patch 2 now reworks the state handling by splitting the BUSY state
>>    into CP_PROCESSING and CP_PENDING
>> - Patches 3 and 5 adapted on top of the reworked patches; hsch/csch
>>    are allowed in CP_PENDING, but not in CP_PROCESSING (did not add
>>    any R-b due to that)
>> - Added missing free in patch 5
>> - Probably some small changes I forgot to note down
>>
>> Changes v1->v2:
>> - New patch 1: make it safe to use the cp accessors at any time; this
>>    should avoid problems with unsolicited interrupt handling
>> - New patch 2: handle concurrent accesses to the io region; the idea is
>>    to return -EAGAIN to userspace more often (so it can simply retry)
>> - also handle concurrent accesses to the async io region
>> - change VFIO_REGION_TYPE_CCW
>> - merge events for halt and clear to a single async event; this turned out
>>    to make the code quite a bit simpler
>> - probably some small changes I forgot to note down
>>
>> Cornelia Huck (6):
>>    vfio-ccw: make it safe to access channel programs
>>    vfio-ccw: rework ssch state handling
>>    vfio-ccw: protect the I/O region
>>    vfio-ccw: add capabilities chain
>>    s390/cio: export hsch to modules
>>    vfio-ccw: add handling for async channel instructions
>>
>>   drivers/s390/cio/Makefile           |   3 +-
>>   drivers/s390/cio/ioasm.c            |   1 +
>>   drivers/s390/cio/vfio_ccw_async.c   |  88 ++++++++++++
>>   drivers/s390/cio/vfio_ccw_cp.c      |  20 ++-
>>   drivers/s390/cio/vfio_ccw_cp.h      |   2 +
>>   drivers/s390/cio/vfio_ccw_drv.c     |  57 ++++++--
>>   drivers/s390/cio/vfio_ccw_fsm.c     | 143 ++++++++++++++++++-
>>   drivers/s390/cio/vfio_ccw_ops.c     | 210 +++++++++++++++++++++++-----
>>   drivers/s390/cio/vfio_ccw_private.h |  48 ++++++-
>>   include/uapi/linux/vfio.h           |   4 +
>>   include/uapi/linux/vfio_ccw.h       |  12 ++
>>   11 files changed, 531 insertions(+), 57 deletions(-)
>>   create mode 100644 drivers/s390/cio/vfio_ccw_async.c
>>
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/6] vfio-ccw: protect the I/O region
  2019-01-30 13:22   ` [Qemu-devel] " Cornelia Huck
@ 2019-02-08 21:26     ` Eric Farman
  -1 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-08 21:26 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic, Farhan Ali, Pierre Morel
  Cc: linux-s390, qemu-s390x, Alex Williamson, qemu-devel, kvm



On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> Introduce a mutex to disallow concurrent reads or writes to the
> I/O region. This makes sure that the data the kernel or user
> space see is always consistent.
> 
> The same mutex will be used to protect the async region as well.
> 
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>

I keep wondering how the FSM could provide this, but I end up getting 
into chicken/egg rabbit holes.  So, until my brain becomes wiser...

Reviewed-by: Eric Farman <farman@linux.ibm.com>

> ---
>   drivers/s390/cio/vfio_ccw_drv.c     |  3 +++
>   drivers/s390/cio/vfio_ccw_ops.c     | 28 +++++++++++++++++++---------
>   drivers/s390/cio/vfio_ccw_private.h |  2 ++
>   3 files changed, 24 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
> index 0b3b9de45c60..5ea0da1dd954 100644
> --- a/drivers/s390/cio/vfio_ccw_drv.c
> +++ b/drivers/s390/cio/vfio_ccw_drv.c
> @@ -84,7 +84,9 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
>   		if (is_final)
>   			cp_free(&private->cp);
>   	}
> +	mutex_lock(&private->io_mutex);
>   	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
> +	mutex_unlock(&private->io_mutex);
>   
>   	if (private->io_trigger)
>   		eventfd_signal(private->io_trigger, 1);
> @@ -129,6 +131,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
>   
>   	private->sch = sch;
>   	dev_set_drvdata(&sch->dev, private);
> +	mutex_init(&private->io_mutex);
>   
>   	spin_lock_irq(sch->lock);
>   	private->state = VFIO_CCW_STATE_NOT_OPER;
> diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
> index 3fdcc6dfe0bf..025c8a832bc8 100644
> --- a/drivers/s390/cio/vfio_ccw_ops.c
> +++ b/drivers/s390/cio/vfio_ccw_ops.c
> @@ -169,16 +169,20 @@ static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
>   {
>   	struct vfio_ccw_private *private;
>   	struct ccw_io_region *region;
> +	int ret;
>   
>   	if (*ppos + count > sizeof(*region))
>   		return -EINVAL;
>   
>   	private = dev_get_drvdata(mdev_parent_dev(mdev));
> +	mutex_lock(&private->io_mutex);
>   	region = private->io_region;
>   	if (copy_to_user(buf, (void *)region + *ppos, count))
> -		return -EFAULT;
> -
> -	return count;
> +		ret = -EFAULT;
> +	else
> +		ret = count;
> +	mutex_unlock(&private->io_mutex);
> +	return ret;
>   }
>   
>   static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
> @@ -188,23 +192,29 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
>   {
>   	struct vfio_ccw_private *private;
>   	struct ccw_io_region *region;
> +	int ret;
>   
>   	if (*ppos + count > sizeof(*region))
>   		return -EINVAL;
>   
>   	private = dev_get_drvdata(mdev_parent_dev(mdev));
> +	if (!mutex_trylock(&private->io_mutex))
> +		return -EAGAIN;
>   
>   	region = private->io_region;
> -	if (copy_from_user((void *)region + *ppos, buf, count))
> -		return -EFAULT;
> +	if (copy_from_user((void *)region + *ppos, buf, count)) {
> +		ret = -EFAULT;
> +		goto out_unlock;
> +	}
>   
>   	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_IO_REQ);
> -	if (region->ret_code != 0) {
> +	if (region->ret_code != 0)
>   		private->state = VFIO_CCW_STATE_IDLE;
> -		return region->ret_code;
> -	}
> +	ret = (region->ret_code != 0) ? region->ret_code : count;
>   
> -	return count;
> +out_unlock:
> +	mutex_unlock(&private->io_mutex);
> +	return ret;
>   }
>   
>   static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info)
> diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
> index 50c52efb4fcb..32173cbd838d 100644
> --- a/drivers/s390/cio/vfio_ccw_private.h
> +++ b/drivers/s390/cio/vfio_ccw_private.h
> @@ -28,6 +28,7 @@
>    * @mdev: pointer to the mediated device
>    * @nb: notifier for vfio events
>    * @io_region: MMIO region to input/output I/O arguments/results
> + * @io_mutex: protect against concurrent update of I/O regions
>    * @cp: channel program for the current I/O operation
>    * @irb: irb info received from interrupt
>    * @scsw: scsw info
> @@ -42,6 +43,7 @@ struct vfio_ccw_private {
>   	struct mdev_device	*mdev;
>   	struct notifier_block	nb;
>   	struct ccw_io_region	*io_region;
> +	struct mutex		io_mutex;
>   
>   	struct channel_program	cp;
>   	struct irb		irb;
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 3/6] vfio-ccw: protect the I/O region
@ 2019-02-08 21:26     ` Eric Farman
  0 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-08 21:26 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, qemu-devel, qemu-s390x, Alex Williamson



On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> Introduce a mutex to disallow concurrent reads or writes to the
> I/O region. This makes sure that the data the kernel or user
> space see is always consistent.
> 
> The same mutex will be used to protect the async region as well.
> 
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>

I keep wondering how the FSM could provide this, but I end up getting 
into chicken/egg rabbit holes.  So, until my brain becomes wiser...

Reviewed-by: Eric Farman <farman@linux.ibm.com>

> ---
>   drivers/s390/cio/vfio_ccw_drv.c     |  3 +++
>   drivers/s390/cio/vfio_ccw_ops.c     | 28 +++++++++++++++++++---------
>   drivers/s390/cio/vfio_ccw_private.h |  2 ++
>   3 files changed, 24 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
> index 0b3b9de45c60..5ea0da1dd954 100644
> --- a/drivers/s390/cio/vfio_ccw_drv.c
> +++ b/drivers/s390/cio/vfio_ccw_drv.c
> @@ -84,7 +84,9 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
>   		if (is_final)
>   			cp_free(&private->cp);
>   	}
> +	mutex_lock(&private->io_mutex);
>   	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
> +	mutex_unlock(&private->io_mutex);
>   
>   	if (private->io_trigger)
>   		eventfd_signal(private->io_trigger, 1);
> @@ -129,6 +131,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
>   
>   	private->sch = sch;
>   	dev_set_drvdata(&sch->dev, private);
> +	mutex_init(&private->io_mutex);
>   
>   	spin_lock_irq(sch->lock);
>   	private->state = VFIO_CCW_STATE_NOT_OPER;
> diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
> index 3fdcc6dfe0bf..025c8a832bc8 100644
> --- a/drivers/s390/cio/vfio_ccw_ops.c
> +++ b/drivers/s390/cio/vfio_ccw_ops.c
> @@ -169,16 +169,20 @@ static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
>   {
>   	struct vfio_ccw_private *private;
>   	struct ccw_io_region *region;
> +	int ret;
>   
>   	if (*ppos + count > sizeof(*region))
>   		return -EINVAL;
>   
>   	private = dev_get_drvdata(mdev_parent_dev(mdev));
> +	mutex_lock(&private->io_mutex);
>   	region = private->io_region;
>   	if (copy_to_user(buf, (void *)region + *ppos, count))
> -		return -EFAULT;
> -
> -	return count;
> +		ret = -EFAULT;
> +	else
> +		ret = count;
> +	mutex_unlock(&private->io_mutex);
> +	return ret;
>   }
>   
>   static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
> @@ -188,23 +192,29 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
>   {
>   	struct vfio_ccw_private *private;
>   	struct ccw_io_region *region;
> +	int ret;
>   
>   	if (*ppos + count > sizeof(*region))
>   		return -EINVAL;
>   
>   	private = dev_get_drvdata(mdev_parent_dev(mdev));
> +	if (!mutex_trylock(&private->io_mutex))
> +		return -EAGAIN;
>   
>   	region = private->io_region;
> -	if (copy_from_user((void *)region + *ppos, buf, count))
> -		return -EFAULT;
> +	if (copy_from_user((void *)region + *ppos, buf, count)) {
> +		ret = -EFAULT;
> +		goto out_unlock;
> +	}
>   
>   	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_IO_REQ);
> -	if (region->ret_code != 0) {
> +	if (region->ret_code != 0)
>   		private->state = VFIO_CCW_STATE_IDLE;
> -		return region->ret_code;
> -	}
> +	ret = (region->ret_code != 0) ? region->ret_code : count;
>   
> -	return count;
> +out_unlock:
> +	mutex_unlock(&private->io_mutex);
> +	return ret;
>   }
>   
>   static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info)
> diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
> index 50c52efb4fcb..32173cbd838d 100644
> --- a/drivers/s390/cio/vfio_ccw_private.h
> +++ b/drivers/s390/cio/vfio_ccw_private.h
> @@ -28,6 +28,7 @@
>    * @mdev: pointer to the mediated device
>    * @nb: notifier for vfio events
>    * @io_region: MMIO region to input/output I/O arguments/results
> + * @io_mutex: protect against concurrent update of I/O regions
>    * @cp: channel program for the current I/O operation
>    * @irb: irb info received from interrupt
>    * @scsw: scsw info
> @@ -42,6 +43,7 @@ struct vfio_ccw_private {
>   	struct mdev_device	*mdev;
>   	struct notifier_block	nb;
>   	struct ccw_io_region	*io_region;
> +	struct mutex		io_mutex;
>   
>   	struct channel_program	cp;
>   	struct irb		irb;
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/6] vfio-ccw: protect the I/O region
  2019-02-08 21:26     ` [Qemu-devel] " Eric Farman
@ 2019-02-11 15:57       ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-11 15:57 UTC (permalink / raw)
  To: Eric Farman
  Cc: linux-s390, Alex Williamson, Pierre Morel, kvm, Farhan Ali,
	qemu-devel, Halil Pasic, qemu-s390x

On Fri, 8 Feb 2019 16:26:06 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> > Introduce a mutex to disallow concurrent reads or writes to the
> > I/O region. This makes sure that the data the kernel or user
> > space see is always consistent.
> > 
> > The same mutex will be used to protect the async region as well.
> > 
> > Signed-off-by: Cornelia Huck <cohuck@redhat.com>  
> 
> I keep wondering how the FSM could provide this, but I end up getting 
> into chicken/egg rabbit holes.

Yes, if the fsm is able to provide this, it is probably not in an
easy-to-understand way...

> So, until my brain becomes wiser...
> 
> Reviewed-by: Eric Farman <farman@linux.ibm.com>

Thanks!

> 
> > ---
> >   drivers/s390/cio/vfio_ccw_drv.c     |  3 +++
> >   drivers/s390/cio/vfio_ccw_ops.c     | 28 +++++++++++++++++++---------
> >   drivers/s390/cio/vfio_ccw_private.h |  2 ++
> >   3 files changed, 24 insertions(+), 9 deletions(-)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 3/6] vfio-ccw: protect the I/O region
@ 2019-02-11 15:57       ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-11 15:57 UTC (permalink / raw)
  To: Eric Farman
  Cc: Halil Pasic, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Fri, 8 Feb 2019 16:26:06 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> > Introduce a mutex to disallow concurrent reads or writes to the
> > I/O region. This makes sure that the data the kernel or user
> > space see is always consistent.
> > 
> > The same mutex will be used to protect the async region as well.
> > 
> > Signed-off-by: Cornelia Huck <cohuck@redhat.com>  
> 
> I keep wondering how the FSM could provide this, but I end up getting 
> into chicken/egg rabbit holes.

Yes, if the fsm is able to provide this, it is probably not in an
easy-to-understand way...

> So, until my brain becomes wiser...
> 
> Reviewed-by: Eric Farman <farman@linux.ibm.com>

Thanks!

> 
> > ---
> >   drivers/s390/cio/vfio_ccw_drv.c     |  3 +++
> >   drivers/s390/cio/vfio_ccw_ops.c     | 28 +++++++++++++++++++---------
> >   drivers/s390/cio/vfio_ccw_private.h |  2 ++
> >   3 files changed, 24 insertions(+), 9 deletions(-)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 0/6] vfio-ccw: support hsch/csch (kernel part)
  2019-02-08 21:19     ` [Qemu-devel] " Eric Farman
@ 2019-02-11 16:13       ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-11 16:13 UTC (permalink / raw)
  To: Eric Farman
  Cc: linux-s390, Alex Williamson, Pierre Morel, kvm, Farhan Ali,
	qemu-devel, Halil Pasic, qemu-s390x

On Fri, 8 Feb 2019 16:19:58 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 02/06/2019 09:00 AM, Cornelia Huck wrote:
> > On Wed, 30 Jan 2019 14:22:06 +0100
> > Cornelia Huck <cohuck@redhat.com> wrote:
> >   
> >> [This is the Linux kernel part, git tree is available at
> >> https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw.git vfio-ccw-eagain-caps-v3  
> > 
> > I've pushed out the changes I've made so far (patch 1) to
> > vfio-ccw-eagain-caps-v3.5. I'll wait a bit for more comments before
> > sending a new version.
> >   
> 
> Thanks for that branch...  For patch 1 in v3.5:
> 
> Reviewed-by: Eric Farman <farman@linux.ibm.com>

Thanks!

> 
> >>
> >> The companion QEMU patches are available at
> >> https://github.com/cohuck/qemu vfio-ccw-caps
> >> This is the previously posted v2 version, which should continue to work.]  
> > 
> > I would not mind if somebody looked at those as well :)  
> 
> Not precluding anyone else from doing so :) ... I'd planned on looking 
> at them as I get into the meat of patches 4-6 on the kernel side, where 
> the overlap occurs.  I'm getting close.  :)

Cool :) I'll wait a bit more before resending, then. (I'll probably
rebase the QEMU side as well when I do resend.)

> 
> FWIW, I've been running with both series for the last week or two, along 
> with some host kernel traces to prove things got executed the way I 
> thought, and it's seemed to be working well.  So that makes me 
> optimistic for the later patches.

That's good news, thanks for testing. Do you have a special test load
that you run in the guest that you can share?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/6] vfio-ccw: support hsch/csch (kernel part)
@ 2019-02-11 16:13       ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-11 16:13 UTC (permalink / raw)
  To: Eric Farman
  Cc: Halil Pasic, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Fri, 8 Feb 2019 16:19:58 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 02/06/2019 09:00 AM, Cornelia Huck wrote:
> > On Wed, 30 Jan 2019 14:22:06 +0100
> > Cornelia Huck <cohuck@redhat.com> wrote:
> >   
> >> [This is the Linux kernel part, git tree is available at
> >> https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw.git vfio-ccw-eagain-caps-v3  
> > 
> > I've pushed out the changes I've made so far (patch 1) to
> > vfio-ccw-eagain-caps-v3.5. I'll wait a bit for more comments before
> > sending a new version.
> >   
> 
> Thanks for that branch...  For patch 1 in v3.5:
> 
> Reviewed-by: Eric Farman <farman@linux.ibm.com>

Thanks!

> 
> >>
> >> The companion QEMU patches are available at
> >> https://github.com/cohuck/qemu vfio-ccw-caps
> >> This is the previously posted v2 version, which should continue to work.]  
> > 
> > I would not mind if somebody looked at those as well :)  
> 
> Not precluding anyone else from doing so :) ... I'd planned on looking 
> at them as I get into the meat of patches 4-6 on the kernel side, where 
> the overlap occurs.  I'm getting close.  :)

Cool :) I'll wait a bit more before resending, then. (I'll probably
rebase the QEMU side as well when I do resend.)

> 
> FWIW, I've been running with both series for the last week or two, along 
> with some host kernel traces to prove things got executed the way I 
> thought, and it's seemed to be working well.  So that makes me 
> optimistic for the later patches.

That's good news, thanks for testing. Do you have a special test load
that you run in the guest that you can share?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 0/6] vfio-ccw: support hsch/csch (kernel part)
  2019-02-11 16:13       ` [Qemu-devel] " Cornelia Huck
@ 2019-02-11 17:37         ` Eric Farman
  -1 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-11 17:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: linux-s390, Alex Williamson, Pierre Morel, kvm, Farhan Ali,
	qemu-devel, Halil Pasic, qemu-s390x



On 02/11/2019 11:13 AM, Cornelia Huck wrote:
> On Fri, 8 Feb 2019 16:19:58 -0500
> Eric Farman <farman@linux.ibm.com> wrote:
>>
>> FWIW, I've been running with both series for the last week or two, along
>> with some host kernel traces to prove things got executed the way I
>> thought, and it's seemed to be working well.  So that makes me
>> optimistic for the later patches.
> 
> That's good news, thanks for testing. Do you have a special test load
> that you run in the guest that you can share?
> 

Not really.  Lately it's just fio, run via some ancient scripts which 
randomize the input parameters and distill the output data.  If I get 
some time to make it less hack-y it might be worth sharing, but right 
now there's more things commented out than actual script.  :)

  - Eric

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/6] vfio-ccw: support hsch/csch (kernel part)
@ 2019-02-11 17:37         ` Eric Farman
  0 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-11 17:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Halil Pasic, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson



On 02/11/2019 11:13 AM, Cornelia Huck wrote:
> On Fri, 8 Feb 2019 16:19:58 -0500
> Eric Farman <farman@linux.ibm.com> wrote:
>>
>> FWIW, I've been running with both series for the last week or two, along
>> with some host kernel traces to prove things got executed the way I
>> thought, and it's seemed to be working well.  So that makes me
>> optimistic for the later patches.
> 
> That's good news, thanks for testing. Do you have a special test load
> that you run in the guest that you can share?
> 

Not really.  Lately it's just fio, run via some ancient scripts which 
randomize the input parameters and distill the output data.  If I get 
some time to make it less hack-y it might be worth sharing, but right 
now there's more things commented out than actual script.  :)

  - Eric

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 4/6] vfio-ccw: add capabilities chain
  2019-01-30 13:22   ` [Qemu-devel] " Cornelia Huck
@ 2019-02-15 15:46     ` Eric Farman
  -1 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-15 15:46 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic, Farhan Ali, Pierre Morel
  Cc: linux-s390, qemu-s390x, Alex Williamson, qemu-devel, kvm



On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> Allow to extend the regions used by vfio-ccw. The first user will be
> handling of halt and clear subchannel.
> 
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> ---
>   drivers/s390/cio/vfio_ccw_ops.c     | 181 ++++++++++++++++++++++++----
>   drivers/s390/cio/vfio_ccw_private.h |  38 ++++++
>   include/uapi/linux/vfio.h           |   2 +
>   3 files changed, 195 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
> index 025c8a832bc8..48b2d7930ea8 100644
> --- a/drivers/s390/cio/vfio_ccw_ops.c
> +++ b/drivers/s390/cio/vfio_ccw_ops.c
> @@ -3,9 +3,11 @@
>    * Physical device callbacks for vfio_ccw
>    *
>    * Copyright IBM Corp. 2017
> + * Copyright Red Hat, Inc. 2019
>    *
>    * Author(s): Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
>    *            Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
> + *            Cornelia Huck <cohuck@redhat.com>
>    */
>   
>   #include <linux/vfio.h>
> @@ -157,27 +159,33 @@ static void vfio_ccw_mdev_release(struct mdev_device *mdev)
>   {
>   	struct vfio_ccw_private *private =
>   		dev_get_drvdata(mdev_parent_dev(mdev));
> +	int i;
>   
>   	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
>   				 &private->nb);
> +
> +	for (i = 0; i < private->num_regions; i++)
> +		private->region[i].ops->release(private, &private->region[i]);
> +
> +	private->num_regions = 0;
> +	kfree(private->region);
> +	private->region = NULL;
>   }
>   
> -static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
> -				  char __user *buf,
> -				  size_t count,
> -				  loff_t *ppos)
> +static ssize_t vfio_ccw_mdev_read_io_region(struct vfio_ccw_private *private,
> +					    char __user *buf, size_t count,
> +					    loff_t *ppos)
>   {
> -	struct vfio_ccw_private *private;
> +	loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
>   	struct ccw_io_region *region;
>   	int ret;
>   
> -	if (*ppos + count > sizeof(*region))
> +	if (pos + count > sizeof(*region))
>   		return -EINVAL;
>   
> -	private = dev_get_drvdata(mdev_parent_dev(mdev));
>   	mutex_lock(&private->io_mutex);
>   	region = private->io_region;
> -	if (copy_to_user(buf, (void *)region + *ppos, count))
> +	if (copy_to_user(buf, (void *)region + pos, count))
>   		ret = -EFAULT;
>   	else
>   		ret = count;
> @@ -185,24 +193,47 @@ static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
>   	return ret;
>   }
>   
> -static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
> -				   const char __user *buf,
> -				   size_t count,
> -				   loff_t *ppos)
> +static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
> +				  char __user *buf,
> +				  size_t count,
> +				  loff_t *ppos)
>   {
> +	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
>   	struct vfio_ccw_private *private;
> +
> +	private = dev_get_drvdata(mdev_parent_dev(mdev));
> +
> +	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
> +		return -EINVAL;
> +
> +	switch (index) {
> +	case VFIO_CCW_CONFIG_REGION_INDEX:
> +		return vfio_ccw_mdev_read_io_region(private, buf, count, ppos);
> +	default:
> +		index -= VFIO_CCW_NUM_REGIONS;
> +		return private->region[index].ops->read(private, buf, count,
> +							ppos);
> +	}
> +
> +	return -EINVAL;
> +}
> +
> +static ssize_t vfio_ccw_mdev_write_io_region(struct vfio_ccw_private *private,
> +					     const char __user *buf,
> +					     size_t count, loff_t *ppos)
> +{
> +	loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
>   	struct ccw_io_region *region;
>   	int ret;
>   
> -	if (*ppos + count > sizeof(*region))
> +	if (pos + count > sizeof(*region))
>   		return -EINVAL;
>   
> -	private = dev_get_drvdata(mdev_parent_dev(mdev));
>   	if (!mutex_trylock(&private->io_mutex))
>   		return -EAGAIN;
>   
>   	region = private->io_region;
> -	if (copy_from_user((void *)region + *ppos, buf, count)) {
> +	if (copy_from_user((void *)region + pos, buf, count)) {
>   		ret = -EFAULT;
>   		goto out_unlock;
>   	}
> @@ -217,19 +248,52 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
>   	return ret;
>   }
>   
> -static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info)
> +static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
> +				   const char __user *buf,
> +				   size_t count,
> +				   loff_t *ppos)
> +{
> +	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
> +	struct vfio_ccw_private *private;
> +
> +	private = dev_get_drvdata(mdev_parent_dev(mdev));
> +
> +	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
> +		return -EINVAL;
> +
> +	switch (index) {
> +	case VFIO_CCW_CONFIG_REGION_INDEX:
> +		return vfio_ccw_mdev_write_io_region(private, buf, count, ppos);
> +	default:
> +		index -= VFIO_CCW_NUM_REGIONS;
> +		return private->region[index].ops->write(private, buf, count,
> +							 ppos);
> +	}
> +
> +	return -EINVAL;
> +}
> +
> +static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info,
> +					 struct mdev_device *mdev)
>   {
> +	struct vfio_ccw_private *private;
> +
> +	private = dev_get_drvdata(mdev_parent_dev(mdev));
>   	info->flags = VFIO_DEVICE_FLAGS_CCW | VFIO_DEVICE_FLAGS_RESET;
> -	info->num_regions = VFIO_CCW_NUM_REGIONS;
> +	info->num_regions = VFIO_CCW_NUM_REGIONS + private->num_regions;
>   	info->num_irqs = VFIO_CCW_NUM_IRQS;
>   
>   	return 0;
>   }
>   
>   static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
> -					 u16 *cap_type_id,
> -					 void **cap_type)
> +					 struct mdev_device *mdev,
> +					 unsigned long arg)
>   {
> +	struct vfio_ccw_private *private;
> +	int i;
> +
> +	private = dev_get_drvdata(mdev_parent_dev(mdev));
>   	switch (info->index) {
>   	case VFIO_CCW_CONFIG_REGION_INDEX:
>   		info->offset = 0;
> @@ -237,9 +301,51 @@ static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
>   		info->flags = VFIO_REGION_INFO_FLAG_READ
>   			      | VFIO_REGION_INFO_FLAG_WRITE;
>   		return 0;
> -	default:
> -		return -EINVAL;
> +	default: /* all other regions are handled via capability chain */
> +	{
> +		struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
> +		struct vfio_region_info_cap_type cap_type = {
> +			.header.id = VFIO_REGION_INFO_CAP_TYPE,
> +			.header.version = 1 };
> +		int ret;
> +
> +		if (info->index >=
> +		    VFIO_CCW_NUM_REGIONS + private->num_regions)
> +			return -EINVAL;

I notice the similarity of this hunk to drivers/vfio/pci/vfio_pci.c ... 
While I was trying to discern the likelihood/possibility/usefulness of 
combining the two, I noticed that there is one difference at this point 
in the other file, which was added by commit 0e714d27786c ("vfio/pci: 
Fix potential Spectre v1")

This got me off on a tangent of setting up smatch in my environment, and 
sure enough it flags this point [1] as being problematic:

drivers/s390/cio/vfio_ccw_ops.c:328 vfio_ccw_mdev_get_region_info() 
warn: potential spectre issue 'private->region' [r]

Might need to consider the same?  (And lends credence to my concern 
about the capability chain code being duplicated.)

> +
> +		i = info->index - VFIO_CCW_NUM_REGIONS;
> +
> +		info->offset = VFIO_CCW_INDEX_TO_OFFSET(info->index);
> +		info->size = private->region[i].size;

[1] smatch actually points to this line, though the referenced commit 
inserts a line up there.

> +		info->flags = private->region[i].flags;
> +
> +		cap_type.type = private->region[i].type;
> +		cap_type.subtype = private->region[i].subtype;
> +
> +		ret = vfio_info_add_capability(&caps, &cap_type.header,
> +					       sizeof(cap_type));
> +		if (ret)
> +			return ret;
> +
> +		info->flags |= VFIO_REGION_INFO_FLAG_CAPS;
> +		if (info->argsz < sizeof(*info) + caps.size) {
> +			info->argsz = sizeof(*info) + caps.size;
> +			info->cap_offset = 0;
> +		} else {
> +			vfio_info_cap_shift(&caps, sizeof(*info));
> +			if (copy_to_user((void __user *)arg + sizeof(*info),
> +					 caps.buf, caps.size)) {
> +				kfree(caps.buf);
> +				return -EFAULT;
> +			}
> +			info->cap_offset = sizeof(*info);
> +		}
> +
> +		kfree(caps.buf);
> +
> +	}
>   	}
> +	return 0;
>   }
>   
>   static int vfio_ccw_mdev_get_irq_info(struct vfio_irq_info *info)
> @@ -316,6 +422,32 @@ static int vfio_ccw_mdev_set_irqs(struct mdev_device *mdev,
>   	}
>   }
>   
> +int vfio_ccw_register_dev_region(struct vfio_ccw_private *private,
> +				 unsigned int subtype,
> +				 const struct vfio_ccw_regops *ops,
> +				 size_t size, u32 flags, void *data)
> +{
> +	struct vfio_ccw_region *region;
> +
> +	region = krealloc(private->region,
> +			  (private->num_regions + 1) * sizeof(*region),
> +			  GFP_KERNEL);
> +	if (!region)
> +		return -ENOMEM;
> +
> +	private->region = region;
> +	private->region[private->num_regions].type = VFIO_REGION_TYPE_CCW;
> +	private->region[private->num_regions].subtype = subtype;
> +	private->region[private->num_regions].ops = ops;
> +	private->region[private->num_regions].size = size;
> +	private->region[private->num_regions].flags = flags;
> +	private->region[private->num_regions].data = data;
> +
> +	private->num_regions++;
> +
> +	return 0;
> +}
> +
>   static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
>   				   unsigned int cmd,
>   				   unsigned long arg)
> @@ -336,7 +468,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
>   		if (info.argsz < minsz)
>   			return -EINVAL;
>   
> -		ret = vfio_ccw_mdev_get_device_info(&info);
> +		ret = vfio_ccw_mdev_get_device_info(&info, mdev);
>   		if (ret)
>   			return ret;
>   
> @@ -345,8 +477,6 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
>   	case VFIO_DEVICE_GET_REGION_INFO:
>   	{
>   		struct vfio_region_info info;
> -		u16 cap_type_id = 0;
> -		void *cap_type = NULL;
>   
>   		minsz = offsetofend(struct vfio_region_info, offset);
>   
> @@ -356,8 +486,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
>   		if (info.argsz < minsz)
>   			return -EINVAL;
>   
> -		ret = vfio_ccw_mdev_get_region_info(&info, &cap_type_id,
> -						    &cap_type);
> +		ret = vfio_ccw_mdev_get_region_info(&info, mdev, arg);
>   		if (ret)
>   			return ret;
>   
> diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
> index 32173cbd838d..c979eb32fb1c 100644
> --- a/drivers/s390/cio/vfio_ccw_private.h
> +++ b/drivers/s390/cio/vfio_ccw_private.h
> @@ -3,9 +3,11 @@
>    * Private stuff for vfio_ccw driver
>    *
>    * Copyright IBM Corp. 2017
> + * Copyright Red Hat, Inc. 2019
>    *
>    * Author(s): Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
>    *            Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
> + *            Cornelia Huck <cohuck@redhat.com>
>    */
>   
>   #ifndef _VFIO_CCW_PRIVATE_H_
> @@ -19,6 +21,38 @@
>   #include "css.h"
>   #include "vfio_ccw_cp.h"
>   
> +#define VFIO_CCW_OFFSET_SHIFT   40
> +#define VFIO_CCW_OFFSET_TO_INDEX(off)	(off >> VFIO_CCW_OFFSET_SHIFT)
> +#define VFIO_CCW_INDEX_TO_OFFSET(index)	((u64)(index) << VFIO_CCW_OFFSET_SHIFT)
> +#define VFIO_CCW_OFFSET_MASK	(((u64)(1) << VFIO_CCW_OFFSET_SHIFT) - 1)

I know Farhan asked this back in v1, but I'd still love a better answer 
than "vfio-pci did this" to what this is.  There's a lot more regions 
prior to the capability chain in vfio-pci than here (9 versus 1), so I'd 
like to be certain it's not related to that.

> +
> +/* capability chain handling similar to vfio-pci */
> +struct vfio_ccw_private;
> +struct vfio_ccw_region;
> +
> +struct vfio_ccw_regops {
> +	ssize_t	(*read)(struct vfio_ccw_private *private, char __user *buf,
> +			size_t count, loff_t *ppos);
> +	ssize_t	(*write)(struct vfio_ccw_private *private,
> +			 const char __user *buf, size_t count, loff_t *ppos);
> +	void	(*release)(struct vfio_ccw_private *private,
> +			   struct vfio_ccw_region *region);
> +};
> +
> +struct vfio_ccw_region {
> +	u32				type;
> +	u32				subtype;
> +	const struct vfio_ccw_regops	*ops;
> +	void				*data;
> +	size_t				size;
> +	u32				flags;
> +};
> +
> +int vfio_ccw_register_dev_region(struct vfio_ccw_private *private,
> +				 unsigned int subtype,
> +				 const struct vfio_ccw_regops *ops,
> +				 size_t size, u32 flags, void *data);
> +
>   /**
>    * struct vfio_ccw_private
>    * @sch: pointer to the subchannel
> @@ -29,6 +63,8 @@
>    * @nb: notifier for vfio events
>    * @io_region: MMIO region to input/output I/O arguments/results
>    * @io_mutex: protect against concurrent update of I/O regions
> + * @region: additional regions for other subchannel operations
> + * @num_regions: number of additional regions
>    * @cp: channel program for the current I/O operation
>    * @irb: irb info received from interrupt
>    * @scsw: scsw info
> @@ -44,6 +80,8 @@ struct vfio_ccw_private {
>   	struct notifier_block	nb;
>   	struct ccw_io_region	*io_region;
>   	struct mutex		io_mutex;
> +	struct vfio_ccw_region *region;
> +	int num_regions;
>   
>   	struct channel_program	cp;
>   	struct irb		irb;
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 02bb7ad6e986..56e2413d3e00 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -353,6 +353,8 @@ struct vfio_region_gfx_edid {
>   #define VFIO_DEVICE_GFX_LINK_STATE_DOWN  2
>   };
>   
> +#define VFIO_REGION_TYPE_CCW			(2)
> +

I'm not sure if this should be here to keep it in its own area (esp. for 
when patch 6 comes along), or with VFIO_REGION_TYPE_GFX to make it 
noticeable where we are in the list without grepping for 
VFIO_REGION_TYPE.  I guess it's just what it is, even if I'm not 
thrilled about it.

>   /*
>    * 10de vendor sub-type
>    *
> 

This generally looks sane to me, even though I can't get past the idea 
that there's opportunities for improvement between the two.  Maybe 
that's refactoring for a day when someone is bored.  ;-)

  - Eric

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 4/6] vfio-ccw: add capabilities chain
@ 2019-02-15 15:46     ` Eric Farman
  0 siblings, 0 replies; 70+ messages in thread
From: Eric Farman @ 2019-02-15 15:46 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic, Farhan Ali, Pierre Morel
  Cc: linux-s390, kvm, qemu-devel, qemu-s390x, Alex Williamson



On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> Allow to extend the regions used by vfio-ccw. The first user will be
> handling of halt and clear subchannel.
> 
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> ---
>   drivers/s390/cio/vfio_ccw_ops.c     | 181 ++++++++++++++++++++++++----
>   drivers/s390/cio/vfio_ccw_private.h |  38 ++++++
>   include/uapi/linux/vfio.h           |   2 +
>   3 files changed, 195 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
> index 025c8a832bc8..48b2d7930ea8 100644
> --- a/drivers/s390/cio/vfio_ccw_ops.c
> +++ b/drivers/s390/cio/vfio_ccw_ops.c
> @@ -3,9 +3,11 @@
>    * Physical device callbacks for vfio_ccw
>    *
>    * Copyright IBM Corp. 2017
> + * Copyright Red Hat, Inc. 2019
>    *
>    * Author(s): Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
>    *            Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
> + *            Cornelia Huck <cohuck@redhat.com>
>    */
>   
>   #include <linux/vfio.h>
> @@ -157,27 +159,33 @@ static void vfio_ccw_mdev_release(struct mdev_device *mdev)
>   {
>   	struct vfio_ccw_private *private =
>   		dev_get_drvdata(mdev_parent_dev(mdev));
> +	int i;
>   
>   	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
>   				 &private->nb);
> +
> +	for (i = 0; i < private->num_regions; i++)
> +		private->region[i].ops->release(private, &private->region[i]);
> +
> +	private->num_regions = 0;
> +	kfree(private->region);
> +	private->region = NULL;
>   }
>   
> -static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
> -				  char __user *buf,
> -				  size_t count,
> -				  loff_t *ppos)
> +static ssize_t vfio_ccw_mdev_read_io_region(struct vfio_ccw_private *private,
> +					    char __user *buf, size_t count,
> +					    loff_t *ppos)
>   {
> -	struct vfio_ccw_private *private;
> +	loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
>   	struct ccw_io_region *region;
>   	int ret;
>   
> -	if (*ppos + count > sizeof(*region))
> +	if (pos + count > sizeof(*region))
>   		return -EINVAL;
>   
> -	private = dev_get_drvdata(mdev_parent_dev(mdev));
>   	mutex_lock(&private->io_mutex);
>   	region = private->io_region;
> -	if (copy_to_user(buf, (void *)region + *ppos, count))
> +	if (copy_to_user(buf, (void *)region + pos, count))
>   		ret = -EFAULT;
>   	else
>   		ret = count;
> @@ -185,24 +193,47 @@ static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
>   	return ret;
>   }
>   
> -static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
> -				   const char __user *buf,
> -				   size_t count,
> -				   loff_t *ppos)
> +static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
> +				  char __user *buf,
> +				  size_t count,
> +				  loff_t *ppos)
>   {
> +	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
>   	struct vfio_ccw_private *private;
> +
> +	private = dev_get_drvdata(mdev_parent_dev(mdev));
> +
> +	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
> +		return -EINVAL;
> +
> +	switch (index) {
> +	case VFIO_CCW_CONFIG_REGION_INDEX:
> +		return vfio_ccw_mdev_read_io_region(private, buf, count, ppos);
> +	default:
> +		index -= VFIO_CCW_NUM_REGIONS;
> +		return private->region[index].ops->read(private, buf, count,
> +							ppos);
> +	}
> +
> +	return -EINVAL;
> +}
> +
> +static ssize_t vfio_ccw_mdev_write_io_region(struct vfio_ccw_private *private,
> +					     const char __user *buf,
> +					     size_t count, loff_t *ppos)
> +{
> +	loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
>   	struct ccw_io_region *region;
>   	int ret;
>   
> -	if (*ppos + count > sizeof(*region))
> +	if (pos + count > sizeof(*region))
>   		return -EINVAL;
>   
> -	private = dev_get_drvdata(mdev_parent_dev(mdev));
>   	if (!mutex_trylock(&private->io_mutex))
>   		return -EAGAIN;
>   
>   	region = private->io_region;
> -	if (copy_from_user((void *)region + *ppos, buf, count)) {
> +	if (copy_from_user((void *)region + pos, buf, count)) {
>   		ret = -EFAULT;
>   		goto out_unlock;
>   	}
> @@ -217,19 +248,52 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
>   	return ret;
>   }
>   
> -static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info)
> +static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
> +				   const char __user *buf,
> +				   size_t count,
> +				   loff_t *ppos)
> +{
> +	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
> +	struct vfio_ccw_private *private;
> +
> +	private = dev_get_drvdata(mdev_parent_dev(mdev));
> +
> +	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
> +		return -EINVAL;
> +
> +	switch (index) {
> +	case VFIO_CCW_CONFIG_REGION_INDEX:
> +		return vfio_ccw_mdev_write_io_region(private, buf, count, ppos);
> +	default:
> +		index -= VFIO_CCW_NUM_REGIONS;
> +		return private->region[index].ops->write(private, buf, count,
> +							 ppos);
> +	}
> +
> +	return -EINVAL;
> +}
> +
> +static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info,
> +					 struct mdev_device *mdev)
>   {
> +	struct vfio_ccw_private *private;
> +
> +	private = dev_get_drvdata(mdev_parent_dev(mdev));
>   	info->flags = VFIO_DEVICE_FLAGS_CCW | VFIO_DEVICE_FLAGS_RESET;
> -	info->num_regions = VFIO_CCW_NUM_REGIONS;
> +	info->num_regions = VFIO_CCW_NUM_REGIONS + private->num_regions;
>   	info->num_irqs = VFIO_CCW_NUM_IRQS;
>   
>   	return 0;
>   }
>   
>   static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
> -					 u16 *cap_type_id,
> -					 void **cap_type)
> +					 struct mdev_device *mdev,
> +					 unsigned long arg)
>   {
> +	struct vfio_ccw_private *private;
> +	int i;
> +
> +	private = dev_get_drvdata(mdev_parent_dev(mdev));
>   	switch (info->index) {
>   	case VFIO_CCW_CONFIG_REGION_INDEX:
>   		info->offset = 0;
> @@ -237,9 +301,51 @@ static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
>   		info->flags = VFIO_REGION_INFO_FLAG_READ
>   			      | VFIO_REGION_INFO_FLAG_WRITE;
>   		return 0;
> -	default:
> -		return -EINVAL;
> +	default: /* all other regions are handled via capability chain */
> +	{
> +		struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
> +		struct vfio_region_info_cap_type cap_type = {
> +			.header.id = VFIO_REGION_INFO_CAP_TYPE,
> +			.header.version = 1 };
> +		int ret;
> +
> +		if (info->index >=
> +		    VFIO_CCW_NUM_REGIONS + private->num_regions)
> +			return -EINVAL;

I notice the similarity of this hunk to drivers/vfio/pci/vfio_pci.c ... 
While I was trying to discern the likelihood/possibility/usefulness of 
combining the two, I noticed that there is one difference at this point 
in the other file, which was added by commit 0e714d27786c ("vfio/pci: 
Fix potential Spectre v1")

This got me off on a tangent of setting up smatch in my environment, and 
sure enough it flags this point [1] as being problematic:

drivers/s390/cio/vfio_ccw_ops.c:328 vfio_ccw_mdev_get_region_info() 
warn: potential spectre issue 'private->region' [r]

Might need to consider the same?  (And lends credence to my concern 
about the capability chain code being duplicated.)

> +
> +		i = info->index - VFIO_CCW_NUM_REGIONS;
> +
> +		info->offset = VFIO_CCW_INDEX_TO_OFFSET(info->index);
> +		info->size = private->region[i].size;

[1] smatch actually points to this line, though the referenced commit 
inserts a line up there.

> +		info->flags = private->region[i].flags;
> +
> +		cap_type.type = private->region[i].type;
> +		cap_type.subtype = private->region[i].subtype;
> +
> +		ret = vfio_info_add_capability(&caps, &cap_type.header,
> +					       sizeof(cap_type));
> +		if (ret)
> +			return ret;
> +
> +		info->flags |= VFIO_REGION_INFO_FLAG_CAPS;
> +		if (info->argsz < sizeof(*info) + caps.size) {
> +			info->argsz = sizeof(*info) + caps.size;
> +			info->cap_offset = 0;
> +		} else {
> +			vfio_info_cap_shift(&caps, sizeof(*info));
> +			if (copy_to_user((void __user *)arg + sizeof(*info),
> +					 caps.buf, caps.size)) {
> +				kfree(caps.buf);
> +				return -EFAULT;
> +			}
> +			info->cap_offset = sizeof(*info);
> +		}
> +
> +		kfree(caps.buf);
> +
> +	}
>   	}
> +	return 0;
>   }
>   
>   static int vfio_ccw_mdev_get_irq_info(struct vfio_irq_info *info)
> @@ -316,6 +422,32 @@ static int vfio_ccw_mdev_set_irqs(struct mdev_device *mdev,
>   	}
>   }
>   
> +int vfio_ccw_register_dev_region(struct vfio_ccw_private *private,
> +				 unsigned int subtype,
> +				 const struct vfio_ccw_regops *ops,
> +				 size_t size, u32 flags, void *data)
> +{
> +	struct vfio_ccw_region *region;
> +
> +	region = krealloc(private->region,
> +			  (private->num_regions + 1) * sizeof(*region),
> +			  GFP_KERNEL);
> +	if (!region)
> +		return -ENOMEM;
> +
> +	private->region = region;
> +	private->region[private->num_regions].type = VFIO_REGION_TYPE_CCW;
> +	private->region[private->num_regions].subtype = subtype;
> +	private->region[private->num_regions].ops = ops;
> +	private->region[private->num_regions].size = size;
> +	private->region[private->num_regions].flags = flags;
> +	private->region[private->num_regions].data = data;
> +
> +	private->num_regions++;
> +
> +	return 0;
> +}
> +
>   static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
>   				   unsigned int cmd,
>   				   unsigned long arg)
> @@ -336,7 +468,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
>   		if (info.argsz < minsz)
>   			return -EINVAL;
>   
> -		ret = vfio_ccw_mdev_get_device_info(&info);
> +		ret = vfio_ccw_mdev_get_device_info(&info, mdev);
>   		if (ret)
>   			return ret;
>   
> @@ -345,8 +477,6 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
>   	case VFIO_DEVICE_GET_REGION_INFO:
>   	{
>   		struct vfio_region_info info;
> -		u16 cap_type_id = 0;
> -		void *cap_type = NULL;
>   
>   		minsz = offsetofend(struct vfio_region_info, offset);
>   
> @@ -356,8 +486,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
>   		if (info.argsz < minsz)
>   			return -EINVAL;
>   
> -		ret = vfio_ccw_mdev_get_region_info(&info, &cap_type_id,
> -						    &cap_type);
> +		ret = vfio_ccw_mdev_get_region_info(&info, mdev, arg);
>   		if (ret)
>   			return ret;
>   
> diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
> index 32173cbd838d..c979eb32fb1c 100644
> --- a/drivers/s390/cio/vfio_ccw_private.h
> +++ b/drivers/s390/cio/vfio_ccw_private.h
> @@ -3,9 +3,11 @@
>    * Private stuff for vfio_ccw driver
>    *
>    * Copyright IBM Corp. 2017
> + * Copyright Red Hat, Inc. 2019
>    *
>    * Author(s): Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
>    *            Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
> + *            Cornelia Huck <cohuck@redhat.com>
>    */
>   
>   #ifndef _VFIO_CCW_PRIVATE_H_
> @@ -19,6 +21,38 @@
>   #include "css.h"
>   #include "vfio_ccw_cp.h"
>   
> +#define VFIO_CCW_OFFSET_SHIFT   40
> +#define VFIO_CCW_OFFSET_TO_INDEX(off)	(off >> VFIO_CCW_OFFSET_SHIFT)
> +#define VFIO_CCW_INDEX_TO_OFFSET(index)	((u64)(index) << VFIO_CCW_OFFSET_SHIFT)
> +#define VFIO_CCW_OFFSET_MASK	(((u64)(1) << VFIO_CCW_OFFSET_SHIFT) - 1)

I know Farhan asked this back in v1, but I'd still love a better answer 
than "vfio-pci did this" to what this is.  There's a lot more regions 
prior to the capability chain in vfio-pci than here (9 versus 1), so I'd 
like to be certain it's not related to that.

> +
> +/* capability chain handling similar to vfio-pci */
> +struct vfio_ccw_private;
> +struct vfio_ccw_region;
> +
> +struct vfio_ccw_regops {
> +	ssize_t	(*read)(struct vfio_ccw_private *private, char __user *buf,
> +			size_t count, loff_t *ppos);
> +	ssize_t	(*write)(struct vfio_ccw_private *private,
> +			 const char __user *buf, size_t count, loff_t *ppos);
> +	void	(*release)(struct vfio_ccw_private *private,
> +			   struct vfio_ccw_region *region);
> +};
> +
> +struct vfio_ccw_region {
> +	u32				type;
> +	u32				subtype;
> +	const struct vfio_ccw_regops	*ops;
> +	void				*data;
> +	size_t				size;
> +	u32				flags;
> +};
> +
> +int vfio_ccw_register_dev_region(struct vfio_ccw_private *private,
> +				 unsigned int subtype,
> +				 const struct vfio_ccw_regops *ops,
> +				 size_t size, u32 flags, void *data);
> +
>   /**
>    * struct vfio_ccw_private
>    * @sch: pointer to the subchannel
> @@ -29,6 +63,8 @@
>    * @nb: notifier for vfio events
>    * @io_region: MMIO region to input/output I/O arguments/results
>    * @io_mutex: protect against concurrent update of I/O regions
> + * @region: additional regions for other subchannel operations
> + * @num_regions: number of additional regions
>    * @cp: channel program for the current I/O operation
>    * @irb: irb info received from interrupt
>    * @scsw: scsw info
> @@ -44,6 +80,8 @@ struct vfio_ccw_private {
>   	struct notifier_block	nb;
>   	struct ccw_io_region	*io_region;
>   	struct mutex		io_mutex;
> +	struct vfio_ccw_region *region;
> +	int num_regions;
>   
>   	struct channel_program	cp;
>   	struct irb		irb;
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 02bb7ad6e986..56e2413d3e00 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -353,6 +353,8 @@ struct vfio_region_gfx_edid {
>   #define VFIO_DEVICE_GFX_LINK_STATE_DOWN  2
>   };
>   
> +#define VFIO_REGION_TYPE_CCW			(2)
> +

I'm not sure if this should be here to keep it in its own area (esp. for 
when patch 6 comes along), or with VFIO_REGION_TYPE_GFX to make it 
noticeable where we are in the list without grepping for 
VFIO_REGION_TYPE.  I guess it's just what it is, even if I'm not 
thrilled about it.

>   /*
>    * 10de vendor sub-type
>    *
> 

This generally looks sane to me, even though I can't get past the idea 
that there's opportunities for improvement between the two.  Maybe 
that's refactoring for a day when someone is bored.  ;-)

  - Eric

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 4/6] vfio-ccw: add capabilities chain
  2019-02-15 15:46     ` [Qemu-devel] " Eric Farman
@ 2019-02-19 11:06       ` Cornelia Huck
  -1 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-19 11:06 UTC (permalink / raw)
  To: Eric Farman
  Cc: linux-s390, Alex Williamson, Pierre Morel, kvm, Farhan Ali,
	qemu-devel, Halil Pasic, qemu-s390x

On Fri, 15 Feb 2019 10:46:08 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> > Allow to extend the regions used by vfio-ccw. The first user will be
> > handling of halt and clear subchannel.
> > 
> > Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> > ---
> >   drivers/s390/cio/vfio_ccw_ops.c     | 181 ++++++++++++++++++++++++----
> >   drivers/s390/cio/vfio_ccw_private.h |  38 ++++++
> >   include/uapi/linux/vfio.h           |   2 +
> >   3 files changed, 195 insertions(+), 26 deletions(-)

(...)

> > @@ -237,9 +301,51 @@ static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
> >   		info->flags = VFIO_REGION_INFO_FLAG_READ
> >   			      | VFIO_REGION_INFO_FLAG_WRITE;
> >   		return 0;
> > -	default:
> > -		return -EINVAL;
> > +	default: /* all other regions are handled via capability chain */
> > +	{
> > +		struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
> > +		struct vfio_region_info_cap_type cap_type = {
> > +			.header.id = VFIO_REGION_INFO_CAP_TYPE,
> > +			.header.version = 1 };
> > +		int ret;
> > +
> > +		if (info->index >=
> > +		    VFIO_CCW_NUM_REGIONS + private->num_regions)
> > +			return -EINVAL;  
> 
> I notice the similarity of this hunk to drivers/vfio/pci/vfio_pci.c ... 
> While I was trying to discern the likelihood/possibility/usefulness of 
> combining the two, I noticed that there is one difference at this point 
> in the other file, which was added by commit 0e714d27786c ("vfio/pci: 
> Fix potential Spectre v1")
> 
> This got me off on a tangent of setting up smatch in my environment, and 
> sure enough it flags this point [1] as being problematic:
> 
> drivers/s390/cio/vfio_ccw_ops.c:328 vfio_ccw_mdev_get_region_info() 
> warn: potential spectre issue 'private->region' [r]

This makes sense, added.

> 
> Might need to consider the same?  (And lends credence to my concern 
> about the capability chain code being duplicated.)

Yeah, there's definitely duplication there. I initially tried to make
this use some common infrastructure, but I remember that it was harder
than it looked and that I stopped trying (don't remember the details,
sorry).

> 
> > +
> > +		i = info->index - VFIO_CCW_NUM_REGIONS;
> > +
> > +		info->offset = VFIO_CCW_INDEX_TO_OFFSET(info->index);
> > +		info->size = private->region[i].size;  
> 
> [1] smatch actually points to this line, though the referenced commit 
> inserts a line up there.
> 
> > +		info->flags = private->region[i].flags;
> > +
> > +		cap_type.type = private->region[i].type;
> > +		cap_type.subtype = private->region[i].subtype;
> > +
> > +		ret = vfio_info_add_capability(&caps, &cap_type.header,
> > +					       sizeof(cap_type));
> > +		if (ret)
> > +			return ret;
> > +
> > +		info->flags |= VFIO_REGION_INFO_FLAG_CAPS;
> > +		if (info->argsz < sizeof(*info) + caps.size) {
> > +			info->argsz = sizeof(*info) + caps.size;
> > +			info->cap_offset = 0;
> > +		} else {
> > +			vfio_info_cap_shift(&caps, sizeof(*info));
> > +			if (copy_to_user((void __user *)arg + sizeof(*info),
> > +					 caps.buf, caps.size)) {
> > +				kfree(caps.buf);
> > +				return -EFAULT;
> > +			}
> > +			info->cap_offset = sizeof(*info);
> > +		}
> > +
> > +		kfree(caps.buf);
> > +
> > +	}
> >   	}
> > +	return 0;
> >   }

(...)

> > @@ -19,6 +21,38 @@
> >   #include "css.h"
> >   #include "vfio_ccw_cp.h"
> >   
> > +#define VFIO_CCW_OFFSET_SHIFT   40
> > +#define VFIO_CCW_OFFSET_TO_INDEX(off)	(off >> VFIO_CCW_OFFSET_SHIFT)
> > +#define VFIO_CCW_INDEX_TO_OFFSET(index)	((u64)(index) << VFIO_CCW_OFFSET_SHIFT)
> > +#define VFIO_CCW_OFFSET_MASK	(((u64)(1) << VFIO_CCW_OFFSET_SHIFT) - 1)  
> 
> I know Farhan asked this back in v1, but I'd still love a better answer 
> than "vfio-pci did this" to what this is.  There's a lot more regions 
> prior to the capability chain in vfio-pci than here (9 versus 1), so I'd 
> like to be certain it's not related to that.

If we assume that we'll only add new regions via the capability chain
(and I think we can assume that), we can probably change that value. I
tried with a value of 10 (should be enough) and things still seem to
work, so that might be a nice, round value.

(...)

> > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> > index 02bb7ad6e986..56e2413d3e00 100644
> > --- a/include/uapi/linux/vfio.h
> > +++ b/include/uapi/linux/vfio.h
> > @@ -353,6 +353,8 @@ struct vfio_region_gfx_edid {
> >   #define VFIO_DEVICE_GFX_LINK_STATE_DOWN  2
> >   };
> >   
> > +#define VFIO_REGION_TYPE_CCW			(2)
> > +  
> 
> I'm not sure if this should be here to keep it in its own area (esp. for 
> when patch 6 comes along), or with VFIO_REGION_TYPE_GFX to make it 
> noticeable where we are in the list without grepping for 
> VFIO_REGION_TYPE.  I guess it's just what it is, even if I'm not 
> thrilled about it.

I'm not really sure where it makes the most sense to put it, TBH. Maybe
it should be moved below the recently added nvlink stuff? My idea was
to keep the subtype (added in patch 6) close to the type; but they can
easily move to a different place in the file.

> 
> >   /*
> >    * 10de vendor sub-type
> >    *
> >   
> 
> This generally looks sane to me, even though I can't get past the idea 
> that there's opportunities for improvement between the two.  Maybe 
> that's refactoring for a day when someone is bored.  ;-)

Yeah, if someone has time, they could try to refactor :) I'm not sure
what made it complicated when I first tried it, maybe I should try
again ;)

Thanks for looking!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH v3 4/6] vfio-ccw: add capabilities chain
@ 2019-02-19 11:06       ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2019-02-19 11:06 UTC (permalink / raw)
  To: Eric Farman
  Cc: Halil Pasic, Farhan Ali, Pierre Morel, linux-s390, kvm,
	qemu-devel, qemu-s390x, Alex Williamson

On Fri, 15 Feb 2019 10:46:08 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 01/30/2019 08:22 AM, Cornelia Huck wrote:
> > Allow to extend the regions used by vfio-ccw. The first user will be
> > handling of halt and clear subchannel.
> > 
> > Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> > ---
> >   drivers/s390/cio/vfio_ccw_ops.c     | 181 ++++++++++++++++++++++++----
> >   drivers/s390/cio/vfio_ccw_private.h |  38 ++++++
> >   include/uapi/linux/vfio.h           |   2 +
> >   3 files changed, 195 insertions(+), 26 deletions(-)

(...)

> > @@ -237,9 +301,51 @@ static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
> >   		info->flags = VFIO_REGION_INFO_FLAG_READ
> >   			      | VFIO_REGION_INFO_FLAG_WRITE;
> >   		return 0;
> > -	default:
> > -		return -EINVAL;
> > +	default: /* all other regions are handled via capability chain */
> > +	{
> > +		struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
> > +		struct vfio_region_info_cap_type cap_type = {
> > +			.header.id = VFIO_REGION_INFO_CAP_TYPE,
> > +			.header.version = 1 };
> > +		int ret;
> > +
> > +		if (info->index >=
> > +		    VFIO_CCW_NUM_REGIONS + private->num_regions)
> > +			return -EINVAL;  
> 
> I notice the similarity of this hunk to drivers/vfio/pci/vfio_pci.c ... 
> While I was trying to discern the likelihood/possibility/usefulness of 
> combining the two, I noticed that there is one difference at this point 
> in the other file, which was added by commit 0e714d27786c ("vfio/pci: 
> Fix potential Spectre v1")
> 
> This got me off on a tangent of setting up smatch in my environment, and 
> sure enough it flags this point [1] as being problematic:
> 
> drivers/s390/cio/vfio_ccw_ops.c:328 vfio_ccw_mdev_get_region_info() 
> warn: potential spectre issue 'private->region' [r]

This makes sense, added.

> 
> Might need to consider the same?  (And lends credence to my concern 
> about the capability chain code being duplicated.)

Yeah, there's definitely duplication there. I initially tried to make
this use some common infrastructure, but I remember that it was harder
than it looked and that I stopped trying (don't remember the details,
sorry).

> 
> > +
> > +		i = info->index - VFIO_CCW_NUM_REGIONS;
> > +
> > +		info->offset = VFIO_CCW_INDEX_TO_OFFSET(info->index);
> > +		info->size = private->region[i].size;  
> 
> [1] smatch actually points to this line, though the referenced commit 
> inserts a line up there.
> 
> > +		info->flags = private->region[i].flags;
> > +
> > +		cap_type.type = private->region[i].type;
> > +		cap_type.subtype = private->region[i].subtype;
> > +
> > +		ret = vfio_info_add_capability(&caps, &cap_type.header,
> > +					       sizeof(cap_type));
> > +		if (ret)
> > +			return ret;
> > +
> > +		info->flags |= VFIO_REGION_INFO_FLAG_CAPS;
> > +		if (info->argsz < sizeof(*info) + caps.size) {
> > +			info->argsz = sizeof(*info) + caps.size;
> > +			info->cap_offset = 0;
> > +		} else {
> > +			vfio_info_cap_shift(&caps, sizeof(*info));
> > +			if (copy_to_user((void __user *)arg + sizeof(*info),
> > +					 caps.buf, caps.size)) {
> > +				kfree(caps.buf);
> > +				return -EFAULT;
> > +			}
> > +			info->cap_offset = sizeof(*info);
> > +		}
> > +
> > +		kfree(caps.buf);
> > +
> > +	}
> >   	}
> > +	return 0;
> >   }

(...)

> > @@ -19,6 +21,38 @@
> >   #include "css.h"
> >   #include "vfio_ccw_cp.h"
> >   
> > +#define VFIO_CCW_OFFSET_SHIFT   40
> > +#define VFIO_CCW_OFFSET_TO_INDEX(off)	(off >> VFIO_CCW_OFFSET_SHIFT)
> > +#define VFIO_CCW_INDEX_TO_OFFSET(index)	((u64)(index) << VFIO_CCW_OFFSET_SHIFT)
> > +#define VFIO_CCW_OFFSET_MASK	(((u64)(1) << VFIO_CCW_OFFSET_SHIFT) - 1)  
> 
> I know Farhan asked this back in v1, but I'd still love a better answer 
> than "vfio-pci did this" to what this is.  There's a lot more regions 
> prior to the capability chain in vfio-pci than here (9 versus 1), so I'd 
> like to be certain it's not related to that.

If we assume that we'll only add new regions via the capability chain
(and I think we can assume that), we can probably change that value. I
tried with a value of 10 (should be enough) and things still seem to
work, so that might be a nice, round value.

(...)

> > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> > index 02bb7ad6e986..56e2413d3e00 100644
> > --- a/include/uapi/linux/vfio.h
> > +++ b/include/uapi/linux/vfio.h
> > @@ -353,6 +353,8 @@ struct vfio_region_gfx_edid {
> >   #define VFIO_DEVICE_GFX_LINK_STATE_DOWN  2
> >   };
> >   
> > +#define VFIO_REGION_TYPE_CCW			(2)
> > +  
> 
> I'm not sure if this should be here to keep it in its own area (esp. for 
> when patch 6 comes along), or with VFIO_REGION_TYPE_GFX to make it 
> noticeable where we are in the list without grepping for 
> VFIO_REGION_TYPE.  I guess it's just what it is, even if I'm not 
> thrilled about it.

I'm not really sure where it makes the most sense to put it, TBH. Maybe
it should be moved below the recently added nvlink stuff? My idea was
to keep the subtype (added in patch 6) close to the type; but they can
easily move to a different place in the file.

> 
> >   /*
> >    * 10de vendor sub-type
> >    *
> >   
> 
> This generally looks sane to me, even though I can't get past the idea 
> that there's opportunities for improvement between the two.  Maybe 
> that's refactoring for a day when someone is bored.  ;-)

Yeah, if someone has time, they could try to refactor :) I'm not sure
what made it complicated when I first tried it, maybe I should try
again ;)

Thanks for looking!

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2019-02-19 11:06 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-30 13:22 [PATCH v3 0/6] vfio-ccw: support hsch/csch (kernel part) Cornelia Huck
2019-01-30 13:22 ` [Qemu-devel] " Cornelia Huck
2019-01-30 13:22 ` [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs Cornelia Huck
2019-01-30 13:22   ` [Qemu-devel] " Cornelia Huck
2019-01-30 18:51   ` Halil Pasic
2019-01-30 18:51     ` [Qemu-devel] " Halil Pasic
2019-01-31 11:52     ` Cornelia Huck
2019-01-31 11:52       ` [Qemu-devel] " Cornelia Huck
2019-01-31 12:34       ` Halil Pasic
2019-01-31 12:34         ` [Qemu-devel] " Halil Pasic
2019-02-04 15:31         ` Cornelia Huck
2019-02-04 15:31           ` [Qemu-devel] " Cornelia Huck
2019-02-05 11:52           ` Halil Pasic
2019-02-05 11:52             ` [Qemu-devel] " Halil Pasic
2019-02-05 12:35             ` Cornelia Huck
2019-02-05 12:35               ` [Qemu-devel] " Cornelia Huck
2019-02-05 14:48               ` Eric Farman
2019-02-05 14:48                 ` [Qemu-devel] " Eric Farman
2019-02-05 15:14                 ` Farhan Ali
2019-02-05 15:14                   ` [Qemu-devel] " Farhan Ali
2019-02-05 16:13                   ` Cornelia Huck
2019-02-05 16:13                     ` [Qemu-devel] " Cornelia Huck
2019-02-04 19:25   ` Eric Farman
2019-02-04 19:25     ` [Qemu-devel] " Eric Farman
2019-02-05 12:03     ` Cornelia Huck
2019-02-05 12:03       ` [Qemu-devel] " Cornelia Huck
2019-02-05 14:41       ` Eric Farman
2019-02-05 14:41         ` [Qemu-devel] " Eric Farman
2019-02-05 16:29         ` Cornelia Huck
2019-02-05 16:29           ` [Qemu-devel] " Cornelia Huck
2019-01-30 13:22 ` [PATCH v3 2/6] vfio-ccw: rework ssch state handling Cornelia Huck
2019-01-30 13:22   ` [Qemu-devel] " Cornelia Huck
2019-02-04 21:29   ` Eric Farman
2019-02-04 21:29     ` [Qemu-devel] " Eric Farman
2019-02-05 12:10     ` Cornelia Huck
2019-02-05 12:10       ` [Qemu-devel] " Cornelia Huck
2019-02-05 14:31       ` Eric Farman
2019-02-05 14:31         ` [Qemu-devel] " Eric Farman
2019-02-05 16:32         ` Cornelia Huck
2019-02-05 16:32           ` [Qemu-devel] " Cornelia Huck
2019-01-30 13:22 ` [PATCH v3 3/6] vfio-ccw: protect the I/O region Cornelia Huck
2019-01-30 13:22   ` [Qemu-devel] " Cornelia Huck
2019-02-08 21:26   ` Eric Farman
2019-02-08 21:26     ` [Qemu-devel] " Eric Farman
2019-02-11 15:57     ` Cornelia Huck
2019-02-11 15:57       ` [Qemu-devel] " Cornelia Huck
2019-01-30 13:22 ` [PATCH v3 4/6] vfio-ccw: add capabilities chain Cornelia Huck
2019-01-30 13:22   ` [Qemu-devel] " Cornelia Huck
2019-02-15 15:46   ` Eric Farman
2019-02-15 15:46     ` [Qemu-devel] " Eric Farman
2019-02-19 11:06     ` Cornelia Huck
2019-02-19 11:06       ` [Qemu-devel] " Cornelia Huck
2019-01-30 13:22 ` [PATCH v3 5/6] s390/cio: export hsch to modules Cornelia Huck
2019-01-30 13:22   ` [Qemu-devel] " Cornelia Huck
2019-01-30 13:22 ` [PATCH v3 6/6] vfio-ccw: add handling for async channel instructions Cornelia Huck
2019-01-30 13:22   ` [Qemu-devel] " Cornelia Huck
2019-01-30 17:00   ` Halil Pasic
2019-01-30 17:00     ` [Qemu-devel] " Halil Pasic
2019-01-30 17:09   ` Halil Pasic
2019-01-30 17:09     ` [Qemu-devel] " Halil Pasic
2019-01-31 11:53     ` Cornelia Huck
2019-01-31 11:53       ` [Qemu-devel] " Cornelia Huck
2019-02-06 14:00 ` [PATCH v3 0/6] vfio-ccw: support hsch/csch (kernel part) Cornelia Huck
2019-02-06 14:00   ` [Qemu-devel] " Cornelia Huck
2019-02-08 21:19   ` Eric Farman
2019-02-08 21:19     ` [Qemu-devel] " Eric Farman
2019-02-11 16:13     ` Cornelia Huck
2019-02-11 16:13       ` [Qemu-devel] " Cornelia Huck
2019-02-11 17:37       ` Eric Farman
2019-02-11 17:37         ` [Qemu-devel] " Eric Farman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.