kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR
@ 2020-06-16 19:50 Eric Farman
  2020-06-16 19:50 ` [RFC PATCH v3 1/3] vfio-ccw: Indicate if a channel_program is started Eric Farman
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Eric Farman @ 2020-06-16 19:50 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Jared Rossi, Halil Pasic, linux-s390, kvm, Eric Farman

Let's continue our discussion of the handling of vfio-ccw interrupts.

The initial fix [1] relied upon the interrupt path's examination of the
FSM state, and freeing all resources if it were CP_PENDING. But the
interface used by HALT/CLEAR SUBCHANNEL doesn't affect the FSM state.
Consider this sequence:

    CPU 1                           CPU 2
    CLEAR (state=IDLE/no change)
                                    START [2]
    INTERRUPT (set state=IDLE)
                                    INTERRUPT (set state=IDLE)

This translates to a couple of possible scenarios:

 A) The START gets a cc2 because of the outstanding CLEAR, -EBUSY is
    returned, resources are freed, and state remains IDLE
 B) The START gets a cc0 because the CLEAR has already presented an
    interrupt, and state is set to CP_PENDING

If the START gets a cc0 before the CLEAR INTERRUPT (stacked onto a
workqueue by the IRQ context) gets a chance to run, then the INTERRUPT
will release the channel program memory prematurely. If the two
operations run concurrently, then the FSM state set to CP_PROCESSING
will prevent the cp_free() from being invoked. But the io_mutex
boundary on that path will pause itself until the START completes,
and then allow the FSM to be reset to IDLE without considering the
outstanding START. Neither scenario would be considered good.

Having said all of that, in v2 Conny suggested [3] the following:

> - Detach the cp from the subchannel (or better, remove the 1:1
>   relationship). By that I mean building the cp as a separately
>   allocated structure (maybe embedding a kref, but that might not be
>   needed), and appending it to a list after SSCH with cc=0. Discard it
>   if cc!=0.
> - Remove the CP_PENDING state. The state is either IDLE after any
>   successful SSCH/HSCH/CSCH, or a new state in that case. But no
>   special state for SSCH.
> - A successful CSCH removes the first queued request, if any.
> - A final interrupt removes the first queued request, if any.

What I have implemented here is basically this, with a few changes:

 - I don't queue cp's. Since there should only be one START in process
   at a time, and HALT/CLEAR doesn't build a cp, I didn't see a pressing
   need to introduce that complexity.
 - Furthermore, while I initially made a separately allocated cp, adding
   an alloc for a cp on each I/O AND moving the guest_cp alloc from the
   probe path to the I/O path seems excessive. So I implemented a
   "started" flag to the cp, set after a cc0 from the START, and examine
   that on the interrupt path to determine whether cp_free() is needed.
 - I opted against a "SOMETHING_PENDING" state if START/HALT/CLEAR
   got a cc0, and just put the FSM back to IDLE. It becomes too unwieldy
   to discern which operation an interrupt is completing, and whether
   more interrupts are expected, to be worth the additional state.
 - A successful CSCH doesn't do anything special, and cp_free()
   is only performed on the interrupt path. Part of me wrestled with
   how a HALT fits into that, but mostly it was that a cc0 on any
   of the instructions indicated the "channel subsystem is signaled
   to asynchronously perform the [START/HALT/CLEAR] function."
   This means that an in-flight START could still receive data from the
   device/subchannel, so not a good idea to release memory at that point.

Separate from all that, I added a small check of the io_work queue to
the FSM START path. Part of the problems I've seen was that an interrupt
is presented by a CPU, but not yet processed by vfio-ccw. Some of the
problems seen thus far is because of this gap, and the above changes
don't address that either. Whether this is appropriate or ridiculous
would be a welcome discussion.

Previous versions:
v2: https://lore.kernel.org/kvm/20200513142934.28788-1-farman@linux.ibm.com/
v1: https://lore.kernel.org/kvm/20200124145455.51181-1-farman@linux.ibm.com/

Footnotes:
[1] https://lore.kernel.org/kvm/62e87bf67b38dc8d5760586e7c96d400db854ebe.1562854091.git.alifm@linux.ibm.com/
[2] Halil has pointed out that QEMU should prohibit this, based on the
    rules set forth by the POPs. This is true, but we should not rely on
    it behaving properly without addressing this scenario that is visible
    today. Once I get this behaving correctly, I'll spend some time
    seeing if QEMU is misbehaving somehow.
[3] https://lore.kernel.org/kvm/20200518180903.7cb21dd8.cohuck@redhat.com/
[4] https://lore.kernel.org/kvm/a52368d3-8cec-7b99-1587-25e055228b62@linux.ibm.com/

Eric Farman (3):
  vfio-ccw: Indicate if a channel_program is started
  vfio-ccw: Remove the CP_PENDING FSM state
  vfio-ccw: Check workqueue before doing START

 drivers/s390/cio/vfio_ccw_cp.c      |  2 ++
 drivers/s390/cio/vfio_ccw_cp.h      |  1 +
 drivers/s390/cio/vfio_ccw_drv.c     |  5 +----
 drivers/s390/cio/vfio_ccw_fsm.c     | 32 +++++++++++++++++------------
 drivers/s390/cio/vfio_ccw_ops.c     |  3 +--
 drivers/s390/cio/vfio_ccw_private.h |  1 -
 6 files changed, 24 insertions(+), 20 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH v3 1/3] vfio-ccw: Indicate if a channel_program is started
  2020-06-16 19:50 [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
@ 2020-06-16 19:50 ` Eric Farman
  2020-06-17 23:11   ` Halil Pasic
  2020-06-16 19:50 ` [RFC PATCH v3 2/3] vfio-ccw: Remove the CP_PENDING FSM state Eric Farman
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Eric Farman @ 2020-06-16 19:50 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Jared Rossi, Halil Pasic, linux-s390, kvm, Eric Farman

The interrupt path checks the FSM state when processing a final interrupt
(an interrupt that is neither subchannel active, nor device active),
to determine whether to call cp_free() and release the associated memory.
But, this does not fully close the window where a START comes in after a
HALT/CLEAR. If the START runs while the CLEAR interrupt is being processed,
the channel program struct will be allocated while the interrupt would be
considering whether or not to free it. If the FSM state is CP_PROCESSING,
then everything is fine. But if the START is able to issue its SSCH and get
a cc0, then the in-flight interrupt would have been for an unrelated
operation (perhaps none, if the subchannel was previously idle).

The channel_program struct has an "initialized" flag that is set early
in the fsm_io_request() flow, to simplify the various cp_*() accessors.
Let's extend this idea to include a "started" flag that announces that the
channel program has successfully been issued to hardware. With this, the
interrupt path can determine whether the final interrupt should also
release the cp resources instead of relying on a transient FSM state.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c  |  2 ++
 drivers/s390/cio/vfio_ccw_cp.h  |  1 +
 drivers/s390/cio/vfio_ccw_drv.c |  2 +-
 drivers/s390/cio/vfio_ccw_fsm.c | 11 +++++++++++
 4 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index b9febc581b1f..7748eeef434e 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -657,6 +657,7 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
 
 	if (!ret) {
 		cp->initialized = true;
+		cp->started = false;
 
 		/* It is safe to force: if it was not set but idals used
 		 * ccwchain_calc_length would have returned an error.
@@ -685,6 +686,7 @@ void cp_free(struct channel_program *cp)
 		return;
 
 	cp->initialized = false;
+	cp->started = false;
 	list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) {
 		for (i = 0; i < chain->ch_len; i++) {
 			pfn_array_unpin_free(chain->ch_pa + i, cp->mdev);
diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h
index ba31240ce965..7ea14910aaaa 100644
--- a/drivers/s390/cio/vfio_ccw_cp.h
+++ b/drivers/s390/cio/vfio_ccw_cp.h
@@ -39,6 +39,7 @@ struct channel_program {
 	union orb orb;
 	struct device *mdev;
 	bool initialized;
+	bool started;
 	struct ccw1 *guest_cp;
 };
 
diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 8c625b530035..7e2a790dc9a1 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -94,7 +94,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
 		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
 	if (scsw_is_solicited(&irb->scsw)) {
 		cp_update_scsw(&private->cp, &irb->scsw);
-		if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING)
+		if (is_final && private->cp.started)
 			cp_free(&private->cp);
 	}
 	mutex_lock(&private->io_mutex);
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index 23e61aa638e4..d806f88eba72 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -50,6 +50,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
 		ret = 0;
 		private->state = VFIO_CCW_STATE_CP_PENDING;
+		private->cp.started = true;
 		break;
 	case 1:		/* Status pending */
 	case 2:		/* Busy */
@@ -246,6 +247,16 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 	char *errstr = "request";
 	struct subchannel_id schid = get_schid(private);
 
+	if (private->cp.started) {
+		io_region->ret_code = -EBUSY;
+		VFIO_CCW_MSG_EVENT(2,
+				   "%pUl (%x.%x.%04x): busy\n",
+				   mdev_uuid(mdev), schid.cssid,
+				   schid.ssid, schid.sch_no);
+		errstr = "busy";
+		goto err_out;
+	}
+
 	private->state = VFIO_CCW_STATE_CP_PROCESSING;
 	memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v3 2/3] vfio-ccw: Remove the CP_PENDING FSM state
  2020-06-16 19:50 [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
  2020-06-16 19:50 ` [RFC PATCH v3 1/3] vfio-ccw: Indicate if a channel_program is started Eric Farman
@ 2020-06-16 19:50 ` Eric Farman
  2020-06-16 19:50 ` [RFC PATCH v3 3/3] vfio-ccw: Check workqueue before doing START Eric Farman
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Eric Farman @ 2020-06-16 19:50 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Jared Rossi, Halil Pasic, linux-s390, kvm, Eric Farman

The FSM state is set to CP_PROCESSING when a START operation begins,
is set to CP_PENDING when a START operation is done (the SSCH
instruction gets a cc0), and is set back to IDLE when the final
interrupt is received (or if the START fails somehow). However, it is
categorically impossible to distinguish between interrupts when
other instructions can generate "final" interrupts via the async
region, so using this information for any decision-making at the
completion of an I/O is fraught with peril.

We could replace "CP_PENDING" with a generic "OPERATION_PENDING" state,
but it doesn't appear to buy us anything as far as knowing if we get
one interrupt for a START and another for a CLEAR, or if we just get
one interrupt for the CLEAR and the START never got off the ground.

So let's remove that entirely, and just move the FSM back to IDLE
once the START process completes.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_drv.c     |  3 ---
 drivers/s390/cio/vfio_ccw_fsm.c     | 16 +++-------------
 drivers/s390/cio/vfio_ccw_ops.c     |  3 +--
 drivers/s390/cio/vfio_ccw_private.h |  1 -
 4 files changed, 4 insertions(+), 19 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 7e2a790dc9a1..b76c9008871b 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -101,9 +101,6 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
 	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
 	mutex_unlock(&private->io_mutex);
 
-	if (private->mdev && is_final)
-		private->state = VFIO_CCW_STATE_IDLE;
-
 	if (private->io_trigger)
 		eventfd_signal(private->io_trigger, 1);
 }
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index d806f88eba72..f0952192480e 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -49,7 +49,6 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 		 */
 		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
 		ret = 0;
-		private->state = VFIO_CCW_STATE_CP_PENDING;
 		private->cp.started = true;
 		break;
 	case 1:		/* Status pending */
@@ -188,12 +187,6 @@ static void fsm_io_error(struct vfio_ccw_private *private,
 	private->io_region->ret_code = -EIO;
 }
 
-static void fsm_io_busy(struct vfio_ccw_private *private,
-			enum vfio_ccw_event event)
-{
-	private->io_region->ret_code = -EBUSY;
-}
-
 static void fsm_io_retry(struct vfio_ccw_private *private,
 			 enum vfio_ccw_event event)
 {
@@ -309,6 +302,7 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 			cp_free(&private->cp);
 			goto err_out;
 		}
+		private->state = VFIO_CCW_STATE_IDLE;
 		return;
 	} else if (scsw->cmd.fctl & SCSW_FCTL_HALT_FUNC) {
 		VFIO_CCW_MSG_EVENT(2,
@@ -331,6 +325,8 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 err_out:
 	trace_vfio_ccw_fsm_io_request(scsw->cmd.fctl, schid,
 				      io_region->ret_code, errstr);
+
+	private->state = VFIO_CCW_STATE_IDLE;
 }
 
 /*
@@ -405,10 +401,4 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
 		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_retry,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
 	},
-	[VFIO_CCW_STATE_CP_PENDING] = {
-		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
-		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_busy,
-		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_request,
-		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
-	},
 };
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 8b3ed5b45277..0b67c04decee 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -276,8 +276,7 @@ static ssize_t vfio_ccw_mdev_write_io_region(struct vfio_ccw_private *private,
 	}
 
 	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_IO_REQ);
-	if (region->ret_code != 0)
-		private->state = VFIO_CCW_STATE_IDLE;
+
 	ret = (region->ret_code != 0) ? region->ret_code : count;
 
 out_unlock:
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 8723156b29ea..f592897f1930 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -125,7 +125,6 @@ enum vfio_ccw_state {
 	VFIO_CCW_STATE_STANDBY,
 	VFIO_CCW_STATE_IDLE,
 	VFIO_CCW_STATE_CP_PROCESSING,
-	VFIO_CCW_STATE_CP_PENDING,
 	/* last element! */
 	NR_VFIO_CCW_STATES
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v3 3/3] vfio-ccw: Check workqueue before doing START
  2020-06-16 19:50 [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
  2020-06-16 19:50 ` [RFC PATCH v3 1/3] vfio-ccw: Indicate if a channel_program is started Eric Farman
  2020-06-16 19:50 ` [RFC PATCH v3 2/3] vfio-ccw: Remove the CP_PENDING FSM state Eric Farman
@ 2020-06-16 19:50 ` Eric Farman
  2020-06-19 11:40   ` Cornelia Huck
  2020-06-17 11:24 ` [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
  2020-06-19 11:21 ` Cornelia Huck
  4 siblings, 1 reply; 11+ messages in thread
From: Eric Farman @ 2020-06-16 19:50 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Jared Rossi, Halil Pasic, linux-s390, kvm, Eric Farman

When an interrupt is received via the IRQ, the bulk of the work is
stacked on a workqueue for later processing. Which means that concurrent
START or a HALT/CLEAR operation (via the async_region) will race with
this process and require some serialization.

Once we have all our locks acquired, let's just look to see if we're
in a window where the process has been started from the IRQ, but not
yet picked up by vfio-ccw to clean up an I/O. If there is, mark the
request as BUSY so it can be redriven after we have a chance to breathe.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_fsm.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index f0952192480e..9dc5b4d549b3 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -28,6 +28,11 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 
 	spin_lock_irqsave(sch->lock, flags);
 
+	if (work_pending(&private->io_work)) {
+		ret = -EBUSY;
+		goto out;
+	}
+
 	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
 	if (!orb) {
 		ret = -EIO;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2020-06-16 19:50 [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
                   ` (2 preceding siblings ...)
  2020-06-16 19:50 ` [RFC PATCH v3 3/3] vfio-ccw: Check workqueue before doing START Eric Farman
@ 2020-06-17 11:24 ` Eric Farman
  2020-06-29 14:56   ` Cornelia Huck
  2020-06-19 11:21 ` Cornelia Huck
  4 siblings, 1 reply; 11+ messages in thread
From: Eric Farman @ 2020-06-17 11:24 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Jared Rossi, Halil Pasic, linux-s390, kvm



On 6/16/20 3:50 PM, Eric Farman wrote:
> Let's continue our discussion of the handling of vfio-ccw interrupts.
> 
> The initial fix [1] relied upon the interrupt path's examination of the
> FSM state, and freeing all resources if it were CP_PENDING. But the
> interface used by HALT/CLEAR SUBCHANNEL doesn't affect the FSM state.
> Consider this sequence:
> 
>     CPU 1                           CPU 2
>     CLEAR (state=IDLE/no change)
>                                     START [2]
>     INTERRUPT (set state=IDLE)
>                                     INTERRUPT (set state=IDLE)
> 
> This translates to a couple of possible scenarios:
> 
>  A) The START gets a cc2 because of the outstanding CLEAR, -EBUSY is
>     returned, resources are freed, and state remains IDLE
>  B) The START gets a cc0 because the CLEAR has already presented an
>     interrupt, and state is set to CP_PENDING
> 
> If the START gets a cc0 before the CLEAR INTERRUPT (stacked onto a
> workqueue by the IRQ context) gets a chance to run, then the INTERRUPT
> will release the channel program memory prematurely. If the two
> operations run concurrently, then the FSM state set to CP_PROCESSING
> will prevent the cp_free() from being invoked. But the io_mutex
> boundary on that path will pause itself until the START completes,
> and then allow the FSM to be reset to IDLE without considering the
> outstanding START. Neither scenario would be considered good.
> 
> Having said all of that, in v2 Conny suggested [3] the following:
> 
>> - Detach the cp from the subchannel (or better, remove the 1:1
>>   relationship). By that I mean building the cp as a separately
>>   allocated structure (maybe embedding a kref, but that might not be
>>   needed), and appending it to a list after SSCH with cc=0. Discard it
>>   if cc!=0.
>> - Remove the CP_PENDING state. The state is either IDLE after any
>>   successful SSCH/HSCH/CSCH, or a new state in that case. But no
>>   special state for SSCH.
>> - A successful CSCH removes the first queued request, if any.
>> - A final interrupt removes the first queued request, if any.
> 
> What I have implemented here is basically this, with a few changes:
> 
>  - I don't queue cp's. Since there should only be one START in process
>    at a time, and HALT/CLEAR doesn't build a cp, I didn't see a pressing
>    need to introduce that complexity.
>  - Furthermore, while I initially made a separately allocated cp, adding
>    an alloc for a cp on each I/O AND moving the guest_cp alloc from the
>    probe path to the I/O path seems excessive. So I implemented a
>    "started" flag to the cp, set after a cc0 from the START, and examine
>    that on the interrupt path to determine whether cp_free() is needed.

FYI... After a day or two of running, I sprung a kernel debug oops for
list corruption in ccwchain_free(). I'm going to blame this piece, since
it was the last thing I changed and I hadn't come across any such damage
since v2. So either "started" is a bad idea, or a broken one. Or both. :)

>  - I opted against a "SOMETHING_PENDING" state if START/HALT/CLEAR
>    got a cc0, and just put the FSM back to IDLE. It becomes too unwieldy
>    to discern which operation an interrupt is completing, and whether
>    more interrupts are expected, to be worth the additional state.
>  - A successful CSCH doesn't do anything special, and cp_free()
>    is only performed on the interrupt path. Part of me wrestled with
>    how a HALT fits into that, but mostly it was that a cc0 on any
>    of the instructions indicated the "channel subsystem is signaled
>    to asynchronously perform the [START/HALT/CLEAR] function."
>    This means that an in-flight START could still receive data from the
>    device/subchannel, so not a good idea to release memory at that point.
> 
> Separate from all that, I added a small check of the io_work queue to
> the FSM START path. Part of the problems I've seen was that an interrupt
> is presented by a CPU, but not yet processed by vfio-ccw. Some of the
> problems seen thus far is because of this gap, and the above changes
> don't address that either. Whether this is appropriate or ridiculous
> would be a welcome discussion.
> 
> Previous versions:
> v2: https://lore.kernel.org/kvm/20200513142934.28788-1-farman@linux.ibm.com/
> v1: https://lore.kernel.org/kvm/20200124145455.51181-1-farman@linux.ibm.com/
> 
> Footnotes:
> [1] https://lore.kernel.org/kvm/62e87bf67b38dc8d5760586e7c96d400db854ebe.1562854091.git.alifm@linux.ibm.com/
> [2] Halil has pointed out that QEMU should prohibit this, based on the
>     rules set forth by the POPs. This is true, but we should not rely on
>     it behaving properly without addressing this scenario that is visible
>     today. Once I get this behaving correctly, I'll spend some time
>     seeing if QEMU is misbehaving somehow.
> [3] https://lore.kernel.org/kvm/20200518180903.7cb21dd8.cohuck@redhat.com/
> [4] https://lore.kernel.org/kvm/a52368d3-8cec-7b99-1587-25e055228b62@linux.ibm.com/
> 
> Eric Farman (3):
>   vfio-ccw: Indicate if a channel_program is started
>   vfio-ccw: Remove the CP_PENDING FSM state
>   vfio-ccw: Check workqueue before doing START
> 
>  drivers/s390/cio/vfio_ccw_cp.c      |  2 ++
>  drivers/s390/cio/vfio_ccw_cp.h      |  1 +
>  drivers/s390/cio/vfio_ccw_drv.c     |  5 +----
>  drivers/s390/cio/vfio_ccw_fsm.c     | 32 +++++++++++++++++------------
>  drivers/s390/cio/vfio_ccw_ops.c     |  3 +--
>  drivers/s390/cio/vfio_ccw_private.h |  1 -
>  6 files changed, 24 insertions(+), 20 deletions(-)
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v3 1/3] vfio-ccw: Indicate if a channel_program is started
  2020-06-16 19:50 ` [RFC PATCH v3 1/3] vfio-ccw: Indicate if a channel_program is started Eric Farman
@ 2020-06-17 23:11   ` Halil Pasic
  2020-06-18 11:47     ` Eric Farman
  0 siblings, 1 reply; 11+ messages in thread
From: Halil Pasic @ 2020-06-17 23:11 UTC (permalink / raw)
  To: Eric Farman; +Cc: Cornelia Huck, Jared Rossi, linux-s390, kvm

On Tue, 16 Jun 2020 21:50:51 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> The interrupt path checks the FSM state when processing a final interrupt
> (an interrupt that is neither subchannel active, nor device active),
> to determine whether to call cp_free() and release the associated memory.
> But, this does not fully close the window where a START comes in after a
> HALT/CLEAR. If the START runs while the CLEAR interrupt is being processed,
> the channel program struct will be allocated while the interrupt would be
> considering whether or not to free it. If the FSM state is CP_PROCESSING,
> then everything is fine. But if the START is able to issue its SSCH and get
> a cc0, then the in-flight interrupt would have been for an unrelated
> operation (perhaps none, if the subchannel was previously idle).
> 
> The channel_program struct has an "initialized" flag that is set early
> in the fsm_io_request() flow, to simplify the various cp_*() accessors.
> Let's extend this idea to include a "started" flag that announces that the
> channel program has successfully been issued to hardware. With this, the
> interrupt path can determine whether the final interrupt should also
> release the cp resources instead of relying on a transient FSM state.

AFAICT cp->started is potentially accessed by multiple threads, form
which at least one writes. Am I right?

Actually AFAICT you want to use cp->sarted for synchronization between
multiple treads (I/O requester(s), IRQ handler(s)). How does the
synchronization work for bool started itself, i.e. don't we have a data
race on 'started'?

A side note: I know, I asked a similar question about 'initialized' back
then.

Regards,
Halil

> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c  |  2 ++
>  drivers/s390/cio/vfio_ccw_cp.h  |  1 +
>  drivers/s390/cio/vfio_ccw_drv.c |  2 +-
>  drivers/s390/cio/vfio_ccw_fsm.c | 11 +++++++++++
>  4 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index b9febc581b1f..7748eeef434e 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -657,6 +657,7 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
>  
>  	if (!ret) {
>  		cp->initialized = true;
> +		cp->started = false;
>  
>  		/* It is safe to force: if it was not set but idals used
>  		 * ccwchain_calc_length would have returned an error.
> @@ -685,6 +686,7 @@ void cp_free(struct channel_program *cp)
>  		return;
>  
>  	cp->initialized = false;
> +	cp->started = false;
>  	list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) {
>  		for (i = 0; i < chain->ch_len; i++) {
>  			pfn_array_unpin_free(chain->ch_pa + i, cp->mdev);
> diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h
> index ba31240ce965..7ea14910aaaa 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.h
> +++ b/drivers/s390/cio/vfio_ccw_cp.h
> @@ -39,6 +39,7 @@ struct channel_program {
>  	union orb orb;
>  	struct device *mdev;
>  	bool initialized;
> +	bool started;
>  	struct ccw1 *guest_cp;
>  };
>  
> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
> index 8c625b530035..7e2a790dc9a1 100644
> --- a/drivers/s390/cio/vfio_ccw_drv.c
> +++ b/drivers/s390/cio/vfio_ccw_drv.c
> @@ -94,7 +94,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
>  		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
>  	if (scsw_is_solicited(&irb->scsw)) {
>  		cp_update_scsw(&private->cp, &irb->scsw);
> -		if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING)
> +		if (is_final && private->cp.started)
>  			cp_free(&private->cp);
>  	}
>  	mutex_lock(&private->io_mutex);
> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> index 23e61aa638e4..d806f88eba72 100644
> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> @@ -50,6 +50,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>  		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
>  		ret = 0;
>  		private->state = VFIO_CCW_STATE_CP_PENDING;
> +		private->cp.started = true;
>  		break;
>  	case 1:		/* Status pending */
>  	case 2:		/* Busy */
> @@ -246,6 +247,16 @@ static void fsm_io_request(struct vfio_ccw_private *private,
>  	char *errstr = "request";
>  	struct subchannel_id schid = get_schid(private);
>  
> +	if (private->cp.started) {
> +		io_region->ret_code = -EBUSY;
> +		VFIO_CCW_MSG_EVENT(2,
> +				   "%pUl (%x.%x.%04x): busy\n",
> +				   mdev_uuid(mdev), schid.cssid,
> +				   schid.ssid, schid.sch_no);
> +		errstr = "busy";
> +		goto err_out;
> +	}
> +
>  	private->state = VFIO_CCW_STATE_CP_PROCESSING;
>  	memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
>  


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v3 1/3] vfio-ccw: Indicate if a channel_program is started
  2020-06-17 23:11   ` Halil Pasic
@ 2020-06-18 11:47     ` Eric Farman
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Farman @ 2020-06-18 11:47 UTC (permalink / raw)
  To: Halil Pasic; +Cc: Cornelia Huck, Jared Rossi, linux-s390, kvm



On 6/17/20 7:11 PM, Halil Pasic wrote:
> On Tue, 16 Jun 2020 21:50:51 +0200
> Eric Farman <farman@linux.ibm.com> wrote:
> 
>> The interrupt path checks the FSM state when processing a final interrupt
>> (an interrupt that is neither subchannel active, nor device active),
>> to determine whether to call cp_free() and release the associated memory.
>> But, this does not fully close the window where a START comes in after a
>> HALT/CLEAR. If the START runs while the CLEAR interrupt is being processed,
>> the channel program struct will be allocated while the interrupt would be
>> considering whether or not to free it. If the FSM state is CP_PROCESSING,
>> then everything is fine. But if the START is able to issue its SSCH and get
>> a cc0, then the in-flight interrupt would have been for an unrelated
>> operation (perhaps none, if the subchannel was previously idle).
>>
>> The channel_program struct has an "initialized" flag that is set early
>> in the fsm_io_request() flow, to simplify the various cp_*() accessors.
>> Let's extend this idea to include a "started" flag that announces that the
>> channel program has successfully been issued to hardware. With this, the
>> interrupt path can determine whether the final interrupt should also
>> release the cp resources instead of relying on a transient FSM state.
> 
> AFAICT cp->started is potentially accessed by multiple threads, form
> which at least one writes. Am I right?

Yup. And with the exception of the cp_free() call out of the interrupt
path, every one is accessed under the io_mutex. I'm still measuring
possible behavior at that point.

> 
> Actually AFAICT you want to use cp->sarted for synchronization between
> multiple treads (I/O requester(s), IRQ handler(s)). How does the
> synchronization work for bool started itself, i.e. don't we have a data
> race on 'started'?
> 
> A side note: I know, I asked a similar question about 'initialized' back
> then.
> 
> Regards,
> Halil
> 
>>
>> Signed-off-by: Eric Farman <farman@linux.ibm.com>
>> ---
>>  drivers/s390/cio/vfio_ccw_cp.c  |  2 ++
>>  drivers/s390/cio/vfio_ccw_cp.h  |  1 +
>>  drivers/s390/cio/vfio_ccw_drv.c |  2 +-
>>  drivers/s390/cio/vfio_ccw_fsm.c | 11 +++++++++++
>>  4 files changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
>> index b9febc581b1f..7748eeef434e 100644
>> --- a/drivers/s390/cio/vfio_ccw_cp.c
>> +++ b/drivers/s390/cio/vfio_ccw_cp.c
>> @@ -657,6 +657,7 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
>>  
>>  	if (!ret) {
>>  		cp->initialized = true;
>> +		cp->started = false;
>>  
>>  		/* It is safe to force: if it was not set but idals used
>>  		 * ccwchain_calc_length would have returned an error.
>> @@ -685,6 +686,7 @@ void cp_free(struct channel_program *cp)
>>  		return;
>>  
>>  	cp->initialized = false;
>> +	cp->started = false;
>>  	list_for_each_entry_safe(chain, temp, &cp->ccwchain_list, next) {
>>  		for (i = 0; i < chain->ch_len; i++) {
>>  			pfn_array_unpin_free(chain->ch_pa + i, cp->mdev);
>> diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h
>> index ba31240ce965..7ea14910aaaa 100644
>> --- a/drivers/s390/cio/vfio_ccw_cp.h
>> +++ b/drivers/s390/cio/vfio_ccw_cp.h
>> @@ -39,6 +39,7 @@ struct channel_program {
>>  	union orb orb;
>>  	struct device *mdev;
>>  	bool initialized;
>> +	bool started;
>>  	struct ccw1 *guest_cp;
>>  };
>>  
>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
>> index 8c625b530035..7e2a790dc9a1 100644
>> --- a/drivers/s390/cio/vfio_ccw_drv.c
>> +++ b/drivers/s390/cio/vfio_ccw_drv.c
>> @@ -94,7 +94,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
>>  		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
>>  	if (scsw_is_solicited(&irb->scsw)) {
>>  		cp_update_scsw(&private->cp, &irb->scsw);
>> -		if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING)
>> +		if (is_final && private->cp.started)
>>  			cp_free(&private->cp);
>>  	}
>>  	mutex_lock(&private->io_mutex);
>> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
>> index 23e61aa638e4..d806f88eba72 100644
>> --- a/drivers/s390/cio/vfio_ccw_fsm.c
>> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
>> @@ -50,6 +50,7 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>>  		sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
>>  		ret = 0;
>>  		private->state = VFIO_CCW_STATE_CP_PENDING;
>> +		private->cp.started = true;
>>  		break;
>>  	case 1:		/* Status pending */
>>  	case 2:		/* Busy */
>> @@ -246,6 +247,16 @@ static void fsm_io_request(struct vfio_ccw_private *private,
>>  	char *errstr = "request";
>>  	struct subchannel_id schid = get_schid(private);
>>  
>> +	if (private->cp.started) {
>> +		io_region->ret_code = -EBUSY;
>> +		VFIO_CCW_MSG_EVENT(2,
>> +				   "%pUl (%x.%x.%04x): busy\n",
>> +				   mdev_uuid(mdev), schid.cssid,
>> +				   schid.ssid, schid.sch_no);
>> +		errstr = "busy";
>> +		goto err_out;
>> +	}
>> +
>>  	private->state = VFIO_CCW_STATE_CP_PROCESSING;
>>  	memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
>>  
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2020-06-16 19:50 [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
                   ` (3 preceding siblings ...)
  2020-06-17 11:24 ` [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
@ 2020-06-19 11:21 ` Cornelia Huck
  4 siblings, 0 replies; 11+ messages in thread
From: Cornelia Huck @ 2020-06-19 11:21 UTC (permalink / raw)
  To: Eric Farman; +Cc: Jared Rossi, Halil Pasic, linux-s390, kvm

On Tue, 16 Jun 2020 21:50:50 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> Let's continue our discussion of the handling of vfio-ccw interrupts.
> 
> The initial fix [1] relied upon the interrupt path's examination of the
> FSM state, and freeing all resources if it were CP_PENDING. But the
> interface used by HALT/CLEAR SUBCHANNEL doesn't affect the FSM state.
> Consider this sequence:
> 
>     CPU 1                           CPU 2
>     CLEAR (state=IDLE/no change)
>                                     START [2]
>     INTERRUPT (set state=IDLE)
>                                     INTERRUPT (set state=IDLE)
> 
> This translates to a couple of possible scenarios:
> 
>  A) The START gets a cc2 because of the outstanding CLEAR, -EBUSY is
>     returned, resources are freed, and state remains IDLE
>  B) The START gets a cc0 because the CLEAR has already presented an
>     interrupt, and state is set to CP_PENDING
> 
> If the START gets a cc0 before the CLEAR INTERRUPT (stacked onto a
> workqueue by the IRQ context) gets a chance to run, then the INTERRUPT
> will release the channel program memory prematurely. If the two
> operations run concurrently, then the FSM state set to CP_PROCESSING
> will prevent the cp_free() from being invoked. But the io_mutex
> boundary on that path will pause itself until the START completes,
> and then allow the FSM to be reset to IDLE without considering the
> outstanding START. Neither scenario would be considered good.
> 
> Having said all of that, in v2 Conny suggested [3] the following:
> 
> > - Detach the cp from the subchannel (or better, remove the 1:1
> >   relationship). By that I mean building the cp as a separately
> >   allocated structure (maybe embedding a kref, but that might not be
> >   needed), and appending it to a list after SSCH with cc=0. Discard it
> >   if cc!=0.
> > - Remove the CP_PENDING state. The state is either IDLE after any
> >   successful SSCH/HSCH/CSCH, or a new state in that case. But no
> >   special state for SSCH.
> > - A successful CSCH removes the first queued request, if any.
> > - A final interrupt removes the first queued request, if any.  
> 
> What I have implemented here is basically this, with a few changes:
> 
>  - I don't queue cp's. Since there should only be one START in process
>    at a time, and HALT/CLEAR doesn't build a cp, I didn't see a pressing
>    need to introduce that complexity.
>  - Furthermore, while I initially made a separately allocated cp, adding
>    an alloc for a cp on each I/O AND moving the guest_cp alloc from the
>    probe path to the I/O path seems excessive. So I implemented a
>    "started" flag to the cp, set after a cc0 from the START, and examine
>    that on the interrupt path to determine whether cp_free() is needed.
>  - I opted against a "SOMETHING_PENDING" state if START/HALT/CLEAR
>    got a cc0, and just put the FSM back to IDLE. It becomes too unwieldy
>    to discern which operation an interrupt is completing, and whether
>    more interrupts are expected, to be worth the additional state.
>  - A successful CSCH doesn't do anything special, and cp_free()
>    is only performed on the interrupt path. Part of me wrestled with
>    how a HALT fits into that, but mostly it was that a cc0 on any
>    of the instructions indicated the "channel subsystem is signaled
>    to asynchronously perform the [START/HALT/CLEAR] function."
>    This means that an in-flight START could still receive data from the
>    device/subchannel, so not a good idea to release memory at that point.

Hm, csch clears any pending status, so it is indeed special in a way.
If we do a csch with cc 0, we already know for sure that we won't get
any further status other than an interrupt indicating that clear has
been done. This was my reasoning behind csch dequeuing the request.

> 
> Separate from all that, I added a small check of the io_work queue to
> the FSM START path. Part of the problems I've seen was that an interrupt
> is presented by a CPU, but not yet processed by vfio-ccw. Some of the
> problems seen thus far is because of this gap, and the above changes
> don't address that either. Whether this is appropriate or ridiculous
> would be a welcome discussion.
> 
> Previous versions:
> v2: https://lore.kernel.org/kvm/20200513142934.28788-1-farman@linux.ibm.com/
> v1: https://lore.kernel.org/kvm/20200124145455.51181-1-farman@linux.ibm.com/
> 
> Footnotes:
> [1] https://lore.kernel.org/kvm/62e87bf67b38dc8d5760586e7c96d400db854ebe.1562854091.git.alifm@linux.ibm.com/
> [2] Halil has pointed out that QEMU should prohibit this, based on the
>     rules set forth by the POPs. This is true, but we should not rely on
>     it behaving properly without addressing this scenario that is visible
>     today. Once I get this behaving correctly, I'll spend some time
>     seeing if QEMU is misbehaving somehow.
> [3] https://lore.kernel.org/kvm/20200518180903.7cb21dd8.cohuck@redhat.com/
> [4] https://lore.kernel.org/kvm/a52368d3-8cec-7b99-1587-25e055228b62@linux.ibm.com/
> 
> Eric Farman (3):
>   vfio-ccw: Indicate if a channel_program is started
>   vfio-ccw: Remove the CP_PENDING FSM state
>   vfio-ccw: Check workqueue before doing START
> 
>  drivers/s390/cio/vfio_ccw_cp.c      |  2 ++
>  drivers/s390/cio/vfio_ccw_cp.h      |  1 +
>  drivers/s390/cio/vfio_ccw_drv.c     |  5 +----
>  drivers/s390/cio/vfio_ccw_fsm.c     | 32 +++++++++++++++++------------
>  drivers/s390/cio/vfio_ccw_ops.c     |  3 +--
>  drivers/s390/cio/vfio_ccw_private.h |  1 -
>  6 files changed, 24 insertions(+), 20 deletions(-)
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v3 3/3] vfio-ccw: Check workqueue before doing START
  2020-06-16 19:50 ` [RFC PATCH v3 3/3] vfio-ccw: Check workqueue before doing START Eric Farman
@ 2020-06-19 11:40   ` Cornelia Huck
  0 siblings, 0 replies; 11+ messages in thread
From: Cornelia Huck @ 2020-06-19 11:40 UTC (permalink / raw)
  To: Eric Farman; +Cc: Jared Rossi, Halil Pasic, linux-s390, kvm

On Tue, 16 Jun 2020 21:50:53 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> When an interrupt is received via the IRQ, the bulk of the work is
> stacked on a workqueue for later processing. Which means that concurrent
> START or a HALT/CLEAR operation (via the async_region) will race with
> this process and require some serialization.
> 
> Once we have all our locks acquired, let's just look to see if we're
> in a window where the process has been started from the IRQ, but not
> yet picked up by vfio-ccw to clean up an I/O. If there is, mark the
> request as BUSY so it can be redriven after we have a chance to breathe.

This change looks reasonable to me. It would be even better if we could
send off I/O requests at any time; but if signaling to retry saves us
from some hairy code elsewhere, it is a good idea to do so.

> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_fsm.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> index f0952192480e..9dc5b4d549b3 100644
> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> @@ -28,6 +28,11 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>  
>  	spin_lock_irqsave(sch->lock, flags);
>  
> +	if (work_pending(&private->io_work)) {
> +		ret = -EBUSY;
> +		goto out;
> +	}
> +
>  	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
>  	if (!orb) {
>  		ret = -EIO;


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2020-06-17 11:24 ` [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
@ 2020-06-29 14:56   ` Cornelia Huck
  2020-06-30 19:10     ` Eric Farman
  0 siblings, 1 reply; 11+ messages in thread
From: Cornelia Huck @ 2020-06-29 14:56 UTC (permalink / raw)
  To: Eric Farman; +Cc: Jared Rossi, Halil Pasic, linux-s390, kvm

On Wed, 17 Jun 2020 07:24:17 -0400
Eric Farman <farman@linux.ibm.com> wrote:

> On 6/16/20 3:50 PM, Eric Farman wrote:
> > Let's continue our discussion of the handling of vfio-ccw interrupts.
> > 
> > The initial fix [1] relied upon the interrupt path's examination of the
> > FSM state, and freeing all resources if it were CP_PENDING. But the
> > interface used by HALT/CLEAR SUBCHANNEL doesn't affect the FSM state.
> > Consider this sequence:
> > 
> >     CPU 1                           CPU 2
> >     CLEAR (state=IDLE/no change)
> >                                     START [2]
> >     INTERRUPT (set state=IDLE)
> >                                     INTERRUPT (set state=IDLE)
> > 
> > This translates to a couple of possible scenarios:
> > 
> >  A) The START gets a cc2 because of the outstanding CLEAR, -EBUSY is
> >     returned, resources are freed, and state remains IDLE
> >  B) The START gets a cc0 because the CLEAR has already presented an
> >     interrupt, and state is set to CP_PENDING
> > 
> > If the START gets a cc0 before the CLEAR INTERRUPT (stacked onto a
> > workqueue by the IRQ context) gets a chance to run, then the INTERRUPT
> > will release the channel program memory prematurely. If the two
> > operations run concurrently, then the FSM state set to CP_PROCESSING
> > will prevent the cp_free() from being invoked. But the io_mutex
> > boundary on that path will pause itself until the START completes,
> > and then allow the FSM to be reset to IDLE without considering the
> > outstanding START. Neither scenario would be considered good.
> > 
> > Having said all of that, in v2 Conny suggested [3] the following:
> >   
> >> - Detach the cp from the subchannel (or better, remove the 1:1
> >>   relationship). By that I mean building the cp as a separately
> >>   allocated structure (maybe embedding a kref, but that might not be
> >>   needed), and appending it to a list after SSCH with cc=0. Discard it
> >>   if cc!=0.
> >> - Remove the CP_PENDING state. The state is either IDLE after any
> >>   successful SSCH/HSCH/CSCH, or a new state in that case. But no
> >>   special state for SSCH.
> >> - A successful CSCH removes the first queued request, if any.
> >> - A final interrupt removes the first queued request, if any.  
> > 
> > What I have implemented here is basically this, with a few changes:
> > 
> >  - I don't queue cp's. Since there should only be one START in process
> >    at a time, and HALT/CLEAR doesn't build a cp, I didn't see a pressing
> >    need to introduce that complexity.
> >  - Furthermore, while I initially made a separately allocated cp, adding
> >    an alloc for a cp on each I/O AND moving the guest_cp alloc from the
> >    probe path to the I/O path seems excessive. So I implemented a
> >    "started" flag to the cp, set after a cc0 from the START, and examine
> >    that on the interrupt path to determine whether cp_free() is needed.  
> 
> FYI... After a day or two of running, I sprung a kernel debug oops for
> list corruption in ccwchain_free(). I'm going to blame this piece, since
> it was the last thing I changed and I hadn't come across any such damage
> since v2. So either "started" is a bad idea, or a broken one. Or both. :)

Have you come to any conclusion wrt 'started'? Not wanting to generate
stress, just asking :)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2020-06-29 14:56   ` Cornelia Huck
@ 2020-06-30 19:10     ` Eric Farman
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Farman @ 2020-06-30 19:10 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Jared Rossi, Halil Pasic, linux-s390, kvm


[-- Attachment #1.1: Type: text/plain, Size: 3894 bytes --]



On 6/29/20 10:56 AM, Cornelia Huck wrote:
> On Wed, 17 Jun 2020 07:24:17 -0400
> Eric Farman <farman@linux.ibm.com> wrote:
> 
>> On 6/16/20 3:50 PM, Eric Farman wrote:
>>> Let's continue our discussion of the handling of vfio-ccw interrupts.
>>>
>>> The initial fix [1] relied upon the interrupt path's examination of the
>>> FSM state, and freeing all resources if it were CP_PENDING. But the
>>> interface used by HALT/CLEAR SUBCHANNEL doesn't affect the FSM state.
>>> Consider this sequence:
>>>
>>>     CPU 1                           CPU 2
>>>     CLEAR (state=IDLE/no change)
>>>                                     START [2]
>>>     INTERRUPT (set state=IDLE)
>>>                                     INTERRUPT (set state=IDLE)
>>>
>>> This translates to a couple of possible scenarios:
>>>
>>>  A) The START gets a cc2 because of the outstanding CLEAR, -EBUSY is
>>>     returned, resources are freed, and state remains IDLE
>>>  B) The START gets a cc0 because the CLEAR has already presented an
>>>     interrupt, and state is set to CP_PENDING
>>>
>>> If the START gets a cc0 before the CLEAR INTERRUPT (stacked onto a
>>> workqueue by the IRQ context) gets a chance to run, then the INTERRUPT
>>> will release the channel program memory prematurely. If the two
>>> operations run concurrently, then the FSM state set to CP_PROCESSING
>>> will prevent the cp_free() from being invoked. But the io_mutex
>>> boundary on that path will pause itself until the START completes,
>>> and then allow the FSM to be reset to IDLE without considering the
>>> outstanding START. Neither scenario would be considered good.
>>>
>>> Having said all of that, in v2 Conny suggested [3] the following:
>>>   
>>>> - Detach the cp from the subchannel (or better, remove the 1:1
>>>>   relationship). By that I mean building the cp as a separately
>>>>   allocated structure (maybe embedding a kref, but that might not be
>>>>   needed), and appending it to a list after SSCH with cc=0. Discard it
>>>>   if cc!=0.
>>>> - Remove the CP_PENDING state. The state is either IDLE after any
>>>>   successful SSCH/HSCH/CSCH, or a new state in that case. But no
>>>>   special state for SSCH.
>>>> - A successful CSCH removes the first queued request, if any.
>>>> - A final interrupt removes the first queued request, if any.  
>>>
>>> What I have implemented here is basically this, with a few changes:
>>>
>>>  - I don't queue cp's. Since there should only be one START in process
>>>    at a time, and HALT/CLEAR doesn't build a cp, I didn't see a pressing
>>>    need to introduce that complexity.
>>>  - Furthermore, while I initially made a separately allocated cp, adding
>>>    an alloc for a cp on each I/O AND moving the guest_cp alloc from the
>>>    probe path to the I/O path seems excessive. So I implemented a
>>>    "started" flag to the cp, set after a cc0 from the START, and examine
>>>    that on the interrupt path to determine whether cp_free() is needed.  
>>
>> FYI... After a day or two of running, I sprung a kernel debug oops for
>> list corruption in ccwchain_free(). I'm going to blame this piece, since
>> it was the last thing I changed and I hadn't come across any such damage
>> since v2. So either "started" is a bad idea, or a broken one. Or both. :)
> 
> Have you come to any conclusion wrt 'started'? Not wanting to generate
> stress, just asking :)
> 

I've talked myself out of it, and gone back to your original proposal of
a separately allocated cp. (Still no queuing.) Too early to pass
judgement though.

Yesterday, when running with a cp_free() call after a CSCH, I was
getting all sorts of errors very early on, so at the moment I've pulled
that back out again. If it looks good in this form, I'll put that as a
separate patch and write up some doc for a discussion on that point.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-06-30 19:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-16 19:50 [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
2020-06-16 19:50 ` [RFC PATCH v3 1/3] vfio-ccw: Indicate if a channel_program is started Eric Farman
2020-06-17 23:11   ` Halil Pasic
2020-06-18 11:47     ` Eric Farman
2020-06-16 19:50 ` [RFC PATCH v3 2/3] vfio-ccw: Remove the CP_PENDING FSM state Eric Farman
2020-06-16 19:50 ` [RFC PATCH v3 3/3] vfio-ccw: Check workqueue before doing START Eric Farman
2020-06-19 11:40   ` Cornelia Huck
2020-06-17 11:24 ` [RFC PATCH v3 0/3] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
2020-06-29 14:56   ` Cornelia Huck
2020-06-30 19:10     ` Eric Farman
2020-06-19 11:21 ` Cornelia Huck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).