KVM Archive on lore.kernel.org
 help / color / Atom feed
* [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR
@ 2021-04-13 18:24 Eric Farman
  2021-04-13 18:24 ` [RFC PATCH v4 1/4] vfio-ccw: Check initialized flag in cp_init() Eric Farman
                   ` (4 more replies)
  0 siblings, 5 replies; 26+ messages in thread
From: Eric Farman @ 2021-04-13 18:24 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: Matthew Rosato, Jared Rossi, linux-s390, kvm, Eric Farman

Hi Conny, Halil,

Let's restart our discussion about the collision between interrupts for
START SUBCHANNEL and HALT/CLEAR SUBCHANNEL. It's been a quarter million
minutes (give or take), so here is the problematic scenario again:

	CPU 1			CPU 2
 1	CLEAR SUBCHANNEL
 2	fsm_irq()
 3				START SUBCHANNEL
 4	vfio_ccw_sch_io_todo()
 5				fsm_irq()
 6				vfio_ccw_sch_io_todo()

From the channel subsystem's point of view the CLEAR SUBCHANNEL (step 1)
is complete once step 2 is called, as the Interrupt Response Block (IRB)
has been presented and the TEST SUBCHANNEL was driven by the cio layer.
Thus, the START SUBCHANNEL (step 3) is submitted [1] and gets a cc=0 to
indicate the I/O was accepted. However, step 2 stacks the bulk of the
actual work onto a workqueue for when the subchannel lock is NOT held,
and is unqueued at step 4. That code misidentifies the data in the IRB
as being associated with the newly active I/O, and may release memory
that is actively in use by the channel subsystem and/or device. Eww.

In this version...

Patch 1 and 2 are defensive checks. Patch 2 was part of v3 [2], but I
would love a better option here to guard between steps 2 and 4.

Patch 3 is a subset of the removal of the CP_PENDING FSM state in v3.
I've obviously gone away from this idea, but I thought this piece is
still valuable.

Patch 4 collapses the code on the interrupt path so that changes to
the FSM state and the channel_program struct are handled at the same
point, rather than separated by a mutex boundary. Because of the
possibility of a START and HALT/CLEAR running concurrently, it does
not make sense to split them here.

With the above patches, maybe it then makes sense to hold the io_mutex
across the entirety of vfio_ccw_sch_io_todo(). But I'm not completely
sure that would be acceptable.

So... Thoughts?

Thanks,
Eric

Previous versions:
v3: https://lore.kernel.org/kvm/20200616195053.99253-1-farman@linux.ibm.com/
v2: https://lore.kernel.org/kvm/20200513142934.28788-1-farman@linux.ibm.com/
v1: https://lore.kernel.org/kvm/20200124145455.51181-1-farman@linux.ibm.com/

Footnotes:
[1] Halil correctly asserts that today's QEMU should prohibit this, but I
    still have not looked into why. The above is the sequence that is
    occurring in the kernel, and we shouldn't rely on a well-behaved
    userspace to enforce things for us. It is still on my list for further
    investigation, but it's lower in priority.
[2] https://lore.kernel.org/kvm/20200619134005.512fc54f.cohuck@redhat.com/

Eric Farman (4):
  vfio-ccw: Check initialized flag in cp_init()
  vfio-ccw: Check workqueue before doing START
  vfio-ccw: Reset FSM state to IDLE inside FSM
  vfio-ccw: Reset FSM state to IDLE before io_mutex

 drivers/s390/cio/vfio_ccw_cp.c  | 4 ++++
 drivers/s390/cio/vfio_ccw_drv.c | 7 +++----
 drivers/s390/cio/vfio_ccw_fsm.c | 6 ++++++
 drivers/s390/cio/vfio_ccw_ops.c | 2 --
 4 files changed, 13 insertions(+), 6 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC PATCH v4 1/4] vfio-ccw: Check initialized flag in cp_init()
  2021-04-13 18:24 [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
@ 2021-04-13 18:24 ` Eric Farman
  2021-04-14 16:30   ` Cornelia Huck
  2021-04-13 18:24 ` [RFC PATCH v4 2/4] vfio-ccw: Check workqueue before doing START Eric Farman
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 26+ messages in thread
From: Eric Farman @ 2021-04-13 18:24 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: Matthew Rosato, Jared Rossi, linux-s390, kvm, Eric Farman

We have a really nice flag in the channel_program struct that
indicates if it had been initialized by cp_init(), and use it
as a guard in the other cp accessor routines, but not for a
duplicate call into cp_init(). The possibility of this occurring
is low, because that flow is protected by the private->io_mutex
and FSM CP_PROCESSING state. But then why bother checking it
in (for example) cp_prefetch() then?

Let's just be consistent and check for that in cp_init() too.

Fixes: 71189f263f8a3 ("vfio-ccw: make it safe to access channel programs")
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_cp.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index b9febc581b1f..8d1b2771c1aa 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -638,6 +638,10 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
 	static DEFINE_RATELIMIT_STATE(ratelimit_state, 5 * HZ, 1);
 	int ret;
 
+	/* this is an error in the caller */
+	if (cp->initialized)
+		return -EBUSY;
+
 	/*
 	 * We only support prefetching the channel program. We assume all channel
 	 * programs executed by supported guests likewise support prefetching.
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC PATCH v4 2/4] vfio-ccw: Check workqueue before doing START
  2021-04-13 18:24 [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
  2021-04-13 18:24 ` [RFC PATCH v4 1/4] vfio-ccw: Check initialized flag in cp_init() Eric Farman
@ 2021-04-13 18:24 ` Eric Farman
  2021-04-15 10:51   ` Cornelia Huck
  2021-04-13 18:24 ` [RFC PATCH v4 3/4] vfio-ccw: Reset FSM state to IDLE inside FSM Eric Farman
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 26+ messages in thread
From: Eric Farman @ 2021-04-13 18:24 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: Matthew Rosato, Jared Rossi, linux-s390, kvm, Eric Farman

When an interrupt is received via the IRQ, the bulk of the work
is stacked on a workqueue for later processing. Which means that
a concurrent START or HALT/CLEAR operation (via the async_region)
will race with this process and require some serialization.

Once we have all our locks acquired, let's just look to see if we're
in a window where the process has been started from the IRQ, but not
yet picked up by vfio-ccw to clean up an I/O. If there is, mark the
request as BUSY so it can be redriven.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_fsm.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index 23e61aa638e4..92d638f10b27 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -28,6 +28,11 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
 
 	spin_lock_irqsave(sch->lock, flags);
 
+	if (work_pending(&private->io_work)) {
+		ret = -EBUSY;
+		goto out;
+	}
+
 	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
 	if (!orb) {
 		ret = -EIO;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC PATCH v4 3/4] vfio-ccw: Reset FSM state to IDLE inside FSM
  2021-04-13 18:24 [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
  2021-04-13 18:24 ` [RFC PATCH v4 1/4] vfio-ccw: Check initialized flag in cp_init() Eric Farman
  2021-04-13 18:24 ` [RFC PATCH v4 2/4] vfio-ccw: Check workqueue before doing START Eric Farman
@ 2021-04-13 18:24 ` Eric Farman
  2021-04-15 10:54   ` Cornelia Huck
  2021-04-13 18:24 ` [RFC PATCH v4 4/4] vfio-ccw: Reset FSM state to IDLE before io_mutex Eric Farman
  2021-04-22  0:52 ` [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR Halil Pasic
  4 siblings, 1 reply; 26+ messages in thread
From: Eric Farman @ 2021-04-13 18:24 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: Matthew Rosato, Jared Rossi, linux-s390, kvm, Eric Farman

When an I/O request is made, the fsm_io_request() routine
moves the FSM state from IDLE to CP_PROCESSING, and then
fsm_io_helper() moves it to CP_PENDING if the START SUBCHANNEL
received a cc0. Yet, the error case to go from CP_PROCESSING
back to IDLE is done after the FSM call returns.

Let's move this up into the FSM proper, to provide some
better symmetry when unwinding in this case.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_fsm.c | 1 +
 drivers/s390/cio/vfio_ccw_ops.c | 2 --
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index 92d638f10b27..38b4a8256665 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -323,6 +323,7 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 	}
 
 err_out:
+	private->state = VFIO_CCW_STATE_IDLE;
 	trace_vfio_ccw_fsm_io_request(scsw->cmd.fctl, schid,
 				      io_region->ret_code, errstr);
 }
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 767ac41686fe..5971641964c6 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -276,8 +276,6 @@ static ssize_t vfio_ccw_mdev_write_io_region(struct vfio_ccw_private *private,
 	}
 
 	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_IO_REQ);
-	if (region->ret_code != 0)
-		private->state = VFIO_CCW_STATE_IDLE;
 	ret = (region->ret_code != 0) ? region->ret_code : count;
 
 out_unlock:
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC PATCH v4 4/4] vfio-ccw: Reset FSM state to IDLE before io_mutex
  2021-04-13 18:24 [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
                   ` (2 preceding siblings ...)
  2021-04-13 18:24 ` [RFC PATCH v4 3/4] vfio-ccw: Reset FSM state to IDLE inside FSM Eric Farman
@ 2021-04-13 18:24 ` Eric Farman
  2021-04-21 10:25   ` Cornelia Huck
  2021-04-22  0:52 ` [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR Halil Pasic
  4 siblings, 1 reply; 26+ messages in thread
From: Eric Farman @ 2021-04-13 18:24 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: Matthew Rosato, Jared Rossi, linux-s390, kvm, Eric Farman

Today, the stacked call to vfio_ccw_sch_io_todo() does three things:

1) Update a solicited IRB with CP information, and release the CP
if the interrupt was the end of a START operation.
2) Copy the IRB data into the io_region, under the protection of
the io_mutex
3) Reset the vfio-ccw FSM state to IDLE to acknowledge that
vfio-ccw can accept more work.

The trouble is that step 3 is (A) invoked for both solicited and
unsolicited interrupts, and (B) sitting after the mutex for step 2.
This second piece becomes a problem if it processes an interrupt
for a CLEAR SUBCHANNEL while another thread initiates a START,
thus allowing the CP and FSM states to get out of sync. That is:

	CPU 1				CPU 2
	fsm_do_clear()
	fsm_irq()
					fsm_io_request()
					fsm_io_helper()
	vfio_ccw_sch_io_todo()
					fsm_irq()
					vfio_ccw_sch_io_todo()

Let's move the reset of the FSM state to the point where the
channel_program struct is cleaned up, which is only done for
solicited interrupts anyway.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/s390/cio/vfio_ccw_drv.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 8c625b530035..e51318f23ca8 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -94,16 +94,15 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
 		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
 	if (scsw_is_solicited(&irb->scsw)) {
 		cp_update_scsw(&private->cp, &irb->scsw);
-		if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING)
+		if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING) {
 			cp_free(&private->cp);
+			private->state = VFIO_CCW_STATE_IDLE;
+		}
 	}
 	mutex_lock(&private->io_mutex);
 	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
 	mutex_unlock(&private->io_mutex);
 
-	if (private->mdev && is_final)
-		private->state = VFIO_CCW_STATE_IDLE;
-
 	if (private->io_trigger)
 		eventfd_signal(private->io_trigger, 1);
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 1/4] vfio-ccw: Check initialized flag in cp_init()
  2021-04-13 18:24 ` [RFC PATCH v4 1/4] vfio-ccw: Check initialized flag in cp_init() Eric Farman
@ 2021-04-14 16:30   ` Cornelia Huck
  0 siblings, 0 replies; 26+ messages in thread
From: Cornelia Huck @ 2021-04-14 16:30 UTC (permalink / raw)
  To: Eric Farman; +Cc: Halil Pasic, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Tue, 13 Apr 2021 20:24:07 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> We have a really nice flag in the channel_program struct that
> indicates if it had been initialized by cp_init(), and use it
> as a guard in the other cp accessor routines, but not for a
> duplicate call into cp_init(). The possibility of this occurring
> is low, because that flow is protected by the private->io_mutex
> and FSM CP_PROCESSING state. But then why bother checking it
> in (for example) cp_prefetch() then?
> 
> Let's just be consistent and check for that in cp_init() too.
> 
> Fixes: 71189f263f8a3 ("vfio-ccw: make it safe to access channel programs")
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index b9febc581b1f..8d1b2771c1aa 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -638,6 +638,10 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)
>  	static DEFINE_RATELIMIT_STATE(ratelimit_state, 5 * HZ, 1);
>  	int ret;
>  
> +	/* this is an error in the caller */
> +	if (cp->initialized)
> +		return -EBUSY;
> +
>  	/*
>  	 * We only support prefetching the channel program. We assume all channel
>  	 * programs executed by supported guests likewise support prefetching.

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 2/4] vfio-ccw: Check workqueue before doing START
  2021-04-13 18:24 ` [RFC PATCH v4 2/4] vfio-ccw: Check workqueue before doing START Eric Farman
@ 2021-04-15 10:51   ` Cornelia Huck
  2021-04-15 13:48     ` Eric Farman
  0 siblings, 1 reply; 26+ messages in thread
From: Cornelia Huck @ 2021-04-15 10:51 UTC (permalink / raw)
  To: Eric Farman; +Cc: Halil Pasic, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Tue, 13 Apr 2021 20:24:08 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> When an interrupt is received via the IRQ, the bulk of the work
> is stacked on a workqueue for later processing. Which means that
> a concurrent START or HALT/CLEAR operation (via the async_region)
> will race with this process and require some serialization.
> 
> Once we have all our locks acquired, let's just look to see if we're
> in a window where the process has been started from the IRQ, but not
> yet picked up by vfio-ccw to clean up an I/O. If there is, mark the
> request as BUSY so it can be redriven.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_fsm.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> index 23e61aa638e4..92d638f10b27 100644
> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> @@ -28,6 +28,11 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
>  
>  	spin_lock_irqsave(sch->lock, flags);
>  
> +	if (work_pending(&private->io_work)) {
> +		ret = -EBUSY;
> +		goto out;
> +	}
> +
>  	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
>  	if (!orb) {
>  		ret = -EIO;

I'm wondering what condition we can consider this situation equivalent
to. I'd say that the virtual version of the subchannel is basically
status pending already, even though userspace may not have retrieved
that information yet; so probably cc 1?

Following the code path further along, it seems we return -EBUSY both
for cc 1 and cc 2 conditions we receive from the device (i.e. they are
not distinguishable from userspace). I don't think we can change that,
as it is an existing API (QEMU maps -EBUSY to cc 2.) So this change
looks fine so far.

I'm wondering what we should do for hsch. We probably want to return
-EBUSY for a pending condition as well, if I read the PoP correctly...
the only problem is that QEMU seems to match everything to 0; but that
is arguably not the kernel's problem.

For clear, we obviously don't have busy conditions. Should we clean up
any pending conditions?

[It feels like we have discussed this before, but any information has
vanished from my cache :/]


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 3/4] vfio-ccw: Reset FSM state to IDLE inside FSM
  2021-04-13 18:24 ` [RFC PATCH v4 3/4] vfio-ccw: Reset FSM state to IDLE inside FSM Eric Farman
@ 2021-04-15 10:54   ` Cornelia Huck
  0 siblings, 0 replies; 26+ messages in thread
From: Cornelia Huck @ 2021-04-15 10:54 UTC (permalink / raw)
  To: Eric Farman; +Cc: Halil Pasic, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Tue, 13 Apr 2021 20:24:09 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> When an I/O request is made, the fsm_io_request() routine
> moves the FSM state from IDLE to CP_PROCESSING, and then
> fsm_io_helper() moves it to CP_PENDING if the START SUBCHANNEL
> received a cc0. Yet, the error case to go from CP_PROCESSING
> back to IDLE is done after the FSM call returns.
> 
> Let's move this up into the FSM proper, to provide some
> better symmetry when unwinding in this case.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_fsm.c | 1 +
>  drivers/s390/cio/vfio_ccw_ops.c | 2 --
>  2 files changed, 1 insertion(+), 2 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 2/4] vfio-ccw: Check workqueue before doing START
  2021-04-15 10:51   ` Cornelia Huck
@ 2021-04-15 13:48     ` Eric Farman
  2021-04-15 16:19       ` Cornelia Huck
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Farman @ 2021-04-15 13:48 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Halil Pasic, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Thu, 2021-04-15 at 12:51 +0200, Cornelia Huck wrote:
> On Tue, 13 Apr 2021 20:24:08 +0200
> Eric Farman <farman@linux.ibm.com> wrote:
> 
> > When an interrupt is received via the IRQ, the bulk of the work
> > is stacked on a workqueue for later processing. Which means that
> > a concurrent START or HALT/CLEAR operation (via the async_region)
> > will race with this process and require some serialization.
> > 
> > Once we have all our locks acquired, let's just look to see if
> > we're
> > in a window where the process has been started from the IRQ, but
> > not
> > yet picked up by vfio-ccw to clean up an I/O. If there is, mark the
> > request as BUSY so it can be redriven.
> > 
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> >  drivers/s390/cio/vfio_ccw_fsm.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_fsm.c
> > b/drivers/s390/cio/vfio_ccw_fsm.c
> > index 23e61aa638e4..92d638f10b27 100644
> > --- a/drivers/s390/cio/vfio_ccw_fsm.c
> > +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> > @@ -28,6 +28,11 @@ static int fsm_io_helper(struct vfio_ccw_private
> > *private)
> >  
> >  	spin_lock_irqsave(sch->lock, flags);
> >  
> > +	if (work_pending(&private->io_work)) {
> > +		ret = -EBUSY;
> > +		goto out;
> > +	}
> > +
> >  	orb = cp_get_orb(&private->cp, (u32)(addr_t)sch, sch->lpm);
> >  	if (!orb) {
> >  		ret = -EIO;
> 
> I'm wondering what condition we can consider this situation
> equivalent
> to. I'd say that the virtual version of the subchannel is basically
> status pending already, even though userspace may not have retrieved
> that information yet; so probably cc 1?

Yes, I guess cc1 is a more natural fit, since there is status pending
rather than an active start/halt/clear that would expect get the cc2.

> 
> Following the code path further along, it seems we return -EBUSY both
> for cc 1 and cc 2 conditions we receive from the device (i.e. they
> are
> not distinguishable from userspace). 

Yeah. :/

> I don't think we can change that,
> as it is an existing API (QEMU maps -EBUSY to cc 2.) So this change
> looks fine so far.
> 
> I'm wondering what we should do for hsch. We probably want to return
> -EBUSY for a pending condition as well, if I read the PoP
> correctly...

Ah, yes...  I agree that to maintain parity with ssch and pops, the
same cc1/-EBUSY would be applicable here. Will make that change in next
version.

> the only problem is that QEMU seems to match everything to 0; but
> that
> is arguably not the kernel's problem.
> 
> For clear, we obviously don't have busy conditions. Should we clean
> up
> any pending conditions?

By doing anything other than issuing the csch to the subchannel?  I
don't think so, that should be more than enough to get the css and
vfio-ccw in sync with each other.

> 
> [It feels like we have discussed this before, but any information has
> vanished from my cache :/]
> 

It has vanished from mine too, and looking over the old threads and
notes doesn't page anything useful in, so here we are. Sorry. :(

Eric



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 2/4] vfio-ccw: Check workqueue before doing START
  2021-04-15 13:48     ` Eric Farman
@ 2021-04-15 16:19       ` Cornelia Huck
  2021-04-15 18:42         ` Eric Farman
  0 siblings, 1 reply; 26+ messages in thread
From: Cornelia Huck @ 2021-04-15 16:19 UTC (permalink / raw)
  To: Eric Farman; +Cc: Halil Pasic, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Thu, 15 Apr 2021 09:48:37 -0400
Eric Farman <farman@linux.ibm.com> wrote:

> On Thu, 2021-04-15 at 12:51 +0200, Cornelia Huck wrote:

> > I'm wondering what we should do for hsch. We probably want to return
> > -EBUSY for a pending condition as well, if I read the PoP
> > correctly...  
> 
> Ah, yes...  I agree that to maintain parity with ssch and pops, the
> same cc1/-EBUSY would be applicable here. Will make that change in next
> version.

Yes, just to handle things in the same fashion consistently.

> 
> > the only problem is that QEMU seems to match everything to 0; but
> > that
> > is arguably not the kernel's problem.
> > 
> > For clear, we obviously don't have busy conditions. Should we clean
> > up
> > any pending conditions?  
> 
> By doing anything other than issuing the csch to the subchannel?  I
> don't think so, that should be more than enough to get the css and
> vfio-ccw in sync with each other.

Hm, doesn't a successful csch clear any status pending? That would mean
that invoking our csch backend implies that we won't deliver the status
pending that is already pending via the workqueue, which therefore
needs to be flushed out in some way? I remember we did some special
csch handling, but I don't immediately see where; might have been only
in QEMU.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 2/4] vfio-ccw: Check workqueue before doing START
  2021-04-15 16:19       ` Cornelia Huck
@ 2021-04-15 18:42         ` Eric Farman
  2021-04-16 14:41           ` Cornelia Huck
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Farman @ 2021-04-15 18:42 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Halil Pasic, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Thu, 2021-04-15 at 18:19 +0200, Cornelia Huck wrote:
> On Thu, 15 Apr 2021 09:48:37 -0400
> Eric Farman <farman@linux.ibm.com> wrote:
> 
> > On Thu, 2021-04-15 at 12:51 +0200, Cornelia Huck wrote:
> > > I'm wondering what we should do for hsch. We probably want to
> > > return
> > > -EBUSY for a pending condition as well, if I read the PoP
> > > correctly...  
> > 
> > Ah, yes...  I agree that to maintain parity with ssch and pops, the
> > same cc1/-EBUSY would be applicable here. Will make that change in
> > next
> > version.
> 
> Yes, just to handle things in the same fashion consistently.
> 
> > > the only problem is that QEMU seems to match everything to 0; but
> > > that
> > > is arguably not the kernel's problem.
> > > 
> > > For clear, we obviously don't have busy conditions. Should we
> > > clean
> > > up
> > > any pending conditions?  
> > 
> > By doing anything other than issuing the csch to the subchannel?  I
> > don't think so, that should be more than enough to get the css and
> > vfio-ccw in sync with each other.
> 
> Hm, doesn't a successful csch clear any status pending? 

Yep.

> That would mean
> that invoking our csch backend implies that we won't deliver the
> status
> pending that is already pending via the workqueue, which therefore
> needs to be flushed out in some way? 

Ah, so I misunderstood the direction you were going... I'm not aware of
a way to "purge" items from a workqueue, as the flush_workqueue()
routine is documented as picking them off and running them.

Perhaps an atomic flag in (private? cp?) that causes
vfio_ccw_sch_io_todo() to just exit rather than doing all its stuff?

> I remember we did some special
> csch handling, but I don't immediately see where; might have been
> only
> in QEMU.
> 

Maybe.  I don't see anything jumping out at me though. :(



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 2/4] vfio-ccw: Check workqueue before doing START
  2021-04-15 18:42         ` Eric Farman
@ 2021-04-16 14:41           ` Cornelia Huck
  0 siblings, 0 replies; 26+ messages in thread
From: Cornelia Huck @ 2021-04-16 14:41 UTC (permalink / raw)
  To: Eric Farman; +Cc: Halil Pasic, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Thu, 15 Apr 2021 14:42:21 -0400
Eric Farman <farman@linux.ibm.com> wrote:

> On Thu, 2021-04-15 at 18:19 +0200, Cornelia Huck wrote:
> > On Thu, 15 Apr 2021 09:48:37 -0400
> > Eric Farman <farman@linux.ibm.com> wrote:
> >   
> > > On Thu, 2021-04-15 at 12:51 +0200, Cornelia Huck wrote:  
> > > > I'm wondering what we should do for hsch. We probably want to
> > > > return
> > > > -EBUSY for a pending condition as well, if I read the PoP
> > > > correctly...    
> > > 
> > > Ah, yes...  I agree that to maintain parity with ssch and pops, the
> > > same cc1/-EBUSY would be applicable here. Will make that change in
> > > next
> > > version.  
> > 
> > Yes, just to handle things in the same fashion consistently.
> >   
> > > > the only problem is that QEMU seems to match everything to 0; but
> > > > that
> > > > is arguably not the kernel's problem.
> > > > 
> > > > For clear, we obviously don't have busy conditions. Should we
> > > > clean
> > > > up
> > > > any pending conditions?    
> > > 
> > > By doing anything other than issuing the csch to the subchannel?  I
> > > don't think so, that should be more than enough to get the css and
> > > vfio-ccw in sync with each other.  
> > 
> > Hm, doesn't a successful csch clear any status pending?   
> 
> Yep.
> 
> > That would mean
> > that invoking our csch backend implies that we won't deliver the
> > status
> > pending that is already pending via the workqueue, which therefore
> > needs to be flushed out in some way?   
> 
> Ah, so I misunderstood the direction you were going... I'm not aware of
> a way to "purge" items from a workqueue, as the flush_workqueue()
> routine is documented as picking them off and running them.
> 
> Perhaps an atomic flag in (private? cp?) that causes
> vfio_ccw_sch_io_todo() to just exit rather than doing all its stuff?

Yes, maybe something like that.

Maybe we should do that on top once we have a good idea, if the current
series already fixes the problems that are actually happening now and
then.

> 
> > I remember we did some special
> > csch handling, but I don't immediately see where; might have been
> > only
> > in QEMU.
> >   
> 
> Maybe.  I don't see anything jumping out at me though. :(

I might have misremembered; it only really applies to passthrough, as
emulated subchannels are handled synchronously anyway.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 4/4] vfio-ccw: Reset FSM state to IDLE before io_mutex
  2021-04-13 18:24 ` [RFC PATCH v4 4/4] vfio-ccw: Reset FSM state to IDLE before io_mutex Eric Farman
@ 2021-04-21 10:25   ` Cornelia Huck
  2021-04-21 12:58     ` Eric Farman
  0 siblings, 1 reply; 26+ messages in thread
From: Cornelia Huck @ 2021-04-21 10:25 UTC (permalink / raw)
  To: Eric Farman; +Cc: Halil Pasic, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Tue, 13 Apr 2021 20:24:10 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> Today, the stacked call to vfio_ccw_sch_io_todo() does three things:
> 
> 1) Update a solicited IRB with CP information, and release the CP
> if the interrupt was the end of a START operation.
> 2) Copy the IRB data into the io_region, under the protection of
> the io_mutex
> 3) Reset the vfio-ccw FSM state to IDLE to acknowledge that
> vfio-ccw can accept more work.
> 
> The trouble is that step 3 is (A) invoked for both solicited and
> unsolicited interrupts, and (B) sitting after the mutex for step 2.
> This second piece becomes a problem if it processes an interrupt
> for a CLEAR SUBCHANNEL while another thread initiates a START,
> thus allowing the CP and FSM states to get out of sync. That is:
> 
> 	CPU 1				CPU 2
> 	fsm_do_clear()
> 	fsm_irq()
> 					fsm_io_request()
> 					fsm_io_helper()
> 	vfio_ccw_sch_io_todo()
> 					fsm_irq()
> 					vfio_ccw_sch_io_todo()
> 
> Let's move the reset of the FSM state to the point where the
> channel_program struct is cleaned up, which is only done for
> solicited interrupts anyway.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_drv.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
> index 8c625b530035..e51318f23ca8 100644
> --- a/drivers/s390/cio/vfio_ccw_drv.c
> +++ b/drivers/s390/cio/vfio_ccw_drv.c
> @@ -94,16 +94,15 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
>  		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
>  	if (scsw_is_solicited(&irb->scsw)) {
>  		cp_update_scsw(&private->cp, &irb->scsw);
> -		if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING)
> +		if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING) {
>  			cp_free(&private->cp);
> +			private->state = VFIO_CCW_STATE_IDLE;
> +		}
>  	}
>  	mutex_lock(&private->io_mutex);
>  	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
>  	mutex_unlock(&private->io_mutex);
>  
> -	if (private->mdev && is_final)
> -		private->state = VFIO_CCW_STATE_IDLE;

Isn't that re-allowing new I/O requests a bit too early? Maybe remember
that we had a final I/O interrupt for an I/O request and only change
the state in this case?


> -
>  	if (private->io_trigger)
>  		eventfd_signal(private->io_trigger, 1);
>  }


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 4/4] vfio-ccw: Reset FSM state to IDLE before io_mutex
  2021-04-21 10:25   ` Cornelia Huck
@ 2021-04-21 12:58     ` Eric Farman
  2021-04-22 16:16       ` Eric Farman
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Farman @ 2021-04-21 12:58 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Halil Pasic, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Wed, 2021-04-21 at 12:25 +0200, Cornelia Huck wrote:
> On Tue, 13 Apr 2021 20:24:10 +0200
> Eric Farman <farman@linux.ibm.com> wrote:
> 
> > Today, the stacked call to vfio_ccw_sch_io_todo() does three
> > things:
> > 
> > 1) Update a solicited IRB with CP information, and release the CP
> > if the interrupt was the end of a START operation.
> > 2) Copy the IRB data into the io_region, under the protection of
> > the io_mutex
> > 3) Reset the vfio-ccw FSM state to IDLE to acknowledge that
> > vfio-ccw can accept more work.
> > 
> > The trouble is that step 3 is (A) invoked for both solicited and
> > unsolicited interrupts, and (B) sitting after the mutex for step 2.
> > This second piece becomes a problem if it processes an interrupt
> > for a CLEAR SUBCHANNEL while another thread initiates a START,
> > thus allowing the CP and FSM states to get out of sync. That is:
> > 
> > 	CPU 1				CPU 2
> > 	fsm_do_clear()
> > 	fsm_irq()
> > 					fsm_io_request()
> > 					fsm_io_helper()
> > 	vfio_ccw_sch_io_todo()
> > 					fsm_irq()
> > 					vfio_ccw_sch_io_todo()
> > 
> > Let's move the reset of the FSM state to the point where the
> > channel_program struct is cleaned up, which is only done for
> > solicited interrupts anyway.
> > 
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> >  drivers/s390/cio/vfio_ccw_drv.c | 7 +++----
> >  1 file changed, 3 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_drv.c
> > b/drivers/s390/cio/vfio_ccw_drv.c
> > index 8c625b530035..e51318f23ca8 100644
> > --- a/drivers/s390/cio/vfio_ccw_drv.c
> > +++ b/drivers/s390/cio/vfio_ccw_drv.c
> > @@ -94,16 +94,15 @@ static void vfio_ccw_sch_io_todo(struct
> > work_struct *work)
> >  		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
> >  	if (scsw_is_solicited(&irb->scsw)) {
> >  		cp_update_scsw(&private->cp, &irb->scsw);
> > -		if (is_final && private->state ==
> > VFIO_CCW_STATE_CP_PENDING)
> > +		if (is_final && private->state ==
> > VFIO_CCW_STATE_CP_PENDING) {
> >  			cp_free(&private->cp);
> > +			private->state = VFIO_CCW_STATE_IDLE;
> > +		}
> >  	}
> >  	mutex_lock(&private->io_mutex);
> >  	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
> >  	mutex_unlock(&private->io_mutex);
> >  
> > -	if (private->mdev && is_final)
> > -		private->state = VFIO_CCW_STATE_IDLE;
> 
> Isn't that re-allowing new I/O requests a bit too early?

Hrm... I guess I don't see what work vfio-ccw has left to do that is
presenting it from carrying on. The copying of the IRB data back into
the io_region seems like a flimsy gate to me. But...

It seems you're (rightly) concerned with userspace doing SSCH + SSCH,
whereas I'v been focused on the CSCH + SSCH sequence. So with this
change, we're inviting the possibility of a second SSCH being able to
be submitted/started before the IRB data for the first SSCH is copied
(and presumably before userspace is tapped to read that data back).

Sigh... I guess that's not the greatest behavior either. Gotta ruminate
on this.

>  Maybe remember
> that we had a final I/O interrupt for an I/O request and only change
> the state in this case?

As a local flag within this routine? Hrm... I have entirely too many
"Let's try this" branches that didn't work, but I don't see that one
jumping out at me. Will give it a try.

> 
> 
> > -
> >  	if (private->io_trigger)
> >  		eventfd_signal(private->io_trigger, 1);
> >  }


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2021-04-13 18:24 [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
                   ` (3 preceding siblings ...)
  2021-04-13 18:24 ` [RFC PATCH v4 4/4] vfio-ccw: Reset FSM state to IDLE before io_mutex Eric Farman
@ 2021-04-22  0:52 ` Halil Pasic
  2021-04-22 20:49   ` Eric Farman
  4 siblings, 1 reply; 26+ messages in thread
From: Halil Pasic @ 2021-04-22  0:52 UTC (permalink / raw)
  To: Eric Farman; +Cc: Cornelia Huck, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Tue, 13 Apr 2021 20:24:06 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> Hi Conny, Halil,
> 
> Let's restart our discussion about the collision between interrupts for
> START SUBCHANNEL and HALT/CLEAR SUBCHANNEL. It's been a quarter million
> minutes (give or take), so here is the problematic scenario again:
> 
> 	CPU 1			CPU 2
>  1	CLEAR SUBCHANNEL
>  2	fsm_irq()
>  3				START SUBCHANNEL
>  4	vfio_ccw_sch_io_todo()
>  5				fsm_irq()
>  6				vfio_ccw_sch_io_todo()
> 
> From the channel subsystem's point of view the CLEAR SUBCHANNEL (step 1)
> is complete once step 2 is called, as the Interrupt Response Block (IRB)
> has been presented and the TEST SUBCHANNEL was driven by the cio layer.
> Thus, the START SUBCHANNEL (step 3) is submitted [1] and gets a cc=0 to
> indicate the I/O was accepted. However, step 2 stacks the bulk of the
> actual work onto a workqueue for when the subchannel lock is NOT held,
> and is unqueued at step 4. That code misidentifies the data in the IRB
> as being associated with the newly active I/O, and may release memory
> that is actively in use by the channel subsystem and/or device. Eww.
> 
> In this version...
> 
> Patch 1 and 2 are defensive checks. Patch 2 was part of v3 [2], but I
> would love a better option here to guard between steps 2 and 4.
> 
> Patch 3 is a subset of the removal of the CP_PENDING FSM state in v3.
> I've obviously gone away from this idea, but I thought this piece is
> still valuable.
> 
> Patch 4 collapses the code on the interrupt path so that changes to
> the FSM state and the channel_program struct are handled at the same
> point, rather than separated by a mutex boundary. Because of the
> possibility of a START and HALT/CLEAR running concurrently, it does
> not make sense to split them here.
> 
> With the above patches, maybe it then makes sense to hold the io_mutex
> across the entirety of vfio_ccw_sch_io_todo(). But I'm not completely
> sure that would be acceptable.
> 
> So... Thoughts?

I believe we should address the concurrency, encapsulation and layering
issues in the subchannel/ccw pass-through code (vfio-ccw) by taking a
holistic approach as soon as possible.

I find the current state of art very hard to reason about, and that
adversely  affects my ability to reason about attempts at partial
improvements.

I understand that such a holistic approach needs a lot of work, and we
may have to stop some bleeding first. In the stop the bleeding phase we
can take a pragmatic approach and accept changes that empirically seem to
work towards stopping the bleeding. I.e. if your tests say it's better,
I'm willing to accept that it is better.

I have to admit, I don't understand how synchronization is done in the
vfio-ccw kernel module (in the sense of avoiding data races).

Regarding your patches, I have to admit, I have a hard time figuring out
which one of these (or what combination of them) is supposed to solve
the problem you described above. If I had to guess, I would guess it is
either patch 4, because it has a similar scenario diagram in the
commit message like the one in the problem statement. Is my guess right?

If it is right I don't quite understand the mechanics of the fix,
because what the patch seems to do is changing the content of step 4 in
the above diagram. And I don't see how is change that code
so that it does not "misidentifies the data in the IRB as being
associated with the newly active I/O". Moreover patch 4 seems to rely on
private->state which, AFAIR is still used in a racy fashion.

But if strong empirical evidence shows that it performs better (stops
the bleeding), I think we can go ahead with it.

Regards,
Halil








^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 4/4] vfio-ccw: Reset FSM state to IDLE before io_mutex
  2021-04-21 12:58     ` Eric Farman
@ 2021-04-22 16:16       ` Eric Farman
  0 siblings, 0 replies; 26+ messages in thread
From: Eric Farman @ 2021-04-22 16:16 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Halil Pasic, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Wed, 2021-04-21 at 08:58 -0400, Eric Farman wrote:
> On Wed, 2021-04-21 at 12:25 +0200, Cornelia Huck wrote:
> > On Tue, 13 Apr 2021 20:24:10 +0200
> > Eric Farman <farman@linux.ibm.com> wrote:
> > 
> > > Today, the stacked call to vfio_ccw_sch_io_todo() does three
> > > things:
> > > 
> > > 1) Update a solicited IRB with CP information, and release the CP
> > > if the interrupt was the end of a START operation.
> > > 2) Copy the IRB data into the io_region, under the protection of
> > > the io_mutex
> > > 3) Reset the vfio-ccw FSM state to IDLE to acknowledge that
> > > vfio-ccw can accept more work.
> > > 
> > > The trouble is that step 3 is (A) invoked for both solicited and
> > > unsolicited interrupts, and (B) sitting after the mutex for step
> > > 2.
> > > This second piece becomes a problem if it processes an interrupt
> > > for a CLEAR SUBCHANNEL while another thread initiates a START,
> > > thus allowing the CP and FSM states to get out of sync. That is:
> > > 
> > > 	CPU 1				CPU 2
> > > 	fsm_do_clear()
> > > 	fsm_irq()
> > > 					fsm_io_request()
> > > 					fsm_io_helper()
> > > 	vfio_ccw_sch_io_todo()
> > > 					fsm_irq()
> > > 					vfio_ccw_sch_io_todo()
> > > 
> > > Let's move the reset of the FSM state to the point where the
> > > channel_program struct is cleaned up, which is only done for
> > > solicited interrupts anyway.
> > > 
> > > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > > ---
> > >  drivers/s390/cio/vfio_ccw_drv.c | 7 +++----
> > >  1 file changed, 3 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/s390/cio/vfio_ccw_drv.c
> > > b/drivers/s390/cio/vfio_ccw_drv.c
> > > index 8c625b530035..e51318f23ca8 100644
> > > --- a/drivers/s390/cio/vfio_ccw_drv.c
> > > +++ b/drivers/s390/cio/vfio_ccw_drv.c
> > > @@ -94,16 +94,15 @@ static void vfio_ccw_sch_io_todo(struct
> > > work_struct *work)
> > >  		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
> > >  	if (scsw_is_solicited(&irb->scsw)) {
> > >  		cp_update_scsw(&private->cp, &irb->scsw);
> > > -		if (is_final && private->state ==
> > > VFIO_CCW_STATE_CP_PENDING)
> > > +		if (is_final && private->state ==
> > > VFIO_CCW_STATE_CP_PENDING) {
> > >  			cp_free(&private->cp);
> > > +			private->state = VFIO_CCW_STATE_IDLE;
> > > +		}
> > >  	}
> > >  	mutex_lock(&private->io_mutex);
> > >  	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
> > >  	mutex_unlock(&private->io_mutex);
> > >  
> > > -	if (private->mdev && is_final)
> > > -		private->state = VFIO_CCW_STATE_IDLE;
> > 
> > Isn't that re-allowing new I/O requests a bit too early?
> 
> Hrm... I guess I don't see what work vfio-ccw has left to do that is
> presenting it from carrying on. The copying of the IRB data back into
> the io_region seems like a flimsy gate to me. But...
> 
> It seems you're (rightly) concerned with userspace doing SSCH + SSCH,
> whereas I'v been focused on the CSCH + SSCH sequence. So with this
> change, we're inviting the possibility of a second SSCH being able to
> be submitted/started before the IRB data for the first SSCH is copied
> (and presumably before userspace is tapped to read that data back).
> 
> Sigh... I guess that's not the greatest behavior either. Gotta
> ruminate
> on this.
> 
> >  Maybe remember
> > that we had a final I/O interrupt for an I/O request and only
> > change
> > the state in this case?
> 
> As a local flag within this routine? Hrm... I have entirely too many
> "Let's try this" branches that didn't work, but I don't see that one
> jumping out at me. Will give it a try.

Still going strong, so that bodes really well (knock wood). I need to
spend a little time with patch 2 before I send the next version, but
that shouldn't be too long.

Eric

> 
> > 
> > > -
> > >  	if (private->io_trigger)
> > >  		eventfd_signal(private->io_trigger, 1);
> > >  }


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2021-04-22  0:52 ` [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR Halil Pasic
@ 2021-04-22 20:49   ` Eric Farman
  2021-04-23 11:06     ` Cornelia Huck
  2021-04-23 11:50     ` Halil Pasic
  0 siblings, 2 replies; 26+ messages in thread
From: Eric Farman @ 2021-04-22 20:49 UTC (permalink / raw)
  To: Halil Pasic; +Cc: Cornelia Huck, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Thu, 2021-04-22 at 02:52 +0200, Halil Pasic wrote:
> On Tue, 13 Apr 2021 20:24:06 +0200
> Eric Farman <farman@linux.ibm.com> wrote:
> 
> > Hi Conny, Halil,
> > 
> > Let's restart our discussion about the collision between interrupts
> > for
> > START SUBCHANNEL and HALT/CLEAR SUBCHANNEL. It's been a quarter
> > million
> > minutes (give or take), so here is the problematic scenario again:
> > 
> > 	CPU 1			CPU 2
> >  1	CLEAR SUBCHANNEL
> >  2	fsm_irq()
> >  3				START SUBCHANNEL
> >  4	vfio_ccw_sch_io_todo()
> >  5				fsm_irq()
> >  6				vfio_ccw_sch_io_todo()
> > 
> > From the channel subsystem's point of view the CLEAR SUBCHANNEL
> > (step 1)
> > is complete once step 2 is called, as the Interrupt Response Block
> > (IRB)
> > has been presented and the TEST SUBCHANNEL was driven by the cio
> > layer.
> > Thus, the START SUBCHANNEL (step 3) is submitted [1] and gets a
> > cc=0 to
> > indicate the I/O was accepted. However, step 2 stacks the bulk of
> > the
> > actual work onto a workqueue for when the subchannel lock is NOT
> > held,
> > and is unqueued at step 4. That code misidentifies the data in the
> > IRB
> > as being associated with the newly active I/O, and may release
> > memory
> > that is actively in use by the channel subsystem and/or device.
> > Eww.
> > 
> > In this version...
> > 
> > Patch 1 and 2 are defensive checks. Patch 2 was part of v3 [2], but
> > I
> > would love a better option here to guard between steps 2 and 4.
> > 
> > Patch 3 is a subset of the removal of the CP_PENDING FSM state in
> > v3.
> > I've obviously gone away from this idea, but I thought this piece
> > is
> > still valuable.
> > 
> > Patch 4 collapses the code on the interrupt path so that changes to
> > the FSM state and the channel_program struct are handled at the
> > same
> > point, rather than separated by a mutex boundary. Because of the
> > possibility of a START and HALT/CLEAR running concurrently, it does
> > not make sense to split them here.
> > 
> > With the above patches, maybe it then makes sense to hold the
> > io_mutex
> > across the entirety of vfio_ccw_sch_io_todo(). But I'm not
> > completely
> > sure that would be acceptable.
> > 
> > So... Thoughts?
> 
> I believe we should address

Who is the "we" here?

>  the concurrency, encapsulation and layering
> issues in the subchannel/ccw pass-through code (vfio-ccw) by taking a
> holistic approach as soon as possible.
> 
> I find the current state of art very hard to reason about, and that
> adversely  affects my ability to reason about attempts at partial
> improvements.
> 
> I understand that such a holistic approach needs a lot of work, and
> we
> may have to stop some bleeding first. In the stop the bleeding phase
> we
> can take a pragmatic approach and accept changes that empirically
> seem to
> work towards stopping the bleeding. I.e. if your tests say it's
> better,
> I'm willing to accept that it is better.

So much bleeding!

RE: my tests... I have only been seeing the described problem in
pathological tests, and this series lets those tests run without issue.

> 
> I have to admit, I don't understand how synchronization is done in
> the
> vfio-ccw kernel module (in the sense of avoiding data races).
> 
> Regarding your patches, I have to admit, I have a hard time figuring
> out
> which one of these (or what combination of them) is supposed to solve
> the problem you described above. If I had to guess, I would guess it
> is
> either patch 4, because it has a similar scenario diagram in the
> commit message like the one in the problem statement. Is my guess
> right?

Sort of. It is true that Patch 4 is the last piece of the puzzle, and
the diagram is included in that commit message so it is kept with the
change, instead of being lost with the cover letter.

As I said in the cover letter, "Patch 1 and 2 are defensive checks"
which are simply included to provide a more robust solution. You could
argue that Patch 3 should be held out separately, but as it came from
the previous version of this series it made sense to include here.

> 
> If it is right I don't quite understand the mechanics of the fix,
> because what the patch seems to do is changing the content of step 4
> in
> the above diagram. And I don't see how is change that code
> so that it does not "misidentifies the data in the IRB as being
> associated with the newly active I/O". 

Consider that the cp_update_scsw() and cp_free() routines that get
called here are looking at the cp->initialized flag to determine
whether to perform any work. For a system that is otherwise idle, the
cp->initialized flag will be false when processing an IRB related to a
CSCH, meaning the bulk of this routine will be a NOP.

In the failing scenario, as I describe in the commit message for patch
4, we could be processing an interrupt that is unaffiliated with the CP
that was (or is being) built. It need not even be a solicited
interrupt; it just happened that the CSCH interrupt is what got me
looking at this path. The whole situation boils down to the FSM state
and cp->initialized flag being out of sync from one another after
coming through this function.

> Moreover patch 4 seems to rely on
> private->state which, AFAIR is still used in a racy fashion.
> 
> But if strong empirical evidence shows that it performs better (stops
> the bleeding), I think we can go ahead with it.

Again with the bleeding. Is there a Doctor in the house? :)

Eric

> 
> Regards,
> Halil
> 
> 
> 
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2021-04-22 20:49   ` Eric Farman
@ 2021-04-23 11:06     ` Cornelia Huck
  2021-04-23 13:23       ` Halil Pasic
  2021-04-23 11:50     ` Halil Pasic
  1 sibling, 1 reply; 26+ messages in thread
From: Cornelia Huck @ 2021-04-23 11:06 UTC (permalink / raw)
  To: Eric Farman; +Cc: Halil Pasic, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Thu, 22 Apr 2021 16:49:21 -0400
Eric Farman <farman@linux.ibm.com> wrote:

> On Thu, 2021-04-22 at 02:52 +0200, Halil Pasic wrote:
> > On Tue, 13 Apr 2021 20:24:06 +0200
> > Eric Farman <farman@linux.ibm.com> wrote:
> >   
> > > Hi Conny, Halil,
> > > 
> > > Let's restart our discussion about the collision between interrupts
> > > for
> > > START SUBCHANNEL and HALT/CLEAR SUBCHANNEL. It's been a quarter
> > > million
> > > minutes (give or take), so here is the problematic scenario again:
> > > 
> > > 	CPU 1			CPU 2
> > >  1	CLEAR SUBCHANNEL
> > >  2	fsm_irq()
> > >  3				START SUBCHANNEL
> > >  4	vfio_ccw_sch_io_todo()
> > >  5				fsm_irq()
> > >  6				vfio_ccw_sch_io_todo()
> > > 
> > > From the channel subsystem's point of view the CLEAR SUBCHANNEL
> > > (step 1)
> > > is complete once step 2 is called, as the Interrupt Response Block
> > > (IRB)
> > > has been presented and the TEST SUBCHANNEL was driven by the cio
> > > layer.
> > > Thus, the START SUBCHANNEL (step 3) is submitted [1] and gets a
> > > cc=0 to
> > > indicate the I/O was accepted. However, step 2 stacks the bulk of
> > > the
> > > actual work onto a workqueue for when the subchannel lock is NOT
> > > held,
> > > and is unqueued at step 4. That code misidentifies the data in the
> > > IRB
> > > as being associated with the newly active I/O, and may release
> > > memory
> > > that is actively in use by the channel subsystem and/or device.
> > > Eww.
> > > 
> > > In this version...
> > > 
> > > Patch 1 and 2 are defensive checks. Patch 2 was part of v3 [2], but
> > > I
> > > would love a better option here to guard between steps 2 and 4.
> > > 
> > > Patch 3 is a subset of the removal of the CP_PENDING FSM state in
> > > v3.
> > > I've obviously gone away from this idea, but I thought this piece
> > > is
> > > still valuable.
> > > 
> > > Patch 4 collapses the code on the interrupt path so that changes to
> > > the FSM state and the channel_program struct are handled at the
> > > same
> > > point, rather than separated by a mutex boundary. Because of the
> > > possibility of a START and HALT/CLEAR running concurrently, it does
> > > not make sense to split them here.
> > > 
> > > With the above patches, maybe it then makes sense to hold the
> > > io_mutex
> > > across the entirety of vfio_ccw_sch_io_todo(). But I'm not
> > > completely
> > > sure that would be acceptable.
> > > 
> > > So... Thoughts?  
> > 
> > I believe we should address  
> 
> Who is the "we" here?
> 
> >  the concurrency, encapsulation and layering
> > issues in the subchannel/ccw pass-through code (vfio-ccw) by taking a
> > holistic approach as soon as possible.

Let me also ask: what is "holistic"? If that's a complete rewrite, I
definitely don't have the capacity for that; if others want to take
over the code, feel free.

> > 
> > I find the current state of art very hard to reason about, and that
> > adversely  affects my ability to reason about attempts at partial
> > improvements.
> > 
> > I understand that such a holistic approach needs a lot of work, and
> > we
> > may have to stop some bleeding first. In the stop the bleeding phase
> > we
> > can take a pragmatic approach and accept changes that empirically
> > seem to
> > work towards stopping the bleeding. I.e. if your tests say it's
> > better,
> > I'm willing to accept that it is better.  
> 
> So much bleeding!
> 
> RE: my tests... I have only been seeing the described problem in
> pathological tests, and this series lets those tests run without issue.

FWIW, I haven't been able to reproduce the problem myself, and I don't
remember seeing other reports. It's still a problem, and if we can get
rid of the issue, good. The reasoning about what is happening makes
sense to me.

> 
> > 
> > I have to admit, I don't understand how synchronization is done in
> > the
> > vfio-ccw kernel module (in the sense of avoiding data races).
> > 
> > Regarding your patches, I have to admit, I have a hard time figuring
> > out
> > which one of these (or what combination of them) is supposed to solve
> > the problem you described above. If I had to guess, I would guess it
> > is
> > either patch 4, because it has a similar scenario diagram in the
> > commit message like the one in the problem statement. Is my guess
> > right?  
> 
> Sort of. It is true that Patch 4 is the last piece of the puzzle, and
> the diagram is included in that commit message so it is kept with the
> change, instead of being lost with the cover letter.
> 
> As I said in the cover letter, "Patch 1 and 2 are defensive checks"
> which are simply included to provide a more robust solution. You could
> argue that Patch 3 should be held out separately, but as it came from
> the previous version of this series it made sense to include here.
> 
> > 
> > If it is right I don't quite understand the mechanics of the fix,
> > because what the patch seems to do is changing the content of step 4
> > in
> > the above diagram. And I don't see how is change that code
> > so that it does not "misidentifies the data in the IRB as being
> > associated with the newly active I/O".   
> 
> Consider that the cp_update_scsw() and cp_free() routines that get
> called here are looking at the cp->initialized flag to determine
> whether to perform any work. For a system that is otherwise idle, the
> cp->initialized flag will be false when processing an IRB related to a
> CSCH, meaning the bulk of this routine will be a NOP.
> 
> In the failing scenario, as I describe in the commit message for patch
> 4, we could be processing an interrupt that is unaffiliated with the CP
> that was (or is being) built. It need not even be a solicited
> interrupt; it just happened that the CSCH interrupt is what got me
> looking at this path. The whole situation boils down to the FSM state
> and cp->initialized flag being out of sync from one another after
> coming through this function.

Nod, that's also my understanding.

> 
> > Moreover patch 4 seems to rely on
> > private->state which, AFAIR is still used in a racy fashion.
> > 
> > But if strong empirical evidence shows that it performs better (stops
> > the bleeding), I think we can go ahead with it.  
> 
> Again with the bleeding. Is there a Doctor in the house? :)

No idea, seen any blue boxes around? :)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2021-04-22 20:49   ` Eric Farman
  2021-04-23 11:06     ` Cornelia Huck
@ 2021-04-23 11:50     ` Halil Pasic
  2021-04-23 15:53       ` Eric Farman
  1 sibling, 1 reply; 26+ messages in thread
From: Halil Pasic @ 2021-04-23 11:50 UTC (permalink / raw)
  To: Eric Farman; +Cc: Cornelia Huck, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Thu, 22 Apr 2021 16:49:21 -0400
Eric Farman <farman@linux.ibm.com> wrote:

> On Thu, 2021-04-22 at 02:52 +0200, Halil Pasic wrote:
> > On Tue, 13 Apr 2021 20:24:06 +0200
> > Eric Farman <farman@linux.ibm.com> wrote:
> >   
> > > Hi Conny, Halil,
> > > 
> > > Let's restart our discussion about the collision between interrupts
> > > for
> > > START SUBCHANNEL and HALT/CLEAR SUBCHANNEL. It's been a quarter
> > > million
> > > minutes (give or take), so here is the problematic scenario again:
> > > 
> > > 	CPU 1			CPU 2
> > >  1	CLEAR SUBCHANNEL
> > >  2	fsm_irq()
> > >  3				START SUBCHANNEL
> > >  4	vfio_ccw_sch_io_todo()
> > >  5				fsm_irq()
> > >  6				vfio_ccw_sch_io_todo()
> > > 
> > > From the channel subsystem's point of view the CLEAR SUBCHANNEL
> > > (step 1)
> > > is complete once step 2 is called, as the Interrupt Response Block
> > > (IRB)
> > > has been presented and the TEST SUBCHANNEL was driven by the cio
> > > layer.
> > > Thus, the START SUBCHANNEL (step 3) is submitted [1] and gets a
> > > cc=0 to
> > > indicate the I/O was accepted. However, step 2 stacks the bulk of
> > > the
> > > actual work onto a workqueue for when the subchannel lock is NOT
> > > held,
> > > and is unqueued at step 4. That code misidentifies the data in the
> > > IRB
> > > as being associated with the newly active I/O, and may release
> > > memory
> > > that is actively in use by the channel subsystem and/or device.
> > > Eww.
> > > 
> > > In this version...
> > > 
> > > Patch 1 and 2 are defensive checks. Patch 2 was part of v3 [2], but
> > > I
> > > would love a better option here to guard between steps 2 and 4.
> > > 
> > > Patch 3 is a subset of the removal of the CP_PENDING FSM state in
> > > v3.
> > > I've obviously gone away from this idea, but I thought this piece
> > > is
> > > still valuable.
> > > 
> > > Patch 4 collapses the code on the interrupt path so that changes to
> > > the FSM state and the channel_program struct are handled at the
> > > same
> > > point, rather than separated by a mutex boundary. Because of the
> > > possibility of a START and HALT/CLEAR running concurrently, it does
> > > not make sense to split them here.
> > > 
> > > With the above patches, maybe it then makes sense to hold the
> > > io_mutex
> > > across the entirety of vfio_ccw_sch_io_todo(). But I'm not
> > > completely
> > > sure that would be acceptable.
> > > 
> > > So... Thoughts?  
> > 
> > I believe we should address  
> 
> Who is the "we" here?
> 

The people that are responsible for vfio-ccw. 

> >  the concurrency, encapsulation and layering
> > issues in the subchannel/ccw pass-through code (vfio-ccw) by taking a
> > holistic approach as soon as possible.
> > 
> > I find the current state of art very hard to reason about, and that
> > adversely  affects my ability to reason about attempts at partial
> > improvements.
> > 
> > I understand that such a holistic approach needs a lot of work, and
> > we
> > may have to stop some bleeding first. In the stop the bleeding phase
> > we
> > can take a pragmatic approach and accept changes that empirically
> > seem to
> > work towards stopping the bleeding. I.e. if your tests say it's
> > better,
> > I'm willing to accept that it is better.  
> 
> So much bleeding!
> 
> RE: my tests... I have only been seeing the described problem in
> pathological tests, and this series lets those tests run without issue.
> 

Good to know.

> > 
> > I have to admit, I don't understand how synchronization is done in
> > the
> > vfio-ccw kernel module (in the sense of avoiding data races).
> > 
> > Regarding your patches, I have to admit, I have a hard time figuring
> > out
> > which one of these (or what combination of them) is supposed to solve
> > the problem you described above. If I had to guess, I would guess it
> > is
> > either patch 4, because it has a similar scenario diagram in the
> > commit message like the one in the problem statement. Is my guess
> > right?  
> 
> Sort of. It is true that Patch 4 is the last piece of the puzzle, and
> the diagram is included in that commit message so it is kept with the
> change, instead of being lost with the cover letter.
> 
> As I said in the cover letter, "Patch 1 and 2 are defensive checks"
> which are simply included to provide a more robust solution. You could
> argue that Patch 3 should be held out separately, but as it came from
> the previous version of this series it made sense to include here.
> 

Does that mean we need patches 1, 2 and 4 to fix the issue or is just
4 sufficient?

> > 
> > If it is right I don't quite understand the mechanics of the fix,
> > because what the patch seems to do is changing the content of step 4
> > in
> > the above diagram. And I don't see how is change that code
> > so that it does not "misidentifies the data in the IRB as being
> > associated with the newly active I/O".   
> 
> Consider that the cp_update_scsw() and cp_free() routines that get
> called here are looking at the cp->initialized flag to determine
> whether to perform any work. For a system that is otherwise idle, the
> cp->initialized flag will be false when processing an IRB related to a
> CSCH, meaning the bulk of this routine will be a NOP.
> 
> In the failing scenario, as I describe in the commit message for patch
> 4, we could be processing an interrupt that is unaffiliated with the CP
> that was (or is being) built. It need not even be a solicited
> interrupt; it just happened that the CSCH interrupt is what got me
> looking at this path. The whole situation boils down to the FSM state
> and cp->initialized flag being out of sync from one another after
> coming through this function.
> 

Thanks for the explanation. Since you are about to send out a new
verison which I understand won't be just about cosmetic fixes, I won't
invest any more in understanding this one. But I hope this will help me
understand that one. 

> > Moreover patch 4 seems to rely on
> > private->state which, AFAIR is still used in a racy fashion.
> > 
> > But if strong empirical evidence shows that it performs better (stops
> > the bleeding), I think we can go ahead with it.  
> 
> Again with the bleeding. Is there a Doctor in the house? :)
> 

Sorry if I expressed myself comically. Was not my intention. I'm puzzled.

Is in your opinion the vfio-ccw kernel module data race free with this
series applied?

Regards,
Halil

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2021-04-23 11:06     ` Cornelia Huck
@ 2021-04-23 13:23       ` Halil Pasic
  2021-04-23 13:28         ` Niklas Schnelle
  2021-04-23 15:53         ` Eric Farman
  0 siblings, 2 replies; 26+ messages in thread
From: Halil Pasic @ 2021-04-23 13:23 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Eric Farman, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Fri, 23 Apr 2021 13:06:16 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Thu, 22 Apr 2021 16:49:21 -0400
> Eric Farman <farman@linux.ibm.com> wrote:
> 
> > On Thu, 2021-04-22 at 02:52 +0200, Halil Pasic wrote:  
> > > On Tue, 13 Apr 2021 20:24:06 +0200
> > > Eric Farman <farman@linux.ibm.com> wrote:
> > >     
> > > > Hi Conny, Halil,
> > > > 
> > > > Let's restart our discussion about the collision between interrupts
> > > > for
> > > > START SUBCHANNEL and HALT/CLEAR SUBCHANNEL. It's been a quarter
> > > > million
> > > > minutes (give or take), so here is the problematic scenario again:
> > > > 
> > > > 	CPU 1			CPU 2
> > > >  1	CLEAR SUBCHANNEL
> > > >  2	fsm_irq()
> > > >  3				START SUBCHANNEL
> > > >  4	vfio_ccw_sch_io_todo()
> > > >  5				fsm_irq()
> > > >  6				vfio_ccw_sch_io_todo()
> > > > 
> > > > From the channel subsystem's point of view the CLEAR SUBCHANNEL
> > > > (step 1)
> > > > is complete once step 2 is called, as the Interrupt Response Block
> > > > (IRB)
> > > > has been presented and the TEST SUBCHANNEL was driven by the cio
> > > > layer.
> > > > Thus, the START SUBCHANNEL (step 3) is submitted [1] and gets a
> > > > cc=0 to
> > > > indicate the I/O was accepted. However, step 2 stacks the bulk of
> > > > the
> > > > actual work onto a workqueue for when the subchannel lock is NOT
> > > > held,
> > > > and is unqueued at step 4. That code misidentifies the data in the
> > > > IRB
> > > > as being associated with the newly active I/O, and may release
> > > > memory
> > > > that is actively in use by the channel subsystem and/or device.
> > > > Eww.
> > > > 
> > > > In this version...
> > > > 
> > > > Patch 1 and 2 are defensive checks. Patch 2 was part of v3 [2], but
> > > > I
> > > > would love a better option here to guard between steps 2 and 4.
> > > > 
> > > > Patch 3 is a subset of the removal of the CP_PENDING FSM state in
> > > > v3.
> > > > I've obviously gone away from this idea, but I thought this piece
> > > > is
> > > > still valuable.
> > > > 
> > > > Patch 4 collapses the code on the interrupt path so that changes to
> > > > the FSM state and the channel_program struct are handled at the
> > > > same
> > > > point, rather than separated by a mutex boundary. Because of the
> > > > possibility of a START and HALT/CLEAR running concurrently, it does
> > > > not make sense to split them here.
> > > > 
> > > > With the above patches, maybe it then makes sense to hold the
> > > > io_mutex
> > > > across the entirety of vfio_ccw_sch_io_todo(). But I'm not
> > > > completely
> > > > sure that would be acceptable.
> > > > 
> > > > So... Thoughts?    
> > > 
> > > I believe we should address    
> > 
> > Who is the "we" here?
> >   
> > >  the concurrency, encapsulation and layering
> > > issues in the subchannel/ccw pass-through code (vfio-ccw) by taking a
> > > holistic approach as soon as possible.  
> 
> Let me also ask: what is "holistic"? If that's a complete rewrite, I
> definitely don't have the capacity for that; if others want to take
> over the code, feel free.
> 

In general: https://en.wikipedia.org/wiki/Holism

In this context I mean:
* Fix all data races in in the vfio-ccw module instead of making the
"race window" smaller. Reasoning about the behavior of racy programs
is very difficult.
* The passed-through subchannel of the VM, as seen by the guest OS is an
overlay of the host subchannel (which we have to assume is within the
specification), the vfio-ccw kernel module, and an userspace emulator.
The interface between the kernel module and the userspace emulator is
something the authors of the vfio-ccw kernel module design, and while
doing so we have to think about the interface the whole solution needs
to implemnet. For example we should ask ourselves: what is the right
response in kernel when we encounter the situation described by the
steps 1-3 of Eric's scenario. Our VMs subchannel needs to reward the
SSC with a cc=2 if the subchannel has the clear FC bit set. If we detect
the described condition, does it mean that the userspace emulator is
broken? Or is the userspace emulator allowed to rely on the vfio-ccw
kernel module to detect this condition and return an -EBUSY (which
corresponds to cc=2 because that is apart of the definition of the
interface between the kernel and the userspace)? When is the FC bit
of our VMs subchannel cleared? I read patch 2 like it is trying to catch
the condition and return an -EBUSY, but I don't see it catching all
the possible cases. I.e. what if another CPU is executing the first
instruction of vfio_ccw_sch_io_todo() when we check 
work_pending(&private->io_work) in fsm_io_helper()?

[..]

> >   
> > > Moreover patch 4 seems to rely on
> > > private->state which, AFAIR is still used in a racy fashion.
> > > 
> > > But if strong empirical evidence shows that it performs better (stops
> > > the bleeding), I think we can go ahead with it.    
> > 
> > Again with the bleeding. Is there a Doctor in the house? :)  
> 
> No idea, seen any blue boxes around? :)
> 

Let me also ask what: blue boxes do you mean? If you mean
https://en.wikipedia.org/wiki/Blue_box
then, I'm not sure I can follow your association. Are you looking for
phone to call a doctor?

Regards,
Halil

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2021-04-23 13:23       ` Halil Pasic
@ 2021-04-23 13:28         ` Niklas Schnelle
  2021-04-23 15:53         ` Eric Farman
  1 sibling, 0 replies; 26+ messages in thread
From: Niklas Schnelle @ 2021-04-23 13:28 UTC (permalink / raw)
  To: Halil Pasic, Cornelia Huck
  Cc: Eric Farman, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Fri, 2021-04-23 at 15:23 +0200, Halil Pasic wrote:
> On Fri, 23 Apr 2021 13:06:16 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Thu, 22 Apr 2021 16:49:21 -0400
> > Eric Farman <farman@linux.ibm.com> wrote:
> > 
> > > On Thu, 2021-04-22 at 02:52 +0200, Halil Pasic wrote:  
> > > > On Tue, 13 Apr 2021 20:24:06 +0200
> > > > Eric Farman <farman@linux.ibm.com> wrote:
> > > >     
> > > > > Hi Conny, Halil,
> > > > > 
> > > > > Let's restart our discussion about the collision between interrupts
> > > > > for
> > > > > START SUBCHANNEL and HALT/CLEAR SUBCHANNEL. It's been a quarter
> > > > > million
> > > > > minutes (give or take), so here is the problematic scenario again:
> > > > > 
> > > > > 	CPU 1			CPU 2
> > > > >  1	CLEAR SUBCHANNEL
> > > > >  2	fsm_irq()
> > > > >  3				START SUBCHANNEL
> > > > >  4	vfio_ccw_sch_io_todo()
> > > > >  5				fsm_irq()
> > > > >  6				vfio_ccw_sch_io_todo()
> > > > > 
> > > > > From the channel subsystem's point of view the CLEAR SUBCHANNEL
> > > > > (step 1)
> > > > > is complete once step 2 is called, as the Interrupt Response Block
> > > > > (IRB)
> > > > > has been presented and the TEST SUBCHANNEL was driven by the cio
> > > > > layer.
> > > > > Thus, the START SUBCHANNEL (step 3) is submitted [1] and gets a
> > > > > cc=0 to
> > > > > indicate the I/O was accepted. However, step 2 stacks the bulk of
> > > > > the
> > > > > actual work onto a workqueue for when the subchannel lock is NOT
> > > > > held,
> > > > > and is unqueued at step 4. That code misidentifies the data in the
> > > > > IRB
> > > > > as being associated with the newly active I/O, and may release
> > > > > memory
> > > > > that is actively in use by the channel subsystem and/or device.
> > > > > Eww.
> > > > > 
> > > > > In this version...
> > > > > 
> > > > > Patch 1 and 2 are defensive checks. Patch 2 was part of v3 [2], but
> > > > > I
> > > > > would love a better option here to guard between steps 2 and 4.
> > > > > 
> > > > > Patch 3 is a subset of the removal of the CP_PENDING FSM state in
> > > > > v3.
> > > > > I've obviously gone away from this idea, but I thought this piece
> > > > > is
> > > > > still valuable.
> > > > > 
> > > > > Patch 4 collapses the code on the interrupt path so that changes to
> > > > > the FSM state and the channel_program struct are handled at the
> > > > > same
> > > > > point, rather than separated by a mutex boundary. Because of the
> > > > > possibility of a START and HALT/CLEAR running concurrently, it does
> > > > > not make sense to split them here.
> > > > > 
> > > > > With the above patches, maybe it then makes sense to hold the
> > > > > io_mutex
> > > > > across the entirety of vfio_ccw_sch_io_todo(). But I'm not
> > > > > completely
> > > > > sure that would be acceptable.
> > > > > 
> > > > > So... Thoughts?    
> > > > 
> > > > I believe we should address    
> > > 
> > > Who is the "we" here?
> > >   
> > > >  the concurrency, encapsulation and layering
> > > > issues in the subchannel/ccw pass-through code (vfio-ccw) by taking a
> > > > holistic approach as soon as possible.  
> > 
> > Let me also ask: what is "holistic"? If that's a complete rewrite, I
> > definitely don't have the capacity for that; if others want to take
> > over the code, feel free.
> > 
> 
> In general: https://en.wikipedia.org/wiki/Holism
> 
> In this context I mean:
> * Fix all data races in in the vfio-ccw module instead of making the
> "race window" smaller. Reasoning about the behavior of racy programs
> is very difficult.
> * The passed-through subchannel of the VM, as seen by the guest OS is an
> overlay of the host subchannel (which we have to assume is within the
> specification), the vfio-ccw kernel module, and an userspace emulator.
> The interface between the kernel module and the userspace emulator is
> something the authors of the vfio-ccw kernel module design, and while
> doing so we have to think about the interface the whole solution needs
> to implemnet. For example we should ask ourselves: what is the right
> response in kernel when we encounter the situation described by the
> steps 1-3 of Eric's scenario. Our VMs subchannel needs to reward the
> SSC with a cc=2 if the subchannel has the clear FC bit set. If we detect
> the described condition, does it mean that the userspace emulator is
> broken? Or is the userspace emulator allowed to rely on the vfio-ccw
> kernel module to detect this condition and return an -EBUSY (which
> corresponds to cc=2 because that is apart of the definition of the
> interface between the kernel and the userspace)? When is the FC bit
> of our VMs subchannel cleared? I read patch 2 like it is trying to catch
> the condition and return an -EBUSY, but I don't see it catching all
> the possible cases. I.e. what if another CPU is executing the first
> instruction of vfio_ccw_sch_io_todo() when we check 
> work_pending(&private->io_work) in fsm_io_helper()?
> 
> [..]
> 
> > >   
> > > > Moreover patch 4 seems to rely on
> > > > private->state which, AFAIR is still used in a racy fashion.
> > > > 
> > > > But if strong empirical evidence shows that it performs better (stops
> > > > the bleeding), I think we can go ahead with it.    
> > > 
> > > Again with the bleeding. Is there a Doctor in the house? :)  
> > 
> > No idea, seen any blue boxes around? :)
> > 
> 
> Let me also ask what: blue boxes do you mean? If you mean
> https://en.wikipedia.org/wiki/Blue_box
> then, I'm not sure I can follow your association. Are you looking for
> phone to call a doctor?

I think he's talking about this phone box: 
https://en.wikipedia.org/wiki/TARDIS

> 
> Regards,
> Halil


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2021-04-23 11:50     ` Halil Pasic
@ 2021-04-23 15:53       ` Eric Farman
  2021-04-23 17:08         ` Halil Pasic
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Farman @ 2021-04-23 15:53 UTC (permalink / raw)
  To: Halil Pasic; +Cc: Cornelia Huck, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Fri, 2021-04-23 at 13:50 +0200, Halil Pasic wrote:
> On Thu, 22 Apr 2021 16:49:21 -0400
> Eric Farman <farman@linux.ibm.com> wrote:
> 
> > On Thu, 2021-04-22 at 02:52 +0200, Halil Pasic wrote:
> > > On Tue, 13 Apr 2021 20:24:06 +0200
> > > Eric Farman <farman@linux.ibm.com> wrote:
> > >   
> > > > Hi Conny, Halil,
> > > > 
> > > > Let's restart our discussion about the collision between
> > > > interrupts
> > > > for
> > > > START SUBCHANNEL and HALT/CLEAR SUBCHANNEL. It's been a quarter
> > > > million
> > > > minutes (give or take), so here is the problematic scenario
> > > > again:
> > > > 
> > > > 	CPU 1			CPU 2
> > > >  1	CLEAR SUBCHANNEL
> > > >  2	fsm_irq()
> > > >  3				START SUBCHANNEL
> > > >  4	vfio_ccw_sch_io_todo()
> > > >  5				fsm_irq()
> > > >  6				vfio_ccw_sch_io_todo()
> > > > 
> > > > From the channel subsystem's point of view the CLEAR SUBCHANNEL
> > > > (step 1)
> > > > is complete once step 2 is called, as the Interrupt Response
> > > > Block
> > > > (IRB)
> > > > has been presented and the TEST SUBCHANNEL was driven by the
> > > > cio
> > > > layer.
> > > > Thus, the START SUBCHANNEL (step 3) is submitted [1] and gets a
> > > > cc=0 to
> > > > indicate the I/O was accepted. However, step 2 stacks the bulk
> > > > of
> > > > the
> > > > actual work onto a workqueue for when the subchannel lock is
> > > > NOT
> > > > held,
> > > > and is unqueued at step 4. That code misidentifies the data in
> > > > the
> > > > IRB
> > > > as being associated with the newly active I/O, and may release
> > > > memory
> > > > that is actively in use by the channel subsystem and/or device.
> > > > Eww.
> > > > 
> > > > In this version...
> > > > 
> > > > Patch 1 and 2 are defensive checks. Patch 2 was part of v3 [2],
> > > > but
> > > > I
> > > > would love a better option here to guard between steps 2 and 4.
> > > > 
> > > > Patch 3 is a subset of the removal of the CP_PENDING FSM state
> > > > in
> > > > v3.
> > > > I've obviously gone away from this idea, but I thought this
> > > > piece
> > > > is
> > > > still valuable.
> > > > 
> > > > Patch 4 collapses the code on the interrupt path so that
> > > > changes to
> > > > the FSM state and the channel_program struct are handled at the
> > > > same
> > > > point, rather than separated by a mutex boundary. Because of
> > > > the
> > > > possibility of a START and HALT/CLEAR running concurrently, it
> > > > does
> > > > not make sense to split them here.
> > > > 
> > > > With the above patches, maybe it then makes sense to hold the
> > > > io_mutex
> > > > across the entirety of vfio_ccw_sch_io_todo(). But I'm not
> > > > completely
> > > > sure that would be acceptable.
> > > > 
> > > > So... Thoughts?  
> > > 
> > > I believe we should address  
> > 
> > Who is the "we" here?
> > 
> 
> The people that are responsible for vfio-ccw. 
> 
> > >  the concurrency, encapsulation and layering
> > > issues in the subchannel/ccw pass-through code (vfio-ccw) by
> > > taking a
> > > holistic approach as soon as possible.
> > > 
> > > I find the current state of art very hard to reason about, and
> > > that
> > > adversely  affects my ability to reason about attempts at partial
> > > improvements.
> > > 
> > > I understand that such a holistic approach needs a lot of work,
> > > and
> > > we
> > > may have to stop some bleeding first. In the stop the bleeding
> > > phase
> > > we
> > > can take a pragmatic approach and accept changes that empirically
> > > seem to
> > > work towards stopping the bleeding. I.e. if your tests say it's
> > > better,
> > > I'm willing to accept that it is better.  
> > 
> > So much bleeding!
> > 
> > RE: my tests... I have only been seeing the described problem in
> > pathological tests, and this series lets those tests run without
> > issue.
> > 
> 
> Good to know.
> 
> > > I have to admit, I don't understand how synchronization is done
> > > in
> > > the
> > > vfio-ccw kernel module (in the sense of avoiding data races).
> > > 
> > > Regarding your patches, I have to admit, I have a hard time
> > > figuring
> > > out
> > > which one of these (or what combination of them) is supposed to
> > > solve
> > > the problem you described above. If I had to guess, I would guess
> > > it
> > > is
> > > either patch 4, because it has a similar scenario diagram in the
> > > commit message like the one in the problem statement. Is my guess
> > > right?  
> > 
> > Sort of. It is true that Patch 4 is the last piece of the puzzle,
> > and
> > the diagram is included in that commit message so it is kept with
> > the
> > change, instead of being lost with the cover letter.
> > 
> > As I said in the cover letter, "Patch 1 and 2 are defensive checks"
> > which are simply included to provide a more robust solution. You
> > could
> > argue that Patch 3 should be held out separately, but as it came
> > from
> > the previous version of this series it made sense to include here.
> > 
> 
> Does that mean we need patches 1, 2 and 4 to fix the issue or is just
> 4 sufficient?

Based on everything I understand, I would not feel comfortable with
only 4.

If you look at the commit message for patch 1, I do explain why its
absence is not exposing any serialization problems today. But as it is
part of the "CCW translation API" (documented in
Documentation/s390/vfio-ccw.rst) I feel it is important to include.

Something needs to be done in the transition described by Patch 2, even
though it still has problems in its current form.

> 
> > > If it is right I don't quite understand the mechanics of the fix,
> > > because what the patch seems to do is changing the content of
> > > step 4
> > > in
> > > the above diagram. And I don't see how is change that code
> > > so that it does not "misidentifies the data in the IRB as being
> > > associated with the newly active I/O".   
> > 
> > Consider that the cp_update_scsw() and cp_free() routines that get
> > called here are looking at the cp->initialized flag to determine
> > whether to perform any work. For a system that is otherwise idle,
> > the
> > cp->initialized flag will be false when processing an IRB related
> > to a
> > CSCH, meaning the bulk of this routine will be a NOP.
> > 
> > In the failing scenario, as I describe in the commit message for
> > patch
> > 4, we could be processing an interrupt that is unaffiliated with
> > the CP
> > that was (or is being) built. It need not even be a solicited
> > interrupt; it just happened that the CSCH interrupt is what got me
> > looking at this path. The whole situation boils down to the FSM
> > state
> > and cp->initialized flag being out of sync from one another after
> > coming through this function.
> > 
> 
> Thanks for the explanation. Since you are about to send out a new
> verison which I understand won't be just about cosmetic fixes, I
> won't
> invest any more in understanding this one. But I hope this will help
> me
> understand that one. 
> 
> > > Moreover patch 4 seems to rely on
> > > private->state which, AFAIR is still used in a racy fashion.
> > > 
> > > But if strong empirical evidence shows that it performs better
> > > (stops
> > > the bleeding), I think we can go ahead with it.  
> > 
> > Again with the bleeding. Is there a Doctor in the house? :)
> > 
> 
> Sorry if I expressed myself comically. Was not my intention. I'm
> puzzled.

No need to be sorry, and I didn't mean to offend you. The repeated use
of such a dramatic phrase struck me as humorous, because it conveyed a
much more prevalant problem than the one being fixed.

> 
> Is in your opinion the vfio-ccw kernel module data race free with
> this
> series applied?

I have no further concerns.

Eric

> 
> Regards,
> Halil


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2021-04-23 13:23       ` Halil Pasic
  2021-04-23 13:28         ` Niklas Schnelle
@ 2021-04-23 15:53         ` Eric Farman
  1 sibling, 0 replies; 26+ messages in thread
From: Eric Farman @ 2021-04-23 15:53 UTC (permalink / raw)
  To: Halil Pasic, Cornelia Huck; +Cc: Matthew Rosato, Jared Rossi, linux-s390, kvm

On Fri, 2021-04-23 at 15:23 +0200, Halil Pasic wrote:
> On Fri, 23 Apr 2021 13:06:16 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Thu, 22 Apr 2021 16:49:21 -0400
> > Eric Farman <farman@linux.ibm.com> wrote:
> > 
> > > On Thu, 2021-04-22 at 02:52 +0200, Halil Pasic wrote:  
> > > > On Tue, 13 Apr 2021 20:24:06 +0200
> > > > Eric Farman <farman@linux.ibm.com> wrote:
> > > >     
> > > > > Hi Conny, Halil,
> > > > > 
> > > > > Let's restart our discussion about the collision between
> > > > > interrupts
> > > > > for
> > > > > START SUBCHANNEL and HALT/CLEAR SUBCHANNEL. It's been a
> > > > > quarter
> > > > > million
> > > > > minutes (give or take), so here is the problematic scenario
> > > > > again:
> > > > > 
> > > > > 	CPU 1			CPU 2
> > > > >  1	CLEAR SUBCHANNEL
> > > > >  2	fsm_irq()
> > > > >  3				START SUBCHANNEL
> > > > >  4	vfio_ccw_sch_io_todo()
> > > > >  5				fsm_irq()
> > > > >  6				vfio_ccw_sch_io_todo()
> > > > > 
> > > > > From the channel subsystem's point of view the CLEAR
> > > > > SUBCHANNEL
> > > > > (step 1)
> > > > > is complete once step 2 is called, as the Interrupt Response
> > > > > Block
> > > > > (IRB)
> > > > > has been presented and the TEST SUBCHANNEL was driven by the
> > > > > cio
> > > > > layer.
> > > > > Thus, the START SUBCHANNEL (step 3) is submitted [1] and gets
> > > > > a
> > > > > cc=0 to
> > > > > indicate the I/O was accepted. However, step 2 stacks the
> > > > > bulk of
> > > > > the
> > > > > actual work onto a workqueue for when the subchannel lock is
> > > > > NOT
> > > > > held,
> > > > > and is unqueued at step 4. That code misidentifies the data
> > > > > in the
> > > > > IRB
> > > > > as being associated with the newly active I/O, and may
> > > > > release
> > > > > memory
> > > > > that is actively in use by the channel subsystem and/or
> > > > > device.
> > > > > Eww.
> > > > > 
> > > > > In this version...
> > > > > 
> > > > > Patch 1 and 2 are defensive checks. Patch 2 was part of v3
> > > > > [2], but
> > > > > I
> > > > > would love a better option here to guard between steps 2 and
> > > > > 4.
> > > > > 
> > > > > Patch 3 is a subset of the removal of the CP_PENDING FSM
> > > > > state in
> > > > > v3.
> > > > > I've obviously gone away from this idea, but I thought this
> > > > > piece
> > > > > is
> > > > > still valuable.
> > > > > 
> > > > > Patch 4 collapses the code on the interrupt path so that
> > > > > changes to
> > > > > the FSM state and the channel_program struct are handled at
> > > > > the
> > > > > same
> > > > > point, rather than separated by a mutex boundary. Because of
> > > > > the
> > > > > possibility of a START and HALT/CLEAR running concurrently,
> > > > > it does
> > > > > not make sense to split them here.
> > > > > 
> > > > > With the above patches, maybe it then makes sense to hold the
> > > > > io_mutex
> > > > > across the entirety of vfio_ccw_sch_io_todo(). But I'm not
> > > > > completely
> > > > > sure that would be acceptable.
> > > > > 
> > > > > So... Thoughts?    
> > > > 
> > > > I believe we should address    
> > > 
> > > Who is the "we" here?
> > >   
> > > >  the concurrency, encapsulation and layering
> > > > issues in the subchannel/ccw pass-through code (vfio-ccw) by
> > > > taking a
> > > > holistic approach as soon as possible.  
> > 
> > Let me also ask: what is "holistic"? If that's a complete rewrite,
> > I
> > definitely don't have the capacity for that; if others want to take
> > over the code, feel free.
> > 

Ditto.

The idea of a rewrite has come up in the past, and I still don't see
how that's a good use of time/resources. Looking at the fixes and
improvements from the last couple of years, I feel good about the
current components, their design, and their handshaking.

> 
> In general: https://en.wikipedia.org/wiki/Holism
> 
> In this context I mean:
> * Fix all data races in in the vfio-ccw module instead of making the
> "race window" smaller. Reasoning about the behavior of racy programs
> is very difficult.
> * The passed-through subchannel of the VM, as seen by the guest OS is
> an
> overlay of the host subchannel (which we have to assume is within the
> specification), the vfio-ccw kernel module, and an userspace
> emulator.
> The interface between the kernel module and the userspace emulator is
> something the authors of the vfio-ccw kernel module design, and while
> doing so we have to think about the interface the whole solution
> needs
> to implemnet. For example we should ask ourselves: what is the right
> response in kernel when we encounter the situation described by the
> steps 1-3 of Eric's scenario. Our VMs subchannel needs to reward the
> SSC with a cc=2 if the subchannel has the clear FC bit set. If we
> detect
> the described condition, does it mean that the userspace emulator is
> broken? Or is the userspace emulator allowed to rely on the vfio-ccw
> kernel module to detect this condition and return an -EBUSY (which
> corresponds to cc=2 because that is apart of the definition of the
> interface between the kernel and the userspace)? When is the FC bit
> of our VMs subchannel cleared? 



> I read patch 2 like it is trying to catch
> the condition and return an -EBUSY, but I don't see it catching all
> the possible cases. I.e. what if another CPU is executing the first
> instruction of vfio_ccw_sch_io_todo() when we check 
> work_pending(&private->io_work) in fsm_io_helper()?

This last hunk is part of the concern that Conny raised in reply to
Patch 2, which we have an idea on how to pursue.

Eric

> 
> [..]
> 
> > >   
> > > > Moreover patch 4 seems to rely on
> > > > private->state which, AFAIR is still used in a racy fashion.
> > > > 
> > > > But if strong empirical evidence shows that it performs better
> > > > (stops
> > > > the bleeding), I think we can go ahead with it.    
> > > 
> > > Again with the bleeding. Is there a Doctor in the house? :)  
> > 
> > No idea, seen any blue boxes around? :)
> > 
> 
> Let me also ask what: blue boxes do you mean? If you mean
> https://en.wikipedia.org/wiki/Blue_box
> then, I'm not sure I can follow your association. Are you looking for
> phone to call a doctor?
> 
> Regards,
> Halil


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2021-04-23 15:53       ` Eric Farman
@ 2021-04-23 17:08         ` Halil Pasic
  2021-04-23 19:07           ` Eric Farman
  0 siblings, 1 reply; 26+ messages in thread
From: Halil Pasic @ 2021-04-23 17:08 UTC (permalink / raw)
  To: Eric Farman; +Cc: Cornelia Huck, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Fri, 23 Apr 2021 11:53:06 -0400
Eric Farman <farman@linux.ibm.com> wrote:

> > 
> > Is in your opinion the vfio-ccw kernel module data race free with
> > this
> > series applied?  
> 
> I have no further concerns.

I take this for a "yes, in my opinion it is data race free".

Please explain me, how do we synchronize access to
a) private->state
b) cp->initialized?

Asuming we both use the definition from the C standard, the multiple
threads potentially concurrently accessing private->state or
cp->initialized are not hard to find.

For the former you can take vfio_ccw_sch_io_todo() (has both a read
and a write, and runs from the workqueue) and fsm_io_request() (has
both read and write). I guess the accesses from fsm_io_request() are
all with the io_mutex held, but the accesses form vfio_ccw_sch_io_todo()
are not with the io_mutex held AFAICT.

And the same is true for the latter with the difference that 
vfio_ccw_sch_io_todo() manipulates cp->initialized via cp_update_scsw()
and cp_free().

These are either data races, or I'm missing something. Let me also
note that C programs data races are bad news, because of undefined
behavior.

Regards,
Halil



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2021-04-23 17:08         ` Halil Pasic
@ 2021-04-23 19:07           ` Eric Farman
  2021-04-24  0:18             ` Halil Pasic
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Farman @ 2021-04-23 19:07 UTC (permalink / raw)
  To: Halil Pasic; +Cc: Cornelia Huck, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Fri, 2021-04-23 at 19:08 +0200, Halil Pasic wrote:
> On Fri, 23 Apr 2021 11:53:06 -0400
> Eric Farman <farman@linux.ibm.com> wrote:
> 
> > > Is in your opinion the vfio-ccw kernel module data race free with
> > > this
> > > series applied?  
> > 
> > I have no further concerns.
> 
> I take this for a "yes, in my opinion it is data race free".

You asked about once this series is applied, which it is not.

> 
> Please explain me, how do we synchronize access to
> a) private->state
> b) cp->initialized?
> 
> Asuming we both use the definition from the C standard, the multiple
> threads potentially concurrently accessing private->state or
> cp->initialized are not hard to find.
> 

Correct. The examples you provide below are the subject of patch 2,
which has already been identified as having issues and will be
reworked.

Thanks,
Eric

> For the former you can take vfio_ccw_sch_io_todo() (has both a read
> and a write, and runs from the workqueue) and fsm_io_request() (has
> both read and write). I guess the accesses from fsm_io_request() are
> all with the io_mutex held, but the accesses form
> vfio_ccw_sch_io_todo()
> are not with the io_mutex held AFAICT.
> 
> And the same is true for the latter with the difference that 
> vfio_ccw_sch_io_todo() manipulates cp->initialized via
> cp_update_scsw()
> and cp_free().
> 
> These are either data races, or I'm missing something. Let me also
> note that C programs data races are bad news, because of undefined
> behavior.
> 
> Regards,
> Halil
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR
  2021-04-23 19:07           ` Eric Farman
@ 2021-04-24  0:18             ` Halil Pasic
  0 siblings, 0 replies; 26+ messages in thread
From: Halil Pasic @ 2021-04-24  0:18 UTC (permalink / raw)
  To: Eric Farman; +Cc: Cornelia Huck, Matthew Rosato, Jared Rossi, linux-s390, kvm

On Fri, 23 Apr 2021 15:07:17 -0400
Eric Farman <farman@linux.ibm.com> wrote:

> > > > Is in your opinion the vfio-ccw kernel module data race free with
> > > > this
> > > > series applied?    
> > > 
> > > I have no further concerns.  
> > 
> > I take this for a "yes, in my opinion it is data race free".  
> 
> You asked about once this series is applied, which it is not.

I applied this series to the then current master several days ago, so
I have no problem reasoning about that state of code.  It seems we had
a misunderstanding. I was talking about code that is available (this
series "[RFC PATCH v4] vfio-ccw: Fix interrupt handling for HALT/CLEAR"),
and you were talking some future non-RFC merged incarnation of it I had
no opportunity to examine at this point in time.

I'm looking forward to that incarnation. And by the way, if an RFC
contains problems that are intentionally not addressed, because the
author wants to get feedback on some other aspect on the problem, I
prefer the things that are not dealt with stated clearly. Otherwise I
tend to invest time into finding the problems the author is already
aware of, and telling him that his solution is not correct because of
those.

Regards,
Halil

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, back to index

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13 18:24 [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
2021-04-13 18:24 ` [RFC PATCH v4 1/4] vfio-ccw: Check initialized flag in cp_init() Eric Farman
2021-04-14 16:30   ` Cornelia Huck
2021-04-13 18:24 ` [RFC PATCH v4 2/4] vfio-ccw: Check workqueue before doing START Eric Farman
2021-04-15 10:51   ` Cornelia Huck
2021-04-15 13:48     ` Eric Farman
2021-04-15 16:19       ` Cornelia Huck
2021-04-15 18:42         ` Eric Farman
2021-04-16 14:41           ` Cornelia Huck
2021-04-13 18:24 ` [RFC PATCH v4 3/4] vfio-ccw: Reset FSM state to IDLE inside FSM Eric Farman
2021-04-15 10:54   ` Cornelia Huck
2021-04-13 18:24 ` [RFC PATCH v4 4/4] vfio-ccw: Reset FSM state to IDLE before io_mutex Eric Farman
2021-04-21 10:25   ` Cornelia Huck
2021-04-21 12:58     ` Eric Farman
2021-04-22 16:16       ` Eric Farman
2021-04-22  0:52 ` [RFC PATCH v4 0/4] vfio-ccw: Fix interrupt handling for HALT/CLEAR Halil Pasic
2021-04-22 20:49   ` Eric Farman
2021-04-23 11:06     ` Cornelia Huck
2021-04-23 13:23       ` Halil Pasic
2021-04-23 13:28         ` Niklas Schnelle
2021-04-23 15:53         ` Eric Farman
2021-04-23 11:50     ` Halil Pasic
2021-04-23 15:53       ` Eric Farman
2021-04-23 17:08         ` Halil Pasic
2021-04-23 19:07           ` Eric Farman
2021-04-24  0:18             ` Halil Pasic

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git