From: Cornelia Huck <cohuck@redhat.com> To: Jason Gunthorpe <jgg@nvidia.com>, Eric Farman <farman@linux.ibm.com> Cc: David Airlie <airlied@linux.ie>, Tony Krowiak <akrowiak@linux.ibm.com>, Alex Williamson <alex.williamson@redhat.com>, Christian Borntraeger <borntraeger@de.ibm.com>, Daniel Vetter <daniel@ffwll.ch>, dri-devel@lists.freedesktop.org, Harald Freudenberger <freude@linux.ibm.com>, Vasily Gorbik <gor@linux.ibm.com>, Heiko Carstens <hca@linux.ibm.com>, intel-gfx@lists.freedesktop.org, intel-gvt-dev@lists.freedesktop.org, Jani Nikula <jani.nikula@linux.intel.com>, Jason Herne <jjherne@linux.ibm.com>, Joonas Lahtinen <joonas.lahtinen@linux.intel.com>, kvm@vger.kernel.org, Kirti Wankhede <kwankhede@nvidia.com>, linux-s390@vger.kernel.org, Matthew Rosato <mjrosato@linux.ibm.com>, Peter Oberparleiter <oberpar@linux.ibm.com>, Halil Pasic <pasic@linux.ibm.com>, Rodrigo Vivi <rodrigo.vivi@intel.com>, Vineeth Vijayan <vneethv@linux.ibm.com>, Zhenyu Wang <zhenyuw@linux.intel.com>, Zhi Wang <zhi.a.wang@intel.com>, Christoph Hellwig <hch@lst.de> Subject: Re: [PATCH v2 0/9] Move vfio_ccw to the new mdev API Date: Fri, 17 Sep 2021 13:59:16 +0200 [thread overview] Message-ID: <87h7ejh0q3.fsf@redhat.com> (raw) In-Reply-To: <20210914133618.GD4065468@nvidia.com> On Tue, Sep 14 2021, Jason Gunthorpe <jgg@nvidia.com> wrote: > On Mon, Sep 13, 2021 at 04:31:54PM -0400, Eric Farman wrote: >> > I rebased it and fixed it up here: >> > >> > https://github.com/jgunthorpe/linux/tree/vfio_ccw >> > >> > Can you try again? >> >> That does address the crash, but then why is it processing a BROKEN >> event? Seems problematic. > > The stuff related to the NOT_OPER looked really wonky to me. I'm > guessing this is the issue - not sure about the pmcw.ena either.. [I have still not been able to digest the whole series, sorry.] > > diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c > index 5ea392959c0711..0d4d4f425befac 100644 > --- a/drivers/s390/cio/vfio_ccw_fsm.c > +++ b/drivers/s390/cio/vfio_ccw_fsm.c > @@ -380,29 +380,19 @@ static void fsm_open(struct vfio_ccw_private *private, > spin_unlock_irq(sch->lock); > } > > -static void fsm_close(struct vfio_ccw_private *private, > - enum vfio_ccw_event event) > +static int flush_sch(struct vfio_ccw_private *private) > { > struct subchannel *sch = private->sch; > DECLARE_COMPLETION_ONSTACK(completion); > int iretry, ret = 0; > > - spin_lock_irq(sch->lock); > - if (!sch->schib.pmcw.ena) > - goto err_unlock; > - ret = cio_disable_subchannel(sch); > - if (ret != -EBUSY) > - goto err_unlock; > - > iretry = 255; > do { > - > ret = cio_cancel_halt_clear(sch, &iretry); > - > if (ret == -EIO) { > pr_err("vfio_ccw: could not quiesce subchannel 0.%x.%04x!\n", > sch->schid.ssid, sch->schid.sch_no); > - break; > + return ret; Looking at this, I wonder why we had special-cased -EIO -- for -ENODEV we should be done as well, as then the device is dead and we do not need to disable it. > } > > /* > @@ -413,13 +403,28 @@ static void fsm_close(struct vfio_ccw_private *private, > spin_unlock_irq(sch->lock); > > if (ret == -EBUSY) > - wait_for_completion_timeout(&completion, 3*HZ); > + wait_for_completion_timeout(&completion, 3 * HZ); > > private->completion = NULL; > flush_workqueue(vfio_ccw_work_q); > spin_lock_irq(sch->lock); > ret = cio_disable_subchannel(sch); > } while (ret == -EBUSY); > + return ret; > +} > + > +static void fsm_close(struct vfio_ccw_private *private, > + enum vfio_ccw_event event) > +{ > + struct subchannel *sch = private->sch; > + int ret; > + > + spin_lock_irq(sch->lock); > + if (!sch->schib.pmcw.ena) > + goto err_unlock; > + ret = cio_disable_subchannel(sch); cio_disable_subchannel() should be happy to disable an already disabled subchannel, so I guess we can just walk through this and end up in CLOSED state... unless entering with !ena actually indicates that we messed up somewhere else in the state machine. I still need to find time to read the patches. > + if (ret == -EBUSY) > + ret = flush_sch(private); > if (ret) > goto err_unlock; > private->state = VFIO_CCW_STATE_CLOSED;
WARNING: multiple messages have this Message-ID (diff)
From: Cornelia Huck <cohuck@redhat.com> To: Jason Gunthorpe <jgg@nvidia.com>, Eric Farman <farman@linux.ibm.com> Cc: David Airlie <airlied@linux.ie>, Tony Krowiak <akrowiak@linux.ibm.com>, Alex Williamson <alex.williamson@redhat.com>, Christian Borntraeger <borntraeger@de.ibm.com>, Daniel Vetter <daniel@ffwll.ch>, dri-devel@lists.freedesktop.org, Harald Freudenberger <freude@linux.ibm.com>, Vasily Gorbik <gor@linux.ibm.com>, Heiko Carstens <hca@linux.ibm.com>, intel-gfx@lists.freedesktop.org, intel-gvt-dev@lists.freedesktop.org, Jani Nikula <jani.nikula@linux.intel.com>, Jason Herne <jjherne@linux.ibm.com>, Joonas Lahtinen <joonas.lahtinen@linux.intel.com>, kvm@vger.kernel.org, Kirti Wankhede <kwankhede@nvidia.com>, linux-s390@vger.kernel.org, Matthew Rosato <mjrosato@linux.ibm.com>, Peter Oberparleiter <oberpar@linux.ibm.com>, Halil Pasic <pasic@linux.ibm.com>, Rodrigo Vivi <rodrigo.vivi@intel.com>, Vineeth Vijayan <vneethv@linux.ibm.com>, Zhenyu Wang <zhenyuw@linux.intel.com>, Zhi Wang <zhi.a.wang@intel.com>, Christoph Hellwig <hch@lst.de> Subject: Re: [Intel-gfx] [PATCH v2 0/9] Move vfio_ccw to the new mdev API Date: Fri, 17 Sep 2021 13:59:16 +0200 [thread overview] Message-ID: <87h7ejh0q3.fsf@redhat.com> (raw) In-Reply-To: <20210914133618.GD4065468@nvidia.com> On Tue, Sep 14 2021, Jason Gunthorpe <jgg@nvidia.com> wrote: > On Mon, Sep 13, 2021 at 04:31:54PM -0400, Eric Farman wrote: >> > I rebased it and fixed it up here: >> > >> > https://github.com/jgunthorpe/linux/tree/vfio_ccw >> > >> > Can you try again? >> >> That does address the crash, but then why is it processing a BROKEN >> event? Seems problematic. > > The stuff related to the NOT_OPER looked really wonky to me. I'm > guessing this is the issue - not sure about the pmcw.ena either.. [I have still not been able to digest the whole series, sorry.] > > diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c > index 5ea392959c0711..0d4d4f425befac 100644 > --- a/drivers/s390/cio/vfio_ccw_fsm.c > +++ b/drivers/s390/cio/vfio_ccw_fsm.c > @@ -380,29 +380,19 @@ static void fsm_open(struct vfio_ccw_private *private, > spin_unlock_irq(sch->lock); > } > > -static void fsm_close(struct vfio_ccw_private *private, > - enum vfio_ccw_event event) > +static int flush_sch(struct vfio_ccw_private *private) > { > struct subchannel *sch = private->sch; > DECLARE_COMPLETION_ONSTACK(completion); > int iretry, ret = 0; > > - spin_lock_irq(sch->lock); > - if (!sch->schib.pmcw.ena) > - goto err_unlock; > - ret = cio_disable_subchannel(sch); > - if (ret != -EBUSY) > - goto err_unlock; > - > iretry = 255; > do { > - > ret = cio_cancel_halt_clear(sch, &iretry); > - > if (ret == -EIO) { > pr_err("vfio_ccw: could not quiesce subchannel 0.%x.%04x!\n", > sch->schid.ssid, sch->schid.sch_no); > - break; > + return ret; Looking at this, I wonder why we had special-cased -EIO -- for -ENODEV we should be done as well, as then the device is dead and we do not need to disable it. > } > > /* > @@ -413,13 +403,28 @@ static void fsm_close(struct vfio_ccw_private *private, > spin_unlock_irq(sch->lock); > > if (ret == -EBUSY) > - wait_for_completion_timeout(&completion, 3*HZ); > + wait_for_completion_timeout(&completion, 3 * HZ); > > private->completion = NULL; > flush_workqueue(vfio_ccw_work_q); > spin_lock_irq(sch->lock); > ret = cio_disable_subchannel(sch); > } while (ret == -EBUSY); > + return ret; > +} > + > +static void fsm_close(struct vfio_ccw_private *private, > + enum vfio_ccw_event event) > +{ > + struct subchannel *sch = private->sch; > + int ret; > + > + spin_lock_irq(sch->lock); > + if (!sch->schib.pmcw.ena) > + goto err_unlock; > + ret = cio_disable_subchannel(sch); cio_disable_subchannel() should be happy to disable an already disabled subchannel, so I guess we can just walk through this and end up in CLOSED state... unless entering with !ena actually indicates that we messed up somewhere else in the state machine. I still need to find time to read the patches. > + if (ret == -EBUSY) > + ret = flush_sch(private); > if (ret) > goto err_unlock; > private->state = VFIO_CCW_STATE_CLOSED;
next prev parent reply other threads:[~2021-09-17 11:59 UTC|newest] Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-09-09 19:38 [PATCH v2 0/9] Move vfio_ccw to the new mdev API Jason Gunthorpe 2021-09-09 19:38 ` [Intel-gfx] " Jason Gunthorpe 2021-09-09 19:38 ` [PATCH v2 1/9] vfio/ccw: Use functions for alloc/free of the vfio_ccw_private Jason Gunthorpe 2021-09-09 19:38 ` [Intel-gfx] " Jason Gunthorpe 2021-09-10 11:27 ` Christoph Hellwig 2021-09-10 11:27 ` Christoph Hellwig 2021-09-14 15:50 ` Cornelia Huck 2021-09-14 15:50 ` [Intel-gfx] " Cornelia Huck 2021-09-14 18:03 ` Jason Gunthorpe 2021-09-14 18:03 ` [Intel-gfx] " Jason Gunthorpe 2021-09-24 2:53 ` Eric Farman 2021-09-24 2:53 ` [Intel-gfx] " Eric Farman 2021-09-09 19:38 ` [PATCH v2 2/9] vfio/ccw: Pass vfio_ccw_private not mdev_device to various functions Jason Gunthorpe 2021-09-09 19:38 ` [Intel-gfx] " Jason Gunthorpe 2021-09-10 11:32 ` Christoph Hellwig 2021-09-10 11:32 ` Christoph Hellwig 2021-09-20 11:12 ` Cornelia Huck 2021-09-20 11:12 ` [Intel-gfx] " Cornelia Huck 2021-09-24 2:53 ` Eric Farman 2021-09-24 2:53 ` [Intel-gfx] " Eric Farman 2021-09-09 19:38 ` [PATCH v2 3/9] vfio/ccw: Convert to use vfio_register_group_dev() Jason Gunthorpe 2021-09-09 19:38 ` [Intel-gfx] " Jason Gunthorpe 2021-09-24 20:37 ` Eric Farman 2021-09-24 20:37 ` [Intel-gfx] " Eric Farman 2021-09-27 12:17 ` Jason Gunthorpe 2021-09-27 12:17 ` [Intel-gfx] " Jason Gunthorpe 2021-09-09 19:38 ` [PATCH v2 4/9] vfio/ccw: Make the FSM complete and synchronize it to the mdev Jason Gunthorpe 2021-09-09 19:38 ` [Intel-gfx] " Jason Gunthorpe 2021-09-20 12:19 ` Cornelia Huck 2021-09-20 12:19 ` [Intel-gfx] " Cornelia Huck 2021-09-20 12:30 ` Jason Gunthorpe 2021-09-20 12:30 ` [Intel-gfx] " Jason Gunthorpe 2021-09-09 19:38 ` [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code Jason Gunthorpe 2021-09-09 19:38 ` [Intel-gfx] " Jason Gunthorpe 2021-09-10 12:10 ` Christoph Hellwig 2021-09-10 12:10 ` [Intel-gfx] " Christoph Hellwig 2021-09-10 13:38 ` Jason Gunthorpe 2021-09-10 13:38 ` [Intel-gfx] " Jason Gunthorpe 2021-09-10 16:09 ` Alex Williamson 2021-09-10 16:09 ` [Intel-gfx] " Alex Williamson 2021-09-09 19:38 ` [PATCH v2 6/9] vfio/mdev: Add mdev available instance checking to the core Jason Gunthorpe 2021-09-09 19:38 ` [Intel-gfx] " Jason Gunthorpe 2021-09-10 12:25 ` Christoph Hellwig 2021-09-10 12:25 ` [Intel-gfx] " Christoph Hellwig 2021-09-20 18:02 ` Cornelia Huck 2021-09-20 18:02 ` [Intel-gfx] " Cornelia Huck 2021-09-21 13:19 ` Jason Gunthorpe 2021-09-21 13:19 ` [Intel-gfx] " Jason Gunthorpe 2021-09-24 2:54 ` Eric Farman 2021-09-24 2:54 ` [Intel-gfx] " Eric Farman 2021-09-09 19:38 ` [PATCH v2 7/9] vfio/ccw: Remove private->mdev Jason Gunthorpe 2021-09-09 19:38 ` [Intel-gfx] " Jason Gunthorpe 2021-09-24 20:45 ` Eric Farman 2021-09-24 20:45 ` [Intel-gfx] " Eric Farman 2021-09-27 12:32 ` Jason Gunthorpe 2021-09-27 12:32 ` [Intel-gfx] " Jason Gunthorpe 2021-09-27 20:45 ` Eric Farman 2021-09-27 20:45 ` [Intel-gfx] " Eric Farman 2021-09-09 19:38 ` [PATCH v2 8/9] vfio: Export vfio_device_try_get() Jason Gunthorpe 2021-09-09 19:38 ` [Intel-gfx] " Jason Gunthorpe 2021-09-09 19:38 ` [PATCH v2 9/9] vfio/ccw: Move the lifecycle of the struct vfio_ccw_private to the mdev Jason Gunthorpe 2021-09-09 19:38 ` [Intel-gfx] " Jason Gunthorpe 2021-09-09 19:54 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Move vfio_ccw to the new mdev API Patchwork 2021-09-13 17:40 ` [PATCH v2 0/9] " Eric Farman 2021-09-13 17:40 ` [Intel-gfx] " Eric Farman 2021-09-13 19:24 ` Jason Gunthorpe 2021-09-13 19:24 ` [Intel-gfx] " Jason Gunthorpe 2021-09-13 20:31 ` Eric Farman 2021-09-13 20:31 ` [Intel-gfx] " Eric Farman 2021-09-14 13:36 ` Jason Gunthorpe 2021-09-14 13:36 ` [Intel-gfx] " Jason Gunthorpe 2021-09-17 11:59 ` Cornelia Huck [this message] 2021-09-17 11:59 ` Cornelia Huck 2021-09-17 12:51 ` Jason Gunthorpe 2021-09-17 12:51 ` [Intel-gfx] " Jason Gunthorpe 2021-09-17 14:37 ` Cornelia Huck 2021-09-17 14:37 ` [Intel-gfx] " Cornelia Huck 2021-09-14 13:46 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Move vfio_ccw to the new mdev API (rev2) Patchwork 2021-09-14 18:19 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Move vfio_ccw to the new mdev API (rev3) Patchwork 2021-09-21 15:11 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Move vfio_ccw to the new mdev API (rev4) Patchwork
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=87h7ejh0q3.fsf@redhat.com \ --to=cohuck@redhat.com \ --cc=airlied@linux.ie \ --cc=akrowiak@linux.ibm.com \ --cc=alex.williamson@redhat.com \ --cc=borntraeger@de.ibm.com \ --cc=daniel@ffwll.ch \ --cc=dri-devel@lists.freedesktop.org \ --cc=farman@linux.ibm.com \ --cc=freude@linux.ibm.com \ --cc=gor@linux.ibm.com \ --cc=hca@linux.ibm.com \ --cc=hch@lst.de \ --cc=intel-gfx@lists.freedesktop.org \ --cc=intel-gvt-dev@lists.freedesktop.org \ --cc=jani.nikula@linux.intel.com \ --cc=jgg@nvidia.com \ --cc=jjherne@linux.ibm.com \ --cc=joonas.lahtinen@linux.intel.com \ --cc=kvm@vger.kernel.org \ --cc=kwankhede@nvidia.com \ --cc=linux-s390@vger.kernel.org \ --cc=mjrosato@linux.ibm.com \ --cc=oberpar@linux.ibm.com \ --cc=pasic@linux.ibm.com \ --cc=rodrigo.vivi@intel.com \ --cc=vneethv@linux.ibm.com \ --cc=zhenyuw@linux.intel.com \ --cc=zhi.a.wang@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.