All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yongji Xie <xieyongji@bytedance.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Jason Wang" <jasowang@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Stefano Garzarella" <sgarzare@redhat.com>,
	"Parav Pandit" <parav@nvidia.com>,
	"Christoph Hellwig" <hch@infradead.org>,
	"Christian Brauner" <christian.brauner@canonical.com>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Al Viro" <viro@zeniv.linux.org.uk>,
	"Jens Axboe" <axboe@kernel.dk>,
	bcrl@kvack.org, "Jonathan Corbet" <corbet@lwn.net>,
	"Mika Penttilä" <mika.penttila@nextfour.com>,
	"Dan Carpenter" <dan.carpenter@oracle.com>,
	joro@8bytes.org, "Greg KH" <gregkh@linuxfoundation.org>,
	"He Zhe" <zhe.he@windriver.com>,
	"Liu Xiaodong" <xiaodong.liu@intel.com>,
	"Joe Perches" <joe@perches.com>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Will Deacon" <will@kernel.org>,
	"John Garry" <john.garry@huawei.com>,
	songmuchun@bytedance.com,
	virtualization <virtualization@lists.linux-foundation.org>,
	netdev@vger.kernel.org, kvm <kvm@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops
Date: Mon, 6 Sep 2021 16:45:55 +0800	[thread overview]
Message-ID: <CACycT3vQHRsJ_j5f4T9RoB4MQzBoYO5ts3egVe9K6TcCVfLOFQ@mail.gmail.com> (raw)
In-Reply-To: <20210906035338-mutt-send-email-mst@kernel.org>

On Mon, Sep 6, 2021 at 4:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Sep 06, 2021 at 03:06:44PM +0800, Yongji Xie wrote:
> > On Mon, Sep 6, 2021 at 2:37 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote:
> > > > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote:
> > > > > > This adds a new callback to support device specific reset
> > > > > > behavior. The vdpa bus driver will call the reset function
> > > > > > instead of setting status to zero during resetting.
> > > > > >
> > > > > > Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> > > > >
> > > > >
> > > > > This does gloss over a significant change though:
> > > > >
> > > > >
> > > > > > ---
> > > > > > @@ -348,12 +352,12 @@ static inline struct device *vdpa_get_dma_dev(struct vdpa_device *vdev)
> > > > > >       return vdev->dma_dev;
> > > > > >  }
> > > > > >
> > > > > > -static inline void vdpa_reset(struct vdpa_device *vdev)
> > > > > > +static inline int vdpa_reset(struct vdpa_device *vdev)
> > > > > >  {
> > > > > >       const struct vdpa_config_ops *ops = vdev->config;
> > > > > >
> > > > > >       vdev->features_valid = false;
> > > > > > -     ops->set_status(vdev, 0);
> > > > > > +     return ops->reset(vdev);
> > > > > >  }
> > > > > >
> > > > > >  static inline int vdpa_set_features(struct vdpa_device *vdev, u64 features)
> > > > >
> > > > >
> > > > > Unfortunately this breaks virtio_vdpa:
> > > > >
> > > > >
> > > > > static void virtio_vdpa_reset(struct virtio_device *vdev)
> > > > > {
> > > > >         struct vdpa_device *vdpa = vd_get_vdpa(vdev);
> > > > >
> > > > >         vdpa_reset(vdpa);
> > > > > }
> > > > >
> > > > >
> > > > > and there's no easy way to fix this, kernel can't recover
> > > > > from a reset failure e.g. during driver unbind.
> > > > >
> > > >
> > > > Yes, but it should be safe with the protection of software IOTLB even
> > > > if the reset() fails during driver unbind.
> > > >
> > > > Thanks,
> > > > Yongji
> > >
> > > Hmm. I don't see it.
> > > What exactly will happen? What prevents device from poking at
> > > memory after reset? Note that dma unmap in e.g. del_vqs happens
> > > too late.
> >
> > But I didn't see any problems with touching the memory for virtqueues.
>
> Drivers make the assumption that after reset returns no new
> buffers will be consumed. For example a bunch of drivers
> call virtqueue_detach_unused_buf.

I'm not sure if I get your point. But it looks like
virtqueue_detach_unused_buf() will check the driver's metadata first
rather than read the memory from virtqueue.

> I can't say whether block makes this assumption anywhere.
> Needs careful auditing.
>
> > The memory should not be freed after dma unmap?
>
> But unmap does not happen until after the reset.
>

I mean the memory is totally allocated and controlled by the VDUSE
driver. The VDUSE driver will not return them to the buddy system
unless userspace unmap it.

>
> > And the memory for the bounce buffer should also be safe to be
> > accessed by userspace in this case.
> >
> > > And what about e.g. interrupts?
> > > E.g. we have this:
> > >
> > >         /* Virtqueues are stopped, nothing can use vblk->vdev anymore. */
> > >         vblk->vdev = NULL;
> > >
> > > and this is no longer true at this point.
> > >
> >
> > You're right. But I didn't see where the interrupt handler will use
> > the vblk->vdev.
>
> static void virtblk_done(struct virtqueue *vq)
> {
>         struct virtio_blk *vblk = vq->vdev->priv;
>
> vq->vdev is the same as vblk->vdev.
>

We will test the vq->ready (will be set to false in del_vqs()) before
injecting an interrupt in the VDUSE driver. So it should be OK?

>
> > So it seems to be not too late to fix it:
> >
> > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c
> > b/drivers/vdpa/vdpa_user/vduse_dev.c
> > index 5c25ff6483ad..ea41a7389a26 100644
> > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > @@ -665,13 +665,13 @@ static void vduse_vdpa_set_config(struct
> > vdpa_device *vdpa, unsigned int offset,
> >  static int vduse_vdpa_reset(struct vdpa_device *vdpa)
> >  {
> >         struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> > +       int ret;
> >
> > -       if (vduse_dev_set_status(dev, 0))
> > -               return -EIO;
> > +       ret = vduse_dev_set_status(dev, 0);
> >
> >         vduse_dev_reset(dev);
> >
> > -       return 0;
> > +       return ret;
> >  }
> >
> >  static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa)
> >
> > Thanks,
> > Yongji
>
> Needs some comments to explain why it's done like this.
>

This is used to make sure the userspace can't not inject the interrupt
any more after reset. The vduse_dev_reset() will clear the interrupt
callback and flush the irq kworker.

> BTW device is generally wedged at this point right?
> E.g. if reset during initialization fails, userspace
> will still get the reset at some later point and be
> confused ...
>

Sorry, I don't get why userspace will get the reset at some later point?

Thanks,
Yongji

WARNING: multiple messages have this Message-ID (diff)
From: Yongji Xie <xieyongji@bytedance.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: kvm <kvm@vger.kernel.org>, "Jason Wang" <jasowang@redhat.com>,
	virtualization <virtualization@lists.linux-foundation.org>,
	"Christian Brauner" <christian.brauner@canonical.com>,
	"Will Deacon" <will@kernel.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Christoph Hellwig" <hch@infradead.org>,
	"Dan Carpenter" <dan.carpenter@oracle.com>,
	"Stefano Garzarella" <sgarzare@redhat.com>,
	"Liu Xiaodong" <xiaodong.liu@intel.com>,
	linux-fsdevel@vger.kernel.org,
	"Al Viro" <viro@zeniv.linux.org.uk>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	songmuchun@bytedance.com, "Jens Axboe" <axboe@kernel.dk>,
	"He Zhe" <zhe.he@windriver.com>,
	"Greg KH" <gregkh@linuxfoundation.org>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	iommu@lists.linux-foundation.org, bcrl@kvack.org,
	netdev@vger.kernel.org, "Joe Perches" <joe@perches.com>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Mika Penttilä" <mika.penttila@nextfour.com>
Subject: Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops
Date: Mon, 6 Sep 2021 16:45:55 +0800	[thread overview]
Message-ID: <CACycT3vQHRsJ_j5f4T9RoB4MQzBoYO5ts3egVe9K6TcCVfLOFQ@mail.gmail.com> (raw)
In-Reply-To: <20210906035338-mutt-send-email-mst@kernel.org>

On Mon, Sep 6, 2021 at 4:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Sep 06, 2021 at 03:06:44PM +0800, Yongji Xie wrote:
> > On Mon, Sep 6, 2021 at 2:37 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote:
> > > > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote:
> > > > > > This adds a new callback to support device specific reset
> > > > > > behavior. The vdpa bus driver will call the reset function
> > > > > > instead of setting status to zero during resetting.
> > > > > >
> > > > > > Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> > > > >
> > > > >
> > > > > This does gloss over a significant change though:
> > > > >
> > > > >
> > > > > > ---
> > > > > > @@ -348,12 +352,12 @@ static inline struct device *vdpa_get_dma_dev(struct vdpa_device *vdev)
> > > > > >       return vdev->dma_dev;
> > > > > >  }
> > > > > >
> > > > > > -static inline void vdpa_reset(struct vdpa_device *vdev)
> > > > > > +static inline int vdpa_reset(struct vdpa_device *vdev)
> > > > > >  {
> > > > > >       const struct vdpa_config_ops *ops = vdev->config;
> > > > > >
> > > > > >       vdev->features_valid = false;
> > > > > > -     ops->set_status(vdev, 0);
> > > > > > +     return ops->reset(vdev);
> > > > > >  }
> > > > > >
> > > > > >  static inline int vdpa_set_features(struct vdpa_device *vdev, u64 features)
> > > > >
> > > > >
> > > > > Unfortunately this breaks virtio_vdpa:
> > > > >
> > > > >
> > > > > static void virtio_vdpa_reset(struct virtio_device *vdev)
> > > > > {
> > > > >         struct vdpa_device *vdpa = vd_get_vdpa(vdev);
> > > > >
> > > > >         vdpa_reset(vdpa);
> > > > > }
> > > > >
> > > > >
> > > > > and there's no easy way to fix this, kernel can't recover
> > > > > from a reset failure e.g. during driver unbind.
> > > > >
> > > >
> > > > Yes, but it should be safe with the protection of software IOTLB even
> > > > if the reset() fails during driver unbind.
> > > >
> > > > Thanks,
> > > > Yongji
> > >
> > > Hmm. I don't see it.
> > > What exactly will happen? What prevents device from poking at
> > > memory after reset? Note that dma unmap in e.g. del_vqs happens
> > > too late.
> >
> > But I didn't see any problems with touching the memory for virtqueues.
>
> Drivers make the assumption that after reset returns no new
> buffers will be consumed. For example a bunch of drivers
> call virtqueue_detach_unused_buf.

I'm not sure if I get your point. But it looks like
virtqueue_detach_unused_buf() will check the driver's metadata first
rather than read the memory from virtqueue.

> I can't say whether block makes this assumption anywhere.
> Needs careful auditing.
>
> > The memory should not be freed after dma unmap?
>
> But unmap does not happen until after the reset.
>

I mean the memory is totally allocated and controlled by the VDUSE
driver. The VDUSE driver will not return them to the buddy system
unless userspace unmap it.

>
> > And the memory for the bounce buffer should also be safe to be
> > accessed by userspace in this case.
> >
> > > And what about e.g. interrupts?
> > > E.g. we have this:
> > >
> > >         /* Virtqueues are stopped, nothing can use vblk->vdev anymore. */
> > >         vblk->vdev = NULL;
> > >
> > > and this is no longer true at this point.
> > >
> >
> > You're right. But I didn't see where the interrupt handler will use
> > the vblk->vdev.
>
> static void virtblk_done(struct virtqueue *vq)
> {
>         struct virtio_blk *vblk = vq->vdev->priv;
>
> vq->vdev is the same as vblk->vdev.
>

We will test the vq->ready (will be set to false in del_vqs()) before
injecting an interrupt in the VDUSE driver. So it should be OK?

>
> > So it seems to be not too late to fix it:
> >
> > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c
> > b/drivers/vdpa/vdpa_user/vduse_dev.c
> > index 5c25ff6483ad..ea41a7389a26 100644
> > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > @@ -665,13 +665,13 @@ static void vduse_vdpa_set_config(struct
> > vdpa_device *vdpa, unsigned int offset,
> >  static int vduse_vdpa_reset(struct vdpa_device *vdpa)
> >  {
> >         struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> > +       int ret;
> >
> > -       if (vduse_dev_set_status(dev, 0))
> > -               return -EIO;
> > +       ret = vduse_dev_set_status(dev, 0);
> >
> >         vduse_dev_reset(dev);
> >
> > -       return 0;
> > +       return ret;
> >  }
> >
> >  static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa)
> >
> > Thanks,
> > Yongji
>
> Needs some comments to explain why it's done like this.
>

This is used to make sure the userspace can't not inject the interrupt
any more after reset. The vduse_dev_reset() will clear the interrupt
callback and flush the irq kworker.

> BTW device is generally wedged at this point right?
> E.g. if reset during initialization fails, userspace
> will still get the reset at some later point and be
> confused ...
>

Sorry, I don't get why userspace will get the reset at some later point?

Thanks,
Yongji
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2021-09-06  8:47 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-31 10:36 [PATCH v13 00/13] Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2021-08-31 10:36 ` Xie Yongji
2021-08-31 10:36 ` [PATCH v13 01/13] iova: Export alloc_iova_fast() and free_iova_fast() Xie Yongji
2021-08-31 10:36   ` Xie Yongji
2021-08-31 10:36 ` [PATCH v13 02/13] eventfd: Export eventfd_wake_count to modules Xie Yongji
2021-08-31 10:36   ` Xie Yongji
2021-09-01  2:50   ` Jason Wang
2021-09-01  2:50     ` Jason Wang
2021-09-01  2:50     ` Jason Wang
2021-08-31 10:36 ` [PATCH v13 03/13] file: Export receive_fd() " Xie Yongji
2021-08-31 10:36   ` Xie Yongji
2021-09-05 15:57   ` Michael S. Tsirkin
2021-09-05 15:57     ` Michael S. Tsirkin
2021-09-05 15:57     ` Michael S. Tsirkin
2021-09-05 16:44     ` Al Viro
2021-09-05 16:44       ` Al Viro
2021-09-05 16:44       ` Al Viro
2021-08-31 10:36 ` [PATCH v13 04/13] vdpa: Fix some coding style issues Xie Yongji
2021-08-31 10:36   ` Xie Yongji
2021-08-31 10:36 ` [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops Xie Yongji
2021-08-31 10:36   ` Xie Yongji
2021-09-06  5:55   ` Michael S. Tsirkin
2021-09-06  5:55     ` Michael S. Tsirkin
2021-09-06  5:55     ` Michael S. Tsirkin
2021-09-06  6:09     ` Yongji Xie
2021-09-06  6:09       ` Yongji Xie
2021-09-06  6:37       ` Michael S. Tsirkin
2021-09-06  6:37         ` Michael S. Tsirkin
2021-09-06  6:37         ` Michael S. Tsirkin
2021-09-06  7:06         ` Yongji Xie
2021-09-06  7:06           ` Yongji Xie
2021-09-06  8:00           ` Michael S. Tsirkin
2021-09-06  8:00             ` Michael S. Tsirkin
2021-09-06  8:00             ` Michael S. Tsirkin
2021-09-06  8:45             ` Yongji Xie [this message]
2021-09-06  8:45               ` Yongji Xie
2021-09-06 10:43               ` Michael S. Tsirkin
2021-09-06 10:43                 ` Michael S. Tsirkin
2021-09-06 10:43                 ` Michael S. Tsirkin
2021-09-06 12:13                 ` Yongji Xie
2021-09-06 12:13                   ` Yongji Xie
2021-08-31 10:36 ` [PATCH v13 06/13] vhost-vdpa: Handle the failure of vdpa_reset() Xie Yongji
2021-08-31 10:36   ` Xie Yongji
2021-08-31 10:36 ` [PATCH v13 07/13] vhost-iotlb: Add an opaque pointer for vhost IOTLB Xie Yongji
2021-08-31 10:36   ` Xie Yongji
2021-08-31 10:36 ` [PATCH v13 08/13] vdpa: Add an opaque pointer for vdpa_config_ops.dma_map() Xie Yongji
2021-08-31 10:36   ` Xie Yongji
2021-08-31 10:36 ` [PATCH v13 09/13] vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap() Xie Yongji
2021-08-31 10:36   ` Xie Yongji
2021-08-31 10:36 ` [PATCH v13 10/13] vdpa: Support transferring virtual addressing during DMA mapping Xie Yongji
2021-08-31 10:36   ` Xie Yongji
2021-08-31 10:36 ` [PATCH v13 11/13] vduse: Implement an MMU-based software IOTLB Xie Yongji
2021-08-31 10:36   ` Xie Yongji
2021-08-31 10:36 ` [PATCH v13 12/13] vduse: Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2021-08-31 10:36   ` Xie Yongji
2021-08-31 10:36 ` [PATCH v13 13/13] Documentation: Add documentation for VDUSE Xie Yongji
2021-08-31 10:36   ` Xie Yongji
2021-10-11  7:32 ` [PATCH v13 00/13] Introduce VDUSE - vDPA Device in Userspace Liuxiangdong
2021-10-11  8:31   ` Yongji Xie
2021-10-11  8:32     ` Yongji Xie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACycT3vQHRsJ_j5f4T9RoB4MQzBoYO5ts3egVe9K6TcCVfLOFQ@mail.gmail.com \
    --to=xieyongji@bytedance.com \
    --cc=axboe@kernel.dk \
    --cc=bcrl@kvack.org \
    --cc=christian.brauner@canonical.com \
    --cc=corbet@lwn.net \
    --cc=dan.carpenter@oracle.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jasowang@redhat.com \
    --cc=joe@perches.com \
    --cc=john.garry@huawei.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mika.penttila@nextfour.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=parav@nvidia.com \
    --cc=rdunlap@infradead.org \
    --cc=robin.murphy@arm.com \
    --cc=sgarzare@redhat.com \
    --cc=songmuchun@bytedance.com \
    --cc=stefanha@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=xiaodong.liu@intel.com \
    --cc=zhe.he@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.