All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Eli Cohen <elic@nvidia.com>
Cc: Eugenio Perez Martin <eperezma@redhat.com>, mst <mst@redhat.com>,
	virtualization <virtualization@lists.linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Si-Wei Liu <si-wei.liu@oracle.com>,
	Parav Pandit <parav@nvidia.com>
Subject: Re: [PATCH RFC 1/3] vdpa/mlx5: Implement susupend virtqueue callback
Date: Mon, 18 Jul 2022 17:03:51 +0800	[thread overview]
Message-ID: <CACGkMEvCdg8pheddm7iTAJcvPhaWHA0nagASREf8F1zcGZ6AiA@mail.gmail.com> (raw)
In-Reply-To: <DM8PR12MB5400DBB3EBA91C250E703E1EAB899@DM8PR12MB5400.namprd12.prod.outlook.com>

On Wed, Jul 13, 2022 at 1:18 PM Eli Cohen <elic@nvidia.com> wrote:
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Wednesday, July 13, 2022 6:29 AM
> > To: Eugenio Perez Martin <eperezma@redhat.com>
> > Cc: Eli Cohen <elic@nvidia.com>; mst <mst@redhat.com>; virtualization <virtualization@lists.linux-foundation.org>; linux-kernel
> > <linux-kernel@vger.kernel.org>; Si-Wei Liu <si-wei.liu@oracle.com>; Parav Pandit <parav@nvidia.com>
> > Subject: Re: [PATCH RFC 1/3] vdpa/mlx5: Implement susupend virtqueue callback
> >
> > On Tue, Jul 12, 2022 at 5:16 PM Eugenio Perez Martin
> > <eperezma@redhat.com> wrote:
> > >
> > > On Tue, Jul 12, 2022 at 10:14 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Mon, Jul 11, 2022 at 2:14 PM Eli Cohen <elic@nvidia.com> wrote:
> > > > >
> > > > > > From: Jason Wang <jasowang@redhat.com>
> > > > > > Sent: Tuesday, June 21, 2022 6:05 AM
> > > > > > To: Eugenio Perez Martin <eperezma@redhat.com>
> > > > > > Cc: Eli Cohen <elic@nvidia.com>; mst <mst@redhat.com>; virtualization <virtualization@lists.linux-foundation.org>; linux-
> > kernel
> > > > > > <linux-kernel@vger.kernel.org>; Si-Wei Liu <si-wei.liu@oracle.com>; Parav Pandit <parav@nvidia.com>
> > > > > > Subject: Re: [PATCH RFC 1/3] vdpa/mlx5: Implement susupend virtqueue callback
> > > > > >
> > > > > > On Mon, Jun 20, 2022 at 5:59 PM Eugenio Perez Martin
> > > > > > <eperezma@redhat.com> wrote:
> > > > > > >
> > > > > > > On Mon, Jun 20, 2022 at 10:56 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Thu, Jun 16, 2022 at 9:27 PM Eli Cohen <elic@nvidia.com> wrote:
> > > > > > > > >
> > > > > > > > > Implement the suspend callback allowing to suspend the virtqueues so
> > > > > > > > > they stop processing descriptors. This is required to allow the shadow
> > > > > > > > > virtqueue to kick in.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Eli Cohen <elic@nvidia.com>
> > > > > > > > > ---
> > > > > > > > >  drivers/vdpa/mlx5/net/mlx5_vnet.c  | 68 +++++++++++++++++++++++++++++-
> > > > > > > > >  include/linux/mlx5/mlx5_ifc_vdpa.h |  8 ++++
> > > > > > > > >  2 files changed, 75 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > index fb0b23e71383..ea4bc8a0cd25 100644
> > > > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > @@ -895,6 +895,7 @@ static int create_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtque
> > > > > > > > >         if (err)
> > > > > > > > >                 goto err_cmd;
> > > > > > > > >
> > > > > > > > > +       mvq->fw_state = MLX5_VIRTIO_NET_Q_OBJECT_STATE_INIT;
> > > > > > > > >         kfree(in);
> > > > > > > > >         mvq->virtq_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id);
> > > > > > > > >
> > > > > > > > > @@ -922,6 +923,7 @@ static void destroy_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtq
> > > > > > > > >                 mlx5_vdpa_warn(&ndev->mvdev, "destroy virtqueue 0x%x\n", mvq->virtq_id);
> > > > > > > > >                 return;
> > > > > > > > >         }
> > > > > > > > > +       mvq->fw_state = MLX5_VIRTIO_NET_Q_OBJECT_NONE;
> > > > > > > > >         umems_destroy(ndev, mvq);
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > @@ -1121,6 +1123,20 @@ static int query_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueu
> > > > > > > > >         return err;
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > +static bool is_valid_state_change(int oldstate, int newstate)
> > > > > > > > > +{
> > > > > > > > > +       switch (oldstate) {
> > > > > > > > > +       case MLX5_VIRTIO_NET_Q_OBJECT_STATE_INIT:
> > > > > > > > > +               return newstate == MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY;
> > > > > > > > > +       case MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY:
> > > > > > > > > +               return newstate == MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND;
> > > > > > > > > +       case MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND:
> > > > > > > > > +       case MLX5_VIRTIO_NET_Q_OBJECT_STATE_ERR:
> > > > > > > > > +       default:
> > > > > > > > > +               return false;
> > > > > > > > > +       }
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > >  static int modify_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq, int state)
> > > > > > > > >  {
> > > > > > > > >         int inlen = MLX5_ST_SZ_BYTES(modify_virtio_net_q_in);
> > > > > > > > > @@ -1130,6 +1146,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtque
> > > > > > > > >         void *in;
> > > > > > > > >         int err;
> > > > > > > > >
> > > > > > > > > +       if (mvq->fw_state == MLX5_VIRTIO_NET_Q_OBJECT_NONE)
> > > > > > > > > +               return 0;
> > > > > > > > > +
> > > > > > > > > +       if (!is_valid_state_change(mvq->fw_state, state))
> > > > > > > > > +               return -EINVAL;
> > > > > > > > > +
> > > > > > > > >         in = kzalloc(inlen, GFP_KERNEL);
> > > > > > > > >         if (!in)
> > > > > > > > >                 return -ENOMEM;
> > > > > > > > > @@ -1991,6 +2013,7 @@ static void mlx5_vdpa_set_vq_ready(struct vdpa_device *vdev, u16 idx, bool ready
> > > > > > > > >         struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
> > > > > > > > >         struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> > > > > > > > >         struct mlx5_vdpa_virtqueue *mvq;
> > > > > > > > > +       int err;
> > > > > > > > >
> > > > > > > > >         if (!mvdev->actual_features)
> > > > > > > > >                 return;
> > > > > > > > > @@ -2004,8 +2027,16 @@ static void mlx5_vdpa_set_vq_ready(struct vdpa_device *vdev, u16 idx, bool ready
> > > > > > > > >         }
> > > > > > > > >
> > > > > > > > >         mvq = &ndev->vqs[idx];
> > > > > > > > > -       if (!ready)
> > > > > > > > > +       if (!ready) {
> > > > > > > > >                 suspend_vq(ndev, mvq);
> > > > > > > > > +       } else {
> > > > > > > > > +               err = modify_virtqueue(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY);
> > > > > > > > > +               if (err) {
> > > > > > > > > +                       mlx5_vdpa_warn(mvdev, "modify VQ %d to ready failed (%d)\n", idx, err);
> > > > > > > > > +                       ready = false;
> > > > > > > > > +               }
> > > > > > > > > +       }
> > > > > > > > > +
> > > > > > > > >
> > > > > > > > >         mvq->ready = ready;
> > > > > > > > >  }
> > > > > > > > > @@ -2732,6 +2763,39 @@ static int mlx5_vdpa_get_vendor_vq_stats(struct vdpa_device *vdev, u16 idx,
> > > > > > > > >         return err;
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > +static void mlx5_vdpa_cvq_suspend(struct mlx5_vdpa_dev *mvdev, bool suspend)
> > > > > > > > > +{
> > > > > > > > > +       struct mlx5_control_vq *cvq;
> > > > > > > > > +
> > > > > > > > > +       if (!(mvdev->actual_features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ)))
> > > > > > > > > +               return;
> > > > > > > > > +
> > > > > > > > > +       cvq = &mvdev->cvq;
> > > > > > > > > +       cvq->ready = !suspend;
> > > > > > > > > +}
> > > > > > > >
> > > > > > > > It looks to me we need to synchronize this with reslock. And this
> > > > > > > > probably deserve a dedicated fix.
> > > > > > > >
> > > > > > > > > +
> > > > > > > > > +static int mlx5_vdpa_suspend(struct vdpa_device *vdev, bool suspend)
> > > > > > > > > +{
> > > > > > > > > +       struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
> > > > > > > > > +       struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> > > > > > > > > +       struct mlx5_vdpa_virtqueue *mvq;
> > > > > > > > > +       int i;
> > > > > > > > > +
> > > > > > > > > +       if (!suspend) {
> > > > > > > > > +               mlx5_vdpa_warn(mvdev, "Resume of virtqueues is not supported\n");
> > > > > > > > > +               return -EOPNOTSUPP;
> > > > > > > > > +       }
> > > > > > > > > +
> > > > > > > > > +       down_write(&ndev->reslock);
> > > > > > > > > +       for (i = 0; i < ndev->cur_num_vqs; i++) {
> > > > > > > > > +               mvq = &ndev->vqs[i];
> > > > > > > > > +               suspend_vq(ndev, mvq);
> > > > > > > > > +       }
> > > > > > > > > +       mlx5_vdpa_cvq_suspend(mvdev, suspend);
> > > > > > > >
> > > > > > > > Do we need to synchronize with the carrier work here? Otherwise we may
> > > > > > > > get config notification after suspending.
> > > > > > > >
> > > > > > > > > +       up_write(&ndev->reslock);
> > > > > > > > > +       return 0;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > >  static const struct vdpa_config_ops mlx5_vdpa_ops = {
> > > > > > > > >         .set_vq_address = mlx5_vdpa_set_vq_address,
> > > > > > > > >         .set_vq_num = mlx5_vdpa_set_vq_num,
> > > > > > > > > @@ -2762,6 +2826,7 @@ static const struct vdpa_config_ops mlx5_vdpa_ops = {
> > > > > > > > >         .get_generation = mlx5_vdpa_get_generation,
> > > > > > > > >         .set_map = mlx5_vdpa_set_map,
> > > > > > > > >         .free = mlx5_vdpa_free,
> > > > > > > > > +       .suspend = mlx5_vdpa_suspend,
> > > > > > > >
> > > > > > > > I don't see the vDPA bus patch to enable this method. Or anything I missed here?
> > > > > > > >
> > > > > > >
> > > > > > > Should we add
> > > > > > > Based-on: <20220526124338.36247-1-eperezma@redhat.com>
> > > > > > >
> > > > > > > To this series?
> > > > > >
> > > > > > Probably, but that series seems to support resume while this series doesn't.
> > > > > >
> > > > > > Any reason for this?
> > > > >
> > > > > I think Eugenio agreed that resume is not really required since we're going stop using this
> > > > > instance and migrate. In any case, we don't support resume for the hardware object
> > > > > though it could be simulated should it be absolutely necessary.
> > > >
> > > > This is fine if everything is fine during the live migration. But when
> > > > migration fails due to some reason, management (libvirt) may choose to
> > > > restart the device in the source.
> > > >
> > > > This means we should either
> > > >
> > > > 1) support resume in the parent
> > > > 2) emulate it in the qemu (with a lot of restoring of the states)
> > > >
> > >
> > > I think it should be handled in qemu (at least the POC reset the
> > > device), but I didn't exercise a lot of the failure paths there
> > > because, well, it was a POC :).
> >
> > It looks like a must in the production environment. The failure is not
> > necessarily related to shadow virtqueue itself.
> >
>
> I don't see a specific interface to resume the device after suspend.
> Reset however could do the job and we already have it.

Yes, this is fine as long as we can emulate it via set_vq_state + reset.

Thanks

>
> > Thanks
> >
> > >
> > > > And it is not only used for live migration, it could be used for vmstop/start.
> > > >
> > >
> > > I think it would be easier if we dedicate a feature flag for resuming
> > > the device in the future. Qemu could take advantage of it at some
> > > error paths of live migration, but less than it seems because it
> > > overrides things like ring addresses. And, obviously, in the
> > > vmstop/vmstart.
> > >
> > > Actually, net devices should be ok to restore with a full reset. The
> > > problem should be filesystems etc that are not part of vdpa at the
> > > moment.
> > >
> > > Thanks!
> > >
>


WARNING: multiple messages have this Message-ID (diff)
From: Jason Wang <jasowang@redhat.com>
To: Eli Cohen <elic@nvidia.com>
Cc: mst <mst@redhat.com>, linux-kernel <linux-kernel@vger.kernel.org>,
	virtualization <virtualization@lists.linux-foundation.org>,
	Eugenio Perez Martin <eperezma@redhat.com>
Subject: Re: [PATCH RFC 1/3] vdpa/mlx5: Implement susupend virtqueue callback
Date: Mon, 18 Jul 2022 17:03:51 +0800	[thread overview]
Message-ID: <CACGkMEvCdg8pheddm7iTAJcvPhaWHA0nagASREf8F1zcGZ6AiA@mail.gmail.com> (raw)
In-Reply-To: <DM8PR12MB5400DBB3EBA91C250E703E1EAB899@DM8PR12MB5400.namprd12.prod.outlook.com>

On Wed, Jul 13, 2022 at 1:18 PM Eli Cohen <elic@nvidia.com> wrote:
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Wednesday, July 13, 2022 6:29 AM
> > To: Eugenio Perez Martin <eperezma@redhat.com>
> > Cc: Eli Cohen <elic@nvidia.com>; mst <mst@redhat.com>; virtualization <virtualization@lists.linux-foundation.org>; linux-kernel
> > <linux-kernel@vger.kernel.org>; Si-Wei Liu <si-wei.liu@oracle.com>; Parav Pandit <parav@nvidia.com>
> > Subject: Re: [PATCH RFC 1/3] vdpa/mlx5: Implement susupend virtqueue callback
> >
> > On Tue, Jul 12, 2022 at 5:16 PM Eugenio Perez Martin
> > <eperezma@redhat.com> wrote:
> > >
> > > On Tue, Jul 12, 2022 at 10:14 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Mon, Jul 11, 2022 at 2:14 PM Eli Cohen <elic@nvidia.com> wrote:
> > > > >
> > > > > > From: Jason Wang <jasowang@redhat.com>
> > > > > > Sent: Tuesday, June 21, 2022 6:05 AM
> > > > > > To: Eugenio Perez Martin <eperezma@redhat.com>
> > > > > > Cc: Eli Cohen <elic@nvidia.com>; mst <mst@redhat.com>; virtualization <virtualization@lists.linux-foundation.org>; linux-
> > kernel
> > > > > > <linux-kernel@vger.kernel.org>; Si-Wei Liu <si-wei.liu@oracle.com>; Parav Pandit <parav@nvidia.com>
> > > > > > Subject: Re: [PATCH RFC 1/3] vdpa/mlx5: Implement susupend virtqueue callback
> > > > > >
> > > > > > On Mon, Jun 20, 2022 at 5:59 PM Eugenio Perez Martin
> > > > > > <eperezma@redhat.com> wrote:
> > > > > > >
> > > > > > > On Mon, Jun 20, 2022 at 10:56 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Thu, Jun 16, 2022 at 9:27 PM Eli Cohen <elic@nvidia.com> wrote:
> > > > > > > > >
> > > > > > > > > Implement the suspend callback allowing to suspend the virtqueues so
> > > > > > > > > they stop processing descriptors. This is required to allow the shadow
> > > > > > > > > virtqueue to kick in.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Eli Cohen <elic@nvidia.com>
> > > > > > > > > ---
> > > > > > > > >  drivers/vdpa/mlx5/net/mlx5_vnet.c  | 68 +++++++++++++++++++++++++++++-
> > > > > > > > >  include/linux/mlx5/mlx5_ifc_vdpa.h |  8 ++++
> > > > > > > > >  2 files changed, 75 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > index fb0b23e71383..ea4bc8a0cd25 100644
> > > > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > @@ -895,6 +895,7 @@ static int create_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtque
> > > > > > > > >         if (err)
> > > > > > > > >                 goto err_cmd;
> > > > > > > > >
> > > > > > > > > +       mvq->fw_state = MLX5_VIRTIO_NET_Q_OBJECT_STATE_INIT;
> > > > > > > > >         kfree(in);
> > > > > > > > >         mvq->virtq_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id);
> > > > > > > > >
> > > > > > > > > @@ -922,6 +923,7 @@ static void destroy_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtq
> > > > > > > > >                 mlx5_vdpa_warn(&ndev->mvdev, "destroy virtqueue 0x%x\n", mvq->virtq_id);
> > > > > > > > >                 return;
> > > > > > > > >         }
> > > > > > > > > +       mvq->fw_state = MLX5_VIRTIO_NET_Q_OBJECT_NONE;
> > > > > > > > >         umems_destroy(ndev, mvq);
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > @@ -1121,6 +1123,20 @@ static int query_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueu
> > > > > > > > >         return err;
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > +static bool is_valid_state_change(int oldstate, int newstate)
> > > > > > > > > +{
> > > > > > > > > +       switch (oldstate) {
> > > > > > > > > +       case MLX5_VIRTIO_NET_Q_OBJECT_STATE_INIT:
> > > > > > > > > +               return newstate == MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY;
> > > > > > > > > +       case MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY:
> > > > > > > > > +               return newstate == MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND;
> > > > > > > > > +       case MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND:
> > > > > > > > > +       case MLX5_VIRTIO_NET_Q_OBJECT_STATE_ERR:
> > > > > > > > > +       default:
> > > > > > > > > +               return false;
> > > > > > > > > +       }
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > >  static int modify_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq, int state)
> > > > > > > > >  {
> > > > > > > > >         int inlen = MLX5_ST_SZ_BYTES(modify_virtio_net_q_in);
> > > > > > > > > @@ -1130,6 +1146,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtque
> > > > > > > > >         void *in;
> > > > > > > > >         int err;
> > > > > > > > >
> > > > > > > > > +       if (mvq->fw_state == MLX5_VIRTIO_NET_Q_OBJECT_NONE)
> > > > > > > > > +               return 0;
> > > > > > > > > +
> > > > > > > > > +       if (!is_valid_state_change(mvq->fw_state, state))
> > > > > > > > > +               return -EINVAL;
> > > > > > > > > +
> > > > > > > > >         in = kzalloc(inlen, GFP_KERNEL);
> > > > > > > > >         if (!in)
> > > > > > > > >                 return -ENOMEM;
> > > > > > > > > @@ -1991,6 +2013,7 @@ static void mlx5_vdpa_set_vq_ready(struct vdpa_device *vdev, u16 idx, bool ready
> > > > > > > > >         struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
> > > > > > > > >         struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> > > > > > > > >         struct mlx5_vdpa_virtqueue *mvq;
> > > > > > > > > +       int err;
> > > > > > > > >
> > > > > > > > >         if (!mvdev->actual_features)
> > > > > > > > >                 return;
> > > > > > > > > @@ -2004,8 +2027,16 @@ static void mlx5_vdpa_set_vq_ready(struct vdpa_device *vdev, u16 idx, bool ready
> > > > > > > > >         }
> > > > > > > > >
> > > > > > > > >         mvq = &ndev->vqs[idx];
> > > > > > > > > -       if (!ready)
> > > > > > > > > +       if (!ready) {
> > > > > > > > >                 suspend_vq(ndev, mvq);
> > > > > > > > > +       } else {
> > > > > > > > > +               err = modify_virtqueue(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY);
> > > > > > > > > +               if (err) {
> > > > > > > > > +                       mlx5_vdpa_warn(mvdev, "modify VQ %d to ready failed (%d)\n", idx, err);
> > > > > > > > > +                       ready = false;
> > > > > > > > > +               }
> > > > > > > > > +       }
> > > > > > > > > +
> > > > > > > > >
> > > > > > > > >         mvq->ready = ready;
> > > > > > > > >  }
> > > > > > > > > @@ -2732,6 +2763,39 @@ static int mlx5_vdpa_get_vendor_vq_stats(struct vdpa_device *vdev, u16 idx,
> > > > > > > > >         return err;
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > +static void mlx5_vdpa_cvq_suspend(struct mlx5_vdpa_dev *mvdev, bool suspend)
> > > > > > > > > +{
> > > > > > > > > +       struct mlx5_control_vq *cvq;
> > > > > > > > > +
> > > > > > > > > +       if (!(mvdev->actual_features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ)))
> > > > > > > > > +               return;
> > > > > > > > > +
> > > > > > > > > +       cvq = &mvdev->cvq;
> > > > > > > > > +       cvq->ready = !suspend;
> > > > > > > > > +}
> > > > > > > >
> > > > > > > > It looks to me we need to synchronize this with reslock. And this
> > > > > > > > probably deserve a dedicated fix.
> > > > > > > >
> > > > > > > > > +
> > > > > > > > > +static int mlx5_vdpa_suspend(struct vdpa_device *vdev, bool suspend)
> > > > > > > > > +{
> > > > > > > > > +       struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
> > > > > > > > > +       struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> > > > > > > > > +       struct mlx5_vdpa_virtqueue *mvq;
> > > > > > > > > +       int i;
> > > > > > > > > +
> > > > > > > > > +       if (!suspend) {
> > > > > > > > > +               mlx5_vdpa_warn(mvdev, "Resume of virtqueues is not supported\n");
> > > > > > > > > +               return -EOPNOTSUPP;
> > > > > > > > > +       }
> > > > > > > > > +
> > > > > > > > > +       down_write(&ndev->reslock);
> > > > > > > > > +       for (i = 0; i < ndev->cur_num_vqs; i++) {
> > > > > > > > > +               mvq = &ndev->vqs[i];
> > > > > > > > > +               suspend_vq(ndev, mvq);
> > > > > > > > > +       }
> > > > > > > > > +       mlx5_vdpa_cvq_suspend(mvdev, suspend);
> > > > > > > >
> > > > > > > > Do we need to synchronize with the carrier work here? Otherwise we may
> > > > > > > > get config notification after suspending.
> > > > > > > >
> > > > > > > > > +       up_write(&ndev->reslock);
> > > > > > > > > +       return 0;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > >  static const struct vdpa_config_ops mlx5_vdpa_ops = {
> > > > > > > > >         .set_vq_address = mlx5_vdpa_set_vq_address,
> > > > > > > > >         .set_vq_num = mlx5_vdpa_set_vq_num,
> > > > > > > > > @@ -2762,6 +2826,7 @@ static const struct vdpa_config_ops mlx5_vdpa_ops = {
> > > > > > > > >         .get_generation = mlx5_vdpa_get_generation,
> > > > > > > > >         .set_map = mlx5_vdpa_set_map,
> > > > > > > > >         .free = mlx5_vdpa_free,
> > > > > > > > > +       .suspend = mlx5_vdpa_suspend,
> > > > > > > >
> > > > > > > > I don't see the vDPA bus patch to enable this method. Or anything I missed here?
> > > > > > > >
> > > > > > >
> > > > > > > Should we add
> > > > > > > Based-on: <20220526124338.36247-1-eperezma@redhat.com>
> > > > > > >
> > > > > > > To this series?
> > > > > >
> > > > > > Probably, but that series seems to support resume while this series doesn't.
> > > > > >
> > > > > > Any reason for this?
> > > > >
> > > > > I think Eugenio agreed that resume is not really required since we're going stop using this
> > > > > instance and migrate. In any case, we don't support resume for the hardware object
> > > > > though it could be simulated should it be absolutely necessary.
> > > >
> > > > This is fine if everything is fine during the live migration. But when
> > > > migration fails due to some reason, management (libvirt) may choose to
> > > > restart the device in the source.
> > > >
> > > > This means we should either
> > > >
> > > > 1) support resume in the parent
> > > > 2) emulate it in the qemu (with a lot of restoring of the states)
> > > >
> > >
> > > I think it should be handled in qemu (at least the POC reset the
> > > device), but I didn't exercise a lot of the failure paths there
> > > because, well, it was a POC :).
> >
> > It looks like a must in the production environment. The failure is not
> > necessarily related to shadow virtqueue itself.
> >
>
> I don't see a specific interface to resume the device after suspend.
> Reset however could do the job and we already have it.

Yes, this is fine as long as we can emulate it via set_vq_state + reset.

Thanks

>
> > Thanks
> >
> > >
> > > > And it is not only used for live migration, it could be used for vmstop/start.
> > > >
> > >
> > > I think it would be easier if we dedicate a feature flag for resuming
> > > the device in the future. Qemu could take advantage of it at some
> > > error paths of live migration, but less than it seems because it
> > > overrides things like ring addresses. And, obviously, in the
> > > vmstop/vmstart.
> > >
> > > Actually, net devices should be ok to restore with a full reset. The
> > > problem should be filesystems etc that are not part of vdpa at the
> > > moment.
> > >
> > > Thanks!
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  reply	other threads:[~2022-07-18  9:04 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-16 13:27 [PATCH RFC 0/3] Support live migration with mlx5_vdpa Eli Cohen
2022-06-16 13:27 ` [PATCH RFC 1/3] vdpa/mlx5: Implement susupend virtqueue callback Eli Cohen
2022-06-16 17:12   ` kernel test robot
2022-06-19 16:33   ` Eugenio Perez Martin
2022-06-20  8:56   ` Jason Wang
2022-06-20  8:56     ` Jason Wang
2022-06-20  9:58     ` Eugenio Perez Martin
2022-06-20 10:06       ` Michael S. Tsirkin
2022-06-20 10:06         ` Michael S. Tsirkin
2022-06-20 11:09         ` Eugenio Perez Martin
2022-06-21  3:04       ` Jason Wang
2022-06-21  3:04         ` Jason Wang
2022-06-21  7:48         ` Eugenio Perez Martin
2022-06-21  7:52           ` Jason Wang
2022-06-21  7:52             ` Jason Wang
2022-07-11  6:14         ` Eli Cohen
2022-07-11 10:43           ` Eugenio Perez Martin
2022-07-12  8:14           ` Jason Wang
2022-07-12  8:14             ` Jason Wang
2022-07-12  9:15             ` Eugenio Perez Martin
2022-07-13  3:29               ` Jason Wang
2022-07-13  3:29                 ` Jason Wang
2022-07-13  5:18                 ` Eli Cohen
2022-07-18  9:03                   ` Jason Wang [this message]
2022-07-18  9:03                     ` Jason Wang
2022-06-20 13:09     ` Eli Cohen
2022-06-21  2:58       ` Jason Wang
2022-06-21  2:58         ` Jason Wang
2022-07-11  6:10         ` Eli Cohen
2022-06-20 10:07   ` Eugenio Perez Martin
2022-06-16 13:27 ` [PATCH RFC 2/3] vdpa/mlx5: Support different address spaces for control and data Eli Cohen
2022-06-20  8:47   ` Jason Wang
2022-06-20  8:47     ` Jason Wang
2022-06-20  8:57   ` Eugenio Perez Martin
2022-06-20  9:20     ` Jason Wang
2022-06-20  9:20       ` Jason Wang
2022-06-16 13:27 ` [PATCH RFC 3/3] vdpa/mlx5: Disable VLAN support to support live migration Eli Cohen
2022-06-20  8:47   ` Jason Wang
2022-06-20  8:47     ` Jason Wang
2022-06-20  9:01     ` Eugenio Perez Martin
2022-06-20  9:25       ` Jason Wang
2022-06-20  9:25         ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACGkMEvCdg8pheddm7iTAJcvPhaWHA0nagASREf8F1zcGZ6AiA@mail.gmail.com \
    --to=jasowang@redhat.com \
    --cc=elic@nvidia.com \
    --cc=eperezma@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=parav@nvidia.com \
    --cc=si-wei.liu@oracle.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.