All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Parav Pandit <parav@nvidia.com>, "Michael S. Tsirkin" <mst@redhat.com>
Cc: Virtio-Dev <virtio-dev@lists.oasis-open.org>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [virtio-dev] [PATCH v2] virtio: Improve queue_reset polarity to match to default reset state
Date: Thu, 28 Apr 2022 11:40:32 +0800	[thread overview]
Message-ID: <9a932876-1a5e-8166-57fd-f41bf4f153f5@redhat.com> (raw)
In-Reply-To: <PH0PR12MB548121654DFF8442593FF26DDCFA9@PH0PR12MB5481.namprd12.prod.outlook.com>


在 2022/4/28 00:15, Parav Pandit 写道:
>> From: Parav Pandit
>> Sent: Wednesday, April 27, 2022 11:58 AM
>>
>> But this is so basic.
>> It's hard to gaze at this spec for coming years and the code to see, Hey
>> sometimes 0 means disabled, sometime 0 means still enabled, sometime 1
>> means enabled, and sometimes 1 means now disabled...
>> And maintain those weird code in device side and extra state bits burning
>> some expensive chip resource.
>>
>> Is removing from 1.2 is equal delay to get is fixed in 1.3?
>> If yes, I make humble request to fix this and have errata.
>> Some of the professional standard bodies release the spec and short after
>> that errata/ratification follows the release that resolve such small issues.
>> May be time for virtio spec to take this opportunity now and be bit agile on it.
>> Your call. :)
>
> I think further, it appears a real bug that requires so special handling in the device for next years to carry.
>
> Imagine this sequence.
> 1. A virtio device is handed over to guest VM
> 2. A virtio device started queue_reset sequence,
> So register values are:
> queue_enable = 1, q_reset=1
> 2. guest VM poled the q_reset register.
> Queue_enable = 1, q_reset = 0 (because device is doing the resetting the queue)
> 3. HV suspend the VM and queried the VQ state


Hypervisor needs to trap the access to q_reset, otherwise there will be 
a race anyhow. So it knows the device is being reset and only after the 
device rest is finished, it can suspend a VM.


> VQ state returned
> q_enable=1, q_reset = 0.
>
> HV doesn't know what q_reset=0 mean here, is it 0 because it was never reset?
> Or it is 0 because GVM Started the reset, but reset didn't finish?


If the device can change any register value (e.g q_reset), it's not 
stopped. We can't do the migration anyhow. No?


>
> When this virtio device is restored on the other side, HV and device doesn't know how to deal with this.
>
> A WA that all devices will implement is, not returning 0, in step_2, but return say q_reset = 0xa to indicate that its other than 1 and other than 0.
> But hey, the destination side needs to treat this special 0xa and convert to internal q_yet_busy stage.
>
> And this answer Jason and myself why queue_enable shouldn't be overloaded for this busy_wait register.


Well, we've already had busy waiting register that is device_status.

"""

After writing 0 todevice_status, the driver MUST wait for a read 
ofdevice_statusto return 0 before reinitializing the device.

"""

static void vp_reset(struct virtio_device *vdev)
{
         struct virtio_pci_device *vp_dev = to_vp_device(vdev);
         struct virtio_pci_modern_device *mdev = &vp_dev->mdev;

         /* 0 status means a reset. */
         vp_modern_set_status(mdev, 0);
         /* After writing 0 to device_status, the driver MUST wait for a 
read of
          * device_status to return 0 before reinitializing the device.
          * This will flush out the status write, and flush in device 
writes,
          * including MSI-X interrupts, if any.
          */
         while (vp_modern_get_status(mdev))
                 msleep(1);
         /* Flush pending VQ/configuration callbacks. */
         vp_synchronize_vectors(vdev);
}

We don't see any live migration issue with this.

Thanks


> Hence queue_reset register seems the right choice with the fix.
>
> Starting to think of workaround even before the spec release is not elegant.
> Lets please fix this, bug effect is expanding beyond the primary virtio device itself.
>


  parent reply	other threads:[~2022-04-28  3:40 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-27 10:25 [PATCH v2] virtio: Improve queue_reset polarity to match to default reset state Parav Pandit
2022-04-27 11:29 ` [virtio-dev] " Jason Wang
2022-04-27 11:44   ` Xuan Zhuo
2022-04-28  3:46     ` Jason Wang
2022-04-27 14:51   ` Parav Pandit
2022-04-27 15:29     ` Michael S. Tsirkin
2022-04-27 15:39       ` Parav Pandit
2022-04-27 15:43         ` Michael S. Tsirkin
2022-04-27 15:57           ` Parav Pandit
2022-04-27 16:15             ` Parav Pandit
2022-04-27 19:32               ` Michael S. Tsirkin
2022-04-28  1:52                 ` Parav Pandit
2022-04-28  3:40               ` Jason Wang [this message]
2022-04-28  4:00                 ` Parav Pandit
2022-04-28  6:13                   ` Jason Wang
2022-04-28  6:30                     ` Michael S. Tsirkin
2022-04-28  6:56                       ` Jason Wang
2022-04-27 19:28             ` Michael S. Tsirkin
2022-04-27 19:29               ` Parav Pandit
2022-04-28  3:15         ` Jason Wang
2022-04-28  3:24           ` Parav Pandit
2022-04-28  3:43             ` Jason Wang
2022-04-28  4:56               ` Michael S. Tsirkin
2022-04-28  6:10                 ` Jason Wang
2022-04-28  6:26                   ` Michael S. Tsirkin
2022-04-28  8:20                     ` Jason Wang
2022-04-27 12:48 ` [virtio-dev] " Cornelia Huck
2022-04-28  1:09   ` [virtio-dev] " Parav Pandit
2022-04-27 20:39 ` [virtio-dev] " Michael S. Tsirkin
2022-04-28  1:49   ` Parav Pandit
2022-04-28  7:33     ` [virtio-comment] " Cornelia Huck
2022-04-28 19:13       ` Parav Pandit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9a932876-1a5e-8166-57fd-f41bf4f153f5@redhat.com \
    --to=jasowang@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=mst@redhat.com \
    --cc=parav@nvidia.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.