All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: mst@redhat.com, virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org, mgurtovoy@nvidia.com,
	cohuck@redhat.com, eperezma@redhat.com, oren@nvidia.com,
	shahafs@nvidia.com, parav@nvidia.com, bodong@nvidia.com,
	amikheev@nvidia.com, pasic@linux.ibm.com
Subject: Re: [PATCH V2 0/2] Vitqueue State Synchronization
Date: Tue, 13 Jul 2021 19:56:02 +0800	[thread overview]
Message-ID: <74534bb8-5747-4714-12ed-14d6651ea7c9@redhat.com> (raw)
In-Reply-To: <YO1rST8Vz3rrqqL4@stefanha-x1.localdomain>


在 2021/7/13 下午6:30, Stefan Hajnoczi 写道:
> On Tue, Jul 13, 2021 at 11:08:28AM +0800, Jason Wang wrote:
>> 在 2021/7/12 下午6:12, Stefan Hajnoczi 写道:
>>> On Tue, Jul 06, 2021 at 12:33:32PM +0800, Jason Wang wrote:
>>>> Hi All:
>>>>
>>>> This is an updated version to implement virtqueue state
>>>> synchronization which is a must for the migration support.
>>>>
>>>> The first patch introduces virtqueue states as a new basic facility of
>>>> the virtio device. This is used by the driver to save and restore
>>>> virtqueue state. The states were split into available state and used
>>>> state to ease the transport specific implementation. It is also
>>>> allowed for the device to have its own device specific way to save and
>>>> resotre extra virtqueue states like in flight request.
>>>>
>>>> The second patch introduce a new status bit STOP. This bit is used for
>>>> the driver to stop the device. The major difference from reset is that
>>>> STOP must preserve all the virtqueue state plus the device state.
>>>>
>>>> A driver can then:
>>>>
>>>> - Get the virtqueue state if STOP status bit is set
>>>> - Set the virtqueue state after FEATURE_OK but before DRIVER_OK
>>>>
>>>> Device specific state synchronization could be built on top.
>>> Will you send a proof-of-concept implementation to demonstrate how it
>>> works in practice?
>>
>> Eugenio has implemented a prototype for this. (Note that the codes was for
>> previous version of the proposal, but it's sufficient to demonstrate how it
>> works).
>>
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg809332.html
>>
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg809335.html
>>
>>
>>> You mentioned being able to migrate virtio-net devices using this
>>> interface, but what about state like VIRTIO_NET_S_LINK_UP that is either
>>> per-device or associated with a non-rx/tx virtqueue?
>>
>> Note that the config space will be maintained by Qemu. So Qemu can choose to
>> emulate link down by simply don't set DRIVER_OK to the device.
>>
>>
>>> Basically I'm not sure if the scope of this is just to migrate state
>>> associated with offloaded virtqueues (vDPA, VFIO/mdev, etc) or if it's
>>> really supposed to migrate the entire device?
>>
>> As the subject, it's the virtqueue state not the device state. The series
>> tries to introduce the minimal sets of functions that could be used to
>> migrate the network device.
>>
>>
>>
>>> Do you have an approach in mind for saving/loading device-specific
>>> state? Here are devices and their state:
>>> - virtio-blk: a list of requests that the destination device can
>>>     re-submit
>>> - virtio-scsi: a list of requests that the destination device can
>>>     re-submit
>>> - virtio-serial: active ports, including the current buffer being
>>>     transferred
>>
>> Actually, we had two types of additional states:
>>
>> - pending (or inflight) buffers, we can introduce a transport specific way
>> to specify the auxiliary page which is used to stored the inflight
>> descriptors (as what vhost-user did)
>> - other device states, this needs to be done via a device specific way, and
>> it would be hard to generalize them
>>
>>
>>> - virtio-net: MAC address, status, etc
>>
>> So VMM will intercept all the control commands, that means we don't need to
>> query any states that is changed via those control commands.
>>
>> E.g The Qemu is in charge of shadowing control virtqueue, so we don't even
>> need to interface to query any of those states that is set via control
>> virtqueue.
>>
>> But all those device state stuffs is out of the scope of this proposal.
>>
>> I can see one of the possible gap is that people may think the migration
>> facility is designed for the simple passthrough that Linux provides, that
>> means the device is assigend 'entirely' to the guest. This is not case for
>> the case of live migration, some kind of mediation must be done in the
>> middle.
>>
>> And that's the work of VMM through vDPA + Qemu: intercepting control command
>> but not datapath.
> I thought this was a more general migration mechanism that passthrough
> devices could use. Thanks for explaining. Maybe this can be made clearer
> in the spec - it's not a full save/load mechanism, it can only be used
> in conjunction with another component that is aware of the device's
> state.


Yes, and actually this should be the suggested way for migrating virtio 
device.

The advantage is obvious, to leverage the mature virtio/vhost software 
stack then we don't need to care much about things like migration 
compatibility.


>
> There is a gap between this approach and VFIO's migration interface. It
> appears to be impossible to write a VFIO/mdev or vfio-user device that
> passes a physical virtio-pci device through to the guest with migration
> support.


I think mediation(mdev) is a must for support live migration in this 
case even for VFIO. If you simply assign the device to the guest, the 
VMM will lose all the control to the device.

And what's more important, virtio is not PCI specific so it can work 
where VFIO can not work:

1) The physical device that doesn't use PCI as its transport
2) The guest that doesn't use PCI or even don't have PCI

That's the consideration for introducing all those as basic facility 
first. Then we can let the transport to implement them in a transport 
comfortable way (admin virtqueue or capabilities).


> The reason is because VIRTIO lacks an interface to save/load
> device (not virtqueue) state. I guess it will be added sooner or later,
> it's similar to what Max Gurtovoy recently proposed.


So my understanding is:

1) Each device should define its own state that needs to be migrated

then, we can define

2) How to design the device interface

Admin virtqueue is a solution for 2) but not 1). And an obvious drawback 
for admin virtqueue is that it's not easily to be used in the nested 
environment where you still need a per function interface.

Thanks


>
> Stefan


      reply	other threads:[~2021-07-13 11:56 UTC|newest]

Thread overview: 115+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-06  4:33 [PATCH V2 0/2] Vitqueue State Synchronization Jason Wang
2021-07-06  4:33 ` [PATCH V2 1/2] virtio: introduce virtqueue state as basic facility Jason Wang
2021-07-06  9:32   ` Michael S. Tsirkin
2021-07-06 17:09     ` Eugenio Perez Martin
2021-07-06 19:08       ` Michael S. Tsirkin
2021-07-06 23:49         ` Max Gurtovoy
2021-07-07  2:50           ` Jason Wang
2021-07-07 12:03             ` Max Gurtovoy
2021-07-07 12:11               ` [virtio-comment] " Jason Wang
2021-07-07  2:42         ` Jason Wang
2021-07-07  4:36           ` Jason Wang
2021-07-07  2:41       ` Jason Wang
2021-07-06 12:27   ` [virtio-comment] " Cornelia Huck
2021-07-07  3:29     ` [virtio-dev] " Jason Wang
2021-07-06  4:33 ` [PATCH V2 2/2] virtio: introduce STOP status bit Jason Wang
2021-07-06  9:24   ` [virtio-comment] " Dr. David Alan Gilbert
2021-07-07  3:20     ` Jason Wang
2021-07-09 17:23       ` Eugenio Perez Martin
2021-07-10 20:36         ` Michael S. Tsirkin
2021-07-12  4:00           ` Jason Wang
2021-07-12  9:57             ` Stefan Hajnoczi
2021-07-13  3:27               ` Jason Wang
2021-07-13  8:19                 ` Cornelia Huck
2021-07-13  9:13                   ` Jason Wang
2021-07-13 11:31                     ` Cornelia Huck
2021-07-13 12:23                       ` Jason Wang
2021-07-13 12:28                         ` Cornelia Huck
2021-07-14  2:47                           ` Jason Wang
2021-07-14  6:20                             ` Cornelia Huck
2021-07-14  8:53                               ` Jason Wang
2021-07-14  9:24                                 ` [virtio-dev] " Cornelia Huck
2021-07-15  2:01                                   ` Jason Wang
2021-07-13 10:00                 ` Stefan Hajnoczi
2021-07-13 12:16                   ` Jason Wang
2021-07-14  9:53                     ` Stefan Hajnoczi
2021-07-14 10:29                       ` Jason Wang
2021-07-14 15:07                         ` Stefan Hajnoczi
2021-07-14 16:22                           ` Max Gurtovoy
2021-07-15  1:38                             ` Jason Wang
2021-07-15  9:26                               ` Stefan Hajnoczi
2021-07-16  1:48                                 ` Jason Wang
2021-07-19 12:08                                   ` Stefan Hajnoczi
2021-07-20  2:46                                     ` Jason Wang
2021-07-15 21:18                               ` Michael S. Tsirkin
2021-07-16  2:19                                 ` Jason Wang
2021-07-15  1:35                           ` Jason Wang
2021-07-15  9:16                             ` [virtio-dev] " Stefan Hajnoczi
2021-07-16  1:44                               ` Jason Wang
2021-07-19 12:18                                 ` [virtio-dev] " Stefan Hajnoczi
2021-07-20  2:50                                   ` Jason Wang
2021-07-20 10:31                                 ` Cornelia Huck
2021-07-21  2:59                                   ` Jason Wang
2021-07-15 10:01                             ` Stefan Hajnoczi
2021-07-16  2:03                               ` Jason Wang
2021-07-16  3:53                                 ` Jason Wang
2021-07-19 12:45                                   ` Stefan Hajnoczi
2021-07-20  3:04                                     ` Jason Wang
2021-07-20  8:50                                       ` Stefan Hajnoczi
2021-07-20 10:48                                         ` Cornelia Huck
2021-07-20 12:47                                           ` Stefan Hajnoczi
2021-07-21  2:29                                         ` Jason Wang
2021-07-21 10:20                                           ` Stefan Hajnoczi
2021-07-22  7:33                                             ` Jason Wang
2021-07-22 10:24                                               ` Stefan Hajnoczi
2021-07-22 13:08                                                 ` Jason Wang
2021-07-26 15:07                                                   ` Stefan Hajnoczi
2021-07-27  7:43                                                     ` Max Reitz
2021-08-03  6:33                                                     ` Jason Wang
2021-08-03 10:37                                                       ` Stefan Hajnoczi
2021-08-03 11:42                                                         ` Jason Wang
2021-08-03 12:22                                                           ` Dr. David Alan Gilbert
2021-08-04  1:42                                                             ` Jason Wang
2021-08-04  9:07                                                               ` Dr. David Alan Gilbert
2021-08-05  6:38                                                                 ` Jason Wang
2021-08-05  8:19                                                                   ` Dr. David Alan Gilbert
2021-08-06  6:15                                                                     ` Jason Wang
2021-08-08  9:31                                                                       ` Max Gurtovoy
2021-08-04  9:20                                                               ` Stefan Hajnoczi
2021-08-05  6:45                                                                 ` Jason Wang
2021-08-04  8:38                                                             ` Stefan Hajnoczi
2021-08-04  8:36                                                           ` Stefan Hajnoczi
2021-08-05  6:35                                                             ` Jason Wang
2021-07-19 12:43                                 ` Stefan Hajnoczi
2021-07-20  3:02                                   ` Jason Wang
2021-07-20 10:19                                     ` Stefan Hajnoczi
2021-07-21  2:52                                       ` Jason Wang
2021-07-21 10:42                                         ` Stefan Hajnoczi
2021-07-22  2:08                                           ` Jason Wang
2021-07-22 10:30                                             ` Stefan Hajnoczi
2021-07-20 12:27                                     ` Max Gurtovoy
2021-07-20 12:57                                       ` Stefan Hajnoczi
2021-07-20 13:09                                         ` Max Gurtovoy
2021-07-21  3:06                                           ` Jason Wang
2021-07-21 10:48                                           ` Stefan Hajnoczi
2021-07-21 11:37                                             ` Max Gurtovoy
2021-07-21  3:09                                       ` Jason Wang
2021-07-21 11:43                                         ` Max Gurtovoy
2021-07-22  2:01                                           ` Jason Wang
2021-07-12  3:53         ` Jason Wang
2021-07-06 12:50   ` [virtio-comment] " Cornelia Huck
2021-07-06 13:18     ` Jason Wang
2021-07-06 14:27       ` [virtio-dev] " Cornelia Huck
2021-07-07  0:05         ` Max Gurtovoy
2021-07-07  3:14           ` Jason Wang
2021-07-07  2:56         ` Jason Wang
2021-07-07 16:45           ` [virtio-comment] " Cornelia Huck
2021-07-08  4:06             ` Jason Wang
2021-07-09 17:35   ` Eugenio Perez Martin
2021-07-12  4:06     ` Jason Wang
2021-07-10 20:40   ` Michael S. Tsirkin
2021-07-12  4:04     ` Jason Wang
2021-07-12 10:12 ` [PATCH V2 0/2] Vitqueue State Synchronization Stefan Hajnoczi
2021-07-13  3:08   ` Jason Wang
2021-07-13 10:30     ` Stefan Hajnoczi
2021-07-13 11:56       ` Jason Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=74534bb8-5747-4714-12ed-14d6651ea7c9@redhat.com \
    --to=jasowang@redhat.com \
    --cc=amikheev@nvidia.com \
    --cc=bodong@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=eperezma@redhat.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=mst@redhat.com \
    --cc=oren@nvidia.com \
    --cc=parav@nvidia.com \
    --cc=pasic@linux.ibm.com \
    --cc=shahafs@nvidia.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=virtio-dev@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.