All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] vfio/pci: migration: Skip config space check for Vendor Specific Information in VSC during restore/load
@ 2024-03-22  6:42 Vinayak Kale
  2024-03-27 17:39 ` Alex Williamson
  0 siblings, 1 reply; 7+ messages in thread
From: Vinayak Kale @ 2024-03-22  6:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, mst, marcel.apfelbaum, avihaih, acurrid, cjia,
	zhiw, targupta, kvm, Vinayak Kale

In case of migration, during restore operation, qemu checks config space of the
pci device with the config space in the migration stream captured during save
operation. In case of config space data mismatch, restore operation is failed.

config space check is done in function get_pci_config_device(). By default VSC
(vendor-specific-capability) in config space is checked.

Due to qemu's config space check for VSC, live migration is broken across NVIDIA
vGPU devices in situation where source and destination host driver is different.
In this situation, Vendor Specific Information in VSC varies on the destination
to ensure vGPU feature capabilities exposed to the guest driver are compatible
with destination host.

If a vfio-pci device is migration capable and vfio-pci vendor driver is OK with
volatile Vendor Specific Info in VSC then qemu should exempt config space check
for Vendor Specific Info. It is vendor driver's responsibility to ensure that
VSC is consistent across migration. Here consistency could mean that VSC format
should be same on source and destination, however actual Vendor Specific Info
may not be byte-to-byte identical.

This patch skips the check for Vendor Specific Information in VSC for VFIO-PCI
device by clearing pdev->cmask[] offsets. Config space check is still enforced
for 3 byte VSC header. If cmask[] is not set for an offset, then qemu skips
config space check for that offset.

Signed-off-by: Vinayak Kale <vkale@nvidia.com>
---
Version History
v2->v3:
    - Config space check skipped only for Vendor Specific Info in VSC, check is
      still enforced for 3 byte VSC header.
    - Updated commit description with live migration failure scenario.
v1->v2:
    - Limited scope of change to vfio-pci devices instead of all pci devices.

 hw/vfio/pci.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index d7fe06715c..1026cdba18 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2132,6 +2132,27 @@ static void vfio_check_af_flr(VFIOPCIDevice *vdev, uint8_t pos)
     }
 }
 
+static int vfio_add_vendor_specific_cap(VFIOPCIDevice *vdev, int pos,
+                                        uint8_t size, Error **errp)
+{
+    PCIDevice *pdev = &vdev->pdev;
+
+    pos = pci_add_capability(pdev, PCI_CAP_ID_VNDR, pos, size, errp);
+    if (pos < 0) {
+        return pos;
+    }
+
+    /*
+     * Exempt config space check for Vendor Specific Information during restore/load.
+     * Config space check is still enforced for 3 byte VSC header.
+     */
+    if (size > 3) {
+        memset(pdev->cmask + pos + 3, 0, size - 3);
+    }
+
+    return pos;
+}
+
 static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp)
 {
     PCIDevice *pdev = &vdev->pdev;
@@ -2199,6 +2220,9 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp)
         vfio_check_af_flr(vdev, pos);
         ret = pci_add_capability(pdev, cap_id, pos, size, errp);
         break;
+    case PCI_CAP_ID_VNDR:
+        ret = vfio_add_vendor_specific_cap(vdev, pos, size, errp);
+        break;
     default:
         ret = pci_add_capability(pdev, cap_id, pos, size, errp);
         break;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] vfio/pci: migration: Skip config space check for Vendor Specific Information in VSC during restore/load
  2024-03-22  6:42 [PATCH v3] vfio/pci: migration: Skip config space check for Vendor Specific Information in VSC during restore/load Vinayak Kale
@ 2024-03-27 17:39 ` Alex Williamson
  2024-03-27 20:11   ` Michael S. Tsirkin
  0 siblings, 1 reply; 7+ messages in thread
From: Alex Williamson @ 2024-03-27 17:39 UTC (permalink / raw)
  To: Vinayak Kale
  Cc: qemu-devel, mst, marcel.apfelbaum, avihaih, acurrid, cjia, zhiw,
	targupta, kvm

On Fri, 22 Mar 2024 12:12:10 +0530
Vinayak Kale <vkale@nvidia.com> wrote:

> In case of migration, during restore operation, qemu checks config space of the
> pci device with the config space in the migration stream captured during save
> operation. In case of config space data mismatch, restore operation is failed.
> 
> config space check is done in function get_pci_config_device(). By default VSC
> (vendor-specific-capability) in config space is checked.
> 
> Due to qemu's config space check for VSC, live migration is broken across NVIDIA
> vGPU devices in situation where source and destination host driver is different.
> In this situation, Vendor Specific Information in VSC varies on the destination
> to ensure vGPU feature capabilities exposed to the guest driver are compatible
> with destination host.
> 
> If a vfio-pci device is migration capable and vfio-pci vendor driver is OK with
> volatile Vendor Specific Info in VSC then qemu should exempt config space check
> for Vendor Specific Info. It is vendor driver's responsibility to ensure that
> VSC is consistent across migration. Here consistency could mean that VSC format
> should be same on source and destination, however actual Vendor Specific Info
> may not be byte-to-byte identical.
> 
> This patch skips the check for Vendor Specific Information in VSC for VFIO-PCI
> device by clearing pdev->cmask[] offsets. Config space check is still enforced
> for 3 byte VSC header. If cmask[] is not set for an offset, then qemu skips
> config space check for that offset.
> 
> Signed-off-by: Vinayak Kale <vkale@nvidia.com>
> ---
> Version History
> v2->v3:
>     - Config space check skipped only for Vendor Specific Info in VSC, check is
>       still enforced for 3 byte VSC header.
>     - Updated commit description with live migration failure scenario.
> v1->v2:
>     - Limited scope of change to vfio-pci devices instead of all pci devices.
> 
>  hw/vfio/pci.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)


Acked-by: Alex Williamson <alex.williamson@redhat.com>

 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index d7fe06715c..1026cdba18 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2132,6 +2132,27 @@ static void vfio_check_af_flr(VFIOPCIDevice *vdev, uint8_t pos)
>      }
>  }
>  
> +static int vfio_add_vendor_specific_cap(VFIOPCIDevice *vdev, int pos,
> +                                        uint8_t size, Error **errp)
> +{
> +    PCIDevice *pdev = &vdev->pdev;
> +
> +    pos = pci_add_capability(pdev, PCI_CAP_ID_VNDR, pos, size, errp);
> +    if (pos < 0) {
> +        return pos;
> +    }
> +
> +    /*
> +     * Exempt config space check for Vendor Specific Information during restore/load.
> +     * Config space check is still enforced for 3 byte VSC header.
> +     */
> +    if (size > 3) {
> +        memset(pdev->cmask + pos + 3, 0, size - 3);
> +    }
> +
> +    return pos;
> +}
> +
>  static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp)
>  {
>      PCIDevice *pdev = &vdev->pdev;
> @@ -2199,6 +2220,9 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp)
>          vfio_check_af_flr(vdev, pos);
>          ret = pci_add_capability(pdev, cap_id, pos, size, errp);
>          break;
> +    case PCI_CAP_ID_VNDR:
> +        ret = vfio_add_vendor_specific_cap(vdev, pos, size, errp);
> +        break;
>      default:
>          ret = pci_add_capability(pdev, cap_id, pos, size, errp);
>          break;


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] vfio/pci: migration: Skip config space check for Vendor Specific Information in VSC during restore/load
  2024-03-27 17:39 ` Alex Williamson
@ 2024-03-27 20:11   ` Michael S. Tsirkin
  2024-03-27 20:52     ` Alex Williamson
  0 siblings, 1 reply; 7+ messages in thread
From: Michael S. Tsirkin @ 2024-03-27 20:11 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Vinayak Kale, qemu-devel, marcel.apfelbaum, avihaih, acurrid,
	cjia, zhiw, targupta, kvm

On Wed, Mar 27, 2024 at 11:39:15AM -0600, Alex Williamson wrote:
> On Fri, 22 Mar 2024 12:12:10 +0530
> Vinayak Kale <vkale@nvidia.com> wrote:
> 
> > In case of migration, during restore operation, qemu checks config space of the
> > pci device with the config space in the migration stream captured during save
> > operation. In case of config space data mismatch, restore operation is failed.
> > 
> > config space check is done in function get_pci_config_device(). By default VSC
> > (vendor-specific-capability) in config space is checked.
> > 
> > Due to qemu's config space check for VSC, live migration is broken across NVIDIA
> > vGPU devices in situation where source and destination host driver is different.
> > In this situation, Vendor Specific Information in VSC varies on the destination
> > to ensure vGPU feature capabilities exposed to the guest driver are compatible
> > with destination host.
> > 
> > If a vfio-pci device is migration capable and vfio-pci vendor driver is OK with
> > volatile Vendor Specific Info in VSC then qemu should exempt config space check
> > for Vendor Specific Info. It is vendor driver's responsibility to ensure that
> > VSC is consistent across migration. Here consistency could mean that VSC format
> > should be same on source and destination, however actual Vendor Specific Info
> > may not be byte-to-byte identical.
> > 
> > This patch skips the check for Vendor Specific Information in VSC for VFIO-PCI
> > device by clearing pdev->cmask[] offsets. Config space check is still enforced
> > for 3 byte VSC header. If cmask[] is not set for an offset, then qemu skips
> > config space check for that offset.
> > 
> > Signed-off-by: Vinayak Kale <vkale@nvidia.com>
> > ---
> > Version History
> > v2->v3:
> >     - Config space check skipped only for Vendor Specific Info in VSC, check is
> >       still enforced for 3 byte VSC header.
> >     - Updated commit description with live migration failure scenario.
> > v1->v2:
> >     - Limited scope of change to vfio-pci devices instead of all pci devices.
> > 
> >  hw/vfio/pci.c | 24 ++++++++++++++++++++++++
> >  1 file changed, 24 insertions(+)
> 
> 
> Acked-by: Alex Williamson <alex.williamson@redhat.com>


A very reasonable way to do it.

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

Merge through the VFIO tree I presume?


>  
> > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > index d7fe06715c..1026cdba18 100644
> > --- a/hw/vfio/pci.c
> > +++ b/hw/vfio/pci.c
> > @@ -2132,6 +2132,27 @@ static void vfio_check_af_flr(VFIOPCIDevice *vdev, uint8_t pos)
> >      }
> >  }
> >  
> > +static int vfio_add_vendor_specific_cap(VFIOPCIDevice *vdev, int pos,
> > +                                        uint8_t size, Error **errp)
> > +{
> > +    PCIDevice *pdev = &vdev->pdev;
> > +
> > +    pos = pci_add_capability(pdev, PCI_CAP_ID_VNDR, pos, size, errp);
> > +    if (pos < 0) {
> > +        return pos;
> > +    }
> > +
> > +    /*
> > +     * Exempt config space check for Vendor Specific Information during restore/load.
> > +     * Config space check is still enforced for 3 byte VSC header.
> > +     */
> > +    if (size > 3) {
> > +        memset(pdev->cmask + pos + 3, 0, size - 3);
> > +    }
> > +
> > +    return pos;
> > +}
> > +
> >  static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp)
> >  {
> >      PCIDevice *pdev = &vdev->pdev;
> > @@ -2199,6 +2220,9 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp)
> >          vfio_check_af_flr(vdev, pos);
> >          ret = pci_add_capability(pdev, cap_id, pos, size, errp);
> >          break;
> > +    case PCI_CAP_ID_VNDR:
> > +        ret = vfio_add_vendor_specific_cap(vdev, pos, size, errp);
> > +        break;
> >      default:
> >          ret = pci_add_capability(pdev, cap_id, pos, size, errp);
> >          break;


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] vfio/pci: migration: Skip config space check for Vendor Specific Information in VSC during restore/load
  2024-03-27 20:11   ` Michael S. Tsirkin
@ 2024-03-27 20:52     ` Alex Williamson
  2024-03-28  9:30       ` Cédric Le Goater
  0 siblings, 1 reply; 7+ messages in thread
From: Alex Williamson @ 2024-03-27 20:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Vinayak Kale, qemu-devel, marcel.apfelbaum, avihaih, acurrid,
	cjia, zhiw, targupta, kvm, Cédric Le Goater

On Wed, 27 Mar 2024 16:11:37 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Mar 27, 2024 at 11:39:15AM -0600, Alex Williamson wrote:
> > On Fri, 22 Mar 2024 12:12:10 +0530
> > Vinayak Kale <vkale@nvidia.com> wrote:
> >   
> > > In case of migration, during restore operation, qemu checks config space of the
> > > pci device with the config space in the migration stream captured during save
> > > operation. In case of config space data mismatch, restore operation is failed.
> > > 
> > > config space check is done in function get_pci_config_device(). By default VSC
> > > (vendor-specific-capability) in config space is checked.
> > > 
> > > Due to qemu's config space check for VSC, live migration is broken across NVIDIA
> > > vGPU devices in situation where source and destination host driver is different.
> > > In this situation, Vendor Specific Information in VSC varies on the destination
> > > to ensure vGPU feature capabilities exposed to the guest driver are compatible
> > > with destination host.
> > > 
> > > If a vfio-pci device is migration capable and vfio-pci vendor driver is OK with
> > > volatile Vendor Specific Info in VSC then qemu should exempt config space check
> > > for Vendor Specific Info. It is vendor driver's responsibility to ensure that
> > > VSC is consistent across migration. Here consistency could mean that VSC format
> > > should be same on source and destination, however actual Vendor Specific Info
> > > may not be byte-to-byte identical.
> > > 
> > > This patch skips the check for Vendor Specific Information in VSC for VFIO-PCI
> > > device by clearing pdev->cmask[] offsets. Config space check is still enforced
> > > for 3 byte VSC header. If cmask[] is not set for an offset, then qemu skips
> > > config space check for that offset.
> > > 
> > > Signed-off-by: Vinayak Kale <vkale@nvidia.com>
> > > ---
> > > Version History
> > > v2->v3:
> > >     - Config space check skipped only for Vendor Specific Info in VSC, check is
> > >       still enforced for 3 byte VSC header.
> > >     - Updated commit description with live migration failure scenario.
> > > v1->v2:
> > >     - Limited scope of change to vfio-pci devices instead of all pci devices.
> > > 
> > >  hw/vfio/pci.c | 24 ++++++++++++++++++++++++
> > >  1 file changed, 24 insertions(+)  
> > 
> > 
> > Acked-by: Alex Williamson <alex.williamson@redhat.com>  
> 
> 
> A very reasonable way to do it.
> 
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Merge through the VFIO tree I presume?

Yep, Cédric said he´d grab it for 9.1.  Thanks,

Alex
 
> > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > > index d7fe06715c..1026cdba18 100644
> > > --- a/hw/vfio/pci.c
> > > +++ b/hw/vfio/pci.c
> > > @@ -2132,6 +2132,27 @@ static void vfio_check_af_flr(VFIOPCIDevice *vdev, uint8_t pos)
> > >      }
> > >  }
> > >  
> > > +static int vfio_add_vendor_specific_cap(VFIOPCIDevice *vdev, int pos,
> > > +                                        uint8_t size, Error **errp)
> > > +{
> > > +    PCIDevice *pdev = &vdev->pdev;
> > > +
> > > +    pos = pci_add_capability(pdev, PCI_CAP_ID_VNDR, pos, size, errp);
> > > +    if (pos < 0) {
> > > +        return pos;
> > > +    }
> > > +
> > > +    /*
> > > +     * Exempt config space check for Vendor Specific Information during restore/load.
> > > +     * Config space check is still enforced for 3 byte VSC header.
> > > +     */
> > > +    if (size > 3) {
> > > +        memset(pdev->cmask + pos + 3, 0, size - 3);
> > > +    }
> > > +
> > > +    return pos;
> > > +}
> > > +
> > >  static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp)
> > >  {
> > >      PCIDevice *pdev = &vdev->pdev;
> > > @@ -2199,6 +2220,9 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp)
> > >          vfio_check_af_flr(vdev, pos);
> > >          ret = pci_add_capability(pdev, cap_id, pos, size, errp);
> > >          break;
> > > +    case PCI_CAP_ID_VNDR:
> > > +        ret = vfio_add_vendor_specific_cap(vdev, pos, size, errp);
> > > +        break;
> > >      default:
> > >          ret = pci_add_capability(pdev, cap_id, pos, size, errp);
> > >          break;  
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] vfio/pci: migration: Skip config space check for Vendor Specific Information in VSC during restore/load
  2024-03-27 20:52     ` Alex Williamson
@ 2024-03-28  9:30       ` Cédric Le Goater
  2024-04-29 12:40         ` Cédric Le Goater
  0 siblings, 1 reply; 7+ messages in thread
From: Cédric Le Goater @ 2024-03-28  9:30 UTC (permalink / raw)
  To: Alex Williamson, Michael S. Tsirkin
  Cc: Vinayak Kale, qemu-devel, marcel.apfelbaum, avihaih, acurrid,
	cjia, zhiw, targupta, kvm

On 3/27/24 21:52, Alex Williamson wrote:
> On Wed, 27 Mar 2024 16:11:37 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
>> On Wed, Mar 27, 2024 at 11:39:15AM -0600, Alex Williamson wrote:
>>> On Fri, 22 Mar 2024 12:12:10 +0530
>>> Vinayak Kale <vkale@nvidia.com> wrote:
>>>    
>>>> In case of migration, during restore operation, qemu checks config space of the
>>>> pci device with the config space in the migration stream captured during save
>>>> operation. In case of config space data mismatch, restore operation is failed.
>>>>
>>>> config space check is done in function get_pci_config_device(). By default VSC
>>>> (vendor-specific-capability) in config space is checked.
>>>>
>>>> Due to qemu's config space check for VSC, live migration is broken across NVIDIA
>>>> vGPU devices in situation where source and destination host driver is different.
>>>> In this situation, Vendor Specific Information in VSC varies on the destination
>>>> to ensure vGPU feature capabilities exposed to the guest driver are compatible
>>>> with destination host.
>>>>
>>>> If a vfio-pci device is migration capable and vfio-pci vendor driver is OK with
>>>> volatile Vendor Specific Info in VSC then qemu should exempt config space check
>>>> for Vendor Specific Info. It is vendor driver's responsibility to ensure that
>>>> VSC is consistent across migration. Here consistency could mean that VSC format
>>>> should be same on source and destination, however actual Vendor Specific Info
>>>> may not be byte-to-byte identical.
>>>>
>>>> This patch skips the check for Vendor Specific Information in VSC for VFIO-PCI
>>>> device by clearing pdev->cmask[] offsets. Config space check is still enforced
>>>> for 3 byte VSC header. If cmask[] is not set for an offset, then qemu skips
>>>> config space check for that offset.
>>>>
>>>> Signed-off-by: Vinayak Kale <vkale@nvidia.com>
>>>> ---
>>>> Version History
>>>> v2->v3:
>>>>      - Config space check skipped only for Vendor Specific Info in VSC, check is
>>>>        still enforced for 3 byte VSC header.
>>>>      - Updated commit description with live migration failure scenario.
>>>> v1->v2:
>>>>      - Limited scope of change to vfio-pci devices instead of all pci devices.
>>>>
>>>>   hw/vfio/pci.c | 24 ++++++++++++++++++++++++
>>>>   1 file changed, 24 insertions(+)
>>>
>>>
>>> Acked-by: Alex Williamson <alex.williamson@redhat.com>
>>
>>
>> A very reasonable way to do it.
>>
>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>
>> Merge through the VFIO tree I presume?
> 
> Yep, Cédric said he´d grab it for 9.1.  Thanks,


Applied to vfio-next.

Thanks,

C.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] vfio/pci: migration: Skip config space check for Vendor Specific Information in VSC during restore/load
  2024-03-28  9:30       ` Cédric Le Goater
@ 2024-04-29 12:40         ` Cédric Le Goater
  2024-04-30 10:10           ` Vinayak Kale
  0 siblings, 1 reply; 7+ messages in thread
From: Cédric Le Goater @ 2024-04-29 12:40 UTC (permalink / raw)
  To: Alex Williamson, Michael S. Tsirkin
  Cc: Vinayak Kale, qemu-devel, marcel.apfelbaum, avihaih, acurrid,
	cjia, zhiw, targupta, kvm

Hello Vinayak,

On 3/28/24 10:30, Cédric Le Goater wrote:
> On 3/27/24 21:52, Alex Williamson wrote:
>> On Wed, 27 Mar 2024 16:11:37 -0400
>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>
>>> On Wed, Mar 27, 2024 at 11:39:15AM -0600, Alex Williamson wrote:
>>>> On Fri, 22 Mar 2024 12:12:10 +0530
>>>> Vinayak Kale <vkale@nvidia.com> wrote:
>>>>> In case of migration, during restore operation, qemu checks config space of the
>>>>> pci device with the config space in the migration stream captured during save
>>>>> operation. In case of config space data mismatch, restore operation is failed.
>>>>>
>>>>> config space check is done in function get_pci_config_device(). By default VSC
>>>>> (vendor-specific-capability) in config space is checked.
>>>>>
>>>>> Due to qemu's config space check for VSC, live migration is broken across NVIDIA
>>>>> vGPU devices in situation where source and destination host driver is different.
>>>>> In this situation, Vendor Specific Information in VSC varies on the destination
>>>>> to ensure vGPU feature capabilities exposed to the guest driver are compatible
>>>>> with destination host.
>>>>>
>>>>> If a vfio-pci device is migration capable and vfio-pci vendor driver is OK with
>>>>> volatile Vendor Specific Info in VSC then qemu should exempt config space check
>>>>> for Vendor Specific Info. It is vendor driver's responsibility to ensure that
>>>>> VSC is consistent across migration. Here consistency could mean that VSC format
>>>>> should be same on source and destination, however actual Vendor Specific Info
>>>>> may not be byte-to-byte identical.
>>>>>
>>>>> This patch skips the check for Vendor Specific Information in VSC for VFIO-PCI
>>>>> device by clearing pdev->cmask[] offsets. Config space check is still enforced
>>>>> for 3 byte VSC header. If cmask[] is not set for an offset, then qemu skips
>>>>> config space check for that offset.
>>>>>
>>>>> Signed-off-by: Vinayak Kale <vkale@nvidia.com>
>>>>> ---
>>>>> Version History
>>>>> v2->v3:
>>>>>      - Config space check skipped only for Vendor Specific Info in VSC, check is
>>>>>        still enforced for 3 byte VSC header.
>>>>>      - Updated commit description with live migration failure scenario.
>>>>> v1->v2:
>>>>>      - Limited scope of change to vfio-pci devices instead of all pci devices.
>>>>>
>>>>>   hw/vfio/pci.c | 24 ++++++++++++++++++++++++
>>>>>   1 file changed, 24 insertions(+)
>>>>
>>>>
>>>> Acked-by: Alex Williamson <alex.williamson@redhat.com>
>>>
>>>
>>> A very reasonable way to do it.
>>>
>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>>
>>> Merge through the VFIO tree I presume?
>>
>> Yep, Cédric said he´d grab it for 9.1.  Thanks,

Could you please resend an update of this change adding a machine
compatibility property for migration ?

Thanks,

C.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] vfio/pci: migration: Skip config space check for Vendor Specific Information in VSC during restore/load
  2024-04-29 12:40         ` Cédric Le Goater
@ 2024-04-30 10:10           ` Vinayak Kale
  0 siblings, 0 replies; 7+ messages in thread
From: Vinayak Kale @ 2024-04-30 10:10 UTC (permalink / raw)
  To: Cédric Le Goater, Alex Williamson, Michael S. Tsirkin
  Cc: qemu-devel, marcel.apfelbaum, avihaih, acurrid, cjia, zhiw,
	targupta, kvm



On 29/04/24 6:10 pm, Cédric Le Goater wrote:
> 
> Hello Vinayak,
> 
> On 3/28/24 10:30, Cédric Le Goater wrote:
>> On 3/27/24 21:52, Alex Williamson wrote:
>>> On Wed, 27 Mar 2024 16:11:37 -0400
>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>
>>>> On Wed, Mar 27, 2024 at 11:39:15AM -0600, Alex Williamson wrote:
>>>>> On Fri, 22 Mar 2024 12:12:10 +0530
>>>>> Vinayak Kale <vkale@nvidia.com> wrote:
>>>>>> In case of migration, during restore operation, qemu checks config 
>>>>>> space of the
>>>>>> pci device with the config space in the migration stream captured 
>>>>>> during save
>>>>>> operation. In case of config space data mismatch, restore 
>>>>>> operation is failed.
>>>>>>
>>>>>> config space check is done in function get_pci_config_device(). By 
>>>>>> default VSC
>>>>>> (vendor-specific-capability) in config space is checked.
>>>>>>
>>>>>> Due to qemu's config space check for VSC, live migration is broken 
>>>>>> across NVIDIA
>>>>>> vGPU devices in situation where source and destination host driver 
>>>>>> is different.
>>>>>> In this situation, Vendor Specific Information in VSC varies on 
>>>>>> the destination
>>>>>> to ensure vGPU feature capabilities exposed to the guest driver 
>>>>>> are compatible
>>>>>> with destination host.
>>>>>>
>>>>>> If a vfio-pci device is migration capable and vfio-pci vendor 
>>>>>> driver is OK with
>>>>>> volatile Vendor Specific Info in VSC then qemu should exempt 
>>>>>> config space check
>>>>>> for Vendor Specific Info. It is vendor driver's responsibility to 
>>>>>> ensure that
>>>>>> VSC is consistent across migration. Here consistency could mean 
>>>>>> that VSC format
>>>>>> should be same on source and destination, however actual Vendor 
>>>>>> Specific Info
>>>>>> may not be byte-to-byte identical.
>>>>>>
>>>>>> This patch skips the check for Vendor Specific Information in VSC 
>>>>>> for VFIO-PCI
>>>>>> device by clearing pdev->cmask[] offsets. Config space check is 
>>>>>> still enforced
>>>>>> for 3 byte VSC header. If cmask[] is not set for an offset, then 
>>>>>> qemu skips
>>>>>> config space check for that offset.
>>>>>>
>>>>>> Signed-off-by: Vinayak Kale <vkale@nvidia.com>
>>>>>> ---
>>>>>> Version History
>>>>>> v2->v3:
>>>>>>      - Config space check skipped only for Vendor Specific Info in 
>>>>>> VSC, check is
>>>>>>        still enforced for 3 byte VSC header.
>>>>>>      - Updated commit description with live migration failure 
>>>>>> scenario.
>>>>>> v1->v2:
>>>>>>      - Limited scope of change to vfio-pci devices instead of all 
>>>>>> pci devices.
>>>>>>
>>>>>>   hw/vfio/pci.c | 24 ++++++++++++++++++++++++
>>>>>>   1 file changed, 24 insertions(+)
>>>>>
>>>>>
>>>>> Acked-by: Alex Williamson <alex.williamson@redhat.com>
>>>>
>>>>
>>>> A very reasonable way to do it.
>>>>
>>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>>>
>>>> Merge through the VFIO tree I presume?
>>>
>>> Yep, Cédric said he´d grab it for 9.1.  Thanks,
> 
> Could you please resend an update of this change adding a machine
> compatibility property for migration ?

Sure, I'll address this in V4.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-04-30 10:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-22  6:42 [PATCH v3] vfio/pci: migration: Skip config space check for Vendor Specific Information in VSC during restore/load Vinayak Kale
2024-03-27 17:39 ` Alex Williamson
2024-03-27 20:11   ` Michael S. Tsirkin
2024-03-27 20:52     ` Alex Williamson
2024-03-28  9:30       ` Cédric Le Goater
2024-04-29 12:40         ` Cédric Le Goater
2024-04-30 10:10           ` Vinayak Kale

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.