All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
@ 2018-02-01 11:20 Peter Xu
  2018-02-01 12:24 ` Marcel Apfelbaum
  2018-02-01 19:51 ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 13+ messages in thread
From: Peter Xu @ 2018-02-01 11:20 UTC (permalink / raw)
  To: qemu-devel
  Cc: peterx, Alex Williamson, Marcel Apfelbaum, Michael S . Tsirkin,
	Dr . David Alan Gilbert, Juan Quintela, Laurent Vivier

In the past, we prioritized IOMMU migration so that we have such a
priority order:

    IOMMU > PCI Devices

When migrating a guest with both vIOMMU and pcie-root-port, we'll always
migrate vIOMMU first, since pcie-root-port will be seen to have the same
priority of general PCI devices.

That's problematic.

The thing is that PCI bus number information is stored in the root port,
and that is needed by vIOMMU during post_load(), e.g., to figure out
context entry for a device.  If we don't have correct bus numbers for
devices, we won't be able to recover device state of the DMAR memory
regions, and things will be messed up.

So let's boost the PCIe root ports to be even with higher priority:

   PCIe Root Port > IOMMU > PCI Devices

A smoke test shows that this patch fixes bug 1538953.

CC: Alex Williamson <alex.williamson@redhat.com>
CC: Marcel Apfelbaum <marcel@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Laurent Vivier <lvivier@redhat.com>
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
Marcel & all,

I think it's possible that we need similar thing for other bridge-like
devices, but I'm not that familiar.  Would you help confirm?  Thanks,
---
 hw/pci-bridge/gen_pcie_root_port.c | 1 +
 include/migration/vmstate.h        | 1 +
 2 files changed, 2 insertions(+)

diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
index 0e2f2e8bf1..e6ff1effd8 100644
--- a/hw/pci-bridge/gen_pcie_root_port.c
+++ b/hw/pci-bridge/gen_pcie_root_port.c
@@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
 
 static const VMStateDescription vmstate_rp_dev = {
     .name = "pcie-root-port",
+    .priority = MIG_PRI_PCIE_ROOT_PORT,
     .version_id = 1,
     .minimum_version_id = 1,
     .post_load = pcie_cap_slot_post_load,
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 8c3889433c..491449db9f 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -148,6 +148,7 @@ enum VMStateFlags {
 typedef enum {
     MIG_PRI_DEFAULT = 0,
     MIG_PRI_IOMMU,              /* Must happen before PCI devices */
+    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
     MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
     MIG_PRI_GICV3,              /* Must happen before the ITS */
     MIG_PRI_MAX,
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
  2018-02-01 11:20 [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority Peter Xu
@ 2018-02-01 12:24 ` Marcel Apfelbaum
  2018-02-01 19:18   ` Michael S. Tsirkin
                     ` (2 more replies)
  2018-02-01 19:51 ` Dr. David Alan Gilbert
  1 sibling, 3 replies; 13+ messages in thread
From: Marcel Apfelbaum @ 2018-02-01 12:24 UTC (permalink / raw)
  To: Peter Xu, qemu-devel, Dr . David Alan Gilbert
  Cc: Alex Williamson, Michael S . Tsirkin, Juan Quintela, Laurent Vivier

Hi Peter,

On 01/02/2018 13:20, Peter Xu wrote:
> In the past, we prioritized IOMMU migration so that we have such a
> priority order:
> 
>     IOMMU > PCI Devices
> 
> When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> migrate vIOMMU first, since pcie-root-port will be seen to have the same
> priority of general PCI devices.
> 
> That's problematic.
> 
> The thing is that PCI bus number information is stored in the root port,
> and that is needed by vIOMMU during post_load(), e.g., to figure out
> context entry for a device.  If we don't have correct bus numbers for
> devices, we won't be able to recover device state of the DMAR memory
> regions, and things will be messed up.
> 
> So let's boost the PCIe root ports to be even with higher priority:
> 
>    PCIe Root Port > IOMMU > PCI Devices
> 
> A smoke test shows that this patch fixes bug 1538953.
> 
> CC: Alex Williamson <alex.williamson@redhat.com>
> CC: Marcel Apfelbaum <marcel@redhat.com>
> CC: Michael S. Tsirkin <mst@redhat.com>
> CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> CC: Laurent Vivier <lvivier@redhat.com>
> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> Marcel & all,
> 
> I think it's possible that we need similar thing for other bridge-like
> devices, but I'm not that familiar.  Would you help confirm?  Thanks,

Is a pity we don't have a way to mark the migration priority
in a base class. Dave, maybe we do have a way?

In the meantime you would need to add it also to:
- ioh3420 (Intel root port)
- xio3130_downstream (Intel switch downstream port)
- xio3130_upstream (The counterpart of the above, you want the whole
  switch to be migrated before loading the IOMMU device state)
- pcie_pci_bridge (for pci devices)
- pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
- i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)

Thanks,
Marcel

>  hw/pci-bridge/gen_pcie_root_port.c | 1 +
>  include/migration/vmstate.h        | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> index 0e2f2e8bf1..e6ff1effd8 100644
> --- a/hw/pci-bridge/gen_pcie_root_port.c
> +++ b/hw/pci-bridge/gen_pcie_root_port.c
> @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
>  
>  static const VMStateDescription vmstate_rp_dev = {
>      .name = "pcie-root-port",
> +    .priority = MIG_PRI_PCIE_ROOT_PORT,
>      .version_id = 1,
>      .minimum_version_id = 1,
>      .post_load = pcie_cap_slot_post_load,
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index 8c3889433c..491449db9f 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -148,6 +148,7 @@ enum VMStateFlags {
>  typedef enum {
>      MIG_PRI_DEFAULT = 0,
>      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
> +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
>      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
>      MIG_PRI_GICV3,              /* Must happen before the ITS */
>      MIG_PRI_MAX,
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
  2018-02-01 12:24 ` Marcel Apfelbaum
@ 2018-02-01 19:18   ` Michael S. Tsirkin
  2018-02-01 19:48     ` Dr. David Alan Gilbert
  2018-02-01 19:38   ` Dr. David Alan Gilbert
  2018-02-02 10:19   ` Peter Xu
  2 siblings, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2018-02-01 19:18 UTC (permalink / raw)
  To: Marcel Apfelbaum
  Cc: Peter Xu, qemu-devel, Dr . David Alan Gilbert, Alex Williamson,
	Juan Quintela, Laurent Vivier

On Thu, Feb 01, 2018 at 02:24:15PM +0200, Marcel Apfelbaum wrote:
> Hi Peter,
> 
> On 01/02/2018 13:20, Peter Xu wrote:
> > In the past, we prioritized IOMMU migration so that we have such a
> > priority order:
> > 
> >     IOMMU > PCI Devices
> > 
> > When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> > migrate vIOMMU first, since pcie-root-port will be seen to have the same
> > priority of general PCI devices.
> > 
> > That's problematic.
> > 
> > The thing is that PCI bus number information is stored in the root port,
> > and that is needed by vIOMMU during post_load(), e.g., to figure out
> > context entry for a device.  If we don't have correct bus numbers for
> > devices, we won't be able to recover device state of the DMAR memory
> > regions, and things will be messed up.
> > 
> > So let's boost the PCIe root ports to be even with higher priority:
> > 
> >    PCIe Root Port > IOMMU > PCI Devices
> > 
> > A smoke test shows that this patch fixes bug 1538953.
> > 
> > CC: Alex Williamson <alex.williamson@redhat.com>
> > CC: Marcel Apfelbaum <marcel@redhat.com>
> > CC: Michael S. Tsirkin <mst@redhat.com>
> > CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > CC: Juan Quintela <quintela@redhat.com>
> > CC: Laurent Vivier <lvivier@redhat.com>
> > Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> > Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > Marcel & all,
> > 
> > I think it's possible that we need similar thing for other bridge-like
> > devices, but I'm not that familiar.  Would you help confirm?  Thanks,
> 
> Is a pity we don't have a way to mark the migration priority
> in a base class. Dave, maybe we do have a way?
> 
> In the meantime you would need to add it also to:
> - ioh3420 (Intel root port)
> - xio3130_downstream (Intel switch downstream port)
> - xio3130_upstream (The counterpart of the above, you want the whole
>   switch to be migrated before loading the IOMMU device state)
> - pcie_pci_bridge (for pci devices)
> - pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
> - i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)
> 
> Thanks,
> Marcel

It's kind of strange that we need to set the priority manually.
Can't migration figure it out itself? I think bus
must always be migrated before the devices behind it ...

> >  hw/pci-bridge/gen_pcie_root_port.c | 1 +
> >  include/migration/vmstate.h        | 1 +
> >  2 files changed, 2 insertions(+)
> > 
> > diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> > index 0e2f2e8bf1..e6ff1effd8 100644
> > --- a/hw/pci-bridge/gen_pcie_root_port.c
> > +++ b/hw/pci-bridge/gen_pcie_root_port.c
> > @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
> >  
> >  static const VMStateDescription vmstate_rp_dev = {
> >      .name = "pcie-root-port",
> > +    .priority = MIG_PRI_PCIE_ROOT_PORT,
> >      .version_id = 1,
> >      .minimum_version_id = 1,
> >      .post_load = pcie_cap_slot_post_load,
> > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > index 8c3889433c..491449db9f 100644
> > --- a/include/migration/vmstate.h
> > +++ b/include/migration/vmstate.h
> > @@ -148,6 +148,7 @@ enum VMStateFlags {
> >  typedef enum {
> >      MIG_PRI_DEFAULT = 0,
> >      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
> > +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
> >      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
> >      MIG_PRI_GICV3,              /* Must happen before the ITS */
> >      MIG_PRI_MAX,
> > 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
  2018-02-01 12:24 ` Marcel Apfelbaum
  2018-02-01 19:18   ` Michael S. Tsirkin
@ 2018-02-01 19:38   ` Dr. David Alan Gilbert
  2018-02-02 10:19   ` Peter Xu
  2 siblings, 0 replies; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2018-02-01 19:38 UTC (permalink / raw)
  To: Marcel Apfelbaum
  Cc: Peter Xu, qemu-devel, Alex Williamson, Michael S . Tsirkin,
	Juan Quintela, Laurent Vivier

* Marcel Apfelbaum (marcel@redhat.com) wrote:
> Hi Peter,
> 
> On 01/02/2018 13:20, Peter Xu wrote:
> > In the past, we prioritized IOMMU migration so that we have such a
> > priority order:
> > 
> >     IOMMU > PCI Devices
> > 
> > When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> > migrate vIOMMU first, since pcie-root-port will be seen to have the same
> > priority of general PCI devices.
> > 
> > That's problematic.
> > 
> > The thing is that PCI bus number information is stored in the root port,
> > and that is needed by vIOMMU during post_load(), e.g., to figure out
> > context entry for a device.  If we don't have correct bus numbers for
> > devices, we won't be able to recover device state of the DMAR memory
> > regions, and things will be messed up.
> > 
> > So let's boost the PCIe root ports to be even with higher priority:
> > 
> >    PCIe Root Port > IOMMU > PCI Devices
> > 
> > A smoke test shows that this patch fixes bug 1538953.
> > 
> > CC: Alex Williamson <alex.williamson@redhat.com>
> > CC: Marcel Apfelbaum <marcel@redhat.com>
> > CC: Michael S. Tsirkin <mst@redhat.com>
> > CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > CC: Juan Quintela <quintela@redhat.com>
> > CC: Laurent Vivier <lvivier@redhat.com>
> > Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> > Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > Marcel & all,
> > 
> > I think it's possible that we need similar thing for other bridge-like
> > devices, but I'm not that familiar.  Would you help confirm?  Thanks,
> 
> Is a pity we don't have a way to mark the migration priority
> in a base class. Dave, maybe we do have a way?

Not that I'm aware of; the 'priority' field is associated with the VMSD
and it's not really connected to the class hierarchy at all.

Dave

> In the meantime you would need to add it also to:
> - ioh3420 (Intel root port)
> - xio3130_downstream (Intel switch downstream port)
> - xio3130_upstream (The counterpart of the above, you want the whole
>   switch to be migrated before loading the IOMMU device state)
> - pcie_pci_bridge (for pci devices)
> - pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
> - i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)
> 
> Thanks,
> Marcel
> 
> >  hw/pci-bridge/gen_pcie_root_port.c | 1 +
> >  include/migration/vmstate.h        | 1 +
> >  2 files changed, 2 insertions(+)
> > 
> > diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> > index 0e2f2e8bf1..e6ff1effd8 100644
> > --- a/hw/pci-bridge/gen_pcie_root_port.c
> > +++ b/hw/pci-bridge/gen_pcie_root_port.c
> > @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
> >  
> >  static const VMStateDescription vmstate_rp_dev = {
> >      .name = "pcie-root-port",
> > +    .priority = MIG_PRI_PCIE_ROOT_PORT,
> >      .version_id = 1,
> >      .minimum_version_id = 1,
> >      .post_load = pcie_cap_slot_post_load,
> > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > index 8c3889433c..491449db9f 100644
> > --- a/include/migration/vmstate.h
> > +++ b/include/migration/vmstate.h
> > @@ -148,6 +148,7 @@ enum VMStateFlags {
> >  typedef enum {
> >      MIG_PRI_DEFAULT = 0,
> >      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
> > +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
> >      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
> >      MIG_PRI_GICV3,              /* Must happen before the ITS */
> >      MIG_PRI_MAX,
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
  2018-02-01 19:18   ` Michael S. Tsirkin
@ 2018-02-01 19:48     ` Dr. David Alan Gilbert
  2018-02-01 20:01       ` Marcel Apfelbaum
  0 siblings, 1 reply; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2018-02-01 19:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Marcel Apfelbaum, Peter Xu, qemu-devel, Alex Williamson,
	Juan Quintela, Laurent Vivier

* Michael S. Tsirkin (mst@redhat.com) wrote:
> On Thu, Feb 01, 2018 at 02:24:15PM +0200, Marcel Apfelbaum wrote:
> > Hi Peter,
> > 
> > On 01/02/2018 13:20, Peter Xu wrote:
> > > In the past, we prioritized IOMMU migration so that we have such a
> > > priority order:
> > > 
> > >     IOMMU > PCI Devices
> > > 
> > > When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> > > migrate vIOMMU first, since pcie-root-port will be seen to have the same
> > > priority of general PCI devices.
> > > 
> > > That's problematic.
> > > 
> > > The thing is that PCI bus number information is stored in the root port,
> > > and that is needed by vIOMMU during post_load(), e.g., to figure out
> > > context entry for a device.  If we don't have correct bus numbers for
> > > devices, we won't be able to recover device state of the DMAR memory
> > > regions, and things will be messed up.
> > > 
> > > So let's boost the PCIe root ports to be even with higher priority:
> > > 
> > >    PCIe Root Port > IOMMU > PCI Devices
> > > 
> > > A smoke test shows that this patch fixes bug 1538953.
> > > 
> > > CC: Alex Williamson <alex.williamson@redhat.com>
> > > CC: Marcel Apfelbaum <marcel@redhat.com>
> > > CC: Michael S. Tsirkin <mst@redhat.com>
> > > CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > CC: Juan Quintela <quintela@redhat.com>
> > > CC: Laurent Vivier <lvivier@redhat.com>
> > > Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> > > Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > ---
> > > Marcel & all,
> > > 
> > > I think it's possible that we need similar thing for other bridge-like
> > > devices, but I'm not that familiar.  Would you help confirm?  Thanks,
> > 
> > Is a pity we don't have a way to mark the migration priority
> > in a base class. Dave, maybe we do have a way?
> > 
> > In the meantime you would need to add it also to:
> > - ioh3420 (Intel root port)
> > - xio3130_downstream (Intel switch downstream port)
> > - xio3130_upstream (The counterpart of the above, you want the whole
> >   switch to be migrated before loading the IOMMU device state)
> > - pcie_pci_bridge (for pci devices)
> > - pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
> > - i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)
> > 
> > Thanks,
> > Marcel
> 
> It's kind of strange that we need to set the priority manually.
> Can't migration figure it out itself?

> I think bus
> must always be migrated before the devices behind it ...

I think that's true; but:
  a) Is the iommu a child of any of the PCI busses?
  b) does anything ensure that the bridge that's a parent of a bus
     gets migrated before the bus it provides?
  c) What happens with more htan one root port?

Dave


> > >  hw/pci-bridge/gen_pcie_root_port.c | 1 +
> > >  include/migration/vmstate.h        | 1 +
> > >  2 files changed, 2 insertions(+)
> > > 
> > > diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> > > index 0e2f2e8bf1..e6ff1effd8 100644
> > > --- a/hw/pci-bridge/gen_pcie_root_port.c
> > > +++ b/hw/pci-bridge/gen_pcie_root_port.c
> > > @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
> > >  
> > >  static const VMStateDescription vmstate_rp_dev = {
> > >      .name = "pcie-root-port",
> > > +    .priority = MIG_PRI_PCIE_ROOT_PORT,
> > >      .version_id = 1,
> > >      .minimum_version_id = 1,
> > >      .post_load = pcie_cap_slot_post_load,
> > > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > > index 8c3889433c..491449db9f 100644
> > > --- a/include/migration/vmstate.h
> > > +++ b/include/migration/vmstate.h
> > > @@ -148,6 +148,7 @@ enum VMStateFlags {
> > >  typedef enum {
> > >      MIG_PRI_DEFAULT = 0,
> > >      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
> > > +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
> > >      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
> > >      MIG_PRI_GICV3,              /* Must happen before the ITS */
> > >      MIG_PRI_MAX,
> > > 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
  2018-02-01 11:20 [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority Peter Xu
  2018-02-01 12:24 ` Marcel Apfelbaum
@ 2018-02-01 19:51 ` Dr. David Alan Gilbert
  2018-02-02  9:56   ` Peter Xu
  1 sibling, 1 reply; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2018-02-01 19:51 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Alex Williamson, Marcel Apfelbaum,
	Michael S . Tsirkin, Juan Quintela, Laurent Vivier

* Peter Xu (peterx@redhat.com) wrote:
> In the past, we prioritized IOMMU migration so that we have such a
> priority order:
> 
>     IOMMU > PCI Devices
> 
> When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> migrate vIOMMU first, since pcie-root-port will be seen to have the same
> priority of general PCI devices.
> 
> That's problematic.
> 
> The thing is that PCI bus number information is stored in the root port,
> and that is needed by vIOMMU during post_load(), e.g., to figure out
> context entry for a device.  If we don't have correct bus numbers for
> devices, we won't be able to recover device state of the DMAR memory
> regions, and things will be messed up.
> 
> So let's boost the PCIe root ports to be even with higher priority:
> 
>    PCIe Root Port > IOMMU > PCI Devices
> 
> A smoke test shows that this patch fixes bug 1538953.

Two questions (partially overlapping with what I replied to Michaels):
  a) What happens with multiple IOMMUs?
  b) What happens with multiple root ports?
  c) How correct is this ordering on different implementations 
    (e.g. ARM/Power/etc)

Dave

> 
> CC: Alex Williamson <alex.williamson@redhat.com>
> CC: Marcel Apfelbaum <marcel@redhat.com>
> CC: Michael S. Tsirkin <mst@redhat.com>
> CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> CC: Laurent Vivier <lvivier@redhat.com>
> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> Marcel & all,
> 
> I think it's possible that we need similar thing for other bridge-like
> devices, but I'm not that familiar.  Would you help confirm?  Thanks,
> ---
>  hw/pci-bridge/gen_pcie_root_port.c | 1 +
>  include/migration/vmstate.h        | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> index 0e2f2e8bf1..e6ff1effd8 100644
> --- a/hw/pci-bridge/gen_pcie_root_port.c
> +++ b/hw/pci-bridge/gen_pcie_root_port.c
> @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
>  
>  static const VMStateDescription vmstate_rp_dev = {
>      .name = "pcie-root-port",
> +    .priority = MIG_PRI_PCIE_ROOT_PORT,
>      .version_id = 1,
>      .minimum_version_id = 1,
>      .post_load = pcie_cap_slot_post_load,
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index 8c3889433c..491449db9f 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -148,6 +148,7 @@ enum VMStateFlags {
>  typedef enum {
>      MIG_PRI_DEFAULT = 0,
>      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
> +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
>      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
>      MIG_PRI_GICV3,              /* Must happen before the ITS */
>      MIG_PRI_MAX,
> -- 
> 2.14.3
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
  2018-02-01 19:48     ` Dr. David Alan Gilbert
@ 2018-02-01 20:01       ` Marcel Apfelbaum
  2018-02-01 20:10         ` Dr. David Alan Gilbert
  2018-02-02 10:04         ` Peter Xu
  0 siblings, 2 replies; 13+ messages in thread
From: Marcel Apfelbaum @ 2018-02-01 20:01 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Michael S. Tsirkin
  Cc: Peter Xu, qemu-devel, Alex Williamson, Juan Quintela, Laurent Vivier

On 01/02/2018 21:48, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (mst@redhat.com) wrote:
>> On Thu, Feb 01, 2018 at 02:24:15PM +0200, Marcel Apfelbaum wrote:
>>> Hi Peter,
>>>
>>> On 01/02/2018 13:20, Peter Xu wrote:
>>>> In the past, we prioritized IOMMU migration so that we have such a
>>>> priority order:
>>>>
>>>>     IOMMU > PCI Devices
>>>>
>>>> When migrating a guest with both vIOMMU and pcie-root-port, we'll always
>>>> migrate vIOMMU first, since pcie-root-port will be seen to have the same
>>>> priority of general PCI devices.
>>>>
>>>> That's problematic.
>>>>
>>>> The thing is that PCI bus number information is stored in the root port,
>>>> and that is needed by vIOMMU during post_load(), e.g., to figure out
>>>> context entry for a device.  If we don't have correct bus numbers for
>>>> devices, we won't be able to recover device state of the DMAR memory
>>>> regions, and things will be messed up.
>>>>
>>>> So let's boost the PCIe root ports to be even with higher priority:
>>>>
>>>>    PCIe Root Port > IOMMU > PCI Devices
>>>>
>>>> A smoke test shows that this patch fixes bug 1538953.
>>>>
>>>> CC: Alex Williamson <alex.williamson@redhat.com>
>>>> CC: Marcel Apfelbaum <marcel@redhat.com>
>>>> CC: Michael S. Tsirkin <mst@redhat.com>
>>>> CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>>> CC: Juan Quintela <quintela@redhat.com>
>>>> CC: Laurent Vivier <lvivier@redhat.com>
>>>> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
>>>> Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>>> ---
>>>> Marcel & all,
>>>>
>>>> I think it's possible that we need similar thing for other bridge-like
>>>> devices, but I'm not that familiar.  Would you help confirm?  Thanks,
>>>
>>> Is a pity we don't have a way to mark the migration priority
>>> in a base class. Dave, maybe we do have a way?
>>>
>>> In the meantime you would need to add it also to:
>>> - ioh3420 (Intel root port)
>>> - xio3130_downstream (Intel switch downstream port)
>>> - xio3130_upstream (The counterpart of the above, you want the whole
>>>   switch to be migrated before loading the IOMMU device state)
>>> - pcie_pci_bridge (for pci devices)
>>> - pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
>>> - i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)
>>>
>>> Thanks,
>>> Marcel
>>
>> It's kind of strange that we need to set the priority manually.
>> Can't migration figure it out itself?
> 
>> I think bus
>> must always be migrated before the devices behind it ...
> 
> I think that's true; but:
>   a) Is the iommu a child of any of the PCI busses?

No, is a sysbus device.

>   b) does anything ensure that the bridge that's a parent of a bus
>      gets migrated before the bus it provides?

I think this was Michael's question :)

>   c) What happens with more htan one root port?
> 

Root ports can't be nested, anyway, I suppose the migration should
follow the bus numbering order.

The question now is what happens if the migration is happening before
the guest firmware finishes assigning numbers to buses...

Still, QEMU has enough information to decide the right ordering,
the question is if the current migration mechanism has some ordering
or the only one is the "VMStateFlags".

Thanks,
Marcel

> Dave
> 
> 
>>>>  hw/pci-bridge/gen_pcie_root_port.c | 1 +
>>>>  include/migration/vmstate.h        | 1 +
>>>>  2 files changed, 2 insertions(+)
>>>>
>>>> diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
>>>> index 0e2f2e8bf1..e6ff1effd8 100644
>>>> --- a/hw/pci-bridge/gen_pcie_root_port.c
>>>> +++ b/hw/pci-bridge/gen_pcie_root_port.c
>>>> @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
>>>>  
>>>>  static const VMStateDescription vmstate_rp_dev = {
>>>>      .name = "pcie-root-port",
>>>> +    .priority = MIG_PRI_PCIE_ROOT_PORT,
>>>>      .version_id = 1,
>>>>      .minimum_version_id = 1,
>>>>      .post_load = pcie_cap_slot_post_load,
>>>> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
>>>> index 8c3889433c..491449db9f 100644
>>>> --- a/include/migration/vmstate.h
>>>> +++ b/include/migration/vmstate.h
>>>> @@ -148,6 +148,7 @@ enum VMStateFlags {
>>>>  typedef enum {
>>>>      MIG_PRI_DEFAULT = 0,
>>>>      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
>>>> +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
>>>>      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
>>>>      MIG_PRI_GICV3,              /* Must happen before the ITS */
>>>>      MIG_PRI_MAX,
>>>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
  2018-02-01 20:01       ` Marcel Apfelbaum
@ 2018-02-01 20:10         ` Dr. David Alan Gilbert
  2018-02-02 10:04         ` Peter Xu
  1 sibling, 0 replies; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2018-02-01 20:10 UTC (permalink / raw)
  To: Marcel Apfelbaum
  Cc: Michael S. Tsirkin, Peter Xu, qemu-devel, Alex Williamson,
	Juan Quintela, Laurent Vivier

* Marcel Apfelbaum (marcel@redhat.com) wrote:
> On 01/02/2018 21:48, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (mst@redhat.com) wrote:
> >> On Thu, Feb 01, 2018 at 02:24:15PM +0200, Marcel Apfelbaum wrote:
> >>> Hi Peter,
> >>>
> >>> On 01/02/2018 13:20, Peter Xu wrote:
> >>>> In the past, we prioritized IOMMU migration so that we have such a
> >>>> priority order:
> >>>>
> >>>>     IOMMU > PCI Devices
> >>>>
> >>>> When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> >>>> migrate vIOMMU first, since pcie-root-port will be seen to have the same
> >>>> priority of general PCI devices.
> >>>>
> >>>> That's problematic.
> >>>>
> >>>> The thing is that PCI bus number information is stored in the root port,
> >>>> and that is needed by vIOMMU during post_load(), e.g., to figure out
> >>>> context entry for a device.  If we don't have correct bus numbers for
> >>>> devices, we won't be able to recover device state of the DMAR memory
> >>>> regions, and things will be messed up.
> >>>>
> >>>> So let's boost the PCIe root ports to be even with higher priority:
> >>>>
> >>>>    PCIe Root Port > IOMMU > PCI Devices
> >>>>
> >>>> A smoke test shows that this patch fixes bug 1538953.
> >>>>
> >>>> CC: Alex Williamson <alex.williamson@redhat.com>
> >>>> CC: Marcel Apfelbaum <marcel@redhat.com>
> >>>> CC: Michael S. Tsirkin <mst@redhat.com>
> >>>> CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >>>> CC: Juan Quintela <quintela@redhat.com>
> >>>> CC: Laurent Vivier <lvivier@redhat.com>
> >>>> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> >>>> Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>>> Signed-off-by: Peter Xu <peterx@redhat.com>
> >>>> ---
> >>>> Marcel & all,
> >>>>
> >>>> I think it's possible that we need similar thing for other bridge-like
> >>>> devices, but I'm not that familiar.  Would you help confirm?  Thanks,
> >>>
> >>> Is a pity we don't have a way to mark the migration priority
> >>> in a base class. Dave, maybe we do have a way?
> >>>
> >>> In the meantime you would need to add it also to:
> >>> - ioh3420 (Intel root port)
> >>> - xio3130_downstream (Intel switch downstream port)
> >>> - xio3130_upstream (The counterpart of the above, you want the whole
> >>>   switch to be migrated before loading the IOMMU device state)
> >>> - pcie_pci_bridge (for pci devices)
> >>> - pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
> >>> - i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)
> >>>
> >>> Thanks,
> >>> Marcel
> >>
> >> It's kind of strange that we need to set the priority manually.
> >> Can't migration figure it out itself?
> > 
> >> I think bus
> >> must always be migrated before the devices behind it ...
> > 
> > I think that's true; but:
> >   a) Is the iommu a child of any of the PCI busses?
> 
> No, is a sysbus device.

OK, so even if we were arguing about whether PCI busses got migrated in
order, that doesn't help Peter's case, because there's nothing that
orders the iommu relative to the PCI.

> >   b) does anything ensure that the bridge that's a parent of a bus
> >      gets migrated before the bus it provides?
> 
> I think this was Michael's question :)
> 
> >   c) What happens with more htan one root port?
> > 
> 
> Root ports can't be nested, anyway, I suppose the migration should
> follow the bus numbering order.
> 
> The question now is what happens if the migration is happening before
> the guest firmware finishes assigning numbers to buses...
> 
> Still, QEMU has enough information to decide the right ordering,
> the question is if the current migration mechanism has some ordering
> or the only one is the "VMStateFlags".

It's ordered on two things:
  a) The priority field that Peter is using
  b) The order of registration of the device.

(b) is of course dangerously unstable.

Dave

> Thanks,
> Marcel
> 
> > Dave
> > 
> > 
> >>>>  hw/pci-bridge/gen_pcie_root_port.c | 1 +
> >>>>  include/migration/vmstate.h        | 1 +
> >>>>  2 files changed, 2 insertions(+)
> >>>>
> >>>> diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> >>>> index 0e2f2e8bf1..e6ff1effd8 100644
> >>>> --- a/hw/pci-bridge/gen_pcie_root_port.c
> >>>> +++ b/hw/pci-bridge/gen_pcie_root_port.c
> >>>> @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
> >>>>  
> >>>>  static const VMStateDescription vmstate_rp_dev = {
> >>>>      .name = "pcie-root-port",
> >>>> +    .priority = MIG_PRI_PCIE_ROOT_PORT,
> >>>>      .version_id = 1,
> >>>>      .minimum_version_id = 1,
> >>>>      .post_load = pcie_cap_slot_post_load,
> >>>> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> >>>> index 8c3889433c..491449db9f 100644
> >>>> --- a/include/migration/vmstate.h
> >>>> +++ b/include/migration/vmstate.h
> >>>> @@ -148,6 +148,7 @@ enum VMStateFlags {
> >>>>  typedef enum {
> >>>>      MIG_PRI_DEFAULT = 0,
> >>>>      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
> >>>> +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
> >>>>      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
> >>>>      MIG_PRI_GICV3,              /* Must happen before the ITS */
> >>>>      MIG_PRI_MAX,
> >>>>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
  2018-02-01 19:51 ` Dr. David Alan Gilbert
@ 2018-02-02  9:56   ` Peter Xu
  2018-02-02 12:39     ` Marcel Apfelbaum
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Xu @ 2018-02-02  9:56 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, Alex Williamson, Marcel Apfelbaum,
	Michael S . Tsirkin, Juan Quintela, Laurent Vivier

On Thu, Feb 01, 2018 at 07:51:31PM +0000, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > In the past, we prioritized IOMMU migration so that we have such a
> > priority order:
> > 
> >     IOMMU > PCI Devices
> > 
> > When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> > migrate vIOMMU first, since pcie-root-port will be seen to have the same
> > priority of general PCI devices.
> > 
> > That's problematic.
> > 
> > The thing is that PCI bus number information is stored in the root port,
> > and that is needed by vIOMMU during post_load(), e.g., to figure out
> > context entry for a device.  If we don't have correct bus numbers for
> > devices, we won't be able to recover device state of the DMAR memory
> > regions, and things will be messed up.
> > 
> > So let's boost the PCIe root ports to be even with higher priority:
> > 
> >    PCIe Root Port > IOMMU > PCI Devices
> > 
> > A smoke test shows that this patch fixes bug 1538953.
> 
> Two questions (partially overlapping with what I replied to Michaels):
>   a) What happens with multiple IOMMUs?

If there are more IOMMUs, then the patch will let all the vIOMMUs be
migrated after pcie root ports.

But a more true answer is that: I don't really know. :)

Because I even don't know how multiple vIOMMUs will coop with each
other, especially nested.  In nested case, maybe there will be
dependency between vIOMMUs, but I'll avoid thinking about that until
we support more than one vIOMMUs.

>   b) What happens with multiple root ports?

Same answer as previous one: all of them will be migrated before any
vIOMMUs.

Note that IMHO we don't care which pcie root port is migrated first -
IMHO they should not depend on each other, but Marcel may correct me.

>   c) How correct is this ordering on different implementations 
>     (e.g. ARM/Power/etc)

Currently it won't affect since Intel IOMMU is the only user for
MIG_PRI_IOMMU.  After SMMU is merged it may affect (if it uses this
bit), but IMHO it's fine too as long as pcie root ports won't depend
on anything related to SMMU.

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
  2018-02-01 20:01       ` Marcel Apfelbaum
  2018-02-01 20:10         ` Dr. David Alan Gilbert
@ 2018-02-02 10:04         ` Peter Xu
  2018-02-02 13:25           ` Marcel Apfelbaum
  1 sibling, 1 reply; 13+ messages in thread
From: Peter Xu @ 2018-02-02 10:04 UTC (permalink / raw)
  To: Marcel Apfelbaum
  Cc: Dr. David Alan Gilbert, Michael S. Tsirkin, qemu-devel,
	Alex Williamson, Juan Quintela, Laurent Vivier

On Thu, Feb 01, 2018 at 10:01:31PM +0200, Marcel Apfelbaum wrote:

[...]

> Root ports can't be nested, anyway, I suppose the migration should
> follow the bus numbering order.

Could I ask whether this is a must?  And if yes, why?

> 
> The question now is what happens if the migration is happening before
> the guest firmware finishes assigning numbers to buses...

Do you mean that vIOMMU may fetch wrong context entries too?

Note that as long as vIOMMU DMAR is off globally, vIOMMU will not
fetch context entries at all.  So IMHO this problem should not happen
during the firmware execution time (assuming that the firmware should
not enable vIOMMU at all).

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
  2018-02-01 12:24 ` Marcel Apfelbaum
  2018-02-01 19:18   ` Michael S. Tsirkin
  2018-02-01 19:38   ` Dr. David Alan Gilbert
@ 2018-02-02 10:19   ` Peter Xu
  2 siblings, 0 replies; 13+ messages in thread
From: Peter Xu @ 2018-02-02 10:19 UTC (permalink / raw)
  To: Marcel Apfelbaum
  Cc: qemu-devel, Dr . David Alan Gilbert, Alex Williamson,
	Michael S . Tsirkin, Juan Quintela, Laurent Vivier

On Thu, Feb 01, 2018 at 02:24:15PM +0200, Marcel Apfelbaum wrote:
> Hi Peter,
> 
> On 01/02/2018 13:20, Peter Xu wrote:
> > In the past, we prioritized IOMMU migration so that we have such a
> > priority order:
> > 
> >     IOMMU > PCI Devices
> > 
> > When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> > migrate vIOMMU first, since pcie-root-port will be seen to have the same
> > priority of general PCI devices.
> > 
> > That's problematic.
> > 
> > The thing is that PCI bus number information is stored in the root port,
> > and that is needed by vIOMMU during post_load(), e.g., to figure out
> > context entry for a device.  If we don't have correct bus numbers for
> > devices, we won't be able to recover device state of the DMAR memory
> > regions, and things will be messed up.
> > 
> > So let's boost the PCIe root ports to be even with higher priority:
> > 
> >    PCIe Root Port > IOMMU > PCI Devices
> > 
> > A smoke test shows that this patch fixes bug 1538953.
> > 
> > CC: Alex Williamson <alex.williamson@redhat.com>
> > CC: Marcel Apfelbaum <marcel@redhat.com>
> > CC: Michael S. Tsirkin <mst@redhat.com>
> > CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > CC: Juan Quintela <quintela@redhat.com>
> > CC: Laurent Vivier <lvivier@redhat.com>
> > Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> > Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > Marcel & all,
> > 
> > I think it's possible that we need similar thing for other bridge-like
> > devices, but I'm not that familiar.  Would you help confirm?  Thanks,
> 
> Is a pity we don't have a way to mark the migration priority
> in a base class. Dave, maybe we do have a way?
> 
> In the meantime you would need to add it also to:
> - ioh3420 (Intel root port)
> - xio3130_downstream (Intel switch downstream port)
> - xio3130_upstream (The counterpart of the above, you want the whole
>   switch to be migrated before loading the IOMMU device state)
> - pcie_pci_bridge (for pci devices)
> - pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
> - i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)

I'll see whether there is any better way to do this instead of
duplicating, but I'm not really sure about it.

Anyway, this list is helpful.  Thanks Marcel.

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
  2018-02-02  9:56   ` Peter Xu
@ 2018-02-02 12:39     ` Marcel Apfelbaum
  0 siblings, 0 replies; 13+ messages in thread
From: Marcel Apfelbaum @ 2018-02-02 12:39 UTC (permalink / raw)
  To: Peter Xu, Dr. David Alan Gilbert
  Cc: qemu-devel, Alex Williamson, Michael S . Tsirkin, Juan Quintela,
	Laurent Vivier

On 02/02/2018 11:56, Peter Xu wrote:
> On Thu, Feb 01, 2018 at 07:51:31PM +0000, Dr. David Alan Gilbert wrote:
>> * Peter Xu (peterx@redhat.com) wrote:
>>> In the past, we prioritized IOMMU migration so that we have such a
>>> priority order:
>>>
>>>     IOMMU > PCI Devices
>>>
>>> When migrating a guest with both vIOMMU and pcie-root-port, we'll always
>>> migrate vIOMMU first, since pcie-root-port will be seen to have the same
>>> priority of general PCI devices.
>>>
>>> That's problematic.
>>>
>>> The thing is that PCI bus number information is stored in the root port,
>>> and that is needed by vIOMMU during post_load(), e.g., to figure out
>>> context entry for a device.  If we don't have correct bus numbers for
>>> devices, we won't be able to recover device state of the DMAR memory
>>> regions, and things will be messed up.
>>>
>>> So let's boost the PCIe root ports to be even with higher priority:
>>>
>>>    PCIe Root Port > IOMMU > PCI Devices
>>>
>>> A smoke test shows that this patch fixes bug 1538953.
>>
>> Two questions (partially overlapping with what I replied to Michaels):
>>   a) What happens with multiple IOMMUs?
> 
> If there are more IOMMUs, then the patch will let all the vIOMMUs be
> migrated after pcie root ports.
> 
> But a more true answer is that: I don't really know. :)
> 
> Because I even don't know how multiple vIOMMUs will coop with each
> other, especially nested. 

I am not aware of "nested" IOMMUs. Multiple IOMMUs work together
by dividing the bus ranges, when each of them declares in the
corresponding ACPI table the bus/device/range is in charge of.

However there was a kernel bug some time ago preventing several
IOMMUs to work together, I am not sure the problem is solved yet.


 In nested case, maybe there will be
> dependency between vIOMMUs, but I'll avoid thinking about that until
> we support more than one vIOMMUs.
> 
>>   b) What happens with multiple root ports?
> 
> Same answer as previous one: all of them will be migrated before any
> vIOMMUs.
> 
> Note that IMHO we don't care which pcie root port is migrated first -
> IMHO they should not depend on each other, but Marcel may correct me.
> 

Right, each Root Port is independent from each other.

Thanks,
Marcel

>>   c) How correct is this ordering on different implementations 
>>     (e.g. ARM/Power/etc)
> 
> Currently it won't affect since Intel IOMMU is the only user for
> MIG_PRI_IOMMU.  After SMMU is merged it may affect (if it uses this
> bit), but IMHO it's fine too as long as pcie root ports won't depend
> on anything related to SMMU.
> 
> Thanks,
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
  2018-02-02 10:04         ` Peter Xu
@ 2018-02-02 13:25           ` Marcel Apfelbaum
  0 siblings, 0 replies; 13+ messages in thread
From: Marcel Apfelbaum @ 2018-02-02 13:25 UTC (permalink / raw)
  To: Peter Xu
  Cc: Dr. David Alan Gilbert, Michael S. Tsirkin, qemu-devel,
	Alex Williamson, Juan Quintela, Laurent Vivier

On 02/02/2018 12:04, Peter Xu wrote:
> On Thu, Feb 01, 2018 at 10:01:31PM +0200, Marcel Apfelbaum wrote:
> 
> [...]
> 
>> Root ports can't be nested, anyway, I suppose the migration should
>> follow the bus numbering order.
> 
> Could I ask whether this is a must?  And if yes, why?
> 

Not sure. The above will ensure that if a device needs some parent/bus
info at load time, the information will be valid.
But if it worked until now, maybe most of the devices do not need that.

>>
>> The question now is what happens if the migration is happening before
>> the guest firmware finishes assigning numbers to buses...
> 
> Do you mean that vIOMMU may fetch wrong context entries too?
> 

No, only that the bus number will not be available at load time.
In this case is OK since the firmware will continue to
assign bus numbers at target side.

Thanks,
Marcel

> Note that as long as vIOMMU DMAR is off globally, vIOMMU will not
> fetch context entries at all.  So IMHO this problem should not happen
> during the firmware execution time (assuming that the firmware should
> not enable vIOMMU at all).
> 
> Thanks,
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-02-02 16:21 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-01 11:20 [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority Peter Xu
2018-02-01 12:24 ` Marcel Apfelbaum
2018-02-01 19:18   ` Michael S. Tsirkin
2018-02-01 19:48     ` Dr. David Alan Gilbert
2018-02-01 20:01       ` Marcel Apfelbaum
2018-02-01 20:10         ` Dr. David Alan Gilbert
2018-02-02 10:04         ` Peter Xu
2018-02-02 13:25           ` Marcel Apfelbaum
2018-02-01 19:38   ` Dr. David Alan Gilbert
2018-02-02 10:19   ` Peter Xu
2018-02-01 19:51 ` Dr. David Alan Gilbert
2018-02-02  9:56   ` Peter Xu
2018-02-02 12:39     ` Marcel Apfelbaum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.