On Tue, 22 Nov 2016 10:15:59 +1100 David Gibson wrote: > On Mon, Nov 21, 2016 at 05:02:20PM +0100, Greg Kurz wrote: > > On Mon, 21 Nov 2016 13:02:53 +0100 > > Thomas Huth wrote: > > > > > On 21.11.2016 06:31, David Gibson wrote: > > > > daa2369 "spapr_pci: Add a 64-bit MMIO window" subtly broke migration > > > > from qemu-2.7 to the current version. It split the device's MMIO > > > > window into two pieces for 32-bit and 64-bit MMIO. > > > > > > > > The patch included backwards compatibility code to convert the old > > > > property into the new format. However, the property value was also > > > > transferred in the migration stream and compared with a (probably > > > > unwise) VMSTATE_EQUAL. So, the "raw" value from 2.7 is compared to > > > > the new style converted value from (pre-)2.8 giving a mismatch and > > > > migration failure. > > > > > > > > Along with the actual field that caused the breakage, there are > > > > several other ill-advised VMSTATE_EQUAL()s. To fix forwards > > > > migration, we read the values in the stream into scratch variables and > > > > ignore them, instead of comparing for equality. To fix backwards > > > > migration, we populate those scratch variables in pre_save() with > > > > adjusted values to match the old behaviour. > > > > > > > > To permit the eventual possibility of removing this cruft from the > > > > stream, we only include these compatibility fields if a new > > > > 'pre-2.8-migration' property is set. We clear it on the pseries-2.8 > > > > machine type, which obviously can't be migrated backwards, but set it > > > > on earlier machine type versions. > > > > > > > > Signed-off-by: David Gibson > > > > --- > > > > hw/ppc/spapr.c | 5 +++++ > > > > hw/ppc/spapr_pci.c | 33 ++++++++++++++++++++++++++++----- > > > > include/hw/pci-host/spapr.h | 6 ++++++ > > > > 3 files changed, 39 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > > > index 775ad2e..c3269c7 100644 > > > > --- a/hw/ppc/spapr.c > > > > +++ b/hw/ppc/spapr.c > > > > @@ -2772,6 +2772,11 @@ DEFINE_SPAPR_MACHINE(2_8, "2.8", true); > > > > .driver = TYPE_POWERPC_CPU, \ > > > > .property = "pre-2.8-migration", \ > > > > .value = "on", \ > > > > + }, \ > > > > + { \ > > > > + .driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ > > > > + .property = "pre-2.8-migration", \ > > > > + .value = "on", \ > > > > }, > > > > > > > > static void phb_placement_2_7(sPAPRMachineState *spapr, uint32_t index, > > > > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > > > > index e429c94..c62c1cb 100644 > > > > --- a/hw/ppc/spapr_pci.c > > > > +++ b/hw/ppc/spapr_pci.c > > > > @@ -1590,6 +1590,8 @@ static Property spapr_phb_properties[] = { > > > > DEFINE_PROP_UINT64("pgsz", sPAPRPHBState, page_size_mask, > > > > (1ULL << 12) | (1ULL << 16)), > > > > DEFINE_PROP_UINT32("numa_node", sPAPRPHBState, numa_node, -1), > > > > + DEFINE_PROP_BOOL("pre-2.8-migration", sPAPRPHBState, > > > > + pre_2_8_migration, false), > > > > DEFINE_PROP_END_OF_LIST(), > > > > }; > > > > > > > > @@ -1636,6 +1638,20 @@ static void spapr_pci_pre_save(void *opaque) > > > > sphb->msi_devs[i].key = *(uint32_t *) key; > > > > sphb->msi_devs[i].value = *(spapr_pci_msi *) value; > > > > } > > > > + > > > > + if (sphb->pre_2_8_migration) { > > > > + sphb->mig_liobn = sphb->dma_liobn[0]; > > > > + sphb->mig_mem_win_addr = sphb->mem_win_addr; > > > > + sphb->mig_mem_win_size = sphb->mem_win_size; > > > > + sphb->mig_io_win_addr = sphb->io_win_addr; > > > > + sphb->mig_io_win_size = sphb->io_win_size; > > > > + > > > > + if ((sphb->mem64_win_size != 0) > > > > + && (sphb->mem64_win_addr > > > > + == (sphb->mem_win_addr + sphb->mem_win_size))) { > > > > + sphb->mig_mem_win_size += sphb->mem64_win_size; > > > > + } > > > > > > Should we maybe print a warning/error message in case > > > > > > sphb->mem64_win_size != 0 && > > > sphb->mem64_win_addr != sphb->mem_win_addr + sphb->mem_win_size > > > > > > ... assuming that this means a configuration which can not be migrated > > > backwards? > > > > > > > Then shouldn't we forbid pre_2_8_migration to be set when we have a > > non-contiguous window ? > > So, yes, we could do either of these, but really I don't think it's > worth it. It will only happen if you have a custom constructed PHB, > in which case it's pretty much on you to ensure that there's something > compatible at the far end. Restricting it here has somewhat the same > problem as VMSTATE_EQUAL()s did - they make assumptions about what is > and isn't sane which could be broken by future changes (in this case > changes in the 2.7 stable tree). They might be unlikely to change, > but if they do things break, and the only benefit is a marginally > better error message in cases that won't work anyway. > Makes sense indeed, hence: Reviewed-by: Greg Kurz Cheers. -- Greg