All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/1] Removing RAMBlocks during migration
@ 2019-12-09  7:41 Yury Kotov
  2019-12-09  7:41 ` [RFC PATCH 1/1] migration: Remove vmstate_unregister_ram Yury Kotov
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Yury Kotov @ 2019-12-09  7:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, yc-core, Juan Quintela, Michael S. Tsirkin,
	Dr. David Alan Gilbert, Max Reitz, Igor Mammedov,
	Philippe Mathieu-Daudé

Hi,

I found that it's possible to remove a RAMBlock during migration.
E.g. device hot-unplugging initiated by a guest (how to reproduce is below).
And I want to clarify whether RAMBlock removing (or even adding) during
migration is valid operation or it's a bug.

Currently, it may cause some race conditions with migration thread and
migration may fail because of them. For instance, vmstate_unregister_ram
function which is called during PCIe device removing does these:
- Memset idstr -> target may receive unknown/zeroed idstr -> migration fail
- Set RAMBlock flags as non-migratable -> migration fail

RAMBlock removing itself seems safe for migration thread because of RCU.
But it seems to me there are other possible race conditions (didn't test it):
- qemu_put_buffer_async -> saves pointer to RAMBlock's memory
   -> block will be freed out of RCU (between ram save iterations)
   -> qemu_fflush -> access to freed memory.

So, I have the following questions:
1. Is RAMBlock removing/adding OK during migration?
2. If yes then what should we do with vmstate_unregister_ram?
   - Just remove vmstate_unregister_ram (my RFC patch)
   - Refcount RAMBlock's migratable/non-migratable state
   - Something else?
3. If it mustn't be possible, so may be
   assert(migration_is_idle()) in qemu_ram_free?

P.S.
I'm working on a fix of below problem and trying to choose better way:
allow device removing and fix all problem like this or fix a particular device.

--------
How to reproduce device removing during migration:

1. Source QEMU command line (target is similar)
  $ x86_64-softmmu/qemu-system-x86_64 \
    -nodefaults -no-user-config -m 1024 -M q35 \
    -qmp unix:./src.sock,server,nowait \
    -drive file=./image,format=raw,if=virtio \
    -device ioh3420,id=pcie.1 \
    -device virtio-net,bus=pcie.1
2. Start migration with slow speed (to simplify reproducing)
3. Power off a device on the hotplug pcie.1 bus:
  $ echo 0 > /sys/bus/pci/slots/0/power
4. Increase migration speed and wait until fail

Most likely you will get something like this:
  qemu-system-x86_64: get_pci_config_device: Bad config data:
          i=0xaa read: 0 device: 40 cmask: ff wmask: 0 w1cmask:19
  qemu-system-x86_64: Failed to load PCIDevice:config
  qemu-system-x86_64: Failed to load
          ioh-3240-express-root-port:parent_obj.parent_obj.parent_obj
  qemu-system-x86_64: error while loading state for instance 0x0 of device
          '0000:00:03.0/ioh-3240-express-root-port'
  qemu-system-x86_64: load of migration failed: Invalid argument

This error is just an illustration of the removing device possibility,
but not actually an illustration of the race conditions for removing RAMBlock.

Regards,
Yury

Yury Kotov (1):
  migration: Remove vmstate_unregister_ram

 hw/block/pflash_cfi01.c     | 1 -
 hw/block/pflash_cfi02.c     | 1 -
 hw/mem/pc-dimm.c            | 5 -----
 hw/misc/ivshmem.c           | 2 --
 hw/pci/pci.c                | 1 -
 include/migration/vmstate.h | 1 -
 migration/savevm.c          | 6 ------
 7 files changed, 17 deletions(-)

-- 
2.24.0



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC PATCH 1/1] migration: Remove vmstate_unregister_ram
  2019-12-09  7:41 [RFC PATCH 0/1] Removing RAMBlocks during migration Yury Kotov
@ 2019-12-09  7:41 ` Yury Kotov
  2019-12-11 11:16 ` [RFC PATCH 0/1] Removing RAMBlocks during migration Dr. David Alan Gilbert
  2020-01-07 20:02 ` Michael S. Tsirkin
  2 siblings, 0 replies; 10+ messages in thread
From: Yury Kotov @ 2019-12-09  7:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, yc-core, Juan Quintela, Michael S. Tsirkin,
	Dr. David Alan Gilbert, Max Reitz, Igor Mammedov,
	Philippe Mathieu-Daudé

Disclaimer: This is just an RFC.
It's more an illustration-patch than real patch. And I didn't verify the
possible side effects!

Currently, it's possible to call this function during migration.
And so, to have a race condition between migration thread and main
thread.

It seems that all calls of this function are just before
MemoryRegion deletion. Thus, there is no effect of this function
actually.

Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
---
 hw/block/pflash_cfi01.c     | 1 -
 hw/block/pflash_cfi02.c     | 1 -
 hw/mem/pc-dimm.c            | 5 -----
 hw/misc/ivshmem.c           | 2 --
 hw/pci/pci.c                | 1 -
 include/migration/vmstate.h | 1 -
 migration/savevm.c          | 6 ------
 7 files changed, 17 deletions(-)

diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
index 54e6ebd385..1d0cc9e576 100644
--- a/hw/block/pflash_cfi01.c
+++ b/hw/block/pflash_cfi01.c
@@ -763,7 +763,6 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp)
     if (pfl->blk) {
         if (!blk_check_size_and_read_all(pfl->blk, pfl->storage, total_len,
                                          errp)) {
-            vmstate_unregister_ram(&pfl->mem, DEVICE(pfl));
             return;
         }
     }
diff --git a/hw/block/pflash_cfi02.c b/hw/block/pflash_cfi02.c
index c7d92c3e79..1abe1120bf 100644
--- a/hw/block/pflash_cfi02.c
+++ b/hw/block/pflash_cfi02.c
@@ -810,7 +810,6 @@ static void pflash_cfi02_realize(DeviceState *dev, Error **errp)
     if (pfl->blk) {
         if (!blk_check_size_and_read_all(pfl->blk, pfl->storage,
                                          pfl->chip_len, errp)) {
-            vmstate_unregister_ram(&pfl->orig_mem, DEVICE(pfl));
             return;
         }
     }
diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index 99e2faf01b..f8cb5233ee 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -76,12 +76,7 @@ void pc_dimm_plug(PCDIMMDevice *dimm, MachineState *machine, Error **errp)
 
 void pc_dimm_unplug(PCDIMMDevice *dimm, MachineState *machine)
 {
-    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
-    MemoryRegion *vmstate_mr = ddc->get_vmstate_memory_region(dimm,
-                                                              &error_abort);
-
     memory_device_unplug(MEMORY_DEVICE(dimm), machine);
-    vmstate_unregister_ram(vmstate_mr, DEVICE(dimm));
 }
 
 static int pc_dimm_slot2bitmap(Object *obj, void *opaque)
diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 5e3b05eae0..f31e313dec 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -973,8 +973,6 @@ static void ivshmem_exit(PCIDevice *dev)
             fd = memory_region_get_fd(s->ivshmem_bar2);
             close(fd);
         }
-
-        vmstate_unregister_ram(s->ivshmem_bar2, DEVICE(dev));
     }
 
     if (s->hostmem) {
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index cbc7a32568..8a719e2bfa 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2371,7 +2371,6 @@ static void pci_del_option_rom(PCIDevice *pdev)
     if (!pdev->has_rom)
         return;
 
-    vmstate_unregister_ram(&pdev->rom, &pdev->qdev);
     pdev->has_rom = false;
 }
 
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index ac4f46a67d..f298de44a7 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -1176,7 +1176,6 @@ void vmstate_unregister(DeviceState *dev, const VMStateDescription *vmsd,
 
 struct MemoryRegion;
 void vmstate_register_ram(struct MemoryRegion *memory, DeviceState *dev);
-void vmstate_unregister_ram(struct MemoryRegion *memory, DeviceState *dev);
 void vmstate_register_ram_global(struct MemoryRegion *memory);
 
 bool vmstate_check_only_migratable(const VMStateDescription *vmsd);
diff --git a/migration/savevm.c b/migration/savevm.c
index a71b930b91..4b3a7b1b76 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2915,12 +2915,6 @@ void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
     qemu_ram_set_migratable(mr->ram_block);
 }
 
-void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
-{
-    qemu_ram_unset_idstr(mr->ram_block);
-    qemu_ram_unset_migratable(mr->ram_block);
-}
-
 void vmstate_register_ram_global(MemoryRegion *mr)
 {
     vmstate_register_ram(mr, NULL);
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 0/1] Removing RAMBlocks during migration
  2019-12-09  7:41 [RFC PATCH 0/1] Removing RAMBlocks during migration Yury Kotov
  2019-12-09  7:41 ` [RFC PATCH 1/1] migration: Remove vmstate_unregister_ram Yury Kotov
@ 2019-12-11 11:16 ` Dr. David Alan Gilbert
  2019-12-23  8:51   ` Yury Kotov
  2020-01-07 20:02 ` Michael S. Tsirkin
  2 siblings, 1 reply; 10+ messages in thread
From: Dr. David Alan Gilbert @ 2019-12-11 11:16 UTC (permalink / raw)
  To: Yury Kotov
  Cc: Kevin Wolf, yc-core, Juan Quintela, Michael S. Tsirkin,
	qemu-devel, Max Reitz, Igor Mammedov, Philippe Mathieu-Daudé

* Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> Hi,
> 
> I found that it's possible to remove a RAMBlock during migration.
> E.g. device hot-unplugging initiated by a guest (how to reproduce is below).
> And I want to clarify whether RAMBlock removing (or even adding) during
> migration is valid operation or it's a bug.
> 
> Currently, it may cause some race conditions with migration thread and
> migration may fail because of them. For instance, vmstate_unregister_ram
> function which is called during PCIe device removing does these:
> - Memset idstr -> target may receive unknown/zeroed idstr -> migration fail
> - Set RAMBlock flags as non-migratable -> migration fail
> 
> RAMBlock removing itself seems safe for migration thread because of RCU.
> But it seems to me there are other possible race conditions (didn't test it):
> - qemu_put_buffer_async -> saves pointer to RAMBlock's memory
>    -> block will be freed out of RCU (between ram save iterations)
>    -> qemu_fflush -> access to freed memory.
> 
> So, I have the following questions:
> 1. Is RAMBlock removing/adding OK during migration?

I don't think that any hot(un)plug is safe during migration.
While it's true we hold RCUs as we walk lists, we can't hold the RCU
around the entire migration.

There's lots of other problems;  for example we call the .save_setup
methods on devices at the start of migration, but then call the iterate
on those devices later - if the device is added/removed between stages
we'll end up either having done a setup and not calling the actual save,
or the other way around.

Juan added checks to qdev_device_add/qdev_unplug in b06424d ~2.5 years
ago.

> 2. If yes then what should we do with vmstate_unregister_ram?
>    - Just remove vmstate_unregister_ram (my RFC patch)
>    - Refcount RAMBlock's migratable/non-migratable state
>    - Something else?
> 3. If it mustn't be possible, so may be
>    assert(migration_is_idle()) in qemu_ram_free?
> 
> P.S.
> I'm working on a fix of below problem and trying to choose better way:
> allow device removing and fix all problem like this or fix a particular device.
> 
> --------
> How to reproduce device removing during migration:
> 
> 1. Source QEMU command line (target is similar)
>   $ x86_64-softmmu/qemu-system-x86_64 \
>     -nodefaults -no-user-config -m 1024 -M q35 \
>     -qmp unix:./src.sock,server,nowait \
>     -drive file=./image,format=raw,if=virtio \
>     -device ioh3420,id=pcie.1 \
>     -device virtio-net,bus=pcie.1
> 2. Start migration with slow speed (to simplify reproducing)
> 3. Power off a device on the hotplug pcie.1 bus:
>   $ echo 0 > /sys/bus/pci/slots/0/power
> 4. Increase migration speed and wait until fail
> 
> Most likely you will get something like this:
>   qemu-system-x86_64: get_pci_config_device: Bad config data:
>           i=0xaa read: 0 device: 40 cmask: ff wmask: 0 w1cmask:19
>   qemu-system-x86_64: Failed to load PCIDevice:config
>   qemu-system-x86_64: Failed to load
>           ioh-3240-express-root-port:parent_obj.parent_obj.parent_obj
>   qemu-system-x86_64: error while loading state for instance 0x0 of device
>           '0000:00:03.0/ioh-3240-express-root-port'
>   qemu-system-x86_64: load of migration failed: Invalid argument
> 
> This error is just an illustration of the removing device possibility,
> but not actually an illustration of the race conditions for removing RAMBlock.

What path does this actually take - does it end up going via qdev_unplug
or some other way?

Dave

> Regards,
> Yury
> 
> Yury Kotov (1):
>   migration: Remove vmstate_unregister_ram
> 
>  hw/block/pflash_cfi01.c     | 1 -
>  hw/block/pflash_cfi02.c     | 1 -
>  hw/mem/pc-dimm.c            | 5 -----
>  hw/misc/ivshmem.c           | 2 --
>  hw/pci/pci.c                | 1 -
>  include/migration/vmstate.h | 1 -
>  migration/savevm.c          | 6 ------
>  7 files changed, 17 deletions(-)
> 
> -- 
> 2.24.0
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 0/1] Removing RAMBlocks during migration
  2019-12-11 11:16 ` [RFC PATCH 0/1] Removing RAMBlocks during migration Dr. David Alan Gilbert
@ 2019-12-23  8:51   ` Yury Kotov
  2020-01-03 11:44     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 10+ messages in thread
From: Yury Kotov @ 2019-12-23  8:51 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Kevin Wolf, yc-core, Juan Quintela, Michael S. Tsirkin,
	qemu-devel, Max Reitz, Igor Mammedov, Philippe Mathieu-Daudé

Hi!

11.12.2019, 14:17, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
> * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
>>  Hi,
>>
>>  I found that it's possible to remove a RAMBlock during migration.
>>  E.g. device hot-unplugging initiated by a guest (how to reproduce is below).
>>  And I want to clarify whether RAMBlock removing (or even adding) during
>>  migration is valid operation or it's a bug.
>>
>>  Currently, it may cause some race conditions with migration thread and
>>  migration may fail because of them. For instance, vmstate_unregister_ram
>>  function which is called during PCIe device removing does these:
>>  - Memset idstr -> target may receive unknown/zeroed idstr -> migration fail
>>  - Set RAMBlock flags as non-migratable -> migration fail
>>
>>  RAMBlock removing itself seems safe for migration thread because of RCU.
>>  But it seems to me there are other possible race conditions (didn't test it):
>>  - qemu_put_buffer_async -> saves pointer to RAMBlock's memory
>>     -> block will be freed out of RCU (between ram save iterations)
>>     -> qemu_fflush -> access to freed memory.
>>
>>  So, I have the following questions:
>>  1. Is RAMBlock removing/adding OK during migration?
>
> I don't think that any hot(un)plug is safe during migration.
> While it's true we hold RCUs as we walk lists, we can't hold the RCU
> around the entire migration.

I agree. Currently, it's unsafe to do any hot(un)plug.
But I thought (and wanted to clarify) it would be nice to make it safe.
Hold the RCU around the entire migration is not the only way actually.
For example, we can defer RAMBlock deletion: refcount RAMBlocks before
migration and unref them after migration.

>
> There's lots of other problems; for example we call the .save_setup
> methods on devices at the start of migration, but then call the iterate
> on those devices later - if the device is added/removed between stages
> we'll end up either having done a setup and not calling the actual save,
> or the other way around.

Hm... Yeah, that's a problem, thanks for mentioning it!

>
> Juan added checks to qdev_device_add/qdev_unplug in b06424d ~2.5 years
> ago.

I see that hot(un)plug during migration has many issues.
But generally it has three groups (if I didn't miss something):
1) RAMBlock add/del
2) Device add/del
3) VMState add/del

IIUC, RAMBlocks are not always connected to some devices.
So, in theory, it might become possible to hot(un)plug a block
without hot adding/removing a device. It's why I wanted to clarify
is there a sense to fix separately the problems related to RAMBlocks.

But, if you think there is no sense to fix all related problems
to let hot(un)plugging during migration be allowed, I think we can add
an assert(!migrate_is_idle()) in qemu_ram_free.

>>  2. If yes then what should we do with vmstate_unregister_ram?
>>     - Just remove vmstate_unregister_ram (my RFC patch)
>>     - Refcount RAMBlock's migratable/non-migratable state
>>     - Something else?
>>  3. If it mustn't be possible, so may be
>>     assert(migration_is_idle()) in qemu_ram_free?
>>
>>  P.S.
>>  I'm working on a fix of below problem and trying to choose better way:
>>  allow device removing and fix all problem like this or fix a particular device.
>>
>>  --------
>>  How to reproduce device removing during migration:
>>
>>  1. Source QEMU command line (target is similar)
>>    $ x86_64-softmmu/qemu-system-x86_64 \
>>      -nodefaults -no-user-config -m 1024 -M q35 \
>>      -qmp unix:./src.sock,server,nowait \
>>      -drive file=./image,format=raw,if=virtio \
>>      -device ioh3420,id=pcie.1 \
>>      -device virtio-net,bus=pcie.1
>>  2. Start migration with slow speed (to simplify reproducing)
>>  3. Power off a device on the hotplug pcie.1 bus:
>>    $ echo 0 > /sys/bus/pci/slots/0/power
>>  4. Increase migration speed and wait until fail
>>
>>  Most likely you will get something like this:
>>    qemu-system-x86_64: get_pci_config_device: Bad config data:
>>            i=0xaa read: 0 device: 40 cmask: ff wmask: 0 w1cmask:19
>>    qemu-system-x86_64: Failed to load PCIDevice:config
>>    qemu-system-x86_64: Failed to load
>>            ioh-3240-express-root-port:parent_obj.parent_obj.parent_obj
>>    qemu-system-x86_64: error while loading state for instance 0x0 of device
>>            '0000:00:03.0/ioh-3240-express-root-port'
>>    qemu-system-x86_64: load of migration failed: Invalid argument
>>
>>  This error is just an illustration of the removing device possibility,
>>  but not actually an illustration of the race conditions for removing RAMBlock.
>
> What path does this actually take - does it end up going via qdev_unplug
> or some other way?

1) Guest: writes to slot's pci config
2) QEMU: pcie_cap_slot_write_config -> pcie_unplug_device

So, it's only guest driven action and qdev_unplug doesn't help here.

>
> Dave
>
>>  Regards,
>>  Yury
>>
>>  Yury Kotov (1):
>>    migration: Remove vmstate_unregister_ram
>>
>>   hw/block/pflash_cfi01.c | 1 -
>>   hw/block/pflash_cfi02.c | 1 -
>>   hw/mem/pc-dimm.c | 5 -----
>>   hw/misc/ivshmem.c | 2 --
>>   hw/pci/pci.c | 1 -
>>   include/migration/vmstate.h | 1 -
>>   migration/savevm.c | 6 ------
>>   7 files changed, 17 deletions(-)
>>
>>  --
>>  2.24.0
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Regards,
Yury


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 0/1] Removing RAMBlocks during migration
  2019-12-23  8:51   ` Yury Kotov
@ 2020-01-03 11:44     ` Dr. David Alan Gilbert
  2020-01-07 20:08       ` Michael S. Tsirkin
  0 siblings, 1 reply; 10+ messages in thread
From: Dr. David Alan Gilbert @ 2020-01-03 11:44 UTC (permalink / raw)
  To: Yury Kotov, Michael S. Tsirkin
  Cc: Kevin Wolf, yc-core, Juan Quintela, qemu-devel, Max Reitz,
	Igor Mammedov, Philippe Mathieu-Daudé

* Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> Hi!
> 
> 11.12.2019, 14:17, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> >>  Hi,
> >>
> >>  I found that it's possible to remove a RAMBlock during migration.
> >>  E.g. device hot-unplugging initiated by a guest (how to reproduce is below).
> >>  And I want to clarify whether RAMBlock removing (or even adding) during
> >>  migration is valid operation or it's a bug.
> >>
> >>  Currently, it may cause some race conditions with migration thread and
> >>  migration may fail because of them. For instance, vmstate_unregister_ram
> >>  function which is called during PCIe device removing does these:
> >>  - Memset idstr -> target may receive unknown/zeroed idstr -> migration fail
> >>  - Set RAMBlock flags as non-migratable -> migration fail
> >>
> >>  RAMBlock removing itself seems safe for migration thread because of RCU.
> >>  But it seems to me there are other possible race conditions (didn't test it):
> >>  - qemu_put_buffer_async -> saves pointer to RAMBlock's memory
> >>     -> block will be freed out of RCU (between ram save iterations)
> >>     -> qemu_fflush -> access to freed memory.
> >>
> >>  So, I have the following questions:
> >>  1. Is RAMBlock removing/adding OK during migration?
> >
> > I don't think that any hot(un)plug is safe during migration.
> > While it's true we hold RCUs as we walk lists, we can't hold the RCU
> > around the entire migration.
> 
> I agree. Currently, it's unsafe to do any hot(un)plug.
> But I thought (and wanted to clarify) it would be nice to make it safe.
> Hold the RCU around the entire migration is not the only way actually.
> For example, we can defer RAMBlock deletion: refcount RAMBlocks before
> migration and unref them after migration.

Yes, that might work.

> >
> > There's lots of other problems; for example we call the .save_setup
> > methods on devices at the start of migration, but then call the iterate
> > on those devices later - if the device is added/removed between stages
> > we'll end up either having done a setup and not calling the actual save,
> > or the other way around.
> 
> Hm... Yeah, that's a problem, thanks for mentioning it!
> 
> >
> > Juan added checks to qdev_device_add/qdev_unplug in b06424d ~2.5 years
> > ago.
> 
> I see that hot(un)plug during migration has many issues.
> But generally it has three groups (if I didn't miss something):
> 1) RAMBlock add/del
> 2) Device add/del
> 3) VMState add/del
> 
> IIUC, RAMBlocks are not always connected to some devices.
> So, in theory, it might become possible to hot(un)plug a block
> without hot adding/removing a device. It's why I wanted to clarify
> is there a sense to fix separately the problems related to RAMBlocks.

Possibly true.

> But, if you think there is no sense to fix all related problems
> to let hot(un)plugging during migration be allowed, I think we can add
> an assert(!migrate_is_idle()) in qemu_ram_free.

is_idle is probably the wrong thing, because of the new WAIT_UNPLUG
state that happens just after setup and is designed to allow vfio
devices to be unplugged before we actually start the guts of the
migration; however the idea makes sense.

> >>  2. If yes then what should we do with vmstate_unregister_ram?
> >>     - Just remove vmstate_unregister_ram (my RFC patch)
> >>     - Refcount RAMBlock's migratable/non-migratable state
> >>     - Something else?
> >>  3. If it mustn't be possible, so may be
> >>     assert(migration_is_idle()) in qemu_ram_free?
> >>
> >>  P.S.
> >>  I'm working on a fix of below problem and trying to choose better way:
> >>  allow device removing and fix all problem like this or fix a particular device.
> >>
> >>  --------
> >>  How to reproduce device removing during migration:
> >>
> >>  1. Source QEMU command line (target is similar)
> >>    $ x86_64-softmmu/qemu-system-x86_64 \
> >>      -nodefaults -no-user-config -m 1024 -M q35 \
> >>      -qmp unix:./src.sock,server,nowait \
> >>      -drive file=./image,format=raw,if=virtio \
> >>      -device ioh3420,id=pcie.1 \
> >>      -device virtio-net,bus=pcie.1
> >>  2. Start migration with slow speed (to simplify reproducing)
> >>  3. Power off a device on the hotplug pcie.1 bus:
> >>    $ echo 0 > /sys/bus/pci/slots/0/power
> >>  4. Increase migration speed and wait until fail
> >>
> >>  Most likely you will get something like this:
> >>    qemu-system-x86_64: get_pci_config_device: Bad config data:
> >>            i=0xaa read: 0 device: 40 cmask: ff wmask: 0 w1cmask:19
> >>    qemu-system-x86_64: Failed to load PCIDevice:config
> >>    qemu-system-x86_64: Failed to load
> >>            ioh-3240-express-root-port:parent_obj.parent_obj.parent_obj
> >>    qemu-system-x86_64: error while loading state for instance 0x0 of device
> >>            '0000:00:03.0/ioh-3240-express-root-port'
> >>    qemu-system-x86_64: load of migration failed: Invalid argument
> >>
> >>  This error is just an illustration of the removing device possibility,
> >>  but not actually an illustration of the race conditions for removing RAMBlock.
> >
> > What path does this actually take - does it end up going via qdev_unplug
> > or some other way?
> 
> 1) Guest: writes to slot's pci config
> 2) QEMU: pcie_cap_slot_write_config -> pcie_unplug_device
> 
> So, it's only guest driven action and qdev_unplug doesn't help here.

Hmm we need to find a way to stop that; lets see if Michael Tsirkin has
any ideas (cc'd) - I'm thinking if we could defer the unplug until the
end of the migration we'd be OK; but it feels racy as to whether the
destination is started with the device that the guest is unplugging.

Dave

> >
> > Dave
> >
> >>  Regards,
> >>  Yury
> >>
> >>  Yury Kotov (1):
> >>    migration: Remove vmstate_unregister_ram
> >>
> >>   hw/block/pflash_cfi01.c | 1 -
> >>   hw/block/pflash_cfi02.c | 1 -
> >>   hw/mem/pc-dimm.c | 5 -----
> >>   hw/misc/ivshmem.c | 2 --
> >>   hw/pci/pci.c | 1 -
> >>   include/migration/vmstate.h | 1 -
> >>   migration/savevm.c | 6 ------
> >>   7 files changed, 17 deletions(-)
> >>
> >>  --
> >>  2.24.0
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> Regards,
> Yury
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 0/1] Removing RAMBlocks during migration
  2019-12-09  7:41 [RFC PATCH 0/1] Removing RAMBlocks during migration Yury Kotov
  2019-12-09  7:41 ` [RFC PATCH 1/1] migration: Remove vmstate_unregister_ram Yury Kotov
  2019-12-11 11:16 ` [RFC PATCH 0/1] Removing RAMBlocks during migration Dr. David Alan Gilbert
@ 2020-01-07 20:02 ` Michael S. Tsirkin
  2020-01-08 13:40   ` Juan Quintela
  2 siblings, 1 reply; 10+ messages in thread
From: Michael S. Tsirkin @ 2020-01-07 20:02 UTC (permalink / raw)
  To: Yury Kotov
  Cc: Kevin Wolf, yc-core, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Igor Mammedov, Max Reitz,
	Philippe Mathieu-Daudé

On Mon, Dec 09, 2019 at 10:41:01AM +0300, Yury Kotov wrote:
> Hi,
> 
> I found that it's possible to remove a RAMBlock during migration.
> E.g. device hot-unplugging initiated by a guest (how to reproduce is below).
> And I want to clarify whether RAMBlock removing (or even adding) during
> migration is valid operation or it's a bug.

There's a very basic problem though: list of RAMBlock's on source and
destination must match otherwise destination will be confused.

It is probably fixable: keep a fake RAMBlock around until migration is
complete, and send some kind of "RAMBlock removed" message to
destination so it knows to remove it there as well.

-- 
MST



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 0/1] Removing RAMBlocks during migration
  2020-01-03 11:44     ` Dr. David Alan Gilbert
@ 2020-01-07 20:08       ` Michael S. Tsirkin
  2020-01-08 10:24         ` Dr. David Alan Gilbert
  2020-01-13 14:18         ` Yury Kotov
  0 siblings, 2 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2020-01-07 20:08 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Kevin Wolf, yc-core, Juan Quintela, jusual, qemu-devel,
	Max Reitz, Yury Kotov, Igor Mammedov, Philippe Mathieu-Daudé

On Fri, Jan 03, 2020 at 11:44:27AM +0000, Dr. David Alan Gilbert wrote:
> > 1) Guest: writes to slot's pci config
> > 2) QEMU: pcie_cap_slot_write_config -> pcie_unplug_device
> > 
> > So, it's only guest driven action and qdev_unplug doesn't help here.
> 
> Hmm we need to find a way to stop that; lets see if Michael Tsirkin has
> any ideas (cc'd) - I'm thinking if we could defer the unplug until the
> end of the migration we'd be OK; but it feels racy as to whether the
> destination is started with the device that the guest is unplugging.
> 
> Dave

It's true - and same is possible with PCI, guest can remove device
at will.

Also, while libvirt learned not to start device del while migration
is active, it's annoying to have to wait for device del
to complete before migration can be started.

I guess we can make device invisible to guest - that concept
now exists because of failover, and can maybe be reused here.

Alternatively or additionally - for a while now I wanted to only remove
the device if guest powered it out and removal was requested.  Again it
might be easier now that we have a concept of an invisible device.



-- 
MST



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 0/1] Removing RAMBlocks during migration
  2020-01-07 20:08       ` Michael S. Tsirkin
@ 2020-01-08 10:24         ` Dr. David Alan Gilbert
  2020-01-13 14:18         ` Yury Kotov
  1 sibling, 0 replies; 10+ messages in thread
From: Dr. David Alan Gilbert @ 2020-01-08 10:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, yc-core, Juan Quintela, jusual, qemu-devel,
	Max Reitz, Yury Kotov, Igor Mammedov, Philippe Mathieu-Daudé

* Michael S. Tsirkin (mst@redhat.com) wrote:
> On Fri, Jan 03, 2020 at 11:44:27AM +0000, Dr. David Alan Gilbert wrote:
> > > 1) Guest: writes to slot's pci config
> > > 2) QEMU: pcie_cap_slot_write_config -> pcie_unplug_device
> > > 
> > > So, it's only guest driven action and qdev_unplug doesn't help here.
> > 
> > Hmm we need to find a way to stop that; lets see if Michael Tsirkin has
> > any ideas (cc'd) - I'm thinking if we could defer the unplug until the
> > end of the migration we'd be OK; but it feels racy as to whether the
> > destination is started with the device that the guest is unplugging.
> > 
> > Dave
> 
> It's true - and same is possible with PCI, guest can remove device
> at will.
> 
> Also, while libvirt learned not to start device del while migration
> is active, it's annoying to have to wait for device del
> to complete before migration can be started.
> 
> I guess we can make device invisible to guest - that concept
> now exists because of failover, and can maybe be reused here.
> 
> Alternatively or additionally - for a while now I wanted to only remove
> the device if guest powered it out and removal was requested.  Again it
> might be easier now that we have a concept of an invisible device.

How do invisible devices wrok? Is that something that each device has to
learn about?
Would we only make them invisible for migration and then do a full
hotunplug on the destination after the migration finishes?

Dave

> 
> 
> -- 
> MST
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 0/1] Removing RAMBlocks during migration
  2020-01-07 20:02 ` Michael S. Tsirkin
@ 2020-01-08 13:40   ` Juan Quintela
  0 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2020-01-08 13:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, yc-core, qemu-devel, Dr. David Alan Gilbert,
	Yury Kotov, Igor Mammedov, Max Reitz, Philippe Mathieu-Daudé

"Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, Dec 09, 2019 at 10:41:01AM +0300, Yury Kotov wrote:
>> Hi,
>> 
>> I found that it's possible to remove a RAMBlock during migration.
>> E.g. device hot-unplugging initiated by a guest (how to reproduce is below).
>> And I want to clarify whether RAMBlock removing (or even adding) during
>> migration is valid operation or it's a bug.
>
> There's a very basic problem though: list of RAMBlock's on source and
> destination must match otherwise destination will be confused.
>
> It is probably fixable: keep a fake RAMBlock around until migration is
> complete, and send some kind of "RAMBlock removed" message to
> destination so it knows to remove it there as well.

Do we have the data to know that a device is unplugged?
I think that it would be just easier just queue that command.  Let the
guest unplug whatever it wants, and on destination, once that we know
that we have finish migration, we just do the rest of the staff.

As stated on the thread, we really want _not_ to have hotplug/unplug
during migration, but if a guest can initiate one, we can only "delay"
the bits that happen inside qemu.

Later, Juan.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 0/1] Removing RAMBlocks during migration
  2020-01-07 20:08       ` Michael S. Tsirkin
  2020-01-08 10:24         ` Dr. David Alan Gilbert
@ 2020-01-13 14:18         ` Yury Kotov
  1 sibling, 0 replies; 10+ messages in thread
From: Yury Kotov @ 2020-01-13 14:18 UTC (permalink / raw)
  To: Michael S. Tsirkin, Dr. David Alan Gilbert
  Cc: Kevin Wolf, yc-core, Juan Quintela, jusual, qemu-devel,
	Max Reitz, Igor Mammedov, Philippe Mathieu-Daudé

Hi!

07.01.2020, 23:08, "Michael S. Tsirkin" <mst@redhat.com>:
> On Fri, Jan 03, 2020 at 11:44:27AM +0000, Dr. David Alan Gilbert wrote:
>>  > 1) Guest: writes to slot's pci config
>>  > 2) QEMU: pcie_cap_slot_write_config -> pcie_unplug_device
>>  >
>>  > So, it's only guest driven action and qdev_unplug doesn't help here.
>>
>>  Hmm we need to find a way to stop that; lets see if Michael Tsirkin has
>>  any ideas (cc'd) - I'm thinking if we could defer the unplug until the
>>  end of the migration we'd be OK; but it feels racy as to whether the
>>  destination is started with the device that the guest is unplugging.
>>
>>  Dave
>
> It's true - and same is possible with PCI, guest can remove device
> at will.
>
> Also, while libvirt learned not to start device del while migration
> is active, it's annoying to have to wait for device del
> to complete before migration can be started.
>
> I guess we can make device invisible to guest - that concept
> now exists because of failover, and can maybe be reused here.
>
> Alternatively or additionally - for a while now I wanted to only remove
> the device if guest powered it out and removal was requested. Again it
> might be easier now that we have a concept of an invisible device.
>

I sent an rfc patch that implements deferred device unplug:
  pcie: Defer hot unplug until migration is complete

Please take a look at it.

Regards,
Yury


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-01-13 14:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-09  7:41 [RFC PATCH 0/1] Removing RAMBlocks during migration Yury Kotov
2019-12-09  7:41 ` [RFC PATCH 1/1] migration: Remove vmstate_unregister_ram Yury Kotov
2019-12-11 11:16 ` [RFC PATCH 0/1] Removing RAMBlocks during migration Dr. David Alan Gilbert
2019-12-23  8:51   ` Yury Kotov
2020-01-03 11:44     ` Dr. David Alan Gilbert
2020-01-07 20:08       ` Michael S. Tsirkin
2020-01-08 10:24         ` Dr. David Alan Gilbert
2020-01-13 14:18         ` Yury Kotov
2020-01-07 20:02 ` Michael S. Tsirkin
2020-01-08 13:40   ` Juan Quintela

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.