All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Yury Kotov <yury-kotov@yandex-team.ru>
Cc: "Kevin Wolf" <kwolf@redhat.com>,
	yc-core@yandex-team.ru, "Juan Quintela" <quintela@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org, "Max Reitz" <mreitz@redhat.com>,
	"Igor Mammedov" <imammedo@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>
Subject: Re: [RFC PATCH 0/1] Removing RAMBlocks during migration
Date: Wed, 11 Dec 2019 11:16:55 +0000	[thread overview]
Message-ID: <20191211111655.GC3875@work-vm> (raw)
In-Reply-To: <20191209074102.5926-1-yury-kotov@yandex-team.ru>

* Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> Hi,
> 
> I found that it's possible to remove a RAMBlock during migration.
> E.g. device hot-unplugging initiated by a guest (how to reproduce is below).
> And I want to clarify whether RAMBlock removing (or even adding) during
> migration is valid operation or it's a bug.
> 
> Currently, it may cause some race conditions with migration thread and
> migration may fail because of them. For instance, vmstate_unregister_ram
> function which is called during PCIe device removing does these:
> - Memset idstr -> target may receive unknown/zeroed idstr -> migration fail
> - Set RAMBlock flags as non-migratable -> migration fail
> 
> RAMBlock removing itself seems safe for migration thread because of RCU.
> But it seems to me there are other possible race conditions (didn't test it):
> - qemu_put_buffer_async -> saves pointer to RAMBlock's memory
>    -> block will be freed out of RCU (between ram save iterations)
>    -> qemu_fflush -> access to freed memory.
> 
> So, I have the following questions:
> 1. Is RAMBlock removing/adding OK during migration?

I don't think that any hot(un)plug is safe during migration.
While it's true we hold RCUs as we walk lists, we can't hold the RCU
around the entire migration.

There's lots of other problems;  for example we call the .save_setup
methods on devices at the start of migration, but then call the iterate
on those devices later - if the device is added/removed between stages
we'll end up either having done a setup and not calling the actual save,
or the other way around.

Juan added checks to qdev_device_add/qdev_unplug in b06424d ~2.5 years
ago.

> 2. If yes then what should we do with vmstate_unregister_ram?
>    - Just remove vmstate_unregister_ram (my RFC patch)
>    - Refcount RAMBlock's migratable/non-migratable state
>    - Something else?
> 3. If it mustn't be possible, so may be
>    assert(migration_is_idle()) in qemu_ram_free?
> 
> P.S.
> I'm working on a fix of below problem and trying to choose better way:
> allow device removing and fix all problem like this or fix a particular device.
> 
> --------
> How to reproduce device removing during migration:
> 
> 1. Source QEMU command line (target is similar)
>   $ x86_64-softmmu/qemu-system-x86_64 \
>     -nodefaults -no-user-config -m 1024 -M q35 \
>     -qmp unix:./src.sock,server,nowait \
>     -drive file=./image,format=raw,if=virtio \
>     -device ioh3420,id=pcie.1 \
>     -device virtio-net,bus=pcie.1
> 2. Start migration with slow speed (to simplify reproducing)
> 3. Power off a device on the hotplug pcie.1 bus:
>   $ echo 0 > /sys/bus/pci/slots/0/power
> 4. Increase migration speed and wait until fail
> 
> Most likely you will get something like this:
>   qemu-system-x86_64: get_pci_config_device: Bad config data:
>           i=0xaa read: 0 device: 40 cmask: ff wmask: 0 w1cmask:19
>   qemu-system-x86_64: Failed to load PCIDevice:config
>   qemu-system-x86_64: Failed to load
>           ioh-3240-express-root-port:parent_obj.parent_obj.parent_obj
>   qemu-system-x86_64: error while loading state for instance 0x0 of device
>           '0000:00:03.0/ioh-3240-express-root-port'
>   qemu-system-x86_64: load of migration failed: Invalid argument
> 
> This error is just an illustration of the removing device possibility,
> but not actually an illustration of the race conditions for removing RAMBlock.

What path does this actually take - does it end up going via qdev_unplug
or some other way?

Dave

> Regards,
> Yury
> 
> Yury Kotov (1):
>   migration: Remove vmstate_unregister_ram
> 
>  hw/block/pflash_cfi01.c     | 1 -
>  hw/block/pflash_cfi02.c     | 1 -
>  hw/mem/pc-dimm.c            | 5 -----
>  hw/misc/ivshmem.c           | 2 --
>  hw/pci/pci.c                | 1 -
>  include/migration/vmstate.h | 1 -
>  migration/savevm.c          | 6 ------
>  7 files changed, 17 deletions(-)
> 
> -- 
> 2.24.0
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



  parent reply	other threads:[~2019-12-11 13:19 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-09  7:41 [RFC PATCH 0/1] Removing RAMBlocks during migration Yury Kotov
2019-12-09  7:41 ` [RFC PATCH 1/1] migration: Remove vmstate_unregister_ram Yury Kotov
2019-12-11 11:16 ` Dr. David Alan Gilbert [this message]
2019-12-23  8:51   ` [RFC PATCH 0/1] Removing RAMBlocks during migration Yury Kotov
2020-01-03 11:44     ` Dr. David Alan Gilbert
2020-01-07 20:08       ` Michael S. Tsirkin
2020-01-08 10:24         ` Dr. David Alan Gilbert
2020-01-13 14:18         ` Yury Kotov
2020-01-07 20:02 ` Michael S. Tsirkin
2020-01-08 13:40   ` Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191211111655.GC3875@work-vm \
    --to=dgilbert@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=mst@redhat.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=yc-core@yandex-team.ru \
    --cc=yury-kotov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.