* [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu
@ 2019-05-14 8:05 ` Olaf Hering
0 siblings, 0 replies; 12+ messages in thread
From: Olaf Hering @ 2019-05-14 8:05 UTC (permalink / raw)
To: xen-devel
Cc: Anthony PERARD, Wei Liu, Olaf Hering, Ian Jackson, Roger Pau Monné
If a domU has a qemu-xen instance attached, it is required to call qemus
"xen-save-devices-state" method. Without it, the receiving side of a PV or
PVH migration may be unable to lock the image:
xen be: qdisk-51712: xen be: qdisk-51712: error: Failed to get "write" lock
error: Failed to get "write" lock
xen be: qdisk-51712: xen be: qdisk-51712: initialise() failed
initialise() failed
To fix this bug, libxl__domain_suspend_device_model() and
libxl__domain_resume_device_model() have to be called not only for HVM,
but also if the active device_model is QEMU_XEN.
Unfortunately, libxl__domain_build_info_setdefault() used to hardcode
b_info->device_model_version to QEMU_XEN if it does not know it any
better. As a result libxl__device_model_version_running() will return
incorrect values. This breaks domUs without a device_model.
libxl__qmp_stop() would wait 10 seconds in qmp_open() for a qemu that
will never appear. During this long timeframe the domU remains in state
paused on the sending side. As a result network connections may be
dropped. Once this bug is fixed as well, by just removing the assumption
that every domU has a QEMU_XEN, there is no code to actually initialise
b_info->device_model_version.
There is a helper function libxl__need_xenpv_qemu(), which is used in
various places to decide if a device_model has to be spawned. This
function can not be used as is, just to fill device_model_version,
because store_libxl_entry() was already called earlier.
Introduce LIBXL_DEVICE_MODEL_VERSION_NONE for PV and PVH that have no
need for a device_model to make the state explicit. Indicate this new
state via LIBXL_HAVE macro in libxl.h.
v05:
- move initialization of device_model_version to extra commit
- return error from libxl__need_xenpv_qemu
- add LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE
v04:
- make sure device_model_stubdomain is initialized
v03:
- rearrange code to make sure device_model_version is initialized before
store_libxl_entry() is called
v02:
- update wording in a comment
- remove stale goto in domcreate_launch_dm
- initialize ret in libxl__need_xenpv_qemu
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Anthony PERARD <anthony.perard@citrix.com>
---
tools/libxl/libxl.h | 7 +++++++
tools/libxl/libxl_create.c | 17 ++++++++++++++---
tools/libxl/libxl_dom_suspend.c | 8 ++++++--
tools/libxl/libxl_types.idl | 1 +
4 files changed, 28 insertions(+), 5 deletions(-)
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 482499a6c0..e0f6916b66 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -1182,6 +1182,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, const libxl_mac *src);
*/
#define LIBXL_HAVE_PVCALLS 1
+/*
+ * LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE
+ *
+ * If this is defined, libxl will only run a device-model if required.
+ */
+#define LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE 1
+
typedef char **libxl_string_list;
void libxl_string_list_dispose(libxl_string_list *sl);
int libxl_string_list_length(const libxl_string_list *sl);
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 3f0431cc84..64336b0d29 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -47,9 +47,20 @@ int libxl__domain_set_device_model(libxl__gc *gc, libxl_domain_config *d_config)
}
break;
default:
- b_info->device_model_version =
- LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL;
- break;
+ ret = libxl__need_xenpv_qemu(gc, d_config);
+ switch (ret) {
+ case 1:
+ d_config->b_info.device_model_version =
+ LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN;
+ break;
+ case 0:
+ d_config->b_info.device_model_version =
+ LIBXL_DEVICE_MODEL_VERSION_NONE;
+ break;
+ default:
+ LOGE(ERROR, "Unable to determine QEMU requisite");
+ return ret;
+ }
}
if (b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
index d1af3a6573..c492fe5dd1 100644
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -379,7 +379,9 @@ static void domain_suspend_common_guest_suspended(libxl__egc *egc,
libxl__ev_xswatch_deregister(gc, &dsps->guest_watch);
libxl__ev_time_deregister(gc, &dsps->guest_timeout);
- if (dsps->type == LIBXL_DOMAIN_TYPE_HVM) {
+ if (dsps->type == LIBXL_DOMAIN_TYPE_HVM ||
+ libxl__device_model_version_running(gc, dsps->domid) ==
+ LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
dsps->callback_device_model_done = domain_suspend_common_done;
libxl__domain_suspend_device_model(egc, dsps); /* must be last */
return;
@@ -459,7 +461,9 @@ int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
goto out;
}
- if (type == LIBXL_DOMAIN_TYPE_HVM) {
+ if (type == LIBXL_DOMAIN_TYPE_HVM ||
+ libxl__device_model_version_running(gc, domid) ==
+ LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
rc = libxl__domain_resume_device_model(gc, domid);
if (rc) {
LOGD(ERROR, domid, "failed to resume device model:%d", rc);
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index cb4702fd7a..75bde095bc 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -106,6 +106,7 @@ libxl_device_model_version = Enumeration("device_model_version", [
(0, "UNKNOWN"),
(1, "QEMU_XEN_TRADITIONAL"), # Historical qemu-xen device model (qemu-dm)
(2, "QEMU_XEN"), # Upstream based qemu-xen device model
+ (3, "NONE"),
])
libxl_console_type = Enumeration("console_type", [
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Xen-devel] [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu
@ 2019-05-14 8:05 ` Olaf Hering
0 siblings, 0 replies; 12+ messages in thread
From: Olaf Hering @ 2019-05-14 8:05 UTC (permalink / raw)
To: xen-devel
Cc: Anthony PERARD, Wei Liu, Olaf Hering, Ian Jackson, Roger Pau Monné
If a domU has a qemu-xen instance attached, it is required to call qemus
"xen-save-devices-state" method. Without it, the receiving side of a PV or
PVH migration may be unable to lock the image:
xen be: qdisk-51712: xen be: qdisk-51712: error: Failed to get "write" lock
error: Failed to get "write" lock
xen be: qdisk-51712: xen be: qdisk-51712: initialise() failed
initialise() failed
To fix this bug, libxl__domain_suspend_device_model() and
libxl__domain_resume_device_model() have to be called not only for HVM,
but also if the active device_model is QEMU_XEN.
Unfortunately, libxl__domain_build_info_setdefault() used to hardcode
b_info->device_model_version to QEMU_XEN if it does not know it any
better. As a result libxl__device_model_version_running() will return
incorrect values. This breaks domUs without a device_model.
libxl__qmp_stop() would wait 10 seconds in qmp_open() for a qemu that
will never appear. During this long timeframe the domU remains in state
paused on the sending side. As a result network connections may be
dropped. Once this bug is fixed as well, by just removing the assumption
that every domU has a QEMU_XEN, there is no code to actually initialise
b_info->device_model_version.
There is a helper function libxl__need_xenpv_qemu(), which is used in
various places to decide if a device_model has to be spawned. This
function can not be used as is, just to fill device_model_version,
because store_libxl_entry() was already called earlier.
Introduce LIBXL_DEVICE_MODEL_VERSION_NONE for PV and PVH that have no
need for a device_model to make the state explicit. Indicate this new
state via LIBXL_HAVE macro in libxl.h.
v05:
- move initialization of device_model_version to extra commit
- return error from libxl__need_xenpv_qemu
- add LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE
v04:
- make sure device_model_stubdomain is initialized
v03:
- rearrange code to make sure device_model_version is initialized before
store_libxl_entry() is called
v02:
- update wording in a comment
- remove stale goto in domcreate_launch_dm
- initialize ret in libxl__need_xenpv_qemu
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Anthony PERARD <anthony.perard@citrix.com>
---
tools/libxl/libxl.h | 7 +++++++
tools/libxl/libxl_create.c | 17 ++++++++++++++---
tools/libxl/libxl_dom_suspend.c | 8 ++++++--
tools/libxl/libxl_types.idl | 1 +
4 files changed, 28 insertions(+), 5 deletions(-)
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 482499a6c0..e0f6916b66 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -1182,6 +1182,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, const libxl_mac *src);
*/
#define LIBXL_HAVE_PVCALLS 1
+/*
+ * LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE
+ *
+ * If this is defined, libxl will only run a device-model if required.
+ */
+#define LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE 1
+
typedef char **libxl_string_list;
void libxl_string_list_dispose(libxl_string_list *sl);
int libxl_string_list_length(const libxl_string_list *sl);
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 3f0431cc84..64336b0d29 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -47,9 +47,20 @@ int libxl__domain_set_device_model(libxl__gc *gc, libxl_domain_config *d_config)
}
break;
default:
- b_info->device_model_version =
- LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL;
- break;
+ ret = libxl__need_xenpv_qemu(gc, d_config);
+ switch (ret) {
+ case 1:
+ d_config->b_info.device_model_version =
+ LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN;
+ break;
+ case 0:
+ d_config->b_info.device_model_version =
+ LIBXL_DEVICE_MODEL_VERSION_NONE;
+ break;
+ default:
+ LOGE(ERROR, "Unable to determine QEMU requisite");
+ return ret;
+ }
}
if (b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
index d1af3a6573..c492fe5dd1 100644
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -379,7 +379,9 @@ static void domain_suspend_common_guest_suspended(libxl__egc *egc,
libxl__ev_xswatch_deregister(gc, &dsps->guest_watch);
libxl__ev_time_deregister(gc, &dsps->guest_timeout);
- if (dsps->type == LIBXL_DOMAIN_TYPE_HVM) {
+ if (dsps->type == LIBXL_DOMAIN_TYPE_HVM ||
+ libxl__device_model_version_running(gc, dsps->domid) ==
+ LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
dsps->callback_device_model_done = domain_suspend_common_done;
libxl__domain_suspend_device_model(egc, dsps); /* must be last */
return;
@@ -459,7 +461,9 @@ int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
goto out;
}
- if (type == LIBXL_DOMAIN_TYPE_HVM) {
+ if (type == LIBXL_DOMAIN_TYPE_HVM ||
+ libxl__device_model_version_running(gc, domid) ==
+ LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
rc = libxl__domain_resume_device_model(gc, domid);
if (rc) {
LOGD(ERROR, domid, "failed to resume device model:%d", rc);
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index cb4702fd7a..75bde095bc 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -106,6 +106,7 @@ libxl_device_model_version = Enumeration("device_model_version", [
(0, "UNKNOWN"),
(1, "QEMU_XEN_TRADITIONAL"), # Historical qemu-xen device model (qemu-dm)
(2, "QEMU_XEN"), # Upstream based qemu-xen device model
+ (3, "NONE"),
])
libxl_console_type = Enumeration("console_type", [
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu
@ 2019-05-14 8:14 ` Olaf Hering
0 siblings, 0 replies; 12+ messages in thread
From: Olaf Hering @ 2019-05-14 8:14 UTC (permalink / raw)
To: xen-devel; +Cc: Anthony PERARD, Wei Liu, Ian Jackson, Roger Pau Monné
[-- Attachment #1.1: Type: text/plain, Size: 2044 bytes --]
Am Tue, 14 May 2019 10:05:58 +0200
schrieb Olaf Hering <olaf@aepfle.de>:
> @@ -459,7 +461,9 @@ int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
> goto out;
> }
>
> - if (type == LIBXL_DOMAIN_TYPE_HVM) {
> + if (type == LIBXL_DOMAIN_TYPE_HVM ||
> + libxl__device_model_version_running(gc, domid) ==
> + LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
> rc = libxl__domain_resume_device_model(gc, domid);
> if (rc) {
> LOGD(ERROR, domid, "failed to resume device model:%d", rc);
I think this could be done like that instead, so that libxl__device_model_version_running
is called just once:
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -444,6 +444,8 @@ int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
if (libxl__qmp_resume(gc, domid))
return ERROR_FAIL;
break;
+ case LIBXL_DEVICE_MODEL_VERSION_NONE:
+ break;
default:
return ERROR_INVAL;
}
@@ -461,14 +463,10 @@ int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
goto out;
}
- if (type == LIBXL_DOMAIN_TYPE_HVM ||
- libxl__device_model_version_running(gc, domid) ==
- LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
- rc = libxl__domain_resume_device_model(gc, domid);
- if (rc) {
- LOGD(ERROR, domid, "failed to resume device model:%d", rc);
- goto out;
- }
+ rc = libxl__domain_resume_device_model(gc, domid);
+ if (rc) {
+ LOGD(ERROR, domid, "failed to resume device model:%d", rc);
+ goto out;
}
if (xc_domain_resume(CTX->xch, domid, suspend_cancel)) {
While it is easy for the resume path, doing the same in the suspend path
needs more changes. libxl__domain_suspend_device_model would need to receive
the callback and set it if a device model is active.
Should this be done on top of this change?
Olaf
[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Xen-devel] [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu
@ 2019-05-14 8:14 ` Olaf Hering
0 siblings, 0 replies; 12+ messages in thread
From: Olaf Hering @ 2019-05-14 8:14 UTC (permalink / raw)
To: xen-devel; +Cc: Anthony PERARD, Wei Liu, Ian Jackson, Roger Pau Monné
[-- Attachment #1.1: Type: text/plain, Size: 2044 bytes --]
Am Tue, 14 May 2019 10:05:58 +0200
schrieb Olaf Hering <olaf@aepfle.de>:
> @@ -459,7 +461,9 @@ int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
> goto out;
> }
>
> - if (type == LIBXL_DOMAIN_TYPE_HVM) {
> + if (type == LIBXL_DOMAIN_TYPE_HVM ||
> + libxl__device_model_version_running(gc, domid) ==
> + LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
> rc = libxl__domain_resume_device_model(gc, domid);
> if (rc) {
> LOGD(ERROR, domid, "failed to resume device model:%d", rc);
I think this could be done like that instead, so that libxl__device_model_version_running
is called just once:
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -444,6 +444,8 @@ int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
if (libxl__qmp_resume(gc, domid))
return ERROR_FAIL;
break;
+ case LIBXL_DEVICE_MODEL_VERSION_NONE:
+ break;
default:
return ERROR_INVAL;
}
@@ -461,14 +463,10 @@ int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
goto out;
}
- if (type == LIBXL_DOMAIN_TYPE_HVM ||
- libxl__device_model_version_running(gc, domid) ==
- LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
- rc = libxl__domain_resume_device_model(gc, domid);
- if (rc) {
- LOGD(ERROR, domid, "failed to resume device model:%d", rc);
- goto out;
- }
+ rc = libxl__domain_resume_device_model(gc, domid);
+ if (rc) {
+ LOGD(ERROR, domid, "failed to resume device model:%d", rc);
+ goto out;
}
if (xc_domain_resume(CTX->xch, domid, suspend_cancel)) {
While it is easy for the resume path, doing the same in the suspend path
needs more changes. libxl__domain_suspend_device_model would need to receive
the callback and set it if a device model is active.
Should this be done on top of this change?
Olaf
[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu
@ 2019-05-14 11:19 ` Roger Pau Monné
0 siblings, 0 replies; 12+ messages in thread
From: Roger Pau Monné @ 2019-05-14 11:19 UTC (permalink / raw)
To: Olaf Hering; +Cc: Anthony PERARD, xen-devel, Ian Jackson, Wei Liu
On Tue, May 14, 2019 at 10:05:58AM +0200, Olaf Hering wrote:
> If a domU has a qemu-xen instance attached, it is required to call qemus
> "xen-save-devices-state" method. Without it, the receiving side of a PV or
> PVH migration may be unable to lock the image:
>
> xen be: qdisk-51712: xen be: qdisk-51712: error: Failed to get "write" lock
> error: Failed to get "write" lock
> xen be: qdisk-51712: xen be: qdisk-51712: initialise() failed
> initialise() failed
>
> To fix this bug, libxl__domain_suspend_device_model() and
> libxl__domain_resume_device_model() have to be called not only for HVM,
> but also if the active device_model is QEMU_XEN.
>
> Unfortunately, libxl__domain_build_info_setdefault() used to hardcode
> b_info->device_model_version to QEMU_XEN if it does not know it any
> better. As a result libxl__device_model_version_running() will return
> incorrect values. This breaks domUs without a device_model.
> libxl__qmp_stop() would wait 10 seconds in qmp_open() for a qemu that
> will never appear. During this long timeframe the domU remains in state
> paused on the sending side. As a result network connections may be
> dropped. Once this bug is fixed as well, by just removing the assumption
> that every domU has a QEMU_XEN, there is no code to actually initialise
> b_info->device_model_version.
>
> There is a helper function libxl__need_xenpv_qemu(), which is used in
> various places to decide if a device_model has to be spawned. This
> function can not be used as is, just to fill device_model_version,
> because store_libxl_entry() was already called earlier.
>
> Introduce LIBXL_DEVICE_MODEL_VERSION_NONE for PV and PVH that have no
> need for a device_model to make the state explicit. Indicate this new
> state via LIBXL_HAVE macro in libxl.h.
>
> v05:
> - move initialization of device_model_version to extra commit
> - return error from libxl__need_xenpv_qemu
> - add LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE
> v04:
> - make sure device_model_stubdomain is initialized
> v03:
> - rearrange code to make sure device_model_version is initialized before
> store_libxl_entry() is called
> v02:
> - update wording in a comment
> - remove stale goto in domcreate_launch_dm
> - initialize ret in libxl__need_xenpv_qemu
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Thanks, Roger.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Xen-devel] [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu
@ 2019-05-14 11:19 ` Roger Pau Monné
0 siblings, 0 replies; 12+ messages in thread
From: Roger Pau Monné @ 2019-05-14 11:19 UTC (permalink / raw)
To: Olaf Hering; +Cc: Anthony PERARD, xen-devel, Ian Jackson, Wei Liu
On Tue, May 14, 2019 at 10:05:58AM +0200, Olaf Hering wrote:
> If a domU has a qemu-xen instance attached, it is required to call qemus
> "xen-save-devices-state" method. Without it, the receiving side of a PV or
> PVH migration may be unable to lock the image:
>
> xen be: qdisk-51712: xen be: qdisk-51712: error: Failed to get "write" lock
> error: Failed to get "write" lock
> xen be: qdisk-51712: xen be: qdisk-51712: initialise() failed
> initialise() failed
>
> To fix this bug, libxl__domain_suspend_device_model() and
> libxl__domain_resume_device_model() have to be called not only for HVM,
> but also if the active device_model is QEMU_XEN.
>
> Unfortunately, libxl__domain_build_info_setdefault() used to hardcode
> b_info->device_model_version to QEMU_XEN if it does not know it any
> better. As a result libxl__device_model_version_running() will return
> incorrect values. This breaks domUs without a device_model.
> libxl__qmp_stop() would wait 10 seconds in qmp_open() for a qemu that
> will never appear. During this long timeframe the domU remains in state
> paused on the sending side. As a result network connections may be
> dropped. Once this bug is fixed as well, by just removing the assumption
> that every domU has a QEMU_XEN, there is no code to actually initialise
> b_info->device_model_version.
>
> There is a helper function libxl__need_xenpv_qemu(), which is used in
> various places to decide if a device_model has to be spawned. This
> function can not be used as is, just to fill device_model_version,
> because store_libxl_entry() was already called earlier.
>
> Introduce LIBXL_DEVICE_MODEL_VERSION_NONE for PV and PVH that have no
> need for a device_model to make the state explicit. Indicate this new
> state via LIBXL_HAVE macro in libxl.h.
>
> v05:
> - move initialization of device_model_version to extra commit
> - return error from libxl__need_xenpv_qemu
> - add LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE
> v04:
> - make sure device_model_stubdomain is initialized
> v03:
> - rearrange code to make sure device_model_version is initialized before
> store_libxl_entry() is called
> v02:
> - update wording in a comment
> - remove stale goto in domcreate_launch_dm
> - initialize ret in libxl__need_xenpv_qemu
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Thanks, Roger.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu
@ 2019-05-14 14:38 ` Wei Liu
0 siblings, 0 replies; 12+ messages in thread
From: Wei Liu @ 2019-05-14 14:38 UTC (permalink / raw)
To: Olaf Hering
Cc: Anthony PERARD, xen-devel, Ian Jackson, Wei Liu, Roger Pau Monné
On Tue, May 14, 2019 at 10:14:52AM +0200, Olaf Hering wrote:
> Am Tue, 14 May 2019 10:05:58 +0200
> schrieb Olaf Hering <olaf@aepfle.de>:
>
> > @@ -459,7 +461,9 @@ int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
> > goto out;
> > }
> >
> > - if (type == LIBXL_DOMAIN_TYPE_HVM) {
> > + if (type == LIBXL_DOMAIN_TYPE_HVM ||
> > + libxl__device_model_version_running(gc, domid) ==
> > + LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
> > rc = libxl__domain_resume_device_model(gc, domid);
> > if (rc) {
> > LOGD(ERROR, domid, "failed to resume device model:%d", rc);
>
> I think this could be done like that instead, so that libxl__device_model_version_running
> is called just once:
>
> --- a/tools/libxl/libxl_dom_suspend.c
> +++ b/tools/libxl/libxl_dom_suspend.c
> @@ -444,6 +444,8 @@ int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
> if (libxl__qmp_resume(gc, domid))
> return ERROR_FAIL;
> break;
> + case LIBXL_DEVICE_MODEL_VERSION_NONE:
> + break;
> default:
> return ERROR_INVAL;
> }
> @@ -461,14 +463,10 @@ int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
> goto out;
> }
>
> - if (type == LIBXL_DOMAIN_TYPE_HVM ||
> - libxl__device_model_version_running(gc, domid) ==
> - LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
> - rc = libxl__domain_resume_device_model(gc, domid);
> - if (rc) {
> - LOGD(ERROR, domid, "failed to resume device model:%d", rc);
> - goto out;
> - }
> + rc = libxl__domain_resume_device_model(gc, domid);
> + if (rc) {
> + LOGD(ERROR, domid, "failed to resume device model:%d", rc);
> + goto out;
> }
>
> if (xc_domain_resume(CTX->xch, domid, suspend_cancel)) {
>
Yeah, from the look of it this is definitely better.
>
> While it is easy for the resume path, doing the same in the suspend path
> needs more changes. libxl__domain_suspend_device_model would need to receive
> the callback and set it if a device model is active.
What do you mean here? Can't you handle the NONE case just like you do
in the resume function?
Wei.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Xen-devel] [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu
@ 2019-05-14 14:38 ` Wei Liu
0 siblings, 0 replies; 12+ messages in thread
From: Wei Liu @ 2019-05-14 14:38 UTC (permalink / raw)
To: Olaf Hering
Cc: Anthony PERARD, xen-devel, Ian Jackson, Wei Liu, Roger Pau Monné
On Tue, May 14, 2019 at 10:14:52AM +0200, Olaf Hering wrote:
> Am Tue, 14 May 2019 10:05:58 +0200
> schrieb Olaf Hering <olaf@aepfle.de>:
>
> > @@ -459,7 +461,9 @@ int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
> > goto out;
> > }
> >
> > - if (type == LIBXL_DOMAIN_TYPE_HVM) {
> > + if (type == LIBXL_DOMAIN_TYPE_HVM ||
> > + libxl__device_model_version_running(gc, domid) ==
> > + LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
> > rc = libxl__domain_resume_device_model(gc, domid);
> > if (rc) {
> > LOGD(ERROR, domid, "failed to resume device model:%d", rc);
>
> I think this could be done like that instead, so that libxl__device_model_version_running
> is called just once:
>
> --- a/tools/libxl/libxl_dom_suspend.c
> +++ b/tools/libxl/libxl_dom_suspend.c
> @@ -444,6 +444,8 @@ int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
> if (libxl__qmp_resume(gc, domid))
> return ERROR_FAIL;
> break;
> + case LIBXL_DEVICE_MODEL_VERSION_NONE:
> + break;
> default:
> return ERROR_INVAL;
> }
> @@ -461,14 +463,10 @@ int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
> goto out;
> }
>
> - if (type == LIBXL_DOMAIN_TYPE_HVM ||
> - libxl__device_model_version_running(gc, domid) ==
> - LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
> - rc = libxl__domain_resume_device_model(gc, domid);
> - if (rc) {
> - LOGD(ERROR, domid, "failed to resume device model:%d", rc);
> - goto out;
> - }
> + rc = libxl__domain_resume_device_model(gc, domid);
> + if (rc) {
> + LOGD(ERROR, domid, "failed to resume device model:%d", rc);
> + goto out;
> }
>
> if (xc_domain_resume(CTX->xch, domid, suspend_cancel)) {
>
Yeah, from the look of it this is definitely better.
>
> While it is easy for the resume path, doing the same in the suspend path
> needs more changes. libxl__domain_suspend_device_model would need to receive
> the callback and set it if a device model is active.
What do you mean here? Can't you handle the NONE case just like you do
in the resume function?
Wei.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu
@ 2019-05-14 14:44 ` Olaf Hering
0 siblings, 0 replies; 12+ messages in thread
From: Olaf Hering @ 2019-05-14 14:44 UTC (permalink / raw)
To: Wei Liu; +Cc: Anthony PERARD, xen-devel, Ian Jackson, Roger Pau Monné
[-- Attachment #1.1: Type: text/plain, Size: 609 bytes --]
Am Tue, 14 May 2019 15:38:55 +0100
schrieb Wei Liu <wei.liu2@citrix.com>:
> > While it is easy for the resume path, doing the same in the suspend path
> > needs more changes. libxl__domain_suspend_device_model would need to receive
> > the callback and set it if a device model is active.
>
> What do you mean here? Can't you handle the NONE case just like you do
> in the resume function?
It means calling libxl__domain_suspend_device_model unconditionally, and
let that function decide what to do. Maybe there is no downside to set the
callback unconditionally, I will check that.
Olaf
[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Xen-devel] [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu
@ 2019-05-14 14:44 ` Olaf Hering
0 siblings, 0 replies; 12+ messages in thread
From: Olaf Hering @ 2019-05-14 14:44 UTC (permalink / raw)
To: Wei Liu; +Cc: Anthony PERARD, xen-devel, Ian Jackson, Roger Pau Monné
[-- Attachment #1.1: Type: text/plain, Size: 609 bytes --]
Am Tue, 14 May 2019 15:38:55 +0100
schrieb Wei Liu <wei.liu2@citrix.com>:
> > While it is easy for the resume path, doing the same in the suspend path
> > needs more changes. libxl__domain_suspend_device_model would need to receive
> > the callback and set it if a device model is active.
>
> What do you mean here? Can't you handle the NONE case just like you do
> in the resume function?
It means calling libxl__domain_suspend_device_model unconditionally, and
let that function decide what to do. Maybe there is no downside to set the
callback unconditionally, I will check that.
Olaf
[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu
@ 2019-05-14 15:42 ` Wei Liu
0 siblings, 0 replies; 12+ messages in thread
From: Wei Liu @ 2019-05-14 15:42 UTC (permalink / raw)
To: Olaf Hering
Cc: Anthony PERARD, xen-devel, Wei Liu, Ian Jackson, Roger Pau Monné
On Tue, May 14, 2019 at 04:44:19PM +0200, Olaf Hering wrote:
> Am Tue, 14 May 2019 15:38:55 +0100
> schrieb Wei Liu <wei.liu2@citrix.com>:
>
> > > While it is easy for the resume path, doing the same in the suspend path
> > > needs more changes. libxl__domain_suspend_device_model would need to receive
> > > the callback and set it if a device model is active.
> >
> > What do you mean here? Can't you handle the NONE case just like you do
> > in the resume function?
>
> It means calling libxl__domain_suspend_device_model unconditionally, and
> let that function decide what to do. Maybe there is no downside to set the
> callback unconditionally, I will check that.
Sure. In the meantime I will apply this patch at some point to fix the
bug first.
Wei.
>
> Olaf
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Xen-devel] [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu
@ 2019-05-14 15:42 ` Wei Liu
0 siblings, 0 replies; 12+ messages in thread
From: Wei Liu @ 2019-05-14 15:42 UTC (permalink / raw)
To: Olaf Hering
Cc: Anthony PERARD, xen-devel, Wei Liu, Ian Jackson, Roger Pau Monné
On Tue, May 14, 2019 at 04:44:19PM +0200, Olaf Hering wrote:
> Am Tue, 14 May 2019 15:38:55 +0100
> schrieb Wei Liu <wei.liu2@citrix.com>:
>
> > > While it is easy for the resume path, doing the same in the suspend path
> > > needs more changes. libxl__domain_suspend_device_model would need to receive
> > > the callback and set it if a device model is active.
> >
> > What do you mean here? Can't you handle the NONE case just like you do
> > in the resume function?
>
> It means calling libxl__domain_suspend_device_model unconditionally, and
> let that function decide what to do. Maybe there is no downside to set the
> callback unconditionally, I will check that.
Sure. In the meantime I will apply this patch at some point to fix the
bug first.
Wei.
>
> Olaf
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2019-05-14 15:42 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-14 8:05 [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu Olaf Hering
2019-05-14 8:05 ` [Xen-devel] " Olaf Hering
2019-05-14 8:14 ` Olaf Hering
2019-05-14 8:14 ` [Xen-devel] " Olaf Hering
2019-05-14 14:38 ` Wei Liu
2019-05-14 14:38 ` [Xen-devel] " Wei Liu
2019-05-14 14:44 ` Olaf Hering
2019-05-14 14:44 ` [Xen-devel] " Olaf Hering
2019-05-14 15:42 ` Wei Liu
2019-05-14 15:42 ` [Xen-devel] " Wei Liu
2019-05-14 11:19 ` Roger Pau Monné
2019-05-14 11:19 ` [Xen-devel] " Roger Pau Monné
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.