RFC: QEMU bumping memory limit and domain restore

* RFC: QEMU bumping memory limit and domain restore
@ 2015-06-02 14:05 Wei Liu
  2015-06-02 14:08 ` Wei Liu
  2015-06-03 13:22 ` George Dunlap
  0 siblings, 2 replies; 18+ messages in thread
From: Wei Liu @ 2015-06-02 14:05 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, Ian Campbell, Stefano Stabellini, Ian Jackson, dslutz,
	andrew, yanghy

Previous discussion at [0].

For the benefit of discussion, we refer to max_memkb inside hypervisor
as hv_max_memkb (name subject to improvement). That's the maximum number
of memory a domain can use.

Libxl doesn't know hv_max_memkb for a domain needs prior to QEMU start-up
because of optional ROMs etc.

Libxl doesn't know the hv_max_memkb even after QEMU start-up, because
there is no mechanism to community between QEMU and libxl. This is an
area that needs improvement, we've encountered problems in this area
before.

QEMU calls xc_domain_setmaxmem to increase hv_max_memkb by N pages. Those
pages are only accounted in hypervisor. During migration, libxl
(currently) doesn't extract that value from hypervisor.

So now the problem is on the remote end:

1. Libxl indicates domain needs X pages.
2. Domain actually needs X + N pages.
3. Remote end tries to write N more pages and fail.

This behaviour currently doesn't affect normal migration (that you
transfer libxl JSON to remote, construct a domain, then start QEMU)
because QEMU won't bump hv_max_memkb again. This is by design and
reflected in QEMU code.

This behaviour affects COLO and becomes a bug in that case, because
secondary VM's QEMU doesn't go through the same start-of-day
initialisation (Hongyang, correct me if I'm wrong), i.e. no bumping
hv_max_memkb inside QEMU.

Andrew plans to embed JSON inside migration v2 and COLO is based on
migration v2. The bug is fixed if JSON is correct in the first place.

As COLO is not yet upstream, so this bug is not a blocker for 4.6. But
it should be fixed for the benefit of COLO.

So here is a proof of concept patch to record and honour that value
during migration.  A new field is added in IDL. Note that we don't
provide xl level config option for it and mandate it to be default value
during domain creation. This is to prevent libxl user from using it to
avoid unforeseen repercussions.

This patch is compiled test only. If we agree this is the way to go I
will test and submit a proper patch.

Wei.

[0] <1428941353-18673-1-git-send-email-dslutz@verizon.com>

---8<---
>From ab9dc179ea4ee26eb88f61f8dad36dd01b63bb6b Mon Sep 17 00:00:00 2001
From: Wei Liu <wei.liu2@citrix.com>
Date: Tue, 2 Jun 2015 14:53:20 +0100
Subject: [PATCH] libxl: record and honour hv_max_memkb

The new field hv_max_memkb in IDL is used to record "max_memkb" inside
hypervisor. That reflects the maximum memory a guest can ask for.

This field is mandated to be default value during guest creation to
avoid unforeseen repercussions. It's only honour when restoring a guest.

(XXX compiled test only at this stage)

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c         | 17 +++++++++--------
 tools/libxl/libxl_create.c  |  6 ++++++
 tools/libxl/libxl_dom.c     |  9 +++++++--
 tools/libxl/libxl_types.idl |  1 +
 4 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 9117b01..72fec8b 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -6614,6 +6614,7 @@ int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
     GC_INIT(ctx);
     int rc;
     libxl__domain_userdata_lock *lock = NULL;
+    uint64_t hv_max_memkb;
 
     CTX_LOCK;
 
@@ -6654,6 +6655,7 @@ int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
         }
         libxl_uuid_copy(ctx, &d_config->c_info.uuid, &info.uuid);
         libxl_dominfo_dispose(&info);
+        hv_max_memkb = info.max_memkb; /* store and use later */
     }
 
     /* Memory limits:
@@ -6661,17 +6663,15 @@ int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
      * Currently there are three memory limits:
      *  1. "target" in xenstore (originally memory= in config file)
      *  2. "static-max" in xenstore (originally maxmem= in config file)
-     *  3. "max_memkb" in hypervisor
-     *
-     * The third one is not visible and currently managed by
-     * toolstack. In order to rebuild a domain we only need to have
-     * "target" and "static-max".
+     *  3. "max_memkb" in hypervisor (corresponds to hv_max_memkb in
+     *     idl, not visible to xl level)
      */
     {
-        uint32_t target_memkb = 0, max_memkb = 0;
+        uint32_t target_memkb = 0, static_max_memkb = 0;
 
         /* "target" */
-        rc = libxl__get_memory_target(gc, domid, &target_memkb, &max_memkb);
+        rc = libxl__get_memory_target(gc, domid, &target_memkb,
+                                      &static_max_memkb);
         if (rc) {
             LOG(ERROR, "fail to get memory target for domain %d", domid);
             goto out;
@@ -6683,7 +6683,8 @@ int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
         d_config->b_info.target_memkb = target_memkb +
             d_config->b_info.video_memkb;
 
-        d_config->b_info.max_memkb = max_memkb;
+        d_config->b_info.max_memkb = static_max_memkb;
+        d_config->b_info.hv_max_memkb = hv_max_memkb;
     }
 
     /* Devices: disk, nic, vtpm, pcidev etc. */
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 86384d2..d35a393 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -813,6 +813,12 @@ static void initiate_domain_create(libxl__egc *egc,
 
     domid = 0;
 
+    if (restore_fd <= 0 &&
+        d_config->b_info.hv_max_memkb != LIBXL_MEMKB_DEFAULT) {
+        LOG(ERROR, "Setting hv_max_memkb when creating domain");
+        goto error_out;
+    }
+
     if (d_config->c_info.ssid_label) {
         char *s = d_config->c_info.ssid_label;
         ret = libxl_flask_context_to_sid(ctx, s, strlen(s),
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index a0c9850..5e33611 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -408,8 +408,13 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
         }
     }
 
-    if (xc_domain_setmaxmem(ctx->xch, domid, info->target_memkb +
-        LIBXL_MAXMEM_CONSTANT) < 0) {
+    /* hv_max_memkb is set in restore path, honour it. */
+    if (info->hv_max_memkb == LIBXL_MEMKB_DEFAULT)
+        rc = xc_domain_setmaxmem(ctx->xch, domid, info->target_memkb +
+                                 LIBXL_MAXMEM_CONSTANT);
+    else
+        rc = xc_domain_setmaxmem(ctx->xch, domid, info->hv_max_memkb);
+    if (rc < 0) {
         LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "Couldn't set max memory");
         return ERROR_FAIL;
     }
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 23f27d4..3c306c0 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -377,6 +377,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
     ("tsc_mode",        libxl_tsc_mode),
     ("max_memkb",       MemKB),
     ("target_memkb",    MemKB),
+    ("hv_max_memkb",    MemKB),
     ("video_memkb",     MemKB),
     ("shadow_memkb",    MemKB),
     ("rtc_timeoffset",  uint32),
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread