* [PATCH 0 of 4] libxc: avoid spurious error logging on restore/migrate
@ 2010-09-03 13:06 Ian Campbell
2010-09-03 13:06 ` [PATCH 1 of 4] libxc: logger: add newline when progress is complete Ian Campbell
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Ian Campbell @ 2010-09-03 13:06 UTC (permalink / raw)
To: xen-devel; +Cc: Brendan Cully
This series replaces "libxc: succeed silently on restore" from yesterday.
As well as adding an explicit "final checkpoint" notification chunk it
also includes some tweaks progress logging to be more pleasing and a
fix to xl so it does not re-run the bootloader on restore.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1 of 4] libxc: logger: add newline when progress is complete
2010-09-03 13:06 [PATCH 0 of 4] libxc: avoid spurious error logging on restore/migrate Ian Campbell
@ 2010-09-03 13:06 ` Ian Campbell
2010-09-03 13:06 ` [PATCH 2 of 4] libxc: use a switch statement in xc_domain_restore.c::pagebuf_get_one Ian Campbell
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2010-09-03 13:06 UTC (permalink / raw)
To: xen-devel; +Cc: Brendan Cully
# HG changeset patch
# User Ian Campbell <ian.campbell@citrix.com>
# Date 1283519032 -3600
# Node ID 75d86afe717e84ad04b0b16190b182553980e0c7
# Parent 9eb7b1ebaaf813eedac5b1941b03b8c999e12b61
libxc: logger: add newline when progress is complete
In xc_domain_save ensure that we signal completion at the end of each
batch.
This ensures that the next logged line starts on a new line. e.g. instead of:
Savefile contains xl domain config
xc: Saving memory: iter 3 (last sent 0 skipped 0): 0/32768 0%migration target: Transfer complete, requesting permission to start domain.
migration sender: Target has acknowledged transfer.
what is desired is:
Savefile contains xl domain config
xc: Saving memory: iter 0 (last sent 0 skipped 0): 32768/32768 100%
xc: Saving memory: iter 1 (last sent 32576 skipped 192): 32768/32768 100%
xc: Saving memory: iter 2 (last sent 217 skipped 0): 32768/32768 100%
xc: Saving memory: iter 3 (last sent 0 skipped 0): 32768/32768 100%
migration target: Transfer complete, requesting permission to start domain.
migration sender: Target has acknowledged transfer.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
diff -r 9eb7b1ebaaf8 -r 75d86afe717e tools/libxc/xc_domain_save.c
--- a/tools/libxc/xc_domain_save.c Fri Sep 03 13:38:36 2010 +0100
+++ b/tools/libxc/xc_domain_save.c Fri Sep 03 14:03:52 2010 +0100
@@ -1461,6 +1461,8 @@ int xc_domain_save(xc_interface *xch, in
skip:
+ xc_report_progress_step(xch, dinfo->p2m_size, dinfo->p2m_size);
+
total_sent += sent_this_iter;
if ( last_iter )
diff -r 9eb7b1ebaaf8 -r 75d86afe717e tools/libxc/xtl_logger_stdio.c
--- a/tools/libxc/xtl_logger_stdio.c Fri Sep 03 13:38:36 2010 +0100
+++ b/tools/libxc/xtl_logger_stdio.c Fri Sep 03 14:03:52 2010 +0100
@@ -108,9 +108,12 @@ static void stdiostream_progress(struct
if (lg->progress_erase_len)
putc('\r', lg->f);
- newpel = fprintf(lg->f, "%s%s" "%s: %lu/%lu %3d%%",
+ lg->progress_last_percent = percent;
+
+ newpel = fprintf(lg->f, "%s%s" "%s: %lu/%lu %3d%%%s",
context?context:"", context?": ":"",
- doing_what, done, total, percent);
+ doing_what, done, total, percent,
+ done == total ? "\n" : "");
extra_erase = lg->progress_erase_len - newpel;
if (extra_erase > 0)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2 of 4] libxc: use a switch statement in xc_domain_restore.c::pagebuf_get_one
2010-09-03 13:06 [PATCH 0 of 4] libxc: avoid spurious error logging on restore/migrate Ian Campbell
2010-09-03 13:06 ` [PATCH 1 of 4] libxc: logger: add newline when progress is complete Ian Campbell
@ 2010-09-03 13:06 ` Ian Campbell
2010-09-03 13:06 ` [PATCH 3 of 4] libxc: provide notification of final checkpoint to restore end Ian Campbell
2010-09-03 13:06 ` [PATCH 4 of 4] xl: do not run bootloader on restore Ian Campbell
3 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2010-09-03 13:06 UTC (permalink / raw)
To: xen-devel; +Cc: Brendan Cully
# HG changeset patch
# User Ian Campbell <ian.campbell@citrix.com>
# Date 1283519033 -3600
# Node ID de66ba50997cd8670fceffa7b01b50704f82edc7
# Parent 75d86afe717e84ad04b0b16190b182553980e0c7
libxc: use a switch statement in xc_domain_restore.c::pagebuf_get_one.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
diff -r 75d86afe717e -r de66ba50997c tools/libxc/xc_domain_restore.c
--- a/tools/libxc/xc_domain_restore.c Fri Sep 03 14:03:52 2010 +0100
+++ b/tools/libxc/xc_domain_restore.c Fri Sep 03 14:03:53 2010 +0100
@@ -680,14 +680,18 @@ static int pagebuf_get_one(xc_interface
// DPRINTF("reading batch of %d pages\n", count);
- if (!count) {
+ switch ( count )
+ {
+ case 0:
// DPRINTF("Last batch read\n");
return 0;
- } else if (count == XC_SAVE_ID_ENABLE_VERIFY_MODE) {
+
+ case XC_SAVE_ID_ENABLE_VERIFY_MODE:
DPRINTF("Entering page verify mode\n");
buf->verify = 1;
return pagebuf_get_one(xch, ctx, buf, fd, dom);
- } else if (count == XC_SAVE_ID_VCPU_INFO) {
+
+ case XC_SAVE_ID_VCPU_INFO:
buf->new_ctxt_format = 1;
if ( RDEXACT(fd, &buf->max_vcpu_id, sizeof(buf->max_vcpu_id)) ||
buf->max_vcpu_id >= 64 || RDEXACT(fd, &buf->vcpumap,
@@ -697,7 +701,8 @@ static int pagebuf_get_one(xc_interface
}
// DPRINTF("Max VCPU ID: %d, vcpumap: %llx\n", buf->max_vcpu_id, buf->vcpumap);
return pagebuf_get_one(xch, ctx, buf, fd, dom);
- } else if (count == XC_SAVE_ID_HVM_IDENT_PT) {
+
+ case XC_SAVE_ID_HVM_IDENT_PT:
/* Skip padding 4 bytes then read the EPT identity PT location. */
if ( RDEXACT(fd, &buf->identpt, sizeof(uint32_t)) ||
RDEXACT(fd, &buf->identpt, sizeof(uint64_t)) )
@@ -707,7 +712,8 @@ static int pagebuf_get_one(xc_interface
}
// DPRINTF("EPT identity map address: %llx\n", buf->identpt);
return pagebuf_get_one(xch, ctx, buf, fd, dom);
- } else if ( count == XC_SAVE_ID_HVM_VM86_TSS ) {
+
+ case XC_SAVE_ID_HVM_VM86_TSS:
/* Skip padding 4 bytes then read the vm86 TSS location. */
if ( RDEXACT(fd, &buf->vm86_tss, sizeof(uint32_t)) ||
RDEXACT(fd, &buf->vm86_tss, sizeof(uint64_t)) )
@@ -717,21 +723,24 @@ static int pagebuf_get_one(xc_interface
}
// DPRINTF("VM86 TSS location: %llx\n", buf->vm86_tss);
return pagebuf_get_one(xch, ctx, buf, fd, dom);
- } else if ( count == XC_SAVE_ID_TMEM ) {
+
+ case XC_SAVE_ID_TMEM:
DPRINTF("xc_domain_restore start tmem\n");
if ( xc_tmem_restore(xch, dom, fd) ) {
PERROR("error reading/restoring tmem");
return -1;
}
return pagebuf_get_one(xch, ctx, buf, fd, dom);
- }
- else if ( count == XC_SAVE_ID_TMEM_EXTRA ) {
+
+ case XC_SAVE_ID_TMEM_EXTRA:
if ( xc_tmem_restore_extra(xch, dom, fd) ) {
PERROR("error reading/restoring tmem extra");
return -1;
}
return pagebuf_get_one(xch, ctx, buf, fd, dom);
- } else if ( count == XC_SAVE_ID_TSC_INFO ) {
+
+ case XC_SAVE_ID_TSC_INFO:
+ {
uint32_t tsc_mode, khz, incarn;
uint64_t nsec;
if ( RDEXACT(fd, &tsc_mode, sizeof(uint32_t)) ||
@@ -743,7 +752,9 @@ static int pagebuf_get_one(xc_interface
return -1;
}
return pagebuf_get_one(xch, ctx, buf, fd, dom);
- } else if (count == XC_SAVE_ID_HVM_CONSOLE_PFN ) {
+ }
+
+ case XC_SAVE_ID_HVM_CONSOLE_PFN :
/* Skip padding 4 bytes then read the console pfn location. */
if ( RDEXACT(fd, &buf->console_pfn, sizeof(uint32_t)) ||
RDEXACT(fd, &buf->console_pfn, sizeof(uint64_t)) )
@@ -753,10 +764,14 @@ static int pagebuf_get_one(xc_interface
}
// DPRINTF("console pfn location: %llx\n", buf->console_pfn);
return pagebuf_get_one(xch, ctx, buf, fd, dom);
- } else if ( (count > MAX_BATCH_SIZE) || (count < 0) ) {
- ERROR("Max batch size exceeded (%d). Giving up.", count);
- errno = EMSGSIZE;
- return -1;
+
+ default:
+ if ( (count > MAX_BATCH_SIZE) || (count < 0) ) {
+ ERROR("Max batch size exceeded (%d). Giving up.", count);
+ errno = EMSGSIZE;
+ return -1;
+ }
+ break;
}
oldcount = buf->nr_pages;
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 3 of 4] libxc: provide notification of final checkpoint to restore end
2010-09-03 13:06 [PATCH 0 of 4] libxc: avoid spurious error logging on restore/migrate Ian Campbell
2010-09-03 13:06 ` [PATCH 1 of 4] libxc: logger: add newline when progress is complete Ian Campbell
2010-09-03 13:06 ` [PATCH 2 of 4] libxc: use a switch statement in xc_domain_restore.c::pagebuf_get_one Ian Campbell
@ 2010-09-03 13:06 ` Ian Campbell
2010-09-03 17:46 ` Ian Jackson
2010-09-03 19:58 ` Brendan Cully
2010-09-03 13:06 ` [PATCH 4 of 4] xl: do not run bootloader on restore Ian Campbell
3 siblings, 2 replies; 8+ messages in thread
From: Ian Campbell @ 2010-09-03 13:06 UTC (permalink / raw)
To: xen-devel; +Cc: Brendan Cully
# HG changeset patch
# User Ian Campbell <ian.campbell@citrix.com>
# Date 1283519033 -3600
# Node ID 684cfeffdb1b4bacd736bc05ae26211cb91833df
# Parent de66ba50997cd8670fceffa7b01b50704f82edc7
libxc: provide notification of final checkpoint to restore end
When the restore code sees this notification it will restore the
currently in-progress checkpoint when it completes.
This allows the restore end to finish up without waiting for a
spurious timeout on the receive fd and thereby avoids unnecessary
error logging in the case of a successful migration or restore.
In the normal migration or restore case the first checkpoint is always
the last. For a rolling checkpoint (such as Remus) the notification is
currently unused but could be used in the future for example to
provide a controlled failover for reasons other than error
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
diff -r de66ba50997c -r 684cfeffdb1b tools/libxc/xc_domain_restore.c
--- a/tools/libxc/xc_domain_restore.c Fri Sep 03 14:03:53 2010 +0100
+++ b/tools/libxc/xc_domain_restore.c Fri Sep 03 14:03:53 2010 +0100
@@ -42,6 +42,7 @@ struct restore_ctx {
xen_pfn_t *p2m; /* A table mapping each PFN to its new MFN. */
xen_pfn_t *p2m_batch; /* A table of P2M mappings in the current region. */
int completed; /* Set when a consistent image is available */
+ int last_checkpoint; /* Set when we should commit to the current checkpoint when it completes. */
struct domain_info_context dinfo;
};
@@ -765,6 +766,11 @@ static int pagebuf_get_one(xc_interface
// DPRINTF("console pfn location: %llx\n", buf->console_pfn);
return pagebuf_get_one(xch, ctx, buf, fd, dom);
+ case XC_SAVE_ID_LAST_CHECKPOINT:
+ ctx->last_checkpoint = 1;
+ // DPRINTF("last checkpoint indication received");
+ return pagebuf_get_one(xch, ctx, buf, fd, dom);
+
default:
if ( (count > MAX_BATCH_SIZE) || (count < 0) ) {
ERROR("Max batch size exceeded (%d). Giving up.", count);
@@ -1296,10 +1302,23 @@ int xc_domain_restore(xc_interface *xch,
goto out;
}
ctx->completed = 1;
- /* shift into nonblocking mode for the remainder */
- if ( (flags = fcntl(io_fd, F_GETFL,0)) < 0 )
- flags = 0;
- fcntl(io_fd, F_SETFL, flags | O_NONBLOCK);
+
+ /*
+ * If more checkpoints are expected then shift into
+ * nonblocking mode for the remainder.
+ */
+ if ( !ctx->last_checkpoint )
+ {
+ if ( (flags = fcntl(io_fd, F_GETFL,0)) < 0 )
+ flags = 0;
+ fcntl(io_fd, F_SETFL, flags | O_NONBLOCK);
+ }
+ }
+
+ if ( ctx->last_checkpoint )
+ {
+ // DPRINTF("Last checkpoint, finishing\n");
+ goto finish;
}
// DPRINTF("Buffered checkpoint\n");
diff -r de66ba50997c -r 684cfeffdb1b tools/libxc/xc_domain_save.c
--- a/tools/libxc/xc_domain_save.c Fri Sep 03 14:03:53 2010 +0100
+++ b/tools/libxc/xc_domain_save.c Fri Sep 03 14:03:53 2010 +0100
@@ -1616,6 +1616,20 @@ int xc_domain_save(xc_interface *xch, in
}
}
+ if ( !callbacks->checkpoint )
+ {
+ /*
+ * If this is not a checkpointed save then this must be the first and
+ * last checkpoint.
+ */
+ i = XC_SAVE_ID_LAST_CHECKPOINT;
+ if ( wrexact(io_fd, &i, sizeof(int)) )
+ {
+ PERROR("Error when writing last checkpoint chunk");
+ goto out;
+ }
+ }
+
/* Zero terminate */
i = 0;
if ( wrexact(io_fd, &i, sizeof(int)) )
diff -r de66ba50997c -r 684cfeffdb1b tools/libxc/xg_save_restore.h
--- a/tools/libxc/xg_save_restore.h Fri Sep 03 14:03:53 2010 +0100
+++ b/tools/libxc/xg_save_restore.h Fri Sep 03 14:03:53 2010 +0100
@@ -131,6 +131,7 @@
#define XC_SAVE_ID_TMEM_EXTRA -6
#define XC_SAVE_ID_TSC_INFO -7
#define XC_SAVE_ID_HVM_CONSOLE_PFN -8 /* (HVM-only) */
+#define XC_SAVE_ID_LAST_CHECKPOINT -9 /* Commit to restoring after completion of current iteration. */
/*
** We process save/restore/migrate in batches of pages; the below
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 4 of 4] xl: do not run bootloader on restore
2010-09-03 13:06 [PATCH 0 of 4] libxc: avoid spurious error logging on restore/migrate Ian Campbell
` (2 preceding siblings ...)
2010-09-03 13:06 ` [PATCH 3 of 4] libxc: provide notification of final checkpoint to restore end Ian Campbell
@ 2010-09-03 13:06 ` Ian Campbell
3 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2010-09-03 13:06 UTC (permalink / raw)
To: xen-devel; +Cc: Brendan Cully
# HG changeset patch
# User Ian Campbell <ian.campbell@citrix.com>
# Date 1283519045 -3600
# Node ID 772b3847d4bcc2a8e12cd7d022d76528b816c702
# Parent 684cfeffdb1b4bacd736bc05ae26211cb91833df
xl: do not run bootloader on restore.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
diff -r 684cfeffdb1b -r 772b3847d4bc tools/libxl/xl_cmdimpl.c
--- a/tools/libxl/xl_cmdimpl.c Fri Sep 03 14:03:53 2010 +0100
+++ b/tools/libxl/xl_cmdimpl.c Fri Sep 03 14:04:05 2010 +0100
@@ -1411,10 +1411,12 @@ start:
*/
dom_info->console_autoconnect = 0;
- ret = libxl_run_bootloader(&ctx, &d_config.b_info, d_config.num_disks > 0 ? &d_config.disks[0] : NULL, domid);
- if (ret) {
- fprintf(stderr, "failed to run bootloader: %d\n", ret);
- goto error_out;
+ if (!restore_file) {
+ ret = libxl_run_bootloader(&ctx, &d_config.b_info, d_config.num_disks > 0 ? &d_config.disks[0] : NULL, domid);
+ if (ret) {
+ fprintf(stderr, "failed to run bootloader: %d\n", ret);
+ goto error_out;
+ }
}
if (!restore_file || !need_daemon) {
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 3 of 4] libxc: provide notification of final checkpoint to restore end
2010-09-03 13:06 ` [PATCH 3 of 4] libxc: provide notification of final checkpoint to restore end Ian Campbell
@ 2010-09-03 17:46 ` Ian Jackson
2010-09-03 19:58 ` Brendan Cully
1 sibling, 0 replies; 8+ messages in thread
From: Ian Jackson @ 2010-09-03 17:46 UTC (permalink / raw)
To: Ian Campbell; +Cc: Brendan Cully, xen-devel
Ian Campbell writes ("[Xen-devel] [PATCH 3 of 4] libxc: provide notification of final checkpoint to restore end"):
> libxc: provide notification of final checkpoint to restore end
I've applied 1, 2 and 4 of this series as they seemed correct to me.
3 looks good to me too, but I'll hold off on it to give Brendan a
chance to comment.
Ian.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 3 of 4] libxc: provide notification of final checkpoint to restore end
2010-09-03 13:06 ` [PATCH 3 of 4] libxc: provide notification of final checkpoint to restore end Ian Campbell
2010-09-03 17:46 ` Ian Jackson
@ 2010-09-03 19:58 ` Brendan Cully
1 sibling, 0 replies; 8+ messages in thread
From: Brendan Cully @ 2010-09-03 19:58 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel
On Friday, 03 September 2010 at 14:06, Ian Campbell wrote:
> # HG changeset patch
> # User Ian Campbell <ian.campbell@citrix.com>
> # Date 1283519033 -3600
> # Node ID 684cfeffdb1b4bacd736bc05ae26211cb91833df
> # Parent de66ba50997cd8670fceffa7b01b50704f82edc7
> libxc: provide notification of final checkpoint to restore end
>
> When the restore code sees this notification it will restore the
> currently in-progress checkpoint when it completes.
>
> This allows the restore end to finish up without waiting for a
> spurious timeout on the receive fd and thereby avoids unnecessary
> error logging in the case of a successful migration or restore.
>
> In the normal migration or restore case the first checkpoint is always
> the last. For a rolling checkpoint (such as Remus) the notification is
> currently unused but could be used in the future for example to
> provide a controlled failover for reasons other than error
>
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
This looks sane and passes basic testing here.
Acked-by: Brendan Cully <brendan@cs.ubc.ca>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 0 of 4] libxc: avoid spurious error logging on restore/migrate
@ 2010-09-06 10:03 Ian Campbell
0 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2010-09-06 10:03 UTC (permalink / raw)
To: xen-devel; +Cc: Brendan Cully
This series replaces "libxc: succeed silently on restore" from yesterday.
As well as adding an explicit "final checkpoint" notification chunk it
also includes some tweaks progress logging to be more pleasing and a
fix to xl so it does not re-run the bootloader on restore.
Changes since previous posting:
- Rebased past fde833c66948 "xl: do not continue in the child and exec
xenconsole in the parent".
- Prepended fix for issue exposed by 22366e13f76d "xl: randomly
generate UUIDs".
- Added Brendan's Ack to patch 3/4.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-09-06 10:03 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-03 13:06 [PATCH 0 of 4] libxc: avoid spurious error logging on restore/migrate Ian Campbell
2010-09-03 13:06 ` [PATCH 1 of 4] libxc: logger: add newline when progress is complete Ian Campbell
2010-09-03 13:06 ` [PATCH 2 of 4] libxc: use a switch statement in xc_domain_restore.c::pagebuf_get_one Ian Campbell
2010-09-03 13:06 ` [PATCH 3 of 4] libxc: provide notification of final checkpoint to restore end Ian Campbell
2010-09-03 17:46 ` Ian Jackson
2010-09-03 19:58 ` Brendan Cully
2010-09-03 13:06 ` [PATCH 4 of 4] xl: do not run bootloader on restore Ian Campbell
2010-09-06 10:03 [PATCH 0 of 4] libxc: avoid spurious error logging on restore/migrate Ian Campbell
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.