All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v20210701 00/40] leftover from 2020
@ 2021-07-01  9:55 Olaf Hering
  2021-07-01  9:55 ` [PATCH v20210701 01/40] hotplug/Linux: fix starting of xenstored with restarting systemd Olaf Hering
                   ` (39 more replies)
  0 siblings, 40 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering

Various unreviewed changes, rebase to f95b7b37cf.

Olaf Hering (40):
  hotplug/Linux: fix starting of xenstored with restarting systemd
  tools: add API to work with sevaral bits at once
  xl: fix description of migrate --debug
  tools: use integer division in convert-legacy-stream
  tools: handle libxl__physmap_info.name properly in convert-legacy-stream
  tools: fix Python3.4 TypeError in format string
  tools: create libxensaverestore
  MAINTAINERS: add myself as saverestore maintainer
  tools: add readv_exact to libxenctrl
  tools: add xc_is_known_page_type to libxenctrl
  tools: use sr_is_known_page_type
  tools: unify type checking for data pfns in migration stream
  tools: unify type checking for data pfns in migration stream
  tools: show migration transfer rate in send_dirty_pages
  tools: prepare to allocate saverestore arrays once
  tools: save: move mfns array
  tools: save: move types array
  tools: save: move errors array
  tools: save: move iov array
  tools: save: move rec_pfns array
  tools: save: move guest_data array
  tools: save: move local_pages array
  tools: restore: move types array
  tools: restore: move mfns array
  tools: restore: move map_errs array
  tools: restore: move mfns array in populate_pfns
  tools: restore: move pfns array in populate_pfns
  tools: restore: split record processing
  tools: restore: split handle_page_data
  tools: restore: write data directly into guest
  tools: recognize LIBXL_API_VERSION for 4.16
  tools: adjust libxl_domain_suspend to receive a struct props
  tools: change struct precopy_stats to precopy_stats_t
  tools: add callback to libxl for precopy_policy and precopy_stats_t
  tools: add --max_iters to libxl_domain_suspend
  tools: add --min_remaining to libxl_domain_suspend
  tools: add --abort_if_busy to libxl_domain_suspend
  tools: add API for expandable bitmaps
  tools: use xg_sr_bitmap for populated_pfns
  tools/libxc: use superpages during restore of HVM guest

 .gitignore                                    |   2 +
 MAINTAINERS                                   |   6 +
 docs/man/xl.1.pod.in                          |  22 +-
 tools/hotplug/Linux/init.d/xencommons.in      |   2 +-
 tools/hotplug/Linux/launch-xenstore.in        |  40 +-
 .../Linux/systemd/xenstored.service.in        |   2 +-
 tools/include/libxl.h                         |  32 +-
 tools/include/xenguest.h                      | 186 -----
 tools/include/xensaverestore.h                | 207 ++++++
 tools/libs/Makefile                           |   1 +
 tools/libs/ctrl/xc_bitops.h                   |  28 +
 tools/libs/ctrl/xc_private.c                  |  57 +-
 tools/libs/ctrl/xc_private.h                  |   1 +
 tools/libs/guest/Makefile                     |  11 -
 tools/libs/guest/xg_dom_x86.c                 |   5 -
 tools/libs/guest/xg_offline_page.c            |   1 -
 tools/libs/guest/xg_private.h                 |   5 +
 tools/libs/guest/xg_sr_restore_x86_hvm.c      | 274 --------
 tools/libs/light/Makefile                     |   4 +-
 tools/libs/light/libxl_dom_save.c             |  24 +
 tools/libs/light/libxl_domain.c               |  10 +-
 tools/libs/light/libxl_internal.h             |   7 +
 tools/libs/light/libxl_save_helper.c          |   1 +
 tools/libs/light/libxl_save_msgs_gen.pl       |   5 +-
 tools/libs/light/libxl_stream_write.c         |   9 +-
 tools/libs/light/libxl_types.idl              |   1 +
 tools/libs/saverestore/Makefile               |  38 ++
 .../xg_sr_common.c => saverestore/common.c}   |  75 +-
 .../xg_sr_common.h => saverestore/common.h}   | 271 +++++++-
 .../common_x86.c}                             |   2 +-
 .../common_x86.h}                             |   2 +-
 .../common_x86_pv.c}                          |   2 +-
 .../common_x86_pv.h}                          |   2 +-
 .../nomigrate.c}                              |   2 +-
 .../xg_sr_restore.c => saverestore/restore.c} | 617 +++++++++--------
 tools/libs/saverestore/restore_x86_hvm.c      | 645 ++++++++++++++++++
 .../restore_x86_pv.c}                         |  70 +-
 .../xg_sr_save.c => saverestore/save.c}       | 165 ++---
 .../save_restore.h}                           |   2 -
 .../save_x86_hvm.c}                           |   7 +-
 .../save_x86_pv.c}                            |  33 +-
 .../stream_format.h}                          |   0
 tools/libs/uselibs.mk                         |   4 +-
 tools/ocaml/libs/xl/xenlight_stubs.c          |   3 +-
 tools/python/scripts/convert-legacy-stream    |  24 +-
 tools/xl/xl_cmdtable.c                        |  26 +-
 tools/xl/xl_migrate.c                         |  54 +-
 tools/xl/xl_saverestore.c                     |   3 +-
 48 files changed, 2037 insertions(+), 953 deletions(-)
 create mode 100644 tools/include/xensaverestore.h
 delete mode 100644 tools/libs/guest/xg_sr_restore_x86_hvm.c
 create mode 100644 tools/libs/saverestore/Makefile
 rename tools/libs/{guest/xg_sr_common.c => saverestore/common.c} (72%)
 rename tools/libs/{guest/xg_sr_common.h => saverestore/common.h} (67%)
 rename tools/libs/{guest/xg_sr_common_x86.c => saverestore/common_x86.c} (99%)
 rename tools/libs/{guest/xg_sr_common_x86.h => saverestore/common_x86.h} (98%)
 rename tools/libs/{guest/xg_sr_common_x86_pv.c => saverestore/common_x86_pv.c} (99%)
 rename tools/libs/{guest/xg_sr_common_x86_pv.h => saverestore/common_x86_pv.h} (98%)
 rename tools/libs/{guest/xg_nomigrate.c => saverestore/nomigrate.c} (98%)
 rename tools/libs/{guest/xg_sr_restore.c => saverestore/restore.c} (66%)
 create mode 100644 tools/libs/saverestore/restore_x86_hvm.c
 rename tools/libs/{guest/xg_sr_restore_x86_pv.c => saverestore/restore_x86_pv.c} (94%)
 rename tools/libs/{guest/xg_sr_save.c => saverestore/save.c} (88%)
 rename tools/libs/{guest/xg_save_restore.h => saverestore/save_restore.h} (98%)
 rename tools/libs/{guest/xg_sr_save_x86_hvm.c => saverestore/save_x86_hvm.c} (96%)
 rename tools/libs/{guest/xg_sr_save_x86_pv.c => saverestore/save_x86_pv.c} (97%)
 rename tools/libs/{guest/xg_sr_stream_format.h => saverestore/stream_format.h} (100%)



^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH v20210701 01/40] hotplug/Linux: fix starting of xenstored with restarting systemd
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
@ 2021-07-01  9:55 ` Olaf Hering
  2021-07-01  9:55 ` [PATCH v20210701 02/40] tools: add API to work with sevaral bits at once Olaf Hering
                   ` (38 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu

A hard to trigger race with another unrelated systemd service and
xenstored.service unveiled a bug in the way how xenstored is launched
with systemd.

launch-xenstore may start either a daemon or a domain. In case a domain
is used, systemd-notify was called. If another service triggered a
restart of systemd while xenstored.service was executed, systemd may
temporary lose track of services with Type=notify. As a result,
xenstored.service would be marked as failed and units that depend on it
will not be started. This breaks the enire Xen toolstack.

The chain of events is basically: xenstored.service sends the
notification to systemd, this is a one-way event. Then systemd may be
restarted by the other unit. During this time, xenstored.service is done
and exits. Once systemd is done with its restart, it collects the pending
notifications and childs. If it does not find the unit which sent the
notification it will declare it as failed.

A workaround for this scenario is to leave the child processes running
for a short time after sending the "READY=1" notification. If systemd
happens to restart it will still find the unit it launched.

Adjust the callers of launch-xenstore to specifiy the init system:
Do not fork xenstored with systemd, preserve pid. This wil also avoid
the need for a sleep because the process which sent the "READY=1" (the
previously forked child) is still alive.

Remove the --pid-file in the systemd case because the pid of the child
is known, and the file had probably little effect anyway due to lack of
PidFile= and Type=forking in the unit file.

Be verbose about xenstored startup only with sysv to avoid interleaved
output in systemd journal. Do the same also for domain case, even if is
not strictly needed because init-xenstore-domain has no output.

The fix for upstream systemd which is supposed to fix it:
575b300b795b6 ("pid1: rework how we dispatch SIGCHLD and other signals")

Signed-off-by: Olaf Hering <olaf@aepfle.de>

--
v04:
- do mkdir unconditionally because init-xenstore-domain writes the domid to
  xenstored.pid
v03:
- remove run_xenstored function, follow style of shell built-in test function
v02:
- preserve Type=notify
---
 tools/hotplug/Linux/init.d/xencommons.in      |  2 +-
 tools/hotplug/Linux/launch-xenstore.in        | 40 ++++++++++++++-----
 .../Linux/systemd/xenstored.service.in        |  2 +-
 3 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/tools/hotplug/Linux/init.d/xencommons.in b/tools/hotplug/Linux/init.d/xencommons.in
index 7fd6903b98..dcb0ce4b73 100644
--- a/tools/hotplug/Linux/init.d/xencommons.in
+++ b/tools/hotplug/Linux/init.d/xencommons.in
@@ -60,7 +60,7 @@ do_start () {
 	mkdir -m700 -p ${XEN_LOCK_DIR}
 	mkdir -p ${XEN_LOG_DIR}
 
-	@XEN_SCRIPT_DIR@/launch-xenstore || exit 1
+	@XEN_SCRIPT_DIR@/launch-xenstore 'sysv' || exit 1
 
 	echo Setting domain 0 name, domid and JSON config...
 	${LIBEXEC_BIN}/xen-init-dom0 ${XEN_DOM0_UUID}
diff --git a/tools/hotplug/Linux/launch-xenstore.in b/tools/hotplug/Linux/launch-xenstore.in
index 019f9d6f4d..d40c66482a 100644
--- a/tools/hotplug/Linux/launch-xenstore.in
+++ b/tools/hotplug/Linux/launch-xenstore.in
@@ -15,6 +15,17 @@
 # License along with this library; If not, see <http://www.gnu.org/licenses/>.
 #
 
+initd=$1
+
+case "$initd" in
+	sysv) nonl='-n' ;;
+	systemd) nonl= ;;
+	*)
+	echo "first argument must be 'sysv' or 'systemd'"
+	exit 1
+	;;
+esac
+
 XENSTORED=@XENSTORED@
 
 . @XEN_SCRIPT_DIR@/hotplugpath.sh
@@ -44,14 +55,16 @@ timeout_xenstore () {
 	return 0
 }
 
-test_xenstore && exit 0
+mkdir -p @XEN_RUN_DIR@
+
+if test "$initd" = 'sysv' ; then
+	test_xenstore && exit 0
+fi
 
 test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons
 
 [ "$XENSTORETYPE" = "" ] && XENSTORETYPE=daemon
 
-/bin/mkdir -p @XEN_RUN_DIR@
-
 [ "$XENSTORETYPE" = "daemon" ] && {
 	[ -z "$XENSTORED_TRACE" ] || XENSTORED_ARGS="$XENSTORED_ARGS -T @XEN_LOG_DIR@/xenstored-trace.log"
 	[ -z "$XENSTORED" ] && XENSTORED=@XENSTORED@
@@ -59,13 +72,15 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF
 		echo "No xenstored found"
 		exit 1
 	}
+	[ "$initd" = 'sysv' ] && {
+		echo $nonl Starting $XENSTORED...
+		$XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
+		timeout_xenstore $XENSTORED || exit 1
+		exit 0
+	}
 
-	echo -n Starting $XENSTORED...
-	$XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
-
-	systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || exit 1
-
-	exit 0
+	exec $XENSTORED -N $XENSTORED_ARGS
+	exit 1
 }
 
 [ "$XENSTORETYPE" = "domain" ] && {
@@ -75,9 +90,12 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF
 	XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --memory $XENSTORE_DOMAIN_SIZE"
 	[ -z "$XENSTORE_MAX_DOMAIN_SIZE" ] || XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --maxmem $XENSTORE_MAX_DOMAIN_SIZE"
 
-	echo -n Starting $XENSTORE_DOMAIN_KERNEL...
+	echo $nonl Starting $XENSTORE_DOMAIN_KERNEL...
 	${LIBEXEC_BIN}/init-xenstore-domain $XENSTORE_DOMAIN_ARGS || exit 1
-	systemd-notify --ready 2>/dev/null
+	[ "$initd" = 'systemd' ] && {
+		systemd-notify --ready
+		sleep 9
+	}
 
 	exit 0
 }
diff --git a/tools/hotplug/Linux/systemd/xenstored.service.in b/tools/hotplug/Linux/systemd/xenstored.service.in
index 80c1d408a5..c226eb3635 100644
--- a/tools/hotplug/Linux/systemd/xenstored.service.in
+++ b/tools/hotplug/Linux/systemd/xenstored.service.in
@@ -11,7 +11,7 @@ Type=notify
 NotifyAccess=all
 RemainAfterExit=true
 ExecStartPre=/bin/grep -q control_d /proc/xen/capabilities
-ExecStart=@XEN_SCRIPT_DIR@/launch-xenstore
+ExecStart=@XEN_SCRIPT_DIR@/launch-xenstore 'systemd'
 
 [Install]
 WantedBy=multi-user.target


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 02/40] tools: add API to work with sevaral bits at once
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
  2021-07-01  9:55 ` [PATCH v20210701 01/40] hotplug/Linux: fix starting of xenstored with restarting systemd Olaf Hering
@ 2021-07-01  9:55 ` Olaf Hering
  2021-07-01  9:55 ` [PATCH v20210701 03/40] xl: fix description of migrate --debug Olaf Hering
                   ` (37 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu, Juergen Gross

Introduce new API to test if a fixed number of bits is clear or set,
and clear or set them all at once.

The caller has to make sure the input bitnumber is a multiple of BITS_PER_LONG.

This API avoids the loop over each bit in a known range just to see
if all of them are either clear or set.

Signed-off-by: Olaf Hering <olaf@aepfle.de>

v02:
- change return type from int to bool (jgross)
---
 tools/libs/ctrl/xc_bitops.h | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/tools/libs/ctrl/xc_bitops.h b/tools/libs/ctrl/xc_bitops.h
index f0bac4a071..8e8c6efb45 100644
--- a/tools/libs/ctrl/xc_bitops.h
+++ b/tools/libs/ctrl/xc_bitops.h
@@ -3,6 +3,7 @@
 
 /* bitmap operations for single threaded access */
 
+#include <stdbool.h>
 #include <stdlib.h>
 #include <string.h>
 
@@ -77,4 +78,31 @@ static inline void bitmap_or(void *_dst, const void *_other,
         dst[i] |= other[i];
 }
 
+static inline bool test_bit_long_set(unsigned long nr_base, const void *_addr)
+{
+    const unsigned long *addr = _addr;
+    unsigned long val = addr[nr_base / BITS_PER_LONG];
+
+    return val == ~0;
+}
+
+static inline bool test_bit_long_clear(unsigned long nr_base, const void *_addr)
+{
+    const unsigned long *addr = _addr;
+    unsigned long val = addr[nr_base / BITS_PER_LONG];
+
+    return val == 0;
+}
+
+static inline void clear_bit_long(unsigned long nr_base, void *_addr)
+{
+    unsigned long *addr = _addr;
+    addr[nr_base / BITS_PER_LONG] = 0;
+}
+
+static inline void set_bit_long(unsigned long nr_base, void *_addr)
+{
+    unsigned long *addr = _addr;
+    addr[nr_base / BITS_PER_LONG] = ~0;
+}
 #endif  /* XC_BITOPS_H */


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 03/40] xl: fix description of migrate --debug
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
  2021-07-01  9:55 ` [PATCH v20210701 01/40] hotplug/Linux: fix starting of xenstored with restarting systemd Olaf Hering
  2021-07-01  9:55 ` [PATCH v20210701 02/40] tools: add API to work with sevaral bits at once Olaf Hering
@ 2021-07-01  9:55 ` Olaf Hering
  2021-07-01 14:30   ` Anthony PERARD
  2021-07-01 14:33   ` Andrew Cooper
  2021-07-01  9:55 ` [PATCH v20210701 04/40] tools: use integer division in convert-legacy-stream Olaf Hering
                   ` (36 subsequent siblings)
  39 siblings, 2 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu, Anthony PERARD

xl migrate --debug used to track every pfn in every batch of pages.
But these times are gone. The code in xc_domain_save is the consumer
of this knob, but it considers it only for the remus and colo case.

Adjust the help text to tell what --debug does today: Nothing.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>

v02:
- the option has no effect anymore
---
 docs/man/xl.1.pod.in   | 2 +-
 tools/xl/xl_cmdtable.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in
index e2176bd696..70a6ebf438 100644
--- a/docs/man/xl.1.pod.in
+++ b/docs/man/xl.1.pod.in
@@ -481,7 +481,7 @@ domain.
 
 =item B<--debug>
 
-Display huge (!) amount of debug information during the migration process.
+This option has no effect. It is preserved for compatibility reasons.
 
 =item B<-p>
 
diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
index 661323d488..ca1dfa3525 100644
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -172,7 +172,7 @@ const struct cmd_spec cmd_table[] = {
       "                migrate-receive [-d -e]\n"
       "-e              Do not wait in the background (on <host>) for the death\n"
       "                of the domain.\n"
-      "--debug         Print huge (!) amount of debug during the migration process.\n"
+      "--debug         Ignored.\n"
       "-p              Do not unpause domain after migrating it.\n"
       "-D              Preserve the domain id"
     },


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 04/40] tools: use integer division in convert-legacy-stream
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (2 preceding siblings ...)
  2021-07-01  9:55 ` [PATCH v20210701 03/40] xl: fix description of migrate --debug Olaf Hering
@ 2021-07-01  9:55 ` Olaf Hering
  2021-07-02 15:10   ` Andrew Cooper
  2021-07-01  9:56 ` [PATCH v20210701 05/40] tools: handle libxl__physmap_info.name properly " Olaf Hering
                   ` (35 subsequent siblings)
  39 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: Olaf Hering, Marek Marczykowski-Górecki, Ian Jackson, Wei Liu

A single slash gives a float, a double slash gives an int.

    bitmap = unpack_exact("Q" * ((max_id/64) + 1))
TypeError: can't multiply sequence by non-int of type 'float'

Signed-off-by: Olaf Hering <olaf@aepfle.de>

v02:
- import division to remain compatible with python2.7 (andrew)
- white space in max_id chunk (andrew)
---
 tools/python/scripts/convert-legacy-stream | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/python/scripts/convert-legacy-stream b/tools/python/scripts/convert-legacy-stream
index ca93a93848..66ee3d2f5d 100755
--- a/tools/python/scripts/convert-legacy-stream
+++ b/tools/python/scripts/convert-legacy-stream
@@ -6,6 +6,7 @@ Convert a legacy migration stream to a v2 stream.
 """
 
 from __future__ import print_function
+from __future__ import division
 
 import sys
 import os, os.path
@@ -163,7 +164,7 @@ def write_libxc_hvm_params(params):
         raise RuntimeError("Expected even length list of hvm parameters")
 
     write_record(libxc.REC_TYPE_hvm_params,
-                 pack(libxc.HVM_PARAMS_FORMAT, len(params) / 2, 0),
+                 pack(libxc.HVM_PARAMS_FORMAT, len(params) // 2, 0),
                  pack("Q" * len(params), *params))
 
 def write_libxc_static_data_end():
@@ -264,8 +265,8 @@ def read_pv_extended_info(vm):
                           (so_far - total_length, ))
 
 def read_pv_p2m_frames(vm):
-    fpp = 4096 / vm.width
-    p2m_frame_len = (vm.p2m_size - 1) / fpp + 1
+    fpp = 4096 // vm.width
+    p2m_frame_len = (vm.p2m_size - 1) // fpp + 1
 
     info("P2M frames: fpp %d, p2m_frame_len %d" % (fpp, p2m_frame_len))
     write_libxc_pv_p2m_frames(vm, unpack_ulongs(p2m_frame_len))
@@ -405,7 +406,7 @@ def read_chunks(vm):
                                   (max_id, legacy.MAX_VCPU_ID))
 
             vm.max_vcpu_id = max_id
-            bitmap = unpack_exact("Q" * ((max_id/64) + 1))
+            bitmap = unpack_exact("Q" * ((max_id // 64) + 1))
 
             for idx, word in enumerate(bitmap):
                 bit_idx = 0


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 05/40] tools: handle libxl__physmap_info.name properly in convert-legacy-stream
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (3 preceding siblings ...)
  2021-07-01  9:55 ` [PATCH v20210701 04/40] tools: use integer division in convert-legacy-stream Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-02 15:35   ` Andrew Cooper
  2021-07-01  9:56 ` [PATCH v20210701 06/40] tools: fix Python3.4 TypeError in format string Olaf Hering
                   ` (34 subsequent siblings)
  39 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel
  Cc: Olaf Hering, Marek Marczykowski-Górecki, Ian Jackson, Wei Liu

The trailing member name[] in libxl__physmap_info is written as a
cstring into the stream. The current code does a sanity check if the
last byte is zero. This attempt fails with python3.4 because name[-1]
returns a type int. As a result the comparison with byte(\00) fails:

  File "/usr/lib/xen/bin/convert-legacy-stream", line 347, in read_libxl_toolstack
    raise StreamError("physmap name not NUL terminated")
StreamError: physmap name not NUL terminated

To handle both python variants the cstring is unpacked into the actual
string and the trailing nil.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 tools/python/scripts/convert-legacy-stream | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/python/scripts/convert-legacy-stream b/tools/python/scripts/convert-legacy-stream
index 66ee3d2f5d..9003ac4f6d 100755
--- a/tools/python/scripts/convert-legacy-stream
+++ b/tools/python/scripts/convert-legacy-stream
@@ -336,20 +336,21 @@ def read_libxl_toolstack(vm, data):
         if len(data) < namelen:
             raise StreamError("Remaining data too short for physmap name")
 
-        name = data[:namelen]
+        c_string = data[:namelen]
         data = data[namelen:]
 
         # Strip padding off the end of name
         if twidth == 64:
-            name = name[:-4]
+            c_string = c_string[:-4]
 
-        if name[-1] != b'\x00':
+        name, nil = unpack("={0}sB".format(len(c_string) - 1), c_string)
+        if nil != 0:
             raise StreamError("physmap name not NUL terminated")
 
         root = b"physmap/%x" % (phys, )
         kv = [root + b"/start_addr", b"%x" % (start, ),
               root + b"/size",       b"%x" % (size, ),
-              root + b"/name",       name[:-1]]
+              root + b"/name",       name]
 
         for key, val in zip(kv[0::2], kv[1::2]):
             info("    '%s' = '%s'" % (key.decode(), val.decode()))


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 06/40] tools: fix Python3.4 TypeError in format string
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (4 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 05/40] tools: handle libxl__physmap_info.name properly " Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-02 16:19   ` Marek Marczykowski-Górecki
  2021-07-01  9:56 ` [PATCH v20210701 07/40] tools: create libxensaverestore Olaf Hering
                   ` (33 subsequent siblings)
  39 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel
  Cc: Olaf Hering, Marek Marczykowski-Górecki, Ian Jackson, Wei Liu

Using the first element of a tuple for a format specifier fails with
python3.4 as included in SLE12:
    b = b"string/%x" % (i, )
TypeError: unsupported operand type(s) for %: 'bytes' and 'tuple'

It happens to work with python 2.7 and 3.6.
Use a syntax that is handled by all three variants.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 tools/python/scripts/convert-legacy-stream | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/python/scripts/convert-legacy-stream b/tools/python/scripts/convert-legacy-stream
index 9003ac4f6d..235b922ff5 100755
--- a/tools/python/scripts/convert-legacy-stream
+++ b/tools/python/scripts/convert-legacy-stream
@@ -347,9 +347,9 @@ def read_libxl_toolstack(vm, data):
         if nil != 0:
             raise StreamError("physmap name not NUL terminated")
 
-        root = b"physmap/%x" % (phys, )
-        kv = [root + b"/start_addr", b"%x" % (start, ),
-              root + b"/size",       b"%x" % (size, ),
+        root = bytes(("physmap/%x" % phys).encode('utf-8'))
+        kv = [root + b"/start_addr", bytes(("%x" % start).encode('utf-8')),
+              root + b"/size",       bytes(("%x" % size).encode('utf-8')),
               root + b"/name",       name]
 
         for key, val in zip(kv[0::2], kv[1::2]):


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 07/40] tools: create libxensaverestore
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (5 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 06/40] tools: fix Python3.4 TypeError in format string Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-09  9:20   ` Olaf Hering
  2021-07-09  9:35   ` Julien Grall
  2021-07-01  9:56 ` [PATCH v20210701 08/40] MAINTAINERS: add myself as saverestore maintainer Olaf Hering
                   ` (32 subsequent siblings)
  39 siblings, 2 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel
  Cc: Olaf Hering, Wei Liu, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Juergen Gross,
	Anthony PERARD

Move all save/restore related code from libxenguest.so into a separate
library libxensaverestore.so. The only consumer is libxl-save-helper.
There is no need to have the moved code mapped all the time in binaries
where libxenguest.so is used.

According to size(1) the change is:
   text	   data	    bss	    dec	    hex	filename
 187183	   4304	     48	 191535	  2ec2f	guest/libxenguest.so.4.15.0

 124106	   3376	     48	 127530	  1f22a	guest/libxenguest.so.4.15.0
  67841	   1872	      8	  69721	  11059	saverestore/libxensaverestore.so.4.15.0

While touching the files anyway, take the opportunity to drop the
redundant xg_sr_ filename prefix.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wl@xen.org>

v6:
- fix build of nomigrate.c
v5:
- fix spelling in description
v4:
- drop xg_ prefix from filenames (jgross)
- drop sr_ prefix from filenames (jbeulich)
v3:
- repost in time for 4.16
v2:
- copy also license header
- move xg_nomigrate.c
- add size(1) output to commit msg
- remove change from libxl_create.c
---
 .gitignore                                    |   2 +
 tools/include/xenguest.h                      | 186 ----------------
 tools/include/xensaverestore.h                | 208 ++++++++++++++++++
 tools/libs/Makefile                           |   1 +
 tools/libs/guest/Makefile                     |  11 -
 tools/libs/guest/xg_offline_page.c            |   1 -
 tools/libs/light/Makefile                     |   4 +-
 tools/libs/light/libxl_internal.h             |   1 +
 tools/libs/light/libxl_save_helper.c          |   1 +
 tools/libs/light/libxl_save_msgs_gen.pl       |   2 +-
 tools/libs/saverestore/Makefile               |  38 ++++
 .../xg_sr_common.c => saverestore/common.c}   |   2 +-
 .../xg_sr_common.h => saverestore/common.h}   |  16 +-
 .../common_x86.c}                             |   2 +-
 .../common_x86.h}                             |   2 +-
 .../common_x86_pv.c}                          |   2 +-
 .../common_x86_pv.h}                          |   2 +-
 .../nomigrate.c}                              |   2 +-
 .../xg_sr_restore.c => saverestore/restore.c} |   2 +-
 .../restore_x86_hvm.c}                        |   2 +-
 .../restore_x86_pv.c}                         |   2 +-
 .../xg_sr_save.c => saverestore/save.c}       |   2 +-
 .../save_restore.h}                           |   2 -
 .../save_x86_hvm.c}                           |   2 +-
 .../save_x86_pv.c}                            |   2 +-
 .../stream_format.h}                          |   0
 tools/libs/uselibs.mk                         |   4 +-
 27 files changed, 283 insertions(+), 218 deletions(-)
 create mode 100644 tools/include/xensaverestore.h
 create mode 100644 tools/libs/saverestore/Makefile
 rename tools/libs/{guest/xg_sr_common.c => saverestore/common.c} (99%)
 rename tools/libs/{guest/xg_sr_common.h => saverestore/common.h} (98%)
 rename tools/libs/{guest/xg_sr_common_x86.c => saverestore/common_x86.c} (99%)
 rename tools/libs/{guest/xg_sr_common_x86.h => saverestore/common_x86.h} (98%)
 rename tools/libs/{guest/xg_sr_common_x86_pv.c => saverestore/common_x86_pv.c} (99%)
 rename tools/libs/{guest/xg_sr_common_x86_pv.h => saverestore/common_x86_pv.h} (98%)
 rename tools/libs/{guest/xg_nomigrate.c => saverestore/nomigrate.c} (98%)
 rename tools/libs/{guest/xg_sr_restore.c => saverestore/restore.c} (99%)
 rename tools/libs/{guest/xg_sr_restore_x86_hvm.c => saverestore/restore_x86_hvm.c} (99%)
 rename tools/libs/{guest/xg_sr_restore_x86_pv.c => saverestore/restore_x86_pv.c} (99%)
 rename tools/libs/{guest/xg_sr_save.c => saverestore/save.c} (99%)
 rename tools/libs/{guest/xg_save_restore.h => saverestore/save_restore.h} (98%)
 rename tools/libs/{guest/xg_sr_save_x86_hvm.c => saverestore/save_x86_hvm.c} (99%)
 rename tools/libs/{guest/xg_sr_save_x86_pv.c => saverestore/save_x86_pv.c} (99%)
 rename tools/libs/{guest/xg_sr_stream_format.h => saverestore/stream_format.h} (100%)

diff --git a/.gitignore b/.gitignore
index 38a085e398..08a321e995 100644
--- a/.gitignore
+++ b/.gitignore
@@ -147,6 +147,8 @@ tools/libs/light/test_timedereg
 tools/libs/light/test_fdderegrace
 tools/libs/light/tmp.*
 tools/libs/light/xenlight.pc
+tools/libs/saverestore/libxensaverestore.map
+tools/libs/saverestore/xensaverestore.pc
 tools/libs/stat/_paths.h
 tools/libs/stat/headers.chk
 tools/libs/stat/libxenstat.map
diff --git a/tools/include/xenguest.h b/tools/include/xenguest.h
index 61d0a82f48..7417675b3b 100644
--- a/tools/include/xenguest.h
+++ b/tools/include/xenguest.h
@@ -24,9 +24,6 @@
 
 #define XC_NUMA_NO_NODE   (~0U)
 
-#define XCFLAGS_LIVE      (1 << 0)
-#define XCFLAGS_DEBUG     (1 << 1)
-
 #define X86_64_B_SIZE   64 
 #define X86_32_B_SIZE   32
 
@@ -433,189 +430,6 @@ static inline xen_pfn_t xc_dom_p2m(struct xc_dom_image *dom, xen_pfn_t pfn)
  */
 struct xenevtchn_handle;
 
-/* For save's precopy_policy(). */
-struct precopy_stats
-{
-    unsigned int iteration;
-    unsigned long total_written;
-    long dirty_count; /* -1 if unknown */
-};
-
-/*
- * A precopy_policy callback may not be running in the same address
- * space as libxc an so precopy_stats is passed by value.
- */
-typedef int (*precopy_policy_t)(struct precopy_stats, void *);
-
-/* callbacks provided by xc_domain_save */
-struct save_callbacks {
-    /*
-     * Called after expiration of checkpoint interval,
-     * to suspend the guest.
-     */
-    int (*suspend)(void *data);
-
-    /*
-     * Called before and after every batch of page data sent during
-     * the precopy phase of a live migration to ask the caller what
-     * to do next based on the current state of the precopy migration.
-     *
-     * Should return one of the values listed below:
-     */
-#define XGS_POLICY_ABORT          (-1) /* Abandon the migration entirely
-                                        * and tidy up. */
-#define XGS_POLICY_CONTINUE_PRECOPY 0  /* Remain in the precopy phase. */
-#define XGS_POLICY_STOP_AND_COPY    1  /* Immediately suspend and transmit the
-                                        * remaining dirty pages. */
-    precopy_policy_t precopy_policy;
-
-    /*
-     * Called after the guest's dirty pages have been
-     *  copied into an output buffer.
-     * Callback function resumes the guest & the device model,
-     *  returns to xc_domain_save.
-     * xc_domain_save then flushes the output buffer, while the
-     *  guest continues to run.
-     */
-    int (*postcopy)(void *data);
-
-    /*
-     * Called after the memory checkpoint has been flushed
-     * out into the network. Typical actions performed in this
-     * callback include:
-     *   (a) send the saved device model state (for HVM guests),
-     *   (b) wait for checkpoint ack
-     *   (c) release the network output buffer pertaining to the acked checkpoint.
-     *   (c) sleep for the checkpoint interval.
-     *
-     * returns:
-     * 0: terminate checkpointing gracefully
-     * 1: take another checkpoint
-     */
-    int (*checkpoint)(void *data);
-
-    /*
-     * Called after the checkpoint callback.
-     *
-     * returns:
-     * 0: terminate checkpointing gracefully
-     * 1: take another checkpoint
-     */
-    int (*wait_checkpoint)(void *data);
-
-    /* Enable qemu-dm logging dirty pages to xen */
-    int (*switch_qemu_logdirty)(uint32_t domid, unsigned enable, void *data); /* HVM only */
-
-    /* to be provided as the last argument to each callback function */
-    void *data;
-};
-
-/* Type of stream.  Plain, or using a continuous replication protocol? */
-typedef enum {
-    XC_STREAM_PLAIN,
-    XC_STREAM_REMUS,
-    XC_STREAM_COLO,
-} xc_stream_type_t;
-
-/**
- * This function will save a running domain.
- *
- * @param xch a handle to an open hypervisor interface
- * @param io_fd the file descriptor to save a domain to
- * @param dom the id of the domain
- * @param flags XCFLAGS_xxx
- * @param stream_type XC_STREAM_PLAIN if the far end of the stream
- *        doesn't use checkpointing
- * @param recv_fd Only used for XC_STREAM_COLO.  Contains backchannel from
- *        the destination side.
- * @return 0 on success, -1 on failure
- */
-int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
-                   uint32_t flags, struct save_callbacks *callbacks,
-                   xc_stream_type_t stream_type, int recv_fd);
-
-/* callbacks provided by xc_domain_restore */
-struct restore_callbacks {
-    /*
-     * Called once the STATIC_DATA_END record has been received/inferred.
-     *
-     * For compatibility with older streams, provides a list of static data
-     * expected to be found in the stream, which was missing.  A higher level
-     * toolstack is responsible for providing any necessary compatibiltiy.
-     */
-#define XGR_SDD_MISSING_CPUID (1 << 0)
-#define XGR_SDD_MISSING_MSR   (1 << 1)
-    int (*static_data_done)(unsigned int missing, void *data);
-
-    /* Called after a new checkpoint to suspend the guest. */
-    int (*suspend)(void *data);
-
-    /*
-     * Called after the secondary vm is ready to resume.
-     * Callback function resumes the guest & the device model,
-     * returns to xc_domain_restore.
-     */
-    int (*postcopy)(void *data);
-
-    /*
-     * A checkpoint record has been found in the stream.
-     * returns:
-     */
-#define XGR_CHECKPOINT_ERROR    0 /* Terminate processing */
-#define XGR_CHECKPOINT_SUCCESS  1 /* Continue reading more data from the stream */
-#define XGR_CHECKPOINT_FAILOVER 2 /* Failover and resume VM */
-    int (*checkpoint)(void *data);
-
-    /*
-     * Called after the checkpoint callback.
-     *
-     * returns:
-     * 0: terminate checkpointing gracefully
-     * 1: take another checkpoint
-     */
-    int (*wait_checkpoint)(void *data);
-
-    /*
-     * callback to send store gfn and console gfn to xl
-     * if we want to resume vm before xc_domain_save()
-     * exits.
-     */
-    void (*restore_results)(xen_pfn_t store_gfn, xen_pfn_t console_gfn,
-                            void *data);
-
-    /* to be provided as the last argument to each callback function */
-    void *data;
-};
-
-/**
- * This function will restore a saved domain.
- *
- * Domain is restored in a suspended state ready to be unpaused.
- *
- * @param xch a handle to an open hypervisor interface
- * @param io_fd the file descriptor to restore a domain from
- * @param dom the id of the domain
- * @param store_evtchn the xenstore event channel for this domain to use
- * @param store_mfn filled with the gfn of the store page
- * @param store_domid the backend domain for xenstore
- * @param console_evtchn the console event channel for this domain to use
- * @param console_mfn filled with the gfn of the console page
- * @param console_domid the backend domain for xenconsole
- * @param stream_type XC_STREAM_PLAIN if the far end of the stream is using
- *        checkpointing
- * @param callbacks non-NULL to receive a callback to restore toolstack
- *        specific data
- * @param send_back_fd Only used for XC_STREAM_COLO.  Contains backchannel to
- *        the source side.
- * @return 0 on success, -1 on failure
- */
-int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
-                      unsigned int store_evtchn, unsigned long *store_mfn,
-                      uint32_t store_domid, unsigned int console_evtchn,
-                      unsigned long *console_mfn, uint32_t console_domid,
-                      xc_stream_type_t stream_type,
-                      struct restore_callbacks *callbacks, int send_back_fd);
-
 /**
  * This function will create a domain for a paravirtualized Linux
  * using file names pointing to kernel and ramdisk
diff --git a/tools/include/xensaverestore.h b/tools/include/xensaverestore.h
new file mode 100644
index 0000000000..0410f0469e
--- /dev/null
+++ b/tools/include/xensaverestore.h
@@ -0,0 +1,208 @@
+/******************************************************************************
+ * A library for guest domain save/restore/migration in Xen.
+ *
+ * Copyright (c) 2003-2004, K A Fraser.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation;
+ * version 2.1 of the License.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef XENSAVERESTORE_H
+#define XENSAVERESTORE_H
+
+#define XCFLAGS_LIVE      (1 << 0)
+#define XCFLAGS_DEBUG     (1 << 1)
+
+/* For save's precopy_policy(). */
+struct precopy_stats
+{
+    unsigned int iteration;
+    unsigned long total_written;
+    long dirty_count; /* -1 if unknown */
+};
+
+/*
+ * A precopy_policy callback may not be running in the same address
+ * space as libxc an so precopy_stats is passed by value.
+ */
+typedef int (*precopy_policy_t)(struct precopy_stats, void *);
+
+/* callbacks provided by xc_domain_save */
+struct save_callbacks {
+    /*
+     * Called after expiration of checkpoint interval,
+     * to suspend the guest.
+     */
+    int (*suspend)(void *data);
+
+    /*
+     * Called before and after every batch of page data sent during
+     * the precopy phase of a live migration to ask the caller what
+     * to do next based on the current state of the precopy migration.
+     *
+     * Should return one of the values listed below:
+     */
+#define XGS_POLICY_ABORT          (-1) /* Abandon the migration entirely
+                                        * and tidy up. */
+#define XGS_POLICY_CONTINUE_PRECOPY 0  /* Remain in the precopy phase. */
+#define XGS_POLICY_STOP_AND_COPY    1  /* Immediately suspend and transmit the
+                                        * remaining dirty pages. */
+    precopy_policy_t precopy_policy;
+
+    /*
+     * Called after the guest's dirty pages have been
+     *  copied into an output buffer.
+     * Callback function resumes the guest & the device model,
+     *  returns to xc_domain_save.
+     * xc_domain_save then flushes the output buffer, while the
+     *  guest continues to run.
+     */
+    int (*postcopy)(void *data);
+
+    /*
+     * Called after the memory checkpoint has been flushed
+     * out into the network. Typical actions performed in this
+     * callback include:
+     *   (a) send the saved device model state (for HVM guests),
+     *   (b) wait for checkpoint ack
+     *   (c) release the network output buffer pertaining to the acked checkpoint.
+     *   (c) sleep for the checkpoint interval.
+     *
+     * returns:
+     * 0: terminate checkpointing gracefully
+     * 1: take another checkpoint
+     */
+    int (*checkpoint)(void *data);
+
+    /*
+     * Called after the checkpoint callback.
+     *
+     * returns:
+     * 0: terminate checkpointing gracefully
+     * 1: take another checkpoint
+     */
+    int (*wait_checkpoint)(void *data);
+
+    /* Enable qemu-dm logging dirty pages to xen */
+    int (*switch_qemu_logdirty)(uint32_t domid, unsigned enable, void *data); /* HVM only */
+
+    /* to be provided as the last argument to each callback function */
+    void *data;
+};
+
+/* Type of stream.  Plain, or using a continuous replication protocol? */
+typedef enum {
+    XC_STREAM_PLAIN,
+    XC_STREAM_REMUS,
+    XC_STREAM_COLO,
+} xc_stream_type_t;
+
+/**
+ * This function will save a running domain.
+ *
+ * @param xch a handle to an open hypervisor interface
+ * @param io_fd the file descriptor to save a domain to
+ * @param dom the id of the domain
+ * @param flags XCFLAGS_xxx
+ * @param stream_type XC_STREAM_PLAIN if the far end of the stream
+ *        doesn't use checkpointing
+ * @param recv_fd Only used for XC_STREAM_COLO.  Contains backchannel from
+ *        the destination side.
+ * @return 0 on success, -1 on failure
+ */
+int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
+                   uint32_t flags, struct save_callbacks *callbacks,
+                   xc_stream_type_t stream_type, int recv_fd);
+
+/* callbacks provided by xc_domain_restore */
+struct restore_callbacks {
+    /*
+     * Called once the STATIC_DATA_END record has been received/inferred.
+     *
+     * For compatibility with older streams, provides a list of static data
+     * expected to be found in the stream, which was missing.  A higher level
+     * toolstack is responsible for providing any necessary compatibiltiy.
+     */
+#define XGR_SDD_MISSING_CPUID (1 << 0)
+#define XGR_SDD_MISSING_MSR   (1 << 1)
+    int (*static_data_done)(unsigned int missing, void *data);
+
+    /* Called after a new checkpoint to suspend the guest. */
+    int (*suspend)(void *data);
+
+    /*
+     * Called after the secondary vm is ready to resume.
+     * Callback function resumes the guest & the device model,
+     * returns to xc_domain_restore.
+     */
+    int (*postcopy)(void *data);
+
+    /*
+     * A checkpoint record has been found in the stream.
+     * returns:
+     */
+#define XGR_CHECKPOINT_ERROR    0 /* Terminate processing */
+#define XGR_CHECKPOINT_SUCCESS  1 /* Continue reading more data from the stream */
+#define XGR_CHECKPOINT_FAILOVER 2 /* Failover and resume VM */
+    int (*checkpoint)(void *data);
+
+    /*
+     * Called after the checkpoint callback.
+     *
+     * returns:
+     * 0: terminate checkpointing gracefully
+     * 1: take another checkpoint
+     */
+    int (*wait_checkpoint)(void *data);
+
+    /*
+     * callback to send store gfn and console gfn to xl
+     * if we want to resume vm before xc_domain_save()
+     * exits.
+     */
+    void (*restore_results)(xen_pfn_t store_gfn, xen_pfn_t console_gfn,
+                            void *data);
+
+    /* to be provided as the last argument to each callback function */
+    void *data;
+};
+
+/**
+ * This function will restore a saved domain.
+ *
+ * Domain is restored in a suspended state ready to be unpaused.
+ *
+ * @param xch a handle to an open hypervisor interface
+ * @param io_fd the file descriptor to restore a domain from
+ * @param dom the id of the domain
+ * @param store_evtchn the xenstore event channel for this domain to use
+ * @param store_mfn filled with the gfn of the store page
+ * @param store_domid the backend domain for xenstore
+ * @param console_evtchn the console event channel for this domain to use
+ * @param console_mfn filled with the gfn of the console page
+ * @param console_domid the backend domain for xenconsole
+ * @param stream_type XC_STREAM_PLAIN if the far end of the stream is using
+ *        checkpointing
+ * @param callbacks non-NULL to receive a callback to restore toolstack
+ *        specific data
+ * @param send_back_fd Only used for XC_STREAM_COLO.  Contains backchannel to
+ *        the source side.
+ * @return 0 on success, -1 on failure
+ */
+int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
+                      unsigned int store_evtchn, unsigned long *store_mfn,
+                      uint32_t store_domid, unsigned int console_evtchn,
+                      unsigned long *console_mfn, uint32_t console_domid,
+                      xc_stream_type_t stream_type,
+                      struct restore_callbacks *callbacks, int send_back_fd);
+
+#endif /* XENSAVERESTORE_H */
diff --git a/tools/libs/Makefile b/tools/libs/Makefile
index 1afcd12e2b..ca43c66777 100644
--- a/tools/libs/Makefile
+++ b/tools/libs/Makefile
@@ -12,6 +12,7 @@ SUBDIRS-y += devicemodel
 SUBDIRS-y += ctrl
 SUBDIRS-y += guest
 SUBDIRS-y += hypfs
+SUBDIRS-y += saverestore
 SUBDIRS-y += store
 SUBDIRS-y += stat
 SUBDIRS-$(CONFIG_Linux) += vchan
diff --git a/tools/libs/guest/Makefile b/tools/libs/guest/Makefile
index 2ce92d247e..4cf5459bb1 100644
--- a/tools/libs/guest/Makefile
+++ b/tools/libs/guest/Makefile
@@ -11,18 +11,7 @@ SRCS-y += xg_domain.c
 SRCS-y += xg_suspend.c
 SRCS-y += xg_resume.c
 ifeq ($(CONFIG_MIGRATE),y)
-SRCS-y += xg_sr_common.c
-SRCS-$(CONFIG_X86) += xg_sr_common_x86.c
-SRCS-$(CONFIG_X86) += xg_sr_common_x86_pv.c
-SRCS-$(CONFIG_X86) += xg_sr_restore_x86_pv.c
-SRCS-$(CONFIG_X86) += xg_sr_restore_x86_hvm.c
-SRCS-$(CONFIG_X86) += xg_sr_save_x86_pv.c
-SRCS-$(CONFIG_X86) += xg_sr_save_x86_hvm.c
-SRCS-y += xg_sr_restore.c
-SRCS-y += xg_sr_save.c
 SRCS-y += xg_offline_page.c
-else
-SRCS-y += xg_nomigrate.c
 endif
 SRCS-y       += xg_core.c
 SRCS-$(CONFIG_X86) += xg_core_x86.c
diff --git a/tools/libs/guest/xg_offline_page.c b/tools/libs/guest/xg_offline_page.c
index cfe0e2d537..92b65243b1 100644
--- a/tools/libs/guest/xg_offline_page.c
+++ b/tools/libs/guest/xg_offline_page.c
@@ -29,7 +29,6 @@
 
 #include "xc_private.h"
 #include "xg_private.h"
-#include "xg_save_restore.h"
 
 struct pte_backup_entry
 {
diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
index 7d8c51d492..68e51dd13c 100644
--- a/tools/libs/light/Makefile
+++ b/tools/libs/light/Makefile
@@ -179,7 +179,7 @@ $(ACPI_OBJS) $(ACPI_PIC_OBJS): CFLAGS += -I. -DLIBACPI_STDUTILS=\"$(CURDIR)/libx
 $(TEST_PROG_OBJS) _libxl.api-for-check: CFLAGS += $(CFLAGS_libxentoollog) $(CFLAGS_libxentoolcore)
 libxl_dom.o libxl_dom.opic: CFLAGS += -I$(XEN_ROOT)/tools  # include libacpi/x86.h
 libxl_x86_acpi.o libxl_x86_acpi.opic: CFLAGS += -I$(XEN_ROOT)/tools
-$(SAVE_HELPER_OBJS): CFLAGS += $(CFLAGS_libxenctrl) $(CFLAGS_libxenevtchn) $(CFLAGS_libxenguest)
+$(SAVE_HELPER_OBJS): CFLAGS += $(CFLAGS_libxenctrl) $(CFLAGS_libxenevtchn) $(CFLAGS_libxensaverestore)
 
 testidl.o: CFLAGS += $(CFLAGS_libxenctrl) $(CFLAGS_libxenlight)
 testidl.c: libxl_types.idl gentest.py $(XEN_INCLUDE)/libxl.h $(AUTOINCS)
@@ -241,7 +241,7 @@ test_%: test_%.o test_common.o libxenlight_test.so
 	$(CC) $(LDFLAGS) -o $@ $^ $(filter-out %libxenlight.so, $(LDLIBS_libxenlight)) $(LDLIBS_libxentoollog) $(LDLIBS_libxentoolcore) -lyajl $(APPEND_LDFLAGS)
 
 libxl-save-helper: $(SAVE_HELPER_OBJS) libxenlight.so
-	$(CC) $(LDFLAGS) -o $@ $(SAVE_HELPER_OBJS) $(LDLIBS_libxentoollog) $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(LDLIBS_libxentoolcore) $(APPEND_LDFLAGS)
+	$(CC) $(LDFLAGS) -o $@ $(SAVE_HELPER_OBJS) $(LDLIBS_libxentoollog) $(LDLIBS_libxenctrl) $(LDLIBS_libxensaverestore) $(LDLIBS_libxentoolcore) $(APPEND_LDFLAGS)
 
 testidl: testidl.o libxenlight.so
 	$(CC) $(LDFLAGS) -o $@ testidl.o $(LDLIBS_libxenlight) $(LDLIBS_libxentoollog) $(LDLIBS_libxentoolcore) $(APPEND_LDFLAGS)
diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h
index 0b4671318c..439c654733 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -56,6 +56,7 @@
 #define XC_WANT_COMPAT_MAP_FOREIGN_API
 #include <xenctrl.h>
 #include <xenguest.h>
+#include <xensaverestore.h>
 #include <xenhypfs.h>
 
 #include <xen-tools/libs.h>
diff --git a/tools/libs/light/libxl_save_helper.c b/tools/libs/light/libxl_save_helper.c
index 65dff389bf..896e845a2f 100644
--- a/tools/libs/light/libxl_save_helper.c
+++ b/tools/libs/light/libxl_save_helper.c
@@ -48,6 +48,7 @@
 
 #include "xenctrl.h"
 #include "xenguest.h"
+#include "xensaverestore.h"
 #include "_libxl_save_msgs_helper.h"
 
 /*----- logger -----*/
diff --git a/tools/libs/light/libxl_save_msgs_gen.pl b/tools/libs/light/libxl_save_msgs_gen.pl
index 9d425b1dee..f263ee01bb 100755
--- a/tools/libs/light/libxl_save_msgs_gen.pl
+++ b/tools/libs/light/libxl_save_msgs_gen.pl
@@ -72,7 +72,7 @@ END_BOTH
 END_CALLOUT
 
 #include <xenctrl.h>
-#include <xenguest.h>
+#include <xensaverestore.h>
 #include "_libxl_save_msgs_${ah}.h"
 
 END_HELPER
diff --git a/tools/libs/saverestore/Makefile b/tools/libs/saverestore/Makefile
new file mode 100644
index 0000000000..48728b3be2
--- /dev/null
+++ b/tools/libs/saverestore/Makefile
@@ -0,0 +1,38 @@
+XEN_ROOT = $(CURDIR)/../../..
+include $(XEN_ROOT)/tools/Rules.mk
+
+ifeq ($(CONFIG_MIGRATE),y)
+SRCS-y += common.c
+SRCS-$(CONFIG_X86) += common_x86.c
+SRCS-$(CONFIG_X86) += common_x86_pv.c
+SRCS-$(CONFIG_X86) += restore_x86_pv.c
+SRCS-$(CONFIG_X86) += restore_x86_hvm.c
+SRCS-$(CONFIG_X86) += save_x86_pv.c
+SRCS-$(CONFIG_X86) += save_x86_hvm.c
+SRCS-y += restore.c
+SRCS-y += save.c
+else
+SRCS-y += nomigrate.c
+endif
+
+CFLAGS += -I$(XEN_libxenctrl)
+CFLAGS += -I$(XEN_libxenguest)
+
+-include $(XEN_TARGET_ARCH)/Makefile
+
+CFLAGS   += -Werror -Wmissing-prototypes
+CFLAGS   += -I. -I./include $(CFLAGS_xeninclude)
+CFLAGS   += -D__XEN_TOOLS__
+CFLAGS   += -include $(XEN_ROOT)/tools/config.h
+# Needed for asprintf()
+CFLAGS-$(CONFIG_Linux) += -D_GNU_SOURCE
+
+LIBHEADER := xensaverestore.h
+
+NO_HEADERS_CHK := y
+
+include $(XEN_ROOT)/tools/libs/libs.mk
+
+.PHONY: cleanlocal
+cleanlocal:
+	rm -f libxensaverestore.map
diff --git a/tools/libs/guest/xg_sr_common.c b/tools/libs/saverestore/common.c
similarity index 99%
rename from tools/libs/guest/xg_sr_common.c
rename to tools/libs/saverestore/common.c
index 17567ab133..77128bc747 100644
--- a/tools/libs/guest/xg_sr_common.c
+++ b/tools/libs/saverestore/common.c
@@ -1,6 +1,6 @@
 #include <assert.h>
 
-#include "xg_sr_common.h"
+#include "common.h"
 
 #include <xen-tools/libs.h>
 
diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/saverestore/common.h
similarity index 98%
rename from tools/libs/guest/xg_sr_common.h
rename to tools/libs/saverestore/common.h
index e2994e18ac..ca2eb47a4f 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/saverestore/common.h
@@ -1,13 +1,25 @@
 #ifndef __COMMON__H
 #define __COMMON__H
 
+#include <unistd.h>
+#include <errno.h>
 #include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+
+#include "xc_private.h"
+#include "xenguest.h"
+#include "xensaverestore.h"
 
 #include "xg_private.h"
-#include "xg_save_restore.h"
+#include "save_restore.h"
 #include "xc_bitops.h"
 
-#include "xg_sr_stream_format.h"
+#include "stream_format.h"
 
 /* String representation of Domain Header types. */
 const char *dhdr_type_to_str(uint32_t type);
diff --git a/tools/libs/guest/xg_sr_common_x86.c b/tools/libs/saverestore/common_x86.c
similarity index 99%
rename from tools/libs/guest/xg_sr_common_x86.c
rename to tools/libs/saverestore/common_x86.c
index 563b4f0168..f1beb234ae 100644
--- a/tools/libs/guest/xg_sr_common_x86.c
+++ b/tools/libs/saverestore/common_x86.c
@@ -1,4 +1,4 @@
-#include "xg_sr_common_x86.h"
+#include "common_x86.h"
 
 int write_x86_tsc_info(struct xc_sr_context *ctx)
 {
diff --git a/tools/libs/guest/xg_sr_common_x86.h b/tools/libs/saverestore/common_x86.h
similarity index 98%
rename from tools/libs/guest/xg_sr_common_x86.h
rename to tools/libs/saverestore/common_x86.h
index b55758c96d..3a2d91dcb8 100644
--- a/tools/libs/guest/xg_sr_common_x86.h
+++ b/tools/libs/saverestore/common_x86.h
@@ -1,7 +1,7 @@
 #ifndef __COMMON_X86__H
 #define __COMMON_X86__H
 
-#include "xg_sr_common.h"
+#include "common.h"
 
 /*
  * Obtains a domains TSC information from Xen and writes a X86_TSC_INFO record
diff --git a/tools/libs/guest/xg_sr_common_x86_pv.c b/tools/libs/saverestore/common_x86_pv.c
similarity index 99%
rename from tools/libs/guest/xg_sr_common_x86_pv.c
rename to tools/libs/saverestore/common_x86_pv.c
index c0acf00f90..cfe1b24bed 100644
--- a/tools/libs/guest/xg_sr_common_x86_pv.c
+++ b/tools/libs/saverestore/common_x86_pv.c
@@ -1,6 +1,6 @@
 #include <assert.h>
 
-#include "xg_sr_common_x86_pv.h"
+#include "common_x86_pv.h"
 
 xen_pfn_t mfn_to_pfn(struct xc_sr_context *ctx, xen_pfn_t mfn)
 {
diff --git a/tools/libs/guest/xg_sr_common_x86_pv.h b/tools/libs/saverestore/common_x86_pv.h
similarity index 98%
rename from tools/libs/guest/xg_sr_common_x86_pv.h
rename to tools/libs/saverestore/common_x86_pv.h
index 953b5bfb8d..a9f8c970e3 100644
--- a/tools/libs/guest/xg_sr_common_x86_pv.h
+++ b/tools/libs/saverestore/common_x86_pv.h
@@ -1,7 +1,7 @@
 #ifndef __COMMON_X86_PV_H
 #define __COMMON_X86_PV_H
 
-#include "xg_sr_common_x86.h"
+#include "common_x86.h"
 
 /* Virtual address ranges reserved for hypervisor. */
 #define HYPERVISOR_VIRT_START_X86_64 0xFFFF800000000000ULL
diff --git a/tools/libs/guest/xg_nomigrate.c b/tools/libs/saverestore/nomigrate.c
similarity index 98%
rename from tools/libs/guest/xg_nomigrate.c
rename to tools/libs/saverestore/nomigrate.c
index 6795c62ddc..67e58d353a 100644
--- a/tools/libs/guest/xg_nomigrate.c
+++ b/tools/libs/saverestore/nomigrate.c
@@ -18,7 +18,7 @@
 #include <inttypes.h>
 #include <errno.h>
 #include <xenctrl.h>
-#include <xenguest.h>
+#include <xensaverestore.h>
 
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t flags,
                    struct save_callbacks *callbacks,
diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/saverestore/restore.c
similarity index 99%
rename from tools/libs/guest/xg_sr_restore.c
rename to tools/libs/saverestore/restore.c
index b57a787519..be259a1c6b 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -2,7 +2,7 @@
 
 #include <assert.h>
 
-#include "xg_sr_common.h"
+#include "common.h"
 
 /*
  * Read and validate the Image and Domain headers.
diff --git a/tools/libs/guest/xg_sr_restore_x86_hvm.c b/tools/libs/saverestore/restore_x86_hvm.c
similarity index 99%
rename from tools/libs/guest/xg_sr_restore_x86_hvm.c
rename to tools/libs/saverestore/restore_x86_hvm.c
index d6ea6f3012..bd63bd2818 100644
--- a/tools/libs/guest/xg_sr_restore_x86_hvm.c
+++ b/tools/libs/saverestore/restore_x86_hvm.c
@@ -1,7 +1,7 @@
 #include <assert.h>
 #include <arpa/inet.h>
 
-#include "xg_sr_common_x86.h"
+#include "common_x86.h"
 
 /*
  * Process an HVM_CONTEXT record from the stream.
diff --git a/tools/libs/guest/xg_sr_restore_x86_pv.c b/tools/libs/saverestore/restore_x86_pv.c
similarity index 99%
rename from tools/libs/guest/xg_sr_restore_x86_pv.c
rename to tools/libs/saverestore/restore_x86_pv.c
index dc50b0f5a8..96608e5231 100644
--- a/tools/libs/guest/xg_sr_restore_x86_pv.c
+++ b/tools/libs/saverestore/restore_x86_pv.c
@@ -1,6 +1,6 @@
 #include <assert.h>
 
-#include "xg_sr_common_x86_pv.h"
+#include "common_x86_pv.h"
 
 static xen_pfn_t pfn_to_mfn(const struct xc_sr_context *ctx, xen_pfn_t pfn)
 {
diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/saverestore/save.c
similarity index 99%
rename from tools/libs/guest/xg_sr_save.c
rename to tools/libs/saverestore/save.c
index 2ba7c3200c..ae3e8797d0 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/saverestore/save.c
@@ -1,7 +1,7 @@
 #include <assert.h>
 #include <arpa/inet.h>
 
-#include "xg_sr_common.h"
+#include "common.h"
 
 /*
  * Writes an Image header and Domain header into the stream.
diff --git a/tools/libs/guest/xg_save_restore.h b/tools/libs/saverestore/save_restore.h
similarity index 98%
rename from tools/libs/guest/xg_save_restore.h
rename to tools/libs/saverestore/save_restore.h
index 3dbbc8dcd2..20bd3d30a5 100644
--- a/tools/libs/guest/xg_save_restore.h
+++ b/tools/libs/saverestore/save_restore.h
@@ -15,8 +15,6 @@
  * License along with this library; If not, see <http://www.gnu.org/licenses/>.
  */
 
-#include "xc_private.h"
-
 #include <xen/foreign/x86_32.h>
 #include <xen/foreign/x86_64.h>
 
diff --git a/tools/libs/guest/xg_sr_save_x86_hvm.c b/tools/libs/saverestore/save_x86_hvm.c
similarity index 99%
rename from tools/libs/guest/xg_sr_save_x86_hvm.c
rename to tools/libs/saverestore/save_x86_hvm.c
index 1634a7bc43..91c2cb99ab 100644
--- a/tools/libs/guest/xg_sr_save_x86_hvm.c
+++ b/tools/libs/saverestore/save_x86_hvm.c
@@ -1,6 +1,6 @@
 #include <assert.h>
 
-#include "xg_sr_common_x86.h"
+#include "common_x86.h"
 
 #include <xen/hvm/params.h>
 
diff --git a/tools/libs/guest/xg_sr_save_x86_pv.c b/tools/libs/saverestore/save_x86_pv.c
similarity index 99%
rename from tools/libs/guest/xg_sr_save_x86_pv.c
rename to tools/libs/saverestore/save_x86_pv.c
index 4964f1f7b8..92f77fad0f 100644
--- a/tools/libs/guest/xg_sr_save_x86_pv.c
+++ b/tools/libs/saverestore/save_x86_pv.c
@@ -1,7 +1,7 @@
 #include <assert.h>
 #include <limits.h>
 
-#include "xg_sr_common_x86_pv.h"
+#include "common_x86_pv.h"
 
 /* Check a 64 bit virtual address for being canonical. */
 static inline bool is_canonical_address(xen_vaddr_t vaddr)
diff --git a/tools/libs/guest/xg_sr_stream_format.h b/tools/libs/saverestore/stream_format.h
similarity index 100%
rename from tools/libs/guest/xg_sr_stream_format.h
rename to tools/libs/saverestore/stream_format.h
diff --git a/tools/libs/uselibs.mk b/tools/libs/uselibs.mk
index efd7a475ba..62a2990b95 100644
--- a/tools/libs/uselibs.mk
+++ b/tools/libs/uselibs.mk
@@ -20,6 +20,8 @@ LIBS_LIBS += ctrl
 USELIBS_ctrl := toollog call evtchn gnttab foreignmemory devicemodel
 LIBS_LIBS += guest
 USELIBS_guest := evtchn ctrl
+LIBS_LIBS += saverestore
+USELIBS_saverestore := guest ctrl
 LIBS_LIBS += store
 USELIBS_store := toolcore
 LIBS_LIBS += vchan
@@ -27,7 +29,7 @@ USELIBS_vchan := toollog store gnttab evtchn
 LIBS_LIBS += stat
 USELIBS_stat := ctrl store
 LIBS_LIBS += light
-USELIBS_light := toollog evtchn toolcore ctrl store hypfs guest
+USELIBS_light := toollog evtchn toolcore ctrl store hypfs guest saverestore
 LIBS_LIBS += util
 USELIBS_util := light
 FILENAME_util := xlutil


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 08/40] MAINTAINERS: add myself as saverestore maintainer
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (6 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 07/40] tools: create libxensaverestore Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01 10:39   ` Jan Beulich
  2021-07-01  9:56 ` [PATCH v20210701 09/40] tools: add readv_exact to libxenctrl Olaf Hering
                   ` (31 subsequent siblings)
  39 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel
  Cc: Olaf Hering, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu

I touched it last.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 MAINTAINERS | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8a52a03969..36dc634958 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -381,6 +381,12 @@ R:	Juergen Gross <jgross@suse.com>
 S:	Supported
 F:	tools/libs/
 
+LIBSAVERESTORE:
+M:	Olaf Hering <olaf@aepfle.de>
+S:	Supported
+F:	tools/include/xensaverestore.h
+F:	tools/libs/saverestore/
+
 LIBXENLIGHT
 M:	Ian Jackson <iwj@xenproject.org>
 M:	Wei Liu <wl@xen.org>


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 09/40] tools: add readv_exact to libxenctrl
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (7 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 08/40] MAINTAINERS: add myself as saverestore maintainer Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 10/40] tools: add xc_is_known_page_type " Olaf Hering
                   ` (30 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu, Juergen Gross

Read a batch of iovec's.

Short reads are the common case, finish the trailing iov with read_exact.

Signed-off-by: Olaf Hering <olaf@aepfle.de>

v2:
- add comment to short-read handling
---
 tools/libs/ctrl/xc_private.c | 57 +++++++++++++++++++++++++++++++++++-
 tools/libs/ctrl/xc_private.h |  1 +
 2 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/tools/libs/ctrl/xc_private.c b/tools/libs/ctrl/xc_private.c
index c0422662f0..bab9a31a70 100644
--- a/tools/libs/ctrl/xc_private.c
+++ b/tools/libs/ctrl/xc_private.c
@@ -698,8 +698,23 @@ int write_exact(int fd, const void *data, size_t size)
 
 #if defined(__MINIOS__)
 /*
- * MiniOS's libc doesn't know about writev(). Implement it as multiple write()s.
+ * MiniOS's libc doesn't know about readv/writev().
+ * Implement it as multiple read/write()s.
  */
+int readv_exact(int fd, const struct iovec *iov, int iovcnt)
+{
+    int rc, i;
+
+    for ( i = 0; i < iovcnt; ++i )
+    {
+        rc = read_exact(fd, iov[i].iov_base, iov[i].iov_len);
+        if ( rc )
+            return rc;
+    }
+
+    return 0;
+}
+
 int writev_exact(int fd, const struct iovec *iov, int iovcnt)
 {
     int rc, i;
@@ -714,6 +729,46 @@ int writev_exact(int fd, const struct iovec *iov, int iovcnt)
     return 0;
 }
 #else
+int readv_exact(int fd, const struct iovec *iov, int iovcnt)
+{
+    int rc = 0, idx = 0;
+    ssize_t len;
+
+    while ( idx < iovcnt )
+    {
+        len = readv(fd, &iov[idx], min(iovcnt - idx, IOV_MAX));
+        if ( len == -1 && errno == EINTR )
+            continue;
+        if ( len <= 0 )
+        {
+            rc = -1;
+            goto out;
+        }
+
+        /* Finish a potential short read in the last iov */
+        while ( len > 0 && idx < iovcnt )
+        {
+            if ( len >= iov[idx].iov_len )
+            {
+                len -= iov[idx].iov_len;
+            }
+            else
+            {
+                void *p = iov[idx].iov_base + len;
+                size_t l = iov[idx].iov_len - len;
+
+                rc = read_exact(fd, p, l);
+                if ( rc )
+                    goto out;
+                len = 0;
+            }
+            idx++;
+        }
+    }
+out:
+    return rc;
+}
+
 int writev_exact(int fd, const struct iovec *iov, int iovcnt)
 {
     struct iovec *local_iov = NULL;
diff --git a/tools/libs/ctrl/xc_private.h b/tools/libs/ctrl/xc_private.h
index 3e299b943f..66086ef19f 100644
--- a/tools/libs/ctrl/xc_private.h
+++ b/tools/libs/ctrl/xc_private.h
@@ -410,6 +410,7 @@ int xc_flush_mmu_updates(xc_interface *xch, struct xc_mmu *mmu);
 
 /* Return 0 on success; -1 on error setting errno. */
 int read_exact(int fd, void *data, size_t size); /* EOF => -1, errno=0 */
+int readv_exact(int fd, const struct iovec *iov, int iovcnt);
 int write_exact(int fd, const void *data, size_t size);
 int writev_exact(int fd, const struct iovec *iov, int iovcnt);
 


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 10/40] tools: add xc_is_known_page_type to libxenctrl
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (8 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 09/40] tools: add readv_exact to libxenctrl Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-02 19:20   ` Andrew Cooper
  2021-07-01  9:56 ` [PATCH v20210701 11/40] tools: use sr_is_known_page_type Olaf Hering
                   ` (29 subsequent siblings)
  39 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu, Juergen Gross

Users of xc_get_pfn_type_batch may want to sanity check the data
returned by Xen. Add a simple helper for this purpose.

Signed-off-by: Olaf Hering <olaf@aepfle.de>

v02:
- rename xc_is_known_page_type to sr_is_known_page_type
- move from ctrl/xc_private.h to saverestore/common.h (jgross)
---
 tools/libs/saverestore/common.h | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index ca2eb47a4f..07c506360c 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -467,6 +467,39 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count,
 /* Handle a STATIC_DATA_END record. */
 int handle_static_data_end(struct xc_sr_context *ctx);
 
+/* Sanitiy check for types returned by Xen */
+static inline bool sr_is_known_page_type(xen_pfn_t type)
+{
+    bool ret;
+
+    switch (type)
+    {
+    case XEN_DOMCTL_PFINFO_NOTAB:
+
+    case XEN_DOMCTL_PFINFO_L1TAB:
+    case XEN_DOMCTL_PFINFO_L1TAB | XEN_DOMCTL_PFINFO_LPINTAB:
+
+    case XEN_DOMCTL_PFINFO_L2TAB:
+    case XEN_DOMCTL_PFINFO_L2TAB | XEN_DOMCTL_PFINFO_LPINTAB:
+
+    case XEN_DOMCTL_PFINFO_L3TAB:
+    case XEN_DOMCTL_PFINFO_L3TAB | XEN_DOMCTL_PFINFO_LPINTAB:
+
+    case XEN_DOMCTL_PFINFO_L4TAB:
+    case XEN_DOMCTL_PFINFO_L4TAB | XEN_DOMCTL_PFINFO_LPINTAB:
+
+    case XEN_DOMCTL_PFINFO_XTAB:
+    case XEN_DOMCTL_PFINFO_XALLOC: /* Synthetic type in Xen 4.2 - 4.5 */
+    case XEN_DOMCTL_PFINFO_BROKEN:
+        ret = true;
+        break;
+    default:
+        ret = false;
+        break;
+    }
+    return ret;
+}
+
 #endif
 /*
  * Local variables:


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 11/40] tools: use sr_is_known_page_type
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (9 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 10/40] tools: add xc_is_known_page_type " Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-02 19:27   ` Andrew Cooper
  2021-07-01  9:56 ` [PATCH v20210701 12/40] tools: unify type checking for data pfns in migration stream Olaf Hering
                   ` (28 subsequent siblings)
  39 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

Verify pfn type on sending side, also verify incoming batch of pfns.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>

v02:
- use sr_is_known_page_type instead of xc_is_known_page_type
---
 tools/libs/saverestore/restore.c | 3 +--
 tools/libs/saverestore/save.c    | 6 ++++++
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index be259a1c6b..324b9050e2 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -406,8 +406,7 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
         }
 
         type = (pages->pfn[i] & PAGE_DATA_TYPE_MASK) >> 32;
-        if ( ((type >> XEN_DOMCTL_PFINFO_LTAB_SHIFT) >= 5) &&
-             ((type >> XEN_DOMCTL_PFINFO_LTAB_SHIFT) <= 8) )
+        if ( sr_is_known_page_type(type) == false )
         {
             ERROR("Invalid type %#"PRIx32" for pfn %#"PRIpfn" (index %u)",
                   type, pfn, i);
diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
index ae3e8797d0..6f820ea432 100644
--- a/tools/libs/saverestore/save.c
+++ b/tools/libs/saverestore/save.c
@@ -147,6 +147,12 @@ static int write_batch(struct xc_sr_context *ctx)
 
     for ( i = 0; i < nr_pfns; ++i )
     {
+        if ( sr_is_known_page_type(types[i]) == false )
+        {
+            ERROR("Wrong type %#"PRIpfn" for pfn %#"PRIpfn, types[i], mfns[i]);
+            goto err;
+        }
+
         switch ( types[i] )
         {
         case XEN_DOMCTL_PFINFO_BROKEN:


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 12/40] tools: unify type checking for data pfns in migration stream
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (10 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 11/40] tools: use sr_is_known_page_type Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-02 19:43   ` Andrew Cooper
  2021-07-05 13:10   ` Andrew Cooper
  2021-07-01  9:56 ` [PATCH v20210701 13/40] " Olaf Hering
                   ` (27 subsequent siblings)
  39 siblings, 2 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu, Juergen Gross

Introduce a helper which decides if a given pfn in the migration
stream is backed by memory.

This specifically deals with type XEN_DOMCTL_PFINFO_XALLOC, which was
a synthetic toolstack-only type used in Xen 4.2 to 4.5. It indicated a
dirty page on the sending side for which no data will be send in the
initial iteration.

No change in behavior intended.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 tools/libs/saverestore/common.h  | 17 +++++++++++++++++
 tools/libs/saverestore/restore.c |  5 ++---
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 07c506360c..fa242e808d 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -500,6 +500,23 @@ static inline bool sr_is_known_page_type(xen_pfn_t type)
     return ret;
 }
 
+static inline bool page_type_to_populate(uint32_t type)
+{
+    bool ret;
+
+    switch (type)
+    {
+    case XEN_DOMCTL_PFINFO_XTAB:
+    case XEN_DOMCTL_PFINFO_BROKEN:
+        ret = false;
+        break;
+    case XEN_DOMCTL_PFINFO_XALLOC:
+    default:
+        ret = true;
+        break;
+    }
+    return ret;
+}
 #endif
 /*
  * Local variables:
diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index 324b9050e2..477b7527a1 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -152,9 +152,8 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count,
 
     for ( i = 0; i < count; ++i )
     {
-        if ( (!types || (types &&
-                         (types[i] != XEN_DOMCTL_PFINFO_XTAB &&
-                          types[i] != XEN_DOMCTL_PFINFO_BROKEN))) &&
+        if ( (!types ||
+              (types && page_type_to_populate(types[i]) == true)) &&
              !pfn_is_populated(ctx, original_pfns[i]) )
         {
             rc = pfn_set_populated(ctx, original_pfns[i]);


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 13/40] tools: unify type checking for data pfns in migration stream
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (11 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 12/40] tools: unify type checking for data pfns in migration stream Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-02 19:49   ` Andrew Cooper
  2021-07-01  9:56 ` [PATCH v20210701 14/40] tools: show migration transfer rate in send_dirty_pages Olaf Hering
                   ` (26 subsequent siblings)
  39 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu, Juergen Gross

Introduce a helper which decides if a given pfn type has data
in the migration stream.

No change in behavior intended.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 tools/libs/saverestore/common.h  | 18 ++++++++++++++++++
 tools/libs/saverestore/restore.c | 29 +++--------------------------
 tools/libs/saverestore/save.c    | 14 ++------------
 3 files changed, 23 insertions(+), 38 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index fa242e808d..905b4078f6 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -517,6 +517,24 @@ static inline bool page_type_to_populate(uint32_t type)
     }
     return ret;
 }
+
+static inline bool page_type_has_stream_data(uint32_t type)
+{
+    bool ret;
+
+    switch (type)
+    {
+    case XEN_DOMCTL_PFINFO_BROKEN:
+    case XEN_DOMCTL_PFINFO_XALLOC:
+    case XEN_DOMCTL_PFINFO_XTAB:
+        ret = false;
+        break;
+    default:
+        ret = true;
+        break;
+    }
+    return ret;
+}
 #endif
 /*
  * Local variables:
diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index 477b7527a1..799170c7a1 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -232,25 +232,8 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count,
     {
         ctx->restore.ops.set_page_type(ctx, pfns[i], types[i]);
 
-        switch ( types[i] )
-        {
-        case XEN_DOMCTL_PFINFO_NOTAB:
-
-        case XEN_DOMCTL_PFINFO_L1TAB:
-        case XEN_DOMCTL_PFINFO_L1TAB | XEN_DOMCTL_PFINFO_LPINTAB:
-
-        case XEN_DOMCTL_PFINFO_L2TAB:
-        case XEN_DOMCTL_PFINFO_L2TAB | XEN_DOMCTL_PFINFO_LPINTAB:
-
-        case XEN_DOMCTL_PFINFO_L3TAB:
-        case XEN_DOMCTL_PFINFO_L3TAB | XEN_DOMCTL_PFINFO_LPINTAB:
-
-        case XEN_DOMCTL_PFINFO_L4TAB:
-        case XEN_DOMCTL_PFINFO_L4TAB | XEN_DOMCTL_PFINFO_LPINTAB:
-
+        if ( page_type_has_stream_data(types[i]) == true )
             mfns[nr_pages++] = ctx->restore.ops.pfn_to_gfn(ctx, pfns[i]);
-            break;
-        }
     }
 
     /* Nothing to do? */
@@ -270,14 +253,8 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count,
 
     for ( i = 0, j = 0; i < count; ++i )
     {
-        switch ( types[i] )
-        {
-        case XEN_DOMCTL_PFINFO_XTAB:
-        case XEN_DOMCTL_PFINFO_BROKEN:
-        case XEN_DOMCTL_PFINFO_XALLOC:
-            /* No page data to deal with. */
+        if ( page_type_has_stream_data(types[i]) == false )
             continue;
-        }
 
         if ( map_errs[j] )
         {
@@ -412,7 +389,7 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
             goto err;
         }
 
-        if ( type < XEN_DOMCTL_PFINFO_BROKEN )
+        if ( page_type_has_stream_data(type) == true )
             /* NOTAB and all L1 through L4 tables (including pinned) should
              * have a page worth of data in the record. */
             pages_of_data++;
diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
index 6f820ea432..12598bd4e2 100644
--- a/tools/libs/saverestore/save.c
+++ b/tools/libs/saverestore/save.c
@@ -153,13 +153,8 @@ static int write_batch(struct xc_sr_context *ctx)
             goto err;
         }
 
-        switch ( types[i] )
-        {
-        case XEN_DOMCTL_PFINFO_BROKEN:
-        case XEN_DOMCTL_PFINFO_XALLOC:
-        case XEN_DOMCTL_PFINFO_XTAB:
+        if ( page_type_has_stream_data(types[i]) == false )
             continue;
-        }
 
         mfns[nr_pages++] = mfns[i];
     }
@@ -177,13 +172,8 @@ static int write_batch(struct xc_sr_context *ctx)
 
         for ( i = 0, p = 0; i < nr_pfns; ++i )
         {
-            switch ( types[i] )
-            {
-            case XEN_DOMCTL_PFINFO_BROKEN:
-            case XEN_DOMCTL_PFINFO_XALLOC:
-            case XEN_DOMCTL_PFINFO_XTAB:
+            if ( page_type_has_stream_data(types[i]) == false )
                 continue;
-            }
 
             if ( errors[p] )
             {


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 14/40] tools: show migration transfer rate in send_dirty_pages
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (12 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 13/40] " Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 15/40] tools: prepare to allocate saverestore arrays once Olaf Hering
                   ` (25 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu, Juergen Gross

Show how fast domU pages are transferred in each iteration.

The relevant data is how fast the pfns travel, not so much how much
protocol overhead exists. So the reported MiB/sec is just for pfns.

Signed-off-by: Olaf Hering <olaf@aepfle.de>

v02:
- rearrange MiB_sec calculation (jgross)
---
 tools/libs/saverestore/common.h |  2 ++
 tools/libs/saverestore/save.c   | 46 +++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 905b4078f6..252076cf51 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -250,6 +250,8 @@ struct xc_sr_context
             bool debug;
 
             unsigned long p2m_size;
+            size_t pages_sent;
+            size_t overhead_sent;
 
             struct precopy_stats stats;
 
diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
index 12598bd4e2..f8fbe7a742 100644
--- a/tools/libs/saverestore/save.c
+++ b/tools/libs/saverestore/save.c
@@ -1,5 +1,6 @@
 #include <assert.h>
 #include <arpa/inet.h>
+#include <time.h>
 
 #include "common.h"
 
@@ -238,6 +239,8 @@ static int write_batch(struct xc_sr_context *ctx)
     iov[3].iov_len = nr_pfns * sizeof(*rec_pfns);
 
     iovcnt = 4;
+    ctx->save.pages_sent += nr_pages;
+    ctx->save.overhead_sent += sizeof(rec) + sizeof(hdr) + nr_pfns * sizeof(*rec_pfns);
 
     if ( nr_pages )
     {
@@ -357,6 +360,42 @@ static int suspend_domain(struct xc_sr_context *ctx)
     return 0;
 }
 
+static void show_transfer_rate(struct xc_sr_context *ctx, struct timespec *start)
+{
+    xc_interface *xch = ctx->xch;
+    struct timespec end = {}, diff = {};
+    size_t ms, MiB_sec;
+
+    if (!ctx->save.pages_sent)
+        return;
+
+    if ( clock_gettime(CLOCK_MONOTONIC, &end) )
+        PERROR("clock_gettime");
+
+    if ( (end.tv_nsec - start->tv_nsec) < 0 )
+    {
+        diff.tv_sec = end.tv_sec - start->tv_sec - 1;
+        diff.tv_nsec = end.tv_nsec - start->tv_nsec + (1000U*1000U*1000U);
+    }
+    else
+    {
+        diff.tv_sec = end.tv_sec - start->tv_sec;
+        diff.tv_nsec = end.tv_nsec - start->tv_nsec;
+    }
+
+    ms = (diff.tv_nsec / (1000U*1000U));
+    ms += (diff.tv_sec * 1000U);
+    if (!ms)
+        ms = 1;
+
+    MiB_sec = (ctx->save.pages_sent * PAGE_SIZE * 1000U) / ms / (1024U*1024U);
+
+    errno = 0;
+    IPRINTF("%s: %zu bytes + %zu pages in %ld.%09ld sec, %zu MiB/sec", __func__,
+            ctx->save.overhead_sent, ctx->save.pages_sent,
+            diff.tv_sec, diff.tv_nsec, MiB_sec);
+}
+
 /*
  * Send a subset of pages in the guests p2m, according to the dirty bitmap.
  * Used for each subsequent iteration of the live migration loop.
@@ -370,9 +409,15 @@ static int send_dirty_pages(struct xc_sr_context *ctx,
     xen_pfn_t p;
     unsigned long written;
     int rc;
+    struct timespec start = {};
     DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
                                     &ctx->save.dirty_bitmap_hbuf);
 
+    ctx->save.pages_sent = 0;
+    ctx->save.overhead_sent = 0;
+    if ( clock_gettime(CLOCK_MONOTONIC, &start) )
+        PERROR("clock_gettime");
+
     for ( p = 0, written = 0; p < ctx->save.p2m_size; ++p )
     {
         if ( !test_bit(p, dirty_bitmap) )
@@ -396,6 +441,7 @@ static int send_dirty_pages(struct xc_sr_context *ctx,
     if ( written > entries )
         DPRINTF("Bitmap contained more entries than expected...");
 
+    show_transfer_rate(ctx, &start);
     xc_report_progress_step(xch, entries, entries);
 
     return ctx->save.ops.check_vm_state(ctx);


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 15/40] tools: prepare to allocate saverestore arrays once
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (13 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 14/40] tools: show migration transfer rate in send_dirty_pages Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-05 10:44   ` Andrew Cooper
  2021-07-01  9:56 ` [PATCH v20210701 16/40] tools: save: move mfns array Olaf Hering
                   ` (24 subsequent siblings)
  39 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu, Juergen Gross

The hotpath 'send_dirty_pages' is supposed to do just one thing: sending.
The other end 'handle_page_data' is supposed to do just receiving.

But instead both do other costly work like memory allocations and data moving.
Do the allocations once, the array sizes are a compiletime constant.
Avoid unneeded copying of data by receiving data directly into mapped guest memory.

This patch is just prepartion, subsequent changes will populate the arrays.

Once all changes are applied, migration of a busy HVM domU changes like that:

Without this series, from sr650 to sr950 (xen-4.15.20201027T173911.16a20963b3 xen_testing):
2020-10-29 10:23:10.711+0000: xc: show_transfer_rate: 23663128 bytes + 2879563 pages in 55.324905335 sec, 203 MiB/sec: Internal error
2020-10-29 10:23:35.115+0000: xc: show_transfer_rate: 16829632 bytes + 2097552 pages in 24.401179720 sec, 335 MiB/sec: Internal error
2020-10-29 10:23:59.436+0000: xc: show_transfer_rate: 16829032 bytes + 2097478 pages in 24.319025928 sec, 336 MiB/sec: Internal error
2020-10-29 10:24:23.844+0000: xc: show_transfer_rate: 16829024 bytes + 2097477 pages in 24.406992500 sec, 335 MiB/sec: Internal error
2020-10-29 10:24:48.292+0000: xc: show_transfer_rate: 16828912 bytes + 2097463 pages in 24.446489027 sec, 335 MiB/sec: Internal error
2020-10-29 10:25:01.816+0000: xc: show_transfer_rate: 16836080 bytes + 2098356 pages in 13.447091818 sec, 609 MiB/sec: Internal error

With this series, from sr650 to sr950 (xen-4.15.20201027T173911.16a20963b3 xen_unstable):
2020-10-28 21:26:05.074+0000: xc: show_transfer_rate: 23663128 bytes + 2879563 pages in 52.564054368 sec, 213 MiB/sec: Internal error
2020-10-28 21:26:23.527+0000: xc: show_transfer_rate: 16830040 bytes + 2097603 pages in 18.450592015 sec, 444 MiB/sec: Internal error
2020-10-28 21:26:41.926+0000: xc: show_transfer_rate: 16830944 bytes + 2097717 pages in 18.397862306 sec, 445 MiB/sec: Internal error
2020-10-28 21:27:00.339+0000: xc: show_transfer_rate: 16829176 bytes + 2097498 pages in 18.411973339 sec, 445 MiB/sec: Internal error
2020-10-28 21:27:18.643+0000: xc: show_transfer_rate: 16828592 bytes + 2097425 pages in 18.303326695 sec, 447 MiB/sec: Internal error
2020-10-28 21:27:26.289+0000: xc: show_transfer_rate: 16835952 bytes + 2098342 pages in 7.579846749 sec, 1081 MiB/sec: Internal error

Note: the performance improvement depends on the used network cards,
wirespeed and the host:
- No improvement is expected with a 1G link.
- Improvement can be seen as shown above on a 10G link.
- Just a slight improvment can be seen on a 100G link.

This change also populates sr_save_arrays with "batch_pfns", and
sr_restore_arrays with "pfns" to make sure malloc is always called
with a non-zero value.

Signed-off-by: Olaf Hering <olaf@aepfle.de>

v02:
- rename xc_sr_save_arrays to sr_save_arrays
- rename xc_sr_restore_arrays to sr_restore_arrays
- merge handling of "batch_pfns" and "pfns" to make sure malloc is
  called with a non-zero size value (jgross)
---
 tools/libs/saverestore/common.h  | 12 +++++++++++-
 tools/libs/saverestore/restore.c | 14 ++++++++++----
 tools/libs/saverestore/save.c    | 27 +++++++++++++--------------
 3 files changed, 34 insertions(+), 19 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 252076cf51..968bb8af13 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -223,6 +223,15 @@ static inline int update_blob(struct xc_sr_blob *blob,
     return 0;
 }
 
+struct sr_save_arrays {
+    xen_pfn_t batch_pfns[MAX_BATCH_SIZE];
+};
+
+struct sr_restore_arrays {
+    /* handle_page_data */
+    xen_pfn_t pfns[MAX_BATCH_SIZE];
+};
+
 struct xc_sr_context
 {
     xc_interface *xch;
@@ -255,11 +264,11 @@ struct xc_sr_context
 
             struct precopy_stats stats;
 
-            xen_pfn_t *batch_pfns;
             unsigned int nr_batch_pfns;
             unsigned long *deferred_pages;
             unsigned long nr_deferred_pages;
             xc_hypercall_buffer_t dirty_bitmap_hbuf;
+            struct sr_save_arrays *m;
         } save;
 
         struct /* Restore data. */
@@ -311,6 +320,7 @@ struct xc_sr_context
 
             /* Sender has invoked verify mode on the stream. */
             bool verify;
+            struct sr_restore_arrays *m;
         } restore;
     };
 
diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index 799170c7a1..c203ce503d 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -315,7 +315,7 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
     unsigned int i, pages_of_data = 0;
     int rc = -1;
 
-    xen_pfn_t *pfns = NULL, pfn;
+    xen_pfn_t *pfns = ctx->restore.m->pfns, pfn;
     uint32_t *types = NULL, type;
 
     /*
@@ -363,9 +363,8 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
         goto err;
     }
 
-    pfns = malloc(pages->count * sizeof(*pfns));
     types = malloc(pages->count * sizeof(*types));
-    if ( !pfns || !types )
+    if ( !types )
     {
         ERROR("Unable to allocate enough memory for %u pfns",
               pages->count);
@@ -412,7 +411,6 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
                            &pages->pfn[pages->count]);
  err:
     free(types);
-    free(pfns);
 
     return rc;
 }
@@ -739,6 +737,13 @@ static int setup(struct xc_sr_context *ctx)
     }
     ctx->restore.allocated_rec_num = DEFAULT_BUF_RECORDS;
 
+    ctx->restore.m = malloc(sizeof(*ctx->restore.m));
+    if ( !ctx->restore.m ) {
+        ERROR("Unable to allocate memory for arrays");
+        rc = -1;
+        goto err;
+    }
+
  err:
     return rc;
 }
@@ -757,6 +762,7 @@ static void cleanup(struct xc_sr_context *ctx)
         xc_hypercall_buffer_free_pages(
             xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->restore.p2m_size)));
 
+    free(ctx->restore.m);
     free(ctx->restore.buffered_records);
     free(ctx->restore.populated_pfns);
 
diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
index f8fbe7a742..e29b6e1d66 100644
--- a/tools/libs/saverestore/save.c
+++ b/tools/libs/saverestore/save.c
@@ -77,7 +77,7 @@ static int write_checkpoint_record(struct xc_sr_context *ctx)
 
 /*
  * Writes a batch of memory as a PAGE_DATA record into the stream.  The batch
- * is constructed in ctx->save.batch_pfns.
+ * is constructed in ctx->save.m->batch_pfns.
  *
  * This function:
  * - gets the types for each pfn in the batch.
@@ -128,12 +128,12 @@ static int write_batch(struct xc_sr_context *ctx)
     for ( i = 0; i < nr_pfns; ++i )
     {
         types[i] = mfns[i] = ctx->save.ops.pfn_to_gfn(ctx,
-                                                      ctx->save.batch_pfns[i]);
+                                                      ctx->save.m->batch_pfns[i]);
 
         /* Likely a ballooned page. */
         if ( mfns[i] == INVALID_MFN )
         {
-            set_bit(ctx->save.batch_pfns[i], ctx->save.deferred_pages);
+            set_bit(ctx->save.m->batch_pfns[i], ctx->save.deferred_pages);
             ++ctx->save.nr_deferred_pages;
         }
     }
@@ -179,7 +179,7 @@ static int write_batch(struct xc_sr_context *ctx)
             if ( errors[p] )
             {
                 ERROR("Mapping of pfn %#"PRIpfn" (mfn %#"PRIpfn") failed %d",
-                      ctx->save.batch_pfns[i], mfns[p], errors[p]);
+                      ctx->save.m->batch_pfns[i], mfns[p], errors[p]);
                 goto err;
             }
 
@@ -193,7 +193,7 @@ static int write_batch(struct xc_sr_context *ctx)
             {
                 if ( rc == -1 && errno == EAGAIN )
                 {
-                    set_bit(ctx->save.batch_pfns[i], ctx->save.deferred_pages);
+                    set_bit(ctx->save.m->batch_pfns[i], ctx->save.deferred_pages);
                     ++ctx->save.nr_deferred_pages;
                     types[i] = XEN_DOMCTL_PFINFO_XTAB;
                     --nr_pages;
@@ -224,7 +224,7 @@ static int write_batch(struct xc_sr_context *ctx)
     rec.length += nr_pages * PAGE_SIZE;
 
     for ( i = 0; i < nr_pfns; ++i )
-        rec_pfns[i] = ((uint64_t)(types[i]) << 32) | ctx->save.batch_pfns[i];
+        rec_pfns[i] = ((uint64_t)(types[i]) << 32) | ctx->save.m->batch_pfns[i];
 
     iov[0].iov_base = &rec.type;
     iov[0].iov_len = sizeof(rec.type);
@@ -296,9 +296,9 @@ static int flush_batch(struct xc_sr_context *ctx)
 
     if ( !rc )
     {
-        VALGRIND_MAKE_MEM_UNDEFINED(ctx->save.batch_pfns,
+        VALGRIND_MAKE_MEM_UNDEFINED(ctx->save.m->batch_pfns,
                                     MAX_BATCH_SIZE *
-                                    sizeof(*ctx->save.batch_pfns));
+                                    sizeof(*ctx->save.m->batch_pfns));
     }
 
     return rc;
@@ -315,7 +315,7 @@ static int add_to_batch(struct xc_sr_context *ctx, xen_pfn_t pfn)
         rc = flush_batch(ctx);
 
     if ( rc == 0 )
-        ctx->save.batch_pfns[ctx->save.nr_batch_pfns++] = pfn;
+        ctx->save.m->batch_pfns[ctx->save.nr_batch_pfns++] = pfn;
 
     return rc;
 }
@@ -849,13 +849,12 @@ static int setup(struct xc_sr_context *ctx)
 
     dirty_bitmap = xc_hypercall_buffer_alloc_pages(
         xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->save.p2m_size)));
-    ctx->save.batch_pfns = malloc(MAX_BATCH_SIZE *
-                                  sizeof(*ctx->save.batch_pfns));
     ctx->save.deferred_pages = bitmap_alloc(ctx->save.p2m_size);
+    ctx->save.m = malloc(sizeof(*ctx->save.m));
 
-    if ( !ctx->save.batch_pfns || !dirty_bitmap || !ctx->save.deferred_pages )
+    if ( !ctx->save.m || !dirty_bitmap || !ctx->save.deferred_pages )
     {
-        ERROR("Unable to allocate memory for dirty bitmaps, batch pfns and"
+        ERROR("Unable to allocate memory for dirty bitmaps and"
               " deferred pages");
         rc = -1;
         errno = ENOMEM;
@@ -884,7 +883,7 @@ static void cleanup(struct xc_sr_context *ctx)
     xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
                                    NRPAGES(bitmap_size(ctx->save.p2m_size)));
     free(ctx->save.deferred_pages);
-    free(ctx->save.batch_pfns);
+    free(ctx->save.m);
 }
 
 /*


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 16/40] tools: save: move mfns array
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (14 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 15/40] tools: prepare to allocate saverestore arrays once Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 17/40] tools: save: move types array Olaf Hering
                   ` (23 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

Remove allocation from hotpath, move mfns array into preallocated space.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 tools/libs/saverestore/common.h | 2 ++
 tools/libs/saverestore/save.c   | 7 ++-----
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 968bb8af13..1415a182d2 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -225,6 +225,8 @@ static inline int update_blob(struct xc_sr_blob *blob,
 
 struct sr_save_arrays {
     xen_pfn_t batch_pfns[MAX_BATCH_SIZE];
+    /* write_batch: Mfns of the batch pfns. */
+    xen_pfn_t mfns[MAX_BATCH_SIZE];
 };
 
 struct sr_restore_arrays {
diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
index e29b6e1d66..6b09784be8 100644
--- a/tools/libs/saverestore/save.c
+++ b/tools/libs/saverestore/save.c
@@ -88,7 +88,7 @@ static int write_checkpoint_record(struct xc_sr_context *ctx)
 static int write_batch(struct xc_sr_context *ctx)
 {
     xc_interface *xch = ctx->xch;
-    xen_pfn_t *mfns = NULL, *types = NULL;
+    xen_pfn_t *mfns = ctx->save.m->mfns, *types = NULL;
     void *guest_mapping = NULL;
     void **guest_data = NULL;
     void **local_pages = NULL;
@@ -105,8 +105,6 @@ static int write_batch(struct xc_sr_context *ctx)
 
     assert(nr_pfns != 0);
 
-    /* Mfns of the batch pfns. */
-    mfns = malloc(nr_pfns * sizeof(*mfns));
     /* Types of the batch pfns. */
     types = malloc(nr_pfns * sizeof(*types));
     /* Errors from attempting to map the gfns. */
@@ -118,7 +116,7 @@ static int write_batch(struct xc_sr_context *ctx)
     /* iovec[] for writev(). */
     iov = malloc((nr_pfns + 4) * sizeof(*iov));
 
-    if ( !mfns || !types || !errors || !guest_data || !local_pages || !iov )
+    if ( !types || !errors || !guest_data || !local_pages || !iov )
     {
         ERROR("Unable to allocate arrays for a batch of %u pages",
               nr_pfns);
@@ -277,7 +275,6 @@ static int write_batch(struct xc_sr_context *ctx)
     free(guest_data);
     free(errors);
     free(types);
-    free(mfns);
 
     return rc;
 }


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 17/40] tools: save: move types array
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (15 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 16/40] tools: save: move mfns array Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 18/40] tools: save: move errors array Olaf Hering
                   ` (22 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

Remove allocation from hotpath, move types array into preallocated space.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 tools/libs/saverestore/common.h | 2 ++
 tools/libs/saverestore/save.c   | 7 ++-----
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 1415a182d2..5bd2913cb6 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -227,6 +227,8 @@ struct sr_save_arrays {
     xen_pfn_t batch_pfns[MAX_BATCH_SIZE];
     /* write_batch: Mfns of the batch pfns. */
     xen_pfn_t mfns[MAX_BATCH_SIZE];
+    /* write_batch: Types of the batch pfns. */
+    xen_pfn_t types[MAX_BATCH_SIZE];
 };
 
 struct sr_restore_arrays {
diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
index 6b09784be8..0883c1fac0 100644
--- a/tools/libs/saverestore/save.c
+++ b/tools/libs/saverestore/save.c
@@ -88,7 +88,7 @@ static int write_checkpoint_record(struct xc_sr_context *ctx)
 static int write_batch(struct xc_sr_context *ctx)
 {
     xc_interface *xch = ctx->xch;
-    xen_pfn_t *mfns = ctx->save.m->mfns, *types = NULL;
+    xen_pfn_t *mfns = ctx->save.m->mfns, *types = ctx->save.m->types;
     void *guest_mapping = NULL;
     void **guest_data = NULL;
     void **local_pages = NULL;
@@ -105,8 +105,6 @@ static int write_batch(struct xc_sr_context *ctx)
 
     assert(nr_pfns != 0);
 
-    /* Types of the batch pfns. */
-    types = malloc(nr_pfns * sizeof(*types));
     /* Errors from attempting to map the gfns. */
     errors = malloc(nr_pfns * sizeof(*errors));
     /* Pointers to page data to send.  Mapped gfns or local allocations. */
@@ -116,7 +114,7 @@ static int write_batch(struct xc_sr_context *ctx)
     /* iovec[] for writev(). */
     iov = malloc((nr_pfns + 4) * sizeof(*iov));
 
-    if ( !types || !errors || !guest_data || !local_pages || !iov )
+    if ( !errors || !guest_data || !local_pages || !iov )
     {
         ERROR("Unable to allocate arrays for a batch of %u pages",
               nr_pfns);
@@ -274,7 +272,6 @@ static int write_batch(struct xc_sr_context *ctx)
     free(local_pages);
     free(guest_data);
     free(errors);
-    free(types);
 
     return rc;
 }


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 18/40] tools: save: move errors array
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (16 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 17/40] tools: save: move types array Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 19/40] tools: save: move iov array Olaf Hering
                   ` (21 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

Remove allocation from hotpath, move errors array into preallocated space.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 tools/libs/saverestore/common.h | 2 ++
 tools/libs/saverestore/save.c   | 7 ++-----
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 5bd2913cb6..25ee8fcb0f 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -229,6 +229,8 @@ struct sr_save_arrays {
     xen_pfn_t mfns[MAX_BATCH_SIZE];
     /* write_batch: Types of the batch pfns. */
     xen_pfn_t types[MAX_BATCH_SIZE];
+    /* write_batch: Errors from attempting to map the gfns. */
+    int errors[MAX_BATCH_SIZE];
 };
 
 struct sr_restore_arrays {
diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
index 0883c1fac0..9ebbf00ce7 100644
--- a/tools/libs/saverestore/save.c
+++ b/tools/libs/saverestore/save.c
@@ -92,7 +92,7 @@ static int write_batch(struct xc_sr_context *ctx)
     void *guest_mapping = NULL;
     void **guest_data = NULL;
     void **local_pages = NULL;
-    int *errors = NULL, rc = -1;
+    int *errors = ctx->save.m->errors, rc = -1;
     unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0;
     unsigned int nr_pfns = ctx->save.nr_batch_pfns;
     void *page, *orig_page;
@@ -105,8 +105,6 @@ static int write_batch(struct xc_sr_context *ctx)
 
     assert(nr_pfns != 0);
 
-    /* Errors from attempting to map the gfns. */
-    errors = malloc(nr_pfns * sizeof(*errors));
     /* Pointers to page data to send.  Mapped gfns or local allocations. */
     guest_data = calloc(nr_pfns, sizeof(*guest_data));
     /* Pointers to locally allocated pages.  Need freeing. */
@@ -114,7 +112,7 @@ static int write_batch(struct xc_sr_context *ctx)
     /* iovec[] for writev(). */
     iov = malloc((nr_pfns + 4) * sizeof(*iov));
 
-    if ( !errors || !guest_data || !local_pages || !iov )
+    if ( !guest_data || !local_pages || !iov )
     {
         ERROR("Unable to allocate arrays for a batch of %u pages",
               nr_pfns);
@@ -271,7 +269,6 @@ static int write_batch(struct xc_sr_context *ctx)
     free(iov);
     free(local_pages);
     free(guest_data);
-    free(errors);
 
     return rc;
 }


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 19/40] tools: save: move iov array
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (17 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 18/40] tools: save: move errors array Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 20/40] tools: save: move rec_pfns array Olaf Hering
                   ` (20 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

Remove allocation from hotpath, move iov array into preallocated space.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 tools/libs/saverestore/common.h | 2 ++
 tools/libs/saverestore/save.c   | 7 ++-----
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 25ee8fcb0f..c8a30acf7b 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -231,6 +231,8 @@ struct sr_save_arrays {
     xen_pfn_t types[MAX_BATCH_SIZE];
     /* write_batch: Errors from attempting to map the gfns. */
     int errors[MAX_BATCH_SIZE];
+    /* write_batch: iovec[] for writev(). */
+    struct iovec iov[MAX_BATCH_SIZE + 4];
 };
 
 struct sr_restore_arrays {
diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
index 9ebbf00ce7..1a5f3d29ea 100644
--- a/tools/libs/saverestore/save.c
+++ b/tools/libs/saverestore/save.c
@@ -97,7 +97,7 @@ static int write_batch(struct xc_sr_context *ctx)
     unsigned int nr_pfns = ctx->save.nr_batch_pfns;
     void *page, *orig_page;
     uint64_t *rec_pfns = NULL;
-    struct iovec *iov = NULL; int iovcnt = 0;
+    struct iovec *iov = ctx->save.m->iov; int iovcnt = 0;
     struct xc_sr_rec_page_data_header hdr = { 0 };
     struct xc_sr_record rec = {
         .type = REC_TYPE_PAGE_DATA,
@@ -109,10 +109,8 @@ static int write_batch(struct xc_sr_context *ctx)
     guest_data = calloc(nr_pfns, sizeof(*guest_data));
     /* Pointers to locally allocated pages.  Need freeing. */
     local_pages = calloc(nr_pfns, sizeof(*local_pages));
-    /* iovec[] for writev(). */
-    iov = malloc((nr_pfns + 4) * sizeof(*iov));
 
-    if ( !guest_data || !local_pages || !iov )
+    if ( !guest_data || !local_pages )
     {
         ERROR("Unable to allocate arrays for a batch of %u pages",
               nr_pfns);
@@ -266,7 +264,6 @@ static int write_batch(struct xc_sr_context *ctx)
         xenforeignmemory_unmap(xch->fmem, guest_mapping, nr_pages_mapped);
     for ( i = 0; local_pages && i < nr_pfns; ++i )
         free(local_pages[i]);
-    free(iov);
     free(local_pages);
     free(guest_data);
 


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 20/40] tools: save: move rec_pfns array
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (18 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 19/40] tools: save: move iov array Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 21/40] tools: save: move guest_data array Olaf Hering
                   ` (19 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

Remove allocation from hotpath, move rec_pfns array into preallocated space.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 tools/libs/saverestore/common.h |  2 ++
 tools/libs/saverestore/save.c   | 11 +----------
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index c8a30acf7b..3994ab3844 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -233,6 +233,8 @@ struct sr_save_arrays {
     int errors[MAX_BATCH_SIZE];
     /* write_batch: iovec[] for writev(). */
     struct iovec iov[MAX_BATCH_SIZE + 4];
+    /* write_batch */
+    uint64_t rec_pfns[MAX_BATCH_SIZE];
 };
 
 struct sr_restore_arrays {
diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
index 1a5f3d29ea..0f02988ff9 100644
--- a/tools/libs/saverestore/save.c
+++ b/tools/libs/saverestore/save.c
@@ -96,7 +96,7 @@ static int write_batch(struct xc_sr_context *ctx)
     unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0;
     unsigned int nr_pfns = ctx->save.nr_batch_pfns;
     void *page, *orig_page;
-    uint64_t *rec_pfns = NULL;
+    uint64_t *rec_pfns = ctx->save.m->rec_pfns;
     struct iovec *iov = ctx->save.m->iov; int iovcnt = 0;
     struct xc_sr_rec_page_data_header hdr = { 0 };
     struct xc_sr_record rec = {
@@ -201,14 +201,6 @@ static int write_batch(struct xc_sr_context *ctx)
         }
     }
 
-    rec_pfns = malloc(nr_pfns * sizeof(*rec_pfns));
-    if ( !rec_pfns )
-    {
-        ERROR("Unable to allocate %zu bytes of memory for page data pfn list",
-              nr_pfns * sizeof(*rec_pfns));
-        goto err;
-    }
-
     hdr.count = nr_pfns;
 
     rec.length = sizeof(hdr);
@@ -259,7 +251,6 @@ static int write_batch(struct xc_sr_context *ctx)
     rc = ctx->save.nr_batch_pfns = 0;
 
  err:
-    free(rec_pfns);
     if ( guest_mapping )
         xenforeignmemory_unmap(xch->fmem, guest_mapping, nr_pages_mapped);
     for ( i = 0; local_pages && i < nr_pfns; ++i )


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 21/40] tools: save: move guest_data array
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (19 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 20/40] tools: save: move rec_pfns array Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 22/40] tools: save: move local_pages array Olaf Hering
                   ` (18 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

Remove allocation from hotpath, move guest_data array into preallocated space.

Because this was allocated with calloc:
Adjust the loop to clear unused entries as needed.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 tools/libs/saverestore/common.h |  2 ++
 tools/libs/saverestore/save.c   | 11 ++++++-----
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 3994ab3844..c3570e0c9a 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -235,6 +235,8 @@ struct sr_save_arrays {
     struct iovec iov[MAX_BATCH_SIZE + 4];
     /* write_batch */
     uint64_t rec_pfns[MAX_BATCH_SIZE];
+    /* write_batch: Pointers to page data to send. Mapped gfns or local allocations. */
+    void *guest_data[MAX_BATCH_SIZE];
 };
 
 struct sr_restore_arrays {
diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
index 0f02988ff9..ea04cb1a74 100644
--- a/tools/libs/saverestore/save.c
+++ b/tools/libs/saverestore/save.c
@@ -90,7 +90,7 @@ static int write_batch(struct xc_sr_context *ctx)
     xc_interface *xch = ctx->xch;
     xen_pfn_t *mfns = ctx->save.m->mfns, *types = ctx->save.m->types;
     void *guest_mapping = NULL;
-    void **guest_data = NULL;
+    void **guest_data = ctx->save.m->guest_data;
     void **local_pages = NULL;
     int *errors = ctx->save.m->errors, rc = -1;
     unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0;
@@ -105,12 +105,10 @@ static int write_batch(struct xc_sr_context *ctx)
 
     assert(nr_pfns != 0);
 
-    /* Pointers to page data to send.  Mapped gfns or local allocations. */
-    guest_data = calloc(nr_pfns, sizeof(*guest_data));
     /* Pointers to locally allocated pages.  Need freeing. */
     local_pages = calloc(nr_pfns, sizeof(*local_pages));
 
-    if ( !guest_data || !local_pages )
+    if ( !local_pages )
     {
         ERROR("Unable to allocate arrays for a batch of %u pages",
               nr_pfns);
@@ -166,7 +164,10 @@ static int write_batch(struct xc_sr_context *ctx)
         for ( i = 0, p = 0; i < nr_pfns; ++i )
         {
             if ( page_type_has_stream_data(types[i]) == false )
+            {
+                guest_data[i] = NULL;
                 continue;
+            }
 
             if ( errors[p] )
             {
@@ -183,6 +184,7 @@ static int write_batch(struct xc_sr_context *ctx)
 
             if ( rc )
             {
+                guest_data[i] = NULL;
                 if ( rc == -1 && errno == EAGAIN )
                 {
                     set_bit(ctx->save.m->batch_pfns[i], ctx->save.deferred_pages);
@@ -256,7 +258,6 @@ static int write_batch(struct xc_sr_context *ctx)
     for ( i = 0; local_pages && i < nr_pfns; ++i )
         free(local_pages[i]);
     free(local_pages);
-    free(guest_data);
 
     return rc;
 }


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 22/40] tools: save: move local_pages array
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (20 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 21/40] tools: save: move guest_data array Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 23/40] tools: restore: move types array Olaf Hering
                   ` (17 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

Remove allocation from hotpath, move local_pages array into preallocated space.

Adjust the code to use the src page as is in case of HVM.
In case of PV the page may need to be normalised, use a private memory
area for this purpose.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 tools/libs/saverestore/common.h       | 22 ++++++++++---------
 tools/libs/saverestore/save.c         | 25 +++------------------
 tools/libs/saverestore/save_x86_hvm.c |  5 +++--
 tools/libs/saverestore/save_x86_pv.c  | 31 ++++++++++++++++++---------
 4 files changed, 39 insertions(+), 44 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index c3570e0c9a..8089449011 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -45,16 +45,12 @@ struct xc_sr_save_ops
      * Optionally transform the contents of a page from being specific to the
      * sending environment, to being generic for the stream.
      *
-     * The page of data at the end of 'page' may be a read-only mapping of a
-     * running guest; it must not be modified.  If no transformation is
-     * required, the callee should leave '*pages' untouched.
+     * The page of data '*src' may be a read-only mapping of a running guest;
+     * it must not be modified. If no transformation is required, the callee
+     * should leave '*src' untouched, and return it via '**ptr'.
      *
-     * If a transformation is required, the callee should allocate themselves
-     * a local page using malloc() and return it via '*page'.
-     *
-     * The caller shall free() '*page' in all cases.  In the case that the
-     * callee encounters an error, it should *NOT* free() the memory it
-     * allocated for '*page'.
+     * If a transformation is required, the callee should provide the
+     * transformed page in a private buffer and return it via '**ptr'.
      *
      * It is valid to fail with EAGAIN if the transformation is not able to be
      * completed at this point.  The page shall be retried later.
@@ -62,7 +58,7 @@ struct xc_sr_save_ops
      * @returns 0 for success, -1 for failure, with errno appropriately set.
      */
     int (*normalise_page)(struct xc_sr_context *ctx, xen_pfn_t type,
-                          void **page);
+                          void *src, unsigned int idx, void **ptr);
 
     /**
      * Set up local environment to save a domain. (Typically querying
@@ -385,6 +381,12 @@ struct xc_sr_context
 
                 union
                 {
+                    struct
+                    {
+                        /* Used by write_batch for modified pages. */
+                        void *normalised_pages;
+                    } save;
+
                     struct
                     {
                         /* State machine for the order of received records. */
diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
index ea04cb1a74..fa83648f9a 100644
--- a/tools/libs/saverestore/save.c
+++ b/tools/libs/saverestore/save.c
@@ -91,11 +91,10 @@ static int write_batch(struct xc_sr_context *ctx)
     xen_pfn_t *mfns = ctx->save.m->mfns, *types = ctx->save.m->types;
     void *guest_mapping = NULL;
     void **guest_data = ctx->save.m->guest_data;
-    void **local_pages = NULL;
     int *errors = ctx->save.m->errors, rc = -1;
     unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0;
     unsigned int nr_pfns = ctx->save.nr_batch_pfns;
-    void *page, *orig_page;
+    void *src;
     uint64_t *rec_pfns = ctx->save.m->rec_pfns;
     struct iovec *iov = ctx->save.m->iov; int iovcnt = 0;
     struct xc_sr_rec_page_data_header hdr = { 0 };
@@ -105,16 +104,6 @@ static int write_batch(struct xc_sr_context *ctx)
 
     assert(nr_pfns != 0);
 
-    /* Pointers to locally allocated pages.  Need freeing. */
-    local_pages = calloc(nr_pfns, sizeof(*local_pages));
-
-    if ( !local_pages )
-    {
-        ERROR("Unable to allocate arrays for a batch of %u pages",
-              nr_pfns);
-        goto err;
-    }
-
     for ( i = 0; i < nr_pfns; ++i )
     {
         types[i] = mfns[i] = ctx->save.ops.pfn_to_gfn(ctx,
@@ -176,11 +165,8 @@ static int write_batch(struct xc_sr_context *ctx)
                 goto err;
             }
 
-            orig_page = page = guest_mapping + (p * PAGE_SIZE);
-            rc = ctx->save.ops.normalise_page(ctx, types[i], &page);
-
-            if ( orig_page != page )
-                local_pages[i] = page;
+            src = guest_mapping + (p * PAGE_SIZE);
+            rc = ctx->save.ops.normalise_page(ctx, types[i], src, i, &guest_data[i]);
 
             if ( rc )
             {
@@ -195,8 +181,6 @@ static int write_batch(struct xc_sr_context *ctx)
                 else
                     goto err;
             }
-            else
-                guest_data[i] = page;
 
             rc = -1;
             ++p;
@@ -255,9 +239,6 @@ static int write_batch(struct xc_sr_context *ctx)
  err:
     if ( guest_mapping )
         xenforeignmemory_unmap(xch->fmem, guest_mapping, nr_pages_mapped);
-    for ( i = 0; local_pages && i < nr_pfns; ++i )
-        free(local_pages[i]);
-    free(local_pages);
 
     return rc;
 }
diff --git a/tools/libs/saverestore/save_x86_hvm.c b/tools/libs/saverestore/save_x86_hvm.c
index 91c2cb99ab..26f49ee267 100644
--- a/tools/libs/saverestore/save_x86_hvm.c
+++ b/tools/libs/saverestore/save_x86_hvm.c
@@ -129,9 +129,10 @@ static xen_pfn_t x86_hvm_pfn_to_gfn(const struct xc_sr_context *ctx,
     return pfn;
 }
 
-static int x86_hvm_normalise_page(struct xc_sr_context *ctx,
-                                  xen_pfn_t type, void **page)
+static int x86_hvm_normalise_page(struct xc_sr_context *ctx, xen_pfn_t type,
+                                  void *src, unsigned int idx, void **ptr)
 {
+    *ptr = src;
     return 0;
 }
 
diff --git a/tools/libs/saverestore/save_x86_pv.c b/tools/libs/saverestore/save_x86_pv.c
index 92f77fad0f..159ff59480 100644
--- a/tools/libs/saverestore/save_x86_pv.c
+++ b/tools/libs/saverestore/save_x86_pv.c
@@ -999,29 +999,31 @@ static xen_pfn_t x86_pv_pfn_to_gfn(const struct xc_sr_context *ctx,
  * save_ops function.  Performs pagetable normalisation on appropriate pages.
  */
 static int x86_pv_normalise_page(struct xc_sr_context *ctx, xen_pfn_t type,
-                                 void **page)
+                                  void *src, unsigned int idx, void **ptr)
 {
     xc_interface *xch = ctx->xch;
-    void *local_page;
     int rc;
+    void *dst;
 
     type &= XEN_DOMCTL_PFINFO_LTABTYPE_MASK;
 
     if ( type < XEN_DOMCTL_PFINFO_L1TAB || type > XEN_DOMCTL_PFINFO_L4TAB )
+    {
+        *ptr = src;
         return 0;
+    }
 
-    local_page = malloc(PAGE_SIZE);
-    if ( !local_page )
+    if ( idx >= MAX_BATCH_SIZE )
     {
-        ERROR("Unable to allocate scratch page");
-        rc = -1;
-        goto out;
+        ERROR("idx %u out of range", idx);
+        errno = ERANGE;
+        return -1;
     }
 
-    rc = normalise_pagetable(ctx, *page, local_page, type);
-    *page = local_page;
+    dst = ctx->x86.pv.save.normalised_pages + idx * PAGE_SIZE;
+    rc = normalise_pagetable(ctx, src, dst, type);
+    *ptr = dst;
 
- out:
     return rc;
 }
 
@@ -1031,8 +1033,16 @@ static int x86_pv_normalise_page(struct xc_sr_context *ctx, xen_pfn_t type,
  */
 static int x86_pv_setup(struct xc_sr_context *ctx)
 {
+    xc_interface *xch = ctx->xch;
     int rc;
 
+    ctx->x86.pv.save.normalised_pages = malloc(MAX_BATCH_SIZE * PAGE_SIZE);
+    if ( !ctx->x86.pv.save.normalised_pages )
+    {
+        PERROR("Failed to allocate normalised_pages");
+        return -1;
+    }
+
     rc = x86_pv_domain_info(ctx);
     if ( rc )
         return rc;
@@ -1118,6 +1128,7 @@ static int x86_pv_check_vm_state(struct xc_sr_context *ctx)
 
 static int x86_pv_cleanup(struct xc_sr_context *ctx)
 {
+    free(ctx->x86.pv.save.normalised_pages);
     free(ctx->x86.pv.p2m_pfns);
 
     if ( ctx->x86.pv.p2m )


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 23/40] tools: restore: move types array
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (21 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 22/40] tools: save: move local_pages array Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 24/40] tools: restore: move mfns array Olaf Hering
                   ` (16 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

Remove allocation from hotpath, move types array into preallocated space.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 tools/libs/saverestore/common.h  |  1 +
 tools/libs/saverestore/restore.c | 12 +-----------
 2 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 8089449011..d798b79745 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -238,6 +238,7 @@ struct sr_save_arrays {
 struct sr_restore_arrays {
     /* handle_page_data */
     xen_pfn_t pfns[MAX_BATCH_SIZE];
+    uint32_t types[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_context
diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index c203ce503d..8ea125cf73 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -316,7 +316,7 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
     int rc = -1;
 
     xen_pfn_t *pfns = ctx->restore.m->pfns, pfn;
-    uint32_t *types = NULL, type;
+    uint32_t *types = ctx->restore.m->types, type;
 
     /*
      * v2 compatibility only exists for x86 streams.  This is a bit of a
@@ -363,14 +363,6 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
         goto err;
     }
 
-    types = malloc(pages->count * sizeof(*types));
-    if ( !types )
-    {
-        ERROR("Unable to allocate enough memory for %u pfns",
-              pages->count);
-        goto err;
-    }
-
     for ( i = 0; i < pages->count; ++i )
     {
         pfn = pages->pfn[i] & PAGE_DATA_PFN_MASK;
@@ -410,8 +402,6 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
     rc = process_page_data(ctx, pages->count, pfns, types,
                            &pages->pfn[pages->count]);
  err:
-    free(types);
-
     return rc;
 }
 


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 24/40] tools: restore: move mfns array
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (22 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 23/40] tools: restore: move types array Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 25/40] tools: restore: move map_errs array Olaf Hering
                   ` (15 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

Remove allocation from hotpath, move mfns array into preallocated space.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 tools/libs/saverestore/common.h  | 2 ++
 tools/libs/saverestore/restore.c | 5 ++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index d798b79745..9d7efff03d 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -239,6 +239,8 @@ struct sr_restore_arrays {
     /* handle_page_data */
     xen_pfn_t pfns[MAX_BATCH_SIZE];
     uint32_t types[MAX_BATCH_SIZE];
+    /* process_page_data */
+    xen_pfn_t mfns[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_context
diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index 8ea125cf73..d7ea52b89e 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -205,7 +205,7 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count,
                              xen_pfn_t *pfns, uint32_t *types, void *page_data)
 {
     xc_interface *xch = ctx->xch;
-    xen_pfn_t *mfns = malloc(count * sizeof(*mfns));
+    xen_pfn_t *mfns = ctx->restore.m->mfns;
     int *map_errs = malloc(count * sizeof(*map_errs));
     int rc;
     void *mapping = NULL, *guest_page = NULL;
@@ -213,7 +213,7 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count,
         j,          /* j indexes the subset of pfns we decide to map. */
         nr_pages = 0;
 
-    if ( !mfns || !map_errs )
+    if ( !map_errs )
     {
         rc = -1;
         ERROR("Failed to allocate %zu bytes to process page data",
@@ -299,7 +299,6 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count,
         xenforeignmemory_unmap(xch->fmem, mapping, nr_pages);
 
     free(map_errs);
-    free(mfns);
 
     return rc;
 }


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 25/40] tools: restore: move map_errs array
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (23 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 24/40] tools: restore: move mfns array Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 26/40] tools: restore: move mfns array in populate_pfns Olaf Hering
                   ` (14 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

Remove allocation from hotpath, move map_errs array into preallocated space.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 tools/libs/saverestore/common.h  |  1 +
 tools/libs/saverestore/restore.c | 12 +-----------
 2 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 9d7efff03d..7684c35e22 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -241,6 +241,7 @@ struct sr_restore_arrays {
     uint32_t types[MAX_BATCH_SIZE];
     /* process_page_data */
     xen_pfn_t mfns[MAX_BATCH_SIZE];
+    int map_errs[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_context
diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index d7ea52b89e..578ee1accb 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -206,21 +206,13 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count,
 {
     xc_interface *xch = ctx->xch;
     xen_pfn_t *mfns = ctx->restore.m->mfns;
-    int *map_errs = malloc(count * sizeof(*map_errs));
+    int *map_errs = ctx->restore.m->map_errs;
     int rc;
     void *mapping = NULL, *guest_page = NULL;
     unsigned int i, /* i indexes the pfns from the record. */
         j,          /* j indexes the subset of pfns we decide to map. */
         nr_pages = 0;
 
-    if ( !map_errs )
-    {
-        rc = -1;
-        ERROR("Failed to allocate %zu bytes to process page data",
-              count * (sizeof(*mfns) + sizeof(*map_errs)));
-        goto err;
-    }
-
     rc = populate_pfns(ctx, count, pfns, types);
     if ( rc )
     {
@@ -298,8 +290,6 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count,
     if ( mapping )
         xenforeignmemory_unmap(xch->fmem, mapping, nr_pages);
 
-    free(map_errs);
-
     return rc;
 }
 


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 26/40] tools: restore: move mfns array in populate_pfns
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (24 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 25/40] tools: restore: move map_errs array Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 27/40] tools: restore: move pfns " Olaf Hering
                   ` (13 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

Remove allocation from hotpath, move populate_pfns mfns array into preallocated space.
Use some prefix to avoid conflict with an array used in handle_page_data.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 tools/libs/saverestore/common.h  | 2 ++
 tools/libs/saverestore/restore.c | 5 ++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 7684c35e22..9d2ea96583 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -242,6 +242,8 @@ struct sr_restore_arrays {
     /* process_page_data */
     xen_pfn_t mfns[MAX_BATCH_SIZE];
     int map_errs[MAX_BATCH_SIZE];
+    /* populate_pfns */
+    xen_pfn_t pp_mfns[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_context
diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index 578ee1accb..7418abf1c5 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -138,12 +138,12 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count,
                   const xen_pfn_t *original_pfns, const uint32_t *types)
 {
     xc_interface *xch = ctx->xch;
-    xen_pfn_t *mfns = malloc(count * sizeof(*mfns)),
+    xen_pfn_t *mfns = ctx->restore.m->pp_mfns,
         *pfns = malloc(count * sizeof(*pfns));
     unsigned int i, nr_pfns = 0;
     int rc = -1;
 
-    if ( !mfns || !pfns )
+    if ( !pfns )
     {
         ERROR("Failed to allocate %zu bytes for populating the physmap",
               2 * count * sizeof(*mfns));
@@ -191,7 +191,6 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count,
 
  err:
     free(pfns);
-    free(mfns);
 
     return rc;
 }


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 27/40] tools: restore: move pfns array in populate_pfns
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (25 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 26/40] tools: restore: move mfns array in populate_pfns Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 28/40] tools: restore: split record processing Olaf Hering
                   ` (12 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

Remove allocation from hotpath, move populate_pfns' pfns array into preallocated space.
Use some prefix to avoid conflict with an array used in handle_page_data.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 tools/libs/saverestore/common.h  |  1 +
 tools/libs/saverestore/restore.c | 11 +----------
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 9d2ea96583..c319148f8f 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -244,6 +244,7 @@ struct sr_restore_arrays {
     int map_errs[MAX_BATCH_SIZE];
     /* populate_pfns */
     xen_pfn_t pp_mfns[MAX_BATCH_SIZE];
+    xen_pfn_t pp_pfns[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_context
diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index 7418abf1c5..2a6ccce847 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -139,17 +139,10 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count,
 {
     xc_interface *xch = ctx->xch;
     xen_pfn_t *mfns = ctx->restore.m->pp_mfns,
-        *pfns = malloc(count * sizeof(*pfns));
+        *pfns = ctx->restore.m->pp_pfns;
     unsigned int i, nr_pfns = 0;
     int rc = -1;
 
-    if ( !pfns )
-    {
-        ERROR("Failed to allocate %zu bytes for populating the physmap",
-              2 * count * sizeof(*mfns));
-        goto err;
-    }
-
     for ( i = 0; i < count; ++i )
     {
         if ( (!types ||
@@ -190,8 +183,6 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count,
     rc = 0;
 
  err:
-    free(pfns);
-
     return rc;
 }
 


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 28/40] tools: restore: split record processing
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (26 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 27/40] tools: restore: move pfns " Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 29/40] tools: restore: split handle_page_data Olaf Hering
                   ` (11 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Juergen Gross, Ian Jackson, Wei Liu

handle_page_data must be able to read directly into mapped guest memory.
This will avoid unneccesary memcpy calls for data which can be consumed verbatim.

Rearrange the code to allow decisions based on the incoming record.

This change is preparation for future changes in handle_page_data,
no change in behavior is intended.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 tools/libs/saverestore/common.c  | 33 ++++++++++++---------
 tools/libs/saverestore/common.h  |  4 ++-
 tools/libs/saverestore/restore.c | 49 ++++++++++++++++++++++----------
 tools/libs/saverestore/save.c    |  7 ++++-
 4 files changed, 63 insertions(+), 30 deletions(-)

diff --git a/tools/libs/saverestore/common.c b/tools/libs/saverestore/common.c
index 77128bc747..7da7fa4e2c 100644
--- a/tools/libs/saverestore/common.c
+++ b/tools/libs/saverestore/common.c
@@ -91,26 +91,33 @@ int write_split_record(struct xc_sr_context *ctx, struct xc_sr_record *rec,
     return -1;
 }
 
-int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec)
+int read_record_header(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr)
 {
     xc_interface *xch = ctx->xch;
-    struct xc_sr_rhdr rhdr;
-    size_t datasz;
 
-    if ( read_exact(fd, &rhdr, sizeof(rhdr)) )
+    if ( read_exact(fd, rhdr, sizeof(*rhdr)) )
     {
         PERROR("Failed to read Record Header from stream");
         return -1;
     }
 
-    if ( rhdr.length > REC_LENGTH_MAX )
+    if ( rhdr->length > REC_LENGTH_MAX )
     {
-        ERROR("Record (0x%08x, %s) length %#x exceeds max (%#x)", rhdr.type,
-              rec_type_to_str(rhdr.type), rhdr.length, REC_LENGTH_MAX);
+        ERROR("Record (0x%08x, %s) length %#x exceeds max (%#x)", rhdr->type,
+              rec_type_to_str(rhdr->type), rhdr->length, REC_LENGTH_MAX);
         return -1;
     }
 
-    datasz = ROUNDUP(rhdr.length, REC_ALIGN_ORDER);
+    return 0;
+}
+
+int read_record_data(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr,
+                     struct xc_sr_record *rec)
+{
+    xc_interface *xch = ctx->xch;
+    size_t datasz;
+
+    datasz = ROUNDUP(rhdr->length, REC_ALIGN_ORDER);
 
     if ( datasz )
     {
@@ -119,7 +126,7 @@ int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec)
         if ( !rec->data )
         {
             ERROR("Unable to allocate %zu bytes for record data (0x%08x, %s)",
-                  datasz, rhdr.type, rec_type_to_str(rhdr.type));
+                  datasz, rhdr->type, rec_type_to_str(rhdr->type));
             return -1;
         }
 
@@ -128,18 +135,18 @@ int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec)
             free(rec->data);
             rec->data = NULL;
             PERROR("Failed to read %zu bytes of data for record (0x%08x, %s)",
-                   datasz, rhdr.type, rec_type_to_str(rhdr.type));
+                   datasz, rhdr->type, rec_type_to_str(rhdr->type));
             return -1;
         }
     }
     else
         rec->data = NULL;
 
-    rec->type   = rhdr.type;
-    rec->length = rhdr.length;
+    rec->type   = rhdr->type;
+    rec->length = rhdr->length;
 
     return 0;
-};
+}
 
 static void __attribute__((unused)) build_assertions(void)
 {
diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index c319148f8f..580eafacc8 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -487,7 +487,9 @@ static inline int write_record(struct xc_sr_context *ctx,
  *
  * On failure, the contents of the record structure are undefined.
  */
-int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec);
+int read_record_header(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr);
+int read_record_data(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr,
+                     struct xc_sr_record *rec);
 
 /*
  * This would ideally be private in restore.c, but is needed by
diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index 2a6ccce847..e75380155d 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -471,7 +471,7 @@ static int send_checkpoint_dirty_pfn_list(struct xc_sr_context *ctx)
     return rc;
 }
 
-static int process_record(struct xc_sr_context *ctx, struct xc_sr_record *rec);
+static int process_buffered_record(struct xc_sr_context *ctx, struct xc_sr_record *rec);
 static int handle_checkpoint(struct xc_sr_context *ctx)
 {
     xc_interface *xch = ctx->xch;
@@ -510,7 +510,7 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
 
         for ( i = 0; i < ctx->restore.buffered_rec_num; i++ )
         {
-            rc = process_record(ctx, &ctx->restore.buffered_records[i]);
+            rc = process_buffered_record(ctx, &ctx->restore.buffered_records[i]);
             if ( rc )
                 goto err;
         }
@@ -571,10 +571,11 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
     return rc;
 }
 
-static int buffer_record(struct xc_sr_context *ctx, struct xc_sr_record *rec)
+static int buffer_record(struct xc_sr_context *ctx, struct xc_sr_rhdr *rhdr)
 {
     xc_interface *xch = ctx->xch;
     unsigned int new_alloc_num;
+    struct xc_sr_record rec;
     struct xc_sr_record *p;
 
     if ( ctx->restore.buffered_rec_num >= ctx->restore.allocated_rec_num )
@@ -592,8 +593,13 @@ static int buffer_record(struct xc_sr_context *ctx, struct xc_sr_record *rec)
         ctx->restore.allocated_rec_num = new_alloc_num;
     }
 
+    if ( read_record_data(ctx, ctx->fd, rhdr, &rec) )
+    {
+        return -1;
+    }
+
     memcpy(&ctx->restore.buffered_records[ctx->restore.buffered_rec_num++],
-           rec, sizeof(*rec));
+           &rec, sizeof(rec));
 
     return 0;
 }
@@ -624,7 +630,7 @@ int handle_static_data_end(struct xc_sr_context *ctx)
     return rc;
 }
 
-static int process_record(struct xc_sr_context *ctx, struct xc_sr_record *rec)
+static int process_buffered_record(struct xc_sr_context *ctx, struct xc_sr_record *rec)
 {
     xc_interface *xch = ctx->xch;
     int rc = 0;
@@ -662,6 +668,19 @@ static int process_record(struct xc_sr_context *ctx, struct xc_sr_record *rec)
     return rc;
 }
 
+static int process_incoming_record_header(struct xc_sr_context *ctx, struct xc_sr_rhdr *rhdr)
+{
+    struct xc_sr_record rec;
+    int rc;
+
+    rc = read_record_data(ctx, ctx->fd, rhdr, &rec);
+    if ( rc )
+        return rc;
+
+    return process_buffered_record(ctx, &rec);
+}
+
+
 static int setup(struct xc_sr_context *ctx)
 {
     xc_interface *xch = ctx->xch;
@@ -745,7 +764,7 @@ static void cleanup(struct xc_sr_context *ctx)
 static int restore(struct xc_sr_context *ctx)
 {
     xc_interface *xch = ctx->xch;
-    struct xc_sr_record rec;
+    struct xc_sr_rhdr rhdr;
     int rc, saved_rc = 0, saved_errno = 0;
 
     IPRINTF("Restoring domain");
@@ -756,7 +775,7 @@ static int restore(struct xc_sr_context *ctx)
 
     do
     {
-        rc = read_record(ctx, ctx->fd, &rec);
+        rc = read_record_header(ctx, ctx->fd, &rhdr);
         if ( rc )
         {
             if ( ctx->restore.buffer_all_records )
@@ -766,25 +785,25 @@ static int restore(struct xc_sr_context *ctx)
         }
 
         if ( ctx->restore.buffer_all_records &&
-             rec.type != REC_TYPE_END &&
-             rec.type != REC_TYPE_CHECKPOINT )
+             rhdr.type != REC_TYPE_END &&
+             rhdr.type != REC_TYPE_CHECKPOINT )
         {
-            rc = buffer_record(ctx, &rec);
+            rc = buffer_record(ctx, &rhdr);
             if ( rc )
                 goto err;
         }
         else
         {
-            rc = process_record(ctx, &rec);
+            rc = process_incoming_record_header(ctx, &rhdr);
             if ( rc == RECORD_NOT_PROCESSED )
             {
-                if ( rec.type & REC_TYPE_OPTIONAL )
+                if ( rhdr.type & REC_TYPE_OPTIONAL )
                     DPRINTF("Ignoring optional record %#x (%s)",
-                            rec.type, rec_type_to_str(rec.type));
+                            rhdr.type, rec_type_to_str(rhdr.type));
                 else
                 {
                     ERROR("Mandatory record %#x (%s) not handled",
-                          rec.type, rec_type_to_str(rec.type));
+                          rhdr.type, rec_type_to_str(rhdr.type));
                     rc = -1;
                     goto err;
                 }
@@ -795,7 +814,7 @@ static int restore(struct xc_sr_context *ctx)
                 goto err;
         }
 
-    } while ( rec.type != REC_TYPE_END );
+    } while ( rhdr.type != REC_TYPE_END );
 
  remus_failover:
     if ( ctx->stream_type == XC_STREAM_COLO )
diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
index fa83648f9a..e486bce96f 100644
--- a/tools/libs/saverestore/save.c
+++ b/tools/libs/saverestore/save.c
@@ -589,6 +589,7 @@ static int send_memory_live(struct xc_sr_context *ctx)
 static int colo_merge_secondary_dirty_bitmap(struct xc_sr_context *ctx)
 {
     xc_interface *xch = ctx->xch;
+    struct xc_sr_rhdr rhdr;
     struct xc_sr_record rec;
     uint64_t *pfns = NULL;
     uint64_t pfn;
@@ -597,7 +598,11 @@ static int colo_merge_secondary_dirty_bitmap(struct xc_sr_context *ctx)
     DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
                                     &ctx->save.dirty_bitmap_hbuf);
 
-    rc = read_record(ctx, ctx->save.recv_fd, &rec);
+    rc = read_record_header(ctx, ctx->save.recv_fd, &rhdr);
+    if ( rc )
+        goto err;
+
+    rc = read_record_data(ctx, ctx->save.recv_fd, &rhdr, &rec);
     if ( rc )
         goto err;
 


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 29/40] tools: restore: split handle_page_data
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (27 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 28/40] tools: restore: split record processing Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 30/40] tools: restore: write data directly into guest Olaf Hering
                   ` (10 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu, Juergen Gross

handle_page_data must be able to read directly into mapped guest memory.
This will avoid unneccesary memcpy calls for data that can be consumed verbatim.

Split the various steps of record processing:
- move processing to handle_buffered_page_data
- adjust xenforeignmemory_map to set errno in case of failure
- adjust verify mode to set errno in case of failure

This change is preparation for future changes in handle_page_data,
no change in behavior is intended.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 tools/libs/saverestore/common.h  |   9 +
 tools/libs/saverestore/restore.c | 343 ++++++++++++++++++++-----------
 2 files changed, 231 insertions(+), 121 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 580eafacc8..96bd0ab80e 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -242,9 +242,14 @@ struct sr_restore_arrays {
     /* process_page_data */
     xen_pfn_t mfns[MAX_BATCH_SIZE];
     int map_errs[MAX_BATCH_SIZE];
+    void *guest_data[MAX_BATCH_SIZE];
+
     /* populate_pfns */
     xen_pfn_t pp_mfns[MAX_BATCH_SIZE];
     xen_pfn_t pp_pfns[MAX_BATCH_SIZE];
+
+    /* Must be the last member */
+    struct xc_sr_rec_page_data_header pages;
 };
 
 struct xc_sr_context
@@ -335,7 +340,11 @@ struct xc_sr_context
 
             /* Sender has invoked verify mode on the stream. */
             bool verify;
+            void *verify_buf;
+
             struct sr_restore_arrays *m;
+            void *guest_mapping;
+            uint32_t nr_mapped_pages;
         } restore;
     };
 
diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index e75380155d..7643de58e0 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -186,123 +186,18 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count,
     return rc;
 }
 
-/*
- * Given a list of pfns, their types, and a block of page data from the
- * stream, populate and record their types, map the relevant subset and copy
- * the data into the guest.
- */
-static int process_page_data(struct xc_sr_context *ctx, unsigned int count,
-                             xen_pfn_t *pfns, uint32_t *types, void *page_data)
+static int handle_static_data_end_v2(struct xc_sr_context *ctx)
 {
-    xc_interface *xch = ctx->xch;
-    xen_pfn_t *mfns = ctx->restore.m->mfns;
-    int *map_errs = ctx->restore.m->map_errs;
-    int rc;
-    void *mapping = NULL, *guest_page = NULL;
-    unsigned int i, /* i indexes the pfns from the record. */
-        j,          /* j indexes the subset of pfns we decide to map. */
-        nr_pages = 0;
-
-    rc = populate_pfns(ctx, count, pfns, types);
-    if ( rc )
-    {
-        ERROR("Failed to populate pfns for batch of %u pages", count);
-        goto err;
-    }
-
-    for ( i = 0; i < count; ++i )
-    {
-        ctx->restore.ops.set_page_type(ctx, pfns[i], types[i]);
-
-        if ( page_type_has_stream_data(types[i]) == true )
-            mfns[nr_pages++] = ctx->restore.ops.pfn_to_gfn(ctx, pfns[i]);
-    }
-
-    /* Nothing to do? */
-    if ( nr_pages == 0 )
-        goto done;
-
-    mapping = guest_page = xenforeignmemory_map(
-        xch->fmem, ctx->domid, PROT_READ | PROT_WRITE,
-        nr_pages, mfns, map_errs);
-    if ( !mapping )
-    {
-        rc = -1;
-        PERROR("Unable to map %u mfns for %u pages of data",
-               nr_pages, count);
-        goto err;
-    }
-
-    for ( i = 0, j = 0; i < count; ++i )
-    {
-        if ( page_type_has_stream_data(types[i]) == false )
-            continue;
-
-        if ( map_errs[j] )
-        {
-            rc = -1;
-            ERROR("Mapping pfn %#"PRIpfn" (mfn %#"PRIpfn", type %#"PRIx32") failed with %d",
-                  pfns[i], mfns[j], types[i], map_errs[j]);
-            goto err;
-        }
-
-        /* Undo page normalisation done by the saver. */
-        rc = ctx->restore.ops.localise_page(ctx, types[i], page_data);
-        if ( rc )
-        {
-            ERROR("Failed to localise pfn %#"PRIpfn" (type %#"PRIx32")",
-                  pfns[i], types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
-            goto err;
-        }
-
-        if ( ctx->restore.verify )
-        {
-            /* Verify mode - compare incoming data to what we already have. */
-            if ( memcmp(guest_page, page_data, PAGE_SIZE) )
-                ERROR("verify pfn %#"PRIpfn" failed (type %#"PRIx32")",
-                      pfns[i], types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
-        }
-        else
-        {
-            /* Regular mode - copy incoming data into place. */
-            memcpy(guest_page, page_data, PAGE_SIZE);
-        }
-
-        ++j;
-        guest_page += PAGE_SIZE;
-        page_data += PAGE_SIZE;
-    }
-
- done:
-    rc = 0;
-
- err:
-    if ( mapping )
-        xenforeignmemory_unmap(xch->fmem, mapping, nr_pages);
-
-    return rc;
-}
+    int rc = 0;
 
-/*
- * Validate a PAGE_DATA record from the stream, and pass the results to
- * process_page_data() to actually perform the legwork.
- */
-static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
-{
+#if defined(__i386__) || defined(__x86_64__)
     xc_interface *xch = ctx->xch;
-    struct xc_sr_rec_page_data_header *pages = rec->data;
-    unsigned int i, pages_of_data = 0;
-    int rc = -1;
-
-    xen_pfn_t *pfns = ctx->restore.m->pfns, pfn;
-    uint32_t *types = ctx->restore.m->types, type;
-
     /*
      * v2 compatibility only exists for x86 streams.  This is a bit of a
      * bodge, but it is less bad than duplicating handle_page_data() between
      * different architectures.
      */
-#if defined(__i386__) || defined(__x86_64__)
+
     /* v2 compat.  Infer the position of STATIC_DATA_END. */
     if ( ctx->restore.format_version < 3 && !ctx->restore.seen_static_data_end )
     {
@@ -320,12 +215,26 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
         ERROR("No STATIC_DATA_END seen");
         goto err;
     }
+
+    rc = 0;
+err:
 #endif
 
-    if ( rec->length < sizeof(*pages) )
+    return rc;
+}
+
+static bool verify_rec_page_hdr(struct xc_sr_context *ctx, uint32_t rec_length,
+                                 struct xc_sr_rec_page_data_header *pages)
+{
+    xc_interface *xch = ctx->xch;
+    bool ret = false;
+
+    errno = EINVAL;
+
+    if ( rec_length < sizeof(*pages) )
     {
         ERROR("PAGE_DATA record truncated: length %u, min %zu",
-              rec->length, sizeof(*pages));
+              rec_length, sizeof(*pages));
         goto err;
     }
 
@@ -335,13 +244,35 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
         goto err;
     }
 
-    if ( rec->length < sizeof(*pages) + (pages->count * sizeof(uint64_t)) )
+    if ( pages->count > MAX_BATCH_SIZE )
+    {
+        ERROR("pfn count %u in PAGE_DATA record too large", pages->count);
+        errno = E2BIG;
+        goto err;
+    }
+
+    if ( rec_length < sizeof(*pages) + (pages->count * sizeof(uint64_t)) )
     {
         ERROR("PAGE_DATA record (length %u) too short to contain %u"
-              " pfns worth of information", rec->length, pages->count);
+              " pfns worth of information", rec_length, pages->count);
         goto err;
     }
 
+    ret = true;
+
+err:
+    return ret;
+}
+
+static bool verify_rec_page_pfns(struct xc_sr_context *ctx, uint32_t rec_length,
+                                 struct xc_sr_rec_page_data_header *pages)
+{
+    xc_interface *xch = ctx->xch;
+    uint32_t i, pages_of_data = 0;
+    xen_pfn_t pfn;
+    uint32_t type;
+    bool ret = false;
+
     for ( i = 0; i < pages->count; ++i )
     {
         pfn = pages->pfn[i] & PAGE_DATA_PFN_MASK;
@@ -364,23 +295,183 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
              * have a page worth of data in the record. */
             pages_of_data++;
 
-        pfns[i] = pfn;
-        types[i] = type;
+        ctx->restore.m->pfns[i] = pfn;
+        ctx->restore.m->types[i] = type;
     }
 
-    if ( rec->length != (sizeof(*pages) +
+    if ( rec_length != (sizeof(*pages) +
                          (sizeof(uint64_t) * pages->count) +
                          (PAGE_SIZE * pages_of_data)) )
     {
         ERROR("PAGE_DATA record wrong size: length %u, expected "
-              "%zu + %zu + %lu", rec->length, sizeof(*pages),
+              "%zu + %zu + %lu", rec_length, sizeof(*pages),
               (sizeof(uint64_t) * pages->count), (PAGE_SIZE * pages_of_data));
         goto err;
     }
 
-    rc = process_page_data(ctx, pages->count, pfns, types,
-                           &pages->pfn[pages->count]);
+    ret = true;
+
+err:
+    return ret;
+}
+
+/*
+ * Populate pfns, if required
+ * Fill m->guest_data with either mapped address or NULL
+ * The caller must unmap guest_mapping
+ */
+static int map_guest_pages(struct xc_sr_context *ctx,
+                           struct xc_sr_rec_page_data_header *pages)
+{
+    xc_interface *xch = ctx->xch;
+    struct sr_restore_arrays *m = ctx->restore.m;
+    uint32_t i, p;
+    int rc;
+
+    rc = populate_pfns(ctx, pages->count, m->pfns, m->types);
+    if ( rc )
+    {
+        ERROR("Failed to populate pfns for batch of %u pages", pages->count);
+        goto err;
+    }
+
+    ctx->restore.nr_mapped_pages = 0;
+
+    for ( i = 0; i < pages->count; i++ )
+    {
+        ctx->restore.ops.set_page_type(ctx, m->pfns[i], m->types[i]);
+
+        if ( page_type_has_stream_data(m->types[i]) == false )
+        {
+            m->guest_data[i] = NULL;
+            continue;
+        }
+
+        m->mfns[ctx->restore.nr_mapped_pages++] = ctx->restore.ops.pfn_to_gfn(ctx, m->pfns[i]);
+    }
+
+    /* Nothing to do? */
+    if ( ctx->restore.nr_mapped_pages == 0 )
+        goto done;
+
+    ctx->restore.guest_mapping = xenforeignmemory_map(xch->fmem, ctx->domid,
+            PROT_READ | PROT_WRITE, ctx->restore.nr_mapped_pages,
+            m->mfns, m->map_errs);
+    if ( !ctx->restore.guest_mapping )
+    {
+        rc = -1;
+        PERROR("Unable to map %u mfns for %u pages of data",
+               ctx->restore.nr_mapped_pages, pages->count);
+        goto err;
+    }
+
+    /* Verify mapping, and assign address to pfn data */
+    for ( i = 0, p = 0; i < pages->count; i++ )
+    {
+        if ( page_type_has_stream_data(m->types[i]) == false )
+            continue;
+
+        if ( m->map_errs[p] == 0 )
+        {
+            m->guest_data[i] = ctx->restore.guest_mapping + (p * PAGE_SIZE);
+            p++;
+            continue;
+        }
+
+        errno = m->map_errs[p];
+        rc = -1;
+        PERROR("Mapping pfn %#"PRIpfn" (mfn %#"PRIpfn", type %#"PRIx32") failed",
+              m->pfns[i], m->mfns[p], m->types[i]);
+        goto err;
+    }
+
+done:
+    rc = 0;
+
+err:
+    return rc;
+}
+
+/*
+ * Handle PAGE_DATA record from an existing buffer
+ * Given a list of pfns, their types, and a block of page data from the
+ * stream, populate and record their types, map the relevant subset and copy
+ * the data into the guest.
+ */
+static int handle_buffered_page_data(struct xc_sr_context *ctx,
+                                     struct xc_sr_record *rec)
+{
+    xc_interface *xch = ctx->xch;
+    struct xc_sr_rec_page_data_header *pages = rec->data;
+    struct sr_restore_arrays *m = ctx->restore.m;
+    void *p;
+    uint32_t i;
+    int rc = -1, idx;
+
+    rc = handle_static_data_end_v2(ctx);
+    if ( rc )
+        goto err;
+
+    /* First read and verify the header */
+    if ( verify_rec_page_hdr(ctx, rec->length, pages) == false )
+    {
+        rc = -1;
+        goto err;
+    }
+
+    /* Then read and verify the pfn numbers */
+    if ( verify_rec_page_pfns(ctx, rec->length, pages) == false )
+    {
+        rc = -1;
+        goto err;
+    }
+
+    /* Map the target pfn */
+    rc = map_guest_pages(ctx, pages);
+    if ( rc )
+        goto err;
+
+    for ( i = 0, idx = 0; i < pages->count; i++ )
+    {
+        if ( !m->guest_data[i] )
+            continue;
+
+        p = &pages->pfn[pages->count] + (idx * PAGE_SIZE);
+        rc = ctx->restore.ops.localise_page(ctx, m->types[i], p);
+        if ( rc )
+        {
+            ERROR("Failed to localise pfn %#"PRIpfn" (type %#"PRIx32")",
+                  m->pfns[i], m->types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
+            goto err;
+
+        }
+
+        if ( ctx->restore.verify )
+        {
+            if ( memcmp(m->guest_data[i], p, PAGE_SIZE) )
+            {
+                errno = EIO;
+                ERROR("verify pfn %#"PRIpfn" failed (type %#"PRIx32")",
+                      m->pfns[i], m->types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
+                goto err;
+            }
+        }
+        else
+        {
+            memcpy(m->guest_data[i], p, PAGE_SIZE);
+        }
+
+        idx++;
+    }
+
+    rc = 0;
+
  err:
+    if ( ctx->restore.guest_mapping )
+    {
+        xenforeignmemory_unmap(xch->fmem, ctx->restore.guest_mapping, ctx->restore.nr_mapped_pages);
+        ctx->restore.guest_mapping = NULL;
+    }
     return rc;
 }
 
@@ -641,12 +732,21 @@ static int process_buffered_record(struct xc_sr_context *ctx, struct xc_sr_recor
         break;
 
     case REC_TYPE_PAGE_DATA:
-        rc = handle_page_data(ctx, rec);
+        rc = handle_buffered_page_data(ctx, rec);
         break;
 
     case REC_TYPE_VERIFY:
         DPRINTF("Verify mode enabled");
         ctx->restore.verify = true;
+        if ( !ctx->restore.verify_buf )
+        {
+            ctx->restore.verify_buf = malloc(MAX_BATCH_SIZE * PAGE_SIZE);
+            if ( !ctx->restore.verify_buf )
+            {
+                rc = -1;
+                PERROR("Unable to allocate verify_buf");
+            }
+        }
         break;
 
     case REC_TYPE_CHECKPOINT:
@@ -725,7 +825,8 @@ static int setup(struct xc_sr_context *ctx)
     }
     ctx->restore.allocated_rec_num = DEFAULT_BUF_RECORDS;
 
-    ctx->restore.m = malloc(sizeof(*ctx->restore.m));
+    ctx->restore.m = malloc(sizeof(*ctx->restore.m) +
+            (sizeof(*ctx->restore.m->pages.pfn) * MAX_BATCH_SIZE));
     if ( !ctx->restore.m ) {
         ERROR("Unable to allocate memory for arrays");
         rc = -1;


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 30/40] tools: restore: write data directly into guest
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (28 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 29/40] tools: restore: split handle_page_data Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 31/40] tools: recognize LIBXL_API_VERSION for 4.16 Olaf Hering
                   ` (9 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu, Juergen Gross

Read incoming migration stream directly into the guest memory.
This avoids the memory allocation and copying, and the resulting
performance penalty.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 tools/libs/saverestore/common.h  |   1 +
 tools/libs/saverestore/restore.c | 132 ++++++++++++++++++++++++++++++-
 2 files changed, 129 insertions(+), 4 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 96bd0ab80e..3adcf2f83f 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -243,6 +243,7 @@ struct sr_restore_arrays {
     xen_pfn_t mfns[MAX_BATCH_SIZE];
     int map_errs[MAX_BATCH_SIZE];
     void *guest_data[MAX_BATCH_SIZE];
+    struct iovec iov[MAX_BATCH_SIZE];
 
     /* populate_pfns */
     xen_pfn_t pp_mfns[MAX_BATCH_SIZE];
diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index 7643de58e0..53f05f1b65 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -392,6 +392,122 @@ err:
     return rc;
 }
 
+/*
+ * Handle PAGE_DATA record from the stream.
+ * Given a list of pfns, their types, and a block of page data from the
+ * stream, populate and record their types, map the relevant subset and copy
+ * the data into the guest.
+ */
+static int handle_incoming_page_data(struct xc_sr_context *ctx,
+                                     struct xc_sr_rhdr *rhdr)
+{
+    xc_interface *xch = ctx->xch;
+    struct sr_restore_arrays *m = ctx->restore.m;
+    struct xc_sr_rec_page_data_header *pages = &m->pages;
+    uint64_t *pfn_nums = m->pages.pfn;
+    uint32_t i;
+    int rc, iov_idx;
+
+    rc = handle_static_data_end_v2(ctx);
+    if ( rc )
+        goto err;
+
+    /* First read and verify the header */
+    rc = read_exact(ctx->fd, pages, sizeof(*pages));
+    if ( rc )
+    {
+        PERROR("Could not read rec_pfn header");
+        goto err;
+    }
+
+    if ( verify_rec_page_hdr(ctx, rhdr->length, pages) == false )
+    {
+        rc = -1;
+        goto err;
+    }
+
+    /* Then read and verify the incoming pfn numbers */
+    rc = read_exact(ctx->fd, pfn_nums, sizeof(*pfn_nums) * pages->count);
+    if ( rc )
+    {
+        PERROR("Could not read rec_pfn data");
+        goto err;
+    }
+
+    if ( verify_rec_page_pfns(ctx, rhdr->length, pages) == false )
+    {
+        rc = -1;
+        goto err;
+    }
+
+    /* Finally read and verify the incoming pfn data */
+    rc = map_guest_pages(ctx, pages);
+    if ( rc )
+        goto err;
+
+    /* Prepare read buffers, either guest or throw away memory */
+    for ( i = 0, iov_idx = 0; i < pages->count; i++ )
+    {
+        if ( !m->guest_data[i] )
+            continue;
+
+        m->iov[iov_idx].iov_len = PAGE_SIZE;
+        if ( ctx->restore.verify )
+            m->iov[iov_idx].iov_base = ctx->restore.verify_buf + i * PAGE_SIZE;
+        else
+            m->iov[iov_idx].iov_base = m->guest_data[i];
+        iov_idx++;
+    }
+
+    if ( !iov_idx )
+        goto done;
+
+    rc = readv_exact(ctx->fd, m->iov, iov_idx);
+    if ( rc )
+    {
+        PERROR("read of %d pages failed", iov_idx);
+        goto err;
+    }
+
+    /* Post-processing of pfn data */
+    for ( i = 0, iov_idx = 0; i < pages->count; i++ )
+    {
+        if ( !m->guest_data[i] )
+            continue;
+
+        rc = ctx->restore.ops.localise_page(ctx, m->types[i], m->iov[iov_idx].iov_base);
+        if ( rc )
+        {
+            ERROR("Failed to localise pfn %#"PRIpfn" (type %#"PRIx32")",
+                  m->pfns[i], m->types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
+            goto err;
+
+        }
+
+        if ( ctx->restore.verify )
+        {
+            if ( memcmp(m->guest_data[i], m->iov[iov_idx].iov_base, PAGE_SIZE) )
+            {
+                ERROR("verify pfn %#"PRIpfn" failed (type %#"PRIx32")",
+                      m->pfns[i], m->types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
+            }
+        }
+
+        iov_idx++;
+    }
+
+done:
+    rc = 0;
+
+err:
+    if ( ctx->restore.guest_mapping )
+    {
+        xenforeignmemory_unmap(xch->fmem, ctx->restore.guest_mapping, ctx->restore.nr_mapped_pages);
+        ctx->restore.guest_mapping = NULL;
+    }
+    return rc;
+}
+
 /*
  * Handle PAGE_DATA record from an existing buffer
  * Given a list of pfns, their types, and a block of page data from the
@@ -773,11 +889,19 @@ static int process_incoming_record_header(struct xc_sr_context *ctx, struct xc_s
     struct xc_sr_record rec;
     int rc;
 
-    rc = read_record_data(ctx, ctx->fd, rhdr, &rec);
-    if ( rc )
-        return rc;
+    switch ( rhdr->type )
+    {
+    case REC_TYPE_PAGE_DATA:
+        rc = handle_incoming_page_data(ctx, rhdr);
+        break;
+    default:
+        rc = read_record_data(ctx, ctx->fd, rhdr, &rec);
+        if ( rc == 0 )
+            rc = process_buffered_record(ctx, &rec);;
+        break;
+    }
 
-    return process_buffered_record(ctx, &rec);
+    return rc;
 }
 
 


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 31/40] tools: recognize LIBXL_API_VERSION for 4.16
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (29 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 30/40] tools: restore: write data directly into guest Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 32/40] tools: adjust libxl_domain_suspend to receive a struct props Olaf Hering
                   ` (8 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu

This is required by upcoming API changes.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 tools/include/libxl.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index ae7fe27c1f..29931626a2 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -729,7 +729,8 @@ typedef struct libxl__ctx libxl_ctx;
 #if LIBXL_API_VERSION != 0x040200 && LIBXL_API_VERSION != 0x040300 && \
     LIBXL_API_VERSION != 0x040400 && LIBXL_API_VERSION != 0x040500 && \
     LIBXL_API_VERSION != 0x040700 && LIBXL_API_VERSION != 0x040800 && \
-    LIBXL_API_VERSION != 0x041300 && LIBXL_API_VERSION != 0x041400
+    LIBXL_API_VERSION != 0x041300 && LIBXL_API_VERSION != 0x041400 && \
+    LIBXL_API_VERSION != 0x041600
 #error Unknown LIBXL_API_VERSION
 #endif
 #endif


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 32/40] tools: adjust libxl_domain_suspend to receive a struct props
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (30 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 31/40] tools: recognize LIBXL_API_VERSION for 4.16 Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 33/40] tools: change struct precopy_stats to precopy_stats_t Olaf Hering
                   ` (7 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel
  Cc: Olaf Hering, Christian Lindig, Ian Jackson, Wei Liu,
	Anthony PERARD, Juergen Gross, David Scott

Upcoming changes will pass more knobs down to xc_domain_save.
Adjust the libxl_domain_suspend API to allow easy adding of additional knobs.

No change in behavior intented.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
---
 tools/include/libxl.h                | 26 +++++++++++++++++++++++---
 tools/libs/light/libxl_domain.c      |  7 ++++---
 tools/ocaml/libs/xl/xenlight_stubs.c |  3 ++-
 tools/xl/xl_migrate.c                |  9 ++++++---
 tools/xl/xl_saverestore.c            |  3 ++-
 5 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 29931626a2..9a4d7514ed 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -1706,12 +1706,32 @@ static inline int libxl_retrieve_domain_configuration_0x041200(
     libxl_retrieve_domain_configuration_0x041200
 #endif
 
+/*
+ * LIBXL_HAVE_DOMAIN_SUSPEND_PROPS indicates that the
+ * libxl_domain_suspend_props() function takes a props struct.
+ */
+#define LIBXL_HAVE_DOMAIN_SUSPEND_PROPS 1
+
+typedef struct {
+    uint32_t flags; /* LIBXL_SUSPEND_* */
+} libxl_domain_suspend_props;
+#define LIBXL_SUSPEND_DEBUG 1
+#define LIBXL_SUSPEND_LIVE 2
+
 int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd,
-                         int flags, /* LIBXL_SUSPEND_* */
+                         libxl_domain_suspend_props *props,
                          const libxl_asyncop_how *ao_how)
                          LIBXL_EXTERNAL_CALLERS_ONLY;
-#define LIBXL_SUSPEND_DEBUG 1
-#define LIBXL_SUSPEND_LIVE 2
+#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION < 0x041600
+static inline int libxl_domain_suspend_0x041500(libxl_ctx *ctx, uint32_t domid,
+                         int fd, int flags, /* LIBXL_SUSPEND_* */
+                         const libxl_asyncop_how *ao_how)
+{
+    libxl_domain_suspend_props props = { .flags = flags, };
+    return libxl_domain_suspend(ctx, domid, fd, &props, ao_how);
+}
+#define libxl_domain_suspend libxl_domain_suspend_0x041500
+#endif
 
 /*
  * Only suspend domain, do not save its state to file, do not destroy it.
diff --git a/tools/libs/light/libxl_domain.c b/tools/libs/light/libxl_domain.c
index c00c36c928..5dbd27900f 100644
--- a/tools/libs/light/libxl_domain.c
+++ b/tools/libs/light/libxl_domain.c
@@ -505,7 +505,8 @@ static void domain_suspend_cb(libxl__egc *egc,
 
 }
 
-int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags,
+int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd,
+                         libxl_domain_suspend_props *props,
                          const libxl_asyncop_how *ao_how)
 {
     AO_CREATE(ctx, domid, ao_how);
@@ -526,8 +527,8 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags,
     dss->domid = domid;
     dss->fd = fd;
     dss->type = type;
-    dss->live = flags & LIBXL_SUSPEND_LIVE;
-    dss->debug = flags & LIBXL_SUSPEND_DEBUG;
+    dss->live = props->flags & LIBXL_SUSPEND_LIVE;
+    dss->debug = props->flags & LIBXL_SUSPEND_DEBUG;
     dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_NONE;
 
     rc = libxl__fd_flags_modify_save(gc, dss->fd,
diff --git a/tools/ocaml/libs/xl/xenlight_stubs.c b/tools/ocaml/libs/xl/xenlight_stubs.c
index 352a00134d..eaf7bce35a 100644
--- a/tools/ocaml/libs/xl/xenlight_stubs.c
+++ b/tools/ocaml/libs/xl/xenlight_stubs.c
@@ -614,10 +614,11 @@ value stub_libxl_domain_suspend(value ctx, value domid, value fd, value async, v
 	int ret;
 	uint32_t c_domid = Int_val(domid);
 	int c_fd = Int_val(fd);
+    libxl_domain_suspend_props props = {};
 	libxl_asyncop_how *ao_how = aohow_val(async);
 
 	caml_enter_blocking_section();
-	ret = libxl_domain_suspend(CTX, c_domid, c_fd, 0, ao_how);
+	ret = libxl_domain_suspend(CTX, c_domid, c_fd, &props, ao_how);
 	caml_leave_blocking_section();
 
 	free(ao_how);
diff --git a/tools/xl/xl_migrate.c b/tools/xl/xl_migrate.c
index b8594f44a5..144890924f 100644
--- a/tools/xl/xl_migrate.c
+++ b/tools/xl/xl_migrate.c
@@ -186,7 +186,10 @@ static void migrate_domain(uint32_t domid, int preserve_domid,
     char *away_domname;
     char rc_buf;
     uint8_t *config_data;
-    int config_len, flags = LIBXL_SUSPEND_LIVE;
+    int config_len;
+    libxl_domain_suspend_props props = {
+        .flags = LIBXL_SUSPEND_LIVE,
+        };
 
     save_domain_core_begin(domid, preserve_domid, override_config_file,
                            &config_data, &config_len);
@@ -205,8 +208,8 @@ static void migrate_domain(uint32_t domid, int preserve_domid,
     xtl_stdiostream_adjust_flags(logger, XTL_STDIOSTREAM_HIDE_PROGRESS, 0);
 
     if (debug)
-        flags |= LIBXL_SUSPEND_DEBUG;
-    rc = libxl_domain_suspend(ctx, domid, send_fd, flags, NULL);
+        props.flags |= LIBXL_SUSPEND_DEBUG;
+    rc = libxl_domain_suspend(ctx, domid, send_fd, &props, NULL);
     if (rc) {
         fprintf(stderr, "migration sender: libxl_domain_suspend failed"
                 " (rc=%d)\n", rc);
diff --git a/tools/xl/xl_saverestore.c b/tools/xl/xl_saverestore.c
index 953d791d1a..476d4d9a6a 100644
--- a/tools/xl/xl_saverestore.c
+++ b/tools/xl/xl_saverestore.c
@@ -130,6 +130,7 @@ static int save_domain(uint32_t domid, int preserve_domid,
     int fd;
     uint8_t *config_data;
     int config_len;
+    libxl_domain_suspend_props props = {};
 
     save_domain_core_begin(domid, preserve_domid, override_config_file,
                            &config_data, &config_len);
@@ -146,7 +147,7 @@ static int save_domain(uint32_t domid, int preserve_domid,
 
     save_domain_core_writeconfig(fd, filename, config_data, config_len);
 
-    int rc = libxl_domain_suspend(ctx, domid, fd, 0, NULL);
+    int rc = libxl_domain_suspend(ctx, domid, fd, &props, NULL);
     close(fd);
 
     if (rc < 0) {


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 33/40] tools: change struct precopy_stats to precopy_stats_t
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (31 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 32/40] tools: adjust libxl_domain_suspend to receive a struct props Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01 16:45   ` Anthony PERARD
  2021-07-01  9:56 ` [PATCH v20210701 34/40] tools: add callback to libxl for precopy_policy and precopy_stats_t Olaf Hering
                   ` (6 subsequent siblings)
  39 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu, Juergen Gross

This will help libxl_save_msgs_gen.pl to copy the struct as a region of memory.

No change in behavior intented.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 tools/include/xensaverestore.h  | 7 +++----
 tools/libs/saverestore/common.h | 2 +-
 tools/libs/saverestore/save.c   | 6 +++---
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/tools/include/xensaverestore.h b/tools/include/xensaverestore.h
index 0410f0469e..dca0134605 100644
--- a/tools/include/xensaverestore.h
+++ b/tools/include/xensaverestore.h
@@ -23,18 +23,17 @@
 #define XCFLAGS_DEBUG     (1 << 1)
 
 /* For save's precopy_policy(). */
-struct precopy_stats
-{
+typedef struct {
     unsigned int iteration;
     unsigned long total_written;
     long dirty_count; /* -1 if unknown */
-};
+} precopy_stats_t;
 
 /*
  * A precopy_policy callback may not be running in the same address
  * space as libxc an so precopy_stats is passed by value.
  */
-typedef int (*precopy_policy_t)(struct precopy_stats, void *);
+typedef int (*precopy_policy_t)(precopy_stats_t, void *);
 
 /* callbacks provided by xc_domain_save */
 struct save_callbacks {
diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index 3adcf2f83f..bb7e437291 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -283,7 +283,7 @@ struct xc_sr_context
             size_t pages_sent;
             size_t overhead_sent;
 
-            struct precopy_stats stats;
+            precopy_stats_t stats;
 
             unsigned int nr_batch_pfns;
             unsigned long *deferred_pages;
diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
index e486bce96f..537b977ba8 100644
--- a/tools/libs/saverestore/save.c
+++ b/tools/libs/saverestore/save.c
@@ -488,7 +488,7 @@ static int update_progress_string(struct xc_sr_context *ctx, char **str)
 #define SPP_MAX_ITERATIONS      5
 #define SPP_TARGET_DIRTY_COUNT 50
 
-static int simple_precopy_policy(struct precopy_stats stats, void *user)
+static int simple_precopy_policy(precopy_stats_t stats, void *user)
 {
     return ((stats.dirty_count >= 0 &&
              stats.dirty_count < SPP_TARGET_DIRTY_COUNT) ||
@@ -515,13 +515,13 @@ static int send_memory_live(struct xc_sr_context *ctx)
     precopy_policy_t precopy_policy = ctx->save.callbacks->precopy_policy;
     void *data = ctx->save.callbacks->data;
 
-    struct precopy_stats *policy_stats;
+    precopy_stats_t *policy_stats;
 
     rc = update_progress_string(ctx, &progress_str);
     if ( rc )
         goto out;
 
-    ctx->save.stats = (struct precopy_stats){
+    ctx->save.stats = (precopy_stats_t){
         .dirty_count = ctx->save.p2m_size,
     };
     policy_stats = &ctx->save.stats;


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 34/40] tools: add callback to libxl for precopy_policy and precopy_stats_t
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (32 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 33/40] tools: change struct precopy_stats to precopy_stats_t Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 35/40] tools: add --max_iters to libxl_domain_suspend Olaf Hering
                   ` (5 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel
  Cc: Olaf Hering, Ian Jackson, Wei Liu, Anthony PERARD, Juergen Gross

This duplicates simple_precopy_policy. To recap its purpose:
- do up to 5 iterations of copying dirty domU memory to target,
  including the initial copying of all domU memory, excluding
  the final copying while the domU is suspended
- do fewer iterations in case the domU dirtied less than 50 pages

Take the opportunity to also move xen_pfn_t into qw().

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 tools/libs/light/libxl_dom_save.c       | 19 +++++++++++++++++++
 tools/libs/light/libxl_internal.h       |  2 ++
 tools/libs/light/libxl_save_msgs_gen.pl |  3 ++-
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/tools/libs/light/libxl_dom_save.c b/tools/libs/light/libxl_dom_save.c
index 32e3cb5a13..3f3cff0342 100644
--- a/tools/libs/light/libxl_dom_save.c
+++ b/tools/libs/light/libxl_dom_save.c
@@ -373,6 +373,24 @@ int libxl__save_emulator_xenstore_data(libxl__domain_save_state *dss,
     return rc;
 }
 
+static int libxl__domain_save_precopy_policy(precopy_stats_t stats, void *user)
+{
+    libxl__save_helper_state *shs = user;
+    libxl__domain_save_state *dss = shs->caller_state;
+    STATE_AO_GC(dss->ao);
+
+    LOGD(DEBUG, shs->domid, "iteration %u dirty_count %ld total_written %lu",
+         stats.iteration, stats.dirty_count, stats.total_written);
+    if (stats.dirty_count >= 0 && stats.dirty_count < LIBXL_XGS_POLICY_TARGET_DIRTY_COUNT)
+        goto stop_copy;
+    if (stats.iteration >= LIBXL_XGS_POLICY_MAX_ITERATIONS)
+        goto stop_copy;
+    return XGS_POLICY_CONTINUE_PRECOPY;
+
+stop_copy:
+    return XGS_POLICY_STOP_AND_COPY;
+}
+
 /*----- main code for saving, in order of execution -----*/
 
 void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
@@ -430,6 +448,7 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
         callbacks->suspend = libxl__domain_suspend_callback;
 
     callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty;
+    callbacks->precopy_policy = libxl__domain_save_precopy_policy;
 
     dss->sws.ao  = dss->ao;
     dss->sws.dss = dss;
diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h
index 439c654733..57d7e4b4b8 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -125,6 +125,8 @@
 #define DOMID_XS_PATH "domid"
 #define PVSHIM_BASENAME "xen-shim"
 #define PVSHIM_CMDLINE "pv-shim console=xen,pv"
+#define LIBXL_XGS_POLICY_MAX_ITERATIONS 5
+#define LIBXL_XGS_POLICY_TARGET_DIRTY_COUNT 50
 
 /* Size macros. */
 #define __AC(X,Y)   (X##Y)
diff --git a/tools/libs/light/libxl_save_msgs_gen.pl b/tools/libs/light/libxl_save_msgs_gen.pl
index f263ee01bb..ab55c81644 100755
--- a/tools/libs/light/libxl_save_msgs_gen.pl
+++ b/tools/libs/light/libxl_save_msgs_gen.pl
@@ -23,6 +23,7 @@ our @msgs = (
                                              STRING doing_what),
                                             'unsigned long', 'done',
                                             'unsigned long', 'total'] ],
+    [ 'scxW',   "precopy_policy", ['precopy_stats_t', 'stats'] ],
     [ 'srcxA',  "suspend", [] ],
     [ 'srcxA',  "postcopy", [] ],
     [ 'srcxA',  "checkpoint", [] ],
@@ -142,7 +143,7 @@ static void bytes_put(unsigned char *const buf, int *len,
 
 END
 
-foreach my $simpletype (qw(int uint16_t uint32_t unsigned), 'unsigned long', 'xen_pfn_t') {
+foreach my $simpletype (qw(int uint16_t uint32_t unsigned precopy_stats_t xen_pfn_t), 'unsigned long') {
     my $typeid = typeid($simpletype);
     $out_body{'callout'} .= <<END;
 static int ${typeid}_get(const unsigned char **msg,


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 35/40] tools: add --max_iters to libxl_domain_suspend
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (33 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 34/40] tools: add callback to libxl for precopy_policy and precopy_stats_t Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 36/40] tools: add --min_remaining " Olaf Hering
                   ` (4 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel
  Cc: Olaf Hering, Ian Jackson, Wei Liu, Anthony PERARD, Juergen Gross

Migrating a large, and potentially busy, domU will take more
time than neccessary due to excessive number of copying iterations.

Allow to host admin to control the number of iterations which
copy cumulated domU dirty pages to the target host.

The default remains 5, which means one initial iteration to copy the
entire domU memory, and up to 4 additional iterations to copy dirty
memory from the still running domU. After the given number of iterations
the domU is suspended, remaining dirty memory is copied and the domU is
finally moved to the target host.

This patch adjusts xl(1) and the libxl API.
External users check LIBXL_HAVE_DOMAIN_SUSPEND_PROPS for the availibility
of the new .max_iters property.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 docs/man/xl.1.pod.in              |  4 ++++
 tools/include/libxl.h             |  1 +
 tools/libs/light/libxl_dom_save.c |  2 +-
 tools/libs/light/libxl_domain.c   |  1 +
 tools/libs/light/libxl_internal.h |  1 +
 tools/xl/xl_cmdtable.c            |  3 ++-
 tools/xl/xl_migrate.c             | 10 +++++++++-
 7 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in
index 70a6ebf438..594387bcf4 100644
--- a/docs/man/xl.1.pod.in
+++ b/docs/man/xl.1.pod.in
@@ -494,6 +494,10 @@ such that it will be identical on the destination host, unless that
 configuration is overridden using the B<-C> option. Note that it is not
 possible to use this option for a 'localhost' migration.
 
+=item B<--max_iters> I<iterations>
+
+Number of copy iterations before final suspend+move (default: 5)
+
 =back
 
 =item B<remus> [I<OPTIONS>] I<domain-id> I<host>
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 9a4d7514ed..bf77da0524 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -1714,6 +1714,7 @@ static inline int libxl_retrieve_domain_configuration_0x041200(
 
 typedef struct {
     uint32_t flags; /* LIBXL_SUSPEND_* */
+    uint32_t max_iters;
 } libxl_domain_suspend_props;
 #define LIBXL_SUSPEND_DEBUG 1
 #define LIBXL_SUSPEND_LIVE 2
diff --git a/tools/libs/light/libxl_dom_save.c b/tools/libs/light/libxl_dom_save.c
index 3f3cff0342..938c0127f3 100644
--- a/tools/libs/light/libxl_dom_save.c
+++ b/tools/libs/light/libxl_dom_save.c
@@ -383,7 +383,7 @@ static int libxl__domain_save_precopy_policy(precopy_stats_t stats, void *user)
          stats.iteration, stats.dirty_count, stats.total_written);
     if (stats.dirty_count >= 0 && stats.dirty_count < LIBXL_XGS_POLICY_TARGET_DIRTY_COUNT)
         goto stop_copy;
-    if (stats.iteration >= LIBXL_XGS_POLICY_MAX_ITERATIONS)
+    if (stats.iteration >= dss->max_iters)
         goto stop_copy;
     return XGS_POLICY_CONTINUE_PRECOPY;
 
diff --git a/tools/libs/light/libxl_domain.c b/tools/libs/light/libxl_domain.c
index 5dbd27900f..9f98cd7f2b 100644
--- a/tools/libs/light/libxl_domain.c
+++ b/tools/libs/light/libxl_domain.c
@@ -527,6 +527,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd,
     dss->domid = domid;
     dss->fd = fd;
     dss->type = type;
+    dss->max_iters = props->max_iters ?: LIBXL_XGS_POLICY_MAX_ITERATIONS;
     dss->live = props->flags & LIBXL_SUSPEND_LIVE;
     dss->debug = props->flags & LIBXL_SUSPEND_DEBUG;
     dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_NONE;
diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h
index 57d7e4b4b8..8cbcc5282c 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -3649,6 +3649,7 @@ struct libxl__domain_save_state {
     int live;
     int debug;
     int checkpointed_stream;
+    uint32_t max_iters;
     const libxl_domain_remus_info *remus;
     /* private */
     int rc;
diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
index ca1dfa3525..9b6b3c99aa 100644
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -174,7 +174,8 @@ const struct cmd_spec cmd_table[] = {
       "                of the domain.\n"
       "--debug         Ignored.\n"
       "-p              Do not unpause domain after migrating it.\n"
-      "-D              Preserve the domain id"
+      "-D              Preserve the domain id\n"
+      "--max_iters N   Number of copy iterations before final stop+move"
     },
     { "restore",
       &main_restore, 0, 1,
diff --git a/tools/xl/xl_migrate.c b/tools/xl/xl_migrate.c
index 144890924f..af117d4d56 100644
--- a/tools/xl/xl_migrate.c
+++ b/tools/xl/xl_migrate.c
@@ -178,6 +178,7 @@ static void migrate_do_preamble(int send_fd, int recv_fd, pid_t child,
 
 static void migrate_domain(uint32_t domid, int preserve_domid,
                            const char *rune, int debug,
+                           uint32_t max_iters,
                            const char *override_config_file)
 {
     pid_t child = -1;
@@ -189,6 +190,7 @@ static void migrate_domain(uint32_t domid, int preserve_domid,
     int config_len;
     libxl_domain_suspend_props props = {
         .flags = LIBXL_SUSPEND_LIVE,
+        .max_iters = max_iters,
         };
 
     save_domain_core_begin(domid, preserve_domid, override_config_file,
@@ -542,8 +544,10 @@ int main_migrate(int argc, char **argv)
     char *host;
     int opt, daemonize = 1, monitor = 1, debug = 0, pause_after_migration = 0;
     int preserve_domid = 0;
+    uint32_t max_iters = 0;
     static struct option opts[] = {
         {"debug", 0, 0, 0x100},
+        {"max_iters", 1, 0, 0x101},
         {"live", 0, 0, 0x200},
         COMMON_LONG_OPTS
     };
@@ -571,6 +575,9 @@ int main_migrate(int argc, char **argv)
     case 0x100: /* --debug */
         debug = 1;
         break;
+    case 0x101: /* --max_iters */
+        max_iters = atoi(optarg);
+        break;
     case 0x200: /* --live */
         /* ignored for compatibility with xm */
         break;
@@ -605,7 +612,8 @@ int main_migrate(int argc, char **argv)
                   pause_after_migration ? " -p" : "");
     }
 
-    migrate_domain(domid, preserve_domid, rune, debug, config_filename);
+    migrate_domain(domid, preserve_domid, rune, debug,
+                   max_iters, config_filename);
     return EXIT_SUCCESS;
 }
 


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 36/40] tools: add --min_remaining to libxl_domain_suspend
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (34 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 35/40] tools: add --max_iters to libxl_domain_suspend Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 37/40] tools: add --abort_if_busy " Olaf Hering
                   ` (3 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel
  Cc: Olaf Hering, Ian Jackson, Wei Liu, Anthony PERARD, Juergen Gross

The decision to stop+move a domU to the new host must be based on two factors:
- the available network bandwidth for the migration stream
- the maximum time a workload within a domU can be savely suspended

Both values define how many dirty pages a workload may produce prior the
final stop+move.

The default value of 50 pages is much too low with todays network bandwidths.
On an idle 1GiB link these 200K will be transferred within ~2ms.

Give the admin a knob to adjust the point when the final stop+move will
be done, so he can base this decision on his own needs.

This patch adjusts xl(1) and the libxl API.
External users check LIBXL_HAVE_DOMAIN_SUSPEND_PROPS for the availibility
of the new .min_remaining property.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 docs/man/xl.1.pod.in              |  8 ++++++++
 tools/include/libxl.h             |  1 +
 tools/libs/light/libxl_dom_save.c |  2 +-
 tools/libs/light/libxl_domain.c   |  1 +
 tools/libs/light/libxl_internal.h |  1 +
 tools/xl/xl_cmdtable.c            | 23 ++++++++++++-----------
 tools/xl/xl_migrate.c             |  9 ++++++++-
 7 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in
index 594387bcf4..09e866ad87 100644
--- a/docs/man/xl.1.pod.in
+++ b/docs/man/xl.1.pod.in
@@ -498,6 +498,14 @@ possible to use this option for a 'localhost' migration.
 
 Number of copy iterations before final suspend+move (default: 5)
 
+=item B<--min_remaing> I<pages>
+
+Number of remaining dirty pages. If the number of dirty pages drops that
+low, the guest is suspended and the domU will finally be moved to I<host>.
+
+This allows the host admin to control for how long the domU will likely
+be suspended during transit.
+
 =back
 
 =item B<remus> [I<OPTIONS>] I<domain-id> I<host>
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index bf77da0524..28d70b1078 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -1715,6 +1715,7 @@ static inline int libxl_retrieve_domain_configuration_0x041200(
 typedef struct {
     uint32_t flags; /* LIBXL_SUSPEND_* */
     uint32_t max_iters;
+    uint32_t min_remaining;
 } libxl_domain_suspend_props;
 #define LIBXL_SUSPEND_DEBUG 1
 #define LIBXL_SUSPEND_LIVE 2
diff --git a/tools/libs/light/libxl_dom_save.c b/tools/libs/light/libxl_dom_save.c
index 938c0127f3..ad5df89b2c 100644
--- a/tools/libs/light/libxl_dom_save.c
+++ b/tools/libs/light/libxl_dom_save.c
@@ -381,7 +381,7 @@ static int libxl__domain_save_precopy_policy(precopy_stats_t stats, void *user)
 
     LOGD(DEBUG, shs->domid, "iteration %u dirty_count %ld total_written %lu",
          stats.iteration, stats.dirty_count, stats.total_written);
-    if (stats.dirty_count >= 0 && stats.dirty_count < LIBXL_XGS_POLICY_TARGET_DIRTY_COUNT)
+    if (stats.dirty_count >= 0 && stats.dirty_count < dss->min_remaining)
         goto stop_copy;
     if (stats.iteration >= dss->max_iters)
         goto stop_copy;
diff --git a/tools/libs/light/libxl_domain.c b/tools/libs/light/libxl_domain.c
index 9f98cd7f2b..06ca7a7df6 100644
--- a/tools/libs/light/libxl_domain.c
+++ b/tools/libs/light/libxl_domain.c
@@ -528,6 +528,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd,
     dss->fd = fd;
     dss->type = type;
     dss->max_iters = props->max_iters ?: LIBXL_XGS_POLICY_MAX_ITERATIONS;
+    dss->min_remaining = props->min_remaining ?: LIBXL_XGS_POLICY_TARGET_DIRTY_COUNT;
     dss->live = props->flags & LIBXL_SUSPEND_LIVE;
     dss->debug = props->flags & LIBXL_SUSPEND_DEBUG;
     dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_NONE;
diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h
index 8cbcc5282c..e4bfb34085 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -3650,6 +3650,7 @@ struct libxl__domain_save_state {
     int debug;
     int checkpointed_stream;
     uint32_t max_iters;
+    uint32_t min_remaining;
     const libxl_domain_remus_info *remus;
     /* private */
     int rc;
diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
index 9b6b3c99aa..2cb4980c80 100644
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -165,17 +165,18 @@ const struct cmd_spec cmd_table[] = {
       &main_migrate, 0, 1,
       "Migrate a domain to another host",
       "[options] <Domain> <host>",
-      "-h              Print this help.\n"
-      "-C <config>     Send <config> instead of config file from creation.\n"
-      "-s <sshcommand> Use <sshcommand> instead of ssh.  String will be passed\n"
-      "                to sh. If empty, run <host> instead of ssh <host> xl\n"
-      "                migrate-receive [-d -e]\n"
-      "-e              Do not wait in the background (on <host>) for the death\n"
-      "                of the domain.\n"
-      "--debug         Ignored.\n"
-      "-p              Do not unpause domain after migrating it.\n"
-      "-D              Preserve the domain id\n"
-      "--max_iters N   Number of copy iterations before final stop+move"
+      "-h                Print this help.\n"
+      "-C <config>       Send <config> instead of config file from creation.\n"
+      "-s <sshcommand>   Use <sshcommand> instead of ssh.  String will be passed\n"
+      "                  to sh. If empty, run <host> instead of ssh <host> xl\n"
+      "                  migrate-receive [-d -e]\n"
+      "-e                Do not wait in the background (on <host>) for the death\n"
+      "                  of the domain.\n"
+      "--debug           Ignored.\n"
+      "-p                Do not unpause domain after migrating it.\n"
+      "-D                Preserve the domain id\n"
+      "--max_iters N     Number of copy iterations before final stop+move\n"
+      "--min_remaining N Number of remaining dirty pages before final stop+move"
     },
     { "restore",
       &main_restore, 0, 1,
diff --git a/tools/xl/xl_migrate.c b/tools/xl/xl_migrate.c
index af117d4d56..14feb2b7ec 100644
--- a/tools/xl/xl_migrate.c
+++ b/tools/xl/xl_migrate.c
@@ -179,6 +179,7 @@ static void migrate_do_preamble(int send_fd, int recv_fd, pid_t child,
 static void migrate_domain(uint32_t domid, int preserve_domid,
                            const char *rune, int debug,
                            uint32_t max_iters,
+                           uint32_t min_remaining,
                            const char *override_config_file)
 {
     pid_t child = -1;
@@ -191,6 +192,7 @@ static void migrate_domain(uint32_t domid, int preserve_domid,
     libxl_domain_suspend_props props = {
         .flags = LIBXL_SUSPEND_LIVE,
         .max_iters = max_iters,
+        .min_remaining = min_remaining,
         };
 
     save_domain_core_begin(domid, preserve_domid, override_config_file,
@@ -545,9 +547,11 @@ int main_migrate(int argc, char **argv)
     int opt, daemonize = 1, monitor = 1, debug = 0, pause_after_migration = 0;
     int preserve_domid = 0;
     uint32_t max_iters = 0;
+    uint32_t min_remaining = 0;
     static struct option opts[] = {
         {"debug", 0, 0, 0x100},
         {"max_iters", 1, 0, 0x101},
+        {"min_remaining", 1, 0, 0x102},
         {"live", 0, 0, 0x200},
         COMMON_LONG_OPTS
     };
@@ -578,6 +582,9 @@ int main_migrate(int argc, char **argv)
     case 0x101: /* --max_iters */
         max_iters = atoi(optarg);
         break;
+    case 0x102: /* --min_remaining */
+        min_remaining = atoi(optarg);
+        break;
     case 0x200: /* --live */
         /* ignored for compatibility with xm */
         break;
@@ -613,7 +620,7 @@ int main_migrate(int argc, char **argv)
     }
 
     migrate_domain(domid, preserve_domid, rune, debug,
-                   max_iters, config_filename);
+                   max_iters, min_remaining, config_filename);
     return EXIT_SUCCESS;
 }
 


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 37/40] tools: add --abort_if_busy to libxl_domain_suspend
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (35 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 36/40] tools: add --min_remaining " Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 38/40] tools: add API for expandable bitmaps Olaf Hering
                   ` (2 subsequent siblings)
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel
  Cc: Olaf Hering, Ian Jackson, Wei Liu, Anthony PERARD, Juergen Gross

Provide a knob to the host admin to abort the live migration of a
running domU if the downtime during final transit will be too long
for the workload within domU.

Adjust error reporting. Add ERROR_MIGRATION_ABORTED to allow callers of
libxl_domain_suspend to distinguish between errors and the requested
constraint.

Adjust precopy_policy to simplify reporting of remaining dirty pages.
The loop in send_memory_live populates ->dirty_count in a different
place than ->iteration. Let it proceeed one more time to provide the
desired information before leaving the loop.

This patch adjusts xl(1) and the libxl API.
External users check LIBXL_HAVE_DOMAIN_SUSPEND_PROPS for the availibility
of the new .abort_if_busy property.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 docs/man/xl.1.pod.in                  |  8 +++++++
 tools/include/libxl.h                 |  1 +
 tools/libs/light/libxl_dom_save.c     |  7 ++++++-
 tools/libs/light/libxl_domain.c       |  1 +
 tools/libs/light/libxl_internal.h     |  2 ++
 tools/libs/light/libxl_stream_write.c |  9 +++++++-
 tools/libs/light/libxl_types.idl      |  1 +
 tools/xl/xl_cmdtable.c                |  6 +++++-
 tools/xl/xl_migrate.c                 | 30 ++++++++++++++++++++-------
 9 files changed, 55 insertions(+), 10 deletions(-)

diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in
index 09e866ad87..37267c9171 100644
--- a/docs/man/xl.1.pod.in
+++ b/docs/man/xl.1.pod.in
@@ -506,6 +506,14 @@ low, the guest is suspended and the domU will finally be moved to I<host>.
 This allows the host admin to control for how long the domU will likely
 be suspended during transit.
 
+=item B<--abort_if_busy>
+
+Abort migration instead of doing final suspend/move/resume if the
+guest produced more than I<min_remaining> dirty pages during th number
+of I<max_iters> iterations.
+This avoids long periods of time where the guest is suspended, which
+may confuse the workload within domU.
+
 =back
 
 =item B<remus> [I<OPTIONS>] I<domain-id> I<host>
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 28d70b1078..cc056ed627 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -1719,6 +1719,7 @@ typedef struct {
 } libxl_domain_suspend_props;
 #define LIBXL_SUSPEND_DEBUG 1
 #define LIBXL_SUSPEND_LIVE 2
+#define LIBXL_SUSPEND_ABORT_IF_BUSY 4
 
 int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd,
                          libxl_domain_suspend_props *props,
diff --git a/tools/libs/light/libxl_dom_save.c b/tools/libs/light/libxl_dom_save.c
index ad5df89b2c..1999a8997f 100644
--- a/tools/libs/light/libxl_dom_save.c
+++ b/tools/libs/light/libxl_dom_save.c
@@ -383,11 +383,16 @@ static int libxl__domain_save_precopy_policy(precopy_stats_t stats, void *user)
          stats.iteration, stats.dirty_count, stats.total_written);
     if (stats.dirty_count >= 0 && stats.dirty_count < dss->min_remaining)
         goto stop_copy;
-    if (stats.iteration >= dss->max_iters)
+    if (stats.dirty_count >= 0 && stats.iteration >= dss->max_iters)
         goto stop_copy;
     return XGS_POLICY_CONTINUE_PRECOPY;
 
 stop_copy:
+    if (dss->abort_if_busy)
+    {
+        dss->remaining_dirty_pages = stats.dirty_count;
+        return XGS_POLICY_ABORT;
+    }
     return XGS_POLICY_STOP_AND_COPY;
 }
 
diff --git a/tools/libs/light/libxl_domain.c b/tools/libs/light/libxl_domain.c
index 06ca7a7df6..e4740b063e 100644
--- a/tools/libs/light/libxl_domain.c
+++ b/tools/libs/light/libxl_domain.c
@@ -529,6 +529,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd,
     dss->type = type;
     dss->max_iters = props->max_iters ?: LIBXL_XGS_POLICY_MAX_ITERATIONS;
     dss->min_remaining = props->min_remaining ?: LIBXL_XGS_POLICY_TARGET_DIRTY_COUNT;
+    dss->abort_if_busy = props->flags & LIBXL_SUSPEND_ABORT_IF_BUSY;
     dss->live = props->flags & LIBXL_SUSPEND_LIVE;
     dss->debug = props->flags & LIBXL_SUSPEND_DEBUG;
     dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_NONE;
diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h
index e4bfb34085..905d5179ba 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -3648,9 +3648,11 @@ struct libxl__domain_save_state {
     libxl_domain_type type;
     int live;
     int debug;
+    int abort_if_busy;
     int checkpointed_stream;
     uint32_t max_iters;
     uint32_t min_remaining;
+    long remaining_dirty_pages;
     const libxl_domain_remus_info *remus;
     /* private */
     int rc;
diff --git a/tools/libs/light/libxl_stream_write.c b/tools/libs/light/libxl_stream_write.c
index 634f3240d1..1ab3943f3e 100644
--- a/tools/libs/light/libxl_stream_write.c
+++ b/tools/libs/light/libxl_stream_write.c
@@ -344,11 +344,18 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
         goto err;
 
     if (retval) {
+        if (dss->remaining_dirty_pages) {
+            LOGD(NOTICE, dss->domid, "saving domain: aborted,"
+                 " %ld remaining dirty pages.", dss->remaining_dirty_pages);
+        } else {
         LOGEVD(ERROR, errnoval, dss->domid, "saving domain: %s",
               dss->dsps.guest_responded ?
               "domain responded to suspend request" :
               "domain did not respond to suspend request");
-        if (!dss->dsps.guest_responded)
+        }
+        if (dss->remaining_dirty_pages)
+           rc = ERROR_MIGRATION_ABORTED;
+        else if(!dss->dsps.guest_responded)
             rc = ERROR_GUEST_TIMEDOUT;
         else if (dss->rc)
             rc = dss->rc;
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index f45adddab0..b91769ee10 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -76,6 +76,7 @@ libxl_error = Enumeration("error", [
     (-30, "QMP_DEVICE_NOT_ACTIVE"), # a device has failed to be become active
     (-31, "QMP_DEVICE_NOT_FOUND"), # the requested device has not been found
     (-32, "QEMU_API"), # QEMU's replies don't contains expected members
+    (-33, "MIGRATION_ABORTED"),
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
index 2cb4980c80..322a47c2bc 100644
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -176,7 +176,11 @@ const struct cmd_spec cmd_table[] = {
       "-p                Do not unpause domain after migrating it.\n"
       "-D                Preserve the domain id\n"
       "--max_iters N     Number of copy iterations before final stop+move\n"
-      "--min_remaining N Number of remaining dirty pages before final stop+move"
+      "--min_remaining N Number of remaining dirty pages before final stop+move\n"
+      "--abort_if_busy   Abort migration instead of doing final stop+move,\n"
+      "                  if the number of dirty pages is higher than <min_remaining>\n"
+      "                  after <max_iters> iterations. Otherwise the amount of memory\n"
+      "                  to be transfered would exceed maximum allowed domU downtime."
     },
     { "restore",
       &main_restore, 0, 1,
diff --git a/tools/xl/xl_migrate.c b/tools/xl/xl_migrate.c
index 14feb2b7ec..f523746e5b 100644
--- a/tools/xl/xl_migrate.c
+++ b/tools/xl/xl_migrate.c
@@ -177,7 +177,7 @@ static void migrate_do_preamble(int send_fd, int recv_fd, pid_t child,
 }
 
 static void migrate_domain(uint32_t domid, int preserve_domid,
-                           const char *rune, int debug,
+                           const char *rune, int debug, int abort_if_busy,
                            uint32_t max_iters,
                            uint32_t min_remaining,
                            const char *override_config_file)
@@ -213,14 +213,20 @@ static void migrate_domain(uint32_t domid, int preserve_domid,
 
     if (debug)
         props.flags |= LIBXL_SUSPEND_DEBUG;
+    if (abort_if_busy)
+        props.flags |= LIBXL_SUSPEND_ABORT_IF_BUSY;
     rc = libxl_domain_suspend(ctx, domid, send_fd, &props, NULL);
     if (rc) {
         fprintf(stderr, "migration sender: libxl_domain_suspend failed"
                 " (rc=%d)\n", rc);
-        if (rc == ERROR_GUEST_TIMEDOUT)
-            goto failed_suspend;
-        else
-            goto failed_resume;
+        switch (rc) {
+            case ERROR_GUEST_TIMEDOUT:
+                goto failed_suspend;
+            case ERROR_MIGRATION_ABORTED:
+                goto failed_busy;
+            default:
+                goto failed_resume;
+        }
     }
 
     //fprintf(stderr, "migration sender: Transfer complete.\n");
@@ -302,6 +308,12 @@ static void migrate_domain(uint32_t domid, int preserve_domid,
     fprintf(stderr, "Migration failed, failed to suspend at sender.\n");
     exit(EXIT_FAILURE);
 
+ failed_busy:
+    close(send_fd);
+    migration_child_report(recv_fd);
+    fprintf(stderr, "Migration aborted as requested, domain is too busy.\n");
+    exit(EXIT_FAILURE);
+
  failed_resume:
     close(send_fd);
     migration_child_report(recv_fd);
@@ -545,13 +557,14 @@ int main_migrate(int argc, char **argv)
     char *rune = NULL;
     char *host;
     int opt, daemonize = 1, monitor = 1, debug = 0, pause_after_migration = 0;
-    int preserve_domid = 0;
+    int preserve_domid = 0, abort_if_busy = 0;
     uint32_t max_iters = 0;
     uint32_t min_remaining = 0;
     static struct option opts[] = {
         {"debug", 0, 0, 0x100},
         {"max_iters", 1, 0, 0x101},
         {"min_remaining", 1, 0, 0x102},
+        {"abort_if_busy", 0, 0, 0x103},
         {"live", 0, 0, 0x200},
         COMMON_LONG_OPTS
     };
@@ -585,6 +598,9 @@ int main_migrate(int argc, char **argv)
     case 0x102: /* --min_remaining */
         min_remaining = atoi(optarg);
         break;
+    case 0x103: /* --abort_if_busy */
+        abort_if_busy = 1;
+        break;
     case 0x200: /* --live */
         /* ignored for compatibility with xm */
         break;
@@ -619,7 +635,7 @@ int main_migrate(int argc, char **argv)
                   pause_after_migration ? " -p" : "");
     }
 
-    migrate_domain(domid, preserve_domid, rune, debug,
+    migrate_domain(domid, preserve_domid, rune, debug, abort_if_busy,
                    max_iters, min_remaining, config_filename);
     return EXIT_SUCCESS;
 }


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 38/40] tools: add API for expandable bitmaps
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (36 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 37/40] tools: add --abort_if_busy " Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 39/40] tools: use xg_sr_bitmap for populated_pfns Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 40/40] tools/libxc: use superpages during restore of HVM guest Olaf Hering
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu, Juergen Gross

Since the incoming migration stream lacks info about what the highest pfn
will be, some data structures can not be allocated upfront.

Add an API for expandable bitmaps, loosely based on pfn_set_populated.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 tools/libs/saverestore/common.c | 39 +++++++++++++++++++
 tools/libs/saverestore/common.h | 67 +++++++++++++++++++++++++++++++++
 2 files changed, 106 insertions(+)

diff --git a/tools/libs/saverestore/common.c b/tools/libs/saverestore/common.c
index 7da7fa4e2c..e96173eea2 100644
--- a/tools/libs/saverestore/common.c
+++ b/tools/libs/saverestore/common.c
@@ -163,6 +163,45 @@ static void __attribute__((unused)) build_assertions(void)
     BUILD_BUG_ON(sizeof(struct xc_sr_rec_hvm_params)        != 8);
 }
 
+/*
+ * Expand the tracking structures as needed.
+ * To avoid realloc()ing too excessively, the size increased to the nearest
+ * power of two large enough to contain the required number of bits.
+ */
+bool _sr_bitmap_expand(struct sr_bitmap *bm, unsigned long bits)
+{
+    size_t new_max;
+    size_t old_sz, new_sz;
+    void *p;
+
+    if (bits <= bm->bits)
+        return true;
+
+    /* Round up to the nearest power of two larger than bit, less 1. */
+    new_max = bits;
+    new_max |= new_max >> 1;
+    new_max |= new_max >> 2;
+    new_max |= new_max >> 4;
+    new_max |= new_max >> 8;
+    new_max |= new_max >> 16;
+    new_max |= sizeof(unsigned long) > 4 ? new_max >> 32 : 0;
+
+    /* Allocate units of unsigned long */
+    new_max = (new_max + BITS_PER_LONG - 1) & ~(BITS_PER_LONG - 1);
+
+    old_sz = bitmap_size(bm->bits);
+    new_sz = bitmap_size(new_max);
+    p = realloc(bm->p, new_sz);
+    if (!p)
+        return false;
+
+    memset(p + old_sz, 0, new_sz - old_sz);
+    bm->p = p;
+    bm->bits = new_max;
+
+    return true;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index bb7e437291..e6a269c482 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -30,6 +30,73 @@ const char *rec_type_to_str(uint32_t type);
 struct xc_sr_context;
 struct xc_sr_record;
 
+struct sr_bitmap
+{
+    void *p;
+    unsigned long bits;
+};
+
+extern bool _sr_bitmap_expand(struct sr_bitmap *bm, unsigned long bits);
+
+static inline bool sr_bitmap_expand(struct sr_bitmap *bm, unsigned long bits)
+{
+    if (bits > bm->bits)
+        return _sr_bitmap_expand(bm, bits);
+    return true;
+}
+
+static inline void sr_bitmap_free(struct sr_bitmap *bm)
+{
+    free(bm->p);
+    bm->p = NULL;
+}
+
+static inline bool sr_set_bit(unsigned long bit, struct sr_bitmap *bm)
+{
+    if (sr_bitmap_expand(bm, bit) == false)
+        return false;
+
+    set_bit(bit, bm->p);
+    return true;
+}
+
+static inline bool sr_test_bit(unsigned long bit, struct sr_bitmap *bm)
+{
+    if (bit > bm->bits)
+        return false;
+    return !!test_bit(bit, bm->p);
+}
+
+static inline void sr_clear_bit(unsigned long bit, struct sr_bitmap *bm)
+{
+    if (bit <= bm->bits)
+        clear_bit(bit, bm->p);
+}
+
+static inline bool sr_test_and_clear_bit(unsigned long bit, struct sr_bitmap *bm)
+{
+    if (bit > bm->bits)
+        return false;
+    return !!test_and_clear_bit(bit, bm->p);
+}
+
+/* No way to report potential allocation error, bitmap must be expanded prior usage */
+static inline bool sr_test_and_set_bit(unsigned long bit, struct sr_bitmap *bm)
+{
+    if (bit > bm->bits)
+        return false;
+    return !!test_and_set_bit(bit, bm->p);
+}
+
+static inline bool sr_set_long_bit(unsigned long base_bit, struct sr_bitmap *bm)
+{
+    if (sr_bitmap_expand(bm, base_bit + BITS_PER_LONG) == false)
+        return false;
+
+    set_bit_long(base_bit, bm->p);
+    return true;
+}
+
 /**
  * Save operations.  To be implemented for each type of guest, for use by the
  * common save algorithm.


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 39/40] tools: use xg_sr_bitmap for populated_pfns
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (37 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 38/40] tools: add API for expandable bitmaps Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  2021-07-01  9:56 ` [PATCH v20210701 40/40] tools/libxc: use superpages during restore of HVM guest Olaf Hering
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu, Juergen Gross

Signed-off-by: Olaf Hering <olaf@aepfle.de>

v02:
- remove xg_ prefix from called functions
---
 tools/libs/saverestore/common.h          | 21 +++++++-
 tools/libs/saverestore/restore.c         | 69 ------------------------
 tools/libs/saverestore/restore_x86_hvm.c |  9 ++++
 tools/libs/saverestore/restore_x86_pv.c  |  7 +++
 4 files changed, 35 insertions(+), 71 deletions(-)

diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index e6a269c482..a610483fe7 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -403,8 +403,7 @@ struct xc_sr_context
             uint32_t     xenstore_domid,  console_domid;
 
             /* Bitmap of currently populated PFNs during restore. */
-            unsigned long *populated_pfns;
-            xen_pfn_t max_populated_pfn;
+            struct sr_bitmap populated_pfns;
 
             /* Sender has invoked verify mode on the stream. */
             bool verify;
@@ -647,6 +646,24 @@ static inline bool page_type_has_stream_data(uint32_t type)
     }
     return ret;
 }
+
+static inline bool pfn_is_populated(struct xc_sr_context *ctx, xen_pfn_t pfn)
+{
+    return sr_test_bit(pfn, &ctx->restore.populated_pfns);
+}
+
+static inline int pfn_set_populated(struct xc_sr_context *ctx, xen_pfn_t pfn)
+{
+    xc_interface *xch = ctx->xch;
+
+    if ( sr_set_bit(pfn, &ctx->restore.populated_pfns) == false )
+    {
+        PERROR("Failed to realloc populated_pfns bitmap");
+        errno = ENOMEM;
+        return -1;
+    }
+    return 0;
+}
 #endif
 /*
  * Local variables:
diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index 53f05f1b65..baf8ea44e5 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -71,64 +71,6 @@ static int read_headers(struct xc_sr_context *ctx)
     return 0;
 }
 
-/*
- * Is a pfn populated?
- */
-static bool pfn_is_populated(const struct xc_sr_context *ctx, xen_pfn_t pfn)
-{
-    if ( pfn > ctx->restore.max_populated_pfn )
-        return false;
-    return test_bit(pfn, ctx->restore.populated_pfns);
-}
-
-/*
- * Set a pfn as populated, expanding the tracking structures if needed. To
- * avoid realloc()ing too excessively, the size increased to the nearest power
- * of two large enough to contain the required pfn.
- */
-static int pfn_set_populated(struct xc_sr_context *ctx, xen_pfn_t pfn)
-{
-    xc_interface *xch = ctx->xch;
-
-    if ( pfn > ctx->restore.max_populated_pfn )
-    {
-        xen_pfn_t new_max;
-        size_t old_sz, new_sz;
-        unsigned long *p;
-
-        /* Round up to the nearest power of two larger than pfn, less 1. */
-        new_max = pfn;
-        new_max |= new_max >> 1;
-        new_max |= new_max >> 2;
-        new_max |= new_max >> 4;
-        new_max |= new_max >> 8;
-        new_max |= new_max >> 16;
-#ifdef __x86_64__
-        new_max |= new_max >> 32;
-#endif
-
-        old_sz = bitmap_size(ctx->restore.max_populated_pfn + 1);
-        new_sz = bitmap_size(new_max + 1);
-        p = realloc(ctx->restore.populated_pfns, new_sz);
-        if ( !p )
-        {
-            ERROR("Failed to realloc populated bitmap");
-            errno = ENOMEM;
-            return -1;
-        }
-
-        memset((uint8_t *)p + old_sz, 0x00, new_sz - old_sz);
-
-        ctx->restore.populated_pfns    = p;
-        ctx->restore.max_populated_pfn = new_max;
-    }
-
-    assert(!test_bit(pfn, ctx->restore.populated_pfns));
-    set_bit(pfn, ctx->restore.populated_pfns);
-
-    return 0;
-}
-
 /*
  * Given a set of pfns, obtain memory from Xen to fill the physmap for the
  * unpopulated subset.  If types is NULL, no page type checking is performed
@@ -929,16 +871,6 @@ static int setup(struct xc_sr_context *ctx)
     if ( rc )
         goto err;
 
-    ctx->restore.max_populated_pfn = (32 * 1024 / 4) - 1;
-    ctx->restore.populated_pfns = bitmap_alloc(
-        ctx->restore.max_populated_pfn + 1);
-    if ( !ctx->restore.populated_pfns )
-    {
-        ERROR("Unable to allocate memory for populated_pfns bitmap");
-        rc = -1;
-        goto err;
-    }
-
     ctx->restore.buffered_records = malloc(
         DEFAULT_BUF_RECORDS * sizeof(struct xc_sr_record));
     if ( !ctx->restore.buffered_records )
@@ -977,7 +909,6 @@ static void cleanup(struct xc_sr_context *ctx)
 
     free(ctx->restore.m);
     free(ctx->restore.buffered_records);
-    free(ctx->restore.populated_pfns);
 
     if ( ctx->restore.ops.cleanup(ctx) )
         PERROR("Failed to clean up");
diff --git a/tools/libs/saverestore/restore_x86_hvm.c b/tools/libs/saverestore/restore_x86_hvm.c
index bd63bd2818..97e7e0f48c 100644
--- a/tools/libs/saverestore/restore_x86_hvm.c
+++ b/tools/libs/saverestore/restore_x86_hvm.c
@@ -136,6 +136,7 @@ static int x86_hvm_localise_page(struct xc_sr_context *ctx,
 static int x86_hvm_setup(struct xc_sr_context *ctx)
 {
     xc_interface *xch = ctx->xch;
+    unsigned long max_pfn;
 
     if ( ctx->restore.guest_type != DHDR_TYPE_X86_HVM )
     {
@@ -161,6 +162,13 @@ static int x86_hvm_setup(struct xc_sr_context *ctx)
     }
 #endif
 
+    max_pfn = max(ctx->restore.p2m_size, ctx->dominfo.max_memkb >> (PAGE_SHIFT-10));
+    if ( !sr_bitmap_expand(&ctx->restore.populated_pfns, max_pfn) )
+    {
+        PERROR("Unable to allocate memory for populated_pfns bitmap");
+        return -1;
+    }
+
     return 0;
 }
 
@@ -241,6 +249,7 @@ static int x86_hvm_stream_complete(struct xc_sr_context *ctx)
 
 static int x86_hvm_cleanup(struct xc_sr_context *ctx)
 {
+    sr_bitmap_free(&ctx->restore.populated_pfns);
     free(ctx->x86.hvm.restore.context.ptr);
 
     free(ctx->x86.restore.cpuid.ptr);
diff --git a/tools/libs/saverestore/restore_x86_pv.c b/tools/libs/saverestore/restore_x86_pv.c
index 96608e5231..c73a3cd99f 100644
--- a/tools/libs/saverestore/restore_x86_pv.c
+++ b/tools/libs/saverestore/restore_x86_pv.c
@@ -1060,6 +1060,12 @@ static int x86_pv_setup(struct xc_sr_context *ctx)
     if ( rc )
         return rc;
 
+    if ( !sr_bitmap_expand(&ctx->restore.populated_pfns, 32 * 1024 / 4) )
+    {
+        PERROR("Unable to allocate memory for populated_pfns bitmap");
+        return -1;
+    }
+
     ctx->x86.pv.restore.nr_vcpus = ctx->dominfo.max_vcpu_id + 1;
     ctx->x86.pv.restore.vcpus = calloc(sizeof(struct xc_sr_x86_pv_restore_vcpu),
                                        ctx->x86.pv.restore.nr_vcpus);
@@ -1153,6 +1159,7 @@ static int x86_pv_stream_complete(struct xc_sr_context *ctx)
  */
 static int x86_pv_cleanup(struct xc_sr_context *ctx)
 {
+    sr_bitmap_free(&ctx->restore.populated_pfns);
     free(ctx->x86.pv.p2m);
     free(ctx->x86.pv.p2m_pfns);
 


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v20210701 40/40] tools/libxc: use superpages during restore of HVM guest
  2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
                   ` (38 preceding siblings ...)
  2021-07-01  9:56 ` [PATCH v20210701 39/40] tools: use xg_sr_bitmap for populated_pfns Olaf Hering
@ 2021-07-01  9:56 ` Olaf Hering
  39 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01  9:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Ian Jackson, Wei Liu, Juergen Gross

During creating of a HVM domU meminit_hvm() tries to map superpages.
After save/restore or migration this mapping is lost, everything is
allocated in single pages. This causes a performance degradation after
migration.

Add neccessary code to preallocate a superpage for an incoming chunk of
pfns. In case a pfn was not populated on the sending side, it must be
freed on the receiving side to avoid over-allocation.

The existing code for x86_pv is moved unmodified into its own file.

Signed-off-by: Olaf Hering <olaf@aepfle.de>

v02:
- remove xg_ prefix from called functions
---
 tools/libs/guest/xg_dom_x86.c            |   5 -
 tools/libs/guest/xg_private.h            |   5 +
 tools/libs/saverestore/common.c          |   1 -
 tools/libs/saverestore/common.h          |  28 +-
 tools/libs/saverestore/restore.c         |  62 +---
 tools/libs/saverestore/restore_x86_hvm.c | 370 ++++++++++++++++++++++-
 tools/libs/saverestore/restore_x86_pv.c  |  61 +++-
 7 files changed, 455 insertions(+), 77 deletions(-)

diff --git a/tools/libs/guest/xg_dom_x86.c b/tools/libs/guest/xg_dom_x86.c
index d2eb89ce01..ec0d18fd60 100644
--- a/tools/libs/guest/xg_dom_x86.c
+++ b/tools/libs/guest/xg_dom_x86.c
@@ -44,11 +44,6 @@
 
 #define SUPERPAGE_BATCH_SIZE 512
 
-#define SUPERPAGE_2MB_SHIFT   9
-#define SUPERPAGE_2MB_NR_PFNS (1UL << SUPERPAGE_2MB_SHIFT)
-#define SUPERPAGE_1GB_SHIFT   18
-#define SUPERPAGE_1GB_NR_PFNS (1UL << SUPERPAGE_1GB_SHIFT)
-
 #define X86_CR0_PE 0x01
 #define X86_CR0_ET 0x10
 
diff --git a/tools/libs/guest/xg_private.h b/tools/libs/guest/xg_private.h
index 28441ee13f..b7372e6bd5 100644
--- a/tools/libs/guest/xg_private.h
+++ b/tools/libs/guest/xg_private.h
@@ -179,4 +179,9 @@ struct xc_cpu_policy {
 };
 #endif /* x86 */
 
+#define SUPERPAGE_2MB_SHIFT   9
+#define SUPERPAGE_2MB_NR_PFNS (1UL << SUPERPAGE_2MB_SHIFT)
+#define SUPERPAGE_1GB_SHIFT   18
+#define SUPERPAGE_1GB_NR_PFNS (1UL << SUPERPAGE_1GB_SHIFT)
+
 #endif /* XG_PRIVATE_H */
diff --git a/tools/libs/saverestore/common.c b/tools/libs/saverestore/common.c
index e96173eea2..8dbd516b1b 100644
--- a/tools/libs/saverestore/common.c
+++ b/tools/libs/saverestore/common.c
@@ -1,5 +1,4 @@
 #include <assert.h>
-
 #include "common.h"
 
 #include <xen-tools/libs.h>
diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
index a610483fe7..3d392f1ac9 100644
--- a/tools/libs/saverestore/common.h
+++ b/tools/libs/saverestore/common.h
@@ -219,6 +219,16 @@ struct xc_sr_restore_ops
      */
     int (*setup)(struct xc_sr_context *ctx);
 
+    /**
+     * Populate PFNs
+     *
+     * Given a set of pfns, obtain memory from Xen to fill the physmap for the
+     * unpopulated subset.
+     */
+    int (*populate_pfns)(struct xc_sr_context *ctx, unsigned count,
+                         const xen_pfn_t *original_pfns, const uint32_t *types);
+
+
     /**
      * Process an individual record from the stream.  The caller shall take
      * care of processing common records (e.g. END, PAGE_DATA).
@@ -366,6 +376,8 @@ struct xc_sr_context
 
             int send_back_fd;
             unsigned long p2m_size;
+            unsigned long max_pages;
+            unsigned long tot_pages;
             xc_hypercall_buffer_t dirty_bitmap_hbuf;
 
             /* From Image Header. */
@@ -503,6 +515,14 @@ struct xc_sr_context
                     {
                         /* HVM context blob. */
                         struct xc_sr_blob context;
+
+                        /* Bitmap of currently allocated PFNs during restore. */
+                        struct sr_bitmap attempted_1g;
+                        struct sr_bitmap attempted_2m;
+                        struct sr_bitmap allocated_pfns;
+                        xen_pfn_t prev_populated_pfn;
+                        xen_pfn_t iteration_tracker_pfn;
+                        unsigned long iteration;
                     } restore;
                 };
             } hvm;
@@ -567,14 +587,6 @@ int read_record_header(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhd
 int read_record_data(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr,
                      struct xc_sr_record *rec);
 
-/*
- * This would ideally be private in restore.c, but is needed by
- * x86_pv_localise_page() if we receive pagetables frames ahead of the
- * contents of the frames they point at.
- */
-int populate_pfns(struct xc_sr_context *ctx, unsigned int count,
-                  const xen_pfn_t *original_pfns, const uint32_t *types);
-
 /* Handle a STATIC_DATA_END record. */
 int handle_static_data_end(struct xc_sr_context *ctx);
 
diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
index baf8ea44e5..5ad3df49ba 100644
--- a/tools/libs/saverestore/restore.c
+++ b/tools/libs/saverestore/restore.c
@@ -71,63 +71,6 @@ static int read_headers(struct xc_sr_context *ctx)
     return 0;
 }
 
-/*
- * Given a set of pfns, obtain memory from Xen to fill the physmap for the
- * unpopulated subset.  If types is NULL, no page type checking is performed
- * and all unpopulated pfns are populated.
- */
-int populate_pfns(struct xc_sr_context *ctx, unsigned int count,
-                  const xen_pfn_t *original_pfns, const uint32_t *types)
-{
-    xc_interface *xch = ctx->xch;
-    xen_pfn_t *mfns = ctx->restore.m->pp_mfns,
-        *pfns = ctx->restore.m->pp_pfns;
-    unsigned int i, nr_pfns = 0;
-    int rc = -1;
-
-    for ( i = 0; i < count; ++i )
-    {
-        if ( (!types ||
-              (types && page_type_to_populate(types[i]) == true)) &&
-             !pfn_is_populated(ctx, original_pfns[i]) )
-        {
-            rc = pfn_set_populated(ctx, original_pfns[i]);
-            if ( rc )
-                goto err;
-            pfns[nr_pfns] = mfns[nr_pfns] = original_pfns[i];
-            ++nr_pfns;
-        }
-    }
-
-    if ( nr_pfns )
-    {
-        rc = xc_domain_populate_physmap_exact(
-            xch, ctx->domid, nr_pfns, 0, 0, mfns);
-        if ( rc )
-        {
-            PERROR("Failed to populate physmap");
-            goto err;
-        }
-
-        for ( i = 0; i < nr_pfns; ++i )
-        {
-            if ( mfns[i] == INVALID_MFN )
-            {
-                ERROR("Populate physmap failed for pfn %u", i);
-                rc = -1;
-                goto err;
-            }
-
-            ctx->restore.ops.set_gfn(ctx, pfns[i], mfns[i]);
-        }
-    }
-
-    rc = 0;
-
- err:
-    return rc;
-}
-
 static int handle_static_data_end_v2(struct xc_sr_context *ctx)
 {
     int rc = 0;
@@ -270,7 +213,7 @@ static int map_guest_pages(struct xc_sr_context *ctx,
     uint32_t i, p;
     int rc;
 
-    rc = populate_pfns(ctx, pages->count, m->pfns, m->types);
+    rc = ctx->restore.ops.populate_pfns(ctx, pages->count, m->pfns, m->types);
     if ( rc )
     {
         ERROR("Failed to populate pfns for batch of %u pages", pages->count);
@@ -1077,6 +1020,9 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
         return -1;
     }
 
+    /* See xc_domain_getinfo */
+    ctx.restore.max_pages = ctx.dominfo.max_memkb >> (PAGE_SHIFT-10);
+    ctx.restore.tot_pages = ctx.dominfo.nr_pages;
     ctx.restore.p2m_size = nr_pfns;
     ctx.restore.ops = ctx.dominfo.hvm
         ? restore_ops_x86_hvm : restore_ops_x86_pv;
diff --git a/tools/libs/saverestore/restore_x86_hvm.c b/tools/libs/saverestore/restore_x86_hvm.c
index 97e7e0f48c..f45635613f 100644
--- a/tools/libs/saverestore/restore_x86_hvm.c
+++ b/tools/libs/saverestore/restore_x86_hvm.c
@@ -130,6 +130,25 @@ static int x86_hvm_localise_page(struct xc_sr_context *ctx,
     return 0;
 }
 
+static bool x86_hvm_expand_sp_bitmaps(struct xc_sr_context *ctx, unsigned long max_pfn)
+{
+    struct sr_bitmap *bm;
+
+    bm = &ctx->x86.hvm.restore.attempted_1g;
+    if ( !sr_bitmap_expand(bm, max_pfn >> SUPERPAGE_1GB_SHIFT) )
+        return false;
+
+    bm = &ctx->x86.hvm.restore.attempted_2m;
+    if ( !sr_bitmap_expand(bm, max_pfn >> SUPERPAGE_2MB_SHIFT) )
+        return false;
+
+    bm = &ctx->x86.hvm.restore.allocated_pfns;
+    if ( !sr_bitmap_expand(bm, max_pfn) )
+        return false;
+
+    return true;
+}
+
 /*
  * restore_ops function. Confirms the stream matches the domain.
  */
@@ -164,12 +183,21 @@ static int x86_hvm_setup(struct xc_sr_context *ctx)
 
     max_pfn = max(ctx->restore.p2m_size, ctx->dominfo.max_memkb >> (PAGE_SHIFT-10));
     if ( !sr_bitmap_expand(&ctx->restore.populated_pfns, max_pfn) )
-    {
-        PERROR("Unable to allocate memory for populated_pfns bitmap");
-        return -1;
-    }
+        goto out;
+
+    if ( !x86_hvm_expand_sp_bitmaps(ctx, max_pfn) )
+        goto out;
+
+    /* FIXME: distinguish between PVH and HVM */
+    /* No superpage in 1st 2MB due to VGA hole */
+    sr_set_bit(0, &ctx->x86.hvm.restore.attempted_1g);
+    sr_set_bit(0, &ctx->x86.hvm.restore.attempted_2m);
 
     return 0;
+
+out:
+    PERROR("Unable to allocate memory for pfn bitmaps");
+    return -1;
 }
 
 /*
@@ -250,6 +278,9 @@ static int x86_hvm_stream_complete(struct xc_sr_context *ctx)
 static int x86_hvm_cleanup(struct xc_sr_context *ctx)
 {
     sr_bitmap_free(&ctx->restore.populated_pfns);
+    sr_bitmap_free(&ctx->x86.hvm.restore.attempted_1g);
+    sr_bitmap_free(&ctx->x86.hvm.restore.attempted_2m);
+    sr_bitmap_free(&ctx->x86.hvm.restore.allocated_pfns);
     free(ctx->x86.hvm.restore.context.ptr);
 
     free(ctx->x86.restore.cpuid.ptr);
@@ -258,6 +289,336 @@ static int x86_hvm_cleanup(struct xc_sr_context *ctx)
     return 0;
 }
 
+/*
+ * Set a range of pfns as allocated
+ */
+static void pfn_set_long_allocated(struct xc_sr_context *ctx, xen_pfn_t base_pfn)
+{
+    sr_set_long_bit(base_pfn, &ctx->x86.hvm.restore.allocated_pfns);
+}
+
+static void pfn_set_allocated(struct xc_sr_context *ctx, xen_pfn_t pfn)
+{
+    sr_set_bit(pfn, &ctx->x86.hvm.restore.allocated_pfns);
+}
+
+struct x86_hvm_sp {
+    xen_pfn_t pfn;
+    xen_pfn_t base_pfn;
+    unsigned long index;
+    unsigned long count;
+};
+
+/*
+ * Try to allocate a 1GB page for this pfn, but avoid Over-allocation.
+ * If this succeeds, mark the range of 2MB pages as busy.
+ */
+static bool x86_hvm_alloc_1g(struct xc_sr_context *ctx, struct x86_hvm_sp *sp)
+{
+    xc_interface *xch = ctx->xch;
+    unsigned int order;
+    int i, done;
+    xen_pfn_t extent;
+
+    /* Only one attempt to avoid overlapping allocation */
+    if ( sr_test_and_set_bit(sp->index, &ctx->x86.hvm.restore.attempted_1g) )
+        return false;
+
+    order = SUPERPAGE_1GB_SHIFT;
+    sp->count = SUPERPAGE_1GB_NR_PFNS;
+
+    /* Allocate only if there is room for another superpage */
+    if ( ctx->restore.tot_pages + sp->count > ctx->restore.max_pages )
+        return false;
+
+    extent = sp->base_pfn = (sp->pfn >> order) << order;
+    done = xc_domain_populate_physmap(xch, ctx->domid, 1, order, 0, &extent);
+    if ( done < 0 ) {
+        PERROR("populate_physmap failed.");
+        return false;
+    }
+    if ( done == 0 )
+        return false;
+
+    DPRINTF("1G %" PRI_xen_pfn "\n", sp->base_pfn);
+
+    /* Mark all 2MB pages as done to avoid overlapping allocation */
+    for ( i = 0; i < (SUPERPAGE_1GB_NR_PFNS/SUPERPAGE_2MB_NR_PFNS); i++ )
+        sr_set_bit((sp->base_pfn >> SUPERPAGE_2MB_SHIFT) + i, &ctx->x86.hvm.restore.attempted_2m);
+
+    return true;
+}
+
+/* Allocate a 2MB page if x86_hvm_alloc_1g failed, avoid Over-allocation. */
+static bool x86_hvm_alloc_2m(struct xc_sr_context *ctx, struct x86_hvm_sp *sp)
+{
+    xc_interface *xch = ctx->xch;
+    unsigned int order;
+    int done;
+    xen_pfn_t extent;
+
+    /* Only one attempt to avoid overlapping allocation */
+    if ( sr_test_and_set_bit(sp->index, &ctx->x86.hvm.restore.attempted_2m) )
+        return false;
+
+    order = SUPERPAGE_2MB_SHIFT;
+    sp->count = SUPERPAGE_2MB_NR_PFNS;
+
+    /* Allocate only if there is room for another superpage */
+    if ( ctx->restore.tot_pages + sp->count > ctx->restore.max_pages )
+        return false;
+
+    extent = sp->base_pfn = (sp->pfn >> order) << order;
+    done = xc_domain_populate_physmap(xch, ctx->domid, 1, order, 0, &extent);
+    if ( done < 0 ) {
+        PERROR("populate_physmap failed.");
+        return false;
+    }
+    if ( done == 0 )
+        return false;
+
+    DPRINTF("2M %" PRI_xen_pfn "\n", sp->base_pfn);
+    return true;
+}
+
+/* Allocate a single page if x86_hvm_alloc_2m failed. */
+static bool x86_hvm_alloc_4k(struct xc_sr_context *ctx, struct x86_hvm_sp *sp)
+{
+    xc_interface *xch = ctx->xch;
+    unsigned int order;
+    int done;
+    xen_pfn_t extent;
+
+    order = 0;
+    sp->count = 1UL;
+
+    /* Allocate only if there is room for another page */
+    if ( ctx->restore.tot_pages + sp->count > ctx->restore.max_pages ) {
+        errno = E2BIG;
+        return false;
+    }
+
+    extent = sp->base_pfn = (sp->pfn >> order) << order;
+    done = xc_domain_populate_physmap(xch, ctx->domid, 1, order, 0, &extent);
+    if ( done < 0 ) {
+        PERROR("populate_physmap failed.");
+        return false;
+    }
+    if ( done == 0 ) {
+        errno = ENOMEM;
+        return false;
+    }
+
+    DPRINTF("4K %" PRI_xen_pfn "\n", sp->base_pfn);
+    return true;
+}
+/*
+ * Attempt to allocate a superpage where the pfn resides.
+ */
+static int x86_hvm_allocate_pfn(struct xc_sr_context *ctx, xen_pfn_t pfn)
+{
+    bool success;
+    unsigned long idx_1g, idx_2m;
+    struct x86_hvm_sp sp = {
+        .pfn = pfn
+    };
+
+    if ( sr_test_bit(pfn, &ctx->x86.hvm.restore.allocated_pfns) )
+        return 0;
+
+    idx_1g = pfn >> SUPERPAGE_1GB_SHIFT;
+    idx_2m = pfn >> SUPERPAGE_2MB_SHIFT;
+
+    sp.index = idx_1g;
+    success = x86_hvm_alloc_1g(ctx, &sp);
+
+    if ( success == false ) {
+        sp.index = idx_2m;
+        success = x86_hvm_alloc_2m(ctx, &sp);
+    }
+
+    if ( success == false ) {
+        sp.index = 0;
+        success = x86_hvm_alloc_4k(ctx, &sp);
+    }
+
+    if ( success == false )
+        return -1;
+
+    do {
+        if ( sp.count >= BITS_PER_LONG ) {
+            sp.count -= BITS_PER_LONG;
+            ctx->restore.tot_pages += BITS_PER_LONG;
+            pfn_set_long_allocated(ctx, sp.base_pfn + sp.count);
+        } else {
+            sp.count--;
+            ctx->restore.tot_pages++;
+            pfn_set_allocated(ctx, sp.base_pfn + sp.count);
+        }
+    } while ( sp.count );
+
+    return 0;
+}
+
+/*
+ * Deallocate memory.
+ * There was likely an optimistic superpage allocation.
+ * This means more pages may have been allocated past gap_end.
+ * This range is not freed now. Incoming higher pfns will release it.
+ */
+static int x86_hvm_punch_hole(struct xc_sr_context *ctx,
+                               xen_pfn_t gap_start, xen_pfn_t gap_end)
+{
+    xc_interface *xch = ctx->xch;
+    xen_pfn_t _pfn, pfn;
+    uint32_t domid, freed = 0;
+    int rc;
+
+    pfn = gap_start >> SUPERPAGE_1GB_SHIFT;
+    do
+    {
+        sr_set_bit(pfn, &ctx->x86.hvm.restore.attempted_1g);
+    } while (++pfn <= gap_end >> SUPERPAGE_1GB_SHIFT);
+
+    pfn = gap_start >> SUPERPAGE_2MB_SHIFT;
+    do
+    {
+        sr_set_bit(pfn, &ctx->x86.hvm.restore.attempted_2m);
+    } while (++pfn <= gap_end >> SUPERPAGE_2MB_SHIFT);
+
+    pfn = gap_start;
+
+    while ( pfn <= gap_end )
+    {
+        if ( sr_test_and_clear_bit(pfn, &ctx->x86.hvm.restore.allocated_pfns) )
+        {
+            domid = ctx->domid;
+            _pfn = pfn;
+            rc = xc_domain_decrease_reservation_exact(xch, domid, 1, 0, &_pfn);
+            if ( rc )
+            {
+                PERROR("Failed to release pfn %" PRI_xen_pfn, pfn);
+                return -1;
+            }
+            ctx->restore.tot_pages--;
+            freed++;
+        }
+        pfn++;
+    }
+    if ( freed )
+        DPRINTF("freed %u between %" PRI_xen_pfn " %" PRI_xen_pfn "\n",
+                freed, gap_start, gap_end);
+    return 0;
+}
+
+static int x86_hvm_unpopulate_page(struct xc_sr_context *ctx, xen_pfn_t pfn)
+{
+    sr_clear_bit(pfn, &ctx->restore.populated_pfns);
+    return x86_hvm_punch_hole(ctx, pfn, pfn);
+}
+
+static int x86_hvm_populate_page(struct xc_sr_context *ctx, xen_pfn_t pfn)
+{
+    xen_pfn_t gap_start, gap_end;
+    bool has_gap, first_iteration;
+    int rc;
+
+    /*
+     * Check for a gap between the previous populated pfn and this pfn.
+     * In case a gap exists, it is required to punch a hole to release memory,
+     * starting after the previous pfn and before this pfn.
+     *
+     * But: this can be done only during the first iteration, which is the
+     * only place there superpage allocations are attempted. All following
+     * iterations lack the info to properly maintain prev_populated_pfn.
+     */
+    has_gap = ctx->x86.hvm.restore.prev_populated_pfn + 1 < pfn;
+    first_iteration = ctx->x86.hvm.restore.iteration == 0;
+    if ( has_gap && first_iteration )
+    {
+        gap_start = ctx->x86.hvm.restore.prev_populated_pfn + 1;
+        gap_end = pfn - 1;
+
+        rc = x86_hvm_punch_hole(ctx, gap_start, gap_end);
+        if ( rc )
+            goto err;
+    }
+
+    rc = x86_hvm_allocate_pfn(ctx, pfn);
+    if ( rc )
+        goto err;
+    pfn_set_populated(ctx, pfn);
+    ctx->x86.hvm.restore.prev_populated_pfn = pfn;
+
+    rc = 0;
+err:
+    return rc;
+}
+
+/*
+ * Try to allocate superpages.
+ * This works without memory map because the pfns arrive in incremental order.
+ * All pfn numbers and their type are submitted.
+ * Only pfns with data will have also pfn content transmitted.
+ */
+static int x86_hvm_populate_pfns(struct xc_sr_context *ctx, unsigned count,
+                                 const xen_pfn_t *original_pfns,
+                                 const uint32_t *types)
+{
+    xc_interface *xch = ctx->xch;
+    xen_pfn_t pfn, min_pfn, max_pfn;
+    bool to_populate, populated;
+    unsigned i = count;
+    int rc = 0;
+
+    min_pfn = count ? original_pfns[0] : 0;
+    max_pfn = count ? original_pfns[count - 1] : 0;
+    DPRINTF("batch of %u pfns between %" PRI_xen_pfn " %" PRI_xen_pfn "\n",
+            count, min_pfn, max_pfn);
+
+    if ( !x86_hvm_expand_sp_bitmaps(ctx, max_pfn) )
+    {
+        ERROR("Unable to allocate memory for pfn bitmaps");
+        return -1;
+    }
+
+    /*
+     * There is no indicator for a new iteration.
+     * Simulate it by checking if a lower pfn is coming in.
+     * In the end it matters only to know if this iteration is the first one.
+     */
+    if ( min_pfn < ctx->x86.hvm.restore.iteration_tracker_pfn )
+        ctx->x86.hvm.restore.iteration++;
+    ctx->x86.hvm.restore.iteration_tracker_pfn = min_pfn;
+
+    for ( i = 0; i < count; ++i )
+    {
+        pfn = original_pfns[i];
+
+        to_populate = page_type_to_populate(types[i]);
+        populated = pfn_is_populated(ctx, pfn);
+
+        /*
+         * page has data, pfn populated: nothing to do
+         * page has data, pfn not populated: likely never seen before
+         * page has no data, pfn populated: likely ballooned out during migration
+         * page has no data, pfn not populated: nothing to do
+         */
+        if ( to_populate && !populated )
+        {
+            rc = x86_hvm_populate_page(ctx, pfn);
+        } else if ( !to_populate && populated )
+        {
+            rc = x86_hvm_unpopulate_page(ctx, pfn);
+        }
+        if ( rc )
+            break;
+    }
+
+    return rc;
+}
+
+
 struct xc_sr_restore_ops restore_ops_x86_hvm =
 {
     .pfn_is_valid    = x86_hvm_pfn_is_valid,
@@ -266,6 +627,7 @@ struct xc_sr_restore_ops restore_ops_x86_hvm =
     .set_page_type   = x86_hvm_set_page_type,
     .localise_page   = x86_hvm_localise_page,
     .setup           = x86_hvm_setup,
+    .populate_pfns   = x86_hvm_populate_pfns,
     .process_record  = x86_hvm_process_record,
     .static_data_complete = x86_static_data_complete,
     .stream_complete = x86_hvm_stream_complete,
diff --git a/tools/libs/saverestore/restore_x86_pv.c b/tools/libs/saverestore/restore_x86_pv.c
index c73a3cd99f..244f1da218 100644
--- a/tools/libs/saverestore/restore_x86_pv.c
+++ b/tools/libs/saverestore/restore_x86_pv.c
@@ -959,6 +959,64 @@ static void x86_pv_set_gfn(struct xc_sr_context *ctx, xen_pfn_t pfn,
         ((uint32_t *)ctx->x86.pv.p2m)[pfn] = mfn;
 }
 
+/*
+ * Given a set of pfns, obtain memory from Xen to fill the physmap for the
+ * unpopulated subset.  If types is NULL, no page type checking is performed
+ * and all unpopulated pfns are populated.
+ */
+static int x86_pv_populate_pfns(struct xc_sr_context *ctx, unsigned count,
+                                const xen_pfn_t *original_pfns,
+                                const uint32_t *types)
+{
+    xc_interface *xch = ctx->xch;
+    xen_pfn_t *mfns = ctx->restore.m->pp_mfns,
+        *pfns = ctx->restore.m->pp_pfns;
+    unsigned int i, nr_pfns = 0;
+    int rc = -1;
+
+    for ( i = 0; i < count; ++i )
+    {
+        if ( (!types ||
+              (types && page_type_has_stream_data(types[i]) == true)) &&
+             !pfn_is_populated(ctx, original_pfns[i]) )
+        {
+            rc = pfn_set_populated(ctx, original_pfns[i]);
+            if ( rc )
+                goto err;
+            pfns[nr_pfns] = mfns[nr_pfns] = original_pfns[i];
+            ++nr_pfns;
+        }
+    }
+
+    if ( nr_pfns )
+    {
+        rc = xc_domain_populate_physmap_exact(
+            xch, ctx->domid, nr_pfns, 0, 0, mfns);
+        if ( rc )
+        {
+            PERROR("Failed to populate physmap");
+            goto err;
+        }
+
+        for ( i = 0; i < nr_pfns; ++i )
+        {
+            if ( mfns[i] == INVALID_MFN )
+            {
+                ERROR("Populate physmap failed for pfn %u", i);
+                rc = -1;
+                goto err;
+            }
+
+            ctx->restore.ops.set_gfn(ctx, pfns[i], mfns[i]);
+        }
+    }
+
+    rc = 0;
+
+ err:
+    return rc;
+}
+
 /*
  * restore_ops function.  Convert pfns back to mfns in pagetables.  Possibly
  * needs to populate new frames if a PTE is found referring to a frame which
@@ -1003,7 +1061,7 @@ static int x86_pv_localise_page(struct xc_sr_context *ctx,
         }
     }
 
-    if ( to_populate && populate_pfns(ctx, to_populate, pfns, NULL) )
+    if ( to_populate && x86_pv_populate_pfns(ctx, to_populate, pfns, NULL) )
         return -1;
 
     for ( i = 0; i < (PAGE_SIZE / sizeof(uint64_t)); ++i )
@@ -1200,6 +1258,7 @@ struct xc_sr_restore_ops restore_ops_x86_pv =
     .set_gfn         = x86_pv_set_gfn,
     .localise_page   = x86_pv_localise_page,
     .setup           = x86_pv_setup,
+    .populate_pfns   = x86_pv_populate_pfns,
     .process_record  = x86_pv_process_record,
     .static_data_complete = x86_static_data_complete,
     .stream_complete = x86_pv_stream_complete,


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 08/40] MAINTAINERS: add myself as saverestore maintainer
  2021-07-01  9:56 ` [PATCH v20210701 08/40] MAINTAINERS: add myself as saverestore maintainer Olaf Hering
@ 2021-07-01 10:39   ` Jan Beulich
  2021-07-01 11:01     ` Olaf Hering
  0 siblings, 1 reply; 86+ messages in thread
From: Jan Beulich @ 2021-07-01 10:39 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu, xen-devel

On 01.07.2021 11:56, Olaf Hering wrote:
> I touched it last.

For my taste, this is too little as a justification.

> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -381,6 +381,12 @@ R:	Juergen Gross <jgross@suse.com>
>  S:	Supported
>  F:	tools/libs/
>  
> +LIBSAVERESTORE:

Nit: DYM LIBXENSAVERESTORE (and hence for the new entry to be below
LIBXENLIGHT)?

> +M:	Olaf Hering <olaf@aepfle.de>
> +S:	Supported
> +F:	tools/include/xensaverestore.h
> +F:	tools/libs/saverestore/

I'm afraid this goes too far: This way you remove all prior
(direct) maintainers (see "The meaning of nesting" in
./MAINTAINERS). And I'm sure Andrew, who has written much of
this, ought to be considered to become the maintainer of this
code then as well.

Personally I think you may want to take a smaller step first and
insert yourself as reviewer for this library. See e.g. what we
had done a while back for "VM EVENT, MEM ACCESS and MONITOR" when
new maintainers had been proposed. I may not have a sufficiently
good picture of reviews you've done in the past for this part of
the tree, so I'm sorry if I'm missing significant work you've
done there, but surely my recent series fixing code in this area
could have been a good opportunity to actually do a full round of
review, when you have the intention expressed here.

Jan

>  LIBXENLIGHT
>  M:	Ian Jackson <iwj@xenproject.org>
>  M:	Wei Liu <wl@xen.org>
> 



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 08/40] MAINTAINERS: add myself as saverestore maintainer
  2021-07-01 10:39   ` Jan Beulich
@ 2021-07-01 11:01     ` Olaf Hering
  2021-07-01 11:40       ` Julien Grall
  0 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-01 11:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu, xen-devel

[-- Attachment #1: Type: text/plain, Size: 627 bytes --]

Am Thu, 1 Jul 2021 12:39:06 +0200
schrieb Jan Beulich <jbeulich@suse.com>:

> I'm afraid this goes too far: This way you remove all prior
> (direct) maintainers (see "The meaning of nesting" in
> ./MAINTAINERS). And I'm sure Andrew, who has written much of
> this, ought to be considered to become the maintainer of this
> code then as well.

I think this was copy&paste from some other entry, which would still include the tools/ maintainers when using get_maintainer.pl. I do not remember which one it was.

Also I think if Andrew would have a desire to be in there he would have added himself already.


Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 08/40] MAINTAINERS: add myself as saverestore maintainer
  2021-07-01 11:01     ` Olaf Hering
@ 2021-07-01 11:40       ` Julien Grall
  2021-07-01 12:00         ` Olaf Hering
  0 siblings, 1 reply; 86+ messages in thread
From: Julien Grall @ 2021-07-01 11:40 UTC (permalink / raw)
  To: Olaf Hering, Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, xen-devel

Hi Olaf,

On 01/07/2021 12:01, Olaf Hering wrote:
> Am Thu, 1 Jul 2021 12:39:06 +0200
> schrieb Jan Beulich <jbeulich@suse.com>:
> 
>> I'm afraid this goes too far: This way you remove all prior
>> (direct) maintainers (see "The meaning of nesting" in
>> ./MAINTAINERS). And I'm sure Andrew, who has written much of
>> this, ought to be considered to become the maintainer of this
>> code then as well.
> 
> I think this was copy&paste from some other entry, which would still include the tools/ maintainers when using get_maintainer.pl. I do not remember which one it was.

You are mixing CCing and actual maintainers. You can be CCed without 
maintaining a directory.

Jan's point is tools/ maintainers would not be directly maintaining the 
library. You would be the sole maintainer of the directory and Jan was 
referring the following paragraph:

1. Under normal circumstances, the Ack of the most specific maintainer
is both necessary and sufficient to get a change to a given file
committed.  So a change to xen/arch/x86/mm/shadow/multi.c requires the
the Ack of the xen/arch/x86/mm/shadow maintainer for that part of the
patch, but would not require the Ack of the xen/arch/x86 maintainer or
the xen/arch/x86/mm maintainer.

Regarding your proposal to maintain the directory. I don't follow much 
the tools side and therefore can't judge the merit of the proposal.

However... this is not new code per-se and therefore the fact you touch 
last is not sufficient (otherwise I could claim the same tomorrow if I 
send a patch to the directory ;)).

For the commit message, I would suggest to provide some information 
about your contribution (including review) to the area. Also, was this 
discussed with the tools maintainers?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 08/40] MAINTAINERS: add myself as saverestore maintainer
  2021-07-01 11:40       ` Julien Grall
@ 2021-07-01 12:00         ` Olaf Hering
  2021-07-01 12:09           ` Julien Grall
  0 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-01 12:00 UTC (permalink / raw)
  To: Julien Grall
  Cc: Jan Beulich, Andrew Cooper, George Dunlap, Ian Jackson,
	Stefano Stabellini, Wei Liu, xen-devel

[-- Attachment #1: Type: text/plain, Size: 203 bytes --]

Am Thu, 1 Jul 2021 12:40:09 +0100
schrieb Julien Grall <julien@xen.org>:

> You would be the sole maintainer of the directory

Yes that is the point, it changes the count from zero to one.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 08/40] MAINTAINERS: add myself as saverestore maintainer
  2021-07-01 12:00         ` Olaf Hering
@ 2021-07-01 12:09           ` Julien Grall
  0 siblings, 0 replies; 86+ messages in thread
From: Julien Grall @ 2021-07-01 12:09 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Jan Beulich, Andrew Cooper, George Dunlap, Ian Jackson,
	Stefano Stabellini, Wei Liu, xen-devel

Hi Olaf,

On 01/07/2021 13:00, Olaf Hering wrote:
> Am Thu, 1 Jul 2021 12:40:09 +0100
> schrieb Julien Grall <julien@xen.org>:
> 
>> You would be the sole maintainer of the directory
> 
> Yes that is the point, it changes the count from zero to one.

The code you are moving is already maintained by the tools maintainers.
So I am guessing you are saying they are unresponsive...

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 03/40] xl: fix description of migrate --debug
  2021-07-01  9:55 ` [PATCH v20210701 03/40] xl: fix description of migrate --debug Olaf Hering
@ 2021-07-01 14:30   ` Anthony PERARD
  2021-07-01 14:33   ` Andrew Cooper
  1 sibling, 0 replies; 86+ messages in thread
From: Anthony PERARD @ 2021-07-01 14:30 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Juergen Gross, Ian Jackson, Wei Liu

On Thu, Jul 01, 2021 at 11:55:58AM +0200, Olaf Hering wrote:
> xl migrate --debug used to track every pfn in every batch of pages.
> But these times are gone. The code in xc_domain_save is the consumer
> of this knob, but it considers it only for the remus and colo case.
> 
> Adjust the help text to tell what --debug does today: Nothing.
> 
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
> Reviewed-by: Juergen Gross <jgross@suse.com>
> 
> v02:
> - the option has no effect anymore

Acked-by: Anthony PERARD <anthony.perard@citrix.com>

Thanks,

-- 
Anthony PERARD


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 03/40] xl: fix description of migrate --debug
  2021-07-01  9:55 ` [PATCH v20210701 03/40] xl: fix description of migrate --debug Olaf Hering
  2021-07-01 14:30   ` Anthony PERARD
@ 2021-07-01 14:33   ` Andrew Cooper
  2021-07-01 14:40     ` Olaf Hering
  1 sibling, 1 reply; 86+ messages in thread
From: Andrew Cooper @ 2021-07-01 14:33 UTC (permalink / raw)
  To: Olaf Hering, xen-devel
  Cc: Juergen Gross, Ian Jackson, Wei Liu, Anthony PERARD

On 01/07/2021 10:55, Olaf Hering wrote:
> xl migrate --debug used to track every pfn in every batch of pages.
> But these times are gone. The code in xc_domain_save is the consumer
> of this knob, but it considers it only for the remus and colo case.
>
> Adjust the help text to tell what --debug does today: Nothing.
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
> Reviewed-by: Juergen Gross <jgross@suse.com>
>
> v02:
> - the option has no effect anymore

Since when?  It was absolutely critical to debugging issues during the
development of migration v2.

~Andrew


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 03/40] xl: fix description of migrate --debug
  2021-07-01 14:33   ` Andrew Cooper
@ 2021-07-01 14:40     ` Olaf Hering
  2021-07-01 14:41       ` Olaf Hering
  0 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-01 14:40 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: xen-devel, Juergen Gross, Ian Jackson, Wei Liu, Anthony PERARD

[-- Attachment #1: Type: text/plain, Size: 310 bytes --]

Am Thu, 1 Jul 2021 15:33:47 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> Since when?  It was absolutely critical to debugging issues during the
> development of migration v2.

Well, I can find out if needed.
What could would it enable today?
Last time I looked there was none.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 03/40] xl: fix description of migrate --debug
  2021-07-01 14:40     ` Olaf Hering
@ 2021-07-01 14:41       ` Olaf Hering
  2021-07-01 14:49         ` Andrew Cooper
  0 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-01 14:41 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: xen-devel, Juergen Gross, Ian Jackson, Wei Liu, Anthony PERARD

[-- Attachment #1: Type: text/plain, Size: 111 bytes --]

Am Thu, 1 Jul 2021 16:40:55 +0200
schrieb Olaf Hering <olaf@aepfle.de>:

> could

"code", sorry.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 03/40] xl: fix description of migrate --debug
  2021-07-01 14:41       ` Olaf Hering
@ 2021-07-01 14:49         ` Andrew Cooper
  2021-07-01 15:08           ` Olaf Hering
  0 siblings, 1 reply; 86+ messages in thread
From: Andrew Cooper @ 2021-07-01 14:49 UTC (permalink / raw)
  To: Olaf Hering
  Cc: xen-devel, Juergen Gross, Ian Jackson, Wei Liu, Anthony PERARD

On 01/07/2021 15:41, Olaf Hering wrote:
> Am Thu, 1 Jul 2021 16:40:55 +0200
> schrieb Olaf Hering <olaf@aepfle.de>:
>
>> could
> "code", sorry.

c/s 7449fb36c6c81d2ba10a40b59e61a9f420cd8450 was buggy and inverted the
condition while making the code transformation.

--debug makes no sense at all in a checkpointed stream.

~Andrew


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 03/40] xl: fix description of migrate --debug
  2021-07-01 14:49         ` Andrew Cooper
@ 2021-07-01 15:08           ` Olaf Hering
  0 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01 15:08 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: xen-devel, Juergen Gross, Ian Jackson, Wei Liu, Anthony PERARD

[-- Attachment #1: Type: text/plain, Size: 447 bytes --]

Am Thu, 1 Jul 2021 15:49:08 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> --debug makes no sense at all in a checkpointed stream.

This should probably have become "ctx->save.checkpointed == MIG_STREAM_NONE".

But this still leaves the question what value this code branch has when verify_frames does not, and most likely can not, work.
I think fixing it requires to expose details like which pages are grant pages.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 33/40] tools: change struct precopy_stats to precopy_stats_t
  2021-07-01  9:56 ` [PATCH v20210701 33/40] tools: change struct precopy_stats to precopy_stats_t Olaf Hering
@ 2021-07-01 16:45   ` Anthony PERARD
  2021-07-01 17:08     ` Olaf Hering
  0 siblings, 1 reply; 86+ messages in thread
From: Anthony PERARD @ 2021-07-01 16:45 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

On Thu, Jul 01, 2021 at 11:56:28AM +0200, Olaf Hering wrote:
> This will help libxl_save_msgs_gen.pl to copy the struct as a region of memory.
> 
> No change in behavior intented.
> 
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
> ---
>  tools/include/xensaverestore.h  | 7 +++----
>  tools/libs/saverestore/common.h | 2 +-
>  tools/libs/saverestore/save.c   | 6 +++---
>  3 files changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/include/xensaverestore.h b/tools/include/xensaverestore.h
> index 0410f0469e..dca0134605 100644
> --- a/tools/include/xensaverestore.h
> +++ b/tools/include/xensaverestore.h
> @@ -23,18 +23,17 @@
>  #define XCFLAGS_DEBUG     (1 << 1)
>  
>  /* For save's precopy_policy(). */
> -struct precopy_stats

I don't think changing the existing API is a good idea. It's probably ok
to add a typedef. But can't libxl_save_msgs_gen.pl been able to deal with
thing like 'struct precopy_stats' ? It seems to be able to deal with
'unsigned long'.

> -{
> +typedef struct {
>      unsigned int iteration;
>      unsigned long total_written;
>      long dirty_count; /* -1 if unknown */
> -};
> +} precopy_stats_t;
>  

Thanks,

-- 
Anthony PERARD


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 33/40] tools: change struct precopy_stats to precopy_stats_t
  2021-07-01 16:45   ` Anthony PERARD
@ 2021-07-01 17:08     ` Olaf Hering
  0 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-01 17:08 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

[-- Attachment #1: Type: text/plain, Size: 419 bytes --]

Am Thu, 1 Jul 2021 17:45:12 +0100
schrieb Anthony PERARD <anthony.perard@citrix.com>:

> But can't libxl_save_msgs_gen.pl been able to deal with
> thing like 'struct precopy_stats' ? It seems to be able to deal with
> 'unsigned long'.

Yes, this is apparently possible.
I have to check why I thought it was required to turn this into a typedef.
Right now I do not see the reason in the code comments.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 04/40] tools: use integer division in convert-legacy-stream
  2021-07-01  9:55 ` [PATCH v20210701 04/40] tools: use integer division in convert-legacy-stream Olaf Hering
@ 2021-07-02 15:10   ` Andrew Cooper
  0 siblings, 0 replies; 86+ messages in thread
From: Andrew Cooper @ 2021-07-02 15:10 UTC (permalink / raw)
  To: Olaf Hering, xen-devel
  Cc: Marek Marczykowski-Górecki, Ian Jackson, Wei Liu

On 01/07/2021 10:55, Olaf Hering wrote:
> A single slash gives a float, a double slash gives an int.
>
>     bitmap = unpack_exact("Q" * ((max_id/64) + 1))
> TypeError: can't multiply sequence by non-int of type 'float'
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 05/40] tools: handle libxl__physmap_info.name properly in convert-legacy-stream
  2021-07-01  9:56 ` [PATCH v20210701 05/40] tools: handle libxl__physmap_info.name properly " Olaf Hering
@ 2021-07-02 15:35   ` Andrew Cooper
  0 siblings, 0 replies; 86+ messages in thread
From: Andrew Cooper @ 2021-07-02 15:35 UTC (permalink / raw)
  To: Olaf Hering, xen-devel
  Cc: Marek Marczykowski-Górecki, Ian Jackson, Wei Liu

On 01/07/2021 10:56, Olaf Hering wrote:
> diff --git a/tools/python/scripts/convert-legacy-stream b/tools/python/scripts/convert-legacy-stream
> index 66ee3d2f5d..9003ac4f6d 100755
> --- a/tools/python/scripts/convert-legacy-stream
> +++ b/tools/python/scripts/convert-legacy-stream
> @@ -336,20 +336,21 @@ def read_libxl_toolstack(vm, data):
>          if len(data) < namelen:
>              raise StreamError("Remaining data too short for physmap name")
>  
> -        name = data[:namelen]
> +        c_string = data[:namelen]
>          data = data[namelen:]
>  
>          # Strip padding off the end of name
>          if twidth == 64:
> -            name = name[:-4]
> +            c_string = c_string[:-4]
>  
> -        if name[-1] != b'\x00':
> +        name, nil = unpack("={0}sB".format(len(c_string) - 1), c_string)

This is rather invasive.  How about simply:

diff --git a/tools/python/scripts/convert-legacy-stream
b/tools/python/scripts/convert-legacy-stream
index ca93a93848ec..d4ae94c02f21 100755
--- a/tools/python/scripts/convert-legacy-stream
+++ b/tools/python/scripts/convert-legacy-stream
@@ -342,7 +342,7 @@ def read_libxl_toolstack(vm, data):
         if twidth == 64:
             name = name[:-4]
 
-        if name[-1] != b'\x00':
+        if bytearray(name)[-1] != 0:
             raise StreamError("physmap name not NUL terminated")
 
         root = b"physmap/%x" % (phys, )

which is rather more contained, and looks to work from Py2.6 and later?

~Andrew


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 06/40] tools: fix Python3.4 TypeError in format string
  2021-07-01  9:56 ` [PATCH v20210701 06/40] tools: fix Python3.4 TypeError in format string Olaf Hering
@ 2021-07-02 16:19   ` Marek Marczykowski-Górecki
  2021-07-02 16:39     ` Andrew Cooper
  2021-07-05  8:07     ` Olaf Hering
  0 siblings, 2 replies; 86+ messages in thread
From: Marek Marczykowski-Górecki @ 2021-07-02 16:19 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Ian Jackson, Wei Liu

[-- Attachment #1: Type: text/plain, Size: 1574 bytes --]

On Thu, Jul 01, 2021 at 11:56:01AM +0200, Olaf Hering wrote:
> Using the first element of a tuple for a format specifier fails with
> python3.4 as included in SLE12:
>     b = b"string/%x" % (i, )
> TypeError: unsupported operand type(s) for %: 'bytes' and 'tuple'
> 
> It happens to work with python 2.7 and 3.6.
> Use a syntax that is handled by all three variants.
> 
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
> ---
>  tools/python/scripts/convert-legacy-stream | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/python/scripts/convert-legacy-stream b/tools/python/scripts/convert-legacy-stream
> index 9003ac4f6d..235b922ff5 100755
> --- a/tools/python/scripts/convert-legacy-stream
> +++ b/tools/python/scripts/convert-legacy-stream
> @@ -347,9 +347,9 @@ def read_libxl_toolstack(vm, data):
>          if nil != 0:
>              raise StreamError("physmap name not NUL terminated")
>  
> -        root = b"physmap/%x" % (phys, )
> -        kv = [root + b"/start_addr", b"%x" % (start, ),
> -              root + b"/size",       b"%x" % (size, ),
> +        root = bytes(("physmap/%x" % phys).encode('utf-8'))
> +        kv = [root + b"/start_addr", bytes(("%x" % start).encode('utf-8')),
> +              root + b"/size",       bytes(("%x" % size).encode('utf-8')),

Why bytes()? Encode does already return bytes type.

>                root + b"/name",       name]
>  
>          for key, val in zip(kv[0::2], kv[1::2]):

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 06/40] tools: fix Python3.4 TypeError in format string
  2021-07-02 16:19   ` Marek Marczykowski-Górecki
@ 2021-07-02 16:39     ` Andrew Cooper
  2021-07-05  8:18       ` Olaf Hering
  2021-07-05  8:07     ` Olaf Hering
  1 sibling, 1 reply; 86+ messages in thread
From: Andrew Cooper @ 2021-07-02 16:39 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki, Olaf Hering
  Cc: xen-devel, Ian Jackson, Wei Liu

On 02/07/2021 17:19, Marek Marczykowski-Górecki wrote:
> On Thu, Jul 01, 2021 at 11:56:01AM +0200, Olaf Hering wrote:
>> Using the first element of a tuple for a format specifier fails with
>> python3.4 as included in SLE12:
>>     b = b"string/%x" % (i, )
>> TypeError: unsupported operand type(s) for %: 'bytes' and 'tuple'
>>
>> It happens to work with python 2.7 and 3.6.
>> Use a syntax that is handled by all three variants.
>>
>> Signed-off-by: Olaf Hering <olaf@aepfle.de>
>> ---
>>  tools/python/scripts/convert-legacy-stream | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/python/scripts/convert-legacy-stream b/tools/python/scripts/convert-legacy-stream
>> index 9003ac4f6d..235b922ff5 100755
>> --- a/tools/python/scripts/convert-legacy-stream
>> +++ b/tools/python/scripts/convert-legacy-stream
>> @@ -347,9 +347,9 @@ def read_libxl_toolstack(vm, data):
>>          if nil != 0:
>>              raise StreamError("physmap name not NUL terminated")
>>  
>> -        root = b"physmap/%x" % (phys, )
>> -        kv = [root + b"/start_addr", b"%x" % (start, ),
>> -              root + b"/size",       b"%x" % (size, ),
>> +        root = bytes(("physmap/%x" % phys).encode('utf-8'))
>> +        kv = [root + b"/start_addr", bytes(("%x" % start).encode('utf-8')),
>> +              root + b"/size",       bytes(("%x" % size).encode('utf-8')),
> Why bytes()? Encode does already return bytes type.

Yes - I've just tried this out on various version of python (including
https://www.onlinegdb.com/online_python_interpreter which is the only
place I can find Python 3.4 easily available)

.encode() does return bytes (Py3) or str (Py2) so doesn't need the
surrounding bytes().

However, the % (phys, ) with the trailing comma is deliberate to work
around a common python error, so wants to remain if you're keeping the
%-formatting.

~Andrew


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 10/40] tools: add xc_is_known_page_type to libxenctrl
  2021-07-01  9:56 ` [PATCH v20210701 10/40] tools: add xc_is_known_page_type " Olaf Hering
@ 2021-07-02 19:20   ` Andrew Cooper
  2021-07-05  8:22     ` Olaf Hering
  0 siblings, 1 reply; 86+ messages in thread
From: Andrew Cooper @ 2021-07-02 19:20 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Ian Jackson, Wei Liu, Juergen Gross

On 01/07/2021 10:56, Olaf Hering wrote:
> Users of xc_get_pfn_type_batch may want to sanity check the data
> returned by Xen. Add a simple helper for this purpose.
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>

Subject needs correcting after v2.

However, given that this is in the save/restore common header, does it
really need a prefix?  Simply is_known_page_type() seems good enough.

>
> v02:
> - rename xc_is_known_page_type to sr_is_known_page_type
> - move from ctrl/xc_private.h to saverestore/common.h (jgross)
> ---
>  tools/libs/saverestore/common.h | 33 +++++++++++++++++++++++++++++++++
>  1 file changed, 33 insertions(+)
>
> diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
> index ca2eb47a4f..07c506360c 100644
> --- a/tools/libs/saverestore/common.h
> +++ b/tools/libs/saverestore/common.h
> @@ -467,6 +467,39 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count,
>  /* Handle a STATIC_DATA_END record. */
>  int handle_static_data_end(struct xc_sr_context *ctx);
>  
> +/* Sanitiy check for types returned by Xen */
> +static inline bool sr_is_known_page_type(xen_pfn_t type)

uint32_t

> +{
> +    bool ret;

The logic will be rather shorter and cleaner to read by dropping ret and
using return directly out of the switch.

> +
> +    switch (type)

Spaces.

I can fix up everything on commit if you're happy with the suggestions.

~Andrew


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 11/40] tools: use sr_is_known_page_type
  2021-07-01  9:56 ` [PATCH v20210701 11/40] tools: use sr_is_known_page_type Olaf Hering
@ 2021-07-02 19:27   ` Andrew Cooper
  2021-07-05  8:25     ` Olaf Hering
  0 siblings, 1 reply; 86+ messages in thread
From: Andrew Cooper @ 2021-07-02 19:27 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu

On 01/07/2021 10:56, Olaf Hering wrote:
> Verify pfn type on sending side, also verify incoming batch of pfns.
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
> Reviewed-by: Juergen Gross <jgross@suse.com>

Any reason this isn't folded into the previous patch, like your
subsequent two page type helper patches are?

> diff --git a/tools/libs/saverestore/save.c b/tools/libs/saverestore/save.c
> index ae3e8797d0..6f820ea432 100644
> --- a/tools/libs/saverestore/save.c
> +++ b/tools/libs/saverestore/save.c
> @@ -147,6 +147,12 @@ static int write_batch(struct xc_sr_context *ctx)
>  
>      for ( i = 0; i < nr_pfns; ++i )
>      {
> +        if ( sr_is_known_page_type(types[i]) == false )
> +        {
> +            ERROR("Wrong type %#"PRIpfn" for pfn %#"PRIpfn, types[i], mfns[i]);

"Unknown type" would be more accurate.

~Andrew



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 12/40] tools: unify type checking for data pfns in migration stream
  2021-07-01  9:56 ` [PATCH v20210701 12/40] tools: unify type checking for data pfns in migration stream Olaf Hering
@ 2021-07-02 19:43   ` Andrew Cooper
  2021-07-05  8:59     ` Olaf Hering
  2021-07-05 13:10   ` Andrew Cooper
  1 sibling, 1 reply; 86+ messages in thread
From: Andrew Cooper @ 2021-07-02 19:43 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Ian Jackson, Wei Liu, Juergen Gross

On 01/07/2021 10:56, Olaf Hering wrote:
> Introduce a helper which decides if a given pfn in the migration
> stream is backed by memory.
>
> This specifically deals with type XEN_DOMCTL_PFINFO_XALLOC, which was
> a synthetic toolstack-only type used in Xen 4.2 to 4.5. It indicated a
> dirty page on the sending side for which no data will be send in the
> initial iteration.
>
> No change in behavior intended.
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
> ---
>  tools/libs/saverestore/common.h  | 17 +++++++++++++++++
>  tools/libs/saverestore/restore.c |  5 ++---
>  2 files changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
> index 07c506360c..fa242e808d 100644
> --- a/tools/libs/saverestore/common.h
> +++ b/tools/libs/saverestore/common.h
> @@ -500,6 +500,23 @@ static inline bool sr_is_known_page_type(xen_pfn_t type)
>      return ret;
>  }
>  
> +static inline bool page_type_to_populate(uint32_t type)
> +{
> +    bool ret;
> +
> +    switch (type)
> +    {

Same style comments as before.

> +    case XEN_DOMCTL_PFINFO_XTAB:
> +    case XEN_DOMCTL_PFINFO_BROKEN:
> +        ret = false;
> +        break;
> +    case XEN_DOMCTL_PFINFO_XALLOC:
> +    default:
> +        ret = true;
> +        break;

I know you're replacing the logic as-was, but in hindsight, I'm not sure
it was great to begin with.  It defaults the unallocated types to being
considered populated, which isn't a clever idea.

Anyone adding a new page type is going to have to audit/edit each of
these helpers.  I think it would be better to write all the true cases
explicitly.

> +    }
> +    return ret;
> +}
>  #endif
>  /*
>   * Local variables:
> diff --git a/tools/libs/saverestore/restore.c b/tools/libs/saverestore/restore.c
> index 324b9050e2..477b7527a1 100644
> --- a/tools/libs/saverestore/restore.c
> +++ b/tools/libs/saverestore/restore.c
> @@ -152,9 +152,8 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count,
>  
>      for ( i = 0; i < count; ++i )
>      {
> -        if ( (!types || (types &&
> -                         (types[i] != XEN_DOMCTL_PFINFO_XTAB &&
> -                          types[i] != XEN_DOMCTL_PFINFO_BROKEN))) &&
> +        if ( (!types ||
> +              (types && page_type_to_populate(types[i]) == true)) &&

I'm surprised not to have seen a compiler or static analysis complaint
about this.

!A || (A && B) is redundant, and simplifies to !A || B.

Clearly need to blame whichever numpty wrote this code to begin with.

~Andrew



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 13/40] tools: unify type checking for data pfns in migration stream
  2021-07-01  9:56 ` [PATCH v20210701 13/40] " Olaf Hering
@ 2021-07-02 19:49   ` Andrew Cooper
  0 siblings, 0 replies; 86+ messages in thread
From: Andrew Cooper @ 2021-07-02 19:49 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Ian Jackson, Wei Liu, Juergen Gross

On 01/07/2021 10:56, Olaf Hering wrote:
> diff --git a/tools/libs/saverestore/common.h b/tools/libs/saverestore/common.h
> index fa242e808d..905b4078f6 100644
> --- a/tools/libs/saverestore/common.h
> +++ b/tools/libs/saverestore/common.h
> @@ -517,6 +517,24 @@ static inline bool page_type_to_populate(uint32_t type)
>      }
>      return ret;
>  }
> +
> +static inline bool page_type_has_stream_data(uint32_t type)
> +{
> +    bool ret;
> +
> +    switch (type)
> +    {
> +    case XEN_DOMCTL_PFINFO_BROKEN:
> +    case XEN_DOMCTL_PFINFO_XALLOC:
> +    case XEN_DOMCTL_PFINFO_XTAB:
> +        ret = false;
> +        break;
> +    default:
> +        ret = true;
> +        break;

As with page_type_to_populate(), we shouldn't really default the
unallocated types to having stream data.

Subject to this and the other style concerned, Reviewed-by: Andrew
Cooper <andrew.cooper3@citrix.com>

I'm happy to fix up all the issue for the page type helpers on commit,
if you're happy.

~Andrew



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 06/40] tools: fix Python3.4 TypeError in format string
  2021-07-02 16:19   ` Marek Marczykowski-Górecki
  2021-07-02 16:39     ` Andrew Cooper
@ 2021-07-05  8:07     ` Olaf Hering
  2021-07-05 10:10       ` Andrew Cooper
  1 sibling, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-05  8:07 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: xen-devel, Ian Jackson, Wei Liu

[-- Attachment #1: Type: text/plain, Size: 406 bytes --]

Am Fri, 2 Jul 2021 18:19:39 +0200
schrieb Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>:

> Why bytes()? Encode does already return bytes type.

You are right, this works as well:
  i = 123
  b = ("str/%x" % (i, )).encode('utf-8')

Any preference regarding the "encoding"? I picked UTF8, but 'ascii' might be more correct in this context. In practice it may not matter.


Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 06/40] tools: fix Python3.4 TypeError in format string
  2021-07-02 16:39     ` Andrew Cooper
@ 2021-07-05  8:18       ` Olaf Hering
  2021-07-05  9:47         ` Andrew Cooper
  0 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-05  8:18 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Marek Marczykowski-Górecki, xen-devel, Ian Jackson, Wei Liu

[-- Attachment #1: Type: text/plain, Size: 284 bytes --]

Am Fri, 2 Jul 2021 17:39:54 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> However, the % (phys, ) with the trailing comma is deliberate to work
> around a common python error, so wants to remain if you're keeping the
> %-formatting.

What error is that?

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 10/40] tools: add xc_is_known_page_type to libxenctrl
  2021-07-02 19:20   ` Andrew Cooper
@ 2021-07-05  8:22     ` Olaf Hering
  2021-07-05  9:51       ` Andrew Cooper
  0 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-05  8:22 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

[-- Attachment #1: Type: text/plain, Size: 703 bytes --]

Am Fri, 2 Jul 2021 20:20:08 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> Subject needs correcting after v2.

Apparently I missed some places while removing the old "xc_" prefix.

> However, given that this is in the save/restore common header, does it
> really need a prefix?  Simply is_known_page_type() seems good enough.

Sure, the possibility of clashes is probably low.


> > +/* Sanitiy check for types returned by Xen */
> > +static inline bool sr_is_known_page_type(xen_pfn_t type)  
> uint32_t

Why is this better than returning 'bool'?

> I can fix up everything on commit if you're happy with the suggestions.

Yes, I'm certainly fine with it.


Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 11/40] tools: use sr_is_known_page_type
  2021-07-02 19:27   ` Andrew Cooper
@ 2021-07-05  8:25     ` Olaf Hering
  2021-07-05  9:53       ` Andrew Cooper
  0 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-05  8:25 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Juergen Gross, Ian Jackson, Wei Liu

[-- Attachment #1: Type: text/plain, Size: 484 bytes --]

Am Fri, 2 Jul 2021 20:27:21 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> Any reason this isn't folded into the previous patch, like your
> subsequent two page type helper patches are?

I think I wanted to separate this for simpler review, but I forgot to split the followup change as well.

> > +            ERROR("Wrong type %#"PRIpfn" for pfn %#"PRIpfn, types[i], mfns[i]);  
> "Unknown type" would be more accurate.

Yes, this is better. Thanks.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 12/40] tools: unify type checking for data pfns in migration stream
  2021-07-02 19:43   ` Andrew Cooper
@ 2021-07-05  8:59     ` Olaf Hering
  2021-07-05  9:53       ` Andrew Cooper
  0 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-05  8:59 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

[-- Attachment #1: Type: text/plain, Size: 401 bytes --]

Am Fri, 2 Jul 2021 20:43:13 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> Anyone adding a new page type is going to have to audit/edit each of
> these helpers.  I think it would be better to write all the true cases
> explicitly.

You mean the check if a page has data or needs to be populated should look like sr_is_known_page_type, where each known variant is listed?

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 06/40] tools: fix Python3.4 TypeError in format string
  2021-07-05  8:18       ` Olaf Hering
@ 2021-07-05  9:47         ` Andrew Cooper
  0 siblings, 0 replies; 86+ messages in thread
From: Andrew Cooper @ 2021-07-05  9:47 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Marek Marczykowski-Górecki, xen-devel, Ian Jackson, Wei Liu

On 05/07/2021 09:18, Olaf Hering wrote:
> Am Fri, 2 Jul 2021 17:39:54 +0100
> schrieb Andrew Cooper <andrew.cooper3@citrix.com>:
>
>> However, the % (phys, ) with the trailing comma is deliberate to work
>> around a common python error, so wants to remain if you're keeping the
>> %-formatting.
> What error is that?

>>> def p1(arg):
...     print("%s" % arg)
>>> def p2(arg):
...     print("%s" % (arg, ))

>>> p1("foo")
foo
>>> p2("foo")
foo

>>> p1(("foo", "bar"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in p1
TypeError: not all arguments converted during string formatting

>>> p2(("foo", "bar"))
('foo', 'bar')


The % operator has some type ambiguity with how it works.  (foo, )
forces arg to be a 1-tuple as far as formatting is concerned.

~Andrew



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 10/40] tools: add xc_is_known_page_type to libxenctrl
  2021-07-05  8:22     ` Olaf Hering
@ 2021-07-05  9:51       ` Andrew Cooper
  2021-07-05 14:24         ` Olaf Hering
  0 siblings, 1 reply; 86+ messages in thread
From: Andrew Cooper @ 2021-07-05  9:51 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

On 05/07/2021 09:22, Olaf Hering wrote:
> Am Fri, 2 Jul 2021 20:20:08 +0100
> schrieb Andrew Cooper <andrew.cooper3@citrix.com>:
>>> +/* Sanitiy check for types returned by Xen */
>>> +static inline bool sr_is_known_page_type(xen_pfn_t type)  
>> uint32_t
> Why is this better than returning 'bool'?

For the parameter sorry, not the return type.

All type fields are uniformly uint32_t elsewhere.

>
>> I can fix up everything on commit if you're happy with the suggestions.
> Yes, I'm certainly fine with it.

Ok - I'll put together a branch.

~Andrew


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 11/40] tools: use sr_is_known_page_type
  2021-07-05  8:25     ` Olaf Hering
@ 2021-07-05  9:53       ` Andrew Cooper
  0 siblings, 0 replies; 86+ messages in thread
From: Andrew Cooper @ 2021-07-05  9:53 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Juergen Gross, Ian Jackson, Wei Liu

On 05/07/2021 09:25, Olaf Hering wrote:
> Am Fri, 2 Jul 2021 20:27:21 +0100
> schrieb Andrew Cooper <andrew.cooper3@citrix.com>:
>
>> Any reason this isn't folded into the previous patch, like your
>> subsequent two page type helper patches are?
> I think I wanted to separate this for simpler review, but I forgot to split the followup change as well.

All patches are largely mechanical changes.  It's easier to review
together, rather than split, because you can only judge the correctness
of the new helper in terms of the code it is replacing.

~Andrew


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 12/40] tools: unify type checking for data pfns in migration stream
  2021-07-05  8:59     ` Olaf Hering
@ 2021-07-05  9:53       ` Andrew Cooper
  0 siblings, 0 replies; 86+ messages in thread
From: Andrew Cooper @ 2021-07-05  9:53 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

On 05/07/2021 09:59, Olaf Hering wrote:
> Am Fri, 2 Jul 2021 20:43:13 +0100
> schrieb Andrew Cooper <andrew.cooper3@citrix.com>:
>
>> Anyone adding a new page type is going to have to audit/edit each of
>> these helpers.  I think it would be better to write all the true cases
>> explicitly.
> You mean the check if a page has data or needs to be populated should look like sr_is_known_page_type, where each known variant is listed?

Yes.  I think that is a safer approach overall.

~Andrew


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 06/40] tools: fix Python3.4 TypeError in format string
  2021-07-05  8:07     ` Olaf Hering
@ 2021-07-05 10:10       ` Andrew Cooper
  0 siblings, 0 replies; 86+ messages in thread
From: Andrew Cooper @ 2021-07-05 10:10 UTC (permalink / raw)
  To: Olaf Hering, Marek Marczykowski-Górecki
  Cc: xen-devel, Ian Jackson, Wei Liu

On 05/07/2021 09:07, Olaf Hering wrote:
> Am Fri, 2 Jul 2021 18:19:39 +0200
> schrieb Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>:
>
>> Why bytes()? Encode does already return bytes type.
> You are right, this works as well:
>   i = 123
>   b = ("str/%x" % (i, )).encode('utf-8')
>
> Any preference regarding the "encoding"? I picked UTF8, but 'ascii' might be more correct in this context. In practice it may not matter.

I suspect you're right and it doesn't matter, but ascii feels like a
safer option.

~Andrew



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 15/40] tools: prepare to allocate saverestore arrays once
  2021-07-01  9:56 ` [PATCH v20210701 15/40] tools: prepare to allocate saverestore arrays once Olaf Hering
@ 2021-07-05 10:44   ` Andrew Cooper
  2021-07-05 11:27     ` Olaf Hering
  0 siblings, 1 reply; 86+ messages in thread
From: Andrew Cooper @ 2021-07-05 10:44 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Ian Jackson, Wei Liu, Juergen Gross

On 01/07/2021 10:56, Olaf Hering wrote:
> The hotpath 'send_dirty_pages' is supposed to do just one thing: sending.
> The other end 'handle_page_data' is supposed to do just receiving.
>
> But instead both do other costly work like memory allocations and data moving.
> Do the allocations once, the array sizes are a compiletime constant.
> Avoid unneeded copying of data by receiving data directly into mapped guest memory.

There is a reason why the logic was written that way, which was good at
the time.  It was so valgrind could spot bugs with the memory handling
in these functions (And it did indeed find many bugs during development).

These days, its ASAN is perhaps the preferred tool, but both depend on
dynamic allocations to figure out the actual size of various objects.


I agree that the repeated alloc/free of same-sized memory regions on
each iteration is a waste.  However, if we are going to fix this by
using one-off allocations, then we want to compensate with logic such as
the call to VALGRIND_MAKE_MEM_UNDEFINED() in flush_batch(), and I think
we still need individual allocations to let the tools work properly.

>
> This patch is just prepartion, subsequent changes will populate the arrays.
>
> Once all changes are applied, migration of a busy HVM domU changes like that:
>
> Without this series, from sr650 to sr950 (xen-4.15.20201027T173911.16a20963b3 xen_testing):
> 2020-10-29 10:23:10.711+0000: xc: show_transfer_rate: 23663128 bytes + 2879563 pages in 55.324905335 sec, 203 MiB/sec: Internal error
> 2020-10-29 10:23:35.115+0000: xc: show_transfer_rate: 16829632 bytes + 2097552 pages in 24.401179720 sec, 335 MiB/sec: Internal error
> 2020-10-29 10:23:59.436+0000: xc: show_transfer_rate: 16829032 bytes + 2097478 pages in 24.319025928 sec, 336 MiB/sec: Internal error
> 2020-10-29 10:24:23.844+0000: xc: show_transfer_rate: 16829024 bytes + 2097477 pages in 24.406992500 sec, 335 MiB/sec: Internal error
> 2020-10-29 10:24:48.292+0000: xc: show_transfer_rate: 16828912 bytes + 2097463 pages in 24.446489027 sec, 335 MiB/sec: Internal error
> 2020-10-29 10:25:01.816+0000: xc: show_transfer_rate: 16836080 bytes + 2098356 pages in 13.447091818 sec, 609 MiB/sec: Internal error
>
> With this series, from sr650 to sr950 (xen-4.15.20201027T173911.16a20963b3 xen_unstable):
> 2020-10-28 21:26:05.074+0000: xc: show_transfer_rate: 23663128 bytes + 2879563 pages in 52.564054368 sec, 213 MiB/sec: Internal error
> 2020-10-28 21:26:23.527+0000: xc: show_transfer_rate: 16830040 bytes + 2097603 pages in 18.450592015 sec, 444 MiB/sec: Internal error
> 2020-10-28 21:26:41.926+0000: xc: show_transfer_rate: 16830944 bytes + 2097717 pages in 18.397862306 sec, 445 MiB/sec: Internal error
> 2020-10-28 21:27:00.339+0000: xc: show_transfer_rate: 16829176 bytes + 2097498 pages in 18.411973339 sec, 445 MiB/sec: Internal error
> 2020-10-28 21:27:18.643+0000: xc: show_transfer_rate: 16828592 bytes + 2097425 pages in 18.303326695 sec, 447 MiB/sec: Internal error
> 2020-10-28 21:27:26.289+0000: xc: show_transfer_rate: 16835952 bytes + 2098342 pages in 7.579846749 sec, 1081 MiB/sec: Internal error

These are good numbers, and clearly show that there is some value here,
but shouldn't they be in the series header?  They're not terribly
relevant to this patch specifically.

Also, while I can believe that the first sample is slower than the later
ones (in particular, during the first round, we've got to deal with the
non-RAM regions too and therefore spend more time making hypercalls),
I'm not sure I believe the final sample.  Given the byte/page count, the
substantially smaller elapsed time looks suspicious.

> Note: the performance improvement depends on the used network cards,
> wirespeed and the host:
> - No improvement is expected with a 1G link.
> - Improvement can be seen as shown above on a 10G link.
> - Just a slight improvment can be seen on a 100G link.

Are these observations with an otherwise idle dom0?

Even if CPU time in dom0 wasn't the bottlekneck with a 1G link, the
reduction in CPU time you observe at higher link speeds will still be
making a difference at 1G, and will probably be visible if you perform
multiple concurrent migrations.

~Andrew



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 15/40] tools: prepare to allocate saverestore arrays once
  2021-07-05 10:44   ` Andrew Cooper
@ 2021-07-05 11:27     ` Olaf Hering
  2021-07-05 13:01       ` Andrew Cooper
  0 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-05 11:27 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

[-- Attachment #1: Type: text/plain, Size: 4622 bytes --]

Am Mon, 5 Jul 2021 11:44:30 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> On 01/07/2021 10:56, Olaf Hering wrote:
> I agree that the repeated alloc/free of same-sized memory regions on
> each iteration is a waste.  However, if we are going to fix this by
> using one-off allocations, then we want to compensate with logic such as
> the call to VALGRIND_MAKE_MEM_UNDEFINED() in flush_batch(), and I think
> we still need individual allocations to let the tools work properly.

If this is a concern, lets just do a few individual arrays.

> > This patch is just prepartion, subsequent changes will populate the arrays.
> >
> > Once all changes are applied, migration of a busy HVM domU changes like that:
> >
> > Without this series, from sr650 to sr950 (xen-4.15.20201027T173911.16a20963b3 xen_testing):
> > 2020-10-29 10:23:10.711+0000: xc: show_transfer_rate: 23663128 bytes + 2879563 pages in 55.324905335 sec, 203 MiB/sec: Internal error
> > 2020-10-29 10:23:35.115+0000: xc: show_transfer_rate: 16829632 bytes + 2097552 pages in 24.401179720 sec, 335 MiB/sec: Internal error
> > 2020-10-29 10:23:59.436+0000: xc: show_transfer_rate: 16829032 bytes + 2097478 pages in 24.319025928 sec, 336 MiB/sec: Internal error
> > 2020-10-29 10:24:23.844+0000: xc: show_transfer_rate: 16829024 bytes + 2097477 pages in 24.406992500 sec, 335 MiB/sec: Internal error
> > 2020-10-29 10:24:48.292+0000: xc: show_transfer_rate: 16828912 bytes + 2097463 pages in 24.446489027 sec, 335 MiB/sec: Internal error
> > 2020-10-29 10:25:01.816+0000: xc: show_transfer_rate: 16836080 bytes + 2098356 pages in 13.447091818 sec, 609 MiB/sec: Internal error
> >
> > With this series, from sr650 to sr950 (xen-4.15.20201027T173911.16a20963b3 xen_unstable):
> > 2020-10-28 21:26:05.074+0000: xc: show_transfer_rate: 23663128 bytes + 2879563 pages in 52.564054368 sec, 213 MiB/sec: Internal error
> > 2020-10-28 21:26:23.527+0000: xc: show_transfer_rate: 16830040 bytes + 2097603 pages in 18.450592015 sec, 444 MiB/sec: Internal error
> > 2020-10-28 21:26:41.926+0000: xc: show_transfer_rate: 16830944 bytes + 2097717 pages in 18.397862306 sec, 445 MiB/sec: Internal error
> > 2020-10-28 21:27:00.339+0000: xc: show_transfer_rate: 16829176 bytes + 2097498 pages in 18.411973339 sec, 445 MiB/sec: Internal error
> > 2020-10-28 21:27:18.643+0000: xc: show_transfer_rate: 16828592 bytes + 2097425 pages in 18.303326695 sec, 447 MiB/sec: Internal error
> > 2020-10-28 21:27:26.289+0000: xc: show_transfer_rate: 16835952 bytes + 2098342 pages in 7.579846749 sec, 1081 MiB/sec: Internal error  
> 
> These are good numbers, and clearly show that there is some value here,
> but shouldn't they be in the series header?  They're not terribly
> relevant to this patch specifically.

The cover letter is unfortunately not under version control.
Perhaps there are ways with git notes, I never use it.

> Also, while I can believe that the first sample is slower than the later
> ones (in particular, during the first round, we've got to deal with the
> non-RAM regions too and therefore spend more time making hypercalls),
> I'm not sure I believe the final sample.  Given the byte/page count, the
> substantially smaller elapsed time looks suspicious.

The first one is slower because it has to wait for the receiver to allocate pages.
But maybe as you said there are other aspects as well.
The last one is always way faster because apparently map/unmap is less costly with a stopped guest.
Right now the code may reach up to 15Gbit/s. The next step is to map the domU just once to reach wirespeed.

> Are these observations with an otherwise idle dom0?

Yes. Idle dom0 and a domU busy with touching its memory.

Unfortunately, I'm not able to prove the reported gain with the systems I have today.
I'm waiting for preparation of different hardware, right now I have only a pair of CoyotePass and WilsonCity.

I'm sure there were NUMA effects involved. Last years libvirt was unable to properly pin vcpus. If I pin all the involved memory to node#0 there is some jitter in the logged numbers, but no obvious improvement. The fist iteration is slightly faster, but that is it.

Meanwhile I think this commit message needs to be redone.

> Even if CPU time in dom0 wasn't the bottlekneck with a 1G link, the
> reduction in CPU time you observe at higher link speeds will still be
> making a difference at 1G, and will probably be visible if you perform
> multiple concurrent migrations.

Yes, I will see what numbers I get with two or more migrations running in parallel.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 15/40] tools: prepare to allocate saverestore arrays once
  2021-07-05 11:27     ` Olaf Hering
@ 2021-07-05 13:01       ` Andrew Cooper
  2021-07-05 14:11         ` Olaf Hering
  2021-07-13 17:50         ` Olaf Hering
  0 siblings, 2 replies; 86+ messages in thread
From: Andrew Cooper @ 2021-07-05 13:01 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

On 05/07/2021 12:27, Olaf Hering wrote:
> Am Mon, 5 Jul 2021 11:44:30 +0100
> schrieb Andrew Cooper <andrew.cooper3@citrix.com>:
>
>>> This patch is just prepartion, subsequent changes will populate the arrays.
>>>
>>> Once all changes are applied, migration of a busy HVM domU changes like that:
>>>
>>> Without this series, from sr650 to sr950 (xen-4.15.20201027T173911.16a20963b3 xen_testing):
>>> 2020-10-29 10:23:10.711+0000: xc: show_transfer_rate: 23663128 bytes + 2879563 pages in 55.324905335 sec, 203 MiB/sec: Internal error
>>> 2020-10-29 10:23:35.115+0000: xc: show_transfer_rate: 16829632 bytes + 2097552 pages in 24.401179720 sec, 335 MiB/sec: Internal error
>>> 2020-10-29 10:23:59.436+0000: xc: show_transfer_rate: 16829032 bytes + 2097478 pages in 24.319025928 sec, 336 MiB/sec: Internal error
>>> 2020-10-29 10:24:23.844+0000: xc: show_transfer_rate: 16829024 bytes + 2097477 pages in 24.406992500 sec, 335 MiB/sec: Internal error
>>> 2020-10-29 10:24:48.292+0000: xc: show_transfer_rate: 16828912 bytes + 2097463 pages in 24.446489027 sec, 335 MiB/sec: Internal error
>>> 2020-10-29 10:25:01.816+0000: xc: show_transfer_rate: 16836080 bytes + 2098356 pages in 13.447091818 sec, 609 MiB/sec: Internal error
>>>
>>> With this series, from sr650 to sr950 (xen-4.15.20201027T173911.16a20963b3 xen_unstable):
>>> 2020-10-28 21:26:05.074+0000: xc: show_transfer_rate: 23663128 bytes + 2879563 pages in 52.564054368 sec, 213 MiB/sec: Internal error
>>> 2020-10-28 21:26:23.527+0000: xc: show_transfer_rate: 16830040 bytes + 2097603 pages in 18.450592015 sec, 444 MiB/sec: Internal error
>>> 2020-10-28 21:26:41.926+0000: xc: show_transfer_rate: 16830944 bytes + 2097717 pages in 18.397862306 sec, 445 MiB/sec: Internal error
>>> 2020-10-28 21:27:00.339+0000: xc: show_transfer_rate: 16829176 bytes + 2097498 pages in 18.411973339 sec, 445 MiB/sec: Internal error
>>> 2020-10-28 21:27:18.643+0000: xc: show_transfer_rate: 16828592 bytes + 2097425 pages in 18.303326695 sec, 447 MiB/sec: Internal error
>>> 2020-10-28 21:27:26.289+0000: xc: show_transfer_rate: 16835952 bytes + 2098342 pages in 7.579846749 sec, 1081 MiB/sec: Internal error  
>> These are good numbers, and clearly show that there is some value here,
>> but shouldn't they be in the series header?  They're not terribly
>> relevant to this patch specifically.
> The cover letter is unfortunately not under version control.
> Perhaps there are ways with git notes, I never use it.

In the end, we'll want some kind of note in the changelog, but that
wants to be a single line.  Its probably fine to say "Improve migration
performance.  25% better bandwidth when NIC link speed is the
bottleneck, due to optimising the data handling logic."

>> Also, while I can believe that the first sample is slower than the later
>> ones (in particular, during the first round, we've got to deal with the
>> non-RAM regions too and therefore spend more time making hypercalls),
>> I'm not sure I believe the final sample.  Given the byte/page count, the
>> substantially smaller elapsed time looks suspicious.
> The first one is slower because it has to wait for the receiver to allocate pages.
> But maybe as you said there are other aspects as well.
> The last one is always way faster because apparently map/unmap is less costly with a stopped guest.

That's suspicious.  If true, we've got some very wonky behaviour in the
hypervisor...

> Right now the code may reach up to 15Gbit/s. The next step is to map the domU just once to reach wirespeed.

We can in principle do that in 64bit toolstacks, for HVM guests.  But
not usefully until we've fixed the fact that Xen has no idea what the
guest physmap is supposed to look like.

At the moment, the current scheme is a little more resilient to bugs
caused by the guest attempting to balloon during the live phase.

Another area to improve, which can be started now, is to avoid bounce
buffering hypercall data.  Now that we have /dev/xen/hypercall which you
can mmap() regular kernel pages from, what we want is a simple memory
allocator which we can allocate permanent hypercall buffers from, rather
than the internals of every xc_*() hypercall wrapper bouncing the data
in (potentially) both directions.

>
>> Are these observations with an otherwise idle dom0?
> Yes. Idle dom0 and a domU busy with touching its memory.
>
> Unfortunately, I'm not able to prove the reported gain with the systems I have today.
> I'm waiting for preparation of different hardware, right now I have only a pair of CoyotePass and WilsonCity.
>
> I'm sure there were NUMA effects involved. Last years libvirt was unable to properly pin vcpus. If I pin all the involved memory to node#0 there is some jitter in the logged numbers, but no obvious improvement. The fist iteration is slightly faster, but that is it.

Oh - so the speedup might not be from reduced data handling?

Avoiding unnecessary data copies is clearly going to improve things,
even if it isn't 25%.

~Andrew



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 12/40] tools: unify type checking for data pfns in migration stream
  2021-07-01  9:56 ` [PATCH v20210701 12/40] tools: unify type checking for data pfns in migration stream Olaf Hering
  2021-07-02 19:43   ` Andrew Cooper
@ 2021-07-05 13:10   ` Andrew Cooper
  2021-07-05 13:53     ` Olaf Hering
  1 sibling, 1 reply; 86+ messages in thread
From: Andrew Cooper @ 2021-07-05 13:10 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Ian Jackson, Wei Liu, Juergen Gross

On 01/07/2021 10:56, Olaf Hering wrote:
> Introduce a helper which decides if a given pfn in the migration
> stream is backed by memory.
>
> This specifically deals with type XEN_DOMCTL_PFINFO_XALLOC, which was
> a synthetic toolstack-only type used in Xen 4.2 to 4.5. It indicated a
> dirty page on the sending side for which no data will be send in the
> initial iteration.

What do you mean "This specifically deals with" ?

AFACT, the code before was correct.

~Andrew


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 12/40] tools: unify type checking for data pfns in migration stream
  2021-07-05 13:10   ` Andrew Cooper
@ 2021-07-05 13:53     ` Olaf Hering
  2021-07-05 18:54       ` Andrew Cooper
  0 siblings, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-05 13:53 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

On Mon, Jul 05, Andrew Cooper wrote:

> What do you mean "This specifically deals with" ?

This was a result from Jürgen pointing out that XEN_DOMCTL_PFINFO_XALLOC
is not handled. If all the type checking changes go into a single
commit, the commig message has to be reworded.

Olaf


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 15/40] tools: prepare to allocate saverestore arrays once
  2021-07-05 13:01       ` Andrew Cooper
@ 2021-07-05 14:11         ` Olaf Hering
  2021-07-13 17:50         ` Olaf Hering
  1 sibling, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-05 14:11 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

[-- Attachment #1: Type: text/plain, Size: 2461 bytes --]

Am Mon, 5 Jul 2021 14:01:07 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> > The last one is always way faster because apparently map/unmap is less costly with a stopped guest.  
> That's suspicious.  If true, we've got some very wonky behaviour in the
> hypervisor...

At least the transfer rate this last iteration is consistent.
Since the only difference I can see is the fact that the domU is suspended, I suspect the mapping.
I did no investigation where the time is spent, I should probably do that one day to better understand this specific difference.

> > Right now the code may reach up to 15Gbit/s. The next step is to map the domU just once to reach wirespeed.  
> 
> We can in principle do that in 64bit toolstacks, for HVM guests.  But
> not usefully until we've fixed the fact that Xen has no idea what the
> guest physmap is supposed to look like.

Why would Xen care?
My attempt last year with a new save/restore code did just 'map' the memory on both sides. The 'unmap' was done in exit().

With this approach I got wirespeed in all iterations with a 10G link.

> At the moment, the current scheme is a little more resilient to bugs
> caused by the guest attempting to balloon during the live phase.

I did not specifically test how a domU behaves when it claims and releases pages while being migrated.
I think this series would handle at least parts of that:
If a page appears or disappears it will be recognized by getpageframeinfo.
If a page disappears between getpageframeinfo and MMAPBATCH I expect an error.
This error is fatal right now, perhaps the code could catch this and move on.
If a page disappears after MMAPBATCH it will be caught by later iterations.


> Another area to improve, which can be started now, is to avoid bounce
> buffering hypercall data.  Now that we have /dev/xen/hypercall which you
> can mmap() regular kernel pages from, what we want is a simple memory
> allocator which we can allocate permanent hypercall buffers from, rather
> than the internals of every xc_*() hypercall wrapper bouncing the data
> in (potentially) both directions.

That sounds like a good idea. Not sure how costly the current approach is.

> Oh - so the speedup might not be from reduced data handling?

At least not on the systems I have now.

Perhaps I should test how the numbers look like with the NIC and the toolstack in node#0, and the domU in node#1.


Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 10/40] tools: add xc_is_known_page_type to libxenctrl
  2021-07-05  9:51       ` Andrew Cooper
@ 2021-07-05 14:24         ` Olaf Hering
  0 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-05 14:24 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

[-- Attachment #1: Type: text/plain, Size: 228 bytes --]

Am Mon, 5 Jul 2021 10:51:50 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> All type fields are uniformly uint32_t elsewhere.

To me it looks like xc_get_pfn_type_batch writes to an array of xen_pfn_t.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 12/40] tools: unify type checking for data pfns in migration stream
  2021-07-05 13:53     ` Olaf Hering
@ 2021-07-05 18:54       ` Andrew Cooper
  2021-07-05 19:06         ` Olaf Hering
  0 siblings, 1 reply; 86+ messages in thread
From: Andrew Cooper @ 2021-07-05 18:54 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

On 05/07/2021 14:53, Olaf Hering wrote:
> On Mon, Jul 05, Andrew Cooper wrote:
>
>> What do you mean "This specifically deals with" ?
> This was a result from Jürgen pointing out that XEN_DOMCTL_PFINFO_XALLOC
> is not handled.

But it is...

Before the patch, we only don't populate for XTAB or BROKEN.  We
populate for every other type, including the unknown/invalid ones.

There is no change in behaviour (for the non-invalid cases) that I can see.

~Andrew


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 12/40] tools: unify type checking for data pfns in migration stream
  2021-07-05 18:54       ` Andrew Cooper
@ 2021-07-05 19:06         ` Olaf Hering
  0 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-05 19:06 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

[-- Attachment #1: Type: text/plain, Size: 241 bytes --]

Am Mon, 5 Jul 2021 19:54:21 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> But it is...

It was not handled in an earlier variant of the patch. Meanwhile the original behavior is restored with the current variant.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 07/40] tools: create libxensaverestore
  2021-07-01  9:56 ` [PATCH v20210701 07/40] tools: create libxensaverestore Olaf Hering
@ 2021-07-09  9:20   ` Olaf Hering
  2021-07-09  9:31     ` Julien Grall
  2021-07-09  9:35   ` Julien Grall
  1 sibling, 1 reply; 86+ messages in thread
From: Olaf Hering @ 2021-07-09  9:20 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Julien Grall, Stefano Stabellini, Juergen Gross, Anthony PERARD

[-- Attachment #1: Type: text/plain, Size: 184 bytes --]

Am Thu,  1 Jul 2021 11:56:02 +0200
schrieb Olaf Hering <olaf@aepfle.de>:

> Move all save/restore related code

This is now 6 months old.

What is blocking approval?


Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 07/40] tools: create libxensaverestore
  2021-07-09  9:20   ` Olaf Hering
@ 2021-07-09  9:31     ` Julien Grall
  2021-07-09  9:33       ` Olaf Hering
  0 siblings, 1 reply; 86+ messages in thread
From: Julien Grall @ 2021-07-09  9:31 UTC (permalink / raw)
  To: Olaf Hering, xen-devel
  Cc: Wei Liu, Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Juergen Gross, Anthony PERARD

Hi Olaf,

On 09/07/2021 10:20, Olaf Hering wrote:
> Am Thu,  1 Jul 2021 11:56:02 +0200
> schrieb Olaf Hering <olaf@aepfle.de>:
> 
>> Move all save/restore related code
> 
> This is now 6 months old.
> 
> What is blocking approval?

There is already an ack from Wei which should be sufficient for the 
tools part.


But looking at the history of the patch, there seem to have been concern 
from Andrew. Were they resolved?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 07/40] tools: create libxensaverestore
  2021-07-09  9:31     ` Julien Grall
@ 2021-07-09  9:33       ` Olaf Hering
  0 siblings, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-09  9:33 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei Liu, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Stefano Stabellini, Juergen Gross, Anthony PERARD

[-- Attachment #1: Type: text/plain, Size: 128 bytes --]

Am Fri, 9 Jul 2021 10:31:53 +0100
schrieb Julien Grall <julien@xen.org>:

> Were they resolved?

I think so, yes.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 07/40] tools: create libxensaverestore
  2021-07-01  9:56 ` [PATCH v20210701 07/40] tools: create libxensaverestore Olaf Hering
  2021-07-09  9:20   ` Olaf Hering
@ 2021-07-09  9:35   ` Julien Grall
  1 sibling, 0 replies; 86+ messages in thread
From: Julien Grall @ 2021-07-09  9:35 UTC (permalink / raw)
  To: Olaf Hering, xen-devel
  Cc: Wei Liu, Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Juergen Gross, Anthony PERARD

Hi Olaf,

On 01/07/2021 10:56, Olaf Hering wrote:
> Move all save/restore related code from libxenguest.so into a separate
> library libxensaverestore.so. The only consumer is libxl-save-helper.
> There is no need to have the moved code mapped all the time in binaries
> where libxenguest.so is used.
> 
> According to size(1) the change is:
>     text	   data	    bss	    dec	    hex	filename
>   187183	   4304	     48	 191535	  2ec2f	guest/libxenguest.so.4.15.0
> 
>   124106	   3376	     48	 127530	  1f22a	guest/libxenguest.so.4.15.0
>    67841	   1872	      8	  69721	  11059	saverestore/libxensaverestore.so.4.15.0
> 
> While touching the files anyway, take the opportunity to drop the
> redundant xg_sr_ filename prefix.
> 
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
> Acked-by: Wei Liu <wl@xen.org>

The changelog is not very useful to keep in the commit message. We 
usally add --- on its own line before so they will get stripped when the 
patch is applied.

This comment applies for the full series. But I can deal with this patch 
if I happen to commit it.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH v20210701 15/40] tools: prepare to allocate saverestore arrays once
  2021-07-05 13:01       ` Andrew Cooper
  2021-07-05 14:11         ` Olaf Hering
@ 2021-07-13 17:50         ` Olaf Hering
  1 sibling, 0 replies; 86+ messages in thread
From: Olaf Hering @ 2021-07-13 17:50 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Ian Jackson, Wei Liu, Juergen Gross

[-- Attachment #1: Type: text/plain, Size: 1414 bytes --]

Am Mon, 5 Jul 2021 14:01:07 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> > Unfortunately, I'm not able to prove the reported gain with the systems I have today.
> > I'm waiting for preparation of different hardware, right now I have only a pair of CoyotePass and WilsonCity.
> >
> > I'm sure there were NUMA effects involved. Last years libvirt was unable to properly pin vcpus. If I pin all the involved memory to node#0 there is some jitter in the logged numbers, but no obvious improvement. The fist iteration is slightly faster, but that is it.  
> 
> Oh - so the speedup might not be from reduced data handling?
> 
> Avoiding unnecessary data copies is clearly going to improve things,
> even if it isn't 25%.


For HVM the only notable improvement is the initial iteration.

On average with 4 migrations of a single domU from A to B and back from B to A, transfer rate goes up from ~490MiB/s to ~677MiB/s. The initial transfer time for the 4194299 domU pages:

with plain staging:
36.800582009
32.145531727
31.827540709
33.009956041
34.951513466
33.416769973
32.128985762
33.201786076

with the series applied:
24.266428156
24.632898175
24.112660134
23.603475994
24.418323859
23.841875914
25.087779229
23.493812677


Migration of a PV domU is much faster, but transfer rate for each iteration varies with or without the patches being applied.


Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

end of thread, other threads:[~2021-07-13 17:50 UTC | newest]

Thread overview: 86+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-01  9:55 [PATCH v20210701 00/40] leftover from 2020 Olaf Hering
2021-07-01  9:55 ` [PATCH v20210701 01/40] hotplug/Linux: fix starting of xenstored with restarting systemd Olaf Hering
2021-07-01  9:55 ` [PATCH v20210701 02/40] tools: add API to work with sevaral bits at once Olaf Hering
2021-07-01  9:55 ` [PATCH v20210701 03/40] xl: fix description of migrate --debug Olaf Hering
2021-07-01 14:30   ` Anthony PERARD
2021-07-01 14:33   ` Andrew Cooper
2021-07-01 14:40     ` Olaf Hering
2021-07-01 14:41       ` Olaf Hering
2021-07-01 14:49         ` Andrew Cooper
2021-07-01 15:08           ` Olaf Hering
2021-07-01  9:55 ` [PATCH v20210701 04/40] tools: use integer division in convert-legacy-stream Olaf Hering
2021-07-02 15:10   ` Andrew Cooper
2021-07-01  9:56 ` [PATCH v20210701 05/40] tools: handle libxl__physmap_info.name properly " Olaf Hering
2021-07-02 15:35   ` Andrew Cooper
2021-07-01  9:56 ` [PATCH v20210701 06/40] tools: fix Python3.4 TypeError in format string Olaf Hering
2021-07-02 16:19   ` Marek Marczykowski-Górecki
2021-07-02 16:39     ` Andrew Cooper
2021-07-05  8:18       ` Olaf Hering
2021-07-05  9:47         ` Andrew Cooper
2021-07-05  8:07     ` Olaf Hering
2021-07-05 10:10       ` Andrew Cooper
2021-07-01  9:56 ` [PATCH v20210701 07/40] tools: create libxensaverestore Olaf Hering
2021-07-09  9:20   ` Olaf Hering
2021-07-09  9:31     ` Julien Grall
2021-07-09  9:33       ` Olaf Hering
2021-07-09  9:35   ` Julien Grall
2021-07-01  9:56 ` [PATCH v20210701 08/40] MAINTAINERS: add myself as saverestore maintainer Olaf Hering
2021-07-01 10:39   ` Jan Beulich
2021-07-01 11:01     ` Olaf Hering
2021-07-01 11:40       ` Julien Grall
2021-07-01 12:00         ` Olaf Hering
2021-07-01 12:09           ` Julien Grall
2021-07-01  9:56 ` [PATCH v20210701 09/40] tools: add readv_exact to libxenctrl Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 10/40] tools: add xc_is_known_page_type " Olaf Hering
2021-07-02 19:20   ` Andrew Cooper
2021-07-05  8:22     ` Olaf Hering
2021-07-05  9:51       ` Andrew Cooper
2021-07-05 14:24         ` Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 11/40] tools: use sr_is_known_page_type Olaf Hering
2021-07-02 19:27   ` Andrew Cooper
2021-07-05  8:25     ` Olaf Hering
2021-07-05  9:53       ` Andrew Cooper
2021-07-01  9:56 ` [PATCH v20210701 12/40] tools: unify type checking for data pfns in migration stream Olaf Hering
2021-07-02 19:43   ` Andrew Cooper
2021-07-05  8:59     ` Olaf Hering
2021-07-05  9:53       ` Andrew Cooper
2021-07-05 13:10   ` Andrew Cooper
2021-07-05 13:53     ` Olaf Hering
2021-07-05 18:54       ` Andrew Cooper
2021-07-05 19:06         ` Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 13/40] " Olaf Hering
2021-07-02 19:49   ` Andrew Cooper
2021-07-01  9:56 ` [PATCH v20210701 14/40] tools: show migration transfer rate in send_dirty_pages Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 15/40] tools: prepare to allocate saverestore arrays once Olaf Hering
2021-07-05 10:44   ` Andrew Cooper
2021-07-05 11:27     ` Olaf Hering
2021-07-05 13:01       ` Andrew Cooper
2021-07-05 14:11         ` Olaf Hering
2021-07-13 17:50         ` Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 16/40] tools: save: move mfns array Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 17/40] tools: save: move types array Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 18/40] tools: save: move errors array Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 19/40] tools: save: move iov array Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 20/40] tools: save: move rec_pfns array Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 21/40] tools: save: move guest_data array Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 22/40] tools: save: move local_pages array Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 23/40] tools: restore: move types array Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 24/40] tools: restore: move mfns array Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 25/40] tools: restore: move map_errs array Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 26/40] tools: restore: move mfns array in populate_pfns Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 27/40] tools: restore: move pfns " Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 28/40] tools: restore: split record processing Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 29/40] tools: restore: split handle_page_data Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 30/40] tools: restore: write data directly into guest Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 31/40] tools: recognize LIBXL_API_VERSION for 4.16 Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 32/40] tools: adjust libxl_domain_suspend to receive a struct props Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 33/40] tools: change struct precopy_stats to precopy_stats_t Olaf Hering
2021-07-01 16:45   ` Anthony PERARD
2021-07-01 17:08     ` Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 34/40] tools: add callback to libxl for precopy_policy and precopy_stats_t Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 35/40] tools: add --max_iters to libxl_domain_suspend Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 36/40] tools: add --min_remaining " Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 37/40] tools: add --abort_if_busy " Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 38/40] tools: add API for expandable bitmaps Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 39/40] tools: use xg_sr_bitmap for populated_pfns Olaf Hering
2021-07-01  9:56 ` [PATCH v20210701 40/40] tools/libxc: use superpages during restore of HVM guest Olaf Hering

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.