All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] tools/migration: Fixes in send_checkpoint_dirty_pfn_list()
@ 2021-07-06 11:23 Andrew Cooper
  2021-07-06 11:23 ` [PATCH 1/2] tools/migration: Fix iovec handling " Andrew Cooper
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Andrew Cooper @ 2021-07-06 11:23 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Ian Jackson, Wei Liu, Juergen Gross, Jan Beulich,
	Olaf Hering

These are a prerequisite to all currently on-list patches touching the
function.

Andrew Cooper (2):
  tools/migration: Fix iovec handling in send_checkpoint_dirty_pfn_list()
  tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()

 tools/libs/guest/xg_sr_restore.c | 48 ++++++++++++++--------------------------
 1 file changed, 17 insertions(+), 31 deletions(-)

-- 
2.11.0



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/2] tools/migration: Fix iovec handling in send_checkpoint_dirty_pfn_list()
  2021-07-06 11:23 [PATCH 0/2] tools/migration: Fixes in send_checkpoint_dirty_pfn_list() Andrew Cooper
@ 2021-07-06 11:23 ` Andrew Cooper
  2021-07-06 11:54   ` Jan Beulich
  2021-07-06 12:20   ` Olaf Hering
  2021-07-06 11:23 ` [PATCH 2/2] tools/migration: Fix potential overflow " Andrew Cooper
  2021-07-06 12:07 ` [PATCH 0/2] tools/migration: Fixes " Jan Beulich
  2 siblings, 2 replies; 20+ messages in thread
From: Andrew Cooper @ 2021-07-06 11:23 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Ian Jackson, Wei Liu, Juergen Gross, Jan Beulich,
	Olaf Hering

We shouldn't be using two struct iovec's to write half of 'rec' each, and
there is no need to malloc() for two struct iovec's at all.

Simplify down to just two - one covering the whole of 'rec', and one covering
the pfns array.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Ian Jackson <iwj@xenproject.org>
CC: Wei Liu <wl@xen.org>
CC: Juergen Gross <jgross@suse.com>
CC: Jan Beulich <JBeulich@suse.com>
CC: Olaf Hering <olaf@aepfle.de>
---
 tools/libs/guest/xg_sr_restore.c | 25 ++++++-------------------
 1 file changed, 6 insertions(+), 19 deletions(-)

diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index aa4113d7f679..07c9e291610b 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -425,11 +425,13 @@ static int send_checkpoint_dirty_pfn_list(struct xc_sr_context *ctx)
     int rc = -1;
     unsigned int count, written;
     uint64_t i, *pfns = NULL;
-    struct iovec *iov = NULL;
     xc_shadow_op_stats_t stats = { 0, ctx->restore.p2m_size };
     struct xc_sr_record rec = {
         .type = REC_TYPE_CHECKPOINT_DIRTY_PFN_LIST,
     };
+    struct iovec iov[2] = {
+        { &rec, sizeof(rec) },
+    };
     DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
                                     &ctx->restore.dirty_bitmap_hbuf);
 
@@ -471,26 +473,12 @@ static int send_checkpoint_dirty_pfn_list(struct xc_sr_context *ctx)
         pfns[written++] = i;
     }
 
-    /* iovec[] for writev(). */
-    iov = malloc(3 * sizeof(*iov));
-    if ( !iov )
-    {
-        ERROR("Unable to allocate memory for sending dirty bitmap");
-        goto err;
-    }
-
     rec.length = count * sizeof(*pfns);
 
-    iov[0].iov_base = &rec.type;
-    iov[0].iov_len = sizeof(rec.type);
-
-    iov[1].iov_base = &rec.length;
-    iov[1].iov_len = sizeof(rec.length);
-
-    iov[2].iov_base = pfns;
-    iov[2].iov_len = count * sizeof(*pfns);
+    iov[1].iov_base = pfns;
+    iov[1].iov_len = rec.length;
 
-    if ( writev_exact(ctx->restore.send_back_fd, iov, 3) )
+    if ( writev_exact(ctx->restore.send_back_fd, iov, ARRAY_SIZE(iov)) )
     {
         PERROR("Failed to write dirty bitmap to stream");
         goto err;
@@ -499,7 +487,6 @@ static int send_checkpoint_dirty_pfn_list(struct xc_sr_context *ctx)
     rc = 0;
  err:
     free(pfns);
-    free(iov);
     return rc;
 }
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 11:23 [PATCH 0/2] tools/migration: Fixes in send_checkpoint_dirty_pfn_list() Andrew Cooper
  2021-07-06 11:23 ` [PATCH 1/2] tools/migration: Fix iovec handling " Andrew Cooper
@ 2021-07-06 11:23 ` Andrew Cooper
  2021-07-06 12:03   ` Jan Beulich
  2021-07-06 12:58   ` Olaf Hering
  2021-07-06 12:07 ` [PATCH 0/2] tools/migration: Fixes " Jan Beulich
  2 siblings, 2 replies; 20+ messages in thread
From: Andrew Cooper @ 2021-07-06 11:23 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Ian Jackson, Wei Liu, Juergen Gross, Jan Beulich,
	Olaf Hering

'count * sizeof(*pfns)' can in principle overflow, but is implausible in
practice as the time between checkpoints is typically sub-second.
Nevertheless, simplify the code and remove the risk.

There is no need to loop over the bitmap to calculate count.  The number of
set bits is returned in xc_shadow_op_stats_t which is already collected (and
ignored).

Bounds check the count against what will fit in REC_LENGTH_MAX.  At the time
of writing, this allows up to 0xffffff pfns.  Rearrange the pfns loop to check
for errors both ways, not simply that there were more pfns than expected.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Ian Jackson <iwj@xenproject.org>
CC: Wei Liu <wl@xen.org>
CC: Juergen Gross <jgross@suse.com>
CC: Jan Beulich <JBeulich@suse.com>
CC: Olaf Hering <olaf@aepfle.de>
---
 tools/libs/guest/xg_sr_restore.c | 33 ++++++++++++++++-----------------
 1 file changed, 16 insertions(+), 17 deletions(-)

diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index 07c9e291610b..bda04ee42e3f 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -425,7 +425,7 @@ static int send_checkpoint_dirty_pfn_list(struct xc_sr_context *ctx)
     int rc = -1;
     unsigned int count, written;
     uint64_t i, *pfns = NULL;
-    xc_shadow_op_stats_t stats = { 0, ctx->restore.p2m_size };
+    xc_shadow_op_stats_t stats;
     struct xc_sr_record rec = {
         .type = REC_TYPE_CHECKPOINT_DIRTY_PFN_LIST,
     };
@@ -444,14 +444,17 @@ static int send_checkpoint_dirty_pfn_list(struct xc_sr_context *ctx)
         goto err;
     }
 
-    for ( i = 0, count = 0; i < ctx->restore.p2m_size; i++ )
+    count = stats.dirty_count;
+
+    if ( ((REC_LENGTH_MAX - sizeof(rec)) / sizeof(*pfns)) < count )
     {
-        if ( test_bit(i, dirty_bitmap) )
-            count++;
+        ERROR("Too many PFNs (%u) to fit in record (limit %zu)", count,
+              ((REC_LENGTH_MAX - sizeof(rec)) / sizeof(*pfns)));
+        goto err;
     }
 
-
-    pfns = malloc(count * sizeof(*pfns));
+    iov[1].iov_len  = rec.length = count * sizeof(*pfns);
+    iov[1].iov_base = pfns       = malloc(rec.length);
     if ( !pfns )
     {
         ERROR("Unable to allocate %zu bytes of memory for dirty pfn list",
@@ -459,24 +462,20 @@ static int send_checkpoint_dirty_pfn_list(struct xc_sr_context *ctx)
         goto err;
     }
 
-    for ( i = 0, written = 0; i < ctx->restore.p2m_size; ++i )
+    for ( i = 0, written = 0; count && i < ctx->restore.p2m_size; ++i, --count )
     {
         if ( !test_bit(i, dirty_bitmap) )
             continue;
 
-        if ( written > count )
-        {
-            ERROR("Dirty pfn list exceed");
-            goto err;
-        }
-
         pfns[written++] = i;
     }
 
-    rec.length = count * sizeof(*pfns);
-
-    iov[1].iov_base = pfns;
-    iov[1].iov_len = rec.length;
+    if ( written != stats.dirty_count )
+    {
+        ERROR("Mismatch between dirty bitmap bits (%u), and dirty_count (%u)",
+              written, stats.dirty_count);
+        goto err;
+    }
 
     if ( writev_exact(ctx->restore.send_back_fd, iov, ARRAY_SIZE(iov)) )
     {
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] tools/migration: Fix iovec handling in send_checkpoint_dirty_pfn_list()
  2021-07-06 11:23 ` [PATCH 1/2] tools/migration: Fix iovec handling " Andrew Cooper
@ 2021-07-06 11:54   ` Jan Beulich
  2021-07-06 12:20   ` Olaf Hering
  1 sibling, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2021-07-06 11:54 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Wei Liu, Juergen Gross, Olaf Hering, Xen-devel

On 06.07.2021 13:23, Andrew Cooper wrote:
> We shouldn't be using two struct iovec's to write half of 'rec' each, and
> there is no need to malloc() for two struct iovec's at all.

I was indeed wondering about all of these while also touching the
function. But I was guessing that there might be a deeper reason I
don't see.

> Simplify down to just two - one covering the whole of 'rec', and one covering
> the pfns array.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 11:23 ` [PATCH 2/2] tools/migration: Fix potential overflow " Andrew Cooper
@ 2021-07-06 12:03   ` Jan Beulich
  2021-07-06 13:34     ` Andrew Cooper
  2021-07-06 12:58   ` Olaf Hering
  1 sibling, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2021-07-06 12:03 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Wei Liu, Juergen Gross, Olaf Hering, Xen-devel

On 06.07.2021 13:23, Andrew Cooper wrote:
> 'count * sizeof(*pfns)' can in principle overflow, but is implausible in
> practice as the time between checkpoints is typically sub-second.
> Nevertheless, simplify the code and remove the risk.
> 
> There is no need to loop over the bitmap to calculate count.  The number of
> set bits is returned in xc_shadow_op_stats_t which is already collected (and
> ignored).
> 
> Bounds check the count against what will fit in REC_LENGTH_MAX.  At the time
> of writing, this allows up to 0xffffff pfns.

Well, okay, this then means that an overflow in the reporting of
dirty_count doesn't matter for now, because the limit is lower
anyway.

>  Rearrange the pfns loop to check
> for errors both ways, not simply that there were more pfns than expected.

Hmm, "both ways" to me would mean ...

> @@ -459,24 +462,20 @@ static int send_checkpoint_dirty_pfn_list(struct xc_sr_context *ctx)
>          goto err;
>      }
>  
> -    for ( i = 0, written = 0; i < ctx->restore.p2m_size; ++i )
> +    for ( i = 0, written = 0; count && i < ctx->restore.p2m_size; ++i, --count )
>      {
>          if ( !test_bit(i, dirty_bitmap) )
>              continue;
>  
> -        if ( written > count )
> -        {
> -            ERROR("Dirty pfn list exceed");
> -            goto err;
> -        }
> -
>          pfns[written++] = i;
>      }
>  
> -    rec.length = count * sizeof(*pfns);
> -
> -    iov[1].iov_base = pfns;
> -    iov[1].iov_len = rec.length;
> +    if ( written != stats.dirty_count )
> +    {
> +        ERROR("Mismatch between dirty bitmap bits (%u), and dirty_count (%u)",
> +              written, stats.dirty_count);
> +        goto err;
> +    }

... you then also check that there are no further bit set in the
bitmap. As said elsewhere, I'm not convinced using statistics as
a basis for actual operation (rather than just reporting) is
appropriate. I'm unaware of there being any spelled out guarantee
that the numbers reported back from the hypercall are accurate.

Jan



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2] tools/migration: Fixes in send_checkpoint_dirty_pfn_list()
  2021-07-06 11:23 [PATCH 0/2] tools/migration: Fixes in send_checkpoint_dirty_pfn_list() Andrew Cooper
  2021-07-06 11:23 ` [PATCH 1/2] tools/migration: Fix iovec handling " Andrew Cooper
  2021-07-06 11:23 ` [PATCH 2/2] tools/migration: Fix potential overflow " Andrew Cooper
@ 2021-07-06 12:07 ` Jan Beulich
  2 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2021-07-06 12:07 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Wei Liu, Juergen Gross, Olaf Hering, Xen-devel

On 06.07.2021 13:23, Andrew Cooper wrote:
> These are a prerequisite to all currently on-list patches touching the
> function.

Just as an observation, while I can see how from your pov (judging from
your not-exactly-review-comments) patch 2 may be a prereq to one of my
changes, I don't think I'd call patch 1 a prereq to anything - this is
cleanup which can be done at any point. Of course it'll allow me to
shrink that one of my patches, which is certainly nice.

Jan



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] tools/migration: Fix iovec handling in send_checkpoint_dirty_pfn_list()
  2021-07-06 11:23 ` [PATCH 1/2] tools/migration: Fix iovec handling " Andrew Cooper
  2021-07-06 11:54   ` Jan Beulich
@ 2021-07-06 12:20   ` Olaf Hering
  1 sibling, 0 replies; 20+ messages in thread
From: Olaf Hering @ 2021-07-06 12:20 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Ian Jackson, Wei Liu, Juergen Gross, Jan Beulich

[-- Attachment #1: Type: text/plain, Size: 385 bytes --]

Am Tue, 6 Jul 2021 12:23:31 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> We shouldn't be using two struct iovec's to write half of 'rec' each, and
> there is no need to malloc() for two struct iovec's at all.
> 
> Simplify down to just two - one covering the whole of 'rec', and one covering
> the pfns array.

Acked-by: Olaf Hering <olaf@aepfle.de>


Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 11:23 ` [PATCH 2/2] tools/migration: Fix potential overflow " Andrew Cooper
  2021-07-06 12:03   ` Jan Beulich
@ 2021-07-06 12:58   ` Olaf Hering
  2021-07-06 13:19     ` Andrew Cooper
  2021-07-06 14:11     ` Olaf Hering
  1 sibling, 2 replies; 20+ messages in thread
From: Olaf Hering @ 2021-07-06 12:58 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Ian Jackson, Wei Liu, Juergen Gross, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 883 bytes --]

Am Tue, 6 Jul 2021 12:23:32 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> +    count = stats.dirty_count;

Is this accurate?

I remember the reporting is broken since a while, and testing a busy domU indicates it is still the case.

# xen-logdirty `xl domid domU`
0: faults= 0 dirty= 258050
1: faults= 0 dirty= 257817
2: faults= 0 dirty= 253713
3: faults= 0 dirty= 253197
4: faults= 0 dirty= 255154
5: faults= 0 dirty= 260876
6: faults= 0 dirty= 259083
7: faults= 0 dirty= 253163
8: faults= 0 dirty= 258349
9: faults= 0 dirty= 260330
10: faults= 0 dirty= 259754
11: faults= 0 dirty= 257273
12: faults= 0 dirty= 255756
13: faults= 0 dirty= 258209
14: faults= 0 dirty= 257847
15: faults= 0 dirty= 59639
16: faults= 0 dirty= 81
17: faults= 0 dirty= 86
18: faults= 0 dirty= 111
19: faults= 0 dirty= 81
20: faults= 0 dirty= 80
....


Olaf

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: logdirty.c --]
[-- Type: text/x-c++src, Size: 3186 bytes --]

/* gcc -Wall -o logdirty -O -lxenctrl logdirty.c */
#include <unistd.h>
#include <errno.h>
#include <inttypes.h>
#include <stdlib.h>
#include <stdio.h>
#include <xenctrl.h>
#include <signal.h>

#define PAGE_SHIFT              XC_PAGE_SHIFT
#define ROUNDUP(_x,_w) (((unsigned long)(_x)+(1UL<<(_w))-1) & ~((1UL<<(_w))-1))
#define NRPAGES(x) (ROUNDUP(x, PAGE_SHIFT) >> PAGE_SHIFT)
#define BITS_PER_LONG (sizeof(unsigned long) * 8)
#define ORDER_LONG (sizeof(unsigned long) == 4 ? 5 : 6)


static unsigned int domid = 0;
static xc_interface *xch;

static unsigned long xdmg(xc_interface *c, unsigned int d)
{
	unsigned long r;
#if XEN_DOMCTL_INTERFACE_VERSION < 0x0b
	r = xc_domain_maximum_gpfn(c, d);
#else
	xen_pfn_t gpfns = 0;
	r = xc_domain_maximum_gpfn(c, d, &gpfns);
#endif
	return r + 1;
}
static void sigint_handler(int sig)
{
	int rc;
	fprintf(stderr, "User aborted\n");
	rc = xc_shadow_control(xch, domid, XEN_DOMCTL_SHADOW_OP_OFF, NULL, 0, NULL, 0, NULL);
        if (rc < 0)
                perror("XEN_DOMCTL_SHADOW_OP_OFF hypercall failed\n");
	exit(1);
}


static inline int bitmap_size(int nr_bits)
{
	int nr_long, nr_bytes;
	nr_long = (nr_bits + BITS_PER_LONG - 1) >> ORDER_LONG;
	nr_bytes = nr_long * sizeof(unsigned long);
	return nr_bytes;
}

int main(int argc, char *argv[])
{
	int rc, ret = 1, i, runs = 42;
	unsigned int lflags;
	xentoollog_level lvl;
	xentoollog_logger *l;
	DECLARE_HYPERCALL_BUFFER(unsigned long, to_skip);
	unsigned long p2m_size;
	xc_shadow_op_stats_t stats;

	errno = EINVAL;
	if (argc > 1)
		domid = atoi(argv[1]);
	if (!domid)
		goto out;
	if (argc > 2)
		runs = atoi(argv[2]);
	if (!runs)
		goto out;

	errno = 0;
	lvl = XTL_DEBUG;
	lflags = XTL_STDIOSTREAM_SHOW_PID | XTL_STDIOSTREAM_HIDE_PROGRESS;
	l = (xentoollog_logger *) xtl_createlogger_stdiostream(stderr, lvl, lflags);
	if (!l)
		goto out;

	xch = xc_interface_open(l, 0, 0);
	if (!xch)
		goto out;

	p2m_size = xdmg(xch, domid);
	if (!p2m_size)
		goto out;

	to_skip = xc_hypercall_buffer_alloc_pages(xch, to_skip, NRPAGES(bitmap_size(p2m_size)));
	if (!to_skip)
		goto out;

	signal(SIGINT, sigint_handler);

	rc = xc_shadow_control(xch, domid, XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY, NULL, 0, NULL, 0, NULL);
	if (rc < 0) {
		rc = xc_shadow_control(xch, domid, XEN_DOMCTL_SHADOW_OP_OFF, NULL, 0, NULL, 0, NULL);
		if (rc < 0)
			goto out;

		rc = xc_shadow_control(xch, domid, XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY, NULL, 0, NULL, 0, NULL);
		if (rc < 0)
			goto out;
	}

	rc = xc_shadow_control(xch, domid, XEN_DOMCTL_SHADOW_OP_CLEAN, NULL, 0, NULL, 0, NULL);
	if (rc < 0)
		goto out;

	for (i = 0; i < runs; i++) {
		sleep(1);
		if (1)
			rc = xc_shadow_control(xch, domid, XEN_DOMCTL_SHADOW_OP_CLEAN, HYPERCALL_BUFFER(to_skip), p2m_size, NULL, 0, &stats);
		if (rc < 0)
			goto out;
		printf("%d: faults= %" PRIu32 " dirty= %" PRIu32 "\n", i, stats.fault_count, stats.dirty_count);
	}
	rc = xc_shadow_control(xch, domid, XEN_DOMCTL_SHADOW_OP_OFF, NULL, 0, NULL, 0, NULL);
	if (rc < 0)
		goto out;
	errno = ret = 0;
      out:
	perror(argv[0]);
	return ret;
}

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 12:58   ` Olaf Hering
@ 2021-07-06 13:19     ` Andrew Cooper
  2021-07-06 13:22       ` Andrew Cooper
                         ` (2 more replies)
  2021-07-06 14:11     ` Olaf Hering
  1 sibling, 3 replies; 20+ messages in thread
From: Andrew Cooper @ 2021-07-06 13:19 UTC (permalink / raw)
  To: Olaf Hering; +Cc: Xen-devel, Ian Jackson, Wei Liu, Juergen Gross, Jan Beulich

On 06/07/2021 13:58, Olaf Hering wrote:
> Am Tue, 6 Jul 2021 12:23:32 +0100
> schrieb Andrew Cooper <andrew.cooper3@citrix.com>:
>
>> +    count = stats.dirty_count;
> Is this accurate?

The live loop relies on it, and it worked correctly the last time I
tested it.

> I remember the reporting is broken since a while, and testing a busy domU indicates it is still the case.

If its broken, it needs fixing.

> # xen-logdirty `xl domid domU`
> 0: faults= 0 dirty= 258050
> 1: faults= 0 dirty= 257817
> 2: faults= 0 dirty= 253713
> 3: faults= 0 dirty= 253197
> 4: faults= 0 dirty= 255154
> 5: faults= 0 dirty= 260876
> 6: faults= 0 dirty= 259083
> 7: faults= 0 dirty= 253163
> 8: faults= 0 dirty= 258349
> 9: faults= 0 dirty= 260330
> 10: faults= 0 dirty= 259754
> 11: faults= 0 dirty= 257273
> 12: faults= 0 dirty= 255756
> 13: faults= 0 dirty= 258209
> 14: faults= 0 dirty= 257847
> 15: faults= 0 dirty= 59639
> 16: faults= 0 dirty= 81
> 17: faults= 0 dirty= 86
> 18: faults= 0 dirty= 111
> 19: faults= 0 dirty= 81
> 20: faults= 0 dirty= 80

What is this showing, other than (unsurprisingly) faults doesn't work
for an HVM guest?

~Andrew



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 13:19     ` Andrew Cooper
@ 2021-07-06 13:22       ` Andrew Cooper
  2021-07-06 13:39         ` Olaf Hering
  2021-07-06 13:28       ` Olaf Hering
  2021-07-06 13:56       ` Jan Beulich
  2 siblings, 1 reply; 20+ messages in thread
From: Andrew Cooper @ 2021-07-06 13:22 UTC (permalink / raw)
  To: Olaf Hering; +Cc: Xen-devel, Ian Jackson, Wei Liu, Juergen Gross, Jan Beulich

On 06/07/2021 14:19, Andrew Cooper wrote:
> On 06/07/2021 13:58, Olaf Hering wrote:
>> Am Tue, 6 Jul 2021 12:23:32 +0100
>> schrieb Andrew Cooper <andrew.cooper3@citrix.com>:
>>
>>> +    count = stats.dirty_count;
>> Is this accurate?
> The live loop relies on it, and it worked correctly the last time I
> tested it.
>
>> I remember the reporting is broken since a while, and testing a busy domU indicates it is still the case.
> If its broken, it needs fixing.
>
>> # xen-logdirty `xl domid domU`
>> 0: faults= 0 dirty= 258050
>> 1: faults= 0 dirty= 257817
>> 2: faults= 0 dirty= 253713
>> 3: faults= 0 dirty= 253197
>> 4: faults= 0 dirty= 255154
>> 5: faults= 0 dirty= 260876
>> 6: faults= 0 dirty= 259083
>> 7: faults= 0 dirty= 253163
>> 8: faults= 0 dirty= 258349
>> 9: faults= 0 dirty= 260330
>> 10: faults= 0 dirty= 259754
>> 11: faults= 0 dirty= 257273
>> 12: faults= 0 dirty= 255756
>> 13: faults= 0 dirty= 258209
>> 14: faults= 0 dirty= 257847
>> 15: faults= 0 dirty= 59639
>> 16: faults= 0 dirty= 81
>> 17: faults= 0 dirty= 86
>> 18: faults= 0 dirty= 111
>> 19: faults= 0 dirty= 81
>> 20: faults= 0 dirty= 80
> What is this showing, other than (unsurprisingly) faults doesn't work
> for an HVM guest?

Sorry - I meant HAP guest.

What hardware is this on?  i.e. is the Page Modification Logging feature
in use?

~Andrew


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 13:19     ` Andrew Cooper
  2021-07-06 13:22       ` Andrew Cooper
@ 2021-07-06 13:28       ` Olaf Hering
  2021-07-06 13:56       ` Jan Beulich
  2 siblings, 0 replies; 20+ messages in thread
From: Olaf Hering @ 2021-07-06 13:28 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Ian Jackson, Wei Liu, Juergen Gross, Jan Beulich

[-- Attachment #1: Type: text/plain, Size: 601 bytes --]

Am Tue, 6 Jul 2021 14:19:21 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> > 20: faults= 0 dirty= 80  
> 
> What is this showing, other than (unsurprisingly) faults doesn't work
> for an HVM guest?

The dirty count goes down after a while for a domU that constantly touches as many pages as it can.
But yes, the current code uses stats.dirty_count in a number of places.
It seems all of the usage is just for reporting, so if these code paths would get wrong input nothing bad happens.
precopy_policy may indicate more or less iterations, this is not critical either.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 12:03   ` Jan Beulich
@ 2021-07-06 13:34     ` Andrew Cooper
  2021-07-06 14:00       ` Jan Beulich
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Cooper @ 2021-07-06 13:34 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Ian Jackson, Wei Liu, Juergen Gross, Olaf Hering, Xen-devel

On 06/07/2021 13:03, Jan Beulich wrote:
> On 06.07.2021 13:23, Andrew Cooper wrote:
>> 'count * sizeof(*pfns)' can in principle overflow, but is implausible in
>> practice as the time between checkpoints is typically sub-second.
>> Nevertheless, simplify the code and remove the risk.
>>
>> There is no need to loop over the bitmap to calculate count.  The number of
>> set bits is returned in xc_shadow_op_stats_t which is already collected (and
>> ignored).
>>
>> Bounds check the count against what will fit in REC_LENGTH_MAX.  At the time
>> of writing, this allows up to 0xffffff pfns.
> Well, okay, this then means that an overflow in the reporting of
> dirty_count doesn't matter for now, because the limit is lower
> anyway.
>
>>  Rearrange the pfns loop to check
>> for errors both ways, not simply that there were more pfns than expected.
> Hmm, "both ways" to me would mean ...
>
>> @@ -459,24 +462,20 @@ static int send_checkpoint_dirty_pfn_list(struct xc_sr_context *ctx)
>>          goto err;
>>      }
>>  
>> -    for ( i = 0, written = 0; i < ctx->restore.p2m_size; ++i )
>> +    for ( i = 0, written = 0; count && i < ctx->restore.p2m_size; ++i, --count )
>>      {
>>          if ( !test_bit(i, dirty_bitmap) )
>>              continue;
>>  
>> -        if ( written > count )
>> -        {
>> -            ERROR("Dirty pfn list exceed");
>> -            goto err;
>> -        }
>> -
>>          pfns[written++] = i;
>>      }
>>  
>> -    rec.length = count * sizeof(*pfns);
>> -
>> -    iov[1].iov_base = pfns;
>> -    iov[1].iov_len = rec.length;
>> +    if ( written != stats.dirty_count )
>> +    {
>> +        ERROR("Mismatch between dirty bitmap bits (%u), and dirty_count (%u)",
>> +              written, stats.dirty_count);
>> +        goto err;
>> +    }
> ... you then also check that there are no further bit set in the
> bitmap. As said elsewhere, I'm not convinced using statistics as
> a basis for actual operation (rather than just reporting) is
> appropriate.

I'm not interested in inference based on the name of the structure.

>  I'm unaware of there being any spelled out guarantee
> that the numbers reported back from the hypercall are accurate.

The live loop uses this information already for this purpose.  If it is
wrong, we've got bigger problems that this.

~Andrew



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 13:22       ` Andrew Cooper
@ 2021-07-06 13:39         ` Olaf Hering
  2021-07-06 13:43           ` Andrew Cooper
  0 siblings, 1 reply; 20+ messages in thread
From: Olaf Hering @ 2021-07-06 13:39 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Ian Jackson, Wei Liu, Juergen Gross, Jan Beulich

[-- Attachment #1: Type: text/plain, Size: 271 bytes --]

Am Tue, 6 Jul 2021 14:22:58 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> What hardware is this on?  i.e. is the Page Modification Logging feature
> in use?

At least it gets reported as VMX feature during boot, this is a CoyotePass system.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 13:39         ` Olaf Hering
@ 2021-07-06 13:43           ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2021-07-06 13:43 UTC (permalink / raw)
  To: Olaf Hering; +Cc: Xen-devel, Ian Jackson, Wei Liu, Juergen Gross, Jan Beulich

On 06/07/2021 14:39, Olaf Hering wrote:
> Am Tue, 6 Jul 2021 14:22:58 +0100
> schrieb Andrew Cooper <andrew.cooper3@citrix.com>:
>
>> What hardware is this on?  i.e. is the Page Modification Logging feature
>> in use?
> At least it gets reported as VMX feature during boot, this is a CoyotePass system.

That logging is problematic (that means "this hardware has the
feature"), but yes - PML will be used by default when available.

~Andrew


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 13:19     ` Andrew Cooper
  2021-07-06 13:22       ` Andrew Cooper
  2021-07-06 13:28       ` Olaf Hering
@ 2021-07-06 13:56       ` Jan Beulich
  2 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2021-07-06 13:56 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Ian Jackson, Wei Liu, Juergen Gross, Olaf Hering

On 06.07.2021 15:19, Andrew Cooper wrote:
> On 06/07/2021 13:58, Olaf Hering wrote:
>> Am Tue, 6 Jul 2021 12:23:32 +0100
>> schrieb Andrew Cooper <andrew.cooper3@citrix.com>:
>>
>>> +    count = stats.dirty_count;
>> Is this accurate?
> 
> The live loop relies on it, and it worked correctly the last time I
> tested it.

When still merely investigating in preparation of my recent series,
i.e. without having made changes yet except to add some logging, I
did observe "Bitmap contained more entries than expected..." a
couple of times, with "written" and "entries" typically apart by
just 1 (as determined by extra logging; to be honest I don't recall
if they were farther apart at any point). So the number is _not_
accurate in any event, and cannot be used for other than reporting
purposes (as also expressed elsewhere on this thread).

This also underlines that, unlike you did say in a reply to one of
my patches, this is only a "detail" message, not an error, because
migration happily went on and succeeded.

Jan



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 13:34     ` Andrew Cooper
@ 2021-07-06 14:00       ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2021-07-06 14:00 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Wei Liu, Juergen Gross, Olaf Hering, Xen-devel

On 06.07.2021 15:34, Andrew Cooper wrote:
> On 06/07/2021 13:03, Jan Beulich wrote:
>> On 06.07.2021 13:23, Andrew Cooper wrote:
>>> 'count * sizeof(*pfns)' can in principle overflow, but is implausible in
>>> practice as the time between checkpoints is typically sub-second.
>>> Nevertheless, simplify the code and remove the risk.
>>>
>>> There is no need to loop over the bitmap to calculate count.  The number of
>>> set bits is returned in xc_shadow_op_stats_t which is already collected (and
>>> ignored).
>>>
>>> Bounds check the count against what will fit in REC_LENGTH_MAX.  At the time
>>> of writing, this allows up to 0xffffff pfns.
>> Well, okay, this then means that an overflow in the reporting of
>> dirty_count doesn't matter for now, because the limit is lower
>> anyway.
>>
>>>  Rearrange the pfns loop to check
>>> for errors both ways, not simply that there were more pfns than expected.
>> Hmm, "both ways" to me would mean ...
>>
>>> @@ -459,24 +462,20 @@ static int send_checkpoint_dirty_pfn_list(struct xc_sr_context *ctx)
>>>          goto err;
>>>      }
>>>  
>>> -    for ( i = 0, written = 0; i < ctx->restore.p2m_size; ++i )
>>> +    for ( i = 0, written = 0; count && i < ctx->restore.p2m_size; ++i, --count )
>>>      {
>>>          if ( !test_bit(i, dirty_bitmap) )
>>>              continue;
>>>  
>>> -        if ( written > count )
>>> -        {
>>> -            ERROR("Dirty pfn list exceed");
>>> -            goto err;
>>> -        }
>>> -
>>>          pfns[written++] = i;
>>>      }
>>>  
>>> -    rec.length = count * sizeof(*pfns);
>>> -
>>> -    iov[1].iov_base = pfns;
>>> -    iov[1].iov_len = rec.length;
>>> +    if ( written != stats.dirty_count )
>>> +    {
>>> +        ERROR("Mismatch between dirty bitmap bits (%u), and dirty_count (%u)",
>>> +              written, stats.dirty_count);
>>> +        goto err;
>>> +    }
>> ... you then also check that there are no further bit set in the
>> bitmap. As said elsewhere, I'm not convinced using statistics as
>> a basis for actual operation (rather than just reporting) is
>> appropriate.
> 
> I'm not interested in inference based on the name of the structure.

I'm afraid that's the problem: Because you started using it for
something it wasn't meant to be used for, you now think it's the
name that's misleading, and the use is okay. I remain of the
opinion that it's the other way around (but see below for there
not being an real dependency at least in this particular case).

>>  I'm unaware of there being any spelled out guarantee
>> that the numbers reported back from the hypercall are accurate.
> 
> The live loop uses this information already for this purpose.  If it is
> wrong, we've got bigger problems that this.

send_memory_live() passes the value to send_dirty_pages(), which
in turn passes it only to xc_report_progress_step() and uses it
in

    if ( written > entries )
        DPRINTF("Bitmap contained more entries than expected...");

There's no relying on this number at all afaics.

Jan



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 12:58   ` Olaf Hering
  2021-07-06 13:19     ` Andrew Cooper
@ 2021-07-06 14:11     ` Olaf Hering
  2021-07-06 15:13       ` Jan Beulich
  1 sibling, 1 reply; 20+ messages in thread
From: Olaf Hering @ 2021-07-06 14:11 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Ian Jackson, Wei Liu, Juergen Gross, Jan Beulich

[-- Attachment #1: Type: text/plain, Size: 307 bytes --]

Am Tue, 6 Jul 2021 14:58:04 +0200
schrieb Olaf Hering <olaf@aepfle.de>:

> the reporting is broken since a while

A quick check on a Dell T320 with E5-2430L, which does not have "Page Modification Logging", indicates that staging-4.5 appears to work, but reporting in staging-4.6 is broken.


Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 14:11     ` Olaf Hering
@ 2021-07-06 15:13       ` Jan Beulich
  2021-07-06 15:22         ` Jan Beulich
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2021-07-06 15:13 UTC (permalink / raw)
  To: Olaf Hering; +Cc: Xen-devel, Ian Jackson, Wei Liu, Juergen Gross, Andrew Cooper

On 06.07.2021 16:11, Olaf Hering wrote:
> Am Tue, 6 Jul 2021 14:58:04 +0200
> schrieb Olaf Hering <olaf@aepfle.de>:
> 
>> the reporting is broken since a while
> 
> A quick check on a Dell T320 with E5-2430L, which does not have "Page Modification Logging", indicates that staging-4.5 appears to work, but reporting in staging-4.6 is broken.

Not surprising at all, considering PML support was added in 4.6.

Jan



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 15:13       ` Jan Beulich
@ 2021-07-06 15:22         ` Jan Beulich
  2021-07-06 16:08           ` Andrew Cooper
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2021-07-06 15:22 UTC (permalink / raw)
  To: Olaf Hering; +Cc: Xen-devel, Ian Jackson, Wei Liu, Juergen Gross, Andrew Cooper

On 06.07.2021 17:13, Jan Beulich wrote:
> On 06.07.2021 16:11, Olaf Hering wrote:
>> Am Tue, 6 Jul 2021 14:58:04 +0200
>> schrieb Olaf Hering <olaf@aepfle.de>:
>>
>>> the reporting is broken since a while
>>
>> A quick check on a Dell T320 with E5-2430L, which does not have "Page Modification Logging", indicates that staging-4.5 appears to work, but reporting in staging-4.6 is broken.
> 
> Not surprising at all, considering PML support was added in 4.6.

Or perhaps still surprising, as the functions involved there don't
appear to be bypassing any core logic, so after some looking around
I can't say I see anything that's obviously wrong.

Oh, and I notice only now the "not" in your original reply, which
renders my earlier statement completely pointless anyway. I'm sorry.

Jan



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()
  2021-07-06 15:22         ` Jan Beulich
@ 2021-07-06 16:08           ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2021-07-06 16:08 UTC (permalink / raw)
  To: Jan Beulich, Olaf Hering; +Cc: Xen-devel, Ian Jackson, Wei Liu, Juergen Gross

On 06/07/2021 16:22, Jan Beulich wrote:
> On 06.07.2021 17:13, Jan Beulich wrote:
>> On 06.07.2021 16:11, Olaf Hering wrote:
>>> Am Tue, 6 Jul 2021 14:58:04 +0200
>>> schrieb Olaf Hering <olaf@aepfle.de>:
>>>
>>>> the reporting is broken since a while
>>> A quick check on a Dell T320 with E5-2430L, which does not have "Page Modification Logging", indicates that staging-4.5 appears to work, but reporting in staging-4.6 is broken.
>> Not surprising at all, considering PML support was added in 4.6.
> Or perhaps still surprising, as the functions involved there don't
> appear to be bypassing any core logic, so after some looking around
> I can't say I see anything that's obviously wrong.
>
> Oh, and I notice only now the "not" in your original reply, which
> renders my earlier statement completely pointless anyway. I'm sorry.

Ok, so my observations of dirty_count being reliable during development
of migration v2 appears to be correct, and something regressed in the
dev window where it got committed upstream...

That is suspicious.

~Andrew


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-07-06 16:09 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-06 11:23 [PATCH 0/2] tools/migration: Fixes in send_checkpoint_dirty_pfn_list() Andrew Cooper
2021-07-06 11:23 ` [PATCH 1/2] tools/migration: Fix iovec handling " Andrew Cooper
2021-07-06 11:54   ` Jan Beulich
2021-07-06 12:20   ` Olaf Hering
2021-07-06 11:23 ` [PATCH 2/2] tools/migration: Fix potential overflow " Andrew Cooper
2021-07-06 12:03   ` Jan Beulich
2021-07-06 13:34     ` Andrew Cooper
2021-07-06 14:00       ` Jan Beulich
2021-07-06 12:58   ` Olaf Hering
2021-07-06 13:19     ` Andrew Cooper
2021-07-06 13:22       ` Andrew Cooper
2021-07-06 13:39         ` Olaf Hering
2021-07-06 13:43           ` Andrew Cooper
2021-07-06 13:28       ` Olaf Hering
2021-07-06 13:56       ` Jan Beulich
2021-07-06 14:11     ` Olaf Hering
2021-07-06 15:13       ` Jan Beulich
2021-07-06 15:22         ` Jan Beulich
2021-07-06 16:08           ` Andrew Cooper
2021-07-06 12:07 ` [PATCH 0/2] tools/migration: Fixes " Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.