xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "Wei Liu" <wl@xen.org>, "Roger Pau Monné" <roger.pau@citrix.com>,
	"Juergen Gross" <jgross@suse.com>,
	"George Dunlap" <george.dunlap@citrix.com>,
	"Ian Jackson" <iwj@xenproject.org>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH 02/12] libxenguest: deal with log-dirty op stats overflow
Date: Mon, 28 Jun 2021 09:48:26 +0200	[thread overview]
Message-ID: <4e3afc8e-1ed8-2e27-b583-476d35352efd@suse.com> (raw)
In-Reply-To: <5e725a42-953a-c96f-3e72-f0c741b0ce16@citrix.com>

On 25.06.2021 18:36, Andrew Cooper wrote:
> On 25/06/2021 14:18, Jan Beulich wrote:
>> In send_memory_live() the precise value the dirty_count struct field
>> gets initialized to doesn't matter much (apart from the triggering of
>> the log message in send_dirty_pages(), see below), but it is important
>> that it not be zero on the first iteration (or else send_dirty_pages()
>> won't get called at all). Saturate the initializer value at the maximum
>> value the field can hold.
> 
> I don't follow.  Migration would be extremely broken if the first
> iteration didn't work correctly, so something else is going on here.

As per the title we're talking about overflow situation here. In particular
the field did end up zero when ctx->save.p2m_size was 0x100000000. I'm not
claiming in any way that the first iteration would generally not work.

>> While there also initialize struct precopy_stats' respective field to a
>> more sane value: We don't really know how many dirty pages there are at
>> that point.
>>
>> In suspend_and_send_dirty() and verify_frames() the local variables
>> don't need initializing at all, as they're only an output from the
>> hypercall which gets invoked first thing.
>>
>> In send_checkpoint_dirty_pfn_list() the local variable can be dropped
>> altogether: It's optional to xc_logdirty_control() and not used anywhere
>> else.
>>
>> Note that in case the clipping actually takes effect, the "Bitmap
>> contained more entries than expected..." log message will trigger. This
>> being just an informational message, I don't think this is overly
>> concerning.
> 
> That message is currently a error, confirming that the VM will crash on
> the resuming side.

An error? All I see is

    if ( written > entries )
        DPRINTF("Bitmap contained more entries than expected...");

> This is a consequence of it attempting to balloon during the live phase
> of migration, and discussed in docs/features/migration.pandoc (well - at
> least mentioned on the "noone has fixed this yet" list).
> 
>> --- a/tools/libs/guest/xg_sr_save.c
>> +++ b/tools/libs/guest/xg_sr_save.c
>> @@ -500,7 +500,9 @@ static int simple_precopy_policy(struct
>>  static int send_memory_live(struct xc_sr_context *ctx)
>>  {
>>      xc_interface *xch = ctx->xch;
>> -    xc_shadow_op_stats_t stats = { 0, ctx->save.p2m_size };
>> +    xc_shadow_op_stats_t stats = {
>> +        .dirty_count = MIN(ctx->save.p2m_size, (typeof(stats.dirty_count))~0)
>> +    };
>>      char *progress_str = NULL;
>>      unsigned int x = 0;
>>      int rc;
>> @@ -519,7 +521,7 @@ static int send_memory_live(struct xc_sr
>>          goto out;
>>  
>>      ctx->save.stats = (struct precopy_stats){
>> -        .dirty_count = ctx->save.p2m_size,
>> +        .dirty_count = -1,
> 
> This is an external interface, and I'm not sure it will tolerate finding
> more than p2m_size allegedly dirty.

But you do realize that a few lines down from here there already was

        policy_stats->dirty_count   = -1;

? Or are you trying to tell me that -1 (documented as indicating
"unknown") is okay on subsequent iterations, but not on the first one?
That's not being said anywhere ...

Jan



  reply	other threads:[~2021-06-28  7:48 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-25 13:15 [PATCH 00/12] x86: more or less log-dirty related improvements Jan Beulich
2021-06-25 13:17 ` [PATCH 01/12] libxc: split xc_logdirty_control() from xc_shadow_control() Jan Beulich
2021-06-25 14:51   ` Christian Lindig
2021-06-25 15:49   ` Andrew Cooper
2021-06-28  9:40     ` Jan Beulich
2021-06-25 13:18 ` [PATCH 02/12] libxenguest: deal with log-dirty op stats overflow Jan Beulich
2021-06-25 16:36   ` Andrew Cooper
2021-06-28  7:48     ` Jan Beulich [this message]
2021-06-28 11:10       ` Olaf Hering
2021-06-28 11:20         ` Jan Beulich
2021-06-28 11:30           ` Olaf Hering
2021-06-25 13:18 ` [PATCH 03/12] libxenguest: short-circuit "all-dirty" handling Jan Beulich
2021-06-25 17:02   ` Andrew Cooper
2021-06-28  8:26     ` Jan Beulich
2021-09-02 17:11       ` Ian Jackson
2021-06-25 13:19 ` [PATCH 04/12] libxenguest: avoid allocating unused deferred-pages bitmap Jan Beulich
2021-06-25 18:08   ` Andrew Cooper
2021-06-28  8:47     ` Jan Beulich
2021-09-02 17:17       ` Ian Jackson
2021-06-25 13:19 ` [PATCH 05/12] libxenguest: complete loops in xc_map_domain_meminfo() Jan Beulich
2021-06-25 18:30   ` Andrew Cooper
2021-06-28  8:53     ` Jan Beulich
2021-06-25 13:20 ` [PATCH 06/12] libxenguest: guard against overflow from too large p2m when checkpointing Jan Beulich
2021-06-25 19:00   ` Andrew Cooper
2021-06-28  9:05     ` Jan Beulich
2021-06-25 13:20 ` [PATCH 07/12] libxenguest: fix off-by-1 in colo-secondary-bitmap merging Jan Beulich
2021-06-25 19:06   ` Andrew Cooper
2021-06-25 13:21 ` [PATCH 08/12] x86/paging: deal with log-dirty stats overflow Jan Beulich
2021-06-25 19:09   ` Andrew Cooper
2021-06-25 13:21 ` [PATCH 09/12] x86/paging: supply more useful log-dirty page count Jan Beulich
2021-06-25 13:22 ` [PATCH 10/12] x86/mm: update log-dirty bitmap when manipulating P2M Jan Beulich
2021-06-25 13:22 ` [PATCH 11/12] x86/mm: pull a sanity check earlier in xenmem_add_to_physmap_one() Jan Beulich
2021-06-25 19:10   ` Andrew Cooper
2021-06-25 13:24 ` [PATCH 12/12] SUPPORT.md: write down restriction of 32-bit tool stacks Jan Beulich
2021-06-25 19:45   ` Andrew Cooper
2021-06-28  9:22     ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e3afc8e-1ed8-2e27-b583-476d35352efd@suse.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=iwj@xenproject.org \
    --cc=jgross@suse.com \
    --cc=roger.pau@citrix.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).