All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Tian Kevin <kevin.tian@intel.com>,
	qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>
Subject: Re: [PATCH RFC 3/4] vl: Sync dirty bits for system resets during precopy
Date: Wed, 6 May 2020 11:38:45 -0400	[thread overview]
Message-ID: <20200506153845.GL6299@xz-x1> (raw)
In-Reply-To: <20200506105340.GH2743@work-vm>

On Wed, May 06, 2020 at 11:53:40AM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > System resets will also reset system memory layout.  Although the memory layout
> > after the reset should probably the same as before the reset, we still need to
> > do frequent memory section removals and additions during the reset process.
> > Those operations could accidentally lose per-mem-section information like KVM
> > memslot dirty bitmaps.
> > 
> > Previously we keep those dirty bitmaps by sync it during memory removal.
> > However that's hard to make it right after all [1].  Instead, we sync dirty
> > pages before system reset if we know we're during a precopy migration.  This
> > should solve the same problem explicitly.
> > 
> > [1] https://lore.kernel.org/qemu-devel/20200327150425.GJ422390@xz-x1/
> 
> I think the problem is knowing whether this is sufficient or whether you
> definitely need to do it for other cases of removal/readd.

Right that's the tricky part.  I planned to look into the other potential cases
after I get a feedback from Paolo or anyone that think this approach is
acceptable, but of course I can even start to look into them now.

When I said this should mostly be for reset, it comes from this commit message:

    commit 4cc856fabae1447d53890e707c70f257a7691174
    Author: zhanghailiang <zhang.zhanghailiang@huawei.com>
    Date:   Thu Apr 2 19:26:31 2015 +0000

    kvm-all: Sync dirty-bitmap from kvm before kvm destroy the corresponding dirty_bitmap
    
    Sometimes, we destroy the dirty_bitmap in kvm_memory_slot before any sync action
    occur, this bit in dirty_bitmap will be missed, and which will lead the corresponding
    dirty pages to be missed in migration.
    
    This usually happens when do migration during VM's Start-up or Reboot.

That's where the get-dirty-log was missing and Hailiang reported an issue with
that probably on system reboot (but, again, I know nothing about the use case
behind this...).

Today I even tried to dig deeper into the initial place that this is
introduced, and it's:

    commit 3fbffb628c001bd540dc9c1805bdf7aa8591da4d
    Author: Avi Kivity <avi@redhat.com>
    Date:   Sun Jan 15 16:13:59 2012 +0200

    kvm: flush the dirty log when unregistering a slot
    
    Otherwise, the dirty log information is lost in the kernel forever.
    
    Fixes opensuse-12.1 boot screen, which changes the vga windows rapidly.
    
    Signed-off-by: Avi Kivity <avi@redhat.com>

So it's Avi's 8 years ago patch probably trying to fix a display issue since
VGA tracks dirty bit without migration.  IIUC this should belong to the case
you raised the concern about "removing a device MMIO region with RAM backed".
So I'll definitely look into that too, which I planned to, after all.

> 
> However, assuming we need to do it during reset, how do we know this is
> the right point to do it, and not something further inside the reset
> process?  Is this just on the basis that vcpus are stopped?

"Stopped vcpus" is not the hint for "this is the right place", but definitely a
good thing because we don't even need to consider multiple layers of dirty bit
caches when they're stopped (for which the last layer is the PML, which is
unluckily per-vcpu rather than per-memslot).  It's just something nice and
something extra we got.  Even if vcpus are not stopped at this point (I believe
the above VGA MMIO region removal could be the case where I need to take care
of later, that we will have to collect dirty bits during the vcpus running),
however as I mentioned in the other thread it should be a stack high enough to
do some global dirty sync (because both dirty log and dirty ring need some
global sync, and I believe that's one of the reasons why current code is still
racy - we always need to flush the PML on every vcpu even for dirty log!).

Sorry I went too far... should probably go back to the topic on "the right
place" - I think this should be the right place for reset because AFAIK all the
memslot add/remove is done inside device reset hooks, qemu_devices_reset() (or
MachineClass.reset(), which could also change stuff, but finally calls
qemu_devices_reset() too).  I tried to verify this by enable the tracepoint at
trace_kvm_set_user_memory() and also dump something at qemu_system_reset()
before qemu_devices_reset(), I think it proves my understanding that all
trace_kvm_set_user_memory() triggers after that point.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2020-05-06 15:39 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-28 19:42 [PATCH RFC 0/4] vl: Sync dirty bitmap when system resets Peter Xu
2020-04-28 19:42 ` [PATCH RFC 1/4] migration: Export migration_bitmap_sync_precopy() Peter Xu
2020-05-06  9:58   ` Dr. David Alan Gilbert
2020-04-28 19:42 ` [PATCH RFC 2/4] migration: Introduce migrate_is_precopy() Peter Xu
2020-05-06 10:05   ` Dr. David Alan Gilbert
2020-05-06 14:52     ` Peter Xu
2020-04-28 19:42 ` [PATCH RFC 3/4] vl: Sync dirty bits for system resets during precopy Peter Xu
2020-05-06 10:53   ` Dr. David Alan Gilbert
2020-05-06 15:38     ` Peter Xu [this message]
2020-04-28 19:42 ` [PATCH RFC 4/4] kvm: No need to sync dirty bitmap before memslot removal any more Peter Xu
2020-04-29 13:26 ` [PATCH RFC 0/4] vl: Sync dirty bitmap when system resets Dr. David Alan Gilbert
2020-04-29 14:32   ` Peter Xu
2020-05-02 21:04     ` Peter Xu
2020-05-23 22:49       ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200506153845.GL6299@xz-x1 \
    --to=peterx@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=kevin.tian@intel.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.