linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Tian, Kevin" <kevin.tian@intel.com>
To: Peter Xu <peterx@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Cc: "Michael S . Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	"Christopherson, Sean J" <sean.j.christopherson@intel.com>,
	"Christophe de Dinechin" <dinechin@redhat.com>,
	"Zhao, Yan Y" <yan.y.zhao@intel.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: RE: [PATCH v8 00/14] KVM: Dirty ring interface
Date: Thu, 23 Apr 2020 06:28:43 +0000	[thread overview]
Message-ID: <AADFC41AFE54684AB9EE6CBC0274A5D19D877A3B@SHSMSX104.ccr.corp.intel.com> (raw)
In-Reply-To: <20200422185155.GA3596@xz-x1>

> From: Peter Xu <peterx@redhat.com>
> Sent: Thursday, April 23, 2020 2:52 AM
> 
> Hi,
> 
> TL;DR: I'm thinking whether we should record pure GPA/GFN instead of
> (slot_id,
> slot_offset) tuple for dirty pages in kvm dirty ring to unbind kvm_dirty_gfn
> with memslots.
> 
> (A slightly longer version starts...)
> 
> The problem is that binding dirty tracking operations to KVM memslots is a
> restriction that needs synchronization to memslot changes, which further
> needs
> synchronization across all the vcpus because they're the consumers of
> memslots.
> E.g., when we remove a memory slot, we need to flush all the dirty bits
> correctly before we do the removal of the memslot.  That's actually an
> known
> defect for QEMU/KVM [1] (I bet it could be a defect for many other
> hypervisors...) right now with current dirty logging.  Meanwhile, even if we
> fix it, that procedure is not scale at all, and error prone to dead locks.
> 
> Here memory removal is really an (still corner-cased but relatively) important
> scenario to think about for dirty logging comparing to memory additions &
> movings.  Because memory addition will always have no initial dirty page,
> and
> we don't really move RAM a lot (or do we ever?!) for a general VM use case.
> 
> Then I went a step back to think about why we need these dirty bit
> information
> after all if the memslot is going to be removed?
> 
> There're two cases:
> 
>   - When the memslot is going to be removed forever, then the dirty
> information
>     is indeed meaningless and can be dropped, and,
> 
>   - When the memslot is going to be removed but quickly added back with
> changed
>     size, then we need to keep those dirty bits because it's just a commmon
> way
>     to e.g. punch an MMIO hole in an existing RAM region (here I'd confess I
>     feel like using "slot_id" to identify memslot is really unfriendly syscall
>     design for things like "hole punchings" in the RAM address space...
>     However such "punch hold" operation is really needed even for a common
>     guest for either system reboots or device hotplugs, etc.).

why would device hotplug punch a hole in an existing RAM region? 

> 
> The real scenario we want to cover for dirty tracking is the 2nd one.
> 
> If we can track dirty using raw GPA, the 2nd scenario is solved itself.
> Because we know we'll add those memslots back (though it might be with a
> different slot ID), then the GPA value will still make sense, which means we
> should be able to avoid any kind of synchronization for things like memory
> removals, as long as the userspace is aware of that.

A curious question. What about the backing storage of the affected GPA 
is changed after adding back? Is recorded dirty info for previous backing 
storage still making sense for the newer one?

Thanks
Kevin

> 
> With that, when we fetch the dirty bits, we lookup the memslot dynamically,
> drop bits if the memslot does not exist on that address (e.g., permanent
> removals), and use whatever memslot is there for that guest physical
> address.
> Though we for sure still need to handle memory move, that the userspace
> needs
> to still take care of dirty bit flushing and sync for a memory move, however
> that's merely not happening so nothing to take care about either.
> 
> Does this makes sense?  Comments greatly welcomed..
> 
> Thanks,
> 
> [1] https://lists.gnu.org/archive/html/qemu-devel/2020-03/msg08361.html
> 
> --
> Peter Xu


  reply	other threads:[~2020-04-23  6:28 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-31 18:59 [PATCH v8 00/14] KVM: Dirty ring interface Peter Xu
2020-03-31 18:59 ` [PATCH v8 01/14] KVM: X86: Change parameter for fast_page_fault tracepoint Peter Xu
2020-03-31 18:59 ` [PATCH v8 02/14] KVM: Cache as_id in kvm_memory_slot Peter Xu
2020-03-31 18:59 ` [PATCH v8 03/14] KVM: X86: Don't track dirty for KVM_SET_[TSS_ADDR|IDENTITY_MAP_ADDR] Peter Xu
2020-04-23 20:39   ` Sean Christopherson
2020-04-24 15:21     ` Peter Xu
2020-04-27 18:10       ` Sean Christopherson
2020-04-28 20:22         ` Peter Xu
2020-03-31 18:59 ` [PATCH v8 04/14] KVM: Pass in kvm pointer into mark_page_dirty_in_slot() Peter Xu
2020-03-31 18:59 ` [PATCH v8 05/14] KVM: X86: Implement ring-based dirty memory tracking Peter Xu
2020-03-31 18:59 ` [PATCH v8 06/14] KVM: Make dirty ring exclusive to dirty bitmap log Peter Xu
2020-03-31 18:59 ` [PATCH v8 07/14] KVM: Don't allocate dirty bitmap if dirty ring is enabled Peter Xu
2020-03-31 18:59 ` [PATCH v8 08/14] KVM: selftests: Always clear dirty bitmap after iteration Peter Xu
2020-04-01  7:04   ` Andrew Jones
2020-03-31 18:59 ` [PATCH v8 09/14] KVM: selftests: Sync uapi/linux/kvm.h to tools/ Peter Xu
2020-03-31 18:59 ` [PATCH v8 10/14] KVM: selftests: Use a single binary for dirty/clear log test Peter Xu
2020-03-31 18:59 ` [PATCH v8 11/14] KVM: selftests: Introduce after_vcpu_run hook for dirty " Peter Xu
2020-04-01  7:03   ` Andrew Jones
2020-04-01 23:24     ` Peter Xu
2020-03-31 18:59 ` [PATCH v8 12/14] KVM: selftests: Add dirty ring buffer test Peter Xu
2020-03-31 18:59 ` [PATCH v8 13/14] KVM: selftests: Let dirty_log_test async for dirty ring test Peter Xu
2020-04-01  7:48   ` Andrew Jones
2020-03-31 19:00 ` [PATCH v8 14/14] KVM: selftests: Add "-c" parameter to dirty log test Peter Xu
2020-04-22 18:51 ` [PATCH v8 00/14] KVM: Dirty ring interface Peter Xu
2020-04-23  6:28   ` Tian, Kevin [this message]
2020-04-23 15:22     ` Peter Xu
2020-04-24  6:01       ` Tian, Kevin
2020-04-24 14:19         ` Peter Xu
2020-04-26 10:29           ` Tian, Kevin
2020-04-27 14:27             ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AADFC41AFE54684AB9EE6CBC0274A5D19D877A3B@SHSMSX104.ccr.corp.intel.com \
    --to=kevin.tian@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=dinechin@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=vkuznets@redhat.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).