From: Paolo Bonzini <pbonzini@redhat.com>
To: Peter Xu <peterx@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>,
Andrew Jones <drjones@redhat.com>,
Sean Christopherson <sean.j.christopherson@intel.com>
Subject: Re: [PATCH v13 00/14] KVM: Dirty ring interface
Date: Fri, 6 Nov 2020 12:28:17 +0100 [thread overview]
Message-ID: <8b3f68dd-c61c-16a0-2077-0a5d3d57a357@redhat.com> (raw)
In-Reply-To: <20201001012044.5151-1-peterx@redhat.com>
On 01/10/20 03:20, Peter Xu wrote:
> KVM branch:
> https://github.com/xzpeter/linux/tree/kvm-dirty-ring
>
> QEMU branch for testing:
> https://github.com/xzpeter/qemu/tree/kvm-dirty-ring
>
> v13:
> - rebase to kvm/queue rather than 5.9-rc7. I think, kvm/queue is broken. I
> can only test the dirty ring after I revert 3eb900173c71 ("KVM: x86: VMX:
> Prevent MSR passthrough when MSR access is denied", 2020-09-28), otherwise
> the guest will hang on vcpu0 looping forever during boot Linux.
> - added another trivial patch "KVM: Documentation: Update entry for
> KVM_X86_SET_MSR_FILTER". It should be squashed into 1a155254ff93 ("KVM: x86:
> Introduce MSR filtering", 2020-09-28) directly.
>
> v12:
> - add r-bs for Sean
> - rebase
>
> v11:
> - rebased to kvm/queue (seems the newest)
> - removed kvm_dirty_ring_waitqueue() tracepoint since not used
> - set memslot->as_id in kvm_delete_memslot() [Sean]
> - let __copy_to_user() always return -EFAULT [Sean]
> - rename 'r' in alloc_apic_access_page into 'hva' [Sean]
>
> v10:
> - remove unused identity_map_pfn in init_rmode_identity_map [syzbot]
> - add "static" to kvm_dirty_ring_full [syzbot]
> - kvm_page_in_dirty_ring() use "#if" macros for KVM_DIRTY_LOG_PAGE_OFFSET to
> quiesce syzbot [syzbot]
> - s/false/null/ in gfn_to_memslot_dirty_bitmap() [syzbot]
>
> v9:
> - patch 3: __x86_set_memory_region: squash another trivial change to return
> (0xdeadull << 48) always for slot removal [Sean]
> - pick r-bs for Drew
>
> For previous versions, please refer to:
>
> V1: https://lore.kernel.org/kvm/20191129213505.18472-1-peterx@redhat.com
> V2: https://lore.kernel.org/kvm/20191221014938.58831-1-peterx@redhat.com
> V3: https://lore.kernel.org/kvm/20200109145729.32898-1-peterx@redhat.com
> V4: https://lore.kernel.org/kvm/20200205025105.367213-1-peterx@redhat.com
> V5: https://lore.kernel.org/kvm/20200304174947.69595-1-peterx@redhat.com
> V6: https://lore.kernel.org/kvm/20200309214424.330363-1-peterx@redhat.com
> V7: https://lore.kernel.org/kvm/20200318163720.93929-1-peterx@redhat.com
> V8: https://lore.kernel.org/kvm/20200331190000.659614-1-peterx@redhat.com
> V9: https://lore.kernel.org/kvm/20200523225659.1027044-1-peterx@redhat.com
> V10: https://lore.kernel.org/kvm/20200601115957.1581250-1-peterx@redhat.com/
>
> Overview
> ============
>
> This is a continued work from Lei Cao <lei.cao@stratus.com> and Paolo
> Bonzini on the KVM dirty ring interface.
>
> The new dirty ring interface is another way to collect dirty pages for
> the virtual machines. It is different from the existing dirty logging
> interface in a few ways, majorly:
>
> - Data format: The dirty data was in a ring format rather than a
> bitmap format, so dirty bits to sync for dirty logging does not
> depend on the size of guest memory any more, but speed of
> dirtying. Also, the dirty ring is per-vcpu, while the dirty
> bitmap is per-vm.
>
> - Data copy: The sync of dirty pages does not need data copy any more,
> but instead the ring is shared between the userspace and kernel by
> page sharings (mmap() on vcpu fd)
>
> - Interface: Instead of using the old KVM_GET_DIRTY_LOG,
> KVM_CLEAR_DIRTY_LOG interfaces, the new ring uses the new
> KVM_RESET_DIRTY_RINGS ioctl when we want to reset the collected
> dirty pages to protected mode again (works like
> KVM_CLEAR_DIRTY_LOG, but ring based). To collecting dirty bits,
> we only need to read the ring data, no ioctl is needed.
>
> Ring Layout
> ===========
>
> KVM dirty ring is per-vcpu. Each ring is an array of kvm_dirty_gfn
> defined as:
>
> struct kvm_dirty_gfn {
> __u32 flags;
> __u32 slot; /* as_id | slot_id */
> __u64 offset;
> };
>
> Each GFN is a state machine itself. The state is embeded in the flags
> field, as defined in the uapi header:
>
> /*
> * KVM dirty GFN flags, defined as:
> *
> * |---------------+---------------+--------------|
> * | bit 1 (reset) | bit 0 (dirty) | Status |
> * |---------------+---------------+--------------|
> * | 0 | 0 | Invalid GFN |
> * | 0 | 1 | Dirty GFN |
> * | 1 | X | GFN to reset |
> * |---------------+---------------+--------------|
> *
> * Lifecycle of a dirty GFN goes like:
> *
> * dirtied collected reset
> * 00 -----------> 01 -------------> 1X -------+
> * ^ |
> * | |
> * +------------------------------------------+
> *
> * The userspace program is only responsible for the 01->1X state
> * conversion (to collect dirty bits). Also, it must not skip any
> * dirty bits so that dirty bits are always collected in sequence.
> */
>
> Testing
> =======
>
> This series provided both the implementation of the KVM dirty ring and
> the test case. Also I've implemented the QEMU counterpart that can
> run with the new KVM, link can be found at the top of the cover
> letter. However that's still a very initial version which is prone to
> change and future optimizations.
>
> I did some measurement with the new method with 24G guest running some
> dirty workload, I don't see any speedup so far, even in some heavy
> dirty load it'll be slower (e.g., when 800MB/s random dirty rate, kvm
> dirty ring takes average of ~73s to complete migration while dirty
> logging only needs average of ~55s). However that's understandable
> because 24G guest means only 1M dirty bitmap, that's still a suitable
> case for dirty logging. Meanwhile heavier workload means worst case
> for dirty ring.
>
> More tests are welcomed if there's bigger host/guest, especially on
> COLO-like workload.
>
> Please review, thanks.
>
> Peter Xu (14):
> KVM: Documentation: Update entry for KVM_X86_SET_MSR_FILTER
> KVM: Cache as_id in kvm_memory_slot
> KVM: X86: Don't track dirty for KVM_SET_[TSS_ADDR|IDENTITY_MAP_ADDR]
> KVM: Pass in kvm pointer into mark_page_dirty_in_slot()
> KVM: X86: Implement ring-based dirty memory tracking
> KVM: Make dirty ring exclusive to dirty bitmap log
> KVM: Don't allocate dirty bitmap if dirty ring is enabled
> KVM: selftests: Always clear dirty bitmap after iteration
> KVM: selftests: Sync uapi/linux/kvm.h to tools/
> KVM: selftests: Use a single binary for dirty/clear log test
> KVM: selftests: Introduce after_vcpu_run hook for dirty log test
> KVM: selftests: Add dirty ring buffer test
> KVM: selftests: Let dirty_log_test async for dirty ring test
> KVM: selftests: Add "-c" parameter to dirty log test
>
> Documentation/virt/kvm/api.rst | 126 ++++-
> arch/x86/include/asm/kvm_host.h | 6 +-
> arch/x86/include/uapi/asm/kvm.h | 1 +
> arch/x86/kvm/Makefile | 3 +-
> arch/x86/kvm/mmu/mmu.c | 10 +-
> arch/x86/kvm/svm/avic.c | 9 +-
> arch/x86/kvm/vmx/vmx.c | 96 ++--
> arch/x86/kvm/x86.c | 46 +-
> include/linux/kvm_dirty_ring.h | 103 ++++
> include/linux/kvm_host.h | 19 +
> include/trace/events/kvm.h | 63 +++
> include/uapi/linux/kvm.h | 53 ++
> tools/include/uapi/linux/kvm.h | 77 ++-
> tools/testing/selftests/kvm/Makefile | 2 -
> .../selftests/kvm/clear_dirty_log_test.c | 6 -
> tools/testing/selftests/kvm/dirty_log_test.c | 505 ++++++++++++++++--
> .../testing/selftests/kvm/include/kvm_util.h | 4 +
> tools/testing/selftests/kvm/lib/kvm_util.c | 72 ++-
> .../selftests/kvm/lib/kvm_util_internal.h | 4 +
> virt/kvm/dirty_ring.c | 197 +++++++
> virt/kvm/kvm_main.c | 168 +++++-
> 21 files changed, 1432 insertions(+), 138 deletions(-)
> create mode 100644 include/linux/kvm_dirty_ring.h
> delete mode 100644 tools/testing/selftests/kvm/clear_dirty_log_test.c
> create mode 100644 virt/kvm/dirty_ring.c
>
Queued, thanks! Now on to QEMU...
Paolo
prev parent reply other threads:[~2020-11-06 11:28 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-01 1:20 [PATCH v13 00/14] KVM: Dirty ring interface Peter Xu
2020-10-01 1:20 ` [PATCH v13 01/14] KVM: Documentation: Update entry for KVM_X86_SET_MSR_FILTER Peter Xu
2020-10-01 1:20 ` [PATCH v13 02/14] KVM: Cache as_id in kvm_memory_slot Peter Xu
2020-10-01 1:20 ` [PATCH v13 03/14] KVM: X86: Don't track dirty for KVM_SET_[TSS_ADDR|IDENTITY_MAP_ADDR] Peter Xu
2020-10-01 1:20 ` [PATCH v13 04/14] KVM: Pass in kvm pointer into mark_page_dirty_in_slot() Peter Xu
2020-10-01 1:22 ` [PATCH v13 05/14] KVM: X86: Implement ring-based dirty memory tracking Peter Xu
2020-11-06 10:56 ` Paolo Bonzini
2020-10-01 1:22 ` [PATCH v13 06/14] KVM: Make dirty ring exclusive to dirty bitmap log Peter Xu
2020-11-06 11:00 ` Paolo Bonzini
2020-10-01 1:22 ` [PATCH v13 07/14] KVM: Don't allocate dirty bitmap if dirty ring is enabled Peter Xu
2020-10-01 1:22 ` [PATCH v13 08/14] KVM: selftests: Always clear dirty bitmap after iteration Peter Xu
2020-10-01 1:22 ` [PATCH v13 09/14] KVM: selftests: Sync uapi/linux/kvm.h to tools/ Peter Xu
2020-10-01 1:22 ` [PATCH v13 10/14] KVM: selftests: Use a single binary for dirty/clear log test Peter Xu
2020-10-01 1:22 ` [PATCH v13 11/14] KVM: selftests: Introduce after_vcpu_run hook for dirty " Peter Xu
2020-10-01 1:22 ` [PATCH v13 12/14] KVM: selftests: Add dirty ring buffer test Peter Xu
2020-10-01 1:22 ` [PATCH v13 13/14] KVM: selftests: Let dirty_log_test async for dirty ring test Peter Xu
2020-11-06 11:27 ` Paolo Bonzini
2020-11-06 18:06 ` Peter Xu
2020-11-06 18:15 ` Paolo Bonzini
2020-11-06 18:27 ` Peter Xu
2020-10-01 1:22 ` [PATCH v13 14/14] KVM: selftests: Add "-c" parameter to dirty log test Peter Xu
2020-11-06 11:28 ` Paolo Bonzini [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8b3f68dd-c61c-16a0-2077-0a5d3d57a357@redhat.com \
--to=pbonzini@redhat.com \
--cc=dgilbert@redhat.com \
--cc=drjones@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peterx@redhat.com \
--cc=sean.j.christopherson@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).