qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] support dirtyrate measurement with dirty bitmap
@ 2021-06-27  5:38 huangy81
       [not found] ` <cover.1624771216.git.huangy81@chinatelecom.cn>
  2021-07-09 18:20 ` [PATCH 0/4] support dirtyrate measurement with dirty bitmap Peter Xu
  0 siblings, 2 replies; 12+ messages in thread
From: huangy81 @ 2021-06-27  5:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
	Peter Xu, Chuan Zheng, Paolo Bonzini

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

the dirtyrate measurement implemented by page-sampling originally, it
is not accurate in some scenarios, so we have introduced dirty-ring
based dirtyrate measurement(maybe it will be merged soon), it fix the
accuracy of page-sampling, and more importantly, it is at the
granualrity of vcpu.

dirty-ring method can be used when dirty-ring enable, as supplementary,
we introduce dirty-bitmap method to calculating dirtyrate when dirty log
enable, so that we can also get the accurate dirtyrate if needed in the
absence of dirty-ring.

three things has done to implement the measurement:
- introduce a fresh new dirty bits named DIRTY_MEMORY_DIRTY_RATE, which
  is used to store dirty bitmap after fetching it from kvm. why we do
  not reuse the existing DIRTY_MEMORY_MIGRATION dirty bits is we do not
  want to interfere with migration of and let implementation clear, this 
  is also the reason why dirty_memory be split.

  DIRTY_MEMORY_DIRTY_RATE dirty bits will be filled when
  memory_global_dirty_log_sync executed if GLOBAL_DIRTY_DIRTY_RATE bit
  be set in the global_dirty_tracking flag.

- introduce kvm_get_manual_dirty_log_protect function so that we can
  probe the protect caps of kvm when calculating.

- implement dirtyrate measurement with dirty bitmap with following step:
  1. start the dirty log. 

  2. probe the protect cap, if KVM_DIRTY_LOG_INITIALLY_SET enable, skip
     skip the 1'R and do the reset page protection manually, since kvm
     file bitmap with 1 bits if this cap is enabled. 

  3. clear the DIRTY_MEMORY_DIRTY_RATE dirty bits, prepare to store 
     the dirty bitmap.

  4. start memory_global_dirty_log_sync and fetch dirty bitmap from kvm

  5. reap the DIRTY_MEMORY_DIRTY_RATE dirty bits and do the calculation.

this patchset rebases on the commit 
"migration/dirtyrate: implement dirty-ring dirtyrate calculation",
since the above feature has not been merged, so we post this patch
for the sake of RFC. ideally, this patshset may be merged after it.

Please, review, thanks !

Best Regards !

Hyman Huang(黄勇) (4):
  memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits
  KVM: introduce kvm_get_manual_dirty_log_protect
  memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits functions
  migration/dirtyrate: implement dirty-bitmap dirtyrate calculation

 accel/kvm/kvm-all.c     |   6 ++
 hmp-commands.hx         |   9 +--
 include/exec/ram_addr.h | 140 +++++++++++++++++++++++++++++++++++++++++++++-
 include/exec/ramlist.h  |   9 +--
 include/sysemu/kvm.h    |   1 +
 migration/dirtyrate.c   | 146 +++++++++++++++++++++++++++++++++++++++++++++---
 migration/trace-events  |   2 +
 qapi/migration.json     |   6 +-
 softmmu/physmem.c       |  60 ++++++++++++++++++++
 9 files changed, 358 insertions(+), 21 deletions(-)

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits
       [not found] ` <cover.1624771216.git.huangy81@chinatelecom.cn>
@ 2021-06-27  5:38   ` huangy81
  2021-07-09 18:37     ` Peter Xu
  2021-06-27  5:38   ` [PATCH 2/4] KVM: introduce kvm_get_manual_dirty_log_protect huangy81
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: huangy81 @ 2021-06-27  5:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
	Peter Xu, Chuan Zheng, Paolo Bonzini

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

intrduce DIRTY_MEMORY_DIRTY_RATE dirty bits to tracking vm
dirty page for calculating dirty rate

since dirtyrate and migration may be trigger concurrently,
reusing the exsiting DIRTY_MEMORY_MIGRATION dirty bits seems
not a good choice. introduce a fresh new dirty bits for
dirtyrate to ensure both dirtyrate and migration work fine.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 include/exec/ram_addr.h | 15 ++++++++++++++-
 include/exec/ramlist.h  |  9 +++++----
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 45c9132..6070a52 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -308,6 +308,10 @@ static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
         while (page < end) {
             unsigned long next = MIN(end, base + DIRTY_MEMORY_BLOCK_SIZE);
 
+            if (unlikely(mask & (1 << DIRTY_MEMORY_DIRTY_RATE))) {
+                bitmap_set_atomic(blocks[DIRTY_MEMORY_DIRTY_RATE]->blocks[idx],
+                                  offset, next - page);
+            }
             if (likely(mask & (1 << DIRTY_MEMORY_MIGRATION))) {
                 bitmap_set_atomic(blocks[DIRTY_MEMORY_MIGRATION]->blocks[idx],
                                   offset, next - page);
@@ -370,9 +374,17 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
                     qatomic_or(&blocks[DIRTY_MEMORY_VGA][idx][offset], temp);
 
                     if (global_dirty_tracking) {
-                        qatomic_or(
+                        if (global_dirty_tracking & GLOBAL_DIRTY_MIGRATION) {
+                            qatomic_or(
                                 &blocks[DIRTY_MEMORY_MIGRATION][idx][offset],
                                 temp);
+                        }
+
+                        if (global_dirty_tracking & GLOBAL_DIRTY_DIRTY_RATE) {
+                            qatomic_or(
+                                &blocks[DIRTY_MEMORY_DIRTY_RATE][idx][offset],
+                                temp);
+                        }
                     }
 
                     if (tcg_enabled()) {
@@ -394,6 +406,7 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
 
         if (!global_dirty_tracking) {
             clients &= ~(1 << DIRTY_MEMORY_MIGRATION);
+            clients &= ~(1 << DIRTY_MEMORY_DIRTY_RATE);
         }
 
         /*
diff --git a/include/exec/ramlist.h b/include/exec/ramlist.h
index ece6497..8585f00 100644
--- a/include/exec/ramlist.h
+++ b/include/exec/ramlist.h
@@ -8,10 +8,11 @@
 
 typedef struct RAMBlockNotifier RAMBlockNotifier;
 
-#define DIRTY_MEMORY_VGA       0
-#define DIRTY_MEMORY_CODE      1
-#define DIRTY_MEMORY_MIGRATION 2
-#define DIRTY_MEMORY_NUM       3        /* num of dirty bits */
+#define DIRTY_MEMORY_VGA        0
+#define DIRTY_MEMORY_CODE       1
+#define DIRTY_MEMORY_MIGRATION  2
+#define DIRTY_MEMORY_DIRTY_RATE 3
+#define DIRTY_MEMORY_NUM        4        /* num of dirty bits */
 
 /* The dirty memory bitmap is split into fixed-size blocks to allow growth
  * under RCU.  The bitmap for a block can be accessed as follows:
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] KVM: introduce kvm_get_manual_dirty_log_protect
       [not found] ` <cover.1624771216.git.huangy81@chinatelecom.cn>
  2021-06-27  5:38   ` [PATCH 1/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits huangy81
@ 2021-06-27  5:38   ` huangy81
  2021-06-27  5:38   ` [PATCH 3/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits functions huangy81
  2021-06-27  5:38   ` [PATCH 4/4] migration/dirtyrate: implement dirty-bitmap dirtyrate calculation huangy81
  3 siblings, 0 replies; 12+ messages in thread
From: huangy81 @ 2021-06-27  5:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
	Peter Xu, Chuan Zheng, Paolo Bonzini

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

introduce kvm_get_manual_dirty_log_protect for measureing
dirtyrate via dirty bitmap. calculation of dirtyrate need
to sync dirty log and depends on the features of dirty log.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 accel/kvm/kvm-all.c  | 6 ++++++
 include/sysemu/kvm.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index e0e88a2..f7d9ae0 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -245,6 +245,12 @@ int kvm_get_max_memslots(void)
     return s->nr_slots;
 }
 
+uint64_t kvm_get_manual_dirty_log_protect(void)
+{
+    KVMState *s = KVM_STATE(current_accel());
+    return s->manual_dirty_log_protect;
+}
+
 /* Called with KVMMemoryListener.slots_lock held */
 static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml)
 {
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 7b22aeb..b668d49 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -533,6 +533,7 @@ int kvm_set_one_reg(CPUState *cs, uint64_t id, void *source);
 int kvm_get_one_reg(CPUState *cs, uint64_t id, void *target);
 struct ppc_radix_page_info *kvm_get_radix_page_info(void);
 int kvm_get_max_memslots(void);
+uint64_t kvm_get_manual_dirty_log_protect(void);
 
 /* Notify resamplefd for EOI of specific interrupts. */
 void kvm_resample_fd_notify(int gsi);
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits functions
       [not found] ` <cover.1624771216.git.huangy81@chinatelecom.cn>
  2021-06-27  5:38   ` [PATCH 1/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits huangy81
  2021-06-27  5:38   ` [PATCH 2/4] KVM: introduce kvm_get_manual_dirty_log_protect huangy81
@ 2021-06-27  5:38   ` huangy81
  2021-07-09 18:26     ` Peter Xu
  2021-06-27  5:38   ` [PATCH 4/4] migration/dirtyrate: implement dirty-bitmap dirtyrate calculation huangy81
  3 siblings, 1 reply; 12+ messages in thread
From: huangy81 @ 2021-06-27  5:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
	Peter Xu, Chuan Zheng, Paolo Bonzini

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

introduce util functions to setup the DIRTY_MEMORY_DIRTY_RATE
dirty bits for the convenience of tracking dirty bitmap when
calculating dirtyrate.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 include/exec/ram_addr.h | 121 ++++++++++++++++++++++++++++++++++++++++++++++++
 softmmu/physmem.c       |  61 ++++++++++++++++++++++++
 2 files changed, 182 insertions(+)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 6070a52..57dc96b 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -435,6 +435,12 @@ bool cpu_physical_memory_test_and_clear_dirty(ram_addr_t start,
                                               ram_addr_t length,
                                               unsigned client);
 
+void cpu_physical_memory_dirtyrate_clear_bit(ram_addr_t start,
+                                             ram_addr_t length);
+
+void cpu_physical_memory_dirtyrate_reprotect_bit(ram_addr_t start,
+                                                 ram_addr_t length);
+
 DirtyBitmapSnapshot *cpu_physical_memory_snapshot_and_clear_dirty
     (MemoryRegion *mr, hwaddr offset, hwaddr length, unsigned client);
 
@@ -523,5 +529,120 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
 
     return num_dirty;
 }
+
+/* Called with RCU critical section */
+static inline
+void cpu_physical_memory_dirtyrate_clear_dirty_bits(RAMBlock *rb)
+{
+    ram_addr_t addr;
+    ram_addr_t length = rb->used_length;
+    unsigned long word = BIT_WORD(rb->offset >> TARGET_PAGE_BITS);
+
+    /* start address and length is aligned at the start of a word? */
+    if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) == rb->offset &&
+        !(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
+        int k;
+        int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
+        unsigned long * const *src;
+        unsigned long idx = (word * BITS_PER_LONG) / DIRTY_MEMORY_BLOCK_SIZE;
+        unsigned long offset = BIT_WORD((word * BITS_PER_LONG) %
+                                        DIRTY_MEMORY_BLOCK_SIZE);
+
+        src = qatomic_rcu_read(
+                &ram_list.dirty_memory[DIRTY_MEMORY_DIRTY_RATE])->blocks;
+
+        for (k = 0; k < nr; k++) {
+            if (src[idx][offset]) {
+                qatomic_set(&src[idx][offset], 0);
+            }
+            if (++offset >= BITS_TO_LONGS(DIRTY_MEMORY_BLOCK_SIZE)) {
+                offset = 0;
+                idx++;
+            }
+        }
+    } else {
+        ram_addr_t offset = rb->offset;
+
+        for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
+            cpu_physical_memory_dirtyrate_clear_bit(addr + offset,
+                                                    TARGET_PAGE_SIZE);
+        }
+    }
+
+    return;
+}
+
+/* Called with RCU critical section */
+static inline
+uint64_t cpu_physical_memory_dirtyrate_stat_dirty_bits(RAMBlock *rb)
+{
+    uint64_t dirty_pages = 0;
+    ram_addr_t addr;
+    ram_addr_t length = rb->used_length;
+    unsigned long word = BIT_WORD(rb->offset >> TARGET_PAGE_BITS);
+    unsigned long bits;
+
+    /* start address and length is aligned at the start of a word? */
+    if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) == rb->offset &&
+        !(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
+        int k;
+        int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
+        unsigned long * const *src;
+        unsigned long idx = (word * BITS_PER_LONG) / DIRTY_MEMORY_BLOCK_SIZE;
+        unsigned long offset = BIT_WORD((word * BITS_PER_LONG) %
+                                        DIRTY_MEMORY_BLOCK_SIZE);
+
+        src = qatomic_rcu_read(
+                &ram_list.dirty_memory[DIRTY_MEMORY_DIRTY_RATE])->blocks;
+
+        for (k = 0; k < nr; k++) {
+            if (src[idx][offset]) {
+                bits = qatomic_read(&src[idx][offset]);
+                dirty_pages += ctpopl(bits);
+            }
+
+            if (++offset >= BITS_TO_LONGS(DIRTY_MEMORY_BLOCK_SIZE)) {
+                offset = 0;
+                idx++;
+            }
+        }
+    } else {
+        ram_addr_t offset = rb->offset;
+
+        for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
+            if (cpu_physical_memory_get_dirty(offset + addr,
+                                              TARGET_PAGE_SIZE,
+                                              DIRTY_MEMORY_DIRTY_RATE)) {
+                dirty_pages++;
+            }
+        }
+    }
+
+    return dirty_pages;
+}
+
+static inline
+void cpu_physical_memory_dirtyrate_reset_protect(RAMBlock *rb)
+{
+    ram_addr_t addr;
+    ram_addr_t length = rb->used_length;
+    unsigned long word = BIT_WORD(rb->offset >> TARGET_PAGE_BITS);
+
+    /* start address and length is aligned at the start of a word? */
+    if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) == rb->offset &&
+        !(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
+        memory_region_clear_dirty_bitmap(rb->mr, 0, length);
+    } else {
+        ram_addr_t offset = rb->offset;
+
+        for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
+            cpu_physical_memory_dirtyrate_reprotect_bit(offset + addr,
+                                                        TARGET_PAGE_SIZE);
+        }
+    }
+
+    return;
+}
+
 #endif
 #endif
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 9b171c9..d68649a 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1068,6 +1068,67 @@ bool cpu_physical_memory_test_and_clear_dirty(ram_addr_t start,
     return dirty;
 }
 
+void cpu_physical_memory_dirtyrate_clear_bit(ram_addr_t start,
+                                             ram_addr_t length)
+{
+    DirtyMemoryBlocks *blocks;
+    unsigned long end, page;
+    RAMBlock *ramblock;
+
+    if (length == 0) {
+        return;
+    }
+
+    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+    page = start >> TARGET_PAGE_BITS;
+
+    WITH_RCU_READ_LOCK_GUARD() {
+        blocks =
+            qatomic_rcu_read(&ram_list.dirty_memory[DIRTY_MEMORY_DIRTY_RATE]);
+        ramblock = qemu_get_ram_block(start);
+        /* Range sanity check on the ramblock */
+        assert(start >= ramblock->offset &&
+               start + length <= ramblock->offset + ramblock->used_length);
+        while (page < end) {
+            unsigned long idx = page / DIRTY_MEMORY_BLOCK_SIZE;
+            unsigned long offset = page % DIRTY_MEMORY_BLOCK_SIZE;
+            unsigned long num = MIN(end - page,
+                                    DIRTY_MEMORY_BLOCK_SIZE - offset);
+
+            clear_bit(num, blocks->blocks[idx]);
+            page += num;
+        }
+    }
+
+    return;
+}
+
+void cpu_physical_memory_dirtyrate_reprotect_bit(ram_addr_t start,
+                                                 ram_addr_t length)
+{
+    unsigned long end, start_page;
+    RAMBlock *ramblock;
+    uint64_t mr_offset, mr_size;
+
+    if (length == 0) {
+        return;
+    }
+
+    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+    start_page = start >> TARGET_PAGE_BITS;
+
+    ramblock = qemu_get_ram_block(start);
+    /* Range sanity check on the ramblock */
+    assert(start >= ramblock->offset &&
+        start + length <= ramblock->offset + ramblock->used_length);
+
+    mr_offset = (ram_addr_t)(start_page << TARGET_PAGE_BITS) - ramblock->offset;
+    mr_size = (end - start_page) << TARGET_PAGE_BITS;
+    memory_region_clear_dirty_bitmap(ramblock->mr, mr_offset, mr_size);
+
+    return;
+}
+
 DirtyBitmapSnapshot *cpu_physical_memory_snapshot_and_clear_dirty
     (MemoryRegion *mr, hwaddr offset, hwaddr length, unsigned client)
 {
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] migration/dirtyrate: implement dirty-bitmap dirtyrate calculation
       [not found] ` <cover.1624771216.git.huangy81@chinatelecom.cn>
                     ` (2 preceding siblings ...)
  2021-06-27  5:38   ` [PATCH 3/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits functions huangy81
@ 2021-06-27  5:38   ` huangy81
  2021-07-09 18:32     ` Peter Xu
  3 siblings, 1 reply; 12+ messages in thread
From: huangy81 @ 2021-06-27  5:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
	Peter Xu, Chuan Zheng, Paolo Bonzini

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

introduce dirty-bitmap mode as the third method of calc-dirty-rate.
implement dirty-bitmap dirtyrate calculation, which can be used
to measuring dirtyrate in the absence of dirty-ring.

introduce "dirty_bitmap:-b" option in hmp calc_dirty_rate to
indicate dirty bitmap method should be used for calculation.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 hmp-commands.hx        |   9 +--
 migration/dirtyrate.c  | 147 ++++++++++++++++++++++++++++++++++++++++++++++---
 migration/trace-events |   2 +
 qapi/migration.json    |   6 +-
 4 files changed, 150 insertions(+), 14 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index f7fc9d7..605973c 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1738,9 +1738,10 @@ ERST
 
     {
         .name       = "calc_dirty_rate",
-        .args_type  = "dirty_ring:-r,second:l,sample_pages_per_GB:l?",
-        .params     = "[-r] second [sample_pages_per_GB]",
-        .help       = "start a round of guest dirty rate measurement (using -d to"
-                      "\n\t\t\t specify dirty ring as the method of calculation)",
+        .args_type  = "dirty_ring:-r,dirty_bitmap:-b,second:l,sample_pages_per_GB:l?",
+        .params     = "[-r] [-b] second [sample_pages_per_GB]",
+        .help       = "start a round of guest dirty rate measurement (using -r to"
+                      "\n\t\t\t specify dirty ring as the method of calculation and"
+                      "\n\t\t\t -b to specify dirty bitmap as method of calculation)",
         .cmd        = hmp_calc_dirty_rate,
     },
diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index 3c8c5e2..bf465ce 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -15,6 +15,7 @@
 #include "qapi/error.h"
 #include "cpu.h"
 #include "exec/ramblock.h"
+#include "exec/ram_addr.h"
 #include "qemu/rcu_queue.h"
 #include "qemu/main-loop.h"
 #include "qapi/qapi-commands-migration.h"
@@ -111,6 +112,10 @@ static struct DirtyRateInfo *query_dirty_rate_info(void)
             }
             info->vcpu_dirty_rate = head;
         }
+
+        if (dirtyrate_mode == DIRTY_RATE_MEASURE_MODE_DIRTY_BITMAP) {
+            info->sample_pages = 0;
+        }
     }
 
     trace_query_dirty_rate_info(DirtyRateStatus_str(CalculatingState));
@@ -421,6 +426,114 @@ static int64_t do_calculate_dirtyrate_vcpu(DirtyPageRecord dirty_pages)
     return memory_size_MB / time_s;
 }
 
+static inline void dirtyrate_dirty_bits_clear(void)
+{
+    RAMBlock *block = NULL;
+
+    WITH_RCU_READ_LOCK_GUARD() {
+        RAMBLOCK_FOREACH_MIGRATABLE(block) {
+            cpu_physical_memory_dirtyrate_clear_dirty_bits(block);
+        }
+    }
+}
+
+static inline uint64_t dirtyrate_dirty_bits_reap(void)
+{
+    RAMBlock *block = NULL;
+    uint64_t dirty_pages = 0;
+    uint64_t total_pages = 0;
+
+    WITH_RCU_READ_LOCK_GUARD() {
+        RAMBLOCK_FOREACH_MIGRATABLE(block) {
+            dirty_pages = cpu_physical_memory_dirtyrate_stat_dirty_bits(block);
+            total_pages += dirty_pages;
+
+            trace_dirtyrate_bitmap_reap(block->idstr, dirty_pages);
+        }
+    }
+
+    return total_pages;
+}
+
+static inline void dirtyrate_manual_reset_protect(void)
+{
+    RAMBlock *block = NULL;
+
+    WITH_RCU_READ_LOCK_GUARD() {
+        RAMBLOCK_FOREACH_MIGRATABLE(block) {
+            cpu_physical_memory_dirtyrate_reset_protect(block);
+        }
+    }
+}
+
+static void do_calculate_dirtyrate_bitmap(void)
+{
+    uint64_t memory_size_MB;
+    int64_t time_s;
+    uint64_t increased_dirty_pages = 0;
+    uint64_t dirtyrate = 0;
+
+    increased_dirty_pages = dirtyrate_dirty_bits_reap();
+
+    memory_size_MB = (increased_dirty_pages * TARGET_PAGE_SIZE) >> 20;
+    time_s = DirtyStat.calc_time;
+
+    dirtyrate = memory_size_MB / time_s;
+    DirtyStat.dirty_rate = dirtyrate;
+
+    trace_dirtyrate_do_calculate_bitmap(increased_dirty_pages,
+                                        time_s, dirtyrate);
+}
+
+static void calculate_dirtyrate_dirty_bitmap(struct DirtyRateConfig config)
+{
+    int64_t msec = 0;
+    int64_t start_time;
+    uint64_t protect_flags = 0;
+    uint64_t initially_set = 0;
+    uint64_t protect_only = 0;
+
+    protect_flags = kvm_get_manual_dirty_log_protect();
+    protect_only = KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE;
+    initially_set =
+        (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | KVM_DIRTY_LOG_INITIALLY_SET);
+
+    dirtyrate_global_dirty_log_start();
+
+    /* absense of migration */
+    if (!(global_dirty_tracking & GLOBAL_DIRTY_MIGRATION)) {
+        if ((protect_flags & initially_set) == initially_set) {
+            /* skip the 1'round which return all 1 bits */
+            memory_global_dirty_log_sync();
+            /*
+             * reset page protect manually and
+             * start dirty tracking from now on
+             **/
+            dirtyrate_manual_reset_protect();
+        }
+    }
+
+    dirtyrate_dirty_bits_clear();
+
+    start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+    DirtyStat.start_time = start_time / 1000;
+
+    msec = config.sample_period_seconds * 1000;
+    msec = set_sample_page_period(msec, start_time);
+    DirtyStat.calc_time = msec / 1000;
+
+    /* fetch dirty bitmap from kvm */
+    memory_global_dirty_log_sync();
+
+    do_calculate_dirtyrate_bitmap();
+
+    if (protect_flags & protect_only) {
+        dirtyrate_manual_reset_protect();
+    }
+
+    dirtyrate_global_dirty_log_stop();
+}
+
 static void calculate_dirtyrate_dirty_ring(struct DirtyRateConfig config)
 {
     CPUState *cpu;
@@ -506,7 +619,9 @@ out:
 
 static void calculate_dirtyrate(struct DirtyRateConfig config)
 {
-    if (config.mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) {
+    if (config.mode == DIRTY_RATE_MEASURE_MODE_DIRTY_BITMAP) {
+        calculate_dirtyrate_dirty_bitmap(config);
+    } else if (config.mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) {
         calculate_dirtyrate_dirty_ring(config);
     } else {
         calculate_dirtyrate_sample_vm(config);
@@ -589,11 +704,14 @@ void qmp_calc_dirty_rate(int64_t calc_time,
 
     /*
      * dirty ring mode only works when kvm dirty ring is enabled.
+     * on the contrary, dirty bitmap mode is not.
      */
-    if ((mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) &&
-        !kvm_dirty_ring_enabled()) {
-        error_setg(errp, "dirty ring is disabled, use sample-pages method "
-                         "or remeasure later.");
+    if (((mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) &&
+        !kvm_dirty_ring_enabled()) ||
+        ((mode == DIRTY_RATE_MEASURE_MODE_DIRTY_BITMAP) &&
+         kvm_dirty_ring_enabled())) {
+        error_setg(errp, "mode %s is not enabled, use other method instead.",
+                         DirtyRateMeasureMode_str(mode));
         return;
     }
 
@@ -670,9 +788,8 @@ void hmp_calc_dirty_rate(Monitor *mon, const QDict *qdict)
     int64_t sample_pages = qdict_get_try_int(qdict, "sample_pages_per_GB", -1);
     bool has_sample_pages = (sample_pages != -1);
     bool dirty_ring = qdict_get_try_bool(qdict, "dirty_ring", false);
-    DirtyRateMeasureMode mode =
-        (dirty_ring ? DIRTY_RATE_MEASURE_MODE_DIRTY_RING :
-         DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING);
+    bool dirty_bitmap = qdict_get_try_bool(qdict, "dirty_bitmap", false);
+    DirtyRateMeasureMode mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
     Error *err = NULL;
 
     if (!sec) {
@@ -680,6 +797,20 @@ void hmp_calc_dirty_rate(Monitor *mon, const QDict *qdict)
         return;
     }
 
+    if (dirty_ring && dirty_bitmap) {
+        monitor_printf(mon, "Either dirty ring or dirty bitmap "
+                       "can be specified!\n");
+        return;
+    }
+
+    if (dirty_bitmap) {
+        mode = DIRTY_RATE_MEASURE_MODE_DIRTY_BITMAP;
+    }
+
+    if (dirty_ring) {
+        mode = DIRTY_RATE_MEASURE_MODE_DIRTY_RING;
+    }
+
     qmp_calc_dirty_rate(sec, has_sample_pages, sample_pages, true,
                         mode, &err);
     if (err) {
diff --git a/migration/trace-events b/migration/trace-events
index ce0b5e6..e19e3e1 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -333,6 +333,8 @@ skip_sample_ramblock(const char *idstr, uint64_t ramblock_size) "ramblock name:
 find_page_matched(const char *idstr) "ramblock %s addr or size changed"
 dirtyrate_calculate(int64_t dirtyrate) "dirty rate: %" PRIi64 " MB/s"
 dirtyrate_do_calculate_vcpu(int idx, uint64_t rate) "vcpu[%d]: %"PRIu64 " MB/s"
+dirtyrate_do_calculate_bitmap(uint64_t pages, int64_t time, uint64_t rate) "%"PRIu64 " pages has increased in %"PRIi64 " s, rate %"PRIu64 "MB/s"
+dirtyrate_bitmap_reap(const char *idstr, uint64_t pages) "ramblock name: %s, has %"PRIu64 " dirty pages"
 
 # block.c
 migration_block_init_shared(const char *blk_device_name) "Start migration for %s with shared base image"
diff --git a/qapi/migration.json b/qapi/migration.json
index de35528..0b00976 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1747,13 +1747,15 @@
 #
 # @page-sampling: calculate dirtyrate by sampling pages.
 #
-# @dirty-ring: calculate dirtyrate by via dirty ring.
+# @dirty-ring: calculate dirtyrate by dirty ring.
+#
+# @dirty-bitmap: calculate dirtyrate by dirty bitmap.
 #
 # Since: 6.1
 #
 ##
 { 'enum': 'DirtyRateMeasureMode',
-  'data': ['page-sampling', 'dirty-ring'] }
+  'data': ['page-sampling', 'dirty-ring', 'dirty-bitmap'] }
 
 ##
 # @DirtyRateInfo:
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] support dirtyrate measurement with dirty bitmap
  2021-06-27  5:38 [PATCH 0/4] support dirtyrate measurement with dirty bitmap huangy81
       [not found] ` <cover.1624771216.git.huangy81@chinatelecom.cn>
@ 2021-07-09 18:20 ` Peter Xu
  2021-07-11 15:27   ` Hyman Huang
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Xu @ 2021-07-09 18:20 UTC (permalink / raw)
  To: huangy81
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini

Yong,

On Sun, Jun 27, 2021 at 01:38:13PM +0800, huangy81@chinatelecom.cn wrote:
> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> 
> the dirtyrate measurement implemented by page-sampling originally, it
> is not accurate in some scenarios, so we have introduced dirty-ring
> based dirtyrate measurement(maybe it will be merged soon), it fix the
> accuracy of page-sampling, and more importantly, it is at the
> granualrity of vcpu.
> 
> dirty-ring method can be used when dirty-ring enable, as supplementary,
> we introduce dirty-bitmap method to calculating dirtyrate when dirty log
> enable, so that we can also get the accurate dirtyrate if needed in the
> absence of dirty-ring.
> 
> three things has done to implement the measurement:
> - introduce a fresh new dirty bits named DIRTY_MEMORY_DIRTY_RATE, which
>   is used to store dirty bitmap after fetching it from kvm. why we do
>   not reuse the existing DIRTY_MEMORY_MIGRATION dirty bits is we do not
>   want to interfere with migration of and let implementation clear, this 
>   is also the reason why dirty_memory be split.
> 
>   DIRTY_MEMORY_DIRTY_RATE dirty bits will be filled when
>   memory_global_dirty_log_sync executed if GLOBAL_DIRTY_DIRTY_RATE bit
>   be set in the global_dirty_tracking flag.

I'm not 100% sure this is needed.

Dirty rate measurements do not care about which page is dirtied, it looks like
an overkill to introduce a new bitmap for it.

IMHO we can directly do the calculation when synchronizing the dirty bits in
below functions:

        cpu_physical_memory_set_dirty_range
        cpu_physical_memory_set_dirty_lebitmap
        cpu_physical_memory_sync_dirty_bitmap

Maybe we can define a global statistics for that?

> 
> - introduce kvm_get_manual_dirty_log_protect function so that we can
>   probe the protect caps of kvm when calculating.
> 
> - implement dirtyrate measurement with dirty bitmap with following step:
>   1. start the dirty log. 
> 
>   2. probe the protect cap, if KVM_DIRTY_LOG_INITIALLY_SET enable, skip
>      skip the 1'R and do the reset page protection manually, since kvm
>      file bitmap with 1 bits if this cap is enabled. 
> 
>   3. clear the DIRTY_MEMORY_DIRTY_RATE dirty bits, prepare to store 
>      the dirty bitmap.
> 
>   4. start memory_global_dirty_log_sync and fetch dirty bitmap from kvm
> 
>   5. reap the DIRTY_MEMORY_DIRTY_RATE dirty bits and do the calculation.
> 
> this patchset rebases on the commit 
> "migration/dirtyrate: implement dirty-ring dirtyrate calculation",
> since the above feature has not been merged, so we post this patch
> for the sake of RFC. ideally, this patshset may be merged after it.

I gave it a shot with some setup dirty workload, it runs well so far and also I
do get accurate numbers (200MB/s measured as 201MB/s; 300MB/s measured as
301MB/s, and so on).  Looks good to me in general.

But as I mentioned above I feel like the changeset can be shrinked quite a bit
if we can drop the extra bitmap; maybe it means we can drop half of the whole
series.  But it's also possible I missed something, let's see.

It'll slightly differ from dirty ring in that same page written will always
only be counted once between two dirty map sync, but that's expected.  Dirty
ring "sync" more frequently (either ring full, or current 1-sec timeout in the
reaper), so it re-protects more frequently too.

I still have some other small comments, I'll go into the patches.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits functions
  2021-06-27  5:38   ` [PATCH 3/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits functions huangy81
@ 2021-07-09 18:26     ` Peter Xu
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Xu @ 2021-07-09 18:26 UTC (permalink / raw)
  To: huangy81
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini

On Sun, Jun 27, 2021 at 01:38:16PM +0800, huangy81@chinatelecom.cn wrote:
> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> 
> introduce util functions to setup the DIRTY_MEMORY_DIRTY_RATE
> dirty bits for the convenience of tracking dirty bitmap when
> calculating dirtyrate.
> 
> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> ---
>  include/exec/ram_addr.h | 121 ++++++++++++++++++++++++++++++++++++++++++++++++
>  softmmu/physmem.c       |  61 ++++++++++++++++++++++++
>  2 files changed, 182 insertions(+)
> 
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index 6070a52..57dc96b 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -435,6 +435,12 @@ bool cpu_physical_memory_test_and_clear_dirty(ram_addr_t start,
>                                                ram_addr_t length,
>                                                unsigned client);
>  
> +void cpu_physical_memory_dirtyrate_clear_bit(ram_addr_t start,
> +                                             ram_addr_t length);
> +
> +void cpu_physical_memory_dirtyrate_reprotect_bit(ram_addr_t start,
> +                                                 ram_addr_t length);
> +
>  DirtyBitmapSnapshot *cpu_physical_memory_snapshot_and_clear_dirty
>      (MemoryRegion *mr, hwaddr offset, hwaddr length, unsigned client);
>  
> @@ -523,5 +529,120 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
>  
>      return num_dirty;
>  }
> +
> +/* Called with RCU critical section */
> +static inline
> +void cpu_physical_memory_dirtyrate_clear_dirty_bits(RAMBlock *rb)
> +{
> +    ram_addr_t addr;
> +    ram_addr_t length = rb->used_length;
> +    unsigned long word = BIT_WORD(rb->offset >> TARGET_PAGE_BITS);
> +
> +    /* start address and length is aligned at the start of a word? */
> +    if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) == rb->offset &&
> +        !(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
> +        int k;
> +        int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
> +        unsigned long * const *src;
> +        unsigned long idx = (word * BITS_PER_LONG) / DIRTY_MEMORY_BLOCK_SIZE;
> +        unsigned long offset = BIT_WORD((word * BITS_PER_LONG) %
> +                                        DIRTY_MEMORY_BLOCK_SIZE);
> +
> +        src = qatomic_rcu_read(
> +                &ram_list.dirty_memory[DIRTY_MEMORY_DIRTY_RATE])->blocks;
> +
> +        for (k = 0; k < nr; k++) {
> +            if (src[idx][offset]) {
> +                qatomic_set(&src[idx][offset], 0);
> +            }
> +            if (++offset >= BITS_TO_LONGS(DIRTY_MEMORY_BLOCK_SIZE)) {
> +                offset = 0;
> +                idx++;
> +            }
> +        }
> +    } else {
> +        ram_addr_t offset = rb->offset;
> +
> +        for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
> +            cpu_physical_memory_dirtyrate_clear_bit(addr + offset,
> +                                                    TARGET_PAGE_SIZE);
> +        }
> +    }
> +
> +    return;
> +}
> +
> +/* Called with RCU critical section */
> +static inline
> +uint64_t cpu_physical_memory_dirtyrate_stat_dirty_bits(RAMBlock *rb)
> +{
> +    uint64_t dirty_pages = 0;
> +    ram_addr_t addr;
> +    ram_addr_t length = rb->used_length;
> +    unsigned long word = BIT_WORD(rb->offset >> TARGET_PAGE_BITS);
> +    unsigned long bits;
> +
> +    /* start address and length is aligned at the start of a word? */
> +    if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) == rb->offset &&
> +        !(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
> +        int k;
> +        int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
> +        unsigned long * const *src;
> +        unsigned long idx = (word * BITS_PER_LONG) / DIRTY_MEMORY_BLOCK_SIZE;
> +        unsigned long offset = BIT_WORD((word * BITS_PER_LONG) %
> +                                        DIRTY_MEMORY_BLOCK_SIZE);
> +
> +        src = qatomic_rcu_read(
> +                &ram_list.dirty_memory[DIRTY_MEMORY_DIRTY_RATE])->blocks;
> +
> +        for (k = 0; k < nr; k++) {
> +            if (src[idx][offset]) {
> +                bits = qatomic_read(&src[idx][offset]);
> +                dirty_pages += ctpopl(bits);
> +            }
> +
> +            if (++offset >= BITS_TO_LONGS(DIRTY_MEMORY_BLOCK_SIZE)) {
> +                offset = 0;
> +                idx++;
> +            }
> +        }
> +    } else {
> +        ram_addr_t offset = rb->offset;
> +
> +        for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
> +            if (cpu_physical_memory_get_dirty(offset + addr,
> +                                              TARGET_PAGE_SIZE,
> +                                              DIRTY_MEMORY_DIRTY_RATE)) {
> +                dirty_pages++;
> +            }
> +        }
> +    }
> +
> +    return dirty_pages;
> +}

If my understanding in the cover letter was correct, all codes until here can
be dropped if without the extra bitmap.

> +
> +static inline
> +void cpu_physical_memory_dirtyrate_reset_protect(RAMBlock *rb)
> +{
> +    ram_addr_t addr;
> +    ram_addr_t length = rb->used_length;
> +    unsigned long word = BIT_WORD(rb->offset >> TARGET_PAGE_BITS);
> +
> +    /* start address and length is aligned at the start of a word? */
> +    if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) == rb->offset &&
> +        !(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
> +        memory_region_clear_dirty_bitmap(rb->mr, 0, length);
> +    } else {
> +        ram_addr_t offset = rb->offset;
> +
> +        for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
> +            cpu_physical_memory_dirtyrate_reprotect_bit(offset + addr,
> +                                                        TARGET_PAGE_SIZE);
> +        }
> +    }
> +
> +    return;
> +}

Confused why we need this complexity.  Can we unconditionally do:

static inline
void cpu_physical_memory_dirtyrate_reset_protect(RAMBlock *rb)
{
    memory_region_clear_dirty_bitmap(rb->mr, 0, rb->used_length);
}

?

Then we can even drop the helper, maybe?

Below functions seem to be able to be dropped too if without the dirty rate
bitmap.  Then, maybe, this patch is not needed..

> +
>  #endif
>  #endif
> diff --git a/softmmu/physmem.c b/softmmu/physmem.c
> index 9b171c9..d68649a 100644
> --- a/softmmu/physmem.c
> +++ b/softmmu/physmem.c
> @@ -1068,6 +1068,67 @@ bool cpu_physical_memory_test_and_clear_dirty(ram_addr_t start,
>      return dirty;
>  }
>  
> +void cpu_physical_memory_dirtyrate_clear_bit(ram_addr_t start,
> +                                             ram_addr_t length)
> +{
> +    DirtyMemoryBlocks *blocks;
> +    unsigned long end, page;
> +    RAMBlock *ramblock;
> +
> +    if (length == 0) {
> +        return;
> +    }
> +
> +    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
> +    page = start >> TARGET_PAGE_BITS;
> +
> +    WITH_RCU_READ_LOCK_GUARD() {
> +        blocks =
> +            qatomic_rcu_read(&ram_list.dirty_memory[DIRTY_MEMORY_DIRTY_RATE]);
> +        ramblock = qemu_get_ram_block(start);
> +        /* Range sanity check on the ramblock */
> +        assert(start >= ramblock->offset &&
> +               start + length <= ramblock->offset + ramblock->used_length);
> +        while (page < end) {
> +            unsigned long idx = page / DIRTY_MEMORY_BLOCK_SIZE;
> +            unsigned long offset = page % DIRTY_MEMORY_BLOCK_SIZE;
> +            unsigned long num = MIN(end - page,
> +                                    DIRTY_MEMORY_BLOCK_SIZE - offset);
> +
> +            clear_bit(num, blocks->blocks[idx]);
> +            page += num;
> +        }
> +    }
> +
> +    return;
> +}
> +
> +void cpu_physical_memory_dirtyrate_reprotect_bit(ram_addr_t start,
> +                                                 ram_addr_t length)
> +{
> +    unsigned long end, start_page;
> +    RAMBlock *ramblock;
> +    uint64_t mr_offset, mr_size;
> +
> +    if (length == 0) {
> +        return;
> +    }
> +
> +    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
> +    start_page = start >> TARGET_PAGE_BITS;
> +
> +    ramblock = qemu_get_ram_block(start);
> +    /* Range sanity check on the ramblock */
> +    assert(start >= ramblock->offset &&
> +        start + length <= ramblock->offset + ramblock->used_length);
> +
> +    mr_offset = (ram_addr_t)(start_page << TARGET_PAGE_BITS) - ramblock->offset;
> +    mr_size = (end - start_page) << TARGET_PAGE_BITS;
> +    memory_region_clear_dirty_bitmap(ramblock->mr, mr_offset, mr_size);
> +
> +    return;
> +}
> +
>  DirtyBitmapSnapshot *cpu_physical_memory_snapshot_and_clear_dirty
>      (MemoryRegion *mr, hwaddr offset, hwaddr length, unsigned client)
>  {
> -- 
> 1.8.3.1
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] migration/dirtyrate: implement dirty-bitmap dirtyrate calculation
  2021-06-27  5:38   ` [PATCH 4/4] migration/dirtyrate: implement dirty-bitmap dirtyrate calculation huangy81
@ 2021-07-09 18:32     ` Peter Xu
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Xu @ 2021-07-09 18:32 UTC (permalink / raw)
  To: huangy81
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini

On Sun, Jun 27, 2021 at 01:38:17PM +0800, huangy81@chinatelecom.cn wrote:
> +static void calculate_dirtyrate_dirty_bitmap(struct DirtyRateConfig config)
> +{
> +    int64_t msec = 0;
> +    int64_t start_time;
> +    uint64_t protect_flags = 0;
> +    uint64_t initially_set = 0;
> +    uint64_t protect_only = 0;
> +
> +    protect_flags = kvm_get_manual_dirty_log_protect();
> +    protect_only = KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE;
> +    initially_set =
> +        (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | KVM_DIRTY_LOG_INITIALLY_SET);
> +
> +    dirtyrate_global_dirty_log_start();
> +
> +    /* absense of migration */
> +    if (!(global_dirty_tracking & GLOBAL_DIRTY_MIGRATION)) {
> +        if ((protect_flags & initially_set) == initially_set) {
> +            /* skip the 1'round which return all 1 bits */
> +            memory_global_dirty_log_sync();
> +            /*
> +             * reset page protect manually and
> +             * start dirty tracking from now on
> +             **/
> +            dirtyrate_manual_reset_protect();
> +        }
> +    }

Right, clear dirty log is a bit tricky.

Wondering whether we can simplify this into something like:

  1. dirty_log_sync()
  2. if (manual_protect) reset_protect()
  3. record initial total dirty stats (total dirty stats updated when sync)
  4. sleep(SECONDS)
  5. dirty_log_sync()
  6. record final total dirty stats

Then I think it's not related to initial-all-set anymore. Do you think this
would work?

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits
  2021-06-27  5:38   ` [PATCH 1/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits huangy81
@ 2021-07-09 18:37     ` Peter Xu
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Xu @ 2021-07-09 18:37 UTC (permalink / raw)
  To: huangy81
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini

On Sun, Jun 27, 2021 at 01:38:14PM +0800, huangy81@chinatelecom.cn wrote:
> @@ -370,9 +374,17 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
>                      qatomic_or(&blocks[DIRTY_MEMORY_VGA][idx][offset], temp);
>  
>                      if (global_dirty_tracking) {
> -                        qatomic_or(
> +                        if (global_dirty_tracking & GLOBAL_DIRTY_MIGRATION) {
> +                            qatomic_or(
>                                  &blocks[DIRTY_MEMORY_MIGRATION][idx][offset],
>                                  temp);
> +                        }
> +
> +                        if (global_dirty_tracking & GLOBAL_DIRTY_DIRTY_RATE) {
> +                            qatomic_or(
> +                                &blocks[DIRTY_MEMORY_DIRTY_RATE][idx][offset],
> +                                temp);

So what I meant in the other thread is instead of operating on this bitmap we
just record the number of total dirty pages, just like we used to do with rings.

PS. IIUC maybe this can even work for dirty rings.. because either dirty ring
or dirty logging collect dirty bits into the slot bitmap, then it's further
aggregated here from the slot bitmaps.  However merging them won't help much
because dirty ring can provide finer-granule per-vcpu dirty info, which can be
further used for per-vcpu throttling in the future.  So just raise this up.

> +                        }
>                      }

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] support dirtyrate measurement with dirty bitmap
  2021-07-09 18:20 ` [PATCH 0/4] support dirtyrate measurement with dirty bitmap Peter Xu
@ 2021-07-11 15:27   ` Hyman Huang
  2021-07-13 17:45     ` Peter Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Hyman Huang @ 2021-07-11 15:27 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini



在 2021/7/10 2:20, Peter Xu 写道:
> Yong,
> 
> On Sun, Jun 27, 2021 at 01:38:13PM +0800, huangy81@chinatelecom.cn wrote:
>> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>>
>> the dirtyrate measurement implemented by page-sampling originally, it
>> is not accurate in some scenarios, so we have introduced dirty-ring
>> based dirtyrate measurement(maybe it will be merged soon), it fix the
>> accuracy of page-sampling, and more importantly, it is at the
>> granualrity of vcpu.
>>
>> dirty-ring method can be used when dirty-ring enable, as supplementary,
>> we introduce dirty-bitmap method to calculating dirtyrate when dirty log
>> enable, so that we can also get the accurate dirtyrate if needed in the
>> absence of dirty-ring.
>>
>> three things has done to implement the measurement:
>> - introduce a fresh new dirty bits named DIRTY_MEMORY_DIRTY_RATE, which
>>    is used to store dirty bitmap after fetching it from kvm. why we do
>>    not reuse the existing DIRTY_MEMORY_MIGRATION dirty bits is we do not
>>    want to interfere with migration of and let implementation clear, this
>>    is also the reason why dirty_memory be split.
>>
>>    DIRTY_MEMORY_DIRTY_RATE dirty bits will be filled when
>>    memory_global_dirty_log_sync executed if GLOBAL_DIRTY_DIRTY_RATE bit
>>    be set in the global_dirty_tracking flag.
> 
> I'm not 100% sure this is needed.
> 
> Dirty rate measurements do not care about which page is dirtied, it looks like
> an overkill to introduce a new bitmap for it.
indeed, dirty rate measurements only cares about the increased dirty 
pages number during calculation time.
> 
> IMHO we can directly do the calculation when synchronizing the dirty bits in
> below functions:
> 
>          cpu_physical_memory_set_dirty_range
>          cpu_physical_memory_set_dirty_lebitmap
>          cpu_physical_memory_sync_dirty_bitmap
> 
> Maybe we can define a global statistics for that?
uhhh... Do you mean that we can reuse the DIRTY_MEMORY_MIGRATION dirty 
bits to stat the new dirty pages number and just define the global var 
to count the increased dirty pages during the calculation time?
or we still use the bitmap but defined as a global var, instead of dirty 
bits?
> 
>>
>> - introduce kvm_get_manual_dirty_log_protect function so that we can
>>    probe the protect caps of kvm when calculating.
>>
>> - implement dirtyrate measurement with dirty bitmap with following step:
>>    1. start the dirty log.
>>
>>    2. probe the protect cap, if KVM_DIRTY_LOG_INITIALLY_SET enable, skip
>>       skip the 1'R and do the reset page protection manually, since kvm
>>       file bitmap with 1 bits if this cap is enabled.
>>
>>    3. clear the DIRTY_MEMORY_DIRTY_RATE dirty bits, prepare to store
>>       the dirty bitmap.
>>
>>    4. start memory_global_dirty_log_sync and fetch dirty bitmap from kvm
>>
>>    5. reap the DIRTY_MEMORY_DIRTY_RATE dirty bits and do the calculation.
>>
>> this patchset rebases on the commit
>> "migration/dirtyrate: implement dirty-ring dirtyrate calculation",
>> since the above feature has not been merged, so we post this patch
>> for the sake of RFC. ideally, this patshset may be merged after it.
> 
> I gave it a shot with some setup dirty workload, it runs well so far and also I
> do get accurate numbers (200MB/s measured as 201MB/s; 300MB/s measured as
> 301MB/s, and so on).  Looks good to me in general.
> 
> But as I mentioned above I feel like the changeset can be shrinked quite a bit
> if we can drop the extra bitmap; maybe it means we can drop half of the whole
> series.  But it's also possible I missed something, let's see.
> 
> It'll slightly differ from dirty ring in that same page written will always
> only be counted once between two dirty map sync, but that's expected.  Dirty
> ring "sync" more frequently (either ring full, or current 1-sec timeout in the
> reaper), so it re-protects more frequently too.
> 
> I still have some other small comments, I'll go into the patches.
> 
> Thanks,
> 

-- 
Best regard

Hyman Huang(黄勇)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] support dirtyrate measurement with dirty bitmap
  2021-07-11 15:27   ` Hyman Huang
@ 2021-07-13 17:45     ` Peter Xu
  2021-07-14 15:59       ` Hyman
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Xu @ 2021-07-13 17:45 UTC (permalink / raw)
  To: Hyman Huang
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini

On Sun, Jul 11, 2021 at 11:27:13PM +0800, Hyman Huang wrote:
> > IMHO we can directly do the calculation when synchronizing the dirty bits in
> > below functions:
> > 
> >          cpu_physical_memory_set_dirty_range
> >          cpu_physical_memory_set_dirty_lebitmap
> >          cpu_physical_memory_sync_dirty_bitmap
> > 
> > Maybe we can define a global statistics for that?
> uhhh... Do you mean that we can reuse the DIRTY_MEMORY_MIGRATION dirty bits
> to stat the new dirty pages number and just define the global var to count
> the increased dirty pages during the calculation time?

I think I misguided you.. Sorry :)

cpu_physical_memory_sync_dirty_bitmap() should not really be in the list above,
as it's fetching the bitmap in ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION].

If you see the other two functions, they all apply dirty bits upon the same
bitmap (actually ram_list.dirty_memory[*] instead of migration-only).  It's
used by e.g. memory region log_sync() to deliver lower level dirty bits to
upper, e.g., see kvm's log_sync[_global]() and kvm_slot_sync_dirty_pages().

Using cpu_physical_memory_sync_dirty_bitmap() is not a good idea to me (which I
saw you used in your latest version), as it could affect migration.  See its
only caller now at ramblock_sync_dirty_bitmap(): when migration calls it, it'll
start to count less than it should for rs->migration_dirty_pages.

So what I wanted to suggest is we do some general counting in both
cpu_physical_memory_set_dirty_range and cpu_physical_memory_set_dirty_lebitmap.
Then to sync for dirty rate measuring, we use memory_global_dirty_log_sync().
That'll sync all dirty bits e.g. in kernel to ram_list.dirty_memory[*], along
which we do the accounting.

Would that work?

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] support dirtyrate measurement with dirty bitmap
  2021-07-13 17:45     ` Peter Xu
@ 2021-07-14 15:59       ` Hyman
  0 siblings, 0 replies; 12+ messages in thread
From: Hyman @ 2021-07-14 15:59 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini



在 2021/7/14 1:45, Peter Xu 写道:
> On Sun, Jul 11, 2021 at 11:27:13PM +0800, Hyman Huang wrote:
>>> IMHO we can directly do the calculation when synchronizing the dirty bits in
>>> below functions:
>>>
>>>           cpu_physical_memory_set_dirty_range
>>>           cpu_physical_memory_set_dirty_lebitmap
>>>           cpu_physical_memory_sync_dirty_bitmap
>>>
>>> Maybe we can define a global statistics for that?
>> uhhh... Do you mean that we can reuse the DIRTY_MEMORY_MIGRATION dirty bits
>> to stat the new dirty pages number and just define the global var to count
>> the increased dirty pages during the calculation time?
> 
> I think I misguided you.. Sorry :)
never mind, the other version of the implementation is what your said, 
i'll post later.
> 
> cpu_physical_memory_sync_dirty_bitmap() should not really be in the list above,
> as it's fetching the bitmap in ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION].
> 
> If you see the other two functions, they all apply dirty bits upon the same
> bitmap (actually ram_list.dirty_memory[*] instead of migration-only).  It's
> used by e.g. memory region log_sync() to deliver lower level dirty bits to
> upper, e.g., see kvm's log_sync[_global]() and kvm_slot_sync_dirty_pages().
> 
> Using cpu_physical_memory_sync_dirty_bitmap() is not a good idea to me (which I
> saw you used in your latest version), as it could affect migration.  See its
> only caller now at ramblock_sync_dirty_bitmap(): when migration calls it, it'll
> start to count less than it should for rs->migration_dirty_pages.
> 
> So what I wanted to suggest is we do some general counting in both
> cpu_physical_memory_set_dirty_range and cpu_physical_memory_set_dirty_lebitmap.
> Then to sync for dirty rate measuring, we use memory_global_dirty_log_sync().
> That'll sync all dirty bits e.g. in kernel to ram_list.dirty_memory[*], along
> which we do the accounting.
> 
> Would that work?
yes, this works.
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-07-14 16:02 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-27  5:38 [PATCH 0/4] support dirtyrate measurement with dirty bitmap huangy81
     [not found] ` <cover.1624771216.git.huangy81@chinatelecom.cn>
2021-06-27  5:38   ` [PATCH 1/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits huangy81
2021-07-09 18:37     ` Peter Xu
2021-06-27  5:38   ` [PATCH 2/4] KVM: introduce kvm_get_manual_dirty_log_protect huangy81
2021-06-27  5:38   ` [PATCH 3/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits functions huangy81
2021-07-09 18:26     ` Peter Xu
2021-06-27  5:38   ` [PATCH 4/4] migration/dirtyrate: implement dirty-bitmap dirtyrate calculation huangy81
2021-07-09 18:32     ` Peter Xu
2021-07-09 18:20 ` [PATCH 0/4] support dirtyrate measurement with dirty bitmap Peter Xu
2021-07-11 15:27   ` Hyman Huang
2021-07-13 17:45     ` Peter Xu
2021-07-14 15:59       ` Hyman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).