All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/6] support dirtyrate at the granualrity of vcpu
@ 2021-06-16  1:12 huangy81
  2021-06-16  1:12 ` [PATCH v4 1/6] KVM: introduce dirty_pages and kvm_dirty_ring_enabled huangy81
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: huangy81 @ 2021-06-16  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
	Peter Xu, Chuan Zheng, Paolo Bonzini

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

v4:
- make global_dirty_log a bitmask:
  1. add comments about dirty log bitmask
  2. use assert statement to check validity of flags 
  3. add trace to log bitmask changes

- introduce mode option to show what method calculation should be used,
  also, export mode option in the as last commmit 

- split cleanup and init of dirty rate stat and move it in the main
  thread

- change the fields of DirtyPageRecord to uint64_t type so that we
  can calculation the increased dirty pages with the formula
  as Peter's advice: dirty pages = end_pages - start_pages

- introduce mutex to protect dirty rate stat info

- adjust order of registering thread

- drop the memory free callback 

this version modify some code on Peter's advice, reference to: 
https://lore.kernel.org/qemu-devel/YL5nNYXmrqMlXF3v@t490s/

thanks again.

v3:
- pick up "migration/dirtyrate: make sample page count configurable" to
  make patchset apply master correctly

v2:
- rebase to "migration/dirtyrate: make sample page count configurable"

- rename "vcpu" to "per_vcpu" to show the per-vcpu method

- squash patch 5/6 into a single one, squash patch 1/2 also 

- pick up "hmp: Add "calc_dirty_rate" and "info dirty_rate" cmds"

- make global_dirty_log a bitmask to make sure both migration and dirty
  could not intefer with each other

- add memory free callback to prevent memory leaking 

the most different of v2 fron v1 is that we make the global_dirty_log a 
bitmask. the reason is dirty rate measurement may start or stop dirty
logging during calculation. this conflict with migration because stop
dirty log make migration leave dirty pages out then that'll be a
problem.

make global_dirty_log a bitmask can let both migration and dirty
rate measurement work fine. introduce GLOBAL_DIRTY_MIGRATION and
GLOBAL_DIRTY_DIRTY_RATE to distinguish what current dirty log aims
for, migration or dirty rate.
    
all references to global_dirty_log should be untouched because any bit
set there should justify that global dirty logging is enabled.

Please review, thanks !

v1:

Since the Dirty Ring on QEMU part has been merged recently, how to use
this feature is under consideration.

In the scene of migration, it is valuable to provide a more accurante
interface to track dirty memory than existing one, so that the upper
layer application can make a wise decision, or whatever. More
importantly,
dirtyrate info at the granualrity of vcpu could provide a possibility to
make migration convergent by imposing restriction on vcpu. With Dirty
Ring, we can calculate dirtyrate efficiently and cheaply.

The old interface implemented by sampling pages, it consumes cpu 
resource, and the larger guest memory size become, the more cpu resource
it consumes, namely, hard to scale. New interface has no such drawback.

Please review, thanks !

Best Regards !

Hyman Huang(黄勇) (6):
  KVM: introduce dirty_pages and kvm_dirty_ring_enabled
  memory: make global_dirty_log a bitmask
  migration/dirtyrate: introduce struct and adjust DirtyRateStat
  migration/dirtyrate: adjust order of registering thread
  migration/dirtyrate: move init step of calculation to main thread
  migration/dirtyrate: implement dirty-ring dirtyrate calculation

 accel/kvm/kvm-all.c    |   7 ++
 hmp-commands.hx        |   7 +-
 include/exec/memory.h  |  14 ++-
 include/hw/core/cpu.h  |   1 +
 include/sysemu/kvm.h   |   1 +
 migration/dirtyrate.c  | 259 +++++++++++++++++++++++++++++++++++++++++++------
 migration/dirtyrate.h  |  19 +++-
 migration/ram.c        |   8 +-
 migration/trace-events |   2 +
 qapi/migration.json    |  46 ++++++++-
 softmmu/memory.c       |  31 ++++--
 softmmu/trace-events   |   1 +
 12 files changed, 343 insertions(+), 53 deletions(-)

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 1/6] KVM: introduce dirty_pages and kvm_dirty_ring_enabled
  2021-06-16  1:12 [PATCH v4 0/6] support dirtyrate at the granualrity of vcpu huangy81
@ 2021-06-16  1:12 ` huangy81
  2021-06-16 15:23   ` Peter Xu
  2021-06-16  1:12 ` [PATCH v4 2/6] memory: make global_dirty_log a bitmask huangy81
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: huangy81 @ 2021-06-16  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
	Peter Xu, Chuan Zheng, Paolo Bonzini

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

dirty_pages is used to calculate dirtyrate via dirty ring, when
enabled, kvm-reaper will increase the dirty pages after gfns
being dirtied.

kvm_dirty_ring_enabled shows if kvm-reaper is working. dirtyrate
thread could use it to check if measurement can base on dirty
ring feature.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 accel/kvm/kvm-all.c   | 7 +++++++
 include/hw/core/cpu.h | 1 +
 include/sysemu/kvm.h  | 1 +
 3 files changed, 9 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index e5b10dd..e0e88a2 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -469,6 +469,7 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
     cpu->kvm_fd = ret;
     cpu->kvm_state = s;
     cpu->vcpu_dirty = true;
+    cpu->dirty_pages = 0;
 
     mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
     if (mmap_size < 0) {
@@ -743,6 +744,7 @@ static uint32_t kvm_dirty_ring_reap_one(KVMState *s, CPUState *cpu)
         count++;
     }
     cpu->kvm_fetch_index = fetch;
+    cpu->dirty_pages += count;
 
     return count;
 }
@@ -2293,6 +2295,11 @@ bool kvm_vcpu_id_is_valid(int vcpu_id)
     return vcpu_id >= 0 && vcpu_id < kvm_max_vcpu_id(s);
 }
 
+bool kvm_dirty_ring_enabled(void)
+{
+    return kvm_state->kvm_dirty_ring_size ? true : false;
+}
+
 static int kvm_init(MachineState *ms)
 {
     MachineClass *mc = MACHINE_GET_CLASS(ms);
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 4e0ea68..80fcb1d 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -374,6 +374,7 @@ struct CPUState {
     struct kvm_run *kvm_run;
     struct kvm_dirty_gfn *kvm_dirty_gfns;
     uint32_t kvm_fetch_index;
+    uint64_t dirty_pages;
 
     /* Used for events with 'vcpu' and *without* the 'disabled' properties */
     DECLARE_BITMAP(trace_dstate_delayed, CPU_TRACE_DSTATE_MAX_EVENTS);
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index a1ab1ee..7b22aeb 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -547,4 +547,5 @@ bool kvm_cpu_check_are_resettable(void);
 
 bool kvm_arch_cpu_check_are_resettable(void);
 
+bool kvm_dirty_ring_enabled(void);
 #endif
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 2/6] memory: make global_dirty_log a bitmask
  2021-06-16  1:12 [PATCH v4 0/6] support dirtyrate at the granualrity of vcpu huangy81
  2021-06-16  1:12 ` [PATCH v4 1/6] KVM: introduce dirty_pages and kvm_dirty_ring_enabled huangy81
@ 2021-06-16  1:12 ` huangy81
  2021-06-16 15:22   ` Peter Xu
  2021-06-16  1:12 ` [PATCH v4 3/6] migration/dirtyrate: introduce struct and adjust DirtyRateStat huangy81
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: huangy81 @ 2021-06-16  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
	Peter Xu, Chuan Zheng, Paolo Bonzini

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

dirty rate measurement may start or stop dirty logging during
calculation. this conflict with migration because stop dirty
log make migration leave dirty pages out then that'll be a problem.

make global_dirty_log a bitmask can let both migration and dirty
rate measurement work fine. introduce GLOBAL_DIRTY_MIGRATION and
GLOBAL_DIRTY_DIRTY_RATE to distinguish what current dirty log aims
for, migration or dirty rate.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 include/exec/memory.h | 14 +++++++++++---
 migration/ram.c       |  8 ++++----
 softmmu/memory.c      | 31 ++++++++++++++++++++++---------
 softmmu/trace-events  |  1 +
 4 files changed, 38 insertions(+), 16 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index b114f54..e31eef2 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -55,7 +55,11 @@ static inline void fuzz_dma_read_cb(size_t addr,
 }
 #endif
 
-extern bool global_dirty_log;
+/* what is the purpose of current dirty log, migration or dirty rate ? */
+#define GLOBAL_DIRTY_MIGRATION  (1U << 0)
+#define GLOBAL_DIRTY_DIRTY_RATE (1U << 1)
+
+extern unsigned int global_dirty_log;
 
 typedef struct MemoryRegionOps MemoryRegionOps;
 
@@ -2099,13 +2103,17 @@ void memory_listener_unregister(MemoryListener *listener);
 
 /**
  * memory_global_dirty_log_start: begin dirty logging for all regions
+ *
+ * @flags: purpose of starting dirty log, migration or dirty rate
  */
-void memory_global_dirty_log_start(void);
+void memory_global_dirty_log_start(unsigned int flags);
 
 /**
  * memory_global_dirty_log_stop: end dirty logging for all regions
+ *
+ * @flags: purpose of stopping dirty log, migration or dirty rate
  */
-void memory_global_dirty_log_stop(void);
+void memory_global_dirty_log_stop(unsigned int flags);
 
 void mtree_info(bool flatview, bool dispatch_tree, bool owner, bool disabled);
 
diff --git a/migration/ram.c b/migration/ram.c
index 60ea913..9ce31af 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2190,7 +2190,7 @@ static void ram_save_cleanup(void *opaque)
         /* caller have hold iothread lock or is in a bh, so there is
          * no writing race against the migration bitmap
          */
-        memory_global_dirty_log_stop();
+        memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION);
     }
 
     RAMBLOCK_FOREACH_NOT_IGNORED(block) {
@@ -2652,7 +2652,7 @@ static void ram_init_bitmaps(RAMState *rs)
         ram_list_init_bitmaps();
         /* We don't use dirty log with background snapshots */
         if (!migrate_background_snapshot()) {
-            memory_global_dirty_log_start();
+            memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
             migration_bitmap_sync_precopy(rs);
         }
     }
@@ -3393,7 +3393,7 @@ void colo_incoming_start_dirty_log(void)
             /* Discard this dirty bitmap record */
             bitmap_zero(block->bmap, block->max_length >> TARGET_PAGE_BITS);
         }
-        memory_global_dirty_log_start();
+        memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
     }
     ram_state->migration_dirty_pages = 0;
     qemu_mutex_unlock_ramlist();
@@ -3405,7 +3405,7 @@ void colo_release_ram_cache(void)
 {
     RAMBlock *block;
 
-    memory_global_dirty_log_stop();
+    memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION);
     RAMBLOCK_FOREACH_NOT_IGNORED(block) {
         g_free(block->bmap);
         block->bmap = NULL;
diff --git a/softmmu/memory.c b/softmmu/memory.c
index c19b0be..d74172f 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -39,7 +39,7 @@
 static unsigned memory_region_transaction_depth;
 static bool memory_region_update_pending;
 static bool ioeventfd_update_pending;
-bool global_dirty_log;
+unsigned int global_dirty_log;
 
 static QTAILQ_HEAD(, MemoryListener) memory_listeners
     = QTAILQ_HEAD_INITIALIZER(memory_listeners);
@@ -2659,14 +2659,19 @@ void memory_global_after_dirty_log_sync(void)
 
 static VMChangeStateEntry *vmstate_change;
 
-void memory_global_dirty_log_start(void)
+void memory_global_dirty_log_start(unsigned int flags)
 {
     if (vmstate_change) {
         qemu_del_vm_change_state_handler(vmstate_change);
         vmstate_change = NULL;
     }
 
-    global_dirty_log = true;
+#define  GLOBAL_DIRTY_MASK  (0x3)
+    assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
+    assert(global_dirty_log ^ flags);
+    global_dirty_log |= flags;
+
+    trace_global_dirty_changed(global_dirty_log);
 
     MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
 
@@ -2676,9 +2681,12 @@ void memory_global_dirty_log_start(void)
     memory_region_transaction_commit();
 }
 
-static void memory_global_dirty_log_do_stop(void)
+static void memory_global_dirty_log_do_stop(unsigned int flags)
 {
-    global_dirty_log = false;
+    assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
+    global_dirty_log &= ~flags;
+
+    trace_global_dirty_changed(global_dirty_log);
 
     /* Refresh DIRTY_MEMORY_MIGRATION bit.  */
     memory_region_transaction_begin();
@@ -2691,8 +2699,10 @@ static void memory_global_dirty_log_do_stop(void)
 static void memory_vm_change_state_handler(void *opaque, bool running,
                                            RunState state)
 {
+    int *flags = opaque;
     if (running) {
-        memory_global_dirty_log_do_stop();
+        memory_global_dirty_log_do_stop(*flags);
+        g_free(opaque);
 
         if (vmstate_change) {
             qemu_del_vm_change_state_handler(vmstate_change);
@@ -2701,18 +2711,21 @@ static void memory_vm_change_state_handler(void *opaque, bool running,
     }
 }
 
-void memory_global_dirty_log_stop(void)
+void memory_global_dirty_log_stop(unsigned int flags)
 {
+    int *opaque = NULL;
     if (!runstate_is_running()) {
         if (vmstate_change) {
             return;
         }
+        opaque = g_malloc0(sizeof(opaque));
+        *opaque = flags;
         vmstate_change = qemu_add_vm_change_state_handler(
-                                memory_vm_change_state_handler, NULL);
+                         memory_vm_change_state_handler, opaque);
         return;
     }
 
-    memory_global_dirty_log_do_stop();
+    memory_global_dirty_log_do_stop(flags);
 }
 
 static void listener_add_address_space(MemoryListener *listener,
diff --git a/softmmu/trace-events b/softmmu/trace-events
index 5262828..4431f7f 100644
--- a/softmmu/trace-events
+++ b/softmmu/trace-events
@@ -18,6 +18,7 @@ memory_region_ram_device_write(int cpu_index, void *mr, uint64_t addr, uint64_t
 flatview_new(void *view, void *root) "%p (root %p)"
 flatview_destroy(void *view, void *root) "%p (root %p)"
 flatview_destroy_rcu(void *view, void *root) "%p (root %p)"
+global_dirty_changed(unsigned int bitmask) "bitmask 0x%"PRIx32
 
 # vl.c
 vm_state_notify(int running, int reason, const char *reason_str) "running %d reason %d (%s)"
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 3/6] migration/dirtyrate: introduce struct and adjust DirtyRateStat
  2021-06-16  1:12 [PATCH v4 0/6] support dirtyrate at the granualrity of vcpu huangy81
  2021-06-16  1:12 ` [PATCH v4 1/6] KVM: introduce dirty_pages and kvm_dirty_ring_enabled huangy81
  2021-06-16  1:12 ` [PATCH v4 2/6] memory: make global_dirty_log a bitmask huangy81
@ 2021-06-16  1:12 ` huangy81
  2021-06-16 15:30   ` Peter Xu
  2021-06-16  1:12 ` [PATCH v4 4/6] migration/dirtyrate: adjust order of registering thread huangy81
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: huangy81 @ 2021-06-16  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
	Peter Xu, Chuan Zheng, Paolo Bonzini

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

introduce "DirtyRateMeasureMode" to specify what method should be
used to calculate dirty rate, introduce "DirtyRateVcpu" to store
dirty rate fore each vcpu.

use union to store stat data of specific mode

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 migration/dirtyrate.c | 47 +++++++++++++++++++++++++++--------------------
 migration/dirtyrate.h | 19 ++++++++++++++++---
 qapi/migration.json   | 30 ++++++++++++++++++++++++++++++
 3 files changed, 73 insertions(+), 23 deletions(-)

diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index 320c56b..14ffac9 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -88,33 +88,43 @@ static struct DirtyRateInfo *query_dirty_rate_info(void)
     return info;
 }
 
-static void init_dirtyrate_stat(int64_t start_time, int64_t calc_time,
-                                uint64_t sample_pages)
+static void init_dirtyrate_stat(int64_t start_time,
+                                struct DirtyRateConfig config)
 {
-    DirtyStat.total_dirty_samples = 0;
-    DirtyStat.total_sample_count = 0;
-    DirtyStat.total_block_mem_MB = 0;
     DirtyStat.dirty_rate = -1;
     DirtyStat.start_time = start_time;
-    DirtyStat.calc_time = calc_time;
-    DirtyStat.sample_pages = sample_pages;
+    DirtyStat.calc_time = config.sample_period_seconds;
+    DirtyStat.sample_pages = config.sample_pages_per_gigabytes;
+
+    switch (config.mode) {
+    case DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING:
+        DirtyStat.page_sampling.total_dirty_samples = 0;
+        DirtyStat.page_sampling.total_sample_count = 0;
+        DirtyStat.page_sampling.total_block_mem_MB = 0;
+        break;
+    case DIRTY_RATE_MEASURE_MODE_DIRTY_RING:
+        DirtyStat.dirty_ring.nvcpu = -1;
+        DirtyStat.dirty_ring.rates = NULL;
+    default:
+        break;
+    }
 }
 
 static void update_dirtyrate_stat(struct RamblockDirtyInfo *info)
 {
-    DirtyStat.total_dirty_samples += info->sample_dirty_count;
-    DirtyStat.total_sample_count += info->sample_pages_count;
+    DirtyStat.page_sampling.total_dirty_samples += info->sample_dirty_count;
+    DirtyStat.page_sampling.total_sample_count += info->sample_pages_count;
     /* size of total pages in MB */
-    DirtyStat.total_block_mem_MB += (info->ramblock_pages *
-                                     TARGET_PAGE_SIZE) >> 20;
+    DirtyStat.page_sampling.total_block_mem_MB += (info->ramblock_pages *
+                                                   TARGET_PAGE_SIZE) >> 20;
 }
 
 static void update_dirtyrate(uint64_t msec)
 {
     uint64_t dirtyrate;
-    uint64_t total_dirty_samples = DirtyStat.total_dirty_samples;
-    uint64_t total_sample_count = DirtyStat.total_sample_count;
-    uint64_t total_block_mem_MB = DirtyStat.total_block_mem_MB;
+    uint64_t total_dirty_samples = DirtyStat.page_sampling.total_dirty_samples;
+    uint64_t total_sample_count = DirtyStat.page_sampling.total_sample_count;
+    uint64_t total_block_mem_MB = DirtyStat.page_sampling.total_block_mem_MB;
 
     dirtyrate = total_dirty_samples * total_block_mem_MB *
                 1000 / (total_sample_count * msec);
@@ -327,7 +337,7 @@ static bool compare_page_hash_info(struct RamblockDirtyInfo *info,
         update_dirtyrate_stat(block_dinfo);
     }
 
-    if (DirtyStat.total_sample_count == 0) {
+    if (DirtyStat.page_sampling.total_sample_count == 0) {
         return false;
     }
 
@@ -372,8 +382,6 @@ void *get_dirtyrate_thread(void *arg)
     struct DirtyRateConfig config = *(struct DirtyRateConfig *)arg;
     int ret;
     int64_t start_time;
-    int64_t calc_time;
-    uint64_t sample_pages;
 
     ret = dirtyrate_set_state(&CalculatingState, DIRTY_RATE_STATUS_UNSTARTED,
                               DIRTY_RATE_STATUS_MEASURING);
@@ -383,9 +391,7 @@ void *get_dirtyrate_thread(void *arg)
     }
 
     start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
-    calc_time = config.sample_period_seconds;
-    sample_pages = config.sample_pages_per_gigabytes;
-    init_dirtyrate_stat(start_time, calc_time, sample_pages);
+    init_dirtyrate_stat(start_time, config);
 
     calculate_dirtyrate(config);
 
@@ -442,6 +448,7 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
 
     config.sample_period_seconds = calc_time;
     config.sample_pages_per_gigabytes = sample_pages;
+    config.mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
     qemu_thread_create(&thread, "get_dirtyrate", get_dirtyrate_thread,
                        (void *)&config, QEMU_THREAD_DETACHED);
 }
diff --git a/migration/dirtyrate.h b/migration/dirtyrate.h
index e1fd290..69d4c5b 100644
--- a/migration/dirtyrate.h
+++ b/migration/dirtyrate.h
@@ -43,6 +43,7 @@
 struct DirtyRateConfig {
     uint64_t sample_pages_per_gigabytes; /* sample pages per GB */
     int64_t sample_period_seconds; /* time duration between two sampling */
+    DirtyRateMeasureMode mode; /* mode of dirtyrate measurement */
 };
 
 /*
@@ -58,17 +59,29 @@ struct RamblockDirtyInfo {
     uint32_t *hash_result; /* array of hash result for sampled pages */
 };
 
+typedef struct SampleVMStat {
+    uint64_t total_dirty_samples; /* total dirty sampled page */
+    uint64_t total_sample_count; /* total sampled pages */
+    uint64_t total_block_mem_MB; /* size of total sampled pages in MB */
+} SampleVMStat;
+
+typedef struct VcpuStat {
+    int nvcpu; /* number of vcpu */
+    DirtyRateVcpu *rates; /* array of dirty rate for each vcpu */
+} VcpuStat;
+
 /*
  * Store calculation statistics for each measure.
  */
 struct DirtyRateStat {
-    uint64_t total_dirty_samples; /* total dirty sampled page */
-    uint64_t total_sample_count; /* total sampled pages */
-    uint64_t total_block_mem_MB; /* size of total sampled pages in MB */
     int64_t dirty_rate; /* dirty rate in MB/s */
     int64_t start_time; /* calculation start time in units of second */
     int64_t calc_time; /* time duration of two sampling in units of second */
     uint64_t sample_pages; /* sample pages per GB */
+    union {
+        SampleVMStat page_sampling;
+        VcpuStat dirty_ring;
+    };
 };
 
 void *get_dirtyrate_thread(void *arg);
diff --git a/qapi/migration.json b/qapi/migration.json
index 1124a2d..7395305 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1709,6 +1709,21 @@
   'data': { 'device-id': 'str' } }
 
 ##
+# @DirtyRateVcpu:
+#
+# Dirty rate of vcpu.
+#
+# @id: vcpu index.
+#
+# @dirty-rate: dirty rate.
+#
+# Since: 6.1
+#
+##
+{ 'struct': 'DirtyRateVcpu',
+  'data': { 'id': 'int', 'dirty-rate': 'int64' } }
+
+##
 # @DirtyRateStatus:
 #
 # An enumeration of dirtyrate status.
@@ -1726,6 +1741,21 @@
   'data': [ 'unstarted', 'measuring', 'measured'] }
 
 ##
+# @DirtyRateMeasureMode:
+#
+# An enumeration of mode of measuring dirtyrate.
+#
+# @page-sampling: calculate dirtyrate by sampling pages.
+#
+# @dirty-ring: calculate dirtyrate by via dirty ring.
+#
+# Since: 6.1
+#
+##
+{ 'enum': 'DirtyRateMeasureMode',
+  'data': [ 'none', 'page-sampling', 'dirty-ring'] }
+
+##
 # @DirtyRateInfo:
 #
 # Information about current dirty page rate of vm.
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 4/6] migration/dirtyrate: adjust order of registering thread
  2021-06-16  1:12 [PATCH v4 0/6] support dirtyrate at the granualrity of vcpu huangy81
                   ` (2 preceding siblings ...)
  2021-06-16  1:12 ` [PATCH v4 3/6] migration/dirtyrate: introduce struct and adjust DirtyRateStat huangy81
@ 2021-06-16  1:12 ` huangy81
  2021-06-16 15:32   ` Peter Xu
  2021-06-16  1:12 ` [PATCH v4 5/6] migration/dirtyrate: move init step of calculation to main thread huangy81
  2021-06-16  1:12 ` [PATCH v4 6/6] migration/dirtyrate: implement dirty-ring dirtyrate calculation huangy81
  5 siblings, 1 reply; 14+ messages in thread
From: huangy81 @ 2021-06-16  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
	Peter Xu, Chuan Zheng, Paolo Bonzini

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

registering get_dirtyrate thread in advance so that both
page-sampling and dirty-ring mode can be covered.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 migration/dirtyrate.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index 14ffac9..b97f6a5 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -351,7 +351,6 @@ static void calculate_dirtyrate(struct DirtyRateConfig config)
     int64_t msec = 0;
     int64_t initial_time;
 
-    rcu_register_thread();
     rcu_read_lock();
     initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
     if (!record_ramblock_hash_info(&block_dinfo, config, &block_count)) {
@@ -374,7 +373,6 @@ static void calculate_dirtyrate(struct DirtyRateConfig config)
 out:
     rcu_read_unlock();
     free_ramblock_dirty_info(block_dinfo, block_count);
-    rcu_unregister_thread();
 }
 
 void *get_dirtyrate_thread(void *arg)
@@ -382,6 +380,7 @@ void *get_dirtyrate_thread(void *arg)
     struct DirtyRateConfig config = *(struct DirtyRateConfig *)arg;
     int ret;
     int64_t start_time;
+    rcu_register_thread();
 
     ret = dirtyrate_set_state(&CalculatingState, DIRTY_RATE_STATUS_UNSTARTED,
                               DIRTY_RATE_STATUS_MEASURING);
@@ -400,6 +399,8 @@ void *get_dirtyrate_thread(void *arg)
     if (ret == -1) {
         error_report("change dirtyrate state failed.");
     }
+
+    rcu_unregister_thread();
     return NULL;
 }
 
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 5/6] migration/dirtyrate: move init step of calculation to main thread
  2021-06-16  1:12 [PATCH v4 0/6] support dirtyrate at the granualrity of vcpu huangy81
                   ` (3 preceding siblings ...)
  2021-06-16  1:12 ` [PATCH v4 4/6] migration/dirtyrate: adjust order of registering thread huangy81
@ 2021-06-16  1:12 ` huangy81
  2021-06-16 16:47   ` Peter Xu
  2021-06-16  1:12 ` [PATCH v4 6/6] migration/dirtyrate: implement dirty-ring dirtyrate calculation huangy81
  5 siblings, 1 reply; 14+ messages in thread
From: huangy81 @ 2021-06-16  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
	Peter Xu, Chuan Zheng, Paolo Bonzini

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

since main thread could "query dirty rate" at any time, then it's
better to move init step into main thead so that synchronization
overhead of dirty stat can be reduced.

since not sure whether "only one QMP iothread" will still keep true
forever, always introduce a mutex and protect dirty stat.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 migration/dirtyrate.c | 34 ++++++++++++++++++++++++++++++----
 1 file changed, 30 insertions(+), 4 deletions(-)

diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index b97f6a5..d7b41bd 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -26,6 +26,8 @@
 
 static int CalculatingState = DIRTY_RATE_STATUS_UNSTARTED;
 static struct DirtyRateStat DirtyStat;
+static QemuMutex dirtyrate_lock;
+static DirtyRateMeasureMode dirtyrate_mode = DIRTY_RATE_MEASURE_MODE_NONE;
 
 static int64_t set_sample_page_period(int64_t msec, int64_t initial_time)
 {
@@ -70,6 +72,7 @@ static int dirtyrate_set_state(int *state, int old_state, int new_state)
 
 static struct DirtyRateInfo *query_dirty_rate_info(void)
 {
+    qemu_mutex_lock(&dirtyrate_lock);
     int64_t dirty_rate = DirtyStat.dirty_rate;
     struct DirtyRateInfo *info = g_malloc0(sizeof(DirtyRateInfo));
 
@@ -83,6 +86,8 @@ static struct DirtyRateInfo *query_dirty_rate_info(void)
     info->calc_time = DirtyStat.calc_time;
     info->sample_pages = DirtyStat.sample_pages;
 
+    qemu_mutex_unlock(&dirtyrate_lock);
+
     trace_query_dirty_rate_info(DirtyRateStatus_str(CalculatingState));
 
     return info;
@@ -91,6 +96,7 @@ static struct DirtyRateInfo *query_dirty_rate_info(void)
 static void init_dirtyrate_stat(int64_t start_time,
                                 struct DirtyRateConfig config)
 {
+    qemu_mutex_lock(&dirtyrate_lock);
     DirtyStat.dirty_rate = -1;
     DirtyStat.start_time = start_time;
     DirtyStat.calc_time = config.sample_period_seconds;
@@ -108,6 +114,12 @@ static void init_dirtyrate_stat(int64_t start_time,
     default:
         break;
     }
+    qemu_mutex_unlock(&dirtyrate_lock);
+}
+
+static void cleanup_dirtyrate_stat(struct DirtyRateConfig config)
+{
+    /* TODO */
 }
 
 static void update_dirtyrate_stat(struct RamblockDirtyInfo *info)
@@ -379,7 +391,6 @@ void *get_dirtyrate_thread(void *arg)
 {
     struct DirtyRateConfig config = *(struct DirtyRateConfig *)arg;
     int ret;
-    int64_t start_time;
     rcu_register_thread();
 
     ret = dirtyrate_set_state(&CalculatingState, DIRTY_RATE_STATUS_UNSTARTED,
@@ -389,9 +400,6 @@ void *get_dirtyrate_thread(void *arg)
         return NULL;
     }
 
-    start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
-    init_dirtyrate_stat(start_time, config);
-
     calculate_dirtyrate(config);
 
     ret = dirtyrate_set_state(&CalculatingState, DIRTY_RATE_STATUS_MEASURING,
@@ -410,6 +418,7 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
     static struct DirtyRateConfig config;
     QemuThread thread;
     int ret;
+    int64_t start_time;
 
     /*
      * If the dirty rate is already being measured, don't attempt to start.
@@ -450,6 +459,23 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
     config.sample_period_seconds = calc_time;
     config.sample_pages_per_gigabytes = sample_pages;
     config.mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
+
+    if (unlikely(dirtyrate_mode == DIRTY_RATE_MEASURE_MODE_NONE)) {
+        /* first time to calculate dirty rate */
+        qemu_mutex_init(&dirtyrate_lock);
+    }
+
+    cleanup_dirtyrate_stat(config);
+
+    /*
+     * update dirty rate mode so that we can figure out what mode has
+     * been used in last calculation
+     **/
+    dirtyrate_mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
+
+    start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
+    init_dirtyrate_stat(start_time, config);
+
     qemu_thread_create(&thread, "get_dirtyrate", get_dirtyrate_thread,
                        (void *)&config, QEMU_THREAD_DETACHED);
 }
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 6/6] migration/dirtyrate: implement dirty-ring dirtyrate calculation
  2021-06-16  1:12 [PATCH v4 0/6] support dirtyrate at the granualrity of vcpu huangy81
                   ` (4 preceding siblings ...)
  2021-06-16  1:12 ` [PATCH v4 5/6] migration/dirtyrate: move init step of calculation to main thread huangy81
@ 2021-06-16  1:12 ` huangy81
  2021-06-16 16:56   ` Peter Xu
  5 siblings, 1 reply; 14+ messages in thread
From: huangy81 @ 2021-06-16  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
	Peter Xu, Chuan Zheng, Paolo Bonzini

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

use dirty ring feature to implement dirtyrate calculation.

introduce mode option in qmp calc_dirty_rate to specify what
method should be used when calculating dirtyrate, either
page-sampling or dirty-ring should be passed.

introduce "dirty_ring:-r" option in hmp calc_dirty_rate to
indicate dirty ring method should be used for calculation.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 hmp-commands.hx        |   7 +-
 migration/dirtyrate.c  | 183 ++++++++++++++++++++++++++++++++++++++++++++++---
 migration/trace-events |   2 +
 qapi/migration.json    |  16 ++++-
 4 files changed, 195 insertions(+), 13 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 8e45bce..f7fc9d7 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1738,8 +1738,9 @@ ERST
 
     {
         .name       = "calc_dirty_rate",
-        .args_type  = "second:l,sample_pages_per_GB:l?",
-        .params     = "second [sample_pages_per_GB]",
-        .help       = "start a round of guest dirty rate measurement",
+        .args_type  = "dirty_ring:-r,second:l,sample_pages_per_GB:l?",
+        .params     = "[-r] second [sample_pages_per_GB]",
+        .help       = "start a round of guest dirty rate measurement (using -d to"
+                      "\n\t\t\t specify dirty ring as the method of calculation)",
         .cmd        = hmp_calc_dirty_rate,
     },
diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index d7b41bd..7c9515b 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -16,6 +16,7 @@
 #include "cpu.h"
 #include "exec/ramblock.h"
 #include "qemu/rcu_queue.h"
+#include "qemu/main-loop.h"
 #include "qapi/qapi-commands-migration.h"
 #include "ram.h"
 #include "trace.h"
@@ -23,11 +24,20 @@
 #include "monitor/hmp.h"
 #include "monitor/monitor.h"
 #include "qapi/qmp/qdict.h"
+#include "sysemu/kvm.h"
+#include "sysemu/runstate.h"
+#include "exec/memory.h"
+
+typedef struct DirtyPageRecord {
+    uint64_t start_pages;
+    uint64_t end_pages;
+} DirtyPageRecord;
 
 static int CalculatingState = DIRTY_RATE_STATUS_UNSTARTED;
 static struct DirtyRateStat DirtyStat;
 static QemuMutex dirtyrate_lock;
 static DirtyRateMeasureMode dirtyrate_mode = DIRTY_RATE_MEASURE_MODE_NONE;
+static DirtyPageRecord *dirty_pages;
 
 static int64_t set_sample_page_period(int64_t msec, int64_t initial_time)
 {
@@ -72,9 +82,11 @@ static int dirtyrate_set_state(int *state, int old_state, int new_state)
 
 static struct DirtyRateInfo *query_dirty_rate_info(void)
 {
+    int i;
     qemu_mutex_lock(&dirtyrate_lock);
     int64_t dirty_rate = DirtyStat.dirty_rate;
     struct DirtyRateInfo *info = g_malloc0(sizeof(DirtyRateInfo));
+    DirtyRateVcpuList *head = NULL, **tail = &head;
 
     if (qatomic_read(&CalculatingState) == DIRTY_RATE_STATUS_MEASURED) {
         info->has_dirty_rate = true;
@@ -85,9 +97,22 @@ static struct DirtyRateInfo *query_dirty_rate_info(void)
     info->start_time = DirtyStat.start_time;
     info->calc_time = DirtyStat.calc_time;
     info->sample_pages = DirtyStat.sample_pages;
+    info->mode = dirtyrate_mode;
+
+    if (dirtyrate_mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) {
+        /* set sample_pages with 0 to indicate page sampling isn't enabled */
+        info->sample_pages = 0;
+        info->has_vcpu_dirty_rate = true;
+        for (i = 0; i < DirtyStat.dirty_ring.nvcpu; i++) {
+            DirtyRateVcpu *rate = g_malloc0(sizeof(DirtyRateVcpu));
+            rate->id = DirtyStat.dirty_ring.rates[i].id;
+            rate->dirty_rate = DirtyStat.dirty_ring.rates[i].dirty_rate;
+            QAPI_LIST_APPEND(tail, rate);
+        }
+        info->vcpu_dirty_rate = head;
+    }
 
     qemu_mutex_unlock(&dirtyrate_lock);
-
     trace_query_dirty_rate_info(DirtyRateStatus_str(CalculatingState));
 
     return info;
@@ -119,7 +144,11 @@ static void init_dirtyrate_stat(int64_t start_time,
 
 static void cleanup_dirtyrate_stat(struct DirtyRateConfig config)
 {
-    /* TODO */
+    /* last calc-dirty-rate qmp use dirty ring mode */
+    if (dirtyrate_mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) {
+        free(DirtyStat.dirty_ring.rates);
+        DirtyStat.dirty_ring.rates = NULL;
+    }
 }
 
 static void update_dirtyrate_stat(struct RamblockDirtyInfo *info)
@@ -356,7 +385,97 @@ static bool compare_page_hash_info(struct RamblockDirtyInfo *info,
     return true;
 }
 
-static void calculate_dirtyrate(struct DirtyRateConfig config)
+static void record_dirtypages(CPUState *cpu, bool start)
+{
+    if (start) {
+        dirty_pages[cpu->cpu_index].start_pages = cpu->dirty_pages;
+    } else {
+        dirty_pages[cpu->cpu_index].end_pages = cpu->dirty_pages;
+    }
+}
+
+static void dirtyrate_global_dirty_log_start(void)
+{
+    qemu_mutex_lock_iothread();
+    memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE);
+    qemu_mutex_unlock_iothread();
+}
+
+static void dirtyrate_global_dirty_log_stop(void)
+{
+    qemu_mutex_lock_iothread();
+    memory_global_dirty_log_stop(GLOBAL_DIRTY_DIRTY_RATE);
+    qemu_mutex_unlock_iothread();
+}
+
+static int64_t do_calculate_dirtyrate_vcpu(int idx)
+{
+    uint64_t memory_size_MB;
+    int64_t time_s;
+    uint64_t start_pages = dirty_pages[idx].start_pages;
+    uint64_t end_pages = dirty_pages[idx].end_pages;
+    uint64_t dirty_pages = 0;
+
+    dirty_pages = end_pages - start_pages;
+
+    memory_size_MB = (dirty_pages * TARGET_PAGE_SIZE) >> 20;
+    time_s = DirtyStat.calc_time;
+
+    trace_dirtyrate_do_calculate_vcpu(idx, dirty_pages, time_s);
+
+    return memory_size_MB / time_s;
+}
+
+static void calculate_dirtyrate_dirty_ring(struct DirtyRateConfig config)
+{
+    CPUState *cpu;
+    int64_t msec = 0;
+    int64_t start_time;
+    uint64_t dirtyrate = 0;
+    uint64_t dirtyrate_sum = 0;
+    int nvcpu = 0;
+    int i = 0;
+
+    CPU_FOREACH(cpu) {
+        nvcpu++;
+    }
+
+    dirty_pages = malloc(sizeof(*dirty_pages) * nvcpu);
+
+    DirtyStat.dirty_ring.nvcpu = nvcpu;
+    DirtyStat.dirty_ring.rates = malloc(sizeof(DirtyRateVcpu) * nvcpu);
+
+    dirtyrate_global_dirty_log_start();
+
+    CPU_FOREACH(cpu) {
+        record_dirtypages(cpu, true);
+    }
+
+    start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+    DirtyStat.start_time = start_time / 1000;
+
+    msec = config.sample_period_seconds * 1000;
+    msec = set_sample_page_period(msec, start_time);
+    DirtyStat.calc_time = msec / 1000;
+
+    CPU_FOREACH(cpu) {
+        record_dirtypages(cpu, false);
+    }
+
+    dirtyrate_global_dirty_log_stop();
+
+    for (i = 0; i < DirtyStat.dirty_ring.nvcpu; i++) {
+        dirtyrate = do_calculate_dirtyrate_vcpu(i);
+        DirtyStat.dirty_ring.rates[i].id = i;
+        DirtyStat.dirty_ring.rates[i].dirty_rate = dirtyrate;
+        dirtyrate_sum += dirtyrate;
+    }
+
+    DirtyStat.dirty_rate = dirtyrate_sum;
+    free(dirty_pages);
+}
+
+static void calculate_dirtyrate_sample_vm(struct DirtyRateConfig config)
 {
     struct RamblockDirtyInfo *block_dinfo = NULL;
     int block_count = 0;
@@ -387,6 +506,17 @@ out:
     free_ramblock_dirty_info(block_dinfo, block_count);
 }
 
+static void calculate_dirtyrate(struct DirtyRateConfig config)
+{
+    if (config.mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) {
+        calculate_dirtyrate_dirty_ring(config);
+    } else {
+        calculate_dirtyrate_sample_vm(config);
+    }
+
+    trace_dirtyrate_calculate(DirtyStat.dirty_rate);
+}
+
 void *get_dirtyrate_thread(void *arg)
 {
     struct DirtyRateConfig config = *(struct DirtyRateConfig *)arg;
@@ -412,8 +542,12 @@ void *get_dirtyrate_thread(void *arg)
     return NULL;
 }
 
-void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
-                         int64_t sample_pages, Error **errp)
+void qmp_calc_dirty_rate(int64_t calc_time,
+                         bool has_sample_pages,
+                         int64_t sample_pages,
+                         bool has_mode,
+                         DirtyRateMeasureMode mode,
+                         Error **errp)
 {
     static struct DirtyRateConfig config;
     QemuThread thread;
@@ -435,6 +569,15 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
         return;
     }
 
+    if (!has_mode) {
+        mode =  DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
+    }
+
+    if (has_sample_pages && mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) {
+        error_setg(errp, "either sample-pages or dirty-ring can be specified.");
+        return;
+    }
+
     if (has_sample_pages) {
         if (!is_sample_pages_valid(sample_pages)) {
             error_setg(errp, "sample-pages is out of range[%d, %d].",
@@ -447,6 +590,16 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
     }
 
     /*
+     * dirty ring mode only works when kvm dirty ring is enabled.
+     */
+    if ((mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) &&
+        !kvm_dirty_ring_enabled()) {
+        error_setg(errp, "dirty ring is disabled, use sample-pages method "
+                         "or remeasure later.");
+        return;
+    }
+
+    /*
      * Init calculation state as unstarted.
      */
     ret = dirtyrate_set_state(&CalculatingState, CalculatingState,
@@ -458,7 +611,7 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
 
     config.sample_period_seconds = calc_time;
     config.sample_pages_per_gigabytes = sample_pages;
-    config.mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
+    config.mode = mode;
 
     if (unlikely(dirtyrate_mode == DIRTY_RATE_MEASURE_MODE_NONE)) {
         /* first time to calculate dirty rate */
@@ -471,7 +624,7 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
      * update dirty rate mode so that we can figure out what mode has
      * been used in last calculation
      **/
-    dirtyrate_mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
+    dirtyrate_mode = mode;
 
     start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
     init_dirtyrate_stat(start_time, config);
@@ -497,9 +650,18 @@ void hmp_info_dirty_rate(Monitor *mon, const QDict *qdict)
                    info->sample_pages);
     monitor_printf(mon, "Period: %"PRIi64" (sec)\n",
                    info->calc_time);
+    monitor_printf(mon, "Mode: %s\n",
+                   DirtyRateMeasureMode_str(info->mode));
     monitor_printf(mon, "Dirty rate: ");
     if (info->has_dirty_rate) {
         monitor_printf(mon, "%"PRIi64" (MB/s)\n", info->dirty_rate);
+        if (info->has_vcpu_dirty_rate) {
+            DirtyRateVcpuList *rate, *head = info->vcpu_dirty_rate;
+            for (rate = head; rate != NULL; rate = rate->next) {
+                monitor_printf(mon, "vcpu[%"PRIi64"], Dirty rate: %"PRIi64"\n",
+                               rate->value->id, rate->value->dirty_rate);
+            }
+        }
     } else {
         monitor_printf(mon, "(not ready)\n");
     }
@@ -511,6 +673,10 @@ void hmp_calc_dirty_rate(Monitor *mon, const QDict *qdict)
     int64_t sec = qdict_get_try_int(qdict, "second", 0);
     int64_t sample_pages = qdict_get_try_int(qdict, "sample_pages_per_GB", -1);
     bool has_sample_pages = (sample_pages != -1);
+    bool dirty_ring = qdict_get_try_bool(qdict, "dirty_ring", false);
+    DirtyRateMeasureMode mode =
+        (dirty_ring ? DIRTY_RATE_MEASURE_MODE_DIRTY_RING :
+         DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING);
     Error *err = NULL;
 
     if (!sec) {
@@ -518,7 +684,8 @@ void hmp_calc_dirty_rate(Monitor *mon, const QDict *qdict)
         return;
     }
 
-    qmp_calc_dirty_rate(sec, has_sample_pages, sample_pages, &err);
+    qmp_calc_dirty_rate(sec, has_sample_pages, sample_pages, true,
+                        mode, &err);
     if (err) {
         hmp_handle_error(mon, err);
         return;
diff --git a/migration/trace-events b/migration/trace-events
index 860c4f4..e51ebe1 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -330,6 +330,8 @@ get_ramblock_vfn_hash(const char *idstr, uint64_t vfn, uint32_t crc) "ramblock n
 calc_page_dirty_rate(const char *idstr, uint32_t new_crc, uint32_t old_crc) "ramblock name: %s, new crc: %" PRIu32 ", old crc: %" PRIu32
 skip_sample_ramblock(const char *idstr, uint64_t ramblock_size) "ramblock name: %s, ramblock size: %" PRIu64
 find_page_matched(const char *idstr) "ramblock %s addr or size changed"
+dirtyrate_calculate(int64_t dirtyrate) "dirty rate: %" PRIi64
+dirtyrate_do_calculate_vcpu(int idx, uint64_t pages, int64_t seconds) "vcpu[%d]: dirty %"PRIu64 " pages in %"PRIi64 " seconds"
 
 # block.c
 migration_block_init_shared(const char *blk_device_name) "Start migration for %s with shared base image"
diff --git a/qapi/migration.json b/qapi/migration.json
index 7395305..e3d21a8 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1773,6 +1773,12 @@
 # @sample-pages: page count per GB for sample dirty pages
 #                the default value is 512 (since 6.1)
 #
+# @mode: mode containing method of calculate dirtyrate includes
+#        'page-sampling' and 'dirty-ring' (Since 6.1)
+#
+# @vcpu-dirty-rate: dirtyrate for each vcpu if dirty-ring
+#                   mode specified (Since 6.1)
+#
 # Since: 5.2
 #
 ##
@@ -1781,7 +1787,9 @@
            'status': 'DirtyRateStatus',
            'start-time': 'int64',
            'calc-time': 'int64',
-           'sample-pages': 'uint64'} }
+           'sample-pages': 'uint64',
+           'mode': 'DirtyRateMeasureMode',
+           '*vcpu-dirty-rate': [ 'DirtyRateVcpu' ] } }
 
 ##
 # @calc-dirty-rate:
@@ -1793,6 +1801,9 @@
 # @sample-pages: page count per GB for sample dirty pages
 #                the default value is 512 (since 6.1)
 #
+# @mode: mechanism of calculating dirtyrate includes
+#        'page-sampling' and 'dirty-ring' (Since 6.1)
+#
 # Since: 5.2
 #
 # Example:
@@ -1801,7 +1812,8 @@
 #
 ##
 { 'command': 'calc-dirty-rate', 'data': {'calc-time': 'int64',
-                                         '*sample-pages': 'int'} }
+                                         '*sample-pages': 'int',
+                                         '*mode': 'DirtyRateMeasureMode'} }
 
 ##
 # @query-dirty-rate:
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 2/6] memory: make global_dirty_log a bitmask
  2021-06-16  1:12 ` [PATCH v4 2/6] memory: make global_dirty_log a bitmask huangy81
@ 2021-06-16 15:22   ` Peter Xu
  2021-06-17  4:49     ` Hyman Huang
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Xu @ 2021-06-16 15:22 UTC (permalink / raw)
  To: huangy81
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini

On Wed, Jun 16, 2021 at 09:12:28AM +0800, huangy81@chinatelecom.cn wrote:
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index b114f54..e31eef2 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -55,7 +55,11 @@ static inline void fuzz_dma_read_cb(size_t addr,
>  }
>  #endif
>  
> -extern bool global_dirty_log;
> +/* what is the purpose of current dirty log, migration or dirty rate ? */

Nitpick: I'll make it:

  /* Possible bits for global_dirty_log */

  /* Dirty tracking enabled because migration is running */
  #define GLOBAL_DIRTY_MIGRATION  (1U << 0)

  /* Dirty tracking enabled because measuring dirty rate */
  #define GLOBAL_DIRTY_DIRTY_RATE (1U << 1)

> +#define GLOBAL_DIRTY_MIGRATION  (1U << 0)
> +#define GLOBAL_DIRTY_DIRTY_RATE (1U << 1)
> +
> +extern unsigned int global_dirty_log;
>  
>  typedef struct MemoryRegionOps MemoryRegionOps;
>  

[...]

> @@ -39,7 +39,7 @@
>  static unsigned memory_region_transaction_depth;
>  static bool memory_region_update_pending;
>  static bool ioeventfd_update_pending;
> -bool global_dirty_log;
> +unsigned int global_dirty_log;

I'm wondering whether it's a good chance to rename it to global_dirty_tracking,
because "logging" has a hint on the method while it's not the only one now.

>  
>  static QTAILQ_HEAD(, MemoryListener) memory_listeners
>      = QTAILQ_HEAD_INITIALIZER(memory_listeners);
> @@ -2659,14 +2659,19 @@ void memory_global_after_dirty_log_sync(void)
>  
>  static VMChangeStateEntry *vmstate_change;
>  
> -void memory_global_dirty_log_start(void)
> +void memory_global_dirty_log_start(unsigned int flags)
>  {
>      if (vmstate_change) {
>          qemu_del_vm_change_state_handler(vmstate_change);
>          vmstate_change = NULL;
>      }
>  
> -    global_dirty_log = true;
> +#define  GLOBAL_DIRTY_MASK  (0x3)
> +    assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
> +    assert(global_dirty_log ^ flags);

Heh, this is probably my fault... I think what I wanted to suggest is actually:

       assert(!(global_dirty_log & flags));

Then for stop() below...

> +    global_dirty_log |= flags;
> +
> +    trace_global_dirty_changed(global_dirty_log);
>  
>      MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
>  
> @@ -2676,9 +2681,12 @@ void memory_global_dirty_log_start(void)
>      memory_region_transaction_commit();
>  }
>  
> -static void memory_global_dirty_log_do_stop(void)
> +static void memory_global_dirty_log_do_stop(unsigned int flags)
>  {
> -    global_dirty_log = false;
> +    assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));

... it should probably be:

       assert((global_dirty_log & flags) == flags);

Sorry about the confusion.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 1/6] KVM: introduce dirty_pages and kvm_dirty_ring_enabled
  2021-06-16  1:12 ` [PATCH v4 1/6] KVM: introduce dirty_pages and kvm_dirty_ring_enabled huangy81
@ 2021-06-16 15:23   ` Peter Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2021-06-16 15:23 UTC (permalink / raw)
  To: huangy81
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini

On Wed, Jun 16, 2021 at 09:12:27AM +0800, huangy81@chinatelecom.cn wrote:
> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> 
> dirty_pages is used to calculate dirtyrate via dirty ring, when
> enabled, kvm-reaper will increase the dirty pages after gfns
> being dirtied.
> 
> kvm_dirty_ring_enabled shows if kvm-reaper is working. dirtyrate
> thread could use it to check if measurement can base on dirty
> ring feature.
> 
> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 3/6] migration/dirtyrate: introduce struct and adjust DirtyRateStat
  2021-06-16  1:12 ` [PATCH v4 3/6] migration/dirtyrate: introduce struct and adjust DirtyRateStat huangy81
@ 2021-06-16 15:30   ` Peter Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2021-06-16 15:30 UTC (permalink / raw)
  To: huangy81
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini

On Wed, Jun 16, 2021 at 09:12:29AM +0800, huangy81@chinatelecom.cn wrote:
> -static void init_dirtyrate_stat(int64_t start_time, int64_t calc_time,
> -                                uint64_t sample_pages)
> +static void init_dirtyrate_stat(int64_t start_time,
> +                                struct DirtyRateConfig config)
>  {
> -    DirtyStat.total_dirty_samples = 0;
> -    DirtyStat.total_sample_count = 0;
> -    DirtyStat.total_block_mem_MB = 0;
>      DirtyStat.dirty_rate = -1;
>      DirtyStat.start_time = start_time;
> -    DirtyStat.calc_time = calc_time;
> -    DirtyStat.sample_pages = sample_pages;
> +    DirtyStat.calc_time = config.sample_period_seconds;
> +    DirtyStat.sample_pages = config.sample_pages_per_gigabytes;
> +
> +    switch (config.mode) {
> +    case DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING:
> +        DirtyStat.page_sampling.total_dirty_samples = 0;
> +        DirtyStat.page_sampling.total_sample_count = 0;
> +        DirtyStat.page_sampling.total_block_mem_MB = 0;
> +        break;
> +    case DIRTY_RATE_MEASURE_MODE_DIRTY_RING:
> +        DirtyStat.dirty_ring.nvcpu = -1;
> +        DirtyStat.dirty_ring.rates = NULL;

Missing "break"?

> +    default:

Assert here instead?

> +        break;
> +    }
>  }

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 4/6] migration/dirtyrate: adjust order of registering thread
  2021-06-16  1:12 ` [PATCH v4 4/6] migration/dirtyrate: adjust order of registering thread huangy81
@ 2021-06-16 15:32   ` Peter Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2021-06-16 15:32 UTC (permalink / raw)
  To: huangy81
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini

On Wed, Jun 16, 2021 at 09:12:30AM +0800, huangy81@chinatelecom.cn wrote:
> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> 
> registering get_dirtyrate thread in advance so that both
> page-sampling and dirty-ring mode can be covered.
> 
> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 5/6] migration/dirtyrate: move init step of calculation to main thread
  2021-06-16  1:12 ` [PATCH v4 5/6] migration/dirtyrate: move init step of calculation to main thread huangy81
@ 2021-06-16 16:47   ` Peter Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2021-06-16 16:47 UTC (permalink / raw)
  To: huangy81
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini

On Wed, Jun 16, 2021 at 09:12:31AM +0800, huangy81@chinatelecom.cn wrote:
> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> 
> since main thread could "query dirty rate" at any time, then it's
> better to move init step into main thead so that synchronization
> overhead of dirty stat can be reduced.
> 
> since not sure whether "only one QMP iothread" will still keep true
> forever, always introduce a mutex and protect dirty stat.

Sorry to have misguided you on that "only one QMP iothread" statement - that's
partly a joke.. I still think it's possible but let's not worry too much on
that now. :)

What I really wanted to suggest is moving the init data phase into main thread
(which you did in this patch, thanks!), then it's very safe already even
without mutex, afaict.. so we never do partial read DirtyStat anymore, which is
already a "safe race" since it doesn't crash anything anyways.

Btw, I think the mutex will lose it's most usefulness too if you don't take it
in the dirty rate thread (which I think is missing in this patch).  But before
looking into that, please see below..

> 
> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> ---
>  migration/dirtyrate.c | 34 ++++++++++++++++++++++++++++++----
>  1 file changed, 30 insertions(+), 4 deletions(-)
> 
> diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
> index b97f6a5..d7b41bd 100644
> --- a/migration/dirtyrate.c
> +++ b/migration/dirtyrate.c
> @@ -26,6 +26,8 @@
>  
>  static int CalculatingState = DIRTY_RATE_STATUS_UNSTARTED;
>  static struct DirtyRateStat DirtyStat;
> +static QemuMutex dirtyrate_lock;
> +static DirtyRateMeasureMode dirtyrate_mode = DIRTY_RATE_MEASURE_MODE_NONE;
>  
>  static int64_t set_sample_page_period(int64_t msec, int64_t initial_time)
>  {
> @@ -70,6 +72,7 @@ static int dirtyrate_set_state(int *state, int old_state, int new_state)
>  
>  static struct DirtyRateInfo *query_dirty_rate_info(void)
>  {
> +    qemu_mutex_lock(&dirtyrate_lock);
>      int64_t dirty_rate = DirtyStat.dirty_rate;
>      struct DirtyRateInfo *info = g_malloc0(sizeof(DirtyRateInfo));
>  
> @@ -83,6 +86,8 @@ static struct DirtyRateInfo *query_dirty_rate_info(void)
>      info->calc_time = DirtyStat.calc_time;
>      info->sample_pages = DirtyStat.sample_pages;
>  
> +    qemu_mutex_unlock(&dirtyrate_lock);
> +
>      trace_query_dirty_rate_info(DirtyRateStatus_str(CalculatingState));
>  
>      return info;
> @@ -91,6 +96,7 @@ static struct DirtyRateInfo *query_dirty_rate_info(void)
>  static void init_dirtyrate_stat(int64_t start_time,
>                                  struct DirtyRateConfig config)
>  {
> +    qemu_mutex_lock(&dirtyrate_lock);
>      DirtyStat.dirty_rate = -1;
>      DirtyStat.start_time = start_time;
>      DirtyStat.calc_time = config.sample_period_seconds;
> @@ -108,6 +114,12 @@ static void init_dirtyrate_stat(int64_t start_time,
>      default:
>          break;
>      }
> +    qemu_mutex_unlock(&dirtyrate_lock);
> +}
> +
> +static void cleanup_dirtyrate_stat(struct DirtyRateConfig config)
> +{
> +    /* TODO */
>  }
>  
>  static void update_dirtyrate_stat(struct RamblockDirtyInfo *info)
> @@ -379,7 +391,6 @@ void *get_dirtyrate_thread(void *arg)
>  {
>      struct DirtyRateConfig config = *(struct DirtyRateConfig *)arg;
>      int ret;
> -    int64_t start_time;
>      rcu_register_thread();
>  
>      ret = dirtyrate_set_state(&CalculatingState, DIRTY_RATE_STATUS_UNSTARTED,
> @@ -389,9 +400,6 @@ void *get_dirtyrate_thread(void *arg)
>          return NULL;
>      }
>  
> -    start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
> -    init_dirtyrate_stat(start_time, config);
> -
>      calculate_dirtyrate(config);
>  
>      ret = dirtyrate_set_state(&CalculatingState, DIRTY_RATE_STATUS_MEASURING,
> @@ -410,6 +418,7 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
>      static struct DirtyRateConfig config;
>      QemuThread thread;
>      int ret;
> +    int64_t start_time;
>  
>      /*
>       * If the dirty rate is already being measured, don't attempt to start.
> @@ -450,6 +459,23 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
>      config.sample_period_seconds = calc_time;
>      config.sample_pages_per_gigabytes = sample_pages;
>      config.mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
> +
> +    if (unlikely(dirtyrate_mode == DIRTY_RATE_MEASURE_MODE_NONE)) {
> +        /* first time to calculate dirty rate */
> +        qemu_mutex_init(&dirtyrate_lock);
> +    }

Is the 'none' mode only for init the mutex?  If so, I'd suggest we drop the
"none" mode.  A side note is that if you want to init a mutex, AFAIU the best
way is define this:

static void __attribute__((__constructor__)) dirty_rate_init(void)
{
        qemu_mutex_init(...);
}

But hold on..

I see the mutex seems to already have brought even more trouble than benefits,
maybe let's drop the mutex too along with "none" mode?  Let's keep this patch
"moving init to main thread" only, and IMHO it's good enough.

There's a special care we need to look for with dirty ring measurements, that
we need to make sure to not reference the *vcpu pointer unless the state is
DIRTY_RATE_STATUS_MEASURED.  I'll comment in the next patch for that soon.

> +
> +    cleanup_dirtyrate_stat(config);
> +
> +    /*
> +     * update dirty rate mode so that we can figure out what mode has
> +     * been used in last calculation
> +     **/
> +    dirtyrate_mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
> +
> +    start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
> +    init_dirtyrate_stat(start_time, config);
> +
>      qemu_thread_create(&thread, "get_dirtyrate", get_dirtyrate_thread,
>                         (void *)&config, QEMU_THREAD_DETACHED);
>  }
> -- 
> 1.8.3.1
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 6/6] migration/dirtyrate: implement dirty-ring dirtyrate calculation
  2021-06-16  1:12 ` [PATCH v4 6/6] migration/dirtyrate: implement dirty-ring dirtyrate calculation huangy81
@ 2021-06-16 16:56   ` Peter Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2021-06-16 16:56 UTC (permalink / raw)
  To: huangy81
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini

On Wed, Jun 16, 2021 at 09:12:32AM +0800, huangy81@chinatelecom.cn wrote:
> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> 
> use dirty ring feature to implement dirtyrate calculation.
> 
> introduce mode option in qmp calc_dirty_rate to specify what
> method should be used when calculating dirtyrate, either
> page-sampling or dirty-ring should be passed.
> 
> introduce "dirty_ring:-r" option in hmp calc_dirty_rate to
> indicate dirty ring method should be used for calculation.
> 
> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

Mostly good to me, thanks; still some more comments below.

> ---
>  hmp-commands.hx        |   7 +-
>  migration/dirtyrate.c  | 183 ++++++++++++++++++++++++++++++++++++++++++++++---
>  migration/trace-events |   2 +
>  qapi/migration.json    |  16 ++++-
>  4 files changed, 195 insertions(+), 13 deletions(-)
> 
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index 8e45bce..f7fc9d7 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1738,8 +1738,9 @@ ERST
>  
>      {
>          .name       = "calc_dirty_rate",
> -        .args_type  = "second:l,sample_pages_per_GB:l?",
> -        .params     = "second [sample_pages_per_GB]",
> -        .help       = "start a round of guest dirty rate measurement",
> +        .args_type  = "dirty_ring:-r,second:l,sample_pages_per_GB:l?",
> +        .params     = "[-r] second [sample_pages_per_GB]",
> +        .help       = "start a round of guest dirty rate measurement (using -d to"
> +                      "\n\t\t\t specify dirty ring as the method of calculation)",
>          .cmd        = hmp_calc_dirty_rate,
>      },
> diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
> index d7b41bd..7c9515b 100644
> --- a/migration/dirtyrate.c
> +++ b/migration/dirtyrate.c
> @@ -16,6 +16,7 @@
>  #include "cpu.h"
>  #include "exec/ramblock.h"
>  #include "qemu/rcu_queue.h"
> +#include "qemu/main-loop.h"
>  #include "qapi/qapi-commands-migration.h"
>  #include "ram.h"
>  #include "trace.h"
> @@ -23,11 +24,20 @@
>  #include "monitor/hmp.h"
>  #include "monitor/monitor.h"
>  #include "qapi/qmp/qdict.h"
> +#include "sysemu/kvm.h"
> +#include "sysemu/runstate.h"
> +#include "exec/memory.h"
> +
> +typedef struct DirtyPageRecord {
> +    uint64_t start_pages;
> +    uint64_t end_pages;
> +} DirtyPageRecord;
>  
>  static int CalculatingState = DIRTY_RATE_STATUS_UNSTARTED;
>  static struct DirtyRateStat DirtyStat;
>  static QemuMutex dirtyrate_lock;
>  static DirtyRateMeasureMode dirtyrate_mode = DIRTY_RATE_MEASURE_MODE_NONE;
> +static DirtyPageRecord *dirty_pages;

I think this can be a local var.  See below.

>  
>  static int64_t set_sample_page_period(int64_t msec, int64_t initial_time)
>  {
> @@ -72,9 +82,11 @@ static int dirtyrate_set_state(int *state, int old_state, int new_state)
>  
>  static struct DirtyRateInfo *query_dirty_rate_info(void)
>  {
> +    int i;
>      qemu_mutex_lock(&dirtyrate_lock);
>      int64_t dirty_rate = DirtyStat.dirty_rate;
>      struct DirtyRateInfo *info = g_malloc0(sizeof(DirtyRateInfo));
> +    DirtyRateVcpuList *head = NULL, **tail = &head;
>  
>      if (qatomic_read(&CalculatingState) == DIRTY_RATE_STATUS_MEASURED) {
>          info->has_dirty_rate = true;
> @@ -85,9 +97,22 @@ static struct DirtyRateInfo *query_dirty_rate_info(void)
>      info->start_time = DirtyStat.start_time;
>      info->calc_time = DirtyStat.calc_time;
>      info->sample_pages = DirtyStat.sample_pages;
> +    info->mode = dirtyrate_mode;
> +
> +    if (dirtyrate_mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) {
> +        /* set sample_pages with 0 to indicate page sampling isn't enabled */
> +        info->sample_pages = 0;
> +        info->has_vcpu_dirty_rate = true;
> +        for (i = 0; i < DirtyStat.dirty_ring.nvcpu; i++) {
> +            DirtyRateVcpu *rate = g_malloc0(sizeof(DirtyRateVcpu));
> +            rate->id = DirtyStat.dirty_ring.rates[i].id;
> +            rate->dirty_rate = DirtyStat.dirty_ring.rates[i].dirty_rate;
> +            QAPI_LIST_APPEND(tail, rate);
> +        }
> +        info->vcpu_dirty_rate = head;
> +    }

I think it's nicer to move this chunk into the previous block:

    if (qatomic_read(&CalculatingState) == DIRTY_RATE_STATUS_MEASURED) {
        ...
    }

Then as mentioned previously I think we can drop the mutex in previous patch.

>  
>      qemu_mutex_unlock(&dirtyrate_lock);
> -
>      trace_query_dirty_rate_info(DirtyRateStatus_str(CalculatingState));
>  
>      return info;
> @@ -119,7 +144,11 @@ static void init_dirtyrate_stat(int64_t start_time,
>  
>  static void cleanup_dirtyrate_stat(struct DirtyRateConfig config)
>  {
> -    /* TODO */
> +    /* last calc-dirty-rate qmp use dirty ring mode */
> +    if (dirtyrate_mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) {
> +        free(DirtyStat.dirty_ring.rates);
> +        DirtyStat.dirty_ring.rates = NULL;
> +    }
>  }
>  
>  static void update_dirtyrate_stat(struct RamblockDirtyInfo *info)
> @@ -356,7 +385,97 @@ static bool compare_page_hash_info(struct RamblockDirtyInfo *info,
>      return true;
>  }
>  
> -static void calculate_dirtyrate(struct DirtyRateConfig config)
> +static void record_dirtypages(CPUState *cpu, bool start)
> +{
> +    if (start) {
> +        dirty_pages[cpu->cpu_index].start_pages = cpu->dirty_pages;
> +    } else {
> +        dirty_pages[cpu->cpu_index].end_pages = cpu->dirty_pages;
> +    }
> +}

I suggest to drop this helper and inline them.  More below.

> +
> +static void dirtyrate_global_dirty_log_start(void)
> +{
> +    qemu_mutex_lock_iothread();
> +    memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE);
> +    qemu_mutex_unlock_iothread();
> +}
> +
> +static void dirtyrate_global_dirty_log_stop(void)
> +{
> +    qemu_mutex_lock_iothread();
> +    memory_global_dirty_log_stop(GLOBAL_DIRTY_DIRTY_RATE);
> +    qemu_mutex_unlock_iothread();
> +}
> +
> +static int64_t do_calculate_dirtyrate_vcpu(int idx)
> +{
> +    uint64_t memory_size_MB;
> +    int64_t time_s;
> +    uint64_t start_pages = dirty_pages[idx].start_pages;
> +    uint64_t end_pages = dirty_pages[idx].end_pages;
> +    uint64_t dirty_pages = 0;
> +
> +    dirty_pages = end_pages - start_pages;
> +
> +    memory_size_MB = (dirty_pages * TARGET_PAGE_SIZE) >> 20;
> +    time_s = DirtyStat.calc_time;
> +
> +    trace_dirtyrate_do_calculate_vcpu(idx, dirty_pages, time_s);
> +
> +    return memory_size_MB / time_s;
> +}
> +
> +static void calculate_dirtyrate_dirty_ring(struct DirtyRateConfig config)
> +{
> +    CPUState *cpu;
> +    int64_t msec = 0;
> +    int64_t start_time;
> +    uint64_t dirtyrate = 0;
> +    uint64_t dirtyrate_sum = 0;
> +    int nvcpu = 0;
> +    int i = 0;
> +
> +    CPU_FOREACH(cpu) {
> +        nvcpu++;
> +    }
> +
> +    dirty_pages = malloc(sizeof(*dirty_pages) * nvcpu);

I think dirty_pages can be a local var in this function and should be enough.

> +
> +    DirtyStat.dirty_ring.nvcpu = nvcpu;
> +    DirtyStat.dirty_ring.rates = malloc(sizeof(DirtyRateVcpu) * nvcpu);
> +
> +    dirtyrate_global_dirty_log_start();
> +
> +    CPU_FOREACH(cpu) {
> +        record_dirtypages(cpu, true);

Here we expand it so reference dirty_pages will have no problem.

> +    }
> +
> +    start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> +    DirtyStat.start_time = start_time / 1000;
> +
> +    msec = config.sample_period_seconds * 1000;
> +    msec = set_sample_page_period(msec, start_time);
> +    DirtyStat.calc_time = msec / 1000;
> +
> +    CPU_FOREACH(cpu) {
> +        record_dirtypages(cpu, false);

Same here.

> +    }
> +
> +    dirtyrate_global_dirty_log_stop();
> +
> +    for (i = 0; i < DirtyStat.dirty_ring.nvcpu; i++) {
> +        dirtyrate = do_calculate_dirtyrate_vcpu(i);

We may need to pass in dirty_pages here too, but this should be the last thing
we do to make it local.

> +        DirtyStat.dirty_ring.rates[i].id = i;
> +        DirtyStat.dirty_ring.rates[i].dirty_rate = dirtyrate;
> +        dirtyrate_sum += dirtyrate;
> +    }
> +
> +    DirtyStat.dirty_rate = dirtyrate_sum;
> +    free(dirty_pages);
> +}
> +
> +static void calculate_dirtyrate_sample_vm(struct DirtyRateConfig config)
>  {
>      struct RamblockDirtyInfo *block_dinfo = NULL;
>      int block_count = 0;
> @@ -387,6 +506,17 @@ out:
>      free_ramblock_dirty_info(block_dinfo, block_count);
>  }
>  
> +static void calculate_dirtyrate(struct DirtyRateConfig config)
> +{
> +    if (config.mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) {
> +        calculate_dirtyrate_dirty_ring(config);
> +    } else {
> +        calculate_dirtyrate_sample_vm(config);
> +    }
> +
> +    trace_dirtyrate_calculate(DirtyStat.dirty_rate);
> +}
> +
>  void *get_dirtyrate_thread(void *arg)
>  {
>      struct DirtyRateConfig config = *(struct DirtyRateConfig *)arg;
> @@ -412,8 +542,12 @@ void *get_dirtyrate_thread(void *arg)
>      return NULL;
>  }
>  
> -void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
> -                         int64_t sample_pages, Error **errp)
> +void qmp_calc_dirty_rate(int64_t calc_time,
> +                         bool has_sample_pages,
> +                         int64_t sample_pages,
> +                         bool has_mode,
> +                         DirtyRateMeasureMode mode,
> +                         Error **errp)
>  {
>      static struct DirtyRateConfig config;
>      QemuThread thread;
> @@ -435,6 +569,15 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
>          return;
>      }
>  
> +    if (!has_mode) {
> +        mode =  DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
> +    }
> +
> +    if (has_sample_pages && mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) {
> +        error_setg(errp, "either sample-pages or dirty-ring can be specified.");
> +        return;
> +    }
> +
>      if (has_sample_pages) {
>          if (!is_sample_pages_valid(sample_pages)) {
>              error_setg(errp, "sample-pages is out of range[%d, %d].",
> @@ -447,6 +590,16 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
>      }
>  
>      /*
> +     * dirty ring mode only works when kvm dirty ring is enabled.
> +     */
> +    if ((mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) &&
> +        !kvm_dirty_ring_enabled()) {
> +        error_setg(errp, "dirty ring is disabled, use sample-pages method "
> +                         "or remeasure later.");
> +        return;
> +    }
> +
> +    /*
>       * Init calculation state as unstarted.
>       */
>      ret = dirtyrate_set_state(&CalculatingState, CalculatingState,
> @@ -458,7 +611,7 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
>  
>      config.sample_period_seconds = calc_time;
>      config.sample_pages_per_gigabytes = sample_pages;
> -    config.mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
> +    config.mode = mode;
>  
>      if (unlikely(dirtyrate_mode == DIRTY_RATE_MEASURE_MODE_NONE)) {
>          /* first time to calculate dirty rate */
> @@ -471,7 +624,7 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages,
>       * update dirty rate mode so that we can figure out what mode has
>       * been used in last calculation
>       **/
> -    dirtyrate_mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
> +    dirtyrate_mode = mode;
>  
>      start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
>      init_dirtyrate_stat(start_time, config);
> @@ -497,9 +650,18 @@ void hmp_info_dirty_rate(Monitor *mon, const QDict *qdict)
>                     info->sample_pages);
>      monitor_printf(mon, "Period: %"PRIi64" (sec)\n",
>                     info->calc_time);
> +    monitor_printf(mon, "Mode: %s\n",
> +                   DirtyRateMeasureMode_str(info->mode));
>      monitor_printf(mon, "Dirty rate: ");
>      if (info->has_dirty_rate) {
>          monitor_printf(mon, "%"PRIi64" (MB/s)\n", info->dirty_rate);
> +        if (info->has_vcpu_dirty_rate) {
> +            DirtyRateVcpuList *rate, *head = info->vcpu_dirty_rate;
> +            for (rate = head; rate != NULL; rate = rate->next) {
> +                monitor_printf(mon, "vcpu[%"PRIi64"], Dirty rate: %"PRIi64"\n",
> +                               rate->value->id, rate->value->dirty_rate);
> +            }
> +        }
>      } else {
>          monitor_printf(mon, "(not ready)\n");
>      }

Please be careful to not leak the list of vcpu results.. I think we need
something like qapi_free_DirtyRateVcpuList().

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 2/6] memory: make global_dirty_log a bitmask
  2021-06-16 15:22   ` Peter Xu
@ 2021-06-17  4:49     ` Hyman Huang
  0 siblings, 0 replies; 14+ messages in thread
From: Hyman Huang @ 2021-06-17  4:49 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini



在 2021/6/16 23:22, Peter Xu 写道:
> On Wed, Jun 16, 2021 at 09:12:28AM +0800, huangy81@chinatelecom.cn wrote:
>> diff --git a/include/exec/memory.h b/include/exec/memory.h
>> index b114f54..e31eef2 100644
>> --- a/include/exec/memory.h
>> +++ b/include/exec/memory.h
>> @@ -55,7 +55,11 @@ static inline void fuzz_dma_read_cb(size_t addr,
>>   }
>>   #endif
>>   
>> -extern bool global_dirty_log;
>> +/* what is the purpose of current dirty log, migration or dirty rate ? */
> 
> Nitpick: I'll make it:
> 
>    /* Possible bits for global_dirty_log */
> 
>    /* Dirty tracking enabled because migration is running */
>    #define GLOBAL_DIRTY_MIGRATION  (1U << 0)
> 
>    /* Dirty tracking enabled because measuring dirty rate */
>    #define GLOBAL_DIRTY_DIRTY_RATE (1U << 1)
> 
>> +#define GLOBAL_DIRTY_MIGRATION  (1U << 0)
>> +#define GLOBAL_DIRTY_DIRTY_RATE (1U << 1)
>> +
>> +extern unsigned int global_dirty_log;
>>   
>>   typedef struct MemoryRegionOps MemoryRegionOps;
>>   
> 
> [...]
> 
>> @@ -39,7 +39,7 @@
>>   static unsigned memory_region_transaction_depth;
>>   static bool memory_region_update_pending;
>>   static bool ioeventfd_update_pending;
>> -bool global_dirty_log;
>> +unsigned int global_dirty_log;
> 
> I'm wondering whether it's a good chance to rename it to global_dirty_tracking,
> because "logging" has a hint on the method while it's not the only one now.
yeah, all references to global_dirty_log should be modified and can this 
be done in a single patch before this patchset?
> 
>>   
>>   static QTAILQ_HEAD(, MemoryListener) memory_listeners
>>       = QTAILQ_HEAD_INITIALIZER(memory_listeners);
>> @@ -2659,14 +2659,19 @@ void memory_global_after_dirty_log_sync(void)
>>   
>>   static VMChangeStateEntry *vmstate_change;
>>   
>> -void memory_global_dirty_log_start(void)
>> +void memory_global_dirty_log_start(unsigned int flags)
>>   {
>>       if (vmstate_change) {
>>           qemu_del_vm_change_state_handler(vmstate_change);
>>           vmstate_change = NULL;
>>       }
>>   
>> -    global_dirty_log = true;
>> +#define  GLOBAL_DIRTY_MASK  (0x3)
>> +    assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
>> +    assert(global_dirty_log ^ flags);
> 
> Heh, this is probably my fault... I think what I wanted to suggest is actually:
> 
>         assert(!(global_dirty_log & flags));
this is more graceful if concerning about only one of the reason can 
start dirty tracking at once. I'll pick up it in the next version.
> 
> Then for stop() below...
> 
>> +    global_dirty_log |= flags;
>> +
>> +    trace_global_dirty_changed(global_dirty_log);
>>   
>>       MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
>>   
>> @@ -2676,9 +2681,12 @@ void memory_global_dirty_log_start(void)
>>       memory_region_transaction_commit();
>>   }
>>   
>> -static void memory_global_dirty_log_do_stop(void)
>> +static void memory_global_dirty_log_do_stop(unsigned int flags)
>>   {
>> -    global_dirty_log = false;
>> +    assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
> 
> ... it should probably be:
> 
>         assert((global_dirty_log & flags) == flags);
> 
> Sorry about the confusion.
not at all, since i'm not figure out how this bitmask works clearly, 
thanks a lot for your guidance.
> 

-- 
Best regard

Hyman Huang(黄勇)


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-06-17  4:51 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-16  1:12 [PATCH v4 0/6] support dirtyrate at the granualrity of vcpu huangy81
2021-06-16  1:12 ` [PATCH v4 1/6] KVM: introduce dirty_pages and kvm_dirty_ring_enabled huangy81
2021-06-16 15:23   ` Peter Xu
2021-06-16  1:12 ` [PATCH v4 2/6] memory: make global_dirty_log a bitmask huangy81
2021-06-16 15:22   ` Peter Xu
2021-06-17  4:49     ` Hyman Huang
2021-06-16  1:12 ` [PATCH v4 3/6] migration/dirtyrate: introduce struct and adjust DirtyRateStat huangy81
2021-06-16 15:30   ` Peter Xu
2021-06-16  1:12 ` [PATCH v4 4/6] migration/dirtyrate: adjust order of registering thread huangy81
2021-06-16 15:32   ` Peter Xu
2021-06-16  1:12 ` [PATCH v4 5/6] migration/dirtyrate: move init step of calculation to main thread huangy81
2021-06-16 16:47   ` Peter Xu
2021-06-16  1:12 ` [PATCH v4 6/6] migration/dirtyrate: implement dirty-ring dirtyrate calculation huangy81
2021-06-16 16:56   ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.