All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/3] support dirty restraint on vCPU
@ 2021-11-18  6:07 huangy81
  2021-11-18  6:07 ` [PATCH v1 1/3] migration/dirtyrate: implement vCPU dirtyrate calculation periodically huangy81
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: huangy81 @ 2021-11-18  6:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: David Hildenbrand, Hyman, Juan Quintela, Richard Henderson,
	Dr. David Alan Gilbert, Peter Xu, Paolo Bonzini,
	Philippe Mathieu-Daudé

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

this patchset introduce a mechanism to impose dirty restraint
on vCPU, aiming to keep the vCPU running in a certain dirtyrate
given by user. dirty restraint on vCPU maybe an alternative
method to implement convergence logic for live migration,
which could improve guest memory performance during migration
compared with traditional method in theory.

For the current live migration implementation, the convergence
logic throttles all vCPUs of the VM, which has some side effects. 
-'read processes' on vCPU will be unnecessarily penalized
- throttle increase percentage step by step, which seems
  struggling to find the optimal throttle percentage when
  dirtyrate is high. 
- hard to predict the remaining time of migration if the
  throttling percentage reachs 99%

to a certain extent, the dirty restraint machnism can fix these
effects by throttling at vCPU granularity during migration.

the implementation is rather straightforward, we calculate
vCPU dirtyrate via the Dirty Ring mechanism periodically
as the commit 0e21bf246 "implement dirty-ring dirtyrate calculation"
does, for vCPU that be specified to impose dirty restraint,
we throttle it periodically as the auto-converge does, once after
throttling, we compare the quota dirtyrate with current dirtyrate,
if current dirtyrate is not under the quota, increase the throttling
percentage until current dirtyrate is under the quota.

this patchset is the basis of implmenting a new auto-converge method
for live migration, we introduce two qmp commands for impose/cancel
the dirty restraint on specified vCPU, so it also can be an independent
api to supply the upper app such as libvirt, which can use it to
implement the convergence logic during live migration, supplemented
with the qmp 'calc-dirty-rate' command or whatever. 

we post this patchset for RFC and any corrections and suggetions about
the implementation, api, throttleing algorithm or whatever are very
appreciated!

Please review, thanks !

Best Regards ! 

Hyman Huang (3):
  migration/dirtyrate: implement vCPU dirtyrate calculation periodically
  cpu-throttle: implement vCPU throttle
  cpus-common: implement dirty restraint on vCPU

 cpus-common.c                   |  45 ++++++
 include/exec/memory.h           |   5 +-
 include/hw/core/cpu.h           |   7 +
 include/sysemu/cpu-throttle.h   |  21 +++
 include/sysemu/dirtyrestraint.h |  22 +++
 migration/dirtyrate.c           | 125 +++++++++++++++++
 migration/dirtyrate.h           |   2 +
 qapi/misc.json                  |  44 ++++++
 softmmu/cpu-throttle.c          | 304 ++++++++++++++++++++++++++++++++++++++++
 softmmu/trace-events            |   5 +
 softmmu/vl.c                    |   1 +
 11 files changed, 580 insertions(+), 1 deletion(-)
 create mode 100644 include/sysemu/dirtyrestraint.h

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v1 1/3] migration/dirtyrate: implement vCPU dirtyrate calculation periodically
  2021-11-18  6:07 [PATCH v1 0/3] support dirty restraint on vCPU huangy81
@ 2021-11-18  6:07 ` huangy81
  2021-11-18  9:26   ` Juan Quintela
  2021-11-18  6:07 ` [PATCH v1 2/3] cpu-throttle: implement vCPU throttle huangy81
  2021-11-18  6:07 ` [PATCH v1 3/3] cpus-common: implement dirty restraint on vCPU huangy81
  2 siblings, 1 reply; 5+ messages in thread
From: huangy81 @ 2021-11-18  6:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: David Hildenbrand, Hyman, Juan Quintela, Richard Henderson,
	Dr. David Alan Gilbert, Peter Xu, Paolo Bonzini,
	Philippe Mathieu-Daudé

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

introduce the third method GLOBAL_DIRTY_RESTRAINT of dirty
tracking for calculate dirtyrate periodly for dirty restraint.

implement thread for calculate dirtyrate periodly, which will
be used for dirty restraint.

add dirtyrestraint.h to introduce the util function for dirty
restrain.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 include/exec/memory.h           |   5 +-
 include/sysemu/dirtyrestraint.h |  20 +++++++
 migration/dirtyrate.c           | 118 ++++++++++++++++++++++++++++++++++++++++
 migration/dirtyrate.h           |   2 +
 4 files changed, 144 insertions(+), 1 deletion(-)
 create mode 100644 include/sysemu/dirtyrestraint.h

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 20f1b27..565d06b 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -69,7 +69,10 @@ static inline void fuzz_dma_read_cb(size_t addr,
 /* Dirty tracking enabled because measuring dirty rate */
 #define GLOBAL_DIRTY_DIRTY_RATE (1U << 1)
 
-#define GLOBAL_DIRTY_MASK  (0x3)
+/* Dirty tracking enabled because dirty restraint */
+#define GLOBAL_DIRTY_RESTRAINT  (1U << 2)
+
+#define GLOBAL_DIRTY_MASK  (0x7)
 
 extern unsigned int global_dirty_tracking;
 
diff --git a/include/sysemu/dirtyrestraint.h b/include/sysemu/dirtyrestraint.h
new file mode 100644
index 0000000..ca744af
--- /dev/null
+++ b/include/sysemu/dirtyrestraint.h
@@ -0,0 +1,20 @@
+/*
+ * dirty restraint helper functions
+ *
+ * Copyright (c) 2021 CHINA TELECOM CO.,LTD.
+ *
+ * Authors:
+ *  Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#ifndef QEMU_DIRTYRESTRAINT_H
+#define QEMU_DIRTYRESTRAINT_H
+
+#define DIRTYRESTRAINT_CALC_PERIOD_TIME_S   15      /* 15s */
+
+void dirtyrestraint_calc_start(void);
+
+void dirtyrestraint_calc_state_init(int max_cpus);
+#endif
diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index d65e744..b453b3a 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -27,6 +27,7 @@
 #include "qapi/qmp/qdict.h"
 #include "sysemu/kvm.h"
 #include "sysemu/runstate.h"
+#include "sysemu/dirtyrestraint.h"
 #include "exec/memory.h"
 
 /*
@@ -46,6 +47,123 @@ static struct DirtyRateStat DirtyStat;
 static DirtyRateMeasureMode dirtyrate_mode =
                 DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
 
+#define DIRTYRESTRAINT_CALC_TIME_MS         1000    /* 1000ms */
+
+struct {
+    DirtyRatesData data;
+    int64_t period;
+    bool enable;
+    QemuCond ready_cond;
+    QemuMutex ready_mtx;
+    bool ready;
+} *dirtyrestraint_calc_state;
+
+static inline void record_dirtypages(DirtyPageRecord *dirty_pages,
+                                     CPUState *cpu, bool start);
+
+static void dirtyrestraint_global_dirty_log_start(void)
+{
+    qemu_mutex_lock_iothread();
+    memory_global_dirty_log_start(GLOBAL_DIRTY_RESTRAINT);
+    qemu_mutex_unlock_iothread();
+}
+
+static void dirtyrestraint_global_dirty_log_stop(void)
+{
+    qemu_mutex_lock_iothread();
+    memory_global_dirty_log_sync();
+    memory_global_dirty_log_stop(GLOBAL_DIRTY_RESTRAINT);
+    qemu_mutex_unlock_iothread();
+}
+
+static void dirtyrestraint_calc_func(void)
+{
+    CPUState *cpu;
+    DirtyPageRecord *dirty_pages;
+    int64_t start_time, end_time, calc_time;
+    DirtyRateVcpu rate;
+    int i = 0;
+
+    dirty_pages = g_malloc0(sizeof(*dirty_pages) *
+        dirtyrestraint_calc_state->data.nvcpu);
+
+    dirtyrestraint_global_dirty_log_start();
+
+    CPU_FOREACH(cpu) {
+        record_dirtypages(dirty_pages, cpu, true);
+    }
+
+    start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+    g_usleep(DIRTYRESTRAINT_CALC_TIME_MS * 1000);
+    end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+    calc_time = end_time - start_time;
+
+    dirtyrestraint_global_dirty_log_stop();
+
+    CPU_FOREACH(cpu) {
+        record_dirtypages(dirty_pages, cpu, false);
+    }
+
+    for (i = 0; i < dirtyrestraint_calc_state->data.nvcpu; i++) {
+        uint64_t increased_dirty_pages =
+            dirty_pages[i].end_pages - dirty_pages[i].start_pages;
+        uint64_t memory_size_MB =
+            (increased_dirty_pages * TARGET_PAGE_SIZE) >> 20;
+        int64_t dirtyrate = (memory_size_MB * 1000) / calc_time;
+
+        rate.id = i;
+        rate.dirty_rate  = dirtyrate;
+        dirtyrestraint_calc_state->data.rates[i] = rate;
+
+        trace_dirtyrate_do_calculate_vcpu(i,
+            dirtyrestraint_calc_state->data.rates[i].dirty_rate);
+    }
+
+    return;
+}
+
+static void *dirtyrestraint_calc_thread(void *opaque)
+{
+    rcu_register_thread();
+
+    while (qatomic_read(&dirtyrestraint_calc_state->enable)) {
+        dirtyrestraint_calc_func();
+        dirtyrestraint_calc_state->ready = true;
+        qemu_cond_signal(&dirtyrestraint_calc_state->ready_cond);
+        sleep(dirtyrestraint_calc_state->period);
+    }
+
+    rcu_unregister_thread();
+    return NULL;
+}
+
+void dirtyrestraint_calc_start(void)
+{
+    if (likely(!qatomic_read(&dirtyrestraint_calc_state->enable))) {
+        qatomic_set(&dirtyrestraint_calc_state->enable, 1);
+        QemuThread thread;
+        qemu_thread_create(&thread, "dirtyrestraint-calc",
+            dirtyrestraint_calc_thread,
+            NULL, QEMU_THREAD_DETACHED);
+    }
+}
+
+void dirtyrestraint_calc_state_init(int max_cpus)
+{
+    dirtyrestraint_calc_state =
+        g_malloc0(sizeof(*dirtyrestraint_calc_state));
+
+    dirtyrestraint_calc_state->data.nvcpu = max_cpus;
+    dirtyrestraint_calc_state->data.rates =
+        g_malloc0(sizeof(DirtyRateVcpu) * max_cpus);
+
+    dirtyrestraint_calc_state->period =
+        DIRTYRESTRAINT_CALC_PERIOD_TIME_S;
+
+    qemu_cond_init(&dirtyrestraint_calc_state->ready_cond);
+    qemu_mutex_init(&dirtyrestraint_calc_state->ready_mtx);
+}
+
 static int64_t set_sample_page_period(int64_t msec, int64_t initial_time)
 {
     int64_t current_time;
diff --git a/migration/dirtyrate.h b/migration/dirtyrate.h
index 69d4c5b..e96acdc 100644
--- a/migration/dirtyrate.h
+++ b/migration/dirtyrate.h
@@ -70,6 +70,8 @@ typedef struct VcpuStat {
     DirtyRateVcpu *rates; /* array of dirty rate for each vcpu */
 } VcpuStat;
 
+typedef struct VcpuStat DirtyRatesData;
+
 /*
  * Store calculation statistics for each measure.
  */
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v1 2/3] cpu-throttle: implement vCPU throttle
  2021-11-18  6:07 [PATCH v1 0/3] support dirty restraint on vCPU huangy81
  2021-11-18  6:07 ` [PATCH v1 1/3] migration/dirtyrate: implement vCPU dirtyrate calculation periodically huangy81
@ 2021-11-18  6:07 ` huangy81
  2021-11-18  6:07 ` [PATCH v1 3/3] cpus-common: implement dirty restraint on vCPU huangy81
  2 siblings, 0 replies; 5+ messages in thread
From: huangy81 @ 2021-11-18  6:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: David Hildenbrand, Hyman, Juan Quintela, Richard Henderson,
	Dr. David Alan Gilbert, Peter Xu, Paolo Bonzini,
	Philippe Mathieu-Daudé

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

implement dirty restraint by kicking each vcpu as the
auto-converge does during migration, but just kick the
specified vcpu instead, not all the vcpu of vm.

start a thread to track the dirty restraint status
and adjuct the throttle pencentage dynamically depend
on current and quota dirtyrate .

introduce the util function in the header for the dirty
restraint implemantataion.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 include/sysemu/cpu-throttle.h   |  21 +++
 include/sysemu/dirtyrestraint.h |   2 +
 migration/dirtyrate.c           |   7 +
 softmmu/cpu-throttle.c          | 304 ++++++++++++++++++++++++++++++++++++++++
 softmmu/trace-events            |   5 +
 5 files changed, 339 insertions(+)

diff --git a/include/sysemu/cpu-throttle.h b/include/sysemu/cpu-throttle.h
index d65bdef..48215d2 100644
--- a/include/sysemu/cpu-throttle.h
+++ b/include/sysemu/cpu-throttle.h
@@ -65,4 +65,25 @@ bool cpu_throttle_active(void);
  */
 int cpu_throttle_get_percentage(void);
 
+/**
+ * dirtyrestraint_state_init:
+ *
+ * initialize golobal state for dirty restraint
+ */
+void dirtyrestraint_state_init(int max_cpus);
+
+/**
+ * dirtyrestraint_vcpu:
+ *
+ * impose dirty restraint on vcpu util reaching the quota dirtyrate
+ */
+void dirtyrestraint_vcpu(int cpu_index,
+                         uint64_t quota);
+/**
+ * dirtyrestraint_cancel_vcpu:
+ *
+ * cancel dirty restraint for the specified vcpu
+ */
+void dirtyrestraint_cancel_vcpu(int cpu_index);
+
 #endif /* SYSEMU_CPU_THROTTLE_H */
diff --git a/include/sysemu/dirtyrestraint.h b/include/sysemu/dirtyrestraint.h
index ca744af..b84a5c0 100644
--- a/include/sysemu/dirtyrestraint.h
+++ b/include/sysemu/dirtyrestraint.h
@@ -14,6 +14,8 @@
 
 #define DIRTYRESTRAINT_CALC_PERIOD_TIME_S   15      /* 15s */
 
+int64_t dirtyrestraint_calc_current(int cpu_index);
+
 void dirtyrestraint_calc_start(void);
 
 void dirtyrestraint_calc_state_init(int max_cpus);
diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index b453b3a..26919ff 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -137,6 +137,13 @@ static void *dirtyrestraint_calc_thread(void *opaque)
     return NULL;
 }
 
+int64_t dirtyrestraint_calc_current(int cpu_index)
+{
+    DirtyRateVcpu *rates = dirtyrestraint_calc_state->data.rates;
+
+    return qatomic_read(&rates[cpu_index].dirty_rate);
+}
+
 void dirtyrestraint_calc_start(void)
 {
     if (likely(!qatomic_read(&dirtyrestraint_calc_state->enable))) {
diff --git a/softmmu/cpu-throttle.c b/softmmu/cpu-throttle.c
index 8c2144a..7a127a0 100644
--- a/softmmu/cpu-throttle.c
+++ b/softmmu/cpu-throttle.c
@@ -29,6 +29,8 @@
 #include "qemu/main-loop.h"
 #include "sysemu/cpus.h"
 #include "sysemu/cpu-throttle.h"
+#include "sysemu/dirtyrestraint.h"
+#include "trace.h"
 
 /* vcpu throttling controls */
 static QEMUTimer *throttle_timer;
@@ -38,6 +40,308 @@ static unsigned int throttle_percentage;
 #define CPU_THROTTLE_PCT_MAX 99
 #define CPU_THROTTLE_TIMESLICE_NS 10000000
 
+#define DIRTYRESTRAINT_TOLERANCE_RANGE  15      /* 15MB/s */
+
+#define DIRTYRESTRAINT_THROTTLE_HEAVY_WATERMARK     75
+#define DIRTYRESTRAINT_THROTTLE_SLIGHT_WATERMARK    90
+
+#define DIRTYRESTRAINT_THROTTLE_HEAVY_STEP_SIZE     5
+#define DIRTYRESTRAINT_THROTTLE_SLIGHT_STEP_SIZE    2
+
+typedef enum {
+    RESTRAIN_KEEP,
+    RESTRAIN_RATIO,
+    RESTRAIN_HEAVY,
+    RESTRAIN_SLIGHT,
+} RestrainPolicy;
+
+typedef struct DirtyRestraintState {
+    int cpu_index;
+    bool enabled;
+    uint64_t quota;     /* quota dirtyrate MB/s */
+    QemuThread thread;
+    char *name;         /* thread name */
+} DirtyRestraintState;
+
+struct {
+    DirtyRestraintState *states;
+    int max_cpus;
+} *dirtyrestraint_state;
+
+static inline bool dirtyrestraint_enabled(int cpu_index)
+{
+    return qatomic_read(&dirtyrestraint_state->states[cpu_index].enabled);
+}
+
+static inline void dirtyrestraint_set_quota(int cpu_index, uint64_t quota)
+{
+    qatomic_set(&dirtyrestraint_state->states[cpu_index].quota, quota);
+}
+
+static inline uint64_t dirtyrestraint_quota(int cpu_index)
+{
+    return qatomic_read(&dirtyrestraint_state->states[cpu_index].quota);
+}
+
+static int64_t dirtyrestraint_current(int cpu_index)
+{
+    return dirtyrestraint_calc_current(cpu_index);
+}
+
+static void dirtyrestraint_vcpu_thread(CPUState *cpu, run_on_cpu_data data)
+{
+    double pct;
+    double throttle_ratio;
+    int64_t sleeptime_ns, endtime_ns;
+    int *percentage = (int *)data.host_ptr;
+
+    pct = (double)(*percentage) / 100;
+    throttle_ratio = pct / (1 - pct);
+    /* Add 1ns to fix double's rounding error (like 0.9999999...) */
+    sleeptime_ns = (int64_t)(throttle_ratio * CPU_THROTTLE_TIMESLICE_NS + 1);
+    endtime_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + sleeptime_ns;
+    while (sleeptime_ns > 0 && !cpu->stop) {
+        if (sleeptime_ns > SCALE_MS) {
+            qemu_cond_timedwait_iothread(cpu->halt_cond,
+                                         sleeptime_ns / SCALE_MS);
+        } else {
+            qemu_mutex_unlock_iothread();
+            g_usleep(sleeptime_ns / SCALE_US);
+            qemu_mutex_lock_iothread();
+        }
+        sleeptime_ns = endtime_ns - qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
+    }
+    qatomic_set(&cpu->throttle_thread_scheduled, 0);
+
+    free(percentage);
+}
+
+static void do_dirtyrestraint(int cpu_index,
+                              int percentage)
+{
+    CPUState *cpu;
+    int64_t sleeptime_ns, starttime_ms, currenttime_ms;
+    int *pct_parameter;
+    double pct;
+
+    pct = (double) percentage / 100;
+
+    starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+
+    while (true) {
+        CPU_FOREACH(cpu) {
+            if ((cpu_index == cpu->cpu_index) &&
+                (!qatomic_xchg(&cpu->throttle_thread_scheduled, 1))) {
+                pct_parameter = malloc(sizeof(*pct_parameter));
+                *pct_parameter = percentage;
+                async_run_on_cpu(cpu, dirtyrestraint_vcpu_thread,
+                                 RUN_ON_CPU_HOST_PTR(pct_parameter));
+                break;
+            }
+        }
+
+        sleeptime_ns = CPU_THROTTLE_TIMESLICE_NS / (1 - pct);
+        g_usleep(sleeptime_ns / SCALE_US);
+
+        currenttime_ms = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+        if (unlikely((currenttime_ms - starttime_ms) >
+                     (DIRTYRESTRAINT_CALC_PERIOD_TIME_S * 1000))) {
+            break;
+        }
+    }
+}
+
+static uint64_t dirtyrestraint_init_pct(uint64_t quota,
+                                        uint64_t current)
+{
+    uint64_t restraint_pct = 0;
+
+    if (quota >= current || (current == 0) ||
+        ((current - quota) <= DIRTYRESTRAINT_TOLERANCE_RANGE)) {
+        restraint_pct = 0;
+    } else {
+        restraint_pct = (current - quota) * 100 / current;
+
+        restraint_pct = MIN(restraint_pct,
+            DIRTYRESTRAINT_THROTTLE_HEAVY_WATERMARK);
+    }
+
+    return restraint_pct;
+}
+
+static RestrainPolicy dirtyrestraint_policy(unsigned int last_pct,
+                                            uint64_t quota,
+                                            uint64_t current)
+{
+    uint64_t max, min;
+
+    max = MAX(quota, current);
+    min = MIN(quota, current);
+    if ((max - min) <= DIRTYRESTRAINT_TOLERANCE_RANGE) {
+        return RESTRAIN_KEEP;
+    }
+    if (last_pct < DIRTYRESTRAINT_THROTTLE_HEAVY_WATERMARK) {
+        /* last percentage locates in [0, 75)*/
+        return RESTRAIN_RATIO;
+    } else if (last_pct < DIRTYRESTRAINT_THROTTLE_SLIGHT_WATERMARK) {
+        /* last percentage locates in [75, 90)*/
+        return RESTRAIN_HEAVY;
+    } else {
+        /* last percentage locates in [90, 99]*/
+        return RESTRAIN_SLIGHT;
+    }
+}
+
+static uint64_t dirtyrestraint_pct(unsigned int last_pct,
+                                   uint64_t quota,
+                                   uint64_t current)
+{
+    uint64_t restraint_pct = 0;
+    RestrainPolicy policy;
+    bool mitigate = (quota > current) ? true : false;
+
+    if (mitigate && ((current == 0) ||
+        (last_pct <= DIRTYRESTRAINT_THROTTLE_SLIGHT_STEP_SIZE))) {
+        return 0;
+    }
+
+    policy = dirtyrestraint_policy(last_pct, quota, current);
+    switch (policy) {
+    case RESTRAIN_SLIGHT:
+        /* [90, 99] */
+        if (mitigate) {
+            restraint_pct =
+                last_pct - DIRTYRESTRAINT_THROTTLE_SLIGHT_STEP_SIZE;
+        } else {
+            restraint_pct =
+                last_pct + DIRTYRESTRAINT_THROTTLE_SLIGHT_STEP_SIZE;
+
+            restraint_pct = MIN(restraint_pct, CPU_THROTTLE_PCT_MAX);
+        }
+       break;
+    case RESTRAIN_HEAVY:
+        /* [75, 90) */
+        if (mitigate) {
+            restraint_pct =
+                last_pct - DIRTYRESTRAINT_THROTTLE_HEAVY_STEP_SIZE;
+        } else {
+            restraint_pct =
+                last_pct + DIRTYRESTRAINT_THROTTLE_HEAVY_STEP_SIZE;
+
+            restraint_pct = MIN(restraint_pct,
+                DIRTYRESTRAINT_THROTTLE_SLIGHT_WATERMARK);
+        }
+       break;
+    case RESTRAIN_RATIO:
+        /* [0, 75) */
+        if (mitigate) {
+            if (last_pct <= (((quota - current) * 100 / quota) / 2)) {
+                restraint_pct = 0;
+            } else {
+                restraint_pct = last_pct -
+                    ((quota - current) * 100 / quota) / 2;
+                restraint_pct = MAX(restraint_pct, CPU_THROTTLE_PCT_MIN);
+            }
+        } else {
+            /*
+             * increase linearly with dirtyrate
+             * but tune a little by divide it by 2
+             */
+            restraint_pct = last_pct +
+                ((current - quota) * 100 / current) / 2;
+
+            restraint_pct = MIN(restraint_pct,
+                DIRTYRESTRAINT_THROTTLE_HEAVY_WATERMARK);
+        }
+       break;
+    case RESTRAIN_KEEP:
+    default:
+       restraint_pct = last_pct;
+       break;
+    }
+
+    return restraint_pct;
+}
+
+static void *dirtyrestraint_thread(void *opaque)
+{
+    int cpu_index = *(int *)opaque;
+    uint64_t quota_dirtyrate, current_dirtyrate;
+    unsigned int last_pct = 0;
+    unsigned int pct = 0;
+
+    rcu_register_thread();
+
+    quota_dirtyrate = dirtyrestraint_quota(cpu_index);
+    current_dirtyrate = dirtyrestraint_current(cpu_index);
+
+    pct = dirtyrestraint_init_pct(quota_dirtyrate, current_dirtyrate);
+
+    do {
+        trace_dirtyrestraint_impose(cpu_index,
+            quota_dirtyrate, current_dirtyrate, pct);
+        if (pct == 0) {
+            sleep(DIRTYRESTRAINT_CALC_PERIOD_TIME_S);
+        } else {
+            last_pct = pct;
+            do_dirtyrestraint(cpu_index, pct);
+        }
+
+        quota_dirtyrate = dirtyrestraint_quota(cpu_index);
+        current_dirtyrate = dirtyrestraint_current(cpu_index);
+
+        pct = dirtyrestraint_pct(last_pct, quota_dirtyrate, current_dirtyrate);
+    } while (dirtyrestraint_enabled(cpu_index));
+
+    rcu_unregister_thread();
+
+    return NULL;
+}
+
+void dirtyrestraint_cancel_vcpu(int cpu_index)
+{
+    qatomic_set(&dirtyrestraint_state->states[cpu_index].enabled, 0);
+}
+
+void dirtyrestraint_vcpu(int cpu_index,
+                         uint64_t quota)
+{
+    trace_dirtyrestraint_vcpu(cpu_index, quota);
+
+    dirtyrestraint_set_quota(cpu_index, quota);
+
+    if (unlikely(!dirtyrestraint_enabled(cpu_index))) {
+        qatomic_set(&dirtyrestraint_state->states[cpu_index].enabled, 1);
+        dirtyrestraint_state->states[cpu_index].name =
+            g_strdup_printf("dirtyrestraint-%d", cpu_index);
+        qemu_thread_create(&dirtyrestraint_state->states[cpu_index].thread,
+            dirtyrestraint_state->states[cpu_index].name,
+            dirtyrestraint_thread,
+            (void *)&dirtyrestraint_state->states[cpu_index].cpu_index,
+            QEMU_THREAD_DETACHED);
+    }
+
+    return;
+}
+
+void dirtyrestraint_state_init(int max_cpus)
+{
+    int i;
+
+    dirtyrestraint_state = g_malloc0(sizeof(*dirtyrestraint_state));
+
+    dirtyrestraint_state->states =
+            g_malloc0(sizeof(DirtyRestraintState) * max_cpus);
+
+    for (i = 0; i < max_cpus; i++) {
+        dirtyrestraint_state->states[i].cpu_index = i;
+    }
+
+    dirtyrestraint_state->max_cpus = max_cpus;
+
+    trace_dirtyrestraint_state_init(max_cpus);
+}
+
 static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque)
 {
     double pct;
diff --git a/softmmu/trace-events b/softmmu/trace-events
index 9c88887..0307567 100644
--- a/softmmu/trace-events
+++ b/softmmu/trace-events
@@ -31,3 +31,8 @@ runstate_set(int current_state, const char *current_state_str, int new_state, co
 system_wakeup_request(int reason) "reason=%d"
 qemu_system_shutdown_request(int reason) "reason=%d"
 qemu_system_powerdown_request(void) ""
+
+#cpu-throttle.c
+dirtyrestraint_state_init(int max_cpus) "dirtyrate restraint init: max cpus %d"
+dirtyrestraint_impose(int cpu_index, uint64_t quota, uint64_t current, int pct) "CPU[%d] impose dirtyrate restraint: quota %" PRIu64 ", current %" PRIu64 ", percentage %d"
+dirtyrestraint_vcpu(int cpu_index, uint64_t quota) "CPU[%d] dirtyrate restraint, quota dirtyrate %"PRIu64
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v1 3/3] cpus-common: implement dirty restraint on vCPU
  2021-11-18  6:07 [PATCH v1 0/3] support dirty restraint on vCPU huangy81
  2021-11-18  6:07 ` [PATCH v1 1/3] migration/dirtyrate: implement vCPU dirtyrate calculation periodically huangy81
  2021-11-18  6:07 ` [PATCH v1 2/3] cpu-throttle: implement vCPU throttle huangy81
@ 2021-11-18  6:07 ` huangy81
  2 siblings, 0 replies; 5+ messages in thread
From: huangy81 @ 2021-11-18  6:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: David Hildenbrand, Hyman, Juan Quintela, Richard Henderson,
	Dr. David Alan Gilbert, Peter Xu, Paolo Bonzini,
	Philippe Mathieu-Daudé

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

implement dirtyrate calculation periodically basing on dirty-ring
and throttle vCPU until it reachs the quota dirtyrate given by user.

introduce qmp commands dirty-restraint/dirty-restraint-cancel to
impose/cancel dirty restraint on vCPU

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
 cpus-common.c         | 45 +++++++++++++++++++++++++++++++++++++++++++++
 include/hw/core/cpu.h |  7 +++++++
 qapi/misc.json        | 44 ++++++++++++++++++++++++++++++++++++++++++++
 softmmu/vl.c          |  1 +
 4 files changed, 97 insertions(+)

diff --git a/cpus-common.c b/cpus-common.c
index 6e73d3e..3c4dbbb 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -23,6 +23,11 @@
 #include "hw/core/cpu.h"
 #include "sysemu/cpus.h"
 #include "qemu/lockable.h"
+#include "sysemu/dirtyrestraint.h"
+#include "sysemu/cpu-throttle.h"
+#include "sysemu/kvm.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-misc.h"
 
 static QemuMutex qemu_cpu_list_lock;
 static QemuCond exclusive_cond;
@@ -352,3 +357,43 @@ void process_queued_cpu_work(CPUState *cpu)
     qemu_mutex_unlock(&cpu->work_mutex);
     qemu_cond_broadcast(&qemu_work_cond);
 }
+
+void qmp_dirty_restraint(int64_t idx,
+                         uint64_t dirtyrate,
+                         Error **errp)
+{
+    if (!kvm_dirty_ring_enabled()) {
+        error_setg(errp, "dirty ring not enable, needed by dirty restraint!");
+        return;
+    }
+
+    dirtyrestraint_calc_start();
+    dirtyrestraint_vcpu(idx, dirtyrate);
+
+    return;
+}
+
+void qmp_dirty_restraint_cancel(int64_t idx,
+                                Error **errp)
+{
+    if (!kvm_dirty_ring_enabled()) {
+        error_setg(errp, "dirty ring not enable, needed by dirty restraint!");
+        return;
+    }
+
+    dirtyrestraint_cancel_vcpu(idx);
+
+    return;
+}
+
+void dirtyrestraint_setup(int max_cpus)
+{
+    if (!kvm_dirty_ring_enabled()) {
+        return;
+    }
+
+    dirtyrestraint_calc_state_init(max_cpus);
+    dirtyrestraint_state_init(max_cpus);
+
+    return;
+}
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index e948e81..d2a3978 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -881,6 +881,13 @@ void end_exclusive(void);
  */
 void qemu_init_vcpu(CPUState *cpu);
 
+/**
+ * dirtyrestraint_setup:
+ *
+ * dirtyrestraint setup.
+ */
+void dirtyrestraint_setup(int max_cpus);
+
 #define SSTEP_ENABLE  0x1  /* Enable simulated HW single stepping */
 #define SSTEP_NOIRQ   0x2  /* Do not use IRQ while single stepping */
 #define SSTEP_NOTIMER 0x4  /* Do not Timers while single stepping */
diff --git a/qapi/misc.json b/qapi/misc.json
index 358548a..6a60b2e 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -527,3 +527,47 @@
  'data': { '*option': 'str' },
  'returns': ['CommandLineOptionInfo'],
  'allow-preconfig': true }
+
+##
+# @DirtyRateQuotaVcpu:
+#
+# Dirty rate of vcpu.
+#
+# @idx: vcpu index.
+#
+# @dirtyrate: dirty rate.
+#
+# Since: 6.3
+#
+##
+{ 'struct': 'DirtyRateQuotaVcpu',
+  'data': { 'idx': 'int', 'dirtyrate': 'uint64' } }
+
+##
+# @dirty-restraint:
+#
+# Since: 6.3
+#
+# Example:
+#   {"execute": "dirty-restraint"}
+#    "arguments": { "idx": "cpu-index",
+#                   "dirtyrate": "quota-dirtyrate" } }
+#
+##
+{ 'command': 'dirty-restraint',
+  'data': 'DirtyRateQuotaVcpu' }
+
+##
+# @dirty-restraint-cancel:
+#
+# @idx: vcpu index
+#
+# Since: 6.3
+#
+# Example:
+#   {"execute": "dirty-restraint-cancel"}
+#    "arguments": { "idx": "cpu-index" } }
+#
+##
+{ 'command': 'dirty-restraint-cancel',
+  'data': { 'idx': 'int' } }
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 1159a64..ab914cb 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -3776,5 +3776,6 @@ void qemu_init(int argc, char **argv, char **envp)
     qemu_init_displays();
     accel_setup_post(current_machine);
     os_setup_post();
+    dirtyrestraint_setup(current_machine->smp.max_cpus);
     resume_mux_open();
 }
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v1 1/3] migration/dirtyrate: implement vCPU dirtyrate calculation periodically
  2021-11-18  6:07 ` [PATCH v1 1/3] migration/dirtyrate: implement vCPU dirtyrate calculation periodically huangy81
@ 2021-11-18  9:26   ` Juan Quintela
  0 siblings, 0 replies; 5+ messages in thread
From: Juan Quintela @ 2021-11-18  9:26 UTC (permalink / raw)
  To: huangy81
  Cc: David Hildenbrand, Richard Henderson, qemu-devel, Peter Xu,
	Dr. David Alan Gilbert, Paolo Bonzini,
	Philippe Mathieu-Daudé

huangy81@chinatelecom.cn wrote:
> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>
> introduce the third method GLOBAL_DIRTY_RESTRAINT of dirty
> tracking for calculate dirtyrate periodly for dirty restraint.
>
> implement thread for calculate dirtyrate periodly, which will
> be used for dirty restraint.
>
> add dirtyrestraint.h to introduce the util function for dirty
> restrain.
>
> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

Some comentes:

> +void dirtyrestraint_calc_start(void);
> +
> +void dirtyrestraint_calc_state_init(int max_cpus);

dirtylimit_? instead of restraint.

We have a start function, but I can't see a finish/end/stop functions.

> +#define DIRTYRESTRAINT_CALC_TIME_MS         1000    /* 1000ms */
> +
> +struct {
> +    DirtyRatesData data;
> +    int64_t period;
> +    bool enable;

Related to previous comment.  I can't see where we set enable to 1, but
nowhere were we set it back to 0, so this never finish.

> +    QemuCond ready_cond;
> +    QemuMutex ready_mtx;

This is a question of style, but when you only have a mutex and a cond
in one struct, you can use the "cond" and "mutex" names.

But as said, it is a question of style, if you preffer do it this way.

> +static inline void record_dirtypages(DirtyPageRecord *dirty_pages,
> +                                     CPUState *cpu, bool start);

You have put the code at the beggining of the file, if you put it at the
end of it, I think you can avoid this forward declaration.

> +static void dirtyrestraint_calc_func(void)
> +{
> +    CPUState *cpu;
> +    DirtyPageRecord *dirty_pages;
> +    int64_t start_time, end_time, calc_time;
> +    DirtyRateVcpu rate;
> +    int i = 0;
> +
> +    dirty_pages = g_malloc0(sizeof(*dirty_pages) *
> +        dirtyrestraint_calc_state->data.nvcpu);
> +
> +    dirtyrestraint_global_dirty_log_start();
> +
> +    CPU_FOREACH(cpu) {
> +        record_dirtypages(dirty_pages, cpu, true);
> +    }
> +
> +    start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> +    g_usleep(DIRTYRESTRAINT_CALC_TIME_MS * 1000);
> +    end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> +    calc_time = end_time - start_time;
> +
> +    dirtyrestraint_global_dirty_log_stop();
> +
> +    CPU_FOREACH(cpu) {
> +        record_dirtypages(dirty_pages, cpu, false);
> +    }
> +
> +    for (i = 0; i < dirtyrestraint_calc_state->data.nvcpu; i++) {
> +        uint64_t increased_dirty_pages =
> +            dirty_pages[i].end_pages - dirty_pages[i].start_pages;
> +        uint64_t memory_size_MB =
> +            (increased_dirty_pages * TARGET_PAGE_SIZE) >> 20;
> +        int64_t dirtyrate = (memory_size_MB * 1000) / calc_time;
> +
> +        rate.id = i;
> +        rate.dirty_rate  = dirtyrate;
> +        dirtyrestraint_calc_state->data.rates[i] = rate;
> +
> +        trace_dirtyrate_do_calculate_vcpu(i,
> +            dirtyrestraint_calc_state->data.rates[i].dirty_rate);
> +    }
> +
> +    return;

unnecesary return;

> +}
> +
> +static void *dirtyrestraint_calc_thread(void *opaque)
> +{
> +    rcu_register_thread();
> +
> +    while (qatomic_read(&dirtyrestraint_calc_state->enable)) {
> +        dirtyrestraint_calc_func();
> +        dirtyrestraint_calc_state->ready = true;

           You really need this to be a global variable?  You can pass
           it on the opaque, no?

Later, Juan.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-18  9:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-18  6:07 [PATCH v1 0/3] support dirty restraint on vCPU huangy81
2021-11-18  6:07 ` [PATCH v1 1/3] migration/dirtyrate: implement vCPU dirtyrate calculation periodically huangy81
2021-11-18  9:26   ` Juan Quintela
2021-11-18  6:07 ` [PATCH v1 2/3] cpu-throttle: implement vCPU throttle huangy81
2021-11-18  6:07 ` [PATCH v1 3/3] cpus-common: implement dirty restraint on vCPU huangy81

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.