* [PATCH 1/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits
[not found] ` <cover.1624771216.git.huangy81@chinatelecom.cn>
@ 2021-06-27 5:38 ` huangy81
2021-07-09 18:37 ` Peter Xu
2021-06-27 5:38 ` [PATCH 2/4] KVM: introduce kvm_get_manual_dirty_log_protect huangy81
` (2 subsequent siblings)
3 siblings, 1 reply; 12+ messages in thread
From: huangy81 @ 2021-06-27 5:38 UTC (permalink / raw)
To: qemu-devel
Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
Peter Xu, Chuan Zheng, Paolo Bonzini
From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
intrduce DIRTY_MEMORY_DIRTY_RATE dirty bits to tracking vm
dirty page for calculating dirty rate
since dirtyrate and migration may be trigger concurrently,
reusing the exsiting DIRTY_MEMORY_MIGRATION dirty bits seems
not a good choice. introduce a fresh new dirty bits for
dirtyrate to ensure both dirtyrate and migration work fine.
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
include/exec/ram_addr.h | 15 ++++++++++++++-
include/exec/ramlist.h | 9 +++++----
2 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 45c9132..6070a52 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -308,6 +308,10 @@ static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
while (page < end) {
unsigned long next = MIN(end, base + DIRTY_MEMORY_BLOCK_SIZE);
+ if (unlikely(mask & (1 << DIRTY_MEMORY_DIRTY_RATE))) {
+ bitmap_set_atomic(blocks[DIRTY_MEMORY_DIRTY_RATE]->blocks[idx],
+ offset, next - page);
+ }
if (likely(mask & (1 << DIRTY_MEMORY_MIGRATION))) {
bitmap_set_atomic(blocks[DIRTY_MEMORY_MIGRATION]->blocks[idx],
offset, next - page);
@@ -370,9 +374,17 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
qatomic_or(&blocks[DIRTY_MEMORY_VGA][idx][offset], temp);
if (global_dirty_tracking) {
- qatomic_or(
+ if (global_dirty_tracking & GLOBAL_DIRTY_MIGRATION) {
+ qatomic_or(
&blocks[DIRTY_MEMORY_MIGRATION][idx][offset],
temp);
+ }
+
+ if (global_dirty_tracking & GLOBAL_DIRTY_DIRTY_RATE) {
+ qatomic_or(
+ &blocks[DIRTY_MEMORY_DIRTY_RATE][idx][offset],
+ temp);
+ }
}
if (tcg_enabled()) {
@@ -394,6 +406,7 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
if (!global_dirty_tracking) {
clients &= ~(1 << DIRTY_MEMORY_MIGRATION);
+ clients &= ~(1 << DIRTY_MEMORY_DIRTY_RATE);
}
/*
diff --git a/include/exec/ramlist.h b/include/exec/ramlist.h
index ece6497..8585f00 100644
--- a/include/exec/ramlist.h
+++ b/include/exec/ramlist.h
@@ -8,10 +8,11 @@
typedef struct RAMBlockNotifier RAMBlockNotifier;
-#define DIRTY_MEMORY_VGA 0
-#define DIRTY_MEMORY_CODE 1
-#define DIRTY_MEMORY_MIGRATION 2
-#define DIRTY_MEMORY_NUM 3 /* num of dirty bits */
+#define DIRTY_MEMORY_VGA 0
+#define DIRTY_MEMORY_CODE 1
+#define DIRTY_MEMORY_MIGRATION 2
+#define DIRTY_MEMORY_DIRTY_RATE 3
+#define DIRTY_MEMORY_NUM 4 /* num of dirty bits */
/* The dirty memory bitmap is split into fixed-size blocks to allow growth
* under RCU. The bitmap for a block can be accessed as follows:
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits
2021-06-27 5:38 ` [PATCH 1/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits huangy81
@ 2021-07-09 18:37 ` Peter Xu
0 siblings, 0 replies; 12+ messages in thread
From: Peter Xu @ 2021-07-09 18:37 UTC (permalink / raw)
To: huangy81
Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini
On Sun, Jun 27, 2021 at 01:38:14PM +0800, huangy81@chinatelecom.cn wrote:
> @@ -370,9 +374,17 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
> qatomic_or(&blocks[DIRTY_MEMORY_VGA][idx][offset], temp);
>
> if (global_dirty_tracking) {
> - qatomic_or(
> + if (global_dirty_tracking & GLOBAL_DIRTY_MIGRATION) {
> + qatomic_or(
> &blocks[DIRTY_MEMORY_MIGRATION][idx][offset],
> temp);
> + }
> +
> + if (global_dirty_tracking & GLOBAL_DIRTY_DIRTY_RATE) {
> + qatomic_or(
> + &blocks[DIRTY_MEMORY_DIRTY_RATE][idx][offset],
> + temp);
So what I meant in the other thread is instead of operating on this bitmap we
just record the number of total dirty pages, just like we used to do with rings.
PS. IIUC maybe this can even work for dirty rings.. because either dirty ring
or dirty logging collect dirty bits into the slot bitmap, then it's further
aggregated here from the slot bitmaps. However merging them won't help much
because dirty ring can provide finer-granule per-vcpu dirty info, which can be
further used for per-vcpu throttling in the future. So just raise this up.
> + }
> }
--
Peter Xu
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/4] KVM: introduce kvm_get_manual_dirty_log_protect
[not found] ` <cover.1624771216.git.huangy81@chinatelecom.cn>
2021-06-27 5:38 ` [PATCH 1/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits huangy81
@ 2021-06-27 5:38 ` huangy81
2021-06-27 5:38 ` [PATCH 3/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits functions huangy81
2021-06-27 5:38 ` [PATCH 4/4] migration/dirtyrate: implement dirty-bitmap dirtyrate calculation huangy81
3 siblings, 0 replies; 12+ messages in thread
From: huangy81 @ 2021-06-27 5:38 UTC (permalink / raw)
To: qemu-devel
Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
Peter Xu, Chuan Zheng, Paolo Bonzini
From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
introduce kvm_get_manual_dirty_log_protect for measureing
dirtyrate via dirty bitmap. calculation of dirtyrate need
to sync dirty log and depends on the features of dirty log.
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
accel/kvm/kvm-all.c | 6 ++++++
include/sysemu/kvm.h | 1 +
2 files changed, 7 insertions(+)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index e0e88a2..f7d9ae0 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -245,6 +245,12 @@ int kvm_get_max_memslots(void)
return s->nr_slots;
}
+uint64_t kvm_get_manual_dirty_log_protect(void)
+{
+ KVMState *s = KVM_STATE(current_accel());
+ return s->manual_dirty_log_protect;
+}
+
/* Called with KVMMemoryListener.slots_lock held */
static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml)
{
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 7b22aeb..b668d49 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -533,6 +533,7 @@ int kvm_set_one_reg(CPUState *cs, uint64_t id, void *source);
int kvm_get_one_reg(CPUState *cs, uint64_t id, void *target);
struct ppc_radix_page_info *kvm_get_radix_page_info(void);
int kvm_get_max_memslots(void);
+uint64_t kvm_get_manual_dirty_log_protect(void);
/* Notify resamplefd for EOI of specific interrupts. */
void kvm_resample_fd_notify(int gsi);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits functions
[not found] ` <cover.1624771216.git.huangy81@chinatelecom.cn>
2021-06-27 5:38 ` [PATCH 1/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits huangy81
2021-06-27 5:38 ` [PATCH 2/4] KVM: introduce kvm_get_manual_dirty_log_protect huangy81
@ 2021-06-27 5:38 ` huangy81
2021-07-09 18:26 ` Peter Xu
2021-06-27 5:38 ` [PATCH 4/4] migration/dirtyrate: implement dirty-bitmap dirtyrate calculation huangy81
3 siblings, 1 reply; 12+ messages in thread
From: huangy81 @ 2021-06-27 5:38 UTC (permalink / raw)
To: qemu-devel
Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
Peter Xu, Chuan Zheng, Paolo Bonzini
From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
introduce util functions to setup the DIRTY_MEMORY_DIRTY_RATE
dirty bits for the convenience of tracking dirty bitmap when
calculating dirtyrate.
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
include/exec/ram_addr.h | 121 ++++++++++++++++++++++++++++++++++++++++++++++++
softmmu/physmem.c | 61 ++++++++++++++++++++++++
2 files changed, 182 insertions(+)
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 6070a52..57dc96b 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -435,6 +435,12 @@ bool cpu_physical_memory_test_and_clear_dirty(ram_addr_t start,
ram_addr_t length,
unsigned client);
+void cpu_physical_memory_dirtyrate_clear_bit(ram_addr_t start,
+ ram_addr_t length);
+
+void cpu_physical_memory_dirtyrate_reprotect_bit(ram_addr_t start,
+ ram_addr_t length);
+
DirtyBitmapSnapshot *cpu_physical_memory_snapshot_and_clear_dirty
(MemoryRegion *mr, hwaddr offset, hwaddr length, unsigned client);
@@ -523,5 +529,120 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
return num_dirty;
}
+
+/* Called with RCU critical section */
+static inline
+void cpu_physical_memory_dirtyrate_clear_dirty_bits(RAMBlock *rb)
+{
+ ram_addr_t addr;
+ ram_addr_t length = rb->used_length;
+ unsigned long word = BIT_WORD(rb->offset >> TARGET_PAGE_BITS);
+
+ /* start address and length is aligned at the start of a word? */
+ if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) == rb->offset &&
+ !(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
+ int k;
+ int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
+ unsigned long * const *src;
+ unsigned long idx = (word * BITS_PER_LONG) / DIRTY_MEMORY_BLOCK_SIZE;
+ unsigned long offset = BIT_WORD((word * BITS_PER_LONG) %
+ DIRTY_MEMORY_BLOCK_SIZE);
+
+ src = qatomic_rcu_read(
+ &ram_list.dirty_memory[DIRTY_MEMORY_DIRTY_RATE])->blocks;
+
+ for (k = 0; k < nr; k++) {
+ if (src[idx][offset]) {
+ qatomic_set(&src[idx][offset], 0);
+ }
+ if (++offset >= BITS_TO_LONGS(DIRTY_MEMORY_BLOCK_SIZE)) {
+ offset = 0;
+ idx++;
+ }
+ }
+ } else {
+ ram_addr_t offset = rb->offset;
+
+ for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
+ cpu_physical_memory_dirtyrate_clear_bit(addr + offset,
+ TARGET_PAGE_SIZE);
+ }
+ }
+
+ return;
+}
+
+/* Called with RCU critical section */
+static inline
+uint64_t cpu_physical_memory_dirtyrate_stat_dirty_bits(RAMBlock *rb)
+{
+ uint64_t dirty_pages = 0;
+ ram_addr_t addr;
+ ram_addr_t length = rb->used_length;
+ unsigned long word = BIT_WORD(rb->offset >> TARGET_PAGE_BITS);
+ unsigned long bits;
+
+ /* start address and length is aligned at the start of a word? */
+ if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) == rb->offset &&
+ !(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
+ int k;
+ int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
+ unsigned long * const *src;
+ unsigned long idx = (word * BITS_PER_LONG) / DIRTY_MEMORY_BLOCK_SIZE;
+ unsigned long offset = BIT_WORD((word * BITS_PER_LONG) %
+ DIRTY_MEMORY_BLOCK_SIZE);
+
+ src = qatomic_rcu_read(
+ &ram_list.dirty_memory[DIRTY_MEMORY_DIRTY_RATE])->blocks;
+
+ for (k = 0; k < nr; k++) {
+ if (src[idx][offset]) {
+ bits = qatomic_read(&src[idx][offset]);
+ dirty_pages += ctpopl(bits);
+ }
+
+ if (++offset >= BITS_TO_LONGS(DIRTY_MEMORY_BLOCK_SIZE)) {
+ offset = 0;
+ idx++;
+ }
+ }
+ } else {
+ ram_addr_t offset = rb->offset;
+
+ for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
+ if (cpu_physical_memory_get_dirty(offset + addr,
+ TARGET_PAGE_SIZE,
+ DIRTY_MEMORY_DIRTY_RATE)) {
+ dirty_pages++;
+ }
+ }
+ }
+
+ return dirty_pages;
+}
+
+static inline
+void cpu_physical_memory_dirtyrate_reset_protect(RAMBlock *rb)
+{
+ ram_addr_t addr;
+ ram_addr_t length = rb->used_length;
+ unsigned long word = BIT_WORD(rb->offset >> TARGET_PAGE_BITS);
+
+ /* start address and length is aligned at the start of a word? */
+ if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) == rb->offset &&
+ !(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
+ memory_region_clear_dirty_bitmap(rb->mr, 0, length);
+ } else {
+ ram_addr_t offset = rb->offset;
+
+ for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
+ cpu_physical_memory_dirtyrate_reprotect_bit(offset + addr,
+ TARGET_PAGE_SIZE);
+ }
+ }
+
+ return;
+}
+
#endif
#endif
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 9b171c9..d68649a 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1068,6 +1068,67 @@ bool cpu_physical_memory_test_and_clear_dirty(ram_addr_t start,
return dirty;
}
+void cpu_physical_memory_dirtyrate_clear_bit(ram_addr_t start,
+ ram_addr_t length)
+{
+ DirtyMemoryBlocks *blocks;
+ unsigned long end, page;
+ RAMBlock *ramblock;
+
+ if (length == 0) {
+ return;
+ }
+
+ end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+ page = start >> TARGET_PAGE_BITS;
+
+ WITH_RCU_READ_LOCK_GUARD() {
+ blocks =
+ qatomic_rcu_read(&ram_list.dirty_memory[DIRTY_MEMORY_DIRTY_RATE]);
+ ramblock = qemu_get_ram_block(start);
+ /* Range sanity check on the ramblock */
+ assert(start >= ramblock->offset &&
+ start + length <= ramblock->offset + ramblock->used_length);
+ while (page < end) {
+ unsigned long idx = page / DIRTY_MEMORY_BLOCK_SIZE;
+ unsigned long offset = page % DIRTY_MEMORY_BLOCK_SIZE;
+ unsigned long num = MIN(end - page,
+ DIRTY_MEMORY_BLOCK_SIZE - offset);
+
+ clear_bit(num, blocks->blocks[idx]);
+ page += num;
+ }
+ }
+
+ return;
+}
+
+void cpu_physical_memory_dirtyrate_reprotect_bit(ram_addr_t start,
+ ram_addr_t length)
+{
+ unsigned long end, start_page;
+ RAMBlock *ramblock;
+ uint64_t mr_offset, mr_size;
+
+ if (length == 0) {
+ return;
+ }
+
+ end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+ start_page = start >> TARGET_PAGE_BITS;
+
+ ramblock = qemu_get_ram_block(start);
+ /* Range sanity check on the ramblock */
+ assert(start >= ramblock->offset &&
+ start + length <= ramblock->offset + ramblock->used_length);
+
+ mr_offset = (ram_addr_t)(start_page << TARGET_PAGE_BITS) - ramblock->offset;
+ mr_size = (end - start_page) << TARGET_PAGE_BITS;
+ memory_region_clear_dirty_bitmap(ramblock->mr, mr_offset, mr_size);
+
+ return;
+}
+
DirtyBitmapSnapshot *cpu_physical_memory_snapshot_and_clear_dirty
(MemoryRegion *mr, hwaddr offset, hwaddr length, unsigned client)
{
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 3/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits functions
2021-06-27 5:38 ` [PATCH 3/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits functions huangy81
@ 2021-07-09 18:26 ` Peter Xu
0 siblings, 0 replies; 12+ messages in thread
From: Peter Xu @ 2021-07-09 18:26 UTC (permalink / raw)
To: huangy81
Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini
On Sun, Jun 27, 2021 at 01:38:16PM +0800, huangy81@chinatelecom.cn wrote:
> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>
> introduce util functions to setup the DIRTY_MEMORY_DIRTY_RATE
> dirty bits for the convenience of tracking dirty bitmap when
> calculating dirtyrate.
>
> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> ---
> include/exec/ram_addr.h | 121 ++++++++++++++++++++++++++++++++++++++++++++++++
> softmmu/physmem.c | 61 ++++++++++++++++++++++++
> 2 files changed, 182 insertions(+)
>
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index 6070a52..57dc96b 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -435,6 +435,12 @@ bool cpu_physical_memory_test_and_clear_dirty(ram_addr_t start,
> ram_addr_t length,
> unsigned client);
>
> +void cpu_physical_memory_dirtyrate_clear_bit(ram_addr_t start,
> + ram_addr_t length);
> +
> +void cpu_physical_memory_dirtyrate_reprotect_bit(ram_addr_t start,
> + ram_addr_t length);
> +
> DirtyBitmapSnapshot *cpu_physical_memory_snapshot_and_clear_dirty
> (MemoryRegion *mr, hwaddr offset, hwaddr length, unsigned client);
>
> @@ -523,5 +529,120 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
>
> return num_dirty;
> }
> +
> +/* Called with RCU critical section */
> +static inline
> +void cpu_physical_memory_dirtyrate_clear_dirty_bits(RAMBlock *rb)
> +{
> + ram_addr_t addr;
> + ram_addr_t length = rb->used_length;
> + unsigned long word = BIT_WORD(rb->offset >> TARGET_PAGE_BITS);
> +
> + /* start address and length is aligned at the start of a word? */
> + if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) == rb->offset &&
> + !(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
> + int k;
> + int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
> + unsigned long * const *src;
> + unsigned long idx = (word * BITS_PER_LONG) / DIRTY_MEMORY_BLOCK_SIZE;
> + unsigned long offset = BIT_WORD((word * BITS_PER_LONG) %
> + DIRTY_MEMORY_BLOCK_SIZE);
> +
> + src = qatomic_rcu_read(
> + &ram_list.dirty_memory[DIRTY_MEMORY_DIRTY_RATE])->blocks;
> +
> + for (k = 0; k < nr; k++) {
> + if (src[idx][offset]) {
> + qatomic_set(&src[idx][offset], 0);
> + }
> + if (++offset >= BITS_TO_LONGS(DIRTY_MEMORY_BLOCK_SIZE)) {
> + offset = 0;
> + idx++;
> + }
> + }
> + } else {
> + ram_addr_t offset = rb->offset;
> +
> + for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
> + cpu_physical_memory_dirtyrate_clear_bit(addr + offset,
> + TARGET_PAGE_SIZE);
> + }
> + }
> +
> + return;
> +}
> +
> +/* Called with RCU critical section */
> +static inline
> +uint64_t cpu_physical_memory_dirtyrate_stat_dirty_bits(RAMBlock *rb)
> +{
> + uint64_t dirty_pages = 0;
> + ram_addr_t addr;
> + ram_addr_t length = rb->used_length;
> + unsigned long word = BIT_WORD(rb->offset >> TARGET_PAGE_BITS);
> + unsigned long bits;
> +
> + /* start address and length is aligned at the start of a word? */
> + if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) == rb->offset &&
> + !(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
> + int k;
> + int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
> + unsigned long * const *src;
> + unsigned long idx = (word * BITS_PER_LONG) / DIRTY_MEMORY_BLOCK_SIZE;
> + unsigned long offset = BIT_WORD((word * BITS_PER_LONG) %
> + DIRTY_MEMORY_BLOCK_SIZE);
> +
> + src = qatomic_rcu_read(
> + &ram_list.dirty_memory[DIRTY_MEMORY_DIRTY_RATE])->blocks;
> +
> + for (k = 0; k < nr; k++) {
> + if (src[idx][offset]) {
> + bits = qatomic_read(&src[idx][offset]);
> + dirty_pages += ctpopl(bits);
> + }
> +
> + if (++offset >= BITS_TO_LONGS(DIRTY_MEMORY_BLOCK_SIZE)) {
> + offset = 0;
> + idx++;
> + }
> + }
> + } else {
> + ram_addr_t offset = rb->offset;
> +
> + for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
> + if (cpu_physical_memory_get_dirty(offset + addr,
> + TARGET_PAGE_SIZE,
> + DIRTY_MEMORY_DIRTY_RATE)) {
> + dirty_pages++;
> + }
> + }
> + }
> +
> + return dirty_pages;
> +}
If my understanding in the cover letter was correct, all codes until here can
be dropped if without the extra bitmap.
> +
> +static inline
> +void cpu_physical_memory_dirtyrate_reset_protect(RAMBlock *rb)
> +{
> + ram_addr_t addr;
> + ram_addr_t length = rb->used_length;
> + unsigned long word = BIT_WORD(rb->offset >> TARGET_PAGE_BITS);
> +
> + /* start address and length is aligned at the start of a word? */
> + if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) == rb->offset &&
> + !(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
> + memory_region_clear_dirty_bitmap(rb->mr, 0, length);
> + } else {
> + ram_addr_t offset = rb->offset;
> +
> + for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
> + cpu_physical_memory_dirtyrate_reprotect_bit(offset + addr,
> + TARGET_PAGE_SIZE);
> + }
> + }
> +
> + return;
> +}
Confused why we need this complexity. Can we unconditionally do:
static inline
void cpu_physical_memory_dirtyrate_reset_protect(RAMBlock *rb)
{
memory_region_clear_dirty_bitmap(rb->mr, 0, rb->used_length);
}
?
Then we can even drop the helper, maybe?
Below functions seem to be able to be dropped too if without the dirty rate
bitmap. Then, maybe, this patch is not needed..
> +
> #endif
> #endif
> diff --git a/softmmu/physmem.c b/softmmu/physmem.c
> index 9b171c9..d68649a 100644
> --- a/softmmu/physmem.c
> +++ b/softmmu/physmem.c
> @@ -1068,6 +1068,67 @@ bool cpu_physical_memory_test_and_clear_dirty(ram_addr_t start,
> return dirty;
> }
>
> +void cpu_physical_memory_dirtyrate_clear_bit(ram_addr_t start,
> + ram_addr_t length)
> +{
> + DirtyMemoryBlocks *blocks;
> + unsigned long end, page;
> + RAMBlock *ramblock;
> +
> + if (length == 0) {
> + return;
> + }
> +
> + end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
> + page = start >> TARGET_PAGE_BITS;
> +
> + WITH_RCU_READ_LOCK_GUARD() {
> + blocks =
> + qatomic_rcu_read(&ram_list.dirty_memory[DIRTY_MEMORY_DIRTY_RATE]);
> + ramblock = qemu_get_ram_block(start);
> + /* Range sanity check on the ramblock */
> + assert(start >= ramblock->offset &&
> + start + length <= ramblock->offset + ramblock->used_length);
> + while (page < end) {
> + unsigned long idx = page / DIRTY_MEMORY_BLOCK_SIZE;
> + unsigned long offset = page % DIRTY_MEMORY_BLOCK_SIZE;
> + unsigned long num = MIN(end - page,
> + DIRTY_MEMORY_BLOCK_SIZE - offset);
> +
> + clear_bit(num, blocks->blocks[idx]);
> + page += num;
> + }
> + }
> +
> + return;
> +}
> +
> +void cpu_physical_memory_dirtyrate_reprotect_bit(ram_addr_t start,
> + ram_addr_t length)
> +{
> + unsigned long end, start_page;
> + RAMBlock *ramblock;
> + uint64_t mr_offset, mr_size;
> +
> + if (length == 0) {
> + return;
> + }
> +
> + end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
> + start_page = start >> TARGET_PAGE_BITS;
> +
> + ramblock = qemu_get_ram_block(start);
> + /* Range sanity check on the ramblock */
> + assert(start >= ramblock->offset &&
> + start + length <= ramblock->offset + ramblock->used_length);
> +
> + mr_offset = (ram_addr_t)(start_page << TARGET_PAGE_BITS) - ramblock->offset;
> + mr_size = (end - start_page) << TARGET_PAGE_BITS;
> + memory_region_clear_dirty_bitmap(ramblock->mr, mr_offset, mr_size);
> +
> + return;
> +}
> +
> DirtyBitmapSnapshot *cpu_physical_memory_snapshot_and_clear_dirty
> (MemoryRegion *mr, hwaddr offset, hwaddr length, unsigned client)
> {
> --
> 1.8.3.1
>
--
Peter Xu
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 4/4] migration/dirtyrate: implement dirty-bitmap dirtyrate calculation
[not found] ` <cover.1624771216.git.huangy81@chinatelecom.cn>
` (2 preceding siblings ...)
2021-06-27 5:38 ` [PATCH 3/4] memory: introduce DIRTY_MEMORY_DIRTY_RATE dirty bits functions huangy81
@ 2021-06-27 5:38 ` huangy81
2021-07-09 18:32 ` Peter Xu
3 siblings, 1 reply; 12+ messages in thread
From: huangy81 @ 2021-06-27 5:38 UTC (permalink / raw)
To: qemu-devel
Cc: Eduardo Habkost, Juan Quintela, Hyman, Dr. David Alan Gilbert,
Peter Xu, Chuan Zheng, Paolo Bonzini
From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
introduce dirty-bitmap mode as the third method of calc-dirty-rate.
implement dirty-bitmap dirtyrate calculation, which can be used
to measuring dirtyrate in the absence of dirty-ring.
introduce "dirty_bitmap:-b" option in hmp calc_dirty_rate to
indicate dirty bitmap method should be used for calculation.
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
---
hmp-commands.hx | 9 +--
migration/dirtyrate.c | 147 ++++++++++++++++++++++++++++++++++++++++++++++---
migration/trace-events | 2 +
qapi/migration.json | 6 +-
4 files changed, 150 insertions(+), 14 deletions(-)
diff --git a/hmp-commands.hx b/hmp-commands.hx
index f7fc9d7..605973c 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1738,9 +1738,10 @@ ERST
{
.name = "calc_dirty_rate",
- .args_type = "dirty_ring:-r,second:l,sample_pages_per_GB:l?",
- .params = "[-r] second [sample_pages_per_GB]",
- .help = "start a round of guest dirty rate measurement (using -d to"
- "\n\t\t\t specify dirty ring as the method of calculation)",
+ .args_type = "dirty_ring:-r,dirty_bitmap:-b,second:l,sample_pages_per_GB:l?",
+ .params = "[-r] [-b] second [sample_pages_per_GB]",
+ .help = "start a round of guest dirty rate measurement (using -r to"
+ "\n\t\t\t specify dirty ring as the method of calculation and"
+ "\n\t\t\t -b to specify dirty bitmap as method of calculation)",
.cmd = hmp_calc_dirty_rate,
},
diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index 3c8c5e2..bf465ce 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -15,6 +15,7 @@
#include "qapi/error.h"
#include "cpu.h"
#include "exec/ramblock.h"
+#include "exec/ram_addr.h"
#include "qemu/rcu_queue.h"
#include "qemu/main-loop.h"
#include "qapi/qapi-commands-migration.h"
@@ -111,6 +112,10 @@ static struct DirtyRateInfo *query_dirty_rate_info(void)
}
info->vcpu_dirty_rate = head;
}
+
+ if (dirtyrate_mode == DIRTY_RATE_MEASURE_MODE_DIRTY_BITMAP) {
+ info->sample_pages = 0;
+ }
}
trace_query_dirty_rate_info(DirtyRateStatus_str(CalculatingState));
@@ -421,6 +426,114 @@ static int64_t do_calculate_dirtyrate_vcpu(DirtyPageRecord dirty_pages)
return memory_size_MB / time_s;
}
+static inline void dirtyrate_dirty_bits_clear(void)
+{
+ RAMBlock *block = NULL;
+
+ WITH_RCU_READ_LOCK_GUARD() {
+ RAMBLOCK_FOREACH_MIGRATABLE(block) {
+ cpu_physical_memory_dirtyrate_clear_dirty_bits(block);
+ }
+ }
+}
+
+static inline uint64_t dirtyrate_dirty_bits_reap(void)
+{
+ RAMBlock *block = NULL;
+ uint64_t dirty_pages = 0;
+ uint64_t total_pages = 0;
+
+ WITH_RCU_READ_LOCK_GUARD() {
+ RAMBLOCK_FOREACH_MIGRATABLE(block) {
+ dirty_pages = cpu_physical_memory_dirtyrate_stat_dirty_bits(block);
+ total_pages += dirty_pages;
+
+ trace_dirtyrate_bitmap_reap(block->idstr, dirty_pages);
+ }
+ }
+
+ return total_pages;
+}
+
+static inline void dirtyrate_manual_reset_protect(void)
+{
+ RAMBlock *block = NULL;
+
+ WITH_RCU_READ_LOCK_GUARD() {
+ RAMBLOCK_FOREACH_MIGRATABLE(block) {
+ cpu_physical_memory_dirtyrate_reset_protect(block);
+ }
+ }
+}
+
+static void do_calculate_dirtyrate_bitmap(void)
+{
+ uint64_t memory_size_MB;
+ int64_t time_s;
+ uint64_t increased_dirty_pages = 0;
+ uint64_t dirtyrate = 0;
+
+ increased_dirty_pages = dirtyrate_dirty_bits_reap();
+
+ memory_size_MB = (increased_dirty_pages * TARGET_PAGE_SIZE) >> 20;
+ time_s = DirtyStat.calc_time;
+
+ dirtyrate = memory_size_MB / time_s;
+ DirtyStat.dirty_rate = dirtyrate;
+
+ trace_dirtyrate_do_calculate_bitmap(increased_dirty_pages,
+ time_s, dirtyrate);
+}
+
+static void calculate_dirtyrate_dirty_bitmap(struct DirtyRateConfig config)
+{
+ int64_t msec = 0;
+ int64_t start_time;
+ uint64_t protect_flags = 0;
+ uint64_t initially_set = 0;
+ uint64_t protect_only = 0;
+
+ protect_flags = kvm_get_manual_dirty_log_protect();
+ protect_only = KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE;
+ initially_set =
+ (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | KVM_DIRTY_LOG_INITIALLY_SET);
+
+ dirtyrate_global_dirty_log_start();
+
+ /* absense of migration */
+ if (!(global_dirty_tracking & GLOBAL_DIRTY_MIGRATION)) {
+ if ((protect_flags & initially_set) == initially_set) {
+ /* skip the 1'round which return all 1 bits */
+ memory_global_dirty_log_sync();
+ /*
+ * reset page protect manually and
+ * start dirty tracking from now on
+ **/
+ dirtyrate_manual_reset_protect();
+ }
+ }
+
+ dirtyrate_dirty_bits_clear();
+
+ start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+ DirtyStat.start_time = start_time / 1000;
+
+ msec = config.sample_period_seconds * 1000;
+ msec = set_sample_page_period(msec, start_time);
+ DirtyStat.calc_time = msec / 1000;
+
+ /* fetch dirty bitmap from kvm */
+ memory_global_dirty_log_sync();
+
+ do_calculate_dirtyrate_bitmap();
+
+ if (protect_flags & protect_only) {
+ dirtyrate_manual_reset_protect();
+ }
+
+ dirtyrate_global_dirty_log_stop();
+}
+
static void calculate_dirtyrate_dirty_ring(struct DirtyRateConfig config)
{
CPUState *cpu;
@@ -506,7 +619,9 @@ out:
static void calculate_dirtyrate(struct DirtyRateConfig config)
{
- if (config.mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) {
+ if (config.mode == DIRTY_RATE_MEASURE_MODE_DIRTY_BITMAP) {
+ calculate_dirtyrate_dirty_bitmap(config);
+ } else if (config.mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) {
calculate_dirtyrate_dirty_ring(config);
} else {
calculate_dirtyrate_sample_vm(config);
@@ -589,11 +704,14 @@ void qmp_calc_dirty_rate(int64_t calc_time,
/*
* dirty ring mode only works when kvm dirty ring is enabled.
+ * on the contrary, dirty bitmap mode is not.
*/
- if ((mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) &&
- !kvm_dirty_ring_enabled()) {
- error_setg(errp, "dirty ring is disabled, use sample-pages method "
- "or remeasure later.");
+ if (((mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) &&
+ !kvm_dirty_ring_enabled()) ||
+ ((mode == DIRTY_RATE_MEASURE_MODE_DIRTY_BITMAP) &&
+ kvm_dirty_ring_enabled())) {
+ error_setg(errp, "mode %s is not enabled, use other method instead.",
+ DirtyRateMeasureMode_str(mode));
return;
}
@@ -670,9 +788,8 @@ void hmp_calc_dirty_rate(Monitor *mon, const QDict *qdict)
int64_t sample_pages = qdict_get_try_int(qdict, "sample_pages_per_GB", -1);
bool has_sample_pages = (sample_pages != -1);
bool dirty_ring = qdict_get_try_bool(qdict, "dirty_ring", false);
- DirtyRateMeasureMode mode =
- (dirty_ring ? DIRTY_RATE_MEASURE_MODE_DIRTY_RING :
- DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING);
+ bool dirty_bitmap = qdict_get_try_bool(qdict, "dirty_bitmap", false);
+ DirtyRateMeasureMode mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING;
Error *err = NULL;
if (!sec) {
@@ -680,6 +797,20 @@ void hmp_calc_dirty_rate(Monitor *mon, const QDict *qdict)
return;
}
+ if (dirty_ring && dirty_bitmap) {
+ monitor_printf(mon, "Either dirty ring or dirty bitmap "
+ "can be specified!\n");
+ return;
+ }
+
+ if (dirty_bitmap) {
+ mode = DIRTY_RATE_MEASURE_MODE_DIRTY_BITMAP;
+ }
+
+ if (dirty_ring) {
+ mode = DIRTY_RATE_MEASURE_MODE_DIRTY_RING;
+ }
+
qmp_calc_dirty_rate(sec, has_sample_pages, sample_pages, true,
mode, &err);
if (err) {
diff --git a/migration/trace-events b/migration/trace-events
index ce0b5e6..e19e3e1 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -333,6 +333,8 @@ skip_sample_ramblock(const char *idstr, uint64_t ramblock_size) "ramblock name:
find_page_matched(const char *idstr) "ramblock %s addr or size changed"
dirtyrate_calculate(int64_t dirtyrate) "dirty rate: %" PRIi64 " MB/s"
dirtyrate_do_calculate_vcpu(int idx, uint64_t rate) "vcpu[%d]: %"PRIu64 " MB/s"
+dirtyrate_do_calculate_bitmap(uint64_t pages, int64_t time, uint64_t rate) "%"PRIu64 " pages has increased in %"PRIi64 " s, rate %"PRIu64 "MB/s"
+dirtyrate_bitmap_reap(const char *idstr, uint64_t pages) "ramblock name: %s, has %"PRIu64 " dirty pages"
# block.c
migration_block_init_shared(const char *blk_device_name) "Start migration for %s with shared base image"
diff --git a/qapi/migration.json b/qapi/migration.json
index de35528..0b00976 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1747,13 +1747,15 @@
#
# @page-sampling: calculate dirtyrate by sampling pages.
#
-# @dirty-ring: calculate dirtyrate by via dirty ring.
+# @dirty-ring: calculate dirtyrate by dirty ring.
+#
+# @dirty-bitmap: calculate dirtyrate by dirty bitmap.
#
# Since: 6.1
#
##
{ 'enum': 'DirtyRateMeasureMode',
- 'data': ['page-sampling', 'dirty-ring'] }
+ 'data': ['page-sampling', 'dirty-ring', 'dirty-bitmap'] }
##
# @DirtyRateInfo:
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 4/4] migration/dirtyrate: implement dirty-bitmap dirtyrate calculation
2021-06-27 5:38 ` [PATCH 4/4] migration/dirtyrate: implement dirty-bitmap dirtyrate calculation huangy81
@ 2021-07-09 18:32 ` Peter Xu
0 siblings, 0 replies; 12+ messages in thread
From: Peter Xu @ 2021-07-09 18:32 UTC (permalink / raw)
To: huangy81
Cc: Eduardo Habkost, Juan Quintela, qemu-devel,
Dr. David Alan Gilbert, Chuan Zheng, Paolo Bonzini
On Sun, Jun 27, 2021 at 01:38:17PM +0800, huangy81@chinatelecom.cn wrote:
> +static void calculate_dirtyrate_dirty_bitmap(struct DirtyRateConfig config)
> +{
> + int64_t msec = 0;
> + int64_t start_time;
> + uint64_t protect_flags = 0;
> + uint64_t initially_set = 0;
> + uint64_t protect_only = 0;
> +
> + protect_flags = kvm_get_manual_dirty_log_protect();
> + protect_only = KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE;
> + initially_set =
> + (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | KVM_DIRTY_LOG_INITIALLY_SET);
> +
> + dirtyrate_global_dirty_log_start();
> +
> + /* absense of migration */
> + if (!(global_dirty_tracking & GLOBAL_DIRTY_MIGRATION)) {
> + if ((protect_flags & initially_set) == initially_set) {
> + /* skip the 1'round which return all 1 bits */
> + memory_global_dirty_log_sync();
> + /*
> + * reset page protect manually and
> + * start dirty tracking from now on
> + **/
> + dirtyrate_manual_reset_protect();
> + }
> + }
Right, clear dirty log is a bit tricky.
Wondering whether we can simplify this into something like:
1. dirty_log_sync()
2. if (manual_protect) reset_protect()
3. record initial total dirty stats (total dirty stats updated when sync)
4. sleep(SECONDS)
5. dirty_log_sync()
6. record final total dirty stats
Then I think it's not related to initial-all-set anymore. Do you think this
would work?
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 12+ messages in thread