All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] Introduce bit-based phys_ram_dirty, and bit-based dirty page checker.
@ 2010-04-06  0:51 ` Yoshiaki Tamura
  0 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel; +Cc: avi, anthony, aliguori, mtosatti, ohmura.kei

The dirty and non-dirty pages are checked one by one.  When most of the memory
is not dirty, checking the dirty and non-dirty pages by multiple page size
should be much faster than checking them one by one.  We introduced bit-based
phys_ram_dirty for VGA, CODE, MIGRATION, MASTER, and
cpu_physical_memory_get_dirty_range() for this purpose.

This patch set is v2 based on the following discussion.

http://www.mail-archive.com/kvm@vger.kernel.org/msg30722.html

I also rebased to qemu.git, but it needs the folloing patch set to be applied.

http://article.gmane.org/gmane.comp.emulators.qemu/66007

Yoshiaki Tamura (6):
  Modify DIRTY_FLAG value to use as indexes of bit-based
    phys_ram_dirty.
  Introduce bit-based phys_ram_dirty for VGA, CODE, MIGRATION and
    MASTER.
  Modifies wrapper functions for byte-based phys_ram_dirty bitmap to   
     bit-based phys_ram_dirty bitmap.
  Introduce cpu_physical_memory_get_dirty_range().
  Use cpu_physical_memory_set_dirty_range() to update phys_ram_dirty.
  Use cpu_physical_memory_get_dirty_range() to check multiple dirty
    pages.

 arch_init.c |   54 ++++++++++++++++++++------------
 bswap.h     |    2 +
 cpu-all.h   |   95 +++++++++++++++++++++++++++++++++++++++++++++++---------
 exec.c      |   98 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 kvm-all.c   |   33 +++++++++-----------
 5 files changed, 222 insertions(+), 60 deletions(-)


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH v2 0/6] Introduce bit-based phys_ram_dirty, and bit-based dirty page checker.
@ 2010-04-06  0:51 ` Yoshiaki Tamura
  0 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel; +Cc: aliguori, mtosatti, avi, ohmura.kei

The dirty and non-dirty pages are checked one by one.  When most of the memory
is not dirty, checking the dirty and non-dirty pages by multiple page size
should be much faster than checking them one by one.  We introduced bit-based
phys_ram_dirty for VGA, CODE, MIGRATION, MASTER, and
cpu_physical_memory_get_dirty_range() for this purpose.

This patch set is v2 based on the following discussion.

http://www.mail-archive.com/kvm@vger.kernel.org/msg30722.html

I also rebased to qemu.git, but it needs the folloing patch set to be applied.

http://article.gmane.org/gmane.comp.emulators.qemu/66007

Yoshiaki Tamura (6):
  Modify DIRTY_FLAG value to use as indexes of bit-based
    phys_ram_dirty.
  Introduce bit-based phys_ram_dirty for VGA, CODE, MIGRATION and
    MASTER.
  Modifies wrapper functions for byte-based phys_ram_dirty bitmap to   
     bit-based phys_ram_dirty bitmap.
  Introduce cpu_physical_memory_get_dirty_range().
  Use cpu_physical_memory_set_dirty_range() to update phys_ram_dirty.
  Use cpu_physical_memory_get_dirty_range() to check multiple dirty
    pages.

 arch_init.c |   54 ++++++++++++++++++++------------
 bswap.h     |    2 +
 cpu-all.h   |   95 +++++++++++++++++++++++++++++++++++++++++++++++---------
 exec.c      |   98 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 kvm-all.c   |   33 +++++++++-----------
 5 files changed, 222 insertions(+), 60 deletions(-)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v2 1/6] Modify DIRTY_FLAG value to use as indexes of bit-based phys_ram_dirty.
  2010-04-06  0:51 ` [Qemu-devel] " Yoshiaki Tamura
@ 2010-04-06  0:51   ` Yoshiaki Tamura
  -1 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel
  Cc: avi, anthony, aliguori, mtosatti, ohmura.kei, Yoshiaki Tamura

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
 cpu-all.h |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index f8bfa66..c409fad 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -853,7 +853,6 @@ target_phys_addr_t cpu_get_phys_page_debug(CPUState *env, target_ulong addr);
 /* memory API */
 
 extern int phys_ram_fd;
-extern uint8_t *phys_ram_dirty;
 extern ram_addr_t ram_size;
 extern ram_addr_t last_ram_offset;
 
@@ -878,9 +877,16 @@ extern int mem_prealloc;
 /* Set if TLB entry is an IO callback.  */
 #define TLB_MMIO        (1 << 5)
 
+/* Use DIRTY_FLAG as indexes of bit-based phys_ram_dirty.
+   0x03 is empty to make it compatible with byte-based bitmap. */
+#define MASTER_DIRTY_FLAG    0x00
 #define VGA_DIRTY_FLAG       0x01
 #define CODE_DIRTY_FLAG      0x02
-#define MIGRATION_DIRTY_FLAG 0x08
+#define MIGRATION_DIRTY_FLAG 0x04
+
+#define NUM_DIRTY_FLAGS      5
+
+extern unsigned long *phys_ram_dirty[NUM_DIRTY_FLAGS];
 
 /* read dirty bit (return 0 or 1) */
 static inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
-- 
1.7.0.31.g1df487


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH v2 1/6] Modify DIRTY_FLAG value to use as indexes of bit-based phys_ram_dirty.
@ 2010-04-06  0:51   ` Yoshiaki Tamura
  0 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel; +Cc: aliguori, ohmura.kei, mtosatti, Yoshiaki Tamura, avi

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
 cpu-all.h |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index f8bfa66..c409fad 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -853,7 +853,6 @@ target_phys_addr_t cpu_get_phys_page_debug(CPUState *env, target_ulong addr);
 /* memory API */
 
 extern int phys_ram_fd;
-extern uint8_t *phys_ram_dirty;
 extern ram_addr_t ram_size;
 extern ram_addr_t last_ram_offset;
 
@@ -878,9 +877,16 @@ extern int mem_prealloc;
 /* Set if TLB entry is an IO callback.  */
 #define TLB_MMIO        (1 << 5)
 
+/* Use DIRTY_FLAG as indexes of bit-based phys_ram_dirty.
+   0x03 is empty to make it compatible with byte-based bitmap. */
+#define MASTER_DIRTY_FLAG    0x00
 #define VGA_DIRTY_FLAG       0x01
 #define CODE_DIRTY_FLAG      0x02
-#define MIGRATION_DIRTY_FLAG 0x08
+#define MIGRATION_DIRTY_FLAG 0x04
+
+#define NUM_DIRTY_FLAGS      5
+
+extern unsigned long *phys_ram_dirty[NUM_DIRTY_FLAGS];
 
 /* read dirty bit (return 0 or 1) */
 static inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
-- 
1.7.0.31.g1df487

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 2/6] Introduce bit-based phys_ram_dirty for VGA, CODE, MIGRATION and MASTER.
  2010-04-06  0:51 ` [Qemu-devel] " Yoshiaki Tamura
@ 2010-04-06  0:51   ` Yoshiaki Tamura
  -1 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel
  Cc: avi, anthony, aliguori, mtosatti, ohmura.kei, Yoshiaki Tamura

Replaces byte-based phys_ram_dirty bitmap with three bit-based phys_ram_dirty
bitmap.  On allocation, it sets all bits in the bitmap.

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
 exec.c |   32 +++++++++++++++++++++++++++-----
 1 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/exec.c b/exec.c
index c74b0a4..9733892 100644
--- a/exec.c
+++ b/exec.c
@@ -110,7 +110,7 @@ uint8_t *code_gen_ptr;
 
 #if !defined(CONFIG_USER_ONLY)
 int phys_ram_fd;
-uint8_t *phys_ram_dirty;
+unsigned long *phys_ram_dirty[NUM_DIRTY_FLAGS];
 static int in_migration;
 
 typedef struct RAMBlock {
@@ -2825,10 +2825,32 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size)
     new_block->next = ram_blocks;
     ram_blocks = new_block;
 
-    phys_ram_dirty = qemu_realloc(phys_ram_dirty,
-        (last_ram_offset + size) >> TARGET_PAGE_BITS);
-    memset(phys_ram_dirty + (last_ram_offset >> TARGET_PAGE_BITS),
-           0xff, size >> TARGET_PAGE_BITS);
+/* temporarily copy from qemu-kvm.git/qemu-kvm.h */
+#define ALIGN(x, y)  (((x)+(y)-1) & ~((y)-1))
+#define BITMAP_SIZE(m) (ALIGN(((m)>>TARGET_PAGE_BITS), HOST_LONG_BITS) / 8)
+
+    if (BITMAP_SIZE(last_ram_offset + size) != BITMAP_SIZE(last_ram_offset)) {
+        phys_ram_dirty[MASTER_DIRTY_FLAG] =
+            qemu_realloc(phys_ram_dirty[MASTER_DIRTY_FLAG],
+                         BITMAP_SIZE(last_ram_offset + size));
+        phys_ram_dirty[VGA_DIRTY_FLAG] 
+            = qemu_realloc(phys_ram_dirty[VGA_DIRTY_FLAG],
+                           BITMAP_SIZE(last_ram_offset + size));
+        phys_ram_dirty[CODE_DIRTY_FLAG] =
+            qemu_realloc(phys_ram_dirty[CODE_DIRTY_FLAG],
+                         BITMAP_SIZE(last_ram_offset + size));
+        phys_ram_dirty[MIGRATION_DIRTY_FLAG] =
+            qemu_realloc(phys_ram_dirty[MIGRATION_DIRTY_FLAG],
+                         BITMAP_SIZE(last_ram_offset + size));
+        memset((uint8_t *)phys_ram_dirty[MASTER_DIRTY_FLAG] +
+               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
+        memset((uint8_t *)phys_ram_dirty[VGA_DIRTY_FLAG] +
+               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
+        memset((uint8_t *)phys_ram_dirty[CODE_DIRTY_FLAG] +
+               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
+        memset((uint8_t *)phys_ram_dirty[MIGRATION_DIRTY_FLAG] +
+               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
+    }
 
     last_ram_offset += size;
 
-- 
1.7.0.31.g1df487


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH v2 2/6] Introduce bit-based phys_ram_dirty for VGA, CODE, MIGRATION and MASTER.
@ 2010-04-06  0:51   ` Yoshiaki Tamura
  0 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel; +Cc: aliguori, ohmura.kei, mtosatti, Yoshiaki Tamura, avi

Replaces byte-based phys_ram_dirty bitmap with three bit-based phys_ram_dirty
bitmap.  On allocation, it sets all bits in the bitmap.

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
 exec.c |   32 +++++++++++++++++++++++++++-----
 1 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/exec.c b/exec.c
index c74b0a4..9733892 100644
--- a/exec.c
+++ b/exec.c
@@ -110,7 +110,7 @@ uint8_t *code_gen_ptr;
 
 #if !defined(CONFIG_USER_ONLY)
 int phys_ram_fd;
-uint8_t *phys_ram_dirty;
+unsigned long *phys_ram_dirty[NUM_DIRTY_FLAGS];
 static int in_migration;
 
 typedef struct RAMBlock {
@@ -2825,10 +2825,32 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size)
     new_block->next = ram_blocks;
     ram_blocks = new_block;
 
-    phys_ram_dirty = qemu_realloc(phys_ram_dirty,
-        (last_ram_offset + size) >> TARGET_PAGE_BITS);
-    memset(phys_ram_dirty + (last_ram_offset >> TARGET_PAGE_BITS),
-           0xff, size >> TARGET_PAGE_BITS);
+/* temporarily copy from qemu-kvm.git/qemu-kvm.h */
+#define ALIGN(x, y)  (((x)+(y)-1) & ~((y)-1))
+#define BITMAP_SIZE(m) (ALIGN(((m)>>TARGET_PAGE_BITS), HOST_LONG_BITS) / 8)
+
+    if (BITMAP_SIZE(last_ram_offset + size) != BITMAP_SIZE(last_ram_offset)) {
+        phys_ram_dirty[MASTER_DIRTY_FLAG] =
+            qemu_realloc(phys_ram_dirty[MASTER_DIRTY_FLAG],
+                         BITMAP_SIZE(last_ram_offset + size));
+        phys_ram_dirty[VGA_DIRTY_FLAG] 
+            = qemu_realloc(phys_ram_dirty[VGA_DIRTY_FLAG],
+                           BITMAP_SIZE(last_ram_offset + size));
+        phys_ram_dirty[CODE_DIRTY_FLAG] =
+            qemu_realloc(phys_ram_dirty[CODE_DIRTY_FLAG],
+                         BITMAP_SIZE(last_ram_offset + size));
+        phys_ram_dirty[MIGRATION_DIRTY_FLAG] =
+            qemu_realloc(phys_ram_dirty[MIGRATION_DIRTY_FLAG],
+                         BITMAP_SIZE(last_ram_offset + size));
+        memset((uint8_t *)phys_ram_dirty[MASTER_DIRTY_FLAG] +
+               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
+        memset((uint8_t *)phys_ram_dirty[VGA_DIRTY_FLAG] +
+               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
+        memset((uint8_t *)phys_ram_dirty[CODE_DIRTY_FLAG] +
+               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
+        memset((uint8_t *)phys_ram_dirty[MIGRATION_DIRTY_FLAG] +
+               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
+    }
 
     last_ram_offset += size;
 
-- 
1.7.0.31.g1df487

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
  2010-04-06  0:51 ` [Qemu-devel] " Yoshiaki Tamura
@ 2010-04-06  0:51   ` Yoshiaki Tamura
  -1 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel
  Cc: avi, anthony, aliguori, mtosatti, ohmura.kei, Yoshiaki Tamura

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
 cpu-all.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++----------
 1 files changed, 67 insertions(+), 14 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index c409fad..0f5bfbe 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -891,43 +891,96 @@ extern unsigned long *phys_ram_dirty[NUM_DIRTY_FLAGS];
 /* read dirty bit (return 0 or 1) */
 static inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
 {
-    return phys_ram_dirty[addr >> TARGET_PAGE_BITS] == 0xff;
+    unsigned long mask;
+    int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+    int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+ 
+    mask = 1UL << offset;
+    return (phys_ram_dirty[MASTER_DIRTY_FLAG][index] & mask) == mask;
 }
 
 static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
 {
-    return phys_ram_dirty[addr >> TARGET_PAGE_BITS];
+     unsigned long mask;
+     int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+     int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+     int ret = 0;
+ 
+     mask = 1UL << offset;
+     if (phys_ram_dirty[MASTER_DIRTY_FLAG][index] & mask)
+         return 0xff;
+     if (phys_ram_dirty[VGA_DIRTY_FLAG][index] & mask)
+         ret |= VGA_DIRTY_FLAG;
+     if (phys_ram_dirty[CODE_DIRTY_FLAG][index] & mask)
+         ret |=  CODE_DIRTY_FLAG;
+     if (phys_ram_dirty[MIGRATION_DIRTY_FLAG][index] & mask)
+         ret |= MIGRATION_DIRTY_FLAG;
+ 
+     return ret;
 }
 
 static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
                                                 int dirty_flags)
 {
-    return phys_ram_dirty[addr >> TARGET_PAGE_BITS] & dirty_flags;
+    unsigned long mask;
+    int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+    int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+
+    mask = 1UL << offset;
+    return (phys_ram_dirty[MASTER_DIRTY_FLAG][index] & mask) ||
+        (phys_ram_dirty[dirty_flags][index] & mask);
 }
 
 static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
 {
-    phys_ram_dirty[addr >> TARGET_PAGE_BITS] = 0xff;
+    unsigned long mask;
+    int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+    int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+
+    mask = 1UL << offset;
+    phys_ram_dirty[MASTER_DIRTY_FLAG][index] |= mask;
+}
+
+static inline void cpu_physical_memory_set_dirty_range(ram_addr_t addr,
+                                                       unsigned long mask)
+{
+    int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+
+    phys_ram_dirty[MASTER_DIRTY_FLAG][index] |= mask;
 }
 
-static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
-                                                      int dirty_flags)
+static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
+                                                       int dirty_flags)
 {
-    return phys_ram_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
+    unsigned long mask;
+    int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+    int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+
+    mask = 1UL << offset;
+    if (dirty_flags & VGA_DIRTY_FLAG) 
+        phys_ram_dirty[VGA_DIRTY_FLAG][index] |= mask;
+    if (dirty_flags & CODE_DIRTY_FLAG) 
+        phys_ram_dirty[CODE_DIRTY_FLAG][index] |= mask;
+    if (dirty_flags & MIGRATION_DIRTY_FLAG) 
+        phys_ram_dirty[MIGRATION_DIRTY_FLAG][index] |= mask;
 }
 
 static inline void cpu_physical_memory_mask_dirty_range(ram_addr_t start,
                                                         int length,
                                                         int dirty_flags)
 {
-    int i, mask, len;
-    uint8_t *p;
+    ram_addr_t addr = start;
+    unsigned long mask;
+    int index, offset, i;
+
+    for (i = 0;  i < length; i += TARGET_PAGE_SIZE) {
+        index = ((addr + i) >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+        offset = ((addr + i) >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+        mask = ~(1UL << offset);
 
-    len = length >> TARGET_PAGE_BITS;
-    mask = ~dirty_flags;
-    p = phys_ram_dirty + (start >> TARGET_PAGE_BITS);
-    for (i = 0; i < len; i++)
-        p[i] &= mask;
+        phys_ram_dirty[MASTER_DIRTY_FLAG][index] &= mask;
+        phys_ram_dirty[dirty_flags][index] &= mask;
+     }
 }
 
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
-- 
1.7.0.31.g1df487


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
@ 2010-04-06  0:51   ` Yoshiaki Tamura
  0 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel; +Cc: aliguori, ohmura.kei, mtosatti, Yoshiaki Tamura, avi

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
 cpu-all.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++----------
 1 files changed, 67 insertions(+), 14 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index c409fad..0f5bfbe 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -891,43 +891,96 @@ extern unsigned long *phys_ram_dirty[NUM_DIRTY_FLAGS];
 /* read dirty bit (return 0 or 1) */
 static inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
 {
-    return phys_ram_dirty[addr >> TARGET_PAGE_BITS] == 0xff;
+    unsigned long mask;
+    int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+    int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+ 
+    mask = 1UL << offset;
+    return (phys_ram_dirty[MASTER_DIRTY_FLAG][index] & mask) == mask;
 }
 
 static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
 {
-    return phys_ram_dirty[addr >> TARGET_PAGE_BITS];
+     unsigned long mask;
+     int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+     int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+     int ret = 0;
+ 
+     mask = 1UL << offset;
+     if (phys_ram_dirty[MASTER_DIRTY_FLAG][index] & mask)
+         return 0xff;
+     if (phys_ram_dirty[VGA_DIRTY_FLAG][index] & mask)
+         ret |= VGA_DIRTY_FLAG;
+     if (phys_ram_dirty[CODE_DIRTY_FLAG][index] & mask)
+         ret |=  CODE_DIRTY_FLAG;
+     if (phys_ram_dirty[MIGRATION_DIRTY_FLAG][index] & mask)
+         ret |= MIGRATION_DIRTY_FLAG;
+ 
+     return ret;
 }
 
 static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
                                                 int dirty_flags)
 {
-    return phys_ram_dirty[addr >> TARGET_PAGE_BITS] & dirty_flags;
+    unsigned long mask;
+    int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+    int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+
+    mask = 1UL << offset;
+    return (phys_ram_dirty[MASTER_DIRTY_FLAG][index] & mask) ||
+        (phys_ram_dirty[dirty_flags][index] & mask);
 }
 
 static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
 {
-    phys_ram_dirty[addr >> TARGET_PAGE_BITS] = 0xff;
+    unsigned long mask;
+    int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+    int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+
+    mask = 1UL << offset;
+    phys_ram_dirty[MASTER_DIRTY_FLAG][index] |= mask;
+}
+
+static inline void cpu_physical_memory_set_dirty_range(ram_addr_t addr,
+                                                       unsigned long mask)
+{
+    int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+
+    phys_ram_dirty[MASTER_DIRTY_FLAG][index] |= mask;
 }
 
-static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
-                                                      int dirty_flags)
+static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
+                                                       int dirty_flags)
 {
-    return phys_ram_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
+    unsigned long mask;
+    int index = (addr >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+    int offset = (addr >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+
+    mask = 1UL << offset;
+    if (dirty_flags & VGA_DIRTY_FLAG) 
+        phys_ram_dirty[VGA_DIRTY_FLAG][index] |= mask;
+    if (dirty_flags & CODE_DIRTY_FLAG) 
+        phys_ram_dirty[CODE_DIRTY_FLAG][index] |= mask;
+    if (dirty_flags & MIGRATION_DIRTY_FLAG) 
+        phys_ram_dirty[MIGRATION_DIRTY_FLAG][index] |= mask;
 }
 
 static inline void cpu_physical_memory_mask_dirty_range(ram_addr_t start,
                                                         int length,
                                                         int dirty_flags)
 {
-    int i, mask, len;
-    uint8_t *p;
+    ram_addr_t addr = start;
+    unsigned long mask;
+    int index, offset, i;
+
+    for (i = 0;  i < length; i += TARGET_PAGE_SIZE) {
+        index = ((addr + i) >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+        offset = ((addr + i) >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+        mask = ~(1UL << offset);
 
-    len = length >> TARGET_PAGE_BITS;
-    mask = ~dirty_flags;
-    p = phys_ram_dirty + (start >> TARGET_PAGE_BITS);
-    for (i = 0; i < len; i++)
-        p[i] &= mask;
+        phys_ram_dirty[MASTER_DIRTY_FLAG][index] &= mask;
+        phys_ram_dirty[dirty_flags][index] &= mask;
+     }
 }
 
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
-- 
1.7.0.31.g1df487

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 4/6] Introduce cpu_physical_memory_get_dirty_range().
  2010-04-06  0:51 ` [Qemu-devel] " Yoshiaki Tamura
@ 2010-04-06  0:51   ` Yoshiaki Tamura
  -1 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel
  Cc: avi, anthony, aliguori, mtosatti, ohmura.kei, Yoshiaki Tamura

Introduces cpu_physical_memory_get_dirty_range().
It checks the first row and puts dirty addr in the array.
If the first row is empty, it skips to the first non-dirty row
or the end addr, and put the length in the first entry of the array.

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
 cpu-all.h |    4 +++
 exec.c    |   66 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+), 0 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 0f5bfbe..791d91f 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -983,6 +983,10 @@ static inline void cpu_physical_memory_mask_dirty_range(ram_addr_t start,
      }
 }
 
+int cpu_physical_memory_get_dirty_range(ram_addr_t start, ram_addr_t end, 
+                                        ram_addr_t *dirty_rams, int length,
+                                        int dirty_flags);
+
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
                                      int dirty_flags);
 void cpu_tlb_update_dirty(CPUState *env);
diff --git a/exec.c b/exec.c
index 9733892..5d4171a 100644
--- a/exec.c
+++ b/exec.c
@@ -2045,6 +2045,72 @@ static inline void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry,
     }
 }
 
+/* It checks the first row and puts dirty addrs in the array.
+   If the first row is empty, it skips to the first non-dirty row
+   or the end addr, and put the length in the first entry of the array. */
+int cpu_physical_memory_get_dirty_range(ram_addr_t start, ram_addr_t end, 
+                                        ram_addr_t *dirty_rams, int length,
+                                        int dirty_flag)
+{
+    unsigned long p = 0, page_number;
+    ram_addr_t addr;
+    int s_idx = (start >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+    int e_idx = (end >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+    int i, j, offset;
+
+    /* mask bits before the start addr */
+    offset = (start >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+    p |= phys_ram_dirty[MASTER_DIRTY_FLAG][s_idx] & ~((1UL << offset) - 1);
+    p |= phys_ram_dirty[dirty_flag][s_idx] & ~((1UL << offset) - 1);
+
+    if (s_idx == e_idx) {
+        /* mask bits after the end addr */
+        offset = (end >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+        p &= (1UL << offset) - 1;
+    }
+
+    if (p == 0) {
+        /* when the row is empty */
+        ram_addr_t skip;
+        if (s_idx == e_idx)
+            skip = end;
+        else {
+            /* skip empty rows */
+            while (s_idx < e_idx) {
+                s_idx++;
+                if ((phys_ram_dirty[MASTER_DIRTY_FLAG][s_idx] |
+                     phys_ram_dirty[dirty_flag][s_idx]) != 0) {
+                    break;
+                }
+            }
+            skip = (s_idx * HOST_LONG_BITS * TARGET_PAGE_SIZE);
+        }
+        dirty_rams[0] = skip - start;
+        i = 0;
+
+    } else if (p == ~0UL) {
+        /* when the row is fully dirtied */
+        addr = start;
+        for (i = 0; i < length; i++) {
+            dirty_rams[i] = addr;
+            addr += TARGET_PAGE_SIZE;
+        }
+    } else {
+        /* when the row is partially dirtied */
+        i = 0;
+        do {
+            j = ffsl(p) - 1;
+            p &= ~(1UL << j);
+            page_number = s_idx * HOST_LONG_BITS + j;
+            addr = page_number * TARGET_PAGE_SIZE;
+            dirty_rams[i] = addr;
+            i++;
+        } while (p != 0 && i < length);
+    }
+
+    return i;
+}
+
 /* Note: start and end must be within the same ram block.  */
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
                                      int dirty_flags)
-- 
1.7.0.31.g1df487


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH v2 4/6] Introduce cpu_physical_memory_get_dirty_range().
@ 2010-04-06  0:51   ` Yoshiaki Tamura
  0 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel; +Cc: aliguori, ohmura.kei, mtosatti, Yoshiaki Tamura, avi

Introduces cpu_physical_memory_get_dirty_range().
It checks the first row and puts dirty addr in the array.
If the first row is empty, it skips to the first non-dirty row
or the end addr, and put the length in the first entry of the array.

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
 cpu-all.h |    4 +++
 exec.c    |   66 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+), 0 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 0f5bfbe..791d91f 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -983,6 +983,10 @@ static inline void cpu_physical_memory_mask_dirty_range(ram_addr_t start,
      }
 }
 
+int cpu_physical_memory_get_dirty_range(ram_addr_t start, ram_addr_t end, 
+                                        ram_addr_t *dirty_rams, int length,
+                                        int dirty_flags);
+
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
                                      int dirty_flags);
 void cpu_tlb_update_dirty(CPUState *env);
diff --git a/exec.c b/exec.c
index 9733892..5d4171a 100644
--- a/exec.c
+++ b/exec.c
@@ -2045,6 +2045,72 @@ static inline void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry,
     }
 }
 
+/* It checks the first row and puts dirty addrs in the array.
+   If the first row is empty, it skips to the first non-dirty row
+   or the end addr, and put the length in the first entry of the array. */
+int cpu_physical_memory_get_dirty_range(ram_addr_t start, ram_addr_t end, 
+                                        ram_addr_t *dirty_rams, int length,
+                                        int dirty_flag)
+{
+    unsigned long p = 0, page_number;
+    ram_addr_t addr;
+    int s_idx = (start >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+    int e_idx = (end >> TARGET_PAGE_BITS) / HOST_LONG_BITS;
+    int i, j, offset;
+
+    /* mask bits before the start addr */
+    offset = (start >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+    p |= phys_ram_dirty[MASTER_DIRTY_FLAG][s_idx] & ~((1UL << offset) - 1);
+    p |= phys_ram_dirty[dirty_flag][s_idx] & ~((1UL << offset) - 1);
+
+    if (s_idx == e_idx) {
+        /* mask bits after the end addr */
+        offset = (end >> TARGET_PAGE_BITS) & (HOST_LONG_BITS - 1);
+        p &= (1UL << offset) - 1;
+    }
+
+    if (p == 0) {
+        /* when the row is empty */
+        ram_addr_t skip;
+        if (s_idx == e_idx)
+            skip = end;
+        else {
+            /* skip empty rows */
+            while (s_idx < e_idx) {
+                s_idx++;
+                if ((phys_ram_dirty[MASTER_DIRTY_FLAG][s_idx] |
+                     phys_ram_dirty[dirty_flag][s_idx]) != 0) {
+                    break;
+                }
+            }
+            skip = (s_idx * HOST_LONG_BITS * TARGET_PAGE_SIZE);
+        }
+        dirty_rams[0] = skip - start;
+        i = 0;
+
+    } else if (p == ~0UL) {
+        /* when the row is fully dirtied */
+        addr = start;
+        for (i = 0; i < length; i++) {
+            dirty_rams[i] = addr;
+            addr += TARGET_PAGE_SIZE;
+        }
+    } else {
+        /* when the row is partially dirtied */
+        i = 0;
+        do {
+            j = ffsl(p) - 1;
+            p &= ~(1UL << j);
+            page_number = s_idx * HOST_LONG_BITS + j;
+            addr = page_number * TARGET_PAGE_SIZE;
+            dirty_rams[i] = addr;
+            i++;
+        } while (p != 0 && i < length);
+    }
+
+    return i;
+}
+
 /* Note: start and end must be within the same ram block.  */
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
                                      int dirty_flags)
-- 
1.7.0.31.g1df487

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 5/6] Use cpu_physical_memory_set_dirty_range() to update phys_ram_dirty.
  2010-04-06  0:51 ` [Qemu-devel] " Yoshiaki Tamura
@ 2010-04-06  0:51   ` Yoshiaki Tamura
  -1 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel
  Cc: avi, anthony, aliguori, mtosatti, ohmura.kei, Yoshiaki Tamura

Modifies kvm_physical_sync_dirty_bitmap to use
cpu_physical_memory_set_dirty_range() to update the row of
the bit-based phys_ram_dirty bitmap at once.

Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
---
 bswap.h   |    2 ++
 kvm-all.c |   33 +++++++++++++++------------------
 2 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/bswap.h b/bswap.h
index aace9b7..956f3fa 100644
--- a/bswap.h
+++ b/bswap.h
@@ -205,8 +205,10 @@ static inline void cpu_to_be32wu(uint32_t *p, uint32_t v)
 
 #ifdef HOST_WORDS_BIGENDIAN
 #define cpu_to_32wu cpu_to_be32wu
+#define leul_to_cpu(v) le ## HOST_LONG_BITS ## _to_cpu(v)
 #else
 #define cpu_to_32wu cpu_to_le32wu
+#define leul_to_cpu(v) (v)
 #endif
 
 #undef le_bswap
diff --git a/kvm-all.c b/kvm-all.c
index 7aa5e57..db762ff 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -282,11 +282,6 @@ static int kvm_set_migration_log(int enable)
     return 0;
 }
 
-static int test_le_bit(unsigned long nr, unsigned char *addr)
-{
-    return (addr[nr >> 3] >> (nr & 7)) & 1;
-}
-
 /**
  * kvm_physical_sync_dirty_bitmap - Grab dirty bitmap from kernel space
  * This function updates qemu's dirty bitmap using cpu_physical_memory_set_dirty().
@@ -299,9 +294,9 @@ static int kvm_physical_sync_dirty_bitmap(target_phys_addr_t start_addr,
 					  target_phys_addr_t end_addr)
 {
     KVMState *s = kvm_state;
-    unsigned long size, allocated_size = 0;
-    target_phys_addr_t phys_addr;
-    ram_addr_t addr;
+    unsigned long size, page_number, addr, addr1, *bitmap,
+        allocated_size = 0;
+    unsigned int i, len;
     KVMDirtyLog d;
     KVMSlot *mem;
     int ret = 0;
@@ -313,7 +308,8 @@ static int kvm_physical_sync_dirty_bitmap(target_phys_addr_t start_addr,
             break;
         }
 
-        size = ((mem->memory_size >> TARGET_PAGE_BITS) + 7) / 8;
+        size = ((mem->memory_size >> TARGET_PAGE_BITS) + HOST_LONG_BITS - 1) /
+            HOST_LONG_BITS * HOST_LONG_SIZE;
         if (!d.dirty_bitmap) {
             d.dirty_bitmap = qemu_malloc(size);
         } else if (size > allocated_size) {
@@ -330,17 +326,18 @@ static int kvm_physical_sync_dirty_bitmap(target_phys_addr_t start_addr,
             break;
         }
 
-        for (phys_addr = mem->start_addr, addr = mem->phys_offset;
-             phys_addr < mem->start_addr + mem->memory_size;
-             phys_addr += TARGET_PAGE_SIZE, addr += TARGET_PAGE_SIZE) {
-            unsigned char *bitmap = (unsigned char *)d.dirty_bitmap;
-            unsigned nr = (phys_addr - mem->start_addr) >> TARGET_PAGE_BITS;
-
-            if (test_le_bit(nr, bitmap)) {
-                cpu_physical_memory_set_dirty(addr);
+        bitmap = (unsigned long *)d.dirty_bitmap;
+        len = size / HOST_LONG_SIZE;
+        for (i = 0; i < len; i++) {
+            if (bitmap[i] != 0) {
+                page_number = i * HOST_LONG_BITS;
+                addr1 = page_number * TARGET_PAGE_SIZE;
+                addr = mem->phys_offset + addr1;
+                cpu_physical_memory_set_dirty_range(addr, 
+                    leul_to_cpu(bitmap[i]));
             }
         }
-        start_addr = phys_addr;
+        start_addr = mem->start_addr + mem->memory_size;
     }
     qemu_free(d.dirty_bitmap);
 
-- 
1.7.0.31.g1df487


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH v2 5/6] Use cpu_physical_memory_set_dirty_range() to update phys_ram_dirty.
@ 2010-04-06  0:51   ` Yoshiaki Tamura
  0 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel; +Cc: aliguori, ohmura.kei, mtosatti, Yoshiaki Tamura, avi

Modifies kvm_physical_sync_dirty_bitmap to use
cpu_physical_memory_set_dirty_range() to update the row of
the bit-based phys_ram_dirty bitmap at once.

Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
---
 bswap.h   |    2 ++
 kvm-all.c |   33 +++++++++++++++------------------
 2 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/bswap.h b/bswap.h
index aace9b7..956f3fa 100644
--- a/bswap.h
+++ b/bswap.h
@@ -205,8 +205,10 @@ static inline void cpu_to_be32wu(uint32_t *p, uint32_t v)
 
 #ifdef HOST_WORDS_BIGENDIAN
 #define cpu_to_32wu cpu_to_be32wu
+#define leul_to_cpu(v) le ## HOST_LONG_BITS ## _to_cpu(v)
 #else
 #define cpu_to_32wu cpu_to_le32wu
+#define leul_to_cpu(v) (v)
 #endif
 
 #undef le_bswap
diff --git a/kvm-all.c b/kvm-all.c
index 7aa5e57..db762ff 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -282,11 +282,6 @@ static int kvm_set_migration_log(int enable)
     return 0;
 }
 
-static int test_le_bit(unsigned long nr, unsigned char *addr)
-{
-    return (addr[nr >> 3] >> (nr & 7)) & 1;
-}
-
 /**
  * kvm_physical_sync_dirty_bitmap - Grab dirty bitmap from kernel space
  * This function updates qemu's dirty bitmap using cpu_physical_memory_set_dirty().
@@ -299,9 +294,9 @@ static int kvm_physical_sync_dirty_bitmap(target_phys_addr_t start_addr,
 					  target_phys_addr_t end_addr)
 {
     KVMState *s = kvm_state;
-    unsigned long size, allocated_size = 0;
-    target_phys_addr_t phys_addr;
-    ram_addr_t addr;
+    unsigned long size, page_number, addr, addr1, *bitmap,
+        allocated_size = 0;
+    unsigned int i, len;
     KVMDirtyLog d;
     KVMSlot *mem;
     int ret = 0;
@@ -313,7 +308,8 @@ static int kvm_physical_sync_dirty_bitmap(target_phys_addr_t start_addr,
             break;
         }
 
-        size = ((mem->memory_size >> TARGET_PAGE_BITS) + 7) / 8;
+        size = ((mem->memory_size >> TARGET_PAGE_BITS) + HOST_LONG_BITS - 1) /
+            HOST_LONG_BITS * HOST_LONG_SIZE;
         if (!d.dirty_bitmap) {
             d.dirty_bitmap = qemu_malloc(size);
         } else if (size > allocated_size) {
@@ -330,17 +326,18 @@ static int kvm_physical_sync_dirty_bitmap(target_phys_addr_t start_addr,
             break;
         }
 
-        for (phys_addr = mem->start_addr, addr = mem->phys_offset;
-             phys_addr < mem->start_addr + mem->memory_size;
-             phys_addr += TARGET_PAGE_SIZE, addr += TARGET_PAGE_SIZE) {
-            unsigned char *bitmap = (unsigned char *)d.dirty_bitmap;
-            unsigned nr = (phys_addr - mem->start_addr) >> TARGET_PAGE_BITS;
-
-            if (test_le_bit(nr, bitmap)) {
-                cpu_physical_memory_set_dirty(addr);
+        bitmap = (unsigned long *)d.dirty_bitmap;
+        len = size / HOST_LONG_SIZE;
+        for (i = 0; i < len; i++) {
+            if (bitmap[i] != 0) {
+                page_number = i * HOST_LONG_BITS;
+                addr1 = page_number * TARGET_PAGE_SIZE;
+                addr = mem->phys_offset + addr1;
+                cpu_physical_memory_set_dirty_range(addr, 
+                    leul_to_cpu(bitmap[i]));
             }
         }
-        start_addr = phys_addr;
+        start_addr = mem->start_addr + mem->memory_size;
     }
     qemu_free(d.dirty_bitmap);
 
-- 
1.7.0.31.g1df487

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 6/6] Use cpu_physical_memory_get_dirty_range() to check multiple dirty pages.
  2010-04-06  0:51 ` [Qemu-devel] " Yoshiaki Tamura
@ 2010-04-06  0:51   ` Yoshiaki Tamura
  -1 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel
  Cc: avi, anthony, aliguori, mtosatti, ohmura.kei, Yoshiaki Tamura

Modifies ram_save_block() and ram_save_remaining() to use
cpu_physical_memory_get_dirty_range() to check multiple dirty and non-dirty
pages at once.

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
 arch_init.c |   54 +++++++++++++++++++++++++++++++++---------------------
 1 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index cfc03ea..245a082 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -108,31 +108,37 @@ static int ram_save_block(QEMUFile *f)
     static ram_addr_t current_addr = 0;
     ram_addr_t saved_addr = current_addr;
     ram_addr_t addr = 0;
-    int found = 0;
+    ram_addr_t dirty_rams[HOST_LONG_BITS];
+    int i, found = 0;
 
     while (addr < last_ram_offset) {
-        if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
+        if ((found = cpu_physical_memory_get_dirty_range(
+                 current_addr, last_ram_offset, dirty_rams, HOST_LONG_BITS,
+                 MIGRATION_DIRTY_FLAG))) {
             uint8_t *p;
 
-            cpu_physical_memory_reset_dirty(current_addr,
-                                            current_addr + TARGET_PAGE_SIZE,
-                                            MIGRATION_DIRTY_FLAG);
+            for (i = 0; i < found; i++) {
+                ram_addr_t page_addr = dirty_rams[i];
+                cpu_physical_memory_reset_dirty(page_addr,
+                                                page_addr + TARGET_PAGE_SIZE,
+                                                MIGRATION_DIRTY_FLAG);
 
-            p = qemu_get_ram_ptr(current_addr);
+                p = qemu_get_ram_ptr(page_addr);
 
-            if (is_dup_page(p, *p)) {
-                qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_COMPRESS);
-                qemu_put_byte(f, *p);
-            } else {
-                qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_PAGE);
-                qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+                if (is_dup_page(p, *p)) {
+                    qemu_put_be64(f, page_addr | RAM_SAVE_FLAG_COMPRESS);
+                    qemu_put_byte(f, *p);
+                } else {
+                    qemu_put_be64(f, page_addr | RAM_SAVE_FLAG_PAGE);
+                    qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+                }
             }
 
-            found = 1;
             break;
+        } else {
+            addr += dirty_rams[0];
+            current_addr = (saved_addr + addr) % last_ram_offset;
         }
-        addr += TARGET_PAGE_SIZE;
-        current_addr = (saved_addr + addr) % last_ram_offset;
     }
 
     return found;
@@ -142,12 +148,18 @@ static uint64_t bytes_transferred;
 
 static ram_addr_t ram_save_remaining(void)
 {
-    ram_addr_t addr;
+    ram_addr_t addr = 0;
     ram_addr_t count = 0;
+    ram_addr_t dirty_rams[HOST_LONG_BITS];
+    int found = 0;
 
-    for (addr = 0; addr < last_ram_offset; addr += TARGET_PAGE_SIZE) {
-        if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
-            count++;
+    while (addr < last_ram_offset) {
+        if ((found = cpu_physical_memory_get_dirty_range(addr, last_ram_offset,
+            dirty_rams, HOST_LONG_BITS, MIGRATION_DIRTY_FLAG))) {
+            count += found;
+            addr = dirty_rams[found - 1] + TARGET_PAGE_SIZE;
+        } else {
+            addr += dirty_rams[0];
         }
     }
 
-- 
1.7.0.31.g1df487


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH v2 6/6] Use cpu_physical_memory_get_dirty_range() to check multiple dirty pages.
@ 2010-04-06  0:51   ` Yoshiaki Tamura
  0 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-06  0:51 UTC (permalink / raw)
  To: kvm, qemu-devel; +Cc: aliguori, ohmura.kei, mtosatti, Yoshiaki Tamura, avi

Modifies ram_save_block() and ram_save_remaining() to use
cpu_physical_memory_get_dirty_range() to check multiple dirty and non-dirty
pages at once.

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
---
 arch_init.c |   54 +++++++++++++++++++++++++++++++++---------------------
 1 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index cfc03ea..245a082 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -108,31 +108,37 @@ static int ram_save_block(QEMUFile *f)
     static ram_addr_t current_addr = 0;
     ram_addr_t saved_addr = current_addr;
     ram_addr_t addr = 0;
-    int found = 0;
+    ram_addr_t dirty_rams[HOST_LONG_BITS];
+    int i, found = 0;
 
     while (addr < last_ram_offset) {
-        if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
+        if ((found = cpu_physical_memory_get_dirty_range(
+                 current_addr, last_ram_offset, dirty_rams, HOST_LONG_BITS,
+                 MIGRATION_DIRTY_FLAG))) {
             uint8_t *p;
 
-            cpu_physical_memory_reset_dirty(current_addr,
-                                            current_addr + TARGET_PAGE_SIZE,
-                                            MIGRATION_DIRTY_FLAG);
+            for (i = 0; i < found; i++) {
+                ram_addr_t page_addr = dirty_rams[i];
+                cpu_physical_memory_reset_dirty(page_addr,
+                                                page_addr + TARGET_PAGE_SIZE,
+                                                MIGRATION_DIRTY_FLAG);
 
-            p = qemu_get_ram_ptr(current_addr);
+                p = qemu_get_ram_ptr(page_addr);
 
-            if (is_dup_page(p, *p)) {
-                qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_COMPRESS);
-                qemu_put_byte(f, *p);
-            } else {
-                qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_PAGE);
-                qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+                if (is_dup_page(p, *p)) {
+                    qemu_put_be64(f, page_addr | RAM_SAVE_FLAG_COMPRESS);
+                    qemu_put_byte(f, *p);
+                } else {
+                    qemu_put_be64(f, page_addr | RAM_SAVE_FLAG_PAGE);
+                    qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+                }
             }
 
-            found = 1;
             break;
+        } else {
+            addr += dirty_rams[0];
+            current_addr = (saved_addr + addr) % last_ram_offset;
         }
-        addr += TARGET_PAGE_SIZE;
-        current_addr = (saved_addr + addr) % last_ram_offset;
     }
 
     return found;
@@ -142,12 +148,18 @@ static uint64_t bytes_transferred;
 
 static ram_addr_t ram_save_remaining(void)
 {
-    ram_addr_t addr;
+    ram_addr_t addr = 0;
     ram_addr_t count = 0;
+    ram_addr_t dirty_rams[HOST_LONG_BITS];
+    int found = 0;
 
-    for (addr = 0; addr < last_ram_offset; addr += TARGET_PAGE_SIZE) {
-        if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
-            count++;
+    while (addr < last_ram_offset) {
+        if ((found = cpu_physical_memory_get_dirty_range(addr, last_ram_offset,
+            dirty_rams, HOST_LONG_BITS, MIGRATION_DIRTY_FLAG))) {
+            count += found;
+            addr = dirty_rams[found - 1] + TARGET_PAGE_SIZE;
+        } else {
+            addr += dirty_rams[0];
         }
     }
 
-- 
1.7.0.31.g1df487

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/6] Introduce bit-based phys_ram_dirty for VGA, CODE, MIGRATION and MASTER.
  2010-04-06  0:51   ` [Qemu-devel] " Yoshiaki Tamura
@ 2010-04-12  8:01     ` Avi Kivity
  -1 siblings, 0 replies; 32+ messages in thread
From: Avi Kivity @ 2010-04-12  8:01 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: kvm, qemu-devel, anthony, aliguori, mtosatti, ohmura.kei

On 04/06/2010 03:51 AM, Yoshiaki Tamura wrote:
> Replaces byte-based phys_ram_dirty bitmap with three bit-based phys_ram_dirty
> bitmap.  On allocation, it sets all bits in the bitmap.
>
>
>    

> index c74b0a4..9733892 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -110,7 +110,7 @@ uint8_t *code_gen_ptr;
>
>   #if !defined(CONFIG_USER_ONLY)
>   int phys_ram_fd;
> -uint8_t *phys_ram_dirty;
> +unsigned long *phys_ram_dirty[NUM_DIRTY_FLAGS];
>   static int in_migration;
>
>   typedef struct RAMBlock {
> @@ -2825,10 +2825,32 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size)
>       new_block->next = ram_blocks;
>       ram_blocks = new_block;
>
> -    phys_ram_dirty = qemu_realloc(phys_ram_dirty,
> -        (last_ram_offset + size)>>  TARGET_PAGE_BITS);
> -    memset(phys_ram_dirty + (last_ram_offset>>  TARGET_PAGE_BITS),
> -           0xff, size>>  TARGET_PAGE_BITS);
> +/* temporarily copy from qemu-kvm.git/qemu-kvm.h */
> +#define ALIGN(x, y)  (((x)+(y)-1)&  ~((y)-1))
> +#define BITMAP_SIZE(m) (ALIGN(((m)>>TARGET_PAGE_BITS), HOST_LONG_BITS) / 8)
>    

Please put in some header file, maybe qemu-common.h.

> +
> +    if (BITMAP_SIZE(last_ram_offset + size) != BITMAP_SIZE(last_ram_offset)) {
> +        phys_ram_dirty[MASTER_DIRTY_FLAG] =
> +            qemu_realloc(phys_ram_dirty[MASTER_DIRTY_FLAG],
> +                         BITMAP_SIZE(last_ram_offset + size));
> +        phys_ram_dirty[VGA_DIRTY_FLAG]
> +            = qemu_realloc(phys_ram_dirty[VGA_DIRTY_FLAG],
> +                           BITMAP_SIZE(last_ram_offset + size));
> +        phys_ram_dirty[CODE_DIRTY_FLAG] =
> +            qemu_realloc(phys_ram_dirty[CODE_DIRTY_FLAG],
> +                         BITMAP_SIZE(last_ram_offset + size));
> +        phys_ram_dirty[MIGRATION_DIRTY_FLAG] =
> +            qemu_realloc(phys_ram_dirty[MIGRATION_DIRTY_FLAG],
> +                         BITMAP_SIZE(last_ram_offset + size));
> +        memset((uint8_t *)phys_ram_dirty[MASTER_DIRTY_FLAG] +
> +               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
> +        memset((uint8_t *)phys_ram_dirty[VGA_DIRTY_FLAG] +
> +               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
> +        memset((uint8_t *)phys_ram_dirty[CODE_DIRTY_FLAG] +
> +               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
> +        memset((uint8_t *)phys_ram_dirty[MIGRATION_DIRTY_FLAG] +
> +               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
> +    }
>
>    

Should be nicer as a loop calling a helper to allocate each bitmap.  
This patch won't work by itself, will it?  I think you need to merge it 
with the succeeding patches.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] Re: [PATCH v2 2/6] Introduce bit-based phys_ram_dirty for VGA, CODE, MIGRATION and MASTER.
@ 2010-04-12  8:01     ` Avi Kivity
  0 siblings, 0 replies; 32+ messages in thread
From: Avi Kivity @ 2010-04-12  8:01 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: aliguori, kvm, ohmura.kei, mtosatti, qemu-devel

On 04/06/2010 03:51 AM, Yoshiaki Tamura wrote:
> Replaces byte-based phys_ram_dirty bitmap with three bit-based phys_ram_dirty
> bitmap.  On allocation, it sets all bits in the bitmap.
>
>
>    

> index c74b0a4..9733892 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -110,7 +110,7 @@ uint8_t *code_gen_ptr;
>
>   #if !defined(CONFIG_USER_ONLY)
>   int phys_ram_fd;
> -uint8_t *phys_ram_dirty;
> +unsigned long *phys_ram_dirty[NUM_DIRTY_FLAGS];
>   static int in_migration;
>
>   typedef struct RAMBlock {
> @@ -2825,10 +2825,32 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size)
>       new_block->next = ram_blocks;
>       ram_blocks = new_block;
>
> -    phys_ram_dirty = qemu_realloc(phys_ram_dirty,
> -        (last_ram_offset + size)>>  TARGET_PAGE_BITS);
> -    memset(phys_ram_dirty + (last_ram_offset>>  TARGET_PAGE_BITS),
> -           0xff, size>>  TARGET_PAGE_BITS);
> +/* temporarily copy from qemu-kvm.git/qemu-kvm.h */
> +#define ALIGN(x, y)  (((x)+(y)-1)&  ~((y)-1))
> +#define BITMAP_SIZE(m) (ALIGN(((m)>>TARGET_PAGE_BITS), HOST_LONG_BITS) / 8)
>    

Please put in some header file, maybe qemu-common.h.

> +
> +    if (BITMAP_SIZE(last_ram_offset + size) != BITMAP_SIZE(last_ram_offset)) {
> +        phys_ram_dirty[MASTER_DIRTY_FLAG] =
> +            qemu_realloc(phys_ram_dirty[MASTER_DIRTY_FLAG],
> +                         BITMAP_SIZE(last_ram_offset + size));
> +        phys_ram_dirty[VGA_DIRTY_FLAG]
> +            = qemu_realloc(phys_ram_dirty[VGA_DIRTY_FLAG],
> +                           BITMAP_SIZE(last_ram_offset + size));
> +        phys_ram_dirty[CODE_DIRTY_FLAG] =
> +            qemu_realloc(phys_ram_dirty[CODE_DIRTY_FLAG],
> +                         BITMAP_SIZE(last_ram_offset + size));
> +        phys_ram_dirty[MIGRATION_DIRTY_FLAG] =
> +            qemu_realloc(phys_ram_dirty[MIGRATION_DIRTY_FLAG],
> +                         BITMAP_SIZE(last_ram_offset + size));
> +        memset((uint8_t *)phys_ram_dirty[MASTER_DIRTY_FLAG] +
> +               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
> +        memset((uint8_t *)phys_ram_dirty[VGA_DIRTY_FLAG] +
> +               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
> +        memset((uint8_t *)phys_ram_dirty[CODE_DIRTY_FLAG] +
> +               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
> +        memset((uint8_t *)phys_ram_dirty[MIGRATION_DIRTY_FLAG] +
> +               BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
> +    }
>
>    

Should be nicer as a loop calling a helper to allocate each bitmap.  
This patch won't work by itself, will it?  I think you need to merge it 
with the succeeding patches.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
  2010-04-06  0:51   ` [Qemu-devel] " Yoshiaki Tamura
@ 2010-04-12  8:10     ` Avi Kivity
  -1 siblings, 0 replies; 32+ messages in thread
From: Avi Kivity @ 2010-04-12  8:10 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: kvm, qemu-devel, anthony, aliguori, mtosatti, ohmura.kei

On 04/06/2010 03:51 AM, Yoshiaki Tamura wrote:
> Signed-off-by: Yoshiaki Tamura<tamura.yoshiaki@lab.ntt.co.jp>
> Signed-off-by: OHMURA Kei<ohmura.kei@lab.ntt.co.jp>
> ---
>
>   static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
>   {
> -    return phys_ram_dirty[addr>>  TARGET_PAGE_BITS];
> +     unsigned long mask;
> +     int index = (addr>>  TARGET_PAGE_BITS) / HOST_LONG_BITS;
> +     int offset = (addr>>  TARGET_PAGE_BITS)&  (HOST_LONG_BITS - 1);
> +     int ret = 0;
> +
> +     mask = 1UL<<  offset;
> +     if (phys_ram_dirty[MASTER_DIRTY_FLAG][index]&  mask)
> +         return 0xff;
> +     if (phys_ram_dirty[VGA_DIRTY_FLAG][index]&  mask)
> +         ret |= VGA_DIRTY_FLAG;
> +     if (phys_ram_dirty[CODE_DIRTY_FLAG][index]&  mask)
> +         ret |=  CODE_DIRTY_FLAG;
> +     if (phys_ram_dirty[MIGRATION_DIRTY_FLAG][index]&  mask)
> +         ret |= MIGRATION_DIRTY_FLAG;
> +
> +     return ret;
>   }
>    

Again, nicer as a loop.

I think if you define both *_DIRTY_FLAG and *_DIRTY_IDX the transition 
patches can be nicer.

Coding style: use braces after if (), even for single statements.

>
>   static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
>                                                   int dirty_flags)
>   {
> -    return phys_ram_dirty[addr>>  TARGET_PAGE_BITS]&  dirty_flags;
> +    unsigned long mask;
> +    int index = (addr>>  TARGET_PAGE_BITS) / HOST_LONG_BITS;
> +    int offset = (addr>>  TARGET_PAGE_BITS)&  (HOST_LONG_BITS - 1);
> +
> +    mask = 1UL<<  offset;
> +    return (phys_ram_dirty[MASTER_DIRTY_FLAG][index]&  mask) ||
> +        (phys_ram_dirty[dirty_flags][index]&  mask);
>   }
>    

A helper that also accepts the DIRTY_IDX index can increase reuse.

>
>   static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
>   {
> -    phys_ram_dirty[addr>>  TARGET_PAGE_BITS] = 0xff;
> +    unsigned long mask;
> +    int index = (addr>>  TARGET_PAGE_BITS) / HOST_LONG_BITS;
> +    int offset = (addr>>  TARGET_PAGE_BITS)&  (HOST_LONG_BITS - 1);
> +
> +    mask = 1UL<<  offset;
> +    phys_ram_dirty[MASTER_DIRTY_FLAG][index] |= mask;
>    

>
> -static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
> -                                                      int dirty_flags)
> +static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
> +                                                       int dirty_flags)
>   {
> -    return phys_ram_dirty[addr>>  TARGET_PAGE_BITS] |= dirty_flags;
> +    unsigned long mask;
> +    int index = (addr>>  TARGET_PAGE_BITS) / HOST_LONG_BITS;
> +    int offset = (addr>>  TARGET_PAGE_BITS)&  (HOST_LONG_BITS - 1);
> +
> +    mask = 1UL<<  offset;
> +    if (dirty_flags&  VGA_DIRTY_FLAG)
> +        phys_ram_dirty[VGA_DIRTY_FLAG][index] |= mask;
> +    if (dirty_flags&  CODE_DIRTY_FLAG)
> +        phys_ram_dirty[CODE_DIRTY_FLAG][index] |= mask;
> +    if (dirty_flags&  MIGRATION_DIRTY_FLAG)
> +        phys_ram_dirty[MIGRATION_DIRTY_FLAG][index] |= mask;
>   }
>    

Is it necessary to update migration and vga bitmaps?

We can simply update the master bitmap, and update the migration and vga 
bitmaps only when they need it.  That can be done in a different patch.

Note that we should only allocate the migration and vga bitmaps when 
migration or vga is active.


-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
@ 2010-04-12  8:10     ` Avi Kivity
  0 siblings, 0 replies; 32+ messages in thread
From: Avi Kivity @ 2010-04-12  8:10 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: aliguori, kvm, ohmura.kei, mtosatti, qemu-devel

On 04/06/2010 03:51 AM, Yoshiaki Tamura wrote:
> Signed-off-by: Yoshiaki Tamura<tamura.yoshiaki@lab.ntt.co.jp>
> Signed-off-by: OHMURA Kei<ohmura.kei@lab.ntt.co.jp>
> ---
>
>   static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
>   {
> -    return phys_ram_dirty[addr>>  TARGET_PAGE_BITS];
> +     unsigned long mask;
> +     int index = (addr>>  TARGET_PAGE_BITS) / HOST_LONG_BITS;
> +     int offset = (addr>>  TARGET_PAGE_BITS)&  (HOST_LONG_BITS - 1);
> +     int ret = 0;
> +
> +     mask = 1UL<<  offset;
> +     if (phys_ram_dirty[MASTER_DIRTY_FLAG][index]&  mask)
> +         return 0xff;
> +     if (phys_ram_dirty[VGA_DIRTY_FLAG][index]&  mask)
> +         ret |= VGA_DIRTY_FLAG;
> +     if (phys_ram_dirty[CODE_DIRTY_FLAG][index]&  mask)
> +         ret |=  CODE_DIRTY_FLAG;
> +     if (phys_ram_dirty[MIGRATION_DIRTY_FLAG][index]&  mask)
> +         ret |= MIGRATION_DIRTY_FLAG;
> +
> +     return ret;
>   }
>    

Again, nicer as a loop.

I think if you define both *_DIRTY_FLAG and *_DIRTY_IDX the transition 
patches can be nicer.

Coding style: use braces after if (), even for single statements.

>
>   static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
>                                                   int dirty_flags)
>   {
> -    return phys_ram_dirty[addr>>  TARGET_PAGE_BITS]&  dirty_flags;
> +    unsigned long mask;
> +    int index = (addr>>  TARGET_PAGE_BITS) / HOST_LONG_BITS;
> +    int offset = (addr>>  TARGET_PAGE_BITS)&  (HOST_LONG_BITS - 1);
> +
> +    mask = 1UL<<  offset;
> +    return (phys_ram_dirty[MASTER_DIRTY_FLAG][index]&  mask) ||
> +        (phys_ram_dirty[dirty_flags][index]&  mask);
>   }
>    

A helper that also accepts the DIRTY_IDX index can increase reuse.

>
>   static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
>   {
> -    phys_ram_dirty[addr>>  TARGET_PAGE_BITS] = 0xff;
> +    unsigned long mask;
> +    int index = (addr>>  TARGET_PAGE_BITS) / HOST_LONG_BITS;
> +    int offset = (addr>>  TARGET_PAGE_BITS)&  (HOST_LONG_BITS - 1);
> +
> +    mask = 1UL<<  offset;
> +    phys_ram_dirty[MASTER_DIRTY_FLAG][index] |= mask;
>    

>
> -static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
> -                                                      int dirty_flags)
> +static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
> +                                                       int dirty_flags)
>   {
> -    return phys_ram_dirty[addr>>  TARGET_PAGE_BITS] |= dirty_flags;
> +    unsigned long mask;
> +    int index = (addr>>  TARGET_PAGE_BITS) / HOST_LONG_BITS;
> +    int offset = (addr>>  TARGET_PAGE_BITS)&  (HOST_LONG_BITS - 1);
> +
> +    mask = 1UL<<  offset;
> +    if (dirty_flags&  VGA_DIRTY_FLAG)
> +        phys_ram_dirty[VGA_DIRTY_FLAG][index] |= mask;
> +    if (dirty_flags&  CODE_DIRTY_FLAG)
> +        phys_ram_dirty[CODE_DIRTY_FLAG][index] |= mask;
> +    if (dirty_flags&  MIGRATION_DIRTY_FLAG)
> +        phys_ram_dirty[MIGRATION_DIRTY_FLAG][index] |= mask;
>   }
>    

Is it necessary to update migration and vga bitmaps?

We can simply update the master bitmap, and update the migration and vga 
bitmaps only when they need it.  That can be done in a different patch.

Note that we should only allocate the migration and vga bitmaps when 
migration or vga is active.


-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/6] Introduce bit-based phys_ram_dirty for VGA, CODE, MIGRATION and MASTER.
  2010-04-12  8:01     ` [Qemu-devel] " Avi Kivity
@ 2010-04-12  9:39       ` Yoshiaki Tamura
  -1 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-12  9:39 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, qemu-devel, anthony, aliguori, mtosatti, ohmura.kei

Avi Kivity wrote:
> On 04/06/2010 03:51 AM, Yoshiaki Tamura wrote:
>> Replaces byte-based phys_ram_dirty bitmap with three bit-based
>> phys_ram_dirty
>> bitmap. On allocation, it sets all bits in the bitmap.
>>
>>
>
>> index c74b0a4..9733892 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -110,7 +110,7 @@ uint8_t *code_gen_ptr;
>>
>> #if !defined(CONFIG_USER_ONLY)
>> int phys_ram_fd;
>> -uint8_t *phys_ram_dirty;
>> +unsigned long *phys_ram_dirty[NUM_DIRTY_FLAGS];
>> static int in_migration;
>>
>> typedef struct RAMBlock {
>> @@ -2825,10 +2825,32 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size)
>> new_block->next = ram_blocks;
>> ram_blocks = new_block;
>>
>> - phys_ram_dirty = qemu_realloc(phys_ram_dirty,
>> - (last_ram_offset + size)>> TARGET_PAGE_BITS);
>> - memset(phys_ram_dirty + (last_ram_offset>> TARGET_PAGE_BITS),
>> - 0xff, size>> TARGET_PAGE_BITS);
>> +/* temporarily copy from qemu-kvm.git/qemu-kvm.h */
>> +#define ALIGN(x, y) (((x)+(y)-1)& ~((y)-1))
>> +#define BITMAP_SIZE(m) (ALIGN(((m)>>TARGET_PAGE_BITS),
>> HOST_LONG_BITS) / 8)
>
> Please put in some header file, maybe qemu-common.h.

OK.  BTW, is qemu-kvm.h planned to go upstream?

>> +
>> + if (BITMAP_SIZE(last_ram_offset + size) !=
>> BITMAP_SIZE(last_ram_offset)) {
>> + phys_ram_dirty[MASTER_DIRTY_FLAG] =
>> + qemu_realloc(phys_ram_dirty[MASTER_DIRTY_FLAG],
>> + BITMAP_SIZE(last_ram_offset + size));
>> + phys_ram_dirty[VGA_DIRTY_FLAG]
>> + = qemu_realloc(phys_ram_dirty[VGA_DIRTY_FLAG],
>> + BITMAP_SIZE(last_ram_offset + size));
>> + phys_ram_dirty[CODE_DIRTY_FLAG] =
>> + qemu_realloc(phys_ram_dirty[CODE_DIRTY_FLAG],
>> + BITMAP_SIZE(last_ram_offset + size));
>> + phys_ram_dirty[MIGRATION_DIRTY_FLAG] =
>> + qemu_realloc(phys_ram_dirty[MIGRATION_DIRTY_FLAG],
>> + BITMAP_SIZE(last_ram_offset + size));
>> + memset((uint8_t *)phys_ram_dirty[MASTER_DIRTY_FLAG] +
>> + BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
>> + memset((uint8_t *)phys_ram_dirty[VGA_DIRTY_FLAG] +
>> + BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
>> + memset((uint8_t *)phys_ram_dirty[CODE_DIRTY_FLAG] +
>> + BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
>> + memset((uint8_t *)phys_ram_dirty[MIGRATION_DIRTY_FLAG] +
>> + BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
>> + }
>>
>
> Should be nicer as a loop calling a helper to allocate each bitmap. This
> patch won't work by itself, will it? I think you need to merge it with
> the succeeding patches.

Yeah.  I originally wrote as you suggested, but I needed to skip 0x03 because of 
the reason written below.

> +/* Use DIRTY_FLAG as indexes of bit-based phys_ram_dirty.
> +   0x03 is empty to make it compatible with byte-based bitmap. */
> +#define MASTER_DIRTY_FLAG    0x00
>  #define VGA_DIRTY_FLAG       0x01
>  #define CODE_DIRTY_FLAG      0x02
> -#define MIGRATION_DIRTY_FLAG 0x08
> +#define MIGRATION_DIRTY_FLAG 0x04

If you think if (i != 0x03) is better, I would modify it to a loop.

And sorry, you need to merge the succeeding patches to make it work.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] Re: [PATCH v2 2/6] Introduce bit-based phys_ram_dirty for VGA, CODE, MIGRATION and MASTER.
@ 2010-04-12  9:39       ` Yoshiaki Tamura
  0 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-12  9:39 UTC (permalink / raw)
  To: Avi Kivity; +Cc: aliguori, kvm, ohmura.kei, mtosatti, qemu-devel

Avi Kivity wrote:
> On 04/06/2010 03:51 AM, Yoshiaki Tamura wrote:
>> Replaces byte-based phys_ram_dirty bitmap with three bit-based
>> phys_ram_dirty
>> bitmap. On allocation, it sets all bits in the bitmap.
>>
>>
>
>> index c74b0a4..9733892 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -110,7 +110,7 @@ uint8_t *code_gen_ptr;
>>
>> #if !defined(CONFIG_USER_ONLY)
>> int phys_ram_fd;
>> -uint8_t *phys_ram_dirty;
>> +unsigned long *phys_ram_dirty[NUM_DIRTY_FLAGS];
>> static int in_migration;
>>
>> typedef struct RAMBlock {
>> @@ -2825,10 +2825,32 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size)
>> new_block->next = ram_blocks;
>> ram_blocks = new_block;
>>
>> - phys_ram_dirty = qemu_realloc(phys_ram_dirty,
>> - (last_ram_offset + size)>> TARGET_PAGE_BITS);
>> - memset(phys_ram_dirty + (last_ram_offset>> TARGET_PAGE_BITS),
>> - 0xff, size>> TARGET_PAGE_BITS);
>> +/* temporarily copy from qemu-kvm.git/qemu-kvm.h */
>> +#define ALIGN(x, y) (((x)+(y)-1)& ~((y)-1))
>> +#define BITMAP_SIZE(m) (ALIGN(((m)>>TARGET_PAGE_BITS),
>> HOST_LONG_BITS) / 8)
>
> Please put in some header file, maybe qemu-common.h.

OK.  BTW, is qemu-kvm.h planned to go upstream?

>> +
>> + if (BITMAP_SIZE(last_ram_offset + size) !=
>> BITMAP_SIZE(last_ram_offset)) {
>> + phys_ram_dirty[MASTER_DIRTY_FLAG] =
>> + qemu_realloc(phys_ram_dirty[MASTER_DIRTY_FLAG],
>> + BITMAP_SIZE(last_ram_offset + size));
>> + phys_ram_dirty[VGA_DIRTY_FLAG]
>> + = qemu_realloc(phys_ram_dirty[VGA_DIRTY_FLAG],
>> + BITMAP_SIZE(last_ram_offset + size));
>> + phys_ram_dirty[CODE_DIRTY_FLAG] =
>> + qemu_realloc(phys_ram_dirty[CODE_DIRTY_FLAG],
>> + BITMAP_SIZE(last_ram_offset + size));
>> + phys_ram_dirty[MIGRATION_DIRTY_FLAG] =
>> + qemu_realloc(phys_ram_dirty[MIGRATION_DIRTY_FLAG],
>> + BITMAP_SIZE(last_ram_offset + size));
>> + memset((uint8_t *)phys_ram_dirty[MASTER_DIRTY_FLAG] +
>> + BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
>> + memset((uint8_t *)phys_ram_dirty[VGA_DIRTY_FLAG] +
>> + BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
>> + memset((uint8_t *)phys_ram_dirty[CODE_DIRTY_FLAG] +
>> + BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
>> + memset((uint8_t *)phys_ram_dirty[MIGRATION_DIRTY_FLAG] +
>> + BITMAP_SIZE(last_ram_offset), 0xff, BITMAP_SIZE(size));
>> + }
>>
>
> Should be nicer as a loop calling a helper to allocate each bitmap. This
> patch won't work by itself, will it? I think you need to merge it with
> the succeeding patches.

Yeah.  I originally wrote as you suggested, but I needed to skip 0x03 because of 
the reason written below.

> +/* Use DIRTY_FLAG as indexes of bit-based phys_ram_dirty.
> +   0x03 is empty to make it compatible with byte-based bitmap. */
> +#define MASTER_DIRTY_FLAG    0x00
>  #define VGA_DIRTY_FLAG       0x01
>  #define CODE_DIRTY_FLAG      0x02
> -#define MIGRATION_DIRTY_FLAG 0x08
> +#define MIGRATION_DIRTY_FLAG 0x04

If you think if (i != 0x03) is better, I would modify it to a loop.

And sorry, you need to merge the succeeding patches to make it work.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/6] Introduce bit-based phys_ram_dirty for VGA, CODE, MIGRATION and MASTER.
  2010-04-12  9:39       ` [Qemu-devel] " Yoshiaki Tamura
@ 2010-04-12 10:17         ` Avi Kivity
  -1 siblings, 0 replies; 32+ messages in thread
From: Avi Kivity @ 2010-04-12 10:17 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: kvm, qemu-devel, anthony, aliguori, mtosatti, ohmura.kei

On 04/12/2010 12:39 PM, Yoshiaki Tamura wrote:
>> Please put in some header file, maybe qemu-common.h.
>
>
> OK.  BTW, is qemu-kvm.h planned to go upstream?

No.  Use kvm.h for kvm specific symbols (qemu-kvm.h includes it).

>>
>> Should be nicer as a loop calling a helper to allocate each bitmap. This
>> patch won't work by itself, will it? I think you need to merge it with
>> the succeeding patches.
>
> Yeah.  I originally wrote as you suggested, but I needed to skip 0x03 
> because of the reason written below.
>
>> +/* Use DIRTY_FLAG as indexes of bit-based phys_ram_dirty.
>> +   0x03 is empty to make it compatible with byte-based bitmap. */
>> +#define MASTER_DIRTY_FLAG    0x00
>>  #define VGA_DIRTY_FLAG       0x01
>>  #define CODE_DIRTY_FLAG      0x02
>> -#define MIGRATION_DIRTY_FLAG 0x08
>> +#define MIGRATION_DIRTY_FLAG 0x04
>
> If you think if (i != 0x03) is better, I would modify it to a loop.

You can have *_DIRTY_FLAG (1, 2, 4, 8) and *_DIRTY_IDX (0, 1, 2, 3); use 
_FLAG for compatibility with byte-based maps, and _IDX for multiple bitmaps.

Alternatively, have only _IDX, and modify the callers to shift when 
necessary.  At the end of the patchset, we'll only use bitmaps, so the 
shifts will be all gone.


-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] Re: [PATCH v2 2/6] Introduce bit-based phys_ram_dirty for VGA, CODE, MIGRATION and MASTER.
@ 2010-04-12 10:17         ` Avi Kivity
  0 siblings, 0 replies; 32+ messages in thread
From: Avi Kivity @ 2010-04-12 10:17 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: aliguori, kvm, ohmura.kei, mtosatti, qemu-devel

On 04/12/2010 12:39 PM, Yoshiaki Tamura wrote:
>> Please put in some header file, maybe qemu-common.h.
>
>
> OK.  BTW, is qemu-kvm.h planned to go upstream?

No.  Use kvm.h for kvm specific symbols (qemu-kvm.h includes it).

>>
>> Should be nicer as a loop calling a helper to allocate each bitmap. This
>> patch won't work by itself, will it? I think you need to merge it with
>> the succeeding patches.
>
> Yeah.  I originally wrote as you suggested, but I needed to skip 0x03 
> because of the reason written below.
>
>> +/* Use DIRTY_FLAG as indexes of bit-based phys_ram_dirty.
>> +   0x03 is empty to make it compatible with byte-based bitmap. */
>> +#define MASTER_DIRTY_FLAG    0x00
>>  #define VGA_DIRTY_FLAG       0x01
>>  #define CODE_DIRTY_FLAG      0x02
>> -#define MIGRATION_DIRTY_FLAG 0x08
>> +#define MIGRATION_DIRTY_FLAG 0x04
>
> If you think if (i != 0x03) is better, I would modify it to a loop.

You can have *_DIRTY_FLAG (1, 2, 4, 8) and *_DIRTY_IDX (0, 1, 2, 3); use 
_FLAG for compatibility with byte-based maps, and _IDX for multiple bitmaps.

Alternatively, have only _IDX, and modify the callers to shift when 
necessary.  At the end of the patchset, we'll only use bitmaps, so the 
shifts will be all gone.


-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
  2010-04-12  8:10     ` [Qemu-devel] " Avi Kivity
@ 2010-04-12 10:58       ` Yoshiaki Tamura
  -1 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-12 10:58 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, qemu-devel, anthony, aliguori, mtosatti, ohmura.kei

Avi Kivity wrote:
> On 04/06/2010 03:51 AM, Yoshiaki Tamura wrote:
>> Signed-off-by: Yoshiaki Tamura<tamura.yoshiaki@lab.ntt.co.jp>
>> Signed-off-by: OHMURA Kei<ohmura.kei@lab.ntt.co.jp>
>> ---
>>
>> static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
>> {
>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS];
>> + unsigned long mask;
>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>> + int ret = 0;
>> +
>> + mask = 1UL<< offset;
>> + if (phys_ram_dirty[MASTER_DIRTY_FLAG][index]& mask)
>> + return 0xff;
>> + if (phys_ram_dirty[VGA_DIRTY_FLAG][index]& mask)
>> + ret |= VGA_DIRTY_FLAG;
>> + if (phys_ram_dirty[CODE_DIRTY_FLAG][index]& mask)
>> + ret |= CODE_DIRTY_FLAG;
>> + if (phys_ram_dirty[MIGRATION_DIRTY_FLAG][index]& mask)
>> + ret |= MIGRATION_DIRTY_FLAG;
>> +
>> + return ret;
>> }
>
> Again, nicer as a loop.
> I think if you define both *_DIRTY_FLAG and *_DIRTY_IDX the transition
> patches can be nicer.

Got your suggestion from the other message.
I agree.

> Coding style: use braces after if (), even for single statements.

Thanks for pointing out.

>> static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
>> int dirty_flags)
>> {
>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags;
>> + unsigned long mask;
>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>> +
>> + mask = 1UL<< offset;
>> + return (phys_ram_dirty[MASTER_DIRTY_FLAG][index]& mask) ||
>> + (phys_ram_dirty[dirty_flags][index]& mask);
>> }
>
> A helper that also accepts the DIRTY_IDX index can increase reuse.

I would ffsl to map DIRTY_FLAG to DIRTY_IDX.

>> static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
>> {
>> - phys_ram_dirty[addr>> TARGET_PAGE_BITS] = 0xff;
>> + unsigned long mask;
>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>> +
>> + mask = 1UL<< offset;
>> + phys_ram_dirty[MASTER_DIRTY_FLAG][index] |= mask;
>
>>
>> -static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
>> - int dirty_flags)
>> +static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
>> + int dirty_flags)
>> {
>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS] |= dirty_flags;
>> + unsigned long mask;
>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>> +
>> + mask = 1UL<< offset;
>> + if (dirty_flags& VGA_DIRTY_FLAG)
>> + phys_ram_dirty[VGA_DIRTY_FLAG][index] |= mask;
>> + if (dirty_flags& CODE_DIRTY_FLAG)
>> + phys_ram_dirty[CODE_DIRTY_FLAG][index] |= mask;
>> + if (dirty_flags& MIGRATION_DIRTY_FLAG)
>> + phys_ram_dirty[MIGRATION_DIRTY_FLAG][index] |= mask;
>> }
>
> Is it necessary to update migration and vga bitmaps?
>
> We can simply update the master bitmap, and update the migration and vga
> bitmaps only when they need it. That can be done in a different patch.

Let me explain the role of the master bitmap here.

In the previous discussion, the master bitmap represented at least one dirty 
type is actually dirty.  While implementing this approach, I though there is one 
downside that upon resetting the bitmap, it needs to check all dirty types 
whether they are already cleared to unset the master bitmap, and this might hurt 
the performance in general.

In this patch, master bitmap represents all types are dirty, similar to existing 
0xff.  With this approach, resetting the master bitmap can be done without 
checking the other types.  set_dirty_flags is actually taking the burden in this 
case though.  Anyway, IIUC somebody would be unhappy depending on the role of 
the master bitmap.

> Note that we should only allocate the migration and vga bitmaps when
> migration or vga is active.

Right.  May I do it in a different patch?
I think this is about optimization.  In this patch, I focused on not breaking 
the existing function.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
@ 2010-04-12 10:58       ` Yoshiaki Tamura
  0 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-12 10:58 UTC (permalink / raw)
  To: Avi Kivity; +Cc: aliguori, kvm, ohmura.kei, mtosatti, qemu-devel

Avi Kivity wrote:
> On 04/06/2010 03:51 AM, Yoshiaki Tamura wrote:
>> Signed-off-by: Yoshiaki Tamura<tamura.yoshiaki@lab.ntt.co.jp>
>> Signed-off-by: OHMURA Kei<ohmura.kei@lab.ntt.co.jp>
>> ---
>>
>> static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
>> {
>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS];
>> + unsigned long mask;
>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>> + int ret = 0;
>> +
>> + mask = 1UL<< offset;
>> + if (phys_ram_dirty[MASTER_DIRTY_FLAG][index]& mask)
>> + return 0xff;
>> + if (phys_ram_dirty[VGA_DIRTY_FLAG][index]& mask)
>> + ret |= VGA_DIRTY_FLAG;
>> + if (phys_ram_dirty[CODE_DIRTY_FLAG][index]& mask)
>> + ret |= CODE_DIRTY_FLAG;
>> + if (phys_ram_dirty[MIGRATION_DIRTY_FLAG][index]& mask)
>> + ret |= MIGRATION_DIRTY_FLAG;
>> +
>> + return ret;
>> }
>
> Again, nicer as a loop.
> I think if you define both *_DIRTY_FLAG and *_DIRTY_IDX the transition
> patches can be nicer.

Got your suggestion from the other message.
I agree.

> Coding style: use braces after if (), even for single statements.

Thanks for pointing out.

>> static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
>> int dirty_flags)
>> {
>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags;
>> + unsigned long mask;
>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>> +
>> + mask = 1UL<< offset;
>> + return (phys_ram_dirty[MASTER_DIRTY_FLAG][index]& mask) ||
>> + (phys_ram_dirty[dirty_flags][index]& mask);
>> }
>
> A helper that also accepts the DIRTY_IDX index can increase reuse.

I would ffsl to map DIRTY_FLAG to DIRTY_IDX.

>> static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
>> {
>> - phys_ram_dirty[addr>> TARGET_PAGE_BITS] = 0xff;
>> + unsigned long mask;
>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>> +
>> + mask = 1UL<< offset;
>> + phys_ram_dirty[MASTER_DIRTY_FLAG][index] |= mask;
>
>>
>> -static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
>> - int dirty_flags)
>> +static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
>> + int dirty_flags)
>> {
>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS] |= dirty_flags;
>> + unsigned long mask;
>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>> +
>> + mask = 1UL<< offset;
>> + if (dirty_flags& VGA_DIRTY_FLAG)
>> + phys_ram_dirty[VGA_DIRTY_FLAG][index] |= mask;
>> + if (dirty_flags& CODE_DIRTY_FLAG)
>> + phys_ram_dirty[CODE_DIRTY_FLAG][index] |= mask;
>> + if (dirty_flags& MIGRATION_DIRTY_FLAG)
>> + phys_ram_dirty[MIGRATION_DIRTY_FLAG][index] |= mask;
>> }
>
> Is it necessary to update migration and vga bitmaps?
>
> We can simply update the master bitmap, and update the migration and vga
> bitmaps only when they need it. That can be done in a different patch.

Let me explain the role of the master bitmap here.

In the previous discussion, the master bitmap represented at least one dirty 
type is actually dirty.  While implementing this approach, I though there is one 
downside that upon resetting the bitmap, it needs to check all dirty types 
whether they are already cleared to unset the master bitmap, and this might hurt 
the performance in general.

In this patch, master bitmap represents all types are dirty, similar to existing 
0xff.  With this approach, resetting the master bitmap can be done without 
checking the other types.  set_dirty_flags is actually taking the burden in this 
case though.  Anyway, IIUC somebody would be unhappy depending on the role of 
the master bitmap.

> Note that we should only allocate the migration and vga bitmaps when
> migration or vga is active.

Right.  May I do it in a different patch?
I think this is about optimization.  In this patch, I focused on not breaking 
the existing function.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
  2010-04-12 10:58       ` [Qemu-devel] " Yoshiaki Tamura
@ 2010-04-12 11:09         ` Avi Kivity
  -1 siblings, 0 replies; 32+ messages in thread
From: Avi Kivity @ 2010-04-12 11:09 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: kvm, qemu-devel, anthony, aliguori, mtosatti, ohmura.kei

On 04/12/2010 01:58 PM, Yoshiaki Tamura wrote:
>> Is it necessary to update migration and vga bitmaps?
>>
>> We can simply update the master bitmap, and update the migration and vga
>> bitmaps only when they need it. That can be done in a different patch.
>
>
> Let me explain the role of the master bitmap here.
>
> In the previous discussion, the master bitmap represented at least one 
> dirty type is actually dirty.  While implementing this approach, I 
> though there is one downside that upon resetting the bitmap, it needs 
> to check all dirty types whether they are already cleared to unset the 
> master bitmap, and this might hurt the performance in general.

The way I see it, the master bitmap is a buffer between the guest and 
all the other client bitmaps (migration and vga).  So, a bit can be set 
in vga and cleared in master.  Let's look at the following scenario:

1. guest touches address 0xa0000 (page 0xa0)
2. qemu updates master bitmap bit 0xa0
3. vga timer fires
4. vga syncs the dirty bitmap for addresses 0xa0000-0xc0000
4.1 vga_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
4.2 migration_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
4.3 master_bitmap[0xa0-0xc0] = 0
5. vga draws based on vga_bitmap

the nice thing is that step 4.2 can be omitted if migration is not active.

so, client bitmaps are always updated when the client requests them, 
master bitmap is only a buffer.

>
> In this patch, master bitmap represents all types are dirty, similar 
> to existing 0xff.  With this approach, resetting the master bitmap can 
> be done without checking the other types.  set_dirty_flags is actually 
> taking the burden in this case though.  Anyway, IIUC somebody would be 
> unhappy depending on the role of the master bitmap.

Yes, the problem is that set_dirty_flags is the common case and also 
uses random access, we'd want to make it touch only a single bit.  
client access is rare, and also sequential, and therefore efficient.

>
>> Note that we should only allocate the migration and vga bitmaps when
>> migration or vga is active.
>
> Right.  May I do it in a different patch?

Sure, more patches is usually better.

> I think this is about optimization.  In this patch, I focused on not 
> breaking the existing function.

Yes, that's the best approach.  But we do need to see how all the 
incremental steps end up.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
@ 2010-04-12 11:09         ` Avi Kivity
  0 siblings, 0 replies; 32+ messages in thread
From: Avi Kivity @ 2010-04-12 11:09 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: aliguori, kvm, ohmura.kei, mtosatti, qemu-devel

On 04/12/2010 01:58 PM, Yoshiaki Tamura wrote:
>> Is it necessary to update migration and vga bitmaps?
>>
>> We can simply update the master bitmap, and update the migration and vga
>> bitmaps only when they need it. That can be done in a different patch.
>
>
> Let me explain the role of the master bitmap here.
>
> In the previous discussion, the master bitmap represented at least one 
> dirty type is actually dirty.  While implementing this approach, I 
> though there is one downside that upon resetting the bitmap, it needs 
> to check all dirty types whether they are already cleared to unset the 
> master bitmap, and this might hurt the performance in general.

The way I see it, the master bitmap is a buffer between the guest and 
all the other client bitmaps (migration and vga).  So, a bit can be set 
in vga and cleared in master.  Let's look at the following scenario:

1. guest touches address 0xa0000 (page 0xa0)
2. qemu updates master bitmap bit 0xa0
3. vga timer fires
4. vga syncs the dirty bitmap for addresses 0xa0000-0xc0000
4.1 vga_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
4.2 migration_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
4.3 master_bitmap[0xa0-0xc0] = 0
5. vga draws based on vga_bitmap

the nice thing is that step 4.2 can be omitted if migration is not active.

so, client bitmaps are always updated when the client requests them, 
master bitmap is only a buffer.

>
> In this patch, master bitmap represents all types are dirty, similar 
> to existing 0xff.  With this approach, resetting the master bitmap can 
> be done without checking the other types.  set_dirty_flags is actually 
> taking the burden in this case though.  Anyway, IIUC somebody would be 
> unhappy depending on the role of the master bitmap.

Yes, the problem is that set_dirty_flags is the common case and also 
uses random access, we'd want to make it touch only a single bit.  
client access is rare, and also sequential, and therefore efficient.

>
>> Note that we should only allocate the migration and vga bitmaps when
>> migration or vga is active.
>
> Right.  May I do it in a different patch?

Sure, more patches is usually better.

> I think this is about optimization.  In this patch, I focused on not 
> breaking the existing function.

Yes, that's the best approach.  But we do need to see how all the 
incremental steps end up.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
  2010-04-12 11:09         ` [Qemu-devel] " Avi Kivity
@ 2010-04-13  8:01           ` Yoshiaki Tamura
  -1 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-13  8:01 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, qemu-devel, anthony, aliguori, mtosatti, ohmura.kei

Avi Kivity wrote:
> On 04/12/2010 01:58 PM, Yoshiaki Tamura wrote:
>>> Is it necessary to update migration and vga bitmaps?
>>>
>>> We can simply update the master bitmap, and update the migration and vga
>>> bitmaps only when they need it. That can be done in a different patch.
>>
>>
>> Let me explain the role of the master bitmap here.
>>
>> In the previous discussion, the master bitmap represented at least one
>> dirty type is actually dirty. While implementing this approach, I
>> though there is one downside that upon resetting the bitmap, it needs
>> to check all dirty types whether they are already cleared to unset the
>> master bitmap, and this might hurt the performance in general.
>
> The way I see it, the master bitmap is a buffer between the guest and
> all the other client bitmaps (migration and vga). So, a bit can be set
> in vga and cleared in master. Let's look at the following scenario:
>
> 1. guest touches address 0xa0000 (page 0xa0)
> 2. qemu updates master bitmap bit 0xa0
> 3. vga timer fires
> 4. vga syncs the dirty bitmap for addresses 0xa0000-0xc0000
> 4.1 vga_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
> 4.2 migration_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
> 4.3 master_bitmap[0xa0-0xc0] = 0
> 5. vga draws based on vga_bitmap
>
> the nice thing is that step 4.2 can be omitted if migration is not active.
>
> so, client bitmaps are always updated when the client requests them,
> master bitmap is only a buffer.

Thanks.  I think I'm finally getting your point.
At 4.2 and 4.3, is syncing to client necessary?
Simply doing OR at get_dirty like in this patch not enough?

>> In this patch, master bitmap represents all types are dirty, similar
>> to existing 0xff. With this approach, resetting the master bitmap can
>> be done without checking the other types. set_dirty_flags is actually
>> taking the burden in this case though. Anyway, IIUC somebody would be
>> unhappy depending on the role of the master bitmap.
>
> Yes, the problem is that set_dirty_flags is the common case and also
> uses random access, we'd want to make it touch only a single bit. client
> access is rare, and also sequential, and therefore efficient.

I see.
So set_dirty_flags would be like,

1. set master first regard less of dirty type.
2. if dirty type was CODE, set the client.

>>> Note that we should only allocate the migration and vga bitmaps when
>>> migration or vga is active.
>>
>> Right. May I do it in a different patch?
>
> Sure, more patches is usually better.
>
>> I think this is about optimization. In this patch, I focused on not
>> breaking the existing function.
>
> Yes, that's the best approach. But we do need to see how all the
> incremental steps end up.

I totally agree.  Thanks for helping.

In summary, I'll prepare two patch sets.

1. v3 of this patch set.
- Change FLAGS value to (1,2,4,8), and add IDX (0,1,2,3)
- Use ffsl to convert FLAGS to IDX.
- Add help function which takes IDX.
- Change the behavior of set_dirty_flags as above.
- Change dirty bitmap access to a loop.
- Add brace after if ()
- Move some macros to qemu-common.h.

2. Allocate vga and migration dirty bitmap dynamically.

Please modify or add items if I were missing.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
@ 2010-04-13  8:01           ` Yoshiaki Tamura
  0 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-13  8:01 UTC (permalink / raw)
  To: Avi Kivity; +Cc: aliguori, kvm, ohmura.kei, mtosatti, qemu-devel

Avi Kivity wrote:
> On 04/12/2010 01:58 PM, Yoshiaki Tamura wrote:
>>> Is it necessary to update migration and vga bitmaps?
>>>
>>> We can simply update the master bitmap, and update the migration and vga
>>> bitmaps only when they need it. That can be done in a different patch.
>>
>>
>> Let me explain the role of the master bitmap here.
>>
>> In the previous discussion, the master bitmap represented at least one
>> dirty type is actually dirty. While implementing this approach, I
>> though there is one downside that upon resetting the bitmap, it needs
>> to check all dirty types whether they are already cleared to unset the
>> master bitmap, and this might hurt the performance in general.
>
> The way I see it, the master bitmap is a buffer between the guest and
> all the other client bitmaps (migration and vga). So, a bit can be set
> in vga and cleared in master. Let's look at the following scenario:
>
> 1. guest touches address 0xa0000 (page 0xa0)
> 2. qemu updates master bitmap bit 0xa0
> 3. vga timer fires
> 4. vga syncs the dirty bitmap for addresses 0xa0000-0xc0000
> 4.1 vga_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
> 4.2 migration_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
> 4.3 master_bitmap[0xa0-0xc0] = 0
> 5. vga draws based on vga_bitmap
>
> the nice thing is that step 4.2 can be omitted if migration is not active.
>
> so, client bitmaps are always updated when the client requests them,
> master bitmap is only a buffer.

Thanks.  I think I'm finally getting your point.
At 4.2 and 4.3, is syncing to client necessary?
Simply doing OR at get_dirty like in this patch not enough?

>> In this patch, master bitmap represents all types are dirty, similar
>> to existing 0xff. With this approach, resetting the master bitmap can
>> be done without checking the other types. set_dirty_flags is actually
>> taking the burden in this case though. Anyway, IIUC somebody would be
>> unhappy depending on the role of the master bitmap.
>
> Yes, the problem is that set_dirty_flags is the common case and also
> uses random access, we'd want to make it touch only a single bit. client
> access is rare, and also sequential, and therefore efficient.

I see.
So set_dirty_flags would be like,

1. set master first regard less of dirty type.
2. if dirty type was CODE, set the client.

>>> Note that we should only allocate the migration and vga bitmaps when
>>> migration or vga is active.
>>
>> Right. May I do it in a different patch?
>
> Sure, more patches is usually better.
>
>> I think this is about optimization. In this patch, I focused on not
>> breaking the existing function.
>
> Yes, that's the best approach. But we do need to see how all the
> incremental steps end up.

I totally agree.  Thanks for helping.

In summary, I'll prepare two patch sets.

1. v3 of this patch set.
- Change FLAGS value to (1,2,4,8), and add IDX (0,1,2,3)
- Use ffsl to convert FLAGS to IDX.
- Add help function which takes IDX.
- Change the behavior of set_dirty_flags as above.
- Change dirty bitmap access to a loop.
- Add brace after if ()
- Move some macros to qemu-common.h.

2. Allocate vga and migration dirty bitmap dynamically.

Please modify or add items if I were missing.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
  2010-04-13  8:01           ` [Qemu-devel] " Yoshiaki Tamura
@ 2010-04-13  9:20             ` Avi Kivity
  -1 siblings, 0 replies; 32+ messages in thread
From: Avi Kivity @ 2010-04-13  9:20 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: kvm, qemu-devel, anthony, aliguori, mtosatti, ohmura.kei

On 04/13/2010 11:01 AM, Yoshiaki Tamura wrote:
> Avi Kivity wrote:
>> On 04/12/2010 01:58 PM, Yoshiaki Tamura wrote:
>>>> Is it necessary to update migration and vga bitmaps?
>>>>
>>>> We can simply update the master bitmap, and update the migration 
>>>> and vga
>>>> bitmaps only when they need it. That can be done in a different patch.
>>>
>>>
>>> Let me explain the role of the master bitmap here.
>>>
>>> In the previous discussion, the master bitmap represented at least one
>>> dirty type is actually dirty. While implementing this approach, I
>>> though there is one downside that upon resetting the bitmap, it needs
>>> to check all dirty types whether they are already cleared to unset the
>>> master bitmap, and this might hurt the performance in general.
>>
>> The way I see it, the master bitmap is a buffer between the guest and
>> all the other client bitmaps (migration and vga). So, a bit can be set
>> in vga and cleared in master. Let's look at the following scenario:
>>
>> 1. guest touches address 0xa0000 (page 0xa0)
>> 2. qemu updates master bitmap bit 0xa0
>> 3. vga timer fires
>> 4. vga syncs the dirty bitmap for addresses 0xa0000-0xc0000
>> 4.1 vga_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
>> 4.2 migration_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
>> 4.3 master_bitmap[0xa0-0xc0] = 0
>> 5. vga draws based on vga_bitmap
>>
>> the nice thing is that step 4.2 can be omitted if migration is not 
>> active.
>>
>> so, client bitmaps are always updated when the client requests them,
>> master bitmap is only a buffer.
>
> Thanks.  I think I'm finally getting your point.
> At 4.2 and 4.3, is syncing to client necessary?

Yes.  We want to clear the master bitmap, otherwise on the next 
iteration we will get dirty bits that may be spurious.  But if we clear 
the dirty bitmap, we must copy it to all client bitmaps, not just vga, 
otherwise we lose data.

> Simply doing OR at get_dirty like in this patch not enough?

If you don't clear the master bitmap, then you will get the same bits at 
the next iteration.

>
>>> In this patch, master bitmap represents all types are dirty, similar
>>> to existing 0xff. With this approach, resetting the master bitmap can
>>> be done without checking the other types. set_dirty_flags is actually
>>> taking the burden in this case though. Anyway, IIUC somebody would be
>>> unhappy depending on the role of the master bitmap.
>>
>> Yes, the problem is that set_dirty_flags is the common case and also
>> uses random access, we'd want to make it touch only a single bit. client
>> access is rare, and also sequential, and therefore efficient.
>
> I see.
> So set_dirty_flags would be like,
>
> 1. set master first regard less of dirty type.
> 2. if dirty type was CODE, set the client.

There really should be two APIs, one for general dirtyness and one for 
code dirty.  As Anthony said, the code dirty bitmap is completely 
separate from the other uses.

Consider starting by separating the two uses, it may clear up the code 
and make things easier later on.

>
> In summary, I'll prepare two patch sets.
>
> 1. v3 of this patch set.
> - Change FLAGS value to (1,2,4,8), and add IDX (0,1,2,3)
> - Use ffsl to convert FLAGS to IDX.
> - Add help function which takes IDX.
> - Change the behavior of set_dirty_flags as above.
> - Change dirty bitmap access to a loop.
> - Add brace after if ()
> - Move some macros to qemu-common.h.
>
> 2. Allocate vga and migration dirty bitmap dynamically.
>
> Please modify or add items if I were missing.

I would add separation of CODE and the other dirty bitmaps.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
@ 2010-04-13  9:20             ` Avi Kivity
  0 siblings, 0 replies; 32+ messages in thread
From: Avi Kivity @ 2010-04-13  9:20 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: aliguori, kvm, ohmura.kei, mtosatti, qemu-devel

On 04/13/2010 11:01 AM, Yoshiaki Tamura wrote:
> Avi Kivity wrote:
>> On 04/12/2010 01:58 PM, Yoshiaki Tamura wrote:
>>>> Is it necessary to update migration and vga bitmaps?
>>>>
>>>> We can simply update the master bitmap, and update the migration 
>>>> and vga
>>>> bitmaps only when they need it. That can be done in a different patch.
>>>
>>>
>>> Let me explain the role of the master bitmap here.
>>>
>>> In the previous discussion, the master bitmap represented at least one
>>> dirty type is actually dirty. While implementing this approach, I
>>> though there is one downside that upon resetting the bitmap, it needs
>>> to check all dirty types whether they are already cleared to unset the
>>> master bitmap, and this might hurt the performance in general.
>>
>> The way I see it, the master bitmap is a buffer between the guest and
>> all the other client bitmaps (migration and vga). So, a bit can be set
>> in vga and cleared in master. Let's look at the following scenario:
>>
>> 1. guest touches address 0xa0000 (page 0xa0)
>> 2. qemu updates master bitmap bit 0xa0
>> 3. vga timer fires
>> 4. vga syncs the dirty bitmap for addresses 0xa0000-0xc0000
>> 4.1 vga_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
>> 4.2 migration_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
>> 4.3 master_bitmap[0xa0-0xc0] = 0
>> 5. vga draws based on vga_bitmap
>>
>> the nice thing is that step 4.2 can be omitted if migration is not 
>> active.
>>
>> so, client bitmaps are always updated when the client requests them,
>> master bitmap is only a buffer.
>
> Thanks.  I think I'm finally getting your point.
> At 4.2 and 4.3, is syncing to client necessary?

Yes.  We want to clear the master bitmap, otherwise on the next 
iteration we will get dirty bits that may be spurious.  But if we clear 
the dirty bitmap, we must copy it to all client bitmaps, not just vga, 
otherwise we lose data.

> Simply doing OR at get_dirty like in this patch not enough?

If you don't clear the master bitmap, then you will get the same bits at 
the next iteration.

>
>>> In this patch, master bitmap represents all types are dirty, similar
>>> to existing 0xff. With this approach, resetting the master bitmap can
>>> be done without checking the other types. set_dirty_flags is actually
>>> taking the burden in this case though. Anyway, IIUC somebody would be
>>> unhappy depending on the role of the master bitmap.
>>
>> Yes, the problem is that set_dirty_flags is the common case and also
>> uses random access, we'd want to make it touch only a single bit. client
>> access is rare, and also sequential, and therefore efficient.
>
> I see.
> So set_dirty_flags would be like,
>
> 1. set master first regard less of dirty type.
> 2. if dirty type was CODE, set the client.

There really should be two APIs, one for general dirtyness and one for 
code dirty.  As Anthony said, the code dirty bitmap is completely 
separate from the other uses.

Consider starting by separating the two uses, it may clear up the code 
and make things easier later on.

>
> In summary, I'll prepare two patch sets.
>
> 1. v3 of this patch set.
> - Change FLAGS value to (1,2,4,8), and add IDX (0,1,2,3)
> - Use ffsl to convert FLAGS to IDX.
> - Add help function which takes IDX.
> - Change the behavior of set_dirty_flags as above.
> - Change dirty bitmap access to a loop.
> - Add brace after if ()
> - Move some macros to qemu-common.h.
>
> 2. Allocate vga and migration dirty bitmap dynamically.
>
> Please modify or add items if I were missing.

I would add separation of CODE and the other dirty bitmaps.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
  2010-04-13  9:20             ` [Qemu-devel] " Avi Kivity
@ 2010-04-13 10:49               ` Yoshiaki Tamura
  -1 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-13 10:49 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, qemu-devel, anthony, aliguori, mtosatti, ohmura.kei

Avi Kivity wrote:
> On 04/13/2010 11:01 AM, Yoshiaki Tamura wrote:
>> Avi Kivity wrote:
>>> On 04/12/2010 01:58 PM, Yoshiaki Tamura wrote:
>>>>> Is it necessary to update migration and vga bitmaps?
>>>>>
>>>>> We can simply update the master bitmap, and update the migration
>>>>> and vga
>>>>> bitmaps only when they need it. That can be done in a different patch.
>>>>
>>>>
>>>> Let me explain the role of the master bitmap here.
>>>>
>>>> In the previous discussion, the master bitmap represented at least one
>>>> dirty type is actually dirty. While implementing this approach, I
>>>> though there is one downside that upon resetting the bitmap, it needs
>>>> to check all dirty types whether they are already cleared to unset the
>>>> master bitmap, and this might hurt the performance in general.
>>>
>>> The way I see it, the master bitmap is a buffer between the guest and
>>> all the other client bitmaps (migration and vga). So, a bit can be set
>>> in vga and cleared in master. Let's look at the following scenario:
>>>
>>> 1. guest touches address 0xa0000 (page 0xa0)
>>> 2. qemu updates master bitmap bit 0xa0
>>> 3. vga timer fires
>>> 4. vga syncs the dirty bitmap for addresses 0xa0000-0xc0000
>>> 4.1 vga_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
>>> 4.2 migration_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
>>> 4.3 master_bitmap[0xa0-0xc0] = 0
>>> 5. vga draws based on vga_bitmap
>>>
>>> the nice thing is that step 4.2 can be omitted if migration is not
>>> active.
>>>
>>> so, client bitmaps are always updated when the client requests them,
>>> master bitmap is only a buffer.
>>
>> Thanks. I think I'm finally getting your point.
>> At 4.2 and 4.3, is syncing to client necessary?
>
> Yes. We want to clear the master bitmap, otherwise on the next iteration
> we will get dirty bits that may be spurious. But if we clear the dirty
> bitmap, we must copy it to all client bitmaps, not just vga, otherwise
> we lose data.
>
>> Simply doing OR at get_dirty like in this patch not enough?
>
> If you don't clear the master bitmap, then you will get the same bits at
> the next iteration.

OK.
Master is just a cache/buffer, and upon get_dirty, we copy it to client bitmaps 
with for loop, clear the master and finally check actual client bitmap.  If 
we're going to allocate client bitmaps dynamically, we need to check whether it 
isn't null too.

>>>> In this patch, master bitmap represents all types are dirty, similar
>>>> to existing 0xff. With this approach, resetting the master bitmap can
>>>> be done without checking the other types. set_dirty_flags is actually
>>>> taking the burden in this case though. Anyway, IIUC somebody would be
>>>> unhappy depending on the role of the master bitmap.
>>>
>>> Yes, the problem is that set_dirty_flags is the common case and also
>>> uses random access, we'd want to make it touch only a single bit. client
>>> access is rare, and also sequential, and therefore efficient.
>>
>> I see.
>> So set_dirty_flags would be like,
>>
>> 1. set master first regard less of dirty type.
>> 2. if dirty type was CODE, set the client.
>
> There really should be two APIs, one for general dirtyness and one for
> code dirty. As Anthony said, the code dirty bitmap is completely
> separate from the other uses.
>
> Consider starting by separating the two uses, it may clear up the code
> and make things easier later on.

That really make sense.  A little bit hesitant because the code might look a bit 
duplicated.

>> In summary, I'll prepare two patch sets.
>>
>> 1. v3 of this patch set.
>> - Change FLAGS value to (1,2,4,8), and add IDX (0,1,2,3)
>> - Use ffsl to convert FLAGS to IDX.
>> - Add help function which takes IDX.
>> - Change the behavior of set_dirty_flags as above.
>> - Change dirty bitmap access to a loop.
>> - Add brace after if ()
>> - Move some macros to qemu-common.h.
>>
>> 2. Allocate vga and migration dirty bitmap dynamically.
>>
>> Please modify or add items if I were missing.
>
> I would add separation of CODE and the other dirty bitmaps.

I'm not sure I should starting separating first or later at this moment, but 
will consider it.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.
@ 2010-04-13 10:49               ` Yoshiaki Tamura
  0 siblings, 0 replies; 32+ messages in thread
From: Yoshiaki Tamura @ 2010-04-13 10:49 UTC (permalink / raw)
  To: Avi Kivity; +Cc: aliguori, kvm, ohmura.kei, mtosatti, qemu-devel

Avi Kivity wrote:
> On 04/13/2010 11:01 AM, Yoshiaki Tamura wrote:
>> Avi Kivity wrote:
>>> On 04/12/2010 01:58 PM, Yoshiaki Tamura wrote:
>>>>> Is it necessary to update migration and vga bitmaps?
>>>>>
>>>>> We can simply update the master bitmap, and update the migration
>>>>> and vga
>>>>> bitmaps only when they need it. That can be done in a different patch.
>>>>
>>>>
>>>> Let me explain the role of the master bitmap here.
>>>>
>>>> In the previous discussion, the master bitmap represented at least one
>>>> dirty type is actually dirty. While implementing this approach, I
>>>> though there is one downside that upon resetting the bitmap, it needs
>>>> to check all dirty types whether they are already cleared to unset the
>>>> master bitmap, and this might hurt the performance in general.
>>>
>>> The way I see it, the master bitmap is a buffer between the guest and
>>> all the other client bitmaps (migration and vga). So, a bit can be set
>>> in vga and cleared in master. Let's look at the following scenario:
>>>
>>> 1. guest touches address 0xa0000 (page 0xa0)
>>> 2. qemu updates master bitmap bit 0xa0
>>> 3. vga timer fires
>>> 4. vga syncs the dirty bitmap for addresses 0xa0000-0xc0000
>>> 4.1 vga_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
>>> 4.2 migration_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
>>> 4.3 master_bitmap[0xa0-0xc0] = 0
>>> 5. vga draws based on vga_bitmap
>>>
>>> the nice thing is that step 4.2 can be omitted if migration is not
>>> active.
>>>
>>> so, client bitmaps are always updated when the client requests them,
>>> master bitmap is only a buffer.
>>
>> Thanks. I think I'm finally getting your point.
>> At 4.2 and 4.3, is syncing to client necessary?
>
> Yes. We want to clear the master bitmap, otherwise on the next iteration
> we will get dirty bits that may be spurious. But if we clear the dirty
> bitmap, we must copy it to all client bitmaps, not just vga, otherwise
> we lose data.
>
>> Simply doing OR at get_dirty like in this patch not enough?
>
> If you don't clear the master bitmap, then you will get the same bits at
> the next iteration.

OK.
Master is just a cache/buffer, and upon get_dirty, we copy it to client bitmaps 
with for loop, clear the master and finally check actual client bitmap.  If 
we're going to allocate client bitmaps dynamically, we need to check whether it 
isn't null too.

>>>> In this patch, master bitmap represents all types are dirty, similar
>>>> to existing 0xff. With this approach, resetting the master bitmap can
>>>> be done without checking the other types. set_dirty_flags is actually
>>>> taking the burden in this case though. Anyway, IIUC somebody would be
>>>> unhappy depending on the role of the master bitmap.
>>>
>>> Yes, the problem is that set_dirty_flags is the common case and also
>>> uses random access, we'd want to make it touch only a single bit. client
>>> access is rare, and also sequential, and therefore efficient.
>>
>> I see.
>> So set_dirty_flags would be like,
>>
>> 1. set master first regard less of dirty type.
>> 2. if dirty type was CODE, set the client.
>
> There really should be two APIs, one for general dirtyness and one for
> code dirty. As Anthony said, the code dirty bitmap is completely
> separate from the other uses.
>
> Consider starting by separating the two uses, it may clear up the code
> and make things easier later on.

That really make sense.  A little bit hesitant because the code might look a bit 
duplicated.

>> In summary, I'll prepare two patch sets.
>>
>> 1. v3 of this patch set.
>> - Change FLAGS value to (1,2,4,8), and add IDX (0,1,2,3)
>> - Use ffsl to convert FLAGS to IDX.
>> - Add help function which takes IDX.
>> - Change the behavior of set_dirty_flags as above.
>> - Change dirty bitmap access to a loop.
>> - Add brace after if ()
>> - Move some macros to qemu-common.h.
>>
>> 2. Allocate vga and migration dirty bitmap dynamically.
>>
>> Please modify or add items if I were missing.
>
> I would add separation of CODE and the other dirty bitmaps.

I'm not sure I should starting separating first or later at this moment, but 
will consider it.

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2010-04-13 10:50 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-06  0:51 [PATCH v2 0/6] Introduce bit-based phys_ram_dirty, and bit-based dirty page checker Yoshiaki Tamura
2010-04-06  0:51 ` [Qemu-devel] " Yoshiaki Tamura
2010-04-06  0:51 ` [PATCH v2 1/6] Modify DIRTY_FLAG value to use as indexes of bit-based phys_ram_dirty Yoshiaki Tamura
2010-04-06  0:51   ` [Qemu-devel] " Yoshiaki Tamura
2010-04-06  0:51 ` [PATCH v2 2/6] Introduce bit-based phys_ram_dirty for VGA, CODE, MIGRATION and MASTER Yoshiaki Tamura
2010-04-06  0:51   ` [Qemu-devel] " Yoshiaki Tamura
2010-04-12  8:01   ` Avi Kivity
2010-04-12  8:01     ` [Qemu-devel] " Avi Kivity
2010-04-12  9:39     ` Yoshiaki Tamura
2010-04-12  9:39       ` [Qemu-devel] " Yoshiaki Tamura
2010-04-12 10:17       ` Avi Kivity
2010-04-12 10:17         ` [Qemu-devel] " Avi Kivity
2010-04-06  0:51 ` [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap Yoshiaki Tamura
2010-04-06  0:51   ` [Qemu-devel] " Yoshiaki Tamura
2010-04-12  8:10   ` Avi Kivity
2010-04-12  8:10     ` [Qemu-devel] " Avi Kivity
2010-04-12 10:58     ` Yoshiaki Tamura
2010-04-12 10:58       ` [Qemu-devel] " Yoshiaki Tamura
2010-04-12 11:09       ` Avi Kivity
2010-04-12 11:09         ` [Qemu-devel] " Avi Kivity
2010-04-13  8:01         ` Yoshiaki Tamura
2010-04-13  8:01           ` [Qemu-devel] " Yoshiaki Tamura
2010-04-13  9:20           ` Avi Kivity
2010-04-13  9:20             ` [Qemu-devel] " Avi Kivity
2010-04-13 10:49             ` Yoshiaki Tamura
2010-04-13 10:49               ` [Qemu-devel] " Yoshiaki Tamura
2010-04-06  0:51 ` [PATCH v2 4/6] Introduce cpu_physical_memory_get_dirty_range() Yoshiaki Tamura
2010-04-06  0:51   ` [Qemu-devel] " Yoshiaki Tamura
2010-04-06  0:51 ` [PATCH v2 5/6] Use cpu_physical_memory_set_dirty_range() to update phys_ram_dirty Yoshiaki Tamura
2010-04-06  0:51   ` [Qemu-devel] " Yoshiaki Tamura
2010-04-06  0:51 ` [PATCH v2 6/6] Use cpu_physical_memory_get_dirty_range() to check multiple dirty pages Yoshiaki Tamura
2010-04-06  0:51   ` [Qemu-devel] " Yoshiaki Tamura

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.