All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
@ 2013-11-06 13:04 Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 01/39] Move prototypes to memory.h Juan Quintela
                   ` (41 more replies)
  0 siblings, 42 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Hi

[v2]
In this version:
- fixed all the comments from last versions (thanks Eric)
- kvm migration bitmap is synchronized using bitmap operations
- qemu bitmap -> migration bitmap is synchronized using bitmap operations
If bitmaps are not properly aligned, we fall back to old code.
Code survives virt-tests, so should be in quite good shape.

ToDo list:

- vga ram by default is not aligned in a page number multiple of 64,

  it could be optimized.  Kraxel?  It syncs the kvm bitmap at least 1
  a second or so? bitmap is only 2048 pages (16MB by default).
  We need to change the ram_addr only

- vga: still more, after we finish migration, vga code continues
  synchronizing the kvm bitmap on source machine.  Notice that there
  is no graphics client connected to the VGA.  Worth investigating?

- I haven't yet meassure speed differences on big hosts.  Vinod?

- Depending of performance, more optimizations to do.

- debugging printf's still on the code, just to see if we are taking
  (or not) the optimized paths.

And that is all.  Please test & comment.

Thanks, Juan.

[v1]
This series split the dirty bitmap (8 bits per page, only three used)
into 3 individual bitmaps.  Once the conversion is done, operations
are handled by bitmap operations, not bit by bit.

- *_DIRTY_FLAG flags are gone, now we use memory.h DIRTY_MEMORY_*
   everywhere.

- We set/reset each flag individually
  (set_dirty_flags(0xff&~CODE_DIRTY_FLAG)) are gone.

- Rename several functions to clarify/make consistent things.

- I know it dont't pass checkpatch for long lines, propper submission
  should pass it. We have to have long lines, short variable names, or
  ugly line splitting :p

- DIRTY_MEMORY_NUM: how can one include exec/memory.h into cpu-all.h?
  #include it don't work, as a workaround, I have copied its value, but
  any better idea?  I can always create "exec/migration-flags.h", though.

- The meat of the code is patch 19.  Rest of patches are quite easy
(even that one is not too complex).

Only optimizations done so far are
set_dirty_range()/clear_dirty_range() that now operates with
bitmap_set/clear.

Note for Xen: cpu_physical_memory_set_dirty_range() was wrong for xen,
see comment on patch.

It passes virt-test migration tests, so it should be perfect.

I post it to ask for comments.

ToDo list:

- create a lock for the bitmaps and fold migration bitmap into this
  one.  This would avoid a copy and make things easier?

- As this code uses/abuses bitmaps, we need to change the type of the
  index from int to long.  With an int index, we can only access a
  maximum of 8TB guest (yes, this is not urgent, we have a couple of
  years to do it).

- merging KVM <-> QEMU bitmap as a bitmap and not bit-by-bit.

- spliting the KVM bitmap synchronization into chunks, i.e. not
  synchronize all memory, just enough to continue with migration.

Any further ideas/needs?

Thanks, Juan.

PD.  Why it took so long?

     Because I was trying to integrate the bitmap on the MemoryRegion
     abstraction.  Would have make the code cleaner, but hit dead-end
     after dead-end.  As practical terms, TCG don't know about
     MemoryRegions, it has been ported to run on top of them, but
     don't use them effective


The following changes since commit c2d30667760e3d7b81290d801e567d4f758825ca:

  rtc: remove dead SQW IRQ code (2013-11-05 20:04:03 -0800)

are available in the git repository at:

  git://github.com/juanquintela/qemu.git bitmap-v2.next

for you to fetch changes up to d91eff97e6f36612eb22d57c2b6c2623f73d3997:

  migration: synchronize memory bitmap 64bits at a time (2013-11-06 13:54:56 +0100)

----------------------------------------------------------------
Juan Quintela (39):
      Move prototypes to memory.h
      memory: cpu_physical_memory_set_dirty_flags() result is never used
      memory: cpu_physical_memory_set_dirty_range() return void
      exec: use accessor function to know if memory is dirty
      memory: create function to set a single dirty bit
      exec: create function to get a single dirty bit
      memory: make cpu_physical_memory_is_dirty return bool
      exec: simplify notdirty_mem_write()
      memory: all users of cpu_physical_memory_get_dirty used only one flag
      memory: set single dirty flags when possible
      memory: cpu_physical_memory_set_dirty_range() allways dirty all flags
      memory: cpu_physical_memory_mask_dirty_range() always clear a single flag
      memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG
      memory: use bit 2 for migration
      memory: make sure that client is always inside range
      memory: only resize dirty bitmap when memory size increases
      memory: cpu_physical_memory_clear_dirty_flag() result is never used
      bitmap: Add bitmap_zero_extend operation
      memory: split dirty bitmap into three
      memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user
      memory: unfold cpu_physical_memory_set_dirty() in its only user
      memory: unfold cpu_physical_memory_set_dirty_flag()
      memory: make cpu_physical_memory_get_dirty() the main function
      memory: cpu_physical_memory_get_dirty() is used as returning a bool
      memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range
      memory: use find_next_bit() to find dirty bits
      memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations
      memory: cpu_physical_memory_clear_dirty_range() now uses bitmap operations
      memory: s/dirty/clean/ in cpu_physical_memory_is_dirty()
      memory: make cpu_physical_memory_reset_dirty() take a length parameter
      memory: cpu_physical_memory_set_dirty_tracking() should return void
      memory: split cpu_physical_memory_* functions to its own include
      memory: unfold memory_region_test_and_clear()
      kvm: use directly cpu_physical_memory_* api for tracking dirty pages
      kvm: refactor start address calculation
      memory: move bitmap synchronization to its own function
      memory: syncronize kvm bitmap using bitmaps operations
      ram: split function that synchronizes a range
      migration: synchronize memory bitmap 64bits at a time

 arch_init.c                    |  57 ++++++++++++----
 cputlb.c                       |  10 +--
 exec.c                         |  75 ++++++++++-----------
 include/exec/cpu-all.h         |   4 +-
 include/exec/cpu-common.h      |   4 --
 include/exec/memory-internal.h |  84 ------------------------
 include/exec/memory-physical.h | 143 +++++++++++++++++++++++++++++++++++++++++
 include/exec/memory.h          |  10 +--
 include/qemu/bitmap.h          |   9 +++
 kvm-all.c                      |  28 ++------
 memory.c                       |  17 ++---
 11 files changed, 260 insertions(+), 181 deletions(-)
 create mode 100644 include/exec/memory-physical.h

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 01/39] Move prototypes to memory.h
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-07  9:36   ` Orit Wasserman
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 02/39] memory: cpu_physical_memory_set_dirty_flags() result is never used Juan Quintela
                   ` (40 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

As the comment says, it should only be used on "core" memory files.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/cpu-common.h | 4 ----
 include/exec/memory.h     | 4 +++-
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index e4996e1..8110ef0 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -51,10 +51,6 @@ typedef void CPUWriteMemoryFunc(void *opaque, hwaddr addr, uint32_t value);
 typedef uint32_t CPUReadMemoryFunc(void *opaque, hwaddr addr);

 void qemu_ram_remap(ram_addr_t addr, ram_addr_t length);
-/* This should not be used by devices.  */
-MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
-void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
-
 void cpu_physical_memory_rw(hwaddr addr, uint8_t *buf,
                             int len, int is_write);
 static inline void cpu_physical_memory_read(hwaddr addr,
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 480dfbf..c7e151f 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1055,7 +1055,9 @@ void *address_space_map(AddressSpace *as, hwaddr addr,
 void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len,
                          int is_write, hwaddr access_len);

-
+/* This should not be used by devices.  */
+MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
+void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
 #endif

 #endif
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 02/39] memory: cpu_physical_memory_set_dirty_flags() result is never used
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 01/39] Move prototypes to memory.h Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-07  9:42   ` Orit Wasserman
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 03/39] memory: cpu_physical_memory_set_dirty_range() return void Juan Quintela
                   ` (39 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

So return void.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-internal.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index d0e0633..c71a5e6 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -70,10 +70,10 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
     return ret;
 }

-static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
+static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
                                                       int dirty_flags)
 {
-    return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
+    ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
 }

 static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 03/39] memory: cpu_physical_memory_set_dirty_range() return void
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 01/39] Move prototypes to memory.h Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 02/39] memory: cpu_physical_memory_set_dirty_flags() result is never used Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-07  9:51   ` Orit Wasserman
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 04/39] exec: use accessor function to know if memory is dirty Juan Quintela
                   ` (38 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/memory.c b/memory.c
index 28f6449..9722f23 100644
--- a/memory.c
+++ b/memory.c
@@ -1182,7 +1182,7 @@ void memory_region_set_dirty(MemoryRegion *mr, hwaddr addr,
                              hwaddr size)
 {
     assert(mr->terminates);
-    return cpu_physical_memory_set_dirty_range(mr->ram_addr + addr, size, -1);
+    cpu_physical_memory_set_dirty_range(mr->ram_addr + addr, size, -1);
 }

 bool memory_region_test_and_clear_dirty(MemoryRegion *mr, hwaddr addr,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 04/39] exec: use accessor function to know if memory is dirty
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (2 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 03/39] memory: cpu_physical_memory_set_dirty_range() return void Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-07 10:00   ` Orit Wasserman
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 05/39] memory: create function to set a single dirty bit Juan Quintela
                   ` (37 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 exec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index 79610ce..7cf5634 100644
--- a/exec.c
+++ b/exec.c
@@ -1397,7 +1397,7 @@ static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
     cpu_physical_memory_set_dirty_flags(ram_addr, dirty_flags);
     /* we remove the notdirty callback only if the code has been
        flushed */
-    if (dirty_flags == 0xff) {
+    if (cpu_physical_memory_is_dirty(ram_addr)) {
         CPUArchState *env = current_cpu->env_ptr;
         tlb_set_dirty(env, env->mem_io_vaddr);
     }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 05/39] memory: create function to set a single dirty bit
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (3 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 04/39] exec: use accessor function to know if memory is dirty Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-07 11:37   ` Orit Wasserman
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 06/39] exec: create function to get " Juan Quintela
                   ` (36 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 cputlb.c                       | 2 +-
 include/exec/memory-internal.h | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/cputlb.c b/cputlb.c
index fff0afb..72452e5 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -137,7 +137,7 @@ void tlb_protect_code(ram_addr_t ram_addr)
 void tlb_unprotect_code_phys(CPUArchState *env, ram_addr_t ram_addr,
                              target_ulong vaddr)
 {
-    cpu_physical_memory_set_dirty_flags(ram_addr, CODE_DIRTY_FLAG);
+    cpu_physical_memory_set_dirty_flag(ram_addr, CODE_DIRTY_FLAG);
 }

 static bool tlb_is_dirty_ram(CPUTLBEntry *tlbe)
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index c71a5e6..4ebab80 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -76,6 +76,12 @@ static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
     ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
 }

+static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
+                                                      int dirty_flag)
+{
+    ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flag;
+}
+
 static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
 {
     cpu_physical_memory_set_dirty_flags(addr, 0xff);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 06/39] exec: create function to get a single dirty bit
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (4 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 05/39] memory: create function to set a single dirty bit Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 23:11   ` Eric Blake
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 07/39] memory: make cpu_physical_memory_is_dirty return bool Juan Quintela
                   ` (35 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 exec.c                         | 3 ++-
 include/exec/memory-internal.h | 6 ++++++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index 7cf5634..36b0a66 100644
--- a/exec.c
+++ b/exec.c
@@ -1374,9 +1374,10 @@ found:
 static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
                                uint64_t val, unsigned size)
 {
+
     int dirty_flags;
     dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
-    if (!(dirty_flags & CODE_DIRTY_FLAG)) {
+    if (!cpu_physical_memory_get_dirty_flag(ram_addr, CODE_DIRTY_FLAG)) {
         tb_invalidate_phys_page_fast(ram_addr, size);
         dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
     }
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 4ebab80..9cd2f53 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -49,6 +49,12 @@ static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
     return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS];
 }

+static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
+                                                     int dirty_flag)
+{
+    return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] & dirty_flag;
+}
+
 /* read dirty bit (return 0 or 1) */
 static inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
 {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 07/39] memory: make cpu_physical_memory_is_dirty return bool
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (5 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 06/39] exec: create function to get " Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 08/39] exec: simplify notdirty_mem_write() Juan Quintela
                   ` (34 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-internal.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 9cd2f53..eefe501 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -56,9 +56,12 @@ static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
 }

 /* read dirty bit (return 0 or 1) */
-static inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
+static inline bool cpu_physical_memory_is_dirty(ram_addr_t addr)
 {
-    return cpu_physical_memory_get_dirty_flags(addr) == 0xff;
+    bool vga = cpu_physical_memory_get_dirty_flag(addr, VGA_DIRTY_FLAG);
+    bool code = cpu_physical_memory_get_dirty_flag(addr, CODE_DIRTY_FLAG);
+    bool migration = cpu_physical_memory_get_dirty_flag(addr, MIGRATION_DIRTY_FLAG);
+    return vga && code && migration;
 }

 static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 08/39] exec: simplify notdirty_mem_write()
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (6 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 07/39] memory: make cpu_physical_memory_is_dirty return bool Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 23:12   ` Eric Blake
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 09/39] memory: all users of cpu_physical_memory_get_dirty used only one flag Juan Quintela
                   ` (33 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

We don't need to make special things for CODE, just set the other two bits

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 exec.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index 36b0a66..27e4540 100644
--- a/exec.c
+++ b/exec.c
@@ -1374,12 +1374,8 @@ found:
 static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
                                uint64_t val, unsigned size)
 {
-
-    int dirty_flags;
-    dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
     if (!cpu_physical_memory_get_dirty_flag(ram_addr, CODE_DIRTY_FLAG)) {
         tb_invalidate_phys_page_fast(ram_addr, size);
-        dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
     }
     switch (size) {
     case 1:
@@ -1394,8 +1390,8 @@ static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
     default:
         abort();
     }
-    dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
-    cpu_physical_memory_set_dirty_flags(ram_addr, dirty_flags);
+    cpu_physical_memory_set_dirty_flag(ram_addr, MIGRATION_DIRTY_FLAG);
+    cpu_physical_memory_set_dirty_flag(ram_addr, VGA_DIRTY_FLAG);
     /* we remove the notdirty callback only if the code has been
        flushed */
     if (cpu_physical_memory_is_dirty(ram_addr)) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 09/39] memory: all users of cpu_physical_memory_get_dirty used only one flag
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (7 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 08/39] exec: simplify notdirty_mem_write() Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 10/39] memory: set single dirty flags when possible Juan Quintela
                   ` (32 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

So cpu_physical_memory_get_dirty_flags is not needed anymore

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-internal.h | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index eefe501..b72c14a 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -44,11 +44,6 @@ void qemu_ram_free_from_ptr(ram_addr_t addr);
 #define CODE_DIRTY_FLAG      0x02
 #define MIGRATION_DIRTY_FLAG 0x08

-static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
-{
-    return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS];
-}
-
 static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
                                                      int dirty_flag)
 {
@@ -66,7 +61,7 @@ static inline bool cpu_physical_memory_is_dirty(ram_addr_t addr)

 static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
                                                 ram_addr_t length,
-                                                int dirty_flags)
+                                                int dirty_flag)
 {
     int ret = 0;
     ram_addr_t addr, end;
@@ -74,7 +69,7 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
     end = TARGET_PAGE_ALIGN(start + length);
     start &= TARGET_PAGE_MASK;
     for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-        ret |= cpu_physical_memory_get_dirty_flags(addr) & dirty_flags;
+        ret |= cpu_physical_memory_get_dirty_flag(addr, dirty_flag);
     }
     return ret;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 10/39] memory: set single dirty flags when possible
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (8 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 09/39] memory: all users of cpu_physical_memory_get_dirty used only one flag Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 11/39] memory: cpu_physical_memory_set_dirty_range() allways dirty all flags Juan Quintela
                   ` (31 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 exec.c                         | 7 ++++---
 include/exec/memory-internal.h | 4 +++-
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/exec.c b/exec.c
index 27e4540..0cc7877 100644
--- a/exec.c
+++ b/exec.c
@@ -1819,7 +1819,8 @@ static void invalidate_and_set_dirty(hwaddr addr,
         /* invalidate code */
         tb_invalidate_phys_page_range(addr, addr + length, 0);
         /* set dirty bit */
-        cpu_physical_memory_set_dirty_flags(addr, (0xff & ~CODE_DIRTY_FLAG));
+        cpu_physical_memory_set_dirty_flag(addr, VGA_DIRTY_FLAG);
+        cpu_physical_memory_set_dirty_flag(addr, MIGRATION_DIRTY_FLAG);
     }
     xen_modified_memory(addr, length);
 }
@@ -2401,8 +2402,8 @@ void stl_phys_notdirty(hwaddr addr, uint32_t val)
                 /* invalidate code */
                 tb_invalidate_phys_page_range(addr1, addr1 + 4, 0);
                 /* set dirty bit */
-                cpu_physical_memory_set_dirty_flags(
-                    addr1, (0xff & ~CODE_DIRTY_FLAG));
+                cpu_physical_memory_set_dirty_flag(addr1, MIGRATION_DIRTY_FLAG);
+                cpu_physical_memory_set_dirty_flag(addr1, VGA_DIRTY_FLAG);
             }
         }
     }
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index b72c14a..bfbfc48 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -88,7 +88,9 @@ static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,

 static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
 {
-    cpu_physical_memory_set_dirty_flags(addr, 0xff);
+    cpu_physical_memory_set_dirty_flag(addr, MIGRATION_DIRTY_FLAG);
+    cpu_physical_memory_set_dirty_flag(addr, VGA_DIRTY_FLAG);
+    cpu_physical_memory_set_dirty_flag(addr, CODE_DIRTY_FLAG);
 }

 static inline int cpu_physical_memory_clear_dirty_flags(ram_addr_t addr,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 11/39] memory: cpu_physical_memory_set_dirty_range() allways dirty all flags
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (9 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 10/39] memory: set single dirty flags when possible Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 23:15   ` Eric Blake
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 12/39] memory: cpu_physical_memory_mask_dirty_range() always clear a single flag Juan Quintela
                   ` (30 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

So remove the flag argument and do it directly.  After this change, there is nothing else using cpu_physical_memory_set_dirty_flags() so remove it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 exec.c                         |  2 +-
 include/exec/memory-internal.h | 11 ++---------
 memory.c                       |  2 +-
 3 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/exec.c b/exec.c
index 0cc7877..3fd7616 100644
--- a/exec.c
+++ b/exec.c
@@ -1164,7 +1164,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
                                        last_ram_offset() >> TARGET_PAGE_BITS);
     memset(ram_list.phys_dirty + (new_block->offset >> TARGET_PAGE_BITS),
            0, size >> TARGET_PAGE_BITS);
-    cpu_physical_memory_set_dirty_range(new_block->offset, size, 0xff);
+    cpu_physical_memory_set_dirty_range(new_block->offset, size);

     qemu_ram_setup_dump(new_block->host, size);
     qemu_madvise(new_block->host, size, QEMU_MADV_HUGEPAGE);
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index bfbfc48..e3b5834 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -74,12 +74,6 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
     return ret;
 }

-static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
-                                                      int dirty_flags)
-{
-    ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
-}
-
 static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
                                                       int dirty_flag)
 {
@@ -102,15 +96,14 @@ static inline int cpu_physical_memory_clear_dirty_flags(ram_addr_t addr,
 }

 static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
-                                                       ram_addr_t length,
-                                                       int dirty_flags)
+                                                       ram_addr_t length)
 {
     ram_addr_t addr, end;

     end = TARGET_PAGE_ALIGN(start + length);
     start &= TARGET_PAGE_MASK;
     for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-        cpu_physical_memory_set_dirty_flags(addr, dirty_flags);
+        cpu_physical_memory_set_dirty(addr);
     }
     xen_modified_memory(addr, length);
 }
diff --git a/memory.c b/memory.c
index 9722f23..08cd07e 100644
--- a/memory.c
+++ b/memory.c
@@ -1182,7 +1182,7 @@ void memory_region_set_dirty(MemoryRegion *mr, hwaddr addr,
                              hwaddr size)
 {
     assert(mr->terminates);
-    cpu_physical_memory_set_dirty_range(mr->ram_addr + addr, size, -1);
+    cpu_physical_memory_set_dirty_range(mr->ram_addr + addr, size);
 }

 bool memory_region_test_and_clear_dirty(MemoryRegion *mr, hwaddr addr,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 12/39] memory: cpu_physical_memory_mask_dirty_range() always clear a single flag
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (10 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 11/39] memory: cpu_physical_memory_set_dirty_range() allways dirty all flags Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 23:18   ` Eric Blake
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 13/39] memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG Juan Quintela
                   ` (29 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Document it

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 exec.c                         |  4 ++--
 include/exec/memory-internal.h | 12 ++++++------
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/exec.c b/exec.c
index 3fd7616..94076ae 100644
--- a/exec.c
+++ b/exec.c
@@ -662,7 +662,7 @@ static void tlb_reset_dirty_range_all(ram_addr_t start, ram_addr_t end,

 /* Note: start and end must be within the same ram block.  */
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
-                                     int dirty_flags)
+                                     int dirty_flag)
 {
     uintptr_t length;

@@ -672,7 +672,7 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
     length = end - start;
     if (length == 0)
         return;
-    cpu_physical_memory_mask_dirty_range(start, length, dirty_flags);
+    cpu_physical_memory_mask_dirty_range(start, length, dirty_flag);

     if (tcg_enabled()) {
         tlb_reset_dirty_range_all(start, end, length);
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index e3b5834..cfe1a24 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -87,10 +87,10 @@ static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
     cpu_physical_memory_set_dirty_flag(addr, CODE_DIRTY_FLAG);
 }

-static inline int cpu_physical_memory_clear_dirty_flags(ram_addr_t addr,
-                                                        int dirty_flags)
+static inline int cpu_physical_memory_clear_dirty_flag(ram_addr_t addr,
+                                                       int dirty_flag)
 {
-    int mask = ~dirty_flags;
+    int mask = ~dirty_flag;

     return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] &= mask;
 }
@@ -110,19 +110,19 @@ static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,

 static inline void cpu_physical_memory_mask_dirty_range(ram_addr_t start,
                                                         ram_addr_t length,
-                                                        int dirty_flags)
+                                                        int dirty_flag)
 {
     ram_addr_t addr, end;

     end = TARGET_PAGE_ALIGN(start + length);
     start &= TARGET_PAGE_MASK;
     for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-        cpu_physical_memory_clear_dirty_flags(addr, dirty_flags);
+        cpu_physical_memory_clear_dirty_flag(addr, dirty_flag);
     }
 }

 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
-                                     int dirty_flags);
+                                     int dirty_flag);

 #endif

-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 13/39] memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (11 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 12/39] memory: cpu_physical_memory_mask_dirty_range() always clear a single flag Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 23:24   ` Eric Blake
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 14/39] memory: use bit 2 for migration Juan Quintela
                   ` (28 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Instead of the bitmap, we use the bitmap number.  Once this is done,
we change all names from dirty_flag to memory regions naming of
client.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 cputlb.c                       |  4 ++--
 exec.c                         | 18 +++++++++---------
 include/exec/memory-internal.h | 38 +++++++++++++++++---------------------
 include/exec/memory.h          |  3 ---
 memory.c                       | 10 ++++------
 5 files changed, 32 insertions(+), 41 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 72452e5..dfd747a 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -129,7 +129,7 @@ void tlb_protect_code(ram_addr_t ram_addr)
 {
     cpu_physical_memory_reset_dirty(ram_addr,
                                     ram_addr + TARGET_PAGE_SIZE,
-                                    CODE_DIRTY_FLAG);
+                                    DIRTY_MEMORY_CODE);
 }

 /* update the TLB so that writes in physical page 'phys_addr' are no longer
@@ -137,7 +137,7 @@ void tlb_protect_code(ram_addr_t ram_addr)
 void tlb_unprotect_code_phys(CPUArchState *env, ram_addr_t ram_addr,
                              target_ulong vaddr)
 {
-    cpu_physical_memory_set_dirty_flag(ram_addr, CODE_DIRTY_FLAG);
+    cpu_physical_memory_set_dirty_flag(ram_addr, DIRTY_MEMORY_CODE);
 }

 static bool tlb_is_dirty_ram(CPUTLBEntry *tlbe)
diff --git a/exec.c b/exec.c
index 94076ae..64a621f 100644
--- a/exec.c
+++ b/exec.c
@@ -662,7 +662,7 @@ static void tlb_reset_dirty_range_all(ram_addr_t start, ram_addr_t end,

 /* Note: start and end must be within the same ram block.  */
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
-                                     int dirty_flag)
+                                     unsigned client)
 {
     uintptr_t length;

@@ -672,7 +672,7 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
     length = end - start;
     if (length == 0)
         return;
-    cpu_physical_memory_mask_dirty_range(start, length, dirty_flag);
+    cpu_physical_memory_mask_dirty_range(start, length, client);

     if (tcg_enabled()) {
         tlb_reset_dirty_range_all(start, end, length);
@@ -1374,7 +1374,7 @@ found:
 static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
                                uint64_t val, unsigned size)
 {
-    if (!cpu_physical_memory_get_dirty_flag(ram_addr, CODE_DIRTY_FLAG)) {
+    if (!cpu_physical_memory_get_dirty_flag(ram_addr, DIRTY_MEMORY_CODE)) {
         tb_invalidate_phys_page_fast(ram_addr, size);
     }
     switch (size) {
@@ -1390,8 +1390,8 @@ static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
     default:
         abort();
     }
-    cpu_physical_memory_set_dirty_flag(ram_addr, MIGRATION_DIRTY_FLAG);
-    cpu_physical_memory_set_dirty_flag(ram_addr, VGA_DIRTY_FLAG);
+    cpu_physical_memory_set_dirty_flag(ram_addr, DIRTY_MEMORY_MIGRATION);
+    cpu_physical_memory_set_dirty_flag(ram_addr, DIRTY_MEMORY_VGA);
     /* we remove the notdirty callback only if the code has been
        flushed */
     if (cpu_physical_memory_is_dirty(ram_addr)) {
@@ -1819,8 +1819,8 @@ static void invalidate_and_set_dirty(hwaddr addr,
         /* invalidate code */
         tb_invalidate_phys_page_range(addr, addr + length, 0);
         /* set dirty bit */
-        cpu_physical_memory_set_dirty_flag(addr, VGA_DIRTY_FLAG);
-        cpu_physical_memory_set_dirty_flag(addr, MIGRATION_DIRTY_FLAG);
+        cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_VGA);
+        cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
     }
     xen_modified_memory(addr, length);
 }
@@ -2402,8 +2402,8 @@ void stl_phys_notdirty(hwaddr addr, uint32_t val)
                 /* invalidate code */
                 tb_invalidate_phys_page_range(addr1, addr1 + 4, 0);
                 /* set dirty bit */
-                cpu_physical_memory_set_dirty_flag(addr1, MIGRATION_DIRTY_FLAG);
-                cpu_physical_memory_set_dirty_flag(addr1, VGA_DIRTY_FLAG);
+                cpu_physical_memory_set_dirty_flag(addr1, DIRTY_MEMORY_MIGRATION);
+                cpu_physical_memory_set_dirty_flag(addr1, DIRTY_MEMORY_VGA);
             }
         }
     }
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index cfe1a24..3947caa 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -40,28 +40,24 @@ void *qemu_get_ram_ptr(ram_addr_t addr);
 void qemu_ram_free(ram_addr_t addr);
 void qemu_ram_free_from_ptr(ram_addr_t addr);

-#define VGA_DIRTY_FLAG       0x01
-#define CODE_DIRTY_FLAG      0x02
-#define MIGRATION_DIRTY_FLAG 0x08
-
 static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
-                                                     int dirty_flag)
+                                                      unsigned client)
 {
-    return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] & dirty_flag;
+    return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] & (1 << client);
 }

 /* read dirty bit (return 0 or 1) */
 static inline bool cpu_physical_memory_is_dirty(ram_addr_t addr)
 {
-    bool vga = cpu_physical_memory_get_dirty_flag(addr, VGA_DIRTY_FLAG);
-    bool code = cpu_physical_memory_get_dirty_flag(addr, CODE_DIRTY_FLAG);
-    bool migration = cpu_physical_memory_get_dirty_flag(addr, MIGRATION_DIRTY_FLAG);
+    bool vga = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_VGA);
+    bool code = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_CODE);
+    bool migration = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
     return vga && code && migration;
 }

 static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
                                                 ram_addr_t length,
-                                                int dirty_flag)
+                                                unsigned client)
 {
     int ret = 0;
     ram_addr_t addr, end;
@@ -69,28 +65,28 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
     end = TARGET_PAGE_ALIGN(start + length);
     start &= TARGET_PAGE_MASK;
     for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-        ret |= cpu_physical_memory_get_dirty_flag(addr, dirty_flag);
+        ret |= cpu_physical_memory_get_dirty_flag(addr, client);
     }
     return ret;
 }

 static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
-                                                      int dirty_flag)
+                                                      unsigned client)
 {
-    ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flag;
+    ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= (1 << client);
 }

 static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
 {
-    cpu_physical_memory_set_dirty_flag(addr, MIGRATION_DIRTY_FLAG);
-    cpu_physical_memory_set_dirty_flag(addr, VGA_DIRTY_FLAG);
-    cpu_physical_memory_set_dirty_flag(addr, CODE_DIRTY_FLAG);
+    cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
+    cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_VGA);
+    cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_CODE);
 }

 static inline int cpu_physical_memory_clear_dirty_flag(ram_addr_t addr,
-                                                       int dirty_flag)
+                                                       unsigned client)
 {
-    int mask = ~dirty_flag;
+    int mask = ~(1 << client);

     return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] &= mask;
 }
@@ -110,19 +106,19 @@ static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,

 static inline void cpu_physical_memory_mask_dirty_range(ram_addr_t start,
                                                         ram_addr_t length,
-                                                        int dirty_flag)
+                                                        unsigned client)
 {
     ram_addr_t addr, end;

     end = TARGET_PAGE_ALIGN(start + length);
     start &= TARGET_PAGE_MASK;
     for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-        cpu_physical_memory_clear_dirty_flag(addr, dirty_flag);
+        cpu_physical_memory_clear_dirty_flag(addr, client);
     }
 }

 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
-                                     int dirty_flag);
+                                     unsigned client);

 #endif

diff --git a/include/exec/memory.h b/include/exec/memory.h
index c7e151f..04b5071 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -33,9 +33,6 @@
 typedef struct MemoryRegionOps MemoryRegionOps;
 typedef struct MemoryRegionMmio MemoryRegionMmio;

-/* Must match *_DIRTY_FLAGS in cpu-all.h.  To be replaced with dynamic
- * registration.
- */
 #define DIRTY_MEMORY_VGA       0
 #define DIRTY_MEMORY_CODE      1
 #define DIRTY_MEMORY_MIGRATION 3
diff --git a/memory.c b/memory.c
index 08cd07e..f93c1e8 100644
--- a/memory.c
+++ b/memory.c
@@ -1174,8 +1174,7 @@ bool memory_region_get_dirty(MemoryRegion *mr, hwaddr addr,
                              hwaddr size, unsigned client)
 {
     assert(mr->terminates);
-    return cpu_physical_memory_get_dirty(mr->ram_addr + addr, size,
-                                         1 << client);
+    return cpu_physical_memory_get_dirty(mr->ram_addr + addr, size, client);
 }

 void memory_region_set_dirty(MemoryRegion *mr, hwaddr addr,
@@ -1190,12 +1189,11 @@ bool memory_region_test_and_clear_dirty(MemoryRegion *mr, hwaddr addr,
 {
     bool ret;
     assert(mr->terminates);
-    ret = cpu_physical_memory_get_dirty(mr->ram_addr + addr, size,
-                                        1 << client);
+    ret = cpu_physical_memory_get_dirty(mr->ram_addr + addr, size, client);
     if (ret) {
         cpu_physical_memory_reset_dirty(mr->ram_addr + addr,
                                         mr->ram_addr + addr + size,
-                                        1 << client);
+                                        client);
     }
     return ret;
 }
@@ -1243,7 +1241,7 @@ void memory_region_reset_dirty(MemoryRegion *mr, hwaddr addr,
     assert(mr->terminates);
     cpu_physical_memory_reset_dirty(mr->ram_addr + addr,
                                     mr->ram_addr + addr + size,
-                                    1 << client);
+                                    client);
 }

 void *memory_region_get_ram_ptr(MemoryRegion *mr)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 14/39] memory: use bit 2 for migration
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (12 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 13/39] memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 15/39] memory: make sure that client is always inside range Juan Quintela
                   ` (27 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

For historical reasons it was bit 3.  Once there, create a constant to
know the number of clients.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 04b5071..4bf7cd0 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -35,7 +35,8 @@ typedef struct MemoryRegionMmio MemoryRegionMmio;

 #define DIRTY_MEMORY_VGA       0
 #define DIRTY_MEMORY_CODE      1
-#define DIRTY_MEMORY_MIGRATION 3
+#define DIRTY_MEMORY_MIGRATION 2
+#define DIRTY_MEMORY_NUM       3	/* num of dirty bits */

 struct MemoryRegionMmio {
     CPUReadMemoryFunc *read[3];
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 15/39] memory: make sure that client is always inside range
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (13 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 14/39] memory: use bit 2 for migration Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 16/39] memory: only resize dirty bitmap when memory size increases Juan Quintela
                   ` (26 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-internal.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 3947caa..e08ac42 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -43,6 +43,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr);
 static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
                                                       unsigned client)
 {
+    assert(client < DIRTY_MEMORY_NUM);
     return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] & (1 << client);
 }

@@ -73,6 +74,7 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
 static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
                                                       unsigned client)
 {
+    assert(client < DIRTY_MEMORY_NUM);
     ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= (1 << client);
 }

@@ -88,6 +90,8 @@ static inline int cpu_physical_memory_clear_dirty_flag(ram_addr_t addr,
 {
     int mask = ~(1 << client);

+    assert(client < DIRTY_MEMORY_NUM);
+
     return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] &= mask;
 }

-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 16/39] memory: only resize dirty bitmap when memory size increases
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (14 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 15/39] memory: make sure that client is always inside range Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 17/39] memory: cpu_physical_memory_clear_dirty_flag() result is never used Juan Quintela
                   ` (25 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 exec.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/exec.c b/exec.c
index 64a621f..38b5c1c 100644
--- a/exec.c
+++ b/exec.c
@@ -1100,6 +1100,9 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
                                    MemoryRegion *mr)
 {
     RAMBlock *block, *new_block;
+    ram_addr_t old_ram_size, new_ram_size;
+
+    old_ram_size = last_ram_offset() >> TARGET_PAGE_BITS;

     size = TARGET_PAGE_ALIGN(size);
     new_block = g_malloc0(sizeof(*new_block));
@@ -1160,10 +1163,13 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
     ram_list.version++;
     qemu_mutex_unlock_ramlist();

-    ram_list.phys_dirty = g_realloc(ram_list.phys_dirty,
-                                       last_ram_offset() >> TARGET_PAGE_BITS);
-    memset(ram_list.phys_dirty + (new_block->offset >> TARGET_PAGE_BITS),
+    new_ram_size = last_ram_offset() >> TARGET_PAGE_BITS;
+
+    if (new_ram_size > old_ram_size) {
+        ram_list.phys_dirty = g_realloc(ram_list.phys_dirty, new_ram_size);
+        memset(ram_list.phys_dirty + (new_block->offset >> TARGET_PAGE_BITS),
            0, size >> TARGET_PAGE_BITS);
+    }
     cpu_physical_memory_set_dirty_range(new_block->offset, size);

     qemu_ram_setup_dump(new_block->host, size);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 17/39] memory: cpu_physical_memory_clear_dirty_flag() result is never used
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (15 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 16/39] memory: only resize dirty bitmap when memory size increases Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 18/39] bitmap: Add bitmap_zero_extend operation Juan Quintela
                   ` (24 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-internal.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index e08ac42..3f885a6 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -85,14 +85,14 @@ static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
     cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_CODE);
 }

-static inline int cpu_physical_memory_clear_dirty_flag(ram_addr_t addr,
+static inline void cpu_physical_memory_clear_dirty_flag(ram_addr_t addr,
                                                        unsigned client)
 {
     int mask = ~(1 << client);

     assert(client < DIRTY_MEMORY_NUM);

-    return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] &= mask;
+    ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] &= mask;
 }

 static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 18/39] bitmap: Add bitmap_zero_extend operation
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (16 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 17/39] memory: cpu_physical_memory_clear_dirty_flag() result is never used Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 19/39] memory: split dirty bitmap into three Juan Quintela
                   ` (23 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/qemu/bitmap.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index 308bbb7..53f5f1f 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -219,4 +219,13 @@ unsigned long bitmap_find_next_zero_area(unsigned long *map,
 					 unsigned int nr,
 					 unsigned long align_mask);

+static inline unsigned long *bitmap_zero_extend(unsigned long *old,
+                                                int old_nbits, int new_nbits)
+{
+    int new_len = BITS_TO_LONGS(new_nbits) * sizeof(unsigned long);
+    unsigned long *new = g_realloc(old, new_len);
+    bitmap_clear(new, old_nbits, new_nbits - old_nbits);
+    return new;
+}
+
 #endif /* BITMAP_H */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 19/39] memory: split dirty bitmap into three
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (17 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 18/39] bitmap: Add bitmap_zero_extend operation Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 23:39   ` Eric Blake
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 20/39] memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user Juan Quintela
                   ` (22 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

After all the previous patches, spliting the bitmap gets direct.

ToDo: Why can't i include "exec/memory.h" into cpu-all.h?  This is the
      reason that I have duplicated DIRTY_MEMORY_NUM.

ToDo2: current bitmaps have one int as index, this limit us to 8TB RAM
       guest, Should we move to longs?

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 exec.c                         |  9 ++++++---
 include/exec/cpu-all.h         |  4 +++-
 include/exec/memory-internal.h | 11 ++++-------
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/exec.c b/exec.c
index 38b5c1c..98700d3 100644
--- a/exec.c
+++ b/exec.c
@@ -1166,9 +1166,12 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
     new_ram_size = last_ram_offset() >> TARGET_PAGE_BITS;

     if (new_ram_size > old_ram_size) {
-        ram_list.phys_dirty = g_realloc(ram_list.phys_dirty, new_ram_size);
-        memset(ram_list.phys_dirty + (new_block->offset >> TARGET_PAGE_BITS),
-           0, size >> TARGET_PAGE_BITS);
+        int i;
+        for (i = 0; i < DIRTY_MEMORY_NUM; i++) {
+            ram_list.dirty_memory[i] =
+                bitmap_zero_extend(ram_list.dirty_memory[i],
+                                   old_ram_size, new_ram_size);
+       }
     }
     cpu_physical_memory_set_dirty_range(new_block->offset, size);

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index b6998f0..019dc20 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -456,10 +456,12 @@ typedef struct RAMBlock {
     int fd;
 } RAMBlock;

+#define DIRTY_MEMORY_NUM       3
+
 typedef struct RAMList {
     QemuMutex mutex;
     /* Protected by the iothread lock.  */
-    uint8_t *phys_dirty;
+    unsigned long *dirty_memory[DIRTY_MEMORY_NUM];
     RAMBlock *mru_block;
     /* Protected by the ramlist lock.  */
     QTAILQ_HEAD(, RAMBlock) blocks;
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 3f885a6..71f198e 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -44,7 +44,7 @@ static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
                                                       unsigned client)
 {
     assert(client < DIRTY_MEMORY_NUM);
-    return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] & (1 << client);
+    return test_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
 }

 /* read dirty bit (return 0 or 1) */
@@ -75,7 +75,8 @@ static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
                                                       unsigned client)
 {
     assert(client < DIRTY_MEMORY_NUM);
-    ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= (1 << client);
+
+    set_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
 }

 static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
@@ -88,11 +89,7 @@ static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
 static inline void cpu_physical_memory_clear_dirty_flag(ram_addr_t addr,
                                                        unsigned client)
 {
-    int mask = ~(1 << client);
-
-    assert(client < DIRTY_MEMORY_NUM);
-
-    ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] &= mask;
+    clear_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
 }

 static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 20/39] memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (18 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 19/39] memory: split dirty bitmap into three Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 21/39] memory: unfold cpu_physical_memory_set_dirty() " Juan Quintela
                   ` (21 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-internal.h | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 71f198e..d6d3537 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -86,12 +86,6 @@ static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
     cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_CODE);
 }

-static inline void cpu_physical_memory_clear_dirty_flag(ram_addr_t addr,
-                                                       unsigned client)
-{
-    clear_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
-}
-
 static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
                                                        ram_addr_t length)
 {
@@ -114,7 +108,7 @@ static inline void cpu_physical_memory_mask_dirty_range(ram_addr_t start,
     end = TARGET_PAGE_ALIGN(start + length);
     start &= TARGET_PAGE_MASK;
     for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-        cpu_physical_memory_clear_dirty_flag(addr, client);
+        clear_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
     }
 }

-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 21/39] memory: unfold cpu_physical_memory_set_dirty() in its only user
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (19 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 20/39] memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 22/39] memory: unfold cpu_physical_memory_set_dirty_flag() Juan Quintela
                   ` (20 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-internal.h | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index d6d3537..5fc4eb6 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -79,13 +79,6 @@ static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
     set_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
 }

-static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
-{
-    cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
-    cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_VGA);
-    cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_CODE);
-}
-
 static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
                                                        ram_addr_t length)
 {
@@ -94,7 +87,9 @@ static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
     end = TARGET_PAGE_ALIGN(start + length);
     start &= TARGET_PAGE_MASK;
     for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-        cpu_physical_memory_set_dirty(addr);
+        cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
+        cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_VGA);
+        cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_CODE);
     }
     xen_modified_memory(addr, length);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 22/39] memory: unfold cpu_physical_memory_set_dirty_flag()
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (20 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 21/39] memory: unfold cpu_physical_memory_set_dirty() " Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 23/39] memory: make cpu_physical_memory_get_dirty() the main function Juan Quintela
                   ` (19 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-internal.h | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 5fc4eb6..e56f43b 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -87,9 +87,12 @@ static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
     end = TARGET_PAGE_ALIGN(start + length);
     start &= TARGET_PAGE_MASK;
     for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-        cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
-        cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_VGA);
-        cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_CODE);
+        set_bit(addr >> TARGET_PAGE_BITS,
+                ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]);
+        set_bit(addr >> TARGET_PAGE_BITS,
+                ram_list.dirty_memory[DIRTY_MEMORY_VGA]);
+        set_bit(addr >> TARGET_PAGE_BITS,
+                ram_list.dirty_memory[DIRTY_MEMORY_CODE]);
     }
     xen_modified_memory(addr, length);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 23/39] memory: make cpu_physical_memory_get_dirty() the main function
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (21 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 22/39] memory: unfold cpu_physical_memory_set_dirty_flag() Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 23:44   ` Eric Blake
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 24/39] memory: cpu_physical_memory_get_dirty() is used as returning a bool Juan Quintela
                   ` (18 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

And make cpu_physical_memory_get_dirty_flag() to use it.  It used to
be the other way around.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-internal.h | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index e56f43b..f66d2ce 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -40,22 +40,6 @@ void *qemu_get_ram_ptr(ram_addr_t addr);
 void qemu_ram_free(ram_addr_t addr);
 void qemu_ram_free_from_ptr(ram_addr_t addr);

-static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
-                                                      unsigned client)
-{
-    assert(client < DIRTY_MEMORY_NUM);
-    return test_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
-}
-
-/* read dirty bit (return 0 or 1) */
-static inline bool cpu_physical_memory_is_dirty(ram_addr_t addr)
-{
-    bool vga = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_VGA);
-    bool code = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_CODE);
-    bool migration = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
-    return vga && code && migration;
-}
-
 static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
                                                 ram_addr_t length,
                                                 unsigned client)
@@ -63,14 +47,31 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
     int ret = 0;
     ram_addr_t addr, end;

+    assert(client < DIRTY_MEMORY_NUM);
+
     end = TARGET_PAGE_ALIGN(start + length);
     start &= TARGET_PAGE_MASK;
     for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-        ret |= cpu_physical_memory_get_dirty_flag(addr, client);
+        ret |= test_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
     }
     return ret;
 }

+static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
+                                                      unsigned client)
+{
+    return cpu_physical_memory_get_dirty(addr, 1, client);
+}
+
+/* read dirty bit (return 0 or 1) */
+static inline bool cpu_physical_memory_is_dirty(ram_addr_t addr)
+{
+    bool vga = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_VGA);
+    bool code = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_CODE);
+    bool migration = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
+    return vga && code && migration;
+}
+
 static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
                                                       unsigned client)
 {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 24/39] memory: cpu_physical_memory_get_dirty() is used as returning a bool
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (22 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 23/39] memory: make cpu_physical_memory_get_dirty() the main function Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 25/39] memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range Juan Quintela
                   ` (17 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-internal.h | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index f66d2ce..de8f279 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -40,11 +40,10 @@ void *qemu_get_ram_ptr(ram_addr_t addr);
 void qemu_ram_free(ram_addr_t addr);
 void qemu_ram_free_from_ptr(ram_addr_t addr);

-static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
-                                                ram_addr_t length,
-                                                unsigned client)
+static inline bool cpu_physical_memory_get_dirty(ram_addr_t start,
+                                                 ram_addr_t length,
+                                                 unsigned client)
 {
-    int ret = 0;
     ram_addr_t addr, end;

     assert(client < DIRTY_MEMORY_NUM);
@@ -52,9 +51,11 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
     end = TARGET_PAGE_ALIGN(start + length);
     start &= TARGET_PAGE_MASK;
     for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-        ret |= test_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
+        if (test_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client])) {
+            return true;
+        }
     }
-    return ret;
+    return false;
 }

 static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 25/39] memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (23 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 24/39] memory: cpu_physical_memory_get_dirty() is used as returning a bool Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 26/39] memory: use find_next_bit() to find dirty bits Juan Quintela
                   ` (16 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Now all functions use the same wording that bitops/bitmap operations

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 exec.c                         | 2 +-
 include/exec/memory-internal.h | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/exec.c b/exec.c
index 98700d3..b95c584 100644
--- a/exec.c
+++ b/exec.c
@@ -672,7 +672,7 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
     length = end - start;
     if (length == 0)
         return;
-    cpu_physical_memory_mask_dirty_range(start, length, client);
+    cpu_physical_memory_clear_dirty_range(start, length, client);

     if (tcg_enabled()) {
         tlb_reset_dirty_range_all(start, end, length);
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index de8f279..0c1dbfa 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -99,9 +99,9 @@ static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
     xen_modified_memory(addr, length);
 }

-static inline void cpu_physical_memory_mask_dirty_range(ram_addr_t start,
-                                                        ram_addr_t length,
-                                                        unsigned client)
+static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
+                                                         ram_addr_t length,
+                                                         unsigned client)
 {
     ram_addr_t addr, end;

-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 26/39] memory: use find_next_bit() to find dirty bits
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (24 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 25/39] memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 27/39] memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations Juan Quintela
                   ` (15 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

This operation is way faster than doing it bit by bit.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-internal.h | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 0c1dbfa..5a5bc0d 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -44,18 +44,15 @@ static inline bool cpu_physical_memory_get_dirty(ram_addr_t start,
                                                  ram_addr_t length,
                                                  unsigned client)
 {
-    ram_addr_t addr, end;
+    unsigned long end, page, next;

     assert(client < DIRTY_MEMORY_NUM);

-    end = TARGET_PAGE_ALIGN(start + length);
-    start &= TARGET_PAGE_MASK;
-    for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-        if (test_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client])) {
-            return true;
-        }
-    }
-    return false;
+    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+    page = start >> TARGET_PAGE_BITS;
+    next = find_next_bit(ram_list.dirty_memory[client], end, page);
+
+    return next < end;
 }

 static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 27/39] memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (25 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 26/39] memory: use find_next_bit() to find dirty bits Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 23:49   ` Eric Blake
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 28/39] memory: cpu_physical_memory_clear_dirty_range() " Juan Quintela
                   ` (14 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

We were setting a range of bits, so use bitmap_set().

Note: xen has always been wrong, and should have used start instead
of addr from the beggining.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-internal.h | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 5a5bc0d..2f704e8 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -81,19 +81,14 @@ static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
 static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
                                                        ram_addr_t length)
 {
-    ram_addr_t addr, end;
+    unsigned long end, page;

-    end = TARGET_PAGE_ALIGN(start + length);
-    start &= TARGET_PAGE_MASK;
-    for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-        set_bit(addr >> TARGET_PAGE_BITS,
-                ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]);
-        set_bit(addr >> TARGET_PAGE_BITS,
-                ram_list.dirty_memory[DIRTY_MEMORY_VGA]);
-        set_bit(addr >> TARGET_PAGE_BITS,
-                ram_list.dirty_memory[DIRTY_MEMORY_CODE]);
-    }
-    xen_modified_memory(addr, length);
+    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+    page = start >> TARGET_PAGE_BITS;
+    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION], page, end - page);
+    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_VGA], page, end - page);
+    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_CODE], page, end - page);
+    xen_modified_memory(start, length);
 }

 static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 28/39] memory: cpu_physical_memory_clear_dirty_range() now uses bitmap operations
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (26 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 27/39] memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 29/39] memory: s/dirty/clean/ in cpu_physical_memory_is_dirty() Juan Quintela
                   ` (13 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

We were clearing a range of bits, so use bitmap_clear().

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-internal.h | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 2f704e8..d46570e 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -95,13 +95,11 @@ static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
                                                          ram_addr_t length,
                                                          unsigned client)
 {
-    ram_addr_t addr, end;
+    unsigned long end, page;

-    end = TARGET_PAGE_ALIGN(start + length);
-    start &= TARGET_PAGE_MASK;
-    for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-        clear_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
-    }
+    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+    page = start >> TARGET_PAGE_BITS;
+    bitmap_clear(ram_list.dirty_memory[client], page, end - page);
 }

 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 29/39] memory: s/dirty/clean/ in cpu_physical_memory_is_dirty()
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (27 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 28/39] memory: cpu_physical_memory_clear_dirty_range() " Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 30/39] memory: make cpu_physical_memory_reset_dirty() take a length parameter Juan Quintela
                   ` (12 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

All uses except one really want the other meaning.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 cputlb.c                       | 2 +-
 exec.c                         | 6 +++---
 include/exec/memory-internal.h | 5 ++---
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index dfd747a..a3a957b 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -299,7 +299,7 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
             /* Write access calls the I/O callback.  */
             te->addr_write = address | TLB_MMIO;
         } else if (memory_region_is_ram(section->mr)
-                   && !cpu_physical_memory_is_dirty(section->mr->ram_addr + xlat)) {
+                   && cpu_physical_memory_is_clean(section->mr->ram_addr + xlat)) {
             te->addr_write = address | TLB_NOTDIRTY;
         } else {
             te->addr_write = address;
diff --git a/exec.c b/exec.c
index b95c584..a0d431a 100644
--- a/exec.c
+++ b/exec.c
@@ -1403,7 +1403,7 @@ static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
     cpu_physical_memory_set_dirty_flag(ram_addr, DIRTY_MEMORY_VGA);
     /* we remove the notdirty callback only if the code has been
        flushed */
-    if (cpu_physical_memory_is_dirty(ram_addr)) {
+    if (!cpu_physical_memory_is_clean(ram_addr)) {
         CPUArchState *env = current_cpu->env_ptr;
         tlb_set_dirty(env, env->mem_io_vaddr);
     }
@@ -1824,7 +1824,7 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
 static void invalidate_and_set_dirty(hwaddr addr,
                                      hwaddr length)
 {
-    if (!cpu_physical_memory_is_dirty(addr)) {
+    if (cpu_physical_memory_is_clean(addr)) {
         /* invalidate code */
         tb_invalidate_phys_page_range(addr, addr + length, 0);
         /* set dirty bit */
@@ -2407,7 +2407,7 @@ void stl_phys_notdirty(hwaddr addr, uint32_t val)
         stl_p(ptr, val);

         if (unlikely(in_migration)) {
-            if (!cpu_physical_memory_is_dirty(addr1)) {
+            if (cpu_physical_memory_is_clean(addr1)) {
                 /* invalidate code */
                 tb_invalidate_phys_page_range(addr1, addr1 + 4, 0);
                 /* set dirty bit */
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index d46570e..4bca95f 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -61,13 +61,12 @@ static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
     return cpu_physical_memory_get_dirty(addr, 1, client);
 }

-/* read dirty bit (return 0 or 1) */
-static inline bool cpu_physical_memory_is_dirty(ram_addr_t addr)
+static inline bool cpu_physical_memory_is_clean(ram_addr_t addr)
 {
     bool vga = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_VGA);
     bool code = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_CODE);
     bool migration = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
-    return vga && code && migration;
+    return !(vga && code && migration);
 }

 static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 30/39] memory: make cpu_physical_memory_reset_dirty() take a length parameter
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (28 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 29/39] memory: s/dirty/clean/ in cpu_physical_memory_is_dirty() Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 31/39] memory: cpu_physical_memory_set_dirty_tracking() should return void Juan Quintela
                   ` (11 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

We have an end parameter in all the callers, and this make it coherent
with the rest of cpu_physical_memory_* functions, that also take a
length parameter.

Once here, move the start/end calculation to
tlb_reset_dirty_range_all() as we don't need it here anymore.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 cputlb.c                       |  3 +--
 exec.c                         | 19 ++++++++-----------
 include/exec/memory-internal.h |  2 +-
 memory.c                       |  8 ++------
 4 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index a3a957b..865430c 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -127,8 +127,7 @@ void tlb_flush_page(CPUArchState *env, target_ulong addr)
    can be detected */
 void tlb_protect_code(ram_addr_t ram_addr)
 {
-    cpu_physical_memory_reset_dirty(ram_addr,
-                                    ram_addr + TARGET_PAGE_SIZE,
+    cpu_physical_memory_reset_dirty(ram_addr, TARGET_PAGE_SIZE,
                                     DIRTY_MEMORY_CODE);
 }

diff --git a/exec.c b/exec.c
index a0d431a..984b417 100644
--- a/exec.c
+++ b/exec.c
@@ -648,11 +648,14 @@ found:
     return block;
 }

-static void tlb_reset_dirty_range_all(ram_addr_t start, ram_addr_t end,
-                                      uintptr_t length)
+static void tlb_reset_dirty_range_all(ram_addr_t start, ram_addr_t length)
 {
-    RAMBlock *block;
     ram_addr_t start1;
+    RAMBlock *block;
+    ram_addr_t end;
+
+    end = TARGET_PAGE_ALIGN(start + length);
+    start &= TARGET_PAGE_MASK;

     block = qemu_get_ram_block(start);
     assert(block == qemu_get_ram_block(end - 1));
@@ -661,21 +664,15 @@ static void tlb_reset_dirty_range_all(ram_addr_t start, ram_addr_t end,
 }

 /* Note: start and end must be within the same ram block.  */
-void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
+void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t length,
                                      unsigned client)
 {
-    uintptr_t length;
-
-    start &= TARGET_PAGE_MASK;
-    end = TARGET_PAGE_ALIGN(end);
-
-    length = end - start;
     if (length == 0)
         return;
     cpu_physical_memory_clear_dirty_range(start, length, client);

     if (tcg_enabled()) {
-        tlb_reset_dirty_range_all(start, end, length);
+        tlb_reset_dirty_range_all(start, length);
     }
 }

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 4bca95f..c790ab8 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -101,7 +101,7 @@ static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
     bitmap_clear(ram_list.dirty_memory[client], page, end - page);
 }

-void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
+void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t length,
                                      unsigned client);

 #endif
diff --git a/memory.c b/memory.c
index f93c1e8..c58661b 100644
--- a/memory.c
+++ b/memory.c
@@ -1191,9 +1191,7 @@ bool memory_region_test_and_clear_dirty(MemoryRegion *mr, hwaddr addr,
     assert(mr->terminates);
     ret = cpu_physical_memory_get_dirty(mr->ram_addr + addr, size, client);
     if (ret) {
-        cpu_physical_memory_reset_dirty(mr->ram_addr + addr,
-                                        mr->ram_addr + addr + size,
-                                        client);
+        cpu_physical_memory_reset_dirty(mr->ram_addr + addr, size, client);
     }
     return ret;
 }
@@ -1239,9 +1237,7 @@ void memory_region_reset_dirty(MemoryRegion *mr, hwaddr addr,
                                hwaddr size, unsigned client)
 {
     assert(mr->terminates);
-    cpu_physical_memory_reset_dirty(mr->ram_addr + addr,
-                                    mr->ram_addr + addr + size,
-                                    client);
+    cpu_physical_memory_reset_dirty(mr->ram_addr + addr, size, client);
 }

 void *memory_region_get_ram_ptr(MemoryRegion *mr)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 31/39] memory: cpu_physical_memory_set_dirty_tracking() should return void
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (29 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 30/39] memory: make cpu_physical_memory_reset_dirty() take a length parameter Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 23:55   ` Eric Blake
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 32/39] memory: split cpu_physical_memory_* functions to its own include Juan Quintela
                   ` (10 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Result was always 0, and not used anywhere.  Once there, use bool type
for the parameter.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 exec.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index 984b417..11b434b 100644
--- a/exec.c
+++ b/exec.c
@@ -54,7 +54,7 @@
 //#define DEBUG_SUBPAGE

 #if !defined(CONFIG_USER_ONLY)
-static int in_migration;
+static bool in_migration;

 RAMList ram_list = { .blocks = QTAILQ_HEAD_INITIALIZER(ram_list.blocks) };

@@ -676,11 +676,9 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t length,
     }
 }

-static int cpu_physical_memory_set_dirty_tracking(int enable)
+static void cpu_physical_memory_set_dirty_tracking(bool enable)
 {
-    int ret = 0;
     in_migration = enable;
-    return ret;
 }

 hwaddr memory_region_section_get_iotlb(CPUArchState *env,
@@ -1699,12 +1697,12 @@ static void tcg_commit(MemoryListener *listener)

 static void core_log_global_start(MemoryListener *listener)
 {
-    cpu_physical_memory_set_dirty_tracking(1);
+    cpu_physical_memory_set_dirty_tracking(true);
 }

 static void core_log_global_stop(MemoryListener *listener)
 {
-    cpu_physical_memory_set_dirty_tracking(0);
+    cpu_physical_memory_set_dirty_tracking(false);
 }

 static MemoryListener core_memory_listener = {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 32/39] memory: split cpu_physical_memory_* functions to its own include
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (30 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 31/39] memory: cpu_physical_memory_set_dirty_tracking() should return void Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 15:54   ` Paolo Bonzini
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 33/39] memory: unfold memory_region_test_and_clear() Juan Quintela
                   ` (9 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

So we know who is using the bitmap directly

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 cputlb.c                       |  1 +
 exec.c                         |  1 +
 include/exec/memory-internal.h | 69 --------------------------------
 include/exec/memory-physical.h | 90 ++++++++++++++++++++++++++++++++++++++++++
 memory.c                       |  1 +
 5 files changed, 93 insertions(+), 69 deletions(-)
 create mode 100644 include/exec/memory-physical.h

diff --git a/cputlb.c b/cputlb.c
index 865430c..96aa143 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -26,6 +26,7 @@
 #include "exec/cputlb.h"

 #include "exec/memory-internal.h"
+#include "exec/memory-physical.h"

 //#define DEBUG_TLB
 //#define DEBUG_TLB_CHECK
diff --git a/exec.c b/exec.c
index 11b434b..04eb2b4 100644
--- a/exec.c
+++ b/exec.c
@@ -50,6 +50,7 @@
 #include "translate-all.h"

 #include "exec/memory-internal.h"
+#include "exec/memory-physical.h"

 //#define DEBUG_SUBPAGE

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index c790ab8..69a92cf 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -20,9 +20,6 @@
 #define MEMORY_INTERNAL_H

 #ifndef CONFIG_USER_ONLY
-#include "hw/xen/xen.h"
-
-
 typedef struct AddressSpaceDispatch AddressSpaceDispatch;

 void address_space_init_dispatch(AddressSpace *as);
@@ -39,71 +36,5 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr);
 void *qemu_get_ram_ptr(ram_addr_t addr);
 void qemu_ram_free(ram_addr_t addr);
 void qemu_ram_free_from_ptr(ram_addr_t addr);
-
-static inline bool cpu_physical_memory_get_dirty(ram_addr_t start,
-                                                 ram_addr_t length,
-                                                 unsigned client)
-{
-    unsigned long end, page, next;
-
-    assert(client < DIRTY_MEMORY_NUM);
-
-    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
-    page = start >> TARGET_PAGE_BITS;
-    next = find_next_bit(ram_list.dirty_memory[client], end, page);
-
-    return next < end;
-}
-
-static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
-                                                      unsigned client)
-{
-    return cpu_physical_memory_get_dirty(addr, 1, client);
-}
-
-static inline bool cpu_physical_memory_is_clean(ram_addr_t addr)
-{
-    bool vga = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_VGA);
-    bool code = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_CODE);
-    bool migration = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
-    return !(vga && code && migration);
-}
-
-static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
-                                                      unsigned client)
-{
-    assert(client < DIRTY_MEMORY_NUM);
-
-    set_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
-}
-
-static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
-                                                       ram_addr_t length)
-{
-    unsigned long end, page;
-
-    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
-    page = start >> TARGET_PAGE_BITS;
-    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION], page, end - page);
-    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_VGA], page, end - page);
-    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_CODE], page, end - page);
-    xen_modified_memory(start, length);
-}
-
-static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
-                                                         ram_addr_t length,
-                                                         unsigned client)
-{
-    unsigned long end, page;
-
-    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
-    page = start >> TARGET_PAGE_BITS;
-    bitmap_clear(ram_list.dirty_memory[client], page, end - page);
-}
-
-void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t length,
-                                     unsigned client);
-
 #endif
-
 #endif
diff --git a/include/exec/memory-physical.h b/include/exec/memory-physical.h
new file mode 100644
index 0000000..610c55f
--- /dev/null
+++ b/include/exec/memory-physical.h
@@ -0,0 +1,90 @@
+/*
+ * Declarations for cpu physical memory functions
+ *
+ * Copyright 2011 Red Hat, Inc. and/or its affiliates
+ *
+ * Authors:
+ *  Avi Kivity <avi@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ *
+ */
+
+/*
+ * This header is for use by exec.c and memory.c ONLY.  Do not include it.
+ * The functions declared here will be removed soon.
+ */
+
+#ifndef MEMORY_PHYSICAL_H
+#define MEMORY_PHYSICAL_H
+
+#ifndef CONFIG_USER_ONLY
+#include "hw/xen/xen.h"
+
+static inline bool cpu_physical_memory_get_dirty(ram_addr_t start,
+                                                 ram_addr_t length,
+                                                 unsigned client)
+{
+    unsigned long end, page, next;
+
+    assert(client < DIRTY_MEMORY_NUM);
+
+    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+    page = start >> TARGET_PAGE_BITS;
+    next = find_next_bit(ram_list.dirty_memory[client], end, page);
+
+    return next < end;
+}
+
+static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
+                                                      unsigned client)
+{
+    return cpu_physical_memory_get_dirty(addr, 1, client);
+}
+
+static inline bool cpu_physical_memory_is_clean(ram_addr_t addr)
+{
+    bool vga = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_VGA);
+    bool code = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_CODE);
+    bool migration = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
+    return !(vga && code && migration);
+}
+
+static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
+                                                      unsigned client)
+{
+    assert(client < DIRTY_MEMORY_NUM);
+
+    set_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
+}
+
+static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
+                                                       ram_addr_t length)
+{
+    unsigned long end, page;
+
+    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+    page = start >> TARGET_PAGE_BITS;
+    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION], page, end - page);
+    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_VGA], page, end - page);
+    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_CODE], page, end - page);
+    xen_modified_memory(start, length);
+}
+
+static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
+                                                         ram_addr_t length,
+                                                         unsigned client)
+{
+    unsigned long end, page;
+
+    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+    page = start >> TARGET_PAGE_BITS;
+    bitmap_clear(ram_list.dirty_memory[client], page, end - page);
+}
+
+void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t length,
+                                     unsigned client);
+
+#endif
+#endif
diff --git a/memory.c b/memory.c
index c58661b..c5efc8a 100644
--- a/memory.c
+++ b/memory.c
@@ -22,6 +22,7 @@
 #include <assert.h>

 #include "exec/memory-internal.h"
+#include "exec/memory-physical.h"

 //#define DEBUG_UNASSIGNED

-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 33/39] memory: unfold memory_region_test_and_clear()
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (31 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 32/39] memory: split cpu_physical_memory_* functions to its own include Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 34/39] kvm: use directly cpu_physical_memory_* api for tracking dirty pages Juan Quintela
                   ` (8 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

We are going to update the bitmap directly

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 arch_init.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 7545d96..72ef993 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -48,6 +48,7 @@
 #include "qmp-commands.h"
 #include "trace.h"
 #include "exec/cpu-all.h"
+#include "exec/memory-physical.h"
 #include "hw/acpi/acpi.h"

 #ifdef DEBUG_ARCH_INIT
@@ -400,9 +401,12 @@ static void migration_bitmap_sync(void)

     QTAILQ_FOREACH(block, &ram_list.blocks, next) {
         for (addr = 0; addr < block->length; addr += TARGET_PAGE_SIZE) {
-            if (memory_region_test_and_clear_dirty(block->mr,
-                                                   addr, TARGET_PAGE_SIZE,
-                                                   DIRTY_MEMORY_MIGRATION)) {
+            if (cpu_physical_memory_get_dirty(block->mr->ram_addr + addr,
+                                              TARGET_PAGE_SIZE,
+                                              DIRTY_MEMORY_MIGRATION)) {
+                cpu_physical_memory_reset_dirty(block->mr->ram_addr + addr,
+                                                TARGET_PAGE_SIZE,
+                                                DIRTY_MEMORY_MIGRATION);
                 migration_bitmap_set_dirty(block->mr, addr);
             }
         }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 34/39] kvm: use directly cpu_physical_memory_* api for tracking dirty pages
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (32 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 33/39] memory: unfold memory_region_test_and_clear() Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 35/39] kvm: refactor start address calculation Juan Quintela
                   ` (7 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Performance is important in this function, and we want to optimize even further.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 kvm-all.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 4478969..963c2d1 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -31,6 +31,7 @@
 #include "sysemu/kvm.h"
 #include "qemu/bswap.h"
 #include "exec/memory.h"
+#include "exec/memory-physical.h"
 #include "exec/address-spaces.h"
 #include "qemu/event_notifier.h"
 #include "trace.h"
@@ -381,6 +382,7 @@ static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
     unsigned int i, j;
     unsigned long page_number, c;
     hwaddr addr, addr1;
+    ram_addr_t ram_addr;
     unsigned int pages = int128_get64(section->size) / getpagesize();
     unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
     unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
@@ -398,8 +400,9 @@ static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
                 page_number = (i * HOST_LONG_BITS + j) * hpratio;
                 addr1 = page_number * TARGET_PAGE_SIZE;
                 addr = section->offset_within_region + addr1;
-                memory_region_set_dirty(section->mr, addr,
-                                        TARGET_PAGE_SIZE * hpratio);
+                ram_addr = section->mr->ram_addr + addr;
+                cpu_physical_memory_set_dirty_range(ram_addr,
+                                                    TARGET_PAGE_SIZE * hpratio);
             } while (c != 0);
         }
     }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 35/39] kvm: refactor start address calculation
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (33 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 34/39] kvm: use directly cpu_physical_memory_* api for tracking dirty pages Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function Juan Quintela
                   ` (6 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 kvm-all.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 963c2d1..7609242 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -381,7 +381,8 @@ static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
 {
     unsigned int i, j;
     unsigned long page_number, c;
-    hwaddr addr, addr1;
+    hwaddr addr;
+    ram_addr_t start = section->offset_within_region + section->mr->ram_addr;
     ram_addr_t ram_addr;
     unsigned int pages = int128_get64(section->size) / getpagesize();
     unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
@@ -398,9 +399,8 @@ static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
                 j = ffsl(c) - 1;
                 c &= ~(1ul << j);
                 page_number = (i * HOST_LONG_BITS + j) * hpratio;
-                addr1 = page_number * TARGET_PAGE_SIZE;
-                addr = section->offset_within_region + addr1;
-                ram_addr = section->mr->ram_addr + addr;
+                addr = page_number * TARGET_PAGE_SIZE;
+                ram_addr = start + addr;
                 cpu_physical_memory_set_dirty_range(ram_addr,
                                                     TARGET_PAGE_SIZE * hpratio);
             } while (c != 0);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (34 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 35/39] kvm: refactor start address calculation Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 15:56   ` Paolo Bonzini
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 37/39] memory: syncronize kvm bitmap using bitmaps operations Juan Quintela
                   ` (5 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

We want to have all the functions that handle directly the dirty
bitmap near.  We will change it later.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-physical.h | 31 +++++++++++++++++++++++++++++++
 kvm-all.c                      | 27 ++-------------------------
 2 files changed, 33 insertions(+), 25 deletions(-)

diff --git a/include/exec/memory-physical.h b/include/exec/memory-physical.h
index 610c55f..72faf06 100644
--- a/include/exec/memory-physical.h
+++ b/include/exec/memory-physical.h
@@ -72,6 +72,37 @@ static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
     xen_modified_memory(start, length);
 }

+static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
+                                                          ram_addr_t start,
+                                                          ram_addr_t pages)
+{
+    unsigned int i, j;
+    unsigned long page_number, c;
+    hwaddr addr;
+    ram_addr_t ram_addr;
+    unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
+    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
+
+    /*
+     * bitmap-traveling is faster than memory-traveling (for addr...)
+     * especially when most of the memory is not dirty.
+     */
+    for (i = 0; i < len; i++) {
+        if (bitmap[i] != 0) {
+            c = leul_to_cpu(bitmap[i]);
+            do {
+                j = ffsl(c) - 1;
+                c &= ~(1ul << j);
+                page_number = (i * HOST_LONG_BITS + j) * hpratio;
+                addr = page_number * TARGET_PAGE_SIZE;
+                ram_addr = start + addr;
+                cpu_physical_memory_set_dirty_range(ram_addr,
+                                                    TARGET_PAGE_SIZE * hpratio);
+            } while (c != 0);
+        }
+    }
+}
+
 static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
                                                          ram_addr_t length,
                                                          unsigned client)
diff --git a/kvm-all.c b/kvm-all.c
index 7609242..25c0864 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -379,33 +379,10 @@ static int kvm_set_migration_log(int enable)
 static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
                                          unsigned long *bitmap)
 {
-    unsigned int i, j;
-    unsigned long page_number, c;
-    hwaddr addr;
     ram_addr_t start = section->offset_within_region + section->mr->ram_addr;
-    ram_addr_t ram_addr;
-    unsigned int pages = int128_get64(section->size) / getpagesize();
-    unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
-    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
+    ram_addr_t pages = int128_get64(section->size) / getpagesize();

-    /*
-     * bitmap-traveling is faster than memory-traveling (for addr...)
-     * especially when most of the memory is not dirty.
-     */
-    for (i = 0; i < len; i++) {
-        if (bitmap[i] != 0) {
-            c = leul_to_cpu(bitmap[i]);
-            do {
-                j = ffsl(c) - 1;
-                c &= ~(1ul << j);
-                page_number = (i * HOST_LONG_BITS + j) * hpratio;
-                addr = page_number * TARGET_PAGE_SIZE;
-                ram_addr = start + addr;
-                cpu_physical_memory_set_dirty_range(ram_addr,
-                                                    TARGET_PAGE_SIZE * hpratio);
-            } while (c != 0);
-        }
-    }
+    cpu_physical_memory_set_dirty_lebitmap(bitmap, start, pages);
     return 0;
 }

-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 37/39] memory: syncronize kvm bitmap using bitmaps operations
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (35 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 15:58   ` Paolo Bonzini
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 38/39] ram: split function that synchronizes a range Juan Quintela
                   ` (4 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

If bitmaps are aligned properly, use bitmap operations.  If they are
not, just use old bit at a time code.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory-physical.h | 54 +++++++++++++++++++++++++++++-------------
 1 file changed, 38 insertions(+), 16 deletions(-)

diff --git a/include/exec/memory-physical.h b/include/exec/memory-physical.h
index 72faf06..9057714 100644
--- a/include/exec/memory-physical.h
+++ b/include/exec/memory-physical.h
@@ -82,23 +82,45 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
     ram_addr_t ram_addr;
     unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
     unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
+    unsigned long page = BIT_WORD(start >> TARGET_PAGE_BITS);

-    /*
-     * bitmap-traveling is faster than memory-traveling (for addr...)
-     * especially when most of the memory is not dirty.
-     */
-    for (i = 0; i < len; i++) {
-        if (bitmap[i] != 0) {
-            c = leul_to_cpu(bitmap[i]);
-            do {
-                j = ffsl(c) - 1;
-                c &= ~(1ul << j);
-                page_number = (i * HOST_LONG_BITS + j) * hpratio;
-                addr = page_number * TARGET_PAGE_SIZE;
-                ram_addr = start + addr;
-                cpu_physical_memory_set_dirty_range(ram_addr,
-                                                    TARGET_PAGE_SIZE * hpratio);
-            } while (c != 0);
+    /* start address is aligned at the start of a word? */
+    if (((page * BITS_PER_LONG) << TARGET_PAGE_BITS) == start) {
+        int k;
+        int nr = BITS_TO_LONGS(pages);
+
+        printf("XXX: aligned start %lx page %lx\n", start, page);
+        assert(start == ((start >> TARGET_PAGE_BITS) << TARGET_PAGE_BITS));
+
+        for (k = 0; k < nr; k++) {
+            if (bitmap[k]) {
+                unsigned long temp = leul_to_cpu(bitmap[k]);
+
+                ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION][page + k] |= temp;
+                ram_list.dirty_memory[DIRTY_MEMORY_VGA][page + k] |= temp;
+                ram_list.dirty_memory[DIRTY_MEMORY_CODE][page + k] |= temp;
+            }
+        }
+        xen_modified_memory(start, pages);
+    } else {
+        printf("XXX: not aligned start %lx pages %lu\n", start, pages);
+        /*
+         * bitmap-traveling is faster than memory-traveling (for addr...)
+         * especially when most of the memory is not dirty.
+         */
+        for (i = 0; i < len; i++) {
+            if (bitmap[i] != 0) {
+                c = leul_to_cpu(bitmap[i]);
+                do {
+                    j = ffsl(c) - 1;
+                    c &= ~(1ul << j);
+                    page_number = (i * HOST_LONG_BITS + j) * hpratio;
+                    addr = page_number * TARGET_PAGE_SIZE;
+                    ram_addr = start + addr;
+                    cpu_physical_memory_set_dirty_range(ram_addr,
+                                                        TARGET_PAGE_SIZE * hpratio);
+                } while (c != 0);
+            }
         }
     }
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 38/39] ram: split function that synchronizes a range
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (36 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 37/39] memory: syncronize kvm bitmap using bitmaps operations Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 39/39] migration: synchronize memory bitmap 64bits at a time Juan Quintela
                   ` (3 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

This function is the only bit where we care about speed.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 arch_init.c | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 72ef993..fe88d64 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -360,11 +360,10 @@ ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
     return (next - base) << TARGET_PAGE_BITS;
 }

-static inline bool migration_bitmap_set_dirty(MemoryRegion *mr,
-                                              ram_addr_t offset)
+static inline bool migration_bitmap_set_dirty(ram_addr_t addr)
 {
     bool ret;
-    int nr = (mr->ram_addr + offset) >> TARGET_PAGE_BITS;
+    int nr = addr >> TARGET_PAGE_BITS;

     ret = test_and_set_bit(nr, migration_bitmap);

@@ -374,12 +373,28 @@ static inline bool migration_bitmap_set_dirty(MemoryRegion *mr,
     return ret;
 }

+static void migration_bitmap_sync_range(ram_addr_t start, ram_addr_t length)
+{
+    ram_addr_t addr;
+
+    for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
+        if (cpu_physical_memory_get_dirty(start + addr,
+                                          TARGET_PAGE_SIZE,
+                                          DIRTY_MEMORY_MIGRATION)) {
+            cpu_physical_memory_reset_dirty(start + addr,
+                                            TARGET_PAGE_SIZE,
+                                            DIRTY_MEMORY_MIGRATION);
+            migration_bitmap_set_dirty(start + addr);
+        }
+    }
+}
+
+
 /* Needs iothread lock! */

 static void migration_bitmap_sync(void)
 {
     RAMBlock *block;
-    ram_addr_t addr;
     uint64_t num_dirty_pages_init = migration_dirty_pages;
     MigrationState *s = migrate_get_current();
     static int64_t start_time;
@@ -400,16 +415,7 @@ static void migration_bitmap_sync(void)
     address_space_sync_dirty_bitmap(&address_space_memory);

     QTAILQ_FOREACH(block, &ram_list.blocks, next) {
-        for (addr = 0; addr < block->length; addr += TARGET_PAGE_SIZE) {
-            if (cpu_physical_memory_get_dirty(block->mr->ram_addr + addr,
-                                              TARGET_PAGE_SIZE,
-                                              DIRTY_MEMORY_MIGRATION)) {
-                cpu_physical_memory_reset_dirty(block->mr->ram_addr + addr,
-                                                TARGET_PAGE_SIZE,
-                                                DIRTY_MEMORY_MIGRATION);
-                migration_bitmap_set_dirty(block->mr, addr);
-            }
-        }
+        migration_bitmap_sync_range(block->mr->ram_addr, block->length);
     }
     trace_migration_bitmap_sync_end(migration_dirty_pages
                                     - num_dirty_pages_init);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 39/39] migration: synchronize memory bitmap 64bits at a time
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (37 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 38/39] ram: split function that synchronizes a range Juan Quintela
@ 2013-11-06 13:04 ` Juan Quintela
  2013-11-06 14:37 ` [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Gerd Hoffmann
                   ` (2 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: chegu_vinod

We use the old code if the bitmaps are not aligned

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 arch_init.c | 43 ++++++++++++++++++++++++++++++++++---------
 1 file changed, 34 insertions(+), 9 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index fe88d64..0909531 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -50,6 +50,7 @@
 #include "exec/cpu-all.h"
 #include "exec/memory-physical.h"
 #include "hw/acpi/acpi.h"
+#include "qemu/host-utils.h"

 #ifdef DEBUG_ARCH_INIT
 #define DPRINTF(fmt, ...) \
@@ -376,15 +377,37 @@ static inline bool migration_bitmap_set_dirty(ram_addr_t addr)
 static void migration_bitmap_sync_range(ram_addr_t start, ram_addr_t length)
 {
     ram_addr_t addr;
-
-    for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
-        if (cpu_physical_memory_get_dirty(start + addr,
-                                          TARGET_PAGE_SIZE,
-                                          DIRTY_MEMORY_MIGRATION)) {
-            cpu_physical_memory_reset_dirty(start + addr,
-                                            TARGET_PAGE_SIZE,
-                                            DIRTY_MEMORY_MIGRATION);
-            migration_bitmap_set_dirty(start + addr);
+    unsigned long page = BIT_WORD(start >> TARGET_PAGE_BITS);
+
+    /* start address is aligned at the start of a word? */
+    if (((page * BITS_PER_LONG) << TARGET_PAGE_BITS) == start) {
+        int k;
+        int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
+        unsigned long *src = ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION];
+
+        printf("XXXX:optimized start %lx page %lx length %lu\n", start, page, length);
+
+        for (k = page; k < page + nr; k++) {
+            if (src[k]) {
+                unsigned long new_dirty;
+                new_dirty = ~migration_bitmap[k];
+                migration_bitmap[k] |= src[k];
+                new_dirty &= src[k];
+                migration_dirty_pages += ctpopl(new_dirty);
+                src[k] = 0;
+            }
+        }
+    } else {
+        printf("XXXX:not optimized start %lx length %lu\n", start, length);
+        for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
+            if (cpu_physical_memory_get_dirty(start + addr,
+                                              TARGET_PAGE_SIZE,
+                                              DIRTY_MEMORY_MIGRATION)) {
+                cpu_physical_memory_reset_dirty(start + addr,
+                                                TARGET_PAGE_SIZE,
+                                                DIRTY_MEMORY_MIGRATION);
+                migration_bitmap_set_dirty(start + addr);
+            }
         }
     }
 }
@@ -415,6 +438,8 @@ static void migration_bitmap_sync(void)
     address_space_sync_dirty_bitmap(&address_space_memory);

     QTAILQ_FOREACH(block, &ram_list.blocks, next) {
+        printf("XXXX: name %s addr %lx length %lu\n",
+               block->idstr, block->mr->ram_addr, block->length);
         migration_bitmap_sync_range(block->mr->ram_addr, block->length);
     }
     trace_migration_bitmap_sync_end(migration_dirty_pages
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (38 preceding siblings ...)
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 39/39] migration: synchronize memory bitmap 64bits at a time Juan Quintela
@ 2013-11-06 14:37 ` Gerd Hoffmann
  2013-11-06 15:38   ` Juan Quintela
  2013-11-06 15:49   ` Paolo Bonzini
  2013-11-08 15:18 ` Chegu Vinod
  2013-11-25  6:15 ` Michael R. Hines
  41 siblings, 2 replies; 69+ messages in thread
From: Gerd Hoffmann @ 2013-11-06 14:37 UTC (permalink / raw)
  To: Juan Quintela; +Cc: chegu_vinod, qemu-devel

  Hi,

> - vga ram by default is not aligned in a page number multiple of 64,
> 
>   it could be optimized.  Kraxel?  It syncs the kvm bitmap at least 1
>   a second or so? bitmap is only 2048 pages (16MB by default).
>   We need to change the ram_addr only

It is created using memory_region_init_ram(), in vga_common_init().
Nothing special is done to avoid/force any alignment.
Dunno why it ends up on a odd page number.

> - vga: still more, after we finish migration, vga code continues
>   synchronizing the kvm bitmap on source machine.  Notice that there
>   is no graphics client connected to the VGA.  Worth investigating?

Yep, needed to figure updated screen areas.  It never stops completely,
but the refresh rate should be alot lower with no vnc client connected,
something like once every few seconds without client vs. 30x per second
with client.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
  2013-11-06 14:37 ` [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Gerd Hoffmann
@ 2013-11-06 15:38   ` Juan Quintela
  2013-11-07 11:23     ` Gerd Hoffmann
  2013-11-06 15:49   ` Paolo Bonzini
  1 sibling, 1 reply; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 15:38 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: chegu_vinod, qemu-devel

Gerd Hoffmann <kraxel@redhat.com> wrote:
>   Hi,
>
>> - vga ram by default is not aligned in a page number multiple of 64,
>> 
>>   it could be optimized.  Kraxel?  It syncs the kvm bitmap at least 1
>>   a second or so? bitmap is only 2048 pages (16MB by default).
>>   We need to change the ram_addr only
>
> It is created using memory_region_init_ram(), in vga_common_init().
> Nothing special is done to avoid/force any alignment.
> Dunno why it ends up on a odd page number.
>
>> - vga: still more, after we finish migration, vga code continues
>>   synchronizing the kvm bitmap on source machine.  Notice that there
>>   is no graphics client connected to the VGA.  Worth investigating?
>
> Yep, needed to figure updated screen areas.  It never stops completely,
> but the refresh rate should be alot lower with no vnc client connected,
> something like once every few seconds without client vs. 30x per second
> with client.

When the guest is stopped you really need to ask kvm for changes?
strange,  no?

Later,  Juan.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
  2013-11-06 14:37 ` [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Gerd Hoffmann
  2013-11-06 15:38   ` Juan Quintela
@ 2013-11-06 15:49   ` Paolo Bonzini
  2013-11-25  6:39     ` Michael R. Hines
  1 sibling, 1 reply; 69+ messages in thread
From: Paolo Bonzini @ 2013-11-06 15:49 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: chegu_vinod, qemu-devel, Michael R. Hines, Juan Quintela

Il 06/11/2013 15:37, Gerd Hoffmann ha scritto:
> 
>>> - vga ram by default is not aligned in a page number multiple of
>>> 64,
>>> 
>>> it could be optimized.  Kraxel?  It syncs the kvm bitmap at least
>>> 1 a second or so? bitmap is only 2048 pages (16MB by default). We
>>> need to change the ram_addr only
> It is created using memory_region_init_ram(), in vga_common_init(). 
> Nothing special is done to avoid/force any alignment. Dunno why it
> ends up on a odd page number.

Because some random option ROM is loaded before the RAM region is
created, and thus the ram_addr_t's become misaligned.  Keeping the
ram_addr_t's aligned in find_ram_offset is easy:

diff --git a/exec.c b/exec.c
index 79610ce..1b82e81 100644
--- a/exec.c
+++ b/exec.c
@@ -1002,7 +1002,7 @@ static ram_addr_t find_ram_offset(ram_addr_t size)
     QTAILQ_FOREACH(block, &ram_list.blocks, next) {
         ram_addr_t end, next = RAM_ADDR_MAX;
 
-        end = block->offset + block->length;
+        end = ROUND_UP(block->offset + block->length, TARGET_PAGE_SIZE * 64);
 
         QTAILQ_FOREACH(next_block, &ram_list.blocks, next) {
             if (next_block->offset >= end) {

but I'm not sure if we're allowed to change the ram_addr_t's.  On one hand they
are not part of guest ABI, on the other hand RDMA migration uses them
in the protocol.

Michael, can you check if RDMA migration works from a QEMU *without* 
this patch to one *with*:

diff --git a/exec.c b/exec.c
index 79610ce..21b8b96 100644
--- a/exec.c
+++ b/exec.c
@@ -997,7 +997,7 @@ static ram_addr_t find_ram_offset(ram_addr_t size)
     assert(size != 0); /* it would hand out same offset multiple times */
 
     if (QTAILQ_EMPTY(&ram_list.blocks))
-        return 0;
+        return TARGET_PAGE_SIZE * 100;
 
     QTAILQ_FOREACH(block, &ram_list.blocks, next) {
         ram_addr_t end, next = RAM_ADDR_MAX;

I suspect it doesn't.  But perhaps we still have time before RDMA
becomes non-experimental.

Paolo

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 32/39] memory: split cpu_physical_memory_* functions to its own include
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 32/39] memory: split cpu_physical_memory_* functions to its own include Juan Quintela
@ 2013-11-06 15:54   ` Paolo Bonzini
  0 siblings, 0 replies; 69+ messages in thread
From: Paolo Bonzini @ 2013-11-06 15:54 UTC (permalink / raw)
  To: Juan Quintela; +Cc: chegu_vinod, qemu-devel

Il 06/11/2013 14:04, Juan Quintela ha scritto:
> So we know who is using the bitmap directly

At least all other functions that take a ram_addr_t should be moved
there too:

ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
                                   MemoryRegion *mr);
ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr);
void *qemu_get_ram_ptr(ram_addr_t addr);
void qemu_ram_free(ram_addr_t addr);
void qemu_ram_free_from_ptr(ram_addr_t addr);

So just call it include/exec/ram-addr.h.

Paolo

> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  cputlb.c                       |  1 +
>  exec.c                         |  1 +
>  include/exec/memory-internal.h | 69 --------------------------------
>  include/exec/memory-physical.h | 90 ++++++++++++++++++++++++++++++++++++++++++
>  memory.c                       |  1 +
>  5 files changed, 93 insertions(+), 69 deletions(-)
>  create mode 100644 include/exec/memory-physical.h
> 
> diff --git a/cputlb.c b/cputlb.c
> index 865430c..96aa143 100644
> --- a/cputlb.c
> +++ b/cputlb.c
> @@ -26,6 +26,7 @@
>  #include "exec/cputlb.h"
> 
>  #include "exec/memory-internal.h"
> +#include "exec/memory-physical.h"
> 
>  //#define DEBUG_TLB
>  //#define DEBUG_TLB_CHECK
> diff --git a/exec.c b/exec.c
> index 11b434b..04eb2b4 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -50,6 +50,7 @@
>  #include "translate-all.h"
> 
>  #include "exec/memory-internal.h"
> +#include "exec/memory-physical.h"
> 
>  //#define DEBUG_SUBPAGE
> 
> diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
> index c790ab8..69a92cf 100644
> --- a/include/exec/memory-internal.h
> +++ b/include/exec/memory-internal.h
> @@ -20,9 +20,6 @@
>  #define MEMORY_INTERNAL_H
> 
>  #ifndef CONFIG_USER_ONLY
> -#include "hw/xen/xen.h"
> -
> -
>  typedef struct AddressSpaceDispatch AddressSpaceDispatch;
> 
>  void address_space_init_dispatch(AddressSpace *as);
> @@ -39,71 +36,5 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr);
>  void *qemu_get_ram_ptr(ram_addr_t addr);
>  void qemu_ram_free(ram_addr_t addr);
>  void qemu_ram_free_from_ptr(ram_addr_t addr);
> -
> -static inline bool cpu_physical_memory_get_dirty(ram_addr_t start,
> -                                                 ram_addr_t length,
> -                                                 unsigned client)
> -{
> -    unsigned long end, page, next;
> -
> -    assert(client < DIRTY_MEMORY_NUM);
> -
> -    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
> -    page = start >> TARGET_PAGE_BITS;
> -    next = find_next_bit(ram_list.dirty_memory[client], end, page);
> -
> -    return next < end;
> -}
> -
> -static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
> -                                                      unsigned client)
> -{
> -    return cpu_physical_memory_get_dirty(addr, 1, client);
> -}
> -
> -static inline bool cpu_physical_memory_is_clean(ram_addr_t addr)
> -{
> -    bool vga = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_VGA);
> -    bool code = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_CODE);
> -    bool migration = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
> -    return !(vga && code && migration);
> -}
> -
> -static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
> -                                                      unsigned client)
> -{
> -    assert(client < DIRTY_MEMORY_NUM);
> -
> -    set_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
> -}
> -
> -static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
> -                                                       ram_addr_t length)
> -{
> -    unsigned long end, page;
> -
> -    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
> -    page = start >> TARGET_PAGE_BITS;
> -    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION], page, end - page);
> -    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_VGA], page, end - page);
> -    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_CODE], page, end - page);
> -    xen_modified_memory(start, length);
> -}
> -
> -static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
> -                                                         ram_addr_t length,
> -                                                         unsigned client)
> -{
> -    unsigned long end, page;
> -
> -    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
> -    page = start >> TARGET_PAGE_BITS;
> -    bitmap_clear(ram_list.dirty_memory[client], page, end - page);
> -}
> -
> -void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t length,
> -                                     unsigned client);
> -
>  #endif
> -
>  #endif
> diff --git a/include/exec/memory-physical.h b/include/exec/memory-physical.h
> new file mode 100644
> index 0000000..610c55f
> --- /dev/null
> +++ b/include/exec/memory-physical.h
> @@ -0,0 +1,90 @@
> +/*
> + * Declarations for cpu physical memory functions
> + *
> + * Copyright 2011 Red Hat, Inc. and/or its affiliates
> + *
> + * Authors:
> + *  Avi Kivity <avi@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + *
> + */
> +
> +/*
> + * This header is for use by exec.c and memory.c ONLY.  Do not include it.
> + * The functions declared here will be removed soon.
> + */
> +
> +#ifndef MEMORY_PHYSICAL_H
> +#define MEMORY_PHYSICAL_H
> +
> +#ifndef CONFIG_USER_ONLY
> +#include "hw/xen/xen.h"
> +
> +static inline bool cpu_physical_memory_get_dirty(ram_addr_t start,
> +                                                 ram_addr_t length,
> +                                                 unsigned client)
> +{
> +    unsigned long end, page, next;
> +
> +    assert(client < DIRTY_MEMORY_NUM);
> +
> +    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
> +    page = start >> TARGET_PAGE_BITS;
> +    next = find_next_bit(ram_list.dirty_memory[client], end, page);
> +
> +    return next < end;
> +}
> +
> +static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
> +                                                      unsigned client)
> +{
> +    return cpu_physical_memory_get_dirty(addr, 1, client);
> +}
> +
> +static inline bool cpu_physical_memory_is_clean(ram_addr_t addr)
> +{
> +    bool vga = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_VGA);
> +    bool code = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_CODE);
> +    bool migration = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
> +    return !(vga && code && migration);
> +}
> +
> +static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
> +                                                      unsigned client)
> +{
> +    assert(client < DIRTY_MEMORY_NUM);
> +
> +    set_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
> +}
> +
> +static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
> +                                                       ram_addr_t length)
> +{
> +    unsigned long end, page;
> +
> +    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
> +    page = start >> TARGET_PAGE_BITS;
> +    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION], page, end - page);
> +    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_VGA], page, end - page);
> +    bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_CODE], page, end - page);
> +    xen_modified_memory(start, length);
> +}
> +
> +static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
> +                                                         ram_addr_t length,
> +                                                         unsigned client)
> +{
> +    unsigned long end, page;
> +
> +    end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
> +    page = start >> TARGET_PAGE_BITS;
> +    bitmap_clear(ram_list.dirty_memory[client], page, end - page);
> +}
> +
> +void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t length,
> +                                     unsigned client);
> +
> +#endif
> +#endif
> diff --git a/memory.c b/memory.c
> index c58661b..c5efc8a 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -22,6 +22,7 @@
>  #include <assert.h>
> 
>  #include "exec/memory-internal.h"
> +#include "exec/memory-physical.h"
> 
>  //#define DEBUG_UNASSIGNED
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function Juan Quintela
@ 2013-11-06 15:56   ` Paolo Bonzini
  2013-11-06 16:22     ` Juan Quintela
  0 siblings, 1 reply; 69+ messages in thread
From: Paolo Bonzini @ 2013-11-06 15:56 UTC (permalink / raw)
  To: Juan Quintela; +Cc: chegu_vinod, qemu-devel

Il 06/11/2013 14:04, Juan Quintela ha scritto:
> We want to have all the functions that handle directly the dirty
> bitmap near.  We will change it later.

Please move it to exec.c instead.

> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  include/exec/memory-physical.h | 31 +++++++++++++++++++++++++++++++
>  kvm-all.c                      | 27 ++-------------------------
>  2 files changed, 33 insertions(+), 25 deletions(-)
> 
> diff --git a/include/exec/memory-physical.h b/include/exec/memory-physical.h
> index 610c55f..72faf06 100644
> --- a/include/exec/memory-physical.h
> +++ b/include/exec/memory-physical.h
> @@ -72,6 +72,37 @@ static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
>      xen_modified_memory(start, length);
>  }
> 
> +static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,

Why "le"?

Paolo

> +                                                          ram_addr_t start,
> +                                                          ram_addr_t pages)
> +{
> +    unsigned int i, j;
> +    unsigned long page_number, c;
> +    hwaddr addr;
> +    ram_addr_t ram_addr;
> +    unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
> +    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
> +
> +    /*
> +     * bitmap-traveling is faster than memory-traveling (for addr...)
> +     * especially when most of the memory is not dirty.
> +     */
> +    for (i = 0; i < len; i++) {
> +        if (bitmap[i] != 0) {
> +            c = leul_to_cpu(bitmap[i]);
> +            do {
> +                j = ffsl(c) - 1;
> +                c &= ~(1ul << j);
> +                page_number = (i * HOST_LONG_BITS + j) * hpratio;
> +                addr = page_number * TARGET_PAGE_SIZE;
> +                ram_addr = start + addr;
> +                cpu_physical_memory_set_dirty_range(ram_addr,
> +                                                    TARGET_PAGE_SIZE * hpratio);
> +            } while (c != 0);
> +        }
> +    }
> +}
> +
>  static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
>                                                           ram_addr_t length,
>                                                           unsigned client)
> diff --git a/kvm-all.c b/kvm-all.c
> index 7609242..25c0864 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -379,33 +379,10 @@ static int kvm_set_migration_log(int enable)
>  static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
>                                           unsigned long *bitmap)
>  {
> -    unsigned int i, j;
> -    unsigned long page_number, c;
> -    hwaddr addr;
>      ram_addr_t start = section->offset_within_region + section->mr->ram_addr;
> -    ram_addr_t ram_addr;
> -    unsigned int pages = int128_get64(section->size) / getpagesize();
> -    unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
> -    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
> +    ram_addr_t pages = int128_get64(section->size) / getpagesize();
> 
> -    /*
> -     * bitmap-traveling is faster than memory-traveling (for addr...)
> -     * especially when most of the memory is not dirty.
> -     */
> -    for (i = 0; i < len; i++) {
> -        if (bitmap[i] != 0) {
> -            c = leul_to_cpu(bitmap[i]);
> -            do {
> -                j = ffsl(c) - 1;
> -                c &= ~(1ul << j);
> -                page_number = (i * HOST_LONG_BITS + j) * hpratio;
> -                addr = page_number * TARGET_PAGE_SIZE;
> -                ram_addr = start + addr;
> -                cpu_physical_memory_set_dirty_range(ram_addr,
> -                                                    TARGET_PAGE_SIZE * hpratio);
> -            } while (c != 0);
> -        }
> -    }
> +    cpu_physical_memory_set_dirty_lebitmap(bitmap, start, pages);
>      return 0;
>  }
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 37/39] memory: syncronize kvm bitmap using bitmaps operations
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 37/39] memory: syncronize kvm bitmap using bitmaps operations Juan Quintela
@ 2013-11-06 15:58   ` Paolo Bonzini
  0 siblings, 0 replies; 69+ messages in thread
From: Paolo Bonzini @ 2013-11-06 15:58 UTC (permalink / raw)
  To: Juan Quintela; +Cc: chegu_vinod, qemu-devel

Il 06/11/2013 14:04, Juan Quintela ha scritto:
> +    /* start address is aligned at the start of a word? */
> +    if (((page * BITS_PER_LONG) << TARGET_PAGE_BITS) == start) {
> +        int k;
> +        int nr = BITS_TO_LONGS(pages);
> +
> +        printf("XXX: aligned start %lx page %lx\n", start, page);
> +        assert(start == ((start >> TARGET_PAGE_BITS) << TARGET_PAGE_BITS));

This test is useless, since you have already tested the zero-ness of the
low bits in the "if" condition: the low bits are 0 in ((page *
BITS_PER_LONG) << TARGET_PAGE_BITS).

Paolo

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function
  2013-11-06 15:56   ` Paolo Bonzini
@ 2013-11-06 16:22     ` Juan Quintela
  2013-11-06 16:23       ` Paolo Bonzini
  2013-12-18  8:54       ` Alexander Graf
  0 siblings, 2 replies; 69+ messages in thread
From: Juan Quintela @ 2013-11-06 16:22 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: chegu_vinod, qemu-devel, Alexander Graf

Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 06/11/2013 14:04, Juan Quintela ha scritto:
>> We want to have all the functions that handle directly the dirty
>> bitmap near.  We will change it later.
>
> Please move it to exec.c instead.
>
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  include/exec/memory-physical.h | 31 +++++++++++++++++++++++++++++++
>>  kvm-all.c                      | 27 ++-------------------------
>>  2 files changed, 33 insertions(+), 25 deletions(-)
>> 
>> diff --git a/include/exec/memory-physical.h b/include/exec/memory-physical.h
>> index 610c55f..72faf06 100644
>> --- a/include/exec/memory-physical.h
>> +++ b/include/exec/memory-physical.h
>> @@ -72,6 +72,37 @@ static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
>>      xen_modified_memory(start, length);
>>  }
>> 
>> +static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
>
> Why "le"?

little endian,  name is already too long (cpu_physical_memory_ is too
big as a preffix.)

>
> Paolo
>
>> +                                                          ram_addr_t start,
>> +                                                          ram_addr_t pages)
>> +{
>> +    unsigned int i, j;
>> +    unsigned long page_number, c;
>> +    hwaddr addr;
>> +    ram_addr_t ram_addr;
>> +    unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
>> +    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
>> +
>> +    /*
>> +     * bitmap-traveling is faster than memory-traveling (for addr...)
>> +     * especially when most of the memory is not dirty.
>> +     */
>> +    for (i = 0; i < len; i++) {
>> +        if (bitmap[i] != 0) {
>> +            c = leul_to_cpu(bitmap[i]);

this is the important part.  KVM maintains the bitmap is Little Endian
mode,  and PPC?  I assme needs this.


>> +            do {
>> +                j = ffsl(c) - 1;
>> +                c &= ~(1ul << j);
>> +                page_number = (i * HOST_LONG_BITS + j) * hpratio;
>> +                addr = page_number * TARGET_PAGE_SIZE;
>> +                ram_addr = start + addr;
>> +                cpu_physical_memory_set_dirty_range(ram_addr,
>> +                                                    TARGET_PAGE_SIZE
>> * hpratio);

I checked,  and it appears that ppc really use the hpratio thing.  I
hope that the bitmap optimization also works for ppc,  but its
architecture is weird^Winteresting O:-)

Alex?

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function
  2013-11-06 16:22     ` Juan Quintela
@ 2013-11-06 16:23       ` Paolo Bonzini
  2013-12-18  8:54       ` Alexander Graf
  1 sibling, 0 replies; 69+ messages in thread
From: Paolo Bonzini @ 2013-11-06 16:23 UTC (permalink / raw)
  To: quintela; +Cc: chegu_vinod, qemu-devel, Alexander Graf

Il 06/11/2013 17:22, Juan Quintela ha scritto:
>>> >> 
>>> >> +static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
>> >
>> > Why "le"?
> little endian,  name is already too long (cpu_physical_memory_ is too
> big as a preffix.)
> 

Yeah, I just found we have a matching "__set_bit_le" in the kernel.

Paolo

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 06/39] exec: create function to get a single dirty bit
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 06/39] exec: create function to get " Juan Quintela
@ 2013-11-06 23:11   ` Eric Blake
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Blake @ 2013-11-06 23:11 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: chegu_vinod

[-- Attachment #1: Type: text/plain, Size: 1009 bytes --]

On 11/06/2013 06:04 AM, Juan Quintela wrote:
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  exec.c                         | 3 ++-
>  include/exec/memory-internal.h | 6 ++++++
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/exec.c b/exec.c
> index 7cf5634..36b0a66 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1374,9 +1374,10 @@ found:
>  static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
>                                 uint64_t val, unsigned size)
>  {
> +
>      int dirty_flags;

Is this whitespace addition spurious?

...
> 
> +static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
> +                                                     int dirty_flag)

Indentation off by one.

But whitespace doesn't impact the patch, so you can mark 1 through 6 as:

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 08/39] exec: simplify notdirty_mem_write()
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 08/39] exec: simplify notdirty_mem_write() Juan Quintela
@ 2013-11-06 23:12   ` Eric Blake
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Blake @ 2013-11-06 23:12 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: chegu_vinod

[-- Attachment #1: Type: text/plain, Size: 808 bytes --]

On 11/06/2013 06:04 AM, Juan Quintela wrote:
> We don't need to make special things for CODE, just set the other two bits
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  exec.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 36b0a66..27e4540 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1374,12 +1374,8 @@ found:
>  static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
>                                 uint64_t val, unsigned size)
>  {
> -
> -    int dirty_flags;

Huh, you just undid the whitespace addition I called out in 6/39.

For 7 and 8,

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 11/39] memory: cpu_physical_memory_set_dirty_range() allways dirty all flags
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 11/39] memory: cpu_physical_memory_set_dirty_range() allways dirty all flags Juan Quintela
@ 2013-11-06 23:15   ` Eric Blake
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Blake @ 2013-11-06 23:15 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: chegu_vinod

[-- Attachment #1: Type: text/plain, Size: 775 bytes --]

On 11/06/2013 06:04 AM, Juan Quintela wrote:

s/allways dirty/always dirties/ in the subject

> So remove the flag argument and do it directly.  After this change, there is nothing else using cpu_physical_memory_set_dirty_flags() so remove it.

Any reason you didn't line wrap the commit message?

> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  exec.c                         |  2 +-
>  include/exec/memory-internal.h | 11 ++---------
>  memory.c                       |  2 +-
>  3 files changed, 4 insertions(+), 11 deletions(-)

But the code itself is correct.  For 9 through 11:

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 12/39] memory: cpu_physical_memory_mask_dirty_range() always clear a single flag
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 12/39] memory: cpu_physical_memory_mask_dirty_range() always clear a single flag Juan Quintela
@ 2013-11-06 23:18   ` Eric Blake
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Blake @ 2013-11-06 23:18 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: chegu_vinod

[-- Attachment #1: Type: text/plain, Size: 499 bytes --]

On 11/06/2013 06:04 AM, Juan Quintela wrote:

s/clear/clears/ in the subject

> Document it
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  exec.c                         |  4 ++--
>  include/exec/memory-internal.h | 12 ++++++------
>  2 files changed, 8 insertions(+), 8 deletions(-)

But the code is correct.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 13/39] memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 13/39] memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG Juan Quintela
@ 2013-11-06 23:24   ` Eric Blake
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Blake @ 2013-11-06 23:24 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: chegu_vinod

[-- Attachment #1: Type: text/plain, Size: 1326 bytes --]

On 11/06/2013 06:04 AM, Juan Quintela wrote:
> Instead of the bitmap, we use the bitmap number.  Once this is done,
> we change all names from dirty_flag to memory regions naming of
> client.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  cputlb.c                       |  4 ++--
>  exec.c                         | 18 +++++++++---------
>  include/exec/memory-internal.h | 38 +++++++++++++++++---------------------
>  include/exec/memory.h          |  3 ---
>  memory.c                       | 10 ++++------
>  5 files changed, 32 insertions(+), 41 deletions(-)
> 
> -
>  static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
> -                                                     int dirty_flag)
> +                                                      unsigned client)

Aha - you fixed the other whitespace nit I pointed out on 6/39.  Since
both nits were transient, it's not worth respinning that patch just for
whitespace.  And since my subsequent nits in commit headers could be
touched up by a maintainer when adding my Reviewed-by, you may have
escaped a respin if the rest of the series is clean :)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 19/39] memory: split dirty bitmap into three
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 19/39] memory: split dirty bitmap into three Juan Quintela
@ 2013-11-06 23:39   ` Eric Blake
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Blake @ 2013-11-06 23:39 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: chegu_vinod

[-- Attachment #1: Type: text/plain, Size: 889 bytes --]

On 11/06/2013 06:04 AM, Juan Quintela wrote:
> After all the previous patches, spliting the bitmap gets direct.
> 
> ToDo: Why can't i include "exec/memory.h" into cpu-all.h?  This is the
>       reason that I have duplicated DIRTY_MEMORY_NUM.
> 
> ToDo2: current bitmaps have one int as index, this limit us to 8TB RAM
>        guest, Should we move to longs?

Do we still want "ToDo" in the commit message?

> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  exec.c                         |  9 ++++++---
>  include/exec/cpu-all.h         |  4 +++-
>  include/exec/memory-internal.h | 11 ++++-------
>  3 files changed, 13 insertions(+), 11 deletions(-)

But the code looks correct.  For 14-19,

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 23/39] memory: make cpu_physical_memory_get_dirty() the main function
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 23/39] memory: make cpu_physical_memory_get_dirty() the main function Juan Quintela
@ 2013-11-06 23:44   ` Eric Blake
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Blake @ 2013-11-06 23:44 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: chegu_vinod

[-- Attachment #1: Type: text/plain, Size: 532 bytes --]

On 11/06/2013 06:04 AM, Juan Quintela wrote:
> And make cpu_physical_memory_get_dirty_flag() to use it.  It used to

s/to use/use/

> be the other way around.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  include/exec/memory-internal.h | 35 ++++++++++++++++++-----------------
>  1 file changed, 18 insertions(+), 17 deletions(-)

For 19-23,

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 27/39] memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 27/39] memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations Juan Quintela
@ 2013-11-06 23:49   ` Eric Blake
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Blake @ 2013-11-06 23:49 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: chegu_vinod

[-- Attachment #1: Type: text/plain, Size: 587 bytes --]

On 11/06/2013 06:04 AM, Juan Quintela wrote:
> We were setting a range of bits, so use bitmap_set().
> 
> Note: xen has always been wrong, and should have used start instead
> of addr from the beggining.

s/beggining/beginning/

> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  include/exec/memory-internal.h | 19 +++++++------------
>  1 file changed, 7 insertions(+), 12 deletions(-)

For 24-27,

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 31/39] memory: cpu_physical_memory_set_dirty_tracking() should return void
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 31/39] memory: cpu_physical_memory_set_dirty_tracking() should return void Juan Quintela
@ 2013-11-06 23:55   ` Eric Blake
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Blake @ 2013-11-06 23:55 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: chegu_vinod

[-- Attachment #1: Type: text/plain, Size: 456 bytes --]

On 11/06/2013 06:04 AM, Juan Quintela wrote:
> Result was always 0, and not used anywhere.  Once there, use bool type
> for the parameter.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  exec.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)

For 28-31,
Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 01/39] Move prototypes to memory.h
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 01/39] Move prototypes to memory.h Juan Quintela
@ 2013-11-07  9:36   ` Orit Wasserman
  0 siblings, 0 replies; 69+ messages in thread
From: Orit Wasserman @ 2013-11-07  9:36 UTC (permalink / raw)
  To: Juan Quintela; +Cc: chegu_vinod, qemu-devel

On 11/06/2013 03:04 PM, Juan Quintela wrote:
> As the comment says, it should only be used on "core" memory files.
>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>   include/exec/cpu-common.h | 4 ----
>   include/exec/memory.h     | 4 +++-
>   2 files changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
> index e4996e1..8110ef0 100644
> --- a/include/exec/cpu-common.h
> +++ b/include/exec/cpu-common.h
> @@ -51,10 +51,6 @@ typedef void CPUWriteMemoryFunc(void *opaque, hwaddr addr, uint32_t value);
>   typedef uint32_t CPUReadMemoryFunc(void *opaque, hwaddr addr);
>
>   void qemu_ram_remap(ram_addr_t addr, ram_addr_t length);
> -/* This should not be used by devices.  */
> -MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
> -void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
> -
>   void cpu_physical_memory_rw(hwaddr addr, uint8_t *buf,
>                               int len, int is_write);
>   static inline void cpu_physical_memory_read(hwaddr addr,
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 480dfbf..c7e151f 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -1055,7 +1055,9 @@ void *address_space_map(AddressSpace *as, hwaddr addr,
>   void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len,
>                            int is_write, hwaddr access_len);
>
> -
> +/* This should not be used by devices.  */
> +MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
> +void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
>   #endif
>
>   #endif
>

Reviewed-by: Orit Wasserman <owasserm@redhat.com>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 02/39] memory: cpu_physical_memory_set_dirty_flags() result is never used
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 02/39] memory: cpu_physical_memory_set_dirty_flags() result is never used Juan Quintela
@ 2013-11-07  9:42   ` Orit Wasserman
  0 siblings, 0 replies; 69+ messages in thread
From: Orit Wasserman @ 2013-11-07  9:42 UTC (permalink / raw)
  To: Juan Quintela; +Cc: chegu_vinod, qemu-devel

On 11/06/2013 03:04 PM, Juan Quintela wrote:
> So return void.
>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>   include/exec/memory-internal.h | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
> index d0e0633..c71a5e6 100644
> --- a/include/exec/memory-internal.h
> +++ b/include/exec/memory-internal.h
> @@ -70,10 +70,10 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
>       return ret;
>   }
>
> -static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
> +static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
>                                                         int dirty_flags)
>   {
> -    return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
> +    ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
>   }
>
>   static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
>

Reviewed-by: Orit Wasserman <owasserm@redhat.com>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 03/39] memory: cpu_physical_memory_set_dirty_range() return void
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 03/39] memory: cpu_physical_memory_set_dirty_range() return void Juan Quintela
@ 2013-11-07  9:51   ` Orit Wasserman
  0 siblings, 0 replies; 69+ messages in thread
From: Orit Wasserman @ 2013-11-07  9:51 UTC (permalink / raw)
  To: Juan Quintela; +Cc: chegu_vinod, qemu-devel

On 11/06/2013 03:04 PM, Juan Quintela wrote:
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>   memory.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/memory.c b/memory.c
> index 28f6449..9722f23 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1182,7 +1182,7 @@ void memory_region_set_dirty(MemoryRegion *mr, hwaddr addr,
>                                hwaddr size)
>   {
>       assert(mr->terminates);
> -    return cpu_physical_memory_set_dirty_range(mr->ram_addr + addr, size, -1);
> +    cpu_physical_memory_set_dirty_range(mr->ram_addr + addr, size, -1);
>   }
>
>   bool memory_region_test_and_clear_dirty(MemoryRegion *mr, hwaddr addr,
>

Reviewed-by: Orit Wasserman <owasserm@redhat.com>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 04/39] exec: use accessor function to know if memory is dirty
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 04/39] exec: use accessor function to know if memory is dirty Juan Quintela
@ 2013-11-07 10:00   ` Orit Wasserman
  0 siblings, 0 replies; 69+ messages in thread
From: Orit Wasserman @ 2013-11-07 10:00 UTC (permalink / raw)
  To: Juan Quintela; +Cc: chegu_vinod, qemu-devel

On 11/06/2013 03:04 PM, Juan Quintela wrote:
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>   exec.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/exec.c b/exec.c
> index 79610ce..7cf5634 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1397,7 +1397,7 @@ static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
>       cpu_physical_memory_set_dirty_flags(ram_addr, dirty_flags);
>       /* we remove the notdirty callback only if the code has been
>          flushed */
> -    if (dirty_flags == 0xff) {
> +    if (cpu_physical_memory_is_dirty(ram_addr)) {
>           CPUArchState *env = current_cpu->env_ptr;
>           tlb_set_dirty(env, env->mem_io_vaddr);
>       }
>

Reviewed-by: Orit Wasserman <owasserm@redhat.com>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
  2013-11-06 15:38   ` Juan Quintela
@ 2013-11-07 11:23     ` Gerd Hoffmann
  0 siblings, 0 replies; 69+ messages in thread
From: Gerd Hoffmann @ 2013-11-07 11:23 UTC (permalink / raw)
  To: quintela; +Cc: chegu_vinod, qemu-devel

  Hi,

> > Yep, needed to figure updated screen areas.  It never stops completely,
> > but the refresh rate should be alot lower with no vnc client connected,
> > something like once every few seconds without client vs. 30x per second
> > with client.
> 
> When the guest is stopped you really need to ask kvm for changes?
> strange,  no?

Ahem, no.  Could certainly be improved.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 05/39] memory: create function to set a single dirty bit
  2013-11-06 13:04 ` [Qemu-devel] [PATCH 05/39] memory: create function to set a single dirty bit Juan Quintela
@ 2013-11-07 11:37   ` Orit Wasserman
  0 siblings, 0 replies; 69+ messages in thread
From: Orit Wasserman @ 2013-11-07 11:37 UTC (permalink / raw)
  To: Juan Quintela; +Cc: chegu_vinod, qemu-devel

On 11/06/2013 03:04 PM, Juan Quintela wrote:
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>   cputlb.c                       | 2 +-
>   include/exec/memory-internal.h | 6 ++++++
>   2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/cputlb.c b/cputlb.c
> index fff0afb..72452e5 100644
> --- a/cputlb.c
> +++ b/cputlb.c
> @@ -137,7 +137,7 @@ void tlb_protect_code(ram_addr_t ram_addr)
>   void tlb_unprotect_code_phys(CPUArchState *env, ram_addr_t ram_addr,
>                                target_ulong vaddr)
>   {
> -    cpu_physical_memory_set_dirty_flags(ram_addr, CODE_DIRTY_FLAG);
> +    cpu_physical_memory_set_dirty_flag(ram_addr, CODE_DIRTY_FLAG);
>   }
>
>   static bool tlb_is_dirty_ram(CPUTLBEntry *tlbe)
> diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
> index c71a5e6..4ebab80 100644
> --- a/include/exec/memory-internal.h
> +++ b/include/exec/memory-internal.h
> @@ -76,6 +76,12 @@ static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
>       ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
>   }
>
> +static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
> +                                                      int dirty_flag)
> +{
> +    ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flag;
> +}
> +
>   static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
>   {
>       cpu_physical_memory_set_dirty_flags(addr, 0xff);
>

Reviewed-by: Orit Wasserman <owasserm@redhat.com>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (39 preceding siblings ...)
  2013-11-06 14:37 ` [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Gerd Hoffmann
@ 2013-11-08 15:18 ` Chegu Vinod
  2013-11-25  6:15 ` Michael R. Hines
  41 siblings, 0 replies; 69+ messages in thread
From: Chegu Vinod @ 2013-11-08 15:18 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel, Paolo Bonzini, Karen Noel,
	Orit Wasserman, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 10919 bytes --]

On 11/6/2013 5:04 AM, Juan Quintela wrote:
> Hi
>
> [v2]
> In this version:
> - fixed all the comments from last versions (thanks Eric)
> - kvm migration bitmap is synchronized using bitmap operations
> - qemu bitmap -> migration bitmap is synchronized using bitmap operations
> If bitmaps are not properly aligned, we fall back to old code.
> Code survives virt-tests, so should be in quite good shape.
>
> ToDo list:
>
> - vga ram by default is not aligned in a page number multiple of 64,
>
>    it could be optimized.  Kraxel?  It syncs the kvm bitmap at least 1
>    a second or so? bitmap is only 2048 pages (16MB by default).
>    We need to change the ram_addr only
>
> - vga: still more, after we finish migration, vga code continues
>    synchronizing the kvm bitmap on source machine.  Notice that there
>    is no graphics client connected to the VGA.  Worth investigating?
>
> - I haven't yet meassure speed differences on big hosts.  Vinod?
>
> - Depending of performance, more optimizations to do.
>
> - debugging printf's still on the code, just to see if we are taking
>    (or not) the optimized paths.
>
> And that is all.  Please test & comment.
>
> Thanks, Juan.
>
> [v1]
> This series split the dirty bitmap (8 bits per page, only three used)
> into 3 individual bitmaps.  Once the conversion is done, operations
> are handled by bitmap operations, not bit by bit.
>
> - *_DIRTY_FLAG flags are gone, now we use memory.h DIRTY_MEMORY_*
>     everywhere.
>
> - We set/reset each flag individually
>    (set_dirty_flags(0xff&~CODE_DIRTY_FLAG)) are gone.
>
> - Rename several functions to clarify/make consistent things.
>
> - I know it dont't pass checkpatch for long lines, propper submission
>    should pass it. We have to have long lines, short variable names, or
>    ugly line splitting :p
>
> - DIRTY_MEMORY_NUM: how can one include exec/memory.h into cpu-all.h?
>    #include it don't work, as a workaround, I have copied its value, but
>    any better idea?  I can always create "exec/migration-flags.h", though.
>
> - The meat of the code is patch 19.  Rest of patches are quite easy
> (even that one is not too complex).
>
> Only optimizations done so far are
> set_dirty_range()/clear_dirty_range() that now operates with
> bitmap_set/clear.
>
> Note for Xen: cpu_physical_memory_set_dirty_range() was wrong for xen,
> see comment on patch.
>
> It passes virt-test migration tests, so it should be perfect.
>
> I post it to ask for comments.
>
> ToDo list:
>
> - create a lock for the bitmaps and fold migration bitmap into this
>    one.  This would avoid a copy and make things easier?
>
> - As this code uses/abuses bitmaps, we need to change the type of the
>    index from int to long.  With an int index, we can only access a
>    maximum of 8TB guest (yes, this is not urgent, we have a couple of
>    years to do it).
>
> - merging KVM <-> QEMU bitmap as a bitmap and not bit-by-bit.
>
> - spliting the KVM bitmap synchronization into chunks, i.e. not
>    synchronize all memory, just enough to continue with migration.
>
> Any further ideas/needs?
>
> Thanks, Juan.
>
> PD.  Why it took so long?
>
>       Because I was trying to integrate the bitmap on the MemoryRegion
>       abstraction.  Would have make the code cleaner, but hit dead-end
>       after dead-end.  As practical terms, TCG don't know about
>       MemoryRegions, it has been ported to run on top of them, but
>       don't use them effective
>
>
> The following changes since commit c2d30667760e3d7b81290d801e567d4f758825ca:
>
>    rtc: remove dead SQW IRQ code (2013-11-05 20:04:03 -0800)
>
> are available in the git repository at:
>
>    git://github.com/juanquintela/qemu.git bitmap-v2.next
>
> for you to fetch changes up to d91eff97e6f36612eb22d57c2b6c2623f73d3997:
>
>    migration: synchronize memory bitmap 64bits at a time (2013-11-06 13:54:56 +0100)
>
> ----------------------------------------------------------------
> Juan Quintela (39):
>        Move prototypes to memory.h
>        memory: cpu_physical_memory_set_dirty_flags() result is never used
>        memory: cpu_physical_memory_set_dirty_range() return void
>        exec: use accessor function to know if memory is dirty
>        memory: create function to set a single dirty bit
>        exec: create function to get a single dirty bit
>        memory: make cpu_physical_memory_is_dirty return bool
>        exec: simplify notdirty_mem_write()
>        memory: all users of cpu_physical_memory_get_dirty used only one flag
>        memory: set single dirty flags when possible
>        memory: cpu_physical_memory_set_dirty_range() allways dirty all flags
>        memory: cpu_physical_memory_mask_dirty_range() always clear a single flag
>        memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG
>        memory: use bit 2 for migration
>        memory: make sure that client is always inside range
>        memory: only resize dirty bitmap when memory size increases
>        memory: cpu_physical_memory_clear_dirty_flag() result is never used
>        bitmap: Add bitmap_zero_extend operation
>        memory: split dirty bitmap into three
>        memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user
>        memory: unfold cpu_physical_memory_set_dirty() in its only user
>        memory: unfold cpu_physical_memory_set_dirty_flag()
>        memory: make cpu_physical_memory_get_dirty() the main function
>        memory: cpu_physical_memory_get_dirty() is used as returning a bool
>        memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range
>        memory: use find_next_bit() to find dirty bits
>        memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations
>        memory: cpu_physical_memory_clear_dirty_range() now uses bitmap operations
>        memory: s/dirty/clean/ in cpu_physical_memory_is_dirty()
>        memory: make cpu_physical_memory_reset_dirty() take a length parameter
>        memory: cpu_physical_memory_set_dirty_tracking() should return void
>        memory: split cpu_physical_memory_* functions to its own include
>        memory: unfold memory_region_test_and_clear()
>        kvm: use directly cpu_physical_memory_* api for tracking dirty pages
>        kvm: refactor start address calculation
>        memory: move bitmap synchronization to its own function
>        memory: syncronize kvm bitmap using bitmaps operations
>        ram: split function that synchronizes a range
>        migration: synchronize memory bitmap 64bits at a time
>
>   arch_init.c                    |  57 ++++++++++++----
>   cputlb.c                       |  10 +--
>   exec.c                         |  75 ++++++++++-----------
>   include/exec/cpu-all.h         |   4 +-
>   include/exec/cpu-common.h      |   4 --
>   include/exec/memory-internal.h |  84 ------------------------
>   include/exec/memory-physical.h | 143 +++++++++++++++++++++++++++++++++++++++++
>   include/exec/memory.h          |  10 +--
>   include/qemu/bitmap.h          |   9 +++
>   kvm-all.c                      |  28 ++------
>   memory.c                       |  17 ++---
>   11 files changed, 260 insertions(+), 181 deletions(-)
>   create mode 100644 include/exec/memory-physical.h
> .
>


Tested-by: Chegu Vinod<chegu_vinod@hp.com>

-------

Hi Juan,


Here are some results from migrating couple of*big fat*  guests using TCP migration and RDMA migration and the last one was with a workload. As one would expect there were noticeable improvements.

Pl. see below.

FYI
Vinod


------

Migrate speed : 20G
Migrate downtime : 2s

I) Without Juan's bitmap optimization patches : (i.e. current upstream)

Freezes observed during the start and at times during the pre-copy phase.
Longer than expected downtime.

a) 20VCPU/256GB (TCP migration)

A freeze of ~1 second in the guest. (as measured by Juan's timer script)

(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 97048 milliseconds
downtime: 3740 milliseconds
setup: 6912 milliseconds
transferred ram: 5734321 kbytes
throughput: 4243.94 mbps
remaining ram: 0 kbytes
total ram: 268444252 kbytes
duplicate: 65856255 pages
skipped: 0 pages
normal: 1286361 pages
normal bytes: 5145444 kbytes


b) 40VCPU/512GB (TCP migration)

A freeze of ~7 seconds in the guest. (as measured by Juan's timer script)

info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 238957 milliseconds
downtime: 5700 milliseconds
setup: 14062 milliseconds
transferred ram: 10461990 kbytes
throughput: 4223.74 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 131953694 pages
skipped: 0 pages
normal: 2321019 pages
normal bytes: 9284076 kbytes


------

II) With Juan's v2 bitmap optimization patches :

The actual downtime is lesser/closer to expected value...and that's good !

No multi-second freezes inside the guest during the start of migration or
during the pre-copy phase !

a) 20VCPU/256GB (TCP migration)

(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 84626 milliseconds
downtime: 1893 milliseconds
setup: 296 milliseconds
transferred ram: 5791133 kbytes
throughput: 4841.76 mbps
remaining ram: 0 kbytes
total ram: 268444252 kbytes
duplicate: 65841383 pages
skipped: 0 pages
normal: 1300569 pages
normal bytes: 5202276 kbytes


b) 40VCPU/512GB (TCP migration)


(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 239477 milliseconds
downtime: 1508 milliseconds
setup: 1171 milliseconds
transferred ram: 10584740 kbytes
throughput: 3570.72 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 131934489 pages
skipped: 0 pages
normal: 2351688 pages
normal bytes: 9406752 kbytes


c) 40VCPU/512GB (RDMA migration)

(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 174542 milliseconds
downtime: 1697 milliseconds
setup: 1140 milliseconds
transferred ram: 11739842 kbytes
throughput: 9987.07 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 131722040 pages
skipped: 0 pages
normal: 2902087 pages
normal bytes: 11608348 kbytes


d) 40VCPU/512GB (RDMA migration)
  (Guest running SpecJBB2005 24 warehouse threads (guest was ~60% loaded)).

info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 154236 milliseconds
downtime: 2314 milliseconds
setup: 575 milliseconds
transferred ram: 19198318 kbytes
throughput: 19404.17 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 130647356 pages
skipped: 0 pages
normal: 4766514 pages
normal bytes: 19066056 kbytes


[-- Attachment #2: Type: text/html, Size: 11158 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
  2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
                   ` (40 preceding siblings ...)
  2013-11-08 15:18 ` Chegu Vinod
@ 2013-11-25  6:15 ` Michael R. Hines
  2013-11-25  6:58   ` Michael R. Hines
  41 siblings, 1 reply; 69+ messages in thread
From: Michael R. Hines @ 2013-11-25  6:15 UTC (permalink / raw)
  To: Juan Quintela; +Cc: chegu_vinod, qemu-devel

On 11/06/2013 09:04 PM, Juan Quintela wrote:
> Hi
>
> [v2]
> In this version:
> - fixed all the comments from last versions (thanks Eric)
> - kvm migration bitmap is synchronized using bitmap operations
> - qemu bitmap -> migration bitmap is synchronized using bitmap operations
> If bitmaps are not properly aligned, we fall back to old code.
> Code survives virt-tests, so should be in quite good shape.
>
> ToDo list:
>
> - vga ram by default is not aligned in a page number multiple of 64,
>
>    it could be optimized.  Kraxel?  It syncs the kvm bitmap at least 1
>    a second or so? bitmap is only 2048 pages (16MB by default).
>    We need to change the ram_addr only
>
> - vga: still more, after we finish migration, vga code continues
>    synchronizing the kvm bitmap on source machine.  Notice that there
>    is no graphics client connected to the VGA.  Worth investigating?
>
> - I haven't yet meassure speed differences on big hosts.  Vinod?
>
> - Depending of performance, more optimizations to do.
>
> - debugging printf's still on the code, just to see if we are taking
>    (or not) the optimized paths.
>
> And that is all.  Please test & comment.
>
> Thanks, Juan.
>
> [v1]
> This series split the dirty bitmap (8 bits per page, only three used)
> into 3 individual bitmaps.  Once the conversion is done, operations
> are handled by bitmap operations, not bit by bit.
>
> - *_DIRTY_FLAG flags are gone, now we use memory.h DIRTY_MEMORY_*
>     everywhere.
>
> - We set/reset each flag individually
>    (set_dirty_flags(0xff&~CODE_DIRTY_FLAG)) are gone.
>
> - Rename several functions to clarify/make consistent things.
>
> - I know it dont't pass checkpatch for long lines, propper submission
>    should pass it. We have to have long lines, short variable names, or
>    ugly line splitting :p
>
> - DIRTY_MEMORY_NUM: how can one include exec/memory.h into cpu-all.h?
>    #include it don't work, as a workaround, I have copied its value, but
>    any better idea?  I can always create "exec/migration-flags.h", though.
>
> - The meat of the code is patch 19.  Rest of patches are quite easy
> (even that one is not too complex).
>
> Only optimizations done so far are
> set_dirty_range()/clear_dirty_range() that now operates with
> bitmap_set/clear.
>
> Note for Xen: cpu_physical_memory_set_dirty_range() was wrong for xen,
> see comment on patch.
>
> It passes virt-test migration tests, so it should be perfect.
>
> I post it to ask for comments.
>
> ToDo list:
>
> - create a lock for the bitmaps and fold migration bitmap into this
>    one.  This would avoid a copy and make things easier?
>
> - As this code uses/abuses bitmaps, we need to change the type of the
>    index from int to long.  With an int index, we can only access a
>    maximum of 8TB guest (yes, this is not urgent, we have a couple of
>    years to do it).
>
> - merging KVM <-> QEMU bitmap as a bitmap and not bit-by-bit.
>
> - spliting the KVM bitmap synchronization into chunks, i.e. not
>    synchronize all memory, just enough to continue with migration.
>
> Any further ideas/needs?
>
> Thanks, Juan.
>
> PD.  Why it took so long?
>
>       Because I was trying to integrate the bitmap on the MemoryRegion
>       abstraction.  Would have make the code cleaner, but hit dead-end
>       after dead-end.  As practical terms, TCG don't know about
>       MemoryRegions, it has been ported to run on top of them, but
>       don't use them effective
>
>
> The following changes since commit c2d30667760e3d7b81290d801e567d4f758825ca:
>
>    rtc: remove dead SQW IRQ code (2013-11-05 20:04:03 -0800)
>
> are available in the git repository at:
>
>    git://github.com/juanquintela/qemu.git bitmap-v2.next
>
> for you to fetch changes up to d91eff97e6f36612eb22d57c2b6c2623f73d3997:
>
>    migration: synchronize memory bitmap 64bits at a time (2013-11-06 13:54:56 +0100)
>
> ----------------------------------------------------------------
> Juan Quintela (39):
>        Move prototypes to memory.h
>        memory: cpu_physical_memory_set_dirty_flags() result is never used
>        memory: cpu_physical_memory_set_dirty_range() return void
>        exec: use accessor function to know if memory is dirty
>        memory: create function to set a single dirty bit
>        exec: create function to get a single dirty bit
>        memory: make cpu_physical_memory_is_dirty return bool
>        exec: simplify notdirty_mem_write()
>        memory: all users of cpu_physical_memory_get_dirty used only one flag
>        memory: set single dirty flags when possible
>        memory: cpu_physical_memory_set_dirty_range() allways dirty all flags
>        memory: cpu_physical_memory_mask_dirty_range() always clear a single flag
>        memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG
>        memory: use bit 2 for migration
>        memory: make sure that client is always inside range
>        memory: only resize dirty bitmap when memory size increases
>        memory: cpu_physical_memory_clear_dirty_flag() result is never used
>        bitmap: Add bitmap_zero_extend operation
>        memory: split dirty bitmap into three
>        memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user
>        memory: unfold cpu_physical_memory_set_dirty() in its only user
>        memory: unfold cpu_physical_memory_set_dirty_flag()
>        memory: make cpu_physical_memory_get_dirty() the main function
>        memory: cpu_physical_memory_get_dirty() is used as returning a bool
>        memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range
>        memory: use find_next_bit() to find dirty bits
>        memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations
>        memory: cpu_physical_memory_clear_dirty_range() now uses bitmap operations
>        memory: s/dirty/clean/ in cpu_physical_memory_is_dirty()
>        memory: make cpu_physical_memory_reset_dirty() take a length parameter
>        memory: cpu_physical_memory_set_dirty_tracking() should return void
>        memory: split cpu_physical_memory_* functions to its own include
>        memory: unfold memory_region_test_and_clear()
>        kvm: use directly cpu_physical_memory_* api for tracking dirty pages
>        kvm: refactor start address calculation
>        memory: move bitmap synchronization to its own function
>        memory: syncronize kvm bitmap using bitmaps operations
>        ram: split function that synchronizes a range
>        migration: synchronize memory bitmap 64bits at a time
>
>   arch_init.c                    |  57 ++++++++++++----
>   cputlb.c                       |  10 +--
>   exec.c                         |  75 ++++++++++-----------
>   include/exec/cpu-all.h         |   4 +-
>   include/exec/cpu-common.h      |   4 --
>   include/exec/memory-internal.h |  84 ------------------------
>   include/exec/memory-physical.h | 143 +++++++++++++++++++++++++++++++++++++++++
>   include/exec/memory.h          |  10 +--
>   include/qemu/bitmap.h          |   9 +++
>   kvm-all.c                      |  28 ++------
>   memory.c                       |  17 ++---
>   11 files changed, 260 insertions(+), 181 deletions(-)
>   create mode 100644 include/exec/memory-physical.h
>

Well done! The performance gains are very impressive.
I also just completed a measurement of this patch for the MC fault 
tolerance implementation that I posted last month:

Here is an example using a 12GB virtual machine (not as big as Vinod's, 
but not small either),
with most of the RAM being aggressively dirtied by a small C program.

With MC (micro checkpointing enabled, checkpointing 10 times per second),
here is the *per-checkpointing* overheads of the dirty bitmap:

1) BEFORE Juan's Patch:
a) Time to retrieve LOGDIRTY from KVM:  33 milliseconds
b) Time to synchronize with migration_bitmap from LOGDIRTY bitmap: 86 
milliseconds

2) AFTER Juan's Patch:
a) Time to retrieve LOGDIRTY from KVM:  17 milliseconds
b) Time to synchronize with migration_bitmap from LOGDIRTY bitmap: < 1 
millisecond

Enormous improvements - very nice.

- Michael

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
  2013-11-06 15:49   ` Paolo Bonzini
@ 2013-11-25  6:39     ` Michael R. Hines
  2013-11-25  9:45       ` Paolo Bonzini
  0 siblings, 1 reply; 69+ messages in thread
From: Michael R. Hines @ 2013-11-25  6:39 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, chegu_vinod, Gerd Hoffmann, Juan Quintela

On 11/06/2013 11:49 PM, Paolo Bonzini wrote:
> Il 06/11/2013 15:37, Gerd Hoffmann ha scritto:
>>>> - vga ram by default is not aligned in a page number multiple of
>>>> 64,
>>>>
>>>> it could be optimized.  Kraxel?  It syncs the kvm bitmap at least
>>>> 1 a second or so? bitmap is only 2048 pages (16MB by default). We
>>>> need to change the ram_addr only
>> It is created using memory_region_init_ram(), in vga_common_init().
>> Nothing special is done to avoid/force any alignment. Dunno why it
>> ends up on a odd page number.
> Because some random option ROM is loaded before the RAM region is
> created, and thus the ram_addr_t's become misaligned.  Keeping the
> ram_addr_t's aligned in find_ram_offset is easy:
>
> diff --git a/exec.c b/exec.c
> index 79610ce..1b82e81 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1002,7 +1002,7 @@ static ram_addr_t find_ram_offset(ram_addr_t size)
>       QTAILQ_FOREACH(block, &ram_list.blocks, next) {
>           ram_addr_t end, next = RAM_ADDR_MAX;
>
> -        end = block->offset + block->length;
> +        end = ROUND_UP(block->offset + block->length, TARGET_PAGE_SIZE * 64);
>
>           QTAILQ_FOREACH(next_block, &ram_list.blocks, next) {
>               if (next_block->offset >= end) {
>
> but I'm not sure if we're allowed to change the ram_addr_t's.  On one hand they
> are not part of guest ABI, on the other hand RDMA migration uses them
> in the protocol.
>
> Michael, can you check if RDMA migration works from a QEMU *without*
> this patch to one *with*:
>
> diff --git a/exec.c b/exec.c
> index 79610ce..21b8b96 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -997,7 +997,7 @@ static ram_addr_t find_ram_offset(ram_addr_t size)
>       assert(size != 0); /* it would hand out same offset multiple times */
>
>       if (QTAILQ_EMPTY(&ram_list.blocks))
> -        return 0;
> +        return TARGET_PAGE_SIZE * 100;
>
>       QTAILQ_FOREACH(block, &ram_list.blocks, next) {
>           ram_addr_t end, next = RAM_ADDR_MAX;
>
> I suspect it doesn't.  But perhaps we still have time before RDMA
> becomes non-experimental.
>
> Paolo
>

I tested the patch as requested. It doesn't seem to break the migration.
I compiled and looped the migration a couple of times and it runs through.

Sorry for the late reply.

- Michael

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
  2013-11-25  6:15 ` Michael R. Hines
@ 2013-11-25  6:58   ` Michael R. Hines
  0 siblings, 0 replies; 69+ messages in thread
From: Michael R. Hines @ 2013-11-25  6:58 UTC (permalink / raw)
  To: Juan Quintela; +Cc: chegu_vinod, qemu-devel

On 11/25/2013 02:15 PM, Michael R. Hines wrote:
> On 11/06/2013 09:04 PM, Juan Quintela wrote:
>> Hi
>>
>> [v2]
>> In this version:
>> - fixed all the comments from last versions (thanks Eric)
>> - kvm migration bitmap is synchronized using bitmap operations
>> - qemu bitmap -> migration bitmap is synchronized using bitmap 
>> operations
>> If bitmaps are not properly aligned, we fall back to old code.
>> Code survives virt-tests, so should be in quite good shape.
>>
>> ToDo list:
>>
>> - vga ram by default is not aligned in a page number multiple of 64,
>>
>>    it could be optimized.  Kraxel?  It syncs the kvm bitmap at least 1
>>    a second or so? bitmap is only 2048 pages (16MB by default).
>>    We need to change the ram_addr only
>>
>> - vga: still more, after we finish migration, vga code continues
>>    synchronizing the kvm bitmap on source machine.  Notice that there
>>    is no graphics client connected to the VGA.  Worth investigating?
>>
>> - I haven't yet meassure speed differences on big hosts.  Vinod?
>>
>> - Depending of performance, more optimizations to do.
>>
>> - debugging printf's still on the code, just to see if we are taking
>>    (or not) the optimized paths.
>>
>> And that is all.  Please test & comment.
>>
>> Thanks, Juan.
>>
>> [v1]
>> This series split the dirty bitmap (8 bits per page, only three used)
>> into 3 individual bitmaps.  Once the conversion is done, operations
>> are handled by bitmap operations, not bit by bit.
>>
>> - *_DIRTY_FLAG flags are gone, now we use memory.h DIRTY_MEMORY_*
>>     everywhere.
>>
>> - We set/reset each flag individually
>>    (set_dirty_flags(0xff&~CODE_DIRTY_FLAG)) are gone.
>>
>> - Rename several functions to clarify/make consistent things.
>>
>> - I know it dont't pass checkpatch for long lines, propper submission
>>    should pass it. We have to have long lines, short variable names, or
>>    ugly line splitting :p
>>
>> - DIRTY_MEMORY_NUM: how can one include exec/memory.h into cpu-all.h?
>>    #include it don't work, as a workaround, I have copied its value, but
>>    any better idea?  I can always create "exec/migration-flags.h", 
>> though.
>>
>> - The meat of the code is patch 19.  Rest of patches are quite easy
>> (even that one is not too complex).
>>
>> Only optimizations done so far are
>> set_dirty_range()/clear_dirty_range() that now operates with
>> bitmap_set/clear.
>>
>> Note for Xen: cpu_physical_memory_set_dirty_range() was wrong for xen,
>> see comment on patch.
>>
>> It passes virt-test migration tests, so it should be perfect.
>>
>> I post it to ask for comments.
>>
>> ToDo list:
>>
>> - create a lock for the bitmaps and fold migration bitmap into this
>>    one.  This would avoid a copy and make things easier?
>>
>> - As this code uses/abuses bitmaps, we need to change the type of the
>>    index from int to long.  With an int index, we can only access a
>>    maximum of 8TB guest (yes, this is not urgent, we have a couple of
>>    years to do it).
>>
>> - merging KVM <-> QEMU bitmap as a bitmap and not bit-by-bit.
>>
>> - spliting the KVM bitmap synchronization into chunks, i.e. not
>>    synchronize all memory, just enough to continue with migration.
>>
>> Any further ideas/needs?
>>
>> Thanks, Juan.
>>
>> PD.  Why it took so long?
>>
>>       Because I was trying to integrate the bitmap on the MemoryRegion
>>       abstraction.  Would have make the code cleaner, but hit dead-end
>>       after dead-end.  As practical terms, TCG don't know about
>>       MemoryRegions, it has been ported to run on top of them, but
>>       don't use them effective
>>
>>
>> The following changes since commit 
>> c2d30667760e3d7b81290d801e567d4f758825ca:
>>
>>    rtc: remove dead SQW IRQ code (2013-11-05 20:04:03 -0800)
>>
>> are available in the git repository at:
>>
>>    git://github.com/juanquintela/qemu.git bitmap-v2.next
>>
>> for you to fetch changes up to d91eff97e6f36612eb22d57c2b6c2623f73d3997:
>>
>>    migration: synchronize memory bitmap 64bits at a time (2013-11-06 
>> 13:54:56 +0100)
>>
>> ----------------------------------------------------------------
>> Juan Quintela (39):
>>        Move prototypes to memory.h
>>        memory: cpu_physical_memory_set_dirty_flags() result is never 
>> used
>>        memory: cpu_physical_memory_set_dirty_range() return void
>>        exec: use accessor function to know if memory is dirty
>>        memory: create function to set a single dirty bit
>>        exec: create function to get a single dirty bit
>>        memory: make cpu_physical_memory_is_dirty return bool
>>        exec: simplify notdirty_mem_write()
>>        memory: all users of cpu_physical_memory_get_dirty used only 
>> one flag
>>        memory: set single dirty flags when possible
>>        memory: cpu_physical_memory_set_dirty_range() allways dirty 
>> all flags
>>        memory: cpu_physical_memory_mask_dirty_range() always clear a 
>> single flag
>>        memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG
>>        memory: use bit 2 for migration
>>        memory: make sure that client is always inside range
>>        memory: only resize dirty bitmap when memory size increases
>>        memory: cpu_physical_memory_clear_dirty_flag() result is never 
>> used
>>        bitmap: Add bitmap_zero_extend operation
>>        memory: split dirty bitmap into three
>>        memory: unfold cpu_physical_memory_clear_dirty_flag() in its 
>> only user
>>        memory: unfold cpu_physical_memory_set_dirty() in its only user
>>        memory: unfold cpu_physical_memory_set_dirty_flag()
>>        memory: make cpu_physical_memory_get_dirty() the main function
>>        memory: cpu_physical_memory_get_dirty() is used as returning a 
>> bool
>>        memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range
>>        memory: use find_next_bit() to find dirty bits
>>        memory: cpu_physical_memory_set_dirty_range() now uses bitmap 
>> operations
>>        memory: cpu_physical_memory_clear_dirty_range() now uses 
>> bitmap operations
>>        memory: s/dirty/clean/ in cpu_physical_memory_is_dirty()
>>        memory: make cpu_physical_memory_reset_dirty() take a length 
>> parameter
>>        memory: cpu_physical_memory_set_dirty_tracking() should return 
>> void
>>        memory: split cpu_physical_memory_* functions to its own include
>>        memory: unfold memory_region_test_and_clear()
>>        kvm: use directly cpu_physical_memory_* api for tracking dirty 
>> pages
>>        kvm: refactor start address calculation
>>        memory: move bitmap synchronization to its own function
>>        memory: syncronize kvm bitmap using bitmaps operations
>>        ram: split function that synchronizes a range
>>        migration: synchronize memory bitmap 64bits at a time
>>
>>   arch_init.c                    |  57 ++++++++++++----
>>   cputlb.c                       |  10 +--
>>   exec.c                         |  75 ++++++++++-----------
>>   include/exec/cpu-all.h         |   4 +-
>>   include/exec/cpu-common.h      |   4 --
>>   include/exec/memory-internal.h |  84 ------------------------
>>   include/exec/memory-physical.h | 143 
>> +++++++++++++++++++++++++++++++++++++++++
>>   include/exec/memory.h          |  10 +--
>>   include/qemu/bitmap.h          |   9 +++
>>   kvm-all.c                      |  28 ++------
>>   memory.c                       |  17 ++---
>>   11 files changed, 260 insertions(+), 181 deletions(-)
>>   create mode 100644 include/exec/memory-physical.h
>>
>
> Well done! The performance gains are very impressive.
> I also just completed a measurement of this patch for the MC fault 
> tolerance implementation that I posted last month:
>
> Here is an example using a 12GB virtual machine (not as big as 
> Vinod's, but not small either),
> with most of the RAM being aggressively dirtied by a small C program.
>
> With MC (micro checkpointing enabled, checkpointing 10 times per second),
> here is the *per-checkpointing* overheads of the dirty bitmap:
>
> 1) BEFORE Juan's Patch:
> a) Time to retrieve LOGDIRTY from KVM:  33 milliseconds
> b) Time to synchronize with migration_bitmap from LOGDIRTY bitmap: 86 
> milliseconds
>
> 2) AFTER Juan's Patch:
> a) Time to retrieve LOGDIRTY from KVM:  17 milliseconds
> b) Time to synchronize with migration_bitmap from LOGDIRTY bitmap: < 1 
> millisecond
>
> Enormous improvements - very nice.
>
> - Michael
Tested-by: Michael R. Hines <mrhines@us.ibm.com>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
  2013-11-25  6:39     ` Michael R. Hines
@ 2013-11-25  9:45       ` Paolo Bonzini
  0 siblings, 0 replies; 69+ messages in thread
From: Paolo Bonzini @ 2013-11-25  9:45 UTC (permalink / raw)
  To: Michael R. Hines; +Cc: qemu-devel, chegu_vinod, Gerd Hoffmann, Juan Quintela

Il 25/11/2013 07:39, Michael R. Hines ha scritto:
>> Because some random option ROM is loaded before the RAM region is
>> created, and thus the ram_addr_t's become misaligned.  Keeping the
>> ram_addr_t's aligned in find_ram_offset is easy:
>>
>> diff --git a/exec.c b/exec.c
>> index 79610ce..1b82e81 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -1002,7 +1002,7 @@ static ram_addr_t find_ram_offset(ram_addr_t size)
>>       QTAILQ_FOREACH(block, &ram_list.blocks, next) {
>>           ram_addr_t end, next = RAM_ADDR_MAX;
>>
>> -        end = block->offset + block->length;
>> +        end = ROUND_UP(block->offset + block->length,
>> TARGET_PAGE_SIZE * 64);
>>
>>           QTAILQ_FOREACH(next_block, &ram_list.blocks, next) {
>>               if (next_block->offset >= end) {
>>
>> but I'm not sure if we're allowed to change the ram_addr_t's.  On one
>> hand they
>> are not part of guest ABI, on the other hand RDMA migration uses them
>> in the protocol.
>>
>> Michael, can you check if RDMA migration works from a QEMU *without*
>> this patch to one *with*:
> 
> I tested the patch as requested. It doesn't seem to break the migration.
> I compiled and looped the migration a couple of times and it runs through.
> 
> Sorry for the late reply.

Thanks, no problem since Juan's patch is 1.8 material anyway.

Juan, looks like I was wrong about the ram_addr_t being part of the
protocol (more likely, it is part of the protocol but it doesn't depend
on equal ram_addr_t).  You could then add the above one-liner as a
separate patch.

Paolo

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function
  2013-11-06 16:22     ` Juan Quintela
  2013-11-06 16:23       ` Paolo Bonzini
@ 2013-12-18  8:54       ` Alexander Graf
  1 sibling, 0 replies; 69+ messages in thread
From: Alexander Graf @ 2013-12-18  8:54 UTC (permalink / raw)
  To: Juan Quintela; +Cc: Paolo Bonzini, chegu_vinod, QEMU Developers


On 06.11.2013, at 17:22, Juan Quintela <quintela@redhat.com> wrote:

> Paolo Bonzini <pbonzini@redhat.com> wrote:
>> Il 06/11/2013 14:04, Juan Quintela ha scritto:
>>> We want to have all the functions that handle directly the dirty
>>> bitmap near.  We will change it later.
>> 
>> Please move it to exec.c instead.
>> 
>>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>>> ---
>>> include/exec/memory-physical.h | 31 +++++++++++++++++++++++++++++++
>>> kvm-all.c                      | 27 ++-------------------------
>>> 2 files changed, 33 insertions(+), 25 deletions(-)
>>> 
>>> diff --git a/include/exec/memory-physical.h b/include/exec/memory-physical.h
>>> index 610c55f..72faf06 100644
>>> --- a/include/exec/memory-physical.h
>>> +++ b/include/exec/memory-physical.h
>>> @@ -72,6 +72,37 @@ static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
>>>     xen_modified_memory(start, length);
>>> }
>>> 
>>> +static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
>> 
>> Why "le"?
> 
> little endian,  name is already too long (cpu_physical_memory_ is too
> big as a preffix.)
> 
>> 
>> Paolo
>> 
>>> +                                                          ram_addr_t start,
>>> +                                                          ram_addr_t pages)
>>> +{
>>> +    unsigned int i, j;
>>> +    unsigned long page_number, c;
>>> +    hwaddr addr;
>>> +    ram_addr_t ram_addr;
>>> +    unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
>>> +    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
>>> +
>>> +    /*
>>> +     * bitmap-traveling is faster than memory-traveling (for addr...)
>>> +     * especially when most of the memory is not dirty.
>>> +     */
>>> +    for (i = 0; i < len; i++) {
>>> +        if (bitmap[i] != 0) {
>>> +            c = leul_to_cpu(bitmap[i]);
> 
> this is the important part.  KVM maintains the bitmap is Little Endian
> mode,  and PPC?  I assme needs this.

Yes, there's a good chance you're running 32bit user space on a 64bit kernel on PPC. With big endian bitmaps that completely blows my mind, so I've settled for little endian bitmaps back in the day (which make a lot more sense anyway) :).

> 
> 
>>> +            do {
>>> +                j = ffsl(c) - 1;
>>> +                c &= ~(1ul << j);
>>> +                page_number = (i * HOST_LONG_BITS + j) * hpratio;
>>> +                addr = page_number * TARGET_PAGE_SIZE;
>>> +                ram_addr = start + addr;
>>> +                cpu_physical_memory_set_dirty_range(ram_addr,
>>> +                                                    TARGET_PAGE_SIZE
>>> * hpratio);
> 
> I checked,  and it appears that ppc really use the hpratio thing.  I
> hope that the bitmap optimization also works for ppc,  but its
> architecture is weird^Winteresting O:-)

Well, it's a ration between host and target page size.

TARGET_PAGE_SIZE defines the smallest chunk that the MMU can possibly split things down to.
Host page size however is the smallest chunk that the Linux page allocation can give us.

That's a very important distinction. All server PPC hardware can use 4k page sizes. But you can configure the kernel to use 64k pages on 64bit. For older hardware, that means that we're using the full 16 HTAB group entries for a single page (16*4 = 64) which gives measurable speedups. For newer hardware we can even have real 64k pages.

So while QEMU always needs to keep the guest's view of the page table world at 4k to stay compatible with what the guest is trying to do, the host kernel may have itself configured to 64k pages.

This is not really PPC specific. AArch64 has a similar thing with native support for 64k page size. There are a few x86 cores with 64k page size too, but I don't think any of them run Linux :). Either way, x86 as we have it today is really a special case with TARGET_PAGE_SIZE == host_page_size.


Alex

^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2013-12-18  8:54 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 01/39] Move prototypes to memory.h Juan Quintela
2013-11-07  9:36   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 02/39] memory: cpu_physical_memory_set_dirty_flags() result is never used Juan Quintela
2013-11-07  9:42   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 03/39] memory: cpu_physical_memory_set_dirty_range() return void Juan Quintela
2013-11-07  9:51   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 04/39] exec: use accessor function to know if memory is dirty Juan Quintela
2013-11-07 10:00   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 05/39] memory: create function to set a single dirty bit Juan Quintela
2013-11-07 11:37   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 06/39] exec: create function to get " Juan Quintela
2013-11-06 23:11   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 07/39] memory: make cpu_physical_memory_is_dirty return bool Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 08/39] exec: simplify notdirty_mem_write() Juan Quintela
2013-11-06 23:12   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 09/39] memory: all users of cpu_physical_memory_get_dirty used only one flag Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 10/39] memory: set single dirty flags when possible Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 11/39] memory: cpu_physical_memory_set_dirty_range() allways dirty all flags Juan Quintela
2013-11-06 23:15   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 12/39] memory: cpu_physical_memory_mask_dirty_range() always clear a single flag Juan Quintela
2013-11-06 23:18   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 13/39] memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG Juan Quintela
2013-11-06 23:24   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 14/39] memory: use bit 2 for migration Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 15/39] memory: make sure that client is always inside range Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 16/39] memory: only resize dirty bitmap when memory size increases Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 17/39] memory: cpu_physical_memory_clear_dirty_flag() result is never used Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 18/39] bitmap: Add bitmap_zero_extend operation Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 19/39] memory: split dirty bitmap into three Juan Quintela
2013-11-06 23:39   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 20/39] memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 21/39] memory: unfold cpu_physical_memory_set_dirty() " Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 22/39] memory: unfold cpu_physical_memory_set_dirty_flag() Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 23/39] memory: make cpu_physical_memory_get_dirty() the main function Juan Quintela
2013-11-06 23:44   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 24/39] memory: cpu_physical_memory_get_dirty() is used as returning a bool Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 25/39] memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 26/39] memory: use find_next_bit() to find dirty bits Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 27/39] memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations Juan Quintela
2013-11-06 23:49   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 28/39] memory: cpu_physical_memory_clear_dirty_range() " Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 29/39] memory: s/dirty/clean/ in cpu_physical_memory_is_dirty() Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 30/39] memory: make cpu_physical_memory_reset_dirty() take a length parameter Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 31/39] memory: cpu_physical_memory_set_dirty_tracking() should return void Juan Quintela
2013-11-06 23:55   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 32/39] memory: split cpu_physical_memory_* functions to its own include Juan Quintela
2013-11-06 15:54   ` Paolo Bonzini
2013-11-06 13:04 ` [Qemu-devel] [PATCH 33/39] memory: unfold memory_region_test_and_clear() Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 34/39] kvm: use directly cpu_physical_memory_* api for tracking dirty pages Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 35/39] kvm: refactor start address calculation Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function Juan Quintela
2013-11-06 15:56   ` Paolo Bonzini
2013-11-06 16:22     ` Juan Quintela
2013-11-06 16:23       ` Paolo Bonzini
2013-12-18  8:54       ` Alexander Graf
2013-11-06 13:04 ` [Qemu-devel] [PATCH 37/39] memory: syncronize kvm bitmap using bitmaps operations Juan Quintela
2013-11-06 15:58   ` Paolo Bonzini
2013-11-06 13:04 ` [Qemu-devel] [PATCH 38/39] ram: split function that synchronizes a range Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 39/39] migration: synchronize memory bitmap 64bits at a time Juan Quintela
2013-11-06 14:37 ` [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Gerd Hoffmann
2013-11-06 15:38   ` Juan Quintela
2013-11-07 11:23     ` Gerd Hoffmann
2013-11-06 15:49   ` Paolo Bonzini
2013-11-25  6:39     ` Michael R. Hines
2013-11-25  9:45       ` Paolo Bonzini
2013-11-08 15:18 ` Chegu Vinod
2013-11-25  6:15 ` Michael R. Hines
2013-11-25  6:58   ` Michael R. Hines

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.