All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism
@ 2012-03-02  9:59 Wen Congyang
  2012-03-02 10:02 ` [Qemu-devel] [RFC][PATCH 01/16 v8] Add API to create memory mapping list Wen Congyang
                   ` (16 more replies)
  0 siblings, 17 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02  9:59 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

Hi, all

'virsh dump' can not work when host pci device is used by guest. We have
discussed this issue here:
http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html

The last version is here:
http://lists.nongnu.org/archive/html/qemu-devel/2012-02/msg04228.html

We have determined to introduce a new command dump to dump memory. The core
file's format can be elf.

Note:
1. The guest should be x86 or x86_64. The other arch is not supported now.
2. If you use old gdb, gdb may crash. I use gdb-7.3.1, and it does not crash.
3. If the OS is in the second kernel, gdb may not work well, and crash can
   work by specifying '--machdep phys_addr=xxx' in the command line. The
   reason is that the second kernel will update the page table, and we can
   not get the page table for the first kernel.
4. The cpu's state is stored in QEMU note. You neet to modify crash to use
   it to calculate phys_base.
5. If the guest OS is 32 bit and the memory size is larger than 4G, the vmcore
   is elf64 format. You should use the gdb which is built with --enable-64-bit-bfd.
6. This patchset is based on the upstream tree, and apply one patch that is still
   in Luiz Capitulino's tree, because I use the API qemu_get_fd() in this patchset.

Changes from v7 to v8:
1. addressed Hatayama's comments

Changes from v6 to v7:
1. addressed Jan's comments
2. fix some bugs
3. store cpu's state into the vmcore

Changes from v5 to v6:
1. allow user to dump a fraction of the memory
2. fix some bugs

Changes from v4 to v5:
1. convert the new command dump to QAPI 

Changes from v3 to v4:
1. support it to run asynchronously
2. add API to cancel dumping and query dumping progress
3. add API to control dumping speed
4. auto cancel dumping when the user resumes vm, and the status is failed.

Changes from v2 to v3:
1. address Jan Kiszka's comment

Changes from v1 to v2:
1. fix virt addr in the vmcore.

Wen Congyang (16):
  Add API to create memory mapping list
  Add API to check whether a physical address is I/O address
  implement cpu_get_memory_mapping()
  Add API to check whether paging mode is enabled
  Add API to get memory mapping
  Add API to get memory mapping without do paging
  target-i386: Add API to write elf notes to core file
  target-i386: Add API to write cpu status to core file
  target-i386: add API to get dump info
  make gdb_id() generally avialable
  introduce a new monitor command 'dump' to dump guest's memory
  support to cancel the current dumping
  support to query dumping status
  run dump at the background
  support detached dump
  allow user to dump a fraction of the memory

 Makefile.target                   |    3 +
 configure                         |    8 +
 cpu-all.h                         |   66 +++
 cpu-common.h                      |    2 +
 dump.c                            |  980 +++++++++++++++++++++++++++++++++++++
 dump.h                            |   23 +
 elf.h                             |    5 +
 exec.c                            |   11 +
 gdbstub.c                         |    9 -
 gdbstub.h                         |    9 +
 hmp-commands.hx                   |   44 ++
 hmp.c                             |   89 ++++
 hmp.h                             |    3 +
 memory_mapping.c                  |  290 +++++++++++
 memory_mapping.h                  |   60 +++
 monitor.c                         |    7 +
 qapi-schema.json                  |   58 +++
 qmp-commands.hx                   |  110 +++++
 target-i386/arch_dump.c           |  433 ++++++++++++++++
 target-i386/arch_memory_mapping.c |  271 ++++++++++
 vl.c                              |    5 +-
 21 files changed, 2475 insertions(+), 11 deletions(-)
 create mode 100644 dump.c
 create mode 100644 dump.h
 create mode 100644 memory_mapping.c
 create mode 100644 memory_mapping.h
 create mode 100644 target-i386/arch_dump.c
 create mode 100644 target-i386/arch_memory_mapping.c

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 01/16 v8] Add API to create memory mapping list
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
@ 2012-03-02 10:02 ` Wen Congyang
  2012-03-02 10:06 ` [Qemu-devel] [RFC][PATCH 02/16 v8] Add API to check whether a physical address is I/O address Wen Congyang
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:02 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

The memory mapping list stores virtual address and physical address mapping.
The virtual address and physical address are contiguous in the mapping.
The folloing patch will use this information to create PT_LOAD in the vmcore.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target  |    1 +
 memory_mapping.c |  166 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 memory_mapping.h |   47 +++++++++++++++
 3 files changed, 214 insertions(+), 0 deletions(-)
 create mode 100644 memory_mapping.c
 create mode 100644 memory_mapping.h

diff --git a/Makefile.target b/Makefile.target
index 68a5641..9227e4e 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -208,6 +208,7 @@ obj-$(CONFIG_KVM) += kvm.o kvm-all.o
 obj-$(CONFIG_NO_KVM) += kvm-stub.o
 obj-$(CONFIG_VGA) += vga.o
 obj-y += memory.o savevm.o
+obj-y += memory_mapping.o
 LIBS+=-lz
 
 obj-i386-$(CONFIG_KVM) += hyperv.o
diff --git a/memory_mapping.c b/memory_mapping.c
new file mode 100644
index 0000000..718f271
--- /dev/null
+++ b/memory_mapping.c
@@ -0,0 +1,166 @@
+/*
+ * QEMU memory mapping
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "cpu.h"
+#include "cpu-all.h"
+#include "memory_mapping.h"
+
+static void memory_mapping_list_add_mapping_sorted(MemoryMappingList *list,
+                                                   MemoryMapping *mapping)
+{
+    MemoryMapping *p;
+
+    QTAILQ_FOREACH(p, &list->head, next) {
+        if (p->phys_addr >= mapping->phys_addr) {
+            QTAILQ_INSERT_BEFORE(p, mapping, next);
+            return;
+        }
+    }
+    QTAILQ_INSERT_TAIL(&list->head, mapping, next);
+}
+
+static void create_new_memory_mapping(MemoryMappingList *list,
+                                      target_phys_addr_t phys_addr,
+                                      target_phys_addr_t virt_addr,
+                                      ram_addr_t length)
+{
+    MemoryMapping *memory_mapping;
+
+    memory_mapping = g_malloc(sizeof(MemoryMapping));
+    memory_mapping->phys_addr = phys_addr;
+    memory_mapping->virt_addr = virt_addr;
+    memory_mapping->length = length;
+    list->last_mapping = memory_mapping;
+    list->num++;
+    memory_mapping_list_add_mapping_sorted(list, memory_mapping);
+}
+
+static inline bool mapping_contiguous(MemoryMapping *map,
+                                      target_phys_addr_t phys_addr,
+                                      target_phys_addr_t virt_addr)
+{
+    return phys_addr == map->phys_addr + map->length &&
+           virt_addr == map->virt_addr + map->length;
+}
+
+/*
+ * [map->phys_addr, map->phys_addr + map->length) and
+ * [phys_addr, phys_addr + length) have intersection?
+ */
+static inline bool mapping_have_same_region(MemoryMapping *map,
+                                            target_phys_addr_t phys_addr,
+                                            ram_addr_t length)
+{
+    return !(phys_addr + length < map->phys_addr ||
+             phys_addr >= map->phys_addr + map->length);
+}
+
+/*
+ * [map->phys_addr, map->phys_addr + map->length) and
+ * [phys_addr, phys_addr + length) have intersection. The virtual address in the
+ * intersection are the same?
+ */
+static inline bool mapping_conflict(MemoryMapping *map,
+                                    target_phys_addr_t phys_addr,
+                                    target_phys_addr_t virt_addr)
+{
+    return virt_addr - map->virt_addr != phys_addr - map->phys_addr;
+}
+
+/*
+ * [map->virt_addr, map->virt_addr + map->length) and
+ * [virt_addr, virt_addr + length) have intersection. And the physical address
+ * in the intersection are the same.
+ */
+static inline void mapping_merge(MemoryMapping *map,
+                                 target_phys_addr_t virt_addr,
+                                 ram_addr_t length)
+{
+    if (virt_addr < map->virt_addr) {
+        map->length += map->virt_addr - virt_addr;
+        map->virt_addr = virt_addr;
+    }
+
+    if ((virt_addr + length) >
+        (map->virt_addr + map->length)) {
+        map->length = virt_addr + length - map->virt_addr;
+    }
+}
+
+void memory_mapping_list_add_merge_sorted(MemoryMappingList *list,
+                                          target_phys_addr_t phys_addr,
+                                          target_phys_addr_t virt_addr,
+                                          ram_addr_t length)
+{
+    MemoryMapping *memory_mapping, *last_mapping;
+
+    if (QTAILQ_EMPTY(&list->head)) {
+        create_new_memory_mapping(list, phys_addr, virt_addr, length);
+        return;
+    }
+
+    last_mapping = list->last_mapping;
+    if (last_mapping) {
+        if (mapping_contiguous(last_mapping, phys_addr, virt_addr)) {
+            last_mapping->length += length;
+            return;
+        }
+    }
+
+    QTAILQ_FOREACH(memory_mapping, &list->head, next) {
+        if (mapping_contiguous(memory_mapping, phys_addr, virt_addr)) {
+            memory_mapping->length += length;
+            list->last_mapping = memory_mapping;
+            return;
+        }
+
+        if (phys_addr + length < memory_mapping->phys_addr) {
+            /* create a new region before memory_mapping */
+            break;
+        }
+
+        if (mapping_have_same_region(memory_mapping, phys_addr, length)) {
+            if (mapping_conflict(memory_mapping, phys_addr, virt_addr)) {
+                continue;
+            }
+
+            /* merge this region into memory_mapping */
+            mapping_merge(memory_mapping, virt_addr, length);
+            list->last_mapping = memory_mapping;
+            return;
+        }
+    }
+
+    /* this region can not be merged into any existed memory mapping. */
+    create_new_memory_mapping(list, phys_addr, virt_addr, length);
+}
+
+void memory_mapping_list_free(MemoryMappingList *list)
+{
+    MemoryMapping *p, *q;
+
+    QTAILQ_FOREACH_SAFE(p, &list->head, next, q) {
+        QTAILQ_REMOVE(&list->head, p, next);
+        g_free(p);
+    }
+
+    list->num = 0;
+    list->last_mapping = NULL;
+}
+
+void memory_mapping_list_init(MemoryMappingList *list)
+{
+    list->num = 0;
+    list->last_mapping = NULL;
+    QTAILQ_INIT(&list->head);
+}
diff --git a/memory_mapping.h b/memory_mapping.h
new file mode 100644
index 0000000..836b047
--- /dev/null
+++ b/memory_mapping.h
@@ -0,0 +1,47 @@
+/*
+ * QEMU memory mapping
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef MEMORY_MAPPING_H
+#define MEMORY_MAPPING_H
+
+#include "qemu-queue.h"
+
+/* The physical and virtual address in the memory mapping are contiguous. */
+typedef struct MemoryMapping {
+    target_phys_addr_t phys_addr;
+    target_ulong virt_addr;
+    ram_addr_t length;
+    QTAILQ_ENTRY(MemoryMapping) next;
+} MemoryMapping;
+
+typedef struct MemoryMappingList {
+    unsigned int num;
+    MemoryMapping *last_mapping;
+    QTAILQ_HEAD(, MemoryMapping) head;
+} MemoryMappingList;
+
+/*
+ * add or merge the memory region [phys_addr, phys_addr + length) into the
+ * memory mapping's list. The region's virtual address starts with virt_addr,
+ * and is contiguous. The list is sorted by phys_addr.
+ */
+void memory_mapping_list_add_merge_sorted(MemoryMappingList *list,
+                                          target_phys_addr_t phys_addr,
+                                          target_phys_addr_t virt_addr,
+                                          ram_addr_t length);
+
+void memory_mapping_list_free(MemoryMappingList *list);
+
+void memory_mapping_list_init(MemoryMappingList *list);
+
+#endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 02/16 v8] Add API to check whether a physical address is I/O address
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
  2012-03-02 10:02 ` [Qemu-devel] [RFC][PATCH 01/16 v8] Add API to create memory mapping list Wen Congyang
@ 2012-03-02 10:06 ` Wen Congyang
  2012-03-02 10:08 ` [Qemu-devel] [RFC][PATCH 03/16 v8] implement cpu_get_memory_mapping() Wen Congyang
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:06 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

This API will be used in the following patch.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 cpu-common.h |    2 ++
 exec.c       |   11 +++++++++++
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/cpu-common.h b/cpu-common.h
index a40c57d..fde3e5d 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -71,6 +71,8 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len,
 void *cpu_register_map_client(void *opaque, void (*callback)(void *opaque));
 void cpu_unregister_map_client(void *cookie);
 
+bool cpu_physical_memory_is_io(target_phys_addr_t phys_addr);
+
 /* Coalesced MMIO regions are areas where write operations can be reordered.
  * This usually implies that write operations are side-effect free.  This allows
  * batching which can make a major impact on performance when using
diff --git a/exec.c b/exec.c
index b81677a..2114dd5 100644
--- a/exec.c
+++ b/exec.c
@@ -4435,3 +4435,14 @@ bool virtio_is_big_endian(void)
 #undef env
 
 #endif
+
+bool cpu_physical_memory_is_io(target_phys_addr_t phys_addr)
+{
+    ram_addr_t pd;
+    PhysPageDesc p;
+
+    p = phys_page_find(phys_addr >> TARGET_PAGE_BITS);
+    pd = p.phys_offset;
+
+    return !is_ram_rom_romd(pd);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 03/16 v8] implement cpu_get_memory_mapping()
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
  2012-03-02 10:02 ` [Qemu-devel] [RFC][PATCH 01/16 v8] Add API to create memory mapping list Wen Congyang
  2012-03-02 10:06 ` [Qemu-devel] [RFC][PATCH 02/16 v8] Add API to check whether a physical address is I/O address Wen Congyang
@ 2012-03-02 10:08 ` Wen Congyang
  2012-03-02 10:12 ` [Qemu-devel] [RFC][PATCH 04/16 v8] Add API to check whether paging mode is enabled Wen Congyang
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:08 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

Walk cpu's page table and collect all virtual address and physical address mapping.
Then, add these mapping into memory mapping list. If the guest does not use paging,
it will do nothing.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target                   |    1 +
 configure                         |    4 +
 cpu-all.h                         |   10 ++
 target-i386/arch_memory_mapping.c |  266 +++++++++++++++++++++++++++++++++++++
 4 files changed, 281 insertions(+), 0 deletions(-)
 create mode 100644 target-i386/arch_memory_mapping.c

diff --git a/Makefile.target b/Makefile.target
index 9227e4e..a87e678 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -84,6 +84,7 @@ libobj-y += op_helper.o helper.o
 ifeq ($(TARGET_BASE_ARCH), i386)
 libobj-y += cpuid.o
 endif
+libobj-$(CONFIG_HAVE_GET_MEMORY_MAPPING) += arch_memory_mapping.o
 libobj-$(TARGET_SPARC64) += vis_helper.o
 libobj-$(CONFIG_NEED_MMU) += mmu.o
 libobj-$(TARGET_ARM) += neon_helper.o iwmmxt_helper.o
diff --git a/configure b/configure
index fb0e18e..61821fc 100755
--- a/configure
+++ b/configure
@@ -3643,6 +3643,10 @@ case "$target_arch2" in
       fi
     fi
 esac
+case "$target_arch2" in
+  i386|x86_64)
+    echo "CONFIG_HAVE_GET_MEMORY_MAPPING=y" >> $config_target_mak
+esac
 if test "$target_arch2" = "ppc64" -a "$fdt" = "yes"; then
   echo "CONFIG_PSERIES=y" >> $config_target_mak
 fi
diff --git a/cpu-all.h b/cpu-all.h
index e2c3c49..cb72680 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -22,6 +22,7 @@
 #include "qemu-common.h"
 #include "qemu-tls.h"
 #include "cpu-common.h"
+#include "memory_mapping.h"
 
 /* some important defines:
  *
@@ -523,4 +524,13 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf);
 int cpu_memory_rw_debug(CPUState *env, target_ulong addr,
                         uint8_t *buf, int len, int is_write);
 
+#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
+int cpu_get_memory_mapping(MemoryMappingList *list, CPUState *env);
+#else
+static inline int cpu_get_memory_mapping(MemoryMappingList *list, CPUState *env)
+{
+    return -1;
+}
+#endif
+
 #endif /* CPU_ALL_H */
diff --git a/target-i386/arch_memory_mapping.c b/target-i386/arch_memory_mapping.c
new file mode 100644
index 0000000..10d9b2c
--- /dev/null
+++ b/target-i386/arch_memory_mapping.c
@@ -0,0 +1,266 @@
+/*
+ * i386 memory mapping
+ *
+ * Copyright Fujitsu, Corp. 2011
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "cpu.h"
+#include "cpu-all.h"
+
+/* PAE Paging or IA-32e Paging */
+static void walk_pte(MemoryMappingList *list, target_phys_addr_t pte_start_addr,
+                     int32_t a20_mask, target_ulong start_line_addr)
+{
+    target_phys_addr_t pte_addr, start_paddr;
+    uint64_t pte;
+    target_ulong start_vaddr;
+    int i;
+
+    for (i = 0; i < 512; i++) {
+        pte_addr = (pte_start_addr + i * 8) & a20_mask;
+        pte = ldq_phys(pte_addr);
+        if (!(pte & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        start_paddr = (pte & ~0xfff) & ~(0x1ULL << 63);
+        if (cpu_physical_memory_is_io(start_paddr)) {
+            /* I/O region */
+            continue;
+        }
+
+        start_vaddr = start_line_addr | ((i & 0x1fff) << 12);
+        memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                             start_vaddr, 1 << 12);
+    }
+}
+
+/* 32-bit Paging */
+static void walk_pte2(MemoryMappingList *list,
+                      target_phys_addr_t pte_start_addr, int32_t a20_mask,
+                      target_ulong start_line_addr)
+{
+    target_phys_addr_t pte_addr, start_paddr;
+    uint32_t pte;
+    target_ulong start_vaddr;
+    int i;
+
+    for (i = 0; i < 1024; i++) {
+        pte_addr = (pte_start_addr + i * 4) & a20_mask;
+        pte = ldl_phys(pte_addr);
+        if (!(pte & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        start_paddr = pte & ~0xfff;
+        if (cpu_physical_memory_is_io(start_paddr)) {
+            /* I/O region */
+            continue;
+        }
+
+        start_vaddr = start_line_addr | ((i & 0x3ff) << 12);
+        memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                             start_vaddr, 1 << 12);
+    }
+}
+
+/* PAE Paging or IA-32e Paging */
+static void walk_pde(MemoryMappingList *list, target_phys_addr_t pde_start_addr,
+                     int32_t a20_mask, target_ulong start_line_addr)
+{
+    target_phys_addr_t pde_addr, pte_start_addr, start_paddr;
+    uint64_t pde;
+    target_ulong line_addr, start_vaddr;
+    int i;
+
+    for (i = 0; i < 512; i++) {
+        pde_addr = (pde_start_addr + i * 8) & a20_mask;
+        pde = ldq_phys(pde_addr);
+        if (!(pde & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = start_line_addr | ((i & 0x1ff) << 21);
+        if (pde & PG_PSE_MASK) {
+            /* 2 MB page */
+            start_paddr = (pde & ~0x1fffff) & ~(0x1ULL << 63);
+            if (cpu_physical_memory_is_io(start_paddr)) {
+                /* I/O region */
+                continue;
+            }
+            start_vaddr = line_addr;
+            memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                                 start_vaddr, 1 << 21);
+            continue;
+        }
+
+        pte_start_addr = (pde & ~0xfff) & a20_mask;
+        walk_pte(list, pte_start_addr, a20_mask, line_addr);
+    }
+}
+
+/* 32-bit Paging */
+static void walk_pde2(MemoryMappingList *list,
+                      target_phys_addr_t pde_start_addr, int32_t a20_mask,
+                      bool pse)
+{
+    target_phys_addr_t pde_addr, pte_start_addr, start_paddr;
+    uint32_t pde;
+    target_ulong line_addr, start_vaddr;
+    int i;
+
+    for (i = 0; i < 1024; i++) {
+        pde_addr = (pde_start_addr + i * 4) & a20_mask;
+        pde = ldl_phys(pde_addr);
+        if (!(pde & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = (((unsigned int)i & 0x3ff) << 22);
+        if ((pde & PG_PSE_MASK) && pse) {
+            /* 4 MB page */
+            start_paddr = (pde & ~0x3fffff) | ((pde & 0x1fe000) << 19);
+            if (cpu_physical_memory_is_io(start_paddr)) {
+                /* I/O region */
+                continue;
+            }
+            start_vaddr = line_addr;
+            memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                                 start_vaddr, 1 << 22);
+            continue;
+        }
+
+        pte_start_addr = (pde & ~0xfff) & a20_mask;
+        walk_pte2(list, pte_start_addr, a20_mask, line_addr);
+    }
+}
+
+/* PAE Paging */
+static void walk_pdpe2(MemoryMappingList *list,
+                       target_phys_addr_t pdpe_start_addr, int32_t a20_mask)
+{
+    target_phys_addr_t pdpe_addr, pde_start_addr;
+    uint64_t pdpe;
+    target_ulong line_addr;
+    int i;
+
+    for (i = 0; i < 4; i++) {
+        pdpe_addr = (pdpe_start_addr + i * 8) & a20_mask;
+        pdpe = ldq_phys(pdpe_addr);
+        if (!(pdpe & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = (((unsigned int)i & 0x3) << 30);
+        pde_start_addr = (pdpe & ~0xfff) & a20_mask;
+        walk_pde(list, pde_start_addr, a20_mask, line_addr);
+    }
+}
+
+#ifdef TARGET_X86_64
+/* IA-32e Paging */
+static void walk_pdpe(MemoryMappingList *list,
+                      target_phys_addr_t pdpe_start_addr, int32_t a20_mask,
+                      target_ulong start_line_addr)
+{
+    target_phys_addr_t pdpe_addr, pde_start_addr, start_paddr;
+    uint64_t pdpe;
+    target_ulong line_addr, start_vaddr;
+    int i;
+
+    for (i = 0; i < 512; i++) {
+        pdpe_addr = (pdpe_start_addr + i * 8) & a20_mask;
+        pdpe = ldq_phys(pdpe_addr);
+        if (!(pdpe & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = start_line_addr | ((i & 0x1ffULL) << 30);
+        if (pdpe & PG_PSE_MASK) {
+            /* 1 GB page */
+            start_paddr = (pdpe & ~0x3fffffff) & ~(0x1ULL << 63);
+            if (cpu_physical_memory_is_io(start_paddr)) {
+                /* I/O region */
+                continue;
+            }
+            start_vaddr = line_addr;
+            memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                                 start_vaddr, 1 << 30);
+            continue;
+        }
+
+        pde_start_addr = (pdpe & ~0xfff) & a20_mask;
+        walk_pde(list, pde_start_addr, a20_mask, line_addr);
+    }
+}
+
+/* IA-32e Paging */
+static void walk_pml4e(MemoryMappingList *list,
+                       target_phys_addr_t pml4e_start_addr, int32_t a20_mask)
+{
+    target_phys_addr_t pml4e_addr, pdpe_start_addr;
+    uint64_t pml4e;
+    target_ulong line_addr;
+    int i;
+
+    for (i = 0; i < 512; i++) {
+        pml4e_addr = (pml4e_start_addr + i * 8) & a20_mask;
+        pml4e = ldq_phys(pml4e_addr);
+        if (!(pml4e & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = ((i & 0x1ffULL) << 39) | (0xffffULL << 48);
+        pdpe_start_addr = (pml4e & ~0xfff) & a20_mask;
+        walk_pdpe(list, pdpe_start_addr, a20_mask, line_addr);
+    }
+}
+#endif
+
+int cpu_get_memory_mapping(MemoryMappingList *list, CPUState *env)
+{
+    if (!(env->cr[0] & CR0_PG_MASK)) {
+        /* paging is disabled */
+        return 0;
+    }
+
+    if (env->cr[4] & CR4_PAE_MASK) {
+#ifdef TARGET_X86_64
+        if (env->hflags & HF_LMA_MASK) {
+            target_phys_addr_t pml4e_addr;
+
+            pml4e_addr = (env->cr[3] & ~0xfff) & env->a20_mask;
+            walk_pml4e(list, pml4e_addr, env->a20_mask);
+        } else
+#endif
+        {
+            target_phys_addr_t pdpe_addr;
+
+            pdpe_addr = (env->cr[3] & ~0x1f) & env->a20_mask;
+            walk_pdpe2(list, pdpe_addr, env->a20_mask);
+        }
+    } else {
+        target_phys_addr_t pde_addr;
+        bool pse;
+
+        pde_addr = (env->cr[3] & ~0xfff) & env->a20_mask;
+        pse = !!(env->cr[4] & CR4_PSE_MASK);
+        walk_pde2(list, pde_addr, env->a20_mask, pse);
+    }
+
+    return 0;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 04/16 v8] Add API to check whether paging mode is enabled
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (2 preceding siblings ...)
  2012-03-02 10:08 ` [Qemu-devel] [RFC][PATCH 03/16 v8] implement cpu_get_memory_mapping() Wen Congyang
@ 2012-03-02 10:12 ` Wen Congyang
  2012-03-02 10:18 ` [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping Wen Congyang
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:12 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake


Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 cpu-all.h                         |    6 ++++++
 target-i386/arch_memory_mapping.c |    7 ++++++-
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index cb72680..01c3c23 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -526,11 +526,17 @@ int cpu_memory_rw_debug(CPUState *env, target_ulong addr,
 
 #if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
 int cpu_get_memory_mapping(MemoryMappingList *list, CPUState *env);
+bool cpu_paging_enabled(CPUState *env);
 #else
 static inline int cpu_get_memory_mapping(MemoryMappingList *list, CPUState *env)
 {
     return -1;
 }
+
+static inline bool cpu_paging_enabled(CPUState *env)
+{
+    return true;
+}
 #endif
 
 #endif /* CPU_ALL_H */
diff --git a/target-i386/arch_memory_mapping.c b/target-i386/arch_memory_mapping.c
index 10d9b2c..824f293 100644
--- a/target-i386/arch_memory_mapping.c
+++ b/target-i386/arch_memory_mapping.c
@@ -233,7 +233,7 @@ static void walk_pml4e(MemoryMappingList *list,
 
 int cpu_get_memory_mapping(MemoryMappingList *list, CPUState *env)
 {
-    if (!(env->cr[0] & CR0_PG_MASK)) {
+    if (!cpu_paging_enabled(env)) {
         /* paging is disabled */
         return 0;
     }
@@ -264,3 +264,8 @@ int cpu_get_memory_mapping(MemoryMappingList *list, CPUState *env)
 
     return 0;
 }
+
+bool cpu_paging_enabled(CPUState *env)
+{
+    return env->cr[0] & CR0_PG_MASK;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (3 preceding siblings ...)
  2012-03-02 10:12 ` [Qemu-devel] [RFC][PATCH 04/16 v8] Add API to check whether paging mode is enabled Wen Congyang
@ 2012-03-02 10:18 ` Wen Congyang
  2012-03-07 15:27   ` HATAYAMA Daisuke
  2012-03-02 10:23 ` [Qemu-devel] [RFC][PATCH 06/16 v8] Add API to get memory mapping without doing paging Wen Congyang
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:18 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

Add API to get all virtual address and physical address mapping.
If there is no virtual address for some physical address, the virtual
address is 0.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 memory_mapping.c |   88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 memory_mapping.h |    8 +++++
 2 files changed, 96 insertions(+), 0 deletions(-)

diff --git a/memory_mapping.c b/memory_mapping.c
index 718f271..f74c5d0 100644
--- a/memory_mapping.c
+++ b/memory_mapping.c
@@ -164,3 +164,91 @@ void memory_mapping_list_init(MemoryMappingList *list)
     list->last_mapping = NULL;
     QTAILQ_INIT(&list->head);
 }
+
+int qemu_get_guest_memory_mapping(MemoryMappingList *list)
+{
+    CPUState *env;
+    MemoryMapping *memory_mapping;
+    RAMBlock *block;
+    ram_addr_t offset, length, m_length;
+    target_phys_addr_t m_phys_addr;
+    int ret;
+    bool paging_mode;
+
+#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
+    paging_mode = cpu_paging_enabled(first_cpu);
+    if (paging_mode) {
+        for (env = first_cpu; env != NULL; env = env->next_cpu) {
+            ret = cpu_get_memory_mapping(list, env);
+            if (ret < 0) {
+                return -1;
+            }
+        }
+    }
+#else
+    return -2;
+#endif
+
+    /*
+     * some memory may be not in the memory mapping's list:
+     * 1. the guest doesn't use paging
+     * 2. the guest is in 2nd kernel, and the memory used by 1st kernel is not
+     *    in paging table
+     * add them into memory mapping's list
+     */
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        offset = block->offset;
+        length = block->length;
+
+        if (!paging_mode) {
+            create_new_memory_mapping(list, offset, offset, length);
+            continue;
+        }
+
+        QTAILQ_FOREACH(memory_mapping, &list->head, next) {
+            m_phys_addr = memory_mapping->phys_addr;
+            m_length = memory_mapping->length;
+
+            if (offset + length <= m_phys_addr) {
+                /*
+                 * memory_mapping's list does not conatin the region
+                 * [offset, offset+length)
+                 */
+                create_new_memory_mapping(list, offset, 0, length);
+                length = 0;
+                break;
+            }
+
+            if (m_phys_addr + m_length <= offset) {
+                continue;
+            }
+
+            if (m_phys_addr > offset) {
+                /*
+                 * memory_mapping's list does not conatin the region
+                 * [offset, memory_mapping->phys_addr)
+                 */
+                create_new_memory_mapping(list, offset, 0,
+                                          m_phys_addr - offset);
+            }
+
+            if (offset + length <= m_phys_addr + m_length) {
+                length = 0;
+                break;
+            }
+
+            length -= m_phys_addr + m_length - offset;
+            offset = m_phys_addr + m_length;
+        }
+
+        if (length > 0) {
+            /*
+             * memory_mapping's list does not conatin the region
+             * [offset, memory_mapping->phys_addr)
+             */
+            create_new_memory_mapping(list, offset, 0, length);
+        }
+    }
+
+    return 0;
+}
diff --git a/memory_mapping.h b/memory_mapping.h
index 836b047..ebd7cf6 100644
--- a/memory_mapping.h
+++ b/memory_mapping.h
@@ -44,4 +44,12 @@ void memory_mapping_list_free(MemoryMappingList *list);
 
 void memory_mapping_list_init(MemoryMappingList *list);
 
+/*
+ * Return value:
+ *    0: success
+ *   -1: failed
+ *   -2: unsupported
+ */
+int qemu_get_guest_memory_mapping(MemoryMappingList *list);
+
 #endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 06/16 v8] Add API to get memory mapping without doing paging
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (4 preceding siblings ...)
  2012-03-02 10:18 ` [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping Wen Congyang
@ 2012-03-02 10:23 ` Wen Congyang
  2012-03-02 10:27 ` [Qemu-devel] [RFC][PATCH 07/16 v8] target-i386: Add API to write elf notes to core file Wen Congyang
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:23 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

Get memory mapping with doing paing is for gdb. crash does not need these information.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 memory_mapping.c |    9 +++++++++
 memory_mapping.h |    3 +++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/memory_mapping.c b/memory_mapping.c
index f74c5d0..7f4193d 100644
--- a/memory_mapping.c
+++ b/memory_mapping.c
@@ -252,3 +252,12 @@ int qemu_get_guest_memory_mapping(MemoryMappingList *list)
 
     return 0;
 }
+
+void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list)
+{
+    RAMBlock *block;
+
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        create_new_memory_mapping(list, block->offset, 0, block->length);
+    }
+}
diff --git a/memory_mapping.h b/memory_mapping.h
index ebd7cf6..50b1f25 100644
--- a/memory_mapping.h
+++ b/memory_mapping.h
@@ -52,4 +52,7 @@ void memory_mapping_list_init(MemoryMappingList *list);
  */
 int qemu_get_guest_memory_mapping(MemoryMappingList *list);
 
+/* get guest's memory mapping without do paging(virtual address is 0). */
+void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list);
+
 #endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 07/16 v8] target-i386: Add API to write elf notes to core file
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (5 preceding siblings ...)
  2012-03-02 10:23 ` [Qemu-devel] [RFC][PATCH 06/16 v8] Add API to get memory mapping without doing paging Wen Congyang
@ 2012-03-02 10:27 ` Wen Congyang
  2012-03-02 10:31 ` [Qemu-devel] [RFC][PATCH 08/16 v8] target-i386: Add API to write cpu status " Wen Congyang
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:27 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

The core file contains register's value. These APIs write registers to
core file, and them will be called in the following patch.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target         |    1 +
 configure               |    4 +
 cpu-all.h               |   23 +++++
 target-i386/arch_dump.c |  249 +++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 277 insertions(+), 0 deletions(-)
 create mode 100644 target-i386/arch_dump.c

diff --git a/Makefile.target b/Makefile.target
index a87e678..cfd3113 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -210,6 +210,7 @@ obj-$(CONFIG_NO_KVM) += kvm-stub.o
 obj-$(CONFIG_VGA) += vga.o
 obj-y += memory.o savevm.o
 obj-y += memory_mapping.o
+obj-$(CONFIG_HAVE_CORE_DUMP) += arch_dump.o
 LIBS+=-lz
 
 obj-i386-$(CONFIG_KVM) += hyperv.o
diff --git a/configure b/configure
index 61821fc..bcbd5d1 100755
--- a/configure
+++ b/configure
@@ -3662,6 +3662,10 @@ if test "$target_softmmu" = "yes" ; then
   if test "$smartcard_nss" = "yes" ; then
     echo "subdir-$target: subdir-libcacard" >> $config_host_mak
   fi
+  case "$target_arch2" in
+    i386|x86_64)
+      echo "CONFIG_HAVE_CORE_DUMP=y" >> $config_target_mak
+  esac
 fi
 if test "$target_user_only" = "yes" ; then
   echo "CONFIG_USER_ONLY=y" >> $config_target_mak
diff --git a/cpu-all.h b/cpu-all.h
index 01c3c23..e476401 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -539,4 +539,27 @@ static inline bool cpu_paging_enabled(CPUState *env)
 }
 #endif
 
+typedef int (*write_core_dump_function)
+    (target_phys_addr_t offset, void *buf, size_t size, void *opaque);
+#if defined(CONFIG_HAVE_CORE_DUMP)
+int cpu_write_elf64_note(write_core_dump_function f, CPUState *env, int cpuid,
+                         target_phys_addr_t *offset, void *opaque);
+int cpu_write_elf32_note(write_core_dump_function f, CPUState *env, int cpuid,
+                         target_phys_addr_t *offset, void *opaque);
+#else
+static inline int cpu_write_elf64_note(write_core_dump_function f,
+                                       CPUState *env, int cpuid,
+                                       target_phys_addr_t *offset, void *opaque)
+{
+    return -1;
+}
+
+static inline int cpu_write_elf32_note(write_core_dump_function f,
+                                       CPUState *env, int cpuid,
+                                       target_phys_addr_t *offset, void *opaque)
+{
+    return -1;
+}
+#endif
+
 #endif /* CPU_ALL_H */
diff --git a/target-i386/arch_dump.c b/target-i386/arch_dump.c
new file mode 100644
index 0000000..3239c40
--- /dev/null
+++ b/target-i386/arch_dump.c
@@ -0,0 +1,249 @@
+/*
+ * i386 memory mapping
+ *
+ * Copyright Fujitsu, Corp. 2011
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "cpu.h"
+#include "cpu-all.h"
+#include "elf.h"
+
+#ifdef TARGET_X86_64
+typedef struct {
+    target_ulong r15, r14, r13, r12, rbp, rbx, r11, r10;
+    target_ulong r9, r8, rax, rcx, rdx, rsi, rdi, orig_rax;
+    target_ulong rip, cs, eflags;
+    target_ulong rsp, ss;
+    target_ulong fs_base, gs_base;
+    target_ulong ds, es, fs, gs;
+} x86_64_user_regs_struct;
+
+static int x86_64_write_elf64_note(write_core_dump_function f, CPUState *env,
+                                   int id, target_phys_addr_t *offset,
+                                   void *opaque)
+{
+    x86_64_user_regs_struct regs;
+    Elf64_Nhdr *note;
+    char *buf;
+    int descsz, note_size, name_size = 5;
+    const char *name = "CORE";
+    int ret;
+
+    regs.r15 = env->regs[15];
+    regs.r14 = env->regs[14];
+    regs.r13 = env->regs[13];
+    regs.r12 = env->regs[12];
+    regs.r11 = env->regs[11];
+    regs.r10 = env->regs[10];
+    regs.r9  = env->regs[9];
+    regs.r8  = env->regs[8];
+    regs.rbp = env->regs[R_EBP];
+    regs.rsp = env->regs[R_ESP];
+    regs.rdi = env->regs[R_EDI];
+    regs.rsi = env->regs[R_ESI];
+    regs.rdx = env->regs[R_EDX];
+    regs.rcx = env->regs[R_ECX];
+    regs.rbx = env->regs[R_EBX];
+    regs.rax = env->regs[R_EAX];
+    regs.rip = env->eip;
+    regs.eflags = env->eflags;
+
+    regs.orig_rax = 0; /* FIXME */
+    regs.cs = env->segs[R_CS].selector;
+    regs.ss = env->segs[R_SS].selector;
+    regs.fs_base = env->segs[R_FS].base;
+    regs.gs_base = env->segs[R_GS].base;
+    regs.ds = env->segs[R_DS].selector;
+    regs.es = env->segs[R_ES].selector;
+    regs.fs = env->segs[R_FS].selector;
+    regs.gs = env->segs[R_GS].selector;
+
+    descsz = 336; /* sizeof(prstatus_t) is 336 on x86_64 box */
+    note_size = ((sizeof(Elf64_Nhdr) + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    note->n_namesz = cpu_to_le32(name_size);
+    note->n_descsz = cpu_to_le32(descsz);
+    note->n_type = cpu_to_le32(NT_PRSTATUS);
+    buf = (char *)note;
+    buf += ((sizeof(Elf64_Nhdr) + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf + 32, &id, 4); /* pr_pid */
+    buf += descsz - sizeof(x86_64_user_regs_struct)-sizeof(target_ulong);
+    memcpy(buf, &regs, sizeof(x86_64_user_regs_struct));
+
+    ret = f(*offset, note, note_size, opaque);
+    g_free(note);
+    if (ret < 0) {
+        return -1;
+    }
+
+    *offset += note_size;
+
+    return 0;
+}
+#endif
+
+typedef struct {
+    uint32_t ebx, ecx, edx, esi, edi, ebp, eax;
+    unsigned short ds, __ds, es, __es;
+    unsigned short fs, __fs, gs, __gs;
+    uint32_t orig_eax, eip;
+    unsigned short cs, __cs;
+    uint32_t eflags, esp;
+    unsigned short ss, __ss;
+} x86_user_regs_struct;
+
+static int x86_write_elf64_note(write_core_dump_function f, CPUState *env,
+                                int id, target_phys_addr_t *offset,
+                                void *opaque)
+{
+    x86_user_regs_struct regs;
+    Elf64_Nhdr *note;
+    char *buf;
+    int descsz, note_size, name_size = 5;
+    const char *name = "CORE";
+    int ret;
+
+    regs.ebp = env->regs[R_EBP] & 0xffffffff;
+    regs.esp = env->regs[R_ESP] & 0xffffffff;
+    regs.edi = env->regs[R_EDI] & 0xffffffff;
+    regs.esi = env->regs[R_ESI] & 0xffffffff;
+    regs.edx = env->regs[R_EDX] & 0xffffffff;
+    regs.ecx = env->regs[R_ECX] & 0xffffffff;
+    regs.ebx = env->regs[R_EBX] & 0xffffffff;
+    regs.eax = env->regs[R_EAX] & 0xffffffff;
+    regs.eip = env->eip & 0xffffffff;
+    regs.eflags = env->eflags & 0xffffffff;
+
+    regs.cs = env->segs[R_CS].selector;
+    regs.__cs = 0;
+    regs.ss = env->segs[R_SS].selector;
+    regs.__ss = 0;
+    regs.ds = env->segs[R_DS].selector;
+    regs.__ds = 0;
+    regs.es = env->segs[R_ES].selector;
+    regs.__es = 0;
+    regs.fs = env->segs[R_FS].selector;
+    regs.__fs = 0;
+    regs.gs = env->segs[R_GS].selector;
+    regs.__gs = 0;
+
+    descsz = 144; /* sizeof(prstatus_t) is 144 on x86 box */
+    note_size = ((sizeof(Elf64_Nhdr) + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    note->n_namesz = cpu_to_le32(name_size);
+    note->n_descsz = cpu_to_le32(descsz);
+    note->n_type = cpu_to_le32(NT_PRSTATUS);
+    buf = (char *)note;
+    buf += ((sizeof(Elf64_Nhdr) + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf + 24, &id, 4); /* pr_pid */
+    buf += descsz - sizeof(x86_user_regs_struct)-4;
+    memcpy(buf, &regs, sizeof(x86_user_regs_struct));
+
+    ret = f(*offset, note, note_size, opaque);
+    g_free(note);
+    if (ret < 0) {
+        return -1;
+    }
+
+    *offset += note_size;
+
+    return 0;
+}
+
+int cpu_write_elf64_note(write_core_dump_function f, CPUState *env, int cpuid,
+                         target_phys_addr_t *offset, void *opaque)
+{
+    int ret;
+#ifdef TARGET_X86_64
+    bool lma = !!(first_cpu->hflags & HF_LMA_MASK);
+
+    if (lma) {
+        ret = x86_64_write_elf64_note(f, env, cpuid, offset, opaque);
+    } else {
+#endif
+        ret = x86_write_elf64_note(f, env, cpuid, offset, opaque);
+#ifdef TARGET_X86_64
+    }
+#endif
+
+    return ret;
+}
+
+int cpu_write_elf32_note(write_core_dump_function f, CPUState *env, int cpuid,
+                         target_phys_addr_t *offset, void *opaque)
+{
+    x86_user_regs_struct regs;
+    Elf32_Nhdr *note;
+    char *buf;
+    int descsz, note_size, name_size = 5;
+    const char *name = "CORE";
+    int ret;
+
+    regs.ebp = env->regs[R_EBP] & 0xffffffff;
+    regs.esp = env->regs[R_ESP] & 0xffffffff;
+    regs.edi = env->regs[R_EDI] & 0xffffffff;
+    regs.esi = env->regs[R_ESI] & 0xffffffff;
+    regs.edx = env->regs[R_EDX] & 0xffffffff;
+    regs.ecx = env->regs[R_ECX] & 0xffffffff;
+    regs.ebx = env->regs[R_EBX] & 0xffffffff;
+    regs.eax = env->regs[R_EAX] & 0xffffffff;
+    regs.eip = env->eip & 0xffffffff;
+    regs.eflags = env->eflags & 0xffffffff;
+
+    regs.cs = env->segs[R_CS].selector;
+    regs.__cs = 0;
+    regs.ss = env->segs[R_SS].selector;
+    regs.__ss = 0;
+    regs.ds = env->segs[R_DS].selector;
+    regs.__ds = 0;
+    regs.es = env->segs[R_ES].selector;
+    regs.__es = 0;
+    regs.fs = env->segs[R_FS].selector;
+    regs.__fs = 0;
+    regs.gs = env->segs[R_GS].selector;
+    regs.__gs = 0;
+
+    descsz = 144; /* sizeof(prstatus_t) is 144 on x86 box */
+    note_size = ((sizeof(Elf32_Nhdr) + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    note->n_namesz = cpu_to_le32(name_size);
+    note->n_descsz = cpu_to_le32(descsz);
+    note->n_type = cpu_to_le32(NT_PRSTATUS);
+    buf = (char *)note;
+    buf += ((sizeof(Elf32_Nhdr) + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf + 24, &cpuid, 4); /* pr_pid */
+    buf += descsz - sizeof(x86_user_regs_struct)-4;
+    memcpy(buf, &regs, sizeof(x86_user_regs_struct));
+
+    ret = f(*offset, note, note_size, opaque);
+    g_free(note);
+    if (ret < 0) {
+        return -1;
+    }
+
+    *offset += note_size;
+
+    return 0;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 08/16 v8] target-i386: Add API to write cpu status to core file
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (6 preceding siblings ...)
  2012-03-02 10:27 ` [Qemu-devel] [RFC][PATCH 07/16 v8] target-i386: Add API to write elf notes to core file Wen Congyang
@ 2012-03-02 10:31 ` Wen Congyang
  2012-03-02 10:33 ` [Qemu-devel] [RFC][PATCH 09/16 v8] target-i386: add API to get dump info Wen Congyang
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:31 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

The core file has register's value. But it does not include all register.
Store the cpu status into QEMU note, and the user can get more information
from vmcore. If you change QEMUCPUState, please count up QEMUCPUSTATE_VERSION.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 cpu-all.h               |   20 ++++++
 target-i386/arch_dump.c |  150 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 170 insertions(+), 0 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index e476401..6c36d73 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -546,6 +546,10 @@ int cpu_write_elf64_note(write_core_dump_function f, CPUState *env, int cpuid,
                          target_phys_addr_t *offset, void *opaque);
 int cpu_write_elf32_note(write_core_dump_function f, CPUState *env, int cpuid,
                          target_phys_addr_t *offset, void *opaque);
+int cpu_write_elf64_qemunote(write_core_dump_function f, CPUState *env,
+                             target_phys_addr_t *offset, void *opaque);
+int cpu_write_elf32_qemunote(write_core_dump_function f, CPUState *env,
+                             target_phys_addr_t *offset, void *opaque);
 #else
 static inline int cpu_write_elf64_note(write_core_dump_function f,
                                        CPUState *env, int cpuid,
@@ -560,6 +564,22 @@ static inline int cpu_write_elf32_note(write_core_dump_function f,
 {
     return -1;
 }
+
+static inline int cpu_write_elf64_qemunote(write_core_dump_function f,
+                                           CPUState *env,
+                                           target_phys_addr_t *offset,
+                                           void *opaque);
+{
+    return -1;
+}
+
+static inline int cpu_write_elf32_qemunote(write_core_dump_function f,
+                                           CPUState *env,
+                                           target_phys_addr_t *offset,
+                                           void *opaque)
+{
+    return -1;
+}
 #endif
 
 #endif /* CPU_ALL_H */
diff --git a/target-i386/arch_dump.c b/target-i386/arch_dump.c
index 3239c40..274bbec 100644
--- a/target-i386/arch_dump.c
+++ b/target-i386/arch_dump.c
@@ -247,3 +247,153 @@ int cpu_write_elf32_note(write_core_dump_function f, CPUState *env, int cpuid,
 
     return 0;
 }
+
+/*
+ * please count up QEMUCPUSTATE_VERSION if you have changed definition of
+ * QEMUCPUState, and modify the tools using this information accordingly.
+ */
+#define QEMUCPUSTATE_VERSION (1)
+
+struct QEMUCPUSegment {
+    uint32_t selector;
+    uint32_t limit;
+    uint32_t flags;
+    uint32_t pad;
+    uint64_t base;
+};
+
+typedef struct QEMUCPUSegment QEMUCPUSegment;
+
+struct QEMUCPUState {
+    uint32_t version;
+    uint32_t size;
+    uint64_t rax, rbx, rcx, rdx, rsi, rdi, rsp, rbp;
+    uint64_t r8, r9, r10, r11, r12, r13, r14, r15;
+    uint64_t rip, rflags;
+    QEMUCPUSegment cs, ds, es, fs, gs, ss;
+    QEMUCPUSegment ldt, tr, gdt, idt;
+    uint64_t cr[5];
+};
+
+typedef struct QEMUCPUState QEMUCPUState;
+
+static void copy_segment(QEMUCPUSegment *d, SegmentCache *s)
+{
+    d->pad = 0;
+    d->selector = s->selector;
+    d->limit = s->limit;
+    d->flags = s->flags;
+    d->base = s->base;
+}
+
+static void qemu_get_cpustate(QEMUCPUState *s, CPUState *env)
+{
+    memset(s, 0, sizeof(QEMUCPUState));
+
+    s->version = QEMUCPUSTATE_VERSION;
+    s->size = sizeof(QEMUCPUState);
+
+    s->rax = env->regs[R_EAX];
+    s->rbx = env->regs[R_EBX];
+    s->rcx = env->regs[R_ECX];
+    s->rdx = env->regs[R_EDX];
+    s->rsi = env->regs[R_ESI];
+    s->rdi = env->regs[R_EDI];
+    s->rsp = env->regs[R_ESP];
+    s->rbp = env->regs[R_EBP];
+#ifdef TARGET_X86_64
+    s->r8  = env->regs[8];
+    s->r9  = env->regs[9];
+    s->r10 = env->regs[10];
+    s->r11 = env->regs[11];
+    s->r12 = env->regs[12];
+    s->r13 = env->regs[13];
+    s->r14 = env->regs[14];
+    s->r15 = env->regs[15];
+#endif
+    s->rip = env->eip;
+    s->rflags = env->eflags;
+
+    copy_segment(&s->cs, &env->segs[R_CS]);
+    copy_segment(&s->ds, &env->segs[R_DS]);
+    copy_segment(&s->es, &env->segs[R_ES]);
+    copy_segment(&s->fs, &env->segs[R_FS]);
+    copy_segment(&s->gs, &env->segs[R_GS]);
+    copy_segment(&s->ss, &env->segs[R_SS]);
+    copy_segment(&s->ldt, &env->ldt);
+    copy_segment(&s->tr, &env->tr);
+    copy_segment(&s->gdt, &env->gdt);
+    copy_segment(&s->idt, &env->idt);
+
+    s->cr[0] = env->cr[0];
+    s->cr[1] = env->cr[1];
+    s->cr[2] = env->cr[2];
+    s->cr[3] = env->cr[3];
+    s->cr[4] = env->cr[4];
+}
+
+static inline int cpu_write_qemu_note(write_core_dump_function f, CPUState *env,
+                                      target_phys_addr_t *offset, void *opaque,
+                                      int type)
+{
+    QEMUCPUState state;
+    Elf64_Nhdr *note64;
+    Elf32_Nhdr *note32;
+    void *note;
+    char *buf;
+    int descsz, note_size, name_size = 5, note_head_size;
+    const char *name = "QEMU";
+    int ret;
+
+    qemu_get_cpustate(&state, env);
+
+    descsz = sizeof(state);
+    if (type == 0) {
+        note_head_size = sizeof(Elf32_Nhdr);
+    } else {
+        note_head_size = sizeof(Elf64_Nhdr);
+    }
+    note_size = ((note_head_size + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    if (type == 0) {
+        note32 = note;
+        note32->n_namesz = cpu_to_le32(name_size);
+        note32->n_descsz = cpu_to_le32(descsz);
+        note32->n_type = 0;
+    } else {
+        note64 = note;
+        note64->n_namesz = cpu_to_le32(name_size);
+        note64->n_descsz = cpu_to_le32(descsz);
+        note64->n_type = 0;
+    }
+    buf = note;
+    buf += ((note_head_size + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf, &state, sizeof(state));
+
+    ret = f(*offset, note, note_size, opaque);
+    g_free(note);
+    if (ret < 0) {
+        return -1;
+    }
+
+    *offset += note_size;
+
+    return 0;
+}
+
+int cpu_write_elf64_qemunote(write_core_dump_function f, CPUState *env,
+                             target_phys_addr_t *offset, void *opaque)
+{
+    return cpu_write_qemu_note(f, env, offset, opaque, 1);
+}
+
+int cpu_write_elf32_qemunote(write_core_dump_function f, CPUState *env,
+                             target_phys_addr_t *offset, void *opaque)
+{
+    return cpu_write_qemu_note(f, env, offset, opaque, 0);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 09/16 v8] target-i386: add API to get dump info
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (7 preceding siblings ...)
  2012-03-02 10:31 ` [Qemu-devel] [RFC][PATCH 08/16 v8] target-i386: Add API to write cpu status " Wen Congyang
@ 2012-03-02 10:33 ` Wen Congyang
  2012-03-02 10:38 ` [Qemu-devel] [RFC][PATCH 10/16 v8] make gdb_id() generally avialable Wen Congyang
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:33 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

Dump info contains: endian, class and architecture. The next
patch will use these information to create vmcore.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 cpu-all.h               |    7 +++++++
 dump.h                  |   23 +++++++++++++++++++++++
 target-i386/arch_dump.c |   34 ++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 0 deletions(-)
 create mode 100644 dump.h

diff --git a/cpu-all.h b/cpu-all.h
index 6c36d73..a8566a9 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -23,6 +23,7 @@
 #include "qemu-tls.h"
 #include "cpu-common.h"
 #include "memory_mapping.h"
+#include "dump.h"
 
 /* some important defines:
  *
@@ -550,6 +551,7 @@ int cpu_write_elf64_qemunote(write_core_dump_function f, CPUState *env,
                              target_phys_addr_t *offset, void *opaque);
 int cpu_write_elf32_qemunote(write_core_dump_function f, CPUState *env,
                              target_phys_addr_t *offset, void *opaque);
+int cpu_get_dump_info(ArchDumpInfo *info);
 #else
 static inline int cpu_write_elf64_note(write_core_dump_function f,
                                        CPUState *env, int cpuid,
@@ -580,6 +582,11 @@ static inline int cpu_write_elf32_qemunote(write_core_dump_function f,
 {
     return -1;
 }
+
+static inline int cpu_get_dump_info(ArchDumpInfo *info)
+{
+    return -1;
+}
 #endif
 
 #endif /* CPU_ALL_H */
diff --git a/dump.h b/dump.h
new file mode 100644
index 0000000..28340cf
--- /dev/null
+++ b/dump.h
@@ -0,0 +1,23 @@
+/*
+ * QEMU dump
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef DUMP_H
+#define DUMP_H
+
+typedef struct ArchDumpInfo {
+    int d_machine;  /* Architecture */
+    int d_endian;   /* ELFDATA2LSB or ELFDATA2MSB */
+    int d_class;    /* ELFCLASS32 or ELFCLASS64 */
+} ArchDumpInfo;
+
+#endif
diff --git a/target-i386/arch_dump.c b/target-i386/arch_dump.c
index 274bbec..1518df7 100644
--- a/target-i386/arch_dump.c
+++ b/target-i386/arch_dump.c
@@ -13,6 +13,7 @@
 
 #include "cpu.h"
 #include "cpu-all.h"
+#include "dump.h"
 #include "elf.h"
 
 #ifdef TARGET_X86_64
@@ -397,3 +398,36 @@ int cpu_write_elf32_qemunote(write_core_dump_function f, CPUState *env,
 {
     return cpu_write_qemu_note(f, env, offset, opaque, 0);
 }
+
+int cpu_get_dump_info(ArchDumpInfo *info)
+{
+    bool lma = false;
+    RAMBlock *block;
+
+#ifdef TARGET_X86_64
+    lma = !!(first_cpu->hflags & HF_LMA_MASK);
+#endif
+
+    if (lma) {
+        info->d_machine = EM_X86_64;
+    } else {
+        info->d_machine = EM_386;
+    }
+    info->d_endian = ELFDATA2LSB;
+
+    if (lma) {
+        info->d_class = ELFCLASS64;
+    } else {
+        info->d_class = ELFCLASS32;
+
+        QLIST_FOREACH(block, &ram_list.blocks, next) {
+            if (block->offset + block->length > UINT_MAX) {
+                /* The memory size is greater than 4G */
+                info->d_class = ELFCLASS64;
+                break;
+            }
+        }
+    }
+
+    return 0;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 10/16 v8] make gdb_id() generally avialable
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (8 preceding siblings ...)
  2012-03-02 10:33 ` [Qemu-devel] [RFC][PATCH 09/16 v8] target-i386: add API to get dump info Wen Congyang
@ 2012-03-02 10:38 ` Wen Congyang
  2012-03-02 10:42 ` [Qemu-devel] [RFC][PATCH 11/16 v8] introduce a new monitor command 'dump' to dump guest's memory Wen Congyang
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:38 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

The following patch also needs this API, so make it generally avialable

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 gdbstub.c |    9 ---------
 gdbstub.h |    9 +++++++++
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index 7d470b6..046b036 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -1939,15 +1939,6 @@ static void gdb_set_cpu_pc(GDBState *s, target_ulong pc)
 #endif
 }
 
-static inline int gdb_id(CPUState *env)
-{
-#if defined(CONFIG_USER_ONLY) && defined(CONFIG_USE_NPTL)
-    return env->host_tid;
-#else
-    return env->cpu_index + 1;
-#endif
-}
-
 static CPUState *find_cpu(uint32_t thread_id)
 {
     CPUState *env;
diff --git a/gdbstub.h b/gdbstub.h
index d82334f..f30bfe8 100644
--- a/gdbstub.h
+++ b/gdbstub.h
@@ -30,6 +30,15 @@ void gdb_register_coprocessor(CPUState *env,
                               gdb_reg_cb get_reg, gdb_reg_cb set_reg,
                               int num_regs, const char *xml, int g_pos);
 
+static inline int gdb_id(CPUState *env)
+{
+#if defined(CONFIG_USER_ONLY) && defined(CONFIG_USE_NPTL)
+    return env->host_tid;
+#else
+    return env->cpu_index + 1;
+#endif
+}
+
 #endif
 
 #ifdef CONFIG_USER_ONLY
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 11/16 v8] introduce a new monitor command 'dump' to dump guest's memory
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (9 preceding siblings ...)
  2012-03-02 10:38 ` [Qemu-devel] [RFC][PATCH 10/16 v8] make gdb_id() generally avialable Wen Congyang
@ 2012-03-02 10:42 ` Wen Congyang
  2012-03-02 10:43 ` [Qemu-devel] [RFC][PATCH 12/16 v8] support to cancel the current dumping Wen Congyang
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:42 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

The command's usage:
   dump [-p] file
file should be start with "file:"(the file's path) or "fd:"(the fd's name).
If you want to use gdb to analyse the core, please specify -p option.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target  |    2 +-
 dump.c           |  714 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 elf.h            |    5 +
 hmp-commands.hx  |   21 ++
 hmp.c            |   10 +
 hmp.h            |    1 +
 qapi-schema.json |   14 +
 qmp-commands.hx  |   34 +++
 8 files changed, 800 insertions(+), 1 deletions(-)
 create mode 100644 dump.c

diff --git a/Makefile.target b/Makefile.target
index cfd3113..4ae59f5 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -210,7 +210,7 @@ obj-$(CONFIG_NO_KVM) += kvm-stub.o
 obj-$(CONFIG_VGA) += vga.o
 obj-y += memory.o savevm.o
 obj-y += memory_mapping.o
-obj-$(CONFIG_HAVE_CORE_DUMP) += arch_dump.o
+obj-$(CONFIG_HAVE_CORE_DUMP) += arch_dump.o dump.o
 LIBS+=-lz
 
 obj-i386-$(CONFIG_KVM) += hyperv.o
diff --git a/dump.c b/dump.c
new file mode 100644
index 0000000..42e1681
--- /dev/null
+++ b/dump.c
@@ -0,0 +1,714 @@
+/*
+ * QEMU dump
+ *
+ * Copyright Fujitsu, Corp. 2011
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include <unistd.h>
+#include "elf.h"
+#include <sys/procfs.h>
+#include <glib.h>
+#include "cpu.h"
+#include "cpu-all.h"
+#include "targphys.h"
+#include "monitor.h"
+#include "kvm.h"
+#include "dump.h"
+#include "sysemu.h"
+#include "bswap.h"
+#include "memory_mapping.h"
+#include "error.h"
+#include "qmp-commands.h"
+#include "gdbstub.h"
+
+static inline uint16_t cpu_convert_to_target16(uint16_t val, int endian)
+{
+    if (endian == ELFDATA2LSB) {
+        val = cpu_to_le16(val);
+    } else {
+        val = cpu_to_be16(val);
+    }
+
+    return val;
+}
+
+static inline uint32_t cpu_convert_to_target32(uint32_t val, int endian)
+{
+    if (endian == ELFDATA2LSB) {
+        val = cpu_to_le32(val);
+    } else {
+        val = cpu_to_be32(val);
+    }
+
+    return val;
+}
+
+static inline uint64_t cpu_convert_to_target64(uint64_t val, int endian)
+{
+    if (endian == ELFDATA2LSB) {
+        val = cpu_to_le64(val);
+    } else {
+        val = cpu_to_be64(val);
+    }
+
+    return val;
+}
+
+enum {
+    DUMP_STATE_ERROR,
+    DUMP_STATE_SETUP,
+    DUMP_STATE_CANCELLED,
+    DUMP_STATE_ACTIVE,
+    DUMP_STATE_COMPLETED,
+};
+
+typedef struct DumpState {
+    ArchDumpInfo dump_info;
+    MemoryMappingList list;
+    uint16_t phdr_num;
+    uint32_t sh_info;
+    bool have_section;
+    int state;
+    bool resume;
+    char *error;
+    target_phys_addr_t memory_offset;
+    write_core_dump_function f;
+    void (*cleanup)(void *opaque);
+    void *opaque;
+} DumpState;
+
+static DumpState *dump_get_current(void)
+{
+    static DumpState current_dump = {
+        .state = DUMP_STATE_SETUP,
+    };
+
+    return &current_dump;
+}
+
+static int dump_cleanup(DumpState *s)
+{
+    int ret = 0;
+
+    memory_mapping_list_free(&s->list);
+    s->cleanup(s->opaque);
+    if (s->resume) {
+        vm_start();
+    }
+
+    return ret;
+}
+
+static void dump_error(DumpState *s, const char *reason)
+{
+    s->state = DUMP_STATE_ERROR;
+    s->error = g_strdup(reason);
+    dump_cleanup(s);
+}
+
+static int write_elf64_header(DumpState *s)
+{
+    Elf64_Ehdr elf_header;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&elf_header, 0, sizeof(Elf64_Ehdr));
+    memcpy(&elf_header, ELFMAG, 4);
+    elf_header.e_ident[EI_CLASS] = ELFCLASS64;
+    elf_header.e_ident[EI_DATA] = s->dump_info.d_endian;
+    elf_header.e_ident[EI_VERSION] = EV_CURRENT;
+    elf_header.e_type = cpu_convert_to_target16(ET_CORE, endian);
+    elf_header.e_machine = cpu_convert_to_target16(s->dump_info.d_machine,
+                                                   endian);
+    elf_header.e_version = cpu_convert_to_target32(EV_CURRENT, endian);
+    elf_header.e_ehsize = cpu_convert_to_target16(sizeof(elf_header), endian);
+    elf_header.e_phoff = cpu_convert_to_target64(sizeof(Elf64_Ehdr), endian);
+    elf_header.e_phentsize = cpu_convert_to_target16(sizeof(Elf64_Phdr),
+                                                     endian);
+    elf_header.e_phnum = cpu_convert_to_target16(s->phdr_num, endian);
+    if (s->have_section) {
+        uint64_t shoff = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr) * s->sh_info;
+
+        elf_header.e_shoff = cpu_convert_to_target64(shoff, endian);
+        elf_header.e_shentsize = cpu_convert_to_target16(sizeof(Elf64_Shdr),
+                                                         endian);
+        elf_header.e_shnum = cpu_convert_to_target16(1, endian);
+    }
+
+    ret = s->f(0, &elf_header, sizeof(elf_header), s->opaque);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write elf header.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_header(DumpState *s)
+{
+    Elf32_Ehdr elf_header;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&elf_header, 0, sizeof(Elf32_Ehdr));
+    memcpy(&elf_header, ELFMAG, 4);
+    elf_header.e_ident[EI_CLASS] = ELFCLASS32;
+    elf_header.e_ident[EI_DATA] = endian;
+    elf_header.e_ident[EI_VERSION] = EV_CURRENT;
+    elf_header.e_type = cpu_convert_to_target16(ET_CORE, endian);
+    elf_header.e_machine = cpu_convert_to_target16(s->dump_info.d_machine,
+                                                   endian);
+    elf_header.e_version = cpu_convert_to_target32(EV_CURRENT, endian);
+    elf_header.e_ehsize = cpu_convert_to_target16(sizeof(elf_header), endian);
+    elf_header.e_phoff = cpu_convert_to_target32(sizeof(Elf32_Ehdr), endian);
+    elf_header.e_phentsize = cpu_convert_to_target16(sizeof(Elf32_Phdr),
+                                                     endian);
+    elf_header.e_phnum = cpu_convert_to_target16(s->phdr_num, endian);
+    if (s->have_section) {
+        uint32_t shoff = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr) * s->sh_info;
+
+        elf_header.e_shoff = cpu_convert_to_target32(shoff, endian);
+        elf_header.e_shentsize = cpu_convert_to_target16(sizeof(Elf32_Shdr),
+                                                         endian);
+        elf_header.e_shnum = cpu_convert_to_target16(1, endian);
+    }
+
+    ret = s->f(0, &elf_header, sizeof(elf_header), s->opaque);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write elf header.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf64_load(DumpState *s, MemoryMapping *memory_mapping,
+                            int phdr_index, target_phys_addr_t offset)
+{
+    Elf64_Phdr phdr;
+    off_t phdr_offset;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&phdr, 0, sizeof(Elf64_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_LOAD, endian);
+    phdr.p_offset = cpu_convert_to_target64(offset, endian);
+    phdr.p_paddr = cpu_convert_to_target64(memory_mapping->phys_addr, endian);
+    if (offset == -1) {
+        phdr.p_filesz = 0;
+    } else {
+        phdr.p_filesz = cpu_convert_to_target64(memory_mapping->length, endian);
+    }
+    phdr.p_memsz = cpu_convert_to_target64(memory_mapping->length, endian);
+    phdr.p_vaddr = cpu_convert_to_target64(memory_mapping->virt_addr, endian);
+
+    phdr_offset = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr)*phdr_index;
+    ret = s->f(phdr_offset, &phdr, sizeof(Elf64_Phdr), s->opaque);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_load(DumpState *s, MemoryMapping *memory_mapping,
+                            int phdr_index, target_phys_addr_t offset)
+{
+    Elf32_Phdr phdr;
+    off_t phdr_offset;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&phdr, 0, sizeof(Elf32_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_LOAD, endian);
+    phdr.p_offset = cpu_convert_to_target32(offset, endian);
+    phdr.p_paddr = cpu_convert_to_target32(memory_mapping->phys_addr, endian);
+    if (offset == -1) {
+        phdr.p_filesz = 0;
+    } else {
+        phdr.p_filesz = cpu_convert_to_target32(memory_mapping->length, endian);
+    }
+    phdr.p_memsz = cpu_convert_to_target32(memory_mapping->length, endian);
+    phdr.p_vaddr = cpu_convert_to_target32(memory_mapping->virt_addr, endian);
+
+    phdr_offset = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr)*phdr_index;
+    ret = s->f(phdr_offset, &phdr, sizeof(Elf32_Phdr), s->opaque);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf64_notes(DumpState *s, int phdr_index,
+                             target_phys_addr_t *offset)
+{
+    CPUState *env;
+    int ret;
+    target_phys_addr_t begin = *offset;
+    Elf64_Phdr phdr;
+    off_t phdr_offset;
+    int id;
+    int endian = s->dump_info.d_endian;
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        id = gdb_id(env);
+        ret = cpu_write_elf64_note(s->f, env, id, offset, s->opaque);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write elf notes.\n");
+            return -1;
+        }
+    }
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        ret = cpu_write_elf64_qemunote(s->f, env, offset, s->opaque);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write CPU status.\n");
+            return -1;
+        }
+    }
+
+    memset(&phdr, 0, sizeof(Elf64_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_NOTE, endian);
+    phdr.p_offset = cpu_convert_to_target64(begin, endian);
+    phdr.p_paddr = 0;
+    phdr.p_filesz = cpu_convert_to_target64(*offset - begin, endian);
+    phdr.p_memsz = cpu_convert_to_target64(*offset - begin, endian);
+    phdr.p_vaddr = 0;
+
+    phdr_offset = sizeof(Elf64_Ehdr);
+    ret = s->f(phdr_offset, &phdr, sizeof(Elf64_Phdr), s->opaque);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_notes(DumpState *s, int phdr_index,
+                             target_phys_addr_t *offset)
+{
+    CPUState *env;
+    int ret;
+    target_phys_addr_t begin = *offset;
+    Elf32_Phdr phdr;
+    off_t phdr_offset;
+    int id;
+    int endian = s->dump_info.d_endian;
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        id = gdb_id(env);
+        ret = cpu_write_elf32_note(s->f, env, id, offset, s->opaque);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write elf notes.\n");
+            return -1;
+        }
+    }
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        ret = cpu_write_elf32_qemunote(s->f, env, offset, s->opaque);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write CPU status.\n");
+            return -1;
+        }
+    }
+
+    memset(&phdr, 0, sizeof(Elf32_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_NOTE, endian);
+    phdr.p_offset = cpu_convert_to_target32(begin, endian);
+    phdr.p_paddr = 0;
+    phdr.p_filesz = cpu_convert_to_target32(*offset - begin, endian);
+    phdr.p_memsz = cpu_convert_to_target32(*offset - begin, endian);
+    phdr.p_vaddr = 0;
+
+    phdr_offset = sizeof(Elf32_Ehdr);
+    ret = s->f(phdr_offset, &phdr, sizeof(Elf32_Phdr), s->opaque);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf_section(DumpState *s, target_phys_addr_t *offset, int type)
+{
+    Elf32_Shdr shdr32;
+    Elf64_Shdr shdr64;
+    int endian = s->dump_info.d_endian;
+    int shdr_size;
+    void *shdr;
+    int ret;
+
+    if (type == 0) {
+        shdr_size = sizeof(Elf32_Shdr);
+        memset(&shdr32, 0, shdr_size);
+        shdr32.sh_info = cpu_convert_to_target32(s->sh_info, endian);
+        shdr = &shdr32;
+    } else {
+        shdr_size = sizeof(Elf64_Shdr);
+        memset(&shdr64, 0, shdr_size);
+        shdr64.sh_info = cpu_convert_to_target32(s->sh_info, endian);
+        shdr = &shdr64;
+    }
+
+    ret = s->f(*offset, &shdr, shdr_size, s->opaque);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write section header table.\n");
+        return -1;
+    }
+
+    *offset += shdr_size;
+    return 0;
+}
+
+static int write_data(DumpState *s, void *buf, int length,
+                      target_phys_addr_t *offset)
+{
+    int ret;
+
+    ret = s->f(*offset, buf, length, s->opaque);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to save memory.\n");
+        return -1;
+    }
+
+    *offset += length;
+    return 0;
+}
+
+/* write the memroy to vmcore. 1 page per I/O. */
+static int write_memory(DumpState *s, RAMBlock *block,
+                        target_phys_addr_t *offset)
+{
+    int i, ret;
+
+    for (i = 0; i < block->length / TARGET_PAGE_SIZE; i++) {
+        ret = write_data(s, block->host + i * TARGET_PAGE_SIZE,
+                         TARGET_PAGE_SIZE, offset);
+        if (ret < 0) {
+            return -1;
+        }
+    }
+
+    if ((block->length % TARGET_PAGE_SIZE) != 0) {
+        ret = write_data(s, block->host + i * TARGET_PAGE_SIZE,
+                         block->length % TARGET_PAGE_SIZE, offset);
+        if (ret < 0) {
+            return -1;
+        }
+    }
+
+    return 0;
+}
+
+/* get the memory's offset in the vmcore */
+static target_phys_addr_t get_offset(target_phys_addr_t phys_addr,
+                                     target_phys_addr_t memory_offset)
+{
+    RAMBlock *block;
+    target_phys_addr_t offset = memory_offset;
+
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        if (phys_addr >= block->offset &&
+            phys_addr < block->offset + block->length) {
+            return phys_addr - block->offset + offset;
+        }
+        offset += block->length;
+    }
+
+    return -1;
+}
+
+/* write elf header, PT_NOTE and elf note to vmcore. */
+static int dump_begin(DumpState *s)
+{
+    target_phys_addr_t offset;
+    int ret;
+
+    s->state = DUMP_STATE_ACTIVE;
+
+    /*
+     * the vmcore's format is:
+     *   --------------
+     *   |  elf header |
+     *   --------------
+     *   |  PT_NOTE    |
+     *   --------------
+     *   |  PT_LOAD    |
+     *   --------------
+     *   |  ......     |
+     *   --------------
+     *   |  PT_LOAD    |
+     *   --------------
+     *   |  sec_hdr    |
+     *   --------------
+     *   |  elf note   |
+     *   --------------
+     *   |  memory     |
+     *   --------------
+     *
+     * we only know where the memory is saved after we write elf note into
+     * vmcore.
+     */
+
+    /* write elf header to vmcore */
+    if (s->dump_info.d_class == ELFCLASS64) {
+        ret = write_elf64_header(s);
+    } else {
+        ret = write_elf32_header(s);
+    }
+    if (ret < 0) {
+        return -1;
+    }
+
+    /* write elf section and notes to vmcore */
+    if (s->dump_info.d_class == ELFCLASS64) {
+        if (s->have_section) {
+            offset = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr)*s->sh_info;
+            if (write_elf_section(s, &offset, 1) < 0) {
+                return -1;
+            }
+        } else {
+            offset = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr)*s->phdr_num;
+        }
+        ret = write_elf64_notes(s, 0, &offset);
+    } else {
+        if (s->have_section) {
+            offset = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr)*s->sh_info;
+            if (write_elf_section(s, &offset, 0) < 0) {
+                return -1;
+            }
+        } else {
+            offset = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr)*s->phdr_num;
+        }
+        ret = write_elf32_notes(s, 0, &offset);
+    }
+
+    if (ret < 0) {
+        return -1;
+    }
+
+    s->memory_offset = offset;
+    return 0;
+}
+
+/* write PT_LOAD to vmcore */
+static int dump_completed(DumpState *s)
+{
+    target_phys_addr_t offset;
+    MemoryMapping *memory_mapping;
+    int phdr_index = 1, ret;
+
+    QTAILQ_FOREACH(memory_mapping, &s->list.head, next) {
+        offset = get_offset(memory_mapping->phys_addr, s->memory_offset);
+        if (s->dump_info.d_class == ELFCLASS64) {
+            ret = write_elf64_load(s, memory_mapping, phdr_index++, offset);
+        } else {
+            ret = write_elf32_load(s, memory_mapping, phdr_index++, offset);
+        }
+        if (ret < 0) {
+            return -1;
+        }
+    }
+
+    s->state = DUMP_STATE_COMPLETED;
+    dump_cleanup(s);
+    return 0;
+}
+
+/* write all memory to vmcore */
+static int dump_iterate(DumpState *s)
+{
+    RAMBlock *block;
+    target_phys_addr_t offset = s->memory_offset;
+    int ret;
+
+    /* write all memory to vmcore */
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        ret = write_memory(s, block, &offset);
+        if (ret < 0) {
+            return -1;
+        }
+    }
+
+    return dump_completed(s);
+}
+
+static int create_vmcore(DumpState *s)
+{
+    int ret;
+
+    ret = dump_begin(s);
+    if (ret < 0) {
+        return -1;
+    }
+
+    ret = dump_iterate(s);
+    if (ret < 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
+static DumpState *dump_init(bool paging, Error **errp)
+{
+    CPUState *env;
+    DumpState *s = dump_get_current();
+    int ret;
+
+    if (runstate_is_running()) {
+        vm_stop(RUN_STATE_PAUSED);
+        s->resume = true;
+    } else {
+        s->resume = false;
+    }
+    s->state = DUMP_STATE_SETUP;
+    if (s->error) {
+        g_free(s->error);
+        s->error = NULL;
+    }
+
+    /*
+     * get dump info: endian, class and architecture.
+     * If the target architecture is not supported, cpu_get_dump_info() will
+     * return -1.
+     *
+     * if we use kvm, we should synchronize the register before we get dump
+     * info.
+     */
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        cpu_synchronize_state(env);
+    }
+
+    ret = cpu_get_dump_info(&s->dump_info);
+    if (ret < 0) {
+        error_set(errp, QERR_UNSUPPORTED);
+        return NULL;
+    }
+
+    /* get memory mapping */
+    memory_mapping_list_init(&s->list);
+    if (paging) {
+        qemu_get_guest_memory_mapping(&s->list);
+    } else {
+        qemu_get_guest_simple_memory_mapping(&s->list);
+    }
+
+    /*
+     * calculate phdr_num
+     *
+     * the type of phdr->num is uint16_t, so we should avoid overflow
+     */
+    s->phdr_num = 1; /* PT_NOTE */
+    if (s->list.num < (1 << 16) - 2) {
+        s->phdr_num += s->list.num;
+        s->have_section = false;
+    } else {
+        s->have_section = true;
+        s->phdr_num = PN_XNUM;
+
+        /* the type of shdr->sh_info is uint32_t, so we should avoid overflow */
+        if (s->list.num > (1ULL << 32) - 2) {
+            s->sh_info = 0xffffffff;
+        } else {
+            s->sh_info += s->list.num;
+        }
+    }
+
+    return s;
+}
+
+static int fd_write_vmcore(target_phys_addr_t offset, void *buf, size_t size,
+                           void *opaque)
+{
+    int fd = (int)(intptr_t)opaque;
+    int ret;
+
+    ret = lseek(fd, offset, SEEK_SET);
+    if (ret < 0) {
+        return -1;
+    }
+
+    ret = write(fd, buf, size);
+    if (ret != size) {
+        return -1;
+    }
+
+    return 0;
+}
+
+static void fd_cleanup(void *opaque)
+{
+    int fd = (int)(intptr_t)opaque;
+
+    if (fd != -1) {
+        close(fd);
+    }
+}
+
+static DumpState *dump_init_fd(int fd, bool paging, Error **errp)
+{
+    DumpState *s = dump_init(paging, errp);
+
+    if (s == NULL) {
+        return NULL;
+    }
+
+    s->f = fd_write_vmcore;
+    s->cleanup = fd_cleanup;
+    s->opaque = (void *)(intptr_t)fd;
+
+    return s;
+}
+
+void qmp_dump(bool paging, const char *file, Error **errp)
+{
+    const char *p;
+    int fd = -1;
+    DumpState *s;
+
+#if !defined(WIN32)
+    if (strstart(file, "fd:", &p)) {
+        fd = qemu_get_fd(p);
+        if (fd == -1) {
+            error_set(errp, QERR_FD_NOT_FOUND, p);
+            return;
+        }
+    }
+#endif
+
+    if  (strstart(file, "file:", &p)) {
+        fd = open(p, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR);
+        if (fd < 0) {
+            error_set(errp, QERR_OPEN_FILE_FAILED, p);
+            return;
+        }
+    }
+
+    if (fd == -1) {
+        error_set(errp, QERR_INVALID_PARAMETER, "file");
+        return;
+    }
+
+    s = dump_init_fd(fd, paging, errp);
+    if (!s) {
+        return;
+    }
+
+    if (create_vmcore(s) < 0) {
+        error_set(errp, QERR_IO_ERROR);
+    }
+}
diff --git a/elf.h b/elf.h
index 2e05d34..6a10657 100644
--- a/elf.h
+++ b/elf.h
@@ -1000,6 +1000,11 @@ typedef struct elf64_sym {
 
 #define EI_NIDENT	16
 
+/* Special value for e_phnum.  This indicates that the real number of
+   program headers is too large to fit into e_phnum.  Instead the real
+   value is in the field sh_info of section 0.  */
+#define PN_XNUM         0xffff
+
 typedef struct elf32_hdr{
   unsigned char	e_ident[EI_NIDENT];
   Elf32_Half	e_type;
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 64b3656..9a1e696 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -880,6 +880,27 @@ server will ask the spice/vnc client to automatically reconnect using the
 new parameters (if specified) once the vm migration finished successfully.
 ETEXI
 
+#if defined(CONFIG_HAVE_CORE_DUMP)
+    {
+        .name       = "dump",
+        .args_type  = "paging:-p,file:s",
+        .params     = "[-p] file",
+        .help       = "dump to file",
+        .user_print = monitor_user_noop,
+        .mhandler.cmd = hmp_dump,
+    },
+
+
+STEXI
+@item dump [-p] @var{file}
+@findex dump
+Dump to @var{file}. The file can be processed with crash or gdb.
+    file: destination file(started with "file:") or destination file descriptor
+          (started with "fd:")
+  paging: do paging to get guest's memory mapping
+ETEXI
+#endif
+
     {
         .name       = "snapshot_blkdev",
         .args_type  = "device:B,snapshot-file:s?,format:s?",
diff --git a/hmp.c b/hmp.c
index 3a54455..a27f6c5 100644
--- a/hmp.c
+++ b/hmp.c
@@ -856,3 +856,13 @@ void hmp_block_job_cancel(Monitor *mon, const QDict *qdict)
 
     hmp_handle_error(mon, &error);
 }
+
+void hmp_dump(Monitor *mon, const QDict *qdict)
+{
+    Error *errp = NULL;
+    int paging = qdict_get_try_bool(qdict, "paging", 0);
+    const char *file = qdict_get_str(qdict, "file");
+
+    qmp_dump(!!paging, file, &errp);
+    hmp_handle_error(mon, &errp);
+}
diff --git a/hmp.h b/hmp.h
index 5409464..b055e50 100644
--- a/hmp.h
+++ b/hmp.h
@@ -59,5 +59,6 @@ void hmp_block_set_io_throttle(Monitor *mon, const QDict *qdict);
 void hmp_block_stream(Monitor *mon, const QDict *qdict);
 void hmp_block_job_set_speed(Monitor *mon, const QDict *qdict);
 void hmp_block_job_cancel(Monitor *mon, const QDict *qdict);
+void hmp_dump(Monitor *mon, const QDict *qdict);
 
 #endif
diff --git a/qapi-schema.json b/qapi-schema.json
index 5f293c4..8b51b1d 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1631,3 +1631,17 @@
 { 'command': 'qom-list-types',
   'data': { '*implements': 'str', '*abstract': 'bool' },
   'returns': [ 'ObjectTypeInfo' ] }
+
+##
+# @dump
+#
+# Dump guest's memory to vmcore.
+#
+# @paging: if true, do paging to get guest's memory mapping
+# @file: the filename or file descriptor of the vmcore.
+#
+# Returns: nothing on success
+#
+# Since: 1.1
+##
+{ 'command': 'dump', 'data': { 'paging': 'bool', 'file': 'str' } }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 0c9bfac..c877987 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -586,6 +586,40 @@ Example:
 
 EQMP
 
+#if defined(CONFIG_HAVE_CORE_DUMP)
+    {
+        .name       = "dump",
+        .args_type  = "paging:-p,file:s",
+        .params     = "[-p] file",
+        .help       = "dump to file",
+        .user_print = monitor_user_noop,
+        .mhandler.cmd_new = qmp_marshal_input_dump,
+    },
+
+SQMP
+dump
+
+
+Dump to file. The file can be processed with crash or gdb.
+
+Arguments:
+
+- "paging": do paging to get guest's memory mapping (json-bool)
+- "file": destination file(started with "file:") or destination file descriptor
+          (started with "fd:") (json-string)
+
+Example:
+
+-> { "execute": "dump", "arguments": { "file": "fd:dump" } }
+<- { "return": {} }
+
+Notes:
+
+(1) All boolean arguments default to false
+
+EQMP
+#endif
+
     {
         .name       = "netdev_add",
         .args_type  = "netdev:O",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 12/16 v8] support to cancel the current dumping
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (10 preceding siblings ...)
  2012-03-02 10:42 ` [Qemu-devel] [RFC][PATCH 11/16 v8] introduce a new monitor command 'dump' to dump guest's memory Wen Congyang
@ 2012-03-02 10:43 ` Wen Congyang
  2012-03-02 10:44 ` [Qemu-devel] [RFC][PATCH 13/16 v8] support to query dumping status Wen Congyang
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:43 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

Add API to allow the user to cancel the current dumping.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 dump.c           |   12 ++++++++++++
 hmp-commands.hx  |   14 ++++++++++++++
 hmp.c            |    5 +++++
 hmp.h            |    1 +
 qapi-schema.json |   13 +++++++++++++
 qmp-commands.hx  |   21 +++++++++++++++++++++
 6 files changed, 66 insertions(+), 0 deletions(-)

diff --git a/dump.c b/dump.c
index 42e1681..dab0c84 100644
--- a/dump.c
+++ b/dump.c
@@ -712,3 +712,15 @@ void qmp_dump(bool paging, const char *file, Error **errp)
         error_set(errp, QERR_IO_ERROR);
     }
 }
+
+void qmp_dump_cancel(Error **errp)
+{
+    DumpState *s = dump_get_current();
+
+    if (s->state != DUMP_STATE_ACTIVE) {
+        return;
+    }
+
+    s->state = DUMP_STATE_CANCELLED;
+    dump_cleanup(s);
+}
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 9a1e696..63193ec 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -902,6 +902,20 @@ ETEXI
 #endif
 
     {
+        .name       = "dump_cancel",
+        .args_type  = "",
+        .params     = "",
+        .help       = "cancel the current VM dumping",
+        .mhandler.cmd = hmp_dump_cancel,
+    },
+
+STEXI
+@item dump_cancel
+@findex dump_cancel
+Cancel the current VM dumping.
+ETEXI
+
+    {
         .name       = "snapshot_blkdev",
         .args_type  = "device:B,snapshot-file:s?,format:s?",
         .params     = "device [new-image-file] [format]",
diff --git a/hmp.c b/hmp.c
index a27f6c5..d427e49 100644
--- a/hmp.c
+++ b/hmp.c
@@ -866,3 +866,8 @@ void hmp_dump(Monitor *mon, const QDict *qdict)
     qmp_dump(!!paging, file, &errp);
     hmp_handle_error(mon, &errp);
 }
+
+void hmp_dump_cancel(Monitor *mon, const QDict *qdict)
+{
+    qmp_dump_cancel(NULL);
+}
diff --git a/hmp.h b/hmp.h
index b055e50..75c6c1d 100644
--- a/hmp.h
+++ b/hmp.h
@@ -60,5 +60,6 @@ void hmp_block_stream(Monitor *mon, const QDict *qdict);
 void hmp_block_job_set_speed(Monitor *mon, const QDict *qdict);
 void hmp_block_job_cancel(Monitor *mon, const QDict *qdict);
 void hmp_dump(Monitor *mon, const QDict *qdict);
+void hmp_dump_cancel(Monitor *mon, const QDict *qdict);
 
 #endif
diff --git a/qapi-schema.json b/qapi-schema.json
index 8b51b1d..d40ba69 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1645,3 +1645,16 @@
 # Since: 1.1
 ##
 { 'command': 'dump', 'data': { 'paging': 'bool', 'file': 'str' } }
+
+##
+# @dump_cancel
+#
+# Cancel the current executing dumping process.
+#
+# Returns: nothing on success
+#
+# Notes: This command succeeds even if there is no dumping process running.
+#
+# Since: 1.1
+##
+{ 'command': 'dump_cancel' }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index c877987..1b36262 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -621,6 +621,27 @@ EQMP
 #endif
 
     {
+        .name       = "dump_cancel",
+        .args_type  = "",
+        .mhandler.cmd_new = qmp_marshal_input_dump_cancel,
+    },
+
+SQMP
+dump_cancel
+
+
+Cancel the current dumping.
+
+Arguments: None.
+
+Example:
+
+-> { "execute": "dump_cancel" }
+<- { "return": {} }
+
+EQMP
+
+    {
         .name       = "netdev_add",
         .args_type  = "netdev:O",
         .params     = "[user|tap|socket],id=str[,prop=value][,...]",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 13/16 v8] support to query dumping status
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (11 preceding siblings ...)
  2012-03-02 10:43 ` [Qemu-devel] [RFC][PATCH 12/16 v8] support to cancel the current dumping Wen Congyang
@ 2012-03-02 10:44 ` Wen Congyang
  2012-03-02 10:44 ` [Qemu-devel] [RFC][PATCH 14/16 v8] run dump at the background Wen Congyang
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:44 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

Add API to allow the user to query dumping status.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 dump.c           |   32 ++++++++++++++++++++++++++++++++
 hmp-commands.hx  |    2 ++
 hmp.c            |   17 +++++++++++++++++
 hmp.h            |    1 +
 monitor.c        |    7 +++++++
 qapi-schema.json |   26 ++++++++++++++++++++++++++
 qmp-commands.hx  |   49 +++++++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 134 insertions(+), 0 deletions(-)

diff --git a/dump.c b/dump.c
index dab0c84..d569867 100644
--- a/dump.c
+++ b/dump.c
@@ -724,3 +724,35 @@ void qmp_dump_cancel(Error **errp)
     s->state = DUMP_STATE_CANCELLED;
     dump_cleanup(s);
 }
+
+DumpInfo *qmp_query_dump(Error **errp)
+{
+    DumpInfo *info = g_malloc0(sizeof(*info));
+    DumpState *s = dump_get_current();
+
+    switch (s->state) {
+    case DUMP_STATE_SETUP:
+        /* no migration has happened ever */
+        break;
+    case DUMP_STATE_ACTIVE:
+        info->has_status = true;
+        info->status = g_strdup("active");
+        break;
+    case DUMP_STATE_COMPLETED:
+        info->has_status = true;
+        info->status = g_strdup("completed");
+        break;
+    case DUMP_STATE_ERROR:
+        info->has_status = true;
+        info->status = g_strdup("failed");
+        info->has_error = true;
+        info->error = g_strdup(s->error);
+        break;
+    case DUMP_STATE_CANCELLED:
+        info->has_status = true;
+        info->status = g_strdup("cancelled");
+        break;
+    }
+
+    return info;
+}
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 63193ec..b936bb7 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1435,6 +1435,8 @@ show device tree
 show qdev device model list
 @item info roms
 show roms
+@item info dump
+show dumping status
 @end table
 ETEXI
 
diff --git a/hmp.c b/hmp.c
index d427e49..81dd23d 100644
--- a/hmp.c
+++ b/hmp.c
@@ -871,3 +871,20 @@ void hmp_dump_cancel(Monitor *mon, const QDict *qdict)
 {
     qmp_dump_cancel(NULL);
 }
+
+void hmp_info_dump(Monitor *mon)
+{
+    DumpInfo *info;
+
+    info = qmp_query_dump(NULL);
+
+    if (info->has_status) {
+        monitor_printf(mon, "Dumping status: %s\n", info->status);
+    }
+
+    if (info->has_error) {
+        monitor_printf(mon, "Dumping failed reason: %s\n", info->error);
+    }
+
+    qapi_free_DumpInfo(info);
+}
diff --git a/hmp.h b/hmp.h
index 75c6c1d..3d105a9 100644
--- a/hmp.h
+++ b/hmp.h
@@ -61,5 +61,6 @@ void hmp_block_job_set_speed(Monitor *mon, const QDict *qdict);
 void hmp_block_job_cancel(Monitor *mon, const QDict *qdict);
 void hmp_dump(Monitor *mon, const QDict *qdict);
 void hmp_dump_cancel(Monitor *mon, const QDict *qdict);
+void hmp_info_dump(Monitor *mon);
 
 #endif
diff --git a/monitor.c b/monitor.c
index 96af5e0..f240895 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2603,6 +2603,13 @@ static mon_cmd_t info_cmds[] = {
         .mhandler.info = do_trace_print_events,
     },
     {
+        .name       = "dump",
+        .args_type  = "",
+        .params     = "",
+        .help       = "show dumping status",
+        .mhandler.info = hmp_info_dump,
+    },
+    {
         .name       = NULL,
     },
 };
diff --git a/qapi-schema.json b/qapi-schema.json
index d40ba69..41428ed 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1658,3 +1658,29 @@
 # Since: 1.1
 ##
 { 'command': 'dump_cancel' }
+
+##
+# @DumpInfo
+#
+# Information about current migration process.
+#
+# @status: #optional string describing the current dumping status.
+#          As of 1,1 this can be 'active', 'completed', 'failed' or
+#          'cancelled'. If this field is not returned, no migration process
+#          has been initiated
+#
+# Since: 1.1
+##
+{ 'type': 'DumpInfo',
+  'data': { '*status': 'str', '*error': 'str' } }
+
+##
+# @query-dump
+#
+# Returns information about current dumping process.
+#
+# Returns: @DumpInfo
+#
+# Since: 1.1
+##
+{ 'command': 'query-dump', 'returns': 'DumpInfo' }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 1b36262..c532561 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -616,6 +616,8 @@ Example:
 Notes:
 
 (1) All boolean arguments default to false
+(2) The 'info dump' command should be used to check dumping's progress
+    and final result (this information is provided by the 'status' member)
 
 EQMP
 #endif
@@ -2090,6 +2092,53 @@ EQMP
     },
 
 SQMP
+query-dump
+-------------
+
+Dumping status.
+
+Return a json-object.
+
+The main json-object contains the following:
+
+- "status": migration status (json-string)
+     - Possible values: "active", "completed", "failed", "cancelled"
+
+Examples:
+
+1. Before the first migration
+
+-> { "execute": "query-dump" }
+<- { "return": {} }
+
+2. Migration is done and has succeeded
+
+-> { "execute": "query-dump" }
+<- { "return": { "status": "completed" } }
+
+3. Migration is done and has failed
+
+-> { "execute": "query-dump" }
+<- { "return": { "status": "failed" } }
+
+4. Migration is being performed:
+
+-> { "execute": "query-dump" }
+<- {
+      "return":{
+         "status":"active",
+      }
+   }
+
+EQMP
+
+    {
+        .name       = "query-dump",
+        .args_type  = "",
+        .mhandler.cmd_new = qmp_marshal_input_query_dump,
+    },
+
+SQMP
 query-balloon
 -------------
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 14/16 v8] run dump at the background
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (12 preceding siblings ...)
  2012-03-02 10:44 ` [Qemu-devel] [RFC][PATCH 13/16 v8] support to query dumping status Wen Congyang
@ 2012-03-02 10:44 ` Wen Congyang
  2012-03-02 10:45 ` [Qemu-devel] [RFC][PATCH 15/16 v8] support detached dump Wen Congyang
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:44 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

The new monitor command dump may take long time to finish. So we need run it
at the background.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 dump.c |  168 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------
 vl.c   |    5 +-
 2 files changed, 150 insertions(+), 23 deletions(-)

diff --git a/dump.c b/dump.c
index d569867..4c8038d 100644
--- a/dump.c
+++ b/dump.c
@@ -80,9 +80,21 @@ typedef struct DumpState {
     bool resume;
     char *error;
     target_phys_addr_t memory_offset;
+
+    /*
+     * Return value:
+     * -2: EAGAIN
+     * -1: error
+     *  0: success
+     */
     write_core_dump_function f;
     void (*cleanup)(void *opaque);
+    int (*dump_begin_iterate)(struct DumpState *, void *opaque);
     void *opaque;
+    RAMBlock *block;
+    ram_addr_t start;
+    target_phys_addr_t offset;
+    VMChangeStateEntry *handler;
 } DumpState;
 
 static DumpState *dump_get_current(void)
@@ -100,6 +112,12 @@ static int dump_cleanup(DumpState *s)
 
     memory_mapping_list_free(&s->list);
     s->cleanup(s->opaque);
+
+    if (s->handler) {
+        qemu_del_vm_change_state_handler(s->handler);
+        s->handler = NULL;
+    }
+
     if (s->resume) {
         vm_start();
     }
@@ -373,40 +391,70 @@ static int write_elf_section(DumpState *s, target_phys_addr_t *offset, int type)
     return 0;
 }
 
+/*
+ * Return value:
+ *     -2: blocked
+ *     -1: failed
+ *      0: sucess
+ */
 static int write_data(DumpState *s, void *buf, int length,
                       target_phys_addr_t *offset)
 {
     int ret;
 
     ret = s->f(*offset, buf, length, s->opaque);
-    if (ret < 0) {
+    if (ret == -1) {
         dump_error(s, "dump: failed to save memory.\n");
         return -1;
     }
 
+    if (ret == -2) {
+        return -2;
+    }
+
     *offset += length;
     return 0;
 }
 
 /* write the memroy to vmcore. 1 page per I/O. */
-static int write_memory(DumpState *s, RAMBlock *block,
-                        target_phys_addr_t *offset)
+static int write_memory(DumpState *s, RAMBlock *block, ram_addr_t start,
+                        target_phys_addr_t *offset, int64_t *size,
+                        int64_t deadline)
 {
     int i, ret;
+    int64_t writen_size = 0;
+    int64_t time;
 
-    for (i = 0; i < block->length / TARGET_PAGE_SIZE; i++) {
-        ret = write_data(s, block->host + i * TARGET_PAGE_SIZE,
+    *size = block->length - start;
+    for (i = 0; i < *size / TARGET_PAGE_SIZE; i++) {
+        ret = write_data(s, block->host + start + i * TARGET_PAGE_SIZE,
                          TARGET_PAGE_SIZE, offset);
         if (ret < 0) {
-            return -1;
+            *size = writen_size;
+            return ret;
+        }
+
+        writen_size += TARGET_PAGE_SIZE;
+        time = qemu_get_clock_ms(rt_clock);
+        if (time >= deadline) {
+            /* time out */
+            *size = writen_size;
+            return -2;
         }
     }
 
-    if ((block->length % TARGET_PAGE_SIZE) != 0) {
-        ret = write_data(s, block->host + i * TARGET_PAGE_SIZE,
-                         block->length % TARGET_PAGE_SIZE, offset);
+    if ((*size % TARGET_PAGE_SIZE) != 0) {
+        ret = write_data(s, block->host + start + i * TARGET_PAGE_SIZE,
+                         *size % TARGET_PAGE_SIZE, offset);
         if (ret < 0) {
-            return -1;
+            *size = writen_size;
+            return ret;
+        }
+
+        time = qemu_get_clock_ms(rt_clock);
+        if (time >= deadline) {
+            /* time out */
+            return -2;
         }
     }
 
@@ -501,6 +549,7 @@ static int dump_begin(DumpState *s)
     }
 
     s->memory_offset = offset;
+    s->offset = offset;
     return 0;
 }
 
@@ -528,22 +577,65 @@ static int dump_completed(DumpState *s)
     return 0;
 }
 
-/* write all memory to vmcore */
-static int dump_iterate(DumpState *s)
+static int get_next_block(DumpState *s, RAMBlock *block)
+{
+    while (1) {
+        block = QLIST_NEXT(block, next);
+        if (!block) {
+            /* no more block */
+            return 1;
+        }
+
+        s->start = 0;
+        s->block = block;
+
+        return 0;
+    }
+}
+
+/* write memory to vmcore */
+static void dump_iterate(void *opaque)
 {
+    DumpState *s = opaque;
     RAMBlock *block;
-    target_phys_addr_t offset = s->memory_offset;
+    target_phys_addr_t offset = s->offset;
+    int64_t size;
+    int64_t deadline, now;
     int ret;
 
-    /* write all memory to vmcore */
-    QLIST_FOREACH(block, &ram_list.blocks, next) {
-        ret = write_memory(s, block, &offset);
-        if (ret < 0) {
-            return -1;
+    now = qemu_get_clock_ms(rt_clock);
+    deadline = now + 5;
+    while(1) {
+        block = s->block;
+        ret = write_memory(s, block, s->start, &offset, &size, deadline);
+        if (ret == -1) {
+            return;
+        }
+
+        if (ret == -2) {
+            break;
+        }
+
+        ret = get_next_block(s, block);
+        if (ret == 1) {
+            dump_completed(s);
+            return;
         }
     }
 
-    return dump_completed(s);
+    if (size == block->length - s->start) {
+        ret = get_next_block(s, block);
+        if (ret == 1) {
+            dump_completed(s);
+            return;
+        }
+    } else {
+        s->start += size;
+    }
+
+    s->offset = offset;
+
+    return;
 }
 
 static int create_vmcore(DumpState *s)
@@ -555,7 +647,7 @@ static int create_vmcore(DumpState *s)
         return -1;
     }
 
-    ret = dump_iterate(s);
+    ret = s->dump_begin_iterate(s, s->opaque);
     if (ret < 0) {
         return -1;
     }
@@ -563,6 +655,17 @@ static int create_vmcore(DumpState *s)
     return 0;
 }
 
+static void dump_vm_state_change(void *opaque, int running, RunState state)
+{
+    DumpState *s = opaque;
+
+    if (running) {
+        qmp_dump_cancel(NULL);
+        s->state = DUMP_STATE_ERROR;
+        s->error = g_strdup("vm state is changed to run\n");
+    }
+}
+
 static DumpState *dump_init(bool paging, Error **errp)
 {
     CPUState *env;
@@ -580,6 +683,9 @@ static DumpState *dump_init(bool paging, Error **errp)
         g_free(s->error);
         s->error = NULL;
     }
+    s->block = QLIST_FIRST(&ram_list.blocks);
+    s->start = 0;
+    s->handler = qemu_add_vm_change_state_handler(dump_vm_state_change, s);
 
     /*
      * get dump info: endian, class and architecture.
@@ -639,14 +745,24 @@ static int fd_write_vmcore(target_phys_addr_t offset, void *buf, size_t size,
 
     ret = lseek(fd, offset, SEEK_SET);
     if (ret < 0) {
+        if (errno == EAGAIN || errno == EWOULDBLOCK) {
+            return -2;
+        }
         return -1;
     }
 
     ret = write(fd, buf, size);
-    if (ret != size) {
+    if (ret < 0) {
+        if (errno == EAGAIN || errno == EWOULDBLOCK) {
+            return -2;
+        }
         return -1;
     }
 
+    if (ret != size) {
+        return -2;
+    }
+
     return 0;
 }
 
@@ -655,10 +771,18 @@ static void fd_cleanup(void *opaque)
     int fd = (int)(intptr_t)opaque;
 
     if (fd != -1) {
+        qemu_set_fd_handler(fd, NULL, NULL, NULL);
         close(fd);
     }
 }
 
+static int fd_dump_begin_iterate(DumpState *s, void *opaque)
+{
+    int fd = (int)(intptr_t)opaque;
+
+    return qemu_set_fd_handler(fd, NULL, dump_iterate, s);
+}
+
 static DumpState *dump_init_fd(int fd, bool paging, Error **errp)
 {
     DumpState *s = dump_init(paging, errp);
@@ -669,7 +793,9 @@ static DumpState *dump_init_fd(int fd, bool paging, Error **errp)
 
     s->f = fd_write_vmcore;
     s->cleanup = fd_cleanup;
+    s->dump_begin_iterate = fd_dump_begin_iterate;
     s->opaque = (void *)(intptr_t)fd;
+    fcntl(fd, F_SETFL, O_NONBLOCK);
 
     return s;
 }
diff --git a/vl.c b/vl.c
index 4a77696..0f31906 100644
--- a/vl.c
+++ b/vl.c
@@ -1249,11 +1249,12 @@ void qemu_del_vm_change_state_handler(VMChangeStateEntry *e)
 
 void vm_state_notify(int running, RunState state)
 {
-    VMChangeStateEntry *e;
+    VMChangeStateEntry *e, *next;
 
     trace_vm_state_notify(running, state);
 
-    for (e = vm_change_state_head.lh_first; e; e = e->entries.le_next) {
+    /* e->cb() may remove itself */
+    QLIST_FOREACH_SAFE(e, &vm_change_state_head, entries, next) {
         e->cb(e->opaque, running, state);
     }
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 15/16 v8] support detached dump
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (13 preceding siblings ...)
  2012-03-02 10:44 ` [Qemu-devel] [RFC][PATCH 14/16 v8] run dump at the background Wen Congyang
@ 2012-03-02 10:45 ` Wen Congyang
  2012-03-02 10:46 ` [Qemu-devel] [RFC][PATCH 16/16 v8] allow user to dump a fraction of the memory Wen Congyang
  2012-03-05  9:12 ` [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:45 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

Let the user to choose whether to block other monitor command while dumping.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 dump.c           |    2 +-
 hmp-commands.hx  |    9 +++++----
 hmp.c            |   49 +++++++++++++++++++++++++++++++++++++++++++++++--
 qapi-schema.json |    4 +++-
 qmp-commands.hx  |    6 ++++--
 5 files changed, 60 insertions(+), 10 deletions(-)

diff --git a/dump.c b/dump.c
index 4c8038d..40eefb9 100644
--- a/dump.c
+++ b/dump.c
@@ -800,7 +800,7 @@ static DumpState *dump_init_fd(int fd, bool paging, Error **errp)
     return s;
 }
 
-void qmp_dump(bool paging, const char *file, Error **errp)
+void qmp_dump(bool detach, bool paging, const char *file, Error **errp)
 {
     const char *p;
     int fd = -1;
diff --git a/hmp-commands.hx b/hmp-commands.hx
index b936bb7..fd6f9f1 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -883,21 +883,22 @@ ETEXI
 #if defined(CONFIG_HAVE_CORE_DUMP)
     {
         .name       = "dump",
-        .args_type  = "paging:-p,file:s",
-        .params     = "[-p] file",
-        .help       = "dump to file",
+        .args_type  = "detach:-d,paging:-p,file:s",
+        .params     = "[-d] [-p] file",
+        .help       = "dump to file (using -d to not wait for completion)",
         .user_print = monitor_user_noop,
         .mhandler.cmd = hmp_dump,
     },
 
 
 STEXI
-@item dump [-p] @var{file}
+@item dump [-d] [-p] @var{file}
 @findex dump
 Dump to @var{file}. The file can be processed with crash or gdb.
     file: destination file(started with "file:") or destination file descriptor
           (started with "fd:")
   paging: do paging to get guest's memory mapping
+      -d: donot wait for completion.
 ETEXI
 #endif
 
diff --git a/hmp.c b/hmp.c
index 81dd23d..8652b20 100644
--- a/hmp.c
+++ b/hmp.c
@@ -857,14 +857,59 @@ void hmp_block_job_cancel(Monitor *mon, const QDict *qdict)
     hmp_handle_error(mon, &error);
 }
 
+typedef struct DumpingStatus
+{
+    QEMUTimer *timer;
+    Monitor *mon;
+} DumpingStatus;
+
+static void hmp_dumping_status_cb(void *opaque)
+{
+    DumpingStatus *status = opaque;
+    DumpInfo *info;
+
+    info = qmp_query_dump(NULL);
+    if (!info->has_status || strcmp(info->status, "active") == 0) {
+        qemu_mod_timer(status->timer, qemu_get_clock_ms(rt_clock) + 1000);
+    } else {
+        monitor_resume(status->mon);
+        qemu_del_timer(status->timer);
+        g_free(status);
+    }
+
+    qapi_free_DumpInfo(info);
+}
+
 void hmp_dump(Monitor *mon, const QDict *qdict)
 {
     Error *errp = NULL;
     int paging = qdict_get_try_bool(qdict, "paging", 0);
+    int detach = qdict_get_try_bool(qdict, "detach", 0);
     const char *file = qdict_get_str(qdict, "file");
 
-    qmp_dump(!!paging, file, &errp);
-    hmp_handle_error(mon, &errp);
+    qmp_dump(!!detach, !!paging, file, &errp);
+    if (errp) {
+        hmp_handle_error(mon, &errp);
+        return;
+    }
+
+    if (!detach) {
+        DumpingStatus *status;
+        int ret;
+
+        ret = monitor_suspend(mon);
+        if (ret < 0) {
+            monitor_printf(mon, "terminal does not allow synchronous "
+                           "migration, continuing detached\n");
+            return;
+        }
+
+        status = g_malloc0(sizeof(*status));
+        status->mon = mon;
+        status->timer = qemu_new_timer_ms(rt_clock, hmp_dumping_status_cb,
+                                          status);
+        qemu_mod_timer(status->timer, qemu_get_clock_ms(rt_clock));
+    }
 }
 
 void hmp_dump_cancel(Monitor *mon, const QDict *qdict)
diff --git a/qapi-schema.json b/qapi-schema.json
index 41428ed..bd258cb 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1637,6 +1637,7 @@
 #
 # Dump guest's memory to vmcore.
 #
+# @detach: detached dumping.
 # @paging: if true, do paging to get guest's memory mapping
 # @file: the filename or file descriptor of the vmcore.
 #
@@ -1644,7 +1645,8 @@
 #
 # Since: 1.1
 ##
-{ 'command': 'dump', 'data': { 'paging': 'bool', 'file': 'str' } }
+{ 'command': 'dump',
+  'data': { 'detach': 'bool', 'paging': 'bool', 'file': 'str' } }
 
 ##
 # @dump_cancel
diff --git a/qmp-commands.hx b/qmp-commands.hx
index c532561..e2accb2 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -589,8 +589,8 @@ EQMP
 #if defined(CONFIG_HAVE_CORE_DUMP)
     {
         .name       = "dump",
-        .args_type  = "paging:-p,file:s",
-        .params     = "[-p] file",
+        .args_type  = "detach:-d,paging:-p,file:s",
+        .params     = "[-d] [-p] file",
         .help       = "dump to file",
         .user_print = monitor_user_noop,
         .mhandler.cmd_new = qmp_marshal_input_dump,
@@ -618,6 +618,8 @@ Notes:
 (1) All boolean arguments default to false
 (2) The 'info dump' command should be used to check dumping's progress
     and final result (this information is provided by the 'status' member)
+(3) The user Monitor's "detach" argument is invalid in QMP and should not
+    be used
 
 EQMP
 #endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC][PATCH 16/16 v8] allow user to dump a fraction of the memory
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (14 preceding siblings ...)
  2012-03-02 10:45 ` [Qemu-devel] [RFC][PATCH 15/16 v8] support detached dump Wen Congyang
@ 2012-03-02 10:46 ` Wen Congyang
  2012-03-05  9:12 ` [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
  16 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-02 10:46 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 dump.c           |  124 +++++++++++++++++++++++++++++++++++++++++++++++------
 hmp-commands.hx  |   14 ++++--
 hmp.c            |   14 ++++++-
 memory_mapping.c |   27 ++++++++++++
 memory_mapping.h |    2 +
 qapi-schema.json |    5 ++-
 qmp-commands.hx  |    8 +++-
 7 files changed, 172 insertions(+), 22 deletions(-)

diff --git a/dump.c b/dump.c
index 40eefb9..52df041 100644
--- a/dump.c
+++ b/dump.c
@@ -93,6 +93,9 @@ typedef struct DumpState {
     void *opaque;
     RAMBlock *block;
     ram_addr_t start;
+    bool has_filter;
+    int64_t begin;
+    int64_t length;
     target_phys_addr_t offset;
     VMChangeStateEntry *handler;
 } DumpState;
@@ -463,17 +466,47 @@ static int write_memory(DumpState *s, RAMBlock *block, ram_addr_t start,
 
 /* get the memory's offset in the vmcore */
 static target_phys_addr_t get_offset(target_phys_addr_t phys_addr,
-                                     target_phys_addr_t memory_offset)
+                                     DumpState *s)
 {
     RAMBlock *block;
-    target_phys_addr_t offset = memory_offset;
+    target_phys_addr_t offset = s->memory_offset;
+    int64_t size_in_block, start;
+
+    if (s->has_filter) {
+        if (phys_addr < s->begin || phys_addr >= s->begin + s->length) {
+            return -1;
+        }
+    }
 
     QLIST_FOREACH(block, &ram_list.blocks, next) {
-        if (phys_addr >= block->offset &&
-            phys_addr < block->offset + block->length) {
-            return phys_addr - block->offset + offset;
+        if (s->has_filter) {
+            if (block->offset >= s->begin + s->length ||
+                block->offset + block->length <= s->begin) {
+                /* This block is out of the range */
+                continue;
+            }
+
+            if (s->begin <= block->offset) {
+                start = block->offset;
+            } else {
+                start = s->begin;
+            }
+
+            size_in_block = block->length - (start - block->offset);
+            if (s->begin + s->length < block->offset + block->length) {
+                size_in_block -= block->offset + block->length -
+                                 (s->begin + s->length);
+            }
+        } else {
+            start = block->offset;
+            size_in_block = block->length;
+        }
+
+        if (phys_addr >= start && phys_addr < start + size_in_block) {
+            return phys_addr - start + offset;
         }
-        offset += block->length;
+
+        offset += size_in_block;
     }
 
     return -1;
@@ -561,7 +594,7 @@ static int dump_completed(DumpState *s)
     int phdr_index = 1, ret;
 
     QTAILQ_FOREACH(memory_mapping, &s->list.head, next) {
-        offset = get_offset(memory_mapping->phys_addr, s->memory_offset);
+        offset = get_offset(memory_mapping->phys_addr, s);
         if (s->dump_info.d_class == ELFCLASS64) {
             ret = write_elf64_load(s, memory_mapping, phdr_index++, offset);
         } else {
@@ -588,6 +621,17 @@ static int get_next_block(DumpState *s, RAMBlock *block)
 
         s->start = 0;
         s->block = block;
+        if (s->has_filter) {
+            if (block->offset >= s->begin + s->length ||
+                block->offset + block->length <= s->begin) {
+                /* This block is out of the range */
+                continue;
+            }
+
+            if (s->begin > block->offset) {
+                s->start = s->begin - block->offset;
+            }
+        }
 
         return 0;
     }
@@ -666,7 +710,36 @@ static void dump_vm_state_change(void *opaque, int running, RunState state)
     }
 }
 
-static DumpState *dump_init(bool paging, Error **errp)
+static ram_addr_t get_start_block(DumpState *s)
+{
+    RAMBlock *block;
+
+    if (!s->has_filter) {
+        s->block = QLIST_FIRST(&ram_list.blocks);
+        return 0;
+    }
+
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        if (block->offset >= s->begin + s->length ||
+            block->offset + block->length <= s->begin) {
+            /* This block is out of the range */
+            continue;
+        }
+
+        s->block = block;
+        if (s->begin > block->offset ) {
+            s->start = s->begin - block->offset;
+        } else {
+            s->start = 0;
+        }
+        return s->start;
+    }
+
+    return -1;
+}
+
+static DumpState *dump_init(bool paging, bool has_filter, int64_t begin,
+                            int64_t length, Error **errp)
 {
     CPUState *env;
     DumpState *s = dump_get_current();
@@ -683,8 +756,16 @@ static DumpState *dump_init(bool paging, Error **errp)
         g_free(s->error);
         s->error = NULL;
     }
-    s->block = QLIST_FIRST(&ram_list.blocks);
-    s->start = 0;
+
+    s->has_filter = has_filter;
+    s->begin = begin;
+    s->length = length;
+    s->start = get_start_block(s);
+    if (s->start == -1) {
+        error_set(errp, QERR_INVALID_PARAMETER, "begin");
+        return NULL;
+    }
+
     s->handler = qemu_add_vm_change_state_handler(dump_vm_state_change, s);
 
     /*
@@ -713,6 +794,10 @@ static DumpState *dump_init(bool paging, Error **errp)
         qemu_get_guest_simple_memory_mapping(&s->list);
     }
 
+    if (s->has_filter) {
+        memory_mapping_filter(&s->list, s->begin, s->length);
+    }
+
     /*
      * calculate phdr_num
      *
@@ -783,9 +868,10 @@ static int fd_dump_begin_iterate(DumpState *s, void *opaque)
     return qemu_set_fd_handler(fd, NULL, dump_iterate, s);
 }
 
-static DumpState *dump_init_fd(int fd, bool paging, Error **errp)
+static DumpState *dump_init_fd(int fd, bool paging, bool has_filter,
+                               int64_t begin, int64_t length, Error **errp)
 {
-    DumpState *s = dump_init(paging, errp);
+    DumpState *s = dump_init(paging, has_filter, begin, length, errp);
 
     if (s == NULL) {
         return NULL;
@@ -800,12 +886,22 @@ static DumpState *dump_init_fd(int fd, bool paging, Error **errp)
     return s;
 }
 
-void qmp_dump(bool detach, bool paging, const char *file, Error **errp)
+void qmp_dump(bool detach, bool paging, const char *file, bool has_begin,
+              int64_t begin, bool has_length, int64_t length, Error **errp)
 {
     const char *p;
     int fd = -1;
     DumpState *s;
 
+    if (has_begin && !has_length) {
+        error_set(errp, QERR_MISSING_PARAMETER, "length");
+        return;
+    }
+    if (!has_begin && has_length) {
+        error_set(errp, QERR_MISSING_PARAMETER, "begin");
+        return;
+    }
+
 #if !defined(WIN32)
     if (strstart(file, "fd:", &p)) {
         fd = qemu_get_fd(p);
@@ -829,7 +925,7 @@ void qmp_dump(bool detach, bool paging, const char *file, Error **errp)
         return;
     }
 
-    s = dump_init_fd(fd, paging, errp);
+    s = dump_init_fd(fd, paging, has_begin, begin, length, errp);
     if (!s) {
         return;
     }
diff --git a/hmp-commands.hx b/hmp-commands.hx
index fd6f9f1..e108780 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -883,22 +883,28 @@ ETEXI
 #if defined(CONFIG_HAVE_CORE_DUMP)
     {
         .name       = "dump",
-        .args_type  = "detach:-d,paging:-p,file:s",
-        .params     = "[-d] [-p] file",
-        .help       = "dump to file (using -d to not wait for completion)",
+        .args_type  = "detach:-d,paging:-p,file:s,begin:i?,length:i?",
+        .params     = "[-d] [-p] file [begin] [length]",
+        .help       = "dump to file (using -d to not wait for completion)"
+                      "\n\t\t\t begin(optional): the starting physical address"
+                      "\n\t\t\t length(optional): the memory size, in bytes",
         .user_print = monitor_user_noop,
         .mhandler.cmd = hmp_dump,
     },
 
 
 STEXI
-@item dump [-d] [-p] @var{file}
+@item dump [-d] [-p] @var{file} @var{begin} @var{length}
 @findex dump
 Dump to @var{file}. The file can be processed with crash or gdb.
     file: destination file(started with "file:") or destination file descriptor
           (started with "fd:")
   paging: do paging to get guest's memory mapping
       -d: donot wait for completion.
+   begin: the starting physical address. It's optional, and should be specified
+          with length together.
+  length: the memory size, in bytes. It's optional, and should be specified with
+          begin together.
 ETEXI
 #endif
 
diff --git a/hmp.c b/hmp.c
index 8652b20..05e30e1 100644
--- a/hmp.c
+++ b/hmp.c
@@ -886,8 +886,20 @@ void hmp_dump(Monitor *mon, const QDict *qdict)
     int paging = qdict_get_try_bool(qdict, "paging", 0);
     int detach = qdict_get_try_bool(qdict, "detach", 0);
     const char *file = qdict_get_str(qdict, "file");
+    bool has_begin = qdict_haskey(qdict, "begin");
+    bool has_length = qdict_haskey(qdict, "length");
+    int64_t begin = 0;
+    int64_t length = 0;
 
-    qmp_dump(!!detach, !!paging, file, &errp);
+    if (has_begin) {
+        begin = qdict_get_int(qdict, "begin");
+    }
+    if (has_length) {
+        length = qdict_get_int(qdict, "length");
+    }
+
+    qmp_dump(!!detach, !!paging, file, has_begin, begin, has_length, length,
+             &errp);
     if (errp) {
         hmp_handle_error(mon, &errp);
         return;
diff --git a/memory_mapping.c b/memory_mapping.c
index 7f4193d..2d17a9a 100644
--- a/memory_mapping.c
+++ b/memory_mapping.c
@@ -261,3 +261,30 @@ void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list)
         create_new_memory_mapping(list, block->offset, 0, block->length);
     }
 }
+
+void memory_mapping_filter(MemoryMappingList *list, int64_t begin,
+                           int64_t length)
+{
+    MemoryMapping *cur, *next;
+
+    QTAILQ_FOREACH_SAFE(cur, &list->head, next, next) {
+        if (cur->phys_addr >= begin + length ||
+            cur->phys_addr + cur->length <= begin) {
+            QTAILQ_REMOVE(&list->head, cur, next);
+            list->num--;
+            continue;
+        }
+
+        if (cur->phys_addr < begin) {
+            cur->length -= begin - cur->phys_addr;
+            if (cur->virt_addr) {
+                cur->virt_addr += begin - cur->phys_addr;
+            }
+            cur->phys_addr = begin;
+        }
+
+        if (cur->phys_addr + cur->length > begin + length) {
+            cur->length -= cur->phys_addr + cur->length - begin - length;
+        }
+    }
+}
diff --git a/memory_mapping.h b/memory_mapping.h
index 50b1f25..6b454e1 100644
--- a/memory_mapping.h
+++ b/memory_mapping.h
@@ -55,4 +55,6 @@ int qemu_get_guest_memory_mapping(MemoryMappingList *list);
 /* get guest's memory mapping without do paging(virtual address is 0). */
 void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list);
 
+void memory_mapping_filter(MemoryMappingList *list, int64_t begin,
+                           int64_t length);
 #endif
diff --git a/qapi-schema.json b/qapi-schema.json
index bd258cb..f981e8d 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1640,13 +1640,16 @@
 # @detach: detached dumping.
 # @paging: if true, do paging to get guest's memory mapping
 # @file: the filename or file descriptor of the vmcore.
+# @begin: if specified, the starting physical address.
+# @length: if specified, the memory size, in bytes.
 #
 # Returns: nothing on success
 #
 # Since: 1.1
 ##
 { 'command': 'dump',
-  'data': { 'detach': 'bool', 'paging': 'bool', 'file': 'str' } }
+  'data': { 'detach': 'bool', 'paging': 'bool', 'file': 'str', '*begin': 'int',
+            '*length': 'int' } }
 
 ##
 # @dump_cancel
diff --git a/qmp-commands.hx b/qmp-commands.hx
index e2accb2..cb2a3f1 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -589,8 +589,8 @@ EQMP
 #if defined(CONFIG_HAVE_CORE_DUMP)
     {
         .name       = "dump",
-        .args_type  = "detach:-d,paging:-p,file:s",
-        .params     = "[-d] [-p] file",
+        .args_type  = "detach:-d,paging:-p,file:s,begin:i?,end:i?",
+        .params     = "[-d] [-p] file [begin] [length]",
         .help       = "dump to file",
         .user_print = monitor_user_noop,
         .mhandler.cmd_new = qmp_marshal_input_dump,
@@ -607,6 +607,10 @@ Arguments:
 - "paging": do paging to get guest's memory mapping (json-bool)
 - "file": destination file(started with "file:") or destination file descriptor
           (started with "fd:") (json-string)
+- "begin": the starting physical address. It's optional, and should be specified
+           with length together (json-int)
+- "length": the memory size, in bytes. It's optional, and should be specified
+            with begin together (json-int)
 
 Example:
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism
  2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
                   ` (15 preceding siblings ...)
  2012-03-02 10:46 ` [Qemu-devel] [RFC][PATCH 16/16 v8] allow user to dump a fraction of the memory Wen Congyang
@ 2012-03-05  9:12 ` Wen Congyang
  2012-03-06  0:41   ` Luiz Capitulino
  16 siblings, 1 reply; 38+ messages in thread
From: Wen Congyang @ 2012-03-05  9:12 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake

At 03/02/2012 05:59 PM, Wen Congyang Wrote:
> Hi, all
> 
> 'virsh dump' can not work when host pci device is used by guest. We have
> discussed this issue here:
> http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html
> 
> The last version is here:
> http://lists.nongnu.org/archive/html/qemu-devel/2012-02/msg04228.html
> 
> We have determined to introduce a new command dump to dump memory. The core
> file's format can be elf.
> 
> Note:
> 1. The guest should be x86 or x86_64. The other arch is not supported now.
> 2. If you use old gdb, gdb may crash. I use gdb-7.3.1, and it does not crash.
> 3. If the OS is in the second kernel, gdb may not work well, and crash can
>    work by specifying '--machdep phys_addr=xxx' in the command line. The
>    reason is that the second kernel will update the page table, and we can
>    not get the page table for the first kernel.
> 4. The cpu's state is stored in QEMU note. You neet to modify crash to use
>    it to calculate phys_base.
> 5. If the guest OS is 32 bit and the memory size is larger than 4G, the vmcore
>    is elf64 format. You should use the gdb which is built with --enable-64-bit-bfd.
> 6. This patchset is based on the upstream tree, and apply one patch that is still
>    in Luiz Capitulino's tree, because I use the API qemu_get_fd() in this patchset.
> 

Hi, Jan, Luiz Capitulino
Do you have any comments?

Thanks
Wen Congyang

> Changes from v7 to v8:
> 1. addressed Hatayama's comments
> 
> Changes from v6 to v7:
> 1. addressed Jan's comments
> 2. fix some bugs
> 3. store cpu's state into the vmcore
> 
> Changes from v5 to v6:
> 1. allow user to dump a fraction of the memory
> 2. fix some bugs
> 
> Changes from v4 to v5:
> 1. convert the new command dump to QAPI 
> 
> Changes from v3 to v4:
> 1. support it to run asynchronously
> 2. add API to cancel dumping and query dumping progress
> 3. add API to control dumping speed
> 4. auto cancel dumping when the user resumes vm, and the status is failed.
> 
> Changes from v2 to v3:
> 1. address Jan Kiszka's comment
> 
> Changes from v1 to v2:
> 1. fix virt addr in the vmcore.
> 
> Wen Congyang (16):
>   Add API to create memory mapping list
>   Add API to check whether a physical address is I/O address
>   implement cpu_get_memory_mapping()
>   Add API to check whether paging mode is enabled
>   Add API to get memory mapping
>   Add API to get memory mapping without do paging
>   target-i386: Add API to write elf notes to core file
>   target-i386: Add API to write cpu status to core file
>   target-i386: add API to get dump info
>   make gdb_id() generally avialable
>   introduce a new monitor command 'dump' to dump guest's memory
>   support to cancel the current dumping
>   support to query dumping status
>   run dump at the background
>   support detached dump
>   allow user to dump a fraction of the memory
> 
>  Makefile.target                   |    3 +
>  configure                         |    8 +
>  cpu-all.h                         |   66 +++
>  cpu-common.h                      |    2 +
>  dump.c                            |  980 +++++++++++++++++++++++++++++++++++++
>  dump.h                            |   23 +
>  elf.h                             |    5 +
>  exec.c                            |   11 +
>  gdbstub.c                         |    9 -
>  gdbstub.h                         |    9 +
>  hmp-commands.hx                   |   44 ++
>  hmp.c                             |   89 ++++
>  hmp.h                             |    3 +
>  memory_mapping.c                  |  290 +++++++++++
>  memory_mapping.h                  |   60 +++
>  monitor.c                         |    7 +
>  qapi-schema.json                  |   58 +++
>  qmp-commands.hx                   |  110 +++++
>  target-i386/arch_dump.c           |  433 ++++++++++++++++
>  target-i386/arch_memory_mapping.c |  271 ++++++++++
>  vl.c                              |    5 +-
>  21 files changed, 2475 insertions(+), 11 deletions(-)
>  create mode 100644 dump.c
>  create mode 100644 dump.h
>  create mode 100644 memory_mapping.c
>  create mode 100644 memory_mapping.h
>  create mode 100644 target-i386/arch_dump.c
>  create mode 100644 target-i386/arch_memory_mapping.c
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism
  2012-03-05  9:12 ` [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
@ 2012-03-06  0:41   ` Luiz Capitulino
  2012-03-07 17:38     ` Luiz Capitulino
  0 siblings, 1 reply; 38+ messages in thread
From: Luiz Capitulino @ 2012-03-06  0:41 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Jan Kiszka, HATAYAMA Daisuke, Dave Anderson, qemu-devel, Eric Blake

On Mon, 05 Mar 2012 17:12:00 +0800
Wen Congyang <wency@cn.fujitsu.com> wrote:

> At 03/02/2012 05:59 PM, Wen Congyang Wrote:
> > Hi, all
> > 
> > 'virsh dump' can not work when host pci device is used by guest. We have
> > discussed this issue here:
> > http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html
> > 
> > The last version is here:
> > http://lists.nongnu.org/archive/html/qemu-devel/2012-02/msg04228.html
> > 
> > We have determined to introduce a new command dump to dump memory. The core
> > file's format can be elf.
> > 
> > Note:
> > 1. The guest should be x86 or x86_64. The other arch is not supported now.
> > 2. If you use old gdb, gdb may crash. I use gdb-7.3.1, and it does not crash.
> > 3. If the OS is in the second kernel, gdb may not work well, and crash can
> >    work by specifying '--machdep phys_addr=xxx' in the command line. The
> >    reason is that the second kernel will update the page table, and we can
> >    not get the page table for the first kernel.
> > 4. The cpu's state is stored in QEMU note. You neet to modify crash to use
> >    it to calculate phys_base.
> > 5. If the guest OS is 32 bit and the memory size is larger than 4G, the vmcore
> >    is elf64 format. You should use the gdb which is built with --enable-64-bit-bfd.
> > 6. This patchset is based on the upstream tree, and apply one patch that is still
> >    in Luiz Capitulino's tree, because I use the API qemu_get_fd() in this patchset.
> > 
> 
> Hi, Jan, Luiz Capitulino
> Do you have any comments?

I haven't had a chance to review it yet, will do in the next few days.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-02 10:18 ` [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping Wen Congyang
@ 2012-03-07 15:27   ` HATAYAMA Daisuke
  2012-03-08  8:52     ` Wen Congyang
  0 siblings, 1 reply; 38+ messages in thread
From: HATAYAMA Daisuke @ 2012-03-07 15:27 UTC (permalink / raw)
  To: wency; +Cc: jan.kiszka, anderson, qemu-devel, eblake, lcapitulino

From: Wen Congyang <wency@cn.fujitsu.com>
Subject: [RFC][PATCH 05/16 v8] Add API to get memory mapping
Date: Fri, 02 Mar 2012 18:18:23 +0800

> Add API to get all virtual address and physical address mapping.
> If there is no virtual address for some physical address, the virtual
> address is 0.
> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
>  memory_mapping.c |   88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  memory_mapping.h |    8 +++++
>  2 files changed, 96 insertions(+), 0 deletions(-)
> 
> diff --git a/memory_mapping.c b/memory_mapping.c
> index 718f271..f74c5d0 100644
> --- a/memory_mapping.c
> +++ b/memory_mapping.c
> @@ -164,3 +164,91 @@ void memory_mapping_list_init(MemoryMappingList *list)
>      list->last_mapping = NULL;
>      QTAILQ_INIT(&list->head);
>  }
> +
> +int qemu_get_guest_memory_mapping(MemoryMappingList *list)
> +{
> +    CPUState *env;
> +    MemoryMapping *memory_mapping;
> +    RAMBlock *block;
> +    ram_addr_t offset, length, m_length;
> +    target_phys_addr_t m_phys_addr;
> +    int ret;
> +    bool paging_mode;
> +
> +#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
> +    paging_mode = cpu_paging_enabled(first_cpu);
> +    if (paging_mode) {
> +        for (env = first_cpu; env != NULL; env = env->next_cpu) {
> +            ret = cpu_get_memory_mapping(list, env);
> +            if (ret < 0) {
> +                return -1;
> +            }
> +        }
> +    }
> +#else
> +    return -2;
> +#endif
> +
> +    /*
> +     * some memory may be not in the memory mapping's list:
> +     * 1. the guest doesn't use paging
> +     * 2. the guest is in 2nd kernel, and the memory used by 1st kernel is not
> +     *    in paging table
> +     * add them into memory mapping's list
> +     */
> +    QLIST_FOREACH(block, &ram_list.blocks, next) {

How does the memory portion referenced by PT_LOAD program headers with
p_vaddr == 0 looks through gdb? If we cannot access such portions,
part not referenced by the page table CR3 has is unnecessary, isn't
it?

Thanks.
HATAYAMA, Daisuke

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism
  2012-03-06  0:41   ` Luiz Capitulino
@ 2012-03-07 17:38     ` Luiz Capitulino
  2012-03-08  8:55       ` Wen Congyang
  0 siblings, 1 reply; 38+ messages in thread
From: Luiz Capitulino @ 2012-03-07 17:38 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Jan Kiszka, HATAYAMA Daisuke, Dave Anderson, qemu-devel, Eric Blake

On Mon, 5 Mar 2012 21:41:02 -0300
Luiz Capitulino <lcapitulino@redhat.com> wrote:

> On Mon, 05 Mar 2012 17:12:00 +0800
> Wen Congyang <wency@cn.fujitsu.com> wrote:
> 
> > At 03/02/2012 05:59 PM, Wen Congyang Wrote:
> > > Hi, all
> > > 
> > > 'virsh dump' can not work when host pci device is used by guest. We have
> > > discussed this issue here:
> > > http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html
> > > 
> > > The last version is here:
> > > http://lists.nongnu.org/archive/html/qemu-devel/2012-02/msg04228.html
> > > 
> > > We have determined to introduce a new command dump to dump memory. The core
> > > file's format can be elf.
> > > 
> > > Note:
> > > 1. The guest should be x86 or x86_64. The other arch is not supported now.
> > > 2. If you use old gdb, gdb may crash. I use gdb-7.3.1, and it does not crash.
> > > 3. If the OS is in the second kernel, gdb may not work well, and crash can
> > >    work by specifying '--machdep phys_addr=xxx' in the command line. The
> > >    reason is that the second kernel will update the page table, and we can
> > >    not get the page table for the first kernel.
> > > 4. The cpu's state is stored in QEMU note. You neet to modify crash to use
> > >    it to calculate phys_base.
> > > 5. If the guest OS is 32 bit and the memory size is larger than 4G, the vmcore
> > >    is elf64 format. You should use the gdb which is built with --enable-64-bit-bfd.
> > > 6. This patchset is based on the upstream tree, and apply one patch that is still
> > >    in Luiz Capitulino's tree, because I use the API qemu_get_fd() in this patchset.
> > > 
> > 
> > Hi, Jan, Luiz Capitulino
> > Do you have any comments?
> 
> I haven't had a chance to review it yet, will do in the next few days.

Wen, I've started reviewing this but before I ask you to make more changes
to this series, it's better to wait for a conclusion of the asynchronous
command discussion thread:

 http://lists.gnu.org/archive/html/qemu-devel/2012-03/msg01067.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-07 15:27   ` HATAYAMA Daisuke
@ 2012-03-08  8:52     ` Wen Congyang
  2012-03-09  0:40       ` HATAYAMA Daisuke
  0 siblings, 1 reply; 38+ messages in thread
From: Wen Congyang @ 2012-03-08  8:52 UTC (permalink / raw)
  To: HATAYAMA Daisuke; +Cc: jan.kiszka, anderson, qemu-devel, eblake, lcapitulino

At 03/07/2012 11:27 PM, HATAYAMA Daisuke Wrote:
> From: Wen Congyang <wency@cn.fujitsu.com>
> Subject: [RFC][PATCH 05/16 v8] Add API to get memory mapping
> Date: Fri, 02 Mar 2012 18:18:23 +0800
> 
>> Add API to get all virtual address and physical address mapping.
>> If there is no virtual address for some physical address, the virtual
>> address is 0.
>>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> ---
>>  memory_mapping.c |   88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  memory_mapping.h |    8 +++++
>>  2 files changed, 96 insertions(+), 0 deletions(-)
>>
>> diff --git a/memory_mapping.c b/memory_mapping.c
>> index 718f271..f74c5d0 100644
>> --- a/memory_mapping.c
>> +++ b/memory_mapping.c
>> @@ -164,3 +164,91 @@ void memory_mapping_list_init(MemoryMappingList *list)
>>      list->last_mapping = NULL;
>>      QTAILQ_INIT(&list->head);
>>  }
>> +
>> +int qemu_get_guest_memory_mapping(MemoryMappingList *list)
>> +{
>> +    CPUState *env;
>> +    MemoryMapping *memory_mapping;
>> +    RAMBlock *block;
>> +    ram_addr_t offset, length, m_length;
>> +    target_phys_addr_t m_phys_addr;
>> +    int ret;
>> +    bool paging_mode;
>> +
>> +#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
>> +    paging_mode = cpu_paging_enabled(first_cpu);
>> +    if (paging_mode) {
>> +        for (env = first_cpu; env != NULL; env = env->next_cpu) {
>> +            ret = cpu_get_memory_mapping(list, env);
>> +            if (ret < 0) {
>> +                return -1;
>> +            }
>> +        }
>> +    }
>> +#else
>> +    return -2;
>> +#endif
>> +
>> +    /*
>> +     * some memory may be not in the memory mapping's list:
>> +     * 1. the guest doesn't use paging
>> +     * 2. the guest is in 2nd kernel, and the memory used by 1st kernel is not
>> +     *    in paging table
>> +     * add them into memory mapping's list
>> +     */
>> +    QLIST_FOREACH(block, &ram_list.blocks, next) {
> 
> How does the memory portion referenced by PT_LOAD program headers with
> p_vaddr == 0 looks through gdb? If we cannot access such portions,
> part not referenced by the page table CR3 has is unnecessary, isn't
> it?

The part is unnecessary if you use gdb. But it is necessary if you use crash.

Thanks
Wen Congyang

> 
> Thanks.
> HATAYAMA, Daisuke
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism
  2012-03-07 17:38     ` Luiz Capitulino
@ 2012-03-08  8:55       ` Wen Congyang
  0 siblings, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-08  8:55 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Jan Kiszka, HATAYAMA Daisuke, Dave Anderson, qemu-devel, Eric Blake

At 03/08/2012 01:38 AM, Luiz Capitulino Wrote:
> On Mon, 5 Mar 2012 21:41:02 -0300
> Luiz Capitulino <lcapitulino@redhat.com> wrote:
> 
>> On Mon, 05 Mar 2012 17:12:00 +0800
>> Wen Congyang <wency@cn.fujitsu.com> wrote:
>>
>>> At 03/02/2012 05:59 PM, Wen Congyang Wrote:
>>>> Hi, all
>>>>
>>>> 'virsh dump' can not work when host pci device is used by guest. We have
>>>> discussed this issue here:
>>>> http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html
>>>>
>>>> The last version is here:
>>>> http://lists.nongnu.org/archive/html/qemu-devel/2012-02/msg04228.html
>>>>
>>>> We have determined to introduce a new command dump to dump memory. The core
>>>> file's format can be elf.
>>>>
>>>> Note:
>>>> 1. The guest should be x86 or x86_64. The other arch is not supported now.
>>>> 2. If you use old gdb, gdb may crash. I use gdb-7.3.1, and it does not crash.
>>>> 3. If the OS is in the second kernel, gdb may not work well, and crash can
>>>>    work by specifying '--machdep phys_addr=xxx' in the command line. The
>>>>    reason is that the second kernel will update the page table, and we can
>>>>    not get the page table for the first kernel.
>>>> 4. The cpu's state is stored in QEMU note. You neet to modify crash to use
>>>>    it to calculate phys_base.
>>>> 5. If the guest OS is 32 bit and the memory size is larger than 4G, the vmcore
>>>>    is elf64 format. You should use the gdb which is built with --enable-64-bit-bfd.
>>>> 6. This patchset is based on the upstream tree, and apply one patch that is still
>>>>    in Luiz Capitulino's tree, because I use the API qemu_get_fd() in this patchset.
>>>>
>>>
>>> Hi, Jan, Luiz Capitulino
>>> Do you have any comments?
>>
>> I haven't had a chance to review it yet, will do in the next few days.
> 
> Wen, I've started reviewing this but before I ask you to make more changes
> to this series, it's better to wait for a conclusion of the asynchronous
> command discussion thread:
> 
>  http://lists.gnu.org/archive/html/qemu-devel/2012-03/msg01067.html

OK. I see this thread.

Thanks
Wen Congyang

> 
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-08  8:52     ` Wen Congyang
@ 2012-03-09  0:40       ` HATAYAMA Daisuke
  2012-03-09  1:46         ` Wen Congyang
  0 siblings, 1 reply; 38+ messages in thread
From: HATAYAMA Daisuke @ 2012-03-09  0:40 UTC (permalink / raw)
  To: wency; +Cc: jan.kiszka, anderson, qemu-devel, eblake, lcapitulino

From: Wen Congyang <wency@cn.fujitsu.com>
Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
Date: Thu, 08 Mar 2012 16:52:29 +0800

> At 03/07/2012 11:27 PM, HATAYAMA Daisuke Wrote:
>> From: Wen Congyang <wency@cn.fujitsu.com>
>> Subject: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>> Date: Fri, 02 Mar 2012 18:18:23 +0800
>> 
>>> Add API to get all virtual address and physical address mapping.
>>> If there is no virtual address for some physical address, the virtual
>>> address is 0.
>>>
>>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>>> ---
>>>  memory_mapping.c |   88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  memory_mapping.h |    8 +++++
>>>  2 files changed, 96 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/memory_mapping.c b/memory_mapping.c
>>> index 718f271..f74c5d0 100644
>>> --- a/memory_mapping.c
>>> +++ b/memory_mapping.c
>>> @@ -164,3 +164,91 @@ void memory_mapping_list_init(MemoryMappingList *list)
>>>      list->last_mapping = NULL;
>>>      QTAILQ_INIT(&list->head);
>>>  }
>>> +
>>> +int qemu_get_guest_memory_mapping(MemoryMappingList *list)
>>> +{
>>> +    CPUState *env;
>>> +    MemoryMapping *memory_mapping;
>>> +    RAMBlock *block;
>>> +    ram_addr_t offset, length, m_length;
>>> +    target_phys_addr_t m_phys_addr;
>>> +    int ret;
>>> +    bool paging_mode;
>>> +
>>> +#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
>>> +    paging_mode = cpu_paging_enabled(first_cpu);
>>> +    if (paging_mode) {
>>> +        for (env = first_cpu; env != NULL; env = env->next_cpu) {
>>> +            ret = cpu_get_memory_mapping(list, env);
>>> +            if (ret < 0) {
>>> +                return -1;
>>> +            }
>>> +        }
>>> +    }
>>> +#else
>>> +    return -2;
>>> +#endif
>>> +
>>> +    /*
>>> +     * some memory may be not in the memory mapping's list:
>>> +     * 1. the guest doesn't use paging
>>> +     * 2. the guest is in 2nd kernel, and the memory used by 1st kernel is not
>>> +     *    in paging table
>>> +     * add them into memory mapping's list
>>> +     */
>>> +    QLIST_FOREACH(block, &ram_list.blocks, next) {
>> 
>> How does the memory portion referenced by PT_LOAD program headers with
>> p_vaddr == 0 looks through gdb? If we cannot access such portions,
>> part not referenced by the page table CR3 has is unnecessary, isn't
>> it?
> 
> The part is unnecessary if you use gdb. But it is necessary if you use crash.
> 

crash users would not use paging option because even if without using
it, we can see all memory well, so the paging option is only for gdb
users.

It looks to me that the latter part only complicates the logic. If
instead, collecting virtual addresses only, way of handling PT_LOAD
entries become simpler, for example, they no longer need to be
physically contiguous in a single entry, and reviewing and maintaince
becomes easy.

Thanks.
HATAYAMA, Daisuke

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-09  0:40       ` HATAYAMA Daisuke
@ 2012-03-09  1:46         ` Wen Congyang
  2012-03-09  2:05           ` HATAYAMA Daisuke
  0 siblings, 1 reply; 38+ messages in thread
From: Wen Congyang @ 2012-03-09  1:46 UTC (permalink / raw)
  To: HATAYAMA Daisuke; +Cc: jan.kiszka, anderson, qemu-devel, eblake, lcapitulino

At 03/09/2012 08:40 AM, HATAYAMA Daisuke Wrote:
> From: Wen Congyang <wency@cn.fujitsu.com>
> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
> Date: Thu, 08 Mar 2012 16:52:29 +0800
> 
>> At 03/07/2012 11:27 PM, HATAYAMA Daisuke Wrote:
>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>> Subject: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>> Date: Fri, 02 Mar 2012 18:18:23 +0800
>>>
>>>> Add API to get all virtual address and physical address mapping.
>>>> If there is no virtual address for some physical address, the virtual
>>>> address is 0.
>>>>
>>>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>>>> ---
>>>>  memory_mapping.c |   88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  memory_mapping.h |    8 +++++
>>>>  2 files changed, 96 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/memory_mapping.c b/memory_mapping.c
>>>> index 718f271..f74c5d0 100644
>>>> --- a/memory_mapping.c
>>>> +++ b/memory_mapping.c
>>>> @@ -164,3 +164,91 @@ void memory_mapping_list_init(MemoryMappingList *list)
>>>>      list->last_mapping = NULL;
>>>>      QTAILQ_INIT(&list->head);
>>>>  }
>>>> +
>>>> +int qemu_get_guest_memory_mapping(MemoryMappingList *list)
>>>> +{
>>>> +    CPUState *env;
>>>> +    MemoryMapping *memory_mapping;
>>>> +    RAMBlock *block;
>>>> +    ram_addr_t offset, length, m_length;
>>>> +    target_phys_addr_t m_phys_addr;
>>>> +    int ret;
>>>> +    bool paging_mode;
>>>> +
>>>> +#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
>>>> +    paging_mode = cpu_paging_enabled(first_cpu);
>>>> +    if (paging_mode) {
>>>> +        for (env = first_cpu; env != NULL; env = env->next_cpu) {
>>>> +            ret = cpu_get_memory_mapping(list, env);
>>>> +            if (ret < 0) {
>>>> +                return -1;
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +#else
>>>> +    return -2;
>>>> +#endif
>>>> +
>>>> +    /*
>>>> +     * some memory may be not in the memory mapping's list:
>>>> +     * 1. the guest doesn't use paging
>>>> +     * 2. the guest is in 2nd kernel, and the memory used by 1st kernel is not
>>>> +     *    in paging table
>>>> +     * add them into memory mapping's list
>>>> +     */
>>>> +    QLIST_FOREACH(block, &ram_list.blocks, next) {
>>>
>>> How does the memory portion referenced by PT_LOAD program headers with
>>> p_vaddr == 0 looks through gdb? If we cannot access such portions,
>>> part not referenced by the page table CR3 has is unnecessary, isn't
>>> it?
>>
>> The part is unnecessary if you use gdb. But it is necessary if you use crash.
>>
> 
> crash users would not use paging option because even if without using
> it, we can see all memory well, so the paging option is only for gdb
> users.

Yes, the paging option is only for gdb users. The default value if off.

> 
> It looks to me that the latter part only complicates the logic. If
> instead, collecting virtual addresses only, way of handling PT_LOAD
> entries become simpler, for example, they no longer need to be
> physically contiguous in a single entry, and reviewing and maintaince
> becomes easy.

Sorry, I donot understand what do you want to say.

Thanks
Wen Congyang

> 
> Thanks.
> HATAYAMA, Daisuke
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-09  1:46         ` Wen Congyang
@ 2012-03-09  2:05           ` HATAYAMA Daisuke
  2012-03-09  2:26             ` Wen Congyang
  0 siblings, 1 reply; 38+ messages in thread
From: HATAYAMA Daisuke @ 2012-03-09  2:05 UTC (permalink / raw)
  To: wency; +Cc: jan.kiszka, anderson, qemu-devel, eblake, lcapitulino

From: Wen Congyang <wency@cn.fujitsu.com>
Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
Date: Fri, 09 Mar 2012 09:46:31 +0800

> At 03/09/2012 08:40 AM, HATAYAMA Daisuke Wrote:
>> From: Wen Congyang <wency@cn.fujitsu.com>
>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>> Date: Thu, 08 Mar 2012 16:52:29 +0800
>> 
>>> At 03/07/2012 11:27 PM, HATAYAMA Daisuke Wrote:
>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>> Subject: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>> Date: Fri, 02 Mar 2012 18:18:23 +0800
>>>>
>>>>

>>>> How does the memory portion referenced by PT_LOAD program headers with
>>>> p_vaddr == 0 looks through gdb? If we cannot access such portions,
>>>> part not referenced by the page table CR3 has is unnecessary, isn't
>>>> it?
>>>
>>> The part is unnecessary if you use gdb. But it is necessary if you use crash.
>>>
>> 
>> crash users would not use paging option because even if without using
>> it, we can see all memory well, so the paging option is only for gdb
>> users.
> 
> Yes, the paging option is only for gdb users. The default value if off.
> 
>> 
>> It looks to me that the latter part only complicates the logic. If
>> instead, collecting virtual addresses only, way of handling PT_LOAD
>> entries become simpler, for example, they no longer need to be
>> physically contiguous in a single entry, and reviewing and maintaince
>> becomes easy.
> 
> Sorry, I donot understand what do you want to say.
> 

The processing that adds part not referenced by page table to vmcore
is meaningless for gdb. crash doesn't require it. So, it only
complicates the current logic.

Thanks.
HATAYAMA, Daisuke

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-09  2:05           ` HATAYAMA Daisuke
@ 2012-03-09  2:26             ` Wen Congyang
  2012-03-09  2:53               ` HATAYAMA Daisuke
  0 siblings, 1 reply; 38+ messages in thread
From: Wen Congyang @ 2012-03-09  2:26 UTC (permalink / raw)
  To: HATAYAMA Daisuke; +Cc: jan.kiszka, anderson, qemu-devel, eblake, lcapitulino

At 03/09/2012 10:05 AM, HATAYAMA Daisuke Wrote:
> From: Wen Congyang <wency@cn.fujitsu.com>
> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
> Date: Fri, 09 Mar 2012 09:46:31 +0800
> 
>> At 03/09/2012 08:40 AM, HATAYAMA Daisuke Wrote:
>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>> Date: Thu, 08 Mar 2012 16:52:29 +0800
>>>
>>>> At 03/07/2012 11:27 PM, HATAYAMA Daisuke Wrote:
>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>> Subject: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>> Date: Fri, 02 Mar 2012 18:18:23 +0800
>>>>>
>>>>>
> 
>>>>> How does the memory portion referenced by PT_LOAD program headers with
>>>>> p_vaddr == 0 looks through gdb? If we cannot access such portions,
>>>>> part not referenced by the page table CR3 has is unnecessary, isn't
>>>>> it?
>>>>
>>>> The part is unnecessary if you use gdb. But it is necessary if you use crash.
>>>>
>>>
>>> crash users would not use paging option because even if without using
>>> it, we can see all memory well, so the paging option is only for gdb
>>> users.
>>
>> Yes, the paging option is only for gdb users. The default value if off.
>>
>>>
>>> It looks to me that the latter part only complicates the logic. If
>>> instead, collecting virtual addresses only, way of handling PT_LOAD
>>> entries become simpler, for example, they no longer need to be
>>> physically contiguous in a single entry, and reviewing and maintaince
>>> becomes easy.
>>
>> Sorry, I donot understand what do you want to say.
>>
> 
> The processing that adds part not referenced by page table to vmcore
> is meaningless for gdb. crash doesn't require it. So, it only
> complicates the current logic.

If the paging mode is on, we can also use crash to analyze the vmcore.
As the comment methioned, the memory used by the 1st kernel may be not
referenced by the page table, so we neet this logic.

Thanks
Wen Congyang

> 
> Thanks.
> HATAYAMA, Daisuke
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-09  2:26             ` Wen Congyang
@ 2012-03-09  2:53               ` HATAYAMA Daisuke
  2012-03-09  9:41                 ` Jan Kiszka
  2012-03-09  9:43                 ` Jan Kiszka
  0 siblings, 2 replies; 38+ messages in thread
From: HATAYAMA Daisuke @ 2012-03-09  2:53 UTC (permalink / raw)
  To: wency; +Cc: jan.kiszka, anderson, qemu-devel, eblake, lcapitulino

From: Wen Congyang <wency@cn.fujitsu.com>
Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
Date: Fri, 09 Mar 2012 10:26:56 +0800

> At 03/09/2012 10:05 AM, HATAYAMA Daisuke Wrote:
>> From: Wen Congyang <wency@cn.fujitsu.com>
>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>> Date: Fri, 09 Mar 2012 09:46:31 +0800
>> 
>>> At 03/09/2012 08:40 AM, HATAYAMA Daisuke Wrote:
>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>> Date: Thu, 08 Mar 2012 16:52:29 +0800
>>>>
>>>>> At 03/07/2012 11:27 PM, HATAYAMA Daisuke Wrote:
>>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>>> Subject: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>>> Date: Fri, 02 Mar 2012 18:18:23 +0800
>>>>>>
>>>>>>
>> 
>>>>>> How does the memory portion referenced by PT_LOAD program headers with
>>>>>> p_vaddr == 0 looks through gdb? If we cannot access such portions,
>>>>>> part not referenced by the page table CR3 has is unnecessary, isn't
>>>>>> it?
>>>>>
>>>>> The part is unnecessary if you use gdb. But it is necessary if you use crash.
>>>>>
>>>>
>>>> crash users would not use paging option because even if without using
>>>> it, we can see all memory well, so the paging option is only for gdb
>>>> users.
>>>
>>> Yes, the paging option is only for gdb users. The default value if off.
>>>
>>>>
>>>> It looks to me that the latter part only complicates the logic. If
>>>> instead, collecting virtual addresses only, way of handling PT_LOAD
>>>> entries become simpler, for example, they no longer need to be
>>>> physically contiguous in a single entry, and reviewing and maintaince
>>>> becomes easy.
>>>
>>> Sorry, I donot understand what do you want to say.
>>>
>> 
>> The processing that adds part not referenced by page table to vmcore
>> is meaningless for gdb. crash doesn't require it. So, it only
>> complicates the current logic.
> 
> If the paging mode is on, we can also use crash to analyze the vmcore.
> As the comment methioned, the memory used by the 1st kernel may be not
> referenced by the page table, so we neet this logic.
> 

As I said several times, crash users don't use paging mode. Users of
the paging mode is gdb only just as you say. So, the paging path needs
to collect part referenced by page table only since the other part is
invisible to gdb.

Thanks.
HATAYAMA, Daisuke

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-09  2:53               ` HATAYAMA Daisuke
@ 2012-03-09  9:41                 ` Jan Kiszka
  2012-03-09  9:57                   ` Wen Congyang
  2012-03-09  9:43                 ` Jan Kiszka
  1 sibling, 1 reply; 38+ messages in thread
From: Jan Kiszka @ 2012-03-09  9:41 UTC (permalink / raw)
  To: HATAYAMA Daisuke; +Cc: eblake, anderson, qemu-devel, lcapitulino

On 2012-03-09 03:53, HATAYAMA Daisuke wrote:
> From: Wen Congyang <wency@cn.fujitsu.com>
> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
> Date: Fri, 09 Mar 2012 10:26:56 +0800
> 
>> At 03/09/2012 10:05 AM, HATAYAMA Daisuke Wrote:
>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>> Date: Fri, 09 Mar 2012 09:46:31 +0800
>>>
>>>> At 03/09/2012 08:40 AM, HATAYAMA Daisuke Wrote:
>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>> Date: Thu, 08 Mar 2012 16:52:29 +0800
>>>>>
>>>>>> At 03/07/2012 11:27 PM, HATAYAMA Daisuke Wrote:
>>>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>>>> Subject: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>>>> Date: Fri, 02 Mar 2012 18:18:23 +0800
>>>>>>>
>>>>>>>
>>>
>>>>>>> How does the memory portion referenced by PT_LOAD program headers with
>>>>>>> p_vaddr == 0 looks through gdb? If we cannot access such portions,
>>>>>>> part not referenced by the page table CR3 has is unnecessary, isn't
>>>>>>> it?
>>>>>>
>>>>>> The part is unnecessary if you use gdb. But it is necessary if you use crash.
>>>>>>
>>>>>
>>>>> crash users would not use paging option because even if without using
>>>>> it, we can see all memory well, so the paging option is only for gdb
>>>>> users.
>>>>
>>>> Yes, the paging option is only for gdb users. The default value if off.
>>>>
>>>>>
>>>>> It looks to me that the latter part only complicates the logic. If
>>>>> instead, collecting virtual addresses only, way of handling PT_LOAD
>>>>> entries become simpler, for example, they no longer need to be
>>>>> physically contiguous in a single entry, and reviewing and maintaince
>>>>> becomes easy.
>>>>
>>>> Sorry, I donot understand what do you want to say.
>>>>
>>>
>>> The processing that adds part not referenced by page table to vmcore
>>> is meaningless for gdb. crash doesn't require it. So, it only
>>> complicates the current logic.
>>
>> If the paging mode is on, we can also use crash to analyze the vmcore.
>> As the comment methioned, the memory used by the 1st kernel may be not
>> referenced by the page table, so we neet this logic.
>>
> 
> As I said several times, crash users don't use paging mode. Users of
> the paging mode is gdb only just as you say. So, the paging path needs
> to collect part referenced by page table only since the other part is
> invisible to gdb.

If crash can work both with and without paging, it should be default
*on* to avoid writing cores that can later on only be analyzed with that
tool. Still not sure, though, if that changes the requirement on what
memory regions should be written in that mode.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-09  2:53               ` HATAYAMA Daisuke
  2012-03-09  9:41                 ` Jan Kiszka
@ 2012-03-09  9:43                 ` Jan Kiszka
  1 sibling, 0 replies; 38+ messages in thread
From: Jan Kiszka @ 2012-03-09  9:43 UTC (permalink / raw)
  To: HATAYAMA Daisuke; +Cc: eblake, anderson, qemu-devel, lcapitulino

On 2012-03-09 03:53, HATAYAMA Daisuke wrote:
> From: Wen Congyang <wency@cn.fujitsu.com>
> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
> Date: Fri, 09 Mar 2012 10:26:56 +0800
> 
>> At 03/09/2012 10:05 AM, HATAYAMA Daisuke Wrote:
>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>> Date: Fri, 09 Mar 2012 09:46:31 +0800
>>>
>>>> At 03/09/2012 08:40 AM, HATAYAMA Daisuke Wrote:
>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>> Date: Thu, 08 Mar 2012 16:52:29 +0800
>>>>>
>>>>>> At 03/07/2012 11:27 PM, HATAYAMA Daisuke Wrote:
>>>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>>>> Subject: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>>>> Date: Fri, 02 Mar 2012 18:18:23 +0800
>>>>>>>
>>>>>>>
>>>
>>>>>>> How does the memory portion referenced by PT_LOAD program headers with
>>>>>>> p_vaddr == 0 looks through gdb? If we cannot access such portions,
>>>>>>> part not referenced by the page table CR3 has is unnecessary, isn't
>>>>>>> it?
>>>>>>
>>>>>> The part is unnecessary if you use gdb. But it is necessary if you use crash.
>>>>>>
>>>>>
>>>>> crash users would not use paging option because even if without using
>>>>> it, we can see all memory well, so the paging option is only for gdb
>>>>> users.
>>>>
>>>> Yes, the paging option is only for gdb users. The default value if off.
>>>>
>>>>>
>>>>> It looks to me that the latter part only complicates the logic. If
>>>>> instead, collecting virtual addresses only, way of handling PT_LOAD
>>>>> entries become simpler, for example, they no longer need to be
>>>>> physically contiguous in a single entry, and reviewing and maintaince
>>>>> becomes easy.
>>>>
>>>> Sorry, I donot understand what do you want to say.
>>>>
>>>
>>> The processing that adds part not referenced by page table to vmcore
>>> is meaningless for gdb. crash doesn't require it. So, it only
>>> complicates the current logic.
>>
>> If the paging mode is on, we can also use crash to analyze the vmcore.
>> As the comment methioned, the memory used by the 1st kernel may be not
>> referenced by the page table, so we neet this logic.
>>
> 
> As I said several times, crash users don't use paging mode. Users of
> the paging mode is gdb only just as you say. So, the paging path needs
> to collect part referenced by page table only since the other part is
> invisible to gdb.

If crash can work both with and without paging, it should be default
*on* to avoid writing cores that can later on only be analyzed with that
tool. Still not sure, though, if that changes the requirement on what
memory regions should be written in that mode.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-09  9:41                 ` Jan Kiszka
@ 2012-03-09  9:57                   ` Wen Congyang
  2012-03-09 10:05                     ` Jan Kiszka
  0 siblings, 1 reply; 38+ messages in thread
From: Wen Congyang @ 2012-03-09  9:57 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: eblake, HATAYAMA Daisuke, anderson, qemu-devel, lcapitulino

At 03/09/2012 05:41 PM, Jan Kiszka Wrote:
> On 2012-03-09 03:53, HATAYAMA Daisuke wrote:
>> From: Wen Congyang <wency@cn.fujitsu.com>
>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>> Date: Fri, 09 Mar 2012 10:26:56 +0800
>>
>>> At 03/09/2012 10:05 AM, HATAYAMA Daisuke Wrote:
>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>> Date: Fri, 09 Mar 2012 09:46:31 +0800
>>>>
>>>>> At 03/09/2012 08:40 AM, HATAYAMA Daisuke Wrote:
>>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>>> Date: Thu, 08 Mar 2012 16:52:29 +0800
>>>>>>
>>>>>>> At 03/07/2012 11:27 PM, HATAYAMA Daisuke Wrote:
>>>>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>>>>> Subject: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>>>>> Date: Fri, 02 Mar 2012 18:18:23 +0800
>>>>>>>>
>>>>>>>>
>>>>
>>>>>>>> How does the memory portion referenced by PT_LOAD program headers with
>>>>>>>> p_vaddr == 0 looks through gdb? If we cannot access such portions,
>>>>>>>> part not referenced by the page table CR3 has is unnecessary, isn't
>>>>>>>> it?
>>>>>>>
>>>>>>> The part is unnecessary if you use gdb. But it is necessary if you use crash.
>>>>>>>
>>>>>>
>>>>>> crash users would not use paging option because even if without using
>>>>>> it, we can see all memory well, so the paging option is only for gdb
>>>>>> users.
>>>>>
>>>>> Yes, the paging option is only for gdb users. The default value if off.
>>>>>
>>>>>>
>>>>>> It looks to me that the latter part only complicates the logic. If
>>>>>> instead, collecting virtual addresses only, way of handling PT_LOAD
>>>>>> entries become simpler, for example, they no longer need to be
>>>>>> physically contiguous in a single entry, and reviewing and maintaince
>>>>>> becomes easy.
>>>>>
>>>>> Sorry, I donot understand what do you want to say.
>>>>>
>>>>
>>>> The processing that adds part not referenced by page table to vmcore
>>>> is meaningless for gdb. crash doesn't require it. So, it only
>>>> complicates the current logic.
>>>
>>> If the paging mode is on, we can also use crash to analyze the vmcore.
>>> As the comment methioned, the memory used by the 1st kernel may be not
>>> referenced by the page table, so we neet this logic.
>>>
>>
>> As I said several times, crash users don't use paging mode. Users of
>> the paging mode is gdb only just as you say. So, the paging path needs
>> to collect part referenced by page table only since the other part is
>> invisible to gdb.
> 
> If crash can work both with and without paging, it should be default
> *on* to avoid writing cores that can later on only be analyzed with that
> tool. Still not sure, though, if that changes the requirement on what
> memory regions should be written in that mode.

If this logic is not remvoed, crash can work both with and without paging.
But the default value is 'off' now, because the option is '-p'.

Thanks
Wen Congyang

> 
> Jan
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-09  9:57                   ` Wen Congyang
@ 2012-03-09 10:05                     ` Jan Kiszka
  2012-03-09 10:06                       ` Jan Kiszka
  2012-03-12  1:52                       ` Wen Congyang
  0 siblings, 2 replies; 38+ messages in thread
From: Jan Kiszka @ 2012-03-09 10:05 UTC (permalink / raw)
  To: Wen Congyang; +Cc: eblake, HATAYAMA Daisuke, anderson, qemu-devel, lcapitulino

On 2012-03-09 10:57, Wen Congyang wrote:
> At 03/09/2012 05:41 PM, Jan Kiszka Wrote:
>> On 2012-03-09 03:53, HATAYAMA Daisuke wrote:
>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>> Date: Fri, 09 Mar 2012 10:26:56 +0800
>>>
>>>> At 03/09/2012 10:05 AM, HATAYAMA Daisuke Wrote:
>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>> Date: Fri, 09 Mar 2012 09:46:31 +0800
>>>>>
>>>>>> At 03/09/2012 08:40 AM, HATAYAMA Daisuke Wrote:
>>>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>>>> Date: Thu, 08 Mar 2012 16:52:29 +0800
>>>>>>>
>>>>>>>> At 03/07/2012 11:27 PM, HATAYAMA Daisuke Wrote:
>>>>>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>>>>>> Subject: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>>>>>> Date: Fri, 02 Mar 2012 18:18:23 +0800
>>>>>>>>>
>>>>>>>>>
>>>>>
>>>>>>>>> How does the memory portion referenced by PT_LOAD program headers with
>>>>>>>>> p_vaddr == 0 looks through gdb? If we cannot access such portions,
>>>>>>>>> part not referenced by the page table CR3 has is unnecessary, isn't
>>>>>>>>> it?
>>>>>>>>
>>>>>>>> The part is unnecessary if you use gdb. But it is necessary if you use crash.
>>>>>>>>
>>>>>>>
>>>>>>> crash users would not use paging option because even if without using
>>>>>>> it, we can see all memory well, so the paging option is only for gdb
>>>>>>> users.
>>>>>>
>>>>>> Yes, the paging option is only for gdb users. The default value if off.
>>>>>>
>>>>>>>
>>>>>>> It looks to me that the latter part only complicates the logic. If
>>>>>>> instead, collecting virtual addresses only, way of handling PT_LOAD
>>>>>>> entries become simpler, for example, they no longer need to be
>>>>>>> physically contiguous in a single entry, and reviewing and maintaince
>>>>>>> becomes easy.
>>>>>>
>>>>>> Sorry, I donot understand what do you want to say.
>>>>>>
>>>>>
>>>>> The processing that adds part not referenced by page table to vmcore
>>>>> is meaningless for gdb. crash doesn't require it. So, it only
>>>>> complicates the current logic.
>>>>
>>>> If the paging mode is on, we can also use crash to analyze the vmcore.
>>>> As the comment methioned, the memory used by the 1st kernel may be not
>>>> referenced by the page table, so we neet this logic.
>>>>
>>>
>>> As I said several times, crash users don't use paging mode. Users of
>>> the paging mode is gdb only just as you say. So, the paging path needs
>>> to collect part referenced by page table only since the other part is
>>> invisible to gdb.
>>
>> If crash can work both with and without paging, it should be default
>> *on* to avoid writing cores that can later on only be analyzed with that
>> tool. Still not sure, though, if that changes the requirement on what
>> memory regions should be written in that mode.
> 
> If this logic is not remvoed, crash can work both with and without paging.
> But the default value is 'off' now, because the option is '-p'.

And this would be unfortunate if you do not want to use crash for
analyzing (I'm working on gdb python scripts which will make gdb - one
day - at least as powerful as crash). If paging mode has the same
information that non-paging mode has, I would even suggest to drop it.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-09 10:05                     ` Jan Kiszka
@ 2012-03-09 10:06                       ` Jan Kiszka
  2012-03-09 12:53                         ` HATAYAMA Daisuke
  2012-03-12  1:52                       ` Wen Congyang
  1 sibling, 1 reply; 38+ messages in thread
From: Jan Kiszka @ 2012-03-09 10:06 UTC (permalink / raw)
  To: Wen Congyang; +Cc: eblake, HATAYAMA Daisuke, anderson, qemu-devel, lcapitulino

On 2012-03-09 11:05, Jan Kiszka wrote:
>>> If crash can work both with and without paging, it should be default
>>> *on* to avoid writing cores that can later on only be analyzed with that
>>> tool. Still not sure, though, if that changes the requirement on what
>>> memory regions should be written in that mode.
>>
>> If this logic is not remvoed, crash can work both with and without paging.
>> But the default value is 'off' now, because the option is '-p'.
> 
> And this would be unfortunate if you do not want to use crash for
> analyzing (I'm working on gdb python scripts which will make gdb - one
> day - at least as powerful as crash). If paging mode has the same
> information that non-paging mode has, I would even suggest to drop it.

Err, with "it" = "non-paging mode".

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-09 10:06                       ` Jan Kiszka
@ 2012-03-09 12:53                         ` HATAYAMA Daisuke
  2012-03-09 13:24                           ` Jan Kiszka
  0 siblings, 1 reply; 38+ messages in thread
From: HATAYAMA Daisuke @ 2012-03-09 12:53 UTC (permalink / raw)
  To: jan.kiszka; +Cc: eblake, anderson, qemu-devel, lcapitulino

From: Jan Kiszka <jan.kiszka@siemens.com>
Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
Date: Fri, 09 Mar 2012 11:06:30 +0100

> On 2012-03-09 11:05, Jan Kiszka wrote:
>>>> If crash can work both with and without paging, it should be default
>>>> *on* to avoid writing cores that can later on only be analyzed with that
>>>> tool. Still not sure, though, if that changes the requirement on what
>>>> memory regions should be written in that mode.
>>>
>>> If this logic is not remvoed, crash can work both with and without paging.
>>> But the default value is 'off' now, because the option is '-p'.
>> 
>> And this would be unfortunate if you do not want to use crash for
>> analyzing (I'm working on gdb python scripts which will make gdb - one
>> day - at least as powerful as crash). If paging mode has the same
>> information that non-paging mode has, I would even suggest to drop it.
> 
> Err, with "it" = "non-paging mode".
> 
> Jan

Paging at default is not good idea. Performing paging in qemu has risk.

  - Guest machine is not always in the state where paging mode is
    enabled. Also, CR3 doesn't always refer to page table.

  - If guest machine is in catastrophic state, its memory data could
    be corrupted. Then, we cannot trust such corrupted page table.

    # In this point, managing PT_LOAD program headers based on such
    # potencially corruppted data has risk.

The idea of yours that performing paging in debugger side is better
than doing in qemu.

Thanks.
HATAYAMA, Daisuke

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-09 12:53                         ` HATAYAMA Daisuke
@ 2012-03-09 13:24                           ` Jan Kiszka
  2012-03-12  6:16                             ` HATAYAMA Daisuke
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Kiszka @ 2012-03-09 13:24 UTC (permalink / raw)
  To: HATAYAMA Daisuke; +Cc: eblake, anderson, qemu-devel, lcapitulino

On 2012-03-09 13:53, HATAYAMA Daisuke wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
> Date: Fri, 09 Mar 2012 11:06:30 +0100
> 
>> On 2012-03-09 11:05, Jan Kiszka wrote:
>>>>> If crash can work both with and without paging, it should be default
>>>>> *on* to avoid writing cores that can later on only be analyzed with that
>>>>> tool. Still not sure, though, if that changes the requirement on what
>>>>> memory regions should be written in that mode.
>>>>
>>>> If this logic is not remvoed, crash can work both with and without paging.
>>>> But the default value is 'off' now, because the option is '-p'.
>>>
>>> And this would be unfortunate if you do not want to use crash for
>>> analyzing (I'm working on gdb python scripts which will make gdb - one
>>> day - at least as powerful as crash). If paging mode has the same
>>> information that non-paging mode has, I would even suggest to drop it.
>>
>> Err, with "it" = "non-paging mode".
>>
>> Jan
> 
> Paging at default is not good idea. Performing paging in qemu has risk.
> 
>   - Guest machine is not always in the state where paging mode is
>     enabled. Also, CR3 doesn't always refer to page table.

That's detectable and means physical == linear address.

> 
>   - If guest machine is in catastrophic state, its memory data could
>     be corrupted. Then, we cannot trust such corrupted page table.

OK, here I agree.

> 
>     # In this point, managing PT_LOAD program headers based on such
>     # potencially corruppted data has risk.
> 
> The idea of yours that performing paging in debugger side is better
> than doing in qemu.

Another alternative is to add guest-awareness to the dump code. If we
detect (or get told) that the target is a Linux kernel, at least the
linear kernel mapping can be written reliably.

But also the fact that there can be as many different page tables as
active processors means that paging likely needs a second thought and
more awareness of the debugger.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-09 10:05                     ` Jan Kiszka
  2012-03-09 10:06                       ` Jan Kiszka
@ 2012-03-12  1:52                       ` Wen Congyang
  1 sibling, 0 replies; 38+ messages in thread
From: Wen Congyang @ 2012-03-12  1:52 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: eblake, HATAYAMA Daisuke, anderson, qemu-devel, lcapitulino

At 03/09/2012 06:05 PM, Jan Kiszka Wrote:
> On 2012-03-09 10:57, Wen Congyang wrote:
>> At 03/09/2012 05:41 PM, Jan Kiszka Wrote:
>>> On 2012-03-09 03:53, HATAYAMA Daisuke wrote:
>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>> Date: Fri, 09 Mar 2012 10:26:56 +0800
>>>>
>>>>> At 03/09/2012 10:05 AM, HATAYAMA Daisuke Wrote:
>>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>>> Date: Fri, 09 Mar 2012 09:46:31 +0800
>>>>>>
>>>>>>> At 03/09/2012 08:40 AM, HATAYAMA Daisuke Wrote:
>>>>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>>>>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>>>>> Date: Thu, 08 Mar 2012 16:52:29 +0800
>>>>>>>>
>>>>>>>>> At 03/07/2012 11:27 PM, HATAYAMA Daisuke Wrote:
>>>>>>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>>>>>>> Subject: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>>>>>>>>>> Date: Fri, 02 Mar 2012 18:18:23 +0800
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>>>>>>> How does the memory portion referenced by PT_LOAD program headers with
>>>>>>>>>> p_vaddr == 0 looks through gdb? If we cannot access such portions,
>>>>>>>>>> part not referenced by the page table CR3 has is unnecessary, isn't
>>>>>>>>>> it?
>>>>>>>>>
>>>>>>>>> The part is unnecessary if you use gdb. But it is necessary if you use crash.
>>>>>>>>>
>>>>>>>>
>>>>>>>> crash users would not use paging option because even if without using
>>>>>>>> it, we can see all memory well, so the paging option is only for gdb
>>>>>>>> users.
>>>>>>>
>>>>>>> Yes, the paging option is only for gdb users. The default value if off.
>>>>>>>
>>>>>>>>
>>>>>>>> It looks to me that the latter part only complicates the logic. If
>>>>>>>> instead, collecting virtual addresses only, way of handling PT_LOAD
>>>>>>>> entries become simpler, for example, they no longer need to be
>>>>>>>> physically contiguous in a single entry, and reviewing and maintaince
>>>>>>>> becomes easy.
>>>>>>>
>>>>>>> Sorry, I donot understand what do you want to say.
>>>>>>>
>>>>>>
>>>>>> The processing that adds part not referenced by page table to vmcore
>>>>>> is meaningless for gdb. crash doesn't require it. So, it only
>>>>>> complicates the current logic.
>>>>>
>>>>> If the paging mode is on, we can also use crash to analyze the vmcore.
>>>>> As the comment methioned, the memory used by the 1st kernel may be not
>>>>> referenced by the page table, so we neet this logic.
>>>>>
>>>>
>>>> As I said several times, crash users don't use paging mode. Users of
>>>> the paging mode is gdb only just as you say. So, the paging path needs
>>>> to collect part referenced by page table only since the other part is
>>>> invisible to gdb.
>>>
>>> If crash can work both with and without paging, it should be default
>>> *on* to avoid writing cores that can later on only be analyzed with that
>>> tool. Still not sure, though, if that changes the requirement on what
>>> memory regions should be written in that mode.
>>
>> If this logic is not remvoed, crash can work both with and without paging.
>> But the default value is 'off' now, because the option is '-p'.
> 
> And this would be unfortunate if you do not want to use crash for
> analyzing (I'm working on gdb python scripts which will make gdb - one
> day - at least as powerful as crash). If paging mode has the same
> information that non-paging mode has, I would even suggest to drop it.

I donot have any knowledge about gdb python scripts. But is it OK to work
without virtual address in PT_LOAD?

Thanks
Wen Congyang

> 
> Jan
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-09 13:24                           ` Jan Kiszka
@ 2012-03-12  6:16                             ` HATAYAMA Daisuke
  2012-03-12  6:26                               ` HATAYAMA Daisuke
  0 siblings, 1 reply; 38+ messages in thread
From: HATAYAMA Daisuke @ 2012-03-12  6:16 UTC (permalink / raw)
  To: jan.kiszka; +Cc: eblake, anderson, qemu-devel, lcapitulino

From: Jan Kiszka <jan.kiszka@siemens.com>
Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
Date: Fri, 09 Mar 2012 14:24:41 +0100

> On 2012-03-09 13:53, HATAYAMA Daisuke wrote:
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>> Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping
>> Date: Fri, 09 Mar 2012 11:06:30 +0100
>> 
>>> On 2012-03-09 11:05, Jan Kiszka wrote:
>>>>>> If crash can work both with and without paging, it should be default
>>>>>> *on* to avoid writing cores that can later on only be analyzed with that
>>>>>> tool. Still not sure, though, if that changes the requirement on what
>>>>>> memory regions should be written in that mode.
>>>>>
>>>>> If this logic is not remvoed, crash can work both with and without paging.
>>>>> But the default value is 'off' now, because the option is '-p'.
>>>>
>>>> And this would be unfortunate if you do not want to use crash for
>>>> analyzing (I'm working on gdb python scripts which will make gdb - one
>>>> day - at least as powerful as crash). If paging mode has the same
>>>> information that non-paging mode has, I would even suggest to drop it.
>>>
>>> Err, with "it" = "non-paging mode".
>>>
>>> Jan
>> 
>> Paging at default is not good idea. Performing paging in qemu has risk.
>> 
>>   - Guest machine is not always in the state where paging mode is
>>     enabled. Also, CR3 doesn't always refer to page table.
> 
> That's detectable and means physical == linear address.
> 

CR0 itself is one of the guest's resources. There's still the issue
whether to trust CR0 or not.

The assumption behind my idea is the host is running in a good
condition but the quest in a bad condition. So we can use qemu dump,
which is the host's feature.

Only checking if CR3 refers to page table correctly is considerably
complicated. CR3 can have any physical address. There's no hint such
as magic number to check it's invalid. So, to check if CR3 correctly
points at page table, the only way is to check if we can see the data
through actually performing paging for some virtual address. The
virtual address would be better if it's more typical one, but it tends
to be guest specific, and I think it's not suitable for qemu due to OS
dependency.
# Of course, this story is done out of the assumption that the data
# could be corrupted.

>> 
>>   - If guest machine is in catastrophic state, its memory data could
>>     be corrupted. Then, we cannot trust such corrupted page table.
> 
> OK, here I agree.
> 
>> 
>>     # In this point, managing PT_LOAD program headers based on such
>>     # potencially corruppted data has risk.
>> 
>> The idea of yours that performing paging in debugger side is better
>> than doing in qemu.
> 
> Another alternative is to add guest-awareness to the dump code. If we
> detect (or get told) that the target is a Linux kernel, at least the
> linear kernel mapping can be written reliably.
> 
> But also the fact that there can be as many different page tables as
> active processors means that paging likely needs a second thought and
> more awareness of the debugger.

Also, it seems to me that the lacking feature of gdb used to dumpfile
compared with used to running guest is paging only. Paging is a
feature in architecture, which is unlikely to change in the
future. It's better in maintainability than introducing OS-level
dependency.

Thanks.
HATAYAMA, Daisuke

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
  2012-03-12  6:16                             ` HATAYAMA Daisuke
@ 2012-03-12  6:26                               ` HATAYAMA Daisuke
  0 siblings, 0 replies; 38+ messages in thread
From: HATAYAMA Daisuke @ 2012-03-12  6:26 UTC (permalink / raw)
  To: jan.kiszka; +Cc: anderson, eblake, qemu-devel, lcapitulino

From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Subject: Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
Date: Mon, 12 Mar 2012 15:16:55 +0900 (   )

> 
> The assumption behind my idea is the host is running in a good
> condition but the quest in a bad condition. So we can use qemu dump,
> which is the host's feature.

There's also the situation that a guest is running in a good condition
and users can recognize that its data is obviously not corrupted. For
such situation, it's natural for qemu dump to have paging mode to some
amount.

Thanks.
HATAYAMA, Daisuke

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2012-03-12  6:26 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-02  9:59 [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
2012-03-02 10:02 ` [Qemu-devel] [RFC][PATCH 01/16 v8] Add API to create memory mapping list Wen Congyang
2012-03-02 10:06 ` [Qemu-devel] [RFC][PATCH 02/16 v8] Add API to check whether a physical address is I/O address Wen Congyang
2012-03-02 10:08 ` [Qemu-devel] [RFC][PATCH 03/16 v8] implement cpu_get_memory_mapping() Wen Congyang
2012-03-02 10:12 ` [Qemu-devel] [RFC][PATCH 04/16 v8] Add API to check whether paging mode is enabled Wen Congyang
2012-03-02 10:18 ` [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping Wen Congyang
2012-03-07 15:27   ` HATAYAMA Daisuke
2012-03-08  8:52     ` Wen Congyang
2012-03-09  0:40       ` HATAYAMA Daisuke
2012-03-09  1:46         ` Wen Congyang
2012-03-09  2:05           ` HATAYAMA Daisuke
2012-03-09  2:26             ` Wen Congyang
2012-03-09  2:53               ` HATAYAMA Daisuke
2012-03-09  9:41                 ` Jan Kiszka
2012-03-09  9:57                   ` Wen Congyang
2012-03-09 10:05                     ` Jan Kiszka
2012-03-09 10:06                       ` Jan Kiszka
2012-03-09 12:53                         ` HATAYAMA Daisuke
2012-03-09 13:24                           ` Jan Kiszka
2012-03-12  6:16                             ` HATAYAMA Daisuke
2012-03-12  6:26                               ` HATAYAMA Daisuke
2012-03-12  1:52                       ` Wen Congyang
2012-03-09  9:43                 ` Jan Kiszka
2012-03-02 10:23 ` [Qemu-devel] [RFC][PATCH 06/16 v8] Add API to get memory mapping without doing paging Wen Congyang
2012-03-02 10:27 ` [Qemu-devel] [RFC][PATCH 07/16 v8] target-i386: Add API to write elf notes to core file Wen Congyang
2012-03-02 10:31 ` [Qemu-devel] [RFC][PATCH 08/16 v8] target-i386: Add API to write cpu status " Wen Congyang
2012-03-02 10:33 ` [Qemu-devel] [RFC][PATCH 09/16 v8] target-i386: add API to get dump info Wen Congyang
2012-03-02 10:38 ` [Qemu-devel] [RFC][PATCH 10/16 v8] make gdb_id() generally avialable Wen Congyang
2012-03-02 10:42 ` [Qemu-devel] [RFC][PATCH 11/16 v8] introduce a new monitor command 'dump' to dump guest's memory Wen Congyang
2012-03-02 10:43 ` [Qemu-devel] [RFC][PATCH 12/16 v8] support to cancel the current dumping Wen Congyang
2012-03-02 10:44 ` [Qemu-devel] [RFC][PATCH 13/16 v8] support to query dumping status Wen Congyang
2012-03-02 10:44 ` [Qemu-devel] [RFC][PATCH 14/16 v8] run dump at the background Wen Congyang
2012-03-02 10:45 ` [Qemu-devel] [RFC][PATCH 15/16 v8] support detached dump Wen Congyang
2012-03-02 10:46 ` [Qemu-devel] [RFC][PATCH 16/16 v8] allow user to dump a fraction of the memory Wen Congyang
2012-03-05  9:12 ` [Qemu-devel] [RFC][PATCH 00/16 v8] introducing a new, dedicated memory dump mechanism Wen Congyang
2012-03-06  0:41   ` Luiz Capitulino
2012-03-07 17:38     ` Luiz Capitulino
2012-03-08  8:55       ` Wen Congyang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.