All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism
@ 2012-03-26  9:58 Wen Congyang
  2012-03-26 10:00 ` [Qemu-devel] [PATCH 01/12 v11] Add API to create memory mapping list Wen Congyang
                   ` (12 more replies)
  0 siblings, 13 replies; 21+ messages in thread
From: Wen Congyang @ 2012-03-26  9:58 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

Hi, all

'virsh dump' can not work when host pci device is used by guest. We have
discussed this issue here:
http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html

The last version is here:
http://lists.nongnu.org/archive/html/qemu-devel/2012-03/msg03866.html

We have determined to introduce a new command dump-guest-memory to dump
guest's memory. The core file's format is elf32 or elf64.

Note:
1. The guest should be x86 or x86_64. The other arch is not supported now.
2. If you use old gdb, gdb may crash. I use gdb-7.3.1, and it does not crash.
3. If the OS is in the second kernel, gdb may not work well, and crash can
   work by specifying '--machdep phys_addr=xxx' in the command line. The
   reason is that the second kernel will update the page table, and we can
   not get the page table for the first kernel.
4. The cpu's state is stored in QEMU note. You neet to modify crash to use
   it to calculate phys_base.
5. If the guest OS is 32 bit and the memory size is larger than 4G, the vmcore
   is elf64 format. You should use the gdb which is built with --enable-64-bit-bfd.
6. This patchset is based on the upstream tree, and apply one patch that is still
   in Luiz Capitulino's tree, because I use the API qemu_get_fd() in this patchset.

Changes from v10 to v11:
1. addressed Luiz's and Hatayam's comment
2. fix a bug about filtering feature

Changes from v9 to v10:
1. fix some bug
2. addressed Luiz's and Hatayam's comment
3. remove cancel and query command

Changes from v8 to v9:
1. remove async support(it will be reimplemented after QAPI async commands support
   is finished)
2. fix some typo error

Changes from v7 to v8:
1. addressed Hatayama's comments

Changes from v6 to v7:
1. addressed Jan's comments
2. fix some bugs
3. store cpu's state into the vmcore

Changes from v5 to v6:
1. allow user to dump a fraction of the memory
2. fix some bugs

Changes from v4 to v5:
1. convert the new command dump to QAPI 

Changes from v3 to v4:
1. support it to run asynchronously
2. add API to cancel dumping and query dumping progress
3. add API to control dumping speed
4. auto cancel dumping when the user resumes vm, and the status is failed.

Changes from v2 to v3:
1. address Jan Kiszka's comment

Changes from v1 to v2:
1. fix virt addr in the vmcore.

Wen Congyang (12):
  Add API to create memory mapping list
  Add API to check whether a physical address is I/O address
  implement cpu_get_memory_mapping()
  Add API to check whether paging mode is enabled
  Add API to get memory mapping
  Add API to get memory mapping without do paging
  target-i386: Add API to write elf notes to core file
  target-i386: Add API to write cpu status to core file
  target-i386: add API to get dump info
  make gdb_id() generally avialable and rename it to cpu_index()
  QError: Introduce new error for the dump-guest-memory command
  introduce a new monitor command 'dump-guest-memory' to dump guest's
    memory

 Makefile.target                   |    3 +
 configure                         |    8 +
 cpu-all.h                         |   67 +++
 cpu-common.h                      |    2 +
 dump.c                            |  827 +++++++++++++++++++++++++++++++++++++
 dump.h                            |   23 +
 elf.h                             |    5 +
 exec.c                            |    9 +
 gdbstub.c                         |   19 +-
 gdbstub.h                         |    9 +
 hmp-commands.hx                   |   28 ++
 hmp.c                             |   22 +
 hmp.h                             |    1 +
 memory_mapping.c                  |  236 +++++++++++
 memory_mapping.h                  |   68 +++
 qapi-schema.json                  |   34 ++
 qerror.c                          |    4 +
 qerror.h                          |    3 +
 qmp-commands.hx                   |   38 ++
 target-i386/arch_dump.c           |  426 +++++++++++++++++++
 target-i386/arch_memory_mapping.c |  271 ++++++++++++
 21 files changed, 2089 insertions(+), 14 deletions(-)
 create mode 100644 dump.c
 create mode 100644 dump.h
 create mode 100644 memory_mapping.c
 create mode 100644 memory_mapping.h
 create mode 100644 target-i386/arch_dump.c
 create mode 100644 target-i386/arch_memory_mapping.c

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 01/12 v11] Add API to create memory mapping list
  2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
@ 2012-03-26 10:00 ` Wen Congyang
  2012-03-26 10:01 ` [Qemu-devel] [PATCH 02/12 v12] Add API to check whether a physical address is I/O address Wen Congyang
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Wen Congyang @ 2012-03-26 10:00 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

The memory mapping list stores virtual address and physical address mapping.
The virtual address and physical address are contiguous in the mapping.
The folloing patch will use this information to create PT_LOAD in the vmcore.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target  |    1 +
 memory_mapping.c |  166 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 memory_mapping.h |   47 +++++++++++++++
 3 files changed, 214 insertions(+), 0 deletions(-)
 create mode 100644 memory_mapping.c
 create mode 100644 memory_mapping.h

diff --git a/Makefile.target b/Makefile.target
index 44b2e83..33680ae 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -220,6 +220,7 @@ obj-$(CONFIG_KVM) += kvm.o kvm-all.o
 obj-$(CONFIG_NO_KVM) += kvm-stub.o
 obj-$(CONFIG_VGA) += vga.o
 obj-y += memory.o savevm.o
+obj-y += memory_mapping.o
 LIBS+=-lz
 
 obj-i386-$(CONFIG_KVM) += hyperv.o
diff --git a/memory_mapping.c b/memory_mapping.c
new file mode 100644
index 0000000..718f271
--- /dev/null
+++ b/memory_mapping.c
@@ -0,0 +1,166 @@
+/*
+ * QEMU memory mapping
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "cpu.h"
+#include "cpu-all.h"
+#include "memory_mapping.h"
+
+static void memory_mapping_list_add_mapping_sorted(MemoryMappingList *list,
+                                                   MemoryMapping *mapping)
+{
+    MemoryMapping *p;
+
+    QTAILQ_FOREACH(p, &list->head, next) {
+        if (p->phys_addr >= mapping->phys_addr) {
+            QTAILQ_INSERT_BEFORE(p, mapping, next);
+            return;
+        }
+    }
+    QTAILQ_INSERT_TAIL(&list->head, mapping, next);
+}
+
+static void create_new_memory_mapping(MemoryMappingList *list,
+                                      target_phys_addr_t phys_addr,
+                                      target_phys_addr_t virt_addr,
+                                      ram_addr_t length)
+{
+    MemoryMapping *memory_mapping;
+
+    memory_mapping = g_malloc(sizeof(MemoryMapping));
+    memory_mapping->phys_addr = phys_addr;
+    memory_mapping->virt_addr = virt_addr;
+    memory_mapping->length = length;
+    list->last_mapping = memory_mapping;
+    list->num++;
+    memory_mapping_list_add_mapping_sorted(list, memory_mapping);
+}
+
+static inline bool mapping_contiguous(MemoryMapping *map,
+                                      target_phys_addr_t phys_addr,
+                                      target_phys_addr_t virt_addr)
+{
+    return phys_addr == map->phys_addr + map->length &&
+           virt_addr == map->virt_addr + map->length;
+}
+
+/*
+ * [map->phys_addr, map->phys_addr + map->length) and
+ * [phys_addr, phys_addr + length) have intersection?
+ */
+static inline bool mapping_have_same_region(MemoryMapping *map,
+                                            target_phys_addr_t phys_addr,
+                                            ram_addr_t length)
+{
+    return !(phys_addr + length < map->phys_addr ||
+             phys_addr >= map->phys_addr + map->length);
+}
+
+/*
+ * [map->phys_addr, map->phys_addr + map->length) and
+ * [phys_addr, phys_addr + length) have intersection. The virtual address in the
+ * intersection are the same?
+ */
+static inline bool mapping_conflict(MemoryMapping *map,
+                                    target_phys_addr_t phys_addr,
+                                    target_phys_addr_t virt_addr)
+{
+    return virt_addr - map->virt_addr != phys_addr - map->phys_addr;
+}
+
+/*
+ * [map->virt_addr, map->virt_addr + map->length) and
+ * [virt_addr, virt_addr + length) have intersection. And the physical address
+ * in the intersection are the same.
+ */
+static inline void mapping_merge(MemoryMapping *map,
+                                 target_phys_addr_t virt_addr,
+                                 ram_addr_t length)
+{
+    if (virt_addr < map->virt_addr) {
+        map->length += map->virt_addr - virt_addr;
+        map->virt_addr = virt_addr;
+    }
+
+    if ((virt_addr + length) >
+        (map->virt_addr + map->length)) {
+        map->length = virt_addr + length - map->virt_addr;
+    }
+}
+
+void memory_mapping_list_add_merge_sorted(MemoryMappingList *list,
+                                          target_phys_addr_t phys_addr,
+                                          target_phys_addr_t virt_addr,
+                                          ram_addr_t length)
+{
+    MemoryMapping *memory_mapping, *last_mapping;
+
+    if (QTAILQ_EMPTY(&list->head)) {
+        create_new_memory_mapping(list, phys_addr, virt_addr, length);
+        return;
+    }
+
+    last_mapping = list->last_mapping;
+    if (last_mapping) {
+        if (mapping_contiguous(last_mapping, phys_addr, virt_addr)) {
+            last_mapping->length += length;
+            return;
+        }
+    }
+
+    QTAILQ_FOREACH(memory_mapping, &list->head, next) {
+        if (mapping_contiguous(memory_mapping, phys_addr, virt_addr)) {
+            memory_mapping->length += length;
+            list->last_mapping = memory_mapping;
+            return;
+        }
+
+        if (phys_addr + length < memory_mapping->phys_addr) {
+            /* create a new region before memory_mapping */
+            break;
+        }
+
+        if (mapping_have_same_region(memory_mapping, phys_addr, length)) {
+            if (mapping_conflict(memory_mapping, phys_addr, virt_addr)) {
+                continue;
+            }
+
+            /* merge this region into memory_mapping */
+            mapping_merge(memory_mapping, virt_addr, length);
+            list->last_mapping = memory_mapping;
+            return;
+        }
+    }
+
+    /* this region can not be merged into any existed memory mapping. */
+    create_new_memory_mapping(list, phys_addr, virt_addr, length);
+}
+
+void memory_mapping_list_free(MemoryMappingList *list)
+{
+    MemoryMapping *p, *q;
+
+    QTAILQ_FOREACH_SAFE(p, &list->head, next, q) {
+        QTAILQ_REMOVE(&list->head, p, next);
+        g_free(p);
+    }
+
+    list->num = 0;
+    list->last_mapping = NULL;
+}
+
+void memory_mapping_list_init(MemoryMappingList *list)
+{
+    list->num = 0;
+    list->last_mapping = NULL;
+    QTAILQ_INIT(&list->head);
+}
diff --git a/memory_mapping.h b/memory_mapping.h
new file mode 100644
index 0000000..836b047
--- /dev/null
+++ b/memory_mapping.h
@@ -0,0 +1,47 @@
+/*
+ * QEMU memory mapping
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef MEMORY_MAPPING_H
+#define MEMORY_MAPPING_H
+
+#include "qemu-queue.h"
+
+/* The physical and virtual address in the memory mapping are contiguous. */
+typedef struct MemoryMapping {
+    target_phys_addr_t phys_addr;
+    target_ulong virt_addr;
+    ram_addr_t length;
+    QTAILQ_ENTRY(MemoryMapping) next;
+} MemoryMapping;
+
+typedef struct MemoryMappingList {
+    unsigned int num;
+    MemoryMapping *last_mapping;
+    QTAILQ_HEAD(, MemoryMapping) head;
+} MemoryMappingList;
+
+/*
+ * add or merge the memory region [phys_addr, phys_addr + length) into the
+ * memory mapping's list. The region's virtual address starts with virt_addr,
+ * and is contiguous. The list is sorted by phys_addr.
+ */
+void memory_mapping_list_add_merge_sorted(MemoryMappingList *list,
+                                          target_phys_addr_t phys_addr,
+                                          target_phys_addr_t virt_addr,
+                                          ram_addr_t length);
+
+void memory_mapping_list_free(MemoryMappingList *list);
+
+void memory_mapping_list_init(MemoryMappingList *list);
+
+#endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 02/12 v12] Add API to check whether a physical address is I/O address
  2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
  2012-03-26 10:00 ` [Qemu-devel] [PATCH 01/12 v11] Add API to create memory mapping list Wen Congyang
@ 2012-03-26 10:01 ` Wen Congyang
  2012-03-26 10:02 ` [Qemu-devel] [PATCH 03/12 v11] implement cpu_get_memory_mapping() Wen Congyang
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Wen Congyang @ 2012-03-26 10:01 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

This API will be used in the following patch.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 cpu-common.h |    2 ++
 exec.c       |    9 +++++++++
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/cpu-common.h b/cpu-common.h
index dca5175..fcd50dc 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -71,6 +71,8 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len,
 void *cpu_register_map_client(void *opaque, void (*callback)(void *opaque));
 void cpu_unregister_map_client(void *cookie);
 
+bool cpu_physical_memory_is_io(target_phys_addr_t phys_addr);
+
 /* Coalesced MMIO regions are areas where write operations can be reordered.
  * This usually implies that write operations are side-effect free.  This allows
  * batching which can make a major impact on performance when using
diff --git a/exec.c b/exec.c
index 6731ab8..b3fd8cb 100644
--- a/exec.c
+++ b/exec.c
@@ -4657,3 +4657,12 @@ bool virtio_is_big_endian(void)
 #undef env
 
 #endif
+
+bool cpu_physical_memory_is_io(target_phys_addr_t phys_addr)
+{
+    MemoryRegionSection *section;
+
+    section = phys_page_find(phys_addr >> TARGET_PAGE_BITS);
+
+    return !is_ram_rom_romd(section);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 03/12 v11] implement cpu_get_memory_mapping()
  2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
  2012-03-26 10:00 ` [Qemu-devel] [PATCH 01/12 v11] Add API to create memory mapping list Wen Congyang
  2012-03-26 10:01 ` [Qemu-devel] [PATCH 02/12 v12] Add API to check whether a physical address is I/O address Wen Congyang
@ 2012-03-26 10:02 ` Wen Congyang
  2012-03-26 10:02 ` [Qemu-devel] [PATCH 04/12 v11] Add API to check whether paging mode is enabled Wen Congyang
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Wen Congyang @ 2012-03-26 10:02 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

Walk cpu's page table and collect all virtual address and physical address mapping.
Then, add these mapping into memory mapping list. If the guest does not use paging,
it will do nothing. Note: the I/O memory will be skipped.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target                   |    1 +
 configure                         |    4 +
 cpu-all.h                         |   11 ++
 target-i386/arch_memory_mapping.c |  266 +++++++++++++++++++++++++++++++++++++
 4 files changed, 282 insertions(+), 0 deletions(-)
 create mode 100644 target-i386/arch_memory_mapping.c

diff --git a/Makefile.target b/Makefile.target
index 33680ae..e88bfcf 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -89,6 +89,7 @@ libobj-y += helper.o
 ifeq ($(TARGET_BASE_ARCH), i386)
 libobj-y += cpuid.o
 endif
+libobj-$(CONFIG_HAVE_GET_MEMORY_MAPPING) += arch_memory_mapping.o
 libobj-$(TARGET_SPARC64) += vis_helper.o
 libobj-$(CONFIG_NEED_MMU) += mmu.o
 libobj-$(TARGET_ARM) += neon_helper.o iwmmxt_helper.o
diff --git a/configure b/configure
index 14ef738..0e05b84 100755
--- a/configure
+++ b/configure
@@ -3659,6 +3659,10 @@ case "$target_arch2" in
       fi
     fi
 esac
+case "$target_arch2" in
+  i386|x86_64)
+    echo "CONFIG_HAVE_GET_MEMORY_MAPPING=y" >> $config_target_mak
+esac
 if test "$target_arch2" = "ppc64" -a "$fdt" = "yes"; then
   echo "CONFIG_PSERIES=y" >> $config_target_mak
 fi
diff --git a/cpu-all.h b/cpu-all.h
index 9621c3c..390b55b 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -22,6 +22,7 @@
 #include "qemu-common.h"
 #include "qemu-tls.h"
 #include "cpu-common.h"
+#include "memory_mapping.h"
 
 /* some important defines:
  *
@@ -525,4 +526,14 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf);
 int cpu_memory_rw_debug(CPUArchState *env, target_ulong addr,
                         uint8_t *buf, int len, int is_write);
 
+#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
+int cpu_get_memory_mapping(MemoryMappingList *list, CPUArchState *env);
+#else
+static inline int cpu_get_memory_mapping(MemoryMappingList *list,
+                                         CPUArchState *env)
+{
+    return -1;
+}
+#endif
+
 #endif /* CPU_ALL_H */
diff --git a/target-i386/arch_memory_mapping.c b/target-i386/arch_memory_mapping.c
new file mode 100644
index 0000000..dd64bec
--- /dev/null
+++ b/target-i386/arch_memory_mapping.c
@@ -0,0 +1,266 @@
+/*
+ * i386 memory mapping
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "cpu.h"
+#include "cpu-all.h"
+
+/* PAE Paging or IA-32e Paging */
+static void walk_pte(MemoryMappingList *list, target_phys_addr_t pte_start_addr,
+                     int32_t a20_mask, target_ulong start_line_addr)
+{
+    target_phys_addr_t pte_addr, start_paddr;
+    uint64_t pte;
+    target_ulong start_vaddr;
+    int i;
+
+    for (i = 0; i < 512; i++) {
+        pte_addr = (pte_start_addr + i * 8) & a20_mask;
+        pte = ldq_phys(pte_addr);
+        if (!(pte & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        start_paddr = (pte & ~0xfff) & ~(0x1ULL << 63);
+        if (cpu_physical_memory_is_io(start_paddr)) {
+            /* I/O region */
+            continue;
+        }
+
+        start_vaddr = start_line_addr | ((i & 0x1fff) << 12);
+        memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                             start_vaddr, 1 << 12);
+    }
+}
+
+/* 32-bit Paging */
+static void walk_pte2(MemoryMappingList *list,
+                      target_phys_addr_t pte_start_addr, int32_t a20_mask,
+                      target_ulong start_line_addr)
+{
+    target_phys_addr_t pte_addr, start_paddr;
+    uint32_t pte;
+    target_ulong start_vaddr;
+    int i;
+
+    for (i = 0; i < 1024; i++) {
+        pte_addr = (pte_start_addr + i * 4) & a20_mask;
+        pte = ldl_phys(pte_addr);
+        if (!(pte & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        start_paddr = pte & ~0xfff;
+        if (cpu_physical_memory_is_io(start_paddr)) {
+            /* I/O region */
+            continue;
+        }
+
+        start_vaddr = start_line_addr | ((i & 0x3ff) << 12);
+        memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                             start_vaddr, 1 << 12);
+    }
+}
+
+/* PAE Paging or IA-32e Paging */
+static void walk_pde(MemoryMappingList *list, target_phys_addr_t pde_start_addr,
+                     int32_t a20_mask, target_ulong start_line_addr)
+{
+    target_phys_addr_t pde_addr, pte_start_addr, start_paddr;
+    uint64_t pde;
+    target_ulong line_addr, start_vaddr;
+    int i;
+
+    for (i = 0; i < 512; i++) {
+        pde_addr = (pde_start_addr + i * 8) & a20_mask;
+        pde = ldq_phys(pde_addr);
+        if (!(pde & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = start_line_addr | ((i & 0x1ff) << 21);
+        if (pde & PG_PSE_MASK) {
+            /* 2 MB page */
+            start_paddr = (pde & ~0x1fffff) & ~(0x1ULL << 63);
+            if (cpu_physical_memory_is_io(start_paddr)) {
+                /* I/O region */
+                continue;
+            }
+            start_vaddr = line_addr;
+            memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                                 start_vaddr, 1 << 21);
+            continue;
+        }
+
+        pte_start_addr = (pde & ~0xfff) & a20_mask;
+        walk_pte(list, pte_start_addr, a20_mask, line_addr);
+    }
+}
+
+/* 32-bit Paging */
+static void walk_pde2(MemoryMappingList *list,
+                      target_phys_addr_t pde_start_addr, int32_t a20_mask,
+                      bool pse)
+{
+    target_phys_addr_t pde_addr, pte_start_addr, start_paddr;
+    uint32_t pde;
+    target_ulong line_addr, start_vaddr;
+    int i;
+
+    for (i = 0; i < 1024; i++) {
+        pde_addr = (pde_start_addr + i * 4) & a20_mask;
+        pde = ldl_phys(pde_addr);
+        if (!(pde & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = (((unsigned int)i & 0x3ff) << 22);
+        if ((pde & PG_PSE_MASK) && pse) {
+            /* 4 MB page */
+            start_paddr = (pde & ~0x3fffff) | ((pde & 0x1fe000) << 19);
+            if (cpu_physical_memory_is_io(start_paddr)) {
+                /* I/O region */
+                continue;
+            }
+            start_vaddr = line_addr;
+            memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                                 start_vaddr, 1 << 22);
+            continue;
+        }
+
+        pte_start_addr = (pde & ~0xfff) & a20_mask;
+        walk_pte2(list, pte_start_addr, a20_mask, line_addr);
+    }
+}
+
+/* PAE Paging */
+static void walk_pdpe2(MemoryMappingList *list,
+                       target_phys_addr_t pdpe_start_addr, int32_t a20_mask)
+{
+    target_phys_addr_t pdpe_addr, pde_start_addr;
+    uint64_t pdpe;
+    target_ulong line_addr;
+    int i;
+
+    for (i = 0; i < 4; i++) {
+        pdpe_addr = (pdpe_start_addr + i * 8) & a20_mask;
+        pdpe = ldq_phys(pdpe_addr);
+        if (!(pdpe & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = (((unsigned int)i & 0x3) << 30);
+        pde_start_addr = (pdpe & ~0xfff) & a20_mask;
+        walk_pde(list, pde_start_addr, a20_mask, line_addr);
+    }
+}
+
+#ifdef TARGET_X86_64
+/* IA-32e Paging */
+static void walk_pdpe(MemoryMappingList *list,
+                      target_phys_addr_t pdpe_start_addr, int32_t a20_mask,
+                      target_ulong start_line_addr)
+{
+    target_phys_addr_t pdpe_addr, pde_start_addr, start_paddr;
+    uint64_t pdpe;
+    target_ulong line_addr, start_vaddr;
+    int i;
+
+    for (i = 0; i < 512; i++) {
+        pdpe_addr = (pdpe_start_addr + i * 8) & a20_mask;
+        pdpe = ldq_phys(pdpe_addr);
+        if (!(pdpe & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = start_line_addr | ((i & 0x1ffULL) << 30);
+        if (pdpe & PG_PSE_MASK) {
+            /* 1 GB page */
+            start_paddr = (pdpe & ~0x3fffffff) & ~(0x1ULL << 63);
+            if (cpu_physical_memory_is_io(start_paddr)) {
+                /* I/O region */
+                continue;
+            }
+            start_vaddr = line_addr;
+            memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                                 start_vaddr, 1 << 30);
+            continue;
+        }
+
+        pde_start_addr = (pdpe & ~0xfff) & a20_mask;
+        walk_pde(list, pde_start_addr, a20_mask, line_addr);
+    }
+}
+
+/* IA-32e Paging */
+static void walk_pml4e(MemoryMappingList *list,
+                       target_phys_addr_t pml4e_start_addr, int32_t a20_mask)
+{
+    target_phys_addr_t pml4e_addr, pdpe_start_addr;
+    uint64_t pml4e;
+    target_ulong line_addr;
+    int i;
+
+    for (i = 0; i < 512; i++) {
+        pml4e_addr = (pml4e_start_addr + i * 8) & a20_mask;
+        pml4e = ldq_phys(pml4e_addr);
+        if (!(pml4e & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = ((i & 0x1ffULL) << 39) | (0xffffULL << 48);
+        pdpe_start_addr = (pml4e & ~0xfff) & a20_mask;
+        walk_pdpe(list, pdpe_start_addr, a20_mask, line_addr);
+    }
+}
+#endif
+
+int cpu_get_memory_mapping(MemoryMappingList *list, CPUArchState *env)
+{
+    if (!(env->cr[0] & CR0_PG_MASK)) {
+        /* paging is disabled */
+        return 0;
+    }
+
+    if (env->cr[4] & CR4_PAE_MASK) {
+#ifdef TARGET_X86_64
+        if (env->hflags & HF_LMA_MASK) {
+            target_phys_addr_t pml4e_addr;
+
+            pml4e_addr = (env->cr[3] & ~0xfff) & env->a20_mask;
+            walk_pml4e(list, pml4e_addr, env->a20_mask);
+        } else
+#endif
+        {
+            target_phys_addr_t pdpe_addr;
+
+            pdpe_addr = (env->cr[3] & ~0x1f) & env->a20_mask;
+            walk_pdpe2(list, pdpe_addr, env->a20_mask);
+        }
+    } else {
+        target_phys_addr_t pde_addr;
+        bool pse;
+
+        pde_addr = (env->cr[3] & ~0xfff) & env->a20_mask;
+        pse = !!(env->cr[4] & CR4_PSE_MASK);
+        walk_pde2(list, pde_addr, env->a20_mask, pse);
+    }
+
+    return 0;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 04/12 v11] Add API to check whether paging mode is enabled
  2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (2 preceding siblings ...)
  2012-03-26 10:02 ` [Qemu-devel] [PATCH 03/12 v11] implement cpu_get_memory_mapping() Wen Congyang
@ 2012-03-26 10:02 ` Wen Congyang
  2012-03-26 10:03 ` [Qemu-devel] [PATCH 05/12 v11] Add API to get memory mapping Wen Congyang
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Wen Congyang @ 2012-03-26 10:02 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

This API will be used in the following patch.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 cpu-all.h                         |    6 ++++++
 target-i386/arch_memory_mapping.c |    7 ++++++-
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 390b55b..24c7f77 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -528,12 +528,18 @@ int cpu_memory_rw_debug(CPUArchState *env, target_ulong addr,
 
 #if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
 int cpu_get_memory_mapping(MemoryMappingList *list, CPUArchState *env);
+bool cpu_paging_enabled(CPUArchState *env);
 #else
 static inline int cpu_get_memory_mapping(MemoryMappingList *list,
                                          CPUArchState *env)
 {
     return -1;
 }
+
+static inline bool cpu_paging_enabled(CPUArchState *env)
+{
+    return true;
+}
 #endif
 
 #endif /* CPU_ALL_H */
diff --git a/target-i386/arch_memory_mapping.c b/target-i386/arch_memory_mapping.c
index dd64bec..bd50e11 100644
--- a/target-i386/arch_memory_mapping.c
+++ b/target-i386/arch_memory_mapping.c
@@ -233,7 +233,7 @@ static void walk_pml4e(MemoryMappingList *list,
 
 int cpu_get_memory_mapping(MemoryMappingList *list, CPUArchState *env)
 {
-    if (!(env->cr[0] & CR0_PG_MASK)) {
+    if (!cpu_paging_enabled(env)) {
         /* paging is disabled */
         return 0;
     }
@@ -264,3 +264,8 @@ int cpu_get_memory_mapping(MemoryMappingList *list, CPUArchState *env)
 
     return 0;
 }
+
+bool cpu_paging_enabled(CPUArchState *env)
+{
+    return env->cr[0] & CR0_PG_MASK;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 05/12 v11] Add API to get memory mapping
  2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (3 preceding siblings ...)
  2012-03-26 10:02 ` [Qemu-devel] [PATCH 04/12 v11] Add API to check whether paging mode is enabled Wen Congyang
@ 2012-03-26 10:03 ` Wen Congyang
  2012-03-27  2:27   ` Wen Congyang
  2012-03-26 10:03 ` [Qemu-devel] [PATCH 06/12 v11] Add API to get memory mapping without do paging Wen Congyang
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 21+ messages in thread
From: Wen Congyang @ 2012-03-26 10:03 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

Add API to get all virtual address and physical address mapping.
If the guest doesn't use paging, the virtual address is equal to the phyical
address. The virtual address and physical address mapping is for gdb's user, and
it does not include the memory that is not referenced by the page table. So if
you want to use crash to anaylze the vmcore, please do not specify -p option.
the reason why the -p option is not default explicitly: guest machine in a
catastrophic state can have corrupted memory, which we cannot trust.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 memory_mapping.c |   34 ++++++++++++++++++++++++++++++++++
 memory_mapping.h |   15 +++++++++++++++
 2 files changed, 49 insertions(+), 0 deletions(-)

diff --git a/memory_mapping.c b/memory_mapping.c
index 718f271..b92e2f6 100644
--- a/memory_mapping.c
+++ b/memory_mapping.c
@@ -164,3 +164,37 @@ void memory_mapping_list_init(MemoryMappingList *list)
     list->last_mapping = NULL;
     QTAILQ_INIT(&list->head);
 }
+
+#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
+int qemu_get_guest_memory_mapping(MemoryMappingList *list)
+{
+    CPUArchState *env;
+    RAMBlock *block;
+    ram_addr_t offset, length;
+    int ret;
+    bool paging_mode;
+
+    paging_mode = cpu_paging_enabled(first_cpu);
+    if (paging_mode) {
+        for (env = first_cpu; env != NULL; env = env->next_cpu) {
+            ret = cpu_get_memory_mapping(list, env);
+            if (ret < 0) {
+                return -1;
+            }
+        }
+        return 0;
+    }
+
+    /*
+     * If the guest doesn't use paging, the virtual address is equal to physical
+     * address.
+     */
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        offset = block->offset;
+        length = block->length;
+        create_new_memory_mapping(list, offset, offset, length);
+    }
+
+    return 0;
+}
+#endif
diff --git a/memory_mapping.h b/memory_mapping.h
index 836b047..4d44641 100644
--- a/memory_mapping.h
+++ b/memory_mapping.h
@@ -44,4 +44,19 @@ void memory_mapping_list_free(MemoryMappingList *list);
 
 void memory_mapping_list_init(MemoryMappingList *list);
 
+/*
+ * Return value:
+ *    0: success
+ *   -1: failed
+ *   -2: unsupported
+ */
+#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
+int qemu_get_guest_memory_mapping(MemoryMappingList *list);
+#else
+static inline int qemu_get_guest_memory_mapping(MemoryMappingList *list)
+{
+    return -2;
+}
+#endif
+
 #endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 06/12 v11] Add API to get memory mapping without do paging
  2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (4 preceding siblings ...)
  2012-03-26 10:03 ` [Qemu-devel] [PATCH 05/12 v11] Add API to get memory mapping Wen Congyang
@ 2012-03-26 10:03 ` Wen Congyang
  2012-03-26 10:04 ` [Qemu-devel] [PATCH 07/12 v11] target-i386: Add API to write elf notes to core file Wen Congyang
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Wen Congyang @ 2012-03-26 10:03 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

crash does not need the virtual address and physical address mapping, and the
mapping does not include the memory that is not referenced by the page table.
crash does not use the virtual address, so we can create the mapping for all
physical memory(virtual address is always 0). This patch provides a API to do
this thing, and it will be used in the following patch.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 memory_mapping.c |    9 +++++++++
 memory_mapping.h |    3 +++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/memory_mapping.c b/memory_mapping.c
index b92e2f6..9a2ffe6 100644
--- a/memory_mapping.c
+++ b/memory_mapping.c
@@ -198,3 +198,12 @@ int qemu_get_guest_memory_mapping(MemoryMappingList *list)
     return 0;
 }
 #endif
+
+void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list)
+{
+    RAMBlock *block;
+
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        create_new_memory_mapping(list, block->offset, 0, block->length);
+    }
+}
diff --git a/memory_mapping.h b/memory_mapping.h
index 4d44641..a583e44 100644
--- a/memory_mapping.h
+++ b/memory_mapping.h
@@ -59,4 +59,7 @@ static inline int qemu_get_guest_memory_mapping(MemoryMappingList *list)
 }
 #endif
 
+/* get guest's memory mapping without do paging(virtual address is 0). */
+void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list);
+
 #endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 07/12 v11] target-i386: Add API to write elf notes to core file
  2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (5 preceding siblings ...)
  2012-03-26 10:03 ` [Qemu-devel] [PATCH 06/12 v11] Add API to get memory mapping without do paging Wen Congyang
@ 2012-03-26 10:04 ` Wen Congyang
  2012-03-26 10:04 ` [Qemu-devel] [PATCH 08/12 v11] target-i386: Add API to write cpu status " Wen Congyang
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Wen Congyang @ 2012-03-26 10:04 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

The core file contains register's value. These APIs write registers to
core file, and them will be called in the following patch.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target         |    1 +
 configure               |    4 +
 cpu-all.h               |   23 +++++
 target-i386/arch_dump.c |  240 +++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 268 insertions(+), 0 deletions(-)
 create mode 100644 target-i386/arch_dump.c

diff --git a/Makefile.target b/Makefile.target
index e88bfcf..2e20ed2 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -222,6 +222,7 @@ obj-$(CONFIG_NO_KVM) += kvm-stub.o
 obj-$(CONFIG_VGA) += vga.o
 obj-y += memory.o savevm.o
 obj-y += memory_mapping.o
+obj-$(CONFIG_HAVE_CORE_DUMP) += arch_dump.o
 LIBS+=-lz
 
 obj-i386-$(CONFIG_KVM) += hyperv.o
diff --git a/configure b/configure
index 0e05b84..afb294c 100755
--- a/configure
+++ b/configure
@@ -3678,6 +3678,10 @@ if test "$target_softmmu" = "yes" ; then
   if test "$smartcard_nss" = "yes" ; then
     echo "subdir-$target: subdir-libcacard" >> $config_host_mak
   fi
+  case "$target_arch2" in
+    i386|x86_64)
+      echo "CONFIG_HAVE_CORE_DUMP=y" >> $config_target_mak
+  esac
 fi
 if test "$target_user_only" = "yes" ; then
   echo "CONFIG_USER_ONLY=y" >> $config_target_mak
diff --git a/cpu-all.h b/cpu-all.h
index 24c7f77..bc29b43 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -542,4 +542,27 @@ static inline bool cpu_paging_enabled(CPUArchState *env)
 }
 #endif
 
+typedef int (*write_core_dump_function)
+    (target_phys_addr_t offset, void *buf, size_t size, void *opaque);
+#if defined(CONFIG_HAVE_CORE_DUMP)
+int cpu_write_elf64_note(write_core_dump_function f, CPUArchState *env,
+                         int cpuid, target_phys_addr_t *offset, void *opaque);
+int cpu_write_elf32_note(write_core_dump_function f, CPUArchState *env,
+                         int cpuid, target_phys_addr_t *offset, void *opaque);
+#else
+static inline int cpu_write_elf64_note(write_core_dump_function f,
+                                       CPUArchState *env, int cpuid,
+                                       target_phys_addr_t *offset, void *opaque)
+{
+    return -1;
+}
+
+static inline int cpu_write_elf32_note(write_core_dump_function f,
+                                       CPUArchState *env, int cpuid,
+                                       target_phys_addr_t *offset, void *opaque)
+{
+    return -1;
+}
+#endif
+
 #endif /* CPU_ALL_H */
diff --git a/target-i386/arch_dump.c b/target-i386/arch_dump.c
new file mode 100644
index 0000000..285c6e1
--- /dev/null
+++ b/target-i386/arch_dump.c
@@ -0,0 +1,240 @@
+/*
+ * i386 memory mapping
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "cpu.h"
+#include "cpu-all.h"
+#include "elf.h"
+
+#ifdef TARGET_X86_64
+typedef struct {
+    target_ulong r15, r14, r13, r12, rbp, rbx, r11, r10;
+    target_ulong r9, r8, rax, rcx, rdx, rsi, rdi, orig_rax;
+    target_ulong rip, cs, eflags;
+    target_ulong rsp, ss;
+    target_ulong fs_base, gs_base;
+    target_ulong ds, es, fs, gs;
+} x86_64_user_regs_struct;
+
+typedef struct {
+    char pad1[32];
+    uint32_t pid;
+    char pad2[76];
+    x86_64_user_regs_struct regs;
+    char pad3[8];
+} x86_64_elf_prstatus;
+
+static int x86_64_write_elf64_note(write_core_dump_function f,
+                                   CPUArchState *env, int id,
+                                   target_phys_addr_t *offset, void *opaque)
+{
+    x86_64_user_regs_struct regs;
+    Elf64_Nhdr *note;
+    char *buf;
+    int descsz, note_size, name_size = 5;
+    const char *name = "CORE";
+    int ret;
+
+    regs.r15 = env->regs[15];
+    regs.r14 = env->regs[14];
+    regs.r13 = env->regs[13];
+    regs.r12 = env->regs[12];
+    regs.r11 = env->regs[11];
+    regs.r10 = env->regs[10];
+    regs.r9  = env->regs[9];
+    regs.r8  = env->regs[8];
+    regs.rbp = env->regs[R_EBP];
+    regs.rsp = env->regs[R_ESP];
+    regs.rdi = env->regs[R_EDI];
+    regs.rsi = env->regs[R_ESI];
+    regs.rdx = env->regs[R_EDX];
+    regs.rcx = env->regs[R_ECX];
+    regs.rbx = env->regs[R_EBX];
+    regs.rax = env->regs[R_EAX];
+    regs.rip = env->eip;
+    regs.eflags = env->eflags;
+
+    regs.orig_rax = 0; /* FIXME */
+    regs.cs = env->segs[R_CS].selector;
+    regs.ss = env->segs[R_SS].selector;
+    regs.fs_base = env->segs[R_FS].base;
+    regs.gs_base = env->segs[R_GS].base;
+    regs.ds = env->segs[R_DS].selector;
+    regs.es = env->segs[R_ES].selector;
+    regs.fs = env->segs[R_FS].selector;
+    regs.gs = env->segs[R_GS].selector;
+
+    descsz = sizeof(x86_64_elf_prstatus);
+    note_size = ((sizeof(Elf64_Nhdr) + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    note->n_namesz = cpu_to_le32(name_size);
+    note->n_descsz = cpu_to_le32(descsz);
+    note->n_type = cpu_to_le32(NT_PRSTATUS);
+    buf = (char *)note;
+    buf += ((sizeof(Elf64_Nhdr) + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf + 32, &id, 4); /* pr_pid */
+    buf += descsz - sizeof(x86_64_user_regs_struct)-sizeof(target_ulong);
+    memcpy(buf, &regs, sizeof(x86_64_user_regs_struct));
+
+    ret = f(*offset, note, note_size, opaque);
+    g_free(note);
+    if (ret < 0) {
+        return -1;
+    }
+
+    *offset += note_size;
+
+    return 0;
+}
+#endif
+
+typedef struct {
+    uint32_t ebx, ecx, edx, esi, edi, ebp, eax;
+    unsigned short ds, __ds, es, __es;
+    unsigned short fs, __fs, gs, __gs;
+    uint32_t orig_eax, eip;
+    unsigned short cs, __cs;
+    uint32_t eflags, esp;
+    unsigned short ss, __ss;
+} x86_user_regs_struct;
+
+typedef struct {
+    char pad1[24];
+    uint32_t pid;
+    char pad2[44];
+    x86_user_regs_struct regs;
+    char pad3[4];
+} x86_elf_prstatus;
+
+static void x86_fill_elf_prstatus(x86_elf_prstatus *prstatus, CPUArchState *env,
+                                  int id)
+{
+    memset(prstatus, 0, sizeof(x86_elf_prstatus));
+    prstatus->regs.ebp = env->regs[R_EBP] & 0xffffffff;
+    prstatus->regs.esp = env->regs[R_ESP] & 0xffffffff;
+    prstatus->regs.edi = env->regs[R_EDI] & 0xffffffff;
+    prstatus->regs.esi = env->regs[R_ESI] & 0xffffffff;
+    prstatus->regs.edx = env->regs[R_EDX] & 0xffffffff;
+    prstatus->regs.ecx = env->regs[R_ECX] & 0xffffffff;
+    prstatus->regs.ebx = env->regs[R_EBX] & 0xffffffff;
+    prstatus->regs.eax = env->regs[R_EAX] & 0xffffffff;
+    prstatus->regs.eip = env->eip & 0xffffffff;
+    prstatus->regs.eflags = env->eflags & 0xffffffff;
+
+    prstatus->regs.cs = env->segs[R_CS].selector;
+    prstatus->regs.ss = env->segs[R_SS].selector;
+    prstatus->regs.ds = env->segs[R_DS].selector;
+    prstatus->regs.es = env->segs[R_ES].selector;
+    prstatus->regs.fs = env->segs[R_FS].selector;
+    prstatus->regs.gs = env->segs[R_GS].selector;
+
+    prstatus->pid = id;
+}
+
+static int x86_write_elf64_note(write_core_dump_function f, CPUArchState *env,
+                                int id, target_phys_addr_t *offset,
+                                void *opaque)
+{
+    x86_elf_prstatus prstatus;
+    Elf64_Nhdr *note;
+    char *buf;
+    int descsz, note_size, name_size = 5;
+    const char *name = "CORE";
+    int ret;
+
+    x86_fill_elf_prstatus(&prstatus, env, id);
+    descsz = sizeof(x86_elf_prstatus);
+    note_size = ((sizeof(Elf64_Nhdr) + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    note->n_namesz = cpu_to_le32(name_size);
+    note->n_descsz = cpu_to_le32(descsz);
+    note->n_type = cpu_to_le32(NT_PRSTATUS);
+    buf = (char *)note;
+    buf += ((sizeof(Elf64_Nhdr) + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf, &prstatus, sizeof(prstatus));
+
+    ret = f(*offset, note, note_size, opaque);
+    g_free(note);
+    if (ret < 0) {
+        return -1;
+    }
+
+    *offset += note_size;
+
+    return 0;
+}
+
+int cpu_write_elf64_note(write_core_dump_function f, CPUArchState *env,
+                         int cpuid, target_phys_addr_t *offset, void *opaque)
+{
+    int ret;
+#ifdef TARGET_X86_64
+    bool lma = !!(first_cpu->hflags & HF_LMA_MASK);
+
+    if (lma) {
+        ret = x86_64_write_elf64_note(f, env, cpuid, offset, opaque);
+    } else {
+#endif
+        ret = x86_write_elf64_note(f, env, cpuid, offset, opaque);
+#ifdef TARGET_X86_64
+    }
+#endif
+
+    return ret;
+}
+
+int cpu_write_elf32_note(write_core_dump_function f, CPUArchState *env,
+                         int cpuid, target_phys_addr_t *offset, void *opaque)
+{
+    x86_elf_prstatus prstatus;
+    Elf32_Nhdr *note;
+    char *buf;
+    int descsz, note_size, name_size = 5;
+    const char *name = "CORE";
+    int ret;
+
+    x86_fill_elf_prstatus(&prstatus, env, cpuid);
+    descsz = sizeof(x86_elf_prstatus);
+    note_size = ((sizeof(Elf32_Nhdr) + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    note->n_namesz = cpu_to_le32(name_size);
+    note->n_descsz = cpu_to_le32(descsz);
+    note->n_type = cpu_to_le32(NT_PRSTATUS);
+    buf = (char *)note;
+    buf += ((sizeof(Elf32_Nhdr) + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf, &prstatus, sizeof(prstatus));
+
+    ret = f(*offset, note, note_size, opaque);
+    g_free(note);
+    if (ret < 0) {
+        return -1;
+    }
+
+    *offset += note_size;
+
+    return 0;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 08/12 v11] target-i386: Add API to write cpu status to core file
  2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (6 preceding siblings ...)
  2012-03-26 10:04 ` [Qemu-devel] [PATCH 07/12 v11] target-i386: Add API to write elf notes to core file Wen Congyang
@ 2012-03-26 10:04 ` Wen Congyang
  2012-03-26 10:05 ` [Qemu-devel] [PATCH 09/12 v11] target-i386: add API to get dump info Wen Congyang
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Wen Congyang @ 2012-03-26 10:04 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

The core file has register's value. But it does not include all registers value.
Store the cpu status into QEMU note, and the user can get more information
from vmcore. If you change QEMUCPUState, please count up QEMUCPUSTATE_VERSION.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 cpu-all.h               |   20 ++++++
 target-i386/arch_dump.c |  152 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 172 insertions(+), 0 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index bc29b43..cd9b013 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -549,6 +549,10 @@ int cpu_write_elf64_note(write_core_dump_function f, CPUArchState *env,
                          int cpuid, target_phys_addr_t *offset, void *opaque);
 int cpu_write_elf32_note(write_core_dump_function f, CPUArchState *env,
                          int cpuid, target_phys_addr_t *offset, void *opaque);
+int cpu_write_elf64_qemunote(write_core_dump_function f, CPUArchState *env,
+                             target_phys_addr_t *offset, void *opaque);
+int cpu_write_elf32_qemunote(write_core_dump_function f, CPUArchState *env,
+                             target_phys_addr_t *offset, void *opaque);
 #else
 static inline int cpu_write_elf64_note(write_core_dump_function f,
                                        CPUArchState *env, int cpuid,
@@ -563,6 +567,22 @@ static inline int cpu_write_elf32_note(write_core_dump_function f,
 {
     return -1;
 }
+
+static inline int cpu_write_elf64_qemunote(write_core_dump_function f,
+                                           CPUArchState *env,
+                                           target_phys_addr_t *offset,
+                                           void *opaque);
+{
+    return -1;
+}
+
+static inline int cpu_write_elf32_qemunote(write_core_dump_function f,
+                                           CPUArchState *env,
+                                           target_phys_addr_t *offset,
+                                           void *opaque)
+{
+    return -1;
+}
 #endif
 
 #endif /* CPU_ALL_H */
diff --git a/target-i386/arch_dump.c b/target-i386/arch_dump.c
index 285c6e1..0f61dae 100644
--- a/target-i386/arch_dump.c
+++ b/target-i386/arch_dump.c
@@ -238,3 +238,155 @@ int cpu_write_elf32_note(write_core_dump_function f, CPUArchState *env,
 
     return 0;
 }
+
+/*
+ * please count up QEMUCPUSTATE_VERSION if you have changed definition of
+ * QEMUCPUState, and modify the tools using this information accordingly.
+ */
+#define QEMUCPUSTATE_VERSION (1)
+
+struct QEMUCPUSegment {
+    uint32_t selector;
+    uint32_t limit;
+    uint32_t flags;
+    uint32_t pad;
+    uint64_t base;
+};
+
+typedef struct QEMUCPUSegment QEMUCPUSegment;
+
+struct QEMUCPUState {
+    uint32_t version;
+    uint32_t size;
+    uint64_t rax, rbx, rcx, rdx, rsi, rdi, rsp, rbp;
+    uint64_t r8, r9, r10, r11, r12, r13, r14, r15;
+    uint64_t rip, rflags;
+    QEMUCPUSegment cs, ds, es, fs, gs, ss;
+    QEMUCPUSegment ldt, tr, gdt, idt;
+    uint64_t cr[5];
+};
+
+typedef struct QEMUCPUState QEMUCPUState;
+
+static void copy_segment(QEMUCPUSegment *d, SegmentCache *s)
+{
+    d->pad = 0;
+    d->selector = s->selector;
+    d->limit = s->limit;
+    d->flags = s->flags;
+    d->base = s->base;
+}
+
+static void qemu_get_cpustate(QEMUCPUState *s, CPUArchState *env)
+{
+    memset(s, 0, sizeof(QEMUCPUState));
+
+    s->version = QEMUCPUSTATE_VERSION;
+    s->size = sizeof(QEMUCPUState);
+
+    s->rax = env->regs[R_EAX];
+    s->rbx = env->regs[R_EBX];
+    s->rcx = env->regs[R_ECX];
+    s->rdx = env->regs[R_EDX];
+    s->rsi = env->regs[R_ESI];
+    s->rdi = env->regs[R_EDI];
+    s->rsp = env->regs[R_ESP];
+    s->rbp = env->regs[R_EBP];
+#ifdef TARGET_X86_64
+    s->r8  = env->regs[8];
+    s->r9  = env->regs[9];
+    s->r10 = env->regs[10];
+    s->r11 = env->regs[11];
+    s->r12 = env->regs[12];
+    s->r13 = env->regs[13];
+    s->r14 = env->regs[14];
+    s->r15 = env->regs[15];
+#endif
+    s->rip = env->eip;
+    s->rflags = env->eflags;
+
+    copy_segment(&s->cs, &env->segs[R_CS]);
+    copy_segment(&s->ds, &env->segs[R_DS]);
+    copy_segment(&s->es, &env->segs[R_ES]);
+    copy_segment(&s->fs, &env->segs[R_FS]);
+    copy_segment(&s->gs, &env->segs[R_GS]);
+    copy_segment(&s->ss, &env->segs[R_SS]);
+    copy_segment(&s->ldt, &env->ldt);
+    copy_segment(&s->tr, &env->tr);
+    copy_segment(&s->gdt, &env->gdt);
+    copy_segment(&s->idt, &env->idt);
+
+    s->cr[0] = env->cr[0];
+    s->cr[1] = env->cr[1];
+    s->cr[2] = env->cr[2];
+    s->cr[3] = env->cr[3];
+    s->cr[4] = env->cr[4];
+}
+
+static inline int cpu_write_qemu_note(write_core_dump_function f,
+                                      CPUArchState *env,
+                                      target_phys_addr_t *offset,
+                                      void *opaque,
+                                      int type)
+{
+    QEMUCPUState state;
+    Elf64_Nhdr *note64;
+    Elf32_Nhdr *note32;
+    void *note;
+    char *buf;
+    int descsz, note_size, name_size = 5, note_head_size;
+    const char *name = "QEMU";
+    int ret;
+
+    qemu_get_cpustate(&state, env);
+
+    descsz = sizeof(state);
+    if (type == 0) {
+        note_head_size = sizeof(Elf32_Nhdr);
+    } else {
+        note_head_size = sizeof(Elf64_Nhdr);
+    }
+    note_size = ((note_head_size + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    if (type == 0) {
+        note32 = note;
+        note32->n_namesz = cpu_to_le32(name_size);
+        note32->n_descsz = cpu_to_le32(descsz);
+        note32->n_type = 0;
+    } else {
+        note64 = note;
+        note64->n_namesz = cpu_to_le32(name_size);
+        note64->n_descsz = cpu_to_le32(descsz);
+        note64->n_type = 0;
+    }
+    buf = note;
+    buf += ((note_head_size + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf, &state, sizeof(state));
+
+    ret = f(*offset, note, note_size, opaque);
+    g_free(note);
+    if (ret < 0) {
+        return -1;
+    }
+
+    *offset += note_size;
+
+    return 0;
+}
+
+int cpu_write_elf64_qemunote(write_core_dump_function f, CPUArchState *env,
+                             target_phys_addr_t *offset, void *opaque)
+{
+    return cpu_write_qemu_note(f, env, offset, opaque, 1);
+}
+
+int cpu_write_elf32_qemunote(write_core_dump_function f, CPUArchState *env,
+                             target_phys_addr_t *offset, void *opaque)
+{
+    return cpu_write_qemu_note(f, env, offset, opaque, 0);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 09/12 v11] target-i386: add API to get dump info
  2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (7 preceding siblings ...)
  2012-03-26 10:04 ` [Qemu-devel] [PATCH 08/12 v11] target-i386: Add API to write cpu status " Wen Congyang
@ 2012-03-26 10:05 ` Wen Congyang
  2012-03-26 10:05 ` [Qemu-devel] [PATCH 10/12 v11] make gdb_id() generally avialable and rename it to cpu_index() Wen Congyang
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Wen Congyang @ 2012-03-26 10:05 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

Dump info contains: endian, class and architecture. The next
patch will use these information to create vmcore. Note: on
x86 box, the  class is ELFCLASS64 if the memory is larger than 4G.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 cpu-all.h               |    7 +++++++
 dump.h                  |   23 +++++++++++++++++++++++
 target-i386/arch_dump.c |   34 ++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 0 deletions(-)
 create mode 100644 dump.h

diff --git a/cpu-all.h b/cpu-all.h
index cd9b013..d7c5a00 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -23,6 +23,7 @@
 #include "qemu-tls.h"
 #include "cpu-common.h"
 #include "memory_mapping.h"
+#include "dump.h"
 
 /* some important defines:
  *
@@ -553,6 +554,7 @@ int cpu_write_elf64_qemunote(write_core_dump_function f, CPUArchState *env,
                              target_phys_addr_t *offset, void *opaque);
 int cpu_write_elf32_qemunote(write_core_dump_function f, CPUArchState *env,
                              target_phys_addr_t *offset, void *opaque);
+int cpu_get_dump_info(ArchDumpInfo *info);
 #else
 static inline int cpu_write_elf64_note(write_core_dump_function f,
                                        CPUArchState *env, int cpuid,
@@ -583,6 +585,11 @@ static inline int cpu_write_elf32_qemunote(write_core_dump_function f,
 {
     return -1;
 }
+
+static inline int cpu_get_dump_info(ArchDumpInfo *info)
+{
+    return -1;
+}
 #endif
 
 #endif /* CPU_ALL_H */
diff --git a/dump.h b/dump.h
new file mode 100644
index 0000000..28340cf
--- /dev/null
+++ b/dump.h
@@ -0,0 +1,23 @@
+/*
+ * QEMU dump
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef DUMP_H
+#define DUMP_H
+
+typedef struct ArchDumpInfo {
+    int d_machine;  /* Architecture */
+    int d_endian;   /* ELFDATA2LSB or ELFDATA2MSB */
+    int d_class;    /* ELFCLASS32 or ELFCLASS64 */
+} ArchDumpInfo;
+
+#endif
diff --git a/target-i386/arch_dump.c b/target-i386/arch_dump.c
index 0f61dae..1a75dea 100644
--- a/target-i386/arch_dump.c
+++ b/target-i386/arch_dump.c
@@ -13,6 +13,7 @@
 
 #include "cpu.h"
 #include "cpu-all.h"
+#include "dump.h"
 #include "elf.h"
 
 #ifdef TARGET_X86_64
@@ -390,3 +391,36 @@ int cpu_write_elf32_qemunote(write_core_dump_function f, CPUArchState *env,
 {
     return cpu_write_qemu_note(f, env, offset, opaque, 0);
 }
+
+int cpu_get_dump_info(ArchDumpInfo *info)
+{
+    bool lma = false;
+    RAMBlock *block;
+
+#ifdef TARGET_X86_64
+    lma = !!(first_cpu->hflags & HF_LMA_MASK);
+#endif
+
+    if (lma) {
+        info->d_machine = EM_X86_64;
+    } else {
+        info->d_machine = EM_386;
+    }
+    info->d_endian = ELFDATA2LSB;
+
+    if (lma) {
+        info->d_class = ELFCLASS64;
+    } else {
+        info->d_class = ELFCLASS32;
+
+        QLIST_FOREACH(block, &ram_list.blocks, next) {
+            if (block->offset + block->length > UINT_MAX) {
+                /* The memory size is greater than 4G */
+                info->d_class = ELFCLASS64;
+                break;
+            }
+        }
+    }
+
+    return 0;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 10/12 v11] make gdb_id() generally avialable and rename it to cpu_index()
  2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (8 preceding siblings ...)
  2012-03-26 10:05 ` [Qemu-devel] [PATCH 09/12 v11] target-i386: add API to get dump info Wen Congyang
@ 2012-03-26 10:05 ` Wen Congyang
  2012-03-27  3:38   ` HATAYAMA Daisuke
  2012-03-26 10:06 ` [Qemu-devel] [PATCH 11/12 v11] QError: Introduce new error for the dump-guest-memory command Wen Congyang
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 21+ messages in thread
From: Wen Congyang @ 2012-03-26 10:05 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

The following patch also needs this API, so make it generally avialable.
The function gdb_id() will not be used in gdbstub.c now, so its name is
not suitable, and rename it to cpu_index()

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 gdbstub.c |   19 +++++--------------
 gdbstub.h |    9 +++++++++
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index 6a7e2c4..423ffec 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -1938,21 +1938,12 @@ static void gdb_set_cpu_pc(GDBState *s, target_ulong pc)
 #endif
 }
 
-static inline int gdb_id(CPUArchState *env)
-{
-#if defined(CONFIG_USER_ONLY) && defined(CONFIG_USE_NPTL)
-    return env->host_tid;
-#else
-    return env->cpu_index + 1;
-#endif
-}
-
 static CPUArchState *find_cpu(uint32_t thread_id)
 {
     CPUArchState *env;
 
     for (env = first_cpu; env != NULL; env = env->next_cpu) {
-        if (gdb_id(env) == thread_id) {
+        if (cpu_index(env) == thread_id) {
             return env;
         }
     }
@@ -1980,7 +1971,7 @@ static int gdb_handle_packet(GDBState *s, const char *line_buf)
     case '?':
         /* TODO: Make this return the correct value for user-mode.  */
         snprintf(buf, sizeof(buf), "T%02xthread:%02x;", GDB_SIGNAL_TRAP,
-                 gdb_id(s->c_cpu));
+                 cpu_index(s->c_cpu));
         put_packet(s, buf);
         /* Remove all the breakpoints when this query is issued,
          * because gdb is doing and initial connect and the state
@@ -2275,7 +2266,7 @@ static int gdb_handle_packet(GDBState *s, const char *line_buf)
         } else if (strcmp(p,"sThreadInfo") == 0) {
         report_cpuinfo:
             if (s->query_cpu) {
-                snprintf(buf, sizeof(buf), "m%x", gdb_id(s->query_cpu));
+                snprintf(buf, sizeof(buf), "m%x", cpu_index(s->query_cpu));
                 put_packet(s, buf);
                 s->query_cpu = s->query_cpu->next_cpu;
             } else
@@ -2423,7 +2414,7 @@ static void gdb_vm_state_change(void *opaque, int running, RunState state)
             }
             snprintf(buf, sizeof(buf),
                      "T%02xthread:%02x;%swatch:" TARGET_FMT_lx ";",
-                     GDB_SIGNAL_TRAP, gdb_id(env), type,
+                     GDB_SIGNAL_TRAP, cpu_index(env), type,
                      env->watchpoint_hit->vaddr);
             env->watchpoint_hit = NULL;
             goto send_packet;
@@ -2456,7 +2447,7 @@ static void gdb_vm_state_change(void *opaque, int running, RunState state)
         ret = GDB_SIGNAL_UNKNOWN;
         break;
     }
-    snprintf(buf, sizeof(buf), "T%02xthread:%02x;", ret, gdb_id(env));
+    snprintf(buf, sizeof(buf), "T%02xthread:%02x;", ret, cpu_index(env));
 
 send_packet:
     put_packet(s, buf);
diff --git a/gdbstub.h b/gdbstub.h
index b44e275..668de66 100644
--- a/gdbstub.h
+++ b/gdbstub.h
@@ -30,6 +30,15 @@ void gdb_register_coprocessor(CPUArchState *env,
                               gdb_reg_cb get_reg, gdb_reg_cb set_reg,
                               int num_regs, const char *xml, int g_pos);
 
+static inline int cpu_index(CPUArchState *env)
+{
+#if defined(CONFIG_USER_ONLY) && defined(CONFIG_USE_NPTL)
+    return env->host_tid;
+#else
+    return env->cpu_index + 1;
+#endif
+}
+
 #endif
 
 #ifdef CONFIG_USER_ONLY
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 11/12 v11] QError: Introduce new error for the dump-guest-memory command
  2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (9 preceding siblings ...)
  2012-03-26 10:05 ` [Qemu-devel] [PATCH 10/12 v11] make gdb_id() generally avialable and rename it to cpu_index() Wen Congyang
@ 2012-03-26 10:06 ` Wen Congyang
  2012-03-26 10:06 ` [Qemu-devel] [PATCH 12/12 v11] introduce a new monitor command 'dump-guest-memory' to dump guest's memory Wen Congyang
  2012-03-28  5:17 ` [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
  12 siblings, 0 replies; 21+ messages in thread
From: Wen Congyang @ 2012-03-26 10:06 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

The new error is QERR_PIPE_OR_SOCKET_FD, which is going to be
used by the QAPI dump-guest-memory command.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 qerror.c |    4 ++++
 qerror.h |    3 +++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/qerror.c b/qerror.c
index 41c729a..d5b3a8c 100644
--- a/qerror.c
+++ b/qerror.c
@@ -299,6 +299,10 @@ static const QErrorStringTable qerror_table[] = {
         .error_fmt = QERR_VNC_SERVER_FAILED,
         .desc      = "Could not start VNC server on %(target)",
     },
+    {
+        .error_fmt = QERR_PIPE_OR_SOCKET_FD,
+        .desc      = "lseek() failed: the fd is associated with a pipe, socket",
+    },
     {}
 };
 
diff --git a/qerror.h b/qerror.h
index e16f9c2..6b5e8f7 100644
--- a/qerror.h
+++ b/qerror.h
@@ -244,4 +244,7 @@ QError *qobject_to_qerror(const QObject *obj);
 #define QERR_VNC_SERVER_FAILED \
     "{ 'class': 'VNCServerFailed', 'data': { 'target': %s } }"
 
+#define QERR_PIPE_OR_SOCKET_FD \
+    "{ 'class': 'PipeOrSocketFD', 'data': {} }"
+
 #endif /* QERROR_H */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 12/12 v11] introduce a new monitor command 'dump-guest-memory' to dump guest's memory
  2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (10 preceding siblings ...)
  2012-03-26 10:06 ` [Qemu-devel] [PATCH 11/12 v11] QError: Introduce new error for the dump-guest-memory command Wen Congyang
@ 2012-03-26 10:06 ` Wen Congyang
  2012-04-02  2:54   ` Wen Congyang
  2012-04-02  3:16   ` [Qemu-devel] [PATCH 12/12 v11.5] " Wen Congyang
  2012-03-28  5:17 ` [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
  12 siblings, 2 replies; 21+ messages in thread
From: Wen Congyang @ 2012-03-26 10:06 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

The command's usage:
   dump [-p] protocol [begin] [length]
The supported protocol can be file or fd:
1. file: the protocol starts with "file:", and the following string is
   the file's path.
2. fd: the protocol starts with "fd:", and the following string is the
   fd's name.

Note:
  1. If you want to use gdb to process the core, please specify -p option.
     The reason why the -p option is not default is:
       a. guest machine in a catastrophic state can have corrupted memory,
          which we cannot trust.
       b. The guest machine can be in read-mode even if paging is enabled.
          For example: the guest machine uses ACPI to sleep, and ACPI sleep
          state goes in real-mode.
  2. This command doesn't support the fd that is is associated with a pipe,
     socket, or FIFO(lseek will fail with such fd).
  3. If you don't want to dump all guest's memory, please specify the start
     physical address and the length.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target  |    2 +-
 dump.c           |  827 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 elf.h            |    5 +
 hmp-commands.hx  |   28 ++
 hmp.c            |   22 ++
 hmp.h            |    1 +
 memory_mapping.c |   27 ++
 memory_mapping.h |    3 +
 qapi-schema.json |   34 +++
 qmp-commands.hx  |   38 +++
 10 files changed, 986 insertions(+), 1 deletions(-)
 create mode 100644 dump.c

diff --git a/Makefile.target b/Makefile.target
index 2e20ed2..43004bc 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -222,7 +222,7 @@ obj-$(CONFIG_NO_KVM) += kvm-stub.o
 obj-$(CONFIG_VGA) += vga.o
 obj-y += memory.o savevm.o
 obj-y += memory_mapping.o
-obj-$(CONFIG_HAVE_CORE_DUMP) += arch_dump.o
+obj-$(CONFIG_HAVE_CORE_DUMP) += arch_dump.o dump.o
 LIBS+=-lz
 
 obj-i386-$(CONFIG_KVM) += hyperv.o
diff --git a/dump.c b/dump.c
new file mode 100644
index 0000000..d1b3933
--- /dev/null
+++ b/dump.c
@@ -0,0 +1,827 @@
+/*
+ * QEMU dump
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include <unistd.h>
+#include "elf.h"
+#include <sys/procfs.h>
+#include <glib.h>
+#include "cpu.h"
+#include "cpu-all.h"
+#include "targphys.h"
+#include "monitor.h"
+#include "kvm.h"
+#include "dump.h"
+#include "sysemu.h"
+#include "bswap.h"
+#include "memory_mapping.h"
+#include "error.h"
+#include "qmp-commands.h"
+#include "gdbstub.h"
+
+static uint16_t cpu_convert_to_target16(uint16_t val, int endian)
+{
+    if (endian == ELFDATA2LSB) {
+        val = cpu_to_le16(val);
+    } else {
+        val = cpu_to_be16(val);
+    }
+
+    return val;
+}
+
+static uint32_t cpu_convert_to_target32(uint32_t val, int endian)
+{
+    if (endian == ELFDATA2LSB) {
+        val = cpu_to_le32(val);
+    } else {
+        val = cpu_to_be32(val);
+    }
+
+    return val;
+}
+
+static uint64_t cpu_convert_to_target64(uint64_t val, int endian)
+{
+    if (endian == ELFDATA2LSB) {
+        val = cpu_to_le64(val);
+    } else {
+        val = cpu_to_be64(val);
+    }
+
+    return val;
+}
+
+typedef struct DumpState {
+    ArchDumpInfo dump_info;
+    MemoryMappingList list;
+    uint16_t phdr_num;
+    uint32_t sh_info;
+    bool have_section;
+    bool resume;
+    target_phys_addr_t memory_offset;
+    int fd;
+
+    RAMBlock *block;
+    ram_addr_t start;
+    bool has_filter;
+    int64_t begin;
+    int64_t length;
+    Error **errp;
+} DumpState;
+
+static int dump_cleanup(DumpState *s)
+{
+    int ret = 0;
+
+    memory_mapping_list_free(&s->list);
+    if (s->fd != -1) {
+        close(s->fd);
+    }
+    if (s->resume) {
+        vm_start();
+    }
+
+    return ret;
+}
+
+static void dump_error(DumpState *s, const char *reason)
+{
+    dump_cleanup(s);
+}
+
+static int fd_write_vmcore(target_phys_addr_t offset, void *buf, size_t size,
+                           void *opaque)
+{
+    DumpState *s = opaque;
+    int fd = s->fd;
+    off_t ret;
+    size_t writen_size;
+
+    while (1) {
+        ret = lseek(fd, offset, SEEK_SET);
+        if (ret < 0) {
+            if (errno == ESPIPE) {
+                error_set(s->errp, QERR_PIPE_OR_SOCKET_FD);
+                return -1;
+            }
+
+            if (errno != EINTR && errno != EAGAIN) {
+                return -1;
+            }
+            continue;
+        }
+        break;
+    }
+
+    /* The fd may be passed from user, and it can be non-blocked */
+    while (size) {
+        writen_size = qemu_write_full(fd, buf, size);
+        if (writen_size != size && errno != EAGAIN) {
+            return -1;
+        }
+
+        buf += writen_size;
+        size -= writen_size;
+    }
+
+    return 0;
+}
+
+static int write_elf64_header(DumpState *s)
+{
+    Elf64_Ehdr elf_header;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&elf_header, 0, sizeof(Elf64_Ehdr));
+    memcpy(&elf_header, ELFMAG, SELFMAG);
+    elf_header.e_ident[EI_CLASS] = ELFCLASS64;
+    elf_header.e_ident[EI_DATA] = s->dump_info.d_endian;
+    elf_header.e_ident[EI_VERSION] = EV_CURRENT;
+    elf_header.e_type = cpu_convert_to_target16(ET_CORE, endian);
+    elf_header.e_machine = cpu_convert_to_target16(s->dump_info.d_machine,
+                                                   endian);
+    elf_header.e_version = cpu_convert_to_target32(EV_CURRENT, endian);
+    elf_header.e_ehsize = cpu_convert_to_target16(sizeof(elf_header), endian);
+    elf_header.e_phoff = cpu_convert_to_target64(sizeof(Elf64_Ehdr), endian);
+    elf_header.e_phentsize = cpu_convert_to_target16(sizeof(Elf64_Phdr),
+                                                     endian);
+    elf_header.e_phnum = cpu_convert_to_target16(s->phdr_num, endian);
+    if (s->have_section) {
+        uint64_t shoff = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr) * s->sh_info;
+
+        elf_header.e_shoff = cpu_convert_to_target64(shoff, endian);
+        elf_header.e_shentsize = cpu_convert_to_target16(sizeof(Elf64_Shdr),
+                                                         endian);
+        elf_header.e_shnum = cpu_convert_to_target16(1, endian);
+    }
+
+    ret = fd_write_vmcore(0, &elf_header, sizeof(elf_header), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write elf header.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_header(DumpState *s)
+{
+    Elf32_Ehdr elf_header;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&elf_header, 0, sizeof(Elf32_Ehdr));
+    memcpy(&elf_header, ELFMAG, SELFMAG);
+    elf_header.e_ident[EI_CLASS] = ELFCLASS32;
+    elf_header.e_ident[EI_DATA] = endian;
+    elf_header.e_ident[EI_VERSION] = EV_CURRENT;
+    elf_header.e_type = cpu_convert_to_target16(ET_CORE, endian);
+    elf_header.e_machine = cpu_convert_to_target16(s->dump_info.d_machine,
+                                                   endian);
+    elf_header.e_version = cpu_convert_to_target32(EV_CURRENT, endian);
+    elf_header.e_ehsize = cpu_convert_to_target16(sizeof(elf_header), endian);
+    elf_header.e_phoff = cpu_convert_to_target32(sizeof(Elf32_Ehdr), endian);
+    elf_header.e_phentsize = cpu_convert_to_target16(sizeof(Elf32_Phdr),
+                                                     endian);
+    elf_header.e_phnum = cpu_convert_to_target16(s->phdr_num, endian);
+    if (s->have_section) {
+        uint32_t shoff = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr) * s->sh_info;
+
+        elf_header.e_shoff = cpu_convert_to_target32(shoff, endian);
+        elf_header.e_shentsize = cpu_convert_to_target16(sizeof(Elf32_Shdr),
+                                                         endian);
+        elf_header.e_shnum = cpu_convert_to_target16(1, endian);
+    }
+
+    ret = fd_write_vmcore(0, &elf_header, sizeof(elf_header), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write elf header.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf64_load(DumpState *s, MemoryMapping *memory_mapping,
+                            int phdr_index, target_phys_addr_t offset)
+{
+    Elf64_Phdr phdr;
+    off_t phdr_offset;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&phdr, 0, sizeof(Elf64_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_LOAD, endian);
+    phdr.p_offset = cpu_convert_to_target64(offset, endian);
+    phdr.p_paddr = cpu_convert_to_target64(memory_mapping->phys_addr, endian);
+    if (offset == -1) {
+        /* When the memory is not stored into vmcore, offset will be -1 */
+        phdr.p_filesz = 0;
+    } else {
+        phdr.p_filesz = cpu_convert_to_target64(memory_mapping->length, endian);
+    }
+    phdr.p_memsz = cpu_convert_to_target64(memory_mapping->length, endian);
+    phdr.p_vaddr = cpu_convert_to_target64(memory_mapping->virt_addr, endian);
+
+    phdr_offset = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr)*phdr_index;
+    ret = fd_write_vmcore(phdr_offset, &phdr, sizeof(Elf64_Phdr), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_load(DumpState *s, MemoryMapping *memory_mapping,
+                            int phdr_index, target_phys_addr_t offset)
+{
+    Elf32_Phdr phdr;
+    off_t phdr_offset;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&phdr, 0, sizeof(Elf32_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_LOAD, endian);
+    phdr.p_offset = cpu_convert_to_target32(offset, endian);
+    phdr.p_paddr = cpu_convert_to_target32(memory_mapping->phys_addr, endian);
+    if (offset == -1) {
+        /* When the memory is not stored into vmcore, offset will be -1 */
+        phdr.p_filesz = 0;
+    } else {
+        phdr.p_filesz = cpu_convert_to_target32(memory_mapping->length, endian);
+    }
+    phdr.p_memsz = cpu_convert_to_target32(memory_mapping->length, endian);
+    phdr.p_vaddr = cpu_convert_to_target32(memory_mapping->virt_addr, endian);
+
+    phdr_offset = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr)*phdr_index;
+    ret = fd_write_vmcore(phdr_offset, &phdr, sizeof(Elf32_Phdr), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf64_notes(DumpState *s, int phdr_index,
+                             target_phys_addr_t *offset)
+{
+    CPUArchState *env;
+    int ret;
+    target_phys_addr_t begin = *offset;
+    Elf64_Phdr phdr;
+    off_t phdr_offset;
+    int id;
+    int endian = s->dump_info.d_endian;
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        id = cpu_index(env);
+        ret = cpu_write_elf64_note(fd_write_vmcore, env, id, offset, s);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write elf notes.\n");
+            return -1;
+        }
+    }
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        ret = cpu_write_elf64_qemunote(fd_write_vmcore, env, offset, s);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write CPU status.\n");
+            return -1;
+        }
+    }
+
+    memset(&phdr, 0, sizeof(Elf64_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_NOTE, endian);
+    phdr.p_offset = cpu_convert_to_target64(begin, endian);
+    phdr.p_paddr = 0;
+    phdr.p_filesz = cpu_convert_to_target64(*offset - begin, endian);
+    phdr.p_memsz = cpu_convert_to_target64(*offset - begin, endian);
+    phdr.p_vaddr = 0;
+
+    phdr_offset = sizeof(Elf64_Ehdr);
+    ret = fd_write_vmcore(phdr_offset, &phdr, sizeof(Elf64_Phdr), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_notes(DumpState *s, int phdr_index,
+                             target_phys_addr_t *offset)
+{
+    CPUArchState *env;
+    int ret;
+    target_phys_addr_t begin = *offset;
+    Elf32_Phdr phdr;
+    off_t phdr_offset;
+    int id;
+    int endian = s->dump_info.d_endian;
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        id = cpu_index(env);
+        ret = cpu_write_elf32_note(fd_write_vmcore, env, id, offset, s);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write elf notes.\n");
+            return -1;
+        }
+    }
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        ret = cpu_write_elf32_qemunote(fd_write_vmcore, env, offset, s);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write CPU status.\n");
+            return -1;
+        }
+    }
+
+    memset(&phdr, 0, sizeof(Elf32_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_NOTE, endian);
+    phdr.p_offset = cpu_convert_to_target32(begin, endian);
+    phdr.p_paddr = 0;
+    phdr.p_filesz = cpu_convert_to_target32(*offset - begin, endian);
+    phdr.p_memsz = cpu_convert_to_target32(*offset - begin, endian);
+    phdr.p_vaddr = 0;
+
+    phdr_offset = sizeof(Elf32_Ehdr);
+    ret = fd_write_vmcore(phdr_offset, &phdr, sizeof(Elf32_Phdr), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf_section(DumpState *s, target_phys_addr_t *offset, int type)
+{
+    Elf32_Shdr shdr32;
+    Elf64_Shdr shdr64;
+    int endian = s->dump_info.d_endian;
+    int shdr_size;
+    void *shdr;
+    int ret;
+
+    if (type == 0) {
+        shdr_size = sizeof(Elf32_Shdr);
+        memset(&shdr32, 0, shdr_size);
+        shdr32.sh_info = cpu_convert_to_target32(s->sh_info, endian);
+        shdr = &shdr32;
+    } else {
+        shdr_size = sizeof(Elf64_Shdr);
+        memset(&shdr64, 0, shdr_size);
+        shdr64.sh_info = cpu_convert_to_target32(s->sh_info, endian);
+        shdr = &shdr64;
+    }
+
+    ret = fd_write_vmcore(*offset, &shdr, shdr_size, s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write section header table.\n");
+        return -1;
+    }
+
+    *offset += shdr_size;
+    return 0;
+}
+
+static int write_data(DumpState *s, void *buf, int length,
+                      target_phys_addr_t *offset)
+{
+    int ret;
+
+    ret = fd_write_vmcore(*offset, buf, length, s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to save memory.\n");
+        return -1;
+    }
+
+    *offset += length;
+    return 0;
+}
+
+/* write the memroy to vmcore. 1 page per I/O. */
+static int write_memory(DumpState *s, RAMBlock *block, ram_addr_t start,
+                        target_phys_addr_t *offset, int64_t size)
+{
+    int i, ret;
+
+    for (i = 0; i < size / TARGET_PAGE_SIZE; i++) {
+        ret = write_data(s, block->host + start + i * TARGET_PAGE_SIZE,
+                         TARGET_PAGE_SIZE, offset);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    if ((size % TARGET_PAGE_SIZE) != 0) {
+        ret = write_data(s, block->host + start + i * TARGET_PAGE_SIZE,
+                         size % TARGET_PAGE_SIZE, offset);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    return 0;
+}
+
+/* get the memory's offset in the vmcore */
+static target_phys_addr_t get_offset(target_phys_addr_t phys_addr,
+                                     DumpState *s)
+{
+    RAMBlock *block;
+    target_phys_addr_t offset = s->memory_offset;
+    int64_t size_in_block, start;
+
+    if (s->has_filter) {
+        if (phys_addr < s->begin || phys_addr >= s->begin + s->length) {
+            return -1;
+        }
+    }
+
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        if (s->has_filter) {
+            if (block->offset >= s->begin + s->length ||
+                block->offset + block->length <= s->begin) {
+                /* This block is out of the range */
+                continue;
+            }
+
+            if (s->begin <= block->offset) {
+                start = block->offset;
+            } else {
+                start = s->begin;
+            }
+
+            size_in_block = block->length - (start - block->offset);
+            if (s->begin + s->length < block->offset + block->length) {
+                size_in_block -= block->offset + block->length -
+                                 (s->begin + s->length);
+            }
+        } else {
+            start = block->offset;
+            size_in_block = block->length;
+        }
+
+        if (phys_addr >= start && phys_addr < start + size_in_block) {
+            return phys_addr - start + offset;
+        }
+
+        offset += size_in_block;
+    }
+
+    return -1;
+}
+
+/* write elf header, PT_NOTE and elf note to vmcore. */
+static int dump_begin(DumpState *s)
+{
+    target_phys_addr_t offset;
+    int ret;
+
+    /*
+     * the vmcore's format is:
+     *   --------------
+     *   |  elf header |
+     *   --------------
+     *   |  PT_NOTE    |
+     *   --------------
+     *   |  PT_LOAD    |
+     *   --------------
+     *   |  ......     |
+     *   --------------
+     *   |  PT_LOAD    |
+     *   --------------
+     *   |  sec_hdr    |
+     *   --------------
+     *   |  elf note   |
+     *   --------------
+     *   |  memory     |
+     *   --------------
+     *
+     * we only know where the memory is saved after we write elf note into
+     * vmcore.
+     */
+
+    /* write elf header to vmcore */
+    if (s->dump_info.d_class == ELFCLASS64) {
+        ret = write_elf64_header(s);
+    } else {
+        ret = write_elf32_header(s);
+    }
+    if (ret < 0) {
+        return -1;
+    }
+
+    /* write elf section and notes to vmcore */
+    if (s->dump_info.d_class == ELFCLASS64) {
+        if (s->have_section) {
+            offset = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr)*s->sh_info;
+            if (write_elf_section(s, &offset, 1) < 0) {
+                return -1;
+            }
+        } else {
+            offset = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr)*s->phdr_num;
+        }
+        ret = write_elf64_notes(s, 0, &offset);
+    } else {
+        if (s->have_section) {
+            offset = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr)*s->sh_info;
+            if (write_elf_section(s, &offset, 0) < 0) {
+                return -1;
+            }
+        } else {
+            offset = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr)*s->phdr_num;
+        }
+        ret = write_elf32_notes(s, 0, &offset);
+    }
+
+    if (ret < 0) {
+        return -1;
+    }
+
+    s->memory_offset = offset;
+    return 0;
+}
+
+/* write PT_LOAD to vmcore */
+static int dump_completed(DumpState *s)
+{
+    target_phys_addr_t offset;
+    MemoryMapping *memory_mapping;
+    int phdr_index = 1, ret;
+
+    QTAILQ_FOREACH(memory_mapping, &s->list.head, next) {
+        offset = get_offset(memory_mapping->phys_addr, s);
+        if (s->dump_info.d_class == ELFCLASS64) {
+            ret = write_elf64_load(s, memory_mapping, phdr_index++, offset);
+        } else {
+            ret = write_elf32_load(s, memory_mapping, phdr_index++, offset);
+        }
+        if (ret < 0) {
+            return -1;
+        }
+    }
+
+    dump_cleanup(s);
+    return 0;
+}
+
+static int get_next_block(DumpState *s, RAMBlock *block)
+{
+    while (1) {
+        block = QLIST_NEXT(block, next);
+        if (!block) {
+            /* no more block */
+            return 1;
+        }
+
+        s->start = 0;
+        s->block = block;
+        if (s->has_filter) {
+            if (block->offset >= s->begin + s->length ||
+                block->offset + block->length <= s->begin) {
+                /* This block is out of the range */
+                continue;
+            }
+
+            if (s->begin > block->offset) {
+                s->start = s->begin - block->offset;
+            }
+        }
+
+        return 0;
+    }
+}
+
+/* write all memory to vmcore */
+static int dump_iterate(DumpState *s)
+{
+    RAMBlock *block;
+    target_phys_addr_t offset = s->memory_offset;
+    int64_t size;
+    int ret;
+
+    while (1) {
+        block = s->block;
+
+        size = block->length;
+        if (s->has_filter) {
+            size -= s->start;
+            if (s->begin + s->length < block->offset + block->length) {
+                size -= block->offset + block->length - (s->begin + s->length);
+            }
+        }
+        ret = write_memory(s, block, s->start, &offset, size);
+        if (ret == -1) {
+            return ret;
+        }
+
+        ret = get_next_block(s, block);
+        if (ret == 1) {
+            dump_completed(s);
+            return 0;
+        }
+    }
+}
+
+static int create_vmcore(DumpState *s)
+{
+    int ret;
+
+    ret = dump_begin(s);
+    if (ret < 0) {
+        return -1;
+    }
+
+    ret = dump_iterate(s);
+    if (ret < 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
+static ram_addr_t get_start_block(DumpState *s)
+{
+    RAMBlock *block;
+
+    if (!s->has_filter) {
+        s->block = QLIST_FIRST(&ram_list.blocks);
+        return 0;
+    }
+
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        if (block->offset >= s->begin + s->length ||
+            block->offset + block->length <= s->begin) {
+            /* This block is out of the range */
+            continue;
+        }
+
+        s->block = block;
+        if (s->begin > block->offset) {
+            s->start = s->begin - block->offset;
+        } else {
+            s->start = 0;
+        }
+        return s->start;
+    }
+
+    return -1;
+}
+
+static int dump_init(DumpState *s, int fd, bool paging, bool has_filter,
+                     int64_t begin, int64_t length, Error **errp)
+{
+    CPUArchState *env;
+    int ret;
+
+    if (runstate_is_running()) {
+        vm_stop(RUN_STATE_SAVE_VM);
+        s->resume = true;
+    } else {
+        s->resume = false;
+    }
+
+    s->errp = errp;
+    s->fd = fd;
+    s->has_filter = has_filter;
+    s->begin = begin;
+    s->length = length;
+    s->start = get_start_block(s);
+    if (s->start == -1) {
+        error_set(errp, QERR_INVALID_PARAMETER, "begin");
+        goto cleanup;
+    }
+
+    /*
+     * get dump info: endian, class and architecture.
+     * If the target architecture is not supported, cpu_get_dump_info() will
+     * return -1.
+     *
+     * if we use kvm, we should synchronize the register before we get dump
+     * info.
+     */
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        cpu_synchronize_state(env);
+    }
+
+    ret = cpu_get_dump_info(&s->dump_info);
+    if (ret < 0) {
+        error_set(errp, QERR_UNSUPPORTED);
+        goto cleanup;
+    }
+
+    /* get memory mapping */
+    memory_mapping_list_init(&s->list);
+    if (paging) {
+        qemu_get_guest_memory_mapping(&s->list);
+    } else {
+        qemu_get_guest_simple_memory_mapping(&s->list);
+    }
+
+    if (s->has_filter) {
+        memory_mapping_filter(&s->list, s->begin, s->length);
+    }
+
+    /*
+     * calculate phdr_num
+     *
+     * the type of ehdr->e_phnum is uint16_t, so we should avoid overflow
+     */
+    s->phdr_num = 1; /* PT_NOTE */
+    if (s->list.num < UINT16_MAX - 2) {
+        s->phdr_num += s->list.num;
+        s->have_section = false;
+    } else {
+        s->have_section = true;
+        s->phdr_num = PN_XNUM;
+        s->sh_info = 1; /* PT_NOTE */
+
+        /* the type of shdr->sh_info is uint32_t, so we should avoid overflow */
+        if (s->list.num <= UINT32_MAX - 1) {
+            s->sh_info += s->list.num;
+        } else {
+            s->sh_info = UINT32_MAX;
+        }
+    }
+
+    return 0;
+
+cleanup:
+    if (s->resume) {
+        vm_start();
+    }
+
+    return -1;
+}
+
+void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
+                           int64_t begin, bool has_length, int64_t length,
+                           Error **errp)
+{
+    const char *p;
+    int fd = -1;
+    DumpState *s;
+    int ret;
+
+    if (has_begin && !has_length) {
+        error_set(errp, QERR_MISSING_PARAMETER, "length");
+        return;
+    }
+    if (!has_begin && has_length) {
+        error_set(errp, QERR_MISSING_PARAMETER, "begin");
+        return;
+    }
+
+#if !defined(WIN32)
+    if (strstart(file, "fd:", &p)) {
+        fd = monitor_get_fd(cur_mon, p);
+        if (fd == -1) {
+            error_set(errp, QERR_FD_NOT_FOUND, p);
+            return;
+        }
+    }
+#endif
+
+    if  (strstart(file, "file:", &p)) {
+        fd = qemu_open(p, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR);
+        if (fd < 0) {
+            error_set(errp, QERR_OPEN_FILE_FAILED, p);
+            return;
+        }
+    }
+
+    if (fd == -1) {
+        error_set(errp, QERR_INVALID_PARAMETER, "protocol");
+        return;
+    }
+
+    s = g_malloc(sizeof(DumpState));
+
+    ret = dump_init(s, fd, paging, has_begin, begin, length, errp);
+    if (ret < 0) {
+        g_free(s);
+        return;
+    }
+
+    if (create_vmcore(s) < 0 && !error_is_set(s->errp)) {
+        error_set(errp, QERR_IO_ERROR);
+    }
+
+    g_free(s);
+}
diff --git a/elf.h b/elf.h
index 36bcac4..7560eac 100644
--- a/elf.h
+++ b/elf.h
@@ -1016,6 +1016,11 @@ typedef struct elf64_sym {
 
 #define EI_NIDENT	16
 
+/* Special value for e_phnum.  This indicates that the real number of
+   program headers is too large to fit into e_phnum.  Instead the real
+   value is in the field sh_info of section 0.  */
+#define PN_XNUM         0xffff
+
 typedef struct elf32_hdr{
   unsigned char	e_ident[EI_NIDENT];
   Elf32_Half	e_type;
diff --git a/hmp-commands.hx b/hmp-commands.hx
index bd35a3e..d270fd1 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -879,6 +879,34 @@ server will ask the spice/vnc client to automatically reconnect using the
 new parameters (if specified) once the vm migration finished successfully.
 ETEXI
 
+#if defined(CONFIG_HAVE_CORE_DUMP)
+    {
+        .name       = "dump-guest-memory",
+        .args_type  = "paging:-p,protocol:s,begin:i?,length:i?",
+        .params     = "[-p] protocol [begin] [length]",
+        .help       = "dump guest memory to file"
+                      "\n\t\t\t begin(optional): the starting physical address"
+                      "\n\t\t\t length(optional): the memory size, in bytes",
+        .user_print = monitor_user_noop,
+        .mhandler.cmd = hmp_dump_guest_memory,
+    },
+
+
+STEXI
+@item dump-guest-memory [-p] @var{protocol} @var{begin} @var{length}
+@findex dump-guest-memory
+Dump guest memory to @var{protocol}. The file can be processed with crash or
+gdb.
+  protocol: destination file(started with "file:") or destination file
+            descriptor (started with "fd:")
+    paging: do paging to get guest's memory mapping
+     begin: the starting physical address. It's optional, and should be
+            specified with length together.
+    length: the memory size, in bytes. It's optional, and should be specified
+            with begin together.
+ETEXI
+#endif
+
     {
         .name       = "snapshot_blkdev",
         .args_type  = "reuse:-n,device:B,snapshot-file:s?,format:s?",
diff --git a/hmp.c b/hmp.c
index 9cf2d13..75ec9a9 100644
--- a/hmp.c
+++ b/hmp.c
@@ -934,3 +934,25 @@ void hmp_migrate(Monitor *mon, const QDict *qdict)
         qemu_mod_timer(status->timer, qemu_get_clock_ms(rt_clock));
     }
 }
+
+void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict)
+{
+    Error *errp = NULL;
+    int paging = qdict_get_try_bool(qdict, "paging", 0);
+    const char *file = qdict_get_str(qdict, "protocol");
+    bool has_begin = qdict_haskey(qdict, "begin");
+    bool has_length = qdict_haskey(qdict, "length");
+    int64_t begin = 0;
+    int64_t length = 0;
+
+    if (has_begin) {
+        begin = qdict_get_int(qdict, "begin");
+    }
+    if (has_length) {
+        length = qdict_get_int(qdict, "length");
+    }
+
+    qmp_dump_guest_memory(paging, file, has_begin, begin, has_length, length,
+                          &errp);
+    hmp_handle_error(mon, &errp);
+}
diff --git a/hmp.h b/hmp.h
index 8807853..36d69c6 100644
--- a/hmp.h
+++ b/hmp.h
@@ -60,5 +60,6 @@ void hmp_block_stream(Monitor *mon, const QDict *qdict);
 void hmp_block_job_set_speed(Monitor *mon, const QDict *qdict);
 void hmp_block_job_cancel(Monitor *mon, const QDict *qdict);
 void hmp_migrate(Monitor *mon, const QDict *qdict);
+void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict);
 
 #endif
diff --git a/memory_mapping.c b/memory_mapping.c
index 9a2ffe6..ff52c9d 100644
--- a/memory_mapping.c
+++ b/memory_mapping.c
@@ -207,3 +207,30 @@ void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list)
         create_new_memory_mapping(list, block->offset, 0, block->length);
     }
 }
+
+void memory_mapping_filter(MemoryMappingList *list, int64_t begin,
+                           int64_t length)
+{
+    MemoryMapping *cur, *next;
+
+    QTAILQ_FOREACH_SAFE(cur, &list->head, next, next) {
+        if (cur->phys_addr >= begin + length ||
+            cur->phys_addr + cur->length <= begin) {
+            QTAILQ_REMOVE(&list->head, cur, next);
+            list->num--;
+            continue;
+        }
+
+        if (cur->phys_addr < begin) {
+            cur->length -= begin - cur->phys_addr;
+            if (cur->virt_addr) {
+                cur->virt_addr += begin - cur->phys_addr;
+            }
+            cur->phys_addr = begin;
+        }
+
+        if (cur->phys_addr + cur->length > begin + length) {
+            cur->length -= cur->phys_addr + cur->length - begin - length;
+        }
+    }
+}
diff --git a/memory_mapping.h b/memory_mapping.h
index a583e44..b795678 100644
--- a/memory_mapping.h
+++ b/memory_mapping.h
@@ -62,4 +62,7 @@ static inline int qemu_get_guest_memory_mapping(MemoryMappingList *list)
 /* get guest's memory mapping without do paging(virtual address is 0). */
 void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list);
 
+void memory_mapping_filter(MemoryMappingList *list, int64_t begin,
+                           int64_t length);
+
 #endif
diff --git a/qapi-schema.json b/qapi-schema.json
index 0d11d6e..c2f5866 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1701,3 +1701,37 @@
 # Since: 1.1
 ##
 { 'command': 'xen-save-devices-state', 'data': {'filename': 'str'} }
+
+##
+# @dump-guest-memory
+#
+# Dump guest's memory to vmcore.
+#
+# @paging: if true, do paging to get guest's memory mapping. The @paging's
+# default value of @paging is false, If you want to use gdb to process the
+# core, please set @paging to true. The reason why the @paging's value is
+# false:
+#   1. guest machine in a catastrophic state can have corrupted memory,
+#      which we cannot trust.
+#   2. The guest machine can be in read-mode even if paging is enabled.
+#      For example: the guest machine uses ACPI to sleep, and ACPI sleep
+#      state goes in real-mode
+# @protocol: the filename or file descriptor of the vmcore. The supported
+# protocol can be file or fd:
+#   1. file: the protocol starts with "file:", and the following string is
+#      the file's path.
+#   2. fd: the protocol starts with "fd:", and the following string is the
+#      fd's name. This command doesn't support the fd that is is associated
+#      with a pipe, socket, or FIFO(lseek will fail with such fd).
+# @begin: #optional if specified, the starting physical address.
+# @length: #optional if specified, the memory size, in bytes. If you don't
+# want to dump all guest's memory, please specify the start @begin and
+# @length
+#
+# Returns: nothing on success
+#
+# Since: 1.1
+##
+{ 'command': 'dump-guest-memory',
+  'data': { 'paging': 'bool', 'protocol': 'str', '*begin': 'int',
+            '*length': 'int' } }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index c626ba8..999c9b2 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -606,6 +606,44 @@ Example:
 
 EQMP
 
+#if defined(CONFIG_HAVE_CORE_DUMP)
+    {
+        .name       = "dump-guest-memory",
+        .args_type  = "paging:b,protocol:s,begin:i?,end:i?",
+        .params     = "[-p] protocol [begin] [length]",
+        .help       = "dump guest memory to file",
+        .user_print = monitor_user_noop,
+        .mhandler.cmd_new = qmp_marshal_input_dump_guest_memory,
+    },
+
+SQMP
+dump
+
+
+Dump guest memory to file. The file can be processed with crash or gdb.
+
+Arguments:
+
+- "paging": do paging to get guest's memory mapping (json-bool)
+- "protocol": destination file(started with "file:") or destination file
+              descriptor (started with "fd:") (json-string)
+- "begin": the starting physical address. It's optional, and should be specified
+           with length together (json-int)
+- "length": the memory size, in bytes. It's optional, and should be specified
+            with begin together (json-int)
+
+Example:
+
+-> { "execute": "dump-guest-memory", "arguments": { "protocol": "fd:dump" } }
+<- { "return": {} }
+
+Notes:
+
+(1) All boolean arguments default to false
+
+EQMP
+#endif
+
     {
         .name       = "netdev_add",
         .args_type  = "netdev:O",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 05/12 v11] Add API to get memory mapping
  2012-03-26 10:03 ` [Qemu-devel] [PATCH 05/12 v11] Add API to get memory mapping Wen Congyang
@ 2012-03-27  2:27   ` Wen Congyang
  0 siblings, 0 replies; 21+ messages in thread
From: Wen Congyang @ 2012-03-27  2:27 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

Add API to get all virtual address and physical address mapping.
If the guest doesn't use paging, the virtual address is equal to the phyical
address. The virtual address and physical address mapping is for gdb's user, and
it does not include the memory that is not referenced by the page table. So if
you want to use crash to anaylze the vmcore, please do not specify -p option.
the reason why the -p option is not default explicitly: guest machine in a
catastrophic state can have corrupted memory, which we cannot trust.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
address hatayama's comment

 memory_mapping.c |   47 +++++++++++++++++++++++++++++++++++++++++++++++
 memory_mapping.h |   15 +++++++++++++++
 2 files changed, 62 insertions(+), 0 deletions(-)

diff --git a/memory_mapping.c b/memory_mapping.c
index 718f271..627397a 100644
--- a/memory_mapping.c
+++ b/memory_mapping.c
@@ -164,3 +164,50 @@ void memory_mapping_list_init(MemoryMappingList *list)
     list->last_mapping = NULL;
     QTAILQ_INIT(&list->head);
 }
+
+#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
+
+static CPUArchState *find_paging_enabled_cpu(CPUArchState *start_cpu)
+{
+    CPUArchState *env;
+
+    for (env = start_cpu; env != NULL; env = env->next_cpu) {
+        if (cpu_paging_enabled(env)) {
+            return env;
+        }
+    }
+
+    return NULL;
+}
+
+int qemu_get_guest_memory_mapping(MemoryMappingList *list)
+{
+    CPUArchState *env, *first_paging_enabled_cpu;
+    RAMBlock *block;
+    ram_addr_t offset, length;
+    int ret;
+
+    first_paging_enabled_cpu = find_paging_enabled_cpu(first_cpu);
+    if (first_paging_enabled_cpu) {
+        for (env = first_paging_enabled_cpu; env != NULL; env = env->next_cpu) {
+            ret = cpu_get_memory_mapping(list, env);
+            if (ret < 0) {
+                return -1;
+            }
+        }
+        return 0;
+    }
+
+    /*
+     * If the guest doesn't use paging, the virtual address is equal to physical
+     * address.
+     */
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        offset = block->offset;
+        length = block->length;
+        create_new_memory_mapping(list, offset, offset, length);
+    }
+
+    return 0;
+}
+#endif
diff --git a/memory_mapping.h b/memory_mapping.h
index 836b047..4d44641 100644
--- a/memory_mapping.h
+++ b/memory_mapping.h
@@ -44,4 +44,19 @@ void memory_mapping_list_free(MemoryMappingList *list);
 
 void memory_mapping_list_init(MemoryMappingList *list);
 
+/*
+ * Return value:
+ *    0: success
+ *   -1: failed
+ *   -2: unsupported
+ */
+#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
+int qemu_get_guest_memory_mapping(MemoryMappingList *list);
+#else
+static inline int qemu_get_guest_memory_mapping(MemoryMappingList *list)
+{
+    return -2;
+}
+#endif
+
 #endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 10/12 v11] make gdb_id() generally avialable and rename it to cpu_index()
  2012-03-26 10:05 ` [Qemu-devel] [PATCH 10/12 v11] make gdb_id() generally avialable and rename it to cpu_index() Wen Congyang
@ 2012-03-27  3:38   ` HATAYAMA Daisuke
  0 siblings, 0 replies; 21+ messages in thread
From: HATAYAMA Daisuke @ 2012-03-27  3:38 UTC (permalink / raw)
  To: wency; +Cc: aliguori, jan.kiszka, qemu-devel, lcapitulino, anderson, eblake

From: Wen Congyang <wency@cn.fujitsu.com>
Subject: [PATCH 10/12 v11] make gdb_id() generally avialable and rename it to cpu_index()
Date: Mon, 26 Mar 2012 18:05:39 +0800

<cut>

>  
> -static inline int gdb_id(CPUArchState *env)
> -{
> -#if defined(CONFIG_USER_ONLY) && defined(CONFIG_USE_NPTL)
> -    return env->host_tid;
> -#else
> -    return env->cpu_index + 1;
> -#endif
> -}
> -

<cut>


I meant in gdbstub.c:

static inline int gdb_id(CPUArchState *env)
{
	return cpu_index(env);
}

Thanks.
HATAYAMA, Daisuke

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism
  2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (11 preceding siblings ...)
  2012-03-26 10:06 ` [Qemu-devel] [PATCH 12/12 v11] introduce a new monitor command 'dump-guest-memory' to dump guest's memory Wen Congyang
@ 2012-03-28  5:17 ` Wen Congyang
  2012-03-28 12:44   ` Luiz Capitulino
  12 siblings, 1 reply; 21+ messages in thread
From: Wen Congyang @ 2012-03-28  5:17 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

Hi, Luiz, Anthony, Jan

do you have any comments about this patchset?

Thanks
Wen Congyang

At 03/26/2012 05:58 PM, Wen Congyang Wrote:
> Hi, all
> 
> 'virsh dump' can not work when host pci device is used by guest. We have
> discussed this issue here:
> http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html
> 
> The last version is here:
> http://lists.nongnu.org/archive/html/qemu-devel/2012-03/msg03866.html
> 
> We have determined to introduce a new command dump-guest-memory to dump
> guest's memory. The core file's format is elf32 or elf64.
> 
> Note:
> 1. The guest should be x86 or x86_64. The other arch is not supported now.
> 2. If you use old gdb, gdb may crash. I use gdb-7.3.1, and it does not crash.
> 3. If the OS is in the second kernel, gdb may not work well, and crash can
>    work by specifying '--machdep phys_addr=xxx' in the command line. The
>    reason is that the second kernel will update the page table, and we can
>    not get the page table for the first kernel.
> 4. The cpu's state is stored in QEMU note. You neet to modify crash to use
>    it to calculate phys_base.
> 5. If the guest OS is 32 bit and the memory size is larger than 4G, the vmcore
>    is elf64 format. You should use the gdb which is built with --enable-64-bit-bfd.
> 6. This patchset is based on the upstream tree, and apply one patch that is still
>    in Luiz Capitulino's tree, because I use the API qemu_get_fd() in this patchset.
> 
> Changes from v10 to v11:
> 1. addressed Luiz's and Hatayam's comment
> 2. fix a bug about filtering feature
> 
> Changes from v9 to v10:
> 1. fix some bug
> 2. addressed Luiz's and Hatayam's comment
> 3. remove cancel and query command
> 
> Changes from v8 to v9:
> 1. remove async support(it will be reimplemented after QAPI async commands support
>    is finished)
> 2. fix some typo error
> 
> Changes from v7 to v8:
> 1. addressed Hatayama's comments
> 
> Changes from v6 to v7:
> 1. addressed Jan's comments
> 2. fix some bugs
> 3. store cpu's state into the vmcore
> 
> Changes from v5 to v6:
> 1. allow user to dump a fraction of the memory
> 2. fix some bugs
> 
> Changes from v4 to v5:
> 1. convert the new command dump to QAPI 
> 
> Changes from v3 to v4:
> 1. support it to run asynchronously
> 2. add API to cancel dumping and query dumping progress
> 3. add API to control dumping speed
> 4. auto cancel dumping when the user resumes vm, and the status is failed.
> 
> Changes from v2 to v3:
> 1. address Jan Kiszka's comment
> 
> Changes from v1 to v2:
> 1. fix virt addr in the vmcore.
> 
> Wen Congyang (12):
>   Add API to create memory mapping list
>   Add API to check whether a physical address is I/O address
>   implement cpu_get_memory_mapping()
>   Add API to check whether paging mode is enabled
>   Add API to get memory mapping
>   Add API to get memory mapping without do paging
>   target-i386: Add API to write elf notes to core file
>   target-i386: Add API to write cpu status to core file
>   target-i386: add API to get dump info
>   make gdb_id() generally avialable and rename it to cpu_index()
>   QError: Introduce new error for the dump-guest-memory command
>   introduce a new monitor command 'dump-guest-memory' to dump guest's
>     memory
> 
>  Makefile.target                   |    3 +
>  configure                         |    8 +
>  cpu-all.h                         |   67 +++
>  cpu-common.h                      |    2 +
>  dump.c                            |  827 +++++++++++++++++++++++++++++++++++++
>  dump.h                            |   23 +
>  elf.h                             |    5 +
>  exec.c                            |    9 +
>  gdbstub.c                         |   19 +-
>  gdbstub.h                         |    9 +
>  hmp-commands.hx                   |   28 ++
>  hmp.c                             |   22 +
>  hmp.h                             |    1 +
>  memory_mapping.c                  |  236 +++++++++++
>  memory_mapping.h                  |   68 +++
>  qapi-schema.json                  |   34 ++
>  qerror.c                          |    4 +
>  qerror.h                          |    3 +
>  qmp-commands.hx                   |   38 ++
>  target-i386/arch_dump.c           |  426 +++++++++++++++++++
>  target-i386/arch_memory_mapping.c |  271 ++++++++++++
>  21 files changed, 2089 insertions(+), 14 deletions(-)
>  create mode 100644 dump.c
>  create mode 100644 dump.h
>  create mode 100644 memory_mapping.c
>  create mode 100644 memory_mapping.h
>  create mode 100644 target-i386/arch_dump.c
>  create mode 100644 target-i386/arch_memory_mapping.c
> 
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism
  2012-03-28  5:17 ` [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
@ 2012-03-28 12:44   ` Luiz Capitulino
  2012-04-02  3:19     ` Wen Congyang
  0 siblings, 1 reply; 21+ messages in thread
From: Luiz Capitulino @ 2012-03-28 12:44 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Anthony Liguori, Jan Kiszka, qemu-devel, HATAYAMA Daisuke,
	Dave Anderson, Eric Blake

On Wed, 28 Mar 2012 13:17:43 +0800
Wen Congyang <wency@cn.fujitsu.com> wrote:

> Hi, Luiz, Anthony, Jan
> 
> do you have any comments about this patchset?

As far as QMP is concerned:

 Acked-by: Luiz Capitulino <lcapitulino@redhat.com>

This can go through me tree, but I'd like an ACK from Jan and/or Anthony.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 12/12 v11] introduce a new monitor command 'dump-guest-memory' to dump guest's memory
  2012-03-26 10:06 ` [Qemu-devel] [PATCH 12/12 v11] introduce a new monitor command 'dump-guest-memory' to dump guest's memory Wen Congyang
@ 2012-04-02  2:54   ` Wen Congyang
  2012-04-02  3:16   ` [Qemu-devel] [PATCH 12/12 v11.5] " Wen Congyang
  1 sibling, 0 replies; 21+ messages in thread
From: Wen Congyang @ 2012-04-02  2:54 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

At 03/26/2012 06:06 PM, Wen Congyang Wrote:
> The command's usage:
>    dump [-p] protocol [begin] [length]
> The supported protocol can be file or fd:
> 1. file: the protocol starts with "file:", and the following string is
>    the file's path.
> 2. fd: the protocol starts with "fd:", and the following string is the
>    fd's name.
> 
> Note:
>   1. If you want to use gdb to process the core, please specify -p option.
>      The reason why the -p option is not default is:
>        a. guest machine in a catastrophic state can have corrupted memory,
>           which we cannot trust.
>        b. The guest machine can be in read-mode even if paging is enabled.
>           For example: the guest machine uses ACPI to sleep, and ACPI sleep
>           state goes in real-mode.
>   2. This command doesn't support the fd that is is associated with a pipe,
>      socket, or FIFO(lseek will fail with such fd).
>   3. If you don't want to dump all guest's memory, please specify the start
>      physical address and the length.
> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
>  Makefile.target  |    2 +-
>  dump.c           |  827 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  elf.h            |    5 +
>  hmp-commands.hx  |   28 ++
>  hmp.c            |   22 ++
>  hmp.h            |    1 +
>  memory_mapping.c |   27 ++
>  memory_mapping.h |    3 +
>  qapi-schema.json |   34 +++
>  qmp-commands.hx  |   38 +++
>  10 files changed, 986 insertions(+), 1 deletions(-)
>  create mode 100644 dump.c

<cut>

> +/* write the memroy to vmcore. 1 page per I/O. */
> +static int write_memory(DumpState *s, RAMBlock *block, ram_addr_t start,
> +                        target_phys_addr_t *offset, int64_t size)
> +{
> +    int i, ret;

The type of i should be int64_t. Otherwise,  i * TARGET_PAGE_SIZE
may be overflow.

I will resend this patch.

Thanks
Wen Congyang

> +
> +    for (i = 0; i < size / TARGET_PAGE_SIZE; i++) {
> +        ret = write_data(s, block->host + start + i * TARGET_PAGE_SIZE,
> +                         TARGET_PAGE_SIZE, offset);
> +        if (ret < 0) {
> +            return ret;
> +        }
> +    }
> +
> +    if ((size % TARGET_PAGE_SIZE) != 0) {
> +        ret = write_data(s, block->host + start + i * TARGET_PAGE_SIZE,
> +                         size % TARGET_PAGE_SIZE, offset);
> +        if (ret < 0) {
> +            return ret;
> +        }
> +    }
> +
> +    return 0;
> +}
> +

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 12/12 v11.5] introduce a new monitor command 'dump-guest-memory' to dump guest's memory
  2012-03-26 10:06 ` [Qemu-devel] [PATCH 12/12 v11] introduce a new monitor command 'dump-guest-memory' to dump guest's memory Wen Congyang
  2012-04-02  2:54   ` Wen Congyang
@ 2012-04-02  3:16   ` Wen Congyang
  1 sibling, 0 replies; 21+ messages in thread
From: Wen Congyang @ 2012-04-02  3:16 UTC (permalink / raw)
  To: qemu-devel, Jan Kiszka, Dave Anderson, HATAYAMA Daisuke,
	Luiz Capitulino, Eric Blake, Anthony Liguori

The command's usage:
   dump [-p] protocol [begin] [length]
The supported protocol can be file or fd:
1. file: the protocol starts with "file:", and the following string is
   the file's path.
2. fd: the protocol starts with "fd:", and the following string is the
   fd's name.

Note:
  1. If you want to use gdb to process the core, please specify -p option.
     The reason why the -p option is not default is:
       a. guest machine in a catastrophic state can have corrupted memory,
          which we cannot trust.
       b. The guest machine can be in read-mode even if paging is enabled.
          For example: the guest machine uses ACPI to sleep, and ACPI sleep
          state goes in real-mode.
  2. This command doesn't support the fd that is is associated with a pipe,
     socket, or FIFO(lseek will fail with such fd).
  3. If you don't want to dump all guest's memory, please specify the start
     physical address and the length.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target  |    2 +-
 dump.c           |  828 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 elf.h            |    5 +
 hmp-commands.hx  |   28 ++
 hmp.c            |   22 ++
 hmp.h            |    1 +
 memory_mapping.c |   27 ++
 memory_mapping.h |    3 +
 qapi-schema.json |   34 +++
 qmp-commands.hx  |   38 +++
 10 files changed, 987 insertions(+), 1 deletions(-)
 create mode 100644 dump.c

diff --git a/Makefile.target b/Makefile.target
index fa4e698..de61c70 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -224,7 +224,7 @@ obj-$(CONFIG_NO_KVM) += kvm-stub.o
 obj-$(CONFIG_VGA) += vga.o
 obj-y += memory.o savevm.o
 obj-y += memory_mapping.o
-obj-$(CONFIG_HAVE_CORE_DUMP) += arch_dump.o
+obj-$(CONFIG_HAVE_CORE_DUMP) += arch_dump.o dump.o
 LIBS+=-lz
 
 obj-i386-$(CONFIG_KVM) += hyperv.o
diff --git a/dump.c b/dump.c
new file mode 100644
index 0000000..bd22ab2
--- /dev/null
+++ b/dump.c
@@ -0,0 +1,828 @@
+/*
+ * QEMU dump
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include <unistd.h>
+#include "elf.h"
+#include <sys/procfs.h>
+#include <glib.h>
+#include "cpu.h"
+#include "cpu-all.h"
+#include "targphys.h"
+#include "monitor.h"
+#include "kvm.h"
+#include "dump.h"
+#include "sysemu.h"
+#include "bswap.h"
+#include "memory_mapping.h"
+#include "error.h"
+#include "qmp-commands.h"
+#include "gdbstub.h"
+
+static uint16_t cpu_convert_to_target16(uint16_t val, int endian)
+{
+    if (endian == ELFDATA2LSB) {
+        val = cpu_to_le16(val);
+    } else {
+        val = cpu_to_be16(val);
+    }
+
+    return val;
+}
+
+static uint32_t cpu_convert_to_target32(uint32_t val, int endian)
+{
+    if (endian == ELFDATA2LSB) {
+        val = cpu_to_le32(val);
+    } else {
+        val = cpu_to_be32(val);
+    }
+
+    return val;
+}
+
+static uint64_t cpu_convert_to_target64(uint64_t val, int endian)
+{
+    if (endian == ELFDATA2LSB) {
+        val = cpu_to_le64(val);
+    } else {
+        val = cpu_to_be64(val);
+    }
+
+    return val;
+}
+
+typedef struct DumpState {
+    ArchDumpInfo dump_info;
+    MemoryMappingList list;
+    uint16_t phdr_num;
+    uint32_t sh_info;
+    bool have_section;
+    bool resume;
+    target_phys_addr_t memory_offset;
+    int fd;
+
+    RAMBlock *block;
+    ram_addr_t start;
+    bool has_filter;
+    int64_t begin;
+    int64_t length;
+    Error **errp;
+} DumpState;
+
+static int dump_cleanup(DumpState *s)
+{
+    int ret = 0;
+
+    memory_mapping_list_free(&s->list);
+    if (s->fd != -1) {
+        close(s->fd);
+    }
+    if (s->resume) {
+        vm_start();
+    }
+
+    return ret;
+}
+
+static void dump_error(DumpState *s, const char *reason)
+{
+    dump_cleanup(s);
+}
+
+static int fd_write_vmcore(target_phys_addr_t offset, void *buf, size_t size,
+                           void *opaque)
+{
+    DumpState *s = opaque;
+    int fd = s->fd;
+    off_t ret;
+    size_t writen_size;
+
+    while (1) {
+        ret = lseek(fd, offset, SEEK_SET);
+        if (ret < 0) {
+            if (errno == ESPIPE) {
+                error_set(s->errp, QERR_PIPE_OR_SOCKET_FD);
+                return -1;
+            }
+
+            if (errno != EINTR && errno != EAGAIN) {
+                return -1;
+            }
+            continue;
+        }
+        break;
+    }
+
+    /* The fd may be passed from user, and it can be non-blocked */
+    while (size) {
+        writen_size = qemu_write_full(fd, buf, size);
+        if (writen_size != size && errno != EAGAIN) {
+            return -1;
+        }
+
+        buf += writen_size;
+        size -= writen_size;
+    }
+
+    return 0;
+}
+
+static int write_elf64_header(DumpState *s)
+{
+    Elf64_Ehdr elf_header;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&elf_header, 0, sizeof(Elf64_Ehdr));
+    memcpy(&elf_header, ELFMAG, SELFMAG);
+    elf_header.e_ident[EI_CLASS] = ELFCLASS64;
+    elf_header.e_ident[EI_DATA] = s->dump_info.d_endian;
+    elf_header.e_ident[EI_VERSION] = EV_CURRENT;
+    elf_header.e_type = cpu_convert_to_target16(ET_CORE, endian);
+    elf_header.e_machine = cpu_convert_to_target16(s->dump_info.d_machine,
+                                                   endian);
+    elf_header.e_version = cpu_convert_to_target32(EV_CURRENT, endian);
+    elf_header.e_ehsize = cpu_convert_to_target16(sizeof(elf_header), endian);
+    elf_header.e_phoff = cpu_convert_to_target64(sizeof(Elf64_Ehdr), endian);
+    elf_header.e_phentsize = cpu_convert_to_target16(sizeof(Elf64_Phdr),
+                                                     endian);
+    elf_header.e_phnum = cpu_convert_to_target16(s->phdr_num, endian);
+    if (s->have_section) {
+        uint64_t shoff = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr) * s->sh_info;
+
+        elf_header.e_shoff = cpu_convert_to_target64(shoff, endian);
+        elf_header.e_shentsize = cpu_convert_to_target16(sizeof(Elf64_Shdr),
+                                                         endian);
+        elf_header.e_shnum = cpu_convert_to_target16(1, endian);
+    }
+
+    ret = fd_write_vmcore(0, &elf_header, sizeof(elf_header), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write elf header.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_header(DumpState *s)
+{
+    Elf32_Ehdr elf_header;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&elf_header, 0, sizeof(Elf32_Ehdr));
+    memcpy(&elf_header, ELFMAG, SELFMAG);
+    elf_header.e_ident[EI_CLASS] = ELFCLASS32;
+    elf_header.e_ident[EI_DATA] = endian;
+    elf_header.e_ident[EI_VERSION] = EV_CURRENT;
+    elf_header.e_type = cpu_convert_to_target16(ET_CORE, endian);
+    elf_header.e_machine = cpu_convert_to_target16(s->dump_info.d_machine,
+                                                   endian);
+    elf_header.e_version = cpu_convert_to_target32(EV_CURRENT, endian);
+    elf_header.e_ehsize = cpu_convert_to_target16(sizeof(elf_header), endian);
+    elf_header.e_phoff = cpu_convert_to_target32(sizeof(Elf32_Ehdr), endian);
+    elf_header.e_phentsize = cpu_convert_to_target16(sizeof(Elf32_Phdr),
+                                                     endian);
+    elf_header.e_phnum = cpu_convert_to_target16(s->phdr_num, endian);
+    if (s->have_section) {
+        uint32_t shoff = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr) * s->sh_info;
+
+        elf_header.e_shoff = cpu_convert_to_target32(shoff, endian);
+        elf_header.e_shentsize = cpu_convert_to_target16(sizeof(Elf32_Shdr),
+                                                         endian);
+        elf_header.e_shnum = cpu_convert_to_target16(1, endian);
+    }
+
+    ret = fd_write_vmcore(0, &elf_header, sizeof(elf_header), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write elf header.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf64_load(DumpState *s, MemoryMapping *memory_mapping,
+                            int phdr_index, target_phys_addr_t offset)
+{
+    Elf64_Phdr phdr;
+    off_t phdr_offset;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&phdr, 0, sizeof(Elf64_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_LOAD, endian);
+    phdr.p_offset = cpu_convert_to_target64(offset, endian);
+    phdr.p_paddr = cpu_convert_to_target64(memory_mapping->phys_addr, endian);
+    if (offset == -1) {
+        /* When the memory is not stored into vmcore, offset will be -1 */
+        phdr.p_filesz = 0;
+    } else {
+        phdr.p_filesz = cpu_convert_to_target64(memory_mapping->length, endian);
+    }
+    phdr.p_memsz = cpu_convert_to_target64(memory_mapping->length, endian);
+    phdr.p_vaddr = cpu_convert_to_target64(memory_mapping->virt_addr, endian);
+
+    phdr_offset = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr)*phdr_index;
+    ret = fd_write_vmcore(phdr_offset, &phdr, sizeof(Elf64_Phdr), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_load(DumpState *s, MemoryMapping *memory_mapping,
+                            int phdr_index, target_phys_addr_t offset)
+{
+    Elf32_Phdr phdr;
+    off_t phdr_offset;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&phdr, 0, sizeof(Elf32_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_LOAD, endian);
+    phdr.p_offset = cpu_convert_to_target32(offset, endian);
+    phdr.p_paddr = cpu_convert_to_target32(memory_mapping->phys_addr, endian);
+    if (offset == -1) {
+        /* When the memory is not stored into vmcore, offset will be -1 */
+        phdr.p_filesz = 0;
+    } else {
+        phdr.p_filesz = cpu_convert_to_target32(memory_mapping->length, endian);
+    }
+    phdr.p_memsz = cpu_convert_to_target32(memory_mapping->length, endian);
+    phdr.p_vaddr = cpu_convert_to_target32(memory_mapping->virt_addr, endian);
+
+    phdr_offset = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr)*phdr_index;
+    ret = fd_write_vmcore(phdr_offset, &phdr, sizeof(Elf32_Phdr), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf64_notes(DumpState *s, int phdr_index,
+                             target_phys_addr_t *offset)
+{
+    CPUArchState *env;
+    int ret;
+    target_phys_addr_t begin = *offset;
+    Elf64_Phdr phdr;
+    off_t phdr_offset;
+    int id;
+    int endian = s->dump_info.d_endian;
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        id = cpu_index(env);
+        ret = cpu_write_elf64_note(fd_write_vmcore, env, id, offset, s);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write elf notes.\n");
+            return -1;
+        }
+    }
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        ret = cpu_write_elf64_qemunote(fd_write_vmcore, env, offset, s);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write CPU status.\n");
+            return -1;
+        }
+    }
+
+    memset(&phdr, 0, sizeof(Elf64_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_NOTE, endian);
+    phdr.p_offset = cpu_convert_to_target64(begin, endian);
+    phdr.p_paddr = 0;
+    phdr.p_filesz = cpu_convert_to_target64(*offset - begin, endian);
+    phdr.p_memsz = cpu_convert_to_target64(*offset - begin, endian);
+    phdr.p_vaddr = 0;
+
+    phdr_offset = sizeof(Elf64_Ehdr);
+    ret = fd_write_vmcore(phdr_offset, &phdr, sizeof(Elf64_Phdr), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_notes(DumpState *s, int phdr_index,
+                             target_phys_addr_t *offset)
+{
+    CPUArchState *env;
+    int ret;
+    target_phys_addr_t begin = *offset;
+    Elf32_Phdr phdr;
+    off_t phdr_offset;
+    int id;
+    int endian = s->dump_info.d_endian;
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        id = cpu_index(env);
+        ret = cpu_write_elf32_note(fd_write_vmcore, env, id, offset, s);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write elf notes.\n");
+            return -1;
+        }
+    }
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        ret = cpu_write_elf32_qemunote(fd_write_vmcore, env, offset, s);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write CPU status.\n");
+            return -1;
+        }
+    }
+
+    memset(&phdr, 0, sizeof(Elf32_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_NOTE, endian);
+    phdr.p_offset = cpu_convert_to_target32(begin, endian);
+    phdr.p_paddr = 0;
+    phdr.p_filesz = cpu_convert_to_target32(*offset - begin, endian);
+    phdr.p_memsz = cpu_convert_to_target32(*offset - begin, endian);
+    phdr.p_vaddr = 0;
+
+    phdr_offset = sizeof(Elf32_Ehdr);
+    ret = fd_write_vmcore(phdr_offset, &phdr, sizeof(Elf32_Phdr), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf_section(DumpState *s, target_phys_addr_t *offset, int type)
+{
+    Elf32_Shdr shdr32;
+    Elf64_Shdr shdr64;
+    int endian = s->dump_info.d_endian;
+    int shdr_size;
+    void *shdr;
+    int ret;
+
+    if (type == 0) {
+        shdr_size = sizeof(Elf32_Shdr);
+        memset(&shdr32, 0, shdr_size);
+        shdr32.sh_info = cpu_convert_to_target32(s->sh_info, endian);
+        shdr = &shdr32;
+    } else {
+        shdr_size = sizeof(Elf64_Shdr);
+        memset(&shdr64, 0, shdr_size);
+        shdr64.sh_info = cpu_convert_to_target32(s->sh_info, endian);
+        shdr = &shdr64;
+    }
+
+    ret = fd_write_vmcore(*offset, &shdr, shdr_size, s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write section header table.\n");
+        return -1;
+    }
+
+    *offset += shdr_size;
+    return 0;
+}
+
+static int write_data(DumpState *s, void *buf, int length,
+                      target_phys_addr_t *offset)
+{
+    int ret;
+
+    ret = fd_write_vmcore(*offset, buf, length, s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to save memory.\n");
+        return -1;
+    }
+
+    *offset += length;
+    return 0;
+}
+
+/* write the memroy to vmcore. 1 page per I/O. */
+static int write_memory(DumpState *s, RAMBlock *block, ram_addr_t start,
+                        target_phys_addr_t *offset, int64_t size)
+{
+    int64_t i;
+    int ret;
+
+    for (i = 0; i < size / TARGET_PAGE_SIZE; i++) {
+        ret = write_data(s, block->host + start + i * TARGET_PAGE_SIZE,
+                         TARGET_PAGE_SIZE, offset);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    if ((size % TARGET_PAGE_SIZE) != 0) {
+        ret = write_data(s, block->host + start + i * TARGET_PAGE_SIZE,
+                         size % TARGET_PAGE_SIZE, offset);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    return 0;
+}
+
+/* get the memory's offset in the vmcore */
+static target_phys_addr_t get_offset(target_phys_addr_t phys_addr,
+                                     DumpState *s)
+{
+    RAMBlock *block;
+    target_phys_addr_t offset = s->memory_offset;
+    int64_t size_in_block, start;
+
+    if (s->has_filter) {
+        if (phys_addr < s->begin || phys_addr >= s->begin + s->length) {
+            return -1;
+        }
+    }
+
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        if (s->has_filter) {
+            if (block->offset >= s->begin + s->length ||
+                block->offset + block->length <= s->begin) {
+                /* This block is out of the range */
+                continue;
+            }
+
+            if (s->begin <= block->offset) {
+                start = block->offset;
+            } else {
+                start = s->begin;
+            }
+
+            size_in_block = block->length - (start - block->offset);
+            if (s->begin + s->length < block->offset + block->length) {
+                size_in_block -= block->offset + block->length -
+                                 (s->begin + s->length);
+            }
+        } else {
+            start = block->offset;
+            size_in_block = block->length;
+        }
+
+        if (phys_addr >= start && phys_addr < start + size_in_block) {
+            return phys_addr - start + offset;
+        }
+
+        offset += size_in_block;
+    }
+
+    return -1;
+}
+
+/* write elf header, PT_NOTE and elf note to vmcore. */
+static int dump_begin(DumpState *s)
+{
+    target_phys_addr_t offset;
+    int ret;
+
+    /*
+     * the vmcore's format is:
+     *   --------------
+     *   |  elf header |
+     *   --------------
+     *   |  PT_NOTE    |
+     *   --------------
+     *   |  PT_LOAD    |
+     *   --------------
+     *   |  ......     |
+     *   --------------
+     *   |  PT_LOAD    |
+     *   --------------
+     *   |  sec_hdr    |
+     *   --------------
+     *   |  elf note   |
+     *   --------------
+     *   |  memory     |
+     *   --------------
+     *
+     * we only know where the memory is saved after we write elf note into
+     * vmcore.
+     */
+
+    /* write elf header to vmcore */
+    if (s->dump_info.d_class == ELFCLASS64) {
+        ret = write_elf64_header(s);
+    } else {
+        ret = write_elf32_header(s);
+    }
+    if (ret < 0) {
+        return -1;
+    }
+
+    /* write elf section and notes to vmcore */
+    if (s->dump_info.d_class == ELFCLASS64) {
+        if (s->have_section) {
+            offset = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr)*s->sh_info;
+            if (write_elf_section(s, &offset, 1) < 0) {
+                return -1;
+            }
+        } else {
+            offset = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr)*s->phdr_num;
+        }
+        ret = write_elf64_notes(s, 0, &offset);
+    } else {
+        if (s->have_section) {
+            offset = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr)*s->sh_info;
+            if (write_elf_section(s, &offset, 0) < 0) {
+                return -1;
+            }
+        } else {
+            offset = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr)*s->phdr_num;
+        }
+        ret = write_elf32_notes(s, 0, &offset);
+    }
+
+    if (ret < 0) {
+        return -1;
+    }
+
+    s->memory_offset = offset;
+    return 0;
+}
+
+/* write PT_LOAD to vmcore */
+static int dump_completed(DumpState *s)
+{
+    target_phys_addr_t offset;
+    MemoryMapping *memory_mapping;
+    int phdr_index = 1, ret;
+
+    QTAILQ_FOREACH(memory_mapping, &s->list.head, next) {
+        offset = get_offset(memory_mapping->phys_addr, s);
+        if (s->dump_info.d_class == ELFCLASS64) {
+            ret = write_elf64_load(s, memory_mapping, phdr_index++, offset);
+        } else {
+            ret = write_elf32_load(s, memory_mapping, phdr_index++, offset);
+        }
+        if (ret < 0) {
+            return -1;
+        }
+    }
+
+    dump_cleanup(s);
+    return 0;
+}
+
+static int get_next_block(DumpState *s, RAMBlock *block)
+{
+    while (1) {
+        block = QLIST_NEXT(block, next);
+        if (!block) {
+            /* no more block */
+            return 1;
+        }
+
+        s->start = 0;
+        s->block = block;
+        if (s->has_filter) {
+            if (block->offset >= s->begin + s->length ||
+                block->offset + block->length <= s->begin) {
+                /* This block is out of the range */
+                continue;
+            }
+
+            if (s->begin > block->offset) {
+                s->start = s->begin - block->offset;
+            }
+        }
+
+        return 0;
+    }
+}
+
+/* write all memory to vmcore */
+static int dump_iterate(DumpState *s)
+{
+    RAMBlock *block;
+    target_phys_addr_t offset = s->memory_offset;
+    int64_t size;
+    int ret;
+
+    while (1) {
+        block = s->block;
+
+        size = block->length;
+        if (s->has_filter) {
+            size -= s->start;
+            if (s->begin + s->length < block->offset + block->length) {
+                size -= block->offset + block->length - (s->begin + s->length);
+            }
+        }
+        ret = write_memory(s, block, s->start, &offset, size);
+        if (ret == -1) {
+            return ret;
+        }
+
+        ret = get_next_block(s, block);
+        if (ret == 1) {
+            dump_completed(s);
+            return 0;
+        }
+    }
+}
+
+static int create_vmcore(DumpState *s)
+{
+    int ret;
+
+    ret = dump_begin(s);
+    if (ret < 0) {
+        return -1;
+    }
+
+    ret = dump_iterate(s);
+    if (ret < 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
+static ram_addr_t get_start_block(DumpState *s)
+{
+    RAMBlock *block;
+
+    if (!s->has_filter) {
+        s->block = QLIST_FIRST(&ram_list.blocks);
+        return 0;
+    }
+
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        if (block->offset >= s->begin + s->length ||
+            block->offset + block->length <= s->begin) {
+            /* This block is out of the range */
+            continue;
+        }
+
+        s->block = block;
+        if (s->begin > block->offset) {
+            s->start = s->begin - block->offset;
+        } else {
+            s->start = 0;
+        }
+        return s->start;
+    }
+
+    return -1;
+}
+
+static int dump_init(DumpState *s, int fd, bool paging, bool has_filter,
+                     int64_t begin, int64_t length, Error **errp)
+{
+    CPUArchState *env;
+    int ret;
+
+    if (runstate_is_running()) {
+        vm_stop(RUN_STATE_SAVE_VM);
+        s->resume = true;
+    } else {
+        s->resume = false;
+    }
+
+    s->errp = errp;
+    s->fd = fd;
+    s->has_filter = has_filter;
+    s->begin = begin;
+    s->length = length;
+    s->start = get_start_block(s);
+    if (s->start == -1) {
+        error_set(errp, QERR_INVALID_PARAMETER, "begin");
+        goto cleanup;
+    }
+
+    /*
+     * get dump info: endian, class and architecture.
+     * If the target architecture is not supported, cpu_get_dump_info() will
+     * return -1.
+     *
+     * if we use kvm, we should synchronize the register before we get dump
+     * info.
+     */
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        cpu_synchronize_state(env);
+    }
+
+    ret = cpu_get_dump_info(&s->dump_info);
+    if (ret < 0) {
+        error_set(errp, QERR_UNSUPPORTED);
+        goto cleanup;
+    }
+
+    /* get memory mapping */
+    memory_mapping_list_init(&s->list);
+    if (paging) {
+        qemu_get_guest_memory_mapping(&s->list);
+    } else {
+        qemu_get_guest_simple_memory_mapping(&s->list);
+    }
+
+    if (s->has_filter) {
+        memory_mapping_filter(&s->list, s->begin, s->length);
+    }
+
+    /*
+     * calculate phdr_num
+     *
+     * the type of ehdr->e_phnum is uint16_t, so we should avoid overflow
+     */
+    s->phdr_num = 1; /* PT_NOTE */
+    if (s->list.num < UINT16_MAX - 2) {
+        s->phdr_num += s->list.num;
+        s->have_section = false;
+    } else {
+        s->have_section = true;
+        s->phdr_num = PN_XNUM;
+        s->sh_info = 1; /* PT_NOTE */
+
+        /* the type of shdr->sh_info is uint32_t, so we should avoid overflow */
+        if (s->list.num <= UINT32_MAX - 1) {
+            s->sh_info += s->list.num;
+        } else {
+            s->sh_info = UINT32_MAX;
+        }
+    }
+
+    return 0;
+
+cleanup:
+    if (s->resume) {
+        vm_start();
+    }
+
+    return -1;
+}
+
+void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
+                           int64_t begin, bool has_length, int64_t length,
+                           Error **errp)
+{
+    const char *p;
+    int fd = -1;
+    DumpState *s;
+    int ret;
+
+    if (has_begin && !has_length) {
+        error_set(errp, QERR_MISSING_PARAMETER, "length");
+        return;
+    }
+    if (!has_begin && has_length) {
+        error_set(errp, QERR_MISSING_PARAMETER, "begin");
+        return;
+    }
+
+#if !defined(WIN32)
+    if (strstart(file, "fd:", &p)) {
+        fd = monitor_get_fd(cur_mon, p);
+        if (fd == -1) {
+            error_set(errp, QERR_FD_NOT_FOUND, p);
+            return;
+        }
+    }
+#endif
+
+    if  (strstart(file, "file:", &p)) {
+        fd = qemu_open(p, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR);
+        if (fd < 0) {
+            error_set(errp, QERR_OPEN_FILE_FAILED, p);
+            return;
+        }
+    }
+
+    if (fd == -1) {
+        error_set(errp, QERR_INVALID_PARAMETER, "protocol");
+        return;
+    }
+
+    s = g_malloc(sizeof(DumpState));
+
+    ret = dump_init(s, fd, paging, has_begin, begin, length, errp);
+    if (ret < 0) {
+        g_free(s);
+        return;
+    }
+
+    if (create_vmcore(s) < 0 && !error_is_set(s->errp)) {
+        error_set(errp, QERR_IO_ERROR);
+    }
+
+    g_free(s);
+}
diff --git a/elf.h b/elf.h
index 36bcac4..7560eac 100644
--- a/elf.h
+++ b/elf.h
@@ -1016,6 +1016,11 @@ typedef struct elf64_sym {
 
 #define EI_NIDENT	16
 
+/* Special value for e_phnum.  This indicates that the real number of
+   program headers is too large to fit into e_phnum.  Instead the real
+   value is in the field sh_info of section 0.  */
+#define PN_XNUM         0xffff
+
 typedef struct elf32_hdr{
   unsigned char	e_ident[EI_NIDENT];
   Elf32_Half	e_type;
diff --git a/hmp-commands.hx b/hmp-commands.hx
index bd35a3e..d270fd1 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -879,6 +879,34 @@ server will ask the spice/vnc client to automatically reconnect using the
 new parameters (if specified) once the vm migration finished successfully.
 ETEXI
 
+#if defined(CONFIG_HAVE_CORE_DUMP)
+    {
+        .name       = "dump-guest-memory",
+        .args_type  = "paging:-p,protocol:s,begin:i?,length:i?",
+        .params     = "[-p] protocol [begin] [length]",
+        .help       = "dump guest memory to file"
+                      "\n\t\t\t begin(optional): the starting physical address"
+                      "\n\t\t\t length(optional): the memory size, in bytes",
+        .user_print = monitor_user_noop,
+        .mhandler.cmd = hmp_dump_guest_memory,
+    },
+
+
+STEXI
+@item dump-guest-memory [-p] @var{protocol} @var{begin} @var{length}
+@findex dump-guest-memory
+Dump guest memory to @var{protocol}. The file can be processed with crash or
+gdb.
+  protocol: destination file(started with "file:") or destination file
+            descriptor (started with "fd:")
+    paging: do paging to get guest's memory mapping
+     begin: the starting physical address. It's optional, and should be
+            specified with length together.
+    length: the memory size, in bytes. It's optional, and should be specified
+            with begin together.
+ETEXI
+#endif
+
     {
         .name       = "snapshot_blkdev",
         .args_type  = "reuse:-n,device:B,snapshot-file:s?,format:s?",
diff --git a/hmp.c b/hmp.c
index 9cf2d13..75ec9a9 100644
--- a/hmp.c
+++ b/hmp.c
@@ -934,3 +934,25 @@ void hmp_migrate(Monitor *mon, const QDict *qdict)
         qemu_mod_timer(status->timer, qemu_get_clock_ms(rt_clock));
     }
 }
+
+void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict)
+{
+    Error *errp = NULL;
+    int paging = qdict_get_try_bool(qdict, "paging", 0);
+    const char *file = qdict_get_str(qdict, "protocol");
+    bool has_begin = qdict_haskey(qdict, "begin");
+    bool has_length = qdict_haskey(qdict, "length");
+    int64_t begin = 0;
+    int64_t length = 0;
+
+    if (has_begin) {
+        begin = qdict_get_int(qdict, "begin");
+    }
+    if (has_length) {
+        length = qdict_get_int(qdict, "length");
+    }
+
+    qmp_dump_guest_memory(paging, file, has_begin, begin, has_length, length,
+                          &errp);
+    hmp_handle_error(mon, &errp);
+}
diff --git a/hmp.h b/hmp.h
index 8807853..36d69c6 100644
--- a/hmp.h
+++ b/hmp.h
@@ -60,5 +60,6 @@ void hmp_block_stream(Monitor *mon, const QDict *qdict);
 void hmp_block_job_set_speed(Monitor *mon, const QDict *qdict);
 void hmp_block_job_cancel(Monitor *mon, const QDict *qdict);
 void hmp_migrate(Monitor *mon, const QDict *qdict);
+void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict);
 
 #endif
diff --git a/memory_mapping.c b/memory_mapping.c
index adb1595..8810bb0 100644
--- a/memory_mapping.c
+++ b/memory_mapping.c
@@ -220,3 +220,30 @@ void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list)
         create_new_memory_mapping(list, block->offset, 0, block->length);
     }
 }
+
+void memory_mapping_filter(MemoryMappingList *list, int64_t begin,
+                           int64_t length)
+{
+    MemoryMapping *cur, *next;
+
+    QTAILQ_FOREACH_SAFE(cur, &list->head, next, next) {
+        if (cur->phys_addr >= begin + length ||
+            cur->phys_addr + cur->length <= begin) {
+            QTAILQ_REMOVE(&list->head, cur, next);
+            list->num--;
+            continue;
+        }
+
+        if (cur->phys_addr < begin) {
+            cur->length -= begin - cur->phys_addr;
+            if (cur->virt_addr) {
+                cur->virt_addr += begin - cur->phys_addr;
+            }
+            cur->phys_addr = begin;
+        }
+
+        if (cur->phys_addr + cur->length > begin + length) {
+            cur->length -= cur->phys_addr + cur->length - begin - length;
+        }
+    }
+}
diff --git a/memory_mapping.h b/memory_mapping.h
index a583e44..b795678 100644
--- a/memory_mapping.h
+++ b/memory_mapping.h
@@ -62,4 +62,7 @@ static inline int qemu_get_guest_memory_mapping(MemoryMappingList *list)
 /* get guest's memory mapping without do paging(virtual address is 0). */
 void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list);
 
+void memory_mapping_filter(MemoryMappingList *list, int64_t begin,
+                           int64_t length);
+
 #endif
diff --git a/qapi-schema.json b/qapi-schema.json
index 0d11d6e..c2f5866 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1701,3 +1701,37 @@
 # Since: 1.1
 ##
 { 'command': 'xen-save-devices-state', 'data': {'filename': 'str'} }
+
+##
+# @dump-guest-memory
+#
+# Dump guest's memory to vmcore.
+#
+# @paging: if true, do paging to get guest's memory mapping. The @paging's
+# default value of @paging is false, If you want to use gdb to process the
+# core, please set @paging to true. The reason why the @paging's value is
+# false:
+#   1. guest machine in a catastrophic state can have corrupted memory,
+#      which we cannot trust.
+#   2. The guest machine can be in read-mode even if paging is enabled.
+#      For example: the guest machine uses ACPI to sleep, and ACPI sleep
+#      state goes in real-mode
+# @protocol: the filename or file descriptor of the vmcore. The supported
+# protocol can be file or fd:
+#   1. file: the protocol starts with "file:", and the following string is
+#      the file's path.
+#   2. fd: the protocol starts with "fd:", and the following string is the
+#      fd's name. This command doesn't support the fd that is is associated
+#      with a pipe, socket, or FIFO(lseek will fail with such fd).
+# @begin: #optional if specified, the starting physical address.
+# @length: #optional if specified, the memory size, in bytes. If you don't
+# want to dump all guest's memory, please specify the start @begin and
+# @length
+#
+# Returns: nothing on success
+#
+# Since: 1.1
+##
+{ 'command': 'dump-guest-memory',
+  'data': { 'paging': 'bool', 'protocol': 'str', '*begin': 'int',
+            '*length': 'int' } }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 9447871..037486a 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -606,6 +606,44 @@ Example:
 
 EQMP
 
+#if defined(CONFIG_HAVE_CORE_DUMP)
+    {
+        .name       = "dump-guest-memory",
+        .args_type  = "paging:b,protocol:s,begin:i?,end:i?",
+        .params     = "[-p] protocol [begin] [length]",
+        .help       = "dump guest memory to file",
+        .user_print = monitor_user_noop,
+        .mhandler.cmd_new = qmp_marshal_input_dump_guest_memory,
+    },
+
+SQMP
+dump
+
+
+Dump guest memory to file. The file can be processed with crash or gdb.
+
+Arguments:
+
+- "paging": do paging to get guest's memory mapping (json-bool)
+- "protocol": destination file(started with "file:") or destination file
+              descriptor (started with "fd:") (json-string)
+- "begin": the starting physical address. It's optional, and should be specified
+           with length together (json-int)
+- "length": the memory size, in bytes. It's optional, and should be specified
+            with begin together (json-int)
+
+Example:
+
+-> { "execute": "dump-guest-memory", "arguments": { "protocol": "fd:dump" } }
+<- { "return": {} }
+
+Notes:
+
+(1) All boolean arguments default to false
+
+EQMP
+#endif
+
     {
         .name       = "netdev_add",
         .args_type  = "netdev:O",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism
  2012-03-28 12:44   ` Luiz Capitulino
@ 2012-04-02  3:19     ` Wen Congyang
  2012-04-03  7:35       ` Jan Kiszka
  0 siblings, 1 reply; 21+ messages in thread
From: Wen Congyang @ 2012-04-02  3:19 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Anthony Liguori, Jan Kiszka, qemu-devel, HATAYAMA Daisuke,
	Dave Anderson, Eric Blake

At 03/28/2012 08:44 PM, Luiz Capitulino Wrote:
> On Wed, 28 Mar 2012 13:17:43 +0800
> Wen Congyang <wency@cn.fujitsu.com> wrote:
> 
>> Hi, Luiz, Anthony, Jan
>>
>> do you have any comments about this patchset?
> 
> As far as QMP is concerned:
> 
>  Acked-by: Luiz Capitulino <lcapitulino@redhat.com>

I fix a overflow bug and resend patch 12/12

> 
> This can go through me tree, but I'd like an ACK from Jan and/or Anthony.
> 

Hi, Jan, Anthony

when do you have tiem to review this patchset?

Thanks
Wen Congyang

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism
  2012-04-02  3:19     ` Wen Congyang
@ 2012-04-03  7:35       ` Jan Kiszka
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Kiszka @ 2012-04-03  7:35 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Anthony Liguori, qemu-devel, Luiz Capitulino, HATAYAMA Daisuke,
	Dave Anderson, Eric Blake

On 2012-04-02 05:19, Wen Congyang wrote:
> At 03/28/2012 08:44 PM, Luiz Capitulino Wrote:
>> On Wed, 28 Mar 2012 13:17:43 +0800
>> Wen Congyang <wency@cn.fujitsu.com> wrote:
>>
>>> Hi, Luiz, Anthony, Jan
>>>
>>> do you have any comments about this patchset?
>>
>> As far as QMP is concerned:
>>
>>  Acked-by: Luiz Capitulino <lcapitulino@redhat.com>
> 
> I fix a overflow bug and resend patch 12/12
> 
>>
>> This can go through me tree, but I'd like an ACK from Jan and/or Anthony.
>>
> 
> Hi, Jan, Anthony
> 
> when do you have tiem to review this patchset?

High on my to-do list. I hope to have a look and a try the next days.
Sorry for the long delay.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-04-03  7:35 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-26  9:58 [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
2012-03-26 10:00 ` [Qemu-devel] [PATCH 01/12 v11] Add API to create memory mapping list Wen Congyang
2012-03-26 10:01 ` [Qemu-devel] [PATCH 02/12 v12] Add API to check whether a physical address is I/O address Wen Congyang
2012-03-26 10:02 ` [Qemu-devel] [PATCH 03/12 v11] implement cpu_get_memory_mapping() Wen Congyang
2012-03-26 10:02 ` [Qemu-devel] [PATCH 04/12 v11] Add API to check whether paging mode is enabled Wen Congyang
2012-03-26 10:03 ` [Qemu-devel] [PATCH 05/12 v11] Add API to get memory mapping Wen Congyang
2012-03-27  2:27   ` Wen Congyang
2012-03-26 10:03 ` [Qemu-devel] [PATCH 06/12 v11] Add API to get memory mapping without do paging Wen Congyang
2012-03-26 10:04 ` [Qemu-devel] [PATCH 07/12 v11] target-i386: Add API to write elf notes to core file Wen Congyang
2012-03-26 10:04 ` [Qemu-devel] [PATCH 08/12 v11] target-i386: Add API to write cpu status " Wen Congyang
2012-03-26 10:05 ` [Qemu-devel] [PATCH 09/12 v11] target-i386: add API to get dump info Wen Congyang
2012-03-26 10:05 ` [Qemu-devel] [PATCH 10/12 v11] make gdb_id() generally avialable and rename it to cpu_index() Wen Congyang
2012-03-27  3:38   ` HATAYAMA Daisuke
2012-03-26 10:06 ` [Qemu-devel] [PATCH 11/12 v11] QError: Introduce new error for the dump-guest-memory command Wen Congyang
2012-03-26 10:06 ` [Qemu-devel] [PATCH 12/12 v11] introduce a new monitor command 'dump-guest-memory' to dump guest's memory Wen Congyang
2012-04-02  2:54   ` Wen Congyang
2012-04-02  3:16   ` [Qemu-devel] [PATCH 12/12 v11.5] " Wen Congyang
2012-03-28  5:17 ` [Qemu-devel] [PATCH 00/12 v11] introducing a new, dedicated guest memory dump mechanism Wen Congyang
2012-03-28 12:44   ` Luiz Capitulino
2012-04-02  3:19     ` Wen Congyang
2012-04-03  7:35       ` Jan Kiszka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.