All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism
@ 2012-05-07  4:01 Wen Congyang
  2012-05-07  4:03 ` [Qemu-devel] [PATCH 01/12 v15] Add API to create memory mapping list Wen Congyang
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: Wen Congyang @ 2012-05-07  4:01 UTC (permalink / raw)
  To: qemu-devel, HATAYAMA Daisuke, Luiz Capitulino,
	Daniel P. Berrange, Anthony Liguori

Hi, all

'virsh dump' can not work when host pci device is used by guest. We have
discussed this issue here:
http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html

The last version is here:
http://lists.nongnu.org/archive/html/qemu-devel/2012-04/msg03379.html

We have determined to introduce a new command dump-guest-memory to dump
guest's memory. The core file's format is elf32 or elf64.

Note:
1. The guest should be x86 or x86_64. The other arch is not supported now.
2. If you use old gdb, gdb may crash. I use gdb-7.3.1, and it does not crash.
3. If the OS is in the second kernel, gdb may not work well, and crash can
   work by specifying '--machdep phys_addr=xxx' in the command line. The
   reason is that the second kernel will update the page table, and we can
   not get the page table for the first kernel.
4. The cpu's state is stored in QEMU note. You neet to modify crash to use
   it to calculate phys_base.
5. If the guest OS is 32 bit and the memory size is larger than 4G, the vmcore
   is elf64 format. You should use the gdb which is built with --enable-64-bit-bfd.

Changes from v14 to v15:
1. rebase to newest qemu
2. address Luiz's comment

Changes from v13 to v14:
1. fix building error

Changes from v12 to v13:
1. Support the fd that is is associated with a pipe, socket, or FIFO

Changes from v11 to v12:
1. rebase and resend

Changes from v10 to v11:
1. addressed Luiz's and Hatayam's comment
2. fix a bug about filtering feature

Changes from v9 to v10:
1. fix some bug
2. addressed Luiz's and Hatayam's comment
3. remove cancel and query command

Changes from v8 to v9:
1. remove async support(it will be reimplemented after QAPI async commands support
   is finished)
2. fix some typo error

Changes from v7 to v8:
1. addressed Hatayama's comments

Changes from v6 to v7:
1. addressed Jan's comments
2. fix some bugs
3. store cpu's state into the vmcore

Changes from v5 to v6:
1. allow user to dump a fraction of the memory
2. fix some bugs

Changes from v4 to v5:
1. convert the new command dump to QAPI 

Changes from v3 to v4:
1. support it to run asynchronously
2. add API to cancel dumping and query dumping progress
3. add API to control dumping speed
4. auto cancel dumping when the user resumes vm, and the status is failed.

Changes from v2 to v3:
1. address Jan Kiszka's comment

Changes from v1 to v2:
1. fix virt addr in the vmcore.

Wen Congyang (12):
  Add API to create memory mapping list
  Add API to check whether a physical address is I/O address
  implement cpu_get_memory_mapping()
  Add API to check whether paging mode is enabled
  Add API to get memory mapping
  Add API to get memory mapping without do paging
  target-i386: Add API to write elf notes to core file
  target-i386: Add API to write cpu status to core file
  target-i386: add API to get dump info
  target-i386: Add API to get note's size
  make gdb_id() generally avialable and rename it to cpu_index()
  introduce a new monitor command 'dump-guest-memory' to dump guest's
    memory

 Makefile.target                   |    5 +
 configure                         |    8 +
 cpu-all.h                         |   70 +++
 cpu-common.h                      |    4 +
 dump.c                            |  883 +++++++++++++++++++++++++++++++++++++
 dump.h                            |   23 +
 elf.h                             |    5 +
 exec.c                            |   12 +
 gdbstub.c                         |   19 +-
 gdbstub.h                         |    9 +
 hmp-commands.hx                   |   28 ++
 hmp.c                             |   22 +
 hmp.h                             |    1 +
 memory_mapping.c                  |  249 +++++++++++
 memory_mapping.h                  |   74 +++
 qapi-schema.json                  |   43 ++
 qmp-commands.hx                   |   36 ++
 target-i386/arch_dump.c           |  449 +++++++++++++++++++
 target-i386/arch_memory_mapping.c |  271 ++++++++++++
 19 files changed, 2197 insertions(+), 14 deletions(-)
 create mode 100644 dump.c
 create mode 100644 dump.h
 create mode 100644 memory_mapping.c
 create mode 100644 memory_mapping.h
 create mode 100644 target-i386/arch_dump.c
 create mode 100644 target-i386/arch_memory_mapping.c

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 01/12 v15] Add API to create memory mapping list
  2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
@ 2012-05-07  4:03 ` Wen Congyang
  2012-05-07  4:04 ` [Qemu-devel] [PATCH 02/12 v15] Add API to check whether a physical address is I/O address Wen Congyang
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2012-05-07  4:03 UTC (permalink / raw)
  To: qemu-devel, HATAYAMA Daisuke, Luiz Capitulino,
	Daniel P. Berrange, Anthony Liguori

The memory mapping list stores virtual address and physical address mapping.
The virtual address and physical address are contiguous in the mapping.
The folloing patch will use this information to create PT_LOAD in the vmcore.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target  |    1 +
 memory_mapping.c |  166 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 memory_mapping.h |   47 +++++++++++++++
 3 files changed, 214 insertions(+), 0 deletions(-)
 create mode 100644 memory_mapping.c
 create mode 100644 memory_mapping.h

diff --git a/Makefile.target b/Makefile.target
index 1582904..005fc49 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -192,6 +192,7 @@ obj-$(CONFIG_KVM) += kvm.o kvm-all.o
 obj-$(CONFIG_NO_KVM) += kvm-stub.o
 obj-$(CONFIG_VGA) += vga.o
 obj-y += memory.o savevm.o cputlb.o
+obj-y += memory_mapping.o
 LIBS+=-lz
 
 obj-i386-$(CONFIG_KVM) += hyperv.o
diff --git a/memory_mapping.c b/memory_mapping.c
new file mode 100644
index 0000000..718f271
--- /dev/null
+++ b/memory_mapping.c
@@ -0,0 +1,166 @@
+/*
+ * QEMU memory mapping
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "cpu.h"
+#include "cpu-all.h"
+#include "memory_mapping.h"
+
+static void memory_mapping_list_add_mapping_sorted(MemoryMappingList *list,
+                                                   MemoryMapping *mapping)
+{
+    MemoryMapping *p;
+
+    QTAILQ_FOREACH(p, &list->head, next) {
+        if (p->phys_addr >= mapping->phys_addr) {
+            QTAILQ_INSERT_BEFORE(p, mapping, next);
+            return;
+        }
+    }
+    QTAILQ_INSERT_TAIL(&list->head, mapping, next);
+}
+
+static void create_new_memory_mapping(MemoryMappingList *list,
+                                      target_phys_addr_t phys_addr,
+                                      target_phys_addr_t virt_addr,
+                                      ram_addr_t length)
+{
+    MemoryMapping *memory_mapping;
+
+    memory_mapping = g_malloc(sizeof(MemoryMapping));
+    memory_mapping->phys_addr = phys_addr;
+    memory_mapping->virt_addr = virt_addr;
+    memory_mapping->length = length;
+    list->last_mapping = memory_mapping;
+    list->num++;
+    memory_mapping_list_add_mapping_sorted(list, memory_mapping);
+}
+
+static inline bool mapping_contiguous(MemoryMapping *map,
+                                      target_phys_addr_t phys_addr,
+                                      target_phys_addr_t virt_addr)
+{
+    return phys_addr == map->phys_addr + map->length &&
+           virt_addr == map->virt_addr + map->length;
+}
+
+/*
+ * [map->phys_addr, map->phys_addr + map->length) and
+ * [phys_addr, phys_addr + length) have intersection?
+ */
+static inline bool mapping_have_same_region(MemoryMapping *map,
+                                            target_phys_addr_t phys_addr,
+                                            ram_addr_t length)
+{
+    return !(phys_addr + length < map->phys_addr ||
+             phys_addr >= map->phys_addr + map->length);
+}
+
+/*
+ * [map->phys_addr, map->phys_addr + map->length) and
+ * [phys_addr, phys_addr + length) have intersection. The virtual address in the
+ * intersection are the same?
+ */
+static inline bool mapping_conflict(MemoryMapping *map,
+                                    target_phys_addr_t phys_addr,
+                                    target_phys_addr_t virt_addr)
+{
+    return virt_addr - map->virt_addr != phys_addr - map->phys_addr;
+}
+
+/*
+ * [map->virt_addr, map->virt_addr + map->length) and
+ * [virt_addr, virt_addr + length) have intersection. And the physical address
+ * in the intersection are the same.
+ */
+static inline void mapping_merge(MemoryMapping *map,
+                                 target_phys_addr_t virt_addr,
+                                 ram_addr_t length)
+{
+    if (virt_addr < map->virt_addr) {
+        map->length += map->virt_addr - virt_addr;
+        map->virt_addr = virt_addr;
+    }
+
+    if ((virt_addr + length) >
+        (map->virt_addr + map->length)) {
+        map->length = virt_addr + length - map->virt_addr;
+    }
+}
+
+void memory_mapping_list_add_merge_sorted(MemoryMappingList *list,
+                                          target_phys_addr_t phys_addr,
+                                          target_phys_addr_t virt_addr,
+                                          ram_addr_t length)
+{
+    MemoryMapping *memory_mapping, *last_mapping;
+
+    if (QTAILQ_EMPTY(&list->head)) {
+        create_new_memory_mapping(list, phys_addr, virt_addr, length);
+        return;
+    }
+
+    last_mapping = list->last_mapping;
+    if (last_mapping) {
+        if (mapping_contiguous(last_mapping, phys_addr, virt_addr)) {
+            last_mapping->length += length;
+            return;
+        }
+    }
+
+    QTAILQ_FOREACH(memory_mapping, &list->head, next) {
+        if (mapping_contiguous(memory_mapping, phys_addr, virt_addr)) {
+            memory_mapping->length += length;
+            list->last_mapping = memory_mapping;
+            return;
+        }
+
+        if (phys_addr + length < memory_mapping->phys_addr) {
+            /* create a new region before memory_mapping */
+            break;
+        }
+
+        if (mapping_have_same_region(memory_mapping, phys_addr, length)) {
+            if (mapping_conflict(memory_mapping, phys_addr, virt_addr)) {
+                continue;
+            }
+
+            /* merge this region into memory_mapping */
+            mapping_merge(memory_mapping, virt_addr, length);
+            list->last_mapping = memory_mapping;
+            return;
+        }
+    }
+
+    /* this region can not be merged into any existed memory mapping. */
+    create_new_memory_mapping(list, phys_addr, virt_addr, length);
+}
+
+void memory_mapping_list_free(MemoryMappingList *list)
+{
+    MemoryMapping *p, *q;
+
+    QTAILQ_FOREACH_SAFE(p, &list->head, next, q) {
+        QTAILQ_REMOVE(&list->head, p, next);
+        g_free(p);
+    }
+
+    list->num = 0;
+    list->last_mapping = NULL;
+}
+
+void memory_mapping_list_init(MemoryMappingList *list)
+{
+    list->num = 0;
+    list->last_mapping = NULL;
+    QTAILQ_INIT(&list->head);
+}
diff --git a/memory_mapping.h b/memory_mapping.h
new file mode 100644
index 0000000..836b047
--- /dev/null
+++ b/memory_mapping.h
@@ -0,0 +1,47 @@
+/*
+ * QEMU memory mapping
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef MEMORY_MAPPING_H
+#define MEMORY_MAPPING_H
+
+#include "qemu-queue.h"
+
+/* The physical and virtual address in the memory mapping are contiguous. */
+typedef struct MemoryMapping {
+    target_phys_addr_t phys_addr;
+    target_ulong virt_addr;
+    ram_addr_t length;
+    QTAILQ_ENTRY(MemoryMapping) next;
+} MemoryMapping;
+
+typedef struct MemoryMappingList {
+    unsigned int num;
+    MemoryMapping *last_mapping;
+    QTAILQ_HEAD(, MemoryMapping) head;
+} MemoryMappingList;
+
+/*
+ * add or merge the memory region [phys_addr, phys_addr + length) into the
+ * memory mapping's list. The region's virtual address starts with virt_addr,
+ * and is contiguous. The list is sorted by phys_addr.
+ */
+void memory_mapping_list_add_merge_sorted(MemoryMappingList *list,
+                                          target_phys_addr_t phys_addr,
+                                          target_phys_addr_t virt_addr,
+                                          ram_addr_t length);
+
+void memory_mapping_list_free(MemoryMappingList *list);
+
+void memory_mapping_list_init(MemoryMappingList *list);
+
+#endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 02/12 v15] Add API to check whether a physical address is I/O address
  2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
  2012-05-07  4:03 ` [Qemu-devel] [PATCH 01/12 v15] Add API to create memory mapping list Wen Congyang
@ 2012-05-07  4:04 ` Wen Congyang
  2012-05-07  4:04 ` [Qemu-devel] [PATCH 03/12 v15] implement cpu_get_memory_mapping() Wen Congyang
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2012-05-07  4:04 UTC (permalink / raw)
  To: qemu-devel, HATAYAMA Daisuke, Luiz Capitulino,
	Daniel P. Berrange, Anthony Liguori

This API will be used in the following patch.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 cpu-common.h |    4 ++++
 exec.c       |   12 ++++++++++++
 2 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/cpu-common.h b/cpu-common.h
index dca5175..1fe3280 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -71,6 +71,10 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len,
 void *cpu_register_map_client(void *opaque, void (*callback)(void *opaque));
 void cpu_unregister_map_client(void *cookie);
 
+#ifndef CONFIG_USER_ONLY
+bool cpu_physical_memory_is_io(target_phys_addr_t phys_addr);
+#endif
+
 /* Coalesced MMIO regions are areas where write operations can be reordered.
  * This usually implies that write operations are side-effect free.  This allows
  * batching which can make a major impact on performance when using
diff --git a/exec.c b/exec.c
index 0607c9b..df8c152 100644
--- a/exec.c
+++ b/exec.c
@@ -4319,3 +4319,15 @@ bool virtio_is_big_endian(void)
 }
 
 #endif
+
+#ifndef CONFIG_USER_ONLY
+bool cpu_physical_memory_is_io(target_phys_addr_t phys_addr)
+{
+    MemoryRegionSection *section;
+
+    section = phys_page_find(phys_addr >> TARGET_PAGE_BITS);
+
+    return !(memory_region_is_ram(section->mr) ||
+             memory_region_is_romd(section->mr));
+}
+#endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 03/12 v15] implement cpu_get_memory_mapping()
  2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
  2012-05-07  4:03 ` [Qemu-devel] [PATCH 01/12 v15] Add API to create memory mapping list Wen Congyang
  2012-05-07  4:04 ` [Qemu-devel] [PATCH 02/12 v15] Add API to check whether a physical address is I/O address Wen Congyang
@ 2012-05-07  4:04 ` Wen Congyang
  2012-05-07  4:05 ` [Qemu-devel] [PATCH 04/12 v15] Add API to check whether paging mode is enabled Wen Congyang
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2012-05-07  4:04 UTC (permalink / raw)
  To: qemu-devel, HATAYAMA Daisuke, Luiz Capitulino,
	Daniel P. Berrange, Anthony Liguori

Walk cpu's page table and collect all virtual address and physical address mapping.
Then, add these mapping into memory mapping list. If the guest does not use paging,
it will do nothing. Note: the I/O memory will be skipped.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target                   |    1 +
 configure                         |    4 +
 cpu-all.h                         |   11 ++
 memory_mapping.h                  |    6 +
 target-i386/arch_memory_mapping.c |  266 +++++++++++++++++++++++++++++++++++++
 5 files changed, 288 insertions(+), 0 deletions(-)
 create mode 100644 target-i386/arch_memory_mapping.c

diff --git a/Makefile.target b/Makefile.target
index 005fc49..18ffaef 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -193,6 +193,7 @@ obj-$(CONFIG_NO_KVM) += kvm-stub.o
 obj-$(CONFIG_VGA) += vga.o
 obj-y += memory.o savevm.o cputlb.o
 obj-y += memory_mapping.o
+obj-$(CONFIG_HAVE_GET_MEMORY_MAPPING) += arch_memory_mapping.o
 LIBS+=-lz
 
 obj-i386-$(CONFIG_KVM) += hyperv.o
diff --git a/configure b/configure
index 0111774..971fa09 100755
--- a/configure
+++ b/configure
@@ -3705,6 +3705,10 @@ case "$target_arch2" in
       fi
     fi
 esac
+case "$target_arch2" in
+  i386|x86_64)
+    echo "CONFIG_HAVE_GET_MEMORY_MAPPING=y" >> $config_target_mak
+esac
 if test "$target_arch2" = "ppc64" -a "$fdt" = "yes"; then
   echo "CONFIG_PSERIES=y" >> $config_target_mak
 fi
diff --git a/cpu-all.h b/cpu-all.h
index 028528f..2688bac 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -22,6 +22,7 @@
 #include "qemu-common.h"
 #include "qemu-tls.h"
 #include "cpu-common.h"
+#include "memory_mapping.h"
 
 /* some important defines:
  *
@@ -524,4 +525,14 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf);
 int cpu_memory_rw_debug(CPUArchState *env, target_ulong addr,
                         uint8_t *buf, int len, int is_write);
 
+#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
+int cpu_get_memory_mapping(MemoryMappingList *list, CPUArchState *env);
+#else
+static inline int cpu_get_memory_mapping(MemoryMappingList *list,
+                                         CPUArchState *env)
+{
+    return -1;
+}
+#endif
+
 #endif /* CPU_ALL_H */
diff --git a/memory_mapping.h b/memory_mapping.h
index 836b047..e486d10 100644
--- a/memory_mapping.h
+++ b/memory_mapping.h
@@ -16,6 +16,7 @@
 
 #include "qemu-queue.h"
 
+#ifndef CONFIG_USER_ONLY
 /* The physical and virtual address in the memory mapping are contiguous. */
 typedef struct MemoryMapping {
     target_phys_addr_t phys_addr;
@@ -44,4 +45,9 @@ void memory_mapping_list_free(MemoryMappingList *list);
 
 void memory_mapping_list_init(MemoryMappingList *list);
 
+#else
+
+/* We use MemoryMappingList* in cpu-all.h */
+typedef struct MemoryMappingList MemoryMappingList;
+#endif
 #endif
diff --git a/target-i386/arch_memory_mapping.c b/target-i386/arch_memory_mapping.c
new file mode 100644
index 0000000..dd64bec
--- /dev/null
+++ b/target-i386/arch_memory_mapping.c
@@ -0,0 +1,266 @@
+/*
+ * i386 memory mapping
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "cpu.h"
+#include "cpu-all.h"
+
+/* PAE Paging or IA-32e Paging */
+static void walk_pte(MemoryMappingList *list, target_phys_addr_t pte_start_addr,
+                     int32_t a20_mask, target_ulong start_line_addr)
+{
+    target_phys_addr_t pte_addr, start_paddr;
+    uint64_t pte;
+    target_ulong start_vaddr;
+    int i;
+
+    for (i = 0; i < 512; i++) {
+        pte_addr = (pte_start_addr + i * 8) & a20_mask;
+        pte = ldq_phys(pte_addr);
+        if (!(pte & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        start_paddr = (pte & ~0xfff) & ~(0x1ULL << 63);
+        if (cpu_physical_memory_is_io(start_paddr)) {
+            /* I/O region */
+            continue;
+        }
+
+        start_vaddr = start_line_addr | ((i & 0x1fff) << 12);
+        memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                             start_vaddr, 1 << 12);
+    }
+}
+
+/* 32-bit Paging */
+static void walk_pte2(MemoryMappingList *list,
+                      target_phys_addr_t pte_start_addr, int32_t a20_mask,
+                      target_ulong start_line_addr)
+{
+    target_phys_addr_t pte_addr, start_paddr;
+    uint32_t pte;
+    target_ulong start_vaddr;
+    int i;
+
+    for (i = 0; i < 1024; i++) {
+        pte_addr = (pte_start_addr + i * 4) & a20_mask;
+        pte = ldl_phys(pte_addr);
+        if (!(pte & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        start_paddr = pte & ~0xfff;
+        if (cpu_physical_memory_is_io(start_paddr)) {
+            /* I/O region */
+            continue;
+        }
+
+        start_vaddr = start_line_addr | ((i & 0x3ff) << 12);
+        memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                             start_vaddr, 1 << 12);
+    }
+}
+
+/* PAE Paging or IA-32e Paging */
+static void walk_pde(MemoryMappingList *list, target_phys_addr_t pde_start_addr,
+                     int32_t a20_mask, target_ulong start_line_addr)
+{
+    target_phys_addr_t pde_addr, pte_start_addr, start_paddr;
+    uint64_t pde;
+    target_ulong line_addr, start_vaddr;
+    int i;
+
+    for (i = 0; i < 512; i++) {
+        pde_addr = (pde_start_addr + i * 8) & a20_mask;
+        pde = ldq_phys(pde_addr);
+        if (!(pde & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = start_line_addr | ((i & 0x1ff) << 21);
+        if (pde & PG_PSE_MASK) {
+            /* 2 MB page */
+            start_paddr = (pde & ~0x1fffff) & ~(0x1ULL << 63);
+            if (cpu_physical_memory_is_io(start_paddr)) {
+                /* I/O region */
+                continue;
+            }
+            start_vaddr = line_addr;
+            memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                                 start_vaddr, 1 << 21);
+            continue;
+        }
+
+        pte_start_addr = (pde & ~0xfff) & a20_mask;
+        walk_pte(list, pte_start_addr, a20_mask, line_addr);
+    }
+}
+
+/* 32-bit Paging */
+static void walk_pde2(MemoryMappingList *list,
+                      target_phys_addr_t pde_start_addr, int32_t a20_mask,
+                      bool pse)
+{
+    target_phys_addr_t pde_addr, pte_start_addr, start_paddr;
+    uint32_t pde;
+    target_ulong line_addr, start_vaddr;
+    int i;
+
+    for (i = 0; i < 1024; i++) {
+        pde_addr = (pde_start_addr + i * 4) & a20_mask;
+        pde = ldl_phys(pde_addr);
+        if (!(pde & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = (((unsigned int)i & 0x3ff) << 22);
+        if ((pde & PG_PSE_MASK) && pse) {
+            /* 4 MB page */
+            start_paddr = (pde & ~0x3fffff) | ((pde & 0x1fe000) << 19);
+            if (cpu_physical_memory_is_io(start_paddr)) {
+                /* I/O region */
+                continue;
+            }
+            start_vaddr = line_addr;
+            memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                                 start_vaddr, 1 << 22);
+            continue;
+        }
+
+        pte_start_addr = (pde & ~0xfff) & a20_mask;
+        walk_pte2(list, pte_start_addr, a20_mask, line_addr);
+    }
+}
+
+/* PAE Paging */
+static void walk_pdpe2(MemoryMappingList *list,
+                       target_phys_addr_t pdpe_start_addr, int32_t a20_mask)
+{
+    target_phys_addr_t pdpe_addr, pde_start_addr;
+    uint64_t pdpe;
+    target_ulong line_addr;
+    int i;
+
+    for (i = 0; i < 4; i++) {
+        pdpe_addr = (pdpe_start_addr + i * 8) & a20_mask;
+        pdpe = ldq_phys(pdpe_addr);
+        if (!(pdpe & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = (((unsigned int)i & 0x3) << 30);
+        pde_start_addr = (pdpe & ~0xfff) & a20_mask;
+        walk_pde(list, pde_start_addr, a20_mask, line_addr);
+    }
+}
+
+#ifdef TARGET_X86_64
+/* IA-32e Paging */
+static void walk_pdpe(MemoryMappingList *list,
+                      target_phys_addr_t pdpe_start_addr, int32_t a20_mask,
+                      target_ulong start_line_addr)
+{
+    target_phys_addr_t pdpe_addr, pde_start_addr, start_paddr;
+    uint64_t pdpe;
+    target_ulong line_addr, start_vaddr;
+    int i;
+
+    for (i = 0; i < 512; i++) {
+        pdpe_addr = (pdpe_start_addr + i * 8) & a20_mask;
+        pdpe = ldq_phys(pdpe_addr);
+        if (!(pdpe & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = start_line_addr | ((i & 0x1ffULL) << 30);
+        if (pdpe & PG_PSE_MASK) {
+            /* 1 GB page */
+            start_paddr = (pdpe & ~0x3fffffff) & ~(0x1ULL << 63);
+            if (cpu_physical_memory_is_io(start_paddr)) {
+                /* I/O region */
+                continue;
+            }
+            start_vaddr = line_addr;
+            memory_mapping_list_add_merge_sorted(list, start_paddr,
+                                                 start_vaddr, 1 << 30);
+            continue;
+        }
+
+        pde_start_addr = (pdpe & ~0xfff) & a20_mask;
+        walk_pde(list, pde_start_addr, a20_mask, line_addr);
+    }
+}
+
+/* IA-32e Paging */
+static void walk_pml4e(MemoryMappingList *list,
+                       target_phys_addr_t pml4e_start_addr, int32_t a20_mask)
+{
+    target_phys_addr_t pml4e_addr, pdpe_start_addr;
+    uint64_t pml4e;
+    target_ulong line_addr;
+    int i;
+
+    for (i = 0; i < 512; i++) {
+        pml4e_addr = (pml4e_start_addr + i * 8) & a20_mask;
+        pml4e = ldq_phys(pml4e_addr);
+        if (!(pml4e & PG_PRESENT_MASK)) {
+            /* not present */
+            continue;
+        }
+
+        line_addr = ((i & 0x1ffULL) << 39) | (0xffffULL << 48);
+        pdpe_start_addr = (pml4e & ~0xfff) & a20_mask;
+        walk_pdpe(list, pdpe_start_addr, a20_mask, line_addr);
+    }
+}
+#endif
+
+int cpu_get_memory_mapping(MemoryMappingList *list, CPUArchState *env)
+{
+    if (!(env->cr[0] & CR0_PG_MASK)) {
+        /* paging is disabled */
+        return 0;
+    }
+
+    if (env->cr[4] & CR4_PAE_MASK) {
+#ifdef TARGET_X86_64
+        if (env->hflags & HF_LMA_MASK) {
+            target_phys_addr_t pml4e_addr;
+
+            pml4e_addr = (env->cr[3] & ~0xfff) & env->a20_mask;
+            walk_pml4e(list, pml4e_addr, env->a20_mask);
+        } else
+#endif
+        {
+            target_phys_addr_t pdpe_addr;
+
+            pdpe_addr = (env->cr[3] & ~0x1f) & env->a20_mask;
+            walk_pdpe2(list, pdpe_addr, env->a20_mask);
+        }
+    } else {
+        target_phys_addr_t pde_addr;
+        bool pse;
+
+        pde_addr = (env->cr[3] & ~0xfff) & env->a20_mask;
+        pse = !!(env->cr[4] & CR4_PSE_MASK);
+        walk_pde2(list, pde_addr, env->a20_mask, pse);
+    }
+
+    return 0;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 04/12 v15] Add API to check whether paging mode is enabled
  2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (2 preceding siblings ...)
  2012-05-07  4:04 ` [Qemu-devel] [PATCH 03/12 v15] implement cpu_get_memory_mapping() Wen Congyang
@ 2012-05-07  4:05 ` Wen Congyang
  2012-05-07  4:06 ` [Qemu-devel] [PATCH 05/12 v15] Add API to get memory mapping Wen Congyang
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2012-05-07  4:05 UTC (permalink / raw)
  To: qemu-devel, HATAYAMA Daisuke, Luiz Capitulino,
	Daniel P. Berrange, Anthony Liguori

This API will be used in the following patch.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 cpu-all.h                         |    6 ++++++
 target-i386/arch_memory_mapping.c |    7 ++++++-
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 2688bac..76439b4 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -527,12 +527,18 @@ int cpu_memory_rw_debug(CPUArchState *env, target_ulong addr,
 
 #if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
 int cpu_get_memory_mapping(MemoryMappingList *list, CPUArchState *env);
+bool cpu_paging_enabled(CPUArchState *env);
 #else
 static inline int cpu_get_memory_mapping(MemoryMappingList *list,
                                          CPUArchState *env)
 {
     return -1;
 }
+
+static inline bool cpu_paging_enabled(CPUArchState *env)
+{
+    return true;
+}
 #endif
 
 #endif /* CPU_ALL_H */
diff --git a/target-i386/arch_memory_mapping.c b/target-i386/arch_memory_mapping.c
index dd64bec..bd50e11 100644
--- a/target-i386/arch_memory_mapping.c
+++ b/target-i386/arch_memory_mapping.c
@@ -233,7 +233,7 @@ static void walk_pml4e(MemoryMappingList *list,
 
 int cpu_get_memory_mapping(MemoryMappingList *list, CPUArchState *env)
 {
-    if (!(env->cr[0] & CR0_PG_MASK)) {
+    if (!cpu_paging_enabled(env)) {
         /* paging is disabled */
         return 0;
     }
@@ -264,3 +264,8 @@ int cpu_get_memory_mapping(MemoryMappingList *list, CPUArchState *env)
 
     return 0;
 }
+
+bool cpu_paging_enabled(CPUArchState *env)
+{
+    return env->cr[0] & CR0_PG_MASK;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 05/12 v15] Add API to get memory mapping
  2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (3 preceding siblings ...)
  2012-05-07  4:05 ` [Qemu-devel] [PATCH 04/12 v15] Add API to check whether paging mode is enabled Wen Congyang
@ 2012-05-07  4:06 ` Wen Congyang
  2012-05-07  4:07 ` [Qemu-devel] [PATCH 06/12 v15] Add API to get memory mapping without do paging Wen Congyang
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2012-05-07  4:06 UTC (permalink / raw)
  To: qemu-devel, HATAYAMA Daisuke, Luiz Capitulino,
	Daniel P. Berrange, Anthony Liguori

Add API to get all virtual address and physical address mapping.
If the guest doesn't use paging, the virtual address is equal to the phyical
address. The virtual address and physical address mapping is for gdb's user, and
it does not include the memory that is not referenced by the page table. So if
you want to use crash to anaylze the vmcore, please do not specify -p option.
the reason why the -p option is not default explicitly: guest machine in a
catastrophic state can have corrupted memory, which we cannot trust.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 memory_mapping.c |   47 +++++++++++++++++++++++++++++++++++++++++++++++
 memory_mapping.h |   15 +++++++++++++++
 2 files changed, 62 insertions(+), 0 deletions(-)

diff --git a/memory_mapping.c b/memory_mapping.c
index 718f271..627397a 100644
--- a/memory_mapping.c
+++ b/memory_mapping.c
@@ -164,3 +164,50 @@ void memory_mapping_list_init(MemoryMappingList *list)
     list->last_mapping = NULL;
     QTAILQ_INIT(&list->head);
 }
+
+#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
+
+static CPUArchState *find_paging_enabled_cpu(CPUArchState *start_cpu)
+{
+    CPUArchState *env;
+
+    for (env = start_cpu; env != NULL; env = env->next_cpu) {
+        if (cpu_paging_enabled(env)) {
+            return env;
+        }
+    }
+
+    return NULL;
+}
+
+int qemu_get_guest_memory_mapping(MemoryMappingList *list)
+{
+    CPUArchState *env, *first_paging_enabled_cpu;
+    RAMBlock *block;
+    ram_addr_t offset, length;
+    int ret;
+
+    first_paging_enabled_cpu = find_paging_enabled_cpu(first_cpu);
+    if (first_paging_enabled_cpu) {
+        for (env = first_paging_enabled_cpu; env != NULL; env = env->next_cpu) {
+            ret = cpu_get_memory_mapping(list, env);
+            if (ret < 0) {
+                return -1;
+            }
+        }
+        return 0;
+    }
+
+    /*
+     * If the guest doesn't use paging, the virtual address is equal to physical
+     * address.
+     */
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        offset = block->offset;
+        length = block->length;
+        create_new_memory_mapping(list, offset, offset, length);
+    }
+
+    return 0;
+}
+#endif
diff --git a/memory_mapping.h b/memory_mapping.h
index e486d10..7f3c256 100644
--- a/memory_mapping.h
+++ b/memory_mapping.h
@@ -45,6 +45,21 @@ void memory_mapping_list_free(MemoryMappingList *list);
 
 void memory_mapping_list_init(MemoryMappingList *list);
 
+/*
+ * Return value:
+ *    0: success
+ *   -1: failed
+ *   -2: unsupported
+ */
+#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
+int qemu_get_guest_memory_mapping(MemoryMappingList *list);
+#else
+static inline int qemu_get_guest_memory_mapping(MemoryMappingList *list)
+{
+    return -2;
+}
+#endif
+
 #else
 
 /* We use MemoryMappingList* in cpu-all.h */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 06/12 v15] Add API to get memory mapping without do paging
  2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (4 preceding siblings ...)
  2012-05-07  4:06 ` [Qemu-devel] [PATCH 05/12 v15] Add API to get memory mapping Wen Congyang
@ 2012-05-07  4:07 ` Wen Congyang
  2012-05-07  4:07 ` [Qemu-devel] [PATCH 07/12 v15] target-i386: Add API to write elf notes to core file Wen Congyang
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2012-05-07  4:07 UTC (permalink / raw)
  To: qemu-devel, HATAYAMA Daisuke, Luiz Capitulino,
	Daniel P. Berrange, Anthony Liguori

crash does not need the virtual address and physical address mapping, and the
mapping does not include the memory that is not referenced by the page table.
crash does not use the virtual address, so we can create the mapping for all
physical memory(virtual address is always 0). This patch provides a API to do
this thing, and it will be used in the following patch.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 memory_mapping.c |    9 +++++++++
 memory_mapping.h |    3 +++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/memory_mapping.c b/memory_mapping.c
index 627397a..adb1595 100644
--- a/memory_mapping.c
+++ b/memory_mapping.c
@@ -211,3 +211,12 @@ int qemu_get_guest_memory_mapping(MemoryMappingList *list)
     return 0;
 }
 #endif
+
+void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list)
+{
+    RAMBlock *block;
+
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        create_new_memory_mapping(list, block->offset, 0, block->length);
+    }
+}
diff --git a/memory_mapping.h b/memory_mapping.h
index 7f3c256..190de12 100644
--- a/memory_mapping.h
+++ b/memory_mapping.h
@@ -60,6 +60,9 @@ static inline int qemu_get_guest_memory_mapping(MemoryMappingList *list)
 }
 #endif
 
+/* get guest's memory mapping without do paging(virtual address is 0). */
+void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list);
+
 #else
 
 /* We use MemoryMappingList* in cpu-all.h */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 07/12 v15] target-i386: Add API to write elf notes to core file
  2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (5 preceding siblings ...)
  2012-05-07  4:07 ` [Qemu-devel] [PATCH 06/12 v15] Add API to get memory mapping without do paging Wen Congyang
@ 2012-05-07  4:07 ` Wen Congyang
  2012-05-07  4:08 ` [Qemu-devel] [PATCH 08/12 v15] target-i386: Add API to write cpu status " Wen Congyang
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2012-05-07  4:07 UTC (permalink / raw)
  To: qemu-devel, HATAYAMA Daisuke, Luiz Capitulino,
	Daniel P. Berrange, Anthony Liguori

The core file contains register's value. These APIs write registers to
core file, and them will be called in the following patch.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target         |    1 +
 configure               |    4 +
 cpu-all.h               |   22 +++++
 target-i386/arch_dump.c |  233 +++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 260 insertions(+), 0 deletions(-)
 create mode 100644 target-i386/arch_dump.c

diff --git a/Makefile.target b/Makefile.target
index 18ffaef..cf05831 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -194,6 +194,7 @@ obj-$(CONFIG_VGA) += vga.o
 obj-y += memory.o savevm.o cputlb.o
 obj-y += memory_mapping.o
 obj-$(CONFIG_HAVE_GET_MEMORY_MAPPING) += arch_memory_mapping.o
+obj-$(CONFIG_HAVE_CORE_DUMP) += arch_dump.o
 LIBS+=-lz
 
 obj-i386-$(CONFIG_KVM) += hyperv.o
diff --git a/configure b/configure
index 971fa09..e8c2dfa 100755
--- a/configure
+++ b/configure
@@ -3724,6 +3724,10 @@ if test "$target_softmmu" = "yes" ; then
   if test "$smartcard_nss" = "yes" ; then
     echo "subdir-$target: subdir-libcacard" >> $config_host_mak
   fi
+  case "$target_arch2" in
+    i386|x86_64)
+      echo "CONFIG_HAVE_CORE_DUMP=y" >> $config_target_mak
+  esac
 fi
 if test "$target_user_only" = "yes" ; then
   echo "CONFIG_USER_ONLY=y" >> $config_target_mak
diff --git a/cpu-all.h b/cpu-all.h
index 76439b4..a71e887 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -541,4 +541,26 @@ static inline bool cpu_paging_enabled(CPUArchState *env)
 }
 #endif
 
+typedef int (*write_core_dump_function)(void *buf, size_t size, void *opaque);
+#if defined(CONFIG_HAVE_CORE_DUMP)
+int cpu_write_elf64_note(write_core_dump_function f, CPUArchState *env,
+                         int cpuid, void *opaque);
+int cpu_write_elf32_note(write_core_dump_function f, CPUArchState *env,
+                         int cpuid, void *opaque);
+#else
+static inline int cpu_write_elf64_note(write_core_dump_function f,
+                                       CPUArchState *env, int cpuid,
+                                       void *opaque)
+{
+    return -1;
+}
+
+static inline int cpu_write_elf32_note(write_core_dump_function f,
+                                       CPUArchState *env, int cpuid,
+                                       void *opaque)
+{
+    return -1;
+}
+#endif
+
 #endif /* CPU_ALL_H */
diff --git a/target-i386/arch_dump.c b/target-i386/arch_dump.c
new file mode 100644
index 0000000..d55c2ce
--- /dev/null
+++ b/target-i386/arch_dump.c
@@ -0,0 +1,233 @@
+/*
+ * i386 memory mapping
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "cpu.h"
+#include "cpu-all.h"
+#include "elf.h"
+
+#ifdef TARGET_X86_64
+typedef struct {
+    target_ulong r15, r14, r13, r12, rbp, rbx, r11, r10;
+    target_ulong r9, r8, rax, rcx, rdx, rsi, rdi, orig_rax;
+    target_ulong rip, cs, eflags;
+    target_ulong rsp, ss;
+    target_ulong fs_base, gs_base;
+    target_ulong ds, es, fs, gs;
+} x86_64_user_regs_struct;
+
+typedef struct {
+    char pad1[32];
+    uint32_t pid;
+    char pad2[76];
+    x86_64_user_regs_struct regs;
+    char pad3[8];
+} x86_64_elf_prstatus;
+
+static int x86_64_write_elf64_note(write_core_dump_function f,
+                                   CPUArchState *env, int id,
+                                   void *opaque)
+{
+    x86_64_user_regs_struct regs;
+    Elf64_Nhdr *note;
+    char *buf;
+    int descsz, note_size, name_size = 5;
+    const char *name = "CORE";
+    int ret;
+
+    regs.r15 = env->regs[15];
+    regs.r14 = env->regs[14];
+    regs.r13 = env->regs[13];
+    regs.r12 = env->regs[12];
+    regs.r11 = env->regs[11];
+    regs.r10 = env->regs[10];
+    regs.r9  = env->regs[9];
+    regs.r8  = env->regs[8];
+    regs.rbp = env->regs[R_EBP];
+    regs.rsp = env->regs[R_ESP];
+    regs.rdi = env->regs[R_EDI];
+    regs.rsi = env->regs[R_ESI];
+    regs.rdx = env->regs[R_EDX];
+    regs.rcx = env->regs[R_ECX];
+    regs.rbx = env->regs[R_EBX];
+    regs.rax = env->regs[R_EAX];
+    regs.rip = env->eip;
+    regs.eflags = env->eflags;
+
+    regs.orig_rax = 0; /* FIXME */
+    regs.cs = env->segs[R_CS].selector;
+    regs.ss = env->segs[R_SS].selector;
+    regs.fs_base = env->segs[R_FS].base;
+    regs.gs_base = env->segs[R_GS].base;
+    regs.ds = env->segs[R_DS].selector;
+    regs.es = env->segs[R_ES].selector;
+    regs.fs = env->segs[R_FS].selector;
+    regs.gs = env->segs[R_GS].selector;
+
+    descsz = sizeof(x86_64_elf_prstatus);
+    note_size = ((sizeof(Elf64_Nhdr) + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    note->n_namesz = cpu_to_le32(name_size);
+    note->n_descsz = cpu_to_le32(descsz);
+    note->n_type = cpu_to_le32(NT_PRSTATUS);
+    buf = (char *)note;
+    buf += ((sizeof(Elf64_Nhdr) + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf + 32, &id, 4); /* pr_pid */
+    buf += descsz - sizeof(x86_64_user_regs_struct)-sizeof(target_ulong);
+    memcpy(buf, &regs, sizeof(x86_64_user_regs_struct));
+
+    ret = f(note, note_size, opaque);
+    g_free(note);
+    if (ret < 0) {
+        return -1;
+    }
+
+    return 0;
+}
+#endif
+
+typedef struct {
+    uint32_t ebx, ecx, edx, esi, edi, ebp, eax;
+    unsigned short ds, __ds, es, __es;
+    unsigned short fs, __fs, gs, __gs;
+    uint32_t orig_eax, eip;
+    unsigned short cs, __cs;
+    uint32_t eflags, esp;
+    unsigned short ss, __ss;
+} x86_user_regs_struct;
+
+typedef struct {
+    char pad1[24];
+    uint32_t pid;
+    char pad2[44];
+    x86_user_regs_struct regs;
+    char pad3[4];
+} x86_elf_prstatus;
+
+static void x86_fill_elf_prstatus(x86_elf_prstatus *prstatus, CPUArchState *env,
+                                  int id)
+{
+    memset(prstatus, 0, sizeof(x86_elf_prstatus));
+    prstatus->regs.ebp = env->regs[R_EBP] & 0xffffffff;
+    prstatus->regs.esp = env->regs[R_ESP] & 0xffffffff;
+    prstatus->regs.edi = env->regs[R_EDI] & 0xffffffff;
+    prstatus->regs.esi = env->regs[R_ESI] & 0xffffffff;
+    prstatus->regs.edx = env->regs[R_EDX] & 0xffffffff;
+    prstatus->regs.ecx = env->regs[R_ECX] & 0xffffffff;
+    prstatus->regs.ebx = env->regs[R_EBX] & 0xffffffff;
+    prstatus->regs.eax = env->regs[R_EAX] & 0xffffffff;
+    prstatus->regs.eip = env->eip & 0xffffffff;
+    prstatus->regs.eflags = env->eflags & 0xffffffff;
+
+    prstatus->regs.cs = env->segs[R_CS].selector;
+    prstatus->regs.ss = env->segs[R_SS].selector;
+    prstatus->regs.ds = env->segs[R_DS].selector;
+    prstatus->regs.es = env->segs[R_ES].selector;
+    prstatus->regs.fs = env->segs[R_FS].selector;
+    prstatus->regs.gs = env->segs[R_GS].selector;
+
+    prstatus->pid = id;
+}
+
+static int x86_write_elf64_note(write_core_dump_function f, CPUArchState *env,
+                                int id, void *opaque)
+{
+    x86_elf_prstatus prstatus;
+    Elf64_Nhdr *note;
+    char *buf;
+    int descsz, note_size, name_size = 5;
+    const char *name = "CORE";
+    int ret;
+
+    x86_fill_elf_prstatus(&prstatus, env, id);
+    descsz = sizeof(x86_elf_prstatus);
+    note_size = ((sizeof(Elf64_Nhdr) + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    note->n_namesz = cpu_to_le32(name_size);
+    note->n_descsz = cpu_to_le32(descsz);
+    note->n_type = cpu_to_le32(NT_PRSTATUS);
+    buf = (char *)note;
+    buf += ((sizeof(Elf64_Nhdr) + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf, &prstatus, sizeof(prstatus));
+
+    ret = f(note, note_size, opaque);
+    g_free(note);
+    if (ret < 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
+int cpu_write_elf64_note(write_core_dump_function f, CPUArchState *env,
+                         int cpuid, void *opaque)
+{
+    int ret;
+#ifdef TARGET_X86_64
+    bool lma = !!(first_cpu->hflags & HF_LMA_MASK);
+
+    if (lma) {
+        ret = x86_64_write_elf64_note(f, env, cpuid, opaque);
+    } else {
+#endif
+        ret = x86_write_elf64_note(f, env, cpuid, opaque);
+#ifdef TARGET_X86_64
+    }
+#endif
+
+    return ret;
+}
+
+int cpu_write_elf32_note(write_core_dump_function f, CPUArchState *env,
+                         int cpuid, void *opaque)
+{
+    x86_elf_prstatus prstatus;
+    Elf32_Nhdr *note;
+    char *buf;
+    int descsz, note_size, name_size = 5;
+    const char *name = "CORE";
+    int ret;
+
+    x86_fill_elf_prstatus(&prstatus, env, cpuid);
+    descsz = sizeof(x86_elf_prstatus);
+    note_size = ((sizeof(Elf32_Nhdr) + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    note->n_namesz = cpu_to_le32(name_size);
+    note->n_descsz = cpu_to_le32(descsz);
+    note->n_type = cpu_to_le32(NT_PRSTATUS);
+    buf = (char *)note;
+    buf += ((sizeof(Elf32_Nhdr) + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf, &prstatus, sizeof(prstatus));
+
+    ret = f(note, note_size, opaque);
+    g_free(note);
+    if (ret < 0) {
+        return -1;
+    }
+
+    return 0;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 08/12 v15] target-i386: Add API to write cpu status to core file
  2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (6 preceding siblings ...)
  2012-05-07  4:07 ` [Qemu-devel] [PATCH 07/12 v15] target-i386: Add API to write elf notes to core file Wen Congyang
@ 2012-05-07  4:08 ` Wen Congyang
  2012-05-07  4:08 ` [Qemu-devel] [PATCH 09/12 v15] target-i386: add API to get dump info Wen Congyang
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2012-05-07  4:08 UTC (permalink / raw)
  To: qemu-devel, HATAYAMA Daisuke, Luiz Capitulino,
	Daniel P. Berrange, Anthony Liguori

The core file has register's value. But it does not include all registers value.
Store the cpu status into QEMU note, and the user can get more information
from vmcore. If you change QEMUCPUState, please count up QEMUCPUSTATE_VERSION.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 cpu-all.h               |   18 ++++++
 target-i386/arch_dump.c |  149 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 167 insertions(+), 0 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index a71e887..a167bac 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -547,6 +547,10 @@ int cpu_write_elf64_note(write_core_dump_function f, CPUArchState *env,
                          int cpuid, void *opaque);
 int cpu_write_elf32_note(write_core_dump_function f, CPUArchState *env,
                          int cpuid, void *opaque);
+int cpu_write_elf64_qemunote(write_core_dump_function f, CPUArchState *env,
+                             void *opaque);
+int cpu_write_elf32_qemunote(write_core_dump_function f, CPUArchState *env,
+                             void *opaque);
 #else
 static inline int cpu_write_elf64_note(write_core_dump_function f,
                                        CPUArchState *env, int cpuid,
@@ -561,6 +565,20 @@ static inline int cpu_write_elf32_note(write_core_dump_function f,
 {
     return -1;
 }
+
+static inline int cpu_write_elf64_qemunote(write_core_dump_function f,
+                                           CPUArchState *env,
+                                           void *opaque)
+{
+    return -1;
+}
+
+static inline int cpu_write_elf32_qemunote(write_core_dump_function f,
+                                           CPUArchState *env,
+                                           void *opaque)
+{
+    return -1;
+}
 #endif
 
 #endif /* CPU_ALL_H */
diff --git a/target-i386/arch_dump.c b/target-i386/arch_dump.c
index d55c2ce..ddbe20c 100644
--- a/target-i386/arch_dump.c
+++ b/target-i386/arch_dump.c
@@ -231,3 +231,152 @@ int cpu_write_elf32_note(write_core_dump_function f, CPUArchState *env,
 
     return 0;
 }
+
+/*
+ * please count up QEMUCPUSTATE_VERSION if you have changed definition of
+ * QEMUCPUState, and modify the tools using this information accordingly.
+ */
+#define QEMUCPUSTATE_VERSION (1)
+
+struct QEMUCPUSegment {
+    uint32_t selector;
+    uint32_t limit;
+    uint32_t flags;
+    uint32_t pad;
+    uint64_t base;
+};
+
+typedef struct QEMUCPUSegment QEMUCPUSegment;
+
+struct QEMUCPUState {
+    uint32_t version;
+    uint32_t size;
+    uint64_t rax, rbx, rcx, rdx, rsi, rdi, rsp, rbp;
+    uint64_t r8, r9, r10, r11, r12, r13, r14, r15;
+    uint64_t rip, rflags;
+    QEMUCPUSegment cs, ds, es, fs, gs, ss;
+    QEMUCPUSegment ldt, tr, gdt, idt;
+    uint64_t cr[5];
+};
+
+typedef struct QEMUCPUState QEMUCPUState;
+
+static void copy_segment(QEMUCPUSegment *d, SegmentCache *s)
+{
+    d->pad = 0;
+    d->selector = s->selector;
+    d->limit = s->limit;
+    d->flags = s->flags;
+    d->base = s->base;
+}
+
+static void qemu_get_cpustate(QEMUCPUState *s, CPUArchState *env)
+{
+    memset(s, 0, sizeof(QEMUCPUState));
+
+    s->version = QEMUCPUSTATE_VERSION;
+    s->size = sizeof(QEMUCPUState);
+
+    s->rax = env->regs[R_EAX];
+    s->rbx = env->regs[R_EBX];
+    s->rcx = env->regs[R_ECX];
+    s->rdx = env->regs[R_EDX];
+    s->rsi = env->regs[R_ESI];
+    s->rdi = env->regs[R_EDI];
+    s->rsp = env->regs[R_ESP];
+    s->rbp = env->regs[R_EBP];
+#ifdef TARGET_X86_64
+    s->r8  = env->regs[8];
+    s->r9  = env->regs[9];
+    s->r10 = env->regs[10];
+    s->r11 = env->regs[11];
+    s->r12 = env->regs[12];
+    s->r13 = env->regs[13];
+    s->r14 = env->regs[14];
+    s->r15 = env->regs[15];
+#endif
+    s->rip = env->eip;
+    s->rflags = env->eflags;
+
+    copy_segment(&s->cs, &env->segs[R_CS]);
+    copy_segment(&s->ds, &env->segs[R_DS]);
+    copy_segment(&s->es, &env->segs[R_ES]);
+    copy_segment(&s->fs, &env->segs[R_FS]);
+    copy_segment(&s->gs, &env->segs[R_GS]);
+    copy_segment(&s->ss, &env->segs[R_SS]);
+    copy_segment(&s->ldt, &env->ldt);
+    copy_segment(&s->tr, &env->tr);
+    copy_segment(&s->gdt, &env->gdt);
+    copy_segment(&s->idt, &env->idt);
+
+    s->cr[0] = env->cr[0];
+    s->cr[1] = env->cr[1];
+    s->cr[2] = env->cr[2];
+    s->cr[3] = env->cr[3];
+    s->cr[4] = env->cr[4];
+}
+
+static inline int cpu_write_qemu_note(write_core_dump_function f,
+                                      CPUArchState *env,
+                                      void *opaque,
+                                      int type)
+{
+    QEMUCPUState state;
+    Elf64_Nhdr *note64;
+    Elf32_Nhdr *note32;
+    void *note;
+    char *buf;
+    int descsz, note_size, name_size = 5, note_head_size;
+    const char *name = "QEMU";
+    int ret;
+
+    qemu_get_cpustate(&state, env);
+
+    descsz = sizeof(state);
+    if (type == 0) {
+        note_head_size = sizeof(Elf32_Nhdr);
+    } else {
+        note_head_size = sizeof(Elf64_Nhdr);
+    }
+    note_size = ((note_head_size + 3) / 4 + (name_size + 3) / 4 +
+                (descsz + 3) / 4) * 4;
+    note = g_malloc(note_size);
+
+    memset(note, 0, note_size);
+    if (type == 0) {
+        note32 = note;
+        note32->n_namesz = cpu_to_le32(name_size);
+        note32->n_descsz = cpu_to_le32(descsz);
+        note32->n_type = 0;
+    } else {
+        note64 = note;
+        note64->n_namesz = cpu_to_le32(name_size);
+        note64->n_descsz = cpu_to_le32(descsz);
+        note64->n_type = 0;
+    }
+    buf = note;
+    buf += ((note_head_size + 3) / 4) * 4;
+    memcpy(buf, name, name_size);
+    buf += ((name_size + 3) / 4) * 4;
+    memcpy(buf, &state, sizeof(state));
+
+    ret = f(note, note_size, opaque);
+    g_free(note);
+    if (ret < 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
+int cpu_write_elf64_qemunote(write_core_dump_function f, CPUArchState *env,
+                             void *opaque)
+{
+    return cpu_write_qemu_note(f, env, opaque, 1);
+}
+
+int cpu_write_elf32_qemunote(write_core_dump_function f, CPUArchState *env,
+                             void *opaque)
+{
+    return cpu_write_qemu_note(f, env, opaque, 0);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 09/12 v15] target-i386: add API to get dump info
  2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (7 preceding siblings ...)
  2012-05-07  4:08 ` [Qemu-devel] [PATCH 08/12 v15] target-i386: Add API to write cpu status " Wen Congyang
@ 2012-05-07  4:08 ` Wen Congyang
  2012-05-07  4:09 ` [Qemu-devel] [PATCH 10/12 v15] target-i386: Add API to get note's size Wen Congyang
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2012-05-07  4:08 UTC (permalink / raw)
  To: qemu-devel, HATAYAMA Daisuke, Luiz Capitulino,
	Daniel P. Berrange, Anthony Liguori

Dump info contains: endian, class and architecture. The next
patch will use these information to create vmcore. Note: on
x86 box, the  class is ELFCLASS64 if the memory is larger than 4G.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 cpu-all.h               |    7 +++++++
 dump.h                  |   23 +++++++++++++++++++++++
 target-i386/arch_dump.c |   34 ++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 0 deletions(-)
 create mode 100644 dump.h

diff --git a/cpu-all.h b/cpu-all.h
index a167bac..7e07c91 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -23,6 +23,7 @@
 #include "qemu-tls.h"
 #include "cpu-common.h"
 #include "memory_mapping.h"
+#include "dump.h"
 
 /* some important defines:
  *
@@ -551,6 +552,7 @@ int cpu_write_elf64_qemunote(write_core_dump_function f, CPUArchState *env,
                              void *opaque);
 int cpu_write_elf32_qemunote(write_core_dump_function f, CPUArchState *env,
                              void *opaque);
+int cpu_get_dump_info(ArchDumpInfo *info);
 #else
 static inline int cpu_write_elf64_note(write_core_dump_function f,
                                        CPUArchState *env, int cpuid,
@@ -579,6 +581,11 @@ static inline int cpu_write_elf32_qemunote(write_core_dump_function f,
 {
     return -1;
 }
+
+static inline int cpu_get_dump_info(ArchDumpInfo *info)
+{
+    return -1;
+}
 #endif
 
 #endif /* CPU_ALL_H */
diff --git a/dump.h b/dump.h
new file mode 100644
index 0000000..28340cf
--- /dev/null
+++ b/dump.h
@@ -0,0 +1,23 @@
+/*
+ * QEMU dump
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef DUMP_H
+#define DUMP_H
+
+typedef struct ArchDumpInfo {
+    int d_machine;  /* Architecture */
+    int d_endian;   /* ELFDATA2LSB or ELFDATA2MSB */
+    int d_class;    /* ELFCLASS32 or ELFCLASS64 */
+} ArchDumpInfo;
+
+#endif
diff --git a/target-i386/arch_dump.c b/target-i386/arch_dump.c
index ddbe20c..e378579 100644
--- a/target-i386/arch_dump.c
+++ b/target-i386/arch_dump.c
@@ -13,6 +13,7 @@
 
 #include "cpu.h"
 #include "cpu-all.h"
+#include "dump.h"
 #include "elf.h"
 
 #ifdef TARGET_X86_64
@@ -380,3 +381,36 @@ int cpu_write_elf32_qemunote(write_core_dump_function f, CPUArchState *env,
 {
     return cpu_write_qemu_note(f, env, opaque, 0);
 }
+
+int cpu_get_dump_info(ArchDumpInfo *info)
+{
+    bool lma = false;
+    RAMBlock *block;
+
+#ifdef TARGET_X86_64
+    lma = !!(first_cpu->hflags & HF_LMA_MASK);
+#endif
+
+    if (lma) {
+        info->d_machine = EM_X86_64;
+    } else {
+        info->d_machine = EM_386;
+    }
+    info->d_endian = ELFDATA2LSB;
+
+    if (lma) {
+        info->d_class = ELFCLASS64;
+    } else {
+        info->d_class = ELFCLASS32;
+
+        QLIST_FOREACH(block, &ram_list.blocks, next) {
+            if (block->offset + block->length > UINT_MAX) {
+                /* The memory size is greater than 4G */
+                info->d_class = ELFCLASS64;
+                break;
+            }
+        }
+    }
+
+    return 0;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 10/12 v15] target-i386: Add API to get note's size
  2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (8 preceding siblings ...)
  2012-05-07  4:08 ` [Qemu-devel] [PATCH 09/12 v15] target-i386: add API to get dump info Wen Congyang
@ 2012-05-07  4:09 ` Wen Congyang
  2012-05-07  4:10 ` [Qemu-devel] [PATCH 11/12 v15] make gdb_id() generally avialable and rename it to cpu_index() Wen Congyang
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2012-05-07  4:09 UTC (permalink / raw)
  To: qemu-devel, HATAYAMA Daisuke, Luiz Capitulino,
	Daniel P. Berrange, Anthony Liguori

We should know where the note and memory is stored before writing
them to vmcore. If we know this, we can avoid using lseek() when
creating vmcore.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 cpu-all.h               |    6 ++++++
 target-i386/arch_dump.c |   33 +++++++++++++++++++++++++++++++++
 2 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 7e07c91..2b77f66 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -553,6 +553,7 @@ int cpu_write_elf64_qemunote(write_core_dump_function f, CPUArchState *env,
 int cpu_write_elf32_qemunote(write_core_dump_function f, CPUArchState *env,
                              void *opaque);
 int cpu_get_dump_info(ArchDumpInfo *info);
+size_t cpu_get_note_size(int class, int machine, int nr_cpus);
 #else
 static inline int cpu_write_elf64_note(write_core_dump_function f,
                                        CPUArchState *env, int cpuid,
@@ -586,6 +587,11 @@ static inline int cpu_get_dump_info(ArchDumpInfo *info)
 {
     return -1;
 }
+
+static inline int cpu_get_note_size(int class, int machine, int nr_cpus)
+{
+    return -1;
+}
 #endif
 
 #endif /* CPU_ALL_H */
diff --git a/target-i386/arch_dump.c b/target-i386/arch_dump.c
index e378579..135d855 100644
--- a/target-i386/arch_dump.c
+++ b/target-i386/arch_dump.c
@@ -414,3 +414,36 @@ int cpu_get_dump_info(ArchDumpInfo *info)
 
     return 0;
 }
+
+size_t cpu_get_note_size(int class, int machine, int nr_cpus)
+{
+    int name_size = 5; /* "CORE" or "QEMU" */
+    size_t elf_note_size = 0;
+    size_t qemu_note_size = 0;
+    int elf_desc_size = 0;
+    int qemu_desc_size = 0;
+    int note_head_size;
+
+    if (class == ELFCLASS32) {
+        note_head_size = sizeof(Elf32_Nhdr);
+    } else {
+        note_head_size = sizeof(Elf64_Nhdr);
+    }
+
+    if (machine == EM_386) {
+        elf_desc_size = sizeof(x86_elf_prstatus);
+    }
+#ifdef TARGET_X86_64
+    else {
+        elf_desc_size = sizeof(x86_64_elf_prstatus);
+    }
+#endif
+    qemu_desc_size = sizeof(QEMUCPUState);
+
+    elf_note_size = ((note_head_size + 3) / 4 + (name_size + 3) / 4 +
+                     (elf_desc_size + 3) / 4) * 4;
+    qemu_note_size = ((note_head_size + 3) / 4 + (name_size + 3) / 4 +
+                      (qemu_desc_size + 3) / 4) * 4;
+
+    return (elf_note_size + qemu_note_size) * nr_cpus;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 11/12 v15] make gdb_id() generally avialable and rename it to cpu_index()
  2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (9 preceding siblings ...)
  2012-05-07  4:09 ` [Qemu-devel] [PATCH 10/12 v15] target-i386: Add API to get note's size Wen Congyang
@ 2012-05-07  4:10 ` Wen Congyang
  2012-05-07  4:10 ` [Qemu-devel] [PATCH 12/12 v15] introduce a new monitor command 'dump-guest-memory' to dump guest's memory Wen Congyang
  2012-05-16 17:59 ` [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Luiz Capitulino
  12 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2012-05-07  4:10 UTC (permalink / raw)
  To: qemu-devel, HATAYAMA Daisuke, Luiz Capitulino,
	Daniel P. Berrange, Anthony Liguori

The following patch also needs this API, so make it generally avialable.
The function gdb_id() will not be used in gdbstub.c now, so its name is
not suitable, and rename it to cpu_index()

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 gdbstub.c |   19 +++++--------------
 gdbstub.h |    9 +++++++++
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index 6a77a66..08cf864 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -1937,21 +1937,12 @@ static void gdb_set_cpu_pc(GDBState *s, target_ulong pc)
 #endif
 }
 
-static inline int gdb_id(CPUArchState *env)
-{
-#if defined(CONFIG_USER_ONLY) && defined(CONFIG_USE_NPTL)
-    return env->host_tid;
-#else
-    return env->cpu_index + 1;
-#endif
-}
-
 static CPUArchState *find_cpu(uint32_t thread_id)
 {
     CPUArchState *env;
 
     for (env = first_cpu; env != NULL; env = env->next_cpu) {
-        if (gdb_id(env) == thread_id) {
+        if (cpu_index(env) == thread_id) {
             return env;
         }
     }
@@ -1979,7 +1970,7 @@ static int gdb_handle_packet(GDBState *s, const char *line_buf)
     case '?':
         /* TODO: Make this return the correct value for user-mode.  */
         snprintf(buf, sizeof(buf), "T%02xthread:%02x;", GDB_SIGNAL_TRAP,
-                 gdb_id(s->c_cpu));
+                 cpu_index(s->c_cpu));
         put_packet(s, buf);
         /* Remove all the breakpoints when this query is issued,
          * because gdb is doing and initial connect and the state
@@ -2274,7 +2265,7 @@ static int gdb_handle_packet(GDBState *s, const char *line_buf)
         } else if (strcmp(p,"sThreadInfo") == 0) {
         report_cpuinfo:
             if (s->query_cpu) {
-                snprintf(buf, sizeof(buf), "m%x", gdb_id(s->query_cpu));
+                snprintf(buf, sizeof(buf), "m%x", cpu_index(s->query_cpu));
                 put_packet(s, buf);
                 s->query_cpu = s->query_cpu->next_cpu;
             } else
@@ -2422,7 +2413,7 @@ static void gdb_vm_state_change(void *opaque, int running, RunState state)
             }
             snprintf(buf, sizeof(buf),
                      "T%02xthread:%02x;%swatch:" TARGET_FMT_lx ";",
-                     GDB_SIGNAL_TRAP, gdb_id(env), type,
+                     GDB_SIGNAL_TRAP, cpu_index(env), type,
                      env->watchpoint_hit->vaddr);
             env->watchpoint_hit = NULL;
             goto send_packet;
@@ -2455,7 +2446,7 @@ static void gdb_vm_state_change(void *opaque, int running, RunState state)
         ret = GDB_SIGNAL_UNKNOWN;
         break;
     }
-    snprintf(buf, sizeof(buf), "T%02xthread:%02x;", ret, gdb_id(env));
+    snprintf(buf, sizeof(buf), "T%02xthread:%02x;", ret, cpu_index(env));
 
 send_packet:
     put_packet(s, buf);
diff --git a/gdbstub.h b/gdbstub.h
index b44e275..668de66 100644
--- a/gdbstub.h
+++ b/gdbstub.h
@@ -30,6 +30,15 @@ void gdb_register_coprocessor(CPUArchState *env,
                               gdb_reg_cb get_reg, gdb_reg_cb set_reg,
                               int num_regs, const char *xml, int g_pos);
 
+static inline int cpu_index(CPUArchState *env)
+{
+#if defined(CONFIG_USER_ONLY) && defined(CONFIG_USE_NPTL)
+    return env->host_tid;
+#else
+    return env->cpu_index + 1;
+#endif
+}
+
 #endif
 
 #ifdef CONFIG_USER_ONLY
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 12/12 v15] introduce a new monitor command 'dump-guest-memory' to dump guest's memory
  2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (10 preceding siblings ...)
  2012-05-07  4:10 ` [Qemu-devel] [PATCH 11/12 v15] make gdb_id() generally avialable and rename it to cpu_index() Wen Congyang
@ 2012-05-07  4:10 ` Wen Congyang
  2012-05-16 17:59 ` [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Luiz Capitulino
  12 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2012-05-07  4:10 UTC (permalink / raw)
  To: qemu-devel, HATAYAMA Daisuke, Luiz Capitulino,
	Daniel P. Berrange, Anthony Liguori

The command's usage:
   dump-guest-memory [-p] protocol [begin] [length]
The supported protocol can be file or fd:
1. file: the protocol starts with "file:", and the following string is
   the file's path.
2. fd: the protocol starts with "fd:", and the following string is the
   fd's name.

Note:
  1. If you want to use gdb to process the core, please specify -p option.
     The reason why the -p option is not default is:
       a. guest machine in a catastrophic state can have corrupted memory,
          which we cannot trust.
       b. The guest machine can be in read-mode even if paging is enabled.
          For example: the guest machine uses ACPI to sleep, and ACPI sleep
          state goes in real-mode.
  2. If you don't want to dump all guest's memory, please specify the start
     physical address and the length.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 Makefile.target  |    2 +
 dump.c           |  883 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 elf.h            |    5 +
 hmp-commands.hx  |   28 ++
 hmp.c            |   22 ++
 hmp.h            |    1 +
 memory_mapping.c |   27 ++
 memory_mapping.h |    3 +
 qapi-schema.json |   43 +++
 qmp-commands.hx  |   36 +++
 10 files changed, 1050 insertions(+), 0 deletions(-)
 create mode 100644 dump.c

diff --git a/Makefile.target b/Makefile.target
index cf05831..c94abfd 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -405,6 +405,8 @@ obj-y += $(addprefix ../, $(trace-obj-y))
 
 endif # CONFIG_SOFTMMU
 
+obj-y += dump.o
+
 ifndef CONFIG_LINUX_USER
 ifndef CONFIG_BSD_USER
 # libcacard needs qemu-thread support, and besides is only needed by devices
diff --git a/dump.c b/dump.c
new file mode 100644
index 0000000..0ca14f8
--- /dev/null
+++ b/dump.c
@@ -0,0 +1,883 @@
+/*
+ * QEMU dump
+ *
+ * Copyright Fujitsu, Corp. 2011, 2012
+ *
+ * Authors:
+ *     Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include <unistd.h>
+#include "elf.h"
+#include <sys/procfs.h>
+#include <glib.h>
+#include "cpu.h"
+#include "cpu-all.h"
+#include "targphys.h"
+#include "monitor.h"
+#include "kvm.h"
+#include "dump.h"
+#include "sysemu.h"
+#include "bswap.h"
+#include "memory_mapping.h"
+#include "error.h"
+#include "qmp-commands.h"
+#include "gdbstub.h"
+
+#if defined(CONFIG_HAVE_CORE_DUMP)
+static uint16_t cpu_convert_to_target16(uint16_t val, int endian)
+{
+    if (endian == ELFDATA2LSB) {
+        val = cpu_to_le16(val);
+    } else {
+        val = cpu_to_be16(val);
+    }
+
+    return val;
+}
+
+static uint32_t cpu_convert_to_target32(uint32_t val, int endian)
+{
+    if (endian == ELFDATA2LSB) {
+        val = cpu_to_le32(val);
+    } else {
+        val = cpu_to_be32(val);
+    }
+
+    return val;
+}
+
+static uint64_t cpu_convert_to_target64(uint64_t val, int endian)
+{
+    if (endian == ELFDATA2LSB) {
+        val = cpu_to_le64(val);
+    } else {
+        val = cpu_to_be64(val);
+    }
+
+    return val;
+}
+
+typedef struct DumpState {
+    ArchDumpInfo dump_info;
+    MemoryMappingList list;
+    uint16_t phdr_num;
+    uint32_t sh_info;
+    bool have_section;
+    bool resume;
+    size_t note_size;
+    target_phys_addr_t memory_offset;
+    int fd;
+
+    RAMBlock *block;
+    ram_addr_t start;
+    bool has_filter;
+    int64_t begin;
+    int64_t length;
+    Error **errp;
+} DumpState;
+
+static int dump_cleanup(DumpState *s)
+{
+    int ret = 0;
+
+    memory_mapping_list_free(&s->list);
+    if (s->fd != -1) {
+        close(s->fd);
+    }
+    if (s->resume) {
+        vm_start();
+    }
+
+    return ret;
+}
+
+static void dump_error(DumpState *s, const char *reason)
+{
+    dump_cleanup(s);
+}
+
+static int fd_write_vmcore(void *buf, size_t size, void *opaque)
+{
+    DumpState *s = opaque;
+    int fd = s->fd;
+    size_t writen_size;
+
+    /* The fd may be passed from user, and it can be non-blocked */
+    while (size) {
+        writen_size = qemu_write_full(fd, buf, size);
+        if (writen_size != size && errno != EAGAIN) {
+            return -1;
+        }
+
+        buf += writen_size;
+        size -= writen_size;
+    }
+
+    return 0;
+}
+
+static int write_elf64_header(DumpState *s)
+{
+    Elf64_Ehdr elf_header;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&elf_header, 0, sizeof(Elf64_Ehdr));
+    memcpy(&elf_header, ELFMAG, SELFMAG);
+    elf_header.e_ident[EI_CLASS] = ELFCLASS64;
+    elf_header.e_ident[EI_DATA] = s->dump_info.d_endian;
+    elf_header.e_ident[EI_VERSION] = EV_CURRENT;
+    elf_header.e_type = cpu_convert_to_target16(ET_CORE, endian);
+    elf_header.e_machine = cpu_convert_to_target16(s->dump_info.d_machine,
+                                                   endian);
+    elf_header.e_version = cpu_convert_to_target32(EV_CURRENT, endian);
+    elf_header.e_ehsize = cpu_convert_to_target16(sizeof(elf_header), endian);
+    elf_header.e_phoff = cpu_convert_to_target64(sizeof(Elf64_Ehdr), endian);
+    elf_header.e_phentsize = cpu_convert_to_target16(sizeof(Elf64_Phdr),
+                                                     endian);
+    elf_header.e_phnum = cpu_convert_to_target16(s->phdr_num, endian);
+    if (s->have_section) {
+        uint64_t shoff = sizeof(Elf64_Ehdr) + sizeof(Elf64_Phdr) * s->sh_info;
+
+        elf_header.e_shoff = cpu_convert_to_target64(shoff, endian);
+        elf_header.e_shentsize = cpu_convert_to_target16(sizeof(Elf64_Shdr),
+                                                         endian);
+        elf_header.e_shnum = cpu_convert_to_target16(1, endian);
+    }
+
+    ret = fd_write_vmcore(&elf_header, sizeof(elf_header), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write elf header.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_header(DumpState *s)
+{
+    Elf32_Ehdr elf_header;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&elf_header, 0, sizeof(Elf32_Ehdr));
+    memcpy(&elf_header, ELFMAG, SELFMAG);
+    elf_header.e_ident[EI_CLASS] = ELFCLASS32;
+    elf_header.e_ident[EI_DATA] = endian;
+    elf_header.e_ident[EI_VERSION] = EV_CURRENT;
+    elf_header.e_type = cpu_convert_to_target16(ET_CORE, endian);
+    elf_header.e_machine = cpu_convert_to_target16(s->dump_info.d_machine,
+                                                   endian);
+    elf_header.e_version = cpu_convert_to_target32(EV_CURRENT, endian);
+    elf_header.e_ehsize = cpu_convert_to_target16(sizeof(elf_header), endian);
+    elf_header.e_phoff = cpu_convert_to_target32(sizeof(Elf32_Ehdr), endian);
+    elf_header.e_phentsize = cpu_convert_to_target16(sizeof(Elf32_Phdr),
+                                                     endian);
+    elf_header.e_phnum = cpu_convert_to_target16(s->phdr_num, endian);
+    if (s->have_section) {
+        uint32_t shoff = sizeof(Elf32_Ehdr) + sizeof(Elf32_Phdr) * s->sh_info;
+
+        elf_header.e_shoff = cpu_convert_to_target32(shoff, endian);
+        elf_header.e_shentsize = cpu_convert_to_target16(sizeof(Elf32_Shdr),
+                                                         endian);
+        elf_header.e_shnum = cpu_convert_to_target16(1, endian);
+    }
+
+    ret = fd_write_vmcore(&elf_header, sizeof(elf_header), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write elf header.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf64_load(DumpState *s, MemoryMapping *memory_mapping,
+                            int phdr_index, target_phys_addr_t offset)
+{
+    Elf64_Phdr phdr;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&phdr, 0, sizeof(Elf64_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_LOAD, endian);
+    phdr.p_offset = cpu_convert_to_target64(offset, endian);
+    phdr.p_paddr = cpu_convert_to_target64(memory_mapping->phys_addr, endian);
+    if (offset == -1) {
+        /* When the memory is not stored into vmcore, offset will be -1 */
+        phdr.p_filesz = 0;
+    } else {
+        phdr.p_filesz = cpu_convert_to_target64(memory_mapping->length, endian);
+    }
+    phdr.p_memsz = cpu_convert_to_target64(memory_mapping->length, endian);
+    phdr.p_vaddr = cpu_convert_to_target64(memory_mapping->virt_addr, endian);
+
+    ret = fd_write_vmcore(&phdr, sizeof(Elf64_Phdr), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_load(DumpState *s, MemoryMapping *memory_mapping,
+                            int phdr_index, target_phys_addr_t offset)
+{
+    Elf32_Phdr phdr;
+    int ret;
+    int endian = s->dump_info.d_endian;
+
+    memset(&phdr, 0, sizeof(Elf32_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_LOAD, endian);
+    phdr.p_offset = cpu_convert_to_target32(offset, endian);
+    phdr.p_paddr = cpu_convert_to_target32(memory_mapping->phys_addr, endian);
+    if (offset == -1) {
+        /* When the memory is not stored into vmcore, offset will be -1 */
+        phdr.p_filesz = 0;
+    } else {
+        phdr.p_filesz = cpu_convert_to_target32(memory_mapping->length, endian);
+    }
+    phdr.p_memsz = cpu_convert_to_target32(memory_mapping->length, endian);
+    phdr.p_vaddr = cpu_convert_to_target32(memory_mapping->virt_addr, endian);
+
+    ret = fd_write_vmcore(&phdr, sizeof(Elf32_Phdr), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf64_note(DumpState *s)
+{
+    Elf64_Phdr phdr;
+    int endian = s->dump_info.d_endian;
+    target_phys_addr_t begin = s->memory_offset - s->note_size;
+    int ret;
+
+    memset(&phdr, 0, sizeof(Elf64_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_NOTE, endian);
+    phdr.p_offset = cpu_convert_to_target64(begin, endian);
+    phdr.p_paddr = 0;
+    phdr.p_filesz = cpu_convert_to_target64(s->note_size, endian);
+    phdr.p_memsz = cpu_convert_to_target64(s->note_size, endian);
+    phdr.p_vaddr = 0;
+
+    ret = fd_write_vmcore(&phdr, sizeof(Elf64_Phdr), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf64_notes(DumpState *s)
+{
+    CPUArchState *env;
+    int ret;
+    int id;
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        id = cpu_index(env);
+        ret = cpu_write_elf64_note(fd_write_vmcore, env, id, s);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write elf notes.\n");
+            return -1;
+        }
+    }
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        ret = cpu_write_elf64_qemunote(fd_write_vmcore, env, s);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write CPU status.\n");
+            return -1;
+        }
+    }
+
+    return 0;
+}
+
+static int write_elf32_note(DumpState *s)
+{
+    target_phys_addr_t begin = s->memory_offset - s->note_size;
+    Elf32_Phdr phdr;
+    int endian = s->dump_info.d_endian;
+    int ret;
+
+    memset(&phdr, 0, sizeof(Elf32_Phdr));
+    phdr.p_type = cpu_convert_to_target32(PT_NOTE, endian);
+    phdr.p_offset = cpu_convert_to_target32(begin, endian);
+    phdr.p_paddr = 0;
+    phdr.p_filesz = cpu_convert_to_target32(s->note_size, endian);
+    phdr.p_memsz = cpu_convert_to_target32(s->note_size, endian);
+    phdr.p_vaddr = 0;
+
+    ret = fd_write_vmcore(&phdr, sizeof(Elf32_Phdr), s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write program header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_elf32_notes(DumpState *s)
+{
+    CPUArchState *env;
+    int ret;
+    int id;
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        id = cpu_index(env);
+        ret = cpu_write_elf32_note(fd_write_vmcore, env, id, s);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write elf notes.\n");
+            return -1;
+        }
+    }
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        ret = cpu_write_elf32_qemunote(fd_write_vmcore, env, s);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to write CPU status.\n");
+            return -1;
+        }
+    }
+
+    return 0;
+}
+
+static int write_elf_section(DumpState *s, int type)
+{
+    Elf32_Shdr shdr32;
+    Elf64_Shdr shdr64;
+    int endian = s->dump_info.d_endian;
+    int shdr_size;
+    void *shdr;
+    int ret;
+
+    if (type == 0) {
+        shdr_size = sizeof(Elf32_Shdr);
+        memset(&shdr32, 0, shdr_size);
+        shdr32.sh_info = cpu_convert_to_target32(s->sh_info, endian);
+        shdr = &shdr32;
+    } else {
+        shdr_size = sizeof(Elf64_Shdr);
+        memset(&shdr64, 0, shdr_size);
+        shdr64.sh_info = cpu_convert_to_target32(s->sh_info, endian);
+        shdr = &shdr64;
+    }
+
+    ret = fd_write_vmcore(&shdr, shdr_size, s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write section header table.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_data(DumpState *s, void *buf, int length)
+{
+    int ret;
+
+    ret = fd_write_vmcore(buf, length, s);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to save memory.\n");
+        return -1;
+    }
+
+    return 0;
+}
+
+/* write the memroy to vmcore. 1 page per I/O. */
+static int write_memory(DumpState *s, RAMBlock *block, ram_addr_t start,
+                        int64_t size)
+{
+    int64_t i;
+    int ret;
+
+    for (i = 0; i < size / TARGET_PAGE_SIZE; i++) {
+        ret = write_data(s, block->host + start + i * TARGET_PAGE_SIZE,
+                         TARGET_PAGE_SIZE);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    if ((size % TARGET_PAGE_SIZE) != 0) {
+        ret = write_data(s, block->host + start + i * TARGET_PAGE_SIZE,
+                         size % TARGET_PAGE_SIZE);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    return 0;
+}
+
+/* get the memory's offset in the vmcore */
+static target_phys_addr_t get_offset(target_phys_addr_t phys_addr,
+                                     DumpState *s)
+{
+    RAMBlock *block;
+    target_phys_addr_t offset = s->memory_offset;
+    int64_t size_in_block, start;
+
+    if (s->has_filter) {
+        if (phys_addr < s->begin || phys_addr >= s->begin + s->length) {
+            return -1;
+        }
+    }
+
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        if (s->has_filter) {
+            if (block->offset >= s->begin + s->length ||
+                block->offset + block->length <= s->begin) {
+                /* This block is out of the range */
+                continue;
+            }
+
+            if (s->begin <= block->offset) {
+                start = block->offset;
+            } else {
+                start = s->begin;
+            }
+
+            size_in_block = block->length - (start - block->offset);
+            if (s->begin + s->length < block->offset + block->length) {
+                size_in_block -= block->offset + block->length -
+                                 (s->begin + s->length);
+            }
+        } else {
+            start = block->offset;
+            size_in_block = block->length;
+        }
+
+        if (phys_addr >= start && phys_addr < start + size_in_block) {
+            return phys_addr - start + offset;
+        }
+
+        offset += size_in_block;
+    }
+
+    return -1;
+}
+
+static int write_elf_loads(DumpState *s)
+{
+    target_phys_addr_t offset;
+    MemoryMapping *memory_mapping;
+    uint32_t phdr_index = 1;
+    int ret;
+    uint32_t max_index;
+
+    if (s->have_section) {
+        max_index = s->sh_info;
+    } else {
+        max_index = s->phdr_num;
+    }
+
+    QTAILQ_FOREACH(memory_mapping, &s->list.head, next) {
+        offset = get_offset(memory_mapping->phys_addr, s);
+        if (s->dump_info.d_class == ELFCLASS64) {
+            ret = write_elf64_load(s, memory_mapping, phdr_index++, offset);
+        } else {
+            ret = write_elf32_load(s, memory_mapping, phdr_index++, offset);
+        }
+
+        if (ret < 0) {
+            return -1;
+        }
+
+        if (phdr_index >= max_index) {
+            break;
+        }
+    }
+
+    return 0;
+}
+
+/* write elf header, PT_NOTE and elf note to vmcore. */
+static int dump_begin(DumpState *s)
+{
+    int ret;
+
+    /*
+     * the vmcore's format is:
+     *   --------------
+     *   |  elf header |
+     *   --------------
+     *   |  PT_NOTE    |
+     *   --------------
+     *   |  PT_LOAD    |
+     *   --------------
+     *   |  ......     |
+     *   --------------
+     *   |  PT_LOAD    |
+     *   --------------
+     *   |  sec_hdr    |
+     *   --------------
+     *   |  elf note   |
+     *   --------------
+     *   |  memory     |
+     *   --------------
+     *
+     * we only know where the memory is saved after we write elf note into
+     * vmcore.
+     */
+
+    /* write elf header to vmcore */
+    if (s->dump_info.d_class == ELFCLASS64) {
+        ret = write_elf64_header(s);
+    } else {
+        ret = write_elf32_header(s);
+    }
+    if (ret < 0) {
+        return -1;
+    }
+
+    if (s->dump_info.d_class == ELFCLASS64) {
+        /* write PT_NOTE to vmcore */
+        if (write_elf64_note(s) < 0) {
+            return -1;
+        }
+
+        /* write all PT_LOAD to vmcore */
+        if (write_elf_loads(s) < 0) {
+            return -1;
+        }
+
+        /* write section to vmcore */
+        if (s->have_section) {
+            if (write_elf_section(s, 1) < 0) {
+                return -1;
+            }
+        }
+
+        /* write notes to vmcore */
+        if (write_elf64_notes(s) < 0) {
+            return -1;
+        }
+
+    } else {
+        /* write PT_NOTE to vmcore */
+        if (write_elf32_note(s) < 0) {
+            return -1;
+        }
+
+        /* write all PT_LOAD to vmcore */
+        if (write_elf_loads(s) < 0) {
+            return -1;
+        }
+
+        /* write section to vmcore */
+        if (s->have_section) {
+            if (write_elf_section(s, 0) < 0) {
+                return -1;
+            }
+        }
+
+        /* write notes to vmcore */
+        if (write_elf32_notes(s) < 0) {
+            return -1;
+        }
+    }
+
+    return 0;
+}
+
+/* write PT_LOAD to vmcore */
+static int dump_completed(DumpState *s)
+{
+    dump_cleanup(s);
+    return 0;
+}
+
+static int get_next_block(DumpState *s, RAMBlock *block)
+{
+    while (1) {
+        block = QLIST_NEXT(block, next);
+        if (!block) {
+            /* no more block */
+            return 1;
+        }
+
+        s->start = 0;
+        s->block = block;
+        if (s->has_filter) {
+            if (block->offset >= s->begin + s->length ||
+                block->offset + block->length <= s->begin) {
+                /* This block is out of the range */
+                continue;
+            }
+
+            if (s->begin > block->offset) {
+                s->start = s->begin - block->offset;
+            }
+        }
+
+        return 0;
+    }
+}
+
+/* write all memory to vmcore */
+static int dump_iterate(DumpState *s)
+{
+    RAMBlock *block;
+    int64_t size;
+    int ret;
+
+    while (1) {
+        block = s->block;
+
+        size = block->length;
+        if (s->has_filter) {
+            size -= s->start;
+            if (s->begin + s->length < block->offset + block->length) {
+                size -= block->offset + block->length - (s->begin + s->length);
+            }
+        }
+        ret = write_memory(s, block, s->start, size);
+        if (ret == -1) {
+            return ret;
+        }
+
+        ret = get_next_block(s, block);
+        if (ret == 1) {
+            dump_completed(s);
+            return 0;
+        }
+    }
+}
+
+static int create_vmcore(DumpState *s)
+{
+    int ret;
+
+    ret = dump_begin(s);
+    if (ret < 0) {
+        return -1;
+    }
+
+    ret = dump_iterate(s);
+    if (ret < 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
+static ram_addr_t get_start_block(DumpState *s)
+{
+    RAMBlock *block;
+
+    if (!s->has_filter) {
+        s->block = QLIST_FIRST(&ram_list.blocks);
+        return 0;
+    }
+
+    QLIST_FOREACH(block, &ram_list.blocks, next) {
+        if (block->offset >= s->begin + s->length ||
+            block->offset + block->length <= s->begin) {
+            /* This block is out of the range */
+            continue;
+        }
+
+        s->block = block;
+        if (s->begin > block->offset) {
+            s->start = s->begin - block->offset;
+        } else {
+            s->start = 0;
+        }
+        return s->start;
+    }
+
+    return -1;
+}
+
+static int dump_init(DumpState *s, int fd, bool paging, bool has_filter,
+                     int64_t begin, int64_t length, Error **errp)
+{
+    CPUArchState *env;
+    int nr_cpus;
+    int ret;
+
+    if (runstate_is_running()) {
+        vm_stop(RUN_STATE_SAVE_VM);
+        s->resume = true;
+    } else {
+        s->resume = false;
+    }
+
+    s->errp = errp;
+    s->fd = fd;
+    s->has_filter = has_filter;
+    s->begin = begin;
+    s->length = length;
+    s->start = get_start_block(s);
+    if (s->start == -1) {
+        error_set(errp, QERR_INVALID_PARAMETER, "begin");
+        goto cleanup;
+    }
+
+    /*
+     * get dump info: endian, class and architecture.
+     * If the target architecture is not supported, cpu_get_dump_info() will
+     * return -1.
+     *
+     * if we use kvm, we should synchronize the register before we get dump
+     * info.
+     */
+    nr_cpus = 0;
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        cpu_synchronize_state(env);
+        nr_cpus++;
+    }
+
+    ret = cpu_get_dump_info(&s->dump_info);
+    if (ret < 0) {
+        error_set(errp, QERR_UNSUPPORTED);
+        goto cleanup;
+    }
+
+    /* get memory mapping */
+    memory_mapping_list_init(&s->list);
+    if (paging) {
+        qemu_get_guest_memory_mapping(&s->list);
+    } else {
+        qemu_get_guest_simple_memory_mapping(&s->list);
+    }
+
+    if (s->has_filter) {
+        memory_mapping_filter(&s->list, s->begin, s->length);
+    }
+
+    /*
+     * calculate phdr_num
+     *
+     * the type of ehdr->e_phnum is uint16_t, so we should avoid overflow
+     */
+    s->phdr_num = 1; /* PT_NOTE */
+    if (s->list.num < UINT16_MAX - 2) {
+        s->phdr_num += s->list.num;
+        s->have_section = false;
+    } else {
+        s->have_section = true;
+        s->phdr_num = PN_XNUM;
+        s->sh_info = 1; /* PT_NOTE */
+
+        /* the type of shdr->sh_info is uint32_t, so we should avoid overflow */
+        if (s->list.num <= UINT32_MAX - 1) {
+            s->sh_info += s->list.num;
+        } else {
+            s->sh_info = UINT32_MAX;
+        }
+    }
+
+    s->note_size = cpu_get_note_size(s->dump_info.d_class,
+                                     s->dump_info.d_machine, nr_cpus);
+    if (s->dump_info.d_class == ELFCLASS64) {
+        if (s->have_section) {
+            s->memory_offset = sizeof(Elf64_Ehdr) +
+                               sizeof(Elf64_Phdr) * s->sh_info +
+                               sizeof(Elf64_Shdr) + s->note_size;
+        } else {
+            s->memory_offset = sizeof(Elf64_Ehdr) +
+                               sizeof(Elf64_Phdr) * s->phdr_num + s->note_size;
+        }
+    } else {
+        if (s->have_section) {
+            s->memory_offset = sizeof(Elf32_Ehdr) +
+                               sizeof(Elf32_Phdr) * s->sh_info +
+                               sizeof(Elf32_Shdr) + s->note_size;
+        } else {
+            s->memory_offset = sizeof(Elf32_Ehdr) +
+                               sizeof(Elf32_Phdr) * s->phdr_num + s->note_size;
+        }
+    }
+
+    return 0;
+
+cleanup:
+    if (s->resume) {
+        vm_start();
+    }
+
+    return -1;
+}
+
+void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
+                           int64_t begin, bool has_length, int64_t length,
+                           Error **errp)
+{
+    const char *p;
+    int fd = -1;
+    DumpState *s;
+    int ret;
+
+    if (has_begin && !has_length) {
+        error_set(errp, QERR_MISSING_PARAMETER, "length");
+        return;
+    }
+    if (!has_begin && has_length) {
+        error_set(errp, QERR_MISSING_PARAMETER, "begin");
+        return;
+    }
+
+#if !defined(WIN32)
+    if (strstart(file, "fd:", &p)) {
+        fd = monitor_get_fd(cur_mon, p);
+        if (fd == -1) {
+            error_set(errp, QERR_FD_NOT_FOUND, p);
+            return;
+        }
+    }
+#endif
+
+    if  (strstart(file, "file:", &p)) {
+        fd = qemu_open(p, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR);
+        if (fd < 0) {
+            error_set(errp, QERR_OPEN_FILE_FAILED, p);
+            return;
+        }
+    }
+
+    if (fd == -1) {
+        error_set(errp, QERR_INVALID_PARAMETER, "protocol");
+        return;
+    }
+
+    s = g_malloc(sizeof(DumpState));
+
+    ret = dump_init(s, fd, paging, has_begin, begin, length, errp);
+    if (ret < 0) {
+        g_free(s);
+        return;
+    }
+
+    if (create_vmcore(s) < 0 && !error_is_set(s->errp)) {
+        error_set(errp, QERR_IO_ERROR);
+    }
+
+    g_free(s);
+}
+
+#else
+/* we need this function in hmp.c */
+void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
+                           int64_t begin, bool has_length, int64_t length,
+                           Error **errp)
+{
+    error_set(errp, QERR_UNSUPPORTED);
+}
+#endif
diff --git a/elf.h b/elf.h
index e1422b8..9c9acfa 100644
--- a/elf.h
+++ b/elf.h
@@ -1037,6 +1037,11 @@ typedef struct elf64_sym {
 
 #define EI_NIDENT	16
 
+/* Special value for e_phnum.  This indicates that the real number of
+   program headers is too large to fit into e_phnum.  Instead the real
+   value is in the field sh_info of section 0.  */
+#define PN_XNUM         0xffff
+
 typedef struct elf32_hdr{
   unsigned char	e_ident[EI_NIDENT];
   Elf32_Half	e_type;
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 18cb415..81723c8 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -878,6 +878,34 @@ server will ask the spice/vnc client to automatically reconnect using the
 new parameters (if specified) once the vm migration finished successfully.
 ETEXI
 
+#if defined(CONFIG_HAVE_CORE_DUMP)
+    {
+        .name       = "dump-guest-memory",
+        .args_type  = "paging:-p,protocol:s,begin:i?,length:i?",
+        .params     = "[-p] protocol [begin] [length]",
+        .help       = "dump guest memory to file"
+                      "\n\t\t\t begin(optional): the starting physical address"
+                      "\n\t\t\t length(optional): the memory size, in bytes",
+        .user_print = monitor_user_noop,
+        .mhandler.cmd = hmp_dump_guest_memory,
+    },
+
+
+STEXI
+@item dump-guest-memory [-p] @var{protocol} @var{begin} @var{length}
+@findex dump-guest-memory
+Dump guest memory to @var{protocol}. The file can be processed with crash or
+gdb.
+  protocol: destination file(started with "file:") or destination file
+            descriptor (started with "fd:")
+    paging: do paging to get guest's memory mapping
+     begin: the starting physical address. It's optional, and should be
+            specified with length together.
+    length: the memory size, in bytes. It's optional, and should be specified
+            with begin together.
+ETEXI
+#endif
+
     {
         .name       = "snapshot_blkdev",
         .args_type  = "reuse:-n,device:B,snapshot-file:s?,format:s?",
diff --git a/hmp.c b/hmp.c
index eb96618..af4abee 100644
--- a/hmp.c
+++ b/hmp.c
@@ -945,3 +945,25 @@ void hmp_device_del(Monitor *mon, const QDict *qdict)
     qmp_device_del(id, &err);
     hmp_handle_error(mon, &err);
 }
+
+void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict)
+{
+    Error *errp = NULL;
+    int paging = qdict_get_try_bool(qdict, "paging", 0);
+    const char *file = qdict_get_str(qdict, "protocol");
+    bool has_begin = qdict_haskey(qdict, "begin");
+    bool has_length = qdict_haskey(qdict, "length");
+    int64_t begin = 0;
+    int64_t length = 0;
+
+    if (has_begin) {
+        begin = qdict_get_int(qdict, "begin");
+    }
+    if (has_length) {
+        length = qdict_get_int(qdict, "length");
+    }
+
+    qmp_dump_guest_memory(paging, file, has_begin, begin, has_length, length,
+                          &errp);
+    hmp_handle_error(mon, &errp);
+}
diff --git a/hmp.h b/hmp.h
index 443b812..5cf3241 100644
--- a/hmp.h
+++ b/hmp.h
@@ -61,5 +61,6 @@ void hmp_block_job_set_speed(Monitor *mon, const QDict *qdict);
 void hmp_block_job_cancel(Monitor *mon, const QDict *qdict);
 void hmp_migrate(Monitor *mon, const QDict *qdict);
 void hmp_device_del(Monitor *mon, const QDict *qdict);
+void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict);
 
 #endif
diff --git a/memory_mapping.c b/memory_mapping.c
index adb1595..8810bb0 100644
--- a/memory_mapping.c
+++ b/memory_mapping.c
@@ -220,3 +220,30 @@ void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list)
         create_new_memory_mapping(list, block->offset, 0, block->length);
     }
 }
+
+void memory_mapping_filter(MemoryMappingList *list, int64_t begin,
+                           int64_t length)
+{
+    MemoryMapping *cur, *next;
+
+    QTAILQ_FOREACH_SAFE(cur, &list->head, next, next) {
+        if (cur->phys_addr >= begin + length ||
+            cur->phys_addr + cur->length <= begin) {
+            QTAILQ_REMOVE(&list->head, cur, next);
+            list->num--;
+            continue;
+        }
+
+        if (cur->phys_addr < begin) {
+            cur->length -= begin - cur->phys_addr;
+            if (cur->virt_addr) {
+                cur->virt_addr += begin - cur->phys_addr;
+            }
+            cur->phys_addr = begin;
+        }
+
+        if (cur->phys_addr + cur->length > begin + length) {
+            cur->length -= cur->phys_addr + cur->length - begin - length;
+        }
+    }
+}
diff --git a/memory_mapping.h b/memory_mapping.h
index 190de12..a1aa64f 100644
--- a/memory_mapping.h
+++ b/memory_mapping.h
@@ -63,6 +63,9 @@ static inline int qemu_get_guest_memory_mapping(MemoryMappingList *list)
 /* get guest's memory mapping without do paging(virtual address is 0). */
 void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list);
 
+void memory_mapping_filter(MemoryMappingList *list, int64_t begin,
+                           int64_t length);
+
 #else
 
 /* We use MemoryMappingList* in cpu-all.h */
diff --git a/qapi-schema.json b/qapi-schema.json
index 9193fb9..4d6b798 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1728,3 +1728,46 @@
 # Since: 0.14.0
 ##
 { 'command': 'device_del', 'data': {'id': 'str'} }
+
+##
+# @dump-guest-memory
+#
+# Dump guest's memory to vmcore. It is a synchronous operation that can take
+# very long depending on the amount of guest memory. This command is only
+# supported only on i386 and x86_64
+#
+# @paging: if true, do paging to get guest's memory mapping. The @paging's
+# default value of @paging is false, If you want to use gdb to process the
+# core, please set @paging to true. The reason why the @paging's value is
+# false:
+#   1. guest machine in a catastrophic state can have corrupted memory,
+#      which we cannot trust.
+#   2. The guest machine can be in read-mode even if paging is enabled.
+#      For example: the guest machine uses ACPI to sleep, and ACPI sleep
+#      state goes in real-mode
+# @protocol: the filename or file descriptor of the vmcore. The supported
+# protocol can be file or fd:
+#   1. file: the protocol starts with "file:", and the following string is
+#      the file's path.
+#   2. fd: the protocol starts with "fd:", and the following string is the
+#      fd's name.
+# @begin: #optional if specified, the starting physical address.
+# @length: #optional if specified, the memory size, in bytes. If you don't
+# want to dump all guest's memory, please specify the start @begin and
+# @length
+#
+# Returns: nothing on success
+#          If @begin contains an invalid address, InvalidParameter
+#          If only one of @begin and @length is specified, MissingParameter
+#          If @protocol stats with "fd:", and the fd cannot be found, FdNotFound
+#          If @protocol starts with "file:", and the file cannot be
+#          opened, OpenFileFailed
+#          If @protocol does not start with "fd:" or "file:", InvalidParameter
+#          If an I/O error occurs while writing the file, IOError
+#          If the target does not support this command, Unsupported
+#
+# Since: 1.1
+##
+{ 'command': 'dump-guest-memory',
+  'data': { 'paging': 'bool', 'protocol': 'str', '*begin': 'int',
+            '*length': 'int' } }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index c810c74..dea2d91 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -604,6 +604,42 @@ Example:
 EQMP
 
     {
+        .name       = "dump-guest-memory",
+        .args_type  = "paging:b,protocol:s,begin:i?,end:i?",
+        .params     = "-p protocol [begin] [length]",
+        .help       = "dump guest memory to file",
+        .user_print = monitor_user_noop,
+        .mhandler.cmd_new = qmp_marshal_input_dump_guest_memory,
+    },
+
+SQMP
+dump
+
+
+Dump guest memory to file. The file can be processed with crash or gdb.
+
+Arguments:
+
+- "paging": do paging to get guest's memory mapping (json-bool)
+- "protocol": destination file(started with "file:") or destination file
+              descriptor (started with "fd:") (json-string)
+- "begin": the starting physical address. It's optional, and should be specified
+           with length together (json-int)
+- "length": the memory size, in bytes. It's optional, and should be specified
+            with begin together (json-int)
+
+Example:
+
+-> { "execute": "dump-guest-memory", "arguments": { "protocol": "fd:dump" } }
+<- { "return": {} }
+
+Notes:
+
+(1) All boolean arguments default to false
+
+EQMP
+
+    {
         .name       = "netdev_add",
         .args_type  = "netdev:O",
         .params     = "[user|tap|socket],id=str[,prop=value][,...]",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism
  2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
                   ` (11 preceding siblings ...)
  2012-05-07  4:10 ` [Qemu-devel] [PATCH 12/12 v15] introduce a new monitor command 'dump-guest-memory' to dump guest's memory Wen Congyang
@ 2012-05-16 17:59 ` Luiz Capitulino
  12 siblings, 0 replies; 14+ messages in thread
From: Luiz Capitulino @ 2012-05-16 17:59 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Anthony Liguori, qemu-devel, HATAYAMA Daisuke, Jan Kiszka, Eric Blake

On Mon, 07 May 2012 12:01:43 +0800
Wen Congyang <wency@cn.fujitsu.com> wrote:

> Hi, all
> 
> 'virsh dump' can not work when host pci device is used by guest. We have
> discussed this issue here:
> http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html
> 
> The last version is here:
> http://lists.nongnu.org/archive/html/qemu-devel/2012-04/msg03379.html
> 
> We have determined to introduce a new command dump-guest-memory to dump
> guest's memory. The core file's format is elf32 or elf64.

This looks good to me now and I've applied it to the qmp-next branch.

Actually, there are some nitpicks, but I won't ask you to respin because of
that, I'll fix them myself on follow up patches.

If anyone wants to give this a review, you still have time as I'll queue
this until 1.1 is released.

PS: I've updated the 'Since' field of the schema to 1.2.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-05-16 18:09 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-07  4:01 [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Wen Congyang
2012-05-07  4:03 ` [Qemu-devel] [PATCH 01/12 v15] Add API to create memory mapping list Wen Congyang
2012-05-07  4:04 ` [Qemu-devel] [PATCH 02/12 v15] Add API to check whether a physical address is I/O address Wen Congyang
2012-05-07  4:04 ` [Qemu-devel] [PATCH 03/12 v15] implement cpu_get_memory_mapping() Wen Congyang
2012-05-07  4:05 ` [Qemu-devel] [PATCH 04/12 v15] Add API to check whether paging mode is enabled Wen Congyang
2012-05-07  4:06 ` [Qemu-devel] [PATCH 05/12 v15] Add API to get memory mapping Wen Congyang
2012-05-07  4:07 ` [Qemu-devel] [PATCH 06/12 v15] Add API to get memory mapping without do paging Wen Congyang
2012-05-07  4:07 ` [Qemu-devel] [PATCH 07/12 v15] target-i386: Add API to write elf notes to core file Wen Congyang
2012-05-07  4:08 ` [Qemu-devel] [PATCH 08/12 v15] target-i386: Add API to write cpu status " Wen Congyang
2012-05-07  4:08 ` [Qemu-devel] [PATCH 09/12 v15] target-i386: add API to get dump info Wen Congyang
2012-05-07  4:09 ` [Qemu-devel] [PATCH 10/12 v15] target-i386: Add API to get note's size Wen Congyang
2012-05-07  4:10 ` [Qemu-devel] [PATCH 11/12 v15] make gdb_id() generally avialable and rename it to cpu_index() Wen Congyang
2012-05-07  4:10 ` [Qemu-devel] [PATCH 12/12 v15] introduce a new monitor command 'dump-guest-memory' to dump guest's memory Wen Congyang
2012-05-16 17:59 ` [Qemu-devel] [PATCH 00/12 v15] introducing a new, dedicated guest memory dump mechanism Luiz Capitulino

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.