[Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format
@ 2014-01-05  7:27 Qiao Nuohan
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 01/11] dump: Add argument to write_elfxx_notes Qiao Nuohan
                   ` (11 more replies)
  0 siblings, 12 replies; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-05  7:27 UTC (permalink / raw)
  To: stefanha, lcapitulino, afaerber, eblake
  Cc: qemu-devel, qiaonuohan, kumagai-atsushi, anderson, akong, lersek

Hi, all

The last version is here:
http://lists.nongnu.org/archive/html/qemu-devel/2013-07/msg01376.html

Command 'dump-guest-memory' was introduced to dump guest's memory. But the
vmcore's format is only elf32 or elf64. The message is here:
http://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03379.html

Compared with migration, the missing of compression feature means regression
to 'dump-guest-memory'. So we post these patches to make 'dump-guest-memory' be
able to dump guest's in kdump-compressed format. Then vmcore can be much
smaller, and easily to be delivered.

The kdump-compressed format is *linux specific* *linux standard* crash dump
format used in kdump framework. The kdump-compressed format is readable only
with the crash utility, and it can be smaller than the ELF format because of
the compression support. To get more detailed information about
kdump-compressed format, please refer to the following URL:
http://sourceforge.net/projects/makedumpfile/

Note, similar to 'dump-guest-memory':
1. The guest should be x86 or x86_64. The other arch is not supported now.
2. If the OS is in the second kernel, gdb may not work well, and crash can
   work by specifying '--machdep phys_addr=xxx' in the command line. The
   reason is that the second kernel will update the page table, and we can
   not get the page table for the first kernel.
3. The cpu's state is stored in QEMU note.
4. The vmcore are able to be compressed with zlib, lzo or snappy. zlib is
   available by default, and option '--enable-lzo' or '--enable-snappy'
   should be specified with 'configure' to make lzo or snappy available.

Changelog:
Changes from v5 to v6:
1. add run-time check for compression format(lzo/snappy)
2. address Stefan's comments about reusing code and coding style
3. update the version of kdump-compressed format to 6th
4. resplit the patches
5. Add 'query-dump-guest-memory-capability' command

Changes from v4 to v5:
1. using flatten format to avoid using temporary files according to Stefan's
   comments
2. Address Andreas's comments about coding style

Changes from v3 to v4:
1. change to avoid conflict with Andreas's patches
2. rebase

Changes from v2 to v3:
1. Address Eric's comment

Changes from v1 to v2:
1. Address Eric & Daniel's comment: fix manner of string copy.
2. Address Eric's comment: replace reinventing new constants by using the
   ready-made ones accoring.
3. Address Andreas's comment: remove useless include.

Qiao Nuohan (11):
  dump: Add argument to write_elfxx_notes
  dump: Add API to write header of flatten format
  dump: Add API to write vmcore
  dump: Add API to write elf notes to buffer
  dump: add support for lzo/snappy
  dump: add API to write dump header
  dump: Add API to write dump_bitmap
  dump: Add APIs to operate DataCache
  dump: Add API to write dump pages
  dump: Make kdump-compressed format available for 'dump-guest-memory'
  Add 'query-dump-guest-memory-capability' command

 configure             |   50 +++
 dump.c                |  929 ++++++++++++++++++++++++++++++++++++++++++++++++-
 hmp-commands.hx       |   12 +-
 hmp.c                 |   23 ++-
 include/sysemu/dump.h |  138 ++++++++
 qapi-schema.json      |   35 ++-
 qmp-commands.hx       |   13 +-
 7 files changed, 1178 insertions(+), 22 deletions(-)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v6 01/11] dump: Add argument to write_elfxx_notes
  2014-01-05  7:27 [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
@ 2014-01-05  7:27 ` Qiao Nuohan
  2014-01-06 17:03   ` Laszlo Ersek
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 02/11] dump: Add API to write header of flatten format Qiao Nuohan
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-05  7:27 UTC (permalink / raw)
  To: stefanha, lcapitulino, afaerber, eblake
  Cc: qemu-devel, qiaonuohan, kumagai-atsushi, anderson, akong, lersek

write_elf32_notes/wirte_elf64_notes use fd_write_vmcore to write elf notes to
vmcore. Adding parameter "WriteCoreDumpFunction f" makes it available to choose
the method of writing elf notes

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
 dump.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/dump.c b/dump.c
index 80a9116..1fa12a2 100644
--- a/dump.c
+++ b/dump.c
@@ -271,7 +271,7 @@ static inline int cpu_index(CPUState *cpu)
     return cpu->cpu_index + 1;
 }
 
-static int write_elf64_notes(DumpState *s)
+static int write_elf64_notes(WriteCoreDumpFunction f, DumpState *s)
 {
     CPUState *cpu;
     int ret;
@@ -279,7 +279,7 @@ static int write_elf64_notes(DumpState *s)
 
     CPU_FOREACH(cpu) {
         id = cpu_index(cpu);
-        ret = cpu_write_elf64_note(fd_write_vmcore, cpu, id, s);
+        ret = cpu_write_elf64_note(f, cpu, id, s);
         if (ret < 0) {
             dump_error(s, "dump: failed to write elf notes.\n");
             return -1;
@@ -287,7 +287,7 @@ static int write_elf64_notes(DumpState *s)
     }
 
     CPU_FOREACH(cpu) {
-        ret = cpu_write_elf64_qemunote(fd_write_vmcore, cpu, s);
+        ret = cpu_write_elf64_qemunote(f, cpu, s);
         if (ret < 0) {
             dump_error(s, "dump: failed to write CPU status.\n");
             return -1;
@@ -321,7 +321,7 @@ static int write_elf32_note(DumpState *s)
     return 0;
 }
 
-static int write_elf32_notes(DumpState *s)
+static int write_elf32_notes(WriteCoreDumpFunction f, DumpState *s)
 {
     CPUState *cpu;
     int ret;
@@ -329,7 +329,7 @@ static int write_elf32_notes(DumpState *s)
 
     CPU_FOREACH(cpu) {
         id = cpu_index(cpu);
-        ret = cpu_write_elf32_note(fd_write_vmcore, cpu, id, s);
+        ret = cpu_write_elf32_note(f, cpu, id, s);
         if (ret < 0) {
             dump_error(s, "dump: failed to write elf notes.\n");
             return -1;
@@ -337,7 +337,7 @@ static int write_elf32_notes(DumpState *s)
     }
 
     CPU_FOREACH(cpu) {
-        ret = cpu_write_elf32_qemunote(fd_write_vmcore, cpu, s);
+        ret = cpu_write_elf32_qemunote(f, cpu, s);
         if (ret < 0) {
             dump_error(s, "dump: failed to write CPU status.\n");
             return -1;
@@ -574,7 +574,7 @@ static int dump_begin(DumpState *s)
         }
 
         /* write notes to vmcore */
-        if (write_elf64_notes(s) < 0) {
+        if (write_elf64_notes(fd_write_vmcore, s) < 0) {
             return -1;
         }
 
@@ -597,7 +597,7 @@ static int dump_begin(DumpState *s)
         }
 
         /* write notes to vmcore */
-        if (write_elf32_notes(s) < 0) {
+        if (write_elf32_notes(fd_write_vmcore, s) < 0) {
             return -1;
         }
     }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v6 02/11] dump: Add API to write header of flatten format
  2014-01-05  7:27 [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 01/11] dump: Add argument to write_elfxx_notes Qiao Nuohan
@ 2014-01-05  7:27 ` Qiao Nuohan
  2014-01-06 17:15   ` Laszlo Ersek
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 03/11] dump: Add API to write vmcore Qiao Nuohan
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-05  7:27 UTC (permalink / raw)
  To: stefanha, lcapitulino, afaerber, eblake
  Cc: qemu-devel, qiaonuohan, kumagai-atsushi, anderson, akong, lersek

flatten format may be used when writing kdump-compressed format. The format is
also used by makedumpfile, you can refer to the following URL to get more
detailed information about flatten format of kdump-compressed format:
http://sourceforge.net/projects/makedumpfile/

The two functions here are used to write start flat header and end flat header
to vmcore, and they will be called later if flatten format is used.

struct MakedumpfileHeader stored at the head of vmcore is used to indicate the
vmcore is in flatten format.

struct MakedumpfileHeader {
    char signature[16];     /* = "makedumpfile" */
    int64_t type;           /* = 1 */
    int64_t version;        /* = 1 */
};

And struct MakedumpfileDataHeader, with offset and buf_size set to -1, is used
to indicate the end of vmcore in flatten format.

struct MakedumpfileDataHeader {
    int64_t offset;         /* = -1 */
    int64_t buf_size;       /* = -1 */
};

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
 dump.c                |   40 ++++++++++++++++++++++++++++++++++++++++
 include/sysemu/dump.h |   17 +++++++++++++++++
 2 files changed, 57 insertions(+), 0 deletions(-)

diff --git a/dump.c b/dump.c
index 1fa12a2..89baeab 100644
--- a/dump.c
+++ b/dump.c
@@ -686,6 +686,46 @@ static int create_vmcore(DumpState *s)
     return 0;
 }
 
+static int write_start_flat_header(int fd)
+{
+    char buf[MAX_SIZE_MDF_HEADER];
+    MakedumpfileHeader mh;
+
+    memset(&mh, 0, sizeof(mh));
+    strncpy(mh.signature, MAKEDUMPFILE_SIGNATURE,
+            strlen(MAKEDUMPFILE_SIGNATURE));
+
+    mh.type = cpu_to_be64(TYPE_FLAT_HEADER);
+    mh.version = cpu_to_be64(VERSION_FLAT_HEADER);
+
+    memset(buf, 0, sizeof(buf));
+    memcpy(buf, &mh, sizeof(mh));
+
+    size_t written_size;
+    written_size = qemu_write_full(fd, buf, sizeof(buf));
+    if (written_size != sizeof(buf)) {
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_end_flat_header(int fd)
+{
+    MakedumpfileDataHeader mdh;
+
+    mdh.offset = END_FLAG_FLAT_HEADER;
+    mdh.buf_size = END_FLAG_FLAT_HEADER;
+
+    size_t written_size;
+    written_size = qemu_write_full(fd, &mdh, sizeof(mdh));
+    if (written_size != sizeof(mdh)) {
+        return -1;
+    }
+
+    return 0;
+}
+
 static ram_addr_t get_start_block(DumpState *s)
 {
     GuestPhysBlock *block;
diff --git a/include/sysemu/dump.h b/include/sysemu/dump.h
index 19fafb2..b32b390 100644
--- a/include/sysemu/dump.h
+++ b/include/sysemu/dump.h
@@ -14,12 +14,29 @@
 #ifndef DUMP_H
 #define DUMP_H
 
+#define MAKEDUMPFILE_SIGNATURE      "makedumpfile"
+#define MAX_SIZE_MDF_HEADER         (4096) /* max size of makedumpfile_header */
+#define TYPE_FLAT_HEADER            (1)    /* type of flattened format */
+#define VERSION_FLAT_HEADER         (1)    /* version of flattened format */
+#define END_FLAG_FLAT_HEADER        (-1)
+
 typedef struct ArchDumpInfo {
     int d_machine;  /* Architecture */
     int d_endian;   /* ELFDATA2LSB or ELFDATA2MSB */
     int d_class;    /* ELFCLASS32 or ELFCLASS64 */
 } ArchDumpInfo;
 
+typedef struct QEMU_PACKED MakedumpfileHeader {
+    char signature[16];     /* = "makedumpfile" */
+    int64_t type;
+    int64_t version;
+} MakedumpfileHeader;
+
+typedef struct QEMU_PACKED MakedumpfileDataHeader {
+    int64_t offset;
+    int64_t buf_size;
+} MakedumpfileDataHeader;
+
 struct GuestPhysBlockList; /* memory_mapping.h */
 int cpu_get_dump_info(ArchDumpInfo *info,
                       const struct GuestPhysBlockList *guest_phys_blocks);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v6 03/11] dump: Add API to write vmcore
  2014-01-05  7:27 [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 01/11] dump: Add argument to write_elfxx_notes Qiao Nuohan
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 02/11] dump: Add API to write header of flatten format Qiao Nuohan
@ 2014-01-05  7:27 ` Qiao Nuohan
  2014-01-06 18:12   ` Laszlo Ersek
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 04/11] dump: Add API to write elf notes to buffer Qiao Nuohan
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-05  7:27 UTC (permalink / raw)
  To: stefanha, lcapitulino, afaerber, eblake
  Cc: qemu-devel, qiaonuohan, kumagai-atsushi, anderson, akong, lersek

Function is used to write vmcore. If flag_flatten is specified, flatten format
will be used. In flatten format, data is written block by block in vmcore.
struct MakedumpfileDataHeader is used to indicate the offset and size of a data
block.

struct MakedumpfileDataHeader {
    int64_t offset;
    int64_t buf_size;
};

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
 dump.c |   28 ++++++++++++++++++++++++++++
 1 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/dump.c b/dump.c
index 89baeab..764db39 100644
--- a/dump.c
+++ b/dump.c
@@ -726,6 +726,34 @@ static int write_end_flat_header(int fd)
     return 0;
 }
 
+static int write_buffer(int fd, bool flag_flatten, off_t offset, void *buf,
+                        size_t size)
+{
+    size_t written_size;
+    MakedumpfileDataHeader mdh;
+
+    if (flag_flatten) {
+        mdh.offset = cpu_to_be64(offset);
+        mdh.buf_size = cpu_to_be64(size);
+
+        written_size = qemu_write_full(fd, &mdh, sizeof(mdh));
+        if (written_size != sizeof(mdh)) {
+            return -1;
+        }
+    } else {
+        if (lseek(fd, offset, SEEK_SET) < 0) {
+            return -1;
+        }
+    }
+
+    written_size = qemu_write_full(fd, buf, size);
+    if (written_size != size) {
+        return -1;
+    }
+
+    return 0;
+}
+
 static ram_addr_t get_start_block(DumpState *s)
 {
     GuestPhysBlock *block;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v6 04/11] dump: Add API to write elf notes to buffer
  2014-01-05  7:27 [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
                   ` (2 preceding siblings ...)
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 03/11] dump: Add API to write vmcore Qiao Nuohan
@ 2014-01-05  7:27 ` Qiao Nuohan
  2014-01-06 18:46   ` Laszlo Ersek
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 05/11] dump: add support for lzo/snappy Qiao Nuohan
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-05  7:27 UTC (permalink / raw)
  To: stefanha, lcapitulino, afaerber, eblake
  Cc: qemu-devel, qiaonuohan, kumagai-atsushi, anderson, akong, lersek

the function can be used by write_elf32_notes/write_elf64_notes to write notes
to a buffer. If fd_write_vmcore is used, write_elf32_notes/write_elf64_notes
will write elf notes to vmcore directly. Instead, if buf_write_note is used,
elf notes will be written to opaque->note_buf at first.

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
 dump.c |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/dump.c b/dump.c
index 764db39..3b9cf00 100644
--- a/dump.c
+++ b/dump.c
@@ -76,6 +76,9 @@ typedef struct DumpState {
     int64_t begin;
     int64_t length;
     Error **errp;
+
+    void *note_buf;
+    size_t note_buf_offset;
 } DumpState;
 
 static int dump_cleanup(DumpState *s)
@@ -754,6 +757,22 @@ static int write_buffer(int fd, bool flag_flatten, off_t offset, void *buf,
     return 0;
 }
 
+static int buf_write_note(void *buf, size_t size, void *opaque)
+{
+    DumpState *s = opaque;
+
+    /* note_buf is not enough */
+    if (s->note_buf_offset + size > s->note_size) {
+        return -1;
+    }
+
+    memcpy(s->note_buf + s->note_buf_offset, buf, size);
+
+    s->note_buf_offset += size;
+
+    return 0;
+}
+
 static ram_addr_t get_start_block(DumpState *s)
 {
     GuestPhysBlock *block;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v6 05/11] dump: add support for lzo/snappy
  2014-01-05  7:27 [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
                   ` (3 preceding siblings ...)
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 04/11] dump: Add API to write elf notes to buffer Qiao Nuohan
@ 2014-01-05  7:27 ` Qiao Nuohan
  2014-01-06 19:25   ` Laszlo Ersek
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 06/11] dump: add API to write dump header Qiao Nuohan
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-05  7:27 UTC (permalink / raw)
  To: stefanha, lcapitulino, afaerber, eblake
  Cc: qemu-devel, qiaonuohan, kumagai-atsushi, anderson, akong, lersek

kdump-compressed format supports three compression format, zlib/lzo/snappy.
Currently, only zlib is available. This patch is used to support lzo/snappy.
'--enable-lzo/--enable-snappy' is needed to be specified with configure to make
lzo/snappy available for qemu

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
 configure |   50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/configure b/configure
index edfea95..b7a28b7 100755
--- a/configure
+++ b/configure
@@ -245,6 +245,8 @@ libusb=""
 usb_redir=""
 glx=""
 zlib="yes"
+lzo="no"
+snappy="no"
 guest_agent=""
 guest_agent_with_vss="no"
 vss_win32_sdk=""
@@ -947,6 +949,10 @@ for opt do
   ;;
   --disable-zlib-test) zlib="no"
   ;;
+  --enable-lzo) lzo="yes"
+  ;;
+  --enable-snappy) snappy="yes"
+  ;;
   --enable-guest-agent) guest_agent="yes"
   ;;
   --disable-guest-agent) guest_agent="no"
@@ -1538,6 +1544,42 @@ fi
 libs_softmmu="$libs_softmmu -lz"
 
 ##########################################
+# lzo check
+
+if test "$lzo" != "no" ; then
+    cat > $TMPC << EOF
+#include <lzo/lzo1x.h>
+int main(void) { lzo_version(); return 0; }
+EOF
+    if compile_prog "" "-llzo2" ; then
+        :
+    else
+        error_exit "lzo check failed" \
+            "Make sure to have the lzo libs and headers installed."
+    fi
+
+    libs_softmmu="$libs_softmmu -llzo2"
+fi
+
+##########################################
+# snappy check
+
+if test "$snappy" != "no" ; then
+    cat > $TMPC << EOF
+#include <snappy-c.h>
+int main(void) { snappy_max_compressed_length(4096); return 0; }
+EOF
+    if compile_prog "" "-lsnappy" ; then
+        :
+    else
+        error_exit "snappy check failed" \
+            "Make sure to have the snappy libs and headers installed."
+    fi
+
+    libs_softmmu="$libs_softmmu -lsnappy"
+fi
+
+##########################################
 # libseccomp check
 
 if test "$seccomp" != "no" ; then
@@ -4135,6 +4177,14 @@ if test "$glx" = "yes" ; then
   echo "GLX_LIBS=$glx_libs" >> $config_host_mak
 fi
 
+if test "$lzo" = "yes" ; then
+  echo "CONFIG_LZO=y" >> $config_host_mak
+fi
+
+if test "$snappy" = "yes" ; then
+  echo "CONFIG_SNAPPY=y" >> $config_host_mak
+fi
+
 if test "$libiscsi" = "yes" ; then
   echo "CONFIG_LIBISCSI=y" >> $config_host_mak
 fi
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v6 06/11] dump: add API to write dump header
  2014-01-05  7:27 [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
                   ` (4 preceding siblings ...)
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 05/11] dump: add support for lzo/snappy Qiao Nuohan
@ 2014-01-05  7:27 ` Qiao Nuohan
  2014-01-07 11:38   ` Laszlo Ersek
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 07/11] dump: Add API to write dump_bitmap Qiao Nuohan
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-05  7:27 UTC (permalink / raw)
  To: stefanha, lcapitulino, afaerber, eblake
  Cc: qemu-devel, qiaonuohan, kumagai-atsushi, anderson, akong, lersek

the functions are used to write header of kdump-compressed format to vmcore.
Header of kdump-compressed format includes:
1. common header: DiskDumpHeader32 / DiskDumpHeader64
2. sub header: KdumpSubHeader32 / KdumpSubHeader64
3. extra information: only elf notes here

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
 dump.c                |  199 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/sysemu/dump.h |   96 ++++++++++++++++++++++++
 2 files changed, 295 insertions(+), 0 deletions(-)

diff --git a/dump.c b/dump.c
index 3b9cf00..e3623b9 100644
--- a/dump.c
+++ b/dump.c
@@ -77,8 +77,16 @@ typedef struct DumpState {
     int64_t length;
     Error **errp;
 
+    bool flag_flatten;
+    uint32_t nr_cpus;
+    size_t page_size;
+    uint64_t max_mapnr;
+    size_t len_dump_bitmap;
     void *note_buf;
     size_t note_buf_offset;
+    off_t offset_dump_bitmap;
+    off_t offset_page;
+    uint32_t flag_compress;
 } DumpState;
 
 static int dump_cleanup(DumpState *s)
@@ -773,6 +781,197 @@ static int buf_write_note(void *buf, size_t size, void *opaque)
     return 0;
 }
 
+/* write common header, sub header and elf note to vmcore */
+static int create_header32(DumpState *s)
+{
+    int ret = 0;
+    DiskDumpHeader32 *dh = NULL;
+    KdumpSubHeader32 *kh = NULL;
+    size_t size;
+
+    /* write common header, the version of kdump-compressed format is 5th */
+    size = sizeof(DiskDumpHeader32);
+    dh = g_malloc0(size);
+
+    strncpy(dh->signature, KDUMP_SIGNATURE, strlen(KDUMP_SIGNATURE));
+    dh->header_version = 6;
+    dh->block_size = s->page_size;
+    dh->sub_hdr_size = sizeof(struct KdumpSubHeader32) + s->note_size;
+    dh->sub_hdr_size = DIV_ROUND_UP(dh->sub_hdr_size, dh->block_size);
+    /* dh->max_mapnr may be truncated, full 64bit is in kh.max_mapnr_64 */
+    dh->max_mapnr = MIN(s->max_mapnr, UINT_MAX);
+    dh->nr_cpus = s->nr_cpus;
+    dh->bitmap_blocks = DIV_ROUND_UP(s->len_dump_bitmap, dh->block_size) * 2;
+    memcpy(&(dh->utsname.machine), "i686", 4);
+
+    if (s->flag_compress & DUMP_DH_COMPRESSED_ZLIB) {
+        dh->status |= DUMP_DH_COMPRESSED_ZLIB;
+    }
+#ifdef CONFIG_LZO
+    if (s->flag_compress & DUMP_DH_COMPRESSED_LZO) {
+        dh->status |= DUMP_DH_COMPRESSED_LZO;
+    }
+#endif
+#ifdef CONFIG_SNAPPY
+    if (s->flag_compress & DUMP_DH_COMPRESSED_SNAPPY) {
+        dh->status |= DUMP_DH_COMPRESSED_SNAPPY;
+    }
+#endif
+
+    if (write_buffer(s->fd, s->flag_flatten, 0, dh, size) < 0) {
+        ret = -1;
+        goto out;
+    }
+
+    /* write sub header */
+    size = sizeof(KdumpSubHeader32);
+    kh = g_malloc0(size);
+
+    /* 64bit max_mapnr_64 */
+    kh->max_mapnr_64 = s->max_mapnr;
+    kh->phys_base = PHYS_BASE;
+    kh->dump_level = DUMP_LEVEL;
+
+    kh->offset_note = DISKDUMP_HEADER_BLOCKS * dh->block_size + size;
+    kh->note_size = s->note_size;
+
+    if (write_buffer(s->fd, s->flag_flatten, dh->block_size, kh, size) < 0) {
+        ret = -1;
+        goto out;
+    }
+
+    /* write note */
+    s->note_buf = g_malloc(s->note_size);
+    s->note_buf_offset = 0;
+
+    /* use s->note_buf to store notes temporarily */
+    if (write_elf32_notes(buf_write_note, s) < 0) {
+        ret = -1;
+        goto out;
+    }
+
+    if (write_buffer(s->fd, s->flag_flatten, kh->offset_note, s->note_buf,
+                     s->note_size) < 0) {
+        ret = -1;
+        goto out;
+    }
+
+    /* get offset of dump_bitmap */
+    s->offset_dump_bitmap = (DISKDUMP_HEADER_BLOCKS + dh->sub_hdr_size) *
+                             dh->block_size;
+
+    /* get offset of page */
+    s->offset_page = (DISKDUMP_HEADER_BLOCKS + dh->sub_hdr_size +
+                      dh->bitmap_blocks) * dh->block_size;
+
+out:
+    g_free(dh);
+    g_free(kh);
+    g_free(s->note_buf);
+
+    return ret;
+}
+
+/* write common header, sub header and elf note to vmcore */
+static int create_header64(DumpState *s)
+{
+    int ret = 0;
+    DiskDumpHeader64 *dh = NULL;
+    KdumpSubHeader64 *kh = NULL;
+    size_t size;
+
+    /* write common header, the version of kdump-compressed format is 5th */
+    size = sizeof(DiskDumpHeader64);
+    dh = g_malloc0(size);
+
+    strncpy(dh->signature, KDUMP_SIGNATURE, strlen(KDUMP_SIGNATURE));
+    dh->header_version = 6;
+    dh->block_size = s->page_size;
+    dh->sub_hdr_size = sizeof(struct KdumpSubHeader64) + s->note_size;
+    dh->sub_hdr_size = DIV_ROUND_UP(dh->sub_hdr_size, dh->block_size);
+    /* dh->max_mapnr may be truncated, full 64bit is in kh.max_mapnr_64 */
+    dh->max_mapnr = MIN(s->max_mapnr, UINT_MAX);
+    dh->nr_cpus = s->nr_cpus;
+    dh->bitmap_blocks = DIV_ROUND_UP(s->len_dump_bitmap, dh->block_size) * 2;
+    memcpy(&(dh->utsname.machine), "x86_64", 6);
+
+    if (s->flag_compress & DUMP_DH_COMPRESSED_ZLIB) {
+        dh->status |= DUMP_DH_COMPRESSED_ZLIB;
+    }
+#ifdef CONFIG_LZO
+    if (s->flag_compress & DUMP_DH_COMPRESSED_LZO) {
+        dh->status |= DUMP_DH_COMPRESSED_LZO;
+    }
+#endif
+#ifdef CONFIG_SNAPPY
+    if (s->flag_compress & DUMP_DH_COMPRESSED_SNAPPY) {
+        dh->status |= DUMP_DH_COMPRESSED_SNAPPY;
+    }
+#endif
+
+    if (write_buffer(s->fd, s->flag_flatten, 0, dh, size) < 0) {
+        ret = -1;
+        goto out;
+    }
+
+    /* write sub header */
+    size = sizeof(KdumpSubHeader64);
+    kh = g_malloc0(size);
+
+    /* 64bit max_mapnr_64 */
+    kh->max_mapnr_64 = s->max_mapnr;
+    kh->phys_base = PHYS_BASE;
+    kh->dump_level = DUMP_LEVEL;
+
+    kh->offset_note = DISKDUMP_HEADER_BLOCKS * dh->block_size + size;
+    kh->note_size = s->note_size;
+
+    if (write_buffer(s->fd, s->flag_flatten, dh->block_size, kh, size) < 0) {
+        ret = -1;
+        goto out;
+    }
+
+    /* write note */
+    s->note_buf = g_malloc0(s->note_size);
+    s->note_buf_offset = 0;
+
+    /* use s->note_buf to store notes temporarily */
+    if (write_elf64_notes(buf_write_note, s) < 0) {
+        ret = -1;
+        goto out;
+    }
+
+    if (write_buffer(s->fd, s->flag_flatten, kh->offset_note, s->note_buf,
+                     s->note_size) < 0) {
+        ret = -1;
+        goto out;
+    }
+
+    /* get offset of dump_bitmap */
+    s->offset_dump_bitmap = (DISKDUMP_HEADER_BLOCKS + dh->sub_hdr_size) *
+                             dh->block_size;
+
+    /* get offset of page */
+    s->offset_page = (DISKDUMP_HEADER_BLOCKS + dh->sub_hdr_size +
+                      dh->bitmap_blocks) * dh->block_size;
+
+out:
+    g_free(dh);
+    g_free(kh);
+    g_free(s->note_buf);
+
+    return ret;
+}
+
+static int write_dump_header(DumpState *s)
+{
+    if (s->dump_info.d_machine == EM_386) {
+        return create_header32(s);
+    } else {
+        return create_header64(s);
+    }
+}
+
 static ram_addr_t get_start_block(DumpState *s)
 {
     GuestPhysBlock *block;
diff --git a/include/sysemu/dump.h b/include/sysemu/dump.h
index b32b390..9e47b4c 100644
--- a/include/sysemu/dump.h
+++ b/include/sysemu/dump.h
@@ -20,6 +20,19 @@
 #define VERSION_FLAT_HEADER         (1)    /* version of flattened format */
 #define END_FLAG_FLAT_HEADER        (-1)
 
+/*
+ * flag for compressed format
+ */
+#define DUMP_DH_COMPRESSED_ZLIB     (0x1)
+#define DUMP_DH_COMPRESSED_LZO      (0x2)
+#define DUMP_DH_COMPRESSED_SNAPPY   (0x4)
+
+#define KDUMP_SIGNATURE             "KDUMP   "
+#define SIG_LEN                     (sizeof(KDUMP_SIGNATURE) - 1)
+#define PHYS_BASE                   (0)
+#define DUMP_LEVEL                  (1)
+#define DISKDUMP_HEADER_BLOCKS      (1)
+
 typedef struct ArchDumpInfo {
     int d_machine;  /* Architecture */
     int d_endian;   /* ELFDATA2LSB or ELFDATA2MSB */
@@ -37,6 +50,89 @@ typedef struct QEMU_PACKED MakedumpfileDataHeader {
     int64_t buf_size;
 } MakedumpfileDataHeader;
 
+typedef struct QEMU_PACKED NewUtsname {
+    char sysname[65];
+    char nodename[65];
+    char release[65];
+    char version[65];
+    char machine[65];
+    char domainname[65];
+} NewUtsname;
+
+typedef struct QEMU_PACKED DiskDumpHeader32 {
+    char signature[SIG_LEN];        /* = "KDUMP   " */
+    uint32_t header_version;        /* Dump header version */
+    NewUtsname utsname;             /* copy of system_utsname */
+    char timestamp[10];             /* Time stamp */
+    uint32_t status;                /* Above flags */
+    uint32_t block_size;            /* Size of a block in byte */
+    uint32_t sub_hdr_size;          /* Size of arch dependent header in block */
+    uint32_t bitmap_blocks;         /* Size of Memory bitmap in block */
+    uint32_t max_mapnr;             /* = max_mapnr ,
+                                       obsoleted in header_version 6 */
+    uint32_t total_ram_blocks;      /* Number of blocks should be written */
+    uint32_t device_blocks;         /* Number of total blocks in dump device */
+    uint32_t written_blocks;        /* Number of written blocks */
+    uint32_t current_cpu;           /* CPU# which handles dump */
+    uint32_t nr_cpus;               /* Number of CPUs */
+} DiskDumpHeader32;
+
+typedef struct QEMU_PACKED DiskDumpHeader64 {
+    char signature[SIG_LEN];        /* = "KDUMP   " */
+    uint32_t header_version;        /* Dump header version */
+    NewUtsname utsname;             /* copy of system_utsname */
+    char timestamp[22];             /* Time stamp */
+    uint32_t status;                /* Above flags */
+    uint32_t block_size;            /* Size of a block in byte */
+    uint32_t sub_hdr_size;          /* Size of arch dependent header in block */
+    uint32_t bitmap_blocks;         /* Size of Memory bitmap in block */
+    uint32_t max_mapnr;             /* = max_mapnr,
+                                       obsoleted in header_version 6 */
+    uint32_t total_ram_blocks;      /* Number of blocks should be written */
+    uint32_t device_blocks;         /* Number of total blocks in dump device */
+    uint32_t written_blocks;        /* Number of written blocks */
+    uint32_t current_cpu;           /* CPU# which handles dump */
+    uint32_t nr_cpus;               /* Number of CPUs */
+} DiskDumpHeader64;
+
+typedef struct QEMU_PACKED KdumpSubHeader32 {
+    uint32_t phys_base;
+    uint32_t dump_level;            /* header_version 1 and later */
+    uint32_t split;                 /* header_version 2 and later */
+    uint32_t start_pfn;             /* header_version 2 and later,
+                                       obsoleted in header_version 6 */
+    uint32_t end_pfn;               /* header_version 2 and later,
+                                       obsoleted in header_version 6 */
+    uint64_t offset_vmcoreinfo;     /* header_version 3 and later */
+    uint32_t size_vmcoreinfo;       /* header_version 3 and later */
+    uint64_t offset_note;           /* header_version 4 and later */
+    uint32_t note_size;             /* header_version 4 and later */
+    uint64_t offset_eraseinfo;      /* header_version 5 and later */
+    uint32_t size_eraseinfo;        /* header_version 5 and later */
+    uint64_t start_pfn_64;          /* header_version 6 and later */
+    uint64_t end_pfn_64;            /* header_version 6 and later */
+    uint64_t max_mapnr_64;          /* header_version 6 and later */
+} KdumpSubHeader32;
+
+typedef struct QEMU_PACKED KdumpSubHeader64 {
+    uint64_t phys_base;
+    uint32_t dump_level;            /* header_version 1 and later */
+    uint32_t split;                 /* header_version 2 and later */
+    uint64_t start_pfn;             /* header_version 2 and later,
+                                       obsoleted in header_version 6 */
+    uint64_t end_pfn;               /* header_version 2 and later,
+                                       obsoleted in header_version 6 */
+    uint64_t offset_vmcoreinfo;     /* header_version 3 and later */
+    uint64_t size_vmcoreinfo;       /* header_version 3 and later */
+    uint64_t offset_note;           /* header_version 4 and later */
+    uint64_t note_size;             /* header_version 4 and later */
+    uint64_t offset_eraseinfo;      /* header_version 5 and later */
+    uint64_t size_eraseinfo;        /* header_version 5 and later */
+    uint64_t start_pfn_64;          /* header_version 6 and later */
+    uint64_t end_pfn_64;            /* header_version 6 and later */
+    uint64_t max_mapnr_64;          /* header_version 6 and later */
+} KdumpSubHeader64;
+
 struct GuestPhysBlockList; /* memory_mapping.h */
 int cpu_get_dump_info(ArchDumpInfo *info,
                       const struct GuestPhysBlockList *guest_phys_blocks);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v6 07/11] dump: Add API to write dump_bitmap
  2014-01-05  7:27 [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
                   ` (5 preceding siblings ...)
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 06/11] dump: add API to write dump header Qiao Nuohan
@ 2014-01-05  7:27 ` Qiao Nuohan
  2014-01-07 14:49   ` Laszlo Ersek
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 08/11] dump: Add APIs to operate DataCache Qiao Nuohan
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-05  7:27 UTC (permalink / raw)
  To: stefanha, lcapitulino, afaerber, eblake
  Cc: qemu-devel, qiaonuohan, kumagai-atsushi, anderson, akong, lersek

functions are used to write 1st and 2nd dump_bitmap of kdump-compressed format,
which is used to indicate whether the corresponded page is existed in vmcore.
1st and 2nd dump_bitmap are same, because dump level is specified to 1 here.

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
 dump.c                |  116 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/sysemu/dump.h |    7 +++
 2 files changed, 123 insertions(+), 0 deletions(-)

diff --git a/dump.c b/dump.c
index e3623b9..1fae152 100644
--- a/dump.c
+++ b/dump.c
@@ -80,12 +80,14 @@ typedef struct DumpState {
     bool flag_flatten;
     uint32_t nr_cpus;
     size_t page_size;
+    uint32_t page_shift;
     uint64_t max_mapnr;
     size_t len_dump_bitmap;
     void *note_buf;
     size_t note_buf_offset;
     off_t offset_dump_bitmap;
     off_t offset_page;
+    size_t num_dumpable;
     uint32_t flag_compress;
 } DumpState;
 
@@ -972,6 +974,120 @@ static int write_dump_header(DumpState *s)
     }
 }
 
+/* set dump_bitmap sequencely. bitmap is not allowed to be rewritten. */
+static int set_dump_bitmap(int64_t last_pfn, int64_t pfn, uint32_t value,
+                           void *buf, DumpState *s)
+{
+    off_t old_offset, new_offset;
+    off_t offset_bitmap1, offset_bitmap2;
+    uint32_t byte, bit;
+
+    /* should not set the previous place */
+    if (last_pfn > pfn) {
+        return -1;
+    }
+
+    /*
+     * if the bit needed to be set is not cached in buf, flush the data in buf
+     * to vmcore firstly.
+     * making new_offset be bigger than old_offset can also sync remained data
+     * into vmcore.
+     */
+    old_offset = BUFSIZE_BITMAP * (last_pfn / PFN_BUFBITMAP);
+    new_offset = BUFSIZE_BITMAP * (pfn / PFN_BUFBITMAP);
+
+    while (old_offset < new_offset) {
+        /* calculate the offset and write dump_bitmap */
+        offset_bitmap1 = s->offset_dump_bitmap + old_offset;
+        if (write_buffer(s->fd, s->flag_flatten, offset_bitmap1, buf,
+                         BUFSIZE_BITMAP) < 0) {
+            return -1;
+        }
+
+        /* dump level 1 is chosen, so 1st and 2nd bitmap are same */
+        offset_bitmap2 = s->offset_dump_bitmap + s->len_dump_bitmap +
+                         old_offset;
+        if (write_buffer(s->fd, s->flag_flatten, offset_bitmap2, buf,
+                         BUFSIZE_BITMAP) < 0) {
+            return -1;
+        }
+
+        memset(buf, 0, BUFSIZE_BITMAP);
+        old_offset += BUFSIZE_BITMAP;
+    }
+
+    /* get the exact place of the bit in the buf, and set it */
+    byte = (pfn % PFN_BUFBITMAP) >> 3;
+    bit = (pfn % PFN_BUFBITMAP) & 7;
+    if (value) {
+        ((char *)buf)[byte] |= 1<<bit;
+    } else {
+        ((char *)buf)[byte] &= ~(1<<bit);
+    }
+
+    return 0;
+}
+
+static int write_dump_bitmap(DumpState *s)
+{
+    int ret = 0;
+    int64_t pfn_start, pfn_end, pfn;
+    int64_t last_pfn;
+    void *dump_bitmap_buf;
+    size_t num_dumpable;
+    MemoryMapping *memory_mapping;
+
+    /* dump_bitmap_buf is used to store dump_bitmap temporarily */
+    dump_bitmap_buf = g_malloc0(BUFSIZE_BITMAP);
+
+    num_dumpable = 0;
+    last_pfn = -1;
+
+    /*
+     * exam memory page by page, and set the bit in dump_bitmap corresponded
+     * to the existing page
+     */
+    QTAILQ_FOREACH(memory_mapping, &s->list.head, next) {
+        pfn_start = paddr_to_pfn(memory_mapping->phys_addr, s->page_shift);
+        pfn_end = paddr_to_pfn(memory_mapping->phys_addr +
+                               memory_mapping->length, s->page_shift);
+
+        for (pfn = pfn_start; pfn < pfn_end; pfn++) {
+            ret = set_dump_bitmap(last_pfn, pfn, 1, dump_bitmap_buf, s);
+            if (ret < 0) {
+                dump_error(s, "dump: failed to set dump_bitmap.\n");
+                ret = -1;
+                goto out;
+            }
+
+            last_pfn = pfn;
+            num_dumpable++;
+        }
+    }
+
+    /*
+     * last_pfn > -1 means bitmap is set, then remained data in buf should be
+     * synchronized into vmcore
+     */
+    if (last_pfn > -1) {
+        ret = set_dump_bitmap(last_pfn, last_pfn + PFN_BUFBITMAP, 0,
+                              dump_bitmap_buf, s);
+        if (ret < 0) {
+            dump_error(s, "dump: failed to sync dump_bitmap.\n");
+            ret = -1;
+            goto out;
+        }
+    }
+
+    /* number of dumpable pages that will be dumped later */
+    s->num_dumpable = num_dumpable;
+
+out:
+    g_free(dump_bitmap_buf);
+
+    return ret;
+}
+
 static ram_addr_t get_start_block(DumpState *s)
 {
     GuestPhysBlock *block;
diff --git a/include/sysemu/dump.h b/include/sysemu/dump.h
index 9e47b4c..b5eaf8d 100644
--- a/include/sysemu/dump.h
+++ b/include/sysemu/dump.h
@@ -27,11 +27,18 @@
 #define DUMP_DH_COMPRESSED_LZO      (0x2)
 #define DUMP_DH_COMPRESSED_SNAPPY   (0x4)
 
+#define PAGE_SIZE                   (4096)
 #define KDUMP_SIGNATURE             "KDUMP   "
 #define SIG_LEN                     (sizeof(KDUMP_SIGNATURE) - 1)
 #define PHYS_BASE                   (0)
 #define DUMP_LEVEL                  (1)
 #define DISKDUMP_HEADER_BLOCKS      (1)
+#define BUFSIZE_BITMAP              (PAGE_SIZE)
+#define PFN_BUFBITMAP               (CHAR_BIT * BUFSIZE_BITMAP)
+#define ARCH_PFN_OFFSET             (0)
+
+#define paddr_to_pfn(X, page_shift) \
+    (((unsigned long long)(X) >> (page_shift)) - ARCH_PFN_OFFSET)
 
 typedef struct ArchDumpInfo {
     int d_machine;  /* Architecture */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v6 08/11] dump: Add APIs to operate DataCache
  2014-01-05  7:27 [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
                   ` (6 preceding siblings ...)
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 07/11] dump: Add API to write dump_bitmap Qiao Nuohan
@ 2014-01-05  7:27 ` Qiao Nuohan
  2014-01-07 15:22   ` Laszlo Ersek
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 09/11] dump: Add API to write dump pages Qiao Nuohan
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-05  7:27 UTC (permalink / raw)
  To: stefanha, lcapitulino, afaerber, eblake
  Cc: qemu-devel, qiaonuohan, kumagai-atsushi, anderson, akong, lersek

DataCache is used to store data temporarily, then the data will be written to
vmcore. These functions will be called later when writing data of page to
vmcore.

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
 dump.c                |   52 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/sysemu/dump.h |    9 ++++++++
 2 files changed, 61 insertions(+), 0 deletions(-)

diff --git a/dump.c b/dump.c
index 1fae152..b4c40f2 100644
--- a/dump.c
+++ b/dump.c
@@ -1088,6 +1088,58 @@ out:
     return ret;
 }
 
+static void prepare_data_cache(DataCache *data_cache, DumpState *s)
+{
+    data_cache->fd = s->fd;
+    data_cache->data_size = 0;
+    data_cache->buf_size = BUFSIZE_DATA_CACHE;
+    data_cache->buf = g_malloc0(BUFSIZE_DATA_CACHE);
+}
+
+static int write_cache(DataCache *dc, bool flag_flatten, void *buf, size_t size)
+{
+    /*
+     * check if the space is enough to cache data, if not, write the cached
+     * data to dc->fd and reset the buf
+     */
+    if (dc->data_size + size > dc->buf_size) {
+        if (write_buffer(dc->fd, flag_flatten, dc->offset, dc->buf,
+                         dc->data_size) < 0) {
+            return -1;
+        }
+
+        dc->offset += dc->data_size;
+        dc->data_size = 0;
+    }
+
+    memcpy(dc->buf + dc->data_size, buf, size);
+    dc->data_size += size;
+
+    return 0;
+}
+
+/* write the remaining data in dc->buf to dc->fd */
+static int sync_data_cache(DataCache *dc, bool flag_flatten)
+{
+    if (dc->data_size == 0) {
+        return 0;
+    }
+
+    if (write_buffer(dc->fd, flag_flatten, dc->offset, dc->buf,
+                     dc->data_size) < 0) {
+        return -1;
+    }
+
+    dc->offset += dc->data_size;
+
+    return 0;
+}
+
+static void free_data_cache(DataCache *data_cache)
+{
+    g_free(data_cache->buf);
+}
+
 static ram_addr_t get_start_block(DumpState *s)
 {
     GuestPhysBlock *block;
diff --git a/include/sysemu/dump.h b/include/sysemu/dump.h
index b5eaf8d..ab44af8 100644
--- a/include/sysemu/dump.h
+++ b/include/sysemu/dump.h
@@ -36,6 +36,7 @@
 #define BUFSIZE_BITMAP              (PAGE_SIZE)
 #define PFN_BUFBITMAP               (CHAR_BIT * BUFSIZE_BITMAP)
 #define ARCH_PFN_OFFSET             (0)
+#define BUFSIZE_DATA_CACHE          (PAGE_SIZE * 4)
 
 #define paddr_to_pfn(X, page_shift) \
     (((unsigned long long)(X) >> (page_shift)) - ARCH_PFN_OFFSET)
@@ -140,6 +141,14 @@ typedef struct QEMU_PACKED KdumpSubHeader64 {
     uint64_t max_mapnr_64;          /* header_version 6 and later */
 } KdumpSubHeader64;
 
+typedef struct DataCache {
+    int fd;             /* fd of the file where to write the cached data */
+    char *buf;          /* buffer for cached data */
+    size_t buf_size;    /* size of the buf */
+    size_t data_size;   /* size of cached data in buf */
+    off_t offset;       /* offset of the file */
+} DataCache;
+
 struct GuestPhysBlockList; /* memory_mapping.h */
 int cpu_get_dump_info(ArchDumpInfo *info,
                       const struct GuestPhysBlockList *guest_phys_blocks);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v6 09/11] dump: Add API to write dump pages
  2014-01-05  7:27 [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
                   ` (7 preceding siblings ...)
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 08/11] dump: Add APIs to operate DataCache Qiao Nuohan
@ 2014-01-05  7:27 ` Qiao Nuohan
  2014-01-07 22:37   ` Laszlo Ersek
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 10/11] dump: Make kdump-compressed format available for 'dump-guest-memory' Qiao Nuohan
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-05  7:27 UTC (permalink / raw)
  To: stefanha, lcapitulino, afaerber, eblake
  Cc: qemu-devel, qiaonuohan, kumagai-atsushi, anderson, akong, lersek

functions are used to write page to vmcore. vmcore is written page by page.
page desc is used to store the information of a page, including a page's size,
offset, compression format, etc.

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
 dump.c                |  258 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/sysemu/dump.h |    9 ++
 2 files changed, 267 insertions(+), 0 deletions(-)

diff --git a/dump.c b/dump.c
index b4c40f2..848957c 100644
--- a/dump.c
+++ b/dump.c
@@ -25,6 +25,14 @@
 #include "qapi/error.h"
 #include "qmp-commands.h"
 
+#include <zlib.h>
+#ifdef CONFIG_LZO
+#include <lzo/lzo1x.h>
+#endif
+#ifdef CONFIG_SNAPPY
+#include <snappy-c.h>
+#endif
+
 static uint16_t cpu_convert_to_target16(uint16_t val, int endian)
 {
     if (endian == ELFDATA2LSB) {
@@ -1140,6 +1148,256 @@ static void free_data_cache(DataCache *data_cache)
     g_free(data_cache->buf);
 }
 
+static size_t get_len_buf_out(size_t page_size, uint32_t flag_compress)
+{
+    size_t len_buf_out_zlib, len_buf_out_lzo, len_buf_out_snappy;
+    size_t len_buf_out;
+
+    /* init buf_out */
+    len_buf_out_zlib = len_buf_out_lzo = len_buf_out_snappy = 0;
+
+    /* buf size for zlib */
+    len_buf_out_zlib = compressBound(page_size);
+
+    /* buf size for lzo */
+#ifdef CONFIG_LZO
+    if (flag_compress & DUMP_DH_COMPRESSED_LZO) {
+        if (lzo_init() != LZO_E_OK) {
+            /* return 0 to indicate lzo is unavailable */
+            return 0;
+        }
+    }
+
+    /*
+     * LZO will expand incompressible data by a little amount. please check the
+     * following URL to see the expansion calculation:
+     * http://www.oberhumer.com/opensource/lzo/lzofaq.php
+     */
+    len_buf_out_lzo = page_size + page_size / 16 + 64 + 3;
+#endif
+
+#ifdef CONFIG_SNAPPY
+    /* buf size for snappy */
+    len_buf_out_snappy = snappy_max_compressed_length(page_size);
+#endif
+
+    /* get the biggest that can store all kinds of compressed page */
+    len_buf_out = MAX(len_buf_out_zlib,
+                      MAX(len_buf_out_lzo, len_buf_out_snappy));
+
+    return len_buf_out;
+}
+
+/*
+ * search memory blocks to find the exact place of the specified page, then
+ * dump the page into buf. memory should be read page by page, or it may exceed
+ * the boundary and fail to read
+ */
+static int readmem(void *bufptr, ram_addr_t addr, size_t size, DumpState *s)
+{
+    GuestPhysBlock *block;
+
+    QTAILQ_FOREACH(block, &s->guest_phys_blocks.head, next) {
+        if ((addr >= block->target_start) &&
+            (addr + size <= block->target_end)) {
+            memcpy(bufptr, block->host_addr + (addr - block->target_start),
+                    size);
+            return 0;
+        }
+    }
+
+    return -1;
+}
+
+/*
+ * check if the page is all 0
+ */
+static inline bool is_zero_page(unsigned char *buf, long page_size)
+{
+    return buffer_is_zero(buf, page_size);
+}
+
+static int write_dump_pages(DumpState *s)
+{
+    int ret = 0;
+    DataCache page_desc, page_data;
+    size_t len_buf_out, size_out;
+#ifdef CONFIG_LZO
+    lzo_bytep wrkmem = NULL;
+#endif
+    unsigned char *buf_out = NULL;
+    off_t offset_desc, offset_data;
+    PageDesc pd, pd_zero;
+    uint64_t pfn_start, pfn_end, pfn;
+    unsigned char buf[s->page_size];
+    MemoryMapping *memory_mapping;
+    bool zero_page;
+
+    prepare_data_cache(&page_desc, s);
+    prepare_data_cache(&page_data, s);
+
+    /* prepare buffer to store compressed data */
+    len_buf_out = get_len_buf_out(s->page_size, s->flag_compress);
+    if (len_buf_out == 0) {
+        dump_error(s, "dump: failed to get length of output buffer.\n");
+        goto out;
+    }
+
+#ifdef CONFIG_LZO
+    wrkmem = g_malloc(LZO1X_1_MEM_COMPRESS);
+#endif
+
+    buf_out = g_malloc(len_buf_out);
+
+    /* get offset of page_desc and page_data in dump file */
+    offset_desc = s->offset_page;
+    offset_data = offset_desc + sizeof(PageDesc) * s->num_dumpable;
+    page_desc.offset = offset_desc;
+    page_data.offset = offset_data;
+
+    /*
+     * init zero page's page_desc and page_data, because every zero page
+     * uses the same page_data
+     */
+    pd_zero.size = s->page_size;
+    pd_zero.flags = 0;
+    pd_zero.offset = offset_data;
+    pd_zero.page_flags = 0;
+    memset(buf, 0, pd_zero.size);
+    ret = write_cache(&page_data, s->flag_flatten, buf, pd_zero.size);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to write page data(zero page).\n");
+        goto out;
+    }
+
+    offset_data += pd_zero.size;
+
+    /* dump memory to vmcore page by page */
+    QTAILQ_FOREACH(memory_mapping, &s->list.head, next) {
+        pfn_start = paddr_to_pfn(memory_mapping->phys_addr, s->page_shift);
+        pfn_end = paddr_to_pfn(memory_mapping->phys_addr +
+                               memory_mapping->length, s->page_shift);
+
+        for (pfn = pfn_start; pfn < pfn_end; pfn++) {
+            memset(buf, 0, s->page_size);
+            ret = readmem(buf, pfn_to_paddr(pfn, s->page_shift), s->page_size,
+                          s);
+            if (ret < 0) {
+                dump_error(s, "dump: failed to read memory.\n");
+                goto out;
+            }
+
+            /* check zero page */
+            zero_page = is_zero_page(buf, s->page_size);
+            if (zero_page) {
+                ret = write_cache(&page_desc, s->flag_flatten, &pd_zero,
+                                  sizeof(PageDesc));
+                if (ret < 0) {
+                    dump_error(s, "dump: failed to write page desc.\n");
+                    goto out;
+                }
+            } else {
+                /*
+                 * not zero page, then:
+                 * 1. compress the page
+                 * 2. write the compressed page into the cache of page_data
+                 * 3. get page desc of the compressed page and write it into the
+                 *    cache of page_desc
+                 */
+                size_out = len_buf_out;
+                if ((s->flag_compress & DUMP_DH_COMPRESSED_ZLIB) &&
+                        (compress2(buf_out, &size_out, buf, s->page_size,
+                        Z_BEST_SPEED) == Z_OK) && (size_out < s->page_size)) {
+                    pd.flags = DUMP_DH_COMPRESSED_ZLIB;
+                    pd.size  = size_out;
+
+                    ret = write_cache(&page_data, s->flag_flatten, buf_out,
+                                      pd.size);
+                    if (ret < 0) {
+                        dump_error(s, "dump: failed to write page data.\n");
+                        goto out;
+                    }
+#ifdef CONFIG_LZO
+                } else if ((s->flag_compress & DUMP_DH_COMPRESSED_LZO) &&
+                        (lzo1x_1_compress(buf, s->page_size, buf_out,
+                        &size_out, wrkmem) == LZO_E_OK) &&
+                        (size_out < s->page_size)) {
+                    pd.flags = DUMP_DH_COMPRESSED_LZO;
+                    pd.size  = size_out;
+
+                    ret = write_cache(&page_data, s->flag_flatten, buf_out,
+                                      pd.size);
+                    if (ret < 0) {
+                        dump_error(s, "dump: failed to write page data.\n");
+                        goto out;
+                    }
+#endif
+#ifdef CONFIG_SNAPPY
+                } else if ((s->flag_compress & DUMP_DH_COMPRESSED_SNAPPY) &&
+                        (snappy_compress((char *)buf, s->page_size,
+                        (char *)buf_out, (size_t *)&size_out) == SNAPPY_OK) &&
+                        (size_out < s->page_size)) {
+                    pd.flags = DUMP_DH_COMPRESSED_SNAPPY;
+                    pd.size  = size_out;
+
+                    ret = write_cache(&page_data, s->flag_flatten, buf_out,
+                                      pd.size);
+                    if (ret < 0) {
+                        dump_error(s, "dump: failed to write page data.\n");
+                        goto out;
+                    }
+#endif
+                } else {
+                    pd.flags = 0;
+                    pd.size = s->page_size;
+
+                    ret = write_cache(&page_data, s->flag_flatten, buf,
+                                      pd.size);
+                    if (ret < 0) {
+                        dump_error(s, "dump: failed to write page data.\n");
+                        goto out;
+                    }
+                }
+
+                /* get and write page desc here */
+                pd.page_flags = 0;
+                pd.offset = offset_data;
+                offset_data += pd.size;
+
+                ret = write_cache(&page_desc, s->flag_flatten, &pd,
+                                  sizeof(PageDesc));
+                if (ret < 0) {
+                    dump_error(s, "dump: failed to write page desc.\n");
+                    goto out;
+                }
+            }
+        }
+    }
+
+    ret = sync_data_cache(&page_desc, s->flag_flatten);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to sync cache for page_desc.\n");
+        goto out;
+    }
+    ret = sync_data_cache(&page_data, s->flag_flatten);
+    if (ret < 0) {
+        dump_error(s, "dump: failed to sync cache for page_data.\n");
+        goto out;
+    }
+
+out:
+    free_data_cache(&page_desc);
+    free_data_cache(&page_data);
+
+#ifdef CONFIG_LZO
+    g_free(wrkmem);
+#endif
+
+    g_free(buf_out);
+
+    return ret;
+}
+
 static ram_addr_t get_start_block(DumpState *s)
 {
     GuestPhysBlock *block;
diff --git a/include/sysemu/dump.h b/include/sysemu/dump.h
index ab44af8..382a3c3 100644
--- a/include/sysemu/dump.h
+++ b/include/sysemu/dump.h
@@ -40,6 +40,8 @@
 
 #define paddr_to_pfn(X, page_shift) \
     (((unsigned long long)(X) >> (page_shift)) - ARCH_PFN_OFFSET)
+#define pfn_to_paddr(X, page_shift) \
+    (((unsigned long long)(X) + ARCH_PFN_OFFSET) << (page_shift))
 
 typedef struct ArchDumpInfo {
     int d_machine;  /* Architecture */
@@ -149,6 +151,13 @@ typedef struct DataCache {
     off_t offset;       /* offset of the file */
 } DataCache;
 
+typedef struct PageDesc {
+    uint64_t offset;                   /* the offset of the page data*/
+    uint32_t size;                  /* the size of this dump page */
+    uint32_t flags;                 /* flags */
+    uint64_t page_flags;            /* page flags */
+} PageDesc;
+
 struct GuestPhysBlockList; /* memory_mapping.h */
 int cpu_get_dump_info(ArchDumpInfo *info,
                       const struct GuestPhysBlockList *guest_phys_blocks);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v6 10/11] dump: Make kdump-compressed format available for 'dump-guest-memory'
  2014-01-05  7:27 [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
                   ` (8 preceding siblings ...)
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 09/11] dump: Add API to write dump pages Qiao Nuohan
@ 2014-01-05  7:27 ` Qiao Nuohan
  2014-01-09 15:46   ` Laszlo Ersek
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 11/11] Add 'query-dump-guest-memory-capability' command Qiao Nuohan
  2014-01-07  6:32 ` [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
  11 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-05  7:27 UTC (permalink / raw)
  To: stefanha, lcapitulino, afaerber, eblake
  Cc: qemu-devel, qiaonuohan, kumagai-atsushi, anderson, akong, lersek

Make monitor command 'dump-guest-memory' be able to dump in kdump-compressed
format. The command's usage:

  dump [-p] protocol [begin] [length] [format]

'format' is used to specified the format of vmcore and can be:
1. 'elf': ELF format, without compression
2. 'kdump-zlib': kdump-compressed format, with zlib-compressed
3. 'kdump-lzo': kdump-compressed format, with lzo-compressed
4. 'kdump-snappy': kdump-compressed format, with snappy-compressed
And without 'format' being set, it is same as 'elf'.

Note:
  1. The kdump-compressed format is readable only with the crash utility and
     makedumpfile, and it can be smaller than the ELF format because of the
     compression support.
  2. The kdump-compressed format is the 6th edition.

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
 dump.c           |  163 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
 hmp-commands.hx  |   12 +++-
 hmp.c            |   23 +++++++-
 qapi-schema.json |   22 +++++++-
 qmp-commands.hx  |    6 +-
 5 files changed, 212 insertions(+), 14 deletions(-)

diff --git a/dump.c b/dump.c
index 848957c..b4e79ff 100644
--- a/dump.c
+++ b/dump.c
@@ -1398,6 +1398,70 @@ out:
     return ret;
 }
 
+static int create_kdump_vmcore(DumpState *s)
+{
+    int ret;
+
+    /*
+     * the kdump-compressed format is:
+     *                                               File offset
+     *  +------------------------------------------+ 0x0
+     *  |    main header (struct disk_dump_header) |
+     *  |------------------------------------------+ block 1
+     *  |    sub header (struct kdump_sub_header)  |
+     *  |------------------------------------------+ block 2
+     *  |            1st-dump_bitmap               |
+     *  |------------------------------------------+ block 2 + X blocks
+     *  |            2nd-dump_bitmap               | (aligned by block)
+     *  |------------------------------------------+ block 2 + 2 * X blocks
+     *  |  page desc for pfn 0 (struct page_desc)  | (aligned by block)
+     *  |  page desc for pfn 1 (struct page_desc)  |
+     *  |                    :                     |
+     *  |  page desc for pfn Z (struct page_desc)  |
+     *  |------------------------------------------| (not aligned by block)
+     *  |         page data (pfn 0)                |
+     *  |         page data (pfn 1)                |
+     *  |                        :                 |
+     *  |         page data (pfn Z)                |
+     *  +------------------------------------------+ offset_eraseinfo
+     *  |                    :                     |
+     *  +------------------------------------------+
+     */
+
+    if (s->flag_flatten) {
+        ret = write_start_flat_header(s->fd);
+        if (ret < 0) {
+            return -1;
+        }
+    }
+
+    ret = write_dump_header(s);
+    if (ret < 0) {
+        return -1;
+    }
+
+    ret = write_dump_bitmap(s);
+    if (ret < 0) {
+        return -1;
+    }
+
+    ret = write_dump_pages(s);
+    if (ret < 0) {
+        return -1;
+    }
+
+    if (s->flag_flatten) {
+        ret = write_end_flat_header(s->fd);
+        if (ret < 0) {
+            return -1;
+        }
+    }
+
+    dump_completed(s);
+
+    return 0;
+}
+
 static ram_addr_t get_start_block(DumpState *s)
 {
     GuestPhysBlock *block;
@@ -1426,7 +1490,27 @@ static ram_addr_t get_start_block(DumpState *s)
     return -1;
 }
 
-static int dump_init(DumpState *s, int fd, bool paging, bool has_filter,
+static bool use_flatten_format(int fd)
+{
+    if (lseek(fd, 0, SEEK_SET) < 0) {
+        return true;
+    }
+
+    return false;
+}
+
+static void get_max_mapnr(DumpState *s)
+{
+    MemoryMapping *memory_mapping;
+
+    QTAILQ_FOREACH(memory_mapping, &s->list.head, next) {
+        s->max_mapnr = paddr_to_pfn(memory_mapping->phys_addr +
+                        memory_mapping->length, s->page_shift);
+    }
+}
+
+static int dump_init(DumpState *s, int fd, bool has_format,
+                     DumpGuestMemoryFormat format, bool paging, bool has_filter,
                      int64_t begin, int64_t length, Error **errp)
 {
     CPUState *cpu;
@@ -1494,6 +1578,44 @@ static int dump_init(DumpState *s, int fd, bool paging, bool has_filter,
         qemu_get_guest_simple_memory_mapping(&s->list, &s->guest_phys_blocks);
     }
 
+    /* init for kdump-compressed format */
+    if (has_format && format != DUMP_GUEST_MEMORY_FORMAT_ELF) {
+        switch (format) {
+        case DUMP_GUEST_MEMORY_FORMAT_KDUMP_ZLIB:
+            s->flag_compress = DUMP_DH_COMPRESSED_ZLIB;
+            break;
+
+        case DUMP_GUEST_MEMORY_FORMAT_KDUMP_LZO:
+            s->flag_compress = DUMP_DH_COMPRESSED_LZO;
+            break;
+
+        case DUMP_GUEST_MEMORY_FORMAT_KDUMP_SNAPPY:
+            s->flag_compress = DUMP_DH_COMPRESSED_SNAPPY;
+            break;
+
+        default:
+            s->flag_compress = 0;
+        }
+
+        /*
+         * check to see if fd is available to seek.
+         * if not, flatten format is used to avoid seek
+         */
+        s->flag_flatten = use_flatten_format(fd);
+
+        s->nr_cpus = nr_cpus;
+        s->page_size = PAGE_SIZE;
+        s->page_shift = ffs(s->page_size) - 1;
+
+        get_max_mapnr(s);
+
+        size_t tmp;
+        tmp = DIV_ROUND_UP(DIV_ROUND_UP(s->max_mapnr, CHAR_BIT), s->page_size);
+        s->len_dump_bitmap = tmp * s->page_size;
+
+        return 0;
+    }
+
     if (s->has_filter) {
         memory_mapping_filter(&s->list, s->begin, s->length);
     }
@@ -1553,8 +1675,9 @@ cleanup:
 }
 
 void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
-                           int64_t begin, bool has_length, int64_t length,
-                           Error **errp)
+                           int64_t begin, bool has_length,
+                           int64_t length, bool has_format,
+                           DumpGuestMemoryFormat format, Error **errp)
 {
     const char *p;
     int fd = -1;
@@ -1569,6 +1692,27 @@ void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
         error_set(errp, QERR_MISSING_PARAMETER, "begin");
         return;
     }
+    /* kdump-compressed format doesn't support paging or filter */
+    if ((has_format && format != DUMP_GUEST_MEMORY_FORMAT_ELF) &&
+        (paging || has_begin || has_length)) {
+        error_set(errp, QERR_INVALID_PARAMETER_COMBINATION);
+        return;
+    }
+
+    /* check whether lzo/snappy is supported */
+#ifndef CONFIG_LZO
+    if (format == DUMP_GUEST_MEMORY_FORMAT_KDUMP_LZO) {
+        error_set(errp, QERR_INVALID_PARAMETER_VALUE, "format",
+                  "supported format(kdump-lzo is not available now)");
+    }
+#endif
+
+#ifndef CONFIG_SNAPPY
+    if (format == DUMP_GUEST_MEMORY_FORMAT_KDUMP_SNAPPY) {
+        error_set(errp, QERR_INVALID_PARAMETER_VALUE, "format",
+                  "supported format(kdump-snappy is not available now)");
+    }
+#endif
 
 #if !defined(WIN32)
     if (strstart(file, "fd:", &p)) {
@@ -1594,14 +1738,21 @@ void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
 
     s = g_malloc0(sizeof(DumpState));
 
-    ret = dump_init(s, fd, paging, has_begin, begin, length, errp);
+    ret = dump_init(s, fd, has_format, format, paging, has_begin,
+                    begin, length, errp);
     if (ret < 0) {
         g_free(s);
         return;
     }
 
-    if (create_vmcore(s) < 0 && !error_is_set(s->errp)) {
-        error_set(errp, QERR_IO_ERROR);
+    if (has_format && format != DUMP_GUEST_MEMORY_FORMAT_ELF) {
+        if (create_kdump_vmcore(s) < 0 && !error_is_set(s->errp)) {
+            error_set(errp, QERR_IO_ERROR);
+        }
+    } else {
+        if (create_vmcore(s) < 0 && !error_is_set(s->errp)) {
+            error_set(errp, QERR_IO_ERROR);
+        }
     }
 
     g_free(s);
diff --git a/hmp-commands.hx b/hmp-commands.hx
index ebe8e78..3856bb4 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -993,17 +993,19 @@ ETEXI
 
     {
         .name       = "dump-guest-memory",
-        .args_type  = "paging:-p,filename:F,begin:i?,length:i?",
-        .params     = "[-p] filename [begin] [length]",
+        .args_type  = "paging:-p,filename:F,begin:i?,length:i?,format:s?",
+        .params     = "[-p] filename [begin] [length] [format]",
         .help       = "dump guest memory to file"
                       "\n\t\t\t begin(optional): the starting physical address"
-                      "\n\t\t\t length(optional): the memory size, in bytes",
+                      "\n\t\t\t length(optional): the memory size, in bytes"
+                      "\n\t\t\t format(optional): the format of guest memory dump,"
+                      "\n\t\t\t it can be elf|kdump-zlib|kdump-lzo|kdump-snappy",
         .mhandler.cmd = hmp_dump_guest_memory,
     },
 
 
 STEXI
-@item dump-guest-memory [-p] @var{protocol} @var{begin} @var{length}
+@item dump-guest-memory [-p] @var{protocol} @var{begin} @var{length} @var{format}
 @findex dump-guest-memory
 Dump guest memory to @var{protocol}. The file can be processed with crash or
 gdb.
@@ -1013,6 +1015,8 @@ gdb.
             specified with length together.
     length: the memory size, in bytes. It's optional, and should be specified
             with begin together.
+    format: the format of guest memory dump. It's optional, and can be
+            elf|kdump-zlib|kdump-lzo|kdump-snappy
 ETEXI
 
     {
diff --git a/hmp.c b/hmp.c
index 32ee285..9bd62b8 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1307,9 +1307,12 @@ void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict)
     const char *file = qdict_get_str(qdict, "filename");
     bool has_begin = qdict_haskey(qdict, "begin");
     bool has_length = qdict_haskey(qdict, "length");
+    bool has_format = qdict_haskey(qdict, "format");
     int64_t begin = 0;
     int64_t length = 0;
+    const char *format = NULL;
     char *prot;
+    enum DumpGuestMemoryFormat dump_format;
 
     if (has_begin) {
         begin = qdict_get_int(qdict, "begin");
@@ -1317,11 +1320,29 @@ void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict)
     if (has_length) {
         length = qdict_get_int(qdict, "length");
     }
+    if (has_format) {
+        format = qdict_get_str(qdict, "format");
+    }
+
+    if (strcmp(format, "elf") == 0) {
+        dump_format = DUMP_GUEST_MEMORY_FORMAT_ELF;
+    } else if (strcmp(format, "kdump-zlib") == 0) {
+        dump_format = DUMP_GUEST_MEMORY_FORMAT_KDUMP_ZLIB;
+    } else if (strcmp(format, "kdump-lzo") == 0) {
+        dump_format = DUMP_GUEST_MEMORY_FORMAT_KDUMP_LZO;
+    } else if (strcmp(format, "kdump-snappy") == 0) {
+        dump_format = DUMP_GUEST_MEMORY_FORMAT_KDUMP_SNAPPY;
+    } else {
+        error_set(&errp, QERR_INVALID_PARAMETER_VALUE,
+                  "format", "elf|kdump-zlib|kdump-lzo|kdump-snappy");
+        hmp_handle_error(mon, &errp);
+        return;
+    }
 
     prot = g_strconcat("file:", file, NULL);
 
     qmp_dump_guest_memory(paging, prot, has_begin, begin, has_length, length,
-                          &errp);
+                          has_format, dump_format, &errp);
     hmp_handle_error(mon, &errp);
     g_free(prot);
 }
diff --git a/qapi-schema.json b/qapi-schema.json
index c3c939c..19b2b23 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2676,6 +2676,24 @@
 { 'command': 'device_del', 'data': {'id': 'str'} }
 
 ##
+# @DumpGuestMemoryFormat:
+#
+# An enumeration of guest-memory-dump's format.
+#
+# @elf: elf format
+#
+# @kdump-zlib: kdump-compressed format with zlib-compressed
+#
+# @kdump-lzo: kdump-compressed format with zlib-compressed
+#
+# @kdump-snappy: kdump-compressed format with zlib-compressed
+#
+# Since: 1.8
+##
+{ 'enum': 'DumpGuestMemoryFormat',
+  'data': [ 'elf', 'kdump-zlib', 'kdump-lzo', 'kdump-snappy' ] }
+
+##
 # @dump-guest-memory
 #
 # Dump guest's memory to vmcore. It is a synchronous operation that can take
@@ -2711,13 +2729,15 @@
 #          want to dump all guest's memory, please specify the start @begin
 #          and @length
 #
+# @format: #optional if specified, the format of guest memory dump. (since 1.8)
+#
 # Returns: nothing on success
 #
 # Since: 1.2
 ##
 { 'command': 'dump-guest-memory',
   'data': { 'paging': 'bool', 'protocol': 'str', '*begin': 'int',
-            '*length': 'int' } }
+            '*length': 'int', '*format': 'DumpGuestMemoryFormat' } }
 
 ##
 # @netdev_add:
diff --git a/qmp-commands.hx b/qmp-commands.hx
index fba15cd..3de9e7d 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -791,8 +791,8 @@ EQMP
 
     {
         .name       = "dump-guest-memory",
-        .args_type  = "paging:b,protocol:s,begin:i?,end:i?",
-        .params     = "-p protocol [begin] [length]",
+        .args_type  = "paging:b,protocol:s,begin:i?,length:i?,format:s?",
+        .params     = "-p protocol [begin] [length] [format]",
         .help       = "dump guest memory to file",
         .user_print = monitor_user_noop,
         .mhandler.cmd_new = qmp_marshal_input_dump_guest_memory,
@@ -813,6 +813,8 @@ Arguments:
            with length together (json-int)
 - "length": the memory size, in bytes. It's optional, and should be specified
             with begin together (json-int)
+- "format": the format of guest memory dump. It's optional, and can be
+            elf|kdump-zlib|kdump-lzo|kdump-snappy (json-string)
 
 Example:
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v6 11/11] Add 'query-dump-guest-memory-capability' command
  2014-01-05  7:27 [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
                   ` (9 preceding siblings ...)
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 10/11] dump: Make kdump-compressed format available for 'dump-guest-memory' Qiao Nuohan
@ 2014-01-05  7:27 ` Qiao Nuohan
  2014-01-09 16:34   ` Laszlo Ersek
  2014-01-07  6:32 ` [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
  11 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-05  7:27 UTC (permalink / raw)
  To: stefanha, lcapitulino, afaerber, eblake
  Cc: qemu-devel, qiaonuohan, kumagai-atsushi, anderson, akong, lersek

'query-dump-guest-memory-capability' is used to query whether option 'format'
is available for 'dump-guest-memory' and the available format. The output
of the command will be like:

-> { "execute": "query-dump-guest-memory-capability" }
<- { "return": {
        "format-option": "optional",
        "capabilities": [
            {"available": true, "format": "elf"},
            {"available": true, "format": "kdump-zlib"},
            {"available": true, "format": "kdump-lzo"},
            {"available": true, "format": "kdump-snappy"}
        ]
    }

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
 dump.c           |   38 ++++++++++++++++++++++++++++++++++++++
 qapi-schema.json |   13 +++++++++++++
 qmp-commands.hx  |    7 +++++++
 3 files changed, 58 insertions(+), 0 deletions(-)

diff --git a/dump.c b/dump.c
index b4e79ff..51fe23a 100644
--- a/dump.c
+++ b/dump.c
@@ -1757,3 +1757,41 @@ void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
 
     g_free(s);
 }
+
+DumpGuestMemoryCapability *qmp_query_dump_guest_memory_capability(Error **errp)
+{
+    int i;
+    DumpGuestMemoryCapabilityStatusList *item;
+    DumpGuestMemoryCapability *cap =
+                                  g_malloc0(sizeof(DumpGuestMemoryCapability));
+
+    cap->format_option = g_strdup_printf("optional");
+
+    for (i = 0; i < DUMP_GUEST_MEMORY_FORMAT_MAX; i++) {
+        if (cap->capabilities == NULL) {
+            item = g_malloc0(sizeof(DumpGuestMemoryCapabilityStatusList));
+            cap->capabilities = item;
+        } else {
+            item->next = g_malloc0(sizeof(DumpGuestMemoryCapabilityStatusList));
+            item = item->next;
+        }
+
+        item->value = g_malloc0(sizeof(struct DumpGuestMemoryCapabilityStatus));
+        item->value->format = i;
+        item->value->available = true;
+
+        if (i == DUMP_GUEST_MEMORY_FORMAT_KDUMP_LZO) {
+#ifndef CONFIG_LZO
+            item->value->available = false;
+#endif
+        }
+
+        if (i == DUMP_GUEST_MEMORY_FORMAT_KDUMP_SNAPPY) {
+#ifndef CONFIG_SNAPPY
+            item->value->available = false;
+#endif
+        }
+    }
+
+    return cap;
+}
diff --git a/qapi-schema.json b/qapi-schema.json
index 19b2b23..e545e0f 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2740,6 +2740,19 @@
             '*length': 'int', '*format': 'DumpGuestMemoryFormat' } }
 
 ##
+# Since: 1.8
+##
+{ 'type':  'DumpGuestMemoryCapabilityStatus',
+  'data': { 'format': 'DumpGuestMemoryFormat', 'available': 'bool' } }
+
+{ 'type': 'DumpGuestMemoryCapability',
+  'data': {
+      'format-option': 'str',
+      'capabilities': ['DumpGuestMemoryCapabilityStatus'] } }
+
+{ 'command': 'query-dump-guest-memory-capability', 'returns': 'DumpGuestMemoryCapability' }
+
+##
 # @netdev_add:
 #
 # Add a network backend.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 3de9e7d..3379c75 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -828,6 +828,13 @@ Notes:
 EQMP
 
     {
+        .name       = "query-dump-guest-memory-capability",
+        .args_type  = "",
+        .mhandler.cmd_new = qmp_marshal_input_query_dump_guest_memory_capability,
+    },
+
+
+    {
         .name       = "netdev_add",
         .args_type  = "netdev:O",
         .mhandler.cmd_new = qmp_netdev_add,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 01/11] dump: Add argument to write_elfxx_notes
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 01/11] dump: Add argument to write_elfxx_notes Qiao Nuohan
@ 2014-01-06 17:03   ` Laszlo Ersek
  2014-01-07  6:00     ` Qiao Nuohan
  0 siblings, 1 reply; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-06 17:03 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

On 01/05/14 08:27, Qiao Nuohan wrote:
> write_elf32_notes/wirte_elf64_notes use fd_write_vmcore to write elf notes to
> vmcore. Adding parameter "WriteCoreDumpFunction f" makes it available to choose
> the method of writing elf notes
> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> ---
>  dump.c |   16 ++++++++--------
>  1 files changed, 8 insertions(+), 8 deletions(-)

I assume the direct calls to fd_write_vmcore() (which we're not
replacing here) will be substituted / abstracted later on in the series.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 02/11] dump: Add API to write header of flatten format
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 02/11] dump: Add API to write header of flatten format Qiao Nuohan
@ 2014-01-06 17:15   ` Laszlo Ersek
  0 siblings, 0 replies; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-06 17:15 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

On 01/05/14 08:27, Qiao Nuohan wrote:
> flatten format may be used when writing kdump-compressed format. The format is
> also used by makedumpfile, you can refer to the following URL to get more
> detailed information about flatten format of kdump-compressed format:
> http://sourceforge.net/projects/makedumpfile/
> 
> The two functions here are used to write start flat header and end flat header
> to vmcore, and they will be called later if flatten format is used.
> 
> struct MakedumpfileHeader stored at the head of vmcore is used to indicate the
> vmcore is in flatten format.
> 
> struct MakedumpfileHeader {
>     char signature[16];     /* = "makedumpfile" */
>     int64_t type;           /* = 1 */
>     int64_t version;        /* = 1 */
> };
> 
> And struct MakedumpfileDataHeader, with offset and buf_size set to -1, is used
> to indicate the end of vmcore in flatten format.
> 
> struct MakedumpfileDataHeader {
>     int64_t offset;         /* = -1 */
>     int64_t buf_size;       /* = -1 */
> };
> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> ---
>  dump.c                |   40 ++++++++++++++++++++++++++++++++++++++++
>  include/sysemu/dump.h |   17 +++++++++++++++++
>  2 files changed, 57 insertions(+), 0 deletions(-)
> 
> diff --git a/dump.c b/dump.c
> index 1fa12a2..89baeab 100644
> --- a/dump.c
> +++ b/dump.c
> @@ -686,6 +686,46 @@ static int create_vmcore(DumpState *s)
>      return 0;
>  }
>  
> +static int write_start_flat_header(int fd)
> +{
> +    char buf[MAX_SIZE_MDF_HEADER];
> +    MakedumpfileHeader mh;
> +
> +    memset(&mh, 0, sizeof(mh));
> +    strncpy(mh.signature, MAKEDUMPFILE_SIGNATURE,
> +            strlen(MAKEDUMPFILE_SIGNATURE));
> +
> +    mh.type = cpu_to_be64(TYPE_FLAT_HEADER);
> +    mh.version = cpu_to_be64(VERSION_FLAT_HEADER);
> +
> +    memset(buf, 0, sizeof(buf));
> +    memcpy(buf, &mh, sizeof(mh));
> +
> +    size_t written_size;
> +    written_size = qemu_write_full(fd, buf, sizeof(buf));
> +    if (written_size != sizeof(buf)) {
> +        return -1;
> +    }
> +
> +    return 0;
> +}

I might have coded this in different style (using a union etc), but
that's not your problem :)

> [...]

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 03/11] dump: Add API to write vmcore
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 03/11] dump: Add API to write vmcore Qiao Nuohan
@ 2014-01-06 18:12   ` Laszlo Ersek
  2014-01-07  6:15     ` Qiao Nuohan
  0 siblings, 1 reply; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-06 18:12 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

On 01/05/14 08:27, Qiao Nuohan wrote:
> Function is used to write vmcore. If flag_flatten is specified, flatten format
> will be used. In flatten format, data is written block by block in vmcore.
> struct MakedumpfileDataHeader is used to indicate the offset and size of a data
> block.
> 
> struct MakedumpfileDataHeader {
>     int64_t offset;
>     int64_t buf_size;
> };
> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> ---
>  dump.c |   28 ++++++++++++++++++++++++++++
>  1 files changed, 28 insertions(+), 0 deletions(-)
> 
> diff --git a/dump.c b/dump.c
> index 89baeab..764db39 100644
> --- a/dump.c
> +++ b/dump.c
> @@ -726,6 +726,34 @@ static int write_end_flat_header(int fd)
>      return 0;
>  }
>  
> +static int write_buffer(int fd, bool flag_flatten, off_t offset, void *buf,
> +                        size_t size)
> +{

You might have wanted to const-qualify "*buf" here, but it certainly
doesn't warrant a respin.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 04/11] dump: Add API to write elf notes to buffer
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 04/11] dump: Add API to write elf notes to buffer Qiao Nuohan
@ 2014-01-06 18:46   ` Laszlo Ersek
  2014-01-07  6:17     ` Qiao Nuohan
  0 siblings, 1 reply; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-06 18:46 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

some tangential comments:

On 01/05/14 08:27, Qiao Nuohan wrote:
> the function can be used by write_elf32_notes/write_elf64_notes to write notes
> to a buffer. If fd_write_vmcore is used, write_elf32_notes/write_elf64_notes
> will write elf notes to vmcore directly. Instead, if buf_write_note is used,
> elf notes will be written to opaque->note_buf at first.
> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> ---
>  dump.c |   19 +++++++++++++++++++
>  1 files changed, 19 insertions(+), 0 deletions(-)
> 
> diff --git a/dump.c b/dump.c
> index 764db39..3b9cf00 100644
> --- a/dump.c
> +++ b/dump.c
> @@ -76,6 +76,9 @@ typedef struct DumpState {
>      int64_t begin;
>      int64_t length;
>      Error **errp;
> +
> +    void *note_buf;
> +    size_t note_buf_offset;
>  } DumpState;
>  
>  static int dump_cleanup(DumpState *s)
> @@ -754,6 +757,22 @@ static int write_buffer(int fd, bool flag_flatten, off_t offset, void *buf,
>      return 0;
>  }
>  
> +static int buf_write_note(void *buf, size_t size, void *opaque)
> +{

"const void *buf" would have been more "elegant".

> +    DumpState *s = opaque;
> +
> +    /* note_buf is not enough */
> +    if (s->note_buf_offset + size > s->note_size) {
> +        return -1;
> +    }
> +
> +    memcpy(s->note_buf + s->note_buf_offset, buf, size);

Giving type "uint8_t" to "*note_buf" would have been preferable.
Addition to a pointer-to-void is a constraint violation in standard C
("... operand shall be a pointer to an object type ..."), ie. it's a gcc
extension here, but I guess we can live with it.

Using s->note_size as limit seems correct.

> +
> +    s->note_buf_offset += size;
> +
> +    return 0;
> +}
> +
>  static ram_addr_t get_start_block(DumpState *s)
>  {
>      GuestPhysBlock *block;
> 

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 05/11] dump: add support for lzo/snappy
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 05/11] dump: add support for lzo/snappy Qiao Nuohan
@ 2014-01-06 19:25   ` Laszlo Ersek
  2014-01-07  6:25     ` Qiao Nuohan
  0 siblings, 1 reply; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-06 19:25 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

On 01/05/14 08:27, Qiao Nuohan wrote:
> kdump-compressed format supports three compression format, zlib/lzo/snappy.
> Currently, only zlib is available. This patch is used to support lzo/snappy.
> '--enable-lzo/--enable-snappy' is needed to be specified with configure to make
> lzo/snappy available for qemu
> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> ---
>  configure |   50 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 50 insertions(+), 0 deletions(-)
> 
> diff --git a/configure b/configure
> index edfea95..b7a28b7 100755
> --- a/configure
> +++ b/configure
> @@ -245,6 +245,8 @@ libusb=""
>  usb_redir=""
>  glx=""
>  zlib="yes"
> +lzo="no"
> +snappy="no"
>  guest_agent=""
>  guest_agent_with_vss="no"
>  vss_win32_sdk=""
> @@ -947,6 +949,10 @@ for opt do
>    ;;
>    --disable-zlib-test) zlib="no"
>    ;;
> +  --enable-lzo) lzo="yes"
> +  ;;
> +  --enable-snappy) snappy="yes"
> +  ;;
>    --enable-guest-agent) guest_agent="yes"
>    ;;
>    --disable-guest-agent) guest_agent="no"
> @@ -1538,6 +1544,42 @@ fi
>  libs_softmmu="$libs_softmmu -lz"
>  
>  ##########################################
> +# lzo check
> +
> +if test "$lzo" != "no" ; then
> +    cat > $TMPC << EOF
> +#include <lzo/lzo1x.h>
> +int main(void) { lzo_version(); return 0; }
> +EOF
> +    if compile_prog "" "-llzo2" ; then
> +        :
> +    else
> +        error_exit "lzo check failed" \
> +            "Make sure to have the lzo libs and headers installed."
> +    fi
> +
> +    libs_softmmu="$libs_softmmu -llzo2"
> +fi
> +
> +##########################################
> +# snappy check
> +
> +if test "$snappy" != "no" ; then
> +    cat > $TMPC << EOF
> +#include <snappy-c.h>
> +int main(void) { snappy_max_compressed_length(4096); return 0; }
> +EOF
> +    if compile_prog "" "-lsnappy" ; then
> +        :
> +    else
> +        error_exit "snappy check failed" \
> +            "Make sure to have the snappy libs and headers installed."
> +    fi
> +
> +    libs_softmmu="$libs_softmmu -lsnappy"
> +fi
> +
> +##########################################
>  # libseccomp check
>  
>  if test "$seccomp" != "no" ; then
> @@ -4135,6 +4177,14 @@ if test "$glx" = "yes" ; then
>    echo "GLX_LIBS=$glx_libs" >> $config_host_mak
>  fi
>  
> +if test "$lzo" = "yes" ; then
> +  echo "CONFIG_LZO=y" >> $config_host_mak
> +fi
> +
> +if test "$snappy" = "yes" ; then
> +  echo "CONFIG_SNAPPY=y" >> $config_host_mak
> +fi
> +
>  if test "$libiscsi" = "yes" ; then
>    echo "CONFIG_LIBISCSI=y" >> $config_host_mak
>  fi
> 

You could have displayed the lzo / snappy settings along with the other
settings in the "big echo block". But it's not too important; if you
want you can add it later.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 01/11] dump: Add argument to write_elfxx_notes
  2014-01-06 17:03   ` Laszlo Ersek
@ 2014-01-07  6:00     ` Qiao Nuohan
  0 siblings, 0 replies; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-07  6:00 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

On 01/07/2014 01:03 AM, Laszlo Ersek wrote:
> I assume the direct calls to fd_write_vmcore() (which we're not
> replacing here) will be substituted / abstracted later on in the series.

Yes, patch 6 will reuse write_elf32_notes/wirte_elf64_notes, but write notes
into buffer instead of vmcore directly.

-- 
Regards
Qiao Nuohan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 03/11] dump: Add API to write vmcore
  2014-01-06 18:12   ` Laszlo Ersek
@ 2014-01-07  6:15     ` Qiao Nuohan
  0 siblings, 0 replies; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-07  6:15 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

On 01/07/2014 02:12 AM, Laszlo Ersek wrote:
>> @@ -726,6 +726,34 @@ static int write_end_flat_header(int fd)
>> >        return 0;
>> >    }
>> >
>> >  +static int write_buffer(int fd, bool flag_flatten, off_t offset, void *buf,
>> >  +                        size_t size)
>> >  +{
> You might have wanted to const-qualify "*buf" here, but it certainly
> doesn't warrant a respin.

Acked, I will reflect it in later version.

-- 
Regards
Qiao Nuohan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 04/11] dump: Add API to write elf notes to buffer
  2014-01-06 18:46   ` Laszlo Ersek
@ 2014-01-07  6:17     ` Qiao Nuohan
  0 siblings, 0 replies; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-07  6:17 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

On 01/07/2014 02:46 AM, Laszlo Ersek wrote:
>>   static int dump_cleanup(DumpState *s)
>> >  @@ -754,6 +757,22 @@ static int write_buffer(int fd, bool flag_flatten, off_t offset, void *buf,
>> >        return 0;
>> >    }
>> >
>> >  +static int buf_write_note(void *buf, size_t size, void *opaque)
>> >  +{
> "const void *buf" would have been more "elegant".
>
>> >  +    DumpState *s = opaque;
>> >  +
>> >  +    /* note_buf is not enough */
>> >  +    if (s->note_buf_offset + size>  s->note_size) {
>> >  +        return -1;
>> >  +    }
>> >  +
>> >  +    memcpy(s->note_buf + s->note_buf_offset, buf, size);
> Giving type "uint8_t" to "*note_buf" would have been preferable.
> Addition to a pointer-to-void is a constraint violation in standard C
> ("... operand shall be a pointer to an object type ..."), ie. it's a gcc
> extension here, but I guess we can live with it.
>
> Using s->note_size as limit seems correct.
>

Acked.

-- 
Regards
Qiao Nuohan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 05/11] dump: add support for lzo/snappy
  2014-01-06 19:25   ` Laszlo Ersek
@ 2014-01-07  6:25     ` Qiao Nuohan
  2014-01-07  7:24       ` Laszlo Ersek
  0 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-07  6:25 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

On 01/07/2014 03:25 AM, Laszlo Ersek wrote:
> You could have displayed the lzo / snappy settings along with the other
> settings in the "big echo block". But it's not too important; if you
> want you can add it later.

You mean the following part? Thanks for pointing it out.

<cut>
echo "Standard options:"
echo "  --help                   print this message"
echo "  --prefix=PREFIX          install in PREFIX [$prefix]"
echo "  --interp-prefix=PREFIX   where to find shared libraries, etc."
echo "                           use %M for cpu name [$interp_prefix]"
echo "  --target-list=LIST       set target list (default: build everything)"
echo "Available targets: $default_target_list" | \
     fold -s -w 53 | sed -e 's/^/                           /'
echo ""
echo "Advanced options (experts only):"
echo "  --source-path=PATH       path of source code [$source_path]"
echo "  --cross-prefix=PREFIX    use PREFIX for compile tools [$cross_prefix]"
echo "  --cc=CC                  use C compiler CC [$cc]"
...
<cut>

-- 
Regards
Qiao Nuohan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format
  2014-01-05  7:27 [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
                   ` (10 preceding siblings ...)
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 11/11] Add 'query-dump-guest-memory-capability' command Qiao Nuohan
@ 2014-01-07  6:32 ` Qiao Nuohan
  2014-01-07  7:27   ` Laszlo Ersek
  11 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-07  6:32 UTC (permalink / raw)
  To: lcapitulino, eblake, lersek
  Cc: stefanha, qemu-devel, Qiao Nuohan, kumagai-atsushi, anderson,
	akong, afaerber

Hello Eric, Luiz and Laszlo,

What do you think about my series? And I have add the light-weight
introspection in the last patch, do you have some comments on it?



On 01/05/2014 03:27 PM, Qiao Nuohan wrote:
> Hi, all
>
> The last version is here:
> http://lists.nongnu.org/archive/html/qemu-devel/2013-07/msg01376.html
>
> Command 'dump-guest-memory' was introduced to dump guest's memory. But the
> vmcore's format is only elf32 or elf64. The message is here:
> http://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03379.html
>
> Compared with migration, the missing of compression feature means regression
> to 'dump-guest-memory'. So we post these patches to make 'dump-guest-memory' be
> able to dump guest's in kdump-compressed format. Then vmcore can be much
> smaller, and easily to be delivered.
>
> The kdump-compressed format is *linux specific* *linux standard* crash dump
> format used in kdump framework. The kdump-compressed format is readable only
> with the crash utility, and it can be smaller than the ELF format because of
> the compression support. To get more detailed information about
> kdump-compressed format, please refer to the following URL:
> http://sourceforge.net/projects/makedumpfile/
>
> Note, similar to 'dump-guest-memory':
> 1. The guest should be x86 or x86_64. The other arch is not supported now.
> 2. If the OS is in the second kernel, gdb may not work well, and crash can
>     work by specifying '--machdep phys_addr=xxx' in the command line. The
>     reason is that the second kernel will update the page table, and we can
>     not get the page table for the first kernel.
> 3. The cpu's state is stored in QEMU note.
> 4. The vmcore are able to be compressed with zlib, lzo or snappy. zlib is
>     available by default, and option '--enable-lzo' or '--enable-snappy'
>     should be specified with 'configure' to make lzo or snappy available.
>
> Changelog:
> Changes from v5 to v6:
> 1. add run-time check for compression format(lzo/snappy)
> 2. address Stefan's comments about reusing code and coding style
> 3. update the version of kdump-compressed format to 6th
> 4. resplit the patches
> 5. Add 'query-dump-guest-memory-capability' command
>
> Changes from v4 to v5:
> 1. using flatten format to avoid using temporary files according to Stefan's
>     comments
> 2. Address Andreas's comments about coding style
>
> Changes from v3 to v4:
> 1. change to avoid conflict with Andreas's patches
> 2. rebase
>
> Changes from v2 to v3:
> 1. Address Eric's comment
>
> Changes from v1 to v2:
> 1. Address Eric&  Daniel's comment: fix manner of string copy.
> 2. Address Eric's comment: replace reinventing new constants by using the
>     ready-made ones accoring.
> 3. Address Andreas's comment: remove useless include.
>
> Qiao Nuohan (11):
>    dump: Add argument to write_elfxx_notes
>    dump: Add API to write header of flatten format
>    dump: Add API to write vmcore
>    dump: Add API to write elf notes to buffer
>    dump: add support for lzo/snappy
>    dump: add API to write dump header
>    dump: Add API to write dump_bitmap
>    dump: Add APIs to operate DataCache
>    dump: Add API to write dump pages
>    dump: Make kdump-compressed format available for 'dump-guest-memory'
>    Add 'query-dump-guest-memory-capability' command
>
>   configure             |   50 +++
>   dump.c                |  929 ++++++++++++++++++++++++++++++++++++++++++++++++-
>   hmp-commands.hx       |   12 +-
>   hmp.c                 |   23 ++-
>   include/sysemu/dump.h |  138 ++++++++
>   qapi-schema.json      |   35 ++-
>   qmp-commands.hx       |   13 +-
>   7 files changed, 1178 insertions(+), 22 deletions(-)
>
>


-- 
Regards
Qiao Nuohan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 05/11] dump: add support for lzo/snappy
  2014-01-07  6:25     ` Qiao Nuohan
@ 2014-01-07  7:24       ` Laszlo Ersek
  0 siblings, 0 replies; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-07  7:24 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

On 01/07/14 07:25, Qiao Nuohan wrote:
> On 01/07/2014 03:25 AM, Laszlo Ersek wrote:
>> You could have displayed the lzo / snappy settings along with the other
>> settings in the "big echo block". But it's not too important; if you
>> want you can add it later.
> 
> You mean the following part? Thanks for pointing it out.
> 
> <cut>
> echo "Standard options:"
> echo "  --help                   print this message"
> echo "  --prefix=PREFIX          install in PREFIX [$prefix]"
> echo "  --interp-prefix=PREFIX   where to find shared libraries, etc."
> echo "                           use %M for cpu name [$interp_prefix]"
> echo "  --target-list=LIST       set target list (default: build
> everything)"
> echo "Available targets: $default_target_list" | \
>     fold -s -w 53 | sed -e 's/^/                           /'
> echo ""
> echo "Advanced options (experts only):"
> echo "  --source-path=PATH       path of source code [$source_path]"
> echo "  --cross-prefix=PREFIX    use PREFIX for compile tools
> [$cross_prefix]"
> echo "  --cc=CC                  use C compiler CC [$cc]"
> ...
> <cut>
> 

This is the help text, I didn't mean that.

I meant this output, printed after configure finishes:

-------------------
Install prefix    /opt/qemu-installed
BIOS directory    /opt/qemu-installed/share/qemu
binary directory  /opt/qemu-installed/bin
[...]
libssh2 support   no
TPM passthrough   no
QOM debugging     yes
vhdx              yes
-------------------

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format
  2014-01-07  6:32 ` [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
@ 2014-01-07  7:27   ` Laszlo Ersek
  2014-01-07  7:30     ` Qiao Nuohan
  0 siblings, 1 reply; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-07  7:27 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, anderson, kumagai-atsushi,
	akong, afaerber

On 01/07/14 07:32, Qiao Nuohan wrote:
> Hello Eric, Luiz and Laszlo,
> 
> What do you think about my series? And I have add the light-weight
> introspection in the last patch, do you have some comments on it?

I haven't finished reviewing it yet, but thus far (up to & including
patch 05) I'm OK with it.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format
  2014-01-07  7:27   ` Laszlo Ersek
@ 2014-01-07  7:30     ` Qiao Nuohan
  0 siblings, 0 replies; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-07  7:30 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: stefanha, qemu-devel, lcapitulino, anderson, kumagai-atsushi,
	akong, afaerber

On 01/07/2014 03:27 PM, Laszlo Ersek wrote:
> I haven't finished reviewing it yet, but thus far (up to&  including
> patch 05) I'm OK with it.

I see, thanks for your work.

-- 
Regards
Qiao Nuohan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 06/11] dump: add API to write dump header
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 06/11] dump: add API to write dump header Qiao Nuohan
@ 2014-01-07 11:38   ` Laszlo Ersek
  2014-01-07 11:49     ` Andreas Färber
  2014-01-13 10:03     ` Qiao Nuohan
  0 siblings, 2 replies; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-07 11:38 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

comments below

On 01/05/14 08:27, Qiao Nuohan wrote:
> the functions are used to write header of kdump-compressed format to vmcore.
> Header of kdump-compressed format includes:
> 1. common header: DiskDumpHeader32 / DiskDumpHeader64
> 2. sub header: KdumpSubHeader32 / KdumpSubHeader64
> 3. extra information: only elf notes here
> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> ---
>  dump.c                |  199 +++++++++++++++++++++++++++++++++++++++++++++++++
>  include/sysemu/dump.h |   96 ++++++++++++++++++++++++
>  2 files changed, 295 insertions(+), 0 deletions(-)
> 
> diff --git a/dump.c b/dump.c
> index 3b9cf00..e3623b9 100644
> --- a/dump.c
> +++ b/dump.c
> @@ -77,8 +77,16 @@ typedef struct DumpState {
>      int64_t length;
>      Error **errp;
>  
> +    bool flag_flatten;
> +    uint32_t nr_cpus;
> +    size_t page_size;
> +    uint64_t max_mapnr;
> +    size_t len_dump_bitmap;
>      void *note_buf;
>      size_t note_buf_offset;
> +    off_t offset_dump_bitmap;
> +    off_t offset_page;
> +    uint32_t flag_compress;
>  } DumpState;
>  
>  static int dump_cleanup(DumpState *s)
> @@ -773,6 +781,197 @@ static int buf_write_note(void *buf, size_t size, void *opaque)
>      return 0;
>  }
>  
> +/* write common header, sub header and elf note to vmcore */
> +static int create_header32(DumpState *s)
> +{
> +    int ret = 0;
> +    DiskDumpHeader32 *dh = NULL;
> +    KdumpSubHeader32 *kh = NULL;
> +    size_t size;
> +
> +    /* write common header, the version of kdump-compressed format is 5th */

I think this is a typo (it should say 6th, shouldn't it?), but it's not
critical.

> +    size = sizeof(DiskDumpHeader32);
> +    dh = g_malloc0(size);
> +
> +    strncpy(dh->signature, KDUMP_SIGNATURE, strlen(KDUMP_SIGNATURE));
> +    dh->header_version = 6;
> +    dh->block_size = s->page_size;
> +    dh->sub_hdr_size = sizeof(struct KdumpSubHeader32) + s->note_size;
> +    dh->sub_hdr_size = DIV_ROUND_UP(dh->sub_hdr_size, dh->block_size);
> +    /* dh->max_mapnr may be truncated, full 64bit is in kh.max_mapnr_64 */
> +    dh->max_mapnr = MIN(s->max_mapnr, UINT_MAX);

You could have simply converted / truncated "s->max_mapnr" to uint32_t
as part of the assignment, but the MIN(..., UINT_MAX) doesn't hurt either.

> +    dh->nr_cpus = s->nr_cpus;
> +    dh->bitmap_blocks = DIV_ROUND_UP(s->len_dump_bitmap, dh->block_size) * 2;
> +    memcpy(&(dh->utsname.machine), "i686", 4);
> +
> +    if (s->flag_compress & DUMP_DH_COMPRESSED_ZLIB) {
> +        dh->status |= DUMP_DH_COMPRESSED_ZLIB;
> +    }
> +#ifdef CONFIG_LZO
> +    if (s->flag_compress & DUMP_DH_COMPRESSED_LZO) {
> +        dh->status |= DUMP_DH_COMPRESSED_LZO;
> +    }
> +#endif
> +#ifdef CONFIG_SNAPPY
> +    if (s->flag_compress & DUMP_DH_COMPRESSED_SNAPPY) {
> +        dh->status |= DUMP_DH_COMPRESSED_SNAPPY;
> +    }
> +#endif
> +
> +    if (write_buffer(s->fd, s->flag_flatten, 0, dh, size) < 0) {
> +        ret = -1;
> +        goto out;
> +    }

The following fields in "dh" are left zero-filled:
- timestamp
- total_ram_blocks
- device_blocks
- written_blocks
- current_cpu

I guess we'll either overwrite them later or it's OK to leave them all
zeroed.

Also... is it OK to write these fields to the file in host native byte
order? What happens if an i686 / x86_64 target is emulated on a BE host?

> +
> +    /* write sub header */
> +    size = sizeof(KdumpSubHeader32);
> +    kh = g_malloc0(size);
> +
> +    /* 64bit max_mapnr_64 */
> +    kh->max_mapnr_64 = s->max_mapnr;
> +    kh->phys_base = PHYS_BASE;
> +    kh->dump_level = DUMP_LEVEL;
> +
> +    kh->offset_note = DISKDUMP_HEADER_BLOCKS * dh->block_size + size;
> +    kh->note_size = s->note_size;
> +
> +    if (write_buffer(s->fd, s->flag_flatten, dh->block_size, kh, size) < 0) {
> +        ret = -1;
> +        goto out;
> +    }

- Same question about endianness as above.

- Again, many fields left zeroed in "kh", but I guess that's OK.

- I would prefer if you repeated the multiplication by
DISKDUMP_HEADER_BLOCKS verbatim in the "offset" write_buffer() argument.

- When this write_buffer() is directed to a regular file in non-flat
mode, then the file might become sparse (you jump over a range of
offsets with lseek() in write_buffer()). If the output has been opened
by qemu itself (ie. "file:....", in qmp_dump_guest_memory()), then due
to the O_TRUNC we can't seek over preexistent data (and keep garbage in
the file). When libvirt pre-opens the file (to send over the fd later),
in doCoreDump(), it also passes O_TRUNC. OK.

> +
> +    /* write note */
> +    s->note_buf = g_malloc(s->note_size);
> +    s->note_buf_offset = 0;
> +
> +    /* use s->note_buf to store notes temporarily */
> +    if (write_elf32_notes(buf_write_note, s) < 0) {
> +        ret = -1;
> +        goto out;
> +    }
> +
> +    if (write_buffer(s->fd, s->flag_flatten, kh->offset_note, s->note_buf,
> +                     s->note_size) < 0) {
> +        ret = -1;
> +        goto out;
> +    }

Right before the write_buffer() call, we know that

  s->note_buf_offset <= s->note_size

because buf_write_note() ensures it.

We know even that

  s->note_buf_offset == s->note_size

there, because write_elf32_notes() produces exactly as many bytes as
we've calculated in advance, in cpu_get_note_size().

However, this is not very easy to see, hence I'd prefer adding *one* of
the following three:
- an assert() before write_buffer() that states the above equality, or
- passing "s->note_buf_offset" to write_buffer() as "size" argument, or
- allocating "s->note_buf" with g_malloc0().

(The 64-bit version below goes with the g_malloc0() choice, which BTW
makes the two versions a bit inconsistent currently.)

> +
> +    /* get offset of dump_bitmap */
> +    s->offset_dump_bitmap = (DISKDUMP_HEADER_BLOCKS + dh->sub_hdr_size) *
> +                             dh->block_size;

So, DISKDUMP_HEADER_BLOCKS covers "dh", and "dh->sub_hdr_size" covers
KdumpSubHeader32 plus the note. Seems OK.

> +
> +    /* get offset of page */
> +    s->offset_page = (DISKDUMP_HEADER_BLOCKS + dh->sub_hdr_size +
> +                      dh->bitmap_blocks) * dh->block_size;

Seems OK too. I guess we'll use these fields later.

Both of these multiplications are done in "unsigned int" (uint32_t)
before converting the product to "off_t".

Are you sure even the second one will fit? "dh->bitmap_blocks" is the
interesting addend. 1 byte in one bitmap covers to 8*4K = 32K bytes
guest RAM. We have two bitmaps. So, if we had close to 4G bytes in the
two bitmaps together (close to 2G bytes in each), we could cover close
to 64 TB guest RAM without overflowing the multiplication. Seems sufficient.


> +
> +out:
> +    g_free(dh);
> +    g_free(kh);
> +    g_free(s->note_buf);
> +
> +    return ret;
> +}
> +
> +/* write common header, sub header and elf note to vmcore */
> +static int create_header64(DumpState *s)
> +{
> +    int ret = 0;
> +    DiskDumpHeader64 *dh = NULL;
> +    KdumpSubHeader64 *kh = NULL;
> +    size_t size;
> +
> +    /* write common header, the version of kdump-compressed format is 5th */
> +    size = sizeof(DiskDumpHeader64);
> +    dh = g_malloc0(size);
> +
> +    strncpy(dh->signature, KDUMP_SIGNATURE, strlen(KDUMP_SIGNATURE));
> +    dh->header_version = 6;
> +    dh->block_size = s->page_size;
> +    dh->sub_hdr_size = sizeof(struct KdumpSubHeader64) + s->note_size;
> +    dh->sub_hdr_size = DIV_ROUND_UP(dh->sub_hdr_size, dh->block_size);
> +    /* dh->max_mapnr may be truncated, full 64bit is in kh.max_mapnr_64 */
> +    dh->max_mapnr = MIN(s->max_mapnr, UINT_MAX);
> +    dh->nr_cpus = s->nr_cpus;
> +    dh->bitmap_blocks = DIV_ROUND_UP(s->len_dump_bitmap, dh->block_size) * 2;
> +    memcpy(&(dh->utsname.machine), "x86_64", 6);
> +
> +    if (s->flag_compress & DUMP_DH_COMPRESSED_ZLIB) {
> +        dh->status |= DUMP_DH_COMPRESSED_ZLIB;
> +    }
> +#ifdef CONFIG_LZO
> +    if (s->flag_compress & DUMP_DH_COMPRESSED_LZO) {
> +        dh->status |= DUMP_DH_COMPRESSED_LZO;
> +    }
> +#endif
> +#ifdef CONFIG_SNAPPY
> +    if (s->flag_compress & DUMP_DH_COMPRESSED_SNAPPY) {
> +        dh->status |= DUMP_DH_COMPRESSED_SNAPPY;
> +    }
> +#endif
> +
> +    if (write_buffer(s->fd, s->flag_flatten, 0, dh, size) < 0) {
> +        ret = -1;
> +        goto out;
> +    }
> +
> +    /* write sub header */
> +    size = sizeof(KdumpSubHeader64);
> +    kh = g_malloc0(size);
> +
> +    /* 64bit max_mapnr_64 */
> +    kh->max_mapnr_64 = s->max_mapnr;
> +    kh->phys_base = PHYS_BASE;
> +    kh->dump_level = DUMP_LEVEL;
> +
> +    kh->offset_note = DISKDUMP_HEADER_BLOCKS * dh->block_size + size;
> +    kh->note_size = s->note_size;
> +
> +    if (write_buffer(s->fd, s->flag_flatten, dh->block_size, kh, size) < 0) {
> +        ret = -1;
> +        goto out;
> +    }
> +
> +    /* write note */
> +    s->note_buf = g_malloc0(s->note_size);
> +    s->note_buf_offset = 0;
> +
> +    /* use s->note_buf to store notes temporarily */
> +    if (write_elf64_notes(buf_write_note, s) < 0) {
> +        ret = -1;
> +        goto out;
> +    }
> +
> +    if (write_buffer(s->fd, s->flag_flatten, kh->offset_note, s->note_buf,
> +                     s->note_size) < 0) {
> +        ret = -1;
> +        goto out;
> +    }
> +
> +    /* get offset of dump_bitmap */
> +    s->offset_dump_bitmap = (DISKDUMP_HEADER_BLOCKS + dh->sub_hdr_size) *
> +                             dh->block_size;
> +
> +    /* get offset of page */
> +    s->offset_page = (DISKDUMP_HEADER_BLOCKS + dh->sub_hdr_size +
> +                      dh->bitmap_blocks) * dh->block_size;
> +
> +out:
> +    g_free(dh);
> +    g_free(kh);
> +    g_free(s->note_buf);
> +
> +    return ret;
> +}

I diffed this against the 32-bit version of the function, and the only
"suprising" difference is

-+    s->note_buf = g_malloc(s->note_size);
++    s->note_buf = g_malloc0(s->note_size);

which I already mentioned above.


I couldn't find anything in this patch that I'd call a direct bug. I
think you can address what you want from the above later too.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 06/11] dump: add API to write dump header
  2014-01-07 11:38   ` Laszlo Ersek
@ 2014-01-07 11:49     ` Andreas Färber
  2014-01-13 10:03     ` Qiao Nuohan
  1 sibling, 0 replies; 39+ messages in thread
From: Andreas Färber @ 2014-01-07 11:49 UTC (permalink / raw)
  To: Laszlo Ersek, Qiao Nuohan, Luiz Capitulino
  Cc: stefanha, qemu-devel, Alexander Graf, kumagai-atsushi, anderson, akong

Am 07.01.2014 12:38, schrieb Laszlo Ersek:
> Also... is it OK to write these fields to the file in host native byte
> order? What happens if an i686 / x86_64 target is emulated on a BE host?

For the target-s390x implementation Alex required to take care of
endianness for the s390x-on-x86 case, so it would probably make sense to
handle it for kdump format as well. Thanks for bringing it up!

BTW my original concerns related to CPU seem to have been addressed, so
I'm hoping this can go via the QMP queue once ready.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 07/11] dump: Add API to write dump_bitmap
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 07/11] dump: Add API to write dump_bitmap Qiao Nuohan
@ 2014-01-07 14:49   ` Laszlo Ersek
  2014-01-07 21:41     ` Laszlo Ersek
  0 siblings, 1 reply; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-07 14:49 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

comments below

On 01/05/14 08:27, Qiao Nuohan wrote:
> functions are used to write 1st and 2nd dump_bitmap of kdump-compressed format,
> which is used to indicate whether the corresponded page is existed in vmcore.
> 1st and 2nd dump_bitmap are same, because dump level is specified to 1 here.
> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> ---
>  dump.c                |  116 +++++++++++++++++++++++++++++++++++++++++++++++++
>  include/sysemu/dump.h |    7 +++
>  2 files changed, 123 insertions(+), 0 deletions(-)
> 
> diff --git a/dump.c b/dump.c
> index e3623b9..1fae152 100644
> --- a/dump.c
> +++ b/dump.c
> @@ -80,12 +80,14 @@ typedef struct DumpState {
>      bool flag_flatten;
>      uint32_t nr_cpus;
>      size_t page_size;
> +    uint32_t page_shift;
>      uint64_t max_mapnr;
>      size_t len_dump_bitmap;
>      void *note_buf;
>      size_t note_buf_offset;
>      off_t offset_dump_bitmap;
>      off_t offset_page;
> +    size_t num_dumpable;
>      uint32_t flag_compress;
>  } DumpState;
>  
> @@ -972,6 +974,120 @@ static int write_dump_header(DumpState *s)
>      }
>  }
>  
> +/* set dump_bitmap sequencely. bitmap is not allowed to be rewritten. */
> +static int set_dump_bitmap(int64_t last_pfn, int64_t pfn, uint32_t value,
> +                           void *buf, DumpState *s)
> +{

I'd recommend
- "uint8_t *buf", and
- "bool value", and
- maybe an assert() that neither of pfn and last_pfn is negative.

> +    off_t old_offset, new_offset;
> +    off_t offset_bitmap1, offset_bitmap2;
> +    uint32_t byte, bit;
> +
> +    /* should not set the previous place */
> +    if (last_pfn > pfn) {
> +        return -1;
> +    }
> +
> +    /*
> +     * if the bit needed to be set is not cached in buf, flush the data in buf
> +     * to vmcore firstly.
> +     * making new_offset be bigger than old_offset can also sync remained data
> +     * into vmcore.
> +     */
> +    old_offset = BUFSIZE_BITMAP * (last_pfn / PFN_BUFBITMAP);
> +    new_offset = BUFSIZE_BITMAP * (pfn / PFN_BUFBITMAP);
> +
> +    while (old_offset < new_offset) {
> +        /* calculate the offset and write dump_bitmap */
> +        offset_bitmap1 = s->offset_dump_bitmap + old_offset;
> +        if (write_buffer(s->fd, s->flag_flatten, offset_bitmap1, buf,
> +                         BUFSIZE_BITMAP) < 0) {
> +            return -1;
> +        }
> +
> +        /* dump level 1 is chosen, so 1st and 2nd bitmap are same */
> +        offset_bitmap2 = s->offset_dump_bitmap + s->len_dump_bitmap +
> +                         old_offset;
> +        if (write_buffer(s->fd, s->flag_flatten, offset_bitmap2, buf,
> +                         BUFSIZE_BITMAP) < 0) {
> +            return -1;
> +        }
> +
> +        memset(buf, 0, BUFSIZE_BITMAP);
> +        old_offset += BUFSIZE_BITMAP;
> +    }

Seems sane to me.

> +
> +    /* get the exact place of the bit in the buf, and set it */
> +    byte = (pfn % PFN_BUFBITMAP) >> 3;

(dividing by CHAR_BIT would be more consistent with the connection
between PFN_BUFBITMAP and BUFSIZE_BITMAP)

> +    bit = (pfn % PFN_BUFBITMAP) & 7;
> +    if (value) {
> +        ((char *)buf)[byte] |= 1<<bit;
> +    } else {
> +        ((char *)buf)[byte] &= ~(1<<bit);
> +    }

Shudder. I would much prefer (with "uint8_t *buf" included):

    if (value) {
        buf[byte] |=   1u << bit;
    } else {
        buf[byte] &= ~(1u << bit);
    }

When you interpret the expressions in question against the C standard,
my proposal is simpler to understand, and relies on fewer
platform-specifics. The current code looks simple, but it's actually
quite complex.

In any case, I think it will work in practice.


> +
> +    return 0;
> +}
> +
> +static int write_dump_bitmap(DumpState *s)
> +{
> +    int ret = 0;
> +    int64_t pfn_start, pfn_end, pfn;
> +    int64_t last_pfn;
> +    void *dump_bitmap_buf;
> +    size_t num_dumpable;
> +    MemoryMapping *memory_mapping;
> +
> +    /* dump_bitmap_buf is used to store dump_bitmap temporarily */
> +    dump_bitmap_buf = g_malloc0(BUFSIZE_BITMAP);
> +
> +    num_dumpable = 0;
> +    last_pfn = -1;

This would trip up the assert() that I proposed above. It also exploits
that (last_pfn == -1) will result in (old_offset = 0) in
set_dump_bitmap(), due to the division. I think it's fairly ugly, but
should work in practice.

> +
> +    /*
> +     * exam memory page by page, and set the bit in dump_bitmap corresponded
> +     * to the existing page
> +     */
> +    QTAILQ_FOREACH(memory_mapping, &s->list.head, next) {
> +        pfn_start = paddr_to_pfn(memory_mapping->phys_addr, s->page_shift);
> +        pfn_end = paddr_to_pfn(memory_mapping->phys_addr +
> +                               memory_mapping->length, s->page_shift);

OK, the intent seems to be to make "pfn_end" exclusive (see the loop
below). That however depends on "memory_mapping->length" being an
integral multiple of the target page size.

I propose an assert() here for that reason. Looking at patch v6 10/11,
for a compressed dump both paging and filtering are rejected. This
implies that in dump_init(),
- qemu_get_guest_simple_memory_mapping() is called -- ie. the memory
mapping list will directly reflect the guest-phys blocks,
- memory_mapping_filter() will *not* be called.

These should ensure that pfn_end is exclusive here (and that "phys_addr"
falls to the beginning of a page) , but an assert() with a comment would
help.

> +
> +        for (pfn = pfn_start; pfn < pfn_end; pfn++) {
> +            ret = set_dump_bitmap(last_pfn, pfn, 1, dump_bitmap_buf, s);

(1 --> "true" after replacing that param type with bool)

> +            if (ret < 0) {
> +                dump_error(s, "dump: failed to set dump_bitmap.\n");

Alas, dump_error() ignores the reason, but maybe we can fix that at a
different time.

> +                ret = -1;
> +                goto out;
> +            }
> +
> +            last_pfn = pfn;
> +            num_dumpable++;
> +        }
> +    }
> +
> +    /*
> +     * last_pfn > -1 means bitmap is set, then remained data in buf should be
> +     * synchronized into vmcore
> +     */

As far as I can see, (num_dumpable > 0) could work just as well, and it
would eliminate the super-ugly (last_pfn == -1) case.

> +    if (last_pfn > -1) {
> +        ret = set_dump_bitmap(last_pfn, last_pfn + PFN_BUFBITMAP, 0,
> +                              dump_bitmap_buf, s);

Results in

  new_offset == old_offset + BUFSIZE_BITMAP

in set_dump_bitmap(), that is, one iteration of the loop. It seems
correct to call this if we have set at least one bit, because
set_dump_bitmap() *always* leaves the most recently set bit un-synced.


> +        if (ret < 0) {
> +            dump_error(s, "dump: failed to sync dump_bitmap.\n");
> +            ret = -1;
> +            goto out;
> +        }
> +    }
> +
> +    /* number of dumpable pages that will be dumped later */
> +    s->num_dumpable = num_dumpable;
> +
> +out:
> +    g_free(dump_bitmap_buf);
> +
> +    return ret;
> +}
> +
>  static ram_addr_t get_start_block(DumpState *s)
>  {
>      GuestPhysBlock *block;
> diff --git a/include/sysemu/dump.h b/include/sysemu/dump.h
> index 9e47b4c..b5eaf8d 100644
> --- a/include/sysemu/dump.h
> +++ b/include/sysemu/dump.h
> @@ -27,11 +27,18 @@
>  #define DUMP_DH_COMPRESSED_LZO      (0x2)
>  #define DUMP_DH_COMPRESSED_SNAPPY   (0x4)
>  
> +#define PAGE_SIZE                   (4096)
>  #define KDUMP_SIGNATURE             "KDUMP   "
>  #define SIG_LEN                     (sizeof(KDUMP_SIGNATURE) - 1)
>  #define PHYS_BASE                   (0)
>  #define DUMP_LEVEL                  (1)
>  #define DISKDUMP_HEADER_BLOCKS      (1)
> +#define BUFSIZE_BITMAP              (PAGE_SIZE)
> +#define PFN_BUFBITMAP               (CHAR_BIT * BUFSIZE_BITMAP)
> +#define ARCH_PFN_OFFSET             (0)
> +
> +#define paddr_to_pfn(X, page_shift) \
> +    (((unsigned long long)(X) >> (page_shift)) - ARCH_PFN_OFFSET)
>  
>  typedef struct ArchDumpInfo {
>      int d_machine;  /* Architecture */
> 

I think these magic constants are somewhat tied to x86, and therefore
should be in an arch-specific file rather than a common file, but
whoever wants to extend this to another architecture can do that.

I think I haven't found anything that I'd call a bug.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 08/11] dump: Add APIs to operate DataCache
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 08/11] dump: Add APIs to operate DataCache Qiao Nuohan
@ 2014-01-07 15:22   ` Laszlo Ersek
  0 siblings, 0 replies; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-07 15:22 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

comments below

On 01/05/14 08:27, Qiao Nuohan wrote:
> DataCache is used to store data temporarily, then the data will be written to
> vmcore. These functions will be called later when writing data of page to
> vmcore.
> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> ---
>  dump.c                |   52 +++++++++++++++++++++++++++++++++++++++++++++++++
>  include/sysemu/dump.h |    9 ++++++++
>  2 files changed, 61 insertions(+), 0 deletions(-)
> 
> diff --git a/dump.c b/dump.c
> index 1fae152..b4c40f2 100644
> --- a/dump.c
> +++ b/dump.c
> @@ -1088,6 +1088,58 @@ out:
>      return ret;
>  }
>  
> +static void prepare_data_cache(DataCache *data_cache, DumpState *s)
> +{
> +    data_cache->fd = s->fd;
> +    data_cache->data_size = 0;
> +    data_cache->buf_size = BUFSIZE_DATA_CACHE;
> +    data_cache->buf = g_malloc0(BUFSIZE_DATA_CACHE);
> +}
> +
> +static int write_cache(DataCache *dc, bool flag_flatten, void *buf, size_t size)
> +{
> +    /*
> +     * check if the space is enough to cache data, if not, write the cached
> +     * data to dc->fd and reset the buf
> +     */
> +    if (dc->data_size + size > dc->buf_size) {
> +        if (write_buffer(dc->fd, flag_flatten, dc->offset, dc->buf,
> +                         dc->data_size) < 0) {
> +            return -1;
> +        }
> +
> +        dc->offset += dc->data_size;
> +        dc->data_size = 0;
> +    }
> +
> +    memcpy(dc->buf + dc->data_size, buf, size);
> +    dc->data_size += size;
> +
> +    return 0;
> +}

I think we should at least add a check ("if" or assert()) because the
passed-in "size" could be greater than all of the room we have in the
buffer (ie. size > dc->buf_size), and in that case we'd overflow the buffer.

Then,

> +
> +/* write the remaining data in dc->buf to dc->fd */
> +static int sync_data_cache(DataCache *dc, bool flag_flatten)
> +{
> +    if (dc->data_size == 0) {
> +        return 0;
> +    }
> +
> +    if (write_buffer(dc->fd, flag_flatten, dc->offset, dc->buf,
> +                     dc->data_size) < 0) {
> +        return -1;
> +    }
> +
> +    dc->offset += dc->data_size;
> +
> +    return 0;
> +}

Incrementing the offset here, but not resetting dc->data_size, seems
inconsistent. Do both or neither. Ideally, do both, and rebase
write_cache() on top of this (ie. when syncing is necessary, call
sync_data_cache() from write_cache()).

It doesn't cause problems as-is in the current patchset though.


> +
> +static void free_data_cache(DataCache *data_cache)
> +{
> +    g_free(data_cache->buf);
> +}
> +
>  static ram_addr_t get_start_block(DumpState *s)
>  {
>      GuestPhysBlock *block;
> diff --git a/include/sysemu/dump.h b/include/sysemu/dump.h
> index b5eaf8d..ab44af8 100644
> --- a/include/sysemu/dump.h
> +++ b/include/sysemu/dump.h
> @@ -36,6 +36,7 @@
>  #define BUFSIZE_BITMAP              (PAGE_SIZE)
>  #define PFN_BUFBITMAP               (CHAR_BIT * BUFSIZE_BITMAP)
>  #define ARCH_PFN_OFFSET             (0)
> +#define BUFSIZE_DATA_CACHE          (PAGE_SIZE * 4)
>  
>  #define paddr_to_pfn(X, page_shift) \
>      (((unsigned long long)(X) >> (page_shift)) - ARCH_PFN_OFFSET)
> @@ -140,6 +141,14 @@ typedef struct QEMU_PACKED KdumpSubHeader64 {
>      uint64_t max_mapnr_64;          /* header_version 6 and later */
>  } KdumpSubHeader64;
>  
> +typedef struct DataCache {
> +    int fd;             /* fd of the file where to write the cached data */
> +    char *buf;          /* buffer for cached data */
> +    size_t buf_size;    /* size of the buf */
> +    size_t data_size;   /* size of cached data in buf */
> +    off_t offset;       /* offset of the file */
> +} DataCache;
> +
>  struct GuestPhysBlockList; /* memory_mapping.h */
>  int cpu_get_dump_info(ArchDumpInfo *info,
>                        const struct GuestPhysBlockList *guest_phys_blocks);
> 

I feel that stuff that depends on page size should be centralized
somehow. I can't describe it very well now, but I feel that having a
bunch of macros that open-code the page size as 4096, and using struct
members elsewhere (with dynamically set values) for the same purpose, is
a mess.

However that could be refactored in a separate series, *if* you think it
would be worthwhile.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 07/11] dump: Add API to write dump_bitmap
  2014-01-07 14:49   ` Laszlo Ersek
@ 2014-01-07 21:41     ` Laszlo Ersek
  0 siblings, 0 replies; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-07 21:41 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, anderson, kumagai-atsushi,
	akong, afaerber

On 01/07/14 15:49, Laszlo Ersek wrote:
> 
> On 01/05/14 08:27, Qiao Nuohan wrote:

>> diff --git a/include/sysemu/dump.h b/include/sysemu/dump.h
>> index 9e47b4c..b5eaf8d 100644
>> --- a/include/sysemu/dump.h
>> +++ b/include/sysemu/dump.h
>> @@ -27,11 +27,18 @@
>>  #define DUMP_DH_COMPRESSED_LZO      (0x2)
>>  #define DUMP_DH_COMPRESSED_SNAPPY   (0x4)
>>  
>> +#define PAGE_SIZE                   (4096)
>>  #define KDUMP_SIGNATURE             "KDUMP   "
>>  #define SIG_LEN                     (sizeof(KDUMP_SIGNATURE) - 1)
>>  #define PHYS_BASE                   (0)
>>  #define DUMP_LEVEL                  (1)
>>  #define DISKDUMP_HEADER_BLOCKS      (1)
>> +#define BUFSIZE_BITMAP              (PAGE_SIZE)
>> +#define PFN_BUFBITMAP               (CHAR_BIT * BUFSIZE_BITMAP)
>> +#define ARCH_PFN_OFFSET             (0)
>> +
>> +#define paddr_to_pfn(X, page_shift) \
>> +    (((unsigned long long)(X) >> (page_shift)) - ARCH_PFN_OFFSET)
>>  
>>  typedef struct ArchDumpInfo {
>>      int d_machine;  /* Architecture */
>>
> 
> I think these magic constants are somewhat tied to x86, and therefore
> should be in an arch-specific file rather than a common file, but
> whoever wants to extend this to another architecture can do that.

Stressing the argument a bit more for PAGE_SIZE specifically:
- we already have TARGET_PAGE_SIZE, maybe that would be a better choice,
- PAGE_SIZE is defined *as* TARGET_PAGE_SIZE in kvm-all.c. There's no
actual conflict, but the mental conflict is bad enough.

Anyway my R-b stands.

Laszlo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 09/11] dump: Add API to write dump pages
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 09/11] dump: Add API to write dump pages Qiao Nuohan
@ 2014-01-07 22:37   ` Laszlo Ersek
  2014-01-07 23:12     ` Eric Blake
  0 siblings, 1 reply; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-07 22:37 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

comments below

On 01/05/14 08:27, Qiao Nuohan wrote:
> functions are used to write page to vmcore. vmcore is written page by page.
> page desc is used to store the information of a page, including a page's size,
> offset, compression format, etc.
> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> ---
>  dump.c                |  258 +++++++++++++++++++++++++++++++++++++++++++++++++
>  include/sysemu/dump.h |    9 ++
>  2 files changed, 267 insertions(+), 0 deletions(-)
> 
> diff --git a/dump.c b/dump.c
> index b4c40f2..848957c 100644
> --- a/dump.c
> +++ b/dump.c
> @@ -25,6 +25,14 @@
>  #include "qapi/error.h"
>  #include "qmp-commands.h"
>  
> +#include <zlib.h>
> +#ifdef CONFIG_LZO
> +#include <lzo/lzo1x.h>
> +#endif
> +#ifdef CONFIG_SNAPPY
> +#include <snappy-c.h>
> +#endif
> +
>  static uint16_t cpu_convert_to_target16(uint16_t val, int endian)
>  {
>      if (endian == ELFDATA2LSB) {
> @@ -1140,6 +1148,256 @@ static void free_data_cache(DataCache *data_cache)
>      g_free(data_cache->buf);
>  }
>  
> +static size_t get_len_buf_out(size_t page_size, uint32_t flag_compress)
> +{
> +    size_t len_buf_out_zlib, len_buf_out_lzo, len_buf_out_snappy;
> +    size_t len_buf_out;
> +
> +    /* init buf_out */
> +    len_buf_out_zlib = len_buf_out_lzo = len_buf_out_snappy = 0;
> +
> +    /* buf size for zlib */
> +    len_buf_out_zlib = compressBound(page_size);
> +
> +    /* buf size for lzo */
> +#ifdef CONFIG_LZO
> +    if (flag_compress & DUMP_DH_COMPRESSED_LZO) {
> +        if (lzo_init() != LZO_E_OK) {
> +            /* return 0 to indicate lzo is unavailable */
> +            return 0;
> +        }
> +    }
> +
> +    /*
> +     * LZO will expand incompressible data by a little amount. please check the
> +     * following URL to see the expansion calculation:
> +     * http://www.oberhumer.com/opensource/lzo/lzofaq.php
> +     */
> +    len_buf_out_lzo = page_size + page_size / 16 + 64 + 3;
> +#endif
> +
> +#ifdef CONFIG_SNAPPY
> +    /* buf size for snappy */
> +    len_buf_out_snappy = snappy_max_compressed_length(page_size);
> +#endif
> +
> +    /* get the biggest that can store all kinds of compressed page */
> +    len_buf_out = MAX(len_buf_out_zlib,
> +                      MAX(len_buf_out_lzo, len_buf_out_snappy));
> +
> +    return len_buf_out;
> +}

seems OK

> +
> +/*
> + * search memory blocks to find the exact place of the specified page, then
> + * dump the page into buf. memory should be read page by page, or it may exceed
> + * the boundary and fail to read
> + */
> +static int readmem(void *bufptr, ram_addr_t addr, size_t size, DumpState *s)

Apologies, but I think this does warrant a respin, even though what I'm
going to say is kind of ridiculous...

The type of "addr" needs to be "hwaddr" here, not "ram_addr_t". The
latter is for subscripting RAMBlocks. The former (hwaddr) is what you
need when poking in guest-phys address space. See also "2.1. Scalars" in
HACKING.

I'm not worried about overflowing ram_addr_t or anything like that, it's
about the message ram_addr_t sends -- it's precisely the wrong type
here. (I guess ram_addr_t here could be a remnant from an early version
of your series, one that predated the introduction of guest_phys_blocks
in dump.)

> +{
> +    GuestPhysBlock *block;
> +
> +    QTAILQ_FOREACH(block, &s->guest_phys_blocks.head, next) {
> +        if ((addr >= block->target_start) &&
> +            (addr + size <= block->target_end)) {
> +            memcpy(bufptr, block->host_addr + (addr - block->target_start),
> +                    size);
> +            return 0;
> +        }
> +    }
> +
> +    return -1;
> +}

I also wonder if you could speed this up by returning a pointer instead
of copying out the data. But that's tangential.

> +
> +/*
> + * check if the page is all 0
> + */
> +static inline bool is_zero_page(unsigned char *buf, long page_size)
> +{
> +    return buffer_is_zero(buf, page_size);
> +}

nice (except "page_size" should have type size_t, and *buf should be
const-qualified)

> +
> +static int write_dump_pages(DumpState *s)
> +{
> +    int ret = 0;
> +    DataCache page_desc, page_data;
> +    size_t len_buf_out, size_out;
> +#ifdef CONFIG_LZO
> +    lzo_bytep wrkmem = NULL;
> +#endif
> +    unsigned char *buf_out = NULL;
> +    off_t offset_desc, offset_data;
> +    PageDesc pd, pd_zero;
> +    uint64_t pfn_start, pfn_end, pfn;
> +    unsigned char buf[s->page_size];

Whoa, a VLA! :) I believe it's *very* non-idiomatic in the qemu source.
Please consider allocating it dynamically. (Of course others might point
out that I'm wrong.)

Even better, if readmem() could return a pointer into host memory
(rather than a deep copy), then you could drop "buf" completely (as an
array).

> +    MemoryMapping *memory_mapping;
> +    bool zero_page;
> +
> +    prepare_data_cache(&page_desc, s);
> +    prepare_data_cache(&page_data, s);
> +
> +    /* prepare buffer to store compressed data */
> +    len_buf_out = get_len_buf_out(s->page_size, s->flag_compress);
> +    if (len_buf_out == 0) {
> +        dump_error(s, "dump: failed to get length of output buffer.\n");
> +        goto out;
> +    }
> +
> +#ifdef CONFIG_LZO
> +    wrkmem = g_malloc(LZO1X_1_MEM_COMPRESS);
> +#endif

This could be further restricted with DUMP_DH_COMPRESSED_LZO.

> +
> +    buf_out = g_malloc(len_buf_out);
> +
> +    /* get offset of page_desc and page_data in dump file */
> +    offset_desc = s->offset_page;
> +    offset_data = offset_desc + sizeof(PageDesc) * s->num_dumpable;
> +    page_desc.offset = offset_desc;
> +    page_data.offset = offset_data;

I think that these offset initializations should happen inside
prepare_data_cache(). prepare_data_cache() should take the initial value
of "dc->offset" as a parameter. Sorry for not noticing it in the earlier
patch.

> +
> +    /*
> +     * init zero page's page_desc and page_data, because every zero page
> +     * uses the same page_data
> +     */
> +    pd_zero.size = s->page_size;
> +    pd_zero.flags = 0;
> +    pd_zero.offset = offset_data;
> +    pd_zero.page_flags = 0;

(cpu_to_le?)

> +    memset(buf, 0, pd_zero.size);
> +    ret = write_cache(&page_data, s->flag_flatten, buf, pd_zero.size);
> +    if (ret < 0) {
> +        dump_error(s, "dump: failed to write page data(zero page).\n");
> +        goto out;
> +    }

We're producing a zero page (to be referenced by zero, one, or more,
page descs). OK.

Also, I understand now how write_cache() + write_buffer() will operate.
For a regular (seekable) file, the caching saves a number of seeks, but
still the file is written through two parallel "offset streams". For a
flattened file, the cache size has a more visible consequence, namely
the number of MakedumpfileDataHeader's produced. OK.

pd_zero.size should be fine for write_cache(), but it's kind of hard to see:
- s->page_size is set to PAGE_SIZE in 10/11,
- data_cache->buf_size is set to BUFSIZE_DATA_CACHE (== PAGE_SIZE * 4)
in prepare_data_cache().

Hence my suggestion to add an assert to write_cache(), with a comment.

> +
> +    offset_data += pd_zero.size;
> +
> +    /* dump memory to vmcore page by page */
> +    QTAILQ_FOREACH(memory_mapping, &s->list.head, next) {
> +        pfn_start = paddr_to_pfn(memory_mapping->phys_addr, s->page_shift);
> +        pfn_end = paddr_to_pfn(memory_mapping->phys_addr +
> +                               memory_mapping->length, s->page_shift);

Again I think we're exploiting that both paging and filtering modes are
disabled for compressed dumps. (Same as in patch v6 07/11.) As I said
there, this seems to cause the memory mapping list to match the
guest_phys_block list identically. In that case though:

> +
> +        for (pfn = pfn_start; pfn < pfn_end; pfn++) {
> +            memset(buf, 0, s->page_size);
> +            ret = readmem(buf, pfn_to_paddr(pfn, s->page_shift), s->page_size,
> +                          s);

This nested loop seems a bit wasteful -- we iterate over the pages in
the guest-phys blocks, and we look up each page again in the list of
guest-phys blocks. I think these two loops could be folded into one, but
I'm not suggesting it at v6 :) The current code does appear OK.

(Also, if this loop is changed, then all other similar loops should be
synchronized with it in the series, including the one in patch 7.)


> +            if (ret < 0) {
> +                dump_error(s, "dump: failed to read memory.\n");
> +                goto out;
> +            }

Given this error check, I think at least the memset() before readmem()
is unnecessary.

> +
> +            /* check zero page */
> +            zero_page = is_zero_page(buf, s->page_size);
> +            if (zero_page) {

(The bool is never reused, I think you could simply call is_zero_page()
inside the "if".)

> +                ret = write_cache(&page_desc, s->flag_flatten, &pd_zero,
> +                                  sizeof(PageDesc));
> +                if (ret < 0) {
> +                    dump_error(s, "dump: failed to write page desc.\n");
> +                    goto out;
> +                }
> +            } else {
> +                /*
> +                 * not zero page, then:
> +                 * 1. compress the page
> +                 * 2. write the compressed page into the cache of page_data
> +                 * 3. get page desc of the compressed page and write it into the
> +                 *    cache of page_desc
> +                 */
> +                size_out = len_buf_out;
> +                if ((s->flag_compress & DUMP_DH_COMPRESSED_ZLIB) &&
> +                        (compress2(buf_out, &size_out, buf, s->page_size,
> +                        Z_BEST_SPEED) == Z_OK) && (size_out < s->page_size)) {
> +                    pd.flags = DUMP_DH_COMPRESSED_ZLIB;
> +                    pd.size  = size_out;

(cpu_to_le?)

> +
> +                    ret = write_cache(&page_data, s->flag_flatten, buf_out,
> +                                      pd.size);
> +                    if (ret < 0) {
> +                        dump_error(s, "dump: failed to write page data.\n");
> +                        goto out;
> +                    }

OK I believe I understand the strategy. We have a priority list of
compression algorithms. For each page, the first one of:
- zlib
- lzo1x
- snappy
will be used that simultaneously:
- achieves actual compression,
- is enabled at build time,
- is selected at runtime.

Given that the runtime selection allows at most one compression
algorithm, this priority list is actually not a list, it's an exact choice.

And for that reason only, I won't say that "we have a small bug here,
namely compress2() can change 'size_out', and then we may call
lzo1x_1_compress() with the non-pristine size_out". :) Because if one of
the algorithms runs, we never try any other algorithm.

Both compression failure and compression without size reduction cause us
to fall back to saving the plaintext.

> +#ifdef CONFIG_LZO
> +                } else if ((s->flag_compress & DUMP_DH_COMPRESSED_LZO) &&
> +                        (lzo1x_1_compress(buf, s->page_size, buf_out,
> +                        &size_out, wrkmem) == LZO_E_OK) &&
> +                        (size_out < s->page_size)) {
> +                    pd.flags = DUMP_DH_COMPRESSED_LZO;
> +                    pd.size  = size_out;

(cpu_to_le?)

> +
> +                    ret = write_cache(&page_data, s->flag_flatten, buf_out,
> +                                      pd.size);
> +                    if (ret < 0) {
> +                        dump_error(s, "dump: failed to write page data.\n");
> +                        goto out;
> +                    }
> +#endif
> +#ifdef CONFIG_SNAPPY
> +                } else if ((s->flag_compress & DUMP_DH_COMPRESSED_SNAPPY) &&
> +                        (snappy_compress((char *)buf, s->page_size,
> +                        (char *)buf_out, (size_t *)&size_out) == SNAPPY_OK) &&

"&size_out" already has type pointer-to-size_t, no need to cast it.

> +                        (size_out < s->page_size)) {
> +                    pd.flags = DUMP_DH_COMPRESSED_SNAPPY;
> +                    pd.size  = size_out;

(cpu_to_le?)

> +
> +                    ret = write_cache(&page_data, s->flag_flatten, buf_out,
> +                                      pd.size);
> +                    if (ret < 0) {
> +                        dump_error(s, "dump: failed to write page data.\n");
> +                        goto out;
> +                    }
> +#endif
> +                } else {
> +                    pd.flags = 0;
> +                    pd.size = s->page_size;

(cpu_to_le?)

> +
> +                    ret = write_cache(&page_data, s->flag_flatten, buf,
> +                                      pd.size);
> +                    if (ret < 0) {
> +                        dump_error(s, "dump: failed to write page data.\n");
> +                        goto out;
> +                    }
> +                }
> +
> +                /* get and write page desc here */
> +                pd.page_flags = 0;
> +                pd.offset = offset_data;

(cpu_to_le?)


> +                offset_data += pd.size;

I wonder if you could drop "offset_data" completely, and use
"page_data.offset" instead. Hm, no, you can't, because
"page_data.offset" is updated only when flushing the page data cache,
but you need an up-to-date "offset_data" value at each page. OK.

> +
> +                ret = write_cache(&page_desc, s->flag_flatten, &pd,
> +                                  sizeof(PageDesc));
> +                if (ret < 0) {
> +                    dump_error(s, "dump: failed to write page desc.\n");
> +                    goto out;
> +                }
> +            }
> +        }
> +    }
> +
> +    ret = sync_data_cache(&page_desc, s->flag_flatten);
> +    if (ret < 0) {
> +        dump_error(s, "dump: failed to sync cache for page_desc.\n");
> +        goto out;
> +    }
> +    ret = sync_data_cache(&page_data, s->flag_flatten);
> +    if (ret < 0) {
> +        dump_error(s, "dump: failed to sync cache for page_data.\n");
> +        goto out;
> +    }
> +
> +out:
> +    free_data_cache(&page_desc);
> +    free_data_cache(&page_data);
> +
> +#ifdef CONFIG_LZO
> +    g_free(wrkmem);
> +#endif
> +
> +    g_free(buf_out);
> +
> +    return ret;
> +}
> +
>  static ram_addr_t get_start_block(DumpState *s)
>  {
>      GuestPhysBlock *block;
> diff --git a/include/sysemu/dump.h b/include/sysemu/dump.h
> index ab44af8..382a3c3 100644
> --- a/include/sysemu/dump.h
> +++ b/include/sysemu/dump.h
> @@ -40,6 +40,8 @@
>  
>  #define paddr_to_pfn(X, page_shift) \
>      (((unsigned long long)(X) >> (page_shift)) - ARCH_PFN_OFFSET)
> +#define pfn_to_paddr(X, page_shift) \
> +    (((unsigned long long)(X) + ARCH_PFN_OFFSET) << (page_shift))
>  
>  typedef struct ArchDumpInfo {
>      int d_machine;  /* Architecture */
> @@ -149,6 +151,13 @@ typedef struct DataCache {
>      off_t offset;       /* offset of the file */
>  } DataCache;
>  
> +typedef struct PageDesc {
> +    uint64_t offset;                   /* the offset of the page data*/
> +    uint32_t size;                  /* the size of this dump page */
> +    uint32_t flags;                 /* flags */
> +    uint64_t page_flags;            /* page flags */
> +} PageDesc;

- some whitespace damage near the first comment
- QEMU_PACKED probably missing from the type definition (although it
would be likely cosmetic given this specific structure)

> +
>  struct GuestPhysBlockList; /* memory_mapping.h */
>  int cpu_get_dump_info(ArchDumpInfo *info,
>                        const struct GuestPhysBlockList *guest_phys_blocks);
> 

I'm withholding my R-b due to the ram_addr_t <-> hwaddr "problem" only.

The rest would be just nice to have (maybe in another series :), so) I
don't insist.

Nice work!

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 09/11] dump: Add API to write dump pages
  2014-01-07 22:37   ` Laszlo Ersek
@ 2014-01-07 23:12     ` Eric Blake
  0 siblings, 0 replies; 39+ messages in thread
From: Eric Blake @ 2014-01-07 23:12 UTC (permalink / raw)
  To: Laszlo Ersek, Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, anderson, kumagai-atsushi,
	akong, afaerber

[-- Attachment #1: Type: text/plain, Size: 1233 bytes --]

On 01/07/2014 03:37 PM, Laszlo Ersek wrote:
> comments below
> 
> On 01/05/14 08:27, Qiao Nuohan wrote:
>> functions are used to write page to vmcore. vmcore is written page by page.
>> page desc is used to store the information of a page, including a page's size,
>> offset, compression format, etc.
>>
>> +    uint64_t pfn_start, pfn_end, pfn;
>> +    unsigned char buf[s->page_size];
> 
> Whoa, a VLA! :) I believe it's *very* non-idiomatic in the qemu source.
> Please consider allocating it dynamically. (Of course others might point
> out that I'm wrong.)

Worse, a stack allocation greater than the size of a stack frame.  Any
time you have a function taking more than a page size of local storage,
you risk nasty behavior on some platforms (Windows in particular is
notorious for giving you only a single guard page, and if you overflow
the stack by more than the guard page, your program is unceremoniously
terminated with no message, compared to the usual desirable behavior of
getting a SIGSEGV that your program can at least react to in order to
diagnose that you had a stack overflow).

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 10/11] dump: Make kdump-compressed format available for 'dump-guest-memory'
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 10/11] dump: Make kdump-compressed format available for 'dump-guest-memory' Qiao Nuohan
@ 2014-01-09 15:46   ` Laszlo Ersek
  0 siblings, 0 replies; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-09 15:46 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

comments below

On 01/05/14 08:27, Qiao Nuohan wrote:
> Make monitor command 'dump-guest-memory' be able to dump in kdump-compressed
> format. The command's usage:
> 
>   dump [-p] protocol [begin] [length] [format]
> 
> 'format' is used to specified the format of vmcore and can be:
> 1. 'elf': ELF format, without compression
> 2. 'kdump-zlib': kdump-compressed format, with zlib-compressed
> 3. 'kdump-lzo': kdump-compressed format, with lzo-compressed
> 4. 'kdump-snappy': kdump-compressed format, with snappy-compressed
> And without 'format' being set, it is same as 'elf'.
> 
> Note:
>   1. The kdump-compressed format is readable only with the crash utility and
>      makedumpfile, and it can be smaller than the ELF format because of the
>      compression support.
>   2. The kdump-compressed format is the 6th edition.
> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> ---
>  dump.c           |  163 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  hmp-commands.hx  |   12 +++-
>  hmp.c            |   23 +++++++-
>  qapi-schema.json |   22 +++++++-
>  qmp-commands.hx  |    6 +-
>  5 files changed, 212 insertions(+), 14 deletions(-)
> 
> diff --git a/dump.c b/dump.c
> index 848957c..b4e79ff 100644
> --- a/dump.c
> +++ b/dump.c
> @@ -1398,6 +1398,70 @@ out:
>      return ret;
>  }
>  
> +static int create_kdump_vmcore(DumpState *s)
> +{
> +    int ret;
> +
> +    /*
> +     * the kdump-compressed format is:
> +     *                                               File offset
> +     *  +------------------------------------------+ 0x0
> +     *  |    main header (struct disk_dump_header) |
> +     *  |------------------------------------------+ block 1
> +     *  |    sub header (struct kdump_sub_header)  |

Also including the "note".

... Come to think of it, "note_buf" could actually be local to
create_headerXX(), no need to add it to DumpState.

> +     *  |------------------------------------------+ block 2
> +     *  |            1st-dump_bitmap               |
> +     *  |------------------------------------------+ block 2 + X blocks
> +     *  |            2nd-dump_bitmap               | (aligned by block)
> +     *  |------------------------------------------+ block 2 + 2 * X blocks
> +     *  |  page desc for pfn 0 (struct page_desc)  | (aligned by block)
> +     *  |  page desc for pfn 1 (struct page_desc)  |
> +     *  |                    :                     |
> +     *  |  page desc for pfn Z (struct page_desc)  |
> +     *  |------------------------------------------| (not aligned by block)
> +     *  |         page data (pfn 0)                |
> +     *  |         page data (pfn 1)                |
> +     *  |                        :                 |
> +     *  |         page data (pfn Z)                |
> +     *  +------------------------------------------+ offset_eraseinfo

I think this "offset_eraseinfo" is a bit misleading, because we never
set that field in KdumpSubHeaderXX to nonzero, and we also never write
the trailing portion here.

Maybe a zero value of "offset_eraseinfo" implies presicely that the
portion marked below as ":" is never written, but still
"offset_eraseinfo" (== 0) doesn't match the end of the page data.

Also, using the same Z as final subscript for "page data" as for "page
desc" is misleading. There can be fewer, equal number of, and more page
data entries than page desc entries. We'll have fewer page data entries
if the guest has at least two zero pages. We'll have more page data
entries if the guest has no zero pages (because we dump exactly one zero
page data entry).

> +     *  |                    :                     |
> +     *  +------------------------------------------+
> +     */
> +
> +    if (s->flag_flatten) {
> +        ret = write_start_flat_header(s->fd);
> +        if (ret < 0) {
> +            return -1;
> +        }
> +    }
> +
> +    ret = write_dump_header(s);
> +    if (ret < 0) {
> +        return -1;
> +    }
> +
> +    ret = write_dump_bitmap(s);
> +    if (ret < 0) {
> +        return -1;
> +    }
> +
> +    ret = write_dump_pages(s);
> +    if (ret < 0) {
> +        return -1;
> +    }
> +
> +    if (s->flag_flatten) {
> +        ret = write_end_flat_header(s->fd);
> +        if (ret < 0) {
> +            return -1;
> +        }
> +    }
> +
> +    dump_completed(s);
> +
> +    return 0;
> +}
> +
>  static ram_addr_t get_start_block(DumpState *s)
>  {
>      GuestPhysBlock *block;
> @@ -1426,7 +1490,27 @@ static ram_addr_t get_start_block(DumpState *s)
>      return -1;
>  }
>  
> -static int dump_init(DumpState *s, int fd, bool paging, bool has_filter,
> +static bool use_flatten_format(int fd)
> +{
> +    if (lseek(fd, 0, SEEK_SET) < 0) {
> +        return true;
> +    }
> +
> +    return false;
> +}
> +
> +static void get_max_mapnr(DumpState *s)
> +{
> +    MemoryMapping *memory_mapping;
> +
> +    QTAILQ_FOREACH(memory_mapping, &s->list.head, next) {
> +        s->max_mapnr = paddr_to_pfn(memory_mapping->phys_addr +
> +                        memory_mapping->length, s->page_shift);
> +    }
> +}

I'm a bit confused about the difference between DumpState.num_dumpable
and DumpState.max_mapnr.

DumpState.max_mapnr:
- defined as the highest gpfn, plus 1
- used to fill the "DiskDumpHeaderXX.max_mapnr" field (truncated to 32
  bits),
- used to fill the "KdumpSubHeaderXX.max_mapnr_64" field,
- used to determine "DumpState.len_dump_bitmap", which in turn
  determines:
  - start offset of the 2nd bitmap dump
  - DiskDumpHeaderXX.bitmap_blocks, which in turn determines:
    - "DumpState.offset_page", ie. start offset of page_descs

DumpState.num_dumpable:
- number of existent guest page frames (<= DumpState.max_mapnr)
- number of entries in the page desc area,
- number of entries in the page data area

OK. I got confused for a second but these two sets of things are
independent.

"crash" can bit-index the bitmap with the absolute gpfn. By summing the
1-bits up to there, it can derive the subscript into the page desc
array. Seems good.

> +
> +static int dump_init(DumpState *s, int fd, bool has_format,
> +                     DumpGuestMemoryFormat format, bool paging, bool has_filter,
>                       int64_t begin, int64_t length, Error **errp)
>  {
>      CPUState *cpu;
> @@ -1494,6 +1578,44 @@ static int dump_init(DumpState *s, int fd, bool paging, bool has_filter,
>          qemu_get_guest_simple_memory_mapping(&s->list, &s->guest_phys_blocks);
>      }
>  
> +    /* init for kdump-compressed format */
> +    if (has_format && format != DUMP_GUEST_MEMORY_FORMAT_ELF) {
> +        switch (format) {
> +        case DUMP_GUEST_MEMORY_FORMAT_KDUMP_ZLIB:
> +            s->flag_compress = DUMP_DH_COMPRESSED_ZLIB;
> +            break;
> +
> +        case DUMP_GUEST_MEMORY_FORMAT_KDUMP_LZO:
> +            s->flag_compress = DUMP_DH_COMPRESSED_LZO;
> +            break;
> +
> +        case DUMP_GUEST_MEMORY_FORMAT_KDUMP_SNAPPY:
> +            s->flag_compress = DUMP_DH_COMPRESSED_SNAPPY;
> +            break;
> +
> +        default:
> +            s->flag_compress = 0;

Not really necessary since DumpState is allocated with g_malloc0(), but
doesn't hurt.

> +        }
> +
> +        /*
> +         * check to see if fd is available to seek.
> +         * if not, flatten format is used to avoid seek
> +         */
> +        s->flag_flatten = use_flatten_format(fd);

use_flatten_format() will mis-report an lseek() IO error on an otherwise
seekable regular file or block device, but such IO errors should be
detected later on anyway. fstat() + S_ISREG()/S_ISBLK() would be
cleaner, but it's not very important.

> +
> +        s->nr_cpus = nr_cpus;
> +        s->page_size = PAGE_SIZE;
> +        s->page_shift = ffs(s->page_size) - 1;
> +
> +        get_max_mapnr(s);
> +
> +        size_t tmp;
> +        tmp = DIV_ROUND_UP(DIV_ROUND_UP(s->max_mapnr, CHAR_BIT), s->page_size);
> +        s->len_dump_bitmap = tmp * s->page_size;

"tmp" and "s->len_dump_bitmap" are "size_t", which can be 32-bit, and
"s->max_mapnr" is "uint64_t". But we discussed earlier how much guest
memory it would take to overflow this, so I think it's fine.

> +
> +        return 0;

OK I think this covers all new fields in DumpState.

> +    }
> +
>      if (s->has_filter) {
>          memory_mapping_filter(&s->list, s->begin, s->length);
>      }
> @@ -1553,8 +1675,9 @@ cleanup:
>  }
>  
>  void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
> -                           int64_t begin, bool has_length, int64_t length,
> -                           Error **errp)
> +                           int64_t begin, bool has_length,
> +                           int64_t length, bool has_format,
> +                           DumpGuestMemoryFormat format, Error **errp)
>  {
>      const char *p;
>      int fd = -1;
> @@ -1569,6 +1692,27 @@ void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
>          error_set(errp, QERR_MISSING_PARAMETER, "begin");
>          return;
>      }
> +    /* kdump-compressed format doesn't support paging or filter */
> +    if ((has_format && format != DUMP_GUEST_MEMORY_FORMAT_ELF) &&
> +        (paging || has_begin || has_length)) {
> +        error_set(errp, QERR_INVALID_PARAMETER_COMBINATION);
> +        return;
> +    }

I'd prefer a more verbose explanation with error_setg() -- the error
should state what the comment says -- but it's not a blocker for me.

And, this check here is *really* important. I wish it were reflected in
a handful of asserts() across the rest of the code (I sought to call out
those spots).

> +
> +    /* check whether lzo/snappy is supported */
> +#ifndef CONFIG_LZO
> +    if (format == DUMP_GUEST_MEMORY_FORMAT_KDUMP_LZO) {
> +        error_set(errp, QERR_INVALID_PARAMETER_VALUE, "format",
> +                  "supported format(kdump-lzo is not available now)");
> +    }
> +#endif

In theory you should look at "has_format" first. In practice I think if
the format is absent then "format" will have value 0
(DUMP_GUEST_MEMORY_FORMAT_ELF), so the check is OK as-is.

The error message needs at least a space character before the left
paren, but error_setg() would be preferable I think. (Others should
chime in too of course...)

> +
> +#ifndef CONFIG_SNAPPY
> +    if (format == DUMP_GUEST_MEMORY_FORMAT_KDUMP_SNAPPY) {
> +        error_set(errp, QERR_INVALID_PARAMETER_VALUE, "format",
> +                  "supported format(kdump-snappy is not available now)");
> +    }
> +#endif

Ditto.

>  
>  #if !defined(WIN32)
>      if (strstart(file, "fd:", &p)) {
> @@ -1594,14 +1738,21 @@ void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
>  
>      s = g_malloc0(sizeof(DumpState));
>  
> -    ret = dump_init(s, fd, paging, has_begin, begin, length, errp);
> +    ret = dump_init(s, fd, has_format, format, paging, has_begin,
> +                    begin, length, errp);
>      if (ret < 0) {
>          g_free(s);
>          return;
>      }
>  
> -    if (create_vmcore(s) < 0 && !error_is_set(s->errp)) {
> -        error_set(errp, QERR_IO_ERROR);
> +    if (has_format && format != DUMP_GUEST_MEMORY_FORMAT_ELF) {
> +        if (create_kdump_vmcore(s) < 0 && !error_is_set(s->errp)) {
> +            error_set(errp, QERR_IO_ERROR);
> +        }
> +    } else {
> +        if (create_vmcore(s) < 0 && !error_is_set(s->errp)) {
> +            error_set(errp, QERR_IO_ERROR);
> +        }
>      }

On error some stuff may be leaked -- there are error paths when
dump_cleanup() is never called, for example:

dump_init()
  guest_phys_blocks_append()
create_kdump_vmcore()
  write_start_flat_header()
    return -1

I'm not sure if this has been possible in the create_vmcore() branch
too. It seems unlikely, because that branch tends to call dump_error()
on all (most?) error paths.

>  
>      g_free(s);
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index ebe8e78..3856bb4 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -993,17 +993,19 @@ ETEXI
>  
>      {
>          .name       = "dump-guest-memory",
> -        .args_type  = "paging:-p,filename:F,begin:i?,length:i?",
> -        .params     = "[-p] filename [begin] [length]",
> +        .args_type  = "paging:-p,filename:F,begin:i?,length:i?,format:s?",
> +        .params     = "[-p] filename [begin] [length] [format]",
>          .help       = "dump guest memory to file"
>                        "\n\t\t\t begin(optional): the starting physical address"
> -                      "\n\t\t\t length(optional): the memory size, in bytes",
> +                      "\n\t\t\t length(optional): the memory size, in bytes"
> +                      "\n\t\t\t format(optional): the format of guest memory dump,"
> +                      "\n\t\t\t it can be elf|kdump-zlib|kdump-lzo|kdump-snappy",
>          .mhandler.cmd = hmp_dump_guest_memory,
>      },
>  
>  
>  STEXI
> -@item dump-guest-memory [-p] @var{protocol} @var{begin} @var{length}
> +@item dump-guest-memory [-p] @var{protocol} @var{begin} @var{length} @var{format}
>  @findex dump-guest-memory
>  Dump guest memory to @var{protocol}. The file can be processed with crash or
>  gdb.
> @@ -1013,6 +1015,8 @@ gdb.
>              specified with length together.
>      length: the memory size, in bytes. It's optional, and should be specified
>              with begin together.
> +    format: the format of guest memory dump. It's optional, and can be
> +            elf|kdump-zlib|kdump-lzo|kdump-snappy
>  ETEXI
>  
>      {

I would document that non-elf formats conflict with both paging and
filtering (ie. begin/length)

However... I'm afraid the compression facility is simply unaccessible
via HMP. Namely:

- If you specify both "begin" and "length", then the parameter
combination check will (correctly!) reject compressed formats.

- If you omit "begin" and/or "length", then the HMP parser will choke on
the compression format, trying to parse it (eg. "kdump-zlib") as the
first integer (begin or length) that you would have wanted to omit.

This is a limitation of the HMP parser, see monitor_parse_command():

        case 'i':
        case 'l':
        case 'M':
            {
                int64_t val;

                while (qemu_isspace(*p))
                    p++;
                if (*typestr == '?' || *typestr == '.') {
                    if (*typestr == '?') {
                        if (*p == '\0') {
                            typestr++;
                            break;
                        }
                    } else {
                        if (*p == '.') {
                            p++;
                            while (qemu_isspace(*p))
                                p++;
                        } else {
                            typestr++;
                            break;
                        }
                    }
                    typestr++;
                }
                if (get_expr(mon, &val, &p))
                    goto fail;

There are only two possibilities to break out of the 'i' branch and
avoid blowing up in get_expr():

- (*typestr == '?' && *p == '\0'), that is, the parameter (ie. "begin",
"length") is optional, and there's nothing else (except whitespace) on
the HMP command line. This is not our case, because the command line
still has "kdump-zlib".

- (*typestr == '.' && *p != '.') -- This would require replacing the
"i?" specifiers in "args_type" with "i.", and passing such arguments for
"begin" and "length" on the HMP command line that don't start with a
dot. For example:

  (qemu) dump-guest-memory 0 1000000 kdump-zlib

In this case "begin" and "length" would be ignored due to the "i."
specifier. They would only be parsed with

  (qemu) dump-guest-memory .0 .1000000 kdump-zlib

However, "i." would break existing HMP command lines, like

  (qemu) dump-guest-memory 0 1000000

so "i." won't fly.

("i." is BTW only ever used with do_ioport_read(), for the case when the
port to read works as an index-register on write.)


I think we simply shouldn't try to make this functionality available
over HMP.

> diff --git a/hmp.c b/hmp.c
> index 32ee285..9bd62b8 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -1307,9 +1307,12 @@ void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict)
>      const char *file = qdict_get_str(qdict, "filename");
>      bool has_begin = qdict_haskey(qdict, "begin");
>      bool has_length = qdict_haskey(qdict, "length");
> +    bool has_format = qdict_haskey(qdict, "format");
>      int64_t begin = 0;
>      int64_t length = 0;
> +    const char *format = NULL;
>      char *prot;
> +    enum DumpGuestMemoryFormat dump_format;
>  
>      if (has_begin) {
>          begin = qdict_get_int(qdict, "begin");
> @@ -1317,11 +1320,29 @@ void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict)
>      if (has_length) {
>          length = qdict_get_int(qdict, "length");
>      }
> +    if (has_format) {
> +        format = qdict_get_str(qdict, "format");
> +    }
> +
> +    if (strcmp(format, "elf") == 0) {

This results in a SIGSEGV when the format has not been specified
("format" is initialized to NULL in its definition, and when "qdict" has
no "format" key (--> has_format==false) then "format" is left at NULL).

But, if we drop HMP support for the feature, this goes away too.

> +        dump_format = DUMP_GUEST_MEMORY_FORMAT_ELF;
> +    } else if (strcmp(format, "kdump-zlib") == 0) {
> +        dump_format = DUMP_GUEST_MEMORY_FORMAT_KDUMP_ZLIB;
> +    } else if (strcmp(format, "kdump-lzo") == 0) {
> +        dump_format = DUMP_GUEST_MEMORY_FORMAT_KDUMP_LZO;
> +    } else if (strcmp(format, "kdump-snappy") == 0) {
> +        dump_format = DUMP_GUEST_MEMORY_FORMAT_KDUMP_SNAPPY;
> +    } else {
> +        error_set(&errp, QERR_INVALID_PARAMETER_VALUE,
> +                  "format", "elf|kdump-zlib|kdump-lzo|kdump-snappy");
> +        hmp_handle_error(mon, &errp);
> +        return;
> +    }
>  
>      prot = g_strconcat("file:", file, NULL);
>  
>      qmp_dump_guest_memory(paging, prot, has_begin, begin, has_length, length,
> -                          &errp);
> +                          has_format, dump_format, &errp);
>      hmp_handle_error(mon, &errp);
>      g_free(prot);
>  }
> diff --git a/qapi-schema.json b/qapi-schema.json
> index c3c939c..19b2b23 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -2676,6 +2676,24 @@
>  { 'command': 'device_del', 'data': {'id': 'str'} }
>  
>  ##
> +# @DumpGuestMemoryFormat:
> +#
> +# An enumeration of guest-memory-dump's format.
> +#
> +# @elf: elf format
> +#
> +# @kdump-zlib: kdump-compressed format with zlib-compressed
> +#
> +# @kdump-lzo: kdump-compressed format with zlib-compressed

should say "lzo-compressed" rather than "zlib-compressed"

> +#
> +# @kdump-snappy: kdump-compressed format with zlib-compressed

should say "snappy-compressed" rather than "zlib-compressed"

> +#
> +# Since: 1.8
> +##

1.8 will never exist, we should say 2.0

> +{ 'enum': 'DumpGuestMemoryFormat',
> +  'data': [ 'elf', 'kdump-zlib', 'kdump-lzo', 'kdump-snappy' ] }
> +
> +##
>  # @dump-guest-memory
>  #
>  # Dump guest's memory to vmcore. It is a synchronous operation that can take
> @@ -2711,13 +2729,15 @@
>  #          want to dump all guest's memory, please specify the start @begin
>  #          and @length
>  #
> +# @format: #optional if specified, the format of guest memory dump. (since 1.8)
> +#

since 2.0

>  # Returns: nothing on success
>  #
>  # Since: 1.2
>  ##
>  { 'command': 'dump-guest-memory',
>    'data': { 'paging': 'bool', 'protocol': 'str', '*begin': 'int',
> -            '*length': 'int' } }
> +            '*length': 'int', '*format': 'DumpGuestMemoryFormat' } }
>  
>  ##
>  # @netdev_add:
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index fba15cd..3de9e7d 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -791,8 +791,8 @@ EQMP
>  
>      {
>          .name       = "dump-guest-memory",
> -        .args_type  = "paging:b,protocol:s,begin:i?,end:i?",
> -        .params     = "-p protocol [begin] [length]",
> +        .args_type  = "paging:b,protocol:s,begin:i?,length:i?,format:s?",

Adding "format' is OK, but we should not rename "end" to "length".

> +        .params     = "-p protocol [begin] [length] [format]",
>          .help       = "dump guest memory to file",
>          .user_print = monitor_user_noop,
>          .mhandler.cmd_new = qmp_marshal_input_dump_guest_memory,
> @@ -813,6 +813,8 @@ Arguments:
>             with length together (json-int)
>  - "length": the memory size, in bytes. It's optional, and should be specified
>              with begin together (json-int)
> +- "format": the format of guest memory dump. It's optional, and can be
> +            elf|kdump-zlib|kdump-lzo|kdump-snappy (json-string)
>  
>  Example:
>  
> 

I think this patch should be respun in order to drop HMP support
completely. The rest is of varying importance; most of it shouldn't be
too hard to address.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 11/11] Add 'query-dump-guest-memory-capability' command
  2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 11/11] Add 'query-dump-guest-memory-capability' command Qiao Nuohan
@ 2014-01-09 16:34   ` Laszlo Ersek
  0 siblings, 0 replies; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-09 16:34 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, kumagai-atsushi, anderson,
	akong, afaerber

comments below

On 01/05/14 08:27, Qiao Nuohan wrote:
> 'query-dump-guest-memory-capability' is used to query whether option 'format'
> is available for 'dump-guest-memory' and the available format. The output
> of the command will be like:
> 
> -> { "execute": "query-dump-guest-memory-capability" }
> <- { "return": {
>         "format-option": "optional",
>         "capabilities": [
>             {"available": true, "format": "elf"},
>             {"available": true, "format": "kdump-zlib"},
>             {"available": true, "format": "kdump-lzo"},
>             {"available": true, "format": "kdump-snappy"}
>         ]
>     }

The order of "available" and "format" in the commit message is reverse
in relation to the json. Not too important.

> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> ---
>  dump.c           |   38 ++++++++++++++++++++++++++++++++++++++
>  qapi-schema.json |   13 +++++++++++++
>  qmp-commands.hx  |    7 +++++++
>  3 files changed, 58 insertions(+), 0 deletions(-)
> 
> diff --git a/dump.c b/dump.c
> index b4e79ff..51fe23a 100644
> --- a/dump.c
> +++ b/dump.c
> @@ -1757,3 +1757,41 @@ void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
>  
>      g_free(s);
>  }
> +
> +DumpGuestMemoryCapability *qmp_query_dump_guest_memory_capability(Error **errp)
> +{
> +    int i;
> +    DumpGuestMemoryCapabilityStatusList *item;
> +    DumpGuestMemoryCapability *cap =
> +                                  g_malloc0(sizeof(DumpGuestMemoryCapability));
> +
> +    cap->format_option = g_strdup_printf("optional");
> +
> +    for (i = 0; i < DUMP_GUEST_MEMORY_FORMAT_MAX; i++) {
> +        if (cap->capabilities == NULL) {
> +            item = g_malloc0(sizeof(DumpGuestMemoryCapabilityStatusList));
> +            cap->capabilities = item;
> +        } else {
> +            item->next = g_malloc0(sizeof(DumpGuestMemoryCapabilityStatusList));
> +            item = item->next;
> +        }
> +
> +        item->value = g_malloc0(sizeof(struct DumpGuestMemoryCapabilityStatus));
> +        item->value->format = i;
> +        item->value->available = true;
> +
> +        if (i == DUMP_GUEST_MEMORY_FORMAT_KDUMP_LZO) {
> +#ifndef CONFIG_LZO
> +            item->value->available = false;
> +#endif
> +        }
> +
> +        if (i == DUMP_GUEST_MEMORY_FORMAT_KDUMP_SNAPPY) {
> +#ifndef CONFIG_SNAPPY
> +            item->value->available = false;
> +#endif
> +        }
> +    }

You could move the preprocessor directives outside.

Also, since the "available" field is false by nature of g_malloc0(),
maybe the code could be reorganized
- from "false to true to maybe false"
- to   "false to maybe true".

Not overly important.

> +
> +    return cap;
> +}
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 19b2b23..e545e0f 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -2740,6 +2740,19 @@
>              '*length': 'int', '*format': 'DumpGuestMemoryFormat' } }
>  
>  ##
> +# Since: 1.8

2.0

> +##
> +{ 'type':  'DumpGuestMemoryCapabilityStatus',
> +  'data': { 'format': 'DumpGuestMemoryFormat', 'available': 'bool' } }
> +
> +{ 'type': 'DumpGuestMemoryCapability',
> +  'data': {
> +      'format-option': 'str',
> +      'capabilities': ['DumpGuestMemoryCapabilityStatus'] } }
> +
> +{ 'command': 'query-dump-guest-memory-capability', 'returns': 'DumpGuestMemoryCapability' }
> +
> +##
>  # @netdev_add:
>  #
>  # Add a network backend.
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index 3de9e7d..3379c75 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -828,6 +828,13 @@ Notes:
>  EQMP
>  
>      {
> +        .name       = "query-dump-guest-memory-capability",
> +        .args_type  = "",
> +        .mhandler.cmd_new = qmp_marshal_input_query_dump_guest_memory_capability,
> +    },
> +
> +
> +    {
>          .name       = "netdev_add",
>          .args_type  = "netdev:O",
>          .mhandler.cmd_new = qmp_netdev_add,
> 

I think some more documentation is usual in the qmp-commands.hx file for
new commands; an SQMP/EQMP section should be likely added. However I'm
not giving my R-b only because of the 1.8 vs. 2.0 comment.

I'll let Eric decide if this query command serves libvirt sufficiently.

Thanks!
Laszlo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 06/11] dump: add API to write dump header
  2014-01-07 11:38   ` Laszlo Ersek
  2014-01-07 11:49     ` Andreas Färber
@ 2014-01-13 10:03     ` Qiao Nuohan
  2014-01-13 10:39       ` Laszlo Ersek
  1 sibling, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-13 10:03 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: stefanha, qemu-devel, lcapitulino, anderson, kumagai-atsushi,
	akong, afaerber

Sorry for responsing late.

On 01/07/2014 07:38 PM, Laszlo Ersek wrote:
> The following fields in "dh" are left zero-filled:
> - timestamp
> - total_ram_blocks
> - device_blocks
> - written_blocks
> - current_cpu
>
> I guess we'll either overwrite them later or it's OK to leave them all
> zeroed.

Yes, they are leaved all zeroed here. Tools, like crash will get exact data
from dumped memory.

>
> Also... is it OK to write these fields to the file in host native byte
> order? What happens if an i686 / x86_64 target is emulated on a BE host?

I will add convert work in v7.

>
>> >  +
>> >  +    /* write sub header */
>> >  +    size = sizeof(KdumpSubHeader32);
>> >  +    kh = g_malloc0(size);
>> >  +
>> >  +    /* 64bit max_mapnr_64 */
>> >  +    kh->max_mapnr_64 = s->max_mapnr;
>> >  +    kh->phys_base = PHYS_BASE;
>> >  +    kh->dump_level = DUMP_LEVEL;
>> >  +
>> >  +    kh->offset_note = DISKDUMP_HEADER_BLOCKS * dh->block_size + size;
>> >  +    kh->note_size = s->note_size;
>> >  +
>> >  +    if (write_buffer(s->fd, s->flag_flatten, dh->block_size, kh, size)<  0) {
>> >  +        ret = -1;
>> >  +        goto out;
>> >  +    }
> - Same question about endianness as above.
>
> - Again, many fields left zeroed in "kh", but I guess that's OK.
>
> - I would prefer if you repeated the multiplication by
> DISKDUMP_HEADER_BLOCKS verbatim in the "offset" write_buffer() argument.

write_buffer(s->fd, s->flag_flatten, DISKDUMP_HEADER_BLOCKS * dh->block_size,
kh, size) ?

Yes, I should change it.

>
> - When this write_buffer() is directed to a regular file in non-flat
> mode, then the file might become sparse (you jump over a range of
> offsets with lseek() in write_buffer()). If the output has been opened
> by qemu itself (ie."file:....", in qmp_dump_guest_memory()), then due
> to the O_TRUNC we can't seek over preexistent data (and keep garbage in
> the file). When libvirt pre-opens the file (to send over the fd later),
> in doCoreDump(), it also passes O_TRUNC. OK.
>

Do you mean because of O_TRUNC，seek will exceed the end of the file that may 
cause some problem?

-- 
Regards
Qiao Nuohan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 06/11] dump: add API to write dump header
  2014-01-13 10:03     ` Qiao Nuohan
@ 2014-01-13 10:39       ` Laszlo Ersek
  2014-01-14  2:07         ` Qiao Nuohan
  0 siblings, 1 reply; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-13 10:39 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, anderson, kumagai-atsushi,
	akong, afaerber

On 01/13/14 11:03, Qiao Nuohan wrote:
> Sorry for responsing late.
> 
> On 01/07/2014 07:38 PM, Laszlo Ersek wrote:

>>> >  +    kh->offset_note = DISKDUMP_HEADER_BLOCKS * dh->block_size +
>>> size;
>>> >  +    kh->note_size = s->note_size;
>>> >  +
>>> >  +    if (write_buffer(s->fd, s->flag_flatten, dh->block_size, kh,
>>> size)<  0) {
>>> >  +        ret = -1;
>>> >  +        goto out;
>>> >  +    }

>> - I would prefer if you repeated the multiplication by
>> DISKDUMP_HEADER_BLOCKS verbatim in the "offset" write_buffer() argument.
> 
> write_buffer(s->fd, s->flag_flatten, DISKDUMP_HEADER_BLOCKS *
> dh->block_size,
> kh, size) ?
> 
> Yes, I should change it.

Yes that's what I meant.

> 
>>
>> - When this write_buffer() is directed to a regular file in non-flat
>> mode, then the file might become sparse (you jump over a range of
>> offsets with lseek() in write_buffer()). If the output has been opened
>> by qemu itself (ie."file:....", in qmp_dump_guest_memory()), then due
>> to the O_TRUNC we can't seek over preexistent data (and keep garbage in
>> the file). When libvirt pre-opens the file (to send over the fd later),
>> in doCoreDump(), it also passes O_TRUNC. OK.
>>
> 
> Do you mean because of O_TRUNC，seek will exceed the end of the file
> that may cause some problem?

I meant that lseek() would seek over an unwritten portion of the file.
If that portion had any kind of data written into it earlier, then that
data would now likely turn into garbage (lose meaning, become truncated
etc.) It wouldn't be corrupted or anything like that, it would just
become a leftover with potential to cause misinterpretation.

But, since we have O_TRUNC at open() time, we're seeking past the end of
the file, and this sought-over portion will read back as zeroes (and the
file might become "sparse", dependent on the filesystem and the size of
the range sought-over).

Seeking past the end of the file is explicitly allowed by POSIX:

    The lseek() function shall allow the file offset to be set beyond
    the end of the existing data in the file. If data is later written
    at this point, subsequent reads of data in the gap shall return
    bytes with the value 0 until data is actually written into the gap.

http://pubs.opengroup.org/onlinepubs/9699919799/functions/lseek.html

So this is fine.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 06/11] dump: add API to write dump header
  2014-01-13 10:39       ` Laszlo Ersek
@ 2014-01-14  2:07         ` Qiao Nuohan
  2014-01-14  2:29           ` Laszlo Ersek
  0 siblings, 1 reply; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-14  2:07 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: stefanha, qemu-devel, lcapitulino, anderson, kumagai-atsushi,
	akong, afaerber

On 01/13/2014 06:39 PM, Laszlo Ersek wrote:
>>> >>
>>> >>  - When this write_buffer() is directed to a regular file in non-flat
>>> >>  mode, then the file might become sparse (you jump over a range of
>>> >>  offsets with lseek() in write_buffer()). If the output has been opened
>>> >>  by qemu itself (ie."file:....", in qmp_dump_guest_memory()), then due
>>> >>  to the O_TRUNC we can't seek over preexistent data (and keep garbage in
>>> >>  the file). When libvirt pre-opens the file (to send over the fd later),
>>> >>  in doCoreDump(), it also passes O_TRUNC. OK.
>>> >>
>> >
>> >  Do you mean because of O_TRUNC，seek will exceed the end of the file
>> >  that may cause some problem?
> I meant that lseek() would seek over an unwritten portion of the file.
> If that portion had any kind of data written into it earlier, then that
> data would now likely turn into garbage (lose meaning, become truncated
> etc.) It wouldn't be corrupted or anything like that, it would just
> become a leftover with potential to cause misinterpretation.
>
> But, since we have O_TRUNC at open() time, we're seeking past the end of
> the file, and this sought-over portion will read back as zeroes (and the
> file might become "sparse", dependent on the filesystem and the size of
> the range sought-over).
>
> Seeking past the end of the file is explicitly allowed by POSIX:
>
>      The lseek() function shall allow the file offset to be set beyond
>      the end of the existing data in the file. If data is later written
>      at this point, subsequent reads of data in the gap shall return
>      bytes with the value 0 until data is actually written into the gap.
>
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/lseek.html
>
> So this is fine.

Thanks for your explanation. I think it would be better to abandon the non-flat
mode to avoid potential risk.

-- 
Regards
Qiao Nuohan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 06/11] dump: add API to write dump header
  2014-01-14  2:07         ` Qiao Nuohan
@ 2014-01-14  2:29           ` Laszlo Ersek
  2014-01-14  2:42             ` Qiao Nuohan
  0 siblings, 1 reply; 39+ messages in thread
From: Laszlo Ersek @ 2014-01-14  2:29 UTC (permalink / raw)
  To: Qiao Nuohan
  Cc: stefanha, qemu-devel, lcapitulino, anderson, kumagai-atsushi,
	akong, afaerber

On 01/14/14 03:07, Qiao Nuohan wrote:
> On 01/13/2014 06:39 PM, Laszlo Ersek wrote:
>>>> >>
>>>> >>  - When this write_buffer() is directed to a regular file in
>>>> non-flat
>>>> >>  mode, then the file might become sparse (you jump over a range of
>>>> >>  offsets with lseek() in write_buffer()). If the output has been
>>>> opened
>>>> >>  by qemu itself (ie."file:....", in qmp_dump_guest_memory()),
>>>> then due
>>>> >>  to the O_TRUNC we can't seek over preexistent data (and keep
>>>> garbage in
>>>> >>  the file). When libvirt pre-opens the file (to send over the fd
>>>> later),
>>>> >>  in doCoreDump(), it also passes O_TRUNC. OK.
>>>> >>
>>> >
>>> >  Do you mean because of O_TRUNC，seek will exceed the end of the file
>>> >  that may cause some problem?
>> I meant that lseek() would seek over an unwritten portion of the file.
>> If that portion had any kind of data written into it earlier, then that
>> data would now likely turn into garbage (lose meaning, become truncated
>> etc.) It wouldn't be corrupted or anything like that, it would just
>> become a leftover with potential to cause misinterpretation.
>>
>> But, since we have O_TRUNC at open() time, we're seeking past the end of
>> the file, and this sought-over portion will read back as zeroes (and the
>> file might become "sparse", dependent on the filesystem and the size of
>> the range sought-over).
>>
>> Seeking past the end of the file is explicitly allowed by POSIX:
>>
>>      The lseek() function shall allow the file offset to be set beyond
>>      the end of the existing data in the file. If data is later written
>>      at this point, subsequent reads of data in the gap shall return
>>      bytes with the value 0 until data is actually written into the gap.
>>
>> http://pubs.opengroup.org/onlinepubs/9699919799/functions/lseek.html
>>
>> So this is fine.
> 
> Thanks for your explanation. I think it would be better to abandon the
> non-flat
> mode to avoid potential risk.

I can't really provide any input to that decision -- I have no clue
which tools support which format. The non-flat (ie. random-access,
regular file) format appears more space- and computation-efficient, and
I thought that would be the "natural" choice. The flat (non-seekable)
format was a surprise to me -- I wouldn't have thought that any debugger
could directly consume that format.

So it's really your call. Again, the lseek()s seemed fine to me on POSIX
platforms.

Thanks,
Laszlo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v6 06/11] dump: add API to write dump header
  2014-01-14  2:29           ` Laszlo Ersek
@ 2014-01-14  2:42             ` Qiao Nuohan
  0 siblings, 0 replies; 39+ messages in thread
From: Qiao Nuohan @ 2014-01-14  2:42 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: stefanha, qemu-devel, lcapitulino, anderson, kumagai-atsushi,
	akong, afaerber

On 01/14/2014 10:29 AM, Laszlo Ersek wrote:
> I can't really provide any input to that decision -- I have no clue
> which tools support which format. The non-flat (ie. random-access,
> regular file) format appears more space- and computation-efficient, and
> I thought that would be the "natural" choice. The flat (non-seekable)
> format was a surprise to me -- I wouldn't have thought that any debugger
> could directly consume that format.

The flat-mode comes from makedumpfile as kdump-compressed format, and crash
utility has already supported it.

-- 
Regards
Qiao Nuohan

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2014-01-14  2:43 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-05  7:27 [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 01/11] dump: Add argument to write_elfxx_notes Qiao Nuohan
2014-01-06 17:03   ` Laszlo Ersek
2014-01-07  6:00     ` Qiao Nuohan
2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 02/11] dump: Add API to write header of flatten format Qiao Nuohan
2014-01-06 17:15   ` Laszlo Ersek
2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 03/11] dump: Add API to write vmcore Qiao Nuohan
2014-01-06 18:12   ` Laszlo Ersek
2014-01-07  6:15     ` Qiao Nuohan
2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 04/11] dump: Add API to write elf notes to buffer Qiao Nuohan
2014-01-06 18:46   ` Laszlo Ersek
2014-01-07  6:17     ` Qiao Nuohan
2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 05/11] dump: add support for lzo/snappy Qiao Nuohan
2014-01-06 19:25   ` Laszlo Ersek
2014-01-07  6:25     ` Qiao Nuohan
2014-01-07  7:24       ` Laszlo Ersek
2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 06/11] dump: add API to write dump header Qiao Nuohan
2014-01-07 11:38   ` Laszlo Ersek
2014-01-07 11:49     ` Andreas Färber
2014-01-13 10:03     ` Qiao Nuohan
2014-01-13 10:39       ` Laszlo Ersek
2014-01-14  2:07         ` Qiao Nuohan
2014-01-14  2:29           ` Laszlo Ersek
2014-01-14  2:42             ` Qiao Nuohan
2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 07/11] dump: Add API to write dump_bitmap Qiao Nuohan
2014-01-07 14:49   ` Laszlo Ersek
2014-01-07 21:41     ` Laszlo Ersek
2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 08/11] dump: Add APIs to operate DataCache Qiao Nuohan
2014-01-07 15:22   ` Laszlo Ersek
2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 09/11] dump: Add API to write dump pages Qiao Nuohan
2014-01-07 22:37   ` Laszlo Ersek
2014-01-07 23:12     ` Eric Blake
2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 10/11] dump: Make kdump-compressed format available for 'dump-guest-memory' Qiao Nuohan
2014-01-09 15:46   ` Laszlo Ersek
2014-01-05  7:27 ` [Qemu-devel] [PATCH v6 11/11] Add 'query-dump-guest-memory-capability' command Qiao Nuohan
2014-01-09 16:34   ` Laszlo Ersek
2014-01-07  6:32 ` [Qemu-devel] [PATCH v6 00/11] Make 'dump-guest-memory' dump in kdump-compressed format Qiao Nuohan
2014-01-07  7:27   ` Laszlo Ersek
2014-01-07  7:30     ` Qiao Nuohan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.