All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/3] add snapshot/restore fuzzing device
@ 2022-07-22 19:20 Richard Liu
  2022-07-22 19:20 ` [RFC 1/3] create skeleton snapshot device and add docs Richard Liu
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Richard Liu @ 2022-07-22 19:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: alxndr, bsd, darren.kenny, Richard Liu

This RFC adds a virtual device for snapshot/restores within QEMU. I am working
on this as a part of QEMU Google Summer of Code 2022. Fast snapshot/restores
within QEMU is helpful for code fuzzing.

I reused the migration code for saving and restoring virtual device and CPU
state. As for the RAM, I am using a simple COW mmaped file to do restores.

The loadvm migration function I used for doing restores only worked after I
called it from a qemu_bh. I'm not sure if I should run the migration code in a
separate thread (see patch 3), since currently it is running as a part of the
device code in the vCPU thread.

This is a rough first revision and feedback on the cpu and device state restores
is appreciated.

To test locally, boot up any linux distro. I used the following C file to
interact with the PCI snapshot device:

    #include <stdio.h>
    #include <stdint.h>
    #include <fcntl.h>
    #include <sys/mman.h>
    #include <unistd.h>

    int main() {
        int fd = open("/sys/bus/pci/devices/0000:00:04.0/resource0", O_RDWR | O_SYNC);
        size_t size = 1024 * 1024;
        uint32_t* memory = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);

        printf("%x\n", memory[0]);

        int a = 0;
        memory[0] = 0x101; // save snapshot
        printf("before: value of a = %d\n", a);
        a = 1;
        printf("middle: value of a = %d\n", a);
        memory[0] = 0x102; // load snapshot
        printf("after: value of a = %d\n", a);

        return 0;
    }

Richard Liu (3):
  create skeleton snapshot device and add docs
  implement ram save/restore
  use migration code for cpu and device save/restore

 docs/devel/snapshot.rst |  26 +++++++
 hw/i386/Kconfig         |   1 +
 hw/misc/Kconfig         |   3 +
 hw/misc/meson.build     |   1 +
 hw/misc/snapshot.c      | 164 ++++++++++++++++++++++++++++++++++++++++
 migration/savevm.c      |  84 ++++++++++++++++++++
 migration/savevm.h      |   3 +
 7 files changed, 282 insertions(+)
 create mode 100644 docs/devel/snapshot.rst
 create mode 100644 hw/misc/snapshot.c

-- 
2.35.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC 1/3] create skeleton snapshot device and add docs
  2022-07-22 19:20 [RFC 0/3] add snapshot/restore fuzzing device Richard Liu
@ 2022-07-22 19:20 ` Richard Liu
  2022-07-22 19:20 ` [RFC 2/3] implement ram save/restore Richard Liu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Richard Liu @ 2022-07-22 19:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: alxndr, bsd, darren.kenny, Richard Liu

Added a simple skeleton PCI device for snapshot/restores. Added
documentation about the snapshot/restore functionality.

Signed-off-by: Richard Liu <richy.liu.2002@gmail.com>
---
 docs/devel/snapshot.rst | 26 +++++++++++++
 hw/i386/Kconfig         |  1 +
 hw/misc/Kconfig         |  3 ++
 hw/misc/meson.build     |  1 +
 hw/misc/snapshot.c      | 86 +++++++++++++++++++++++++++++++++++++++++
 5 files changed, 117 insertions(+)
 create mode 100644 docs/devel/snapshot.rst
 create mode 100644 hw/misc/snapshot.c

diff --git a/docs/devel/snapshot.rst b/docs/devel/snapshot.rst
new file mode 100644
index 0000000000..a333de69b6
--- /dev/null
+++ b/docs/devel/snapshot.rst
@@ -0,0 +1,26 @@
+================
+Snapshot/restore
+================
+
+The ability to rapidly snapshot and restore guest VM state is a
+crucial component of fuzzing applications with QEMU. A special virtual
+device can be used by fuzzers to interface with snapshot/restores
+commands in QEMU. The virtual device should have the following
+commands supported that can be called by the guest:
+
+- snapshot: save a copy of the guest VM memory, registers, and virtual
+  device state
+- restore: restore the saved copy of guest VM state
+- coverage_location: given a location in guest memory, specifying
+  where the coverage data is to be passed to the fuzzer
+- input_location: specify where in the guest memory the fuzzing input
+  should be stored
+- done: indicates whether or not the run succeeded and that the
+  coverage data has been populated
+
+The first version of the virtual device will only accept snapshot and
+restore commands from the guest. Coverage data will be collected by
+code on the guest with source-based coverage tracking.
+
+Further expansions could include controlling the snapshot/restore from
+host and gathering code coverage information directly from TCG.
diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index d22ac4a4b9..55656eddf5 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -46,6 +46,7 @@ config PC
     select ACPI_VMGENID
     select VIRTIO_PMEM_SUPPORTED
     select VIRTIO_MEM_SUPPORTED
+    select SNAPSHOT
 
 config PC_PCI
     bool
diff --git a/hw/misc/Kconfig b/hw/misc/Kconfig
index cbabe9f78c..fe84f812f2 100644
--- a/hw/misc/Kconfig
+++ b/hw/misc/Kconfig
@@ -174,4 +174,7 @@ config VIRT_CTRL
 config LASI
     bool
 
+config SNAPSHOT
+    bool
+
 source macio/Kconfig
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
index 95268eddc0..ac8fcc5f0b 100644
--- a/hw/misc/meson.build
+++ b/hw/misc/meson.build
@@ -10,6 +10,7 @@ softmmu_ss.add(when: 'CONFIG_UNIMP', if_true: files('unimp.c'))
 softmmu_ss.add(when: 'CONFIG_EMPTY_SLOT', if_true: files('empty_slot.c'))
 softmmu_ss.add(when: 'CONFIG_LED', if_true: files('led.c'))
 softmmu_ss.add(when: 'CONFIG_PVPANIC_COMMON', if_true: files('pvpanic.c'))
+softmmu_ss.add(when: 'CONFIG_SNAPSHOT', if_true: files('snapshot.c'))
 
 # ARM devices
 softmmu_ss.add(when: 'CONFIG_PL310', if_true: files('arm_l2x0.c'))
diff --git a/hw/misc/snapshot.c b/hw/misc/snapshot.c
new file mode 100644
index 0000000000..2690b331fd
--- /dev/null
+++ b/hw/misc/snapshot.c
@@ -0,0 +1,86 @@
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "hw/pci/pci.h"
+#include "hw/hw.h"
+#include "hw/boards.h"
+#include "exec/ramblock.h"
+#include "qom/object.h"
+#include "qemu/module.h"
+#include "qapi/visitor.h"
+#include "io/channel-buffer.h"
+#include "migration/savevm.h"
+
+#define TYPE_PCI_SNAPSHOT_DEVICE "snapshot"
+typedef struct SnapshotState SnapshotState;
+DECLARE_INSTANCE_CHECKER(SnapshotState, SNAPSHOT,
+                         TYPE_PCI_SNAPSHOT_DEVICE)
+
+struct SnapshotState {
+    PCIDevice pdev;
+    MemoryRegion mmio;
+};
+
+static uint64_t snapshot_mmio_read(void *opaque, hwaddr addr, unsigned size)
+{
+    return 0;
+}
+
+static void snapshot_mmio_write(void *opaque, hwaddr addr, uint64_t val,
+                unsigned size)
+{
+}
+
+static const MemoryRegionOps snapshot_mmio_ops = {
+    .read = snapshot_mmio_read,
+    .write = snapshot_mmio_write,
+    .endianness = DEVICE_NATIVE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+
+};
+
+static void pci_snapshot_realize(PCIDevice *pdev, Error **errp)
+{
+    SnapshotState *snapshot = SNAPSHOT(pdev);
+
+    memory_region_init_io(&snapshot->mmio, OBJECT(snapshot), &snapshot_mmio_ops, snapshot,
+                    "snapshot-mmio", 1 * MiB);
+    pci_register_bar(pdev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &snapshot->mmio);
+}
+
+static void snapshot_class_init(ObjectClass *class, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(class);
+    PCIDeviceClass *k = PCI_DEVICE_CLASS(class);
+
+    k->realize = pci_snapshot_realize;
+    k->vendor_id = PCI_VENDOR_ID_QEMU;
+    k->device_id = 0xf987;
+    k->revision = 0x10;
+    k->class_id = PCI_CLASS_OTHERS;
+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+}
+
+static void pci_snapshot_register_types(void)
+{
+    static InterfaceInfo interfaces[] = {
+        { INTERFACE_CONVENTIONAL_PCI_DEVICE },
+        { },
+    };
+    static const TypeInfo snapshot_info = {
+        .name          = TYPE_PCI_SNAPSHOT_DEVICE,
+        .parent        = TYPE_PCI_DEVICE,
+        .instance_size = sizeof(SnapshotState),
+        .class_init    = snapshot_class_init,
+        .interfaces = interfaces,
+    };
+
+    type_register_static(&snapshot_info);
+}
+type_init(pci_snapshot_register_types)
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC 2/3] implement ram save/restore
  2022-07-22 19:20 [RFC 0/3] add snapshot/restore fuzzing device Richard Liu
  2022-07-22 19:20 ` [RFC 1/3] create skeleton snapshot device and add docs Richard Liu
@ 2022-07-22 19:20 ` Richard Liu
  2022-07-22 19:20 ` [RFC 3/3] use migration code for cpu and device save/restore Richard Liu
  2022-07-22 20:10 ` [RFC 0/3] add snapshot/restore fuzzing device Claudio Fontana
  3 siblings, 0 replies; 6+ messages in thread
From: Richard Liu @ 2022-07-22 19:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: alxndr, bsd, darren.kenny, Richard Liu

Use a file-backed copy-on-write mmap region for snapshots. Restores are
handled by remmaping the fixed region. Currently, the snapshot file save
path (`filepath`) is hardcoded (to a path that is memory-backed on my
machine).

Signed-off-by: Richard Liu <richy.liu.2002@gmail.com>
---
 hw/misc/snapshot.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/hw/misc/snapshot.c b/hw/misc/snapshot.c
index 2690b331fd..510bf59dce 100644
--- a/hw/misc/snapshot.c
+++ b/hw/misc/snapshot.c
@@ -18,8 +18,63 @@ DECLARE_INSTANCE_CHECKER(SnapshotState, SNAPSHOT,
 struct SnapshotState {
     PCIDevice pdev;
     MemoryRegion mmio;
+
+    // track saved stated to prevent re-saving
+    bool is_saved;
+
+    // saved cpu and devices state
+    QIOChannelBuffer *ioc;
 };
 
+// memory save location (for better performance, use tmpfs)
+const char *filepath = "/Volumes/RAMDisk/snapshot_0";
+
+static void save_snapshot(struct SnapshotState *s) {
+    if (s->is_saved) {
+        return;
+    }
+    s->is_saved = true;
+
+    // save memory state to file
+    int fd = -1;
+    uint8_t *guest_mem = current_machine->ram->ram_block->host;
+    size_t guest_size = current_machine->ram->ram_block->max_length;
+
+    fd = open(filepath, O_RDWR | O_CREAT | O_TRUNC, (mode_t)0600);
+    ftruncate(fd, guest_size);
+
+    char *map = mmap(0, guest_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+    memcpy(map, guest_mem, guest_size);
+    msync(map, guest_size, MS_SYNC);
+    munmap(map, guest_size);
+    close(fd);
+
+    // unmap the guest, we will now use a MAP_PRIVATE
+    munmap(guest_mem, guest_size);
+
+    // map as MAP_PRIVATE to avoid carrying writes back to the saved file
+    fd = open(filepath, O_RDONLY);
+    mmap(guest_mem, guest_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FIXED, fd, 0);
+}
+
+static void restore_snapshot(struct SnapshotState *s) {
+    int fd = -1;
+    uint8_t *guest_mem = current_machine->ram->ram_block->host;
+    size_t guest_size = current_machine->ram->ram_block->max_length;
+
+    if (!s->is_saved) {
+        fprintf(stderr, "[QEMU] ERROR: attempting to restore but state has not been saved!\n");
+        return;
+    }
+
+    munmap(guest_mem, guest_size);
+
+    // remap the snapshot at the same location
+    fd = open(filepath, O_RDONLY);
+    mmap(guest_mem, guest_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FIXED, fd, 0);
+    close(fd);
+}
+
 static uint64_t snapshot_mmio_read(void *opaque, hwaddr addr, unsigned size)
 {
     return 0;
@@ -28,6 +83,21 @@ static uint64_t snapshot_mmio_read(void *opaque, hwaddr addr, unsigned size)
 static void snapshot_mmio_write(void *opaque, hwaddr addr, uint64_t val,
                 unsigned size)
 {
+    SnapshotState *snapshot = opaque;
+    (void)snapshot;
+
+    switch (addr) {
+    case 0x00:
+        switch (val) {
+        case 0x101:
+            save_snapshot(snapshot);
+            break;
+        case 0x102:
+            restore_snapshot(snapshot);
+            break;
+        }
+        break;
+    }
 }
 
 static const MemoryRegionOps snapshot_mmio_ops = {
@@ -48,6 +118,8 @@ static const MemoryRegionOps snapshot_mmio_ops = {
 static void pci_snapshot_realize(PCIDevice *pdev, Error **errp)
 {
     SnapshotState *snapshot = SNAPSHOT(pdev);
+    snapshot->is_saved = false;
+    snapshot->ioc = NULL;
 
     memory_region_init_io(&snapshot->mmio, OBJECT(snapshot), &snapshot_mmio_ops, snapshot,
                     "snapshot-mmio", 1 * MiB);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC 3/3] use migration code for cpu and device save/restore
  2022-07-22 19:20 [RFC 0/3] add snapshot/restore fuzzing device Richard Liu
  2022-07-22 19:20 ` [RFC 1/3] create skeleton snapshot device and add docs Richard Liu
  2022-07-22 19:20 ` [RFC 2/3] implement ram save/restore Richard Liu
@ 2022-07-22 19:20 ` Richard Liu
  2022-07-22 20:10 ` [RFC 0/3] add snapshot/restore fuzzing device Claudio Fontana
  3 siblings, 0 replies; 6+ messages in thread
From: Richard Liu @ 2022-07-22 19:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: alxndr, bsd, darren.kenny, Richard Liu

Reused device migration code for cpu and device state snapshots. In this
initial version, I used several hacks to get the device code working.

vm_stop doesn't have the intended effect (for qemu_save_device_state)
unless called outside the vcpu thread. I trick the function into
thinking it is outside the vcpu thread by temporarily setting
`current_cpu` to be null.

The restore code (qemu_loadvm_state in particular) needs to be called
in a bottom half or a coroutine. I am not sure why.

Signed-off-by: Richard Liu <richy.liu.2002@gmail.com>
---
 hw/misc/snapshot.c |  6 ++++
 migration/savevm.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++
 migration/savevm.h |  3 ++
 3 files changed, 93 insertions(+)

diff --git a/hw/misc/snapshot.c b/hw/misc/snapshot.c
index 510bf59dce..afdc5b7f15 100644
--- a/hw/misc/snapshot.c
+++ b/hw/misc/snapshot.c
@@ -55,6 +55,9 @@ static void save_snapshot(struct SnapshotState *s) {
     // map as MAP_PRIVATE to avoid carrying writes back to the saved file
     fd = open(filepath, O_RDONLY);
     mmap(guest_mem, guest_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FIXED, fd, 0);
+
+    // save cpu and device state
+    s->ioc = qemu_snapshot_save_cpu_state();
 }
 
 static void restore_snapshot(struct SnapshotState *s) {
@@ -73,6 +76,9 @@ static void restore_snapshot(struct SnapshotState *s) {
     fd = open(filepath, O_RDONLY);
     mmap(guest_mem, guest_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FIXED, fd, 0);
     close(fd);
+
+    // restore cpu and device state
+    qemu_snapshot_load_cpu_state(s->ioc);
 }
 
 static uint64_t snapshot_mmio_read(void *opaque, hwaddr addr, unsigned size)
diff --git a/migration/savevm.c b/migration/savevm.c
index 48e85c052c..62e5e5b564 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -3309,3 +3309,87 @@ void qmp_snapshot_delete(const char *job_id,
 
     job_start(&s->common);
 }
+
+// saves the cpu and devices state
+QIOChannelBuffer* qemu_snapshot_save_cpu_state(void)
+{
+    QEMUFile *f;
+    QIOChannelBuffer *ioc;
+    MigrationState *ms = migrate_get_current();
+    int ret;
+
+    /* This is a hack to trick vm_stop() into thinking it is not in vcpu thread.
+     * This is needed to properly stop the VM for a snapshot.
+     */
+    CPUState *cpu = current_cpu;
+    current_cpu = NULL;
+    vm_stop(RUN_STATE_SAVE_VM);
+    current_cpu = cpu;
+
+    global_state_store_running();
+
+    ioc = qio_channel_buffer_new(0x10000);
+    qio_channel_set_name(QIO_CHANNEL(ioc), "snapshot-buffer");
+    f = qemu_file_new_output(QIO_CHANNEL(ioc));
+
+    /* We need to initialize migration otherwise qemu_save_device_state() will
+     * complain.
+     */
+    migrate_init(ms);
+    ms->state = MIGRATION_STATUS_NONE;
+    ms->send_configuration = false;
+
+    cpu_synchronize_all_states();
+
+    ret = qemu_save_device_state(f);
+    if (ret < 0) {
+        fprintf(stderr, "[QEMU] save device err: %d\n", ret);
+    }
+
+    // clean up and restart vm
+    qemu_fflush(f);
+    g_free(f);
+
+    vm_start();
+
+    /* Needed so qemu_loadvm_state does not error with:
+     * qemu-system-x86_64: Expected vmdescription section, but got 0
+     */
+    ms->state = MIGRATION_STATUS_POSTCOPY_ACTIVE;
+
+    return ioc;
+}
+
+// loads the cpu and devices state
+static void do_snapshot_load(void* opaque) {
+    QIOChannelBuffer *ioc = opaque;
+    QEMUFile *f;
+    int ret;
+
+    vm_stop(RUN_STATE_RESTORE_VM);
+
+    // seek back to beginning of file
+    qio_channel_io_seek(QIO_CHANNEL(ioc), 0, 0, NULL);
+    f = qemu_file_new_input(QIO_CHANNEL(ioc));
+
+    ret = qemu_loadvm_state(f);
+    if (ret < 0) {
+        fprintf(stderr, "[QEMU] loadvm err: %d\n", ret);
+    }
+
+    vm_start();
+
+    g_free(f);
+
+    // print time to debug speed
+    struct timespec ts;
+    clock_gettime(CLOCK_MONOTONIC, &ts);
+    fprintf(stderr, "loaded snapshot at %ld.%ld\n", ts.tv_sec, ts.tv_nsec);
+}
+
+void qemu_snapshot_load_cpu_state(QIOChannelBuffer *ioc) {
+    /* Run in a bh because otherwise qemu_loadvm_state won't work
+     */
+    QEMUBH *bh = qemu_bh_new(do_snapshot_load, ioc);
+    qemu_bh_schedule(bh);
+}
diff --git a/migration/savevm.h b/migration/savevm.h
index 6461342cb4..990bcddd2f 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -68,4 +68,7 @@ int qemu_load_device_state(QEMUFile *f);
 int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
         bool in_postcopy, bool inactivate_disks);
 
+QIOChannelBuffer* qemu_snapshot_save_cpu_state(void);
+void qemu_snapshot_load_cpu_state(QIOChannelBuffer *ioc);
+
 #endif
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC 0/3] add snapshot/restore fuzzing device
  2022-07-22 19:20 [RFC 0/3] add snapshot/restore fuzzing device Richard Liu
                   ` (2 preceding siblings ...)
  2022-07-22 19:20 ` [RFC 3/3] use migration code for cpu and device save/restore Richard Liu
@ 2022-07-22 20:10 ` Claudio Fontana
  2022-07-23 15:52   ` Alexander Bulekov
  3 siblings, 1 reply; 6+ messages in thread
From: Claudio Fontana @ 2022-07-22 20:10 UTC (permalink / raw)
  To: Richard Liu, qemu-devel
  Cc: alxndr, bsd, darren.kenny, Dr . David Alan Gilbert, nborisov, Het Gala

Hi Richard,

On 7/22/22 21:20, Richard Liu wrote:
> This RFC adds a virtual device for snapshot/restores within QEMU. I am working
> on this as a part of QEMU Google Summer of Code 2022. Fast snapshot/restores
> within QEMU is helpful for code fuzzing.
> 
> I reused the migration code for saving and restoring virtual device and CPU
> state. As for the RAM, I am using a simple COW mmaped file to do restores.
> 
> The loadvm migration function I used for doing restores only worked after I
> called it from a qemu_bh. I'm not sure if I should run the migration code in a
> separate thread (see patch 3), since currently it is running as a part of the
> device code in the vCPU thread.
> 
> This is a rough first revision and feedback on the cpu and device state restores
> is appreciated.

As I understand it, usually the save and restore of VM state in QEMU can best be
managed by libvirt APIs, and for example using the libvirt command line tool virsh:

$ virsh save (or managedsave)

$ virsh restore (or start)

These commands start a QEMU migration using the QMP protocol to a file descriptor,
previously opened by libvirt to contain the state file.

(getfd QMP command):
https://qemu-project.gitlab.io/qemu/interop/qemu-qmp-ref.html#qapidoc-2811

(migrate QMP command):
https://qemu-project.gitlab.io/qemu/interop/qemu-qmp-ref.html#qapidoc-1947

This is unfortunately currently very slow.

Maybe you could help thinking out or with the implementation of the solution?
I tried to push this approach that only involves libvirt, using the existing QEMU multifd migration to a socket:

https://listman.redhat.com/archives/libvir-list/2022-June/232252.html

performance is very good compared with what is possible today, but it won't be upstreamable because it is not deemed optimal, and libvirt wants the code to be in QEMU.

What about helping in thinking out how the QEMU-based solution could look like?

The requirements for now in my view seem to be:

* avoiding the kernel file page trashing for large transfers
  which currently requires in my view changing QEMU to be able to migrate a stream to an fd that is open with O_DIRECT.
  In practice this means somehow making all QEMU migration stream writes block-friendly (adding some buffering?).

* allow concurrent parallel transfers
  to be able to use extra cpu resources to speed up the transfer if such resources are available.

* we should be able to transfer multiple GB/s with modern nvmes for super fast VM state save and restore (few seconds even for a 30GB VM),
  and we should do no worse than the prototype fully implemented in libvirt, otherwise it would not make sense to implement it in QEMU.

What do you think?

Ciao,

Claudio

> 
> To test locally, boot up any linux distro. I used the following C file to
> interact with the PCI snapshot device:
> 
>     #include <stdio.h>
>     #include <stdint.h>
>     #include <fcntl.h>
>     #include <sys/mman.h>
>     #include <unistd.h>
> 
>     int main() {
>         int fd = open("/sys/bus/pci/devices/0000:00:04.0/resource0", O_RDWR | O_SYNC);
>         size_t size = 1024 * 1024;
>         uint32_t* memory = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
> 
>         printf("%x\n", memory[0]);
> 
>         int a = 0;
>         memory[0] = 0x101; // save snapshot
>         printf("before: value of a = %d\n", a);
>         a = 1;
>         printf("middle: value of a = %d\n", a);
>         memory[0] = 0x102; // load snapshot
>         printf("after: value of a = %d\n", a);
> 
>         return 0;
>     }
> 
> Richard Liu (3):
>   create skeleton snapshot device and add docs
>   implement ram save/restore
>   use migration code for cpu and device save/restore
> 
>  docs/devel/snapshot.rst |  26 +++++++
>  hw/i386/Kconfig         |   1 +
>  hw/misc/Kconfig         |   3 +
>  hw/misc/meson.build     |   1 +
>  hw/misc/snapshot.c      | 164 ++++++++++++++++++++++++++++++++++++++++
>  migration/savevm.c      |  84 ++++++++++++++++++++
>  migration/savevm.h      |   3 +
>  7 files changed, 282 insertions(+)
>  create mode 100644 docs/devel/snapshot.rst
>  create mode 100644 hw/misc/snapshot.c
> 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC 0/3] add snapshot/restore fuzzing device
  2022-07-22 20:10 ` [RFC 0/3] add snapshot/restore fuzzing device Claudio Fontana
@ 2022-07-23 15:52   ` Alexander Bulekov
  0 siblings, 0 replies; 6+ messages in thread
From: Alexander Bulekov @ 2022-07-23 15:52 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Richard Liu, qemu-devel, bsd, darren.kenny,
	Dr . David Alan Gilbert, nborisov, Het Gala

On 220722 2210, Claudio Fontana wrote:
> Hi Richard,
> 
> On 7/22/22 21:20, Richard Liu wrote:
> > This RFC adds a virtual device for snapshot/restores within QEMU. I am working
> > on this as a part of QEMU Google Summer of Code 2022. Fast snapshot/restores
> > within QEMU is helpful for code fuzzing.
> > 
> > I reused the migration code for saving and restoring virtual device and CPU
> > state. As for the RAM, I am using a simple COW mmaped file to do restores.
> > 
> > The loadvm migration function I used for doing restores only worked after I
> > called it from a qemu_bh. I'm not sure if I should run the migration code in a
> > separate thread (see patch 3), since currently it is running as a part of the
> > device code in the vCPU thread.
> > 
> > This is a rough first revision and feedback on the cpu and device state restores
> > is appreciated.
> 
> As I understand it, usually the save and restore of VM state in QEMU can best be
> managed by libvirt APIs, and for example using the libvirt command line tool virsh:
> 
> $ virsh save (or managedsave)
> 
> $ virsh restore (or start)
> 
> These commands start a QEMU migration using the QMP protocol to a file descriptor,
> previously opened by libvirt to contain the state file.
> 
> (getfd QMP command):
> https://qemu-project.gitlab.io/qemu/interop/qemu-qmp-ref.html#qapidoc-2811
> 
> (migrate QMP command):
> https://qemu-project.gitlab.io/qemu/interop/qemu-qmp-ref.html#qapidoc-1947
> 
> This is unfortunately currently very slow.
> 
> Maybe you could help thinking out or with the implementation of the solution?
> I tried to push this approach that only involves libvirt, using the existing QEMU multifd migration to a socket:
> 
> https://listman.redhat.com/archives/libvir-list/2022-June/232252.html
> 
> performance is very good compared with what is possible today, but it won't be upstreamable because it is not deemed optimal, and libvirt wants the code to be in QEMU.
> 
> What about helping in thinking out how the QEMU-based solution could look like?
> 
> The requirements for now in my view seem to be:
> 
> * avoiding the kernel file page trashing for large transfers
>   which currently requires in my view changing QEMU to be able to migrate a stream to an fd that is open with O_DIRECT.
>   In practice this means somehow making all QEMU migration stream writes block-friendly (adding some buffering?).
> 
> * allow concurrent parallel transfers
>   to be able to use extra cpu resources to speed up the transfer if such resources are available.
> 
> * we should be able to transfer multiple GB/s with modern nvmes for super fast VM state save and restore (few seconds even for a 30GB VM),
>   and we should do no worse than the prototype fully implemented in libvirt, otherwise it would not make sense to implement it in QEMU.
> 
> What do you think?

Hi Claudio,
These changes aim to restore a VM hundreds-thousands of times per second
within the same process. Do you think that is achievable with the design
of qmp migrate? We want to to avoid serializing/transferring all of
memory over the FD. So right now, this series only uses migration code
for device state. Right now (in 3/3), the memory is "restored" simply be
re-mmapping MAP_PRIVATE from file-backed memory. However, future
versions might use dirty-page-tracking with a shadow memory-snapshot, to
avoid the page-faults that result from the mmap + MAP_PRIVATE approach.

In terms of the way the guest initiates snapshots/restores, maybe there
is a neater way to do this with QMP, by providing the guest with access
to qmp via a serial device. That way, we avoid the need for a custom
virtual-device. Right now, the snapshots are requested/restored over
MMIO, since we need to make snapshots at precise locations in the
guest's execution (i.e. a specific program counter in a process running
in the guest). I wonder if there is a way to achieve that with qmp
forwarded to the guest.

-Alex

> 
> Ciao,
> 
> Claudio
> 
> > 
> > To test locally, boot up any linux distro. I used the following C file to
> > interact with the PCI snapshot device:
> > 
> >     #include <stdio.h>
> >     #include <stdint.h>
> >     #include <fcntl.h>
> >     #include <sys/mman.h>
> >     #include <unistd.h>
> > 
> >     int main() {
> >         int fd = open("/sys/bus/pci/devices/0000:00:04.0/resource0", O_RDWR | O_SYNC);
> >         size_t size = 1024 * 1024;
> >         uint32_t* memory = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
> > 
> >         printf("%x\n", memory[0]);
> > 
> >         int a = 0;
> >         memory[0] = 0x101; // save snapshot
> >         printf("before: value of a = %d\n", a);
> >         a = 1;
> >         printf("middle: value of a = %d\n", a);
> >         memory[0] = 0x102; // load snapshot
> >         printf("after: value of a = %d\n", a);
> > 
> >         return 0;
> >     }
> > 
> > Richard Liu (3):
> >   create skeleton snapshot device and add docs
> >   implement ram save/restore
> >   use migration code for cpu and device save/restore
> > 
> >  docs/devel/snapshot.rst |  26 +++++++
> >  hw/i386/Kconfig         |   1 +
> >  hw/misc/Kconfig         |   3 +
> >  hw/misc/meson.build     |   1 +
> >  hw/misc/snapshot.c      | 164 ++++++++++++++++++++++++++++++++++++++++
> >  migration/savevm.c      |  84 ++++++++++++++++++++
> >  migration/savevm.h      |   3 +
> >  7 files changed, 282 insertions(+)
> >  create mode 100644 docs/devel/snapshot.rst
> >  create mode 100644 hw/misc/snapshot.c
> > 
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-07-23 15:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-22 19:20 [RFC 0/3] add snapshot/restore fuzzing device Richard Liu
2022-07-22 19:20 ` [RFC 1/3] create skeleton snapshot device and add docs Richard Liu
2022-07-22 19:20 ` [RFC 2/3] implement ram save/restore Richard Liu
2022-07-22 19:20 ` [RFC 3/3] use migration code for cpu and device save/restore Richard Liu
2022-07-22 20:10 ` [RFC 0/3] add snapshot/restore fuzzing device Claudio Fontana
2022-07-23 15:52   ` Alexander Bulekov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.