All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: qemu-devel@nongnu.org
Cc: "Nicholas Piggin" <npiggin@gmail.com>,
	"Pavel Dovgalyuk" <Pavel.Dovgalyuk@ispras.ru>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"John Snow" <jsnow@redhat.com>, "Cleber Rosa" <crosa@redhat.com>,
	"Wainer dos Santos Moschetta" <wainersm@redhat.com>,
	"Beraldo Leal" <bleal@redhat.com>,
	"Michael Tokarev" <mjt@tls.msk.ru>
Subject: [PATCH v5 20/24] replay: simple auto-snapshot mode for record
Date: Tue, 19 Mar 2024 01:46:17 +1000	[thread overview]
Message-ID: <20240318154621.2361161-21-npiggin@gmail.com> (raw)
In-Reply-To: <20240318154621.2361161-1-npiggin@gmail.com>

record makes an initial snapshot when the machine is created, to enable
reverse-debugging. Often the issue being debugged appears near the end of
the trace, so it is important for performance to keep snapshots close to
the end.

This implements a periodic snapshot mode that keeps a rolling set of
recent snapshots. This could be done by the debugger or other program
that talks QMP, but for setting up simple scenarios and tests, this is
more convenient.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 docs/system/replay.rst   |  5 ++++
 include/sysemu/replay.h  | 11 ++++++++
 replay/replay-snapshot.c | 57 ++++++++++++++++++++++++++++++++++++++++
 replay/replay.c          | 27 +++++++++++++++++--
 system/vl.c              |  9 +++++++
 qemu-options.hx          |  9 +++++--
 6 files changed, 114 insertions(+), 4 deletions(-)

diff --git a/docs/system/replay.rst b/docs/system/replay.rst
index ca7c17c63d..1ae8614475 100644
--- a/docs/system/replay.rst
+++ b/docs/system/replay.rst
@@ -156,6 +156,11 @@ for storing VM snapshots. Here is the example of the command line for this:
 ``empty.qcow2`` drive does not connected to any virtual block device and used
 for VM snapshots only.
 
+``rrsnapmode`` can be used to select just an initial snapshot or periodic
+snapshots, with ``rrsnapcount`` specifying the number of periodic snapshots
+to maintain, and ``rrsnaptime`` the amount of run time in seconds between
+periodic snapshots.
+
 .. _network-label:
 
 Network devices
diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h
index 8102fa54f0..92fa82842b 100644
--- a/include/sysemu/replay.h
+++ b/include/sysemu/replay.h
@@ -48,6 +48,17 @@ typedef enum ReplayCheckpoint ReplayCheckpoint;
 
 typedef struct ReplayNetState ReplayNetState;
 
+enum ReplaySnapshotMode {
+    REPLAY_SNAPSHOT_MODE_INITIAL,
+    REPLAY_SNAPSHOT_MODE_PERIODIC,
+};
+typedef enum ReplaySnapshotMode ReplaySnapshotMode;
+
+extern ReplaySnapshotMode replay_snapshot_mode;
+
+extern uint64_t replay_snapshot_periodic_delay;
+extern int replay_snapshot_periodic_nr_keep;
+
 /* Name of the initial VM snapshot */
 extern char *replay_snapshot;
 
diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c
index ccb4d89dda..762555feaa 100644
--- a/replay/replay-snapshot.c
+++ b/replay/replay-snapshot.c
@@ -70,6 +70,53 @@ void replay_vmstate_register(void)
     vmstate_register(NULL, 0, &vmstate_replay, &replay_state);
 }
 
+static QEMUTimer *replay_snapshot_timer;
+static int replay_snapshot_count;
+
+static void replay_snapshot_timer_cb(void *opaque)
+{
+    Error *err = NULL;
+    char *name;
+
+    if (!replay_can_snapshot()) {
+        /* Try again soon */
+        timer_mod(replay_snapshot_timer,
+                  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
+                  replay_snapshot_periodic_delay / 10);
+        return;
+    }
+
+    name = g_strdup_printf("%s-%d", replay_snapshot, replay_snapshot_count);
+    if (!save_snapshot(name,
+                       true, NULL, false, NULL, &err)) {
+        error_report_err(err);
+        error_report("Could not create periodic snapshot "
+                     "for icount record, disabling");
+        g_free(name);
+        return;
+    }
+    g_free(name);
+    replay_snapshot_count++;
+
+    if (replay_snapshot_periodic_nr_keep >= 1 &&
+        replay_snapshot_count > replay_snapshot_periodic_nr_keep) {
+        int del_nr;
+
+        del_nr = replay_snapshot_count - replay_snapshot_periodic_nr_keep - 1;
+        name = g_strdup_printf("%s-%d", replay_snapshot, del_nr);
+        if (!delete_snapshot(name, false, NULL, &err)) {
+            error_report_err(err);
+            error_report("Could not delete periodic snapshot "
+                         "for icount record");
+        }
+        g_free(name);
+    }
+
+    timer_mod(replay_snapshot_timer,
+              qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
+              replay_snapshot_periodic_delay);
+}
+
 void replay_vmstate_init(void)
 {
     Error *err = NULL;
@@ -82,6 +129,16 @@ void replay_vmstate_init(void)
                 error_report("Could not create snapshot for icount record");
                 exit(1);
             }
+
+            if (replay_snapshot_mode == REPLAY_SNAPSHOT_MODE_PERIODIC) {
+                replay_snapshot_timer = timer_new_ms(QEMU_CLOCK_REALTIME,
+                                                     replay_snapshot_timer_cb,
+                                                     NULL);
+                timer_mod(replay_snapshot_timer,
+                          qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
+                          replay_snapshot_periodic_delay);
+            }
+
         } else if (replay_mode == REPLAY_MODE_PLAY) {
             if (!load_snapshot(replay_snapshot, NULL, false, NULL, &err)) {
                 error_report_err(err);
diff --git a/replay/replay.c b/replay/replay.c
index 895fa6b67a..c916e71d30 100644
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -29,6 +29,10 @@
 ReplayMode replay_mode = REPLAY_MODE_NONE;
 char *replay_snapshot;
 
+ReplaySnapshotMode replay_snapshot_mode;
+uint64_t replay_snapshot_periodic_delay;
+int replay_snapshot_periodic_nr_keep;
+
 /* Name of replay file  */
 static char *replay_filename;
 ReplayState replay_state;
@@ -424,6 +428,27 @@ void replay_configure(QemuOpts *opts)
     }
 
     replay_snapshot = g_strdup(qemu_opt_get(opts, "rrsnapshot"));
+    if (replay_snapshot && mode == REPLAY_MODE_RECORD) {
+        const char *snapmode;
+
+        snapmode = qemu_opt_get(opts, "rrsnapmode");
+        if (!snapmode || !strcmp(snapmode, "initial")) {
+            replay_snapshot_mode = REPLAY_SNAPSHOT_MODE_INITIAL;
+        } else if (!strcmp(snapmode, "periodic")) {
+            replay_snapshot_mode = REPLAY_SNAPSHOT_MODE_PERIODIC;
+        } else {
+            error_report("Invalid rrsnapmode option: %s", snapmode);
+            exit(1);
+        }
+
+        /* Default 10 host seconds of machine runtime per snapshot. */
+        replay_snapshot_periodic_delay =
+                           qemu_opt_get_number(opts, "rrsnaptime", 10) * 1000;
+
+        /* Default 2, to cover at least the last 10 host seconds of runtime. */
+        replay_snapshot_periodic_nr_keep =
+                           qemu_opt_get_number(opts, "rrsnapcount", 2);
+    }
     replay_vmstate_register();
     replay_enable(fname, mode);
 
@@ -446,8 +471,6 @@ void replay_start(void)
         exit(1);
     }
 
-    /* Timer for snapshotting will be set up here. */
-
     replay_enable_events();
 }
 
diff --git a/system/vl.c b/system/vl.c
index 70f4cece7f..8070cee6da 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -447,6 +447,15 @@ static QemuOptsList qemu_icount_opts = {
         }, {
             .name = "rrsnapshot",
             .type = QEMU_OPT_STRING,
+        }, {
+            .name = "rrsnapmode",
+            .type = QEMU_OPT_STRING,
+        }, {
+            .name = "rrsnaptime",
+            .type = QEMU_OPT_NUMBER,
+        }, {
+            .name = "rrsnapcount",
+            .type = QEMU_OPT_NUMBER,
         },
         { /* end of list */ }
     },
diff --git a/qemu-options.hx b/qemu-options.hx
index 7fd1713fa8..51f2498f12 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4665,13 +4665,13 @@ SRST
 ERST
 
 DEF("icount", HAS_ARG, QEMU_OPTION_icount, \
-    "-icount [shift=N|auto][,align=on|off][,sleep=on|off][,rr=record|replay,rrfile=<filename>[,rrsnapshot=<snapshot>]]\n" \
+    "-icount [shift=N|auto][,align=on|off][,sleep=on|off][,rr=record|replay,rrfile=<filename>[,rrsnapshot=<snapshot>][,rrsnapmode=initial|periodic][,rrsnaptime=secs][,rrsnapcount=N]\n" \
     "                enable virtual instruction counter with 2^N clock ticks per\n" \
     "                instruction, enable aligning the host and virtual clocks\n" \
     "                or disable real time cpu sleeping, and optionally enable\n" \
     "                record-and-replay mode\n", QEMU_ARCH_ALL)
 SRST
-``-icount [shift=N|auto][,align=on|off][,sleep=on|off][,rr=record|replay,rrfile=filename[,rrsnapshot=snapshot]]``
+``-icount [shift=N|auto][,align=on|off][,sleep=on|off][,rr=record|replay,rrfile=filename[,rrsnapshot=snapshot][,rrsnapmode=initial|periodic][,rrsnaptime=secs][,rrsnapcount=N]]``
     Enable virtual instruction counter. The virtual cpu will execute one
     instruction every 2^N ns of virtual time. If ``auto`` is specified
     then the virtual cpu speed will be automatically adjusted to keep
@@ -4713,6 +4713,11 @@ SRST
     name. In record mode, a new VM snapshot with the given name is created
     at the start of execution recording. In replay mode this option
     specifies the snapshot name used to load the initial VM state.
+    ``rrsnapmode=periodic`` will additionally cause a periodic snapshot to
+    be created after ``rrsnaptime=secs`` seconds of real runtime. The last
+    ``rrsnapcount=N`` periodic snapshots (not including the initial) will
+    be kept (0 for infinite). Periodic snapshots are useful to speed
+    reverse debugging operations near the end of the recorded trace.
 ERST
 
 DEF("watchdog-action", HAS_ARG, QEMU_OPTION_watchdog_action, \
-- 
2.42.0



  parent reply	other threads:[~2024-03-18 15:48 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-18 15:45 [PATCH v5 00/24] replay: fixes and new test cases Nicholas Piggin
2024-03-18 15:45 ` [PATCH v5 01/24] scripts/replay-dump.py: Update to current rr record format Nicholas Piggin
2024-03-18 15:45 ` [PATCH v5 02/24] scripts/replay-dump.py: rejig decoders in event number order Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 03/24] tests/avocado: excercise scripts/replay-dump.py in replay tests Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 04/24] replay: allow runstate shutdown->running when replaying trace Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 05/24] Revert "replay: stop us hanging in rr_wait_io_event" Nicholas Piggin
2024-03-19 17:58   ` Alex Bennée
2024-03-18 15:46 ` [PATCH v5 06/24] tests/avocado: replay_kernel.py add x86-64 q35 machine test Nicholas Piggin
2024-03-19 17:58   ` Alex Bennée
2024-03-18 15:46 ` [PATCH v5 07/24] chardev: set record/replay on the base device of a muxed device Nicholas Piggin
2024-03-19 19:17   ` Alex Bennée
2024-03-18 15:46 ` [PATCH v5 08/24] replay: Fix migration use of clock Nicholas Piggin
2024-03-19 20:40   ` Alex Bennée
2024-03-20  4:48     ` Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 09/24] replay: Fix migration replay_mutex locking Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 10/24] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state Nicholas Piggin
2024-03-19 20:41   ` Alex Bennée
2024-04-05  6:53   ` Pavel Dovgalyuk
2024-03-18 15:46 ` [PATCH v5 11/24] virtio-net: Use virtual time for RSC timers Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 12/24] savevm: Fix load_snapshot error path crash Nicholas Piggin
2024-03-19 20:49   ` Alex Bennée
2024-03-18 15:46 ` [PATCH v5 13/24] tests/avocado: replay_linux.py remove the timeout expected guards Nicholas Piggin
2024-03-19 17:57   ` Alex Bennée
2024-03-20  4:34     ` Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 14/24] tests/avocado: reverse_debugging.py mark aarch64 and pseries as not flaky Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 15/24] tests/avocado: reverse_debugging.py add test for x86-64 q35 machine Nicholas Piggin
2024-03-19 20:50   ` Alex Bennée
2024-03-18 15:46 ` [PATCH v5 16/24] tests/avocado: reverse_debugging.py verify addresses between record and replay Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 17/24] tests/avocado: reverse_debugging.py stop VM before sampling icount Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 18/24] tests/avocado: reverse_debugging reverse-step at the end of the trace Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 19/24] tests/avocado: reverse_debugging.py add snapshot testing Nicholas Piggin
2024-03-18 15:46 ` Nicholas Piggin [this message]
2024-04-05  6:56   ` [PATCH v5 20/24] replay: simple auto-snapshot mode for record Pavel Dovgalyuk
2024-03-18 15:46 ` [PATCH v5 21/24] tests/avocado: reverse_debugging.py test auto-snapshot mode Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 22/24] target/ppc: fix timebase register reset state Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 23/24] spapr: Fix vpa dispatch count for record-replay Nicholas Piggin
2024-03-18 15:46 ` [PATCH v5 24/24] tests/avocado: replay_linux.py add ppc64 pseries test Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240318154621.2361161-21-npiggin@gmail.com \
    --to=npiggin@gmail.com \
    --cc=Pavel.Dovgalyuk@ispras.ru \
    --cc=alex.bennee@linaro.org \
    --cc=bleal@redhat.com \
    --cc=crosa@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=mjt@tls.msk.ru \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=wainersm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.