All of lore.kernel.org
 help / color / mirror / Atom feed
* [PULL 0/5] tcg patch queue
@ 2023-01-16 22:36 Richard Henderson
  2023-01-16 22:36 ` [PULL 1/5] linux-user: Clean up when exiting due to a signal Richard Henderson
                   ` (6 more replies)
  0 siblings, 7 replies; 29+ messages in thread
From: Richard Henderson @ 2023-01-16 22:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

The following changes since commit fb7e7990342e59cf67dbd895c1a1e3fb1741df7a:

  tests/qtest/qom-test: Do not print tested properties by default (2023-01-16 15:00:57 +0000)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230116

for you to fetch changes up to 61710a7e23a63546da0071ea32adb96476fa5d07:

  accel/tcg: Split out cpu_exec_{setjmp,loop} (2023-01-16 10:14:12 -1000)

----------------------------------------------------------------
- Reorg cpu_tb_exec around setjmp.
- Use __attribute__((target)) for buffer_is_zero.
- Add perfmap and jitdump for perf support.

----------------------------------------------------------------
Ilya Leoshkevich (3):
      linux-user: Clean up when exiting due to a signal
      accel/tcg: Add debuginfo support
      tcg: add perfmap and jitdump

Richard Henderson (2):
      util/bufferiszero: Use __attribute__((target)) for avx2/avx512
      accel/tcg: Split out cpu_exec_{setjmp,loop}

 docs/devel/tcg.rst        |  23 +++
 meson.build               |  16 +-
 accel/tcg/debuginfo.h     |  77 ++++++++++
 accel/tcg/perf.h          |  49 ++++++
 accel/tcg/cpu-exec.c      | 111 +++++++-------
 accel/tcg/debuginfo.c     |  96 ++++++++++++
 accel/tcg/perf.c          | 375 ++++++++++++++++++++++++++++++++++++++++++++++
 accel/tcg/translate-all.c |   7 +
 hw/core/loader.c          |   5 +
 linux-user/elfload.c      |   3 +
 linux-user/exit.c         |   2 +
 linux-user/main.c         |  15 ++
 linux-user/signal.c       |   8 +-
 softmmu/vl.c              |  11 ++
 tcg/tcg.c                 |   2 +
 util/bufferiszero.c       |  41 +----
 accel/tcg/meson.build     |   2 +
 linux-user/meson.build    |   1 +
 qemu-options.hx           |  20 +++
 19 files changed, 763 insertions(+), 101 deletions(-)
 create mode 100644 accel/tcg/debuginfo.h
 create mode 100644 accel/tcg/perf.h
 create mode 100644 accel/tcg/debuginfo.c
 create mode 100644 accel/tcg/perf.c


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PULL 1/5] linux-user: Clean up when exiting due to a signal
  2023-01-16 22:36 [PULL 0/5] tcg patch queue Richard Henderson
@ 2023-01-16 22:36 ` Richard Henderson
  2023-01-16 22:36 ` [PULL 2/5] accel/tcg: Add debuginfo support Richard Henderson
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Richard Henderson @ 2023-01-16 22:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Ilya Leoshkevich, Alex Bennée

From: Ilya Leoshkevich <iii@linux.ibm.com>

When exiting due to an exit() syscall, qemu-user calls
preexit_cleanup(), but this is currently not the case when exiting due
to a signal. This leads to various buffers not being flushed (e.g.,
for gprof, for gcov, and for the upcoming perf support).

Add the missing call.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230112152013.125680-2-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/signal.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 61c6fa3fcf..098f3a787d 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -695,7 +695,7 @@ void cpu_loop_exit_sigbus(CPUState *cpu, target_ulong addr,
 
 /* abort execution with signal */
 static G_NORETURN
-void dump_core_and_abort(int target_sig)
+void dump_core_and_abort(CPUArchState *cpu_env, int target_sig)
 {
     CPUState *cpu = thread_cpu;
     CPUArchState *env = cpu->env_ptr;
@@ -724,6 +724,8 @@ void dump_core_and_abort(int target_sig)
             target_sig, strsignal(host_sig), "core dumped" );
     }
 
+    preexit_cleanup(cpu_env, 128 + target_sig);
+
     /* The proper exit code for dying from an uncaught signal is
      * -<signal>.  The kernel doesn't allow exit() or _exit() to pass
      * a negative value.  To get the proper exit code we need to
@@ -1058,12 +1060,12 @@ static void handle_pending_signal(CPUArchState *cpu_env, int sig,
                    sig != TARGET_SIGURG &&
                    sig != TARGET_SIGWINCH &&
                    sig != TARGET_SIGCONT) {
-            dump_core_and_abort(sig);
+            dump_core_and_abort(cpu_env, sig);
         }
     } else if (handler == TARGET_SIG_IGN) {
         /* ignore sig */
     } else if (handler == TARGET_SIG_ERR) {
-        dump_core_and_abort(sig);
+        dump_core_and_abort(cpu_env, sig);
     } else {
         /* compute the blocked signals during the handler execution */
         sigset_t *blocked_set;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 2/5] accel/tcg: Add debuginfo support
  2023-01-16 22:36 [PULL 0/5] tcg patch queue Richard Henderson
  2023-01-16 22:36 ` [PULL 1/5] linux-user: Clean up when exiting due to a signal Richard Henderson
@ 2023-01-16 22:36 ` Richard Henderson
  2023-01-16 22:36 ` [PULL 3/5] tcg: add perfmap and jitdump Richard Henderson
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Richard Henderson @ 2023-01-16 22:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Ilya Leoshkevich

From: Ilya Leoshkevich <iii@linux.ibm.com>

Add libdw-based functions for loading and querying debuginfo. Load
debuginfo from the system and the linux-user loaders.

This is useful for the upcoming perf support, which can then put
human-readable guest symbols instead of raw guest PCs into perfmap and
jitdump files.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230112152013.125680-3-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 meson.build            |  8 ++++
 accel/tcg/debuginfo.h  | 77 +++++++++++++++++++++++++++++++++
 accel/tcg/debuginfo.c  | 96 ++++++++++++++++++++++++++++++++++++++++++
 hw/core/loader.c       |  5 +++
 linux-user/elfload.c   |  3 ++
 accel/tcg/meson.build  |  1 +
 linux-user/meson.build |  1 +
 7 files changed, 191 insertions(+)
 create mode 100644 accel/tcg/debuginfo.h
 create mode 100644 accel/tcg/debuginfo.c

diff --git a/meson.build b/meson.build
index 5d68a8fd23..6d212f6c8e 100644
--- a/meson.build
+++ b/meson.build
@@ -1648,6 +1648,12 @@ if libbpf.found() and not cc.links('''
   endif
 endif
 
+# libdw
+libdw = dependency('libdw',
+                   method: 'pkg-config',
+                   kwargs: static_kwargs,
+                   required: false)
+
 #################
 # config-host.h #
 #################
@@ -1923,6 +1929,7 @@ config_host_data.set('CONFIG_DBUS_DISPLAY', dbus_display)
 config_host_data.set('CONFIG_CFI', get_option('cfi'))
 config_host_data.set('CONFIG_SELINUX', selinux.found())
 config_host_data.set('CONFIG_XEN_BACKEND', xen.found())
+config_host_data.set('CONFIG_LIBDW', libdw.found())
 if xen.found()
   # protect from xen.version() having less than three components
   xen_version = xen.version().split('.') + ['0', '0']
@@ -3976,6 +3983,7 @@ summary_info += {'libudev':           libudev}
 # Dummy dependency, keep .found()
 summary_info += {'FUSE lseek':        fuse_lseek.found()}
 summary_info += {'selinux':           selinux}
+summary_info += {'libdw':             libdw}
 summary(summary_info, bool_yn: true, section: 'Dependencies')
 
 if not supported_cpus.contains(cpu)
diff --git a/accel/tcg/debuginfo.h b/accel/tcg/debuginfo.h
new file mode 100644
index 0000000000..7542cfe6e0
--- /dev/null
+++ b/accel/tcg/debuginfo.h
@@ -0,0 +1,77 @@
+/*
+ * Debug information support.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef ACCEL_TCG_DEBUGINFO_H
+#define ACCEL_TCG_DEBUGINFO_H
+
+/*
+ * Debuginfo describing a certain address.
+ */
+struct debuginfo_query {
+    uint64_t address;    /* Input: address. */
+    int flags;           /* Input: debuginfo subset. */
+    const char *symbol;  /* Symbol that the address is part of. */
+    uint64_t offset;     /* Offset from the symbol. */
+    const char *file;    /* Source file associated with the address. */
+    int line;            /* Line number in the source file. */
+};
+
+/*
+ * Debuginfo subsets.
+ */
+#define DEBUGINFO_SYMBOL BIT(1)
+#define DEBUGINFO_LINE   BIT(2)
+
+#if defined(CONFIG_TCG) && defined(CONFIG_LIBDW)
+/*
+ * Load debuginfo for the specified guest ELF image.
+ * Return true on success, false on failure.
+ */
+void debuginfo_report_elf(const char *name, int fd, uint64_t bias);
+
+/*
+ * Take the debuginfo lock.
+ */
+void debuginfo_lock(void);
+
+/*
+ * Fill each on N Qs with the debuginfo about Q->ADDRESS as specified by
+ * Q->FLAGS:
+ *
+ * - DEBUGINFO_SYMBOL: update Q->SYMBOL and Q->OFFSET. If symbol debuginfo is
+ *                     missing, then leave them as is.
+ * - DEBUINFO_LINE: update Q->FILE and Q->LINE. If line debuginfo is missing,
+ *                  then leave them as is.
+ *
+ * This function must be called under the debuginfo lock. The results can be
+ * accessed only until the debuginfo lock is released.
+ */
+void debuginfo_query(struct debuginfo_query *q, size_t n);
+
+/*
+ * Release the debuginfo lock.
+ */
+void debuginfo_unlock(void);
+#else
+static inline void debuginfo_report_elf(const char *image_name, int image_fd,
+                                        uint64_t load_bias)
+{
+}
+
+static inline void debuginfo_lock(void)
+{
+}
+
+static inline void debuginfo_query(struct debuginfo_query *q, size_t n)
+{
+}
+
+static inline void debuginfo_unlock(void)
+{
+}
+#endif
+
+#endif
diff --git a/accel/tcg/debuginfo.c b/accel/tcg/debuginfo.c
new file mode 100644
index 0000000000..71c66d04d1
--- /dev/null
+++ b/accel/tcg/debuginfo.c
@@ -0,0 +1,96 @@
+/*
+ * Debug information support.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/lockable.h"
+
+#include <elfutils/libdwfl.h>
+
+#include "debuginfo.h"
+
+static QemuMutex lock;
+static Dwfl *dwfl;
+static const Dwfl_Callbacks dwfl_callbacks = {
+    .find_elf = NULL,
+    .find_debuginfo = dwfl_standard_find_debuginfo,
+    .section_address = NULL,
+    .debuginfo_path = NULL,
+};
+
+__attribute__((constructor))
+static void debuginfo_init(void)
+{
+    qemu_mutex_init(&lock);
+}
+
+void debuginfo_report_elf(const char *name, int fd, uint64_t bias)
+{
+    QEMU_LOCK_GUARD(&lock);
+
+    if (dwfl) {
+        dwfl_report_begin_add(dwfl);
+    } else {
+        dwfl = dwfl_begin(&dwfl_callbacks);
+    }
+
+    if (dwfl) {
+        dwfl_report_elf(dwfl, name, name, fd, bias, true);
+        dwfl_report_end(dwfl, NULL, NULL);
+    }
+}
+
+void debuginfo_lock(void)
+{
+    qemu_mutex_lock(&lock);
+}
+
+void debuginfo_query(struct debuginfo_query *q, size_t n)
+{
+    const char *symbol, *file;
+    Dwfl_Module *dwfl_module;
+    Dwfl_Line *dwfl_line;
+    GElf_Off dwfl_offset;
+    GElf_Sym dwfl_sym;
+    size_t i;
+    int line;
+
+    if (!dwfl) {
+        return;
+    }
+
+    for (i = 0; i < n; i++) {
+        dwfl_module = dwfl_addrmodule(dwfl, q[i].address);
+        if (!dwfl_module) {
+            continue;
+        }
+
+        if (q[i].flags & DEBUGINFO_SYMBOL) {
+            symbol = dwfl_module_addrinfo(dwfl_module, q[i].address,
+                                          &dwfl_offset, &dwfl_sym,
+                                          NULL, NULL, NULL);
+            if (symbol) {
+                q[i].symbol = symbol;
+                q[i].offset = dwfl_offset;
+            }
+        }
+
+        if (q[i].flags & DEBUGINFO_LINE) {
+            dwfl_line = dwfl_module_getsrc(dwfl_module, q[i].address);
+            if (dwfl_line) {
+                file = dwfl_lineinfo(dwfl_line, NULL, &line, 0, NULL, NULL);
+                if (file) {
+                    q[i].file = file;
+                    q[i].line = line;
+                }
+            }
+        }
+    }
+}
+
+void debuginfo_unlock(void)
+{
+    qemu_mutex_unlock(&lock);
+}
diff --git a/hw/core/loader.c b/hw/core/loader.c
index 0548830733..55dbe2e199 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -61,6 +61,7 @@
 #include "hw/boards.h"
 #include "qemu/cutils.h"
 #include "sysemu/runstate.h"
+#include "accel/tcg/debuginfo.h"
 
 #include <zlib.h>
 
@@ -503,6 +504,10 @@ ssize_t load_elf_ram_sym(const char *filename,
                          clear_lsb, data_swab, as, load_rom, sym_cb);
     }
 
+    if (ret != ELF_LOAD_FAILED) {
+        debuginfo_report_elf(filename, fd, 0);
+    }
+
  fail:
     close(fd);
     return ret;
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 20894b633f..5928c14dfc 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -19,6 +19,7 @@
 #include "qemu/selfmap.h"
 #include "qapi/error.h"
 #include "target_signal.h"
+#include "accel/tcg/debuginfo.h"
 
 #ifdef _ARCH_PPC64
 #undef ARCH_DLINFO
@@ -3261,6 +3262,8 @@ static void load_elf_image(const char *image_name, int image_fd,
         load_symbols(ehdr, image_fd, load_bias);
     }
 
+    debuginfo_report_elf(image_name, image_fd, load_bias);
+
     mmap_unlock();
 
     close(image_fd);
diff --git a/accel/tcg/meson.build b/accel/tcg/meson.build
index 75e1dffb4d..55b3b4dd7e 100644
--- a/accel/tcg/meson.build
+++ b/accel/tcg/meson.build
@@ -12,6 +12,7 @@ tcg_ss.add(files(
 tcg_ss.add(when: 'CONFIG_USER_ONLY', if_true: files('user-exec.c'))
 tcg_ss.add(when: 'CONFIG_SOFTMMU', if_false: files('user-exec-stub.c'))
 tcg_ss.add(when: 'CONFIG_PLUGIN', if_true: [files('plugin-gen.c')])
+tcg_ss.add(when: libdw, if_true: files('debuginfo.c'))
 specific_ss.add_all(when: 'CONFIG_TCG', if_true: tcg_ss)
 
 specific_ss.add(when: ['CONFIG_SOFTMMU', 'CONFIG_TCG'], if_true: files(
diff --git a/linux-user/meson.build b/linux-user/meson.build
index de4320af05..7171dc60be 100644
--- a/linux-user/meson.build
+++ b/linux-user/meson.build
@@ -22,6 +22,7 @@ linux_user_ss.add(files(
   'uname.c',
 ))
 linux_user_ss.add(rt)
+linux_user_ss.add(libdw)
 
 linux_user_ss.add(when: 'TARGET_HAS_BFLT', if_true: files('flatload.c'))
 linux_user_ss.add(when: 'TARGET_I386', if_true: files('vm86.c'))
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 3/5] tcg: add perfmap and jitdump
  2023-01-16 22:36 [PULL 0/5] tcg patch queue Richard Henderson
  2023-01-16 22:36 ` [PULL 1/5] linux-user: Clean up when exiting due to a signal Richard Henderson
  2023-01-16 22:36 ` [PULL 2/5] accel/tcg: Add debuginfo support Richard Henderson
@ 2023-01-16 22:36 ` Richard Henderson
  2023-06-02 17:21   ` Peter Maydell
  2023-06-29 11:31   ` Philippe Mathieu-Daudé
  2023-01-16 22:36 ` [PULL 4/5] util/bufferiszero: Use __attribute__((target)) for avx2/avx512 Richard Henderson
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 29+ messages in thread
From: Richard Henderson @ 2023-01-16 22:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, Ilya Leoshkevich, Vanderson M . do Rosario,
	Alex Bennée

From: Ilya Leoshkevich <iii@linux.ibm.com>

Add ability to dump /tmp/perf-<pid>.map and jit-<pid>.dump.
The first one allows the perf tool to map samples to each individual
translation block. The second one adds the ability to resolve symbol
names, line numbers and inspect JITed code.

Example of use:

    perf record qemu-x86_64 -perfmap ./a.out
    perf report

or

    perf record -k 1 qemu-x86_64 -jitdump ./a.out
    DEBUGINFOD_URLS= perf inject -j -i perf.data -o perf.data.jitted
    perf report -i perf.data.jitted

Co-developed-by: Vanderson M. do Rosario <vandersonmr2@gmail.com>
Co-developed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230112152013.125680-4-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 docs/devel/tcg.rst        |  23 +++
 accel/tcg/perf.h          |  49 +++++
 accel/tcg/perf.c          | 375 ++++++++++++++++++++++++++++++++++++++
 accel/tcg/translate-all.c |   7 +
 linux-user/exit.c         |   2 +
 linux-user/main.c         |  15 ++
 softmmu/vl.c              |  11 ++
 tcg/tcg.c                 |   2 +
 accel/tcg/meson.build     |   1 +
 qemu-options.hx           |  20 ++
 10 files changed, 505 insertions(+)
 create mode 100644 accel/tcg/perf.h
 create mode 100644 accel/tcg/perf.c

diff --git a/docs/devel/tcg.rst b/docs/devel/tcg.rst
index 136a7a0d96..b4096a17df 100644
--- a/docs/devel/tcg.rst
+++ b/docs/devel/tcg.rst
@@ -188,3 +188,26 @@ memory areas instead calls out to C code for device emulation.
 Finally, the MMU helps tracking dirty pages and pages pointed to by
 translation blocks.
 
+Profiling JITted code
+---------------------
+
+The Linux ``perf`` tool will treat all JITted code as a single block as
+unlike the main code it can't use debug information to link individual
+program counter samples with larger functions. To overcome this
+limitation you can use the ``-perfmap`` or the ``-jitdump`` option to generate
+map files. ``-perfmap`` is lightweight and produces only guest-host mappings.
+``-jitdump`` additionally saves JITed code and guest debug information (if
+available); its output needs to be integrated with the ``perf.data`` file
+before the final report can be viewed.
+
+.. code::
+
+  perf record $QEMU -perfmap $REMAINING_ARGS
+  perf report
+
+  perf record -k 1 $QEMU -jitdump $REMAINING_ARGS
+  DEBUGINFOD_URLS= perf inject -j -i perf.data -o perf.data.jitted
+  perf report -i perf.data.jitted
+
+Note that qemu-system generates mappings only for ``-kernel`` files in ELF
+format.
diff --git a/accel/tcg/perf.h b/accel/tcg/perf.h
new file mode 100644
index 0000000000..f92dd52c69
--- /dev/null
+++ b/accel/tcg/perf.h
@@ -0,0 +1,49 @@
+/*
+ * Linux perf perf-<pid>.map and jit-<pid>.dump integration.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef ACCEL_TCG_PERF_H
+#define ACCEL_TCG_PERF_H
+
+#if defined(CONFIG_TCG) && defined(CONFIG_LINUX)
+/* Start writing perf-<pid>.map. */
+void perf_enable_perfmap(void);
+
+/* Start writing jit-<pid>.dump. */
+void perf_enable_jitdump(void);
+
+/* Add information about TCG prologue to profiler maps. */
+void perf_report_prologue(const void *start, size_t size);
+
+/* Add information about JITted guest code to profiler maps. */
+void perf_report_code(uint64_t guest_pc, TranslationBlock *tb,
+                      const void *start);
+
+/* Stop writing perf-<pid>.map and/or jit-<pid>.dump. */
+void perf_exit(void);
+#else
+static inline void perf_enable_perfmap(void)
+{
+}
+
+static inline void perf_enable_jitdump(void)
+{
+}
+
+static inline void perf_report_prologue(const void *start, size_t size)
+{
+}
+
+static inline void perf_report_code(uint64_t guest_pc, TranslationBlock *tb,
+                                    const void *start)
+{
+}
+
+static inline void perf_exit(void)
+{
+}
+#endif
+
+#endif
diff --git a/accel/tcg/perf.c b/accel/tcg/perf.c
new file mode 100644
index 0000000000..ae19f6e28f
--- /dev/null
+++ b/accel/tcg/perf.c
@@ -0,0 +1,375 @@
+/*
+ * Linux perf perf-<pid>.map and jit-<pid>.dump integration.
+ *
+ * The jitdump spec can be found at [1].
+ *
+ * [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/tools/perf/Documentation/jitdump-specification.txt
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "elf.h"
+#include "exec/exec-all.h"
+#include "qemu/timer.h"
+#include "tcg/tcg.h"
+
+#include "debuginfo.h"
+#include "perf.h"
+
+static FILE *safe_fopen_w(const char *path)
+{
+    int saved_errno;
+    FILE *f;
+    int fd;
+
+    /* Delete the old file, if any. */
+    unlink(path);
+
+    /* Avoid symlink attacks by using O_CREAT | O_EXCL. */
+    fd = open(path, O_RDWR | O_CREAT | O_EXCL, S_IRUSR | S_IWUSR);
+    if (fd == -1) {
+        return NULL;
+    }
+
+    /* Convert fd to FILE*. */
+    f = fdopen(fd, "w");
+    if (f == NULL) {
+        saved_errno = errno;
+        close(fd);
+        errno = saved_errno;
+        return NULL;
+    }
+
+    return f;
+}
+
+static FILE *perfmap;
+
+void perf_enable_perfmap(void)
+{
+    char map_file[32];
+
+    snprintf(map_file, sizeof(map_file), "/tmp/perf-%d.map", getpid());
+    perfmap = safe_fopen_w(map_file);
+    if (perfmap == NULL) {
+        warn_report("Could not open %s: %s, proceeding without perfmap",
+                    map_file, strerror(errno));
+    }
+}
+
+/* Get PC and size of code JITed for guest instruction #INSN. */
+static void get_host_pc_size(uintptr_t *host_pc, uint16_t *host_size,
+                             const void *start, size_t insn)
+{
+    uint16_t start_off = insn ? tcg_ctx->gen_insn_end_off[insn - 1] : 0;
+
+    if (host_pc) {
+        *host_pc = (uintptr_t)start + start_off;
+    }
+    if (host_size) {
+        *host_size = tcg_ctx->gen_insn_end_off[insn] - start_off;
+    }
+}
+
+static const char *pretty_symbol(const struct debuginfo_query *q, size_t *len)
+{
+    static __thread char buf[64];
+    int tmp;
+
+    if (!q->symbol) {
+        tmp = snprintf(buf, sizeof(buf), "guest-0x%"PRIx64, q->address);
+        if (len) {
+            *len = MIN(tmp + 1, sizeof(buf));
+        }
+        return buf;
+    }
+
+    if (!q->offset) {
+        if (len) {
+            *len = strlen(q->symbol) + 1;
+        }
+        return q->symbol;
+    }
+
+    tmp = snprintf(buf, sizeof(buf), "%s+0x%"PRIx64, q->symbol, q->offset);
+    if (len) {
+        *len = MIN(tmp + 1, sizeof(buf));
+    }
+    return buf;
+}
+
+static void write_perfmap_entry(const void *start, size_t insn,
+                                const struct debuginfo_query *q)
+{
+    uint16_t host_size;
+    uintptr_t host_pc;
+
+    get_host_pc_size(&host_pc, &host_size, start, insn);
+    fprintf(perfmap, "%"PRIxPTR" %"PRIx16" %s\n",
+            host_pc, host_size, pretty_symbol(q, NULL));
+}
+
+static FILE *jitdump;
+
+#define JITHEADER_MAGIC 0x4A695444
+#define JITHEADER_VERSION 1
+
+struct jitheader {
+    uint32_t magic;
+    uint32_t version;
+    uint32_t total_size;
+    uint32_t elf_mach;
+    uint32_t pad1;
+    uint32_t pid;
+    uint64_t timestamp;
+    uint64_t flags;
+};
+
+enum jit_record_type {
+    JIT_CODE_LOAD = 0,
+    JIT_CODE_DEBUG_INFO = 2,
+};
+
+struct jr_prefix {
+    uint32_t id;
+    uint32_t total_size;
+    uint64_t timestamp;
+};
+
+struct jr_code_load {
+    struct jr_prefix p;
+
+    uint32_t pid;
+    uint32_t tid;
+    uint64_t vma;
+    uint64_t code_addr;
+    uint64_t code_size;
+    uint64_t code_index;
+};
+
+struct debug_entry {
+    uint64_t addr;
+    int lineno;
+    int discrim;
+    const char name[];
+};
+
+struct jr_code_debug_info {
+    struct jr_prefix p;
+
+    uint64_t code_addr;
+    uint64_t nr_entry;
+    struct debug_entry entries[];
+};
+
+static uint32_t get_e_machine(void)
+{
+    Elf64_Ehdr elf_header;
+    FILE *exe;
+    size_t n;
+
+    QEMU_BUILD_BUG_ON(offsetof(Elf32_Ehdr, e_machine) !=
+                      offsetof(Elf64_Ehdr, e_machine));
+
+    exe = fopen("/proc/self/exe", "r");
+    if (exe == NULL) {
+        return EM_NONE;
+    }
+
+    n = fread(&elf_header, sizeof(elf_header), 1, exe);
+    fclose(exe);
+    if (n != 1) {
+        return EM_NONE;
+    }
+
+    return elf_header.e_machine;
+}
+
+void perf_enable_jitdump(void)
+{
+    struct jitheader header;
+    char jitdump_file[32];
+    void *perf_marker;
+
+    if (!use_rt_clock) {
+        warn_report("CLOCK_MONOTONIC is not available, proceeding without jitdump");
+        return;
+    }
+
+    snprintf(jitdump_file, sizeof(jitdump_file), "jit-%d.dump", getpid());
+    jitdump = safe_fopen_w(jitdump_file);
+    if (jitdump == NULL) {
+        warn_report("Could not open %s: %s, proceeding without jitdump",
+                    jitdump_file, strerror(errno));
+        return;
+    }
+
+    /*
+     * `perf inject` will see that the mapped file name in the corresponding
+     * PERF_RECORD_MMAP or PERF_RECORD_MMAP2 event is of the form jit-%d.dump
+     * and will process it as a jitdump file.
+     */
+    perf_marker = mmap(NULL, qemu_real_host_page_size(), PROT_READ | PROT_EXEC,
+                       MAP_PRIVATE, fileno(jitdump), 0);
+    if (perf_marker == MAP_FAILED) {
+        warn_report("Could not map %s: %s, proceeding without jitdump",
+                    jitdump_file, strerror(errno));
+        fclose(jitdump);
+        jitdump = NULL;
+        return;
+    }
+
+    header.magic = JITHEADER_MAGIC;
+    header.version = JITHEADER_VERSION;
+    header.total_size = sizeof(header);
+    header.elf_mach = get_e_machine();
+    header.pad1 = 0;
+    header.pid = getpid();
+    header.timestamp = get_clock();
+    header.flags = 0;
+    fwrite(&header, sizeof(header), 1, jitdump);
+}
+
+void perf_report_prologue(const void *start, size_t size)
+{
+    if (perfmap) {
+        fprintf(perfmap, "%"PRIxPTR" %zx tcg-prologue-buffer\n",
+                (uintptr_t)start, size);
+    }
+}
+
+/* Write a JIT_CODE_DEBUG_INFO jitdump entry. */
+static void write_jr_code_debug_info(const void *start,
+                                     const struct debuginfo_query *q,
+                                     size_t icount)
+{
+    struct jr_code_debug_info rec;
+    struct debug_entry ent;
+    uintptr_t host_pc;
+    int insn;
+
+    /* Write the header. */
+    rec.p.id = JIT_CODE_DEBUG_INFO;
+    rec.p.total_size = sizeof(rec) + sizeof(ent) + 1;
+    rec.p.timestamp = get_clock();
+    rec.code_addr = (uintptr_t)start;
+    rec.nr_entry = 1;
+    for (insn = 0; insn < icount; insn++) {
+        if (q[insn].file) {
+            rec.p.total_size += sizeof(ent) + strlen(q[insn].file) + 1;
+            rec.nr_entry++;
+        }
+    }
+    fwrite(&rec, sizeof(rec), 1, jitdump);
+
+    /* Write the main debug entries. */
+    for (insn = 0; insn < icount; insn++) {
+        if (q[insn].file) {
+            get_host_pc_size(&host_pc, NULL, start, insn);
+            ent.addr = host_pc;
+            ent.lineno = q[insn].line;
+            ent.discrim = 0;
+            fwrite(&ent, sizeof(ent), 1, jitdump);
+            fwrite(q[insn].file, strlen(q[insn].file) + 1, 1, jitdump);
+        }
+    }
+
+    /* Write the trailing debug_entry. */
+    ent.addr = (uintptr_t)start + tcg_ctx->gen_insn_end_off[icount - 1];
+    ent.lineno = 0;
+    ent.discrim = 0;
+    fwrite(&ent, sizeof(ent), 1, jitdump);
+    fwrite("", 1, 1, jitdump);
+}
+
+/* Write a JIT_CODE_LOAD jitdump entry. */
+static void write_jr_code_load(const void *start, uint16_t host_size,
+                               const struct debuginfo_query *q)
+{
+    static uint64_t code_index;
+    struct jr_code_load rec;
+    const char *symbol;
+    size_t symbol_size;
+
+    symbol = pretty_symbol(q, &symbol_size);
+    rec.p.id = JIT_CODE_LOAD;
+    rec.p.total_size = sizeof(rec) + symbol_size + host_size;
+    rec.p.timestamp = get_clock();
+    rec.pid = getpid();
+    rec.tid = qemu_get_thread_id();
+    rec.vma = (uintptr_t)start;
+    rec.code_addr = (uintptr_t)start;
+    rec.code_size = host_size;
+    rec.code_index = code_index++;
+    fwrite(&rec, sizeof(rec), 1, jitdump);
+    fwrite(symbol, symbol_size, 1, jitdump);
+    fwrite(start, host_size, 1, jitdump);
+}
+
+void perf_report_code(uint64_t guest_pc, TranslationBlock *tb,
+                      const void *start)
+{
+    struct debuginfo_query *q;
+    size_t insn;
+
+    if (!perfmap && !jitdump) {
+        return;
+    }
+
+    q = g_try_malloc0_n(tb->icount, sizeof(*q));
+    if (!q) {
+        return;
+    }
+
+    debuginfo_lock();
+
+    /* Query debuginfo for each guest instruction. */
+    for (insn = 0; insn < tb->icount; insn++) {
+        /* FIXME: This replicates the restore_state_to_opc() logic. */
+        q[insn].address = tcg_ctx->gen_insn_data[insn][0];
+        if (TARGET_TB_PCREL) {
+            q[insn].address |= (guest_pc & TARGET_PAGE_MASK);
+        } else {
+#if defined(TARGET_I386)
+            q[insn].address -= tb->cs_base;
+#endif
+        }
+        q[insn].flags = DEBUGINFO_SYMBOL | (jitdump ? DEBUGINFO_LINE : 0);
+    }
+    debuginfo_query(q, tb->icount);
+
+    /* Emit perfmap entries if needed. */
+    if (perfmap) {
+        flockfile(perfmap);
+        for (insn = 0; insn < tb->icount; insn++) {
+            write_perfmap_entry(start, insn, &q[insn]);
+        }
+        funlockfile(perfmap);
+    }
+
+    /* Emit jitdump entries if needed. */
+    if (jitdump) {
+        flockfile(jitdump);
+        write_jr_code_debug_info(start, q, tb->icount);
+        write_jr_code_load(start, tcg_ctx->gen_insn_end_off[tb->icount - 1],
+                           q);
+        funlockfile(jitdump);
+    }
+
+    debuginfo_unlock();
+    g_free(q);
+}
+
+void perf_exit(void)
+{
+    if (perfmap) {
+        fclose(perfmap);
+        perfmap = NULL;
+    }
+
+    if (jitdump) {
+        fclose(jitdump);
+        jitdump = NULL;
+    }
+}
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 51ac1f6c84..979f8e1107 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -62,6 +62,7 @@
 #include "tb-hash.h"
 #include "tb-context.h"
 #include "internal.h"
+#include "perf.h"
 
 /* Make sure all possible CPU event bits fit in tb->trace_vcpu_dstate */
 QEMU_BUILD_BUG_ON(CPU_TRACE_DSTATE_MAX_EVENTS >
@@ -406,6 +407,12 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     }
     tb->tc.size = gen_code_size;
 
+    /*
+     * For TARGET_TB_PCREL, attribute all executions of the generated
+     * code to its first mapping.
+     */
+    perf_report_code(pc, tb, tcg_splitwx_to_rx(gen_code_buf));
+
 #ifdef CONFIG_PROFILER
     qatomic_set(&prof->code_time, prof->code_time + profile_getclock() - ti);
     qatomic_set(&prof->code_in_len, prof->code_in_len + tb->size);
diff --git a/linux-user/exit.c b/linux-user/exit.c
index fa6ef0b9b4..607b6da9fc 100644
--- a/linux-user/exit.c
+++ b/linux-user/exit.c
@@ -17,6 +17,7 @@
  *  along with this program; if not, see <http://www.gnu.org/licenses/>.
  */
 #include "qemu/osdep.h"
+#include "accel/tcg/perf.h"
 #include "exec/gdbstub.h"
 #include "qemu.h"
 #include "user-internals.h"
@@ -38,4 +39,5 @@ void preexit_cleanup(CPUArchState *env, int code)
 #endif
         gdb_exit(code);
         qemu_plugin_user_exit();
+        perf_exit();
 }
diff --git a/linux-user/main.c b/linux-user/main.c
index a17fed045b..4290651c3c 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -53,6 +53,7 @@
 #include "signal-common.h"
 #include "loader.h"
 #include "user-mmap.h"
+#include "accel/tcg/perf.h"
 
 #ifdef CONFIG_SEMIHOSTING
 #include "semihosting/semihost.h"
@@ -423,6 +424,16 @@ static void handle_arg_abi_call0(const char *arg)
 }
 #endif
 
+static void handle_arg_perfmap(const char *arg)
+{
+    perf_enable_perfmap();
+}
+
+static void handle_arg_jitdump(const char *arg)
+{
+    perf_enable_jitdump();
+}
+
 static QemuPluginList plugins = QTAILQ_HEAD_INITIALIZER(plugins);
 
 #ifdef CONFIG_PLUGIN
@@ -493,6 +504,10 @@ static const struct qemu_argument arg_table[] = {
     {"xtensa-abi-call0", "QEMU_XTENSA_ABI_CALL0", false, handle_arg_abi_call0,
      "",           "assume CALL0 Xtensa ABI"},
 #endif
+    {"perfmap",    "QEMU_PERFMAP",     false, handle_arg_perfmap,
+     "",           "Generate a /tmp/perf-${pid}.map file for perf"},
+    {"jitdump",    "QEMU_JITDUMP",     false, handle_arg_jitdump,
+     "",           "Generate a jit-${pid}.dump file for perf"},
     {NULL, NULL, false, NULL, NULL, NULL}
 };
 
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 9bd0e52d01..9177d95d4e 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -96,6 +96,9 @@
 #include "fsdev/qemu-fsdev.h"
 #endif
 #include "sysemu/qtest.h"
+#ifdef CONFIG_TCG
+#include "accel/tcg/perf.h"
+#endif
 
 #include "disas/disas.h"
 
@@ -2926,6 +2929,14 @@ void qemu_init(int argc, char **argv)
             case QEMU_OPTION_DFILTER:
                 qemu_set_dfilter_ranges(optarg, &error_fatal);
                 break;
+#if defined(CONFIG_TCG) && defined(CONFIG_LINUX)
+            case QEMU_OPTION_perfmap:
+                perf_enable_perfmap();
+                break;
+            case QEMU_OPTION_jitdump:
+                perf_enable_jitdump();
+                break;
+#endif
             case QEMU_OPTION_seed:
                 qemu_guest_random_seed_main(optarg, &error_fatal);
                 break;
diff --git a/tcg/tcg.c b/tcg/tcg.c
index da91779890..9b7df71e7a 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -61,6 +61,7 @@
 #include "exec/log.h"
 #include "tcg/tcg-ldst.h"
 #include "tcg-internal.h"
+#include "accel/tcg/perf.h"
 
 /* Forward declarations for functions declared in tcg-target.c.inc and
    used here. */
@@ -913,6 +914,7 @@ void tcg_prologue_init(TCGContext *s)
 #endif
 
     prologue_size = tcg_current_code_size(s);
+    perf_report_prologue(s->code_gen_ptr, prologue_size);
 
 #ifndef CONFIG_TCG_INTERPRETER
     flush_idcache_range((uintptr_t)tcg_splitwx_to_rx(s->code_buf),
diff --git a/accel/tcg/meson.build b/accel/tcg/meson.build
index 55b3b4dd7e..77740b1a0d 100644
--- a/accel/tcg/meson.build
+++ b/accel/tcg/meson.build
@@ -13,6 +13,7 @@ tcg_ss.add(when: 'CONFIG_USER_ONLY', if_true: files('user-exec.c'))
 tcg_ss.add(when: 'CONFIG_SOFTMMU', if_false: files('user-exec-stub.c'))
 tcg_ss.add(when: 'CONFIG_PLUGIN', if_true: [files('plugin-gen.c')])
 tcg_ss.add(when: libdw, if_true: files('debuginfo.c'))
+tcg_ss.add(when: 'CONFIG_LINUX', if_true: files('perf.c'))
 specific_ss.add_all(when: 'CONFIG_TCG', if_true: tcg_ss)
 
 specific_ss.add(when: ['CONFIG_SOFTMMU', 'CONFIG_TCG'], if_true: files(
diff --git a/qemu-options.hx b/qemu-options.hx
index 3aa3a2f5a3..d59d19704b 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4838,6 +4838,26 @@ SRST
     Enable synchronization profiling.
 ERST
 
+#if defined(CONFIG_TCG) && defined(CONFIG_LINUX)
+DEF("perfmap", 0, QEMU_OPTION_perfmap,
+    "-perfmap        generate a /tmp/perf-${pid}.map file for perf\n",
+    QEMU_ARCH_ALL)
+SRST
+``-perfmap``
+    Generate a map file for Linux perf tools that will allow basic profiling
+    information to be broken down into basic blocks.
+ERST
+
+DEF("jitdump", 0, QEMU_OPTION_jitdump,
+    "-jitdump        generate a jit-${pid}.dump file for perf\n",
+    QEMU_ARCH_ALL)
+SRST
+``-jitdump``
+    Generate a dump file for Linux perf tools that maps basic blocks to symbol
+    names, line numbers and JITted code.
+ERST
+#endif
+
 DEFHEADING()
 
 DEFHEADING(Generic object creation:)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 4/5] util/bufferiszero: Use __attribute__((target)) for avx2/avx512
  2023-01-16 22:36 [PULL 0/5] tcg patch queue Richard Henderson
                   ` (2 preceding siblings ...)
  2023-01-16 22:36 ` [PULL 3/5] tcg: add perfmap and jitdump Richard Henderson
@ 2023-01-16 22:36 ` Richard Henderson
  2023-01-16 22:36 ` [PULL 5/5] accel/tcg: Split out cpu_exec_{setjmp,loop} Richard Henderson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Richard Henderson @ 2023-01-16 22:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Daniel P . Berrangé

Use the attribute, which is supported by clang, instead of
the #pragma, which is not supported and, for some reason,
also not detected by the meson probe, so we fail by -Werror.

Include only <immintrin.h> as that is the outermost "official"
header for these intrinsics -- emmintrin.h and smmintrin -- are
older SSE2 and SSE4 specific headers, while the immintrin.h
includes all of the Intel intrinsics.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 meson.build         |  8 ++------
 util/bufferiszero.c | 41 ++++++-----------------------------------
 2 files changed, 8 insertions(+), 41 deletions(-)

diff --git a/meson.build b/meson.build
index 6d212f6c8e..58d8cd68a6 100644
--- a/meson.build
+++ b/meson.build
@@ -2338,11 +2338,9 @@ config_host_data.set('CONFIG_CPUID_H', have_cpuid_h)
 config_host_data.set('CONFIG_AVX2_OPT', get_option('avx2') \
   .require(have_cpuid_h, error_message: 'cpuid.h not available, cannot enable AVX2') \
   .require(cc.links('''
-    #pragma GCC push_options
-    #pragma GCC target("avx2")
     #include <cpuid.h>
     #include <immintrin.h>
-    static int bar(void *a) {
+    static int __attribute__((target("avx2"))) bar(void *a) {
       __m256i x = *(__m256i *)a;
       return _mm256_testz_si256(x, x);
     }
@@ -2352,11 +2350,9 @@ config_host_data.set('CONFIG_AVX2_OPT', get_option('avx2') \
 config_host_data.set('CONFIG_AVX512F_OPT', get_option('avx512f') \
   .require(have_cpuid_h, error_message: 'cpuid.h not available, cannot enable AVX512F') \
   .require(cc.links('''
-    #pragma GCC push_options
-    #pragma GCC target("avx512f")
     #include <cpuid.h>
     #include <immintrin.h>
-    static int bar(void *a) {
+    static int __attribute__((target("avx512f"))) bar(void *a) {
       __m512i x = *(__m512i *)a;
       return _mm512_test_epi64_mask(x, x);
     }
diff --git a/util/bufferiszero.c b/util/bufferiszero.c
index ec3cd4ca15..1790ded7d4 100644
--- a/util/bufferiszero.c
+++ b/util/bufferiszero.c
@@ -64,18 +64,11 @@ buffer_zero_int(const void *buf, size_t len)
 }
 
 #if defined(CONFIG_AVX512F_OPT) || defined(CONFIG_AVX2_OPT) || defined(__SSE2__)
-/* Do not use push_options pragmas unnecessarily, because clang
- * does not support them.
- */
-#if defined(CONFIG_AVX512F_OPT) || defined(CONFIG_AVX2_OPT)
-#pragma GCC push_options
-#pragma GCC target("sse2")
-#endif
-#include <emmintrin.h>
+#include <immintrin.h>
 
 /* Note that each of these vectorized functions require len >= 64.  */
 
-static bool
+static bool __attribute__((target("sse2")))
 buffer_zero_sse2(const void *buf, size_t len)
 {
     __m128i t = _mm_loadu_si128(buf);
@@ -104,20 +97,9 @@ buffer_zero_sse2(const void *buf, size_t len)
 
     return _mm_movemask_epi8(_mm_cmpeq_epi8(t, zero)) == 0xFFFF;
 }
-#if defined(CONFIG_AVX512F_OPT) || defined(CONFIG_AVX2_OPT)
-#pragma GCC pop_options
-#endif
 
 #ifdef CONFIG_AVX2_OPT
-/* Note that due to restrictions/bugs wrt __builtin functions in gcc <= 4.8,
- * the includes have to be within the corresponding push_options region, and
- * therefore the regions themselves have to be ordered with increasing ISA.
- */
-#pragma GCC push_options
-#pragma GCC target("sse4")
-#include <smmintrin.h>
-
-static bool
+static bool __attribute__((target("sse4")))
 buffer_zero_sse4(const void *buf, size_t len)
 {
     __m128i t = _mm_loadu_si128(buf);
@@ -145,12 +127,7 @@ buffer_zero_sse4(const void *buf, size_t len)
     return _mm_testz_si128(t, t);
 }
 
-#pragma GCC pop_options
-#pragma GCC push_options
-#pragma GCC target("avx2")
-#include <immintrin.h>
-
-static bool
+static bool __attribute__((target("avx2")))
 buffer_zero_avx2(const void *buf, size_t len)
 {
     /* Begin with an unaligned head of 32 bytes.  */
@@ -176,15 +153,10 @@ buffer_zero_avx2(const void *buf, size_t len)
 
     return _mm256_testz_si256(t, t);
 }
-#pragma GCC pop_options
 #endif /* CONFIG_AVX2_OPT */
 
 #ifdef CONFIG_AVX512F_OPT
-#pragma GCC push_options
-#pragma GCC target("avx512f")
-#include <immintrin.h>
-
-static bool
+static bool __attribute__((target("avx512f")))
 buffer_zero_avx512(const void *buf, size_t len)
 {
     /* Begin with an unaligned head of 64 bytes.  */
@@ -210,8 +182,7 @@ buffer_zero_avx512(const void *buf, size_t len)
     return !_mm512_test_epi64_mask(t, t);
 
 }
-#pragma GCC pop_options
-#endif
+#endif /* CONFIG_AVX512F_OPT */
 
 
 /* Note that for test_buffer_is_zero_next_accel, the most preferred
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 5/5] accel/tcg: Split out cpu_exec_{setjmp,loop}
  2023-01-16 22:36 [PULL 0/5] tcg patch queue Richard Henderson
                   ` (3 preceding siblings ...)
  2023-01-16 22:36 ` [PULL 4/5] util/bufferiszero: Use __attribute__((target)) for avx2/avx512 Richard Henderson
@ 2023-01-16 22:36 ` Richard Henderson
  2023-01-17 15:47 ` [PULL 0/5] tcg patch queue Peter Maydell
  2023-01-20  9:41 ` Thomas Huth
  6 siblings, 0 replies; 29+ messages in thread
From: Richard Henderson @ 2023-01-16 22:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

Recently the g_assert(cpu == current_cpu) test has been
intermittently failing with gcc.  Reorg the code around
the setjmp to minimize the lifetime of the cpu variable
affected by the setjmp.

This appears to fix the existing issue with clang as well.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1147
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c | 111 +++++++++++++++++++++----------------------
 1 file changed, 54 insertions(+), 57 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 356fe348de..8927092537 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -909,64 +909,10 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
 
 /* main execution loop */
 
-int cpu_exec(CPUState *cpu)
+static int __attribute__((noinline))
+cpu_exec_loop(CPUState *cpu, SyncClocks *sc)
 {
     int ret;
-    SyncClocks sc = { 0 };
-
-    /* replay_interrupt may need current_cpu */
-    current_cpu = cpu;
-
-    if (cpu_handle_halt(cpu)) {
-        return EXCP_HALTED;
-    }
-
-    rcu_read_lock();
-
-    cpu_exec_enter(cpu);
-
-    /* Calculate difference between guest clock and host clock.
-     * This delay includes the delay of the last cycle, so
-     * what we have to do is sleep until it is 0. As for the
-     * advance/delay we gain here, we try to fix it next time.
-     */
-    init_delay_params(&sc, cpu);
-
-    /* prepare setjmp context for exception handling */
-    if (sigsetjmp(cpu->jmp_env, 0) != 0) {
-#if defined(__clang__)
-        /*
-         * Some compilers wrongly smash all local variables after
-         * siglongjmp (the spec requires that only non-volatile locals
-         * which are changed between the sigsetjmp and siglongjmp are
-         * permitted to be trashed). There were bug reports for gcc
-         * 4.5.0 and clang.  The bug is fixed in all versions of gcc
-         * that we support, but is still unfixed in clang:
-         *   https://bugs.llvm.org/show_bug.cgi?id=21183
-         *
-         * Reload an essential local variable here for those compilers.
-         * Newer versions of gcc would complain about this code (-Wclobbered),
-         * so we only perform the workaround for clang.
-         */
-        cpu = current_cpu;
-#else
-        /* Non-buggy compilers preserve this; assert the correct value. */
-        g_assert(cpu == current_cpu);
-#endif
-
-#ifndef CONFIG_SOFTMMU
-        clear_helper_retaddr();
-        if (have_mmap_lock()) {
-            mmap_unlock();
-        }
-#endif
-        if (qemu_mutex_iothread_locked()) {
-            qemu_mutex_unlock_iothread();
-        }
-        qemu_plugin_disable_mem_helpers(cpu);
-
-        assert_no_pages_locked();
-    }
 
     /* if an exception is pending, we execute it here */
     while (!cpu_handle_exception(cpu, &ret)) {
@@ -1033,9 +979,60 @@ int cpu_exec(CPUState *cpu)
 
             /* Try to align the host and virtual clocks
                if the guest is in advance */
-            align_clocks(&sc, cpu);
+            align_clocks(sc, cpu);
         }
     }
+    return ret;
+}
+
+static int cpu_exec_setjmp(CPUState *cpu, SyncClocks *sc)
+{
+    /* Prepare setjmp context for exception handling. */
+    if (unlikely(sigsetjmp(cpu->jmp_env, 0) != 0)) {
+        /* Non-buggy compilers preserve this; assert the correct value. */
+        g_assert(cpu == current_cpu);
+
+#ifndef CONFIG_SOFTMMU
+        clear_helper_retaddr();
+        if (have_mmap_lock()) {
+            mmap_unlock();
+        }
+#endif
+        if (qemu_mutex_iothread_locked()) {
+            qemu_mutex_unlock_iothread();
+        }
+        qemu_plugin_disable_mem_helpers(cpu);
+
+        assert_no_pages_locked();
+    }
+
+    return cpu_exec_loop(cpu, sc);
+}
+
+int cpu_exec(CPUState *cpu)
+{
+    int ret;
+    SyncClocks sc = { 0 };
+
+    /* replay_interrupt may need current_cpu */
+    current_cpu = cpu;
+
+    if (cpu_handle_halt(cpu)) {
+        return EXCP_HALTED;
+    }
+
+    rcu_read_lock();
+    cpu_exec_enter(cpu);
+
+    /*
+     * Calculate difference between guest clock and host clock.
+     * This delay includes the delay of the last cycle, so
+     * what we have to do is sleep until it is 0. As for the
+     * advance/delay we gain here, we try to fix it next time.
+     */
+    init_delay_params(&sc, cpu);
+
+    ret = cpu_exec_setjmp(cpu, &sc);
 
     cpu_exec_exit(cpu);
     rcu_read_unlock();
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PULL 0/5] tcg patch queue
  2023-01-16 22:36 [PULL 0/5] tcg patch queue Richard Henderson
                   ` (4 preceding siblings ...)
  2023-01-16 22:36 ` [PULL 5/5] accel/tcg: Split out cpu_exec_{setjmp,loop} Richard Henderson
@ 2023-01-17 15:47 ` Peter Maydell
  2023-01-20  9:41 ` Thomas Huth
  6 siblings, 0 replies; 29+ messages in thread
From: Peter Maydell @ 2023-01-17 15:47 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Mon, 16 Jan 2023 at 22:36, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The following changes since commit fb7e7990342e59cf67dbd895c1a1e3fb1741df7a:
>
>   tests/qtest/qom-test: Do not print tested properties by default (2023-01-16 15:00:57 +0000)
>
> are available in the Git repository at:
>
>   https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230116
>
> for you to fetch changes up to 61710a7e23a63546da0071ea32adb96476fa5d07:
>
>   accel/tcg: Split out cpu_exec_{setjmp,loop} (2023-01-16 10:14:12 -1000)
>
> ----------------------------------------------------------------
> - Reorg cpu_tb_exec around setjmp.
> - Use __attribute__((target)) for buffer_is_zero.
> - Add perfmap and jitdump for perf support.
>
> ----------------------------------------------------------------


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.0
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 0/5] tcg patch queue
  2023-01-16 22:36 [PULL 0/5] tcg patch queue Richard Henderson
                   ` (5 preceding siblings ...)
  2023-01-17 15:47 ` [PULL 0/5] tcg patch queue Peter Maydell
@ 2023-01-20  9:41 ` Thomas Huth
  2023-01-20 10:50   ` Alex Bennée
                     ` (2 more replies)
  6 siblings, 3 replies; 29+ messages in thread
From: Thomas Huth @ 2023-01-20  9:41 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, Ilya Leoshkevich; +Cc: peter.maydell

On 16/01/2023 23.36, Richard Henderson wrote:
> The following changes since commit fb7e7990342e59cf67dbd895c1a1e3fb1741df7a:
> 
>    tests/qtest/qom-test: Do not print tested properties by default (2023-01-16 15:00:57 +0000)
> 
> are available in the Git repository at:
> 
>    https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230116
> 
> for you to fetch changes up to 61710a7e23a63546da0071ea32adb96476fa5d07:
> 
>    accel/tcg: Split out cpu_exec_{setjmp,loop} (2023-01-16 10:14:12 -1000)
> 
> ----------------------------------------------------------------
> - Reorg cpu_tb_exec around setjmp.
> - Use __attribute__((target)) for buffer_is_zero.
> - Add perfmap and jitdump for perf support.
> 
> ----------------------------------------------------------------
> Ilya Leoshkevich (3):
>        linux-user: Clean up when exiting due to a signal
>        accel/tcg: Add debuginfo support
>        tcg: add perfmap and jitdump
> 
> Richard Henderson (2):
>        util/bufferiszero: Use __attribute__((target)) for avx2/avx512
>        accel/tcg: Split out cpu_exec_{setjmp,loop}

  Hi Richard, hi Ilya,

with the recent QEMU master branch (commit 701ed34), I'm now seeing failures 
in Travis:

  https://app.travis-ci.com/github/huth/qemu/jobs/593786529#L14411

Everything was still fine a couple of days ago (commit fb7e799):

  https://app.travis-ci.com/github/huth/qemu/builds/259755664

... so it seems this is likely related to this pull request. Could you 
please have a look?

  Thanks,
   Thomas



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 0/5] tcg patch queue
  2023-01-20  9:41 ` Thomas Huth
@ 2023-01-20 10:50   ` Alex Bennée
  2023-01-20 10:53   ` Ilya Leoshkevich
  2023-01-21  6:07   ` Richard Henderson
  2 siblings, 0 replies; 29+ messages in thread
From: Alex Bennée @ 2023-01-20 10:50 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Richard Henderson, Ilya Leoshkevich, peter.maydell, qemu-devel


Thomas Huth <thuth@redhat.com> writes:

> On 16/01/2023 23.36, Richard Henderson wrote:
>> The following changes since commit fb7e7990342e59cf67dbd895c1a1e3fb1741df7a:
>>    tests/qtest/qom-test: Do not print tested properties by default
>> (2023-01-16 15:00:57 +0000)
>> are available in the Git repository at:
>>    https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230116
>> for you to fetch changes up to
>> 61710a7e23a63546da0071ea32adb96476fa5d07:
>>    accel/tcg: Split out cpu_exec_{setjmp,loop} (2023-01-16 10:14:12
>> -1000)
>> ----------------------------------------------------------------
>> - Reorg cpu_tb_exec around setjmp.
>> - Use __attribute__((target)) for buffer_is_zero.
>> - Add perfmap and jitdump for perf support.
>> ----------------------------------------------------------------
>> Ilya Leoshkevich (3):
>>        linux-user: Clean up when exiting due to a signal
>>        accel/tcg: Add debuginfo support
>>        tcg: add perfmap and jitdump
>> Richard Henderson (2):
>>        util/bufferiszero: Use __attribute__((target)) for avx2/avx512
>>        accel/tcg: Split out cpu_exec_{setjmp,loop}
>
>  Hi Richard, hi Ilya,
>
> with the recent QEMU master branch (commit 701ed34), I'm now seeing
> failures in Travis:
>
>  https://app.travis-ci.com/github/huth/qemu/jobs/593786529#L14411
>
> Everything was still fine a couple of days ago (commit fb7e799):
>
>  https://app.travis-ci.com/github/huth/qemu/builds/259755664
>
> ... so it seems this is likely related to this pull request. Could you
> please have a look?

Hmm maybe the code motion has revealed another form of the compiler bug.
I guess these bugs don't die, they just refract.

>
>  Thanks,
>   Thomas


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 0/5] tcg patch queue
  2023-01-20  9:41 ` Thomas Huth
  2023-01-20 10:50   ` Alex Bennée
@ 2023-01-20 10:53   ` Ilya Leoshkevich
  2023-01-20 12:51     ` Thomas Huth
  2023-01-21  6:07   ` Richard Henderson
  2 siblings, 1 reply; 29+ messages in thread
From: Ilya Leoshkevich @ 2023-01-20 10:53 UTC (permalink / raw)
  To: Thomas Huth, Richard Henderson, Alex Bennée, qemu-devel
  Cc: peter.maydell

On Fri, 2023-01-20 at 10:41 +0100, Thomas Huth wrote:
> On 16/01/2023 23.36, Richard Henderson wrote:
> > The following changes since commit
> > fb7e7990342e59cf67dbd895c1a1e3fb1741df7a:
> > 
> >    tests/qtest/qom-test: Do not print tested properties by default
> > (2023-01-16 15:00:57 +0000)
> > 
> > are available in the Git repository at:
> > 
> >    https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230116
> > 
> > for you to fetch changes up to
> > 61710a7e23a63546da0071ea32adb96476fa5d07:
> > 
> >    accel/tcg: Split out cpu_exec_{setjmp,loop} (2023-01-16 10:14:12
> > -1000)
> > 
> > ----------------------------------------------------------------
> > - Reorg cpu_tb_exec around setjmp.
> > - Use __attribute__((target)) for buffer_is_zero.
> > - Add perfmap and jitdump for perf support.
> > 
> > ----------------------------------------------------------------
> > Ilya Leoshkevich (3):
> >        linux-user: Clean up when exiting due to a signal
> >        accel/tcg: Add debuginfo support
> >        tcg: add perfmap and jitdump
> > 
> > Richard Henderson (2):
> >        util/bufferiszero: Use __attribute__((target)) for
> > avx2/avx512
> >        accel/tcg: Split out cpu_exec_{setjmp,loop}
> 
>   Hi Richard, hi Ilya,
> 
> with the recent QEMU master branch (commit 701ed34), I'm now seeing
> failures 
> in Travis:
> 
>   https://app.travis-ci.com/github/huth/qemu/jobs/593786529#L14411
> 
> Everything was still fine a couple of days ago (commit fb7e799):
> 
>   https://app.travis-ci.com/github/huth/qemu/builds/259755664
> 
> ... so it seems this is likely related to this pull request. Could
> you 
> please have a look?
> 
>   Thanks,
>    Thomas
> 

I would expect this to be (temporarily) fixed by [1], but we probably
don't set GITLAB_CI in Travis. Would it make sense to set it? It looks
as if this variable is currently used only to skip certain tests.

If not, then maybe split it into QEMU_CI, GITLAB_CI and TRAVIS_CI?

https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg04438.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 0/5] tcg patch queue
  2023-01-20 10:53   ` Ilya Leoshkevich
@ 2023-01-20 12:51     ` Thomas Huth
  2023-01-20 16:49       ` Alex Bennée
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Huth @ 2023-01-20 12:51 UTC (permalink / raw)
  To: Ilya Leoshkevich, Richard Henderson, Alex Bennée, qemu-devel
  Cc: peter.maydell

On 20/01/2023 11.53, Ilya Leoshkevich wrote:
> On Fri, 2023-01-20 at 10:41 +0100, Thomas Huth wrote:
>> On 16/01/2023 23.36, Richard Henderson wrote:
>>> The following changes since commit
>>> fb7e7990342e59cf67dbd895c1a1e3fb1741df7a:
>>>
>>>     tests/qtest/qom-test: Do not print tested properties by default
>>> (2023-01-16 15:00:57 +0000)
>>>
>>> are available in the Git repository at:
>>>
>>>     https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230116
>>>
>>> for you to fetch changes up to
>>> 61710a7e23a63546da0071ea32adb96476fa5d07:
>>>
>>>     accel/tcg: Split out cpu_exec_{setjmp,loop} (2023-01-16 10:14:12
>>> -1000)
>>>
>>> ----------------------------------------------------------------
>>> - Reorg cpu_tb_exec around setjmp.
>>> - Use __attribute__((target)) for buffer_is_zero.
>>> - Add perfmap and jitdump for perf support.
>>>
>>> ----------------------------------------------------------------
>>> Ilya Leoshkevich (3):
>>>         linux-user: Clean up when exiting due to a signal
>>>         accel/tcg: Add debuginfo support
>>>         tcg: add perfmap and jitdump
>>>
>>> Richard Henderson (2):
>>>         util/bufferiszero: Use __attribute__((target)) for
>>> avx2/avx512
>>>         accel/tcg: Split out cpu_exec_{setjmp,loop}
>>
>>    Hi Richard, hi Ilya,
>>
>> with the recent QEMU master branch (commit 701ed34), I'm now seeing
>> failures
>> in Travis:
>>
>>    https://app.travis-ci.com/github/huth/qemu/jobs/593786529#L14411
>>
>> Everything was still fine a couple of days ago (commit fb7e799):
>>
>>    https://app.travis-ci.com/github/huth/qemu/builds/259755664
>>
>> ... so it seems this is likely related to this pull request. Could
>> you
>> please have a look?
>>
>>    Thanks,
>>     Thomas
>>
> 
> I would expect this to be (temporarily) fixed by [1], but we probably
> don't set GITLAB_CI in Travis. Would it make sense to set it? It looks
> as if this variable is currently used only to skip certain tests.
> 
> If not, then maybe split it into QEMU_CI, GITLAB_CI and TRAVIS_CI?
> 
> https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg04438.html

Ah, ok, so this test has issues in gitlab, too!

For Travis, I think we should either check the CI or TRAVIS environment 
variables:

 
https://docs.travis-ci.com/user/environment-variables/#default-environment-variables

  Thomas




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 0/5] tcg patch queue
  2023-01-20 12:51     ` Thomas Huth
@ 2023-01-20 16:49       ` Alex Bennée
  0 siblings, 0 replies; 29+ messages in thread
From: Alex Bennée @ 2023-01-20 16:49 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Ilya Leoshkevich, Richard Henderson, qemu-devel, peter.maydell


Thomas Huth <thuth@redhat.com> writes:

> On 20/01/2023 11.53, Ilya Leoshkevich wrote:
>> On Fri, 2023-01-20 at 10:41 +0100, Thomas Huth wrote:
>>> On 16/01/2023 23.36, Richard Henderson wrote:
>>>> The following changes since commit
>>>> fb7e7990342e59cf67dbd895c1a1e3fb1741df7a:
>>>>
>>>>     tests/qtest/qom-test: Do not print tested properties by default
>>>> (2023-01-16 15:00:57 +0000)
>>>>
>>>> are available in the Git repository at:
>>>>
>>>>     https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230116
>>>>
>>>> for you to fetch changes up to
>>>> 61710a7e23a63546da0071ea32adb96476fa5d07:
>>>>
>>>>     accel/tcg: Split out cpu_exec_{setjmp,loop} (2023-01-16 10:14:12
>>>> -1000)
>>>>
>>>> ----------------------------------------------------------------
>>>> - Reorg cpu_tb_exec around setjmp.
>>>> - Use __attribute__((target)) for buffer_is_zero.
>>>> - Add perfmap and jitdump for perf support.
>>>>
>>>> ----------------------------------------------------------------
>>>> Ilya Leoshkevich (3):
>>>>         linux-user: Clean up when exiting due to a signal
>>>>         accel/tcg: Add debuginfo support
>>>>         tcg: add perfmap and jitdump
>>>>
>>>> Richard Henderson (2):
>>>>         util/bufferiszero: Use __attribute__((target)) for
>>>> avx2/avx512
>>>>         accel/tcg: Split out cpu_exec_{setjmp,loop}
>>>
>>>    Hi Richard, hi Ilya,
>>>
>>> with the recent QEMU master branch (commit 701ed34), I'm now seeing
>>> failures
>>> in Travis:
>>>
>>>    https://app.travis-ci.com/github/huth/qemu/jobs/593786529#L14411
>>>
>>> Everything was still fine a couple of days ago (commit fb7e799):
>>>
>>>    https://app.travis-ci.com/github/huth/qemu/builds/259755664
>>>
>>> ... so it seems this is likely related to this pull request. Could
>>> you
>>> please have a look?
>>>
>>>    Thanks,
>>>     Thomas
>>>
>> I would expect this to be (temporarily) fixed by [1], but we
>> probably
>> don't set GITLAB_CI in Travis. Would it make sense to set it? It looks
>> as if this variable is currently used only to skip certain tests.
>> If not, then maybe split it into QEMU_CI, GITLAB_CI and TRAVIS_CI?
>> https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg04438.html
>
> Ah, ok, so this test has issues in gitlab, too!

*sigh* yeah the test is flaky but this is a subtly different failure
 mode. All the gitlab failures I saw where the test triggering the abort
 rather than the assert catch we have here.


>
> For Travis, I think we should either check the CI or TRAVIS
> environment variables:
>
>
> https://docs.travis-ci.com/user/environment-variables/#default-environment-variables
>
>  Thomas


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 0/5] tcg patch queue
  2023-01-20  9:41 ` Thomas Huth
  2023-01-20 10:50   ` Alex Bennée
  2023-01-20 10:53   ` Ilya Leoshkevich
@ 2023-01-21  6:07   ` Richard Henderson
  2 siblings, 0 replies; 29+ messages in thread
From: Richard Henderson @ 2023-01-21  6:07 UTC (permalink / raw)
  To: Thomas Huth, qemu-devel, Ilya Leoshkevich; +Cc: peter.maydell

On 1/19/23 23:41, Thomas Huth wrote:
> On 16/01/2023 23.36, Richard Henderson wrote:
>> The following changes since commit fb7e7990342e59cf67dbd895c1a1e3fb1741df7a:
>>
>>    tests/qtest/qom-test: Do not print tested properties by default (2023-01-16 15:00:57 
>> +0000)
>>
>> are available in the Git repository at:
>>
>>    https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230116
>>
>> for you to fetch changes up to 61710a7e23a63546da0071ea32adb96476fa5d07:
>>
>>    accel/tcg: Split out cpu_exec_{setjmp,loop} (2023-01-16 10:14:12 -1000)
>>
>> ----------------------------------------------------------------
>> - Reorg cpu_tb_exec around setjmp.
>> - Use __attribute__((target)) for buffer_is_zero.
>> - Add perfmap and jitdump for perf support.
>>
>> ----------------------------------------------------------------
>> Ilya Leoshkevich (3):
>>        linux-user: Clean up when exiting due to a signal
>>        accel/tcg: Add debuginfo support
>>        tcg: add perfmap and jitdump
>>
>> Richard Henderson (2):
>>        util/bufferiszero: Use __attribute__((target)) for avx2/avx512
>>        accel/tcg: Split out cpu_exec_{setjmp,loop}
> 
>   Hi Richard, hi Ilya,
> 
> with the recent QEMU master branch (commit 701ed34), I'm now seeing failures in Travis:
> 
>   https://app.travis-ci.com/github/huth/qemu/jobs/593786529#L14411
> 
> Everything was still fine a couple of days ago (commit fb7e799):
> 
>   https://app.travis-ci.com/github/huth/qemu/builds/259755664
> 
> ... so it seems this is likely related to this pull request. Could you please have a look?

Thankfully our s390x.ci.qemu.org has the same version gcc installed, and I was able to 
reproduce this.  But only once -- it's irregular and very low frequency.

The code generated by gcc is correct and easy to inspect, since cpu_exec_setjmp is now 
quite small:

00000000000f3250 <cpu_exec_setjmp.isra.0>:
    f3250:       eb 6f f0 30 00 24       stmg    %r6,%r15,48(%r15)
    f3256:       a7 39 00 00             lghi    %r3,0
    f325a:       e3 f0 ff 58 ff 71       lay     %r15,-168(%r15)

                                         // Save cpu to stack+160.
    f3260:       e3 20 f0 a0 00 24       stg     %r2,160(%r15)
    f3266:       41 20 20 f0             la      %r2,240(%r2)
    f326a:       c0 e5 ff fb 10 eb       brasl   %r14,55440 <__sigsetjmp@plt>
    f3270:       ec 26 00 0d 00 7e       cijne   %r2,0,f328a <cpu_exec_setjmp.isra.0+0x3a>

                                         // Reload cpu for cpu_exec_loop().
    f3276:       e3 20 f0 a0 00 04       lg      %r2,160(%r15)
    f327c:       c0 e5 ff ff fb ee       brasl   %r14,f2a58 <cpu_exec_loop.isra.0>
    f3282:       eb 6f f0 d8 00 04       lmg     %r6,%r15,216(%r15)
    f3288:       07 fe                   br      %r14

                                         // Load tls pointer and current_cpu address.
    f328a:       b2 4f 00 10             ear     %r1,%a0
    f328e:       c0 20 00 0a 35 9d       larl    %r2,239dc8 <current_cpu@@Base+0x239dc8>
    f3294:       eb 11 00 20 00 0d       sllg    %r1,%r1,32
    f329a:       e3 20 20 00 00 04       lg      %r2,0(%r2)
    f32a0:       b2 4f 00 11             ear     %r1,%a1

                                         // Reload cpu for comparison
    f32a4:       e3 30 f0 a0 00 04       lg      %r3,160(%r15)
                                         // cpu == current_cpu
    f32aa:       e3 32 10 00 00 20       cg      %r3,0(%r2,%r1)
    f32b0:       a7 84 00 12             je      f32d4 <cpu_exec_setjmp.isra.0+0x84>
    ...

The only way I can imagine that this comparison fails is if we have corrupted the stack in 
some way.  I have not been able to induce failure under any sort of debugging, and I can't 
imagine where irregular corruption would have come from.


r~

r~


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 3/5] tcg: add perfmap and jitdump
  2023-01-16 22:36 ` [PULL 3/5] tcg: add perfmap and jitdump Richard Henderson
@ 2023-06-02 17:21   ` Peter Maydell
  2023-06-03 20:35     ` Ilya Leoshkevich
  2023-06-29 11:31   ` Philippe Mathieu-Daudé
  1 sibling, 1 reply; 29+ messages in thread
From: Peter Maydell @ 2023-06-02 17:21 UTC (permalink / raw)
  To: Richard Henderson
  Cc: qemu-devel, Ilya Leoshkevich, Vanderson M . do Rosario, Alex Bennée

On Mon, 16 Jan 2023 at 22:36, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> From: Ilya Leoshkevich <iii@linux.ibm.com>
>
> Add ability to dump /tmp/perf-<pid>.map and jit-<pid>.dump.
> The first one allows the perf tool to map samples to each individual
> translation block. The second one adds the ability to resolve symbol
> names, line numbers and inspect JITed code.
>
> Example of use:
>
>     perf record qemu-x86_64 -perfmap ./a.out
>     perf report
>
> or
>
>     perf record -k 1 qemu-x86_64 -jitdump ./a.out
>     DEBUGINFOD_URLS= perf inject -j -i perf.data -o perf.data.jitted
>     perf report -i perf.data.jitted
>
> Co-developed-by: Vanderson M. do Rosario <vandersonmr2@gmail.com>
> Co-developed-by: Alex Bennée <alex.bennee@linaro.org>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> Message-Id: <20230112152013.125680-4-iii@linux.ibm.com>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Hi; Coverity thinks (CID 1507521) that there's a memory leak
in this code:

> +void perf_enable_jitdump(void)
> +{
> +    struct jitheader header;
> +    char jitdump_file[32];
> +    void *perf_marker;
> +
> +    if (!use_rt_clock) {
> +        warn_report("CLOCK_MONOTONIC is not available, proceeding without jitdump");
> +        return;
> +    }
> +
> +    snprintf(jitdump_file, sizeof(jitdump_file), "jit-%d.dump", getpid());
> +    jitdump = safe_fopen_w(jitdump_file);
> +    if (jitdump == NULL) {
> +        warn_report("Could not open %s: %s, proceeding without jitdump",
> +                    jitdump_file, strerror(errno));
> +        return;
> +    }
> +
> +    /*
> +     * `perf inject` will see that the mapped file name in the corresponding
> +     * PERF_RECORD_MMAP or PERF_RECORD_MMAP2 event is of the form jit-%d.dump
> +     * and will process it as a jitdump file.
> +     */
> +    perf_marker = mmap(NULL, qemu_real_host_page_size(), PROT_READ | PROT_EXEC,
> +                       MAP_PRIVATE, fileno(jitdump), 0);

Here we mmap() something...

> +    if (perf_marker == MAP_FAILED) {
> +        warn_report("Could not map %s: %s, proceeding without jitdump",
> +                    jitdump_file, strerror(errno));
> +        fclose(jitdump);
> +        jitdump = NULL;
> +        return;
> +    }
> +
> +    header.magic = JITHEADER_MAGIC;
> +    header.version = JITHEADER_VERSION;
> +    header.total_size = sizeof(header);
> +    header.elf_mach = get_e_machine();
> +    header.pad1 = 0;
> +    header.pid = getpid();
> +    header.timestamp = get_clock();
> +    header.flags = 0;
> +    fwrite(&header, sizeof(header), 1, jitdump);

...but we never do anything with that pointer, so the memory
we just mmap()ed is never going to be freed.

Is this doing something particularly magical, or should that
pointer be kept track of somewhere ?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 3/5] tcg: add perfmap and jitdump
  2023-06-02 17:21   ` Peter Maydell
@ 2023-06-03 20:35     ` Ilya Leoshkevich
  0 siblings, 0 replies; 29+ messages in thread
From: Ilya Leoshkevich @ 2023-06-03 20:35 UTC (permalink / raw)
  To: Peter Maydell, Richard Henderson
  Cc: qemu-devel, Vanderson M . do Rosario, Alex Bennée

On Fri, 2023-06-02 at 18:21 +0100, Peter Maydell wrote:
> On Mon, 16 Jan 2023 at 22:36, Richard Henderson
> <richard.henderson@linaro.org> wrote:
> > 
> > From: Ilya Leoshkevich <iii@linux.ibm.com>
> > 
> > Add ability to dump /tmp/perf-<pid>.map and jit-<pid>.dump.
> > The first one allows the perf tool to map samples to each
> > individual
> > translation block. The second one adds the ability to resolve
> > symbol
> > names, line numbers and inspect JITed code.
> > 
> > Example of use:
> > 
> >     perf record qemu-x86_64 -perfmap ./a.out
> >     perf report
> > 
> > or
> > 
> >     perf record -k 1 qemu-x86_64 -jitdump ./a.out
> >     DEBUGINFOD_URLS= perf inject -j -i perf.data -o
> > perf.data.jitted
> >     perf report -i perf.data.jitted
> > 
> > Co-developed-by: Vanderson M. do Rosario <vandersonmr2@gmail.com>
> > Co-developed-by: Alex Bennée <alex.bennee@linaro.org>
> > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > Message-Id: <20230112152013.125680-4-iii@linux.ibm.com>
> > Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> 
> Hi; Coverity thinks (CID 1507521) that there's a memory leak
> in this code:
> 
> > +void perf_enable_jitdump(void)
> > +{
> > +    struct jitheader header;
> > +    char jitdump_file[32];
> > +    void *perf_marker;
> > +
> > +    if (!use_rt_clock) {
> > +        warn_report("CLOCK_MONOTONIC is not available, proceeding
> > without jitdump");
> > +        return;
> > +    }
> > +
> > +    snprintf(jitdump_file, sizeof(jitdump_file), "jit-%d.dump",
> > getpid());
> > +    jitdump = safe_fopen_w(jitdump_file);
> > +    if (jitdump == NULL) {
> > +        warn_report("Could not open %s: %s, proceeding without
> > jitdump",
> > +                    jitdump_file, strerror(errno));
> > +        return;
> > +    }
> > +
> > +    /*
> > +     * `perf inject` will see that the mapped file name in the
> > corresponding
> > +     * PERF_RECORD_MMAP or PERF_RECORD_MMAP2 event is of the form
> > jit-%d.dump
> > +     * and will process it as a jitdump file.
> > +     */
> > +    perf_marker = mmap(NULL, qemu_real_host_page_size(), PROT_READ
> > | PROT_EXEC,
> > +                       MAP_PRIVATE, fileno(jitdump), 0);
> 
> Here we mmap() something...
> 
> > +    if (perf_marker == MAP_FAILED) {
> > +        warn_report("Could not map %s: %s, proceeding without
> > jitdump",
> > +                    jitdump_file, strerror(errno));
> > +        fclose(jitdump);
> > +        jitdump = NULL;
> > +        return;
> > +    }
> > +
> > +    header.magic = JITHEADER_MAGIC;
> > +    header.version = JITHEADER_VERSION;
> > +    header.total_size = sizeof(header);
> > +    header.elf_mach = get_e_machine();
> > +    header.pad1 = 0;
> > +    header.pid = getpid();
> > +    header.timestamp = get_clock();
> > +    header.flags = 0;
> > +    fwrite(&header, sizeof(header), 1, jitdump);
> 
> ...but we never do anything with that pointer, so the memory
> we just mmap()ed is never going to be freed.
> 
> Is this doing something particularly magical, or should that
> pointer be kept track of somewhere ?
> 
> thanks
> -- PMM

It's magic that points perf to the location of the jitdump file,
but it won't hurt munmap()ping it in perf_exit(). I'll send a patch.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 3/5] tcg: add perfmap and jitdump
  2023-01-16 22:36 ` [PULL 3/5] tcg: add perfmap and jitdump Richard Henderson
  2023-06-02 17:21   ` Peter Maydell
@ 2023-06-29 11:31   ` Philippe Mathieu-Daudé
  2023-06-29 12:59     ` Richard Henderson
  1 sibling, 1 reply; 29+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-06-29 11:31 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: peter.maydell, Ilya Leoshkevich, Vanderson M . do Rosario,
	Alex Bennée

Hi Richard, Alex,

On 16/1/23 23:36, Richard Henderson wrote:
> From: Ilya Leoshkevich <iii@linux.ibm.com>
> 
> Add ability to dump /tmp/perf-<pid>.map and jit-<pid>.dump.
> The first one allows the perf tool to map samples to each individual
> translation block. The second one adds the ability to resolve symbol
> names, line numbers and inspect JITed code.
> 
> Example of use:
> 
>      perf record qemu-x86_64 -perfmap ./a.out
>      perf report
> 
> or
> 
>      perf record -k 1 qemu-x86_64 -jitdump ./a.out
>      DEBUGINFOD_URLS= perf inject -j -i perf.data -o perf.data.jitted
>      perf report -i perf.data.jitted
> 
> Co-developed-by: Vanderson M. do Rosario <vandersonmr2@gmail.com>
> Co-developed-by: Alex Bennée <alex.bennee@linaro.org>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> Message-Id: <20230112152013.125680-4-iii@linux.ibm.com>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   docs/devel/tcg.rst        |  23 +++
>   accel/tcg/perf.h          |  49 +++++
>   accel/tcg/perf.c          | 375 ++++++++++++++++++++++++++++++++++++++
>   accel/tcg/translate-all.c |   7 +
>   linux-user/exit.c         |   2 +
>   linux-user/main.c         |  15 ++
>   softmmu/vl.c              |  11 ++
>   tcg/tcg.c                 |   2 +
>   accel/tcg/meson.build     |   1 +
>   qemu-options.hx           |  20 ++
>   10 files changed, 505 insertions(+)
>   create mode 100644 accel/tcg/perf.h
>   create mode 100644 accel/tcg/perf.c


> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index da91779890..9b7df71e7a 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -61,6 +61,7 @@
>   #include "exec/log.h"
>   #include "tcg/tcg-ldst.h"
>   #include "tcg-internal.h"
> +#include "accel/tcg/perf.h"

Is it OK to include an header from QEMU's accel/tcg/ here?
I thought we wanted to keep tcg/ kinda independant (or maybe
this is already too late and this isn't a concern anymore).

>   /* Forward declarations for functions declared in tcg-target.c.inc and
>      used here. */
> @@ -913,6 +914,7 @@ void tcg_prologue_init(TCGContext *s)
>   #endif
>   
>       prologue_size = tcg_current_code_size(s);
> +    perf_report_prologue(s->code_gen_ptr, prologue_size);



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 3/5] tcg: add perfmap and jitdump
  2023-06-29 11:31   ` Philippe Mathieu-Daudé
@ 2023-06-29 12:59     ` Richard Henderson
  2023-06-30 12:39       ` Ilya Leoshkevich
  0 siblings, 1 reply; 29+ messages in thread
From: Richard Henderson @ 2023-06-29 12:59 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu-devel
  Cc: peter.maydell, Ilya Leoshkevich, Vanderson M . do Rosario,
	Alex Bennée

On 6/29/23 13:31, Philippe Mathieu-Daudé wrote:
>> diff --git a/tcg/tcg.c b/tcg/tcg.c
>> index da91779890..9b7df71e7a 100644
>> --- a/tcg/tcg.c
>> +++ b/tcg/tcg.c
>> @@ -61,6 +61,7 @@
>>   #include "exec/log.h"
>>   #include "tcg/tcg-ldst.h"
>>   #include "tcg-internal.h"
>> +#include "accel/tcg/perf.h"
> 
> Is it OK to include an header from QEMU's accel/tcg/ here?
> I thought we wanted to keep tcg/ kinda independant (or maybe
> this is already too late and this isn't a concern anymore).

It's not ideal, no.  Perf really should live in tcg/.


r~


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 3/5] tcg: add perfmap and jitdump
  2023-06-29 12:59     ` Richard Henderson
@ 2023-06-30 12:39       ` Ilya Leoshkevich
  2023-06-30 13:24         ` Richard Henderson
  0 siblings, 1 reply; 29+ messages in thread
From: Ilya Leoshkevich @ 2023-06-30 12:39 UTC (permalink / raw)
  To: Richard Henderson, Philippe Mathieu-Daudé, qemu-devel
  Cc: peter.maydell, Vanderson M . do Rosario, Alex Bennée

On Thu, 2023-06-29 at 14:59 +0200, Richard Henderson wrote:
> On 6/29/23 13:31, Philippe Mathieu-Daudé wrote:
> > > diff --git a/tcg/tcg.c b/tcg/tcg.c
> > > index da91779890..9b7df71e7a 100644
> > > --- a/tcg/tcg.c
> > > +++ b/tcg/tcg.c
> > > @@ -61,6 +61,7 @@
> > >   #include "exec/log.h"
> > >   #include "tcg/tcg-ldst.h"
> > >   #include "tcg-internal.h"
> > > +#include "accel/tcg/perf.h"
> > 
> > Is it OK to include an header from QEMU's accel/tcg/ here?
> > I thought we wanted to keep tcg/ kinda independant (or maybe
> > this is already too late and this isn't a concern anymore).
> 
> It's not ideal, no.  Perf really should live in tcg/.
> 
> 
> r~

This would require to somehow get rid of this:

#if defined(TARGET_I386)
            q[insn].address -= tb->cs_base;
#endif

I'll try to come up with a patch.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 3/5] tcg: add perfmap and jitdump
  2023-06-30 12:39       ` Ilya Leoshkevich
@ 2023-06-30 13:24         ` Richard Henderson
  2023-12-07  9:51           ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 29+ messages in thread
From: Richard Henderson @ 2023-06-30 13:24 UTC (permalink / raw)
  To: Ilya Leoshkevich, Philippe Mathieu-Daudé, qemu-devel
  Cc: peter.maydell, Vanderson M . do Rosario, Alex Bennée

On 6/30/23 14:39, Ilya Leoshkevich wrote:
> On Thu, 2023-06-29 at 14:59 +0200, Richard Henderson wrote:
>> On 6/29/23 13:31, Philippe Mathieu-Daudé wrote:
>>>> diff --git a/tcg/tcg.c b/tcg/tcg.c
>>>> index da91779890..9b7df71e7a 100644
>>>> --- a/tcg/tcg.c
>>>> +++ b/tcg/tcg.c
>>>> @@ -61,6 +61,7 @@
>>>>    #include "exec/log.h"
>>>>    #include "tcg/tcg-ldst.h"
>>>>    #include "tcg-internal.h"
>>>> +#include "accel/tcg/perf.h"
>>>
>>> Is it OK to include an header from QEMU's accel/tcg/ here?
>>> I thought we wanted to keep tcg/ kinda independant (or maybe
>>> this is already too late and this isn't a concern anymore).
>>
>> It's not ideal, no.  Perf really should live in tcg/.
>>
>>
>> r~
> 
> This would require to somehow get rid of this:
> 
> #if defined(TARGET_I386)
>              q[insn].address -= tb->cs_base;
> #endif
> 
> I'll try to come up with a patch.

Just drop it?  Did you really want EIP instead of the full virtual address?
It only makes a difference for 16-bit mode anyway.


r~


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 3/5] tcg: add perfmap and jitdump
  2023-06-30 13:24         ` Richard Henderson
@ 2023-12-07  9:51           ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 29+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-12-07  9:51 UTC (permalink / raw)
  To: Richard Henderson, Ilya Leoshkevich, qemu-devel
  Cc: peter.maydell, Vanderson M . do Rosario, Alex Bennée

Hi Ilya,

On 30/6/23 15:24, Richard Henderson wrote:
> On 6/30/23 14:39, Ilya Leoshkevich wrote:
>> On Thu, 2023-06-29 at 14:59 +0200, Richard Henderson wrote:
>>> On 6/29/23 13:31, Philippe Mathieu-Daudé wrote:
>>>>> diff --git a/tcg/tcg.c b/tcg/tcg.c
>>>>> index da91779890..9b7df71e7a 100644
>>>>> --- a/tcg/tcg.c
>>>>> +++ b/tcg/tcg.c
>>>>> @@ -61,6 +61,7 @@
>>>>>    #include "exec/log.h"
>>>>>    #include "tcg/tcg-ldst.h"
>>>>>    #include "tcg-internal.h"
>>>>> +#include "accel/tcg/perf.h"
>>>>
>>>> Is it OK to include an header from QEMU's accel/tcg/ here?
>>>> I thought we wanted to keep tcg/ kinda independant (or maybe
>>>> this is already too late and this isn't a concern anymore).
>>>
>>> It's not ideal, no.  Perf really should live in tcg/.
>>>
>>>
>>> r~
>>
>> This would require to somehow get rid of this:
>>
>> #if defined(TARGET_I386)
>>              q[insn].address -= tb->cs_base;
>> #endif
>>
>> I'll try to come up with a patch.
> 
> Just drop it?  Did you really want EIP instead of the full virtual address?
> It only makes a difference for 16-bit mode anyway.

Any update here? Have you tried 'perf' on non-x86 hosts?
There is a "This replicates the restore_state_to_opc() logic"
comment but only x86 restore_state_to_opc() logic is used,
so I'm rather confused here.

Thanks,

Phil.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PULL 0/5] tcg patch queue
@ 2023-01-16 22:36 Richard Henderson
  0 siblings, 0 replies; 29+ messages in thread
From: Richard Henderson @ 2023-01-16 22:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

The following changes since commit fb7e7990342e59cf67dbd895c1a1e3fb1741df7a:

  tests/qtest/qom-test: Do not print tested properties by default (2023-01-16 15:00:57 +0000)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230116

for you to fetch changes up to 61710a7e23a63546da0071ea32adb96476fa5d07:

  accel/tcg: Split out cpu_exec_{setjmp,loop} (2023-01-16 10:14:12 -1000)

----------------------------------------------------------------
- Reorg cpu_tb_exec around setjmp.
- Use __attribute__((target)) for buffer_is_zero.
- Add perfmap and jitdump for perf support.

----------------------------------------------------------------
Ilya Leoshkevich (3):
      linux-user: Clean up when exiting due to a signal
      accel/tcg: Add debuginfo support
      tcg: add perfmap and jitdump

Richard Henderson (2):
      util/bufferiszero: Use __attribute__((target)) for avx2/avx512
      accel/tcg: Split out cpu_exec_{setjmp,loop}

 docs/devel/tcg.rst        |  23 +++
 meson.build               |  16 +-
 accel/tcg/debuginfo.h     |  77 ++++++++++
 accel/tcg/perf.h          |  49 ++++++
 accel/tcg/cpu-exec.c      | 111 +++++++-------
 accel/tcg/debuginfo.c     |  96 ++++++++++++
 accel/tcg/perf.c          | 375 ++++++++++++++++++++++++++++++++++++++++++++++
 accel/tcg/translate-all.c |   7 +
 hw/core/loader.c          |   5 +
 linux-user/elfload.c      |   3 +
 linux-user/exit.c         |   2 +
 linux-user/main.c         |  15 ++
 linux-user/signal.c       |   8 +-
 softmmu/vl.c              |  11 ++
 tcg/tcg.c                 |   2 +
 util/bufferiszero.c       |  41 +----
 accel/tcg/meson.build     |   2 +
 linux-user/meson.build    |   1 +
 qemu-options.hx           |  20 +++
 19 files changed, 763 insertions(+), 101 deletions(-)
 create mode 100644 accel/tcg/debuginfo.h
 create mode 100644 accel/tcg/perf.h
 create mode 100644 accel/tcg/debuginfo.c
 create mode 100644 accel/tcg/perf.c


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 0/5] tcg patch queue
  2021-05-01 18:51 Richard Henderson
  2021-05-01 19:27 ` no-reply
@ 2021-05-02 13:32 ` Peter Maydell
  1 sibling, 0 replies; 29+ messages in thread
From: Peter Maydell @ 2021-05-02 13:32 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Sat, 1 May 2021 at 19:51, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The following changes since commit 8f860d2633baf9c2b6261f703f86e394c6bc22ca:
>
>   Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2021-04-30' into staging (2021-04-30 16:02:00 +0100)
>
> are available in the Git repository at:
>
>   https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20210501
>
> for you to fetch changes up to af93ccacc772019298be4c3e47251cdaa60d0c21:
>
>   decodetree: Extend argument set syntax to allow types (2021-05-01 11:45:35 -0700)
>
> ----------------------------------------------------------------
> Include cleanups.
> Decodetree enhancements for power10.


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/6.1
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 0/5] tcg patch queue
  2021-05-01 18:51 Richard Henderson
@ 2021-05-01 19:27 ` no-reply
  2021-05-02 13:32 ` Peter Maydell
  1 sibling, 0 replies; 29+ messages in thread
From: no-reply @ 2021-05-01 19:27 UTC (permalink / raw)
  To: richard.henderson; +Cc: peter.maydell, qemu-devel

Patchew URL: https://patchew.org/QEMU/20210501185116.1338875-1-richard.henderson@linaro.org/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210501185116.1338875-1-richard.henderson@linaro.org
Subject: [PULL 0/5] tcg patch queue

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]         patchew/20210501185116.1338875-1-richard.henderson@linaro.org -> patchew/20210501185116.1338875-1-richard.henderson@linaro.org
Switched to a new branch 'test'
3f52d0d decodetree: Extend argument set syntax to allow types
2f170a4 decodetree: Add support for 64-bit instructions
6567eed decodetree: More use of f-strings
95caca8 decodetree: Introduce whex and whexC helpers
b9a64b1 exec: Remove accel/tcg/ from include paths

=== OUTPUT BEGIN ===
1/5 Checking commit b9a64b13ad89 (exec: Remove accel/tcg/ from include paths)
2/5 Checking commit 95caca818e08 (decodetree: Introduce whex and whexC helpers)
ERROR: line over 90 characters
#52: FILE: scripts/decodetree.py:495:
+                output(ind, f'if ((insn & {whexC(innermask)}) == {whexC(innerbits)}) {{\n')

WARNING: line over 80 characters
#53: FILE: scripts/decodetree.py:496:
+                output(ind, f'    /* {str_match_bits(p.fixedbits, p.fixedmask)} */\n')

total: 1 errors, 1 warnings, 136 lines checked

Patch 2/5 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

3/5 Checking commit 6567eed04461 (decodetree: More use of f-strings)
4/5 Checking commit 2f170a408195 (decodetree: Add support for 64-bit instructions)
WARNING: line over 80 characters
#75: FILE: scripts/decodetree.py:236:
+                ret = f'deposit{bitop_width}({ret}, {pos}, {bitop_width - pos}, {ext})'

total: 0 errors, 1 warnings, 63 lines checked

Patch 4/5 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
5/5 Checking commit 3f52d0d96c42 (decodetree: Extend argument set syntax to allow types)
Use of uninitialized value $acpi_testexpected in string eq at ./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#156: 
new file mode 100644

total: 0 errors, 1 warnings, 121 lines checked

Patch 5/5 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20210501185116.1338875-1-richard.henderson@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PULL 0/5] tcg patch queue
@ 2021-05-01 18:51 Richard Henderson
  2021-05-01 19:27 ` no-reply
  2021-05-02 13:32 ` Peter Maydell
  0 siblings, 2 replies; 29+ messages in thread
From: Richard Henderson @ 2021-05-01 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

The following changes since commit 8f860d2633baf9c2b6261f703f86e394c6bc22ca:

  Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2021-04-30' into staging (2021-04-30 16:02:00 +0100)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20210501

for you to fetch changes up to af93ccacc772019298be4c3e47251cdaa60d0c21:

  decodetree: Extend argument set syntax to allow types (2021-05-01 11:45:35 -0700)

----------------------------------------------------------------
Include cleanups.
Decodetree enhancements for power10.

----------------------------------------------------------------
Luis Fernando Fujita Pires (1):
      decodetree: Add support for 64-bit instructions

Philippe Mathieu-Daudé (1):
      exec: Remove accel/tcg/ from include paths

Richard Henderson (3):
      decodetree: Introduce whex and whexC helpers
      decodetree: More use of f-strings
      decodetree: Extend argument set syntax to allow types

 docs/devel/decodetree.rst             |  11 ++-
 meson.build                           |   1 -
 include/exec/helper-gen.h             |   4 +-
 include/exec/helper-proto.h           |   4 +-
 include/exec/helper-tcg.h             |   4 +-
 tests/decode/succ_argset_type1.decode |   1 +
 scripts/decodetree.py                 | 172 +++++++++++++++++++---------------
 7 files changed, 112 insertions(+), 85 deletions(-)
 create mode 100644 tests/decode/succ_argset_type1.decode


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 0/5] tcg patch queue
  2020-09-03 21:40 Richard Henderson
@ 2020-09-06 13:07 ` Peter Maydell
  0 siblings, 0 replies; 29+ messages in thread
From: Peter Maydell @ 2020-09-06 13:07 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Thu, 3 Sep 2020 at 22:41, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The following changes since commit 3dd23a4fb8fd72d2220a90a809f213999ffe7f3a:
>
>   Merge remote-tracking branch 'remotes/legoater/tags/pull-aspeed-20200901' into staging (2020-09-03 14:12:48 +0100)
>
> are available in the Git repository at:
>
>   https://github.com/rth7680/qemu.git tags/pull-tcg-20200903
>
> for you to fetch changes up to fe4b0b5bfa96c38ad1cad0689a86cca9f307e353:
>
>   tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem (2020-09-03 13:13:58 -0700)
>
> ----------------------------------------------------------------
> Improve inlining in cputlb.c.
> Fix vector abs fallback.
> Only set parallel_cpus for SMP.
> Add vector dupm for 256-bit elements.
>
> ----------------------------------------------------------------


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.2
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PULL 0/5] tcg patch queue
@ 2020-09-03 21:40 Richard Henderson
  2020-09-06 13:07 ` Peter Maydell
  0 siblings, 1 reply; 29+ messages in thread
From: Richard Henderson @ 2020-09-03 21:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

The following changes since commit 3dd23a4fb8fd72d2220a90a809f213999ffe7f3a:

  Merge remote-tracking branch 'remotes/legoater/tags/pull-aspeed-20200901' into staging (2020-09-03 14:12:48 +0100)

are available in the Git repository at:

  https://github.com/rth7680/qemu.git tags/pull-tcg-20200903

for you to fetch changes up to fe4b0b5bfa96c38ad1cad0689a86cca9f307e353:

  tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem (2020-09-03 13:13:58 -0700)

----------------------------------------------------------------
Improve inlining in cputlb.c.
Fix vector abs fallback.
Only set parallel_cpus for SMP.
Add vector dupm for 256-bit elements.

----------------------------------------------------------------
Richard Henderson (4):
      cputlb: Make store_helper less fragile to compiler optimizations
      softmmu/cpus: Only set parallel_cpus for SMP
      tcg: Eliminate one store for in-place 128-bit dup_mem
      tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem

Stephen Long (1):
      tcg: Fix tcg gen for vectorized absolute value

 accel/tcg/cputlb.c | 138 ++++++++++++++++++++++++++++++-----------------------
 softmmu/cpus.c     |  11 ++++-
 tcg/tcg-op-gvec.c  |  61 ++++++++++++++++++++---
 3 files changed, 143 insertions(+), 67 deletions(-)


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 0/5] tcg patch queue
  2020-03-17 19:00 Richard Henderson
  2020-03-17 23:34 ` no-reply
@ 2020-03-19 10:17 ` Peter Maydell
  1 sibling, 0 replies; 29+ messages in thread
From: Peter Maydell @ 2020-03-19 10:17 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Tue, 17 Mar 2020 at 19:00, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The following changes since commit 40c67636f67c2a89745f2e698522fe917326a952:
>
>   Merge remote-tracking branch 'remotes/kraxel/tags/usb-20200317-pull-request' into staging (2020-03-17 14:00:56 +0000)
>
> are available in the Git repository at:
>
>   https://github.com/rth7680/qemu.git tags/pull-tcg-20200317
>
> for you to fetch changes up to 0270bd503e3699b7202200a2d693ad1feb57473f:
>
>   tcg: Remove tcg-runtime-gvec.c DO_CMP0 (2020-03-17 08:41:07 -0700)
>
> ----------------------------------------------------------------
> Fix tcg/i386 bug vs sari_vec.
> Fix tcg-runtime-gvec.c vs i386 without avx.
>
> ----------------------------------------------------------------



Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.0
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 0/5] tcg patch queue
  2020-03-17 19:00 Richard Henderson
@ 2020-03-17 23:34 ` no-reply
  2020-03-19 10:17 ` Peter Maydell
  1 sibling, 0 replies; 29+ messages in thread
From: no-reply @ 2020-03-17 23:34 UTC (permalink / raw)
  To: richard.henderson; +Cc: peter.maydell, qemu-devel

Patchew URL: https://patchew.org/QEMU/20200317190013.25036-1-richard.henderson@linaro.org/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PULL 0/5] tcg patch queue
Message-id: 20200317190013.25036-1-richard.henderson@linaro.org
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
83eaadd tcg: Remove tcg-runtime-gvec.c DO_CMP0
e0008a5 tcg: Tidy tcg-runtime-gvec.c DUP*
8e7d6d3 tcg: Tidy tcg-runtime-gvec.c types
44bd3c5 tcg: Remove CONFIG_VECTOR16
f410c29 tcg/i386: Bound shift count expanding sari_vec

=== OUTPUT BEGIN ===
1/5 Checking commit f410c296b774 (tcg/i386: Bound shift count expanding sari_vec)
2/5 Checking commit 44bd3c5fbbdb (tcg: Remove CONFIG_VECTOR16)
3/5 Checking commit 8e7d6d39c529 (tcg: Tidy tcg-runtime-gvec.c types)
ERROR: spaces required around that '&' (ctx:WxO)
#442: FILE: accel/tcg/tcg-runtime-gvec.c:510:
+        *(uint64_t *)(d + i) = *(uint64_t *)(a + i) &~ *(uint64_t *)(b + i);
                                                     ^

ERROR: space prohibited after that '~' (ctx:OxW)
#442: FILE: accel/tcg/tcg-runtime-gvec.c:510:
+        *(uint64_t *)(d + i) = *(uint64_t *)(a + i) &~ *(uint64_t *)(b + i);
                                                      ^

ERROR: spaces required around that '|' (ctx:WxO)
#453: FILE: accel/tcg/tcg-runtime-gvec.c:521:
+        *(uint64_t *)(d + i) = *(uint64_t *)(a + i) |~ *(uint64_t *)(b + i);
                                                     ^

ERROR: space prohibited after that '~' (ctx:OxW)
#453: FILE: accel/tcg/tcg-runtime-gvec.c:521:
+        *(uint64_t *)(d + i) = *(uint64_t *)(a + i) |~ *(uint64_t *)(b + i);
                                                      ^

ERROR: spaces required around that '==' (ctx:WxB)
#677: FILE: accel/tcg/tcg-runtime-gvec.c:897:
+    DO_CMP1(gvec_eq##SZ, uint##SZ##_t, ==)    \
                                        ^

ERROR: spaces required around that '!=' (ctx:WxB)
#678: FILE: accel/tcg/tcg-runtime-gvec.c:898:
+    DO_CMP1(gvec_ne##SZ, uint##SZ##_t, !=)    \
                                        ^

ERROR: spaces required around that '<' (ctx:WxB)
#679: FILE: accel/tcg/tcg-runtime-gvec.c:899:
+    DO_CMP1(gvec_lt##SZ, int##SZ##_t, <)      \
                                       ^

ERROR: spaces required around that '<=' (ctx:WxB)
#680: FILE: accel/tcg/tcg-runtime-gvec.c:900:
+    DO_CMP1(gvec_le##SZ, int##SZ##_t, <=)     \
                                       ^

ERROR: spaces required around that '<' (ctx:WxB)
#681: FILE: accel/tcg/tcg-runtime-gvec.c:901:
+    DO_CMP1(gvec_ltu##SZ, uint##SZ##_t, <)    \
                                         ^

ERROR: spaces required around that '<=' (ctx:WxB)
#682: FILE: accel/tcg/tcg-runtime-gvec.c:902:
+    DO_CMP1(gvec_leu##SZ, uint##SZ##_t, <=)
                                         ^

total: 10 errors, 0 warnings, 630 lines checked

Patch 3/5 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

4/5 Checking commit e0008a500fbb (tcg: Tidy tcg-runtime-gvec.c DUP*)
5/5 Checking commit 83eaadd6af23 (tcg: Remove tcg-runtime-gvec.c DO_CMP0)
ERROR: spaces required around that '*' (ctx:WxV)
#30: FILE: accel/tcg/tcg-runtime-gvec.c:869:
+        *(TYPE *)(d + i) = -(*(TYPE *)(a + i) OP *(TYPE *)(b + i));        \
                                                  ^

total: 1 errors, 0 warnings, 23 lines checked

Patch 5/5 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200317190013.25036-1-richard.henderson@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PULL 0/5] tcg patch queue
@ 2020-03-17 19:00 Richard Henderson
  2020-03-17 23:34 ` no-reply
  2020-03-19 10:17 ` Peter Maydell
  0 siblings, 2 replies; 29+ messages in thread
From: Richard Henderson @ 2020-03-17 19:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

The following changes since commit 40c67636f67c2a89745f2e698522fe917326a952:

  Merge remote-tracking branch 'remotes/kraxel/tags/usb-20200317-pull-request' into staging (2020-03-17 14:00:56 +0000)

are available in the Git repository at:

  https://github.com/rth7680/qemu.git tags/pull-tcg-20200317

for you to fetch changes up to 0270bd503e3699b7202200a2d693ad1feb57473f:

  tcg: Remove tcg-runtime-gvec.c DO_CMP0 (2020-03-17 08:41:07 -0700)

----------------------------------------------------------------
Fix tcg/i386 bug vs sari_vec.
Fix tcg-runtime-gvec.c vs i386 without avx.

----------------------------------------------------------------
Richard Henderson (5):
      tcg/i386: Bound shift count expanding sari_vec
      tcg: Remove CONFIG_VECTOR16
      tcg: Tidy tcg-runtime-gvec.c types
      tcg: Tidy tcg-runtime-gvec.c DUP*
      tcg: Remove tcg-runtime-gvec.c DO_CMP0

 configure                    |  56 --------
 accel/tcg/tcg-runtime-gvec.c | 298 +++++++++++++++++--------------------------
 tcg/i386/tcg-target.inc.c    |   9 +-
 3 files changed, 122 insertions(+), 241 deletions(-)


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2023-12-07  9:52 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-16 22:36 [PULL 0/5] tcg patch queue Richard Henderson
2023-01-16 22:36 ` [PULL 1/5] linux-user: Clean up when exiting due to a signal Richard Henderson
2023-01-16 22:36 ` [PULL 2/5] accel/tcg: Add debuginfo support Richard Henderson
2023-01-16 22:36 ` [PULL 3/5] tcg: add perfmap and jitdump Richard Henderson
2023-06-02 17:21   ` Peter Maydell
2023-06-03 20:35     ` Ilya Leoshkevich
2023-06-29 11:31   ` Philippe Mathieu-Daudé
2023-06-29 12:59     ` Richard Henderson
2023-06-30 12:39       ` Ilya Leoshkevich
2023-06-30 13:24         ` Richard Henderson
2023-12-07  9:51           ` Philippe Mathieu-Daudé
2023-01-16 22:36 ` [PULL 4/5] util/bufferiszero: Use __attribute__((target)) for avx2/avx512 Richard Henderson
2023-01-16 22:36 ` [PULL 5/5] accel/tcg: Split out cpu_exec_{setjmp,loop} Richard Henderson
2023-01-17 15:47 ` [PULL 0/5] tcg patch queue Peter Maydell
2023-01-20  9:41 ` Thomas Huth
2023-01-20 10:50   ` Alex Bennée
2023-01-20 10:53   ` Ilya Leoshkevich
2023-01-20 12:51     ` Thomas Huth
2023-01-20 16:49       ` Alex Bennée
2023-01-21  6:07   ` Richard Henderson
  -- strict thread matches above, loose matches on Subject: below --
2023-01-16 22:36 Richard Henderson
2021-05-01 18:51 Richard Henderson
2021-05-01 19:27 ` no-reply
2021-05-02 13:32 ` Peter Maydell
2020-09-03 21:40 Richard Henderson
2020-09-06 13:07 ` Peter Maydell
2020-03-17 19:00 Richard Henderson
2020-03-17 23:34 ` no-reply
2020-03-19 10:17 ` Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.