All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] accel/tcg/plugin: host insn size for plugin
@ 2023-04-06  2:27 Fei Wu
  2023-04-06  2:27 ` [PATCH 1/2] accel/tcg/plugin: export host insn size Fei Wu
  2023-04-06  2:27 ` [PATCH 2/2] plugins/hotblocks: add " Fei Wu
  0 siblings, 2 replies; 18+ messages in thread
From: Fei Wu @ 2023-04-06  2:27 UTC (permalink / raw)
  To: richard.henderson, pbonzini, alex.bennee, erdnaxe, ma.mandourr,
	qemu-devel
  Cc: Fei Wu

The translation ratio of host to guest instruction count is one of the
key performance factor of binary translation. It's better to have this
kind of information exported to plugin for analysis. As the host insn
size is not determined at guest->IR time, its address is recorded for
later dereference, and plugin inline mode is not supported.

Here is an example of the output with modified plugin hotblocks:

    pc, tcount, icount, ecount, host isize
    0xffffffff8041ad6c, 1, 9, 130450345, 456
    0xffffffff800084f0, 1, 9, 88273714, 264
    0xffffffff800084e4, 1, 3, 88264146, 135
    0xffffffff8041abd0, 1, 1, 46032689, 123
    0xffffffff8041ab3c, 1, 1, 46021650, 123
    0xffffffff8045ffe8, 1, 5, 40927215, 328

Fei Wu (2):
  accel/tcg/plugin: export host insn size
  plugins/hotblocks: add host insn size

 accel/tcg/plugin-gen.c       |  1 +
 contrib/plugins/hotblocks.c  | 24 +++++++++++++++---------
 include/qemu/plugin.h        |  2 ++
 include/qemu/qemu-plugin.h   |  8 ++++++++
 plugins/api.c                |  5 +++++
 plugins/qemu-plugins.symbols |  1 +
 6 files changed, 32 insertions(+), 9 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-06  2:27 [PATCH 0/2] accel/tcg/plugin: host insn size for plugin Fei Wu
@ 2023-04-06  2:27 ` Fei Wu
  2023-04-06  7:46   ` Alex Bennée
  2023-04-06  2:27 ` [PATCH 2/2] plugins/hotblocks: add " Fei Wu
  1 sibling, 1 reply; 18+ messages in thread
From: Fei Wu @ 2023-04-06  2:27 UTC (permalink / raw)
  To: richard.henderson, pbonzini, alex.bennee, erdnaxe, ma.mandourr,
	qemu-devel
  Cc: Fei Wu

The translation ratio of host to guest instruction count is one of the
key performance factor of binary translation. TCG doesn't collect host
instruction count at present, it does collect host instruction size
instead, although they are not the same thing as instruction size might
not be fixed, instruction size is still a valid estimation.

Signed-off-by: Fei Wu <fei2.wu@intel.com>
---
 accel/tcg/plugin-gen.c       | 1 +
 include/qemu/plugin.h        | 2 ++
 include/qemu/qemu-plugin.h   | 8 ++++++++
 plugins/api.c                | 5 +++++
 plugins/qemu-plugins.symbols | 1 +
 5 files changed, 17 insertions(+)

diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 5efb8db258..4a3ca8fa2f 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -881,6 +881,7 @@ bool plugin_gen_tb_start(CPUState *cpu, const DisasContextBase *db,
         ptb->haddr2 = NULL;
         ptb->mem_only = mem_only;
         ptb->mem_helper = false;
+        ptb->host_insn_size = &db->tb->tc.size;
 
         plugin_gen_empty_callback(PLUGIN_GEN_FROM_TB);
     }
diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index bc0781cab8..b38fd139e1 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -151,6 +151,8 @@ struct qemu_plugin_tb {
     /* if set, the TB calls helpers that might access guest memory */
     bool mem_helper;
 
+    uint64_t *host_insn_size;
+
     GArray *cbs[PLUGIN_N_CB_SUBTYPES];
 };
 
diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
index 50a9957279..2397574a21 100644
--- a/include/qemu/qemu-plugin.h
+++ b/include/qemu/qemu-plugin.h
@@ -336,6 +336,14 @@ void qemu_plugin_register_vcpu_insn_exec_inline(struct qemu_plugin_insn *insn,
  */
 size_t qemu_plugin_tb_n_insns(const struct qemu_plugin_tb *tb);
 
+/**
+ * qemu_plugin_tb_n_insns() - query helper for host insns size in TB
+ * @tb: opaque handle to TB passed to callback
+ *
+ * Returns: address of host insns size of this block
+ */
+void *qemu_plugin_tb_host_insn_size(const struct qemu_plugin_tb *tb);
+
 /**
  * qemu_plugin_tb_vaddr() - query helper for vaddr of TB start
  * @tb: opaque handle to TB passed to callback
diff --git a/plugins/api.c b/plugins/api.c
index 2078b16edb..0d70cb1f0f 100644
--- a/plugins/api.c
+++ b/plugins/api.c
@@ -188,6 +188,11 @@ size_t qemu_plugin_tb_n_insns(const struct qemu_plugin_tb *tb)
     return tb->n;
 }
 
+void *qemu_plugin_tb_host_insn_size(const struct qemu_plugin_tb *tb)
+{
+    return tb->host_insn_size;
+}
+
 uint64_t qemu_plugin_tb_vaddr(const struct qemu_plugin_tb *tb)
 {
     return tb->vaddr;
diff --git a/plugins/qemu-plugins.symbols b/plugins/qemu-plugins.symbols
index 71f6c90549..3e92c3b8ba 100644
--- a/plugins/qemu-plugins.symbols
+++ b/plugins/qemu-plugins.symbols
@@ -39,6 +39,7 @@
   qemu_plugin_start_code;
   qemu_plugin_tb_get_insn;
   qemu_plugin_tb_n_insns;
+  qemu_plugin_tb_host_insn_size;
   qemu_plugin_tb_vaddr;
   qemu_plugin_uninstall;
   qemu_plugin_vcpu_for_each;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/2] plugins/hotblocks: add host insn size
  2023-04-06  2:27 [PATCH 0/2] accel/tcg/plugin: host insn size for plugin Fei Wu
  2023-04-06  2:27 ` [PATCH 1/2] accel/tcg/plugin: export host insn size Fei Wu
@ 2023-04-06  2:27 ` Fei Wu
  2023-04-06  7:54   ` Alex Bennée
  1 sibling, 1 reply; 18+ messages in thread
From: Fei Wu @ 2023-04-06  2:27 UTC (permalink / raw)
  To: richard.henderson, pbonzini, alex.bennee, erdnaxe, ma.mandourr,
	qemu-devel
  Cc: Fei Wu

It's only valid when inline=false, otherwise it's default to 0.

Signed-off-by: Fei Wu <fei2.wu@intel.com>
---
 contrib/plugins/hotblocks.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/contrib/plugins/hotblocks.c b/contrib/plugins/hotblocks.c
index 062200a7a4..c9716da7fe 100644
--- a/contrib/plugins/hotblocks.c
+++ b/contrib/plugins/hotblocks.c
@@ -37,6 +37,8 @@ typedef struct {
     uint64_t exec_count;
     int      trans_count;
     unsigned long insns;
+    void    *p_host_insn_size;
+    uint64_t host_insn_size;
 } ExecCount;
 
 static gint cmp_exec_count(gconstpointer a, gconstpointer b)
@@ -59,13 +61,17 @@ static void plugin_exit(qemu_plugin_id_t id, void *p)
     it = g_list_sort(counts, cmp_exec_count);
 
     if (it) {
-        g_string_append_printf(report, "pc, tcount, icount, ecount\n");
+        g_string_append_printf(report,
+                               "host isize is only valid when inline=false\n"
+                               "pc, tcount, icount, ecount, host isize\n");
 
         for (i = 0; i < limit && it->next; i++, it = it->next) {
             ExecCount *rec = (ExecCount *) it->data;
-            g_string_append_printf(report, "0x%016"PRIx64", %d, %ld, %"PRId64"\n",
+            g_string_append_printf(report, "0x%016"PRIx64", %d, %ld, %"PRId64
+                                   ", %"PRIu64"\n",
                                    rec->start_addr, rec->trans_count,
-                                   rec->insns, rec->exec_count);
+                                   rec->insns, rec->exec_count,
+                                   rec->host_insn_size);
         }
 
         g_list_free(it);
@@ -82,14 +88,13 @@ static void plugin_init(void)
 
 static void vcpu_tb_exec(unsigned int cpu_index, void *udata)
 {
-    ExecCount *cnt;
-    uint64_t hash = (uint64_t) udata;
+    ExecCount *cnt = (ExecCount *) udata;
 
     g_mutex_lock(&lock);
-    cnt = (ExecCount *) g_hash_table_lookup(hotblocks, (gconstpointer) hash);
-    /* should always succeed */
-    g_assert(cnt);
     cnt->exec_count++;
+    if (cnt->host_insn_size == 0) {
+        cnt->host_insn_size = *((uint64_t *)cnt->p_host_insn_size);
+    }
     g_mutex_unlock(&lock);
 }
 
@@ -114,6 +119,7 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
         cnt->start_addr = pc;
         cnt->trans_count = 1;
         cnt->insns = insns;
+        cnt->p_host_insn_size = qemu_plugin_tb_host_insn_size(tb);
         g_hash_table_insert(hotblocks, (gpointer) hash, (gpointer) cnt);
     }
 
@@ -125,7 +131,7 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
     } else {
         qemu_plugin_register_vcpu_tb_exec_cb(tb, vcpu_tb_exec,
                                              QEMU_PLUGIN_CB_NO_REGS,
-                                             (void *)hash);
+                                             (void *)cnt);
     }
 }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-06  2:27 ` [PATCH 1/2] accel/tcg/plugin: export host insn size Fei Wu
@ 2023-04-06  7:46   ` Alex Bennée
  2023-04-07  1:31     ` Wu, Fei
  2023-04-08  3:34     ` Richard Henderson
  0 siblings, 2 replies; 18+ messages in thread
From: Alex Bennée @ 2023-04-06  7:46 UTC (permalink / raw)
  To: Fei Wu; +Cc: richard.henderson, pbonzini, erdnaxe, ma.mandourr, qemu-devel


Fei Wu <fei2.wu@intel.com> writes:

> The translation ratio of host to guest instruction count is one of the
> key performance factor of binary translation. TCG doesn't collect host
> instruction count at present, it does collect host instruction size
> instead, although they are not the same thing as instruction size might
> not be fixed, instruction size is still a valid estimation.

I'm not so sure about exposing this information to plugins because we
try to avoid leaking internal implementation details to plugins. Aside
from that the very act of instrumenting will increase the size of the
target buffer.

If your aim is to examine JIT efficiency what is wrong with the current
"info jit" that you can access via the HMP? Also I'm wondering if its
time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
extra data it collects is that expensive.

Richard, what do you think?

>
> Signed-off-by: Fei Wu <fei2.wu@intel.com>
> ---
>  accel/tcg/plugin-gen.c       | 1 +
>  include/qemu/plugin.h        | 2 ++
>  include/qemu/qemu-plugin.h   | 8 ++++++++
>  plugins/api.c                | 5 +++++
>  plugins/qemu-plugins.symbols | 1 +
>  5 files changed, 17 insertions(+)
>
> diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
> index 5efb8db258..4a3ca8fa2f 100644
> --- a/accel/tcg/plugin-gen.c
> +++ b/accel/tcg/plugin-gen.c
> @@ -881,6 +881,7 @@ bool plugin_gen_tb_start(CPUState *cpu, const DisasContextBase *db,
>          ptb->haddr2 = NULL;
>          ptb->mem_only = mem_only;
>          ptb->mem_helper = false;
> +        ptb->host_insn_size = &db->tb->tc.size;
>  
>          plugin_gen_empty_callback(PLUGIN_GEN_FROM_TB);
>      }
> diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
> index bc0781cab8..b38fd139e1 100644
> --- a/include/qemu/plugin.h
> +++ b/include/qemu/plugin.h
> @@ -151,6 +151,8 @@ struct qemu_plugin_tb {
>      /* if set, the TB calls helpers that might access guest memory */
>      bool mem_helper;
>  
> +    uint64_t *host_insn_size;
> +
>      GArray *cbs[PLUGIN_N_CB_SUBTYPES];
>  };
>  
> diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
> index 50a9957279..2397574a21 100644
> --- a/include/qemu/qemu-plugin.h
> +++ b/include/qemu/qemu-plugin.h
> @@ -336,6 +336,14 @@ void qemu_plugin_register_vcpu_insn_exec_inline(struct qemu_plugin_insn *insn,
>   */
>  size_t qemu_plugin_tb_n_insns(const struct qemu_plugin_tb *tb);
>  
> +/**
> + * qemu_plugin_tb_n_insns() - query helper for host insns size in TB
> + * @tb: opaque handle to TB passed to callback
> + *
> + * Returns: address of host insns size of this block

If we went ahead with this we need to be very clear when you can call
this helper because the data will only be valid at certain points (which
is another argument against this).

> + */
> +void *qemu_plugin_tb_host_insn_size(const struct qemu_plugin_tb *tb);
> +
>  /**
>   * qemu_plugin_tb_vaddr() - query helper for vaddr of TB start
>   * @tb: opaque handle to TB passed to callback
> diff --git a/plugins/api.c b/plugins/api.c
> index 2078b16edb..0d70cb1f0f 100644
> --- a/plugins/api.c
> +++ b/plugins/api.c
> @@ -188,6 +188,11 @@ size_t qemu_plugin_tb_n_insns(const struct qemu_plugin_tb *tb)
>      return tb->n;
>  }
>  
> +void *qemu_plugin_tb_host_insn_size(const struct qemu_plugin_tb *tb)
> +{
> +    return tb->host_insn_size;
> +}
> +
>  uint64_t qemu_plugin_tb_vaddr(const struct qemu_plugin_tb *tb)
>  {
>      return tb->vaddr;
> diff --git a/plugins/qemu-plugins.symbols b/plugins/qemu-plugins.symbols
> index 71f6c90549..3e92c3b8ba 100644
> --- a/plugins/qemu-plugins.symbols
> +++ b/plugins/qemu-plugins.symbols
> @@ -39,6 +39,7 @@
>    qemu_plugin_start_code;
>    qemu_plugin_tb_get_insn;
>    qemu_plugin_tb_n_insns;
> +  qemu_plugin_tb_host_insn_size;
>    qemu_plugin_tb_vaddr;
>    qemu_plugin_uninstall;
>    qemu_plugin_vcpu_for_each;


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] plugins/hotblocks: add host insn size
  2023-04-06  2:27 ` [PATCH 2/2] plugins/hotblocks: add " Fei Wu
@ 2023-04-06  7:54   ` Alex Bennée
  0 siblings, 0 replies; 18+ messages in thread
From: Alex Bennée @ 2023-04-06  7:54 UTC (permalink / raw)
  To: Fei Wu; +Cc: richard.henderson, pbonzini, erdnaxe, ma.mandourr, qemu-devel


Fei Wu <fei2.wu@intel.com> writes:

> It's only valid when inline=false, otherwise it's default to 0.
>
> Signed-off-by: Fei Wu <fei2.wu@intel.com>
> ---
>  contrib/plugins/hotblocks.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/contrib/plugins/hotblocks.c b/contrib/plugins/hotblocks.c
> index 062200a7a4..c9716da7fe 100644
> --- a/contrib/plugins/hotblocks.c
> +++ b/contrib/plugins/hotblocks.c
> @@ -37,6 +37,8 @@ typedef struct {
>      uint64_t exec_count;
>      int      trans_count;
>      unsigned long insns;
> +    void    *p_host_insn_size;
> +    uint64_t host_insn_size;
>  } ExecCount;
>  
>  static gint cmp_exec_count(gconstpointer a, gconstpointer b)
> @@ -59,13 +61,17 @@ static void plugin_exit(qemu_plugin_id_t id, void *p)
>      it = g_list_sort(counts, cmp_exec_count);
>  
>      if (it) {
> -        g_string_append_printf(report, "pc, tcount, icount, ecount\n");
> +        g_string_append_printf(report,
> +                               "host isize is only valid when inline=false\n"
> +                               "pc, tcount, icount, ecount, host isize\n");
>  
>          for (i = 0; i < limit && it->next; i++, it = it->next) {
>              ExecCount *rec = (ExecCount *) it->data;
> -            g_string_append_printf(report, "0x%016"PRIx64", %d, %ld, %"PRId64"\n",
> +            g_string_append_printf(report, "0x%016"PRIx64", %d, %ld, %"PRId64
> +                                   ", %"PRIu64"\n",
>                                     rec->start_addr, rec->trans_count,
> -                                   rec->insns, rec->exec_count);
> +                                   rec->insns, rec->exec_count,
> +                                   rec->host_insn_size);
>          }
>  
>          g_list_free(it);
> @@ -82,14 +88,13 @@ static void plugin_init(void)
>  
>  static void vcpu_tb_exec(unsigned int cpu_index, void *udata)
>  {
> -    ExecCount *cnt;
> -    uint64_t hash = (uint64_t) udata;
> +    ExecCount *cnt = (ExecCount *) udata;
>  
>      g_mutex_lock(&lock);
> -    cnt = (ExecCount *) g_hash_table_lookup(hotblocks, (gconstpointer) hash);
> -    /* should always succeed */
> -    g_assert(cnt);
>      cnt->exec_count++;
> +    if (cnt->host_insn_size == 0) {
> +        cnt->host_insn_size = *((uint64_t *)cnt->p_host_insn_size);

No - passing an internal TCG pointer across different phases of
translation/execution is a definite no no. We explicitly state that
handles are only valid for callbacks in the docs:

  Lifetime of the query handle
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  Each callback provides an opaque anonymous information handle which
  can usually be further queried to find out information about a
  translation, instruction or operation. The handles themselves are only
  valid during the lifetime of the callback so it is important that any
  information that is needed is extracted during the callback and saved
  by the plugin.

to avoid this sort of tangling of implementation details into the
plugins.


> +    }
>      g_mutex_unlock(&lock);
>  }
>  
> @@ -114,6 +119,7 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
>          cnt->start_addr = pc;
>          cnt->trans_count = 1;
>          cnt->insns = insns;
> +        cnt->p_host_insn_size = qemu_plugin_tb_host_insn_size(tb);
>          g_hash_table_insert(hotblocks, (gpointer) hash, (gpointer) cnt);
>      }
>  
> @@ -125,7 +131,7 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
>      } else {
>          qemu_plugin_register_vcpu_tb_exec_cb(tb, vcpu_tb_exec,
>                                               QEMU_PLUGIN_CB_NO_REGS,
> -                                             (void *)hash);
> +                                             (void *)cnt);
>      }
>  }


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-06  7:46   ` Alex Bennée
@ 2023-04-07  1:31     ` Wu, Fei
  2023-04-10 10:46       ` Alex Bennée
  2023-04-08  3:34     ` Richard Henderson
  1 sibling, 1 reply; 18+ messages in thread
From: Wu, Fei @ 2023-04-07  1:31 UTC (permalink / raw)
  To: Alex Bennée
  Cc: richard.henderson, pbonzini, erdnaxe, ma.mandourr, qemu-devel

On 4/6/2023 3:46 PM, Alex Bennée wrote:
> 
> Fei Wu <fei2.wu@intel.com> writes:
> 
>> The translation ratio of host to guest instruction count is one of the
>> key performance factor of binary translation. TCG doesn't collect host
>> instruction count at present, it does collect host instruction size
>> instead, although they are not the same thing as instruction size might
>> not be fixed, instruction size is still a valid estimation.
> 
> I'm not so sure about exposing this information to plugins because we
> try to avoid leaking internal implementation details to plugins. Aside
> from that the very act of instrumenting will increase the size of the
> target buffer.
> 
> If your aim is to examine JIT efficiency what is wrong with the current
> "info jit" that you can access via the HMP? Also I'm wondering if its
> time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
> extra data it collects is that expensive.
> 
"info jit" collects the translation time expansion ratio, it doesn't
distinguish between hot and cold blocks:
    TB avg target size  14 max=1918 bytes
    TB avg host size    287 bytes (expansion ratio: 19.7)

My primary aim is to collect the runtime expansion ratio, so hot blocks
weigh more than cold blocks. My concern is this series might not be the
proper way to implement it, just as you mentioned in another reply.

Thanks,
Fei.

> Richard, what do you think?
> 
>>
>> Signed-off-by: Fei Wu <fei2.wu@intel.com>
>> ---
>>  accel/tcg/plugin-gen.c       | 1 +
>>  include/qemu/plugin.h        | 2 ++
>>  include/qemu/qemu-plugin.h   | 8 ++++++++
>>  plugins/api.c                | 5 +++++
>>  plugins/qemu-plugins.symbols | 1 +
>>  5 files changed, 17 insertions(+)
>>
>> diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
>> index 5efb8db258..4a3ca8fa2f 100644
>> --- a/accel/tcg/plugin-gen.c
>> +++ b/accel/tcg/plugin-gen.c
>> @@ -881,6 +881,7 @@ bool plugin_gen_tb_start(CPUState *cpu, const DisasContextBase *db,
>>          ptb->haddr2 = NULL;
>>          ptb->mem_only = mem_only;
>>          ptb->mem_helper = false;
>> +        ptb->host_insn_size = &db->tb->tc.size;
>>  
>>          plugin_gen_empty_callback(PLUGIN_GEN_FROM_TB);
>>      }
>> diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
>> index bc0781cab8..b38fd139e1 100644
>> --- a/include/qemu/plugin.h
>> +++ b/include/qemu/plugin.h
>> @@ -151,6 +151,8 @@ struct qemu_plugin_tb {
>>      /* if set, the TB calls helpers that might access guest memory */
>>      bool mem_helper;
>>  
>> +    uint64_t *host_insn_size;
>> +
>>      GArray *cbs[PLUGIN_N_CB_SUBTYPES];
>>  };
>>  
>> diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
>> index 50a9957279..2397574a21 100644
>> --- a/include/qemu/qemu-plugin.h
>> +++ b/include/qemu/qemu-plugin.h
>> @@ -336,6 +336,14 @@ void qemu_plugin_register_vcpu_insn_exec_inline(struct qemu_plugin_insn *insn,
>>   */
>>  size_t qemu_plugin_tb_n_insns(const struct qemu_plugin_tb *tb);
>>  
>> +/**
>> + * qemu_plugin_tb_n_insns() - query helper for host insns size in TB
>> + * @tb: opaque handle to TB passed to callback
>> + *
>> + * Returns: address of host insns size of this block
> 
> If we went ahead with this we need to be very clear when you can call
> this helper because the data will only be valid at certain points (which
> is another argument against this).
> 
>> + */
>> +void *qemu_plugin_tb_host_insn_size(const struct qemu_plugin_tb *tb);
>> +
>>  /**
>>   * qemu_plugin_tb_vaddr() - query helper for vaddr of TB start
>>   * @tb: opaque handle to TB passed to callback
>> diff --git a/plugins/api.c b/plugins/api.c
>> index 2078b16edb..0d70cb1f0f 100644
>> --- a/plugins/api.c
>> +++ b/plugins/api.c
>> @@ -188,6 +188,11 @@ size_t qemu_plugin_tb_n_insns(const struct qemu_plugin_tb *tb)
>>      return tb->n;
>>  }
>>  
>> +void *qemu_plugin_tb_host_insn_size(const struct qemu_plugin_tb *tb)
>> +{
>> +    return tb->host_insn_size;
>> +}
>> +
>>  uint64_t qemu_plugin_tb_vaddr(const struct qemu_plugin_tb *tb)
>>  {
>>      return tb->vaddr;
>> diff --git a/plugins/qemu-plugins.symbols b/plugins/qemu-plugins.symbols
>> index 71f6c90549..3e92c3b8ba 100644
>> --- a/plugins/qemu-plugins.symbols
>> +++ b/plugins/qemu-plugins.symbols
>> @@ -39,6 +39,7 @@
>>    qemu_plugin_start_code;
>>    qemu_plugin_tb_get_insn;
>>    qemu_plugin_tb_n_insns;
>> +  qemu_plugin_tb_host_insn_size;
>>    qemu_plugin_tb_vaddr;
>>    qemu_plugin_uninstall;
>>    qemu_plugin_vcpu_for_each;
> 
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-06  7:46   ` Alex Bennée
  2023-04-07  1:31     ` Wu, Fei
@ 2023-04-08  3:34     ` Richard Henderson
  2023-04-10 10:36       ` Alex Bennée
  1 sibling, 1 reply; 18+ messages in thread
From: Richard Henderson @ 2023-04-08  3:34 UTC (permalink / raw)
  To: Alex Bennée, Fei Wu; +Cc: pbonzini, erdnaxe, ma.mandourr, qemu-devel

On 4/6/23 00:46, Alex Bennée wrote:
> If your aim is to examine JIT efficiency what is wrong with the current
> "info jit" that you can access via the HMP? Also I'm wondering if its
> time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
> extra data it collects is that expensive.
> 
> Richard, what do you think?

What is it that you want from CONFIG_PROFILER that you can't get from perf?
I've been tempted to remove CONFIG_PROFILER entirely.


r~


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-08  3:34     ` Richard Henderson
@ 2023-04-10 10:36       ` Alex Bennée
  2023-04-10 13:02         ` Wu, Fei
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Bennée @ 2023-04-10 10:36 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Fei Wu, pbonzini, erdnaxe, ma.mandourr, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> On 4/6/23 00:46, Alex Bennée wrote:
>> If your aim is to examine JIT efficiency what is wrong with the current
>> "info jit" that you can access via the HMP? Also I'm wondering if its
>> time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
>> extra data it collects is that expensive.
>> Richard, what do you think?
>
> What is it that you want from CONFIG_PROFILER that you can't get from perf?
> I've been tempted to remove CONFIG_PROFILER entirely.

I think perf is pretty good at getting the hot paths in the translator
and pretty much all of the timer related stuff in CONFIG_PROFILER could
be dropped. However some of the additional information about TCG ops
usage and distribution is useful. That said last time I had a tilt at
this on the back of a GSoC project:

  Subject: [PATCH  v9 00/13] TCG code quality tracking and perf integration
  Date: Mon,  7 Oct 2019 16:28:26 +0100
  Message-Id: <20191007152839.30804-1-alex.bennee@linaro.org>

The series ended up moving all the useful bits of CONFIG_PROFILER into
tb stats which was dynamically controlled on a per TB basis. Now that
the perf integration stuff was merged maybe there is a simpler series to
be picked out of the remains?

Fei Wu,

Have you looked at the above series? Is that gathering the sort of
things you need? Is this all in service of examining the translation
quality of hot code?

>
>
> r~


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-07  1:31     ` Wu, Fei
@ 2023-04-10 10:46       ` Alex Bennée
  0 siblings, 0 replies; 18+ messages in thread
From: Alex Bennée @ 2023-04-10 10:46 UTC (permalink / raw)
  To: Wu, Fei; +Cc: richard.henderson, pbonzini, erdnaxe, ma.mandourr, qemu-devel


"Wu, Fei" <fei2.wu@intel.com> writes:

> On 4/6/2023 3:46 PM, Alex Bennée wrote:
>> 
>> Fei Wu <fei2.wu@intel.com> writes:
>> 
>>> The translation ratio of host to guest instruction count is one of the
>>> key performance factor of binary translation. TCG doesn't collect host
>>> instruction count at present, it does collect host instruction size
>>> instead, although they are not the same thing as instruction size might
>>> not be fixed, instruction size is still a valid estimation.
>> 
>> I'm not so sure about exposing this information to plugins because we
>> try to avoid leaking internal implementation details to plugins. Aside
>> from that the very act of instrumenting will increase the size of the
>> target buffer.
>> 
>> If your aim is to examine JIT efficiency what is wrong with the current
>> "info jit" that you can access via the HMP? Also I'm wondering if its
>> time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
>> extra data it collects is that expensive.
>> 
> "info jit" collects the translation time expansion ratio, it doesn't
> distinguish between hot and cold blocks:
>     TB avg target size  14 max=1918 bytes
>     TB avg host size    287 bytes (expansion ratio: 19.7)
>
> My primary aim is to collect the runtime expansion ratio, so hot blocks
> weigh more than cold blocks. My concern is this series might not be the
> proper way to implement it, just as you mentioned in another reply.

See my reply to Richard but:

  Subject: [PATCH  v9 00/13] TCG code quality tracking and perf integration
  Date: Mon,  7 Oct 2019 16:28:26 +0100
  Message-Id: <20191007152839.30804-1-alex.bennee@linaro.org>

may be of interest?

>
> Thanks,
> Fei.
>
>> Richard, what do you think?
>> 
>>>
>>> Signed-off-by: Fei Wu <fei2.wu@intel.com>
>>> ---
>>>  accel/tcg/plugin-gen.c       | 1 +
>>>  include/qemu/plugin.h        | 2 ++
>>>  include/qemu/qemu-plugin.h   | 8 ++++++++
>>>  plugins/api.c                | 5 +++++
>>>  plugins/qemu-plugins.symbols | 1 +
>>>  5 files changed, 17 insertions(+)
>>>
>>> diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
>>> index 5efb8db258..4a3ca8fa2f 100644
>>> --- a/accel/tcg/plugin-gen.c
>>> +++ b/accel/tcg/plugin-gen.c
>>> @@ -881,6 +881,7 @@ bool plugin_gen_tb_start(CPUState *cpu, const DisasContextBase *db,
>>>          ptb->haddr2 = NULL;
>>>          ptb->mem_only = mem_only;
>>>          ptb->mem_helper = false;
>>> +        ptb->host_insn_size = &db->tb->tc.size;
>>>  
>>>          plugin_gen_empty_callback(PLUGIN_GEN_FROM_TB);
>>>      }
>>> diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
>>> index bc0781cab8..b38fd139e1 100644
>>> --- a/include/qemu/plugin.h
>>> +++ b/include/qemu/plugin.h
>>> @@ -151,6 +151,8 @@ struct qemu_plugin_tb {
>>>      /* if set, the TB calls helpers that might access guest memory */
>>>      bool mem_helper;
>>>  
>>> +    uint64_t *host_insn_size;
>>> +
>>>      GArray *cbs[PLUGIN_N_CB_SUBTYPES];
>>>  };
>>>  
>>> diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
>>> index 50a9957279..2397574a21 100644
>>> --- a/include/qemu/qemu-plugin.h
>>> +++ b/include/qemu/qemu-plugin.h
>>> @@ -336,6 +336,14 @@ void qemu_plugin_register_vcpu_insn_exec_inline(struct qemu_plugin_insn *insn,
>>>   */
>>>  size_t qemu_plugin_tb_n_insns(const struct qemu_plugin_tb *tb);
>>>  
>>> +/**
>>> + * qemu_plugin_tb_n_insns() - query helper for host insns size in TB
>>> + * @tb: opaque handle to TB passed to callback
>>> + *
>>> + * Returns: address of host insns size of this block
>> 
>> If we went ahead with this we need to be very clear when you can call
>> this helper because the data will only be valid at certain points (which
>> is another argument against this).
>> 
>>> + */
>>> +void *qemu_plugin_tb_host_insn_size(const struct qemu_plugin_tb *tb);
>>> +
>>>  /**
>>>   * qemu_plugin_tb_vaddr() - query helper for vaddr of TB start
>>>   * @tb: opaque handle to TB passed to callback
>>> diff --git a/plugins/api.c b/plugins/api.c
>>> index 2078b16edb..0d70cb1f0f 100644
>>> --- a/plugins/api.c
>>> +++ b/plugins/api.c
>>> @@ -188,6 +188,11 @@ size_t qemu_plugin_tb_n_insns(const struct qemu_plugin_tb *tb)
>>>      return tb->n;
>>>  }
>>>  
>>> +void *qemu_plugin_tb_host_insn_size(const struct qemu_plugin_tb *tb)
>>> +{
>>> +    return tb->host_insn_size;
>>> +}
>>> +
>>>  uint64_t qemu_plugin_tb_vaddr(const struct qemu_plugin_tb *tb)
>>>  {
>>>      return tb->vaddr;
>>> diff --git a/plugins/qemu-plugins.symbols b/plugins/qemu-plugins.symbols
>>> index 71f6c90549..3e92c3b8ba 100644
>>> --- a/plugins/qemu-plugins.symbols
>>> +++ b/plugins/qemu-plugins.symbols
>>> @@ -39,6 +39,7 @@
>>>    qemu_plugin_start_code;
>>>    qemu_plugin_tb_get_insn;
>>>    qemu_plugin_tb_n_insns;
>>> +  qemu_plugin_tb_host_insn_size;
>>>    qemu_plugin_tb_vaddr;
>>>    qemu_plugin_uninstall;
>>>    qemu_plugin_vcpu_for_each;
>> 
>> 


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-10 10:36       ` Alex Bennée
@ 2023-04-10 13:02         ` Wu, Fei
  2023-04-11  7:27           ` Alex Bennée
  0 siblings, 1 reply; 18+ messages in thread
From: Wu, Fei @ 2023-04-10 13:02 UTC (permalink / raw)
  To: Alex Bennée, Richard Henderson
  Cc: pbonzini, erdnaxe, ma.mandourr, qemu-devel

On 4/10/2023 6:36 PM, Alex Bennée wrote:
> 
> Richard Henderson <richard.henderson@linaro.org> writes:
> 
>> On 4/6/23 00:46, Alex Bennée wrote:
>>> If your aim is to examine JIT efficiency what is wrong with the current
>>> "info jit" that you can access via the HMP? Also I'm wondering if its
>>> time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
>>> extra data it collects is that expensive.
>>> Richard, what do you think?
>>
>> What is it that you want from CONFIG_PROFILER that you can't get from perf?
>> I've been tempted to remove CONFIG_PROFILER entirely.
> 
> I think perf is pretty good at getting the hot paths in the translator
> and pretty much all of the timer related stuff in CONFIG_PROFILER could
> be dropped. However some of the additional information about TCG ops
> usage and distribution is useful. That said last time I had a tilt at
> this on the back of a GSoC project:
> 
>   Subject: [PATCH  v9 00/13] TCG code quality tracking and perf integration
>   Date: Mon,  7 Oct 2019 16:28:26 +0100
>   Message-Id: <20191007152839.30804-1-alex.bennee@linaro.org>
> 
> The series ended up moving all the useful bits of CONFIG_PROFILER into
> tb stats which was dynamically controlled on a per TB basis. Now that
> the perf integration stuff was merged maybe there is a simpler series to
> be picked out of the remains?
> 
> Fei Wu,
> 
> Have you looked at the above series? Is that gathering the sort of
> things you need? Is this all in service of examining the translation
> quality of hot code?
> 
Yes, it does have what I want, I suppose this wiki is for the series:
    https://wiki.qemu.org/Features/TCGCodeQuality

btw, the archive seems broken and cannot show the whole series:
    https://www.mail-archive.com/qemu-devel@nongnu.org/msg650258.html

Thanks,
Fei.

>>
>>
>> r~
> 
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-10 13:02         ` Wu, Fei
@ 2023-04-11  7:27           ` Alex Bennée
  2023-04-12 12:50             ` Wu, Fei
  2023-04-17 11:11             ` Wu, Fei
  0 siblings, 2 replies; 18+ messages in thread
From: Alex Bennée @ 2023-04-11  7:27 UTC (permalink / raw)
  To: Wu, Fei; +Cc: Richard Henderson, pbonzini, erdnaxe, ma.mandourr, qemu-devel


"Wu, Fei" <fei2.wu@intel.com> writes:

> On 4/10/2023 6:36 PM, Alex Bennée wrote:
>> 
>> Richard Henderson <richard.henderson@linaro.org> writes:
>> 
>>> On 4/6/23 00:46, Alex Bennée wrote:
>>>> If your aim is to examine JIT efficiency what is wrong with the current
>>>> "info jit" that you can access via the HMP? Also I'm wondering if its
>>>> time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
>>>> extra data it collects is that expensive.
>>>> Richard, what do you think?
>>>
>>> What is it that you want from CONFIG_PROFILER that you can't get from perf?
>>> I've been tempted to remove CONFIG_PROFILER entirely.
>> 
>> I think perf is pretty good at getting the hot paths in the translator
>> and pretty much all of the timer related stuff in CONFIG_PROFILER could
>> be dropped. However some of the additional information about TCG ops
>> usage and distribution is useful. That said last time I had a tilt at
>> this on the back of a GSoC project:
>> 
>>   Subject: [PATCH  v9 00/13] TCG code quality tracking and perf integration
>>   Date: Mon,  7 Oct 2019 16:28:26 +0100
>>   Message-Id: <20191007152839.30804-1-alex.bennee@linaro.org>
>> 
>> The series ended up moving all the useful bits of CONFIG_PROFILER into
>> tb stats which was dynamically controlled on a per TB basis. Now that
>> the perf integration stuff was merged maybe there is a simpler series to
>> be picked out of the remains?
>> 
>> Fei Wu,
>> 
>> Have you looked at the above series? Is that gathering the sort of
>> things you need? Is this all in service of examining the translation
>> quality of hot code?
>> 
> Yes, it does have what I want, I suppose this wiki is for the series:
>     https://wiki.qemu.org/Features/TCGCodeQuality

Yes.

>
> btw, the archive seems broken and cannot show the whole series:
>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg650258.html

I have a v10 branch here:

  https://github.com/stsquad/qemu/tree/tcg/tbstats-and-perf-v10

I think the top two patches can be dropped on a re-base as the JIT/perf
integration is already merged. It might be a tricky re-base though.
Depends on how much churn there has been in the tree since.

>
> Thanks,
> Fei.
>
>>>
>>>
>>> r~
>> 
>> 


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-11  7:27           ` Alex Bennée
@ 2023-04-12 12:50             ` Wu, Fei
  2023-04-12 13:28               ` Alex Bennée
  2023-04-17 11:11             ` Wu, Fei
  1 sibling, 1 reply; 18+ messages in thread
From: Wu, Fei @ 2023-04-12 12:50 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Richard Henderson, pbonzini, erdnaxe, ma.mandourr, qemu-devel

On 4/11/2023 3:27 PM, Alex Bennée wrote:
> 
> "Wu, Fei" <fei2.wu@intel.com> writes:
> 
>> On 4/10/2023 6:36 PM, Alex Bennée wrote:
>>>
>>> Richard Henderson <richard.henderson@linaro.org> writes:
>>>
>>>> On 4/6/23 00:46, Alex Bennée wrote:
>>>>> If your aim is to examine JIT efficiency what is wrong with the current
>>>>> "info jit" that you can access via the HMP? Also I'm wondering if its
>>>>> time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
>>>>> extra data it collects is that expensive.
>>>>> Richard, what do you think?
>>>>
>>>> What is it that you want from CONFIG_PROFILER that you can't get from perf?
>>>> I've been tempted to remove CONFIG_PROFILER entirely.
>>>
>>> I think perf is pretty good at getting the hot paths in the translator
>>> and pretty much all of the timer related stuff in CONFIG_PROFILER could
>>> be dropped. However some of the additional information about TCG ops
>>> usage and distribution is useful. That said last time I had a tilt at
>>> this on the back of a GSoC project:
>>>
>>>   Subject: [PATCH  v9 00/13] TCG code quality tracking and perf integration
>>>   Date: Mon,  7 Oct 2019 16:28:26 +0100
>>>   Message-Id: <20191007152839.30804-1-alex.bennee@linaro.org>
>>>
>>> The series ended up moving all the useful bits of CONFIG_PROFILER into
>>> tb stats which was dynamically controlled on a per TB basis. Now that
>>> the perf integration stuff was merged maybe there is a simpler series to
>>> be picked out of the remains?
>>>
>>> Fei Wu,
>>>
>>> Have you looked at the above series? Is that gathering the sort of
>>> things you need? Is this all in service of examining the translation
>>> quality of hot code?
>>>
>> Yes, it does have what I want, I suppose this wiki is for the series:
>>     https://wiki.qemu.org/Features/TCGCodeQuality
> 
> Yes.
> 
>>
>> btw, the archive seems broken and cannot show the whole series:
>>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg650258.html
> 
> I have a v10 branch here:
> 
>   https://github.com/stsquad/qemu/tree/tcg/tbstats-and-perf-v10
> 
> I think the top two patches can be dropped on a re-base as the JIT/perf
> integration is already merged. It might be a tricky re-base though.
> Depends on how much churn there has been in the tree since.
> 
I'd like to try it. Why has it not been merged upstream?

Thanks,
Fei.

>>
>> Thanks,
>> Fei.
>>
>>>>
>>>>
>>>> r~
>>>
>>>
> 
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-12 12:50             ` Wu, Fei
@ 2023-04-12 13:28               ` Alex Bennée
  2023-04-12 13:47                 ` Wu, Fei
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Bennée @ 2023-04-12 13:28 UTC (permalink / raw)
  To: Wu, Fei; +Cc: Richard Henderson, pbonzini, erdnaxe, ma.mandourr, qemu-devel


"Wu, Fei" <fei2.wu@intel.com> writes:

> On 4/11/2023 3:27 PM, Alex Bennée wrote:
>> 
>> "Wu, Fei" <fei2.wu@intel.com> writes:
>> 
>>> On 4/10/2023 6:36 PM, Alex Bennée wrote:
>>>>
>>>> Richard Henderson <richard.henderson@linaro.org> writes:
>>>>
>>>>> On 4/6/23 00:46, Alex Bennée wrote:
>>>>>> If your aim is to examine JIT efficiency what is wrong with the current
>>>>>> "info jit" that you can access via the HMP? Also I'm wondering if its
>>>>>> time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
>>>>>> extra data it collects is that expensive.
>>>>>> Richard, what do you think?
>>>>>
>>>>> What is it that you want from CONFIG_PROFILER that you can't get from perf?
>>>>> I've been tempted to remove CONFIG_PROFILER entirely.
>>>>
>>>> I think perf is pretty good at getting the hot paths in the translator
>>>> and pretty much all of the timer related stuff in CONFIG_PROFILER could
>>>> be dropped. However some of the additional information about TCG ops
>>>> usage and distribution is useful. That said last time I had a tilt at
>>>> this on the back of a GSoC project:
>>>>
>>>>   Subject: [PATCH  v9 00/13] TCG code quality tracking and perf integration
>>>>   Date: Mon,  7 Oct 2019 16:28:26 +0100
>>>>   Message-Id: <20191007152839.30804-1-alex.bennee@linaro.org>
>>>>
>>>> The series ended up moving all the useful bits of CONFIG_PROFILER into
>>>> tb stats which was dynamically controlled on a per TB basis. Now that
>>>> the perf integration stuff was merged maybe there is a simpler series to
>>>> be picked out of the remains?
>>>>
>>>> Fei Wu,
>>>>
>>>> Have you looked at the above series? Is that gathering the sort of
>>>> things you need? Is this all in service of examining the translation
>>>> quality of hot code?
>>>>
>>> Yes, it does have what I want, I suppose this wiki is for the series:
>>>     https://wiki.qemu.org/Features/TCGCodeQuality
>> 
>> Yes.
>> 
>>>
>>> btw, the archive seems broken and cannot show the whole series:
>>>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg650258.html
>> 
>> I have a v10 branch here:
>> 
>>   https://github.com/stsquad/qemu/tree/tcg/tbstats-and-perf-v10
>> 
>> I think the top two patches can be dropped on a re-base as the JIT/perf
>> integration is already merged. It might be a tricky re-base though.
>> Depends on how much churn there has been in the tree since.
>> 
> I'd like to try it. Why has it not been merged upstream?

Bits have been merged (the perf jit support) but the original GSoC
student moved on and I ran out of time to work on it. It became yet another
back burner series that awaits some spare hacking time.

>
> Thanks,
> Fei.
>
>>>
>>> Thanks,
>>> Fei.
>>>
>>>>>
>>>>>
>>>>> r~
>>>>
>>>>
>> 
>> 


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-12 13:28               ` Alex Bennée
@ 2023-04-12 13:47                 ` Wu, Fei
  0 siblings, 0 replies; 18+ messages in thread
From: Wu, Fei @ 2023-04-12 13:47 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Richard Henderson, pbonzini, erdnaxe, ma.mandourr, qemu-devel

On 4/12/2023 9:28 PM, Alex Bennée wrote:
> 
> "Wu, Fei" <fei2.wu@intel.com> writes:
> 
>> On 4/11/2023 3:27 PM, Alex Bennée wrote:
>>>
>>> "Wu, Fei" <fei2.wu@intel.com> writes:
>>>
>>>> On 4/10/2023 6:36 PM, Alex Bennée wrote:
>>>>>
>>>>> Richard Henderson <richard.henderson@linaro.org> writes:
>>>>>
>>>>>> On 4/6/23 00:46, Alex Bennée wrote:
>>>>>>> If your aim is to examine JIT efficiency what is wrong with the current
>>>>>>> "info jit" that you can access via the HMP? Also I'm wondering if its
>>>>>>> time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
>>>>>>> extra data it collects is that expensive.
>>>>>>> Richard, what do you think?
>>>>>>
>>>>>> What is it that you want from CONFIG_PROFILER that you can't get from perf?
>>>>>> I've been tempted to remove CONFIG_PROFILER entirely.
>>>>>
>>>>> I think perf is pretty good at getting the hot paths in the translator
>>>>> and pretty much all of the timer related stuff in CONFIG_PROFILER could
>>>>> be dropped. However some of the additional information about TCG ops
>>>>> usage and distribution is useful. That said last time I had a tilt at
>>>>> this on the back of a GSoC project:
>>>>>
>>>>>   Subject: [PATCH  v9 00/13] TCG code quality tracking and perf integration
>>>>>   Date: Mon,  7 Oct 2019 16:28:26 +0100
>>>>>   Message-Id: <20191007152839.30804-1-alex.bennee@linaro.org>
>>>>>
>>>>> The series ended up moving all the useful bits of CONFIG_PROFILER into
>>>>> tb stats which was dynamically controlled on a per TB basis. Now that
>>>>> the perf integration stuff was merged maybe there is a simpler series to
>>>>> be picked out of the remains?
>>>>>
>>>>> Fei Wu,
>>>>>
>>>>> Have you looked at the above series? Is that gathering the sort of
>>>>> things you need? Is this all in service of examining the translation
>>>>> quality of hot code?
>>>>>
>>>> Yes, it does have what I want, I suppose this wiki is for the series:
>>>>     https://wiki.qemu.org/Features/TCGCodeQuality
>>>
>>> Yes.
>>>
>>>>
>>>> btw, the archive seems broken and cannot show the whole series:
>>>>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg650258.html
>>>
>>> I have a v10 branch here:
>>>
>>>   https://github.com/stsquad/qemu/tree/tcg/tbstats-and-perf-v10
>>>
>>> I think the top two patches can be dropped on a re-base as the JIT/perf
>>> integration is already merged. It might be a tricky re-base though.
>>> Depends on how much churn there has been in the tree since.
>>>
>> I'd like to try it. Why has it not been merged upstream?
> 
> Bits have been merged (the perf jit support) but the original GSoC
> student moved on and I ran out of time to work on it. It became yet another
> back burner series that awaits some spare hacking time.
> 
Got it, let's see if I can help.

Thanks,
Fei.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-11  7:27           ` Alex Bennée
  2023-04-12 12:50             ` Wu, Fei
@ 2023-04-17 11:11             ` Wu, Fei
  2023-04-17 12:11               ` Alex Bennée
  1 sibling, 1 reply; 18+ messages in thread
From: Wu, Fei @ 2023-04-17 11:11 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Richard Henderson, pbonzini, erdnaxe, ma.mandourr, qemu-devel

On 4/11/2023 3:27 PM, Alex Bennée wrote:
> 
> "Wu, Fei" <fei2.wu@intel.com> writes:
> 
>> On 4/10/2023 6:36 PM, Alex Bennée wrote:
>>>
>>> Richard Henderson <richard.henderson@linaro.org> writes:
>>>
>>>> On 4/6/23 00:46, Alex Bennée wrote:
>>>>> If your aim is to examine JIT efficiency what is wrong with the current
>>>>> "info jit" that you can access via the HMP? Also I'm wondering if its
>>>>> time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
>>>>> extra data it collects is that expensive.
>>>>> Richard, what do you think?
>>>>
>>>> What is it that you want from CONFIG_PROFILER that you can't get from perf?
>>>> I've been tempted to remove CONFIG_PROFILER entirely.
>>>
>>> I think perf is pretty good at getting the hot paths in the translator
>>> and pretty much all of the timer related stuff in CONFIG_PROFILER could
>>> be dropped. However some of the additional information about TCG ops
>>> usage and distribution is useful. That said last time I had a tilt at
>>> this on the back of a GSoC project:
>>>
>>>   Subject: [PATCH  v9 00/13] TCG code quality tracking and perf integration
>>>   Date: Mon,  7 Oct 2019 16:28:26 +0100
>>>   Message-Id: <20191007152839.30804-1-alex.bennee@linaro.org>
>>>
>>> The series ended up moving all the useful bits of CONFIG_PROFILER into
>>> tb stats which was dynamically controlled on a per TB basis. Now that
>>> the perf integration stuff was merged maybe there is a simpler series to
>>> be picked out of the remains?
>>>
>>> Fei Wu,
>>>
>>> Have you looked at the above series? Is that gathering the sort of
>>> things you need? Is this all in service of examining the translation
>>> quality of hot code?
>>>
>> Yes, it does have what I want, I suppose this wiki is for the series:
>>     https://wiki.qemu.org/Features/TCGCodeQuality
> 
> Yes.
> 
>>
>> btw, the archive seems broken and cannot show the whole series:
>>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg650258.html
> 
> I have a v10 branch here:
> 
>   https://github.com/stsquad/qemu/tree/tcg/tbstats-and-perf-v10
> 
> I think the top two patches can be dropped on a re-base as the JIT/perf
> integration is already merged. It might be a tricky re-base though.
> Depends on how much churn there has been in the tree since.
> 
I have rebased the patches to upstream here:
    https://github.com/atwufei/qemu/tree/tbstats

I try to keep the patches as possible as they are, but there are lots of
changes since then, so changes are inevitable, e.g. CF_NOCACHE has been
removed from upstream, I just removed its usage in the corresponding
patch, which might not be preferred.

I did some basic tests and they worked (the output of info goes to qemu
console, instead of telnet terminal), including:
    * tb_stats start
    * info tb-list
    * info tb 10

Alex, would you please take a look?

Thanks,
Fei.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-17 11:11             ` Wu, Fei
@ 2023-04-17 12:11               ` Alex Bennée
  2023-04-17 13:01                 ` Wu, Fei
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Bennée @ 2023-04-17 12:11 UTC (permalink / raw)
  To: Wu, Fei; +Cc: Richard Henderson, pbonzini, erdnaxe, ma.mandourr, qemu-devel


"Wu, Fei" <fei2.wu@intel.com> writes:

> On 4/11/2023 3:27 PM, Alex Bennée wrote:
>> 
>> "Wu, Fei" <fei2.wu@intel.com> writes:
>> 
>>> On 4/10/2023 6:36 PM, Alex Bennée wrote:
>>>>
>>>> Richard Henderson <richard.henderson@linaro.org> writes:
>>>>
>>>>> On 4/6/23 00:46, Alex Bennée wrote:
>>>>>> If your aim is to examine JIT efficiency what is wrong with the current
>>>>>> "info jit" that you can access via the HMP? Also I'm wondering if its
>>>>>> time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
>>>>>> extra data it collects is that expensive.
>>>>>> Richard, what do you think?
>>>>>
>>>>> What is it that you want from CONFIG_PROFILER that you can't get from perf?
>>>>> I've been tempted to remove CONFIG_PROFILER entirely.
>>>>
>>>> I think perf is pretty good at getting the hot paths in the translator
>>>> and pretty much all of the timer related stuff in CONFIG_PROFILER could
>>>> be dropped. However some of the additional information about TCG ops
>>>> usage and distribution is useful. That said last time I had a tilt at
>>>> this on the back of a GSoC project:
>>>>
>>>>   Subject: [PATCH  v9 00/13] TCG code quality tracking and perf integration
>>>>   Date: Mon,  7 Oct 2019 16:28:26 +0100
>>>>   Message-Id: <20191007152839.30804-1-alex.bennee@linaro.org>
>>>>
>>>> The series ended up moving all the useful bits of CONFIG_PROFILER into
>>>> tb stats which was dynamically controlled on a per TB basis. Now that
>>>> the perf integration stuff was merged maybe there is a simpler series to
>>>> be picked out of the remains?
>>>>
>>>> Fei Wu,
>>>>
>>>> Have you looked at the above series? Is that gathering the sort of
>>>> things you need? Is this all in service of examining the translation
>>>> quality of hot code?
>>>>
>>> Yes, it does have what I want, I suppose this wiki is for the series:
>>>     https://wiki.qemu.org/Features/TCGCodeQuality
>> 
>> Yes.
>> 
>>>
>>> btw, the archive seems broken and cannot show the whole series:
>>>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg650258.html
>> 
>> I have a v10 branch here:
>> 
>>   https://github.com/stsquad/qemu/tree/tcg/tbstats-and-perf-v10
>> 
>> I think the top two patches can be dropped on a re-base as the JIT/perf
>> integration is already merged. It might be a tricky re-base though.
>> Depends on how much churn there has been in the tree since.
>> 
> I have rebased the patches to upstream here:
>     https://github.com/atwufei/qemu/tree/tbstats
>
> I try to keep the patches as possible as they are, but there are lots of
> changes since then, so changes are inevitable, e.g. CF_NOCACHE has been
> removed from upstream, I just removed its usage in the corresponding
> patch, which might not be preferred.

Yeah that fine. CF_NOCACHE was removed to avoid special cases in the
generation code - we simply don't link or store the TBs in the QHT
anymore. As long as the guest isn't executing a lot of non-RAM code we
won't run out of translation buffer too quickly.

>
> I did some basic tests and they worked (the output of info goes to qemu
> console, instead of telnet terminal), including:
>     * tb_stats start
>     * info tb-list
>     * info tb 10
>
> Alex, would you please take a look?

That looks pretty good, glad it wasn't too painful a re-base.

The next question is do you want to pick up the series and put through a
review cycle or two to get merged? It would probably be worth checking
the last posting thread to see if their are any outstanding review
comments.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-17 12:11               ` Alex Bennée
@ 2023-04-17 13:01                 ` Wu, Fei
  2023-04-21 13:46                   ` Wu, Fei
  0 siblings, 1 reply; 18+ messages in thread
From: Wu, Fei @ 2023-04-17 13:01 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Richard Henderson, pbonzini, erdnaxe, ma.mandourr, qemu-devel

On 4/17/2023 8:11 PM, Alex Bennée wrote:
> 
> "Wu, Fei" <fei2.wu@intel.com> writes:
> 
>> On 4/11/2023 3:27 PM, Alex Bennée wrote:
>>>
>>> "Wu, Fei" <fei2.wu@intel.com> writes:
>>>
>>>> On 4/10/2023 6:36 PM, Alex Bennée wrote:
>>>>>
>>>>> Richard Henderson <richard.henderson@linaro.org> writes:
>>>>>
>>>>>> On 4/6/23 00:46, Alex Bennée wrote:
>>>>>>> If your aim is to examine JIT efficiency what is wrong with the current
>>>>>>> "info jit" that you can access via the HMP? Also I'm wondering if its
>>>>>>> time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
>>>>>>> extra data it collects is that expensive.
>>>>>>> Richard, what do you think?
>>>>>>
>>>>>> What is it that you want from CONFIG_PROFILER that you can't get from perf?
>>>>>> I've been tempted to remove CONFIG_PROFILER entirely.
>>>>>
>>>>> I think perf is pretty good at getting the hot paths in the translator
>>>>> and pretty much all of the timer related stuff in CONFIG_PROFILER could
>>>>> be dropped. However some of the additional information about TCG ops
>>>>> usage and distribution is useful. That said last time I had a tilt at
>>>>> this on the back of a GSoC project:
>>>>>
>>>>>   Subject: [PATCH  v9 00/13] TCG code quality tracking and perf integration
>>>>>   Date: Mon,  7 Oct 2019 16:28:26 +0100
>>>>>   Message-Id: <20191007152839.30804-1-alex.bennee@linaro.org>
>>>>>
>>>>> The series ended up moving all the useful bits of CONFIG_PROFILER into
>>>>> tb stats which was dynamically controlled on a per TB basis. Now that
>>>>> the perf integration stuff was merged maybe there is a simpler series to
>>>>> be picked out of the remains?
>>>>>
>>>>> Fei Wu,
>>>>>
>>>>> Have you looked at the above series? Is that gathering the sort of
>>>>> things you need? Is this all in service of examining the translation
>>>>> quality of hot code?
>>>>>
>>>> Yes, it does have what I want, I suppose this wiki is for the series:
>>>>     https://wiki.qemu.org/Features/TCGCodeQuality
>>>
>>> Yes.
>>>
>>>>
>>>> btw, the archive seems broken and cannot show the whole series:
>>>>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg650258.html
>>>
>>> I have a v10 branch here:
>>>
>>>   https://github.com/stsquad/qemu/tree/tcg/tbstats-and-perf-v10
>>>
>>> I think the top two patches can be dropped on a re-base as the JIT/perf
>>> integration is already merged. It might be a tricky re-base though.
>>> Depends on how much churn there has been in the tree since.
>>>
>> I have rebased the patches to upstream here:
>>     https://github.com/atwufei/qemu/tree/tbstats
>>
>> I try to keep the patches as possible as they are, but there are lots of
>> changes since then, so changes are inevitable, e.g. CF_NOCACHE has been
>> removed from upstream, I just removed its usage in the corresponding
>> patch, which might not be preferred.
> 
> Yeah that fine. CF_NOCACHE was removed to avoid special cases in the
> generation code - we simply don't link or store the TBs in the QHT
> anymore. As long as the guest isn't executing a lot of non-RAM code we
> won't run out of translation buffer too quickly.
> 
>>
>> I did some basic tests and they worked (the output of info goes to qemu
>> console, instead of telnet terminal), including:
>>     * tb_stats start
>>     * info tb-list
>>     * info tb 10
>>
>> Alex, would you please take a look?
> 
> That looks pretty good, glad it wasn't too painful a re-base.
> 
> The next question is do you want to pick up the series and put through a
> review cycle or two to get merged? It would probably be worth checking
> the last posting thread to see if their are any outstanding review
> comments.
> 
Yes, I can do it. I have something else in hand right now, so the review
request may be sent out in a few days.

Thanks,
Fei.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] accel/tcg/plugin: export host insn size
  2023-04-17 13:01                 ` Wu, Fei
@ 2023-04-21 13:46                   ` Wu, Fei
  0 siblings, 0 replies; 18+ messages in thread
From: Wu, Fei @ 2023-04-21 13:46 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Richard Henderson, pbonzini, erdnaxe, ma.mandourr, qemu-devel

On 4/17/2023 9:01 PM, Wu, Fei wrote:
> On 4/17/2023 8:11 PM, Alex Bennée wrote:
>>
>> "Wu, Fei" <fei2.wu@intel.com> writes:
>>
>>> On 4/11/2023 3:27 PM, Alex Bennée wrote:
>>>>
>>>> "Wu, Fei" <fei2.wu@intel.com> writes:
>>>>
>>>>> On 4/10/2023 6:36 PM, Alex Bennée wrote:
>>>>>>
>>>>>> Richard Henderson <richard.henderson@linaro.org> writes:
>>>>>>
>>>>>>> On 4/6/23 00:46, Alex Bennée wrote:
>>>>>>>> If your aim is to examine JIT efficiency what is wrong with the current
>>>>>>>> "info jit" that you can access via the HMP? Also I'm wondering if its
>>>>>>>> time to remove the #ifdefs from CONFIG_PROFILER because I doubt the
>>>>>>>> extra data it collects is that expensive.
>>>>>>>> Richard, what do you think?
>>>>>>>
>>>>>>> What is it that you want from CONFIG_PROFILER that you can't get from perf?
>>>>>>> I've been tempted to remove CONFIG_PROFILER entirely.
>>>>>>
>>>>>> I think perf is pretty good at getting the hot paths in the translator
>>>>>> and pretty much all of the timer related stuff in CONFIG_PROFILER could
>>>>>> be dropped. However some of the additional information about TCG ops
>>>>>> usage and distribution is useful. That said last time I had a tilt at
>>>>>> this on the back of a GSoC project:
>>>>>>
>>>>>>   Subject: [PATCH  v9 00/13] TCG code quality tracking and perf integration
>>>>>>   Date: Mon,  7 Oct 2019 16:28:26 +0100
>>>>>>   Message-Id: <20191007152839.30804-1-alex.bennee@linaro.org>
>>>>>>
>>>>>> The series ended up moving all the useful bits of CONFIG_PROFILER into
>>>>>> tb stats which was dynamically controlled on a per TB basis. Now that
>>>>>> the perf integration stuff was merged maybe there is a simpler series to
>>>>>> be picked out of the remains?
>>>>>>
>>>>>> Fei Wu,
>>>>>>
>>>>>> Have you looked at the above series? Is that gathering the sort of
>>>>>> things you need? Is this all in service of examining the translation
>>>>>> quality of hot code?
>>>>>>
>>>>> Yes, it does have what I want, I suppose this wiki is for the series:
>>>>>     https://wiki.qemu.org/Features/TCGCodeQuality
>>>>
>>>> Yes.
>>>>
>>>>>
>>>>> btw, the archive seems broken and cannot show the whole series:
>>>>>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg650258.html
>>>>
>>>> I have a v10 branch here:
>>>>
>>>>   https://github.com/stsquad/qemu/tree/tcg/tbstats-and-perf-v10
>>>>
>>>> I think the top two patches can be dropped on a re-base as the JIT/perf
>>>> integration is already merged. It might be a tricky re-base though.
>>>> Depends on how much churn there has been in the tree since.
>>>>
>>> I have rebased the patches to upstream here:
>>>     https://github.com/atwufei/qemu/tree/tbstats
>>>
>>> I try to keep the patches as possible as they are, but there are lots of
>>> changes since then, so changes are inevitable, e.g. CF_NOCACHE has been
>>> removed from upstream, I just removed its usage in the corresponding
>>> patch, which might not be preferred.
>>
>> Yeah that fine. CF_NOCACHE was removed to avoid special cases in the
>> generation code - we simply don't link or store the TBs in the QHT
>> anymore. As long as the guest isn't executing a lot of non-RAM code we
>> won't run out of translation buffer too quickly.
>>
>>>
>>> I did some basic tests and they worked (the output of info goes to qemu
>>> console, instead of telnet terminal), including:
>>>     * tb_stats start
>>>     * info tb-list
>>>     * info tb 10
>>>
>>> Alex, would you please take a look?
>>
>> That looks pretty good, glad it wasn't too painful a re-base.
>>
>> The next question is do you want to pick up the series and put through a
>> review cycle or two to get merged? It would probably be worth checking
>> the last posting thread to see if their are any outstanding review
>> comments.
>>
> Yes, I can do it. I have something else in hand right now, so the review
> request may be sent out in a few days.
> 
I have sent the review out here, hope you have received it:
   https://www.mail-archive.com/qemu-devel@nongnu.org/msg955889.html

I just received the cover letter w/o the following patches, I do
subscribe to qemu-devel@nongnu.org, not sure why.

Thanks,
Fei.


> Thanks,
> Fei.



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-04-21 13:46 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-06  2:27 [PATCH 0/2] accel/tcg/plugin: host insn size for plugin Fei Wu
2023-04-06  2:27 ` [PATCH 1/2] accel/tcg/plugin: export host insn size Fei Wu
2023-04-06  7:46   ` Alex Bennée
2023-04-07  1:31     ` Wu, Fei
2023-04-10 10:46       ` Alex Bennée
2023-04-08  3:34     ` Richard Henderson
2023-04-10 10:36       ` Alex Bennée
2023-04-10 13:02         ` Wu, Fei
2023-04-11  7:27           ` Alex Bennée
2023-04-12 12:50             ` Wu, Fei
2023-04-12 13:28               ` Alex Bennée
2023-04-12 13:47                 ` Wu, Fei
2023-04-17 11:11             ` Wu, Fei
2023-04-17 12:11               ` Alex Bennée
2023-04-17 13:01                 ` Wu, Fei
2023-04-21 13:46                   ` Wu, Fei
2023-04-06  2:27 ` [PATCH 2/2] plugins/hotblocks: add " Fei Wu
2023-04-06  7:54   ` Alex Bennée

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.