All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/5] tcg: Misc clean-up patches from Paolo
@ 2016-03-17 13:46 sergey.fedorov
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 1/5] tcg: code_bitmap is not used by user-mode emulation sergey.fedorov
                   ` (4 more replies)
  0 siblings, 5 replies; 41+ messages in thread
From: sergey.fedorov @ 2016-03-17 13:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Richard Henderson, Peter Crosthwaite,
	Sergey Fedorov, Paolo Bonzini

From: Sergey Fedorov <sergey.fedorov@linaro.org>

This patch series consists of various general TCG clean-up patches
extracted from Paolo's MTTCG tree [1]. The idea is to review and merge
these patches separately from the MTTCG series to cut the latter and
make it easier to review.

[1] https://github.com/bonzini/qemu/tree/mttcg

Paolo Bonzini (5):
  tcg: code_bitmap is not used by user-mode emulation
  tcg: reorganize tb_find_physical loop
  tcg: always keep jump target and tb->jmp_next consistent
  tcg: reorder removal from lists in tb_phys_invalidate
  tcg: move tb_invalidated_flag to CPUState

 cpu-exec.c              | 61 +++++++++++++++++-----------------
 include/exec/exec-all.h |  2 --
 include/qom/cpu.h       |  2 ++
 translate-all.c         | 87 +++++++++++++++++++++++++------------------------
 4 files changed, 77 insertions(+), 75 deletions(-)

-- 
2.7.3

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 1/5] tcg: code_bitmap is not used by user-mode emulation
  2016-03-17 13:46 [Qemu-devel] [PATCH 0/5] tcg: Misc clean-up patches from Paolo sergey.fedorov
@ 2016-03-17 13:46 ` sergey.fedorov
  2016-03-17 14:56   ` Peter Maydell
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop sergey.fedorov
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 41+ messages in thread
From: sergey.fedorov @ 2016-03-17 13:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Richard Henderson, Peter Crosthwaite,
	Sergey Fedorov, Paolo Bonzini

From: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 translate-all.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/translate-all.c b/translate-all.c
index e9f409b762ab..f17ace1ae899 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -784,6 +784,9 @@ void tb_free(TranslationBlock *tb)
 
 static inline void invalidate_page_bitmap(PageDesc *p)
 {
+#ifndef CONFIG_SOFTMMU
+    assert(p->code_bitmap == NULL);
+#endif
     g_free(p->code_bitmap);
     p->code_bitmap = NULL;
     p->code_write_count = 0;
@@ -1018,6 +1021,7 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
     tcg_ctx.tb_ctx.tb_phys_invalidate_count++;
 }
 
+#ifdef CONFIG_SOFTMMU
 static void build_page_bitmap(PageDesc *p)
 {
     int n, tb_start, tb_end;
@@ -1046,6 +1050,7 @@ static void build_page_bitmap(PageDesc *p)
         tb = tb->page_next[n];
     }
 }
+#endif
 
 /* Called with mmap_lock held for user mode emulation.  */
 TranslationBlock *tb_gen_code(CPUState *cpu,
@@ -1294,6 +1299,7 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
 #endif
 }
 
+#ifdef CONFIG_SOFTMMU
 /* len must be <= 8 and start must be a multiple of len */
 void tb_invalidate_phys_page_fast(tb_page_addr_t start, int len)
 {
@@ -1331,8 +1337,7 @@ void tb_invalidate_phys_page_fast(tb_page_addr_t start, int len)
         tb_invalidate_phys_page_range(start, start + len, 1);
     }
 }
-
-#if !defined(CONFIG_SOFTMMU)
+#else
 /* Called with mmap_lock held.  */
 static void tb_invalidate_phys_page(tb_page_addr_t addr,
                                     uintptr_t pc, void *puc,
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop
  2016-03-17 13:46 [Qemu-devel] [PATCH 0/5] tcg: Misc clean-up patches from Paolo sergey.fedorov
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 1/5] tcg: code_bitmap is not used by user-mode emulation sergey.fedorov
@ 2016-03-17 13:46 ` sergey.fedorov
  2016-03-17 14:59   ` Peter Maydell
  2016-03-22 14:59   ` Alex Bennée
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 3/5] tcg: always keep jump target and tb->jmp_next consistent sergey.fedorov
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 41+ messages in thread
From: sergey.fedorov @ 2016-03-17 13:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Richard Henderson, Peter Crosthwaite,
	Sergey Fedorov, Paolo Bonzini

From: Paolo Bonzini <pbonzini@redhat.com>

Use a continue statement.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Sergey Fedorov: Fix moving to list head in case of no TB]
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 cpu-exec.c | 50 +++++++++++++++++++++++++-------------------------
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index fd92452f16f6..f90482eff778 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -225,37 +225,37 @@ static TranslationBlock *tb_find_physical(CPUState *cpu,
     phys_pc = get_page_addr_code(env, pc);
     phys_page1 = phys_pc & TARGET_PAGE_MASK;
     h = tb_phys_hash_func(phys_pc);
-    ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
-    for(;;) {
-        tb = *ptb1;
-        if (!tb) {
-            return NULL;
+    for (ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
+         (tb = *ptb1) != NULL;
+         ptb1 = &tb->phys_hash_next) {
+        if (tb->pc != pc ||
+            tb->page_addr[0] != phys_page1 ||
+            tb->cs_base != cs_base ||
+            tb->flags != flags) {
+            continue;
         }
-        if (tb->pc == pc &&
-            tb->page_addr[0] == phys_page1 &&
-            tb->cs_base == cs_base &&
-            tb->flags == flags) {
-            /* check next page if needed */
-            if (tb->page_addr[1] != -1) {
-                tb_page_addr_t phys_page2;
-
-                virt_page2 = (pc & TARGET_PAGE_MASK) +
-                    TARGET_PAGE_SIZE;
-                phys_page2 = get_page_addr_code(env, virt_page2);
-                if (tb->page_addr[1] == phys_page2) {
-                    break;
-                }
-            } else {
+
+        /* check next page if needed */
+        if (tb->page_addr[1] != -1) {
+            tb_page_addr_t phys_page2;
+
+            virt_page2 = (pc & TARGET_PAGE_MASK) +
+                TARGET_PAGE_SIZE;
+            phys_page2 = get_page_addr_code(env, virt_page2);
+            if (tb->page_addr[1] == phys_page2) {
                 break;
             }
+        } else {
+            break;
         }
-        ptb1 = &tb->phys_hash_next;
     }
 
-    /* Move the TB to the head of the list */
-    *ptb1 = tb->phys_hash_next;
-    tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
-    tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
+    if (tb) {
+        /* Move the TB to the head of the list */
+        *ptb1 = tb->phys_hash_next;
+        tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
+        tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
+    }
     return tb;
 }
 
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 3/5] tcg: always keep jump target and tb->jmp_next consistent
  2016-03-17 13:46 [Qemu-devel] [PATCH 0/5] tcg: Misc clean-up patches from Paolo sergey.fedorov
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 1/5] tcg: code_bitmap is not used by user-mode emulation sergey.fedorov
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop sergey.fedorov
@ 2016-03-17 13:46 ` sergey.fedorov
  2016-03-17 17:57   ` Richard Henderson
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate sergey.fedorov
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 5/5] tcg: move tb_invalidated_flag to CPUState sergey.fedorov
  4 siblings, 1 reply; 41+ messages in thread
From: sergey.fedorov @ 2016-03-17 13:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Richard Henderson, Peter Crosthwaite,
	Sergey Fedorov, Paolo Bonzini

From: Paolo Bonzini <pbonzini@redhat.com>

Simple code simplification.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 translate-all.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/translate-all.c b/translate-all.c
index f17ace1ae899..a1ac9841de48 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -927,6 +927,14 @@ static inline void tb_page_remove(TranslationBlock **ptb, TranslationBlock *tb)
     }
 }
 
+/* reset the jump entry 'n' of a TB so that it is not chained to
+   another TB */
+static inline void tb_reset_jump(TranslationBlock *tb, int n)
+{
+    tb_set_jmp_target(tb, n, (uintptr_t)(tb->tc_ptr + tb->tb_next_offset[n]));
+    tb->jmp_next[n] = NULL;
+}
+
 static inline void tb_jmp_remove(TranslationBlock *tb, int n)
 {
     TranslationBlock *tb1, **ptb;
@@ -951,18 +959,10 @@ static inline void tb_jmp_remove(TranslationBlock *tb, int n)
         }
         /* now we can suppress tb(n) from the list */
         *ptb = tb->jmp_next[n];
-
-        tb->jmp_next[n] = NULL;
+        tb_reset_jump(tb, n);
     }
 }
 
-/* reset the jump entry 'n' of a TB so that it is not chained to
-   another TB */
-static inline void tb_reset_jump(TranslationBlock *tb, int n)
-{
-    tb_set_jmp_target(tb, n, (uintptr_t)(tb->tc_ptr + tb->tb_next_offset[n]));
-}
-
 /* invalidate one TB */
 void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
 {
@@ -1013,7 +1013,6 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
         tb1 = (TranslationBlock *)((uintptr_t)tb1 & ~3);
         tb2 = tb1->jmp_next[n1];
         tb_reset_jump(tb1, n1);
-        tb1->jmp_next[n1] = NULL;
         tb1 = tb2;
     }
     tb->jmp_first = (TranslationBlock *)((uintptr_t)tb | 2); /* fail safe */
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-03-17 13:46 [Qemu-devel] [PATCH 0/5] tcg: Misc clean-up patches from Paolo sergey.fedorov
                   ` (2 preceding siblings ...)
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 3/5] tcg: always keep jump target and tb->jmp_next consistent sergey.fedorov
@ 2016-03-17 13:46 ` sergey.fedorov
  2016-03-17 15:09   ` Paolo Bonzini
  2016-03-28 18:42   ` Sergey Fedorov
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 5/5] tcg: move tb_invalidated_flag to CPUState sergey.fedorov
  4 siblings, 2 replies; 41+ messages in thread
From: sergey.fedorov @ 2016-03-17 13:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Richard Henderson, Peter Crosthwaite,
	Sergey Fedorov, Paolo Bonzini

From: Paolo Bonzini <pbonzini@redhat.com>

First the translation block is invalidated, for which a simple write
to tb->pc is enough.  This means that cpu-exec will not pick up anymore
the block, though it may still execute it through chained jumps.  This
also replaces the NULLing out of the pointer in the CPUs' local cache.

Then the chained jumps are removed, meaning that CPUs will only execute
the translation block once after this point.

Finally, the TB is removed from the per-page list and the phys-hash
bucket to clean up the data structure.

This has no effect for now, but it will be the right order when tb_find_fast
is moved outside the tb_lock.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 translate-all.c | 56 ++++++++++++++++++++++++++++----------------------------
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/translate-all.c b/translate-all.c
index a1ac9841de48..1db5a914d9a3 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -966,40 +966,21 @@ static inline void tb_jmp_remove(TranslationBlock *tb, int n)
 /* invalidate one TB */
 void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
 {
-    CPUState *cpu;
     PageDesc *p;
     unsigned int h, n1;
+    tb_page_addr_t pc;
     tb_page_addr_t phys_pc;
     TranslationBlock *tb1, *tb2;
 
-    /* remove the TB from the hash list */
-    phys_pc = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
-    h = tb_phys_hash_func(phys_pc);
-    tb_hash_remove(&tcg_ctx.tb_ctx.tb_phys_hash[h], tb);
-
-    /* remove the TB from the page list */
-    if (tb->page_addr[0] != page_addr) {
-        p = page_find(tb->page_addr[0] >> TARGET_PAGE_BITS);
-        tb_page_remove(&p->first_tb, tb);
-        invalidate_page_bitmap(p);
-    }
-    if (tb->page_addr[1] != -1 && tb->page_addr[1] != page_addr) {
-        p = page_find(tb->page_addr[1] >> TARGET_PAGE_BITS);
-        tb_page_remove(&p->first_tb, tb);
-        invalidate_page_bitmap(p);
-    }
-
-    tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
-
-    /* remove the TB from the hash list */
-    h = tb_jmp_cache_hash_func(tb->pc);
-    CPU_FOREACH(cpu) {
-        if (cpu->tb_jmp_cache[h] == tb) {
-            cpu->tb_jmp_cache[h] = NULL;
-        }
-    }
+    /* First invalidate the translation block.  CPUs will not use it anymore
+     * from their local caches.
+     */
+    pc = tb->pc;
+    tb->pc = -1;
 
-    /* suppress this TB from the two jump lists */
+    /* Then suppress this TB from the two jump lists.  CPUs will not jump
+     * anymore into this translation block.
+     */
     tb_jmp_remove(tb, 0);
     tb_jmp_remove(tb, 1);
 
@@ -1017,6 +998,25 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
     }
     tb->jmp_first = (TranslationBlock *)((uintptr_t)tb | 2); /* fail safe */
 
+    /* Now remove the TB from the hash list, so that tb_find_slow
+     * cannot find it anymore.
+     */
+    phys_pc = tb->page_addr[0] + (pc & ~TARGET_PAGE_MASK);
+    h = tb_phys_hash_func(phys_pc);
+    tb_hash_remove(&tcg_ctx.tb_ctx.tb_phys_hash[h], tb);
+
+    /* remove the TB from the page list */
+    if (tb->page_addr[0] != page_addr) {
+        p = page_find(tb->page_addr[0] >> TARGET_PAGE_BITS);
+        tb_page_remove(&p->first_tb, tb);
+        invalidate_page_bitmap(p);
+    }
+    if (tb->page_addr[1] != -1 && tb->page_addr[1] != page_addr) {
+        p = page_find(tb->page_addr[1] >> TARGET_PAGE_BITS);
+        tb_page_remove(&p->first_tb, tb);
+        invalidate_page_bitmap(p);
+    }
+
     tcg_ctx.tb_ctx.tb_phys_invalidate_count++;
 }
 
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 5/5] tcg: move tb_invalidated_flag to CPUState
  2016-03-17 13:46 [Qemu-devel] [PATCH 0/5] tcg: Misc clean-up patches from Paolo sergey.fedorov
                   ` (3 preceding siblings ...)
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate sergey.fedorov
@ 2016-03-17 13:46 ` sergey.fedorov
  2016-03-22 15:07   ` Alex Bennée
  4 siblings, 1 reply; 41+ messages in thread
From: sergey.fedorov @ 2016-03-17 13:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Peter Crosthwaite, Paolo Bonzini, Sergey Fedorov,
	Andreas Färber, Richard Henderson

From: Paolo Bonzini <pbonzini@redhat.com>

This is a baby step towards making tb_flush thread safe.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 cpu-exec.c              | 11 +++++------
 include/exec/exec-all.h |  2 --
 include/qom/cpu.h       |  2 ++
 translate-all.c         |  3 +--
 4 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index f90482eff778..07545aa91082 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -195,10 +195,11 @@ static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
     if (max_cycles > CF_COUNT_MASK)
         max_cycles = CF_COUNT_MASK;
 
+    cpu->tb_invalidated_flag = 0;
     tb = tb_gen_code(cpu, orig_tb->pc, orig_tb->cs_base, orig_tb->flags,
                      max_cycles | CF_NOCACHE
                          | (ignore_icount ? CF_IGNORE_ICOUNT : 0));
-    tb->orig_tb = tcg_ctx.tb_ctx.tb_invalidated_flag ? NULL : orig_tb;
+    tb->orig_tb = cpu->tb_invalidated_flag ? NULL : orig_tb;
     cpu->current_tb = tb;
     /* execute the generated code */
     trace_exec_tb_nocache(tb, tb->pc);
@@ -219,8 +220,6 @@ static TranslationBlock *tb_find_physical(CPUState *cpu,
     tb_page_addr_t phys_pc, phys_page1;
     target_ulong virt_page2;
 
-    tcg_ctx.tb_ctx.tb_invalidated_flag = 0;
-
     /* find translated block using physical mappings */
     phys_pc = get_page_addr_code(env, pc);
     phys_page1 = phys_pc & TARGET_PAGE_MASK;
@@ -288,6 +287,7 @@ static TranslationBlock *tb_find_slow(CPUState *cpu,
 #endif
 
     /* if no translated code available, then translate it now */
+    cpu->tb_invalidated_flag = 0;
     tb = tb_gen_code(cpu, pc, cs_base, flags, 0);
 
 #ifdef CONFIG_USER_ONLY
@@ -493,12 +493,11 @@ int cpu_exec(CPUState *cpu)
                 tb = tb_find_fast(cpu);
                 /* Note: we do it here to avoid a gcc bug on Mac OS X when
                    doing it in tb_find_slow */
-                if (tcg_ctx.tb_ctx.tb_invalidated_flag) {
+                if (cpu->tb_invalidated_flag) {
                     /* as some TB could have been invalidated because
-                       of memory exceptions while generating the code, we
+                       of a tb_flush while generating the code, we
                        must recompute the hash index here */
                     next_tb = 0;
-                    tcg_ctx.tb_ctx.tb_invalidated_flag = 0;
                 }
                 if (qemu_loglevel_mask(CPU_LOG_EXEC)) {
                     qemu_log("Trace %p [" TARGET_FMT_lx "] %s\n",
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 05a151da4a54..0ef6ea5cf6dc 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -288,8 +288,6 @@ struct TBContext {
     /* statistics */
     int tb_flush_count;
     int tb_phys_invalidate_count;
-
-    int tb_invalidated_flag;
 };
 
 void tb_free(TranslationBlock *tb);
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 7052eee7b78a..9538f9cc2af3 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -240,6 +240,7 @@ struct kvm_run;
  * @crash_occurred: Indicates the OS reported a crash (panic) for this CPU
  * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this
  *           CPU and return to its top level loop.
+ * @tb_invalidated_flag: Set to tell TCG that tb_flush has been called.
  * @singlestep_enabled: Flags for single-stepping.
  * @icount_extra: Instructions until next timer event.
  * @icount_decr: Number of cycles left, with interrupt flag in high bit.
@@ -291,6 +292,7 @@ struct CPUState {
     bool stopped;
     bool crash_occurred;
     bool exit_request;
+    bool tb_invalidated_flag;
     uint32_t interrupt_request;
     int singlestep_enabled;
     int64_t icount_extra;
diff --git a/translate-all.c b/translate-all.c
index 1db5a914d9a3..8e1edd6bb633 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -843,6 +843,7 @@ void tb_flush(CPUState *cpu)
     tcg_ctx.tb_ctx.nb_tbs = 0;
 
     CPU_FOREACH(cpu) {
+        cpu->tb_invalidated_flag = 1;
         memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
     }
 
@@ -1079,8 +1080,6 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
         /* cannot fail at this point */
         tb = tb_alloc(pc);
         assert(tb != NULL);
-        /* Don't forget to invalidate previous TB info.  */
-        tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
     }
 
     gen_code_buf = tcg_ctx.code_gen_ptr;
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] tcg: code_bitmap is not used by user-mode emulation
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 1/5] tcg: code_bitmap is not used by user-mode emulation sergey.fedorov
@ 2016-03-17 14:56   ` Peter Maydell
  2016-03-17 15:03     ` Sergey Fedorov
  0 siblings, 1 reply; 41+ messages in thread
From: Peter Maydell @ 2016-03-17 14:56 UTC (permalink / raw)
  To: sergey.fedorov
  Cc: Sergey Fedorov, Peter Crosthwaite, QEMU Developers,
	Paolo Bonzini, Richard Henderson

On 17 March 2016 at 13:46,  <sergey.fedorov@linaro.org> wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  translate-all.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/translate-all.c b/translate-all.c
> index e9f409b762ab..f17ace1ae899 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -784,6 +784,9 @@ void tb_free(TranslationBlock *tb)
>
>  static inline void invalidate_page_bitmap(PageDesc *p)
>  {
> +#ifndef CONFIG_SOFTMMU
> +    assert(p->code_bitmap == NULL);
> +#endif
>      g_free(p->code_bitmap);
>      p->code_bitmap = NULL;
>      p->code_write_count = 0;
> @@ -1018,6 +1021,7 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
>      tcg_ctx.tb_ctx.tb_phys_invalidate_count++;
>  }

You could pretty much take this another step and actually
make the field in the struct be only present ifdef CONFIG_SOFTMMU...

thanks
-- PMM

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop sergey.fedorov
@ 2016-03-17 14:59   ` Peter Maydell
  2016-03-22 14:59   ` Alex Bennée
  1 sibling, 0 replies; 41+ messages in thread
From: Peter Maydell @ 2016-03-17 14:59 UTC (permalink / raw)
  To: sergey.fedorov
  Cc: Sergey Fedorov, Peter Crosthwaite, QEMU Developers,
	Paolo Bonzini, Richard Henderson

On 17 March 2016 at 13:46,  <sergey.fedorov@linaro.org> wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
>
> Use a continue statement.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> [Sergey Fedorov: Fix moving to list head in case of no TB]
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

(though it would be nice if the commit message gave some
motivation for the change)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] tcg: code_bitmap is not used by user-mode emulation
  2016-03-17 14:56   ` Peter Maydell
@ 2016-03-17 15:03     ` Sergey Fedorov
  0 siblings, 0 replies; 41+ messages in thread
From: Sergey Fedorov @ 2016-03-17 15:03 UTC (permalink / raw)
  To: Peter Maydell, sergey.fedorov
  Cc: Paolo Bonzini, Peter Crosthwaite, QEMU Developers, Richard Henderson

On 17/03/16 17:56, Peter Maydell wrote:
> On 17 March 2016 at 13:46,  <sergey.fedorov@linaro.org> wrote:
>> From: Paolo Bonzini <pbonzini@redhat.com>
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>> ---
>>   translate-all.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/translate-all.c b/translate-all.c
>> index e9f409b762ab..f17ace1ae899 100644
>> --- a/translate-all.c
>> +++ b/translate-all.c
>> @@ -784,6 +784,9 @@ void tb_free(TranslationBlock *tb)
>>
>>   static inline void invalidate_page_bitmap(PageDesc *p)
>>   {
>> +#ifndef CONFIG_SOFTMMU
>> +    assert(p->code_bitmap == NULL);
>> +#endif
>>       g_free(p->code_bitmap);
>>       p->code_bitmap = NULL;
>>       p->code_write_count = 0;
>> @@ -1018,6 +1021,7 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
>>       tcg_ctx.tb_ctx.tb_phys_invalidate_count++;
>>   }
> You could pretty much take this another step and actually
> make the field in the struct be only present ifdef CONFIG_SOFTMMU...

Sonds a good idea.

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate sergey.fedorov
@ 2016-03-17 15:09   ` Paolo Bonzini
  2016-03-17 15:14     ` Sergey Fedorov
  2016-03-28 18:42   ` Sergey Fedorov
  1 sibling, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2016-03-17 15:09 UTC (permalink / raw)
  To: sergey.fedorov, qemu-devel
  Cc: Sergey Fedorov, Richard Henderson, Peter Crosthwaite



On 17/03/2016 14:46, sergey.fedorov@linaro.org wrote:
>  void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
>  {
> -    CPUState *cpu;
>      PageDesc *p;
>      unsigned int h, n1;
> +    tb_page_addr_t pc;
>      tb_page_addr_t phys_pc;
>      TranslationBlock *tb1, *tb2;
>  
> -    /* remove the TB from the hash list */
> -    phys_pc = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
> -    h = tb_phys_hash_func(phys_pc);
> -    tb_hash_remove(&tcg_ctx.tb_ctx.tb_phys_hash[h], tb);
> -
> -    /* remove the TB from the page list */
> -    if (tb->page_addr[0] != page_addr) {
> -        p = page_find(tb->page_addr[0] >> TARGET_PAGE_BITS);
> -        tb_page_remove(&p->first_tb, tb);
> -        invalidate_page_bitmap(p);
> -    }
> -    if (tb->page_addr[1] != -1 && tb->page_addr[1] != page_addr) {
> -        p = page_find(tb->page_addr[1] >> TARGET_PAGE_BITS);
> -        tb_page_remove(&p->first_tb, tb);
> -        invalidate_page_bitmap(p);
> -    }
> -
> -    tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
> -

Did you investigate the removal of this setting of tb_invalidated_flag?

My recollection is that it is okay to remove it because at worse it
would cause a tb_add_jump from an invalidated source to a valid
destination.  This should be harmless as long as the source has been
tb_phys_invalidated and not tb_flushed.  But this needs to be checked.

Paolo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-03-17 15:09   ` Paolo Bonzini
@ 2016-03-17 15:14     ` Sergey Fedorov
  2016-03-28 15:18       ` Sergey Fedorov
  0 siblings, 1 reply; 41+ messages in thread
From: Sergey Fedorov @ 2016-03-17 15:14 UTC (permalink / raw)
  To: Paolo Bonzini, sergey.fedorov, qemu-devel
  Cc: Richard Henderson, Peter Crosthwaite

On 17/03/16 18:09, Paolo Bonzini wrote:
>
> On 17/03/2016 14:46, sergey.fedorov@linaro.org wrote:
>>   void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
>>   {
>> -    CPUState *cpu;
>>       PageDesc *p;
>>       unsigned int h, n1;
>> +    tb_page_addr_t pc;
>>       tb_page_addr_t phys_pc;
>>       TranslationBlock *tb1, *tb2;
>>   
>> -    /* remove the TB from the hash list */
>> -    phys_pc = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
>> -    h = tb_phys_hash_func(phys_pc);
>> -    tb_hash_remove(&tcg_ctx.tb_ctx.tb_phys_hash[h], tb);
>> -
>> -    /* remove the TB from the page list */
>> -    if (tb->page_addr[0] != page_addr) {
>> -        p = page_find(tb->page_addr[0] >> TARGET_PAGE_BITS);
>> -        tb_page_remove(&p->first_tb, tb);
>> -        invalidate_page_bitmap(p);
>> -    }
>> -    if (tb->page_addr[1] != -1 && tb->page_addr[1] != page_addr) {
>> -        p = page_find(tb->page_addr[1] >> TARGET_PAGE_BITS);
>> -        tb_page_remove(&p->first_tb, tb);
>> -        invalidate_page_bitmap(p);
>> -    }
>> -
>> -    tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
>> -
> Did you investigate the removal of this setting of tb_invalidated_flag?
>
> My recollection is that it is okay to remove it because at worse it
> would cause a tb_add_jump from an invalidated source to a valid
> destination.  This should be harmless as long as the source has been
> tb_phys_invalidated and not tb_flushed.  But this needs to be checked.

Thanks for pointing that. I should investigate it to make sure, although 
arm32/arm64/x86_64 Linux boots fine as well as the latest Alex's 
kmv-unit-tests pass.

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] tcg: always keep jump target and tb->jmp_next consistent
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 3/5] tcg: always keep jump target and tb->jmp_next consistent sergey.fedorov
@ 2016-03-17 17:57   ` Richard Henderson
  2016-03-17 19:31     ` Paolo Bonzini
  0 siblings, 1 reply; 41+ messages in thread
From: Richard Henderson @ 2016-03-17 17:57 UTC (permalink / raw)
  To: sergey.fedorov, qemu-devel
  Cc: Sergey Fedorov, Peter Crosthwaite, Paolo Bonzini

On 03/17/2016 06:46 AM, sergey.fedorov@linaro.org wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
> 
> Simple code simplification.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  translate-all.c | 19 +++++++++----------
>  1 file changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/translate-all.c b/translate-all.c
> index f17ace1ae899..a1ac9841de48 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -927,6 +927,14 @@ static inline void tb_page_remove(TranslationBlock **ptb, TranslationBlock *tb)
>      }
>  }
>  
> +/* reset the jump entry 'n' of a TB so that it is not chained to
> +   another TB */
> +static inline void tb_reset_jump(TranslationBlock *tb, int n)
> +{
> +    tb_set_jmp_target(tb, n, (uintptr_t)(tb->tc_ptr + tb->tb_next_offset[n]));
> +    tb->jmp_next[n] = NULL;
> +}
> +
>  static inline void tb_jmp_remove(TranslationBlock *tb, int n)
>  {
>      TranslationBlock *tb1, **ptb;
> @@ -951,18 +959,10 @@ static inline void tb_jmp_remove(TranslationBlock *tb, int n)
>          }
>          /* now we can suppress tb(n) from the list */
>          *ptb = tb->jmp_next[n];
> -
> -        tb->jmp_next[n] = NULL;
> +        tb_reset_jump(tb, n);

What's the motivation here?  This implies an extra cache flush.
Where were we resetting the jump previously?  Or is this a bug
in that we *weren't* resetting the jump previously?


r~

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] tcg: always keep jump target and tb->jmp_next consistent
  2016-03-17 17:57   ` Richard Henderson
@ 2016-03-17 19:31     ` Paolo Bonzini
  2016-03-17 20:45       ` Sergey Fedorov
  0 siblings, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2016-03-17 19:31 UTC (permalink / raw)
  To: Richard Henderson, sergey.fedorov, qemu-devel
  Cc: Sergey Fedorov, Peter Crosthwaite



On 17/03/2016 18:57, Richard Henderson wrote:
> > @@ -951,18 +959,10 @@ static inline void tb_jmp_remove(TranslationBlock *tb, int n)
> >          }
> >          /* now we can suppress tb(n) from the list */
> >          *ptb = tb->jmp_next[n];
> > -
> > -        tb->jmp_next[n] = NULL;
> > +        tb_reset_jump(tb, n);
>
> What's the motivation here?  This implies an extra cache flush.
> Where were we resetting the jump previously?  Or is this a bug
> in that we *weren't* resetting the jump previously?

Indeed I think this patch can be removed if it has a performance effect
on machines that require icache invalidation.  If it doesn't, it would
be just a small code simplification.

Paolo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] tcg: always keep jump target and tb->jmp_next consistent
  2016-03-17 19:31     ` Paolo Bonzini
@ 2016-03-17 20:45       ` Sergey Fedorov
  2016-03-17 20:46         ` Richard Henderson
  2016-03-18 10:32         ` Sergey Fedorov
  0 siblings, 2 replies; 41+ messages in thread
From: Sergey Fedorov @ 2016-03-17 20:45 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, sergey.fedorov, qemu-devel
  Cc: Peter Crosthwaite

On 17/03/16 22:31, Paolo Bonzini wrote:
>
> On 17/03/2016 18:57, Richard Henderson wrote:
>>> @@ -951,18 +959,10 @@ static inline void tb_jmp_remove(TranslationBlock *tb, int n)
>>>           }
>>>           /* now we can suppress tb(n) from the list */
>>>           *ptb = tb->jmp_next[n];
>>> -
>>> -        tb->jmp_next[n] = NULL;
>>> +        tb_reset_jump(tb, n);
>> What's the motivation here?  This implies an extra cache flush.
>> Where were we resetting the jump previously?  Or is this a bug
>> in that we *weren't* resetting the jump previously?
> Indeed I think this patch can be removed if it has a performance effect
> on machines that require icache invalidation.  If it doesn't, it would
> be just a small code simplification.

In fact, tb_jmp_remove() is only supposed to remove the TB from a list 
of all TB's jumping to the same TB which is n-th jump destination of the 
given TB. This function is only called in tb_phys_invalidate() for the 
TB being invalidated. Thus we don't have to patch that TB anymore. We 
don't even have to do "tb->jmp_next[n] = NULL" here.

Probably it's time to audit the code that handles direct jumping and 
clean-up/document/rename things to make it more easy to understand? :)

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] tcg: always keep jump target and tb->jmp_next consistent
  2016-03-17 20:45       ` Sergey Fedorov
@ 2016-03-17 20:46         ` Richard Henderson
  2016-03-18 10:29           ` Sergey Fedorov
  2016-03-18 10:32         ` Sergey Fedorov
  1 sibling, 1 reply; 41+ messages in thread
From: Richard Henderson @ 2016-03-17 20:46 UTC (permalink / raw)
  To: Sergey Fedorov, Paolo Bonzini, sergey.fedorov, qemu-devel
  Cc: Peter Crosthwaite

On 03/17/2016 01:45 PM, Sergey Fedorov wrote:
> Probably it's time to audit the code that handles direct jumping and
> clean-up/document/rename things to make it more easy to understand? :)

Seconded!


r~

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] tcg: always keep jump target and tb->jmp_next consistent
  2016-03-17 20:46         ` Richard Henderson
@ 2016-03-18 10:29           ` Sergey Fedorov
  0 siblings, 0 replies; 41+ messages in thread
From: Sergey Fedorov @ 2016-03-18 10:29 UTC (permalink / raw)
  To: Richard Henderson, Paolo Bonzini, sergey.fedorov, qemu-devel
  Cc: Peter Crosthwaite

On 17/03/16 23:46, Richard Henderson wrote:
> On 03/17/2016 01:45 PM, Sergey Fedorov wrote:
>> Probably it's time to audit the code that handles direct jumping and
>> clean-up/document/rename things to make it more easy to understand? :)
> Seconded!

I'll go for it, then :)

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] tcg: always keep jump target and tb->jmp_next consistent
  2016-03-17 20:45       ` Sergey Fedorov
  2016-03-17 20:46         ` Richard Henderson
@ 2016-03-18 10:32         ` Sergey Fedorov
  1 sibling, 0 replies; 41+ messages in thread
From: Sergey Fedorov @ 2016-03-18 10:32 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, sergey.fedorov, qemu-devel
  Cc: Peter Crosthwaite

[-- Attachment #1: Type: text/plain, Size: 1234 bytes --]

On 17/03/16 23:45, Sergey Fedorov wrote:
> On 17/03/16 22:31, Paolo Bonzini wrote:
>>
>> On 17/03/2016 18:57, Richard Henderson wrote:
>>>> @@ -951,18 +959,10 @@ static inline void 
>>>> tb_jmp_remove(TranslationBlock *tb, int n)
>>>>           }
>>>>           /* now we can suppress tb(n) from the list */
>>>>           *ptb = tb->jmp_next[n];
>>>> -
>>>> -        tb->jmp_next[n] = NULL;
>>>> +        tb_reset_jump(tb, n);
>>> What's the motivation here?  This implies an extra cache flush.
>>> Where were we resetting the jump previously?  Or is this a bug
>>> in that we *weren't* resetting the jump previously?
>> Indeed I think this patch can be removed if it has a performance effect
>> on machines that require icache invalidation.  If it doesn't, it would
>> be just a small code simplification.
>
> In fact, tb_jmp_remove() is only supposed to remove the TB from a list 
> of all TB's jumping to the same TB which is n-th jump destination of 
> the given TB. This function is only called in tb_phys_invalidate() for 
> the TB being invalidated. Thus we don't have to patch that TB anymore. 
> We don't even have to do "tb->jmp_next[n] = NULL" here. 

I'll drop the patch from the series in v2 then.

Kind regards,
Sergey

[-- Attachment #2: Type: text/html, Size: 2352 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop sergey.fedorov
  2016-03-17 14:59   ` Peter Maydell
@ 2016-03-22 14:59   ` Alex Bennée
  2016-03-22 15:00     ` Paolo Bonzini
  2016-03-29 13:19     ` Sergey Fedorov
  1 sibling, 2 replies; 41+ messages in thread
From: Alex Bennée @ 2016-03-22 14:59 UTC (permalink / raw)
  To: sergey.fedorov
  Cc: Sergey Fedorov, Peter Crosthwaite, qemu-devel, Paolo Bonzini,
	Richard Henderson


sergey.fedorov@linaro.org writes:

> From: Paolo Bonzini <pbonzini@redhat.com>
>
> Use a continue statement.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> [Sergey Fedorov: Fix moving to list head in case of no TB]
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  cpu-exec.c | 50 +++++++++++++++++++++++++-------------------------
>  1 file changed, 25 insertions(+), 25 deletions(-)
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index fd92452f16f6..f90482eff778 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -225,37 +225,37 @@ static TranslationBlock *tb_find_physical(CPUState *cpu,
>      phys_pc = get_page_addr_code(env, pc);
>      phys_page1 = phys_pc & TARGET_PAGE_MASK;
>      h = tb_phys_hash_func(phys_pc);
> -    ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
> -    for(;;) {
> -        tb = *ptb1;
> -        if (!tb) {
> -            return NULL;
> +    for (ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
> +         (tb = *ptb1) != NULL;
> +         ptb1 = &tb->phys_hash_next) {

I'm not sure I'm keen on the assignment in the for condition clause. I
appreciate the cleansing of the if !tb return exit though. Could we be
cleaner maybe? Here is my attempt:

    static TranslationBlock *tb_find_physical(CPUState *cpu,
                                              target_ulong pc,
                                              target_ulong cs_base,
                                              uint64_t flags)
    {
        CPUArchState *env = (CPUArchState *)cpu->env_ptr;
        TranslationBlock *tb, **tb_hash_head, **ptb1;
        unsigned int h;
        tb_page_addr_t phys_pc, phys_page1;

        /* find translated block using physical mappings */
        phys_pc = get_page_addr_code(env, pc);
        phys_page1 = phys_pc & TARGET_PAGE_MASK;
        h = tb_phys_hash_func(phys_pc);

        /* Start at head of the hash entry */
        ptb1 = tb_hash_head = &tcg_ctx.tb_ctx.tb_phys_hash[h];
        tb = *ptb1;

        while (tb) {

            if (tb->pc == pc &&
                tb->page_addr[0] == phys_page1 &&
                tb->cs_base == cs_base &&
                tb->flags == flags) {

                if (tb->page_addr[1] == -1) {
                    /* done, we have a match */
                    break;
                } else {
                    /* check next page if needed */
                    target_ulong virt_page2 = (pc & TARGET_PAGE_MASK)
                        + TARGET_PAGE_SIZE;
                    tb_page_addr_t phys_page2 = get_page_addr_code(env, virt_page2);

                    if (tb->page_addr[1] == phys_page2) {
                        break;
                    }
                }
            }

            ptb1 = &tb->phys_hash_next;
            tb = *ptb1;
        }

        if (tb) {
            /* Move the TB to the head of the list */
            *ptb1 = tb->phys_hash_next;
            tb->phys_hash_next = *tb_hash_head;
            *tb_hash_head = tb;
        }
        return tb;
    }

FWIW the compiled code is 9 bytes shorter on my machine.

> +        if (tb->pc != pc ||
> +            tb->page_addr[0] != phys_page1 ||
> +            tb->cs_base != cs_base ||
> +            tb->flags != flags) {
> +            continue;
>          }
> -        if (tb->pc == pc &&
> -            tb->page_addr[0] == phys_page1 &&
> -            tb->cs_base == cs_base &&
> -            tb->flags == flags) {
> -            /* check next page if needed */
> -            if (tb->page_addr[1] != -1) {
> -                tb_page_addr_t phys_page2;
> -
> -                virt_page2 = (pc & TARGET_PAGE_MASK) +
> -                    TARGET_PAGE_SIZE;
> -                phys_page2 = get_page_addr_code(env, virt_page2);
> -                if (tb->page_addr[1] == phys_page2) {
> -                    break;
> -                }
> -            } else {
> +
> +        /* check next page if needed */
> +        if (tb->page_addr[1] != -1) {
> +            tb_page_addr_t phys_page2;
> +
> +            virt_page2 = (pc & TARGET_PAGE_MASK) +
> +                TARGET_PAGE_SIZE;
> +            phys_page2 = get_page_addr_code(env, virt_page2);
> +            if (tb->page_addr[1] == phys_page2) {
>                  break;
>              }
> +        } else {
> +            break;
>          }
> -        ptb1 = &tb->phys_hash_next;
>      }
>
> -    /* Move the TB to the head of the list */
> -    *ptb1 = tb->phys_hash_next;
> -    tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
> -    tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
> +    if (tb) {
> +        /* Move the TB to the head of the list */
> +        *ptb1 = tb->phys_hash_next;
> +        tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
> +        tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
> +    }
>      return tb;
>  }


--
Alex Bennée

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop
  2016-03-22 14:59   ` Alex Bennée
@ 2016-03-22 15:00     ` Paolo Bonzini
  2016-03-29 13:19     ` Sergey Fedorov
  1 sibling, 0 replies; 41+ messages in thread
From: Paolo Bonzini @ 2016-03-22 15:00 UTC (permalink / raw)
  To: Alex Bennée, sergey.fedorov
  Cc: Sergey Fedorov, Peter Crosthwaite, qemu-devel, Richard Henderson



On 22/03/2016 15:59, Alex Bennée wrote:
>> > +    for (ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
>> > +         (tb = *ptb1) != NULL;
>> > +         ptb1 = &tb->phys_hash_next) {
> I'm not sure I'm keen on the assignment in the for condition clause. I
> appreciate the cleansing of the if !tb return exit though. Could we be
> cleaner maybe? Here is my attempt:

Sure, that would be just fine.

Paolo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] tcg: move tb_invalidated_flag to CPUState
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 5/5] tcg: move tb_invalidated_flag to CPUState sergey.fedorov
@ 2016-03-22 15:07   ` Alex Bennée
  2016-03-22 15:11     ` Sergey Fedorov
  0 siblings, 1 reply; 41+ messages in thread
From: Alex Bennée @ 2016-03-22 15:07 UTC (permalink / raw)
  To: sergey.fedorov
  Cc: Peter Crosthwaite, qemu-devel, Sergey Fedorov, Paolo Bonzini,
	Andreas Färber, Richard Henderson


sergey.fedorov@linaro.org writes:

> From: Paolo Bonzini <pbonzini@redhat.com>
>
> This is a baby step towards making tb_flush thread safe.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  cpu-exec.c              | 11 +++++------
>  include/exec/exec-all.h |  2 --
>  include/qom/cpu.h       |  2 ++
>  translate-all.c         |  3 +--
>  4 files changed, 8 insertions(+), 10 deletions(-)
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index f90482eff778..07545aa91082 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -195,10 +195,11 @@ static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
>      if (max_cycles > CF_COUNT_MASK)
>          max_cycles = CF_COUNT_MASK;
>
> +    cpu->tb_invalidated_flag = 0;

We've declared as bool so lets use true/false instead of 1/0's

>      tb = tb_gen_code(cpu, orig_tb->pc, orig_tb->cs_base, orig_tb->flags,
>                       max_cycles | CF_NOCACHE
>                           | (ignore_icount ? CF_IGNORE_ICOUNT : 0));
> -    tb->orig_tb = tcg_ctx.tb_ctx.tb_invalidated_flag ? NULL : orig_tb;
> +    tb->orig_tb = cpu->tb_invalidated_flag ? NULL : orig_tb;
>      cpu->current_tb = tb;
>      /* execute the generated code */
>      trace_exec_tb_nocache(tb, tb->pc);
> @@ -219,8 +220,6 @@ static TranslationBlock *tb_find_physical(CPUState *cpu,
>      tb_page_addr_t phys_pc, phys_page1;
>      target_ulong virt_page2;
>
> -    tcg_ctx.tb_ctx.tb_invalidated_flag = 0;
> -
>      /* find translated block using physical mappings */
>      phys_pc = get_page_addr_code(env, pc);
>      phys_page1 = phys_pc & TARGET_PAGE_MASK;
> @@ -288,6 +287,7 @@ static TranslationBlock *tb_find_slow(CPUState *cpu,
>  #endif
>
>      /* if no translated code available, then translate it now */
> +    cpu->tb_invalidated_flag = 0;
>      tb = tb_gen_code(cpu, pc, cs_base, flags, 0);
>
>  #ifdef CONFIG_USER_ONLY
> @@ -493,12 +493,11 @@ int cpu_exec(CPUState *cpu)
>                  tb = tb_find_fast(cpu);
>                  /* Note: we do it here to avoid a gcc bug on Mac OS X when
>                     doing it in tb_find_slow */
> -                if (tcg_ctx.tb_ctx.tb_invalidated_flag) {
> +                if (cpu->tb_invalidated_flag) {
>                      /* as some TB could have been invalidated because
> -                       of memory exceptions while generating the code, we
> +                       of a tb_flush while generating the code, we
>                         must recompute the hash index here */
>                      next_tb = 0;
> -                    tcg_ctx.tb_ctx.tb_invalidated_flag = 0;
>                  }
>                  if (qemu_loglevel_mask(CPU_LOG_EXEC)) {
>                      qemu_log("Trace %p [" TARGET_FMT_lx "] %s\n",
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 05a151da4a54..0ef6ea5cf6dc 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -288,8 +288,6 @@ struct TBContext {
>      /* statistics */
>      int tb_flush_count;
>      int tb_phys_invalidate_count;
> -
> -    int tb_invalidated_flag;
>  };
>
>  void tb_free(TranslationBlock *tb);
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index 7052eee7b78a..9538f9cc2af3 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -240,6 +240,7 @@ struct kvm_run;
>   * @crash_occurred: Indicates the OS reported a crash (panic) for this CPU
>   * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this
>   *           CPU and return to its top level loop.
> + * @tb_invalidated_flag: Set to tell TCG that tb_flush has been called.
>   * @singlestep_enabled: Flags for single-stepping.
>   * @icount_extra: Instructions until next timer event.
>   * @icount_decr: Number of cycles left, with interrupt flag in high bit.
> @@ -291,6 +292,7 @@ struct CPUState {
>      bool stopped;
>      bool crash_occurred;
>      bool exit_request;
> +    bool tb_invalidated_flag;

s/_flag// would save a few characters given it should be obvious it is a
flag from the setting of true and false?

>      uint32_t interrupt_request;
>      int singlestep_enabled;
>      int64_t icount_extra;
> diff --git a/translate-all.c b/translate-all.c
> index 1db5a914d9a3..8e1edd6bb633 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -843,6 +843,7 @@ void tb_flush(CPUState *cpu)
>      tcg_ctx.tb_ctx.nb_tbs = 0;
>
>      CPU_FOREACH(cpu) {
> +        cpu->tb_invalidated_flag = 1;
>          memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
>      }
>
> @@ -1079,8 +1080,6 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>          /* cannot fail at this point */
>          tb = tb_alloc(pc);
>          assert(tb != NULL);
> -        /* Don't forget to invalidate previous TB info.  */
> -        tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
>      }
>
>      gen_code_buf = tcg_ctx.code_gen_ptr;

I also note there is some code motion about where these flags are set
and cleared which should probably be mentioned in the commit message.

--
Alex Bennée

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] tcg: move tb_invalidated_flag to CPUState
  2016-03-22 15:07   ` Alex Bennée
@ 2016-03-22 15:11     ` Sergey Fedorov
  0 siblings, 0 replies; 41+ messages in thread
From: Sergey Fedorov @ 2016-03-22 15:11 UTC (permalink / raw)
  To: Alex Bennée, sergey.fedorov
  Cc: Paolo Bonzini, Richard Henderson, qemu-devel,
	Andreas Färber, Peter Crosthwaite

On 22/03/16 18:07, Alex Bennée wrote:
> sergey.fedorov@linaro.org writes:
(snip)
>> diff --git a/cpu-exec.c b/cpu-exec.c
>> index f90482eff778..07545aa91082 100644
>> --- a/cpu-exec.c
>> +++ b/cpu-exec.c
>> @@ -195,10 +195,11 @@ static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
>>      if (max_cycles > CF_COUNT_MASK)
>>          max_cycles = CF_COUNT_MASK;
>>
>> +    cpu->tb_invalidated_flag = 0;
> We've declared as bool so lets use true/false instead of 1/0's
>
(snip)
>> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
>> index 7052eee7b78a..9538f9cc2af3 100644
>> --- a/include/qom/cpu.h
>> +++ b/include/qom/cpu.h
>> @@ -240,6 +240,7 @@ struct kvm_run;
>>   * @crash_occurred: Indicates the OS reported a crash (panic) for this CPU
>>   * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this
>>   *           CPU and return to its top level loop.
>> + * @tb_invalidated_flag: Set to tell TCG that tb_flush has been called.
>>   * @singlestep_enabled: Flags for single-stepping.
>>   * @icount_extra: Instructions until next timer event.
>>   * @icount_decr: Number of cycles left, with interrupt flag in high bit.
>> @@ -291,6 +292,7 @@ struct CPUState {
>>      bool stopped;
>>      bool crash_occurred;
>>      bool exit_request;
>> +    bool tb_invalidated_flag;
> s/_flag// would save a few characters given it should be obvious it is a
> flag from the setting of true and false?
>
>>    
(snip)
>> diff --git a/translate-all.c b/translate-all.c
>> index 1db5a914d9a3..8e1edd6bb633 100644
>> --- a/translate-all.c
>> +++ b/translate-all.c
>> @@ -843,6 +843,7 @@ void tb_flush(CPUState *cpu)
>>      tcg_ctx.tb_ctx.nb_tbs = 0;
>>
>>      CPU_FOREACH(cpu) {
>> +        cpu->tb_invalidated_flag = 1;
>>          memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
>>      }
>>
>> @@ -1079,8 +1080,6 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>>          /* cannot fail at this point */
>>          tb = tb_alloc(pc);
>>          assert(tb != NULL);
>> -        /* Don't forget to invalidate previous TB info.  */
>> -        tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
>>      }
>>
>>      gen_code_buf = tcg_ctx.code_gen_ptr;
> I also note there is some code motion about where these flags are set
> and cleared which should probably be mentioned in the commit message.
>

Agree with all the comments.

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-03-17 15:14     ` Sergey Fedorov
@ 2016-03-28 15:18       ` Sergey Fedorov
  2016-03-28 21:21         ` Paolo Bonzini
  0 siblings, 1 reply; 41+ messages in thread
From: Sergey Fedorov @ 2016-03-28 15:18 UTC (permalink / raw)
  To: Paolo Bonzini, sergey.fedorov, qemu-devel
  Cc: Richard Henderson, Peter Crosthwaite

On 17/03/16 18:14, Sergey Fedorov wrote:
> On 17/03/16 18:09, Paolo Bonzini wrote:
>>
>> On 17/03/2016 14:46, sergey.fedorov@linaro.org wrote:
>>>   void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t
>>> page_addr)
>>>   {
>>> -    CPUState *cpu;
>>>       PageDesc *p;
>>>       unsigned int h, n1;
>>> +    tb_page_addr_t pc;
>>>       tb_page_addr_t phys_pc;
>>>       TranslationBlock *tb1, *tb2;
>>>   -    /* remove the TB from the hash list */
>>> -    phys_pc = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
>>> -    h = tb_phys_hash_func(phys_pc);
>>> -    tb_hash_remove(&tcg_ctx.tb_ctx.tb_phys_hash[h], tb);
>>> -
>>> -    /* remove the TB from the page list */
>>> -    if (tb->page_addr[0] != page_addr) {
>>> -        p = page_find(tb->page_addr[0] >> TARGET_PAGE_BITS);
>>> -        tb_page_remove(&p->first_tb, tb);
>>> -        invalidate_page_bitmap(p);
>>> -    }
>>> -    if (tb->page_addr[1] != -1 && tb->page_addr[1] != page_addr) {
>>> -        p = page_find(tb->page_addr[1] >> TARGET_PAGE_BITS);
>>> -        tb_page_remove(&p->first_tb, tb);
>>> -        invalidate_page_bitmap(p);
>>> -    }
>>> -
>>> -    tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
>>> -
>> Did you investigate the removal of this setting of tb_invalidated_flag?
>>
>> My recollection is that it is okay to remove it because at worse it
>> would cause a tb_add_jump from an invalidated source to a valid
>> destination.  This should be harmless as long as the source has been
>> tb_phys_invalidated and not tb_flushed.  But this needs to be checked.
>
> Thanks for pointing that. I should investigate it to make sure,
> although arm32/arm64/x86_64 Linux boots fine as well as the latest
> Alex's kmv-unit-tests pass.

The use pattern of 'tb_invalidated_flag' is a bit intricate; correct me,
if I'm wrong about the following. Basically, 'tb_invalidated_flag' was
meant to catch two events:
 * some TB has been invalidated by tb_phys_invalidate();
 * the whole translation buffer has been flushed by tb_flush().

Then it is checked to ensure:
 * the last executed TB can be safely patched to directly call the next
   one in cpu_exec();
 * the original TB should be provided for further possible invalidation
   along with the temporarily generated TB when in cpu_exec_nocache().

What, I think, we couldn't be sure in is the cpu_exec_nocache() case. It
could look like a kind of corner case, but it could result in stalls, if
the original TB isn't invalidated properly, see b4ac20b4df0d ("cpu-exec:
fix cpu_exec_nocache").

So I would suggest the following solution:s
 (1) Use 'tb->pc' as an indicator of whether TB is valid; check for it
     in cpu_exec() when deciding on whether to patch the last executed
     TB or not
 (2) Use 'tcg_ctx.tb_ctx.tb_flush_count' to check for translation buffer
     flushes; capture it before calling tb_gen_code() and compare to it
     afterwards to check if tb_flush() has been called in between

What do you think?

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-03-17 13:46 ` [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate sergey.fedorov
  2016-03-17 15:09   ` Paolo Bonzini
@ 2016-03-28 18:42   ` Sergey Fedorov
  2016-03-28 20:58     ` Paolo Bonzini
  1 sibling, 1 reply; 41+ messages in thread
From: Sergey Fedorov @ 2016-03-28 18:42 UTC (permalink / raw)
  To: sergey.fedorov, qemu-devel
  Cc: Paolo Bonzini, Richard Henderson, Peter Crosthwaite

On 17/03/16 16:46, sergey.fedorov@linaro.org wrote:
> First the translation block is invalidated, for which a simple write
> to tb->pc is enough.  This means that cpu-exec will not pick up anymore
> the block, though it may still execute it through chained jumps.  This
> also replaces the NULLing out of the pointer in the CPUs' local cache.

Although, using 'tb->pc' to mark a TB as invalid is probably not such a
good idea. There may be some cases when PC could become equal to -1. For
example, ARMv6-M uses PC >= 0xFFFFFFF0 to perform exception return. So
we'd better introduce a separate 'tb->valid' or 'tb->invalid' flag.

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-03-28 18:42   ` Sergey Fedorov
@ 2016-03-28 20:58     ` Paolo Bonzini
  2016-03-29  0:17       ` Richard Henderson
  0 siblings, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2016-03-28 20:58 UTC (permalink / raw)
  To: Sergey Fedorov, sergey.fedorov, qemu-devel
  Cc: Peter Crosthwaite, Richard Henderson



On 28/03/2016 20:42, Sergey Fedorov wrote:
> On 17/03/16 16:46, sergey.fedorov@linaro.org wrote:
>> First the translation block is invalidated, for which a simple write
>> to tb->pc is enough.  This means that cpu-exec will not pick up anymore
>> the block, though it may still execute it through chained jumps.  This
>> also replaces the NULLing out of the pointer in the CPUs' local cache.
> 
> Although, using 'tb->pc' to mark a TB as invalid is probably not such a
> good idea. There may be some cases when PC could become equal to -1. For
> example, ARMv6-M uses PC >= 0xFFFFFFF0 to perform exception return. So
> we'd better introduce a separate 'tb->valid' or 'tb->invalid' flag.

It is also possible to use tb->flags for that.  I suspect that all-ones
tb flags is never valid, but it could also be a #define.

Paolo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-03-28 15:18       ` Sergey Fedorov
@ 2016-03-28 21:21         ` Paolo Bonzini
  2016-03-29 10:03           ` Sergey Fedorov
  0 siblings, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2016-03-28 21:21 UTC (permalink / raw)
  To: Sergey Fedorov, sergey.fedorov, qemu-devel
  Cc: Peter Crosthwaite, Richard Henderson



On 28/03/2016 17:18, Sergey Fedorov wrote:
> The use pattern of 'tb_invalidated_flag' is a bit intricate; correct me,
> if I'm wrong about the following. Basically, 'tb_invalidated_flag' was
> meant to catch two events:
>  * some TB has been invalidated by tb_phys_invalidate();

This is patch 4.

>  * the whole translation buffer has been flushed by tb_flush().

This is patch 5.

> Then it is checked to ensure:
>  * the last executed TB can be safely patched to directly call the next
>    one in cpu_exec();
>  * the original TB should be provided for further possible invalidation
>    along with the temporarily generated TB when in cpu_exec_nocache().
>
> [...] I would suggest the following solution:
>  (1) Use 'tb->pc' as an indicator of whether TB is valid; check for it
>      in cpu_exec() when deciding on whether to patch the last executed
>      TB or not
>  (2) Use 'tcg_ctx.tb_ctx.tb_flush_count' to check for translation buffer
>      flushes; capture it before calling tb_gen_code() and compare to it
>      afterwards to check if tb_flush() has been called in between

Of course that would work, but it would be slower.  I think it is
unnecessary for two reasons:

1) There are two calls to cpu_exec_nocache.  One exits immediately with
"break;", the other always sets "next_tb = 0;".  Therefore it is safe in
both cases for cpu_exec_nocache to hijack cpu->tb_invalidated_flag.

2) if it were broken, it would _also_ be broken before these patches
because cpu_exec_nocache always runs with tb_lock taken.  So I think
documenting the assumptions is better than changing them at the same
time as doing other changes.


Your observation that tb->pc==-1 is not necessarily safe still holds of
course.  Probably the best thing is an inline that can do one of:

1) set cs_base to an invalid value (anything nonzero is enough except on
x86 and SPARC; SPARC can use all-ones)

2) sets the flags to an invalid combination (x86 can use all ones)

3) sets the PC to an invalid value (no one really needs it)

Paolo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-03-28 20:58     ` Paolo Bonzini
@ 2016-03-29  0:17       ` Richard Henderson
  0 siblings, 0 replies; 41+ messages in thread
From: Richard Henderson @ 2016-03-29  0:17 UTC (permalink / raw)
  To: Paolo Bonzini, Sergey Fedorov, sergey.fedorov, qemu-devel
  Cc: Peter Crosthwaite

On 03/28/2016 01:58 PM, Paolo Bonzini wrote:
>
>
> On 28/03/2016 20:42, Sergey Fedorov wrote:
>> On 17/03/16 16:46, sergey.fedorov@linaro.org wrote:
>>> First the translation block is invalidated, for which a simple write
>>> to tb->pc is enough.  This means that cpu-exec will not pick up anymore
>>> the block, though it may still execute it through chained jumps.  This
>>> also replaces the NULLing out of the pointer in the CPUs' local cache.
>>
>> Although, using 'tb->pc' to mark a TB as invalid is probably not such a
>> good idea. There may be some cases when PC could become equal to -1. For
>> example, ARMv6-M uses PC >= 0xFFFFFFF0 to perform exception return. So
>> we'd better introduce a separate 'tb->valid' or 'tb->invalid' flag.
>
> It is also possible to use tb->flags for that.  I suspect that all-ones
> tb flags is never valid, but it could also be a #define.

That might work by accident, but it might not.  You'd need to reserve a bit 
across all of the targets.


r~

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-03-28 21:21         ` Paolo Bonzini
@ 2016-03-29 10:03           ` Sergey Fedorov
  2016-03-29 10:37             ` Paolo Bonzini
  0 siblings, 1 reply; 41+ messages in thread
From: Sergey Fedorov @ 2016-03-29 10:03 UTC (permalink / raw)
  To: Paolo Bonzini, sergey.fedorov, qemu-devel
  Cc: Peter Crosthwaite, Richard Henderson

On 29/03/16 00:21, Paolo Bonzini wrote:
>
> On 28/03/2016 17:18, Sergey Fedorov wrote:
>> The use pattern of 'tb_invalidated_flag' is a bit intricate; correct me,
>> if I'm wrong about the following. Basically, 'tb_invalidated_flag' was
>> meant to catch two events:
>>  * some TB has been invalidated by tb_phys_invalidate();
> This is patch 4.
>
>>  * the whole translation buffer has been flushed by tb_flush().
> This is patch 5.
>
>> Then it is checked to ensure:
>>  * the last executed TB can be safely patched to directly call the next
>>    one in cpu_exec();
>>  * the original TB should be provided for further possible invalidation
>>    along with the temporarily generated TB when in cpu_exec_nocache().
>>
>> [...] I would suggest the following solution:
>>  (1) Use 'tb->pc' as an indicator of whether TB is valid; check for it
>>      in cpu_exec() when deciding on whether to patch the last executed
>>      TB or not
>>  (2) Use 'tcg_ctx.tb_ctx.tb_flush_count' to check for translation buffer
>>      flushes; capture it before calling tb_gen_code() and compare to it
>>      afterwards to check if tb_flush() has been called in between
> Of course that would work, but it would be slower.  

What's going to be slower?

> I think it is
> unnecessary for two reasons:
>
> 1) There are two calls to cpu_exec_nocache.  One exits immediately with
> "break;", the other always sets "next_tb = 0;".  Therefore it is safe in
> both cases for cpu_exec_nocache to hijack cpu->tb_invalidated_flag.
>
> 2) if it were broken, it would _also_ be broken before these patches
> because cpu_exec_nocache always runs with tb_lock taken.  

I can't see how cpu_exec_nocache() always runs with tb_lock taken.

> So I think
> documenting the assumptions is better than changing them at the same
> time as doing other changes.

I'm not sure I understand you here exactly, but if implementing my
proposal, it'd rather be a separate patch/series, I think.

> Your observation that tb->pc==-1 is not necessarily safe still holds of
> course.  Probably the best thing is an inline that can do one of:
>
> 1) set cs_base to an invalid value (anything nonzero is enough except on
> x86 and SPARC; SPARC can use all-ones)
>
> 2) sets the flags to an invalid combination (x86 can use all ones)
>
> 3) sets the PC to an invalid value (no one really needs it)

It's a bit tricky. Does it really worth doing so instead of using a
separate dedicated flag? Mainly, it should cost one extra compare on TB
look-up. I suppose it's a kind of trade-off between performance and code
clarity.

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-03-29 10:03           ` Sergey Fedorov
@ 2016-03-29 10:37             ` Paolo Bonzini
  2016-03-29 12:31               ` Sergey Fedorov
  2016-04-14 14:45               ` Sergey Fedorov
  0 siblings, 2 replies; 41+ messages in thread
From: Paolo Bonzini @ 2016-03-29 10:37 UTC (permalink / raw)
  To: Sergey Fedorov, sergey.fedorov, qemu-devel
  Cc: Peter Crosthwaite, Richard Henderson



On 29/03/2016 12:03, Sergey Fedorov wrote:
>>> >> [...] I would suggest the following solution:
>>> >>  (1) Use 'tb->pc' as an indicator of whether TB is valid; check for it
>>> >>      in cpu_exec() when deciding on whether to patch the last executed
>>> >>      TB or not
>>> >>  (2) Use 'tcg_ctx.tb_ctx.tb_flush_count' to check for translation buffer
>>> >>      flushes; capture it before calling tb_gen_code() and compare to it
>>> >>      afterwards to check if tb_flush() has been called in between
>> > Of course that would work, but it would be slower.  
> What's going to be slower?

Checking tb->pc (or something similar) *and* tb_flush_count in 
tb_find_physical.

> > I think it is
> > unnecessary for two reasons:
> >
> > 1) There are two calls to cpu_exec_nocache.  One exits immediately with
> > "break;", the other always sets "next_tb = 0;".  Therefore it is safe in
> > both cases for cpu_exec_nocache to hijack cpu->tb_invalidated_flag.
> >
> > 2) if it were broken, it would _also_ be broken before these patches
> > because cpu_exec_nocache always runs with tb_lock taken.  
> 
> I can't see how cpu_exec_nocache() always runs with tb_lock taken.

It takes the lock itself. :)

>From Fred's "tcg: protect TBContext with tb_lock", as it is in my mttcg 
branch:

@@ -194,17 +194,23 @@ static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
     if (max_cycles > CF_COUNT_MASK)
         max_cycles = CF_COUNT_MASK;
 
+    tb_lock();
     cpu->tb_invalidated_flag = 0;
     tb = tb_gen_code(cpu, orig_tb->pc, orig_tb->cs_base, orig_tb->flags,
                      max_cycles | CF_NOCACHE);
     tb->orig_tb = cpu->tb_invalidated_flag ? NULL : orig_tb;
     cpu->current_tb = tb;
+    tb_unlock();
+
     /* execute the generated code */
     trace_exec_tb_nocache(tb, tb->pc);
     cpu_tb_exec(cpu, tb->tc_ptr);
+
+    tb_lock();
     cpu->current_tb = NULL;
     tb_phys_invalidate(tb, -1);
     tb_free(tb);
+    tb_unlock();
 }
 #endif
 

It takes the lock before resetting tb_invalidated_flag.

cpu_exec_nocache is not used in user-mode emulation, so it's okay if
qemu.git doesn't take the lock yet.  (This kind of misunderstanding
about which code is thread-safe is going to be common until we have
MTTCG.  This was the reason for the patch "cpu-exec: elide more icount
code if CONFIG_USER_ONLY").

> > So I think
> > documenting the assumptions is better than changing them at the same
> > time as doing other changes.
>
> I'm not sure I understand you here exactly, but if implementing my
> proposal, it'd rather be a separate patch/series, I think.

Exactly.  For the purpose of these 5 patches, I would just document above
cpu_exec_nocache that callers should ensure that next_tb is zero.

Alternatively, you could add a patch that does

    old_tb_invalidated_flag = cpu->tb_invalidated_flag;
    cpu->tb_invalidated_flag = 0;
    ...
    cpu->tb_invalidated_flag |= old_tb_invalidated_flag;

it could use the single global flag (and then it would be between patch 4
and patch 5) or it could use the CPU-specific one (and then it would be
after patch 5).

However, I think documenting the requirements is fine.

>> > Your observation that tb->pc==-1 is not necessarily safe still holds of
>> > course.  Probably the best thing is an inline that can do one of:
>> >
>> > 1) set cs_base to an invalid value (anything nonzero is enough except on
>> > x86 and SPARC; SPARC can use all-ones)
>> >
>> > 2) sets the flags to an invalid combination (x86 can use all ones)
>> >
>> > 3) sets the PC to an invalid value (no one really needs it)
>
> It's a bit tricky. Does it really worth doing so instead of using a
> separate dedicated flag? Mainly, it should cost one extra compare on TB
> look-up. I suppose it's a kind of trade-off between performance and code
> clarity.

I think a new inline function cpu_make_tb_invalid would not be too tricky.
Just setting "tb->cs_base = -1;" is pretty much obvious for all the targets
that do not use cs_base at all and for SPARC which sets it to a PC (and
thus a multiple of four).  x86 is the odd one out.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-03-29 10:37             ` Paolo Bonzini
@ 2016-03-29 12:31               ` Sergey Fedorov
  2016-03-29 13:43                 ` Alex Bennée
  2016-04-14 14:45               ` Sergey Fedorov
  1 sibling, 1 reply; 41+ messages in thread
From: Sergey Fedorov @ 2016-03-29 12:31 UTC (permalink / raw)
  To: Paolo Bonzini, sergey.fedorov, qemu-devel
  Cc: Peter Crosthwaite, Richard Henderson

On 29/03/16 13:37, Paolo Bonzini wrote:
> cpu_exec_nocache is not used in user-mode emulation, so it's okay if
> qemu.git doesn't take the lock yet.  (This kind of misunderstanding
> about which code is thread-safe is going to be common until we have
> MTTCG.  This was the reason for the patch "cpu-exec: elide more icount
> code if CONFIG_USER_ONLY").

So I'd better pull in here the rebased version of this patch from Alex's
"Base enabling patches for MTTCG" series.

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop
  2016-03-22 14:59   ` Alex Bennée
  2016-03-22 15:00     ` Paolo Bonzini
@ 2016-03-29 13:19     ` Sergey Fedorov
  2016-03-29 13:26       ` Paolo Bonzini
  1 sibling, 1 reply; 41+ messages in thread
From: Sergey Fedorov @ 2016-03-29 13:19 UTC (permalink / raw)
  To: Alex Bennée, sergey.fedorov
  Cc: Paolo Bonzini, Peter Crosthwaite, qemu-devel, Richard Henderson

On 22/03/16 17:59, Alex Bennée wrote:
> sergey.fedorov@linaro.org writes:
>
>> From: Paolo Bonzini <pbonzini@redhat.com>
>>
>> Use a continue statement.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> [Sergey Fedorov: Fix moving to list head in case of no TB]
>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>> ---
>>  cpu-exec.c | 50 +++++++++++++++++++++++++-------------------------
>>  1 file changed, 25 insertions(+), 25 deletions(-)
>>
>> diff --git a/cpu-exec.c b/cpu-exec.c
>> index fd92452f16f6..f90482eff778 100644
>> --- a/cpu-exec.c
>> +++ b/cpu-exec.c
>> @@ -225,37 +225,37 @@ static TranslationBlock *tb_find_physical(CPUState *cpu,
>>      phys_pc = get_page_addr_code(env, pc);
>>      phys_page1 = phys_pc & TARGET_PAGE_MASK;
>>      h = tb_phys_hash_func(phys_pc);
>> -    ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
>> -    for(;;) {
>> -        tb = *ptb1;
>> -        if (!tb) {
>> -            return NULL;
>> +    for (ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
>> +         (tb = *ptb1) != NULL;
>> +         ptb1 = &tb->phys_hash_next) {
> I'm not sure I'm keen on the assignment in the for condition clause. I
> appreciate the cleansing of the if !tb return exit though. Could we be
> cleaner maybe? Here is my attempt:
>
>     static TranslationBlock *tb_find_physical(CPUState *cpu,
>                                               target_ulong pc,
>                                               target_ulong cs_base,
>                                               uint64_t flags)
>     {
>         CPUArchState *env = (CPUArchState *)cpu->env_ptr;
>         TranslationBlock *tb, **tb_hash_head, **ptb1;
>         unsigned int h;
>         tb_page_addr_t phys_pc, phys_page1;
>
>         /* find translated block using physical mappings */
>         phys_pc = get_page_addr_code(env, pc);
>         phys_page1 = phys_pc & TARGET_PAGE_MASK;
>         h = tb_phys_hash_func(phys_pc);
>
>         /* Start at head of the hash entry */
>         ptb1 = tb_hash_head = &tcg_ctx.tb_ctx.tb_phys_hash[h];
>         tb = *ptb1;
>
>         while (tb) {
>
>             if (tb->pc == pc &&
>                 tb->page_addr[0] == phys_page1 &&
>                 tb->cs_base == cs_base &&
>                 tb->flags == flags) {
>
>                 if (tb->page_addr[1] == -1) {
>                     /* done, we have a match */
>                     break;
>                 } else {
>                     /* check next page if needed */
>                     target_ulong virt_page2 = (pc & TARGET_PAGE_MASK)
>                         + TARGET_PAGE_SIZE;
>                     tb_page_addr_t phys_page2 = get_page_addr_code(env, virt_page2);
>
>                     if (tb->page_addr[1] == phys_page2) {
>                         break;
>                     }
>                 }
>             }
>
>             ptb1 = &tb->phys_hash_next;
>             tb = *ptb1;
>         }
>
>         if (tb) {
>             /* Move the TB to the head of the list */
>             *ptb1 = tb->phys_hash_next;
>             tb->phys_hash_next = *tb_hash_head;
>             *tb_hash_head = tb;
>         }
>         return tb;
>     }
>
> FWIW the compiled code is 9 bytes shorter on my machine.

Looks like another attempt to rewrite it. I am wondering whom to
attribute as an author, then? :)

Kind regards,
Sergey

>
>> +        if (tb->pc != pc ||
>> +            tb->page_addr[0] != phys_page1 ||
>> +            tb->cs_base != cs_base ||
>> +            tb->flags != flags) {
>> +            continue;
>>          }
>> -        if (tb->pc == pc &&
>> -            tb->page_addr[0] == phys_page1 &&
>> -            tb->cs_base == cs_base &&
>> -            tb->flags == flags) {
>> -            /* check next page if needed */
>> -            if (tb->page_addr[1] != -1) {
>> -                tb_page_addr_t phys_page2;
>> -
>> -                virt_page2 = (pc & TARGET_PAGE_MASK) +
>> -                    TARGET_PAGE_SIZE;
>> -                phys_page2 = get_page_addr_code(env, virt_page2);
>> -                if (tb->page_addr[1] == phys_page2) {
>> -                    break;
>> -                }
>> -            } else {
>> +
>> +        /* check next page if needed */
>> +        if (tb->page_addr[1] != -1) {
>> +            tb_page_addr_t phys_page2;
>> +
>> +            virt_page2 = (pc & TARGET_PAGE_MASK) +
>> +                TARGET_PAGE_SIZE;
>> +            phys_page2 = get_page_addr_code(env, virt_page2);
>> +            if (tb->page_addr[1] == phys_page2) {
>>                  break;
>>              }
>> +        } else {
>> +            break;
>>          }
>> -        ptb1 = &tb->phys_hash_next;
>>      }
>>
>> -    /* Move the TB to the head of the list */
>> -    *ptb1 = tb->phys_hash_next;
>> -    tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
>> -    tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
>> +    if (tb) {
>> +        /* Move the TB to the head of the list */
>> +        *ptb1 = tb->phys_hash_next;
>> +        tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
>> +        tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
>> +    }
>>      return tb;
>>  }

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop
  2016-03-29 13:19     ` Sergey Fedorov
@ 2016-03-29 13:26       ` Paolo Bonzini
  2016-03-29 14:05         ` Sergey Fedorov
  0 siblings, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2016-03-29 13:26 UTC (permalink / raw)
  To: Sergey Fedorov, Alex Bennée, sergey.fedorov
  Cc: Peter Crosthwaite, qemu-devel, Richard Henderson



On 29/03/2016 15:19, Sergey Fedorov wrote:
> On 22/03/16 17:59, Alex Bennée wrote:
>> sergey.fedorov@linaro.org writes:
>>
>>> From: Paolo Bonzini <pbonzini@redhat.com>
>>>
>>> Use a continue statement.
>>>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> [Sergey Fedorov: Fix moving to list head in case of no TB]
>>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>>> ---
>>>  cpu-exec.c | 50 +++++++++++++++++++++++++-------------------------
>>>  1 file changed, 25 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/cpu-exec.c b/cpu-exec.c
>>> index fd92452f16f6..f90482eff778 100644
>>> --- a/cpu-exec.c
>>> +++ b/cpu-exec.c
>>> @@ -225,37 +225,37 @@ static TranslationBlock *tb_find_physical(CPUState *cpu,
>>>      phys_pc = get_page_addr_code(env, pc);
>>>      phys_page1 = phys_pc & TARGET_PAGE_MASK;
>>>      h = tb_phys_hash_func(phys_pc);
>>> -    ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
>>> -    for(;;) {
>>> -        tb = *ptb1;
>>> -        if (!tb) {
>>> -            return NULL;
>>> +    for (ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
>>> +         (tb = *ptb1) != NULL;
>>> +         ptb1 = &tb->phys_hash_next) {
>> I'm not sure I'm keen on the assignment in the for condition clause. I
>> appreciate the cleansing of the if !tb return exit though. Could we be
>> cleaner maybe? Here is my attempt:
>>
>>     static TranslationBlock *tb_find_physical(CPUState *cpu,
>>                                               target_ulong pc,
>>                                               target_ulong cs_base,
>>                                               uint64_t flags)
>>     {
>>         CPUArchState *env = (CPUArchState *)cpu->env_ptr;
>>         TranslationBlock *tb, **tb_hash_head, **ptb1;
>>         unsigned int h;
>>         tb_page_addr_t phys_pc, phys_page1;
>>
>>         /* find translated block using physical mappings */
>>         phys_pc = get_page_addr_code(env, pc);
>>         phys_page1 = phys_pc & TARGET_PAGE_MASK;
>>         h = tb_phys_hash_func(phys_pc);
>>
>>         /* Start at head of the hash entry */
>>         ptb1 = tb_hash_head = &tcg_ctx.tb_ctx.tb_phys_hash[h];
>>         tb = *ptb1;
>>
>>         while (tb) {
>>
>>             if (tb->pc == pc &&
>>                 tb->page_addr[0] == phys_page1 &&
>>                 tb->cs_base == cs_base &&
>>                 tb->flags == flags) {
>>
>>                 if (tb->page_addr[1] == -1) {
>>                     /* done, we have a match */
>>                     break;
>>                 } else {
>>                     /* check next page if needed */
>>                     target_ulong virt_page2 = (pc & TARGET_PAGE_MASK)
>>                         + TARGET_PAGE_SIZE;
>>                     tb_page_addr_t phys_page2 = get_page_addr_code(env, virt_page2);
>>
>>                     if (tb->page_addr[1] == phys_page2) {
>>                         break;
>>                     }
>>                 }
>>             }
>>
>>             ptb1 = &tb->phys_hash_next;
>>             tb = *ptb1;
>>         }
>>
>>         if (tb) {
>>             /* Move the TB to the head of the list */
>>             *ptb1 = tb->phys_hash_next;
>>             tb->phys_hash_next = *tb_hash_head;
>>             *tb_hash_head = tb;
>>         }
>>         return tb;
>>     }
>>
>> FWIW the compiled code is 9 bytes shorter on my machine.
> 
> Looks like another attempt to rewrite it. I am wondering whom to
> attribute as an author, then? :)

I don't really care. :)

Paolo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-03-29 12:31               ` Sergey Fedorov
@ 2016-03-29 13:43                 ` Alex Bennée
  0 siblings, 0 replies; 41+ messages in thread
From: Alex Bennée @ 2016-03-29 13:43 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: Paolo Bonzini, Richard Henderson, qemu-devel, sergey.fedorov,
	Peter Crosthwaite


Sergey Fedorov <serge.fdrv@gmail.com> writes:

> On 29/03/16 13:37, Paolo Bonzini wrote:
>> cpu_exec_nocache is not used in user-mode emulation, so it's okay if
>> qemu.git doesn't take the lock yet.  (This kind of misunderstanding
>> about which code is thread-safe is going to be common until we have
>> MTTCG.  This was the reason for the patch "cpu-exec: elide more icount
>> code if CONFIG_USER_ONLY").
>
> So I'd better pull in here the rebased version of this patch from Alex's
> "Base enabling patches for MTTCG" series.

OK, sounds good.

--
Alex Bennée

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop
  2016-03-29 13:26       ` Paolo Bonzini
@ 2016-03-29 14:05         ` Sergey Fedorov
  2016-03-29 14:26           ` Alex Bennée
  0 siblings, 1 reply; 41+ messages in thread
From: Sergey Fedorov @ 2016-03-29 14:05 UTC (permalink / raw)
  To: Paolo Bonzini, Alex Bennée, sergey.fedorov
  Cc: Peter Crosthwaite, qemu-devel, Richard Henderson

On 29/03/16 16:26, Paolo Bonzini wrote:
>
> On 29/03/2016 15:19, Sergey Fedorov wrote:
>> On 22/03/16 17:59, Alex Bennée wrote:
>>> sergey.fedorov@linaro.org writes:
>>>
>>>> From: Paolo Bonzini <pbonzini@redhat.com>
>>>>
>>>> Use a continue statement.
>>>>
>>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>>> [Sergey Fedorov: Fix moving to list head in case of no TB]
>>>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>>>> ---
>>>>  cpu-exec.c | 50 +++++++++++++++++++++++++-------------------------
>>>>  1 file changed, 25 insertions(+), 25 deletions(-)
>>>>
>>>> diff --git a/cpu-exec.c b/cpu-exec.c
>>>> index fd92452f16f6..f90482eff778 100644
>>>> --- a/cpu-exec.c
>>>> +++ b/cpu-exec.c
>>>> @@ -225,37 +225,37 @@ static TranslationBlock *tb_find_physical(CPUState *cpu,
>>>>      phys_pc = get_page_addr_code(env, pc);
>>>>      phys_page1 = phys_pc & TARGET_PAGE_MASK;
>>>>      h = tb_phys_hash_func(phys_pc);
>>>> -    ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
>>>> -    for(;;) {
>>>> -        tb = *ptb1;
>>>> -        if (!tb) {
>>>> -            return NULL;
>>>> +    for (ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
>>>> +         (tb = *ptb1) != NULL;
>>>> +         ptb1 = &tb->phys_hash_next) {
>>> I'm not sure I'm keen on the assignment in the for condition clause. I
>>> appreciate the cleansing of the if !tb return exit though. Could we be
>>> cleaner maybe? Here is my attempt:
>>>
>>>     static TranslationBlock *tb_find_physical(CPUState *cpu,
>>>                                               target_ulong pc,
>>>                                               target_ulong cs_base,
>>>                                               uint64_t flags)
>>>     {
>>>         CPUArchState *env = (CPUArchState *)cpu->env_ptr;
>>>         TranslationBlock *tb, **tb_hash_head, **ptb1;
>>>         unsigned int h;
>>>         tb_page_addr_t phys_pc, phys_page1;
>>>
>>>         /* find translated block using physical mappings */
>>>         phys_pc = get_page_addr_code(env, pc);
>>>         phys_page1 = phys_pc & TARGET_PAGE_MASK;
>>>         h = tb_phys_hash_func(phys_pc);
>>>
>>>         /* Start at head of the hash entry */
>>>         ptb1 = tb_hash_head = &tcg_ctx.tb_ctx.tb_phys_hash[h];
>>>         tb = *ptb1;
>>>
>>>         while (tb) {
>>>
>>>             if (tb->pc == pc &&
>>>                 tb->page_addr[0] == phys_page1 &&
>>>                 tb->cs_base == cs_base &&
>>>                 tb->flags == flags) {
>>>
>>>                 if (tb->page_addr[1] == -1) {
>>>                     /* done, we have a match */
>>>                     break;
>>>                 } else {
>>>                     /* check next page if needed */
>>>                     target_ulong virt_page2 = (pc & TARGET_PAGE_MASK)
>>>                         + TARGET_PAGE_SIZE;
>>>                     tb_page_addr_t phys_page2 = get_page_addr_code(env, virt_page2);
>>>
>>>                     if (tb->page_addr[1] == phys_page2) {
>>>                         break;
>>>                     }
>>>                 }
>>>             }
>>>
>>>             ptb1 = &tb->phys_hash_next;
>>>             tb = *ptb1;
>>>         }
>>>
>>>         if (tb) {
>>>             /* Move the TB to the head of the list */
>>>             *ptb1 = tb->phys_hash_next;
>>>             tb->phys_hash_next = *tb_hash_head;
>>>             *tb_hash_head = tb;
>>>         }
>>>         return tb;
>>>     }
>>>
>>> FWIW the compiled code is 9 bytes shorter on my machine.
>> Looks like another attempt to rewrite it. I am wondering whom to
>> attribute as an author, then? :)
> I don't really care. :)

Alex, could you give your s-o-b for your variant of code? Or would you
like to make a patch by yourself?

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop
  2016-03-29 14:05         ` Sergey Fedorov
@ 2016-03-29 14:26           ` Alex Bennée
  2016-03-29 14:37             ` Sergey Fedorov
  0 siblings, 1 reply; 41+ messages in thread
From: Alex Bennée @ 2016-03-29 14:26 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: Paolo Bonzini, Peter Crosthwaite, qemu-devel, sergey.fedorov,
	Richard Henderson


Sergey Fedorov <serge.fdrv@gmail.com> writes:

> On 29/03/16 16:26, Paolo Bonzini wrote:
>>
>> On 29/03/2016 15:19, Sergey Fedorov wrote:
>>> On 22/03/16 17:59, Alex Bennée wrote:
>>>> sergey.fedorov@linaro.org writes:
>>>>
>>>>> From: Paolo Bonzini <pbonzini@redhat.com>
>>>>>
>>>>> Use a continue statement.
>>>>>
>>>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>>>> [Sergey Fedorov: Fix moving to list head in case of no TB]
>>>>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>>>>> ---
>>>>>  cpu-exec.c | 50 +++++++++++++++++++++++++-------------------------
>>>>>  1 file changed, 25 insertions(+), 25 deletions(-)
>>>>>
>>>>> diff --git a/cpu-exec.c b/cpu-exec.c
>>>>> index fd92452f16f6..f90482eff778 100644
>>>>> --- a/cpu-exec.c
>>>>> +++ b/cpu-exec.c
>>>>> @@ -225,37 +225,37 @@ static TranslationBlock *tb_find_physical(CPUState *cpu,
>>>>>      phys_pc = get_page_addr_code(env, pc);
>>>>>      phys_page1 = phys_pc & TARGET_PAGE_MASK;
>>>>>      h = tb_phys_hash_func(phys_pc);
>>>>> -    ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
>>>>> -    for(;;) {
>>>>> -        tb = *ptb1;
>>>>> -        if (!tb) {
>>>>> -            return NULL;
>>>>> +    for (ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
>>>>> +         (tb = *ptb1) != NULL;
>>>>> +         ptb1 = &tb->phys_hash_next) {
>>>> I'm not sure I'm keen on the assignment in the for condition clause. I
>>>> appreciate the cleansing of the if !tb return exit though. Could we be
>>>> cleaner maybe? Here is my attempt:
>>>>
>>>>     static TranslationBlock *tb_find_physical(CPUState *cpu,
>>>>                                               target_ulong pc,
>>>>                                               target_ulong cs_base,
>>>>                                               uint64_t flags)
>>>>     {
>>>>         CPUArchState *env = (CPUArchState *)cpu->env_ptr;
>>>>         TranslationBlock *tb, **tb_hash_head, **ptb1;
>>>>         unsigned int h;
>>>>         tb_page_addr_t phys_pc, phys_page1;
>>>>
>>>>         /* find translated block using physical mappings */
>>>>         phys_pc = get_page_addr_code(env, pc);
>>>>         phys_page1 = phys_pc & TARGET_PAGE_MASK;
>>>>         h = tb_phys_hash_func(phys_pc);
>>>>
>>>>         /* Start at head of the hash entry */
>>>>         ptb1 = tb_hash_head = &tcg_ctx.tb_ctx.tb_phys_hash[h];
>>>>         tb = *ptb1;
>>>>
>>>>         while (tb) {
>>>>
>>>>             if (tb->pc == pc &&
>>>>                 tb->page_addr[0] == phys_page1 &&
>>>>                 tb->cs_base == cs_base &&
>>>>                 tb->flags == flags) {
>>>>
>>>>                 if (tb->page_addr[1] == -1) {
>>>>                     /* done, we have a match */
>>>>                     break;
>>>>                 } else {
>>>>                     /* check next page if needed */
>>>>                     target_ulong virt_page2 = (pc & TARGET_PAGE_MASK)
>>>>                         + TARGET_PAGE_SIZE;
>>>>                     tb_page_addr_t phys_page2 = get_page_addr_code(env, virt_page2);
>>>>
>>>>                     if (tb->page_addr[1] == phys_page2) {
>>>>                         break;
>>>>                     }
>>>>                 }
>>>>             }
>>>>
>>>>             ptb1 = &tb->phys_hash_next;
>>>>             tb = *ptb1;
>>>>         }
>>>>
>>>>         if (tb) {
>>>>             /* Move the TB to the head of the list */
>>>>             *ptb1 = tb->phys_hash_next;
>>>>             tb->phys_hash_next = *tb_hash_head;
>>>>             *tb_hash_head = tb;
>>>>         }
>>>>         return tb;
>>>>     }
>>>>
>>>> FWIW the compiled code is 9 bytes shorter on my machine.
>>> Looks like another attempt to rewrite it. I am wondering whom to
>>> attribute as an author, then? :)
>> I don't really care. :)
>
> Alex, could you give your s-o-b for your variant of code? Or would you
> like to make a patch by yourself?

Sure here:

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

>
> Kind regards,
> Sergey


--
Alex Bennée

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop
  2016-03-29 14:26           ` Alex Bennée
@ 2016-03-29 14:37             ` Sergey Fedorov
  0 siblings, 0 replies; 41+ messages in thread
From: Sergey Fedorov @ 2016-03-29 14:37 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Paolo Bonzini, Peter Crosthwaite, qemu-devel, sergey.fedorov,
	Richard Henderson

On 29/03/16 17:26, Alex Bennée wrote:
>> Alex, could you give your s-o-b for your variant of code? Or would you
>> > like to make a patch by yourself?
> Sure here:
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>

So I'm going to respin the series quickly dropping the last two
contentious patches so far. It would give this series a chance for to
get in 2.6.

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-03-29 10:37             ` Paolo Bonzini
  2016-03-29 12:31               ` Sergey Fedorov
@ 2016-04-14 14:45               ` Sergey Fedorov
  2016-04-14 15:13                 ` Paolo Bonzini
  1 sibling, 1 reply; 41+ messages in thread
From: Sergey Fedorov @ 2016-04-14 14:45 UTC (permalink / raw)
  To: Paolo Bonzini, sergey.fedorov, qemu-devel
  Cc: Richard Henderson, Peter Crosthwaite, Alex Bennée

On 29/03/16 13:37, Paolo Bonzini wrote:
>>>> Your observation that tb->pc==-1 is not necessarily safe still holds of
>>>> >> > course.  Probably the best thing is an inline that can do one of:
>>>> >> >
>>>> >> > 1) set cs_base to an invalid value (anything nonzero is enough except on
>>>> >> > x86 and SPARC; SPARC can use all-ones)
>>>> >> >
>>>> >> > 2) sets the flags to an invalid combination (x86 can use all ones)
>>>> >> >
>>>> >> > 3) sets the PC to an invalid value (no one really needs it)
>> >
>> > It's a bit tricky. Does it really worth doing so instead of using a
>> > separate dedicated flag? Mainly, it should cost one extra compare on TB
>> > look-up. I suppose it's a kind of trade-off between performance and code
>> > clarity.
> I think a new inline function cpu_make_tb_invalid would not be too tricky.
> Just setting "tb->cs_base = -1;" is pretty much obvious for all the targets
> that do not use cs_base at all and for SPARC which sets it to a PC (and
> thus a multiple of four).  x86 is the odd one out.

So what would you suggest to use for x86? I can't think of something
that looks like a really compelling combination when I look at
cpu_get_tb_cpu_state() in target-i386/cpu.h. Personally, I'm not so
happy trying to use pc/cs_base/flags to mark an invalid TB. Are my
worries unreasonable? :) Some other thoughts?

Anyway, I am wondering if there is still a way to clear tb_phys_hash and
tb_jmp_cache safely. Maybe something like this:
 * Remove the TB from physical hash list
 * Memory barrier
 * Remove the TB from each vCPU's virtual address hash cache

Would that work?

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-04-14 14:45               ` Sergey Fedorov
@ 2016-04-14 15:13                 ` Paolo Bonzini
  2016-04-14 15:36                   ` Sergey Fedorov
                                     ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Paolo Bonzini @ 2016-04-14 15:13 UTC (permalink / raw)
  To: Sergey Fedorov, sergey.fedorov, qemu-devel
  Cc: Richard Henderson, Peter Crosthwaite, Alex Bennée



On 14/04/2016 16:45, Sergey Fedorov wrote:
> So what would you suggest to use for x86? I can't think of something
> that looks like a really compelling combination when I look at
> cpu_get_tb_cpu_state() in target-i386/cpu.h.

On x86 I think we should define HF_INVALID_TB to an invalid flag
combination. I can think of several solutions:

- #defining HF_INVALID_TB to an invalid combination (e.g. HF_CS64_MASK,
because it always appears together with HF_LMA_MASK; see code that
updates those hflags in cpu_x86_load_seg_cache, cpu_x86_update_cr0,
kvm_get_sregs). Advantage: doesn't waste a bit, reasonably self
documenting. Disadvantage: a bit tricky, but still my favorite.

- rename HF_SOFTMMU_MASK to HF_INVALID_MASK (it's always the same as
CONFIG_SOFTMMU so we can remove it), then #define HF_INVALID_TB
HF_INVALID_MASK. Advantage: obviously correct. Disadvantage: wastes a
bit. My second favorite.

- #defining HF_INVALID_TB to -1. Advantage: ?!? Disadvantage:
everything. Looks tame, actually a huge hack

- #defining HF_INVALID_TB to the "wrong" direction HF_SOFTMMU_MASK (i.e.
to 0 if CONFIG_SOFTMMU, . Advantage: obviously correct. Disadvantage:
huge hack, HF_SOFTMMU_MASK is unused anyway.

Choose your own favorite. :)

(Setting cs_base to -1 actually would work on 64-bit x86, but not on
qemu-system-i386).

> Personally, I'm not so
> happy trying to use pc/cs_base/flags to mark an invalid TB. Are my
> worries unreasonable? :)

Can you explain your worries?

The advantages are that it's O(1) and it obviously doesn't affect other
TBs than the invalidated one.

> Anyway, I am wondering if there is still a way to clear tb_phys_hash and
> tb_jmp_cache safely.
>
> Maybe something like this:
>  * Remove the TB from physical hash list

So at this point tb_find_slow cannot find it.

>  * Memory barrier
>  * Remove the TB from each vCPU's virtual address hash cache

tb_find_fast then cannot find it either.

> Would that work?

This is very similar to the current code.  From 10,000 feet, because
tb_find_fast calls tb_find_slow, this could indeed work, but I'm a bit
concerned about how to order the removal of the jump lists.  The usage
of "tcg_ctx.tb_ctx.tb_invalidated_flag = 1" in the existing code was
what worries me.  Indeed the motivation of this patch was removing that
single line of code to prepare for the move of tb_invalidated_flag to
CPUState.

Also, this loop will not be thread-safe anymore as soon as Fred's
"tb_jmp_cache lookup outside tb_lock" goes in:

    CPU_FOREACH(cpu) {
        if (cpu->tb_jmp_cache[h] == tb) {
            cpu->tb_jmp_cache[h] = NULL;
        }
    }

It should use atomic_cmpxchg (slow!) or to unconditionally NULL out
cpu->tb_jmp_cache (a bit hacky).  Preparing for that change is an added
bonus of the tb-hacking approach.

Paolo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-04-14 15:13                 ` Paolo Bonzini
@ 2016-04-14 15:36                   ` Sergey Fedorov
  2016-04-14 17:27                     ` Paolo Bonzini
  2016-04-14 18:29                   ` Sergey Fedorov
  2016-04-14 18:37                   ` Sergey Fedorov
  2 siblings, 1 reply; 41+ messages in thread
From: Sergey Fedorov @ 2016-04-14 15:36 UTC (permalink / raw)
  To: Paolo Bonzini, sergey.fedorov, qemu-devel
  Cc: Richard Henderson, Peter Crosthwaite, Alex Bennée

On 14/04/16 18:13, Paolo Bonzini wrote:
> This is very similar to the current code.  From 10,000 feet, because
> tb_find_fast calls tb_find_slow, this could indeed work, but I'm a bit
> concerned about how to order the removal of the jump lists.  The usage
> of "tcg_ctx.tb_ctx.tb_invalidated_flag = 1" in the existing code was
> what worries me.  Indeed the motivation of this patch was removing that
> single line of code to prepare for the move of tb_invalidated_flag to
> CPUState.

I'm just preparing a patch with a long commit message which describes
and removes this setting of tb_invalidated_flag :) I think we can do
this safely and even benefit in performance.

>
> Also, this loop will not be thread-safe anymore as soon as Fred's
> "tb_jmp_cache lookup outside tb_lock" goes in:
>
>     CPU_FOREACH(cpu) {
>         if (cpu->tb_jmp_cache[h] == tb) {
>             cpu->tb_jmp_cache[h] = NULL;
>         }
>     }
>
> It should use atomic_cmpxchg (slow!) or to unconditionally NULL out
> cpu->tb_jmp_cache (a bit hacky).  Preparing for that change is an added
> bonus of the tb-hacking approach.

IIUC we always modify tb_jmp_cache/tb_phys_hash under tb_lock. We're
just gonna do tb_jmp_cache lookup outside of tb_lock. I think write
memory barrier in tb_phys_invalidate() paired with read memory memory
barrier in tb_find_fast()/tb_find_slow() would be enough to make it
safe. Also we need to use atomic read/set to tb_jmp_cache but cmpxchg is
not necessary. So I'm going to give it a thought then.

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-04-14 15:36                   ` Sergey Fedorov
@ 2016-04-14 17:27                     ` Paolo Bonzini
  0 siblings, 0 replies; 41+ messages in thread
From: Paolo Bonzini @ 2016-04-14 17:27 UTC (permalink / raw)
  To: Sergey Fedorov, sergey.fedorov, qemu-devel
  Cc: Peter Crosthwaite, Alex Bennée, Richard Henderson



On 14/04/2016 17:36, Sergey Fedorov wrote:
> IIUC we always modify tb_jmp_cache/tb_phys_hash under tb_lock. We're
> just gonna do tb_jmp_cache lookup outside of tb_lock. I think write
> memory barrier in tb_phys_invalidate() paired with read memory memory
> barrier in tb_find_fast()/tb_find_slow() would be enough to make it
> safe. Also we need to use atomic read/set to tb_jmp_cache but cmpxchg is
> not necessary. So I'm going to give it a thought then.

Sounds good!

Paolo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-04-14 15:13                 ` Paolo Bonzini
  2016-04-14 15:36                   ` Sergey Fedorov
@ 2016-04-14 18:29                   ` Sergey Fedorov
  2016-04-14 18:37                   ` Sergey Fedorov
  2 siblings, 0 replies; 41+ messages in thread
From: Sergey Fedorov @ 2016-04-14 18:29 UTC (permalink / raw)
  To: Paolo Bonzini, sergey.fedorov, qemu-devel
  Cc: Richard Henderson, Peter Crosthwaite, Alex Bennée

On 14/04/16 18:13, Paolo Bonzini wrote:
> This is very similar to the current code.  From 10,000 feet, because
> tb_find_fast calls tb_find_slow, this could indeed work, but I'm a bit
> concerned about how to order the removal of the jump lists.

As long as we always link/unlink TBs under tb_lock the exact order
doesn't seem to be important.

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate
  2016-04-14 15:13                 ` Paolo Bonzini
  2016-04-14 15:36                   ` Sergey Fedorov
  2016-04-14 18:29                   ` Sergey Fedorov
@ 2016-04-14 18:37                   ` Sergey Fedorov
  2 siblings, 0 replies; 41+ messages in thread
From: Sergey Fedorov @ 2016-04-14 18:37 UTC (permalink / raw)
  To: Paolo Bonzini, sergey.fedorov, qemu-devel
  Cc: Richard Henderson, Peter Crosthwaite, Alex Bennée

On 14/04/16 18:13, Paolo Bonzini wrote:
> On 14/04/2016 16:45, Sergey Fedorov wrote:
>> Personally, I'm not so
>> happy trying to use pc/cs_base/flags to mark an invalid TB. Are my
>> worries unreasonable? :)
> Can you explain your worries?
>
> The advantages are that it's O(1) and it obviously doesn't affect other
> TBs than the invalidated one.

Well, it looks a bit hucky, I don't like hucky code because it is each
time hard to read, modify and be confident in it. I would like to
introduce a dedicated boolean field to mark a TB as valid/invalid. But
this approach adds one more compare in tb_find_fast() and
tb_find_slow(). Luckily, it may be not an issue if my proposal with
memory barriers and atomic access to 'tb_jmp_cache' solves the problem.

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2016-04-14 18:37 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-17 13:46 [Qemu-devel] [PATCH 0/5] tcg: Misc clean-up patches from Paolo sergey.fedorov
2016-03-17 13:46 ` [Qemu-devel] [PATCH 1/5] tcg: code_bitmap is not used by user-mode emulation sergey.fedorov
2016-03-17 14:56   ` Peter Maydell
2016-03-17 15:03     ` Sergey Fedorov
2016-03-17 13:46 ` [Qemu-devel] [PATCH 2/5] tcg: reorganize tb_find_physical loop sergey.fedorov
2016-03-17 14:59   ` Peter Maydell
2016-03-22 14:59   ` Alex Bennée
2016-03-22 15:00     ` Paolo Bonzini
2016-03-29 13:19     ` Sergey Fedorov
2016-03-29 13:26       ` Paolo Bonzini
2016-03-29 14:05         ` Sergey Fedorov
2016-03-29 14:26           ` Alex Bennée
2016-03-29 14:37             ` Sergey Fedorov
2016-03-17 13:46 ` [Qemu-devel] [PATCH 3/5] tcg: always keep jump target and tb->jmp_next consistent sergey.fedorov
2016-03-17 17:57   ` Richard Henderson
2016-03-17 19:31     ` Paolo Bonzini
2016-03-17 20:45       ` Sergey Fedorov
2016-03-17 20:46         ` Richard Henderson
2016-03-18 10:29           ` Sergey Fedorov
2016-03-18 10:32         ` Sergey Fedorov
2016-03-17 13:46 ` [Qemu-devel] [PATCH 4/5] tcg: reorder removal from lists in tb_phys_invalidate sergey.fedorov
2016-03-17 15:09   ` Paolo Bonzini
2016-03-17 15:14     ` Sergey Fedorov
2016-03-28 15:18       ` Sergey Fedorov
2016-03-28 21:21         ` Paolo Bonzini
2016-03-29 10:03           ` Sergey Fedorov
2016-03-29 10:37             ` Paolo Bonzini
2016-03-29 12:31               ` Sergey Fedorov
2016-03-29 13:43                 ` Alex Bennée
2016-04-14 14:45               ` Sergey Fedorov
2016-04-14 15:13                 ` Paolo Bonzini
2016-04-14 15:36                   ` Sergey Fedorov
2016-04-14 17:27                     ` Paolo Bonzini
2016-04-14 18:29                   ` Sergey Fedorov
2016-04-14 18:37                   ` Sergey Fedorov
2016-03-28 18:42   ` Sergey Fedorov
2016-03-28 20:58     ` Paolo Bonzini
2016-03-29  0:17       ` Richard Henderson
2016-03-17 13:46 ` [Qemu-devel] [PATCH 5/5] tcg: move tb_invalidated_flag to CPUState sergey.fedorov
2016-03-22 15:07   ` Alex Bennée
2016-03-22 15:11     ` Sergey Fedorov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.