[Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe
@ 2016-04-07 15:53 Sergey Fedorov
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 01/11] tci: Fix build regression Sergey Fedorov
                   ` (12 more replies)
  0 siblings, 13 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 15:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Richard Henderson, Peter Crosthwaite,
	Alex Bennée, Paolo Bonzini

From: Sergey Fedorov <serge.fdrv@gmail.com>

When patching translated code for direct block chaining/unchaining,
modification of concurrently executing code can happen in multi-threaded
execution.  Currently only user-mode is affected. To make direct block patching
safe, some care must be taken to make sure that the code modification is made
atomic and concurrently executed code is guaranteed to be consistent.

This patch series fixes all supported TCG targets using direct patching and
documents the requirement for direct jump patching be atomic and thread-safe.

The series' tree can be found in a public git repository [1].

[1] https://github.com/sergefdrv/qemu/tree/atomic-tb-patching

Sergey Fedorov (10):
  pc-bios/s390-ccw: Use correct strip when cross-compiling
  tci: Make direct jump patching thread-safe
  tcg/ppc: Make direct jump patching thread-safe
  tcg/i386: Make direct jump patching thread-safe
  tcg/s390: Make direct jump patching thread-safe
  tcg/arm: Make direct jump patching thread-safe
  tcg/aarch64: Make direct jump patching thread-safe
  tcg/sparc: Make direct jump patching thread-safe
  tcg/mips: Make direct jump patching thread-safe
  tcg: Note requirement on atomic direct jump patching

Stefan Weil (1):
  tci: Fix build regression

 include/exec/exec-all.h      | 32 ++++++------------------------
 pc-bios/s390-ccw/Makefile    |  2 +-
 tcg/aarch64/tcg-target.inc.c | 14 +++++++++++++-
 tcg/arm/tcg-target.inc.c     | 17 ++++++++++++++++
 tcg/i386/tcg-target.inc.c    | 17 ++++++++++++++++
 tcg/mips/tcg-target.inc.c    |  3 ++-
 tcg/ppc/tcg-target.inc.c     | 22 +++++++++++++++++----
 tcg/s390/tcg-target.inc.c    |  6 ++++++
 tcg/sparc/tcg-target.inc.c   |  2 +-
 tcg/tci/tcg-target.inc.c     |  2 ++
 tci.c                        | 46 +++++++++++++++++++++++++-------------------
 translate-all.c              |  2 ++
 12 files changed, 111 insertions(+), 54 deletions(-)

-- 
2.8.1

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [Qemu-devel] [PATCH 01/11] tci: Fix build regression
  2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
@ 2016-04-07 15:53 ` Sergey Fedorov
  2016-04-07 18:15   ` Richard Henderson
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling Sergey Fedorov
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 15:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Crosthwaite, Stefan Weil, Paolo Bonzini, Sergey Fedorov,
	Alex Bennée, Richard Henderson

From: Stefan Weil <sw@weilnetz.de>

Commit d38ea87ac54af64ef611de434d07c12dc0399216 cleaned the include
statements which resulted in a wrong order of assert.h and the definition
of NDEBUG in tci.c. Normally NDEBUG modifies the definition of the assert
macro, but here this definition comes too late which results in a failing
build.

To fix this, a new macro tci_assert which depends on CONFIG_DEBUG_TCG
is introduced. Only builds with CONFIG_DEBUG_TCG will use assertions.
Even in this case, it is still possible to disable assertions by
defining NDEBUG via compiler settings.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
---
 tci.c | 41 ++++++++++++++++++++++-------------------
 1 file changed, 22 insertions(+), 19 deletions(-)

diff --git a/tci.c b/tci.c
index 7cbb39ed4b6a..82705fe77295 100644
--- a/tci.c
+++ b/tci.c
@@ -1,7 +1,7 @@
 /*
  * Tiny Code Interpreter for QEMU
  *
- * Copyright (c) 2009, 2011 Stefan Weil
+ * Copyright (c) 2009, 2011, 2016 Stefan Weil
  *
  * This program is free software: you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -19,9 +19,12 @@
 
 #include "qemu/osdep.h"
 
-/* Defining NDEBUG disables assertions (which makes the code faster). */
-#if !defined(CONFIG_DEBUG_TCG) && !defined(NDEBUG)
-# define NDEBUG
+/* Enable TCI assertions only when debugging TCG (and without NDEBUG defined).
+ * Without assertions, the interpreter runs much faster. */
+#if defined(CONFIG_DEBUG_TCG)
+# define tci_assert(cond) assert(cond)
+#else
+# define tci_assert(cond) ((void)0)
 #endif
 
 #include "qemu-common.h"
@@ -56,7 +59,7 @@ static tcg_target_ulong tci_reg[TCG_TARGET_NB_REGS];
 
 static tcg_target_ulong tci_read_reg(TCGReg index)
 {
-    assert(index < ARRAY_SIZE(tci_reg));
+    tci_assert(index < ARRAY_SIZE(tci_reg));
     return tci_reg[index];
 }
 
@@ -105,9 +108,9 @@ static uint64_t tci_read_reg64(TCGReg index)
 
 static void tci_write_reg(TCGReg index, tcg_target_ulong value)
 {
-    assert(index < ARRAY_SIZE(tci_reg));
-    assert(index != TCG_AREG0);
-    assert(index != TCG_REG_CALL_STACK);
+    tci_assert(index < ARRAY_SIZE(tci_reg));
+    tci_assert(index != TCG_AREG0);
+    tci_assert(index != TCG_REG_CALL_STACK);
     tci_reg[index] = value;
 }
 
@@ -325,7 +328,7 @@ static uint64_t tci_read_ri64(uint8_t **tb_ptr)
 static tcg_target_ulong tci_read_label(uint8_t **tb_ptr)
 {
     tcg_target_ulong label = tci_read_i(tb_ptr);
-    assert(label != 0);
+    tci_assert(label != 0);
     return label;
 }
 
@@ -468,11 +471,11 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
 
     tci_reg[TCG_AREG0] = (tcg_target_ulong)env;
     tci_reg[TCG_REG_CALL_STACK] = sp_value;
-    assert(tb_ptr);
+    tci_assert(tb_ptr);
 
     for (;;) {
         TCGOpcode opc = tb_ptr[0];
-#if !defined(NDEBUG)
+#if defined(CONFIG_DEBUG_TCG) && !defined(NDEBUG)
         uint8_t op_size = tb_ptr[1];
         uint8_t *old_code_ptr = tb_ptr;
 #endif
@@ -525,7 +528,7 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
             break;
         case INDEX_op_br:
             label = tci_read_label(&tb_ptr);
-            assert(tb_ptr == old_code_ptr + op_size);
+            tci_assert(tb_ptr == old_code_ptr + op_size);
             tb_ptr = (uint8_t *)label;
             continue;
         case INDEX_op_setcond_i32:
@@ -600,7 +603,7 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
             t0 = tci_read_r32(&tb_ptr);
             t1 = tci_read_r(&tb_ptr);
             t2 = tci_read_s32(&tb_ptr);
-            assert(t1 != sp_value || (int32_t)t2 < 0);
+            tci_assert(t1 != sp_value || (int32_t)t2 < 0);
             *(uint32_t *)(t1 + t2) = t0;
             break;
 
@@ -725,7 +728,7 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
             condition = *tb_ptr++;
             label = tci_read_label(&tb_ptr);
             if (tci_compare32(t0, t1, condition)) {
-                assert(tb_ptr == old_code_ptr + op_size);
+                tci_assert(tb_ptr == old_code_ptr + op_size);
                 tb_ptr = (uint8_t *)label;
                 continue;
             }
@@ -751,7 +754,7 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
             condition = *tb_ptr++;
             label = tci_read_label(&tb_ptr);
             if (tci_compare64(tmp64, v64, condition)) {
-                assert(tb_ptr == old_code_ptr + op_size);
+                tci_assert(tb_ptr == old_code_ptr + op_size);
                 tb_ptr = (uint8_t *)label;
                 continue;
             }
@@ -885,7 +888,7 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
             t0 = tci_read_r64(&tb_ptr);
             t1 = tci_read_r(&tb_ptr);
             t2 = tci_read_s32(&tb_ptr);
-            assert(t1 != sp_value || (int32_t)t2 < 0);
+            tci_assert(t1 != sp_value || (int32_t)t2 < 0);
             *(uint64_t *)(t1 + t2) = t0;
             break;
 
@@ -992,7 +995,7 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
             condition = *tb_ptr++;
             label = tci_read_label(&tb_ptr);
             if (tci_compare64(t0, t1, condition)) {
-                assert(tb_ptr == old_code_ptr + op_size);
+                tci_assert(tb_ptr == old_code_ptr + op_size);
                 tb_ptr = (uint8_t *)label;
                 continue;
             }
@@ -1087,7 +1090,7 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
             break;
         case INDEX_op_goto_tb:
             t0 = tci_read_i32(&tb_ptr);
-            assert(tb_ptr == old_code_ptr + op_size);
+            tci_assert(tb_ptr == old_code_ptr + op_size);
             tb_ptr += (int32_t)t0;
             continue;
         case INDEX_op_qemu_ld_i32:
@@ -1234,7 +1237,7 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
             TODO();
             break;
         }
-        assert(tb_ptr == old_code_ptr + op_size);
+        tci_assert(tb_ptr == old_code_ptr + op_size);
     }
 exit:
     return next_tb;
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling
  2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 01/11] tci: Fix build regression Sergey Fedorov
@ 2016-04-07 15:53 ` Sergey Fedorov
  2016-04-07 16:18   ` Cornelia Huck
  2016-04-18 14:51   ` Cornelia Huck
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 03/11] tci: Make direct jump patching thread-safe Sergey Fedorov
                   ` (10 subsequent siblings)
  12 siblings, 2 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 15:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Peter Crosthwaite, Alexander Graf,
	Christian Borntraeger, Paolo Bonzini, Cornelia Huck,
	Sergey Fedorov, Alex Bennée, Richard Henderson

From: Sergey Fedorov <serge.fdrv@gmail.com>

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 pc-bios/s390-ccw/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
index 4208cb429593..5ce6d4ccbaf5 100644
--- a/pc-bios/s390-ccw/Makefile
+++ b/pc-bios/s390-ccw/Makefile
@@ -20,7 +20,7 @@ s390-ccw.elf: $(OBJECTS)
 	$(call quiet-command,$(CC) $(LDFLAGS) -o $@ $(OBJECTS),"  Building $(TARGET_DIR)$@")
 
 s390-ccw.img: s390-ccw.elf
-	$(call quiet-command,strip --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
+	$(call quiet-command,$(STRIP) --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
 
 $(OBJECTS): Makefile
 
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Qemu-devel] [PATCH 03/11] tci: Make direct jump patching thread-safe
  2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 01/11] tci: Fix build regression Sergey Fedorov
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling Sergey Fedorov
@ 2016-04-07 15:53 ` Sergey Fedorov
  2016-04-20  9:42   ` Alex Bennée
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 04/11] tcg/ppc: " Sergey Fedorov
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 15:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Peter Crosthwaite, Stefan Weil, Paolo Bonzini,
	Sergey Fedorov, Alex Bennée, Richard Henderson

From: Sergey Fedorov <serge.fdrv@gmail.com>

Ensure direct jump patching in TCI is atomic by:
 * naturally aligning a location of direct jump address;
 * using atomic_read()/atomic_set() to load/store the address.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 include/exec/exec-all.h  | 2 +-
 tcg/tci/tcg-target.inc.c | 2 ++
 tci.c                    | 5 ++++-
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 736209505a68..59709c9dd5c9 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -302,7 +302,7 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr);
 static inline void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
 {
     /* patch the branch destination */
-    *(uint32_t *)jmp_addr = addr - (jmp_addr + 4);
+    atomic_set((int32_t *)jmp_addr, addr - (jmp_addr + 4));
     /* no need to flush icache explicitly */
 }
 #elif defined(_ARCH_PPC)
diff --git a/tcg/tci/tcg-target.inc.c b/tcg/tci/tcg-target.inc.c
index 4afe4d7a8d59..7e6180e62898 100644
--- a/tcg/tci/tcg-target.inc.c
+++ b/tcg/tci/tcg-target.inc.c
@@ -556,6 +556,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
         if (s->tb_jmp_offset) {
             /* Direct jump method. */
             assert(args[0] < ARRAY_SIZE(s->tb_jmp_offset));
+            /* Align for atomic patching and thread safety */
+            s->code_ptr = (uint8_t *)(((uintptr_t)s->code_ptr + 3) & ~3);
             s->tb_jmp_offset[args[0]] = tcg_current_code_size(s);
             tcg_out32(s, 0);
         } else {
diff --git a/tci.c b/tci.c
index 82705fe77295..531f5ebf2c2e 100644
--- a/tci.c
+++ b/tci.c
@@ -1089,7 +1089,10 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
             goto exit;
             break;
         case INDEX_op_goto_tb:
-            t0 = tci_read_i32(&tb_ptr);
+            /* Jump address is aligned */
+            tb_ptr = (uint8_t *)(((uintptr_t)tb_ptr + 3) & ~3);
+            t0 = atomic_read((int32_t *)tb_ptr);
+            tb_ptr += sizeof(int32_t);
             tci_assert(tb_ptr == old_code_ptr + op_size);
             tb_ptr += (int32_t)t0;
             continue;
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Qemu-devel] [PATCH 04/11] tcg/ppc: Make direct jump patching thread-safe
  2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
                   ` (2 preceding siblings ...)
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 03/11] tci: Make direct jump patching thread-safe Sergey Fedorov
@ 2016-04-07 15:53 ` Sergey Fedorov
  2016-04-20  9:49   ` Alex Bennée
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 05/11] tcg/i386: " Sergey Fedorov
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 15:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Peter Crosthwaite, Vassili Karpov (malc),
	Paolo Bonzini, Sergey Fedorov, Alex Bennée,
	Richard Henderson

From: Sergey Fedorov <serge.fdrv@gmail.com>

Ensure direct jump patching in PPC is atomic by:
 * limiting translation buffer size in 32-bit mode to be addressable by
   Branch I-form instruction;
 * using atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 tcg/ppc/tcg-target.inc.c | 22 ++++++++++++++++++----
 translate-all.c          |  2 ++
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c
index 8c1c2dfa9b22..a7c65fd08854 100644
--- a/tcg/ppc/tcg-target.inc.c
+++ b/tcg/ppc/tcg-target.inc.c
@@ -1237,6 +1237,7 @@ static void tcg_out_brcond2 (TCGContext *s, const TCGArg *args,
     tcg_out_bc(s, BC | BI(7, CR_EQ) | BO_COND_TRUE, arg_label(args[5]));
 }
 
+#ifdef __powerpc64__
 void ppc_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
 {
     tcg_insn_unit i1, i2;
@@ -1265,11 +1266,18 @@ void ppc_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
     pair = (uint64_t)i2 << 32 | i1;
 #endif
 
-    /* ??? __atomic_store_8, presuming there's some way to do that
-       for 32-bit, otherwise this is good enough for 64-bit.  */
-    *(uint64_t *)jmp_addr = pair;
+    atomic_set((uint64_t *)jmp_addr, pair);
     flush_icache_range(jmp_addr, jmp_addr + 8);
 }
+#else
+void ppc_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
+{
+    intptr_t diff = addr - jmp_addr;
+    tcg_debug_assert(in_range_b(diff));
+    atomic_set((uint32_t *)jmp_addr, B | (diff & 0x3fffffc));
+    flush_icache_range(jmp_addr, jmp_addr + 4);
+}
+#endif
 
 static void tcg_out_call(TCGContext *s, tcg_insn_unit *target)
 {
@@ -1895,7 +1903,9 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
         break;
     case INDEX_op_goto_tb:
         tcg_debug_assert(s->tb_jmp_offset);
-        /* Direct jump.  Ensure the next insns are 8-byte aligned. */
+        /* Direct jump. */
+#ifdef __powerpc64__
+        /* Ensure the next insns are 8-byte aligned. */
         if ((uintptr_t)s->code_ptr & 7) {
             tcg_out32(s, NOP);
         }
@@ -1904,6 +1914,10 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
         s->code_ptr += 2;
         tcg_out32(s, MTSPR | RS(TCG_REG_TMP1) | CTR);
         tcg_out32(s, BCCTR | BO_ALWAYS);
+#else
+        /* To be replaced by a branch.  */
+        s->code_ptr++;
+#endif
         s->tb_next_offset[args[0]] = tcg_current_code_size(s);
         break;
     case INDEX_op_br:
diff --git a/translate-all.c b/translate-all.c
index b4df1ec68fa9..9b98a4a36967 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -464,6 +464,8 @@ static inline PageDesc *page_find(tb_page_addr_t index)
 # define MAX_CODE_GEN_BUFFER_SIZE  (2ul * 1024 * 1024 * 1024)
 #elif defined(__powerpc64__)
 # define MAX_CODE_GEN_BUFFER_SIZE  (2ul * 1024 * 1024 * 1024)
+#elif defined(__powerpc__)
+# define MAX_CODE_GEN_BUFFER_SIZE  (32u * 1024 * 1024)
 #elif defined(__aarch64__)
 # define MAX_CODE_GEN_BUFFER_SIZE  (128ul * 1024 * 1024)
 #elif defined(__arm__)
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Qemu-devel] [PATCH 05/11] tcg/i386: Make direct jump patching thread-safe
  2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
                   ` (3 preceding siblings ...)
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 04/11] tcg/ppc: " Sergey Fedorov
@ 2016-04-07 15:53 ` Sergey Fedorov
  2016-04-20  9:55   ` Alex Bennée
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 06/11] tcg/s390: " Sergey Fedorov
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 15:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Peter Crosthwaite, Paolo Bonzini, Sergey Fedorov,
	Alex Bennée, Richard Henderson

From: Sergey Fedorov <serge.fdrv@gmail.com>

Ensure direct jump patching in i386 is atomic by:
 * naturally aligning a location of direct jump address;
 * using atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 include/exec/exec-all.h   |  2 +-
 tcg/i386/tcg-target.inc.c | 17 +++++++++++++++++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 59709c9dd5c9..82399175fe80 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -312,7 +312,7 @@ void ppc_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr);
 static inline void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
 {
     /* patch the branch destination */
-    stl_le_p((void*)jmp_addr, addr - (jmp_addr + 4));
+    atomic_set((int32_t *)jmp_addr, addr - (jmp_addr + 4));
     /* no need to flush icache explicitly */
 }
 #elif defined(__s390x__)
diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
index 9187d34caf6d..3ffb7b3124d8 100644
--- a/tcg/i386/tcg-target.inc.c
+++ b/tcg/i386/tcg-target.inc.c
@@ -1123,6 +1123,19 @@ static void tcg_out_jmp(TCGContext *s, tcg_insn_unit *dest)
     tcg_out_branch(s, 0, dest);
 }
 
+static void tcg_out_nopn(TCGContext *s, int n)
+{
+    static const uint8_t nop1[] = { 0x90 };
+    static const uint8_t nop2[] = { 0x66, 0x90 };
+    static const uint8_t nop3[] = { 0x8d, 0x76, 0x00 };
+    static const uint8_t *const nopn[] = { nop1, nop2, nop3 };
+    int i;
+    assert(n <= ARRAY_SIZE(nopn));
+    for (i = 0; i < n; ++i) {
+        tcg_out8(s, nopn[n - 1][i]);
+    }
+}
+
 #if defined(CONFIG_SOFTMMU)
 /* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
  *                                     int mmu_idx, uintptr_t ra)
@@ -1777,6 +1790,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_goto_tb:
         if (s->tb_jmp_offset) {
             /* direct jump method */
+            /* align jump displacement for atomic pathing */
+            if (((uintptr_t)s->code_ptr & 3) != 3) {
+                tcg_out_nopn(s, 3 - ((uintptr_t)s->code_ptr & 3));
+            }
             tcg_out8(s, OPC_JMP_long); /* jmp im */
             s->tb_jmp_offset[args[0]] = tcg_current_code_size(s);
             tcg_out32(s, 0);
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Qemu-devel] [PATCH 06/11] tcg/s390: Make direct jump patching thread-safe
  2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
                   ` (4 preceding siblings ...)
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 05/11] tcg/i386: " Sergey Fedorov
@ 2016-04-07 15:53 ` Sergey Fedorov
  2016-04-20 10:01   ` Alex Bennée
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 07/11] tcg/arm: " Sergey Fedorov
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 15:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Peter Crosthwaite, Alexander Graf, Paolo Bonzini,
	Sergey Fedorov, Alex Bennée, Richard Henderson

From: Sergey Fedorov <serge.fdrv@gmail.com>

Ensure direct jump patching in s390 is atomic by:
 * naturally aligning a location of direct jump address;
 * using atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 include/exec/exec-all.h   | 2 +-
 tcg/s390/tcg-target.inc.c | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 82399175fe80..e18cc24e50f0 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -320,7 +320,7 @@ static inline void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
 {
     /* patch the branch destination */
     intptr_t disp = addr - (jmp_addr - 2);
-    stl_be_p((void*)jmp_addr, disp / 2);
+    atomic_set((int32_t *)jmp_addr, disp / 2);
     /* no need to flush icache explicitly */
 }
 #elif defined(__aarch64__)
diff --git a/tcg/s390/tcg-target.inc.c b/tcg/s390/tcg-target.inc.c
index fbf97bb2e15d..84221dfb68c9 100644
--- a/tcg/s390/tcg-target.inc.c
+++ b/tcg/s390/tcg-target.inc.c
@@ -219,6 +219,8 @@ typedef enum S390Opcode {
     RX_ST       = 0x50,
     RX_STC      = 0x42,
     RX_STH      = 0x40,
+
+    NOP         = 0x0707,
 } S390Opcode;
 
 #ifndef NDEBUG
@@ -1716,6 +1718,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_goto_tb:
         if (s->tb_jmp_offset) {
+            /* align branch displacement for atomic pathing */
+            if (((uintptr_t)s->code_ptr & 3) == 0) {
+                tcg_out16(s, NOP);
+            }
             tcg_out16(s, RIL_BRCL | (S390_CC_ALWAYS << 4));
             s->tb_jmp_offset[args[0]] = tcg_current_code_size(s);
             s->code_ptr += 2;
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Qemu-devel] [PATCH 07/11] tcg/arm: Make direct jump patching thread-safe
  2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
                   ` (5 preceding siblings ...)
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 06/11] tcg/s390: " Sergey Fedorov
@ 2016-04-07 15:53 ` Sergey Fedorov
  2016-04-20 13:33   ` Alex Bennée
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 08/11] tcg/aarch64: " Sergey Fedorov
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 15:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Peter Crosthwaite, qemu-arm, Paolo Bonzini,
	Sergey Fedorov, Alex Bennée, Richard Henderson

From: Sergey Fedorov <serge.fdrv@gmail.com>

Ensure direct jump patching in ARM is atomic by using
atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 include/exec/exec-all.h  | 25 ++-----------------------
 tcg/arm/tcg-target.inc.c | 17 +++++++++++++++++
 2 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index e18cc24e50f0..6a054ee720a8 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -327,29 +327,8 @@ static inline void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
 void aarch64_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr);
 #define tb_set_jmp_target1 aarch64_tb_set_jmp_target
 #elif defined(__arm__)
-static inline void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
-{
-#if !QEMU_GNUC_PREREQ(4, 1)
-    register unsigned long _beg __asm ("a1");
-    register unsigned long _end __asm ("a2");
-    register unsigned long _flg __asm ("a3");
-#endif
-
-    /* we could use a ldr pc, [pc, #-4] kind of branch and avoid the flush */
-    *(uint32_t *)jmp_addr =
-        (*(uint32_t *)jmp_addr & ~0xffffff)
-        | (((addr - (jmp_addr + 8)) >> 2) & 0xffffff);
-
-#if QEMU_GNUC_PREREQ(4, 1)
-    __builtin___clear_cache((char *) jmp_addr, (char *) jmp_addr + 4);
-#else
-    /* flush icache */
-    _beg = jmp_addr;
-    _end = jmp_addr + 4;
-    _flg = 0;
-    __asm __volatile__ ("swi 0x9f0002" : : "r" (_beg), "r" (_end), "r" (_flg));
-#endif
-}
+void arm_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr);
+#define tb_set_jmp_target1 arm_tb_set_jmp_target
 #elif defined(__sparc__) || defined(__mips__)
 void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr);
 #else
diff --git a/tcg/arm/tcg-target.inc.c b/tcg/arm/tcg-target.inc.c
index 3edf6a6f971c..5c69de20bc69 100644
--- a/tcg/arm/tcg-target.inc.c
+++ b/tcg/arm/tcg-target.inc.c
@@ -121,6 +121,13 @@ static inline void reloc_pc24(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
     *code_ptr = (*code_ptr & ~0xffffff) | (offset & 0xffffff);
 }
 
+static inline void reloc_pc24_atomic(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
+{
+    ptrdiff_t offset = (tcg_ptr_byte_diff(target, code_ptr) - 8) >> 2;
+    tcg_insn_unit insn = atomic_read(code_ptr);
+    atomic_set(code_ptr, (insn & ~0xffffff) | (offset & 0xffffff));
+}
+
 static void patch_reloc(tcg_insn_unit *code_ptr, int type,
                         intptr_t value, intptr_t addend)
 {
@@ -1038,6 +1045,16 @@ static void tcg_out_call(TCGContext *s, tcg_insn_unit *addr)
     }
 }
 
+void arm_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
+{
+    tcg_insn_unit *code_ptr = (tcg_insn_unit *)jmp_addr;
+    tcg_insn_unit *target = (tcg_insn_unit *)addr;
+
+    /* we could use a ldr pc, [pc, #-4] kind of branch and avoid the flush */
+    reloc_pc24_atomic(code_ptr, target);
+    flush_icache_range(jmp_addr, jmp_addr + 4);
+}
+
 static inline void tcg_out_goto_label(TCGContext *s, int cond, TCGLabel *l)
 {
     if (l->has_value) {
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Qemu-devel] [PATCH 08/11] tcg/aarch64: Make direct jump patching thread-safe
  2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
                   ` (6 preceding siblings ...)
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 07/11] tcg/arm: " Sergey Fedorov
@ 2016-04-07 15:53 ` Sergey Fedorov
  2016-04-20 14:01   ` Alex Bennée
  2016-04-21 15:47   ` Sergey Fedorov
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 09/11] tcg/sparc: " Sergey Fedorov
                   ` (4 subsequent siblings)
  12 siblings, 2 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 15:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Peter Crosthwaite, Claudio Fontana, qemu-arm,
	Paolo Bonzini, Sergey Fedorov, Alex Bennée,
	Richard Henderson

From: Sergey Fedorov <serge.fdrv@gmail.com>

Ensure direct jump patching in AArch64 is atomic by using
atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 tcg/aarch64/tcg-target.inc.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
index 0ed10a974121..15fdebec921f 100644
--- a/tcg/aarch64/tcg-target.inc.c
+++ b/tcg/aarch64/tcg-target.inc.c
@@ -73,6 +73,18 @@ static inline void reloc_pc26(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
     *code_ptr = deposit32(*code_ptr, 0, 26, offset);
 }
 
+static inline void reloc_pc26_atomic(tcg_insn_unit *code_ptr,
+                                     tcg_insn_unit *target)
+{
+    ptrdiff_t offset = target - code_ptr;
+    tcg_insn_unit insn;
+    assert(offset == sextract64(offset, 0, 26));
+    /* read instruction, mask away previous PC_REL26 parameter contents,
+       set the proper offset, then write back the instruction. */
+    insn = atomic_read(code_ptr);
+    atomic_set(code_ptr, deposit32(insn, 0, 26, offset));
+}
+
 static inline void reloc_pc19(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
 {
     ptrdiff_t offset = target - code_ptr;
@@ -835,7 +847,7 @@ void aarch64_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
     tcg_insn_unit *code_ptr = (tcg_insn_unit *)jmp_addr;
     tcg_insn_unit *target = (tcg_insn_unit *)addr;
 
-    reloc_pc26(code_ptr, target);
+    reloc_pc26_atomic(code_ptr, target);
     flush_icache_range(jmp_addr, jmp_addr + 4);
 }
 
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Qemu-devel] [PATCH 09/11] tcg/sparc: Make direct jump patching thread-safe
  2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
                   ` (7 preceding siblings ...)
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 08/11] tcg/aarch64: " Sergey Fedorov
@ 2016-04-07 15:53 ` Sergey Fedorov
  2016-04-20 14:23   ` Alex Bennée
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 10/11] tcg/mips: " Sergey Fedorov
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 15:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Peter Crosthwaite, Blue Swirl, Paolo Bonzini,
	Sergey Fedorov, Alex Bennée, Richard Henderson

From: Sergey Fedorov <serge.fdrv@gmail.com>

Ensure direct jump patching in SPARC is atomic by using
atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 tcg/sparc/tcg-target.inc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tcg/sparc/tcg-target.inc.c b/tcg/sparc/tcg-target.inc.c
index 54df1bc42432..629dd7bdafff 100644
--- a/tcg/sparc/tcg-target.inc.c
+++ b/tcg/sparc/tcg-target.inc.c
@@ -1647,6 +1647,6 @@ void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
        the code_gen_buffer can't be larger than 2GB.  */
     assert(disp == (int32_t)disp);
 
-    *ptr = CALL | (uint32_t)disp >> 2;
+    atomic_set(ptr, CALL | (uint32_t)disp >> 2);
     flush_icache_range(jmp_addr, jmp_addr + 4);
 }
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Qemu-devel] [PATCH 10/11] tcg/mips: Make direct jump patching thread-safe
  2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
                   ` (8 preceding siblings ...)
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 09/11] tcg/sparc: " Sergey Fedorov
@ 2016-04-07 15:53 ` Sergey Fedorov
  2016-04-07 16:01   ` Paolo Bonzini
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 11/11] tcg: Note requirement on atomic direct jump patching Sergey Fedorov
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 15:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Peter Crosthwaite, Paolo Bonzini, Sergey Fedorov,
	Alex Bennée, Aurelien Jarno, Richard Henderson

From: Sergey Fedorov <serge.fdrv@gmail.com>

Ensure direct jump patching in MIPS is atomic by using
atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 tcg/mips/tcg-target.inc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tcg/mips/tcg-target.inc.c b/tcg/mips/tcg-target.inc.c
index 682e19897db0..e65835d2096b 100644
--- a/tcg/mips/tcg-target.inc.c
+++ b/tcg/mips/tcg-target.inc.c
@@ -1886,6 +1886,7 @@ static void tcg_target_init(TCGContext *s)
 void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
 {
     uint32_t *ptr = (uint32_t *)jmp_addr;
-    *ptr = deposit32(*ptr, 0, 26, addr >> 2);
+    uint32_t insn = atomic_read(ptr);
+    atomic_write(ptr, deposit32(insn, 0, 26, addr >> 2));
     flush_icache_range(jmp_addr, jmp_addr + 4);
 }
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Qemu-devel] [PATCH 11/11] tcg: Note requirement on atomic direct jump patching
  2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
                   ` (9 preceding siblings ...)
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 10/11] tcg/mips: " Sergey Fedorov
@ 2016-04-07 15:53 ` Sergey Fedorov
  2016-04-20 14:25   ` Alex Bennée
  2016-04-07 15:56 ` [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
  2016-04-20  8:44 ` Alex Bennée
  12 siblings, 1 reply; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 15:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Sergey Fedorov, Peter Crosthwaite, Paolo Bonzini, Sergey Fedorov,
	Alex Bennée, Richard Henderson

From: Sergey Fedorov <serge.fdrv@gmail.com>

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
---
 include/exec/exec-all.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 6a054ee720a8..1878e7b7856c 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -229,6 +229,7 @@ static inline void tlb_flush_by_mmuidx(CPUState *cpu, ...)
     || defined(__sparc__) || defined(__aarch64__) \
     || defined(__s390x__) || defined(__mips__) \
     || defined(CONFIG_TCG_INTERPRETER)
+/* NOTE: Direct jump patching must be atomic and thread-safe. */
 #define USE_DIRECT_JUMP
 #endif
 
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe
  2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
                   ` (10 preceding siblings ...)
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 11/11] tcg: Note requirement on atomic direct jump patching Sergey Fedorov
@ 2016-04-07 15:56 ` Sergey Fedorov
  2016-04-20  8:44 ` Alex Bennée
  12 siblings, 0 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 15:56 UTC (permalink / raw)
  To: Sergey Fedorov, qemu-devel
  Cc: Paolo Bonzini, Richard Henderson, Alex Bennée, Peter Crosthwaite

On 07/04/16 18:53, Sergey Fedorov wrote:
> From: Sergey Fedorov <serge.fdrv@gmail.com>
>
> When patching translated code for direct block chaining/unchaining,
> modification of concurrently executing code can happen in multi-threaded
> execution.  Currently only user-mode is affected. To make direct block patching
> safe, some care must be taken to make sure that the code modification is made
> atomic and concurrently executed code is guaranteed to be consistent.
>
> This patch series fixes all supported TCG targets using direct patching and
> documents the requirement for direct jump patching be atomic and thread-safe.
>
> The series' tree can be found in a public git repository [1].
>
> [1] https://github.com/sergefdrv/qemu/tree/atomic-tb-patching
>

Sorry, the first two patches got there by mistake...

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 10/11] tcg/mips: Make direct jump patching thread-safe
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 10/11] tcg/mips: " Sergey Fedorov
@ 2016-04-07 16:01   ` Paolo Bonzini
  2016-04-07 16:09     ` Sergey Fedorov
  0 siblings, 1 reply; 57+ messages in thread
From: Paolo Bonzini @ 2016-04-07 16:01 UTC (permalink / raw)
  To: Sergey Fedorov, qemu-devel
  Cc: Sergey Fedorov, Richard Henderson, Alex Bennée,
	Aurelien Jarno, Peter Crosthwaite



On 07/04/2016 17:53, Sergey Fedorov wrote:
> Ensure direct jump patching in MIPS is atomic by using
> atomic_read()/atomic_set() for code patching.
> 
> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  tcg/mips/tcg-target.inc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tcg/mips/tcg-target.inc.c b/tcg/mips/tcg-target.inc.c
> index 682e19897db0..e65835d2096b 100644
> --- a/tcg/mips/tcg-target.inc.c
> +++ b/tcg/mips/tcg-target.inc.c
> @@ -1886,6 +1886,7 @@ static void tcg_target_init(TCGContext *s)
>  void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
>  {
>      uint32_t *ptr = (uint32_t *)jmp_addr;
> -    *ptr = deposit32(*ptr, 0, 26, addr >> 2);
> +    uint32_t insn = atomic_read(ptr);
> +    atomic_write(ptr, deposit32(insn, 0, 26, addr >> 2));

Oops. :)

Paolo

>      flush_icache_range(jmp_addr, jmp_addr + 4);
>  }

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 10/11] tcg/mips: Make direct jump patching thread-safe
  2016-04-07 16:01   ` Paolo Bonzini
@ 2016-04-07 16:09     ` Sergey Fedorov
  0 siblings, 0 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 16:09 UTC (permalink / raw)
  To: Paolo Bonzini, Sergey Fedorov, qemu-devel
  Cc: Richard Henderson, Alex Bennée, Aurelien Jarno, Peter Crosthwaite

On 07/04/16 19:01, Paolo Bonzini wrote:
>
> On 07/04/2016 17:53, Sergey Fedorov wrote:
>> Ensure direct jump patching in MIPS is atomic by using
>> atomic_read()/atomic_set() for code patching.
>>
>> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>> ---
>>  tcg/mips/tcg-target.inc.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/tcg/mips/tcg-target.inc.c b/tcg/mips/tcg-target.inc.c
>> index 682e19897db0..e65835d2096b 100644
>> --- a/tcg/mips/tcg-target.inc.c
>> +++ b/tcg/mips/tcg-target.inc.c
>> @@ -1886,6 +1886,7 @@ static void tcg_target_init(TCGContext *s)
>>  void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
>>  {
>>      uint32_t *ptr = (uint32_t *)jmp_addr;
>> -    *ptr = deposit32(*ptr, 0, 26, addr >> 2);
>> +    uint32_t insn = atomic_read(ptr);
>> +    atomic_write(ptr, deposit32(insn, 0, 26, addr >> 2));
> Oops. :)
>
> Paolo

Nice catch! I think I just didn't cross-compile it for MIPS...

Thanks,
Sergey

>
>>      flush_icache_range(jmp_addr, jmp_addr + 4);
>>  }

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling Sergey Fedorov
@ 2016-04-07 16:18   ` Cornelia Huck
  2016-04-07 16:22     ` Sergey Fedorov
  2016-04-18 13:15     ` Sergey Fedorov
  2016-04-18 14:51   ` Cornelia Huck
  1 sibling, 2 replies; 57+ messages in thread
From: Cornelia Huck @ 2016-04-07 16:18 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: Peter Crosthwaite, qemu-devel, Alexander Graf,
	Christian Borntraeger, Paolo Bonzini, Sergey Fedorov,
	Alex Bennée, Richard Henderson

On Thu,  7 Apr 2016 18:53:44 +0300
Sergey Fedorov <sergey.fedorov@linaro.org> wrote:

> From: Sergey Fedorov <serge.fdrv@gmail.com>
> 
> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>

Any reason you use two different email addresses here? :)

> ---
>  pc-bios/s390-ccw/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
> index 4208cb429593..5ce6d4ccbaf5 100644
> --- a/pc-bios/s390-ccw/Makefile
> +++ b/pc-bios/s390-ccw/Makefile
> @@ -20,7 +20,7 @@ s390-ccw.elf: $(OBJECTS)
>  	$(call quiet-command,$(CC) $(LDFLAGS) -o $@ $(OBJECTS),"  Building $(TARGET_DIR)$@")
> 
>  s390-ccw.img: s390-ccw.elf
> -	$(call quiet-command,strip --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
> +	$(call quiet-command,$(STRIP) --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
> 
>  $(OBJECTS): Makefile
> 

Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>

I can take this through my queue for 2.7, if nothing else depends on it.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling
  2016-04-07 16:18   ` Cornelia Huck
@ 2016-04-07 16:22     ` Sergey Fedorov
  2016-04-18 13:15     ` Sergey Fedorov
  1 sibling, 0 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-07 16:22 UTC (permalink / raw)
  To: Cornelia Huck, Sergey Fedorov
  Cc: Peter Crosthwaite, qemu-devel, Alexander Graf,
	Christian Borntraeger, Paolo Bonzini, Alex Bennée,
	Richard Henderson

On 07/04/16 19:18, Cornelia Huck wrote:
> On Thu,  7 Apr 2016 18:53:44 +0300
> Sergey Fedorov <sergey.fedorov@linaro.org> wrote:
>
>> From: Sergey Fedorov <serge.fdrv@gmail.com>
>>
>> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> Any reason you use two different email addresses here? :)

I'd like people could reach me after my @linaro.org account get
disabled. That's why I put my private address. On the other hand, I'd
like to signify I did this work under Linaro. That's why I put another
s-o-b tag.

Thanks,
Sergey

>
>> ---
>>  pc-bios/s390-ccw/Makefile | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
>> index 4208cb429593..5ce6d4ccbaf5 100644
>> --- a/pc-bios/s390-ccw/Makefile
>> +++ b/pc-bios/s390-ccw/Makefile
>> @@ -20,7 +20,7 @@ s390-ccw.elf: $(OBJECTS)
>>  	$(call quiet-command,$(CC) $(LDFLAGS) -o $@ $(OBJECTS),"  Building $(TARGET_DIR)$@")
>>
>>  s390-ccw.img: s390-ccw.elf
>> -	$(call quiet-command,strip --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
>> +	$(call quiet-command,$(STRIP) --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
>>
>>  $(OBJECTS): Makefile
>>
> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
>
> I can take this through my queue for 2.7, if nothing else depends on it.
>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 01/11] tci: Fix build regression
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 01/11] tci: Fix build regression Sergey Fedorov
@ 2016-04-07 18:15   ` Richard Henderson
  2016-04-07 19:16     ` Stefan Weil
  0 siblings, 1 reply; 57+ messages in thread
From: Richard Henderson @ 2016-04-07 18:15 UTC (permalink / raw)
  To: Sergey Fedorov, qemu-devel
  Cc: Alex Bennée, Sergey Fedorov, Paolo Bonzini,
	Peter Crosthwaite, Stefan Weil

On 04/07/2016 08:53 AM, Sergey Fedorov wrote:
> +/* Enable TCI assertions only when debugging TCG (and without NDEBUG defined).
> + * Without assertions, the interpreter runs much faster. */
> +#if defined(CONFIG_DEBUG_TCG)
> +# define tci_assert(cond) assert(cond)
> +#else
> +# define tci_assert(cond) ((void)0)
>   #endif
>

Please just use tcg_debug_assert.


r~

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 01/11] tci: Fix build regression
  2016-04-07 18:15   ` Richard Henderson
@ 2016-04-07 19:16     ` Stefan Weil
  2016-04-07 20:37       ` Stefan Weil
  0 siblings, 1 reply; 57+ messages in thread
From: Stefan Weil @ 2016-04-07 19:16 UTC (permalink / raw)
  To: Richard Henderson, Sergey Fedorov, qemu-devel
  Cc: Alex Bennée, Sergey Fedorov, Paolo Bonzini,
	Peter Crosthwaite, Michael Roth, Peter Maydell

Am 07.04.2016 um 20:15 schrieb Richard Henderson:
> On 04/07/2016 08:53 AM, Sergey Fedorov wrote:
>> +/* Enable TCI assertions only when debugging TCG (and without NDEBUG
>> defined).
>> + * Without assertions, the interpreter runs much faster. */
>> +#if defined(CONFIG_DEBUG_TCG)
>> +# define tci_assert(cond) assert(cond)
>> +#else
>> +# define tci_assert(cond) ((void)0)
>>   #endif
>>
> 
> Please just use tcg_debug_assert.
> 
> 
> r~


Hi Richard,

that's a good suggestion, but maybe a little late for 2.6-rc2.
I already sent a pull request an hour ago after Michael had added
his tested-by.

My first priority is fixing the build regression in 2.6. I can
try to prepare a new patch, wait for reviews and send a pull
request, but I am afraid this might not be finished in time
for 2.6.

Regards
Stefan

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 01/11] tci: Fix build regression
  2016-04-07 19:16     ` Stefan Weil
@ 2016-04-07 20:37       ` Stefan Weil
  2016-04-08  3:40         ` Richard Henderson
  0 siblings, 1 reply; 57+ messages in thread
From: Stefan Weil @ 2016-04-07 20:37 UTC (permalink / raw)
  To: Richard Henderson, Sergey Fedorov, qemu-devel
  Cc: Alex Bennée, Sergey Fedorov, Paolo Bonzini,
	Peter Crosthwaite, Michael Roth, Peter Maydell

Am 07.04.2016 um 21:16 schrieb Stefan Weil:
> Am 07.04.2016 um 20:15 schrieb Richard Henderson:
>> On 04/07/2016 08:53 AM, Sergey Fedorov wrote:
>>> +/* Enable TCI assertions only when debugging TCG (and without NDEBUG
>>> defined).
>>> + * Without assertions, the interpreter runs much faster. */
>>> +#if defined(CONFIG_DEBUG_TCG)
>>> +# define tci_assert(cond) assert(cond)
>>> +#else
>>> +# define tci_assert(cond) ((void)0)
>>>   #endif
>>>
>> Please just use tcg_debug_assert.
>>
>>
>> r~
>
> Hi Richard,
>
> that's a good suggestion, but maybe a little late for 2.6-rc2.
> I already sent a pull request an hour ago after Michael had added
> his tested-by.
>
> My first priority is fixing the build regression in 2.6. I can
> try to prepare a new patch, wait for reviews and send a pull
> request, but I am afraid this might not be finished in time
> for 2.6.
>
> Regards
> Stefan

I just tested a variant with tcg_debug_assert. It creates less efficient
code when debugging is disabled. Here is the code size for x86_64:

with normal tcg_debug_assert:

   text       data        bss        dec        hex    filename
   8293          0        128       8421       20e5   
bin/ndebug/x86_64-linux-gnu,tci/x86_64-softmmu/tci.o

In gdb I can also see assembler code for op_size at the beginning
of the for loop in tcg_qemu_tb_exec. This slows down the interpreter.

with ((void)0):

   text       data        bss        dec        hex    filename
   8135          0        128       8263       2047   
bin/ndebug/x86_64-linux-gnu,tci/x86_64-softmmu/tci.o

Therefore I'd prefer using my pull request for 2.6.

Stefan

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 01/11] tci: Fix build regression
  2016-04-07 20:37       ` Stefan Weil
@ 2016-04-08  3:40         ` Richard Henderson
  0 siblings, 0 replies; 57+ messages in thread
From: Richard Henderson @ 2016-04-08  3:40 UTC (permalink / raw)
  To: Stefan Weil, Sergey Fedorov, qemu-devel
  Cc: Alex Bennée, Sergey Fedorov, Paolo Bonzini,
	Peter Crosthwaite, Michael Roth, Peter Maydell

On 04/07/2016 01:37 PM, Stefan Weil wrote:
> I just tested a variant with tcg_debug_assert. It creates less efficient
> code when debugging is disabled. Here is the code size for x86_64:
>
> with normal tcg_debug_assert:
>
>     text       data        bss        dec        hex    filename
>     8293          0        128       8421       20e5
> bin/ndebug/x86_64-linux-gnu,tci/x86_64-softmmu/tci.o
>
> In gdb I can also see assembler code for op_size at the beginning
> of the for loop in tcg_qemu_tb_exec. This slows down the interpreter.

Please file a gcc bug.

A branch across __builtin_unreachable is supposed to fold to a simple 
compile-time assert, providing extra information to the compiler but not 
generating code.


r~

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling
  2016-04-07 16:18   ` Cornelia Huck
  2016-04-07 16:22     ` Sergey Fedorov
@ 2016-04-18 13:15     ` Sergey Fedorov
  1 sibling, 0 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-18 13:15 UTC (permalink / raw)
  To: Cornelia Huck, Sergey Fedorov
  Cc: qemu-devel, Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Christian Borntraeger, Alexander Graf

On 07/04/16 19:18, Cornelia Huck wrote:
> On Thu,  7 Apr 2016 18:53:44 +0300
> Sergey Fedorov <sergey.fedorov@linaro.org> wrote:
(snip)
>> ---
>>  pc-bios/s390-ccw/Makefile | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
>> index 4208cb429593..5ce6d4ccbaf5 100644
>> --- a/pc-bios/s390-ccw/Makefile
>> +++ b/pc-bios/s390-ccw/Makefile
>> @@ -20,7 +20,7 @@ s390-ccw.elf: $(OBJECTS)
>>  	$(call quiet-command,$(CC) $(LDFLAGS) -o $@ $(OBJECTS),"  Building $(TARGET_DIR)$@")
>>
>>  s390-ccw.img: s390-ccw.elf
>> -	$(call quiet-command,strip --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
>> +	$(call quiet-command,$(STRIP) --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
>>
>>  $(OBJECTS): Makefile
>>
> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
>
> I can take this through my queue for 2.7, if nothing else depends on it.
>

Yes, please :)

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling Sergey Fedorov
  2016-04-07 16:18   ` Cornelia Huck
@ 2016-04-18 14:51   ` Cornelia Huck
  2016-04-18 15:34     ` Cornelia Huck
  1 sibling, 1 reply; 57+ messages in thread
From: Cornelia Huck @ 2016-04-18 14:51 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: qemu-devel, Alex Bennée, Sergey Fedorov, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Christian Borntraeger,
	Alexander Graf

On Thu,  7 Apr 2016 18:53:44 +0300
Sergey Fedorov <sergey.fedorov@linaro.org> wrote:

> From: Sergey Fedorov <serge.fdrv@gmail.com>
> 
> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  pc-bios/s390-ccw/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
> index 4208cb429593..5ce6d4ccbaf5 100644
> --- a/pc-bios/s390-ccw/Makefile
> +++ b/pc-bios/s390-ccw/Makefile
> @@ -20,7 +20,7 @@ s390-ccw.elf: $(OBJECTS)
>  	$(call quiet-command,$(CC) $(LDFLAGS) -o $@ $(OBJECTS),"  Building $(TARGET_DIR)$@")
> 
>  s390-ccw.img: s390-ccw.elf
> -	$(call quiet-command,strip --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
> +	$(call quiet-command,$(STRIP) --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
> 
>  $(OBJECTS): Makefile
> 

Thanks, applied to s390-next.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling
  2016-04-18 14:51   ` Cornelia Huck
@ 2016-04-18 15:34     ` Cornelia Huck
  2016-04-18 15:47       ` Sergey Fedorov
  0 siblings, 1 reply; 57+ messages in thread
From: Cornelia Huck @ 2016-04-18 15:34 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: qemu-devel, Alex Bennée, Sergey Fedorov, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Christian Borntraeger,
	Alexander Graf

On Mon, 18 Apr 2016 16:51:16 +0200
Cornelia Huck <cornelia.huck@de.ibm.com> wrote:

> On Thu,  7 Apr 2016 18:53:44 +0300
> Sergey Fedorov <sergey.fedorov@linaro.org> wrote:
> 
> > From: Sergey Fedorov <serge.fdrv@gmail.com>
> > 
> > Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> > Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> > ---
> >  pc-bios/s390-ccw/Makefile | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
> > index 4208cb429593..5ce6d4ccbaf5 100644
> > --- a/pc-bios/s390-ccw/Makefile
> > +++ b/pc-bios/s390-ccw/Makefile
> > @@ -20,7 +20,7 @@ s390-ccw.elf: $(OBJECTS)
> >  	$(call quiet-command,$(CC) $(LDFLAGS) -o $@ $(OBJECTS),"  Building $(TARGET_DIR)$@")
> > 
> >  s390-ccw.img: s390-ccw.elf
> > -	$(call quiet-command,strip --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
> > +	$(call quiet-command,$(STRIP) --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
> > 
> >  $(OBJECTS): Makefile
> > 
> 
> Thanks, applied to s390-next.

Uhm, scratch that.

This fails to build with --disable-strip, as $STRIP is unset in that
case:

  Building s390-ccw/s390-ccw.elf
  Stripping s390-ccw/s390-ccw.img
/bin/sh: --strip-unneeded: command not found
make[1]: *** [s390-ccw.img] Error 127


The catch is that we always want to strip that binary. Care to send a
patch that deals with that?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling
  2016-04-18 15:34     ` Cornelia Huck
@ 2016-04-18 15:47       ` Sergey Fedorov
  2016-04-21 17:36         ` Sergey Fedorov
  0 siblings, 1 reply; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-18 15:47 UTC (permalink / raw)
  To: Cornelia Huck, Sergey Fedorov
  Cc: qemu-devel, Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Christian Borntraeger, Alexander Graf

On 18/04/16 18:34, Cornelia Huck wrote:
> On Mon, 18 Apr 2016 16:51:16 +0200
> Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
>
>> On Thu,  7 Apr 2016 18:53:44 +0300
>> Sergey Fedorov <sergey.fedorov@linaro.org> wrote:
>>
>>> From: Sergey Fedorov <serge.fdrv@gmail.com>
>>>
>>> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
>>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>>> ---
>>>  pc-bios/s390-ccw/Makefile | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
>>> index 4208cb429593..5ce6d4ccbaf5 100644
>>> --- a/pc-bios/s390-ccw/Makefile
>>> +++ b/pc-bios/s390-ccw/Makefile
>>> @@ -20,7 +20,7 @@ s390-ccw.elf: $(OBJECTS)
>>>  	$(call quiet-command,$(CC) $(LDFLAGS) -o $@ $(OBJECTS),"  Building $(TARGET_DIR)$@")
>>>
>>>  s390-ccw.img: s390-ccw.elf
>>> -	$(call quiet-command,strip --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
>>> +	$(call quiet-command,$(STRIP) --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
>>>
>>>  $(OBJECTS): Makefile
>>>
>> Thanks, applied to s390-next.
> Uhm, scratch that.
>
> This fails to build with --disable-strip, as $STRIP is unset in that
> case:
>
>   Building s390-ccw/s390-ccw.elf
>   Stripping s390-ccw/s390-ccw.img
> /bin/sh: --strip-unneeded: command not found
> make[1]: *** [s390-ccw.img] Error 127
>
>
> The catch is that we always want to strip that binary. Care to send a
> patch that deals with that?
>

I see the problem. I don't promise to fix it soon, but I could try
dealing with this as I have time. I don't mind if someone else can just
fix it with their own patch :)

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe
  2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
                   ` (11 preceding siblings ...)
  2016-04-07 15:56 ` [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
@ 2016-04-20  8:44 ` Alex Bennée
  12 siblings, 0 replies; 57+ messages in thread
From: Alex Bennée @ 2016-04-20  8:44 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: qemu-devel, Sergey Fedorov, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson


Sergey Fedorov <sergey.fedorov@linaro.org> writes:

> From: Sergey Fedorov <serge.fdrv@gmail.com>
>
> When patching translated code for direct block chaining/unchaining,
> modification of concurrently executing code can happen in multi-threaded
> execution.  Currently only user-mode is affected. To make direct block patching
> safe, some care must be taken to make sure that the code modification is made
> atomic and concurrently executed code is guaranteed to be consistent.
>
> This patch series fixes all supported TCG targets using direct patching and
> documents the requirement for direct jump patching be atomic and thread-safe.
>
> The series' tree can be found in a public git repository [1].
>
> [1] https://github.com/sergefdrv/qemu/tree/atomic-tb-patching

So I already know this will need a re-base as the patches don't apply
cleanly to the current master. The git tree re-based without issue so I
guess that is just git am being a bit crap.

>
> Sergey Fedorov (10):
>   pc-bios/s390-ccw: Use correct strip when cross-compiling
>   tci: Make direct jump patching thread-safe
>   tcg/ppc: Make direct jump patching thread-safe
>   tcg/i386: Make direct jump patching thread-safe
>   tcg/s390: Make direct jump patching thread-safe
>   tcg/arm: Make direct jump patching thread-safe
>   tcg/aarch64: Make direct jump patching thread-safe
>   tcg/sparc: Make direct jump patching thread-safe
>   tcg/mips: Make direct jump patching thread-safe
>   tcg: Note requirement on atomic direct jump patching
>
> Stefan Weil (1):
>   tci: Fix build regression
>
>  include/exec/exec-all.h      | 32 ++++++------------------------
>  pc-bios/s390-ccw/Makefile    |  2 +-
>  tcg/aarch64/tcg-target.inc.c | 14 +++++++++++++-
>  tcg/arm/tcg-target.inc.c     | 17 ++++++++++++++++
>  tcg/i386/tcg-target.inc.c    | 17 ++++++++++++++++
>  tcg/mips/tcg-target.inc.c    |  3 ++-
>  tcg/ppc/tcg-target.inc.c     | 22 +++++++++++++++++----
>  tcg/s390/tcg-target.inc.c    |  6 ++++++
>  tcg/sparc/tcg-target.inc.c   |  2 +-
>  tcg/tci/tcg-target.inc.c     |  2 ++
>  tci.c                        | 46 +++++++++++++++++++++++++-------------------
>  translate-all.c              |  2 ++
>  12 files changed, 111 insertions(+), 54 deletions(-)


--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 03/11] tci: Make direct jump patching thread-safe
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 03/11] tci: Make direct jump patching thread-safe Sergey Fedorov
@ 2016-04-20  9:42   ` Alex Bennée
  2016-04-20 11:40     ` Sergey Fedorov
  0 siblings, 1 reply; 57+ messages in thread
From: Alex Bennée @ 2016-04-20  9:42 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: qemu-devel, Sergey Fedorov, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Stefan Weil


Sergey Fedorov <sergey.fedorov@linaro.org> writes:

> From: Sergey Fedorov <serge.fdrv@gmail.com>
>
> Ensure direct jump patching in TCI is atomic by:
>  * naturally aligning a location of direct jump address;
>  * using atomic_read()/atomic_set() to load/store the address.
>
> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  include/exec/exec-all.h  | 2 +-
>  tcg/tci/tcg-target.inc.c | 2 ++
>  tci.c                    | 5 ++++-
>  3 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 736209505a68..59709c9dd5c9 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -302,7 +302,7 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr);
>  static inline void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
>  {
>      /* patch the branch destination */
> -    *(uint32_t *)jmp_addr = addr - (jmp_addr + 4);
> +    atomic_set((int32_t *)jmp_addr, addr - (jmp_addr + 4));
>      /* no need to flush icache explicitly */
>  }
>  #elif defined(_ARCH_PPC)
> diff --git a/tcg/tci/tcg-target.inc.c b/tcg/tci/tcg-target.inc.c
> index 4afe4d7a8d59..7e6180e62898 100644
> --- a/tcg/tci/tcg-target.inc.c
> +++ b/tcg/tci/tcg-target.inc.c
> @@ -556,6 +556,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
>          if (s->tb_jmp_offset) {
>              /* Direct jump method. */
>              assert(args[0] < ARRAY_SIZE(s->tb_jmp_offset));
> +            /* Align for atomic patching and thread safety */
> +            s->code_ptr = (uint8_t *)(((uintptr_t)s->code_ptr + 3) &
> ~3);

Seeing this pattern is being used over and over again I wonder if we
should have some utility helper functions for this? Perhaps we should
steal the kernels ALIGN macros?

>              s->tb_jmp_offset[args[0]] = tcg_current_code_size(s);
>              tcg_out32(s, 0);
>          } else {
> diff --git a/tci.c b/tci.c
> index 82705fe77295..531f5ebf2c2e 100644
> --- a/tci.c
> +++ b/tci.c
> @@ -1089,7 +1089,10 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
>              goto exit;
>              break;
>          case INDEX_op_goto_tb:
> -            t0 = tci_read_i32(&tb_ptr);
> +            /* Jump address is aligned */
> +            tb_ptr = (uint8_t *)(((uintptr_t)tb_ptr + 3) & ~3);
> +            t0 = atomic_read((int32_t *)tb_ptr);
> +            tb_ptr += sizeof(int32_t);
>              tci_assert(tb_ptr == old_code_ptr + op_size);
>              tb_ptr += (int32_t)t0;
>              continue;


--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 04/11] tcg/ppc: Make direct jump patching thread-safe
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 04/11] tcg/ppc: " Sergey Fedorov
@ 2016-04-20  9:49   ` Alex Bennée
  0 siblings, 0 replies; 57+ messages in thread
From: Alex Bennée @ 2016-04-20  9:49 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: qemu-devel, Sergey Fedorov, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Vassili Karpov (malc)


Sergey Fedorov <sergey.fedorov@linaro.org> writes:

> From: Sergey Fedorov <serge.fdrv@gmail.com>
>
> Ensure direct jump patching in PPC is atomic by:
>  * limiting translation buffer size in 32-bit mode to be addressable by
>    Branch I-form instruction;
>  * using atomic_read()/atomic_set() for code patching.
>
> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  tcg/ppc/tcg-target.inc.c | 22 ++++++++++++++++++----
>  translate-all.c          |  2 ++
>  2 files changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c
> index 8c1c2dfa9b22..a7c65fd08854 100644
> --- a/tcg/ppc/tcg-target.inc.c
> +++ b/tcg/ppc/tcg-target.inc.c
> @@ -1237,6 +1237,7 @@ static void tcg_out_brcond2 (TCGContext *s, const TCGArg *args,
>      tcg_out_bc(s, BC | BI(7, CR_EQ) | BO_COND_TRUE, arg_label(args[5]));
>  }
>
> +#ifdef __powerpc64__
>  void ppc_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
>  {
>      tcg_insn_unit i1, i2;
> @@ -1265,11 +1266,18 @@ void ppc_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
>      pair = (uint64_t)i2 << 32 | i1;
>  #endif
>
> -    /* ??? __atomic_store_8, presuming there's some way to do that
> -       for 32-bit, otherwise this is good enough for 64-bit.  */
> -    *(uint64_t *)jmp_addr = pair;
> +    atomic_set((uint64_t *)jmp_addr, pair);
>      flush_icache_range(jmp_addr, jmp_addr + 8);
>  }
> +#else
> +void ppc_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
> +{
> +    intptr_t diff = addr - jmp_addr;
> +    tcg_debug_assert(in_range_b(diff));
> +    atomic_set((uint32_t *)jmp_addr, B | (diff & 0x3fffffc));

I'll let that slide because the rest of the code sprinkles this magic
number about. However I suspect 0x3fffffc should be a clean #define.

> +    flush_icache_range(jmp_addr, jmp_addr + 4);
> +}
> +#endif
>
>  static void tcg_out_call(TCGContext *s, tcg_insn_unit *target)
>  {
> @@ -1895,7 +1903,9 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
>          break;
>      case INDEX_op_goto_tb:
>          tcg_debug_assert(s->tb_jmp_offset);
> -        /* Direct jump.  Ensure the next insns are 8-byte aligned. */
> +        /* Direct jump. */
> +#ifdef __powerpc64__
> +        /* Ensure the next insns are 8-byte aligned. */
>          if ((uintptr_t)s->code_ptr & 7) {
>              tcg_out32(s, NOP);
>          }
> @@ -1904,6 +1914,10 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
>          s->code_ptr += 2;
>          tcg_out32(s, MTSPR | RS(TCG_REG_TMP1) | CTR);
>          tcg_out32(s, BCCTR | BO_ALWAYS);
> +#else
> +        /* To be replaced by a branch.  */
> +        s->code_ptr++;
> +#endif
>          s->tb_next_offset[args[0]] = tcg_current_code_size(s);
>          break;
>      case INDEX_op_br:
> diff --git a/translate-all.c b/translate-all.c
> index b4df1ec68fa9..9b98a4a36967 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -464,6 +464,8 @@ static inline PageDesc *page_find(tb_page_addr_t index)
>  # define MAX_CODE_GEN_BUFFER_SIZE  (2ul * 1024 * 1024 * 1024)
>  #elif defined(__powerpc64__)
>  # define MAX_CODE_GEN_BUFFER_SIZE  (2ul * 1024 * 1024 * 1024)
> +#elif defined(__powerpc__)
> +# define MAX_CODE_GEN_BUFFER_SIZE  (32u * 1024 * 1024)
>  #elif defined(__aarch64__)
>  # define MAX_CODE_GEN_BUFFER_SIZE  (128ul * 1024 * 1024)
>  #elif defined(__arm__)


Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 05/11] tcg/i386: Make direct jump patching thread-safe
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 05/11] tcg/i386: " Sergey Fedorov
@ 2016-04-20  9:55   ` Alex Bennée
  2016-04-20 11:43     ` Sergey Fedorov
  2016-04-20 15:04     ` Richard Henderson
  0 siblings, 2 replies; 57+ messages in thread
From: Alex Bennée @ 2016-04-20  9:55 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: qemu-devel, Sergey Fedorov, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson


Sergey Fedorov <sergey.fedorov@linaro.org> writes:

> From: Sergey Fedorov <serge.fdrv@gmail.com>
>
> Ensure direct jump patching in i386 is atomic by:
>  * naturally aligning a location of direct jump address;
>  * using atomic_read()/atomic_set() for code patching.
>
> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  include/exec/exec-all.h   |  2 +-
>  tcg/i386/tcg-target.inc.c | 17 +++++++++++++++++
>  2 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 59709c9dd5c9..82399175fe80 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -312,7 +312,7 @@ void ppc_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr);
>  static inline void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
>  {
>      /* patch the branch destination */
> -    stl_le_p((void*)jmp_addr, addr - (jmp_addr + 4));
> +    atomic_set((int32_t *)jmp_addr, addr - (jmp_addr + 4));
>      /* no need to flush icache explicitly */
>  }
>  #elif defined(__s390x__)
> diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
> index 9187d34caf6d..3ffb7b3124d8 100644
> --- a/tcg/i386/tcg-target.inc.c
> +++ b/tcg/i386/tcg-target.inc.c
> @@ -1123,6 +1123,19 @@ static void tcg_out_jmp(TCGContext *s, tcg_insn_unit *dest)
>      tcg_out_branch(s, 0, dest);
>  }
>
> +static void tcg_out_nopn(TCGContext *s, int n)
> +{
> +    static const uint8_t nop1[] = { 0x90 };
> +    static const uint8_t nop2[] = { 0x66, 0x90 };
> +    static const uint8_t nop3[] = { 0x8d, 0x76, 0x00 };
> +    static const uint8_t *const nopn[] = { nop1, nop2, nop3 };
> +    int i;
> +    assert(n <= ARRAY_SIZE(nopn));
> +    for (i = 0; i < n; ++i) {
> +        tcg_out8(s, nopn[n - 1][i]);
> +    }
> +}

*shudder* I recall x86 instruction encoding is weird. Maybe a comment
 for the function to describe the 3 forms of NOP we have here?

> +
>  #if defined(CONFIG_SOFTMMU)
>  /* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
>   *                                     int mmu_idx, uintptr_t ra)
> @@ -1777,6 +1790,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>      case INDEX_op_goto_tb:
>          if (s->tb_jmp_offset) {
>              /* direct jump method */
> +            /* align jump displacement for atomic pathing */

s/pathing/patching/

> +            if (((uintptr_t)s->code_ptr & 3) != 3) {
> +                tcg_out_nopn(s, 3 - ((uintptr_t)s->code_ptr & 3));
> +            }

apropos my previous comments. I think the intention could be made
clearer the use of well named helper functions to check alignment and
calculate number elements until next alignment.

>              tcg_out8(s, OPC_JMP_long); /* jmp im */
>              s->tb_jmp_offset[args[0]] = tcg_current_code_size(s);
>              tcg_out32(s, 0);


--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 06/11] tcg/s390: Make direct jump patching thread-safe
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 06/11] tcg/s390: " Sergey Fedorov
@ 2016-04-20 10:01   ` Alex Bennée
  2016-04-20 11:45     ` Sergey Fedorov
  0 siblings, 1 reply; 57+ messages in thread
From: Alex Bennée @ 2016-04-20 10:01 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: qemu-devel, Sergey Fedorov, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Alexander Graf


Sergey Fedorov <sergey.fedorov@linaro.org> writes:

> From: Sergey Fedorov <serge.fdrv@gmail.com>
>
> Ensure direct jump patching in s390 is atomic by:
>  * naturally aligning a location of direct jump address;
>  * using atomic_read()/atomic_set() for code patching.
>
> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  include/exec/exec-all.h   | 2 +-
>  tcg/s390/tcg-target.inc.c | 6 ++++++
>  2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 82399175fe80..e18cc24e50f0 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -320,7 +320,7 @@ static inline void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
>  {
>      /* patch the branch destination */
>      intptr_t disp = addr - (jmp_addr - 2);
> -    stl_be_p((void*)jmp_addr, disp / 2);
> +    atomic_set((int32_t *)jmp_addr, disp / 2);
>      /* no need to flush icache explicitly */
>  }
>  #elif defined(__aarch64__)
> diff --git a/tcg/s390/tcg-target.inc.c b/tcg/s390/tcg-target.inc.c
> index fbf97bb2e15d..84221dfb68c9 100644
> --- a/tcg/s390/tcg-target.inc.c
> +++ b/tcg/s390/tcg-target.inc.c
> @@ -219,6 +219,8 @@ typedef enum S390Opcode {
>      RX_ST       = 0x50,
>      RX_STC      = 0x42,
>      RX_STH      = 0x40,
> +
> +    NOP         = 0x0707,
>  } S390Opcode;
>
>  #ifndef NDEBUG
> @@ -1716,6 +1718,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>
>      case INDEX_op_goto_tb:
>          if (s->tb_jmp_offset) {
> +            /* align branch displacement for atomic pathing */

s/pathing/patching/

> +            if (((uintptr_t)s->code_ptr & 3) == 0) {
> +                tcg_out16(s, NOP);
> +            }

Isn't this the wrong way around? Shouldn't we insert the NOP is code_ptr & 3
== 2 (I assume 1 & 3 are impossible). Or is it that we need to be
unaligned when we out the jmp so the offset itself is aligned.

>              tcg_out16(s, RIL_BRCL | (S390_CC_ALWAYS << 4));
>              s->tb_jmp_offset[args[0]] = tcg_current_code_size(s);
>              s->code_ptr += 2;


--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 03/11] tci: Make direct jump patching thread-safe
  2016-04-20  9:42   ` Alex Bennée
@ 2016-04-20 11:40     ` Sergey Fedorov
  2016-04-20 13:14       ` Alex Bennée
  0 siblings, 1 reply; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-20 11:40 UTC (permalink / raw)
  To: Alex Bennée, Sergey Fedorov
  Cc: qemu-devel, Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	Stefan Weil

On 20/04/16 12:42, Alex Bennée wrote:
> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>> diff --git a/tcg/tci/tcg-target.inc.c b/tcg/tci/tcg-target.inc.c
>> index 4afe4d7a8d59..7e6180e62898 100644
>> --- a/tcg/tci/tcg-target.inc.c
>> +++ b/tcg/tci/tcg-target.inc.c
>> @@ -556,6 +556,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
>>          if (s->tb_jmp_offset) {
>>              /* Direct jump method. */
>>              assert(args[0] < ARRAY_SIZE(s->tb_jmp_offset));
>> +            /* Align for atomic patching and thread safety */
>> +            s->code_ptr = (uint8_t *)(((uintptr_t)s->code_ptr + 3) &
>> ~3);
> Seeing this pattern is being used over and over again I wonder if we
> should have some utility helper functions for this? Perhaps we should
> steal the kernels ALIGN macros?

Good point, really. I see such a macro in hw/display/qxl.c and
kvm-all.c. It'd be better a common definition. Any idea of where to put it?

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 05/11] tcg/i386: Make direct jump patching thread-safe
  2016-04-20  9:55   ` Alex Bennée
@ 2016-04-20 11:43     ` Sergey Fedorov
  2016-04-20 15:04     ` Richard Henderson
  1 sibling, 0 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-20 11:43 UTC (permalink / raw)
  To: Alex Bennée, Sergey Fedorov
  Cc: qemu-devel, Paolo Bonzini, Peter Crosthwaite, Richard Henderson

On 20/04/16 12:55, Alex Bennée wrote:
> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>> diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
>> index 9187d34caf6d..3ffb7b3124d8 100644
>> --- a/tcg/i386/tcg-target.inc.c
>> +++ b/tcg/i386/tcg-target.inc.c
>> @@ -1123,6 +1123,19 @@ static void tcg_out_jmp(TCGContext *s, tcg_insn_unit *dest)
>>      tcg_out_branch(s, 0, dest);
>>  }
>>
>> +static void tcg_out_nopn(TCGContext *s, int n)
>> +{
>> +    static const uint8_t nop1[] = { 0x90 };
>> +    static const uint8_t nop2[] = { 0x66, 0x90 };
>> +    static const uint8_t nop3[] = { 0x8d, 0x76, 0x00 };
>> +    static const uint8_t *const nopn[] = { nop1, nop2, nop3 };
>> +    int i;
>> +    assert(n <= ARRAY_SIZE(nopn));
>> +    for (i = 0; i < n; ++i) {
>> +        tcg_out8(s, nopn[n - 1][i]);
>> +    }
>> +}
> *shudder* I recall x86 instruction encoding is weird. Maybe a comment
>  for the function to describe the 3 forms of NOP we have here?

Okay.

>
>> +
>>  #if defined(CONFIG_SOFTMMU)
>>  /* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
>>   *                                     int mmu_idx, uintptr_t ra)
>> @@ -1777,6 +1790,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>>      case INDEX_op_goto_tb:
>>          if (s->tb_jmp_offset) {
>>              /* direct jump method */
>> +            /* align jump displacement for atomic pathing */
> s/pathing/patching/

Nice catch, thanks :)
 
Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 06/11] tcg/s390: Make direct jump patching thread-safe
  2016-04-20 10:01   ` Alex Bennée
@ 2016-04-20 11:45     ` Sergey Fedorov
  0 siblings, 0 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-20 11:45 UTC (permalink / raw)
  To: Alex Bennée, Sergey Fedorov
  Cc: qemu-devel, Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	Alexander Graf

On 20/04/16 13:01, Alex Bennée wrote:
> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
>> index 82399175fe80..e18cc24e50f0 100644
>> --- a/include/exec/exec-all.h
>> +++ b/include/exec/exec-all.h
(snip)
>> @@ -1716,6 +1718,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>>
>>      case INDEX_op_goto_tb:
>>          if (s->tb_jmp_offset) {
>> +            /* align branch displacement for atomic pathing */
> s/pathing/patching/
>
>> +            if (((uintptr_t)s->code_ptr & 3) == 0) {
>> +                tcg_out16(s, NOP);
>> +            }
> Isn't this the wrong way around? Shouldn't we insert the NOP is code_ptr & 3
> == 2 (I assume 1 & 3 are impossible). Or is it that we need to be
> unaligned when we out the jmp so the offset itself is aligned.

Yes, it is the offset itself should be aligned to patch in atomically.

Kind regards,
Sergey

>
>>              tcg_out16(s, RIL_BRCL | (S390_CC_ALWAYS << 4));
>>              s->tb_jmp_offset[args[0]] = tcg_current_code_size(s);
>>              s->code_ptr += 2;

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 03/11] tci: Make direct jump patching thread-safe
  2016-04-20 11:40     ` Sergey Fedorov
@ 2016-04-20 13:14       ` Alex Bennée
  2016-04-22 11:31         ` Sergey Fedorov
  0 siblings, 1 reply; 57+ messages in thread
From: Alex Bennée @ 2016-04-20 13:14 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: Sergey Fedorov, qemu-devel, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Stefan Weil


Sergey Fedorov <serge.fdrv@gmail.com> writes:

> On 20/04/16 12:42, Alex Bennée wrote:
>> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>>> diff --git a/tcg/tci/tcg-target.inc.c b/tcg/tci/tcg-target.inc.c
>>> index 4afe4d7a8d59..7e6180e62898 100644
>>> --- a/tcg/tci/tcg-target.inc.c
>>> +++ b/tcg/tci/tcg-target.inc.c
>>> @@ -556,6 +556,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
>>>          if (s->tb_jmp_offset) {
>>>              /* Direct jump method. */
>>>              assert(args[0] < ARRAY_SIZE(s->tb_jmp_offset));
>>> +            /* Align for atomic patching and thread safety */
>>> +            s->code_ptr = (uint8_t *)(((uintptr_t)s->code_ptr + 3) &
>>> ~3);
>> Seeing this pattern is being used over and over again I wonder if we
>> should have some utility helper functions for this? Perhaps we should
>> steal the kernels ALIGN macros?
>
> Good point, really. I see such a macro in hw/display/qxl.c and
> kvm-all.c. It'd be better a common definition. Any idea of where to
> put it?

Somewhere inside include/qemu. osdep.h has ROUND_UP/DOWN functions maybe
there makes the most sense?
>
> Kind regards,
> Sergey


--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 07/11] tcg/arm: Make direct jump patching thread-safe
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 07/11] tcg/arm: " Sergey Fedorov
@ 2016-04-20 13:33   ` Alex Bennée
  2016-04-20 14:29     ` Sergey Fedorov
  0 siblings, 1 reply; 57+ messages in thread
From: Alex Bennée @ 2016-04-20 13:33 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: qemu-devel, Sergey Fedorov, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Andrzej Zaborowski, qemu-arm


Sergey Fedorov <sergey.fedorov@linaro.org> writes:

> From: Sergey Fedorov <serge.fdrv@gmail.com>
>
> Ensure direct jump patching in ARM is atomic by using
> atomic_read()/atomic_set() for code patching.
>
> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  include/exec/exec-all.h  | 25 ++-----------------------
>  tcg/arm/tcg-target.inc.c | 17 +++++++++++++++++
>  2 files changed, 19 insertions(+), 23 deletions(-)
>
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index e18cc24e50f0..6a054ee720a8 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -327,29 +327,8 @@ static inline void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
>  void aarch64_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr);
>  #define tb_set_jmp_target1 aarch64_tb_set_jmp_target
>  #elif defined(__arm__)
> -static inline void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
> -{
> -#if !QEMU_GNUC_PREREQ(4, 1)
> -    register unsigned long _beg __asm ("a1");
> -    register unsigned long _end __asm ("a2");
> -    register unsigned long _flg __asm ("a3");
> -#endif
> -
> -    /* we could use a ldr pc, [pc, #-4] kind of branch and avoid the flush */
> -    *(uint32_t *)jmp_addr =
> -        (*(uint32_t *)jmp_addr & ~0xffffff)
> -        | (((addr - (jmp_addr + 8)) >> 2) & 0xffffff);
> -
> -#if QEMU_GNUC_PREREQ(4, 1)
> -    __builtin___clear_cache((char *) jmp_addr, (char *) jmp_addr + 4);
> -#else
> -    /* flush icache */
> -    _beg = jmp_addr;
> -    _end = jmp_addr + 4;
> -    _flg = 0;
> -    __asm __volatile__ ("swi 0x9f0002" : : "r" (_beg), "r" (_end), "r" (_flg));
> -#endif
> -}
> +void arm_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr);
> +#define tb_set_jmp_target1 arm_tb_set_jmp_target
>  #elif defined(__sparc__) || defined(__mips__)
>  void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr);
>  #else
> diff --git a/tcg/arm/tcg-target.inc.c b/tcg/arm/tcg-target.inc.c
> index 3edf6a6f971c..5c69de20bc69 100644
> --- a/tcg/arm/tcg-target.inc.c
> +++ b/tcg/arm/tcg-target.inc.c
> @@ -121,6 +121,13 @@ static inline void reloc_pc24(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>      *code_ptr = (*code_ptr & ~0xffffff) | (offset & 0xffffff);
>  }
>
> +static inline void reloc_pc24_atomic(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
> +{
> +    ptrdiff_t offset = (tcg_ptr_byte_diff(target, code_ptr) - 8) >>
> 2;

This seems like something a tcg_debug_assert should be ensuring we don't overflow.

> +    tcg_insn_unit insn = atomic_read(code_ptr);

Don't we already know what the instruction should be or could there be
multiple ones?

> +    atomic_set(code_ptr, (insn & ~0xffffff) | (offset & 0xffffff));

Please use deposit32 to set the offset like the aarch64 code.

> +}
> +
>  static void patch_reloc(tcg_insn_unit *code_ptr, int type,
>                          intptr_t value, intptr_t addend)
>  {
> @@ -1038,6 +1045,16 @@ static void tcg_out_call(TCGContext *s, tcg_insn_unit *addr)
>      }
>  }
>
> +void arm_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
> +{
> +    tcg_insn_unit *code_ptr = (tcg_insn_unit *)jmp_addr;
> +    tcg_insn_unit *target = (tcg_insn_unit *)addr;
> +
> +    /* we could use a ldr pc, [pc, #-4] kind of branch and avoid the
> flush */

So why don't we?

> +    reloc_pc24_atomic(code_ptr, target);
> +    flush_icache_range(jmp_addr, jmp_addr + 4);
> +}
> +
>  static inline void tcg_out_goto_label(TCGContext *s, int cond, TCGLabel *l)
>  {
>      if (l->has_value) {


--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 08/11] tcg/aarch64: Make direct jump patching thread-safe
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 08/11] tcg/aarch64: " Sergey Fedorov
@ 2016-04-20 14:01   ` Alex Bennée
  2016-04-20 15:08     ` Richard Henderson
  2016-04-21 15:47   ` Sergey Fedorov
  1 sibling, 1 reply; 57+ messages in thread
From: Alex Bennée @ 2016-04-20 14:01 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: qemu-devel, Sergey Fedorov, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Claudio Fontana, qemu-arm


Sergey Fedorov <sergey.fedorov@linaro.org> writes:

> From: Sergey Fedorov <serge.fdrv@gmail.com>
>
> Ensure direct jump patching in AArch64 is atomic by using
> atomic_read()/atomic_set() for code patching.
>
> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  tcg/aarch64/tcg-target.inc.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
> index 0ed10a974121..15fdebec921f 100644
> --- a/tcg/aarch64/tcg-target.inc.c
> +++ b/tcg/aarch64/tcg-target.inc.c
> @@ -73,6 +73,18 @@ static inline void reloc_pc26(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>      *code_ptr = deposit32(*code_ptr, 0, 26, offset);
>  }
>
> +static inline void reloc_pc26_atomic(tcg_insn_unit *code_ptr,
> +                                     tcg_insn_unit *target)
> +{
> +    ptrdiff_t offset = target - code_ptr;
> +    tcg_insn_unit insn;
> +    assert(offset == sextract64(offset, 0, 26));
> +    /* read instruction, mask away previous PC_REL26 parameter contents,
> +       set the proper offset, then write back the instruction. */

This comment could be moved from here and reloc_pc26 and made common for
the two following functions.

> +    insn = atomic_read(code_ptr);
> +    atomic_set(code_ptr, deposit32(insn, 0, 26, offset));
> +}
> +
>  static inline void reloc_pc19(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>  {
>      ptrdiff_t offset = target - code_ptr;
> @@ -835,7 +847,7 @@ void aarch64_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
>      tcg_insn_unit *code_ptr = (tcg_insn_unit *)jmp_addr;
>      tcg_insn_unit *target = (tcg_insn_unit *)addr;
>
> -    reloc_pc26(code_ptr, target);
> +    reloc_pc26_atomic(code_ptr, target);
>      flush_icache_range(jmp_addr, jmp_addr + 4);
>  }

Otherwise:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>



--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 09/11] tcg/sparc: Make direct jump patching thread-safe
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 09/11] tcg/sparc: " Sergey Fedorov
@ 2016-04-20 14:23   ` Alex Bennée
  0 siblings, 0 replies; 57+ messages in thread
From: Alex Bennée @ 2016-04-20 14:23 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: qemu-devel, Sergey Fedorov, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Blue Swirl


Sergey Fedorov <sergey.fedorov@linaro.org> writes:

> From: Sergey Fedorov <serge.fdrv@gmail.com>
>
> Ensure direct jump patching in SPARC is atomic by using
> atomic_read()/atomic_set() for code patching.
>
> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  tcg/sparc/tcg-target.inc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tcg/sparc/tcg-target.inc.c b/tcg/sparc/tcg-target.inc.c
> index 54df1bc42432..629dd7bdafff 100644
> --- a/tcg/sparc/tcg-target.inc.c
> +++ b/tcg/sparc/tcg-target.inc.c
> @@ -1647,6 +1647,6 @@ void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr)
>         the code_gen_buffer can't be larger than 2GB.  */
>      assert(disp == (int32_t)disp);
>
> -    *ptr = CALL | (uint32_t)disp >> 2;
> +    atomic_set(ptr, CALL | (uint32_t)disp >> 2);

atomic_set(ptr, deposit32(CALL, 0, 30, disp >> 2))

might be a little more explicit.

Otherwise:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>


>      flush_icache_range(jmp_addr, jmp_addr + 4);
>  }


--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 11/11] tcg: Note requirement on atomic direct jump patching
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 11/11] tcg: Note requirement on atomic direct jump patching Sergey Fedorov
@ 2016-04-20 14:25   ` Alex Bennée
  0 siblings, 0 replies; 57+ messages in thread
From: Alex Bennée @ 2016-04-20 14:25 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: qemu-devel, Sergey Fedorov, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson


Sergey Fedorov <sergey.fedorov@linaro.org> writes:

> From: Sergey Fedorov <serge.fdrv@gmail.com>
>
> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  include/exec/exec-all.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 6a054ee720a8..1878e7b7856c 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -229,6 +229,7 @@ static inline void tlb_flush_by_mmuidx(CPUState *cpu, ...)
>      || defined(__sparc__) || defined(__aarch64__) \
>      || defined(__s390x__) || defined(__mips__) \
>      || defined(CONFIG_TCG_INTERPRETER)
> +/* NOTE: Direct jump patching must be atomic and thread-safe. */

Maybe:

/* NOTE: Direct jump patching must be atomic to be thread-safe. */

Otherwise:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

>  #define USE_DIRECT_JUMP
>  #endif


--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 07/11] tcg/arm: Make direct jump patching thread-safe
  2016-04-20 13:33   ` Alex Bennée
@ 2016-04-20 14:29     ` Sergey Fedorov
  2016-04-20 14:40       ` Alex Bennée
  0 siblings, 1 reply; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-20 14:29 UTC (permalink / raw)
  To: Alex Bennée, Sergey Fedorov
  Cc: qemu-devel, Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	Andrzej Zaborowski, qemu-arm

On 20/04/16 16:33, Alex Bennée wrote:
> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>> diff --git a/tcg/arm/tcg-target.inc.c b/tcg/arm/tcg-target.inc.c
>> index 3edf6a6f971c..5c69de20bc69 100644
>> --- a/tcg/arm/tcg-target.inc.c
>> +++ b/tcg/arm/tcg-target.inc.c
>> @@ -121,6 +121,13 @@ static inline void reloc_pc24(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>>      *code_ptr = (*code_ptr & ~0xffffff) | (offset & 0xffffff);
>>  }
>>
>> +static inline void reloc_pc24_atomic(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>> +{
>> +    ptrdiff_t offset = (tcg_ptr_byte_diff(target, code_ptr) - 8) >>
>> 2;
> This seems like something a tcg_debug_assert should be ensuring we don't overflow.

Then we should probably put the same assert into reloc_pc24() as well.

>
>> +    tcg_insn_unit insn = atomic_read(code_ptr);
> Don't we already know what the instruction should be or could there be
> multiple ones?

Actually, it is always what tcg_out_b_noaddr() generates. I'm not sure
it's worthwhile to introduce tcg_out_b_atomic() or something similar here.

>
>> +    atomic_set(code_ptr, (insn & ~0xffffff) | (offset & 0xffffff));
> Please use deposit32 to set the offset like the aarch64 code.

Will do.

>
>> +}
>> +
>>  static void patch_reloc(tcg_insn_unit *code_ptr, int type,
>>                          intptr_t value, intptr_t addend)
>>  {
>> @@ -1038,6 +1045,16 @@ static void tcg_out_call(TCGContext *s, tcg_insn_unit *addr)
>>      }
>>  }
>>
>> +void arm_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
>> +{
>> +    tcg_insn_unit *code_ptr = (tcg_insn_unit *)jmp_addr;
>> +    tcg_insn_unit *target = (tcg_insn_unit *)addr;
>> +
>> +    /* we could use a ldr pc, [pc, #-4] kind of branch and avoid the
>> flush */
> So why don't we?

I think because it's a bit more expensive to take this kind of branch.
If we assume direct jumps are taken much more times than TB patching
then it's preferable to optimize for direct jumps instead of for patching.

Kind regards,
Sergey

>
>> +    reloc_pc24_atomic(code_ptr, target);
>> +    flush_icache_range(jmp_addr, jmp_addr + 4);
>> +}
>> +
>>  static inline void tcg_out_goto_label(TCGContext *s, int cond, TCGLabel *l)
>>  {
>>      if (l->has_value) {
>>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 07/11] tcg/arm: Make direct jump patching thread-safe
  2016-04-20 14:29     ` Sergey Fedorov
@ 2016-04-20 14:40       ` Alex Bennée
  2016-04-20 16:12         ` Sergey Fedorov
  0 siblings, 1 reply; 57+ messages in thread
From: Alex Bennée @ 2016-04-20 14:40 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: Sergey Fedorov, qemu-devel, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Andrzej Zaborowski, qemu-arm


Sergey Fedorov <serge.fdrv@gmail.com> writes:

> On 20/04/16 16:33, Alex Bennée wrote:
>> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>>> diff --git a/tcg/arm/tcg-target.inc.c b/tcg/arm/tcg-target.inc.c
>>> index 3edf6a6f971c..5c69de20bc69 100644
>>> --- a/tcg/arm/tcg-target.inc.c
>>> +++ b/tcg/arm/tcg-target.inc.c
>>> @@ -121,6 +121,13 @@ static inline void reloc_pc24(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>>>      *code_ptr = (*code_ptr & ~0xffffff) | (offset & 0xffffff);
>>>  }
>>>
>>> +static inline void reloc_pc24_atomic(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>>> +{
>>> +    ptrdiff_t offset = (tcg_ptr_byte_diff(target, code_ptr) - 8) >>
>>> 2;
>> This seems like something a tcg_debug_assert should be ensuring we don't overflow.
>
> Then we should probably put the same assert into reloc_pc24() as well.
>
>>
>>> +    tcg_insn_unit insn = atomic_read(code_ptr);
>> Don't we already know what the instruction should be or could there be
>> multiple ones?
>
> Actually, it is always what tcg_out_b_noaddr() generates. I'm not sure
> it's worthwhile to introduce tcg_out_b_atomic() or something similar
> here.

No I don't think so. It depends if the branch instruction is going to
have multiple potential conditions (so requiring the read) or always be
the same.

>
>>
>>> +    atomic_set(code_ptr, (insn & ~0xffffff) | (offset & 0xffffff));
>> Please use deposit32 to set the offset like the aarch64 code.
>
> Will do.
>
>>
>>> +}
>>> +
>>>  static void patch_reloc(tcg_insn_unit *code_ptr, int type,
>>>                          intptr_t value, intptr_t addend)
>>>  {
>>> @@ -1038,6 +1045,16 @@ static void tcg_out_call(TCGContext *s, tcg_insn_unit *addr)
>>>      }
>>>  }
>>>
>>> +void arm_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
>>> +{
>>> +    tcg_insn_unit *code_ptr = (tcg_insn_unit *)jmp_addr;
>>> +    tcg_insn_unit *target = (tcg_insn_unit *)addr;
>>> +
>>> +    /* we could use a ldr pc, [pc, #-4] kind of branch and avoid the
>>> flush */
>> So why don't we?
>
> I think because it's a bit more expensive to take this kind of branch.
> If we assume direct jumps are taken much more times than TB patching
> then it's preferable to optimize for direct jumps instead of for
> patching.

Looking again I see the comment came from code motion so I'm not overly
fussed its just comments like that always leave me hanging. "We could
...." and I want to know why we don't then ;-)

>
> Kind regards,
> Sergey
>
>>
>>> +    reloc_pc24_atomic(code_ptr, target);
>>> +    flush_icache_range(jmp_addr, jmp_addr + 4);
>>> +}
>>> +
>>>  static inline void tcg_out_goto_label(TCGContext *s, int cond, TCGLabel *l)
>>>  {
>>>      if (l->has_value) {
>>>


--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 05/11] tcg/i386: Make direct jump patching thread-safe
  2016-04-20  9:55   ` Alex Bennée
  2016-04-20 11:43     ` Sergey Fedorov
@ 2016-04-20 15:04     ` Richard Henderson
  2016-04-20 16:15       ` Sergey Fedorov
  1 sibling, 1 reply; 57+ messages in thread
From: Richard Henderson @ 2016-04-20 15:04 UTC (permalink / raw)
  To: Alex Bennée, Sergey Fedorov
  Cc: qemu-devel, Sergey Fedorov, Paolo Bonzini, Peter Crosthwaite

On 04/20/2016 02:55 AM, Alex Bennée wrote:
>> +static void tcg_out_nopn(TCGContext *s, int n)
>> +{
>> +    static const uint8_t nop1[] = { 0x90 };
>> +    static const uint8_t nop2[] = { 0x66, 0x90 };
>> +    static const uint8_t nop3[] = { 0x8d, 0x76, 0x00 };
>> +    static const uint8_t *const nopn[] = { nop1, nop2, nop3 };
>> +    int i;
>> +    assert(n <= ARRAY_SIZE(nopn));
>> +    for (i = 0; i < n; ++i) {
>> +        tcg_out8(s, nopn[n - 1][i]);
>> +    }
>> +}
>
> *shudder* I recall x86 instruction encoding is weird. Maybe a comment
>   for the function to describe the 3 forms of NOP we have here?

I think I'd prefer to drop the tables and do

   /* Emit 1 or 2 operand size prefixes for the standard one byte nop,
      xchg %eax,%eax, forming xchg %ax,%ax.  All cores accept the
      duplicate prefix, and all of the interesting recent cores can
      decode and discard the duplicates in a single cycle.  */
   for (i = 1; i < n; ++i) {
     tcg_out8(s, 0x66);
   }
   tcg_out8(s, 0x90);


r~

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 08/11] tcg/aarch64: Make direct jump patching thread-safe
  2016-04-20 14:01   ` Alex Bennée
@ 2016-04-20 15:08     ` Richard Henderson
  2016-04-20 18:22       ` Alex Bennée
  2016-04-20 18:44       ` Sergey Fedorov
  0 siblings, 2 replies; 57+ messages in thread
From: Richard Henderson @ 2016-04-20 15:08 UTC (permalink / raw)
  To: Alex Bennée, Sergey Fedorov
  Cc: qemu-devel, Sergey Fedorov, Paolo Bonzini, Peter Crosthwaite,
	Claudio Fontana, qemu-arm

On 04/20/2016 07:01 AM, Alex Bennée wrote:
>
> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>
>> From: Sergey Fedorov <serge.fdrv@gmail.com>
>>
>> Ensure direct jump patching in AArch64 is atomic by using
>> atomic_read()/atomic_set() for code patching.
>>
>> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>> ---
>>   tcg/aarch64/tcg-target.inc.c | 14 +++++++++++++-
>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
>> index 0ed10a974121..15fdebec921f 100644
>> --- a/tcg/aarch64/tcg-target.inc.c
>> +++ b/tcg/aarch64/tcg-target.inc.c
>> @@ -73,6 +73,18 @@ static inline void reloc_pc26(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>>       *code_ptr = deposit32(*code_ptr, 0, 26, offset);
>>   }
>>
>> +static inline void reloc_pc26_atomic(tcg_insn_unit *code_ptr,
>> +                                     tcg_insn_unit *target)
>> +{
>> +    ptrdiff_t offset = target - code_ptr;
>> +    tcg_insn_unit insn;
>> +    assert(offset == sextract64(offset, 0, 26));
>> +    /* read instruction, mask away previous PC_REL26 parameter contents,
>> +       set the proper offset, then write back the instruction. */
>
> This comment could be moved from here and reloc_pc26 and made common for
> the two following functions.

There's a significant amount of cleanup that ought to happen here, now that 
we're not re-translating TBs.  I don't know if Sergey should be gated on that.


r~

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 07/11] tcg/arm: Make direct jump patching thread-safe
  2016-04-20 14:40       ` Alex Bennée
@ 2016-04-20 16:12         ` Sergey Fedorov
  0 siblings, 0 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-20 16:12 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Sergey Fedorov, qemu-devel, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Andrzej Zaborowski, qemu-arm

On 20/04/16 17:40, Alex Bennée wrote:
> Sergey Fedorov <serge.fdrv@gmail.com> writes:
>> On 20/04/16 16:33, Alex Bennée wrote:
>>> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>>>> diff --git a/tcg/arm/tcg-target.inc.c b/tcg/arm/tcg-target.inc.c
>>>> index 3edf6a6f971c..5c69de20bc69 100644
>>>> --- a/tcg/arm/tcg-target.inc.c
>>>> +++ b/tcg/arm/tcg-target.inc.c
>>>> @@ -121,6 +121,13 @@ static inline void reloc_pc24(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>>>>      *code_ptr = (*code_ptr & ~0xffffff) | (offset & 0xffffff);
>>>>  }
>>>>
>>>> +static inline void reloc_pc24_atomic(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>>>> +{
>>>> +    ptrdiff_t offset = (tcg_ptr_byte_diff(target, code_ptr) - 8) >>
>>>> 2;
>>> This seems like something a tcg_debug_assert should be ensuring we don't overflow.
>> Then we should probably put the same assert into reloc_pc24() as well.
>>
>>>> +    tcg_insn_unit insn = atomic_read(code_ptr);
>>> Don't we already know what the instruction should be or could there be
>>> multiple ones?
>> Actually, it is always what tcg_out_b_noaddr() generates. I'm not sure
>> it's worthwhile to introduce tcg_out_b_atomic() or something similar
>> here.
> No I don't think so. It depends if the branch instruction is going to
> have multiple potential conditions (so requiring the read) or always be
> the same.

So I mean "regarding goto_tb, it's always branch immediate,
unconditional". But concerning the function name, it should only patch
the immediate offset of the instruction.

>
>>>> +    atomic_set(code_ptr, (insn & ~0xffffff) | (offset & 0xffffff));
>>> Please use deposit32 to set the offset like the aarch64 code.
>> Will do.
>>
>>>> +}
>>>> +
>>>>  static void patch_reloc(tcg_insn_unit *code_ptr, int type,
>>>>                          intptr_t value, intptr_t addend)
>>>>  {
>>>> @@ -1038,6 +1045,16 @@ static void tcg_out_call(TCGContext *s, tcg_insn_unit *addr)
>>>>      }
>>>>  }
>>>>
>>>> +void arm_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
>>>> +{
>>>> +    tcg_insn_unit *code_ptr = (tcg_insn_unit *)jmp_addr;
>>>> +    tcg_insn_unit *target = (tcg_insn_unit *)addr;
>>>> +
>>>> +    /* we could use a ldr pc, [pc, #-4] kind of branch and avoid the
>>>> flush */
>>> So why don't we?
>> I think because it's a bit more expensive to take this kind of branch.
>> If we assume direct jumps are taken much more times than TB patching
>> then it's preferable to optimize for direct jumps instead of for
>> patching.
> Looking again I see the comment came from code motion so I'm not overly
> fussed its just comments like that always leave me hanging. "We could
> ...." and I want to know why we don't then ;-)

So let's leave it as is?

Kind regards,
Sergey

>>>> +    reloc_pc24_atomic(code_ptr, target);
>>>> +    flush_icache_range(jmp_addr, jmp_addr + 4);
>>>> +}
>>>> +
>>>>  static inline void tcg_out_goto_label(TCGContext *s, int cond, TCGLabel *l)
>>>>  {
>>>>      if (l->has_value) {

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 05/11] tcg/i386: Make direct jump patching thread-safe
  2016-04-20 15:04     ` Richard Henderson
@ 2016-04-20 16:15       ` Sergey Fedorov
  0 siblings, 0 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-20 16:15 UTC (permalink / raw)
  To: Richard Henderson, Alex Bennée, Sergey Fedorov
  Cc: qemu-devel, Paolo Bonzini, Peter Crosthwaite

On 20/04/16 18:04, Richard Henderson wrote:
> On 04/20/2016 02:55 AM, Alex Bennée wrote:
>>> +static void tcg_out_nopn(TCGContext *s, int n)
>>> +{
>>> +    static const uint8_t nop1[] = { 0x90 };
>>> +    static const uint8_t nop2[] = { 0x66, 0x90 };
>>> +    static const uint8_t nop3[] = { 0x8d, 0x76, 0x00 };
>>> +    static const uint8_t *const nopn[] = { nop1, nop2, nop3 };
>>> +    int i;
>>> +    assert(n <= ARRAY_SIZE(nopn));
>>> +    for (i = 0; i < n; ++i) {
>>> +        tcg_out8(s, nopn[n - 1][i]);
>>> +    }
>>> +}
>>
>> *shudder* I recall x86 instruction encoding is weird. Maybe a comment
>>   for the function to describe the 3 forms of NOP we have here?
>
> I think I'd prefer to drop the tables and do
>
>   /* Emit 1 or 2 operand size prefixes for the standard one byte nop,
>      xchg %eax,%eax, forming xchg %ax,%ax.  All cores accept the
>      duplicate prefix, and all of the interesting recent cores can
>      decode and discard the duplicates in a single cycle.  */
>   for (i = 1; i < n; ++i) {
>     tcg_out8(s, 0x66);
>   }
>   tcg_out8(s, 0x90);

It's fine if you are sure about that :)

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 08/11] tcg/aarch64: Make direct jump patching thread-safe
  2016-04-20 15:08     ` Richard Henderson
@ 2016-04-20 18:22       ` Alex Bennée
  2016-04-20 18:57         ` Richard Henderson
  2016-04-20 18:44       ` Sergey Fedorov
  1 sibling, 1 reply; 57+ messages in thread
From: Alex Bennée @ 2016-04-20 18:22 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Sergey Fedorov, qemu-devel, Sergey Fedorov, Paolo Bonzini,
	Peter Crosthwaite, Claudio Fontana, qemu-arm


Richard Henderson <rth@twiddle.net> writes:

> On 04/20/2016 07:01 AM, Alex Bennée wrote:
>>
>> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>>
>>> From: Sergey Fedorov <serge.fdrv@gmail.com>
>>>
>>> Ensure direct jump patching in AArch64 is atomic by using
>>> atomic_read()/atomic_set() for code patching.
>>>
>>> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
>>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>>> ---
>>>   tcg/aarch64/tcg-target.inc.c | 14 +++++++++++++-
>>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
>>> index 0ed10a974121..15fdebec921f 100644
>>> --- a/tcg/aarch64/tcg-target.inc.c
>>> +++ b/tcg/aarch64/tcg-target.inc.c
>>> @@ -73,6 +73,18 @@ static inline void reloc_pc26(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>>>       *code_ptr = deposit32(*code_ptr, 0, 26, offset);
>>>   }
>>>
>>> +static inline void reloc_pc26_atomic(tcg_insn_unit *code_ptr,
>>> +                                     tcg_insn_unit *target)
>>> +{
>>> +    ptrdiff_t offset = target - code_ptr;
>>> +    tcg_insn_unit insn;
>>> +    assert(offset == sextract64(offset, 0, 26));
>>> +    /* read instruction, mask away previous PC_REL26 parameter contents,
>>> +       set the proper offset, then write back the instruction. */
>>
>> This comment could be moved from here and reloc_pc26 and made common for
>> the two following functions.
>
> There's a significant amount of cleanup that ought to happen here, now that
> we're not re-translating TBs.  I don't know if Sergey should be gated
> on that.

Is this stuff already in the works? Otherwise we are trying to get
pre-cursors to MTTCG into the code (once the tree re-opens) to keep the
main diff down. This also is beneficial for linux-user stuff.

>
>
> r~


--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 08/11] tcg/aarch64: Make direct jump patching thread-safe
  2016-04-20 15:08     ` Richard Henderson
  2016-04-20 18:22       ` Alex Bennée
@ 2016-04-20 18:44       ` Sergey Fedorov
  1 sibling, 0 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-20 18:44 UTC (permalink / raw)
  To: Richard Henderson, Alex Bennée, Sergey Fedorov
  Cc: qemu-devel, Paolo Bonzini, Peter Crosthwaite, Claudio Fontana, qemu-arm

On 20/04/16 18:08, Richard Henderson wrote:
> On 04/20/2016 07:01 AM, Alex Bennée wrote:
>>
>> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>>
>>> From: Sergey Fedorov <serge.fdrv@gmail.com>
>>>
>>> Ensure direct jump patching in AArch64 is atomic by using
>>> atomic_read()/atomic_set() for code patching.
>>>
>>> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
>>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>>> ---
>>>   tcg/aarch64/tcg-target.inc.c | 14 +++++++++++++-
>>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tcg/aarch64/tcg-target.inc.c
>>> b/tcg/aarch64/tcg-target.inc.c
>>> index 0ed10a974121..15fdebec921f 100644
>>> --- a/tcg/aarch64/tcg-target.inc.c
>>> +++ b/tcg/aarch64/tcg-target.inc.c
>>> @@ -73,6 +73,18 @@ static inline void reloc_pc26(tcg_insn_unit
>>> *code_ptr, tcg_insn_unit *target)
>>>       *code_ptr = deposit32(*code_ptr, 0, 26, offset);
>>>   }
>>>
>>> +static inline void reloc_pc26_atomic(tcg_insn_unit *code_ptr,
>>> +                                     tcg_insn_unit *target)
>>> +{
>>> +    ptrdiff_t offset = target - code_ptr;
>>> +    tcg_insn_unit insn;
>>> +    assert(offset == sextract64(offset, 0, 26));
>>> +    /* read instruction, mask away previous PC_REL26 parameter
>>> contents,
>>> +       set the proper offset, then write back the instruction. */
>>
>> This comment could be moved from here and reloc_pc26 and made common for
>> the two following functions.
>
> There's a significant amount of cleanup that ought to happen here, now
> that we're not re-translating TBs.  I don't know if Sergey should be
> gated on that.

Do you mean I'd better avoid using stuff like reloc_pc26()?

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 08/11] tcg/aarch64: Make direct jump patching thread-safe
  2016-04-20 18:22       ` Alex Bennée
@ 2016-04-20 18:57         ` Richard Henderson
  2016-04-20 19:51           ` Alex Bennée
  0 siblings, 1 reply; 57+ messages in thread
From: Richard Henderson @ 2016-04-20 18:57 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Sergey Fedorov, qemu-devel, Sergey Fedorov, Paolo Bonzini,
	Peter Crosthwaite, Claudio Fontana, qemu-arm

On 04/20/2016 11:22 AM, Alex Bennée wrote:
>
> Richard Henderson <rth@twiddle.net> writes:
>
>> On 04/20/2016 07:01 AM, Alex Bennée wrote:
>>>
>>> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>>>
>>>> From: Sergey Fedorov <serge.fdrv@gmail.com>
>>>>
>>>> Ensure direct jump patching in AArch64 is atomic by using
>>>> atomic_read()/atomic_set() for code patching.
>>>>
>>>> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
>>>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>>>> ---
>>>>    tcg/aarch64/tcg-target.inc.c | 14 +++++++++++++-
>>>>    1 file changed, 13 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
>>>> index 0ed10a974121..15fdebec921f 100644
>>>> --- a/tcg/aarch64/tcg-target.inc.c
>>>> +++ b/tcg/aarch64/tcg-target.inc.c
>>>> @@ -73,6 +73,18 @@ static inline void reloc_pc26(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>>>>        *code_ptr = deposit32(*code_ptr, 0, 26, offset);
>>>>    }
>>>>
>>>> +static inline void reloc_pc26_atomic(tcg_insn_unit *code_ptr,
>>>> +                                     tcg_insn_unit *target)
>>>> +{
>>>> +    ptrdiff_t offset = target - code_ptr;
>>>> +    tcg_insn_unit insn;
>>>> +    assert(offset == sextract64(offset, 0, 26));
>>>> +    /* read instruction, mask away previous PC_REL26 parameter contents,
>>>> +       set the proper offset, then write back the instruction. */
>>>
>>> This comment could be moved from here and reloc_pc26 and made common for
>>> the two following functions.
>>
>> There's a significant amount of cleanup that ought to happen here, now that
>> we're not re-translating TBs.  I don't know if Sergey should be gated
>> on that.
>
> Is this stuff already in the works? Otherwise we are trying to get
> pre-cursors to MTTCG into the code (once the tree re-opens) to keep the
> main diff down. This also is beneficial for linux-user stuff.

We're talking past one another.

No, it's not yet in the works.  I'm saying that this patch set should not wait 
for it.  Thus I don't care if what he adds here is a little messy; we'll clean 
it all up at once later.  Thus don't bother refactoring reloc_pc26 and 
reloc_pc26_atomic.


r~

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 08/11] tcg/aarch64: Make direct jump patching thread-safe
  2016-04-20 18:57         ` Richard Henderson
@ 2016-04-20 19:51           ` Alex Bennée
  0 siblings, 0 replies; 57+ messages in thread
From: Alex Bennée @ 2016-04-20 19:51 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Sergey Fedorov, qemu-devel, Sergey Fedorov, Paolo Bonzini,
	Peter Crosthwaite, Claudio Fontana, qemu-arm


Richard Henderson <rth@twiddle.net> writes:

> On 04/20/2016 11:22 AM, Alex Bennée wrote:
>>
>> Richard Henderson <rth@twiddle.net> writes:
>>
>>> On 04/20/2016 07:01 AM, Alex Bennée wrote:
>>>>
>>>> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>>>>
>>>>> From: Sergey Fedorov <serge.fdrv@gmail.com>
>>>>>
>>>>> Ensure direct jump patching in AArch64 is atomic by using
>>>>> atomic_read()/atomic_set() for code patching.
>>>>>
>>>>> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
>>>>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>>>>> ---
>>>>>    tcg/aarch64/tcg-target.inc.c | 14 +++++++++++++-
>>>>>    1 file changed, 13 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
>>>>> index 0ed10a974121..15fdebec921f 100644
>>>>> --- a/tcg/aarch64/tcg-target.inc.c
>>>>> +++ b/tcg/aarch64/tcg-target.inc.c
>>>>> @@ -73,6 +73,18 @@ static inline void reloc_pc26(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>>>>>        *code_ptr = deposit32(*code_ptr, 0, 26, offset);
>>>>>    }
>>>>>
>>>>> +static inline void reloc_pc26_atomic(tcg_insn_unit *code_ptr,
>>>>> +                                     tcg_insn_unit *target)
>>>>> +{
>>>>> +    ptrdiff_t offset = target - code_ptr;
>>>>> +    tcg_insn_unit insn;
>>>>> +    assert(offset == sextract64(offset, 0, 26));
>>>>> +    /* read instruction, mask away previous PC_REL26 parameter contents,
>>>>> +       set the proper offset, then write back the instruction. */
>>>>
>>>> This comment could be moved from here and reloc_pc26 and made common for
>>>> the two following functions.
>>>
>>> There's a significant amount of cleanup that ought to happen here, now that
>>> we're not re-translating TBs.  I don't know if Sergey should be gated
>>> on that.
>>
>> Is this stuff already in the works? Otherwise we are trying to get
>> pre-cursors to MTTCG into the code (once the tree re-opens) to keep the
>> main diff down. This also is beneficial for linux-user stuff.
>
> We're talking past one another.
>
> No, it's not yet in the works.  I'm saying that this patch set should not wait
> for it.  Thus I don't care if what he adds here is a little messy; we'll clean
> it all up at once later.  Thus don't bother refactoring reloc_pc26 and
> reloc_pc26_atomic.

Ahh OK, cool ;-)

>
>
> r~


--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 08/11] tcg/aarch64: Make direct jump patching thread-safe
  2016-04-07 15:53 ` [Qemu-devel] [PATCH 08/11] tcg/aarch64: " Sergey Fedorov
  2016-04-20 14:01   ` Alex Bennée
@ 2016-04-21 15:47   ` Sergey Fedorov
  1 sibling, 0 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-21 15:47 UTC (permalink / raw)
  To: Sergey Fedorov, qemu-devel
  Cc: Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Claudio Fontana, qemu-arm

On 07/04/16 18:53, Sergey Fedorov wrote:
> From: Sergey Fedorov <serge.fdrv@gmail.com>
>
> Ensure direct jump patching in AArch64 is atomic by using
> atomic_read()/atomic_set() for code patching.
>
> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> ---
>  tcg/aarch64/tcg-target.inc.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
> index 0ed10a974121..15fdebec921f 100644
> --- a/tcg/aarch64/tcg-target.inc.c
> +++ b/tcg/aarch64/tcg-target.inc.c
> @@ -73,6 +73,18 @@ static inline void reloc_pc26(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>      *code_ptr = deposit32(*code_ptr, 0, 26, offset);
>  }
>  
> +static inline void reloc_pc26_atomic(tcg_insn_unit *code_ptr,
> +                                     tcg_insn_unit *target)
> +{
> +    ptrdiff_t offset = target - code_ptr;
> +    tcg_insn_unit insn;
> +    assert(offset == sextract64(offset, 0, 26));

I'd better use tcg_debug_assert() here as in this patch:

http://patchwork.ozlabs.org/patch/613020/

Kind regards,
Sergey

> +    /* read instruction, mask away previous PC_REL26 parameter contents,
> +       set the proper offset, then write back the instruction. */
> +    insn = atomic_read(code_ptr);
> +    atomic_set(code_ptr, deposit32(insn, 0, 26, offset));
> +}
> +
>  static inline void reloc_pc19(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
>  {
>      ptrdiff_t offset = target - code_ptr;
> @@ -835,7 +847,7 @@ void aarch64_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
>      tcg_insn_unit *code_ptr = (tcg_insn_unit *)jmp_addr;
>      tcg_insn_unit *target = (tcg_insn_unit *)addr;
>  
> -    reloc_pc26(code_ptr, target);
> +    reloc_pc26_atomic(code_ptr, target);
>      flush_icache_range(jmp_addr, jmp_addr + 4);
>  }
>  

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling
  2016-04-18 15:47       ` Sergey Fedorov
@ 2016-04-21 17:36         ` Sergey Fedorov
  2016-04-21 17:49           ` Alex Bennée
  2016-04-22  8:08           ` Cornelia Huck
  0 siblings, 2 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-21 17:36 UTC (permalink / raw)
  To: Cornelia Huck, Sergey Fedorov
  Cc: qemu-devel, Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Christian Borntraeger, Alexander Graf

[-- Attachment #1: Type: text/plain, Size: 2826 bytes --]

On 18/04/16 18:47, Sergey Fedorov wrote:
> On 18/04/16 18:34, Cornelia Huck wrote:
>> On Mon, 18 Apr 2016 16:51:16 +0200
>> Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
>>
>>> On Thu,  7 Apr 2016 18:53:44 +0300
>>> Sergey Fedorov <sergey.fedorov@linaro.org> wrote:
>>>
>>>> From: Sergey Fedorov <serge.fdrv@gmail.com>
>>>>
>>>> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
>>>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>>>> ---
>>>>  pc-bios/s390-ccw/Makefile | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
>>>> index 4208cb429593..5ce6d4ccbaf5 100644
>>>> --- a/pc-bios/s390-ccw/Makefile
>>>> +++ b/pc-bios/s390-ccw/Makefile
>>>> @@ -20,7 +20,7 @@ s390-ccw.elf: $(OBJECTS)
>>>>  	$(call quiet-command,$(CC) $(LDFLAGS) -o $@ $(OBJECTS),"  Building $(TARGET_DIR)$@")
>>>>
>>>>  s390-ccw.img: s390-ccw.elf
>>>> -	$(call quiet-command,strip --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
>>>> +	$(call quiet-command,$(STRIP) --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
>>>>
>>>>  $(OBJECTS): Makefile
>>>>
>>> Thanks, applied to s390-next.
>> Uhm, scratch that.
>>
>> This fails to build with --disable-strip, as $STRIP is unset in that
>> case:
>>
>>   Building s390-ccw/s390-ccw.elf
>>   Stripping s390-ccw/s390-ccw.img
>> /bin/sh: --strip-unneeded: command not found
>> make[1]: *** [s390-ccw.img] Error 127
>>
>>
>> The catch is that we always want to strip that binary. Care to send a
>> patch that deals with that?
>>
> I see the problem. I don't promise to fix it soon, but I could try
> dealing with this as I have time. I don't mind if someone else can just
> fix it with their own patch :)

It's not straightforward to fix. We need to detect a correct
cross-prefix for strip. There is something in roms/Makefile:

    #
    # cross compiler auto detection
    #
    path := $(subst :, ,$(PATH))
    system := $(shell uname -s | tr "A-Z" "a-z")

    # first find cross binutils in path
    find-cross-ld = $(firstword $(wildcard $(patsubst
    %,%/$(1)-*$(system)*-ld,$(path))))
    # then check we have cross gcc too
    find-cross-gcc = $(firstword $(wildcard $(patsubst %ld,%gcc,$(call
    find-cross-ld,$(1)))))
    # finally strip off path + toolname so we get the prefix
    find-cross-prefix = $(subst gcc,,$(notdir $(call find-cross-gcc,$(1))))

    powerpc64_cross_prefix := $(call find-cross-prefix,powerpc64)
    powerpc_cross_prefix := $(call find-cross-prefix,powerpc)
    x86_64_cross_prefix := $(call find-cross-prefix,x86_64)


and then:

    $(powerpc_cross_prefix)strip <...>


However, to solve this problem, it would be enough to export
${cross_prefix} from configure to config-host.mak and do like this:

    $(CROSS_PREFIX)strip <...>


Kind regards,
Sergey

[-- Attachment #2: Type: text/html, Size: 4065 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling
  2016-04-21 17:36         ` Sergey Fedorov
@ 2016-04-21 17:49           ` Alex Bennée
  2016-04-21 18:56             ` Sergey Fedorov
  2016-04-22  8:08           ` Cornelia Huck
  1 sibling, 1 reply; 57+ messages in thread
From: Alex Bennée @ 2016-04-21 17:49 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: Cornelia Huck, Sergey Fedorov, qemu-devel, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Christian Borntraeger,
	Alexander Graf


Sergey Fedorov <serge.fdrv@gmail.com> writes:

> On 18/04/16 18:47, Sergey Fedorov wrote:
>> On 18/04/16 18:34, Cornelia Huck wrote:
>>> On Mon, 18 Apr 2016 16:51:16 +0200
>>> Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
>>>
>>>> On Thu,  7 Apr 2016 18:53:44 +0300
>>>> Sergey Fedorov <sergey.fedorov@linaro.org> wrote:
>>>>
>>>>> From: Sergey Fedorov <serge.fdrv@gmail.com>
>>>>>
>>>>> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
>>>>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>>>>> ---
>>>>>  pc-bios/s390-ccw/Makefile | 2 +-
>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
>>>>> index 4208cb429593..5ce6d4ccbaf5 100644
>>>>> --- a/pc-bios/s390-ccw/Makefile
>>>>> +++ b/pc-bios/s390-ccw/Makefile
>>>>> @@ -20,7 +20,7 @@ s390-ccw.elf: $(OBJECTS)
>>>>>  	$(call quiet-command,$(CC) $(LDFLAGS) -o $@ $(OBJECTS),"  Building $(TARGET_DIR)$@")
>>>>>
>>>>>  s390-ccw.img: s390-ccw.elf
>>>>> -	$(call quiet-command,strip --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
>>>>> +	$(call quiet-command,$(STRIP) --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
>>>>>
>>>>>  $(OBJECTS): Makefile
>>>>>
>>>> Thanks, applied to s390-next.
>>> Uhm, scratch that.
>>>
>>> This fails to build with --disable-strip, as $STRIP is unset in that
>>> case:
>>>
>>>   Building s390-ccw/s390-ccw.elf
>>>   Stripping s390-ccw/s390-ccw.img
>>> /bin/sh: --strip-unneeded: command not found
>>> make[1]: *** [s390-ccw.img] Error 127
>>>
>>>
>>> The catch is that we always want to strip that binary. Care to send a
>>> patch that deals with that?
>>>
>> I see the problem. I don't promise to fix it soon, but I could try
>> dealing with this as I have time. I don't mind if someone else can just
>> fix it with their own patch :)
>
> It's not straightforward to fix. We need to detect a correct
> cross-prefix for strip. There is something in roms/Makefile:
>
>     #
>     # cross compiler auto detection
>     #
>     path := $(subst :, ,$(PATH))
>     system := $(shell uname -s | tr "A-Z" "a-z")
>
>     # first find cross binutils in path
>     find-cross-ld = $(firstword $(wildcard $(patsubst
>     %,%/$(1)-*$(system)*-ld,$(path))))
>     # then check we have cross gcc too
>     find-cross-gcc = $(firstword $(wildcard $(patsubst %ld,%gcc,$(call
>     find-cross-ld,$(1)))))
>     # finally strip off path + toolname so we get the prefix
>     find-cross-prefix = $(subst gcc,,$(notdir $(call find-cross-gcc,$(1))))
>
>     powerpc64_cross_prefix := $(call find-cross-prefix,powerpc64)
>     powerpc_cross_prefix := $(call find-cross-prefix,powerpc)
>     x86_64_cross_prefix := $(call find-cross-prefix,x86_64)

Ohh that's cool. I wonder if it could be used to solve the tcg/tests
problem?

>
>
> and then:
>
>     $(powerpc_cross_prefix)strip <...>
>
>
> However, to solve this problem, it would be enough to export
> ${cross_prefix} from configure to config-host.mak and do like this:
>
>     $(CROSS_PREFIX)strip <...>
>
>
> Kind regards,
> Sergey


--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling
  2016-04-21 17:49           ` Alex Bennée
@ 2016-04-21 18:56             ` Sergey Fedorov
  0 siblings, 0 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-21 18:56 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Cornelia Huck, Sergey Fedorov, qemu-devel, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Christian Borntraeger,
	Alexander Graf

On 21/04/16 20:49, Alex Bennée wrote:
> Sergey Fedorov <serge.fdrv@gmail.com> writes:
>
>> On 18/04/16 18:47, Sergey Fedorov wrote:
>>> On 18/04/16 18:34, Cornelia Huck wrote:
>>>> On Mon, 18 Apr 2016 16:51:16 +0200
>>>> Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
>>>>
>>>>> On Thu,  7 Apr 2016 18:53:44 +0300
>>>>> Sergey Fedorov <sergey.fedorov@linaro.org> wrote:
>>>>>
>>>>>> From: Sergey Fedorov <serge.fdrv@gmail.com>
>>>>>>
>>>>>> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
>>>>>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
>>>>>> ---
>>>>>>  pc-bios/s390-ccw/Makefile | 2 +-
>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
>>>>>> index 4208cb429593..5ce6d4ccbaf5 100644
>>>>>> --- a/pc-bios/s390-ccw/Makefile
>>>>>> +++ b/pc-bios/s390-ccw/Makefile
>>>>>> @@ -20,7 +20,7 @@ s390-ccw.elf: $(OBJECTS)
>>>>>>  	$(call quiet-command,$(CC) $(LDFLAGS) -o $@ $(OBJECTS),"  Building $(TARGET_DIR)$@")
>>>>>>
>>>>>>  s390-ccw.img: s390-ccw.elf
>>>>>> -	$(call quiet-command,strip --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
>>>>>> +	$(call quiet-command,$(STRIP) --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
>>>>>>
>>>>>>  $(OBJECTS): Makefile
>>>>>>
>>>>> Thanks, applied to s390-next.
>>>> Uhm, scratch that.
>>>>
>>>> This fails to build with --disable-strip, as $STRIP is unset in that
>>>> case:
>>>>
>>>>   Building s390-ccw/s390-ccw.elf
>>>>   Stripping s390-ccw/s390-ccw.img
>>>> /bin/sh: --strip-unneeded: command not found
>>>> make[1]: *** [s390-ccw.img] Error 127
>>>>
>>>>
>>>> The catch is that we always want to strip that binary. Care to send a
>>>> patch that deals with that?
>>>>
>>> I see the problem. I don't promise to fix it soon, but I could try
>>> dealing with this as I have time. I don't mind if someone else can just
>>> fix it with their own patch :)
>> It's not straightforward to fix. We need to detect a correct
>> cross-prefix for strip. There is something in roms/Makefile:
>>
>>     #
>>     # cross compiler auto detection
>>     #
>>     path := $(subst :, ,$(PATH))
>>     system := $(shell uname -s | tr "A-Z" "a-z")
>>
>>     # first find cross binutils in path
>>     find-cross-ld = $(firstword $(wildcard $(patsubst
>>     %,%/$(1)-*$(system)*-ld,$(path))))
>>     # then check we have cross gcc too
>>     find-cross-gcc = $(firstword $(wildcard $(patsubst %ld,%gcc,$(call
>>     find-cross-ld,$(1)))))
>>     # finally strip off path + toolname so we get the prefix
>>     find-cross-prefix = $(subst gcc,,$(notdir $(call find-cross-gcc,$(1))))
>>
>>     powerpc64_cross_prefix := $(call find-cross-prefix,powerpc64)
>>     powerpc_cross_prefix := $(call find-cross-prefix,powerpc)
>>     x86_64_cross_prefix := $(call find-cross-prefix,x86_64)
> Ohh that's cool. I wonder if it could be used to solve the tcg/tests
> problem?

Which problem? :)

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling
  2016-04-21 17:36         ` Sergey Fedorov
  2016-04-21 17:49           ` Alex Bennée
@ 2016-04-22  8:08           ` Cornelia Huck
  2016-05-09 12:49             ` Paolo Bonzini
  1 sibling, 1 reply; 57+ messages in thread
From: Cornelia Huck @ 2016-04-22  8:08 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: Sergey Fedorov, qemu-devel, Alex Bennée, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Christian Borntraeger,
	Alexander Graf

On Thu, 21 Apr 2016 20:36:49 +0300
Sergey Fedorov <serge.fdrv@gmail.com> wrote:

> On 18/04/16 18:47, Sergey Fedorov wrote:
> > On 18/04/16 18:34, Cornelia Huck wrote:
> >> On Mon, 18 Apr 2016 16:51:16 +0200
> >> Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> >>
> >>> On Thu,  7 Apr 2016 18:53:44 +0300
> >>> Sergey Fedorov <sergey.fedorov@linaro.org> wrote:
> >>>
> >>>> From: Sergey Fedorov <serge.fdrv@gmail.com>
> >>>>
> >>>> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
> >>>> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
> >>>> ---
> >>>>  pc-bios/s390-ccw/Makefile | 2 +-
> >>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
> >>>> index 4208cb429593..5ce6d4ccbaf5 100644
> >>>> --- a/pc-bios/s390-ccw/Makefile
> >>>> +++ b/pc-bios/s390-ccw/Makefile
> >>>> @@ -20,7 +20,7 @@ s390-ccw.elf: $(OBJECTS)
> >>>>  	$(call quiet-command,$(CC) $(LDFLAGS) -o $@ $(OBJECTS),"  Building $(TARGET_DIR)$@")
> >>>>
> >>>>  s390-ccw.img: s390-ccw.elf
> >>>> -	$(call quiet-command,strip --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
> >>>> +	$(call quiet-command,$(STRIP) --strip-unneeded $< -o $@,"  Stripping $(TARGET_DIR)$@")
> >>>>
> >>>>  $(OBJECTS): Makefile
> >>>>
> >>> Thanks, applied to s390-next.
> >> Uhm, scratch that.
> >>
> >> This fails to build with --disable-strip, as $STRIP is unset in that
> >> case:
> >>
> >>   Building s390-ccw/s390-ccw.elf
> >>   Stripping s390-ccw/s390-ccw.img
> >> /bin/sh: --strip-unneeded: command not found
> >> make[1]: *** [s390-ccw.img] Error 127
> >>
> >>
> >> The catch is that we always want to strip that binary. Care to send a
> >> patch that deals with that?
> >>
> > I see the problem. I don't promise to fix it soon, but I could try
> > dealing with this as I have time. I don't mind if someone else can just
> > fix it with their own patch :)
> 
> It's not straightforward to fix. We need to detect a correct
> cross-prefix for strip. There is something in roms/Makefile:
> 
>     #
>     # cross compiler auto detection
>     #
>     path := $(subst :, ,$(PATH))
>     system := $(shell uname -s | tr "A-Z" "a-z")
> 
>     # first find cross binutils in path
>     find-cross-ld = $(firstword $(wildcard $(patsubst
>     %,%/$(1)-*$(system)*-ld,$(path))))
>     # then check we have cross gcc too
>     find-cross-gcc = $(firstword $(wildcard $(patsubst %ld,%gcc,$(call
>     find-cross-ld,$(1)))))
>     # finally strip off path + toolname so we get the prefix
>     find-cross-prefix = $(subst gcc,,$(notdir $(call find-cross-gcc,$(1))))
> 
>     powerpc64_cross_prefix := $(call find-cross-prefix,powerpc64)
>     powerpc_cross_prefix := $(call find-cross-prefix,powerpc)
>     x86_64_cross_prefix := $(call find-cross-prefix,x86_64)
> 
> 
> and then:
> 
>     $(powerpc_cross_prefix)strip <...>

This one has the drawback of having to add architectures manually,
though.

> 
> 
> However, to solve this problem, it would be enough to export
> ${cross_prefix} from configure to config-host.mak and do like this:
> 
>     $(CROSS_PREFIX)strip <...>

I agree. This would probably be the easiest way.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 03/11] tci: Make direct jump patching thread-safe
  2016-04-20 13:14       ` Alex Bennée
@ 2016-04-22 11:31         ` Sergey Fedorov
  2016-04-22 12:49           ` Alex Bennée
  0 siblings, 1 reply; 57+ messages in thread
From: Sergey Fedorov @ 2016-04-22 11:31 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Sergey Fedorov, qemu-devel, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Stefan Weil

On 20/04/16 16:14, Alex Bennée wrote:
> Sergey Fedorov <serge.fdrv@gmail.com> writes:
>
>> On 20/04/16 12:42, Alex Bennée wrote:
>>> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>>>> diff --git a/tcg/tci/tcg-target.inc.c b/tcg/tci/tcg-target.inc.c
>>>> index 4afe4d7a8d59..7e6180e62898 100644
>>>> --- a/tcg/tci/tcg-target.inc.c
>>>> +++ b/tcg/tci/tcg-target.inc.c
>>>> @@ -556,6 +556,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
>>>>          if (s->tb_jmp_offset) {
>>>>              /* Direct jump method. */
>>>>              assert(args[0] < ARRAY_SIZE(s->tb_jmp_offset));
>>>> +            /* Align for atomic patching and thread safety */
>>>> +            s->code_ptr = (uint8_t *)(((uintptr_t)s->code_ptr + 3) &
>>>> ~3);
>>> Seeing this pattern is being used over and over again I wonder if we
>>> should have some utility helper functions for this? Perhaps we should
>>> steal the kernels ALIGN macros?
>> Good point, really. I see such a macro in hw/display/qxl.c and
>> kvm-all.c. It'd be better a common definition. Any idea of where to
>> put it?
> Somewhere inside include/qemu. osdep.h has ROUND_UP/DOWN functions maybe
> there makes the most sense?

Hmm, ROUND_UP() seems to be exactly what we need here. Though I think
compiler could be smart enough to give the same code with
QEMU_ALIGN_UP() as well. But we'd benefit from something like:

/* n-byte align pointer down */
#define QEMU_ALIGN_PTR_DOWN(p, n) \
    ((typeof(p))QEMU_ALIGN_DOWN((uintptr_t)(p), (n)))

/* n-byte align pointer up */
#define QEMU_ALIGN_PTR_UP(p, n) \
    ((typeof(p))QEMU_ALIGN_UP((uintptr_t)(p), (n)))


Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 03/11] tci: Make direct jump patching thread-safe
  2016-04-22 11:31         ` Sergey Fedorov
@ 2016-04-22 12:49           ` Alex Bennée
  0 siblings, 0 replies; 57+ messages in thread
From: Alex Bennée @ 2016-04-22 12:49 UTC (permalink / raw)
  To: Sergey Fedorov
  Cc: Sergey Fedorov, qemu-devel, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Stefan Weil


Sergey Fedorov <serge.fdrv@gmail.com> writes:

> On 20/04/16 16:14, Alex Bennée wrote:
>> Sergey Fedorov <serge.fdrv@gmail.com> writes:
>>
>>> On 20/04/16 12:42, Alex Bennée wrote:
>>>> Sergey Fedorov <sergey.fedorov@linaro.org> writes:
>>>>> diff --git a/tcg/tci/tcg-target.inc.c b/tcg/tci/tcg-target.inc.c
>>>>> index 4afe4d7a8d59..7e6180e62898 100644
>>>>> --- a/tcg/tci/tcg-target.inc.c
>>>>> +++ b/tcg/tci/tcg-target.inc.c
>>>>> @@ -556,6 +556,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
>>>>>          if (s->tb_jmp_offset) {
>>>>>              /* Direct jump method. */
>>>>>              assert(args[0] < ARRAY_SIZE(s->tb_jmp_offset));
>>>>> +            /* Align for atomic patching and thread safety */
>>>>> +            s->code_ptr = (uint8_t *)(((uintptr_t)s->code_ptr + 3) &
>>>>> ~3);
>>>> Seeing this pattern is being used over and over again I wonder if we
>>>> should have some utility helper functions for this? Perhaps we should
>>>> steal the kernels ALIGN macros?
>>> Good point, really. I see such a macro in hw/display/qxl.c and
>>> kvm-all.c. It'd be better a common definition. Any idea of where to
>>> put it?
>> Somewhere inside include/qemu. osdep.h has ROUND_UP/DOWN functions maybe
>> there makes the most sense?
>
> Hmm, ROUND_UP() seems to be exactly what we need here. Though I think
> compiler could be smart enough to give the same code with
> QEMU_ALIGN_UP() as well. But we'd benefit from something like:
>
> /* n-byte align pointer down */
> #define QEMU_ALIGN_PTR_DOWN(p, n) \
>     ((typeof(p))QEMU_ALIGN_DOWN((uintptr_t)(p), (n)))
>
> /* n-byte align pointer up */
> #define QEMU_ALIGN_PTR_UP(p, n) \
>     ((typeof(p))QEMU_ALIGN_UP((uintptr_t)(p), (n)))

Sounds good.

--
Alex Bennée

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling
  2016-04-22  8:08           ` Cornelia Huck
@ 2016-05-09 12:49             ` Paolo Bonzini
  2016-05-10 10:47               ` Sergey Fedorov
  0 siblings, 1 reply; 57+ messages in thread
From: Paolo Bonzini @ 2016-05-09 12:49 UTC (permalink / raw)
  To: Cornelia Huck, Sergey Fedorov
  Cc: Sergey Fedorov, qemu-devel, Alex Bennée, Peter Crosthwaite,
	Richard Henderson, Christian Borntraeger, Alexander Graf



On 22/04/2016 10:08, Cornelia Huck wrote:
>> > However, to solve this problem, it would be enough to export
>> > ${cross_prefix} from configure to config-host.mak and do like this:
>> > 
>> >     $(CROSS_PREFIX)strip <...>
> I agree. This would probably be the easiest way.
> 

$(CROSS_PREFIX) is a build->host cross prefix (tool running on build
machine, producing output for the machine that QEMU runs on).  The ones
in roms/Makefile are build->target cross prefix (tool running on build
machine, producing output for the machine that QEMU emulates).

So roms/Makefile is the way to go.

Paolo

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling
  2016-05-09 12:49             ` Paolo Bonzini
@ 2016-05-10 10:47               ` Sergey Fedorov
  0 siblings, 0 replies; 57+ messages in thread
From: Sergey Fedorov @ 2016-05-10 10:47 UTC (permalink / raw)
  To: Paolo Bonzini, Cornelia Huck
  Cc: Sergey Fedorov, qemu-devel, Alex Bennée, Peter Crosthwaite,
	Richard Henderson, Christian Borntraeger, Alexander Graf

On 09/05/16 15:49, Paolo Bonzini wrote:
>
> On 22/04/2016 10:08, Cornelia Huck wrote:
>>>> However, to solve this problem, it would be enough to export
>>>> ${cross_prefix} from configure to config-host.mak and do like this:
>>>>
>>>>     $(CROSS_PREFIX)strip <...>
>> I agree. This would probably be the easiest way.
>>
> $(CROSS_PREFIX) is a build->host cross prefix (tool running on build
> machine, producing output for the machine that QEMU runs on).  The ones
> in roms/Makefile are build->target cross prefix (tool running on build
> machine, producing output for the machine that QEMU emulates).
>
> So roms/Makefile is the way to go.

You're right, thanks. I suspect there should be some way to move this
stuff from roms/Makefile.

Kind regards,
Sergey

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2016-05-10 10:47 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-07 15:53 [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
2016-04-07 15:53 ` [Qemu-devel] [PATCH 01/11] tci: Fix build regression Sergey Fedorov
2016-04-07 18:15   ` Richard Henderson
2016-04-07 19:16     ` Stefan Weil
2016-04-07 20:37       ` Stefan Weil
2016-04-08  3:40         ` Richard Henderson
2016-04-07 15:53 ` [Qemu-devel] [PATCH 02/11] pc-bios/s390-ccw: Use correct strip when cross-compiling Sergey Fedorov
2016-04-07 16:18   ` Cornelia Huck
2016-04-07 16:22     ` Sergey Fedorov
2016-04-18 13:15     ` Sergey Fedorov
2016-04-18 14:51   ` Cornelia Huck
2016-04-18 15:34     ` Cornelia Huck
2016-04-18 15:47       ` Sergey Fedorov
2016-04-21 17:36         ` Sergey Fedorov
2016-04-21 17:49           ` Alex Bennée
2016-04-21 18:56             ` Sergey Fedorov
2016-04-22  8:08           ` Cornelia Huck
2016-05-09 12:49             ` Paolo Bonzini
2016-05-10 10:47               ` Sergey Fedorov
2016-04-07 15:53 ` [Qemu-devel] [PATCH 03/11] tci: Make direct jump patching thread-safe Sergey Fedorov
2016-04-20  9:42   ` Alex Bennée
2016-04-20 11:40     ` Sergey Fedorov
2016-04-20 13:14       ` Alex Bennée
2016-04-22 11:31         ` Sergey Fedorov
2016-04-22 12:49           ` Alex Bennée
2016-04-07 15:53 ` [Qemu-devel] [PATCH 04/11] tcg/ppc: " Sergey Fedorov
2016-04-20  9:49   ` Alex Bennée
2016-04-07 15:53 ` [Qemu-devel] [PATCH 05/11] tcg/i386: " Sergey Fedorov
2016-04-20  9:55   ` Alex Bennée
2016-04-20 11:43     ` Sergey Fedorov
2016-04-20 15:04     ` Richard Henderson
2016-04-20 16:15       ` Sergey Fedorov
2016-04-07 15:53 ` [Qemu-devel] [PATCH 06/11] tcg/s390: " Sergey Fedorov
2016-04-20 10:01   ` Alex Bennée
2016-04-20 11:45     ` Sergey Fedorov
2016-04-07 15:53 ` [Qemu-devel] [PATCH 07/11] tcg/arm: " Sergey Fedorov
2016-04-20 13:33   ` Alex Bennée
2016-04-20 14:29     ` Sergey Fedorov
2016-04-20 14:40       ` Alex Bennée
2016-04-20 16:12         ` Sergey Fedorov
2016-04-07 15:53 ` [Qemu-devel] [PATCH 08/11] tcg/aarch64: " Sergey Fedorov
2016-04-20 14:01   ` Alex Bennée
2016-04-20 15:08     ` Richard Henderson
2016-04-20 18:22       ` Alex Bennée
2016-04-20 18:57         ` Richard Henderson
2016-04-20 19:51           ` Alex Bennée
2016-04-20 18:44       ` Sergey Fedorov
2016-04-21 15:47   ` Sergey Fedorov
2016-04-07 15:53 ` [Qemu-devel] [PATCH 09/11] tcg/sparc: " Sergey Fedorov
2016-04-20 14:23   ` Alex Bennée
2016-04-07 15:53 ` [Qemu-devel] [PATCH 10/11] tcg/mips: " Sergey Fedorov
2016-04-07 16:01   ` Paolo Bonzini
2016-04-07 16:09     ` Sergey Fedorov
2016-04-07 15:53 ` [Qemu-devel] [PATCH 11/11] tcg: Note requirement on atomic direct jump patching Sergey Fedorov
2016-04-20 14:25   ` Alex Bennée
2016-04-07 15:56 ` [Qemu-devel] [PATCH 00/11] tcg: Make direct jump patching thread-safe Sergey Fedorov
2016-04-20  8:44 ` Alex Bennée

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.