All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2
@ 2010-06-04 19:14 Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 01/35] tcg-s390: Adjust compilation flags Richard Henderson
                   ` (35 more replies)
  0 siblings, 36 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Changes v1->v2
  * Disassembler doesn't include GPLv3 code.

  * Smashed about 20 commits into the "New TCG Target" patch,
    which is fully functional from the beginning.

  * Merged a lot of the follow-on patches such that introducing
    the use of an instruction and conditionalizing the use of
    the instruction on the active ISA features is not split into
    two separate patches.

This patch series is available at

  git://repo.or.cz/qemu/rth.git tcg-s390-3


r~



Richard Henderson (35):
  tcg-s390: Adjust compilation flags.
  s390x: Avoid _llseek.
  s390x: Don't use a linker script for user-only.
  tcg-s390: Compute is_write in cpu_signal_handler.
  tcg-s390: Icache flush is a no-op.
  tcg-s390: Allocate the code_gen_buffer near the main program.
  tcg: Optionally sign-extend 32-bit arguments for 64-bit host.
  s390: Update disassembler to the last GPLv2 from binutils.
  s390: Disassemble some general-instruction-extension insns.
  tcg-s390: New TCG target
  tcg-s390: Tidy unimplemented opcodes.
  tcg-s390: Define TCG_TMP0.
  tcg-s390: Tidy regset initialization; use R14 as temporary.
  tcg-s390: Rearrange register allocation order.
  tcg-s390: Query instruction extensions that are installed.
  tcg-s390: Re-implement tcg_out_movi.
  tcg-s390: Implement sign and zero-extension operations.
  tcg-s390: Implement bswap operations.
  tcg-s390: Implement rotates.
  tcg-s390: Use LOAD COMPLIMENT for negate.
  tcg-s390: Use the ADD IMMEDIATE instructions.
  tcg-s390: Use the AND IMMEDIATE instructions.
  tcg-s390: Use the OR IMMEDIATE instructions.
  tcg-s390: Use the XOR IMMEDIATE instructions.
  tcg-s390: Use the MULTIPLY IMMEDIATE instructions.
  tcg-s390: Tidy goto_tb.
  tcg-s390: Rearrange qemu_ld/st to avoid register copy.
  tcg-s390: Tidy tcg_prepare_qemu_ldst.
  tcg-s390: Tidy user qemu_ld/st.
  tcg-s390: Implement GUEST_BASE.
  tcg-s390: Use 16-bit branches for forward jumps.
  tcg-s390: Use the LOAD AND TEST instruction for compares.
  tcg-s390: Use the COMPARE IMMEDIATE instrucions for compares.
  tcg-s390: Use COMPARE AND BRANCH instructions.
  tcg-s390: Enable compile in 32-bit mode.

 configure                    |   12 +-
 cpu-exec.c                   |   42 +-
 def-helper.h                 |   38 +-
 exec.c                       |    7 +
 linux-user/syscall.c         |    4 +-
 s390-dis.c                   |  168 +++-
 target-i386/ops_sse_header.h |    3 +
 target-ppc/helper.h          |    1 +
 tcg/s390/tcg-target.c        | 2248 +++++++++++++++++++++++++++++++++++++++++-
 tcg/s390/tcg-target.h        |   63 +-
 tcg/tcg-op.h                 |   42 +-
 tcg/tcg.c                    |   41 +-
 12 files changed, 2536 insertions(+), 133 deletions(-)

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 01/35] tcg-s390: Adjust compilation flags.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-09 22:53   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 02/35] s390x: Avoid _llseek Richard Henderson
                   ` (34 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Force -m31/-m64 based on s390/s390x target.

Force -march=z990.  The TCG backend will always require the
long-displacement facility, so the compiler may as well make
use of that as well.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 configure |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index 653c8d2..65f87a2 100755
--- a/configure
+++ b/configure
@@ -697,7 +697,12 @@ case "$cpu" in
            fi
            ;;
     s390)
-           QEMU_CFLAGS="-march=z900 $QEMU_CFLAGS"
+           QEMU_CFLAGS="-m31 -march=z990 $QEMU_CFLAGS"
+           LDFLAGS="-m31 $LDFLAGS"
+           ;;
+    s390x)
+           QEMU_CFLAGS="-m64 -march=z990 $QEMU_CFLAGS"
+           LDFLAGS="-m64 $LDFLAGS"
            ;;
     i386)
            QEMU_CFLAGS="-m32 $QEMU_CFLAGS"
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 02/35] s390x: Avoid _llseek.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 01/35] tcg-s390: Adjust compilation flags Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-09 22:54   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 03/35] s390x: Don't use a linker script for user-only Richard Henderson
                   ` (33 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

There's no _llseek on s390x either.  Replace the existing
test for __x86_64__ with a functional test for __NR_llseek.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 linux-user/syscall.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 8222cb9..e94f1ee 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -208,7 +208,7 @@ _syscall3(int, sys_getdents, uint, fd, struct linux_dirent *, dirp, uint, count)
 _syscall3(int, sys_getdents64, uint, fd, struct linux_dirent64 *, dirp, uint, count);
 #endif
 _syscall2(int, sys_getpriority, int, which, int, who);
-#if defined(TARGET_NR__llseek) && !defined (__x86_64__)
+#if defined(TARGET_NR__llseek) && defined(__NR_llseek)
 _syscall5(int, _llseek,  uint,  fd, ulong, hi, ulong, lo,
           loff_t *, res, uint, wh);
 #endif
@@ -5933,7 +5933,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #ifdef TARGET_NR__llseek /* Not on alpha */
     case TARGET_NR__llseek:
         {
-#if defined (__x86_64__)
+#if !defined(__NR_llseek)
             ret = get_errno(lseek(arg1, ((uint64_t )arg2 << 32) | arg3, arg5));
             if (put_user_s64(ret, arg4))
                 goto efault;
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 03/35] s390x: Don't use a linker script for user-only.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 01/35] tcg-s390: Adjust compilation flags Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 02/35] s390x: Avoid _llseek Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-09 22:54   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 04/35] tcg-s390: Compute is_write in cpu_signal_handler Richard Henderson
                   ` (32 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The default placement of the application at 0x80000000 is fine,
and will avoid the default placement for most other guests.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 configure |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/configure b/configure
index 65f87a2..7f5b5b2 100755
--- a/configure
+++ b/configure
@@ -2758,6 +2758,9 @@ if test "$target_linux_user" = "yes" -o "$target_bsd_user" = "yes" ; then
     # -static is used to avoid g1/g3 usage by the dynamic linker
     ldflags="$linker_script -static $ldflags"
     ;;
+  alpha | s390x)
+    # The default placement of the application is fine.
+    ;;
   *)
     ldflags="$linker_script $ldflags"
     ;;
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 04/35] tcg-s390: Compute is_write in cpu_signal_handler.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (2 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 03/35] s390x: Don't use a linker script for user-only Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-09 22:54   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 05/35] tcg-s390: Icache flush is a no-op Richard Henderson
                   ` (31 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 cpu-exec.c |   42 +++++++++++++++++++++++++++++++++++++++---
 1 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index c776605..026980a 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -1156,11 +1156,47 @@ int cpu_signal_handler(int host_signum, void *pinfo,
     siginfo_t *info = pinfo;
     struct ucontext *uc = puc;
     unsigned long pc;
-    int is_write;
+    uint16_t *pinsn;
+    int is_write = 0;
 
     pc = uc->uc_mcontext.psw.addr;
-    /* XXX: compute is_write */
-    is_write = 0;
+
+    /* ??? On linux, the non-rt signal handler has 4 (!) arguments instead
+       of the normal 2 arguments.  The 3rd argument contains the "int_code"
+       from the hardware which does in fact contain the is_write value.
+       The rt signal handler, as far as I can tell, does not give this value
+       at all.  Not that we could get to it from here even if it were.  */
+    /* ??? This is not even close to complete, since it ignores all
+       of the read-modify-write instructions.  */
+    pinsn = (uint16_t *)pc;
+    switch (pinsn[0] >> 8) {
+    case 0x50: /* ST */
+    case 0x42: /* STC */
+    case 0x40: /* STH */
+        is_write = 1;
+        break;
+    case 0xc4: /* RIL format insns */
+        switch (pinsn[0] & 0xf) {
+        case 0xf: /* STRL */
+        case 0xb: /* STGRL */
+        case 0x7: /* STHRL */
+            is_write = 1;
+        }
+        break;
+    case 0xe3: /* RXY format insns */
+        switch (pinsn[2] & 0xff) {
+        case 0x50: /* STY */
+        case 0x24: /* STG */
+        case 0x72: /* STCY */
+        case 0x70: /* STHY */
+        case 0x8e: /* STPQ */
+        case 0x3f: /* STRVH */
+        case 0x3e: /* STRV */
+        case 0x2f: /* STRVG */
+            is_write = 1;
+        }
+        break;
+    }
     return handle_cpu_signal(pc, (unsigned long)info->si_addr,
                              is_write, &uc->uc_sigmask, puc);
 }
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 05/35] tcg-s390: Icache flush is a no-op.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (3 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 04/35] tcg-s390: Compute is_write in cpu_signal_handler Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-09 22:55   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 06/35] tcg-s390: Allocate the code_gen_buffer near the main program Richard Henderson
                   ` (30 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Before gcc 4.2, __builtin___clear_cache doesn't exist, and
afterward the gcc s390 backend implements it as nothing.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.h |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index d8a2955..d7fe0c7 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -94,9 +94,4 @@ enum {
 
 static inline void flush_icache_range(unsigned long start, unsigned long stop)
 {
-#if QEMU_GNUC_PREREQ(4, 1)
-    __builtin___clear_cache((char *) start, (char *) stop);
-#else
-#error not implemented
-#endif
 }
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 06/35] tcg-s390: Allocate the code_gen_buffer near the main program.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (4 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 05/35] tcg-s390: Icache flush is a no-op Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-09 22:59   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 07/35] tcg: Optionally sign-extend 32-bit arguments for 64-bit host Richard Henderson
                   ` (29 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

This allows the use of direct calls to the helpers,
and a direct branch back to the epilogue.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 exec.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/exec.c b/exec.c
index bb3dcad..7bbfe60 100644
--- a/exec.c
+++ b/exec.c
@@ -519,6 +519,13 @@ static void code_gen_alloc(unsigned long tb_size)
         start = (void *) 0x01000000UL;
         if (code_gen_buffer_size > 16 * 1024 * 1024)
             code_gen_buffer_size = 16 * 1024 * 1024;
+#elif defined(__s390x__)
+        /* Map the buffer so that we can use direct calls and branches.  */
+        /* We have a +- 4GB range on the branches; leave some slop.  */
+        if (code_gen_buffer_size > (3ul * 1024 * 1024 * 1024)) {
+            code_gen_buffer_size = 3ul * 1024 * 1024 * 1024;
+        }
+        start = (void *)0x90000000UL;
 #endif
         code_gen_buffer = mmap(start, code_gen_buffer_size,
                                PROT_WRITE | PROT_READ | PROT_EXEC,
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 07/35] tcg: Optionally sign-extend 32-bit arguments for 64-bit host.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (5 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 06/35] tcg-s390: Allocate the code_gen_buffer near the main program Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-10 10:22   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 08/35] s390: Update disassembler to the last GPLv2 from binutils Richard Henderson
                   ` (28 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Some hosts (amd64, ia64) have an ABI that ignores the high bits
of the 64-bit register when passing 32-bit arguments.  Others,
like s390x, require the value to be properly sign-extended for
the type.  I.e. "int32_t" must be sign-extended and "uint32_t"
must be zero-extended to 64-bits.

To effect this, extend the "sizemask" parameter to tcg_gen_callN
to include the signedness of the type of each parameter.  If the
tcg target requires it, extend each 32-bit argument into a 64-bit
temp and pass that to the function call.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 def-helper.h                 |   38 +++++++++++++++++++++++++++++---------
 target-i386/ops_sse_header.h |    3 +++
 target-ppc/helper.h          |    1 +
 tcg/s390/tcg-target.h        |    2 ++
 tcg/tcg-op.h                 |   42 +++++++++++++++++++++---------------------
 tcg/tcg.c                    |   41 +++++++++++++++++++++++++++++++++++------
 6 files changed, 91 insertions(+), 36 deletions(-)

diff --git a/def-helper.h b/def-helper.h
index 8a88c5b..8a822c7 100644
--- a/def-helper.h
+++ b/def-helper.h
@@ -81,9 +81,29 @@
 #define dh_is_64bit_ptr (TCG_TARGET_REG_BITS == 64)
 #define dh_is_64bit(t) glue(dh_is_64bit_, dh_alias(t))
 
+#define dh_is_signed_void 0
+#define dh_is_signed_i32 0
+#define dh_is_signed_s32 1
+#define dh_is_signed_i64 0
+#define dh_is_signed_s64 1
+#define dh_is_signed_f32 0
+#define dh_is_signed_f64 0
+#define dh_is_signed_tl  0
+#define dh_is_signed_int 1
+/* ??? This is highly specific to the host cpu.  There are even special
+   extension instructions that may be required, e.g. ia64's addp4.  But
+   for now we don't support any 64-bit targets with 32-bit pointers.  */
+#define dh_is_signed_ptr 0
+#define dh_is_signed_env dh_is_signed_ptr
+#define dh_is_signed(t) dh_is_signed_##t
+
+#define dh_sizemask(t, n) \
+  sizemask |= dh_is_64bit(t) << (n*2); \
+  sizemask |= dh_is_signed(t) << (n*2+1)
+
 #define dh_arg(t, n) \
   args[n - 1] = glue(GET_TCGV_, dh_alias(t))(glue(arg, n)); \
-  sizemask |= dh_is_64bit(t) << n
+  dh_sizemask(t, n)
 
 #define dh_arg_decl(t, n) glue(TCGv_, dh_alias(t)) glue(arg, n)
 
@@ -138,8 +158,8 @@ static inline void glue(gen_helper_, name)(dh_retvar_decl0(ret)) \
 static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1)) \
 { \
   TCGArg args[1]; \
-  int sizemask; \
-  sizemask = dh_is_64bit(ret); \
+  int sizemask = 0; \
+  dh_sizemask(ret, 0); \
   dh_arg(t1, 1); \
   tcg_gen_helperN(HELPER(name), flags, sizemask, dh_retvar(ret), 1, args); \
 }
@@ -149,8 +169,8 @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1
     dh_arg_decl(t2, 2)) \
 { \
   TCGArg args[2]; \
-  int sizemask; \
-  sizemask = dh_is_64bit(ret); \
+  int sizemask = 0; \
+  dh_sizemask(ret, 0); \
   dh_arg(t1, 1); \
   dh_arg(t2, 2); \
   tcg_gen_helperN(HELPER(name), flags, sizemask, dh_retvar(ret), 2, args); \
@@ -161,8 +181,8 @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1
     dh_arg_decl(t2, 2), dh_arg_decl(t3, 3)) \
 { \
   TCGArg args[3]; \
-  int sizemask; \
-  sizemask = dh_is_64bit(ret); \
+  int sizemask = 0; \
+  dh_sizemask(ret, 0); \
   dh_arg(t1, 1); \
   dh_arg(t2, 2); \
   dh_arg(t3, 3); \
@@ -174,8 +194,8 @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1
     dh_arg_decl(t2, 2), dh_arg_decl(t3, 3), dh_arg_decl(t4, 4)) \
 { \
   TCGArg args[4]; \
-  int sizemask; \
-  sizemask = dh_is_64bit(ret); \
+  int sizemask = 0; \
+  dh_sizemask(ret, 0); \
   dh_arg(t1, 1); \
   dh_arg(t2, 2); \
   dh_arg(t3, 3); \
diff --git a/target-i386/ops_sse_header.h b/target-i386/ops_sse_header.h
index a0a6361..8d4b2b7 100644
--- a/target-i386/ops_sse_header.h
+++ b/target-i386/ops_sse_header.h
@@ -30,6 +30,9 @@
 #define dh_ctype_Reg Reg *
 #define dh_ctype_XMMReg XMMReg *
 #define dh_ctype_MMXReg MMXReg *
+#define dh_is_signed_Reg dh_is_signed_ptr
+#define dh_is_signed_XMMReg dh_is_signed_ptr
+#define dh_is_signed_MMXReg dh_is_signed_ptr
 
 DEF_HELPER_2(glue(psrlw, SUFFIX), void, Reg, Reg)
 DEF_HELPER_2(glue(psraw, SUFFIX), void, Reg, Reg)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 5cf6cd4..c025a2f 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -95,6 +95,7 @@ DEF_HELPER_3(fsel, i64, i64, i64, i64)
 
 #define dh_alias_avr ptr
 #define dh_ctype_avr ppc_avr_t *
+#define dh_is_signed_avr dh_is_signed_ptr
 
 DEF_HELPER_3(vaddubm, void, avr, avr, avr)
 DEF_HELPER_3(vadduhm, void, avr, avr, avr)
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index d7fe0c7..8c19262 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -87,6 +87,8 @@ enum {
 #define TCG_TARGET_STACK_ALIGN		8
 #define TCG_TARGET_CALL_STACK_OFFSET	0
 
+#define TCG_TARGET_EXTEND_ARGS 1
+
 enum {
     /* Note: must be synced with dyngen-exec.h */
     TCG_AREG0 = TCG_REG_R10,
diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index aa436de..4220e3d 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -369,8 +369,8 @@ static inline void tcg_gen_helperN(void *func, int flags, int sizemask,
    and pure, hence the call to tcg_gen_callN() with TCG_CALL_CONST |
    TCG_CALL_PURE. This may need to be adjusted if these functions
    start to be used with other helpers. */
-static inline void tcg_gen_helper32(void *func, TCGv_i32 ret,
-                                    TCGv_i32 a, TCGv_i32 b)
+static inline void tcg_gen_helper32(void *func, TCGv_i32 ret, TCGv_i32 a,
+                                    TCGv_i32 b, _Bool is_signed)
 {
     TCGv_ptr fn;
     TCGArg args[2];
@@ -378,12 +378,12 @@ static inline void tcg_gen_helper32(void *func, TCGv_i32 ret,
     args[0] = GET_TCGV_I32(a);
     args[1] = GET_TCGV_I32(b);
     tcg_gen_callN(&tcg_ctx, fn, TCG_CALL_CONST | TCG_CALL_PURE,
-                  0, GET_TCGV_I32(ret), 2, args);
+                  (is_signed ? 0x2a : 0x00), GET_TCGV_I32(ret), 2, args);
     tcg_temp_free_ptr(fn);
 }
 
-static inline void tcg_gen_helper64(void *func, TCGv_i64 ret,
-                                    TCGv_i64 a, TCGv_i64 b)
+static inline void tcg_gen_helper64(void *func, TCGv_i64 ret, TCGv_i64 a,
+                                    TCGv_i64 b, _Bool is_signed)
 {
     TCGv_ptr fn;
     TCGArg args[2];
@@ -391,7 +391,7 @@ static inline void tcg_gen_helper64(void *func, TCGv_i64 ret,
     args[0] = GET_TCGV_I64(a);
     args[1] = GET_TCGV_I64(b);
     tcg_gen_callN(&tcg_ctx, fn, TCG_CALL_CONST | TCG_CALL_PURE,
-                  7, GET_TCGV_I64(ret), 2, args);
+                  (is_signed ? 0x3f : 0x15), GET_TCGV_I64(ret), 2, args);
     tcg_temp_free_ptr(fn);
 }
 
@@ -692,22 +692,22 @@ static inline void tcg_gen_remu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 #else
 static inline void tcg_gen_div_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
-    tcg_gen_helper32(tcg_helper_div_i32, ret, arg1, arg2);
+    tcg_gen_helper32(tcg_helper_div_i32, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_rem_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
-    tcg_gen_helper32(tcg_helper_rem_i32, ret, arg1, arg2);
+    tcg_gen_helper32(tcg_helper_rem_i32, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_divu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
-    tcg_gen_helper32(tcg_helper_divu_i32, ret, arg1, arg2);
+    tcg_gen_helper32(tcg_helper_divu_i32, ret, arg1, arg2, 0);
 }
 
 static inline void tcg_gen_remu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
-    tcg_gen_helper32(tcg_helper_remu_i32, ret, arg1, arg2);
+    tcg_gen_helper32(tcg_helper_remu_i32, ret, arg1, arg2, 0);
 }
 #endif
 
@@ -867,7 +867,7 @@ static inline void tcg_gen_xori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
    specific code (x86) */
 static inline void tcg_gen_shl_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_shl_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_shl_i64, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
@@ -877,7 +877,7 @@ static inline void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
 
 static inline void tcg_gen_shr_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_shr_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_shr_i64, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
@@ -887,7 +887,7 @@ static inline void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
 
 static inline void tcg_gen_sar_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_sar_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_sar_i64, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_sari_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
@@ -935,22 +935,22 @@ static inline void tcg_gen_mul_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 
 static inline void tcg_gen_div_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_div_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_div_i64, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_rem_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_rem_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_rem_i64, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_divu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_divu_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_divu_i64, ret, arg1, arg2, 0);
 }
 
 static inline void tcg_gen_remu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_remu_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_remu_i64, ret, arg1, arg2, 0);
 }
 
 #else
@@ -1212,22 +1212,22 @@ static inline void tcg_gen_remu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 #else
 static inline void tcg_gen_div_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_div_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_div_i64, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_rem_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_rem_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_rem_i64, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_divu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_divu_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_divu_i64, ret, arg1, arg2, 0);
 }
 
 static inline void tcg_gen_remu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_remu_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_remu_i64, ret, arg1, arg2, 0);
 }
 #endif
 
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 880e7ce..d8ddd1f 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -560,6 +560,24 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned int flags,
     int real_args;
     int nb_rets;
     TCGArg *nparam;
+
+#if defined(TCG_TARGET_EXTEND_ARGS) && TCG_TARGET_REG_BITS == 64
+    for (i = 0; i < nargs; ++i) {
+        int is_64bit = sizemask & (1 << (i+1)*2);
+        int is_signed = sizemask & (2 << (i+1)*2);
+        if (!is_64bit) {
+            TCGv_i64 temp = tcg_temp_new_i64();
+            TCGv_i64 orig = MAKE_TCGV_I64(args[i]);
+            if (is_signed) {
+                tcg_gen_ext32s_i64(temp, orig);
+            } else {
+                tcg_gen_ext32u_i64(temp, orig);
+            }
+            args[i] = GET_TCGV_I64(temp);
+        }
+    }
+#endif /* TCG_TARGET_EXTEND_ARGS */
+
     *gen_opc_ptr++ = INDEX_op_call;
     nparam = gen_opparam_ptr++;
 #ifdef TCG_TARGET_I386
@@ -588,7 +606,8 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned int flags,
     real_args = 0;
     for (i = 0; i < nargs; i++) {
 #if TCG_TARGET_REG_BITS < 64
-        if (sizemask & (2 << i)) {
+        int is_64bit = sizemask & (1 << (i+1)*2);
+        if (is_64bit) {
 #ifdef TCG_TARGET_I386
             /* REGPARM case: if the third parameter is 64 bit, it is
                allocated on the stack */
@@ -622,12 +641,12 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned int flags,
             *gen_opparam_ptr++ = args[i] + 1;
 #endif
             real_args += 2;
-        } else
-#endif
-        {
-            *gen_opparam_ptr++ = args[i];
-            real_args++;
+            continue;
         }
+#endif /* TCG_TARGET_REG_BITS < 64 */
+
+        *gen_opparam_ptr++ = args[i];
+        real_args++;
     }
     *gen_opparam_ptr++ = GET_TCGV_PTR(func);
 
@@ -637,6 +656,16 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned int flags,
 
     /* total parameters, needed to go backward in the instruction stream */
     *gen_opparam_ptr++ = 1 + nb_rets + real_args + 3;
+
+#if defined(TCG_TARGET_EXTEND_ARGS) && TCG_TARGET_REG_BITS == 64
+    for (i = 0; i < nargs; ++i) {
+        int is_64bit = sizemask & (1 << (i+1)*2);
+        if (!is_64bit) {
+            TCGv_i64 temp = MAKE_TCGV_I64(args[i]);
+            tcg_temp_free_i64(temp);
+        }
+    }
+#endif /* TCG_TARGET_EXTEND_ARGS */
 }
 
 #if TCG_TARGET_REG_BITS == 32
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 08/35] s390: Update disassembler to the last GPLv2 from binutils.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (6 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 07/35] tcg: Optionally sign-extend 32-bit arguments for 64-bit host Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-09 22:47   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 09/35] s390: Disassemble some general-instruction-extension insns Richard Henderson
                   ` (27 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 s390-dis.c |   81 +++++++++++++++++++++++++++++++++++++-----------------------
 1 files changed, 50 insertions(+), 31 deletions(-)

diff --git a/s390-dis.c b/s390-dis.c
index 86dd84f..3d96be0 100644
--- a/s390-dis.c
+++ b/s390-dis.c
@@ -1,3 +1,4 @@
+/* opcodes/s390-dis.c revision 1.12 */
 /* s390-dis.c -- Disassemble S390 instructions
    Copyright 2000, 2001, 2002, 2003, 2005 Free Software Foundation, Inc.
    Contributed by Martin Schwidefsky (schwidefsky@de.ibm.com).
@@ -15,11 +16,14 @@
    GNU General Public License for more details.
 
    You should have received a copy of the GNU General Public License
-   along with this program; if not, see <http://www.gnu.org/licenses/>.  */
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
+   02110-1301, USA.  */
 
-#include <stdio.h>
+#include "qemu-common.h"
 #include "dis-asm.h"
 
+/* include/opcode/s390.h revision 1.9 */
 /* s390.h -- Header file for S390 opcode table
    Copyright 2000, 2001, 2003 Free Software Foundation, Inc.
    Contributed by Martin Schwidefsky (schwidefsky@de.ibm.com).
@@ -37,7 +41,9 @@
    GNU General Public License for more details.
 
    You should have received a copy of the GNU General Public License
-   along with this program; if not, see <http://www.gnu.org/licenses/>.  */
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
+   02110-1301, USA.  */
 
 #ifndef S390_H
 #define S390_H
@@ -57,7 +63,8 @@ enum s390_opcode_cpu_val
     S390_OPCODE_Z900,
     S390_OPCODE_Z990,
     S390_OPCODE_Z9_109,
-    S390_OPCODE_Z9_EC
+    S390_OPCODE_Z9_EC,
+    S390_OPCODE_Z10
   };
 
 /* The opcode table is an array of struct s390_opcode.  */
@@ -95,12 +102,13 @@ struct s390_opcode
 /* The table itself is sorted by major opcode number, and is otherwise
    in the order in which the disassembler should consider
    instructions.  */
-extern const struct s390_opcode s390_opcodes[];
-extern const int                s390_num_opcodes;
+/* QEMU: Mark these static.  */
+static const struct s390_opcode s390_opcodes[];
+static const int                s390_num_opcodes;
 
 /* A opcode format table for the .insn pseudo mnemonic.  */
-extern const struct s390_opcode s390_opformats[];
-extern const int                s390_num_opformats;
+static const struct s390_opcode s390_opformats[];
+static const int                s390_num_opformats;
 
 /* Values defined for the flags field of a struct powerpc_opcode.  */
 
@@ -121,7 +129,7 @@ struct s390_operand
 /* Elements in the table are retrieved by indexing with values from
    the operands field of the powerpc_opcodes table.  */
 
-extern const struct s390_operand s390_operands[];
+static const struct s390_operand s390_operands[];
 
 /* Values defined for the flags field of a struct s390_operand.  */
 
@@ -164,12 +172,13 @@ extern const struct s390_operand s390_operands[];
    the instruction may be optional.  */
 #define S390_OPERAND_OPTIONAL 0x400
 
-	#endif /* S390_H */
-
+#endif /* S390_H */
 
 static int init_flag = 0;
 static int opc_index[256];
-static int current_arch_mask = 0;
+
+/* QEMU: We've disabled the architecture check below.  */
+/* static int current_arch_mask = 0; */
 
 /* Set up index table for first opcode byte.  */
 
@@ -188,17 +197,21 @@ init_disasm (struct disassemble_info *info)
 	     (opcode[1].opcode[0] == opcode->opcode[0]))
 	opcode++;
     }
-//  switch (info->mach)
-//    {
-//    case bfd_mach_s390_31:
+
+#ifdef QEMU_DISABLE
+  switch (info->mach)
+    {
+    case bfd_mach_s390_31:
       current_arch_mask = 1 << S390_OPCODE_ESA;
-//      break;
-//    case bfd_mach_s390_64:
-//      current_arch_mask = 1 << S390_OPCODE_ZARCH;
-//      break;
-//    default:
-//      abort ();
-//    }
+      break;
+    case bfd_mach_s390_64:
+      current_arch_mask = 1 << S390_OPCODE_ZARCH;
+      break;
+    default:
+      abort ();
+    }
+#endif /* QEMU_DISABLE */
+
   init_flag = 1;
 }
 
@@ -297,9 +310,12 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
 	  const struct s390_operand *operand;
 	  const unsigned char *opindex;
 
+#ifdef QEMU_DISABLE
 	  /* Check architecture.  */
 	  if (!(opcode->modes & current_arch_mask))
 	    continue;
+#endif /* QEMU_DISABLE */
+
 	  /* Check signature of the opcode.  */
 	  if ((buffer[1] & opcode->mask[1]) != opcode->opcode[1]
 	      || (buffer[2] & opcode->mask[2]) != opcode->opcode[2]
@@ -392,6 +408,8 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
       return 1;
     }
 }
+
+/* opcodes/s390-opc.c revision 1.16 */
 /* s390-opc.c -- S390 opcode list
    Copyright 2000, 2001, 2003 Free Software Foundation, Inc.
    Contributed by Martin Schwidefsky (schwidefsky@de.ibm.com).
@@ -409,9 +427,9 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
    GNU General Public License for more details.
 
    You should have received a copy of the GNU General Public License
-   along with this program; if not, see <http://www.gnu.org/licenses/>.  */
-
-#include <stdio.h>
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
+   02110-1301, USA.  */
 
 /* This file holds the S390 opcode table.  The opcode table
    includes almost all of the extended instruction mnemonics.  This
@@ -427,7 +445,7 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
 /* The operands table.
    The fields are bits, shift, insert, extract, flags.  */
 
-const struct s390_operand s390_operands[] =
+static const struct s390_operand s390_operands[] =
 {
 #define UNUSED 0
   { 0, 0, 0 },                    /* Indicates the end of the operand list */
@@ -563,7 +581,7 @@ const struct s390_operand s390_operands[] =
       quite close.
 
       For example the instruction "mvo" is defined in the PoP as follows:
-
+      
       MVO  D1(L1,B1),D2(L2,B2)   [SS]
 
       --------------------------------------
@@ -739,7 +757,7 @@ const struct s390_operand s390_operands[] =
 
 /* The opcode formats table (blueprints for .insn pseudo mnemonic).  */
 
-const struct s390_opcode s390_opformats[] =
+static const struct s390_opcode s390_opformats[] =
   {
   { "e",	OP8(0x00LL),	MASK_E,		INSTR_E,	3, 0 },
   { "ri",	OP8(0x00LL),	MASK_RI_RI,	INSTR_RI_RI,	3, 0 },
@@ -765,9 +783,10 @@ const struct s390_opcode s390_opformats[] =
   { "ssf",	OP8(0x00LL),	MASK_SSF_RRDRD,	INSTR_SSF_RRDRD,3, 0 },
 };
 
-const int s390_num_opformats =
+static const int s390_num_opformats =
   sizeof (s390_opformats) / sizeof (s390_opformats[0]);
 
+/* include "s390-opc.tab" generated from opcodes/s390-opc.txt rev 1.17 */
 /* The opcode table. This file was generated by s390-mkopc.
 
    The format of the opcode table is:
@@ -783,7 +802,7 @@ const int s390_num_opformats =
    The disassembler reads the table in order and prints the first
    instruction which matches.  */
 
-const struct s390_opcode s390_opcodes[] =
+static const struct s390_opcode s390_opcodes[] =
   {
   { "dp", OP8(0xfdLL), MASK_SS_LLRDRD, INSTR_SS_LLRDRD, 3, 0},
   { "mp", OP8(0xfcLL), MASK_SS_LLRDRD, INSTR_SS_LLRDRD, 3, 0},
@@ -1700,5 +1719,5 @@ const struct s390_opcode s390_opcodes[] =
   { "pr", OP16(0x0101LL), MASK_E, INSTR_E, 3, 0}
 };
 
-const int s390_num_opcodes =
+static const int s390_num_opcodes =
   sizeof (s390_opcodes) / sizeof (s390_opcodes[0]);
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 09/35] s390: Disassemble some general-instruction-extension insns.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (7 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 08/35] s390: Update disassembler to the last GPLv2 from binutils Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-09 22:47   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 10/35] tcg-s390: New TCG target Richard Henderson
                   ` (26 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The full general-instruction-extension facility was added to binutils
after the change to GPLv3.  This is not the entire extension, just
what we're using in TCG.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 s390-dis.c |   89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 files changed, 81 insertions(+), 8 deletions(-)

diff --git a/s390-dis.c b/s390-dis.c
index 3d96be0..2eed69b 100644
--- a/s390-dis.c
+++ b/s390-dis.c
@@ -172,6 +172,31 @@ static const struct s390_operand s390_operands[];
    the instruction may be optional.  */
 #define S390_OPERAND_OPTIONAL 0x400
 
+/* QEMU-ADD */
+/* ??? Not quite the format the assembler takes, but easy to implement
+   without recourse to the table generator.  */
+#define S390_OPERAND_CCODE  0x800
+
+static const char s390_ccode_name[16][4] = {
+    "n",    /* 0000 */
+    "o",    /* 0001 */
+    "h",    /* 0010 */
+    "nle",  /* 0011 */
+    "l",    /* 0100 */
+    "nhe",  /* 0101 */
+    "lh",   /* 0110 */
+    "ne",   /* 0111 */
+    "e",    /* 1000 */
+    "nlh",  /* 1001 */
+    "he",   /* 1010 */
+    "nl",   /* 1011 */
+    "le",   /* 1100 */
+    "nh",   /* 1101 */
+    "no",   /* 1110 */
+    "a"     /* 1111 */
+};
+/* QEMU-END */
+
 #endif /* S390_H */
 
 static int init_flag = 0;
@@ -325,13 +350,16 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
 	    continue;
 
 	  /* The instruction is valid.  */
-	  if (opcode->operands[0] != 0)
-	    (*info->fprintf_func) (info->stream, "%s\t", opcode->name);
-	  else
-	    (*info->fprintf_func) (info->stream, "%s", opcode->name);
+/* QEMU-MOD */
+         (*info->fprintf_func) (info->stream, "%s", opcode->name);
+
+         if (s390_operands[opcode->operands[0]].flags & S390_OPERAND_CCODE)
+           separator = 0;
+         else
+           separator = '\t';
+/* QEMU-END */
 
 	  /* Extract the operands.  */
-	  separator = 0;
 	  for (opindex = opcode->operands; *opindex != 0; opindex++)
 	    {
 	      unsigned int value;
@@ -363,6 +391,15 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
 		(*info->print_address_func) (memaddr + (int) value, info);
 	      else if (operand->flags & S390_OPERAND_SIGNED)
 		(*info->fprintf_func) (info->stream, "%i", (int) value);
+/* QEMU-ADD */
+              else if (operand->flags & S390_OPERAND_CCODE)
+                {
+		  (*info->fprintf_func) (info->stream, "%s",
+                                         s390_ccode_name[(int) value]);
+                  separator = '\t';
+                  continue;
+                }
+/* QEMU-END */
 	      else
 		(*info->fprintf_func) (info->stream, "%u", value);
 
@@ -543,8 +580,16 @@ static const struct s390_operand s390_operands[] =
 #define M_16   42                 /* 4 bit optional mask starting at 16 */
   { 4, 16, S390_OPERAND_OPTIONAL },
 #define RO_28  43                 /* optional GPR starting at position 28 */
-  { 4, 28, (S390_OPERAND_GPR | S390_OPERAND_OPTIONAL) }
-
+  { 4, 28, (S390_OPERAND_GPR | S390_OPERAND_OPTIONAL) },
+
+/* QEMU-ADD: */
+#define M4_12 44                  /* 4-bit condition-code starting at 12 */
+  { 4, 12, S390_OPERAND_CCODE },
+#define M4_32 45                  /* 4-bit condition-code starting at 32 */
+  { 4, 32, S390_OPERAND_CCODE },
+#define I8_32 46                  /* 8 bit signed value starting at 32 */
+  { 8, 32, S390_OPERAND_SIGNED },
+/* QEMU-END */
 };
 
 
@@ -755,6 +800,14 @@ static const struct s390_operand s390_operands[] =
 #define MASK_S_RD        { 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_SSF_RRDRD   { 0xff, 0x0f, 0x00, 0x00, 0x00, 0x00 }
 
+/* QEMU-ADD: */
+#define INSTR_RIE_MRRP   6, { M4_32,R_8,R_12,J16_16,0,0 }	/* e.g. crj */
+#define MASK_RIE_MRRP    { 0xff, 0x00, 0x00, 0x00, 0x0f, 0xff }
+
+#define INSTR_RIE_MRIP   6, { M4_12,R_8,I8_32,J16_16,0,0 }      /* e.g. cij */
+#define MASK_RIE_MRIP    { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
+/* QEMU-END */
+
 /* The opcode formats table (blueprints for .insn pseudo mnemonic).  */
 
 static const struct s390_opcode s390_opformats[] =
@@ -1092,6 +1145,10 @@ static const struct s390_opcode s390_opcodes[] =
   { "agfi", OP16(0xc208LL), MASK_RIL_RI, INSTR_RIL_RI, 2, 4},
   { "slfi", OP16(0xc205LL), MASK_RIL_RU, INSTR_RIL_RU, 2, 4},
   { "slgfi", OP16(0xc204LL), MASK_RIL_RU, INSTR_RIL_RU, 2, 4},
+/* QEMU-ADD: */
+  { "msfi",  OP16(0xc201ll), MASK_RIL_RI, INSTR_RIL_RI, 3, 6},
+  { "msgfi", OP16(0xc200ll), MASK_RIL_RI, INSTR_RIL_RI, 3, 6},
+/* QEMU-END */
   { "jg", OP16(0xc0f4LL), MASK_RIL_0P, INSTR_RIL_0P, 3, 2},
   { "jgno", OP16(0xc0e4LL), MASK_RIL_0P, INSTR_RIL_0P, 3, 2},
   { "jgnh", OP16(0xc0d4LL), MASK_RIL_0P, INSTR_RIL_0P, 3, 2},
@@ -1716,7 +1773,23 @@ static const struct s390_opcode s390_opcodes[] =
   { "pfpo", OP16(0x010aLL), MASK_E, INSTR_E, 2, 5},
   { "sckpf", OP16(0x0107LL), MASK_E, INSTR_E, 3, 0},
   { "upt", OP16(0x0102LL), MASK_E, INSTR_E, 3, 0},
-  { "pr", OP16(0x0101LL), MASK_E, INSTR_E, 3, 0}
+  { "pr", OP16(0x0101LL), MASK_E, INSTR_E, 3, 0},
+
+/* QEMU-ADD: */
+  { "crj",   OP48(0xec0000000076LL), MASK_RIE_MRRP, INSTR_RIE_MRRP, 3, 6},
+  { "cgrj",  OP48(0xec0000000064LL), MASK_RIE_MRRP, INSTR_RIE_MRRP, 3, 6},
+  { "clrj",  OP48(0xec0000000077LL), MASK_RIE_MRRP, INSTR_RIE_MRRP, 3, 6},
+  { "clgrj", OP48(0xec0000000065LL), MASK_RIE_MRRP, INSTR_RIE_MRRP, 3, 6},
+
+  { "cij",   OP48(0xec000000007eLL), MASK_RIE_MRIP, INSTR_RIE_MRIP, 3, 6},
+  { "cgij",  OP48(0xec000000007cLL), MASK_RIE_MRIP, INSTR_RIE_MRIP, 3, 6},
+  { "clij",  OP48(0xec000000007fLL), MASK_RIE_MRIP, INSTR_RIE_MRIP, 3, 6},
+  { "clgij", OP48(0xec000000007dLL), MASK_RIE_MRIP, INSTR_RIE_MRIP, 3, 6},
+
+  { "lrl",   OP16(0xc40dll), MASK_RIL_RP, INSTR_RIL_RP, 3, 6},
+  { "lgrl",  OP16(0xc408ll), MASK_RIL_RP, INSTR_RIL_RP, 3, 6},
+  { "lgfrl", OP16(0xc40cll), MASK_RIL_RP, INSTR_RIL_RP, 3, 6},
+/* QEMU-END */
 };
 
 static const int s390_num_opcodes =
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 10/35] tcg-s390: New TCG target
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (8 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 09/35] s390: Disassemble some general-instruction-extension insns Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-10 10:24   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 11/35] tcg-s390: Tidy unimplemented opcodes Richard Henderson
                   ` (25 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

We already have stubs for a TCG target on S390, but were missing code that
would actually generate instructions.

So I took Uli's patch, cleaned it up and present it to you again :-).

I hope I found all odd coding style and unprettiness issues, but if you
still spot one feel free to nag about it.

Signed-off-by: Alexander Graf <agraf@suse.de>
CC: Uli Hecht <uli@suse.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c | 1171 ++++++++++++++++++++++++++++++++++++++++++++++++-
 tcg/s390/tcg-target.h |   13 +-
 2 files changed, 1157 insertions(+), 27 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 265194a..55f0fa9 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -2,6 +2,7 @@
  * Tiny Code Generator for QEMU
  *
  * Copyright (c) 2009 Ulrich Hecht <uli@suse.de>
+ * Copyright (c) 2009 Alexander Graf <agraf@suse.de>
  *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to deal
@@ -22,81 +23,1209 @@
  * THE SOFTWARE.
  */
 
+/* #define DEBUG_S390_TCG */
+
+#ifdef DEBUG_S390_TCG
+#define dprintf(fmt, ...) \
+    do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
+#else
+#define dprintf(fmt, ...) \
+    do { } while (0)
+#endif
+
+#define TCG_CT_CONST_S16                0x100
+#define TCG_CT_CONST_U12                0x200
+
+/* Several places within the instruction set 0 means "no register"
+   rather than TCG_REG_R0.  */
+#define TCG_REG_NONE    0
+
+/* All of the following instructions are prefixed with their instruction
+   format, and are defined as 8- or 16-bit quantities, even when the two
+   halves of the 16-bit quantity may appear 32 bits apart in the insn.
+   This makes it easy to copy the values from the tables in Appendix B.  */
+typedef enum S390Opcode {
+    RIL_BRASL   = 0xc005,
+    RIL_BRCL    = 0xc004,
+    RIL_LARL    = 0xc000,
+
+    RI_AGHI     = 0xa70b,
+    RI_AHI      = 0xa70a,
+    RI_BRC      = 0xa704,
+    RI_IILH     = 0xa502,
+    RI_LGHI     = 0xa709,
+    RI_LLILL    = 0xa50f,
+
+    RRE_AGR     = 0xb908,
+    RRE_CGR     = 0xb920,
+    RRE_CLGR    = 0xb921,
+    RRE_DLGR    = 0xb987,
+    RRE_DLR     = 0xb997,
+    RRE_DSGFR   = 0xb91d,
+    RRE_DSGR    = 0xb90d,
+    RRE_LCGR    = 0xb903,
+    RRE_LGFR    = 0xb914,
+    RRE_LGR     = 0xb904,
+    RRE_LLGFR   = 0xb916,
+    RRE_MSGR    = 0xb90c,
+    RRE_MSR     = 0xb252,
+    RRE_NGR     = 0xb980,
+    RRE_OGR     = 0xb981,
+    RRE_SGR     = 0xb909,
+    RRE_XGR     = 0xb982,
+
+    RR_AR       = 0x1a,
+    RR_BASR     = 0x0d,
+    RR_BCR      = 0x07,
+    RR_CLR      = 0x15,
+    RR_CR       = 0x19,
+    RR_DR       = 0x1d,
+    RR_LCR      = 0x13,
+    RR_LR       = 0x18,
+    RR_NR       = 0x14,
+    RR_OR       = 0x16,
+    RR_SR       = 0x1b,
+    RR_XR       = 0x17,
+
+    RSY_SLLG    = 0xeb0d,
+    RSY_SRAG    = 0xeb0a,
+    RSY_SRLG    = 0xeb0c,
+
+    RS_SLL      = 0x89,
+    RS_SRA      = 0x8a,
+    RS_SRL      = 0x88,
+
+    RXY_CG      = 0xe320,
+    RXY_LB      = 0xe376,
+    RXY_LG      = 0xe304,
+    RXY_LGB     = 0xe377,
+    RXY_LGF     = 0xe314,
+    RXY_LGH     = 0xe315,
+    RXY_LHY     = 0xe378,
+    RXY_LLC     = 0xe394,
+    RXY_LLGC    = 0xe390,
+    RXY_LLGF    = 0xe316,
+    RXY_LLGH    = 0xe391,
+    RXY_LLH     = 0xe395,
+    RXY_LMG     = 0xeb04,
+    RXY_LRV     = 0xe31e,
+    RXY_LRVG    = 0xe30f,
+    RXY_LRVH    = 0xe31f,
+    RXY_LY      = 0xe358,
+    RXY_STCY    = 0xe372,
+    RXY_STG     = 0xe324,
+    RXY_STHY    = 0xe370,
+    RXY_STMG    = 0xeb24,
+    RXY_STRV    = 0xe33e,
+    RXY_STRVG   = 0xe32f,
+    RXY_STRVH   = 0xe33f,
+    RXY_STY     = 0xe350,
+
+    RX_L        = 0x58,
+    RX_LH       = 0x48,
+    RX_ST       = 0x50,
+    RX_STC      = 0x42,
+    RX_STH      = 0x40,
+} S390Opcode;
+
+#define LD_SIGNED      0x04
+#define LD_UINT8       0x00
+#define LD_INT8        (LD_UINT8 | LD_SIGNED)
+#define LD_UINT16      0x01
+#define LD_INT16       (LD_UINT16 | LD_SIGNED)
+#define LD_UINT32      0x02
+#define LD_INT32       (LD_UINT32 | LD_SIGNED)
+#define LD_UINT64      0x03
+#define LD_INT64       (LD_UINT64 | LD_SIGNED)
+
+#ifndef NDEBUG
+static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
+    "%r0", "%r1", "%r2", "%r3", "%r4", "%r5", "%r6", "%r7",
+    "%r8", "%r9", "%r10" "%r11" "%r12" "%r13" "%r14" "%r15"
+};
+#endif
+
 static const int tcg_target_reg_alloc_order[] = {
+    TCG_REG_R6,
+    TCG_REG_R7,
+    TCG_REG_R8,
+    TCG_REG_R9,
+    TCG_REG_R10,
+    TCG_REG_R11,
+    TCG_REG_R12,
+    TCG_REG_R13,
+    TCG_REG_R14,
+    TCG_REG_R0,
+    TCG_REG_R1,
+    TCG_REG_R2,
+    TCG_REG_R3,
+    TCG_REG_R4,
+    TCG_REG_R5,
 };
 
 static const int tcg_target_call_iarg_regs[] = {
+    TCG_REG_R2,
+    TCG_REG_R3,
+    TCG_REG_R4,
+    TCG_REG_R5,
+    TCG_REG_R6,
 };
 
 static const int tcg_target_call_oarg_regs[] = {
+    TCG_REG_R2,
+    TCG_REG_R3,
+};
+
+/* signed/unsigned is handled by using COMPARE and COMPARE LOGICAL,
+   respectively */
+
+#define S390_CC_EQ      8
+#define S390_CC_LT      4
+#define S390_CC_GT      2
+#define S390_CC_OV      1
+#define S390_CC_NE      (S390_CC_LT | S390_CC_GT)
+#define S390_CC_LE      (S390_CC_LT | S390_CC_EQ)
+#define S390_CC_GE      (S390_CC_GT | S390_CC_EQ)
+#define S390_CC_ALWAYS  15
+
+static const uint8_t tcg_cond_to_s390_cond[10] = {
+    [TCG_COND_EQ]  = S390_CC_EQ,
+    [TCG_COND_LT]  = S390_CC_LT,
+    [TCG_COND_LTU] = S390_CC_LT,
+    [TCG_COND_LE]  = S390_CC_LE,
+    [TCG_COND_LEU] = S390_CC_LE,
+    [TCG_COND_GT]  = S390_CC_GT,
+    [TCG_COND_GTU] = S390_CC_GT,
+    [TCG_COND_GE]  = S390_CC_GE,
+    [TCG_COND_GEU] = S390_CC_GE,
+    [TCG_COND_NE]  = S390_CC_NE,
+};
+
+#ifdef CONFIG_SOFTMMU
+
+#include "../../softmmu_defs.h"
+
+static void *qemu_ld_helpers[4] = {
+    __ldb_mmu,
+    __ldw_mmu,
+    __ldl_mmu,
+    __ldq_mmu,
+};
+
+static void *qemu_st_helpers[4] = {
+    __stb_mmu,
+    __stw_mmu,
+    __stl_mmu,
+    __stq_mmu,
 };
+#endif
+
+static uint8_t *tb_ret_addr;
 
 static void patch_reloc(uint8_t *code_ptr, int type,
                 tcg_target_long value, tcg_target_long addend)
 {
-    tcg_abort();
+    uint32_t *code_ptr_32 = (uint32_t*)code_ptr;
+    tcg_target_long code_ptr_tlong = (tcg_target_long)code_ptr;
+
+    switch (type) {
+    case R_390_PC32DBL:
+        *code_ptr_32 = (value - (code_ptr_tlong + addend)) >> 1;
+        break;
+    default:
+        tcg_abort();
+        break;
+    }
 }
 
-static inline int tcg_target_get_call_iarg_regs_count(int flags)
+static int tcg_target_get_call_iarg_regs_count(int flags)
 {
-    tcg_abort();
-    return 0;
+    return sizeof(tcg_target_call_iarg_regs) / sizeof(int);
 }
 
 /* parse target specific constraints */
 static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
 {
-    tcg_abort();
+    const char *ct_str;
+
+    ct->ct |= TCG_CT_REG;
+    tcg_regset_set32(ct->u.regs, 0, 0xffff);
+    ct_str = *pct_str;
+
+    switch (ct_str[0]) {
+    case 'L':                   /* qemu_ld/st constraint */
+        tcg_regset_reset_reg (ct->u.regs, TCG_REG_R2);
+        tcg_regset_reset_reg (ct->u.regs, TCG_REG_R3);
+        break;
+    case 'R':                        /* not R0 */
+        tcg_regset_reset_reg(ct->u.regs, TCG_REG_R0);
+        break;
+    case 'a':                  /* force R2 for division */
+        tcg_regset_clear(ct->u.regs);
+        tcg_regset_set_reg(ct->u.regs, TCG_REG_R2);
+        break;
+    case 'b':                  /* force R3 for division */
+        tcg_regset_clear(ct->u.regs);
+        tcg_regset_set_reg(ct->u.regs, TCG_REG_R3);
+        break;
+    case 'I':
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_S16;
+        break;
+    default:
+        break;
+    }
+    ct_str++;
+    *pct_str = ct_str;
+
     return 0;
 }
 
 /* Test if a constant matches the constraint. */
 static inline int tcg_target_const_match(tcg_target_long val,
-                const TCGArgConstraint *arg_ct)
+                                         const TCGArgConstraint *arg_ct)
 {
-    tcg_abort();
+    int ct = arg_ct->ct;
+
+    if ((ct & TCG_CT_CONST) ||
+       ((ct & TCG_CT_CONST_S16) && val == (int16_t)val) ||
+       ((ct & TCG_CT_CONST_U12) && val == (val & 0xfff))) {
+        return 1;
+    }
+
     return 0;
 }
 
+/* Emit instructions according to the given instruction format.  */
+
+static void tcg_out_insn_RR(TCGContext *s, S390Opcode op, TCGReg r1, TCGReg r2)
+{
+    tcg_out16(s, (op << 8) | (r1 << 4) | r2);
+}
+
+static void tcg_out_insn_RRE(TCGContext *s, S390Opcode op,
+                             TCGReg r1, TCGReg r2)
+{
+    tcg_out32(s, (op << 16) | (r1 << 4) | r2);
+}
+
+static void tcg_out_insn_RI(TCGContext *s, S390Opcode op, TCGReg r1, int i2)
+{
+    tcg_out32(s, (op << 16) | (r1 << 20) | (i2 & 0xffff));
+}
+
+static void tcg_out_insn_RIL(TCGContext *s, S390Opcode op, TCGReg r1, int i2)
+{
+    tcg_out16(s, op | (r1 << 4));
+    tcg_out32(s, i2);
+}
+
+static void tcg_out_insn_RS(TCGContext *s, S390Opcode op, TCGReg r1,
+                            TCGReg b2, TCGReg r3, int disp)
+{
+    tcg_out32(s, (op << 24) | (r1 << 20) | (r3 << 16) | (b2 << 12)
+              | (disp & 0xfff));
+}
+
+static void tcg_out_insn_RSY(TCGContext *s, S390Opcode op, TCGReg r1,
+                             TCGReg b2, TCGReg r3, int disp)
+{
+    tcg_out16(s, (op & 0xff00) | (r1 << 4) | r3);
+    tcg_out32(s, (op & 0xff) | (b2 << 28)
+              | ((disp & 0xfff) << 16) | ((disp & 0xff000) >> 4));
+}
+
+#define tcg_out_insn_RX   tcg_out_insn_RS
+#define tcg_out_insn_RXY  tcg_out_insn_RSY
+
+/* Emit an opcode with "type-checking" of the format.  */
+#define tcg_out_insn(S, FMT, OP, ...) \
+    glue(tcg_out_insn_,FMT)(S, glue(glue(FMT,_),OP), ## __VA_ARGS__)
+
+
+/* emit 64-bit shifts */
+static void tcg_out_sh64(TCGContext* s, S390Opcode op, TCGReg dest,
+                         TCGReg src, TCGReg sh_reg, int sh_imm)
+{
+    tcg_out_insn_RSY(s, op, dest, sh_reg, src, sh_imm);
+}
+
+/* emit 32-bit shifts */
+static void tcg_out_sh32(TCGContext* s, S390Opcode op, TCGReg dest,
+                         TCGReg sh_reg, int sh_imm)
+{
+    tcg_out_insn_RS(s, op, dest, sh_reg, 0, sh_imm);
+}
+
+static inline void tcg_out_mov(TCGContext *s, int ret, int arg)
+{
+    /* ??? With a TCGType argument, we could emit the smaller LR insn.  */
+    tcg_out_insn(s, RRE, LGR, ret, arg);
+}
+
 /* load a register with an immediate value */
 static inline void tcg_out_movi(TCGContext *s, TCGType type,
                 int ret, tcg_target_long arg)
 {
-    tcg_abort();
+    if (arg >= -0x8000 && arg < 0x8000) { /* signed immediate load */
+        tcg_out_insn(s, RI, LGHI, ret, arg);
+    } else if (!(arg & 0xffffffffffff0000UL)) {
+        tcg_out_insn(s, RI, LLILL, ret, arg);
+    } else if (!(arg & 0xffffffff00000000UL) || type == TCG_TYPE_I32) {
+        tcg_out_insn(s, RI, LLILL, ret, arg);
+        tcg_out_insn(s, RI, IILH, ret, arg >> 16);
+    } else {
+        /* branch over constant and store its address in R13 */
+        tcg_out_insn(s, RIL, BRASL, TCG_REG_R13, (6 + 8) >> 1);
+        /* 64-bit constant */
+        tcg_out32(s, arg >> 32);
+        tcg_out32(s, arg);
+        /* load constant to ret */
+        tcg_out_insn(s, RXY, LG, ret, TCG_REG_R13, 0, 0);
+    }
 }
 
+
+/* Emit a load/store type instruction.  Inputs are:
+   DATA:     The register to be loaded or stored.
+   BASE+OFS: The effective address.
+   OPC_RX:   If the operation has an RX format opcode (e.g. STC), otherwise 0.
+   OPC_RXY:  The RXY format opcode for the operation (e.g. STCY).  */
+
+static void tcg_out_mem(TCGContext *s, S390Opcode opc_rx, S390Opcode opc_rxy,
+                        TCGReg data, TCGReg base, TCGReg index,
+                        tcg_target_long ofs)
+{
+    if (ofs < -0x80000 || ofs >= 0x80000) {
+        /* Combine the low 16 bits of the offset with the actual load insn;
+           the high 48 bits must come from an immediate load.  */
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, ofs & ~0xffff);
+        ofs &= 0xffff;
+
+        /* If we were already given an index register, add it in.  */
+        if (index != TCG_REG_NONE) {
+            tcg_out_insn(s, RRE, AGR, TCG_REG_R13, index);
+        }
+        index = TCG_REG_R13;
+    }
+
+    if (opc_rx && ofs >= 0 && ofs < 0x1000) {
+        tcg_out_insn_RX(s, opc_rx, data, base, index, ofs);
+    } else {
+        tcg_out_insn_RXY(s, opc_rxy, data, base, index, ofs);
+    }
+}
+
+
 /* load data without address translation or endianness conversion */
-static inline void tcg_out_ld(TCGContext *s, TCGType type, int arg,
-                int arg1, tcg_target_long arg2)
+static inline void tcg_out_ld(TCGContext *s, TCGType type, TCGReg data,
+                              TCGReg base, tcg_target_long ofs)
 {
-    tcg_abort();
+    if (type == TCG_TYPE_I32) {
+        tcg_out_mem(s, RX_L, RXY_LY, data, base, TCG_REG_NONE, ofs);
+    } else {
+        tcg_out_mem(s, 0, RXY_LG, data, base, TCG_REG_NONE, ofs);
+    }
 }
 
-static inline void tcg_out_st(TCGContext *s, TCGType type, int arg,
-                              int arg1, tcg_target_long arg2)
+static inline void tcg_out_st(TCGContext *s, TCGType type, TCGReg data,
+                              TCGReg base, tcg_target_long ofs)
 {
-    tcg_abort();
+    if (type == TCG_TYPE_I32) {
+        tcg_out_mem(s, RX_ST, RXY_STY, data, base, TCG_REG_NONE, ofs);
+    } else {
+        tcg_out_mem(s, 0, RXY_STG, data, base, TCG_REG_NONE, ofs);
+    }
+}
+
+static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
+{
+    if (c > TCG_COND_GT) {
+        /* unsigned */
+        tcg_out_insn(s, RR, CLR, r1, r2);
+    } else {
+        /* signed */
+        tcg_out_insn(s, RR, CR, r1, r2);
+    }
+}
+
+static void tgen64_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
+{
+    if (c > TCG_COND_GT) {
+        /* unsigned */
+        tcg_out_insn(s, RRE, CLGR, r1, r2);
+    } else {
+        /* signed */
+        tcg_out_insn(s, RRE, CGR, r1, r2);
+    }
+}
+
+static void tgen_setcond(TCGContext *s, TCGType type, TCGCond c,
+                         TCGReg dest, TCGReg r1, TCGReg r2)
+{
+    if (type == TCG_TYPE_I32) {
+        tgen32_cmp(s, c, r1, r2);
+    } else {
+        tgen64_cmp(s, c, r1, r2);
+    }
+    /* Emit: r1 = 1; if (cc) goto over; r1 = 0; over:  */
+    tcg_out_movi(s, type, dest, 1);
+    tcg_out_insn(s, RI, BRC, tcg_cond_to_s390_cond[c], (4 + 4) >> 1);
+    tcg_out_movi(s, type, dest, 0);
+}
+
+static void tgen_gotoi(TCGContext *s, int cc, tcg_target_long dest)
+{
+    tcg_target_long off = (dest - (tcg_target_long)s->code_ptr) >> 1;
+    if (off > -0x8000 && off < 0x7fff) {
+        tcg_out_insn(s, RI, BRC, cc, off);
+    } else if (off == (int32_t)off) {
+        tcg_out_insn(s, RIL, BRCL, cc, off);
+    } else {
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, dest);
+        tcg_out_insn(s, RR, BCR, cc, TCG_REG_R13);
+    }
+}
+
+static void tgen_branch(TCGContext *s, int cc, int labelno)
+{
+    TCGLabel* l = &s->labels[labelno];
+    if (l->has_value) {
+        tgen_gotoi(s, cc, l->u.value);
+    } else {
+        tcg_out16(s, RIL_BRCL | (cc << 4));
+        tcg_out_reloc(s, s->code_ptr, R_390_PC32DBL, labelno, -2);
+        s->code_ptr += 4;
+    }
+}
+
+static void tgen_calli(TCGContext *s, tcg_target_long dest)
+{
+    tcg_target_long off = (dest - (tcg_target_long)s->code_ptr) >> 1;
+    if (off == (int32_t)off) {
+        tcg_out_insn(s, RIL, BRASL, TCG_REG_R14, off);
+    } else {
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, dest);
+        tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_REG_R13);
+    }
+}
+
+#if defined(CONFIG_SOFTMMU)
+static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
+                                  int mem_index, int opc,
+                                  uint16_t **label2_ptr_p, int is_store)
+  {
+    int arg0 = TCG_REG_R2;
+    int arg1 = TCG_REG_R3;
+    int arg2 = TCG_REG_R4;
+    int s_bits;
+    uint16_t *label1_ptr;
+
+    if (is_store) {
+        s_bits = opc;
+    } else {
+        s_bits = opc & 3;
+    }
+
+#if TARGET_LONG_BITS == 32
+    tcg_out_insn(s, RRE, LLGFR, arg1, addr_reg);
+    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
+#else
+    tcg_out_mov(s, arg1, addr_reg);
+    tcg_out_mov(s, arg0, addr_reg);
+#endif
+
+    tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, TCG_REG_NONE,
+                 TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
+
+    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                 TARGET_PAGE_MASK | ((1 << s_bits) - 1));
+    tcg_out_insn(s, RRE, NGR, arg0, TCG_REG_R13);
+
+    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                 (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
+    tcg_out_insn(s, RRE, NGR, arg1, TCG_REG_R13);
+
+    if (is_store) {
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                     offsetof(CPUState, tlb_table[mem_index][0].addr_write));
+    } else {
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                     offsetof(CPUState, tlb_table[mem_index][0].addr_read));
+    }
+    tcg_out_insn(s, RRE, AGR, arg1, TCG_REG_R13);
+
+    tcg_out_insn(s, RRE, AGR, arg1, TCG_AREG0);
+
+    tcg_out_insn(s, RXY, CG, arg0, arg1, 0, 0);
+
+    label1_ptr = (uint16_t*)s->code_ptr;
+
+    /* je label1 (offset will be patched in later) */
+    tcg_out_insn(s, RI, BRC, S390_CC_EQ, 0);
+
+    /* call load/store helper */
+#if TARGET_LONG_BITS == 32
+    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
+#else
+    tcg_out_mov(s, arg0, addr_reg);
+#endif
+
+    if (is_store) {
+        tcg_out_mov(s, arg1, data_reg);
+        tcg_out_movi(s, TCG_TYPE_I32, arg2, mem_index);
+        tgen_calli(s, (tcg_target_ulong)qemu_st_helpers[s_bits]);
+    } else {
+        tcg_out_movi(s, TCG_TYPE_I32, arg1, mem_index);
+        tgen_calli(s, (tcg_target_ulong)qemu_ld_helpers[s_bits]);
+
+        /* sign extension */
+        switch (opc) {
+        case LD_INT8:
+            tcg_out_insn(s, RSY, SLLG, data_reg, arg0, TCG_REG_NONE, 56);
+            tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, TCG_REG_NONE, 56);
+            break;
+        case LD_INT16:
+            tcg_out_insn(s, RSY, SLLG, data_reg, arg0, TCG_REG_NONE, 48);
+            tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, TCG_REG_NONE, 48);
+            break;
+        case LD_INT32:
+            tcg_out_insn(s, RRE, LGFR, data_reg, arg0);
+            break;
+        default:
+            /* unsigned -> just copy */
+            tcg_out_mov(s, data_reg, arg0);
+            break;
+        }
+    }
+
+    /* jump to label2 (end) */
+    *label2_ptr_p = (uint16_t*)s->code_ptr;
+
+    tcg_out_insn(s, RI, BRC, S390_CC_ALWAYS, 0);
+
+    /* this is label1, patch branch */
+    *(label1_ptr + 1) = ((unsigned long)s->code_ptr -
+                         (unsigned long)label1_ptr) >> 1;
+
+    if (is_store) {
+        tcg_out_insn(s, RXY, LG, arg1, arg1, 0,
+                     offsetof(CPUTLBEntry, addend)
+                     - offsetof(CPUTLBEntry, addr_write));
+    } else {
+        tcg_out_insn(s, RXY, LG, arg1, arg1, 0,
+                     offsetof(CPUTLBEntry, addend)
+                     - offsetof(CPUTLBEntry, addr_read));
+    }
+
+#if TARGET_LONG_BITS == 32
+    /* zero upper 32 bits */
+    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
+#else
+    /* just copy */
+    tcg_out_mov(s, arg0, addr_reg);
+#endif
+    tcg_out_insn(s, RRE, AGR, arg0, arg1);
+}
+
+static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
+{
+    /* patch branch */
+    *(label2_ptr + 1) = ((unsigned long)s->code_ptr -
+                         (unsigned long)label2_ptr) >> 1;
+}
+
+#else /* CONFIG_SOFTMMU */
+
+static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
+                                int mem_index, int opc,
+                                uint16_t **label2_ptr_p, int is_store)
+{
+    int arg0 = TCG_REG_R2;
+
+    /* user mode, no address translation required */
+    if (TARGET_LONG_BITS == 32) {
+        tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
+    } else {
+        tcg_out_mov(s, arg0, addr_reg);
+    }
+}
+
+static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
+{
+}
+
+#endif /* CONFIG_SOFTMMU */
+
+/* load data with address translation (if applicable)
+   and endianness conversion */
+static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
+{
+    int addr_reg, data_reg, mem_index;
+    int arg0 = TCG_REG_R2;
+    uint16_t *label2_ptr;
+
+    data_reg = *args++;
+    addr_reg = *args++;
+    mem_index = *args;
+
+    dprintf("tcg_out_qemu_ld opc %d data_reg %d addr_reg %d mem_index %d\n"
+            opc, data_reg, addr_reg, mem_index);
+
+    tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
+                          opc, &label2_ptr, 0);
+
+    switch (opc) {
+    case LD_UINT8:
+        tcg_out_insn(s, RXY, LLGC, data_reg, arg0, 0, 0);
+        break;
+    case LD_INT8:
+        tcg_out_insn(s, RXY, LGB, data_reg, arg0, 0, 0);
+        break;
+    case LD_UINT16:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_insn(s, RXY, LLGH, data_reg, arg0, 0, 0);
+#else
+        /* swapped unsigned halfword load with upper bits zeroed */
+        tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, 0xffffL);
+        tcg_out_insn(s, RRE, NGR, data_reg, 13);
+#endif
+        break;
+    case LD_INT16:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_insn(s, RXY, LGH, data_reg, arg0, 0, 0);
+#else
+        /* swapped sign-extended halfword load */
+        tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
+        tcg_out_insn(s, RSY, SLLG, data_reg, data_reg, TCG_REG_NONE, 48);
+        tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, TCG_REG_NONE, 48);
+#endif
+        break;
+    case LD_UINT32:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_insn(s, RXY, LLGF, data_reg, arg0, 0, 0);
+#else
+        /* swapped unsigned int load with upper bits zeroed */
+        tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
+        tcg_out_insn(s, RRE, LLGFR, data_reg, data_reg);
+#endif
+        break;
+    case LD_INT32:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_insn(s, RXY, LGF, data_reg, arg0, 0, 0);
+#else
+        /* swapped sign-extended int load */
+        tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
+        tcg_out_insn(s, RRE, LGFR, data_reg, data_reg);
+#endif
+        break;
+    case LD_UINT64:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_insn(s, RXY, LG, data_reg, arg0, 0, 0);
+#else
+        tcg_out_insn(s, RXY, LRVG, data_reg, arg0, 0, 0);
+#endif
+        break;
+    default:
+        tcg_abort();
+    }
+
+    tcg_finish_qemu_ldst(s, label2_ptr);
+}
+
+static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
+{
+    int addr_reg, data_reg, mem_index;
+    uint16_t *label2_ptr;
+    int arg0 = TCG_REG_R2;
+
+    data_reg = *args++;
+    addr_reg = *args++;
+    mem_index = *args;
+
+    dprintf("tcg_out_qemu_st opc %d data_reg %d addr_reg %d mem_index %d\n"
+            opc, data_reg, addr_reg, mem_index);
+
+    tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
+                          opc, &label2_ptr, 1);
+
+    switch (opc) {
+    case LD_UINT8:
+        tcg_out_insn(s, RX, STC, data_reg, arg0, 0, 0);
+        break;
+    case LD_UINT16:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_insn(s, RX, STH, data_reg, arg0, 0, 0);
+#else
+        tcg_out_insn(s, RXY, STRVH, data_reg, arg0, 0, 0);
+#endif
+        break;
+    case LD_UINT32:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_insn(s, RX, ST, data_reg, arg0, 0, 0);
+#else
+        tcg_out_insn(s, RXY, STRV, data_reg, arg0, 0, 0);
+#endif
+        break;
+    case LD_UINT64:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_insn(s, RXY, STG, data_reg, arg0, 0, 0);
+#else
+        tcg_out_insn(s, RXY, STRVG, data_reg, arg0, 0, 0);
+#endif
+        break;
+    default:
+        tcg_abort();
+    }
+
+    tcg_finish_qemu_ldst(s, label2_ptr);
 }
 
 static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                 const TCGArg *args, const int *const_args)
 {
-    tcg_abort();
+    S390Opcode op;
+
+    switch (opc) {
+    case INDEX_op_exit_tb:
+        /* return value */
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R2, args[0]);
+        tgen_gotoi(s, S390_CC_ALWAYS, (unsigned long)tb_ret_addr);
+        break;
+
+    case INDEX_op_goto_tb:
+        if (s->tb_jmp_offset) {
+            tcg_abort();
+        } else {
+            tcg_target_long off = ((tcg_target_long)(s->tb_next + args[0]) -
+                                   (tcg_target_long)s->code_ptr) >> 1;
+            if (off == (int32_t)off) {
+                /* load address relative to PC */
+                tcg_out_insn(s, RIL, LARL, TCG_REG_R13, off);
+            } else {
+                /* too far for larl */
+                tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                             (tcg_target_long)(s->tb_next + args[0]));
+            }
+            /* load address stored at s->tb_next + args[0] */
+            tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R13, TCG_REG_R13, 0);
+            /* and go there */
+            tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R13);
+        }
+        s->tb_next_offset[args[0]] = s->code_ptr - s->code_buf;
+        break;
+
+    case INDEX_op_call:
+        if (const_args[0]) {
+            tgen_calli(s, args[0]);
+        } else {
+            tcg_out_insn(s, RR, BASR, TCG_REG_R14, args[0]);
+        }
+        break;
+
+    case INDEX_op_jmp:
+        /* XXX */
+        tcg_abort();
+        break;
+
+    case INDEX_op_ld8u_i32:
+    case INDEX_op_ld8u_i64:
+        /* ??? LLC (RXY format) is only present with the extended-immediate
+           facility, whereas LLGC is always present.  */
+        tcg_out_mem(s, 0, RXY_LLGC, args[0], args[1], TCG_REG_NONE, args[2]);
+        break;
+
+    case INDEX_op_ld8s_i32:
+    case INDEX_op_ld8s_i64:
+        /* ??? LB is no smaller than LGB, so no point to using it.  */
+        tcg_out_mem(s, 0, RXY_LGB, args[0], args[1], TCG_REG_NONE, args[2]);
+        break;
+
+    case INDEX_op_ld16u_i32:
+    case INDEX_op_ld16u_i64:
+        /* ??? LLH (RXY format) is only present with the extended-immediate
+           facility, whereas LLGH is always present.  */
+        tcg_out_mem(s, 0, RXY_LLGH, args[0], args[1], TCG_REG_NONE, args[2]);
+        break;
+
+    case INDEX_op_ld16s_i32:
+        tcg_out_mem(s, RX_LH, RXY_LHY, args[0], args[1], TCG_REG_NONE, args[2]);
+        break;
+    case INDEX_op_ld16s_i64:
+        tcg_out_mem(s, 0, RXY_LGH, args[0], args[1], TCG_REG_NONE, args[2]);
+        break;
+
+    case INDEX_op_ld_i32:
+        tcg_out_ld(s, TCG_TYPE_I32, args[0], args[1], args[2]);
+        break;
+    case INDEX_op_ld32u_i64:
+        tcg_out_mem(s, 0, RXY_LLGF, args[0], args[1], TCG_REG_NONE, args[2]);
+        break;
+    case INDEX_op_ld32s_i64:
+        tcg_out_mem(s, 0, RXY_LGF, args[0], args[1], TCG_REG_NONE, args[2]);
+        break;
+
+    case INDEX_op_ld_i64:
+        tcg_out_ld(s, TCG_TYPE_I64, args[0], args[1], args[2]);
+        break;
+
+    case INDEX_op_st8_i32:
+    case INDEX_op_st8_i64:
+        tcg_out_mem(s, RX_STC, RXY_STCY, args[0], args[1],
+                    TCG_REG_NONE, args[2]);
+        break;
+
+    case INDEX_op_st16_i32:
+    case INDEX_op_st16_i64:
+        tcg_out_mem(s, RX_STH, RXY_STHY, args[0], args[1],
+                    TCG_REG_NONE, args[2]);
+        break;
+
+    case INDEX_op_st_i32:
+    case INDEX_op_st32_i64:
+        tcg_out_st(s, TCG_TYPE_I32, args[0], args[1], args[2]);
+        break;
+
+    case INDEX_op_st_i64:
+        tcg_out_st(s, TCG_TYPE_I64, args[0], args[1], args[2]);
+        break;
+
+    case INDEX_op_mov_i32:
+        /* XXX */
+        tcg_abort();
+        break;
+
+    case INDEX_op_movi_i32:
+        /* XXX */
+        tcg_abort();
+        break;
+
+    case INDEX_op_add_i32:
+        if (const_args[2]) {
+            tcg_out_insn(s, RI, AHI, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RR, AR, args[0], args[2]);
+        }
+        break;
+
+    case INDEX_op_add_i64:
+        tcg_out_insn(s, RRE, AGR, args[0], args[2]);
+        break;
+
+    case INDEX_op_sub_i32:
+        tcg_out_insn(s, RR, SR, args[0], args[2]);
+        break;
+
+    case INDEX_op_sub_i64:
+        tcg_out_insn(s, RRE, SGR, args[0], args[2]);
+        break;
+
+    case INDEX_op_and_i32:
+        tcg_out_insn(s, RR, NR, args[0], args[2]);
+        break;
+    case INDEX_op_or_i32:
+        tcg_out_insn(s, RR, OR, args[0], args[2]);
+        break;
+    case INDEX_op_xor_i32:
+        tcg_out_insn(s, RR, XR, args[0], args[2]);
+        break;
+
+    case INDEX_op_and_i64:
+        tcg_out_insn(s, RRE, NGR, args[0], args[2]);
+        break;
+    case INDEX_op_or_i64:
+        tcg_out_insn(s, RRE, OGR, args[0], args[2]);
+        break;
+    case INDEX_op_xor_i64:
+        tcg_out_insn(s, RRE, XGR, args[0], args[2]);
+        break;
+
+    case INDEX_op_neg_i32:
+        /* FIXME: optimize args[0] != args[1] case */
+        tcg_out_insn(s, RR, LR, 13, args[1]);
+        tcg_out_movi(s, TCG_TYPE_I32, args[0], 0);
+        tcg_out_insn(s, RR, SR, args[0], 13);
+        break;
+    case INDEX_op_neg_i64:
+        /* FIXME: optimize args[0] != args[1] case */
+        tcg_out_mov(s, TCG_REG_R13, args[1]);
+        tcg_out_movi(s, TCG_TYPE_I64, args[0], 0);
+        tcg_out_insn(s, RRE, SGR, args[0], TCG_REG_R13);
+        break;
+
+    case INDEX_op_mul_i32:
+        tcg_out_insn(s, RRE, MSR, args[0], args[2]);
+        break;
+    case INDEX_op_mul_i64:
+        tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
+        break;
+
+    case INDEX_op_div2_i32:
+        tcg_out_insn(s, RR, DR, TCG_REG_R2, args[4]);
+        break;
+    case INDEX_op_divu2_i32:
+        tcg_out_insn(s, RRE, DLR, TCG_REG_R2, args[4]);
+        break;
+
+    case INDEX_op_div2_i64:
+        /* ??? We get an unnecessary sign-extension of the dividend
+           into R3 with this definition, but as we do in fact always
+           produce both quotient and remainder using INDEX_op_div_i64
+           instead requires jumping through even more hoops.  */
+        tcg_out_insn(s, RRE, DSGR, TCG_REG_R2, args[4]);
+        break;
+    case INDEX_op_divu2_i64:
+        tcg_out_insn(s, RRE, DLGR, TCG_REG_R2, args[4]);
+        break;
+
+    case INDEX_op_shl_i32:
+        op = RS_SLL;
+    do_shift32:
+        if (const_args[2]) {
+            tcg_out_sh32(s, op, args[0], TCG_REG_NONE, args[2]);
+        } else {
+            tcg_out_sh32(s, op, args[0], args[2], 0);
+        }
+        break;
+    case INDEX_op_shr_i32:
+        op = RS_SRL;
+        goto do_shift32;
+    case INDEX_op_sar_i32:
+        op = RS_SRA;
+        goto do_shift32;
+
+    case INDEX_op_shl_i64:
+        op = RSY_SLLG;
+    do_shift64:
+        if (const_args[2]) {
+            tcg_out_sh64(s, op, args[0], args[1], TCG_REG_NONE, args[2]);
+        } else {
+            tcg_out_sh64(s, op, args[0], args[1], args[2], 0);
+        }
+        break;
+    case INDEX_op_shr_i64:
+        op = RSY_SRLG;
+        goto do_shift64;
+    case INDEX_op_sar_i64:
+        op = RSY_SRAG;
+        goto do_shift64;
+
+    case INDEX_op_br:
+        tgen_branch(s, S390_CC_ALWAYS, args[0]);
+        break;
+
+    case INDEX_op_brcond_i64:
+        tgen64_cmp(s, args[2], args[0], args[1]);
+        goto do_brcond;
+    case INDEX_op_brcond_i32:
+        tgen32_cmp(s, args[2], args[0], args[1]);
+    do_brcond:
+        tgen_branch(s, tcg_cond_to_s390_cond[args[2]], args[3]);
+        break;
+
+    case INDEX_op_setcond_i32:
+        tgen_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1], args[2]);
+        break;
+    case INDEX_op_setcond_i64:
+        tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1], args[2]);
+        break;
+
+    case INDEX_op_qemu_ld8u:
+        tcg_out_qemu_ld(s, args, LD_UINT8);
+        break;
+
+    case INDEX_op_qemu_ld8s:
+        tcg_out_qemu_ld(s, args, LD_INT8);
+        break;
+
+    case INDEX_op_qemu_ld16u:
+        tcg_out_qemu_ld(s, args, LD_UINT16);
+        break;
+
+    case INDEX_op_qemu_ld16s:
+        tcg_out_qemu_ld(s, args, LD_INT16);
+        break;
+
+    case INDEX_op_qemu_ld32:
+        /* ??? Technically we can use a non-extending instruction.  */
+    case INDEX_op_qemu_ld32u:
+        tcg_out_qemu_ld(s, args, LD_UINT32);
+        break;
+
+    case INDEX_op_qemu_ld32s:
+        tcg_out_qemu_ld(s, args, LD_INT32);
+        break;
+
+    case INDEX_op_qemu_ld64:
+        tcg_out_qemu_ld(s, args, LD_UINT64);
+        break;
+
+    case INDEX_op_qemu_st8:
+        tcg_out_qemu_st(s, args, LD_UINT8);
+        break;
+
+    case INDEX_op_qemu_st16:
+        tcg_out_qemu_st(s, args, LD_UINT16);
+        break;
+
+    case INDEX_op_qemu_st32:
+        tcg_out_qemu_st(s, args, LD_UINT32);
+        break;
+
+    case INDEX_op_qemu_st64:
+        tcg_out_qemu_st(s, args, LD_UINT64);
+        break;
+
+    default:
+        fprintf(stderr,"unimplemented opc 0x%x\n",opc);
+        tcg_abort();
+    }
 }
 
+static const TCGTargetOpDef s390_op_defs[] = {
+    { INDEX_op_exit_tb, { } },
+    { INDEX_op_goto_tb, { } },
+    { INDEX_op_call, { "ri" } },
+    { INDEX_op_jmp, { "ri" } },
+    { INDEX_op_br, { } },
+
+    { INDEX_op_mov_i32, { "r", "r" } },
+    { INDEX_op_movi_i32, { "r" } },
+
+    { INDEX_op_ld8u_i32, { "r", "r" } },
+    { INDEX_op_ld8s_i32, { "r", "r" } },
+    { INDEX_op_ld16u_i32, { "r", "r" } },
+    { INDEX_op_ld16s_i32, { "r", "r" } },
+    { INDEX_op_ld_i32, { "r", "r" } },
+    { INDEX_op_st8_i32, { "r", "r" } },
+    { INDEX_op_st16_i32, { "r", "r" } },
+    { INDEX_op_st_i32, { "r", "r" } },
+
+    { INDEX_op_add_i32, { "r", "0", "rI" } },
+    { INDEX_op_sub_i32, { "r", "0", "r" } },
+    { INDEX_op_mul_i32, { "r", "0", "r" } },
+
+    { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
+    { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
+
+    { INDEX_op_and_i32, { "r", "0", "r" } },
+    { INDEX_op_or_i32, { "r", "0", "r" } },
+    { INDEX_op_xor_i32, { "r", "0", "r" } },
+    { INDEX_op_neg_i32, { "r", "r" } },
+
+    { INDEX_op_shl_i32, { "r", "0", "Ri" } },
+    { INDEX_op_shr_i32, { "r", "0", "Ri" } },
+    { INDEX_op_sar_i32, { "r", "0", "Ri" } },
+
+    { INDEX_op_brcond_i32, { "r", "r" } },
+    { INDEX_op_setcond_i32, { "r", "r", "r" } },
+
+    { INDEX_op_qemu_ld8u, { "r", "L" } },
+    { INDEX_op_qemu_ld8s, { "r", "L" } },
+    { INDEX_op_qemu_ld16u, { "r", "L" } },
+    { INDEX_op_qemu_ld16s, { "r", "L" } },
+    { INDEX_op_qemu_ld32u, { "r", "L" } },
+    { INDEX_op_qemu_ld32s, { "r", "L" } },
+    { INDEX_op_qemu_ld32, { "r", "L" } },
+    { INDEX_op_qemu_ld64, { "r", "L" } },
+
+    { INDEX_op_qemu_st8, { "L", "L" } },
+    { INDEX_op_qemu_st16, { "L", "L" } },
+    { INDEX_op_qemu_st32, { "L", "L" } },
+    { INDEX_op_qemu_st64, { "L", "L" } },
+
+#if defined(__s390x__)
+    { INDEX_op_mov_i64, { "r", "r" } },
+    { INDEX_op_movi_i64, { "r" } },
+
+    { INDEX_op_ld8u_i64, { "r", "r" } },
+    { INDEX_op_ld8s_i64, { "r", "r" } },
+    { INDEX_op_ld16u_i64, { "r", "r" } },
+    { INDEX_op_ld16s_i64, { "r", "r" } },
+    { INDEX_op_ld32u_i64, { "r", "r" } },
+    { INDEX_op_ld32s_i64, { "r", "r" } },
+    { INDEX_op_ld_i64, { "r", "r" } },
+
+    { INDEX_op_st8_i64, { "r", "r" } },
+    { INDEX_op_st16_i64, { "r", "r" } },
+    { INDEX_op_st32_i64, { "r", "r" } },
+    { INDEX_op_st_i64, { "r", "r" } },
+
+    { INDEX_op_add_i64, { "r", "0", "r" } },
+    { INDEX_op_sub_i64, { "r", "0", "r" } },
+    { INDEX_op_mul_i64, { "r", "0", "r" } },
+
+    { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
+    { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
+
+    { INDEX_op_and_i64, { "r", "0", "r" } },
+    { INDEX_op_or_i64, { "r", "0", "r" } },
+    { INDEX_op_xor_i64, { "r", "0", "r" } },
+    { INDEX_op_neg_i64, { "r", "r" } },
+
+    { INDEX_op_shl_i64, { "r", "r", "Ri" } },
+    { INDEX_op_shr_i64, { "r", "r", "Ri" } },
+    { INDEX_op_sar_i64, { "r", "r", "Ri" } },
+
+    { INDEX_op_brcond_i64, { "r", "r" } },
+    { INDEX_op_setcond_i64, { "r", "r", "r" } },
+#endif
+
+    { -1 },
+};
+
 void tcg_target_init(TCGContext *s)
 {
-    /* gets called with KVM */
+#if !defined(CONFIG_USER_ONLY)
+    /* fail safe */
+    if ((1 << CPU_TLB_ENTRY_BITS) != sizeof(CPUTLBEntry)) {
+        tcg_abort();
+    }
+#endif
+
+    tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffff);
+    tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffff);
+    tcg_regset_set32(tcg_target_call_clobber_regs, 0,
+                     (1 << TCG_REG_R0) |
+                     (1 << TCG_REG_R1) |
+                     (1 << TCG_REG_R2) |
+                     (1 << TCG_REG_R3) |
+                     (1 << TCG_REG_R4) |
+                     (1 << TCG_REG_R5) |
+                     (1 << TCG_REG_R14)); /* link register */
+
+    tcg_regset_clear(s->reserved_regs);
+    /* frequently used as a temporary */
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R13);
+    /* another temporary */
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R12);
+    /* XXX many insns can't be used with R0, so we better avoid it for now */
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R0);
+    /* The stack pointer.  */
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R15);
+
+    tcg_add_target_add_op_defs(s390_op_defs);
 }
 
 void tcg_target_qemu_prologue(TCGContext *s)
 {
-    /* gets called with KVM */
-}
+    /* stmg %r6,%r15,48(%r15) (save registers) */
+    tcg_out_insn(s, RXY, STMG, TCG_REG_R6, TCG_REG_R15, TCG_REG_R15, 48);
 
-static inline void tcg_out_mov(TCGContext *s, int ret, int arg)
-{
-    tcg_abort();
+    /* aghi %r15,-160 (stack frame) */
+    tcg_out_insn(s, RI, AGHI, TCG_REG_R15, -160);
+
+    /* br %r2 (go to TB) */
+    tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R2);
+
+    tb_ret_addr = s->code_ptr;
+
+    /* lmg %r6,%r15,208(%r15) (restore registers) */
+    tcg_out_insn(s, RXY, LMG, TCG_REG_R6, TCG_REG_R15, TCG_REG_R15, 208);
+
+    /* br %r14 (return) */
+    tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R14);
 }
 
 static inline void tcg_out_addi(TCGContext *s, int reg, tcg_target_long val)
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 8c19262..26dafae 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -26,7 +26,7 @@
 #define TCG_TARGET_REG_BITS 64
 #define TCG_TARGET_WORDS_BIGENDIAN
 
-enum {
+typedef enum TCGReg {
     TCG_REG_R0 = 0,
     TCG_REG_R1,
     TCG_REG_R2,
@@ -43,11 +43,12 @@ enum {
     TCG_REG_R13,
     TCG_REG_R14,
     TCG_REG_R15
-};
+} TCGReg;
+
 #define TCG_TARGET_NB_REGS 16
 
 /* optional instructions */
-// #define TCG_TARGET_HAS_div_i32
+#define TCG_TARGET_HAS_div2_i32
 // #define TCG_TARGET_HAS_rot_i32
 // #define TCG_TARGET_HAS_ext8s_i32
 // #define TCG_TARGET_HAS_ext16s_i32
@@ -56,14 +57,14 @@ enum {
 // #define TCG_TARGET_HAS_bswap16_i32
 // #define TCG_TARGET_HAS_bswap32_i32
 // #define TCG_TARGET_HAS_not_i32
-// #define TCG_TARGET_HAS_neg_i32
+#define TCG_TARGET_HAS_neg_i32
 // #define TCG_TARGET_HAS_andc_i32
 // #define TCG_TARGET_HAS_orc_i32
 // #define TCG_TARGET_HAS_eqv_i32
 // #define TCG_TARGET_HAS_nand_i32
 // #define TCG_TARGET_HAS_nor_i32
 
-// #define TCG_TARGET_HAS_div_i64
+#define TCG_TARGET_HAS_div2_i64
 // #define TCG_TARGET_HAS_rot_i64
 // #define TCG_TARGET_HAS_ext8s_i64
 // #define TCG_TARGET_HAS_ext16s_i64
@@ -75,7 +76,7 @@ enum {
 // #define TCG_TARGET_HAS_bswap32_i64
 // #define TCG_TARGET_HAS_bswap64_i64
 // #define TCG_TARGET_HAS_not_i64
-// #define TCG_TARGET_HAS_neg_i64
+#define TCG_TARGET_HAS_neg_i64
 // #define TCG_TARGET_HAS_andc_i64
 // #define TCG_TARGET_HAS_orc_i64
 // #define TCG_TARGET_HAS_eqv_i64
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 11/35] tcg-s390: Tidy unimplemented opcodes.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (9 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 10/35] tcg-s390: New TCG target Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-10 10:24   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 12/35] tcg-s390: Define TCG_TMP0 Richard Henderson
                   ` (24 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   25 ++++++++++---------------
 1 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 55f0fa9..5b2134b 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -824,11 +824,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
-    case INDEX_op_jmp:
-        /* XXX */
-        tcg_abort();
-        break;
-
     case INDEX_op_ld8u_i32:
     case INDEX_op_ld8u_i64:
         /* ??? LLC (RXY format) is only present with the extended-immediate
@@ -891,16 +886,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_st(s, TCG_TYPE_I64, args[0], args[1], args[2]);
         break;
 
-    case INDEX_op_mov_i32:
-        /* XXX */
-        tcg_abort();
-        break;
-
-    case INDEX_op_movi_i32:
-        /* XXX */
-        tcg_abort();
-        break;
-
     case INDEX_op_add_i32:
         if (const_args[2]) {
             tcg_out_insn(s, RI, AHI, args[0], args[2]);
@@ -1077,6 +1062,16 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_qemu_st(s, args, LD_UINT64);
         break;
 
+    case INDEX_op_mov_i32:
+    case INDEX_op_mov_i64:
+    case INDEX_op_movi_i32:
+    case INDEX_op_movi_i64:
+        /* These are always emitted by TCG directly.  */
+    case INDEX_op_jmp:
+        /* This one is obsolete and never emitted.  */
+        tcg_abort();
+        break;
+
     default:
         fprintf(stderr,"unimplemented opc 0x%x\n",opc);
         tcg_abort();
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 12/35] tcg-s390: Define TCG_TMP0.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (10 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 11/35] tcg-s390: Tidy unimplemented opcodes Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-10 10:25   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 13/35] tcg-s390: Tidy regset initialization; use R14 as temporary Richard Henderson
                   ` (23 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Use a define for the temp register instead of hard-coding it.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   54 ++++++++++++++++++++++++++----------------------
 1 files changed, 29 insertions(+), 25 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 5b2134b..2b80c02 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -40,6 +40,10 @@
    rather than TCG_REG_R0.  */
 #define TCG_REG_NONE    0
 
+/* A scratch register that may be be used throughout the backend.  */
+#define TCG_TMP0        TCG_REG_R13
+
+
 /* All of the following instructions are prefixed with their instruction
    format, and are defined as 8- or 16-bit quantities, even when the two
    halves of the 16-bit quantity may appear 32 bits apart in the insn.
@@ -376,12 +380,12 @@ static inline void tcg_out_movi(TCGContext *s, TCGType type,
         tcg_out_insn(s, RI, IILH, ret, arg >> 16);
     } else {
         /* branch over constant and store its address in R13 */
-        tcg_out_insn(s, RIL, BRASL, TCG_REG_R13, (6 + 8) >> 1);
+        tcg_out_insn(s, RIL, BRASL, TCG_TMP0, (6 + 8) >> 1);
         /* 64-bit constant */
         tcg_out32(s, arg >> 32);
         tcg_out32(s, arg);
         /* load constant to ret */
-        tcg_out_insn(s, RXY, LG, ret, TCG_REG_R13, 0, 0);
+        tcg_out_insn(s, RXY, LG, ret, TCG_TMP0, 0, 0);
     }
 }
 
@@ -399,14 +403,14 @@ static void tcg_out_mem(TCGContext *s, S390Opcode opc_rx, S390Opcode opc_rxy,
     if (ofs < -0x80000 || ofs >= 0x80000) {
         /* Combine the low 16 bits of the offset with the actual load insn;
            the high 48 bits must come from an immediate load.  */
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, ofs & ~0xffff);
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, ofs & ~0xffff);
         ofs &= 0xffff;
 
         /* If we were already given an index register, add it in.  */
         if (index != TCG_REG_NONE) {
-            tcg_out_insn(s, RRE, AGR, TCG_REG_R13, index);
+            tcg_out_insn(s, RRE, AGR, TCG_TMP0, index);
         }
-        index = TCG_REG_R13;
+        index = TCG_TMP0;
     }
 
     if (opc_rx && ofs >= 0 && ofs < 0x1000) {
@@ -482,8 +486,8 @@ static void tgen_gotoi(TCGContext *s, int cc, tcg_target_long dest)
     } else if (off == (int32_t)off) {
         tcg_out_insn(s, RIL, BRCL, cc, off);
     } else {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, dest);
-        tcg_out_insn(s, RR, BCR, cc, TCG_REG_R13);
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, dest);
+        tcg_out_insn(s, RR, BCR, cc, TCG_TMP0);
     }
 }
 
@@ -505,8 +509,8 @@ static void tgen_calli(TCGContext *s, tcg_target_long dest)
     if (off == (int32_t)off) {
         tcg_out_insn(s, RIL, BRASL, TCG_REG_R14, off);
     } else {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, dest);
-        tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_REG_R13);
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, dest);
+        tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_TMP0);
     }
 }
 
@@ -538,22 +542,22 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, TCG_REG_NONE,
                  TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
 
-    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+    tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
                  TARGET_PAGE_MASK | ((1 << s_bits) - 1));
-    tcg_out_insn(s, RRE, NGR, arg0, TCG_REG_R13);
+    tcg_out_insn(s, RRE, NGR, arg0, TCG_TMP0);
 
-    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+    tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
                  (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
-    tcg_out_insn(s, RRE, NGR, arg1, TCG_REG_R13);
+    tcg_out_insn(s, RRE, NGR, arg1, TCG_TMP0);
 
     if (is_store) {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
                      offsetof(CPUState, tlb_table[mem_index][0].addr_write));
     } else {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
                      offsetof(CPUState, tlb_table[mem_index][0].addr_read));
     }
-    tcg_out_insn(s, RRE, AGR, arg1, TCG_REG_R13);
+    tcg_out_insn(s, RRE, AGR, arg1, TCG_TMP0);
 
     tcg_out_insn(s, RRE, AGR, arg1, TCG_AREG0);
 
@@ -688,8 +692,8 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 #else
         /* swapped unsigned halfword load with upper bits zeroed */
         tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, 0xffffL);
-        tcg_out_insn(s, RRE, NGR, data_reg, 13);
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, 0xffffL);
+        tcg_out_insn(s, RRE, NGR, data_reg, TCG_TMP0);
 #endif
         break;
     case LD_INT16:
@@ -802,16 +806,16 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                                    (tcg_target_long)s->code_ptr) >> 1;
             if (off == (int32_t)off) {
                 /* load address relative to PC */
-                tcg_out_insn(s, RIL, LARL, TCG_REG_R13, off);
+                tcg_out_insn(s, RIL, LARL, TCG_TMP0, off);
             } else {
                 /* too far for larl */
-                tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
                              (tcg_target_long)(s->tb_next + args[0]));
             }
             /* load address stored at s->tb_next + args[0] */
-            tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R13, TCG_REG_R13, 0);
+            tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP0, TCG_TMP0, 0);
             /* and go there */
-            tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R13);
+            tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_TMP0);
         }
         s->tb_next_offset[args[0]] = s->code_ptr - s->code_buf;
         break;
@@ -934,9 +938,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
     case INDEX_op_neg_i64:
         /* FIXME: optimize args[0] != args[1] case */
-        tcg_out_mov(s, TCG_REG_R13, args[1]);
+        tcg_out_mov(s, TCG_TMP0, args[1]);
         tcg_out_movi(s, TCG_TYPE_I64, args[0], 0);
-        tcg_out_insn(s, RRE, SGR, args[0], TCG_REG_R13);
+        tcg_out_insn(s, RRE, SGR, args[0], TCG_TMP0);
         break;
 
     case INDEX_op_mul_i32:
@@ -1192,7 +1196,7 @@ void tcg_target_init(TCGContext *s)
 
     tcg_regset_clear(s->reserved_regs);
     /* frequently used as a temporary */
-    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R13);
+    tcg_regset_set_reg(s->reserved_regs, TCG_TMP0);
     /* another temporary */
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_R12);
     /* XXX many insns can't be used with R0, so we better avoid it for now */
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 13/35] tcg-s390: Tidy regset initialization; use R14 as temporary.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (11 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 12/35] tcg-s390: Define TCG_TMP0 Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-10 10:26   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 14/35] tcg-s390: Rearrange register allocation order Richard Henderson
                   ` (22 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   26 ++++++++++++--------------
 1 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 2b80c02..95ea3c8 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -41,7 +41,7 @@
 #define TCG_REG_NONE    0
 
 /* A scratch register that may be be used throughout the backend.  */
-#define TCG_TMP0        TCG_REG_R13
+#define TCG_TMP0        TCG_REG_R14
 
 
 /* All of the following instructions are prefixed with their instruction
@@ -1185,24 +1185,22 @@ void tcg_target_init(TCGContext *s)
 
     tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffff);
     tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffff);
-    tcg_regset_set32(tcg_target_call_clobber_regs, 0,
-                     (1 << TCG_REG_R0) |
-                     (1 << TCG_REG_R1) |
-                     (1 << TCG_REG_R2) |
-                     (1 << TCG_REG_R3) |
-                     (1 << TCG_REG_R4) |
-                     (1 << TCG_REG_R5) |
-                     (1 << TCG_REG_R14)); /* link register */
+
+    tcg_regset_clear(tcg_target_call_clobber_regs);
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R0);
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R1);
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R2);
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R3);
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R4);
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R5);
+    /* The return register can be considered call-clobbered.  */
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R14);
 
     tcg_regset_clear(s->reserved_regs);
-    /* frequently used as a temporary */
     tcg_regset_set_reg(s->reserved_regs, TCG_TMP0);
-    /* another temporary */
-    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R12);
     /* XXX many insns can't be used with R0, so we better avoid it for now */
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_R0);
-    /* The stack pointer.  */
-    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R15);
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_CALL_STACK);
 
     tcg_add_target_add_op_defs(s390_op_defs);
 }
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 14/35] tcg-s390: Rearrange register allocation order.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (12 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 13/35] tcg-s390: Tidy regset initialization; use R14 as temporary Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-10 10:26   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed Richard Henderson
                   ` (21 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Try to avoid conflicting with the outgoing function call arguments.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   23 +++++++++++++----------
 1 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 95ea3c8..3944cb1 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -149,22 +149,25 @@ static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
 };
 #endif
 
+/* Since R6 is a potential argument register, choose it last of the
+   call-saved registers.  Likewise prefer the call-clobbered registers
+   in reverse order to maximize the chance of avoiding the arguments.  */
 static const int tcg_target_reg_alloc_order[] = {
-    TCG_REG_R6,
-    TCG_REG_R7,
-    TCG_REG_R8,
-    TCG_REG_R9,
-    TCG_REG_R10,
-    TCG_REG_R11,
-    TCG_REG_R12,
     TCG_REG_R13,
+    TCG_REG_R12,
+    TCG_REG_R11,
+    TCG_REG_R10,
+    TCG_REG_R9,
+    TCG_REG_R8,
+    TCG_REG_R7,
+    TCG_REG_R6,
     TCG_REG_R14,
     TCG_REG_R0,
     TCG_REG_R1,
-    TCG_REG_R2,
-    TCG_REG_R3,
-    TCG_REG_R4,
     TCG_REG_R5,
+    TCG_REG_R4,
+    TCG_REG_R3,
+    TCG_REG_R2,
 };
 
 static const int tcg_target_call_iarg_regs[] = {
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (13 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 14/35] tcg-s390: Rearrange register allocation order Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-10 10:28   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 16/35] tcg-s390: Re-implement tcg_out_movi Richard Henderson
                   ` (20 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Verify that we have all the instruction extensions that we generate.
Future patches can tailor code generation to the set of instructions
that are present.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  113 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 113 insertions(+), 0 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 3944cb1..d99bb5c 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -229,6 +229,17 @@ static void *qemu_st_helpers[4] = {
 
 static uint8_t *tb_ret_addr;
 
+/* A list of relevant facilities used by this translator.  Some of these
+   are required for proper operation, and these are checked at startup.  */
+
+#define FACILITY_ZARCH		(1ULL << (63 - 1))
+#define FACILITY_ZARCH_ACTIVE	(1ULL << (63 - 2))
+#define FACILITY_LONG_DISP	(1ULL << (63 - 18))
+#define FACILITY_EXT_IMM	(1ULL << (63 - 21))
+#define FACILITY_GEN_INST_EXT	(1ULL << (63 - 34))
+
+static uint64_t facilities;
+
 static void patch_reloc(uint8_t *code_ptr, int type,
                 tcg_target_long value, tcg_target_long addend)
 {
@@ -1177,6 +1188,106 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { -1 },
 };
 
+/* ??? Linux kernels provide an AUXV entry AT_HWCAP that provides most of
+   this information.  However, getting at that entry is not easy this far
+   away from main.  Our options are: start searching from environ, but
+   that fails as soon as someone does a setenv in between.  Read the data
+   from /proc/self/auxv.  Or do the probing ourselves.  The only thing
+   extra that AT_HWCAP gives us is HWCAP_S390_HIGH_GPRS, which indicates
+   that the kernel saves all 64-bits of the registers around traps while
+   in 31-bit mode.  But this is true of all "recent" kernels (ought to dig
+   back and see from when this might not be true).  */
+
+#include <signal.h>
+
+static volatile sig_atomic_t got_sigill;
+
+static void sigill_handler(int sig)
+{
+    got_sigill = 1;
+}
+
+static void query_facilities(void)
+{
+    struct sigaction sa_old, sa_new;
+    register int r0 __asm__("0");
+    register void *r1 __asm__("1");
+    int fail;
+
+    memset(&sa_new, 0, sizeof(sa_new));
+    sa_new.sa_handler = sigill_handler;
+    sigaction(SIGILL, &sa_new, &sa_old);
+
+    /* First, try STORE FACILITY LIST EXTENDED.  If this is present, then
+       we need not do any more probing.  Unfortunately, this itself is an
+       extension and the original STORE FACILITY LIST instruction is
+       kernel-only, storing its results at absolute address 200.  */
+    /* stfle 0(%r1) */
+    r1 = &facilities;
+    asm volatile(".word 0xb2b0,0x1000"
+                 : "=r"(r0) : "0"(0), "r"(r1) : "memory", "cc");
+
+    if (got_sigill) {
+        /* STORE FACILITY EXTENDED is not available.  Probe for one of each
+           kind of instruction that we're interested in.  */
+        /* ??? Possibly some of these are in practice never present unless
+           the store-facility-extended facility is also present.  But since
+           that isn't documented it's just better to probe for each.  */
+
+        /* Test for z/Architecture.  Required even in 31-bit mode.  */
+        got_sigill = 0;
+        /* agr %r0,%r0 */
+        asm volatile(".word 0xb908,0x0000" : "=r"(r0) : : "cc");
+        if (!got_sigill) {
+            facilities |= FACILITY_ZARCH | FACILITY_ZARCH_ACTIVE;
+        }
+
+        /* Test for long displacement.  */
+        got_sigill = 0;
+        /* ly %r0,0(%r1) */
+        r1 = &facilities;
+        asm volatile(".word 0xe300,0x1000,0x0058"
+                     : "=r"(r0) : "r"(r1) : "cc");
+        if (!got_sigill) {
+            facilities |= FACILITY_LONG_DISP;
+        }
+
+        /* Test for extended immediates.  */
+        got_sigill = 0;
+        /* afi %r0,0 */
+        asm volatile(".word 0xc209,0x0000,0x0000" : : : "cc");
+        if (!got_sigill) {
+            facilities |= FACILITY_EXT_IMM;
+        }
+
+        /* Test for general-instructions-extension.  */
+        got_sigill = 0;
+        /* msfi %r0,1 */
+        asm volatile(".word 0xc201,0x0000,0x0001");
+        if (!got_sigill) {
+            facilities |= FACILITY_GEN_INST_EXT;
+        }
+    }
+
+    sigaction(SIGILL, &sa_old, NULL);
+
+    /* The translator currently uses these extensions unconditionally.
+       Pruning this back to the base ESA/390 architecture doesn't seem
+       worthwhile, since even the KVM target requires z/Arch.  */
+    fail = 0;
+    if ((facilities & FACILITY_ZARCH_ACTIVE) == 0) {
+        fprintf(stderr, "TCG: z/Arch facility is required\n");
+        fail = 1;
+    }
+    if ((facilities & FACILITY_LONG_DISP) == 0) {
+        fprintf(stderr, "TCG: long-displacement facility is required\n");
+        fail = 1;
+    }
+    if (fail) {
+        exit(-1);
+    }
+}
+
 void tcg_target_init(TCGContext *s)
 {
 #if !defined(CONFIG_USER_ONLY)
@@ -1186,6 +1297,8 @@ void tcg_target_init(TCGContext *s)
     }
 #endif
 
+    query_facilities();
+
     tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffff);
     tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffff);
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 16/35] tcg-s390: Re-implement tcg_out_movi.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (14 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-12 12:04   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 17/35] tcg-s390: Implement sign and zero-extension operations Richard Henderson
                   ` (19 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Make better use of the LOAD HALFWORD IMMEDIATE, LOAD IMMEDIATE,
and INSERT IMMEDIATE instruction groups.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  129 +++++++++++++++++++++++++++++++++++++++++++------
 1 files changed, 113 insertions(+), 16 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index d99bb5c..71e017a 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -52,12 +52,23 @@ typedef enum S390Opcode {
     RIL_BRASL   = 0xc005,
     RIL_BRCL    = 0xc004,
     RIL_LARL    = 0xc000,
+    RIL_IIHF    = 0xc008,
+    RIL_IILF    = 0xc009,
+    RIL_LGFI    = 0xc001,
+    RIL_LLIHF   = 0xc00e,
+    RIL_LLILF   = 0xc00f,
 
     RI_AGHI     = 0xa70b,
     RI_AHI      = 0xa70a,
     RI_BRC      = 0xa704,
+    RI_IIHH     = 0xa500,
+    RI_IIHL     = 0xa501,
     RI_IILH     = 0xa502,
+    RI_IILL     = 0xa503,
     RI_LGHI     = 0xa709,
+    RI_LLIHH    = 0xa50c,
+    RI_LLIHL    = 0xa50d,
+    RI_LLILH    = 0xa50e,
     RI_LLILL    = 0xa50f,
 
     RRE_AGR     = 0xb908,
@@ -382,24 +393,110 @@ static inline void tcg_out_mov(TCGContext *s, int ret, int arg)
 }
 
 /* load a register with an immediate value */
-static inline void tcg_out_movi(TCGContext *s, TCGType type,
-                int ret, tcg_target_long arg)
+static void tcg_out_movi(TCGContext *s, TCGType type,
+                         TCGReg ret, tcg_target_long sval)
 {
-    if (arg >= -0x8000 && arg < 0x8000) { /* signed immediate load */
-        tcg_out_insn(s, RI, LGHI, ret, arg);
-    } else if (!(arg & 0xffffffffffff0000UL)) {
-        tcg_out_insn(s, RI, LLILL, ret, arg);
-    } else if (!(arg & 0xffffffff00000000UL) || type == TCG_TYPE_I32) {
-        tcg_out_insn(s, RI, LLILL, ret, arg);
-        tcg_out_insn(s, RI, IILH, ret, arg >> 16);
+    static const S390Opcode lli_insns[4] = {
+        RI_LLILL, RI_LLILH, RI_LLIHL, RI_LLIHH
+    };
+
+    tcg_target_ulong uval = sval;
+    int i;
+
+    if (type == TCG_TYPE_I32) {
+        uval = (uint32_t)sval;
+        sval = (int32_t)sval;
+    }
+
+    /* Try all 32-bit insns that can load it in one go.  */
+    if (sval >= -0x8000 && sval < 0x8000) {
+        tcg_out_insn(s, RI, LGHI, ret, sval);
+        return;
+    }
+
+    for (i = 0; i < 4; i++) {
+        tcg_target_long mask = 0xffffull << i*16;
+        if ((uval & mask) != 0 && (uval & ~mask) == 0) {
+            tcg_out_insn_RI(s, lli_insns[i], ret, uval >> i*16);
+            return;
+        }
+    }
+
+    /* Try all 48-bit insns that can load it in one go.  */
+    if (facilities & FACILITY_EXT_IMM) {
+        if (sval == (int32_t)sval) {
+            tcg_out_insn(s, RIL, LGFI, ret, sval);
+            return;
+        }
+        if (uval <= 0xffffffff) {
+            tcg_out_insn(s, RIL, LLILF, ret, uval);
+            return;
+        }
+        if ((uval & 0xffffffff) == 0) {
+            tcg_out_insn(s, RIL, LLIHF, ret, uval >> 32);
+            return;
+        }
+    }
+
+    /* Try for PC-relative address load.  */
+    if ((sval & 1) == 0) {
+        intptr_t off = (sval - (intptr_t)s->code_ptr) >> 1;
+        if (off == (int32_t)off) {
+            tcg_out_insn(s, RIL, LARL, ret, off);
+            return;
+        }
+    }
+
+    /* If extended immediates are not present, then we may have to issue
+       several instructions to load the low 32 bits.  */
+    if (!(facilities & FACILITY_EXT_IMM)) {
+        /* A 32-bit unsigned value can be loaded in 2 insns.  And given
+           that the lli_insns loop above did not succeed, we know that
+           both insns are required.  */
+        if (uval <= 0xffffffff) {
+            tcg_out_insn(s, RI, LLILL, ret, uval);
+            tcg_out_insn(s, RI, IILH, ret, uval >> 16);
+            return;
+        }
+
+        /* If all high bits are set, the value can be loaded in 2 or 3 insns.
+           We first want to make sure that all the high bits get set.  With
+           luck the low 16-bits can be considered negative to perform that for
+           free, otherwise we load an explicit -1.  */
+        if (sval >> 32 == -1) {
+            if (uval & 0x8000) {
+                tcg_out_insn(s, RI, LGHI, ret, uval);
+            } else {
+                tcg_out_insn(s, RI, LGHI, ret, -1);
+                tcg_out_insn(s, RI, IILL, ret, uval);
+            }
+            tcg_out_insn(s, RI, IILH, ret, uval >> 16);
+            return;
+        }
+    }
+
+    /* If we get here, both the high and low parts have non-zero bits.  */
+
+    /* Recurse to load the lower 32-bits.  */
+    tcg_out_movi(s, TCG_TYPE_I32, ret, sval);
+
+    /* Insert data into the high 32-bits.  */
+    uval >>= 32;
+    if (facilities & FACILITY_EXT_IMM) {
+        if (uval < 0x10000) {
+            tcg_out_insn(s, RI, IIHL, ret, uval);
+        } else if ((uval & 0xffff) == 0) {
+            tcg_out_insn(s, RI, IIHH, ret, uval >> 16);
+        } else {
+            tcg_out_insn(s, RIL, IIHF, ret, uval);
+        }
     } else {
-        /* branch over constant and store its address in R13 */
-        tcg_out_insn(s, RIL, BRASL, TCG_TMP0, (6 + 8) >> 1);
-        /* 64-bit constant */
-        tcg_out32(s, arg >> 32);
-        tcg_out32(s, arg);
-        /* load constant to ret */
-        tcg_out_insn(s, RXY, LG, ret, TCG_TMP0, 0, 0);
+        if (uval & 0xffff) {
+            tcg_out_insn(s, RI, IIHL, ret, uval);
+        }
+        if (uval & 0xffff0000) {
+            tcg_out_insn(s, RI, IIHH, ret, uval >> 16);
+        }
     }
 }
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 17/35] tcg-s390: Implement sign and zero-extension operations.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (15 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 16/35] tcg-s390: Re-implement tcg_out_movi Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-12 12:32   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 18/35] tcg-s390: Implement bswap operations Richard Henderson
                   ` (18 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  164 ++++++++++++++++++++++++++++++++++++++++++++-----
 tcg/s390/tcg-target.h |   20 +++---
 2 files changed, 158 insertions(+), 26 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 71e017a..42e3224 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -78,10 +78,14 @@ typedef enum S390Opcode {
     RRE_DLR     = 0xb997,
     RRE_DSGFR   = 0xb91d,
     RRE_DSGR    = 0xb90d,
+    RRE_LGBR    = 0xb906,
     RRE_LCGR    = 0xb903,
     RRE_LGFR    = 0xb914,
+    RRE_LGHR    = 0xb907,
     RRE_LGR     = 0xb904,
+    RRE_LLGCR   = 0xb984,
     RRE_LLGFR   = 0xb916,
+    RRE_LLGHR   = 0xb985,
     RRE_MSGR    = 0xb90c,
     RRE_MSR     = 0xb252,
     RRE_NGR     = 0xb980,
@@ -117,11 +121,9 @@ typedef enum S390Opcode {
     RXY_LGF     = 0xe314,
     RXY_LGH     = 0xe315,
     RXY_LHY     = 0xe378,
-    RXY_LLC     = 0xe394,
     RXY_LLGC    = 0xe390,
     RXY_LLGF    = 0xe316,
     RXY_LLGH    = 0xe391,
-    RXY_LLH     = 0xe395,
     RXY_LMG     = 0xeb04,
     RXY_LRV     = 0xe31e,
     RXY_LRVG    = 0xe30f,
@@ -553,6 +555,96 @@ static inline void tcg_out_st(TCGContext *s, TCGType type, TCGReg data,
     }
 }
 
+static void tgen_ext8s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
+{
+    if (facilities & FACILITY_EXT_IMM) {
+        tcg_out_insn(s, RRE, LGBR, dest, src);
+        return;
+    }
+
+    if (type == TCG_TYPE_I32) {
+        if (dest == src) {
+            tcg_out_sh32(s, RS_SLL, dest, TCG_REG_NONE, 24);
+        } else {
+            tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 24);
+        }
+        tcg_out_sh32(s, RS_SRA, dest, TCG_REG_NONE, 24);
+    } else {
+        tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 56);
+        tcg_out_sh64(s, RSY_SRAG, dest, dest, TCG_REG_NONE, 56);
+    }
+}
+
+static void tgen_ext8u(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
+{
+    if (facilities & FACILITY_EXT_IMM) {
+        tcg_out_insn(s, RRE, LLGCR, dest, src);
+        return;
+    }
+
+    if (dest == src) {
+        tcg_out_movi(s, type, TCG_TMP0, 0xff);
+        src = TCG_TMP0;
+    } else {
+        tcg_out_movi(s, type, dest, 0xff);
+    }
+    if (type == TCG_TYPE_I32) {
+        tcg_out_insn(s, RR, NR, dest, src);
+    } else {
+        tcg_out_insn(s, RRE, NGR, dest, src);
+    }
+}
+
+static void tgen_ext16s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
+{
+    if (facilities & FACILITY_EXT_IMM) {
+        tcg_out_insn(s, RRE, LGHR, dest, src);
+        return;
+    }
+
+    if (type == TCG_TYPE_I32) {
+        if (dest == src) {
+            tcg_out_sh32(s, RS_SLL, dest, TCG_REG_NONE, 16);
+        } else {
+            tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 16);
+        }
+        tcg_out_sh32(s, RS_SRA, dest, TCG_REG_NONE, 16);
+    } else {
+        tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 48);
+        tcg_out_sh64(s, RSY_SRAG, dest, dest, TCG_REG_NONE, 48);
+    }
+}
+
+static void tgen_ext16u(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
+{
+    if (facilities & FACILITY_EXT_IMM) {
+        tcg_out_insn(s, RRE, LLGHR, dest, src);
+        return;
+    }
+
+    if (dest == src) {
+        tcg_out_movi(s, type, TCG_TMP0, 0xffff);
+        src = TCG_TMP0;
+    } else {
+        tcg_out_movi(s, type, dest, 0xffff);
+    }
+    if (type == TCG_TYPE_I32) {
+        tcg_out_insn(s, RR, NR, dest, src);
+    } else {
+        tcg_out_insn(s, RRE, NGR, dest, src);
+    }
+}
+
+static inline void tgen_ext32s(TCGContext *s, TCGReg dest, TCGReg src)
+{
+    tcg_out_insn(s, RRE, LGFR, dest, src);
+}
+
+static inline void tgen_ext32u(TCGContext *s, TCGReg dest, TCGReg src)
+{
+    tcg_out_insn(s, RRE, LLGFR, dest, src);
+}
+
 static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
 {
     if (c > TCG_COND_GT) {
@@ -643,8 +735,8 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     }
 
 #if TARGET_LONG_BITS == 32
-    tcg_out_insn(s, RRE, LLGFR, arg1, addr_reg);
-    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
+    tgen_ext32u(s, arg1, addr_reg);
+    tgen_ext32u(s, arg0, addr_reg);
 #else
     tcg_out_mov(s, arg1, addr_reg);
     tcg_out_mov(s, arg0, addr_reg);
@@ -681,7 +773,7 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
 
     /* call load/store helper */
 #if TARGET_LONG_BITS == 32
-    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
+    tgen_ext32u(s, arg0, addr_reg);
 #else
     tcg_out_mov(s, arg0, addr_reg);
 #endif
@@ -697,15 +789,13 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
         /* sign extension */
         switch (opc) {
         case LD_INT8:
-            tcg_out_insn(s, RSY, SLLG, data_reg, arg0, TCG_REG_NONE, 56);
-            tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, TCG_REG_NONE, 56);
+            tgen_ext8s(s, TCG_TYPE_I64, data_reg, arg0);
             break;
         case LD_INT16:
-            tcg_out_insn(s, RSY, SLLG, data_reg, arg0, TCG_REG_NONE, 48);
-            tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, TCG_REG_NONE, 48);
+            tgen_ext16s(s, TCG_TYPE_I64, data_reg, arg0);
             break;
         case LD_INT32:
-            tcg_out_insn(s, RRE, LGFR, data_reg, arg0);
+            tgen_ext32s(s, data_reg, arg0);
             break;
         default:
             /* unsigned -> just copy */
@@ -803,8 +893,7 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 #else
         /* swapped unsigned halfword load with upper bits zeroed */
         tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, 0xffffL);
-        tcg_out_insn(s, RRE, NGR, data_reg, TCG_TMP0);
+        tgen_ext16u(s, TCG_TYPE_I64, data_reg, data_reg);
 #endif
         break;
     case LD_INT16:
@@ -813,8 +902,7 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 #else
         /* swapped sign-extended halfword load */
         tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
-        tcg_out_insn(s, RSY, SLLG, data_reg, data_reg, TCG_REG_NONE, 48);
-        tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, TCG_REG_NONE, 48);
+        tgen_ext16s(s, TCG_TYPE_I64, data_reg, data_reg);
 #endif
         break;
     case LD_UINT32:
@@ -823,7 +911,7 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 #else
         /* swapped unsigned int load with upper bits zeroed */
         tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
-        tcg_out_insn(s, RRE, LLGFR, data_reg, data_reg);
+        tgen_ext32u(s, data_reg, data_reg);
 #endif
         break;
     case LD_INT32:
@@ -832,7 +920,7 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 #else
         /* swapped sign-extended int load */
         tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
-        tcg_out_insn(s, RRE, LGFR, data_reg, data_reg);
+        tgen_ext32s(s, data_reg, data_reg);
 #endif
         break;
     case LD_UINT64:
@@ -1111,6 +1199,38 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         op = RSY_SRAG;
         goto do_shift64;
 
+    case INDEX_op_ext8s_i32:
+        tgen_ext8s(s, TCG_TYPE_I32, args[0], args[1]);
+        break;
+    case INDEX_op_ext8s_i64:
+        tgen_ext8s(s, TCG_TYPE_I64, args[0], args[1]);
+        break;
+    case INDEX_op_ext16s_i32:
+        tgen_ext16s(s, TCG_TYPE_I32, args[0], args[1]);
+        break;
+    case INDEX_op_ext16s_i64:
+        tgen_ext16s(s, TCG_TYPE_I64, args[0], args[1]);
+        break;
+    case INDEX_op_ext32s_i64:
+        tgen_ext32s(s, args[0], args[1]);
+        break;
+
+    case INDEX_op_ext8u_i32:
+        tgen_ext8u(s, TCG_TYPE_I32, args[0], args[1]);
+        break;
+    case INDEX_op_ext8u_i64:
+        tgen_ext8u(s, TCG_TYPE_I64, args[0], args[1]);
+        break;
+    case INDEX_op_ext16u_i32:
+        tgen_ext16u(s, TCG_TYPE_I32, args[0], args[1]);
+        break;
+    case INDEX_op_ext16u_i64:
+        tgen_ext16u(s, TCG_TYPE_I64, args[0], args[1]);
+        break;
+    case INDEX_op_ext32u_i64:
+        tgen_ext32u(s, args[0], args[1]);
+        break;
+
     case INDEX_op_br:
         tgen_branch(s, S390_CC_ALWAYS, args[0]);
         break;
@@ -1228,6 +1348,11 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_shr_i32, { "r", "0", "Ri" } },
     { INDEX_op_sar_i32, { "r", "0", "Ri" } },
 
+    { INDEX_op_ext8s_i32, { "r", "r" } },
+    { INDEX_op_ext8u_i32, { "r", "r" } },
+    { INDEX_op_ext16s_i32, { "r", "r" } },
+    { INDEX_op_ext16u_i32, { "r", "r" } },
+
     { INDEX_op_brcond_i32, { "r", "r" } },
     { INDEX_op_setcond_i32, { "r", "r", "r" } },
 
@@ -1278,6 +1403,13 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_shr_i64, { "r", "r", "Ri" } },
     { INDEX_op_sar_i64, { "r", "r", "Ri" } },
 
+    { INDEX_op_ext8s_i64, { "r", "r" } },
+    { INDEX_op_ext8u_i64, { "r", "r" } },
+    { INDEX_op_ext16s_i64, { "r", "r" } },
+    { INDEX_op_ext16u_i64, { "r", "r" } },
+    { INDEX_op_ext32s_i64, { "r", "r" } },
+    { INDEX_op_ext32u_i64, { "r", "r" } },
+
     { INDEX_op_brcond_i64, { "r", "r" } },
     { INDEX_op_setcond_i64, { "r", "r", "r" } },
 #endif
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 26dafae..570c832 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -50,10 +50,10 @@ typedef enum TCGReg {
 /* optional instructions */
 #define TCG_TARGET_HAS_div2_i32
 // #define TCG_TARGET_HAS_rot_i32
-// #define TCG_TARGET_HAS_ext8s_i32
-// #define TCG_TARGET_HAS_ext16s_i32
-// #define TCG_TARGET_HAS_ext8u_i32
-// #define TCG_TARGET_HAS_ext16u_i32
+#define TCG_TARGET_HAS_ext8s_i32
+#define TCG_TARGET_HAS_ext16s_i32
+#define TCG_TARGET_HAS_ext8u_i32
+#define TCG_TARGET_HAS_ext16u_i32
 // #define TCG_TARGET_HAS_bswap16_i32
 // #define TCG_TARGET_HAS_bswap32_i32
 // #define TCG_TARGET_HAS_not_i32
@@ -66,12 +66,12 @@ typedef enum TCGReg {
 
 #define TCG_TARGET_HAS_div2_i64
 // #define TCG_TARGET_HAS_rot_i64
-// #define TCG_TARGET_HAS_ext8s_i64
-// #define TCG_TARGET_HAS_ext16s_i64
-// #define TCG_TARGET_HAS_ext32s_i64
-// #define TCG_TARGET_HAS_ext8u_i64
-// #define TCG_TARGET_HAS_ext16u_i64
-// #define TCG_TARGET_HAS_ext32u_i64
+#define TCG_TARGET_HAS_ext8s_i64
+#define TCG_TARGET_HAS_ext16s_i64
+#define TCG_TARGET_HAS_ext32s_i64
+#define TCG_TARGET_HAS_ext8u_i64
+#define TCG_TARGET_HAS_ext16u_i64
+#define TCG_TARGET_HAS_ext32u_i64
 // #define TCG_TARGET_HAS_bswap16_i64
 // #define TCG_TARGET_HAS_bswap32_i64
 // #define TCG_TARGET_HAS_bswap64_i64
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 18/35] tcg-s390: Implement bswap operations.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (16 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 17/35] tcg-s390: Implement sign and zero-extension operations Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-12 12:32   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 19/35] tcg-s390: Implement rotates Richard Henderson
                   ` (17 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   24 ++++++++++++++++++++++++
 tcg/s390/tcg-target.h |   10 +++++-----
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 42e3224..3a98ca3 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -86,6 +86,8 @@ typedef enum S390Opcode {
     RRE_LLGCR   = 0xb984,
     RRE_LLGFR   = 0xb916,
     RRE_LLGHR   = 0xb985,
+    RRE_LRVR    = 0xb91f,
+    RRE_LRVGR   = 0xb90f,
     RRE_MSGR    = 0xb90c,
     RRE_MSR     = 0xb252,
     RRE_NGR     = 0xb980,
@@ -1231,6 +1233,21 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tgen_ext32u(s, args[0], args[1]);
         break;
 
+    case INDEX_op_bswap16_i32:
+    case INDEX_op_bswap16_i64:
+        /* The TCG bswap definition requires bits 0-47 already be zero.
+           Thus we don't need the G-type insns to implement bswap16_i64.  */
+        tcg_out_insn(s, RRE, LRVR, args[0], args[1]);
+        tcg_out_sh32(s, RS_SRL, args[0], TCG_REG_NONE, 16);
+        break;
+    case INDEX_op_bswap32_i32:
+    case INDEX_op_bswap32_i64:
+        tcg_out_insn(s, RRE, LRVR, args[0], args[1]);
+        break;
+    case INDEX_op_bswap64_i64:
+        tcg_out_insn(s, RRE, LRVGR, args[0], args[1]);
+        break;
+
     case INDEX_op_br:
         tgen_branch(s, S390_CC_ALWAYS, args[0]);
         break;
@@ -1353,6 +1370,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_ext16s_i32, { "r", "r" } },
     { INDEX_op_ext16u_i32, { "r", "r" } },
 
+    { INDEX_op_bswap16_i32, { "r", "r" } },
+    { INDEX_op_bswap32_i32, { "r", "r" } },
+
     { INDEX_op_brcond_i32, { "r", "r" } },
     { INDEX_op_setcond_i32, { "r", "r", "r" } },
 
@@ -1410,6 +1430,10 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_ext32s_i64, { "r", "r" } },
     { INDEX_op_ext32u_i64, { "r", "r" } },
 
+    { INDEX_op_bswap16_i64, { "r", "r" } },
+    { INDEX_op_bswap32_i64, { "r", "r" } },
+    { INDEX_op_bswap64_i64, { "r", "r" } },
+
     { INDEX_op_brcond_i64, { "r", "r" } },
     { INDEX_op_setcond_i64, { "r", "r", "r" } },
 #endif
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 570c832..dcb9bc3 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -54,8 +54,8 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_ext16s_i32
 #define TCG_TARGET_HAS_ext8u_i32
 #define TCG_TARGET_HAS_ext16u_i32
-// #define TCG_TARGET_HAS_bswap16_i32
-// #define TCG_TARGET_HAS_bswap32_i32
+#define TCG_TARGET_HAS_bswap16_i32
+#define TCG_TARGET_HAS_bswap32_i32
 // #define TCG_TARGET_HAS_not_i32
 #define TCG_TARGET_HAS_neg_i32
 // #define TCG_TARGET_HAS_andc_i32
@@ -72,9 +72,9 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_ext8u_i64
 #define TCG_TARGET_HAS_ext16u_i64
 #define TCG_TARGET_HAS_ext32u_i64
-// #define TCG_TARGET_HAS_bswap16_i64
-// #define TCG_TARGET_HAS_bswap32_i64
-// #define TCG_TARGET_HAS_bswap64_i64
+#define TCG_TARGET_HAS_bswap16_i64
+#define TCG_TARGET_HAS_bswap32_i64
+#define TCG_TARGET_HAS_bswap64_i64
 // #define TCG_TARGET_HAS_not_i64
 #define TCG_TARGET_HAS_neg_i64
 // #define TCG_TARGET_HAS_andc_i64
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 19/35] tcg-s390: Implement rotates.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (17 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 18/35] tcg-s390: Implement bswap operations Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-12 12:33   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 20/35] tcg-s390: Use LOAD COMPLIMENT for negate Richard Henderson
                   ` (16 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   46 ++++++++++++++++++++++++++++++++++++++++++++++
 tcg/s390/tcg-target.h |    4 ++--
 2 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 3a98ca3..f53038b 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -108,6 +108,8 @@ typedef enum S390Opcode {
     RR_SR       = 0x1b,
     RR_XR       = 0x17,
 
+    RSY_RLL     = 0xeb1d,
+    RSY_RLLG    = 0xeb1c,
     RSY_SLLG    = 0xeb0d,
     RSY_SRAG    = 0xeb0a,
     RSY_SRLG    = 0xeb0c,
@@ -1201,6 +1203,44 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         op = RSY_SRAG;
         goto do_shift64;
 
+    case INDEX_op_rotl_i32:
+        /* ??? Using tcg_out_sh64 here for the format; it is a 32-bit rol.  */
+        if (const_args[2]) {
+            tcg_out_sh64(s, RSY_RLL, args[0], args[1], TCG_REG_NONE, args[2]);
+        } else {
+            tcg_out_sh64(s, RSY_RLL, args[0], args[1], args[2], 0);
+        }
+        break;
+    case INDEX_op_rotr_i32:
+        if (const_args[2]) {
+            tcg_out_sh64(s, RSY_RLL, args[0], args[1],
+                         TCG_REG_NONE, (32 - args[2]) & 31);
+        } else {
+            tcg_out_insn(s, RR, LCR, TCG_TMP0, args[2]);
+            tcg_out_sh64(s, RSY_RLL, args[0], args[1], TCG_TMP0, 0);
+        }
+        break;
+
+    case INDEX_op_rotl_i64:
+        if (const_args[2]) {
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
+                         TCG_REG_NONE, args[2]);
+        } else {
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], args[2], 0);
+        }
+        break;
+    case INDEX_op_rotr_i64:
+        if (const_args[2]) {
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
+                         TCG_REG_NONE, (64 - args[2]) & 63);
+        } else {
+            /* We can use the smaller 32-bit negate because only the
+               low 6 bits are examined for the rotate.  */
+            tcg_out_insn(s, RR, LCR, TCG_TMP0, args[2]);
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], TCG_TMP0, 0);
+        }
+        break;
+
     case INDEX_op_ext8s_i32:
         tgen_ext8s(s, TCG_TYPE_I32, args[0], args[1]);
         break;
@@ -1365,6 +1405,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_shr_i32, { "r", "0", "Ri" } },
     { INDEX_op_sar_i32, { "r", "0", "Ri" } },
 
+    { INDEX_op_rotl_i32, { "r", "r", "Ri" } },
+    { INDEX_op_rotr_i32, { "r", "r", "Ri" } },
+
     { INDEX_op_ext8s_i32, { "r", "r" } },
     { INDEX_op_ext8u_i32, { "r", "r" } },
     { INDEX_op_ext16s_i32, { "r", "r" } },
@@ -1423,6 +1466,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_shr_i64, { "r", "r", "Ri" } },
     { INDEX_op_sar_i64, { "r", "r", "Ri" } },
 
+    { INDEX_op_rotl_i64, { "r", "r", "Ri" } },
+    { INDEX_op_rotr_i64, { "r", "r", "Ri" } },
+
     { INDEX_op_ext8s_i64, { "r", "r" } },
     { INDEX_op_ext8u_i64, { "r", "r" } },
     { INDEX_op_ext16s_i64, { "r", "r" } },
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index dcb9bc3..9135c7a 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -49,7 +49,7 @@ typedef enum TCGReg {
 
 /* optional instructions */
 #define TCG_TARGET_HAS_div2_i32
-// #define TCG_TARGET_HAS_rot_i32
+#define TCG_TARGET_HAS_rot_i32
 #define TCG_TARGET_HAS_ext8s_i32
 #define TCG_TARGET_HAS_ext16s_i32
 #define TCG_TARGET_HAS_ext8u_i32
@@ -65,7 +65,7 @@ typedef enum TCGReg {
 // #define TCG_TARGET_HAS_nor_i32
 
 #define TCG_TARGET_HAS_div2_i64
-// #define TCG_TARGET_HAS_rot_i64
+#define TCG_TARGET_HAS_rot_i64
 #define TCG_TARGET_HAS_ext8s_i64
 #define TCG_TARGET_HAS_ext16s_i64
 #define TCG_TARGET_HAS_ext32s_i64
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 20/35] tcg-s390: Use LOAD COMPLIMENT for negate.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (18 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 19/35] tcg-s390: Implement rotates Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-12 12:33   ` Aurelien Jarno
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 21/35] tcg-s390: Use the ADD IMMEDIATE instructions Richard Henderson
                   ` (15 subsequent siblings)
  35 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   10 ++--------
 1 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index f53038b..826a2c8 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -1134,16 +1134,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_neg_i32:
-        /* FIXME: optimize args[0] != args[1] case */
-        tcg_out_insn(s, RR, LR, 13, args[1]);
-        tcg_out_movi(s, TCG_TYPE_I32, args[0], 0);
-        tcg_out_insn(s, RR, SR, args[0], 13);
+        tcg_out_insn(s, RR, LCR, args[0], args[1]);
         break;
     case INDEX_op_neg_i64:
-        /* FIXME: optimize args[0] != args[1] case */
-        tcg_out_mov(s, TCG_TMP0, args[1]);
-        tcg_out_movi(s, TCG_TYPE_I64, args[0], 0);
-        tcg_out_insn(s, RRE, SGR, args[0], TCG_TMP0);
+        tcg_out_insn(s, RRE, LCGR, args[0], args[1]);
         break;
 
     case INDEX_op_mul_i32:
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 21/35] tcg-s390: Use the ADD IMMEDIATE instructions.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (19 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 20/35] tcg-s390: Use LOAD COMPLIMENT for negate Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 22/35] tcg-s390: Use the AND " Richard Henderson
                   ` (14 subsequent siblings)
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The ADD IMMEDIATE instructions are in the extended-immediate facility.
Using them gives us a 32-bit immediate addend.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   96 ++++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 80 insertions(+), 16 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 826a2c8..795ddcd 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -33,8 +33,9 @@
     do { } while (0)
 #endif
 
-#define TCG_CT_CONST_S16                0x100
-#define TCG_CT_CONST_U12                0x200
+#define TCG_CT_CONST_32    0x100
+#define TCG_CT_CONST_NEG   0x200
+#define TCG_CT_CONST_ADDI  0x400
 
 /* Several places within the instruction set 0 means "no register"
    rather than TCG_REG_R0.  */
@@ -49,6 +50,9 @@
    halves of the 16-bit quantity may appear 32 bits apart in the insn.
    This makes it easy to copy the values from the tables in Appendix B.  */
 typedef enum S390Opcode {
+    RIL_AFI     = 0xc209,
+    RIL_AGFI    = 0xc208,
+    RIL_ALGFI   = 0xc20a,
     RIL_BRASL   = 0xc005,
     RIL_BRCL    = 0xc004,
     RIL_LARL    = 0xc000,
@@ -303,9 +307,17 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         tcg_regset_clear(ct->u.regs);
         tcg_regset_set_reg(ct->u.regs, TCG_REG_R3);
         break;
+    case 'N':                  /* force immediate negate */
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_NEG;
+        break;
+    case 'W':                  /* force 32-bit ("word") immediate */
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_32;
+        break;
     case 'I':
         ct->ct &= ~TCG_CT_REG;
-        ct->ct |= TCG_CT_CONST_S16;
+        ct->ct |= TCG_CT_CONST_ADDI;
         break;
     default:
         break;
@@ -322,12 +334,31 @@ static inline int tcg_target_const_match(tcg_target_long val,
 {
     int ct = arg_ct->ct;
 
-    if ((ct & TCG_CT_CONST) ||
-       ((ct & TCG_CT_CONST_S16) && val == (int16_t)val) ||
-       ((ct & TCG_CT_CONST_U12) && val == (val & 0xfff))) {
+    if (ct & TCG_CT_CONST) {
         return 1;
     }
 
+    /* Handle the modifiers.  */
+    if (ct & TCG_CT_CONST_NEG) {
+        val = -val;
+    }
+    if (ct & TCG_CT_CONST_32) {
+        val = (int32_t)val;
+    }
+
+    /* The following are mutually exclusive.  */
+    if (ct & TCG_CT_CONST_ADDI) {
+        /* Immediates that may be used with add.  If we have the
+           extended-immediates facility then we have ADD IMMEDIATE
+           with signed and unsigned 32-bit, otherwise we have only
+           ADD HALFWORD IMMEDIATE with a signed 16-bit.  */
+        if (facilities & FACILITY_EXT_IMM) {
+            return val == (int32_t)val || val == (uint32_t)val;
+        } else {
+            return val == (int16_t)val;
+        }
+    }
+
     return 0;
 }
 
@@ -649,6 +680,29 @@ static inline void tgen_ext32u(TCGContext *s, TCGReg dest, TCGReg src)
     tcg_out_insn(s, RRE, LLGFR, dest, src);
 }
 
+static void tgen32_addi(TCGContext *s, TCGReg dest, int32_t val)
+{
+    if (val == (int16_t)val) {
+        tcg_out_insn(s, RI, AHI, dest, val);
+    } else {
+        tcg_out_insn(s, RIL, AFI, dest, val);
+    }
+}
+
+static void tgen64_addi(TCGContext *s, TCGReg dest, int64_t val)
+{
+    if (val == (int16_t)val) {
+        tcg_out_insn(s, RI, AGHI, dest, val);
+    } else if (val == (int32_t)val) {
+        tcg_out_insn(s, RIL, AGFI, dest, val);
+    } else if (val == (uint32_t)val) {
+        tcg_out_insn(s, RIL, ALGFI, dest, val);
+    } else {
+        tcg_abort();
+    }
+
+}
+
 static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
 {
     if (c > TCG_COND_GT) {
@@ -1095,22 +1149,32 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_add_i32:
         if (const_args[2]) {
-            tcg_out_insn(s, RI, AHI, args[0], args[2]);
+            tgen32_addi(s, args[0], args[2]);
         } else {
             tcg_out_insn(s, RR, AR, args[0], args[2]);
         }
         break;
-
     case INDEX_op_add_i64:
-        tcg_out_insn(s, RRE, AGR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_addi(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, AGR, args[0], args[2]);
+        }
         break;
 
     case INDEX_op_sub_i32:
-        tcg_out_insn(s, RR, SR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen32_addi(s, args[0], -args[2]);
+        } else {
+            tcg_out_insn(s, RR, SR, args[0], args[2]);
+        }
         break;
-
     case INDEX_op_sub_i64:
-        tcg_out_insn(s, RRE, SGR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_addi(s, args[0], -args[2]);
+        } else {
+            tcg_out_insn(s, RRE, SGR, args[0], args[2]);
+        }
         break;
 
     case INDEX_op_and_i32:
@@ -1383,8 +1447,8 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_st16_i32, { "r", "r" } },
     { INDEX_op_st_i32, { "r", "r" } },
 
-    { INDEX_op_add_i32, { "r", "0", "rI" } },
-    { INDEX_op_sub_i32, { "r", "0", "r" } },
+    { INDEX_op_add_i32, { "r", "0", "rWI" } },
+    { INDEX_op_sub_i32, { "r", "0", "rWNI" } },
     { INDEX_op_mul_i32, { "r", "0", "r" } },
 
     { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
@@ -1444,8 +1508,8 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_st32_i64, { "r", "r" } },
     { INDEX_op_st_i64, { "r", "r" } },
 
-    { INDEX_op_add_i64, { "r", "0", "r" } },
-    { INDEX_op_sub_i64, { "r", "0", "r" } },
+    { INDEX_op_add_i64, { "r", "0", "rI" } },
+    { INDEX_op_sub_i64, { "r", "0", "rNI" } },
     { INDEX_op_mul_i64, { "r", "0", "r" } },
 
     { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 22/35] tcg-s390: Use the AND IMMEDIATE instructions.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (20 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 21/35] tcg-s390: Use the ADD IMMEDIATE instructions Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 23/35] tcg-s390: Use the OR " Richard Henderson
                   ` (13 subsequent siblings)
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  179 +++++++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 166 insertions(+), 13 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 795ddcd..53a92c5 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -36,6 +36,7 @@
 #define TCG_CT_CONST_32    0x100
 #define TCG_CT_CONST_NEG   0x200
 #define TCG_CT_CONST_ADDI  0x400
+#define TCG_CT_CONST_ANDI  0x800
 
 /* Several places within the instruction set 0 means "no register"
    rather than TCG_REG_R0.  */
@@ -61,6 +62,8 @@ typedef enum S390Opcode {
     RIL_LGFI    = 0xc001,
     RIL_LLIHF   = 0xc00e,
     RIL_LLILF   = 0xc00f,
+    RIL_NIHF    = 0xc00a,
+    RIL_NILF    = 0xc00b,
 
     RI_AGHI     = 0xa70b,
     RI_AHI      = 0xa70a,
@@ -74,6 +77,10 @@ typedef enum S390Opcode {
     RI_LLIHL    = 0xa50d,
     RI_LLILH    = 0xa50e,
     RI_LLILL    = 0xa50f,
+    RI_NIHH     = 0xa504,
+    RI_NIHL     = 0xa505,
+    RI_NILH     = 0xa506,
+    RI_NILL     = 0xa507,
 
     RRE_AGR     = 0xb908,
     RRE_CGR     = 0xb920,
@@ -319,6 +326,10 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         ct->ct &= ~TCG_CT_REG;
         ct->ct |= TCG_CT_CONST_ADDI;
         break;
+    case 'A':
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_ANDI;
+        break;
     default:
         break;
     }
@@ -328,9 +339,66 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
     return 0;
 }
 
+/* Immediates to be used with logical AND.  This is an optimization only,
+   since a full 64-bit immediate AND can always be performed with 4 sequential
+   NI[LH][LH] instructions.  What we're looking for is immediates that we
+   can load efficiently, and the immediate load plus the reg-reg AND is
+   smaller than the sequential NI's.  */
+
+static int tcg_match_andi(int ct, tcg_target_ulong val)
+{
+    int i;
+
+    if (facilities & FACILITY_EXT_IMM) {
+        if (ct & TCG_CT_CONST_32) {
+            /* All 32-bit ANDs can be performed with 1 48-bit insn.  */
+            return 1;
+        }
+
+        /* Zero-extensions.  */
+        if (val == 0xff || val == 0xffff || val == 0xffffffff) {
+            return 1;
+        }
+    } else {
+        if (ct & TCG_CT_CONST_32) {
+            val = (uint32_t)val;
+        } else if (val == 0xffffffff) {
+            return 1;
+        }
+    }
+
+    /* Try all 32-bit insns that can perform it in one go.  */
+    for (i = 0; i < 4; i++) {
+        tcg_target_ulong mask = ~(0xffffull << i*16);
+        if ((val & mask) == mask) {
+            return 1;
+        }
+    }
+
+    /* Look for 16-bit values performing the mask.  These are better
+       to load with LLI[LH][LH].  */
+    for (i = 0; i < 4; i++) {
+        tcg_target_ulong mask = 0xffffull << i*16;
+        if ((val & mask) == val) {
+            return 0;
+        }
+    }
+
+    /* Look for 32-bit values performing the 64-bit mask.  These
+       are better to load with LLI[LH]F, or if extended immediates
+       not available, with a pair of LLI insns.  */
+    if ((ct & TCG_CT_CONST_32) == 0) {
+        if (val <= 0xffffffff || (val & 0xffffffff) == 0) {
+            return 0;
+        }
+    }
+
+    return 1;
+}
+
 /* Test if a constant matches the constraint. */
-static inline int tcg_target_const_match(tcg_target_long val,
-                                         const TCGArgConstraint *arg_ct)
+static int tcg_target_const_match(tcg_target_long val,
+                                  const TCGArgConstraint *arg_ct)
 {
     int ct = arg_ct->ct;
 
@@ -357,6 +425,8 @@ static inline int tcg_target_const_match(tcg_target_long val,
         } else {
             return val == (int16_t)val;
         }
+    } else if (ct & TCG_CT_CONST_ANDI) {
+        return tcg_match_andi(ct, val);
     }
 
     return 0;
@@ -703,6 +773,74 @@ static void tgen64_addi(TCGContext *s, TCGReg dest, int64_t val)
 
 }
 
+static void tgen64_andi(TCGContext *s, TCGReg dest, tcg_target_ulong val)
+{
+    static const S390Opcode ni_insns[4] = {
+        RI_NILL, RI_NILH, RI_NIHL, RI_NIHH
+    };
+    static const S390Opcode nif_insns[2] = {
+        RIL_NILF, RIL_NIHF
+    };
+
+    int i;
+
+    /* Look for no-op.  */
+    if (val == -1) {
+        return;
+    }
+
+    /* Look for the zero-extensions.  */
+    if (val == 0xffffffff) {
+        tgen_ext32u(s, dest, dest);
+        return;
+    }
+
+    if (facilities & FACILITY_EXT_IMM) {
+        if (val == 0xff) {
+            tgen_ext8u(s, TCG_TYPE_I64, dest, dest);
+            return;
+        }
+        if (val == 0xffff) {
+            tgen_ext16u(s, TCG_TYPE_I64, dest, dest);
+            return;
+        }
+
+        /* Try all 32-bit insns that can perform it in one go.  */
+        for (i = 0; i < 4; i++) {
+            tcg_target_ulong mask = ~(0xffffull << i*16);
+            if ((val & mask) == mask) {
+                tcg_out_insn_RI(s, ni_insns[i], dest, val >> i*16);
+                return;
+            }
+        }
+
+        /* Try all 48-bit insns that can perform it in one go.  */
+        if (facilities & FACILITY_EXT_IMM) {
+            for (i = 0; i < 2; i++) {
+                tcg_target_ulong mask = ~(0xffffffffull << i*32);
+                if ((val & mask) == mask) {
+                    tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32);
+                    return;
+                }
+            }
+        }
+
+        /* Perform the AND via sequential modifications to the high and low
+           parts.  Do this via recursion to handle 16-bit vs 32-bit masks in
+           each half.  */
+        tgen64_andi(s, dest, val | 0xffffffff00000000ull);
+        tgen64_andi(s, dest, val | 0x00000000ffffffffull);
+    } else {
+        /* With no extended-immediate facility, just emit the sequence.  */
+        for (i = 0; i < 4; i++) {
+            tcg_target_ulong mask = 0xffffull << i*16;
+            if ((val & mask) != mask) {
+                tcg_out_insn_RI(s, ni_insns[i], dest, val >> i*16);
+            }
+        }
+    }
+}
+
 static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
 {
     if (c > TCG_COND_GT) {
@@ -776,6 +914,16 @@ static void tgen_calli(TCGContext *s, tcg_target_long dest)
 }
 
 #if defined(CONFIG_SOFTMMU)
+static void tgen64_andi_tmp(TCGContext *s, TCGReg dest, tcg_target_ulong val)
+{
+    if (tcg_match_andi(0, val)) {
+        tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, val);
+        tcg_out_insn(s, RRE, NGR, dest, TCG_TMP0);
+    } else {
+        tgen64_andi(s, dest, val);
+    }
+}
+
 static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
                                   int mem_index, int opc,
                                   uint16_t **label2_ptr_p, int is_store)
@@ -803,13 +951,8 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, TCG_REG_NONE,
                  TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
 
-    tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
-                 TARGET_PAGE_MASK | ((1 << s_bits) - 1));
-    tcg_out_insn(s, RRE, NGR, arg0, TCG_TMP0);
-
-    tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
-                 (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
-    tcg_out_insn(s, RRE, NGR, arg1, TCG_TMP0);
+    tgen64_andi_tmp(s, arg0, TARGET_PAGE_MASK | ((1 << s_bits) - 1));
+    tgen64_andi_tmp(s, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
 
     if (is_store) {
         tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
@@ -1178,7 +1321,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_and_i32:
-        tcg_out_insn(s, RR, NR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_andi(s, args[0], args[2] | 0xffffffff00000000ull);
+        } else {
+            tcg_out_insn(s, RR, NR, args[0], args[2]);
+        }
         break;
     case INDEX_op_or_i32:
         tcg_out_insn(s, RR, OR, args[0], args[2]);
@@ -1188,7 +1335,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_and_i64:
-        tcg_out_insn(s, RRE, NGR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_andi(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, NGR, args[0], args[2]);
+        }
         break;
     case INDEX_op_or_i64:
         tcg_out_insn(s, RRE, OGR, args[0], args[2]);
@@ -1454,9 +1605,10 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
 
-    { INDEX_op_and_i32, { "r", "0", "r" } },
+    { INDEX_op_and_i32, { "r", "0", "rWA" } },
     { INDEX_op_or_i32, { "r", "0", "r" } },
     { INDEX_op_xor_i32, { "r", "0", "r" } },
+
     { INDEX_op_neg_i32, { "r", "r" } },
 
     { INDEX_op_shl_i32, { "r", "0", "Ri" } },
@@ -1515,9 +1667,10 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
 
-    { INDEX_op_and_i64, { "r", "0", "r" } },
+    { INDEX_op_and_i64, { "r", "0", "rA" } },
     { INDEX_op_or_i64, { "r", "0", "r" } },
     { INDEX_op_xor_i64, { "r", "0", "r" } },
+
     { INDEX_op_neg_i64, { "r", "r" } },
 
     { INDEX_op_shl_i64, { "r", "r", "Ri" } },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 23/35] tcg-s390: Use the OR IMMEDIATE instructions.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (21 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 22/35] tcg-s390: Use the AND " Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 24/35] tcg-s390: Use the XOR " Richard Henderson
                   ` (12 subsequent siblings)
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  119 +++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 111 insertions(+), 8 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 53a92c5..a17ef91 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -33,10 +33,11 @@
     do { } while (0)
 #endif
 
-#define TCG_CT_CONST_32    0x100
-#define TCG_CT_CONST_NEG   0x200
-#define TCG_CT_CONST_ADDI  0x400
-#define TCG_CT_CONST_ANDI  0x800
+#define TCG_CT_CONST_32    0x0100
+#define TCG_CT_CONST_NEG   0x0200
+#define TCG_CT_CONST_ADDI  0x0400
+#define TCG_CT_CONST_ANDI  0x1000
+#define TCG_CT_CONST_ORI   0x2000
 
 /* Several places within the instruction set 0 means "no register"
    rather than TCG_REG_R0.  */
@@ -64,6 +65,8 @@ typedef enum S390Opcode {
     RIL_LLILF   = 0xc00f,
     RIL_NIHF    = 0xc00a,
     RIL_NILF    = 0xc00b,
+    RIL_OIHF    = 0xc00c,
+    RIL_OILF    = 0xc00d,
 
     RI_AGHI     = 0xa70b,
     RI_AHI      = 0xa70a,
@@ -81,6 +84,10 @@ typedef enum S390Opcode {
     RI_NIHL     = 0xa505,
     RI_NILH     = 0xa506,
     RI_NILL     = 0xa507,
+    RI_OIHH     = 0xa508,
+    RI_OIHL     = 0xa509,
+    RI_OILH     = 0xa50a,
+    RI_OILL     = 0xa50b,
 
     RRE_AGR     = 0xb908,
     RRE_CGR     = 0xb920,
@@ -330,6 +337,10 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         ct->ct &= ~TCG_CT_REG;
         ct->ct |= TCG_CT_CONST_ANDI;
         break;
+    case 'O':
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_ORI;
+        break;
     default:
         break;
     }
@@ -396,6 +407,36 @@ static int tcg_match_andi(int ct, tcg_target_ulong val)
     return 1;
 }
 
+/* Immediates to be used with logical OR.  This is an optimization only,
+   since a full 64-bit immediate OR can always be performed with 4 sequential
+   OI[LH][LH] instructions.  What we're looking for is immediates that we
+   can load efficiently, and the immediate load plus the reg-reg OR is
+   smaller than the sequential OI's.  */
+
+static int tcg_match_ori(int ct, tcg_target_long val)
+{
+    if (facilities & FACILITY_EXT_IMM) {
+        if (ct & TCG_CT_CONST_32) {
+            /* All 32-bit ORs can be performed with 1 48-bit insn.  */
+            return 1;
+        }
+    }
+
+    /* Look for negative values.  These are best to load with LGHI.  */
+    if (val < 0) {
+        if (val == (int16_t)val) {
+            return 0;
+        }
+        if (facilities & FACILITY_EXT_IMM) {
+            if (val == (int32_t)val) {
+                return 0;
+            }
+        }
+    }
+
+    return 1;
+}
+
 /* Test if a constant matches the constraint. */
 static int tcg_target_const_match(tcg_target_long val,
                                   const TCGArgConstraint *arg_ct)
@@ -427,6 +468,8 @@ static int tcg_target_const_match(tcg_target_long val,
         }
     } else if (ct & TCG_CT_CONST_ANDI) {
         return tcg_match_andi(ct, val);
+    } else if (ct & TCG_CT_CONST_ORI) {
+        return tcg_match_ori(ct, val);
     }
 
     return 0;
@@ -841,6 +884,58 @@ static void tgen64_andi(TCGContext *s, TCGReg dest, tcg_target_ulong val)
     }
 }
 
+static void tgen64_ori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
+{
+    static const S390Opcode oi_insns[4] = {
+        RI_OILL, RI_OILH, RI_OIHL, RI_OIHH
+    };
+    static const S390Opcode nif_insns[2] = {
+        RIL_OILF, RIL_OIHF
+    };
+
+    int i;
+
+    /* Look for no-op.  */
+    if (val == 0) {
+        return;
+    }
+
+    if (facilities & FACILITY_EXT_IMM) {
+        /* Try all 32-bit insns that can perform it in one go.  */
+        for (i = 0; i < 4; i++) {
+            tcg_target_ulong mask = (0xffffull << i*16);
+            if ((val & mask) != 0 && (val & ~mask) == 0) {
+                tcg_out_insn_RI(s, oi_insns[i], dest, val >> i*16);
+                return;
+            }
+        }
+
+        /* Try all 48-bit insns that can perform it in one go.  */
+        for (i = 0; i < 2; i++) {
+            tcg_target_ulong mask = (0xffffffffull << i*32);
+            if ((val & mask) != 0 && (val & ~mask) == 0) {
+                tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32);
+                return;
+            }
+        }
+
+        /* Perform the OR via sequential modifications to the high and
+           low parts.  Do this via recursion to handle 16-bit vs 32-bit
+           masks in each half.  */
+        tgen64_ori(s, dest, val & 0x00000000ffffffffull);
+        tgen64_ori(s, dest, val & 0xffffffff00000000ull);
+    } else {
+        /* With no extended-immediate facility, we don't need to be so
+           clever.  Just iterate over the insns and mask in the constant.  */
+        for (i = 0; i < 4; i++) {
+            tcg_target_ulong mask = (0xffffull << i*16);
+            if ((val & mask) != 0) {
+                tcg_out_insn_RI(s, oi_insns[i], dest, val >> i*16);
+            }
+        }
+    }
+}
+
 static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
 {
     if (c > TCG_COND_GT) {
@@ -1328,7 +1423,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
     case INDEX_op_or_i32:
-        tcg_out_insn(s, RR, OR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_ori(s, args[0], args[2] & 0xffffffff);
+        } else {
+            tcg_out_insn(s, RR, OR, args[0], args[2]);
+        }
         break;
     case INDEX_op_xor_i32:
         tcg_out_insn(s, RR, XR, args[0], args[2]);
@@ -1342,7 +1441,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
     case INDEX_op_or_i64:
-        tcg_out_insn(s, RRE, OGR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_ori(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, OGR, args[0], args[2]);
+        }
         break;
     case INDEX_op_xor_i64:
         tcg_out_insn(s, RRE, XGR, args[0], args[2]);
@@ -1606,7 +1709,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
 
     { INDEX_op_and_i32, { "r", "0", "rWA" } },
-    { INDEX_op_or_i32, { "r", "0", "r" } },
+    { INDEX_op_or_i32, { "r", "0", "rWO" } },
     { INDEX_op_xor_i32, { "r", "0", "r" } },
 
     { INDEX_op_neg_i32, { "r", "r" } },
@@ -1668,7 +1771,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
 
     { INDEX_op_and_i64, { "r", "0", "rA" } },
-    { INDEX_op_or_i64, { "r", "0", "r" } },
+    { INDEX_op_or_i64, { "r", "0", "rO" } },
     { INDEX_op_xor_i64, { "r", "0", "r" } },
 
     { INDEX_op_neg_i64, { "r", "r" } },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 24/35] tcg-s390: Use the XOR IMMEDIATE instructions.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (22 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 23/35] tcg-s390: Use the OR " Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 25/35] tcg-s390: Use the MULTIPLY " Richard Henderson
                   ` (11 subsequent siblings)
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   60 +++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 56 insertions(+), 4 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index a17ef91..5446591 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -38,6 +38,7 @@
 #define TCG_CT_CONST_ADDI  0x0400
 #define TCG_CT_CONST_ANDI  0x1000
 #define TCG_CT_CONST_ORI   0x2000
+#define TCG_CT_CONST_XORI  0x4000
 
 /* Several places within the instruction set 0 means "no register"
    rather than TCG_REG_R0.  */
@@ -67,6 +68,8 @@ typedef enum S390Opcode {
     RIL_NILF    = 0xc00b,
     RIL_OIHF    = 0xc00c,
     RIL_OILF    = 0xc00d,
+    RIL_XIHF    = 0xc006,
+    RIL_XILF    = 0xc007,
 
     RI_AGHI     = 0xa70b,
     RI_AHI      = 0xa70a,
@@ -341,6 +344,10 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         ct->ct &= ~TCG_CT_REG;
         ct->ct |= TCG_CT_CONST_ORI;
         break;
+    case 'X':
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_XORI;
+        break;
     default:
         break;
     }
@@ -437,6 +444,30 @@ static int tcg_match_ori(int ct, tcg_target_long val)
     return 1;
 }
 
+/* Immediates to be used with logical XOR.  This is almost, but not quite,
+   only an optimization.  XOR with immediate is only supported with the
+   extended-immediate facility.  That said, there are a few patterns for
+   which it is better to load the value into a register first.  */
+
+static int tcg_match_xori(int ct, tcg_target_long val)
+{
+    if ((facilities & FACILITY_EXT_IMM) == 0) {
+        return 0;
+    }
+
+    if (ct & TCG_CT_CONST_32) {
+        /* All 32-bit XORs can be performed with 1 48-bit insn.  */
+        return 1;
+    }
+
+    /* Look for negative values.  These are best to load with LGHI.  */
+    if (val < 0 && val == (int32_t)val) {
+        return 0;
+    }
+
+    return 1;
+}
+
 /* Test if a constant matches the constraint. */
 static int tcg_target_const_match(tcg_target_long val,
                                   const TCGArgConstraint *arg_ct)
@@ -470,6 +501,8 @@ static int tcg_target_const_match(tcg_target_long val,
         return tcg_match_andi(ct, val);
     } else if (ct & TCG_CT_CONST_ORI) {
         return tcg_match_ori(ct, val);
+    } else if (ct & TCG_CT_CONST_XORI) {
+        return tcg_match_xori(ct, val);
     }
 
     return 0;
@@ -936,6 +969,17 @@ static void tgen64_ori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
     }
 }
 
+static void tgen64_xori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
+{
+    /* Perform the xor by parts.  */
+    if (val & 0xffffffff) {
+        tcg_out_insn(s, RIL, XILF, dest, val);
+    }
+    if (val > 0xffffffff) {
+        tcg_out_insn(s, RIL, XIHF, dest, val >> 32);
+    }
+}
+
 static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
 {
     if (c > TCG_COND_GT) {
@@ -1430,7 +1474,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
     case INDEX_op_xor_i32:
-        tcg_out_insn(s, RR, XR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_xori(s, args[0], args[2] & 0xffffffff);
+        } else {
+            tcg_out_insn(s, RR, XR, args[0], args[2]);
+        }
         break;
 
     case INDEX_op_and_i64:
@@ -1448,7 +1496,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
     case INDEX_op_xor_i64:
-        tcg_out_insn(s, RRE, XGR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_xori(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, XGR, args[0], args[2]);
+        }
         break;
 
     case INDEX_op_neg_i32:
@@ -1710,7 +1762,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_and_i32, { "r", "0", "rWA" } },
     { INDEX_op_or_i32, { "r", "0", "rWO" } },
-    { INDEX_op_xor_i32, { "r", "0", "r" } },
+    { INDEX_op_xor_i32, { "r", "0", "rWX" } },
 
     { INDEX_op_neg_i32, { "r", "r" } },
 
@@ -1772,7 +1824,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_and_i64, { "r", "0", "rA" } },
     { INDEX_op_or_i64, { "r", "0", "rO" } },
-    { INDEX_op_xor_i64, { "r", "0", "r" } },
+    { INDEX_op_xor_i64, { "r", "0", "rX" } },
 
     { INDEX_op_neg_i64, { "r", "r" } },
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 25/35] tcg-s390: Use the MULTIPLY IMMEDIATE instructions.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (23 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 24/35] tcg-s390: Use the XOR " Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 26/35] tcg-s390: Tidy goto_tb Richard Henderson
                   ` (10 subsequent siblings)
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   43 +++++++++++++++++++++++++++++++++++++++----
 1 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 5446591..f1e00e9 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -36,6 +36,7 @@
 #define TCG_CT_CONST_32    0x0100
 #define TCG_CT_CONST_NEG   0x0200
 #define TCG_CT_CONST_ADDI  0x0400
+#define TCG_CT_CONST_MULI  0x8000
 #define TCG_CT_CONST_ANDI  0x1000
 #define TCG_CT_CONST_ORI   0x2000
 #define TCG_CT_CONST_XORI  0x4000
@@ -64,6 +65,8 @@ typedef enum S390Opcode {
     RIL_LGFI    = 0xc001,
     RIL_LLIHF   = 0xc00e,
     RIL_LLILF   = 0xc00f,
+    RIL_MSFI    = 0xc201,
+    RIL_MSGFI   = 0xc200,
     RIL_NIHF    = 0xc00a,
     RIL_NILF    = 0xc00b,
     RIL_OIHF    = 0xc00c,
@@ -83,6 +86,8 @@ typedef enum S390Opcode {
     RI_LLIHL    = 0xa50d,
     RI_LLILH    = 0xa50e,
     RI_LLILL    = 0xa50f,
+    RI_MGHI     = 0xa70d,
+    RI_MHI      = 0xa70c,
     RI_NIHH     = 0xa504,
     RI_NIHL     = 0xa505,
     RI_NILH     = 0xa506,
@@ -336,6 +341,10 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         ct->ct &= ~TCG_CT_REG;
         ct->ct |= TCG_CT_CONST_ADDI;
         break;
+    case 'K':
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_MULI;
+        break;
     case 'A':
         ct->ct &= ~TCG_CT_REG;
         ct->ct |= TCG_CT_CONST_ANDI;
@@ -497,6 +506,16 @@ static int tcg_target_const_match(tcg_target_long val,
         } else {
             return val == (int16_t)val;
         }
+    } else if (ct & TCG_CT_CONST_MULI) {
+        /* Immediates that may be used with multiply.  If we have the
+           general-instruction-extensions, then we have MULTIPLY SINGLE
+           IMMEDIATE with a signed 32-bit, otherwise we have only
+           MULTIPLY HALFWORD IMMEDIATE, with a signed 16-bit.  */
+        if (facilities & FACILITY_GEN_INST_EXT) {
+            return val == (int32_t)val;
+        } else {
+            return val == (int16_t)val;
+        }
     } else if (ct & TCG_CT_CONST_ANDI) {
         return tcg_match_andi(ct, val);
     } else if (ct & TCG_CT_CONST_ORI) {
@@ -1511,10 +1530,26 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_mul_i32:
-        tcg_out_insn(s, RRE, MSR, args[0], args[2]);
+        if (const_args[2]) {
+            if ((int32_t)args[2] == (int16_t)args[2]) {
+                tcg_out_insn(s, RI, MHI, args[0], args[2]);
+            } else {
+                tcg_out_insn(s, RIL, MSFI, args[0], args[2]);
+            }
+        } else {
+            tcg_out_insn(s, RRE, MSR, args[0], args[2]);
+        }
         break;
     case INDEX_op_mul_i64:
-        tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
+        if (const_args[2]) {
+            if (args[2] == (int16_t)args[2]) {
+                tcg_out_insn(s, RI, MGHI, args[0], args[2]);
+            } else {
+                tcg_out_insn(s, RIL, MSGFI, args[0], args[2]);
+            }
+        } else {
+            tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
+        }
         break;
 
     case INDEX_op_div2_i32:
@@ -1755,7 +1790,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_add_i32, { "r", "0", "rWI" } },
     { INDEX_op_sub_i32, { "r", "0", "rWNI" } },
-    { INDEX_op_mul_i32, { "r", "0", "r" } },
+    { INDEX_op_mul_i32, { "r", "0", "rK" } },
 
     { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
@@ -1817,7 +1852,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_add_i64, { "r", "0", "rI" } },
     { INDEX_op_sub_i64, { "r", "0", "rNI" } },
-    { INDEX_op_mul_i64, { "r", "0", "r" } },
+    { INDEX_op_mul_i64, { "r", "0", "rK" } },
 
     { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 26/35] tcg-s390: Tidy goto_tb.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (24 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 25/35] tcg-s390: Use the MULTIPLY " Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 27/35] tcg-s390: Rearrange qemu_ld/st to avoid register copy Richard Henderson
                   ` (9 subsequent siblings)
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Invent tcg_out_ld_abs, using LOAD RELATIVE instructions, and use it.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   37 +++++++++++++++++++++++++------------
 1 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index f1e00e9..822835b 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -59,12 +59,14 @@ typedef enum S390Opcode {
     RIL_ALGFI   = 0xc20a,
     RIL_BRASL   = 0xc005,
     RIL_BRCL    = 0xc004,
-    RIL_LARL    = 0xc000,
     RIL_IIHF    = 0xc008,
     RIL_IILF    = 0xc009,
+    RIL_LARL    = 0xc000,
     RIL_LGFI    = 0xc001,
+    RIL_LGRL    = 0xc408,
     RIL_LLIHF   = 0xc00e,
     RIL_LLILF   = 0xc00f,
+    RIL_LRL     = 0xc40d,
     RIL_MSFI    = 0xc201,
     RIL_MSGFI   = 0xc200,
     RIL_NIHF    = 0xc00a,
@@ -755,6 +757,27 @@ static inline void tcg_out_st(TCGContext *s, TCGType type, TCGReg data,
     }
 }
 
+/* load data from an absolute host address */
+static void tcg_out_ld_abs(TCGContext *s, TCGType type, TCGReg dest, void *abs)
+{
+    tcg_target_long addr = (tcg_target_long)abs;
+
+    if (facilities & FACILITY_GEN_INST_EXT) {
+        tcg_target_long disp = (addr - (tcg_target_long)s->code_ptr) >> 1;
+        if (disp == (int32_t)disp) {
+            if (type == TCG_TYPE_I32) {
+                tcg_out_insn(s, RIL, LRL, dest, disp);
+            } else {
+                tcg_out_insn(s, RIL, LGRL, dest, disp);
+            }
+            return;
+        }
+    }
+
+    tcg_out_movi(s, TCG_TYPE_PTR, dest, addr & ~0xffff);
+    tcg_out_ld(s, type, dest, dest, addr & 0xffff);
+}
+
 static void tgen_ext8s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
 {
     if (facilities & FACILITY_EXT_IMM) {
@@ -1360,18 +1383,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         if (s->tb_jmp_offset) {
             tcg_abort();
         } else {
-            tcg_target_long off = ((tcg_target_long)(s->tb_next + args[0]) -
-                                   (tcg_target_long)s->code_ptr) >> 1;
-            if (off == (int32_t)off) {
-                /* load address relative to PC */
-                tcg_out_insn(s, RIL, LARL, TCG_TMP0, off);
-            } else {
-                /* too far for larl */
-                tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
-                             (tcg_target_long)(s->tb_next + args[0]));
-            }
             /* load address stored at s->tb_next + args[0] */
-            tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP0, TCG_TMP0, 0);
+            tcg_out_ld_abs(s, TCG_TYPE_PTR, TCG_TMP0, s->tb_next + args[0]);
             /* and go there */
             tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_TMP0);
         }
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 27/35] tcg-s390: Rearrange qemu_ld/st to avoid register copy.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (25 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 26/35] tcg-s390: Tidy goto_tb Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 28/35] tcg-s390: Tidy tcg_prepare_qemu_ldst Richard Henderson
                   ` (8 subsequent siblings)
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Split out qemu_ld/st_direct with full address components.
Avoid copy from addr_reg to R2 for 64-bit guests.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  282 ++++++++++++++++++++++++++-----------------------
 1 files changed, 151 insertions(+), 131 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 822835b..88b5592 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -1094,6 +1094,115 @@ static void tgen_calli(TCGContext *s, tcg_target_long dest)
     }
 }
 
+static void tcg_out_qemu_ld_direct(TCGContext *s, int opc, TCGReg data,
+                                   TCGReg base, TCGReg index, int disp)
+{
+#ifdef TARGET_WORDS_BIGENDIAN
+    const int bswap = 0;
+#else
+    const int bswap = 1;
+#endif
+    switch (opc) {
+    case LD_UINT8:
+        tcg_out_insn(s, RXY, LLGC, data, base, index, disp);
+        break;
+    case LD_INT8:
+        tcg_out_insn(s, RXY, LGB, data, base, index, disp);
+        break;
+    case LD_UINT16:
+        if (bswap) {
+            /* swapped unsigned halfword load with upper bits zeroed */
+            tcg_out_insn(s, RXY, LRVH, data, base, index, disp);
+            tgen_ext16u(s, TCG_TYPE_I64, data, data);
+        } else {
+            tcg_out_insn(s, RXY, LLGH, data, base, index, disp);
+        }
+        break;
+    case LD_INT16:
+        if (bswap) {
+            /* swapped sign-extended halfword load */
+            tcg_out_insn(s, RXY, LRVH, data, base, index, disp);
+            tgen_ext16s(s, TCG_TYPE_I64, data, data);
+        } else {
+            tcg_out_insn(s, RXY, LGH, data, base, index, disp);
+        }
+        break;
+    case LD_UINT32:
+        if (bswap) {
+            /* swapped unsigned int load with upper bits zeroed */
+            tcg_out_insn(s, RXY, LRV, data, base, index, disp);
+            tgen_ext32u(s, data, data);
+        } else {
+            tcg_out_insn(s, RXY, LLGF, data, base, index, disp);
+        }
+        break;
+    case LD_INT32:
+        if (bswap) {
+            /* swapped sign-extended int load */
+            tcg_out_insn(s, RXY, LRV, data, base, index, disp);
+            tgen_ext32s(s, data, data);
+        } else {
+            tcg_out_insn(s, RXY, LGF, data, base, index, disp);
+        }
+        break;
+    case LD_UINT64:
+        if (bswap) {
+            tcg_out_insn(s, RXY, LRVG, data, base, index, disp);
+        } else {
+            tcg_out_insn(s, RXY, LG, data, base, index, disp);
+        }
+        break;
+    default:
+        tcg_abort();
+    }
+}
+
+static void tcg_out_qemu_st_direct(TCGContext *s, int opc, TCGReg data,
+                                   TCGReg base, TCGReg index, int disp)
+{
+#ifdef TARGET_WORDS_BIGENDIAN
+    const int bswap = 0;
+#else
+    const int bswap = 1;
+#endif
+    switch (opc) {
+    case LD_UINT8:
+        if (disp >= 0 && disp < 0x1000) {
+            tcg_out_insn(s, RX, STC, data, base, index, disp);
+        } else {
+            tcg_out_insn(s, RXY, STCY, data, base, index, disp);
+        }
+        break;
+    case LD_UINT16:
+        if (bswap) {
+            tcg_out_insn(s, RXY, STRVH, data, base, index, disp);
+        } else if (disp >= 0 && disp < 0x1000) {
+            tcg_out_insn(s, RX, STH, data, base, index, disp);
+        } else {
+            tcg_out_insn(s, RXY, STHY, data, base, index, disp);
+        }
+        break;
+    case LD_UINT32:
+        if (bswap) {
+            tcg_out_insn(s, RXY, STRV, data, base, index, disp);
+        } else if (disp >= 0 && disp < 0x1000) {
+            tcg_out_insn(s, RX, ST, data, base, index, disp);
+        } else {
+            tcg_out_insn(s, RXY, STY, data, base, index, disp);
+        }
+        break;
+    case LD_UINT64:
+        if (bswap) {
+            tcg_out_insn(s, RXY, STRVG, data, base, index, disp);
+        } else {
+            tcg_out_insn(s, RXY, STG, data, base, index, disp);
+        }
+        break;
+    default:
+        tcg_abort();
+    }
+}
+
 #if defined(CONFIG_SOFTMMU)
 static void tgen64_andi_tmp(TCGContext *s, TCGReg dest, tcg_target_ulong val)
 {
@@ -1105,13 +1214,13 @@ static void tgen64_andi_tmp(TCGContext *s, TCGReg dest, tcg_target_ulong val)
     }
 }
 
-static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
-                                  int mem_index, int opc,
+static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
+                                  TCGReg addr_reg, int mem_index, int opc,
                                   uint16_t **label2_ptr_p, int is_store)
-  {
-    int arg0 = TCG_REG_R2;
-    int arg1 = TCG_REG_R3;
-    int arg2 = TCG_REG_R4;
+{
+    const TCGReg arg0 = TCG_REG_R2;
+    const TCGReg arg1 = TCG_REG_R3;
+    const TCGReg arg2 = TCG_REG_R4;
     int s_bits;
     uint16_t *label1_ptr;
 
@@ -1148,18 +1257,18 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
 
     tcg_out_insn(s, RXY, CG, arg0, arg1, 0, 0);
 
-    label1_ptr = (uint16_t*)s->code_ptr;
-
-    /* je label1 (offset will be patched in later) */
-    tcg_out_insn(s, RI, BRC, S390_CC_EQ, 0);
-
-    /* call load/store helper */
 #if TARGET_LONG_BITS == 32
     tgen_ext32u(s, arg0, addr_reg);
 #else
     tcg_out_mov(s, arg0, addr_reg);
 #endif
 
+    label1_ptr = (uint16_t*)s->code_ptr;
+
+    /* je label1 (offset will be patched in later) */
+    tcg_out_insn(s, RI, BRC, S390_CC_EQ, 0);
+
+    /* call load/store helper */
     if (is_store) {
         tcg_out_mov(s, arg1, data_reg);
         tcg_out_movi(s, TCG_TYPE_I32, arg2, mem_index);
@@ -1205,13 +1314,6 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
                      - offsetof(CPUTLBEntry, addr_read));
     }
 
-#if TARGET_LONG_BITS == 32
-    /* zero upper 32 bits */
-    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
-#else
-    /* just copy */
-    tcg_out_mov(s, arg0, addr_reg);
-#endif
     tcg_out_insn(s, RRE, AGR, arg0, arg1);
 }
 
@@ -1221,150 +1323,68 @@ static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
     *(label2_ptr + 1) = ((unsigned long)s->code_ptr -
                          (unsigned long)label2_ptr) >> 1;
 }
-
-#else /* CONFIG_SOFTMMU */
-
-static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
-                                int mem_index, int opc,
-                                uint16_t **label2_ptr_p, int is_store)
-{
-    int arg0 = TCG_REG_R2;
-
-    /* user mode, no address translation required */
-    if (TARGET_LONG_BITS == 32) {
-        tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
-    } else {
-        tcg_out_mov(s, arg0, addr_reg);
-    }
-}
-
-static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
-{
-}
-
 #endif /* CONFIG_SOFTMMU */
 
 /* load data with address translation (if applicable)
    and endianness conversion */
 static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 {
-    int addr_reg, data_reg, mem_index;
-    int arg0 = TCG_REG_R2;
+    TCGReg addr_reg, data_reg;
+#if defined(CONFIG_SOFTMMU)
+    int mem_index;
     uint16_t *label2_ptr;
+#endif
 
     data_reg = *args++;
     addr_reg = *args++;
-    mem_index = *args;
 
-    dprintf("tcg_out_qemu_ld opc %d data_reg %d addr_reg %d mem_index %d\n"
-            opc, data_reg, addr_reg, mem_index);
+#if defined(CONFIG_SOFTMMU)
+    mem_index = *args;
 
     tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
                           opc, &label2_ptr, 0);
 
-    switch (opc) {
-    case LD_UINT8:
-        tcg_out_insn(s, RXY, LLGC, data_reg, arg0, 0, 0);
-        break;
-    case LD_INT8:
-        tcg_out_insn(s, RXY, LGB, data_reg, arg0, 0, 0);
-        break;
-    case LD_UINT16:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RXY, LLGH, data_reg, arg0, 0, 0);
-#else
-        /* swapped unsigned halfword load with upper bits zeroed */
-        tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
-        tgen_ext16u(s, TCG_TYPE_I64, data_reg, data_reg);
-#endif
-        break;
-    case LD_INT16:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RXY, LGH, data_reg, arg0, 0, 0);
-#else
-        /* swapped sign-extended halfword load */
-        tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
-        tgen_ext16s(s, TCG_TYPE_I64, data_reg, data_reg);
-#endif
-        break;
-    case LD_UINT32:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RXY, LLGF, data_reg, arg0, 0, 0);
-#else
-        /* swapped unsigned int load with upper bits zeroed */
-        tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
-        tgen_ext32u(s, data_reg, data_reg);
-#endif
-        break;
-    case LD_INT32:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RXY, LGF, data_reg, arg0, 0, 0);
-#else
-        /* swapped sign-extended int load */
-        tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
-        tgen_ext32s(s, data_reg, data_reg);
-#endif
-        break;
-    case LD_UINT64:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RXY, LG, data_reg, arg0, 0, 0);
-#else
-        tcg_out_insn(s, RXY, LRVG, data_reg, arg0, 0, 0);
-#endif
-        break;
-    default:
-        tcg_abort();
-    }
+    tcg_out_qemu_ld_direct(s, opc, data_reg, TCG_REG_R2, TCG_REG_NONE, 0);
 
     tcg_finish_qemu_ldst(s, label2_ptr);
+#else
+    if (TARGET_LONG_BITS == 32) {
+        tgen_ext32u(s, TCG_TMP0, addr_reg);
+        tcg_out_qemu_ld_direct(s, opc, data_reg, TCG_TMP0, TCG_REG_NONE, 0);
+    } else {
+        tcg_out_qemu_ld_direct(s, opc, data_reg, addr_reg, TCG_REG_NONE, 0);
+    }
+#endif
 }
 
 static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
 {
-    int addr_reg, data_reg, mem_index;
+    TCGReg addr_reg, data_reg;
+#if defined(CONFIG_SOFTMMU)
+    int mem_index;
     uint16_t *label2_ptr;
-    int arg0 = TCG_REG_R2;
+#endif
 
     data_reg = *args++;
     addr_reg = *args++;
-    mem_index = *args;
 
-    dprintf("tcg_out_qemu_st opc %d data_reg %d addr_reg %d mem_index %d\n"
-            opc, data_reg, addr_reg, mem_index);
+#if defined(CONFIG_SOFTMMU)
+    mem_index = *args;
 
     tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
                           opc, &label2_ptr, 1);
 
-    switch (opc) {
-    case LD_UINT8:
-        tcg_out_insn(s, RX, STC, data_reg, arg0, 0, 0);
-        break;
-    case LD_UINT16:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RX, STH, data_reg, arg0, 0, 0);
-#else
-        tcg_out_insn(s, RXY, STRVH, data_reg, arg0, 0, 0);
-#endif
-        break;
-    case LD_UINT32:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RX, ST, data_reg, arg0, 0, 0);
-#else
-        tcg_out_insn(s, RXY, STRV, data_reg, arg0, 0, 0);
-#endif
-        break;
-    case LD_UINT64:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RXY, STG, data_reg, arg0, 0, 0);
-#else
-        tcg_out_insn(s, RXY, STRVG, data_reg, arg0, 0, 0);
-#endif
-        break;
-    default:
-        tcg_abort();
-    }
+    tcg_out_qemu_st_direct(s, opc, data_reg, TCG_REG_R2, TCG_REG_NONE, 0);
 
     tcg_finish_qemu_ldst(s, label2_ptr);
+#else
+    if (TARGET_LONG_BITS == 32) {
+        tgen_ext32u(s, TCG_TMP0, addr_reg);
+        tcg_out_qemu_st_direct(s, opc, data_reg, TCG_TMP0, TCG_REG_NONE, 0);
+    } else {
+        tcg_out_qemu_st_direct(s, opc, data_reg, addr_reg, TCG_REG_NONE, 0);
+    }
+#endif
 }
 
 static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 28/35] tcg-s390: Tidy tcg_prepare_qemu_ldst.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (26 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 27/35] tcg-s390: Rearrange qemu_ld/st to avoid register copy Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 29/35] tcg-s390: Tidy user qemu_ld/st Richard Henderson
                   ` (7 subsequent siblings)
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Make use of the reg+reg+disp addressing mode to eliminate
redundant additions.  Make use of the load-and-operate insns.
Avoid an extra register copy when using the 64-bit shift insns.
Fix the width of the TLB comparison.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   64 ++++++++++++++++++++----------------------------
 1 files changed, 27 insertions(+), 37 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 88b5592..b73515d 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -146,7 +146,10 @@ typedef enum S390Opcode {
     RS_SRA      = 0x8a,
     RS_SRL      = 0x88,
 
+    RXY_AG      = 0xe308,
+    RXY_AY      = 0xe35a,
     RXY_CG      = 0xe320,
+    RXY_CY      = 0xe359,
     RXY_LB      = 0xe376,
     RXY_LG      = 0xe304,
     RXY_LGB     = 0xe377,
@@ -170,6 +173,8 @@ typedef enum S390Opcode {
     RXY_STRVH   = 0xe33f,
     RXY_STY     = 0xe350,
 
+    RX_A        = 0x5a,
+    RX_C        = 0x59,
     RX_L        = 0x58,
     RX_LH       = 0x48,
     RX_ST       = 0x50,
@@ -1220,24 +1225,16 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
 {
     const TCGReg arg0 = TCG_REG_R2;
     const TCGReg arg1 = TCG_REG_R3;
-    const TCGReg arg2 = TCG_REG_R4;
-    int s_bits;
+    int s_bits = opc & 3;
     uint16_t *label1_ptr;
+    tcg_target_long ofs;
 
-    if (is_store) {
-        s_bits = opc;
+    if (TARGET_LONG_BITS == 32) {
+        tgen_ext32u(s, arg0, addr_reg);
     } else {
-        s_bits = opc & 3;
+        tcg_out_mov(s, arg0, addr_reg);
     }
 
-#if TARGET_LONG_BITS == 32
-    tgen_ext32u(s, arg1, addr_reg);
-    tgen_ext32u(s, arg0, addr_reg);
-#else
-    tcg_out_mov(s, arg1, addr_reg);
-    tcg_out_mov(s, arg0, addr_reg);
-#endif
-
     tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, TCG_REG_NONE,
                  TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
 
@@ -1245,23 +1242,23 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
     tgen64_andi_tmp(s, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
 
     if (is_store) {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
-                     offsetof(CPUState, tlb_table[mem_index][0].addr_write));
+        ofs = offsetof(CPUState, tlb_table[mem_index][0].addr_write);
     } else {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
-                     offsetof(CPUState, tlb_table[mem_index][0].addr_read));
+        ofs = offsetof(CPUState, tlb_table[mem_index][0].addr_read);
     }
-    tcg_out_insn(s, RRE, AGR, arg1, TCG_TMP0);
+    assert(ofs < 0x80000);
 
-    tcg_out_insn(s, RRE, AGR, arg1, TCG_AREG0);
-
-    tcg_out_insn(s, RXY, CG, arg0, arg1, 0, 0);
+    if (TARGET_LONG_BITS == 32) {
+        tcg_out_mem(s, RX_C, RXY_CY, arg0, arg1, TCG_AREG0, ofs);
+    } else {
+        tcg_out_mem(s, 0, RXY_CG, arg0, arg1, TCG_AREG0, ofs);
+    }
 
-#if TARGET_LONG_BITS == 32
-    tgen_ext32u(s, arg0, addr_reg);
-#else
-    tcg_out_mov(s, arg0, addr_reg);
-#endif
+    if (TARGET_LONG_BITS == 32) {
+        tgen_ext32u(s, arg0, addr_reg);
+    } else {
+        tcg_out_mov(s, arg0, addr_reg);
+    }
 
     label1_ptr = (uint16_t*)s->code_ptr;
 
@@ -1271,7 +1268,7 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
     /* call load/store helper */
     if (is_store) {
         tcg_out_mov(s, arg1, data_reg);
-        tcg_out_movi(s, TCG_TYPE_I32, arg2, mem_index);
+        tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R4, mem_index);
         tgen_calli(s, (tcg_target_ulong)qemu_st_helpers[s_bits]);
     } else {
         tcg_out_movi(s, TCG_TYPE_I32, arg1, mem_index);
@@ -1304,17 +1301,10 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
     *(label1_ptr + 1) = ((unsigned long)s->code_ptr -
                          (unsigned long)label1_ptr) >> 1;
 
-    if (is_store) {
-        tcg_out_insn(s, RXY, LG, arg1, arg1, 0,
-                     offsetof(CPUTLBEntry, addend)
-                     - offsetof(CPUTLBEntry, addr_write));
-    } else {
-        tcg_out_insn(s, RXY, LG, arg1, arg1, 0,
-                     offsetof(CPUTLBEntry, addend)
-                     - offsetof(CPUTLBEntry, addr_read));
-    }
+    ofs = offsetof(CPUState, tlb_table[mem_index][0].addend);
+    assert(ofs < 0x80000);
 
-    tcg_out_insn(s, RRE, AGR, arg0, arg1);
+    tcg_out_mem(s, 0, RXY_AG, arg0, arg1, TCG_AREG0, ofs);
 }
 
 static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 29/35] tcg-s390: Tidy user qemu_ld/st.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (27 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 28/35] tcg-s390: Tidy tcg_prepare_qemu_ldst Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 30/35] tcg-s390: Implement GUEST_BASE Richard Henderson
                   ` (6 subsequent siblings)
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Create a tcg_prepare_user_ldst to prep the host address to
be used to implement the guest memory operation.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   33 +++++++++++++++++++++------------
 1 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index b73515d..ef1f69e 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -1313,6 +1313,17 @@ static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
     *(label2_ptr + 1) = ((unsigned long)s->code_ptr -
                          (unsigned long)label2_ptr) >> 1;
 }
+#else
+static void tcg_prepare_user_ldst(TCGContext *s, TCGReg *addr_reg,
+                                  TCGReg *index_reg, tcg_target_long *disp)
+{
+    *index_reg = TCG_REG_NONE;
+    *disp = 0;
+    if (TARGET_LONG_BITS == 32) {
+        tgen_ext32u(s, TCG_TMP0, *addr_reg);
+        *addr_reg = TCG_TMP0;
+    }
+}
 #endif /* CONFIG_SOFTMMU */
 
 /* load data with address translation (if applicable)
@@ -1323,6 +1334,9 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 #if defined(CONFIG_SOFTMMU)
     int mem_index;
     uint16_t *label2_ptr;
+#else
+    TCGReg index_reg;
+    tcg_target_long disp;
 #endif
 
     data_reg = *args++;
@@ -1338,12 +1352,8 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 
     tcg_finish_qemu_ldst(s, label2_ptr);
 #else
-    if (TARGET_LONG_BITS == 32) {
-        tgen_ext32u(s, TCG_TMP0, addr_reg);
-        tcg_out_qemu_ld_direct(s, opc, data_reg, TCG_TMP0, TCG_REG_NONE, 0);
-    } else {
-        tcg_out_qemu_ld_direct(s, opc, data_reg, addr_reg, TCG_REG_NONE, 0);
-    }
+    tcg_prepare_user_ldst(s, &addr_reg, &index_reg, &disp);
+    tcg_out_qemu_ld_direct(s, opc, data_reg, addr_reg, index_reg, disp);
 #endif
 }
 
@@ -1353,6 +1363,9 @@ static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
 #if defined(CONFIG_SOFTMMU)
     int mem_index;
     uint16_t *label2_ptr;
+#else
+    TCGReg index_reg;
+    tcg_target_long disp;
 #endif
 
     data_reg = *args++;
@@ -1368,12 +1381,8 @@ static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
 
     tcg_finish_qemu_ldst(s, label2_ptr);
 #else
-    if (TARGET_LONG_BITS == 32) {
-        tgen_ext32u(s, TCG_TMP0, addr_reg);
-        tcg_out_qemu_st_direct(s, opc, data_reg, TCG_TMP0, TCG_REG_NONE, 0);
-    } else {
-        tcg_out_qemu_st_direct(s, opc, data_reg, addr_reg, TCG_REG_NONE, 0);
-    }
+    tcg_prepare_user_ldst(s, &addr_reg, &index_reg, &disp);
+    tcg_out_qemu_st_direct(s, opc, data_reg, addr_reg, index_reg, disp);
 #endif
 }
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 30/35] tcg-s390: Implement GUEST_BASE.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (28 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 29/35] tcg-s390: Tidy user qemu_ld/st Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 31/35] tcg-s390: Use 16-bit branches for forward jumps Richard Henderson
                   ` (5 subsequent siblings)
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 configure             |    2 ++
 tcg/s390/tcg-target.c |   24 ++++++++++++++++++++++--
 tcg/s390/tcg-target.h |    2 ++
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 7f5b5b2..e80b820 100755
--- a/configure
+++ b/configure
@@ -699,10 +699,12 @@ case "$cpu" in
     s390)
            QEMU_CFLAGS="-m31 -march=z990 $QEMU_CFLAGS"
            LDFLAGS="-m31 $LDFLAGS"
+           host_guest_base="yes"
            ;;
     s390x)
            QEMU_CFLAGS="-m64 -march=z990 $QEMU_CFLAGS"
            LDFLAGS="-m64 $LDFLAGS"
+           host_guest_base="yes"
            ;;
     i386)
            QEMU_CFLAGS="-m32 $QEMU_CFLAGS"
diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index ef1f69e..13c4de6 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -48,6 +48,16 @@
 /* A scratch register that may be be used throughout the backend.  */
 #define TCG_TMP0        TCG_REG_R14
 
+#ifdef CONFIG_USE_GUEST_BASE
+#define TCG_GUEST_BASE_REG TCG_REG_R13
+#else
+#define TCG_GUEST_BASE_REG TCG_REG_R0
+#endif
+
+#ifndef GUEST_BASE
+#define GUEST_BASE 0
+#endif
+
 
 /* All of the following instructions are prefixed with their instruction
    format, and are defined as 8- or 16-bit quantities, even when the two
@@ -1317,12 +1327,17 @@ static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
 static void tcg_prepare_user_ldst(TCGContext *s, TCGReg *addr_reg,
                                   TCGReg *index_reg, tcg_target_long *disp)
 {
-    *index_reg = TCG_REG_NONE;
-    *disp = 0;
     if (TARGET_LONG_BITS == 32) {
         tgen_ext32u(s, TCG_TMP0, *addr_reg);
         *addr_reg = TCG_TMP0;
     }
+    if (GUEST_BASE < 0x80000) {
+        *index_reg = TCG_REG_NONE;
+        *disp = GUEST_BASE;
+    } else {
+        *index_reg = TCG_GUEST_BASE_REG;
+        *disp = 0;
+    }
 }
 #endif /* CONFIG_SOFTMMU */
 
@@ -2061,6 +2076,11 @@ void tcg_target_qemu_prologue(TCGContext *s)
     /* aghi %r15,-160 (stack frame) */
     tcg_out_insn(s, RI, AGHI, TCG_REG_R15, -160);
 
+    if (GUEST_BASE >= 0x80000) {
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_GUEST_BASE_REG, GUEST_BASE);
+        tcg_regset_set_reg(s->reserved_regs, TCG_GUEST_BASE_REG);
+    }
+
     /* br %r2 (go to TB) */
     tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R2);
 
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 9135c7a..390c587 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -83,6 +83,8 @@ typedef enum TCGReg {
 // #define TCG_TARGET_HAS_nand_i64
 // #define TCG_TARGET_HAS_nor_i64
 
+#define TCG_TARGET_HAS_GUEST_BASE
+
 /* used for function call generation */
 #define TCG_REG_CALL_STACK		TCG_REG_R15
 #define TCG_TARGET_STACK_ALIGN		8
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 31/35] tcg-s390: Use 16-bit branches for forward jumps.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (29 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 30/35] tcg-s390: Implement GUEST_BASE Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 32/35] tcg-s390: Use the LOAD AND TEST instruction for compares Richard Henderson
                   ` (4 subsequent siblings)
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Translation blocks are never big enough to require 32-bit branches.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   27 ++++++++++++++++++++++-----
 1 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 13c4de6..81d5ad3 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -33,6 +33,11 @@
     do { } while (0)
 #endif
 
+/* ??? The translation blocks produced by TCG are generally small enough to
+   be entirely reachable with a 16-bit displacement.  Leaving the option for
+   a 32-bit displacement here Just In Case.  */
+#define USE_LONG_BRANCHES 0
+
 #define TCG_CT_CONST_32    0x0100
 #define TCG_CT_CONST_NEG   0x0200
 #define TCG_CT_CONST_ADDI  0x0400
@@ -301,14 +306,22 @@ static uint8_t *tb_ret_addr;
 static uint64_t facilities;
 
 static void patch_reloc(uint8_t *code_ptr, int type,
-                tcg_target_long value, tcg_target_long addend)
+                        tcg_target_long value, tcg_target_long addend)
 {
-    uint32_t *code_ptr_32 = (uint32_t*)code_ptr;
-    tcg_target_long code_ptr_tlong = (tcg_target_long)code_ptr;
+    tcg_target_long code_ptr_tl = (tcg_target_long)code_ptr;
+    tcg_target_long pcrel2;
 
+    /* ??? Not the usual definition of "addend".  */
+    pcrel2 = (value - (code_ptr_tl + addend)) >> 1;
+    
     switch (type) {
+    case R_390_PC16DBL:
+        assert(pcrel2 == (int16_t)pcrel2);
+        *(int16_t *)code_ptr = pcrel2;
+        break;
     case R_390_PC32DBL:
-        *code_ptr_32 = (value - (code_ptr_tlong + addend)) >> 1;
+        assert(pcrel2 == (int32_t)pcrel2);
+        *(int32_t *)code_ptr = pcrel2;
         break;
     default:
         tcg_abort();
@@ -1091,10 +1104,14 @@ static void tgen_branch(TCGContext *s, int cc, int labelno)
     TCGLabel* l = &s->labels[labelno];
     if (l->has_value) {
         tgen_gotoi(s, cc, l->u.value);
-    } else {
+    } else if (USE_LONG_BRANCHES) {
         tcg_out16(s, RIL_BRCL | (cc << 4));
         tcg_out_reloc(s, s->code_ptr, R_390_PC32DBL, labelno, -2);
         s->code_ptr += 4;
+    } else {
+        tcg_out16(s, RI_BRC | (cc << 4));
+        tcg_out_reloc(s, s->code_ptr, R_390_PC16DBL, labelno, -2);
+        s->code_ptr += 2;
     }
 }
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 32/35] tcg-s390: Use the LOAD AND TEST instruction for compares.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (30 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 31/35] tcg-s390: Use 16-bit branches for forward jumps Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 33/35] tcg-s390: Use the COMPARE IMMEDIATE instrucions " Richard Henderson
                   ` (3 subsequent siblings)
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

This instruction is always available, and nicely eliminates
the constant load for comparisons against zero.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  133 +++++++++++++++++++++++++++++++++---------------
 1 files changed, 91 insertions(+), 42 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 81d5ad3..8bc82b4 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -45,6 +45,7 @@
 #define TCG_CT_CONST_ANDI  0x1000
 #define TCG_CT_CONST_ORI   0x2000
 #define TCG_CT_CONST_XORI  0x4000
+#define TCG_CT_CONST_CMPI  0x8000
 
 /* Several places within the instruction set 0 means "no register"
    rather than TCG_REG_R0.  */
@@ -131,6 +132,7 @@ typedef enum S390Opcode {
     RRE_LLGHR   = 0xb985,
     RRE_LRVR    = 0xb91f,
     RRE_LRVGR   = 0xb90f,
+    RRE_LTGR    = 0xb902,
     RRE_MSGR    = 0xb90c,
     RRE_MSR     = 0xb252,
     RRE_NGR     = 0xb980,
@@ -146,6 +148,7 @@ typedef enum S390Opcode {
     RR_DR       = 0x1d,
     RR_LCR      = 0x13,
     RR_LR       = 0x18,
+    RR_LTR      = 0x12,
     RR_NR       = 0x14,
     RR_OR       = 0x16,
     RR_SR       = 0x1b,
@@ -248,9 +251,6 @@ static const int tcg_target_call_oarg_regs[] = {
     TCG_REG_R3,
 };
 
-/* signed/unsigned is handled by using COMPARE and COMPARE LOGICAL,
-   respectively */
-
 #define S390_CC_EQ      8
 #define S390_CC_LT      4
 #define S390_CC_GT      2
@@ -258,19 +258,37 @@ static const int tcg_target_call_oarg_regs[] = {
 #define S390_CC_NE      (S390_CC_LT | S390_CC_GT)
 #define S390_CC_LE      (S390_CC_LT | S390_CC_EQ)
 #define S390_CC_GE      (S390_CC_GT | S390_CC_EQ)
+#define S390_CC_NEVER   0
 #define S390_CC_ALWAYS  15
 
+/* Condition codes that result from a COMPARE and COMPARE LOGICAL.  */
 static const uint8_t tcg_cond_to_s390_cond[10] = {
     [TCG_COND_EQ]  = S390_CC_EQ,
+    [TCG_COND_NE]  = S390_CC_NE,
     [TCG_COND_LT]  = S390_CC_LT,
-    [TCG_COND_LTU] = S390_CC_LT,
     [TCG_COND_LE]  = S390_CC_LE,
-    [TCG_COND_LEU] = S390_CC_LE,
     [TCG_COND_GT]  = S390_CC_GT,
-    [TCG_COND_GTU] = S390_CC_GT,
     [TCG_COND_GE]  = S390_CC_GE,
+    [TCG_COND_LTU] = S390_CC_LT,
+    [TCG_COND_LEU] = S390_CC_LE,
+    [TCG_COND_GTU] = S390_CC_GT,
     [TCG_COND_GEU] = S390_CC_GE,
+};
+
+/* Condition codes that result from a LOAD AND TEST.  Here, we have no
+   unsigned instruction variation, however since the test is vs zero we
+   can re-map the outcomes appropriately.  */
+static const uint8_t tcg_cond_to_ltr_cond[10] = {
+    [TCG_COND_EQ]  = S390_CC_EQ,
     [TCG_COND_NE]  = S390_CC_NE,
+    [TCG_COND_LT]  = S390_CC_LT,
+    [TCG_COND_LE]  = S390_CC_LE,
+    [TCG_COND_GT]  = S390_CC_GT,
+    [TCG_COND_GE]  = S390_CC_GE,
+    [TCG_COND_LTU] = S390_CC_NEVER,
+    [TCG_COND_LEU] = S390_CC_EQ,
+    [TCG_COND_GTU] = S390_CC_NE,
+    [TCG_COND_GEU] = S390_CC_ALWAYS,
 };
 
 #ifdef CONFIG_SOFTMMU
@@ -387,6 +405,10 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         ct->ct &= ~TCG_CT_REG;
         ct->ct |= TCG_CT_CONST_XORI;
         break;
+    case 'C':
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_CMPI;
+        break;
     default:
         break;
     }
@@ -507,6 +529,13 @@ static int tcg_match_xori(int ct, tcg_target_long val)
     return 1;
 }
 
+/* Imediates to be used with comparisons.  */
+
+static int tcg_match_cmpi(int ct, tcg_target_long val)
+{
+    return (val == 0);
+}
+
 /* Test if a constant matches the constraint. */
 static int tcg_target_const_match(tcg_target_long val,
                                   const TCGArgConstraint *arg_ct)
@@ -552,6 +581,8 @@ static int tcg_target_const_match(tcg_target_long val,
         return tcg_match_ori(ct, val);
     } else if (ct & TCG_CT_CONST_XORI) {
         return tcg_match_xori(ct, val);
+    } else if (ct & TCG_CT_CONST_CMPI) {
+        return tcg_match_cmpi(ct, val);
     }
 
     return 0;
@@ -1050,39 +1081,48 @@ static void tgen64_xori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
     }
 }
 
-static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
-{
-    if (c > TCG_COND_GT) {
-        /* unsigned */
-        tcg_out_insn(s, RR, CLR, r1, r2);
-    } else {
-        /* signed */
-        tcg_out_insn(s, RR, CR, r1, r2);
-    }
-}
-
-static void tgen64_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
+static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
+                    TCGArg c2, int c2const)
 {
-    if (c > TCG_COND_GT) {
-        /* unsigned */
-        tcg_out_insn(s, RRE, CLGR, r1, r2);
+    if (c2const) {
+        if (c2 == 0) {
+            if (type == TCG_TYPE_I32) {
+                tcg_out_insn(s, RR, LTR, r1, r1);
+            } else {
+                tcg_out_insn(s, RRE, LTGR, r1, r1);
+            }
+            return tcg_cond_to_ltr_cond[c];
+        } else {
+            tcg_abort();
+        }
     } else {
-        /* signed */
-        tcg_out_insn(s, RRE, CGR, r1, r2);
+        if (c > TCG_COND_GT) {
+            /* unsigned */
+            if (type == TCG_TYPE_I32) {
+                tcg_out_insn(s, RR, CLR, r1, c2);
+            } else {
+                tcg_out_insn(s, RRE, CLGR, r1, c2);
+            }
+        } else {
+            /* signed */
+            if (type == TCG_TYPE_I32) {
+                tcg_out_insn(s, RR, CR, r1, c2);
+            } else {
+                tcg_out_insn(s, RRE, CGR, r1, c2);
+            }
+        }
     }
+    return tcg_cond_to_s390_cond[c];
 }
 
 static void tgen_setcond(TCGContext *s, TCGType type, TCGCond c,
-                         TCGReg dest, TCGReg r1, TCGReg r2)
+                         TCGReg dest, TCGReg r1, TCGArg c2, int c2const)
 {
-    if (type == TCG_TYPE_I32) {
-        tgen32_cmp(s, c, r1, r2);
-    } else {
-        tgen64_cmp(s, c, r1, r2);
-    }
+    int cc = tgen_cmp(s, type, c, r1, c2, c2const);
+
     /* Emit: r1 = 1; if (cc) goto over; r1 = 0; over:  */
     tcg_out_movi(s, type, dest, 1);
-    tcg_out_insn(s, RI, BRC, tcg_cond_to_s390_cond[c], (4 + 4) >> 1);
+    tcg_out_insn(s, RI, BRC, cc, (4 + 4) >> 1);
     tcg_out_movi(s, type, dest, 0);
 }
 
@@ -1115,6 +1155,13 @@ static void tgen_branch(TCGContext *s, int cc, int labelno)
     }
 }
 
+static void tgen_brcond(TCGContext *s, TCGType type, TCGCond c,
+                        TCGReg r1, TCGArg c2, int c2const, int labelno)
+{
+    int cc = tgen_cmp(s, type, c, r1, c2, c2const);
+    tgen_branch(s, cc, labelno);
+}
+
 static void tgen_calli(TCGContext *s, tcg_target_long dest)
 {
     tcg_target_long off = (dest - (tcg_target_long)s->code_ptr) >> 1;
@@ -1755,20 +1802,22 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tgen_branch(s, S390_CC_ALWAYS, args[0]);
         break;
 
-    case INDEX_op_brcond_i64:
-        tgen64_cmp(s, args[2], args[0], args[1]);
-        goto do_brcond;
     case INDEX_op_brcond_i32:
-        tgen32_cmp(s, args[2], args[0], args[1]);
-    do_brcond:
-        tgen_branch(s, tcg_cond_to_s390_cond[args[2]], args[3]);
+        tgen_brcond(s, TCG_TYPE_I32, args[2], args[0],
+                    args[1], const_args[1], args[3]);
+        break;
+    case INDEX_op_brcond_i64:
+        tgen_brcond(s, TCG_TYPE_I64, args[2], args[0],
+                    args[1], const_args[1], args[3]);
         break;
 
     case INDEX_op_setcond_i32:
-        tgen_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1], args[2]);
+        tgen_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1],
+                     args[2], const_args[2]);
         break;
     case INDEX_op_setcond_i64:
-        tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1], args[2]);
+        tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1],
+                     args[2], const_args[2]);
         break;
 
     case INDEX_op_qemu_ld8u:
@@ -1880,8 +1929,8 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_bswap16_i32, { "r", "r" } },
     { INDEX_op_bswap32_i32, { "r", "r" } },
 
-    { INDEX_op_brcond_i32, { "r", "r" } },
-    { INDEX_op_setcond_i32, { "r", "r", "r" } },
+    { INDEX_op_brcond_i32, { "r", "rWC" } },
+    { INDEX_op_setcond_i32, { "r", "r", "rWC" } },
 
     { INDEX_op_qemu_ld8u, { "r", "L" } },
     { INDEX_op_qemu_ld8s, { "r", "L" } },
@@ -1945,8 +1994,8 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_bswap32_i64, { "r", "r" } },
     { INDEX_op_bswap64_i64, { "r", "r" } },
 
-    { INDEX_op_brcond_i64, { "r", "r" } },
-    { INDEX_op_setcond_i64, { "r", "r", "r" } },
+    { INDEX_op_brcond_i64, { "r", "rC" } },
+    { INDEX_op_setcond_i64, { "r", "r", "rC" } },
 #endif
 
     { -1 },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 33/35] tcg-s390: Use the COMPARE IMMEDIATE instrucions for compares.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (31 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 32/35] tcg-s390: Use the LOAD AND TEST instruction for compares Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 34/35] tcg-s390: Use COMPARE AND BRANCH instructions Richard Henderson
                   ` (2 subsequent siblings)
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

These instructions are available with extended-immediate facility.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   44 ++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 8bc82b4..691a3f5 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -75,6 +75,10 @@ typedef enum S390Opcode {
     RIL_ALGFI   = 0xc20a,
     RIL_BRASL   = 0xc005,
     RIL_BRCL    = 0xc004,
+    RIL_CFI     = 0xc20d,
+    RIL_CGFI    = 0xc20c,
+    RIL_CLFI    = 0xc20f,
+    RIL_CLGFI   = 0xc20e,
     RIL_IIHF    = 0xc008,
     RIL_IILF    = 0xc009,
     RIL_LARL    = 0xc000,
@@ -533,7 +537,29 @@ static int tcg_match_xori(int ct, tcg_target_long val)
 
 static int tcg_match_cmpi(int ct, tcg_target_long val)
 {
-    return (val == 0);
+    if (facilities & FACILITY_EXT_IMM) {
+        /* The COMPARE IMMEDIATE instruction is available.  */
+        if (ct & TCG_CT_CONST_32) {
+            /* We have a 32-bit immediate and can compare against anything.  */
+            return 1;
+        } else {
+            /* ??? We have no insight here into whether the comparison is
+               signed or unsigned.  The COMPARE IMMEDIATE insn uses a 32-bit
+               signed immediate, and the COMPARE LOGICAL IMMEDIATE insn uses
+               a 32-bit unsigned immediate.  If we were to use the (semi)
+               obvious "val == (int32_t)val" we would be enabling unsigned
+               comparisons vs very large numbers.  The only solution is to
+               take the intersection of the ranges.  */
+            /* ??? Another possible solution is to simply lie and allow all
+               constants here and force the out-of-range values into a temp
+               register in tgen_cmp when we have knowledge of the actual
+               comparison code in use.  */
+            return val >= 0 && val <= 0x7fffffff;
+        }
+    } else {
+        /* Only the LOAD AND TEST instruction is available.  */
+        return val == 0;
+    }
 }
 
 /* Test if a constant matches the constraint. */
@@ -1093,7 +1119,21 @@ static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
             }
             return tcg_cond_to_ltr_cond[c];
         } else {
-            tcg_abort();
+            if (c > TCG_COND_GT) {
+                /* unsigned */
+                if (type == TCG_TYPE_I32) {
+                    tcg_out_insn(s, RIL, CLFI, r1, c2);
+                } else {
+                    tcg_out_insn(s, RIL, CLGFI, r1, c2);
+                }
+            } else {
+                /* signed */
+                if (type == TCG_TYPE_I32) {
+                    tcg_out_insn(s, RIL, CFI, r1, c2);
+                } else {
+                    tcg_out_insn(s, RIL, CGFI, r1, c2);
+                }
+            }
         }
     } else {
         if (c > TCG_COND_GT) {
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 34/35] tcg-s390: Use COMPARE AND BRANCH instructions.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (32 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 33/35] tcg-s390: Use the COMPARE IMMEDIATE instrucions " Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 35/35] tcg-s390: Enable compile in 32-bit mode Richard Henderson
  2010-06-08 13:11 ` [Qemu-devel] Re: [PATCH 00/35] S390 TCG target, version 2 Alexander Graf
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

These instructions are available with the general-instructions-extension
facility.  Use them if available.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  102 +++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 95 insertions(+), 7 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 691a3f5..d5c26b8 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -119,6 +119,15 @@ typedef enum S390Opcode {
     RI_OILH     = 0xa50a,
     RI_OILL     = 0xa50b,
 
+    RIE_CGIJ    = 0xec7c,
+    RIE_CGRJ    = 0xec64,
+    RIE_CIJ     = 0xec7e,
+    RIE_CLGRJ   = 0xec65,
+    RIE_CLIJ    = 0xec7f,
+    RIE_CLGIJ   = 0xec7d,
+    RIE_CLRJ    = 0xec77,
+    RIE_CRJ     = 0xec76,
+
     RRE_AGR     = 0xb908,
     RRE_CGR     = 0xb920,
     RRE_CLGR    = 0xb921,
@@ -1110,6 +1119,7 @@ static void tgen64_xori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
 static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
                     TCGArg c2, int c2const)
 {
+    _Bool is_unsigned = (c > TCG_COND_GT);
     if (c2const) {
         if (c2 == 0) {
             if (type == TCG_TYPE_I32) {
@@ -1119,15 +1129,13 @@ static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
             }
             return tcg_cond_to_ltr_cond[c];
         } else {
-            if (c > TCG_COND_GT) {
-                /* unsigned */
+            if (is_unsigned) {
                 if (type == TCG_TYPE_I32) {
                     tcg_out_insn(s, RIL, CLFI, r1, c2);
                 } else {
                     tcg_out_insn(s, RIL, CLGFI, r1, c2);
                 }
             } else {
-                /* signed */
                 if (type == TCG_TYPE_I32) {
                     tcg_out_insn(s, RIL, CFI, r1, c2);
                 } else {
@@ -1136,15 +1144,13 @@ static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
             }
         }
     } else {
-        if (c > TCG_COND_GT) {
-            /* unsigned */
+        if (is_unsigned) {
             if (type == TCG_TYPE_I32) {
                 tcg_out_insn(s, RR, CLR, r1, c2);
             } else {
                 tcg_out_insn(s, RRE, CLGR, r1, c2);
             }
         } else {
-            /* signed */
             if (type == TCG_TYPE_I32) {
                 tcg_out_insn(s, RR, CR, r1, c2);
             } else {
@@ -1195,10 +1201,92 @@ static void tgen_branch(TCGContext *s, int cc, int labelno)
     }
 }
 
+static void tgen_compare_branch(TCGContext *s, S390Opcode opc, int cc,
+                                TCGReg r1, TCGReg r2, int labelno)
+{
+    TCGLabel* l = &s->labels[labelno];
+    tcg_target_long off;
+
+    if (l->has_value) {
+        off = (l->u.value - (tcg_target_long)s->code_ptr) >> 1;
+    } else {
+        /* We need to keep the offset unchanged for retranslation.  */
+        off = ((int16_t *)s->code_ptr)[1];
+        tcg_out_reloc(s, s->code_ptr + 2, R_390_PC16DBL, labelno, -2);
+    }
+
+    tcg_out16(s, (opc & 0xff00) | (r1 << 4) | r2);
+    tcg_out16(s, off);
+    tcg_out16(s, cc << 12 | (opc & 0xff));
+}
+
+static void tgen_compare_imm_branch(TCGContext *s, S390Opcode opc, int cc,
+                                    TCGReg r1, int i2, int labelno)
+{
+    TCGLabel* l = &s->labels[labelno];
+    tcg_target_long off;
+
+    if (l->has_value) {
+        off = (l->u.value - (tcg_target_long)s->code_ptr) >> 1;
+    } else {
+        /* We need to keep the offset unchanged for retranslation.  */
+        off = ((int16_t *)s->code_ptr)[1];
+        tcg_out_reloc(s, s->code_ptr + 2, R_390_PC16DBL, labelno, -2);
+    }
+
+    tcg_out16(s, (opc & 0xff00) | (r1 << 4) | cc);
+    tcg_out16(s, off);
+    tcg_out16(s, (i2 << 8) | (opc & 0xff));
+}
+
 static void tgen_brcond(TCGContext *s, TCGType type, TCGCond c,
                         TCGReg r1, TCGArg c2, int c2const, int labelno)
 {
-    int cc = tgen_cmp(s, type, c, r1, c2, c2const);
+    int cc;
+
+    if (facilities & FACILITY_GEN_INST_EXT) {
+        _Bool is_unsigned = (c > TCG_COND_GT);
+        _Bool in_range;
+        S390Opcode opc;
+
+        cc = tcg_cond_to_s390_cond[c];
+
+        if (!c2const) {
+            opc = (type == TCG_TYPE_I32
+                   ? (is_unsigned ? RIE_CLRJ : RIE_CRJ)
+                   : (is_unsigned ? RIE_CLGRJ : RIE_CGRJ));
+            tgen_compare_branch(s, opc, cc, r1, c2, labelno);
+            return;
+        }
+
+        /* COMPARE IMMEDIATE AND BRANCH RELATIVE has an 8-bit immediate field.
+           If the immediate we've been given does not fit that range, we'll
+           fall back to separate compare and branch instructions using the
+           larger comparison range afforded by COMPARE IMMEDIATE.  */
+        if (type == TCG_TYPE_I32) {
+            if (is_unsigned) {
+                opc = RIE_CLIJ;
+                in_range = (uint32_t)c2 == (uint8_t)c2;
+            } else {
+                opc = RIE_CIJ;
+                in_range = (int32_t)c2 == (int8_t)c2;
+            }
+        } else {
+            if (is_unsigned) {
+                opc = RIE_CLGIJ;
+                in_range = (uint64_t)c2 == (uint8_t)c2;
+            } else {
+                opc = RIE_CGIJ;
+                in_range = (int64_t)c2 == (int8_t)c2;
+            }
+        }
+        if (in_range) {
+            tgen_compare_imm_branch(s, opc, cc, r1, c2, labelno);
+            return;
+        }
+    }
+
+    cc = tgen_cmp(s, type, c, r1, c2, c2const);
     tgen_branch(s, cc, labelno);
 }
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] [PATCH 35/35] tcg-s390: Enable compile in 32-bit mode.
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (33 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 34/35] tcg-s390: Use COMPARE AND BRANCH instructions Richard Henderson
@ 2010-06-04 19:14 ` Richard Henderson
  2010-06-08 13:11 ` [Qemu-devel] Re: [PATCH 00/35] S390 TCG target, version 2 Alexander Graf
  35 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-04 19:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The TCG translator will *not* work in 32-bit mode, and there is a
check added to query_facilities to enforce that.

However, QEMU can run in KVM mode when built in 32-bit mode, and
this patch is just good enough to enable that method to continue.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  386 +++++++++++++++++++++++++------------------------
 tcg/s390/tcg-target.h |    7 +
 2 files changed, 205 insertions(+), 188 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index d5c26b8..8719ed7 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -731,7 +731,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
             return;
         }
         if ((uval & 0xffffffff) == 0) {
-            tcg_out_insn(s, RIL, LLIHF, ret, uval >> 32);
+            tcg_out_insn(s, RIL, LLIHF, ret, uval >> 31 >> 1);
             return;
         }
     }
@@ -761,7 +761,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
            We first want to make sure that all the high bits get set.  With
            luck the low 16-bits can be considered negative to perform that for
            free, otherwise we load an explicit -1.  */
-        if (sval >> 32 == -1) {
+        if (sval >> 31 >> 1 == -1) {
             if (uval & 0x8000) {
                 tcg_out_insn(s, RI, LGHI, ret, uval);
             } else {
@@ -779,7 +779,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
     tcg_out_movi(s, TCG_TYPE_I32, ret, sval);
 
     /* Insert data into the high 32-bits.  */
-    uval >>= 32;
+    uval = uval >> 31 >> 1;
     if (facilities & FACILITY_EXT_IMM) {
         if (uval < 0x10000) {
             tcg_out_insn(s, RI, IIHL, ret, uval);
@@ -962,7 +962,7 @@ static inline void tgen_ext32u(TCGContext *s, TCGReg dest, TCGReg src)
     tcg_out_insn(s, RRE, LLGFR, dest, src);
 }
 
-static void tgen32_addi(TCGContext *s, TCGReg dest, int32_t val)
+static inline void tgen32_addi(TCGContext *s, TCGReg dest, int32_t val)
 {
     if (val == (int16_t)val) {
         tcg_out_insn(s, RI, AHI, dest, val);
@@ -971,7 +971,7 @@ static void tgen32_addi(TCGContext *s, TCGReg dest, int32_t val)
     }
 }
 
-static void tgen64_addi(TCGContext *s, TCGReg dest, int64_t val)
+static inline void tgen64_addi(TCGContext *s, TCGReg dest, int64_t val)
 {
     if (val == (int16_t)val) {
         tcg_out_insn(s, RI, AGHI, dest, val);
@@ -1112,7 +1112,7 @@ static void tgen64_xori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
         tcg_out_insn(s, RIL, XILF, dest, val);
     }
     if (val > 0xffffffff) {
-        tcg_out_insn(s, RIL, XIHF, dest, val >> 32);
+        tcg_out_insn(s, RIL, XIHF, dest, val >> 31 >> 1);
     }
 }
 
@@ -1593,6 +1593,15 @@ static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
 #endif
 }
 
+#if TCG_TARGET_REG_BITS == 64
+# define OP_32_64(x) \
+        case glue(glue(INDEX_op_,x),_i32): \
+        case glue(glue(INDEX_op_,x),_i64)
+#else
+# define OP_32_64(x) \
+        case glue(glue(INDEX_op_,x),_i32)
+#endif
+
 static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                 const TCGArg *args, const int *const_args)
 {
@@ -1625,21 +1634,18 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
-    case INDEX_op_ld8u_i32:
-    case INDEX_op_ld8u_i64:
+    OP_32_64(ld8u):
         /* ??? LLC (RXY format) is only present with the extended-immediate
            facility, whereas LLGC is always present.  */
         tcg_out_mem(s, 0, RXY_LLGC, args[0], args[1], TCG_REG_NONE, args[2]);
         break;
 
-    case INDEX_op_ld8s_i32:
-    case INDEX_op_ld8s_i64:
+    OP_32_64(ld8s):
         /* ??? LB is no smaller than LGB, so no point to using it.  */
         tcg_out_mem(s, 0, RXY_LGB, args[0], args[1], TCG_REG_NONE, args[2]);
         break;
 
-    case INDEX_op_ld16u_i32:
-    case INDEX_op_ld16u_i64:
+    OP_32_64(ld16u):
         /* ??? LLH (RXY format) is only present with the extended-immediate
            facility, whereas LLGH is always present.  */
         tcg_out_mem(s, 0, RXY_LLGH, args[0], args[1], TCG_REG_NONE, args[2]);
@@ -1648,45 +1654,25 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_ld16s_i32:
         tcg_out_mem(s, RX_LH, RXY_LHY, args[0], args[1], TCG_REG_NONE, args[2]);
         break;
-    case INDEX_op_ld16s_i64:
-        tcg_out_mem(s, 0, RXY_LGH, args[0], args[1], TCG_REG_NONE, args[2]);
-        break;
 
     case INDEX_op_ld_i32:
         tcg_out_ld(s, TCG_TYPE_I32, args[0], args[1], args[2]);
         break;
-    case INDEX_op_ld32u_i64:
-        tcg_out_mem(s, 0, RXY_LLGF, args[0], args[1], TCG_REG_NONE, args[2]);
-        break;
-    case INDEX_op_ld32s_i64:
-        tcg_out_mem(s, 0, RXY_LGF, args[0], args[1], TCG_REG_NONE, args[2]);
-        break;
-
-    case INDEX_op_ld_i64:
-        tcg_out_ld(s, TCG_TYPE_I64, args[0], args[1], args[2]);
-        break;
 
-    case INDEX_op_st8_i32:
-    case INDEX_op_st8_i64:
+    OP_32_64(st8):
         tcg_out_mem(s, RX_STC, RXY_STCY, args[0], args[1],
                     TCG_REG_NONE, args[2]);
         break;
 
-    case INDEX_op_st16_i32:
-    case INDEX_op_st16_i64:
+    OP_32_64(st16):
         tcg_out_mem(s, RX_STH, RXY_STHY, args[0], args[1],
                     TCG_REG_NONE, args[2]);
         break;
 
     case INDEX_op_st_i32:
-    case INDEX_op_st32_i64:
         tcg_out_st(s, TCG_TYPE_I32, args[0], args[1], args[2]);
         break;
 
-    case INDEX_op_st_i64:
-        tcg_out_st(s, TCG_TYPE_I64, args[0], args[1], args[2]);
-        break;
-
     case INDEX_op_add_i32:
         if (const_args[2]) {
             tgen32_addi(s, args[0], args[2]);
@@ -1694,14 +1680,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
             tcg_out_insn(s, RR, AR, args[0], args[2]);
         }
         break;
-    case INDEX_op_add_i64:
-        if (const_args[2]) {
-            tgen64_addi(s, args[0], args[2]);
-        } else {
-            tcg_out_insn(s, RRE, AGR, args[0], args[2]);
-        }
-        break;
-
     case INDEX_op_sub_i32:
         if (const_args[2]) {
             tgen32_addi(s, args[0], -args[2]);
@@ -1709,13 +1687,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
             tcg_out_insn(s, RR, SR, args[0], args[2]);
         }
         break;
-    case INDEX_op_sub_i64:
-        if (const_args[2]) {
-            tgen64_addi(s, args[0], -args[2]);
-        } else {
-            tcg_out_insn(s, RRE, SGR, args[0], args[2]);
-        }
-        break;
 
     case INDEX_op_and_i32:
         if (const_args[2]) {
@@ -1739,34 +1710,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
-    case INDEX_op_and_i64:
-        if (const_args[2]) {
-            tgen64_andi(s, args[0], args[2]);
-        } else {
-            tcg_out_insn(s, RRE, NGR, args[0], args[2]);
-        }
-        break;
-    case INDEX_op_or_i64:
-        if (const_args[2]) {
-            tgen64_ori(s, args[0], args[2]);
-        } else {
-            tcg_out_insn(s, RRE, OGR, args[0], args[2]);
-        }
-        break;
-    case INDEX_op_xor_i64:
-        if (const_args[2]) {
-            tgen64_xori(s, args[0], args[2]);
-        } else {
-            tcg_out_insn(s, RRE, XGR, args[0], args[2]);
-        }
-        break;
-
     case INDEX_op_neg_i32:
         tcg_out_insn(s, RR, LCR, args[0], args[1]);
         break;
-    case INDEX_op_neg_i64:
-        tcg_out_insn(s, RRE, LCGR, args[0], args[1]);
-        break;
 
     case INDEX_op_mul_i32:
         if (const_args[2]) {
@@ -1779,17 +1725,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
             tcg_out_insn(s, RRE, MSR, args[0], args[2]);
         }
         break;
-    case INDEX_op_mul_i64:
-        if (const_args[2]) {
-            if (args[2] == (int16_t)args[2]) {
-                tcg_out_insn(s, RI, MGHI, args[0], args[2]);
-            } else {
-                tcg_out_insn(s, RIL, MSGFI, args[0], args[2]);
-            }
-        } else {
-            tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
-        }
-        break;
 
     case INDEX_op_div2_i32:
         tcg_out_insn(s, RR, DR, TCG_REG_R2, args[4]);
@@ -1798,17 +1733,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_insn(s, RRE, DLR, TCG_REG_R2, args[4]);
         break;
 
-    case INDEX_op_div2_i64:
-        /* ??? We get an unnecessary sign-extension of the dividend
-           into R3 with this definition, but as we do in fact always
-           produce both quotient and remainder using INDEX_op_div_i64
-           instead requires jumping through even more hoops.  */
-        tcg_out_insn(s, RRE, DSGR, TCG_REG_R2, args[4]);
-        break;
-    case INDEX_op_divu2_i64:
-        tcg_out_insn(s, RRE, DLGR, TCG_REG_R2, args[4]);
-        break;
-
     case INDEX_op_shl_i32:
         op = RS_SLL;
     do_shift32:
@@ -1825,22 +1749,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         op = RS_SRA;
         goto do_shift32;
 
-    case INDEX_op_shl_i64:
-        op = RSY_SLLG;
-    do_shift64:
-        if (const_args[2]) {
-            tcg_out_sh64(s, op, args[0], args[1], TCG_REG_NONE, args[2]);
-        } else {
-            tcg_out_sh64(s, op, args[0], args[1], args[2], 0);
-        }
-        break;
-    case INDEX_op_shr_i64:
-        op = RSY_SRLG;
-        goto do_shift64;
-    case INDEX_op_sar_i64:
-        op = RSY_SRAG;
-        goto do_shift64;
-
     case INDEX_op_rotl_i32:
         /* ??? Using tcg_out_sh64 here for the format; it is a 32-bit rol.  */
         if (const_args[2]) {
@@ -1859,72 +1767,28 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
-    case INDEX_op_rotl_i64:
-        if (const_args[2]) {
-            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
-                         TCG_REG_NONE, args[2]);
-        } else {
-            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], args[2], 0);
-        }
-        break;
-    case INDEX_op_rotr_i64:
-        if (const_args[2]) {
-            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
-                         TCG_REG_NONE, (64 - args[2]) & 63);
-        } else {
-            /* We can use the smaller 32-bit negate because only the
-               low 6 bits are examined for the rotate.  */
-            tcg_out_insn(s, RR, LCR, TCG_TMP0, args[2]);
-            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], TCG_TMP0, 0);
-        }
-        break;
-
     case INDEX_op_ext8s_i32:
         tgen_ext8s(s, TCG_TYPE_I32, args[0], args[1]);
         break;
-    case INDEX_op_ext8s_i64:
-        tgen_ext8s(s, TCG_TYPE_I64, args[0], args[1]);
-        break;
     case INDEX_op_ext16s_i32:
         tgen_ext16s(s, TCG_TYPE_I32, args[0], args[1]);
         break;
-    case INDEX_op_ext16s_i64:
-        tgen_ext16s(s, TCG_TYPE_I64, args[0], args[1]);
-        break;
-    case INDEX_op_ext32s_i64:
-        tgen_ext32s(s, args[0], args[1]);
-        break;
-
     case INDEX_op_ext8u_i32:
         tgen_ext8u(s, TCG_TYPE_I32, args[0], args[1]);
         break;
-    case INDEX_op_ext8u_i64:
-        tgen_ext8u(s, TCG_TYPE_I64, args[0], args[1]);
-        break;
     case INDEX_op_ext16u_i32:
         tgen_ext16u(s, TCG_TYPE_I32, args[0], args[1]);
         break;
-    case INDEX_op_ext16u_i64:
-        tgen_ext16u(s, TCG_TYPE_I64, args[0], args[1]);
-        break;
-    case INDEX_op_ext32u_i64:
-        tgen_ext32u(s, args[0], args[1]);
-        break;
 
-    case INDEX_op_bswap16_i32:
-    case INDEX_op_bswap16_i64:
+    OP_32_64(bswap16):
         /* The TCG bswap definition requires bits 0-47 already be zero.
            Thus we don't need the G-type insns to implement bswap16_i64.  */
         tcg_out_insn(s, RRE, LRVR, args[0], args[1]);
         tcg_out_sh32(s, RS_SRL, args[0], TCG_REG_NONE, 16);
         break;
-    case INDEX_op_bswap32_i32:
-    case INDEX_op_bswap32_i64:
+    OP_32_64(bswap32):
         tcg_out_insn(s, RRE, LRVR, args[0], args[1]);
         break;
-    case INDEX_op_bswap64_i64:
-        tcg_out_insn(s, RRE, LRVGR, args[0], args[1]);
-        break;
 
     case INDEX_op_br:
         tgen_branch(s, S390_CC_ALWAYS, args[0]);
@@ -1934,46 +1798,27 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tgen_brcond(s, TCG_TYPE_I32, args[2], args[0],
                     args[1], const_args[1], args[3]);
         break;
-    case INDEX_op_brcond_i64:
-        tgen_brcond(s, TCG_TYPE_I64, args[2], args[0],
-                    args[1], const_args[1], args[3]);
-        break;
-
     case INDEX_op_setcond_i32:
         tgen_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1],
                      args[2], const_args[2]);
         break;
-    case INDEX_op_setcond_i64:
-        tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1],
-                     args[2], const_args[2]);
-        break;
 
     case INDEX_op_qemu_ld8u:
         tcg_out_qemu_ld(s, args, LD_UINT8);
         break;
-
     case INDEX_op_qemu_ld8s:
         tcg_out_qemu_ld(s, args, LD_INT8);
         break;
-
     case INDEX_op_qemu_ld16u:
         tcg_out_qemu_ld(s, args, LD_UINT16);
         break;
-
     case INDEX_op_qemu_ld16s:
         tcg_out_qemu_ld(s, args, LD_INT16);
         break;
-
     case INDEX_op_qemu_ld32:
         /* ??? Technically we can use a non-extending instruction.  */
-    case INDEX_op_qemu_ld32u:
         tcg_out_qemu_ld(s, args, LD_UINT32);
         break;
-
-    case INDEX_op_qemu_ld32s:
-        tcg_out_qemu_ld(s, args, LD_INT32);
-        break;
-
     case INDEX_op_qemu_ld64:
         tcg_out_qemu_ld(s, args, LD_UINT64);
         break;
@@ -1981,23 +1826,178 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_qemu_st8:
         tcg_out_qemu_st(s, args, LD_UINT8);
         break;
-
     case INDEX_op_qemu_st16:
         tcg_out_qemu_st(s, args, LD_UINT16);
         break;
-
     case INDEX_op_qemu_st32:
         tcg_out_qemu_st(s, args, LD_UINT32);
         break;
-
     case INDEX_op_qemu_st64:
         tcg_out_qemu_st(s, args, LD_UINT64);
         break;
 
-    case INDEX_op_mov_i32:
-    case INDEX_op_mov_i64:
-    case INDEX_op_movi_i32:
-    case INDEX_op_movi_i64:
+#if TCG_TARGET_REG_BITS == 64
+    case INDEX_op_ld16s_i64:
+        tcg_out_mem(s, 0, RXY_LGH, args[0], args[1], TCG_REG_NONE, args[2]);
+        break;
+    case INDEX_op_ld32u_i64:
+        tcg_out_mem(s, 0, RXY_LLGF, args[0], args[1], TCG_REG_NONE, args[2]);
+        break;
+    case INDEX_op_ld32s_i64:
+        tcg_out_mem(s, 0, RXY_LGF, args[0], args[1], TCG_REG_NONE, args[2]);
+        break;
+    case INDEX_op_ld_i64:
+        tcg_out_ld(s, TCG_TYPE_I64, args[0], args[1], args[2]);
+        break;
+
+    case INDEX_op_st32_i64:
+        tcg_out_st(s, TCG_TYPE_I32, args[0], args[1], args[2]);
+        break;
+    case INDEX_op_st_i64:
+        tcg_out_st(s, TCG_TYPE_I64, args[0], args[1], args[2]);
+        break;
+
+    case INDEX_op_add_i64:
+        if (const_args[2]) {
+            tgen64_addi(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, AGR, args[0], args[2]);
+        }
+        break;
+    case INDEX_op_sub_i64:
+        if (const_args[2]) {
+            tgen64_addi(s, args[0], -args[2]);
+        } else {
+            tcg_out_insn(s, RRE, SGR, args[0], args[2]);
+        }
+        break;
+
+    case INDEX_op_and_i64:
+        if (const_args[2]) {
+            tgen64_andi(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, NGR, args[0], args[2]);
+        }
+        break;
+    case INDEX_op_or_i64:
+        if (const_args[2]) {
+            tgen64_ori(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, OGR, args[0], args[2]);
+        }
+        break;
+    case INDEX_op_xor_i64:
+        if (const_args[2]) {
+            tgen64_xori(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, XGR, args[0], args[2]);
+        }
+        break;
+
+    case INDEX_op_neg_i64:
+        tcg_out_insn(s, RRE, LCGR, args[0], args[1]);
+        break;
+    case INDEX_op_bswap64_i64:
+        tcg_out_insn(s, RRE, LRVGR, args[0], args[1]);
+        break;
+
+    case INDEX_op_mul_i64:
+        if (const_args[2]) {
+            if (args[2] == (int16_t)args[2]) {
+                tcg_out_insn(s, RI, MGHI, args[0], args[2]);
+            } else {
+                tcg_out_insn(s, RIL, MSGFI, args[0], args[2]);
+            }
+        } else {
+            tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
+        }
+        break;
+
+    case INDEX_op_div2_i64:
+        /* ??? We get an unnecessary sign-extension of the dividend
+           into R3 with this definition, but as we do in fact always
+           produce both quotient and remainder using INDEX_op_div_i64
+           instead requires jumping through even more hoops.  */
+        tcg_out_insn(s, RRE, DSGR, TCG_REG_R2, args[4]);
+        break;
+    case INDEX_op_divu2_i64:
+        tcg_out_insn(s, RRE, DLGR, TCG_REG_R2, args[4]);
+        break;
+
+    case INDEX_op_shl_i64:
+        op = RSY_SLLG;
+    do_shift64:
+        if (const_args[2]) {
+            tcg_out_sh64(s, op, args[0], args[1], TCG_REG_NONE, args[2]);
+        } else {
+            tcg_out_sh64(s, op, args[0], args[1], args[2], 0);
+        }
+        break;
+    case INDEX_op_shr_i64:
+        op = RSY_SRLG;
+        goto do_shift64;
+    case INDEX_op_sar_i64:
+        op = RSY_SRAG;
+        goto do_shift64;
+
+    case INDEX_op_rotl_i64:
+        if (const_args[2]) {
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
+                         TCG_REG_NONE, args[2]);
+        } else {
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], args[2], 0);
+        }
+        break;
+    case INDEX_op_rotr_i64:
+        if (const_args[2]) {
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
+                         TCG_REG_NONE, (64 - args[2]) & 63);
+        } else {
+            /* We can use the smaller 32-bit negate because only the
+               low 6 bits are examined for the rotate.  */
+            tcg_out_insn(s, RR, LCR, TCG_TMP0, args[2]);
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], TCG_TMP0, 0);
+        }
+        break;
+
+    case INDEX_op_ext8s_i64:
+        tgen_ext8s(s, TCG_TYPE_I64, args[0], args[1]);
+        break;
+    case INDEX_op_ext16s_i64:
+        tgen_ext16s(s, TCG_TYPE_I64, args[0], args[1]);
+        break;
+    case INDEX_op_ext32s_i64:
+        tgen_ext32s(s, args[0], args[1]);
+        break;
+    case INDEX_op_ext8u_i64:
+        tgen_ext8u(s, TCG_TYPE_I64, args[0], args[1]);
+        break;
+    case INDEX_op_ext16u_i64:
+        tgen_ext16u(s, TCG_TYPE_I64, args[0], args[1]);
+        break;
+    case INDEX_op_ext32u_i64:
+        tgen_ext32u(s, args[0], args[1]);
+        break;
+
+    case INDEX_op_brcond_i64:
+        tgen_brcond(s, TCG_TYPE_I64, args[2], args[0],
+                    args[1], const_args[1], args[3]);
+        break;
+    case INDEX_op_setcond_i64:
+        tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1],
+                     args[2], const_args[2]);
+        break;
+
+    case INDEX_op_qemu_ld32u:
+        tcg_out_qemu_ld(s, args, LD_UINT32);
+        break;
+    case INDEX_op_qemu_ld32s:
+        tcg_out_qemu_ld(s, args, LD_INT32);
+        break;
+#endif /* TCG_TARGET_REG_BITS == 64 */
+
+    OP_32_64(mov):
+    OP_32_64(movi):
         /* These are always emitted by TCG directly.  */
     case INDEX_op_jmp:
         /* This one is obsolete and never emitted.  */
@@ -2064,8 +2064,6 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_qemu_ld8s, { "r", "L" } },
     { INDEX_op_qemu_ld16u, { "r", "L" } },
     { INDEX_op_qemu_ld16s, { "r", "L" } },
-    { INDEX_op_qemu_ld32u, { "r", "L" } },
-    { INDEX_op_qemu_ld32s, { "r", "L" } },
     { INDEX_op_qemu_ld32, { "r", "L" } },
     { INDEX_op_qemu_ld64, { "r", "L" } },
 
@@ -2124,6 +2122,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_brcond_i64, { "r", "rC" } },
     { INDEX_op_setcond_i64, { "r", "r", "rC" } },
+
+    { INDEX_op_qemu_ld32u, { "r", "L" } },
+    { INDEX_op_qemu_ld32s, { "r", "L" } },
 #endif
 
     { -1 },
@@ -2217,13 +2218,22 @@ static void query_facilities(void)
        worthwhile, since even the KVM target requires z/Arch.  */
     fail = 0;
     if ((facilities & FACILITY_ZARCH_ACTIVE) == 0) {
-        fprintf(stderr, "TCG: z/Arch facility is required\n");
+        fprintf(stderr, "TCG: z/Arch facility is required.\n");
+        fprintf(stderr, "TCG: Boot with a 64-bit enabled kernel.\n");
         fail = 1;
     }
     if ((facilities & FACILITY_LONG_DISP) == 0) {
-        fprintf(stderr, "TCG: long-displacement facility is required\n");
+        fprintf(stderr, "TCG: long-displacement facility is required.\n");
         fail = 1;
     }
+
+    /* So far there's just enough support for 31-bit mode to let the
+       compile succeed.  This is good enough to run QEMU with KVM.  */
+    if (sizeof(void *) != 8) {
+        fprintf(stderr, "TCG: 31-bit mode is not supported.\n");
+        fail = 1;
+    }
+
     if (fail) {
         exit(-1);
     }
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 390c587..4e45cf3 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -23,7 +23,12 @@
  */
 #define TCG_TARGET_S390 1
 
+#ifdef __s390x__
 #define TCG_TARGET_REG_BITS 64
+#else
+#define TCG_TARGET_REG_BITS 32
+#endif
+
 #define TCG_TARGET_WORDS_BIGENDIAN
 
 typedef enum TCGReg {
@@ -64,6 +69,7 @@ typedef enum TCGReg {
 // #define TCG_TARGET_HAS_nand_i32
 // #define TCG_TARGET_HAS_nor_i32
 
+#if TCG_TARGET_REG_BITS == 64
 #define TCG_TARGET_HAS_div2_i64
 #define TCG_TARGET_HAS_rot_i64
 #define TCG_TARGET_HAS_ext8s_i64
@@ -82,6 +88,7 @@ typedef enum TCGReg {
 // #define TCG_TARGET_HAS_eqv_i64
 // #define TCG_TARGET_HAS_nand_i64
 // #define TCG_TARGET_HAS_nor_i64
+#endif
 
 #define TCG_TARGET_HAS_GUEST_BASE
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Qemu-devel] Re: [PATCH 00/35] S390 TCG target, version 2
  2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
                   ` (34 preceding siblings ...)
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 35/35] tcg-s390: Enable compile in 32-bit mode Richard Henderson
@ 2010-06-08 13:11 ` Alexander Graf
  35 siblings, 0 replies; 75+ messages in thread
From: Alexander Graf @ 2010-06-08 13:11 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, aurelien


On 04.06.2010, at 21:14, Richard Henderson wrote:

> Changes v1->v2
>  * Disassembler doesn't include GPLv3 code.
> 
>  * Smashed about 20 commits into the "New TCG Target" patch,
>    which is fully functional from the beginning.
> 
>  * Merged a lot of the follow-on patches such that introducing
>    the use of an instruction and conditionalizing the use of
>    the instruction on the active ISA features is not split into
>    two separate patches.
> 
> This patch series is available at
> 
>  git://repo.or.cz/qemu/rth.git tcg-s390-3

It works for me, it improves the current state, I see no reason not to take it! Let's get S390x host support out for 0.13!

Acked-by: Alexander Graf <agraf@suse.de>


Alex

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 08/35] s390: Update disassembler to the last GPLv2 from binutils.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 08/35] s390: Update disassembler to the last GPLv2 from binutils Richard Henderson
@ 2010-06-09 22:47   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-09 22:47 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:16PM -0700, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  s390-dis.c |   81 +++++++++++++++++++++++++++++++++++++-----------------------
>  1 files changed, 50 insertions(+), 31 deletions(-)

Thanks, applied.

> diff --git a/s390-dis.c b/s390-dis.c
> index 86dd84f..3d96be0 100644
> --- a/s390-dis.c
> +++ b/s390-dis.c
> @@ -1,3 +1,4 @@
> +/* opcodes/s390-dis.c revision 1.12 */
>  /* s390-dis.c -- Disassemble S390 instructions
>     Copyright 2000, 2001, 2002, 2003, 2005 Free Software Foundation, Inc.
>     Contributed by Martin Schwidefsky (schwidefsky@de.ibm.com).
> @@ -15,11 +16,14 @@
>     GNU General Public License for more details.
>  
>     You should have received a copy of the GNU General Public License
> -   along with this program; if not, see <http://www.gnu.org/licenses/>.  */
> +   along with this program; if not, write to the Free Software
> +   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
> +   02110-1301, USA.  */
>  
> -#include <stdio.h>
> +#include "qemu-common.h"
>  #include "dis-asm.h"
>  
> +/* include/opcode/s390.h revision 1.9 */
>  /* s390.h -- Header file for S390 opcode table
>     Copyright 2000, 2001, 2003 Free Software Foundation, Inc.
>     Contributed by Martin Schwidefsky (schwidefsky@de.ibm.com).
> @@ -37,7 +41,9 @@
>     GNU General Public License for more details.
>  
>     You should have received a copy of the GNU General Public License
> -   along with this program; if not, see <http://www.gnu.org/licenses/>.  */
> +   along with this program; if not, write to the Free Software
> +   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
> +   02110-1301, USA.  */
>  
>  #ifndef S390_H
>  #define S390_H
> @@ -57,7 +63,8 @@ enum s390_opcode_cpu_val
>      S390_OPCODE_Z900,
>      S390_OPCODE_Z990,
>      S390_OPCODE_Z9_109,
> -    S390_OPCODE_Z9_EC
> +    S390_OPCODE_Z9_EC,
> +    S390_OPCODE_Z10
>    };
>  
>  /* The opcode table is an array of struct s390_opcode.  */
> @@ -95,12 +102,13 @@ struct s390_opcode
>  /* The table itself is sorted by major opcode number, and is otherwise
>     in the order in which the disassembler should consider
>     instructions.  */
> -extern const struct s390_opcode s390_opcodes[];
> -extern const int                s390_num_opcodes;
> +/* QEMU: Mark these static.  */
> +static const struct s390_opcode s390_opcodes[];
> +static const int                s390_num_opcodes;
>  
>  /* A opcode format table for the .insn pseudo mnemonic.  */
> -extern const struct s390_opcode s390_opformats[];
> -extern const int                s390_num_opformats;
> +static const struct s390_opcode s390_opformats[];
> +static const int                s390_num_opformats;
>  
>  /* Values defined for the flags field of a struct powerpc_opcode.  */
>  
> @@ -121,7 +129,7 @@ struct s390_operand
>  /* Elements in the table are retrieved by indexing with values from
>     the operands field of the powerpc_opcodes table.  */
>  
> -extern const struct s390_operand s390_operands[];
> +static const struct s390_operand s390_operands[];
>  
>  /* Values defined for the flags field of a struct s390_operand.  */
>  
> @@ -164,12 +172,13 @@ extern const struct s390_operand s390_operands[];
>     the instruction may be optional.  */
>  #define S390_OPERAND_OPTIONAL 0x400
>  
> -	#endif /* S390_H */
> -
> +#endif /* S390_H */
>  
>  static int init_flag = 0;
>  static int opc_index[256];
> -static int current_arch_mask = 0;
> +
> +/* QEMU: We've disabled the architecture check below.  */
> +/* static int current_arch_mask = 0; */
>  
>  /* Set up index table for first opcode byte.  */
>  
> @@ -188,17 +197,21 @@ init_disasm (struct disassemble_info *info)
>  	     (opcode[1].opcode[0] == opcode->opcode[0]))
>  	opcode++;
>      }
> -//  switch (info->mach)
> -//    {
> -//    case bfd_mach_s390_31:
> +
> +#ifdef QEMU_DISABLE
> +  switch (info->mach)
> +    {
> +    case bfd_mach_s390_31:
>        current_arch_mask = 1 << S390_OPCODE_ESA;
> -//      break;
> -//    case bfd_mach_s390_64:
> -//      current_arch_mask = 1 << S390_OPCODE_ZARCH;
> -//      break;
> -//    default:
> -//      abort ();
> -//    }
> +      break;
> +    case bfd_mach_s390_64:
> +      current_arch_mask = 1 << S390_OPCODE_ZARCH;
> +      break;
> +    default:
> +      abort ();
> +    }
> +#endif /* QEMU_DISABLE */
> +
>    init_flag = 1;
>  }
>  
> @@ -297,9 +310,12 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
>  	  const struct s390_operand *operand;
>  	  const unsigned char *opindex;
>  
> +#ifdef QEMU_DISABLE
>  	  /* Check architecture.  */
>  	  if (!(opcode->modes & current_arch_mask))
>  	    continue;
> +#endif /* QEMU_DISABLE */
> +
>  	  /* Check signature of the opcode.  */
>  	  if ((buffer[1] & opcode->mask[1]) != opcode->opcode[1]
>  	      || (buffer[2] & opcode->mask[2]) != opcode->opcode[2]
> @@ -392,6 +408,8 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
>        return 1;
>      }
>  }
> +
> +/* opcodes/s390-opc.c revision 1.16 */
>  /* s390-opc.c -- S390 opcode list
>     Copyright 2000, 2001, 2003 Free Software Foundation, Inc.
>     Contributed by Martin Schwidefsky (schwidefsky@de.ibm.com).
> @@ -409,9 +427,9 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
>     GNU General Public License for more details.
>  
>     You should have received a copy of the GNU General Public License
> -   along with this program; if not, see <http://www.gnu.org/licenses/>.  */
> -
> -#include <stdio.h>
> +   along with this program; if not, write to the Free Software
> +   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
> +   02110-1301, USA.  */
>  
>  /* This file holds the S390 opcode table.  The opcode table
>     includes almost all of the extended instruction mnemonics.  This
> @@ -427,7 +445,7 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
>  /* The operands table.
>     The fields are bits, shift, insert, extract, flags.  */
>  
> -const struct s390_operand s390_operands[] =
> +static const struct s390_operand s390_operands[] =
>  {
>  #define UNUSED 0
>    { 0, 0, 0 },                    /* Indicates the end of the operand list */
> @@ -563,7 +581,7 @@ const struct s390_operand s390_operands[] =
>        quite close.
>  
>        For example the instruction "mvo" is defined in the PoP as follows:
> -
> +      
>        MVO  D1(L1,B1),D2(L2,B2)   [SS]
>  
>        --------------------------------------
> @@ -739,7 +757,7 @@ const struct s390_operand s390_operands[] =
>  
>  /* The opcode formats table (blueprints for .insn pseudo mnemonic).  */
>  
> -const struct s390_opcode s390_opformats[] =
> +static const struct s390_opcode s390_opformats[] =
>    {
>    { "e",	OP8(0x00LL),	MASK_E,		INSTR_E,	3, 0 },
>    { "ri",	OP8(0x00LL),	MASK_RI_RI,	INSTR_RI_RI,	3, 0 },
> @@ -765,9 +783,10 @@ const struct s390_opcode s390_opformats[] =
>    { "ssf",	OP8(0x00LL),	MASK_SSF_RRDRD,	INSTR_SSF_RRDRD,3, 0 },
>  };
>  
> -const int s390_num_opformats =
> +static const int s390_num_opformats =
>    sizeof (s390_opformats) / sizeof (s390_opformats[0]);
>  
> +/* include "s390-opc.tab" generated from opcodes/s390-opc.txt rev 1.17 */
>  /* The opcode table. This file was generated by s390-mkopc.
>  
>     The format of the opcode table is:
> @@ -783,7 +802,7 @@ const int s390_num_opformats =
>     The disassembler reads the table in order and prints the first
>     instruction which matches.  */
>  
> -const struct s390_opcode s390_opcodes[] =
> +static const struct s390_opcode s390_opcodes[] =
>    {
>    { "dp", OP8(0xfdLL), MASK_SS_LLRDRD, INSTR_SS_LLRDRD, 3, 0},
>    { "mp", OP8(0xfcLL), MASK_SS_LLRDRD, INSTR_SS_LLRDRD, 3, 0},
> @@ -1700,5 +1719,5 @@ const struct s390_opcode s390_opcodes[] =
>    { "pr", OP16(0x0101LL), MASK_E, INSTR_E, 3, 0}
>  };
>  
> -const int s390_num_opcodes =
> +static const int s390_num_opcodes =
>    sizeof (s390_opcodes) / sizeof (s390_opcodes[0]);
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 09/35] s390: Disassemble some general-instruction-extension insns.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 09/35] s390: Disassemble some general-instruction-extension insns Richard Henderson
@ 2010-06-09 22:47   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-09 22:47 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:17PM -0700, Richard Henderson wrote:
> The full general-instruction-extension facility was added to binutils
> after the change to GPLv3.  This is not the entire extension, just
> what we're using in TCG.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  s390-dis.c |   89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 files changed, 81 insertions(+), 8 deletions(-)

Thanks, applied.

> diff --git a/s390-dis.c b/s390-dis.c
> index 3d96be0..2eed69b 100644
> --- a/s390-dis.c
> +++ b/s390-dis.c
> @@ -172,6 +172,31 @@ static const struct s390_operand s390_operands[];
>     the instruction may be optional.  */
>  #define S390_OPERAND_OPTIONAL 0x400
>  
> +/* QEMU-ADD */
> +/* ??? Not quite the format the assembler takes, but easy to implement
> +   without recourse to the table generator.  */
> +#define S390_OPERAND_CCODE  0x800
> +
> +static const char s390_ccode_name[16][4] = {
> +    "n",    /* 0000 */
> +    "o",    /* 0001 */
> +    "h",    /* 0010 */
> +    "nle",  /* 0011 */
> +    "l",    /* 0100 */
> +    "nhe",  /* 0101 */
> +    "lh",   /* 0110 */
> +    "ne",   /* 0111 */
> +    "e",    /* 1000 */
> +    "nlh",  /* 1001 */
> +    "he",   /* 1010 */
> +    "nl",   /* 1011 */
> +    "le",   /* 1100 */
> +    "nh",   /* 1101 */
> +    "no",   /* 1110 */
> +    "a"     /* 1111 */
> +};
> +/* QEMU-END */
> +
>  #endif /* S390_H */
>  
>  static int init_flag = 0;
> @@ -325,13 +350,16 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
>  	    continue;
>  
>  	  /* The instruction is valid.  */
> -	  if (opcode->operands[0] != 0)
> -	    (*info->fprintf_func) (info->stream, "%s\t", opcode->name);
> -	  else
> -	    (*info->fprintf_func) (info->stream, "%s", opcode->name);
> +/* QEMU-MOD */
> +         (*info->fprintf_func) (info->stream, "%s", opcode->name);
> +
> +         if (s390_operands[opcode->operands[0]].flags & S390_OPERAND_CCODE)
> +           separator = 0;
> +         else
> +           separator = '\t';
> +/* QEMU-END */
>  
>  	  /* Extract the operands.  */
> -	  separator = 0;
>  	  for (opindex = opcode->operands; *opindex != 0; opindex++)
>  	    {
>  	      unsigned int value;
> @@ -363,6 +391,15 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
>  		(*info->print_address_func) (memaddr + (int) value, info);
>  	      else if (operand->flags & S390_OPERAND_SIGNED)
>  		(*info->fprintf_func) (info->stream, "%i", (int) value);
> +/* QEMU-ADD */
> +              else if (operand->flags & S390_OPERAND_CCODE)
> +                {
> +		  (*info->fprintf_func) (info->stream, "%s",
> +                                         s390_ccode_name[(int) value]);
> +                  separator = '\t';
> +                  continue;
> +                }
> +/* QEMU-END */
>  	      else
>  		(*info->fprintf_func) (info->stream, "%u", value);
>  
> @@ -543,8 +580,16 @@ static const struct s390_operand s390_operands[] =
>  #define M_16   42                 /* 4 bit optional mask starting at 16 */
>    { 4, 16, S390_OPERAND_OPTIONAL },
>  #define RO_28  43                 /* optional GPR starting at position 28 */
> -  { 4, 28, (S390_OPERAND_GPR | S390_OPERAND_OPTIONAL) }
> -
> +  { 4, 28, (S390_OPERAND_GPR | S390_OPERAND_OPTIONAL) },
> +
> +/* QEMU-ADD: */
> +#define M4_12 44                  /* 4-bit condition-code starting at 12 */
> +  { 4, 12, S390_OPERAND_CCODE },
> +#define M4_32 45                  /* 4-bit condition-code starting at 32 */
> +  { 4, 32, S390_OPERAND_CCODE },
> +#define I8_32 46                  /* 8 bit signed value starting at 32 */
> +  { 8, 32, S390_OPERAND_SIGNED },
> +/* QEMU-END */
>  };
>  
>  
> @@ -755,6 +800,14 @@ static const struct s390_operand s390_operands[] =
>  #define MASK_S_RD        { 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 }
>  #define MASK_SSF_RRDRD   { 0xff, 0x0f, 0x00, 0x00, 0x00, 0x00 }
>  
> +/* QEMU-ADD: */
> +#define INSTR_RIE_MRRP   6, { M4_32,R_8,R_12,J16_16,0,0 }	/* e.g. crj */
> +#define MASK_RIE_MRRP    { 0xff, 0x00, 0x00, 0x00, 0x0f, 0xff }
> +
> +#define INSTR_RIE_MRIP   6, { M4_12,R_8,I8_32,J16_16,0,0 }      /* e.g. cij */
> +#define MASK_RIE_MRIP    { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
> +/* QEMU-END */
> +
>  /* The opcode formats table (blueprints for .insn pseudo mnemonic).  */
>  
>  static const struct s390_opcode s390_opformats[] =
> @@ -1092,6 +1145,10 @@ static const struct s390_opcode s390_opcodes[] =
>    { "agfi", OP16(0xc208LL), MASK_RIL_RI, INSTR_RIL_RI, 2, 4},
>    { "slfi", OP16(0xc205LL), MASK_RIL_RU, INSTR_RIL_RU, 2, 4},
>    { "slgfi", OP16(0xc204LL), MASK_RIL_RU, INSTR_RIL_RU, 2, 4},
> +/* QEMU-ADD: */
> +  { "msfi",  OP16(0xc201ll), MASK_RIL_RI, INSTR_RIL_RI, 3, 6},
> +  { "msgfi", OP16(0xc200ll), MASK_RIL_RI, INSTR_RIL_RI, 3, 6},
> +/* QEMU-END */
>    { "jg", OP16(0xc0f4LL), MASK_RIL_0P, INSTR_RIL_0P, 3, 2},
>    { "jgno", OP16(0xc0e4LL), MASK_RIL_0P, INSTR_RIL_0P, 3, 2},
>    { "jgnh", OP16(0xc0d4LL), MASK_RIL_0P, INSTR_RIL_0P, 3, 2},
> @@ -1716,7 +1773,23 @@ static const struct s390_opcode s390_opcodes[] =
>    { "pfpo", OP16(0x010aLL), MASK_E, INSTR_E, 2, 5},
>    { "sckpf", OP16(0x0107LL), MASK_E, INSTR_E, 3, 0},
>    { "upt", OP16(0x0102LL), MASK_E, INSTR_E, 3, 0},
> -  { "pr", OP16(0x0101LL), MASK_E, INSTR_E, 3, 0}
> +  { "pr", OP16(0x0101LL), MASK_E, INSTR_E, 3, 0},
> +
> +/* QEMU-ADD: */
> +  { "crj",   OP48(0xec0000000076LL), MASK_RIE_MRRP, INSTR_RIE_MRRP, 3, 6},
> +  { "cgrj",  OP48(0xec0000000064LL), MASK_RIE_MRRP, INSTR_RIE_MRRP, 3, 6},
> +  { "clrj",  OP48(0xec0000000077LL), MASK_RIE_MRRP, INSTR_RIE_MRRP, 3, 6},
> +  { "clgrj", OP48(0xec0000000065LL), MASK_RIE_MRRP, INSTR_RIE_MRRP, 3, 6},
> +
> +  { "cij",   OP48(0xec000000007eLL), MASK_RIE_MRIP, INSTR_RIE_MRIP, 3, 6},
> +  { "cgij",  OP48(0xec000000007cLL), MASK_RIE_MRIP, INSTR_RIE_MRIP, 3, 6},
> +  { "clij",  OP48(0xec000000007fLL), MASK_RIE_MRIP, INSTR_RIE_MRIP, 3, 6},
> +  { "clgij", OP48(0xec000000007dLL), MASK_RIE_MRIP, INSTR_RIE_MRIP, 3, 6},
> +
> +  { "lrl",   OP16(0xc40dll), MASK_RIL_RP, INSTR_RIL_RP, 3, 6},
> +  { "lgrl",  OP16(0xc408ll), MASK_RIL_RP, INSTR_RIL_RP, 3, 6},
> +  { "lgfrl", OP16(0xc40cll), MASK_RIL_RP, INSTR_RIL_RP, 3, 6},
> +/* QEMU-END */
>  };
>  
>  static const int s390_num_opcodes =
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 01/35] tcg-s390: Adjust compilation flags.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 01/35] tcg-s390: Adjust compilation flags Richard Henderson
@ 2010-06-09 22:53   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-09 22:53 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:09PM -0700, Richard Henderson wrote:
> Force -m31/-m64 based on s390/s390x target.
> 
> Force -march=z990.  The TCG backend will always require the
> long-displacement facility, so the compiler may as well make
> use of that as well.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  configure |    7 ++++++-
>  1 files changed, 6 insertions(+), 1 deletions(-)

Thanks, applied.

> diff --git a/configure b/configure
> index 653c8d2..65f87a2 100755
> --- a/configure
> +++ b/configure
> @@ -697,7 +697,12 @@ case "$cpu" in
>             fi
>             ;;
>      s390)
> -           QEMU_CFLAGS="-march=z900 $QEMU_CFLAGS"
> +           QEMU_CFLAGS="-m31 -march=z990 $QEMU_CFLAGS"
> +           LDFLAGS="-m31 $LDFLAGS"
> +           ;;
> +    s390x)
> +           QEMU_CFLAGS="-m64 -march=z990 $QEMU_CFLAGS"
> +           LDFLAGS="-m64 $LDFLAGS"
>             ;;
>      i386)
>             QEMU_CFLAGS="-m32 $QEMU_CFLAGS"
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 02/35] s390x: Avoid _llseek.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 02/35] s390x: Avoid _llseek Richard Henderson
@ 2010-06-09 22:54   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-09 22:54 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:10PM -0700, Richard Henderson wrote:
> There's no _llseek on s390x either.  Replace the existing
> test for __x86_64__ with a functional test for __NR_llseek.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  linux-user/syscall.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)

Thanks, applied.

> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 8222cb9..e94f1ee 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -208,7 +208,7 @@ _syscall3(int, sys_getdents, uint, fd, struct linux_dirent *, dirp, uint, count)
>  _syscall3(int, sys_getdents64, uint, fd, struct linux_dirent64 *, dirp, uint, count);
>  #endif
>  _syscall2(int, sys_getpriority, int, which, int, who);
> -#if defined(TARGET_NR__llseek) && !defined (__x86_64__)
> +#if defined(TARGET_NR__llseek) && defined(__NR_llseek)
>  _syscall5(int, _llseek,  uint,  fd, ulong, hi, ulong, lo,
>            loff_t *, res, uint, wh);
>  #endif
> @@ -5933,7 +5933,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>  #ifdef TARGET_NR__llseek /* Not on alpha */
>      case TARGET_NR__llseek:
>          {
> -#if defined (__x86_64__)
> +#if !defined(__NR_llseek)
>              ret = get_errno(lseek(arg1, ((uint64_t )arg2 << 32) | arg3, arg5));
>              if (put_user_s64(ret, arg4))
>                  goto efault;
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 03/35] s390x: Don't use a linker script for user-only.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 03/35] s390x: Don't use a linker script for user-only Richard Henderson
@ 2010-06-09 22:54   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-09 22:54 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:11PM -0700, Richard Henderson wrote:
> The default placement of the application at 0x80000000 is fine,
> and will avoid the default placement for most other guests.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  configure |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)

Thanks, applied.

> diff --git a/configure b/configure
> index 65f87a2..7f5b5b2 100755
> --- a/configure
> +++ b/configure
> @@ -2758,6 +2758,9 @@ if test "$target_linux_user" = "yes" -o "$target_bsd_user" = "yes" ; then
>      # -static is used to avoid g1/g3 usage by the dynamic linker
>      ldflags="$linker_script -static $ldflags"
>      ;;
> +  alpha | s390x)
> +    # The default placement of the application is fine.
> +    ;;
>    *)
>      ldflags="$linker_script $ldflags"
>      ;;
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 04/35] tcg-s390: Compute is_write in cpu_signal_handler.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 04/35] tcg-s390: Compute is_write in cpu_signal_handler Richard Henderson
@ 2010-06-09 22:54   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-09 22:54 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:12PM -0700, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  cpu-exec.c |   42 +++++++++++++++++++++++++++++++++++++++---
>  1 files changed, 39 insertions(+), 3 deletions(-)

Thanks, applied.

> diff --git a/cpu-exec.c b/cpu-exec.c
> index c776605..026980a 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -1156,11 +1156,47 @@ int cpu_signal_handler(int host_signum, void *pinfo,
>      siginfo_t *info = pinfo;
>      struct ucontext *uc = puc;
>      unsigned long pc;
> -    int is_write;
> +    uint16_t *pinsn;
> +    int is_write = 0;
>  
>      pc = uc->uc_mcontext.psw.addr;
> -    /* XXX: compute is_write */
> -    is_write = 0;
> +
> +    /* ??? On linux, the non-rt signal handler has 4 (!) arguments instead
> +       of the normal 2 arguments.  The 3rd argument contains the "int_code"
> +       from the hardware which does in fact contain the is_write value.
> +       The rt signal handler, as far as I can tell, does not give this value
> +       at all.  Not that we could get to it from here even if it were.  */
> +    /* ??? This is not even close to complete, since it ignores all
> +       of the read-modify-write instructions.  */
> +    pinsn = (uint16_t *)pc;
> +    switch (pinsn[0] >> 8) {
> +    case 0x50: /* ST */
> +    case 0x42: /* STC */
> +    case 0x40: /* STH */
> +        is_write = 1;
> +        break;
> +    case 0xc4: /* RIL format insns */
> +        switch (pinsn[0] & 0xf) {
> +        case 0xf: /* STRL */
> +        case 0xb: /* STGRL */
> +        case 0x7: /* STHRL */
> +            is_write = 1;
> +        }
> +        break;
> +    case 0xe3: /* RXY format insns */
> +        switch (pinsn[2] & 0xff) {
> +        case 0x50: /* STY */
> +        case 0x24: /* STG */
> +        case 0x72: /* STCY */
> +        case 0x70: /* STHY */
> +        case 0x8e: /* STPQ */
> +        case 0x3f: /* STRVH */
> +        case 0x3e: /* STRV */
> +        case 0x2f: /* STRVG */
> +            is_write = 1;
> +        }
> +        break;
> +    }
>      return handle_cpu_signal(pc, (unsigned long)info->si_addr,
>                               is_write, &uc->uc_sigmask, puc);
>  }
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 05/35] tcg-s390: Icache flush is a no-op.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 05/35] tcg-s390: Icache flush is a no-op Richard Henderson
@ 2010-06-09 22:55   ` Aurelien Jarno
  2010-06-10 22:04     ` Richard Henderson
  0 siblings, 1 reply; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-09 22:55 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:13PM -0700, Richard Henderson wrote:
> Before gcc 4.2, __builtin___clear_cache doesn't exist, and
> afterward the gcc s390 backend implements it as nothing.

Does it means that instruction and data caches are coherent on s390?

> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.h |    5 -----
>  1 files changed, 0 insertions(+), 5 deletions(-)
> 
> diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
> index d8a2955..d7fe0c7 100644
> --- a/tcg/s390/tcg-target.h
> +++ b/tcg/s390/tcg-target.h
> @@ -94,9 +94,4 @@ enum {
>  
>  static inline void flush_icache_range(unsigned long start, unsigned long stop)
>  {
> -#if QEMU_GNUC_PREREQ(4, 1)
> -    __builtin___clear_cache((char *) start, (char *) stop);
> -#else
> -#error not implemented
> -#endif
>  }
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 06/35] tcg-s390: Allocate the code_gen_buffer near the main program.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 06/35] tcg-s390: Allocate the code_gen_buffer near the main program Richard Henderson
@ 2010-06-09 22:59   ` Aurelien Jarno
  2010-06-10 22:05     ` Richard Henderson
  0 siblings, 1 reply; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-09 22:59 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:14PM -0700, Richard Henderson wrote:
> This allows the use of direct calls to the helpers,
> and a direct branch back to the epilogue.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  exec.c |    7 +++++++
>  1 files changed, 7 insertions(+), 0 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index bb3dcad..7bbfe60 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -519,6 +519,13 @@ static void code_gen_alloc(unsigned long tb_size)
>          start = (void *) 0x01000000UL;
>          if (code_gen_buffer_size > 16 * 1024 * 1024)
>              code_gen_buffer_size = 16 * 1024 * 1024;
> +#elif defined(__s390x__)
> +        /* Map the buffer so that we can use direct calls and branches.  */
> +        /* We have a +- 4GB range on the branches; leave some slop.  */
> +        if (code_gen_buffer_size > (3ul * 1024 * 1024 * 1024)) {
> +            code_gen_buffer_size = 3ul * 1024 * 1024 * 1024;
> +        }
> +        start = (void *)0x90000000UL;

Is there any reason for this address?

>  #endif
>          code_gen_buffer = mmap(start, code_gen_buffer_size,
>                                 PROT_WRITE | PROT_READ | PROT_EXEC,
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 07/35] tcg: Optionally sign-extend 32-bit arguments for 64-bit host.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 07/35] tcg: Optionally sign-extend 32-bit arguments for 64-bit host Richard Henderson
@ 2010-06-10 10:22   ` Aurelien Jarno
  2010-06-10 22:08     ` Richard Henderson
  2010-06-14 22:20     ` Richard Henderson
  0 siblings, 2 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-10 10:22 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:15PM -0700, Richard Henderson wrote:
> Some hosts (amd64, ia64) have an ABI that ignores the high bits
> of the 64-bit register when passing 32-bit arguments.  Others,
> like s390x, require the value to be properly sign-extended for
> the type.  I.e. "int32_t" must be sign-extended and "uint32_t"
> must be zero-extended to 64-bits.
> 
> To effect this, extend the "sizemask" parameter to tcg_gen_callN
> to include the signedness of the type of each parameter.  If the
> tcg target requires it, extend each 32-bit argument into a 64-bit
> temp and pass that to the function call.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  def-helper.h                 |   38 +++++++++++++++++++++++++++++---------
>  target-i386/ops_sse_header.h |    3 +++
>  target-ppc/helper.h          |    1 +
>  tcg/s390/tcg-target.h        |    2 ++
>  tcg/tcg-op.h                 |   42 +++++++++++++++++++++---------------------
>  tcg/tcg.c                    |   41 +++++++++++++++++++++++++++++++++++------
>  6 files changed, 91 insertions(+), 36 deletions(-)
> 
> diff --git a/def-helper.h b/def-helper.h
> index 8a88c5b..8a822c7 100644
> --- a/def-helper.h
> +++ b/def-helper.h
> @@ -81,9 +81,29 @@
>  #define dh_is_64bit_ptr (TCG_TARGET_REG_BITS == 64)
>  #define dh_is_64bit(t) glue(dh_is_64bit_, dh_alias(t))
>  
> +#define dh_is_signed_void 0
> +#define dh_is_signed_i32 0
> +#define dh_is_signed_s32 1
> +#define dh_is_signed_i64 0
> +#define dh_is_signed_s64 1
> +#define dh_is_signed_f32 0
> +#define dh_is_signed_f64 0
> +#define dh_is_signed_tl  0
> +#define dh_is_signed_int 1
> +/* ??? This is highly specific to the host cpu.  There are even special
> +   extension instructions that may be required, e.g. ia64's addp4.  But
> +   for now we don't support any 64-bit targets with 32-bit pointers.  */
> +#define dh_is_signed_ptr 0
> +#define dh_is_signed_env dh_is_signed_ptr
> +#define dh_is_signed(t) dh_is_signed_##t
> +
> +#define dh_sizemask(t, n) \
> +  sizemask |= dh_is_64bit(t) << (n*2); \
> +  sizemask |= dh_is_signed(t) << (n*2+1)
> +
>  #define dh_arg(t, n) \
>    args[n - 1] = glue(GET_TCGV_, dh_alias(t))(glue(arg, n)); \
> -  sizemask |= dh_is_64bit(t) << n
> +  dh_sizemask(t, n)
>  
>  #define dh_arg_decl(t, n) glue(TCGv_, dh_alias(t)) glue(arg, n)
>  
> @@ -138,8 +158,8 @@ static inline void glue(gen_helper_, name)(dh_retvar_decl0(ret)) \
>  static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1)) \
>  { \
>    TCGArg args[1]; \
> -  int sizemask; \
> -  sizemask = dh_is_64bit(ret); \
> +  int sizemask = 0; \
> +  dh_sizemask(ret, 0); \
>    dh_arg(t1, 1); \
>    tcg_gen_helperN(HELPER(name), flags, sizemask, dh_retvar(ret), 1, args); \
>  }
> @@ -149,8 +169,8 @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1
>      dh_arg_decl(t2, 2)) \
>  { \
>    TCGArg args[2]; \
> -  int sizemask; \
> -  sizemask = dh_is_64bit(ret); \
> +  int sizemask = 0; \
> +  dh_sizemask(ret, 0); \
>    dh_arg(t1, 1); \
>    dh_arg(t2, 2); \
>    tcg_gen_helperN(HELPER(name), flags, sizemask, dh_retvar(ret), 2, args); \
> @@ -161,8 +181,8 @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1
>      dh_arg_decl(t2, 2), dh_arg_decl(t3, 3)) \
>  { \
>    TCGArg args[3]; \
> -  int sizemask; \
> -  sizemask = dh_is_64bit(ret); \
> +  int sizemask = 0; \
> +  dh_sizemask(ret, 0); \
>    dh_arg(t1, 1); \
>    dh_arg(t2, 2); \
>    dh_arg(t3, 3); \
> @@ -174,8 +194,8 @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1
>      dh_arg_decl(t2, 2), dh_arg_decl(t3, 3), dh_arg_decl(t4, 4)) \
>  { \
>    TCGArg args[4]; \
> -  int sizemask; \
> -  sizemask = dh_is_64bit(ret); \
> +  int sizemask = 0; \
> +  dh_sizemask(ret, 0); \
>    dh_arg(t1, 1); \
>    dh_arg(t2, 2); \
>    dh_arg(t3, 3); \
> diff --git a/target-i386/ops_sse_header.h b/target-i386/ops_sse_header.h
> index a0a6361..8d4b2b7 100644
> --- a/target-i386/ops_sse_header.h
> +++ b/target-i386/ops_sse_header.h
> @@ -30,6 +30,9 @@
>  #define dh_ctype_Reg Reg *
>  #define dh_ctype_XMMReg XMMReg *
>  #define dh_ctype_MMXReg MMXReg *
> +#define dh_is_signed_Reg dh_is_signed_ptr
> +#define dh_is_signed_XMMReg dh_is_signed_ptr
> +#define dh_is_signed_MMXReg dh_is_signed_ptr
>  
>  DEF_HELPER_2(glue(psrlw, SUFFIX), void, Reg, Reg)
>  DEF_HELPER_2(glue(psraw, SUFFIX), void, Reg, Reg)
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index 5cf6cd4..c025a2f 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -95,6 +95,7 @@ DEF_HELPER_3(fsel, i64, i64, i64, i64)
>  
>  #define dh_alias_avr ptr
>  #define dh_ctype_avr ppc_avr_t *
> +#define dh_is_signed_avr dh_is_signed_ptr
>  
>  DEF_HELPER_3(vaddubm, void, avr, avr, avr)
>  DEF_HELPER_3(vadduhm, void, avr, avr, avr)
> diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
> index d7fe0c7..8c19262 100644
> --- a/tcg/s390/tcg-target.h
> +++ b/tcg/s390/tcg-target.h
> @@ -87,6 +87,8 @@ enum {
>  #define TCG_TARGET_STACK_ALIGN		8
>  #define TCG_TARGET_CALL_STACK_OFFSET	0
>  
> +#define TCG_TARGET_EXTEND_ARGS 1
> +
>  enum {
>      /* Note: must be synced with dyngen-exec.h */
>      TCG_AREG0 = TCG_REG_R10,
> diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
> index aa436de..4220e3d 100644
> --- a/tcg/tcg-op.h
> +++ b/tcg/tcg-op.h
> @@ -369,8 +369,8 @@ static inline void tcg_gen_helperN(void *func, int flags, int sizemask,
>     and pure, hence the call to tcg_gen_callN() with TCG_CALL_CONST |
>     TCG_CALL_PURE. This may need to be adjusted if these functions
>     start to be used with other helpers. */
> -static inline void tcg_gen_helper32(void *func, TCGv_i32 ret,
> -                                    TCGv_i32 a, TCGv_i32 b)
> +static inline void tcg_gen_helper32(void *func, TCGv_i32 ret, TCGv_i32 a,
> +                                    TCGv_i32 b, _Bool is_signed)

This should be int instead of _Bool.

>  {
>      TCGv_ptr fn;
>      TCGArg args[2];
> @@ -378,12 +378,12 @@ static inline void tcg_gen_helper32(void *func, TCGv_i32 ret,
>      args[0] = GET_TCGV_I32(a);
>      args[1] = GET_TCGV_I32(b);
>      tcg_gen_callN(&tcg_ctx, fn, TCG_CALL_CONST | TCG_CALL_PURE,
> -                  0, GET_TCGV_I32(ret), 2, args);
> +                  (is_signed ? 0x2a : 0x00), GET_TCGV_I32(ret), 2, args);

Wouldn't it be better to actually pass the whole flag to
tcg_gen_helper32(), so that we can in the future also support mixed
signedness in arguments? Also doing it here looks like a bit like a
magic constant.

>      tcg_temp_free_ptr(fn);
>  }
>  
> -static inline void tcg_gen_helper64(void *func, TCGv_i64 ret,
> -                                    TCGv_i64 a, TCGv_i64 b)
> +static inline void tcg_gen_helper64(void *func, TCGv_i64 ret, TCGv_i64 a,
> +                                    TCGv_i64 b, _Bool is_signed)
>  {
>      TCGv_ptr fn;
>      TCGArg args[2];
> @@ -391,7 +391,7 @@ static inline void tcg_gen_helper64(void *func, TCGv_i64 ret,
>      args[0] = GET_TCGV_I64(a);
>      args[1] = GET_TCGV_I64(b);
>      tcg_gen_callN(&tcg_ctx, fn, TCG_CALL_CONST | TCG_CALL_PURE,
> -                  7, GET_TCGV_I64(ret), 2, args);
> +                  (is_signed ? 0x3f : 0x15), GET_TCGV_I64(ret), 2, args);
>      tcg_temp_free_ptr(fn);
>  }

Same

> @@ -692,22 +692,22 @@ static inline void tcg_gen_remu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
>  #else
>  static inline void tcg_gen_div_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
>  {
> -    tcg_gen_helper32(tcg_helper_div_i32, ret, arg1, arg2);
> +    tcg_gen_helper32(tcg_helper_div_i32, ret, arg1, arg2, 1);
>  }
>  
>  static inline void tcg_gen_rem_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
>  {
> -    tcg_gen_helper32(tcg_helper_rem_i32, ret, arg1, arg2);
> +    tcg_gen_helper32(tcg_helper_rem_i32, ret, arg1, arg2, 1);
>  }
>  
>  static inline void tcg_gen_divu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
>  {
> -    tcg_gen_helper32(tcg_helper_divu_i32, ret, arg1, arg2);
> +    tcg_gen_helper32(tcg_helper_divu_i32, ret, arg1, arg2, 0);
>  }
>  
>  static inline void tcg_gen_remu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
>  {
> -    tcg_gen_helper32(tcg_helper_remu_i32, ret, arg1, arg2);
> +    tcg_gen_helper32(tcg_helper_remu_i32, ret, arg1, arg2, 0);
>  }
>  #endif
>  
> @@ -867,7 +867,7 @@ static inline void tcg_gen_xori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
>     specific code (x86) */
>  static inline void tcg_gen_shl_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
>  {
> -    tcg_gen_helper64(tcg_helper_shl_i64, ret, arg1, arg2);
> +    tcg_gen_helper64(tcg_helper_shl_i64, ret, arg1, arg2, 1);
>  }
>  
>  static inline void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
> @@ -877,7 +877,7 @@ static inline void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
>  
>  static inline void tcg_gen_shr_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
>  {
> -    tcg_gen_helper64(tcg_helper_shr_i64, ret, arg1, arg2);
> +    tcg_gen_helper64(tcg_helper_shr_i64, ret, arg1, arg2, 1);
>  }
>  
>  static inline void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
> @@ -887,7 +887,7 @@ static inline void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
>  
>  static inline void tcg_gen_sar_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
>  {
> -    tcg_gen_helper64(tcg_helper_sar_i64, ret, arg1, arg2);
> +    tcg_gen_helper64(tcg_helper_sar_i64, ret, arg1, arg2, 1);
>  }
>  
>  static inline void tcg_gen_sari_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
> @@ -935,22 +935,22 @@ static inline void tcg_gen_mul_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
>  
>  static inline void tcg_gen_div_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
>  {
> -    tcg_gen_helper64(tcg_helper_div_i64, ret, arg1, arg2);
> +    tcg_gen_helper64(tcg_helper_div_i64, ret, arg1, arg2, 1);
>  }
>  
>  static inline void tcg_gen_rem_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
>  {
> -    tcg_gen_helper64(tcg_helper_rem_i64, ret, arg1, arg2);
> +    tcg_gen_helper64(tcg_helper_rem_i64, ret, arg1, arg2, 1);
>  }
>  
>  static inline void tcg_gen_divu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
>  {
> -    tcg_gen_helper64(tcg_helper_divu_i64, ret, arg1, arg2);
> +    tcg_gen_helper64(tcg_helper_divu_i64, ret, arg1, arg2, 0);
>  }
>  
>  static inline void tcg_gen_remu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
>  {
> -    tcg_gen_helper64(tcg_helper_remu_i64, ret, arg1, arg2);
> +    tcg_gen_helper64(tcg_helper_remu_i64, ret, arg1, arg2, 0);
>  }
>  
>  #else
> @@ -1212,22 +1212,22 @@ static inline void tcg_gen_remu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
>  #else
>  static inline void tcg_gen_div_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
>  {
> -    tcg_gen_helper64(tcg_helper_div_i64, ret, arg1, arg2);
> +    tcg_gen_helper64(tcg_helper_div_i64, ret, arg1, arg2, 1);
>  }
>  
>  static inline void tcg_gen_rem_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
>  {
> -    tcg_gen_helper64(tcg_helper_rem_i64, ret, arg1, arg2);
> +    tcg_gen_helper64(tcg_helper_rem_i64, ret, arg1, arg2, 1);
>  }
>  
>  static inline void tcg_gen_divu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
>  {
> -    tcg_gen_helper64(tcg_helper_divu_i64, ret, arg1, arg2);
> +    tcg_gen_helper64(tcg_helper_divu_i64, ret, arg1, arg2, 0);
>  }
>  
>  static inline void tcg_gen_remu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
>  {
> -    tcg_gen_helper64(tcg_helper_remu_i64, ret, arg1, arg2);
> +    tcg_gen_helper64(tcg_helper_remu_i64, ret, arg1, arg2, 0);
>  }
>  #endif
>  
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index 880e7ce..d8ddd1f 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -560,6 +560,24 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned int flags,
>      int real_args;
>      int nb_rets;
>      TCGArg *nparam;
> +
> +#if defined(TCG_TARGET_EXTEND_ARGS) && TCG_TARGET_REG_BITS == 64
> +    for (i = 0; i < nargs; ++i) {
> +        int is_64bit = sizemask & (1 << (i+1)*2);
> +        int is_signed = sizemask & (2 << (i+1)*2);
> +        if (!is_64bit) {
> +            TCGv_i64 temp = tcg_temp_new_i64();
> +            TCGv_i64 orig = MAKE_TCGV_I64(args[i]);
> +            if (is_signed) {
> +                tcg_gen_ext32s_i64(temp, orig);
> +            } else {
> +                tcg_gen_ext32u_i64(temp, orig);
> +            }
> +            args[i] = GET_TCGV_I64(temp);
> +        }
> +    }
> +#endif /* TCG_TARGET_EXTEND_ARGS */
> +

This part allocates a lot of temp variables, that will probably generate
a lot of register spills during the code generation.

As we do that for all arguments anyway, wouldn't it be possible to do
the extension in place? The value in the register is changed, but that
should not have any effect as it is ignored anyway in other
instructions.

>      *gen_opc_ptr++ = INDEX_op_call;
>      nparam = gen_opparam_ptr++;
>  #ifdef TCG_TARGET_I386
> @@ -588,7 +606,8 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned int flags,
>      real_args = 0;
>      for (i = 0; i < nargs; i++) {
>  #if TCG_TARGET_REG_BITS < 64
> -        if (sizemask & (2 << i)) {
> +        int is_64bit = sizemask & (1 << (i+1)*2);
> +        if (is_64bit) {
>  #ifdef TCG_TARGET_I386
>              /* REGPARM case: if the third parameter is 64 bit, it is
>                 allocated on the stack */
> @@ -622,12 +641,12 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned int flags,
>              *gen_opparam_ptr++ = args[i] + 1;
>  #endif
>              real_args += 2;
> -        } else
> -#endif
> -        {
> -            *gen_opparam_ptr++ = args[i];
> -            real_args++;
> +            continue;
>          }
> +#endif /* TCG_TARGET_REG_BITS < 64 */
> +
> +        *gen_opparam_ptr++ = args[i];
> +        real_args++;
>      }
>      *gen_opparam_ptr++ = GET_TCGV_PTR(func);
>  
> @@ -637,6 +656,16 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned int flags,
>  
>      /* total parameters, needed to go backward in the instruction stream */
>      *gen_opparam_ptr++ = 1 + nb_rets + real_args + 3;
> +
> +#if defined(TCG_TARGET_EXTEND_ARGS) && TCG_TARGET_REG_BITS == 64
> +    for (i = 0; i < nargs; ++i) {
> +        int is_64bit = sizemask & (1 << (i+1)*2);
> +        if (!is_64bit) {
> +            TCGv_i64 temp = MAKE_TCGV_I64(args[i]);
> +            tcg_temp_free_i64(temp);
> +        }
> +    }
> +#endif /* TCG_TARGET_EXTEND_ARGS */
>  }
>  
>  #if TCG_TARGET_REG_BITS == 32
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 10/35] tcg-s390: New TCG target
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 10/35] tcg-s390: New TCG target Richard Henderson
@ 2010-06-10 10:24   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-10 10:24 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:18PM -0700, Richard Henderson wrote:
> We already have stubs for a TCG target on S390, but were missing code that
> would actually generate instructions.
> 
> So I took Uli's patch, cleaned it up and present it to you again :-).
> 
> I hope I found all odd coding style and unprettiness issues, but if you
> still spot one feel free to nag about it.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> CC: Uli Hecht <uli@suse.de>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c | 1171 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  tcg/s390/tcg-target.h |   13 +-
>  2 files changed, 1157 insertions(+), 27 deletions(-)

This patch is difficult to review, as a lot of changes are done in
latter patches. I think the best would be to do a quick final review
after squashing all the tcg/s390/* patches all together.

> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index 265194a..55f0fa9 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -2,6 +2,7 @@
>   * Tiny Code Generator for QEMU
>   *
>   * Copyright (c) 2009 Ulrich Hecht <uli@suse.de>
> + * Copyright (c) 2009 Alexander Graf <agraf@suse.de>
>   *
>   * Permission is hereby granted, free of charge, to any person obtaining a copy
>   * of this software and associated documentation files (the "Software"), to deal
> @@ -22,81 +23,1209 @@
>   * THE SOFTWARE.
>   */
>  
> +/* #define DEBUG_S390_TCG */
> +
> +#ifdef DEBUG_S390_TCG
> +#define dprintf(fmt, ...) \
> +    do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
> +#else
> +#define dprintf(fmt, ...) \
> +    do { } while (0)
> +#endif
> +
> +#define TCG_CT_CONST_S16                0x100
> +#define TCG_CT_CONST_U12                0x200
> +
> +/* Several places within the instruction set 0 means "no register"
> +   rather than TCG_REG_R0.  */
> +#define TCG_REG_NONE    0
> +
> +/* All of the following instructions are prefixed with their instruction
> +   format, and are defined as 8- or 16-bit quantities, even when the two
> +   halves of the 16-bit quantity may appear 32 bits apart in the insn.
> +   This makes it easy to copy the values from the tables in Appendix B.  */
> +typedef enum S390Opcode {
> +    RIL_BRASL   = 0xc005,
> +    RIL_BRCL    = 0xc004,
> +    RIL_LARL    = 0xc000,
> +
> +    RI_AGHI     = 0xa70b,
> +    RI_AHI      = 0xa70a,
> +    RI_BRC      = 0xa704,
> +    RI_IILH     = 0xa502,
> +    RI_LGHI     = 0xa709,
> +    RI_LLILL    = 0xa50f,
> +
> +    RRE_AGR     = 0xb908,
> +    RRE_CGR     = 0xb920,
> +    RRE_CLGR    = 0xb921,
> +    RRE_DLGR    = 0xb987,
> +    RRE_DLR     = 0xb997,
> +    RRE_DSGFR   = 0xb91d,
> +    RRE_DSGR    = 0xb90d,
> +    RRE_LCGR    = 0xb903,
> +    RRE_LGFR    = 0xb914,
> +    RRE_LGR     = 0xb904,
> +    RRE_LLGFR   = 0xb916,
> +    RRE_MSGR    = 0xb90c,
> +    RRE_MSR     = 0xb252,
> +    RRE_NGR     = 0xb980,
> +    RRE_OGR     = 0xb981,
> +    RRE_SGR     = 0xb909,
> +    RRE_XGR     = 0xb982,
> +
> +    RR_AR       = 0x1a,
> +    RR_BASR     = 0x0d,
> +    RR_BCR      = 0x07,
> +    RR_CLR      = 0x15,
> +    RR_CR       = 0x19,
> +    RR_DR       = 0x1d,
> +    RR_LCR      = 0x13,
> +    RR_LR       = 0x18,
> +    RR_NR       = 0x14,
> +    RR_OR       = 0x16,
> +    RR_SR       = 0x1b,
> +    RR_XR       = 0x17,
> +
> +    RSY_SLLG    = 0xeb0d,
> +    RSY_SRAG    = 0xeb0a,
> +    RSY_SRLG    = 0xeb0c,
> +
> +    RS_SLL      = 0x89,
> +    RS_SRA      = 0x8a,
> +    RS_SRL      = 0x88,
> +
> +    RXY_CG      = 0xe320,
> +    RXY_LB      = 0xe376,
> +    RXY_LG      = 0xe304,
> +    RXY_LGB     = 0xe377,
> +    RXY_LGF     = 0xe314,
> +    RXY_LGH     = 0xe315,
> +    RXY_LHY     = 0xe378,
> +    RXY_LLC     = 0xe394,
> +    RXY_LLGC    = 0xe390,
> +    RXY_LLGF    = 0xe316,
> +    RXY_LLGH    = 0xe391,
> +    RXY_LLH     = 0xe395,
> +    RXY_LMG     = 0xeb04,
> +    RXY_LRV     = 0xe31e,
> +    RXY_LRVG    = 0xe30f,
> +    RXY_LRVH    = 0xe31f,
> +    RXY_LY      = 0xe358,
> +    RXY_STCY    = 0xe372,
> +    RXY_STG     = 0xe324,
> +    RXY_STHY    = 0xe370,
> +    RXY_STMG    = 0xeb24,
> +    RXY_STRV    = 0xe33e,
> +    RXY_STRVG   = 0xe32f,
> +    RXY_STRVH   = 0xe33f,
> +    RXY_STY     = 0xe350,
> +
> +    RX_L        = 0x58,
> +    RX_LH       = 0x48,
> +    RX_ST       = 0x50,
> +    RX_STC      = 0x42,
> +    RX_STH      = 0x40,
> +} S390Opcode;
> +
> +#define LD_SIGNED      0x04
> +#define LD_UINT8       0x00
> +#define LD_INT8        (LD_UINT8 | LD_SIGNED)
> +#define LD_UINT16      0x01
> +#define LD_INT16       (LD_UINT16 | LD_SIGNED)
> +#define LD_UINT32      0x02
> +#define LD_INT32       (LD_UINT32 | LD_SIGNED)
> +#define LD_UINT64      0x03
> +#define LD_INT64       (LD_UINT64 | LD_SIGNED)
> +
> +#ifndef NDEBUG
> +static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
> +    "%r0", "%r1", "%r2", "%r3", "%r4", "%r5", "%r6", "%r7",
> +    "%r8", "%r9", "%r10" "%r11" "%r12" "%r13" "%r14" "%r15"
> +};
> +#endif
> +
>  static const int tcg_target_reg_alloc_order[] = {
> +    TCG_REG_R6,
> +    TCG_REG_R7,
> +    TCG_REG_R8,
> +    TCG_REG_R9,
> +    TCG_REG_R10,
> +    TCG_REG_R11,
> +    TCG_REG_R12,
> +    TCG_REG_R13,
> +    TCG_REG_R14,
> +    TCG_REG_R0,
> +    TCG_REG_R1,
> +    TCG_REG_R2,
> +    TCG_REG_R3,
> +    TCG_REG_R4,
> +    TCG_REG_R5,
>  };
>  
>  static const int tcg_target_call_iarg_regs[] = {
> +    TCG_REG_R2,
> +    TCG_REG_R3,
> +    TCG_REG_R4,
> +    TCG_REG_R5,
> +    TCG_REG_R6,
>  };
>  
>  static const int tcg_target_call_oarg_regs[] = {
> +    TCG_REG_R2,
> +    TCG_REG_R3,
> +};
> +
> +/* signed/unsigned is handled by using COMPARE and COMPARE LOGICAL,
> +   respectively */
> +
> +#define S390_CC_EQ      8
> +#define S390_CC_LT      4
> +#define S390_CC_GT      2
> +#define S390_CC_OV      1
> +#define S390_CC_NE      (S390_CC_LT | S390_CC_GT)
> +#define S390_CC_LE      (S390_CC_LT | S390_CC_EQ)
> +#define S390_CC_GE      (S390_CC_GT | S390_CC_EQ)
> +#define S390_CC_ALWAYS  15
> +
> +static const uint8_t tcg_cond_to_s390_cond[10] = {
> +    [TCG_COND_EQ]  = S390_CC_EQ,
> +    [TCG_COND_LT]  = S390_CC_LT,
> +    [TCG_COND_LTU] = S390_CC_LT,
> +    [TCG_COND_LE]  = S390_CC_LE,
> +    [TCG_COND_LEU] = S390_CC_LE,
> +    [TCG_COND_GT]  = S390_CC_GT,
> +    [TCG_COND_GTU] = S390_CC_GT,
> +    [TCG_COND_GE]  = S390_CC_GE,
> +    [TCG_COND_GEU] = S390_CC_GE,
> +    [TCG_COND_NE]  = S390_CC_NE,
> +};
> +
> +#ifdef CONFIG_SOFTMMU
> +
> +#include "../../softmmu_defs.h"
> +
> +static void *qemu_ld_helpers[4] = {
> +    __ldb_mmu,
> +    __ldw_mmu,
> +    __ldl_mmu,
> +    __ldq_mmu,
> +};
> +
> +static void *qemu_st_helpers[4] = {
> +    __stb_mmu,
> +    __stw_mmu,
> +    __stl_mmu,
> +    __stq_mmu,
>  };
> +#endif
> +
> +static uint8_t *tb_ret_addr;
>  
>  static void patch_reloc(uint8_t *code_ptr, int type,
>                  tcg_target_long value, tcg_target_long addend)
>  {
> -    tcg_abort();
> +    uint32_t *code_ptr_32 = (uint32_t*)code_ptr;
> +    tcg_target_long code_ptr_tlong = (tcg_target_long)code_ptr;
> +
> +    switch (type) {
> +    case R_390_PC32DBL:
> +        *code_ptr_32 = (value - (code_ptr_tlong + addend)) >> 1;
> +        break;
> +    default:
> +        tcg_abort();
> +        break;
> +    }
>  }
>  
> -static inline int tcg_target_get_call_iarg_regs_count(int flags)
> +static int tcg_target_get_call_iarg_regs_count(int flags)
>  {
> -    tcg_abort();
> -    return 0;
> +    return sizeof(tcg_target_call_iarg_regs) / sizeof(int);
>  }
>  
>  /* parse target specific constraints */
>  static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
>  {
> -    tcg_abort();
> +    const char *ct_str;
> +
> +    ct->ct |= TCG_CT_REG;
> +    tcg_regset_set32(ct->u.regs, 0, 0xffff);
> +    ct_str = *pct_str;
> +
> +    switch (ct_str[0]) {
> +    case 'L':                   /* qemu_ld/st constraint */
> +        tcg_regset_reset_reg (ct->u.regs, TCG_REG_R2);
> +        tcg_regset_reset_reg (ct->u.regs, TCG_REG_R3);
> +        break;
> +    case 'R':                        /* not R0 */
> +        tcg_regset_reset_reg(ct->u.regs, TCG_REG_R0);
> +        break;
> +    case 'a':                  /* force R2 for division */
> +        tcg_regset_clear(ct->u.regs);
> +        tcg_regset_set_reg(ct->u.regs, TCG_REG_R2);
> +        break;
> +    case 'b':                  /* force R3 for division */
> +        tcg_regset_clear(ct->u.regs);
> +        tcg_regset_set_reg(ct->u.regs, TCG_REG_R3);
> +        break;
> +    case 'I':
> +        ct->ct &= ~TCG_CT_REG;
> +        ct->ct |= TCG_CT_CONST_S16;
> +        break;
> +    default:
> +        break;
> +    }
> +    ct_str++;
> +    *pct_str = ct_str;
> +
>      return 0;
>  }
>  
>  /* Test if a constant matches the constraint. */
>  static inline int tcg_target_const_match(tcg_target_long val,
> -                const TCGArgConstraint *arg_ct)
> +                                         const TCGArgConstraint *arg_ct)
>  {
> -    tcg_abort();
> +    int ct = arg_ct->ct;
> +
> +    if ((ct & TCG_CT_CONST) ||
> +       ((ct & TCG_CT_CONST_S16) && val == (int16_t)val) ||
> +       ((ct & TCG_CT_CONST_U12) && val == (val & 0xfff))) {
> +        return 1;
> +    }
> +
>      return 0;
>  }
>  
> +/* Emit instructions according to the given instruction format.  */
> +
> +static void tcg_out_insn_RR(TCGContext *s, S390Opcode op, TCGReg r1, TCGReg r2)
> +{
> +    tcg_out16(s, (op << 8) | (r1 << 4) | r2);
> +}
> +
> +static void tcg_out_insn_RRE(TCGContext *s, S390Opcode op,
> +                             TCGReg r1, TCGReg r2)
> +{
> +    tcg_out32(s, (op << 16) | (r1 << 4) | r2);
> +}
> +
> +static void tcg_out_insn_RI(TCGContext *s, S390Opcode op, TCGReg r1, int i2)
> +{
> +    tcg_out32(s, (op << 16) | (r1 << 20) | (i2 & 0xffff));
> +}
> +
> +static void tcg_out_insn_RIL(TCGContext *s, S390Opcode op, TCGReg r1, int i2)
> +{
> +    tcg_out16(s, op | (r1 << 4));
> +    tcg_out32(s, i2);
> +}
> +
> +static void tcg_out_insn_RS(TCGContext *s, S390Opcode op, TCGReg r1,
> +                            TCGReg b2, TCGReg r3, int disp)
> +{
> +    tcg_out32(s, (op << 24) | (r1 << 20) | (r3 << 16) | (b2 << 12)
> +              | (disp & 0xfff));
> +}
> +
> +static void tcg_out_insn_RSY(TCGContext *s, S390Opcode op, TCGReg r1,
> +                             TCGReg b2, TCGReg r3, int disp)
> +{
> +    tcg_out16(s, (op & 0xff00) | (r1 << 4) | r3);
> +    tcg_out32(s, (op & 0xff) | (b2 << 28)
> +              | ((disp & 0xfff) << 16) | ((disp & 0xff000) >> 4));
> +}
> +
> +#define tcg_out_insn_RX   tcg_out_insn_RS
> +#define tcg_out_insn_RXY  tcg_out_insn_RSY
> +
> +/* Emit an opcode with "type-checking" of the format.  */
> +#define tcg_out_insn(S, FMT, OP, ...) \
> +    glue(tcg_out_insn_,FMT)(S, glue(glue(FMT,_),OP), ## __VA_ARGS__)
> +
> +
> +/* emit 64-bit shifts */
> +static void tcg_out_sh64(TCGContext* s, S390Opcode op, TCGReg dest,
> +                         TCGReg src, TCGReg sh_reg, int sh_imm)
> +{
> +    tcg_out_insn_RSY(s, op, dest, sh_reg, src, sh_imm);
> +}
> +
> +/* emit 32-bit shifts */
> +static void tcg_out_sh32(TCGContext* s, S390Opcode op, TCGReg dest,
> +                         TCGReg sh_reg, int sh_imm)
> +{
> +    tcg_out_insn_RS(s, op, dest, sh_reg, 0, sh_imm);
> +}
> +
> +static inline void tcg_out_mov(TCGContext *s, int ret, int arg)
> +{
> +    /* ??? With a TCGType argument, we could emit the smaller LR insn.  */
> +    tcg_out_insn(s, RRE, LGR, ret, arg);
> +}
> +
>  /* load a register with an immediate value */
>  static inline void tcg_out_movi(TCGContext *s, TCGType type,
>                  int ret, tcg_target_long arg)
>  {
> -    tcg_abort();
> +    if (arg >= -0x8000 && arg < 0x8000) { /* signed immediate load */
> +        tcg_out_insn(s, RI, LGHI, ret, arg);
> +    } else if (!(arg & 0xffffffffffff0000UL)) {
> +        tcg_out_insn(s, RI, LLILL, ret, arg);
> +    } else if (!(arg & 0xffffffff00000000UL) || type == TCG_TYPE_I32) {
> +        tcg_out_insn(s, RI, LLILL, ret, arg);
> +        tcg_out_insn(s, RI, IILH, ret, arg >> 16);
> +    } else {
> +        /* branch over constant and store its address in R13 */
> +        tcg_out_insn(s, RIL, BRASL, TCG_REG_R13, (6 + 8) >> 1);
> +        /* 64-bit constant */
> +        tcg_out32(s, arg >> 32);
> +        tcg_out32(s, arg);
> +        /* load constant to ret */
> +        tcg_out_insn(s, RXY, LG, ret, TCG_REG_R13, 0, 0);
> +    }
>  }
>  
> +
> +/* Emit a load/store type instruction.  Inputs are:
> +   DATA:     The register to be loaded or stored.
> +   BASE+OFS: The effective address.
> +   OPC_RX:   If the operation has an RX format opcode (e.g. STC), otherwise 0.
> +   OPC_RXY:  The RXY format opcode for the operation (e.g. STCY).  */
> +
> +static void tcg_out_mem(TCGContext *s, S390Opcode opc_rx, S390Opcode opc_rxy,
> +                        TCGReg data, TCGReg base, TCGReg index,
> +                        tcg_target_long ofs)
> +{
> +    if (ofs < -0x80000 || ofs >= 0x80000) {
> +        /* Combine the low 16 bits of the offset with the actual load insn;
> +           the high 48 bits must come from an immediate load.  */
> +        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, ofs & ~0xffff);
> +        ofs &= 0xffff;
> +
> +        /* If we were already given an index register, add it in.  */
> +        if (index != TCG_REG_NONE) {
> +            tcg_out_insn(s, RRE, AGR, TCG_REG_R13, index);
> +        }
> +        index = TCG_REG_R13;
> +    }
> +
> +    if (opc_rx && ofs >= 0 && ofs < 0x1000) {
> +        tcg_out_insn_RX(s, opc_rx, data, base, index, ofs);
> +    } else {
> +        tcg_out_insn_RXY(s, opc_rxy, data, base, index, ofs);
> +    }
> +}
> +
> +
>  /* load data without address translation or endianness conversion */
> -static inline void tcg_out_ld(TCGContext *s, TCGType type, int arg,
> -                int arg1, tcg_target_long arg2)
> +static inline void tcg_out_ld(TCGContext *s, TCGType type, TCGReg data,
> +                              TCGReg base, tcg_target_long ofs)
>  {
> -    tcg_abort();
> +    if (type == TCG_TYPE_I32) {
> +        tcg_out_mem(s, RX_L, RXY_LY, data, base, TCG_REG_NONE, ofs);
> +    } else {
> +        tcg_out_mem(s, 0, RXY_LG, data, base, TCG_REG_NONE, ofs);
> +    }
>  }
>  
> -static inline void tcg_out_st(TCGContext *s, TCGType type, int arg,
> -                              int arg1, tcg_target_long arg2)
> +static inline void tcg_out_st(TCGContext *s, TCGType type, TCGReg data,
> +                              TCGReg base, tcg_target_long ofs)
>  {
> -    tcg_abort();
> +    if (type == TCG_TYPE_I32) {
> +        tcg_out_mem(s, RX_ST, RXY_STY, data, base, TCG_REG_NONE, ofs);
> +    } else {
> +        tcg_out_mem(s, 0, RXY_STG, data, base, TCG_REG_NONE, ofs);
> +    }
> +}
> +
> +static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
> +{
> +    if (c > TCG_COND_GT) {
> +        /* unsigned */
> +        tcg_out_insn(s, RR, CLR, r1, r2);
> +    } else {
> +        /* signed */
> +        tcg_out_insn(s, RR, CR, r1, r2);
> +    }
> +}
> +
> +static void tgen64_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
> +{
> +    if (c > TCG_COND_GT) {
> +        /* unsigned */
> +        tcg_out_insn(s, RRE, CLGR, r1, r2);
> +    } else {
> +        /* signed */
> +        tcg_out_insn(s, RRE, CGR, r1, r2);
> +    }
> +}
> +
> +static void tgen_setcond(TCGContext *s, TCGType type, TCGCond c,
> +                         TCGReg dest, TCGReg r1, TCGReg r2)
> +{
> +    if (type == TCG_TYPE_I32) {
> +        tgen32_cmp(s, c, r1, r2);
> +    } else {
> +        tgen64_cmp(s, c, r1, r2);
> +    }
> +    /* Emit: r1 = 1; if (cc) goto over; r1 = 0; over:  */
> +    tcg_out_movi(s, type, dest, 1);
> +    tcg_out_insn(s, RI, BRC, tcg_cond_to_s390_cond[c], (4 + 4) >> 1);
> +    tcg_out_movi(s, type, dest, 0);
> +}
> +
> +static void tgen_gotoi(TCGContext *s, int cc, tcg_target_long dest)
> +{
> +    tcg_target_long off = (dest - (tcg_target_long)s->code_ptr) >> 1;
> +    if (off > -0x8000 && off < 0x7fff) {
> +        tcg_out_insn(s, RI, BRC, cc, off);
> +    } else if (off == (int32_t)off) {
> +        tcg_out_insn(s, RIL, BRCL, cc, off);
> +    } else {
> +        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, dest);
> +        tcg_out_insn(s, RR, BCR, cc, TCG_REG_R13);
> +    }
> +}
> +
> +static void tgen_branch(TCGContext *s, int cc, int labelno)
> +{
> +    TCGLabel* l = &s->labels[labelno];
> +    if (l->has_value) {
> +        tgen_gotoi(s, cc, l->u.value);
> +    } else {
> +        tcg_out16(s, RIL_BRCL | (cc << 4));
> +        tcg_out_reloc(s, s->code_ptr, R_390_PC32DBL, labelno, -2);
> +        s->code_ptr += 4;
> +    }
> +}
> +
> +static void tgen_calli(TCGContext *s, tcg_target_long dest)
> +{
> +    tcg_target_long off = (dest - (tcg_target_long)s->code_ptr) >> 1;
> +    if (off == (int32_t)off) {
> +        tcg_out_insn(s, RIL, BRASL, TCG_REG_R14, off);
> +    } else {
> +        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, dest);
> +        tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_REG_R13);
> +    }
> +}
> +
> +#if defined(CONFIG_SOFTMMU)
> +static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
> +                                  int mem_index, int opc,
> +                                  uint16_t **label2_ptr_p, int is_store)
> +  {
> +    int arg0 = TCG_REG_R2;
> +    int arg1 = TCG_REG_R3;
> +    int arg2 = TCG_REG_R4;
> +    int s_bits;
> +    uint16_t *label1_ptr;
> +
> +    if (is_store) {
> +        s_bits = opc;
> +    } else {
> +        s_bits = opc & 3;
> +    }
> +
> +#if TARGET_LONG_BITS == 32
> +    tcg_out_insn(s, RRE, LLGFR, arg1, addr_reg);
> +    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
> +#else
> +    tcg_out_mov(s, arg1, addr_reg);
> +    tcg_out_mov(s, arg0, addr_reg);
> +#endif
> +
> +    tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, TCG_REG_NONE,
> +                 TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
> +
> +    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
> +                 TARGET_PAGE_MASK | ((1 << s_bits) - 1));
> +    tcg_out_insn(s, RRE, NGR, arg0, TCG_REG_R13);
> +
> +    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
> +                 (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
> +    tcg_out_insn(s, RRE, NGR, arg1, TCG_REG_R13);
> +
> +    if (is_store) {
> +        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
> +                     offsetof(CPUState, tlb_table[mem_index][0].addr_write));
> +    } else {
> +        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
> +                     offsetof(CPUState, tlb_table[mem_index][0].addr_read));
> +    }
> +    tcg_out_insn(s, RRE, AGR, arg1, TCG_REG_R13);
> +
> +    tcg_out_insn(s, RRE, AGR, arg1, TCG_AREG0);
> +
> +    tcg_out_insn(s, RXY, CG, arg0, arg1, 0, 0);
> +
> +    label1_ptr = (uint16_t*)s->code_ptr;
> +
> +    /* je label1 (offset will be patched in later) */
> +    tcg_out_insn(s, RI, BRC, S390_CC_EQ, 0);
> +
> +    /* call load/store helper */
> +#if TARGET_LONG_BITS == 32
> +    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
> +#else
> +    tcg_out_mov(s, arg0, addr_reg);
> +#endif
> +
> +    if (is_store) {
> +        tcg_out_mov(s, arg1, data_reg);
> +        tcg_out_movi(s, TCG_TYPE_I32, arg2, mem_index);
> +        tgen_calli(s, (tcg_target_ulong)qemu_st_helpers[s_bits]);
> +    } else {
> +        tcg_out_movi(s, TCG_TYPE_I32, arg1, mem_index);
> +        tgen_calli(s, (tcg_target_ulong)qemu_ld_helpers[s_bits]);
> +
> +        /* sign extension */
> +        switch (opc) {
> +        case LD_INT8:
> +            tcg_out_insn(s, RSY, SLLG, data_reg, arg0, TCG_REG_NONE, 56);
> +            tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, TCG_REG_NONE, 56);
> +            break;
> +        case LD_INT16:
> +            tcg_out_insn(s, RSY, SLLG, data_reg, arg0, TCG_REG_NONE, 48);
> +            tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, TCG_REG_NONE, 48);
> +            break;
> +        case LD_INT32:
> +            tcg_out_insn(s, RRE, LGFR, data_reg, arg0);
> +            break;
> +        default:
> +            /* unsigned -> just copy */
> +            tcg_out_mov(s, data_reg, arg0);
> +            break;
> +        }
> +    }
> +
> +    /* jump to label2 (end) */
> +    *label2_ptr_p = (uint16_t*)s->code_ptr;
> +
> +    tcg_out_insn(s, RI, BRC, S390_CC_ALWAYS, 0);
> +
> +    /* this is label1, patch branch */
> +    *(label1_ptr + 1) = ((unsigned long)s->code_ptr -
> +                         (unsigned long)label1_ptr) >> 1;
> +
> +    if (is_store) {
> +        tcg_out_insn(s, RXY, LG, arg1, arg1, 0,
> +                     offsetof(CPUTLBEntry, addend)
> +                     - offsetof(CPUTLBEntry, addr_write));
> +    } else {
> +        tcg_out_insn(s, RXY, LG, arg1, arg1, 0,
> +                     offsetof(CPUTLBEntry, addend)
> +                     - offsetof(CPUTLBEntry, addr_read));
> +    }
> +
> +#if TARGET_LONG_BITS == 32
> +    /* zero upper 32 bits */
> +    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
> +#else
> +    /* just copy */
> +    tcg_out_mov(s, arg0, addr_reg);
> +#endif
> +    tcg_out_insn(s, RRE, AGR, arg0, arg1);
> +}
> +
> +static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
> +{
> +    /* patch branch */
> +    *(label2_ptr + 1) = ((unsigned long)s->code_ptr -
> +                         (unsigned long)label2_ptr) >> 1;
> +}
> +
> +#else /* CONFIG_SOFTMMU */
> +
> +static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
> +                                int mem_index, int opc,
> +                                uint16_t **label2_ptr_p, int is_store)
> +{
> +    int arg0 = TCG_REG_R2;
> +
> +    /* user mode, no address translation required */
> +    if (TARGET_LONG_BITS == 32) {
> +        tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
> +    } else {
> +        tcg_out_mov(s, arg0, addr_reg);
> +    }
> +}
> +
> +static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
> +{
> +}
> +
> +#endif /* CONFIG_SOFTMMU */
> +
> +/* load data with address translation (if applicable)
> +   and endianness conversion */
> +static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
> +{
> +    int addr_reg, data_reg, mem_index;
> +    int arg0 = TCG_REG_R2;
> +    uint16_t *label2_ptr;
> +
> +    data_reg = *args++;
> +    addr_reg = *args++;
> +    mem_index = *args;
> +
> +    dprintf("tcg_out_qemu_ld opc %d data_reg %d addr_reg %d mem_index %d\n"
> +            opc, data_reg, addr_reg, mem_index);
> +
> +    tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
> +                          opc, &label2_ptr, 0);
> +
> +    switch (opc) {
> +    case LD_UINT8:
> +        tcg_out_insn(s, RXY, LLGC, data_reg, arg0, 0, 0);
> +        break;
> +    case LD_INT8:
> +        tcg_out_insn(s, RXY, LGB, data_reg, arg0, 0, 0);
> +        break;
> +    case LD_UINT16:
> +#ifdef TARGET_WORDS_BIGENDIAN
> +        tcg_out_insn(s, RXY, LLGH, data_reg, arg0, 0, 0);
> +#else
> +        /* swapped unsigned halfword load with upper bits zeroed */
> +        tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
> +        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, 0xffffL);
> +        tcg_out_insn(s, RRE, NGR, data_reg, 13);
> +#endif
> +        break;
> +    case LD_INT16:
> +#ifdef TARGET_WORDS_BIGENDIAN
> +        tcg_out_insn(s, RXY, LGH, data_reg, arg0, 0, 0);
> +#else
> +        /* swapped sign-extended halfword load */
> +        tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
> +        tcg_out_insn(s, RSY, SLLG, data_reg, data_reg, TCG_REG_NONE, 48);
> +        tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, TCG_REG_NONE, 48);
> +#endif
> +        break;
> +    case LD_UINT32:
> +#ifdef TARGET_WORDS_BIGENDIAN
> +        tcg_out_insn(s, RXY, LLGF, data_reg, arg0, 0, 0);
> +#else
> +        /* swapped unsigned int load with upper bits zeroed */
> +        tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
> +        tcg_out_insn(s, RRE, LLGFR, data_reg, data_reg);
> +#endif
> +        break;
> +    case LD_INT32:
> +#ifdef TARGET_WORDS_BIGENDIAN
> +        tcg_out_insn(s, RXY, LGF, data_reg, arg0, 0, 0);
> +#else
> +        /* swapped sign-extended int load */
> +        tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
> +        tcg_out_insn(s, RRE, LGFR, data_reg, data_reg);
> +#endif
> +        break;
> +    case LD_UINT64:
> +#ifdef TARGET_WORDS_BIGENDIAN
> +        tcg_out_insn(s, RXY, LG, data_reg, arg0, 0, 0);
> +#else
> +        tcg_out_insn(s, RXY, LRVG, data_reg, arg0, 0, 0);
> +#endif
> +        break;
> +    default:
> +        tcg_abort();
> +    }
> +
> +    tcg_finish_qemu_ldst(s, label2_ptr);
> +}
> +
> +static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
> +{
> +    int addr_reg, data_reg, mem_index;
> +    uint16_t *label2_ptr;
> +    int arg0 = TCG_REG_R2;
> +
> +    data_reg = *args++;
> +    addr_reg = *args++;
> +    mem_index = *args;
> +
> +    dprintf("tcg_out_qemu_st opc %d data_reg %d addr_reg %d mem_index %d\n"
> +            opc, data_reg, addr_reg, mem_index);
> +
> +    tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
> +                          opc, &label2_ptr, 1);
> +
> +    switch (opc) {
> +    case LD_UINT8:
> +        tcg_out_insn(s, RX, STC, data_reg, arg0, 0, 0);
> +        break;
> +    case LD_UINT16:
> +#ifdef TARGET_WORDS_BIGENDIAN
> +        tcg_out_insn(s, RX, STH, data_reg, arg0, 0, 0);
> +#else
> +        tcg_out_insn(s, RXY, STRVH, data_reg, arg0, 0, 0);
> +#endif
> +        break;
> +    case LD_UINT32:
> +#ifdef TARGET_WORDS_BIGENDIAN
> +        tcg_out_insn(s, RX, ST, data_reg, arg0, 0, 0);
> +#else
> +        tcg_out_insn(s, RXY, STRV, data_reg, arg0, 0, 0);
> +#endif
> +        break;
> +    case LD_UINT64:
> +#ifdef TARGET_WORDS_BIGENDIAN
> +        tcg_out_insn(s, RXY, STG, data_reg, arg0, 0, 0);
> +#else
> +        tcg_out_insn(s, RXY, STRVG, data_reg, arg0, 0, 0);
> +#endif
> +        break;
> +    default:
> +        tcg_abort();
> +    }
> +
> +    tcg_finish_qemu_ldst(s, label2_ptr);
>  }
>  
>  static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>                  const TCGArg *args, const int *const_args)
>  {
> -    tcg_abort();
> +    S390Opcode op;
> +
> +    switch (opc) {
> +    case INDEX_op_exit_tb:
> +        /* return value */
> +        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R2, args[0]);
> +        tgen_gotoi(s, S390_CC_ALWAYS, (unsigned long)tb_ret_addr);
> +        break;
> +
> +    case INDEX_op_goto_tb:
> +        if (s->tb_jmp_offset) {
> +            tcg_abort();
> +        } else {
> +            tcg_target_long off = ((tcg_target_long)(s->tb_next + args[0]) -
> +                                   (tcg_target_long)s->code_ptr) >> 1;
> +            if (off == (int32_t)off) {
> +                /* load address relative to PC */
> +                tcg_out_insn(s, RIL, LARL, TCG_REG_R13, off);
> +            } else {
> +                /* too far for larl */
> +                tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
> +                             (tcg_target_long)(s->tb_next + args[0]));
> +            }
> +            /* load address stored at s->tb_next + args[0] */
> +            tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R13, TCG_REG_R13, 0);
> +            /* and go there */
> +            tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R13);
> +        }
> +        s->tb_next_offset[args[0]] = s->code_ptr - s->code_buf;
> +        break;
> +
> +    case INDEX_op_call:
> +        if (const_args[0]) {
> +            tgen_calli(s, args[0]);
> +        } else {
> +            tcg_out_insn(s, RR, BASR, TCG_REG_R14, args[0]);
> +        }
> +        break;
> +
> +    case INDEX_op_jmp:
> +        /* XXX */
> +        tcg_abort();
> +        break;
> +
> +    case INDEX_op_ld8u_i32:
> +    case INDEX_op_ld8u_i64:
> +        /* ??? LLC (RXY format) is only present with the extended-immediate
> +           facility, whereas LLGC is always present.  */
> +        tcg_out_mem(s, 0, RXY_LLGC, args[0], args[1], TCG_REG_NONE, args[2]);
> +        break;
> +
> +    case INDEX_op_ld8s_i32:
> +    case INDEX_op_ld8s_i64:
> +        /* ??? LB is no smaller than LGB, so no point to using it.  */
> +        tcg_out_mem(s, 0, RXY_LGB, args[0], args[1], TCG_REG_NONE, args[2]);
> +        break;
> +
> +    case INDEX_op_ld16u_i32:
> +    case INDEX_op_ld16u_i64:
> +        /* ??? LLH (RXY format) is only present with the extended-immediate
> +           facility, whereas LLGH is always present.  */
> +        tcg_out_mem(s, 0, RXY_LLGH, args[0], args[1], TCG_REG_NONE, args[2]);
> +        break;
> +
> +    case INDEX_op_ld16s_i32:
> +        tcg_out_mem(s, RX_LH, RXY_LHY, args[0], args[1], TCG_REG_NONE, args[2]);
> +        break;
> +    case INDEX_op_ld16s_i64:
> +        tcg_out_mem(s, 0, RXY_LGH, args[0], args[1], TCG_REG_NONE, args[2]);
> +        break;
> +
> +    case INDEX_op_ld_i32:
> +        tcg_out_ld(s, TCG_TYPE_I32, args[0], args[1], args[2]);
> +        break;
> +    case INDEX_op_ld32u_i64:
> +        tcg_out_mem(s, 0, RXY_LLGF, args[0], args[1], TCG_REG_NONE, args[2]);
> +        break;
> +    case INDEX_op_ld32s_i64:
> +        tcg_out_mem(s, 0, RXY_LGF, args[0], args[1], TCG_REG_NONE, args[2]);
> +        break;
> +
> +    case INDEX_op_ld_i64:
> +        tcg_out_ld(s, TCG_TYPE_I64, args[0], args[1], args[2]);
> +        break;
> +
> +    case INDEX_op_st8_i32:
> +    case INDEX_op_st8_i64:
> +        tcg_out_mem(s, RX_STC, RXY_STCY, args[0], args[1],
> +                    TCG_REG_NONE, args[2]);
> +        break;
> +
> +    case INDEX_op_st16_i32:
> +    case INDEX_op_st16_i64:
> +        tcg_out_mem(s, RX_STH, RXY_STHY, args[0], args[1],
> +                    TCG_REG_NONE, args[2]);
> +        break;
> +
> +    case INDEX_op_st_i32:
> +    case INDEX_op_st32_i64:
> +        tcg_out_st(s, TCG_TYPE_I32, args[0], args[1], args[2]);
> +        break;
> +
> +    case INDEX_op_st_i64:
> +        tcg_out_st(s, TCG_TYPE_I64, args[0], args[1], args[2]);
> +        break;
> +
> +    case INDEX_op_mov_i32:
> +        /* XXX */
> +        tcg_abort();
> +        break;
> +
> +    case INDEX_op_movi_i32:
> +        /* XXX */
> +        tcg_abort();
> +        break;
> +
> +    case INDEX_op_add_i32:
> +        if (const_args[2]) {
> +            tcg_out_insn(s, RI, AHI, args[0], args[2]);
> +        } else {
> +            tcg_out_insn(s, RR, AR, args[0], args[2]);
> +        }
> +        break;
> +
> +    case INDEX_op_add_i64:
> +        tcg_out_insn(s, RRE, AGR, args[0], args[2]);
> +        break;
> +
> +    case INDEX_op_sub_i32:
> +        tcg_out_insn(s, RR, SR, args[0], args[2]);
> +        break;
> +
> +    case INDEX_op_sub_i64:
> +        tcg_out_insn(s, RRE, SGR, args[0], args[2]);
> +        break;
> +
> +    case INDEX_op_and_i32:
> +        tcg_out_insn(s, RR, NR, args[0], args[2]);
> +        break;
> +    case INDEX_op_or_i32:
> +        tcg_out_insn(s, RR, OR, args[0], args[2]);
> +        break;
> +    case INDEX_op_xor_i32:
> +        tcg_out_insn(s, RR, XR, args[0], args[2]);
> +        break;
> +
> +    case INDEX_op_and_i64:
> +        tcg_out_insn(s, RRE, NGR, args[0], args[2]);
> +        break;
> +    case INDEX_op_or_i64:
> +        tcg_out_insn(s, RRE, OGR, args[0], args[2]);
> +        break;
> +    case INDEX_op_xor_i64:
> +        tcg_out_insn(s, RRE, XGR, args[0], args[2]);
> +        break;
> +
> +    case INDEX_op_neg_i32:
> +        /* FIXME: optimize args[0] != args[1] case */
> +        tcg_out_insn(s, RR, LR, 13, args[1]);
> +        tcg_out_movi(s, TCG_TYPE_I32, args[0], 0);
> +        tcg_out_insn(s, RR, SR, args[0], 13);
> +        break;
> +    case INDEX_op_neg_i64:
> +        /* FIXME: optimize args[0] != args[1] case */
> +        tcg_out_mov(s, TCG_REG_R13, args[1]);
> +        tcg_out_movi(s, TCG_TYPE_I64, args[0], 0);
> +        tcg_out_insn(s, RRE, SGR, args[0], TCG_REG_R13);
> +        break;
> +
> +    case INDEX_op_mul_i32:
> +        tcg_out_insn(s, RRE, MSR, args[0], args[2]);
> +        break;
> +    case INDEX_op_mul_i64:
> +        tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
> +        break;
> +
> +    case INDEX_op_div2_i32:
> +        tcg_out_insn(s, RR, DR, TCG_REG_R2, args[4]);
> +        break;
> +    case INDEX_op_divu2_i32:
> +        tcg_out_insn(s, RRE, DLR, TCG_REG_R2, args[4]);
> +        break;
> +
> +    case INDEX_op_div2_i64:
> +        /* ??? We get an unnecessary sign-extension of the dividend
> +           into R3 with this definition, but as we do in fact always
> +           produce both quotient and remainder using INDEX_op_div_i64
> +           instead requires jumping through even more hoops.  */
> +        tcg_out_insn(s, RRE, DSGR, TCG_REG_R2, args[4]);
> +        break;
> +    case INDEX_op_divu2_i64:
> +        tcg_out_insn(s, RRE, DLGR, TCG_REG_R2, args[4]);
> +        break;
> +
> +    case INDEX_op_shl_i32:
> +        op = RS_SLL;
> +    do_shift32:
> +        if (const_args[2]) {
> +            tcg_out_sh32(s, op, args[0], TCG_REG_NONE, args[2]);
> +        } else {
> +            tcg_out_sh32(s, op, args[0], args[2], 0);
> +        }
> +        break;
> +    case INDEX_op_shr_i32:
> +        op = RS_SRL;
> +        goto do_shift32;
> +    case INDEX_op_sar_i32:
> +        op = RS_SRA;
> +        goto do_shift32;
> +
> +    case INDEX_op_shl_i64:
> +        op = RSY_SLLG;
> +    do_shift64:
> +        if (const_args[2]) {
> +            tcg_out_sh64(s, op, args[0], args[1], TCG_REG_NONE, args[2]);
> +        } else {
> +            tcg_out_sh64(s, op, args[0], args[1], args[2], 0);
> +        }
> +        break;
> +    case INDEX_op_shr_i64:
> +        op = RSY_SRLG;
> +        goto do_shift64;
> +    case INDEX_op_sar_i64:
> +        op = RSY_SRAG;
> +        goto do_shift64;
> +
> +    case INDEX_op_br:
> +        tgen_branch(s, S390_CC_ALWAYS, args[0]);
> +        break;
> +
> +    case INDEX_op_brcond_i64:
> +        tgen64_cmp(s, args[2], args[0], args[1]);
> +        goto do_brcond;
> +    case INDEX_op_brcond_i32:
> +        tgen32_cmp(s, args[2], args[0], args[1]);
> +    do_brcond:
> +        tgen_branch(s, tcg_cond_to_s390_cond[args[2]], args[3]);
> +        break;
> +
> +    case INDEX_op_setcond_i32:
> +        tgen_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1], args[2]);
> +        break;
> +    case INDEX_op_setcond_i64:
> +        tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1], args[2]);
> +        break;
> +
> +    case INDEX_op_qemu_ld8u:
> +        tcg_out_qemu_ld(s, args, LD_UINT8);
> +        break;
> +
> +    case INDEX_op_qemu_ld8s:
> +        tcg_out_qemu_ld(s, args, LD_INT8);
> +        break;
> +
> +    case INDEX_op_qemu_ld16u:
> +        tcg_out_qemu_ld(s, args, LD_UINT16);
> +        break;
> +
> +    case INDEX_op_qemu_ld16s:
> +        tcg_out_qemu_ld(s, args, LD_INT16);
> +        break;
> +
> +    case INDEX_op_qemu_ld32:
> +        /* ??? Technically we can use a non-extending instruction.  */
> +    case INDEX_op_qemu_ld32u:
> +        tcg_out_qemu_ld(s, args, LD_UINT32);
> +        break;
> +
> +    case INDEX_op_qemu_ld32s:
> +        tcg_out_qemu_ld(s, args, LD_INT32);
> +        break;
> +
> +    case INDEX_op_qemu_ld64:
> +        tcg_out_qemu_ld(s, args, LD_UINT64);
> +        break;
> +
> +    case INDEX_op_qemu_st8:
> +        tcg_out_qemu_st(s, args, LD_UINT8);
> +        break;
> +
> +    case INDEX_op_qemu_st16:
> +        tcg_out_qemu_st(s, args, LD_UINT16);
> +        break;
> +
> +    case INDEX_op_qemu_st32:
> +        tcg_out_qemu_st(s, args, LD_UINT32);
> +        break;
> +
> +    case INDEX_op_qemu_st64:
> +        tcg_out_qemu_st(s, args, LD_UINT64);
> +        break;
> +
> +    default:
> +        fprintf(stderr,"unimplemented opc 0x%x\n",opc);
> +        tcg_abort();
> +    }
>  }
>  
> +static const TCGTargetOpDef s390_op_defs[] = {
> +    { INDEX_op_exit_tb, { } },
> +    { INDEX_op_goto_tb, { } },
> +    { INDEX_op_call, { "ri" } },
> +    { INDEX_op_jmp, { "ri" } },
> +    { INDEX_op_br, { } },
> +
> +    { INDEX_op_mov_i32, { "r", "r" } },
> +    { INDEX_op_movi_i32, { "r" } },
> +
> +    { INDEX_op_ld8u_i32, { "r", "r" } },
> +    { INDEX_op_ld8s_i32, { "r", "r" } },
> +    { INDEX_op_ld16u_i32, { "r", "r" } },
> +    { INDEX_op_ld16s_i32, { "r", "r" } },
> +    { INDEX_op_ld_i32, { "r", "r" } },
> +    { INDEX_op_st8_i32, { "r", "r" } },
> +    { INDEX_op_st16_i32, { "r", "r" } },
> +    { INDEX_op_st_i32, { "r", "r" } },
> +
> +    { INDEX_op_add_i32, { "r", "0", "rI" } },
> +    { INDEX_op_sub_i32, { "r", "0", "r" } },
> +    { INDEX_op_mul_i32, { "r", "0", "r" } },
> +
> +    { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
> +    { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
> +
> +    { INDEX_op_and_i32, { "r", "0", "r" } },
> +    { INDEX_op_or_i32, { "r", "0", "r" } },
> +    { INDEX_op_xor_i32, { "r", "0", "r" } },
> +    { INDEX_op_neg_i32, { "r", "r" } },
> +
> +    { INDEX_op_shl_i32, { "r", "0", "Ri" } },
> +    { INDEX_op_shr_i32, { "r", "0", "Ri" } },
> +    { INDEX_op_sar_i32, { "r", "0", "Ri" } },
> +
> +    { INDEX_op_brcond_i32, { "r", "r" } },
> +    { INDEX_op_setcond_i32, { "r", "r", "r" } },
> +
> +    { INDEX_op_qemu_ld8u, { "r", "L" } },
> +    { INDEX_op_qemu_ld8s, { "r", "L" } },
> +    { INDEX_op_qemu_ld16u, { "r", "L" } },
> +    { INDEX_op_qemu_ld16s, { "r", "L" } },
> +    { INDEX_op_qemu_ld32u, { "r", "L" } },
> +    { INDEX_op_qemu_ld32s, { "r", "L" } },
> +    { INDEX_op_qemu_ld32, { "r", "L" } },
> +    { INDEX_op_qemu_ld64, { "r", "L" } },
> +
> +    { INDEX_op_qemu_st8, { "L", "L" } },
> +    { INDEX_op_qemu_st16, { "L", "L" } },
> +    { INDEX_op_qemu_st32, { "L", "L" } },
> +    { INDEX_op_qemu_st64, { "L", "L" } },
> +
> +#if defined(__s390x__)
> +    { INDEX_op_mov_i64, { "r", "r" } },
> +    { INDEX_op_movi_i64, { "r" } },
> +
> +    { INDEX_op_ld8u_i64, { "r", "r" } },
> +    { INDEX_op_ld8s_i64, { "r", "r" } },
> +    { INDEX_op_ld16u_i64, { "r", "r" } },
> +    { INDEX_op_ld16s_i64, { "r", "r" } },
> +    { INDEX_op_ld32u_i64, { "r", "r" } },
> +    { INDEX_op_ld32s_i64, { "r", "r" } },
> +    { INDEX_op_ld_i64, { "r", "r" } },
> +
> +    { INDEX_op_st8_i64, { "r", "r" } },
> +    { INDEX_op_st16_i64, { "r", "r" } },
> +    { INDEX_op_st32_i64, { "r", "r" } },
> +    { INDEX_op_st_i64, { "r", "r" } },
> +
> +    { INDEX_op_add_i64, { "r", "0", "r" } },
> +    { INDEX_op_sub_i64, { "r", "0", "r" } },
> +    { INDEX_op_mul_i64, { "r", "0", "r" } },
> +
> +    { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
> +    { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
> +
> +    { INDEX_op_and_i64, { "r", "0", "r" } },
> +    { INDEX_op_or_i64, { "r", "0", "r" } },
> +    { INDEX_op_xor_i64, { "r", "0", "r" } },
> +    { INDEX_op_neg_i64, { "r", "r" } },
> +
> +    { INDEX_op_shl_i64, { "r", "r", "Ri" } },
> +    { INDEX_op_shr_i64, { "r", "r", "Ri" } },
> +    { INDEX_op_sar_i64, { "r", "r", "Ri" } },
> +
> +    { INDEX_op_brcond_i64, { "r", "r" } },
> +    { INDEX_op_setcond_i64, { "r", "r", "r" } },
> +#endif
> +
> +    { -1 },
> +};
> +
>  void tcg_target_init(TCGContext *s)
>  {
> -    /* gets called with KVM */
> +#if !defined(CONFIG_USER_ONLY)
> +    /* fail safe */
> +    if ((1 << CPU_TLB_ENTRY_BITS) != sizeof(CPUTLBEntry)) {
> +        tcg_abort();
> +    }
> +#endif
> +
> +    tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffff);
> +    tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffff);
> +    tcg_regset_set32(tcg_target_call_clobber_regs, 0,
> +                     (1 << TCG_REG_R0) |
> +                     (1 << TCG_REG_R1) |
> +                     (1 << TCG_REG_R2) |
> +                     (1 << TCG_REG_R3) |
> +                     (1 << TCG_REG_R4) |
> +                     (1 << TCG_REG_R5) |
> +                     (1 << TCG_REG_R14)); /* link register */
> +
> +    tcg_regset_clear(s->reserved_regs);
> +    /* frequently used as a temporary */
> +    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R13);
> +    /* another temporary */
> +    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R12);
> +    /* XXX many insns can't be used with R0, so we better avoid it for now */
> +    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R0);
> +    /* The stack pointer.  */
> +    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R15);
> +
> +    tcg_add_target_add_op_defs(s390_op_defs);
>  }
>  
>  void tcg_target_qemu_prologue(TCGContext *s)
>  {
> -    /* gets called with KVM */
> -}
> +    /* stmg %r6,%r15,48(%r15) (save registers) */
> +    tcg_out_insn(s, RXY, STMG, TCG_REG_R6, TCG_REG_R15, TCG_REG_R15, 48);
>  
> -static inline void tcg_out_mov(TCGContext *s, int ret, int arg)
> -{
> -    tcg_abort();
> +    /* aghi %r15,-160 (stack frame) */
> +    tcg_out_insn(s, RI, AGHI, TCG_REG_R15, -160);
> +
> +    /* br %r2 (go to TB) */
> +    tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R2);
> +
> +    tb_ret_addr = s->code_ptr;
> +
> +    /* lmg %r6,%r15,208(%r15) (restore registers) */
> +    tcg_out_insn(s, RXY, LMG, TCG_REG_R6, TCG_REG_R15, TCG_REG_R15, 208);
> +
> +    /* br %r14 (return) */
> +    tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R14);
>  }
>  
>  static inline void tcg_out_addi(TCGContext *s, int reg, tcg_target_long val)
> diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
> index 8c19262..26dafae 100644
> --- a/tcg/s390/tcg-target.h
> +++ b/tcg/s390/tcg-target.h
> @@ -26,7 +26,7 @@
>  #define TCG_TARGET_REG_BITS 64
>  #define TCG_TARGET_WORDS_BIGENDIAN
>  
> -enum {
> +typedef enum TCGReg {
>      TCG_REG_R0 = 0,
>      TCG_REG_R1,
>      TCG_REG_R2,
> @@ -43,11 +43,12 @@ enum {
>      TCG_REG_R13,
>      TCG_REG_R14,
>      TCG_REG_R15
> -};
> +} TCGReg;
> +
>  #define TCG_TARGET_NB_REGS 16
>  
>  /* optional instructions */
> -// #define TCG_TARGET_HAS_div_i32
> +#define TCG_TARGET_HAS_div2_i32
>  // #define TCG_TARGET_HAS_rot_i32
>  // #define TCG_TARGET_HAS_ext8s_i32
>  // #define TCG_TARGET_HAS_ext16s_i32
> @@ -56,14 +57,14 @@ enum {
>  // #define TCG_TARGET_HAS_bswap16_i32
>  // #define TCG_TARGET_HAS_bswap32_i32
>  // #define TCG_TARGET_HAS_not_i32
> -// #define TCG_TARGET_HAS_neg_i32
> +#define TCG_TARGET_HAS_neg_i32
>  // #define TCG_TARGET_HAS_andc_i32
>  // #define TCG_TARGET_HAS_orc_i32
>  // #define TCG_TARGET_HAS_eqv_i32
>  // #define TCG_TARGET_HAS_nand_i32
>  // #define TCG_TARGET_HAS_nor_i32
>  
> -// #define TCG_TARGET_HAS_div_i64
> +#define TCG_TARGET_HAS_div2_i64
>  // #define TCG_TARGET_HAS_rot_i64
>  // #define TCG_TARGET_HAS_ext8s_i64
>  // #define TCG_TARGET_HAS_ext16s_i64
> @@ -75,7 +76,7 @@ enum {
>  // #define TCG_TARGET_HAS_bswap32_i64
>  // #define TCG_TARGET_HAS_bswap64_i64
>  // #define TCG_TARGET_HAS_not_i64
> -// #define TCG_TARGET_HAS_neg_i64
> +#define TCG_TARGET_HAS_neg_i64
>  // #define TCG_TARGET_HAS_andc_i64
>  // #define TCG_TARGET_HAS_orc_i64
>  // #define TCG_TARGET_HAS_eqv_i64
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 11/35] tcg-s390: Tidy unimplemented opcodes.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 11/35] tcg-s390: Tidy unimplemented opcodes Richard Henderson
@ 2010-06-10 10:24   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-10 10:24 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:19PM -0700, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c |   25 ++++++++++---------------
>  1 files changed, 10 insertions(+), 15 deletions(-)
> 
> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index 55f0fa9..5b2134b 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -824,11 +824,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>          }
>          break;
>  
> -    case INDEX_op_jmp:
> -        /* XXX */
> -        tcg_abort();
> -        break;
> -
>      case INDEX_op_ld8u_i32:
>      case INDEX_op_ld8u_i64:
>          /* ??? LLC (RXY format) is only present with the extended-immediate
> @@ -891,16 +886,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>          tcg_out_st(s, TCG_TYPE_I64, args[0], args[1], args[2]);
>          break;
>  
> -    case INDEX_op_mov_i32:
> -        /* XXX */
> -        tcg_abort();
> -        break;
> -
> -    case INDEX_op_movi_i32:
> -        /* XXX */
> -        tcg_abort();
> -        break;
> -
>      case INDEX_op_add_i32:
>          if (const_args[2]) {
>              tcg_out_insn(s, RI, AHI, args[0], args[2]);
> @@ -1077,6 +1062,16 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>          tcg_out_qemu_st(s, args, LD_UINT64);
>          break;
>  
> +    case INDEX_op_mov_i32:
> +    case INDEX_op_mov_i64:
> +    case INDEX_op_movi_i32:
> +    case INDEX_op_movi_i64:
> +        /* These are always emitted by TCG directly.  */

OTOH, these 4 ones are very easy to write in case TCG starts to emit
such opcodes at some point.

> +    case INDEX_op_jmp:
> +        /* This one is obsolete and never emitted.  */
> +        tcg_abort();

I am fine with this one.

> +        break;
> +
>      default:
>          fprintf(stderr,"unimplemented opc 0x%x\n",opc);
>          tcg_abort();
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 12/35] tcg-s390: Define TCG_TMP0.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 12/35] tcg-s390: Define TCG_TMP0 Richard Henderson
@ 2010-06-10 10:25   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-10 10:25 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:20PM -0700, Richard Henderson wrote:
> Use a define for the temp register instead of hard-coding it.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c |   54 ++++++++++++++++++++++++++----------------------
>  1 files changed, 29 insertions(+), 25 deletions(-)

This patch looks ok.

> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index 5b2134b..2b80c02 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -40,6 +40,10 @@
>     rather than TCG_REG_R0.  */
>  #define TCG_REG_NONE    0
>  
> +/* A scratch register that may be be used throughout the backend.  */
> +#define TCG_TMP0        TCG_REG_R13
> +
> +
>  /* All of the following instructions are prefixed with their instruction
>     format, and are defined as 8- or 16-bit quantities, even when the two
>     halves of the 16-bit quantity may appear 32 bits apart in the insn.
> @@ -376,12 +380,12 @@ static inline void tcg_out_movi(TCGContext *s, TCGType type,
>          tcg_out_insn(s, RI, IILH, ret, arg >> 16);
>      } else {
>          /* branch over constant and store its address in R13 */
> -        tcg_out_insn(s, RIL, BRASL, TCG_REG_R13, (6 + 8) >> 1);
> +        tcg_out_insn(s, RIL, BRASL, TCG_TMP0, (6 + 8) >> 1);
>          /* 64-bit constant */
>          tcg_out32(s, arg >> 32);
>          tcg_out32(s, arg);
>          /* load constant to ret */
> -        tcg_out_insn(s, RXY, LG, ret, TCG_REG_R13, 0, 0);
> +        tcg_out_insn(s, RXY, LG, ret, TCG_TMP0, 0, 0);
>      }
>  }
>  
> @@ -399,14 +403,14 @@ static void tcg_out_mem(TCGContext *s, S390Opcode opc_rx, S390Opcode opc_rxy,
>      if (ofs < -0x80000 || ofs >= 0x80000) {
>          /* Combine the low 16 bits of the offset with the actual load insn;
>             the high 48 bits must come from an immediate load.  */
> -        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, ofs & ~0xffff);
> +        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, ofs & ~0xffff);
>          ofs &= 0xffff;
>  
>          /* If we were already given an index register, add it in.  */
>          if (index != TCG_REG_NONE) {
> -            tcg_out_insn(s, RRE, AGR, TCG_REG_R13, index);
> +            tcg_out_insn(s, RRE, AGR, TCG_TMP0, index);
>          }
> -        index = TCG_REG_R13;
> +        index = TCG_TMP0;
>      }
>  
>      if (opc_rx && ofs >= 0 && ofs < 0x1000) {
> @@ -482,8 +486,8 @@ static void tgen_gotoi(TCGContext *s, int cc, tcg_target_long dest)
>      } else if (off == (int32_t)off) {
>          tcg_out_insn(s, RIL, BRCL, cc, off);
>      } else {
> -        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, dest);
> -        tcg_out_insn(s, RR, BCR, cc, TCG_REG_R13);
> +        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, dest);
> +        tcg_out_insn(s, RR, BCR, cc, TCG_TMP0);
>      }
>  }
>  
> @@ -505,8 +509,8 @@ static void tgen_calli(TCGContext *s, tcg_target_long dest)
>      if (off == (int32_t)off) {
>          tcg_out_insn(s, RIL, BRASL, TCG_REG_R14, off);
>      } else {
> -        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, dest);
> -        tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_REG_R13);
> +        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, dest);
> +        tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_TMP0);
>      }
>  }
>  
> @@ -538,22 +542,22 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
>      tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, TCG_REG_NONE,
>                   TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
>  
> -    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
> +    tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
>                   TARGET_PAGE_MASK | ((1 << s_bits) - 1));
> -    tcg_out_insn(s, RRE, NGR, arg0, TCG_REG_R13);
> +    tcg_out_insn(s, RRE, NGR, arg0, TCG_TMP0);
>  
> -    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
> +    tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
>                   (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
> -    tcg_out_insn(s, RRE, NGR, arg1, TCG_REG_R13);
> +    tcg_out_insn(s, RRE, NGR, arg1, TCG_TMP0);
>  
>      if (is_store) {
> -        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
> +        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
>                       offsetof(CPUState, tlb_table[mem_index][0].addr_write));
>      } else {
> -        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
> +        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
>                       offsetof(CPUState, tlb_table[mem_index][0].addr_read));
>      }
> -    tcg_out_insn(s, RRE, AGR, arg1, TCG_REG_R13);
> +    tcg_out_insn(s, RRE, AGR, arg1, TCG_TMP0);
>  
>      tcg_out_insn(s, RRE, AGR, arg1, TCG_AREG0);
>  
> @@ -688,8 +692,8 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
>  #else
>          /* swapped unsigned halfword load with upper bits zeroed */
>          tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
> -        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, 0xffffL);
> -        tcg_out_insn(s, RRE, NGR, data_reg, 13);
> +        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, 0xffffL);
> +        tcg_out_insn(s, RRE, NGR, data_reg, TCG_TMP0);
>  #endif
>          break;
>      case LD_INT16:
> @@ -802,16 +806,16 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>                                     (tcg_target_long)s->code_ptr) >> 1;
>              if (off == (int32_t)off) {
>                  /* load address relative to PC */
> -                tcg_out_insn(s, RIL, LARL, TCG_REG_R13, off);
> +                tcg_out_insn(s, RIL, LARL, TCG_TMP0, off);
>              } else {
>                  /* too far for larl */
> -                tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
> +                tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
>                               (tcg_target_long)(s->tb_next + args[0]));
>              }
>              /* load address stored at s->tb_next + args[0] */
> -            tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R13, TCG_REG_R13, 0);
> +            tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP0, TCG_TMP0, 0);
>              /* and go there */
> -            tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R13);
> +            tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_TMP0);
>          }
>          s->tb_next_offset[args[0]] = s->code_ptr - s->code_buf;
>          break;
> @@ -934,9 +938,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>          break;
>      case INDEX_op_neg_i64:
>          /* FIXME: optimize args[0] != args[1] case */
> -        tcg_out_mov(s, TCG_REG_R13, args[1]);
> +        tcg_out_mov(s, TCG_TMP0, args[1]);
>          tcg_out_movi(s, TCG_TYPE_I64, args[0], 0);
> -        tcg_out_insn(s, RRE, SGR, args[0], TCG_REG_R13);
> +        tcg_out_insn(s, RRE, SGR, args[0], TCG_TMP0);
>          break;
>  
>      case INDEX_op_mul_i32:
> @@ -1192,7 +1196,7 @@ void tcg_target_init(TCGContext *s)
>  
>      tcg_regset_clear(s->reserved_regs);
>      /* frequently used as a temporary */
> -    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R13);
> +    tcg_regset_set_reg(s->reserved_regs, TCG_TMP0);
>      /* another temporary */
>      tcg_regset_set_reg(s->reserved_regs, TCG_REG_R12);
>      /* XXX many insns can't be used with R0, so we better avoid it for now */
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 13/35] tcg-s390: Tidy regset initialization; use R14 as temporary.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 13/35] tcg-s390: Tidy regset initialization; use R14 as temporary Richard Henderson
@ 2010-06-10 10:26   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-10 10:26 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:21PM -0700, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c |   26 ++++++++++++--------------
>  1 files changed, 12 insertions(+), 14 deletions(-)

This patch looks fine.

> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index 2b80c02..95ea3c8 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -41,7 +41,7 @@
>  #define TCG_REG_NONE    0
>  
>  /* A scratch register that may be be used throughout the backend.  */
> -#define TCG_TMP0        TCG_REG_R13
> +#define TCG_TMP0        TCG_REG_R14
>  
>  
>  /* All of the following instructions are prefixed with their instruction
> @@ -1185,24 +1185,22 @@ void tcg_target_init(TCGContext *s)
>  
>      tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffff);
>      tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffff);
> -    tcg_regset_set32(tcg_target_call_clobber_regs, 0,
> -                     (1 << TCG_REG_R0) |
> -                     (1 << TCG_REG_R1) |
> -                     (1 << TCG_REG_R2) |
> -                     (1 << TCG_REG_R3) |
> -                     (1 << TCG_REG_R4) |
> -                     (1 << TCG_REG_R5) |
> -                     (1 << TCG_REG_R14)); /* link register */
> +
> +    tcg_regset_clear(tcg_target_call_clobber_regs);
> +    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R0);
> +    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R1);
> +    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R2);
> +    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R3);
> +    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R4);
> +    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R5);
> +    /* The return register can be considered call-clobbered.  */
> +    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R14);
>  
>      tcg_regset_clear(s->reserved_regs);
> -    /* frequently used as a temporary */
>      tcg_regset_set_reg(s->reserved_regs, TCG_TMP0);
> -    /* another temporary */
> -    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R12);
>      /* XXX many insns can't be used with R0, so we better avoid it for now */
>      tcg_regset_set_reg(s->reserved_regs, TCG_REG_R0);
> -    /* The stack pointer.  */
> -    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R15);
> +    tcg_regset_set_reg(s->reserved_regs, TCG_REG_CALL_STACK);
>  
>      tcg_add_target_add_op_defs(s390_op_defs);
>  }
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 14/35] tcg-s390: Rearrange register allocation order.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 14/35] tcg-s390: Rearrange register allocation order Richard Henderson
@ 2010-06-10 10:26   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-10 10:26 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:22PM -0700, Richard Henderson wrote:
> Try to avoid conflicting with the outgoing function call arguments.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c |   23 +++++++++++++----------
>  1 files changed, 13 insertions(+), 10 deletions(-)

This patch looks fine.

> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index 95ea3c8..3944cb1 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -149,22 +149,25 @@ static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
>  };
>  #endif
>  
> +/* Since R6 is a potential argument register, choose it last of the
> +   call-saved registers.  Likewise prefer the call-clobbered registers
> +   in reverse order to maximize the chance of avoiding the arguments.  */
>  static const int tcg_target_reg_alloc_order[] = {
> -    TCG_REG_R6,
> -    TCG_REG_R7,
> -    TCG_REG_R8,
> -    TCG_REG_R9,
> -    TCG_REG_R10,
> -    TCG_REG_R11,
> -    TCG_REG_R12,
>      TCG_REG_R13,
> +    TCG_REG_R12,
> +    TCG_REG_R11,
> +    TCG_REG_R10,
> +    TCG_REG_R9,
> +    TCG_REG_R8,
> +    TCG_REG_R7,
> +    TCG_REG_R6,
>      TCG_REG_R14,
>      TCG_REG_R0,
>      TCG_REG_R1,
> -    TCG_REG_R2,
> -    TCG_REG_R3,
> -    TCG_REG_R4,
>      TCG_REG_R5,
> +    TCG_REG_R4,
> +    TCG_REG_R3,
> +    TCG_REG_R2,
>  };
>  
>  static const int tcg_target_call_iarg_regs[] = {
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed Richard Henderson
@ 2010-06-10 10:28   ` Aurelien Jarno
  2010-06-10 22:19     ` Richard Henderson
  0 siblings, 1 reply; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-10 10:28 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:23PM -0700, Richard Henderson wrote:
> Verify that we have all the instruction extensions that we generate.
> Future patches can tailor code generation to the set of instructions
> that are present.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c |  113 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 113 insertions(+), 0 deletions(-)
> 
> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index 3944cb1..d99bb5c 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -229,6 +229,17 @@ static void *qemu_st_helpers[4] = {
>  
>  static uint8_t *tb_ret_addr;
>  
> +/* A list of relevant facilities used by this translator.  Some of these
> +   are required for proper operation, and these are checked at startup.  */
> +
> +#define FACILITY_ZARCH		(1ULL << (63 - 1))
> +#define FACILITY_ZARCH_ACTIVE	(1ULL << (63 - 2))
> +#define FACILITY_LONG_DISP	(1ULL << (63 - 18))
> +#define FACILITY_EXT_IMM	(1ULL << (63 - 21))
> +#define FACILITY_GEN_INST_EXT	(1ULL << (63 - 34))
> +
> +static uint64_t facilities;
> +
>  static void patch_reloc(uint8_t *code_ptr, int type,
>                  tcg_target_long value, tcg_target_long addend)
>  {
> @@ -1177,6 +1188,106 @@ static const TCGTargetOpDef s390_op_defs[] = {
>      { -1 },
>  };
>  
> +/* ??? Linux kernels provide an AUXV entry AT_HWCAP that provides most of
> +   this information.  However, getting at that entry is not easy this far
> +   away from main.  Our options are: start searching from environ, but
> +   that fails as soon as someone does a setenv in between.  Read the data
> +   from /proc/self/auxv.  Or do the probing ourselves.  The only thing
> +   extra that AT_HWCAP gives us is HWCAP_S390_HIGH_GPRS, which indicates
> +   that the kernel saves all 64-bits of the registers around traps while
> +   in 31-bit mode.  But this is true of all "recent" kernels (ought to dig
> +   back and see from when this might not be true).  */
> +
> +#include <signal.h>
> +
> +static volatile sig_atomic_t got_sigill;
> +
> +static void sigill_handler(int sig)
> +{
> +    got_sigill = 1;
> +}
> +
> +static void query_facilities(void)
> +{
> +    struct sigaction sa_old, sa_new;
> +    register int r0 __asm__("0");
> +    register void *r1 __asm__("1");
> +    int fail;
> +
> +    memset(&sa_new, 0, sizeof(sa_new));
> +    sa_new.sa_handler = sigill_handler;
> +    sigaction(SIGILL, &sa_new, &sa_old);
> +
> +    /* First, try STORE FACILITY LIST EXTENDED.  If this is present, then
> +       we need not do any more probing.  Unfortunately, this itself is an
> +       extension and the original STORE FACILITY LIST instruction is
> +       kernel-only, storing its results at absolute address 200.  */
> +    /* stfle 0(%r1) */
> +    r1 = &facilities;
> +    asm volatile(".word 0xb2b0,0x1000"
> +                 : "=r"(r0) : "0"(0), "r"(r1) : "memory", "cc");

Wouldn't it be possible to use the instruction directly instead of
dumping the opcode values? Same below

> +
> +    if (got_sigill) {
> +        /* STORE FACILITY EXTENDED is not available.  Probe for one of each
> +           kind of instruction that we're interested in.  */
> +        /* ??? Possibly some of these are in practice never present unless
> +           the store-facility-extended facility is also present.  But since
> +           that isn't documented it's just better to probe for each.  */
> +
> +        /* Test for z/Architecture.  Required even in 31-bit mode.  */
> +        got_sigill = 0;
> +        /* agr %r0,%r0 */
> +        asm volatile(".word 0xb908,0x0000" : "=r"(r0) : : "cc");
> +        if (!got_sigill) {
> +            facilities |= FACILITY_ZARCH | FACILITY_ZARCH_ACTIVE;
> +        }
> +
> +        /* Test for long displacement.  */
> +        got_sigill = 0;
> +        /* ly %r0,0(%r1) */
> +        r1 = &facilities;
> +        asm volatile(".word 0xe300,0x1000,0x0058"
> +                     : "=r"(r0) : "r"(r1) : "cc");
> +        if (!got_sigill) {
> +            facilities |= FACILITY_LONG_DISP;
> +        }
> +
> +        /* Test for extended immediates.  */
> +        got_sigill = 0;
> +        /* afi %r0,0 */
> +        asm volatile(".word 0xc209,0x0000,0x0000" : : : "cc");
> +        if (!got_sigill) {
> +            facilities |= FACILITY_EXT_IMM;
> +        }
> +
> +        /* Test for general-instructions-extension.  */
> +        got_sigill = 0;
> +        /* msfi %r0,1 */
> +        asm volatile(".word 0xc201,0x0000,0x0001");
> +        if (!got_sigill) {
> +            facilities |= FACILITY_GEN_INST_EXT;
> +        }
> +    }
> +
> +    sigaction(SIGILL, &sa_old, NULL);
> +
> +    /* The translator currently uses these extensions unconditionally.
> +       Pruning this back to the base ESA/390 architecture doesn't seem
> +       worthwhile, since even the KVM target requires z/Arch.  */
> +    fail = 0;
> +    if ((facilities & FACILITY_ZARCH_ACTIVE) == 0) {
> +        fprintf(stderr, "TCG: z/Arch facility is required\n");
> +        fail = 1;
> +    }
> +    if ((facilities & FACILITY_LONG_DISP) == 0) {
> +        fprintf(stderr, "TCG: long-displacement facility is required\n");
> +        fail = 1;
> +    }
> +    if (fail) {
> +        exit(-1);
> +    }
> +}
> +
>  void tcg_target_init(TCGContext *s)
>  {
>  #if !defined(CONFIG_USER_ONLY)
> @@ -1186,6 +1297,8 @@ void tcg_target_init(TCGContext *s)
>      }
>  #endif
>  
> +    query_facilities();
> +
>      tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffff);
>      tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffff);
>  
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 05/35] tcg-s390: Icache flush is a no-op.
  2010-06-09 22:55   ` Aurelien Jarno
@ 2010-06-10 22:04     ` Richard Henderson
  2010-06-11  6:46       ` Aurelien Jarno
  0 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-10 22:04 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, agraf

On 06/09/2010 03:55 PM, Aurelien Jarno wrote:
> On Fri, Jun 04, 2010 at 12:14:13PM -0700, Richard Henderson wrote:
>> Before gcc 4.2, __builtin___clear_cache doesn't exist, and
>> afterward the gcc s390 backend implements it as nothing.
> 
> Does it means that instruction and data caches are coherent on s390?

Yes.

Principles of Operation, 2.1 Main storage:
# Main storage may include a faster-access buffer storage, sometimes called a cache.
# Each CPU may have an associated cache. The effects, except on  performance, of
# the physical construction and the use of distinct storage media are not observable
# by the program.

This architecture pre-dates caches, I think.  ;-)


r~

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 06/35] tcg-s390: Allocate the code_gen_buffer near the main program.
  2010-06-09 22:59   ` Aurelien Jarno
@ 2010-06-10 22:05     ` Richard Henderson
  2010-06-11  7:31       ` Aurelien Jarno
  0 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-10 22:05 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, agraf

On 06/09/2010 03:59 PM, Aurelien Jarno wrote:
>> +        start = (void *)0x90000000UL;
> 
> Is there any reason for this address?

The default link address for the main application is 0x80000000,
so this is near-by.


r~

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 07/35] tcg: Optionally sign-extend 32-bit arguments for 64-bit host.
  2010-06-10 10:22   ` Aurelien Jarno
@ 2010-06-10 22:08     ` Richard Henderson
  2010-06-14 22:20     ` Richard Henderson
  1 sibling, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-10 22:08 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, agraf

On 06/10/2010 03:22 AM, Aurelien Jarno wrote:
> Wouldn't it be better to actually pass the whole flag to
> tcg_gen_helper32(), so that we can in the future also support mixed
> signedness in arguments? Also doing it here looks like a bit like a
> magic constant.

I suppose that's possible.

> This part allocates a lot of temp variables, that will probably generate
> a lot of register spills during the code generation.
> 
> As we do that for all arguments anyway, wouldn't it be possible to do
> the extension in place? The value in the register is changed, but that
> should not have any effect as it is ignored anyway in other
> instructions.

That hadn't occurred to me.  I'll give it a try.


r~

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.
  2010-06-10 10:28   ` Aurelien Jarno
@ 2010-06-10 22:19     ` Richard Henderson
  2010-06-11  8:06       ` Aurelien Jarno
  0 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-10 22:19 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, agraf

On 06/10/2010 03:28 AM, Aurelien Jarno wrote:
>> +    asm volatile(".word 0xb2b0,0x1000"
>> +                 : "=r"(r0) : "0"(0), "r"(r1) : "memory", "cc");
> 
> Wouldn't it be possible to use the instruction directly instead of
> dumping the opcode values? Same below

No, they aren't recognized by older assemblers.  For instance, the one shipped
with RHEL 5.5, and possibly even by Debian Lenny (I don't currently have access
to that machine to check). Apparently some of these are quite new insns -- 2008 era.

That said, all the hardware to which either I or agraf have access are the latest
z10 machines.  Frankly I expect that to be true of most if not all machines, since
I think it's just a microcode update which everyone with an active support contract
can get.


r~

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 05/35] tcg-s390: Icache flush is a no-op.
  2010-06-10 22:04     ` Richard Henderson
@ 2010-06-11  6:46       ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-11  6:46 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Thu, Jun 10, 2010 at 03:04:04PM -0700, Richard Henderson wrote:
> On 06/09/2010 03:55 PM, Aurelien Jarno wrote:
> > On Fri, Jun 04, 2010 at 12:14:13PM -0700, Richard Henderson wrote:
> >> Before gcc 4.2, __builtin___clear_cache doesn't exist, and
> >> afterward the gcc s390 backend implements it as nothing.
> > 
> > Does it means that instruction and data caches are coherent on s390?
> 
> Yes.
> 
> Principles of Operation, 2.1 Main storage:
> # Main storage may include a faster-access buffer storage, sometimes called a cache.
> # Each CPU may have an associated cache. The effects, except on  performance, of
> # the physical construction and the use of distinct storage media are not observable
> # by the program.
> 
> This architecture pre-dates caches, I think.  ;-)

Ok, I have applied the patch

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 06/35] tcg-s390: Allocate the code_gen_buffer near the main program.
  2010-06-10 22:05     ` Richard Henderson
@ 2010-06-11  7:31       ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-11  7:31 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Thu, Jun 10, 2010 at 03:05:16PM -0700, Richard Henderson wrote:
> On 06/09/2010 03:59 PM, Aurelien Jarno wrote:
> >> +        start = (void *)0x90000000UL;
> > 
> > Is there any reason for this address?
> 
> The default link address for the main application is 0x80000000,
> so this is near-by.
> 

Ok, that makes sense, I have applied the patch

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.
  2010-06-10 22:19     ` Richard Henderson
@ 2010-06-11  8:06       ` Aurelien Jarno
  2010-06-11 13:07         ` Richard Henderson
  2010-06-11 13:13         ` Richard Henderson
  0 siblings, 2 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-11  8:06 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Thu, Jun 10, 2010 at 03:19:25PM -0700, Richard Henderson wrote:
> On 06/10/2010 03:28 AM, Aurelien Jarno wrote:
> >> +    asm volatile(".word 0xb2b0,0x1000"
> >> +                 : "=r"(r0) : "0"(0), "r"(r1) : "memory", "cc");
> > 
> > Wouldn't it be possible to use the instruction directly instead of
> > dumping the opcode values? Same below
> 
> No, they aren't recognized by older assemblers.  For instance, the one shipped
> with RHEL 5.5, and possibly even by Debian Lenny (I don't currently have access
> to that machine to check). Apparently some of these are quite new insns -- 2008 era.
> 
> That said, all the hardware to which either I or agraf have access are the latest
> z10 machines.  Frankly I expect that to be true of most if not all machines, since
> I think it's just a microcode update which everyone with an active support contract
> can get.
> 

FYI, that's the /proc/cpuinfo of s390 machines I have (more or less)
access:

features        : esan3 zarch msa ldisp 
features        : esan3 zarch stfle msa ldisp eimm dfp
features        : esan3 zarch stfle msa ldisp eimm dfp etf3eh highgprs 

So that's seems fine. However, looking more in details in the code
again, I do wonder about this part:

> +        /* Test for z/Architecture.  Required even in 31-bit mode.  */
> +        got_sigill = 0;
> +        /* agr %r0,%r0 */
> +        asm volatile(".word 0xb908,0x0000" : "=r"(r0) : : "cc");
> +        if (!got_sigill) {
> +            facilities |= FACILITY_ZARCH | FACILITY_ZARCH_ACTIVE;
> +        }
> +

What's the difference between FACILITY_ZARCH and FACILITY_ZARCH_ACTIVE,
as both are actually flagged together. My guess is that
FACILITY_ZARCH_ACTIVE is needed in 64-bit mode, why FACILITY_ZARCH is
only needed for a possible future 32-bit mode. Is it correct?

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.
  2010-06-11  8:06       ` Aurelien Jarno
@ 2010-06-11 13:07         ` Richard Henderson
  2010-06-12 11:57           ` Aurelien Jarno
  2010-06-11 13:13         ` Richard Henderson
  1 sibling, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-11 13:07 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, agraf

On 06/11/2010 01:06 AM, Aurelien Jarno wrote:
> What's the difference between FACILITY_ZARCH and FACILITY_ZARCH_ACTIVE,
> as both are actually flagged together. My guess is that
> FACILITY_ZARCH_ACTIVE is needed in 64-bit mode, why FACILITY_ZARCH is
> only needed for a possible future 32-bit mode. Is it correct?

Loosely,

ZARCH is set when the system is 64-bit capable, whether or not it is active.
The OS would check this bit at startup if it wanted to change modes.  This
bit isn't really interesting to us in userspace.

ZARCH_ACTIVE is set when the system is in 64-bit mode, i.e. you've booted
with a 64-bit kernel.  Note that this says nothing about the address 
decoding mode -- this bit can be set while the PSW is set for 31-bit
address translation, e.g. running a 32-bit program on a 64-bit kernel.


r~

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.
  2010-06-11  8:06       ` Aurelien Jarno
  2010-06-11 13:07         ` Richard Henderson
@ 2010-06-11 13:13         ` Richard Henderson
  2010-06-13 10:49           ` Aurelien Jarno
  1 sibling, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-11 13:13 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, agraf

On 06/11/2010 01:06 AM, Aurelien Jarno wrote:
>> That said, all the hardware to which either I or agraf have access are the latest
>> z10 machines.  Frankly I expect that to be true of most if not all machines, since
>> I think it's just a microcode update which everyone with an active support contract
>> can get.
>>
> 
> FYI, that's the /proc/cpuinfo of s390 machines I have (more or less)
> access:
> 
> features        : esan3 zarch msa ldisp 
> features        : esan3 zarch stfle msa ldisp eimm dfp
> features        : esan3 zarch stfle msa ldisp eimm dfp etf3eh highgprs 

Interesting that your first one doesn't have stfle.  That one will have to
go through the SIGILL path.  I would be very interested to have you test 
that code path.

Also, what era is that second machine without highgprs?  Is it running an
old kernel, or a 32-bit kernel?


r~

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.
  2010-06-11 13:07         ` Richard Henderson
@ 2010-06-12 11:57           ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-12 11:57 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 11, 2010 at 06:07:52AM -0700, Richard Henderson wrote:
> On 06/11/2010 01:06 AM, Aurelien Jarno wrote:
> > What's the difference between FACILITY_ZARCH and FACILITY_ZARCH_ACTIVE,
> > as both are actually flagged together. My guess is that
> > FACILITY_ZARCH_ACTIVE is needed in 64-bit mode, why FACILITY_ZARCH is
> > only needed for a possible future 32-bit mode. Is it correct?
> 
> Loosely,
> 
> ZARCH is set when the system is 64-bit capable, whether or not it is active.
> The OS would check this bit at startup if it wanted to change modes.  This
> bit isn't really interesting to us in userspace.
> 
> ZARCH_ACTIVE is set when the system is in 64-bit mode, i.e. you've booted
> with a 64-bit kernel.  Note that this says nothing about the address 
> decoding mode -- this bit can be set while the PSW is set for 31-bit
> address translation, e.g. running a 32-bit program on a 64-bit kernel.
> 

So in short we never use ZARCH in QEMU, so we probably don't want to
have this #define, nor add it at the same time as FACILITY_ZARCH_ACTIVE.


-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 16/35] tcg-s390: Re-implement tcg_out_movi.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 16/35] tcg-s390: Re-implement tcg_out_movi Richard Henderson
@ 2010-06-12 12:04   ` Aurelien Jarno
  2010-06-13 23:19     ` Richard Henderson
  0 siblings, 1 reply; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-12 12:04 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:24PM -0700, Richard Henderson wrote:
> Make better use of the LOAD HALFWORD IMMEDIATE, LOAD IMMEDIATE,
> and INSERT IMMEDIATE instruction groups.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c |  129 +++++++++++++++++++++++++++++++++++++++++++------
>  1 files changed, 113 insertions(+), 16 deletions(-)
> 
> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index d99bb5c..71e017a 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -52,12 +52,23 @@ typedef enum S390Opcode {
>      RIL_BRASL   = 0xc005,
>      RIL_BRCL    = 0xc004,
>      RIL_LARL    = 0xc000,
> +    RIL_IIHF    = 0xc008,
> +    RIL_IILF    = 0xc009,
> +    RIL_LGFI    = 0xc001,
> +    RIL_LLIHF   = 0xc00e,
> +    RIL_LLILF   = 0xc00f,
>  
>      RI_AGHI     = 0xa70b,
>      RI_AHI      = 0xa70a,
>      RI_BRC      = 0xa704,
> +    RI_IIHH     = 0xa500,
> +    RI_IIHL     = 0xa501,
>      RI_IILH     = 0xa502,
> +    RI_IILL     = 0xa503,
>      RI_LGHI     = 0xa709,
> +    RI_LLIHH    = 0xa50c,
> +    RI_LLIHL    = 0xa50d,
> +    RI_LLILH    = 0xa50e,
>      RI_LLILL    = 0xa50f,
>  
>      RRE_AGR     = 0xb908,
> @@ -382,24 +393,110 @@ static inline void tcg_out_mov(TCGContext *s, int ret, int arg)
>  }
>  
>  /* load a register with an immediate value */
> -static inline void tcg_out_movi(TCGContext *s, TCGType type,
> -                int ret, tcg_target_long arg)
> +static void tcg_out_movi(TCGContext *s, TCGType type,
> +                         TCGReg ret, tcg_target_long sval)
>  {
> -    if (arg >= -0x8000 && arg < 0x8000) { /* signed immediate load */
> -        tcg_out_insn(s, RI, LGHI, ret, arg);
> -    } else if (!(arg & 0xffffffffffff0000UL)) {
> -        tcg_out_insn(s, RI, LLILL, ret, arg);
> -    } else if (!(arg & 0xffffffff00000000UL) || type == TCG_TYPE_I32) {
> -        tcg_out_insn(s, RI, LLILL, ret, arg);
> -        tcg_out_insn(s, RI, IILH, ret, arg >> 16);
> +    static const S390Opcode lli_insns[4] = {
> +        RI_LLILL, RI_LLILH, RI_LLIHL, RI_LLIHH
> +    };
> +
> +    tcg_target_ulong uval = sval;
> +    int i;
> +
> +    if (type == TCG_TYPE_I32) {
> +        uval = (uint32_t)sval;
> +        sval = (int32_t)sval;
> +    }
> +
> +    /* Try all 32-bit insns that can load it in one go.  */
> +    if (sval >= -0x8000 && sval < 0x8000) {
> +        tcg_out_insn(s, RI, LGHI, ret, sval);
> +        return;
> +    }
> +
> +    for (i = 0; i < 4; i++) {
> +        tcg_target_long mask = 0xffffull << i*16;
> +        if ((uval & mask) != 0 && (uval & ~mask) == 0) {

Wouldn't it be simpler to use (uval & mask) == uval ?

> +            tcg_out_insn_RI(s, lli_insns[i], ret, uval >> i*16);
> +            return;
> +        }
> +    }
> +
> +    /* Try all 48-bit insns that can load it in one go.  */
> +    if (facilities & FACILITY_EXT_IMM) {
> +        if (sval == (int32_t)sval) {
> +            tcg_out_insn(s, RIL, LGFI, ret, sval);
> +            return;
> +        }
> +        if (uval <= 0xffffffff) {
> +            tcg_out_insn(s, RIL, LLILF, ret, uval);
> +            return;
> +        }
> +        if ((uval & 0xffffffff) == 0) {
> +            tcg_out_insn(s, RIL, LLIHF, ret, uval >> 32);
> +            return;
> +        }
> +    }
> +
> +    /* Try for PC-relative address load.  */
> +    if ((sval & 1) == 0) {
> +        intptr_t off = (sval - (intptr_t)s->code_ptr) >> 1;
> +        if (off == (int32_t)off) {
> +            tcg_out_insn(s, RIL, LARL, ret, off);
> +            return;
> +        }
> +    }

Is this part used in practice? There was such a trick on the ARM
backend, but it was actually never used.

> +
> +    /* If extended immediates are not present, then we may have to issue
> +       several instructions to load the low 32 bits.  */
> +    if (!(facilities & FACILITY_EXT_IMM)) {
> +        /* A 32-bit unsigned value can be loaded in 2 insns.  And given
> +           that the lli_insns loop above did not succeed, we know that
> +           both insns are required.  */
> +        if (uval <= 0xffffffff) {
> +            tcg_out_insn(s, RI, LLILL, ret, uval);
> +            tcg_out_insn(s, RI, IILH, ret, uval >> 16);
> +            return;
> +        }
> +
> +        /* If all high bits are set, the value can be loaded in 2 or 3 insns.
> +           We first want to make sure that all the high bits get set.  With
> +           luck the low 16-bits can be considered negative to perform that for
> +           free, otherwise we load an explicit -1.  */
> +        if (sval >> 32 == -1) {
> +            if (uval & 0x8000) {
> +                tcg_out_insn(s, RI, LGHI, ret, uval);
> +            } else {
> +                tcg_out_insn(s, RI, LGHI, ret, -1);
> +                tcg_out_insn(s, RI, IILL, ret, uval);
> +            }
> +            tcg_out_insn(s, RI, IILH, ret, uval >> 16);
> +            return;
> +        }
> +    }
> +
> +    /* If we get here, both the high and low parts have non-zero bits.  */
> +
> +    /* Recurse to load the lower 32-bits.  */
> +    tcg_out_movi(s, TCG_TYPE_I32, ret, sval);
> +
> +    /* Insert data into the high 32-bits.  */
> +    uval >>= 32;
> +    if (facilities & FACILITY_EXT_IMM) {
> +        if (uval < 0x10000) {
> +            tcg_out_insn(s, RI, IIHL, ret, uval);
> +        } else if ((uval & 0xffff) == 0) {
> +            tcg_out_insn(s, RI, IIHH, ret, uval >> 16);
> +        } else {
> +            tcg_out_insn(s, RIL, IIHF, ret, uval);
> +        }
>      } else {
> -        /* branch over constant and store its address in R13 */
> -        tcg_out_insn(s, RIL, BRASL, TCG_TMP0, (6 + 8) >> 1);
> -        /* 64-bit constant */
> -        tcg_out32(s, arg >> 32);
> -        tcg_out32(s, arg);
> -        /* load constant to ret */
> -        tcg_out_insn(s, RXY, LG, ret, TCG_TMP0, 0, 0);
> +        if (uval & 0xffff) {
> +            tcg_out_insn(s, RI, IIHL, ret, uval);
> +        }
> +        if (uval & 0xffff0000) {
> +            tcg_out_insn(s, RI, IIHH, ret, uval >> 16);
> +        }
>      }
>  }
>  
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 17/35] tcg-s390: Implement sign and zero-extension operations.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 17/35] tcg-s390: Implement sign and zero-extension operations Richard Henderson
@ 2010-06-12 12:32   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-12 12:32 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:25PM -0700, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c |  164 ++++++++++++++++++++++++++++++++++++++++++++-----
>  tcg/s390/tcg-target.h |   20 +++---
>  2 files changed, 158 insertions(+), 26 deletions(-)

This patch looks fine.

> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index 71e017a..42e3224 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -78,10 +78,14 @@ typedef enum S390Opcode {
>      RRE_DLR     = 0xb997,
>      RRE_DSGFR   = 0xb91d,
>      RRE_DSGR    = 0xb90d,
> +    RRE_LGBR    = 0xb906,
>      RRE_LCGR    = 0xb903,
>      RRE_LGFR    = 0xb914,
> +    RRE_LGHR    = 0xb907,
>      RRE_LGR     = 0xb904,
> +    RRE_LLGCR   = 0xb984,
>      RRE_LLGFR   = 0xb916,
> +    RRE_LLGHR   = 0xb985,
>      RRE_MSGR    = 0xb90c,
>      RRE_MSR     = 0xb252,
>      RRE_NGR     = 0xb980,
> @@ -117,11 +121,9 @@ typedef enum S390Opcode {
>      RXY_LGF     = 0xe314,
>      RXY_LGH     = 0xe315,
>      RXY_LHY     = 0xe378,
> -    RXY_LLC     = 0xe394,
>      RXY_LLGC    = 0xe390,
>      RXY_LLGF    = 0xe316,
>      RXY_LLGH    = 0xe391,
> -    RXY_LLH     = 0xe395,
>      RXY_LMG     = 0xeb04,
>      RXY_LRV     = 0xe31e,
>      RXY_LRVG    = 0xe30f,
> @@ -553,6 +555,96 @@ static inline void tcg_out_st(TCGContext *s, TCGType type, TCGReg data,
>      }
>  }
>  
> +static void tgen_ext8s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
> +{
> +    if (facilities & FACILITY_EXT_IMM) {
> +        tcg_out_insn(s, RRE, LGBR, dest, src);
> +        return;
> +    }
> +
> +    if (type == TCG_TYPE_I32) {
> +        if (dest == src) {
> +            tcg_out_sh32(s, RS_SLL, dest, TCG_REG_NONE, 24);
> +        } else {
> +            tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 24);
> +        }
> +        tcg_out_sh32(s, RS_SRA, dest, TCG_REG_NONE, 24);
> +    } else {
> +        tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 56);
> +        tcg_out_sh64(s, RSY_SRAG, dest, dest, TCG_REG_NONE, 56);
> +    }
> +}
> +
> +static void tgen_ext8u(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
> +{
> +    if (facilities & FACILITY_EXT_IMM) {
> +        tcg_out_insn(s, RRE, LLGCR, dest, src);
> +        return;
> +    }
> +
> +    if (dest == src) {
> +        tcg_out_movi(s, type, TCG_TMP0, 0xff);
> +        src = TCG_TMP0;
> +    } else {
> +        tcg_out_movi(s, type, dest, 0xff);
> +    }
> +    if (type == TCG_TYPE_I32) {
> +        tcg_out_insn(s, RR, NR, dest, src);
> +    } else {
> +        tcg_out_insn(s, RRE, NGR, dest, src);
> +    }
> +}
> +
> +static void tgen_ext16s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
> +{
> +    if (facilities & FACILITY_EXT_IMM) {
> +        tcg_out_insn(s, RRE, LGHR, dest, src);
> +        return;
> +    }
> +
> +    if (type == TCG_TYPE_I32) {
> +        if (dest == src) {
> +            tcg_out_sh32(s, RS_SLL, dest, TCG_REG_NONE, 16);
> +        } else {
> +            tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 16);
> +        }
> +        tcg_out_sh32(s, RS_SRA, dest, TCG_REG_NONE, 16);
> +    } else {
> +        tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 48);
> +        tcg_out_sh64(s, RSY_SRAG, dest, dest, TCG_REG_NONE, 48);
> +    }
> +}
> +
> +static void tgen_ext16u(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
> +{
> +    if (facilities & FACILITY_EXT_IMM) {
> +        tcg_out_insn(s, RRE, LLGHR, dest, src);
> +        return;
> +    }
> +
> +    if (dest == src) {
> +        tcg_out_movi(s, type, TCG_TMP0, 0xffff);
> +        src = TCG_TMP0;
> +    } else {
> +        tcg_out_movi(s, type, dest, 0xffff);
> +    }
> +    if (type == TCG_TYPE_I32) {
> +        tcg_out_insn(s, RR, NR, dest, src);
> +    } else {
> +        tcg_out_insn(s, RRE, NGR, dest, src);
> +    }
> +}
> +
> +static inline void tgen_ext32s(TCGContext *s, TCGReg dest, TCGReg src)
> +{
> +    tcg_out_insn(s, RRE, LGFR, dest, src);
> +}
> +
> +static inline void tgen_ext32u(TCGContext *s, TCGReg dest, TCGReg src)
> +{
> +    tcg_out_insn(s, RRE, LLGFR, dest, src);
> +}
> +
>  static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
>  {
>      if (c > TCG_COND_GT) {
> @@ -643,8 +735,8 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
>      }
>  
>  #if TARGET_LONG_BITS == 32
> -    tcg_out_insn(s, RRE, LLGFR, arg1, addr_reg);
> -    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
> +    tgen_ext32u(s, arg1, addr_reg);
> +    tgen_ext32u(s, arg0, addr_reg);
>  #else
>      tcg_out_mov(s, arg1, addr_reg);
>      tcg_out_mov(s, arg0, addr_reg);
> @@ -681,7 +773,7 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
>  
>      /* call load/store helper */
>  #if TARGET_LONG_BITS == 32
> -    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
> +    tgen_ext32u(s, arg0, addr_reg);
>  #else
>      tcg_out_mov(s, arg0, addr_reg);
>  #endif
> @@ -697,15 +789,13 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
>          /* sign extension */
>          switch (opc) {
>          case LD_INT8:
> -            tcg_out_insn(s, RSY, SLLG, data_reg, arg0, TCG_REG_NONE, 56);
> -            tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, TCG_REG_NONE, 56);
> +            tgen_ext8s(s, TCG_TYPE_I64, data_reg, arg0);
>              break;
>          case LD_INT16:
> -            tcg_out_insn(s, RSY, SLLG, data_reg, arg0, TCG_REG_NONE, 48);
> -            tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, TCG_REG_NONE, 48);
> +            tgen_ext16s(s, TCG_TYPE_I64, data_reg, arg0);
>              break;
>          case LD_INT32:
> -            tcg_out_insn(s, RRE, LGFR, data_reg, arg0);
> +            tgen_ext32s(s, data_reg, arg0);
>              break;
>          default:
>              /* unsigned -> just copy */
> @@ -803,8 +893,7 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
>  #else
>          /* swapped unsigned halfword load with upper bits zeroed */
>          tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
> -        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, 0xffffL);
> -        tcg_out_insn(s, RRE, NGR, data_reg, TCG_TMP0);
> +        tgen_ext16u(s, TCG_TYPE_I64, data_reg, data_reg);
>  #endif
>          break;
>      case LD_INT16:
> @@ -813,8 +902,7 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
>  #else
>          /* swapped sign-extended halfword load */
>          tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
> -        tcg_out_insn(s, RSY, SLLG, data_reg, data_reg, TCG_REG_NONE, 48);
> -        tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, TCG_REG_NONE, 48);
> +        tgen_ext16s(s, TCG_TYPE_I64, data_reg, data_reg);
>  #endif
>          break;
>      case LD_UINT32:
> @@ -823,7 +911,7 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
>  #else
>          /* swapped unsigned int load with upper bits zeroed */
>          tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
> -        tcg_out_insn(s, RRE, LLGFR, data_reg, data_reg);
> +        tgen_ext32u(s, data_reg, data_reg);
>  #endif
>          break;
>      case LD_INT32:
> @@ -832,7 +920,7 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
>  #else
>          /* swapped sign-extended int load */
>          tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
> -        tcg_out_insn(s, RRE, LGFR, data_reg, data_reg);
> +        tgen_ext32s(s, data_reg, data_reg);
>  #endif
>          break;
>      case LD_UINT64:
> @@ -1111,6 +1199,38 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>          op = RSY_SRAG;
>          goto do_shift64;
>  
> +    case INDEX_op_ext8s_i32:
> +        tgen_ext8s(s, TCG_TYPE_I32, args[0], args[1]);
> +        break;
> +    case INDEX_op_ext8s_i64:
> +        tgen_ext8s(s, TCG_TYPE_I64, args[0], args[1]);
> +        break;
> +    case INDEX_op_ext16s_i32:
> +        tgen_ext16s(s, TCG_TYPE_I32, args[0], args[1]);
> +        break;
> +    case INDEX_op_ext16s_i64:
> +        tgen_ext16s(s, TCG_TYPE_I64, args[0], args[1]);
> +        break;
> +    case INDEX_op_ext32s_i64:
> +        tgen_ext32s(s, args[0], args[1]);
> +        break;
> +
> +    case INDEX_op_ext8u_i32:
> +        tgen_ext8u(s, TCG_TYPE_I32, args[0], args[1]);
> +        break;
> +    case INDEX_op_ext8u_i64:
> +        tgen_ext8u(s, TCG_TYPE_I64, args[0], args[1]);
> +        break;
> +    case INDEX_op_ext16u_i32:
> +        tgen_ext16u(s, TCG_TYPE_I32, args[0], args[1]);
> +        break;
> +    case INDEX_op_ext16u_i64:
> +        tgen_ext16u(s, TCG_TYPE_I64, args[0], args[1]);
> +        break;
> +    case INDEX_op_ext32u_i64:
> +        tgen_ext32u(s, args[0], args[1]);
> +        break;
> +
>      case INDEX_op_br:
>          tgen_branch(s, S390_CC_ALWAYS, args[0]);
>          break;
> @@ -1228,6 +1348,11 @@ static const TCGTargetOpDef s390_op_defs[] = {
>      { INDEX_op_shr_i32, { "r", "0", "Ri" } },
>      { INDEX_op_sar_i32, { "r", "0", "Ri" } },
>  
> +    { INDEX_op_ext8s_i32, { "r", "r" } },
> +    { INDEX_op_ext8u_i32, { "r", "r" } },
> +    { INDEX_op_ext16s_i32, { "r", "r" } },
> +    { INDEX_op_ext16u_i32, { "r", "r" } },
> +
>      { INDEX_op_brcond_i32, { "r", "r" } },
>      { INDEX_op_setcond_i32, { "r", "r", "r" } },
>  
> @@ -1278,6 +1403,13 @@ static const TCGTargetOpDef s390_op_defs[] = {
>      { INDEX_op_shr_i64, { "r", "r", "Ri" } },
>      { INDEX_op_sar_i64, { "r", "r", "Ri" } },
>  
> +    { INDEX_op_ext8s_i64, { "r", "r" } },
> +    { INDEX_op_ext8u_i64, { "r", "r" } },
> +    { INDEX_op_ext16s_i64, { "r", "r" } },
> +    { INDEX_op_ext16u_i64, { "r", "r" } },
> +    { INDEX_op_ext32s_i64, { "r", "r" } },
> +    { INDEX_op_ext32u_i64, { "r", "r" } },
> +
>      { INDEX_op_brcond_i64, { "r", "r" } },
>      { INDEX_op_setcond_i64, { "r", "r", "r" } },
>  #endif
> diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
> index 26dafae..570c832 100644
> --- a/tcg/s390/tcg-target.h
> +++ b/tcg/s390/tcg-target.h
> @@ -50,10 +50,10 @@ typedef enum TCGReg {
>  /* optional instructions */
>  #define TCG_TARGET_HAS_div2_i32
>  // #define TCG_TARGET_HAS_rot_i32
> -// #define TCG_TARGET_HAS_ext8s_i32
> -// #define TCG_TARGET_HAS_ext16s_i32
> -// #define TCG_TARGET_HAS_ext8u_i32
> -// #define TCG_TARGET_HAS_ext16u_i32
> +#define TCG_TARGET_HAS_ext8s_i32
> +#define TCG_TARGET_HAS_ext16s_i32
> +#define TCG_TARGET_HAS_ext8u_i32
> +#define TCG_TARGET_HAS_ext16u_i32
>  // #define TCG_TARGET_HAS_bswap16_i32
>  // #define TCG_TARGET_HAS_bswap32_i32
>  // #define TCG_TARGET_HAS_not_i32
> @@ -66,12 +66,12 @@ typedef enum TCGReg {
>  
>  #define TCG_TARGET_HAS_div2_i64
>  // #define TCG_TARGET_HAS_rot_i64
> -// #define TCG_TARGET_HAS_ext8s_i64
> -// #define TCG_TARGET_HAS_ext16s_i64
> -// #define TCG_TARGET_HAS_ext32s_i64
> -// #define TCG_TARGET_HAS_ext8u_i64
> -// #define TCG_TARGET_HAS_ext16u_i64
> -// #define TCG_TARGET_HAS_ext32u_i64
> +#define TCG_TARGET_HAS_ext8s_i64
> +#define TCG_TARGET_HAS_ext16s_i64
> +#define TCG_TARGET_HAS_ext32s_i64
> +#define TCG_TARGET_HAS_ext8u_i64
> +#define TCG_TARGET_HAS_ext16u_i64
> +#define TCG_TARGET_HAS_ext32u_i64
>  // #define TCG_TARGET_HAS_bswap16_i64
>  // #define TCG_TARGET_HAS_bswap32_i64
>  // #define TCG_TARGET_HAS_bswap64_i64
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 18/35] tcg-s390: Implement bswap operations.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 18/35] tcg-s390: Implement bswap operations Richard Henderson
@ 2010-06-12 12:32   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-12 12:32 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:26PM -0700, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c |   24 ++++++++++++++++++++++++
>  tcg/s390/tcg-target.h |   10 +++++-----
>  2 files changed, 29 insertions(+), 5 deletions(-)

This patch looks fine.

> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index 42e3224..3a98ca3 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -86,6 +86,8 @@ typedef enum S390Opcode {
>      RRE_LLGCR   = 0xb984,
>      RRE_LLGFR   = 0xb916,
>      RRE_LLGHR   = 0xb985,
> +    RRE_LRVR    = 0xb91f,
> +    RRE_LRVGR   = 0xb90f,
>      RRE_MSGR    = 0xb90c,
>      RRE_MSR     = 0xb252,
>      RRE_NGR     = 0xb980,
> @@ -1231,6 +1233,21 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>          tgen_ext32u(s, args[0], args[1]);
>          break;
>  
> +    case INDEX_op_bswap16_i32:
> +    case INDEX_op_bswap16_i64:
> +        /* The TCG bswap definition requires bits 0-47 already be zero.
> +           Thus we don't need the G-type insns to implement bswap16_i64.  */
> +        tcg_out_insn(s, RRE, LRVR, args[0], args[1]);
> +        tcg_out_sh32(s, RS_SRL, args[0], TCG_REG_NONE, 16);
> +        break;
> +    case INDEX_op_bswap32_i32:
> +    case INDEX_op_bswap32_i64:
> +        tcg_out_insn(s, RRE, LRVR, args[0], args[1]);
> +        break;
> +    case INDEX_op_bswap64_i64:
> +        tcg_out_insn(s, RRE, LRVGR, args[0], args[1]);
> +        break;
> +
>      case INDEX_op_br:
>          tgen_branch(s, S390_CC_ALWAYS, args[0]);
>          break;
> @@ -1353,6 +1370,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
>      { INDEX_op_ext16s_i32, { "r", "r" } },
>      { INDEX_op_ext16u_i32, { "r", "r" } },
>  
> +    { INDEX_op_bswap16_i32, { "r", "r" } },
> +    { INDEX_op_bswap32_i32, { "r", "r" } },
> +
>      { INDEX_op_brcond_i32, { "r", "r" } },
>      { INDEX_op_setcond_i32, { "r", "r", "r" } },
>  
> @@ -1410,6 +1430,10 @@ static const TCGTargetOpDef s390_op_defs[] = {
>      { INDEX_op_ext32s_i64, { "r", "r" } },
>      { INDEX_op_ext32u_i64, { "r", "r" } },
>  
> +    { INDEX_op_bswap16_i64, { "r", "r" } },
> +    { INDEX_op_bswap32_i64, { "r", "r" } },
> +    { INDEX_op_bswap64_i64, { "r", "r" } },
> +
>      { INDEX_op_brcond_i64, { "r", "r" } },
>      { INDEX_op_setcond_i64, { "r", "r", "r" } },
>  #endif
> diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
> index 570c832..dcb9bc3 100644
> --- a/tcg/s390/tcg-target.h
> +++ b/tcg/s390/tcg-target.h
> @@ -54,8 +54,8 @@ typedef enum TCGReg {
>  #define TCG_TARGET_HAS_ext16s_i32
>  #define TCG_TARGET_HAS_ext8u_i32
>  #define TCG_TARGET_HAS_ext16u_i32
> -// #define TCG_TARGET_HAS_bswap16_i32
> -// #define TCG_TARGET_HAS_bswap32_i32
> +#define TCG_TARGET_HAS_bswap16_i32
> +#define TCG_TARGET_HAS_bswap32_i32
>  // #define TCG_TARGET_HAS_not_i32
>  #define TCG_TARGET_HAS_neg_i32
>  // #define TCG_TARGET_HAS_andc_i32
> @@ -72,9 +72,9 @@ typedef enum TCGReg {
>  #define TCG_TARGET_HAS_ext8u_i64
>  #define TCG_TARGET_HAS_ext16u_i64
>  #define TCG_TARGET_HAS_ext32u_i64
> -// #define TCG_TARGET_HAS_bswap16_i64
> -// #define TCG_TARGET_HAS_bswap32_i64
> -// #define TCG_TARGET_HAS_bswap64_i64
> +#define TCG_TARGET_HAS_bswap16_i64
> +#define TCG_TARGET_HAS_bswap32_i64
> +#define TCG_TARGET_HAS_bswap64_i64
>  // #define TCG_TARGET_HAS_not_i64
>  #define TCG_TARGET_HAS_neg_i64
>  // #define TCG_TARGET_HAS_andc_i64
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 19/35] tcg-s390: Implement rotates.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 19/35] tcg-s390: Implement rotates Richard Henderson
@ 2010-06-12 12:33   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-12 12:33 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:27PM -0700, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c |   46 ++++++++++++++++++++++++++++++++++++++++++++++
>  tcg/s390/tcg-target.h |    4 ++--
>  2 files changed, 48 insertions(+), 2 deletions(-)

This patch looks fine.

> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index 3a98ca3..f53038b 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -108,6 +108,8 @@ typedef enum S390Opcode {
>      RR_SR       = 0x1b,
>      RR_XR       = 0x17,
>  
> +    RSY_RLL     = 0xeb1d,
> +    RSY_RLLG    = 0xeb1c,
>      RSY_SLLG    = 0xeb0d,
>      RSY_SRAG    = 0xeb0a,
>      RSY_SRLG    = 0xeb0c,
> @@ -1201,6 +1203,44 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>          op = RSY_SRAG;
>          goto do_shift64;
>  
> +    case INDEX_op_rotl_i32:
> +        /* ??? Using tcg_out_sh64 here for the format; it is a 32-bit rol.  */
> +        if (const_args[2]) {
> +            tcg_out_sh64(s, RSY_RLL, args[0], args[1], TCG_REG_NONE, args[2]);
> +        } else {
> +            tcg_out_sh64(s, RSY_RLL, args[0], args[1], args[2], 0);
> +        }
> +        break;
> +    case INDEX_op_rotr_i32:
> +        if (const_args[2]) {
> +            tcg_out_sh64(s, RSY_RLL, args[0], args[1],
> +                         TCG_REG_NONE, (32 - args[2]) & 31);
> +        } else {
> +            tcg_out_insn(s, RR, LCR, TCG_TMP0, args[2]);
> +            tcg_out_sh64(s, RSY_RLL, args[0], args[1], TCG_TMP0, 0);
> +        }
> +        break;
> +
> +    case INDEX_op_rotl_i64:
> +        if (const_args[2]) {
> +            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
> +                         TCG_REG_NONE, args[2]);
> +        } else {
> +            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], args[2], 0);
> +        }
> +        break;
> +    case INDEX_op_rotr_i64:
> +        if (const_args[2]) {
> +            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
> +                         TCG_REG_NONE, (64 - args[2]) & 63);
> +        } else {
> +            /* We can use the smaller 32-bit negate because only the
> +               low 6 bits are examined for the rotate.  */
> +            tcg_out_insn(s, RR, LCR, TCG_TMP0, args[2]);
> +            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], TCG_TMP0, 0);
> +        }
> +        break;
> +
>      case INDEX_op_ext8s_i32:
>          tgen_ext8s(s, TCG_TYPE_I32, args[0], args[1]);
>          break;
> @@ -1365,6 +1405,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
>      { INDEX_op_shr_i32, { "r", "0", "Ri" } },
>      { INDEX_op_sar_i32, { "r", "0", "Ri" } },
>  
> +    { INDEX_op_rotl_i32, { "r", "r", "Ri" } },
> +    { INDEX_op_rotr_i32, { "r", "r", "Ri" } },
> +
>      { INDEX_op_ext8s_i32, { "r", "r" } },
>      { INDEX_op_ext8u_i32, { "r", "r" } },
>      { INDEX_op_ext16s_i32, { "r", "r" } },
> @@ -1423,6 +1466,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
>      { INDEX_op_shr_i64, { "r", "r", "Ri" } },
>      { INDEX_op_sar_i64, { "r", "r", "Ri" } },
>  
> +    { INDEX_op_rotl_i64, { "r", "r", "Ri" } },
> +    { INDEX_op_rotr_i64, { "r", "r", "Ri" } },
> +
>      { INDEX_op_ext8s_i64, { "r", "r" } },
>      { INDEX_op_ext8u_i64, { "r", "r" } },
>      { INDEX_op_ext16s_i64, { "r", "r" } },
> diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
> index dcb9bc3..9135c7a 100644
> --- a/tcg/s390/tcg-target.h
> +++ b/tcg/s390/tcg-target.h
> @@ -49,7 +49,7 @@ typedef enum TCGReg {
>  
>  /* optional instructions */
>  #define TCG_TARGET_HAS_div2_i32
> -// #define TCG_TARGET_HAS_rot_i32
> +#define TCG_TARGET_HAS_rot_i32
>  #define TCG_TARGET_HAS_ext8s_i32
>  #define TCG_TARGET_HAS_ext16s_i32
>  #define TCG_TARGET_HAS_ext8u_i32
> @@ -65,7 +65,7 @@ typedef enum TCGReg {
>  // #define TCG_TARGET_HAS_nor_i32
>  
>  #define TCG_TARGET_HAS_div2_i64
> -// #define TCG_TARGET_HAS_rot_i64
> +#define TCG_TARGET_HAS_rot_i64
>  #define TCG_TARGET_HAS_ext8s_i64
>  #define TCG_TARGET_HAS_ext16s_i64
>  #define TCG_TARGET_HAS_ext32s_i64
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 20/35] tcg-s390: Use LOAD COMPLIMENT for negate.
  2010-06-04 19:14 ` [Qemu-devel] [PATCH 20/35] tcg-s390: Use LOAD COMPLIMENT for negate Richard Henderson
@ 2010-06-12 12:33   ` Aurelien Jarno
  0 siblings, 0 replies; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-12 12:33 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 04, 2010 at 12:14:28PM -0700, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c |   10 ++--------
>  1 files changed, 2 insertions(+), 8 deletions(-)

This patch looks fine.

> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index f53038b..826a2c8 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -1134,16 +1134,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>          break;
>  
>      case INDEX_op_neg_i32:
> -        /* FIXME: optimize args[0] != args[1] case */
> -        tcg_out_insn(s, RR, LR, 13, args[1]);
> -        tcg_out_movi(s, TCG_TYPE_I32, args[0], 0);
> -        tcg_out_insn(s, RR, SR, args[0], 13);
> +        tcg_out_insn(s, RR, LCR, args[0], args[1]);
>          break;
>      case INDEX_op_neg_i64:
> -        /* FIXME: optimize args[0] != args[1] case */
> -        tcg_out_mov(s, TCG_TMP0, args[1]);
> -        tcg_out_movi(s, TCG_TYPE_I64, args[0], 0);
> -        tcg_out_insn(s, RRE, SGR, args[0], TCG_TMP0);
> +        tcg_out_insn(s, RRE, LCGR, args[0], args[1]);
>          break;
>  
>      case INDEX_op_mul_i32:
> -- 
> 1.7.0.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.
  2010-06-11 13:13         ` Richard Henderson
@ 2010-06-13 10:49           ` Aurelien Jarno
  2010-06-13 16:02             ` Richard Henderson
  0 siblings, 1 reply; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-13 10:49 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Fri, Jun 11, 2010 at 06:13:49AM -0700, Richard Henderson wrote:
> On 06/11/2010 01:06 AM, Aurelien Jarno wrote:
> >> That said, all the hardware to which either I or agraf have access are the latest
> >> z10 machines.  Frankly I expect that to be true of most if not all machines, since
> >> I think it's just a microcode update which everyone with an active support contract
> >> can get.
> >>
> > 
> > FYI, that's the /proc/cpuinfo of s390 machines I have (more or less)
> > access:
> > 
> > features        : esan3 zarch msa ldisp 
> > features        : esan3 zarch stfle msa ldisp eimm dfp
> > features        : esan3 zarch stfle msa ldisp eimm dfp etf3eh highgprs 
> 
> Interesting that your first one doesn't have stfle.  That one will have to
> go through the SIGILL path.  I would be very interested to have you test 
> that code path.

I have tried, it correctly detects zarch and ldisp.

> Also, what era is that second machine without highgprs?  Is it running an
> old kernel, or a 32-bit kernel?
> 

I have very few infos about it, it's an IBM System z10 machine running a
64-bit 2.6.26 kernel.

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.
  2010-06-13 10:49           ` Aurelien Jarno
@ 2010-06-13 16:02             ` Richard Henderson
  2010-06-13 16:44               ` Aurelien Jarno
  0 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-13 16:02 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, agraf

On 06/13/2010 03:49 AM, Aurelien Jarno wrote:
>> Also, what era is that second machine without highgprs?  Is it running an
>> old kernel, or a 32-bit kernel?
> 
> I have very few infos about it, it's an IBM System z10 machine running a
> 64-bit 2.6.26 kernel.

Ah, I see it now: ea2a4d3a3a929ef494952bba57a0ef1a8a877881

    [S390] 64-bit register support for 31-bit processes

which adds a mechanism to pass the high parts of the gprs
in the ucontext to the 31-bit signal handler, and adds a
spot for them in the 31-bit core dump.

It doesn't change the actual saving of registers within
the kernel.  Since we take asynchronous signals and return
from them (as opposed to always longjmping out), we cannot
use the full 64-bit register within a 31-bit process without
having that bit set in HWCAP.

Something to remember if we ever implement TCG for 31-bit mode.
At the moment we only allow KVM in 31-bit mode.


r~

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.
  2010-06-13 16:02             ` Richard Henderson
@ 2010-06-13 16:44               ` Aurelien Jarno
  2010-06-13 22:23                 ` Alexander Graf
  0 siblings, 1 reply; 75+ messages in thread
From: Aurelien Jarno @ 2010-06-13 16:44 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Sun, Jun 13, 2010 at 09:02:40AM -0700, Richard Henderson wrote:
> On 06/13/2010 03:49 AM, Aurelien Jarno wrote:
> >> Also, what era is that second machine without highgprs?  Is it running an
> >> old kernel, or a 32-bit kernel?
> > 
> > I have very few infos about it, it's an IBM System z10 machine running a
> > 64-bit 2.6.26 kernel.
> 
> Ah, I see it now: ea2a4d3a3a929ef494952bba57a0ef1a8a877881
> 
>     [S390] 64-bit register support for 31-bit processes
> 
> which adds a mechanism to pass the high parts of the gprs
> in the ucontext to the 31-bit signal handler, and adds a
> spot for them in the 31-bit core dump.
> 
> It doesn't change the actual saving of registers within
> the kernel.  Since we take asynchronous signals and return
> from them (as opposed to always longjmping out), we cannot
> use the full 64-bit register within a 31-bit process without
> having that bit set in HWCAP.
> 
> Something to remember if we ever implement TCG for 31-bit mode.
> At the moment we only allow KVM in 31-bit mode.
> 

Is KVM in 31-bit mode actually functional?

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.
  2010-06-13 16:44               ` Aurelien Jarno
@ 2010-06-13 22:23                 ` Alexander Graf
  2010-06-14 16:20                   ` Richard Henderson
  0 siblings, 1 reply; 75+ messages in thread
From: Alexander Graf @ 2010-06-13 22:23 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, Richard Henderson


On 13.06.2010, at 18:44, Aurelien Jarno wrote:

> On Sun, Jun 13, 2010 at 09:02:40AM -0700, Richard Henderson wrote:
>> On 06/13/2010 03:49 AM, Aurelien Jarno wrote:
>>>> Also, what era is that second machine without highgprs?  Is it running an
>>>> old kernel, or a 32-bit kernel?
>>> 
>>> I have very few infos about it, it's an IBM System z10 machine running a
>>> 64-bit 2.6.26 kernel.
>> 
>> Ah, I see it now: ea2a4d3a3a929ef494952bba57a0ef1a8a877881
>> 
>>    [S390] 64-bit register support for 31-bit processes
>> 
>> which adds a mechanism to pass the high parts of the gprs
>> in the ucontext to the 31-bit signal handler, and adds a
>> spot for them in the 31-bit core dump.
>> 
>> It doesn't change the actual saving of registers within
>> the kernel.  Since we take asynchronous signals and return
>> from them (as opposed to always longjmping out), we cannot
>> use the full 64-bit register within a 31-bit process without
>> having that bit set in HWCAP.
>> 
>> Something to remember if we ever implement TCG for 31-bit mode.
>> At the moment we only allow KVM in 31-bit mode.
>> 
> 
> Is KVM in 31-bit mode actually functional?

I'm not aware of anything preventing it to be. But I honestly haven't tried. As long as all hypercall parameters stay within the first 32/31 bits, things should be safe.

Alex

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 16/35] tcg-s390: Re-implement tcg_out_movi.
  2010-06-12 12:04   ` Aurelien Jarno
@ 2010-06-13 23:19     ` Richard Henderson
  0 siblings, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-13 23:19 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, agraf

On 06/12/2010 05:04 AM, Aurelien Jarno wrote:
>> +    for (i = 0; i < 4; i++) {
>> +        tcg_target_long mask = 0xffffull << i*16;
>> +        if ((uval & mask) != 0 && (uval & ~mask) == 0) {
> 
> Wouldn't it be simpler to use (uval & mask) == uval ?

Doh.

>> +    /* Try for PC-relative address load.  */
>> +    if ((sval & 1) == 0) {
>> +        intptr_t off = (sval - (intptr_t)s->code_ptr) >> 1;
>> +        if (off == (int32_t)off) {
>> +            tcg_out_insn(s, RIL, LARL, ret, off);
>> +            return;
>> +        }
>> +    }
> 
> Is this part used in practice? There was such a trick on the ARM
> backend, but it was actually never used.

Yes.  The difference here is we have a +- 4GB displacement.

This is primarily used when the extended-immediate facility is not present;
we can generate all even 32-bit constants from LARL, given the placement of
the code_gen_buffer.


r~

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.
  2010-06-13 22:23                 ` Alexander Graf
@ 2010-06-14 16:20                   ` Richard Henderson
  2010-06-14 17:39                     ` Alexander Graf
  0 siblings, 1 reply; 75+ messages in thread
From: Richard Henderson @ 2010-06-14 16:20 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel, Aurelien Jarno

On 06/13/2010 03:23 PM, Alexander Graf wrote:
> On 13.06.2010, at 18:44, Aurelien Jarno wrote:
>> Is KVM in 31-bit mode actually functional?
> 
> I'm not aware of anything preventing it to be. But I honestly haven't
> tried. As long as all hypercall parameters stay within the first
> 32/31 bits, things should be safe.

On the other hand, is there any point in supporting it?
This does seem to be a case for which it does seem like
having the extra VM space (KVM+TCG) and the extra register
size (TCG) is extremely helpful.

I'd be just as happy to adjust the configury to actively
prevent compilation in 31-bit mode, rather than pretend
it might work and never bother to test it...


r~

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.
  2010-06-14 16:20                   ` Richard Henderson
@ 2010-06-14 17:39                     ` Alexander Graf
  0 siblings, 0 replies; 75+ messages in thread
From: Alexander Graf @ 2010-06-14 17:39 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, Aurelien Jarno


Am 14.06.2010 um 18:20 schrieb Richard Henderson <rth@twiddle.net>:

> On 06/13/2010 03:23 PM, Alexander Graf wrote:
>> On 13.06.2010, at 18:44, Aurelien Jarno wrote:
>>> Is KVM in 31-bit mode actually functional?
>>
>> I'm not aware of anything preventing it to be. But I honestly haven't
>> tried. As long as all hypercall parameters stay within the first
>> 32/31 bits, things should be safe.
>
> On the other hand, is there any point in supporting it?
> This does seem to be a case for which it does seem like
> having the extra VM space (KVM+TCG) and the extra register
> size (TCG) is extremely helpful.
>
> I'd be just as happy to adjust the configury to actively
> prevent compilation in 31-bit mode, rather than pretend
> it might work and never bother to test it...

Yeah, I agree. 31-bit is really deprecated by now anyways. Aurelien  
was the one interested in it.

Alex

>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Qemu-devel] [PATCH 07/35] tcg: Optionally sign-extend 32-bit arguments for 64-bit host.
  2010-06-10 10:22   ` Aurelien Jarno
  2010-06-10 22:08     ` Richard Henderson
@ 2010-06-14 22:20     ` Richard Henderson
  1 sibling, 0 replies; 75+ messages in thread
From: Richard Henderson @ 2010-06-14 22:20 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, agraf

On 06/10/2010 03:22 AM, Aurelien Jarno wrote:
>> -                  0, GET_TCGV_I32(ret), 2, args);
>> +                  (is_signed ? 0x2a : 0x00), GET_TCGV_I32(ret), 2, args);
> 
> Wouldn't it be better to actually pass the whole flag to
> tcg_gen_helper32(), so that we can in the future also support mixed
> signedness in arguments? Also doing it here looks like a bit like a
> magic constant.

I've fixed this.

>> +#if defined(TCG_TARGET_EXTEND_ARGS) && TCG_TARGET_REG_BITS == 64
>> +    for (i = 0; i < nargs; ++i) {
>> +        int is_64bit = sizemask & (1 << (i+1)*2);
>> +        int is_signed = sizemask & (2 << (i+1)*2);
>> +        if (!is_64bit) {
>> +            TCGv_i64 temp = tcg_temp_new_i64();
>> +            TCGv_i64 orig = MAKE_TCGV_I64(args[i]);
>> +            if (is_signed) {
>> +                tcg_gen_ext32s_i64(temp, orig);
>> +            } else {
>> +                tcg_gen_ext32u_i64(temp, orig);
>> +            }
>> +            args[i] = GET_TCGV_I64(temp);
>> +        }
>> +    }
>> +#endif /* TCG_TARGET_EXTEND_ARGS */
>> +
> 
> This part allocates a lot of temp variables, that will probably generate
> a lot of register spills during the code generation.
> 
> As we do that for all arguments anyway, wouldn't it be possible to do
> the extension in place? The value in the register is changed, but that
> should not have any effect as it is ignored anyway in other
> instructions.

It is *not* possible to do the extension in-place.  At least not without
changing the format of the INDEX_op_call opcode.

With the extension done during opcode generation, like this, ORIG has been
marked TCG_TYPE_I32 and TEMP gets marked TCG_TYPE_I64.  This matters when
it comes time to copy the arguments into place.  If we somehow extended 
ORIG in-place, we'd still use the wrong instruction to copy the value into
the argument list -- either tcg_out_mov or tcg_out_st would get the wrong
TYPE argument.

If we try to do the extension later, e.g. while copying the value into the
argument list, we'd need to have SIZEMASK available.  To do that, we'd need
to save SIZEMASK into INDEX_op_call's argument list somehow.  That, I think,
is a more invasive change.


r~

^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2010-06-14 22:21 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-04 19:14 [Qemu-devel] [PATCH 00/35] S390 TCG target, version 2 Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 01/35] tcg-s390: Adjust compilation flags Richard Henderson
2010-06-09 22:53   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 02/35] s390x: Avoid _llseek Richard Henderson
2010-06-09 22:54   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 03/35] s390x: Don't use a linker script for user-only Richard Henderson
2010-06-09 22:54   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 04/35] tcg-s390: Compute is_write in cpu_signal_handler Richard Henderson
2010-06-09 22:54   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 05/35] tcg-s390: Icache flush is a no-op Richard Henderson
2010-06-09 22:55   ` Aurelien Jarno
2010-06-10 22:04     ` Richard Henderson
2010-06-11  6:46       ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 06/35] tcg-s390: Allocate the code_gen_buffer near the main program Richard Henderson
2010-06-09 22:59   ` Aurelien Jarno
2010-06-10 22:05     ` Richard Henderson
2010-06-11  7:31       ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 07/35] tcg: Optionally sign-extend 32-bit arguments for 64-bit host Richard Henderson
2010-06-10 10:22   ` Aurelien Jarno
2010-06-10 22:08     ` Richard Henderson
2010-06-14 22:20     ` Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 08/35] s390: Update disassembler to the last GPLv2 from binutils Richard Henderson
2010-06-09 22:47   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 09/35] s390: Disassemble some general-instruction-extension insns Richard Henderson
2010-06-09 22:47   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 10/35] tcg-s390: New TCG target Richard Henderson
2010-06-10 10:24   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 11/35] tcg-s390: Tidy unimplemented opcodes Richard Henderson
2010-06-10 10:24   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 12/35] tcg-s390: Define TCG_TMP0 Richard Henderson
2010-06-10 10:25   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 13/35] tcg-s390: Tidy regset initialization; use R14 as temporary Richard Henderson
2010-06-10 10:26   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 14/35] tcg-s390: Rearrange register allocation order Richard Henderson
2010-06-10 10:26   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed Richard Henderson
2010-06-10 10:28   ` Aurelien Jarno
2010-06-10 22:19     ` Richard Henderson
2010-06-11  8:06       ` Aurelien Jarno
2010-06-11 13:07         ` Richard Henderson
2010-06-12 11:57           ` Aurelien Jarno
2010-06-11 13:13         ` Richard Henderson
2010-06-13 10:49           ` Aurelien Jarno
2010-06-13 16:02             ` Richard Henderson
2010-06-13 16:44               ` Aurelien Jarno
2010-06-13 22:23                 ` Alexander Graf
2010-06-14 16:20                   ` Richard Henderson
2010-06-14 17:39                     ` Alexander Graf
2010-06-04 19:14 ` [Qemu-devel] [PATCH 16/35] tcg-s390: Re-implement tcg_out_movi Richard Henderson
2010-06-12 12:04   ` Aurelien Jarno
2010-06-13 23:19     ` Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 17/35] tcg-s390: Implement sign and zero-extension operations Richard Henderson
2010-06-12 12:32   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 18/35] tcg-s390: Implement bswap operations Richard Henderson
2010-06-12 12:32   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 19/35] tcg-s390: Implement rotates Richard Henderson
2010-06-12 12:33   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 20/35] tcg-s390: Use LOAD COMPLIMENT for negate Richard Henderson
2010-06-12 12:33   ` Aurelien Jarno
2010-06-04 19:14 ` [Qemu-devel] [PATCH 21/35] tcg-s390: Use the ADD IMMEDIATE instructions Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 22/35] tcg-s390: Use the AND " Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 23/35] tcg-s390: Use the OR " Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 24/35] tcg-s390: Use the XOR " Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 25/35] tcg-s390: Use the MULTIPLY " Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 26/35] tcg-s390: Tidy goto_tb Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 27/35] tcg-s390: Rearrange qemu_ld/st to avoid register copy Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 28/35] tcg-s390: Tidy tcg_prepare_qemu_ldst Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 29/35] tcg-s390: Tidy user qemu_ld/st Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 30/35] tcg-s390: Implement GUEST_BASE Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 31/35] tcg-s390: Use 16-bit branches for forward jumps Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 32/35] tcg-s390: Use the LOAD AND TEST instruction for compares Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 33/35] tcg-s390: Use the COMPARE IMMEDIATE instrucions " Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 34/35] tcg-s390: Use COMPARE AND BRANCH instructions Richard Henderson
2010-06-04 19:14 ` [Qemu-devel] [PATCH 35/35] tcg-s390: Enable compile in 32-bit mode Richard Henderson
2010-06-08 13:11 ` [Qemu-devel] Re: [PATCH 00/35] S390 TCG target, version 2 Alexander Graf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.