All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v1 0/7] s390x/tcg: Cleanups and refactorings for Vector
@ 2019-02-25 11:55 David Hildenbrand
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 1/7] s390x/tcg: RXE has an optional M3 field David Hildenbrand
                   ` (6 more replies)
  0 siblings, 7 replies; 25+ messages in thread
From: David Hildenbrand @ 2019-02-25 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-s390x, Thomas Huth, Cornelia Huck, Richard Henderson,
	David Hildenbrand

Before we start with the real magic, some more cleanups and refactorings.
This series does not depend on other patches not yet in master.

Also add a variant of "LOAD LENGTHENED" that is used along with
vector instructions in linux (HFP instructions that can be used without
HFP :) ). Implement "LOAD COUNT TO BLOCK BOUNDARY", introduced with
vector facility but not operating on vectors.

David Hildenbrand (7):
  s390x/tcg: RXE has an optional M3 field
  s390x/tcg: Simplify disassembler operands initialization
  s390x/tcg: Clarify terminology in vec_reg_offset()
  s390x/tcg: Factor out vec_full_reg_offset()
  s390x/tcg: Factor out gen_addi_and_wrap_i64() from get_address()
  s390x/tcg: Implement LOAD LENGTHENED short HFP to long HFP
  s390x/tcg: Implement LOAD COUNT TO BLOCK BOUNDARY

 target/s390x/cc_helper.c     |  8 ++++
 target/s390x/helper.c        |  1 +
 target/s390x/helper.h        |  1 +
 target/s390x/insn-data.def   |  4 ++
 target/s390x/insn-format.def |  2 +-
 target/s390x/internal.h      |  1 +
 target/s390x/mem_helper.c    | 12 +++++
 target/s390x/translate.c     | 92 ++++++++++++++++++++++++------------
 8 files changed, 91 insertions(+), 30 deletions(-)

-- 
2.17.2

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v1 1/7] s390x/tcg: RXE has an optional M3 field
  2019-02-25 11:55 [Qemu-devel] [PATCH v1 0/7] s390x/tcg: Cleanups and refactorings for Vector David Hildenbrand
@ 2019-02-25 11:55 ` David Hildenbrand
  2019-02-25 12:18   ` Thomas Huth
  2019-02-25 15:46   ` Richard Henderson
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 2/7] s390x/tcg: Simplify disassembler operands initialization David Hildenbrand
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: David Hildenbrand @ 2019-02-25 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-s390x, Thomas Huth, Cornelia Huck, Richard Henderson,
	David Hildenbrand

Will be needed, so add it to the format description.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/insn-format.def | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/s390x/insn-format.def b/target/s390x/insn-format.def
index a412d90fb7..4297ff4165 100644
--- a/target/s390x/insn-format.def
+++ b/target/s390x/insn-format.def
@@ -36,7 +36,7 @@ F3(RSY_a, R(1, 8),     BDL(2),      R(3,12))
 F3(RSY_b, R(1, 8),     BDL(2),      M(3,12))
 F2(RX_a,  R(1, 8),     BXD(2))
 F2(RX_b,  M(1, 8),     BXD(2))
-F2(RXE,   R(1, 8),     BXD(2))
+F3(RXE,   R(1, 8),     BXD(2),      M(3,32))
 F3(RXF,   R(1,32),     BXD(2),      R(3, 8))
 F2(RXY_a, R(1, 8),     BXDL(2))
 F2(RXY_b, M(1, 8),     BXDL(2))
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v1 2/7] s390x/tcg: Simplify disassembler operands initialization
  2019-02-25 11:55 [Qemu-devel] [PATCH v1 0/7] s390x/tcg: Cleanups and refactorings for Vector David Hildenbrand
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 1/7] s390x/tcg: RXE has an optional M3 field David Hildenbrand
@ 2019-02-25 11:55 ` David Hildenbrand
  2019-02-25 12:20   ` Thomas Huth
                     ` (2 more replies)
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 3/7] s390x/tcg: Clarify terminology in vec_reg_offset() David Hildenbrand
                   ` (4 subsequent siblings)
  6 siblings, 3 replies; 25+ messages in thread
From: David Hildenbrand @ 2019-02-25 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-s390x, Thomas Huth, Cornelia Huck, Richard Henderson,
	David Hildenbrand

Let's simply initialization to 0.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/translate.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index 19072efec6..c646e50eb3 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -6091,7 +6091,7 @@ static DisasJumpType translate_one(CPUS390XState *env, DisasContext *s)
     const DisasInsn *insn;
     DisasJumpType ret = DISAS_NEXT;
     DisasFields f;
-    DisasOps o;
+    DisasOps o = {};
 
     /* Search for the insn in the table.  */
     insn = extract_insn(env, s, &f);
@@ -6161,12 +6161,6 @@ static DisasJumpType translate_one(CPUS390XState *env, DisasContext *s)
     /* Set up the strutures we use to communicate with the helpers. */
     s->insn = insn;
     s->fields = &f;
-    o.g_out = o.g_out2 = o.g_in1 = o.g_in2 = false;
-    o.out = NULL;
-    o.out2 = NULL;
-    o.in1 = NULL;
-    o.in2 = NULL;
-    o.addr1 = NULL;
 
     /* Implement the instruction.  */
     if (insn->help_in1) {
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v1 3/7] s390x/tcg: Clarify terminology in vec_reg_offset()
  2019-02-25 11:55 [Qemu-devel] [PATCH v1 0/7] s390x/tcg: Cleanups and refactorings for Vector David Hildenbrand
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 1/7] s390x/tcg: RXE has an optional M3 field David Hildenbrand
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 2/7] s390x/tcg: Simplify disassembler operands initialization David Hildenbrand
@ 2019-02-25 11:55 ` David Hildenbrand
  2019-02-25 12:30   ` Thomas Huth
  2019-02-25 15:47   ` Richard Henderson
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 4/7] s390x/tcg: Factor out vec_full_reg_offset() David Hildenbrand
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: David Hildenbrand @ 2019-02-25 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-s390x, Thomas Huth, Cornelia Huck, Richard Henderson,
	David Hildenbrand

We will use s390x speak "Element Size" (es) for MO_8 == 0, MO_16 == 1
... Simple rename of variables.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/translate.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index c646e50eb3..916508b567 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -145,10 +145,10 @@ void s390x_translate_init(void)
     }
 }
 
-static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp size)
+static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp es)
 {
-    const uint8_t es = 1 << size;
-    int offs = enr * es;
+    const uint8_t bytes = 1 << es;
+    int offs = enr * bytes;
 
     g_assert(reg < 32);
     /*
@@ -173,9 +173,9 @@ static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp size)
      * the two 8 byte elements have to be loaded separately. Let's force all
      * 16 byte operations to handle it in a special way.
      */
-    g_assert(size <= MO_64);
+    g_assert(es <= MO_64);
 #ifndef HOST_WORDS_BIGENDIAN
-    offs ^= (8 - es);
+    offs ^= (8 - bytes);
 #endif
     return offs + offsetof(CPUS390XState, vregs[reg][0].d);
 }
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v1 4/7] s390x/tcg: Factor out vec_full_reg_offset()
  2019-02-25 11:55 [Qemu-devel] [PATCH v1 0/7] s390x/tcg: Cleanups and refactorings for Vector David Hildenbrand
                   ` (2 preceding siblings ...)
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 3/7] s390x/tcg: Clarify terminology in vec_reg_offset() David Hildenbrand
@ 2019-02-25 11:55 ` David Hildenbrand
  2019-02-25 12:32   ` Thomas Huth
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 5/7] s390x/tcg: Factor out gen_addi_and_wrap_i64() from get_address() David Hildenbrand
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2019-02-25 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-s390x, Thomas Huth, Cornelia Huck, Richard Henderson,
	David Hildenbrand

We'll use that a lot along with gvec helpers, to calculate the start
address of a vector.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/translate.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index 916508b567..f8c285a685 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -145,12 +145,17 @@ void s390x_translate_init(void)
     }
 }
 
+static inline int vec_full_reg_offset(uint8_t reg)
+{
+    g_assert(reg < 32);
+    return offsetof(CPUS390XState, vregs[reg][0].d);
+}
+
 static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp es)
 {
     const uint8_t bytes = 1 << es;
     int offs = enr * bytes;
 
-    g_assert(reg < 32);
     /*
      * vregs[n][0] is the lowest 8 byte and vregs[n][1] the highest 8 byte
      * of the 16 byte vector, on both, little and big endian systems.
@@ -177,7 +182,7 @@ static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp es)
 #ifndef HOST_WORDS_BIGENDIAN
     offs ^= (8 - bytes);
 #endif
-    return offs + offsetof(CPUS390XState, vregs[reg][0].d);
+    return offs + vec_full_reg_offset(reg);
 }
 
 static inline int freg64_offset(uint8_t reg)
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v1 5/7] s390x/tcg: Factor out gen_addi_and_wrap_i64() from get_address()
  2019-02-25 11:55 [Qemu-devel] [PATCH v1 0/7] s390x/tcg: Cleanups and refactorings for Vector David Hildenbrand
                   ` (3 preceding siblings ...)
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 4/7] s390x/tcg: Factor out vec_full_reg_offset() David Hildenbrand
@ 2019-02-25 11:55 ` David Hildenbrand
  2019-02-25 15:53   ` Richard Henderson
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 6/7] s390x/tcg: Implement LOAD LENGTHENED short HFP to long HFP David Hildenbrand
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 7/7] s390x/tcg: Implement LOAD COUNT TO BLOCK BOUNDARY David Hildenbrand
  6 siblings, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2019-02-25 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-s390x, Thomas Huth, Cornelia Huck, Richard Henderson,
	David Hildenbrand

Also properly wrap in 24bit mode. While at it, convert the comment (and
drop the comment about fundamental TCG optimizations).

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/translate.c | 41 +++++++++++++++++++++++++---------------
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index f8c285a685..322fbbdf81 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -381,32 +381,43 @@ static inline void gen_trap(DisasContext *s)
     gen_data_exception(0xff);
 }
 
+static inline void gen_addi_and_wrap_i64(DisasContext *s, TCGv_i64 dst,
+                                         TCGv_i64 src, int64_t imm)
+{
+    tcg_gen_addi_i64(dst, src, imm);
+    if (!(s->base.tb->flags & FLAG_MASK_64)) {
+        if (s->base.tb->flags & FLAG_MASK_32) {
+            tcg_gen_andi_i64(dst, dst, 0x7fffffff);
+        } else {
+            tcg_gen_andi_i64(dst, dst, 0x00ffffff);
+        }
+    }
+}
+
 static TCGv_i64 get_address(DisasContext *s, int x2, int b2, int d2)
 {
     TCGv_i64 tmp = tcg_temp_new_i64();
-    bool need_31 = !(s->base.tb->flags & FLAG_MASK_64);
-
-    /* Note that d2 is limited to 20 bits, signed.  If we crop negative
-       displacements early we create larger immedate addends.  */
 
-    /* Note that addi optimizes the imm==0 case.  */
+    /*
+     * Note that d2 is limited to 20 bits, signed.  If we crop negative
+     * displacements early we create larger immedate addends.
+     */
     if (b2 && x2) {
         tcg_gen_add_i64(tmp, regs[b2], regs[x2]);
-        tcg_gen_addi_i64(tmp, tmp, d2);
+        gen_addi_and_wrap_i64(s, tmp, tmp, d2);
     } else if (b2) {
-        tcg_gen_addi_i64(tmp, regs[b2], d2);
+        gen_addi_and_wrap_i64(s, tmp, regs[b2], d2);
     } else if (x2) {
-        tcg_gen_addi_i64(tmp, regs[x2], d2);
-    } else {
-        if (need_31) {
-            d2 &= 0x7fffffff;
-            need_31 = false;
+        gen_addi_and_wrap_i64(s, tmp, regs[x2], d2);
+    } else if (!(s->base.tb->flags & FLAG_MASK_64)) {
+        if (s->base.tb->flags & FLAG_MASK_32) {
+            tcg_gen_movi_i64(tmp, d2 & 0x7fffffff);
+        } else {
+            tcg_gen_movi_i64(tmp, d2 & 0x00ffffff);
         }
+    } else {
         tcg_gen_movi_i64(tmp, d2);
     }
-    if (need_31) {
-        tcg_gen_andi_i64(tmp, tmp, 0x7fffffff);
-    }
 
     return tmp;
 }
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v1 6/7] s390x/tcg: Implement LOAD LENGTHENED short HFP to long HFP
  2019-02-25 11:55 [Qemu-devel] [PATCH v1 0/7] s390x/tcg: Cleanups and refactorings for Vector David Hildenbrand
                   ` (4 preceding siblings ...)
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 5/7] s390x/tcg: Factor out gen_addi_and_wrap_i64() from get_address() David Hildenbrand
@ 2019-02-25 11:55 ` David Hildenbrand
  2019-02-25 15:59   ` Richard Henderson
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 7/7] s390x/tcg: Implement LOAD COUNT TO BLOCK BOUNDARY David Hildenbrand
  6 siblings, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2019-02-25 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-s390x, Thomas Huth, Cornelia Huck, Richard Henderson,
	David Hildenbrand

Nice trick to load a 32 bit value into vector element 0 (32 bit element
size) from memory, zeroing out element1. The short HFP to long HFP
conversion really only is a shift.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/insn-data.def | 2 ++
 target/s390x/translate.c   | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index 61582372ab..fb6ee18650 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -598,6 +598,8 @@
     F(0xed04, LDEB,    RXE,   Z,   0, m2_32u, new, f1, ldeb, 0, IF_BFP)
     F(0xed05, LXDB,    RXE,   Z,   0, m2_64, new_P, x1, lxdb, 0, IF_BFP)
     F(0xed06, LXEB,    RXE,   Z,   0, m2_32u, new_P, x1, lxeb, 0, IF_BFP)
+    F(0xb324, LDER,    RXE,   Z,   0, e2, new, f1, lde, 0, IF_AFP1)
+    F(0xed24, LDE,     RXE,   Z,   0, m2_32u, new, f1, lde, 0, IF_AFP1)
 /* LOAD ROUNDED */
     F(0xb344, LEDBR,   RRE,   Z,   0, f2, new, e1, ledb, 0, IF_BFP)
     F(0xb345, LDXBR,   RRE,   Z,   x2h, x2l, new, f1, ldxb, 0, IF_BFP)
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index 322fbbdf81..34799a8704 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -2724,6 +2724,12 @@ static DisasJumpType op_lxeb(DisasContext *s, DisasOps *o)
     return DISAS_NEXT;
 }
 
+static DisasJumpType op_lde(DisasContext *s, DisasOps *o)
+{
+    tcg_gen_shli_i64(o->out, o->in2, 32);
+    return DISAS_NEXT;
+}
+
 static DisasJumpType op_llgt(DisasContext *s, DisasOps *o)
 {
     tcg_gen_andi_i64(o->out, o->in2, 0x7fffffff);
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v1 7/7] s390x/tcg: Implement LOAD COUNT TO BLOCK BOUNDARY
  2019-02-25 11:55 [Qemu-devel] [PATCH v1 0/7] s390x/tcg: Cleanups and refactorings for Vector David Hildenbrand
                   ` (5 preceding siblings ...)
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 6/7] s390x/tcg: Implement LOAD LENGTHENED short HFP to long HFP David Hildenbrand
@ 2019-02-25 11:55 ` David Hildenbrand
  2019-02-25 16:14   ` Richard Henderson
  6 siblings, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2019-02-25 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-s390x, Thomas Huth, Cornelia Huck, Richard Henderson,
	David Hildenbrand

Use a new CC helper to calculate the CC lazily if needed. While the
PoP mentions that "A 32-bit unsigned binary integer" is placed into the
first operand, there is no word telling that the other 32 bits (high
part) are left untouched. Maybe the other 32-bit are unpredictable.
So store 64 bit for now.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/cc_helper.c   |  8 ++++++++
 target/s390x/helper.c      |  1 +
 target/s390x/helper.h      |  1 +
 target/s390x/insn-data.def |  2 ++
 target/s390x/internal.h    |  1 +
 target/s390x/mem_helper.c  | 12 ++++++++++++
 target/s390x/translate.c   | 18 ++++++++++++++++++
 7 files changed, 43 insertions(+)

diff --git a/target/s390x/cc_helper.c b/target/s390x/cc_helper.c
index 307ad61aee..0e467bf2b6 100644
--- a/target/s390x/cc_helper.c
+++ b/target/s390x/cc_helper.c
@@ -397,6 +397,11 @@ static uint32_t cc_calc_flogr(uint64_t dst)
     return dst ? 2 : 0;
 }
 
+static uint32_t cc_calc_lcbb(uint64_t dst)
+{
+    return dst == 16 ? 0 : 3;
+}
+
 static uint32_t do_calc_cc(CPUS390XState *env, uint32_t cc_op,
                                   uint64_t src, uint64_t dst, uint64_t vr)
 {
@@ -506,6 +511,9 @@ static uint32_t do_calc_cc(CPUS390XState *env, uint32_t cc_op,
     case CC_OP_FLOGR:
         r = cc_calc_flogr(dst);
         break;
+    case CC_OP_LCBB:
+        r = cc_calc_lcbb(dst);
+        break;
 
     case CC_OP_NZ_F32:
         r = set_cc_nz_f32(dst);
diff --git a/target/s390x/helper.c b/target/s390x/helper.c
index a7edd5df7d..8e9573221c 100644
--- a/target/s390x/helper.c
+++ b/target/s390x/helper.c
@@ -417,6 +417,7 @@ const char *cc_name(enum cc_op cc_op)
         [CC_OP_SLA_32]    = "CC_OP_SLA_32",
         [CC_OP_SLA_64]    = "CC_OP_SLA_64",
         [CC_OP_FLOGR]     = "CC_OP_FLOGR",
+        [CC_OP_LCBB]      = "CC_OP_LCBB",
     };
 
     return cc_names[cc_op];
diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 6260b50496..a2f8f96aae 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -122,6 +122,7 @@ DEF_HELPER_4(cu42, i32, env, i32, i32, i32)
 DEF_HELPER_5(msa, i32, env, i32, i32, i32, i32)
 DEF_HELPER_FLAGS_1(stpt, TCG_CALL_NO_RWG, i64, env)
 DEF_HELPER_FLAGS_1(stck, TCG_CALL_NO_RWG_SE, i64, env)
+DEF_HELPER_FLAGS_2(lcbb, TCG_CALL_NO_RWG_SE, i64, i64, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index fb6ee18650..f4f1d63ab4 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -479,6 +479,8 @@
     F(0xb313, LCDBR,   RRE,   Z,   0, f2, new, f1, negf64, f64, IF_BFP)
     F(0xb343, LCXBR,   RRE,   Z,   x2h, x2l, new_P, x1, negf128, f128, IF_BFP)
     F(0xb373, LCDFR,   RRE,   FPSSH, 0, f2, new, f1, negf64, 0, IF_AFP1 | IF_AFP2)
+/* LOAD COUNT TO BLOCK BOUNDARY */
+    C(0xe727, LCBB,    RXE,   V,   la2, 0, r1, 0, lcbb, 0)
 /* LOAD HALFWORD */
     C(0xb927, LHR,     RRE,   EI,  0, r2_16s, 0, r1_32, mov2, 0)
     C(0xb907, LGHR,    RRE,   EI,  0, r2_16s, 0, r1, mov2, 0)
diff --git a/target/s390x/internal.h b/target/s390x/internal.h
index b2966a3adc..9d0a45d1fe 100644
--- a/target/s390x/internal.h
+++ b/target/s390x/internal.h
@@ -236,6 +236,7 @@ enum cc_op {
     CC_OP_SLA_32,               /* Calculate shift left signed (32bit) */
     CC_OP_SLA_64,               /* Calculate shift left signed (64bit) */
     CC_OP_FLOGR,                /* find leftmost one */
+    CC_OP_LCBB,                 /* load count to block boundary */
     CC_OP_MAX
 };
 
diff --git a/target/s390x/mem_helper.c b/target/s390x/mem_helper.c
index a506d9ef99..7bca848cda 100644
--- a/target/s390x/mem_helper.c
+++ b/target/s390x/mem_helper.c
@@ -2623,3 +2623,15 @@ uint32_t HELPER(cu42)(CPUS390XState *env, uint32_t r1, uint32_t r2, uint32_t m3)
     return convert_unicode(env, r1, r2, m3, GETPC(),
                            decode_utf32, encode_utf16);
 }
+
+uint64_t HELPER(lcbb)(uint64_t addr, uint32_t m3)
+{
+    const uint32_t block_size = 1ul << (m3 + 6);
+    const uint64_t rounded_addr = ROUND_UP(addr, block_size);
+    uint32_t to_load = 16;
+
+    if (rounded_addr != addr) {
+        to_load = MIN(rounded_addr - addr, to_load);
+    }
+    return to_load;
+}
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index 34799a8704..fd08ae6a5d 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -556,6 +556,7 @@ static void gen_op_calc_cc(DisasContext *s)
     case CC_OP_NZ_F32:
     case CC_OP_NZ_F64:
     case CC_OP_FLOGR:
+    case CC_OP_LCBB:
         /* 1 argument */
         gen_helper_calc_cc(cc_op, cpu_env, local_cc_op, dummy, cc_dst, dummy);
         break;
@@ -3141,6 +3142,22 @@ static DisasJumpType op_lzrb(DisasContext *s, DisasOps *o)
     return DISAS_NEXT;
 }
 
+static DisasJumpType op_lcbb(DisasContext *s, DisasOps *o)
+{
+    TCGv_i32 m3;
+
+    if (get_field(s->fields, m3) > 6) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }
+
+    m3 = tcg_const_i32(get_field(s->fields, m3));
+    gen_helper_lcbb(o->out, o->addr1, m3);
+    tcg_temp_free_i32(m3);
+    gen_op_update1_cc_i64(s, CC_OP_LCBB, o->out);
+    return DISAS_NEXT;
+}
+
 static DisasJumpType op_mov2(DisasContext *s, DisasOps *o)
 {
     o->out = o->in2;
@@ -5930,6 +5947,7 @@ enum DisasInsnEnum {
 #define FAC_ECT         S390_FEAT_EXTRACT_CPU_TIME
 #define FAC_PCI         S390_FEAT_ZPCI /* z/PCI facility */
 #define FAC_AIS         S390_FEAT_ADAPTER_INT_SUPPRESSION
+#define FAC_V           S390_FEAT_VECTOR /* vector facility */
 
 static const DisasInsn insn_info[] = {
 #include "insn-data.def"
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/7] s390x/tcg: RXE has an optional M3 field
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 1/7] s390x/tcg: RXE has an optional M3 field David Hildenbrand
@ 2019-02-25 12:18   ` Thomas Huth
  2019-02-25 15:46   ` Richard Henderson
  1 sibling, 0 replies; 25+ messages in thread
From: Thomas Huth @ 2019-02-25 12:18 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: qemu-s390x, Cornelia Huck, Richard Henderson

On 25/02/2019 12.55, David Hildenbrand wrote:
> Will be needed, so add it to the format description.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/insn-format.def | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/s390x/insn-format.def b/target/s390x/insn-format.def
> index a412d90fb7..4297ff4165 100644
> --- a/target/s390x/insn-format.def
> +++ b/target/s390x/insn-format.def
> @@ -36,7 +36,7 @@ F3(RSY_a, R(1, 8),     BDL(2),      R(3,12))
>  F3(RSY_b, R(1, 8),     BDL(2),      M(3,12))
>  F2(RX_a,  R(1, 8),     BXD(2))
>  F2(RX_b,  M(1, 8),     BXD(2))
> -F2(RXE,   R(1, 8),     BXD(2))
> +F3(RXE,   R(1, 8),     BXD(2),      M(3,32))

Yes, according to the Principles of Operations, RXE has a M3 field.

Reviewed-by: Thomas Huth <thuth@redhat.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 2/7] s390x/tcg: Simplify disassembler operands initialization
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 2/7] s390x/tcg: Simplify disassembler operands initialization David Hildenbrand
@ 2019-02-25 12:20   ` Thomas Huth
  2019-02-25 12:21   ` Cornelia Huck
  2019-02-25 15:47   ` Richard Henderson
  2 siblings, 0 replies; 25+ messages in thread
From: Thomas Huth @ 2019-02-25 12:20 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: qemu-s390x, Cornelia Huck, Richard Henderson

On 25/02/2019 12.55, David Hildenbrand wrote:
> Let's simply initialization to 0.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/translate.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/target/s390x/translate.c b/target/s390x/translate.c
> index 19072efec6..c646e50eb3 100644
> --- a/target/s390x/translate.c
> +++ b/target/s390x/translate.c
> @@ -6091,7 +6091,7 @@ static DisasJumpType translate_one(CPUS390XState *env, DisasContext *s)
>      const DisasInsn *insn;
>      DisasJumpType ret = DISAS_NEXT;
>      DisasFields f;
> -    DisasOps o;
> +    DisasOps o = {};
>  
>      /* Search for the insn in the table.  */
>      insn = extract_insn(env, s, &f);
> @@ -6161,12 +6161,6 @@ static DisasJumpType translate_one(CPUS390XState *env, DisasContext *s)
>      /* Set up the strutures we use to communicate with the helpers. */
>      s->insn = insn;
>      s->fields = &f;
> -    o.g_out = o.g_out2 = o.g_in1 = o.g_in2 = false;
> -    o.out = NULL;
> -    o.out2 = NULL;
> -    o.in1 = NULL;
> -    o.in2 = NULL;
> -    o.addr1 = NULL;
>  
>      /* Implement the instruction.  */
>      if (insn->help_in1) {
> 

Reviewed-by: Thomas Huth <thuth@redhat.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 2/7] s390x/tcg: Simplify disassembler operands initialization
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 2/7] s390x/tcg: Simplify disassembler operands initialization David Hildenbrand
  2019-02-25 12:20   ` Thomas Huth
@ 2019-02-25 12:21   ` Cornelia Huck
  2019-02-25 13:32     ` David Hildenbrand
  2019-02-25 15:47   ` Richard Henderson
  2 siblings, 1 reply; 25+ messages in thread
From: Cornelia Huck @ 2019-02-25 12:21 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: qemu-devel, qemu-s390x, Thomas Huth, Richard Henderson

On Mon, 25 Feb 2019 12:55:47 +0100
David Hildenbrand <david@redhat.com> wrote:

> Let's simply initialization to 0.

"Let's simply zero-initialize the structure." ?

> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/translate.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 3/7] s390x/tcg: Clarify terminology in vec_reg_offset()
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 3/7] s390x/tcg: Clarify terminology in vec_reg_offset() David Hildenbrand
@ 2019-02-25 12:30   ` Thomas Huth
  2019-02-25 13:33     ` David Hildenbrand
  2019-02-25 15:47   ` Richard Henderson
  1 sibling, 1 reply; 25+ messages in thread
From: Thomas Huth @ 2019-02-25 12:30 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: qemu-s390x, Cornelia Huck, Richard Henderson

On 25/02/2019 12.55, David Hildenbrand wrote:
> We will use s390x speak "Element Size" (es) for MO_8 == 0, MO_16 == 1
> ... Simple rename of variables.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/translate.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/target/s390x/translate.c b/target/s390x/translate.c
> index c646e50eb3..916508b567 100644
> --- a/target/s390x/translate.c
> +++ b/target/s390x/translate.c
> @@ -145,10 +145,10 @@ void s390x_translate_init(void)
>      }
>  }
>  
> -static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp size)
> +static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp es)
>  {
> -    const uint8_t es = 1 << size;
> -    int offs = enr * es;
> +    const uint8_t bytes = 1 << es;

I'd maybe add a comment at the end of above line, saying "/* Convert
element size to bytes */" or something similar, so that it is clear to
the reader that "es" means "element size"...

> +    int offs = enr * bytes;
>  
>      g_assert(reg < 32);
>      /*
> @@ -173,9 +173,9 @@ static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp size)
>       * the two 8 byte elements have to be loaded separately. Let's force all
>       * 16 byte operations to handle it in a special way.
>       */
> -    g_assert(size <= MO_64);
> +    g_assert(es <= MO_64);
>  #ifndef HOST_WORDS_BIGENDIAN
> -    offs ^= (8 - es);
> +    offs ^= (8 - bytes);
>  #endif
>      return offs + offsetof(CPUS390XState, vregs[reg][0].d);
>  }
> 

Reviewed-by: Thomas Huth <thuth@redhat.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 4/7] s390x/tcg: Factor out vec_full_reg_offset()
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 4/7] s390x/tcg: Factor out vec_full_reg_offset() David Hildenbrand
@ 2019-02-25 12:32   ` Thomas Huth
  0 siblings, 0 replies; 25+ messages in thread
From: Thomas Huth @ 2019-02-25 12:32 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: qemu-s390x, Cornelia Huck, Richard Henderson

On 25/02/2019 12.55, David Hildenbrand wrote:
> We'll use that a lot along with gvec helpers, to calculate the start
> address of a vector.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/translate.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/target/s390x/translate.c b/target/s390x/translate.c
> index 916508b567..f8c285a685 100644
> --- a/target/s390x/translate.c
> +++ b/target/s390x/translate.c
> @@ -145,12 +145,17 @@ void s390x_translate_init(void)
>      }
>  }
>  
> +static inline int vec_full_reg_offset(uint8_t reg)
> +{
> +    g_assert(reg < 32);
> +    return offsetof(CPUS390XState, vregs[reg][0].d);
> +}
> +
>  static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp es)
>  {
>      const uint8_t bytes = 1 << es;
>      int offs = enr * bytes;
>  
> -    g_assert(reg < 32);
>      /*
>       * vregs[n][0] is the lowest 8 byte and vregs[n][1] the highest 8 byte
>       * of the 16 byte vector, on both, little and big endian systems.
> @@ -177,7 +182,7 @@ static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp es)
>  #ifndef HOST_WORDS_BIGENDIAN
>      offs ^= (8 - bytes);
>  #endif
> -    return offs + offsetof(CPUS390XState, vregs[reg][0].d);
> +    return offs + vec_full_reg_offset(reg);
>  }
>  
>  static inline int freg64_offset(uint8_t reg)
> 

Reviewed-by: Thomas Huth <thuth@redhat.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 2/7] s390x/tcg: Simplify disassembler operands initialization
  2019-02-25 12:21   ` Cornelia Huck
@ 2019-02-25 13:32     ` David Hildenbrand
  0 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2019-02-25 13:32 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: qemu-devel, qemu-s390x, Thomas Huth, Richard Henderson

On 25.02.19 13:21, Cornelia Huck wrote:
> On Mon, 25 Feb 2019 12:55:47 +0100
> David Hildenbrand <david@redhat.com> wrote:
> 
>> Let's simply initialization to 0.
> 
> "Let's simply zero-initialize the structure." ?

Or "s/simply/simplify/", whatever you prefer.

> 
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  target/s390x/translate.c | 8 +-------
>>  1 file changed, 1 insertion(+), 7 deletions(-)


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 3/7] s390x/tcg: Clarify terminology in vec_reg_offset()
  2019-02-25 12:30   ` Thomas Huth
@ 2019-02-25 13:33     ` David Hildenbrand
  0 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2019-02-25 13:33 UTC (permalink / raw)
  To: Thomas Huth, qemu-devel; +Cc: qemu-s390x, Cornelia Huck, Richard Henderson

On 25.02.19 13:30, Thomas Huth wrote:
> On 25/02/2019 12.55, David Hildenbrand wrote:
>> We will use s390x speak "Element Size" (es) for MO_8 == 0, MO_16 == 1
>> ... Simple rename of variables.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  target/s390x/translate.c | 10 +++++-----
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/target/s390x/translate.c b/target/s390x/translate.c
>> index c646e50eb3..916508b567 100644
>> --- a/target/s390x/translate.c
>> +++ b/target/s390x/translate.c
>> @@ -145,10 +145,10 @@ void s390x_translate_init(void)
>>      }
>>  }
>>  
>> -static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp size)
>> +static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp es)
>>  {
>> -    const uint8_t es = 1 << size;
>> -    int offs = enr * es;
>> +    const uint8_t bytes = 1 << es;
> 
> I'd maybe add a comment at the end of above line, saying "/* Convert
> element size to bytes */" or something similar, so that it is clear to
> the reader that "es" means "element size"...

Can do, in the actual vector instruction part I have a comment
explaining the terminology, but that will be in another file.

> 
>> +    int offs = enr * bytes;
>>  
>>      g_assert(reg < 32);
>>      /*
>> @@ -173,9 +173,9 @@ static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp size)
>>       * the two 8 byte elements have to be loaded separately. Let's force all
>>       * 16 byte operations to handle it in a special way.
>>       */
>> -    g_assert(size <= MO_64);
>> +    g_assert(es <= MO_64);
>>  #ifndef HOST_WORDS_BIGENDIAN
>> -    offs ^= (8 - es);
>> +    offs ^= (8 - bytes);
>>  #endif
>>      return offs + offsetof(CPUS390XState, vregs[reg][0].d);
>>  }
>>
> 
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> 

Thanks!


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/7] s390x/tcg: RXE has an optional M3 field
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 1/7] s390x/tcg: RXE has an optional M3 field David Hildenbrand
  2019-02-25 12:18   ` Thomas Huth
@ 2019-02-25 15:46   ` Richard Henderson
  1 sibling, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2019-02-25 15:46 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: qemu-s390x, Cornelia Huck, Thomas Huth, Richard Henderson

On 2/25/19 3:55 AM, David Hildenbrand wrote:
> Will be needed, so add it to the format description.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/insn-format.def | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 2/7] s390x/tcg: Simplify disassembler operands initialization
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 2/7] s390x/tcg: Simplify disassembler operands initialization David Hildenbrand
  2019-02-25 12:20   ` Thomas Huth
  2019-02-25 12:21   ` Cornelia Huck
@ 2019-02-25 15:47   ` Richard Henderson
  2 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2019-02-25 15:47 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: qemu-s390x, Cornelia Huck, Thomas Huth, Richard Henderson

On 2/25/19 3:55 AM, David Hildenbrand wrote:
> Let's simply initialization to 0.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/translate.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)

Modulo what ever fixed language you and Connie agree upon.  ;-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 3/7] s390x/tcg: Clarify terminology in vec_reg_offset()
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 3/7] s390x/tcg: Clarify terminology in vec_reg_offset() David Hildenbrand
  2019-02-25 12:30   ` Thomas Huth
@ 2019-02-25 15:47   ` Richard Henderson
  1 sibling, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2019-02-25 15:47 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: qemu-s390x, Cornelia Huck, Thomas Huth, Richard Henderson

On 2/25/19 3:55 AM, David Hildenbrand wrote:
> We will use s390x speak "Element Size" (es) for MO_8 == 0, MO_16 == 1
> ... Simple rename of variables.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/translate.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/7] s390x/tcg: Factor out gen_addi_and_wrap_i64() from get_address()
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 5/7] s390x/tcg: Factor out gen_addi_and_wrap_i64() from get_address() David Hildenbrand
@ 2019-02-25 15:53   ` Richard Henderson
  2019-02-25 16:08     ` David Hildenbrand
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2019-02-25 15:53 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: qemu-s390x, Cornelia Huck, Thomas Huth, Richard Henderson

On 2/25/19 3:55 AM, David Hildenbrand wrote:
> Also properly wrap in 24bit mode. While at it, convert the comment (and
> drop the comment about fundamental TCG optimizations).
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/translate.c | 41 +++++++++++++++++++++++++---------------
>  1 file changed, 26 insertions(+), 15 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

> +static inline void gen_addi_and_wrap_i64(DisasContext *s, TCGv_i64 dst,
> +                                         TCGv_i64 src, int64_t imm)

I would drop the inline and let the compiler choose.
I'm using very few explicit inline markers these days...


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 6/7] s390x/tcg: Implement LOAD LENGTHENED short HFP to long HFP
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 6/7] s390x/tcg: Implement LOAD LENGTHENED short HFP to long HFP David Hildenbrand
@ 2019-02-25 15:59   ` Richard Henderson
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2019-02-25 15:59 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: qemu-s390x, Cornelia Huck, Thomas Huth, Richard Henderson

On 2/25/19 3:55 AM, David Hildenbrand wrote:
> Nice trick to load a 32 bit value into vector element 0 (32 bit element
> size) from memory, zeroing out element1. The short HFP to long HFP
> conversion really only is a shift.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/insn-data.def | 2 ++
>  target/s390x/translate.c   | 6 ++++++
>  2 files changed, 8 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/7] s390x/tcg: Factor out gen_addi_and_wrap_i64() from get_address()
  2019-02-25 15:53   ` Richard Henderson
@ 2019-02-25 16:08     ` David Hildenbrand
  0 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2019-02-25 16:08 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: qemu-s390x, Cornelia Huck, Thomas Huth, Richard Henderson

On 25.02.19 16:53, Richard Henderson wrote:
> On 2/25/19 3:55 AM, David Hildenbrand wrote:
>> Also properly wrap in 24bit mode. While at it, convert the comment (and
>> drop the comment about fundamental TCG optimizations).
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  target/s390x/translate.c | 41 +++++++++++++++++++++++++---------------
>>  1 file changed, 26 insertions(+), 15 deletions(-)
> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> 
>> +static inline void gen_addi_and_wrap_i64(DisasContext *s, TCGv_i64 dst,
>> +                                         TCGv_i64 src, int64_t imm)
> 
> I would drop the inline and let the compiler choose.
> I'm using very few explicit inline markers these days...

Makes sense, the compiler usually knows better. Thanks!

> 
> 
> r~
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 7/7] s390x/tcg: Implement LOAD COUNT TO BLOCK BOUNDARY
  2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 7/7] s390x/tcg: Implement LOAD COUNT TO BLOCK BOUNDARY David Hildenbrand
@ 2019-02-25 16:14   ` Richard Henderson
  2019-02-25 16:17     ` David Hildenbrand
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2019-02-25 16:14 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: qemu-s390x, Cornelia Huck, Thomas Huth, Richard Henderson

On 2/25/19 3:55 AM, David Hildenbrand wrote:
> +uint64_t HELPER(lcbb)(uint64_t addr, uint32_t m3)
> +{
> +    const uint32_t block_size = 1ul << (m3 + 6);
> +    const uint64_t rounded_addr = ROUND_UP(addr, block_size);
> +    uint32_t to_load = 16;
> +
> +    if (rounded_addr != addr) {
> +        to_load = MIN(rounded_addr - addr, to_load);
> +    }
> +    return to_load;
> +}

I don't understand all of this "blocksize" business, when they are all powers
of two, and the maximum value returned is 16.

As far as I can see, the result is obtained by -(addr | -16) regardless of the
value of m3.


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 7/7] s390x/tcg: Implement LOAD COUNT TO BLOCK BOUNDARY
  2019-02-25 16:14   ` Richard Henderson
@ 2019-02-25 16:17     ` David Hildenbrand
  2019-02-25 16:40       ` Richard Henderson
  0 siblings, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2019-02-25 16:17 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: qemu-s390x, Cornelia Huck, Thomas Huth, Richard Henderson

On 25.02.19 17:14, Richard Henderson wrote:
> On 2/25/19 3:55 AM, David Hildenbrand wrote:
>> +uint64_t HELPER(lcbb)(uint64_t addr, uint32_t m3)
>> +{
>> +    const uint32_t block_size = 1ul << (m3 + 6);
>> +    const uint64_t rounded_addr = ROUND_UP(addr, block_size);
>> +    uint32_t to_load = 16;
>> +
>> +    if (rounded_addr != addr) {
>> +        to_load = MIN(rounded_addr - addr, to_load);
>> +    }
>> +    return to_load;
>> +}
> 
> I don't understand all of this "blocksize" business, when they are all powers
> of two, and the maximum value returned is 16.
> 
> As far as I can see, the result is obtained by -(addr | -16) regardless of the
> value of m3.

Let's assume we have addr = 63;

Assume block size is 64:
-> to_load = 1

Assume block size is 128:
-> to_load = 16

Or am i missing something?

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 7/7] s390x/tcg: Implement LOAD COUNT TO BLOCK BOUNDARY
  2019-02-25 16:17     ` David Hildenbrand
@ 2019-02-25 16:40       ` Richard Henderson
  2019-02-25 19:55         ` David Hildenbrand
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2019-02-25 16:40 UTC (permalink / raw)
  To: David Hildenbrand, Richard Henderson, qemu-devel
  Cc: qemu-s390x, Cornelia Huck, Thomas Huth

On 2/25/19 8:17 AM, David Hildenbrand wrote:
> On 25.02.19 17:14, Richard Henderson wrote:
>> I don't understand all of this "blocksize" business, when they are all powers
>> of two, and the maximum value returned is 16.
>>
>> As far as I can see, the result is obtained by -(addr | -16) regardless of the
>> value of m3.
> 
> Let's assume we have addr = 63;
> 
> Assume block size is 64:
> -> to_load = 1
> 
> Assume block size is 128:
> -> to_load = 16
> 
> Or am i missing something?

No, just me.

You can still do the computation inline, with

    tcg_gen_ori_i64(tmp, addr, -blocksize);
    tcg_gen_neg_i64(tmp, tmp);
    sixteen = tcg_const_i64(16);
    tcg_gen_umin_i64(tmp, sixteen);


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v1 7/7] s390x/tcg: Implement LOAD COUNT TO BLOCK BOUNDARY
  2019-02-25 16:40       ` Richard Henderson
@ 2019-02-25 19:55         ` David Hildenbrand
  0 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2019-02-25 19:55 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: qemu-s390x, Cornelia Huck, Thomas Huth

On 25.02.19 17:40, Richard Henderson wrote:
> On 2/25/19 8:17 AM, David Hildenbrand wrote:
>> On 25.02.19 17:14, Richard Henderson wrote:
>>> I don't understand all of this "blocksize" business, when they are all powers
>>> of two, and the maximum value returned is 16.
>>>
>>> As far as I can see, the result is obtained by -(addr | -16) regardless of the
>>> value of m3.
>>
>> Let's assume we have addr = 63;
>>
>> Assume block size is 64:
>> -> to_load = 1
>>
>> Assume block size is 128:
>> -> to_load = 16
>>
>> Or am i missing something?
> 
> No, just me.
> 
> You can still do the computation inline, with
> 
>     tcg_gen_ori_i64(tmp, addr, -blocksize);
>     tcg_gen_neg_i64(tmp, tmp);
>     sixteen = tcg_const_i64(16);
>     tcg_gen_umin_i64(tmp, sixteen);
> 

Nice trick, works fine, thanks :)

> 
> r~
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-02-25 19:55 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-25 11:55 [Qemu-devel] [PATCH v1 0/7] s390x/tcg: Cleanups and refactorings for Vector David Hildenbrand
2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 1/7] s390x/tcg: RXE has an optional M3 field David Hildenbrand
2019-02-25 12:18   ` Thomas Huth
2019-02-25 15:46   ` Richard Henderson
2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 2/7] s390x/tcg: Simplify disassembler operands initialization David Hildenbrand
2019-02-25 12:20   ` Thomas Huth
2019-02-25 12:21   ` Cornelia Huck
2019-02-25 13:32     ` David Hildenbrand
2019-02-25 15:47   ` Richard Henderson
2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 3/7] s390x/tcg: Clarify terminology in vec_reg_offset() David Hildenbrand
2019-02-25 12:30   ` Thomas Huth
2019-02-25 13:33     ` David Hildenbrand
2019-02-25 15:47   ` Richard Henderson
2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 4/7] s390x/tcg: Factor out vec_full_reg_offset() David Hildenbrand
2019-02-25 12:32   ` Thomas Huth
2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 5/7] s390x/tcg: Factor out gen_addi_and_wrap_i64() from get_address() David Hildenbrand
2019-02-25 15:53   ` Richard Henderson
2019-02-25 16:08     ` David Hildenbrand
2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 6/7] s390x/tcg: Implement LOAD LENGTHENED short HFP to long HFP David Hildenbrand
2019-02-25 15:59   ` Richard Henderson
2019-02-25 11:55 ` [Qemu-devel] [PATCH v1 7/7] s390x/tcg: Implement LOAD COUNT TO BLOCK BOUNDARY David Hildenbrand
2019-02-25 16:14   ` Richard Henderson
2019-02-25 16:17     ` David Hildenbrand
2019-02-25 16:40       ` Richard Henderson
2019-02-25 19:55         ` David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.