All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4
@ 2019-05-31 10:44 David Hildenbrand
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 01/23] s390x: Use uint64_t for vector registers David Hildenbrand
                   ` (24 more replies)
  0 siblings, 25 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

This is the final part of vector instruction support for s390x. It is based
on part 2, which is will send a pull-request for to Conny soon.

Part 1: Vector Support Instructions
Part 2: Vector Integer Instructions
Part 3: Vector String Instructions
Part 4: Vector Floating-Point Instructions

The current state can be found at (kept updated):
    https://github.com/davidhildenbrand/qemu/tree/vx

It is based on:
- [PATCH v2 0/5] s390x/tcg: Vector Instruction Support Part 3
- [PATCH v1 0/2] s390x: Fix vector register alignment

With the current state I can boot Linux kernel + user space compiled with
SIMD support. This allows to boot distributions compiled exclusively for
z13, requiring SIMD support. Also, it is now possible to build a complete
kernel using rpmbuild as quite some issues have been sorted out.

While the current state works fine for me with RHEL 8, I am experiencing
some issues with newer userspace versions (I suspect glibc). I'll have
to look into the details first - could be a BUG in !vector
instruction or a BUG in a vector instruction that was until now unused.

In this part, all Vector Floating-Point Instructions introduced with the
"Vector Facility" are added. Also, the "qemu" model is changed to a
z13 machine.

David Hildenbrand (23):
  s390x: Use uint64_t for vector registers
  s390x/tcg: Introduce tcg_s390_vector_exception()
  s390x/tcg: Export float_comp_to_cc() and float(32|64|128)_dcmask()
  s390x/tcg: Implement VECTOR FP ADD
  s390x/tcg: Implement VECTOR FP COMPARE (AND SIGNAL) SCALAR
  s390x/tcg: Implement VECTOR FP COMPARE (EQUAL|HIGH|HIGH OR EQUAL)
  s390x/tcg: Implement VECTOR FP CONVERT FROM FIXED 64-BIT
  s390x/tcg: Implement VECTOR FP CONVERT FROM LOGICAL 64-BIT
  s390x/tcg: Implement VECTOR FP CONVERT TO FIXED 64-BIT
  s390x/tcg: Implement VECTOR FP CONVERT TO LOGICAL 64-BIT
  s390x/tcg: Implement VECTOR FP DIVIDE
  s390x/tcg: Implement VECTOR LOAD FP INTEGER
  s390x/tcg: Implement VECTOR LOAD LENGTHENED
  s390x/tcg: Implement VECTOR LOAD ROUNDED
  s390x/tcg: Implement VECTOR FP MULTIPLY
  s390x/tcg: Implement VECTOR FP MULTIPLY AND (ADD|SUBTRACT)
  s390x/tcg: Implement VECTOR FP PERFORM SIGN OPERATION
  s390x/tcg: Implement VECTOR FP SQUARE ROOT
  s390x/tcg: Implement VECTOR FP SUBTRACT
  s390x/tcg: Implement VECTOR FP TEST DATA CLASS IMMEDIATE
  s390x/tcg: Allow linux-user to use vector instructions
  s390x/tcg: We support the Vector Facility
  s390x: Bump the "qemu" CPU model up to a stripped-down z13

 hw/s390x/s390-virtio-ccw.c      |   2 +
 linux-user/s390x/signal.c       |   4 +-
 target/s390x/Makefile.objs      |   1 +
 target/s390x/arch_dump.c        |   8 +-
 target/s390x/cpu.c              |   3 +
 target/s390x/cpu.h              |   5 +-
 target/s390x/cpu_models.c       |   4 +-
 target/s390x/excp_helper.c      |  21 +-
 target/s390x/fpu_helper.c       |   4 +-
 target/s390x/gdbstub.c          |  16 +-
 target/s390x/gen-features.c     |  10 +-
 target/s390x/helper.c           |  10 +-
 target/s390x/helper.h           |  46 +++
 target/s390x/insn-data.def      |  45 +++
 target/s390x/internal.h         |   4 +
 target/s390x/kvm.c              |  16 +-
 target/s390x/machine.c          | 128 +++----
 target/s390x/tcg_s390x.h        |   2 +
 target/s390x/translate.c        |   2 +-
 target/s390x/translate_vx.inc.c | 274 ++++++++++++++
 target/s390x/vec_fpu_helper.c   | 644 ++++++++++++++++++++++++++++++++
 21 files changed, 1145 insertions(+), 104 deletions(-)
 create mode 100644 target/s390x/vec_fpu_helper.c

-- 
2.20.1



^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 01/23] s390x: Use uint64_t for vector registers
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 02/23] s390x/tcg: Introduce tcg_s390_vector_exception() David Hildenbrand
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: Christian Borntraeger, Richard Henderson, Denys Vlasenko,
	David Hildenbrand

CPU_DoubleU is primarily used to reinterpret between integer and floats.
We don't really need this functionality. So let's just keep it simple
and use an uint64_t.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 linux-user/s390x/signal.c  |   4 +-
 target/s390x/arch_dump.c   |   8 +--
 target/s390x/cpu.h         |   4 +-
 target/s390x/excp_helper.c |   6 +-
 target/s390x/gdbstub.c     |  16 ++---
 target/s390x/helper.c      |  10 +--
 target/s390x/kvm.c         |  16 ++---
 target/s390x/machine.c     | 128 ++++++++++++++++++-------------------
 target/s390x/translate.c   |   2 +-
 9 files changed, 97 insertions(+), 97 deletions(-)

diff --git a/linux-user/s390x/signal.c b/linux-user/s390x/signal.c
index 3d3cb67bbe..ecfa2a14a9 100644
--- a/linux-user/s390x/signal.c
+++ b/linux-user/s390x/signal.c
@@ -123,7 +123,7 @@ static void save_sigregs(CPUS390XState *env, target_sigregs *sregs)
      */
     //save_fp_regs(&current->thread.fp_regs); FIXME
     for (i = 0; i < 16; i++) {
-        __put_user(get_freg(env, i)->ll, &sregs->fpregs.fprs[i]);
+        __put_user(*get_freg(env, i), &sregs->fpregs.fprs[i]);
     }
 }
 
@@ -254,7 +254,7 @@ restore_sigregs(CPUS390XState *env, target_sigregs *sc)
         __get_user(env->aregs[i], &sc->regs.acrs[i]);
     }
     for (i = 0; i < 16; i++) {
-        __get_user(get_freg(env, i)->ll, &sc->fpregs.fprs[i]);
+        __get_user(*get_freg(env, i), &sc->fpregs.fprs[i]);
     }
 
     return err;
diff --git a/target/s390x/arch_dump.c b/target/s390x/arch_dump.c
index c9ef0a6e60..50fa0ae4b6 100644
--- a/target/s390x/arch_dump.c
+++ b/target/s390x/arch_dump.c
@@ -104,7 +104,7 @@ static void s390x_write_elf64_fpregset(Note *note, S390CPU *cpu, int id)
     note->hdr.n_type = cpu_to_be32(NT_FPREGSET);
     note->contents.fpregset.fpc = cpu_to_be32(cpu->env.fpc);
     for (i = 0; i <= 15; i++) {
-        note->contents.fpregset.fprs[i] = cpu_to_be64(get_freg(cs, i)->ll);
+        note->contents.fpregset.fprs[i] = cpu_to_be64(*get_freg(cs, i));
     }
 }
 
@@ -114,7 +114,7 @@ static void s390x_write_elf64_vregslo(Note *note, S390CPU *cpu,  int id)
 
     note->hdr.n_type = cpu_to_be32(NT_S390_VXRS_LOW);
     for (i = 0; i <= 15; i++) {
-        note->contents.vregslo.vregs[i] = cpu_to_be64(cpu->env.vregs[i][1].ll);
+        note->contents.vregslo.vregs[i] = cpu_to_be64(cpu->env.vregs[i][1]);
     }
 }
 
@@ -127,8 +127,8 @@ static void s390x_write_elf64_vregshi(Note *note, S390CPU *cpu, int id)
 
     note->hdr.n_type = cpu_to_be32(NT_S390_VXRS_HIGH);
     for (i = 0; i <= 15; i++) {
-        temp_vregshi->vregs[i][0] = cpu_to_be64(cpu->env.vregs[i + 16][0].ll);
-        temp_vregshi->vregs[i][1] = cpu_to_be64(cpu->env.vregs[i + 16][1].ll);
+        temp_vregshi->vregs[i][0] = cpu_to_be64(cpu->env.vregs[i + 16][0]);
+        temp_vregshi->vregs[i][1] = cpu_to_be64(cpu->env.vregs[i + 16][1]);
     }
 }
 
diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index 1bed12b6c3..317a1377e6 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -66,7 +66,7 @@ struct CPUS390XState {
      * The floating point registers are part of the vector registers.
      * vregs[0][0] -> vregs[15][0] are 16 floating point registers
      */
-    CPU_DoubleU vregs[32][2] QEMU_ALIGNED(16);  /* vector registers */
+    uint64_t vregs[32][2] QEMU_ALIGNED(16);  /* vector registers */
     uint32_t aregs[16];    /* access registers */
     uint8_t riccb[64];     /* runtime instrumentation control */
     uint64_t gscb[4];      /* guarded storage control */
@@ -153,7 +153,7 @@ struct CPUS390XState {
 
 };
 
-static inline CPU_DoubleU *get_freg(CPUS390XState *cs, int nr)
+static inline uint64_t *get_freg(CPUS390XState *cs, int nr)
 {
     return &cs->vregs[nr][0];
 }
diff --git a/target/s390x/excp_helper.c b/target/s390x/excp_helper.c
index 3a467b72c5..85223d00c0 100644
--- a/target/s390x/excp_helper.c
+++ b/target/s390x/excp_helper.c
@@ -390,8 +390,8 @@ static int mchk_store_vregs(CPUS390XState *env, uint64_t mcesao)
     }
 
     for (i = 0; i < 32; i++) {
-        sa->vregs[i][0] = cpu_to_be64(env->vregs[i][0].ll);
-        sa->vregs[i][1] = cpu_to_be64(env->vregs[i][1].ll);
+        sa->vregs[i][0] = cpu_to_be64(env->vregs[i][0]);
+        sa->vregs[i][1] = cpu_to_be64(env->vregs[i][1]);
     }
 
     cpu_physical_memory_unmap(sa, len, 1, len);
@@ -429,7 +429,7 @@ static void do_mchk_interrupt(CPUS390XState *env)
     lowcore->ar_access_id = 1;
 
     for (i = 0; i < 16; i++) {
-        lowcore->floating_pt_save_area[i] = cpu_to_be64(get_freg(env, i)->ll);
+        lowcore->floating_pt_save_area[i] = cpu_to_be64(*get_freg(env, i));
         lowcore->gpregs_save_area[i] = cpu_to_be64(env->regs[i]);
         lowcore->access_regs_save_area[i] = cpu_to_be32(env->aregs[i]);
         lowcore->cregs_save_area[i] = cpu_to_be64(env->cregs[i]);
diff --git a/target/s390x/gdbstub.c b/target/s390x/gdbstub.c
index df147596ce..9cfd8fe3e0 100644
--- a/target/s390x/gdbstub.c
+++ b/target/s390x/gdbstub.c
@@ -116,7 +116,7 @@ static int cpu_read_fp_reg(CPUS390XState *env, uint8_t *mem_buf, int n)
     case S390_FPC_REGNUM:
         return gdb_get_reg32(mem_buf, env->fpc);
     case S390_F0_REGNUM ... S390_F15_REGNUM:
-        return gdb_get_reg64(mem_buf, get_freg(env, n - S390_F0_REGNUM)->ll);
+        return gdb_get_reg64(mem_buf, *get_freg(env, n - S390_F0_REGNUM));
     default:
         return 0;
     }
@@ -129,7 +129,7 @@ static int cpu_write_fp_reg(CPUS390XState *env, uint8_t *mem_buf, int n)
         env->fpc = ldl_p(mem_buf);
         return 4;
     case S390_F0_REGNUM ... S390_F15_REGNUM:
-        get_freg(env, n - S390_F0_REGNUM)->ll = ldtul_p(mem_buf);
+        *get_freg(env, n - S390_F0_REGNUM) = ldtul_p(mem_buf);
         return 8;
     default:
         return 0;
@@ -150,11 +150,11 @@ static int cpu_read_vreg(CPUS390XState *env, uint8_t *mem_buf, int n)
 
     switch (n) {
     case S390_V0L_REGNUM ... S390_V15L_REGNUM:
-        ret = gdb_get_reg64(mem_buf, env->vregs[n][1].ll);
+        ret = gdb_get_reg64(mem_buf, env->vregs[n][1]);
         break;
     case S390_V16_REGNUM ... S390_V31_REGNUM:
-        ret = gdb_get_reg64(mem_buf, env->vregs[n][0].ll);
-        ret += gdb_get_reg64(mem_buf + 8, env->vregs[n][1].ll);
+        ret = gdb_get_reg64(mem_buf, env->vregs[n][0]);
+        ret += gdb_get_reg64(mem_buf + 8, env->vregs[n][1]);
         break;
     default:
         ret = 0;
@@ -167,11 +167,11 @@ static int cpu_write_vreg(CPUS390XState *env, uint8_t *mem_buf, int n)
 {
     switch (n) {
     case S390_V0L_REGNUM ... S390_V15L_REGNUM:
-        env->vregs[n][1].ll = ldtul_p(mem_buf + 8);
+        env->vregs[n][1] = ldtul_p(mem_buf + 8);
         return 8;
     case S390_V16_REGNUM ... S390_V31_REGNUM:
-        env->vregs[n][0].ll = ldtul_p(mem_buf);
-        env->vregs[n][1].ll = ldtul_p(mem_buf + 8);
+        env->vregs[n][0] = ldtul_p(mem_buf);
+        env->vregs[n][1] = ldtul_p(mem_buf + 8);
         return 16;
     default:
         return 0;
diff --git a/target/s390x/helper.c b/target/s390x/helper.c
index 3c8f0a7615..a69e5abf5f 100644
--- a/target/s390x/helper.c
+++ b/target/s390x/helper.c
@@ -249,7 +249,7 @@ int s390_store_status(S390CPU *cpu, hwaddr addr, bool store_arch)
         cpu_physical_memory_write(offsetof(LowCore, ar_access_id), &ar_id, 1);
     }
     for (i = 0; i < 16; ++i) {
-        sa->fprs[i] = cpu_to_be64(get_freg(&cpu->env, i)->ll);
+        sa->fprs[i] = cpu_to_be64(*get_freg(&cpu->env, i));
     }
     for (i = 0; i < 16; ++i) {
         sa->grs[i] = cpu_to_be64(cpu->env.regs[i]);
@@ -299,8 +299,8 @@ int s390_store_adtl_status(S390CPU *cpu, hwaddr addr, hwaddr len)
 
     if (s390_has_feat(S390_FEAT_VECTOR)) {
         for (i = 0; i < 32; i++) {
-            sa->vregs[i][0] = cpu_to_be64(cpu->env.vregs[i][0].ll);
-            sa->vregs[i][1] = cpu_to_be64(cpu->env.vregs[i][1].ll);
+            sa->vregs[i][0] = cpu_to_be64(cpu->env.vregs[i][0]);
+            sa->vregs[i][1] = cpu_to_be64(cpu->env.vregs[i][1]);
         }
     }
     if (s390_has_feat(S390_FEAT_GUARDED_STORAGE) && len >= ADTL_GS_MIN_SIZE) {
@@ -341,13 +341,13 @@ void s390_cpu_dump_state(CPUState *cs, FILE *f, int flags)
         if (s390_has_feat(S390_FEAT_VECTOR)) {
             for (i = 0; i < 32; i++) {
                 qemu_fprintf(f, "V%02d=%016" PRIx64 "%016" PRIx64 "%c",
-                             i, env->vregs[i][0].ll, env->vregs[i][1].ll,
+                             i, env->vregs[i][0], env->vregs[i][1],
                              i % 2 ? '\n' : ' ');
             }
         } else {
             for (i = 0; i < 16; i++) {
                 qemu_fprintf(f, "F%02d=%016" PRIx64 "%c",
-                             i, get_freg(env, i)->ll,
+                             i, *get_freg(env, i),
                              (i % 4) == 3 ? '\n' : ' ');
             }
         }
diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index e5e2b691f2..f0649980c9 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -418,21 +418,21 @@ int kvm_arch_put_registers(CPUState *cs, int level)
 
     if (can_sync_regs(cs, KVM_SYNC_VRS)) {
         for (i = 0; i < 32; i++) {
-            cs->kvm_run->s.regs.vrs[i][0] = env->vregs[i][0].ll;
-            cs->kvm_run->s.regs.vrs[i][1] = env->vregs[i][1].ll;
+            cs->kvm_run->s.regs.vrs[i][0] = env->vregs[i][0];
+            cs->kvm_run->s.regs.vrs[i][1] = env->vregs[i][1];
         }
         cs->kvm_run->s.regs.fpc = env->fpc;
         cs->kvm_run->kvm_dirty_regs |= KVM_SYNC_VRS;
     } else if (can_sync_regs(cs, KVM_SYNC_FPRS)) {
         for (i = 0; i < 16; i++) {
-            cs->kvm_run->s.regs.fprs[i] = get_freg(env, i)->ll;
+            cs->kvm_run->s.regs.fprs[i] = *get_freg(env, i);
         }
         cs->kvm_run->s.regs.fpc = env->fpc;
         cs->kvm_run->kvm_dirty_regs |= KVM_SYNC_FPRS;
     } else {
         /* Floating point */
         for (i = 0; i < 16; i++) {
-            fpu.fprs[i] = get_freg(env, i)->ll;
+            fpu.fprs[i] = *get_freg(env, i);
         }
         fpu.fpc = env->fpc;
 
@@ -586,13 +586,13 @@ int kvm_arch_get_registers(CPUState *cs)
     /* Floating point and vector registers */
     if (can_sync_regs(cs, KVM_SYNC_VRS)) {
         for (i = 0; i < 32; i++) {
-            env->vregs[i][0].ll = cs->kvm_run->s.regs.vrs[i][0];
-            env->vregs[i][1].ll = cs->kvm_run->s.regs.vrs[i][1];
+            env->vregs[i][0] = cs->kvm_run->s.regs.vrs[i][0];
+            env->vregs[i][1] = cs->kvm_run->s.regs.vrs[i][1];
         }
         env->fpc = cs->kvm_run->s.regs.fpc;
     } else if (can_sync_regs(cs, KVM_SYNC_FPRS)) {
         for (i = 0; i < 16; i++) {
-            get_freg(env, i)->ll = cs->kvm_run->s.regs.fprs[i];
+            get_freg(env, i) = cs->kvm_run->s.regs.fprs[i];
         }
         env->fpc = cs->kvm_run->s.regs.fpc;
     } else {
@@ -601,7 +601,7 @@ int kvm_arch_get_registers(CPUState *cs)
             return r;
         }
         for (i = 0; i < 16; i++) {
-            get_freg(env, i)->ll = fpu.fprs[i];
+            get_freg(env, i) = fpu.fprs[i];
         }
         env->fpc = fpu.fpc;
     }
diff --git a/target/s390x/machine.c b/target/s390x/machine.c
index cb792aa103..e6851a57bc 100644
--- a/target/s390x/machine.c
+++ b/target/s390x/machine.c
@@ -66,22 +66,22 @@ static const VMStateDescription vmstate_fpu = {
     .minimum_version_id = 1,
     .needed = fpu_needed,
     .fields = (VMStateField[]) {
-        VMSTATE_UINT64(env.vregs[0][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[1][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[2][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[3][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[4][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[5][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[6][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[7][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[8][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[9][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[10][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[11][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[12][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[13][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[14][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[15][0].ll, S390CPU),
+        VMSTATE_UINT64(env.vregs[0][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[1][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[2][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[3][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[4][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[5][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[6][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[7][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[8][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[9][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[10][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[11][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[12][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[13][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[14][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[15][0], S390CPU),
         VMSTATE_UINT32(env.fpc, S390CPU),
         VMSTATE_END_OF_LIST()
     }
@@ -99,54 +99,54 @@ static const VMStateDescription vmstate_vregs = {
     .needed = vregs_needed,
     .fields = (VMStateField[]) {
         /* vregs[0][0] -> vregs[15][0] and fregs are overlays */
-        VMSTATE_UINT64(env.vregs[16][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[17][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[18][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[19][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[20][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[21][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[22][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[23][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[24][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[25][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[26][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[27][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[28][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[29][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[30][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[31][0].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[0][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[1][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[2][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[3][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[4][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[5][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[6][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[7][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[8][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[9][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[10][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[11][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[12][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[13][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[14][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[15][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[16][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[17][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[18][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[19][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[20][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[21][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[22][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[23][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[24][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[25][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[26][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[27][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[28][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[29][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[30][1].ll, S390CPU),
-        VMSTATE_UINT64(env.vregs[31][1].ll, S390CPU),
+        VMSTATE_UINT64(env.vregs[16][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[17][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[18][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[19][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[20][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[21][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[22][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[23][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[24][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[25][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[26][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[27][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[28][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[29][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[30][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[31][0], S390CPU),
+        VMSTATE_UINT64(env.vregs[0][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[1][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[2][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[3][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[4][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[5][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[6][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[7][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[8][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[9][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[10][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[11][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[12][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[13][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[14][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[15][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[16][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[17][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[18][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[19][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[20][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[21][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[22][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[23][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[24][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[25][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[26][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[27][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[28][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[29][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[30][1], S390CPU),
+        VMSTATE_UINT64(env.vregs[31][1], S390CPU),
         VMSTATE_END_OF_LIST()
     }
 };
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index fa57b7550e..ac0d8b6410 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -149,7 +149,7 @@ void s390x_translate_init(void)
 static inline int vec_full_reg_offset(uint8_t reg)
 {
     g_assert(reg < 32);
-    return offsetof(CPUS390XState, vregs[reg][0].d);
+    return offsetof(CPUS390XState, vregs[reg][0]);
 }
 
 static inline int vec_reg_offset(uint8_t reg, uint8_t enr, TCGMemOp es)
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 02/23] s390x/tcg: Introduce tcg_s390_vector_exception()
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 01/23] s390x: Use uint64_t for vector registers David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 03/23] s390x/tcg: Export float_comp_to_cc() and float(32|64|128)_dcmask() David Hildenbrand
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

Handling is similar to data exceptions, however we can always store the
VXC into the lowore and the FPC:

z14 PoP, 6-20, "Vector-Exception Code"
    When a vector-processing exception causes a pro-
    gram interruption, a vector-exception code (VXC) is
    stored at location 147, and zeros are stored at loca-
    tions 144-146. The VXC is also placed in the DXC
    field of the floating-point-control (FPC) register if bit
    45 of control register 0 is one. When bit 45 of control
    register 0 is zero and bit 46 of control register 0 is
    one, the DXC field of the FPC register and the con-
    tents of storage at location 147 are unpredictable.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/cpu.h         |  1 +
 target/s390x/excp_helper.c | 15 +++++++++++++++
 target/s390x/tcg_s390x.h   |  2 ++
 3 files changed, 18 insertions(+)

diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index 317a1377e6..4fc08a2c88 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -215,6 +215,7 @@ extern const struct VMStateDescription vmstate_s390_cpu;
 #define PGM_SPECIAL_OP                  0x0013
 #define PGM_OPERAND                     0x0015
 #define PGM_TRACE_TABLE                 0x0016
+#define PGM_VECTOR_PROCESSING           0x001b
 #define PGM_SPACE_SWITCH                0x001c
 #define PGM_HFP_SQRT                    0x001d
 #define PGM_PC_TRANS_SPEC               0x001f
diff --git a/target/s390x/excp_helper.c b/target/s390x/excp_helper.c
index 85223d00c0..f21bcf79ae 100644
--- a/target/s390x/excp_helper.c
+++ b/target/s390x/excp_helper.c
@@ -62,6 +62,21 @@ void QEMU_NORETURN tcg_s390_data_exception(CPUS390XState *env, uint32_t dxc,
     tcg_s390_program_interrupt(env, PGM_DATA, ILEN_AUTO, ra);
 }
 
+void QEMU_NORETURN tcg_s390_vector_exception(CPUS390XState *env, uint32_t vxc,
+                                             uintptr_t ra)
+{
+    g_assert(vxc <= 0xff);
+#if !defined(CONFIG_USER_ONLY)
+    /* Always store the VXC into the lowcore, without AFP it is undefined */
+    stl_phys(CPU(s390_env_get_cpu(env))->as,
+             env->psa + offsetof(LowCore, data_exc_code), vxc);
+#endif
+
+    /* Always store the VXC into the FPC, without AFP it is undefined */
+    env->fpc = deposit32(env->fpc, 8, 8, vxc);
+    tcg_s390_program_interrupt(env, PGM_VECTOR_PROCESSING, ILEN_AUTO, ra);
+}
+
 void HELPER(data_exception)(CPUS390XState *env, uint32_t dxc)
 {
     tcg_s390_data_exception(env, dxc, GETPC());
diff --git a/target/s390x/tcg_s390x.h b/target/s390x/tcg_s390x.h
index ab2c4ba703..2813f9d48e 100644
--- a/target/s390x/tcg_s390x.h
+++ b/target/s390x/tcg_s390x.h
@@ -18,5 +18,7 @@ void QEMU_NORETURN tcg_s390_program_interrupt(CPUS390XState *env, uint32_t code,
                                               int ilen, uintptr_t ra);
 void QEMU_NORETURN tcg_s390_data_exception(CPUS390XState *env, uint32_t dxc,
                                            uintptr_t ra);
+void QEMU_NORETURN tcg_s390_vector_exception(CPUS390XState *env, uint32_t vxc,
+                                             uintptr_t ra);
 
 #endif /* TCG_S390X_H */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 03/23] s390x/tcg: Export float_comp_to_cc() and float(32|64|128)_dcmask()
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 01/23] s390x: Use uint64_t for vector registers David Hildenbrand
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 02/23] s390x/tcg: Introduce tcg_s390_vector_exception() David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 04/23] s390x/tcg: Implement VECTOR FP ADD David Hildenbrand
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

Vector floating-point instructions will require these functions, so
allow to use them from other files.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/fpu_helper.c | 4 ++--
 target/s390x/internal.h   | 4 ++++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/target/s390x/fpu_helper.c b/target/s390x/fpu_helper.c
index 1be68bafea..d2c17ed942 100644
--- a/target/s390x/fpu_helper.c
+++ b/target/s390x/fpu_helper.c
@@ -112,7 +112,7 @@ static void handle_exceptions(CPUS390XState *env, bool XxC, uintptr_t retaddr)
     }
 }
 
-static inline int float_comp_to_cc(CPUS390XState *env, int float_compare)
+int float_comp_to_cc(CPUS390XState *env, int float_compare)
 {
     S390CPU *cpu = s390_env_get_cpu(env);
 
@@ -746,7 +746,7 @@ static inline uint16_t dcmask(int bit, bool neg)
 }
 
 #define DEF_FLOAT_DCMASK(_TYPE) \
-static uint16_t _TYPE##_dcmask(CPUS390XState *env, _TYPE f1)       \
+uint16_t _TYPE##_dcmask(CPUS390XState *env, _TYPE f1)              \
 {                                                                  \
     const bool neg = _TYPE##_is_neg(f1);                           \
                                                                    \
diff --git a/target/s390x/internal.h b/target/s390x/internal.h
index 9893fc094b..c243fa725b 100644
--- a/target/s390x/internal.h
+++ b/target/s390x/internal.h
@@ -285,6 +285,10 @@ uint32_t set_cc_nz_f128(float128 v);
 uint8_t s390_softfloat_exc_to_ieee(unsigned int exc);
 int s390_swap_bfp_rounding_mode(CPUS390XState *env, int m3);
 void s390_restore_bfp_rounding_mode(CPUS390XState *env, int old_mode);
+int float_comp_to_cc(CPUS390XState *env, int float_compare);
+uint16_t float32_dcmask(CPUS390XState *env, float32 f1);
+uint16_t float64_dcmask(CPUS390XState *env, float64 f1);
+uint16_t float128_dcmask(CPUS390XState *env, float128 f1);
 
 
 /* gdbstub.c */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 04/23] s390x/tcg: Implement VECTOR FP ADD
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (2 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 03/23] s390x/tcg: Export float_comp_to_cc() and float(32|64|128)_dcmask() David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 15:54   ` Richard Henderson
  2019-05-31 16:30   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 05/23] s390x/tcg: Implement VECTOR FP COMPARE (AND SIGNAL) SCALAR David Hildenbrand
                   ` (20 subsequent siblings)
  24 siblings, 2 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

1. We'll reuse op_vfa() for similar instructions later, prepare for
   that.
2. We'll reuse vop64_3() for other instructions later.
3. Take care of modifying the vector register only if no trap happened.
 - on traps, flags are not updated and no elements are modified
 - traps don't modify the fpc flags
 - without traps, all exceptions of all elements are merged
4. We'll reuse check_ieee_exc() later when we need the XxC flag.

We have to check for exceptions after processing each element.
Provide separate handlers for single/all element processing. We'll do
the same for all applicable FP instructions.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/Makefile.objs      |   1 +
 target/s390x/helper.h           |   4 ++
 target/s390x/insn-data.def      |   5 ++
 target/s390x/translate_vx.inc.c |  29 ++++++++
 target/s390x/vec_fpu_helper.c   | 119 ++++++++++++++++++++++++++++++++
 5 files changed, 158 insertions(+)
 create mode 100644 target/s390x/vec_fpu_helper.c

diff --git a/target/s390x/Makefile.objs b/target/s390x/Makefile.objs
index ffdd484ef0..3e2745594a 100644
--- a/target/s390x/Makefile.objs
+++ b/target/s390x/Makefile.objs
@@ -2,6 +2,7 @@ obj-y += cpu.o cpu_models.o cpu_features.o gdbstub.o interrupt.o helper.o
 obj-$(CONFIG_TCG) += translate.o cc_helper.o excp_helper.o fpu_helper.o
 obj-$(CONFIG_TCG) += int_helper.o mem_helper.o misc_helper.o crypto_helper.o
 obj-$(CONFIG_TCG) += vec_helper.o vec_int_helper.o vec_string_helper.o
+obj-$(CONFIG_TCG) += vec_fpu_helper.o
 obj-$(CONFIG_SOFTMMU) += machine.o ioinst.o arch_dump.o mmu_helper.o diag.o
 obj-$(CONFIG_SOFTMMU) += sigp.o
 obj-$(CONFIG_KVM) += kvm.o
diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 5db67779d3..21658a2771 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -249,6 +249,10 @@ DEF_HELPER_6(gvec_vstrc_cc_rt8, void, ptr, cptr, cptr, cptr, env, i32)
 DEF_HELPER_6(gvec_vstrc_cc_rt16, void, ptr, cptr, cptr, cptr, env, i32)
 DEF_HELPER_6(gvec_vstrc_cc_rt32, void, ptr, cptr, cptr, cptr, env, i32)
 
+/* === Vector Floating-Point Instructions */
+DEF_HELPER_FLAGS_5(gvec_vfa64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_5(gvec_vfa64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
 DEF_HELPER_4(diag, void, env, i32, i32, i32)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index a2969fab58..79892f6042 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1204,6 +1204,11 @@
 /* VECTOR STRING RANGE COMPARE */
     F(0xe78a, VSTRC,   VRR_d, V,   0, 0, 0, 0, vstrc, 0, IF_VEC)
 
+/* === Vector Floating-Point Instructions */
+
+/* VECTOR FP ADD */
+    F(0xe7e3, VFA,     VRR_c, V,   0, 0, 0, 0, vfa, 0, IF_VEC)
+
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
     E(0xb250, CSP,     RRE,   Z,   r1_32u, ra2, r1_P, 0, csp, 0, MO_TEUL, IF_PRIV)
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index f26ffa2895..44da9f2645 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -52,6 +52,11 @@
 #define ES_64   MO_64
 #define ES_128  4
 
+/* Floating-Point Format */
+#define FPF_SHORT       2
+#define FPF_LONG        3
+#define FPF_EXT         4
+
 static inline bool valid_vec_element(uint8_t enr, TCGMemOp es)
 {
     return !(enr & ~(NUM_VEC_ELEMENTS(es) - 1));
@@ -2538,3 +2543,27 @@ static DisasJumpType op_vstrc(DisasContext *s, DisasOps *o)
     }
     return DISAS_NEXT;
 }
+
+static DisasJumpType op_vfa(DisasContext *s, DisasOps *o)
+{
+    const uint8_t fpf = get_field(s->fields, m4);
+    const uint8_t m5 = get_field(s->fields, m5);
+    const bool se = extract32(m5, 3, 1);
+    gen_helper_gvec_3_ptr *fn;
+
+    if (fpf != FPF_LONG || extract32(m5, 0, 3)) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }
+
+    switch (s->fields->op2) {
+    case 0xe3:
+        fn = se ? gen_helper_gvec_vfa64s : gen_helper_gvec_vfa64;
+        break;
+    default:
+        g_assert_not_reached();
+    }
+    gen_gvec_3_ptr(get_field(s->fields, v1), get_field(s->fields, v2),
+                   get_field(s->fields, v3), cpu_env, 0, fn);
+    return DISAS_NEXT;
+}
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
new file mode 100644
index 0000000000..11dd20b837
--- /dev/null
+++ b/target/s390x/vec_fpu_helper.c
@@ -0,0 +1,119 @@
+/*
+ * QEMU TCG support -- s390x vector floating point instruction support
+ *
+ * Copyright (C) 2019 Red Hat Inc
+ *
+ * Authors:
+ *   David Hildenbrand <david@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "cpu.h"
+#include "internal.h"
+#include "vec.h"
+#include "tcg_s390x.h"
+#include "tcg/tcg-gvec-desc.h"
+#include "exec/exec-all.h"
+#include "exec/helper-proto.h"
+#include "fpu/softfloat.h"
+
+#define VIC_INVALID         0x1
+#define VIC_DIVBYZERO       0x2
+#define VIC_OVERFLOW        0x3
+#define VIC_UNDERFLOW       0x4
+#define VIC_INEXACT         0x5
+
+/* returns the VEX. If the VEX is 0, there is no trap */
+static uint8_t check_ieee_exc(CPUS390XState *env, uint8_t enr, bool XxC,
+                              uint8_t *vec_exc)
+{
+    uint8_t vece_exc = 0, trap_exc;
+    unsigned qemu_exc;
+
+    /* Retrieve and clear the softfloat exceptions */
+    qemu_exc = env->fpu_status.float_exception_flags;
+    if (qemu_exc == 0) {
+        return 0;
+    }
+    env->fpu_status.float_exception_flags = 0;
+
+    vece_exc = s390_softfloat_exc_to_ieee(qemu_exc);
+
+    /* Add them to the vector-wide s390x exception bits */
+    *vec_exc |= vece_exc;
+
+    /* Check for traps and construct the VXC */
+    trap_exc = vece_exc & env->fpc >> 24;
+    if (trap_exc) {
+        if (trap_exc & S390_IEEE_MASK_INVALID) {
+            return enr << 4 | VIC_INVALID;
+        } else if (trap_exc & S390_IEEE_MASK_DIVBYZERO) {
+            return enr << 4 | VIC_DIVBYZERO;
+        } else if (trap_exc & S390_IEEE_MASK_OVERFLOW) {
+            return enr << 4 | VIC_OVERFLOW;
+        } else if (trap_exc & S390_IEEE_MASK_UNDERFLOW) {
+            return enr << 4 | VIC_UNDERFLOW;
+        } else if (!XxC) {
+            g_assert(trap_exc & S390_IEEE_MASK_INEXACT);
+            /* inexact has lowest priority on traps */
+            return enr << 4 | VIC_INEXACT;
+        }
+    }
+    return 0;
+}
+
+static void handle_ieee_exc(CPUS390XState *env, uint8_t vxc, uint8_t vec_exc,
+                            uintptr_t retaddr)
+{
+    if (vxc) {
+        /* on traps, the fpc flags are not updated, instruction is suppressed */
+        tcg_s390_vector_exception(env, vxc, retaddr);
+    }
+    if (vec_exc) {
+        /* indicate exceptions for all elements combined */
+        env->fpc |= vec_exc << 16;
+    }
+}
+
+typedef uint64_t (*vop64_3_fn)(uint64_t a, uint64_t b, float_status *s);
+static void vop64_3(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
+                    CPUS390XState *env, bool s, vop64_3_fn fn,
+                    uintptr_t retaddr)
+{
+    uint8_t vxc, vec_exc = 0;
+    S390Vector tmp = {};
+    int i;
+
+    for (i = 0; i < 2; i++) {
+        const uint64_t a = s390_vec_read_element64(v2, i);
+        const uint64_t b = s390_vec_read_element64(v3, i);
+
+        s390_vec_write_element64(&tmp, i, fn(a, b, &env->fpu_status));
+        vxc = check_ieee_exc(env, i, false, &vec_exc);
+        if (s || vxc) {
+            break;
+        }
+    }
+    handle_ieee_exc(env, vxc, vec_exc, retaddr);
+    *v1 = tmp;
+}
+
+static uint64_t vfa64(uint64_t a, uint64_t b, float_status *s)
+{
+    return float64_val(float64_add(make_float64(a), make_float64(b), s));
+}
+
+void HELPER(gvec_vfa64)(void *v1, const void *v2, const void *v3,
+                        CPUS390XState *env, uint32_t desc)
+{
+    vop64_3(v1, v2, v3, env, false, vfa64, GETPC());
+}
+
+void HELPER(gvec_vfa64s)(void *v1, const void *v2, const void *v3,
+                         CPUS390XState *env, uint32_t desc)
+{
+    vop64_3(v1, v2, v3, env, true, vfa64, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 05/23] s390x/tcg: Implement VECTOR FP COMPARE (AND SIGNAL) SCALAR
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (3 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 04/23] s390x/tcg: Implement VECTOR FP ADD David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 16:33   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 06/23] s390x/tcg: Implement VECTOR FP COMPARE (EQUAL|HIGH|HIGH OR EQUAL) David Hildenbrand
                   ` (19 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

As far as I can see, there is only a tiny difference.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  2 ++
 target/s390x/insn-data.def      |  4 ++++
 target/s390x/translate_vx.inc.c | 21 +++++++++++++++++++++
 target/s390x/vec_fpu_helper.c   | 32 ++++++++++++++++++++++++++++++++
 4 files changed, 59 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 21658a2771..d34d6802a6 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -252,6 +252,8 @@ DEF_HELPER_6(gvec_vstrc_cc_rt32, void, ptr, cptr, cptr, cptr, env, i32)
 /* === Vector Floating-Point Instructions */
 DEF_HELPER_FLAGS_5(gvec_vfa64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
 DEF_HELPER_FLAGS_5(gvec_vfa64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_4(gvec_wfc64, void, cptr, cptr, env, i32)
+DEF_HELPER_4(gvec_wfk64, void, cptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index 79892f6042..c45e101b10 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1208,6 +1208,10 @@
 
 /* VECTOR FP ADD */
     F(0xe7e3, VFA,     VRR_c, V,   0, 0, 0, 0, vfa, 0, IF_VEC)
+/* VECTOR FP COMPARE SCALAR */
+    F(0xe7cb, WFC,     VRR_a, V,   0, 0, 0, 0, wfc, 0, IF_VEC)
+/* VECTOR FP COMPARE AND SIGNAL SCALAR */
+    F(0xe7ca, WFK,     VRR_a, V,   0, 0, 0, 0, wfc, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index 44da9f2645..283e8aa07a 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2567,3 +2567,24 @@ static DisasJumpType op_vfa(DisasContext *s, DisasOps *o)
                    get_field(s->fields, v3), cpu_env, 0, fn);
     return DISAS_NEXT;
 }
+
+static DisasJumpType op_wfc(DisasContext *s, DisasOps *o)
+{
+    const uint8_t fpf = get_field(s->fields, m3);
+    const uint8_t m4 = get_field(s->fields, m4);
+
+    if (fpf != FPF_LONG || m4) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }
+
+    if (s->fields->op2 == 0xcb) {
+        gen_gvec_2_ptr(get_field(s->fields, v1), get_field(s->fields, v2),
+                       cpu_env, 0, gen_helper_gvec_wfc64);
+    } else {
+        gen_gvec_2_ptr(get_field(s->fields, v1), get_field(s->fields, v2),
+                       cpu_env, 0, gen_helper_gvec_wfk64);
+    }
+    set_cc_static(s);
+    return DISAS_NEXT;
+}
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index 11dd20b837..3c153d8426 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -117,3 +117,35 @@ void HELPER(gvec_vfa64s)(void *v1, const void *v2, const void *v3,
 {
     vop64_3(v1, v2, v3, env, true, vfa64, GETPC());
 }
+
+static int wfc64(const S390Vector *v1, const S390Vector *v2,
+                 CPUS390XState *env, bool signal, uintptr_t retaddr)
+{
+    /* only the zero-indexed elements are compared */
+    const float64 a = make_float64(s390_vec_read_element64(v1, 0));
+    const float64 b = make_float64(s390_vec_read_element64(v2, 0));
+    uint8_t vxc, vec_exc = 0;
+    int cmp;
+
+    if (signal) {
+        cmp = float64_compare(a, b, &env->fpu_status);
+    } else {
+        cmp = float64_compare_quiet(a, b, &env->fpu_status);
+    }
+    vxc = check_ieee_exc(env, 0, false, &vec_exc);
+    handle_ieee_exc(env, vxc, vec_exc, retaddr);
+
+    return float_comp_to_cc(env, cmp);
+}
+
+void HELPER(gvec_wfc64)(const void *v1, const void *v2, CPUS390XState *env,
+                        uint32_t desc)
+{
+    env->cc_op = wfc64(v1, v2, env, false, GETPC());
+}
+
+void HELPER(gvec_wfk64)(const void *v1, const void *v2, CPUS390XState *env,
+                        uint32_t desc)
+{
+    env->cc_op = wfc64(v1, v2, env, true, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 06/23] s390x/tcg: Implement VECTOR FP COMPARE (EQUAL|HIGH|HIGH OR EQUAL)
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (4 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 05/23] s390x/tcg: Implement VECTOR FP COMPARE (AND SIGNAL) SCALAR David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 16:53   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 07/23] s390x/tcg: Implement VECTOR FP CONVERT FROM FIXED 64-BIT David Hildenbrand
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

Provide for all three instructions all four combinations of cc bit and
s bit.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  12 ++++
 target/s390x/insn-data.def      |   6 ++
 target/s390x/translate_vx.inc.c |  51 +++++++++++++++
 target/s390x/vec_fpu_helper.c   | 107 ++++++++++++++++++++++++++++++++
 4 files changed, 176 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index d34d6802a6..33d3bacf74 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -254,6 +254,18 @@ DEF_HELPER_FLAGS_5(gvec_vfa64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
 DEF_HELPER_FLAGS_5(gvec_vfa64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
 DEF_HELPER_4(gvec_wfc64, void, cptr, cptr, env, i32)
 DEF_HELPER_4(gvec_wfk64, void, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_5(gvec_vfce64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_5(gvec_vfce64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_5(gvec_vfce64_cc, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_5(gvec_vfce64s_cc, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_5(gvec_vfch64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_5(gvec_vfch64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_5(gvec_vfch64_cc, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_5(gvec_vfch64s_cc, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_5(gvec_vfche64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_5(gvec_vfche64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_5(gvec_vfche64_cc, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_5(gvec_vfche64s_cc, void, ptr, cptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index c45e101b10..446552f251 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1212,6 +1212,12 @@
     F(0xe7cb, WFC,     VRR_a, V,   0, 0, 0, 0, wfc, 0, IF_VEC)
 /* VECTOR FP COMPARE AND SIGNAL SCALAR */
     F(0xe7ca, WFK,     VRR_a, V,   0, 0, 0, 0, wfc, 0, IF_VEC)
+/* VECTOR FP COMPARE EQUAL */
+    F(0xe7e8, VFCE,    VRR_c, V,   0, 0, 0, 0, vfc, 0, IF_VEC)
+/* VECTOR FP COMPARE HIGH */
+    F(0xe7eb, VFCH,    VRR_c, V,   0, 0, 0, 0, vfc, 0, IF_VEC)
+/* VECTOR FP COMPARE HIGH OR EQUAL */
+    F(0xe7ea, VFCHE,   VRR_c, V,   0, 0, 0, 0, vfc, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index 283e8aa07a..5571a71e1a 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2588,3 +2588,54 @@ static DisasJumpType op_wfc(DisasContext *s, DisasOps *o)
     set_cc_static(s);
     return DISAS_NEXT;
 }
+
+static DisasJumpType op_vfc(DisasContext *s, DisasOps *o)
+{
+    const uint8_t fpf = get_field(s->fields, m4);
+    const uint8_t m5 = get_field(s->fields, m5);
+    const uint8_t m6 = get_field(s->fields, m6);
+    const bool se = extract32(m5, 3, 1);
+    const bool cs = extract32(m6, 0, 1);
+    gen_helper_gvec_3_ptr *fn;
+
+    if (fpf != FPF_LONG || extract32(m5, 0, 3) || extract32(m6, 1, 3)) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }
+
+    if (cs) {
+        switch (s->fields->op2) {
+        case 0xe8:
+            fn = se ? gen_helper_gvec_vfce64s_cc : gen_helper_gvec_vfce64_cc;
+            break;
+        case 0xeb:
+            fn = se ? gen_helper_gvec_vfch64s_cc : gen_helper_gvec_vfch64_cc;
+            break;
+        case 0xea:
+            fn = se ? gen_helper_gvec_vfche64s_cc : gen_helper_gvec_vfche64_cc;
+            break;
+        default:
+            g_assert_not_reached();
+        }
+    } else {
+        switch (s->fields->op2) {
+        case 0xe8:
+            fn = se ? gen_helper_gvec_vfce64s : gen_helper_gvec_vfce64;
+            break;
+        case 0xeb:
+            fn = se ? gen_helper_gvec_vfch64s : gen_helper_gvec_vfch64;
+            break;
+        case 0xea:
+            fn = se ? gen_helper_gvec_vfche64s : gen_helper_gvec_vfche64;
+            break;
+        default:
+            g_assert_not_reached();
+        }
+    }
+    gen_gvec_3_ptr(get_field(s->fields, v1), get_field(s->fields, v2),
+                   get_field(s->fields, v3), cpu_env, 0, fn);
+    if (cs) {
+        set_cc_static(s);
+    }
+    return DISAS_NEXT;
+}
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index 3c153d8426..1c4d4661ba 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -149,3 +149,110 @@ void HELPER(gvec_wfk64)(const void *v1, const void *v2, CPUS390XState *env,
 {
     env->cc_op = wfc64(v1, v2, env, true, GETPC());
 }
+
+static int vfc64(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
+                 CPUS390XState *env, bool s, bool test_equal, bool test_high,
+                 uintptr_t retaddr)
+{
+    uint8_t vxc, vec_exc = 0;
+    S390Vector tmp = {};
+    int match = 0;
+    int i;
+
+    for (i = 0; i < 2; i++) {
+        const float64 a = make_float64(s390_vec_read_element64(v2, i));
+        const float64 b = make_float64(s390_vec_read_element64(v3, i));
+        const int cmp = float64_compare_quiet(a, b, &env->fpu_status);
+
+        if ((cmp == float_relation_equal && test_equal) ||
+            (cmp == float_relation_greater && test_high)) {
+            match++;
+            s390_vec_write_element64(&tmp, i, -1ull);
+        }
+        vxc = check_ieee_exc(env, i, false, &vec_exc);
+        if (s || vxc) {
+            break;
+        }
+    }
+
+    handle_ieee_exc(env, vxc, vec_exc, retaddr);
+    *v1 = tmp;
+    if (match == i + 1) {
+        return 0;
+    } else if (match) {
+        return 1;
+    }
+    return 3;
+}
+
+void HELPER(gvec_vfce64)(void *v1, const void *v2, const void *v3,
+                         CPUS390XState *env, uint32_t desc)
+{
+    vfc64(v1, v2, v3, env, false, true, false, GETPC());
+}
+
+void HELPER(gvec_vfce64s)(void *v1, const void *v2, const void *v3,
+                          CPUS390XState *env, uint32_t desc)
+{
+    vfc64(v1, v2, v3, env, true, true, false, GETPC());
+}
+
+void HELPER(gvec_vfce64_cc)(void *v1, const void *v2, const void *v3,
+                            CPUS390XState *env, uint32_t desc)
+{
+    env->cc_op = vfc64(v1, v2, v3, env, false, true, false, GETPC());
+}
+
+void HELPER(gvec_vfce64s_cc)(void *v1, const void *v2, const void *v3,
+                            CPUS390XState *env, uint32_t desc)
+{
+    env->cc_op = vfc64(v1, v2, v3, env, true, true, false, GETPC());
+}
+
+void HELPER(gvec_vfch64)(void *v1, const void *v2, const void *v3,
+                         CPUS390XState *env, uint32_t desc)
+{
+    vfc64(v1, v2, v3, env, false, false, true, GETPC());
+}
+
+void HELPER(gvec_vfch64s)(void *v1, const void *v2, const void *v3,
+                          CPUS390XState *env, uint32_t desc)
+{
+    vfc64(v1, v2, v3, env, true, false, true, GETPC());
+}
+
+void HELPER(gvec_vfch64_cc)(void *v1, const void *v2, const void *v3,
+                            CPUS390XState *env, uint32_t desc)
+{
+    env->cc_op = vfc64(v1, v2, v3, env, false, false, true, GETPC());
+}
+
+void HELPER(gvec_vfch64s_cc)(void *v1, const void *v2, const void *v3,
+                             CPUS390XState *env, uint32_t desc)
+{
+    env->cc_op = vfc64(v1, v2, v3, env, true, false, true, GETPC());
+}
+
+void HELPER(gvec_vfche64)(void *v1, const void *v2, const void *v3,
+                          CPUS390XState *env, uint32_t desc)
+{
+    vfc64(v1, v2, v3, env, false, true, true, GETPC());
+}
+
+void HELPER(gvec_vfche64s)(void *v1, const void *v2, const void *v3,
+                           CPUS390XState *env, uint32_t desc)
+{
+    vfc64(v1, v2, v3, env, true, true, true, GETPC());
+}
+
+void HELPER(gvec_vfche64_cc)(void *v1, const void *v2, const void *v3,
+                             CPUS390XState *env, uint32_t desc)
+{
+    env->cc_op = vfc64(v1, v2, v3, env, false, true, true, GETPC());
+}
+
+void HELPER(gvec_vfche64s_cc)(void *v1, const void *v2, const void *v3,
+                              CPUS390XState *env, uint32_t desc)
+{
+    env->cc_op = vfc64(v1, v2, v3, env, true, true, true, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 07/23] s390x/tcg: Implement VECTOR FP CONVERT FROM FIXED 64-BIT
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (5 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 06/23] s390x/tcg: Implement VECTOR FP COMPARE (EQUAL|HIGH|HIGH OR EQUAL) David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:10   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 08/23] s390x/tcg: Implement VECTOR FP CONVERT FROM LOGICAL 64-BIT David Hildenbrand
                   ` (17 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

1. We'll reuse op_vcdg() for similar instructions later, prepare for
   that.
2. We'll reuse vop64_2() later for other instructions.

We have to mangle the erm (effective rounding mode) and the m4 into
the simd_data(), and properly unmangle them again.

Make sure to restore the erm before triggering an exception.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  2 ++
 target/s390x/insn-data.def      |  2 ++
 target/s390x/translate_vx.inc.c | 25 ++++++++++++++++++
 target/s390x/vec_fpu_helper.c   | 47 +++++++++++++++++++++++++++++++++
 4 files changed, 76 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 33d3bacf74..a60f4c49fc 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -266,6 +266,8 @@ DEF_HELPER_FLAGS_5(gvec_vfche64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32
 DEF_HELPER_FLAGS_5(gvec_vfche64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
 DEF_HELPER_5(gvec_vfche64_cc, void, ptr, cptr, cptr, env, i32)
 DEF_HELPER_5(gvec_vfche64s_cc, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vcdg64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vcdg64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index 446552f251..d3386024c8 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1218,6 +1218,8 @@
     F(0xe7eb, VFCH,    VRR_c, V,   0, 0, 0, 0, vfc, 0, IF_VEC)
 /* VECTOR FP COMPARE HIGH OR EQUAL */
     F(0xe7ea, VFCHE,   VRR_c, V,   0, 0, 0, 0, vfc, 0, IF_VEC)
+/* VECTOR FP CONVERT FROM FIXED 64-BIT */
+    F(0xe7c3, VCDG,    VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index 5571a71e1a..6741b707cc 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2639,3 +2639,28 @@ static DisasJumpType op_vfc(DisasContext *s, DisasOps *o)
     }
     return DISAS_NEXT;
 }
+
+static DisasJumpType op_vcdg(DisasContext *s, DisasOps *o)
+{
+    const uint8_t fpf = get_field(s->fields, m3);
+    const uint8_t m4 = get_field(s->fields, m4);
+    const uint8_t erm = get_field(s->fields, m5);
+    const bool se = extract32(m4, 3, 1);
+    gen_helper_gvec_2_ptr *fn;
+
+    if (fpf != FPF_LONG || extract32(m4, 0, 2) || erm > 7 || erm == 2) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }
+
+    switch (s->fields->op2) {
+    case 0xc3:
+        fn = se ? gen_helper_gvec_vcdg64s : gen_helper_gvec_vcdg64;
+        break;
+    default:
+        g_assert_not_reached();
+    }
+    gen_gvec_2_ptr(get_field(s->fields, v1), get_field(s->fields, v2), cpu_env,
+                   deposit32(m4, 4, 4, erm), fn);
+    return DISAS_NEXT;
+}
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index 1c4d4661ba..488895efdc 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -78,6 +78,30 @@ static void handle_ieee_exc(CPUS390XState *env, uint8_t vxc, uint8_t vec_exc,
     }
 }
 
+typedef uint64_t (*vop64_2_fn)(uint64_t a, float_status *s);
+static void vop64_2(S390Vector *v1, const S390Vector *v2, CPUS390XState *env,
+                    bool s, bool XxC, uint8_t erm, vop64_2_fn fn,
+                    uintptr_t retaddr)
+{
+    uint8_t vxc, vec_exc = 0;
+    S390Vector tmp = {};
+    int i, old_mode;
+
+    old_mode = s390_swap_bfp_rounding_mode(env, erm);
+    for (i = 0; i < 2; i++) {
+        const uint64_t a = s390_vec_read_element64(v2, i);
+
+        s390_vec_write_element64(&tmp, i, fn(a, &env->fpu_status));
+        vxc = check_ieee_exc(env, i, XxC, &vec_exc);
+        if (s || vxc) {
+            break;
+        }
+    }
+    s390_restore_bfp_rounding_mode(env, old_mode);
+    handle_ieee_exc(env, vxc, vec_exc, retaddr);
+    *v1 = tmp;
+}
+
 typedef uint64_t (*vop64_3_fn)(uint64_t a, uint64_t b, float_status *s);
 static void vop64_3(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
                     CPUS390XState *env, bool s, vop64_3_fn fn,
@@ -256,3 +280,26 @@ void HELPER(gvec_vfche64s_cc)(void *v1, const void *v2, const void *v3,
 {
     env->cc_op = vfc64(v1, v2, v3, env, true, true, true, GETPC());
 }
+
+static uint64_t vcdg64(uint64_t a, float_status *s)
+{
+    return float64_val(int64_to_float64(a, s));
+}
+
+void HELPER(gvec_vcdg64)(void *v1, const void *v2, CPUS390XState *env,
+                         uint32_t desc)
+{
+    const uint8_t erm = extract32(simd_data(desc), 4, 4);
+    const bool XxC = extract32(simd_data(desc), 2, 1);
+
+    vop64_2(v1, v2, env, false, XxC, erm, vcdg64, GETPC());
+}
+
+void HELPER(gvec_vcdg64s)(void *v1, const void *v2, CPUS390XState *env,
+                          uint32_t desc)
+{
+    const uint8_t erm = extract32(simd_data(desc), 4, 4);
+    const bool XxC = extract32(simd_data(desc), 2, 1);
+
+    vop64_2(v1, v2, env, true, XxC, erm, vcdg64, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 08/23] s390x/tcg: Implement VECTOR FP CONVERT FROM LOGICAL 64-BIT
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (6 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 07/23] s390x/tcg: Implement VECTOR FP CONVERT FROM FIXED 64-BIT David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:15   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 09/23] s390x/tcg: Implement VECTOR FP CONVERT TO FIXED 64-BIT David Hildenbrand
                   ` (16 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  2 ++
 target/s390x/insn-data.def      |  2 ++
 target/s390x/translate_vx.inc.c |  3 +++
 target/s390x/vec_fpu_helper.c   | 23 +++++++++++++++++++++++
 4 files changed, 30 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index a60f4c49fc..6fd996e924 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -268,6 +268,8 @@ DEF_HELPER_5(gvec_vfche64_cc, void, ptr, cptr, cptr, env, i32)
 DEF_HELPER_5(gvec_vfche64s_cc, void, ptr, cptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vcdg64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vcdg64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vcdlg64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vcdlg64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index d3386024c8..465b36dd70 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1220,6 +1220,8 @@
     F(0xe7ea, VFCHE,   VRR_c, V,   0, 0, 0, 0, vfc, 0, IF_VEC)
 /* VECTOR FP CONVERT FROM FIXED 64-BIT */
     F(0xe7c3, VCDG,    VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
+/* VECTOR FP CONVERT FROM LOGICAL 64-BIT */
+    F(0xe7c1, VCDLG,   VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index 6741b707cc..fa755cd1d6 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2657,6 +2657,9 @@ static DisasJumpType op_vcdg(DisasContext *s, DisasOps *o)
     case 0xc3:
         fn = se ? gen_helper_gvec_vcdg64s : gen_helper_gvec_vcdg64;
         break;
+    case 0xc1:
+        fn = se ? gen_helper_gvec_vcdlg64s : gen_helper_gvec_vcdlg64;
+        break;
     default:
         g_assert_not_reached();
     }
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index 488895efdc..8f7dac0439 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -303,3 +303,26 @@ void HELPER(gvec_vcdg64s)(void *v1, const void *v2, CPUS390XState *env,
 
     vop64_2(v1, v2, env, true, XxC, erm, vcdg64, GETPC());
 }
+
+static uint64_t vcdlg64(uint64_t a, float_status *s)
+{
+    return float64_val(uint64_to_float64(a, s));
+}
+
+void HELPER(gvec_vcdlg64)(void *v1, const void *v2, CPUS390XState *env,
+                          uint32_t desc)
+{
+    const uint8_t erm = extract32(simd_data(desc), 4, 4);
+    const bool XxC = extract32(simd_data(desc), 2, 1);
+
+    vop64_2(v1, v2, env, false, XxC, erm, vcdlg64, GETPC());
+}
+
+void HELPER(gvec_vcdlg64s)(void *v1, const void *v2, CPUS390XState *env,
+                           uint32_t desc)
+{
+    const uint8_t erm = extract32(simd_data(desc), 4, 4);
+    const bool XxC = extract32(simd_data(desc), 2, 1);
+
+    vop64_2(v1, v2, env, true, XxC, erm, vcdlg64, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 09/23] s390x/tcg: Implement VECTOR FP CONVERT TO FIXED 64-BIT
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (7 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 08/23] s390x/tcg: Implement VECTOR FP CONVERT FROM LOGICAL 64-BIT David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:17   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 10/23] s390x/tcg: Implement VECTOR FP CONVERT TO LOGICAL 64-BIT David Hildenbrand
                   ` (15 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  2 ++
 target/s390x/insn-data.def      |  2 ++
 target/s390x/translate_vx.inc.c |  3 +++
 target/s390x/vec_fpu_helper.c   | 23 +++++++++++++++++++++++
 4 files changed, 30 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 6fd996e924..9893c677da 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -270,6 +270,8 @@ DEF_HELPER_FLAGS_4(gvec_vcdg64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vcdg64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vcdlg64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vcdlg64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vcgd64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vcgd64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index 465b36dd70..97c62a8af5 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1222,6 +1222,8 @@
     F(0xe7c3, VCDG,    VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
 /* VECTOR FP CONVERT FROM LOGICAL 64-BIT */
     F(0xe7c1, VCDLG,   VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
+/* VECTOR FP CONVERT TO FIXED 64-BIT */
+    F(0xe7c2, VCGD,    VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index fa755cd1d6..a42de2ff01 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2660,6 +2660,9 @@ static DisasJumpType op_vcdg(DisasContext *s, DisasOps *o)
     case 0xc1:
         fn = se ? gen_helper_gvec_vcdlg64s : gen_helper_gvec_vcdlg64;
         break;
+    case 0xc2:
+        fn = se ? gen_helper_gvec_vcgd64s : gen_helper_gvec_vcgd64;
+        break;
     default:
         g_assert_not_reached();
     }
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index 8f7dac0439..e1a797ecca 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -326,3 +326,26 @@ void HELPER(gvec_vcdlg64s)(void *v1, const void *v2, CPUS390XState *env,
 
     vop64_2(v1, v2, env, true, XxC, erm, vcdlg64, GETPC());
 }
+
+static uint64_t vcgd64(uint64_t a, float_status *s)
+{
+    return float64_to_int64(make_float64(a), s);
+}
+
+void HELPER(gvec_vcgd64)(void *v1, const void *v2, CPUS390XState *env,
+                         uint32_t desc)
+{
+    const uint8_t erm = extract32(simd_data(desc), 4, 4);
+    const bool XxC = extract32(simd_data(desc), 2, 1);
+
+    vop64_2(v1, v2, env, false, XxC, erm, vcgd64, GETPC());
+}
+
+void HELPER(gvec_vcgd64s)(void *v1, const void *v2, CPUS390XState *env,
+                          uint32_t desc)
+{
+    const uint8_t erm = extract32(simd_data(desc), 4, 4);
+    const bool XxC = extract32(simd_data(desc), 2, 1);
+
+    vop64_2(v1, v2, env, true, XxC, erm, vcgd64, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 10/23] s390x/tcg: Implement VECTOR FP CONVERT TO LOGICAL 64-BIT
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (8 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 09/23] s390x/tcg: Implement VECTOR FP CONVERT TO FIXED 64-BIT David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:18   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 11/23] s390x/tcg: Implement VECTOR FP DIVIDE David Hildenbrand
                   ` (14 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  2 ++
 target/s390x/insn-data.def      |  2 ++
 target/s390x/translate_vx.inc.c |  3 +++
 target/s390x/vec_fpu_helper.c   | 23 +++++++++++++++++++++++
 4 files changed, 30 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 9893c677da..9b9062970a 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -272,6 +272,8 @@ DEF_HELPER_FLAGS_4(gvec_vcdlg64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vcdlg64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vcgd64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vcgd64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vclgd64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vclgd64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index 97c62a8af5..ed8b888d59 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1224,6 +1224,8 @@
     F(0xe7c1, VCDLG,   VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
 /* VECTOR FP CONVERT TO FIXED 64-BIT */
     F(0xe7c2, VCGD,    VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
+/* VECTOR FP CONVERT TO LOGICAL 64-BIT */
+    F(0xe7c0, VCLGD,   VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index a42de2ff01..0395d69968 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2663,6 +2663,9 @@ static DisasJumpType op_vcdg(DisasContext *s, DisasOps *o)
     case 0xc2:
         fn = se ? gen_helper_gvec_vcgd64s : gen_helper_gvec_vcgd64;
         break;
+    case 0xc0:
+        fn = se ? gen_helper_gvec_vclgd64s : gen_helper_gvec_vclgd64;
+        break;
     default:
         g_assert_not_reached();
     }
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index e1a797ecca..92a2c04952 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -349,3 +349,26 @@ void HELPER(gvec_vcgd64s)(void *v1, const void *v2, CPUS390XState *env,
 
     vop64_2(v1, v2, env, true, XxC, erm, vcgd64, GETPC());
 }
+
+static uint64_t vclgd64(uint64_t a, float_status *s)
+{
+    return float64_to_uint64(make_float64(a), s);
+}
+
+void HELPER(gvec_vclgd64)(void *v1, const void *v2, CPUS390XState *env,
+                          uint32_t desc)
+{
+    const uint8_t erm = extract32(simd_data(desc), 4, 4);
+    const bool XxC = extract32(simd_data(desc), 2, 1);
+
+    vop64_2(v1, v2, env, false, XxC, erm, vclgd64, GETPC());
+}
+
+void HELPER(gvec_vclgd64s)(void *v1, const void *v2, CPUS390XState *env,
+                           uint32_t desc)
+{
+    const uint8_t erm = extract32(simd_data(desc), 4, 4);
+    const bool XxC = extract32(simd_data(desc), 2, 1);
+
+    vop64_2(v1, v2, env, true, XxC, erm, vclgd64, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 11/23] s390x/tcg: Implement VECTOR FP DIVIDE
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (9 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 10/23] s390x/tcg: Implement VECTOR FP CONVERT TO LOGICAL 64-BIT David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:25   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 12/23] s390x/tcg: Implement VECTOR LOAD FP INTEGER David Hildenbrand
                   ` (13 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

We can reuse most of the infrastructure added for VECTOR FP ADD.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  2 ++
 target/s390x/insn-data.def      |  2 ++
 target/s390x/translate_vx.inc.c |  3 +++
 target/s390x/vec_fpu_helper.c   | 17 +++++++++++++++++
 4 files changed, 24 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 9b9062970a..238bfa2509 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -274,6 +274,8 @@ DEF_HELPER_FLAGS_4(gvec_vcgd64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vcgd64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vclgd64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vclgd64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_5(gvec_vfd64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_5(gvec_vfd64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index ed8b888d59..f9830deace 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1226,6 +1226,8 @@
     F(0xe7c2, VCGD,    VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
 /* VECTOR FP CONVERT TO LOGICAL 64-BIT */
     F(0xe7c0, VCLGD,   VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
+/* VECTOR FP DIVIDE */
+    F(0xe7e5, VFD,     VRR_c, V,   0, 0, 0, 0, vfa, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index 0395d69968..9e55d4488b 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2560,6 +2560,9 @@ static DisasJumpType op_vfa(DisasContext *s, DisasOps *o)
     case 0xe3:
         fn = se ? gen_helper_gvec_vfa64s : gen_helper_gvec_vfa64;
         break;
+    case 0xe5:
+        fn = se ? gen_helper_gvec_vfd64s : gen_helper_gvec_vfd64;
+        break;
     default:
         g_assert_not_reached();
     }
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index 92a2c04952..2c085a8849 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -372,3 +372,20 @@ void HELPER(gvec_vclgd64s)(void *v1, const void *v2, CPUS390XState *env,
 
     vop64_2(v1, v2, env, true, XxC, erm, vclgd64, GETPC());
 }
+
+static uint64_t vfd64(uint64_t a, uint64_t b, float_status *s)
+{
+    return float64_val(float64_div(make_float64(a), make_float64(b), s));
+}
+
+void HELPER(gvec_vfd64)(void *v1, const void *v2, const void *v3,
+                        CPUS390XState *env, uint32_t desc)
+{
+    vop64_3(v1, v2, v3, env, false, vfd64, GETPC());
+}
+
+void HELPER(gvec_vfd64s)(void *v1, const void *v2, const void *v3,
+                         CPUS390XState *env, uint32_t desc)
+{
+    vop64_3(v1, v2, v3, env, true, vfd64, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 12/23] s390x/tcg: Implement VECTOR LOAD FP INTEGER
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (10 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 11/23] s390x/tcg: Implement VECTOR FP DIVIDE David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:26   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 13/23] s390x/tcg: Implement VECTOR LOAD LENGTHENED David Hildenbrand
                   ` (12 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

We can reuse most of the infrastructure introduced for
VECTOR FP CONVERT FROM FIXED 64-BIT and friends.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  2 ++
 target/s390x/insn-data.def      |  2 ++
 target/s390x/translate_vx.inc.c |  3 +++
 target/s390x/vec_fpu_helper.c   | 23 +++++++++++++++++++++++
 4 files changed, 30 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 238bfa2509..10a9cb39b6 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -276,6 +276,8 @@ DEF_HELPER_FLAGS_4(gvec_vclgd64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vclgd64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_5(gvec_vfd64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
 DEF_HELPER_FLAGS_5(gvec_vfd64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vfi64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vfi64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index f9830deace..f77aa41253 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1228,6 +1228,8 @@
     F(0xe7c0, VCLGD,   VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
 /* VECTOR FP DIVIDE */
     F(0xe7e5, VFD,     VRR_c, V,   0, 0, 0, 0, vfa, 0, IF_VEC)
+/* VECTOR LOAD FP INTEGER */
+    F(0xe7c7, VFI,     VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index 9e55d4488b..59d8b971c0 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2669,6 +2669,9 @@ static DisasJumpType op_vcdg(DisasContext *s, DisasOps *o)
     case 0xc0:
         fn = se ? gen_helper_gvec_vclgd64s : gen_helper_gvec_vclgd64;
         break;
+    case 0xc7:
+        fn = se ? gen_helper_gvec_vfi64s : gen_helper_gvec_vfi64;
+        break;
     default:
         g_assert_not_reached();
     }
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index 2c085a8849..63ba4cf548 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -389,3 +389,26 @@ void HELPER(gvec_vfd64s)(void *v1, const void *v2, const void *v3,
 {
     vop64_3(v1, v2, v3, env, true, vfd64, GETPC());
 }
+
+static uint64_t vfi64(uint64_t a, float_status *s)
+{
+    return float64_val(float64_round_to_int(make_float64(a), s));
+}
+
+void HELPER(gvec_vfi64)(void *v1, const void *v2, CPUS390XState *env,
+                        uint32_t desc)
+{
+    const uint8_t erm = extract32(simd_data(desc), 4, 4);
+    const bool XxC = extract32(simd_data(desc), 2, 1);
+
+    vop64_2(v1, v2, env, false, XxC, erm, vfi64, GETPC());
+}
+
+void HELPER(gvec_vfi64s)(void *v1, const void *v2, CPUS390XState *env,
+                         uint32_t desc)
+{
+    const uint8_t erm = extract32(simd_data(desc), 4, 4);
+    const bool XxC = extract32(simd_data(desc), 2, 1);
+
+    vop64_2(v1, v2, env, true, XxC, erm, vfi64, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 13/23] s390x/tcg: Implement VECTOR LOAD LENGTHENED
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (11 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 12/23] s390x/tcg: Implement VECTOR LOAD FP INTEGER David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:33   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 14/23] s390x/tcg: Implement VECTOR LOAD ROUNDED David Hildenbrand
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

Take care of reading/indicating the 32-bit elements.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  2 ++
 target/s390x/insn-data.def      |  2 ++
 target/s390x/translate_vx.inc.c | 19 +++++++++++++++++
 target/s390x/vec_fpu_helper.c   | 36 +++++++++++++++++++++++++++++++++
 4 files changed, 59 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 10a9cb39b6..cb25141ffe 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -278,6 +278,8 @@ DEF_HELPER_FLAGS_5(gvec_vfd64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
 DEF_HELPER_FLAGS_5(gvec_vfd64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vfi64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vfi64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vfll32, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vfll32s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index f77aa41253..5afdb36aec 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1230,6 +1230,8 @@
     F(0xe7e5, VFD,     VRR_c, V,   0, 0, 0, 0, vfa, 0, IF_VEC)
 /* VECTOR LOAD FP INTEGER */
     F(0xe7c7, VFI,     VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
+/* VECTOR LOAD LENGTHENED */
+    F(0xe7c4, VFLL,    VRR_a, V,   0, 0, 0, 0, vfll, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index 59d8b971c0..a25985e5c9 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2679,3 +2679,22 @@ static DisasJumpType op_vcdg(DisasContext *s, DisasOps *o)
                    deposit32(m4, 4, 4, erm), fn);
     return DISAS_NEXT;
 }
+
+static DisasJumpType op_vfll(DisasContext *s, DisasOps *o)
+{
+    const uint8_t fpf = get_field(s->fields, m3);
+    const uint8_t m4 = get_field(s->fields, m4);
+    gen_helper_gvec_2_ptr *fn = gen_helper_gvec_vfll32;
+
+    if (fpf != FPF_SHORT || extract32(m4, 0, 3)) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }
+
+    if (extract32(m4, 3, 1)) {
+        fn = gen_helper_gvec_vfll32s;
+    }
+    gen_gvec_2_ptr(get_field(s->fields, v1), get_field(s->fields, v2), cpu_env,
+                   0, fn);
+    return DISAS_NEXT;
+}
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index 63ba4cf548..f8919beed5 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -412,3 +412,39 @@ void HELPER(gvec_vfi64s)(void *v1, const void *v2, CPUS390XState *env,
 
     vop64_2(v1, v2, env, true, XxC, erm, vfi64, GETPC());
 }
+
+static void vfll32(S390Vector *v1, const S390Vector *v2, CPUS390XState *env,
+                   bool s, uintptr_t retaddr)
+{
+    uint8_t vxc, vec_exc = 0;
+    S390Vector tmp = {};
+    int i;
+
+    for (i = 0; i < 2; i++) {
+        /* load from even element */
+        const float32 a = make_float32(s390_vec_read_element32(v2, i * 2));
+        const uint64_t ret = float64_val(float32_to_float64(a,
+                                                            &env->fpu_status));
+
+        s390_vec_write_element64(&tmp, i, ret);
+        /* indicate the source element */
+        vxc = check_ieee_exc(env, i * 2, false, &vec_exc);
+        if (s || vxc) {
+            break;
+        }
+    }
+    handle_ieee_exc(env, vxc, vec_exc, retaddr);
+    *v1 = tmp;
+}
+
+void HELPER(gvec_vfll32)(void *v1, const void *v2, CPUS390XState *env,
+                         uint32_t desc)
+{
+    vfll32(v1, v2, env, false, GETPC());
+}
+
+void HELPER(gvec_vfll32s)(void *v1, const void *v2, CPUS390XState *env,
+                          uint32_t desc)
+{
+    vfll32(v1, v2, env, true, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 14/23] s390x/tcg: Implement VECTOR LOAD ROUNDED
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (12 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 13/23] s390x/tcg: Implement VECTOR LOAD LENGTHENED David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:37   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 15/23] s390x/tcg: Implement VECTOR FP MULTIPLY David Hildenbrand
                   ` (10 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

We can reuse some of the infrastructure introduced for
VECTOR FP CONVERT FROM FIXED 64-BIT and friends.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  2 ++
 target/s390x/insn-data.def      |  2 ++
 target/s390x/translate_vx.inc.c |  3 +++
 target/s390x/vec_fpu_helper.c   | 43 +++++++++++++++++++++++++++++++++
 4 files changed, 50 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index cb25141ffe..7526f8e8c6 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -280,6 +280,8 @@ DEF_HELPER_FLAGS_4(gvec_vfi64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vfi64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vfll32, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vfll32s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vflr64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vflr64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index 5afdb36aec..f03914d528 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1232,6 +1232,8 @@
     F(0xe7c7, VFI,     VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
 /* VECTOR LOAD LENGTHENED */
     F(0xe7c4, VFLL,    VRR_a, V,   0, 0, 0, 0, vfll, 0, IF_VEC)
+/* VECTOR LOAD ROUNDED */
+    F(0xe7c5, VFLR,    VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index a25985e5c9..73e1b1062a 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2672,6 +2672,9 @@ static DisasJumpType op_vcdg(DisasContext *s, DisasOps *o)
     case 0xc7:
         fn = se ? gen_helper_gvec_vfi64s : gen_helper_gvec_vfi64;
         break;
+    case 0xc5:
+        fn = se ? gen_helper_gvec_vflr64s : gen_helper_gvec_vflr64;
+        break;
     default:
         g_assert_not_reached();
     }
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index f8919beed5..d5fd931b61 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -448,3 +448,46 @@ void HELPER(gvec_vfll32s)(void *v1, const void *v2, CPUS390XState *env,
 {
     vfll32(v1, v2, env, true, GETPC());
 }
+
+static void vflr64(S390Vector *v1, const S390Vector *v2, CPUS390XState *env,
+                   bool s, bool XxC, uint8_t erm, uintptr_t retaddr)
+{
+    uint8_t vxc, vec_exc = 0;
+    S390Vector tmp = {};
+    int i, old_mode;
+
+    old_mode = s390_swap_bfp_rounding_mode(env, erm);
+    for (i = 0; i < 2; i++) {
+        float64 a = make_float64(s390_vec_read_element64(v2, i));
+        uint32_t ret = float32_val(float64_to_float32(a, &env->fpu_status));
+
+        /* place at even element */
+        s390_vec_write_element32(&tmp, i * 2, ret);
+        /* indicate the source element */
+        vxc = check_ieee_exc(env, i, XxC, &vec_exc);
+        if (s || vxc) {
+            break;
+        }
+    }
+    s390_restore_bfp_rounding_mode(env, old_mode);
+    handle_ieee_exc(env, vxc, vec_exc, retaddr);
+    *v1 = tmp;
+}
+
+void HELPER(gvec_vflr64)(void *v1, const void *v2, CPUS390XState *env,
+                         uint32_t desc)
+{
+    const uint8_t erm = extract32(simd_data(desc), 4, 4);
+    const bool XxC = extract32(simd_data(desc), 2, 1);
+
+    vflr64(v1, v2, env, false, XxC, erm, GETPC());
+}
+
+void HELPER(gvec_vflr64s)(void *v1, const void *v2, CPUS390XState *env,
+                          uint32_t desc)
+{
+    const uint8_t erm = extract32(simd_data(desc), 4, 4);
+    const bool XxC = extract32(simd_data(desc), 2, 1);
+
+    vflr64(v1, v2, env, true, XxC, erm, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 15/23] s390x/tcg: Implement VECTOR FP MULTIPLY
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (13 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 14/23] s390x/tcg: Implement VECTOR LOAD ROUNDED David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:37   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 16/23] s390x/tcg: Implement VECTOR FP MULTIPLY AND (ADD|SUBTRACT) David Hildenbrand
                   ` (9 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

Very similar to VECTOR FP DIVIDE.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  2 ++
 target/s390x/insn-data.def      |  2 ++
 target/s390x/translate_vx.inc.c |  3 +++
 target/s390x/vec_fpu_helper.c   | 17 +++++++++++++++++
 4 files changed, 24 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 7526f8e8c6..22e02a0178 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -282,6 +282,8 @@ DEF_HELPER_FLAGS_4(gvec_vfll32, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vfll32s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vflr64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vflr64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_5(gvec_vfm64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_5(gvec_vfm64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index f03914d528..e56059ac34 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1234,6 +1234,8 @@
     F(0xe7c4, VFLL,    VRR_a, V,   0, 0, 0, 0, vfll, 0, IF_VEC)
 /* VECTOR LOAD ROUNDED */
     F(0xe7c5, VFLR,    VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
+/* VECTOR FP MULTIPLY */
+    F(0xe7e7, VFM,     VRR_c, V,   0, 0, 0, 0, vfa, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index 73e1b1062a..ae31a327cf 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2563,6 +2563,9 @@ static DisasJumpType op_vfa(DisasContext *s, DisasOps *o)
     case 0xe5:
         fn = se ? gen_helper_gvec_vfd64s : gen_helper_gvec_vfd64;
         break;
+    case 0xe7:
+        fn = se ? gen_helper_gvec_vfm64s : gen_helper_gvec_vfm64;
+        break;
     default:
         g_assert_not_reached();
     }
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index d5fd931b61..fd147cc055 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -491,3 +491,20 @@ void HELPER(gvec_vflr64s)(void *v1, const void *v2, CPUS390XState *env,
 
     vflr64(v1, v2, env, true, XxC, erm, GETPC());
 }
+
+static uint64_t vfm64(uint64_t a, uint64_t b, float_status *s)
+{
+    return float64_val(float64_mul(make_float64(a), make_float64(b), s));
+}
+
+void HELPER(gvec_vfm64)(void *v1, const void *v2, const void *v3,
+                        CPUS390XState *env, uint32_t desc)
+{
+    vop64_3(v1, v2, v3, env, false, vfm64, GETPC());
+}
+
+void HELPER(gvec_vfm64s)(void *v1, const void *v2, const void *v3,
+                         CPUS390XState *env, uint32_t desc)
+{
+    vop64_3(v1, v2, v3, env, true, vfm64, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 16/23] s390x/tcg: Implement VECTOR FP MULTIPLY AND (ADD|SUBTRACT)
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (14 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 15/23] s390x/tcg: Implement VECTOR FP MULTIPLY David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:42   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 17/23] s390x/tcg: Implement VECTOR FP PERFORM SIGN OPERATION David Hildenbrand
                   ` (8 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  4 +++
 target/s390x/insn-data.def      |  4 +++
 target/s390x/translate_vx.inc.c | 23 +++++++++++++
 target/s390x/vec_fpu_helper.c   | 61 +++++++++++++++++++++++++++++++++
 4 files changed, 92 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 22e02a0178..bcaabb91a5 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -284,6 +284,10 @@ DEF_HELPER_FLAGS_4(gvec_vflr64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vflr64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_5(gvec_vfm64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
 DEF_HELPER_FLAGS_5(gvec_vfm64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_6(gvec_vfma64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_6(gvec_vfma64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_6(gvec_vfms64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_6(gvec_vfms64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index e56059ac34..e86ade9e44 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1236,6 +1236,10 @@
     F(0xe7c5, VFLR,    VRR_a, V,   0, 0, 0, 0, vcdg, 0, IF_VEC)
 /* VECTOR FP MULTIPLY */
     F(0xe7e7, VFM,     VRR_c, V,   0, 0, 0, 0, vfa, 0, IF_VEC)
+/* VECTOR FP MULTIPLY AND ADD */
+    F(0xe78f, VFMA,    VRR_e, V,   0, 0, 0, 0, vfma, 0, IF_VEC)
+/* VECTOR FP MULTIPLY AND SUBTRACT */
+    F(0xe78e, VFMS,    VRR_e, V,   0, 0, 0, 0, vfma, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index ae31a327cf..b624c7a8aa 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2704,3 +2704,26 @@ static DisasJumpType op_vfll(DisasContext *s, DisasOps *o)
                    0, fn);
     return DISAS_NEXT;
 }
+
+static DisasJumpType op_vfma(DisasContext *s, DisasOps *o)
+{
+    const uint8_t m5 = get_field(s->fields, m5);
+    const uint8_t fpf = get_field(s->fields, m6);
+    const bool se = extract32(m5, 3, 1);
+    gen_helper_gvec_4_ptr *fn;
+
+    if (fpf != FPF_LONG || extract32(m5, 0, 3)) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }
+
+    if (s->fields->op2 == 0x8f) {
+        fn = se ? gen_helper_gvec_vfma64s : gen_helper_gvec_vfma64;
+    } else {
+        fn = se ? gen_helper_gvec_vfms64s : gen_helper_gvec_vfms64;
+    }
+    gen_gvec_4_ptr(get_field(s->fields, v1), get_field(s->fields, v2),
+                   get_field(s->fields, v3), get_field(s->fields, v4), cpu_env,
+                   0, fn);
+    return DISAS_NEXT;
+}
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index fd147cc055..a27b354214 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -125,6 +125,31 @@ static void vop64_3(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
     *v1 = tmp;
 }
 
+typedef uint64_t (*vop64_4_fn)(uint64_t a, uint64_t b, uint64_t c,
+                               float_status *s);
+static void vop64_4(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
+                    const S390Vector *v4, CPUS390XState *env, bool s,
+                    vop64_4_fn fn, uintptr_t retaddr)
+{
+    uint8_t vxc, vec_exc = 0;
+    S390Vector tmp = {};
+    int i;
+
+    for (i = 0; i < 2; i++) {
+        const uint64_t a = s390_vec_read_element64(v2, i);
+        const uint64_t b = s390_vec_read_element64(v3, i);
+        const uint64_t c = s390_vec_read_element64(v4, i);
+
+        s390_vec_write_element64(&tmp, i, fn(a, b, c, &env->fpu_status));
+        vxc = check_ieee_exc(env, i, false, &vec_exc);
+        if (s || vxc) {
+            break;
+        }
+    }
+    handle_ieee_exc(env, vxc, vec_exc, retaddr);
+    *v1 = tmp;
+}
+
 static uint64_t vfa64(uint64_t a, uint64_t b, float_status *s)
 {
     return float64_val(float64_add(make_float64(a), make_float64(b), s));
@@ -508,3 +533,39 @@ void HELPER(gvec_vfm64s)(void *v1, const void *v2, const void *v3,
 {
     vop64_3(v1, v2, v3, env, true, vfm64, GETPC());
 }
+
+static uint64_t vfma64(uint64_t a, uint64_t b, uint64_t c, float_status *s)
+{
+    return float64_val(float64_muladd(make_float64(a), make_float64(b),
+                       make_float64(c), 0, s));
+}
+
+void HELPER(gvec_vfma64)(void *v1, const void *v2, const void *v3,
+                         const void *v4, CPUS390XState *env, uint32_t desc)
+{
+    vop64_4(v1, v2, v3, v4, env, false, vfma64, GETPC());
+}
+
+void HELPER(gvec_vfma64s)(void *v1, const void *v2, const void *v3,
+                         const void *v4, CPUS390XState *env, uint32_t desc)
+{
+    vop64_4(v1, v2, v3, v4, env, true, vfma64, GETPC());
+}
+
+static uint64_t vfms64(uint64_t a, uint64_t b, uint64_t c, float_status *s)
+{
+    return float64_val(float64_muladd(make_float64(a), make_float64(b),
+                       make_float64(c), float_muladd_negate_c, s));
+}
+
+void HELPER(gvec_vfms64)(void *v1, const void *v2, const void *v3,
+                         const void *v4, CPUS390XState *env, uint32_t desc)
+{
+    vop64_4(v1, v2, v3, v4, env, false, vfms64, GETPC());
+}
+
+void HELPER(gvec_vfms64s)(void *v1, const void *v2, const void *v3,
+                         const void *v4, CPUS390XState *env, uint32_t desc)
+{
+    vop64_4(v1, v2, v3, v4, env, true, vfms64, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 17/23] s390x/tcg: Implement VECTOR FP PERFORM SIGN OPERATION
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (15 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 16/23] s390x/tcg: Implement VECTOR FP MULTIPLY AND (ADD|SUBTRACT) David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:48   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 18/23] s390x/tcg: Implement VECTOR FP SQUARE ROOT David Hildenbrand
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

The only FP instruction we can implement without an helper.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/insn-data.def      |  2 ++
 target/s390x/translate_vx.inc.c | 42 +++++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index e86ade9e44..fa2e801747 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1240,6 +1240,8 @@
     F(0xe78f, VFMA,    VRR_e, V,   0, 0, 0, 0, vfma, 0, IF_VEC)
 /* VECTOR FP MULTIPLY AND SUBTRACT */
     F(0xe78e, VFMS,    VRR_e, V,   0, 0, 0, 0, vfma, 0, IF_VEC)
+/* VECTOR FP PERFORM SIGN OPERATION */
+    F(0xe7cc, VFPSO,   VRR_a, V,   0, 0, 0, 0, vfpso, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index b624c7a8aa..b80d2a7a88 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2727,3 +2727,45 @@ static DisasJumpType op_vfma(DisasContext *s, DisasOps *o)
                    0, fn);
     return DISAS_NEXT;
 }
+
+static DisasJumpType op_vfpso(DisasContext *s, DisasOps *o)
+{
+    const uint8_t fpf = get_field(s->fields, m3);
+    const uint8_t m4 = get_field(s->fields, m4);
+    const uint8_t m5 = get_field(s->fields, m5);
+    const bool se = extract32(m4, 3, 1);
+    TCGv_i64 tmp;
+    int i;
+
+    if (fpf != FPF_LONG || extract32(m4, 0, 3) || m5 > 2) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }
+
+    tmp = tcg_temp_new_i64();
+    for (i = 0; i < 2; i++) {
+        read_vec_element_i64(tmp, get_field(s->fields, v2), i, ES_64);
+
+        switch (m5) {
+        case 0:
+            /* sign bit is inverted (complement) */
+            tcg_gen_xori_i64(tmp, tmp, 1ull << 63);
+            break;
+        case 1:
+            /* sign bit is set to one (negative) */
+            tcg_gen_ori_i64(tmp, tmp, 1ull << 63);
+            break;
+        case 2:
+            /* sign bit is set to zero (positive) */
+            tcg_gen_andi_i64(tmp, tmp, (1ull << 63) - 1);
+            break;
+        }
+
+        write_vec_element_i64(tmp, get_field(s->fields, v1), i, ES_64);
+        if (se) {
+            break;
+        }
+    }
+    tcg_temp_free_i64(tmp);
+    return DISAS_NEXT;
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 18/23] s390x/tcg: Implement VECTOR FP SQUARE ROOT
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (16 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 17/23] s390x/tcg: Implement VECTOR FP PERFORM SIGN OPERATION David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:50   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 19/23] s390x/tcg: Implement VECTOR FP SUBTRACT David Hildenbrand
                   ` (6 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

Simulate XxC=0 and ERM=0 (current mode), so we can use the existing
helper function.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  2 ++
 target/s390x/insn-data.def      |  2 ++
 target/s390x/translate_vx.inc.c | 19 +++++++++++++++++++
 target/s390x/vec_fpu_helper.c   | 17 +++++++++++++++++
 4 files changed, 40 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index bcaabb91a5..23b37af1e4 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -288,6 +288,8 @@ DEF_HELPER_FLAGS_6(gvec_vfma64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, cptr, env
 DEF_HELPER_FLAGS_6(gvec_vfma64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, cptr, env, i32)
 DEF_HELPER_FLAGS_6(gvec_vfms64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, cptr, env, i32)
 DEF_HELPER_FLAGS_6(gvec_vfms64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vfsq64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vfsq64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index fa2e801747..354252d57c 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1242,6 +1242,8 @@
     F(0xe78e, VFMS,    VRR_e, V,   0, 0, 0, 0, vfma, 0, IF_VEC)
 /* VECTOR FP PERFORM SIGN OPERATION */
     F(0xe7cc, VFPSO,   VRR_a, V,   0, 0, 0, 0, vfpso, 0, IF_VEC)
+/* VECTOR FP SQUARE ROOT */
+    F(0xe7ce, VFSQ,    VRR_a, V,   0, 0, 0, 0, vfsq, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index b80d2a7a88..48b4e6008c 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2769,3 +2769,22 @@ static DisasJumpType op_vfpso(DisasContext *s, DisasOps *o)
     tcg_temp_free_i64(tmp);
     return DISAS_NEXT;
 }
+
+static DisasJumpType op_vfsq(DisasContext *s, DisasOps *o)
+{
+    const uint8_t fpf = get_field(s->fields, m3);
+    const uint8_t m4 = get_field(s->fields, m4);
+    gen_helper_gvec_2_ptr *fn = gen_helper_gvec_vfsq64;
+
+    if (fpf != FPF_LONG || extract32(m4, 0, 3)) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }
+
+    if (extract32(m4, 3, 1)) {
+        fn = gen_helper_gvec_vfsq64s;
+    }
+    gen_gvec_2_ptr(get_field(s->fields, v1), get_field(s->fields, v2), cpu_env,
+                   0, fn);
+    return DISAS_NEXT;
+}
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index a27b354214..a78c9dccdc 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -569,3 +569,20 @@ void HELPER(gvec_vfms64s)(void *v1, const void *v2, const void *v3,
 {
     vop64_4(v1, v2, v3, v4, env, true, vfms64, GETPC());
 }
+
+static uint64_t vfsq64(uint64_t a, float_status *s)
+{
+    return float64_val(float64_sqrt(make_float64(a), s));
+}
+
+void HELPER(gvec_vfsq64)(void *v1, const void *v2, CPUS390XState *env,
+                         uint32_t desc)
+{
+    vop64_2(v1, v2, env, false, false, 0, vfsq64, GETPC());
+}
+
+void HELPER(gvec_vfsq64s)(void *v1, const void *v2, CPUS390XState *env,
+                          uint32_t desc)
+{
+    vop64_2(v1, v2, env, true, false, 0, vfsq64, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 19/23] s390x/tcg: Implement VECTOR FP SUBTRACT
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (17 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 18/23] s390x/tcg: Implement VECTOR FP SQUARE ROOT David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:51   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 20/23] s390x/tcg: Implement VECTOR FP TEST DATA CLASS IMMEDIATE David Hildenbrand
                   ` (5 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

Similar to VECTOR FP ADD.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  2 ++
 target/s390x/insn-data.def      |  2 ++
 target/s390x/translate_vx.inc.c |  3 +++
 target/s390x/vec_fpu_helper.c   | 17 +++++++++++++++++
 4 files changed, 24 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 23b37af1e4..c788fc1b7f 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -290,6 +290,8 @@ DEF_HELPER_FLAGS_6(gvec_vfms64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, cptr, env
 DEF_HELPER_FLAGS_6(gvec_vfms64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vfsq64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vfsq64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_5(gvec_vfs64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_5(gvec_vfs64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index 354252d57c..4426f40250 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1244,6 +1244,8 @@
     F(0xe7cc, VFPSO,   VRR_a, V,   0, 0, 0, 0, vfpso, 0, IF_VEC)
 /* VECTOR FP SQUARE ROOT */
     F(0xe7ce, VFSQ,    VRR_a, V,   0, 0, 0, 0, vfsq, 0, IF_VEC)
+/* VECTOR FP SUBTRACT */
+    F(0xe7e2, VFS,     VRR_c, V,   0, 0, 0, 0, vfa, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index 48b4e6008c..bc75a147b6 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2566,6 +2566,9 @@ static DisasJumpType op_vfa(DisasContext *s, DisasOps *o)
     case 0xe7:
         fn = se ? gen_helper_gvec_vfm64s : gen_helper_gvec_vfm64;
         break;
+    case 0xe2:
+        fn = se ? gen_helper_gvec_vfs64s : gen_helper_gvec_vfs64;
+        break;
     default:
         g_assert_not_reached();
     }
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index a78c9dccdc..10249c5105 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -586,3 +586,20 @@ void HELPER(gvec_vfsq64s)(void *v1, const void *v2, CPUS390XState *env,
 {
     vop64_2(v1, v2, env, true, false, 0, vfsq64, GETPC());
 }
+
+static uint64_t vfs64(uint64_t a, uint64_t b, float_status *s)
+{
+    return float64_val(float64_sub(make_float64(a), make_float64(b), s));
+}
+
+void HELPER(gvec_vfs64)(void *v1, const void *v2, const void *v3,
+                        CPUS390XState *env, uint32_t desc)
+{
+    vop64_3(v1, v2, v3, env, false, vfs64, GETPC());
+}
+
+void HELPER(gvec_vfs64s)(void *v1, const void *v2, const void *v3,
+                         CPUS390XState *env, uint32_t desc)
+{
+    vop64_3(v1, v2, v3, env, true, vfs64, GETPC());
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 20/23] s390x/tcg: Implement VECTOR FP TEST DATA CLASS IMMEDIATE
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (18 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 19/23] s390x/tcg: Implement VECTOR FP SUBTRACT David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:40   ` David Hildenbrand
  2019-05-31 17:54   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 21/23] s390x/tcg: Allow linux-user to use vector instructions David Hildenbrand
                   ` (4 subsequent siblings)
  24 siblings, 2 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

We can reuse float64_dcmask().

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h           |  2 ++
 target/s390x/insn-data.def      |  2 ++
 target/s390x/translate_vx.inc.c | 21 ++++++++++++++++++
 target/s390x/vec_fpu_helper.c   | 39 +++++++++++++++++++++++++++++++++
 4 files changed, 64 insertions(+)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index c788fc1b7f..e9aff83b05 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -292,6 +292,8 @@ DEF_HELPER_FLAGS_4(gvec_vfsq64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vfsq64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_5(gvec_vfs64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
 DEF_HELPER_FLAGS_5(gvec_vfs64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_4(gvec_vftci64, void, ptr, cptr, env, i32)
+DEF_HELPER_4(gvec_vftci64s, void, ptr, cptr, env, i32)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index 4426f40250..f421184fcd 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -1246,6 +1246,8 @@
     F(0xe7ce, VFSQ,    VRR_a, V,   0, 0, 0, 0, vfsq, 0, IF_VEC)
 /* VECTOR FP SUBTRACT */
     F(0xe7e2, VFS,     VRR_c, V,   0, 0, 0, 0, vfa, 0, IF_VEC)
+/* VECTOR FP TEST DATA CLASS IMMEDIATE */
+    F(0xe74a, VFTCI,   VRI_e, V,   0, 0, 0, 0, vftci, 0, IF_VEC)
 
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
index bc75a147b6..715fcb2cb5 100644
--- a/target/s390x/translate_vx.inc.c
+++ b/target/s390x/translate_vx.inc.c
@@ -2791,3 +2791,24 @@ static DisasJumpType op_vfsq(DisasContext *s, DisasOps *o)
                    0, fn);
     return DISAS_NEXT;
 }
+
+static DisasJumpType op_vftci(DisasContext *s, DisasOps *o)
+{
+    const uint16_t i3 = get_field(s->fields, i3);
+    const uint8_t fpf = get_field(s->fields, m4);
+    const uint8_t m5 = get_field(s->fields, m5);
+    gen_helper_gvec_2_ptr *fn = gen_helper_gvec_vftci64;
+
+    if (fpf != FPF_LONG || extract32(m5, 0, 3)) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }
+
+    if (extract32(m5, 3, 1)) {
+        fn = gen_helper_gvec_vftci64s;
+    }
+    gen_gvec_2_ptr(get_field(s->fields, v1), get_field(s->fields, v2), cpu_env,
+                   i3, fn);
+    set_cc_static(s);
+    return DISAS_NEXT;
+}
diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
index 10249c5105..930b6d1db4 100644
--- a/target/s390x/vec_fpu_helper.c
+++ b/target/s390x/vec_fpu_helper.c
@@ -603,3 +603,42 @@ void HELPER(gvec_vfs64s)(void *v1, const void *v2, const void *v3,
 {
     vop64_3(v1, v2, v3, env, true, vfs64, GETPC());
 }
+
+static int vftci64(S390Vector *v1, const S390Vector *v2, CPUS390XState *env,
+                   bool s, uint16_t i3)
+{
+    int i, match = 0;
+
+    for (i = 0; i < 2; i++) {
+        float64 a = make_float64(s390_vec_read_element64(v2, i));
+
+        if (float64_dcmask(env, a) & i3) {
+            match++;
+            s390_vec_write_element64(v1, i, -1ull);
+        } else {
+            s390_vec_write_element64(v1, i, 0);
+        }
+        if (s) {
+            break;
+        }
+    }
+
+    if (match == i + 1) {
+        return 0;
+    } else if (match) {
+        return 1;
+    }
+    return 3;
+}
+
+void HELPER(gvec_vftci64)(void *v1, const void *v2, CPUS390XState *env,
+                          uint32_t desc)
+{
+    env->cc_op = vftci64(v1, v2, env, false, simd_data(desc));
+}
+
+void HELPER(gvec_vftci64s)(void *v1, const void *v2, CPUS390XState *env,
+                           uint32_t desc)
+{
+    env->cc_op = vftci64(v1, v2, env, true, simd_data(desc));
+}
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 21/23] s390x/tcg: Allow linux-user to use vector instructions
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (19 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 20/23] s390x/tcg: Implement VECTOR FP TEST DATA CLASS IMMEDIATE David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:54   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 22/23] s390x/tcg: We support the Vector Facility David Hildenbrand
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

Once we unlock S390_FEAT_VECTOR for TCG, we want linux-user to be
able to make use of it.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/cpu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index b1df63d82c..6af1a1530f 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -145,6 +145,9 @@ static void s390_cpu_full_reset(CPUState *s)
 #if defined(CONFIG_USER_ONLY)
     /* user mode should always be allowed to use the full FPU */
     env->cregs[0] |= CR0_AFP;
+    if (s390_has_feat(S390_FEAT_VECTOR)) {
+        env->cregs[0] |= CR0_VECTOR;
+    }
 #endif
 
     /* architectured initial value for Breaking-Event-Address register */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 22/23] s390x/tcg: We support the Vector Facility
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (20 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 21/23] s390x/tcg: Allow linux-user to use vector instructions David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:55   ` Richard Henderson
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 23/23] s390x: Bump the "qemu" CPU model up to a stripped-down z13 David Hildenbrand
                   ` (2 subsequent siblings)
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

Let's add it to the max model, so we can enable it.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/gen-features.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
index c346b76bdf..a818c80332 100644
--- a/target/s390x/gen-features.c
+++ b/target/s390x/gen-features.c
@@ -702,6 +702,7 @@ static uint16_t qemu_LATEST[] = {
 static uint16_t qemu_MAX[] = {
     /* z13+ features */
     S390_FEAT_STFLE_53,
+    S390_FEAT_VECTOR,
     /* generates a dependency warning, leave it out for now */
     S390_FEAT_MSA_EXT_5,
 };
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v1 23/23] s390x: Bump the "qemu" CPU model up to a stripped-down z13
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (21 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 22/23] s390x/tcg: We support the Vector Facility David Hildenbrand
@ 2019-05-31 10:44 ` David Hildenbrand
  2019-05-31 17:57   ` Richard Henderson
  2019-05-31 10:47 ` [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
  2019-07-19  9:51 ` Aleksandar Markovic
  24 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko, David Hildenbrand

We don't care about the other two missing base features:
- S390_FEAT_DFP_PACKED_CONVERSION
- S390_FEAT_GROUP_GEN13_PTFF

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/s390x/s390-virtio-ccw.c  |  2 ++
 target/s390x/cpu_models.c   |  4 ++--
 target/s390x/gen-features.c | 11 +++++++----
 3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index bbc6e8fa0b..4d643686cb 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -669,7 +669,9 @@ DEFINE_CCW_MACHINE(4_1, "4.1", true);
 
 static void ccw_machine_4_0_instance_options(MachineState *machine)
 {
+    static const S390FeatInit qemu_cpu_feat = { S390_FEAT_LIST_QEMU_V4_0 };
     ccw_machine_4_1_instance_options(machine);
+    s390_set_qemu_cpu_model(0x2827, 12, 2, qemu_cpu_feat);
 }
 
 static void ccw_machine_4_0_class_options(MachineClass *mc)
diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index 21ea819483..b5d16e4c89 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -86,8 +86,8 @@ static S390CPUDef s390_cpu_defs[] = {
     CPUDEF_INIT(0x8562, 15, 1, 47, 0x08000000U, "gen15b", "IBM 8562 GA1"),
 };
 
-#define QEMU_MAX_CPU_TYPE 0x2827
-#define QEMU_MAX_CPU_GEN 12
+#define QEMU_MAX_CPU_TYPE 0x2964
+#define QEMU_MAX_CPU_GEN 13
 #define QEMU_MAX_CPU_EC_GA 2
 static const S390FeatInit qemu_max_cpu_feat_init = { S390_FEAT_LIST_QEMU_MAX };
 static S390FeatBitmap qemu_max_cpu_feat;
diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
index a818c80332..dc320a06c2 100644
--- a/target/s390x/gen-features.c
+++ b/target/s390x/gen-features.c
@@ -689,7 +689,7 @@ static uint16_t qemu_V3_1[] = {
     S390_FEAT_MSA_EXT_4,
 };
 
-static uint16_t qemu_LATEST[] = {
+static uint16_t qemu_V4_0[] = {
     /*
      * Only BFP bits are implemented (HFP, DFP, PFPO and DIVIDE TO INTEGER not
      * implemented yet).
@@ -698,11 +698,13 @@ static uint16_t qemu_LATEST[] = {
     S390_FEAT_ZPCI,
 };
 
-/* add all new definitions before this point */
-static uint16_t qemu_MAX[] = {
-    /* z13+ features */
+static uint16_t qemu_LATEST[] = {
     S390_FEAT_STFLE_53,
     S390_FEAT_VECTOR,
+};
+
+/* add all new definitions before this point */
+static uint16_t qemu_MAX[] = {
     /* generates a dependency warning, leave it out for now */
     S390_FEAT_MSA_EXT_5,
 };
@@ -821,6 +823,7 @@ static FeatGroupDefSpec FeatGroupDef[] = {
 static FeatGroupDefSpec QemuFeatDef[] = {
     QEMU_FEAT_INITIALIZER(V2_11),
     QEMU_FEAT_INITIALIZER(V3_1),
+    QEMU_FEAT_INITIALIZER(V4_0),
     QEMU_FEAT_INITIALIZER(LATEST),
     QEMU_FEAT_INITIALIZER(MAX),
 };
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (22 preceding siblings ...)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 23/23] s390x: Bump the "qemu" CPU model up to a stripped-down z13 David Hildenbrand
@ 2019-05-31 10:47 ` David Hildenbrand
  2019-07-19  9:51 ` Aleksandar Markovic
  24 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 10:47 UTC (permalink / raw)
  To: qemu-devel
  Cc: Christian Borntraeger, Thomas Huth, Cornelia Huck,
	Denys Vlasenko, Richard Henderson

On 31.05.19 12:44, David Hildenbrand wrote:
> This is the final part of vector instruction support for s390x. It is based
> on part 2, which is will send a pull-request for to Conny soon.
> 
> Part 1: Vector Support Instructions
> Part 2: Vector Integer Instructions
> Part 3: Vector String Instructions
> Part 4: Vector Floating-Point Instructions
> 
> The current state can be found at (kept updated):
>     https://github.com/davidhildenbrand/qemu/tree/vx
> 
> It is based on:
> - [PATCH v2 0/5] s390x/tcg: Vector Instruction Support Part 3
> - [PATCH v1 0/2] s390x: Fix vector register alignment
> 
> With the current state I can boot Linux kernel + user space compiled with
> SIMD support. This allows to boot distributions compiled exclusively for
> z13, requiring SIMD support. Also, it is now possible to build a complete
> kernel using rpmbuild as quite some issues have been sorted out.
> 
> While the current state works fine for me with RHEL 8, I am experiencing
> some issues with newer userspace versions (I suspect glibc). I'll have
> to look into the details first - could be a BUG in !vector
> instruction or a BUG in a vector instruction that was until now unused.
> 
> In this part, all Vector Floating-Point Instructions introduced with the
> "Vector Facility" are added. Also, the "qemu" model is changed to a
> z13 machine.
> 
> David Hildenbrand (23):
>   s390x: Use uint64_t for vector registers
>   s390x/tcg: Introduce tcg_s390_vector_exception()
>   s390x/tcg: Export float_comp_to_cc() and float(32|64|128)_dcmask()
>   s390x/tcg: Implement VECTOR FP ADD
>   s390x/tcg: Implement VECTOR FP COMPARE (AND SIGNAL) SCALAR
>   s390x/tcg: Implement VECTOR FP COMPARE (EQUAL|HIGH|HIGH OR EQUAL)
>   s390x/tcg: Implement VECTOR FP CONVERT FROM FIXED 64-BIT
>   s390x/tcg: Implement VECTOR FP CONVERT FROM LOGICAL 64-BIT
>   s390x/tcg: Implement VECTOR FP CONVERT TO FIXED 64-BIT
>   s390x/tcg: Implement VECTOR FP CONVERT TO LOGICAL 64-BIT
>   s390x/tcg: Implement VECTOR FP DIVIDE
>   s390x/tcg: Implement VECTOR LOAD FP INTEGER
>   s390x/tcg: Implement VECTOR LOAD LENGTHENED
>   s390x/tcg: Implement VECTOR LOAD ROUNDED
>   s390x/tcg: Implement VECTOR FP MULTIPLY
>   s390x/tcg: Implement VECTOR FP MULTIPLY AND (ADD|SUBTRACT)
>   s390x/tcg: Implement VECTOR FP PERFORM SIGN OPERATION
>   s390x/tcg: Implement VECTOR FP SQUARE ROOT
>   s390x/tcg: Implement VECTOR FP SUBTRACT
>   s390x/tcg: Implement VECTOR FP TEST DATA CLASS IMMEDIATE
>   s390x/tcg: Allow linux-user to use vector instructions
>   s390x/tcg: We support the Vector Facility
>   s390x: Bump the "qemu" CPU model up to a stripped-down z13
> 
>  hw/s390x/s390-virtio-ccw.c      |   2 +
>  linux-user/s390x/signal.c       |   4 +-
>  target/s390x/Makefile.objs      |   1 +
>  target/s390x/arch_dump.c        |   8 +-
>  target/s390x/cpu.c              |   3 +
>  target/s390x/cpu.h              |   5 +-
>  target/s390x/cpu_models.c       |   4 +-
>  target/s390x/excp_helper.c      |  21 +-
>  target/s390x/fpu_helper.c       |   4 +-
>  target/s390x/gdbstub.c          |  16 +-
>  target/s390x/gen-features.c     |  10 +-
>  target/s390x/helper.c           |  10 +-
>  target/s390x/helper.h           |  46 +++
>  target/s390x/insn-data.def      |  45 +++
>  target/s390x/internal.h         |   4 +
>  target/s390x/kvm.c              |  16 +-
>  target/s390x/machine.c          | 128 +++----
>  target/s390x/tcg_s390x.h        |   2 +
>  target/s390x/translate.c        |   2 +-
>  target/s390x/translate_vx.inc.c | 274 ++++++++++++++
>  target/s390x/vec_fpu_helper.c   | 644 ++++++++++++++++++++++++++++++++
>  21 files changed, 1145 insertions(+), 104 deletions(-)
>  create mode 100644 target/s390x/vec_fpu_helper.c
> 

Nasty git "-identity" + manual "-cc" collision.

CC'ing some more people.

-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 04/23] s390x/tcg: Implement VECTOR FP ADD
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 04/23] s390x/tcg: Implement VECTOR FP ADD David Hildenbrand
@ 2019-05-31 15:54   ` Richard Henderson
  2019-05-31 16:26     ` David Hildenbrand
  2019-05-31 16:30   ` Richard Henderson
  1 sibling, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 15:54 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> +static uint64_t vfa64(uint64_t a, uint64_t b, float_status *s)
> +{
> +    return float64_val(float64_add(make_float64(a), make_float64(b), s));
> +}


You don't need either make_float64 or float64_val.
I've been intending to strip them out entirely; we
don't need to add new uses.


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 04/23] s390x/tcg: Implement VECTOR FP ADD
  2019-05-31 15:54   ` Richard Henderson
@ 2019-05-31 16:26     ` David Hildenbrand
  0 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 16:26 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 31.05.19 17:54, Richard Henderson wrote:
> On 5/31/19 5:44 AM, David Hildenbrand wrote:
>> +static uint64_t vfa64(uint64_t a, uint64_t b, float_status *s)
>> +{
>> +    return float64_val(float64_add(make_float64(a), make_float64(b), s));
>> +}
> 
> 
> You don't need either make_float64 or float64_val.
> I've been intending to strip them out entirely; we
> don't need to add new uses.

Makes sense, I added them for consistency - will remove them.


-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 04/23] s390x/tcg: Implement VECTOR FP ADD
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 04/23] s390x/tcg: Implement VECTOR FP ADD David Hildenbrand
  2019-05-31 15:54   ` Richard Henderson
@ 2019-05-31 16:30   ` Richard Henderson
  1 sibling, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 16:30 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> +void HELPER(gvec_vfa64)(void *v1, const void *v2, const void *v3,
> +                        CPUS390XState *env, uint32_t desc)
> +{
> +    vop64_3(v1, v2, v3, env, false, vfa64, GETPC());
> +}

Given that make_float64 is banished, I guess you can pass float64_add here
directly.

With that,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 05/23] s390x/tcg: Implement VECTOR FP COMPARE (AND SIGNAL) SCALAR
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 05/23] s390x/tcg: Implement VECTOR FP COMPARE (AND SIGNAL) SCALAR David Hildenbrand
@ 2019-05-31 16:33   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 16:33 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> As far as I can see, there is only a tiny difference.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h           |  2 ++
>  target/s390x/insn-data.def      |  4 ++++
>  target/s390x/translate_vx.inc.c | 21 +++++++++++++++++++++
>  target/s390x/vec_fpu_helper.c   | 32 ++++++++++++++++++++++++++++++++
>  4 files changed, 59 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 06/23] s390x/tcg: Implement VECTOR FP COMPARE (EQUAL|HIGH|HIGH OR EQUAL)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 06/23] s390x/tcg: Implement VECTOR FP COMPARE (EQUAL|HIGH|HIGH OR EQUAL) David Hildenbrand
@ 2019-05-31 16:53   ` Richard Henderson
  2019-05-31 17:18     ` David Hildenbrand
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 16:53 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> +static int vfc64(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
> +                 CPUS390XState *env, bool s, bool test_equal, bool test_high,
> +                 uintptr_t retaddr)
> +{
> +    uint8_t vxc, vec_exc = 0;
> +    S390Vector tmp = {};
> +    int match = 0;
> +    int i;
> +
> +    for (i = 0; i < 2; i++) {
> +        const float64 a = make_float64(s390_vec_read_element64(v2, i));
> +        const float64 b = make_float64(s390_vec_read_element64(v3, i));
> +        const int cmp = float64_compare_quiet(a, b, &env->fpu_status);
> +
> +        if ((cmp == float_relation_equal && test_equal) ||
> +            (cmp == float_relation_greater && test_high)) {

It might be easier to pass in the comparison function instead of test_equal and
test_high (float64_eq_quiet, float64_lt_quiet) and swap the arguments to turn
lt into gt (not affecting eq).

This will let you pass float64_eq and float64_lt when it comes time to support
the SQ bit for the vector-enhancment-1 facility.

Otherwise you'll have 3 bools passed in and a bit of a mess here.


> +    if (match == i + 1) {
> +        return 0;

This doesn't look right.  How can match == 3,
with i == 2 when not exiting the loop early.

The vxc case is handled via longjmp, I think,
which leaves the S case to handle here.

Perhaps better as

	if (match) {
	    return s || match == 2 ? 0 : 1;
	}
	return 3;


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 07/23] s390x/tcg: Implement VECTOR FP CONVERT FROM FIXED 64-BIT
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 07/23] s390x/tcg: Implement VECTOR FP CONVERT FROM FIXED 64-BIT David Hildenbrand
@ 2019-05-31 17:10   ` Richard Henderson
  2019-05-31 17:15     ` Richard Henderson
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:10 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> +static DisasJumpType op_vcdg(DisasContext *s, DisasOps *o)
> +{
> +    const uint8_t fpf = get_field(s->fields, m3);
> +    const uint8_t m4 = get_field(s->fields, m4);
> +    const uint8_t erm = get_field(s->fields, m5);
> +    const bool se = extract32(m4, 3, 1);
> +    gen_helper_gvec_2_ptr *fn;
> +
> +    if (fpf != FPF_LONG || extract32(m4, 0, 2) || erm > 7 || erm == 2) {

Please split out the erm validity check.
We have fpinst_extract_m34 doing some of this now;
it would be a shame to replicate it more.


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 07/23] s390x/tcg: Implement VECTOR FP CONVERT FROM FIXED 64-BIT
  2019-05-31 17:10   ` Richard Henderson
@ 2019-05-31 17:15     ` Richard Henderson
  2019-05-31 17:16       ` David Hildenbrand
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:15 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 12:10 PM, Richard Henderson wrote:
> On 5/31/19 5:44 AM, David Hildenbrand wrote:
>> +static DisasJumpType op_vcdg(DisasContext *s, DisasOps *o)
>> +{
>> +    const uint8_t fpf = get_field(s->fields, m3);
>> +    const uint8_t m4 = get_field(s->fields, m4);
>> +    const uint8_t erm = get_field(s->fields, m5);
>> +    const bool se = extract32(m4, 3, 1);
>> +    gen_helper_gvec_2_ptr *fn;
>> +
>> +    if (fpf != FPF_LONG || extract32(m4, 0, 2) || erm > 7 || erm == 2) {
> 
> Please split out the erm validity check.
> We have fpinst_extract_m34 doing some of this now;
> it would be a shame to replicate it more.

Hmm.  Or perhaps you aren't replicating it because it's only used by these
conversions, and both signed and unsigned go through this same function?

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 08/23] s390x/tcg: Implement VECTOR FP CONVERT FROM LOGICAL 64-BIT
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 08/23] s390x/tcg: Implement VECTOR FP CONVERT FROM LOGICAL 64-BIT David Hildenbrand
@ 2019-05-31 17:15   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:15 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h           |  2 ++
>  target/s390x/insn-data.def      |  2 ++
>  target/s390x/translate_vx.inc.c |  3 +++
>  target/s390x/vec_fpu_helper.c   | 23 +++++++++++++++++++++++
>  4 files changed, 30 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 07/23] s390x/tcg: Implement VECTOR FP CONVERT FROM FIXED 64-BIT
  2019-05-31 17:15     ` Richard Henderson
@ 2019-05-31 17:16       ` David Hildenbrand
  0 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 17:16 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 31.05.19 19:15, Richard Henderson wrote:
> On 5/31/19 12:10 PM, Richard Henderson wrote:
>> On 5/31/19 5:44 AM, David Hildenbrand wrote:
>>> +static DisasJumpType op_vcdg(DisasContext *s, DisasOps *o)
>>> +{
>>> +    const uint8_t fpf = get_field(s->fields, m3);
>>> +    const uint8_t m4 = get_field(s->fields, m4);
>>> +    const uint8_t erm = get_field(s->fields, m5);
>>> +    const bool se = extract32(m4, 3, 1);
>>> +    gen_helper_gvec_2_ptr *fn;
>>> +
>>> +    if (fpf != FPF_LONG || extract32(m4, 0, 2) || erm > 7 || erm == 2) {
>>
>> Please split out the erm validity check.
>> We have fpinst_extract_m34 doing some of this now;
>> it would be a shame to replicate it more.
> 
> Hmm.  Or perhaps you aren't replicating it because it's only used by these
> conversions, and both signed and unsigned go through this same function?

Right, the check is only at one place in this file. Thanks!

> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> 
> 
> r~
> 
> 


-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 09/23] s390x/tcg: Implement VECTOR FP CONVERT TO FIXED 64-BIT
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 09/23] s390x/tcg: Implement VECTOR FP CONVERT TO FIXED 64-BIT David Hildenbrand
@ 2019-05-31 17:17   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:17 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h           |  2 ++
>  target/s390x/insn-data.def      |  2 ++
>  target/s390x/translate_vx.inc.c |  3 +++
>  target/s390x/vec_fpu_helper.c   | 23 +++++++++++++++++++++++
>  4 files changed, 30 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 10/23] s390x/tcg: Implement VECTOR FP CONVERT TO LOGICAL 64-BIT
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 10/23] s390x/tcg: Implement VECTOR FP CONVERT TO LOGICAL 64-BIT David Hildenbrand
@ 2019-05-31 17:18   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:18 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h           |  2 ++
>  target/s390x/insn-data.def      |  2 ++
>  target/s390x/translate_vx.inc.c |  3 +++
>  target/s390x/vec_fpu_helper.c   | 23 +++++++++++++++++++++++
>  4 files changed, 30 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 06/23] s390x/tcg: Implement VECTOR FP COMPARE (EQUAL|HIGH|HIGH OR EQUAL)
  2019-05-31 16:53   ` Richard Henderson
@ 2019-05-31 17:18     ` David Hildenbrand
  0 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 17:18 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 31.05.19 18:53, Richard Henderson wrote:
> On 5/31/19 5:44 AM, David Hildenbrand wrote:
>> +static int vfc64(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
>> +                 CPUS390XState *env, bool s, bool test_equal, bool test_high,
>> +                 uintptr_t retaddr)
>> +{
>> +    uint8_t vxc, vec_exc = 0;
>> +    S390Vector tmp = {};
>> +    int match = 0;
>> +    int i;
>> +
>> +    for (i = 0; i < 2; i++) {
>> +        const float64 a = make_float64(s390_vec_read_element64(v2, i));
>> +        const float64 b = make_float64(s390_vec_read_element64(v3, i));
>> +        const int cmp = float64_compare_quiet(a, b, &env->fpu_status);
>> +
>> +        if ((cmp == float_relation_equal && test_equal) ||
>> +            (cmp == float_relation_greater && test_high)) {
> 
> It might be easier to pass in the comparison function instead of test_equal and
> test_high (float64_eq_quiet, float64_lt_quiet) and swap the arguments to turn
> lt into gt (not affecting eq).
> 
> This will let you pass float64_eq and float64_lt when it comes time to support
> the SQ bit for the vector-enhancment-1 facility.
> 
> Otherwise you'll have 3 bools passed in and a bit of a mess here.

Very good idea!

> 
> 
>> +    if (match == i + 1) {
>> +        return 0;
> 
> This doesn't look right.  How can match == 3,
> with i == 2 when not exiting the loop early.
> 
> The vxc case is handled via longjmp, I think,
> which leaves the S case to handle here.
> 
> Perhaps better as
> 
> 	if (match) {
> 	    return s || match == 2 ? 0 : 1;
> 	}
> 	return 3;

Yes indeed, thanks for catching this.

> 
> 
> r~
> 


-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 11/23] s390x/tcg: Implement VECTOR FP DIVIDE
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 11/23] s390x/tcg: Implement VECTOR FP DIVIDE David Hildenbrand
@ 2019-05-31 17:25   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:25 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> We can reuse most of the infrastructure added for VECTOR FP ADD.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h           |  2 ++
>  target/s390x/insn-data.def      |  2 ++
>  target/s390x/translate_vx.inc.c |  3 +++
>  target/s390x/vec_fpu_helper.c   | 17 +++++++++++++++++
>  4 files changed, 24 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 12/23] s390x/tcg: Implement VECTOR LOAD FP INTEGER
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 12/23] s390x/tcg: Implement VECTOR LOAD FP INTEGER David Hildenbrand
@ 2019-05-31 17:26   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:26 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> We can reuse most of the infrastructure introduced for
> VECTOR FP CONVERT FROM FIXED 64-BIT and friends.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h           |  2 ++
>  target/s390x/insn-data.def      |  2 ++
>  target/s390x/translate_vx.inc.c |  3 +++
>  target/s390x/vec_fpu_helper.c   | 23 +++++++++++++++++++++++
>  4 files changed, 30 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 13/23] s390x/tcg: Implement VECTOR LOAD LENGTHENED
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 13/23] s390x/tcg: Implement VECTOR LOAD LENGTHENED David Hildenbrand
@ 2019-05-31 17:33   ` Richard Henderson
  2019-05-31 17:35     ` David Hildenbrand
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:33 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> +    for (i = 0; i < 2; i++) {
> +        /* load from even element */
> +        const float32 a = make_float32(s390_vec_read_element32(v2, i * 2));

I suppose.

You could also reuse vop64_2 with

static uint64_t vfll(uint64_t a, float_status *s)
{
    /* Even float32 are stored in the high half of each doubleword.  */
    return float32_to_float64(a >> 32, s);
}


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 13/23] s390x/tcg: Implement VECTOR LOAD LENGTHENED
  2019-05-31 17:33   ` Richard Henderson
@ 2019-05-31 17:35     ` David Hildenbrand
  2019-05-31 17:36       ` Richard Henderson
  0 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 17:35 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 31.05.19 19:33, Richard Henderson wrote:
> On 5/31/19 5:44 AM, David Hildenbrand wrote:
>> +    for (i = 0; i < 2; i++) {
>> +        /* load from even element */
>> +        const float32 a = make_float32(s390_vec_read_element32(v2, i * 2));
> 
> I suppose.
> 
> You could also reuse vop64_2 with
> 
> static uint64_t vfll(uint64_t a, float_status *s)
> {
>     /* Even float32 are stored in the high half of each doubleword.  */
>     return float32_to_float64(a >> 32, s);
> }
> 

Then, I wouldn't be able to indicate the correct element index on
exceptions via the vex (has to be the 32-bit index).

Thanks!

> 
> r~
> 


-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 13/23] s390x/tcg: Implement VECTOR LOAD LENGTHENED
  2019-05-31 17:35     ` David Hildenbrand
@ 2019-05-31 17:36       ` Richard Henderson
  2019-05-31 17:38         ` David Hildenbrand
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:36 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 12:35 PM, David Hildenbrand wrote:
> On 31.05.19 19:33, Richard Henderson wrote:
>> On 5/31/19 5:44 AM, David Hildenbrand wrote:
>>> +    for (i = 0; i < 2; i++) {
>>> +        /* load from even element */
>>> +        const float32 a = make_float32(s390_vec_read_element32(v2, i * 2));
>>
>> I suppose.
>>
>> You could also reuse vop64_2 with
>>
>> static uint64_t vfll(uint64_t a, float_status *s)
>> {
>>     /* Even float32 are stored in the high half of each doubleword.  */
>>     return float32_to_float64(a >> 32, s);
>> }
>>
> 
> Then, I wouldn't be able to indicate the correct element index on
> exceptions via the vex (has to be the 32-bit index).

Ah, tricky.  Missed that detail.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 14/23] s390x/tcg: Implement VECTOR LOAD ROUNDED
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 14/23] s390x/tcg: Implement VECTOR LOAD ROUNDED David Hildenbrand
@ 2019-05-31 17:37   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:37 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> We can reuse some of the infrastructure introduced for
> VECTOR FP CONVERT FROM FIXED 64-BIT and friends.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h           |  2 ++
>  target/s390x/insn-data.def      |  2 ++
>  target/s390x/translate_vx.inc.c |  3 +++
>  target/s390x/vec_fpu_helper.c   | 43 +++++++++++++++++++++++++++++++++
>  4 files changed, 50 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 15/23] s390x/tcg: Implement VECTOR FP MULTIPLY
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 15/23] s390x/tcg: Implement VECTOR FP MULTIPLY David Hildenbrand
@ 2019-05-31 17:37   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:37 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> Very similar to VECTOR FP DIVIDE.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h           |  2 ++
>  target/s390x/insn-data.def      |  2 ++
>  target/s390x/translate_vx.inc.c |  3 +++
>  target/s390x/vec_fpu_helper.c   | 17 +++++++++++++++++
>  4 files changed, 24 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 13/23] s390x/tcg: Implement VECTOR LOAD LENGTHENED
  2019-05-31 17:36       ` Richard Henderson
@ 2019-05-31 17:38         ` David Hildenbrand
  0 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 17:38 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 31.05.19 19:36, Richard Henderson wrote:
> On 5/31/19 12:35 PM, David Hildenbrand wrote:
>> On 31.05.19 19:33, Richard Henderson wrote:
>>> On 5/31/19 5:44 AM, David Hildenbrand wrote:
>>>> +    for (i = 0; i < 2; i++) {
>>>> +        /* load from even element */
>>>> +        const float32 a = make_float32(s390_vec_read_element32(v2, i * 2));
>>>
>>> I suppose.
>>>
>>> You could also reuse vop64_2 with
>>>
>>> static uint64_t vfll(uint64_t a, float_status *s)
>>> {
>>>     /* Even float32 are stored in the high half of each doubleword.  */
>>>     return float32_to_float64(a >> 32, s);
>>> }
>>>
>>
>> Then, I wouldn't be able to indicate the correct element index on
>> exceptions via the vex (has to be the 32-bit index).

I guess we can later handle all type conversions (lower -> bigger) via
this single function (passing the source size). Then we only need a
second function for the other direction.

Thanks!

> 
> Ah, tricky.  Missed that detail.
> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> 
> 
> r~
> 


-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 20/23] s390x/tcg: Implement VECTOR FP TEST DATA CLASS IMMEDIATE
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 20/23] s390x/tcg: Implement VECTOR FP TEST DATA CLASS IMMEDIATE David Hildenbrand
@ 2019-05-31 17:40   ` David Hildenbrand
  2019-05-31 17:54   ` Richard Henderson
  1 sibling, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 17:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 31.05.19 12:44, David Hildenbrand wrote:
> We can reuse float64_dcmask().
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h           |  2 ++
>  target/s390x/insn-data.def      |  2 ++
>  target/s390x/translate_vx.inc.c | 21 ++++++++++++++++++
>  target/s390x/vec_fpu_helper.c   | 39 +++++++++++++++++++++++++++++++++
>  4 files changed, 64 insertions(+)
> 
> diff --git a/target/s390x/helper.h b/target/s390x/helper.h
> index c788fc1b7f..e9aff83b05 100644
> --- a/target/s390x/helper.h
> +++ b/target/s390x/helper.h
> @@ -292,6 +292,8 @@ DEF_HELPER_FLAGS_4(gvec_vfsq64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
>  DEF_HELPER_FLAGS_4(gvec_vfsq64s, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
>  DEF_HELPER_FLAGS_5(gvec_vfs64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
>  DEF_HELPER_FLAGS_5(gvec_vfs64s, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
> +DEF_HELPER_4(gvec_vftci64, void, ptr, cptr, env, i32)
> +DEF_HELPER_4(gvec_vftci64s, void, ptr, cptr, env, i32)
>  
>  #ifndef CONFIG_USER_ONLY
>  DEF_HELPER_3(servc, i32, env, i64, i64)
> diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
> index 4426f40250..f421184fcd 100644
> --- a/target/s390x/insn-data.def
> +++ b/target/s390x/insn-data.def
> @@ -1246,6 +1246,8 @@
>      F(0xe7ce, VFSQ,    VRR_a, V,   0, 0, 0, 0, vfsq, 0, IF_VEC)
>  /* VECTOR FP SUBTRACT */
>      F(0xe7e2, VFS,     VRR_c, V,   0, 0, 0, 0, vfa, 0, IF_VEC)
> +/* VECTOR FP TEST DATA CLASS IMMEDIATE */
> +    F(0xe74a, VFTCI,   VRI_e, V,   0, 0, 0, 0, vftci, 0, IF_VEC)
>  
>  #ifndef CONFIG_USER_ONLY
>  /* COMPARE AND SWAP AND PURGE */
> diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
> index bc75a147b6..715fcb2cb5 100644
> --- a/target/s390x/translate_vx.inc.c
> +++ b/target/s390x/translate_vx.inc.c
> @@ -2791,3 +2791,24 @@ static DisasJumpType op_vfsq(DisasContext *s, DisasOps *o)
>                     0, fn);
>      return DISAS_NEXT;
>  }
> +
> +static DisasJumpType op_vftci(DisasContext *s, DisasOps *o)
> +{
> +    const uint16_t i3 = get_field(s->fields, i3);
> +    const uint8_t fpf = get_field(s->fields, m4);
> +    const uint8_t m5 = get_field(s->fields, m5);
> +    gen_helper_gvec_2_ptr *fn = gen_helper_gvec_vftci64;
> +
> +    if (fpf != FPF_LONG || extract32(m5, 0, 3)) {
> +        gen_program_exception(s, PGM_SPECIFICATION);
> +        return DISAS_NORETURN;
> +    }
> +
> +    if (extract32(m5, 3, 1)) {
> +        fn = gen_helper_gvec_vftci64s;
> +    }
> +    gen_gvec_2_ptr(get_field(s->fields, v1), get_field(s->fields, v2), cpu_env,
> +                   i3, fn);
> +    set_cc_static(s);
> +    return DISAS_NEXT;
> +}
> diff --git a/target/s390x/vec_fpu_helper.c b/target/s390x/vec_fpu_helper.c
> index 10249c5105..930b6d1db4 100644
> --- a/target/s390x/vec_fpu_helper.c
> +++ b/target/s390x/vec_fpu_helper.c
> @@ -603,3 +603,42 @@ void HELPER(gvec_vfs64s)(void *v1, const void *v2, const void *v3,
>  {
>      vop64_3(v1, v2, v3, env, true, vfs64, GETPC());
>  }
> +
> +static int vftci64(S390Vector *v1, const S390Vector *v2, CPUS390XState *env,
> +                   bool s, uint16_t i3)
> +{
> +    int i, match = 0;
> +
> +    for (i = 0; i < 2; i++) {
> +        float64 a = make_float64(s390_vec_read_element64(v2, i));
> +
> +        if (float64_dcmask(env, a) & i3) {
> +            match++;
> +            s390_vec_write_element64(v1, i, -1ull);
> +        } else {
> +            s390_vec_write_element64(v1, i, 0);
> +        }
> +        if (s) {
> +            break;
> +        }
> +    }
> +
> +    if (match == i + 1) {
> +        return 0;
> +    } else if (match) {
> +        return 1;
> +    }

This is also wrong, similar to the other function.


-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 16/23] s390x/tcg: Implement VECTOR FP MULTIPLY AND (ADD|SUBTRACT)
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 16/23] s390x/tcg: Implement VECTOR FP MULTIPLY AND (ADD|SUBTRACT) David Hildenbrand
@ 2019-05-31 17:42   ` Richard Henderson
  2019-05-31 17:44     ` David Hildenbrand
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:42 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> +typedef uint64_t (*vop64_4_fn)(uint64_t a, uint64_t b, uint64_t c,
> +                               float_status *s);
> +static void vop64_4(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
> +                    const S390Vector *v4, CPUS390XState *env, bool s,
> +                    vop64_4_fn fn, uintptr_t retaddr)
> +{

Surely this is only going to be used for FMA/FMS.
Why not just pass in the float_muladd_* constant
to pass on to float64_muladd?


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 16/23] s390x/tcg: Implement VECTOR FP MULTIPLY AND (ADD|SUBTRACT)
  2019-05-31 17:42   ` Richard Henderson
@ 2019-05-31 17:44     ` David Hildenbrand
  0 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 17:44 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 31.05.19 19:42, Richard Henderson wrote:
> On 5/31/19 5:44 AM, David Hildenbrand wrote:
>> +typedef uint64_t (*vop64_4_fn)(uint64_t a, uint64_t b, uint64_t c,
>> +                               float_status *s);
>> +static void vop64_4(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
>> +                    const S390Vector *v4, CPUS390XState *env, bool s,
>> +                    vop64_4_fn fn, uintptr_t retaddr)
>> +{
> 
> Surely this is only going to be used for FMA/FMS.
> Why not just pass in the float_muladd_* constant
> to pass on to float64_muladd?

I actually had something similar before, but makes sense, as this will
really only be used for these functions.

Thanks!

> 
> 
> r~
> 


-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 17/23] s390x/tcg: Implement VECTOR FP PERFORM SIGN OPERATION
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 17/23] s390x/tcg: Implement VECTOR FP PERFORM SIGN OPERATION David Hildenbrand
@ 2019-05-31 17:48   ` Richard Henderson
  2019-05-31 18:05     ` David Hildenbrand
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:48 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> +static DisasJumpType op_vfpso(DisasContext *s, DisasOps *o)
> +{
> +    const uint8_t fpf = get_field(s->fields, m3);
> +    const uint8_t m4 = get_field(s->fields, m4);
> +    const uint8_t m5 = get_field(s->fields, m5);
> +    const bool se = extract32(m4, 3, 1);
> +    TCGv_i64 tmp;
> +    int i;
> +
> +    if (fpf != FPF_LONG || extract32(m4, 0, 3) || m5 > 2) {
> +        gen_program_exception(s, PGM_SPECIFICATION);
> +        return DISAS_NORETURN;
> +    }
> +
> +    tmp = tcg_temp_new_i64();
> +    for (i = 0; i < 2; i++) {
> +        read_vec_element_i64(tmp, get_field(s->fields, v2), i, ES_64);
> +
> +        switch (m5) {
> +        case 0:
> +            /* sign bit is inverted (complement) */
> +            tcg_gen_xori_i64(tmp, tmp, 1ull << 63);
> +            break;
> +        case 1:
> +            /* sign bit is set to one (negative) */
> +            tcg_gen_ori_i64(tmp, tmp, 1ull << 63);
> +            break;
> +        case 2:
> +            /* sign bit is set to zero (positive) */
> +            tcg_gen_andi_i64(tmp, tmp, (1ull << 63) - 1);
> +            break;
> +        }
> +
> +        write_vec_element_i64(tmp, get_field(s->fields, v1), i, ES_64);
> +        if (se) {
> +            break;
> +        }
> +    }
> +    tcg_temp_free_i64(tmp);
> +    return DISAS_NEXT;
> +}

Better to use tcg_gen_gvec_{and,xor,or}i to do all of the elements at once.
Won't work for FPF_EXTENDED, but much better for FPF_SINGLE, once they're
supported.


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 18/23] s390x/tcg: Implement VECTOR FP SQUARE ROOT
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 18/23] s390x/tcg: Implement VECTOR FP SQUARE ROOT David Hildenbrand
@ 2019-05-31 17:50   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:50 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> Simulate XxC=0 and ERM=0 (current mode), so we can use the existing
> helper function.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h           |  2 ++
>  target/s390x/insn-data.def      |  2 ++
>  target/s390x/translate_vx.inc.c | 19 +++++++++++++++++++
>  target/s390x/vec_fpu_helper.c   | 17 +++++++++++++++++
>  4 files changed, 40 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 19/23] s390x/tcg: Implement VECTOR FP SUBTRACT
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 19/23] s390x/tcg: Implement VECTOR FP SUBTRACT David Hildenbrand
@ 2019-05-31 17:51   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:51 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> Similar to VECTOR FP ADD.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h           |  2 ++
>  target/s390x/insn-data.def      |  2 ++
>  target/s390x/translate_vx.inc.c |  3 +++
>  target/s390x/vec_fpu_helper.c   | 17 +++++++++++++++++
>  4 files changed, 24 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 20/23] s390x/tcg: Implement VECTOR FP TEST DATA CLASS IMMEDIATE
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 20/23] s390x/tcg: Implement VECTOR FP TEST DATA CLASS IMMEDIATE David Hildenbrand
  2019-05-31 17:40   ` David Hildenbrand
@ 2019-05-31 17:54   ` Richard Henderson
  1 sibling, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:54 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> We can reuse float64_dcmask().
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h           |  2 ++
>  target/s390x/insn-data.def      |  2 ++
>  target/s390x/translate_vx.inc.c | 21 ++++++++++++++++++
>  target/s390x/vec_fpu_helper.c   | 39 +++++++++++++++++++++++++++++++++
>  4 files changed, 64 insertions(+)

Modulo the cc value, as discussed,

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 21/23] s390x/tcg: Allow linux-user to use vector instructions
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 21/23] s390x/tcg: Allow linux-user to use vector instructions David Hildenbrand
@ 2019-05-31 17:54   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:54 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> Once we unlock S390_FEAT_VECTOR for TCG, we want linux-user to be
> able to make use of it.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/cpu.c | 3 +++
>  1 file changed, 3 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 22/23] s390x/tcg: We support the Vector Facility
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 22/23] s390x/tcg: We support the Vector Facility David Hildenbrand
@ 2019-05-31 17:55   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:55 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> Let's add it to the max model, so we can enable it.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/gen-features.c | 1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 23/23] s390x: Bump the "qemu" CPU model up to a stripped-down z13
  2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 23/23] s390x: Bump the "qemu" CPU model up to a stripped-down z13 David Hildenbrand
@ 2019-05-31 17:57   ` Richard Henderson
  2019-05-31 17:58     ` David Hildenbrand
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 17:57 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 5:44 AM, David Hildenbrand wrote:
> We don't care about the other two missing base features:
> - S390_FEAT_DFP_PACKED_CONVERSION
> - S390_FEAT_GROUP_GEN13_PTFF
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  hw/s390x/s390-virtio-ccw.c  |  2 ++
>  target/s390x/cpu_models.c   |  4 ++--
>  target/s390x/gen-features.c | 11 +++++++----
>  3 files changed, 11 insertions(+), 6 deletions(-)

We should get around to supporting DFP at some point.
The code is all there, used by target/ppc/.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 23/23] s390x: Bump the "qemu" CPU model up to a stripped-down z13
  2019-05-31 17:57   ` Richard Henderson
@ 2019-05-31 17:58     ` David Hildenbrand
  2019-05-31 18:06       ` Richard Henderson
  0 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 17:58 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 31.05.19 19:57, Richard Henderson wrote:
> On 5/31/19 5:44 AM, David Hildenbrand wrote:
>> We don't care about the other two missing base features:
>> - S390_FEAT_DFP_PACKED_CONVERSION
>> - S390_FEAT_GROUP_GEN13_PTFF
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  hw/s390x/s390-virtio-ccw.c  |  2 ++
>>  target/s390x/cpu_models.c   |  4 ++--
>>  target/s390x/gen-features.c | 11 +++++++----
>>  3 files changed, 11 insertions(+), 6 deletions(-)
> 
> We should get around to supporting DFP at some point.
> The code is all there, used by target/ppc/.

Cool, didn't know about that - will take a look once I have sme spare
time. Are you aware of a HFP library?

Thanks!

> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> 
> 
> r~
> 


-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 17/23] s390x/tcg: Implement VECTOR FP PERFORM SIGN OPERATION
  2019-05-31 17:48   ` Richard Henderson
@ 2019-05-31 18:05     ` David Hildenbrand
  0 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 18:05 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 31.05.19 19:48, Richard Henderson wrote:
> On 5/31/19 5:44 AM, David Hildenbrand wrote:
>> +static DisasJumpType op_vfpso(DisasContext *s, DisasOps *o)
>> +{
>> +    const uint8_t fpf = get_field(s->fields, m3);
>> +    const uint8_t m4 = get_field(s->fields, m4);
>> +    const uint8_t m5 = get_field(s->fields, m5);
>> +    const bool se = extract32(m4, 3, 1);
>> +    TCGv_i64 tmp;
>> +    int i;
>> +
>> +    if (fpf != FPF_LONG || extract32(m4, 0, 3) || m5 > 2) {
>> +        gen_program_exception(s, PGM_SPECIFICATION);
>> +        return DISAS_NORETURN;
>> +    }
>> +
>> +    tmp = tcg_temp_new_i64();
>> +    for (i = 0; i < 2; i++) {
>> +        read_vec_element_i64(tmp, get_field(s->fields, v2), i, ES_64);
>> +
>> +        switch (m5) {
>> +        case 0:
>> +            /* sign bit is inverted (complement) */
>> +            tcg_gen_xori_i64(tmp, tmp, 1ull << 63);
>> +            break;
>> +        case 1:
>> +            /* sign bit is set to one (negative) */
>> +            tcg_gen_ori_i64(tmp, tmp, 1ull << 63);
>> +            break;
>> +        case 2:
>> +            /* sign bit is set to zero (positive) */
>> +            tcg_gen_andi_i64(tmp, tmp, (1ull << 63) - 1);
>> +            break;
>> +        }
>> +
>> +        write_vec_element_i64(tmp, get_field(s->fields, v1), i, ES_64);
>> +        if (se) {
>> +            break;
>> +        }
>> +    }
>> +    tcg_temp_free_i64(tmp);
>> +    return DISAS_NEXT;
>> +}
> 
> Better to use tcg_gen_gvec_{and,xor,or}i to do all of the elements at once.
> Won't work for FPF_EXTENDED, but much better for FPF_SINGLE, once they're
> supported.
> 

How could I miss that :)

Thanks!


-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 23/23] s390x: Bump the "qemu" CPU model up to a stripped-down z13
  2019-05-31 17:58     ` David Hildenbrand
@ 2019-05-31 18:06       ` Richard Henderson
  2019-05-31 18:07         ` David Hildenbrand
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2019-05-31 18:06 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 5/31/19 12:58 PM, David Hildenbrand wrote:
> Are you aware of a HFP library?

No.  It might be possible to shoehorn into softfloat, because I *think* to can
treat HFP as BFP with weird rounding.  At least that's what I remember from my
old college daze on the esa/390.

Otherwise we could maybe steal some code from Hercules.  I see "OSI certified"
and "Q public license" on the web page, which iirc is compatible.


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 23/23] s390x: Bump the "qemu" CPU model up to a stripped-down z13
  2019-05-31 18:06       ` Richard Henderson
@ 2019-05-31 18:07         ` David Hildenbrand
  0 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-05-31 18:07 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: Christian Borntraeger, Denys Vlasenko

On 31.05.19 20:06, Richard Henderson wrote:
> On 5/31/19 12:58 PM, David Hildenbrand wrote:
>> Are you aware of a HFP library?
> 
> No.  It might be possible to shoehorn into softfloat, because I *think* to can
> treat HFP as BFP with weird rounding.  At least that's what I remember from my
> old college daze on the esa/390.
> 
> Otherwise we could maybe steal some code from Hercules.  I see "OSI certified"
> and "Q public license" on the web page, which iirc is compatible.
> 

Right, they support it. Maybe a project for a cold winter evening ;)

-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4
  2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
                   ` (23 preceding siblings ...)
  2019-05-31 10:47 ` [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
@ 2019-07-19  9:51 ` Aleksandar Markovic
  2019-07-19 10:00   ` David Hildenbrand
  24 siblings, 1 reply; 61+ messages in thread
From: Aleksandar Markovic @ 2019-07-19  9:51 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: Christian Borntraeger, Denys Vlasenko, qemu-devel

On May 31, 2019 12:48 PM, "David Hildenbrand" <david@redhat.com> wrote:
>
> This is the final part of vector instruction support for s390x. It is
based
> on part 2, which is will send a pull-request for to Conny soon.
>
> Part 1: Vector Support Instructions
> Part 2: Vector Integer Instructions
> Part 3: Vector String Instructions
> Part 4: Vector Floating-Point Instructions
>

Congratulations on completing this complex task!

I followed your series (even though I did not make any comment), and I
salute this addition to QEMU.

I would just ask you to provide me and others with the link to the detailed
documentation on this matter - I had the hardest time trying to find it
online.

Thanks in advance!

Aleksandar

> The current state can be found at (kept updated):
>     https://github.com/davidhildenbrand/qemu/tree/vx
>
> It is based on:
> - [PATCH v2 0/5] s390x/tcg: Vector Instruction Support Part 3
> - [PATCH v1 0/2] s390x: Fix vector register alignment
>
> With the current state I can boot Linux kernel + user space compiled with
> SIMD support. This allows to boot distributions compiled exclusively for
> z13, requiring SIMD support. Also, it is now possible to build a complete
> kernel using rpmbuild as quite some issues have been sorted out.
>
> While the current state works fine for me with RHEL 8, I am experiencing
> some issues with newer userspace versions (I suspect glibc). I'll have
> to look into the details first - could be a BUG in !vector
> instruction or a BUG in a vector instruction that was until now unused.
>
> In this part, all Vector Floating-Point Instructions introduced with the
> "Vector Facility" are added. Also, the "qemu" model is changed to a
> z13 machine.
>
> David Hildenbrand (23):
>   s390x: Use uint64_t for vector registers
>   s390x/tcg: Introduce tcg_s390_vector_exception()
>   s390x/tcg: Export float_comp_to_cc() and float(32|64|128)_dcmask()
>   s390x/tcg: Implement VECTOR FP ADD
>   s390x/tcg: Implement VECTOR FP COMPARE (AND SIGNAL) SCALAR
>   s390x/tcg: Implement VECTOR FP COMPARE (EQUAL|HIGH|HIGH OR EQUAL)
>   s390x/tcg: Implement VECTOR FP CONVERT FROM FIXED 64-BIT
>   s390x/tcg: Implement VECTOR FP CONVERT FROM LOGICAL 64-BIT
>   s390x/tcg: Implement VECTOR FP CONVERT TO FIXED 64-BIT
>   s390x/tcg: Implement VECTOR FP CONVERT TO LOGICAL 64-BIT
>   s390x/tcg: Implement VECTOR FP DIVIDE
>   s390x/tcg: Implement VECTOR LOAD FP INTEGER
>   s390x/tcg: Implement VECTOR LOAD LENGTHENED
>   s390x/tcg: Implement VECTOR LOAD ROUNDED
>   s390x/tcg: Implement VECTOR FP MULTIPLY
>   s390x/tcg: Implement VECTOR FP MULTIPLY AND (ADD|SUBTRACT)
>   s390x/tcg: Implement VECTOR FP PERFORM SIGN OPERATION
>   s390x/tcg: Implement VECTOR FP SQUARE ROOT
>   s390x/tcg: Implement VECTOR FP SUBTRACT
>   s390x/tcg: Implement VECTOR FP TEST DATA CLASS IMMEDIATE
>   s390x/tcg: Allow linux-user to use vector instructions
>   s390x/tcg: We support the Vector Facility
>   s390x: Bump the "qemu" CPU model up to a stripped-down z13
>
>  hw/s390x/s390-virtio-ccw.c      |   2 +
>  linux-user/s390x/signal.c       |   4 +-
>  target/s390x/Makefile.objs      |   1 +
>  target/s390x/arch_dump.c        |   8 +-
>  target/s390x/cpu.c              |   3 +
>  target/s390x/cpu.h              |   5 +-
>  target/s390x/cpu_models.c       |   4 +-
>  target/s390x/excp_helper.c      |  21 +-
>  target/s390x/fpu_helper.c       |   4 +-
>  target/s390x/gdbstub.c          |  16 +-
>  target/s390x/gen-features.c     |  10 +-
>  target/s390x/helper.c           |  10 +-
>  target/s390x/helper.h           |  46 +++
>  target/s390x/insn-data.def      |  45 +++
>  target/s390x/internal.h         |   4 +
>  target/s390x/kvm.c              |  16 +-
>  target/s390x/machine.c          | 128 +++----
>  target/s390x/tcg_s390x.h        |   2 +
>  target/s390x/translate.c        |   2 +-
>  target/s390x/translate_vx.inc.c | 274 ++++++++++++++
>  target/s390x/vec_fpu_helper.c   | 644 ++++++++++++++++++++++++++++++++
>  21 files changed, 1145 insertions(+), 104 deletions(-)
>  create mode 100644 target/s390x/vec_fpu_helper.c
>
> --
> 2.20.1
>
>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4
  2019-07-19  9:51 ` Aleksandar Markovic
@ 2019-07-19 10:00   ` David Hildenbrand
  0 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2019-07-19 10:00 UTC (permalink / raw)
  To: Aleksandar Markovic; +Cc: Christian Borntraeger, Denys Vlasenko, qemu-devel

On 19.07.19 11:51, Aleksandar Markovic wrote:
> 
> On May 31, 2019 12:48 PM, "David Hildenbrand" <david@redhat.com
> <mailto:david@redhat.com>> wrote:
>>
>> This is the final part of vector instruction support for s390x. It is
> based
>> on part 2, which is will send a pull-request for to Conny soon.
>>
>> Part 1: Vector Support Instructions
>> Part 2: Vector Integer Instructions
>> Part 3: Vector String Instructions
>> Part 4: Vector Floating-Point Instructions
>>
> 
> Congratulations on completing this complex task!
> 
> I followed your series (even though I did not make any comment), and I
> salute this addition to QEMU.

Thanks, glad to hear that this addition might be beneficial for others
as well!

> 
> I would just ask you to provide me and others with the link to the
> detailed documentation on this matter - I had the hardest time trying to
> find it online.

So, the s390x architecture (including vector instructions) are described
in the z/Architecture Principles of Operation. You can find the latest
publication at [1].

Regarding TCG internals/vector instruction support ... well, most
documentation is the code itself/implementing architectures. :)

Please let me know if you need more information.

Cheers!

[1]
https://www-01.ibm.com/support/docview.wss?uid=isg2b9de5f05a9d57819852571c500428f9a

> 
> Thanks in advance!
> 
> Aleksandar
-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2019-07-19 10:00 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-31 10:44 [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 01/23] s390x: Use uint64_t for vector registers David Hildenbrand
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 02/23] s390x/tcg: Introduce tcg_s390_vector_exception() David Hildenbrand
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 03/23] s390x/tcg: Export float_comp_to_cc() and float(32|64|128)_dcmask() David Hildenbrand
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 04/23] s390x/tcg: Implement VECTOR FP ADD David Hildenbrand
2019-05-31 15:54   ` Richard Henderson
2019-05-31 16:26     ` David Hildenbrand
2019-05-31 16:30   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 05/23] s390x/tcg: Implement VECTOR FP COMPARE (AND SIGNAL) SCALAR David Hildenbrand
2019-05-31 16:33   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 06/23] s390x/tcg: Implement VECTOR FP COMPARE (EQUAL|HIGH|HIGH OR EQUAL) David Hildenbrand
2019-05-31 16:53   ` Richard Henderson
2019-05-31 17:18     ` David Hildenbrand
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 07/23] s390x/tcg: Implement VECTOR FP CONVERT FROM FIXED 64-BIT David Hildenbrand
2019-05-31 17:10   ` Richard Henderson
2019-05-31 17:15     ` Richard Henderson
2019-05-31 17:16       ` David Hildenbrand
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 08/23] s390x/tcg: Implement VECTOR FP CONVERT FROM LOGICAL 64-BIT David Hildenbrand
2019-05-31 17:15   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 09/23] s390x/tcg: Implement VECTOR FP CONVERT TO FIXED 64-BIT David Hildenbrand
2019-05-31 17:17   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 10/23] s390x/tcg: Implement VECTOR FP CONVERT TO LOGICAL 64-BIT David Hildenbrand
2019-05-31 17:18   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 11/23] s390x/tcg: Implement VECTOR FP DIVIDE David Hildenbrand
2019-05-31 17:25   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 12/23] s390x/tcg: Implement VECTOR LOAD FP INTEGER David Hildenbrand
2019-05-31 17:26   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 13/23] s390x/tcg: Implement VECTOR LOAD LENGTHENED David Hildenbrand
2019-05-31 17:33   ` Richard Henderson
2019-05-31 17:35     ` David Hildenbrand
2019-05-31 17:36       ` Richard Henderson
2019-05-31 17:38         ` David Hildenbrand
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 14/23] s390x/tcg: Implement VECTOR LOAD ROUNDED David Hildenbrand
2019-05-31 17:37   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 15/23] s390x/tcg: Implement VECTOR FP MULTIPLY David Hildenbrand
2019-05-31 17:37   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 16/23] s390x/tcg: Implement VECTOR FP MULTIPLY AND (ADD|SUBTRACT) David Hildenbrand
2019-05-31 17:42   ` Richard Henderson
2019-05-31 17:44     ` David Hildenbrand
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 17/23] s390x/tcg: Implement VECTOR FP PERFORM SIGN OPERATION David Hildenbrand
2019-05-31 17:48   ` Richard Henderson
2019-05-31 18:05     ` David Hildenbrand
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 18/23] s390x/tcg: Implement VECTOR FP SQUARE ROOT David Hildenbrand
2019-05-31 17:50   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 19/23] s390x/tcg: Implement VECTOR FP SUBTRACT David Hildenbrand
2019-05-31 17:51   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 20/23] s390x/tcg: Implement VECTOR FP TEST DATA CLASS IMMEDIATE David Hildenbrand
2019-05-31 17:40   ` David Hildenbrand
2019-05-31 17:54   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 21/23] s390x/tcg: Allow linux-user to use vector instructions David Hildenbrand
2019-05-31 17:54   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 22/23] s390x/tcg: We support the Vector Facility David Hildenbrand
2019-05-31 17:55   ` Richard Henderson
2019-05-31 10:44 ` [Qemu-devel] [PATCH v1 23/23] s390x: Bump the "qemu" CPU model up to a stripped-down z13 David Hildenbrand
2019-05-31 17:57   ` Richard Henderson
2019-05-31 17:58     ` David Hildenbrand
2019-05-31 18:06       ` Richard Henderson
2019-05-31 18:07         ` David Hildenbrand
2019-05-31 10:47 ` [Qemu-devel] [PATCH v1 00/23] s390x/tcg: Vector Instruction Support Part 4 David Hildenbrand
2019-07-19  9:51 ` Aleksandar Markovic
2019-07-19 10:00   ` David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.