All of lore.kernel.org
 help / color / mirror / Atom feed
* [RISU PATCH 0/5] Add LoongArch architectures support
@ 2022-09-17  7:43 Song Gao
  2022-09-17  7:43 ` [RISU PATCH 1/5] risu: Use alternate stack Song Gao
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Song Gao @ 2022-09-17  7:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: richard.henderson, peter.maydell, alex.bennee, maobibo

hi, 

This series adds LoongArch architectures support, we had tested two mode:
1. LoongArch host server +  LoongArch host client;
2. LoongArch host server  + qemu client.

You can find all LoongArch instructions at [1].
This series not contains all LoongArch instructions,
such as pcadd, syscalls, rdtime and jumps.

[1]: https://github.com/loongson/LoongArch-Documentation/releases/download/2022.08.12/LoongArch-Vol1-v1.02-EN.pdf


Thanks.
Song Gao


Song Gao (5):
  risu: Use alternate stack
  loongarch: Add LoongArch basic test support
  loongarch: Implement risugen module
  loongarch: Add risufile with loongarch instructions
  loongarch: Add block 'safefloat' and nanbox_s()

 loongarch64.risu           | 612 +++++++++++++++++++++++++++++++++++++
 risu.c                     |  16 +-
 risu_loongarch64.c         |  50 +++
 risu_reginfo_loongarch64.c | 183 +++++++++++
 risu_reginfo_loongarch64.h |  25 ++
 risugen                    |   2 +-
 risugen_loongarch64.pm     | 532 ++++++++++++++++++++++++++++++++
 test_loongarch64.s         |  92 ++++++
 8 files changed, 1510 insertions(+), 2 deletions(-)
 create mode 100644 loongarch64.risu
 create mode 100644 risu_loongarch64.c
 create mode 100644 risu_reginfo_loongarch64.c
 create mode 100644 risu_reginfo_loongarch64.h
 create mode 100644 risugen_loongarch64.pm
 create mode 100644 test_loongarch64.s

-- 
2.31.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RISU PATCH 1/5] risu: Use alternate stack
  2022-09-17  7:43 [RISU PATCH 0/5] Add LoongArch architectures support Song Gao
@ 2022-09-17  7:43 ` Song Gao
  2022-10-10 14:20   ` Richard Henderson
  2022-09-17  7:43 ` [RISU PATCH 2/5] loongarch: Add LoongArch basic test support Song Gao
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Song Gao @ 2022-09-17  7:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: richard.henderson, peter.maydell, alex.bennee, maobibo

We can use alternate stack, so that we can use sp register as intput/ouput register.
I had tested aarch64/LoongArch architecture.

Signed-off-by: Song Gao <gaosong@loongson.cn>
---
 risu.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/risu.c b/risu.c
index 1c096a8..714074e 100644
--- a/risu.c
+++ b/risu.c
@@ -329,7 +329,7 @@ static void set_sigill_handler(void (*fn) (int, siginfo_t *, void *))
     memset(&sa, 0, sizeof(struct sigaction));
 
     sa.sa_sigaction = fn;
-    sa.sa_flags = SA_SIGINFO;
+    sa.sa_flags = SA_SIGINFO | SA_ONSTACK;
     sigemptyset(&sa.sa_mask);
     if (sigaction(SIGILL, &sa, 0) != 0) {
         perror("sigaction");
@@ -550,6 +550,7 @@ int main(int argc, char **argv)
     char *trace_fn = NULL;
     struct option *longopts;
     char *shortopts;
+    stack_t ss;
 
     longopts = setup_options(&shortopts);
 
@@ -617,6 +618,19 @@ int main(int argc, char **argv)
 
     load_image(imgfile);
 
+    /* create alternate stack */
+    ss.ss_sp = malloc(SIGSTKSZ);
+    if (ss.ss_sp == NULL) {
+        perror("malloc");
+        exit(EXIT_FAILURE);
+    }
+    ss.ss_size = SIGSTKSZ;
+    ss.ss_flags = 0;
+    if (sigaltstack(&ss, NULL) == -1) {
+        perror("sigaltstac");
+        exit(EXIT_FAILURE);
+    }
+
     /* E.g. select requested SVE vector length. */
     arch_init();
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RISU PATCH 2/5] loongarch: Add LoongArch basic test support
  2022-09-17  7:43 [RISU PATCH 0/5] Add LoongArch architectures support Song Gao
  2022-09-17  7:43 ` [RISU PATCH 1/5] risu: Use alternate stack Song Gao
@ 2022-09-17  7:43 ` Song Gao
  2022-10-10 14:58   ` Richard Henderson
  2022-10-10 15:34   ` Peter Maydell
  2022-09-17  7:43 ` [RISU PATCH 3/5] loongarch: Implement risugen module Song Gao
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 16+ messages in thread
From: Song Gao @ 2022-09-17  7:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: richard.henderson, peter.maydell, alex.bennee, maobibo

This patch adds LoongArch server, client support, and basic test file.

Signed-off-by: Song Gao <gaosong@loongson.cn>
---
 risu_loongarch64.c         |  50 ++++++++++
 risu_reginfo_loongarch64.c | 183 +++++++++++++++++++++++++++++++++++++
 risu_reginfo_loongarch64.h |  25 +++++
 test_loongarch64.s         |  92 +++++++++++++++++++
 4 files changed, 350 insertions(+)
 create mode 100644 risu_loongarch64.c
 create mode 100644 risu_reginfo_loongarch64.c
 create mode 100644 risu_reginfo_loongarch64.h
 create mode 100644 test_loongarch64.s

diff --git a/risu_loongarch64.c b/risu_loongarch64.c
new file mode 100644
index 0000000..24599e1
--- /dev/null
+++ b/risu_loongarch64.c
@@ -0,0 +1,50 @@
+/******************************************************************************
+ * Copyright (c) 2022 Loongson Technology Corporation Limited
+ * All rights reserved. This program and the accompanying materials
+ * are made available under the terms of the Eclipse Public License v1.0
+ * which accompanies this distribution, and is available at
+ * http://www.eclipse.org/legal/epl-v10.html
+ *
+ * Contributors:
+ *     based on Peter Maydell's risu_arm.c
+ *****************************************************************************/
+
+#include <asm/types.h>
+#include <signal.h>
+#include <asm/ucontext.h>
+
+#include "risu.h"
+
+void advance_pc(void *vuc)
+{
+    struct ucontext *uc = vuc;
+    uc->uc_mcontext.sc_pc += 4;
+}
+
+void set_ucontext_paramreg(void *vuc, uint64_t value)
+{
+    struct ucontext *uc = vuc;
+    uc->uc_mcontext.sc_regs[4] = value;
+}
+
+uint64_t get_reginfo_paramreg(struct reginfo *ri)
+{
+    return ri->regs[4];
+}
+
+int get_risuop(struct reginfo *ri)
+{
+    /* Return the risuop we have been asked to do
+     * (or -1 if this was a SIGILL for a non-risuop insn)
+     */
+    uint32_t insn = ri->faulting_insn;
+    uint32_t op = insn & 0xf;
+    uint32_t key = insn & ~0xf;
+    uint32_t risukey = 0x000001f0;
+    return (key != risukey) ? -1 : op;
+}
+
+uintptr_t get_pc(struct reginfo *ri)
+{
+   return ri->pc;
+}
diff --git a/risu_reginfo_loongarch64.c b/risu_reginfo_loongarch64.c
new file mode 100644
index 0000000..af6ab77
--- /dev/null
+++ b/risu_reginfo_loongarch64.c
@@ -0,0 +1,183 @@
+/******************************************************************************
+ * Copyright (c) 2022 Loongson Technology Corporation Limited
+ * All rights reserved. This program and the accompanying materials
+ * are made available under the terms of the Eclipse Public License v1.0
+ * which accompanies this distribution, and is available at
+ * http://www.eclipse.org/legal/epl-v10.html
+ *
+ * Contributors:
+ *     based on Peter Maydell's risu_reginfo_arm.c
+ *****************************************************************************/
+
+#include <stdio.h>
+#include <asm/types.h>
+#include <signal.h>
+#include <asm/ucontext.h>
+
+#include <string.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <stdbool.h>
+#include <inttypes.h>
+#include <assert.h>
+#include <sys/prctl.h>
+
+#include "risu.h"
+#include "risu_reginfo_loongarch64.h"
+
+const struct option * const arch_long_opts;
+const char * const arch_extra_help;
+
+struct _ctx_layout {
+        struct sctx_info *addr;
+        unsigned int size;
+};
+
+struct extctx_layout {
+        unsigned long size;
+        unsigned int flags;
+        struct _ctx_layout fpu;
+        struct _ctx_layout end;
+};
+
+void process_arch_opt(int opt, const char *arg)
+{
+    abort();
+}
+
+void arch_init(void)
+{
+}
+
+int reginfo_size(struct reginfo *ri)
+{
+    return sizeof(*ri);
+}
+
+static int parse_extcontext(struct sigcontext *sc, struct extctx_layout *extctx)
+{
+    uint32_t magic, size;
+    struct sctx_info *info = (struct sctx_info *)&sc->sc_extcontext;
+
+    while(1) {
+        magic = (uint32_t)info->magic;
+        size =  (uint32_t)info->size;
+        switch (magic) {
+        case 0: /* END*/
+            return 0;
+        case FPU_CTX_MAGIC:
+            if (size < (sizeof(struct sctx_info) +
+                        sizeof(struct fpu_context))) {
+                return -1;
+            }
+            extctx->fpu.addr = info;
+            break;
+        default:
+            return -1;
+       }
+       info = (struct sctx_info *)((char *)info +size);
+    }
+    return 0;
+}
+
+/* reginfo_init: initialize with a ucontext */
+void reginfo_init(struct reginfo *ri, ucontext_t *context)
+{
+    int i;
+    struct ucontext *uc = (struct ucontext *)context;
+    struct extctx_layout extctx;
+
+    memset(&extctx, 0, sizeof(struct extctx_layout));
+    memset(ri, 0, sizeof(*ri));
+
+    for (i = 1; i < 32; i++) {
+        ri->regs[i] = uc->uc_mcontext.sc_regs[i]; //sp:r3, tp:r2
+    }
+
+    ri->regs[2] = 0xdeadbeefdeadbeef;
+    ri->pc = uc->uc_mcontext.sc_pc - (unsigned long)image_start_address;
+    ri->flags = uc->uc_mcontext.sc_flags;
+    ri->faulting_insn = *(uint32_t *)uc->uc_mcontext.sc_pc;
+
+    parse_extcontext(&uc->uc_mcontext, &extctx);
+    if (extctx.fpu.addr) {
+        struct sctx_info *info = extctx.fpu.addr;
+        struct fpu_context *fpu_ctx = (struct fpu_context *)((char *)info +
+                                       sizeof(struct sctx_info));
+        for(i = 0; i < 32; i++) {
+	    ri->fpregs[i] = fpu_ctx->regs[i];
+        }
+	ri->fcsr = fpu_ctx->fcsr;
+	ri->fcc = fpu_ctx->fcc;
+    }
+}
+
+/* reginfo_is_eq: compare the reginfo structs, returns nonzero if equal */
+int reginfo_is_eq(struct reginfo *r1, struct reginfo *r2)
+{
+    return !memcmp(r1, r2, sizeof(*r1));
+}
+
+/* reginfo_dump: print state to a stream, returns nonzero on success */
+int reginfo_dump(struct reginfo *ri, FILE * f)
+{
+    int i;
+    fprintf(f, "  faulting insn %08x\n", ri->faulting_insn);
+
+    for (i = 0; i < 32; i++) {
+        fprintf(f, "  r%-2d    : %016" PRIx64 "\n", i, ri->regs[i]);
+    }
+
+    fprintf(f, "  pc     : %016" PRIx64 "\n", ri->pc);
+    fprintf(f, "  flags  : %08x\n", ri->flags);
+    fprintf(f, "  fcc    : %016" PRIx64 "\n", ri->fcc);
+    fprintf(f, "  fcsr   : %08x\n", ri->fcsr);
+
+    for (i = 0; i < 32; i++) {
+        fprintf(f, "  f%-2d    : %016lx\n", i, ri->fpregs[i]);
+    }
+
+    return !ferror(f);
+}
+
+/* reginfo_dump_mismatch: print mismatch details to a stream, ret nonzero=ok */
+int reginfo_dump_mismatch(struct reginfo *m, struct reginfo *a, FILE * f)
+{
+    int i;
+    fprintf(f, "mismatch detail (master : apprentice):\n");
+    if (m->faulting_insn != a->faulting_insn) {
+        fprintf(f, "  faulting insn mismatch %08x vs %08x\n",
+                m->faulting_insn, a->faulting_insn);
+    }
+    /* r2:tp, r3:sp */
+    for (i = 0; i < 32; i++) {
+        if (m->regs[i] != a->regs[i]) {
+            fprintf(f, "  r%-2d    : %016" PRIx64 " vs %016" PRIx64 "\n",
+                    i, m->regs[i], a->regs[i]);
+        }
+    }
+
+    if (m->pc != a->pc) {
+        fprintf(f, "  pc     : %016" PRIx64 " vs %016" PRIx64 "\n",
+                m->pc, a->pc);
+    }
+    if (m->flags != a->flags) {
+        fprintf(f, "  flags  : %08x vs %08x\n", m->flags, a->flags);
+    }
+    if (m->fcc != a->fcc) {
+        fprintf(f, "  fcc    : %016" PRIx64 " vs %016" PRIx64 "\n",
+                m->fcc, a->fcc);
+    }
+    if (m->fcsr != a->fcsr) {
+        fprintf(f, "  fcsr   : %08x vs %08x\n", m->fcsr, a->fcsr);
+    }
+
+    for (i = 0; i < 32; i++) {
+        if (m->fpregs[i]!= a->fpregs[i]) {
+            fprintf(f, "  f%-2d    : %016lx vs %016lx\n",
+                    i, m->fpregs[i], a->fpregs[i]);
+        }
+    }
+
+    return !ferror(f);
+}
diff --git a/risu_reginfo_loongarch64.h b/risu_reginfo_loongarch64.h
new file mode 100644
index 0000000..b6c5aaa
--- /dev/null
+++ b/risu_reginfo_loongarch64.h
@@ -0,0 +1,25 @@
+/******************************************************************************
+ * Copyright (c) 2022 Loongson Technology Corporation Limited
+ * All rights reserved. This program and the accompanying materials
+ * are made available under the terms of the Eclipse Public License v1.0
+ * which accompanies this distribution, and is available at
+ * http://www.eclipse.org/legal/epl-v10.html
+ *
+ * Contributors:
+ *     based on Peter Maydell's risu_reginfo_arm.h
+ *****************************************************************************/
+
+#ifndef RISU_REGINFO_LOONGARCH64_H
+#define RISU_REGINFO_LOONGARCH64_H
+
+struct reginfo {
+    uint64_t regs[32];
+    uint64_t pc;
+    uint64_t fcc;
+    uint32_t flags;
+    uint32_t fcsr;
+    uint32_t faulting_insn;
+    uint64_t fpregs[32];
+};
+
+#endif /* RISU_REGINFO_LOONGARCH64_H */
diff --git a/test_loongarch64.s b/test_loongarch64.s
new file mode 100644
index 0000000..431416d
--- /dev/null
+++ b/test_loongarch64.s
@@ -0,0 +1,92 @@
+/*****************************************************************************
+ * Copyright (c) 2022 Loongson Technology Corporation Limited
+ * All rights reserved. This program and the accompanying materials
+ * are made available under the terms of the Eclipse Public License v1.0
+ * rhich accompanies this distribution, and is available at
+ * http://rrr.eclipse.org/legal/epl-v10.html
+ *
+ * Contributors:
+ *     based on test_arm.s by Peter Maydell
+ *****************************************************************************/
+
+/* Initialise the gp regs */
+# $r0 is always 0
+addi.w $r1, $r0, 1
+#r2 tp skip r2
+#r3 sp
+addi.w $r3, $r0, 3
+addi.w $r4, $r0, 4
+addi.w $r5, $r0, 5
+addi.w $r6, $r0, 6
+addi.w $r7, $r0, 7
+addi.w $r8, $r0, 8
+addi.w $r9, $r0, 9
+addi.w $r10, $r0, 10
+addi.w $r11, $r0, 11
+addi.w $r12, $r0, 12
+addi.w $r13, $r0, 13
+addi.w $r14, $r0, 14
+addi.w $r15, $r0, 15
+addi.w $r16, $r0, 16
+addi.w $r17, $r0, 17
+addi.w $r18, $r0, 18
+addi.w $r19, $r0, 19
+addi.w $r20, $r0, 20
+addi.w $r21, $r0, 21
+addi.w $r22, $r0, 22
+addi.w $r23, $r0, 23
+addi.w $r24, $r0, 24
+addi.w $r25, $r0, 25
+addi.w $r26, $r0, 26
+addi.w $r27, $r0, 27
+addi.w $r28, $r0, 28
+addi.w $r29, $r0, 29
+addi.w $r30, $r0, 30
+addi.w $r31, $r0, 31
+
+/* Initialise the fp regs */
+movgr2fr.d $f0, $r0
+movgr2fr.d $f1, $r1
+movgr2fr.d $f2, $r0
+movgr2fr.d $f3, $r0
+movgr2fr.d $f4, $r4
+movgr2fr.d $f5, $r5
+movgr2fr.d $f6, $r6
+movgr2fr.d $f7, $r7
+movgr2fr.d $f8, $r8
+movgr2fr.d $f9, $r9
+movgr2fr.d $f10, $r10
+movgr2fr.d $f11, $r11
+movgr2fr.d $f12, $r12
+movgr2fr.d $f13, $r13
+movgr2fr.d $f14, $r14
+movgr2fr.d $f15, $r15
+movgr2fr.d $f16, $r16
+movgr2fr.d $f17, $r17
+movgr2fr.d $f18, $r18
+movgr2fr.d $f19, $r19
+movgr2fr.d $f20, $r20
+movgr2fr.d $f21, $r21
+movgr2fr.d $f22, $r22
+movgr2fr.d $f23, $r23
+movgr2fr.d $f24, $r24
+movgr2fr.d $f25, $r25
+movgr2fr.d $f26, $r26
+movgr2fr.d $f27, $r27
+movgr2fr.d $f28, $r28
+movgr2fr.d $f29, $r29
+movgr2fr.d $f30, $r30
+movgr2fr.d $f31, $r31
+movgr2cf $fcc0, $r0
+movgr2cf $fcc1, $r0
+movgr2cf $fcc2, $r0
+movgr2cf $fcc3, $r0
+movgr2cf $fcc4, $r0
+movgr2cf $fcc5, $r0
+movgr2cf $fcc6, $r0
+movgr2cf $fcc7, $r0
+
+/* do compare. */
+.int 0x000001f0
+/* exit test */
+.int 0x000001f1
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RISU PATCH 3/5] loongarch: Implement risugen module
  2022-09-17  7:43 [RISU PATCH 0/5] Add LoongArch architectures support Song Gao
  2022-09-17  7:43 ` [RISU PATCH 1/5] risu: Use alternate stack Song Gao
  2022-09-17  7:43 ` [RISU PATCH 2/5] loongarch: Add LoongArch basic test support Song Gao
@ 2022-09-17  7:43 ` Song Gao
  2022-10-10 15:19   ` Richard Henderson
  2022-09-17  7:43 ` [RISU PATCH 4/5] loongarch: Add risufile with loongarch instructions Song Gao
  2022-09-17  7:43 ` [RISU PATCH 5/5] loongarch: Add block 'safefloat' and nanbox_s() Song Gao
  4 siblings, 1 reply; 16+ messages in thread
From: Song Gao @ 2022-09-17  7:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: richard.henderson, peter.maydell, alex.bennee, maobibo

Signed-off-by: Song Gao <gaosong@loongson.cn>
---
 risugen_loongarch64.pm | 502 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 502 insertions(+)
 create mode 100644 risugen_loongarch64.pm

diff --git a/risugen_loongarch64.pm b/risugen_loongarch64.pm
new file mode 100644
index 0000000..693fb71
--- /dev/null
+++ b/risugen_loongarch64.pm
@@ -0,0 +1,502 @@
+#!/usr/bin/perl -w
+###############################################################################
+# Copyright (c) 2022 Loongson Technology Corporation Limited
+# All rights reserved. This program and the accompanying materials
+# are made available under the terms of the Eclipse Public License v1.0
+# which accompanies this distribution, and is available at
+# http://www.eclipse.org/legal/epl-v10.html
+#
+# Contributors:
+#     based on Peter Maydell (Linaro) - initial implementation
+###############################################################################
+
+# risugen -- generate a test binary file for use with risu
+# See 'risugen --help' for usage information.
+package risugen_loongarch64;
+
+use strict;
+use warnings;
+
+use risugen_common;
+
+require Exporter;
+
+our @ISA    = qw(Exporter);
+our @EXPORT = qw(write_test_code);
+
+my $periodic_reg_random = 1;
+
+# Maximum alignment restriction permitted for a memory op.
+my $MAXALIGN = 64;
+
+my $OP_COMPARE = 0;        # compare registers
+my $OP_TESTEND = 1;        # end of test, stop
+my $OP_SETMEMBLOCK = 2;    # r4 is address of memory block (8192 bytes)
+my $OP_GETMEMBLOCK = 3;    # add the address of memory block to r4
+my $OP_COMPAREMEM = 4;     # compare memory block
+
+sub write_risuop($)
+{
+    my ($op) = @_;
+    insn32(0x000001f0 | $op);
+}
+
+sub write_set_fcsr($)
+{
+    my ($fcsr) = @_;
+    # movgr2fcsr r0, r0
+    insn32(0x0114c000);
+}
+
+# Global used to communicate between align(x) and reg() etc.
+my $alignment_restriction;
+
+sub set_reg_w($)
+{
+    my($reg)=@_;
+    # Set reg [0x0, 0x7FFFFFFF]
+
+    # $reg << 33
+    # slli.d  $reg, $reg, 33
+    insn32(0x410000 | 33 << 10 | $reg << 5 | $reg);
+    # $reg >> 33
+    # srli.d  $reg, $reg, 33
+    insn32(0x450000 | 33 << 10 | $reg << 5 | $reg);
+
+    return $reg;
+}
+
+sub align($)
+{
+    my ($a) = @_;
+    if (!is_pow_of_2($a) || ($a < 0) || ($a > $MAXALIGN)) {
+        die "bad align() value $a\n";
+    }
+    $alignment_restriction = $a;
+}
+
+sub write_sub_rrr($$$)
+{
+    my ($rd, $rj, $rk) = @_;
+    # sub.d rd, rj, rk
+    insn32(0x00118000 | $rk << 10 | $rj << 5 | $rd);
+}
+
+sub write_mov_rr($$$)
+{
+    my($rd, $rj, $rk) = @_;
+    # add.d rd, rj, r0
+    insn32(0x00108000 | 0 << 10 | $rj << 5 | $rd);
+}
+
+sub write_mov_positive_ri($$)
+{
+    # Use lu12i.w and ori instruction
+    my ($rd, $imm) = @_;
+    my $high_20 = ($imm >> 12) & 0xfffff;
+
+    if ($high_20) {
+        # lu12i.w rd, si20
+        insn32(0x14000000 | $high_20 << 5 | $rd);
+        # ori rd, rd, ui12
+        insn32(0x03800000 | ($imm & 0xfff) << 10 | $rd << 5 | $rd);
+    } else {
+        # ori rd, 0, ui12
+        insn32(0x03800000 | ($imm & 0xfff) << 10 | 0 << 5 | $rd);
+    }
+}
+
+sub write_mov_ri($$)
+{
+    my ($rd, $imm) = @_;
+
+    if ($imm < 0) {
+        my $tmp = 0 - $imm ;
+        write_mov_positive_ri($rd, $tmp);
+        write_sub_rrr($rd, 0, $rd);
+    } else {
+        write_mov_positive_ri($rd, $imm);
+    }
+}
+
+sub write_get_offset()
+{
+    # Emit code to get a random offset within the memory block, of the
+    # right alignment, into r4
+    # We require the offset to not be within 256 bytes of either
+    # end, to (more than) allow for the worst case data transfer, which is
+    # 16 * 64 bit regs
+    my $offset = (rand(2048 - 512) + 256) & ~($alignment_restriction - 1);
+    write_mov_ri(4, $offset);
+    write_risuop($OP_GETMEMBLOCK);
+}
+
+sub reg_plus_reg($$@)
+{
+    my ($base, $idx, @trashed) = @_;
+    my $savedidx = 0;
+    if ($idx == 4) {
+        # Save the index into some other register for the
+        # moment, because the risuop will trash r4.
+        $idx = 5;
+        $idx++ if $idx == $base;
+        $savedidx = 1;
+        write_mov_rr($idx, 4, 0);
+    }
+    # Get a random offset within the memory block, of the
+    # right alignment.
+    write_get_offset();
+
+    write_sub_rrr($base, 4, $idx);
+    if ($base != 4) {
+        if ($savedidx) {
+            write_mov_rr(4, $idx, 0);
+            write_mov_ri($idx, 0);
+        } else {
+            write_mov_ri(4, 0);
+        }
+    } else {
+	if ($savedidx) {
+            write_mov_ri($idx, 0);
+	}
+    }
+
+    if (grep $_ == $base, @trashed) {
+        return -1;
+    }
+    return $base;
+}
+
+sub reg_plus_imm($$@)
+{
+    # Handle reg + immediate addressing mode
+    my ($base, $imm, @trashed) = @_;
+
+    write_get_offset();
+    # Now r4 is the address we want to do the access to,
+    # so set the basereg by doing the inverse of the
+    # addressing mode calculation, ie base = r4 - imm
+    # We could do this more cleverly with a sub immediate.
+    if ($base != 4) {
+        write_mov_ri($base, $imm);
+        write_sub_rrr($base, 4, $base);
+        # Clear r4 to avoid register compare mismatches
+        # when the memory block location differs between machines.
+         write_mov_ri(4, 0);
+    }else {
+        # We borrow r1 as a temporary (not a problem
+        # as long as we don't leave anything in a register
+        # which depends on the location of the memory block)
+        write_mov_ri(1, $imm);
+        write_sub_rrr($base, 4, 1);
+    }
+
+    if (grep $_ == $base, @trashed) {
+        return -1;
+    }
+    return $base;
+}
+
+sub write_pc_adr($$)
+{
+    my($rd, $imm) = @_;
+    # pcaddi (si20 | 2bit 0) + pc
+    insn32(0x18000000 | $imm << 5 | $rd);
+}
+
+sub write_and($$$)
+{
+    my($rd, $rj, $rk)  = @_;
+    # and rd, rj, rk
+    insn32(0x148000 | $rk << 10 | $rj << 5 | $rd);
+}
+
+sub write_align_reg($$)
+{
+    my ($rd, $align) = @_;
+    # rd = rd & ~($align -1);
+    # use r1 as a temp register.
+    write_mov_ri(1, $align -1);
+    write_sub_rrr(1, 0, 1);
+    write_and($rd, $rd, 1);
+}
+
+sub write_jump_fwd($)
+{
+    my($len) = @_;
+    # b pc + len
+    my ($offslo, $offshi) = (($len / 4 + 1) & 0xffff, ($len / 4 + 1) >> 16);
+    insn32(0x50000000 | $offslo << 10 | $offshi);
+}
+
+sub write_memblock_setup()
+{
+    my $align = $MAXALIGN;
+    my $datalen = 8192 + $align;
+    if (($align > 255) || !is_pow_of_2($align) || $align < 4) {
+        die "bad alignment!";
+    }
+
+    # Set r4 to (datablock + (align-1)) & ~(align-1)
+    # datablock is at PC + (4 * 4 instructions) = PC + 16
+    write_pc_adr(4, (4 * 4) + ($align - 1)); #insn 1
+    write_align_reg(4, $align);              #insn 2
+    write_risuop($OP_SETMEMBLOCK);           #insn 3
+    write_jump_fwd($datalen);                #insn 4
+
+    for(my $i = 0; $i < $datalen / 4; $i++) {
+        insn32(rand(0xffffffff));
+    }
+}
+
+# Write random fp value of passed precision (1=single, 2=double, 4=quad)
+sub write_random_fpreg_var($)
+{
+    my ($precision) = @_;
+    my $randomize_low = 0;
+
+    if ($precision != 1 && $precision != 2 && $precision != 4) {
+        die "write_random_fpreg: invalid precision.\n";
+    }
+
+    my ($low, $high);
+    my $r = rand(100);
+    if ($r < 5) {
+        # +-0 (5%)
+        $low = $high = 0;
+        $high |= 0x80000000 if (rand() < 0.5);
+    } elsif ($r < 10) {
+        # NaN (5%)
+        # (plus a tiny chance of generating +-Inf)
+        $randomize_low = 1;
+        $high = rand(0xffffffff) | 0x7ff00000;
+    } elsif ($r < 15) {
+        # Infinity (5%)
+        $low = 0;
+        $high = 0x7ff00000;
+        $high |= 0x80000000 if (rand() < 0.5);
+    } elsif ($r < 30) {
+        # Denormalized number (15%)
+        # (plus tiny chance of +-0)
+        $randomize_low = 1;
+        $high = rand(0xffffffff) & ~0x7ff00000;
+    } else {
+        # Normalized number (70%)
+        # (plus a small chance of the other cases)
+        $randomize_low = 1;
+        $high = rand(0xffffffff);
+    }
+
+    for (my $i = 1; $i < $precision; $i++) {
+        if ($randomize_low) {
+            $low = rand(0xffffffff);
+        }
+        insn32($low);
+    }
+    insn32($high);
+}
+
+sub write_random_loongarch64_fpdata()
+{
+    # Load floating point registers
+    my $align = 16;
+    my $datalen = 32 * 16 + $align;
+    my $off = 0;
+    write_pc_adr(5, (4 * 4) + $align);       # insn 1  pcaddi
+    write_pc_adr(4, (3 * 4) + ($align - 1)); # insn 2  pcaddi
+    write_align_reg(4, $align);              # insn 3  andi
+    write_jump_fwd($datalen);                # insn 4  b pc + len
+
+    # Align safety
+    for (my $i = 0; $i < ($align / 4); $i++) {
+        insn32(rand(0xffffffff));
+    }
+
+    for (my $i = 0; $i < 32; $i++) {
+        write_random_fpreg_var(4); # double
+    }
+
+    $off = 0;
+    for (my $i = 0; $i < 32; $i++) {
+        my $tmp_reg = 6;
+        # r5 is fp register initial val
+        # r4 is aligned base address
+        # copy memory from r5 to r4
+        # ld.d r6, r5, $off
+        # st.d r6, r4, $off
+        # $off = $off + 16
+        insn32(0x28c00000 | $off << 10 | 5 << 5 | $tmp_reg);
+        insn32(0x29c00000 | $off << 10 | 4 << 5 | $tmp_reg);
+        $off = $off + 8;
+        insn32(0x28c00000 | $off << 10 | 5 << 5 | $tmp_reg);
+        insn32(0x29c00000 | $off << 10 | 4 << 5 | $tmp_reg);
+        $off = $off + 8;
+    }
+
+    $off = 0;
+    for (my $i = 0; $i < 32; $i++) {
+        # fld.d fd, r4, $off
+        insn32(0x2b800000 | $off << 10 | 4 << 5 | $i);
+        $off = $off + 16;
+    }
+}
+
+sub write_random_regdata()
+{
+    # General purpose registers, skip r2
+    write_mov_ri(1, rand(0xffffffff)); # init r1
+    for  (my $i = 3; $i < 32; $i++) {
+        write_mov_ri($i, rand(0xffffffff));
+    }
+}
+
+sub write_random_register_data($)
+{
+    my ($fp_enabled) = @_;
+
+    # Set fcc0 ~ fcc7
+    # movgr2cf $fcc0, $zero
+    insn32(0x114d800);
+    # movgr2cf $fcc1, $zero
+    insn32(0x114d801);
+    # movgr2cf $fcc2, $zero
+    insn32(0x114d802);
+    # movgr2cf $fcc3, $zero
+    insn32(0x114d803);
+    # movgr2cf $fcc4, $zero
+    insn32(0x114d804);
+    # movgr2cf $fcc5, $zero
+    insn32(0x114d805);
+    # movgr2cf $fcc6, $zero
+    insn32(0x114d806);
+    # movgr2cf $fcc7, $zero
+    insn32(0x114d807);
+
+    if ($fp_enabled) {
+        # Load floating point registers
+        write_random_loongarch64_fpdata();
+    }
+
+    write_random_regdata();
+    write_risuop($OP_COMPARE);
+}
+
+sub gen_one_insn($$)
+{
+    # Given an instruction-details array, generate an instruction
+    my $constraintfailures = 0;
+
+    INSN: while(1) {
+        my ($forcecond, $rec) = @_;
+        my $insn = int(rand(0xffffffff));
+        my $insnname = $rec->{name};
+        my $insnwidth = $rec->{width};
+        my $fixedbits = $rec->{fixedbits};
+        my $fixedbitmask = $rec->{fixedbitmask};
+        my $constraint = $rec->{blocks}{"constraints"};
+        my $memblock = $rec->{blocks}{"memory"};
+
+        $insn &= ~$fixedbitmask;
+        $insn |= $fixedbits;
+
+        if (defined $constraint) {
+            # User-specified constraint: evaluate in an environment
+            # with variables set corresponding to the variable fields.
+            my $v = eval_with_fields($insnname, $insn, $rec, "constraints", $constraint);
+            if(!$v) {
+                $constraintfailures++;
+                if ($constraintfailures > 10000) {
+                    print "10000 consecutive constraint failures for $insnname constraints string:\n$constraint\n";
+                    exit (1);
+                }
+                next INSN;
+            }
+        }
+
+        # OK, we got a good one
+        $constraintfailures = 0;
+
+        my $basereg;
+
+        if (defined $memblock) {
+            # This is a load or store. We simply evaluate the block,
+            # which is expected to be a call to a function which emits
+            # the code to set up the base register and returns the
+            # number of the base register.
+            # Default alignment requirement for ARM is 4 bytes,
+            # we use 16 for Aarch64, although often unnecessary and overkill.
+            align(16);
+            $basereg = eval_with_fields($insnname, $insn, $rec, "memory", $memblock);
+        }
+
+        insn32($insn);
+
+        if (defined $memblock) {
+            # Clean up following a memory access instruction:
+            # we need to turn the (possibly written-back) basereg
+            # into an offset from the base of the memory block,
+            # to avoid making register values depend on memory layout.
+            # $basereg -1 means the basereg was a target of a load
+            # (and so it doesn't contain a memory address after the op)
+            if ($basereg != -1) {
+                write_mov_ri($basereg, 0);
+            }
+            write_risuop($OP_COMPAREMEM);
+        }
+        return;
+    }
+}
+
+sub write_test_code($)
+{
+    my ($params) = @_;
+
+    my $condprob = $params->{ 'condprob' };
+    my $fcsr = $params->{'fpscr'};
+    my $numinsns = $params->{ 'numinsns' };
+    my $fp_enabled = $params->{ 'fp_enabled' };
+    my $outfile = $params->{ 'outfile' };
+
+    my %insn_details = %{ $params->{ 'details' } };
+    my @keys = @{ $params->{ 'keys' } };
+
+    open_bin($outfile);
+
+    # Convert from probability that insn will be conditional to
+    # probability of forcing insn to unconditional
+    $condprob = 1 - $condprob;
+
+    # TODO better random number generator?
+    srand(0);
+
+    print "Generating code using patterns: @keys...\n";
+    progress_start(78, $numinsns);
+
+    if ($fp_enabled) {
+        write_set_fcsr($fcsr);
+    }
+
+    if (grep { defined($insn_details{$_}->{blocks}->{"memory"}) } @keys) {
+        write_memblock_setup();
+    }
+    # Memblock setup doesn't clean its registers, so this must come afterwards.
+    write_random_register_data($fp_enabled);
+
+    for my $i (1..$numinsns) {
+        my $insn_enc = $keys[int rand (@keys)];
+        my $forcecond = (rand() < $condprob) ? 1 : 0;
+        gen_one_insn($forcecond, $insn_details{$insn_enc});
+        write_risuop($OP_COMPARE);
+        # Rewrite the registers periodically. This avoids the tendency
+        # for the VFP registers to decay to NaNs and zeroes.
+        if ($periodic_reg_random && ($i % 100) == 0) {
+            write_random_register_data($fp_enabled);
+        }
+        progress_update($i);
+    }
+    write_risuop($OP_TESTEND);
+    progress_end();
+    close_bin();
+}
+
+1;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RISU PATCH 4/5] loongarch: Add risufile with loongarch instructions
  2022-09-17  7:43 [RISU PATCH 0/5] Add LoongArch architectures support Song Gao
                   ` (2 preceding siblings ...)
  2022-09-17  7:43 ` [RISU PATCH 3/5] loongarch: Implement risugen module Song Gao
@ 2022-09-17  7:43 ` Song Gao
  2022-10-10 15:21   ` Richard Henderson
  2022-09-17  7:43 ` [RISU PATCH 5/5] loongarch: Add block 'safefloat' and nanbox_s() Song Gao
  4 siblings, 1 reply; 16+ messages in thread
From: Song Gao @ 2022-09-17  7:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: richard.henderson, peter.maydell, alex.bennee, maobibo

Signed-off-by: Song Gao <gaosong@loongson.cn>
---
 loongarch64.risu | 573 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 573 insertions(+)
 create mode 100644 loongarch64.risu

diff --git a/loongarch64.risu b/loongarch64.risu
new file mode 100644
index 0000000..d059811
--- /dev/null
+++ b/loongarch64.risu
@@ -0,0 +1,573 @@
+###############################################################################
+# Copyright (c) 2022 Loongson Technology Corporation Limited
+# All rights reserved. This program and the accompanying materials
+# are made available under the terms of the Eclipse Public License v1.0
+# which accompanies this distribution, and is available at
+# http://www.eclipse.org/legal/epl-v10.html
+#
+# Contributors:
+#     based on aarch64.risu by Claudio Fontana
+#     based on arm.risu by Peter Maydell
+###############################################################################
+
+# Input file for risugen defining LoongArch64 instructions
+.mode loongarch64
+
+#
+# Fixed point arithmetic operation instruction
+#
+add_w LA64 0000 00000001 00000 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+add_d LA64 0000 00000001 00001 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+sub_w LA64 0000 00000001 00010 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+sub_d LA64 0000 00000001 00011 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+slt LA64 0000 00000001 00100 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+sltu LA64 0000 00000001 00101 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+slti LA64 0000 001000 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+sltui LA64 0000 001001 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+nor LA64 0000 00000001 01000 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+and LA64 0000 00000001 01001 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+or LA64 0000 00000001 01010 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+xor LA64 0000 00000001 01011 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+orn LA64 0000 00000001 01100 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+andn LA64 0000 00000001 01101 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+mul_w LA64 0000 00000001 11000 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+mul_d LA64 0000 00000001 11011 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+mulh_w LA64 0000 00000001 11001 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+mulh_d LA64 0000 00000001 11100 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+mulh_wu LA64 0000 00000001 11010 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+mulh_du LA64 0000 00000001 11101 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+mulw_d_w LA64 0000 00000001 11110 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+mulw_d_wu LA64 0000 00000001 11111 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+
+#div.{w[u]/d[u]} rd,rj,rk
+# the docement 2.2.13,  rk, rj, need in 32bit [0x0 ~0x7FFFFFFF]
+# use function set_reg_w($reg)
+div_w LA64 0000 00000010 00000 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { set_reg_w($rj); set_reg_w($rk); }
+div_wu LA64 0000 00000010 00010 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { set_reg_w($rj); set_reg_w($rk); }
+div_d LA64 0000 00000010 00100 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+div_du LA64 0000 00000010 00110 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+mod_w LA64 0000 00000010 00001 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { set_reg_w($rj); set_reg_w($rk); }
+mod_wu LA64 0000 00000010 00011 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { set_reg_w($rj); set_reg_w($rk); }
+mod_d LA64 0000 00000010 00101 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+mod_du LA64 0000 00000010 00111 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+
+alsl_w LA64 0000 00000000 010 sa2:2 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+alsl_wu LA64 0000 00000000 011 sa2:2 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+alsl_d LA64 0000 00000010 110 sa2:2 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+lu12i_w LA64 0001 010 si20:20 rd:5 \
+    !constraints { $rd != 2; }
+lu32i_d LA64 0001 011 si20:20 rd:5 \
+    !constraints { $rd != 2; }
+lu52i_d LA64 0000 001100 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+addi_w LA64 0000 001010 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+addi_d LA64 0000 001011 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+addu16i_d LA64 0001 00 si16:16 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+andi LA64 0000 001101 ui12:12 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+ori LA64 0000 001110 ui12:12 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+xori LA64 0000 001111 ui12:12 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+
+#
+# Fixed point shift operation instruction
+#
+sll_w LA64 0000 00000001 01110 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+sll_d LA64 0000 00000001 10001 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+srl_w LA64 0000 00000001 01111 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+srl_d LA64 0000 00000001 10010 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+sra_w LA64 0000 00000001 10000 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+sra_d LA64 0000 00000001 10011 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+rotr_w LA64 0000 00000001 10110 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+rotr_d LA64 0000 00000001 10111 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+slli_w LA64 0000 00000100 00001 ui5:5 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+slli_d LA64 0000 00000100 0001 ui6:6 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+srli_w LA64 0000 00000100 01001 ui5:5 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+srli_d LA64 0000 00000100 0101 ui6:6 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+srai_w LA64 0000 00000100 10001 ui5:5 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+srai_d LA64 0000 00000100 1001 ui6:6 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+rotri_w LA64 0000 00000100 11001 ui5:5 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+rotri_d LA64 0000 00000100 1101 ui6:6 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+
+#
+# Fixed point bit operation instruction
+#
+ext_w_h LA64 0000 00000000 00000 10110 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+ext_w_b LA64 0000 00000000 00000 10111 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+clo_w LA64 0000 00000000 00000 00100 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+clz_w LA64 0000 00000000 00000 00101 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+cto_w LA64 0000 00000000 00000 00110 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+ctz_w LA64 0000 00000000 00000 00111 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+clo_d LA64 0000 00000000 00000 01000 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+clz_d LA64 0000 00000000 00000 01001 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+cto_d LA64 0000 00000000 00000 01010 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+ctz_d LA64 0000 00000000 00000 01011 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+revb_2h LA64 0000 00000000 00000 01100 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+revb_4h LA64 0000 00000000 00000 01101 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+revb_2w LA64 0000 00000000 00000 01110 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+revb_d  LA64 0000 00000000 00000 01111 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+revh_2w LA64 0000 00000000 00000 10000 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+revh_d  LA64 0000 00000000 00000 10001 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+bitrev_4b LA64 0000 00000000 00000 10010 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+bitrev_8b LA64 0000 00000000 00000 10011 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+bitrev_w  LA64 0000 00000000 00000 10100 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+bitrev_d  LA64 0000 00000000 00000 10101 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2; }
+bytepick_w LA64 0000 00000000 100 sa2:2 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+bytepick_d LA64 0000 00000000 11 sa3:3 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+maskeqz LA64 0000 00000001 00110 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+masknez LA64 0000 00000001 00111 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+bstrins_w LA64 0000 0000011 msbw:5 0 lsbw:5 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2 && $msbw >= $lsbw; }
+bstrins_d LA64 0000 000010 msbd:6 lsbd:6 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2 && $msbd >= $lsbd; }
+bstrpick_w LA64 0000 0000011 msbw:5 1 lsbw:5 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2 && $msbw >= $lsbw; }
+bstrpick_d LA64 0000 000011 msbd:6 lsbd:6 rj:5 rd:5 \
+    !constraints { $rj != 2 && $rd != 2 && $msbd >= $lsbd; }
+
+#
+# Fixed point load/store instruction
+#
+ld_b  LA64 0010 100000 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+ld_h  LA64 0010 100001 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+ld_w  LA64 0010 100010 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+ld_d  LA64 0010 100011 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+ld_bu LA64 0010 101000 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+ld_hu LA64 0010 101001 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+ld_wu LA64 0010 101010 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+st_b  LA64 0010 100100 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rd && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+st_h  LA64 0010 100101 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rd && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+st_w  LA64 0010 100110 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rd && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+st_d  LA64 0010 100111 si12:12 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rd && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+ldx_b LA64 0011 10000000 00000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+ldx_h LA64 0011 10000000 01000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+ldx_w LA64 0011 10000000 10000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+ldx_d LA64 0011 10000000 11000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+ldx_bu LA64 0011 10000010 00000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+ldx_hu LA64 0011 10000010 01000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+ldx_wu LA64 0011 10000010 10000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+stx_b LA64 0011 10000001 00000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rd != $rj && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+stx_h LA64 0011 10000001 01000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rd != $rj && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+stx_w LA64 0011 10000001 10000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rd != $rj && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+stx_d LA64 0011 10000001 11000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rd != $rj && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+preld  LA64 0010 101011 si12:12 rj:5 hint:5 \
+    !constraints { $rj != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+dbar LA64 0011 10000111 00100 hint:15
+ibar LA64 0011 10000111 00101 hint:15
+ldptr_w LA64 0010 0100 si14:14 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si14, 14) * 4); }
+ldptr_d LA64 0010 0110 si14:14 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si14, 14) * 4); }
+stptr_w LA64 0010 0101 si14:14 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rd && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si14, 14) * 4); }
+stptr_d LA64 0010 0111 si14:14 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != $rd && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si14, 14) * 4); }
+
+#
+# Fixed point atomic instruction
+#
+ll_w LA64 0010 0000 si14:14 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si14, 14) * 4); }
+ll_d LA64 0010 0010 si14:14 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si14, 14) * 4); }
+
+amswap_w LA64 0011 10000110 00000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amswap_d LA64 0011 10000110 00001 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amadd_w LA64 0011 10000110 00010 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amadd_d LA64 0011 10000110 00011 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amand_w LA64 0011 10000110 00100 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amand_d LA64 0011 10000110 00101 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amor_w LA64 0011 10000110 00110 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amor_d LA64 0011 10000110 00111 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amxor_w LA64 0011 10000110 01000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amxor_d LA64 0011 10000110 01001 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammax_w LA64 0011 10000110 01010 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammax_d LA64 0011 10000110 01011 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammin_w LA64 0011 10000110 01100 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammin_d LA64 0011 10000110 01101 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammax_wu LA64 0011 10000110 01110 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammax_du LA64 0011 10000110 01111 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammin_wu LA64 0011 10000110 10000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammin_du LA64 0011 10000110 10001 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+
+amswap_db_w LA64 0011 10000110 10010 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amswap_db_d LA64 0011 10000110 10011 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amadd_db_w LA64 0011 10000110 10100 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amadd_db_d LA64 0011 10000110 10101 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amand_db_w LA64 0011 10000110 10110 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amand_db_d LA64 0011 10000110 10111 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amor_db_w LA64 0011 10000110 11000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amor_db_d LA64 0011 10000110 11001 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amxor_db_w LA64 0011 10000110 11010 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+amxor_db_d LA64 0011 10000110 11011 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammax_db_w LA64 0011 10000110 11100 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammax_db_d LA64 0011 10000110 11101 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammin_db_w LA64 0011 10000110 11110 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammin_db_d LA64 0011 10000110 11111 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammax_db_wu LA64 0011 10000111 00000 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammax_db_du LA64 0011 10000111 00001 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammin_db_wu LA64 0011 10000111 00010 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+ammin_db_du LA64 0011 10000111 00011 rk:5 rj:5 rd:5 \
+    !constraints { $rj != 0 && $rd != $rj && $rj != $rk && $rd != $rk && $rk != 2 && $rj != 2 && $rd != 2; } \
+    !memory { reg_plus_reg($rj, 0); }
+
+#
+# Fixed point extra instruction
+#
+crc_w_b_w LA64 0000 00000010 01000 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+crc_w_h_w LA64 0000 00000010 01001 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+crc_w_w_w LA64 0000 00000010 01010 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+crc_w_d_w LA64 0000 00000010 01011 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+crcc_w_b_w LA64 0000 00000010 01100 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+crcc_w_h_w LA64 0000 00000010 01101 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+crcc_w_w_w LA64 0000 00000010 01110 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+crcc_w_d_w LA64 0000 00000010 01111 rk:5 rj:5 rd:5 \
+    !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
+
+#
+# Floating point arithmetic operation instruction
+#
+fadd_s LA64 0000 00010000 00001 fk:5 fj:5 fd:5
+fadd_d LA64 0000 00010000 00010 fk:5 fj:5 fd:5
+fsub_s LA64 0000 00010000 00101 fk:5 fj:5 fd:5
+fsub_d LA64 0000 00010000 00110 fk:5 fj:5 fd:5
+fmul_s LA64 0000 00010000 01001 fk:5 fj:5 fd:5
+fmul_d LA64 0000 00010000 01010 fk:5 fj:5 fd:5
+fdiv_s LA64 0000 00010000 01101 fk:5 fj:5 fd:5
+fdiv_d LA64 0000 00010000 01110 fk:5 fj:5 fd:5
+fmadd_s LA64 0000 10000001 fa:5 fk:5 fj:5 fd:5
+fmadd_d LA64 0000 10000010 fa:5 fk:5 fj:5 fd:5
+fmsub_s LA64 0000 10000101 fa:5 fk:5 fj:5 fd:5
+fmsub_d LA64 0000 10000110 fa:5 fk:5 fj:5 fd:5
+fnmadd_s LA64 0000 10001001 fa:5 fk:5 fj:5 fd:5
+fnmadd_d LA64 0000 10001010 fa:5 fk:5 fj:5 fd:5
+fnmsub_s LA64 0000 10001101 fa:5 fk:5 fj:5 fd:5
+fnmsub_d LA64 0000 10001110 fa:5 fk:5 fj:5 fd:5
+fmax_s LA64 0000 00010000 10001 fk:5 fj:5 fd:5
+fmax_d LA64 0000 00010000 10010 fk:5 fj:5 fd:5
+fmin_s LA64 0000 00010000 10101 fk:5 fj:5 fd:5
+fmin_d LA64 0000 00010000 10110 fk:5 fj:5 fd:5
+fmaxa_s LA64 0000 00010000 11001 fk:5 fj:5 fd:5
+fmaxa_d LA64 0000 00010000 11010 fk:5 fj:5 fd:5
+fmina_s LA64 0000 00010000 11101 fk:5 fj:5 fd:5
+fmina_d LA64 0000 00010000 11110 fk:5 fj:5 fd:5
+fabs_s LA64 0000 00010001 01000 00001 fj:5 fd:5
+fabs_d LA64 0000 00010001 01000 00010 fj:5 fd:5
+fneg_s LA64 0000 00010001 01000 00101 fj:5 fd:5
+fneg_d LA64 0000 00010001 01000 00110 fj:5 fd:5
+fsqrt_s LA64 0000 00010001 01000 10001 fj:5 fd:5
+fsqrt_d LA64 0000 00010001 01000 10010 fj:5 fd:5
+frecip_s LA64 0000 00010001 01000 10101 fj:5 fd:5
+frecip_d LA64 0000 00010001 01000 10110 fj:5 fd:5
+frsqrt_s LA64 0000 00010001 01000 11001 fj:5 fd:5
+frsqrt_d LA64 0000 00010001 01000 11010 fj:5 fd:5
+fscaleb_s LA64 0000 00010001 00001 fk:5 fj:5 fd:5
+fscaleb_d LA64 0000 00010001 00010 fk:5 fj:5 fd:5
+flogb_s LA64 0000 00010001 01000 01001 fj:5 fd:5
+flogb_d LA64 0000 00010001 01000 01010 fj:5 fd:5
+fcopysign_s LA64 0000 00010001 00101 fk:5 fj:5 fd:5
+fcopysign_d LA64 0000 00010001 00110 fk:5 fj:5 fd:5
+fclass_s LA64 0000 00010001 01000 01101 fj:5 fd:5
+fclass_d LA64 0000 00010001 01000 01110 fj:5 fd:5
+
+#
+# Floating point compare instruction
+#
+fcmp_cond_s LA64 0000 11000001 cond:5 fk:5 fj:5 00 cd:3 \
+    !constraints { $cond > 0 && $cond < 0x12; }
+fcmp_cond_d LA64 0000 11000010 cond:5 fk:5 fj:5 00 cd:3 \
+    !constraints { $cond > 0 && $cond < 0x12; }
+
+#
+# Floating point conversion instruction
+#
+fcvt_s_d LA64 0000 00010001 10010 00110 fj:5 fd:5
+fcvt_d_s LA64 0000 00010001 10010 01001 fj:5 fd:5
+ftintrm_w_s LA64 0000 00010001 10100 00001 fj:5 fd:5
+ftintrm_w_d LA64 0000 00010001 10100 00010 fj:5 fd:5
+ftintrm_l_s LA64 0000 00010001 10100 01001 fj:5 fd:5
+ftintrm_l_d LA64 0000 00010001 10100 01010 fj:5 fd:5
+ftintrp_w_s LA64 0000 00010001 10100 10001 fj:5 fd:5
+ftintrp_w_d LA64 0000 00010001 10100 10010 fj:5 fd:5
+ftintrp_l_s LA64 0000 00010001 10100 11001 fj:5 fd:5
+ftintrp_l_d LA64 0000 00010001 10100 11010 fj:5 fd:5
+ftintrz_w_s LA64 0000 00010001 10101 00001 fj:5 fd:5
+ftintrz_w_d LA64 0000 00010001 10101 00010 fj:5 fd:5
+ftintrz_l_s LA64 0000 00010001 10101 01001 fj:5 fd:5
+ftintrz_l_d LA64 0000 00010001 10101 01010 fj:5 fd:5
+ftintrne_w_s LA64 0000 00010001 10101 10001 fj:5 fd:5
+ftintrne_w_d LA64 0000 00010001 10101 10010 fj:5 fd:5
+ftintrne_l_s LA64 0000 00010001 10101 11001 fj:5 fd:5
+ftintrne_l_d LA64 0000 00010001 10101 11010 fj:5 fd:5
+ftint_w_s LA64 0000 00010001 10110 00001 fj:5 fd:5
+ftint_w_d LA64 0000 00010001 10110 00010 fj:5 fd:5
+ftint_l_s LA64 0000 00010001 10110 01001 fj:5 fd:5
+ftint_l_d LA64 0000 00010001 10110 01010 fj:5 fd:5
+ffint_s_w LA64 0000 00010001 11010 00100 fj:5 fd:5
+ffint_s_l LA64 0000 00010001 11010 00110 fj:5 fd:5
+ffint_d_w LA64 0000 00010001 11010 01000 fj:5 fd:5
+ffint_d_l LA64 0000 00010001 11010 01010 fj:5 fd:5
+frint_s LA64 0000 00010001 11100 10001 fj:5 fd:5
+frint_d LA64 0000 00010001 11100 10010 fj:5 fd:5
+
+#
+# Floating point move instruction
+#
+fmov_s LA64 0000 00010001 01001 00101 fj:5 fd:5
+fmov_d LA64 0000 00010001 01001 00110 fj:5 fd:5
+fsel LA64 0000 11010000 00 ca:3 fk:5 fj:5 fd:5
+movgr2fr_w LA64 0000 00010001 01001 01001 rj:5 fd:5 \
+    !constraints { $rj != 2; }
+movgr2fr_d LA64 0000 00010001 01001 01010 rj:5 fd:5 \
+    !constraints { $rj != 2; }
+movgr2frh_w LA64 0000 00010001 01001 01011 rj:5 fd:5 \
+    !constraints { $rj != 2; }
+movfr2gr_s LA64 0000 00010001 01001 01101 fj:5 rd:5 \
+    !constraints { $rd != 2; }
+movfr2gr_d LA64 0000 00010001 01001 01110 fj:5 rd:5 \
+    !constraints { $rd != 2; }
+movfrh2gr_s LA64 0000 00010001 01001 01111 fj:5 rd:5 \
+    !constraints { $rd != 2; }
+movfr2cf LA64 0000 00010001 01001 10100 fj:5 00 cd:3
+movcf2fr LA64 0000 00010001 01001 10101 00 cj:3 fd:5
+movgr2cf LA64 0000 00010001 01001 10110 rj:5 00 cd:3 \
+    !constraints { $rj != 2; }
+movcf2gr LA64 0000 00010001 01001 10111 00 cj:3 rd:5 \
+    !constraints { $rd != 2; }
+
+#
+# Floating point load/store instruction
+#
+fld_s LA64 0010 101100 si12:12 rj:5 fd:5 \
+    !constraints { $rj != 0 && $rj != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+fst_s LA64 0010 101101 si12:12 rj:5 fd:5 \
+    !constraints { $rj != 0 && $rj != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+fld_d LA64 0010 101110 si12:12 rj:5 fd:5 \
+    !constraints { $rj != 0 && $rj != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+fst_d LA64 0010 101111 si12:12 rj:5 fd:5 \
+    !constraints { $rj != 0 && $rj != 2; } \
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+fldx_s LA64 0011 10000011 00000 rk:5 rj:5 fd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+fldx_d LA64 0011 10000011 01000 rk:5 rj:5 fd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+fstx_s LA64 0011 10000011 10000 rk:5 rj:5 fd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
+fstx_d LA64 0011 10000011 11000 rk:5 rj:5 fd:5 \
+    !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2; } \
+    !memory { reg_plus_reg($rj, $rk); }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RISU PATCH 5/5] loongarch: Add block 'safefloat' and nanbox_s()
  2022-09-17  7:43 [RISU PATCH 0/5] Add LoongArch architectures support Song Gao
                   ` (3 preceding siblings ...)
  2022-09-17  7:43 ` [RISU PATCH 4/5] loongarch: Add risufile with loongarch instructions Song Gao
@ 2022-09-17  7:43 ` Song Gao
  2022-10-10 15:24   ` Richard Henderson
  4 siblings, 1 reply; 16+ messages in thread
From: Song Gao @ 2022-09-17  7:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: richard.henderson, peter.maydell, alex.bennee, maobibo

Some LoongArch instructions don't care the high 32bit,
so use nanbox_s() set the high 32bit 0xffffffff.

Signed-off-by: Song Gao <gaosong@loongson.cn>
---
 loongarch64.risu       | 119 +++++++++++++++++++++++++++--------------
 risugen                |   2 +-
 risugen_loongarch64.pm |  30 +++++++++++
 3 files changed, 110 insertions(+), 41 deletions(-)

diff --git a/loongarch64.risu b/loongarch64.risu
index d059811..d625a12 100644
--- a/loongarch64.risu
+++ b/loongarch64.risu
@@ -62,7 +62,7 @@ mulw_d_wu LA64 0000 00000001 11111 rk:5 rj:5 rd:5 \
     !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
 
 #div.{w[u]/d[u]} rd,rj,rk
-# the docement 2.2.13,  rk, rj, need in 32bit [0x0 ~0x7FFFFFFF]
+# div.w{u}, mod.w[u]  rk, rj, need in [0x0 ~0x7FFFFFFF]
 # use function set_reg_w($reg)
 div_w LA64 0000 00000010 00000 rk:5 rj:5 rd:5 \
     !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
@@ -436,47 +436,68 @@ crcc_w_d_w LA64 0000 00000010 01111 rk:5 rj:5 rd:5 \
 #
 # Floating point arithmetic operation instruction
 #
-fadd_s LA64 0000 00010000 00001 fk:5 fj:5 fd:5
+fadd_s LA64 0000 00010000 00001 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fadd_d LA64 0000 00010000 00010 fk:5 fj:5 fd:5
-fsub_s LA64 0000 00010000 00101 fk:5 fj:5 fd:5
+fsub_s LA64 0000 00010000 00101 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fsub_d LA64 0000 00010000 00110 fk:5 fj:5 fd:5
-fmul_s LA64 0000 00010000 01001 fk:5 fj:5 fd:5
+fmul_s LA64 0000 00010000 01001 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fmul_d LA64 0000 00010000 01010 fk:5 fj:5 fd:5
-fdiv_s LA64 0000 00010000 01101 fk:5 fj:5 fd:5
+fdiv_s LA64 0000 00010000 01101 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fdiv_d LA64 0000 00010000 01110 fk:5 fj:5 fd:5
-fmadd_s LA64 0000 10000001 fa:5 fk:5 fj:5 fd:5
+fmadd_s LA64 0000 10000001 fa:5 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fmadd_d LA64 0000 10000010 fa:5 fk:5 fj:5 fd:5
-fmsub_s LA64 0000 10000101 fa:5 fk:5 fj:5 fd:5
+fmsub_s LA64 0000 10000101 fa:5 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fmsub_d LA64 0000 10000110 fa:5 fk:5 fj:5 fd:5
-fnmadd_s LA64 0000 10001001 fa:5 fk:5 fj:5 fd:5
+fnmadd_s LA64 0000 10001001 fa:5 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fnmadd_d LA64 0000 10001010 fa:5 fk:5 fj:5 fd:5
-fnmsub_s LA64 0000 10001101 fa:5 fk:5 fj:5 fd:5
+fnmsub_s LA64 0000 10001101 fa:5 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fnmsub_d LA64 0000 10001110 fa:5 fk:5 fj:5 fd:5
-fmax_s LA64 0000 00010000 10001 fk:5 fj:5 fd:5
+fmax_s LA64 0000 00010000 10001 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fmax_d LA64 0000 00010000 10010 fk:5 fj:5 fd:5
-fmin_s LA64 0000 00010000 10101 fk:5 fj:5 fd:5
+fmin_s LA64 0000 00010000 10101 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fmin_d LA64 0000 00010000 10110 fk:5 fj:5 fd:5
-fmaxa_s LA64 0000 00010000 11001 fk:5 fj:5 fd:5
+fmaxa_s LA64 0000 00010000 11001 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fmaxa_d LA64 0000 00010000 11010 fk:5 fj:5 fd:5
-fmina_s LA64 0000 00010000 11101 fk:5 fj:5 fd:5
+fmina_s LA64 0000 00010000 11101 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fmina_d LA64 0000 00010000 11110 fk:5 fj:5 fd:5
-fabs_s LA64 0000 00010001 01000 00001 fj:5 fd:5
+fabs_s LA64 0000 00010001 01000 00001 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fabs_d LA64 0000 00010001 01000 00010 fj:5 fd:5
-fneg_s LA64 0000 00010001 01000 00101 fj:5 fd:5
+fneg_s LA64 0000 00010001 01000 00101 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fneg_d LA64 0000 00010001 01000 00110 fj:5 fd:5
-fsqrt_s LA64 0000 00010001 01000 10001 fj:5 fd:5
+fsqrt_s LA64 0000 00010001 01000 10001 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fsqrt_d LA64 0000 00010001 01000 10010 fj:5 fd:5
-frecip_s LA64 0000 00010001 01000 10101 fj:5 fd:5
+frecip_s LA64 0000 00010001 01000 10101 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 frecip_d LA64 0000 00010001 01000 10110 fj:5 fd:5
-frsqrt_s LA64 0000 00010001 01000 11001 fj:5 fd:5
+frsqrt_s LA64 0000 00010001 01000 11001 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 frsqrt_d LA64 0000 00010001 01000 11010 fj:5 fd:5
-fscaleb_s LA64 0000 00010001 00001 fk:5 fj:5 fd:5
+fscaleb_s LA64 0000 00010001 00001 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fscaleb_d LA64 0000 00010001 00010 fk:5 fj:5 fd:5
-flogb_s LA64 0000 00010001 01000 01001 fj:5 fd:5
+flogb_s LA64 0000 00010001 01000 01001 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 flogb_d LA64 0000 00010001 01000 01010 fj:5 fd:5
-fcopysign_s LA64 0000 00010001 00101 fk:5 fj:5 fd:5
+fcopysign_s LA64 0000 00010001 00101 fk:5 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fcopysign_d LA64 0000 00010001 00110 fk:5 fj:5 fd:5
-fclass_s LA64 0000 00010001 01000 01101 fj:5 fd:5
+fclass_s LA64 0000 00010001 01000 01101 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fclass_d LA64 0000 00010001 01000 01110 fj:5 fd:5
 
 #
@@ -490,43 +511,59 @@ fcmp_cond_d LA64 0000 11000010 cond:5 fk:5 fj:5 00 cd:3 \
 #
 # Floating point conversion instruction
 #
-fcvt_s_d LA64 0000 00010001 10010 00110 fj:5 fd:5
+fcvt_s_d LA64 0000 00010001 10010 00110 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fcvt_d_s LA64 0000 00010001 10010 01001 fj:5 fd:5
-ftintrm_w_s LA64 0000 00010001 10100 00001 fj:5 fd:5
-ftintrm_w_d LA64 0000 00010001 10100 00010 fj:5 fd:5
+ftintrm_w_s LA64 0000 00010001 10100 00001 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
+ftintrm_w_d LA64 0000 00010001 10100 00010 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 ftintrm_l_s LA64 0000 00010001 10100 01001 fj:5 fd:5
 ftintrm_l_d LA64 0000 00010001 10100 01010 fj:5 fd:5
-ftintrp_w_s LA64 0000 00010001 10100 10001 fj:5 fd:5
-ftintrp_w_d LA64 0000 00010001 10100 10010 fj:5 fd:5
+ftintrp_w_s LA64 0000 00010001 10100 10001 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
+ftintrp_w_d LA64 0000 00010001 10100 10010 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 ftintrp_l_s LA64 0000 00010001 10100 11001 fj:5 fd:5
 ftintrp_l_d LA64 0000 00010001 10100 11010 fj:5 fd:5
-ftintrz_w_s LA64 0000 00010001 10101 00001 fj:5 fd:5
-ftintrz_w_d LA64 0000 00010001 10101 00010 fj:5 fd:5
+ftintrz_w_s LA64 0000 00010001 10101 00001 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
+ftintrz_w_d LA64 0000 00010001 10101 00010 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 ftintrz_l_s LA64 0000 00010001 10101 01001 fj:5 fd:5
 ftintrz_l_d LA64 0000 00010001 10101 01010 fj:5 fd:5
-ftintrne_w_s LA64 0000 00010001 10101 10001 fj:5 fd:5
-ftintrne_w_d LA64 0000 00010001 10101 10010 fj:5 fd:5
+ftintrne_w_s LA64 0000 00010001 10101 10001 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
+ftintrne_w_d LA64 0000 00010001 10101 10010 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 ftintrne_l_s LA64 0000 00010001 10101 11001 fj:5 fd:5
 ftintrne_l_d LA64 0000 00010001 10101 11010 fj:5 fd:5
-ftint_w_s LA64 0000 00010001 10110 00001 fj:5 fd:5
-ftint_w_d LA64 0000 00010001 10110 00010 fj:5 fd:5
+ftint_w_s LA64 0000 00010001 10110 00001 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
+ftint_w_d LA64 0000 00010001 10110 00010 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 ftint_l_s LA64 0000 00010001 10110 01001 fj:5 fd:5
 ftint_l_d LA64 0000 00010001 10110 01010 fj:5 fd:5
-ffint_s_w LA64 0000 00010001 11010 00100 fj:5 fd:5
-ffint_s_l LA64 0000 00010001 11010 00110 fj:5 fd:5
+ffint_s_w LA64 0000 00010001 11010 00100 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
+ffint_s_l LA64 0000 00010001 11010 00110 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 ffint_d_w LA64 0000 00010001 11010 01000 fj:5 fd:5
 ffint_d_l LA64 0000 00010001 11010 01010 fj:5 fd:5
-frint_s LA64 0000 00010001 11100 10001 fj:5 fd:5
+frint_s LA64 0000 00010001 11100 10001 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 frint_d LA64 0000 00010001 11100 10010 fj:5 fd:5
 
 #
 # Floating point move instruction
 #
-fmov_s LA64 0000 00010001 01001 00101 fj:5 fd:5
+fmov_s LA64 0000 00010001 01001 00101 fj:5 fd:5 \
+    !safefloat { nanbox_s($fd); }
 fmov_d LA64 0000 00010001 01001 00110 fj:5 fd:5
 fsel LA64 0000 11010000 00 ca:3 fk:5 fj:5 fd:5
 movgr2fr_w LA64 0000 00010001 01001 01001 rj:5 fd:5 \
-    !constraints { $rj != 2; }
+    !constraints { $rj != 2; } \
+    !safefloat { nanbox_s($fd); }
 movgr2fr_d LA64 0000 00010001 01001 01010 rj:5 fd:5 \
     !constraints { $rj != 2; }
 movgr2frh_w LA64 0000 00010001 01001 01011 rj:5 fd:5 \
@@ -549,7 +586,8 @@ movcf2gr LA64 0000 00010001 01001 10111 00 cj:3 rd:5 \
 #
 fld_s LA64 0010 101100 si12:12 rj:5 fd:5 \
     !constraints { $rj != 0 && $rj != 2; } \
-    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
+    !memory { reg_plus_imm($rj, sextract($si12, 12)); } \
+    !safefloat { nanbox_s($fd); }
 fst_s LA64 0010 101101 si12:12 rj:5 fd:5 \
     !constraints { $rj != 0 && $rj != 2; } \
     !memory { reg_plus_imm($rj, sextract($si12, 12)); }
@@ -561,7 +599,8 @@ fst_d LA64 0010 101111 si12:12 rj:5 fd:5 \
     !memory { reg_plus_imm($rj, sextract($si12, 12)); }
 fldx_s LA64 0011 10000011 00000 rk:5 rj:5 fd:5 \
     !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2; } \
-    !memory { reg_plus_reg($rj, $rk); }
+    !memory { reg_plus_reg($rj, $rk); } \
+    !safefloat { nanbox_s($fd); }
 fldx_d LA64 0011 10000011 01000 rk:5 rj:5 fd:5 \
     !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2; } \
     !memory { reg_plus_reg($rj, $rk); }
diff --git a/risugen b/risugen
index e690b18..fa94a39 100755
--- a/risugen
+++ b/risugen
@@ -43,7 +43,7 @@ my @pattern_re = ();            # include pattern
 my @not_pattern_re = ();        # exclude pattern
 
 # Valid block names (keys in blocks hash)
-my %valid_blockname = ( constraints => 1, memory => 1 );
+my %valid_blockname = ( constraints => 1, memory => 1, safefloat =>1 );
 
 sub parse_risu_directive($$@)
 {
diff --git a/risugen_loongarch64.pm b/risugen_loongarch64.pm
index 693fb71..8ab598b 100644
--- a/risugen_loongarch64.pm
+++ b/risugen_loongarch64.pm
@@ -66,6 +66,28 @@ sub set_reg_w($)
     return $reg;
 }
 
+sub write_orn_rrr($$$)
+{
+    my($rd, $rj, $rk)=@_;
+    # $rd = $rj | (~$rk)
+    insn32(0x160000 | $rk << 10 | $rj << 5 | $rd);
+}
+
+sub nanbox_s($)
+{
+    my ($fpreg)=@_;
+
+    # Set $fpreg register high 32bit ffffffff
+    # use r1 as a temp register
+    # r1 = r1 | ~(r0)
+    write_orn_rrr(1, 1, 0);
+
+    # movgr2frh.w   $fpreg ,$1
+    insn32(0x114ac00 | 1 << 5 | $fpreg);
+
+    return $fpreg;
+}
+
 sub align($)
 {
     my ($a) = @_;
@@ -395,6 +417,7 @@ sub gen_one_insn($$)
         my $fixedbitmask = $rec->{fixedbitmask};
         my $constraint = $rec->{blocks}{"constraints"};
         my $memblock = $rec->{blocks}{"memory"};
+        my $safefloat = $rec->{blocks}{"safefloat"};
 
         $insn &= ~$fixedbitmask;
         $insn |= $fixedbits;
@@ -431,6 +454,13 @@ sub gen_one_insn($$)
 
         insn32($insn);
 
+        if (defined $safefloat) {
+            # Some result only care about low 32bit,
+            # so we use nanbox_s() make sure that high 32bit is 0xffffffff;
+            my $resultreg;
+            $resultreg = eval_with_fields($insnname, $insn, $rec, "safefloat", $safefloat);
+        }
+
         if (defined $memblock) {
             # Clean up following a memory access instruction:
             # we need to turn the (possibly written-back) basereg
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RISU PATCH 1/5] risu: Use alternate stack
  2022-09-17  7:43 ` [RISU PATCH 1/5] risu: Use alternate stack Song Gao
@ 2022-10-10 14:20   ` Richard Henderson
  2022-10-10 14:43     ` Peter Maydell
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Henderson @ 2022-10-10 14:20 UTC (permalink / raw)
  To: Song Gao, qemu-devel; +Cc: peter.maydell, alex.bennee, maobibo

On 9/17/22 00:43, Song Gao wrote:
> We can use alternate stack, so that we can use sp register as intput/ouput register.
> I had tested aarch64/LoongArch architecture.
> 
> Signed-off-by: Song Gao<gaosong@loongson.cn>
> ---
>   risu.c | 16 +++++++++++++++-
>   1 file changed, 15 insertions(+), 1 deletion(-)

Good idea.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RISU PATCH 1/5] risu: Use alternate stack
  2022-10-10 14:20   ` Richard Henderson
@ 2022-10-10 14:43     ` Peter Maydell
  2022-10-11  6:56       ` gaosong
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Maydell @ 2022-10-10 14:43 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Song Gao, qemu-devel, alex.bennee, maobibo

On Mon, 10 Oct 2022 at 15:20, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 9/17/22 00:43, Song Gao wrote:
> > We can use alternate stack, so that we can use sp register as intput/ouput register.
> > I had tested aarch64/LoongArch architecture.
> >
> > Signed-off-by: Song Gao<gaosong@loongson.cn>
> > ---
> >   risu.c | 16 +++++++++++++++-
> >   1 file changed, 15 insertions(+), 1 deletion(-)
>
> Good idea.

Depending on the architecture there might still need to be
restrictions on use of the stack pointer, eg aarch64's
alignment requirements, but this at least means you can
in theory write some risu rules that use SP.

-- PMM


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RISU PATCH 2/5] loongarch: Add LoongArch basic test support
  2022-09-17  7:43 ` [RISU PATCH 2/5] loongarch: Add LoongArch basic test support Song Gao
@ 2022-10-10 14:58   ` Richard Henderson
  2022-10-10 15:34   ` Peter Maydell
  1 sibling, 0 replies; 16+ messages in thread
From: Richard Henderson @ 2022-10-10 14:58 UTC (permalink / raw)
  To: Song Gao, qemu-devel; +Cc: peter.maydell, alex.bennee, maobibo

On 9/17/22 00:43, Song Gao wrote:
> This patch adds LoongArch server, client support, and basic test file.
> 
> Signed-off-by: Song Gao<gaosong@loongson.cn>
> ---
>   risu_loongarch64.c         |  50 ++++++++++
>   risu_reginfo_loongarch64.c | 183 +++++++++++++++++++++++++++++++++++++
>   risu_reginfo_loongarch64.h |  25 +++++
>   test_loongarch64.s         |  92 +++++++++++++++++++
>   4 files changed, 350 insertions(+)
>   create mode 100644 risu_loongarch64.c
>   create mode 100644 risu_reginfo_loongarch64.c
>   create mode 100644 risu_reginfo_loongarch64.h
>   create mode 100644 test_loongarch64.s

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RISU PATCH 3/5] loongarch: Implement risugen module
  2022-09-17  7:43 ` [RISU PATCH 3/5] loongarch: Implement risugen module Song Gao
@ 2022-10-10 15:19   ` Richard Henderson
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Henderson @ 2022-10-10 15:19 UTC (permalink / raw)
  To: Song Gao, qemu-devel; +Cc: peter.maydell, alex.bennee, maobibo

On 9/17/22 00:43, Song Gao wrote:
> +sub write_mov_positive_ri($$)
> +{
> +    # Use lu12i.w and ori instruction
> +    my ($rd, $imm) = @_;
> +    my $high_20 = ($imm >> 12) & 0xfffff;
> +
> +    if ($high_20) {
> +        # lu12i.w rd, si20
> +        insn32(0x14000000 | $high_20 << 5 | $rd);

This isn't necessarily positive -- lu12i.w sign-extends from 32-bits.

> +        # ori rd, rd, ui12
> +        insn32(0x03800000 | ($imm & 0xfff) << 10 | $rd << 5 | $rd);
> +    } else {
> +        # ori rd, 0, ui12
> +        insn32(0x03800000 | ($imm & 0xfff) << 10 | 0 << 5 | $rd);
> +    }
> +}
> +
> +sub write_mov_ri($$)
> +{
> +    my ($rd, $imm) = @_;
> +
> +    if ($imm < 0) {
> +        my $tmp = 0 - $imm ;
> +        write_mov_positive_ri($rd, $tmp);
> +        write_sub_rrr($rd, 0, $rd);
> +    } else {
> +        write_mov_positive_ri($rd, $imm);
> +    }
> +}

OTOH, I'm not sure why you'd need to split out write_mov_positive_ri and negate.  I don't 
*think* we need to handle completely arbitrary constants.  From the aarch64 code we 
certainly don't.

I might write

	if ($imm >= -0x1000 && $imm <= 0xfff) {
             addi.w
         } elsif ($imm >= -0x80000000 && $imm <= 0x7fffffff) {
             lu12i.w
             ori
         } else {
             die "unhandled immediate load";
         }


Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RISU PATCH 4/5] loongarch: Add risufile with loongarch instructions
  2022-09-17  7:43 ` [RISU PATCH 4/5] loongarch: Add risufile with loongarch instructions Song Gao
@ 2022-10-10 15:21   ` Richard Henderson
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Henderson @ 2022-10-10 15:21 UTC (permalink / raw)
  To: Song Gao, qemu-devel; +Cc: peter.maydell, alex.bennee, maobibo

On 9/17/22 00:43, Song Gao wrote:
> Signed-off-by: Song Gao<gaosong@loongson.cn>
> ---
>   loongarch64.risu | 573 +++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 573 insertions(+)
>   create mode 100644 loongarch64.risu

Acked-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RISU PATCH 5/5] loongarch: Add block 'safefloat' and nanbox_s()
  2022-09-17  7:43 ` [RISU PATCH 5/5] loongarch: Add block 'safefloat' and nanbox_s() Song Gao
@ 2022-10-10 15:24   ` Richard Henderson
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Henderson @ 2022-10-10 15:24 UTC (permalink / raw)
  To: Song Gao, qemu-devel; +Cc: peter.maydell, alex.bennee, maobibo

On 9/17/22 00:43, Song Gao wrote:
> Some LoongArch instructions don't care the high 32bit,
> so use nanbox_s() set the high 32bit 0xffffffff.
> 
> Signed-off-by: Song Gao <gaosong@loongson.cn>
> ---
>   loongarch64.risu       | 119 +++++++++++++++++++++++++++--------------
>   risugen                |   2 +-
>   risugen_loongarch64.pm |  30 +++++++++++
>   3 files changed, 110 insertions(+), 41 deletions(-)
> 
> diff --git a/loongarch64.risu b/loongarch64.risu
> index d059811..d625a12 100644
> --- a/loongarch64.risu
> +++ b/loongarch64.risu
> @@ -62,7 +62,7 @@ mulw_d_wu LA64 0000 00000001 11111 rk:5 rj:5 rd:5 \
>       !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
>   
>   #div.{w[u]/d[u]} rd,rj,rk
> -# the docement 2.2.13,  rk, rj, need in 32bit [0x0 ~0x7FFFFFFF]
> +# div.w{u}, mod.w[u]  rk, rj, need in [0x0 ~0x7FFFFFFF]
>   # use function set_reg_w($reg)
>   div_w LA64 0000 00000010 00000 rk:5 rj:5 rd:5 \
>       !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
> @@ -436,47 +436,68 @@ crcc_w_d_w LA64 0000 00000010 01111 rk:5 rj:5 rd:5 \
>   #
>   # Floating point arithmetic operation instruction
>   #
> -fadd_s LA64 0000 00010000 00001 fk:5 fj:5 fd:5
> +fadd_s LA64 0000 00010000 00001 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fadd_d LA64 0000 00010000 00010 fk:5 fj:5 fd:5
> -fsub_s LA64 0000 00010000 00101 fk:5 fj:5 fd:5
> +fsub_s LA64 0000 00010000 00101 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fsub_d LA64 0000 00010000 00110 fk:5 fj:5 fd:5
> -fmul_s LA64 0000 00010000 01001 fk:5 fj:5 fd:5
> +fmul_s LA64 0000 00010000 01001 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fmul_d LA64 0000 00010000 01010 fk:5 fj:5 fd:5
> -fdiv_s LA64 0000 00010000 01101 fk:5 fj:5 fd:5
> +fdiv_s LA64 0000 00010000 01101 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fdiv_d LA64 0000 00010000 01110 fk:5 fj:5 fd:5
> -fmadd_s LA64 0000 10000001 fa:5 fk:5 fj:5 fd:5
> +fmadd_s LA64 0000 10000001 fa:5 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fmadd_d LA64 0000 10000010 fa:5 fk:5 fj:5 fd:5
> -fmsub_s LA64 0000 10000101 fa:5 fk:5 fj:5 fd:5
> +fmsub_s LA64 0000 10000101 fa:5 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fmsub_d LA64 0000 10000110 fa:5 fk:5 fj:5 fd:5
> -fnmadd_s LA64 0000 10001001 fa:5 fk:5 fj:5 fd:5
> +fnmadd_s LA64 0000 10001001 fa:5 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fnmadd_d LA64 0000 10001010 fa:5 fk:5 fj:5 fd:5
> -fnmsub_s LA64 0000 10001101 fa:5 fk:5 fj:5 fd:5
> +fnmsub_s LA64 0000 10001101 fa:5 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fnmsub_d LA64 0000 10001110 fa:5 fk:5 fj:5 fd:5
> -fmax_s LA64 0000 00010000 10001 fk:5 fj:5 fd:5
> +fmax_s LA64 0000 00010000 10001 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fmax_d LA64 0000 00010000 10010 fk:5 fj:5 fd:5
> -fmin_s LA64 0000 00010000 10101 fk:5 fj:5 fd:5
> +fmin_s LA64 0000 00010000 10101 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fmin_d LA64 0000 00010000 10110 fk:5 fj:5 fd:5
> -fmaxa_s LA64 0000 00010000 11001 fk:5 fj:5 fd:5
> +fmaxa_s LA64 0000 00010000 11001 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fmaxa_d LA64 0000 00010000 11010 fk:5 fj:5 fd:5
> -fmina_s LA64 0000 00010000 11101 fk:5 fj:5 fd:5
> +fmina_s LA64 0000 00010000 11101 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fmina_d LA64 0000 00010000 11110 fk:5 fj:5 fd:5
> -fabs_s LA64 0000 00010001 01000 00001 fj:5 fd:5
> +fabs_s LA64 0000 00010001 01000 00001 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fabs_d LA64 0000 00010001 01000 00010 fj:5 fd:5
> -fneg_s LA64 0000 00010001 01000 00101 fj:5 fd:5
> +fneg_s LA64 0000 00010001 01000 00101 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fneg_d LA64 0000 00010001 01000 00110 fj:5 fd:5
> -fsqrt_s LA64 0000 00010001 01000 10001 fj:5 fd:5
> +fsqrt_s LA64 0000 00010001 01000 10001 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fsqrt_d LA64 0000 00010001 01000 10010 fj:5 fd:5
> -frecip_s LA64 0000 00010001 01000 10101 fj:5 fd:5
> +frecip_s LA64 0000 00010001 01000 10101 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   frecip_d LA64 0000 00010001 01000 10110 fj:5 fd:5
> -frsqrt_s LA64 0000 00010001 01000 11001 fj:5 fd:5
> +frsqrt_s LA64 0000 00010001 01000 11001 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   frsqrt_d LA64 0000 00010001 01000 11010 fj:5 fd:5
> -fscaleb_s LA64 0000 00010001 00001 fk:5 fj:5 fd:5
> +fscaleb_s LA64 0000 00010001 00001 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fscaleb_d LA64 0000 00010001 00010 fk:5 fj:5 fd:5
> -flogb_s LA64 0000 00010001 01000 01001 fj:5 fd:5
> +flogb_s LA64 0000 00010001 01000 01001 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   flogb_d LA64 0000 00010001 01000 01010 fj:5 fd:5
> -fcopysign_s LA64 0000 00010001 00101 fk:5 fj:5 fd:5
> +fcopysign_s LA64 0000 00010001 00101 fk:5 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fcopysign_d LA64 0000 00010001 00110 fk:5 fj:5 fd:5
> -fclass_s LA64 0000 00010001 01000 01101 fj:5 fd:5
> +fclass_s LA64 0000 00010001 01000 01101 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fclass_d LA64 0000 00010001 01000 01110 fj:5 fd:5
>   
>   #
> @@ -490,43 +511,59 @@ fcmp_cond_d LA64 0000 11000010 cond:5 fk:5 fj:5 00 cd:3 \
>   #
>   # Floating point conversion instruction
>   #
> -fcvt_s_d LA64 0000 00010001 10010 00110 fj:5 fd:5
> +fcvt_s_d LA64 0000 00010001 10010 00110 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fcvt_d_s LA64 0000 00010001 10010 01001 fj:5 fd:5
> -ftintrm_w_s LA64 0000 00010001 10100 00001 fj:5 fd:5
> -ftintrm_w_d LA64 0000 00010001 10100 00010 fj:5 fd:5
> +ftintrm_w_s LA64 0000 00010001 10100 00001 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
> +ftintrm_w_d LA64 0000 00010001 10100 00010 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   ftintrm_l_s LA64 0000 00010001 10100 01001 fj:5 fd:5
>   ftintrm_l_d LA64 0000 00010001 10100 01010 fj:5 fd:5
> -ftintrp_w_s LA64 0000 00010001 10100 10001 fj:5 fd:5
> -ftintrp_w_d LA64 0000 00010001 10100 10010 fj:5 fd:5
> +ftintrp_w_s LA64 0000 00010001 10100 10001 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
> +ftintrp_w_d LA64 0000 00010001 10100 10010 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   ftintrp_l_s LA64 0000 00010001 10100 11001 fj:5 fd:5
>   ftintrp_l_d LA64 0000 00010001 10100 11010 fj:5 fd:5
> -ftintrz_w_s LA64 0000 00010001 10101 00001 fj:5 fd:5
> -ftintrz_w_d LA64 0000 00010001 10101 00010 fj:5 fd:5
> +ftintrz_w_s LA64 0000 00010001 10101 00001 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
> +ftintrz_w_d LA64 0000 00010001 10101 00010 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   ftintrz_l_s LA64 0000 00010001 10101 01001 fj:5 fd:5
>   ftintrz_l_d LA64 0000 00010001 10101 01010 fj:5 fd:5
> -ftintrne_w_s LA64 0000 00010001 10101 10001 fj:5 fd:5
> -ftintrne_w_d LA64 0000 00010001 10101 10010 fj:5 fd:5
> +ftintrne_w_s LA64 0000 00010001 10101 10001 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
> +ftintrne_w_d LA64 0000 00010001 10101 10010 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   ftintrne_l_s LA64 0000 00010001 10101 11001 fj:5 fd:5
>   ftintrne_l_d LA64 0000 00010001 10101 11010 fj:5 fd:5
> -ftint_w_s LA64 0000 00010001 10110 00001 fj:5 fd:5
> -ftint_w_d LA64 0000 00010001 10110 00010 fj:5 fd:5
> +ftint_w_s LA64 0000 00010001 10110 00001 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
> +ftint_w_d LA64 0000 00010001 10110 00010 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   ftint_l_s LA64 0000 00010001 10110 01001 fj:5 fd:5
>   ftint_l_d LA64 0000 00010001 10110 01010 fj:5 fd:5
> -ffint_s_w LA64 0000 00010001 11010 00100 fj:5 fd:5
> -ffint_s_l LA64 0000 00010001 11010 00110 fj:5 fd:5
> +ffint_s_w LA64 0000 00010001 11010 00100 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
> +ffint_s_l LA64 0000 00010001 11010 00110 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   ffint_d_w LA64 0000 00010001 11010 01000 fj:5 fd:5
>   ffint_d_l LA64 0000 00010001 11010 01010 fj:5 fd:5
> -frint_s LA64 0000 00010001 11100 10001 fj:5 fd:5
> +frint_s LA64 0000 00010001 11100 10001 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   frint_d LA64 0000 00010001 11100 10010 fj:5 fd:5
>   
>   #
>   # Floating point move instruction
>   #
> -fmov_s LA64 0000 00010001 01001 00101 fj:5 fd:5
> +fmov_s LA64 0000 00010001 01001 00101 fj:5 fd:5 \
> +    !safefloat { nanbox_s($fd); }
>   fmov_d LA64 0000 00010001 01001 00110 fj:5 fd:5
>   fsel LA64 0000 11010000 00 ca:3 fk:5 fj:5 fd:5
>   movgr2fr_w LA64 0000 00010001 01001 01001 rj:5 fd:5 \
> -    !constraints { $rj != 2; }
> +    !constraints { $rj != 2; } \
> +    !safefloat { nanbox_s($fd); }
>   movgr2fr_d LA64 0000 00010001 01001 01010 rj:5 fd:5 \
>       !constraints { $rj != 2; }
>   movgr2frh_w LA64 0000 00010001 01001 01011 rj:5 fd:5 \
> @@ -549,7 +586,8 @@ movcf2gr LA64 0000 00010001 01001 10111 00 cj:3 rd:5 \
>   #
>   fld_s LA64 0010 101100 si12:12 rj:5 fd:5 \
>       !constraints { $rj != 0 && $rj != 2; } \
> -    !memory { reg_plus_imm($rj, sextract($si12, 12)); }
> +    !memory { reg_plus_imm($rj, sextract($si12, 12)); } \
> +    !safefloat { nanbox_s($fd); }
>   fst_s LA64 0010 101101 si12:12 rj:5 fd:5 \
>       !constraints { $rj != 0 && $rj != 2; } \
>       !memory { reg_plus_imm($rj, sextract($si12, 12)); }
> @@ -561,7 +599,8 @@ fst_d LA64 0010 101111 si12:12 rj:5 fd:5 \
>       !memory { reg_plus_imm($rj, sextract($si12, 12)); }
>   fldx_s LA64 0011 10000011 00000 rk:5 rj:5 fd:5 \
>       !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2; } \
> -    !memory { reg_plus_reg($rj, $rk); }
> +    !memory { reg_plus_reg($rj, $rk); } \
> +    !safefloat { nanbox_s($fd); }
>   fldx_d LA64 0011 10000011 01000 rk:5 rj:5 fd:5 \
>       !constraints { $rj != 0 && $rj != $rk && $rk != 2 && $rj != 2; } \
>       !memory { reg_plus_reg($rj, $rk); }
> diff --git a/risugen b/risugen
> index e690b18..fa94a39 100755
> --- a/risugen
> +++ b/risugen
> @@ -43,7 +43,7 @@ my @pattern_re = ();            # include pattern
>   my @not_pattern_re = ();        # exclude pattern
>   
>   # Valid block names (keys in blocks hash)
> -my %valid_blockname = ( constraints => 1, memory => 1 );
> +my %valid_blockname = ( constraints => 1, memory => 1, safefloat =>1 );
>   
>   sub parse_risu_directive($$@)
>   {
> diff --git a/risugen_loongarch64.pm b/risugen_loongarch64.pm
> index 693fb71..8ab598b 100644
> --- a/risugen_loongarch64.pm
> +++ b/risugen_loongarch64.pm
> @@ -66,6 +66,28 @@ sub set_reg_w($)
>       return $reg;
>   }
>   
> +sub write_orn_rrr($$$)
> +{
> +    my($rd, $rj, $rk)=@_;
> +    # $rd = $rj | (~$rk)
> +    insn32(0x160000 | $rk << 10 | $rj << 5 | $rd);
> +}
> +
> +sub nanbox_s($)
> +{
> +    my ($fpreg)=@_;
> +
> +    # Set $fpreg register high 32bit ffffffff
> +    # use r1 as a temp register
> +    # r1 = r1 | ~(r0)
> +    write_orn_rrr(1, 1, 0);

Better to use write_mov_ri(1, -1) instead of inventing another helper.

Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RISU PATCH 2/5] loongarch: Add LoongArch basic test support
  2022-09-17  7:43 ` [RISU PATCH 2/5] loongarch: Add LoongArch basic test support Song Gao
  2022-10-10 14:58   ` Richard Henderson
@ 2022-10-10 15:34   ` Peter Maydell
  2022-10-11  1:48     ` gaosong
  1 sibling, 1 reply; 16+ messages in thread
From: Peter Maydell @ 2022-10-10 15:34 UTC (permalink / raw)
  To: Song Gao; +Cc: qemu-devel, richard.henderson, alex.bennee, maobibo

On Sat, 17 Sept 2022 at 08:43, Song Gao <gaosong@loongson.cn> wrote:
>
> This patch adds LoongArch server, client support, and basic test file.
>
> Signed-off-by: Song Gao <gaosong@loongson.cn>

> +int get_risuop(struct reginfo *ri)
> +{
> +    /* Return the risuop we have been asked to do
> +     * (or -1 if this was a SIGILL for a non-risuop insn)
> +     */
> +    uint32_t insn = ri->faulting_insn;
> +    uint32_t op = insn & 0xf;
> +    uint32_t key = insn & ~0xf;
> +    uint32_t risukey = 0x000001f0;
> +    return (key != risukey) ? -1 : op;
> +}

You'll probably find this needs tweaking when you rebase
on current risu git, because a recent refactor means this
function should now return a RisuOp, not an int. The changes
should be minor, though.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RISU PATCH 2/5] loongarch: Add LoongArch basic test support
  2022-10-10 15:34   ` Peter Maydell
@ 2022-10-11  1:48     ` gaosong
  0 siblings, 0 replies; 16+ messages in thread
From: gaosong @ 2022-10-11  1:48 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, richard.henderson, alex.bennee, maobibo

[-- Attachment #1: Type: text/plain, Size: 680 bytes --]


在 2022/10/10 23:34, Peter Maydell 写道:
>> +int get_risuop(struct reginfo *ri)
>> +{
>> +    /* Return the risuop we have been asked to do
>> +     * (or -1 if this was a SIGILL for a non-risuop insn)
>> +     */
>> +    uint32_t insn = ri->faulting_insn;
>> +    uint32_t op = insn & 0xf;
>> +    uint32_t key = insn & ~0xf;
>> +    uint32_t risukey = 0x000001f0;
>> +    return (key != risukey) ? -1 : op;
>> +}
> You'll probably find this needs tweaking when you rebase
> on current risu git, because a recent refactor means this
> function should now return a RisuOp, not an int. The changes
> should be minor, though.
Ok,  I will correct it  on v2.

Thanks.
Song Gao


[-- Attachment #2: Type: text/html, Size: 1192 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RISU PATCH 1/5] risu: Use alternate stack
  2022-10-10 14:43     ` Peter Maydell
@ 2022-10-11  6:56       ` gaosong
  2022-10-11  9:27         ` Peter Maydell
  0 siblings, 1 reply; 16+ messages in thread
From: gaosong @ 2022-10-11  6:56 UTC (permalink / raw)
  To: Peter Maydell, Richard Henderson; +Cc: qemu-devel, alex.bennee, maobibo


在 2022/10/10 22:43, Peter Maydell 写道:
> On Mon, 10 Oct 2022 at 15:20, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>> On 9/17/22 00:43, Song Gao wrote:
>>> We can use alternate stack, so that we can use sp register as intput/ouput register.
>>> I had tested aarch64/LoongArch architecture.
>>>
>>> Signed-off-by: Song Gao<gaosong@loongson.cn>
>>> ---
>>>    risu.c | 16 +++++++++++++++-
>>>    1 file changed, 15 insertions(+), 1 deletion(-)
>> Good idea.
> Depending on the architecture there might still need to be
> restrictions on use of the stack pointer, eg aarch64's
> alignment requirements, but this at least means you can
> in theory write some risu rules that use SP.
I really want use alternate stack, this way can reduce risu rules.
what about use this only on LoongArch architecture ?

Thanks.
Song Gao



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RISU PATCH 1/5] risu: Use alternate stack
  2022-10-11  6:56       ` gaosong
@ 2022-10-11  9:27         ` Peter Maydell
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Maydell @ 2022-10-11  9:27 UTC (permalink / raw)
  To: gaosong; +Cc: Richard Henderson, qemu-devel, alex.bennee, maobibo

On Tue, 11 Oct 2022 at 07:57, gaosong <gaosong@loongson.cn> wrote:
>
>
> 在 2022/10/10 22:43, Peter Maydell 写道:
> > On Mon, 10 Oct 2022 at 15:20, Richard Henderson
> > <richard.henderson@linaro.org> wrote:
> >> On 9/17/22 00:43, Song Gao wrote:
> >>> We can use alternate stack, so that we can use sp register as intput/ouput register.
> >>> I had tested aarch64/LoongArch architecture.
> >>>
> >>> Signed-off-by: Song Gao<gaosong@loongson.cn>
> >>> ---
> >>>    risu.c | 16 +++++++++++++++-
> >>>    1 file changed, 15 insertions(+), 1 deletion(-)
> >> Good idea.
> > Depending on the architecture there might still need to be
> > restrictions on use of the stack pointer, eg aarch64's
> > alignment requirements, but this at least means you can
> > in theory write some risu rules that use SP.
> I really want use alternate stack, this way can reduce risu rules.
> what about use this only on LoongArch architecture ?

I just mean that although this patch is fine it might
still mean that depending on the architecture some care
and/or special casing of sp in the target risu rules
might be needed. I don't know if that applies to
loongarch or not.

-- PMM


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-10-11  9:31 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-17  7:43 [RISU PATCH 0/5] Add LoongArch architectures support Song Gao
2022-09-17  7:43 ` [RISU PATCH 1/5] risu: Use alternate stack Song Gao
2022-10-10 14:20   ` Richard Henderson
2022-10-10 14:43     ` Peter Maydell
2022-10-11  6:56       ` gaosong
2022-10-11  9:27         ` Peter Maydell
2022-09-17  7:43 ` [RISU PATCH 2/5] loongarch: Add LoongArch basic test support Song Gao
2022-10-10 14:58   ` Richard Henderson
2022-10-10 15:34   ` Peter Maydell
2022-10-11  1:48     ` gaosong
2022-09-17  7:43 ` [RISU PATCH 3/5] loongarch: Implement risugen module Song Gao
2022-10-10 15:19   ` Richard Henderson
2022-09-17  7:43 ` [RISU PATCH 4/5] loongarch: Add risufile with loongarch instructions Song Gao
2022-10-10 15:21   ` Richard Henderson
2022-09-17  7:43 ` [RISU PATCH 5/5] loongarch: Add block 'safefloat' and nanbox_s() Song Gao
2022-10-10 15:24   ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.