All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v3 00/34] Hexagon patch series
@ 2020-08-18 15:50 Taylor Simpson
  2020-08-18 15:50 ` [RFC PATCH v3 01/34] Hexagon Update MAINTAINERS file Taylor Simpson
                   ` (35 more replies)
  0 siblings, 36 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

This series adds support for the Hexagon processor with Linux user support

See patch 02/34 Hexagon README for detailed information.

Once the series is applied, the Hexagon port will pass "make check-tcg".
The series also includes Hexagon-specific tests in tcg/tests/hexagon.

We have a parallel effort to make the Hexagon Linux toolchain inside a docker
container publically available.

*** Future items under consideration ***
Use qemu softfloat
Use qemu decodetree

*** Known checkpatch issues ***

The following are known checkpatch errors in the series
    target/hexagon/reg_fields.h         Complex macro
    target/hexagon/attribs.h            Complex macro
    target/hexagon/decode.c             Complex macro
    target/hexagon/q6v_decode.c         Macro needs do - while
    target/hexagon/printinsn.c          Macro needs do - while
    target/hexagon/gen_semantics.c      Suspicious ; after while (0)
    target/hexagon/gen_dectree_import.c Complex macro
    target/hexagon/gen_dectree_import.c Suspicious ; after while (0)
    target/hexagon/opcodes.c            Complex macro
    target/hexagon/iclass.h             Complex macro
    scripts/qemu-binfmt-conf.sh         Line over 90 characters

The following are known checkpatch warnings in the series
    target/hexagon/fma_emu.c            Comments inside macro definition
    scripts/qemu-binfmt-conf.sh         Line over 80 characters

*** Changes in v3 ***
Remove substantial portions of the code to facilitate review
- Plan to submit subsequent patches
- Hexagon Vector eXtensions (HVX)
- Circular and bit-reverse addressiong
- Add/sub-with-carry
- Unused insn_t and pkt_t fields
- Unused instruction attributes
- All TCG overrides except instructions with multiple definitions
- Unused macros
- Unused reg fields
- COUNT_HEX_HELPERS
Use Laurent's gensyscall.sh script to generate linux-user/hexagon/syscall_nr.h
Handle mem_noshuf
Remove "RsV = RsV" per review feedback
Simplify include file structure
Add directed tests in <qemu>/tests/tcg/hexagon
Rework the python scripts to generate one header file at a time
Change fWRAP_* macros to fGEN_TCG_*

*** Changes in v2 ***
- Use scripts/git.orderfile

Taylor Simpson (34):
  Hexagon Update MAINTAINERS file
  Hexagon (target/hexagon) README
  Hexagon (include/elf.h) ELF machine definition
  Hexagon (target/hexagon) scalar core definition
  Hexagon (target/hexagon) register names
  Hexagon (disas) disassembler
  Hexagon (target/hexagon) scalar core helpers
  Hexagon (target/hexagon) GDB Stub
  Hexagon (target/hexagon) architecture types
  Hexagon (target/hexagon) instruction and packet types
  Hexagon (target/hexagon) register fields
  Hexagon (target/hexagon) instruction attributes
  Hexagon (target/hexagon) register map
  Hexagon (target/hexagon) instruction/packet decode
  Hexagon (target/hexagon) instruction printing
  Hexagon (target/hexagon) utility functions
  Hexagon (target/hexagon/imported) arch import - macro definitions
  Hexagon (target/hexagon/imported) arch import - instruction semantics
  Hexagon (target/hexagon/imported) arch import - instruction encoding
  Hexagon (target/hexagon) generator phase 1 - C preprocessor for
    semantics
  Hexagon (target/hexagon) generator phase 2 - generate header files
  Hexagon (target/hexagon) generator phase 3 - C preprocessor for decode
    tree
  Hexagon (target/hexagon) generater phase 4 - decode tree
  Hexagon (target/hexagon) opcode data structures
  Hexagon (target/hexagon) macros to interface with the generator
  Hexagon (target/hexagon) macros referenced in instruction semantics
  Hexagon (target/hexagon) instruction classes
  Hexagon (target/hexagon) TCG generation helpers
  Hexagon (target/hexagon) TCG generation
  Hexagon (target/hexagon) TCG for instructions with multiple
    definitions
  Hexagon (target/hexagon) translation
  Hexagon (linux-user/hexagon) Linux user emulation
  Hexagon (tests/tcg/hexagon) TCG tests
  Hexagon build infrastructure

 configure                                  |    9 +
 default-configs/hexagon-linux-user.mak     |    1 +
 include/disas/dis-asm.h                    |    1 +
 include/elf.h                              |    2 +
 linux-user/hexagon/sockbits.h              |   18 +
 linux-user/hexagon/syscall_nr.h            |  343 +++++
 linux-user/hexagon/target_cpu.h            |   44 +
 linux-user/hexagon/target_elf.h            |   40 +
 linux-user/hexagon/target_fcntl.h          |   18 +
 linux-user/hexagon/target_signal.h         |   34 +
 linux-user/hexagon/target_structs.h        |   46 +
 linux-user/hexagon/target_syscall.h        |   32 +
 linux-user/hexagon/termbits.h              |   18 +
 linux-user/syscall_defs.h                  |   33 +
 target/hexagon/arch.h                      |   42 +
 target/hexagon/attribs.h                   |   32 +
 target/hexagon/attribs_def.h               |   98 ++
 target/hexagon/conv_emu.h                  |   50 +
 target/hexagon/cpu-param.h                 |   26 +
 target/hexagon/cpu.h                       |  164 +++
 target/hexagon/cpu_bits.h                  |   34 +
 target/hexagon/decode.h                    |   39 +
 target/hexagon/fma_emu.h                   |   27 +
 target/hexagon/gen_tcg.h                   |  198 +++
 target/hexagon/genptr.h                    |   25 +
 target/hexagon/genptr_helpers.h            |  244 ++++
 target/hexagon/helper.h                    |   35 +
 target/hexagon/hex_arch_types.h            |   43 +
 target/hexagon/hex_regs.h                  |   83 ++
 target/hexagon/iclass.h                    |   46 +
 target/hexagon/insn.h                      |   75 +
 target/hexagon/internal.h                  |   42 +
 target/hexagon/macros.h                    |  982 +++++++++++++
 target/hexagon/opcodes.h                   |   66 +
 target/hexagon/printinsn.h                 |   26 +
 target/hexagon/reg_fields.h                |   40 +
 target/hexagon/reg_fields_def.h            |   78 +
 target/hexagon/regmap.h                    |   38 +
 target/hexagon/translate.h                 |  103 ++
 disas/hexagon.c                            |   62 +
 linux-user/elfload.c                       |   16 +
 linux-user/hexagon/cpu_loop.c              |   99 ++
 linux-user/hexagon/signal.c                |  276 ++++
 linux-user/syscall.c                       |    2 +
 target/hexagon/arch.c                      |  354 +++++
 target/hexagon/conv_emu.c                  |  369 +++++
 target/hexagon/cpu.c                       |  318 +++++
 target/hexagon/decode.c                    |  593 ++++++++
 target/hexagon/fma_emu.c                   |  781 ++++++++++
 target/hexagon/gdbstub.c                   |   49 +
 target/hexagon/gen_dectree_import.c        |  191 +++
 target/hexagon/gen_semantics.c             |   88 ++
 target/hexagon/genptr.c                    |   56 +
 target/hexagon/iclass.c                    |   88 ++
 target/hexagon/op_helper.c                 |  365 +++++
 target/hexagon/opcodes.c                   |  211 +++
 target/hexagon/printinsn.c                 |   94 ++
 target/hexagon/q6v_decode.c                |  373 +++++
 target/hexagon/reg_fields.c                |   28 +
 target/hexagon/translate.c                 |  730 ++++++++++
 tests/tcg/hexagon/atomics.c                |  122 ++
 tests/tcg/hexagon/clrtnew.c                |   56 +
 tests/tcg/hexagon/dual_stores.c            |   60 +
 tests/tcg/hexagon/exec_counters.c          |   57 +
 tests/tcg/hexagon/mem_noshuf.c             |  291 ++++
 tests/tcg/hexagon/misc.c                   |  293 ++++
 tests/tcg/hexagon/preg_alias.c             |  106 ++
 tests/tcg/hexagon/pthread_cancel.c         |   43 +
 tests/tcg/hexagon/sfminmax.c               |   62 +
 MAINTAINERS                                |    8 +
 disas/Makefile.objs                        |    1 +
 scripts/gensyscalls.sh                     |    3 +-
 scripts/qemu-binfmt-conf.sh                |    6 +-
 target/hexagon/Makefile.objs               |  203 +++
 target/hexagon/README                      |  254 ++++
 target/hexagon/dectree.py                  |  352 +++++
 target/hexagon/gen_helper_funcs.py         |  230 +++
 target/hexagon/gen_helper_protos.py        |  158 +++
 target/hexagon/gen_op_attribs.py           |   46 +
 target/hexagon/gen_op_regs.py              |  119 ++
 target/hexagon/gen_opcodes_def.py          |   43 +
 target/hexagon/gen_printinsn.py            |  182 +++
 target/hexagon/gen_shortcode.py            |   71 +
 target/hexagon/gen_tcg_funcs.py            |  301 ++++
 target/hexagon/hex_common.py               |  204 +++
 target/hexagon/imported/allidefs.def       |   30 +
 target/hexagon/imported/alu.idef           | 1259 +++++++++++++++++
 target/hexagon/imported/branch.idef        |  328 +++++
 target/hexagon/imported/compare.idef       |  621 ++++++++
 target/hexagon/imported/encode.def         |  125 ++
 target/hexagon/imported/encode_pp.def      | 2110 ++++++++++++++++++++++++++++
 target/hexagon/imported/encode_subinsn.def |  150 ++
 target/hexagon/imported/float.idef         |  313 +++++
 target/hexagon/imported/iclass.def         |   52 +
 target/hexagon/imported/ldst.idef          |  286 ++++
 target/hexagon/imported/macros.def         | 1529 ++++++++++++++++++++
 target/hexagon/imported/mpy.idef           | 1212 ++++++++++++++++
 target/hexagon/imported/shift.idef         | 1067 ++++++++++++++
 target/hexagon/imported/subinsns.idef      |  152 ++
 target/hexagon/imported/system.idef        |   69 +
 tests/tcg/configure.sh                     |    4 +-
 tests/tcg/hexagon/Makefile.target          |   49 +
 tests/tcg/hexagon/first.S                  |   57 +
 tests/tcg/hexagon/float_convs.ref          |  748 ++++++++++
 tests/tcg/hexagon/float_madds.ref          |  768 ++++++++++
 105 files changed, 22615 insertions(+), 3 deletions(-)
 create mode 100644 default-configs/hexagon-linux-user.mak
 create mode 100644 linux-user/hexagon/sockbits.h
 create mode 100644 linux-user/hexagon/syscall_nr.h
 create mode 100644 linux-user/hexagon/target_cpu.h
 create mode 100644 linux-user/hexagon/target_elf.h
 create mode 100644 linux-user/hexagon/target_fcntl.h
 create mode 100644 linux-user/hexagon/target_signal.h
 create mode 100644 linux-user/hexagon/target_structs.h
 create mode 100644 linux-user/hexagon/target_syscall.h
 create mode 100644 linux-user/hexagon/termbits.h
 create mode 100644 target/hexagon/arch.h
 create mode 100644 target/hexagon/attribs.h
 create mode 100644 target/hexagon/attribs_def.h
 create mode 100644 target/hexagon/conv_emu.h
 create mode 100644 target/hexagon/cpu-param.h
 create mode 100644 target/hexagon/cpu.h
 create mode 100644 target/hexagon/cpu_bits.h
 create mode 100644 target/hexagon/decode.h
 create mode 100644 target/hexagon/fma_emu.h
 create mode 100644 target/hexagon/gen_tcg.h
 create mode 100644 target/hexagon/genptr.h
 create mode 100644 target/hexagon/genptr_helpers.h
 create mode 100644 target/hexagon/helper.h
 create mode 100644 target/hexagon/hex_arch_types.h
 create mode 100644 target/hexagon/hex_regs.h
 create mode 100644 target/hexagon/iclass.h
 create mode 100644 target/hexagon/insn.h
 create mode 100644 target/hexagon/internal.h
 create mode 100644 target/hexagon/macros.h
 create mode 100644 target/hexagon/opcodes.h
 create mode 100644 target/hexagon/printinsn.h
 create mode 100644 target/hexagon/reg_fields.h
 create mode 100644 target/hexagon/reg_fields_def.h
 create mode 100644 target/hexagon/regmap.h
 create mode 100644 target/hexagon/translate.h
 create mode 100644 disas/hexagon.c
 create mode 100644 linux-user/hexagon/cpu_loop.c
 create mode 100644 linux-user/hexagon/signal.c
 create mode 100644 target/hexagon/arch.c
 create mode 100644 target/hexagon/conv_emu.c
 create mode 100644 target/hexagon/cpu.c
 create mode 100644 target/hexagon/decode.c
 create mode 100644 target/hexagon/fma_emu.c
 create mode 100644 target/hexagon/gdbstub.c
 create mode 100644 target/hexagon/gen_dectree_import.c
 create mode 100644 target/hexagon/gen_semantics.c
 create mode 100644 target/hexagon/genptr.c
 create mode 100644 target/hexagon/iclass.c
 create mode 100644 target/hexagon/op_helper.c
 create mode 100644 target/hexagon/opcodes.c
 create mode 100644 target/hexagon/printinsn.c
 create mode 100644 target/hexagon/q6v_decode.c
 create mode 100644 target/hexagon/reg_fields.c
 create mode 100644 target/hexagon/translate.c
 create mode 100644 tests/tcg/hexagon/atomics.c
 create mode 100644 tests/tcg/hexagon/clrtnew.c
 create mode 100644 tests/tcg/hexagon/dual_stores.c
 create mode 100644 tests/tcg/hexagon/exec_counters.c
 create mode 100644 tests/tcg/hexagon/mem_noshuf.c
 create mode 100644 tests/tcg/hexagon/misc.c
 create mode 100644 tests/tcg/hexagon/preg_alias.c
 create mode 100644 tests/tcg/hexagon/pthread_cancel.c
 create mode 100644 tests/tcg/hexagon/sfminmax.c
 create mode 100644 target/hexagon/Makefile.objs
 create mode 100644 target/hexagon/README
 create mode 100755 target/hexagon/dectree.py
 create mode 100755 target/hexagon/gen_helper_funcs.py
 create mode 100755 target/hexagon/gen_helper_protos.py
 create mode 100755 target/hexagon/gen_op_attribs.py
 create mode 100755 target/hexagon/gen_op_regs.py
 create mode 100755 target/hexagon/gen_opcodes_def.py
 create mode 100755 target/hexagon/gen_printinsn.py
 create mode 100755 target/hexagon/gen_shortcode.py
 create mode 100755 target/hexagon/gen_tcg_funcs.py
 create mode 100755 target/hexagon/hex_common.py
 create mode 100644 target/hexagon/imported/allidefs.def
 create mode 100644 target/hexagon/imported/alu.idef
 create mode 100644 target/hexagon/imported/branch.idef
 create mode 100644 target/hexagon/imported/compare.idef
 create mode 100644 target/hexagon/imported/encode.def
 create mode 100644 target/hexagon/imported/encode_pp.def
 create mode 100644 target/hexagon/imported/encode_subinsn.def
 create mode 100644 target/hexagon/imported/float.idef
 create mode 100644 target/hexagon/imported/iclass.def
 create mode 100644 target/hexagon/imported/ldst.idef
 create mode 100755 target/hexagon/imported/macros.def
 create mode 100644 target/hexagon/imported/mpy.idef
 create mode 100644 target/hexagon/imported/shift.idef
 create mode 100644 target/hexagon/imported/subinsns.idef
 create mode 100644 target/hexagon/imported/system.idef
 create mode 100644 tests/tcg/hexagon/Makefile.target
 create mode 100644 tests/tcg/hexagon/first.S
 create mode 100644 tests/tcg/hexagon/float_convs.ref
 create mode 100644 tests/tcg/hexagon/float_madds.ref

-- 
2.7.4


^ permalink raw reply	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 01/34] Hexagon Update MAINTAINERS file
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26  1:55   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 02/34] Hexagon (target/hexagon) README Taylor Simpson
                   ` (34 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Add Taylor Simpson as the Hexagon target maintainer

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 MAINTAINERS | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 0886eb3..d85da55 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -184,6 +184,14 @@ F: include/hw/cris/
 F: tests/tcg/cris/
 F: disas/cris.c
 
+Hexagon TCG CPUs
+M: Taylor Simpson <tsimpson@quicinc.com>
+S: Supported
+F: target/hexagon/
+F: linux-user/hexagon/
+F: disas/hexagon.c
+F: default-configs/hexagon-linux-user.mak
+
 HPPA (PA-RISC) TCG CPUs
 M: Richard Henderson <rth@twiddle.net>
 S: Maintained
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 02/34] Hexagon (target/hexagon) README
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
  2020-08-18 15:50 ` [RFC PATCH v3 01/34] Hexagon Update MAINTAINERS file Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26  2:06   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 03/34] Hexagon (include/elf.h) ELF machine definition Taylor Simpson
                   ` (33 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Gives an introduction and overview to the Hexagon target

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/README | 254 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 254 insertions(+)
 create mode 100644 target/hexagon/README

diff --git a/target/hexagon/README b/target/hexagon/README
new file mode 100644
index 0000000..3b65688
--- /dev/null
+++ b/target/hexagon/README
@@ -0,0 +1,254 @@
+Hexagon is Qualcomm's very long instruction word (VLIW) digital signal
+processor(DSP).
+
+The following versions of the Hexagon core are supported
+    Scalar core: v67
+    https://developer.qualcomm.com/downloads/qualcomm-hexagon-v67-programmer-s-reference-manual
+
+We presented an overview of the project at the 2019 KVM Forum.
+    https://kvmforum2019.sched.com/event/Tmwc/qemu-hexagon-automatic-translation-of-the-isa-manual-pseudcode-to-tiny-code-instructions-of-a-vliw-architecture-niccolo-izzo-revng-taylor-simpson-qualcomm-innovation-center
+
+*** Tour of the code ***
+
+The qemu-hexagon implementation is a combination of qemu and the Hexagon
+architecture library (aka archlib).  The three primary directories with
+Hexagon-specific code are
+
+    qemu/target/hexagon
+        This has all the instruction and packet semantics
+    qemu/target/hexagon/imported
+        These files are imported with very little modification from archlib
+        *.idef                  Instruction semantics definition
+        macros.def              Mapping of macros to instruction attributes
+        encode*.def             Encoding patterns for each instruction
+        iclass.def              Instruction class definitions used to determine
+                                legal VLIW slots for each instruction
+    qemu/linux-user/hexagon
+        Helpers for loading the ELF file and making Linux system calls,
+        signals, etc
+
+We start with scripts that generate a bunch of include files.  This
+is a two step process.  The first step is to use the C preprocessor to expand
+macros inside the architecture definition files.  This is done in
+target/hexagon/gen_semantics.c.  This step produces
+    <BUILD_DIR>/hexagon-linux-user/semantics_generated.pyinc.
+That file is consumed by the following python scripts to produce the indicated
+header files in <BUILD_DIR>/hexagon-linux-user
+        gen_shortcode.py                -> shortcode_generated.h
+        gen_helper_protos.py            -> helper_protos_generated.h
+        gen_tcg_funcs.py                -> tcg_funcs_generated.h
+        gen_helper_funcs.py             -> helper_funcs_generated.h
+
+Qemu helper functions have 3 parts
+    DEF_HELPER declaration indicates the signature of the helper
+    gen_helper_<NAME> will generate a TCG call to the helper function
+    The helper implementation
+
+Here's an example of the A2_add instruction.
+    Instruction tag        A2_add
+    Assembly syntax        "Rd32=add(Rs32,Rt32)"
+    Instruction semantics  "{ RdV=RsV+RtV;}"
+
+By convention, the operands are identified by letter
+    RdV is the destination register
+    RsV, RtV are source registers
+
+The generator uses the operand naming conventions (see large comment in
+hex_common.py) to determine the signature of the helper function.  Here are the
+results for A2_add
+
+helper_protos_generated.h
+    #ifndef fGEN_TCG_A2_add
+    DEF_HELPER_3(A2_add, s32, env, s32, s32)
+    #endif
+
+tcg_funcs_generated.h
+    DEF_TCG_FUNC(A2_add, /* { RdV=RsV+RtV;} */
+    {
+    /* A2_add */
+    DECL_RREG_d(RdV, RdN, 0, 0);
+    DECL_RREG_s(RsV, RsN, 1, 0);
+    DECL_RREG_t(RtV, RtN, 2, 0);
+    READ_RREG_s(RsV, RsN);
+    READ_RREG_t(RtV, RtN);
+    #ifdef fGEN_TCG_A2_add
+    fGEN_TCG_A2_add({ RdV=RsV+RtV;});
+    #else
+    do {
+    gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
+    } while (0);
+    #endif
+    WRITE_RREG_d(RdN, RdV);
+    FREE_RREG_d(RdV);
+    FREE_RREG_s(RsV);
+    FREE_RREG_t(RtV);
+    /* A2_add */
+    })
+
+helper_funcs_generated.h
+    #ifndef fGEN_TCG_A2_add
+    int32_t HELPER(A2_add)(CPUHexagonState *env, int32_t RsV, int32_t RtV)
+    {
+    uint32_t slot __attribute__((unused)) = 4;
+    int32_t RdV = 0;
+    { RdV=RsV+RtV;}
+    return RdV;
+    }
+    #endif
+
+For each operand, there are macros for DECL, FREE, READ, WRITE.  These are
+defined in macros.h.  Note that we append the operand type to the macro name,
+which allows us to specialize the TCG code tenerated.  For read-only operands,
+DECL simply declares the TCGv variable (no need for tcg_temp_local_new()),
+and READ will assign from the TCGv corresponding to the GPR, and FREE doesn't
+have to do anything.  Also, note that the WRITE macros update the disassembly
+context to be processed when the packet commits (see "Packet Semantics" below).
+
+Note the fGEN_TCG_A2_add macro.  This macro allows us to generate TCG code
+instead of a call to the helper.  If defined, the macro takes 1 argument.
+    C semantics (aka short code)
+
+This allows the code generator to override the auto-generated code.  In some
+cases this is necessary for correct execution.  We can also override for
+faster emulation.  For example, calling a helper for add is more expensive
+than generating a TCG add operation.
+
+The gen_tcg.h file has any overrides. For example,
+    #define fGEN_TCG_A2_add(GENHLPR, SHORTCODE) \
+        tcg_gen_add_tl(RdV, RsV, RtV)
+
+The gen_tcg.h file is included twice
+1) In genptr.c, it overrides the semantics from tcg_funcs_generated.h
+2) In helper.h, it prevents the generation of helpers for overridden
+   instructions.  Notice the #ifndef fGEN_TCG_A2_add above in both
+   helper_protos_generated.h and helper_functions.h
+
+The instruction semantics C code relies heavily on macros.  In cases where the
+C semantics are specified only with macros, we can override the default with
+the short semantics option and #define the macros to generate TCG code.  One
+example is L2_loadw_locked:
+    Instruction tag        L2_loadw_locked
+    Assembly syntax        "Rd32=memw_locked(Rs32)"
+    Instruction semantics  "{ fEA_REG(RsV); fLOAD_LOCKED(1,4,u,EA,RdV) }"
+
+In gen_tcg.h, we use the shortcode
+#define fGEN_TCG_L2_loadw_locked(SHORTCODE) \
+    SHORTCODE
+
+There are also cases where we brute force the TCG code generation.
+Instructions with multiple definitions are examples.  These require special
+handling because qemu helpers can only return a single value.
+
+In addition to instruction semantics, we use a generator to create the decode
+tree.  This generation is also a two step process.  The first step is to run
+target/hexagon/gen_dectree_import.c to produce
+    <BUILD_DIR>/hexagon-linux-user/iset.py
+This file is imported by target/hexagon/dectree.py to produce
+    <BUILD_DIR>/hexagon-linux-user/dectree_generated.h
+
+*** Key Files ***
+
+cpu.h
+
+This file contains the definition of the CPUHexagonState struct.  It is the
+runtime information for each thread and contains stuff like the GPR and
+predicate registers.
+
+macros.h
+
+The Hexagon arch lib relies heavily on macros for the instruction semantics.
+This is a great advantage for qemu because we can override them for different
+purposes.  You will also notice there are sometimes two definitions of a macro.
+The QEMU_GENERATE variable determines whether we want the macro to generate TCG
+code.  If QEMU_GENERATE is not defined, we want the macro to generate vanilla
+C code that will work in the helper implementation.
+
+translate.c
+
+The functions in this file generate TCG code for a translation block.  Some
+important functions in this file are
+
+    gen_start_packet - initialize the data structures for packet semantics
+    gen_commit_packet - commit the register writes, stores, etc for a packet
+    decode_packet - disassemble a packet and generate code
+
+genptr.c
+genptr_helpers.h
+gen_tcg.h
+
+These files create a function for each instruction.  It is mostly composed of
+fGEN_TCG_<tag> definitions followed by including qemu_def_generated.h.  The
+genptr_helpers.h file contains helper functions that are invoked by the macros
+in gen_tcg.h and macros.h
+
+op_helper.c
+
+This file contains the implementations of all the helpers.  There are a few
+general purpose helpers, but most of them are generated by including
+qemu_def_generated.h.  There are also several helpers used for debugging.
+
+
+*** Packet Semantics ***
+
+VLIW packet semantics differ from serial semantics in that all input operands
+are read, then the operations are performed, then all the results are written.
+For exmaple, this packet performs a swap of registers r0 and r1
+    { r0 = r1; r1 = r0 }
+Note that the result is different if the instructions are executed serially.
+
+Packet semantics dictate that we defer any changes of state until the entire
+packet is committed.  We record the results of each instruction in a side data
+structure, and update the visible processor state when we commit the packet.
+
+The data structures are divided between the runtime state and the translation
+context.
+
+During the TCG generation (see translate.[ch]), we use the DisasContext to
+track what needs to be done during packet commit.  Here are the relevant
+fields
+
+    reg_log            list of registers written
+    reg_log_idx        index into ctx_reg_log
+    pred_log           list of predicates written
+    pred_log_idx       index into ctx_pred_log
+    store_width        width of stores (indexed by slot)
+
+During runtime, the following fields in CPUHexagonState (see cpu.h) are used
+
+    new_value             new value of a given register
+    reg_written           boolean indicating if register was written
+    new_pred_value        new value of a predicate register
+    pred_written          boolean indicating if predicate was written
+    mem_log_stores        record of the stores (indexed by slot)
+
+*** Debugging ***
+
+You can turn on a lot of debugging by changing the HEX_DEBUG macro to 1 in
+internal.h.  This will stream a lot of information as it generates TCG and
+executes the code.
+
+To track down nasty issues with Hexagon->TCG generation, we compare the
+execution results with actual hardware running on a Hexagon Linux target.
+Run qemu with the "-d cpu" option.  Then, we can diff the results and figure
+out where qemu and hardware behave differently.
+
+The stacks are located at different locations.  We handle this by changing
+env->stack_adjust in translate.c.  First, set this to zero and run qemu.
+Then, change env->stack_adjust to the difference between the two stack
+locations.  Then rebuild qemu and run again. That will produce a very
+clean diff.
+
+Here are some handy places to set breakpoints
+
+    At the call to gen_start_packet for a given PC (note that the line number
+        might change in the future)
+        br translate.c:602 if ctx->base.pc_next == 0xdeadbeef
+    The helper function for each instruction is named helper_<TAG>, so here's
+        an example that will set a breakpoint at the start
+        br helper_A2_add
+    If you have the HEX_DEBUG macro set, the following will be useful
+        At the start of execution of a packet for a given PC
+            br helper_debug_start_packet if env->gpr[41] == 0xdeadbeef
+        At the end of execution of a packet for a given PC
+            br helper_debug_commit_end if env->this_PC == 0xdeadbeef
+
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 03/34] Hexagon (include/elf.h) ELF machine definition
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
  2020-08-18 15:50 ` [RFC PATCH v3 01/34] Hexagon Update MAINTAINERS file Taylor Simpson
  2020-08-18 15:50 ` [RFC PATCH v3 02/34] Hexagon (target/hexagon) README Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26  2:06   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 04/34] Hexagon (target/hexagon) scalar core definition Taylor Simpson
                   ` (32 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Define EM_HEXAGON 164

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 include/elf.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/elf.h b/include/elf.h
index 5b06b55..447da08 100644
--- a/include/elf.h
+++ b/include/elf.h
@@ -172,6 +172,8 @@ typedef struct mips_elf_abiflags_v0 {
 
 #define EM_UNICORE32    110     /* UniCore32 */
 
+#define EM_HEXAGON     164     /* Qualcomm Hexagon */
+
 #define EM_RISCV        243     /* RISC-V */
 
 #define EM_NANOMIPS     249     /* Wave Computing nanoMIPS */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 04/34] Hexagon (target/hexagon) scalar core definition
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (2 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 03/34] Hexagon (include/elf.h) ELF machine definition Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 13:35   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 05/34] Hexagon (target/hexagon) register names Taylor Simpson
                   ` (31 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Add target state header, target definitions and initialization routines

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/cpu-param.h |  26 ++++
 target/hexagon/cpu.h       | 164 +++++++++++++++++++++++
 target/hexagon/cpu_bits.h  |  34 +++++
 target/hexagon/internal.h  |  40 ++++++
 target/hexagon/cpu.c       | 316 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 580 insertions(+)
 create mode 100644 target/hexagon/cpu-param.h
 create mode 100644 target/hexagon/cpu.h
 create mode 100644 target/hexagon/cpu_bits.h
 create mode 100644 target/hexagon/internal.h
 create mode 100644 target/hexagon/cpu.c

diff --git a/target/hexagon/cpu-param.h b/target/hexagon/cpu-param.h
new file mode 100644
index 0000000..3a6b727
--- /dev/null
+++ b/target/hexagon/cpu-param.h
@@ -0,0 +1,26 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_CPU_PARAM_H
+#define HEXAGON_CPU_PARAM_H
+
+#define TARGET_PHYS_ADDR_SPACE_BITS 36
+#define TARGET_VIRT_ADDR_SPACE_BITS 32
+
+#define NB_MMU_MODES 1
+
+#endif
diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
new file mode 100644
index 0000000..af3d644
--- /dev/null
+++ b/target/hexagon/cpu.h
@@ -0,0 +1,164 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_CPU_H
+#define HEXAGON_CPU_H
+
+/* Forward declaration needed by some of the header files */
+typedef struct CPUHexagonState CPUHexagonState;
+
+#include <fenv.h>
+
+#define TARGET_PAGE_BITS 16     /* 64K pages */
+#define TARGET_LONG_BITS 32
+
+#include "qemu/compiler.h"
+#include "qemu-common.h"
+#include "exec/cpu-defs.h"
+#include "hex_regs.h"
+
+#define NUM_PREGS 4
+#ifdef CONFIG_USER_ONLY
+#define TOTAL_PER_THREAD_REGS 64
+#else
+#error System mode not implemented
+#endif
+
+#define SLOTS_MAX 4
+#define STORES_MAX 2
+#define REG_WRITES_MAX 32
+#define PRED_WRITES_MAX 5                   /* 4 insns + endloop */
+
+#define TYPE_HEXAGON_CPU "hexagon-cpu"
+
+#define HEXAGON_CPU_TYPE_SUFFIX "-" TYPE_HEXAGON_CPU
+#define HEXAGON_CPU_TYPE_NAME(name) (name HEXAGON_CPU_TYPE_SUFFIX)
+#define CPU_RESOLVING_TYPE TYPE_HEXAGON_CPU
+
+#define TYPE_HEXAGON_CPU_V67 HEXAGON_CPU_TYPE_NAME("v67")
+
+#define MMU_USER_IDX 0
+
+struct MemLog {
+    target_ulong va;
+    uint8_t width;
+    uint32_t data32;
+    uint64_t data64;
+};
+
+#define EXEC_STATUS_OK          0x0000
+#define EXEC_STATUS_STOP        0x0002
+#define EXEC_STATUS_REPLAY      0x0010
+#define EXEC_STATUS_LOCKED      0x0020
+#define EXEC_STATUS_EXCEPTION   0x0100
+
+
+#define EXCEPTION_DETECTED      (env->status & EXEC_STATUS_EXCEPTION)
+#define REPLAY_DETECTED         (env->status & EXEC_STATUS_REPLAY)
+#define CLEAR_EXCEPTION         (env->status &= (~EXEC_STATUS_EXCEPTION))
+#define SET_EXCEPTION           (env->status |= EXEC_STATUS_EXCEPTION)
+
+struct CPUHexagonState {
+    target_ulong gpr[TOTAL_PER_THREAD_REGS];
+    target_ulong pred[NUM_PREGS];
+    target_ulong branch_taken;
+    target_ulong next_PC;
+
+    /* For comparing with LLDB on target - see hack_stack_ptrs function */
+    target_ulong stack_start;
+    target_ulong stack_adjust;
+
+    uint8_t slot_cancelled;
+    target_ulong new_value[TOTAL_PER_THREAD_REGS];
+
+    /*
+     * Only used when HEX_DEBUG is on, but unconditionally included
+     * to reduce recompile time when turning HEX_DEBUG on/off.
+     */
+    target_ulong this_PC;
+    target_ulong reg_written[TOTAL_PER_THREAD_REGS];
+
+    target_ulong new_pred_value[NUM_PREGS];
+    target_ulong pred_written;
+
+    struct MemLog mem_log_stores[STORES_MAX];
+    target_ulong pkt_has_store_s1;
+    target_ulong dczero_addr;
+
+    fenv_t fenv;
+
+    target_ulong llsc_addr;
+    target_ulong llsc_val;
+    uint64_t     llsc_val_i64;
+
+    target_ulong is_gather_store_insn;
+    target_ulong gather_issued;
+};
+
+#define HEXAGON_CPU_CLASS(klass) \
+    OBJECT_CLASS_CHECK(HexagonCPUClass, (klass), TYPE_HEXAGON_CPU)
+#define HEXAGON_CPU(obj) \
+    OBJECT_CHECK(HexagonCPU, (obj), TYPE_HEXAGON_CPU)
+#define HEXAGON_CPU_GET_CLASS(obj) \
+    OBJECT_GET_CLASS(HexagonCPUClass, (obj), TYPE_HEXAGON_CPU)
+
+typedef struct HexagonCPUClass {
+    /*< private >*/
+    CPUClass parent_class;
+    /*< public >*/
+    DeviceRealize parent_realize;
+    DeviceReset parent_reset;
+} HexagonCPUClass;
+
+typedef struct HexagonCPU {
+    /*< private >*/
+    CPUState parent_obj;
+    /*< public >*/
+    CPUNegativeOffsetState neg;
+    CPUHexagonState env;
+} HexagonCPU;
+
+static inline HexagonCPU *hexagon_env_get_cpu(CPUHexagonState *env)
+{
+    return container_of(env, HexagonCPU, env);
+}
+
+#include "cpu_bits.h"
+
+#define cpu_signal_handler cpu_hexagon_signal_handler
+extern int cpu_hexagon_signal_handler(int host_signum, void *pinfo, void *puc);
+
+static inline void cpu_get_tb_cpu_state(CPUHexagonState *env, target_ulong *pc,
+                                        target_ulong *cs_base, uint32_t *flags)
+{
+    *pc = env->gpr[HEX_REG_PC];
+    *cs_base = 0;
+#ifdef CONFIG_USER_ONLY
+    *flags = 0;
+#else
+#error System mode not supported on Hexagon yet
+#endif
+}
+
+typedef struct CPUHexagonState CPUArchState;
+typedef HexagonCPU ArchCPU;
+
+void hexagon_translate_init(void);
+
+#include "exec/cpu-all.h"
+
+#endif /* HEXAGON_CPU_H */
diff --git a/target/hexagon/cpu_bits.h b/target/hexagon/cpu_bits.h
new file mode 100644
index 0000000..586c717
--- /dev/null
+++ b/target/hexagon/cpu_bits.h
@@ -0,0 +1,34 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_CPU_BITS_H
+#define HEXAGON_CPU_BITS_H
+
+#define HEX_EXCP_FETCH_NO_UPAGE  0x012
+#define HEX_EXCP_INVALID_PACKET  0x015
+#define HEX_EXCP_INVALID_OPCODE  0x015
+#define HEX_EXCP_PRIV_NO_UREAD   0x024
+#define HEX_EXCP_PRIV_NO_UWRITE  0x025
+
+#define HEX_EXCP_TRAP0           0x172
+
+#define PACKET_WORDS_MAX         4
+
+extern int disassemble_hexagon(uint32_t *words, int nwords,
+                               char *buf, int bufsize);
+
+#endif
diff --git a/target/hexagon/internal.h b/target/hexagon/internal.h
new file mode 100644
index 0000000..d3e4412
--- /dev/null
+++ b/target/hexagon/internal.h
@@ -0,0 +1,40 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_INTERNAL_H
+#define HEXAGON_INTERNAL_H
+
+/*
+ * Change HEX_DEBUG to 1 to turn on debugging output
+ */
+#define HEX_DEBUG 0
+#define HEX_DEBUG_LOG(...) \
+    do { \
+        if (HEX_DEBUG) { \
+            rcu_read_lock(); \
+            fprintf(stderr, __VA_ARGS__); \
+            rcu_read_unlock(); \
+        } \
+    } while (0)
+
+extern void hexagon_debug(CPUHexagonState *env);
+
+extern const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS];
+
+extern void init_genptr(void);
+
+#endif
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
new file mode 100644
index 0000000..d812913
--- /dev/null
+++ b/target/hexagon/cpu.c
@@ -0,0 +1,316 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qemu/qemu-print.h"
+#include "cpu.h"
+#include "internal.h"
+#include "exec/exec-all.h"
+#include "qapi/error.h"
+#include "migration/vmstate.h"
+
+static void hexagon_v67_cpu_init(Object *obj)
+{
+}
+
+static ObjectClass *hexagon_cpu_class_by_name(const char *cpu_model)
+{
+    ObjectClass *oc;
+    char *typename;
+    char **cpuname;
+
+    cpuname = g_strsplit(cpu_model, ",", 1);
+    typename = g_strdup_printf(HEXAGON_CPU_TYPE_NAME("%s"), cpuname[0]);
+    oc = object_class_by_name(typename);
+    g_strfreev(cpuname);
+    g_free(typename);
+    if (!oc || !object_class_dynamic_cast(oc, TYPE_HEXAGON_CPU) ||
+        object_class_is_abstract(oc)) {
+        return NULL;
+    }
+    return oc;
+}
+
+const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS] = {
+   "r0", "r1",  "r2",  "r3",  "r4",   "r5",  "r6",  "r7",
+   "r8", "r9",  "r10", "r11", "r12",  "r13", "r14", "r15",
+  "r16", "r17", "r18", "r19", "r20",  "r21", "r22", "r23",
+  "r24", "r25", "r26", "r27", "r28",  "r29", "r30", "r31",
+  "sa0", "lc0", "sa1", "lc1", "p3_0", "c5",  "m0",  "m1",
+  "usr", "pc",  "ugp", "gp",  "cs0",  "cs1", "c14", "c15",
+  "c16", "c17", "c18", "c19", "pkt_cnt",  "insn_cnt", "c22", "c23",
+  "c24", "c25", "c26", "c27", "c28",  "c29", "c30", "c31",
+};
+
+/*
+ * One of the main debugging techniques is to use "-d cpu" and compare against
+ * LLDB output when single stepping.  However, the target and qemu put the
+ * stacks at different locations.  This is used to compensate so the diff is
+ * cleaner.
+ */
+static inline target_ulong hack_stack_ptrs(CPUHexagonState *env,
+                                           target_ulong addr)
+{
+    static bool first = true;
+    if (first) {
+        first = false;
+        env->stack_start = env->gpr[HEX_REG_SP];
+        env->gpr[HEX_REG_USR] = 0x56000;
+
+#define ADJUST_STACK 0
+#if ADJUST_STACK
+        /*
+         * Change the two numbers below to
+         *     1    qemu stack location
+         *     2    hardware stack location
+         * Or set to zero for normal mode (no stack adjustment)
+         */
+        env->stack_adjust = 0xfffeeb80 - 0xbf89f980;
+#else
+        env->stack_adjust = 0;
+#endif
+    }
+
+    target_ulong stack_start = env->stack_start;
+    target_ulong stack_size = 0x10000;
+    target_ulong stack_adjust = env->stack_adjust;
+
+    if (stack_start + 0x1000 >= addr && addr >= (stack_start - stack_size)) {
+        return addr - stack_adjust;
+    }
+    return addr;
+}
+
+/* HEX_REG_P3_0 (aka C4) is an alias for the predicate registers */
+static inline target_ulong read_p3_0(CPUHexagonState *env)
+{
+    int32_t control_reg = 0;
+    int i;
+    for (i = NUM_PREGS - 1; i >= 0; i--) {
+        control_reg <<= 8;
+        control_reg |= env->pred[i] & 0xff;
+    }
+    return control_reg;
+}
+
+static void print_reg(FILE *f, CPUHexagonState *env, int regnum)
+{
+    target_ulong value;
+
+    if (regnum == HEX_REG_P3_0) {
+        value = read_p3_0(env);
+    } else {
+        value = regnum < 32 ? hack_stack_ptrs(env, env->gpr[regnum])
+                            : env->gpr[regnum];
+    }
+
+    qemu_fprintf(f, "  %s = 0x" TARGET_FMT_lx "\n",
+                 hexagon_regnames[regnum], value);
+}
+
+static void hexagon_dump(CPUHexagonState *env, FILE *f)
+{
+    static target_ulong last_pc;
+    int i;
+
+    /*
+     * When comparing with LLDB, it doesn't step through single-cycle
+     * hardware loops the same way.  So, we just skip them here
+     */
+    if (env->gpr[HEX_REG_PC] == last_pc) {
+        return;
+    }
+    last_pc = env->gpr[HEX_REG_PC];
+    qemu_fprintf(f, "General Purpose Registers = {\n");
+    for (i = 0; i < 32; i++) {
+        print_reg(f, env, i);
+    }
+    print_reg(f, env, HEX_REG_SA0);
+    print_reg(f, env, HEX_REG_LC0);
+    print_reg(f, env, HEX_REG_SA1);
+    print_reg(f, env, HEX_REG_LC1);
+    print_reg(f, env, HEX_REG_M0);
+    print_reg(f, env, HEX_REG_M1);
+    print_reg(f, env, HEX_REG_USR);
+    print_reg(f, env, HEX_REG_P3_0);
+    print_reg(f, env, HEX_REG_GP);
+    print_reg(f, env, HEX_REG_UGP);
+    print_reg(f, env, HEX_REG_PC);
+#ifdef CONFIG_USER_ONLY
+    /*
+     * Not modelled in user mode, print junk to minimize the diff's
+     * with LLDB output
+     */
+    qemu_fprintf(f, "  cause = 0x000000db\n");
+    qemu_fprintf(f, "  badva = 0x00000000\n");
+    qemu_fprintf(f, "  cs0 = 0x00000000\n");
+    qemu_fprintf(f, "  cs1 = 0x00000000\n");
+#else
+    print_reg(f, env, HEX_REG_CAUSE);
+    print_reg(f, env, HEX_REG_BADVA);
+    print_reg(f, env, HEX_REG_CS0);
+    print_reg(f, env, HEX_REG_CS1);
+#endif
+    qemu_fprintf(f, "}\n");
+}
+
+static void hexagon_dump_state(CPUState *cs, FILE *f, int flags)
+{
+    HexagonCPU *cpu = HEXAGON_CPU(cs);
+    CPUHexagonState *env = &cpu->env;
+
+    hexagon_dump(env, f);
+}
+
+void hexagon_debug(CPUHexagonState *env)
+{
+    hexagon_dump(env, stdout);
+}
+
+static void hexagon_cpu_set_pc(CPUState *cs, vaddr value)
+{
+    HexagonCPU *cpu = HEXAGON_CPU(cs);
+    CPUHexagonState *env = &cpu->env;
+    env->gpr[HEX_REG_PC] = value;
+}
+
+static void hexagon_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+{
+    HexagonCPU *cpu = HEXAGON_CPU(cs);
+    CPUHexagonState *env = &cpu->env;
+    env->gpr[HEX_REG_PC] = tb->pc;
+}
+
+static bool hexagon_cpu_has_work(CPUState *cs)
+{
+    return true;
+}
+
+void restore_state_to_opc(CPUHexagonState *env, TranslationBlock *tb,
+                          target_ulong *data)
+{
+    env->gpr[HEX_REG_PC] = data[0];
+}
+
+static void hexagon_cpu_reset(DeviceState *dev)
+{
+    CPUState *cs = CPU(dev);
+    HexagonCPU *cpu = HEXAGON_CPU(cs);
+    HexagonCPUClass *mcc = HEXAGON_CPU_GET_CLASS(cpu);
+
+    mcc->parent_reset(dev);
+}
+
+static void hexagon_cpu_disas_set_info(CPUState *s, disassemble_info *info)
+{
+    info->print_insn = print_insn_hexagon;
+}
+
+static void hexagon_cpu_realize(DeviceState *dev, Error **errp)
+{
+    CPUState *cs = CPU(dev);
+    HexagonCPUClass *mcc = HEXAGON_CPU_GET_CLASS(dev);
+    Error *local_err = NULL;
+
+    cpu_exec_realizefn(cs, &local_err);
+    if (local_err != NULL) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    qemu_init_vcpu(cs);
+    cpu_reset(cs);
+
+    mcc->parent_realize(dev, errp);
+}
+
+static void hexagon_cpu_init(Object *obj)
+{
+    HexagonCPU *cpu = HEXAGON_CPU(obj);
+
+    cpu_set_cpustate_pointers(cpu);
+}
+
+static bool hexagon_tlb_fill(CPUState *cs, vaddr address, int size,
+                             MMUAccessType access_type, int mmu_idx,
+                             bool probe, uintptr_t retaddr)
+{
+#ifdef CONFIG_USER_ONLY
+    switch (access_type) {
+    case MMU_INST_FETCH:
+        cs->exception_index = HEX_EXCP_FETCH_NO_UPAGE;
+        break;
+    case MMU_DATA_LOAD:
+        cs->exception_index = HEX_EXCP_PRIV_NO_UREAD;
+        break;
+    case MMU_DATA_STORE:
+        cs->exception_index = HEX_EXCP_PRIV_NO_UWRITE;
+        break;
+    }
+    cpu_loop_exit_restore(cs, retaddr);
+#else
+#error System mode not implemented for Hexagon
+#endif
+}
+
+static void hexagon_cpu_class_init(ObjectClass *c, void *data)
+{
+    HexagonCPUClass *mcc = HEXAGON_CPU_CLASS(c);
+    CPUClass *cc = CPU_CLASS(c);
+    DeviceClass *dc = DEVICE_CLASS(c);
+
+    device_class_set_parent_realize(dc, hexagon_cpu_realize,
+                                    &mcc->parent_realize);
+
+    device_class_set_parent_reset(dc, hexagon_cpu_reset, &mcc->parent_reset);
+
+    cc->class_by_name = hexagon_cpu_class_by_name;
+    cc->has_work = hexagon_cpu_has_work;
+    cc->dump_state = hexagon_dump_state;
+    cc->set_pc = hexagon_cpu_set_pc;
+    cc->synchronize_from_tb = hexagon_cpu_synchronize_from_tb;
+    cc->gdb_num_core_regs = TOTAL_PER_THREAD_REGS;
+    cc->gdb_stop_before_watchpoint = true;
+    cc->disas_set_info = hexagon_cpu_disas_set_info;
+#ifdef CONFIG_TCG
+    cc->tcg_initialize = hexagon_translate_init;
+    cc->tlb_fill = hexagon_tlb_fill;
+#endif
+}
+
+#define DEFINE_CPU(type_name, initfn)      \
+    {                                      \
+        .name = type_name,                 \
+        .parent = TYPE_HEXAGON_CPU,        \
+        .instance_init = initfn            \
+    }
+
+static const TypeInfo hexagon_cpu_type_infos[] = {
+    {
+        .name = TYPE_HEXAGON_CPU,
+        .parent = TYPE_CPU,
+        .instance_size = sizeof(HexagonCPU),
+        .instance_init = hexagon_cpu_init,
+        .abstract = true,
+        .class_size = sizeof(HexagonCPUClass),
+        .class_init = hexagon_cpu_class_init,
+    },
+    DEFINE_CPU(TYPE_HEXAGON_CPU_V67,              hexagon_v67_cpu_init),
+};
+
+DEFINE_TYPES(hexagon_cpu_type_infos)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 05/34] Hexagon (target/hexagon) register names
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (3 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 04/34] Hexagon (target/hexagon) scalar core definition Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 13:39   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 06/34] Hexagon (disas) disassembler Taylor Simpson
                   ` (30 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/hex_regs.h | 83 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 83 insertions(+)
 create mode 100644 target/hexagon/hex_regs.h

diff --git a/target/hexagon/hex_regs.h b/target/hexagon/hex_regs.h
new file mode 100644
index 0000000..3b4249a
--- /dev/null
+++ b/target/hexagon/hex_regs.h
@@ -0,0 +1,83 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_REGS_H
+#define HEXAGON_REGS_H
+
+enum {
+    HEX_REG_R00              = 0,
+    HEX_REG_R01              = 1,
+    HEX_REG_R02              = 2,
+    HEX_REG_R03              = 3,
+    HEX_REG_R04              = 4,
+    HEX_REG_R05              = 5,
+    HEX_REG_R06              = 6,
+    HEX_REG_R07              = 7,
+    HEX_REG_R08              = 8,
+    HEX_REG_R09              = 9,
+    HEX_REG_R10              = 10,
+    HEX_REG_R11              = 11,
+    HEX_REG_R12              = 12,
+    HEX_REG_R13              = 13,
+    HEX_REG_R14              = 14,
+    HEX_REG_R15              = 15,
+    HEX_REG_R16              = 16,
+    HEX_REG_R17              = 17,
+    HEX_REG_R18              = 18,
+    HEX_REG_R19              = 19,
+    HEX_REG_R20              = 20,
+    HEX_REG_R21              = 21,
+    HEX_REG_R22              = 22,
+    HEX_REG_R23              = 23,
+    HEX_REG_R24              = 24,
+    HEX_REG_R25              = 25,
+    HEX_REG_R26              = 26,
+    HEX_REG_R27              = 27,
+    HEX_REG_R28              = 28,
+    HEX_REG_R29              = 29,
+    HEX_REG_SP               = 29,
+    HEX_REG_FP               = 30,
+    HEX_REG_R30              = 30,
+    HEX_REG_LR               = 31,
+    HEX_REG_R31              = 31,
+    HEX_REG_SA0              = 32,
+    HEX_REG_LC0              = 33,
+    HEX_REG_SA1              = 34,
+    HEX_REG_LC1              = 35,
+    HEX_REG_P3_0             = 36,
+    HEX_REG_M0               = 38,
+    HEX_REG_M1               = 39,
+    HEX_REG_USR              = 40,
+    HEX_REG_PC               = 41,
+    HEX_REG_UGP              = 42,
+    HEX_REG_GP               = 43,
+    HEX_REG_CS0              = 44,
+    HEX_REG_CS1              = 45,
+    HEX_REG_UPCYCLELO        = 46,
+    HEX_REG_UPCYCLEHI        = 47,
+    HEX_REG_FRAMELIMIT       = 48,
+    HEX_REG_FRAMEKEY         = 49,
+    HEX_REG_PKTCNTLO         = 50,
+    HEX_REG_PKTCNTHI         = 51,
+    /* Use reserved control registers for qemu execution counts */
+    HEX_REG_QEMU_PKT_CNT      = 52,
+    HEX_REG_QEMU_INSN_CNT     = 53,
+    HEX_REG_UTIMERLO          = 62,
+    HEX_REG_UTIMERHI          = 63,
+};
+
+#endif
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 06/34] Hexagon (disas) disassembler
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (4 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 05/34] Hexagon (target/hexagon) register names Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 13:52   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 07/34] Hexagon (target/hexagon) scalar core helpers Taylor Simpson
                   ` (29 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

The Hexagon disassembler calls disassemble_hexagon to decode a packet
and format it for printing

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 include/disas/dis-asm.h |  1 +
 disas/hexagon.c         | 62 +++++++++++++++++++++++++++++++++++++++++++++++++
 disas/Makefile.objs     |  1 +
 3 files changed, 64 insertions(+)
 create mode 100644 disas/hexagon.c

diff --git a/include/disas/dis-asm.h b/include/disas/dis-asm.h
index 9856bf7..14ff2be 100644
--- a/include/disas/dis-asm.h
+++ b/include/disas/dis-asm.h
@@ -460,6 +460,7 @@ int print_insn_xtensa           (bfd_vma, disassemble_info*);
 int print_insn_riscv32          (bfd_vma, disassemble_info*);
 int print_insn_riscv64          (bfd_vma, disassemble_info*);
 int print_insn_rx(bfd_vma, disassemble_info *);
+int print_insn_hexagon(bfd_vma, disassemble_info *);
 
 #if 0
 /* Fetch the disassembler for a given BFD, if that support is available.  */
diff --git a/disas/hexagon.c b/disas/hexagon.c
new file mode 100644
index 0000000..6ee8653
--- /dev/null
+++ b/disas/hexagon.c
@@ -0,0 +1,62 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * QEMU Hexagon Disassembler
+ */
+
+#include "qemu/osdep.h"
+#include "disas/dis-asm.h"
+#include "target/hexagon/cpu_bits.h"
+
+/*
+ * We will disassemble a packet with up to 4 instructions, so we need
+ * a hefty size buffer.
+ */
+#define PACKET_BUFFER_LEN                   1028
+
+int print_insn_hexagon(bfd_vma memaddr, struct disassemble_info *info)
+{
+    uint32_t words[PACKET_WORDS_MAX];
+    int len, slen;
+    char buf[PACKET_BUFFER_LEN];
+    int status;
+    int i;
+
+    for (i = 0; i < PACKET_WORDS_MAX; i++) {
+        status = (*info->read_memory_func)(memaddr + i * sizeof(uint32_t),
+                                           (bfd_byte *)&words[i],
+                                           sizeof(uint32_t), info);
+        if (status) {
+            if (i > 0) {
+                break;
+            }
+            (*info->memory_error_func)(status, memaddr, info);
+            return status;
+        }
+    }
+
+    len = disassemble_hexagon(words, i, buf, PACKET_BUFFER_LEN);
+    slen = strlen(buf);
+    if (buf[slen - 1] == '\n') {
+        buf[slen - 1] = '\0';
+    }
+    (*info->fprintf_func)(info->stream, "%s", buf);
+
+    return len;
+}
+
diff --git a/disas/Makefile.objs b/disas/Makefile.objs
index 3c1cdce..0e86964 100644
--- a/disas/Makefile.objs
+++ b/disas/Makefile.objs
@@ -24,6 +24,7 @@ common-obj-$(CONFIG_SH4_DIS) += sh4.o
 common-obj-$(CONFIG_SPARC_DIS) += sparc.o
 common-obj-$(CONFIG_LM32_DIS) += lm32.o
 common-obj-$(CONFIG_XTENSA_DIS) += xtensa.o
+common-obj-$(CONFIG_HEXAGON_DIS) += hexagon.o
 
 # TODO: As long as the TCG interpreter and its generated code depend
 # on the QEMU target, we cannot compile the disassembler here.
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 07/34] Hexagon (target/hexagon) scalar core helpers
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (5 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 06/34] Hexagon (disas) disassembler Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 14:16   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 08/34] Hexagon (target/hexagon) GDB Stub Taylor Simpson
                   ` (28 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

The majority of helpers are generated.  Define the helper functions needed
then include the generated file

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/helper.h    |  33 ++++
 target/hexagon/op_helper.c | 368 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 401 insertions(+)
 create mode 100644 target/hexagon/helper.h
 create mode 100644 target/hexagon/op_helper.c

diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h
new file mode 100644
index 0000000..48b1917
--- /dev/null
+++ b/target/hexagon/helper.h
@@ -0,0 +1,33 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+DEF_HELPER_2(raise_exception, noreturn, env, i32)
+DEF_HELPER_1(debug_start_packet, void, env)
+DEF_HELPER_3(debug_check_store_width, void, env, int, int)
+DEF_HELPER_3(debug_commit_end, void, env, int, int)
+DEF_HELPER_3(merge_inflight_store1s, s32, env, s32, s32)
+DEF_HELPER_3(merge_inflight_store1u, s32, env, s32, s32)
+DEF_HELPER_3(merge_inflight_store2s, s32, env, s32, s32)
+DEF_HELPER_3(merge_inflight_store2u, s32, env, s32, s32)
+DEF_HELPER_3(merge_inflight_store4s, s32, env, s32, s32)
+DEF_HELPER_3(merge_inflight_store4u, s32, env, s32, s32)
+DEF_HELPER_3(merge_inflight_store8u, s64, env, s32, s64)
+
+#include "helper_protos_generated.h"
+
+DEF_HELPER_2(debug_value, void, env, s32)
+DEF_HELPER_2(debug_value_i64, void, env, s64)
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
new file mode 100644
index 0000000..f86d45b
--- /dev/null
+++ b/target/hexagon/op_helper.c
@@ -0,0 +1,368 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <math.h>
+#include "qemu/osdep.h"
+#include "qemu.h"
+#include "exec/helper-proto.h"
+#include "cpu.h"
+#include "internal.h"
+#include "macros.h"
+#include "arch.h"
+#include "fma_emu.h"
+#include "conv_emu.h"
+
+/* Exceptions processing helpers */
+static void QEMU_NORETURN do_raise_exception_err(CPUHexagonState *env,
+                                                 uint32_t exception,
+                                                 uintptr_t pc)
+{
+    CPUState *cs = CPU(hexagon_env_get_cpu(env));
+    qemu_log_mask(CPU_LOG_INT, "%s: %d\n", __func__, exception);
+    cs->exception_index = exception;
+    cpu_loop_exit_restore(cs, pc);
+}
+
+void HELPER(raise_exception)(CPUHexagonState *env, uint32_t exception)
+{
+    do_raise_exception_err(env, exception, 0);
+}
+
+static inline void log_reg_write(CPUHexagonState *env, int rnum,
+                                 target_ulong val, uint32_t slot)
+{
+    HEX_DEBUG_LOG("log_reg_write[%d] = " TARGET_FMT_ld " (0x" TARGET_FMT_lx ")",
+                  rnum, val, val);
+    if (env->slot_cancelled & (1 << slot)) {
+        HEX_DEBUG_LOG(" CANCELLED");
+    }
+    if (val == env->gpr[rnum]) {
+        HEX_DEBUG_LOG(" NO CHANGE");
+    }
+    HEX_DEBUG_LOG("\n");
+    if (!(env->slot_cancelled & (1 << slot))) {
+        env->new_value[rnum] = val;
+#if HEX_DEBUG
+        /* Do this so HELPER(debug_commit_end) will know */
+        env->reg_written[rnum] = 1;
+#endif
+    }
+}
+
+static __attribute__((unused))
+inline void log_reg_write_pair(CPUHexagonState *env, int rnum,
+                                      int64_t val, uint32_t slot)
+{
+    HEX_DEBUG_LOG("log_reg_write_pair[%d:%d] = %ld\n", rnum + 1, rnum, val);
+    log_reg_write(env, rnum, val & 0xFFFFFFFF, slot);
+    log_reg_write(env, rnum + 1, (val >> 32) & 0xFFFFFFFF, slot);
+}
+
+static inline void log_pred_write(CPUHexagonState *env, int pnum,
+                                  target_ulong val)
+{
+    HEX_DEBUG_LOG("log_pred_write[%d] = " TARGET_FMT_ld
+                  " (0x" TARGET_FMT_lx ")\n",
+                  pnum, val, val);
+
+    /* Multiple writes to the same preg are and'ed together */
+    if (env->pred_written & (1 << pnum)) {
+        env->new_pred_value[pnum] &= val & 0xff;
+    } else {
+        env->new_pred_value[pnum] = val & 0xff;
+        env->pred_written |= 1 << pnum;
+    }
+}
+
+static inline void log_store32(CPUHexagonState *env, target_ulong addr,
+                               target_ulong val, int width, int slot)
+{
+    HEX_DEBUG_LOG("log_store%d(0x" TARGET_FMT_lx ", " TARGET_FMT_ld
+                  " [0x" TARGET_FMT_lx "])\n",
+                  width, addr, val, val);
+    env->mem_log_stores[slot].va = addr;
+    env->mem_log_stores[slot].width = width;
+    env->mem_log_stores[slot].data32 = val;
+}
+
+static inline void log_store64(CPUHexagonState *env, target_ulong addr,
+                               int64_t val, int width, int slot)
+{
+    HEX_DEBUG_LOG("log_store%d(0x" TARGET_FMT_lx ", %ld [0x%lx])\n",
+                   width, addr, val, val);
+    env->mem_log_stores[slot].va = addr;
+    env->mem_log_stores[slot].width = width;
+    env->mem_log_stores[slot].data64 = val;
+}
+
+static inline void write_new_pc(CPUHexagonState *env, target_ulong addr)
+{
+    HEX_DEBUG_LOG("write_new_pc(0x" TARGET_FMT_lx ")\n", addr);
+
+    /*
+     * If more than one branch is taken in a packet, only the first one
+     * is actually done.
+     */
+    if (env->branch_taken) {
+        HEX_DEBUG_LOG("INFO: multiple branches taken in same packet, "
+                      "ignoring the second one\n");
+    } else {
+        fCHECK_PCALIGN(addr);
+        env->branch_taken = 1;
+        env->next_PC = addr;
+    }
+}
+
+/* Handy place to set a breakpoint */
+void HELPER(debug_start_packet)(CPUHexagonState *env)
+{
+    HEX_DEBUG_LOG("Start packet: pc = 0x" TARGET_FMT_lx "\n",
+                  env->gpr[HEX_REG_PC]);
+
+    int i;
+    for (i = 0; i < TOTAL_PER_THREAD_REGS; i++) {
+        env->reg_written[i] = 0;
+    }
+}
+
+static inline int32_t new_pred_value(CPUHexagonState *env, int pnum)
+{
+    return env->new_pred_value[pnum];
+}
+
+/* Checks for bookkeeping errors between disassembly context and runtime */
+void HELPER(debug_check_store_width)(CPUHexagonState *env, int slot, int check)
+{
+    if (env->mem_log_stores[slot].width != check) {
+        HEX_DEBUG_LOG("ERROR: %d != %d\n",
+                      env->mem_log_stores[slot].width, check);
+        g_assert_not_reached();
+    }
+}
+
+static void print_store(CPUHexagonState *env, int slot)
+{
+    if (!(env->slot_cancelled & (1 << slot))) {
+        size1u_t width = env->mem_log_stores[slot].width;
+        if (width == 1) {
+            size4u_t data = env->mem_log_stores[slot].data32 & 0xff;
+            HEX_DEBUG_LOG("\tmemb[0x" TARGET_FMT_lx "] = %d (0x%02x)\n",
+                          env->mem_log_stores[slot].va, data, data);
+        } else if (width == 2) {
+            size4u_t data = env->mem_log_stores[slot].data32 & 0xffff;
+            HEX_DEBUG_LOG("\tmemh[0x" TARGET_FMT_lx "] = %d (0x%04x)\n",
+                          env->mem_log_stores[slot].va, data, data);
+        } else if (width == 4) {
+            size4u_t data = env->mem_log_stores[slot].data32;
+            HEX_DEBUG_LOG("\tmemw[0x" TARGET_FMT_lx "] = %d (0x%08x)\n",
+                          env->mem_log_stores[slot].va, data, data);
+        } else if (width == 8) {
+            HEX_DEBUG_LOG("\tmemd[0x" TARGET_FMT_lx "] = %lu (0x%016lx)\n",
+                          env->mem_log_stores[slot].va,
+                          env->mem_log_stores[slot].data64,
+                          env->mem_log_stores[slot].data64);
+        } else {
+            HEX_DEBUG_LOG("\tBad store width %d\n", width);
+            g_assert_not_reached();
+        }
+    }
+}
+
+/* This function is a handy place to set a breakpoint */
+void HELPER(debug_commit_end)(CPUHexagonState *env, int has_st0, int has_st1)
+{
+    bool reg_printed = false;
+    bool pred_printed = false;
+    int i;
+
+    HEX_DEBUG_LOG("Packet committed: pc = 0x" TARGET_FMT_lx "\n",
+                  env->this_PC);
+    HEX_DEBUG_LOG("slot_cancelled = %d\n", env->slot_cancelled);
+
+    for (i = 0; i < TOTAL_PER_THREAD_REGS; i++) {
+        if (env->reg_written[i]) {
+            if (!reg_printed) {
+                HEX_DEBUG_LOG("Regs written\n");
+                reg_printed = true;
+            }
+            HEX_DEBUG_LOG("\tr%d = " TARGET_FMT_ld " (0x" TARGET_FMT_lx ")\n",
+                          i, env->new_value[i], env->new_value[i]);
+        }
+    }
+
+    for (i = 0; i < NUM_PREGS; i++) {
+        if (env->pred_written & (1 << i)) {
+            if (!pred_printed) {
+                HEX_DEBUG_LOG("Predicates written\n");
+                pred_printed = true;
+            }
+            HEX_DEBUG_LOG("\tp%d = 0x" TARGET_FMT_lx "\n",
+                          i, env->new_pred_value[i]);
+        }
+    }
+
+    if (has_st0 || has_st1) {
+        HEX_DEBUG_LOG("Stores\n");
+        if (has_st0) {
+            print_store(env, 0);
+        }
+        if (has_st1) {
+            print_store(env, 1);
+        }
+    }
+
+    HEX_DEBUG_LOG("Next PC = 0x%x\n", env->next_PC);
+    HEX_DEBUG_LOG("Exec counters: pkt = " TARGET_FMT_lx
+                  ", insn = " TARGET_FMT_lx
+                  "\n",
+                  env->gpr[HEX_REG_QEMU_PKT_CNT],
+                  env->gpr[HEX_REG_QEMU_INSN_CNT]);
+
+}
+
+/*
+ * Handle mem_noshuf
+ *
+ * This occurs when there is a load that might need data forwarded
+ * from an inflight store in slot 1.  Note that the load and store
+ * might have different sizes, so we can't simply compare the
+ * addresses.  We merge only the bytes that overlap (if any).
+ */
+static int64_t merge_bytes(CPUHexagonState *env, target_ulong load_addr,
+                           int64_t load_data, uint32_t load_width)
+{
+    /* Don't do anything if slot 1 was cancelled */
+    const int store_slot = 1;
+    if (env->slot_cancelled & (1 << store_slot)) {
+        return load_data;
+    }
+
+    int store_width = env->mem_log_stores[store_slot].width;
+    target_ulong store_addr = env->mem_log_stores[store_slot].va;
+    union {
+        uint8_t bytes[8];
+        uint32_t data32;
+        uint64_t data64;
+    } retdata, storedata;
+    int bigmask = ((-load_width) & (-store_width));
+    if ((load_addr & bigmask) != (store_addr & bigmask)) {
+        /* no overlap */
+        return load_data;
+    }
+    retdata.data64 = load_data;
+
+    if (store_width == 1 || store_width == 2 || store_width == 4) {
+        storedata.data32 = env->mem_log_stores[store_slot].data32;
+    } else if (store_width == 8) {
+        storedata.data64 = env->mem_log_stores[store_slot].data64;
+    } else {
+        g_assert_not_reached();
+    }
+    int i, j;
+    i = store_addr & (load_width - 1);
+    j = load_addr & (store_width - 1);
+    while ((i < load_width) && (j < store_width)) {
+        retdata.bytes[i] = storedata.bytes[j];
+        i++;
+        j++;
+    }
+    return retdata.data64;
+}
+
+#define MERGE_INFLIGHT(NAME, RET, IN_TYPE, OUT_TYPE, SIZE) \
+RET HELPER(NAME)(CPUHexagonState *env, int32_t addr, IN_TYPE data) \
+{ \
+    return (OUT_TYPE)merge_bytes(env, addr, data, SIZE); \
+}
+
+MERGE_INFLIGHT(merge_inflight_store1s, int32_t, int32_t,  int8_t,  1)
+MERGE_INFLIGHT(merge_inflight_store1u, int32_t, int32_t, uint8_t,  1)
+MERGE_INFLIGHT(merge_inflight_store2s, int32_t, int32_t,  int16_t, 2)
+MERGE_INFLIGHT(merge_inflight_store2u, int32_t, int32_t, uint16_t, 2)
+MERGE_INFLIGHT(merge_inflight_store4s, int32_t, int32_t,  int32_t, 4)
+MERGE_INFLIGHT(merge_inflight_store4u, int32_t, int32_t, uint32_t, 4)
+MERGE_INFLIGHT(merge_inflight_store8u, int64_t, int64_t,  int64_t, 8)
+
+#define CHECK_NOSHUF(DST, VA, SZ, SIGN) \
+    do { \
+        if (slot == 0 && env->pkt_has_store_s1) { \
+            DST = HELPER(merge_inflight_store##SZ##SIGN)(env, VA, DST); \
+        } \
+    } while (0)
+
+static inline uint8_t mem_load1(CPUHexagonState *env, uint32_t slot,
+                                target_ulong vaddr)
+{
+    uint8_t retval;
+    get_user_u8(retval, vaddr);
+    CHECK_NOSHUF(retval, vaddr, 1, u);
+    return retval;
+}
+
+static inline uint16_t mem_load2(CPUHexagonState *env, uint32_t slot,
+                                 target_ulong vaddr)
+{
+    uint16_t retval;
+    get_user_u16(retval, vaddr);
+    CHECK_NOSHUF(retval, vaddr, 2, u);
+    return retval;
+}
+
+static inline uint32_t mem_load4(CPUHexagonState *env, uint32_t slot,
+                                 target_ulong vaddr)
+{
+    uint32_t retval;
+    get_user_u32(retval, vaddr);
+    CHECK_NOSHUF(retval, vaddr, 4, u);
+    return retval;
+}
+
+static inline uint64_t mem_load8(CPUHexagonState *env, uint32_t slot,
+                                 target_ulong vaddr)
+{
+    uint64_t retval;
+    get_user_u64(retval, vaddr);
+    CHECK_NOSHUF(retval, vaddr, 8, u);
+    return retval;
+}
+
+/* Helpful for printing intermediate values within instructions */
+void HELPER(debug_value)(CPUHexagonState *env, int32_t value)
+{
+    HEX_DEBUG_LOG("value = 0x%x\n", value);
+}
+
+void HELPER(debug_value_i64)(CPUHexagonState *env, int64_t value)
+{
+    HEX_DEBUG_LOG("value_i64 = 0x%lx\n", value);
+}
+
+static void cancel_slot(CPUHexagonState *env, uint32_t slot)
+{
+    HEX_DEBUG_LOG("Slot %d cancelled\n", slot);
+    env->slot_cancelled |= (1 << slot);
+}
+
+/* These macros can be referenced in the generated helper functions */
+#define warn(...) /* Nothing */
+#define fatal(...) g_assert_not_reached();
+
+#define BOGUS_HELPER(tag) \
+    printf("ERROR: bogus helper: " #tag "\n")
+
+#include "helper_funcs_generated.h"
+
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 08/34] Hexagon (target/hexagon) GDB Stub
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (6 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 07/34] Hexagon (target/hexagon) scalar core helpers Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 14:17   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 09/34] Hexagon (target/hexagon) architecture types Taylor Simpson
                   ` (27 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

GDB register read and write routines

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/internal.h |  2 ++
 target/hexagon/cpu.c      |  2 ++
 target/hexagon/gdbstub.c  | 49 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 53 insertions(+)
 create mode 100644 target/hexagon/gdbstub.c

diff --git a/target/hexagon/internal.h b/target/hexagon/internal.h
index d3e4412..acddd91 100644
--- a/target/hexagon/internal.h
+++ b/target/hexagon/internal.h
@@ -37,4 +37,6 @@ extern const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS];
 
 extern void init_genptr(void);
 
+extern int hexagon_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg);
+extern int hexagon_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
 #endif
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
index d812913..e2c879c 100644
--- a/target/hexagon/cpu.c
+++ b/target/hexagon/cpu.c
@@ -284,6 +284,8 @@ static void hexagon_cpu_class_init(ObjectClass *c, void *data)
     cc->dump_state = hexagon_dump_state;
     cc->set_pc = hexagon_cpu_set_pc;
     cc->synchronize_from_tb = hexagon_cpu_synchronize_from_tb;
+    cc->gdb_read_register = hexagon_gdb_read_register;
+    cc->gdb_write_register = hexagon_gdb_write_register;
     cc->gdb_num_core_regs = TOTAL_PER_THREAD_REGS;
     cc->gdb_stop_before_watchpoint = true;
     cc->disas_set_info = hexagon_cpu_disas_set_info;
diff --git a/target/hexagon/gdbstub.c b/target/hexagon/gdbstub.c
new file mode 100644
index 0000000..a2ea528
--- /dev/null
+++ b/target/hexagon/gdbstub.c
@@ -0,0 +1,49 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "exec/gdbstub.h"
+#include "cpu.h"
+#include "internal.h"
+
+int hexagon_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
+{
+    HexagonCPU *cpu = HEXAGON_CPU(cs);
+    CPUHexagonState *env = &cpu->env;
+
+    if (n < TOTAL_PER_THREAD_REGS) {
+        return gdb_get_regl(mem_buf, env->gpr[n]);
+    }
+
+    g_assert_not_reached();
+    return 0;
+}
+
+int hexagon_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
+{
+    HexagonCPU *cpu = HEXAGON_CPU(cs);
+    CPUHexagonState *env = &cpu->env;
+
+    if (n < TOTAL_PER_THREAD_REGS) {
+        env->gpr[n] = ldtul_p(mem_buf);
+        return sizeof(target_ulong);
+    }
+
+    g_assert_not_reached();
+    return 0;
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 09/34] Hexagon (target/hexagon) architecture types
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (7 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 08/34] Hexagon (target/hexagon) GDB Stub Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 14:19   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 10/34] Hexagon (target/hexagon) instruction and packet types Taylor Simpson
                   ` (26 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Define types used in files imported from the Hexagon architecture library

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/hex_arch_types.h | 43 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)
 create mode 100644 target/hexagon/hex_arch_types.h

diff --git a/target/hexagon/hex_arch_types.h b/target/hexagon/hex_arch_types.h
new file mode 100644
index 0000000..5f5b722
--- /dev/null
+++ b/target/hexagon/hex_arch_types.h
@@ -0,0 +1,43 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_ARCH_TYPES_H
+#define HEXAGON_ARCH_TYPES_H
+
+#include <stdint.h>
+
+/*
+ * These types are used by the code generated from the Hexagon
+ * architecture library.
+ */
+typedef uint8_t     size1u_t;
+typedef int8_t      size1s_t;
+typedef uint16_t    size2u_t;
+typedef int16_t     size2s_t;
+typedef uint32_t    size4u_t;
+typedef int32_t     size4s_t;
+typedef uint64_t    size8u_t;
+typedef int64_t     size8s_t;
+typedef uint64_t    paddr_t;
+typedef uint32_t    vaddr_t;
+
+typedef struct size16s {
+    size8s_t hi;
+    size8u_t lo;
+} size16s_t;
+
+#endif
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 10/34] Hexagon (target/hexagon) instruction and packet types
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (8 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 09/34] Hexagon (target/hexagon) architecture types Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 14:22   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 11/34] Hexagon (target/hexagon) register fields Taylor Simpson
                   ` (25 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

The insn_t and packet_t are the interface between instruction decoding and
TCG code generation

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/insn.h | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 75 insertions(+)
 create mode 100644 target/hexagon/insn.h

diff --git a/target/hexagon/insn.h b/target/hexagon/insn.h
new file mode 100644
index 0000000..aa2d1dd
--- /dev/null
+++ b/target/hexagon/insn.h
@@ -0,0 +1,75 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_INSN_H
+#define HEXAGON_INSN_H
+
+#include "cpu.h"
+#include "hex_arch_types.h"
+
+#define INSTRUCTIONS_MAX 7    /* 2 pairs + loopend */
+#define REG_OPERANDS_MAX 5
+#define IMMEDS_MAX 2
+
+struct Instruction;
+struct Packet;
+struct DisasContext;
+
+typedef void (*semantic_insn_t)(CPUHexagonState *env,
+                                struct DisasContext *ctx,
+                                struct Instruction *insn,
+                                struct Packet *pkt);
+
+struct Instruction {
+    semantic_insn_t generate;            /* pointer to genptr routine */
+    size1u_t regno[REG_OPERANDS_MAX];    /* reg operands including predicates */
+    size2u_t opcode;
+
+    size4u_t iclass:6;
+    size4u_t slot:3;
+    size4u_t part1:1;        /*
+                              * cmp-jumps are split into two insns.
+                              * set for the compare and clear for the jump
+                              */
+    size4u_t extension_valid:1;   /* Has a constant extender attached */
+    size4u_t which_extended:1;    /* If has an extender, which immediate */
+    size4u_t is_endloop:1;   /* This is an end of loop */
+    size4u_t new_value_producer_slot:4;
+    size4s_t immed[IMMEDS_MAX];    /* immediate field */
+};
+
+typedef struct Instruction insn_t;
+
+struct Packet {
+    size2u_t num_insns;
+    size2u_t encod_pkt_size_in_bytes;
+
+    /* Pre-decodes about COF */
+    size8u_t pkt_has_cof:1;          /* Has any change-of-flow */
+    size8u_t pkt_has_endloop:1;
+
+    size8u_t pkt_has_dczeroa:1;
+
+    size8u_t pkt_has_store_s0:1;
+    size8u_t pkt_has_store_s1:1;
+
+    insn_t insn[INSTRUCTIONS_MAX];
+};
+
+typedef struct Packet packet_t;
+
+#endif
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 11/34] Hexagon (target/hexagon) register fields
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (9 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 10/34] Hexagon (target/hexagon) instruction and packet types Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 14:29   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 12/34] Hexagon (target/hexagon) instruction attributes Taylor Simpson
                   ` (24 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Declare bitfields within registers such as user status register (USR)

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/reg_fields.h     | 40 +++++++++++++++++++++
 target/hexagon/reg_fields_def.h | 78 +++++++++++++++++++++++++++++++++++++++++
 target/hexagon/reg_fields.c     | 28 +++++++++++++++
 3 files changed, 146 insertions(+)
 create mode 100644 target/hexagon/reg_fields.h
 create mode 100644 target/hexagon/reg_fields_def.h
 create mode 100644 target/hexagon/reg_fields.c

diff --git a/target/hexagon/reg_fields.h b/target/hexagon/reg_fields.h
new file mode 100644
index 0000000..cf168f0
--- /dev/null
+++ b/target/hexagon/reg_fields.h
@@ -0,0 +1,40 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_REG_FIELDS_H
+#define HEXAGON_REG_FIELDS_H
+
+#define NUM_GEN_REGS 32
+
+typedef struct {
+    const char *name;
+    int offset;
+    int width;
+    const char *description;
+} reg_field_t;
+
+extern reg_field_t reg_field_info[];
+
+enum reg_fields_enum {
+#define DEF_REG_FIELD(TAG, NAME, START, WIDTH, DESCRIPTION) \
+    TAG,
+#include "reg_fields_def.h"
+    NUM_REG_FIELDS
+#undef DEF_REG_FIELD
+};
+
+#endif
diff --git a/target/hexagon/reg_fields_def.h b/target/hexagon/reg_fields_def.h
new file mode 100644
index 0000000..ffb18a1
--- /dev/null
+++ b/target/hexagon/reg_fields_def.h
@@ -0,0 +1,78 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * For registers that have individual fields, explain them here
+ *   DEF_REG_FIELD(tag,
+ *                 name,
+ *                 bit start offset,
+ *                 width,
+ *                 description
+ */
+
+/* USR fields */
+DEF_REG_FIELD(USR_OVF,
+    "ovf", 0, 1,
+    "Sticky Saturation Overflow - "
+    "Set when saturation occurs while executing instruction that specifies "
+    "optional saturation, remains set until explicitly cleared by a USR=Rs "
+    "instruction.")
+DEF_REG_FIELD(USR_FPINVF,
+    "fpinvf", 1, 1,
+    "Floating-point IEEE Invalid Sticky Flag.")
+DEF_REG_FIELD(USR_FPDBZF,
+    "fpdbzf", 2, 1,
+    "Floating-point IEEE Divide-By-Zero Sticky Flag.")
+DEF_REG_FIELD(USR_FPOVFF,
+    "fpovff", 3, 1,
+    "Floating-point IEEE Overflow Sticky Flag.")
+DEF_REG_FIELD(USR_FPUNFF,
+    "fpunff", 4, 1,
+    "Floating-point IEEE Underflow Sticky Flag.")
+DEF_REG_FIELD(USR_FPINPF,
+    "fpinpf", 5, 1,
+    "Floating-point IEEE Inexact Sticky Flag.")
+
+DEF_REG_FIELD(USR_LPCFG,
+    "lpcfg", 8, 2,
+    "Hardware Loop Configuration: "
+    "Number of loop iterations (0-3) remaining before pipeline predicate "
+    "should be set.")
+
+DEF_REG_FIELD(USR_FPRND,
+    "fprnd", 22, 2,
+    "Rounding Mode for Floating-Point Instructions: "
+    "00: Round to nearest, ties to even (default), "
+    "01: Toward zero, "
+    "10: Downward (toward negative infinity), "
+    "11: Upward (toward positive infinity).")
+
+DEF_REG_FIELD(USR_FPINVE,
+    "fpinve", 25, 1,
+    "Enable trap on IEEE Invalid.")
+DEF_REG_FIELD(USR_FPDBZE,
+    "fpdbze", 26, 1, "Enable trap on IEEE Divide-By-Zero.")
+DEF_REG_FIELD(USR_FPOVFE,
+    "fpovfe", 27, 1,
+    "Enable trap on IEEE Overflow.")
+DEF_REG_FIELD(USR_FPUNFE,
+    "fpunfe", 28, 1,
+    "Enable trap on IEEE Underflow.")
+DEF_REG_FIELD(USR_FPINPE,
+    "fpinpe", 29, 1,
+    "Enable trap on IEEE Inexact.")
+
diff --git a/target/hexagon/reg_fields.c b/target/hexagon/reg_fields.c
new file mode 100644
index 0000000..2a3e4f5a
--- /dev/null
+++ b/target/hexagon/reg_fields.c
@@ -0,0 +1,28 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "reg_fields.h"
+
+reg_field_t reg_field_info[] = {
+#define DEF_REG_FIELD(TAG, NAME, START, WIDTH, DESCRIPTION)    \
+      {NAME, START, WIDTH, DESCRIPTION},
+#include "reg_fields_def.h"
+      {NULL, 0, 0}
+#undef DEF_REG_FIELD
+};
+
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 12/34] Hexagon (target/hexagon) instruction attributes
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (10 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 11/34] Hexagon (target/hexagon) register fields Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 14:34   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 13/34] Hexagon (target/hexagon) register map Taylor Simpson
                   ` (23 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/attribs.h     | 32 +++++++++++++++
 target/hexagon/attribs_def.h | 98 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 130 insertions(+)
 create mode 100644 target/hexagon/attribs.h
 create mode 100644 target/hexagon/attribs_def.h

diff --git a/target/hexagon/attribs.h b/target/hexagon/attribs.h
new file mode 100644
index 0000000..d35af0c
--- /dev/null
+++ b/target/hexagon/attribs.h
@@ -0,0 +1,32 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_ATTRIBS_H
+#define HEXAGON_ATTRIBS_H
+
+enum {
+#define DEF_ATTRIB(NAME, ...) A_##NAME,
+#include "attribs_def.h"
+#undef DEF_ATTRIB
+};
+
+#define ATTRIB_WIDTH 32
+#define GET_ATTRIB(opcode, attrib) \
+    (((opcode_attribs[opcode][attrib / ATTRIB_WIDTH])\
+    >> (attrib % ATTRIB_WIDTH)) & 0x1)
+
+#endif /* ATTRIBS_H */
diff --git a/target/hexagon/attribs_def.h b/target/hexagon/attribs_def.h
new file mode 100644
index 0000000..d024176
--- /dev/null
+++ b/target/hexagon/attribs_def.h
@@ -0,0 +1,98 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/* Keep this as the first attribute: */
+DEF_ATTRIB(AA_DUMMY, "Dummy Zeroth Attribute", "", "")
+
+/* Misc */
+DEF_ATTRIB(EXTENSION, "Extension instruction", "", "")
+
+DEF_ATTRIB(PRIV, "Not available in user or guest mode", "", "")
+DEF_ATTRIB(GUEST, "Not available in user mode", "", "")
+
+DEF_ATTRIB(FPOP, "Floating Point Operation", "", "")
+
+DEF_ATTRIB(EXTENDABLE, "Immediate may be extended", "", "")
+
+DEF_ATTRIB(ARCHV2, "V2 architecture", "", "")
+DEF_ATTRIB(ARCHV3, "V3 architecture", "", "")
+DEF_ATTRIB(ARCHV4, "V4 architecture", "", "")
+DEF_ATTRIB(ARCHV5, "V5 architecture", "", "")
+
+DEF_ATTRIB(SUBINSN, "sub-instruction", "", "")
+
+/* Load and Store attributes */
+DEF_ATTRIB(LOAD, "Loads from memory", "", "")
+DEF_ATTRIB(STORE, "Stores to memory", "", "")
+DEF_ATTRIB(MEMLIKE, "Memory-like instruction", "", "")
+DEF_ATTRIB(MEMLIKE_PACKET_RULES, "follows Memory-like packet rules", "", "")
+
+
+/* Change-of-flow attributes */
+DEF_ATTRIB(JUMP, "Jump-type instruction", "", "")
+DEF_ATTRIB(INDIRECT, "Absolute register jump", "", "")
+DEF_ATTRIB(CALL, "Function call instruction", "", "")
+DEF_ATTRIB(COF, "Change-of-flow instruction", "", "")
+DEF_ATTRIB(CONDEXEC, "May be cancelled by a predicate", "", "")
+DEF_ATTRIB(DOTNEWVALUE, "Uses a register value generated in this pkt", "", "")
+DEF_ATTRIB(NEWCMPJUMP, "Compound compare and jump", "", "")
+
+/* access to implicit registers */
+DEF_ATTRIB(IMPLICIT_WRITES_LR, "Writes the link register", "", "UREG.LR")
+DEF_ATTRIB(IMPLICIT_WRITES_PC, "Writes the program counter", "", "UREG.PC")
+DEF_ATTRIB(IMPLICIT_WRITES_SP, "Writes the stack pointer", "", "UREG.SP")
+DEF_ATTRIB(IMPLICIT_WRITES_FP, "Writes the frame pointer", "", "UREG.FP")
+DEF_ATTRIB(IMPLICIT_WRITES_GP, "Writes the GP register", "", "UREG.GP")
+DEF_ATTRIB(IMPLICIT_WRITES_LC0, "Writes loop count for loop 0", "", "UREG.LC0")
+DEF_ATTRIB(IMPLICIT_WRITES_LC1, "Writes loop count for loop 1", "", "UREG.LC1")
+DEF_ATTRIB(IMPLICIT_WRITES_SA0, "Writes start addr for loop 0", "", "UREG.SA0")
+DEF_ATTRIB(IMPLICIT_WRITES_SA1, "Writes start addr for loop 1", "", "UREG.SA1")
+DEF_ATTRIB(IMPLICIT_WRITES_P0, "Writes Predicate 0", "", "UREG.P0")
+DEF_ATTRIB(IMPLICIT_WRITES_P1, "Writes Predicate 1", "", "UREG.P1")
+DEF_ATTRIB(IMPLICIT_WRITES_P2, "Writes Predicate 1", "", "UREG.P2")
+DEF_ATTRIB(IMPLICIT_WRITES_P3, "May write Predicate 3", "", "UREG.P3")
+
+DEF_ATTRIB(CRSLOT23, "Can execute in slot 2 or slot 3 (CR)", "", "")
+DEF_ATTRIB(IT_NOP, "nop instruction", "", "")
+DEF_ATTRIB(IT_EXTENDER, "constant extender instruction", "", "")
+
+
+/* Restrictions to make note of */
+DEF_ATTRIB(RESTRICT_SLOT0ONLY, "Must execute on slot0", "", "")
+DEF_ATTRIB(RESTRICT_SLOT1ONLY, "Must execute on slot1", "", "")
+DEF_ATTRIB(RESTRICT_SLOT2ONLY, "Must execute on slot2", "", "")
+DEF_ATTRIB(RESTRICT_SLOT3ONLY, "Must execute on slot3", "", "")
+DEF_ATTRIB(RESTRICT_NOSLOT1, "No slot 1 instruction in parallel", "", "")
+DEF_ATTRIB(RESTRICT_PREFERSLOT0, "Try to encode into slot 0", "", "")
+
+DEF_ATTRIB(ICOP, "Instruction cache op", "", "")
+
+DEF_ATTRIB(HWLOOP0_END, "Ends HW loop0", "", "")
+DEF_ATTRIB(HWLOOP1_END, "Ends HW loop1", "", "")
+DEF_ATTRIB(DCZEROA, "dczeroa type", "", "")
+DEF_ATTRIB(ICFLUSHOP, "icflush op type", "", "")
+DEF_ATTRIB(DCFLUSHOP, "dcflush op type", "", "")
+DEF_ATTRIB(DCFETCH, "dcfetch type", "", "")
+
+DEF_ATTRIB(L2FETCH, "Instruction is l2fetch type", "", "")
+
+DEF_ATTRIB(ICINVA, "icinva", "", "")
+DEF_ATTRIB(DCCLEANINVA, "dccleaninva", "", "")
+
+/* Keep this as the last attribute: */
+DEF_ATTRIB(ZZ_LASTATTRIB, "Last attribute in the file", "", "")
+
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 13/34] Hexagon (target/hexagon) register map
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (11 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 12/34] Hexagon (target/hexagon) instruction attributes Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 14:36   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 14/34] Hexagon (target/hexagon) instruction/packet decode Taylor Simpson
                   ` (22 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Certain operand types represent a non-contiguous set of values.
For example, the compound compare-and-jump instruction can only access
registers R0-R7 and R16-23.
This table represents the mapping from the encoding to the actual values.

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/regmap.h | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)
 create mode 100644 target/hexagon/regmap.h

diff --git a/target/hexagon/regmap.h b/target/hexagon/regmap.h
new file mode 100644
index 0000000..2bcc0de
--- /dev/null
+++ b/target/hexagon/regmap.h
@@ -0,0 +1,38 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ *  Certain operand types represent a non-contiguous set of values.
+ *  For example, the compound compare-and-jump instruction can only access
+ *  registers R0-R7 and R16-23.
+ *  This table represents the mapping from the encoding to the actual values.
+ */
+
+#ifndef HEXAGON_REGMAP_H
+#define HEXAGON_REGMAP_H
+
+        /* Name   Num Table */
+DEF_REGMAP(R_16,  16, 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23)
+DEF_REGMAP(R__8,  8,  0, 2, 4, 6, 16, 18, 20, 22)
+DEF_REGMAP(R__4,  4,  0, 2, 4, 6)
+DEF_REGMAP(R_4,   4,  0, 1, 2, 3)
+DEF_REGMAP(R_8S,  8,  0, 1, 2, 3, 16, 17, 18, 19)
+DEF_REGMAP(R_8,   8,  0, 1, 2, 3, 4, 5, 6, 7)
+DEF_REGMAP(V__8,  8,  0, 4, 8, 12, 16, 20, 24, 28)
+DEF_REGMAP(V__16, 16, 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30)
+
+#endif
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 14/34] Hexagon (target/hexagon) instruction/packet decode
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (12 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 13/34] Hexagon (target/hexagon) register map Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 15:06   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 15/34] Hexagon (target/hexagon) instruction printing Taylor Simpson
                   ` (21 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Take the words from instruction memory and build a packet_t for TCG code
generation

The following operations are performed
    Convert the .new encoded offset to the register number of the producer
    Reorder the packet so .new producer is before consumer
    Apply constant extenders
    Separate subinsn's into two instructions
    Break compare-jumps into two instructions
    Create instructions for :endloop

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/decode.h     |  39 +++
 target/hexagon/decode.c     | 593 ++++++++++++++++++++++++++++++++++++++++++++
 target/hexagon/q6v_decode.c | 373 ++++++++++++++++++++++++++++
 3 files changed, 1005 insertions(+)
 create mode 100644 target/hexagon/decode.h
 create mode 100644 target/hexagon/decode.c
 create mode 100644 target/hexagon/q6v_decode.c

diff --git a/target/hexagon/decode.h b/target/hexagon/decode.h
new file mode 100644
index 0000000..7f63b1c
--- /dev/null
+++ b/target/hexagon/decode.h
@@ -0,0 +1,39 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_DECODE_H
+#define HEXAGON_DECODE_H
+
+#include "cpu.h"
+#include "opcodes.h"
+#include "hex_arch_types.h"
+#include "insn.h"
+
+extern void decode_init(void);
+
+static inline int is_packet_end(uint32_t word)
+{
+    uint32_t bits = (word >> 14) & 0x3;
+    return ((bits == 0x3) || (bits == 0x0));
+}
+
+extern void decode_send_insn_to(packet_t *packet, int start, int newloc);
+
+extern packet_t *decode_this(int max_words, size4u_t *words,
+                             packet_t *decode_pkt);
+
+#endif
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
new file mode 100644
index 0000000..2e882bc
--- /dev/null
+++ b/target/hexagon/decode.c
@@ -0,0 +1,593 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "iclass.h"
+#include "opcodes.h"
+#include "genptr.h"
+#include "decode.h"
+#include "insn.h"
+#include "printinsn.h"
+
+#define fZXTN(N, M, VAL) ((VAL) & ((1LL << (N)) - 1))
+
+enum {
+    EXT_IDX_noext = 0,
+    EXT_IDX_noext_AFTER = 4,
+    EXT_IDX_mmvec = 4,
+    EXT_IDX_mmvec_AFTER = 8,
+    XX_LAST_EXT_IDX
+};
+
+#define DEF_REGMAP(NAME, ELEMENTS, ...) \
+    static const unsigned int DECODE_REGISTER_##NAME[ELEMENTS] = \
+    { __VA_ARGS__ };
+#include "regmap.h"
+
+#define DECODE_MAPPED_REG(REGNO, NAME) \
+    insn->regno[REGNO] = DECODE_REGISTER_##NAME[insn->regno[REGNO]];
+
+typedef struct {
+    struct _dectree_table_struct *table_link;
+    struct _dectree_table_struct *table_link_b;
+    opcode_t opcode;
+    enum {
+        DECTREE_ENTRY_INVALID,
+        DECTREE_TABLE_LINK,
+        DECTREE_SUBINSNS,
+        DECTREE_EXTSPACE,
+        DECTREE_TERMINAL
+    } type;
+} dectree_entry_t;
+
+typedef struct _dectree_table_struct {
+    unsigned int (*lookup_function)(int startbit, int width, size4u_t opcode);
+    unsigned int size;
+    unsigned int startbit;
+    unsigned int width;
+    dectree_entry_t table[];
+} dectree_table_t;
+
+#define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT) \
+    static struct _dectree_table_struct dectree_table_##TAG;
+#define TABLE_LINK(TABLE)                     /* NOTHING */
+#define TERMINAL(TAG, ENC)                    /* NOTHING */
+#define SUBINSNS(TAG, CLASSA, CLASSB, ENC)    /* NOTHING */
+#define EXTSPACE(TAG, ENC)                    /* NOTHING */
+#define INVALID()                             /* NOTHING */
+#define DECODE_END_TABLE(...)                 /* NOTHING */
+#define DECODE_MATCH_INFO(...)                /* NOTHING */
+#define DECODE_LEGACY_MATCH_INFO(...)         /* NOTHING */
+#define DECODE_OPINFO(...)                    /* NOTHING */
+
+#include "dectree_generated.h"
+
+#undef DECODE_OPINFO
+#undef DECODE_MATCH_INFO
+#undef DECODE_LEGACY_MATCH_INFO
+#undef DECODE_END_TABLE
+#undef INVALID
+#undef TERMINAL
+#undef SUBINSNS
+#undef EXTSPACE
+#undef TABLE_LINK
+#undef DECODE_NEW_TABLE
+#undef DECODE_SEPARATOR_BITS
+
+#define DECODE_SEPARATOR_BITS(START, WIDTH) NULL, START, WIDTH
+#define DECODE_NEW_TABLE_HELPER(TAG, SIZE, FN, START, WIDTH) \
+    static dectree_table_t dectree_table_##TAG = { \
+        .size = SIZE, \
+        .lookup_function = FN, \
+        .startbit = START, \
+        .width = WIDTH, \
+        .table = {
+#define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT) \
+    DECODE_NEW_TABLE_HELPER(TAG, SIZE, WHATNOT)
+
+#define TABLE_LINK(TABLE) \
+    { .type = DECTREE_TABLE_LINK, .table_link = &dectree_table_##TABLE },
+#define TERMINAL(TAG, ENC) \
+    { .type = DECTREE_TERMINAL, .opcode = TAG  },
+#define SUBINSNS(TAG, CLASSA, CLASSB, ENC) \
+    { \
+        .type = DECTREE_SUBINSNS, \
+        .table_link = &dectree_table_DECODE_SUBINSN_##CLASSA, \
+        .table_link_b = &dectree_table_DECODE_SUBINSN_##CLASSB \
+    },
+#define EXTSPACE(TAG, ENC) { .type = DECTREE_EXTSPACE },
+#define INVALID() { .type = DECTREE_ENTRY_INVALID, .opcode = XX_LAST_OPCODE },
+
+#define DECODE_END_TABLE(...) } };
+
+#define DECODE_MATCH_INFO(...)                /* NOTHING */
+#define DECODE_LEGACY_MATCH_INFO(...)         /* NOTHING */
+#define DECODE_OPINFO(...)                    /* NOTHING */
+
+#include "dectree_generated.h"
+
+#undef DECODE_OPINFO
+#undef DECODE_MATCH_INFO
+#undef DECODE_LEGACY_MATCH_INFO
+#undef DECODE_END_TABLE
+#undef INVALID
+#undef TERMINAL
+#undef SUBINSNS
+#undef EXTSPACE
+#undef TABLE_LINK
+#undef DECODE_NEW_TABLE
+#undef DECODE_NEW_TABLE_HELPER
+#undef DECODE_SEPARATOR_BITS
+
+static dectree_table_t dectree_table_DECODE_EXT_EXT_noext = {
+    .size = 1, .lookup_function = NULL, .startbit = 0, .width = 0,
+    .table = {
+        { .type = DECTREE_ENTRY_INVALID, .opcode = XX_LAST_OPCODE },
+    }
+};
+
+static dectree_table_t *ext_trees[XX_LAST_EXT_IDX];
+
+static void decode_ext_init(void)
+{
+    int i;
+    for (i = EXT_IDX_noext; i < EXT_IDX_noext_AFTER; i++) {
+        ext_trees[i] = &dectree_table_DECODE_EXT_EXT_noext;
+    }
+}
+
+typedef struct {
+    size4u_t mask;
+    size4u_t match;
+} decode_itable_entry_t;
+
+#define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT)  /* NOTHING */
+#define TABLE_LINK(TABLE)                     /* NOTHING */
+#define TERMINAL(TAG, ENC)                    /* NOTHING */
+#define SUBINSNS(TAG, CLASSA, CLASSB, ENC)    /* NOTHING */
+#define EXTSPACE(TAG, ENC)                    /* NOTHING */
+#define INVALID()                             /* NOTHING */
+#define DECODE_END_TABLE(...)                 /* NOTHING */
+#define DECODE_OPINFO(...)                    /* NOTHING */
+
+#define DECODE_MATCH_INFO_NORMAL(TAG, MASK, MATCH) \
+    [TAG] = { \
+        .mask = MASK, \
+        .match = MATCH, \
+    },
+
+#define DECODE_MATCH_INFO_NULL(TAG, MASK, MATCH) \
+    [TAG] = { .match = ~0 },
+
+#define DECODE_MATCH_INFO(...) DECODE_MATCH_INFO_NORMAL(__VA_ARGS__)
+#define DECODE_LEGACY_MATCH_INFO(...) /* NOTHING */
+
+static const decode_itable_entry_t decode_itable[XX_LAST_OPCODE] = {
+#include "dectree_generated.h"
+};
+
+#undef DECODE_MATCH_INFO
+#define DECODE_MATCH_INFO(...) DECODE_MATCH_INFO_NULL(__VA_ARGS__)
+
+#undef DECODE_LEGACY_MATCH_INFO
+#define DECODE_LEGACY_MATCH_INFO(...) DECODE_MATCH_INFO_NORMAL(__VA_ARGS__)
+
+static const decode_itable_entry_t decode_legacy_itable[XX_LAST_OPCODE] = {
+#include "dectree_generated.h"
+};
+
+#undef DECODE_OPINFO
+#undef DECODE_MATCH_INFO
+#undef DECODE_LEGACY_MATCH_INFO
+#undef DECODE_END_TABLE
+#undef INVALID
+#undef TERMINAL
+#undef SUBINSNS
+#undef EXTSPACE
+#undef TABLE_LINK
+#undef DECODE_NEW_TABLE
+#undef DECODE_SEPARATOR_BITS
+
+void decode_init(void)
+{
+    decode_ext_init();
+}
+
+void decode_send_insn_to(packet_t *packet, int start, int newloc)
+{
+    insn_t tmpinsn;
+    int direction;
+    int i;
+    if (start == newloc) {
+        return;
+    }
+    if (start < newloc) {
+        /* Move towards end */
+        direction = 1;
+    } else {
+        /* move towards beginning */
+        direction = -1;
+    }
+    for (i = start; i != newloc; i += direction) {
+        tmpinsn = packet->insn[i];
+        packet->insn[i] = packet->insn[i + direction];
+        packet->insn[i + direction] = tmpinsn;
+    }
+}
+
+/* Fill newvalue registers with the correct regno */
+static int
+decode_fill_newvalue_regno(packet_t *packet)
+{
+    int i, use_regidx, def_idx;
+    size2u_t def_opcode, use_opcode;
+    char *dststr;
+
+    for (i = 1; i < packet->num_insns; i++) {
+        if (GET_ATTRIB(packet->insn[i].opcode, A_DOTNEWVALUE) &&
+            !GET_ATTRIB(packet->insn[i].opcode, A_EXTENSION)) {
+            use_opcode = packet->insn[i].opcode;
+
+            /* It's a store, so we're adjusting the Nt field */
+            if (GET_ATTRIB(use_opcode, A_STORE)) {
+                use_regidx = strchr(opcode_reginfo[use_opcode], 't') -
+                    opcode_reginfo[use_opcode];
+            } else {    /* It's a Jump, so we're adjusting the Ns field */
+                use_regidx = strchr(opcode_reginfo[use_opcode], 's') -
+                    opcode_reginfo[use_opcode];
+            }
+
+            /*
+             * What's encoded at the N-field is the offset to who's producing
+             * the value.  Shift off the LSB which indicates odd/even register.
+             */
+            def_idx = i - ((packet->insn[i].regno[use_regidx]) >> 1);
+
+            /*
+             * Check for a badly encoded N-field which points to an instruction
+             * out-of-range
+             */
+            if ((def_idx < 0) || (def_idx > (packet->num_insns - 1))) {
+                g_assert_not_reached();
+                return 1;
+            }
+
+            /* previous insn is the producer */
+            def_opcode = packet->insn[def_idx].opcode;
+            dststr = strstr(opcode_wregs[def_opcode], "Rd");
+            if (dststr) {
+                dststr = strchr(opcode_reginfo[def_opcode], 'd');
+            } else {
+                dststr = strstr(opcode_wregs[def_opcode], "Rx");
+                if (dststr) {
+                    dststr = strchr(opcode_reginfo[def_opcode], 'x');
+                } else {
+                    dststr = strstr(opcode_wregs[def_opcode], "Re");
+                    if (dststr) {
+                        dststr = strchr(opcode_reginfo[def_opcode], 'e');
+                    } else {
+                        dststr = strstr(opcode_wregs[def_opcode], "Ry");
+                        if (dststr) {
+                            dststr = strchr(opcode_reginfo[def_opcode], 'y');
+                        } else {
+                            g_assert_not_reached();
+                            return 1;
+                        }
+                    }
+                }
+            }
+            g_assert(dststr != NULL);
+
+            /* Now patch up the consumer with the register number */
+            packet->insn[i].regno[use_regidx] =
+                packet->insn[def_idx].regno[dststr -
+                    opcode_reginfo[def_opcode]];
+            /*
+             * We need to remember who produces this value to later
+             * check if it was dynamically cancelled
+             */
+            packet->insn[i].new_value_producer_slot =
+                packet->insn[def_idx].slot;
+        }
+    }
+    return 0;
+}
+
+/* Split CJ into a compare and a jump */
+static int decode_split_cmpjump(packet_t *pkt)
+{
+    int last, i;
+    int numinsns = pkt->num_insns;
+
+    /*
+     * First, split all compare-jumps.
+     * The compare is sent to the end as a new instruction.
+     * Do it this way so we don't reorder dual jumps. Those need to stay in
+     * original order.
+     */
+    for (i = 0; i < numinsns; i++) {
+        /* It's a cmp-jump */
+        if (GET_ATTRIB(pkt->insn[i].opcode, A_NEWCMPJUMP)) {
+            last = pkt->num_insns;
+            pkt->insn[last] = pkt->insn[i];    /* copy the instruction */
+            pkt->insn[last].part1 = 1;    /* last instruction does the CMP */
+            pkt->insn[i].part1 = 0;    /* existing instruction does the JUMP */
+        pkt->num_insns++;
+        }
+    }
+
+    /* Now re-shuffle all the compares back to the beginning */
+    for (i = 0; i < pkt->num_insns; i++) {
+        if (pkt->insn[i].part1) {
+            decode_send_insn_to(pkt, i, 0);
+        }
+    }
+    return 0;
+}
+
+static inline int decode_opcode_can_jump(int opcode)
+{
+    if ((GET_ATTRIB(opcode, A_JUMP)) ||
+        (GET_ATTRIB(opcode, A_CALL)) ||
+        (opcode == J2_trap0)) {
+        /* Exception to A_JUMP attribute */
+        if (opcode == J4_hintjumpr) {
+            return 0;
+        }
+        return 1;
+    }
+
+    return 0;
+}
+
+static inline int decode_opcode_ends_loop(int opcode)
+{
+    return GET_ATTRIB(opcode, A_HWLOOP0_END) ||
+           GET_ATTRIB(opcode, A_HWLOOP1_END);
+}
+
+/* Set the is_* fields in each instruction */
+static int decode_set_insn_attr_fields(packet_t *pkt)
+{
+    int i;
+    int numinsns = pkt->num_insns;
+    size2u_t opcode;
+    int canjump;
+
+    pkt->pkt_has_cof = 0;
+    pkt->pkt_has_endloop = 0;
+    pkt->pkt_has_dczeroa = 0;
+
+    for (i = 0; i < numinsns; i++) {
+        opcode = pkt->insn[i].opcode;
+        if (pkt->insn[i].part1) {
+            continue;    /* Skip compare of cmp-jumps */
+        }
+
+        if (GET_ATTRIB(opcode, A_DCZEROA)) {
+            pkt->pkt_has_dczeroa = 1;
+        }
+
+        if (GET_ATTRIB(opcode, A_STORE)) {
+            if (pkt->insn[i].slot == 0) {
+                pkt->pkt_has_store_s0 = 1;
+            } else {
+                pkt->pkt_has_store_s1 = 1;
+            }
+        }
+
+        canjump = decode_opcode_can_jump(opcode);
+        pkt->pkt_has_cof |= canjump;
+
+        pkt->insn[i].is_endloop = decode_opcode_ends_loop(opcode);
+
+        pkt->pkt_has_endloop |= pkt->insn[i].is_endloop;
+
+        pkt->pkt_has_cof |= pkt->pkt_has_endloop;
+    }
+
+    return 0;
+}
+
+/*
+ * Shuffle for execution
+ * Move stores to end (in same order as encoding)
+ * Move compares to beginning (for use by .new insns)
+ */
+static int decode_shuffle_for_execution(packet_t *packet)
+{
+    int changed = 0;
+    int i;
+    int flag;    /* flag means we've seen a non-memory instruction */
+    int n_mems;
+    int last_insn = packet->num_insns - 1;
+
+    /*
+     * Skip end loops, somehow an end loop is getting in and messing
+     * up the order
+     */
+    if (decode_opcode_ends_loop(packet->insn[last_insn].opcode)) {
+        last_insn--;
+    }
+
+    do {
+        changed = 0;
+        /*
+         * Stores go last, must not reorder.
+         * Cannot shuffle stores past loads, either.
+         * Iterate backwards.  If we see a non-memory instruction,
+         * then a store, shuffle the store to the front.  Don't shuffle
+         *  stores wrt each other or a load.
+         */
+        for (flag = n_mems = 0, i = last_insn; i >= 0; i--) {
+            int opcode = packet->insn[i].opcode;
+
+            if (flag && GET_ATTRIB(opcode, A_STORE)) {
+                decode_send_insn_to(packet, i, last_insn - n_mems);
+                n_mems++;
+                changed = 1;
+            } else if (GET_ATTRIB(opcode, A_STORE)) {
+                n_mems++;
+            } else if (GET_ATTRIB(opcode, A_LOAD)) {
+                /*
+                 * Don't set flag, since we don't want to shuffle a
+                 * store pasta load
+                 */
+                n_mems++;
+            } else if (GET_ATTRIB(opcode, A_DOTNEWVALUE)) {
+                /*
+                 * Don't set flag, since we don't want to shuffle past
+                 * a .new value
+                 */
+            } else {
+                flag = 1;
+            }
+        }
+
+        if (changed) {
+            continue;
+        }
+        /* Compares go first, may be reordered wrt each other */
+        for (flag = 0, i = 0; i < last_insn + 1; i++) {
+            int opcode = packet->insn[i].opcode;
+
+            if ((strstr(opcode_wregs[opcode], "Pd4") ||
+                 strstr(opcode_wregs[opcode], "Pe4")) &&
+                GET_ATTRIB(opcode, A_STORE) == 0) {
+                /* This should be a compare (not a store conditional) */
+                if (flag) {
+                    decode_send_insn_to(packet, i, 0);
+                    changed = 1;
+                    continue;
+                }
+            } else if (GET_ATTRIB(opcode, A_IMPLICIT_WRITES_P3) &&
+                       !decode_opcode_ends_loop(packet->insn[i].opcode)) {
+                /*
+                 * spNloop instruction
+                 * Don't reorder endloops; they are not valid for .new uses,
+                 * and we want to match HW
+                 */
+                if (flag) {
+                    decode_send_insn_to(packet, i, 0);
+                    changed = 1;
+                    continue;
+                }
+            } else if (GET_ATTRIB(opcode, A_IMPLICIT_WRITES_P0) &&
+                       !GET_ATTRIB(opcode, A_NEWCMPJUMP)) {
+                if (flag) {
+                    decode_send_insn_to(packet, i, 0);
+                    changed = 1;
+                    continue;
+                }
+            } else {
+                flag = 1;
+            }
+        }
+        if (changed) {
+            continue;
+        }
+    } while (changed);
+
+    /*
+     * If we have a .new register compare/branch, move that to the very
+     * very end, past stores
+     */
+    for (i = 0; i < last_insn; i++) {
+        if (GET_ATTRIB(packet->insn[i].opcode, A_DOTNEWVALUE)) {
+            decode_send_insn_to(packet, i, last_insn);
+            break;
+        }
+    }
+
+    return 0;
+}
+
+static int
+apply_extender(packet_t *pkt, int i, size4u_t extender)
+{
+    int immed_num;
+    size4u_t base_immed;
+
+    immed_num = opcode_which_immediate_is_extended(pkt->insn[i].opcode);
+    base_immed = pkt->insn[i].immed[immed_num];
+
+    pkt->insn[i].immed[immed_num] = extender | fZXTN(6, 32, base_immed);
+    return 0;
+}
+
+static int decode_apply_extenders(packet_t *packet)
+{
+    int i;
+    for (i = 0; i < packet->num_insns; i++) {
+        if (GET_ATTRIB(packet->insn[i].opcode, A_IT_EXTENDER)) {
+            packet->insn[i + 1].extension_valid = 1;
+            apply_extender(packet, i + 1, packet->insn[i].immed[0]);
+        }
+    }
+    return 0;
+}
+
+static int decode_remove_extenders(packet_t *packet)
+{
+    int i, j;
+    for (i = 0; i < packet->num_insns; i++) {
+        if (GET_ATTRIB(packet->insn[i].opcode, A_IT_EXTENDER)) {
+            for (j = i;
+                (j < packet->num_insns - 1) && (j < INSTRUCTIONS_MAX - 1);
+                j++) {
+                packet->insn[j] = packet->insn[j + 1];
+            }
+            packet->num_insns--;
+        }
+    }
+    return 0;
+}
+
+static const char *
+get_valid_slot_str(const packet_t *pkt, unsigned int slot)
+{
+    return find_iclass_slots(pkt->insn[slot].opcode,
+                             pkt->insn[slot].iclass);
+}
+
+#include "q6v_decode.c"
+
+packet_t *decode_this(int max_words, size4u_t *words, packet_t *decode_pkt)
+{
+    int ret;
+    ret = do_decode_packet(max_words, words, decode_pkt);
+    if (ret <= 0) {
+        /* ERROR or BAD PARSE */
+        return NULL;
+    }
+    return decode_pkt;
+}
+
+/* Used for "-d in_asm" logging */
+int disassemble_hexagon(uint32_t *words, int nwords, char *buf, int bufsize)
+{
+    packet_t pkt;
+
+    if (decode_this(nwords, words, &pkt)) {
+        snprint_a_pkt(buf, bufsize, &pkt);
+        return pkt.encod_pkt_size_in_bytes;
+    } else {
+        snprintf(buf, bufsize, "<invalid>");
+        return 0;
+    }
+}
diff --git a/target/hexagon/q6v_decode.c b/target/hexagon/q6v_decode.c
new file mode 100644
index 0000000..c108ac2
--- /dev/null
+++ b/target/hexagon/q6v_decode.c
@@ -0,0 +1,373 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT)     /* NOTHING */
+#define TABLE_LINK(TABLE)                        /* NOTHING */
+#define TERMINAL(TAG, ENC)                       /* NOTHING */
+#define SUBINSNS(TAG, CLASSA, CLASSB, ENC)       /* NOTHING */
+#define EXTSPACE(TAG, ENC)                       /* NOTHING */
+#define INVALID()                                /* NOTHING */
+#define DECODE_END_TABLE(...)                    /* NOTHING */
+#define DECODE_MATCH_INFO(...)                   /* NOTHING */
+#define DECODE_LEGACY_MATCH_INFO(...)            /* NOTHING */
+
+#define DECODE_REG(REGNO, WIDTH, STARTBIT) \
+    insn->regno[REGNO] = ((encoding >> STARTBIT) & ((1 << WIDTH) - 1));
+
+#define DECODE_IMPL_REG(REGNO, VAL) \
+    insn->regno[REGNO] = VAL;
+
+#define DECODE_IMM(IMMNO, WIDTH, STARTBIT, VALSTART) \
+    insn->immed[IMMNO] |= (((encoding >> STARTBIT) & ((1 << WIDTH) - 1))) << \
+                          (VALSTART);
+
+#define DECODE_IMM_SXT(IMMNO, WIDTH) \
+    insn->immed[IMMNO] = ((((size4s_t)insn->immed[IMMNO]) << (32 - WIDTH)) >> \
+                          (32 - WIDTH));
+
+#define DECODE_IMM_NEG(IMMNO, WIDTH) \
+    insn->immed[IMMNO] = -insn->immed[IMMNO];
+
+#define DECODE_IMM_SHIFT(IMMNO, SHAMT)                                 \
+    if ((!insn->extension_valid) || \
+        (insn->which_extended != IMMNO)) { \
+        insn->immed[IMMNO] <<= SHAMT; \
+    }
+
+#define DECODE_OPINFO(TAG, BEH) \
+    case TAG: \
+        { BEH  } \
+        break; \
+
+static void
+decode_op(insn_t *insn, opcode_t tag, size4u_t encoding)
+{
+    insn->immed[0] = 0;
+    insn->immed[1] = 0;
+    insn->opcode = tag;
+    if (insn->extension_valid) {
+        insn->which_extended = opcode_which_immediate_is_extended(tag);
+    }
+
+    switch (tag) {
+#include "dectree_generated.h"
+    default:
+        break;
+    }
+
+    insn->generate = opcode_genptr[tag];
+    insn->iclass = (encoding >> 28) & 0xf;
+    if (((encoding >> 14) & 3) == 0) {
+        insn->iclass += 16;
+    }
+}
+
+#undef DECODE_REG
+#undef DECODE_IMPL_REG
+#undef DECODE_IMM
+#undef DECODE_IMM_SHIFT
+#undef DECODE_OPINFO
+#undef DECODE_MATCH_INFO
+#undef DECODE_LEGACY_MATCH_INFO
+#undef DECODE_END_TABLE
+#undef INVALID
+#undef TERMINAL
+#undef SUBINSNS
+#undef EXTSPACE
+#undef TABLE_LINK
+#undef DECODE_NEW_TABLE
+#undef DECODE_SEPARATOR_BITS
+
+static unsigned int
+decode_subinsn_tablewalk(insn_t *insn, dectree_table_t *table,
+                         size4u_t encoding)
+{
+    unsigned int i;
+    opcode_t opc;
+    if (table->lookup_function) {
+        i = table->lookup_function(table->startbit, table->width, encoding);
+    } else {
+        i = ((encoding >> table->startbit) & ((1 << table->width) - 1));
+    }
+    if (table->table[i].type == DECTREE_TABLE_LINK) {
+        return decode_subinsn_tablewalk(insn, table->table[i].table_link,
+                                        encoding);
+    } else if (table->table[i].type == DECTREE_TERMINAL) {
+        opc = table->table[i].opcode;
+        if ((encoding & decode_itable[opc].mask) != decode_itable[opc].match) {
+            return 0;
+        }
+        decode_op(insn, opc, encoding);
+        return 1;
+    } else {
+        return 0;
+    }
+}
+
+static unsigned int get_insn_a(size4u_t encoding)
+{
+    return encoding & 0x00001fff;
+}
+
+static unsigned int get_insn_b(size4u_t encoding)
+{
+    return (encoding >> 16) & 0x00001fff;
+}
+
+static unsigned int
+decode_insns_tablewalk(insn_t *insn, dectree_table_t *table, size4u_t encoding)
+{
+    unsigned int i;
+    unsigned int a, b;
+    opcode_t opc;
+    if (table->lookup_function) {
+        i = table->lookup_function(table->startbit, table->width, encoding);
+    } else {
+        i = ((encoding >> table->startbit) & ((1 << table->width) - 1));
+    }
+    if (table->table[i].type == DECTREE_TABLE_LINK) {
+        return decode_insns_tablewalk(insn, table->table[i].table_link,
+                                      encoding);
+    } else if (table->table[i].type == DECTREE_SUBINSNS) {
+        a = get_insn_a(encoding);
+        b = get_insn_b(encoding);
+        b = decode_subinsn_tablewalk(insn, table->table[i].table_link_b, b);
+        a = decode_subinsn_tablewalk(insn + 1, table->table[i].table_link, a);
+        if ((a == 0) || (b == 0)) {
+            return 0;
+        }
+        return 2;
+    } else if (table->table[i].type == DECTREE_TERMINAL) {
+        opc = table->table[i].opcode;
+        if ((encoding & decode_itable[opc].mask) != decode_itable[opc].match) {
+            if ((encoding & decode_legacy_itable[opc].mask) !=
+                decode_legacy_itable[opc].match) {
+                return 0;
+            }
+        }
+        decode_op(insn, opc, encoding);
+        return 1;
+    } else {
+        return 0;
+    }
+}
+
+static unsigned int
+decode_insns(insn_t *insn, size4u_t encoding)
+{
+    dectree_table_t *table;
+    if ((encoding & 0x0000c000) != 0) {
+        /* Start with PP table */
+        table = &dectree_table_DECODE_ROOT_32;
+    } else {
+        /* start with EE table */
+        table = &dectree_table_DECODE_ROOT_EE;
+    }
+    return decode_insns_tablewalk(insn, table, encoding);
+}
+
+static void decode_add_endloop_insn(insn_t *insn, int loopnum)
+{
+    if (loopnum == 10) {
+        insn->opcode = J2_endloop01;
+        insn->generate = opcode_genptr[J2_endloop01];
+    } else if (loopnum == 1) {
+        insn->opcode = J2_endloop1;
+        insn->generate = opcode_genptr[J2_endloop1];
+    } else {
+        insn->opcode = J2_endloop0;
+        insn->generate = opcode_genptr[J2_endloop0];
+    }
+}
+
+static inline int decode_parsebits_is_end(size4u_t encoding32)
+{
+    size4u_t bits = (encoding32 >> 14) & 0x3;
+    return ((bits == 0x3) || (bits == 0x0));
+}
+
+static inline int decode_parsebits_is_loopend(size4u_t encoding32)
+{
+    size4u_t bits = (encoding32 >> 14) & 0x3;
+    return ((bits == 0x2));
+}
+
+static int
+decode_set_slot_number(packet_t *pkt)
+{
+    int slot;
+    int i;
+    int hit_mem_insn = 0;
+    int hit_duplex = 0;
+    const char *valid_slot_str;
+
+    for (i = 0, slot = 3; i < pkt->num_insns; i++) {
+        valid_slot_str = get_valid_slot_str(pkt, i);
+
+        while (strchr(valid_slot_str, '0' + slot) == NULL) {
+            slot--;
+        }
+        pkt->insn[i].slot = slot;
+        if (slot) {
+            /* I've assigned the slot, now decrement it for the next insn */
+            slot--;
+        }
+    }
+
+    /* Fix the exceptions - mem insns to slot 0,1 */
+    for (i = pkt->num_insns - 1; i >= 0; i--) {
+
+        /* First memory instruction always goes to slot 0 */
+        if ((GET_ATTRIB(pkt->insn[i].opcode, A_MEMLIKE) ||
+             GET_ATTRIB(pkt->insn[i].opcode, A_MEMLIKE_PACKET_RULES)) &&
+            !hit_mem_insn) {
+            hit_mem_insn = 1;
+            pkt->insn[i].slot = 0;
+            continue;
+        }
+
+        /* Next memory instruction always goes to slot 1 */
+        if ((GET_ATTRIB(pkt->insn[i].opcode, A_MEMLIKE) ||
+             GET_ATTRIB(pkt->insn[i].opcode, A_MEMLIKE_PACKET_RULES)) &&
+            hit_mem_insn) {
+            pkt->insn[i].slot = 1;
+        }
+    }
+
+    /* Fix the exceptions - duplex always slot 0,1 */
+    for (i = pkt->num_insns - 1; i >= 0; i--) {
+
+        /* First subinsn always goes to slot 0 */
+        if (GET_ATTRIB(pkt->insn[i].opcode, A_SUBINSN) && !hit_duplex) {
+            hit_duplex = 1;
+            pkt->insn[i].slot = 0;
+            continue;
+        }
+
+        /* Next subinsn always goes to slot 1 */
+        if (GET_ATTRIB(pkt->insn[i].opcode, A_SUBINSN) && hit_duplex) {
+            pkt->insn[i].slot = 1;
+        }
+    }
+
+    /* Fix the exceptions - slot 1 is never empty, always aligns to slot 0 */
+    {
+        int slot0_found = 0;
+        int slot1_found = 0;
+        int slot1_iidx = 0;
+        for (i = pkt->num_insns - 1; i >= 0; i--) {
+            /* Is slot0 used? */
+            if (pkt->insn[i].slot == 0) {
+                int is_endloop = (pkt->insn[i].opcode == J2_endloop01);
+                is_endloop |= (pkt->insn[i].opcode == J2_endloop0);
+                is_endloop |= (pkt->insn[i].opcode == J2_endloop1);
+
+                /*
+                 * Make sure it's not endloop since, we're overloading
+                 * slot0 for endloop
+                 */
+                if (!is_endloop) {
+                    slot0_found = 1;
+                }
+            }
+            /* Is slot1 used? */
+            if (pkt->insn[i].slot == 1) {
+                slot1_found = 1;
+                slot1_iidx = i;
+            }
+        }
+        /* Is slot0 empty and slot1 used? */
+        if ((slot0_found == 0) && (slot1_found == 1)) {
+            /* Then push it to slot0 */
+            pkt->insn[slot1_iidx].slot = 0;
+        }
+    }
+    return 0;
+}
+
+/*
+ * do_decode_packet
+ * Decodes packet with given words
+ * Returns negative on error, 0 on insufficient words,
+ * and number of words used on success
+ */
+
+static int do_decode_packet(int max_words, const size4u_t *words, packet_t *pkt)
+{
+    int num_insns = 0;
+    int words_read = 0;
+    int end_of_packet = 0;
+    int new_insns = 0;
+    int errors = 0;
+    size4u_t encoding32;
+
+    /* Initialize */
+    memset(pkt, 0, sizeof(*pkt));
+    /* Try to build packet */
+    while (!end_of_packet && (words_read < max_words)) {
+        encoding32 = words[words_read];
+        end_of_packet = decode_parsebits_is_end(encoding32);
+        new_insns = decode_insns(&pkt->insn[num_insns], encoding32);
+        /*
+         * If we saw an extender, mark next word extended so immediate
+         * decode works
+         */
+        if (pkt->insn[num_insns].opcode == A4_ext) {
+            pkt->insn[num_insns + 1].extension_valid = 1;
+        }
+        num_insns += new_insns;
+        words_read++;
+    }
+
+    pkt->num_insns = num_insns;
+    if (!end_of_packet) {
+        /* Ran out of words! */
+        return 0;
+    }
+    pkt->encod_pkt_size_in_bytes = words_read * 4;
+
+    /* Shuffle / split / reorder for execution */
+    if ((words_read == 2) && (decode_parsebits_is_loopend(words[0]))) {
+        decode_add_endloop_insn(&pkt->insn[pkt->num_insns++], 0);
+    }
+    if (words_read >= 3) {
+        size4u_t has_loop0, has_loop1;
+        has_loop0 = decode_parsebits_is_loopend(words[0]);
+        has_loop1 = decode_parsebits_is_loopend(words[1]);
+        if (has_loop0 && has_loop1) {
+            decode_add_endloop_insn(&pkt->insn[pkt->num_insns++], 10);
+        } else if (has_loop1) {
+            decode_add_endloop_insn(&pkt->insn[pkt->num_insns++], 1);
+        } else if (has_loop0) {
+            decode_add_endloop_insn(&pkt->insn[pkt->num_insns++], 0);
+        }
+    }
+
+    errors += decode_apply_extenders(pkt);
+    errors += decode_remove_extenders(pkt);
+    errors += decode_set_slot_number(pkt);
+    errors += decode_fill_newvalue_regno(pkt);
+
+    errors += decode_shuffle_for_execution(pkt);
+    errors += decode_split_cmpjump(pkt);
+    errors += decode_set_insn_attr_fields(pkt);
+    if (errors) {
+        return -1;
+    }
+
+    return words_read;
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 15/34] Hexagon (target/hexagon) instruction printing
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (13 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 14/34] Hexagon (target/hexagon) instruction/packet decode Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 15:08   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 16/34] Hexagon (target/hexagon) utility functions Taylor Simpson
                   ` (20 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/printinsn.h | 26 +++++++++++++
 target/hexagon/printinsn.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 120 insertions(+)
 create mode 100644 target/hexagon/printinsn.h
 create mode 100644 target/hexagon/printinsn.c

diff --git a/target/hexagon/printinsn.h b/target/hexagon/printinsn.h
new file mode 100644
index 0000000..264b63c
--- /dev/null
+++ b/target/hexagon/printinsn.h
@@ -0,0 +1,26 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_PRINTINSN_H
+#define HEXAGON_PRINTINSN_H
+
+#include "qemu/osdep.h"
+#include "insn.h"
+
+extern void snprint_a_pkt(char *buf, int n, packet_t *pkt);
+
+#endif
diff --git a/target/hexagon/printinsn.c b/target/hexagon/printinsn.c
new file mode 100644
index 0000000..b6cffd6
--- /dev/null
+++ b/target/hexagon/printinsn.c
@@ -0,0 +1,94 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "opcodes.h"
+#include "printinsn.h"
+#include "insn.h"
+#include "reg_fields.h"
+#include "internal.h"
+
+#define REGNO(NUM) (insn->regno[NUM])
+#define IMMNO(NUM) (insn->immed[NUM])
+
+static const char *sreg2str(unsigned int reg)
+{
+    if (reg < TOTAL_PER_THREAD_REGS) {
+        return hexagon_regnames[reg];
+    } else {
+        return "???";
+    }
+}
+
+static const char *creg2str(unsigned int reg)
+{
+    return sreg2str(reg + NUM_GEN_REGS);
+}
+
+static void snprintinsn(char *buf, int n, insn_t * insn)
+{
+    switch (insn->opcode) {
+#define DEF_VECX_PRINTINFO(TAG, FMT, ...) DEF_PRINTINFO(TAG, FMT, __VA_ARGS__)
+#define DEF_PRINTINFO(TAG, FMT, ...) \
+    case TAG: \
+        snprintf(buf, n, FMT, __VA_ARGS__);\
+        break;
+#include "printinsn_generated.h"
+#undef DEF_VECX_PRINTINFO
+#undef DEF_PRINTINFO
+    }
+}
+
+void snprint_a_pkt(char *buf, int n, packet_t * pkt)
+{
+    char tmpbuf[128];
+    buf[0] = '\0';
+    int i, slot, opcode;
+
+    if (pkt == NULL) {
+        snprintf(buf, n, "<printpkt: NULL ptr>");
+        return;
+    }
+
+    if (pkt->num_insns > 1) {
+        strncat(buf, "\n{\n", n);
+    }
+    for (i = 0; i < pkt->num_insns; i++) {
+        if (pkt->insn[i].part1) {
+            continue;
+        }
+        snprintinsn(tmpbuf, 127, &(pkt->insn[i]));
+        strncat(buf, "\t", n);
+        strncat(buf, tmpbuf, n);
+        if (GET_ATTRIB(pkt->insn[i].opcode, A_SUBINSN)) {
+            strncat(buf, " //subinsn", n);
+        }
+        if (pkt->insn[i].extension_valid) {
+            strncat(buf, " //constant extended", n);
+        }
+        slot = pkt->insn[i].slot;
+        opcode = pkt->insn[i].opcode;
+        snprintf(tmpbuf, 127, " //slot=%d:tag=%s", slot, opcode_names[opcode]);
+        strncat(buf, tmpbuf, n);
+
+        strncat(buf, "\n", n);
+    }
+    if (pkt->num_insns > 1) {
+        strncat(buf, "}\n", n);
+    }
+}
+
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 16/34] Hexagon (target/hexagon) utility functions
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (14 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 15/34] Hexagon (target/hexagon) instruction printing Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 15:10   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 17/34] Hexagon (target/hexagon/imported) arch import - macro definitions Taylor Simpson
                   ` (19 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Utility functions called by various instructions

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/arch.h     |  42 +++
 target/hexagon/conv_emu.h |  50 +++
 target/hexagon/fma_emu.h  |  27 ++
 target/hexagon/arch.c     | 354 +++++++++++++++++++++
 target/hexagon/conv_emu.c | 369 ++++++++++++++++++++++
 target/hexagon/fma_emu.c  | 781 ++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 1623 insertions(+)
 create mode 100644 target/hexagon/arch.h
 create mode 100644 target/hexagon/conv_emu.h
 create mode 100644 target/hexagon/fma_emu.h
 create mode 100644 target/hexagon/arch.c
 create mode 100644 target/hexagon/conv_emu.c
 create mode 100644 target/hexagon/fma_emu.c

diff --git a/target/hexagon/arch.h b/target/hexagon/arch.h
new file mode 100644
index 0000000..ecf7060
--- /dev/null
+++ b/target/hexagon/arch.h
@@ -0,0 +1,42 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_ARCH_H
+#define HEXAGON_ARCH_H
+
+#include "cpu.h"
+#include "hex_arch_types.h"
+
+extern size8u_t interleave(size4u_t odd, size4u_t even);
+extern size8u_t deinterleave(size8u_t src);
+extern size4u_t carry_from_add64(size8u_t a, size8u_t b, size4u_t c);
+extern size4s_t conv_round(size4s_t a, int n);
+extern size16s_t cast8s_to_16s(size8s_t a);
+extern size8s_t cast16s_to_8s(size16s_t a);
+extern size16s_t add128(size16s_t a, size16s_t b);
+extern size16s_t sub128(size16s_t a, size16s_t b);
+extern size16s_t shiftr128(size16s_t a, size4u_t n);
+extern size16s_t shiftl128(size16s_t a, size4u_t n);
+extern size16s_t and128(size16s_t a, size16s_t b);
+extern void arch_fpop_start(CPUHexagonState *env);
+extern void arch_fpop_end(CPUHexagonState *env);
+extern void arch_raise_fpflag(unsigned int flags);
+extern int arch_sf_recip_common(size4s_t *Rs, size4s_t *Rt, size4s_t *Rd,
+                                int *adjust);
+extern int arch_sf_invsqrt_common(size4s_t *Rs, size4s_t *Rd, int *adjust);
+
+#endif
diff --git a/target/hexagon/conv_emu.h b/target/hexagon/conv_emu.h
new file mode 100644
index 0000000..50d9d2c
--- /dev/null
+++ b/target/hexagon/conv_emu.h
@@ -0,0 +1,50 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_CONV_EMU_H
+#define HEXAGON_CONV_EMU_H
+
+#include "hex_arch_types.h"
+
+extern size8u_t conv_sf_to_8u(float in);
+extern size4u_t conv_sf_to_4u(float in);
+extern size8s_t conv_sf_to_8s(float in);
+extern size4s_t conv_sf_to_4s(float in);
+
+extern size8u_t conv_df_to_8u(double in);
+extern size4u_t conv_df_to_4u(double in);
+extern size8s_t conv_df_to_8s(double in);
+extern size4s_t conv_df_to_4s(double in);
+
+extern double conv_8u_to_df(size8u_t in);
+extern double conv_4u_to_df(size4u_t in);
+extern double conv_8s_to_df(size8s_t in);
+extern double conv_4s_to_df(size4s_t in);
+
+extern float conv_8u_to_sf(size8u_t in);
+extern float conv_4u_to_sf(size4u_t in);
+extern float conv_8s_to_sf(size8s_t in);
+extern float conv_4s_to_sf(size4s_t in);
+
+extern float conv_df_to_sf(double in);
+
+static inline double conv_sf_to_df(float in)
+{
+    return in;
+}
+
+#endif
diff --git a/target/hexagon/fma_emu.h b/target/hexagon/fma_emu.h
new file mode 100644
index 0000000..181b1f6
--- /dev/null
+++ b/target/hexagon/fma_emu.h
@@ -0,0 +1,27 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_FMA_EMU_H
+#define HEXAGON_FMA_EMU_H
+
+extern float internal_fmafx(float a_in, float b_in, float c_in, int scale);
+extern float internal_fmaf(float a_in, float b_in, float c_in);
+extern float internal_mpyf(float a_in, float b_in);
+extern double internal_mpyhh(double a_in, double b_in,
+                             unsigned long long int accumulated);
+
+#endif
diff --git a/target/hexagon/arch.c b/target/hexagon/arch.c
new file mode 100644
index 0000000..0d612d2
--- /dev/null
+++ b/target/hexagon/arch.c
@@ -0,0 +1,354 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include <math.h>
+#include "fma_emu.h"
+#include "arch.h"
+#include "macros.h"
+
+#define BITS_MASK_8 0x5555555555555555ULL
+#define PAIR_MASK_8 0x3333333333333333ULL
+#define NYBL_MASK_8 0x0f0f0f0f0f0f0f0fULL
+#define BYTE_MASK_8 0x00ff00ff00ff00ffULL
+#define HALF_MASK_8 0x0000ffff0000ffffULL
+#define WORD_MASK_8 0x00000000ffffffffULL
+
+size8u_t interleave(size4u_t odd, size4u_t even)
+{
+    /* Convert to long long */
+    size8u_t myodd = odd;
+    size8u_t myeven = even;
+    /* First, spread bits out */
+    myodd = (myodd | (myodd << 16)) & HALF_MASK_8;
+    myeven = (myeven | (myeven << 16)) & HALF_MASK_8;
+    myodd = (myodd | (myodd << 8)) & BYTE_MASK_8;
+    myeven = (myeven | (myeven << 8)) & BYTE_MASK_8;
+    myodd = (myodd | (myodd << 4)) & NYBL_MASK_8;
+    myeven = (myeven | (myeven << 4)) & NYBL_MASK_8;
+    myodd = (myodd | (myodd << 2)) & PAIR_MASK_8;
+    myeven = (myeven | (myeven << 2)) & PAIR_MASK_8;
+    myodd = (myodd | (myodd << 1)) & BITS_MASK_8;
+    myeven = (myeven | (myeven << 1)) & BITS_MASK_8;
+    /* Now OR together */
+    return myeven | (myodd << 1);
+}
+
+size8u_t deinterleave(size8u_t src)
+{
+    /* Get odd and even bits */
+    size8u_t myodd = ((src >> 1) & BITS_MASK_8);
+    size8u_t myeven = (src & BITS_MASK_8);
+
+    /* Unspread bits */
+    myeven = (myeven | (myeven >> 1)) & PAIR_MASK_8;
+    myodd = (myodd | (myodd >> 1)) & PAIR_MASK_8;
+    myeven = (myeven | (myeven >> 2)) & NYBL_MASK_8;
+    myodd = (myodd | (myodd >> 2)) & NYBL_MASK_8;
+    myeven = (myeven | (myeven >> 4)) & BYTE_MASK_8;
+    myodd = (myodd | (myodd >> 4)) & BYTE_MASK_8;
+    myeven = (myeven | (myeven >> 8)) & HALF_MASK_8;
+    myodd = (myodd | (myodd >> 8)) & HALF_MASK_8;
+    myeven = (myeven | (myeven >> 16)) & WORD_MASK_8;
+    myodd = (myodd | (myodd >> 16)) & WORD_MASK_8;
+
+    /* Return odd bits in upper half */
+    return myeven | (myodd << 32);
+}
+
+size4u_t carry_from_add64(size8u_t a, size8u_t b, size4u_t c)
+{
+    size8u_t tmpa, tmpb, tmpc;
+    tmpa = fGETUWORD(0, a);
+    tmpb = fGETUWORD(0, b);
+    tmpc = tmpa + tmpb + c;
+    tmpa = fGETUWORD(1, a);
+    tmpb = fGETUWORD(1, b);
+    tmpc = tmpa + tmpb + fGETUWORD(1, tmpc);
+    tmpc = fGETUWORD(1, tmpc);
+    return tmpc;
+}
+
+size4s_t conv_round(size4s_t a, int n)
+{
+    size8s_t val;
+
+    if (n == 0) {
+        val = a;
+    } else if ((a & ((1 << (n - 1)) - 1)) == 0) {    /* N-1..0 all zero? */
+        /* Add LSB from int part */
+        val = ((fSE32_64(a)) + (size8s_t) (((size4u_t) ((1 << n) & a)) >> 1));
+    } else {
+        val = ((fSE32_64(a)) + (1 << (n - 1)));
+    }
+
+    val = val >> n;
+    return (size4s_t)val;
+}
+
+size16s_t cast8s_to_16s(size8s_t a)
+{
+    size16s_t result = {.hi = 0, .lo = 0};
+    result.lo = a;
+    if (a < 0) {
+        result.hi = -1;
+    }
+    return result;
+}
+
+size8s_t cast16s_to_8s(size16s_t a)
+{
+    return a.lo;
+}
+
+size16s_t add128(size16s_t a, size16s_t b)
+{
+    size16s_t result = {.hi = 0, .lo = 0};
+    result.lo = a.lo + b.lo;
+    result.hi = a.hi + b.hi;
+
+    if (result.lo < b.lo) {
+        result.hi++;
+    }
+
+    return result;
+}
+
+size16s_t sub128(size16s_t a, size16s_t b)
+{
+    size16s_t result = {.hi = 0, .lo = 0};
+    result.lo = a.lo - b.lo;
+    result.hi = a.hi - b.hi;
+    if (result.lo > a.lo) {
+        result.hi--;
+    }
+
+    return result;
+}
+
+size16s_t shiftr128(size16s_t a, size4u_t n)
+{
+    size16s_t result;
+    result.lo = (a.lo >> n) | (a.hi << (64 - n));
+    result.hi = a.hi >> n;
+    return result;
+}
+
+size16s_t shiftl128(size16s_t a, size4u_t n)
+{
+    size16s_t result;
+    result.lo = a.lo << n;
+    result.hi = (a.hi << n) | (a.lo >> (64 - n));
+    return result;
+}
+
+size16s_t and128(size16s_t a, size16s_t b)
+{
+    size16s_t result;
+    result.lo = a.lo & b.lo;
+    result.hi = a.hi & b.hi;
+    return result;
+}
+
+/* Floating Point Stuff */
+
+static const int roundingmodes[] = {
+    FE_TONEAREST,
+    FE_TOWARDZERO,
+    FE_DOWNWARD,
+    FE_UPWARD
+};
+
+void arch_fpop_start(CPUHexagonState *env)
+{
+    fegetenv(&env->fenv);
+    feclearexcept(FE_ALL_EXCEPT);
+    fesetround(roundingmodes[fREAD_REG_FIELD(USR, USR_FPRND)]);
+}
+
+#define NOTHING             /* Don't do anything */
+
+#define TEST_FLAG(LIBCF, MYF, MYE) \
+    do { \
+        if (fetestexcept(LIBCF)) { \
+            if (GET_USR_FIELD(USR_##MYF) == 0) { \
+                SET_USR_FIELD(USR_##MYF, 1); \
+                if (GET_USR_FIELD(USR_##MYE)) { \
+                    NOTHING \
+                } \
+            } \
+        } \
+    } while (0)
+
+void arch_fpop_end(CPUHexagonState *env)
+{
+    if (fetestexcept(FE_ALL_EXCEPT)) {
+        TEST_FLAG(FE_INEXACT, FPINPF, FPINPE);
+        TEST_FLAG(FE_DIVBYZERO, FPDBZF, FPDBZE);
+        TEST_FLAG(FE_INVALID, FPINVF, FPINVE);
+        TEST_FLAG(FE_OVERFLOW, FPOVFF, FPOVFE);
+        TEST_FLAG(FE_UNDERFLOW, FPUNFF, FPUNFE);
+    }
+    fesetenv(&env->fenv);
+}
+
+#undef TEST_FLAG
+
+
+void arch_raise_fpflag(unsigned int flags)
+{
+    feraiseexcept(flags);
+}
+
+int arch_sf_recip_common(size4s_t *Rs, size4s_t *Rt, size4s_t *Rd, int *adjust)
+{
+    int n_class;
+    int d_class;
+    int n_exp;
+    int d_exp;
+    int ret = 0;
+    size4s_t RsV, RtV, RdV;
+    int PeV = 0;
+    RsV = *Rs;
+    RtV = *Rt;
+    n_class = fpclassify(fFLOAT(RsV));
+    d_class = fpclassify(fFLOAT(RtV));
+    if ((n_class == FP_NAN) && (d_class == FP_NAN)) {
+        if (fGETBIT(22, RsV & RtV) == 0) {
+            fRAISEFLAGS(FE_INVALID);
+        }
+        RdV = RsV = RtV = fSFNANVAL();
+    } else if (n_class == FP_NAN) {
+        if (fGETBIT(22, RsV) == 0) {
+            fRAISEFLAGS(FE_INVALID);
+        }
+        RdV = RsV = RtV = fSFNANVAL();
+    } else if (d_class == FP_NAN) {
+        /* or put NaN in num/den fixup? */
+        if (fGETBIT(22, RtV) == 0) {
+            fRAISEFLAGS(FE_INVALID);
+        }
+        RdV = RsV = RtV = fSFNANVAL();
+    } else if ((n_class == FP_INFINITE) && (d_class == FP_INFINITE)) {
+        /* or put Inf in num fixup? */
+        RdV = RsV = RtV = fSFNANVAL();
+        fRAISEFLAGS(FE_INVALID);
+    } else if ((n_class == FP_ZERO) && (d_class == FP_ZERO)) {
+        /* or put zero in num fixup? */
+        RdV = RsV = RtV = fSFNANVAL();
+        fRAISEFLAGS(FE_INVALID);
+    } else if (d_class == FP_ZERO) {
+        /* or put Inf in num fixup? */
+        RsV = fSFINFVAL(RsV ^ RtV);
+        RtV = fSFONEVAL(0);
+        RdV = fSFONEVAL(0);
+        if (n_class != FP_INFINITE) {
+            fRAISEFLAGS(FE_DIVBYZERO);
+        }
+    } else if (d_class == FP_INFINITE) {
+        RsV = 0x80000000 & (RsV ^ RtV);
+        RtV = fSFONEVAL(0);
+        RdV = fSFONEVAL(0);
+    } else if (n_class == FP_ZERO) {
+        /* Does this just work itself out? */
+        /* No, 0/Inf causes problems. */
+        RsV = 0x80000000 & (RsV ^ RtV);
+        RtV = fSFONEVAL(0);
+        RdV = fSFONEVAL(0);
+    } else if (n_class == FP_INFINITE) {
+        /* Does this just work itself out? */
+        RsV = fSFINFVAL(RsV ^ RtV);
+        RtV = fSFONEVAL(0);
+        RdV = fSFONEVAL(0);
+    } else {
+        PeV = 0x00;
+        /* Basic checks passed */
+        n_exp = fSF_GETEXP(RsV);
+        d_exp = fSF_GETEXP(RtV);
+        if ((n_exp - d_exp + fSF_BIAS()) <= fSF_MANTBITS()) {
+            /* Near quotient underflow / inexact Q */
+            PeV = 0x80;
+            RtV = fSF_MUL_POW2(RtV, -64);
+            RsV = fSF_MUL_POW2(RsV, 64);
+        } else if ((n_exp - d_exp + fSF_BIAS()) > (fSF_MAXEXP() - 24)) {
+            /* Near quotient overflow */
+            PeV = 0x40;
+            RtV = fSF_MUL_POW2(RtV, 32);
+            RsV = fSF_MUL_POW2(RsV, -32);
+        } else if (n_exp <= fSF_MANTBITS() + 2) {
+            RtV = fSF_MUL_POW2(RtV, 64);
+            RsV = fSF_MUL_POW2(RsV, 64);
+        } else if (d_exp <= 1) {
+            RtV = fSF_MUL_POW2(RtV, 32);
+            RsV = fSF_MUL_POW2(RsV, 32);
+        } else if (d_exp > 252) {
+            RtV = fSF_MUL_POW2(RtV, -32);
+            RsV = fSF_MUL_POW2(RsV, -32);
+        }
+        RdV = 0;
+        ret = 1;
+    }
+    *Rs = RsV;
+    *Rt = RtV;
+    *Rd = RdV;
+    *adjust = PeV;
+    return ret;
+}
+
+int arch_sf_invsqrt_common(size4s_t *Rs, size4s_t *Rd, int *adjust)
+{
+    int r_class;
+    size4s_t RsV, RdV;
+    int PeV = 0;
+    int r_exp;
+    int ret = 0;
+    RsV = *Rs;
+    r_class = fpclassify(fFLOAT(RsV));
+    if (r_class == FP_NAN) {
+        if (fGETBIT(22, RsV) == 0) {
+            fRAISEFLAGS(FE_INVALID);
+        }
+        RdV = RsV = fSFNANVAL();
+    } else if (fFLOAT(RsV) < 0.0) {
+        /* Negative nonzero values are NaN */
+        fRAISEFLAGS(FE_INVALID);
+        RsV = fSFNANVAL();
+        RdV = fSFNANVAL();
+    } else if (r_class == FP_INFINITE) {
+        /* or put Inf in num fixup? */
+        RsV = fSFINFVAL(-1);
+        RdV = fSFINFVAL(-1);
+    } else if (r_class == FP_ZERO) {
+        /* or put zero in num fixup? */
+        RdV = fSFONEVAL(0);
+    } else {
+        PeV = 0x00;
+        /* Basic checks passed */
+        r_exp = fSF_GETEXP(RsV);
+        if (r_exp <= 24) {
+            RsV = fSF_MUL_POW2(RsV, 64);
+            PeV = 0xe0;
+        }
+        RdV = 0;
+        ret = 1;
+    }
+    *Rs = RsV;
+    *Rd = RdV;
+    *adjust = PeV;
+    return ret;
+}
+
diff --git a/target/hexagon/conv_emu.c b/target/hexagon/conv_emu.c
new file mode 100644
index 0000000..b31ff32
--- /dev/null
+++ b/target/hexagon/conv_emu.c
@@ -0,0 +1,369 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <math.h>
+#include "qemu/osdep.h"
+#include "hex_arch_types.h"
+#include "macros.h"
+#include "conv_emu.h"
+
+#define isz(X) (fpclassify(X) == FP_ZERO)
+#define DF_BIAS 1023
+#define SF_BIAS 127
+
+#define LL_MAX_POS 0x7fffffffffffffffULL
+#define MAX_POS 0x7fffffffU
+
+#ifdef VCPP
+/*
+ * Visual C isn't GNU C and doesn't have __builtin_clzll
+ */
+
+static int __builtin_clzll(unsigned long long int input)
+{
+    int total = 0;
+    if (input == 0) {
+        return 64;
+    }
+    total += ((input >> (total + 32)) != 0) ? 32 : 0;
+    total += ((input >> (total + 16)) != 0) ? 16 : 0;
+    total += ((input >> (total +  8)) != 0) ?  8 : 0;
+    total += ((input >> (total +  4)) != 0) ?  4 : 0;
+    total += ((input >> (total +  2)) != 0) ?  2 : 0;
+    total += ((input >> (total +  1)) != 0) ?  1 : 0;
+    return 63 - total;
+}
+#endif
+
+typedef union {
+    double f;
+    size8u_t i;
+    struct {
+        size8u_t mant:52;
+        size8u_t exp:11;
+        size8u_t sign:1;
+    } x;
+} df_t;
+
+
+typedef union {
+    float f;
+    size4u_t i;
+    struct {
+        size4u_t mant:23;
+        size4u_t exp:8;
+        size4u_t sign:1;
+    } x;
+} sf_t;
+
+
+#define MAKE_CONV_8U_TO_XF_N(FLOATID, BIGFLOATID, RETTYPE) \
+static RETTYPE conv_8u_to_##FLOATID##_n(size8u_t in, int negate) \
+{ \
+    FLOATID##_t x; \
+    size8u_t tmp, truncbits, shamt; \
+    int leading_zeros; \
+    if (in == 0) { \
+        return 0.0; \
+    } \
+    leading_zeros = __builtin_clzll(in); \
+    tmp = in << (leading_zeros); \
+    tmp <<= 1; \
+    shamt = 64 - f##BIGFLOATID##_MANTBITS(); \
+    truncbits = tmp & ((1ULL << (shamt)) - 1); \
+    tmp >>= shamt; \
+    if (truncbits != 0) { \
+        feraiseexcept(FE_INEXACT); \
+        switch (fegetround()) { \
+        case FE_TOWARDZERO: \
+            break; \
+        case FE_DOWNWARD: \
+            if (negate) { \
+                tmp += 1; \
+            } \
+            break; \
+        case FE_UPWARD: \
+            if (!negate) { \
+                tmp += 1; \
+            } \
+            break; \
+        default: \
+            if ((truncbits & ((1ULL << (shamt - 1)) - 1)) == 0) { \
+                tmp += (tmp & 1); \
+            } else { \
+                tmp += ((truncbits >> (shamt - 1)) & 1); \
+            } \
+            break; \
+        } \
+    } \
+    if (((tmp << shamt) >> shamt) != tmp) { \
+        leading_zeros--; \
+    } \
+    x.x.mant = tmp; \
+    x.x.exp = BIGFLOATID##_BIAS + f##BIGFLOATID##_MANTBITS() - \
+              leading_zeros + shamt - 1; \
+    x.x.sign = negate; \
+    return x.f; \
+}
+
+MAKE_CONV_8U_TO_XF_N(df, DF, double)
+MAKE_CONV_8U_TO_XF_N(sf, SF, float)
+
+double conv_8u_to_df(size8u_t in)
+{
+    return conv_8u_to_df_n(in, 0);
+}
+
+double conv_8s_to_df(size8s_t in)
+{
+    if (in == 0x8000000000000000) {
+        return -0x1p63;
+    }
+    if (in < 0) {
+        return conv_8u_to_df_n(-in, 1);
+    } else {
+        return conv_8u_to_df_n(in, 0);
+    }
+}
+
+double conv_4u_to_df(size4u_t in)
+{
+    return conv_8u_to_df((size8u_t) in);
+}
+
+double conv_4s_to_df(size4s_t in)
+{
+    return conv_8s_to_df(in);
+}
+
+float conv_8u_to_sf(size8u_t in)
+{
+    return conv_8u_to_sf_n(in, 0);
+}
+
+float conv_8s_to_sf(size8s_t in)
+{
+    if (in == 0x8000000000000000) {
+        return -0x1p63;
+    }
+    if (in < 0) {
+        return conv_8u_to_sf_n(-in, 1);
+    } else {
+        return conv_8u_to_sf_n(in, 0);
+    }
+}
+
+float conv_4u_to_sf(size4u_t in)
+{
+    return conv_8u_to_sf(in);
+}
+
+float conv_4s_to_sf(size4s_t in)
+{
+    return conv_8s_to_sf(in);
+}
+
+
+static size8u_t conv_df_to_8u_n(double in, int will_negate)
+{
+    df_t x;
+    int fracshift, endshift;
+    size8u_t tmp, truncbits;
+    x.f = in;
+    if (isinf(in)) {
+        feraiseexcept(FE_INVALID);
+        if (in > 0.0) {
+            return ~0ULL;
+        } else {
+            return 0ULL;
+        }
+    }
+    if (isnan(in)) {
+        feraiseexcept(FE_INVALID);
+        return ~0ULL;
+    }
+    if (isz(in)) {
+        return 0;
+    }
+    if (x.x.sign) {
+        feraiseexcept(FE_INVALID);
+        return 0;
+    }
+    if (in < 0.5) {
+        /* Near zero, captures large fracshifts, denorms, etc */
+        feraiseexcept(FE_INEXACT);
+        switch (fegetround()) {
+        case FE_DOWNWARD:
+            if (will_negate) {
+                return 1;
+            } else {
+                return 0;
+            }
+        case FE_UPWARD:
+            if (!will_negate) {
+                return 1;
+            } else {
+                return 0;
+            }
+        default:
+            return 0;    /* nearest or towards zero */
+        }
+    }
+    if ((x.x.exp - DF_BIAS) >= 64) {
+        /* way too big */
+        feraiseexcept(FE_INVALID);
+        return ~0ULL;
+    }
+    fracshift = fMAX(0, (fDF_MANTBITS() - (x.x.exp - DF_BIAS)));
+    endshift = fMAX(0, ((x.x.exp - DF_BIAS - fDF_MANTBITS())));
+    tmp = x.x.mant | (1ULL << fDF_MANTBITS());
+    truncbits = tmp & ((1ULL << fracshift) - 1);
+    tmp >>= fracshift;
+    if (truncbits) {
+        /* Apply Rounding */
+        feraiseexcept(FE_INEXACT);
+        switch (fegetround()) {
+        case FE_TOWARDZERO:
+            break;
+        case FE_DOWNWARD:
+            if (will_negate) {
+                tmp += 1;
+            }
+            break;
+        case FE_UPWARD:
+            if (!will_negate) {
+                tmp += 1;
+            }
+            break;
+        default:
+            if ((truncbits & ((1ULL << (fracshift - 1)) - 1)) == 0) {
+                /* Exactly .5 */
+                tmp += (tmp & 1);
+            } else {
+                tmp += ((truncbits >> (fracshift - 1)) & 1);
+            }
+        }
+    }
+    /*
+     * If we added one and it carried all the way out,
+     * check to see if overflow
+     */
+    if ((tmp & ((1ULL << (fDF_MANTBITS() + 1)) - 1)) == 0) {
+        if ((x.x.exp - DF_BIAS) == 63) {
+            feclearexcept(FE_INEXACT);
+            feraiseexcept(FE_INVALID);
+            return ~0ULL;
+        }
+    }
+    tmp <<= endshift;
+    return tmp;
+}
+
+static size4u_t conv_df_to_4u_n(double in, int will_negate)
+{
+    size8u_t tmp;
+    tmp = conv_df_to_8u_n(in, will_negate);
+    if (tmp > 0x00000000ffffffffULL) {
+        feclearexcept(FE_INEXACT);
+        feraiseexcept(FE_INVALID);
+        return ~0U;
+    }
+    return (size4u_t)tmp;
+}
+
+size8u_t conv_df_to_8u(double in)
+{
+    return conv_df_to_8u_n(in, 0);
+}
+
+size4u_t conv_df_to_4u(double in)
+{
+    return conv_df_to_4u_n(in, 0);
+}
+
+size8s_t conv_df_to_8s(double in)
+{
+    size8u_t tmp;
+    df_t x;
+    x.f = in;
+    if (isnan(in)) {
+        feraiseexcept(FE_INVALID);
+        return -1;
+    }
+    if (x.x.sign) {
+        tmp = conv_df_to_8u_n(-in, 1);
+    } else {
+        tmp = conv_df_to_8u_n(in, 0);
+    }
+    if (tmp > (LL_MAX_POS + x.x.sign)) {
+        feclearexcept(FE_INEXACT);
+        feraiseexcept(FE_INVALID);
+        tmp = (LL_MAX_POS + x.x.sign);
+    }
+    if (x.x.sign) {
+        return -tmp;
+    } else {
+        return tmp;
+    }
+}
+
+size4s_t conv_df_to_4s(double in)
+{
+    size8u_t tmp;
+    df_t x;
+    x.f = in;
+    if (isnan(in)) {
+        feraiseexcept(FE_INVALID);
+        return -1;
+    }
+    if (x.x.sign) {
+        tmp = conv_df_to_8u_n(-in, 1);
+    } else {
+        tmp = conv_df_to_8u_n(in, 0);
+    }
+    if (tmp > (MAX_POS + x.x.sign)) {
+        feclearexcept(FE_INEXACT);
+        feraiseexcept(FE_INVALID);
+        tmp = (MAX_POS + x.x.sign);
+    }
+    if (x.x.sign) {
+        return -tmp;
+    } else {
+        return tmp;
+    }
+}
+
+size8u_t conv_sf_to_8u(float in)
+{
+    return conv_df_to_8u(in);
+}
+
+size4u_t conv_sf_to_4u(float in)
+{
+    return conv_df_to_4u(in);
+}
+
+size8s_t conv_sf_to_8s(float in)
+{
+    return conv_df_to_8s(in);
+}
+
+size4s_t conv_sf_to_4s(float in)
+{
+    return conv_df_to_4s(in);
+}
+
diff --git a/target/hexagon/fma_emu.c b/target/hexagon/fma_emu.c
new file mode 100644
index 0000000..9d7b798
--- /dev/null
+++ b/target/hexagon/fma_emu.c
@@ -0,0 +1,781 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <math.h>
+#include "qemu/osdep.h"
+#include "macros.h"
+#include "conv_emu.h"
+#include "fma_emu.h"
+
+#define DF_INF_EXP 0x7ff
+#define DF_BIAS 1023
+
+#define SF_INF_EXP 0xff
+#define SF_BIAS 127
+
+#define HF_INF_EXP 0x1f
+#define HF_BIAS 15
+
+#define WAY_BIG_EXP 4096
+
+#define isz(X) (fpclassify(X) == FP_ZERO)
+
+typedef union {
+    double f;
+    size8u_t i;
+    struct {
+        size8u_t mant:52;
+        size8u_t exp:11;
+        size8u_t sign:1;
+    } x;
+} df_t;
+
+typedef union {
+    float f;
+    size4u_t i;
+    struct {
+        size4u_t mant:23;
+        size4u_t exp:8;
+        size4u_t sign:1;
+    } x;
+} sf_t;
+
+typedef struct {
+    union {
+        size8u_t low;
+        struct {
+            size4u_t w0;
+            size4u_t w1;
+        };
+    };
+    union {
+        size8u_t high;
+        struct {
+            size4u_t w3;
+            size4u_t w2;
+        };
+    };
+} int128_t;
+
+typedef struct {
+    int128_t mant;
+    size4s_t exp;
+    size1u_t sign;
+    size1u_t guard;
+    size1u_t round;
+    size1u_t sticky;
+} xf_t;
+
+static inline void xf_init(xf_t *p)
+{
+    p->mant.low = 0;
+    p->mant.high = 0;
+    p->exp = 0;
+    p->sign = 0;
+    p->guard = 0;
+    p->round = 0;
+    p->sticky = 0;
+}
+
+static inline size8u_t df_getmant(df_t a)
+{
+    int class = fpclassify(a.f);
+    switch (class) {
+    case FP_NORMAL:
+    return a.x.mant | 1ULL << 52;
+    case FP_ZERO:
+        return 0;
+    case FP_SUBNORMAL:
+        return a.x.mant;
+    default:
+        return -1;
+    };
+}
+
+static inline size4s_t df_getexp(df_t a)
+{
+    int class = fpclassify(a.f);
+    switch (class) {
+    case FP_NORMAL:
+        return a.x.exp;
+    case FP_SUBNORMAL:
+        return a.x.exp + 1;
+    default:
+        return -1;
+    };
+}
+
+static inline size8u_t sf_getmant(sf_t a)
+{
+    int class = fpclassify(a.f);
+    switch (class) {
+    case FP_NORMAL:
+        return a.x.mant | 1ULL << 23;
+    case FP_ZERO:
+        return 0;
+    case FP_SUBNORMAL:
+        return a.x.mant | 0ULL;
+    default:
+        return -1;
+    };
+}
+
+static inline size4s_t sf_getexp(sf_t a)
+{
+    int class = fpclassify(a.f);
+    switch (class) {
+    case FP_NORMAL:
+        return a.x.exp;
+    case FP_SUBNORMAL:
+        return a.x.exp + 1;
+    default:
+        return -1;
+    };
+}
+
+static inline int128_t int128_mul_6464(size8u_t ai, size8u_t bi)
+{
+    int128_t ret;
+    int128_t a, b;
+    size8u_t pp0, pp1a, pp1b, pp1s, pp2;
+
+    a.high = b.high = 0;
+    a.low = ai;
+    b.low = bi;
+    pp0 = (size8u_t)a.w0 * (size8u_t)b.w0;
+    pp1a = (size8u_t)a.w1 * (size8u_t)b.w0;
+    pp1b = (size8u_t)b.w1 * (size8u_t)a.w0;
+    pp2 = (size8u_t)a.w1 * (size8u_t)b.w1;
+
+    pp1s = pp1a + pp1b;
+    if ((pp1s < pp1a) || (pp1s < pp1b)) {
+        pp2 += (1ULL << 32);
+    }
+    ret.low = pp0 + (pp1s << 32);
+    if ((ret.low < pp0) || (ret.low < (pp1s << 32))) {
+        pp2 += 1;
+    }
+    ret.high = pp2 + (pp1s >> 32);
+
+    return ret;
+}
+
+static inline int128_t int128_shl(int128_t a, size4u_t amt)
+{
+    int128_t ret;
+    if (amt == 0) {
+        return a;
+    }
+    if (amt > 128) {
+        ret.high = 0;
+        ret.low = 0;
+        return ret;
+    }
+    if (amt >= 64) {
+        amt -= 64;
+        a.high = a.low;
+        a.low = 0;
+    }
+    ret.high = a.high << amt;
+    ret.high |= (a.low >> (64 - amt));
+    ret.low = a.low << amt;
+    return ret;
+}
+
+static inline int128_t int128_shr(int128_t a, size4u_t amt)
+{
+    int128_t ret;
+    if (amt == 0) {
+        return a;
+    }
+    if (amt > 128) {
+        ret.high = 0;
+        ret.low = 0;
+        return ret;
+    }
+    if (amt >= 64) {
+        amt -= 64;
+        a.low = a.high;
+        a.high = 0;
+    }
+    ret.low = a.low >> amt;
+    ret.low |= (a.high << (64 - amt));
+    ret.high = a.high >> amt;
+    return ret;
+}
+
+static inline int128_t int128_add(int128_t a, int128_t b)
+{
+    int128_t ret;
+    ret.low = a.low + b.low;
+    if ((ret.low < a.low) || (ret.low < b.low)) {
+        /* carry into high part */
+        a.high += 1;
+    }
+    ret.high = a.high + b.high;
+    return ret;
+}
+
+static inline int128_t int128_sub(int128_t a, int128_t b, int borrow)
+{
+    int128_t ret;
+    ret.low = a.low - b.low;
+    if (ret.low > a.low) {
+        /* borrow into high part */
+        a.high -= 1;
+    }
+    ret.high = a.high - b.high;
+    if (borrow == 0) {
+        return ret;
+    } else {
+        a.high = 0;
+        a.low = 1;
+        return int128_sub(ret, a, 0);
+    }
+}
+
+static inline int int128_gt(int128_t a, int128_t b)
+{
+    if (a.high == b.high) {
+        return a.low > b.low;
+    }
+    return a.high > b.high;
+}
+
+static inline xf_t xf_norm_left(xf_t a)
+{
+    a.exp--;
+    a.mant = int128_shl(a.mant, 1);
+    a.mant.low |= a.guard;
+    a.guard = a.round;
+    a.round = a.sticky;
+    return a;
+}
+
+static inline xf_t xf_norm_right(xf_t a, int amt)
+{
+    if (amt > 130) {
+        a.sticky |=
+            a.round | a.guard | (a.mant.low != 0) | (a.mant.high != 0);
+        a.guard = a.round = a.mant.high = a.mant.low = 0;
+        a.exp += amt;
+        return a;
+
+    }
+    while (amt >= 64) {
+        a.sticky |= a.round | a.guard | (a.mant.low != 0);
+        a.guard = (a.mant.low >> 63) & 1;
+        a.round = (a.mant.low >> 62) & 1;
+        a.mant.low = a.mant.high;
+        a.mant.high = 0;
+        a.exp += 64;
+        amt -= 64;
+    }
+    while (amt > 0) {
+        a.exp++;
+        a.sticky |= a.round;
+        a.round = a.guard;
+        a.guard = a.mant.low & 1;
+        a.mant = int128_shr(a.mant, 1);
+        amt--;
+    }
+    return a;
+}
+
+
+/*
+ * On the add/sub, we need to be able to shift out lots of bits, but need a
+ * sticky bit for what was shifted out, I think.
+ */
+static xf_t xf_add(xf_t a, xf_t b);
+
+static inline xf_t xf_sub(xf_t a, xf_t b, int negate)
+{
+    xf_t ret;
+    xf_init(&ret);
+    int borrow;
+
+    if (a.sign != b.sign) {
+        b.sign = !b.sign;
+        return xf_add(a, b);
+    }
+    if (b.exp > a.exp) {
+        /* small - big == - (big - small) */
+        return xf_sub(b, a, !negate);
+    }
+    if ((b.exp == a.exp) && (int128_gt(b.mant, a.mant))) {
+        /* small - big == - (big - small) */
+        return xf_sub(b, a, !negate);
+    }
+
+    while (a.exp > b.exp) {
+        /* Try to normalize exponents: shrink a exponent and grow mantissa */
+        if (a.mant.high & (1ULL << 62)) {
+            /* Can't grow a any more */
+            break;
+        } else {
+            a = xf_norm_left(a);
+        }
+    }
+
+    while (a.exp > b.exp) {
+        /* Try to normalize exponents: grow b exponent and shrink mantissa */
+        /* Keep around shifted out bits... we might need those later */
+        b = xf_norm_right(b, a.exp - b.exp);
+    }
+
+    if ((int128_gt(b.mant, a.mant))) {
+        return xf_sub(b, a, !negate);
+    }
+
+    /* OK, now things should be normalized! */
+    ret.sign = a.sign;
+    ret.exp = a.exp;
+    assert(!int128_gt(b.mant, a.mant));
+    borrow = (b.round << 2) | (b.guard << 1) | b.sticky;
+    ret.mant = int128_sub(a.mant, b.mant, (borrow != 0));
+    borrow = 0 - borrow;
+    ret.guard = (borrow >> 2) & 1;
+    ret.round = (borrow >> 1) & 1;
+    ret.sticky = (borrow >> 0) & 1;
+    if (negate) {
+        ret.sign = !ret.sign;
+    }
+    return ret;
+}
+
+static xf_t xf_add(xf_t a, xf_t b)
+{
+    xf_t ret;
+    xf_init(&ret);
+    if (a.sign != b.sign) {
+        b.sign = !b.sign;
+        return xf_sub(a, b, 0);
+    }
+    if (b.exp > a.exp) {
+        /* small + big ==  (big + small) */
+        return xf_add(b, a);
+    }
+    if ((b.exp == a.exp) && int128_gt(b.mant, a.mant)) {
+        /* small + big ==  (big + small) */
+        return xf_add(b, a);
+    }
+
+    while (a.exp > b.exp) {
+        /* Try to normalize exponents: shrink a exponent and grow mantissa */
+        if (a.mant.high & (1ULL << 62)) {
+            /* Can't grow a any more */
+            break;
+        } else {
+            a = xf_norm_left(a);
+        }
+    }
+
+    while (a.exp > b.exp) {
+        /* Try to normalize exponents: grow b exponent and shrink mantissa */
+        /* Keep around shifted out bits... we might need those later */
+        b = xf_norm_right(b, a.exp - b.exp);
+    }
+
+    /* OK, now things should be normalized! */
+    if (int128_gt(b.mant, a.mant)) {
+        return xf_add(b, a);
+    };
+    ret.sign = a.sign;
+    ret.exp = a.exp;
+    assert(!int128_gt(b.mant, a.mant));
+    ret.mant = int128_add(a.mant, b.mant);
+    ret.guard = b.guard;
+    ret.round = b.round;
+    ret.sticky = b.sticky;
+    return ret;
+}
+
+/* Return an infinity with the same sign as a */
+static inline df_t infinite_df_t(xf_t a)
+{
+    df_t ret;
+    ret.x.sign = a.sign;
+    ret.x.exp = DF_INF_EXP;
+    ret.x.mant = 0ULL;
+    return ret;
+}
+
+/* Return a maximum finite value with the same sign as a */
+static inline df_t maxfinite_df_t(xf_t a)
+{
+    df_t ret;
+    ret.x.sign = a.sign;
+    ret.x.exp = DF_INF_EXP - 1;
+    ret.x.mant = 0x000fffffffffffffULL;
+    return ret;
+}
+
+static inline df_t f2df_t(double in)
+{
+    df_t ret;
+    ret.f = in;
+    return ret;
+}
+
+/* Return an infinity with the same sign as a */
+static inline sf_t infinite_sf_t(xf_t a)
+{
+    sf_t ret;
+    ret.x.sign = a.sign;
+    ret.x.exp = SF_INF_EXP;
+    ret.x.mant = 0ULL;
+    return ret;
+}
+
+/* Return a maximum finite value with the same sign as a */
+static inline sf_t maxfinite_sf_t(xf_t a)
+{
+    sf_t ret;
+    ret.x.sign = a.sign;
+    ret.x.exp = SF_INF_EXP - 1;
+    ret.x.mant = 0x007fffffUL;
+    return ret;
+}
+
+static inline sf_t f2sf_t(float in)
+{
+    sf_t ret;
+    ret.f = in;
+    return ret;
+}
+
+#define GEN_XF_ROUND(TYPE, MANTBITS, INF_EXP) \
+static inline TYPE xf_round_##TYPE(xf_t a) \
+{ \
+    TYPE ret; \
+    ret.i = 0; \
+    ret.x.sign = a.sign; \
+    if ((a.mant.high == 0) && (a.mant.low == 0) \
+        && ((a.guard | a.round | a.sticky) == 0)) { \
+        /* result zero */ \
+        switch (fegetround()) { \
+        case FE_DOWNWARD: \
+            return f2##TYPE(-0.0); \
+        default: \
+            return f2##TYPE(0.0); \
+        } \
+    } \
+    /* Normalize right */ \
+    /* We want MANTBITS bits of mantissa plus the leading one. */ \
+    /* That means that we want MANTBITS+1 bits, or 0x000000000000FF_FFFF */ \
+    /* So we need to normalize right while the high word is non-zero and \
+    * while the low word is nonzero when masked with 0xffe0_0000_0000_0000 */ \
+    while ((a.mant.high != 0) || ((a.mant.low >> (MANTBITS + 1)) != 0)) { \
+        a = xf_norm_right(a, 1); \
+    } \
+    /* \
+     * OK, now normalize left \
+     * We want to normalize left until we have a leading one in bit 24 \
+     * Theoretically, we only need to shift a maximum of one to the left if we \
+     * shifted out lots of bits from B, or if we had no shift / 1 shift sticky \
+     * shoudl be 0  \
+     */ \
+    while ((a.mant.low & (1ULL << MANTBITS)) == 0) { \
+        a = xf_norm_left(a); \
+    } \
+    /* \
+     * OK, now we might need to denormalize because of potential underflow. \
+     * We need to do this before rounding, and rounding might make us normal \
+     * again \
+     */ \
+    while (a.exp <= 0) { \
+        a = xf_norm_right(a, 1 - a.exp); \
+        /* \
+         * Do we have underflow? \
+         * That's when we get an inexact answer because we ran out of bits \
+         * in a denormal. \
+         */ \
+        if (a.guard || a.round || a.sticky) { \
+            feraiseexcept(FE_UNDERFLOW); \
+        } \
+    } \
+    /* OK, we're relatively canonical... now we need to round */ \
+    if (a.guard || a.round || a.sticky) { \
+        feraiseexcept(FE_INEXACT); \
+        switch (fegetround()) { \
+        case FE_TOWARDZERO: \
+            /* Chop and we're done */ \
+            break; \
+        case FE_UPWARD: \
+            if (a.sign == 0) { \
+                a.mant.low += 1; \
+            } \
+            break; \
+        case FE_DOWNWARD: \
+            if (a.sign != 0) { \
+                a.mant.low += 1; \
+            } \
+            break; \
+        default: \
+            if (a.round || a.sticky) { \
+                /* round up if guard is 1, down if guard is zero */ \
+                a.mant.low += a.guard; \
+            } else if (a.guard) { \
+                /* exactly .5, round up if odd */ \
+                a.mant.low += (a.mant.low & 1); \
+            } \
+            break; \
+        } \
+    } \
+    /* \
+     * OK, now we might have carried all the way up. \
+     * So we might need to shr once \
+     * at least we know that the lsb should be zero if we rounded and \
+     * got a carry out... \
+     */ \
+    if ((a.mant.low >> (MANTBITS + 1)) != 0) { \
+        a = xf_norm_right(a, 1); \
+    } \
+    /* Overflow? */ \
+    if (a.exp >= INF_EXP) { \
+        /* Yep, inf result */ \
+        feraiseexcept(FE_OVERFLOW); \
+        feraiseexcept(FE_INEXACT); \
+        switch (fegetround()) { \
+        case FE_TOWARDZERO: \
+            return maxfinite_##TYPE(a); \
+        case FE_UPWARD: \
+            if (a.sign == 0) { \
+                return infinite_##TYPE(a); \
+            } else { \
+                return maxfinite_##TYPE(a); \
+            } \
+        case FE_DOWNWARD: \
+            if (a.sign != 0) { \
+                return infinite_##TYPE(a); \
+            } else { \
+                return maxfinite_##TYPE(a); \
+            } \
+        default: \
+            return infinite_##TYPE(a); \
+        } \
+    } \
+    /* Underflow? */ \
+    if (a.mant.low & (1ULL << MANTBITS)) { \
+        /* Leading one means: No, we're normal. So, we should be done... */ \
+        ret.x.exp = a.exp; \
+        ret.x.mant = a.mant.low; \
+        return ret; \
+    } \
+    if (a.exp != 1) { \
+        printf("a.exp == %d\n", a.exp); \
+    } \
+    assert(a.exp == 1); \
+    ret.x.exp = 0; \
+    ret.x.mant = a.mant.low; \
+    return ret; \
+}
+
+GEN_XF_ROUND(df_t, fDF_MANTBITS(), DF_INF_EXP)
+GEN_XF_ROUND(sf_t, fSF_MANTBITS(), SF_INF_EXP)
+
+static inline double special_fma(df_t a, df_t b, df_t c)
+{
+    df_t ret;
+    ret.i = 0;
+
+    /*
+     * If A multiplied by B is an exact infinity and C is also an infinity
+     * but with the opposite sign, FMA returns NaN and raises invalid.
+     */
+    if (fISINFPROD(a.f, b.f) && isinf(c.f)) {
+        if ((a.x.sign ^ b.x.sign) != c.x.sign) {
+            ret.i = fDFNANVAL();
+            feraiseexcept(FE_INVALID);
+            return ret.f;
+        }
+    }
+    if ((isinf(a.f) && isz(b.f)) || (isz(a.f) && isinf(b.f))) {
+        ret.i = fDFNANVAL();
+        feraiseexcept(FE_INVALID);
+        return ret.f;
+    }
+    /*
+     * If none of the above checks are true and C is a NaN,
+     * a NaN shall be returned
+     * If A or B are NaN, a NAN shall be returned.
+     */
+    if (isnan(a.f) || isnan(b.f) || (isnan(c.f))) {
+        if (isnan(a.f) && (fGETBIT(51, a.i) == 0)) {
+            feraiseexcept(FE_INVALID);
+        }
+        if (isnan(b.f) && (fGETBIT(51, b.i) == 0)) {
+            feraiseexcept(FE_INVALID);
+        }
+        if (isnan(c.f) && (fGETBIT(51, c.i) == 0)) {
+            feraiseexcept(FE_INVALID);
+        }
+        ret.i = fDFNANVAL();
+        return ret.f;
+    }
+    /*
+     * We have checked for adding opposite-signed infinities.
+     * Other infinities return infinity with the correct sign
+     */
+    if (isinf(c.f)) {
+        ret.x.exp = DF_INF_EXP;
+        ret.x.mant = 0;
+        ret.x.sign = c.x.sign;
+        return ret.f;
+    }
+    if (isinf(a.f) || isinf(b.f)) {
+        ret.x.exp = DF_INF_EXP;
+        ret.x.mant = 0;
+        ret.x.sign = (a.x.sign ^ b.x.sign);
+        return ret.f;
+    }
+    g_assert_not_reached();
+    ret.x.exp = 0x123;
+    ret.x.mant = 0xdead;
+    return ret.f;
+}
+
+static inline float special_fmaf(sf_t a, sf_t b, sf_t c)
+{
+    df_t aa, bb, cc;
+    aa.f = a.f;
+    bb.f = b.f;
+    cc.f = c.f;
+    return special_fma(aa, bb, cc);
+}
+
+float internal_fmafx(float a_in, float b_in, float c_in, int scale)
+{
+    sf_t a, b, c;
+    xf_t prod;
+    xf_t acc;
+    xf_t result;
+    xf_init(&prod);
+    xf_init(&acc);
+    xf_init(&result);
+    a.f = a_in;
+    b.f = b_in;
+    c.f = c_in;
+
+    if (isinf(a.f) || isinf(b.f) || isinf(c.f)) {
+        return special_fmaf(a, b, c);
+    }
+    if (isnan(a.f) || isnan(b.f) || isnan(c.f)) {
+        return special_fmaf(a, b, c);
+    }
+    if ((scale == 0) && (isz(a.f) || isz(b.f))) {
+        return a.f * b.f + c.f;
+    }
+
+    /* (a * 2**b) * (c * 2**d) == a*c * 2**(b+d) */
+    prod.mant = int128_mul_6464(sf_getmant(a), sf_getmant(b));
+
+    /*
+     * Note: extracting the mantissa into an int is multiplying by
+     * 2**23, so adjust here
+     */
+    prod.exp = sf_getexp(a) + sf_getexp(b) - SF_BIAS - 23;
+    prod.sign = a.x.sign ^ b.x.sign;
+    if (isz(a.f) || isz(b.f)) {
+        prod.exp = -2 * WAY_BIG_EXP;
+    }
+    if ((scale > 0) && (fpclassify(c.f) == FP_SUBNORMAL)) {
+        acc.mant = int128_mul_6464(0, 0);
+        acc.exp = -WAY_BIG_EXP;
+        acc.sign = c.x.sign;
+        acc.sticky = 1;
+        result = xf_add(prod, acc);
+    } else if (!isz(c.f)) {
+        acc.mant = int128_mul_6464(sf_getmant(c), 1);
+        acc.exp = sf_getexp(c);
+        acc.sign = c.x.sign;
+        result = xf_add(prod, acc);
+    } else {
+        result = prod;
+    }
+    result.exp += scale;
+    return xf_round_sf_t(result).f;
+}
+
+
+float internal_fmaf(float a_in, float b_in, float c_in)
+{
+    return internal_fmafx(a_in, b_in, c_in, 0);
+}
+
+float internal_mpyf(float a_in, float b_in)
+{
+    if (isz(a_in) || isz(b_in)) {
+        return a_in * b_in;
+    }
+    return internal_fmafx(a_in, b_in, 0.0, 0);
+}
+
+static inline double internal_mpyhh_special(double a, double b)
+{
+    return a * b;
+}
+
+double internal_mpyhh(double a_in, double b_in,
+                      unsigned long long int accumulated)
+{
+    df_t a, b;
+    xf_t x;
+    unsigned long long int prod;
+    unsigned int sticky;
+
+    a.f = a_in;
+    b.f = b_in;
+    sticky = accumulated & 1;
+    accumulated >>= 1;
+    xf_init(&x);
+    if (isz(a_in) || isnan(a_in) || isinf(a_in)) {
+        return internal_mpyhh_special(a_in, b_in);
+    }
+    if (isz(b_in) || isnan(b_in) || isinf(b_in)) {
+        return internal_mpyhh_special(a_in, b_in);
+    }
+    x.mant = int128_mul_6464(accumulated, 1);
+    x.sticky = sticky;
+    prod = fGETUWORD(1, df_getmant(a)) * fGETUWORD(1, df_getmant(b));
+    x.mant = int128_add(x.mant, int128_mul_6464(prod, 0x100000000ULL));
+    x.exp = df_getexp(a) + df_getexp(b) - DF_BIAS - 20;
+    if (!isnormal(a_in) || !isnormal(b_in)) {
+        /* crush to inexact zero */
+        x.sticky = 1;
+        x.exp = -4096;
+    }
+    x.sign = a.x.sign ^ b.x.sign;
+    return xf_round_df_t(x).f;
+}
+
+float conv_df_to_sf(double in_f)
+{
+    xf_t x;
+    df_t in;
+    if (isz(in_f) || isnan(in_f) || isinf(in_f)) {
+        return in_f;
+    }
+    xf_init(&x);
+    in.f = in_f;
+    x.mant = int128_mul_6464(df_getmant(in), 1);
+    x.exp = df_getexp(in) - DF_BIAS + SF_BIAS - 52 + 23;
+    x.sign = in.x.sign;
+    return xf_round_sf_t(x).f;
+}
+
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 17/34] Hexagon (target/hexagon/imported) arch import - macro definitions
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (15 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 16/34] Hexagon (target/hexagon) utility functions Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 15:17   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 18/34] Hexagon (target/hexagon/imported) arch import - instruction semantics Taylor Simpson
                   ` (18 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

The macro definitions specify instruction attributes that are applied
to each instruction that references the macro. The generator will
recursively apply attributes to each instruction that used the macro.

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/imported/macros.def | 1529 ++++++++++++++++++++++++++++++++++++
 1 file changed, 1529 insertions(+)
 create mode 100755 target/hexagon/imported/macros.def

diff --git a/target/hexagon/imported/macros.def b/target/hexagon/imported/macros.def
new file mode 100755
index 0000000..affbb1d
--- /dev/null
+++ b/target/hexagon/imported/macros.def
@@ -0,0 +1,1529 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+DEF_MACRO(
+	LIKELY,		/* NAME */
+	__builtin_expect((X),1), /* BEH */
+	()		/* attribs */
+)
+
+DEF_MACRO(
+	UNLIKELY,	/* NAME */
+	__builtin_expect((X),0), /* BEH */
+	()		/* attribs */
+)
+
+DEF_MACRO(
+	CANCEL, /* macro name */
+	{if (thread->last_pkt) thread->last_pkt->slot_cancelled |= (1<<insn->slot); return;} , /* behavior */
+	(A_CONDEXEC)
+)
+
+DEF_MACRO(
+	LOAD_CANCEL, /* macro name */
+	{mem_general_load_cancelled(thread,EA,insn);CANCEL;} , /* behavior */
+	(A_CONDEXEC)
+)
+
+DEF_MACRO(
+	STORE_CANCEL, /* macro name */
+	{mem_general_store_cancelled(thread,EA,insn);CANCEL;} , /* behavior */
+	(A_CONDEXEC)
+)
+
+DEF_MACRO(
+	fMAX, /* macro name */
+	(((A) > (B)) ? (A) : (B)), /* behavior */
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fMIN, /* macro name */
+	(((A) < (B)) ? (A) : (B)), /* behavior */
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fABS, /* macro name */
+	(((A)<0)?(-(A)):(A)), /* behavior */
+	/* optional attributes */
+)
+
+
+/* Bit insert */
+DEF_MACRO(
+	fINSERT_BITS,
+        {
+	   REG = ((REG) & ~(((fCONSTLL(1)<<(WIDTH))-1)<<(OFFSET))) | (((INVAL) & ((fCONSTLL(1)<<(WIDTH))-1)) << (OFFSET));
+        },
+	/* attribs */
+)
+
+/* Bit extract */
+DEF_MACRO(
+	fEXTRACTU_BITS,
+	(fZXTN(WIDTH,32,(INREG >> OFFSET))),
+	/* attribs */
+)
+
+DEF_MACRO(
+	fEXTRACTU_BIDIR,
+	(fZXTN(WIDTH,32,fBIDIR_LSHIFTR((INREG),(OFFSET),4_8))),
+	/* attribs */
+)
+
+DEF_MACRO(
+	fEXTRACTU_RANGE,
+	(fZXTN((HIBIT-LOWBIT+1),32,(INREG >> LOWBIT))),
+	/* attribs */
+)
+
+DEF_MACRO(
+	f8BITSOF,
+	( (VAL) ? 0xff : 0x00),
+	/* attribs */
+)
+
+DEF_MACRO(
+	fLSBOLD,
+	((VAL) & 1),
+	()
+)
+
+DEF_MACRO(
+	fLSBNEW,
+	predlog_read(thread,PNUM),
+	()
+)
+
+DEF_MACRO(
+	fLSBNEW0,
+	predlog_read(thread,0),
+	()
+)
+
+DEF_MACRO(
+	fLSBNEW1,
+	predlog_read(thread,1),
+	()
+)
+
+DEF_MACRO(
+	fLSBOLDNOT,
+	(!fLSBOLD(VAL)),
+	()
+)
+
+DEF_MACRO(
+	fLSBNEWNOT,
+	(!fLSBNEW(PNUM)),
+	()
+)
+
+DEF_MACRO(
+	fLSBNEW0NOT,
+	(!fLSBNEW0),
+	()
+)
+
+DEF_MACRO(
+	fLSBNEW1NOT,
+	(!fLSBNEW1),
+	()
+)
+
+DEF_MACRO(
+	fNEWREG,
+	({if (newvalue_missing(thread,RNUM) ||
+      IS_CANCELLED(insn->new_value_producer_slot)) CANCEL; reglog_read(thread,RNUM);}),
+	(A_DOTNEWVALUE,A_RESTRICT_SLOT0ONLY)
+)
+// Store new with a missing newvalue or cancelled goes out as a zero byte store in V65
+// take advantage of the fact that reglog_read returns zero for not valid rnum
+DEF_MACRO(
+	fNEWREG_ST,
+	({if (newvalue_missing(thread,RNUM) ||
+      IS_CANCELLED(insn->new_value_producer_slot)) { STORE_ZERO; RNUM = -1; }; reglog_read(thread,RNUM);}),
+	(A_DOTNEWVALUE,A_RESTRICT_SLOT0ONLY)
+)
+
+DEF_MACRO(
+	fSATUVALN,
+	({fSET_OVERFLOW(); ((VAL) < 0) ? 0 : ((1LL<<(N))-1);}),
+	()
+)
+
+DEF_MACRO(
+	fSATVALN,
+	({fSET_OVERFLOW(); ((VAL) < 0) ? (-(1LL<<((N)-1))) : ((1LL<<((N)-1))-1);}),
+	()
+)
+
+DEF_MACRO(
+	fZXTN, /* macro name */
+	((VAL) & ((1LL<<(N))-1)),
+	/* attribs */
+)
+
+DEF_MACRO(
+	fSXTN, /* macro name */
+	((fZXTN(N,M,VAL) ^ (1LL<<((N)-1))) - (1LL<<((N)-1))),
+	/* attribs */
+)
+
+DEF_MACRO(
+	fSATN,
+	((fSXTN(N,64,VAL) == (VAL)) ? (VAL) : fSATVALN(N,VAL)),
+	()
+)
+
+DEF_MACRO(
+	fADDSAT64,
+	{
+	  size8u_t __a = fCAST8u(A);
+	  size8u_t __b = fCAST8u(B);
+	  size8u_t __sum = __a + __b;
+	  size8u_t __xor = __a ^ __b;
+	  const size8u_t __mask = 0x8000000000000000ULL;
+	  if (__xor & __mask) {
+		/* Opposite signs, OK */
+		DST = __sum;
+	  } else if ((__a ^ __sum) & __mask) {
+		/* Signs mismatch */
+		if (__sum & __mask) {
+			/* overflowed to negative, make max pos */
+			DST=0x7FFFFFFFFFFFFFFFLL; fSET_OVERFLOW();
+		} else {
+			/* overflowed to positive, make max neg */
+			DST=0x8000000000000000LL; fSET_OVERFLOW();
+		}
+	  } else {
+		/* signs did not mismatch, OK */
+		DST = __sum;
+	  }
+        },
+	()
+)
+
+DEF_MACRO(
+	fSATUN,
+	((fZXTN(N,64,VAL) == (VAL)) ? (VAL) : fSATUVALN(N,VAL)),
+	()
+)
+
+DEF_MACRO(
+	fSATH,
+	(fSATN(16,VAL)),
+	()
+)
+
+
+DEF_MACRO(
+	fSATUH,
+	(fSATUN(16,VAL)),
+	()
+)
+
+DEF_MACRO(
+	fSATUB,
+	(fSATUN(8,VAL)),
+	()
+)
+DEF_MACRO(
+	fSATB,
+	(fSATN(8,VAL)),
+	()
+)
+
+
+/*************************************/
+/* immediate extension               */
+/*************************************/
+
+DEF_MACRO(
+	fIMMEXT,
+	(IMM = IMM),
+	(A_EXTENDABLE)
+)
+
+DEF_MACRO(
+	fMUST_IMMEXT,
+	fIMMEXT(IMM),
+	(A_EXTENDABLE)
+)
+
+DEF_MACRO(
+	fPCALIGN,
+	IMM=(IMM & ~PCALIGN_MASK),
+	(A_EXTENDABLE)
+)
+
+/*************************************/
+/* Read and Write Implicit Regs      */
+/*************************************/
+
+DEF_MACRO(
+	fREAD_LR, /* read link register */
+	(READ_RREG(REG_LR)),          /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fWRITE_LR, /* write lr */
+	WRITE_RREG(REG_LR,A),          /* behavior */
+	(A_IMPLICIT_WRITES_LR)
+)
+
+DEF_MACRO(
+	fWRITE_FP, /* write sp */
+	WRITE_RREG(REG_FP,A),          /* behavior */
+	(A_IMPLICIT_WRITES_FP)
+)
+
+DEF_MACRO(
+	fWRITE_SP, /* write sp */
+        WRITE_RREG(REG_SP,A),          /* behavior */
+	(A_IMPLICIT_WRITES_SP)
+)
+
+DEF_MACRO(
+	fREAD_SP, /* read stack pointer */
+	(READ_RREG(REG_SP)),          /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fREAD_LC0, /* read loop count */
+	(READ_RREG(REG_LC0)),          /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fREAD_LC1, /* read loop count */
+	(READ_RREG(REG_LC1)),          /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fREAD_SA0, /* read start addr */
+	(READ_RREG(REG_SA0)),          /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fREAD_SA1, /* read start addr */
+	(READ_RREG(REG_SA1)),          /* behavior */
+	()
+)
+
+
+DEF_MACRO(
+	fREAD_FP, /* read frame pointer */
+	(READ_RREG(REG_FP)),          /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fREAD_GP, /* read global pointer */
+	(insn->extension_valid ? 0 : READ_RREG(REG_GP)),          /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fREAD_PC, /* read PC */
+	(READ_RREG(REG_PC)),          /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fREAD_NPC, /* read next PC */
+	(thread->next_PC & (0xfffffffe)),          /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fREAD_P0, /* read Predicate 0 */
+	(READ_PREG(0)),          /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fREAD_P3, /* read Predicate 3 */
+	(READ_PREG(3)),          /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fCHECK_PCALIGN,
+	if (((A) & PCALIGN_MASK)) {
+		register_error_exception(thread,PRECISE_CAUSE_PC_NOT_ALIGNED,thread->Regs[REG_BADVA0],thread->Regs[REG_BADVA1],GET_SSR_FIELD(SSR_BVS),GET_SSR_FIELD(SSR_V0),GET_SSR_FIELD(SSR_V1),0);
+	},
+	()
+)
+
+DEF_MACRO(
+	fWRITE_NPC, /* write next PC */
+	if (!thread->branch_taken) {
+           if (A != thread->next_PC) {
+             thread->next_pkt_guess=thread->last_pkt->taken_ptr;
+           }
+	   fCHECK_PCALIGN(A);
+           thread->branched = 1; thread->branch_taken = 1; thread->next_PC = A; \
+           thread->branch_offset = insn->encoding_offset; thread->branch_opcode = insn->opcode;
+        }
+         ,          /* behavior */
+	(A_COF)
+)
+
+DEF_MACRO(
+	fBRANCH,
+	fWRITE_NPC(LOC); fCOF_CALLBACK(LOC,TYPE),
+	()
+)
+
+DEF_MACRO(
+	fJUMPR,	/* A jumpr has executed */
+	{fBRANCH(TARGET,COF_TYPE_JUMPR);},
+	(A_INDIRECT)
+)
+
+DEF_MACRO(
+	fHINTJR,	/* A hintjr instruction has executed */
+	{ },
+)
+
+DEF_MACRO(
+	fCALL,	/* Do a call */
+	if (!thread->branch_taken) {fBP_RAS_CALL(A); fWRITE_LR(fREAD_NPC()); fBRANCH(A,COF_TYPE_CALL);},
+	(A_COF,A_IMPLICIT_WRITES_LR,A_CALL)
+)
+
+DEF_MACRO(
+	fCALLR,	/* Do a call Register */
+	if (!thread->branch_taken) {fBP_RAS_CALL(A); fWRITE_LR(fREAD_NPC()); fBRANCH(A,COF_TYPE_CALLR);},
+	(A_COF,A_IMPLICIT_WRITES_LR,A_CALL)
+)
+
+DEF_MACRO(
+	fWRITE_LOOP_REGS0, /* write ln,sa,ea,lc */
+	{WRITE_RREG(REG_LC0,COUNT);
+         WRITE_RREG(REG_SA0,START);},
+	(A_IMPLICIT_WRITES_LC0,A_IMPLICIT_WRITES_SA0)
+)
+
+DEF_MACRO(
+	fWRITE_LOOP_REGS1, /* write ln,sa,ea,lc */
+	{WRITE_RREG(REG_LC1,COUNT);
+         WRITE_RREG(REG_SA1,START);},
+	(A_IMPLICIT_WRITES_LC1,A_IMPLICIT_WRITES_SA1)
+)
+
+DEF_MACRO(
+	fWRITE_LC0,
+	WRITE_RREG(REG_LC0,VAL),
+	(A_IMPLICIT_WRITES_LC0)
+)
+
+DEF_MACRO(
+	fWRITE_LC1,
+	WRITE_RREG(REG_LC1,VAL),
+	(A_IMPLICIT_WRITES_LC1)
+)
+
+DEF_MACRO(
+	fCARRY_FROM_ADD,
+	carry_from_add64(A,B,C),
+	/* NOTHING */
+)
+
+DEF_MACRO(
+	fSET_OVERFLOW,
+	SET_USR_FIELD(USR_OVF,1),
+	()
+)
+
+DEF_MACRO(
+	fSET_LPCFG,
+	SET_USR_FIELD(USR_LPCFG,(VAL)),
+	()
+)
+
+
+DEF_MACRO(
+	fGET_LPCFG,
+	(GET_USR_FIELD(USR_LPCFG)),
+	()
+)
+
+
+
+DEF_MACRO(
+	fWRITE_P0, /* write Predicate 0 */
+	WRITE_PREG(0,VAL),          /* behavior */
+	(A_IMPLICIT_WRITES_P0)
+)
+
+DEF_MACRO(
+	fWRITE_P1, /* write Predicate 0 */
+	WRITE_PREG(1,VAL),          /* behavior */
+	(A_IMPLICIT_WRITES_P1)
+)
+
+DEF_MACRO(
+	fWRITE_P2, /* write Predicate 0 */
+	WRITE_PREG(2,VAL),          /* behavior */
+	(A_IMPLICIT_WRITES_P2)
+)
+
+DEF_MACRO(
+	fWRITE_P3, /* write Predicate 0 */
+	WRITE_PREG(3,VAL),     /* behavior */
+	(A_IMPLICIT_WRITES_P3)
+)
+
+DEF_MACRO(
+	fPART1, /* write Predicate 0 */
+	if (insn->part1) { WORK; return; },          /* behavior */
+	/* optional attributes */
+)
+
+
+/*************************************/
+/* Casting, Sign-Zero extension, etc */
+/*************************************/
+
+DEF_MACRO(
+	fCAST4u, /* macro name */
+	((size4u_t)(A)),          /* behavior */
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fCAST4s, /* macro name */
+	((size4s_t)(A)),          /* behavior */
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fCAST8u, /* macro name */
+	((size8u_t)(A)),          /* behavior */
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fCAST8s, /* macro name */
+	((size8s_t)(A)),          /* behavior */
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fCAST4_4s, /* macro name */
+	((size4s_t)(A)),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fCAST4_4u, /* macro name */
+	((size4u_t)(A)),
+	/* optional attributes */
+)
+
+
+DEF_MACRO(
+	fCAST4_8s, /* macro name */
+	((size8s_t)((size4s_t)(A))),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fCAST4_8u, /* macro name */
+	((size8u_t)((size4u_t)(A))),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fCAST8_8s, /* macro name */
+	((size8s_t)(A)),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fCAST8_8u, /* macro name */
+	((size8u_t)(A)),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fCAST2_8s, /* macro name */
+	((size8s_t)((size2s_t)(A))),
+	/* optional attributes */
+)
+DEF_MACRO(
+	fCAST2_8u, /* macro name */
+	((size8u_t)((size2u_t)(A))),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fZE8_16, /* zero-extend 8 to 16 */
+	((size2s_t)((size1u_t)(A))),
+	/* optional attributes */
+)
+DEF_MACRO(
+	fSE8_16, /* sign-extend 8 to 16 */
+	((size2s_t)((size1s_t)(A))),
+	/* optional attributes */
+)
+
+
+DEF_MACRO(
+	fSE16_32, /* sign-extend 16 to 32 */
+	((size4s_t)((size2s_t)(A))),          /* behavior */
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fZE16_32, /* zero-extend 16 to 32 */
+	((size4u_t)((size2u_t)(A))),          /* behavior */
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fSE32_64,
+	( (size8s_t)((size4s_t)(A)) ),          /* behavior */
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fZE32_64,
+	( (size8u_t)((size4u_t)(A)) ),          /* behavior */
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fSE8_32, /* sign-extend 8 to 32 */
+	((size4s_t)((size1s_t)(A))),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fZE8_32, /* zero-extend 8 to 32 */
+	((size4s_t)((size1u_t)(A))),
+	/* optional attributes */
+)
+
+/*************************************/
+/* DSP arithmetic support            */
+/************************************/
+DEF_MACRO(
+	fMPY8UU, /* multiply half integer */
+	(int)(fZE8_16(A)*fZE8_16(B)),     /* behavior */
+	()
+)
+DEF_MACRO(
+	fMPY8US, /* multiply half integer */
+	(int)(fZE8_16(A)*fSE8_16(B)),     /* behavior */
+	()
+)
+DEF_MACRO(
+	fMPY8SU, /* multiply half integer */
+	(int)(fSE8_16(A)*fZE8_16(B)),     /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fMPY8SS, /* multiply half integer */
+	(int)((short)(A)*(short)(B)),     /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fMPY16SS, /* multiply half integer */
+	fSE32_64(fSE16_32(A)*fSE16_32(B)),     /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fMPY16UU, /* multiply unsigned half integer */
+	fZE32_64(fZE16_32(A)*fZE16_32(B)),     /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fMPY16SU, /* multiply half integer */
+	fSE32_64(fSE16_32(A)*fZE16_32(B)),     /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fMPY16US, /* multiply half integer */
+	fMPY16SU(B,A),
+	()
+)
+
+DEF_MACRO(
+	fMPY32SS, /* multiply half integer */
+	(fSE32_64(A)*fSE32_64(B)),     /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fMPY32UU, /* multiply half integer */
+	(fZE32_64(A)*fZE32_64(B)),     /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fMPY32SU, /* multiply half integer */
+	(fSE32_64(A)*fZE32_64(B)),     /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fMPY3216SS, /* multiply mixed precision */
+	(fSE32_64(A)*fSXTN(16,64,B)),     /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fMPY3216SU, /* multiply mixed precision */
+	(fSE32_64(A)*fZXTN(16,64,B)),     /* behavior */
+	()
+)
+
+DEF_MACRO(
+	fROUND, /* optional rounding */
+	(A+0x8000),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fCLIP, /* optional rounding */
+	{ size4s_t maxv = (1<<U)-1;
+	 size4s_t minv = -(1<<U);
+	 DST = fMIN(maxv,fMAX(SRC,minv));
+	},
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fCRND, /* optional rounding */
+	((((A)&0x3)==0x3)?((A)+1):((A))),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fRNDN, /* Rounding to a boundary */
+	((((N)==0)?(A):(((fSE32_64(A))+(1<<((N)-1)))))),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fCRNDN, /* Rounding to a boundary */
+	(conv_round(A,N)),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fADD128, /* Rounding to a boundary */
+	(add128(A, B)),
+	/* optional attributes */
+)
+DEF_MACRO(
+	fSUB128, /* Rounding to a boundary */
+	(sub128(A, B)),
+	/* optional attributes */
+)
+DEF_MACRO(
+	fSHIFTR128, /* Rounding to a boundary */
+	(shiftr128(A, B)),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fSHIFTL128, /* Rounding to a boundary */
+	(shiftl128(A, B)),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fAND128, /* Rounding to a boundary */
+	(and128(A, B)),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fCAST8S_16S, /* Rounding to a boundary */
+	(cast8s_to_16s(A)),
+	/* optional attributes */
+)
+DEF_MACRO(
+	fCAST16S_8S, /* Rounding to a boundary */
+	(cast16s_to_8s(A)),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fEA_RI, /* Calculate EA with Register + Immediate Offset */
+	do { EA=REG+IMM; fDOCHKPAGECROSS(REG,EA); } while (0),
+	()
+)
+
+DEF_MACRO(
+	fEA_RRs, /* Calculate EA with Register + Registers scaled Offset */
+	do { EA=REG+(REG2<<SCALE); fDOCHKPAGECROSS(REG,EA); } while (0),
+	()
+)
+
+DEF_MACRO(
+	fEA_IRs, /* Calculate EA with Immediate + Registers scaled Offset */
+	do { EA=IMM+(REG<<SCALE); fDOCHKPAGECROSS(IMM,EA); } while (0),
+	()
+)
+
+DEF_MACRO(
+	fEA_IMM, /* Calculate EA with Immediate */
+	EA=IMM,
+	()
+)
+
+DEF_MACRO(
+	fEA_REG, /* Calculate EA with REGISTER */
+	EA=REG,
+	()
+)
+
+DEF_MACRO(
+	fEA_GPI, /* Calculate EA with Global Poitner + Immediate */
+    do { EA=fREAD_GP()+IMM; fGP_DOCHKPAGECROSS(fREAD_GP(),EA); } while (0),
+	()
+)
+
+DEF_MACRO(
+	fPM_I, /* Post Modify Register by Immediate*/
+	do { REG = REG + IMM; } while (0),
+	()
+)
+
+DEF_MACRO(
+	fPM_M, /* Post Modify Register by M register */
+	do { REG = REG + MVAL; } while (0),
+	()
+)
+
+DEF_MACRO(
+	fSCALE, /* scale by N */
+	(((size8s_t)(A))<<N),
+	/* optional attributes */
+)
+
+DEF_MACRO(
+	fSATW, /* saturating to 32-bits*/
+	fSATN(32,((long long)A)),
+	()
+)
+
+DEF_MACRO(
+	fSAT, /* saturating to 32-bits*/
+	fSATN(32,(A)),
+	()
+)
+
+DEF_MACRO(
+	fSAT_ORIG_SHL, /* Saturating to 32-bits, with original value, for shift left */
+	((((size4s_t)((fSAT(A)) ^ ((size4s_t)(ORIG_REG)))) < 0) ?
+		fSATVALN(32,((size4s_t)(ORIG_REG))) :
+		((((ORIG_REG) > 0) && ((A) == 0)) ?
+			fSATVALN(32,(ORIG_REG)) :
+			fSAT(A))),
+	()
+)
+
+DEF_MACRO(
+	fPASS,
+	A,
+)
+
+DEF_MACRO(
+	fRND, /* saturating to 32-bits*/
+	(((A)+1)>>1),
+)
+
+
+DEF_MACRO(
+	fBIDIR_SHIFTL,
+	(((SHAMT) < 0) ? ((fCAST##REGSTYPE(SRC) >> ((-(SHAMT))-1)) >>1) : (fCAST##REGSTYPE(SRC) << (SHAMT))),
+	()
+)
+
+DEF_MACRO(
+	fBIDIR_ASHIFTL,
+	fBIDIR_SHIFTL(SRC,SHAMT,REGSTYPE##s),
+	()
+)
+
+DEF_MACRO(
+	fBIDIR_LSHIFTL,
+	fBIDIR_SHIFTL(SRC,SHAMT,REGSTYPE##u),
+	()
+)
+
+DEF_MACRO(
+	fBIDIR_ASHIFTL_SAT,
+	(((SHAMT) < 0) ? ((fCAST##REGSTYPE##s(SRC) >> ((-(SHAMT))-1)) >>1) : fSAT_ORIG_SHL(fCAST##REGSTYPE##s(SRC) << (SHAMT),(SRC))),
+	()
+)
+
+
+DEF_MACRO(
+	fBIDIR_SHIFTR,
+	(((SHAMT) < 0) ? ((fCAST##REGSTYPE(SRC) << ((-(SHAMT))-1)) << 1) : (fCAST##REGSTYPE(SRC) >> (SHAMT))),
+	()
+)
+
+DEF_MACRO(
+	fBIDIR_ASHIFTR,
+	fBIDIR_SHIFTR(SRC,SHAMT,REGSTYPE##s),
+	()
+)
+
+DEF_MACRO(
+	fBIDIR_LSHIFTR,
+	fBIDIR_SHIFTR(SRC,SHAMT,REGSTYPE##u),
+	()
+)
+
+DEF_MACRO(
+	fBIDIR_ASHIFTR_SAT,
+	(((SHAMT) < 0) ? fSAT_ORIG_SHL((fCAST##REGSTYPE##s(SRC) << ((-(SHAMT))-1)) << 1,(SRC)) : (fCAST##REGSTYPE##s(SRC) >> (SHAMT))),
+	()
+)
+
+DEF_MACRO(
+	fASHIFTR,
+	(fCAST##REGSTYPE##s(SRC) >> (SHAMT)),
+	/* */
+)
+
+DEF_MACRO(
+	fLSHIFTR,
+	(((SHAMT) >= 64)?0:(fCAST##REGSTYPE##u(SRC) >> (SHAMT))),
+	/* */
+)
+
+DEF_MACRO(
+	fROTL,
+	(((SHAMT)==0) ? (SRC) : ((fCAST##REGSTYPE##u(SRC) << (SHAMT)) | \
+		((fCAST##REGSTYPE##u(SRC) >> ((sizeof(SRC)*8)-(SHAMT)))))),
+	/* */
+)
+
+DEF_MACRO(
+	fROTR,
+	(((SHAMT)==0) ? (SRC) : ((fCAST##REGSTYPE##u(SRC) >> (SHAMT)) | \
+		((fCAST##REGSTYPE##u(SRC) << ((sizeof(SRC)*8)-(SHAMT)))))),
+	/* */
+)
+
+DEF_MACRO(
+	fASHIFTL,
+	(((SHAMT) >= 64)?0:(fCAST##REGSTYPE##s(SRC) << (SHAMT))),
+	/* */
+)
+
+/*************************************/
+/* Floating-Point Support            */
+/************************************/
+
+DEF_MACRO(
+	fFLOAT, /* name */
+	({ union { float f; size4u_t i; } _fipun; _fipun.i = (A); _fipun.f; }),     /* behavior */
+	(A_FPOP)
+)
+
+DEF_MACRO(
+	fUNFLOAT, /* multiply half integer */
+	({ union { float f; size4u_t i; } _fipun; _fipun.f = (A); isnan(_fipun.f) ? 0xFFFFFFFFU : _fipun.i; }),     /* behavior */
+	(A_FPOP)
+)
+
+DEF_MACRO(
+	fSFNANVAL,
+	0xffffffff,
+	()
+)
+
+DEF_MACRO(
+	fSFINFVAL,
+	(((A) & 0x80000000) | 0x7f800000),
+	()
+)
+
+DEF_MACRO(
+	fSFONEVAL,
+	(((A) & 0x80000000) | fUNFLOAT(1.0)),
+	()
+)
+
+DEF_MACRO(
+	fCHECKSFNAN,
+	do {
+		if (isnan(fFLOAT(A))) {
+			if ((fGETBIT(22,A)) == 0) fRAISEFLAGS(FE_INVALID);
+			DST = fSFNANVAL();
+		}
+	} while (0),
+	()
+)
+
+DEF_MACRO(
+	fCHECKSFNAN3,
+	do {
+		fCHECKSFNAN(DST,A);
+		fCHECKSFNAN(DST,B);
+		fCHECKSFNAN(DST,C);
+	} while (0),
+	()
+)
+
+DEF_MACRO(
+	fSF_BIAS,
+	127,
+	()
+)
+
+DEF_MACRO(
+	fSF_MANTBITS,
+	23,
+	()
+)
+
+DEF_MACRO(
+	fSF_MUL_POW2,
+	(fUNFLOAT(fFLOAT(A) * fFLOAT((fSF_BIAS() + (B)) << fSF_MANTBITS()))),
+	()
+)
+
+DEF_MACRO(
+	fSF_GETEXP,
+	(((A) >> fSF_MANTBITS()) & 0xff),
+	()
+)
+
+DEF_MACRO(
+	fSF_MAXEXP,
+	(254),
+	()
+)
+
+DEF_MACRO(
+	fSF_RECIP_COMMON,
+	arch_sf_recip_common(&N,&D,&O,&A),
+	(A_FPOP)
+)
+
+DEF_MACRO(
+	fSF_INVSQRT_COMMON,
+	arch_sf_invsqrt_common(&N,&O,&A),
+	(A_FPOP)
+)
+
+DEF_MACRO(
+	fFMAFX,
+	internal_fmafx(A,B,C,fSXTN(8,64,ADJ)),
+	()
+)
+
+DEF_MACRO(
+	fFMAF,
+	internal_fmafx(A,B,C,0),
+	()
+)
+
+DEF_MACRO(
+	fSFMPY,
+	internal_mpyf(A,B),
+	()
+)
+
+DEF_MACRO(
+	fMAKESF,
+	((((SIGN) & 1) << 31) | (((EXP) & 0xff) << fSF_MANTBITS()) |
+		((MANT) & ((1<<fSF_MANTBITS())-1))),
+	()
+)
+
+
+DEF_MACRO(
+	fDOUBLE, /* multiply half integer */
+	({ union { double f; size8u_t i; } _fipun; _fipun.i = (A); _fipun.f; }),     /* behavior */
+	(A_FPOP)
+)
+
+DEF_MACRO(
+	fUNDOUBLE, /* multiply half integer */
+	({ union { double f; size8u_t i; } _fipun; _fipun.f = (A); isnan(_fipun.f) ? 0xFFFFFFFFFFFFFFFFULL : _fipun.i; }),     /* behavior */
+	(A_FPOP)
+)
+
+DEF_MACRO(
+	fDFNANVAL,
+	0xffffffffffffffffULL,
+	()
+)
+
+DEF_MACRO(
+	fDF_ISNORMAL,
+	(fpclassify(fDOUBLE(X)) == FP_NORMAL),
+	()
+)
+
+DEF_MACRO(
+	fDF_ISDENORM,
+	(fpclassify(fDOUBLE(X)) == FP_SUBNORMAL),
+	()
+)
+
+DEF_MACRO(
+	fDF_ISBIG,
+	(fDF_GETEXP(X) >= 512),
+	()
+)
+
+DEF_MACRO(
+	fDF_MANTBITS,
+	52,
+	()
+)
+
+DEF_MACRO(
+	fDF_GETEXP,
+	(((A) >> fDF_MANTBITS()) & 0x7ff),
+	()
+)
+
+DEF_MACRO(
+	fFMA,
+	internal_fma(A,B,C),
+	/* nothing */
+)
+
+DEF_MACRO(
+	fDF_MPY_HH,
+	internal_mpyhh(A,B,ACC),
+	/* nothing */
+)
+
+DEF_MACRO(
+	fFPOP_START,
+	arch_fpop_start(thread),
+	/* nothing */
+)
+
+DEF_MACRO(
+	fFPOP_END,
+	arch_fpop_end(thread),
+	/* nothing */
+)
+
+DEF_MACRO(
+	fFPSETROUND_NEAREST,
+	fesetround(FE_TONEAREST),
+	/* nothing */
+)
+
+DEF_MACRO(
+	fFPSETROUND_CHOP,
+	fesetround(FE_TOWARDZERO),
+	/* nothing */
+)
+
+DEF_MACRO(
+	fFPCANCELFLAGS,
+	feclearexcept(FE_ALL_EXCEPT),
+	/* nothing */
+)
+
+DEF_MACRO(
+	fISINFPROD,
+	((isinf(A) && isinf(B)) ||
+		(isinf(A) && isfinite(B) && ((B) != 0.0)) ||
+		(isinf(B) && isfinite(A) && ((A) != 0.0))),
+	/* nothing */
+)
+
+DEF_MACRO(
+	fISZEROPROD,
+	((((A) == 0.0) && isfinite(B)) || (((B) == 0.0) && isfinite(A))),
+	/* nothing */
+)
+
+DEF_MACRO(
+	fRAISEFLAGS,
+	arch_raise_fpflag(A),
+	/* NOTHING */
+)
+
+DEF_MACRO(
+	fDF_MAX,
+	(((A)==(B))
+		? fDOUBLE(fUNDOUBLE(A) & fUNDOUBLE(B))
+		: fmax(A,B)),
+	(A_FPOP)
+)
+
+DEF_MACRO(
+	fDF_MIN,
+	(((A)==(B))
+		? fDOUBLE(fUNDOUBLE(A) | fUNDOUBLE(B))
+		: fmin(A,B)),
+	(A_FPOP)
+)
+
+DEF_MACRO(
+	fSF_MAX,
+	(((A)==(B))
+		? fFLOAT(fUNFLOAT(A) & fUNFLOAT(B))
+		: fmaxf(A,B)),
+	(A_FPOP)
+)
+
+DEF_MACRO(
+	fSF_MIN,
+	(((A)==(B))
+		? fFLOAT(fUNFLOAT(A) | fUNFLOAT(B))
+		: fminf(A,B)),
+	(A_FPOP)
+)
+
+/*************************************/
+/* Load/Store support                */
+/*************************************/
+
+DEF_MACRO(fLOAD,
+	{ DST = (size##SIZE##SIGN##_t)MEM_LOAD##SIZE(thread,EA,insn); },
+	(A_LOAD,A_MEMLIKE)
+)
+
+DEF_MACRO(fMEMOP,
+	{ memop##SIZE##_##FNTYPE(thread,EA,VALUE); },
+	(A_LOAD,A_STORE,A_MEMLIKE)
+)
+
+DEF_MACRO(fGET_FRAMEKEY,
+	READ_RREG(REG_FRAMEKEY),
+	()
+)
+
+DEF_MACRO(fFRAME_SCRAMBLE,
+	((VAL) ^ (fCAST8u(fGET_FRAMEKEY()) << 32)),
+	/* ATTRIBS */
+)
+
+DEF_MACRO(fFRAME_UNSCRAMBLE,
+	fFRAME_SCRAMBLE(VAL),
+	/* ATTRIBS */
+)
+
+DEF_MACRO(fFRAMECHECK,
+	sys_check_framelimit(thread,ADDR,EA),
+	()
+)
+
+DEF_MACRO(fLOAD_LOCKED,
+	{     DST = (size##SIZE##SIGN##_t)mem_load_locked(thread,EA,SIZE,insn);
+  },
+	(A_LOAD,A_MEMLIKE)
+)
+
+DEF_MACRO(fSTORE,
+	{ MEM_STORE##SIZE(thread,EA,SRC,insn); },
+	(A_STORE,A_MEMLIKE)
+)
+
+
+DEF_MACRO(fSTORE_LOCKED,
+	{ PRED = (mem_store_conditional(thread,EA,SRC,SIZE,insn) ? 0xff : 0); },
+	(A_STORE,A_MEMLIKE)
+)
+
+/*************************************/
+/* Functions to help with bytes      */
+/*************************************/
+
+DEF_MACRO(fGETBYTE,
+         ((size1s_t)((SRC>>((N)*8))&0xff)),
+	/* nothing */
+)
+
+DEF_MACRO(fGETUBYTE,
+         ((size1u_t)((SRC>>((N)*8))&0xff)),
+	/* nothing */
+)
+
+DEF_MACRO(fSETBYTE,
+	{
+	DST = (DST & ~(0x0ffLL<<((N)*8))) | (((size8u_t)((VAL) & 0x0ffLL)) << ((N)*8));
+	},
+	/* nothing */
+)
+
+DEF_MACRO(fGETHALF,
+         ((size2s_t)((SRC>>((N)*16))&0xffff)),
+	/* nothing */
+)
+
+DEF_MACRO(fGETUHALF,
+         ((size2u_t)((SRC>>((N)*16))&0xffff)),
+	/* nothing */
+)
+
+DEF_MACRO(fSETHALF,
+	{
+	DST = (DST & ~(0x0ffffLL<<((N)*16))) | (((size8u_t)((VAL) & 0x0ffff)) << ((N)*16));
+	},
+	/* nothing */
+)
+
+
+
+DEF_MACRO(fGETWORD,
+         ((size8s_t)((size4s_t)((SRC>>((N)*32))&0x0ffffffffLL))),
+	/* nothing */
+)
+
+DEF_MACRO(fGETUWORD,
+         ((size8u_t)((size4u_t)((SRC>>((N)*32))&0x0ffffffffLL))),
+	/* nothing */
+)
+
+DEF_MACRO(fSETWORD,
+	{
+	DST = (DST & ~(0x0ffffffffLL<<((N)*32))) | (((VAL) & 0x0ffffffffLL) << ((N)*32));
+	},
+	/* nothing */
+)
+
+DEF_MACRO(fSETBIT,
+	{
+	DST = (DST & ~(1ULL<<(N))) | (((size8u_t)(VAL))<<(N));
+	},
+	/* nothing */
+)
+
+DEF_MACRO(fGETBIT,
+	(((SRC)>>N)&1),
+	/* nothing */
+)
+
+
+DEF_MACRO(fSETBITS,
+	do {
+        int j;
+        for (j=LO;j<=HI;j++) {
+          fSETBIT(j,DST,VAL);
+        }
+	} while (0),
+	/* nothing */
+)
+
+/*************************************/
+/* Used for parity, etc........      */
+/*************************************/
+DEF_MACRO(fCOUNTONES_4,
+	count_ones_4(VAL),
+	/* nothing */
+)
+
+DEF_MACRO(fCOUNTONES_8,
+	count_ones_8(VAL),
+	/* nothing */
+)
+
+DEF_MACRO(fBREV_8,
+	reverse_bits_8(VAL),
+	/* nothing */
+)
+
+DEF_MACRO(fBREV_4,
+	reverse_bits_4(VAL),
+	/* nothing */
+)
+
+DEF_MACRO(fCL1_8,
+	count_leading_ones_8(VAL),
+	/* nothing */
+)
+
+DEF_MACRO(fCL1_4,
+	count_leading_ones_4(VAL),
+	/* nothing */
+)
+
+DEF_MACRO(fINTERLEAVE,
+	interleave(ODD,EVEN),
+	/* nothing */
+)
+
+DEF_MACRO(fDEINTERLEAVE,
+	deinterleave(MIXED),
+	/* nothing */
+)
+
+DEF_MACRO(fHIDE,
+	A,
+	()
+)
+
+DEF_MACRO(fCONSTLL,
+	A##LL,
+)
+
+/* Do the things in the parens, but don't print the parens. */
+DEF_MACRO(fECHO,
+	(A),
+	/* nothing */
+)
+
+
+/********************************************/
+/* OS interface and stop/wait               */
+/********************************************/
+
+DEF_MACRO(fTRAP,
+    warn("Trap NPC=%x ",fREAD_NPC());
+	warn("Trap exception, PCYCLE=%lld TYPE=%d NPC=%x IMM=0x%x",thread->processor_ptr->pstats[pcycles],TRAPTYPE,fREAD_NPC(),IMM);
+	register_trap_exception(thread,fREAD_NPC(),TRAPTYPE,IMM);,
+	()
+)
+
+DEF_MACRO(fALIGN_REG_FIELD_VALUE,
+	((VAL)<<reg_field_info[FIELD].offset),
+	/* */
+)
+
+DEF_MACRO(fGET_REG_FIELD_MASK,
+	(((1<<reg_field_info[FIELD].width)-1)<<reg_field_info[FIELD].offset),
+	/* */
+)
+
+DEF_MACRO(fREAD_REG_FIELD,
+	fEXTRACTU_BITS(thread->Regs[REG_##REG],
+            reg_field_info[FIELD].width,
+            reg_field_info[FIELD].offset),
+	/* ATTRIBS */
+)
+
+DEF_MACRO(fGET_FIELD,
+	fEXTRACTU_BITS(VAL,
+		reg_field_info[FIELD].width,
+		reg_field_info[FIELD].offset),
+	/* ATTRIBS */
+)
+
+DEF_MACRO(fSET_FIELD,
+	fINSERT_BITS(VAL,
+		reg_field_info[FIELD].width,
+		reg_field_info[FIELD].offset,
+		(NEWVAL)),
+	/* ATTRIBS */
+)
+
+/********************************************/
+/* Cache Management                         */
+/********************************************/
+
+DEF_MACRO(fBARRIER,
+     {
+        sys_barrier(thread, insn->slot);
+     },
+	()
+)
+
+DEF_MACRO(fSYNCH,
+  {
+      sys_sync(thread, insn->slot);
+  },
+	()
+)
+
+DEF_MACRO(fISYNC,
+  {
+      sys_isync(thread, insn->slot);
+  },
+	()
+)
+
+
+DEF_MACRO(fDCFETCH,
+	sys_dcfetch(thread, (REG), insn->slot),
+	(A_MEMLIKE)
+)
+
+DEF_MACRO(fICINVA,
+	{
+	arch_internal_flush(thread->processor_ptr, 0, 0xffffffff);
+   	sys_icinva(thread, (REG),insn->slot);
+	},
+	(A_ICINVA)
+)
+
+DEF_MACRO(fL2FETCH,
+	sys_l2fetch(thread, ADDR,HEIGHT,WIDTH,STRIDE,FLAGS, insn->slot),
+	(A_MEMLIKE,A_L2FETCH)
+)
+
+DEF_MACRO(fDCCLEANA,
+	sys_dccleana(thread, (REG)),
+	(A_MEMLIKE)
+)
+
+DEF_MACRO(fDCCLEANINVA,
+	sys_dccleaninva(thread, (REG), insn->slot),
+	(A_MEMLIKE,A_DCCLEANINVA)
+)
+
+DEF_MACRO(fDCZEROA,
+	sys_dczeroa(thread, (REG)),
+	(A_MEMLIKE)
+)
+
+DEF_MACRO(fCHECKFORPRIV,
+	{sys_check_privs(thread); if (EXCEPTION_DETECTED) return; },
+	()
+)
+
+DEF_MACRO(fCHECKFORGUEST,
+	{sys_check_guest(thread); if (EXCEPTION_DETECTED) return; },
+	()
+)
+
+DEF_MACRO(fBRANCH_SPECULATE_STALL,
+	{
+sys_speculate_branch_stall(thread, insn->slot, JUMP_COND(JUMP_PRED_SET),
+										   SPEC_DIR,
+										   DOTNEWVAL,
+										   HINTBITNUM,
+										   STRBITNUM,
+										   0,
+										   thread->last_pkt->pkt_has_dual_jump,
+										   insn->is_2nd_jump,
+										   (thread->fetch_access.vaddr + insn->encoding_offset*4));
+  },
+	()
+)
+
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 18/34] Hexagon (target/hexagon/imported) arch import - instruction semantics
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (16 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 17/34] Hexagon (target/hexagon/imported) arch import - macro definitions Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-18 15:50 ` [RFC PATCH v3 19/34] Hexagon (target/hexagon/imported) arch import - instruction encoding Taylor Simpson
                   ` (17 subsequent siblings)
  35 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Imported from the Hexagon architecture library
    imported/allidefs.def          Top level instruction definition file
    imported/*.idef                Instruction definition files
These files are input to the first phase of the generator (gen_semantics.c)
to create a python include file with the instruction semantics and
attributes.  The python include file is fed to the second phase to generate
various header files.

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/imported/allidefs.def  |   30 +
 target/hexagon/imported/alu.idef      | 1259 +++++++++++++++++++++++++++++++++
 target/hexagon/imported/branch.idef   |  328 +++++++++
 target/hexagon/imported/compare.idef  |  621 ++++++++++++++++
 target/hexagon/imported/float.idef    |  313 ++++++++
 target/hexagon/imported/ldst.idef     |  286 ++++++++
 target/hexagon/imported/mpy.idef      | 1212 +++++++++++++++++++++++++++++++
 target/hexagon/imported/shift.idef    | 1067 ++++++++++++++++++++++++++++
 target/hexagon/imported/subinsns.idef |  152 ++++
 target/hexagon/imported/system.idef   |   69 ++
 10 files changed, 5337 insertions(+)
 create mode 100644 target/hexagon/imported/allidefs.def
 create mode 100644 target/hexagon/imported/alu.idef
 create mode 100644 target/hexagon/imported/branch.idef
 create mode 100644 target/hexagon/imported/compare.idef
 create mode 100644 target/hexagon/imported/float.idef
 create mode 100644 target/hexagon/imported/ldst.idef
 create mode 100644 target/hexagon/imported/mpy.idef
 create mode 100644 target/hexagon/imported/shift.idef
 create mode 100644 target/hexagon/imported/subinsns.idef
 create mode 100644 target/hexagon/imported/system.idef

diff --git a/target/hexagon/imported/allidefs.def b/target/hexagon/imported/allidefs.def
new file mode 100644
index 0000000..6339c10
--- /dev/null
+++ b/target/hexagon/imported/allidefs.def
@@ -0,0 +1,30 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * Top level instruction definition file
+ */
+
+#include "branch.idef"
+#include "ldst.idef"
+#include "compare.idef"
+#include "mpy.idef"
+#include "alu.idef"
+#include "float.idef"
+#include "shift.idef"
+#include "system.idef"
+#include "subinsns.idef"
diff --git a/target/hexagon/imported/alu.idef b/target/hexagon/imported/alu.idef
new file mode 100644
index 0000000..8c0804e
--- /dev/null
+++ b/target/hexagon/imported/alu.idef
@@ -0,0 +1,1259 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * ALU Instructions
+ */
+
+
+/**********************************************/
+/* Add/Sub instructions		       */
+/**********************************************/
+
+Q6INSN(A2_add,"Rd32=add(Rs32,Rt32)",ATTRIBS(),
+"Add 32-bit registers",
+{ RdV=RsV+RtV;})
+
+Q6INSN(A2_sub,"Rd32=sub(Rt32,Rs32)",ATTRIBS(),
+"Subtract 32-bit registers",
+{ RdV=RtV-RsV;})
+
+#define COND_ALU(TAG,OPER,DESCR,SEMANTICS)\
+Q6INSN(TAG##t,"if (Pu4) "OPER,ATTRIBS(A_ARCHV2),DESCR,{if(fLSBOLD(PuV)){SEMANTICS;} else {CANCEL;}})\
+Q6INSN(TAG##f,"if (!Pu4) "OPER,ATTRIBS(A_ARCHV2),DESCR,{if(fLSBOLDNOT(PuV)){SEMANTICS;} else {CANCEL;}})\
+Q6INSN(TAG##tnew,"if (Pu4.new) " OPER,ATTRIBS(A_ARCHV2),DESCR,{if(fLSBNEW(PuN)){SEMANTICS;} else {CANCEL;}})\
+Q6INSN(TAG##fnew,"if (!Pu4.new) "OPER,ATTRIBS(A_ARCHV2),DESCR,{if(fLSBNEWNOT(PuN)){SEMANTICS;} else {CANCEL;}})
+
+COND_ALU(A2_padd,"Rd32=add(Rs32,Rt32)","Conditionally Add 32-bit registers",RdV=RsV+RtV)
+COND_ALU(A2_psub,"Rd32=sub(Rt32,Rs32)","Conditionally Subtract 32-bit registers",RdV=RtV-RsV)
+COND_ALU(A2_paddi,"Rd32=add(Rs32,#s8)","Conditionally Add Register and immediate",fIMMEXT(siV); RdV=RsV+siV)
+COND_ALU(A2_pxor,"Rd32=xor(Rs32,Rt32)","Conditionally XOR registers",RdV=RsV^RtV);
+COND_ALU(A2_pand,"Rd32=and(Rs32,Rt32)","Conditionally AND registers",RdV=RsV&RtV);
+COND_ALU(A2_por,"Rd32=or(Rs32,Rt32)","Conditionally OR registers",RdV=RsV|RtV);
+
+COND_ALU(A4_psxtb,"Rd32=sxtb(Rs32)","Conditionally sign-extend byte", RdV=fSXTN(8,32,RsV));
+COND_ALU(A4_pzxtb,"Rd32=zxtb(Rs32)","Conditionally zero-extend byte", RdV=fZXTN(8,32,RsV));
+COND_ALU(A4_psxth,"Rd32=sxth(Rs32)","Conditionally sign-extend halfword", RdV=fSXTN(16,32,RsV));
+COND_ALU(A4_pzxth,"Rd32=zxth(Rs32)","Conditionally zero-extend halfword", RdV=fZXTN(16,32,RsV));
+COND_ALU(A4_paslh,"Rd32=aslh(Rs32)","Conditionally zero-extend halfword", RdV=RsV<<16);
+COND_ALU(A4_pasrh,"Rd32=asrh(Rs32)","Conditionally zero-extend halfword", RdV=RsV>>16);
+
+
+Q6INSN(A2_addsat,"Rd32=add(Rs32,Rt32):sat",ATTRIBS(),
+"Add 32-bit registers with saturation",
+{ RdV=fSAT(fSE32_64(RsV)+fSE32_64(RtV)); })
+
+Q6INSN(A2_subsat,"Rd32=sub(Rt32,Rs32):sat",ATTRIBS(),
+"Subtract 32-bit registers with saturation",
+{ RdV=fSAT(fSE32_64(RtV) - fSE32_64(RsV)); })
+
+
+Q6INSN(A2_addi,"Rd32=add(Rs32,#s16)",ATTRIBS(),
+"Add a signed immediate to a register",
+{ fIMMEXT(siV); RdV=RsV+siV;})
+
+
+Q6INSN(C4_addipc,"Rd32=add(pc,#u6)",ATTRIBS(),
+"Add immediate to PC",
+{ RdV=fREAD_PC()+fIMMEXT(uiV);})
+
+
+
+/**********************************************/
+/* Single-precision HL forms		  */
+/* These insns and the SP mpy are the ones    */
+/* that can do .HL stuff		      */
+/**********************************************/
+#define STD_HL_INSN(TAG,OPER,AOPER,ATR,SEM)\
+Q6INSN(A2_##TAG##_ll, OPER"(Rt.L32,Rs.L32)"AOPER,    ATR,"",{SEM(fGETHALF(0,RtV),fGETHALF(0,RsV));})\
+Q6INSN(A2_##TAG##_lh, OPER"(Rt.L32,Rs.H32)"AOPER,    ATR,"",{SEM(fGETHALF(0,RtV),fGETHALF(1,RsV));})\
+Q6INSN(A2_##TAG##_hl, OPER"(Rt.H32,Rs.L32)"AOPER,    ATR,"",{SEM(fGETHALF(1,RtV),fGETHALF(0,RsV));})\
+Q6INSN(A2_##TAG##_hh, OPER"(Rt.H32,Rs.H32)"AOPER,    ATR,"",{SEM(fGETHALF(1,RtV),fGETHALF(1,RsV));})
+
+#define SUBSTD_HL_INSN(TAG,OPER,AOPER,ATR,SEM)\
+Q6INSN(A2_##TAG##_ll, OPER"(Rt.L32,Rs.L32)"AOPER,    ATR,"",{SEM(fGETHALF(0,RtV),fGETHALF(0,RsV));})\
+Q6INSN(A2_##TAG##_hl, OPER"(Rt.L32,Rs.H32)"AOPER,    ATR,"",{SEM(fGETHALF(0,RtV),fGETHALF(1,RsV));})
+
+
+#undef HLSEM
+#define HLSEM(A,B) RdV=fSXTN(16,32,(A+B))
+SUBSTD_HL_INSN(addh_l16,"Rd32=add","",ATTRIBS(),HLSEM)
+
+#undef HLSEM
+#define HLSEM(A,B) RdV=fSATH(A+B)
+SUBSTD_HL_INSN(addh_l16_sat,"Rd32=add",":sat",ATTRIBS(),HLSEM)
+
+#undef HLSEM
+#define HLSEM(A,B) RdV=fSXTN(16,32,(A-B))
+SUBSTD_HL_INSN(subh_l16,"Rd32=sub","",ATTRIBS(),HLSEM)
+
+#undef HLSEM
+#define HLSEM(A,B) RdV=fSATH(A-B)
+SUBSTD_HL_INSN(subh_l16_sat,"Rd32=sub",":sat",ATTRIBS(),HLSEM)
+
+#undef HLSEM
+#define HLSEM(A,B) RdV=(A+B)<<16
+STD_HL_INSN(addh_h16,"Rd32=add",":<<16",ATTRIBS(),HLSEM)
+
+#undef HLSEM
+#define HLSEM(A,B) RdV=(fSATH(A+B))<<16
+STD_HL_INSN(addh_h16_sat,"Rd32=add",":sat:<<16",ATTRIBS(),HLSEM)
+
+#undef HLSEM
+#define HLSEM(A,B) RdV=(A-B)<<16
+STD_HL_INSN(subh_h16,"Rd32=sub",":<<16",ATTRIBS(),HLSEM)
+
+#undef HLSEM
+#define HLSEM(A,B) RdV=(fSATH(A-B))<<16
+STD_HL_INSN(subh_h16_sat,"Rd32=sub",":sat:<<16",ATTRIBS(),HLSEM)
+
+
+
+
+Q6INSN(A2_aslh,"Rd32=aslh(Rs32)",ATTRIBS(),
+"Arithmetic Shift Left by Halfword",{ RdV=RsV<<16; })
+
+Q6INSN(A2_asrh,"Rd32=asrh(Rs32)",ATTRIBS(),
+"Arithmetic Shift Right by Halfword",{ RdV=RsV>>16; })
+
+
+/* 64-bit versions */
+
+Q6INSN(A2_addp,"Rdd32=add(Rss32,Rtt32)",ATTRIBS(),
+"Add",
+{ RddV=RssV+RttV;})
+
+Q6INSN(A2_addpsat,"Rdd32=add(Rss32,Rtt32):sat",ATTRIBS(A_ARCHV3),
+"Add",
+{ fADDSAT64(RddV,RssV,RttV);})
+
+Q6INSN(A2_addspl,"Rdd32=add(Rss32,Rtt32):raw:lo",ATTRIBS(A_ARCHV3),
+"Add",
+{ RddV=RttV+fSXTN(32,64,fGETWORD(0,RssV));})
+
+Q6INSN(A2_addsph,"Rdd32=add(Rss32,Rtt32):raw:hi",ATTRIBS(A_ARCHV3),
+"Add",
+{ RddV=RttV+fSXTN(32,64,fGETWORD(1,RssV));})
+
+Q6INSN(A2_subp,"Rdd32=sub(Rtt32,Rss32)",ATTRIBS(),
+"Sub",
+{ RddV=RttV-RssV;})
+
+/* NEG and ABS */
+
+Q6INSN(A2_negsat,"Rd32=neg(Rs32):sat",ATTRIBS(),
+"Arithmetic negate register", { RdV = fSAT(-fCAST8s(RsV)); })
+
+Q6INSN(A2_abs,"Rd32=abs(Rs32)",ATTRIBS(),
+"Absolute Value register", { RdV = fABS(RsV); })
+
+Q6INSN(A2_abssat,"Rd32=abs(Rs32):sat",ATTRIBS(),
+"Arithmetic negate register", { RdV = fSAT(fABS(fCAST4_8s(RsV))); })
+
+Q6INSN(A2_vconj,"Rdd32=vconj(Rss32):sat",ATTRIBS(A_ARCHV2),
+"Vector Complex conjugate of Rss",
+{  fSETHALF(1,RddV,fSATN(16,-fGETHALF(1,RssV)));
+   fSETHALF(0,RddV,fGETHALF(0,RssV));
+   fSETHALF(3,RddV,fSATN(16,-fGETHALF(3,RssV)));
+   fSETHALF(2,RddV,fGETHALF(2,RssV));
+})
+
+
+/* 64-bit versions */
+
+Q6INSN(A2_negp,"Rdd32=neg(Rss32)",ATTRIBS(),
+"Arithmetic negate register", { RddV = -RssV; })
+
+Q6INSN(A2_absp,"Rdd32=abs(Rss32)",ATTRIBS(),
+"Absolute Value register", { RddV = fABS(RssV); })
+
+
+/* MIN and MAX  R */
+
+Q6INSN(A2_max,"Rd32=max(Rs32,Rt32)",ATTRIBS(),
+"Maximum of two registers",
+{ RdV = fMAX(RsV,RtV); })
+
+Q6INSN(A2_maxu,"Rd32=maxu(Rs32,Rt32)",ATTRIBS(),
+"Maximum of two registers (unsigned)",
+{ RdV = fMAX(fCAST4u(RsV),fCAST4u(RtV)); })
+
+Q6INSN(A2_min,"Rd32=min(Rt32,Rs32)",ATTRIBS(),
+"Minimum of two registers",
+{ RdV = fMIN(RtV,RsV); })
+
+Q6INSN(A2_minu,"Rd32=minu(Rt32,Rs32)",ATTRIBS(),
+"Minimum of two registers (unsigned)",
+{ RdV = fMIN(fCAST4u(RtV),fCAST4u(RsV)); })
+
+/* MIN and MAX Pairs */
+#if 1
+Q6INSN(A2_maxp,"Rdd32=max(Rss32,Rtt32)",ATTRIBS(A_ARCHV3),
+"Maximum of two register pairs",
+{ RddV = fMAX(RssV,RttV); })
+
+Q6INSN(A2_maxup,"Rdd32=maxu(Rss32,Rtt32)",ATTRIBS(A_ARCHV3),
+"Maximum of two register pairs (unsigned)",
+{ RddV = fMAX(fCAST8u(RssV),fCAST8u(RttV)); })
+
+Q6INSN(A2_minp,"Rdd32=min(Rtt32,Rss32)",ATTRIBS(A_ARCHV3),
+"Minimum of two register pairs",
+{ RddV = fMIN(RttV,RssV); })
+
+Q6INSN(A2_minup,"Rdd32=minu(Rtt32,Rss32)",ATTRIBS(A_ARCHV3),
+"Minimum of two register pairs (unsigned)",
+{ RddV = fMIN(fCAST8u(RttV),fCAST8u(RssV)); })
+#endif
+
+/**********************************************/
+/* Register and Immediate Transfers	   */
+/**********************************************/
+
+Q6INSN(A2_nop,"nop",ATTRIBS(A_IT_NOP),
+"Nop (32-bit encoding)",
+ fHIDE( { }  ))
+
+
+Q6INSN(A4_ext,"immext(#u26:6)",ATTRIBS(A_IT_EXTENDER),
+"This instruction carries the 26 most-significant immediate bits for the next instruction",
+{ fHIDE(); })
+
+
+Q6INSN(A2_tfr,"Rd32=Rs32",ATTRIBS(),
+"tfr register",{ RdV=RsV;})
+
+Q6INSN(A2_tfrsi,"Rd32=#s16",ATTRIBS(),
+"transfer signed immediate to register",{ fIMMEXT(siV); RdV=siV;})
+
+Q6INSN(A2_sxtb,"Rd32=sxtb(Rs32)",ATTRIBS(),
+"Sign extend byte", {RdV = fSXTN(8,32,RsV);})
+
+Q6INSN(A2_zxth,"Rd32=zxth(Rs32)",ATTRIBS(),
+"Zero extend half", {RdV = fZXTN(16,32,RsV);})
+
+Q6INSN(A2_sxth,"Rd32=sxth(Rs32)",ATTRIBS(),
+"Sign extend half", {RdV = fSXTN(16,32,RsV);})
+
+Q6INSN(A2_combinew,"Rdd32=combine(Rs32,Rt32)",ATTRIBS(),
+"Combine two words into a register pair",
+{ fSETWORD(0,RddV,RtV);
+  fSETWORD(1,RddV,RsV);
+})
+
+Q6INSN(A4_combineri,"Rdd32=combine(Rs32,#s8)",ATTRIBS(),
+"Combine a word and an immediate into a register pair",
+{ fIMMEXT(siV); fSETWORD(0,RddV,siV);
+  fSETWORD(1,RddV,RsV);
+})
+
+Q6INSN(A4_combineir,"Rdd32=combine(#s8,Rs32)",ATTRIBS(),
+"Combine a word and an immediate into a register pair",
+{ fIMMEXT(siV); fSETWORD(0,RddV,RsV);
+  fSETWORD(1,RddV,siV);
+})
+
+
+
+Q6INSN(A2_combineii,"Rdd32=combine(#s8,#S8)",ATTRIBS(A_ARCHV2),
+"Set two small immediates",
+{ fIMMEXT(siV); fSETWORD(0,RddV,SiV); fSETWORD(1,RddV,siV); })
+
+Q6INSN(A4_combineii,"Rdd32=combine(#s8,#U6)",ATTRIBS(),"Set two small immediates",
+{ fIMMEXT(UiV); fSETWORD(0,RddV,UiV); fSETWORD(1,RddV,siV); })
+
+
+Q6INSN(A2_combine_hh,"Rd32=combine(Rt.H32,Rs.H32)",ATTRIBS(),
+"Combine two halfs into a register", {RdV = (fGETUHALF(1,RtV)<<16) | fGETUHALF(1,RsV);})
+
+Q6INSN(A2_combine_hl,"Rd32=combine(Rt.H32,Rs.L32)",ATTRIBS(),
+"Combine two halfs into a register", {RdV = (fGETUHALF(1,RtV)<<16) | fGETUHALF(0,RsV);})
+
+Q6INSN(A2_combine_lh,"Rd32=combine(Rt.L32,Rs.H32)",ATTRIBS(),
+"Combine two halfs into a register", {RdV = (fGETUHALF(0,RtV)<<16) | fGETUHALF(1,RsV);})
+
+Q6INSN(A2_combine_ll,"Rd32=combine(Rt.L32,Rs.L32)",ATTRIBS(),
+"Combine two halfs into a register", {RdV = (fGETUHALF(0,RtV)<<16) | fGETUHALF(0,RsV);})
+
+Q6INSN(A2_tfril,"Rx.L32=#u16",ATTRIBS(),
+"Set low 16-bits, leave upper 16 unchanged",{ fSETHALF(0,RxV,uiV);})
+
+Q6INSN(A2_tfrih,"Rx.H32=#u16",ATTRIBS(),
+"Set high 16-bits, leave low 16 unchanged",{ fSETHALF(1,RxV,uiV);})
+
+Q6INSN(A2_tfrcrr,"Rd32=Cs32",ATTRIBS(),
+"transfer control register to general register",{ RdV=CsV;})
+
+Q6INSN(A2_tfrrcr,"Cd32=Rs32",ATTRIBS(),
+"transfer general register to control register",{ CdV=RsV;})
+
+Q6INSN(A4_tfrcpp,"Rdd32=Css32",ATTRIBS(),
+"transfer control register to general register",{ RddV=CssV;})
+
+Q6INSN(A4_tfrpcp,"Cdd32=Rss32",ATTRIBS(),
+"transfer general register to control register",{ CddV=RssV;})
+
+
+/**********************************************/
+/* Logicals				   */
+/**********************************************/
+
+Q6INSN(A2_and,"Rd32=and(Rs32,Rt32)",ATTRIBS(),
+"logical AND",{ RdV=RsV&RtV;})
+
+Q6INSN(A2_or,"Rd32=or(Rs32,Rt32)",ATTRIBS(),
+"logical OR",{ RdV=RsV|RtV;})
+
+Q6INSN(A2_xor,"Rd32=xor(Rs32,Rt32)",ATTRIBS(),
+"logical XOR",{ RdV=RsV^RtV;})
+
+Q6INSN(M2_xor_xacc,"Rx32^=xor(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"logical XOR with XOR accumulation",{ RxV^=RsV^RtV;})
+
+Q6INSN(M4_xor_xacc,"Rxx32^=xor(Rss32,Rtt32)",,
+"logical XOR with XOR accumulation",{ RxxV^=RssV^RttV;})
+
+
+
+Q6INSN(A4_andn,"Rd32=and(Rt32,~Rs32)",,
+"And-Not", { RdV = (RtV & ~RsV); })
+
+Q6INSN(A4_orn,"Rd32=or(Rt32,~Rs32)",,
+"Or-Not", { RdV = (RtV | ~RsV); })
+
+
+Q6INSN(A4_andnp,"Rdd32=and(Rtt32,~Rss32)",,
+"And-Not", { RddV = (RttV & ~RssV); })
+
+Q6INSN(A4_ornp,"Rdd32=or(Rtt32,~Rss32)",,
+"Or-Not", { RddV = (RttV | ~RssV); })
+
+
+
+
+/********************/
+/* Compound add-add */
+/********************/
+
+Q6INSN(S4_addaddi,"Rd32=add(Rs32,add(Ru32,#s6))",ATTRIBS(),
+	"3-input add",
+	{ RdV = RsV + RuV + fIMMEXT(siV); })
+
+
+Q6INSN(S4_subaddi,"Rd32=add(Rs32,sub(#s6,Ru32))",ATTRIBS(),
+	"3-input sub",
+	{ RdV = RsV - RuV + fIMMEXT(siV); })
+
+
+
+/****************************/
+/* Compound logical-logical */
+/****************************/
+
+Q6INSN(M4_and_and,"Rx32&=and(Rs32,Rt32)",ATTRIBS(),
+"Compound And-And", { RxV &= (RsV & RtV); })
+
+Q6INSN(M4_and_andn,"Rx32&=and(Rs32,~Rt32)",ATTRIBS(),
+"Compound And-Andn", { RxV &= (RsV & ~RtV); })
+
+Q6INSN(M4_and_or,"Rx32&=or(Rs32,Rt32)",ATTRIBS(),
+"Compound And-Or", { RxV &= (RsV | RtV); })
+
+Q6INSN(M4_and_xor,"Rx32&=xor(Rs32,Rt32)",ATTRIBS(),
+"Compound And-xor", { RxV &= (RsV ^ RtV); })
+
+
+
+Q6INSN(M4_or_and,"Rx32|=and(Rs32,Rt32)",ATTRIBS(),
+"Compound Or-And", { RxV |= (RsV & RtV); })
+
+Q6INSN(M4_or_andn,"Rx32|=and(Rs32,~Rt32)",ATTRIBS(),
+"Compound Or-AndN", { RxV |= (RsV & ~RtV); })
+
+Q6INSN(M4_or_or,"Rx32|=or(Rs32,Rt32)",ATTRIBS(),
+"Compound Or-Or", { RxV |= (RsV | RtV); })
+
+Q6INSN(M4_or_xor,"Rx32|=xor(Rs32,Rt32)",ATTRIBS(),
+"Compound Or-xor", { RxV |= (RsV ^ RtV); })
+
+
+Q6INSN(S4_or_andix,"Rx32=or(Ru32,and(Rx32,#s10))",ATTRIBS(),
+"Compound Or-And", { RxV = RuV | (RxV & fIMMEXT(siV)); })
+
+Q6INSN(S4_or_andi,"Rx32|=and(Rs32,#s10)",ATTRIBS(),
+"Compound Or-And", { RxV = RxV | (RsV & fIMMEXT(siV)); })
+
+Q6INSN(S4_or_ori,"Rx32|=or(Rs32,#s10)",ATTRIBS(),
+"Compound Or-And", { RxV = RxV | (RsV | fIMMEXT(siV)); })
+
+
+
+
+Q6INSN(M4_xor_and,"Rx32^=and(Rs32,Rt32)",ATTRIBS(),
+"Compound Xor-And", { RxV ^= (RsV & RtV); })
+
+Q6INSN(M4_xor_or,"Rx32^=or(Rs32,Rt32)",ATTRIBS(),
+"Compound Xor-Or", { RxV ^= (RsV | RtV); })
+
+Q6INSN(M4_xor_andn,"Rx32^=and(Rs32,~Rt32)",ATTRIBS(),
+"Compound Xor-And", { RxV ^= (RsV & ~RtV); })
+
+
+
+
+
+
+Q6INSN(A2_subri,"Rd32=sub(#s10,Rs32)",ATTRIBS(A_ARCHV2),
+"Subtract register from immediate",{ fIMMEXT(siV); RdV=siV-RsV;})
+
+Q6INSN(A2_andir,"Rd32=and(Rs32,#s10)",ATTRIBS(A_ARCHV2),
+"logical AND with immediate",{ fIMMEXT(siV); RdV=RsV&siV;})
+
+Q6INSN(A2_orir,"Rd32=or(Rs32,#s10)",ATTRIBS(A_ARCHV2),
+"logical OR with immediate",{ fIMMEXT(siV); RdV=RsV|siV;})
+
+
+
+
+Q6INSN(A2_andp,"Rdd32=and(Rss32,Rtt32)",ATTRIBS(),
+"logical AND pair",{ RddV=RssV&RttV;})
+
+Q6INSN(A2_orp,"Rdd32=or(Rss32,Rtt32)",ATTRIBS(),
+"logical OR pair",{ RddV=RssV|RttV;})
+
+Q6INSN(A2_xorp,"Rdd32=xor(Rss32,Rtt32)",ATTRIBS(),
+"logical eXclusive OR pair",{ RddV=RssV^RttV;})
+
+Q6INSN(A2_notp,"Rdd32=not(Rss32)",ATTRIBS(),
+"logical NOT pair",{ RddV=~RssV;})
+
+Q6INSN(A2_sxtw,"Rdd32=sxtw(Rs32)",ATTRIBS(),
+"Sign extend 32-bit word to 64-bit pair",
+{ RddV = fCAST4_8s(RsV); })
+
+Q6INSN(A2_sat,"Rd32=sat(Rss32)",ATTRIBS(),
+"Saturate to 32-bit Signed",
+{ RdV = fSAT(RssV); })
+
+Q6INSN(A2_roundsat,"Rd32=round(Rss32):sat",ATTRIBS(),
+"Round & Saturate to 32-bit Signed",
+{ fHIDE(size8s_t tmp;) fADDSAT64(tmp,RssV,0x080000000ULL); RdV = fGETWORD(1,tmp); })
+
+Q6INSN(A2_sath,"Rd32=sath(Rs32)",ATTRIBS(),
+"Saturate to 16-bit Signed",
+{ RdV = fSATH(RsV); })
+
+Q6INSN(A2_satuh,"Rd32=satuh(Rs32)",ATTRIBS(),
+"Saturate to 16-bit Unsigned",
+{ RdV = fSATUH(RsV); })
+
+Q6INSN(A2_satub,"Rd32=satub(Rs32)",ATTRIBS(),
+"Saturate to 8-bit Unsigned",
+{ RdV = fSATUB(RsV); })
+
+Q6INSN(A2_satb,"Rd32=satb(Rs32)",ATTRIBS(A_ARCHV2),
+"Saturate to 8-bit Signed",
+{ RdV = fSATB(RsV); })
+
+/**********************************************/
+/* Vector Add				 */
+/**********************************************/
+
+Q6INSN(A2_vaddub,"Rdd32=vaddub(Rss32,Rtt32)",ATTRIBS(),
+"Add vector of bytes",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBYTE(i,RddV,(fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV)));
+	}
+})
+
+Q6INSN(A2_vaddubs,"Rdd32=vaddub(Rss32,Rtt32):sat",ATTRIBS(),
+"Add vector of bytes",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBYTE(i,RddV,fSATUN(8,fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV)));
+	}
+})
+
+Q6INSN(A2_vaddh,"Rdd32=vaddh(Rss32,Rtt32)",ATTRIBS(),
+"Add vector of half integers",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV,fGETHALF(i,RssV)+fGETHALF(i,RttV));
+	}
+})
+
+Q6INSN(A2_vaddhs,"Rdd32=vaddh(Rss32,Rtt32):sat",ATTRIBS(),
+"Add vector of half integers with saturation",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV,fSATN(16,fGETHALF(i,RssV)+fGETHALF(i,RttV)));
+	}
+})
+
+Q6INSN(A2_vadduhs,"Rdd32=vadduh(Rss32,Rtt32):sat",ATTRIBS(),
+"Add vector of unsigned half integers with saturation",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV,fSATUN(16,fGETUHALF(i,RssV)+fGETUHALF(i,RttV)));
+	}
+})
+
+Q6INSN(A5_vaddhubs,"Rd32=vaddhub(Rss32,Rtt32):sat",ATTRIBS(),
+"Add vector of half integers with saturation and pack to unsigned bytes",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+	  		fSETBYTE(i,RdV,fSATUB(fGETHALF(i,RssV)+fGETHALF(i,RttV)));
+	}
+})
+
+Q6INSN(A2_vaddw,"Rdd32=vaddw(Rss32,Rtt32)",ATTRIBS(),
+"Add vector of words",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETWORD(i,RddV,fGETWORD(i,RssV)+fGETWORD(i,RttV));
+	}
+})
+
+Q6INSN(A2_vaddws,"Rdd32=vaddw(Rss32,Rtt32):sat",ATTRIBS(),
+"Add vector of words with saturation",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETWORD(i,RddV,fSATN(32,fGETWORD(i,RssV)+fGETWORD(i,RttV)));
+	}
+})
+
+
+
+Q6INSN(S4_vxaddsubw,"Rdd32=vxaddsubw(Rss32,Rtt32):sat",ATTRIBS(),
+"Cross vector add-sub words with saturation",
+{
+	fSETWORD(0,RddV,fSAT(fGETWORD(0,RssV)+fGETWORD(1,RttV)));
+	fSETWORD(1,RddV,fSAT(fGETWORD(1,RssV)-fGETWORD(0,RttV)));
+})
+Q6INSN(S4_vxsubaddw,"Rdd32=vxsubaddw(Rss32,Rtt32):sat",ATTRIBS(),
+"Cross vector sub-add words with saturation",
+{
+	fSETWORD(0,RddV,fSAT(fGETWORD(0,RssV)-fGETWORD(1,RttV)));
+	fSETWORD(1,RddV,fSAT(fGETWORD(1,RssV)+fGETWORD(0,RttV)));
+})
+
+
+
+Q6INSN(S4_vxaddsubh,"Rdd32=vxaddsubh(Rss32,Rtt32):sat",ATTRIBS(),
+"Cross vector add-sub halfwords with saturation",
+{
+	fSETHALF(0,RddV,fSATH(fGETHALF(0,RssV)+fGETHALF(1,RttV)));
+	fSETHALF(1,RddV,fSATH(fGETHALF(1,RssV)-fGETHALF(0,RttV)));
+
+	fSETHALF(2,RddV,fSATH(fGETHALF(2,RssV)+fGETHALF(3,RttV)));
+	fSETHALF(3,RddV,fSATH(fGETHALF(3,RssV)-fGETHALF(2,RttV)));
+
+})
+Q6INSN(S4_vxsubaddh,"Rdd32=vxsubaddh(Rss32,Rtt32):sat",ATTRIBS(),
+"Cross vector sub-add halfwords with saturation",
+{
+	fSETHALF(0,RddV,fSATH(fGETHALF(0,RssV)-fGETHALF(1,RttV)));
+	fSETHALF(1,RddV,fSATH(fGETHALF(1,RssV)+fGETHALF(0,RttV)));
+
+	fSETHALF(2,RddV,fSATH(fGETHALF(2,RssV)-fGETHALF(3,RttV)));
+	fSETHALF(3,RddV,fSATH(fGETHALF(3,RssV)+fGETHALF(2,RttV)));
+})
+
+
+
+
+Q6INSN(S4_vxaddsubhr,"Rdd32=vxaddsubh(Rss32,Rtt32):rnd:>>1:sat",ATTRIBS(),
+"Cross vector add-sub halfwords with shift, round, and saturation",
+{
+	fSETHALF(0,RddV,fSATH((fGETHALF(0,RssV)+fGETHALF(1,RttV)+1)>>1));
+	fSETHALF(1,RddV,fSATH((fGETHALF(1,RssV)-fGETHALF(0,RttV)+1)>>1));
+
+	fSETHALF(2,RddV,fSATH((fGETHALF(2,RssV)+fGETHALF(3,RttV)+1)>>1));
+	fSETHALF(3,RddV,fSATH((fGETHALF(3,RssV)-fGETHALF(2,RttV)+1)>>1));
+
+})
+Q6INSN(S4_vxsubaddhr,"Rdd32=vxsubaddh(Rss32,Rtt32):rnd:>>1:sat",ATTRIBS(),
+"Cross vector sub-add halfwords with shift, round, and saturation",
+{
+	fSETHALF(0,RddV,fSATH((fGETHALF(0,RssV)-fGETHALF(1,RttV)+1)>>1));
+	fSETHALF(1,RddV,fSATH((fGETHALF(1,RssV)+fGETHALF(0,RttV)+1)>>1));
+
+	fSETHALF(2,RddV,fSATH((fGETHALF(2,RssV)-fGETHALF(3,RttV)+1)>>1));
+	fSETHALF(3,RddV,fSATH((fGETHALF(3,RssV)+fGETHALF(2,RttV)+1)>>1));
+})
+
+
+
+
+
+/**********************************************/
+/* 1/2 Vector operations		      */
+/**********************************************/
+
+
+Q6INSN(A2_svavgh,"Rd32=vavgh(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"Avg vector of half integers",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETHALF(i,RdV,((fGETHALF(i,RsV)+fGETHALF(i,RtV))>>1));
+	}
+})
+
+Q6INSN(A2_svavghs,"Rd32=vavgh(Rs32,Rt32):rnd",ATTRIBS(A_ARCHV2),
+"Avg vector of half integers with rounding",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETHALF(i,RdV,((fGETHALF(i,RsV)+fGETHALF(i,RtV)+1)>>1));
+	}
+})
+
+
+
+Q6INSN(A2_svnavgh,"Rd32=vnavgh(Rt32,Rs32)",ATTRIBS(A_ARCHV2),
+"Avg vector of half integers",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETHALF(i,RdV,((fGETHALF(i,RtV)-fGETHALF(i,RsV))>>1));
+	}
+})
+
+
+Q6INSN(A2_svaddh,"Rd32=vaddh(Rs32,Rt32)",ATTRIBS(),
+"Add vector of half integers",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETHALF(i,RdV,fGETHALF(i,RsV)+fGETHALF(i,RtV));
+	}
+})
+
+Q6INSN(A2_svaddhs,"Rd32=vaddh(Rs32,Rt32):sat",ATTRIBS(),
+"Add vector of half integers with saturation",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETHALF(i,RdV,fSATN(16,fGETHALF(i,RsV)+fGETHALF(i,RtV)));
+	}
+})
+
+Q6INSN(A2_svadduhs,"Rd32=vadduh(Rs32,Rt32):sat",ATTRIBS(),
+"Add vector of unsigned half integers with saturation",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETHALF(i,RdV,fSATUN(16,fGETUHALF(i,RsV)+fGETUHALF(i,RtV)));
+	}
+})
+
+
+Q6INSN(A2_svsubh,"Rd32=vsubh(Rt32,Rs32)",ATTRIBS(),
+"Sub vector of half integers",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETHALF(i,RdV,fGETHALF(i,RtV)-fGETHALF(i,RsV));
+	}
+})
+
+Q6INSN(A2_svsubhs,"Rd32=vsubh(Rt32,Rs32):sat",ATTRIBS(),
+"Sub vector of half integers with saturation",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETHALF(i,RdV,fSATN(16,fGETHALF(i,RtV)-fGETHALF(i,RsV)));
+	}
+})
+
+Q6INSN(A2_svsubuhs,"Rd32=vsubuh(Rt32,Rs32):sat",ATTRIBS(),
+"Sub vector of unsigned half integers with saturation",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETHALF(i,RdV,fSATUN(16,fGETUHALF(i,RtV)-fGETUHALF(i,RsV)));
+	}
+})
+
+
+
+
+/**********************************************/
+/* Vector Reduce Add			  */
+/**********************************************/
+
+Q6INSN(A2_vraddub,"Rdd32=vraddub(Rss32,Rtt32)",ATTRIBS(),
+"Sum: two vectors of unsigned bytes",
+{
+	fHIDE(int i;)
+	RddV = 0;
+	for (i=0;i<4;i++) {
+		fSETWORD(0,RddV,(fGETWORD(0,RddV) + (fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV))));
+	}
+	for (i=4;i<8;i++) {
+		fSETWORD(1,RddV,(fGETWORD(1,RddV) + (fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV))));
+	}
+})
+
+Q6INSN(A2_vraddub_acc,"Rxx32+=vraddub(Rss32,Rtt32)",ATTRIBS(),
+"Sum: two vectors of unsigned bytes",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 4; i++) {
+		fSETWORD(0,RxxV,(fGETWORD(0,RxxV) + (fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV))));
+	}
+	for (i = 4; i < 8; i++) {
+		fSETWORD(1,RxxV,(fGETWORD(1,RxxV) + (fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV))));
+	}
+})
+
+
+
+Q6INSN(M2_vraddh,"Rd32=vraddh(Rss32,Rtt32)",ATTRIBS(A_ARCHV3),
+"Sum: two vectors of halves",
+{
+	fHIDE(int i;)
+	RdV = 0;
+	for (i=0;i<4;i++) {
+		RdV += (fGETHALF(i,RssV)+fGETHALF(i,RttV));
+	}
+})
+
+Q6INSN(M2_vradduh,"Rd32=vradduh(Rss32,Rtt32)",ATTRIBS(A_ARCHV3),
+"Sum: two vectors of unsigned halves",
+{
+	fHIDE(int i;)
+	RdV = 0;
+	for (i=0;i<4;i++) {
+		RdV += (fGETUHALF(i,RssV)+fGETUHALF(i,RttV));
+	}
+})
+
+/**********************************************/
+/* Vector Sub				 */
+/**********************************************/
+
+Q6INSN(A2_vsubub,"Rdd32=vsubub(Rtt32,Rss32)",ATTRIBS(),
+"Sub vector of bytes",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBYTE(i,RddV,(fGETUBYTE(i,RttV)-fGETUBYTE(i,RssV)));
+	}
+})
+
+Q6INSN(A2_vsububs,"Rdd32=vsubub(Rtt32,Rss32):sat",ATTRIBS(),
+"Sub vector of bytes",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBYTE(i,RddV,fSATUN(8,fGETUBYTE(i,RttV)-fGETUBYTE(i,RssV)));
+	}
+})
+
+Q6INSN(A2_vsubh,"Rdd32=vsubh(Rtt32,Rss32)",ATTRIBS(),
+"Sub vector of half integers",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV,fGETHALF(i,RttV)-fGETHALF(i,RssV));
+	}
+})
+
+Q6INSN(A2_vsubhs,"Rdd32=vsubh(Rtt32,Rss32):sat",ATTRIBS(),
+"Sub vector of half integers with saturation",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV,fSATN(16,fGETHALF(i,RttV)-fGETHALF(i,RssV)));
+	}
+})
+
+Q6INSN(A2_vsubuhs,"Rdd32=vsubuh(Rtt32,Rss32):sat",ATTRIBS(),
+"Sub vector of unsigned half integers with saturation",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV,fSATUN(16,fGETUHALF(i,RttV)-fGETUHALF(i,RssV)));
+	}
+})
+
+Q6INSN(A2_vsubw,"Rdd32=vsubw(Rtt32,Rss32)",ATTRIBS(),
+"Sub vector of words",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETWORD(i,RddV,fGETWORD(i,RttV)-fGETWORD(i,RssV));
+	}
+})
+
+Q6INSN(A2_vsubws,"Rdd32=vsubw(Rtt32,Rss32):sat",ATTRIBS(),
+"Sub vector of words with saturation",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETWORD(i,RddV,fSATN(32,fGETWORD(i,RttV)-fGETWORD(i,RssV)));
+	}
+})
+
+
+
+
+/**********************************************/
+/* Vector Abs				 */
+/**********************************************/
+
+Q6INSN(A2_vabsh,"Rdd32=vabsh(Rss32)",ATTRIBS(),
+"Negate vector of half integers",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV,fABS(fGETHALF(i,RssV)));
+	}
+})
+
+Q6INSN(A2_vabshsat,"Rdd32=vabsh(Rss32):sat",ATTRIBS(),
+"Negate vector of half integers",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV,fSATH(fABS(fGETHALF(i,RssV))));
+	}
+})
+
+Q6INSN(A2_vabsw,"Rdd32=vabsw(Rss32)",ATTRIBS(),
+"Absolute Value vector of words",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETWORD(i,RddV,fABS(fGETWORD(i,RssV)));
+	}
+})
+
+Q6INSN(A2_vabswsat,"Rdd32=vabsw(Rss32):sat",ATTRIBS(),
+"Absolute Value vector of words",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETWORD(i,RddV,fSAT(fABS(fGETWORD(i,RssV))));
+	}
+})
+
+/**********************************************/
+/* Vector SAD				 */
+/**********************************************/
+
+
+Q6INSN(M2_vabsdiffw,"Rdd32=vabsdiffw(Rtt32,Rss32)",ATTRIBS(A_ARCHV2),
+"Absolute Differences: vector of words",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+	  fSETWORD(i,RddV,fABS(fGETWORD(i,RttV) - fGETWORD(i,RssV)));
+	}
+})
+
+Q6INSN(M2_vabsdiffh,"Rdd32=vabsdiffh(Rtt32,Rss32)",ATTRIBS(A_ARCHV2),
+"Absolute Differences: vector of halfwords",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV,fABS(fGETHALF(i,RttV) - fGETHALF(i,RssV)));
+	}
+})
+
+Q6INSN(M6_vabsdiffb,"Rdd32=vabsdiffb(Rtt32,Rss32)",ATTRIBS(),
+"Absolute Differences: vector of halfwords",
+{
+	fHIDE(int i;)
+	for (i=0;i<8;i++) {
+	  fSETBYTE(i,RddV,fABS(fGETBYTE(i,RttV) - fGETBYTE(i,RssV)));
+	}
+})
+
+Q6INSN(M6_vabsdiffub,"Rdd32=vabsdiffub(Rtt32,Rss32)",ATTRIBS(),
+"Absolute Differences: vector of halfwords",
+{
+	fHIDE(int i;)
+	for (i=0;i<8;i++) {
+	  fSETBYTE(i,RddV,fABS(fGETUBYTE(i,RttV) - fGETUBYTE(i,RssV)));
+	}
+})
+
+
+
+Q6INSN(A2_vrsadub,"Rdd32=vrsadub(Rss32,Rtt32)",ATTRIBS(),
+"Sum of Absolute Differences: vector of unsigned bytes",
+{
+	fHIDE(int i;)
+	RddV = 0;
+	for (i = 0; i < 4; i++) {
+		fSETWORD(0,RddV,(fGETWORD(0,RddV) + fABS((fGETUBYTE(i,RssV) - fGETUBYTE(i,RttV)))));
+	}
+	for (i = 4; i < 8; i++) {
+		fSETWORD(1,RddV,(fGETWORD(1,RddV) + fABS((fGETUBYTE(i,RssV) - fGETUBYTE(i,RttV)))));
+	}
+})
+
+Q6INSN(A2_vrsadub_acc,"Rxx32+=vrsadub(Rss32,Rtt32)",ATTRIBS(),
+"Sum of Absolute Differences: vector of unsigned bytes",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 4; i++) {
+		fSETWORD(0,RxxV,(fGETWORD(0,RxxV) + fABS((fGETUBYTE(i,RssV) - fGETUBYTE(i,RttV)))));
+	}
+	for (i = 4; i < 8; i++) {
+		fSETWORD(1,RxxV,(fGETWORD(1,RxxV) + fABS((fGETUBYTE(i,RssV) - fGETUBYTE(i,RttV)))));
+	}
+})
+
+
+/**********************************************/
+/* Vector Average			     */
+/**********************************************/
+
+Q6INSN(A2_vavgub,"Rdd32=vavgub(Rss32,Rtt32)",ATTRIBS(),
+"Average vector of unsigned bytes",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBYTE(i,RddV,((fGETUBYTE(i,RssV) + fGETUBYTE(i,RttV))>>1));
+	}
+})
+
+Q6INSN(A2_vavguh,"Rdd32=vavguh(Rss32,Rtt32)",ATTRIBS(),
+"Average vector of unsigned halfwords",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+		fSETHALF(i,RddV,(fGETUHALF(i,RssV)+fGETUHALF(i,RttV))>>1);
+	}
+})
+
+Q6INSN(A2_vavgh,"Rdd32=vavgh(Rss32,Rtt32)",ATTRIBS(),
+"Average vector of halfwords",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+		fSETHALF(i,RddV,(fGETHALF(i,RssV)+fGETHALF(i,RttV))>>1);
+	}
+})
+
+Q6INSN(A2_vnavgh,"Rdd32=vnavgh(Rtt32,Rss32)",ATTRIBS(),
+"Negative Average vector of halfwords",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+		fSETHALF(i,RddV,(fGETHALF(i,RttV)-fGETHALF(i,RssV))>>1);
+	}
+})
+
+Q6INSN(A2_vavgw,"Rdd32=vavgw(Rss32,Rtt32)",ATTRIBS(),
+"Average vector of words",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+		fSETWORD(i,RddV,(fSXTN(32,33,fGETWORD(i,RssV))+fSXTN(32,33,fGETWORD(i,RttV)))>>1);
+	}
+})
+
+Q6INSN(A2_vnavgw,"Rdd32=vnavgw(Rtt32,Rss32)",ATTRIBS(A_ARCHV2),
+"Average vector of words",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+		fSETWORD(i,RddV,(fSXTN(32,33,fGETWORD(i,RttV))-fSXTN(32,33,fGETWORD(i,RssV)))>>1);
+	}
+})
+
+Q6INSN(A2_vavgwr,"Rdd32=vavgw(Rss32,Rtt32):rnd",ATTRIBS(),
+"Average vector of words",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+		fSETWORD(i,RddV,(fSXTN(32,33,fGETWORD(i,RssV))+fSXTN(32,33,fGETWORD(i,RttV))+1)>>1);
+	}
+})
+
+Q6INSN(A2_vnavgwr,"Rdd32=vnavgw(Rtt32,Rss32):rnd:sat",ATTRIBS(A_ARCHV2),
+"Average vector of words",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+		fSETWORD(i,RddV,fSAT((fSXTN(32,33,fGETWORD(i,RttV))-fSXTN(32,33,fGETWORD(i,RssV))+1)>>1));
+	}
+})
+
+Q6INSN(A2_vavgwcr,"Rdd32=vavgw(Rss32,Rtt32):crnd",ATTRIBS(A_ARCHV2),
+"Average vector of words with convergent rounding",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+		fSETWORD(i,RddV,(fCRND(fSXTN(32,33,fGETWORD(i,RssV))+fSXTN(32,33,fGETWORD(i,RttV)))>>1));
+	}
+})
+
+Q6INSN(A2_vnavgwcr,"Rdd32=vnavgw(Rtt32,Rss32):crnd:sat",ATTRIBS(A_ARCHV2),
+"Average negative vector of words with convergent rounding",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+		fSETWORD(i,RddV,fSAT(fCRND(fSXTN(32,33,fGETWORD(i,RttV))-fSXTN(32,33,fGETWORD(i,RssV)))>>1));
+	}
+})
+
+Q6INSN(A2_vavghcr,"Rdd32=vavgh(Rss32,Rtt32):crnd",ATTRIBS(A_ARCHV2),
+"Average vector of halfwords with conv rounding",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+		fSETHALF(i,RddV,fCRND(fGETHALF(i,RssV)+fGETHALF(i,RttV))>>1);
+	}
+})
+
+Q6INSN(A2_vnavghcr,"Rdd32=vnavgh(Rtt32,Rss32):crnd:sat",ATTRIBS(A_ARCHV2),
+"Average negative vector of halfwords with conv rounding",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+		fSETHALF(i,RddV,fSATH(fCRND(fGETHALF(i,RttV)-fGETHALF(i,RssV))>>1));
+	}
+})
+
+
+Q6INSN(A2_vavguw,"Rdd32=vavguw(Rss32,Rtt32)",ATTRIBS(),
+"Average vector of unsigned words",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+		fSETWORD(i,RddV,(fZXTN(32,33,fGETUWORD(i,RssV))+fZXTN(32,33,fGETUWORD(i,RttV)))>>1);
+	}
+})
+
+Q6INSN(A2_vavguwr,"Rdd32=vavguw(Rss32,Rtt32):rnd",ATTRIBS(),
+"Average vector of unsigned words",
+{
+	fHIDE(int i;)
+	for (i=0;i<2;i++) {
+		fSETWORD(i,RddV,(fZXTN(32,33,fGETUWORD(i,RssV))+fZXTN(32,33,fGETUWORD(i,RttV))+1)>>1);
+	}
+})
+
+Q6INSN(A2_vavgubr,"Rdd32=vavgub(Rss32,Rtt32):rnd",ATTRIBS(),
+"Average vector of unsigned bytes",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBYTE(i,RddV,((fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV)+1)>>1));
+	}
+})
+
+Q6INSN(A2_vavguhr,"Rdd32=vavguh(Rss32,Rtt32):rnd",ATTRIBS(),
+"Average vector of unsigned halfwords with rounding",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+		fSETHALF(i,RddV,(fGETUHALF(i,RssV)+fGETUHALF(i,RttV)+1)>>1);
+	}
+})
+
+Q6INSN(A2_vavghr,"Rdd32=vavgh(Rss32,Rtt32):rnd",ATTRIBS(),
+"Average vector of halfwords with rounding",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+		fSETHALF(i,RddV,(fGETHALF(i,RssV)+fGETHALF(i,RttV)+1)>>1);
+	}
+})
+
+Q6INSN(A2_vnavghr,"Rdd32=vnavgh(Rtt32,Rss32):rnd:sat",ATTRIBS(A_ARCHV2),
+"Negative Average vector of halfwords with rounding",
+{
+	fHIDE(int i;)
+	for (i=0;i<4;i++) {
+		fSETHALF(i,RddV,fSATH((fGETHALF(i,RttV)-fGETHALF(i,RssV)+1)>>1));
+	}
+})
+
+
+/* Rounding Instruction */
+
+Q6INSN(A4_round_ri,"Rd32=round(Rs32,#u5)",ATTRIBS(),"Round", {RdV = fRNDN(RsV,uiV)>>uiV; })
+Q6INSN(A4_round_rr,"Rd32=round(Rs32,Rt32)",ATTRIBS(),"Round", {RdV = fRNDN(RsV,fZXTN(5,32,RtV))>>fZXTN(5,32,RtV); })
+Q6INSN(A4_round_ri_sat,"Rd32=round(Rs32,#u5):sat",ATTRIBS(),"Round", {RdV = (fSAT(fRNDN(RsV,uiV)))>>uiV; })
+Q6INSN(A4_round_rr_sat,"Rd32=round(Rs32,Rt32):sat",ATTRIBS(),"Round", {RdV = (fSAT(fRNDN(RsV,fZXTN(5,32,RtV))))>>fZXTN(5,32,RtV); })
+
+
+Q6INSN(A4_cround_ri,"Rd32=cround(Rs32,#u5)",ATTRIBS(),"Convergent Round", {RdV = fCRNDN(RsV,uiV); })
+Q6INSN(A4_cround_rr,"Rd32=cround(Rs32,Rt32)",ATTRIBS(),"Convergent Round", {RdV = fCRNDN(RsV,fZXTN(5,32,RtV)); })
+
+
+#define CROUND(DST,SRC,SHIFT) \
+	fHIDE(size16s_t rndbit_128;)\
+	fHIDE(size16s_t tmp128;)\
+	fHIDE(size16s_t src_128;)\
+	if (SHIFT == 0) { \
+		DST = SRC;\
+	} else if ((SRC & (size8s_t)((1LL << (SHIFT - 1)) - 1LL)) == 0) { \
+		src_128 = fCAST8S_16S(SRC);\
+		rndbit_128 = fCAST8S_16S(1LL);\
+		rndbit_128 = fSHIFTL128(rndbit_128, SHIFT);\
+		rndbit_128 = fAND128(rndbit_128, src_128);\
+		rndbit_128 = fSHIFTR128(rndbit_128, 1);\
+		tmp128 = fADD128(src_128, rndbit_128);\
+		tmp128 = fSHIFTR128(tmp128, SHIFT);\
+		DST =  fCAST16S_8S(tmp128);\
+	} else {\
+		size16s_t rndbit_128 =  fCAST8S_16S((1LL << (SHIFT - 1))); \
+		size16s_t src_128 =  fCAST8S_16S(SRC); \
+		size16s_t tmp128 = fADD128(src_128, rndbit_128);\
+		tmp128 = fSHIFTR128(tmp128, SHIFT);\
+		DST =  fCAST16S_8S(tmp128);\
+	}
+
+Q6INSN(A7_croundd_ri,"Rdd32=cround(Rss32,#u6)",ATTRIBS(),"Convergent Round",
+{
+CROUND(RddV,RssV,uiV);
+})
+
+Q6INSN(A7_croundd_rr,"Rdd32=cround(Rss32,Rt32)",ATTRIBS(),"Convergent Round",
+{
+CROUND(RddV,RssV,fZXTN(6,32,RtV));
+})
+
+
+
+
+
+
+
+
+
+Q6INSN(A7_clip,"Rd32=clip(Rs32,#u5)",ATTRIBS(),"Clip to  #s5", 	{   fCLIP(RdV,RsV,uiV);})
+Q6INSN(A7_vclip,"Rdd32=vclip(Rss32,#u5)",ATTRIBS(),"Clip to  #s5",
+{
+fHIDE(size4s_t tmp;)
+fCLIP(tmp, fGETWORD(0, RssV), uiV);
+fSETWORD(0, RddV, tmp);
+fCLIP(tmp,fGETWORD(1, RssV), uiV);
+fSETWORD(1, RddV, tmp);
+}
+)
+
+
+
+/**********************************************/
+/* V4: Cross Vector Min/Max		   */
+/**********************************************/
+
+
+#define VRMINORMAX(TAG,STR,OP,SHORTTYPE,SETTYPE,GETTYPE,NEL,SHIFT) \
+Q6INSN(A4_vr##TAG##SHORTTYPE,"Rxx32=vr"#TAG#SHORTTYPE"(Rss32,Ru32)",ATTRIBS(), \
+"Choose " STR " elements of a vector", \
+{ \
+	fHIDE(int i; size8s_t TAG; size4s_t addr;) \
+	TAG = fGET##GETTYPE(0,RxxV); \
+	addr = fGETWORD(1,RxxV); \
+	for (i = 0; i < NEL; i++) { \
+		if (TAG OP fGET##GETTYPE(i,RssV)) { \
+			TAG = fGET##GETTYPE(i,RssV); \
+			addr = RuV | i<<SHIFT; \
+		} \
+	} \
+	fSETWORD(0,RxxV,TAG); \
+	fSETWORD(1,RxxV,addr); \
+})
+
+#define RMINMAX(SHORTTYPE,SETTYPE,GETTYPE,NEL,SHIFT) \
+VRMINORMAX(min,"minimum",>,SHORTTYPE,SETTYPE,GETTYPE,NEL,SHIFT) \
+VRMINORMAX(max,"maximum",<,SHORTTYPE,SETTYPE,GETTYPE,NEL,SHIFT)
+
+
+RMINMAX(h,HALF,HALF,4,1)
+RMINMAX(uh,HALF,UHALF,4,1)
+RMINMAX(w,WORD,WORD,2,2)
+RMINMAX(uw,WORD,UWORD,2,2)
+
+#undef RMINMAX
+#undef VRMINORMAX
+
+/**********************************************/
+/* Vector Min/Max			     */
+/**********************************************/
+
+#define VMINORMAX(TAG,STR,FUNC,SHORTTYPE,SETTYPE,GETTYPE,NEL) \
+Q6INSN(A2_v##TAG##SHORTTYPE,"Rdd32=v"#TAG#SHORTTYPE"(Rtt32,Rss32)",ATTRIBS(), \
+"Choose " STR " elements of two vectors", \
+{ \
+	fHIDE(int i;) \
+	for (i = 0; i < NEL; i++) { \
+		fSET##SETTYPE(i,RddV,FUNC(fGET##GETTYPE(i,RttV),fGET##GETTYPE(i,RssV))); \
+	} \
+})
+
+#define VMINORMAX3(TAG,STR,FUNC,SHORTTYPE,SETTYPE,GETTYPE,NEL) \
+Q6INSN(A6_v##TAG##SHORTTYPE##3,"Rxx32=v"#TAG#SHORTTYPE"3(Rtt32,Rss32)",ATTRIBS(), \
+"Choose " STR " elements of two vectors", \
+{ \
+	fHIDE(int i;) \
+	for (i = 0; i < NEL; i++) { \
+		fSET##SETTYPE(i,RxxV,FUNC(fGET##GETTYPE(i,RxxV),FUNC(fGET##GETTYPE(i,RttV),fGET##GETTYPE(i,RssV)))); \
+	} \
+})
+
+#define MINMAX(SHORTTYPE,SETTYPE,GETTYPE,NEL) \
+VMINORMAX(min,"minimum",fMIN,SHORTTYPE,SETTYPE,GETTYPE,NEL) \
+VMINORMAX(max,"maximum",fMAX,SHORTTYPE,SETTYPE,GETTYPE,NEL)
+
+MINMAX(b,BYTE,BYTE,8)
+MINMAX(ub,BYTE,UBYTE,8)
+MINMAX(h,HALF,HALF,4)
+MINMAX(uh,HALF,UHALF,4)
+MINMAX(w,WORD,WORD,2)
+MINMAX(uw,WORD,UWORD,2)
+
+#undef MINMAX
+#undef VMINORMAX
+#undef VMINORMAX3
+
+
+/**********************************************/
+/* Vector Min/Max			     */
+/**********************************************/
+
+
+Q6INSN(A4_modwrapu,"Rd32=modwrap(Rs32,Rt32)",ATTRIBS(),
+"Wrap to an unsigned modulo buffer",
+{
+	if (RsV < 0) {
+	    RdV = RsV + fCAST4u(RtV);
+	} else if (fCAST4u(RsV) >= fCAST4u(RtV)) {
+	    RdV = RsV - fCAST4u(RtV);
+	} else {
+ 	 		 RdV = RsV;
+    }
+})
+
diff --git a/target/hexagon/imported/branch.idef b/target/hexagon/imported/branch.idef
new file mode 100644
index 0000000..600fec9
--- /dev/null
+++ b/target/hexagon/imported/branch.idef
@@ -0,0 +1,328 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+
+/*********************************************/
+/* Jump instructions			 */
+/*********************************************/
+
+#define A_JDIR A_JUMP
+#define A_CJNEWDIR A_JUMP
+#define A_CJOLDDIR A_JUMP
+#define A_NEWVALUEJ A_JUMP,A_DOTNEWVALUE,A_MEMLIKE_PACKET_RULES
+#define A_JINDIR A_JUMP,A_INDIRECT
+#define A_JINDIRNEW A_JUMP,A_INDIRECT
+#define A_JINDIROLD A_JUMP,A_INDIRECT
+
+Q6INSN(J2_jump,"jump #r22:2",ATTRIBS(A_JDIR), "direct unconditional jump",
+{fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);})
+
+Q6INSN(J2_jumpr,"jumpr Rs32",ATTRIBS(A_JINDIR), "indirect unconditional jump",
+{fJUMPR(RsN,RsV,COF_TYPE_JUMPR);})
+
+#define OLDCOND_JUMP(TAG,OPER,OPER2,ATTRIB,DESCR,SEMANTICS) \
+Q6INSN(TAG##t,"if (Pu4) "OPER":nt "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBOLD(PuV),,SPECULATE_NOT_TAKEN,12,0); if (fLSBOLD(PuV)) { SEMANTICS; }}) \
+Q6INSN(TAG##f,"if (!Pu4) "OPER":nt "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBOLDNOT(PuV),,SPECULATE_NOT_TAKEN,12,0); if (fLSBOLDNOT(PuV)) { SEMANTICS; }}) \
+Q6INSN(TAG##tpt,"if (Pu4) "OPER":t "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBOLD(PuV),,SPECULATE_TAKEN,12,0); if (fLSBOLD(PuV)) { SEMANTICS; }}) \
+Q6INSN(TAG##fpt,"if (!Pu4) "OPER":t "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBOLDNOT(PuV),,SPECULATE_TAKEN,12,0); if (fLSBOLDNOT(PuV)) { SEMANTICS; }})
+
+OLDCOND_JUMP(J2_jump,"jump","#r15:2",ATTRIBS(A_CJOLDDIR),"direct conditional jump",
+fIMMEXT(riV);fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);)
+
+OLDCOND_JUMP(J2_jumpr,"jumpr","Rs32",ATTRIBS(A_JINDIROLD),"indirect conditional jump",
+fJUMPR(RsN,RsV,COF_TYPE_JUMPR);)
+
+#define NEWCOND_JUMP(TAG,OPER,OPER2,ATTRIB,DESCR,SEMANTICS)\
+Q6INSN(TAG##tnew,"if (Pu4.new) "OPER":nt "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBNEW(PuN),, SPECULATE_NOT_TAKEN , 12,0)} {if(fLSBNEW(PuN)){SEMANTICS;}})\
+Q6INSN(TAG##fnew,"if (!Pu4.new) "OPER":nt "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBNEWNOT(PuN),, SPECULATE_NOT_TAKEN , 12,0)} {if(fLSBNEWNOT(PuN)){SEMANTICS;}})\
+Q6INSN(TAG##tnewpt,"if (Pu4.new) "OPER":t "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBNEW(PuN),, SPECULATE_TAKEN , 12,0)} {if(fLSBNEW(PuN)){SEMANTICS;}})\
+Q6INSN(TAG##fnewpt,"if (!Pu4.new) "OPER":t "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBNEWNOT(PuN),, SPECULATE_TAKEN , 12,0)} {if(fLSBNEWNOT(PuN)){SEMANTICS;}})
+
+NEWCOND_JUMP(J2_jump,"jump","#r15:2",ATTRIBS(A_CJNEWDIR,A_ARCHV2),"direct conditional jump",
+fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMPNEW);)
+
+NEWCOND_JUMP(J2_jumpr,"jumpr","Rs32",ATTRIBS(A_JINDIRNEW,A_ARCHV3),"indirect conditional jump",
+fJUMPR(RsN,RsV,COF_TYPE_JUMPR);)
+
+
+
+Q6INSN(J4_hintjumpr,"hintjr(Rs32)",ATTRIBS(A_JINDIR),"hint indirect conditional jump",
+{fHINTJR(RsV);})
+
+
+/*********************************************/
+/* Compound Compare-Jumps		    */
+/*********************************************/
+Q6INSN(J2_jumprz,"if (Rs32!=#0) jump:nt #r13:2",ATTRIBS(A_CJNEWDIR,A_ARCHV3),"direct conditional jump if register true",
+{fBRANCH_SPECULATE_STALL((RsV!=0), , SPECULATE_NOT_TAKEN,12,0) if (RsV != 0) { fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})
+
+Q6INSN(J2_jumprnz,"if (Rs32==#0) jump:nt #r13:2",ATTRIBS(A_CJNEWDIR,A_ARCHV3),"direct conditional jump if register false",
+{fBRANCH_SPECULATE_STALL((RsV==0), , SPECULATE_NOT_TAKEN,12,0) if (RsV == 0) {fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})
+
+Q6INSN(J2_jumprzpt,"if (Rs32!=#0) jump:t #r13:2",ATTRIBS(A_CJNEWDIR,A_ARCHV3),"direct conditional jump if register true",
+{fBRANCH_SPECULATE_STALL((RsV!=0), , SPECULATE_TAKEN,12,0) if (RsV != 0) { fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})
+
+Q6INSN(J2_jumprnzpt,"if (Rs32==#0) jump:t #r13:2",ATTRIBS(A_CJNEWDIR,A_ARCHV3),"direct conditional jump if register false",
+{fBRANCH_SPECULATE_STALL((RsV==0), , SPECULATE_TAKEN,12,0) if (RsV == 0) {fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})
+
+Q6INSN(J2_jumprgtez,"if (Rs32>=#0) jump:nt #r13:2",ATTRIBS(A_CJNEWDIR,A_ARCHV3),"direct conditional jump if register greater or equal to zero",
+{fBRANCH_SPECULATE_STALL((RsV>=0), , SPECULATE_NOT_TAKEN,12,0) if (RsV>=0) { fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})
+
+Q6INSN(J2_jumprgtezpt,"if (Rs32>=#0) jump:t #r13:2",ATTRIBS(A_CJNEWDIR,A_ARCHV3),"direct conditional jump if register greater or equal to zero",
+{fBRANCH_SPECULATE_STALL((RsV>=0), , SPECULATE_TAKEN,12,0) if (RsV>=0) { fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})
+
+Q6INSN(J2_jumprltez,"if (Rs32<=#0) jump:nt #r13:2",ATTRIBS(A_CJNEWDIR,A_ARCHV3),"direct conditional jump if register less than or equal to zero",
+{fBRANCH_SPECULATE_STALL((RsV<=0), , SPECULATE_NOT_TAKEN,12,0) if (RsV<=0) { fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})
+
+Q6INSN(J2_jumprltezpt,"if (Rs32<=#0) jump:t #r13:2",ATTRIBS(A_CJNEWDIR,A_ARCHV3),"direct conditional jump if register less than or equal to zero",
+{fBRANCH_SPECULATE_STALL((RsV<=0), , SPECULATE_TAKEN,12,0) if (RsV<=0) { fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})
+
+
+
+/*********************************************/
+/* V4 Compound Compare-Jumps		 */
+/*********************************************/
+
+
+/* V4 compound compare jumps (CJ) */
+#define STD_CMPJUMP(TAG,TST,TSTSEM)\
+Q6INSN(J4_##TAG##_tp0_jump_nt, "p0="TST"; if (p0.new) jump:nt #r9:2", ATTRIBS(A_CJNEWDIR,A_NEWCMPJUMP),"compound compare-jump", {fPART1(fWRITE_P0(f8BITSOF(TSTSEM)))  fBRANCH_SPECULATE_STALL(fLSBNEW0,,SPECULATE_NOT_TAKEN,13,0)  if (fLSBNEW0) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\
+Q6INSN(J4_##TAG##_fp0_jump_nt, "p0="TST"; if (!p0.new) jump:nt #r9:2", ATTRIBS(A_CJNEWDIR,A_NEWCMPJUMP),"compound compare-jump",{fPART1(fWRITE_P0(f8BITSOF(TSTSEM)))  fBRANCH_SPECULATE_STALL(fLSBNEW0NOT,,SPECULATE_NOT_TAKEN,13,0) if (fLSBNEW0NOT) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\
+Q6INSN(J4_##TAG##_tp0_jump_t,  "p0="TST"; if (p0.new) jump:t #r9:2", ATTRIBS(A_CJNEWDIR,A_NEWCMPJUMP),"compound compare-jump",  {fPART1(fWRITE_P0(f8BITSOF(TSTSEM)))  fBRANCH_SPECULATE_STALL(fLSBNEW0,,SPECULATE_TAKEN,13,0)      if (fLSBNEW0) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\
+Q6INSN(J4_##TAG##_fp0_jump_t,  "p0="TST"; if (!p0.new) jump:t #r9:2", ATTRIBS(A_CJNEWDIR,A_NEWCMPJUMP),"compound compare-jump", {fPART1(fWRITE_P0(f8BITSOF(TSTSEM)))  fBRANCH_SPECULATE_STALL(fLSBNEW0NOT,,SPECULATE_TAKEN,13,0)     if (fLSBNEW0NOT) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\
+Q6INSN(J4_##TAG##_tp1_jump_nt, "p1="TST"; if (p1.new) jump:nt #r9:2", ATTRIBS(A_CJNEWDIR,A_NEWCMPJUMP),"compound compare-jump", {fPART1(fWRITE_P1(f8BITSOF(TSTSEM)))  fBRANCH_SPECULATE_STALL(fLSBNEW1,,SPECULATE_NOT_TAKEN,13,0)  if (fLSBNEW1) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\
+Q6INSN(J4_##TAG##_fp1_jump_nt, "p1="TST"; if (!p1.new) jump:nt #r9:2", ATTRIBS(A_CJNEWDIR,A_NEWCMPJUMP),"compound compare-jump",{fPART1(fWRITE_P1(f8BITSOF(TSTSEM)))  fBRANCH_SPECULATE_STALL(fLSBNEW1NOT,,SPECULATE_NOT_TAKEN,13,0) if (fLSBNEW1NOT) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\
+Q6INSN(J4_##TAG##_tp1_jump_t,  "p1="TST"; if (p1.new) jump:t #r9:2", ATTRIBS(A_CJNEWDIR,A_NEWCMPJUMP),"compound compare-jump",  {fPART1(fWRITE_P1(f8BITSOF(TSTSEM)))  fBRANCH_SPECULATE_STALL(fLSBNEW1,,SPECULATE_TAKEN,13,0)      if (fLSBNEW1) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\
+Q6INSN(J4_##TAG##_fp1_jump_t,  "p1="TST"; if (!p1.new) jump:t #r9:2", ATTRIBS(A_CJNEWDIR,A_NEWCMPJUMP),"compound compare-jump", {fPART1(fWRITE_P1(f8BITSOF(TSTSEM)))  fBRANCH_SPECULATE_STALL(fLSBNEW1NOT,,SPECULATE_TAKEN,13,0)     if (fLSBNEW1NOT) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})
+
+
+STD_CMPJUMP(cmpeqi,"cmp.eq(Rs16,#U5)",(RsV==UiV))
+STD_CMPJUMP(cmpgti,"cmp.gt(Rs16,#U5)",(RsV>UiV))
+STD_CMPJUMP(cmpgtui,"cmp.gtu(Rs16,#U5)",(fCAST4u(RsV)>UiV))
+
+STD_CMPJUMP(cmpeqn1,"cmp.eq(Rs16,#-1)",(RsV==-1))
+STD_CMPJUMP(cmpgtn1,"cmp.gt(Rs16,#-1)",(RsV>-1))
+STD_CMPJUMP(tstbit0,"tstbit(Rs16,#0)",(RsV & 1))
+
+STD_CMPJUMP(cmpeq,"cmp.eq(Rs16,Rt16)",(RsV==RtV))
+STD_CMPJUMP(cmpgt,"cmp.gt(Rs16,Rt16)",(RsV>RtV))
+STD_CMPJUMP(cmpgtu,"cmp.gtu(Rs16,Rt16)",(fCAST4u(RsV)>RtV))
+
+
+
+/* V4 jump and transfer (CJ) */
+Q6INSN(J4_jumpseti,"Rd16=#U6 ; jump #r9:2",ATTRIBS(A_JDIR), "direct unconditional jump and set register to immediate",
+{fIMMEXT(riV); fPCALIGN(riV); RdV=UiV; fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);})
+
+Q6INSN(J4_jumpsetr,"Rd16=Rs16 ; jump #r9:2",ATTRIBS(A_JDIR), "direct unconditional jump and transfer register",
+{fIMMEXT(riV); fPCALIGN(riV); RdV=RsV; fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);})
+
+
+/* V4 new-value jumps (NCJ) */
+#define STD_CMPJUMPNEWRS(TAG,TST,TSTSEM)\
+Q6INSN(J4_##TAG##_jumpnv_t, "if ("TST") jump:t #r9:2", ATTRIBS(A_NEWVALUEJ),"compound compare-jump",{fBRANCH_SPECULATE_STALL(TSTSEM,,SPECULATE_TAKEN,13,0);if (TSTSEM) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\
+Q6INSN(J4_##TAG##_jumpnv_nt,"if ("TST") jump:nt #r9:2",ATTRIBS(A_NEWVALUEJ),"compound compare-jump",{fBRANCH_SPECULATE_STALL(TSTSEM,,SPECULATE_NOT_TAKEN,13,0); if (TSTSEM) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})
+
+
+
+
+STD_CMPJUMPNEWRS(cmpeqi_t,"cmp.eq(Ns8.new,#U5)",(fNEWREG(NsN)==(UiV)))
+STD_CMPJUMPNEWRS(cmpeqi_f,"!cmp.eq(Ns8.new,#U5)",(fNEWREG(NsN)!=(UiV)))
+STD_CMPJUMPNEWRS(cmpgti_t,"cmp.gt(Ns8.new,#U5)",(fNEWREG(NsN)>(UiV)))
+STD_CMPJUMPNEWRS(cmpgti_f,"!cmp.gt(Ns8.new,#U5)",!(fNEWREG(NsN)>(UiV)))
+STD_CMPJUMPNEWRS(cmpgtui_t,"cmp.gtu(Ns8.new,#U5)",(fCAST4u(fNEWREG(NsN))>(UiV)))
+STD_CMPJUMPNEWRS(cmpgtui_f,"!cmp.gtu(Ns8.new,#U5)",!(fCAST4u(fNEWREG(NsN))>(UiV)))
+
+
+STD_CMPJUMPNEWRS(cmpeqn1_t,"cmp.eq(Ns8.new,#-1)",(fNEWREG(NsN)==(-1)))
+STD_CMPJUMPNEWRS(cmpeqn1_f,"!cmp.eq(Ns8.new,#-1)",(fNEWREG(NsN)!=(-1)))
+STD_CMPJUMPNEWRS(cmpgtn1_t,"cmp.gt(Ns8.new,#-1)",(fNEWREG(NsN)>(-1)))
+STD_CMPJUMPNEWRS(cmpgtn1_f,"!cmp.gt(Ns8.new,#-1)",!(fNEWREG(NsN)>(-1)))
+STD_CMPJUMPNEWRS(tstbit0_t,"tstbit(Ns8.new,#0)",((fNEWREG(NsN)) & 1))
+STD_CMPJUMPNEWRS(tstbit0_f,"!tstbit(Ns8.new,#0)",!((fNEWREG(NsN)) & 1))
+
+
+STD_CMPJUMPNEWRS(cmpeq_t, "cmp.eq(Ns8.new,Rt32)", (fNEWREG(NsN)==RtV))
+STD_CMPJUMPNEWRS(cmpgt_t, "cmp.gt(Ns8.new,Rt32)", (fNEWREG(NsN)>RtV))
+STD_CMPJUMPNEWRS(cmpgtu_t,"cmp.gtu(Ns8.new,Rt32)",(fCAST4u(fNEWREG(NsN))>fCAST4u(RtV)))
+STD_CMPJUMPNEWRS(cmplt_t, "cmp.gt(Rt32,Ns8.new)", (RtV>fNEWREG(NsN)))
+STD_CMPJUMPNEWRS(cmpltu_t,"cmp.gtu(Rt32,Ns8.new)",(fCAST4u(RtV)>fCAST4u(fNEWREG(NsN))))
+STD_CMPJUMPNEWRS(cmpeq_f, "!cmp.eq(Ns8.new,Rt32)", (fNEWREG(NsN)!=RtV))
+STD_CMPJUMPNEWRS(cmpgt_f, "!cmp.gt(Ns8.new,Rt32)", !(fNEWREG(NsN)>RtV))
+STD_CMPJUMPNEWRS(cmpgtu_f,"!cmp.gtu(Ns8.new,Rt32)",!(fCAST4u(fNEWREG(NsN))>fCAST4u(RtV)))
+STD_CMPJUMPNEWRS(cmplt_f, "!cmp.gt(Rt32,Ns8.new)", !(RtV>fNEWREG(NsN)))
+STD_CMPJUMPNEWRS(cmpltu_f,"!cmp.gtu(Rt32,Ns8.new)",!(fCAST4u(RtV)>fCAST4u(fNEWREG(NsN))))
+
+
+
+
+
+/*********************************************/
+/* Subroutine Call instructions	      */
+/*********************************************/
+
+#define CDIR_STD   A_CALL
+#define CINDIR_STD A_CALL,A_INDIRECT
+
+Q6INSN(J2_call,"call #r22:2",ATTRIBS(CDIR_STD), "direct unconditional call",
+{fIMMEXT(riV); fPCALIGN(riV); fCALL(fREAD_PC()+riV); })
+
+Q6INSN(J2_callt,"if (Pu4) call #r15:2",ATTRIBS(CDIR_STD),"direct conditional call if true",
+{fIMMEXT(riV); fPCALIGN(riV); fBRANCH_SPECULATE_STALL(fLSBOLD(PuV),,SPECULATE_NOT_TAKEN,12,0); if (fLSBOLD(PuV)) { fCALL(fREAD_PC()+riV); }})
+
+Q6INSN(J2_callf,"if (!Pu4) call #r15:2",ATTRIBS(CDIR_STD),"direct conditional call if false",
+{fIMMEXT(riV); fPCALIGN(riV); fBRANCH_SPECULATE_STALL(fLSBOLDNOT(PuV),,SPECULATE_NOT_TAKEN,12,0);if (fLSBOLDNOT(PuV)) { fCALL(fREAD_PC()+riV); }})
+
+Q6INSN(J2_callr,"callr Rs32",ATTRIBS(CINDIR_STD), "indirect unconditional call",
+{ fCALLR(RsV); })
+
+Q6INSN(J2_callrt,"if (Pu4) callr Rs32",ATTRIBS(CINDIR_STD),"indirect conditional call if true",
+{fBRANCH_SPECULATE_STALL(fLSBOLD(PuV),,SPECULATE_NOT_TAKEN,12,0);if (fLSBOLD(PuV)) { fCALLR(RsV); }})
+
+Q6INSN(J2_callrf,"if (!Pu4) callr Rs32",ATTRIBS(CINDIR_STD),"indirect conditional call if false",
+{fBRANCH_SPECULATE_STALL(fLSBOLDNOT(PuV),,SPECULATE_NOT_TAKEN,12,0);if (fLSBOLDNOT(PuV)) { fCALLR(RsV); }})
+
+
+
+
+/*********************************************/
+/* HW Loop instructions		      */
+/*********************************************/
+
+Q6INSN(J2_loop0r,"loop0(#r7:2,Rs32)",ATTRIBS(),"Initialize HW loop 0",
+{ fIMMEXT(riV); fPCALIGN(riV);
+  fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, RsV);
+  fSET_LPCFG(0);
+})
+
+Q6INSN(J2_loop1r,"loop1(#r7:2,Rs32)",ATTRIBS(),"Initialize HW loop 1",
+{ fIMMEXT(riV); fPCALIGN(riV);
+  fWRITE_LOOP_REGS1(/*sa,lc*/ fREAD_PC()+riV, RsV);
+})
+
+Q6INSN(J2_loop0i,"loop0(#r7:2,#U10)",ATTRIBS(),"Initialize HW loop 0",
+{ fIMMEXT(riV); fPCALIGN(riV);
+  fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, UiV);
+  fSET_LPCFG(0);
+})
+
+Q6INSN(J2_loop1i,"loop1(#r7:2,#U10)",ATTRIBS(),"Initialize HW loop 1",
+{ fIMMEXT(riV); fPCALIGN(riV);
+  fWRITE_LOOP_REGS1(/*sa,lc*/ fREAD_PC()+riV, UiV);
+})
+
+
+Q6INSN(J2_ploop1sr,"p3=sp1loop0(#r7:2,Rs32)",ATTRIBS(A_ARCHV2),"Initialize HW loop 0",
+{ fIMMEXT(riV); fPCALIGN(riV);
+  fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, RsV);
+  fSET_LPCFG(1);
+  fWRITE_P3(0);
+})
+Q6INSN(J2_ploop1si,"p3=sp1loop0(#r7:2,#U10)",ATTRIBS(A_ARCHV2),"Initialize HW loop 0",
+{ fIMMEXT(riV); fPCALIGN(riV);
+  fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, UiV);
+  fSET_LPCFG(1);
+  fWRITE_P3(0);
+})
+
+Q6INSN(J2_ploop2sr,"p3=sp2loop0(#r7:2,Rs32)",ATTRIBS(A_ARCHV2),"Initialize HW loop 0",
+{ fIMMEXT(riV); fPCALIGN(riV);
+  fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, RsV);
+  fSET_LPCFG(2);
+  fWRITE_P3(0);
+})
+Q6INSN(J2_ploop2si,"p3=sp2loop0(#r7:2,#U10)",ATTRIBS(A_ARCHV2),"Initialize HW loop 0",
+{ fIMMEXT(riV); fPCALIGN(riV);
+  fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, UiV);
+  fSET_LPCFG(2);
+  fWRITE_P3(0);
+})
+
+Q6INSN(J2_ploop3sr,"p3=sp3loop0(#r7:2,Rs32)",ATTRIBS(A_ARCHV2),"Initialize HW loop 0",
+{ fIMMEXT(riV); fPCALIGN(riV);
+  fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, RsV);
+  fSET_LPCFG(3);
+  fWRITE_P3(0);
+})
+Q6INSN(J2_ploop3si,"p3=sp3loop0(#r7:2,#U10)",ATTRIBS(A_ARCHV2),"Initialize HW loop 0",
+{ fIMMEXT(riV); fPCALIGN(riV);
+  fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, UiV);
+  fSET_LPCFG(3);
+  fWRITE_P3(0);
+})
+
+
+
+Q6INSN(J2_endloop01,"endloop01",ATTRIBS(A_HWLOOP0_END,A_HWLOOP1_END),"Loopend for inner and outer loop",
+{
+
+  /* V2: With predicate control */
+  if (fGET_LPCFG) {
+    fHIDE( if (fGET_LPCFG >= 2) { /* Nothing */ } else )
+    if (fGET_LPCFG==1) {
+       fWRITE_P3(0xff);
+    }
+    fSET_LPCFG(fGET_LPCFG-1);
+  }
+
+  /* check if iterate */
+  if (fREAD_LC0>1) {
+    fBRANCH(fREAD_SA0,COF_TYPE_LOOPEND0);
+    /* decrement loop count */
+    fWRITE_LC0(fREAD_LC0-1);
+  } else {
+    /* check if iterate */
+    if (fREAD_LC1>1) {
+      fBRANCH(fREAD_SA1,COF_TYPE_LOOPEND1);
+      /* decrement loop count */
+      fWRITE_LC1(fREAD_LC1-1);
+    }
+  }
+
+})
+
+Q6INSN(J2_endloop0,"endloop0",ATTRIBS(A_HWLOOP0_END),"Loopend for inner loop",
+{
+
+  /* V2: With predicate control */
+  if (fGET_LPCFG) {
+    fHIDE( if (fGET_LPCFG >= 2) { /* Nothing */ } else )
+    if (fGET_LPCFG==1) {
+       fWRITE_P3(0xff);
+    }
+    fSET_LPCFG(fGET_LPCFG-1);
+  }
+
+  /* check if iterate */
+  if (fREAD_LC0>1) {
+    fBRANCH(fREAD_SA0,COF_TYPE_LOOPEND0);
+    /* decrement loop count */
+    fWRITE_LC0(fREAD_LC0-1);
+  }
+})
+
+Q6INSN(J2_endloop1,"endloop1",ATTRIBS(A_HWLOOP1_END),"Loopend for outer loop",
+{
+  /* check if iterate */
+  if (fREAD_LC1>1) {
+    fBRANCH(fREAD_SA1,COF_TYPE_LOOPEND1);
+    /* decrement loop count */
+    fWRITE_LC1(fREAD_LC1-1);
+  }
+})
+
+
diff --git a/target/hexagon/imported/compare.idef b/target/hexagon/imported/compare.idef
new file mode 100644
index 0000000..0f97ed8
--- /dev/null
+++ b/target/hexagon/imported/compare.idef
@@ -0,0 +1,621 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * Compare Instructions
+ */
+
+
+
+/*********************************************/
+/* Scalar compare instructions	       */
+/*********************************************/
+
+Q6INSN(C2_cmpeq,"Pd4=cmp.eq(Rs32,Rt32)",ATTRIBS(),
+"Compare for Equal",
+{PdV=f8BITSOF(RsV==RtV);})
+
+Q6INSN(C2_cmpgt,"Pd4=cmp.gt(Rs32,Rt32)",ATTRIBS(),
+"Compare for signed Greater Than",
+{PdV=f8BITSOF(RsV>RtV);})
+
+Q6INSN(C2_cmpgtu,"Pd4=cmp.gtu(Rs32,Rt32)",ATTRIBS(),
+"Compare for Greater Than Unsigned",
+{PdV=f8BITSOF(fCAST4u(RsV)>fCAST4u(RtV));})
+
+Q6INSN(C2_cmpeqp,"Pd4=cmp.eq(Rss32,Rtt32)",ATTRIBS(),
+"Compare for Equal",
+{PdV=f8BITSOF(RssV==RttV);})
+
+Q6INSN(C2_cmpgtp,"Pd4=cmp.gt(Rss32,Rtt32)",ATTRIBS(),
+"Compare for signed Greater Than",
+{PdV=f8BITSOF(RssV>RttV);})
+
+Q6INSN(C2_cmpgtup,"Pd4=cmp.gtu(Rss32,Rtt32)",ATTRIBS(),
+"Compare for Greater Than Unsigned",
+{PdV=f8BITSOF(fCAST8u(RssV)>fCAST8u(RttV));})
+
+
+
+
+/*********************************************/
+/* Compare and put result in GPR	     */
+/*  typically for function I/O	       */
+/*********************************************/
+
+Q6INSN(A4_rcmpeqi,"Rd32=cmp.eq(Rs32,#s8)",ATTRIBS(),
+"Compare for Equal",
+{fIMMEXT(siV); RdV=(RsV==siV); })
+
+Q6INSN(A4_rcmpneqi,"Rd32=!cmp.eq(Rs32,#s8)",ATTRIBS(),
+"Compare for Equal",
+{fIMMEXT(siV); RdV=(RsV!=siV); })
+
+
+Q6INSN(A4_rcmpeq,"Rd32=cmp.eq(Rs32,Rt32)",ATTRIBS(),
+"Compare for Equal",
+{RdV=(RsV==RtV); })
+
+Q6INSN(A4_rcmpneq,"Rd32=!cmp.eq(Rs32,Rt32)",ATTRIBS(),
+"Compare for Equal",
+{RdV=(RsV!=RtV); })
+
+
+
+/*********************************************/
+/* Scalar compare instructions	       */
+/*********************************************/
+
+
+Q6INSN(C2_bitsset,"Pd4=bitsset(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"Compare for selected bits set",
+{PdV=f8BITSOF((RsV&RtV)==RtV);})
+
+Q6INSN(C2_bitsclr,"Pd4=bitsclr(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"Compare for selected bits clear",
+{PdV=f8BITSOF((RsV&RtV)==0);})
+
+
+Q6INSN(C4_nbitsset,"Pd4=!bitsset(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"Compare for selected bits set",
+{PdV=f8BITSOF((RsV&RtV)!=RtV);})
+
+Q6INSN(C4_nbitsclr,"Pd4=!bitsclr(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"Compare for selected bits clear",
+{PdV=f8BITSOF((RsV&RtV)!=0);})
+
+
+
+/*********************************************/
+/* Scalar compare instructions W/ immediate  */
+/*********************************************/
+
+Q6INSN(C2_cmpeqi,"Pd4=cmp.eq(Rs32,#s10)",ATTRIBS(),
+"Compare for Equal",
+{fIMMEXT(siV); PdV=f8BITSOF(RsV==siV);})
+
+Q6INSN(C2_cmpgti,"Pd4=cmp.gt(Rs32,#s10)",ATTRIBS(),
+"Compare for signed Greater Than",
+{fIMMEXT(siV); PdV=f8BITSOF(RsV>siV);})
+
+Q6INSN(C2_cmpgtui,"Pd4=cmp.gtu(Rs32,#u9)",ATTRIBS(),
+"Compare for Greater Than Unsigned",
+{fIMMEXT(uiV); PdV=f8BITSOF(fCAST4u(RsV)>fCAST4u(uiV));})
+
+Q6INSN(C2_bitsclri,"Pd4=bitsclr(Rs32,#u6)",ATTRIBS(A_ARCHV2),
+"Compare for selected bits clear",
+{PdV=f8BITSOF((RsV&uiV)==0);})
+
+Q6INSN(C4_nbitsclri,"Pd4=!bitsclr(Rs32,#u6)",ATTRIBS(A_ARCHV2),
+"Compare for selected bits clear",
+{PdV=f8BITSOF((RsV&uiV)!=0);})
+
+
+
+
+Q6INSN(C4_cmpneqi,"Pd4=!cmp.eq(Rs32,#s10)",ATTRIBS(), "Compare for Not Equal", {fIMMEXT(siV); PdV=f8BITSOF(RsV!=siV);})
+Q6INSN(C4_cmpltei,"Pd4=!cmp.gt(Rs32,#s10)",ATTRIBS(), "Compare for Less Than or Equal", {fIMMEXT(siV); PdV=f8BITSOF(RsV<=siV);})
+Q6INSN(C4_cmplteui,"Pd4=!cmp.gtu(Rs32,#u9)",ATTRIBS(), "Compare for Less Than or Equal Unsigned", {fIMMEXT(uiV); PdV=f8BITSOF(fCAST4u(RsV)<=fCAST4u(uiV));})
+
+Q6INSN(C4_cmpneq,"Pd4=!cmp.eq(Rs32,Rt32)",ATTRIBS(), "And-Compare for Equal", {PdV=f8BITSOF(RsV!=RtV);})
+Q6INSN(C4_cmplte,"Pd4=!cmp.gt(Rs32,Rt32)",ATTRIBS(), "And-Compare for signed Greater Than", {PdV=f8BITSOF(RsV<=RtV);})
+Q6INSN(C4_cmplteu,"Pd4=!cmp.gtu(Rs32,Rt32)",ATTRIBS(), "And-Compare for Greater Than Unsigned", {PdV=f8BITSOF(fCAST4u(RsV)<=fCAST4u(RtV));})
+
+
+
+
+
+/* Predicate Logical Operations */
+
+Q6INSN(C2_and,"Pd4=and(Pt4,Ps4)",ATTRIBS(A_CRSLOT23),
+"Predicate AND",
+{PdV=PsV & PtV;})
+
+Q6INSN(C2_or,"Pd4=or(Pt4,Ps4)",ATTRIBS(A_CRSLOT23),
+"Predicate OR",
+{PdV=PsV | PtV;})
+
+Q6INSN(C2_xor,"Pd4=xor(Ps4,Pt4)",ATTRIBS(A_CRSLOT23),
+"Predicate XOR",
+{PdV=PsV ^ PtV;})
+
+Q6INSN(C2_andn,"Pd4=and(Pt4,!Ps4)",ATTRIBS(A_CRSLOT23),
+"Predicate AND NOT",
+{PdV=PtV & (~PsV);})
+
+Q6INSN(C2_not,"Pd4=not(Ps4)",ATTRIBS(A_CRSLOT23),
+"Logical NOT Predicate",
+{PdV=~PsV;})
+
+Q6INSN(C2_orn,"Pd4=or(Pt4,!Ps4)",ATTRIBS(A_ARCHV2,A_CRSLOT23),
+"Predicate OR NOT",
+{PdV=PtV | (~PsV);})
+
+
+
+
+
+Q6INSN(C4_and_and,"Pd4=and(Ps4,and(Pt4,Pu4))",ATTRIBS(A_CRSLOT23),
+"Compound And-And", { PdV = PsV & PtV & PuV; })
+
+Q6INSN(C4_and_or,"Pd4=and(Ps4,or(Pt4,Pu4))",ATTRIBS(A_CRSLOT23),
+"Compound And-Or", { PdV = PsV &  (PtV | PuV); })
+
+Q6INSN(C4_or_and,"Pd4=or(Ps4,and(Pt4,Pu4))",ATTRIBS(A_CRSLOT23),
+"Compound Or-And", { PdV = PsV | (PtV & PuV); })
+
+Q6INSN(C4_or_or,"Pd4=or(Ps4,or(Pt4,Pu4))",ATTRIBS(A_CRSLOT23),
+"Compound Or-Or", { PdV = PsV | PtV | PuV; })
+
+
+
+Q6INSN(C4_and_andn,"Pd4=and(Ps4,and(Pt4,!Pu4))",ATTRIBS(A_CRSLOT23),
+"Compound And-And", { PdV = PsV & PtV & (~PuV); })
+
+Q6INSN(C4_and_orn,"Pd4=and(Ps4,or(Pt4,!Pu4))",ATTRIBS(A_CRSLOT23),
+"Compound And-Or", { PdV = PsV &  (PtV | (~PuV)); })
+
+Q6INSN(C4_or_andn,"Pd4=or(Ps4,and(Pt4,!Pu4))",ATTRIBS(A_CRSLOT23),
+"Compound Or-And", { PdV = PsV | (PtV & (~PuV)); })
+
+Q6INSN(C4_or_orn,"Pd4=or(Ps4,or(Pt4,!Pu4))",ATTRIBS(A_CRSLOT23),
+"Compound Or-Or", { PdV = PsV | PtV | (~PuV); })
+
+
+Q6INSN(C2_any8,"Pd4=any8(Ps4)",ATTRIBS(A_CRSLOT23),
+"Logical ANY of low 8 predicate bits",
+{ PsV ? (PdV=0xff) : (PdV=0x00); })
+
+Q6INSN(C2_all8,"Pd4=all8(Ps4)",ATTRIBS(A_CRSLOT23),
+"Logical ALL of low 8 predicate bits",
+{ (PsV==0xff) ? (PdV=0xff) : (PdV=0x00); })
+
+Q6INSN(C2_vitpack,"Rd32=vitpack(Ps4,Pt4)",ATTRIBS(),
+"Pack the odd and even bits of two predicate registers",
+{ RdV = (PsV&0x55) | (PtV&0xAA); })
+
+/* Mux instructions */
+
+Q6INSN(C2_mux,"Rd32=mux(Pu4,Rs32,Rt32)",ATTRIBS(),
+"Scalar MUX",
+{ (fLSBOLD(PuV)) ? (RdV=RsV):(RdV=RtV); })
+
+
+Q6INSN(C2_cmovenewit,"if (Pu4.new) Rd32=#s12",ATTRIBS(A_ARCHV2),
+"Scalar conditional move",
+{ fIMMEXT(siV); if (fLSBNEW(PuN)) RdV=siV; else CANCEL;})
+
+Q6INSN(C2_cmovenewif,"if (!Pu4.new) Rd32=#s12",ATTRIBS(A_ARCHV2),
+"Scalar conditional move",
+{ fIMMEXT(siV); if (fLSBNEWNOT(PuN)) RdV=siV; else CANCEL;})
+
+Q6INSN(C2_cmoveit,"if (Pu4) Rd32=#s12",ATTRIBS(A_ARCHV2),
+"Scalar conditional move",
+{ fIMMEXT(siV); if (fLSBOLD(PuV)) RdV=siV; else CANCEL;})
+
+Q6INSN(C2_cmoveif,"if (!Pu4) Rd32=#s12",ATTRIBS(A_ARCHV2),
+"Scalar conditional move",
+{ fIMMEXT(siV); if (fLSBOLDNOT(PuV)) RdV=siV; else CANCEL;})
+
+
+
+Q6INSN(C2_ccombinewnewt,"if (Pu4.new) Rdd32=combine(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"Conditionally combine two words into a register pair",
+{ if (fLSBNEW(PuN)) {
+    fSETWORD(0,RddV,RtV);
+    fSETWORD(1,RddV,RsV);
+  } else {CANCEL;}
+})
+
+Q6INSN(C2_ccombinewnewf,"if (!Pu4.new) Rdd32=combine(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"Conditionally combine two words into a register pair",
+{ if (fLSBNEWNOT(PuN)) {
+    fSETWORD(0,RddV,RtV);
+    fSETWORD(1,RddV,RsV);
+  } else {CANCEL;}
+})
+
+Q6INSN(C2_ccombinewt,"if (Pu4) Rdd32=combine(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"Conditionally combine two words into a register pair",
+{ if (fLSBOLD(PuV)) {
+    fSETWORD(0,RddV,RtV);
+    fSETWORD(1,RddV,RsV);
+  } else {CANCEL;}
+})
+
+Q6INSN(C2_ccombinewf,"if (!Pu4) Rdd32=combine(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"Conditionally combine two words into a register pair",
+{ if (fLSBOLDNOT(PuV)) {
+    fSETWORD(0,RddV,RtV);
+    fSETWORD(1,RddV,RsV);
+  } else {CANCEL;}
+})
+
+
+
+Q6INSN(C2_muxii,"Rd32=mux(Pu4,#s8,#S8)",ATTRIBS(A_ARCHV2),
+"Scalar MUX immediates",
+{ fIMMEXT(siV); (fLSBOLD(PuV)) ? (RdV=siV):(RdV=SiV); })
+
+
+
+Q6INSN(C2_muxir,"Rd32=mux(Pu4,Rs32,#s8)",ATTRIBS(A_ARCHV2),
+"Scalar MUX register immediate",
+{ fIMMEXT(siV); (fLSBOLD(PuV)) ? (RdV=RsV):(RdV=siV); })
+
+
+Q6INSN(C2_muxri,"Rd32=mux(Pu4,#s8,Rs32)",ATTRIBS(A_ARCHV2),
+"Scalar MUX register immediate",
+{ fIMMEXT(siV); (fLSBOLD(PuV)) ? (RdV=siV):(RdV=RsV); })
+
+
+
+Q6INSN(C2_vmux,"Rdd32=vmux(Pu4,Rss32,Rtt32)",ATTRIBS(),
+"Vector MUX",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBYTE(i,RddV,(fGETBIT(i,PuV)?(fGETBYTE(i,RssV)):(fGETBYTE(i,RttV))));
+	}
+})
+
+Q6INSN(C2_mask,"Rdd32=mask(Pt4)",ATTRIBS(),
+"Vector Mask Generation",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBYTE(i,RddV,(fGETBIT(i,PtV)?(0xff):(0x00)));
+	}
+})
+
+/* VCMP */
+
+Q6INSN(A2_vcmpbeq,"Pd4=vcmpb.eq(Rss32,Rtt32)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBIT(i,PdV,(fGETBYTE(i,RssV) == fGETBYTE(i,RttV)));
+	}
+})
+
+Q6INSN(A4_vcmpbeqi,"Pd4=vcmpb.eq(Rss32,#u8)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBIT(i,PdV,(fGETUBYTE(i,RssV) == uiV));
+	}
+})
+
+Q6INSN(A4_vcmpbeq_any,"Pd4=any8(vcmpb.eq(Rss32,Rtt32))",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	PdV = 0;
+	for (i = 0; i < 8; i++) {
+		if (fGETBYTE(i,RssV) == fGETBYTE(i,RttV)) PdV = 0xff;
+	}
+})
+
+Q6INSN(A6_vcmpbeq_notany,"Pd4=!any8(vcmpb.eq(Rss32,Rtt32))",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	PdV = 0;
+	for (i = 0; i < 8; i++) {
+		if (fGETBYTE(i,RssV) == fGETBYTE(i,RttV)) PdV = 0xff;
+	}
+	PdV = ~PdV;
+})
+
+Q6INSN(A2_vcmpbgtu,"Pd4=vcmpb.gtu(Rss32,Rtt32)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBIT(i,PdV,(fGETUBYTE(i,RssV) > fGETUBYTE(i,RttV)));
+	}
+})
+
+Q6INSN(A4_vcmpbgtui,"Pd4=vcmpb.gtu(Rss32,#u7)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBIT(i,PdV,(fGETUBYTE(i,RssV) > uiV));
+	}
+})
+
+Q6INSN(A4_vcmpbgt,"Pd4=vcmpb.gt(Rss32,Rtt32)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBIT(i,PdV,(fGETBYTE(i,RssV) > fGETBYTE(i,RttV)));
+	}
+})
+
+Q6INSN(A4_vcmpbgti,"Pd4=vcmpb.gt(Rss32,#s8)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 8; i++) {
+		fSETBIT(i,PdV,(fGETBYTE(i,RssV) > siV));
+	}
+})
+
+
+
+Q6INSN(A4_cmpbeq,"Pd4=cmpb.eq(Rs32,Rt32)",ATTRIBS(),
+"Compare bytes ",
+{
+	PdV=f8BITSOF(fGETBYTE(0,RsV) == fGETBYTE(0,RtV));
+})
+
+Q6INSN(A4_cmpbeqi,"Pd4=cmpb.eq(Rs32,#u8)",ATTRIBS(),
+"Compare bytes ",
+{
+	PdV=f8BITSOF(fGETUBYTE(0,RsV) == uiV);
+})
+
+Q6INSN(A4_cmpbgtu,"Pd4=cmpb.gtu(Rs32,Rt32)",ATTRIBS(),
+"Compare bytes ",
+{
+	PdV=f8BITSOF(fGETUBYTE(0,RsV) > fGETUBYTE(0,RtV));
+})
+
+Q6INSN(A4_cmpbgtui,"Pd4=cmpb.gtu(Rs32,#u7)",ATTRIBS(),
+"Compare bytes ",
+{
+	fIMMEXT(uiV);
+	PdV=f8BITSOF(fGETUBYTE(0,RsV) > fCAST4u(uiV));
+})
+
+Q6INSN(A4_cmpbgt,"Pd4=cmpb.gt(Rs32,Rt32)",ATTRIBS(),
+"Compare bytes ",
+{
+	PdV=f8BITSOF(fGETBYTE(0,RsV) > fGETBYTE(0,RtV));
+})
+
+Q6INSN(A4_cmpbgti,"Pd4=cmpb.gt(Rs32,#s8)",ATTRIBS(),
+"Compare bytes ",
+{
+	PdV=f8BITSOF(fGETBYTE(0,RsV) > siV);
+})
+
+Q6INSN(A2_vcmpheq,"Pd4=vcmph.eq(Rss32,Rtt32)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 4; i++) {
+		fSETBIT(i*2,PdV,  (fGETHALF(i,RssV) == fGETHALF(i,RttV)));
+		fSETBIT(i*2+1,PdV,(fGETHALF(i,RssV) == fGETHALF(i,RttV)));
+	}
+})
+
+Q6INSN(A2_vcmphgt,"Pd4=vcmph.gt(Rss32,Rtt32)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 4; i++) {
+		fSETBIT(i*2,  PdV,  (fGETHALF(i,RssV) > fGETHALF(i,RttV)));
+		fSETBIT(i*2+1,PdV,  (fGETHALF(i,RssV) > fGETHALF(i,RttV)));
+	}
+})
+
+Q6INSN(A2_vcmphgtu,"Pd4=vcmph.gtu(Rss32,Rtt32)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 4; i++) {
+		fSETBIT(i*2,  PdV,  (fGETUHALF(i,RssV) > fGETUHALF(i,RttV)));
+		fSETBIT(i*2+1,PdV,  (fGETUHALF(i,RssV) > fGETUHALF(i,RttV)));
+	}
+})
+
+Q6INSN(A4_vcmpheqi,"Pd4=vcmph.eq(Rss32,#s8)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 4; i++) {
+		fSETBIT(i*2,PdV,  (fGETHALF(i,RssV) == siV));
+		fSETBIT(i*2+1,PdV,(fGETHALF(i,RssV) == siV));
+	}
+})
+
+Q6INSN(A4_vcmphgti,"Pd4=vcmph.gt(Rss32,#s8)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 4; i++) {
+		fSETBIT(i*2,  PdV,  (fGETHALF(i,RssV) > siV));
+		fSETBIT(i*2+1,PdV,  (fGETHALF(i,RssV) > siV));
+	}
+})
+
+
+Q6INSN(A4_vcmphgtui,"Pd4=vcmph.gtu(Rss32,#u7)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 4; i++) {
+		fSETBIT(i*2,  PdV,  (fGETUHALF(i,RssV) > uiV));
+		fSETBIT(i*2+1,PdV,  (fGETUHALF(i,RssV) > uiV));
+	}
+})
+
+Q6INSN(A4_cmpheq,"Pd4=cmph.eq(Rs32,Rt32)",ATTRIBS(),
+"Compare halfwords ",
+{
+	PdV=f8BITSOF(fGETHALF(0,RsV) == fGETHALF(0,RtV));
+})
+
+Q6INSN(A4_cmphgt,"Pd4=cmph.gt(Rs32,Rt32)",ATTRIBS(),
+"Compare halfwords ",
+{
+	PdV=f8BITSOF(fGETHALF(0,RsV) > fGETHALF(0,RtV));
+})
+
+Q6INSN(A4_cmphgtu,"Pd4=cmph.gtu(Rs32,Rt32)",ATTRIBS(),
+"Compare halfwords ",
+{
+	PdV=f8BITSOF(fGETUHALF(0,RsV) > fGETUHALF(0,RtV));
+})
+
+Q6INSN(A4_cmpheqi,"Pd4=cmph.eq(Rs32,#s8)",ATTRIBS(),
+"Compare halfwords ",
+{
+	fIMMEXT(siV);
+	PdV=f8BITSOF(fGETHALF(0,RsV) == siV);
+})
+
+Q6INSN(A4_cmphgti,"Pd4=cmph.gt(Rs32,#s8)",ATTRIBS(),
+"Compare halfwords ",
+{
+	fIMMEXT(siV);
+	PdV=f8BITSOF(fGETHALF(0,RsV) > siV);
+})
+
+Q6INSN(A4_cmphgtui,"Pd4=cmph.gtu(Rs32,#u7)",ATTRIBS(),
+"Compare halfwords ",
+{
+	fIMMEXT(uiV);
+	PdV=f8BITSOF(fGETUHALF(0,RsV) > fCAST4u(uiV));
+})
+
+Q6INSN(A2_vcmpweq,"Pd4=vcmpw.eq(Rss32,Rtt32)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fSETBITS(3,0,PdV,(fGETWORD(0,RssV)==fGETWORD(0,RttV)));
+	fSETBITS(7,4,PdV,(fGETWORD(1,RssV)==fGETWORD(1,RttV)));
+})
+
+Q6INSN(A2_vcmpwgt,"Pd4=vcmpw.gt(Rss32,Rtt32)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fSETBITS(3,0,PdV,(fGETWORD(0,RssV)>fGETWORD(0,RttV)));
+	fSETBITS(7,4,PdV,(fGETWORD(1,RssV)>fGETWORD(1,RttV)));
+})
+
+Q6INSN(A2_vcmpwgtu,"Pd4=vcmpw.gtu(Rss32,Rtt32)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fSETBITS(3,0,PdV,(fGETUWORD(0,RssV)>fGETUWORD(0,RttV)));
+	fSETBITS(7,4,PdV,(fGETUWORD(1,RssV)>fGETUWORD(1,RttV)));
+})
+
+Q6INSN(A4_vcmpweqi,"Pd4=vcmpw.eq(Rss32,#s8)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fSETBITS(3,0,PdV,(fGETWORD(0,RssV)==siV));
+	fSETBITS(7,4,PdV,(fGETWORD(1,RssV)==siV));
+})
+
+Q6INSN(A4_vcmpwgti,"Pd4=vcmpw.gt(Rss32,#s8)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fSETBITS(3,0,PdV,(fGETWORD(0,RssV)>siV));
+	fSETBITS(7,4,PdV,(fGETWORD(1,RssV)>siV));
+})
+
+Q6INSN(A4_vcmpwgtui,"Pd4=vcmpw.gtu(Rss32,#u7)",ATTRIBS(),
+"Compare elements of two vectors ",
+{
+	fSETBITS(3,0,PdV,(fGETUWORD(0,RssV)>fCAST4u(uiV)));
+	fSETBITS(7,4,PdV,(fGETUWORD(1,RssV)>fCAST4u(uiV)));
+})
+
+Q6INSN(A4_boundscheck_hi,"Pd4=boundscheck(Rss32,Rtt32):raw:hi",ATTRIBS(),
+"Detect if a register is within bounds",
+{
+	fHIDE(size4u_t src;)
+	src = fGETUWORD(1,RssV);
+	PdV = f8BITSOF((fCAST4u(src) >= fGETUWORD(0,RttV)) && (fCAST4u(src) < fGETUWORD(1,RttV)));
+})
+
+Q6INSN(A4_boundscheck_lo,"Pd4=boundscheck(Rss32,Rtt32):raw:lo",ATTRIBS(),
+"Detect if a register is within bounds",
+{
+	fHIDE(size4u_t src;)
+	src = fGETUWORD(0,RssV);
+	PdV = f8BITSOF((fCAST4u(src) >= fGETUWORD(0,RttV)) && (fCAST4u(src) < fGETUWORD(1,RttV)));
+})
+
+Q6INSN(A4_tlbmatch,"Pd4=tlbmatch(Rss32,Rt32)",ATTRIBS(),
+"Detect if a VA/ASID matches a TLB entry",
+{
+	fHIDE(size4u_t TLBHI; size4u_t TLBLO; size4u_t MASK; size4u_t SIZE;)
+	MASK = 0x07ffffff;
+	TLBLO = fGETUWORD(0,RssV);
+	TLBHI = fGETUWORD(1,RssV);
+	SIZE = fMIN(6,fCL1_4(~fBREV_4(TLBLO)));
+	MASK &= (0xffffffff << 2*SIZE);
+	PdV = f8BITSOF(fGETBIT(31,TLBHI) && ((TLBHI & MASK) == (RtV & MASK)));
+})
+
+Q6INSN(C2_tfrpr,"Rd32=Ps4",ATTRIBS(),
+"Transfer predicate to general register", { RdV = fZXTN(8,32,PsV); })
+
+Q6INSN(C2_tfrrp,"Pd4=Rs32",ATTRIBS(),
+"Transfer general register to Predicate", { PdV = fGETUBYTE(0,RsV); })
+
+Q6INSN(C4_fastcorner9,"Pd4=fastcorner9(Ps4,Pt4)",ATTRIBS(A_CRSLOT23),
+"Determine whether the predicate sources define a corner",
+{
+	fHIDE(size4u_t tmp = 0; size4u_t i;)
+	fSETHALF(0,tmp,(PsV<<8)|PtV);
+	fSETHALF(1,tmp,(PsV<<8)|PtV);
+	for (i = 1; i < 9; i++) {
+		tmp &= tmp >> 1;
+	}
+	PdV = f8BITSOF(tmp != 0);
+})
+
+Q6INSN(C4_fastcorner9_not,"Pd4=!fastcorner9(Ps4,Pt4)",ATTRIBS(A_CRSLOT23),
+"Determine whether the predicate sources define a corner",
+{
+	fHIDE(size4u_t tmp = 0; size4u_t i;)
+	fSETHALF(0,tmp,(PsV<<8)|PtV);
+	fSETHALF(1,tmp,(PsV<<8)|PtV);
+	for (i = 1; i < 9; i++) {
+		tmp &= tmp >> 1;
+	}
+	PdV = f8BITSOF(tmp == 0);
+})
+
+
diff --git a/target/hexagon/imported/float.idef b/target/hexagon/imported/float.idef
new file mode 100644
index 0000000..e493ce6
--- /dev/null
+++ b/target/hexagon/imported/float.idef
@@ -0,0 +1,313 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * Floating-Point Instructions
+ */
+
+/*************************************/
+/* Scalar FP			 */
+/*************************************/
+Q6INSN(F2_sfadd,"Rd32=sfadd(Rs32,Rt32)",ATTRIBS(),
+"Floating-Point Add",
+{ RdV=fUNFLOAT(fFLOAT(RsV)+fFLOAT(RtV));})
+
+Q6INSN(F2_sfsub,"Rd32=sfsub(Rs32,Rt32)",ATTRIBS(),
+"Floating-Point Subtract",
+{ RdV=fUNFLOAT(fFLOAT(RsV)-fFLOAT(RtV));})
+
+Q6INSN(F2_sfmpy,"Rd32=sfmpy(Rs32,Rt32)",ATTRIBS(),
+"Floating-Point Multiply",
+{ RdV=fUNFLOAT(fSFMPY(fFLOAT(RsV),fFLOAT(RtV)));})
+
+Q6INSN(F2_sffma,"Rx32+=sfmpy(Rs32,Rt32)",ATTRIBS(),
+"Floating-Point Fused Multiply Add",
+{ RxV=fUNFLOAT(fFMAF(fFLOAT(RsV),fFLOAT(RtV),fFLOAT(RxV)));})
+
+Q6INSN(F2_sffma_sc,"Rx32+=sfmpy(Rs32,Rt32,Pu4):scale",ATTRIBS(),
+"Floating-Point Fused Multiply Add w/ Additional Scaling (2**Pu)",
+{
+	fHIDE(size4s_t tmp;)
+	fCHECKSFNAN3(RxV,RxV,RsV,RtV);
+	tmp=fUNFLOAT(fFMAFX(fFLOAT(RsV),fFLOAT(RtV),fFLOAT(RxV),PuV));
+	if (!((fFLOAT(RxV) == 0.0) && fISZEROPROD(fFLOAT(RsV),fFLOAT(RtV)))) RxV = tmp;
+})
+
+Q6INSN(F2_sffms,"Rx32-=sfmpy(Rs32,Rt32)",ATTRIBS(),
+"Floating-Point Fused Multiply Add",
+{ RxV=fUNFLOAT(fFMAF(-fFLOAT(RsV),fFLOAT(RtV),fFLOAT(RxV))); })
+
+Q6INSN(F2_sffma_lib,"Rx32+=sfmpy(Rs32,Rt32):lib",ATTRIBS(),
+"Floating-Point Fused Multiply Add for Library Routines",
+{ fFPSETROUND_NEAREST(); fHIDE(int infinp; int infminusinf; size4s_t tmp;)
+	infminusinf = ((isinf(fFLOAT(RxV))) &&
+		(fISINFPROD(fFLOAT(RsV),fFLOAT(RtV))) &&
+		(fGETBIT(31,RsV ^ RxV ^ RtV) != 0));
+	infinp = (isinf(fFLOAT(RxV))) || (isinf(fFLOAT(RtV))) || (isinf(fFLOAT(RsV)));
+	fCHECKSFNAN3(RxV,RxV,RsV,RtV);
+	tmp=fUNFLOAT(fFMAF(fFLOAT(RsV),fFLOAT(RtV),fFLOAT(RxV)));
+	if (!((fFLOAT(RxV) == 0.0) && fISZEROPROD(fFLOAT(RsV),fFLOAT(RtV)))) RxV = tmp;
+	fFPCANCELFLAGS();
+	if (isinf(fFLOAT(RxV)) && !infinp) RxV = RxV - 1;
+	if (infminusinf) RxV = 0;
+})
+
+Q6INSN(F2_sffms_lib,"Rx32-=sfmpy(Rs32,Rt32):lib",ATTRIBS(),
+"Floating-Point Fused Multiply Add for Library Routines",
+{ fFPSETROUND_NEAREST(); fHIDE(int infinp; int infminusinf; size4s_t tmp;)
+	infminusinf = ((isinf(fFLOAT(RxV))) &&
+		(fISINFPROD(fFLOAT(RsV),fFLOAT(RtV))) &&
+		(fGETBIT(31,RsV ^ RxV ^ RtV) == 0));
+	infinp = (isinf(fFLOAT(RxV))) || (isinf(fFLOAT(RtV))) || (isinf(fFLOAT(RsV)));
+	fCHECKSFNAN3(RxV,RxV,RsV,RtV);
+	tmp=fUNFLOAT(fFMAF(-fFLOAT(RsV),fFLOAT(RtV),fFLOAT(RxV)));
+	if (!((fFLOAT(RxV) == 0.0) && fISZEROPROD(fFLOAT(RsV),fFLOAT(RtV)))) RxV = tmp;
+	fFPCANCELFLAGS();
+	if (isinf(fFLOAT(RxV)) && !infinp) RxV = RxV - 1;
+	if (infminusinf) RxV = 0;
+})
+
+
+Q6INSN(F2_sfcmpeq,"Pd4=sfcmp.eq(Rs32,Rt32)",ATTRIBS(),
+"Floating Point Compare for Equal",
+{PdV=f8BITSOF(fFLOAT(RsV)==fFLOAT(RtV));})
+
+Q6INSN(F2_sfcmpgt,"Pd4=sfcmp.gt(Rs32,Rt32)",ATTRIBS(),
+"Floating-Point Compare for Greater Than",
+{PdV=f8BITSOF(fFLOAT(RsV)>fFLOAT(RtV));})
+
+/* cmpge is not the same as !cmpgt(swapops) in IEEE */
+
+Q6INSN(F2_sfcmpge,"Pd4=sfcmp.ge(Rs32,Rt32)",ATTRIBS(),
+"Floating-Point Compare for Greater Than / Equal To",
+{PdV=f8BITSOF(fFLOAT(RsV)>=fFLOAT(RtV));})
+
+/* Everyone seems to have this... */
+
+Q6INSN(F2_sfcmpuo,"Pd4=sfcmp.uo(Rs32,Rt32)",ATTRIBS(),
+"Floating-Point Compare for Unordered",
+{PdV=f8BITSOF(isunordered(fFLOAT(RsV),fFLOAT(RtV)));})
+
+
+Q6INSN(F2_sfmax,"Rd32=sfmax(Rs32,Rt32)",ATTRIBS(),
+"Maximum of Floating-Point values",
+{ RdV = fUNFLOAT(fSF_MAX(fFLOAT(RsV),fFLOAT(RtV))); })
+
+Q6INSN(F2_sfmin,"Rd32=sfmin(Rs32,Rt32)",ATTRIBS(),
+"Minimum of Floating-Point values",
+{ RdV = fUNFLOAT(fSF_MIN(fFLOAT(RsV),fFLOAT(RtV))); })
+
+
+Q6INSN(F2_sfclass,"Pd4=sfclass(Rs32,#u5)",ATTRIBS(),
+"Classify Floating-Point Value",
+{
+	fHIDE(int class;)
+	PdV = 0;
+	class = fpclassify(fFLOAT(RsV));
+	/* Is the value zero? */
+	if (fGETBIT(0,uiV) && (class == FP_ZERO)) PdV = 0xff;
+	if (fGETBIT(1,uiV) && (class == FP_NORMAL)) PdV = 0xff;
+	if (fGETBIT(2,uiV) && (class == FP_SUBNORMAL)) PdV = 0xff;
+	if (fGETBIT(3,uiV) && (class == FP_INFINITE)) PdV = 0xff;
+	if (fGETBIT(4,uiV) && (class == FP_NAN)) PdV = 0xff;
+	fFPCANCELFLAGS();
+})
+
+/* Range: +/- (1.0 .. 1+(63/64)) * 2**(-6 .. +9) */
+/* More immediate bits should probably be used for more precision? */
+
+Q6INSN(F2_sfimm_p,"Rd32=sfmake(#u10):pos",ATTRIBS(),
+"Make Floating Point Value",
+{
+	RdV = (127 - 6) << 23;
+	RdV += uiV << 17;
+})
+
+Q6INSN(F2_sfimm_n,"Rd32=sfmake(#u10):neg",ATTRIBS(),
+"Make Floating Point Value",
+{
+	RdV = (127 - 6) << 23;
+	RdV += (uiV << 17);
+	RdV |= (1 << 31);
+})
+
+
+Q6INSN(F2_sffixupn,"Rd32=sffixupn(Rs32,Rt32)",ATTRIBS(),
+"Fix Up Numerator",
+{
+	fHIDE(int adjust;)
+	fSF_RECIP_COMMON(RsV,RtV,RdV,adjust);
+	RdV = RsV;
+})
+
+Q6INSN(F2_sffixupd,"Rd32=sffixupd(Rs32,Rt32)",ATTRIBS(),
+"Fix Up Denominator",
+{
+	fHIDE(int adjust;)
+	fSF_RECIP_COMMON(RsV,RtV,RdV,adjust);
+	RdV = RtV;
+})
+
+Q6INSN(F2_sffixupr,"Rd32=sffixupr(Rs32)",ATTRIBS(),
+"Fix Up Radicand",
+{
+	fHIDE(int adjust;)
+	fSF_INVSQRT_COMMON(RsV,RdV,adjust);
+	RdV = RsV;
+})
+
+/*************************************/
+/* Scalar DP			 */
+/*************************************/
+Q6INSN(F2_dfadd,"Rdd32=dfadd(Rss32,Rtt32)",ATTRIBS(),
+"Floating-Point Add",
+{ RddV=fUNDOUBLE(fDOUBLE(RssV)+fDOUBLE(RttV));})
+
+Q6INSN(F2_dfsub,"Rdd32=dfsub(Rss32,Rtt32)",ATTRIBS(),
+"Floating-Point Subtract",
+{ RddV=fUNDOUBLE(fDOUBLE(RssV)-fDOUBLE(RttV));})
+
+Q6INSN(F2_dfmax,"Rdd32=dfmax(Rss32,Rtt32)",ATTRIBS(),
+"Maximum of Floating-Point values",
+{ RddV = fUNDOUBLE(fDF_MAX(fDOUBLE(RssV),fDOUBLE(RttV))); })
+
+Q6INSN(F2_dfmin,"Rdd32=dfmin(Rss32,Rtt32)",ATTRIBS(),
+"Minimum of Floating-Point values",
+{ RddV = fUNDOUBLE(fDF_MIN(fDOUBLE(RssV),fDOUBLE(RttV))); })
+
+Q6INSN(F2_dfmpyfix,"Rdd32=dfmpyfix(Rss32,Rtt32)",ATTRIBS(),
+"Fix Up Multiplicand for Multiplication",
+{
+	if (fDF_ISDENORM(RssV) && fDF_ISBIG(RttV) && fDF_ISNORMAL(RttV)) RddV = fUNDOUBLE(fDOUBLE(RssV) * 0x1.0p52);
+	else if (fDF_ISDENORM(RttV) && fDF_ISBIG(RssV) && fDF_ISNORMAL(RssV)) RddV = fUNDOUBLE(fDOUBLE(RssV) * 0x1.0p-52);
+	else RddV = RssV;
+})
+
+Q6INSN(F2_dfmpyll,"Rdd32=dfmpyll(Rss32,Rtt32)",ATTRIBS(),
+"Multiply low*low and shift off low 32 bits into sticky (in MSB)",
+{
+	fHIDE(size8u_t prod;)
+	prod = fMPY32UU(fGETUWORD(0,RssV),fGETUWORD(0,RttV));
+	RddV = (prod >> 32) << 1;
+	if (fGETUWORD(0,prod) != 0) fSETBIT(0,RddV,1);
+})
+
+Q6INSN(F2_dfmpylh,"Rxx32+=dfmpylh(Rss32,Rtt32)",ATTRIBS(),
+"Multiply low*high and accumulate",
+{
+	RxxV += (fGETUWORD(0,RssV) * (0x00100000 | fZXTN(20,64,fGETUWORD(1,RttV)))) << 1;
+})
+
+Q6INSN(F2_dfmpyhh,"Rxx32+=dfmpyhh(Rss32,Rtt32)",ATTRIBS(),
+"Multiply high*high and accumulate with L*H value",
+{
+	RxxV = fUNDOUBLE(fDF_MPY_HH(fDOUBLE(RssV),fDOUBLE(RttV),RxxV));
+})
+
+
+
+Q6INSN(F2_dfcmpeq,"Pd4=dfcmp.eq(Rss32,Rtt32)",ATTRIBS(),
+"Floating Point Compare for Equal",
+{PdV=f8BITSOF(fDOUBLE(RssV)==fDOUBLE(RttV));})
+
+Q6INSN(F2_dfcmpgt,"Pd4=dfcmp.gt(Rss32,Rtt32)",ATTRIBS(),
+"Floating-Point Compare for Greater Than",
+{PdV=f8BITSOF(fDOUBLE(RssV)>fDOUBLE(RttV));})
+
+
+/* cmpge is not the same as !cmpgt(swapops) in IEEE */
+
+Q6INSN(F2_dfcmpge,"Pd4=dfcmp.ge(Rss32,Rtt32)",ATTRIBS(),
+"Floating-Point Compare for Greater Than / Equal To",
+{PdV=f8BITSOF(fDOUBLE(RssV)>=fDOUBLE(RttV));})
+
+/* Everyone seems to have this... */
+
+Q6INSN(F2_dfcmpuo,"Pd4=dfcmp.uo(Rss32,Rtt32)",ATTRIBS(),
+"Floating-Point Compare for Unordered",
+{PdV=f8BITSOF(isunordered(fDOUBLE(RssV),fDOUBLE(RttV)));})
+
+
+Q6INSN(F2_dfclass,"Pd4=dfclass(Rss32,#u5)",ATTRIBS(),
+"Classify Floating-Point Value",
+{
+	fHIDE(int class;)
+	PdV = 0;
+	class = fpclassify(fDOUBLE(RssV));
+	/* Is the value zero? */
+	if (fGETBIT(0,uiV) && (class == FP_ZERO)) PdV = 0xff;
+	if (fGETBIT(1,uiV) && (class == FP_NORMAL)) PdV = 0xff;
+	if (fGETBIT(2,uiV) && (class == FP_SUBNORMAL)) PdV = 0xff;
+	if (fGETBIT(3,uiV) && (class == FP_INFINITE)) PdV = 0xff;
+	if (fGETBIT(4,uiV) && (class == FP_NAN)) PdV = 0xff;
+	fFPCANCELFLAGS();
+})
+
+
+/* Range: +/- (1.0 .. 1+(63/64)) * 2**(-6 .. +9) */
+/* More immediate bits should probably be used for more precision? */
+
+Q6INSN(F2_dfimm_p,"Rdd32=dfmake(#u10):pos",ATTRIBS(),
+"Make Floating Point Value",
+{
+	RddV = (1023ULL - 6) << 52;
+	RddV += (fHIDE((size8u_t))uiV) << 46;
+})
+
+Q6INSN(F2_dfimm_n,"Rdd32=dfmake(#u10):neg",ATTRIBS(),
+"Make Floating Point Value",
+{
+	RddV = (1023ULL - 6) << 52;
+	RddV += (fHIDE((size8u_t))uiV) << 46;
+	RddV |= ((1ULL) << 63);
+})
+
+
+/* CONVERSION */
+
+#define CONVERT(TAG,DEST,DESTV,SRC,SRCV,OUTCAST,OUTTYPE,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \
+    Q6INSN(F2_conv_##TAG##MODETAG,#DEST"=convert_"#TAG"("#SRC")"#MODESYN,ATTRIBS(), \
+    "Floating point format conversion", \
+    { MODEBEH DESTV = OUTCAST(conv_##INTYPE##_to_##OUTTYPE(INCAST(SRCV))); })
+
+CONVERT(sf2df,Rdd32,RddV,Rs32,RsV,fUNDOUBLE,df,fFLOAT,sf,,,)
+CONVERT(df2sf,Rd32,RdV,Rss32,RssV,fUNFLOAT,sf,fDOUBLE,df,,,)
+
+#define ALLINTDST(TAGSTART,SRC,SRCV,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \
+CONVERT(TAGSTART##uw,Rd32,RdV,SRC,SRCV,fCAST4u,4u,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \
+CONVERT(TAGSTART##w,Rd32,RdV,SRC,SRCV,fCAST4s,4s,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \
+CONVERT(TAGSTART##ud,Rdd32,RddV,SRC,SRCV,fCAST8u,8u,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \
+CONVERT(TAGSTART##d,Rdd32,RddV,SRC,SRCV,fCAST8s,8s,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH)
+
+#define ALLFPDST(TAGSTART,SRC,SRCV,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \
+CONVERT(TAGSTART##sf,Rd32,RdV,SRC,SRCV,fUNFLOAT,sf,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \
+CONVERT(TAGSTART##df,Rdd32,RddV,SRC,SRCV,fUNDOUBLE,df,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH)
+
+#define ALLINTSRC(GEN,MODETAG,MODESYN,MODEBEH) \
+GEN(uw##2,Rs32,RsV,fCAST4u,4u,MODETAG,MODESYN,MODEBEH) \
+GEN(w##2,Rs32,RsV,fCAST4s,4s,MODETAG,MODESYN,MODEBEH) \
+GEN(ud##2,Rss32,RssV,fCAST8u,8u,MODETAG,MODESYN,MODEBEH) \
+GEN(d##2,Rss32,RssV,fCAST8s,8s,MODETAG,MODESYN,MODEBEH)
+
+#define ALLFPSRC(GEN,MODETAG,MODESYN,MODEBEH) \
+GEN(sf##2,Rs32,RsV,fFLOAT,sf,MODETAG,MODESYN,MODEBEH) \
+GEN(df##2,Rss32,RssV,fDOUBLE,df,MODETAG,MODESYN,MODEBEH)
+
+ALLINTSRC(ALLFPDST,,,)
+ALLFPSRC(ALLINTDST,,,)
+ALLFPSRC(ALLINTDST,_chop,:chop,fFPSETROUND_CHOP();)
+
diff --git a/target/hexagon/imported/ldst.idef b/target/hexagon/imported/ldst.idef
new file mode 100644
index 0000000..8ecc408
--- /dev/null
+++ b/target/hexagon/imported/ldst.idef
@@ -0,0 +1,286 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * Load and Store instruction definitions
+ */
+
+/* The set of addressing modes standard to all Load instructions */
+#define STD_LD_AMODES(TAG,OPER,DESCR,ATTRIB,SHFT,SEMANTICS,SCALE)\
+Q6INSN(L2_##TAG##_io,  OPER"(Rs32+#s11:"SHFT")",          ATTRIB,DESCR,{fIMMEXT(siV); fEA_RI(RsV,siV); SEMANTICS; })\
+Q6INSN(L4_##TAG##_ur,  OPER"(Rt32<<#u2+#U6)",             ATTRIB,DESCR,{fMUST_IMMEXT(UiV); fEA_IRs(UiV,RtV,uiV); SEMANTICS;})\
+Q6INSN(L4_##TAG##_ap,  OPER"(Re32=#U6)",                  ATTRIB,DESCR,{fMUST_IMMEXT(UiV); fEA_IMM(UiV); SEMANTICS; ReV=UiV; })\
+Q6INSN(L2_##TAG##_pr,  OPER"(Rx32++Mu2)",                 ATTRIB,DESCR,{fEA_REG(RxV); fPM_M(RxV,MuV); SEMANTICS;})\
+Q6INSN(L2_##TAG##_pi,  OPER"(Rx32++#s4:"SHFT")",          ATTRIB,DESCR,{fEA_REG(RxV); fPM_I(RxV,siV); SEMANTICS;})\
+
+/* The set of 32-bit load instructions */
+STD_LD_AMODES(loadrub,"Rd32=memub","Load Unsigned Byte",ATTRIBS(A_LOAD),"0",fLOAD(1,1,u,EA,RdV),0)
+STD_LD_AMODES(loadrb, "Rd32=memb", "Load signed Byte",ATTRIBS(A_LOAD),"0",fLOAD(1,1,s,EA,RdV),0)
+STD_LD_AMODES(loadruh,"Rd32=memuh","Load unsigned Half integer",ATTRIBS(A_LOAD),"1",fLOAD(1,2,u,EA,RdV),1)
+STD_LD_AMODES(loadrh, "Rd32=memh", "Load signed Half integer",ATTRIBS(A_LOAD),"1",fLOAD(1,2,s,EA,RdV),1)
+STD_LD_AMODES(loadri, "Rd32=memw", "Load Word",ATTRIBS(A_LOAD),"2",fLOAD(1,4,u,EA,RdV),2)
+STD_LD_AMODES(loadrd, "Rdd32=memd","Load Double integer",ATTRIBS(A_LOAD),"3",fLOAD(1,8,u,EA,RddV),3)
+
+/* The set of addressing modes standard to all Store instructions */
+#define STD_ST_AMODES(TAG,DEST,OPER,DESCR,ATTRIB,SHFT,SEMANTICS,SCALE)\
+Q6INSN(S2_##TAG##_io,  OPER"(Rs32+#s11:"SHFT")="DEST,     ATTRIB,DESCR,{fIMMEXT(siV); fEA_RI(RsV,siV); SEMANTICS; })\
+Q6INSN(S2_##TAG##_pi,  OPER"(Rx32++#s4:"SHFT")="DEST,     ATTRIB,DESCR,{fEA_REG(RxV); fPM_I(RxV,siV); SEMANTICS; })\
+Q6INSN(S4_##TAG##_ap,  OPER"(Re32=#U6)="DEST,             ATTRIB,DESCR,{fMUST_IMMEXT(UiV); fEA_IMM(UiV); SEMANTICS; ReV=UiV; })\
+Q6INSN(S2_##TAG##_pr,  OPER"(Rx32++Mu2)="DEST,            ATTRIB,DESCR,{fEA_REG(RxV); fPM_M(RxV,MuV); SEMANTICS; })\
+Q6INSN(S4_##TAG##_ur,  OPER"(Ru32<<#u2+#U6)="DEST,            ATTRIB,DESCR,{fMUST_IMMEXT(UiV); fEA_IRs(UiV,RuV,uiV); SEMANTICS;})\
+
+
+/* The set of 32-bit store instructions */
+STD_ST_AMODES(storerb, "Rt32", "memb","Store Byte",ATTRIBS(A_STORE),"0",fSTORE(1,1,EA,fGETBYTE(0,RtV)),0)
+STD_ST_AMODES(storerh, "Rt32", "memh","Store Half integer",ATTRIBS(A_STORE),"1",fSTORE(1,2,EA,fGETHALF(0,RtV)),1)
+STD_ST_AMODES(storerf, "Rt.H32", "memh","Store Upper Half integer",ATTRIBS(A_STORE),"1",fSTORE(1,2,EA,fGETHALF(1,RtV)),1)
+STD_ST_AMODES(storeri, "Rt32", "memw","Store Word",ATTRIBS(A_STORE),"2",fSTORE(1,4,EA,RtV),2)
+STD_ST_AMODES(storerd, "Rtt32","memd","Store Double integer",ATTRIBS(A_STORE),"3",fSTORE(1,8,EA,RttV),3)
+STD_ST_AMODES(storerinew, "Nt8.new", "memw","Store Word",ATTRIBS(A_STORE),"2",fSTORE(1,4,EA,fNEWREG_ST(NtN)),2)
+STD_ST_AMODES(storerbnew, "Nt8.new", "memb","Store Byte",ATTRIBS(A_STORE),"0",fSTORE(1,1,EA,fGETBYTE(0,fNEWREG_ST(NtN))),0)
+STD_ST_AMODES(storerhnew, "Nt8.new", "memh","Store Half integer",ATTRIBS(A_STORE),"1",fSTORE(1,2,EA,fGETHALF(0,fNEWREG_ST(NtN))),1)
+
+
+Q6INSN(S2_allocframe,"allocframe(Rx32,#u11:3):raw", ATTRIBS(A_STORE,A_RESTRICT_SLOT0ONLY), "Allocate stack frame",
+{ fEA_RI(RxV,-8); fSTORE(1,8,EA,fFRAME_SCRAMBLE((fCAST8_8u(fREAD_LR()) << 32) | fCAST4_4u(fREAD_FP()))); fWRITE_FP(EA); fFRAMECHECK(EA-uiV,EA); RxV = EA-uiV; })
+
+#define A_RETURN A_RESTRICT_SLOT0ONLY
+
+Q6INSN(L2_deallocframe,"Rdd32=deallocframe(Rs32):raw", ATTRIBS(A_LOAD), "Deallocate stack frame",
+{ fHIDE(size8u_t tmp;) fEA_REG(RsV);
+	fLOAD(1,8,u,EA,tmp);
+	RddV = fFRAME_UNSCRAMBLE(tmp);
+	fWRITE_SP(EA+8); })
+
+Q6INSN(L4_return,"Rdd32=dealloc_return(Rs32):raw", ATTRIBS(A_JINDIR,A_LOAD,A_RETURN), "Deallocate stack frame and return",
+{ fHIDE(size8u_t tmp;) fEA_REG(RsV);
+	fLOAD(1,8,u,EA,tmp);
+	RddV = fFRAME_UNSCRAMBLE(tmp);
+	fWRITE_SP(EA+8);
+    fJUMPR(REG_LR,fGETWORD(1,RddV),COF_TYPE_JUMPR);})
+
+#define CONDSEM(SRCREG,STALLBITS0,STALLBITS1,PREDFUNC,PREDARG,STALLSPEC,PREDCOND) \
+{ \
+	fHIDE(size8u_t tmp;) \
+	fBRANCH_SPECULATE_STALL(PREDFUNC##PREDCOND(PREDARG),,STALLSPEC,STALLBITS0,STALLBITS1); \
+	fEA_REG(SRCREG); \
+	if (PREDFUNC##PREDCOND(PREDARG)) { \
+		fLOAD(1,8,u,EA,tmp); \
+		RddV = fFRAME_UNSCRAMBLE(tmp); \
+		fWRITE_SP(EA+8); \
+		fJUMPR(REG_LR,fGETWORD(1,RddV),COF_TYPE_JUMPR); \
+	} else { \
+		LOAD_CANCEL(EA); \
+	} \
+}
+
+#define COND_RETURN_TF(TG,TG2,DOTNEW,STALLBITS0,STALLBITS1,STALLSPEC,ATTRIBS,PREDFUNC,PREDARG,T_NT) \
+	Q6INSN(TG##_t##TG2,"if (Pv4"DOTNEW") Rdd32=dealloc_return(Rs32)"T_NT":raw",ATTRIBS,"deallocate stack frame and return", \
+		CONDSEM(RsV,STALLBITS0,STALLBITS1,PREDFUNC,PREDARG,STALLSPEC,)) \
+	Q6INSN(TG##_f##TG2,"if (!Pv4"DOTNEW") Rdd32=dealloc_return(Rs32)"T_NT":raw",ATTRIBS,"deallocate stack frame and return", \
+		CONDSEM(RsV,STALLBITS0,STALLBITS1,PREDFUNC##NOT,PREDARG,STALLSPEC,))
+
+#define COND_RETURN_NEW(TG,STALLBITS0,STALLBITS1,ATTRIBS) \
+	COND_RETURN_TF(TG,new_pt,".new",12,0,SPECULATE_TAKEN,ATTRIBS,fLSBNEW,PvN,":t") \
+	COND_RETURN_TF(TG,new_pnt,".new",12,0,SPECULATE_NOT_TAKEN,ATTRIBS,fLSBNEW,PvN,":nt") \
+
+#define RETURN_ATTRIBS A_LOAD,A_RETURN
+
+COND_RETURN_TF(L4_return,,,7,0,SPECULATE_NOT_TAKEN,ATTRIBS(RETURN_ATTRIBS,A_JINDIROLD),fLSBOLD,PvV,)
+COND_RETURN_NEW(L4_return,12,0,ATTRIBS(RETURN_ATTRIBS,A_JINDIRNEW))
+
+
+
+
+Q6INSN(L2_loadw_locked,"Rd32=memw_locked(Rs32)", ATTRIBS(A_LOAD,A_RESTRICT_SLOT0ONLY), "Load word with lock",
+{ fEA_REG(RsV); fLOAD_LOCKED(1,4,u,EA,RdV) });
+
+
+Q6INSN(S2_storew_locked,"memw_locked(Rs32,Pd4)=Rt32", ATTRIBS(A_STORE,A_RESTRICT_SLOT0ONLY), "Store word with lock",
+{ fEA_REG(RsV); fSTORE_LOCKED(1,4,EA,RtV,PdV) });
+
+
+Q6INSN(L4_loadd_locked,"Rdd32=memd_locked(Rs32)", ATTRIBS(A_LOAD,A_RESTRICT_SLOT0ONLY), "Load double with lock",
+{ fEA_REG(RsV); fLOAD_LOCKED(1,8,u,EA,RddV) });
+
+Q6INSN(S4_stored_locked,"memd_locked(Rs32,Pd4)=Rtt32", ATTRIBS(A_STORE,A_RESTRICT_SLOT0ONLY), "Store word with lock",
+{ fEA_REG(RsV); fSTORE_LOCKED(1,8,EA,RttV,PdV) });
+
+
+
+
+
+/*****************************************************************/
+/*                                                               */
+/*                       Predicated LDST                         */
+/*                                                               */
+/*****************************************************************/
+
+#define STD_PLD_AMODES(TAG,OPER,DESCR,ATTRIB,SHFT,SHFTNUM,SEMANTICS)\
+Q6INSN(L4_##TAG##_rr,  OPER"(Rs32+Rt32<<#u2)",            ATTRIB,DESCR,{fEA_RRs(RsV,RtV,uiV); SEMANTICS;})\
+Q6INSN(L2_p##TAG##t_io, "if (Pt4) "OPER"(Rs32+#u6:"SHFT")",            ATTRIB,DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); if(fLSBOLD(PtV)){SEMANTICS;} else {LOAD_CANCEL(EA);}})\
+Q6INSN(L2_p##TAG##t_pi, "if (Pt4) "OPER"(Rx32++#s4:"SHFT")",           ATTRIB,DESCR,{fEA_REG(RxV); if(fLSBOLD(PtV)){ fPM_I(RxV,siV); SEMANTICS;} else {LOAD_CANCEL(EA);}})\
+Q6INSN(L2_p##TAG##f_io, "if (!Pt4) "OPER"(Rs32+#u6:"SHFT")",           ATTRIB,DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); if(fLSBOLDNOT(PtV)){ SEMANTICS; } else {LOAD_CANCEL(EA);}})\
+Q6INSN(L2_p##TAG##f_pi, "if (!Pt4) "OPER"(Rx32++#s4:"SHFT")",          ATTRIB,DESCR,{fEA_REG(RxV); if(fLSBOLDNOT(PtV)){ fPM_I(RxV,siV); SEMANTICS;} else {LOAD_CANCEL(EA);}})\
+Q6INSN(L2_p##TAG##tnew_io,"if (Pt4.new) "OPER"(Rs32+#u6:"SHFT")",ATTRIB,DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); if (fLSBNEW(PtN))  { SEMANTICS; } else {LOAD_CANCEL(EA);}})\
+Q6INSN(L2_p##TAG##fnew_io,"if (!Pt4.new) "OPER"(Rs32+#u6:"SHFT")",ATTRIB,DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); if (fLSBNEWNOT(PtN)) { SEMANTICS; } else {LOAD_CANCEL(EA);}})\
+Q6INSN(L4_p##TAG##t_rr, "if (Pv4) "OPER"(Rs32+Rt32<<#u2)",            ATTRIB,DESCR,{fEA_RRs(RsV,RtV,uiV); if(fLSBOLD(PvV)){ SEMANTICS;} else {LOAD_CANCEL(EA);}})\
+Q6INSN(L4_p##TAG##f_rr, "if (!Pv4) "OPER"(Rs32+Rt32<<#u2)",           ATTRIB,DESCR,{fEA_RRs(RsV,RtV,uiV); if(fLSBOLDNOT(PvV)){ SEMANTICS; } else {LOAD_CANCEL(EA);}})\
+Q6INSN(L4_p##TAG##tnew_rr,"if (Pv4.new) "OPER"(Rs32+Rt32<<#u2)",ATTRIB,DESCR,{fEA_RRs(RsV,RtV,uiV); if (fLSBNEW(PvN))  { SEMANTICS; } else {LOAD_CANCEL(EA);}})\
+Q6INSN(L4_p##TAG##fnew_rr,"if (!Pv4.new) "OPER"(Rs32+Rt32<<#u2)",ATTRIB,DESCR,{fEA_RRs(RsV,RtV,uiV); if (fLSBNEWNOT(PvN)) { SEMANTICS; } else {LOAD_CANCEL(EA);}})\
+Q6INSN(L2_p##TAG##tnew_pi, "if (Pt4.new) "OPER"(Rx32++#s4:"SHFT")",           ATTRIB,DESCR,{fEA_REG(RxV); if(fLSBNEW(PtN)){ fPM_I(RxV,siV); SEMANTICS;} else {LOAD_CANCEL(EA);}})\
+Q6INSN(L2_p##TAG##fnew_pi, "if (!Pt4.new) "OPER"(Rx32++#s4:"SHFT")",          ATTRIB,DESCR,{fEA_REG(RxV); if(fLSBNEWNOT(PtN)){ fPM_I(RxV,siV); SEMANTICS;} else {LOAD_CANCEL(EA);}})\
+Q6INSN(L4_p##TAG##t_abs, "if (Pt4) "OPER"(#u6)",            ATTRIB,DESCR,{fMUST_IMMEXT(uiV); fEA_IMM(uiV); if(fLSBOLD(PtV)){ SEMANTICS;} else {LOAD_CANCEL(EA);}})\
+Q6INSN(L4_p##TAG##f_abs, "if (!Pt4) "OPER"(#u6)",           ATTRIB,DESCR,{fMUST_IMMEXT(uiV); fEA_IMM(uiV); if(fLSBOLDNOT(PtV)){ SEMANTICS; } else {LOAD_CANCEL(EA);}})\
+Q6INSN(L4_p##TAG##tnew_abs,"if (Pt4.new) "OPER"(#u6)",ATTRIB,DESCR,{fMUST_IMMEXT(uiV); fEA_IMM(uiV);if (fLSBNEW(PtN))  { SEMANTICS; } else {LOAD_CANCEL(EA);}})\
+Q6INSN(L4_p##TAG##fnew_abs,"if (!Pt4.new) "OPER"(#u6)",ATTRIB,DESCR,{fMUST_IMMEXT(uiV); fEA_IMM(uiV);if (fLSBNEWNOT(PtN)) { SEMANTICS; } else {LOAD_CANCEL(EA);}})
+
+
+
+/* The set of 32-bit predicated load instructions */
+STD_PLD_AMODES(loadrub,"Rd32=memub","Load Unsigned Byte",ATTRIBS(A_ARCHV2,A_LOAD),"0",0,fLOAD(1,1,u,EA,RdV))
+STD_PLD_AMODES(loadrb, "Rd32=memb", "Load signed Byte",ATTRIBS(A_ARCHV2,A_LOAD),"0",0,fLOAD(1,1,s,EA,RdV))
+STD_PLD_AMODES(loadruh,"Rd32=memuh","Load unsigned Half integer",ATTRIBS(A_ARCHV2,A_LOAD),"1",1,fLOAD(1,2,u,EA,RdV))
+STD_PLD_AMODES(loadrh, "Rd32=memh", "Load signed Half integer",ATTRIBS(A_ARCHV2,A_LOAD),"1",1,fLOAD(1,2,s,EA,RdV))
+STD_PLD_AMODES(loadri, "Rd32=memw", "Load Word",ATTRIBS(A_ARCHV2,A_LOAD),"2",2,fLOAD(1,4,u,EA,RdV))
+STD_PLD_AMODES(loadrd, "Rdd32=memd","Load Double integer",ATTRIBS(A_ARCHV2,A_LOAD),"3",3,fLOAD(1,8,u,EA,RddV))
+
+/* The set of addressing modes standard to all predicated store instructions */
+#define STD_PST_AMODES(TAG,DEST,OPER,DESCR,ATTRIB,SHFT,SHFTNUM,SEMANTICS)\
+Q6INSN(S4_##TAG##_rr,  OPER"(Rs32+Ru32<<#u2)="DEST,            ATTRIB,DESCR,{fEA_RRs(RsV,RuV,uiV); SEMANTICS;})\
+Q6INSN(S2_p##TAG##t_io, "if (Pv4) "OPER"(Rs32+#u6:"SHFT")="DEST,     ATTRIB,DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); if (fLSBOLD(PvV)){ SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S2_p##TAG##t_pi, "if (Pv4) "OPER"(Rx32++#s4:"SHFT")="DEST,     ATTRIB,DESCR,{fEA_REG(RxV); if (fLSBOLD(PvV)){ fPM_I(RxV,siV); SEMANTICS;} else {STORE_CANCEL(EA);}})\
+Q6INSN(S2_p##TAG##f_io, "if (!Pv4) "OPER"(Rs32+#u6:"SHFT")="DEST,     ATTRIB,DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); if (fLSBOLDNOT(PvV)){ SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S2_p##TAG##f_pi, "if (!Pv4) "OPER"(Rx32++#s4:"SHFT")="DEST,     ATTRIB,DESCR,{fEA_REG(RxV); if (fLSBOLDNOT(PvV)){ fPM_I(RxV,siV); SEMANTICS;} else {STORE_CANCEL(EA);}})\
+Q6INSN(S4_p##TAG##t_rr, "if (Pv4) "OPER"(Rs32+Ru32<<#u2)="DEST,     ATTRIB,DESCR,{fEA_RRs(RsV,RuV,uiV); if (fLSBOLD(PvV)){ SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S4_p##TAG##f_rr, "if (!Pv4) "OPER"(Rs32+Ru32<<#u2)="DEST,     ATTRIB,DESCR,{fEA_RRs(RsV,RuV,uiV); if (fLSBOLDNOT(PvV)){ SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S4_p##TAG##tnew_io,"if (Pv4.new) "OPER"(Rs32+#u6:"SHFT")="DEST,ATTRIB,DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); if ( fLSBNEW(PvN)) { SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S4_p##TAG##fnew_io,"if (!Pv4.new) "OPER"(Rs32+#u6:"SHFT")="DEST,ATTRIB,DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); if (fLSBNEWNOT(PvN)) { SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S4_p##TAG##tnew_rr,"if (Pv4.new) "OPER"(Rs32+Ru32<<#u2)="DEST,ATTRIB,DESCR,{fEA_RRs(RsV,RuV,uiV); if ( fLSBNEW(PvN)) { SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S4_p##TAG##fnew_rr,"if (!Pv4.new) "OPER"(Rs32+Ru32<<#u2)="DEST,ATTRIB,DESCR,{fEA_RRs(RsV,RuV,uiV); if (fLSBNEWNOT(PvN)) { SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S2_p##TAG##tnew_pi, "if (Pv4.new) "OPER"(Rx32++#s4:"SHFT")="DEST,     ATTRIB,DESCR,{fEA_REG(RxV); if (fLSBNEW(PvN)){ fPM_I(RxV,siV); SEMANTICS;} else {STORE_CANCEL(EA);}})\
+Q6INSN(S2_p##TAG##fnew_pi, "if (!Pv4.new) "OPER"(Rx32++#s4:"SHFT")="DEST,     ATTRIB,DESCR,{fEA_REG(RxV); if (fLSBNEWNOT(PvN)){ fPM_I(RxV,siV); SEMANTICS;} else {STORE_CANCEL(EA);}})\
+Q6INSN(S4_p##TAG##t_abs, "if (Pv4) "OPER"(#u6)="DEST,     ATTRIB,DESCR,{fMUST_IMMEXT(uiV); fEA_IMM(uiV); if (fLSBOLD(PvV)){ SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S4_p##TAG##f_abs, "if (!Pv4) "OPER"(#u6)="DEST,     ATTRIB,DESCR,{fMUST_IMMEXT(uiV);fEA_IMM(uiV); if (fLSBOLDNOT(PvV)){ SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S4_p##TAG##tnew_abs,"if (Pv4.new) "OPER"(#u6)="DEST,ATTRIB,DESCR,{fMUST_IMMEXT(uiV);fEA_IMM(uiV); if ( fLSBNEW(PvN)) { SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S4_p##TAG##fnew_abs,"if (!Pv4.new) "OPER"(#u6)="DEST,ATTRIB,DESCR,{fMUST_IMMEXT(uiV);fEA_IMM(uiV); if (fLSBNEWNOT(PvN)) { SEMANTICS; } else {STORE_CANCEL(EA);}})
+
+
+
+
+/* The set of 32-bit predicated store instructions */
+STD_PST_AMODES(storerb,"Rt32","memb","Store Byte",ATTRIBS(A_ARCHV2,A_STORE),"0",0,fSTORE(1,1,EA,fGETBYTE(0,RtV)))
+STD_PST_AMODES(storerh,"Rt32","memh","Store Half integer",ATTRIBS(A_ARCHV2,A_STORE),"1",1,fSTORE(1,2,EA,fGETHALF(0,RtV)))
+STD_PST_AMODES(storerf,"Rt.H32","memh","Store Upper Half integer",ATTRIBS(A_ARCHV2,A_STORE),"1",1,fSTORE(1,2,EA,fGETHALF(1,RtV)))
+STD_PST_AMODES(storeri,"Rt32","memw","Store Word",ATTRIBS(A_ARCHV2,A_STORE),"2",2,fSTORE(1,4,EA,RtV))
+STD_PST_AMODES(storerd,"Rtt32","memd","Store Double integer",ATTRIBS(A_ARCHV2,A_STORE),"3",3,fSTORE(1,8,EA,RttV))
+STD_PST_AMODES(storerinew,"Nt8.new","memw","Store Word",ATTRIBS(A_ARCHV2,A_STORE),"2",2,fSTORE(1,4,EA,fNEWREG_ST(NtN)))
+STD_PST_AMODES(storerbnew,"Nt8.new","memb","Store Byte",ATTRIBS(A_ARCHV2,A_STORE),"0",0,fSTORE(1,1,EA,fGETBYTE(0,fNEWREG_ST(NtN))))
+STD_PST_AMODES(storerhnew,"Nt8.new","memh","Store Half integer",ATTRIBS(A_ARCHV2,A_STORE),"1",1,fSTORE(1,2,EA,fGETHALF(0,fNEWREG_ST(NtN))))
+
+
+
+
+/*****************************************************************/
+/*                                                               */
+/*                       Mem-Ops (Load-op-Store)                 */
+/*                                                               */
+/*****************************************************************/
+
+/* The set of 32-bit non-predicated mem-ops */
+#define STD_MEMOP_AMODES(TAG,OPER,DESCR,SEMANTICS)\
+Q6INSN(L4_##TAG##w_io,  "memw(Rs32+#u6:2)"OPER,     ATTRIBS(A_RESTRICT_SLOT0ONLY),DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); fHIDE(size4s_t tmp;) fLOAD(1,4,s,EA,tmp); SEMANTICS;  fSTORE(1,4,EA,tmp); })\
+Q6INSN(L4_##TAG##b_io,  "memb(Rs32+#u6:0)"OPER,     ATTRIBS(A_RESTRICT_SLOT0ONLY),DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); fHIDE(size4s_t tmp;) fLOAD(1,1,s,EA,tmp); SEMANTICS;  fSTORE(1,1,EA,tmp); })\
+Q6INSN(L4_##TAG##h_io,  "memh(Rs32+#u6:1)"OPER,     ATTRIBS(A_RESTRICT_SLOT0ONLY),DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); fHIDE(size4s_t tmp;) fLOAD(1,2,s,EA,tmp); SEMANTICS;  fSTORE(1,2,EA,tmp); })
+
+
+
+STD_MEMOP_AMODES(add_memop, "+=Rt32", "Add Register to Memory Word", tmp += RtV)
+STD_MEMOP_AMODES(sub_memop, "-=Rt32", "Sub Register from Memory Word", tmp -= RtV)
+STD_MEMOP_AMODES(and_memop, "&=Rt32", "Logical AND Register to Memory Word", tmp &= RtV)
+STD_MEMOP_AMODES(or_memop, "|=Rt32", "Logical OR Register to Memory Word", tmp |= RtV)
+
+
+STD_MEMOP_AMODES(iadd_memop, "+=#U5", "Add Immediate to Memory Word", tmp += UiV)
+STD_MEMOP_AMODES(isub_memop, "-=#U5", "Sub Immediate to Memory Word", tmp -= UiV)
+STD_MEMOP_AMODES(iand_memop, "=clrbit(#U5)", "Clear a bit in memory", tmp &= (~(1<<UiV)))
+STD_MEMOP_AMODES(ior_memop,  "=setbit(#U5)", "Set a bit in memory", tmp |= (1<<UiV))
+
+
+/*****************************************************************/
+/*                                                               */
+/*                  V4 store immediates                          */
+/*                                                               */
+/*****************************************************************/
+/* Predicated Store immediates */
+#define V4_PSTI_AMODES(TAG,DEST,OPER,DESCR,ATTRIB,SHFT,SEMANTICS)\
+Q6INSN(S4_##TAG##t_io,"if (Pv4) "OPER"(Rs32+#u6:"SHFT")="DEST,ATTRIB,DESCR,{fEA_RI(RsV,uiV); if (fLSBOLD(PvV)){ SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S4_##TAG##f_io,"if (!Pv4) "OPER"(Rs32+#u6:"SHFT")="DEST,ATTRIB,DESCR,{fEA_RI(RsV,uiV); if (fLSBOLDNOT(PvV)){ SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S4_##TAG##tnew_io,"if (Pv4.new) "OPER"(Rs32+#u6:"SHFT")="DEST,ATTRIB,DESCR,{fEA_RI(RsV,uiV); if (fLSBNEW(PvN)){ SEMANTICS; } else {STORE_CANCEL(EA);}})\
+Q6INSN(S4_##TAG##fnew_io,"if (!Pv4.new) "OPER"(Rs32+#u6:"SHFT")="DEST,ATTRIB,DESCR,{fEA_RI(RsV,uiV); if (fLSBNEWNOT(PvN)){ SEMANTICS; } else {STORE_CANCEL(EA);}})
+
+/* The set of 32-bit store immediate instructions */
+V4_PSTI_AMODES(storeirb,"#S6","memb","Store Immediate Byte",ATTRIBS(A_ARCHV2,A_STORE),"0",fIMMEXT(SiV); fSTORE(1,1,EA,SiV))
+V4_PSTI_AMODES(storeirh,"#S6","memh","Store Immediate Half integer",ATTRIBS(A_ARCHV2,A_STORE),"1",fIMMEXT(SiV); fSTORE(1,2,EA,SiV))
+V4_PSTI_AMODES(storeiri,"#S6","memw","Store Immediate Word",ATTRIBS(A_ARCHV2,A_STORE),"2",fIMMEXT(SiV); fSTORE(1,4,EA,SiV))
+
+
+/* Non-predicated store immediates */
+#define V4_STI_AMODES(TAG,DEST,OPER,DESCR,ATTRIB,SHFT,SEMANTICS)\
+Q6INSN(S4_##TAG##_io,  OPER"(Rs32+#u6:"SHFT")="DEST,  ATTRIB,DESCR,{fEA_RI(RsV,uiV); SEMANTICS; })
+
+/* The set of 32-bit store immediate instructions */
+V4_STI_AMODES(storeirb,"#S8","memb","Store Immediate Byte",ATTRIBS(A_ARCHV2,A_STORE),"0",fIMMEXT(SiV); fSTORE(1,1,EA,SiV))
+V4_STI_AMODES(storeirh,"#S8","memh","Store Immediate Half integer",ATTRIBS(A_ARCHV2,A_STORE),"1",fIMMEXT(SiV); fSTORE(1,2,EA,SiV))
+V4_STI_AMODES(storeiri,"#S8","memw","Store Immediate Word",ATTRIBS(A_ARCHV2,A_STORE),"2",fIMMEXT(SiV); fSTORE(1,4,EA,SiV))
+
+
+
+
+
+
+
+/*****************************************************************/
+/*                                                               */
+/*                  V2 GP-relative LD/ST                         */
+/*                                                               */
+/*****************************************************************/
+
+#define STD_GPLD_AMODES(TAG,OPER,DESCR,ATTRIB,SHFT,SEMANTICS)\
+Q6INSN(L2_##TAG##gp, OPER"(gp+#u16:"SHFT")",   ATTRIB,DESCR,{fIMMEXT(uiV); fEA_GPI(uiV); SEMANTICS; })
+
+/* The set of 32-bit load instructions */
+STD_GPLD_AMODES(loadrub,"Rd32=memub","Load Unsigned Byte",ATTRIBS(A_LOAD,A_ARCHV2),"0",fLOAD(1,1,u,EA,RdV))
+STD_GPLD_AMODES(loadrb, "Rd32=memb", "Load signed Byte",ATTRIBS(A_LOAD,A_ARCHV2),"0",fLOAD(1,1,s,EA,RdV))
+STD_GPLD_AMODES(loadruh,"Rd32=memuh","Load unsigned Half integer",ATTRIBS(A_LOAD,A_ARCHV2),"1",fLOAD(1,2,u,EA,RdV))
+STD_GPLD_AMODES(loadrh, "Rd32=memh", "Load signed Half integer",ATTRIBS(A_LOAD,A_ARCHV2),"1",fLOAD(1,2,s,EA,RdV))
+STD_GPLD_AMODES(loadri, "Rd32=memw", "Load Word",ATTRIBS(A_LOAD,A_ARCHV2),"2",fLOAD(1,4,u,EA,RdV))
+STD_GPLD_AMODES(loadrd, "Rdd32=memd","Load Double integer",ATTRIBS(A_LOAD,A_ARCHV2),"3",fLOAD(1,8,u,EA,RddV))
+
+
+#define STD_GPST_AMODES(TAG,DEST,OPER,DESCR,ATTRIB,SHFT,SEMANTICS)\
+Q6INSN(S2_##TAG##gp, OPER"(gp+#u16:"SHFT")="DEST, ATTRIB,DESCR,{fIMMEXT(uiV); fEA_GPI(uiV); SEMANTICS; })
+
+/* The set of 32-bit store instructions */
+STD_GPST_AMODES(storerb, "Rt32", "memb","Store Byte",ATTRIBS(A_STORE,A_ARCHV2),"0",fSTORE(1,1,EA,fGETBYTE(0,RtV)))
+STD_GPST_AMODES(storerh, "Rt32", "memh","Store Half integer",ATTRIBS(A_STORE,A_ARCHV2),"1",fSTORE(1,2,EA,fGETHALF(0,RtV)))
+STD_GPST_AMODES(storerf, "Rt.H32", "memh","Store Upper Half integer",ATTRIBS(A_STORE,A_ARCHV2),"1",fSTORE(1,2,EA,fGETHALF(1,RtV)))
+STD_GPST_AMODES(storeri, "Rt32", "memw","Store Word",ATTRIBS(A_STORE,A_ARCHV2),"2",fSTORE(1,4,EA,RtV))
+STD_GPST_AMODES(storerd, "Rtt32","memd","Store Double integer",ATTRIBS(A_STORE,A_ARCHV2),"3",fSTORE(1,8,EA,RttV))
+STD_GPST_AMODES(storerinew, "Nt8.new", "memw","Store Word",ATTRIBS(A_STORE,A_ARCHV2),"2",fSTORE(1,4,EA,fNEWREG_ST(NtN)))
+STD_GPST_AMODES(storerbnew, "Nt8.new", "memb","Store Byte",ATTRIBS(A_STORE,A_ARCHV2),"0",fSTORE(1,1,EA,fGETBYTE(0,fNEWREG_ST(NtN))))
+STD_GPST_AMODES(storerhnew, "Nt8.new", "memh","Store Half integer",ATTRIBS(A_STORE,A_ARCHV2),"1",fSTORE(1,2,EA,fGETHALF(0,fNEWREG_ST(NtN))))
diff --git a/target/hexagon/imported/mpy.idef b/target/hexagon/imported/mpy.idef
new file mode 100644
index 0000000..0cc62aa
--- /dev/null
+++ b/target/hexagon/imported/mpy.idef
@@ -0,0 +1,1212 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * Multiply Instructions
+ */
+
+
+#define STD_SP_MODES(TAG,OPER,ATR,DST,ACCSEM,SEM,OSEM,SATSEM,RNDSEM)\
+Q6INSN(M2_##TAG##_hh_s0, OPER"(Rs.H32,Rt.H32)"OSEM,        ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM(         fGETHALF(1,RsV),fGETHALF(1,RtV))));})\
+Q6INSN(M2_##TAG##_hh_s1, OPER"(Rs.H32,Rt.H32):<<1"OSEM,    ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETHALF(1,RsV),fGETHALF(1,RtV)))));})\
+Q6INSN(M2_##TAG##_hl_s0, OPER"(Rs.H32,Rt.L32)"OSEM,        ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM(         fGETHALF(1,RsV),fGETHALF(0,RtV))));})\
+Q6INSN(M2_##TAG##_hl_s1, OPER"(Rs.H32,Rt.L32):<<1"OSEM,    ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETHALF(1,RsV),fGETHALF(0,RtV)))));})\
+Q6INSN(M2_##TAG##_lh_s0, OPER"(Rs.L32,Rt.H32)"OSEM,        ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM(         fGETHALF(0,RsV),fGETHALF(1,RtV))));})\
+Q6INSN(M2_##TAG##_lh_s1, OPER"(Rs.L32,Rt.H32):<<1"OSEM,    ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETHALF(0,RsV),fGETHALF(1,RtV)))));})\
+Q6INSN(M2_##TAG##_ll_s0, OPER"(Rs.L32,Rt.L32)"OSEM,        ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM(         fGETHALF(0,RsV),fGETHALF(0,RtV))));})\
+Q6INSN(M2_##TAG##_ll_s1, OPER"(Rs.L32,Rt.L32):<<1"OSEM,    ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETHALF(0,RsV),fGETHALF(0,RtV)))));})
+
+/*****************************************************/
+/* multiply 16x16->32 signed instructions            */
+/*****************************************************/
+STD_SP_MODES(mpy_acc,    "Rx32+=mpy", ,RxV,RxV+    ,fMPY16SS,          ,fPASS,fPASS)
+STD_SP_MODES(mpy_nac,    "Rx32-=mpy", ,RxV,RxV-    ,fMPY16SS,          ,fPASS,fPASS)
+STD_SP_MODES(mpy_acc_sat,"Rx32+=mpy", ,RxV,RxV+    ,fMPY16SS,":sat"    ,fSAT, fPASS)
+STD_SP_MODES(mpy_nac_sat,"Rx32-=mpy", ,RxV,RxV-    ,fMPY16SS,":sat"    ,fSAT, fPASS)
+STD_SP_MODES(mpy,        "Rd32=mpy",  ,RdV,        ,fMPY16SS,          ,fPASS,fPASS)
+STD_SP_MODES(mpy_sat,    "Rd32=mpy",  ,RdV,        ,fMPY16SS,":sat"    ,fSAT, fPASS)
+STD_SP_MODES(mpy_rnd,    "Rd32=mpy",  ,RdV,        ,fMPY16SS,":rnd"    ,fPASS,fROUND)
+STD_SP_MODES(mpy_sat_rnd,"Rd32=mpy",  ,RdV,        ,fMPY16SS,":rnd:sat",fSAT, fROUND)
+STD_SP_MODES(mpyd_acc,   "Rxx32+=mpy",,RxxV,RxxV+  ,fMPY16SS,          ,fPASS,fPASS)
+STD_SP_MODES(mpyd_nac,   "Rxx32-=mpy",,RxxV,RxxV-  ,fMPY16SS,          ,fPASS,fPASS)
+STD_SP_MODES(mpyd,       "Rdd32=mpy", ,RddV,       ,fMPY16SS,          ,fPASS,fPASS)
+STD_SP_MODES(mpyd_rnd,   "Rdd32=mpy", ,RddV,       ,fMPY16SS,":rnd"    ,fPASS,fROUND)
+
+
+/*****************************************************/
+/* multiply 16x16->32 unsigned instructions          */
+/*****************************************************/
+#define STD_USP_MODES(TAG,OPER,ATR,DST,ACCSEM,SEM,OSEM,SATSEM,RNDSEM)\
+Q6INSN(M2_##TAG##_hh_s0, OPER"(Rs.H32,Rt.H32)"OSEM,        ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM(         fGETUHALF(1,RsV),fGETUHALF(1,RtV))));})\
+Q6INSN(M2_##TAG##_hh_s1, OPER"(Rs.H32,Rt.H32):<<1"OSEM,    ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETUHALF(1,RsV),fGETUHALF(1,RtV)))));})\
+Q6INSN(M2_##TAG##_hl_s0, OPER"(Rs.H32,Rt.L32)"OSEM,        ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM(         fGETUHALF(1,RsV),fGETUHALF(0,RtV))));})\
+Q6INSN(M2_##TAG##_hl_s1, OPER"(Rs.H32,Rt.L32):<<1"OSEM,    ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETUHALF(1,RsV),fGETUHALF(0,RtV)))));})\
+Q6INSN(M2_##TAG##_lh_s0, OPER"(Rs.L32,Rt.H32)"OSEM,        ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM(         fGETUHALF(0,RsV),fGETUHALF(1,RtV))));})\
+Q6INSN(M2_##TAG##_lh_s1, OPER"(Rs.L32,Rt.H32):<<1"OSEM,    ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETUHALF(0,RsV),fGETUHALF(1,RtV)))));})\
+Q6INSN(M2_##TAG##_ll_s0, OPER"(Rs.L32,Rt.L32)"OSEM,        ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM(         fGETUHALF(0,RsV),fGETUHALF(0,RtV))));})\
+Q6INSN(M2_##TAG##_ll_s1, OPER"(Rs.L32,Rt.L32):<<1"OSEM,    ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETUHALF(0,RsV),fGETUHALF(0,RtV)))));})
+
+STD_USP_MODES(mpyu_acc,    "Rx32+=mpyu", ,RxV,RxV+  ,fMPY16UU,          ,fPASS,fPASS)
+STD_USP_MODES(mpyu_nac,    "Rx32-=mpyu", ,RxV,RxV-  ,fMPY16UU,          ,fPASS,fPASS)
+STD_USP_MODES(mpyu,        "Rd32=mpyu",  ATTRIBS() ,RdV,  ,fMPY16UU, ,fPASS,fPASS)
+STD_USP_MODES(mpyud_acc,   "Rxx32+=mpyu",,RxxV,RxxV+,fMPY16UU,          ,fPASS,fPASS)
+STD_USP_MODES(mpyud_nac,   "Rxx32-=mpyu",,RxxV,RxxV-,fMPY16UU,          ,fPASS,fPASS)
+STD_USP_MODES(mpyud,       "Rdd32=mpyu", ATTRIBS() ,RddV, ,fMPY16UU, ,fPASS,fPASS)
+
+/**********************************************/
+/* mpy 16x#s8->32                             */
+/**********************************************/
+
+Q6INSN(M2_mpysip,"Rd32=+mpyi(Rs32,#u8)",ATTRIBS(A_ARCHV2),
+"32-bit Multiply by unsigned immediate",
+{ fIMMEXT(uiV); RdV=RsV*uiV; })
+
+Q6INSN(M2_mpysin,"Rd32=-mpyi(Rs32,#u8)",ATTRIBS(A_ARCHV2),
+"32-bit Multiply by unsigned immediate, negate result",
+{ RdV=RsV*-uiV; })
+
+Q6INSN(M2_macsip,"Rx32+=mpyi(Rs32,#u8)",ATTRIBS(A_ARCHV2),
+"32-bit Multiply-Add by unsigned immediate",
+{ fIMMEXT(uiV); RxV=RxV + (RsV*uiV);})
+
+Q6INSN(M2_macsin,"Rx32-=mpyi(Rs32,#u8)",ATTRIBS(A_ARCHV2),
+"32-bit Multiply-Subtract by unsigned immediate",
+{ fIMMEXT(uiV); RxV=RxV - (RsV*uiV);})
+
+
+/**********************************************/
+/* multiply/mac  32x32->64 instructions       */
+/**********************************************/
+Q6INSN(M2_dpmpyss_s0,    "Rdd32=mpy(Rs32,Rt32)", ATTRIBS(),"Multiply 32x32",{RddV=fMPY32SS(RsV,RtV);})
+Q6INSN(M2_dpmpyss_acc_s0,"Rxx32+=mpy(Rs32,Rt32)",ATTRIBS(),"Multiply 32x32",{RxxV= RxxV + fMPY32SS(RsV,RtV);})
+Q6INSN(M2_dpmpyss_nac_s0,"Rxx32-=mpy(Rs32,Rt32)",ATTRIBS(),"Multiply 32x32",{RxxV= RxxV - fMPY32SS(RsV,RtV);})
+
+Q6INSN(M2_dpmpyuu_s0,    "Rdd32=mpyu(Rs32,Rt32)", ATTRIBS(),"Multiply 32x32",{RddV=fMPY32UU(fCAST4u(RsV),fCAST4u(RtV));})
+Q6INSN(M2_dpmpyuu_acc_s0,"Rxx32+=mpyu(Rs32,Rt32)",ATTRIBS(),"Multiply 32x32",{RxxV= RxxV + fMPY32UU(fCAST4u(RsV),fCAST4u(RtV));})
+Q6INSN(M2_dpmpyuu_nac_s0,"Rxx32-=mpyu(Rs32,Rt32)",ATTRIBS(),"Multiply 32x32",{RxxV= RxxV - fMPY32UU(fCAST4u(RsV),fCAST4u(RtV));})
+
+
+/******************************************************/
+/* multiply/mac  32x32->32 (upper) instructions       */
+/******************************************************/
+Q6INSN(M2_mpy_up,        "Rd32=mpy(Rs32,Rt32)", ATTRIBS(),"Multiply 32x32",{RdV=fMPY32SS(RsV,RtV)>>32;})
+Q6INSN(M2_mpy_up_s1,     "Rd32=mpy(Rs32,Rt32):<<1", ATTRIBS(),"Multiply 32x32",{RdV=fMPY32SS(RsV,RtV)>>31;})
+Q6INSN(M2_mpy_up_s1_sat, "Rd32=mpy(Rs32,Rt32):<<1:sat", ATTRIBS(),"Multiply 32x32",{RdV=fSAT(fMPY32SS(RsV,RtV)>>31);})
+Q6INSN(M2_mpyu_up,       "Rd32=mpyu(Rs32,Rt32)", ATTRIBS(),"Multiply 32x32",{RdV=fMPY32UU(fCAST4u(RsV),fCAST4u(RtV))>>32;})
+Q6INSN(M2_mpysu_up,      "Rd32=mpysu(Rs32,Rt32)", ATTRIBS(),"Multiply 32x32",{RdV=fMPY32SU(RsV,fCAST4u(RtV))>>32;})
+Q6INSN(M2_dpmpyss_rnd_s0,"Rd32=mpy(Rs32,Rt32):rnd", ATTRIBS(),"Multiply 32x32",{RdV=(fMPY32SS(RsV,RtV)+fCONSTLL(0x80000000))>>32;})
+
+Q6INSN(M4_mac_up_s1_sat, "Rx32+=mpy(Rs32,Rt32):<<1:sat", ATTRIBS(),"Multiply 32x32",{RxV=fSAT(  (fSE32_64(RxV)) + (fMPY32SS(RsV,RtV)>>31));})
+Q6INSN(M4_nac_up_s1_sat, "Rx32-=mpy(Rs32,Rt32):<<1:sat", ATTRIBS(),"Multiply 32x32",{RxV=fSAT(  (fSE32_64(RxV)) - (fMPY32SS(RsV,RtV)>>31));})
+
+
+/**********************************************/
+/* 32x32->32 multiply (lower)                 */
+/**********************************************/
+
+Q6INSN(M2_mpyi,"Rd32=mpyi(Rs32,Rt32)",ATTRIBS(),
+"Multiply Integer",
+{ RdV=RsV*RtV;})
+
+Q6INSN(M2_maci,"Rx32+=mpyi(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"Multiply-Accumulate Integer",
+{ RxV=RxV + RsV*RtV;})
+
+Q6INSN(M2_mnaci,"Rx32-=mpyi(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"Multiply-Neg-Accumulate Integer",
+{ RxV=RxV - RsV*RtV;})
+
+/****** WHY ARE THESE IN MPY.IDEF? **********/
+
+Q6INSN(M2_acci,"Rx32+=add(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"Add with accumulate",
+{ RxV=RxV + RsV + RtV;})
+
+Q6INSN(M2_accii,"Rx32+=add(Rs32,#s8)",ATTRIBS(A_ARCHV2),
+"Add with accumulate",
+{ fIMMEXT(siV); RxV=RxV + RsV + siV;})
+
+Q6INSN(M2_nacci,"Rx32-=add(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
+"Add with neg accumulate",
+{ RxV=RxV - (RsV + RtV);})
+
+Q6INSN(M2_naccii,"Rx32-=add(Rs32,#s8)",ATTRIBS(A_ARCHV2),
+"Add with neg accumulate",
+{ fIMMEXT(siV); RxV=RxV - (RsV + siV);})
+
+Q6INSN(M2_subacc,"Rx32+=sub(Rt32,Rs32)",ATTRIBS(A_ARCHV2),
+"Sub with accumulate",
+{ RxV=RxV + RtV - RsV;})
+
+
+
+
+Q6INSN(M4_mpyrr_addr,"Ry32=add(Ru32,mpyi(Ry32,Rs32))",ATTRIBS(),
+"Mpy by immed and add immed",
+{ RyV = RuV + RsV*RyV;})
+
+Q6INSN(M4_mpyri_addr_u2,"Rd32=add(Ru32,mpyi(#u6:2,Rs32))",ATTRIBS(),
+"Mpy by immed and add immed",
+{ RdV = RuV + RsV*uiV;})
+
+Q6INSN(M4_mpyri_addr,"Rd32=add(Ru32,mpyi(Rs32,#u6))",ATTRIBS(),
+"Mpy by immed and add immed",
+{ fIMMEXT(uiV); RdV = RuV + RsV*uiV;})
+
+
+
+Q6INSN(M4_mpyri_addi,"Rd32=add(#u6,mpyi(Rs32,#U6))",ATTRIBS(),
+"Mpy by immed and add immed",
+{ fIMMEXT(uiV); RdV = uiV + RsV*UiV;})
+
+
+
+Q6INSN(M4_mpyrr_addi,"Rd32=add(#u6,mpyi(Rs32,Rt32))",ATTRIBS(),
+"Mpy by immed and add immed",
+{ fIMMEXT(uiV); RdV = uiV + RsV*RtV;})
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+/**********************************************/
+/* vector mac  2x[16x16 -> 32]                */
+/**********************************************/
+
+#undef vmac_sema
+#define vmac_sema(N)\
+{ fSETWORD(0,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV)))));\
+  fSETWORD(1,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV)))));\
+}
+Q6INSN(M2_vmpy2s_s0,"Rdd32=vmpyh(Rs32,Rt32):sat",ATTRIBS(),"Vector Multiply",vmac_sema(0))
+Q6INSN(M2_vmpy2s_s1,"Rdd32=vmpyh(Rs32,Rt32):<<1:sat",ATTRIBS(),"Vector Multiply",vmac_sema(1))
+
+
+#undef vmac_sema
+#define vmac_sema(N)\
+{ fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV)))));\
+  fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV)))));\
+}
+Q6INSN(M2_vmac2s_s0,"Rxx32+=vmpyh(Rs32,Rt32):sat",ATTRIBS(),"Vector Multiply",vmac_sema(0))
+Q6INSN(M2_vmac2s_s1,"Rxx32+=vmpyh(Rs32,Rt32):<<1:sat",ATTRIBS(),"Vector Multiply",vmac_sema(1))
+
+#undef vmac_sema
+#define vmac_sema(N)\
+{ fSETWORD(0,RddV,fSAT(fSCALE(N,fMPY16SU(fGETHALF(0,RsV),fGETUHALF(0,RtV)))));\
+  fSETWORD(1,RddV,fSAT(fSCALE(N,fMPY16SU(fGETHALF(1,RsV),fGETUHALF(1,RtV)))));\
+}
+Q6INSN(M2_vmpy2su_s0,"Rdd32=vmpyhsu(Rs32,Rt32):sat",ATTRIBS(),"Vector Multiply",vmac_sema(0))
+Q6INSN(M2_vmpy2su_s1,"Rdd32=vmpyhsu(Rs32,Rt32):<<1:sat",ATTRIBS(),"Vector Multiply",vmac_sema(1))
+
+
+#undef vmac_sema
+#define vmac_sema(N)\
+{ fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + fSCALE(N,fMPY16SU(fGETHALF(0,RsV),fGETUHALF(0,RtV)))));\
+  fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + fSCALE(N,fMPY16SU(fGETHALF(1,RsV),fGETUHALF(1,RtV)))));\
+}
+Q6INSN(M2_vmac2su_s0,"Rxx32+=vmpyhsu(Rs32,Rt32):sat",ATTRIBS(),"Vector Multiply",vmac_sema(0))
+Q6INSN(M2_vmac2su_s1,"Rxx32+=vmpyhsu(Rs32,Rt32):<<1:sat",ATTRIBS(),"Vector Multiply",vmac_sema(1))
+
+
+
+#undef vmac_sema
+#define vmac_sema(N)\
+{ fSETHALF(1,RdV,fGETHALF(1,(fSAT(fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV))) + 0x8000))));\
+  fSETHALF(0,RdV,fGETHALF(1,(fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV))) + 0x8000))));\
+}
+Q6INSN(M2_vmpy2s_s0pack,"Rd32=vmpyh(Rs32,Rt32):rnd:sat",ATTRIBS(A_ARCHV2),"Vector Multiply",vmac_sema(0))
+Q6INSN(M2_vmpy2s_s1pack,"Rd32=vmpyh(Rs32,Rt32):<<1:rnd:sat",ATTRIBS(A_ARCHV2),"Vector Multiply",vmac_sema(1))
+
+
+#undef vmac_sema
+#define vmac_sema(N)\
+{ fSETWORD(0,RxxV,fGETWORD(0,RxxV) + fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV)));\
+  fSETWORD(1,RxxV,fGETWORD(1,RxxV) + fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV)));\
+}
+Q6INSN(M2_vmac2,"Rxx32+=vmpyh(Rs32,Rt32)",ATTRIBS(A_ARCHV2),"Vector Multiply",vmac_sema(0))
+
+#undef vmac_sema
+#define vmac_sema(N)\
+{ fSETWORD(0,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)))));\
+  fSETWORD(1,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)))));\
+}
+Q6INSN(M2_vmpy2es_s0,"Rdd32=vmpyeh(Rss32,Rtt32):sat",ATTRIBS(),"Vector Multiply",vmac_sema(0))
+Q6INSN(M2_vmpy2es_s1,"Rdd32=vmpyeh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Vector Multiply",vmac_sema(1))
+
+#undef vmac_sema
+#define vmac_sema(N)\
+{ fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)))));\
+  fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)))));\
+}
+Q6INSN(M2_vmac2es_s0,"Rxx32+=vmpyeh(Rss32,Rtt32):sat",ATTRIBS(),"Vector Multiply",vmac_sema(0))
+Q6INSN(M2_vmac2es_s1,"Rxx32+=vmpyeh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Vector Multiply",vmac_sema(1))
+
+#undef vmac_sema
+#define vmac_sema(N)\
+{ fSETWORD(0,RxxV,fGETWORD(0,RxxV) + fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)));\
+  fSETWORD(1,RxxV,fGETWORD(1,RxxV) + fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)));\
+}
+Q6INSN(M2_vmac2es,"Rxx32+=vmpyeh(Rss32,Rtt32)",ATTRIBS(A_ARCHV2),"Vector Multiply",vmac_sema(0))
+
+
+
+
+/********************************************************/
+/* vrmpyh, aka Big Mac, aka Mac Daddy, aka Mac-ac-ac-ac */
+/* vector mac  4x[16x16] + 64 ->64                      */
+/********************************************************/
+
+
+#undef vmac_sema
+#define vmac_sema(N)\
+{ RxxV = RxxV + fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV))\
+              + fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV))\
+              + fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV))\
+              + fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV));\
+}
+Q6INSN(M2_vrmac_s0,"Rxx32+=vrmpyh(Rss32,Rtt32)",ATTRIBS(),"Vector Multiply",vmac_sema(0))
+
+#undef vmac_sema
+#define vmac_sema(N)\
+{ RddV = fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV))\
+       + fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV))\
+       + fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV))\
+       + fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV));\
+}
+Q6INSN(M2_vrmpy_s0,"Rdd32=vrmpyh(Rss32,Rtt32)",ATTRIBS(),"Vector Multiply",vmac_sema(0))
+
+
+
+/******************************************************/
+/* vector dual macs. just like complex                */
+/******************************************************/
+
+
+/* With round&pack */
+#undef dmpy_sema
+#define dmpy_sema(N)\
+{ fSETHALF(0,RdV,fGETHALF(1,(fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV))) + \
+                                  fSCALE(N,fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV))) + 0x8000))));\
+  fSETHALF(1,RdV,fGETHALF(1,(fSAT(fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV))) + \
+                                  fSCALE(N,fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV))) + 0x8000))));\
+}
+Q6INSN(M2_vdmpyrs_s0,"Rd32=vdmpy(Rss32,Rtt32):rnd:sat",ATTRIBS(),    "vector dual mac w/ round&pack",dmpy_sema(0))
+Q6INSN(M2_vdmpyrs_s1,"Rd32=vdmpy(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"vector dual mac w/ round&pack",dmpy_sema(1))
+
+
+
+
+
+/******************************************************/
+/* vector byte multiplies                             */
+/******************************************************/
+
+
+Q6INSN(M5_vrmpybuu,"Rdd32=vrmpybu(Rss32,Rtt32)",ATTRIBS(),
+ "vector dual mpy bytes",
+{
+  fSETWORD(0,RddV,(fMPY16SS(fGETUBYTE(0,RssV),fGETUBYTE(0,RttV)) +
+		   fMPY16SS(fGETUBYTE(1,RssV),fGETUBYTE(1,RttV)) +
+		   fMPY16SS(fGETUBYTE(2,RssV),fGETUBYTE(2,RttV)) +
+		   fMPY16SS(fGETUBYTE(3,RssV),fGETUBYTE(3,RttV))));
+  fSETWORD(1,RddV,(fMPY16SS(fGETUBYTE(4,RssV),fGETUBYTE(4,RttV)) +
+		   fMPY16SS(fGETUBYTE(5,RssV),fGETUBYTE(5,RttV)) +
+		   fMPY16SS(fGETUBYTE(6,RssV),fGETUBYTE(6,RttV)) +
+		   fMPY16SS(fGETUBYTE(7,RssV),fGETUBYTE(7,RttV))));
+ })
+
+Q6INSN(M5_vrmacbuu,"Rxx32+=vrmpybu(Rss32,Rtt32)",ATTRIBS(),
+ "vector dual mac bytes",
+{
+  fSETWORD(0,RxxV,(fGETWORD(0,RxxV) +
+	           fMPY16SS(fGETUBYTE(0,RssV),fGETUBYTE(0,RttV)) +
+		   fMPY16SS(fGETUBYTE(1,RssV),fGETUBYTE(1,RttV)) +
+		   fMPY16SS(fGETUBYTE(2,RssV),fGETUBYTE(2,RttV)) +
+		   fMPY16SS(fGETUBYTE(3,RssV),fGETUBYTE(3,RttV))));
+  fSETWORD(1,RxxV,(fGETWORD(1,RxxV) +
+		   fMPY16SS(fGETUBYTE(4,RssV),fGETUBYTE(4,RttV)) +
+		   fMPY16SS(fGETUBYTE(5,RssV),fGETUBYTE(5,RttV)) +
+		   fMPY16SS(fGETUBYTE(6,RssV),fGETUBYTE(6,RttV)) +
+		   fMPY16SS(fGETUBYTE(7,RssV),fGETUBYTE(7,RttV))));
+ })
+
+
+Q6INSN(M5_vrmpybsu,"Rdd32=vrmpybsu(Rss32,Rtt32)",ATTRIBS(),
+ "vector dual mpy bytes",
+{
+  fSETWORD(0,RddV,(fMPY16SS(fGETBYTE(0,RssV),fGETUBYTE(0,RttV)) +
+		   fMPY16SS(fGETBYTE(1,RssV),fGETUBYTE(1,RttV)) +
+		   fMPY16SS(fGETBYTE(2,RssV),fGETUBYTE(2,RttV)) +
+		   fMPY16SS(fGETBYTE(3,RssV),fGETUBYTE(3,RttV))));
+  fSETWORD(1,RddV,(fMPY16SS(fGETBYTE(4,RssV),fGETUBYTE(4,RttV)) +
+		   fMPY16SS(fGETBYTE(5,RssV),fGETUBYTE(5,RttV)) +
+		   fMPY16SS(fGETBYTE(6,RssV),fGETUBYTE(6,RttV)) +
+		   fMPY16SS(fGETBYTE(7,RssV),fGETUBYTE(7,RttV))));
+ })
+
+Q6INSN(M5_vrmacbsu,"Rxx32+=vrmpybsu(Rss32,Rtt32)",ATTRIBS(),
+ "vector dual mac bytes",
+{
+  fSETWORD(0,RxxV,(fGETWORD(0,RxxV) +
+		   fMPY16SS(fGETBYTE(0,RssV),fGETUBYTE(0,RttV)) +
+		   fMPY16SS(fGETBYTE(1,RssV),fGETUBYTE(1,RttV)) +
+		   fMPY16SS(fGETBYTE(2,RssV),fGETUBYTE(2,RttV)) +
+		   fMPY16SS(fGETBYTE(3,RssV),fGETUBYTE(3,RttV))));
+  fSETWORD(1,RxxV,(fGETWORD(1,RxxV) +
+		   fMPY16SS(fGETBYTE(4,RssV),fGETUBYTE(4,RttV)) +
+		   fMPY16SS(fGETBYTE(5,RssV),fGETUBYTE(5,RttV)) +
+		   fMPY16SS(fGETBYTE(6,RssV),fGETUBYTE(6,RttV)) +
+		   fMPY16SS(fGETBYTE(7,RssV),fGETUBYTE(7,RttV))));
+ })
+
+
+Q6INSN(M5_vmpybuu,"Rdd32=vmpybu(Rs32,Rt32)",ATTRIBS(),
+ "vector mpy bytes",
+{
+  fSETHALF(0,RddV,(fMPY16SS(fGETUBYTE(0,RsV),fGETUBYTE(0,RtV))));
+  fSETHALF(1,RddV,(fMPY16SS(fGETUBYTE(1,RsV),fGETUBYTE(1,RtV))));
+  fSETHALF(2,RddV,(fMPY16SS(fGETUBYTE(2,RsV),fGETUBYTE(2,RtV))));
+  fSETHALF(3,RddV,(fMPY16SS(fGETUBYTE(3,RsV),fGETUBYTE(3,RtV))));
+ })
+
+Q6INSN(M5_vmpybsu,"Rdd32=vmpybsu(Rs32,Rt32)",ATTRIBS(),
+ "vector mpy bytes",
+{
+  fSETHALF(0,RddV,(fMPY16SS(fGETBYTE(0,RsV),fGETUBYTE(0,RtV))));
+  fSETHALF(1,RddV,(fMPY16SS(fGETBYTE(1,RsV),fGETUBYTE(1,RtV))));
+  fSETHALF(2,RddV,(fMPY16SS(fGETBYTE(2,RsV),fGETUBYTE(2,RtV))));
+  fSETHALF(3,RddV,(fMPY16SS(fGETBYTE(3,RsV),fGETUBYTE(3,RtV))));
+ })
+
+
+Q6INSN(M5_vmacbuu,"Rxx32+=vmpybu(Rs32,Rt32)",ATTRIBS(),
+ "vector mac bytes",
+{
+  fSETHALF(0,RxxV,(fGETHALF(0,RxxV)+fMPY16SS(fGETUBYTE(0,RsV),fGETUBYTE(0,RtV))));
+  fSETHALF(1,RxxV,(fGETHALF(1,RxxV)+fMPY16SS(fGETUBYTE(1,RsV),fGETUBYTE(1,RtV))));
+  fSETHALF(2,RxxV,(fGETHALF(2,RxxV)+fMPY16SS(fGETUBYTE(2,RsV),fGETUBYTE(2,RtV))));
+  fSETHALF(3,RxxV,(fGETHALF(3,RxxV)+fMPY16SS(fGETUBYTE(3,RsV),fGETUBYTE(3,RtV))));
+ })
+
+Q6INSN(M5_vmacbsu,"Rxx32+=vmpybsu(Rs32,Rt32)",ATTRIBS(),
+ "vector mac bytes",
+{
+  fSETHALF(0,RxxV,(fGETHALF(0,RxxV)+fMPY16SS(fGETBYTE(0,RsV),fGETUBYTE(0,RtV))));
+  fSETHALF(1,RxxV,(fGETHALF(1,RxxV)+fMPY16SS(fGETBYTE(1,RsV),fGETUBYTE(1,RtV))));
+  fSETHALF(2,RxxV,(fGETHALF(2,RxxV)+fMPY16SS(fGETBYTE(2,RsV),fGETUBYTE(2,RtV))));
+  fSETHALF(3,RxxV,(fGETHALF(3,RxxV)+fMPY16SS(fGETBYTE(3,RsV),fGETUBYTE(3,RtV))));
+ })
+
+
+
+Q6INSN(M5_vdmpybsu,"Rdd32=vdmpybsu(Rss32,Rtt32):sat",ATTRIBS(),
+ "vector quad mpy bytes",
+{
+  fSETHALF(0,RddV,fSATN(16,(fMPY16SS(fGETBYTE(0,RssV),fGETUBYTE(0,RttV)) +
+         		   fMPY16SS(fGETBYTE(1,RssV),fGETUBYTE(1,RttV)))));
+  fSETHALF(1,RddV,fSATN(16,(fMPY16SS(fGETBYTE(2,RssV),fGETUBYTE(2,RttV)) +
+         		   fMPY16SS(fGETBYTE(3,RssV),fGETUBYTE(3,RttV)))));
+  fSETHALF(2,RddV,fSATN(16,(fMPY16SS(fGETBYTE(4,RssV),fGETUBYTE(4,RttV)) +
+         		   fMPY16SS(fGETBYTE(5,RssV),fGETUBYTE(5,RttV)))));
+  fSETHALF(3,RddV,fSATN(16,(fMPY16SS(fGETBYTE(6,RssV),fGETUBYTE(6,RttV)) +
+         		   fMPY16SS(fGETBYTE(7,RssV),fGETUBYTE(7,RttV)))));
+ })
+
+
+Q6INSN(M5_vdmacbsu,"Rxx32+=vdmpybsu(Rss32,Rtt32):sat",ATTRIBS(),
+ "vector quad mac bytes",
+{
+  fSETHALF(0,RxxV,fSATN(16,(fGETHALF(0,RxxV) +
+		   fMPY16SS(fGETBYTE(0,RssV),fGETUBYTE(0,RttV)) +
+		   fMPY16SS(fGETBYTE(1,RssV),fGETUBYTE(1,RttV)))));
+  fSETHALF(1,RxxV,fSATN(16,(fGETHALF(1,RxxV) +
+		   fMPY16SS(fGETBYTE(2,RssV),fGETUBYTE(2,RttV)) +
+		   fMPY16SS(fGETBYTE(3,RssV),fGETUBYTE(3,RttV)))));
+  fSETHALF(2,RxxV,fSATN(16,(fGETHALF(2,RxxV) +
+		   fMPY16SS(fGETBYTE(4,RssV),fGETUBYTE(4,RttV)) +
+		   fMPY16SS(fGETBYTE(5,RssV),fGETUBYTE(5,RttV)))));
+  fSETHALF(3,RxxV,fSATN(16,(fGETHALF(3,RxxV) +
+		   fMPY16SS(fGETBYTE(6,RssV),fGETUBYTE(6,RttV)) +
+		   fMPY16SS(fGETBYTE(7,RssV),fGETUBYTE(7,RttV)))));
+ })
+
+
+
+/* Full version */
+#undef dmpy_sema
+#define dmpy_sema(N)\
+{ fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV))) + \
+                     fSCALE(N,fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV)))));\
+  fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV))) + \
+                     fSCALE(N,fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV)))));\
+}
+Q6INSN(M2_vdmacs_s0,"Rxx32+=vdmpy(Rss32,Rtt32):sat",ATTRIBS(),    "",dmpy_sema(0))
+Q6INSN(M2_vdmacs_s1,"Rxx32+=vdmpy(Rss32,Rtt32):<<1:sat",ATTRIBS(),"",dmpy_sema(1))
+
+#undef dmpy_sema
+#define dmpy_sema(N)\
+{ fSETWORD(0,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV))) + \
+              fSCALE(N,fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV)))));\
+  fSETWORD(1,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV))) + \
+              fSCALE(N,fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV)))));\
+}
+
+Q6INSN(M2_vdmpys_s0,"Rdd32=vdmpy(Rss32,Rtt32):sat",ATTRIBS(),    "",dmpy_sema(0))
+Q6INSN(M2_vdmpys_s1,"Rdd32=vdmpy(Rss32,Rtt32):<<1:sat",ATTRIBS(),"",dmpy_sema(1))
+
+
+
+/******************************************************/
+/* complex multiply/mac with                          */
+/* real&imag are packed together and always saturated */
+/* to protect against overflow.                       */
+/******************************************************/
+
+#undef cmpy_sema
+#define cmpy_sema(N,CONJMINUS,CONJPLUS)\
+{ fSETHALF(1,RdV,fGETHALF(1,(fSAT(fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(0,RtV))) CONJMINUS \
+                                  fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(1,RtV))) + 0x8000))));\
+  fSETHALF(0,RdV,fGETHALF(1,(fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV))) CONJPLUS \
+                                  fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV))) + 0x8000))));\
+}
+Q6INSN(M2_cmpyrs_s0,"Rd32=cmpy(Rs32,Rt32):rnd:sat",ATTRIBS(),    "Complex Multiply",cmpy_sema(0,+,-))
+Q6INSN(M2_cmpyrs_s1,"Rd32=cmpy(Rs32,Rt32):<<1:rnd:sat",ATTRIBS(),"Complex Multiply",cmpy_sema(1,+,-))
+
+
+Q6INSN(M2_cmpyrsc_s0,"Rd32=cmpy(Rs32,Rt32*):rnd:sat",ATTRIBS(A_ARCHV2),    "Complex Multiply",cmpy_sema(0,-,+))
+Q6INSN(M2_cmpyrsc_s1,"Rd32=cmpy(Rs32,Rt32*):<<1:rnd:sat",ATTRIBS(A_ARCHV2),"Complex Multiply",cmpy_sema(1,-,+))
+
+
+#undef cmpy_sema
+#define cmpy_sema(N,CONJMINUS,CONJPLUS)\
+{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(0,RtV))) CONJMINUS \
+                                          fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(1,RtV)))));\
+  fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV))) CONJPLUS \
+                                          fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV)))));\
+}
+Q6INSN(M2_cmacs_s0,"Rxx32+=cmpy(Rs32,Rt32):sat",ATTRIBS(),    "Complex Multiply",cmpy_sema(0,+,-))
+Q6INSN(M2_cmacs_s1,"Rxx32+=cmpy(Rs32,Rt32):<<1:sat",ATTRIBS(),"Complex Multiply",cmpy_sema(1,+,-))
+
+/* EJP: Need mac versions w/ CONJ T? */
+Q6INSN(M2_cmacsc_s0,"Rxx32+=cmpy(Rs32,Rt32*):sat",ATTRIBS(A_ARCHV2),    "Complex Multiply",cmpy_sema(0,-,+))
+Q6INSN(M2_cmacsc_s1,"Rxx32+=cmpy(Rs32,Rt32*):<<1:sat",ATTRIBS(A_ARCHV2),"Complex Multiply",cmpy_sema(1,-,+))
+
+
+#undef cmpy_sema
+#define cmpy_sema(N,CONJMINUS,CONJPLUS)\
+{ fSETWORD(1,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(0,RtV))) CONJMINUS \
+                       fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(1,RtV)))));\
+  fSETWORD(0,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV))) CONJPLUS \
+                       fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV)))));\
+}
+
+Q6INSN(M2_cmpys_s0,"Rdd32=cmpy(Rs32,Rt32):sat",ATTRIBS(),    "Complex Multiply",cmpy_sema(0,+,-))
+Q6INSN(M2_cmpys_s1,"Rdd32=cmpy(Rs32,Rt32):<<1:sat",ATTRIBS(),"Complex Multiply",cmpy_sema(1,+,-))
+
+Q6INSN(M2_cmpysc_s0,"Rdd32=cmpy(Rs32,Rt32*):sat",ATTRIBS(A_ARCHV2),    "Complex Multiply",cmpy_sema(0,-,+))
+Q6INSN(M2_cmpysc_s1,"Rdd32=cmpy(Rs32,Rt32*):<<1:sat",ATTRIBS(A_ARCHV2),"Complex Multiply",cmpy_sema(1,-,+))
+
+
+
+#undef cmpy_sema
+#define cmpy_sema(N,CONJMINUS,CONJPLUS)\
+{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) - (fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(0,RtV))) CONJMINUS \
+                                           fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(1,RtV))))));\
+  fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) - (fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV))) CONJPLUS \
+                                           fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV))))));\
+}
+Q6INSN(M2_cnacs_s0,"Rxx32-=cmpy(Rs32,Rt32):sat",ATTRIBS(A_ARCHV2),    "Complex Multiply",cmpy_sema(0,+,-))
+Q6INSN(M2_cnacs_s1,"Rxx32-=cmpy(Rs32,Rt32):<<1:sat",ATTRIBS(A_ARCHV2),"Complex Multiply",cmpy_sema(1,+,-))
+
+/* EJP: need CONJ versions? */
+Q6INSN(M2_cnacsc_s0,"Rxx32-=cmpy(Rs32,Rt32*):sat",ATTRIBS(A_ARCHV2),    "Complex Multiply",cmpy_sema(0,-,+))
+Q6INSN(M2_cnacsc_s1,"Rxx32-=cmpy(Rs32,Rt32*):<<1:sat",ATTRIBS(A_ARCHV2),"Complex Multiply",cmpy_sema(1,-,+))
+
+
+/******************************************************/
+/* complex interpolation                              */
+/* Given a pair of complex values, scale by a,b, sum  */
+/* Saturate/shift1 and round/pack                     */
+/******************************************************/
+
+#undef vrcmpys_sema
+#define vrcmpys_sema(N,INWORD) \
+{ fSETWORD(1,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,INWORD))) + \
+                       fSCALE(N,fMPY16SS(fGETHALF(3,RssV),fGETHALF(1,INWORD)))));\
+  fSETWORD(0,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,INWORD))) + \
+                       fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(1,INWORD)))));\
+}
+
+
+
+Q6INSN(M2_vrcmpys_s1_h,"Rdd32=vrcmpys(Rss32,Rtt32):<<1:sat:raw:hi",ATTRIBS(A_ARCHV3), "Vector Reduce Complex Multiply by Scalar",vrcmpys_sema(1,fGETWORD(1,RttV)))
+Q6INSN(M2_vrcmpys_s1_l,"Rdd32=vrcmpys(Rss32,Rtt32):<<1:sat:raw:lo",ATTRIBS(A_ARCHV3), "Vector Reduce Complex Multiply by Scalar",vrcmpys_sema(1,fGETWORD(0,RttV)))
+
+#undef vrcmpys_sema
+#define vrcmpys_sema(N,INWORD) \
+{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,INWORD))) + \
+                       fSCALE(N,fMPY16SS(fGETHALF(3,RssV),fGETHALF(1,INWORD)))));\
+  fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,INWORD))) + \
+                       fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(1,INWORD)))));\
+}
+
+
+
+Q6INSN(M2_vrcmpys_acc_s1_h,"Rxx32+=vrcmpys(Rss32,Rtt32):<<1:sat:raw:hi",ATTRIBS(A_ARCHV3), "Vector Reduce Complex Multiply by Scalar",vrcmpys_sema(1,fGETWORD(1,RttV)))
+Q6INSN(M2_vrcmpys_acc_s1_l,"Rxx32+=vrcmpys(Rss32,Rtt32):<<1:sat:raw:lo",ATTRIBS(A_ARCHV3), "Vector Reduce Complex Multiply by Scalar",vrcmpys_sema(1,fGETWORD(0,RttV)))
+
+#undef vrcmpys_sema
+#define vrcmpys_sema(N,INWORD) \
+{ fSETHALF(1,RdV,fGETHALF(1,fSAT(fSCALE(N,fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,INWORD))) + \
+                       fSCALE(N,fMPY16SS(fGETHALF(3,RssV),fGETHALF(1,INWORD))) + 0x8000)));\
+  fSETHALF(0,RdV,fGETHALF(1,fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,INWORD))) + \
+                       fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(1,INWORD))) + 0x8000)));\
+}
+
+Q6INSN(M2_vrcmpys_s1rp_h,"Rd32=vrcmpys(Rss32,Rtt32):<<1:rnd:sat:raw:hi",ATTRIBS(A_ARCHV3), "Vector Reduce Complex Multiply by Scalar",vrcmpys_sema(1,fGETWORD(1,RttV)))
+Q6INSN(M2_vrcmpys_s1rp_l,"Rd32=vrcmpys(Rss32,Rtt32):<<1:rnd:sat:raw:lo",ATTRIBS(A_ARCHV3), "Vector Reduce Complex Multiply by Scalar",vrcmpys_sema(1,fGETWORD(0,RttV)))
+
+/**************************************************************/
+/* mixed mode 32x16 vector dual multiplies                    */
+/*                                                            */
+/**************************************************************/
+
+/* SIGNED 32 x SIGNED 16 */
+
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(2,RttV))))>>16)) ); \
+  fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RttV))))>>16)) ); \
+}
+Q6INSN(M2_mmacls_s0,"Rxx32+=vmpyweh(Rss32,Rtt32):sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmacls_s1,"Rxx32+=vmpyweh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(3,RttV))))>>16) )); \
+  fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RttV))))>>16 ))); \
+}
+Q6INSN(M2_mmachs_s0,"Rxx32+=vmpywoh(Rss32,Rtt32):sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmachs_s1,"Rxx32+=vmpywoh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(2,RttV))))>>16)); \
+  fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RttV))))>>16)); \
+}
+Q6INSN(M2_mmpyl_s0,"Rdd32=vmpyweh(Rss32,Rtt32):sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmpyl_s1,"Rdd32=vmpyweh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(3,RttV))))>>16)); \
+  fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RttV))))>>16)); \
+}
+Q6INSN(M2_mmpyh_s0,"Rdd32=vmpywoh(Rss32,Rtt32):sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmpyh_s1,"Rdd32=vmpywoh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+
+/* With rounding */
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(2,RttV)))+0x8000)>>16)) ); \
+  fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RttV)))+0x8000)>>16)) ); \
+}
+Q6INSN(M2_mmacls_rs0,"Rxx32+=vmpyweh(Rss32,Rtt32):rnd:sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmacls_rs1,"Rxx32+=vmpyweh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(3,RttV)))+0x8000)>>16) )); \
+  fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RttV)))+0x8000)>>16 ))); \
+}
+Q6INSN(M2_mmachs_rs0,"Rxx32+=vmpywoh(Rss32,Rtt32):rnd:sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmachs_rs1,"Rxx32+=vmpywoh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(2,RttV)))+0x8000)>>16)); \
+  fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RttV)))+0x8000)>>16)); \
+}
+Q6INSN(M2_mmpyl_rs0,"Rdd32=vmpyweh(Rss32,Rtt32):rnd:sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmpyl_rs1,"Rdd32=vmpyweh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(3,RttV)))+0x8000)>>16)); \
+  fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RttV)))+0x8000)>>16)); \
+}
+Q6INSN(M2_mmpyh_rs0,"Rdd32=vmpywoh(Rss32,Rtt32):rnd:sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmpyh_rs1,"Rdd32=vmpywoh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+
+#undef mixmpy_sema
+#define mixmpy_sema(DEST,EQUALS,N)\
+{ DEST EQUALS fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(2,RttV))) + fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RttV)));}
+
+Q6INSN(M4_vrmpyeh_s0,"Rdd32=vrmpyweh(Rss32,Rtt32)",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(RddV,=,0))
+Q6INSN(M4_vrmpyeh_s1,"Rdd32=vrmpyweh(Rss32,Rtt32):<<1",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(RddV,=,1))
+Q6INSN(M4_vrmpyeh_acc_s0,"Rxx32+=vrmpyweh(Rss32,Rtt32)",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(RxxV,+=,0))
+Q6INSN(M4_vrmpyeh_acc_s1,"Rxx32+=vrmpyweh(Rss32,Rtt32):<<1",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(RxxV,+=,1))
+
+#undef mixmpy_sema
+#define mixmpy_sema(DEST,EQUALS,N)\
+{ DEST EQUALS fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(3,RttV))) + fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RttV)));}
+
+Q6INSN(M4_vrmpyoh_s0,"Rdd32=vrmpywoh(Rss32,Rtt32)",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(RddV,=,0))
+Q6INSN(M4_vrmpyoh_s1,"Rdd32=vrmpywoh(Rss32,Rtt32):<<1",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(RddV,=,1))
+Q6INSN(M4_vrmpyoh_acc_s0,"Rxx32+=vrmpywoh(Rss32,Rtt32)",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(RxxV,+=,0))
+Q6INSN(M4_vrmpyoh_acc_s1,"Rxx32+=vrmpywoh(Rss32,Rtt32):<<1",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(RxxV,+=,1))
+
+
+
+
+
+
+#undef mixmpy_sema
+#define mixmpy_sema(N,H,RND)\
+{  RdV = fSAT((fSCALE(N,fMPY3216SS(RsV,fGETHALF(H,RtV)))RND)>>16); \
+}
+Q6INSN(M2_hmmpyl_rs1,"Rd32=mpy(Rs32,Rt.L32):<<1:rnd:sat",ATTRIBS(A_ARCHV2),"Mixed Precision Multiply",mixmpy_sema(1,0,+0x8000))
+Q6INSN(M2_hmmpyh_rs1,"Rd32=mpy(Rs32,Rt.H32):<<1:rnd:sat",ATTRIBS(A_ARCHV2),"Mixed Precision Multiply",mixmpy_sema(1,1,+0x8000))
+Q6INSN(M2_hmmpyl_s1,"Rd32=mpy(Rs32,Rt.L32):<<1:sat",ATTRIBS(A_ARCHV2),"Mixed Precision Multiply",mixmpy_sema(1,0,))
+Q6INSN(M2_hmmpyh_s1,"Rd32=mpy(Rs32,Rt.H32):<<1:sat",ATTRIBS(A_ARCHV2),"Mixed Precision Multiply",mixmpy_sema(1,1,))
+
+
+
+
+
+
+
+
+
+/* SIGNED 32 x UNSIGNED 16 */
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(2,RttV))))>>16)) ); \
+  fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(0,RttV))))>>16)) ); \
+}
+Q6INSN(M2_mmaculs_s0,"Rxx32+=vmpyweuh(Rss32,Rtt32):sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmaculs_s1,"Rxx32+=vmpyweuh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(3,RttV))))>>16) )); \
+  fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(1,RttV))))>>16 ))); \
+}
+Q6INSN(M2_mmacuhs_s0,"Rxx32+=vmpywouh(Rss32,Rtt32):sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmacuhs_s1,"Rxx32+=vmpywouh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(2,RttV))))>>16)); \
+  fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(0,RttV))))>>16)); \
+}
+Q6INSN(M2_mmpyul_s0,"Rdd32=vmpyweuh(Rss32,Rtt32):sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmpyul_s1,"Rdd32=vmpyweuh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(3,RttV))))>>16)); \
+  fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(1,RttV))))>>16)); \
+}
+Q6INSN(M2_mmpyuh_s0,"Rdd32=vmpywouh(Rss32,Rtt32):sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmpyuh_s1,"Rdd32=vmpywouh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+
+/* With rounding */
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(2,RttV)))+0x8000)>>16)) ); \
+  fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(0,RttV)))+0x8000)>>16)) ); \
+}
+Q6INSN(M2_mmaculs_rs0,"Rxx32+=vmpyweuh(Rss32,Rtt32):rnd:sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmaculs_rs1,"Rxx32+=vmpyweuh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(3,RttV)))+0x8000)>>16) )); \
+  fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(1,RttV)))+0x8000)>>16 ))); \
+}
+Q6INSN(M2_mmacuhs_rs0,"Rxx32+=vmpywouh(Rss32,Rtt32):rnd:sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmacuhs_rs1,"Rxx32+=vmpywouh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(2,RttV)))+0x8000)>>16)); \
+  fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(0,RttV)))+0x8000)>>16)); \
+}
+Q6INSN(M2_mmpyul_rs0,"Rdd32=vmpyweuh(Rss32,Rtt32):rnd:sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmpyul_rs1,"Rdd32=vmpyweuh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+#undef mixmpy_sema
+#define mixmpy_sema(N)\
+{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(3,RttV)))+0x8000)>>16)); \
+  fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(1,RttV)))+0x8000)>>16)); \
+}
+Q6INSN(M2_mmpyuh_rs0,"Rdd32=vmpywouh(Rss32,Rtt32):rnd:sat",ATTRIBS(),    "Mixed Precision Multiply",mixmpy_sema(0))
+Q6INSN(M2_mmpyuh_rs1,"Rdd32=vmpywouh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1))
+
+
+/**************************************************************/
+/* complex mac with full 64-bit accum - no sat, no shift      */
+/* either do real or accum, never both                        */
+/**************************************************************/
+
+Q6INSN(M2_vrcmaci_s0,"Rxx32+=vrcmpyi(Rss32,Rtt32)",ATTRIBS(),"Vector Complex Mac Imaginary",
+{
+RxxV = RxxV + fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,RttV)) + \
+              fMPY16SS(fGETHALF(0,RssV),fGETHALF(1,RttV)) + \
+              fMPY16SS(fGETHALF(3,RssV),fGETHALF(2,RttV)) + \
+              fMPY16SS(fGETHALF(2,RssV),fGETHALF(3,RttV));\
+})
+
+Q6INSN(M2_vrcmacr_s0,"Rxx32+=vrcmpyr(Rss32,Rtt32)",ATTRIBS(),"Vector Complex Mac Real",
+{ RxxV = RxxV + fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)) - \
+                fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV)) + \
+                fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)) - \
+                fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV));\
+})
+
+Q6INSN(M2_vrcmaci_s0c,"Rxx32+=vrcmpyi(Rss32,Rtt32*)",ATTRIBS(A_ARCHV2),"Vector Complex Mac Imaginary",
+{
+RxxV = RxxV + fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,RttV)) - \
+              fMPY16SS(fGETHALF(0,RssV),fGETHALF(1,RttV)) + \
+              fMPY16SS(fGETHALF(3,RssV),fGETHALF(2,RttV)) - \
+              fMPY16SS(fGETHALF(2,RssV),fGETHALF(3,RttV));\
+})
+
+Q6INSN(M2_vrcmacr_s0c,"Rxx32+=vrcmpyr(Rss32,Rtt32*)",ATTRIBS(A_ARCHV2),"Vector Complex Mac Real",
+{ RxxV = RxxV + fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)) + \
+                fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV)) + \
+                fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)) + \
+                fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV));\
+})
+
+Q6INSN(M2_cmaci_s0,"Rxx32+=cmpyi(Rs32,Rt32)",ATTRIBS(),"Vector Complex Mac Imaginary",
+{
+RxxV = RxxV + fMPY16SS(fGETHALF(1,RsV),fGETHALF(0,RtV)) + \
+              fMPY16SS(fGETHALF(0,RsV),fGETHALF(1,RtV));
+})
+
+Q6INSN(M2_cmacr_s0,"Rxx32+=cmpyr(Rs32,Rt32)",ATTRIBS(),"Vector Complex Mac Real",
+{ RxxV = RxxV + fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV)) - \
+                fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV));
+})
+
+
+Q6INSN(M2_vrcmpyi_s0,"Rdd32=vrcmpyi(Rss32,Rtt32)",ATTRIBS(),"Vector Complex Mpy Imaginary",
+{
+RddV = fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,RttV)) + \
+       fMPY16SS(fGETHALF(0,RssV),fGETHALF(1,RttV)) + \
+       fMPY16SS(fGETHALF(3,RssV),fGETHALF(2,RttV)) + \
+       fMPY16SS(fGETHALF(2,RssV),fGETHALF(3,RttV));\
+})
+
+Q6INSN(M2_vrcmpyr_s0,"Rdd32=vrcmpyr(Rss32,Rtt32)",ATTRIBS(),"Vector Complex Mpy Real",
+{ RddV = fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)) - \
+         fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV)) + \
+         fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)) - \
+         fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV));\
+})
+
+Q6INSN(M2_vrcmpyi_s0c,"Rdd32=vrcmpyi(Rss32,Rtt32*)",ATTRIBS(A_ARCHV2),"Vector Complex Mpy Imaginary",
+{
+RddV = fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,RttV)) - \
+       fMPY16SS(fGETHALF(0,RssV),fGETHALF(1,RttV)) + \
+       fMPY16SS(fGETHALF(3,RssV),fGETHALF(2,RttV)) - \
+       fMPY16SS(fGETHALF(2,RssV),fGETHALF(3,RttV));\
+})
+
+Q6INSN(M2_vrcmpyr_s0c,"Rdd32=vrcmpyr(Rss32,Rtt32*)",ATTRIBS(A_ARCHV2),"Vector Complex Mpy Real",
+{ RddV = fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)) + \
+         fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV)) + \
+         fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)) + \
+         fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV));\
+})
+
+Q6INSN(M2_cmpyi_s0,"Rdd32=cmpyi(Rs32,Rt32)",ATTRIBS(),"Vector Complex Mpy Imaginary",
+{
+RddV = fMPY16SS(fGETHALF(1,RsV),fGETHALF(0,RtV)) + \
+       fMPY16SS(fGETHALF(0,RsV),fGETHALF(1,RtV));
+})
+
+Q6INSN(M2_cmpyr_s0,"Rdd32=cmpyr(Rs32,Rt32)",ATTRIBS(),"Vector Complex Mpy Real",
+{ RddV = fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV)) - \
+         fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV));
+})
+
+
+/**************************************************************/
+/* Complex mpy/mac with 2x32 bit accum, sat, shift            */
+/* 32x16 real or imag                                         */
+/**************************************************************/
+
+#if 1
+
+Q6INSN(M4_cmpyi_wh,"Rd32=cmpyiwh(Rss32,Rt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Complex Multiply",
+{
+ RdV = fSAT(  (  fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RtV))
+               + fMPY3216SS(fGETWORD(1,RssV),fGETHALF(0,RtV))
+               + 0x4000)>>15);
+})
+
+
+Q6INSN(M4_cmpyr_wh,"Rd32=cmpyrwh(Rss32,Rt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Complex Multiply",
+{
+ RdV = fSAT(  (  fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RtV))
+               - fMPY3216SS(fGETWORD(1,RssV),fGETHALF(1,RtV))
+               + 0x4000)>>15);
+})
+
+Q6INSN(M4_cmpyi_whc,"Rd32=cmpyiwh(Rss32,Rt32*):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Complex Multiply",
+{
+ RdV = fSAT(  (  fMPY3216SS(fGETWORD(1,RssV),fGETHALF(0,RtV))
+               - fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RtV))
+               + 0x4000)>>15);
+})
+
+
+Q6INSN(M4_cmpyr_whc,"Rd32=cmpyrwh(Rss32,Rt32*):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Complex Multiply",
+{
+ RdV = fSAT(  (  fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RtV))
+               + fMPY3216SS(fGETWORD(1,RssV),fGETHALF(1,RtV))
+               + 0x4000)>>15);
+})
+
+
+#endif
+
+/**************************************************************/
+/* Vector mpy/mac with 2x32 bit accum, sat, shift             */
+/* either do real or imag,  never both                        */
+/**************************************************************/
+
+#undef VCMPYSEMI
+#define VCMPYSEMI(DST,ACC0,ACC1,SHIFT,SAT) \
+	fSETWORD(0,DST,SAT(ACC0 fSCALE(SHIFT,fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,RttV)) + \
+	                                    fMPY16SS(fGETHALF(0,RssV),fGETHALF(1,RttV))))); \
+	fSETWORD(1,DST,SAT(ACC1 fSCALE(SHIFT,fMPY16SS(fGETHALF(3,RssV),fGETHALF(2,RttV)) + \
+	                                    fMPY16SS(fGETHALF(2,RssV),fGETHALF(3,RttV))))); \
+
+#undef VCMPYSEMR
+#define VCMPYSEMR(DST,ACC0,ACC1,SHIFT,SAT) \
+	fSETWORD(0,DST,SAT(ACC0 fSCALE(SHIFT,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)) - \
+	                       fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV))))); \
+	fSETWORD(1,DST,SAT(ACC1 fSCALE(SHIFT,fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)) - \
+	                       fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV))))); \
+
+
+#undef VCMPYIR
+#define VCMPYIR(TAGBASE,DSTSYN,DSTVAL,ACCSEM,ACCVAL0,ACCVAL1,SHIFTSYN,SHIFTVAL,SATSYN,SATVAL) \
+Q6INSN(M2_##TAGBASE##i,DSTSYN ACCSEM "vcmpyi(Rss32,Rtt32)" SHIFTSYN SATSYN,ATTRIBS(A_ARCHV2), \
+	"Vector Complex Multiply Imaginary", { VCMPYSEMI(DSTVAL,ACCVAL0,ACCVAL1,SHIFTVAL,SATVAL); }) \
+Q6INSN(M2_##TAGBASE##r,DSTSYN ACCSEM "vcmpyr(Rss32,Rtt32)" SHIFTSYN SATSYN,ATTRIBS(A_ARCHV2), \
+	"Vector Complex Multiply Imaginary", { VCMPYSEMR(DSTVAL,ACCVAL0,ACCVAL1,SHIFTVAL,SATVAL); })
+
+
+VCMPYIR(vcmpy_s0_sat_,"Rdd32",RddV,"=",,,"",0,":sat",fSAT)
+VCMPYIR(vcmpy_s1_sat_,"Rdd32",RddV,"=",,,":<<1",1,":sat",fSAT)
+VCMPYIR(vcmac_s0_sat_,"Rxx32",RxxV,"+=",fGETWORD(0,RxxV) + ,fGETWORD(1,RxxV) + ,"",0,":sat",fSAT)
+
+
+/**********************************************************************
+ *  Rotation  -- by 0, 90, 180, or 270 means mult by 1, J, -1, -J     *
+ *********************************************************************/
+
+Q6INSN(S2_vcrotate,"Rdd32=vcrotate(Rss32,Rt32)",ATTRIBS(A_ARCHV2),"Rotate complex value by multiple of PI/2",
+{
+	fHIDE(size1u_t tmp;)
+	tmp = fEXTRACTU_RANGE(RtV,1,0);
+	if (tmp == 0) { /* No rotation */
+		fSETHALF(0,RddV,fGETHALF(0,RssV));
+		fSETHALF(1,RddV,fGETHALF(1,RssV));
+	} else if (tmp == 1) { /* Multiply by -J */
+		fSETHALF(0,RddV,fGETHALF(1,RssV));
+		fSETHALF(1,RddV,fSATH(-fGETHALF(0,RssV)));
+	} else if (tmp == 2) { /* Multiply by J */
+		fSETHALF(0,RddV,fSATH(-fGETHALF(1,RssV)));
+		fSETHALF(1,RddV,fGETHALF(0,RssV));
+	} else { /* Multiply by -1 */
+		fHIDE(if (tmp != 3) fatal("C is broken");)
+		fSETHALF(0,RddV,fSATH(-fGETHALF(0,RssV)));
+		fSETHALF(1,RddV,fSATH(-fGETHALF(1,RssV)));
+	}
+	tmp = fEXTRACTU_RANGE(RtV,3,2);
+	if (tmp == 0) { /* No rotation */
+		fSETHALF(2,RddV,fGETHALF(2,RssV));
+		fSETHALF(3,RddV,fGETHALF(3,RssV));
+	} else if (tmp == 1) { /* Multiply by -J */
+		fSETHALF(2,RddV,fGETHALF(3,RssV));
+		fSETHALF(3,RddV,fSATH(-fGETHALF(2,RssV)));
+	} else if (tmp == 2) { /* Multiply by J */
+		fSETHALF(2,RddV,fSATH(-fGETHALF(3,RssV)));
+		fSETHALF(3,RddV,fGETHALF(2,RssV));
+	} else { /* Multiply by -1 */
+		fHIDE(if (tmp != 3) fatal("C is broken");)
+		fSETHALF(2,RddV,fSATH(-fGETHALF(2,RssV)));
+		fSETHALF(3,RddV,fSATH(-fGETHALF(3,RssV)));
+	}
+})
+
+
+Q6INSN(S4_vrcrotate_acc,"Rxx32+=vrcrotate(Rss32,Rt32,#u2)",ATTRIBS(),"Rotate and Reduce Bytes",
+{
+	fHIDE(int i; int tmpr; int tmpi; unsigned int control;)
+	fHIDE(int sumr; int sumi;)
+	sumr = 0;
+	sumi = 0;
+	control = fGETUBYTE(uiV,RtV);
+	for (i = 0; i < 8; i += 2) {
+		tmpr = fGETBYTE(i  ,RssV);
+		tmpi = fGETBYTE(i+1,RssV);
+		switch (control & 3) {
+			case 0: /* No Rotation */
+				sumr += tmpr;
+				sumi += tmpi;
+				break;
+			case 1: /* Multiply by -J */
+				sumr += tmpi;
+				sumi -= tmpr;
+				break;
+			case 2: /* Multiply by J */
+				sumr -= tmpi;
+				sumi += tmpr;
+				break;
+			case 3: /* Multiply by -1 */
+				sumr -= tmpr;
+				sumi -= tmpi;
+				break;
+			fHIDE(default: fatal("C is broken!");)
+		}
+		control = control >> 2;
+	}
+	fSETWORD(0,RxxV,fGETWORD(0,RxxV) + sumr);
+	fSETWORD(1,RxxV,fGETWORD(1,RxxV) + sumi);
+})
+
+Q6INSN(S4_vrcrotate,"Rdd32=vrcrotate(Rss32,Rt32,#u2)",ATTRIBS(),"Rotate and Reduce Bytes",
+{
+	fHIDE(int i; int tmpr; int tmpi; unsigned int control;)
+	fHIDE(int sumr; int sumi;)
+	sumr = 0;
+	sumi = 0;
+	control = fGETUBYTE(uiV,RtV);
+	for (i = 0; i < 8; i += 2) {
+		tmpr = fGETBYTE(i  ,RssV);
+		tmpi = fGETBYTE(i+1,RssV);
+		switch (control & 3) {
+			case 0: /* No Rotation */
+				sumr += tmpr;
+				sumi += tmpi;
+				break;
+			case 1: /* Multiply by -J */
+				sumr += tmpi;
+				sumi -= tmpr;
+				break;
+			case 2: /* Multiply by J */
+				sumr -= tmpi;
+				sumi += tmpr;
+				break;
+			case 3: /* Multiply by -1 */
+				sumr -= tmpr;
+				sumi -= tmpi;
+				break;
+			fHIDE(default: fatal("C is broken!");)
+		}
+		control = control >> 2;
+	}
+	fSETWORD(0,RddV,sumr);
+	fSETWORD(1,RddV,sumi);
+})
+
+
+Q6INSN(S2_vcnegh,"Rdd32=vcnegh(Rss32,Rt32)",ATTRIBS(),"Conditional Negate halfwords",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 4; i++) {
+		if (fGETBIT(i,RtV)) {
+			fSETHALF(i,RddV,fSATH(-fGETHALF(i,RssV)));
+		} else {
+			fSETHALF(i,RddV,fGETHALF(i,RssV));
+		}
+	}
+})
+
+Q6INSN(S2_vrcnegh,"Rxx32+=vrcnegh(Rss32,Rt32)",ATTRIBS(),"Vector Reduce Conditional Negate halfwords",
+{
+	fHIDE(int i;)
+	for (i = 0; i < 4; i++) {
+		if (fGETBIT(i,RtV)) {
+			RxxV += -fGETHALF(i,RssV);
+		} else {
+			RxxV += fGETHALF(i,RssV);
+		}
+	}
+})
+
+
+/**********************************************************************
+ *  Finite-field multiplies.  Written by David Hoyle                  *
+ *********************************************************************/
+
+Q6INSN(M4_pmpyw,"Rdd32=pmpyw(Rs32,Rt32)",ATTRIBS(),"Polynomial 32bit Multiplication with Addition in GF(2)",
+{
+        fHIDE(int i; unsigned int y;)
+        fHIDE(unsigned long long x; unsigned long long prod;)
+        x = fGETUWORD(0, RsV);
+        y = fGETUWORD(0, RtV);
+
+        prod = 0;
+        for(i=0; i < 32; i++) {
+            if((y >> i) & 1) prod ^= (x << i);
+        }
+	RddV = prod;
+})
+
+Q6INSN(M4_vpmpyh,"Rdd32=vpmpyh(Rs32,Rt32)",ATTRIBS(),"Dual Polynomial 16bit Multiplication with Addition in GF(2)",
+{
+        fHIDE(int i; unsigned int x0; unsigned int x1;)
+        fHIDE(unsigned int y0; unsigned int y1;)
+        fHIDE(unsigned int prod0; unsigned int prod1;)
+
+        x0 = fGETUHALF(0, RsV);
+        x1 = fGETUHALF(1, RsV);
+        y0 = fGETUHALF(0, RtV);
+        y1 = fGETUHALF(1, RtV);
+
+        prod0 = prod1 = 0;
+        for(i=0; i < 16; i++) {
+            if((y0 >> i) & 1) prod0 ^= (x0 << i);
+            if((y1 >> i) & 1) prod1 ^= (x1 << i);
+        }
+        fSETHALF(0,RddV,fGETUHALF(0,prod0));
+        fSETHALF(1,RddV,fGETUHALF(0,prod1));
+        fSETHALF(2,RddV,fGETUHALF(1,prod0));
+        fSETHALF(3,RddV,fGETUHALF(1,prod1));
+})
+
+Q6INSN(M4_pmpyw_acc,"Rxx32^=pmpyw(Rs32,Rt32)",ATTRIBS(),"Polynomial 32bit Multiplication with Addition in GF(2)",
+{
+        fHIDE(int i; unsigned int y;)
+        fHIDE(unsigned long long x; unsigned long long prod;)
+        x = fGETUWORD(0, RsV);
+        y = fGETUWORD(0, RtV);
+
+        prod = 0;
+        for(i=0; i < 32; i++) {
+            if((y >> i) & 1) prod ^= (x << i);
+        }
+	RxxV ^= prod;
+})
+
+Q6INSN(M4_vpmpyh_acc,"Rxx32^=vpmpyh(Rs32,Rt32)",ATTRIBS(),"Dual Polynomial 16bit Multiplication with Addition in GF(2)",
+{
+        fHIDE(int i; unsigned int x0; unsigned int x1;)
+        fHIDE(unsigned int y0; unsigned int y1;)
+        fHIDE(unsigned int prod0; unsigned int prod1;)
+
+        x0 = fGETUHALF(0, RsV);
+        x1 = fGETUHALF(1, RsV);
+        y0 = fGETUHALF(0, RtV);
+        y1 = fGETUHALF(1, RtV);
+
+        prod0 = prod1 = 0;
+        for(i=0; i < 16; i++) {
+            if((y0 >> i) & 1) prod0 ^= (x0 << i);
+            if((y1 >> i) & 1) prod1 ^= (x1 << i);
+        }
+        fSETHALF(0,RxxV,fGETUHALF(0,RxxV) ^ fGETUHALF(0,prod0));
+        fSETHALF(1,RxxV,fGETUHALF(1,RxxV) ^ fGETUHALF(0,prod1));
+        fSETHALF(2,RxxV,fGETUHALF(2,RxxV) ^ fGETUHALF(1,prod0));
+        fSETHALF(3,RxxV,fGETUHALF(3,RxxV) ^ fGETUHALF(1,prod1));
+})
+
+
+/* V70: TINY CORE */
+
+#define CMPY64(TAG,NAME,DESC,OPERAND1,OP,W0,W1,W2,W3) \
+Q6INSN(M7_##TAG,"Rdd32=" NAME "(Rss32," OPERAND1 ")",ATTRIBS(A_RESTRICT_SLOT3ONLY),"Complex Multiply 64-bit " DESC,    { RddV  = (fMPY32SS(fGETWORD(W0, RssV), fGETWORD(W1, RttV)) OP fMPY32SS(fGETWORD(W2, RssV), fGETWORD(W3, RttV)));})\
+Q6INSN(M7_##TAG##_acc,"Rxx32+=" NAME "(Rss32,"OPERAND1")",ATTRIBS(A_RESTRICT_SLOT3ONLY),"Complex Multiply-Accumulate 64-bit " DESC, { RxxV += (fMPY32SS(fGETWORD(W0, RssV), fGETWORD(W1, RttV)) OP fMPY32SS(fGETWORD(W2, RssV), fGETWORD(W3, RttV)));})
+
+CMPY64(dcmpyrw, "cmpyrw","Real","Rtt32" ,-,0,0,1,1)
+CMPY64(dcmpyrwc,"cmpyrw","Real","Rtt32*",+,0,0,1,1)
+CMPY64(dcmpyiw, "cmpyiw","Imag","Rtt32" ,+,0,1,1,0)
+CMPY64(dcmpyiwc,"cmpyiw","Imag","Rtt32*",-,1,0,0,1)
+
+#define CMPY128(TAG, NAME, OPERAND1, WORD0, WORD1, WORD2, WORD3, OP) \
+Q6INSN(M7_##TAG,"Rd32=" NAME "(Rss32,"OPERAND1"):<<1:sat",ATTRIBS(A_RESTRICT_SLOT3ONLY),"Complex Multiply 32-bit result real",  \
+{ \
+fHIDE(size16s_t acc128;)\
+fHIDE(size16s_t tmp128;)\
+fHIDE(size8s_t acc64;)\
+tmp128 = fCAST8S_16S(fMPY32SS(fGETWORD(WORD0, RssV), fGETWORD(WORD1, RttV)));\
+acc128 = fCAST8S_16S(fMPY32SS(fGETWORD(WORD2, RssV), fGETWORD(WORD3, RttV)));\
+acc128 = OP(tmp128,acc128);\
+acc128 = fSHIFTR128(acc128, 31);\
+acc64 =  fCAST16S_8S(acc128);\
+RdV = fSATW(acc64);\
+})
+
+
+CMPY128(wcmpyrw, "cmpyrw", "Rtt32", 0, 0, 1, 1, fSUB128)
+CMPY128(wcmpyrwc, "cmpyrw", "Rtt32*", 0, 0, 1, 1, fADD128)
+CMPY128(wcmpyiw, "cmpyiw", "Rtt32", 0, 1, 1, 0, fADD128)
+CMPY128(wcmpyiwc, "cmpyiw", "Rtt32*", 1, 0, 0, 1, fSUB128)
+
+
+#define CMPY128RND(TAG, NAME, OPERAND1, WORD0, WORD1, WORD2, WORD3, OP) \
+Q6INSN(M7_##TAG##_rnd,"Rd32=" NAME "(Rss32,"OPERAND1"):<<1:rnd:sat",ATTRIBS(A_RESTRICT_SLOT3ONLY),"Complex Multiply 32-bit result real",  \
+{ \
+fHIDE(size16s_t acc128;)\
+fHIDE(size16s_t tmp128;)\
+fHIDE(size16s_t const128;)\
+fHIDE(size8s_t acc64;)\
+tmp128 = fCAST8S_16S(fMPY32SS(fGETWORD(WORD0, RssV), fGETWORD(WORD1, RttV)));\
+acc128 = fCAST8S_16S(fMPY32SS(fGETWORD(WORD2, RssV), fGETWORD(WORD3, RttV)));\
+const128 = fCAST8S_16S(fCONSTLL(0x40000000));\
+acc128 = OP(tmp128,acc128);\
+acc128 = fADD128(acc128,const128);\
+acc128 = fSHIFTR128(acc128, 31);\
+acc64 =  fCAST16S_8S(acc128);\
+RdV = fSATW(acc64);\
+})
+
+CMPY128RND(wcmpyrw, "cmpyrw", "Rtt32", 0, 0, 1, 1, fSUB128)
+CMPY128RND(wcmpyrwc, "cmpyrw", "Rtt32*", 0, 0, 1, 1, fADD128)
+CMPY128RND(wcmpyiw, "cmpyiw", "Rtt32", 0, 1, 1, 0, fADD128)
+CMPY128RND(wcmpyiwc, "cmpyiw", "Rtt32*", 1, 0, 0, 1, fSUB128)
+
+
+
+
diff --git a/target/hexagon/imported/shift.idef b/target/hexagon/imported/shift.idef
new file mode 100644
index 0000000..485c279
--- /dev/null
+++ b/target/hexagon/imported/shift.idef
@@ -0,0 +1,1067 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * S-type Instructions
+ */
+
+/**********************************************/
+/* SHIFTS                                     */
+/**********************************************/
+
+/* NOTE: Rdd = Rs *right* shifts don't make sense */
+/* NOTE: Rd[d] = Rs[s] *right* shifts with saturation don't make sense */
+
+
+#define RSHIFTTYPES(TAGEND,REGD,REGS,REGSTYPE,ACC,ACCSRC,SAT,SATOPT,ATTRS) \
+Q6INSN(S2_asr_r_##TAGEND,#REGD "32" #ACC "=asr(" #REGS "32,Rt32)" #SATOPT,ATTRIBS(ATTRS), \
+	"Arithmetic Shift Right by Register", \
+	{  \
+                fHIDE(size4s_t) shamt=fSXTN(7,32,RtV);\
+		REGD##V = SAT(ACCSRC ACC fBIDIR_ASHIFTR(REGS##V,shamt,REGSTYPE));  \
+	})\
+\
+Q6INSN(S2_asl_r_##TAGEND,#REGD "32" #ACC "=asl(" #REGS "32,Rt32)" #SATOPT,ATTRIBS(ATTRS), \
+	"Arithmetic Shift Left by Register", \
+	{  \
+                fHIDE(size4s_t) shamt=fSXTN(7,32,RtV);\
+		REGD##V = SAT(ACCSRC ACC fBIDIR_ASHIFTL(REGS##V,shamt,REGSTYPE));  \
+	})\
+\
+Q6INSN(S2_lsr_r_##TAGEND,#REGD "32" #ACC "=lsr(" #REGS "32,Rt32)" #SATOPT,ATTRIBS(ATTRS), \
+	"Logical Shift Right by Register", \
+	{  \
+                fHIDE(size4s_t) shamt=fSXTN(7,32,RtV);\
+		REGD##V = SAT(ACCSRC ACC fBIDIR_LSHIFTR(REGS##V,shamt,REGSTYPE));  \
+	})\
+\
+Q6INSN(S2_lsl_r_##TAGEND,#REGD "32" #ACC "=lsl(" #REGS "32,Rt32)" #SATOPT,ATTRIBS(ATTRS), \
+	"Logical Shift Left by Register", \
+	{  \
+                fHIDE(size4s_t) shamt=fSXTN(7,32,RtV);\
+		REGD##V = SAT(ACCSRC ACC fBIDIR_LSHIFTL(REGS##V,shamt,REGSTYPE));  \
+	})
+
+RSHIFTTYPES(r,Rd,Rs,4_8,,,fECHO,,)
+RSHIFTTYPES(p,Rdd,Rss,8_8,,,fECHO,,)
+RSHIFTTYPES(r_acc,Rx,Rs,4_8,+,RxV,fECHO,,)
+RSHIFTTYPES(p_acc,Rxx,Rss,8_8,+,RxxV,fECHO,,)
+RSHIFTTYPES(r_nac,Rx,Rs,4_8,-,RxV,fECHO,,)
+RSHIFTTYPES(p_nac,Rxx,Rss,8_8,-,RxxV,fECHO,,)
+
+RSHIFTTYPES(r_and,Rx,Rs,4_8,&,RxV,fECHO,,)
+RSHIFTTYPES(r_or,Rx,Rs,4_8,|,RxV,fECHO,,)
+RSHIFTTYPES(p_and,Rxx,Rss,8_8,&,RxxV,fECHO,,)
+RSHIFTTYPES(p_or,Rxx,Rss,8_8,|,RxxV,fECHO,,)
+RSHIFTTYPES(p_xor,Rxx,Rss,8_8,^,RxxV,fECHO,,)
+
+
+#undef RSHIFTTYPES
+
+/* Register shift with saturation */
+#define RSATSHIFTTYPES(TAGEND,REGD,REGS,REGSTYPE) \
+Q6INSN(S2_asr_r_##TAGEND,#REGD "32" "=asr(" #REGS "32,Rt32):sat",ATTRIBS(), \
+	"Arithmetic Shift Right by Register", \
+	{  \
+                fHIDE(size4s_t) shamt=fSXTN(7,32,RtV);\
+		REGD##V = fBIDIR_ASHIFTR_SAT(REGS##V,shamt,REGSTYPE);  \
+	})\
+\
+Q6INSN(S2_asl_r_##TAGEND,#REGD "32" "=asl(" #REGS "32,Rt32):sat",ATTRIBS(), \
+	"Arithmetic Shift Left by Register", \
+	{  \
+                fHIDE(size4s_t) shamt=fSXTN(7,32,RtV);\
+		REGD##V = fBIDIR_ASHIFTL_SAT(REGS##V,shamt,REGSTYPE);  \
+	})
+
+RSATSHIFTTYPES(r_sat,Rd,Rs,4_8)
+
+
+
+
+
+#define ISHIFTTYPES(TAGEND,SIZE,REGD,REGS,REGSTYPE,ACC,ACCSRC,SAT,SATOPT,ATTRS) \
+Q6INSN(S2_asr_i_##TAGEND,#REGD "32" #ACC "=asr(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(ATTRS), \
+	"Arithmetic Shift Right by Immediate", \
+	{ REGD##V = SAT(ACCSRC ACC fASHIFTR(REGS##V,uiV,REGSTYPE)); }) \
+\
+Q6INSN(S2_lsr_i_##TAGEND,#REGD "32" #ACC "=lsr(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(ATTRS), \
+	"Logical Shift Right by Immediate", \
+	{ REGD##V = SAT(ACCSRC ACC fLSHIFTR(REGS##V,uiV,REGSTYPE)); }) \
+\
+Q6INSN(S2_asl_i_##TAGEND,#REGD "32" #ACC "=asl(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(ATTRS), \
+	"Shift Left by Immediate", \
+	{ REGD##V = SAT(ACCSRC ACC fASHIFTL(REGS##V,uiV,REGSTYPE)); }) \
+Q6INSN(S6_rol_i_##TAGEND,#REGD "32" #ACC "=rol(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(ATTRS), \
+	"Rotate Left by Immediate", \
+	{ REGD##V = SAT(ACCSRC ACC fROTL(REGS##V,uiV,REGSTYPE)); })
+
+
+#define ISHIFTTYPES_ONLY_ASL(TAGEND,SIZE,REGD,REGS,REGSTYPE,ACC,ACCSRC,SAT,SATOPT) \
+Q6INSN(S2_asl_i_##TAGEND,#REGD "32" #ACC "=asl(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(), \
+	"", \
+	{ REGD##V = SAT(ACCSRC ACC fASHIFTL(REGS##V,uiV,REGSTYPE)); })
+
+#define ISHIFTTYPES_ONLY_ASR(TAGEND,SIZE,REGD,REGS,REGSTYPE,ACC,ACCSRC,SAT,SATOPT) \
+Q6INSN(S2_asr_i_##TAGEND,#REGD "32" #ACC "=asr(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(), \
+	"", \
+	{ REGD##V = SAT(ACCSRC ACC fASHIFTR(REGS##V,uiV,REGSTYPE)); })
+
+
+#define ISHIFTTYPES_NOASR(TAGEND,SIZE,REGD,REGS,REGSTYPE,ACC,ACCSRC,SAT,SATOPT) \
+Q6INSN(S2_lsr_i_##TAGEND,#REGD "32" #ACC "=lsr(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(), \
+	"Logical Shift Right by Register", \
+	{ REGD##V = SAT(ACCSRC ACC fLSHIFTR(REGS##V,uiV,REGSTYPE)); }) \
+Q6INSN(S2_asl_i_##TAGEND,#REGD "32" #ACC "=asl(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(), \
+	"Shift Left by Register", \
+	{ REGD##V = SAT(ACCSRC ACC fASHIFTL(REGS##V,uiV,REGSTYPE)); }) \
+Q6INSN(S6_rol_i_##TAGEND,#REGD "32" #ACC "=rol(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(), \
+	"Rotate Left by Immediate", \
+	{ REGD##V = SAT(ACCSRC ACC fROTL(REGS##V,uiV,REGSTYPE)); })
+
+
+
+ISHIFTTYPES(r,5,Rd,Rs,4_4,,,fECHO,,)
+ISHIFTTYPES(p,6,Rdd,Rss,8_8,,,fECHO,,)
+ISHIFTTYPES(r_acc,5,Rx,Rs,4_4,+,RxV,fECHO,,)
+ISHIFTTYPES(p_acc,6,Rxx,Rss,8_8,+,RxxV,fECHO,,)
+ISHIFTTYPES(r_nac,5,Rx,Rs,4_4,-,RxV,fECHO,,)
+ISHIFTTYPES(p_nac,6,Rxx,Rss,8_8,-,RxxV,fECHO,,)
+
+ISHIFTTYPES_NOASR(r_xacc,5,Rx,Rs,4_4,^, RxV,fECHO,)
+ISHIFTTYPES_NOASR(p_xacc,6,Rxx,Rss,8_8,^, RxxV,fECHO,)
+
+ISHIFTTYPES(r_and,5,Rx,Rs,4_4,&,RxV,fECHO,,)
+ISHIFTTYPES(r_or,5,Rx,Rs,4_4,|,RxV,fECHO,,)
+ISHIFTTYPES(p_and,6,Rxx,Rss,8_8,&,RxxV,fECHO,,)
+ISHIFTTYPES(p_or,6,Rxx,Rss,8_8,|,RxxV,fECHO,,)
+
+ISHIFTTYPES_ONLY_ASL(r_sat,5,Rd,Rs,4_8,,,fSAT,:sat)
+
+
+Q6INSN(S2_asr_i_r_rnd,"Rd32=asr(Rs32,#u5):rnd",ATTRIBS(),
+       "Shift right with round",
+       { RdV = fASHIFTR(((fASHIFTR(RsV,uiV,4_8))+1),1,8_8); })
+
+
+Q6INSN(S2_asr_i_p_rnd,"Rdd32=asr(Rss32,#u6):rnd",ATTRIBS(), "Shift right with round",
+{ fHIDE(size8u_t tmp;)
+  fHIDE(size8u_t rnd;)
+  tmp = fASHIFTR(RssV,uiV,8_8);
+  rnd = tmp & 1;
+  RddV = fASHIFTR(tmp,1,8_8) + rnd; })
+
+
+Q6INSN(S4_lsli,"Rd32=lsl(#s6,Rt32)",ATTRIBS(), "Shift an immediate left by register amount",
+{
+	fHIDE(size4s_t) shamt = fSXTN(7,32,RtV);
+	RdV = fBIDIR_LSHIFTL(siV,shamt,4_8);
+})
+
+
+
+
+Q6INSN(S2_addasl_rrri,"Rd32=addasl(Rt32,Rs32,#u3)",ATTRIBS(),
+	"Shift left by small amount and add",
+	{ RdV = RtV + fASHIFTL(RsV,uiV,4_4); })
+
+
+
+#define SHIFTOPI(TAGEND,INNEROP,INNERSEM)\
+Q6INSN(S4_andi_##TAGEND,"Rx32=and(#u8,"INNEROP")",,"Shift-op",{RxV=fIMMEXT(uiV)&INNERSEM;})\
+Q6INSN(S4_ori_##TAGEND, "Rx32=or(#u8,"INNEROP")",,"Shift-op",{RxV=fIMMEXT(uiV)|INNERSEM;})\
+Q6INSN(S4_addi_##TAGEND,"Rx32=add(#u8,"INNEROP")",,"Shift-op",{RxV=fIMMEXT(uiV)+INNERSEM;})\
+Q6INSN(S4_subi_##TAGEND,"Rx32=sub(#u8,"INNEROP")",,"Shift-op",{RxV=fIMMEXT(uiV)-INNERSEM;})
+
+
+SHIFTOPI(asl_ri,"asl(Rx32,#U5)",(RxV<<UiV))
+SHIFTOPI(lsr_ri,"lsr(Rx32,#U5)",(((unsigned int)RxV)>>UiV))
+
+
+/**********************************************/
+/* PERMUTES                                   */
+/**********************************************/
+Q6INSN(S2_valignib,"Rdd32=valignb(Rtt32,Rss32,#u3)",
+ATTRIBS(), "Vector align bytes",
+{
+  RddV = (fLSHIFTR(RssV,uiV*8,8_8))|(fASHIFTL(RttV,((8-uiV)*8),8_8));
+})
+
+Q6INSN(S2_valignrb,"Rdd32=valignb(Rtt32,Rss32,Pu4)",
+ATTRIBS(), "Align with register",
+{ RddV = fLSHIFTR(RssV,(PuV&0x7)*8,8_8)|(fASHIFTL(RttV,(8-(PuV&0x7))*8,8_8));})
+
+Q6INSN(S2_vspliceib,"Rdd32=vspliceb(Rss32,Rtt32,#u3)",
+ATTRIBS(), "Vector splice bytes",
+{ RddV = fASHIFTL(RttV,uiV*8,8_8) | fZXTN(uiV*8,64,RssV); })
+
+Q6INSN(S2_vsplicerb,"Rdd32=vspliceb(Rss32,Rtt32,Pu4)",
+ATTRIBS(), "Splice with register",
+{ RddV = fASHIFTL(RttV,(PuV&7)*8,8_8) | fZXTN((PuV&7)*8,64,RssV); })
+
+Q6INSN(S2_vsplatrh,"Rdd32=vsplath(Rs32)",
+ATTRIBS(), "Vector splat halfwords from register",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV, fGETHALF(0,RsV));
+        }
+})
+
+
+Q6INSN(S2_vsplatrb,"Rd32=vsplatb(Rs32)",
+ATTRIBS(), "Vector splat bytes from register",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+	  fSETBYTE(i,RdV, fGETBYTE(0,RsV));
+        }
+})
+
+Q6INSN(S6_vsplatrbp,"Rdd32=vsplatb(Rs32)",
+ATTRIBS(), "Vector splat bytes from register",
+{
+        fHIDE(int i;)
+        for (i=0;i<8;i++) {
+	  fSETBYTE(i,RddV, fGETBYTE(0,RsV));
+        }
+})
+
+
+
+/**********************************************/
+/* Insert/Extract[u]                          */
+/**********************************************/
+
+Q6INSN(S2_insert,"Rx32=insert(Rs32,#u5,#U5)",
+ATTRIBS(), "Insert bits",
+{
+        fHIDE(int) width=uiV;
+        fHIDE(int) offset=UiV;
+	/* clear bits in Rxx where new bits go */
+	RxV &= ~(((fCONSTLL(1)<<width)-1)<<offset);
+	/* OR in new bits */
+	RxV |= ((RsV & ((fCONSTLL(1)<<width)-1)) << offset);
+})
+
+
+Q6INSN(S2_tableidxb,"Rx32=tableidxb(Rs32,#u4,#S6):raw",
+ATTRIBS(A_ARCHV2), "Extract and insert bits",
+{
+        fHIDE(int) width=uiV;
+        fHIDE(int) offset=SiV;
+        fHIDE(int) field = fEXTRACTU_BIDIR(RsV,width,offset);
+        fINSERT_BITS(RxV,width,0,field);
+})
+
+Q6INSN(S2_tableidxh,"Rx32=tableidxh(Rs32,#u4,#S6):raw",
+ATTRIBS(A_ARCHV2), "Extract and insert bits",
+{
+        fHIDE(int) width=uiV;
+        fHIDE(int) offset=SiV+1;
+        fHIDE(int) field = fEXTRACTU_BIDIR(RsV,width,offset);
+        fINSERT_BITS(RxV,width,1,field);
+})
+
+Q6INSN(S2_tableidxw,"Rx32=tableidxw(Rs32,#u4,#S6):raw",
+ATTRIBS(A_ARCHV2), "Extract and insert bits",
+{
+        fHIDE(int) width=uiV;
+        fHIDE(int) offset=SiV+2;
+        fHIDE(int) field = fEXTRACTU_BIDIR(RsV,width,offset);
+        fINSERT_BITS(RxV,width,2,field);
+})
+
+Q6INSN(S2_tableidxd,"Rx32=tableidxd(Rs32,#u4,#S6):raw",
+ATTRIBS(A_ARCHV2), "Extract and insert bits",
+{
+        fHIDE(int) width=uiV;
+        fHIDE(int) offset=SiV+3;
+        fHIDE(int) field = fEXTRACTU_BIDIR(RsV,width,offset);
+        fINSERT_BITS(RxV,width,3,field);
+})
+
+
+Q6INSN(A4_bitspliti,"Rdd32=bitsplit(Rs32,#u5)",
+ATTRIBS(), "Split a bitfield into two registers",
+{
+        fSETWORD(1,RddV,(fCAST4_4u(RsV)>>uiV));
+        fSETWORD(0,RddV,fZXTN(uiV,32,RsV));
+})
+
+Q6INSN(A4_bitsplit,"Rdd32=bitsplit(Rs32,Rt32)",
+ATTRIBS(), "Split a bitfield into two registers",
+{
+		fHIDE(size4u_t) shamt = fZXTN(5,32,RtV);
+        fSETWORD(1,RddV,(fCAST4_4u(RsV)>>shamt));
+        fSETWORD(0,RddV,fZXTN(shamt,32,RsV));
+})
+
+
+
+
+Q6INSN(S4_extract,"Rd32=extract(Rs32,#u5,#U5)",
+ATTRIBS(), "Extract signed bitfield",
+{
+        fHIDE(int) width=uiV;
+        fHIDE(int) offset=UiV;
+		RdV = fSXTN(width,32,(fCAST4_4u(RsV) >> offset));
+})
+
+
+Q6INSN(S2_extractu,"Rd32=extractu(Rs32,#u5,#U5)",
+ATTRIBS(), "Extract unsigned bitfield",
+{
+        fHIDE(int) width=uiV;
+        fHIDE(int) offset=UiV;
+		RdV = fZXTN(width,32,(fCAST4_4u(RsV) >> offset));
+})
+
+Q6INSN(S2_insertp,"Rxx32=insert(Rss32,#u6,#U6)",
+ATTRIBS(), "Insert bits",
+{
+        fHIDE(int) width=uiV;
+      	fHIDE(int) offset=UiV;
+		/* clear bits in Rxx where new bits go */
+		RxxV &= ~(((fCONSTLL(1)<<width)-1)<<offset);
+		/* OR in new bits */
+		RxxV |= ((RssV & ((fCONSTLL(1)<<width)-1)) << offset);
+})
+
+
+Q6INSN(S4_extractp,"Rdd32=extract(Rss32,#u6,#U6)",
+ATTRIBS(), "Extract signed bitfield",
+{
+        fHIDE(int) width=uiV;
+        fHIDE(int) offset=UiV;
+	RddV = fSXTN(width,64,(fCAST8_8u(RssV) >> offset));
+})
+
+
+Q6INSN(S2_extractup,"Rdd32=extractu(Rss32,#u6,#U6)",
+ATTRIBS(), "Extract unsigned bitfield",
+{
+        fHIDE(int) width=uiV;
+        fHIDE(int) offset=UiV;
+	RddV = fZXTN(width,64,(fCAST8_8u(RssV) >> offset));
+})
+
+
+
+
+Q6INSN(S2_mask,"Rd32=mask(#u5,#U5)",
+ATTRIBS(), "Form mask from immediate",
+{
+    RdV = ((1<<uiV)-1) << UiV;
+})
+
+
+
+
+
+Q6INSN(S2_insert_rp,"Rx32=insert(Rs32,Rtt32)",
+ATTRIBS(), "Insert bits",
+{
+        fHIDE(int) width=fZXTN(6,32,(fGETWORD(1,RttV)));
+        fHIDE(int) offset=fSXTN(7,32,(fGETWORD(0,RttV)));
+	fHIDE(size8u_t) mask = ((fCONSTLL(1)<<width)-1);
+	if (offset < 0) {
+		RxV = 0;
+	} else {
+		/* clear bits in Rxx where new bits go */
+		RxV &= ~(mask<<offset);
+		/* OR in new bits */
+		RxV |= ((RsV & mask) << offset);
+	}
+})
+
+
+Q6INSN(S4_extract_rp,"Rd32=extract(Rs32,Rtt32)",
+ATTRIBS(), "Extract signed bitfield",
+{
+        fHIDE(int) width=fZXTN(6,32,(fGETWORD(1,RttV)));
+        fHIDE(int) offset=fSXTN(7,32,(fGETWORD(0,RttV)));
+	RdV = fSXTN(width,64,fBIDIR_LSHIFTR(fCAST4_8u(RsV),offset,4_8));
+})
+
+
+
+Q6INSN(S2_extractu_rp,"Rd32=extractu(Rs32,Rtt32)",
+ATTRIBS(), "Extract unsigned bitfield",
+{
+        fHIDE(int) width=fZXTN(6,32,(fGETWORD(1,RttV)));
+        fHIDE(int) offset=fSXTN(7,32,(fGETWORD(0,RttV)));
+	RdV = fZXTN(width,64,fBIDIR_LSHIFTR(fCAST4_8u(RsV),offset,4_8));
+})
+
+Q6INSN(S2_insertp_rp,"Rxx32=insert(Rss32,Rtt32)",
+ATTRIBS(), "Insert bits",
+{
+        fHIDE(int) width=fZXTN(6,32,(fGETWORD(1,RttV)));
+        fHIDE(int) offset=fSXTN(7,32,(fGETWORD(0,RttV)));
+	fHIDE(size8u_t) mask = ((fCONSTLL(1)<<width)-1);
+	if (offset < 0) {
+		RxxV = 0;
+	} else {
+		/* clear bits in Rxx where new bits go */
+		RxxV &= ~(mask<<offset);
+		/* OR in new bits */
+		RxxV |= ((RssV & mask) << offset);
+	}
+})
+
+
+Q6INSN(S4_extractp_rp,"Rdd32=extract(Rss32,Rtt32)",
+ATTRIBS(), "Extract signed bitfield",
+{
+        fHIDE(int) width=fZXTN(6,32,(fGETWORD(1,RttV)));
+        fHIDE(int) offset=fSXTN(7,32,(fGETWORD(0,RttV)));
+	RddV = fSXTN(width,64,fBIDIR_LSHIFTR(fCAST8_8u(RssV),offset,8_8));
+})
+
+
+Q6INSN(S2_extractup_rp,"Rdd32=extractu(Rss32,Rtt32)",
+ATTRIBS(), "Extract unsigned bitfield",
+{
+        fHIDE(int) width=fZXTN(6,32,(fGETWORD(1,RttV)));
+        fHIDE(int) offset=fSXTN(7,32,(fGETWORD(0,RttV)));
+	RddV = fZXTN(width,64,fBIDIR_LSHIFTR(fCAST8_8u(RssV),offset,8_8));
+})
+
+/**********************************************/
+/* tstbit/setbit/clrbit                       */
+/**********************************************/
+
+Q6INSN(S2_tstbit_i,"Pd4=tstbit(Rs32,#u5)",
+ATTRIBS(), "Test a bit",
+{
+	PdV = f8BITSOF((RsV & (1<<uiV)) != 0);
+})
+
+Q6INSN(S4_ntstbit_i,"Pd4=!tstbit(Rs32,#u5)",
+ATTRIBS(), "Test a bit",
+{
+	PdV = f8BITSOF((RsV & (1<<uiV)) == 0);
+})
+
+Q6INSN(S2_setbit_i,"Rd32=setbit(Rs32,#u5)",
+ATTRIBS(), "Set a bit",
+{
+	RdV = (RsV | (1<<uiV));
+})
+
+Q6INSN(S2_togglebit_i,"Rd32=togglebit(Rs32,#u5)",
+ATTRIBS(), "Toggle a bit",
+{
+	RdV = (RsV ^ (1<<uiV));
+})
+
+Q6INSN(S2_clrbit_i,"Rd32=clrbit(Rs32,#u5)",
+ATTRIBS(), "Clear a bit",
+{
+	RdV = (RsV & (~(1<<uiV)));
+})
+
+
+
+/* using a register */
+Q6INSN(S2_tstbit_r,"Pd4=tstbit(Rs32,Rt32)",
+ATTRIBS(), "Test a bit",
+{
+	PdV = f8BITSOF((fCAST4_8u(RsV) & fBIDIR_LSHIFTL(1,fSXTN(7,32,RtV),4_8)) != 0);
+})
+
+Q6INSN(S4_ntstbit_r,"Pd4=!tstbit(Rs32,Rt32)",
+ATTRIBS(), "Test a bit",
+{
+	PdV = f8BITSOF((fCAST4_8u(RsV) & fBIDIR_LSHIFTL(1,fSXTN(7,32,RtV),4_8)) == 0);
+})
+
+Q6INSN(S2_setbit_r,"Rd32=setbit(Rs32,Rt32)",
+ATTRIBS(), "Set a bit",
+{
+	RdV = (RsV | fBIDIR_LSHIFTL(1,fSXTN(7,32,RtV),4_8));
+})
+
+Q6INSN(S2_togglebit_r,"Rd32=togglebit(Rs32,Rt32)",
+ATTRIBS(), "Toggle a bit",
+{
+	RdV = (RsV ^ fBIDIR_LSHIFTL(1,fSXTN(7,32,RtV),4_8));
+})
+
+Q6INSN(S2_clrbit_r,"Rd32=clrbit(Rs32,Rt32)",
+ATTRIBS(), "Clear a bit",
+{
+	RdV = (RsV & (~(fBIDIR_LSHIFTL(1,fSXTN(7,32,RtV),4_8))));
+})
+
+
+/**********************************************/
+/* vector shifting                            */
+/**********************************************/
+
+/* Half Vector Immediate Shifts */
+
+Q6INSN(S2_asr_i_vh,"Rdd32=vasrh(Rss32,#u4)",ATTRIBS(),
+	"Vector Arithmetic Shift Right by Immediate",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV, (fGETHALF(i,RssV)>>uiV));
+        }
+})
+
+
+Q6INSN(S2_lsr_i_vh,"Rdd32=vlsrh(Rss32,#u4)",ATTRIBS(),
+	"Vector Logical Shift Right by Immediate",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV, (fGETUHALF(i,RssV)>>uiV));
+        }
+})
+
+Q6INSN(S2_asl_i_vh,"Rdd32=vaslh(Rss32,#u4)",ATTRIBS(),
+	"Vector Arithmetic Shift Left by Immediate",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV, (fGETHALF(i,RssV)<<uiV));
+        }
+})
+
+/* Half Vector Register Shifts */
+
+Q6INSN(S2_asr_r_vh,"Rdd32=vasrh(Rss32,Rt32)",ATTRIBS(),
+	"Vector Arithmetic Shift Right by Register",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV, fBIDIR_ASHIFTR(fGETHALF(i,RssV),fSXTN(7,32,RtV),2_8));
+        }
+})
+
+Q6INSN(S5_asrhub_rnd_sat,"Rd32=vasrhub(Rss32,#u4):raw",,
+	"Vector Arithmetic Shift Right by Immediate with Round, Saturate, and Pack",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+			fSETBYTE(i,RdV, fSATUB( ((fGETHALF(i,RssV) >> uiV )+1)>>1  ));
+        }
+})
+
+Q6INSN(S5_asrhub_sat,"Rd32=vasrhub(Rss32,#u4):sat",,
+	"Vector Arithmetic Shift Right by Immediate with Saturate and Pack",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+			fSETBYTE(i,RdV, fSATUB( fGETHALF(i,RssV) >> uiV ));
+        }
+})
+
+
+
+Q6INSN(S5_vasrhrnd,"Rdd32=vasrh(Rss32,#u4):raw",,
+	"Vector Arithmetic Shift Right by Immediate with Round",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+			fSETHALF(i,RddV, ( ((fGETHALF(i,RssV) >> uiV)+1)>>1  ));
+        }
+})
+
+
+Q6INSN(S2_asl_r_vh,"Rdd32=vaslh(Rss32,Rt32)",ATTRIBS(),
+	"Vector Arithmetic Shift Left by Register",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV, fBIDIR_ASHIFTL(fGETHALF(i,RssV),fSXTN(7,32,RtV),2_8));
+        }
+})
+
+
+
+Q6INSN(S2_lsr_r_vh,"Rdd32=vlsrh(Rss32,Rt32)",ATTRIBS(),
+	"Vector Logical Shift Right by Register",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV, fBIDIR_LSHIFTR(fGETUHALF(i,RssV),fSXTN(7,32,RtV),2_8));
+        }
+})
+
+
+Q6INSN(S2_lsl_r_vh,"Rdd32=vlslh(Rss32,Rt32)",ATTRIBS(),
+	"Vector Logical Shift Left by Register",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+	  fSETHALF(i,RddV, fBIDIR_LSHIFTL(fGETUHALF(i,RssV),fSXTN(7,32,RtV),2_8));
+        }
+})
+
+
+
+
+/* Word Vector Immediate Shifts */
+
+Q6INSN(S2_asr_i_vw,"Rdd32=vasrw(Rss32,#u5)",ATTRIBS(),
+	"Vector Arithmetic Shift Right by Immediate",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+	  fSETWORD(i,RddV,(fGETWORD(i,RssV)>>uiV));
+        }
+})
+
+
+
+Q6INSN(S2_asr_i_svw_trun,"Rd32=vasrw(Rss32,#u5)",ATTRIBS(A_ARCHV2),
+	"Vector Arithmetic Shift Right by Immediate with Truncate and Pack",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+	  fSETHALF(i,RdV,fGETHALF(0,(fGETWORD(i,RssV)>>uiV)));
+        }
+})
+
+Q6INSN(S2_asr_r_svw_trun,"Rd32=vasrw(Rss32,Rt32)",ATTRIBS(A_ARCHV2),
+	"Vector Arithmetic Shift Right truncate and Pack",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+ 	  fSETHALF(i,RdV,fGETHALF(0,fBIDIR_ASHIFTR(fGETWORD(i,RssV),fSXTN(7,32,RtV),4_8)));
+        }
+})
+
+
+Q6INSN(S2_lsr_i_vw,"Rdd32=vlsrw(Rss32,#u5)",ATTRIBS(),
+	"Vector Logical Shift Right by Immediate",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+	  fSETWORD(i,RddV,(fGETUWORD(i,RssV)>>uiV));
+        }
+})
+
+Q6INSN(S2_asl_i_vw,"Rdd32=vaslw(Rss32,#u5)",ATTRIBS(),
+	"Vector Arithmetic Shift Left by Immediate",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+	  fSETWORD(i,RddV,(fGETWORD(i,RssV)<<uiV));
+        }
+})
+
+/* Word Vector Register Shifts */
+
+Q6INSN(S2_asr_r_vw,"Rdd32=vasrw(Rss32,Rt32)",ATTRIBS(),
+	"Vector Arithmetic Shift Right by Register",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+ 	  fSETWORD(i,RddV, fBIDIR_ASHIFTR(fGETWORD(i,RssV),fSXTN(7,32,RtV),4_8));
+        }
+})
+
+
+
+Q6INSN(S2_asl_r_vw,"Rdd32=vaslw(Rss32,Rt32)",ATTRIBS(),
+	"Vector Arithmetic Shift Left by Register",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+ 	  fSETWORD(i,RddV, fBIDIR_ASHIFTL(fGETWORD(i,RssV),fSXTN(7,32,RtV),4_8));
+        }
+})
+
+
+Q6INSN(S2_lsr_r_vw,"Rdd32=vlsrw(Rss32,Rt32)",ATTRIBS(),
+	"Vector Logical Shift Right by Register",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+ 	  fSETWORD(i,RddV, fBIDIR_LSHIFTR(fGETUWORD(i,RssV),fSXTN(7,32,RtV),4_8));
+        }
+})
+
+
+
+Q6INSN(S2_lsl_r_vw,"Rdd32=vlslw(Rss32,Rt32)",ATTRIBS(),
+	"Vector Logical Shift Left by Register",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+ 	  fSETWORD(i,RddV, fBIDIR_LSHIFTL(fGETUWORD(i,RssV),fSXTN(7,32,RtV),4_8));
+        }
+})
+
+
+
+
+
+/**********************************************/
+/* Vector SXT/ZXT/SAT/TRUN/RNDPACK            */
+/**********************************************/
+
+
+Q6INSN(S2_vrndpackwh,"Rd32=vrndwh(Rss32)",ATTRIBS(),
+"Round and Pack vector of words to Halfwords",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+	  fSETHALF(i,RdV,fGETHALF(1,(fGETWORD(i,RssV)+0x08000)));
+        }
+})
+
+Q6INSN(S2_vrndpackwhs,"Rd32=vrndwh(Rss32):sat",ATTRIBS(),
+"Round and Pack vector of words to Halfwords",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+	  fSETHALF(i,RdV,fGETHALF(1,fSAT(fGETWORD(i,RssV)+0x08000)));
+        }
+})
+
+Q6INSN(S2_vsxtbh,"Rdd32=vsxtbh(Rs32)",ATTRIBS(A_ARCHV2),
+"Vector sign extend byte to half",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+  	  fSETHALF(i,RddV,fGETBYTE(i,RsV));
+        }
+})
+
+Q6INSN(S2_vzxtbh,"Rdd32=vzxtbh(Rs32)",ATTRIBS(),
+"Vector zero extend byte to half",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+  	  fSETHALF(i,RddV,fGETUBYTE(i,RsV));
+        }
+})
+
+Q6INSN(S2_vsathub,"Rd32=vsathub(Rss32)",ATTRIBS(),
+"Vector saturate half to unsigned byte",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+  	  fSETBYTE(i,RdV,fSATUN(8,fGETHALF(i,RssV)));
+        }
+})
+
+
+
+
+
+Q6INSN(S2_svsathub,"Rd32=vsathub(Rs32)",ATTRIBS(A_ARCHV2),
+"Vector saturate half to unsigned byte",
+{
+	fSETBYTE(0,RdV,fSATUN(8,fGETHALF(0,RsV)));
+  	fSETBYTE(1,RdV,fSATUN(8,fGETHALF(1,RsV)));
+  	fSETBYTE(2,RdV,0);
+  	fSETBYTE(3,RdV,0);
+})
+
+Q6INSN(S2_svsathb,"Rd32=vsathb(Rs32)",ATTRIBS(A_ARCHV2),
+"Vector saturate half to signed byte",
+{
+	fSETBYTE(0,RdV,fSATN(8,fGETHALF(0,RsV)));
+  	fSETBYTE(1,RdV,fSATN(8,fGETHALF(1,RsV)));
+  	fSETBYTE(2,RdV,0);
+  	fSETBYTE(3,RdV,0);
+})
+
+
+Q6INSN(S2_vsathb,"Rd32=vsathb(Rss32)",ATTRIBS(A_ARCHV2),
+"Vector saturate half to signed byte",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+  	  fSETBYTE(i,RdV,fSATN(8,fGETHALF(i,RssV)));
+        }
+})
+
+Q6INSN(S2_vtrunohb,"Rd32=vtrunohb(Rss32)",ATTRIBS(A_ARCHV2),
+"Vector truncate half to byte: take high",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+  	  fSETBYTE(i,RdV,fGETBYTE(i*2+1,RssV));
+        }
+})
+
+Q6INSN(S2_vtrunewh,"Rdd32=vtrunewh(Rss32,Rtt32)",ATTRIBS(A_ARCHV2),
+"Vector truncate word to half: take low",
+{
+	fSETHALF(0,RddV,fGETHALF(0,RttV));
+	fSETHALF(1,RddV,fGETHALF(2,RttV));
+	fSETHALF(2,RddV,fGETHALF(0,RssV));
+	fSETHALF(3,RddV,fGETHALF(2,RssV));
+})
+
+Q6INSN(S2_vtrunowh,"Rdd32=vtrunowh(Rss32,Rtt32)",ATTRIBS(A_ARCHV2),
+"Vector truncate word to half: take high",
+{
+	fSETHALF(0,RddV,fGETHALF(1,RttV));
+	fSETHALF(1,RddV,fGETHALF(3,RttV));
+	fSETHALF(2,RddV,fGETHALF(1,RssV));
+	fSETHALF(3,RddV,fGETHALF(3,RssV));
+})
+
+
+Q6INSN(S2_vtrunehb,"Rd32=vtrunehb(Rss32)",ATTRIBS(),
+"Vector truncate half to byte: take low",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+  	  fSETBYTE(i,RdV,fGETBYTE(i*2,RssV));
+        }
+})
+
+Q6INSN(S6_vtrunehb_ppp,"Rdd32=vtrunehb(Rss32,Rtt32)",ATTRIBS(),
+"Vector truncate half to byte: take low",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+  	  		fSETBYTE(i,RddV,fGETBYTE(i*2,RttV));
+			fSETBYTE(i+4,RddV,fGETBYTE(i*2,RssV));
+        }
+})
+
+Q6INSN(S6_vtrunohb_ppp,"Rdd32=vtrunohb(Rss32,Rtt32)",ATTRIBS(),
+"Vector truncate half to byte: take high",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+  	  		fSETBYTE(i,RddV,fGETBYTE(i*2+1,RttV));
+  	  		fSETBYTE(i+4,RddV,fGETBYTE(i*2+1,RssV));
+        }
+})
+
+Q6INSN(S2_vsxthw,"Rdd32=vsxthw(Rs32)",ATTRIBS(),
+"Vector sign extend half to word",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+  	  fSETWORD(i,RddV,fGETHALF(i,RsV));
+        }
+})
+
+Q6INSN(S2_vzxthw,"Rdd32=vzxthw(Rs32)",ATTRIBS(),
+"Vector zero extend half to word",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+  	  fSETWORD(i,RddV,fGETUHALF(i,RsV));
+        }
+})
+
+
+Q6INSN(S2_vsatwh,"Rd32=vsatwh(Rss32)",ATTRIBS(),
+"Vector saturate word to signed half",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+  	  fSETHALF(i,RdV,fSATN(16,fGETWORD(i,RssV)));
+        }
+})
+
+Q6INSN(S2_vsatwuh,"Rd32=vsatwuh(Rss32)",ATTRIBS(),
+"Vector saturate word to unsigned half",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+  	  fSETHALF(i,RdV,fSATUN(16,fGETWORD(i,RssV)));
+        }
+})
+
+/* Other misc insns of this type */
+
+Q6INSN(S2_packhl,"Rdd32=packhl(Rs32,Rt32)",ATTRIBS(),
+"Pack high halfwords and low halfwords together",
+{
+    fSETHALF(0,RddV,fGETHALF(0,RtV));
+    fSETHALF(1,RddV,fGETHALF(0,RsV));
+    fSETHALF(2,RddV,fGETHALF(1,RtV));
+    fSETHALF(3,RddV,fGETHALF(1,RsV));
+})
+
+Q6INSN(A2_swiz,"Rd32=swiz(Rs32)",ATTRIBS(A_ARCHV2),
+"Endian swap the bytes of Rs",
+{
+    fSETBYTE(0,RdV,fGETBYTE(3,RsV));
+    fSETBYTE(1,RdV,fGETBYTE(2,RsV));
+    fSETBYTE(2,RdV,fGETBYTE(1,RsV));
+    fSETBYTE(3,RdV,fGETBYTE(0,RsV));
+})
+
+
+
+/* Vector Sat without Packing */
+Q6INSN(S2_vsathub_nopack,"Rdd32=vsathub(Rss32)",ATTRIBS(),
+"Vector saturate half to unsigned byte",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+  	  fSETHALF(i,RddV,fSATUN(8,fGETHALF(i,RssV)));
+        }
+})
+
+Q6INSN(S2_vsathb_nopack,"Rdd32=vsathb(Rss32)",ATTRIBS(A_ARCHV2),
+"Vector saturate half to signed byte without pack",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+  	  fSETHALF(i,RddV,fSATN(8,fGETHALF(i,RssV)));
+        }
+})
+
+Q6INSN(S2_vsatwh_nopack,"Rdd32=vsatwh(Rss32)",ATTRIBS(),
+"Vector saturate word to signed half",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+  	  fSETWORD(i,RddV,fSATN(16,fGETWORD(i,RssV)));
+        }
+})
+
+Q6INSN(S2_vsatwuh_nopack,"Rdd32=vsatwuh(Rss32)",ATTRIBS(),
+"Vector saturate word to unsigned half",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+  	  fSETWORD(i,RddV,fSATUN(16,fGETWORD(i,RssV)));
+        }
+})
+
+
+/**********************************************/
+/* Shuffle                                    */
+/**********************************************/
+
+
+Q6INSN(S2_shuffob,"Rdd32=shuffob(Rtt32,Rss32)",ATTRIBS(),
+"Shuffle high bytes together",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+	  fSETBYTE(i*2  ,RddV,fGETBYTE(i*2+1,RssV));
+	  fSETBYTE(i*2+1,RddV,fGETBYTE(i*2+1,RttV));
+        }
+})
+
+Q6INSN(S2_shuffeb,"Rdd32=shuffeb(Rss32,Rtt32)",ATTRIBS(),
+"Shuffle low bytes together",
+{
+        fHIDE(int i;)
+        for (i=0;i<4;i++) {
+	  fSETBYTE(i*2  ,RddV,fGETBYTE(i*2,RttV));
+	  fSETBYTE(i*2+1,RddV,fGETBYTE(i*2,RssV));
+        }
+})
+
+Q6INSN(S2_shuffoh,"Rdd32=shuffoh(Rtt32,Rss32)",ATTRIBS(),
+"Shuffle high halves together",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+	  fSETHALF(i*2  ,RddV,fGETHALF(i*2+1,RssV));
+	  fSETHALF(i*2+1,RddV,fGETHALF(i*2+1,RttV));
+        }
+})
+
+Q6INSN(S2_shuffeh,"Rdd32=shuffeh(Rss32,Rtt32)",ATTRIBS(),
+"Shuffle low halves together",
+{
+        fHIDE(int i;)
+        for (i=0;i<2;i++) {
+	  fSETHALF(i*2  ,RddV,fGETHALF(i*2,RttV));
+	  fSETHALF(i*2+1,RddV,fGETHALF(i*2,RssV));
+        }
+})
+
+
+/**********************************************/
+/* Strange bit instructions                   */
+/**********************************************/
+
+Q6INSN(S5_popcountp,"Rd32=popcount(Rss32)",ATTRIBS(),
+"Population Count", { RdV = fCOUNTONES_8(RssV); })
+
+Q6INSN(S4_parity,"Rd32=parity(Rs32,Rt32)",,
+"Parity of Masked Value", { RdV = 1&fCOUNTONES_4(RsV & RtV); })
+
+Q6INSN(S2_parityp,"Rd32=parity(Rss32,Rtt32)",ATTRIBS(A_ARCHV2),
+"Parity of Masked Value", { RdV = 1&fCOUNTONES_8(RssV & RttV); })
+
+Q6INSN(S2_lfsp,"Rdd32=lfs(Rss32,Rtt32)",ATTRIBS(A_ARCHV2),
+"Parity of Masked Value", { RddV = (fCAST8u(RssV) >> 1) | (fCAST8u((1&fCOUNTONES_8(RssV & RttV)))<<63) ; })
+
+Q6INSN(S2_clbnorm,"Rd32=normamt(Rs32)",ATTRIBS(A_ARCHV2),
+"Count leading sign bits - 1", { if (RsV == 0) { RdV = 0; } else { RdV = (fMAX(fCL1_4(RsV),fCL1_4(~RsV)))-1;} })
+
+Q6INSN(S4_clbaddi,"Rd32=add(clb(Rs32),#s6)",ATTRIBS(A_ARCHV2),
+"Count leading sign bits then add signed number",
+{ RdV = (fMAX(fCL1_4(RsV),fCL1_4(~RsV)))+siV;} )
+
+Q6INSN(S4_clbpnorm,"Rd32=normamt(Rss32)",ATTRIBS(A_ARCHV2),
+"Count leading sign bits - 1", { if (RssV == 0) { RdV = 0; }
+else { RdV = (fMAX(fCL1_8(RssV),fCL1_8(~RssV)))-1;}})
+
+Q6INSN(S4_clbpaddi,"Rd32=add(clb(Rss32),#s6)",ATTRIBS(A_ARCHV2),
+"Count leading sign bits then add signed number",
+{ RdV = (fMAX(fCL1_8(RssV),fCL1_8(~RssV)))+siV;})
+
+
+Q6INSN(S2_clb,"Rd32=clb(Rs32)",ATTRIBS(),
+"Count leading bits", {RdV = fMAX(fCL1_4(RsV),fCL1_4(~RsV));})
+
+
+Q6INSN(S2_cl0,"Rd32=cl0(Rs32)",ATTRIBS(),
+"Count leading bits", {RdV = fCL1_4(~RsV);})
+
+Q6INSN(S2_cl1,"Rd32=cl1(Rs32)",ATTRIBS(),
+"Count leading bits", {RdV = fCL1_4(RsV);})
+
+Q6INSN(S2_clbp,"Rd32=clb(Rss32)",ATTRIBS(),
+"Count leading bits", {RdV = fMAX(fCL1_8(RssV),fCL1_8(~RssV));})
+
+Q6INSN(S2_cl0p,"Rd32=cl0(Rss32)",ATTRIBS(),
+"Count leading bits", {RdV = fCL1_8(~RssV);})
+
+Q6INSN(S2_cl1p,"Rd32=cl1(Rss32)",ATTRIBS(),
+"Count leading bits", {RdV = fCL1_8(RssV);})
+
+
+
+
+Q6INSN(S2_brev,	"Rd32=brev(Rs32)",   ATTRIBS(A_ARCHV2), "Bit Reverse",{RdV = fBREV_4(RsV);})
+Q6INSN(S2_brevp,"Rdd32=brev(Rss32)", ATTRIBS(), "Bit Reverse",{RddV = fBREV_8(RssV);})
+Q6INSN(S2_ct0,  "Rd32=ct0(Rs32)",    ATTRIBS(A_ARCHV2), "Count Trailing",{RdV = fCL1_4(~fBREV_4(RsV));})
+Q6INSN(S2_ct1,  "Rd32=ct1(Rs32)",    ATTRIBS(A_ARCHV2), "Count Trailing",{RdV = fCL1_4(fBREV_4(RsV));})
+Q6INSN(S2_ct0p, "Rd32=ct0(Rss32)",   ATTRIBS(), "Count Trailing",{RdV = fCL1_8(~fBREV_8(RssV));})
+Q6INSN(S2_ct1p, "Rd32=ct1(Rss32)",   ATTRIBS(), "Count Trailing",{RdV = fCL1_8(fBREV_8(RssV));})
+
+
+Q6INSN(S2_interleave,"Rdd32=interleave(Rss32)",ATTRIBS(A_ARCHV2),"Interleave bits",
+{RddV = fINTERLEAVE(fGETWORD(1,RssV),fGETWORD(0,RssV));})
+
+Q6INSN(S2_deinterleave,"Rdd32=deinterleave(Rss32)",ATTRIBS(A_ARCHV2),"Interleave bits",
+{RddV = fDEINTERLEAVE(RssV);})
+
diff --git a/target/hexagon/imported/subinsns.idef b/target/hexagon/imported/subinsns.idef
new file mode 100644
index 0000000..3bab12f
--- /dev/null
+++ b/target/hexagon/imported/subinsns.idef
@@ -0,0 +1,152 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * sub-instructions
+ */
+
+
+
+/*****************************************************************/
+/*                                                               */
+/*                       A-type subinsns                         */
+/*                                                               */
+/*****************************************************************/
+
+Q6INSN(SA1_addi,     "Rx16=add(Rx16,#s7)",    ATTRIBS(A_SUBINSN),"Add",        { fIMMEXT(siV); RxV=RxV+siV;})
+Q6INSN(SA1_tfr,      "Rd16=Rs16",             ATTRIBS(A_SUBINSN),"Tfr",        { RdV=RsV;})
+Q6INSN(SA1_seti,     "Rd16=#u6",              ATTRIBS(A_SUBINSN),"Set immed",  { fIMMEXT(uiV); RdV=uiV;})
+Q6INSN(SA1_setin1,   "Rd16=#-1",              ATTRIBS(A_SUBINSN),"Set to -1",  { RdV=-1;})
+Q6INSN(SA1_clrtnew,  "if (p0.new) Rd16=#0",   ATTRIBS(A_SUBINSN),"clear if true", { if (fLSBNEW0) {RdV=0;} else {CANCEL;} })
+Q6INSN(SA1_clrfnew,  "if (!p0.new) Rd16=#0",  ATTRIBS(A_SUBINSN),"clear if false",{ if (fLSBNEW0NOT) {RdV=0;} else {CANCEL;} })
+Q6INSN(SA1_clrt,     "if (p0) Rd16=#0",       ATTRIBS(A_SUBINSN),"clear if true", { if (fLSBOLD(fREAD_P0())) {RdV=0;} else {CANCEL;} })
+Q6INSN(SA1_clrf,     "if (!p0) Rd16=#0",      ATTRIBS(A_SUBINSN),"clear if false",{ if (fLSBOLDNOT(fREAD_P0())) {RdV=0;} else {CANCEL;} })
+
+Q6INSN(SA1_addsp,    "Rd16=add(r29,#u6:2)",   ATTRIBS(A_SUBINSN),"Add",        { RdV=fREAD_SP()+uiV; })
+Q6INSN(SA1_inc,      "Rd16=add(Rs16,#1)",     ATTRIBS(A_SUBINSN),"Inc",        { RdV=RsV+1;})
+Q6INSN(SA1_dec,      "Rd16=add(Rs16,#-1)",    ATTRIBS(A_SUBINSN),"Dec",        { RdV=RsV-1;})
+Q6INSN(SA1_addrx,    "Rx16=add(Rx16,Rs16)",   ATTRIBS(A_SUBINSN),"Add",        { RxV=RxV+RsV; })
+Q6INSN(SA1_zxtb,     "Rd16=and(Rs16,#255)",   ATTRIBS(A_SUBINSN),"Zxtb",       { RdV= fZXTN(8,32,RsV);})
+Q6INSN(SA1_and1,     "Rd16=and(Rs16,#1)",     ATTRIBS(A_SUBINSN),"And #1",     { RdV= RsV&1;})
+Q6INSN(SA1_sxtb,     "Rd16=sxtb(Rs16)",       ATTRIBS(A_SUBINSN),"Sxtb",       { RdV= fSXTN(8,32,RsV);})
+Q6INSN(SA1_zxth,     "Rd16=zxth(Rs16)",       ATTRIBS(A_SUBINSN),"Zxth",       { RdV= fZXTN(16,32,RsV);})
+Q6INSN(SA1_sxth,     "Rd16=sxth(Rs16)",       ATTRIBS(A_SUBINSN),"Sxth",       { RdV= fSXTN(16,32,RsV);})
+Q6INSN(SA1_combinezr,"Rdd8=combine(#0,Rs16)", ATTRIBS(A_SUBINSN),"Combines",   { fSETWORD(0,RddV,RsV); fSETWORD(1,RddV,0); })
+Q6INSN(SA1_combinerz,"Rdd8=combine(Rs16,#0)", ATTRIBS(A_SUBINSN),"Combines",   { fSETWORD(0,RddV,0); fSETWORD(1,RddV,RsV); })
+Q6INSN(SA1_combine0i,"Rdd8=combine(#0,#u2)", ATTRIBS(A_SUBINSN),"Combines",   { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,0); })
+Q6INSN(SA1_combine1i,"Rdd8=combine(#1,#u2)", ATTRIBS(A_SUBINSN),"Combines",   { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,1); })
+Q6INSN(SA1_combine2i,"Rdd8=combine(#2,#u2)", ATTRIBS(A_SUBINSN),"Combines",   { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,2); })
+Q6INSN(SA1_combine3i,"Rdd8=combine(#3,#u2)", ATTRIBS(A_SUBINSN),"Combines",   { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,3); })
+Q6INSN(SA1_cmpeqi,   "p0=cmp.eq(Rs16,#u2)",   ATTRIBS(A_SUBINSN),"CompareImmed",{fWRITE_P0(f8BITSOF(RsV==uiV));})
+
+
+
+
+/*****************************************************************/
+/*                                                               */
+/*                       Ld1/2 subinsns                          */
+/*                                                               */
+/*****************************************************************/
+
+Q6INSN(SL1_loadri_io,  "Rd16=memw(Rs16+#u4:2)", ATTRIBS(A_LOAD,A_SUBINSN),"load word", {fEA_RI(RsV,uiV); fLOAD(1,4,u,EA,RdV);})
+Q6INSN(SL1_loadrub_io, "Rd16=memub(Rs16+#u4:0)",ATTRIBS(A_LOAD,A_SUBINSN),"load byte", {fEA_RI(RsV,uiV); fLOAD(1,1,u,EA,RdV);})
+
+Q6INSN(SL2_loadrh_io,  "Rd16=memh(Rs16+#u3:1)", ATTRIBS(A_LOAD,A_SUBINSN),"load half", {fEA_RI(RsV,uiV); fLOAD(1,2,s,EA,RdV);})
+Q6INSN(SL2_loadruh_io, "Rd16=memuh(Rs16+#u3:1)",ATTRIBS(A_LOAD,A_SUBINSN),"load half", {fEA_RI(RsV,uiV); fLOAD(1,2,u,EA,RdV);})
+Q6INSN(SL2_loadrb_io,  "Rd16=memb(Rs16+#u3:0)", ATTRIBS(A_LOAD,A_SUBINSN),"load byte", {fEA_RI(RsV,uiV); fLOAD(1,1,s,EA,RdV);})
+Q6INSN(SL2_loadri_sp,  "Rd16=memw(r29+#u5:2)",  ATTRIBS(A_LOAD,A_SUBINSN),"load word", {fEA_RI(fREAD_SP(),uiV); fLOAD(1,4,u,EA,RdV);})
+Q6INSN(SL2_loadrd_sp,  "Rdd8=memd(r29+#u5:3)", ATTRIBS(A_LOAD,A_SUBINSN),"load dword",{fEA_RI(fREAD_SP(),uiV); fLOAD(1,8,u,EA,RddV);})
+
+Q6INSN(SL2_deallocframe,"deallocframe", ATTRIBS(A_SUBINSN,A_LOAD), "Deallocate stack frame",
+{ fHIDE(size8u_t tmp;) fEA_REG(fREAD_FP());
+	fLOAD(1,8,u,EA,tmp);
+	tmp = fFRAME_UNSCRAMBLE(tmp);
+	fWRITE_LR(fGETWORD(1,tmp));
+	fWRITE_FP(fGETWORD(0,tmp));
+	fWRITE_SP(EA+8); })
+
+Q6INSN(SL2_return,"dealloc_return", ATTRIBS(A_JINDIR,A_SUBINSN,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY), "Deallocate stack frame and return",
+{ fHIDE(size8u_t tmp;) fEA_REG(fREAD_FP());
+	fLOAD(1,8,u,EA,tmp);
+	tmp = fFRAME_UNSCRAMBLE(tmp);
+	fWRITE_LR(fGETWORD(1,tmp));
+	fWRITE_FP(fGETWORD(0,tmp));
+	fWRITE_SP(EA+8);
+    fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);})
+
+Q6INSN(SL2_return_t,"if (p0) dealloc_return", ATTRIBS(A_JINDIROLD,A_SUBINSN,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY), "Deallocate stack frame and return",
+{ fHIDE(size8u_t tmp;); fBRANCH_SPECULATE_STALL(fLSBOLD(fREAD_P0()),, SPECULATE_NOT_TAKEN,4,0); fEA_REG(fREAD_FP()); if (fLSBOLD(fREAD_P0())) { fLOAD(1,8,u,EA,tmp); tmp = fFRAME_UNSCRAMBLE(tmp); fWRITE_LR(fGETWORD(1,tmp)); fWRITE_FP(fGETWORD(0,tmp)); fWRITE_SP(EA+8);
+  fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);} else {LOAD_CANCEL(EA);} })
+
+Q6INSN(SL2_return_f,"if (!p0) dealloc_return", ATTRIBS(A_JINDIROLD,A_SUBINSN,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY), "Deallocate stack frame and return",
+{ fHIDE(size8u_t tmp;);fBRANCH_SPECULATE_STALL(fLSBOLDNOT(fREAD_P0()),, SPECULATE_NOT_TAKEN,4,0); fEA_REG(fREAD_FP()); if (fLSBOLDNOT(fREAD_P0())) { fLOAD(1,8,u,EA,tmp); tmp = fFRAME_UNSCRAMBLE(tmp); fWRITE_LR(fGETWORD(1,tmp)); fWRITE_FP(fGETWORD(0,tmp)); fWRITE_SP(EA+8);
+  fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);} else {LOAD_CANCEL(EA);} })
+
+
+
+Q6INSN(SL2_return_tnew,"if (p0.new) dealloc_return:nt", ATTRIBS(A_JINDIRNEW,A_SUBINSN,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY), "Deallocate stack frame and return",
+{ fHIDE(size8u_t tmp;) fBRANCH_SPECULATE_STALL(fLSBNEW0,, SPECULATE_NOT_TAKEN , 4,3); fEA_REG(fREAD_FP()); if (fLSBNEW0) { fLOAD(1,8,u,EA,tmp); tmp = fFRAME_UNSCRAMBLE(tmp); fWRITE_LR(fGETWORD(1,tmp)); fWRITE_FP(fGETWORD(0,tmp)); fWRITE_SP(EA+8);
+  fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);} else {LOAD_CANCEL(EA);} })
+
+Q6INSN(SL2_return_fnew,"if (!p0.new) dealloc_return:nt", ATTRIBS(A_JINDIRNEW,A_SUBINSN,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY), "Deallocate stack frame and return",
+{ fHIDE(size8u_t tmp;) fBRANCH_SPECULATE_STALL(fLSBNEW0NOT,, SPECULATE_NOT_TAKEN , 4,3); fEA_REG(fREAD_FP()); if (fLSBNEW0NOT) { fLOAD(1,8,u,EA,tmp); tmp = fFRAME_UNSCRAMBLE(tmp); fWRITE_LR(fGETWORD(1,tmp)); fWRITE_FP(fGETWORD(0,tmp)); fWRITE_SP(EA+8);
+  fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);} else {LOAD_CANCEL(EA);} })
+
+
+Q6INSN(SL2_jumpr31,"jumpr r31",ATTRIBS(A_SUBINSN,A_JINDIR,A_RESTRICT_SLOT0ONLY),"indirect unconditional jump",
+{ fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);})
+
+Q6INSN(SL2_jumpr31_t,"if (p0) jumpr r31",ATTRIBS(A_SUBINSN,A_JINDIROLD,A_RESTRICT_SLOT0ONLY),"indirect conditional jump if true",
+{fBRANCH_SPECULATE_STALL(fLSBOLD(fREAD_P0()),, SPECULATE_TAKEN,4,0); if (fLSBOLD(fREAD_P0())) {fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);}})
+
+Q6INSN(SL2_jumpr31_f,"if (!p0) jumpr r31",ATTRIBS(A_SUBINSN,A_JINDIROLD,A_RESTRICT_SLOT0ONLY),"indirect conditional jump if false",
+{fBRANCH_SPECULATE_STALL(fLSBOLDNOT(fREAD_P0()),, SPECULATE_TAKEN,4,0); if (fLSBOLDNOT(fREAD_P0())) {fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);}})
+
+
+
+Q6INSN(SL2_jumpr31_tnew,"if (p0.new) jumpr:nt r31",ATTRIBS(A_SUBINSN,A_JINDIRNEW,A_RESTRICT_SLOT0ONLY),"indirect conditional jump if true",
+{fBRANCH_SPECULATE_STALL(fLSBNEW0,, SPECULATE_NOT_TAKEN , 4,3); if (fLSBNEW0) {fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);}})
+
+Q6INSN(SL2_jumpr31_fnew,"if (!p0.new) jumpr:nt r31",ATTRIBS(A_SUBINSN,A_JINDIRNEW,A_RESTRICT_SLOT0ONLY),"indirect conditional jump if false",
+{fBRANCH_SPECULATE_STALL(fLSBNEW0NOT,, SPECULATE_NOT_TAKEN , 4,3); if (fLSBNEW0NOT) {fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);}})
+
+
+
+
+
+/*****************************************************************/
+/*                                                               */
+/*                       St1/2 subinsns                          */
+/*                                                               */
+/*****************************************************************/
+
+Q6INSN(SS1_storew_io,  "memw(Rs16+#u4:2)=Rt16", ATTRIBS(A_STORE,A_SUBINSN), "store word", {fEA_RI(RsV,uiV); fSTORE(1,4,EA,RtV);})
+Q6INSN(SS1_storeb_io,  "memb(Rs16+#u4:0)=Rt16", ATTRIBS(A_STORE,A_SUBINSN), "store byte", {fEA_RI(RsV,uiV); fSTORE(1,1,EA,fGETBYTE(0,RtV));})
+Q6INSN(SS2_storeh_io,  "memh(Rs16+#u3:1)=Rt16", ATTRIBS(A_STORE,A_SUBINSN), "store half", {fEA_RI(RsV,uiV); fSTORE(1,2,EA,fGETHALF(0,RtV));})
+Q6INSN(SS2_stored_sp,  "memd(r29+#s6:3)=Rtt8", ATTRIBS(A_STORE,A_SUBINSN), "store dword",{fEA_RI(fREAD_SP(),siV); fSTORE(1,8,EA,RttV);})
+Q6INSN(SS2_storew_sp,  "memw(r29+#u5:2)=Rt16",  ATTRIBS(A_STORE,A_SUBINSN), "store word", {fEA_RI(fREAD_SP(),uiV); fSTORE(1,4,EA,RtV);})
+Q6INSN(SS2_storewi0,   "memw(Rs16+#u4:2)=#0", ATTRIBS(A_STORE,A_SUBINSN), "store word", {fEA_RI(RsV,uiV); fSTORE(1,4,EA,0);})
+Q6INSN(SS2_storebi0,   "memb(Rs16+#u4:0)=#0", ATTRIBS(A_STORE,A_SUBINSN), "store byte", {fEA_RI(RsV,uiV); fSTORE(1,1,EA,0);})
+Q6INSN(SS2_storewi1,   "memw(Rs16+#u4:2)=#1", ATTRIBS(A_STORE,A_SUBINSN), "store word", {fEA_RI(RsV,uiV); fSTORE(1,4,EA,1);})
+Q6INSN(SS2_storebi1,   "memb(Rs16+#u4:0)=#1", ATTRIBS(A_STORE,A_SUBINSN), "store byte", {fEA_RI(RsV,uiV); fSTORE(1,1,EA,1);})
+
+
+Q6INSN(SS2_allocframe,"allocframe(#u5:3)", ATTRIBS(A_SUBINSN,A_STORE,A_RESTRICT_SLOT0ONLY), "Allocate stack frame",
+{ fEA_RI(fREAD_SP(),-8);  fSTORE(1,8,EA,fFRAME_SCRAMBLE((fCAST8_8u(fREAD_LR()) << 32) | fCAST4_4u(fREAD_FP()))); fWRITE_FP(EA); fFRAMECHECK(EA-uiV,EA); fWRITE_SP(EA-uiV); })
+
+
+
diff --git a/target/hexagon/imported/system.idef b/target/hexagon/imported/system.idef
new file mode 100644
index 0000000..746144d
--- /dev/null
+++ b/target/hexagon/imported/system.idef
@@ -0,0 +1,69 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * System Interface Instructions
+ */
+
+
+
+/********************************************/
+/* User->OS interface                       */
+/********************************************/
+
+Q6INSN(J2_trap0,"trap0(#u8)",ATTRIBS(A_COF),
+"Trap to Operating System",
+	fTRAP(0,uiV);
+)
+
+Q6INSN(Y2_icinva,"icinva(Rs32)",ATTRIBS(A_ICOP,A_ICFLUSHOP),"Instruction Cache Invalidate Address",{fEA_REG(RsV); fICINVA(EA);})
+
+Q6INSN(Y2_isync,"isync",ATTRIBS(),"Memory Synchronization",{fISYNC();})
+Q6INSN(Y2_barrier,"barrier",ATTRIBS(A_RESTRICT_SLOT0ONLY),"Memory Barrier",{fBARRIER();})
+Q6INSN(Y2_syncht,"syncht",ATTRIBS(A_RESTRICT_SLOT0ONLY),"Memory Synchronization",{fSYNCH();})
+
+
+Q6INSN(Y2_dcfetchbo,"dcfetch(Rs32+#u11:3)",ATTRIBS(A_RESTRICT_PREFERSLOT0,A_DCFETCH),"Data Cache Prefetch",{fEA_RI(RsV,uiV); fDCFETCH(EA);})
+
+
+Q6INSN(Y2_dczeroa,"dczeroa(Rs32)",ATTRIBS(A_STORE,A_RESTRICT_SLOT0ONLY,A_DCZEROA),"Zero an aligned 32-byte cacheline",{fEA_REG(RsV); fDCZEROA(EA);})
+Q6INSN(Y2_dccleana,"dccleana(Rs32)",ATTRIBS(A_RESTRICT_SLOT0ONLY,A_DCFLUSHOP),"Data Cache Clean Address",{fEA_REG(RsV); fDCCLEANA(EA);})
+Q6INSN(Y2_dccleaninva,"dccleaninva(Rs32)",ATTRIBS(A_RESTRICT_SLOT0ONLY,A_DCFLUSHOP),"Data Cache Clean and Invalidate Address",{fEA_REG(RsV); fDCCLEANINVA(EA);})
+Q6INSN(Y2_dcinva,"dcinva(Rs32)",ATTRIBS(A_RESTRICT_SLOT0ONLY,A_DCFLUSHOP),"Data Cache Invalidate Address",{fEA_REG(RsV); fDCCLEANINVA(EA);})
+
+
+
+
+
+
+Q6INSN(Y4_l2fetch,"l2fetch(Rs32,Rt32)",ATTRIBS(A_RESTRICT_SLOT0ONLY),"L2 Cache Prefetch",
+{ fL2FETCH(RsV,
+	(RtV&0xff), /*height*/
+	((RtV>>8)&0xff), /*width*/
+	((RtV>>16)&0xffff), /*stride*/
+	0); /*extra attrib flags*/
+})
+
+
+
+Q6INSN(Y5_l2fetch,"l2fetch(Rs32,Rtt32)",ATTRIBS(A_RESTRICT_SLOT0ONLY),"L2 Cache Prefetch",
+{ fL2FETCH(RsV,
+	fGETUHALF(0,RttV), /*height*/
+	fGETUHALF(1,RttV), /*width*/
+	fGETUHALF(2,RttV), /*stride*/
+	fGETUHALF(3,RttV)); /*flags*/
+})
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 19/34] Hexagon (target/hexagon/imported) arch import - instruction encoding
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (17 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 18/34] Hexagon (target/hexagon/imported) arch import - instruction semantics Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-18 15:50 ` [RFC PATCH v3 20/34] Hexagon (target/hexagon) generator phase 1 - C preprocessor for semantics Taylor Simpson
                   ` (16 subsequent siblings)
  35 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Imported from the Hexagon architecture library
    Instruction encoding bit patterns for every instruction

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/imported/encode.def         |  125 ++
 target/hexagon/imported/encode_pp.def      | 2110 ++++++++++++++++++++++++++++
 target/hexagon/imported/encode_subinsn.def |  150 ++
 3 files changed, 2385 insertions(+)
 create mode 100644 target/hexagon/imported/encode.def
 create mode 100644 target/hexagon/imported/encode_pp.def
 create mode 100644 target/hexagon/imported/encode_subinsn.def

diff --git a/target/hexagon/imported/encode.def b/target/hexagon/imported/encode.def
new file mode 100644
index 0000000..ae23301
--- /dev/null
+++ b/target/hexagon/imported/encode.def
@@ -0,0 +1,125 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * This just includes all encoding files
+ */
+
+#ifndef DEF_FIELD32
+#define __SELF_DEF_FIELD32
+#define DEF_FIELD32(...) /* nothing */
+#endif
+
+#ifndef DEF_CLASS32
+#define __SELF_DEF_CLASS32
+#define DEF_CLASS32(...) /* nothing */
+#endif
+
+#ifndef DEF_ANTICLASS32
+#define __SELF_DEF_ANTICLASS32
+#define DEF_ANTICLASS32(...) /* nothing */
+#endif
+
+#ifndef LEGACY_DEF_ENC32
+#define __SELF_DEF_LEGACY_DEF_ENC32
+#define LEGACY_DEF_ENC32(...) /* nothing */
+#endif
+
+#ifndef DEF_FIELDROW_DESC32
+#define __SELF_DEF_FIELDROW_DESC32
+#define DEF_FIELDROW_DESC32(...) /* nothing */
+#endif
+
+#ifndef DEF_ENC32
+#define __SELF_DEF_ENC32
+#define DEF_ENC32(...) /* nothing */
+#endif
+
+#ifndef DEF_PACKED32
+#define __SELF_DEF_PACKED32
+#define DEF_PACKED32(...) /* nothing */
+#endif
+
+#ifndef DEF_ENC_SUBINSN
+#define __SELF_DEF_ENC_SUBINSN
+#define DEF_ENC_SUBINSN(...) /* nothing */
+#endif
+
+#ifndef DEF_EXT_ENC
+#define __SELF_DEF_EXT_ENC
+#define DEF_EXT_ENC(...) /* nothing */
+#endif
+
+#ifndef DEF_EXT_SPACE
+#define __SELF_DEF_EXT_SPACE
+#define DEF_EXT_SPACE(...) /* nothing */
+#endif
+
+#include "encode_pp.def"
+#include "encode_subinsn.def"
+
+#ifdef __SELF_DEF_FIELD32
+#undef __SELF_DEF_FIELD32
+#undef DEF_FIELD32
+#endif
+
+#ifdef __SELF_DEF_CLASS32
+#undef __SELF_DEF_CLASS32
+#undef DEF_CLASS32
+#endif
+
+#ifdef __SELF_DEF_ANTICLASS32
+#undef __SELF_DEF_ANTICLASS32
+#undef DEF_ANTICLASS32
+#endif
+
+#ifdef __SELF_DEF_LEGACY_DEF_ENC32
+#undef __SELF_DEF_LEGACY_DEF_ENC32
+#undef LEGACY_DEF_ENC32
+#endif
+
+#ifdef __SELF_DEF_FIELDROW_DESC32
+#undef __SELF_DEF_FIELDROW_DESC32
+#undef DEF_FIELDROW_DESC32
+#endif
+
+#ifdef __SELF_DEF_ENC32
+#undef __SELF_DEF_ENC32
+#undef DEF_ENC32
+#endif
+
+#ifdef __SELF_DEF_EXT_SPACE
+#undef __SELF_DEF_EXT_SPACE
+#undef DEF_EXT_SPACE
+#endif
+
+
+#ifdef __SELF_DEF_PACKED32
+#undef __SELF_DEF_PACKED32
+#undef DEF_PACKED32
+#endif
+
+#ifdef __SELF_DEF_ENC_SUBINSN
+#undef __SELF_DEF_ENC_SUBINSN
+#undef DEF_ENC_SUBINSN
+#endif
+
+#ifdef __SELF_DEF_EXT_ENC
+#undef __SELF_DEF_EXT_ENC
+#undef DEF_EXT_ENC
+#endif
+
diff --git a/target/hexagon/imported/encode_pp.def b/target/hexagon/imported/encode_pp.def
new file mode 100644
index 0000000..d3ff2c3
--- /dev/null
+++ b/target/hexagon/imported/encode_pp.def
@@ -0,0 +1,2110 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * Encodings for 32 bit instructions
+ *
+ */
+
+
+
+
+DEF_CLASS32("---- ---- -------- PP------ --------",ALL_PP)
+DEF_FIELD32("---- ---- -------- !!------ --------",Parse,"Packet/Loop parse bits")
+DEF_FIELD32("!!!! ---- -------- PP------ --------",ICLASS,"Instruction Class")
+
+#define ICLASS_EXTENDER   "0000"
+#define ICLASS_CJ         "0001"
+#define ICLASS_NCJ        "0010"
+#define ICLASS_V4LDST     "0011"
+#define ICLASS_V2LDST     "0100"
+#define ICLASS_J          "0101"
+#define ICLASS_CR         "0110"
+#define ICLASS_ALU2op     "0111"
+#define ICLASS_S2op       "1000"
+#define ICLASS_LD         "1001"
+#define ICLASS_ST         "1010"
+#define ICLASS_ADDI       "1011"
+#define ICLASS_S3op       "1100"
+#define ICLASS_ALU64      "1101"
+#define ICLASS_M          "1110"
+#define ICLASS_ALU3op     "1111"
+
+
+
+/*******************************/
+/*                             */
+/*                             */
+/*     V4 Immediate Payload    */
+/*                             */
+/*                             */
+/*******************************/
+
+DEF_CLASS32(ICLASS_EXTENDER" ---- -------- PP------ --------",EXTENDER)
+DEF_ENC32(A4_ext, ICLASS_EXTENDER "iiii iiiiiiii PPiiiiii iiiiiiii")
+
+
+
+/*******************************/
+/*                             */
+/*                             */
+/*     V2 PREDICATED LD/ST     */
+/*                             */
+/*                             */
+/*******************************/
+
+DEF_CLASS32(ICLASS_V2LDST" ---- -------- PP------ --------",V2LDST)
+DEF_CLASS32(ICLASS_V2LDST" ---1 -------- PP------ --------",V2LD)
+DEF_CLASS32(ICLASS_V2LDST" ---0 -------- PP------ --------",V2ST)
+DEF_CLASS32(ICLASS_V2LDST" 0--1 -------- PP------ --------",PLD)
+DEF_CLASS32(ICLASS_V2LDST" 0--0 -------- PP------ --------",PST)
+DEF_CLASS32(ICLASS_V2LDST" 1--1 -------- PP------ --------",GPLD)
+DEF_CLASS32(ICLASS_V2LDST" 1--0 -------- PP------ --------",GPST)
+
+DEF_FIELD32(ICLASS_V2LDST" 0!-- -------- PP------ --------",PMEM_Sense,"Sense")
+DEF_FIELD32(ICLASS_V2LDST" 0-!- -------- PP------ --------",PMEM_PredNew,"PredNew")
+DEF_FIELD32(ICLASS_V2LDST" ---1 !!------ PP------ --------",PMEML_Type,"Type")
+DEF_FIELD32(ICLASS_V2LDST" ---1 --!----- PP------ --------",PMEML_UN,"Unsigned")
+DEF_FIELD32(ICLASS_V2LDST" ---0 !!!----- PP------ --------",PMEMS_Type,"Type")
+
+#define STD_PLD_IOENC(TAG,OPC) \
+DEF_ENC32(L2_pload##TAG##t_io,   ICLASS_V2LDST" 0001 "OPC"  sssss  PP0ttiii  iiiddddd")\
+DEF_ENC32(L2_pload##TAG##f_io,   ICLASS_V2LDST" 0101 "OPC"  sssss  PP0ttiii  iiiddddd")\
+DEF_ENC32(L2_pload##TAG##tnew_io,ICLASS_V2LDST" 0011 "OPC"  sssss  PP0ttiii  iiiddddd")\
+DEF_ENC32(L2_pload##TAG##fnew_io,ICLASS_V2LDST" 0111 "OPC"  sssss  PP0ttiii  iiiddddd")
+
+STD_PLD_IOENC(rb,  "000")
+STD_PLD_IOENC(rub, "001")
+STD_PLD_IOENC(rh,  "010")
+STD_PLD_IOENC(ruh, "011")
+STD_PLD_IOENC(ri,  "100")
+STD_PLD_IOENC(rd,  "110") /* note dest reg field LSB=0, 1 is reserved */
+
+
+
+#define STD_PST_IOENC(TAG,OPC,SRC) \
+DEF_ENC32(S2_pstore##TAG##t_io,   ICLASS_V2LDST" 0000 "OPC"  sssss  PPi"SRC"  iiiii0vv")\
+DEF_ENC32(S2_pstore##TAG##f_io,   ICLASS_V2LDST" 0100 "OPC"  sssss  PPi"SRC"  iiiii0vv")\
+DEF_ENC32(S4_pstore##TAG##tnew_io,ICLASS_V2LDST" 0010 "OPC"  sssss  PPi"SRC"  iiiii0vv")\
+DEF_ENC32(S4_pstore##TAG##fnew_io,ICLASS_V2LDST" 0110 "OPC"  sssss  PPi"SRC"  iiiii0vv")
+
+STD_PST_IOENC(rb,    "000","ttttt")
+STD_PST_IOENC(rh,    "010","ttttt")
+STD_PST_IOENC(rf,    "011","ttttt")
+STD_PST_IOENC(ri,    "100","ttttt")
+STD_PST_IOENC(rd,    "110","ttttt")
+STD_PST_IOENC(rbnew, "101","00ttt")
+STD_PST_IOENC(rhnew, "101","01ttt")
+STD_PST_IOENC(rinew, "101","10ttt")
+
+
+
+
+
+/*******************************/
+/*                             */
+/*                             */
+/*     V2 GP-RELATIVE LD/ST    */
+/*                             */
+/*                             */
+/*******************************/
+#define STD_LD_GP(TAG,OPC) \
+DEF_ENC32(L2_load##TAG##gp,   ICLASS_V2LDST" 1ii1 "OPC"  iiiii  PPiiiiii  iiiddddd")
+
+STD_LD_GP(rb,  "000")
+STD_LD_GP(rub, "001")
+STD_LD_GP(rh,  "010")
+STD_LD_GP(ruh, "011")
+STD_LD_GP(ri,  "100")
+STD_LD_GP(rd,  "110") /* note dest reg field LSB=0, 1 is reserved */
+
+#define STD_ST_GP(TAG,OPC,SRC) \
+DEF_ENC32(S2_store##TAG##gp,  ICLASS_V2LDST" 1ii0 "OPC"  iiiii  PPi"SRC"  iiiiiiii")
+
+STD_ST_GP(rb,   "000","ttttt")
+STD_ST_GP(rh,   "010","ttttt")
+STD_ST_GP(rf,   "011","ttttt")
+STD_ST_GP(ri,   "100","ttttt")
+STD_ST_GP(rd,   "110","ttttt")
+STD_ST_GP(rbnew,"101","00ttt")
+STD_ST_GP(rhnew,"101","01ttt")
+STD_ST_GP(rinew,"101","10ttt")
+
+
+
+
+
+/*******************************/
+/*                             */
+/*                             */
+/*     V4LDST                  */
+/*                             */
+/*                             */
+/*******************************/
+
+
+DEF_CLASS32(ICLASS_V4LDST" ---- -------- PP------ --------",V4LDST)
+DEF_CLASS32(ICLASS_V4LDST" 0--- -------- PP------ --------",Pred_RplusR)
+DEF_CLASS32(ICLASS_V4LDST" 100- -------- PP------ --------",Pred_StoreImmed)
+DEF_CLASS32(ICLASS_V4LDST" 101- -------- PP------ --------",RplusR)
+DEF_CLASS32(ICLASS_V4LDST" 110- -------- PP------ --------",StoreImmed)
+DEF_CLASS32(ICLASS_V4LDST" 111- -------- PP------ --------",MemOp)
+
+
+
+
+/*******************************/
+/*    Pred (R+R)               */
+/*******************************/
+
+#define STD_PLD_RRENC(TAG,OPC) \
+DEF_ENC32(L4_pload##TAG##t_rr,   ICLASS_V4LDST" 00 00 "OPC"  sssss  PPittttt  ivvddddd")\
+DEF_ENC32(L4_pload##TAG##f_rr,   ICLASS_V4LDST" 00 01 "OPC"  sssss  PPittttt  ivvddddd")\
+DEF_ENC32(L4_pload##TAG##tnew_rr,ICLASS_V4LDST" 00 10 "OPC"  sssss  PPittttt  ivvddddd")\
+DEF_ENC32(L4_pload##TAG##fnew_rr,ICLASS_V4LDST" 00 11 "OPC"  sssss  PPittttt  ivvddddd")
+
+STD_PLD_RRENC(rb,  "000")
+STD_PLD_RRENC(rub, "001")
+STD_PLD_RRENC(rh,  "010")
+STD_PLD_RRENC(ruh, "011")
+STD_PLD_RRENC(ri,  "100")
+STD_PLD_RRENC(rd,  "110")
+
+#define STD_PST_RRENC(TAG,OPC,SRC) \
+DEF_ENC32(S4_pstore##TAG##t_rr,   ICLASS_V4LDST" 01 00 "OPC"  sssss  PPiuuuuu  ivv"SRC)\
+DEF_ENC32(S4_pstore##TAG##f_rr,   ICLASS_V4LDST" 01 01 "OPC"  sssss  PPiuuuuu  ivv"SRC)\
+DEF_ENC32(S4_pstore##TAG##tnew_rr,ICLASS_V4LDST" 01 10 "OPC"  sssss  PPiuuuuu  ivv"SRC)\
+DEF_ENC32(S4_pstore##TAG##fnew_rr,ICLASS_V4LDST" 01 11 "OPC"  sssss  PPiuuuuu  ivv"SRC)
+
+STD_PST_RRENC(rb,    "000","ttttt")
+STD_PST_RRENC(rh,    "010","ttttt")
+STD_PST_RRENC(rf,    "011","ttttt")
+STD_PST_RRENC(ri,    "100","ttttt")
+STD_PST_RRENC(rd,    "110","ttttt")
+STD_PST_RRENC(rbnew, "101","00ttt")
+STD_PST_RRENC(rhnew, "101","01ttt")
+STD_PST_RRENC(rinew, "101","10ttt")
+
+
+
+/*******************************/
+/*     Pred Store immediates   */
+/*******************************/
+
+#define V4_PSTI(TAG,OPC) \
+DEF_ENC32(S4_storei##TAG##t_io,    ICLASS_V4LDST" 100 00  "OPC"  sssss  PPIiiiii  ivvIIIII")\
+DEF_ENC32(S4_storei##TAG##f_io,    ICLASS_V4LDST" 100 01  "OPC"  sssss  PPIiiiii  ivvIIIII")\
+DEF_ENC32(S4_storei##TAG##tnew_io, ICLASS_V4LDST" 100 10  "OPC"  sssss  PPIiiiii  ivvIIIII")\
+DEF_ENC32(S4_storei##TAG##fnew_io, ICLASS_V4LDST" 100 11  "OPC"  sssss  PPIiiiii  ivvIIIII")
+
+V4_PSTI(rb, "00")
+V4_PSTI(rh, "01")
+V4_PSTI(ri, "10")
+
+
+
+/*******************************/
+/*    (R+R)                    */
+/*******************************/
+
+#define STD_LD_RRENC(TAG,OPC) \
+DEF_ENC32(L4_load##TAG##_rr,     ICLASS_V4LDST" 1010 "OPC"  sssss  PPittttt  i--ddddd")
+
+STD_LD_RRENC(rb,  "000")
+STD_LD_RRENC(rub, "001")
+STD_LD_RRENC(rh,  "010")
+STD_LD_RRENC(ruh, "011")
+STD_LD_RRENC(ri,  "100")
+STD_LD_RRENC(rd,  "110")
+
+#define STD_ST_RRENC(TAG,OPC,SRC) \
+DEF_ENC32(S4_store##TAG##_rr,     ICLASS_V4LDST" 1011 "OPC"  sssss  PPiuuuuu  i--"SRC)
+
+STD_ST_RRENC(rb,    "000","ttttt")
+STD_ST_RRENC(rh,    "010","ttttt")
+STD_ST_RRENC(rf,    "011","ttttt")
+STD_ST_RRENC(ri,    "100","ttttt")
+STD_ST_RRENC(rd,    "110","ttttt")
+STD_ST_RRENC(rbnew, "101","00ttt")
+STD_ST_RRENC(rhnew, "101","01ttt")
+STD_ST_RRENC(rinew, "101","10ttt")
+
+
+
+
+/*******************************/
+/*     Store immediates        */
+/*******************************/
+
+#define V4_STI(TAG,OPC) \
+DEF_ENC32(S4_storei##TAG##_io,     ICLASS_V4LDST" 110 -- "OPC"  sssss  PPIiiiii  iIIIIIII")
+
+
+V4_STI(rb, "00")
+V4_STI(rh, "01")
+V4_STI(ri, "10")
+
+
+/*******************************/
+/*     Memops                 */
+/*******************************/
+
+#define MEMOPENC(TAG,OPC) \
+DEF_ENC32(L4_add_##TAG##_io,         ICLASS_V4LDST" 111 0- " OPC "sssss  PP0iiiii  i00ttttt")\
+DEF_ENC32(L4_sub_##TAG##_io,         ICLASS_V4LDST" 111 0- " OPC "sssss  PP0iiiii  i01ttttt")\
+DEF_ENC32(L4_and_##TAG##_io,         ICLASS_V4LDST" 111 0- " OPC "sssss  PP0iiiii  i10ttttt")\
+DEF_ENC32(L4_or_##TAG##_io,          ICLASS_V4LDST" 111 0- " OPC "sssss  PP0iiiii  i11ttttt")\
+\
+DEF_ENC32(L4_iadd_##TAG##_io,        ICLASS_V4LDST" 111 1- " OPC "sssss  PP0iiiii  i00IIIII")\
+DEF_ENC32(L4_isub_##TAG##_io,        ICLASS_V4LDST" 111 1- " OPC "sssss  PP0iiiii  i01IIIII")\
+DEF_ENC32(L4_iand_##TAG##_io,        ICLASS_V4LDST" 111 1- " OPC "sssss  PP0iiiii  i10IIIII")\
+DEF_ENC32(L4_ior_##TAG##_io,         ICLASS_V4LDST" 111 1- " OPC "sssss  PP0iiiii  i11IIIII")
+
+
+
+MEMOPENC(memopw,"10")
+MEMOPENC(memoph,"01")
+MEMOPENC(memopb,"00")
+
+
+
+
+/*******************************/
+/*                             */
+/*                             */
+/*           LOAD              */
+/*                             */
+/*                             */
+/*******************************/
+DEF_CLASS32(ICLASS_LD" ---- -------- PP------ --------",LD)
+
+
+DEF_CLASS32(ICLASS_LD" 0--- -------- PP------ --------",LD_ADDR_ROFFSET)
+DEF_CLASS32(ICLASS_LD" 101- -------- PP00---- --------",LD_ADDR_POST_IMMED)
+DEF_CLASS32(ICLASS_LD" 101- -------- PP01---- --------",LD_ADDR_ABS_UPDATE_V4)
+DEF_CLASS32(ICLASS_LD" 101- -------- PP1----- --------",LD_ADDR_POST_IMMED_PRED_V2)
+DEF_CLASS32(ICLASS_LD" 110- -------- PP-0---- 0-------",LD_ADDR_POST_REG)
+DEF_CLASS32(ICLASS_LD" 110- -------- PP-1---- --------",LD_ADDR_ABS_PLUS_REG_V4)
+DEF_CLASS32(ICLASS_LD" 100- -------- PP----1- --------",LD_ADDR_POST_CREG_V2)
+DEF_CLASS32(ICLASS_LD" 111- -------- PP------ 1-------",LD_ADDR_PRED_ABS_V4)
+
+DEF_FIELD32(ICLASS_LD" !!!- -------- PP------ --------",LD_Amode,"Amode")
+DEF_FIELD32(ICLASS_LD" ---! !!------ PP------ --------",LD_Type,"Type")
+DEF_FIELD32(ICLASS_LD" ---- --!----- PP------ --------",LD_UN,"Unsigned")
+
+#define STD_LD_ENC(TAG,OPC) \
+DEF_ENC32(L2_load##TAG##_io,   ICLASS_LD" 0 ii "OPC"  sssss  PPiiiiii  iiiddddd")\
+DEF_ENC32(L2_load##TAG##_pi,   ICLASS_LD" 1 01 "OPC"  xxxxx  PP00---i  iiiddddd")\
+DEF_ENC32(L4_load##TAG##_ap,   ICLASS_LD" 1 01 "OPC"  eeeee  PP01IIII  -IIddddd")\
+DEF_ENC32(L2_load##TAG##_pr,   ICLASS_LD" 1 10 "OPC"  xxxxx  PPu0----  0--ddddd")\
+DEF_ENC32(L4_load##TAG##_ur,   ICLASS_LD" 1 10 "OPC"  ttttt  PPi1IIII  iIIddddd")\
+
+
+#define STD_LDX_ENC(TAG,OPC) \
+DEF_ENC32(L2_load##TAG##_io,   ICLASS_LD" 0 ii "OPC"  sssss  PPiiiiii  iiiyyyyy")\
+DEF_ENC32(L2_load##TAG##_pi,   ICLASS_LD" 1 01 "OPC"  xxxxx  PP00---i  iiiyyyyy")\
+DEF_ENC32(L4_load##TAG##_ap,   ICLASS_LD" 1 01 "OPC"  eeeee  PP01IIII  -IIyyyyy")\
+DEF_ENC32(L2_load##TAG##_pr,   ICLASS_LD" 1 10 "OPC"  xxxxx  PPu0----  0--yyyyy")\
+DEF_ENC32(L4_load##TAG##_ur,   ICLASS_LD" 1 10 "OPC"  ttttt  PPi1IIII  iIIyyyyy")\
+
+
+#define STD_PLD_ENC(TAG,OPC) \
+DEF_ENC32(L2_pload##TAG##t_pi,    ICLASS_LD" 1 01 "OPC"  xxxxx  PP100tti  iiiddddd")\
+DEF_ENC32(L2_pload##TAG##f_pi,    ICLASS_LD" 1 01 "OPC"  xxxxx  PP101tti  iiiddddd")\
+DEF_ENC32(L2_pload##TAG##tnew_pi, ICLASS_LD" 1 01 "OPC"  xxxxx  PP110tti  iiiddddd")\
+DEF_ENC32(L2_pload##TAG##fnew_pi, ICLASS_LD" 1 01 "OPC"  xxxxx  PP111tti  iiiddddd")\
+DEF_ENC32(L4_pload##TAG##t_abs,   ICLASS_LD" 1 11 "OPC"  iiiii  PP100tti  1--ddddd")\
+DEF_ENC32(L4_pload##TAG##f_abs,   ICLASS_LD" 1 11 "OPC"  iiiii  PP101tti  1--ddddd")\
+DEF_ENC32(L4_pload##TAG##tnew_abs,ICLASS_LD" 1 11 "OPC"  iiiii  PP110tti  1--ddddd")\
+DEF_ENC32(L4_pload##TAG##fnew_abs,ICLASS_LD" 1 11 "OPC"  iiiii  PP111tti  1--ddddd")
+
+
+/*               0 000  misc: dealloc,loadw_locked,dcfetch      */
+STD_LD_ENC(rb,  "1 000")
+STD_LD_ENC(rub, "1 001")
+STD_LD_ENC(rh,  "1 010")
+STD_LD_ENC(ruh, "1 011")
+STD_LD_ENC(ri,  "1 100")
+STD_LD_ENC(rd,  "1 110") /* note dest reg field LSB=0, 1 is reserved */
+
+STD_PLD_ENC(rb,  "1 000")
+STD_PLD_ENC(rub, "1 001")
+STD_PLD_ENC(rh,  "1 010")
+STD_PLD_ENC(ruh, "1 011")
+STD_PLD_ENC(ri,  "1 100")
+STD_PLD_ENC(rd,  "1 110") /* note dest reg field LSB=0, 1 is reserved */
+
+
+DEF_CLASS32(    ICLASS_LD" 0--0 000----- PP------ --------",LD_MISC)
+DEF_ANTICLASS32(ICLASS_LD" 0--0 000----- PP------ --------",LD_ADDR_ROFFSET)
+DEF_ANTICLASS32(ICLASS_LD" 1010 000----- PP------ --------",LD_ADDR_POST_IMMED)
+DEF_ANTICLASS32(ICLASS_LD" 1100 000----- PP------ --------",LD_ADDR_POST_REG)
+DEF_ANTICLASS32(ICLASS_LD" 1110 000----- PP------ --------",LD_ADDR_POST_REG)
+
+DEF_ENC32(L2_deallocframe,    ICLASS_LD" 000 0 000 sssss PP0----- ---ddddd")
+DEF_ENC32(L4_return,          ICLASS_LD" 011 0 000 sssss PP0000-- ---ddddd")
+DEF_ENC32(L4_return_t,        ICLASS_LD" 011 0 000 sssss PP0100vv ---ddddd")
+DEF_ENC32(L4_return_f,        ICLASS_LD" 011 0 000 sssss PP1100vv ---ddddd")
+DEF_ENC32(L4_return_tnew_pt,  ICLASS_LD" 011 0 000 sssss PP0110vv ---ddddd")
+DEF_ENC32(L4_return_fnew_pt,  ICLASS_LD" 011 0 000 sssss PP1110vv ---ddddd")
+DEF_ENC32(L4_return_tnew_pnt, ICLASS_LD" 011 0 000 sssss PP0010vv ---ddddd")
+DEF_ENC32(L4_return_fnew_pnt, ICLASS_LD" 011 0 000 sssss PP1010vv ---ddddd")
+
+DEF_ENC32(L2_loadw_locked,ICLASS_LD" 001 0 000 sssss PP00---- -00ddddd")
+
+
+
+
+
+
+DEF_ENC32(L4_loadd_locked,ICLASS_LD" 001 0 000 sssss PP01---- -00ddddd")
+DEF_EXT_SPACE(EXTRACTW,   ICLASS_LD" 001 0 000 iiiii PP0iiiii -01iiiii")
+DEF_ENC32(Y2_dcfetchbo,   ICLASS_LD" 010 0 000 sssss PP0--iii iiiiiiii")
+
+
+
+
+
+
+
+
+/*******************************/
+/*                             */
+/*                             */
+/*           STORE             */
+/*                             */
+/*                             */
+/*******************************/
+
+DEF_CLASS32(ICLASS_ST" ---- -------- PP------ --------",ST)
+
+DEF_FIELD32(ICLASS_ST" !!!- -------- PP------ --------",ST_Amode,"Amode")
+DEF_FIELD32(ICLASS_ST" ---! !!------ PP------ --------",ST_Type,"Type")
+DEF_FIELD32(ICLASS_ST" ---- --!----- PP------ --------",ST_UN,"Unsigned")
+
+DEF_CLASS32(ICLASS_ST" 0--1 -------- PP------ --------",ST_ADDR_ROFFSET)
+DEF_CLASS32(ICLASS_ST" 1011 -------- PP0----- 0-----0-",ST_ADDR_POST_IMMED)
+DEF_CLASS32(ICLASS_ST" 1011 -------- PP0----- 1-------",ST_ADDR_ABS_UPDATE_V4)
+DEF_CLASS32(ICLASS_ST" 1011 -------- PP1----- --------",ST_ADDR_POST_IMMED_PRED_V2)
+DEF_CLASS32(ICLASS_ST" 1111 -------- PP------ 1-------",ST_ADDR_PRED_ABS_V4)
+DEF_CLASS32(ICLASS_ST" 1101 -------- PP------ 0-------",ST_ADDR_POST_REG)
+DEF_CLASS32(ICLASS_ST" 1101 -------- PP------ 1-------",ST_ADDR_ABS_PLUS_REG_V4)
+DEF_CLASS32(ICLASS_ST" 1001 -------- PP------ ------1-",ST_ADDR_POST_CREG_V2)
+DEF_CLASS32(ICLASS_ST" 0--0 1------- PP------ --------",ST_MISC_STORELIKE)
+DEF_CLASS32(ICLASS_ST" 1--0 0------- PP------ --------",ST_MISC_BUSOP)
+DEF_CLASS32(ICLASS_ST" 0--0 0------- PP------ --------",ST_MISC_CACHEOP)
+
+
+#define STD_ST_ENC(TAG,OPC,SRC) \
+DEF_ENC32(S2_store##TAG##_io,   ICLASS_ST" 0 ii "OPC"  sssss  PPi"SRC"  iiiiiiii")\
+DEF_ENC32(S2_store##TAG##_pi,   ICLASS_ST" 1 01 "OPC"  xxxxx  PP0"SRC"  0iiii-0-")\
+DEF_ENC32(S4_store##TAG##_ap,   ICLASS_ST" 1 01 "OPC"  eeeee  PP0"SRC"  1-IIIIII")\
+DEF_ENC32(S2_store##TAG##_pr,   ICLASS_ST" 1 10 "OPC"  xxxxx  PPu"SRC"  0-------")\
+DEF_ENC32(S4_store##TAG##_ur,   ICLASS_ST" 1 10 "OPC"  uuuuu  PPi"SRC"  1iIIIIII")\
+
+
+#define STD_PST_ENC(TAG,OPC,SRC) \
+DEF_ENC32(S2_pstore##TAG##t_pi,    ICLASS_ST" 1 01 "OPC"  xxxxx  PP1"SRC"  0iiii0vv")\
+DEF_ENC32(S2_pstore##TAG##f_pi,    ICLASS_ST" 1 01 "OPC"  xxxxx  PP1"SRC"  0iiii1vv")\
+DEF_ENC32(S2_pstore##TAG##tnew_pi, ICLASS_ST" 1 01 "OPC"  xxxxx  PP1"SRC"  1iiii0vv")\
+DEF_ENC32(S2_pstore##TAG##fnew_pi, ICLASS_ST" 1 01 "OPC"  xxxxx  PP1"SRC"  1iiii1vv")\
+DEF_ENC32(S4_pstore##TAG##t_abs,   ICLASS_ST" 1 11 "OPC"  ---ii  PP0"SRC"  1iiii0vv")\
+DEF_ENC32(S4_pstore##TAG##f_abs,   ICLASS_ST" 1 11 "OPC"  ---ii  PP0"SRC"  1iiii1vv")\
+DEF_ENC32(S4_pstore##TAG##tnew_abs,ICLASS_ST" 1 11 "OPC"  ---ii  PP1"SRC"  1iiii0vv")\
+DEF_ENC32(S4_pstore##TAG##fnew_abs,ICLASS_ST" 1 11 "OPC"  ---ii  PP1"SRC"  1iiii1vv")
+
+
+/*                 0 0--  Store Misc */
+/*                 0 1xx  Available */
+STD_ST_ENC(rb,    "1 000","ttttt")
+STD_ST_ENC(rh,    "1 010","ttttt")
+STD_ST_ENC(rf,    "1 011","ttttt")
+STD_ST_ENC(ri,    "1 100","ttttt")
+STD_ST_ENC(rd,    "1 110","ttttt")
+STD_ST_ENC(rbnew, "1 101","00ttt")
+STD_ST_ENC(rhnew, "1 101","01ttt")
+STD_ST_ENC(rinew, "1 101","10ttt")
+
+STD_PST_ENC(rb,    "1 000","ttttt")
+STD_PST_ENC(rh,    "1 010","ttttt")
+STD_PST_ENC(rf,    "1 011","ttttt")
+STD_PST_ENC(ri,    "1 100","ttttt")
+STD_PST_ENC(rd,    "1 110","ttttt")
+STD_PST_ENC(rbnew, "1 101","00ttt")
+STD_PST_ENC(rhnew, "1 101","01ttt")
+STD_PST_ENC(rinew, "1 101","10ttt")
+
+
+
+/* User */
+/*                                   xx - st_misc */
+/*                                                */
+/*                               x bus/cache     */
+/*                                    x store/cache     */
+DEF_ENC32(S2_allocframe,   ICLASS_ST" 000 01 00xxxxx PP000iii iiiiiiii")
+DEF_ENC32(S2_storew_locked,ICLASS_ST" 000 01 01sssss PP-ttttt ------dd")
+DEF_ENC32(S4_stored_locked,ICLASS_ST" 000 01 11sssss PP0ttttt ------dd")
+DEF_ENC32(Y2_dczeroa,      ICLASS_ST" 000 01 10sssss PP0----- --------")
+
+
+DEF_ENC32(Y2_barrier,      ICLASS_ST" 100 00 00----- PP------ 000-----")
+DEF_ENC32(Y2_syncht,       ICLASS_ST" 100 00 10----- PP------ --------")
+
+
+
+DEF_ENC32(Y2_dccleana,     ICLASS_ST" 000 00 00sssss PP------ --------")
+DEF_ENC32(Y2_dcinva,       ICLASS_ST" 000 00 01sssss PP------ --------")
+DEF_ENC32(Y2_dccleaninva,  ICLASS_ST" 000 00 10sssss PP------ --------")
+
+/*******************************/
+/*                             */
+/*                             */
+/*           JUMP              */
+/*                             */
+/*                             */
+/*******************************/
+
+DEF_CLASS32(ICLASS_J" ---- -------- PP------ --------",J)
+DEF_CLASS32(ICLASS_J" 0--- -------- PP------ --------",JUMPR_MISC)
+DEF_CLASS32(ICLASS_J" 10-- -------- PP------ --------",UCJUMP)
+DEF_CLASS32(ICLASS_J" 110- -------- PP------ --------",CJUMP)
+DEF_FIELD32(ICLASS_J" 110- -------- PP--!--- --------",J_DN,"Dot-new")
+DEF_FIELD32(ICLASS_J" 110- -------- PP-!---- --------",J_PT,"Predict-taken")
+
+
+
+DEF_FIELDROW_DESC32(ICLASS_J" 0000 -------- PP------ --------","[#0] PC=(Rs), R31=return")
+DEF_ENC32(J2_callr,     ICLASS_J" 0000  101sssss  PP------  --------")
+
+DEF_FIELDROW_DESC32(ICLASS_J" 0001 -------- PP------ --------","[#1] if (Pu) PC=(Rs), R31=return")
+DEF_ENC32(J2_callrt,    ICLASS_J" 0001  000sssss  PP----uu  --------")
+DEF_ENC32(J2_callrf,    ICLASS_J" 0001  001sssss  PP----uu  --------")
+
+DEF_FIELDROW_DESC32(ICLASS_J" 0010 -------- PP------ --------","[#2] PC=(Rs); ")
+DEF_ENC32(J2_jumpr,      ICLASS_J" 0010  100sssss  PP------  --------")
+DEF_ENC32(J4_hintjumpr,  ICLASS_J" 0010  101sssss  PP------  --------")
+
+DEF_FIELDROW_DESC32(ICLASS_J" 0011 -------- PP------ --------","[#3] if (Pu) PC=(Rs) ")
+DEF_ENC32(J2_jumprt,   ICLASS_J" 0011  010sssss  PP-00-uu  --------")
+DEF_ENC32(J2_jumprf,   ICLASS_J" 0011  011sssss  PP-00-uu  --------")
+DEF_ENC32(J2_jumprtpt,    ICLASS_J" 0011  010sssss  PP-10-uu  --------")
+DEF_ENC32(J2_jumprfpt,    ICLASS_J" 0011  011sssss  PP-10-uu  --------")
+DEF_ENC32(J2_jumprtnew,   ICLASS_J" 0011  010sssss  PP-01-uu  --------")
+DEF_ENC32(J2_jumprfnew,   ICLASS_J" 0011  011sssss  PP-01-uu  --------")
+DEF_ENC32(J2_jumprtnewpt, ICLASS_J" 0011  010sssss  PP-11-uu  --------")
+DEF_ENC32(J2_jumprfnewpt, ICLASS_J" 0011  011sssss  PP-11-uu  --------")
+
+DEF_FIELDROW_DESC32(ICLASS_J" 0100 -------- PP------ --------","[#4] (#u8) ")
+DEF_ENC32(J2_trap0,     ICLASS_J" 0100  00------  PP-iiiii  ---iii--")
+
+DEF_FIELDROW_DESC32(ICLASS_J" 0110 -------- PP------ --------","[#6] icop(Rs) ")
+DEF_ENC32(Y2_icinva,    ICLASS_J" 0110  110sssss  PP000---  --------")
+
+DEF_FIELDROW_DESC32(ICLASS_J" 0111 -------- PP------ --------","[#7] () ")
+DEF_ENC32(Y2_isync,     ICLASS_J" 0111  11000000  PP0---00  00000010")
+
+/* JUMP */
+DEF_FIELDROW_DESC32(ICLASS_J" 100- -------- PP------ --------","[#8,9] PC=(#r22)")
+DEF_ENC32(J2_jump,      ICLASS_J" 100i  iiiiiiii  PPiiiiii  iiiiiii-")
+
+DEF_FIELDROW_DESC32(ICLASS_J" 101- -------- PP------ --------","[#10,11] PC=(#r22), R31=return")
+DEF_ENC32(J2_call,      ICLASS_J" 101i  iiiiiiii  PPiiiiii  iiiiiii0")
+
+DEF_FIELDROW_DESC32(ICLASS_J" 1100 -------- PP------ --------","[#12] if (Pu) PC=(#r15)")
+DEF_ENC32(J2_jumpt,  ICLASS_J" 1100  ii0iiiii  PPi00-uu  iiiiiii-")
+DEF_ENC32(J2_jumpf,  ICLASS_J" 1100  ii1iiiii  PPi00-uu  iiiiiii-")
+DEF_ENC32(J2_jumptpt,   ICLASS_J" 1100  ii0iiiii  PPi10-uu  iiiiiii-")
+DEF_ENC32(J2_jumpfpt,   ICLASS_J" 1100  ii1iiiii  PPi10-uu  iiiiiii-")
+DEF_ENC32(J2_jumptnew,  ICLASS_J" 1100  ii0iiiii  PPi01-uu  iiiiiii-")
+DEF_ENC32(J2_jumpfnew,  ICLASS_J" 1100  ii1iiiii  PPi01-uu  iiiiiii-")
+DEF_ENC32(J2_jumptnewpt,ICLASS_J" 1100  ii0iiiii  PPi11-uu  iiiiiii-")
+DEF_ENC32(J2_jumpfnewpt,ICLASS_J" 1100  ii1iiiii  PPi11-uu  iiiiiii-")
+
+DEF_FIELDROW_DESC32(ICLASS_J" 1101 -------- PP------ --------","[#13] if (Pu) PC=(#r15), R31=return")
+DEF_ENC32(J2_callt,     ICLASS_J" 1101  ii0iiiii  PPi-0-uu  iiiiiii-")
+DEF_ENC32(J2_callf,     ICLASS_J" 1101  ii1iiiii  PPi-0-uu  iiiiiii-")
+
+
+
+
+
+
+
+/*******************************/
+/*                             */
+/*        V4                   */
+/*   COMPOUND COMPARE-JUMPS    */
+/*                             */
+/*                             */
+/*******************************/
+
+
+/* EJP: this has to match what we have in htmldocs.py... so I will call it CJ, we can change it */
+DEF_CLASS32(ICLASS_CJ" 0--- -------- PP------ --------",CJ)
+
+DEF_FIELDROW_DESC32(ICLASS_CJ" 00-- -------- -------- --------","[#0-3]  pd=cmp.xx(R,#u5) ; if ([!]p0.new) jump:[h] #s9:2 ")
+DEF_FIELDROW_DESC32(ICLASS_CJ" 010- -------- -------- --------","[#4,5]  pd=cmp.eq(R,R) ; if ([!]p0.new) jump:[h] #s9:2 ")
+DEF_FIELDROW_DESC32(ICLASS_CJ" 0110 -------- -------- --------","[#6]    Rd=#u6 ; jump #s9:2 ")
+DEF_FIELDROW_DESC32(ICLASS_CJ" 0111 -------- -------- --------","[#7]    Rd=Rs ; jump #s9:2 ")
+
+
+#define CMPJMPI_ENC(TAG,OPC) \
+DEF_ENC32(TAG##i_tp0_jump_t,      ICLASS_CJ" 00 0 "OPC"  0iissss  PP1IIIII  iiiiiii-") \
+DEF_ENC32(TAG##i_fp0_jump_t,      ICLASS_CJ" 00 0 "OPC"  1iissss  PP1IIIII  iiiiiii-") \
+DEF_ENC32(TAG##i_tp0_jump_nt,     ICLASS_CJ" 00 0 "OPC"  0iissss  PP0IIIII  iiiiiii-") \
+DEF_ENC32(TAG##i_fp0_jump_nt,     ICLASS_CJ" 00 0 "OPC"  1iissss  PP0IIIII  iiiiiii-") \
+\
+DEF_ENC32(TAG##i_tp1_jump_t,      ICLASS_CJ" 00 1 "OPC"  0iissss  PP1IIIII  iiiiiii-") \
+DEF_ENC32(TAG##i_fp1_jump_t,      ICLASS_CJ" 00 1 "OPC"  1iissss  PP1IIIII  iiiiiii-") \
+DEF_ENC32(TAG##i_tp1_jump_nt,     ICLASS_CJ" 00 1 "OPC"  0iissss  PP0IIIII  iiiiiii-") \
+DEF_ENC32(TAG##i_fp1_jump_nt,     ICLASS_CJ" 00 1 "OPC"  1iissss  PP0IIIII  iiiiiii-")
+
+CMPJMPI_ENC(J4_cmpeq,"00")
+CMPJMPI_ENC(J4_cmpgt,"01")
+CMPJMPI_ENC(J4_cmpgtu,"10")
+
+
+#define CMPJMP1I_ENC(TAG,OPC) \
+DEF_ENC32(TAG##_tp0_jump_t,      ICLASS_CJ" 00 0  11  0iissss  PP1---"OPC"  iiiiiii-") \
+DEF_ENC32(TAG##_fp0_jump_t,      ICLASS_CJ" 00 0  11  1iissss  PP1---"OPC"  iiiiiii-") \
+DEF_ENC32(TAG##_tp0_jump_nt,     ICLASS_CJ" 00 0  11  0iissss  PP0---"OPC"  iiiiiii-") \
+DEF_ENC32(TAG##_fp0_jump_nt,     ICLASS_CJ" 00 0  11  1iissss  PP0---"OPC"  iiiiiii-") \
+\
+DEF_ENC32(TAG##_tp1_jump_t,      ICLASS_CJ" 00 1  11  0iissss  PP1---"OPC"  iiiiiii-") \
+DEF_ENC32(TAG##_fp1_jump_t,      ICLASS_CJ" 00 1  11  1iissss  PP1---"OPC"  iiiiiii-") \
+DEF_ENC32(TAG##_tp1_jump_nt,     ICLASS_CJ" 00 1  11  0iissss  PP0---"OPC"  iiiiiii-") \
+DEF_ENC32(TAG##_fp1_jump_nt,     ICLASS_CJ" 00 1  11  1iissss  PP0---"OPC"  iiiiiii-")
+
+CMPJMP1I_ENC(J4_cmpeqn1,"00")
+CMPJMP1I_ENC(J4_cmpgtn1,"01")
+CMPJMP1I_ENC(J4_tstbit0,"11")
+
+
+
+#define CMPJMPR_ENC(TAG,OPC) \
+DEF_ENC32(TAG##_tp0_jump_t,       ICLASS_CJ" 01 0 "OPC"  0iissss  PP10tttt  iiiiiii-") \
+DEF_ENC32(TAG##_fp0_jump_t,       ICLASS_CJ" 01 0 "OPC"  1iissss  PP10tttt  iiiiiii-") \
+DEF_ENC32(TAG##_tp0_jump_nt,      ICLASS_CJ" 01 0 "OPC"  0iissss  PP00tttt  iiiiiii-") \
+DEF_ENC32(TAG##_fp0_jump_nt,      ICLASS_CJ" 01 0 "OPC"  1iissss  PP00tttt  iiiiiii-") \
+\
+DEF_ENC32(TAG##_tp1_jump_t,       ICLASS_CJ" 01 0 "OPC"  0iissss  PP11tttt  iiiiiii-") \
+DEF_ENC32(TAG##_fp1_jump_t,       ICLASS_CJ" 01 0 "OPC"  1iissss  PP11tttt  iiiiiii-") \
+DEF_ENC32(TAG##_tp1_jump_nt,      ICLASS_CJ" 01 0 "OPC"  0iissss  PP01tttt  iiiiiii-") \
+DEF_ENC32(TAG##_fp1_jump_nt,      ICLASS_CJ" 01 0 "OPC"  1iissss  PP01tttt  iiiiiii-")
+
+CMPJMPR_ENC(J4_cmpeq,"00")
+CMPJMPR_ENC(J4_cmpgt,"01")
+CMPJMPR_ENC(J4_cmpgtu,"10")
+
+
+DEF_ENC32(J4_jumpseti,            ICLASS_CJ" 0110  --iidddd  PPIIIIII  iiiiiii-")
+DEF_ENC32(J4_jumpsetr,            ICLASS_CJ" 0111  --iissss  PP--dddd  iiiiiii-")
+
+
+DEF_EXT_SPACE(EXT_CJ,             ICLASS_CJ"1 iii  iiiiiiii  PPiiiiii  iiiiiiii")
+
+
+
+DEF_CLASS32(ICLASS_NCJ" 0--- -------- PP------ --------",NCJ)
+DEF_FIELDROW_DESC32(ICLASS_NCJ" 00-- -------- -------- --------","[#0-3] if (cmp.xx(R.new,R)) jump:[h] #s9:2 ")
+DEF_FIELDROW_DESC32(ICLASS_NCJ" 01-- -------- -------- --------","[#4-7] if (cmp.xx(R.new,#U5)) jump:[h] #s9:2 ")
+
+#define OPRJMP_ENC(TAG,OPC) \
+DEF_ENC32(TAG##_t_jumpnv_t,       ICLASS_NCJ" 00 "OPC"  0ii-sss  PP1ttttt  iiiiiii-") \
+DEF_ENC32(TAG##_f_jumpnv_t,       ICLASS_NCJ" 00 "OPC"  1ii-sss  PP1ttttt  iiiiiii-") \
+DEF_ENC32(TAG##_t_jumpnv_nt,      ICLASS_NCJ" 00 "OPC"  0ii-sss  PP0ttttt  iiiiiii-") \
+DEF_ENC32(TAG##_f_jumpnv_nt,      ICLASS_NCJ" 00 "OPC"  1ii-sss  PP0ttttt  iiiiiii-")
+
+OPRJMP_ENC(J4_cmpeq,   "000")
+OPRJMP_ENC(J4_cmpgt,   "001")
+OPRJMP_ENC(J4_cmpgtu,  "010")
+OPRJMP_ENC(J4_cmplt,   "011")
+OPRJMP_ENC(J4_cmpltu,  "100")
+
+
+#define OPIJMP_ENC(TAG,OPC) \
+DEF_ENC32(TAG##_t_jumpnv_t,       ICLASS_NCJ" 01 "OPC"  0ii-sss  PP1IIIII  iiiiiii-") \
+DEF_ENC32(TAG##_f_jumpnv_t,       ICLASS_NCJ" 01 "OPC"  1ii-sss  PP1IIIII  iiiiiii-") \
+DEF_ENC32(TAG##_t_jumpnv_nt,      ICLASS_NCJ" 01 "OPC"  0ii-sss  PP0IIIII  iiiiiii-") \
+DEF_ENC32(TAG##_f_jumpnv_nt,      ICLASS_NCJ" 01 "OPC"  1ii-sss  PP0IIIII  iiiiiii-")
+
+OPIJMP_ENC(J4_cmpeqi,  "000")
+OPIJMP_ENC(J4_cmpgti,  "001")
+OPIJMP_ENC(J4_cmpgtui, "010")
+
+
+#define OPI1JMP_ENC(TAG,OPC) \
+DEF_ENC32(TAG##_t_jumpnv_t,       ICLASS_NCJ" 01 "OPC"  0ii-sss  PP1-----  iiiiiii-") \
+DEF_ENC32(TAG##_f_jumpnv_t,       ICLASS_NCJ" 01 "OPC"  1ii-sss  PP1-----  iiiiiii-") \
+DEF_ENC32(TAG##_t_jumpnv_nt,      ICLASS_NCJ" 01 "OPC"  0ii-sss  PP0-----  iiiiiii-") \
+DEF_ENC32(TAG##_f_jumpnv_nt,      ICLASS_NCJ" 01 "OPC"  1ii-sss  PP0-----  iiiiiii-")
+
+OPI1JMP_ENC(J4_cmpeqn1,  "100")
+OPI1JMP_ENC(J4_cmpgtn1,  "101")
+OPI1JMP_ENC(J4_tstbit0,  "011")
+
+
+DEF_EXT_SPACE(EXT_NCJ,             ICLASS_NCJ"1 iii  iiiiiiii  PPiiiiii  iiiiiiii")
+
+
+
+/*******************************/
+/*                             */
+/*                             */
+/*           CR                */
+/*                             */
+/*                             */
+/*******************************/
+
+
+
+DEF_CLASS32(ICLASS_CR" ---- -------- PP------ --------",CR)
+DEF_CLASS32(ICLASS_CR" -0-- -------- PP------ --------",CRUSER)
+DEF_CLASS32(ICLASS_CR" -1-- -------- PP------ --------",CRSUPER)
+
+DEF_FIELD32(ICLASS_CR" -!-- -------- PP------ --------",CR_sm,"Supervisor mode only")
+
+/* User CR ops */
+
+DEF_FIELDROW_DESC32(    ICLASS_CR" 0000  --------  PP------  --------","[#0] (Rs,#r8)")
+DEF_ENC32(J2_loop0r,    ICLASS_CR" 0000  000sssss  PP-iiiii  ---ii---")
+DEF_ENC32(J2_loop1r,    ICLASS_CR" 0000  001sssss  PP-iiiii  ---ii---")
+DEF_ENC32(J2_ploop1sr,  ICLASS_CR" 0000  101sssss  PP-iiiii  ---ii---")
+DEF_ENC32(J2_ploop2sr,  ICLASS_CR" 0000  110sssss  PP-iiiii  ---ii---")
+DEF_ENC32(J2_ploop3sr,  ICLASS_CR" 0000  111sssss  PP-iiiii  ---ii---")
+
+DEF_FIELDROW_DESC32(     ICLASS_CR" 0001  --------  PP------  --------","[#1] (Rs,#r13)")
+DEF_ENC32(J2_jumprz,     ICLASS_CR" 0001  00isssss  PPi0iiii  iiiiiii-")
+DEF_ENC32(J2_jumprzpt,   ICLASS_CR" 0001  00isssss  PPi1iiii  iiiiiii-")
+DEF_ENC32(J2_jumprnz,    ICLASS_CR" 0001  10isssss  PPi0iiii  iiiiiii-")
+DEF_ENC32(J2_jumprnzpt,  ICLASS_CR" 0001  10isssss  PPi1iiii  iiiiiii-")
+
+DEF_ENC32(J2_jumprgtez,  ICLASS_CR" 0001  01isssss  PPi0iiii  iiiiiii-")
+DEF_ENC32(J2_jumprgtezpt,ICLASS_CR" 0001  01isssss  PPi1iiii  iiiiiii-")
+DEF_ENC32(J2_jumprltez,  ICLASS_CR" 0001  11isssss  PPi0iiii  iiiiiii-")
+DEF_ENC32(J2_jumprltezpt,ICLASS_CR" 0001  11isssss  PPi1iiii  iiiiiii-")
+
+DEF_FIELDROW_DESC32(    ICLASS_CR" 0010  --------  PP------  --------","[#2] Cd=Rs ")
+DEF_ENC32(A2_tfrrcr,    ICLASS_CR" 0010  001sssss  PP------  ---ddddd")
+
+DEF_FIELDROW_DESC32(    ICLASS_CR" 0011  --------  PP------  --------","[#3] Cdd=Rss ")
+DEF_ENC32(A4_tfrpcp,    ICLASS_CR" 0011  001sssss  PP------  ---ddddd")
+
+DEF_FIELDROW_DESC32(    ICLASS_CR" 1000  --------  PP------  --------","[#8] Rdd=Css ")
+DEF_ENC32(A4_tfrcpp,    ICLASS_CR" 1000  000sssss  PP------  ---ddddd")
+
+DEF_FIELDROW_DESC32(    ICLASS_CR" 1001  --------  PP------  --------","[#9] (#r8,#U10)")
+DEF_ENC32(J2_ploop1si,  ICLASS_CR" 1001  101IIIII  PP-iiiii  IIIii-II")
+DEF_ENC32(J2_ploop2si,  ICLASS_CR" 1001  110IIIII  PP-iiiii  IIIii-II")
+DEF_ENC32(J2_ploop3si,  ICLASS_CR" 1001  111IIIII  PP-iiiii  IIIii-II")
+DEF_ENC32(J2_loop0i,    ICLASS_CR" 1001  000IIIII  PP-iiiii  IIIii-II")
+DEF_ENC32(J2_loop1i,    ICLASS_CR" 1001  001IIIII  PP-iiiii  IIIii-II")
+
+DEF_FIELDROW_DESC32(    ICLASS_CR" 1010  --------  PP------  --------","[#10] Rd=Cs ")
+DEF_ENC32(A2_tfrcrr,    ICLASS_CR" 1010  000sssss  PP------  ---ddddd")
+DEF_ENC32(C4_addipc,    ICLASS_CR" 1010  01001001  PP-iiiii  i--ddddd")
+
+
+DEF_FIELDROW_DESC32(    ICLASS_CR" 1011  --------  PP0-----  --------","[#11] Pd=(Ps,Pt,Pu)")
+DEF_ENC32(C2_and,       ICLASS_CR" 1011  0000--ss  PP0---tt  ------dd")
+DEF_ENC32(C2_or,        ICLASS_CR" 1011  0010--ss  PP0---tt  ------dd")
+DEF_ENC32(C2_xor,       ICLASS_CR" 1011  0100--ss  PP0---tt  ------dd")
+DEF_ENC32(C2_andn,      ICLASS_CR" 1011  0110--ss  PP0---tt  ------dd")
+DEF_ENC32(C2_any8,      ICLASS_CR" 1011  1000--ss  PP0-----  ------dd")
+DEF_ENC32(C2_all8,      ICLASS_CR" 1011  1010--ss  PP0-----  ------dd")
+DEF_ENC32(C2_not,       ICLASS_CR" 1011  1100--ss  PP0-----  ------dd")
+DEF_ENC32(C2_orn,       ICLASS_CR" 1011  1110--ss  PP0---tt  ------dd")
+
+DEF_ENC32(C4_and_and,   ICLASS_CR" 1011  0001--ss  PP0---tt  uu----dd")
+DEF_ENC32(C4_and_or,    ICLASS_CR" 1011  0011--ss  PP0---tt  uu----dd")
+DEF_ENC32(C4_or_and,    ICLASS_CR" 1011  0101--ss  PP0---tt  uu----dd")
+DEF_ENC32(C4_or_or,     ICLASS_CR" 1011  0111--ss  PP0---tt  uu----dd")
+DEF_ENC32(C4_and_andn,  ICLASS_CR" 1011  1001--ss  PP0---tt  uu----dd")
+DEF_ENC32(C4_and_orn,   ICLASS_CR" 1011  1011--ss  PP0---tt  uu----dd")
+DEF_ENC32(C4_or_andn,   ICLASS_CR" 1011  1101--ss  PP0---tt  uu----dd")
+DEF_ENC32(C4_or_orn,    ICLASS_CR" 1011  1111--ss  PP0---tt  uu----dd")
+
+DEF_ENC32(C4_fastcorner9,	ICLASS_CR"1011 0000--ss  PP1---tt 1--1--dd")
+DEF_ENC32(C4_fastcorner9_not,	ICLASS_CR"1011 0001--ss  PP1---tt 1--1--dd")
+
+
+
+/*******************************/
+/*                             */
+/*                             */
+/*           M                 */
+/*                             */
+/*                             */
+/*******************************/
+
+
+DEF_CLASS32(ICLASS_M" ---- -------- PP------ --------",M)
+DEF_FIELD32(ICLASS_M" !!!! -------- PP------ --------",M_RegType,"Register Type")
+DEF_FIELD32(ICLASS_M" ---- !!!----- PP------ --------",M_MajOp,"Major Opcode")
+DEF_FIELD32(ICLASS_M" ---- -------- PP------ !!!-----",M_MinOp,"Minor Opcode")
+
+
+
+#define SP_MPY(TAG,REGTYPE,DSTCHARS,SAT,RND,UNS)\
+DEF_ENC32(TAG##_ll_s0, ICLASS_M  REGTYPE "0"  UNS RND"sssss  PP-ttttt "SAT"00"   DSTCHARS)\
+DEF_ENC32(TAG##_lh_s0, ICLASS_M  REGTYPE "0"  UNS RND"sssss  PP-ttttt "SAT"01"   DSTCHARS)\
+DEF_ENC32(TAG##_hl_s0, ICLASS_M  REGTYPE "0"  UNS RND"sssss  PP-ttttt "SAT"10"   DSTCHARS)\
+DEF_ENC32(TAG##_hh_s0, ICLASS_M  REGTYPE "0"  UNS RND"sssss  PP-ttttt "SAT"11"   DSTCHARS)\
+DEF_ENC32(TAG##_ll_s1, ICLASS_M  REGTYPE "1"  UNS RND"sssss  PP-ttttt "SAT"00"   DSTCHARS)\
+DEF_ENC32(TAG##_lh_s1, ICLASS_M  REGTYPE "1"  UNS RND"sssss  PP-ttttt "SAT"01"   DSTCHARS)\
+DEF_ENC32(TAG##_hl_s1, ICLASS_M  REGTYPE "1"  UNS RND"sssss  PP-ttttt "SAT"10"   DSTCHARS)\
+DEF_ENC32(TAG##_hh_s1, ICLASS_M  REGTYPE "1"  UNS RND"sssss  PP-ttttt "SAT"11"   DSTCHARS)
+
+/* Double precision                   */
+#define MPY_ENC(TAG,REGTYPE,DSTCHARS,SAT,RNDNAC,UNS,SHFT,VMIN2)\
+DEF_ENC32(TAG, ICLASS_M REGTYPE SHFT UNS RNDNAC"sssss  PP0ttttt "SAT VMIN2 DSTCHARS)
+
+#define MPYI_ENC(TAG,REGTYPE,DSTCHARS,RNDNAC,UNS,SHFT)\
+DEF_ENC32(TAG, ICLASS_M REGTYPE SHFT UNS RNDNAC"sssss  PP0iiiii iii" DSTCHARS)
+
+
+DEF_FIELDROW_DESC32(ICLASS_M" 0000 -------- PP------ --------","[#0] Rd=(Rs,#u8)")
+MPYI_ENC(M2_mpysip,          "0000","ddddd","-","-","0"     )
+MPYI_ENC(M2_mpysin,          "0000","ddddd","-","-","1"     )
+
+
+DEF_FIELDROW_DESC32(ICLASS_M" 0001 -------- PP------ --------","[#1] Rx=(Rs,#u8)")
+MPYI_ENC(M2_macsip,          "0001","xxxxx","-","-","0"     )
+MPYI_ENC(M2_macsin,          "0001","xxxxx","-","-","1"     )
+
+
+DEF_FIELDROW_DESC32(ICLASS_M" 0010 -------- PP------ --------","[#2] Rx=(Rs,#s8)")
+MPYI_ENC(M2_accii,           "0010","xxxxx","-","-","0"     )
+MPYI_ENC(M2_naccii,          "0010","xxxxx","-","-","1"     )
+
+
+DEF_FIELDROW_DESC32(ICLASS_M" 0011 -------- PP------ --------","[#3] Ry=(Ru,(Rs,Ry)) ")
+DEF_ENC32(M4_mpyrr_addr,ICLASS_M" 0011 000sssss PP-yyyyy ---uuuuu")
+
+
+DEF_FIELDROW_DESC32(ICLASS_M" 0100 -------- PP------ --------","[#4] Rdd=(Rs,Rt)")
+DEF_FIELD32(ICLASS_M"         0100 -------- PP------ --!-----",Ma_tH,"Rt is High") /*Rt high */
+DEF_FIELD32(ICLASS_M"         0100 -------- PP------ -!------",Ma_sH,"Rs is High") /* Rs high */
+SP_MPY(M2_mpyd,              "0100","ddddd","-","0","0")
+SP_MPY(M2_mpyd_rnd,          "0100","ddddd","-","1","0")
+SP_MPY(M2_mpyud,             "0100","ddddd","-","0","1")
+
+
+DEF_FIELDROW_DESC32(ICLASS_M" 0101 -------- PP------ --------","[#5] Rdd=(Rs,Rt)")
+MPY_ENC(M2_dpmpyss_s0,       "0101","ddddd","0","0","0","0","00")
+MPY_ENC(M2_dpmpyuu_s0,       "0101","ddddd","0","0","1","0","00")
+MPY_ENC(M2_vmpy2s_s0,        "0101","ddddd","1","0","0","0","01")
+MPY_ENC(M2_vmpy2s_s1,        "0101","ddddd","1","0","0","1","01")
+MPY_ENC(M2_cmpyi_s0,         "0101","ddddd","0","0","0","0","01")
+MPY_ENC(M2_cmpyr_s0,         "0101","ddddd","0","0","0","0","10")
+MPY_ENC(M2_cmpys_s0,         "0101","ddddd","1","0","0","0","10")
+MPY_ENC(M2_cmpys_s1,         "0101","ddddd","1","0","0","1","10")
+MPY_ENC(M2_cmpysc_s0,        "0101","ddddd","1","0","1","0","10")
+MPY_ENC(M2_cmpysc_s1,        "0101","ddddd","1","0","1","1","10")
+MPY_ENC(M2_vmpy2su_s0,       "0101","ddddd","1","0","0","0","11")
+MPY_ENC(M2_vmpy2su_s1,       "0101","ddddd","1","0","0","1","11")
+MPY_ENC(M4_pmpyw,            "0101","ddddd","1","0","1","0","11")
+MPY_ENC(M4_vpmpyh,           "0101","ddddd","1","0","1","1","11")
+MPY_ENC(M5_vmpybuu,          "0101","ddddd","0","0","0","1","01")
+MPY_ENC(M5_vmpybsu,          "0101","ddddd","0","0","1","0","01")
+
+
+
+
+DEF_FIELDROW_DESC32(ICLASS_M" 0110 -------- PP------ --------","[#6] Rxx=(Rs,Rt)")
+DEF_FIELD32(ICLASS_M"         0110 -------- PP------ --!-----",Mb_tH,"Rt is High") /*Rt high */
+DEF_FIELD32(ICLASS_M"         0110 -------- PP------ -!------",Mb_sH,"Rs is High") /* Rs high */
+SP_MPY(M2_mpyd_acc,          "0110","xxxxx","0","0","0")
+SP_MPY(M2_mpyud_acc,         "0110","xxxxx","0","0","1")
+SP_MPY(M2_mpyd_nac,          "0110","xxxxx","0","1","0")
+SP_MPY(M2_mpyud_nac,         "0110","xxxxx","0","1","1")
+
+
+DEF_FIELDROW_DESC32(ICLASS_M" 0111 -------- PP------ --------","[#7] Rxx=(Rs,Rt)")
+MPY_ENC(M2_dpmpyss_acc_s0,   "0111","xxxxx","0","0","0","0","00")
+MPY_ENC(M2_dpmpyss_nac_s0,   "0111","xxxxx","0","1","0","0","00")
+MPY_ENC(M2_dpmpyuu_acc_s0,   "0111","xxxxx","0","0","1","0","00")
+MPY_ENC(M2_dpmpyuu_nac_s0,   "0111","xxxxx","0","1","1","0","00")
+MPY_ENC(M2_vmac2s_s0,        "0111","xxxxx","1","0","0","0","01")
+MPY_ENC(M2_vmac2s_s1,        "0111","xxxxx","1","0","0","1","01")
+MPY_ENC(M2_cmaci_s0,         "0111","xxxxx","0","0","0","0","01")
+MPY_ENC(M2_cmacr_s0,         "0111","xxxxx","0","0","0","0","10")
+MPY_ENC(M2_cmacs_s0,         "0111","xxxxx","1","0","0","0","10")
+MPY_ENC(M2_cmacs_s1,         "0111","xxxxx","1","0","0","1","10")
+MPY_ENC(M2_cmacsc_s0,        "0111","xxxxx","1","0","1","0","10")
+MPY_ENC(M2_cmacsc_s1,        "0111","xxxxx","1","0","1","1","10")
+MPY_ENC(M2_vmac2,            "0111","xxxxx","0","1","0","0","01")
+MPY_ENC(M2_cnacs_s0,         "0111","xxxxx","1","0","0","0","11")
+MPY_ENC(M2_cnacs_s1,         "0111","xxxxx","1","0","0","1","11")
+MPY_ENC(M2_cnacsc_s0,        "0111","xxxxx","1","0","1","0","11")
+MPY_ENC(M2_cnacsc_s1,        "0111","xxxxx","1","0","1","1","11")
+MPY_ENC(M2_vmac2su_s0,       "0111","xxxxx","1","1","1","0","01")
+MPY_ENC(M2_vmac2su_s1,       "0111","xxxxx","1","1","1","1","01")
+MPY_ENC(M4_pmpyw_acc,        "0111","xxxxx","1","1","0","0","11")
+MPY_ENC(M4_vpmpyh_acc,       "0111","xxxxx","1","1","0","1","11")
+MPY_ENC(M5_vmacbuu,          "0111","xxxxx","0","0","0","1","01")
+MPY_ENC(M5_vmacbsu,          "0111","xxxxx","0","0","1","1","01")
+
+
+
+
+
+DEF_FIELDROW_DESC32(ICLASS_M" 1000 -------- PP------ --------","[#8] Rdd=(Rss,Rtt)")
+MPY_ENC(M2_vrcmpyi_s0,       "1000","ddddd","0","0","0","0","00")
+MPY_ENC(M2_vdmpys_s0,        "1000","ddddd","1","0","0","0","00")
+MPY_ENC(M2_vdmpys_s1,        "1000","ddddd","1","0","0","1","00")
+MPY_ENC(M2_vrcmpyi_s0c,      "1000","ddddd","0","0","1","0","00")
+MPY_ENC(M2_vabsdiffw,        "1000","ddddd","0","1","0","0","00")
+MPY_ENC(M6_vabsdiffub,       "1000","ddddd","0","1","0","1","00")
+MPY_ENC(M2_vabsdiffh,        "1000","ddddd","0","1","1","0","00")
+MPY_ENC(M6_vabsdiffb,        "1000","ddddd","0","1","1","1","00")
+MPY_ENC(M2_vrcmpys_s1_h,     "1000","ddddd","1","1","0","1","00")
+MPY_ENC(M2_vrcmpys_s1_l,     "1000","ddddd","1","1","1","1","00")
+MPY_ENC(M2_vrcmpyr_s0c,      "1000","ddddd","0","1","1","0","01")
+MPY_ENC(M2_vrcmpyr_s0,       "1000","ddddd","0","0","0","0","01")
+MPY_ENC(A2_vraddub,          "1000","ddddd","0","0","1","0","01")
+MPY_ENC(M2_mmpyl_s0,         "1000","ddddd","1","0","0","0","01")
+MPY_ENC(M2_mmpyl_s1,         "1000","ddddd","1","0","0","1","01")
+MPY_ENC(M2_mmpyl_rs0,        "1000","ddddd","1","1","0","0","01")
+MPY_ENC(M2_mmpyl_rs1,        "1000","ddddd","1","1","0","1","01")
+MPY_ENC(M2_mmpyul_s0,        "1000","ddddd","1","0","1","0","01")
+MPY_ENC(M2_mmpyul_s1,        "1000","ddddd","1","0","1","1","01")
+MPY_ENC(M2_mmpyul_rs0,       "1000","ddddd","1","1","1","0","01")
+MPY_ENC(M2_mmpyul_rs1,       "1000","ddddd","1","1","1","1","01")
+MPY_ENC(M2_vrmpy_s0,         "1000","ddddd","0","0","0","0","10")
+MPY_ENC(A2_vrsadub,          "1000","ddddd","0","0","1","0","10")
+MPY_ENC(M2_vmpy2es_s0,       "1000","ddddd","1","0","0","0","10")
+MPY_ENC(M2_vmpy2es_s1,       "1000","ddddd","1","0","0","1","10")
+MPY_ENC(M2_vcmpy_s0_sat_i,   "1000","ddddd","1","0","1","0","10")
+MPY_ENC(M2_vcmpy_s0_sat_r,   "1000","ddddd","1","1","0","0","10")
+MPY_ENC(M2_vcmpy_s1_sat_i,   "1000","ddddd","1","0","1","1","10")
+MPY_ENC(M2_vcmpy_s1_sat_r,   "1000","ddddd","1","1","0","1","10")
+
+MPY_ENC(M2_mmpyh_s0,         "1000","ddddd","1","0","0","0","11")
+MPY_ENC(M2_mmpyh_s1,         "1000","ddddd","1","0","0","1","11")
+MPY_ENC(M2_mmpyh_rs0,        "1000","ddddd","1","1","0","0","11")
+MPY_ENC(M2_mmpyh_rs1,        "1000","ddddd","1","1","0","1","11")
+MPY_ENC(M2_mmpyuh_s0,        "1000","ddddd","1","0","1","0","11")
+MPY_ENC(M2_mmpyuh_s1,        "1000","ddddd","1","0","1","1","11")
+MPY_ENC(M2_mmpyuh_rs0,       "1000","ddddd","1","1","1","0","11")
+MPY_ENC(M2_mmpyuh_rs1,       "1000","ddddd","1","1","1","1","11")
+
+MPY_ENC(M4_vrmpyeh_s0,       "1000","ddddd","1","0","1","0","00")
+MPY_ENC(M4_vrmpyeh_s1,       "1000","ddddd","1","0","1","1","00")
+MPY_ENC(M4_vrmpyoh_s0,       "1000","ddddd","0","1","0","0","10")
+MPY_ENC(M4_vrmpyoh_s1,       "1000","ddddd","0","1","0","1","10")
+MPY_ENC(M5_vrmpybuu,         "1000","ddddd","0","0","0","1","01")
+MPY_ENC(M5_vrmpybsu,         "1000","ddddd","0","0","1","1","01")
+MPY_ENC(M5_vdmpybsu,         "1000","ddddd","0","1","0","1","01")
+
+MPY_ENC(F2_dfadd,            "1000","ddddd","0","0","0","0","11")
+MPY_ENC(F2_dfsub,            "1000","ddddd","0","0","0","1","11")
+MPY_ENC(F2_dfmpyfix,         "1000","ddddd","0","0","1","0","11")
+MPY_ENC(F2_dfmin,            "1000","ddddd","0","0","1","1","11")
+MPY_ENC(F2_dfmax,            "1000","ddddd","0","1","0","0","11")
+MPY_ENC(F2_dfmpyll,          "1000","ddddd","0","1","0","1","11")
+#ifdef ADD_DP_OPS
+MPY_ENC(F2_dfdivcheat,       "1000","ddddd","0","0","0","1","00")
+
+MPY_ENC(F2_dffixupn,         "1000","ddddd","0","1","0","1","11")
+MPY_ENC(F2_dffixupd,         "1000","ddddd","0","1","1","0","11")
+MPY_ENC(F2_dfrecipa,         "1000","ddddd","0","1","1","1","ee")
+#endif
+
+MPY_ENC(M7_dcmpyrw,       	"1000","ddddd","0","0","0","1","10")
+MPY_ENC(M7_dcmpyrwc,         "1000","ddddd","0","0","1","1","10")
+MPY_ENC(M7_dcmpyiw,       	"1000","ddddd","0","1","1","0","10")
+MPY_ENC(M7_dcmpyiwc,         "1000","ddddd","0","1","1","1","10")
+
+
+
+DEF_FIELDROW_DESC32(ICLASS_M" 1001 -------- PP------ --------","[#9] Rd=(Rss,Rtt)")
+MPY_ENC(M2_vdmpyrs_s0,       "1001","ddddd","0","0","0","0","00")
+MPY_ENC(M2_vdmpyrs_s1,       "1001","ddddd","0","0","0","1","00")
+
+MPY_ENC(M7_wcmpyrw,      	 "1001","ddddd","0","0","1","0","00")
+MPY_ENC(M7_wcmpyrw_rnd,      "1001","ddddd","0","0","1","1","00")
+MPY_ENC(M7_wcmpyiw,       	 "1001","ddddd","0","1","0","0","00")
+MPY_ENC(M7_wcmpyiw_rnd,      "1001","ddddd","0","1","0","1","00")
+
+MPY_ENC(M7_wcmpyrwc,      	 "1001","ddddd","0","1","1","0","00")
+MPY_ENC(M7_wcmpyrwc_rnd,     "1001","ddddd","0","1","1","1","00")
+MPY_ENC(M7_wcmpyiwc,       	 "1001","ddddd","1","0","0","0","00")
+MPY_ENC(M7_wcmpyiwc_rnd,     "1001","ddddd","1","0","0","1","00")
+
+
+
+MPY_ENC(M2_vradduh,          "1001","ddddd","-","-","-","0","01")
+MPY_ENC(M2_vrcmpys_s1rp_h,   "1001","ddddd","1","1","-","1","10")
+MPY_ENC(M2_vrcmpys_s1rp_l,   "1001","ddddd","1","1","-","1","11")
+MPY_ENC(M2_vraddh,           "1001","ddddd","1","1","-","0","11")
+
+
+DEF_FIELDROW_DESC32(ICLASS_M" 1010 -------- PP------ --------","[#10] Rxx=(Rss,Rtt)")
+MPY_ENC(M2_vrcmaci_s0,       "1010","xxxxx","0","0","0","0","00")
+MPY_ENC(M2_vdmacs_s0,        "1010","xxxxx","1","0","0","0","00")
+MPY_ENC(M2_vdmacs_s1,        "1010","xxxxx","1","0","0","1","00")
+MPY_ENC(M2_vrcmaci_s0c,      "1010","xxxxx","0","0","1","0","00")
+MPY_ENC(M2_vcmac_s0_sat_i,   "1010","xxxxx","1","0","1","0","00")
+MPY_ENC(M2_vcmac_s0_sat_r,   "1010","xxxxx","1","1","0","0","00")
+MPY_ENC(M2_vrcmpys_acc_s1_h, "1010","xxxxx","1","1","0","1","00")
+MPY_ENC(M2_vrcmpys_acc_s1_l, "1010","xxxxx","1","1","1","1","00")
+MPY_ENC(M2_vrcmacr_s0,       "1010","xxxxx","0","0","0","0","01")
+MPY_ENC(A2_vraddub_acc,      "1010","xxxxx","0","0","1","0","01")
+MPY_ENC(M2_mmacls_s0,        "1010","xxxxx","1","0","0","0","01")
+MPY_ENC(M2_mmacls_s1,        "1010","xxxxx","1","0","0","1","01")
+MPY_ENC(M2_mmacls_rs0,       "1010","xxxxx","1","1","0","0","01")
+MPY_ENC(M2_mmacls_rs1,       "1010","xxxxx","1","1","0","1","01")
+MPY_ENC(M2_mmaculs_s0,       "1010","xxxxx","1","0","1","0","01")
+MPY_ENC(M2_mmaculs_s1,       "1010","xxxxx","1","0","1","1","01")
+MPY_ENC(M2_mmaculs_rs0,      "1010","xxxxx","1","1","1","0","01")
+MPY_ENC(M2_mmaculs_rs1,      "1010","xxxxx","1","1","1","1","01")
+MPY_ENC(M2_vrcmacr_s0c,      "1010","xxxxx","0","1","1","0","01")
+MPY_ENC(M2_vrmac_s0,         "1010","xxxxx","0","0","0","0","10")
+MPY_ENC(A2_vrsadub_acc,      "1010","xxxxx","0","0","1","0","10")
+MPY_ENC(M2_vmac2es_s0,       "1010","xxxxx","1","0","0","0","10")
+MPY_ENC(M2_vmac2es_s1,       "1010","xxxxx","1","0","0","1","10")
+MPY_ENC(M2_vmac2es,          "1010","xxxxx","0","1","0","0","10")
+MPY_ENC(M2_mmachs_s0,        "1010","xxxxx","1","0","0","0","11")
+MPY_ENC(M2_mmachs_s1,        "1010","xxxxx","1","0","0","1","11")
+MPY_ENC(M2_mmachs_rs0,       "1010","xxxxx","1","1","0","0","11")
+MPY_ENC(M2_mmachs_rs1,       "1010","xxxxx","1","1","0","1","11")
+MPY_ENC(M2_mmacuhs_s0,       "1010","xxxxx","1","0","1","0","11")
+MPY_ENC(M2_mmacuhs_s1,       "1010","xxxxx","1","0","1","1","11")
+MPY_ENC(M2_mmacuhs_rs0,      "1010","xxxxx","1","1","1","0","11")
+MPY_ENC(M2_mmacuhs_rs1,      "1010","xxxxx","1","1","1","1","11")
+MPY_ENC(M4_vrmpyeh_acc_s0,   "1010","xxxxx","1","1","0","0","10")
+MPY_ENC(M4_vrmpyeh_acc_s1,   "1010","xxxxx","1","1","0","1","10")
+MPY_ENC(M4_vrmpyoh_acc_s0,   "1010","xxxxx","1","1","1","0","10")
+MPY_ENC(M4_vrmpyoh_acc_s1,   "1010","xxxxx","1","1","1","1","10")
+MPY_ENC(M5_vrmacbuu,         "1010","xxxxx","0","0","0","1","01")
+MPY_ENC(M5_vrmacbsu,         "1010","xxxxx","0","0","1","1","01")
+MPY_ENC(M5_vdmacbsu,         "1010","xxxxx","0","1","0","0","01")
+
+MPY_ENC(F2_dfmpylh,          "1010","xxxxx","0","0","0","0","11")
+MPY_ENC(F2_dfmpyhh,          "1010","xxxxx","0","0","0","1","11")
+#ifdef ADD_DP_OPS
+MPY_ENC(F2_dfmpyhh,          "1010","xxxxx","0","0","1","0","11")
+MPY_ENC(F2_dffma,            "1010","xxxxx","0","0","0","0","11")
+MPY_ENC(F2_dffms,            "1010","xxxxx","0","0","0","1","11")
+
+MPY_ENC(F2_dffma_lib,        "1010","xxxxx","0","0","1","0","11")
+MPY_ENC(F2_dffms_lib,        "1010","xxxxx","0","0","1","1","11")
+MPY_ENC(F2_dffma_sc,         "1010","xxxxx","0","1","1","1","uu")
+#endif
+
+
+MPY_ENC(M7_dcmpyrw_acc,       	"1010","xxxxx","0","0","0","1","10")
+MPY_ENC(M7_dcmpyrwc_acc,         "1010","xxxxx","0","0","1","1","10")
+MPY_ENC(M7_dcmpyiw_acc,       	"1010","xxxxx","0","1","1","0","10")
+MPY_ENC(M7_dcmpyiwc_acc,         "1010","xxxxx","1","0","1","0","10")
+
+
+
+
+/*
+*/
+
+DEF_FIELDROW_DESC32(ICLASS_M" 1011 -------- PP------ --------","[#11] Reserved")
+MPY_ENC(F2_sfadd,            "1011","ddddd","0","0","0","0","00")
+MPY_ENC(F2_sfsub,            "1011","ddddd","0","0","0","0","01")
+MPY_ENC(F2_sfmax,            "1011","ddddd","0","0","0","1","00")
+MPY_ENC(F2_sfmin,            "1011","ddddd","0","0","0","1","01")
+MPY_ENC(F2_sfmpy,            "1011","ddddd","0","0","1","0","00")
+MPY_ENC(F2_sffixupn,         "1011","ddddd","0","0","1","1","00")
+MPY_ENC(F2_sffixupd,         "1011","ddddd","0","0","1","1","01")
+
+DEF_FIELDROW_DESC32(ICLASS_M" 1100 -------- PP------ --------","[#12] Rd=(Rs,Rt)")
+DEF_FIELD32(ICLASS_M"         1100 -------- PP------ --!-----",Mc_tH,"Rt is High") /*Rt high */
+DEF_FIELD32(ICLASS_M"         1100 -------- PP------ -!------",Mc_sH,"Rs is High") /* Rs high */
+SP_MPY(M2_mpy,               "1100","ddddd","0","0","0")
+SP_MPY(M2_mpy_sat,           "1100","ddddd","1","0","0")
+SP_MPY(M2_mpy_rnd,           "1100","ddddd","0","1","0")
+SP_MPY(M2_mpy_sat_rnd,       "1100","ddddd","1","1","0")
+SP_MPY(M2_mpyu,              "1100","ddddd","0","0","1")
+
+DEF_FIELDROW_DESC32(ICLASS_M" 1101 -------- PP------ --------","[#13] Rd=(Rs,Rt)")
+/* EJP: same as mpyi MPY_ENC(M2_mpyui,            "1101","ddddd","0","0","1","0","00") */
+MPY_ENC(M2_mpyi,             "1101","ddddd","0","0","0","0","00")
+MPY_ENC(M2_mpy_up,           "1101","ddddd","0","0","0","0","01")
+MPY_ENC(M2_mpyu_up,          "1101","ddddd","0","0","1","0","01")
+MPY_ENC(M2_dpmpyss_rnd_s0,   "1101","ddddd","0","1","0","0","01")
+MPY_ENC(M2_cmpyrs_s0,        "1101","ddddd","1","1","0","0","10")
+MPY_ENC(M2_cmpyrs_s1,        "1101","ddddd","1","1","0","1","10")
+MPY_ENC(M2_cmpyrsc_s0,       "1101","ddddd","1","1","1","0","10")
+MPY_ENC(M2_cmpyrsc_s1,       "1101","ddddd","1","1","1","1","10")
+MPY_ENC(M2_vmpy2s_s0pack,    "1101","ddddd","1","1","0","0","11")
+MPY_ENC(M2_vmpy2s_s1pack,    "1101","ddddd","1","1","0","1","11")
+MPY_ENC(M2_hmmpyh_rs1,       "1101","ddddd","1","1","0","1","00")
+MPY_ENC(M2_hmmpyl_rs1,       "1101","ddddd","1","1","1","1","00")
+
+MPY_ENC(M2_hmmpyh_s1,        "1101","ddddd","0","1","0","1","00")
+MPY_ENC(M2_hmmpyl_s1,        "1101","ddddd","0","1","0","1","01")
+MPY_ENC(M2_mpy_up_s1,        "1101","ddddd","0","1","0","1","10")
+MPY_ENC(M2_mpy_up_s1_sat,    "1101","ddddd","0","1","1","1","00")
+MPY_ENC(M2_mpysu_up,         "1101","ddddd","0","1","1","0","01")
+
+
+DEF_FIELDROW_DESC32(ICLASS_M" 1110 -------- PP------ --------","[#14] Rx=(Rs,Rt)")
+DEF_FIELD32(ICLASS_M"         1110 -------- PP------ --!-----",Md_tH,"Rt is High") /*Rt high */
+DEF_FIELD32(ICLASS_M"         1110 -------- PP------ -!------",Md_sH,"Rs is High") /* Rs high */
+SP_MPY(M2_mpyu_acc,          "1110","xxxxx","0","0","1")
+SP_MPY(M2_mpy_acc,           "1110","xxxxx","0","0","0")
+SP_MPY(M2_mpy_acc_sat,       "1110","xxxxx","1","0","0")
+SP_MPY(M2_mpyu_nac,          "1110","xxxxx","0","1","1")
+SP_MPY(M2_mpy_nac,           "1110","xxxxx","0","1","0")
+SP_MPY(M2_mpy_nac_sat,       "1110","xxxxx","1","1","0")
+
+
+DEF_FIELDROW_DESC32(ICLASS_M" 1111 -------- PP------ --------","[#15] Rx=(Rs,Rt)")
+MPY_ENC(M2_maci,             "1111","xxxxx","0","0","0","0","00")
+MPY_ENC(M2_mnaci,            "1111","xxxxx","0","0","0","1","00")
+MPY_ENC(M2_acci,             "1111","xxxxx","0","0","0","0","01")
+MPY_ENC(M2_nacci,            "1111","xxxxx","0","0","0","1","01")
+MPY_ENC(M2_xor_xacc,         "1111","xxxxx","0","0","0","1","11")
+MPY_ENC(M2_subacc,           "1111","xxxxx","0","0","0","0","11")
+
+MPY_ENC(M4_mac_up_s1_sat,    "1111","xxxxx","0","1","1","0","00")
+MPY_ENC(M4_nac_up_s1_sat,    "1111","xxxxx","0","1","1","0","01")
+
+MPY_ENC(M4_and_and,          "1111","xxxxx","0","0","1","0","00")
+MPY_ENC(M4_and_or,           "1111","xxxxx","0","0","1","0","01")
+MPY_ENC(M4_and_xor,          "1111","xxxxx","0","0","1","0","10")
+MPY_ENC(M4_or_and,           "1111","xxxxx","0","0","1","0","11")
+MPY_ENC(M4_or_or,            "1111","xxxxx","0","0","1","1","00")
+MPY_ENC(M4_or_xor,           "1111","xxxxx","0","0","1","1","01")
+MPY_ENC(M4_xor_and,          "1111","xxxxx","0","0","1","1","10")
+MPY_ENC(M4_xor_or,           "1111","xxxxx","0","0","1","1","11")
+
+MPY_ENC(M4_or_andn,          "1111","xxxxx","0","1","0","0","00")
+MPY_ENC(M4_and_andn,         "1111","xxxxx","0","1","0","0","01")
+MPY_ENC(M4_xor_andn,         "1111","xxxxx","0","1","0","0","10")
+
+MPY_ENC(F2_sffma,            "1111","xxxxx","1","0","0","0","00")
+MPY_ENC(F2_sffms,            "1111","xxxxx","1","0","0","0","01")
+
+MPY_ENC(F2_sffma_lib,        "1111","xxxxx","1","0","0","0","10")
+MPY_ENC(F2_sffms_lib,        "1111","xxxxx","1","0","0","0","11")
+
+MPY_ENC(F2_sffma_sc,         "1111","xxxxx","1","1","1","0","uu")
+
+
+/*******************************/
+/*                             */
+/*                             */
+/*           ALU32_2op         */
+/*                             */
+/*                             */
+/*******************************/
+DEF_CLASS32(ICLASS_ADDI" ---- -------- PP------ --------",ALU32_ADDI)
+
+DEF_CLASS32(ICLASS_ALU2op" ---- -------- PP------ --------",ALU32_2op)
+DEF_FIELD32(ICLASS_ALU2op" !--- -------- PP------ --------",A2_Rs,"No Rs read")
+DEF_FIELD32(ICLASS_ALU2op" -!!! -------- PP------ --------",A2_MajOp,"Major Opcode")
+DEF_FIELD32(ICLASS_ALU2op" ---- !!!----- PP------ --------",A2_MinOp,"Minor Opcode")
+
+DEF_FIELD32(ICLASS_ALU3op" -!!! -------- PP------ --------",A3_MajOp,"Major Opcode")
+DEF_FIELD32(ICLASS_ALU3op" ---- !!!----- PP------ --------",A3_MinOp,"Minor Opcode")
+DEF_CLASS32(ICLASS_ALU3op" ---- -------- PP------ --------",ALU32_3op)
+DEF_FIELD32(ICLASS_ALU3op" !--- -------- PP------ --------",A3_P,"Predicated")
+DEF_FIELD32(ICLASS_ALU3op" ---- -------- PP!----- --------",A3_DN,"Dot-new")
+DEF_FIELD32(ICLASS_ALU3op" ---- -------- PP------ !-------",A3_PS,"Predicate sense")
+
+
+/*************************/
+/* Our good friend addi  */
+/*************************/
+DEF_ENC32(A2_addi,    ICLASS_ADDI"  iiii iiisssss PPiiiiii iiiddddd")
+
+
+/*******************************/
+/* Standard ALU32 insns        */
+/*******************************/
+
+#define ALU32_IRR_ENC(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS)\
+DEF_ENC32(TAG, ICLASS_ALU2op" "MAJ4"  "MIN3"sssss  PP"SMOD1"iiiii "VMIN3 DSTCHARS)
+
+#define ALU32_RR_ENC(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS)\
+DEF_ENC32(TAG, ICLASS_ALU2op" "MAJ4"  "MIN3"sssss  PP"SMOD1"----- "VMIN3 DSTCHARS)
+
+#define CONDA32_RR_ENC(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS)\
+DEF_ENC32(TAG##t,   ICLASS_ALU2op" "MAJ4"  "MIN3"sssss  PP"SMOD1"-00uu "VMIN3 DSTCHARS)\
+DEF_ENC32(TAG##f,   ICLASS_ALU2op" "MAJ4"  "MIN3"sssss  PP"SMOD1"-10uu "VMIN3 DSTCHARS)\
+DEF_ENC32(TAG##tnew,ICLASS_ALU2op" "MAJ4"  "MIN3"sssss  PP"SMOD1"-01uu "VMIN3 DSTCHARS)\
+DEF_ENC32(TAG##fnew,ICLASS_ALU2op" "MAJ4"  "MIN3"sssss  PP"SMOD1"-11uu "VMIN3 DSTCHARS)
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 0000 -------- PP------ --------","[#0] (Pu) Rd=(Rs)")
+DEF_FIELD32(            ICLASS_ALU2op" 0000 -------- PP!----- --------",A32a_C,"Conditional")
+DEF_FIELD32(            ICLASS_ALU2op" 0000 -------- PP--!--- --------",A32a_S,"Predicate sense")
+DEF_FIELD32(            ICLASS_ALU2op" 0000 -------- PP---!-- --------",A32a_dn,"Dot-new")
+
+ALU32_RR_ENC(A2_aslh,                 "0000","000","0","---","ddddd")
+ALU32_RR_ENC(A2_asrh,                 "0000","001","0","---","ddddd")
+ALU32_RR_ENC(A2_tfr,                  "0000","011","0","---","ddddd")
+ALU32_RR_ENC(A2_sxtb,                 "0000","101","0","---","ddddd")
+ALU32_RR_ENC(A2_zxth,                 "0000","110","0","---","ddddd")
+ALU32_RR_ENC(A2_sxth,                 "0000","111","0","---","ddddd")
+
+CONDA32_RR_ENC(A4_paslh,               "0000","000","1","---","ddddd")
+CONDA32_RR_ENC(A4_pasrh,               "0000","001","1","---","ddddd")
+CONDA32_RR_ENC(A4_pzxtb,               "0000","100","1","---","ddddd")
+CONDA32_RR_ENC(A4_psxtb,               "0000","101","1","---","ddddd")
+CONDA32_RR_ENC(A4_pzxth,               "0000","110","1","---","ddddd")
+CONDA32_RR_ENC(A4_psxth,               "0000","111","1","---","ddddd")
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 0001 -------- PP------ --------","[#1] Rx=(#u16)")
+DEF_ENC32(A2_tfril,     ICLASS_ALU2op" 0001 ii1xxxxx PPiiiiii iiiiiiii")
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 0010 -------- PP------ --------","[#2] Rx=(#u16)")
+DEF_ENC32(A2_tfrih,     ICLASS_ALU2op" 0010 ii1xxxxx PPiiiiii iiiiiiii")
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 0011 -------- PP------ --------","[#3] Rd=(Pu,Rs,#s8)")
+DEF_ENC32(C2_muxir,     ICLASS_ALU2op" 0011 0uusssss PP0iiiii iiiddddd")
+DEF_ENC32(C2_muxri,     ICLASS_ALU2op" 0011 1uusssss PP0iiiii iiiddddd")
+
+DEF_ENC32(A4_combineri, ICLASS_ALU2op" 0011 -00sssss PP1iiiii iiiddddd") /* Rdd = (Rs,#s8) */
+DEF_ENC32(A4_combineir, ICLASS_ALU2op" 0011 -01sssss PP1iiiii iiiddddd") /* Rdd = (Rs,#s8) */
+DEF_ENC32(A4_rcmpeqi,   ICLASS_ALU2op" 0011 -10sssss PP1iiiii iiiddddd") /* Rd = (Rs,#s8) */
+DEF_ENC32(A4_rcmpneqi,  ICLASS_ALU2op" 0011 -11sssss PP1iiiii iiiddddd") /* Rd = (Rs,#s8) */
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 0100 -------- PP------ --------","[#4] (Pu) Rd=(Rs,#s8)")
+DEF_FIELD32(            ICLASS_ALU2op" 0100 -------- PP!----- --------",A32a_DN,"Dot-new")
+DEF_FIELD32(            ICLASS_ALU2op" 0100 !------- PP------ --------",A32a_PS,"Predicate sense")
+DEF_ENC32(A2_paddit,    ICLASS_ALU2op" 0100 0uusssss PP0iiiii iiiddddd")
+DEF_ENC32(A2_padditnew, ICLASS_ALU2op" 0100 0uusssss PP1iiiii iiiddddd")
+DEF_ENC32(A2_paddif,    ICLASS_ALU2op" 0100 1uusssss PP0iiiii iiiddddd")
+DEF_ENC32(A2_paddifnew, ICLASS_ALU2op" 0100 1uusssss PP1iiiii iiiddddd")
+
+
+DEF_FIELDROW_DESC32(     ICLASS_ALU2op" 0101 -------- PP------ --------","[#5] Pd=(Rs,#s10)")
+DEF_ENC32(C2_cmpeqi,     ICLASS_ALU2op" 0101 00isssss PPiiiiii iii000dd")
+DEF_ENC32(C2_cmpgti,     ICLASS_ALU2op" 0101 01isssss PPiiiiii iii000dd")
+DEF_ENC32(C2_cmpgtui,    ICLASS_ALU2op" 0101 100sssss PPiiiiii iii000dd")
+
+DEF_ENC32(C4_cmpneqi,    ICLASS_ALU2op" 0101 00isssss PPiiiiii iii100dd")
+DEF_ENC32(C4_cmpltei,    ICLASS_ALU2op" 0101 01isssss PPiiiiii iii100dd")
+DEF_ENC32(C4_cmplteui,   ICLASS_ALU2op" 0101 100sssss PPiiiiii iii100dd")
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 0110 -------- PP------ --------","[#6] Rd=(Rs,#s10)")
+ALU32_IRR_ENC(A2_andir,               "0110","00i","i","iii","ddddd")
+ALU32_IRR_ENC(A2_subri,               "0110","01i","i","iii","ddddd")
+ALU32_IRR_ENC(A2_orir,                "0110","10i","i","iii","ddddd")
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 0111 -------- PP------ --------","[#7] Reserved")
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 1000 -------- PP------ --------","[#8] Rd=#s16")
+DEF_ENC32(A2_tfrsi,     ICLASS_ALU2op" 1000 ii-iiiii PPiiiiii iiiddddd")
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 1001 -------- PP------ --------","[#9] Reserved")
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 101- -------- PP------ --------","[#10,#11] Rd=(Pu,#s8,#S8)")
+DEF_ENC32(C2_muxii,     ICLASS_ALU2op" 101u uIIIIIII PPIiiiii iiiddddd")
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 1100 -------- PP------ --------","[#12] Rdd=(#s8,#S8)")
+DEF_ENC32(A2_combineii, ICLASS_ALU2op" 1100 0IIIIIII PPIiiiii iiiddddd")
+DEF_ENC32(A4_combineii, ICLASS_ALU2op" 1100 1--IIIII PPIiiiii iiiddddd")
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 1101 -------- PP------ --------","[#13] Reserved")
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 1110 -------- PP------ --------","[#14] (Pu) Rd=#s12")
+DEF_FIELD32(            ICLASS_ALU2op" 1110 ---0---- PP!----- --------",A32c_DN,"Dot-new")
+DEF_FIELD32(            ICLASS_ALU2op" 1110 !--0---- PP------ --------",A32c_PS,"Predicate sense")
+DEF_ENC32(C2_cmovenewit,ICLASS_ALU2op" 1110 0uu0iiii PP1iiiii iiiddddd")
+DEF_ENC32(C2_cmovenewif,ICLASS_ALU2op" 1110 1uu0iiii PP1iiiii iiiddddd")
+DEF_ENC32(C2_cmoveit,   ICLASS_ALU2op" 1110 0uu0iiii PP0iiiii iiiddddd")
+DEF_ENC32(C2_cmoveif,   ICLASS_ALU2op" 1110 1uu0iiii PP0iiiii iiiddddd")
+
+
+DEF_FIELDROW_DESC32(    ICLASS_ALU2op" 1111 -------- PP------ --------","[#15] nop")
+DEF_ENC32(A2_nop,       ICLASS_ALU2op" 1111 -------- PP------ --------")
+
+
+
+
+
+
+
+
+
+
+
+
+/*******************************/
+/*                             */
+/*                             */
+/*    ALU32_3op                */
+/*                             */
+/*                             */
+/*******************************/
+
+
+#define V2A32_RRR_ENC(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS)\
+DEF_ENC32(TAG, ICLASS_ALU3op" "MAJ4"  "MIN3"sssss  PP"SMOD1"ttttt "VMIN3 DSTCHARS)
+
+#define V2A32_RR_ENC(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS)\
+DEF_ENC32(TAG, ICLASS_ALU3op" "MAJ4"  "MIN3"sssss  PP"SMOD1"----- "VMIN3 DSTCHARS)
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  0000 -------- PP------ --------","[#0] Reserved")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  0001 -------- PP------ --------","[#1] Rd=(Rs,Rt)")
+V2A32_RRR_ENC(A2_and,              "0001","000","-","---","ddddd")
+V2A32_RRR_ENC(A2_or,               "0001","001","-","---","ddddd")
+V2A32_RRR_ENC(A2_xor,              "0001","011","-","---","ddddd")
+V2A32_RRR_ENC(A4_andn,             "0001","100","-","---","ddddd")
+V2A32_RRR_ENC(A4_orn,              "0001","101","-","---","ddddd")
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  0010 -------- PP------ --------","[#2] Pd=(Rs,Rt)")
+V2A32_RRR_ENC(C2_cmpeq,            "0010","-00","-","---","000dd")
+V2A32_RRR_ENC(C2_cmpgt,            "0010","-10","-","---","000dd")
+V2A32_RRR_ENC(C2_cmpgtu,           "0010","-11","-","---","000dd")
+
+V2A32_RRR_ENC(C4_cmpneq,           "0010","-00","-","---","100dd")
+V2A32_RRR_ENC(C4_cmplte,           "0010","-10","-","---","100dd")
+V2A32_RRR_ENC(C4_cmplteu,          "0010","-11","-","---","100dd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  0011 -------- PP------ --------","[#3] Rd=(Rs,Rt)")
+V2A32_RRR_ENC(A2_add,              "0011","000","-","---","ddddd")
+V2A32_RRR_ENC(A2_sub,              "0011","001","-","---","ddddd")
+V2A32_RRR_ENC(A2_combine_hh,       "0011","100","-","---","ddddd")
+V2A32_RRR_ENC(A2_combine_hl,       "0011","101","-","---","ddddd")
+V2A32_RRR_ENC(A2_combine_lh,       "0011","110","-","---","ddddd")
+V2A32_RRR_ENC(A2_combine_ll,       "0011","111","-","---","ddddd")
+V2A32_RRR_ENC(A4_rcmpeq,           "0011","010","-","---","ddddd")
+V2A32_RRR_ENC(A4_rcmpneq,          "0011","011","-","---","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  0100 -------- PP------ --------","[#4] Rd=(Pu,Rs,Rt)")
+V2A32_RRR_ENC(C2_mux,              "0100","---","-","-uu","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  0101 -------- PP------ --------","[#5] Rdd=(Rs,Rt)")
+V2A32_RRR_ENC(A2_combinew,         "0101","0--","-","---","ddddd")
+V2A32_RRR_ENC(S2_packhl,           "0101","1--","-","---","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  0110 -------- PP------ --------","[#6] Rd=(Rs,Rt)")
+V2A32_RRR_ENC(A2_svaddh,           "0110","000","-","---","ddddd")
+V2A32_RRR_ENC(A2_svaddhs,          "0110","001","-","---","ddddd")
+V2A32_RRR_ENC(A2_svadduhs,         "0110","011","-","---","ddddd")
+V2A32_RRR_ENC(A2_svsubh,           "0110","100","-","---","ddddd")
+V2A32_RRR_ENC(A2_svsubhs,          "0110","101","-","---","ddddd")
+V2A32_RRR_ENC(A2_svsubuhs,         "0110","111","-","---","ddddd")
+V2A32_RRR_ENC(A2_addsat,           "0110","010","-","---","ddddd")
+V2A32_RRR_ENC(A2_subsat,           "0110","110","-","---","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  0111 -------- PP------ --------","[#7] Rd=(Rs,Rt)")
+V2A32_RRR_ENC(A2_svavgh,           "0111","-00","-","---","ddddd")
+V2A32_RRR_ENC(A2_svavghs,          "0111","-01","-","---","ddddd")
+V2A32_RRR_ENC(A2_svnavgh,          "0111","-11","-","---","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  1000 -------- PP------ --------","[#8] Reserved")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  1001 -------- PP------ --------","[#9] (Pu) Rd=(Rs,Rt)")
+V2A32_RRR_ENC(A2_pandt,            "1001","-00","0","0uu","ddddd")
+V2A32_RRR_ENC(A2_pandtnew,         "1001","-00","1","0uu","ddddd")
+V2A32_RRR_ENC(A2_pandf,            "1001","-00","0","1uu","ddddd")
+V2A32_RRR_ENC(A2_pandfnew,         "1001","-00","1","1uu","ddddd")
+V2A32_RRR_ENC(A2_port,             "1001","-01","0","0uu","ddddd")
+V2A32_RRR_ENC(A2_portnew,          "1001","-01","1","0uu","ddddd")
+V2A32_RRR_ENC(A2_porf,             "1001","-01","0","1uu","ddddd")
+V2A32_RRR_ENC(A2_porfnew,          "1001","-01","1","1uu","ddddd")
+V2A32_RRR_ENC(A2_pxort,            "1001","-11","0","0uu","ddddd")
+V2A32_RRR_ENC(A2_pxortnew,         "1001","-11","1","0uu","ddddd")
+V2A32_RRR_ENC(A2_pxorf,            "1001","-11","0","1uu","ddddd")
+V2A32_RRR_ENC(A2_pxorfnew,         "1001","-11","1","1uu","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  1010 -------- PP------ --------","[#10] Reserved")
+
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  1011 -------- PP------ --------","[#11] (Pu) Rd=(Rs,Rt)")
+V2A32_RRR_ENC(A2_paddt,            "1011","0-0","0","0uu","ddddd")
+V2A32_RRR_ENC(A2_paddtnew,         "1011","0-0","1","0uu","ddddd")
+V2A32_RRR_ENC(A2_paddf,            "1011","0-0","0","1uu","ddddd")
+V2A32_RRR_ENC(A2_paddfnew,         "1011","0-0","1","1uu","ddddd")
+V2A32_RRR_ENC(A2_psubt,            "1011","0-1","0","0uu","ddddd")
+V2A32_RRR_ENC(A2_psubtnew,         "1011","0-1","1","0uu","ddddd")
+V2A32_RRR_ENC(A2_psubf,            "1011","0-1","0","1uu","ddddd")
+V2A32_RRR_ENC(A2_psubfnew,         "1011","0-1","1","1uu","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  1100 -------- PP------ --------","[#12] Reserved")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  1101 -------- PP------ --------","[#13] (Pu) Rdd=(Rs,Rt)")
+V2A32_RRR_ENC(C2_ccombinewnewt,    "1101","000","1","0uu","ddddd")
+V2A32_RRR_ENC(C2_ccombinewnewf,    "1101","000","1","1uu","ddddd")
+V2A32_RRR_ENC(C2_ccombinewt,       "1101","000","0","0uu","ddddd")
+V2A32_RRR_ENC(C2_ccombinewf,       "1101","000","0","1uu","ddddd")
+
+
+
+
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU3op"  1110 -------- PP------ --------","[#14] Reserved")
+
+
+
+
+
+
+
+
+
+/*******************************/
+/*                             */
+/*                             */
+/*    S                        */
+/*                             */
+/*                             */
+/*******************************/
+
+DEF_CLASS32(ICLASS_S2op" ---- -------- PP------ --------",S_2op)
+DEF_FIELD32(ICLASS_S2op" !!!! -------- PP------ --------",STYPEB_RegType,"Register Type")
+DEF_FIELD32(ICLASS_S2op" ---- !!------ PP------ --------",S2_MajOp,"Major Opcode")
+DEF_FIELD32(ICLASS_S2op" ---- -------- PP------ !!!-----",S2_MinOp,"Minor Opcode")
+
+DEF_CLASS32(ICLASS_S3op" ---- -------- PP------ --------",S_3op)
+DEF_FIELD32(ICLASS_S3op" !!!! -------- PP------ --------",STYPEA_RegType,"Register Type")
+DEF_FIELD32(ICLASS_S3op" ---- !!------ PP------ --------",S3_Maj,"Major Opcode")
+DEF_FIELD32(ICLASS_S3op" ---- -------- PP------ !!------",S3_Min,"Minor Opcode")
+
+
+#define SH_RRR_ENC(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS) \
+DEF_ENC32(TAG,ICLASS_S3op" "MAJ4"  "MIN3"sssss  PP"SMOD1"ttttt "VMIN3 DSTCHARS)
+
+#define SH_RRRiENC(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS) \
+DEF_ENC32(TAG,ICLASS_S3op" "MAJ4"  "MIN3"iiiii  PP"SMOD1"ttttt "VMIN3 DSTCHARS)
+
+#define SH_RRR_ENCX(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS) \
+DEF_ENC32(TAG,ICLASS_S3op" "MAJ4"  "MIN3"sssss  PP"SMOD1"xxxxx "VMIN3 DSTCHARS)
+
+#define SH3_RR_ENC(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS) \
+DEF_ENC32(TAG,ICLASS_S3op" "MAJ4"  "MIN3"sssss  PP"SMOD1"----- "VMIN3 DSTCHARS)
+
+#define SH_PPP_ENC(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS) \
+DEF_ENC32(TAG,ICLASS_S3op" "MAJ4"  "MIN3"---ss  PP"SMOD1"---tt "VMIN3 DSTCHARS)
+
+#define SH2_RR_ENC(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS) \
+DEF_ENC32(TAG,ICLASS_S2op" "MAJ4"  "MIN3"sssss  PP"SMOD1"----- "VMIN3 DSTCHARS)
+
+#define SH2_PPP_ENC(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS) \
+DEF_ENC32(TAG,ICLASS_S2op" "MAJ4"  "MIN3"---ss  PP"SMOD1"---tt "VMIN3 DSTCHARS)
+
+#define SH_RRI4_ENC(TAG,MAJ4,MIN3,VMIN3,DSTCHARS) \
+DEF_ENC32(TAG,ICLASS_S2op" "MAJ4" "MIN3 "sssss PP00iiii " VMIN3 DSTCHARS)
+
+#define SH_RRI5_ENC(TAG,MAJ4,MIN3,VMIN3,DSTCHARS) \
+DEF_ENC32(TAG,ICLASS_S2op" "MAJ4" "MIN3 "sssss PP0iiiii " VMIN3 DSTCHARS)
+
+#define SH_RRI6_ENC(TAG,MAJ4,MIN3,VMIN3,DSTCHARS) \
+DEF_ENC32(TAG,ICLASS_S2op" "MAJ4" "MIN3 "sssss PPiiiiii " VMIN3 DSTCHARS)
+
+#define RSHIFTTYPES(TAGEND,MAJ4,MIN3,SMOD1,DMOD1,DSTCHARS) \
+SH_RRR_ENC(S2_asr_r_##TAGEND,MAJ4,MIN3,SMOD1,"00"DMOD1,DSTCHARS) \
+SH_RRR_ENC(S2_lsr_r_##TAGEND,MAJ4,MIN3,SMOD1,"01"DMOD1,DSTCHARS) \
+SH_RRR_ENC(S2_asl_r_##TAGEND,MAJ4,MIN3,SMOD1,"10"DMOD1,DSTCHARS) \
+SH_RRR_ENC(S2_lsl_r_##TAGEND,MAJ4,MIN3,SMOD1,"11"DMOD1,DSTCHARS)
+
+
+#define I5SHIFTTYPES(TAGEND,MAJ4,MIN3,SMOD1,DSTCHARS) \
+SH_RRI5_ENC(S2_asr_i_##TAGEND,MAJ4,MIN3,SMOD1 "00",DSTCHARS) \
+SH_RRI5_ENC(S2_lsr_i_##TAGEND,MAJ4,MIN3,SMOD1 "01",DSTCHARS) \
+SH_RRI5_ENC(S2_asl_i_##TAGEND,MAJ4,MIN3,SMOD1 "10",DSTCHARS) \
+SH_RRI5_ENC(S6_rol_i_##TAGEND,MAJ4,MIN3,SMOD1 "11",DSTCHARS)
+
+#define I5SHIFTTYPES_NOROL(TAGEND,MAJ4,MIN3,SMOD1,DSTCHARS) \
+SH_RRI5_ENC(S2_asr_i_##TAGEND,MAJ4,MIN3,SMOD1 "00",DSTCHARS) \
+SH_RRI5_ENC(S2_lsr_i_##TAGEND,MAJ4,MIN3,SMOD1 "01",DSTCHARS) \
+SH_RRI5_ENC(S2_asl_i_##TAGEND,MAJ4,MIN3,SMOD1 "10",DSTCHARS)
+
+#define I5SHIFTTYPES_NOASR(TAGEND,MAJ4,MIN3,SMOD1,DSTCHARS) \
+SH_RRI5_ENC(S2_lsr_i_##TAGEND,MAJ4,MIN3,SMOD1 "01",DSTCHARS) \
+SH_RRI5_ENC(S2_asl_i_##TAGEND,MAJ4,MIN3,SMOD1 "10",DSTCHARS) \
+SH_RRI5_ENC(S6_rol_i_##TAGEND,MAJ4,MIN3,SMOD1 "11",DSTCHARS)
+
+#define I4SHIFTTYPES(TAGEND,MAJ4,MIN3,SMOD1,DSTCHARS) \
+SH_RRI4_ENC(S2_asr_i_##TAGEND,MAJ4,MIN3,SMOD1 "00",DSTCHARS) \
+SH_RRI4_ENC(S2_lsr_i_##TAGEND,MAJ4,MIN3,SMOD1 "01",DSTCHARS) \
+SH_RRI4_ENC(S2_asl_i_##TAGEND,MAJ4,MIN3,SMOD1 "10",DSTCHARS)
+
+#define I5ASHIFTTYPES(TAGEND,MAJ4,MIN3,SMOD1,DSTCHARS) \
+SH_RRI5_ENC(S2_asl_i_##TAGEND,MAJ4,MIN3,SMOD1 "10",DSTCHARS)
+
+#define I4ASHIFTTYPES(TAGEND,MAJ4,MIN3,SMOD1,DSTCHARS) \
+SH_RRI4_ENC(S2_asl_i_##TAGEND,MAJ4,MIN3,SMOD1 "10",DSTCHARS)
+
+#define I6SHIFTTYPES(TAGEND,MAJ4,MIN3,SMOD1,DSTCHARS) \
+SH_RRI6_ENC(S2_asr_i_##TAGEND,MAJ4,MIN3,SMOD1 "00",DSTCHARS) \
+SH_RRI6_ENC(S2_lsr_i_##TAGEND,MAJ4,MIN3,SMOD1 "01",DSTCHARS) \
+SH_RRI6_ENC(S2_asl_i_##TAGEND,MAJ4,MIN3,SMOD1 "10",DSTCHARS) \
+SH_RRI6_ENC(S6_rol_i_##TAGEND,MAJ4,MIN3,SMOD1 "11",DSTCHARS) \
+
+#define I6SHIFTTYPES_NOASR(TAGEND,MAJ4,MIN3,SMOD1,DSTCHARS) \
+SH_RRI6_ENC(S2_lsr_i_##TAGEND,MAJ4,MIN3,SMOD1 "01",DSTCHARS) \
+SH_RRI6_ENC(S2_asl_i_##TAGEND,MAJ4,MIN3,SMOD1 "10",DSTCHARS) \
+SH_RRI6_ENC(S6_rol_i_##TAGEND,MAJ4,MIN3,SMOD1 "11",DSTCHARS)
+
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op" 0000 -------- PP------ --------","[#0] Rdd=(Rss,#u6)")
+/* EJP: there is actually quite a bit of space here, look at the reserved bits */
+I6SHIFTTYPES(p,                 "0000","000","0","ddddd")
+I5SHIFTTYPES_NOROL(vw,          "0000","010","0","ddddd")
+I4SHIFTTYPES(vh,                "0000","100","0","ddddd")
+
+
+
+/* False assume an immediate */
+SH2_RR_ENC(S2_vsathub_nopack, "0000","000","-","1 00","ddddd")
+SH2_RR_ENC(S2_vsatwuh_nopack, "0000","000","-","1 01","ddddd")
+SH2_RR_ENC(S2_vsatwh_nopack,  "0000","000","-","1 10","ddddd")
+SH2_RR_ENC(S2_vsathb_nopack,  "0000","000","-","1 11","ddddd")
+
+SH_RRI4_ENC(S5_vasrhrnd,      "0000","001",    "0 00","ddddd")
+
+SH2_RR_ENC(A2_vabsh,          "0000","010","-","1 00","ddddd")
+SH2_RR_ENC(A2_vabshsat,       "0000","010","-","1 01","ddddd")
+SH2_RR_ENC(A2_vabsw,          "0000","010","-","1 10","ddddd")
+SH2_RR_ENC(A2_vabswsat,       "0000","010","-","1 11","ddddd")
+
+SH2_RR_ENC(A2_notp,           "0000","100","-","1 00","ddddd")
+SH2_RR_ENC(A2_negp,           "0000","100","-","1 01","ddddd")
+SH2_RR_ENC(A2_absp,           "0000","100","-","1 10","ddddd")
+SH2_RR_ENC(A2_vconj,          "0000","100","-","1 11","ddddd")
+
+SH2_RR_ENC(S2_deinterleave,   "0000","110","-","1 00","ddddd")
+SH2_RR_ENC(S2_interleave,     "0000","110","-","1 01","ddddd")
+SH2_RR_ENC(S2_brevp,          "0000","110","-","1 10","ddddd")
+SH_RRI6_ENC(S2_asr_i_p_rnd,   "0000","110",    "1 11","ddddd")
+
+SH2_RR_ENC(F2_conv_df2d,      "0000","111","0","0 00","ddddd")
+SH2_RR_ENC(F2_conv_df2ud,     "0000","111","0","0 01","ddddd")
+SH2_RR_ENC(F2_conv_ud2df,     "0000","111","0","0 10","ddddd")
+SH2_RR_ENC(F2_conv_d2df,      "0000","111","0","0 11","ddddd")
+#ifdef ADD_DP_OPS
+SH2_RR_ENC(F2_dffixupr,       "0000","111","0","1 00","ddddd")
+SH2_RR_ENC(F2_dfsqrtcheat,    "0000","111","0","1 01","ddddd")
+#endif
+SH2_RR_ENC(F2_conv_df2d_chop, "0000","111","0","1 10","ddddd")
+SH2_RR_ENC(F2_conv_df2ud_chop,"0000","111","0","1 11","ddddd")
+#ifdef ADD_DP_OPS
+SH2_RR_ENC(F2_dfinvsqrta,     "0000","111","1","0 ee","ddddd")
+#endif
+
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"    0001 -------- PP------ --------","[#1] Rdd=(Rss,#u6,#U6)")
+DEF_ENC32(S2_extractup,ICLASS_S2op" 0001 IIIsssss PPiiiiii IIIddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op" 0010 -------- PP------ --------","[#2] Rxx=(Rss,#u6)")
+I6SHIFTTYPES(p_nac,             "0010","00-","0","xxxxx")
+I6SHIFTTYPES(p_acc,             "0010","00-","1","xxxxx")
+I6SHIFTTYPES(p_and,             "0010","01-","0","xxxxx")
+I6SHIFTTYPES(p_or,              "0010","01-","1","xxxxx")
+I6SHIFTTYPES_NOASR(p_xacc,      "0010","10-","0","xxxxx")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"  0011 -------- PP------ --------","[#3] Rxx=(Rss,#u6,#U6)")
+DEF_ENC32(S2_insertp,ICLASS_S2op" 0011 IIIsssss PPiiiiii IIIxxxxx")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"  0100 -------- PP------ --------","[#4] Rdd=(Rs)")
+SH2_RR_ENC(S2_vsxtbh,            "0100","00-","-","00-","ddddd")
+SH2_RR_ENC(S2_vzxtbh,            "0100","00-","-","01-","ddddd")
+SH2_RR_ENC(S2_vsxthw,            "0100","00-","-","10-","ddddd")
+SH2_RR_ENC(S2_vzxthw,            "0100","00-","-","11-","ddddd")
+SH2_RR_ENC(A2_sxtw,              "0100","01-","-","00-","ddddd")
+SH2_RR_ENC(S2_vsplatrh,          "0100","01-","-","01-","ddddd")
+SH2_RR_ENC(S6_vsplatrbp,         "0100","01-","-","10-","ddddd")
+
+SH2_RR_ENC(F2_conv_sf2df,        "0100","1--","-","000","ddddd")
+SH2_RR_ENC(F2_conv_uw2df,        "0100","1--","-","001","ddddd")
+SH2_RR_ENC(F2_conv_w2df,         "0100","1--","-","010","ddddd")
+SH2_RR_ENC(F2_conv_sf2ud,        "0100","1--","-","011","ddddd")
+SH2_RR_ENC(F2_conv_sf2d,         "0100","1--","-","100","ddddd")
+SH2_RR_ENC(F2_conv_sf2ud_chop,   "0100","1--","-","101","ddddd")
+SH2_RR_ENC(F2_conv_sf2d_chop,    "0100","1--","-","110","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"   0101 -------- PP------ --------","[#5] Pd=(Rs,#u6)")
+DEF_ENC32(S2_tstbit_i,ICLASS_S2op" 0101 000sssss PP0iiiii ------dd")
+DEF_ENC32(C2_tfrrp,   ICLASS_S2op" 0101 010sssss PP------ ------dd")
+DEF_ENC32(C2_bitsclri,ICLASS_S2op" 0101 100sssss PPiiiiii ------dd")
+DEF_ENC32(S4_ntstbit_i,ICLASS_S2op"0101 001sssss PP0iiiii ------dd")
+DEF_ENC32(C4_nbitsclri,ICLASS_S2op"0101 101sssss PPiiiiii ------dd")
+DEF_ENC32(F2_sfclass,  ICLASS_S2op"0101 111sssss PP0iiiii ------dd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"   0110 -------- PP------ --------","[#6] Rdd=(Pt)")
+DEF_ENC32(C2_mask, ICLASS_S2op"    0110   --- -----  PP----tt --- ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"    0111 -------- PP------ --------","[#7] Rx=(Rs,#u4,#S6)")
+DEF_ENC32(S2_tableidxb,ICLASS_S2op" 0111 00isssss PPIIIIII iiixxxxx")
+DEF_ENC32(S2_tableidxh,ICLASS_S2op" 0111 01isssss PPIIIIII iiixxxxx")
+DEF_ENC32(S2_tableidxw,ICLASS_S2op" 0111 10isssss PPIIIIII iiixxxxx")
+DEF_ENC32(S2_tableidxd,ICLASS_S2op" 0111 11isssss PPIIIIII iiixxxxx")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"   1000 -------- PP------ --------","[#8] Rd=(Rss,#u6)")
+SH2_RR_ENC(S2_vsathub,            "1000","000","-","000","ddddd")
+SH2_RR_ENC(S2_vsatwh,             "1000","000","-","010","ddddd")
+SH2_RR_ENC(S2_vsatwuh,            "1000","000","-","100","ddddd")
+SH2_RR_ENC(S2_vsathb,             "1000","000","-","110","ddddd")
+SH2_RR_ENC(S2_clbp,               "1000","010","-","000","ddddd")
+SH2_RR_ENC(S2_cl0p,               "1000","010","-","010","ddddd")
+SH2_RR_ENC(S2_cl1p,               "1000","010","-","100","ddddd")
+SH2_RR_ENC(S2_ct0p,               "1000","111","-","010","ddddd")
+SH2_RR_ENC(S2_ct1p,               "1000","111","-","100","ddddd")
+SH2_RR_ENC(S2_vtrunohb,           "1000","100","-","000","ddddd")
+SH2_RR_ENC(S2_vtrunehb,           "1000","100","-","010","ddddd")
+SH2_RR_ENC(S2_vrndpackwh,         "1000","100","-","100","ddddd")
+SH2_RR_ENC(S2_vrndpackwhs,        "1000","100","-","110","ddddd")
+SH2_RR_ENC(A2_sat,                "1000","110","-","000","ddddd")
+SH2_RR_ENC(A2_roundsat,           "1000","110","-","001","ddddd")
+SH_RRI5_ENC(S2_asr_i_svw_trun,    "1000","110",    "010","ddddd")
+SH_RRI5_ENC(A4_bitspliti,         "1000","110",    "100","ddddd")
+
+SH_RRI5_ENC(A7_clip,         	  "1000","110",    "101","ddddd")
+SH_RRI5_ENC(A7_vclip,         	  "1000","110",    "110","ddddd")
+
+
+SH2_RR_ENC(S4_clbpnorm,           "1000","011","-","000","ddddd")
+SH_RRI6_ENC(S4_clbpaddi,          "1000","011",    "010","ddddd")
+SH2_RR_ENC(S5_popcountp,          "1000","011","-","011","ddddd")
+
+SH_RRI4_ENC(S5_asrhub_rnd_sat,    "1000","011",    "100","ddddd")
+SH_RRI4_ENC(S5_asrhub_sat,        "1000","011",    "101","ddddd")
+
+SH2_RR_ENC(F2_conv_df2sf,         "1000","000","-","001","ddddd")
+SH2_RR_ENC(F2_conv_ud2sf,         "1000","001","-","001","ddddd")
+SH2_RR_ENC(F2_conv_d2sf,          "1000","010","-","001","ddddd")
+SH2_RR_ENC(F2_conv_df2uw,         "1000","011","-","001","ddddd")
+SH2_RR_ENC(F2_conv_df2w,          "1000","100","-","001","ddddd")
+SH2_RR_ENC(F2_conv_df2uw_chop,    "1000","101","-","001","ddddd")
+SH2_RR_ENC(F2_conv_df2w_chop,     "1000","111","-","001","ddddd")
+
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"   1001 -------- PP------ --------","[#9] Rd=(Ps,Pt)")
+DEF_ENC32(C2_vitpack, ICLASS_S2op" 1001   -00 ---ss  PP----tt --- ddddd")
+DEF_ENC32(C2_tfrpr,   ICLASS_S2op" 1001   -1- ---ss  PP------ --- ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"   1010 -------- PP------ --------","[#10] Rdd=(Rss,#u6,#U6)")
+DEF_ENC32(S4_extractp,ICLASS_S2op" 1010 IIIsssss PPiiiiii IIIddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"   1011 -------- PP------ --------","[#11] Rd=(Rs)")
+SH2_RR_ENC(F2_conv_uw2sf,         "1011","001","-","000","ddddd")
+SH2_RR_ENC(F2_conv_w2sf,          "1011","010","-","000","ddddd")
+SH2_RR_ENC(F2_conv_sf2uw,         "1011","011","-","000","ddddd")
+SH2_RR_ENC(F2_conv_sf2w,          "1011","100","-","000","ddddd")
+SH2_RR_ENC(F2_conv_sf2uw_chop,    "1011","011","-","001","ddddd")
+SH2_RR_ENC(F2_conv_sf2w_chop,     "1011","100","-","001","ddddd")
+SH2_RR_ENC(F2_sffixupr,           "1011","101","-","000","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"      1100 -------- PP------ --------","[#12] Rd=(Rs,#u6)")
+I5SHIFTTYPES(r,                      "1100","000",            "0  ","ddddd")
+SH_RRI5_ENC(S2_asl_i_r_sat,          "1100","010",            "010","ddddd")
+SH_RRI5_ENC(S2_asr_i_r_rnd,          "1100","010",            "000","ddddd")
+
+SH2_RR_ENC(S2_svsathb,               "1100","10-","-",        "00-","ddddd")
+SH2_RR_ENC(S2_svsathub,              "1100","10-","-",        "01-","ddddd")
+
+SH_RRI5_ENC(A4_cround_ri,            "1100","111",            "00-","ddddd")
+SH_RRI6_ENC(A7_croundd_ri,           "1100","111",            "01-","ddddd")
+SH_RRI5_ENC(A4_round_ri,             "1100","111",            "10-","ddddd")
+SH_RRI5_ENC(A4_round_ri_sat,         "1100","111",            "11-","ddddd")
+
+DEF_ENC32(S2_setbit_i,   ICLASS_S2op" 1100   110sssss PP0iiiii 000ddddd")
+DEF_ENC32(S2_clrbit_i,   ICLASS_S2op" 1100   110sssss PP0iiiii 001ddddd")
+DEF_ENC32(S2_togglebit_i,ICLASS_S2op" 1100   110sssss PP0iiiii 010ddddd")
+DEF_ENC32(S4_lsli       ,ICLASS_S2op" 1100   110sssss PPiiiiii 011ddddd")
+
+DEF_ENC32(S4_clbaddi    ,ICLASS_S2op" 1100   001sssss PPiiiiii 000ddddd")
+
+
+
+/* False read #u6 */
+SH2_RR_ENC(S2_clb,                   "1100","000","-","1 00","ddddd")
+SH2_RR_ENC(S2_cl0,                   "1100","000","-","1 01","ddddd")
+SH2_RR_ENC(S2_cl1,                   "1100","000","-","1 10","ddddd")
+SH2_RR_ENC(S2_clbnorm,               "1100","000","-","1 11","ddddd")
+SH2_RR_ENC(S2_ct0,                   "1100","010","-","1 00","ddddd")
+SH2_RR_ENC(S2_ct1,                   "1100","010","-","1 01","ddddd")
+SH2_RR_ENC(S2_brev,                  "1100","010","-","1 10","ddddd")
+SH2_RR_ENC(S2_vsplatrb,              "1100","010","-","1 11","ddddd")
+SH2_RR_ENC(A2_abs,                   "1100","100","-","1 00","ddddd")
+SH2_RR_ENC(A2_abssat,                "1100","100","-","1 01","ddddd")
+SH2_RR_ENC(A2_negsat,                "1100","100","-","1 10","ddddd")
+SH2_RR_ENC(A2_swiz,                  "1100","100","-","1 11","ddddd")
+SH2_RR_ENC(A2_sath,                  "1100","110","-","1 00","ddddd")
+SH2_RR_ENC(A2_satuh,                 "1100","110","-","1 01","ddddd")
+SH2_RR_ENC(A2_satub,                 "1100","110","-","1 10","ddddd")
+SH2_RR_ENC(A2_satb,                  "1100","110","-","1 11","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"     1101 -------- PP------ --------","[#13] Rd=(Rs,#u6,#U6)")
+DEF_ENC32(S2_extractu,  ICLASS_S2op" 1101 0IIsssss PP0iiiii IIIddddd")
+DEF_ENC32(S4_extract,   ICLASS_S2op" 1101 1IIsssss PP0iiiii IIIddddd")
+DEF_ENC32(S2_mask,    ICLASS_S2op"   1101 0II----- PP1iiiii IIIddddd")
+
+
+
+
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"     1110 -------- PP------ --------","[#14] Rx=(Rs,#u6)")
+I5SHIFTTYPES(r_nac,       "1110","00-","0","xxxxx")
+I5SHIFTTYPES(r_acc,       "1110","00-","1","xxxxx")
+I5SHIFTTYPES(r_and,       "1110","01-","0","xxxxx")
+I5SHIFTTYPES(r_or,        "1110","01-","1","xxxxx")
+I5SHIFTTYPES_NOASR(r_xacc,"1110","10-","0","xxxxx")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S2op"     1111 -------- PP------ --------","[#15] Rs=(Rs,#u6,#U6)")
+DEF_ENC32(S2_insert,    ICLASS_S2op" 1111 0IIsssss PP0iiiii IIIxxxxx")
+
+
+
+
+
+/*************************/
+/* S_3_operand           */
+/*************************/
+
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 0000 -------- PP------ --------","[#0] Rdd=(Rss,Rtt,#u3)")
+SH_RRR_ENC(S2_valignib,         "0000","0--","-","iii","ddddd")
+SH_RRR_ENC(S2_vspliceib,        "0000","1--","-","iii","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 0001 -------- PP------ --------","[#1] Rdd=(Rss,Rtt)")
+SH_RRR_ENC(S2_extractup_rp,     "0001","00-","-","00-","ddddd")
+SH_RRR_ENC(S2_shuffeb,          "0001","00-","-","01-","ddddd")
+SH_RRR_ENC(S2_shuffob,          "0001","00-","-","10-","ddddd")
+SH_RRR_ENC(S2_shuffeh,          "0001","00-","-","11-","ddddd")
+
+SH_RRR_ENC(S2_shuffoh,          "0001","10-","-","000","ddddd")
+SH_RRR_ENC(S2_vtrunewh,         "0001","10-","-","010","ddddd")
+SH_RRR_ENC(S6_vtrunehb_ppp,		"0001","10-","-","011","ddddd")
+SH_RRR_ENC(S2_vtrunowh,         "0001","10-","-","100","ddddd")
+SH_RRR_ENC(S6_vtrunohb_ppp,		"0001","10-","-","101","ddddd")
+SH_RRR_ENC(S2_lfsp,             "0001","10-","-","110","ddddd")
+
+SH_RRR_ENC(S4_vxaddsubw,        "0001","01-","-","000","ddddd")
+SH_RRR_ENC(A5_vaddhubs,         "0001","01-","-","001","ddddd")
+SH_RRR_ENC(S4_vxsubaddw,        "0001","01-","-","010","ddddd")
+SH_RRR_ENC(S4_vxaddsubh,        "0001","01-","-","100","ddddd")
+SH_RRR_ENC(S4_vxsubaddh,        "0001","01-","-","110","ddddd")
+
+SH_RRR_ENC(S4_vxaddsubhr,       "0001","11-","-","00-","ddddd")
+SH_RRR_ENC(S4_vxsubaddhr,       "0001","11-","-","01-","ddddd")
+SH_RRR_ENC(S4_extractp_rp,      "0001","11-","-","10-","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 0010 -------- PP------ --------","[#2] Rdd=(Rss,Rtt,Pu)")
+SH_RRR_ENC(S2_valignrb,         "0010","0--","-","-uu","ddddd")
+SH_RRR_ENC(S2_vsplicerb,        "0010","100","-","-uu","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 0011 -------- PP------ --------","[#3] Rdd=(Rss,Rt)")
+RSHIFTTYPES(vw,                 "0011","00-","-","-","ddddd")
+RSHIFTTYPES(vh,                 "0011","01-","-","-","ddddd")
+RSHIFTTYPES(p,                  "0011","10-","-","-","ddddd")
+SH_RRR_ENC(S2_vcrotate,         "0011","11-","-","00-","ddddd")
+SH_RRR_ENC(S2_vcnegh,           "0011","11-","-","01-","ddddd")
+SH_RRR_ENC(S4_vrcrotate,        "0011","11-","i","11i","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 0100 -------- PP------ --------","[#4] Rd=(Rs,Rt,#u3)")
+DEF_ENC32(S2_addasl_rrri, ICLASS_S3op" 0100   000 sssss PP0ttttt iiiddddd")
+
+
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 0101 -------- PP------ --------","[#5] Rd=(Rss,Rt)")
+SH_RRR_ENC(S2_asr_r_svw_trun,   "0101","---","-","010","ddddd")
+SH_RRR_ENC(M4_cmpyi_wh,         "0101","---","-","100","ddddd")
+SH_RRR_ENC(M4_cmpyr_wh,         "0101","---","-","110","ddddd")
+SH_RRR_ENC(M4_cmpyi_whc,        "0101","---","-","101","ddddd")
+SH_RRR_ENC(M4_cmpyr_whc,        "0101","---","-","111","ddddd")
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 0110 -------- PP------ --------","[#6] Rd=(Rs,Rt)")
+SH_RRR_ENC(S2_asr_r_r_sat,      "0110","00-","-","00-","ddddd") \
+SH_RRR_ENC(S2_asl_r_r_sat,      "0110","00-","-","10-","ddddd")
+
+RSHIFTTYPES(r,                  "0110","01-","-","-","ddddd")
+
+SH_RRR_ENC(S2_setbit_r,         "0110","10-","-","00-","ddddd")
+SH_RRR_ENC(S2_clrbit_r,         "0110","10-","-","01-","ddddd")
+SH_RRR_ENC(S2_togglebit_r,      "0110","10-","-","10-","ddddd")
+SH_RRRiENC(S4_lsli,             "0110","10-","-","11i","ddddd")
+
+SH_RRR_ENC(A4_cround_rr,        "0110","11-","-","00-","ddddd")
+SH_RRR_ENC(A7_croundd_rr,       "0110","11-","-","01-","ddddd")
+SH_RRR_ENC(A4_round_rr,         "0110","11-","-","10-","ddddd")
+SH_RRR_ENC(A4_round_rr_sat,     "0110","11-","-","11-","ddddd")
+
+
+
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 0111 -------- PP------ --------","[#7] Pd=(Rs,Rt)")
+SH_RRR_ENC(S2_tstbit_r,         "0111","000","-","---","---dd")
+SH_RRR_ENC(C2_bitsset,          "0111","010","-","---","---dd")
+SH_RRR_ENC(C2_bitsclr,          "0111","100","-","---","---dd")
+SH_RRR_ENC(A4_cmpheq,           "0111","110","-","011","---dd")
+SH_RRR_ENC(A4_cmphgt,           "0111","110","-","100","---dd")
+SH_RRR_ENC(A4_cmphgtu,          "0111","110","-","101","---dd")
+SH_RRR_ENC(A4_cmpbeq,           "0111","110","-","110","---dd")
+SH_RRR_ENC(A4_cmpbgtu,          "0111","110","-","111","---dd")
+SH_RRR_ENC(A4_cmpbgt,           "0111","110","-","010","---dd")
+SH_RRR_ENC(S4_ntstbit_r,        "0111","001","-","---","---dd")
+SH_RRR_ENC(C4_nbitsset,         "0111","011","-","---","---dd")
+SH_RRR_ENC(C4_nbitsclr,         "0111","101","-","---","---dd")
+
+SH_RRR_ENC(F2_sfcmpge,          "0111","111","-","000","---dd")
+SH_RRR_ENC(F2_sfcmpuo,          "0111","111","-","001","---dd")
+SH_RRR_ENC(F2_sfcmpeq,          "0111","111","-","011","---dd")
+SH_RRR_ENC(F2_sfcmpgt,          "0111","111","-","100","---dd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 1000 -------- PP------ --------","[#8] Rx=(Rs,Rtt)")
+SH_RRR_ENC(S2_insert_rp,        "1000","---","-","---","xxxxx")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 1001 -------- PP------ --------","[#9] Rd=(Rs,Rtt)")
+SH_RRR_ENC(S2_extractu_rp,      "1001","00-","-","00-","ddddd")
+SH_RRR_ENC(S4_extract_rp,       "1001","00-","-","01-","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 1010 -------- PP------ --------","[#10] Rxx=(Rss,Rtt)")
+SH_RRR_ENC(S2_insertp_rp,       "1010","0--","0","---","xxxxx")
+SH_RRR_ENC(M4_xor_xacc,         "1010","10-","0","000","xxxxx")
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 1011 -------- PP------ --------","[#11] Rxx=(Rss,Rt)")
+RSHIFTTYPES(p_or,               "1011","000","-","-","xxxxx")
+RSHIFTTYPES(p_and,              "1011","010","-","-","xxxxx")
+RSHIFTTYPES(p_nac,              "1011","100","-","-","xxxxx")
+RSHIFTTYPES(p_acc,              "1011","110","-","-","xxxxx")
+RSHIFTTYPES(p_xor,              "1011","011","-","-","xxxxx")
+
+SH_RRR_ENCX(A4_vrmaxh,		"1011","001","0","001","uuuuu")
+SH_RRR_ENCX(A4_vrmaxuh,		"1011","001","1","001","uuuuu")
+SH_RRR_ENCX(A4_vrmaxw,		"1011","001","0","010","uuuuu")
+SH_RRR_ENCX(A4_vrmaxuw,		"1011","001","1","010","uuuuu")
+
+SH_RRR_ENCX(A4_vrminh,		"1011","001","0","101","uuuuu")
+SH_RRR_ENCX(A4_vrminuh,		"1011","001","1","101","uuuuu")
+SH_RRR_ENCX(A4_vrminw,		"1011","001","0","110","uuuuu")
+SH_RRR_ENCX(A4_vrminuw,		"1011","001","1","110","uuuuu")
+
+SH_RRR_ENC(S2_vrcnegh,		"1011","001","1","111","xxxxx")
+
+SH_RRR_ENC(S4_vrcrotate_acc,	"1011","101","i","--i","xxxxx")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 1100 -------- PP------ --------","[#12] Rx=(Rs,Rt)")
+RSHIFTTYPES(r_or,               "1100","00-","-","-","xxxxx")
+RSHIFTTYPES(r_and,              "1100","01-","-","-","xxxxx")
+RSHIFTTYPES(r_nac,              "1100","10-","-","-","xxxxx")
+RSHIFTTYPES(r_acc,              "1100","11-","-","-","xxxxx")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 1101 -------- PP------ --------","[#13] Reserved")
+DEF_FIELDROW_DESC32(ICLASS_S3op" 1110 -------- PP------ --------","[#14] Reserved")
+
+
+DEF_FIELDROW_DESC32(ICLASS_S3op" 1111 -------- PP------ --------","[#14] User Instruction")
+
+
+
+
+
+
+
+
+
+
+
+
+
+/*******************************/
+/*                             */
+/*                             */
+/*           ALU64             */
+/*                             */
+/*                             */
+/*******************************/
+DEF_CLASS32(ICLASS_ALU64" ---- -------- PP------ --------",ALU64)
+DEF_FIELD32(ICLASS_ALU64" !!!! -------- PP------ --------",ALU64_RegType,"Register Type")
+DEF_FIELD32(ICLASS_ALU64" 0--- !!!----- PP------ --------",A_MajOp,"Major Opcode")
+DEF_FIELD32(ICLASS_ALU64" 0--- -------- PP------ !!!-----",A_MinOp,"Minor Opcode")
+DEF_FIELD32(ICLASS_ALU64" 11-- -------- PP------ ---!!!!!",A_MajOp,"Major Opcode")
+
+
+
+#define ALU64_RRR_ENC(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS)\
+DEF_ENC32(TAG, ICLASS_ALU64" "MAJ4"  "MIN3"sssss  PP"SMOD1"ttttt "VMIN3 DSTCHARS)
+
+#define LEGACY_ALU64_RRR_ENC(TAG,MAJ4,MIN3,SMOD1,VMIN3,DSTCHARS)\
+LEGACY_DEF_ENC32(TAG, ICLASS_ALU64" "MAJ4"  "MIN3"sssss  PP"SMOD1"ttttt "VMIN3 DSTCHARS)
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 0000 -------- PP------ --------","[#0] Rd=(Rss,Rtt)")
+ALU64_RRR_ENC(S2_parityp,        "0000","---","-","---","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 0001 -------- PP------ --------","[#1] Rdd=(Pu,Rss,Rtt)")
+ALU64_RRR_ENC(C2_vmux,           "0001","---","-","-uu","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 0010 -------- PP------ --------","[#2] Pd=(Rss,Rtt)")
+ALU64_RRR_ENC(A2_vcmpweq,        "0010","0--","0","000","---dd")
+ALU64_RRR_ENC(A2_vcmpwgt,        "0010","0--","0","001","---dd")
+ALU64_RRR_ENC(A2_vcmpwgtu,       "0010","0--","0","010","---dd")
+ALU64_RRR_ENC(A2_vcmpheq,        "0010","0--","0","011","---dd")
+ALU64_RRR_ENC(A2_vcmphgt,        "0010","0--","0","100","---dd")
+ALU64_RRR_ENC(A2_vcmphgtu,       "0010","0--","0","101","---dd")
+ALU64_RRR_ENC(A2_vcmpbeq,        "0010","0--","0","110","---dd")
+ALU64_RRR_ENC(A2_vcmpbgtu,       "0010","0--","0","111","---dd")
+
+ALU64_RRR_ENC(A4_vcmpbeq_any,    "0010","0--","1","000","---dd")
+ALU64_RRR_ENC(A6_vcmpbeq_notany, "0010","0--","1","001","---dd")
+ALU64_RRR_ENC(A4_vcmpbgt,        "0010","0--","1","010","---dd")
+ALU64_RRR_ENC(A4_tlbmatch,       "0010","0--","1","011","---dd")
+ALU64_RRR_ENC(A4_boundscheck_lo, "0010","0--","1","100","---dd")
+ALU64_RRR_ENC(A4_boundscheck_hi, "0010","0--","1","101","---dd")
+
+ALU64_RRR_ENC(C2_cmpeqp,         "0010","100","-","000","---dd")
+ALU64_RRR_ENC(C2_cmpgtp,         "0010","100","-","010","---dd")
+ALU64_RRR_ENC(C2_cmpgtup,        "0010","100","-","100","---dd")
+
+ALU64_RRR_ENC(F2_dfcmpeq,        "0010","111","-","000","---dd")
+ALU64_RRR_ENC(F2_dfcmpgt,        "0010","111","-","001","---dd")
+ALU64_RRR_ENC(F2_dfcmpge,        "0010","111","-","010","---dd")
+ALU64_RRR_ENC(F2_dfcmpuo,        "0010","111","-","011","---dd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 0011 -------- PP------ --------","[#3] Rdd=(Rss,Rtt)")
+ALU64_RRR_ENC(A2_vaddub,         "0011","000","-","000","ddddd")
+ALU64_RRR_ENC(A2_vaddubs,        "0011","000","-","001","ddddd")
+ALU64_RRR_ENC(A2_vaddh,          "0011","000","-","010","ddddd")
+ALU64_RRR_ENC(A2_vaddhs,         "0011","000","-","011","ddddd")
+ALU64_RRR_ENC(A2_vadduhs,        "0011","000","-","100","ddddd")
+ALU64_RRR_ENC(A2_vaddw,          "0011","000","-","101","ddddd")
+ALU64_RRR_ENC(A2_vaddws,         "0011","000","-","110","ddddd")
+ALU64_RRR_ENC(A2_addp,           "0011","000","-","111","ddddd")
+
+ALU64_RRR_ENC(A2_vsubub,         "0011","001","-","000","ddddd")
+ALU64_RRR_ENC(A2_vsububs,        "0011","001","-","001","ddddd")
+ALU64_RRR_ENC(A2_vsubh,          "0011","001","-","010","ddddd")
+ALU64_RRR_ENC(A2_vsubhs,         "0011","001","-","011","ddddd")
+ALU64_RRR_ENC(A2_vsubuhs,        "0011","001","-","100","ddddd")
+ALU64_RRR_ENC(A2_vsubw,          "0011","001","-","101","ddddd")
+ALU64_RRR_ENC(A2_vsubws,         "0011","001","-","110","ddddd")
+ALU64_RRR_ENC(A2_subp,           "0011","001","-","111","ddddd")
+
+ALU64_RRR_ENC(A2_vavgub,         "0011","010","-","000","ddddd")
+ALU64_RRR_ENC(A2_vavgubr,        "0011","010","-","001","ddddd")
+ALU64_RRR_ENC(A2_vavgh,          "0011","010","-","010","ddddd")
+ALU64_RRR_ENC(A2_vavghr,         "0011","010","-","011","ddddd")
+ALU64_RRR_ENC(A2_vavghcr,        "0011","010","-","100","ddddd")
+ALU64_RRR_ENC(A2_vavguh,         "0011","010","-","101","ddddd")
+ALU64_RRR_ENC(A2_vavguhr,        "0011","010","-","11-","ddddd")
+
+ALU64_RRR_ENC(A2_vavgw,          "0011","011","-","000","ddddd")
+ALU64_RRR_ENC(A2_vavgwr,         "0011","011","-","001","ddddd")
+ALU64_RRR_ENC(A2_vavgwcr,        "0011","011","-","010","ddddd")
+ALU64_RRR_ENC(A2_vavguw,         "0011","011","-","011","ddddd")
+ALU64_RRR_ENC(A2_vavguwr,        "0011","011","-","100","ddddd")
+ALU64_RRR_ENC(A2_addpsat,        "0011","011","-","101","ddddd")
+ALU64_RRR_ENC(A2_addspl,         "0011","011","-","110","ddddd")
+ALU64_RRR_ENC(A2_addsph,         "0011","011","-","111","ddddd")
+
+ALU64_RRR_ENC(A2_vnavgh,         "0011","100","-","000","ddddd")
+ALU64_RRR_ENC(A2_vnavghr,        "0011","100","-","001","ddddd")
+ALU64_RRR_ENC(A2_vnavghcr,       "0011","100","-","010","ddddd")
+ALU64_RRR_ENC(A2_vnavgw,         "0011","100","-","011","ddddd")
+ALU64_RRR_ENC(A2_vnavgwr,        "0011","100","-","10-","ddddd")
+ALU64_RRR_ENC(A2_vnavgwcr,       "0011","100","-","11-","ddddd")
+
+ALU64_RRR_ENC(A2_vminub,         "0011","101","-","000","ddddd")
+ALU64_RRR_ENC(A2_vminh,          "0011","101","-","001","ddddd")
+ALU64_RRR_ENC(A2_vminuh,         "0011","101","-","010","ddddd")
+ALU64_RRR_ENC(A2_vminw,          "0011","101","-","011","ddddd")
+ALU64_RRR_ENC(A2_vminuw,         "0011","101","-","100","ddddd")
+ALU64_RRR_ENC(A2_vmaxuw,         "0011","101","-","101","ddddd")	/* Doh! We did not put max with other max insns in v3 */
+ALU64_RRR_ENC(A2_minp,           "0011","101","-","110","ddddd")
+ALU64_RRR_ENC(A2_minup,          "0011","101","-","111","ddddd")
+
+ALU64_RRR_ENC(A2_vmaxub,         "0011","110","-","000","ddddd")
+ALU64_RRR_ENC(A2_vmaxh,          "0011","110","-","001","ddddd")
+ALU64_RRR_ENC(A2_vmaxuh,         "0011","110","-","010","ddddd")
+ALU64_RRR_ENC(A2_vmaxw,          "0011","110","-","011","ddddd")
+ALU64_RRR_ENC(A2_maxp,           "0011","110","-","100","ddddd")
+ALU64_RRR_ENC(A2_maxup,          "0011","110","-","101","ddddd")
+ALU64_RRR_ENC(A2_vmaxb,          "0011","110","-","110","ddddd")
+ALU64_RRR_ENC(A2_vminb,          "0011","110","-","111","ddddd")	/* EJP: Because vmaxuw out of place */
+
+ALU64_RRR_ENC(A2_andp,           "0011","111","-","000","ddddd")
+ALU64_RRR_ENC(A2_orp,            "0011","111","-","010","ddddd")
+ALU64_RRR_ENC(A2_xorp,           "0011","111","-","100","ddddd")
+ALU64_RRR_ENC(A4_andnp,          "0011","111","-","001","ddddd")
+ALU64_RRR_ENC(A4_ornp,           "0011","111","-","011","ddddd")
+
+ALU64_RRR_ENC(A4_modwrapu,       "0011","111","-","111","ddddd")
+
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 0100 -------- PP------ --------","[#4] Rdd=(Rs,Rt)")
+LEGACY_ALU64_RRR_ENC(S2_packhl,  "0100","--0","-","---","ddddd")
+ALU64_RRR_ENC(A4_bitsplit,       "0100","--1","-","---","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 0101 -------- PP------ --------","[#5] Rd=(Rs,Rt)")
+ALU64_RRR_ENC(A2_addh_l16_ll,    "0101","000","-","00-","ddddd")
+ALU64_RRR_ENC(A2_addh_l16_hl,    "0101","000","-","01-","ddddd")
+ALU64_RRR_ENC(A2_addh_l16_sat_ll,"0101","000","-","10-","ddddd")
+ALU64_RRR_ENC(A2_addh_l16_sat_hl,"0101","000","-","11-","ddddd")
+
+ALU64_RRR_ENC(A2_subh_l16_ll,    "0101","001","-","00-","ddddd")
+ALU64_RRR_ENC(A2_subh_l16_hl,    "0101","001","-","01-","ddddd")
+ALU64_RRR_ENC(A2_subh_l16_sat_ll,"0101","001","-","10-","ddddd")
+ALU64_RRR_ENC(A2_subh_l16_sat_hl,"0101","001","-","11-","ddddd")
+
+ALU64_RRR_ENC(A2_addh_h16_ll,    "0101","010","-","000","ddddd")
+ALU64_RRR_ENC(A2_addh_h16_lh,    "0101","010","-","001","ddddd")
+ALU64_RRR_ENC(A2_addh_h16_hl,    "0101","010","-","010","ddddd")
+ALU64_RRR_ENC(A2_addh_h16_hh,    "0101","010","-","011","ddddd")
+ALU64_RRR_ENC(A2_addh_h16_sat_ll,"0101","010","-","100","ddddd")
+ALU64_RRR_ENC(A2_addh_h16_sat_lh,"0101","010","-","101","ddddd")
+ALU64_RRR_ENC(A2_addh_h16_sat_hl,"0101","010","-","110","ddddd")
+ALU64_RRR_ENC(A2_addh_h16_sat_hh,"0101","010","-","111","ddddd")
+
+ALU64_RRR_ENC(A2_subh_h16_ll,    "0101","011","-","000","ddddd")
+ALU64_RRR_ENC(A2_subh_h16_lh,    "0101","011","-","001","ddddd")
+ALU64_RRR_ENC(A2_subh_h16_hl,    "0101","011","-","010","ddddd")
+ALU64_RRR_ENC(A2_subh_h16_hh,    "0101","011","-","011","ddddd")
+ALU64_RRR_ENC(A2_subh_h16_sat_ll,"0101","011","-","100","ddddd")
+ALU64_RRR_ENC(A2_subh_h16_sat_lh,"0101","011","-","101","ddddd")
+ALU64_RRR_ENC(A2_subh_h16_sat_hl,"0101","011","-","110","ddddd")
+ALU64_RRR_ENC(A2_subh_h16_sat_hh,"0101","011","-","111","ddddd")
+
+LEGACY_ALU64_RRR_ENC(A2_addsat,  "0101","100","-","0--","ddddd")
+LEGACY_ALU64_RRR_ENC(A2_subsat,  "0101","100","-","1--","ddddd")
+
+ALU64_RRR_ENC(A2_min,            "0101","101","-","0--","ddddd")
+ALU64_RRR_ENC(A2_minu,           "0101","101","-","1--","ddddd")
+
+ALU64_RRR_ENC(A2_max,            "0101","110","-","0--","ddddd")
+ALU64_RRR_ENC(A2_maxu,           "0101","110","-","1--","ddddd")
+
+ALU64_RRR_ENC(S4_parity,         "0101","111","-","---","ddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 0110 -------- PP------ --------","[#6] Rd=#u10 ")
+DEF_ENC32(F2_sfimm_p,     ICLASS_ALU64" 0110   00i ----- PPiiiiii iiiddddd")
+DEF_ENC32(F2_sfimm_n,     ICLASS_ALU64" 0110   01i ----- PPiiiiii iiiddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 0111 -------- PP------ --------","[#7] Rd=(Rs,Rt,#u6)")
+DEF_ENC32(M4_mpyrr_addi,  ICLASS_ALU64" 0111   0ii sssss PPittttt iiiddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 1000 -------- PP------ --------","[#8] Rd=(Rs,#u6,#U6)")
+DEF_ENC32(M4_mpyri_addi,  ICLASS_ALU64" 1000   Iii sssss PPiddddd iiiIIIII")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 1001 -------- PP------ --------","[#9] Rdd=#u10 ")
+DEF_ENC32(F2_dfimm_p,     ICLASS_ALU64" 1001   00i ----- PPiiiiii iiiddddd")
+DEF_ENC32(F2_dfimm_n,     ICLASS_ALU64" 1001   01i ----- PPiiiiii iiiddddd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 1010 -------- PP------ --------","[#10] Rx=(Rs,Rx,#s10)")
+DEF_ENC32(S4_or_andix,    ICLASS_ALU64" 1010   01i xxxxx PPiiiiii iiiuuuuu")
+DEF_ENC32(S4_or_andi,     ICLASS_ALU64" 1010   00i sssss PPiiiiii iiixxxxx")
+DEF_ENC32(S4_or_ori,      ICLASS_ALU64" 1010   10i sssss PPiiiiii iiixxxxx")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 1011 -------- PP------ --------","[#11] Rd=(Rs,Rd,#s6)")
+DEF_ENC32(S4_addaddi,     ICLASS_ALU64" 1011   0ii sssss PPiddddd iiiuuuuu")
+DEF_ENC32(S4_subaddi,     ICLASS_ALU64" 1011   1ii sssss PPiddddd iiiuuuuu")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64"     1100 -------- PP------ --------","[#12] Pd=(Rss,#s8)")
+DEF_ENC32(A4_vcmpbeqi,   ICLASS_ALU64"1100 000sssss PP-iiiii iii00-dd")
+DEF_ENC32(A4_vcmpbgti,   ICLASS_ALU64"1100 001sssss PP-iiiii iii00-dd")
+DEF_ENC32(A4_vcmpbgtui,  ICLASS_ALU64"1100 010sssss PP-0iiii iii00-dd")
+DEF_ENC32(A4_vcmpheqi,   ICLASS_ALU64"1100 000sssss PP-iiiii iii01-dd")
+DEF_ENC32(A4_vcmphgti,   ICLASS_ALU64"1100 001sssss PP-iiiii iii01-dd")
+DEF_ENC32(A4_vcmphgtui,  ICLASS_ALU64"1100 010sssss PP-0iiii iii01-dd")
+DEF_ENC32(A4_vcmpweqi,   ICLASS_ALU64"1100 000sssss PP-iiiii iii10-dd")
+DEF_ENC32(A4_vcmpwgti,   ICLASS_ALU64"1100 001sssss PP-iiiii iii10-dd")
+DEF_ENC32(A4_vcmpwgtui,  ICLASS_ALU64"1100 010sssss PP-0iiii iii10-dd")
+
+DEF_ENC32(F2_dfclass,    ICLASS_ALU64"1100 100sssss PP-000ii iii10-dd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 1101 -------- PP------ --------","[#13] Pd=(Rs,#s8)")
+DEF_ENC32(A4_cmpbeqi,    ICLASS_ALU64"1101 -00sssss PP-iiiii iii00-dd")
+DEF_ENC32(A4_cmpbgti,    ICLASS_ALU64"1101 -01sssss PP-iiiii iii00-dd")
+DEF_ENC32(A4_cmpbgtui,   ICLASS_ALU64"1101 -10sssss PP-0iiii iii00-dd")
+DEF_ENC32(A4_cmpheqi,    ICLASS_ALU64"1101 -00sssss PP-iiiii iii01-dd")
+DEF_ENC32(A4_cmphgti,    ICLASS_ALU64"1101 -01sssss PP-iiiii iii01-dd")
+DEF_ENC32(A4_cmphgtui,   ICLASS_ALU64"1101 -10sssss PP-0iiii iii01-dd")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 1110 -------- PP------ --------","[#14] Rx=(#u9,op(Rx,#u5))")
+
+#define OP_OPI_RI(TAG,OPB)\
+DEF_ENC32(S4_andi_##TAG##_ri,ICLASS_ALU64" 1110 iiixxxxx PPiIIIII iii"OPB"i00-")\
+DEF_ENC32(S4_ori_##TAG##_ri, ICLASS_ALU64" 1110 iiixxxxx PPiIIIII iii"OPB"i01-")\
+DEF_ENC32(S4_addi_##TAG##_ri,ICLASS_ALU64" 1110 iiixxxxx PPiIIIII iii"OPB"i10-")\
+DEF_ENC32(S4_subi_##TAG##_ri,ICLASS_ALU64" 1110 iiixxxxx PPiIIIII iii"OPB"i11-")
+
+OP_OPI_RI(asl,"0")
+OP_OPI_RI(lsr,"1")
+
+
+DEF_FIELDROW_DESC32(ICLASS_ALU64" 1111 -------- PP------ --------","[#15] Rd=(Rs,Ru,#u6:2)")
+DEF_ENC32(M4_mpyri_addr_u2, ICLASS_ALU64" 1111   0ii sssss PPiddddd iiiuuuuu")
+DEF_ENC32(M4_mpyri_addr,    ICLASS_ALU64" 1111   1ii sssss PPiddddd iiiuuuuu")
diff --git a/target/hexagon/imported/encode_subinsn.def b/target/hexagon/imported/encode_subinsn.def
new file mode 100644
index 0000000..ebb8119
--- /dev/null
+++ b/target/hexagon/imported/encode_subinsn.def
@@ -0,0 +1,150 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+
+/* DEF_ENC_SUBINSN(TAG, CLASS, ENCSTR) */
+
+
+
+
+/*********************/
+/* Ld1-type subinsns */
+/*********************/
+DEF_ENC_SUBINSN(SL1_loadri_io,   SUBINSN_L1, "0iiiissssdddd")
+DEF_ENC_SUBINSN(SL1_loadrub_io,  SUBINSN_L1, "1iiiissssdddd")
+
+/*********************/
+/* St1-type subinsns */
+/*********************/
+DEF_ENC_SUBINSN(SS1_storew_io,  SUBINSN_S1, "0ii iisssstttt")
+DEF_ENC_SUBINSN(SS1_storeb_io,  SUBINSN_S1, "1ii iisssstttt")
+
+
+/*********************/
+/* Ld2-type subinsns */
+/*********************/
+DEF_ENC_SUBINSN(SL2_loadrh_io,   SUBINSN_L2, "00i iissssdddd")
+DEF_ENC_SUBINSN(SL2_loadruh_io,  SUBINSN_L2, "01i iissssdddd")
+DEF_ENC_SUBINSN(SL2_loadrb_io,   SUBINSN_L2, "10i iissssdddd")
+DEF_ENC_SUBINSN(SL2_loadri_sp,   SUBINSN_L2, "111 0iiiiidddd")
+DEF_ENC_SUBINSN(SL2_loadrd_sp,   SUBINSN_L2, "111 10iiiiiddd")
+
+DEF_ENC_SUBINSN(SL2_deallocframe,SUBINSN_L2, "111 1100---0--")
+
+DEF_ENC_SUBINSN(SL2_return,      SUBINSN_L2, "111 1101---0--")
+DEF_ENC_SUBINSN(SL2_return_t,    SUBINSN_L2, "111 1101---100")
+DEF_ENC_SUBINSN(SL2_return_f,    SUBINSN_L2, "111 1101---101")
+DEF_ENC_SUBINSN(SL2_return_tnew, SUBINSN_L2, "111 1101---110")
+DEF_ENC_SUBINSN(SL2_return_fnew, SUBINSN_L2, "111 1101---111")
+
+DEF_ENC_SUBINSN(SL2_jumpr31,     SUBINSN_L2, "111 1111---0--")
+DEF_ENC_SUBINSN(SL2_jumpr31_t,   SUBINSN_L2, "111 1111---100")
+DEF_ENC_SUBINSN(SL2_jumpr31_f,   SUBINSN_L2, "111 1111---101")
+DEF_ENC_SUBINSN(SL2_jumpr31_tnew,SUBINSN_L2, "111 1111---110")
+DEF_ENC_SUBINSN(SL2_jumpr31_fnew,SUBINSN_L2, "111 1111---111")
+
+
+/*********************/
+/* St2-type subinsns */
+/*********************/
+DEF_ENC_SUBINSN(SS2_storeh_io,   SUBINSN_S2, "00i iisssstttt")
+DEF_ENC_SUBINSN(SS2_storew_sp,   SUBINSN_S2, "010 0iiiiitttt")
+DEF_ENC_SUBINSN(SS2_stored_sp,   SUBINSN_S2, "010 1iiiiiittt")
+
+DEF_ENC_SUBINSN(SS2_storewi0,    SUBINSN_S2, "100 00ssssiiii")
+DEF_ENC_SUBINSN(SS2_storewi1,    SUBINSN_S2, "100 01ssssiiii")
+DEF_ENC_SUBINSN(SS2_storebi0,    SUBINSN_S2, "100 10ssssiiii")
+DEF_ENC_SUBINSN(SS2_storebi1,    SUBINSN_S2, "100 11ssssiiii")
+
+DEF_ENC_SUBINSN(SS2_allocframe,  SUBINSN_S2, "111 0iiiii----")
+
+
+
+/*******************/
+/* A-type subinsns */
+/*******************/
+DEF_ENC_SUBINSN(SA1_addi,       SUBINSN_A, "00i iiiiiixxxx")
+DEF_ENC_SUBINSN(SA1_seti,       SUBINSN_A, "010 iiiiiidddd")
+DEF_ENC_SUBINSN(SA1_addsp,      SUBINSN_A, "011 iiiiiidddd")
+
+DEF_ENC_SUBINSN(SA1_tfr,        SUBINSN_A, "100 00ssssdddd")
+DEF_ENC_SUBINSN(SA1_inc,        SUBINSN_A, "100 01ssssdddd")
+DEF_ENC_SUBINSN(SA1_and1,       SUBINSN_A, "100 10ssssdddd")
+DEF_ENC_SUBINSN(SA1_dec,        SUBINSN_A, "100 11ssssdddd")
+
+DEF_ENC_SUBINSN(SA1_sxth,       SUBINSN_A, "101 00ssssdddd")
+DEF_ENC_SUBINSN(SA1_sxtb,       SUBINSN_A, "101 01ssssdddd")
+DEF_ENC_SUBINSN(SA1_zxth,       SUBINSN_A, "101 10ssssdddd")
+DEF_ENC_SUBINSN(SA1_zxtb,       SUBINSN_A, "101 11ssssdddd")
+
+
+DEF_ENC_SUBINSN(SA1_addrx,      SUBINSN_A, "110 00ssssxxxx")
+DEF_ENC_SUBINSN(SA1_cmpeqi,     SUBINSN_A, "110 01ssss--ii")
+DEF_ENC_SUBINSN(SA1_setin1,     SUBINSN_A, "110 1--0--dddd")
+DEF_ENC_SUBINSN(SA1_clrtnew,    SUBINSN_A, "110 1--100dddd")
+DEF_ENC_SUBINSN(SA1_clrfnew,    SUBINSN_A, "110 1--101dddd")
+DEF_ENC_SUBINSN(SA1_clrt,       SUBINSN_A, "110 1--110dddd")
+DEF_ENC_SUBINSN(SA1_clrf,       SUBINSN_A, "110 1--111dddd")
+
+
+DEF_ENC_SUBINSN(SA1_combine0i,  SUBINSN_A, "111 -0-ii00ddd")
+DEF_ENC_SUBINSN(SA1_combine1i,  SUBINSN_A, "111 -0-ii01ddd")
+DEF_ENC_SUBINSN(SA1_combine2i,  SUBINSN_A, "111 -0-ii10ddd")
+DEF_ENC_SUBINSN(SA1_combine3i,  SUBINSN_A, "111 -0-ii11ddd")
+DEF_ENC_SUBINSN(SA1_combinezr,  SUBINSN_A, "111 -1ssss0ddd")
+DEF_ENC_SUBINSN(SA1_combinerz,  SUBINSN_A, "111 -1ssss1ddd")
+
+
+
+
+/* maybe R=cmpeq ? */
+
+
+/* Add a group of NCJ: if (R.new==#0) jump:hint #r9 */
+/* Add a group of NCJ: if (R.new!=#0) jump:hint #r9 */
+/* NCJ goes with LD1, LD2 */
+
+
+
+
+DEF_FIELD32("---! !!!! !!!!!!!! EE------ --------",SUBFIELD_B_SLOT1,"B: Slot1 Instruction")
+DEF_FIELD32("---- ---- -------- EE-!!!!! !!!!!!!!",SUBFIELD_A_SLOT0,"A: Slot0 Instruction")
+
+
+/* DEF_PACKED32(TAG, CLASSA, CLASSB, ENCSTR) */
+
+DEF_PACKED32(P2_PACKED_L1_L1, SUBINSN_L1, SUBINSN_L1, "000B BBBB BBBB BBBB EE0A AAAA AAAA AAAA")
+DEF_PACKED32(P2_PACKED_L1_L2, SUBINSN_L2, SUBINSN_L1, "000B BBBB BBBB BBBB EE1A AAAA AAAA AAAA")
+DEF_PACKED32(P2_PACKED_L2_L2, SUBINSN_L2, SUBINSN_L2, "001B BBBB BBBB BBBB EE0A AAAA AAAA AAAA")
+DEF_PACKED32(P2_PACKED_A_A,   SUBINSN_A,  SUBINSN_A,  "001B BBBB BBBB BBBB EE1A AAAA AAAA AAAA")
+
+DEF_PACKED32(P2_PACKED_L1_A,  SUBINSN_L1, SUBINSN_A,  "010B BBBB BBBB BBBB EE0A AAAA AAAA AAAA")
+DEF_PACKED32(P2_PACKED_L2_A,  SUBINSN_L2, SUBINSN_A,  "010B BBBB BBBB BBBB EE1A AAAA AAAA AAAA")
+DEF_PACKED32(P2_PACKED_S1_A,  SUBINSN_S1, SUBINSN_A,  "011B BBBB BBBB BBBB EE0A AAAA AAAA AAAA")
+DEF_PACKED32(P2_PACKED_S2_A,  SUBINSN_S2, SUBINSN_A,  "011B BBBB BBBB BBBB EE1A AAAA AAAA AAAA")
+
+DEF_PACKED32(P2_PACKED_S1_L1, SUBINSN_S1, SUBINSN_L1, "100B BBBB BBBB BBBB EE0A AAAA AAAA AAAA")
+DEF_PACKED32(P2_PACKED_S1_L2, SUBINSN_S1, SUBINSN_L2, "100B BBBB BBBB BBBB EE1A AAAA AAAA AAAA")
+DEF_PACKED32(P2_PACKED_S1_S1, SUBINSN_S1, SUBINSN_S1, "101B BBBB BBBB BBBB EE0A AAAA AAAA AAAA")
+DEF_PACKED32(P2_PACKED_S1_S2, SUBINSN_S2, SUBINSN_S1, "101B BBBB BBBB BBBB EE1A AAAA AAAA AAAA")
+
+DEF_PACKED32(P2_PACKED_S2_L1, SUBINSN_S2, SUBINSN_L1, "110B BBBB BBBB BBBB EE0A AAAA AAAA AAAA")
+DEF_PACKED32(P2_PACKED_S2_L2, SUBINSN_S2, SUBINSN_L2, "110B BBBB BBBB BBBB EE1A AAAA AAAA AAAA")
+DEF_PACKED32(P2_PACKED_S2_S2, SUBINSN_S2, SUBINSN_S2, "111B BBBB BBBB BBBB EE0A AAAA AAAA AAAA")
+
+DEF_PACKED32(P2_PACKED_RESERVED, SUBINSN_INVALID, SUBINSN_INVALID, "111B BBBB BBBB BBBB EE1A AAAA AAAA AAAA")
+
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 20/34] Hexagon (target/hexagon) generator phase 1 - C preprocessor for semantics
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (18 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 19/34] Hexagon (target/hexagon/imported) arch import - instruction encoding Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-18 15:50 ` [RFC PATCH v3 21/34] Hexagon (target/hexagon) generator phase 2 - generate header files Taylor Simpson
                   ` (15 subsequent siblings)
  35 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Run the C preprocessor across the instruction definition files and macro
definitoin file to expand macros and prepare the semantics_generated.pyinc
file.  The resulting file contains one entry with the semantics for each
instruction and one line with the instruction attributes associated with
each macro.

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/gen_semantics.c | 88 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 88 insertions(+)
 create mode 100644 target/hexagon/gen_semantics.c

diff --git a/target/hexagon/gen_semantics.c b/target/hexagon/gen_semantics.c
new file mode 100644
index 0000000..1b198cb
--- /dev/null
+++ b/target/hexagon/gen_semantics.c
@@ -0,0 +1,88 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * This program generates the semantics file that is processed by
+ * the do_qemu.py script.  We use the C preporcessor to manipulate the
+ * files imported from the Hexagon architecture library.
+ */
+
+#include <stdio.h>
+#define STRINGIZE(X) #X
+
+int main(int argc, char *argv[])
+{
+    FILE *outfile;
+
+    if (argc != 2) {
+        fprintf(stderr, "Usage: gen_semantics ouptputfile\n");
+        return -1;
+    }
+    outfile = fopen(argv[1], "w");
+    if (outfile == NULL) {
+        fprintf(stderr, "Cannot open %s for writing\n", argv[1]);
+        return -1;
+    }
+
+/*
+ * Process the instruction definitions
+ *     Scalar core instructions have the following form
+ *         Q6INSN(A2_add,"Rd32=add(Rs32,Rt32)",ATTRIBS(),
+ *         "Add 32-bit registers",
+ *         { RdV=RsV+RtV;})
+ */
+#define Q6INSN(TAG, BEH, ATTRIBS, DESCR, SEM) \
+    do { \
+        fprintf(outfile, "SEMANTICS( \\\n" \
+                         "    \"%s\", \\\n" \
+                         "    %s, \\\n" \
+                         "    \"\"\"%s\"\"\" \\\n" \
+                         ")\n", \
+                #TAG, STRINGIZE(BEH), STRINGIZE(SEM)); \
+        fprintf(outfile, "ATTRIBUTES( \\\n" \
+                         "    \"%s\", \\\n" \
+                         "    \"%s\" \\\n" \
+                         ")\n", \
+                #TAG, STRINGIZE(ATTRIBS)); \
+    } while (0);
+#include "imported/allidefs.def"
+#undef Q6INSN
+
+/*
+ * Process the macro definitions
+ *     Macros definitions have the following form
+ *         DEF_MACRO(
+ *             fLSBNEW0,
+ *             predlog_read(thread,0),
+ *             ()
+ *         )
+ * The important part here is the attributes.  Whenever an instruction
+ * invokes a macro, we add the macro's attributes to the instruction.
+ */
+#define DEF_MACRO(MNAME, BEH, ATTRS) \
+    fprintf(outfile, "MACROATTRIB( \\\n" \
+                     "    \"%s\", \\\n" \
+                     "    \"\"\"%s\"\"\", \\\n" \
+                     "    \"%s\" \\\n" \
+                     ")\n", \
+            #MNAME, STRINGIZE(BEH), STRINGIZE(ATTRS));
+#include "imported/macros.def"
+#undef DEF_MACRO
+
+    fclose(outfile);
+    return 0;
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 21/34] Hexagon (target/hexagon) generator phase 2 - generate header files
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (19 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 20/34] Hexagon (target/hexagon) generator phase 1 - C preprocessor for semantics Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-18 15:50 ` [RFC PATCH v3 22/34] Hexagon (target/hexagon) generator phase 3 - C preprocessor for decode tree Taylor Simpson
                   ` (14 subsequent siblings)
  35 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Python scripts generate the following files
    helper_protos_generated.h
        For each instruction we create DEF_HELPER function prototype
    helper_funcs_generated.h
        For each instruction we create the helper function definition
    tcg_funcs_generated.h
        For each instruction we create TCG code to generate call to helper
    shortcode_generated.h
        Generate a table of instruction "shortcode" semantics
    opcodes_def_generated.h
        Gives a list of all the opcodes
    op_attribs_generated.h
        Lists all the attributes associated with each instruction
    op_regs_generated.h
        Lists the register and immediate operands for each instruction
    printinsn_generated.h
        Data for printing (disassembling) each instruction (format
        string + operands)

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/gen_helper_funcs.py  | 230 +++++++++++++++++++++++++++
 target/hexagon/gen_helper_protos.py | 158 +++++++++++++++++++
 target/hexagon/gen_op_attribs.py    |  46 ++++++
 target/hexagon/gen_op_regs.py       | 119 ++++++++++++++
 target/hexagon/gen_opcodes_def.py   |  43 ++++++
 target/hexagon/gen_printinsn.py     | 182 ++++++++++++++++++++++
 target/hexagon/gen_shortcode.py     |  71 +++++++++
 target/hexagon/gen_tcg_funcs.py     | 301 ++++++++++++++++++++++++++++++++++++
 target/hexagon/hex_common.py        | 204 ++++++++++++++++++++++++
 9 files changed, 1354 insertions(+)
 create mode 100755 target/hexagon/gen_helper_funcs.py
 create mode 100755 target/hexagon/gen_helper_protos.py
 create mode 100755 target/hexagon/gen_op_attribs.py
 create mode 100755 target/hexagon/gen_op_regs.py
 create mode 100755 target/hexagon/gen_opcodes_def.py
 create mode 100755 target/hexagon/gen_printinsn.py
 create mode 100755 target/hexagon/gen_shortcode.py
 create mode 100755 target/hexagon/gen_tcg_funcs.py
 create mode 100755 target/hexagon/hex_common.py

diff --git a/target/hexagon/gen_helper_funcs.py b/target/hexagon/gen_helper_funcs.py
new file mode 100755
index 0000000..4595887
--- /dev/null
+++ b/target/hexagon/gen_helper_funcs.py
@@ -0,0 +1,230 @@
+#!/usr/bin/env python3
+
+##
+##  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##
+##  This program is free software; you can redistribute it and/or modify
+##  it under the terms of the GNU General Public License as published by
+##  the Free Software Foundation; either version 2 of the License, or
+##  (at your option) any later version.
+##
+##  This program is distributed in the hope that it will be useful,
+##  but WITHOUT ANY WARRANTY; without even the implied warranty of
+##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+##  GNU General Public License for more details.
+##
+##  You should have received a copy of the GNU General Public License
+##  along with this program; if not, see <http://www.gnu.org/licenses/>.
+##
+
+import sys
+import re
+import string
+from io import StringIO
+
+from hex_common import *
+
+##
+## Helpers for gen_helper_function
+##
+def gen_decl_ea(f):
+    f.write("size4u_t EA;\n")
+
+def gen_helper_return_type(f,regtype,regid,regno):
+    if regno > 1 : f.write(", ")
+    f.write("int32_t")
+
+def gen_helper_return_type_pair(f,regtype,regid,regno):
+    if regno > 1 : f.write(", ")
+    f.write("int64_t")
+
+def gen_helper_arg(f,regtype,regid,regno):
+    if regno > 0 : f.write(", " )
+    f.write("int32_t %s%sV" % (regtype,regid))
+
+def gen_helper_arg_new(f,regtype,regid,regno):
+    if regno >= 0 : f.write(", " )
+    f.write("int32_t %s%sN" % (regtype,regid))
+
+def gen_helper_arg_pair(f,regtype,regid,regno):
+    if regno >= 0 : f.write(", ")
+    f.write("int64_t %s%sV" % (regtype,regid))
+
+def gen_helper_arg_opn(f,regtype,regid,i,tag):
+    if (is_pair(regid)):
+        gen_helper_arg_pair(f,regtype,regid,i)
+    elif (is_single(regid)):
+        if is_old_val(regtype, regid, tag):
+            gen_helper_arg(f,regtype,regid,i)
+        elif is_new_val(regtype, regid, tag):
+            gen_helper_arg_new(f,regtype,regid,i)
+        else:
+            print("Bad register parse: ",regtype,regid,toss,numregs)
+    else:
+        print("Bad register parse: ",regtype,regid,toss,numregs)
+
+def gen_helper_arg_imm(f,immlett):
+    f.write(", int32_t %s" % (imm_name(immlett)))
+
+def gen_helper_dest_decl(f,regtype,regid,regno,subfield=""):
+    f.write("int32_t %s%sV%s = 0;\n" % \
+        (regtype,regid,subfield))
+
+def gen_helper_dest_decl_pair(f,regtype,regid,regno,subfield=""):
+    f.write("int64_t %s%sV%s = 0;\n" % \
+        (regtype,regid,subfield))
+
+def gen_helper_dest_decl_opn(f,regtype,regid,i):
+    if (is_pair(regid)):
+        gen_helper_dest_decl_pair(f,regtype,regid,i)
+    elif (is_single(regid)):
+        gen_helper_dest_decl(f,regtype,regid,i)
+    else:
+        print("Bad register parse: ",regtype,regid,toss,numregs)
+
+def gen_helper_return(f,regtype,regid,regno):
+    f.write("return %s%sV;\n" % (regtype,regid))
+
+def gen_helper_return_pair(f,regtype,regid,regno):
+    f.write("return %s%sV;\n" % (regtype,regid))
+
+def gen_helper_return_opn(f, regtype, regid, i):
+    if (is_pair(regid)):
+        gen_helper_return_pair(f,regtype,regid,i)
+    elif (is_single(regid)):
+        gen_helper_return(f,regtype,regid,i)
+    else:
+        print("Bad register parse: ",regtype,regid,toss,numregs)
+
+##
+## Generate the TCG code to call the helper
+##     For A2_add: Rd32=add(Rs32,Rt32), { RdV=RsV+RtV;}
+##     We produce:
+##       #ifndef fGEN_TCG_A2_add
+##       int32_t HELPER(A2_add)(CPUHexagonState *env, int32_t RsV, int32_t RtV)
+##       {
+##       uint32_t slot __attribute__(unused)) = 4;
+##       int32_t RdV = 0;
+##       { RdV=RsV+RtV;}
+##       return RdV;
+##       }
+##       #endif
+##
+def gen_helper_function(f, tag, tagregs, tagimms):
+    regs = tagregs[tag]
+    imms = tagimms[tag]
+
+    f.write('#ifndef fGEN_TCG_%s\n' % tag)
+    numresults = 0
+    numscalarresults = 0
+    numscalarreadwrite = 0
+    for regtype,regid,toss,numregs in regs:
+        if (is_written(regid)):
+            numresults += 1
+            if (is_scalar_reg(regtype)):
+                numscalarresults += 1
+        if (is_readwrite(regid)):
+            if (is_scalar_reg(regtype)):
+                numscalarreadwrite += 1
+
+    if (numscalarresults > 1):
+        ## The helper is bogus when there is more than one result
+        f.write("void HELPER(%s)(CPUHexagonState *env) { BOGUS_HELPER(%s); }\n"
+                % (tag, tag))
+    else:
+        ## The return type of the function is the type of the destination
+        ## register
+        i=0
+        for regtype,regid,toss,numregs in regs:
+            if (is_written(regid)):
+                if (is_pair(regid)):
+                    gen_helper_return_type_pair(f,regtype,regid,i)
+                elif (is_single(regid)):
+                    gen_helper_return_type(f,regtype,regid,i)
+                else:
+                    print("Bad register parse: ",regtype,regid,toss,numregs)
+            i += 1
+
+        if (numscalarresults == 0):
+            f.write("void")
+        f.write(" HELPER(%s)(CPUHexagonState *env" % tag)
+
+        i = 1
+
+        ## Arguments to the helper function are the source regs and immediates
+        for regtype,regid,toss,numregs in regs:
+            if (is_read(regid)):
+                gen_helper_arg_opn(f,regtype,regid,i,tag)
+                i += 1
+        for immlett,bits,immshift in imms:
+            gen_helper_arg_imm(f,immlett)
+            i += 1
+        if need_slot(tag):
+            if i > 0: f.write(", ")
+            f.write("uint32_t slot")
+            i += 1
+        if need_part1(tag):
+            if i > 0: f.write(", ")
+            f.write("uint32_t part1")
+        f.write(")\n{\n")
+        if (not need_slot(tag)):
+            f.write("uint32_t slot __attribute__((unused)) = 4;\n" )
+        if need_ea(tag): gen_decl_ea(f)
+        ## Declare the return variable
+        i=0
+        for regtype,regid,toss,numregs in regs:
+            if (is_writeonly(regid)):
+                gen_helper_dest_decl_opn(f,regtype,regid,i)
+            i += 1
+
+        if 'A_FPOP' in attribdict[tag]:
+            f.write('fFPOP_START();\n');
+
+        f.write(semdict[tag])
+        f.write("\n")
+
+        if 'A_FPOP' in attribdict[tag]:
+            f.write('fFPOP_END();\n');
+
+        ## Save/return the return variable
+        for regtype,regid,toss,numregs in regs:
+            if (is_written(regid)):
+                gen_helper_return_opn(f, regtype, regid, i)
+        f.write("}\n")
+        ## End of the helper definition
+    f.write('#endif\n')
+
+def main():
+    read_semantics_file(sys.argv[1])
+    read_attribs_file(sys.argv[2])
+    calculate_attribs()
+    tagregs = get_tagregs()
+    tagimms = get_tagimms()
+
+    f = StringIO()
+
+    for tag in tags:
+        ## Skip the priv instructions
+        if ( "A_PRIV" in attribdict[tag] ) :
+            continue
+        ## Skip the guest instructions
+        if ( "A_GUEST" in attribdict[tag] ) :
+            continue
+        ## Skip the diag instructions
+        if ( tag == "Y6_diag" ) :
+            continue
+        if ( tag == "Y6_diag0" ) :
+            continue
+        if ( tag == "Y6_diag1" ) :
+            continue
+
+        gen_helper_function(f, tag, tagregs, tagimms)
+
+
+    realf = open('helper_funcs_generated.h','w')
+    realf.write(f.getvalue())
+    realf.close()
+    f.close()
+
+if __name__ == "__main__":
+    main()
diff --git a/target/hexagon/gen_helper_protos.py b/target/hexagon/gen_helper_protos.py
new file mode 100755
index 0000000..6d5ab07
--- /dev/null
+++ b/target/hexagon/gen_helper_protos.py
@@ -0,0 +1,158 @@
+#!/usr/bin/env python3
+
+##
+##  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##
+##  This program is free software; you can redistribute it and/or modify
+##  it under the terms of the GNU General Public License as published by
+##  the Free Software Foundation; either version 2 of the License, or
+##  (at your option) any later version.
+##
+##  This program is distributed in the hope that it will be useful,
+##  but WITHOUT ANY WARRANTY; without even the implied warranty of
+##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+##  GNU General Public License for more details.
+##
+##  You should have received a copy of the GNU General Public License
+##  along with this program; if not, see <http://www.gnu.org/licenses/>.
+##
+
+import sys
+import re
+import string
+from io import StringIO
+
+from hex_common import *
+
+##
+## Helpers for gen_helper_prototype
+##
+def_helper_types = {
+    'N' : 's32',
+    'O' : 's32',
+    'P' : 's32',
+    'M' : 's32',
+    'C' : 's32',
+    'R' : 's32',
+    'V' : 'ptr',
+    'Q' : 'ptr'
+}
+
+def_helper_types_pair = {
+    'R' : 's64',
+    'C' : 's64',
+    'S' : 's64',
+    'G' : 's64',
+    'V' : 'ptr',
+    'Q' : 'ptr'
+}
+
+def gen_def_helper_opn(f, tag, regtype, regid, toss, numregs, i):
+    if (is_pair(regid)):
+        f.write(", %s" % (def_helper_types_pair[regtype]))
+    elif (is_single(regid)):
+        f.write(", %s" % (def_helper_types[regtype]))
+    else:
+        print("Bad register parse: ",regtype,regid,toss,numregs)
+
+##
+## Generate the DEF_HELPER prototype for an instruction
+##     For A2_add: Rd32=add(Rs32,Rt32)
+##     We produce:
+##         #ifndef fGEN_TCG_A2_add
+##         DEF_HELPER_3(A2_add, s32, env, s32, s32)
+##         #endif
+##
+def gen_helper_prototype(f, tag, tagregs, tagimms):
+    regs = tagregs[tag]
+    imms = tagimms[tag]
+
+    f.write('#ifndef fGEN_TCG_%s\n' % tag)
+    numresults = 0
+    numscalarresults = 0
+    numscalarreadwrite = 0
+    for regtype,regid,toss,numregs in regs:
+        if (is_written(regid)):
+            numresults += 1
+            if (is_scalar_reg(regtype)):
+                numscalarresults += 1
+        if (is_readwrite(regid)):
+            if (is_scalar_reg(regtype)):
+                numscalarreadwrite += 1
+
+    if (numscalarresults > 1):
+        ## The helper is bogus when there is more than one result
+        f.write('DEF_HELPER_1(%s, void, env)\n' % tag)
+    else:
+        ## Figure out how many arguments the helper will take
+        if (numscalarresults == 0):
+            def_helper_size = len(regs)+len(imms)+numscalarreadwrite+1
+            if need_part1(tag): def_helper_size += 1
+            if need_slot(tag): def_helper_size += 1
+            f.write('DEF_HELPER_%s(%s' % (def_helper_size, tag))
+            ## The return type is void
+            f.write(', void' )
+        else:
+            def_helper_size = len(regs)+len(imms)+numscalarreadwrite
+            if need_part1(tag): def_helper_size += 1
+            if need_slot(tag): def_helper_size += 1
+            f.write('DEF_HELPER_%s(%s' % (def_helper_size, tag))
+
+        ## Generate the qemu DEF_HELPER type for each result
+        i=0
+        for regtype,regid,toss,numregs in regs:
+            if (is_written(regid)):
+                gen_def_helper_opn(f, tag, regtype, regid, toss, numregs, i)
+                i += 1
+
+        ## Put the env between the outputs and inputs
+        f.write(', env' )
+        i += 1
+
+        ## Generate the qemu type for each input operand (regs and immediates)
+        for regtype,regid,toss,numregs in regs:
+            if (is_read(regid)):
+                gen_def_helper_opn(f, tag, regtype, regid, toss, numregs, i)
+                i += 1
+        for immlett,bits,immshift in imms:
+            f.write(", s32")
+
+        ## Add the arguments for the instruction slot and part1 (if needed)
+        if need_slot(tag): f.write(', i32' )
+        if need_part1(tag): f.write(' , i32' )
+        f.write(')\n')
+    f.write('#endif\n')
+
+def main():
+    read_semantics_file(sys.argv[1])
+    read_attribs_file(sys.argv[2])
+    calculate_attribs()
+    tagregs = get_tagregs()
+    tagimms = get_tagimms()
+
+    f = StringIO()
+
+    for tag in tags:
+        ## Skip the priv instructions
+        if ( "A_PRIV" in attribdict[tag] ) :
+            continue
+        ## Skip the guest instructions
+        if ( "A_GUEST" in attribdict[tag] ) :
+            continue
+        ## Skip the diag instructions
+        if ( tag == "Y6_diag" ) :
+            continue
+        if ( tag == "Y6_diag0" ) :
+            continue
+        if ( tag == "Y6_diag1" ) :
+            continue
+
+        gen_helper_prototype(f, tag, tagregs, tagimms)
+
+    realf = open('helper_protos_generated.h','w')
+    realf.write(f.getvalue())
+    realf.close()
+    f.close()
+
+if __name__ == "__main__":
+    main()
diff --git a/target/hexagon/gen_op_attribs.py b/target/hexagon/gen_op_attribs.py
new file mode 100755
index 0000000..663296e
--- /dev/null
+++ b/target/hexagon/gen_op_attribs.py
@@ -0,0 +1,46 @@
+#!/usr/bin/env python3
+
+##
+##  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##
+##  This program is free software; you can redistribute it and/or modify
+##  it under the terms of the GNU General Public License as published by
+##  the Free Software Foundation; either version 2 of the License, or
+##  (at your option) any later version.
+##
+##  This program is distributed in the hope that it will be useful,
+##  but WITHOUT ANY WARRANTY; without even the implied warranty of
+##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+##  GNU General Public License for more details.
+##
+##  You should have received a copy of the GNU General Public License
+##  along with this program; if not, see <http://www.gnu.org/licenses/>.
+##
+
+import sys
+import re
+import string
+from io import StringIO
+
+from hex_common import *
+
+def main():
+    read_semantics_file(sys.argv[1])
+    read_attribs_file(sys.argv[2])
+    calculate_attribs()
+
+    ##
+    ## Generate the op_attribs_generated.h file
+    ##     Lists all the attributes associated with each instruction
+    ##
+    f = StringIO()
+    for tag in tags:
+        f.write('OP_ATTRIB(%s,ATTRIBS(%s))\n' % \
+            (tag, ','.join(sorted(attribdict[tag]))))
+    realf = open('op_attribs_generated.h', 'wt')
+    realf.write(f.getvalue())
+    realf.close()
+    f.close()
+
+if __name__ == "__main__":
+    main()
diff --git a/target/hexagon/gen_op_regs.py b/target/hexagon/gen_op_regs.py
new file mode 100755
index 0000000..abb7441
--- /dev/null
+++ b/target/hexagon/gen_op_regs.py
@@ -0,0 +1,119 @@
+#!/usr/bin/env python3
+
+##
+##  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##
+##  This program is free software; you can redistribute it and/or modify
+##  it under the terms of the GNU General Public License as published by
+##  the Free Software Foundation; either version 2 of the License, or
+##  (at your option) any later version.
+##
+##  This program is distributed in the hope that it will be useful,
+##  but WITHOUT ANY WARRANTY; without even the implied warranty of
+##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+##  GNU General Public License for more details.
+##
+##  You should have received a copy of the GNU General Public License
+##  along with this program; if not, see <http://www.gnu.org/licenses/>.
+##
+
+import sys
+import re
+import string
+from io import StringIO
+
+from hex_common import *
+
+##
+## Generate the op_regs_generated.h file
+##     Lists the register and immediate operands for each instruction
+##
+def calculate_regid_reg(tag):
+    def letter_inc(x): return chr(ord(x)+1)
+    ordered_implregs = [ 'SP','FP','LR' ]
+    srcdst_lett = 'X'
+    src_lett = 'S'
+    dst_lett = 'D'
+    retstr = ""
+    mapdict = {}
+    for reg in ordered_implregs:
+        reg_rd = 0
+        reg_wr = 0
+        if ('A_IMPLICIT_WRITES_'+reg) in attribdict[tag]: reg_wr = 1
+        if reg_rd and reg_wr:
+            retstr += srcdst_lett
+            mapdict[srcdst_lett] = reg
+            srcdst_lett = letter_inc(srcdst_lett)
+        elif reg_rd:
+            retstr += src_lett
+            mapdict[src_lett] = reg
+            src_lett = letter_inc(src_lett)
+        elif reg_wr:
+            retstr += dst_lett
+            mapdict[dst_lett] = reg
+            dst_lett = letter_inc(dst_lett)
+    return retstr,mapdict
+
+def calculate_regid_letters(tag):
+    retstr,mapdict = calculate_regid_reg(tag)
+    return retstr
+
+def strip_reg_prefix(x):
+    y=x.replace('UREG.','')
+    y=y.replace('MREG.','')
+    return y.replace('GREG.','')
+
+def main():
+    read_semantics_file(sys.argv[1])
+    read_attribs_file(sys.argv[2])
+    tagregs = get_tagregs()
+    tagimms = get_tagimms()
+
+    f = StringIO()
+
+    for tag in tags:
+        regs = tagregs[tag]
+        rregs = []
+        wregs = []
+        regids = ""
+        for regtype,regid,toss,numregs in regs:
+            if is_read(regid):
+                if regid[0] not in regids: regids += regid[0]
+                rregs.append(regtype+regid+numregs)
+            if is_written(regid):
+                wregs.append(regtype+regid+numregs)
+                if regid[0] not in regids: regids += regid[0]
+        for attrib in attribdict[tag]:
+            if attribinfo[attrib]['rreg']:
+                rregs.append(strip_reg_prefix(attribinfo[attrib]['rreg']))
+            if attribinfo[attrib]['wreg']:
+                wregs.append(strip_reg_prefix(attribinfo[attrib]['wreg']))
+        regids += calculate_regid_letters(tag)
+        f.write('REGINFO(%s,"%s",\t/*RD:*/\t"%s",\t/*WR:*/\t"%s")\n' % \
+            (tag,regids,",".join(rregs),",".join(wregs)))
+
+    for tag in tags:
+        imms = tagimms[tag]
+        f.write( 'IMMINFO(%s' % tag)
+        if not imms:
+            f.write(''','u',0,0,'U',0,0''')
+        for sign,size,shamt in imms:
+            if sign == 'r': sign = 's'
+            if not shamt:
+                shamt = "0"
+            f.write(''','%s',%s,%s''' % (sign,size,shamt))
+        if len(imms) == 1:
+            if sign.isupper():
+                myu = 'u'
+            else:
+                myu = 'U'
+            f.write(''','%s',0,0''' % myu)
+        f.write(')\n')
+
+    realf = open('op_regs_generated.h','w')
+    realf.write(f.getvalue())
+    realf.close()
+    f.close()
+
+if __name__ == "__main__":
+    main()
diff --git a/target/hexagon/gen_opcodes_def.py b/target/hexagon/gen_opcodes_def.py
new file mode 100755
index 0000000..c34c2b1
--- /dev/null
+++ b/target/hexagon/gen_opcodes_def.py
@@ -0,0 +1,43 @@
+#!/usr/bin/env python3
+
+##
+##  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##
+##  This program is free software; you can redistribute it and/or modify
+##  it under the terms of the GNU General Public License as published by
+##  the Free Software Foundation; either version 2 of the License, or
+##  (at your option) any later version.
+##
+##  This program is distributed in the hope that it will be useful,
+##  but WITHOUT ANY WARRANTY; without even the implied warranty of
+##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+##  GNU General Public License for more details.
+##
+##  You should have received a copy of the GNU General Public License
+##  along with this program; if not, see <http://www.gnu.org/licenses/>.
+##
+
+import sys
+import re
+import string
+from io import StringIO
+
+from hex_common import *
+
+def main():
+    read_semantics_file(sys.argv[1])
+
+    ##
+    ## Generate the opcodes_def_generated.h file
+    ##     Gives a list of all the opcodes
+    ##
+    f = StringIO()
+    for tag in tags:
+        f.write ( "OPCODE(%s),\n" % (tag) )
+    realf = open('opcodes_def_generated.h', 'wt')
+    realf.write(f.getvalue())
+    realf.close()
+    f.close()
+
+if __name__ == "__main__":
+    main()
diff --git a/target/hexagon/gen_printinsn.py b/target/hexagon/gen_printinsn.py
new file mode 100755
index 0000000..6fb9424
--- /dev/null
+++ b/target/hexagon/gen_printinsn.py
@@ -0,0 +1,182 @@
+#!/usr/bin/env python3
+
+##
+##  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##
+##  This program is free software; you can redistribute it and/or modify
+##  it under the terms of the GNU General Public License as published by
+##  the Free Software Foundation; either version 2 of the License, or
+##  (at your option) any later version.
+##
+##  This program is distributed in the hope that it will be useful,
+##  but WITHOUT ANY WARRANTY; without even the implied warranty of
+##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+##  GNU General Public License for more details.
+##
+##  You should have received a copy of the GNU General Public License
+##  along with this program; if not, see <http://www.gnu.org/licenses/>.
+##
+
+import sys
+import re
+import string
+from io import StringIO
+
+from hex_common import *
+
+##
+## Generate the printinsn_generated.h file
+##     Data for printing each instruction (format string + operands)
+##
+def regprinter(m):
+    str = m.group(1)
+    str += ":".join(["%d"]*len(m.group(2)))
+    str += m.group(3)
+    if ('S' in m.group(1)) and (len(m.group(2)) == 1):
+        str += "/%s"
+    elif ('C' in m.group(1)) and (len(m.group(2)) == 1):
+        str += "/%s"
+    return str
+
+def spacify(s):
+    # Regular expression that matches any operator that contains '=' character:
+    opswithequal_re = '[-+^&|!<>=]?='
+    # Regular expression that matches any assignment operator.
+    assignment_re = '[-+^&|]?='
+
+    # Out of the operators that contain the = sign, if the operator is also an
+    # assignment, spaces will be added around it, unless it's enclosed within
+    # parentheses, or spaces are already present.
+
+    equals = re.compile(opswithequal_re)
+    assign = re.compile(assignment_re)
+
+    slen = len(s)
+    paren_count = {}
+    i = 0
+    pc = 0
+    while i < slen:
+        c = s[i]
+        if c == '(':
+            pc += 1
+        elif c == ')':
+            pc -= 1
+        paren_count[i] = pc
+        i += 1
+
+    # Iterate over all operators that contain the equal sign. If any
+    # match is also an assignment operator, add spaces around it if
+    # the parenthesis count is 0.
+    pos = 0
+    out = []
+    for m in equals.finditer(s):
+        ms = m.start()
+        me = m.end()
+        # t is the string that matched opswithequal_re.
+        t = m.string[ms:me]
+        out += s[pos:ms]
+        pos = me
+        if paren_count[ms] == 0:
+            # Check if the entire string t is an assignment.
+            am = assign.match(t)
+            if am and len(am.group(0)) == me-ms:
+                # Don't add spaces if they are already there.
+                if ms > 0 and s[ms-1] != ' ':
+                    out.append(' ')
+                out += t
+                if me < slen and s[me] != ' ':
+                    out.append(' ')
+                continue
+        # If this is not an assignment, just append it to the output
+        # string.
+        out += t
+
+    # Append the remaining part of the string.
+    out += s[pos:len(s)]
+    return ''.join(out)
+
+def main():
+    read_semantics_file(sys.argv[1])
+    read_attribs_file(sys.argv[2])
+
+    immext_casere = re.compile(r'IMMEXT\(([A-Za-z])')
+
+    f = StringIO()
+
+    for tag in tags:
+        if not behdict[tag]: continue
+        extendable_upper_imm = False
+        extendable_lower_imm = False
+        m = immext_casere.search(semdict[tag])
+        if m:
+            if m.group(1).isupper():
+                extendable_upper_imm = True
+            else:
+                extendable_lower_imm = True
+        beh = behdict[tag]
+        beh = regre.sub(regprinter,beh)
+        beh = absimmre.sub(r"#%s0x%x",beh)
+        beh = relimmre.sub(r"PC+%s%d",beh)
+        beh = spacify(beh)
+        # Print out a literal "%s" at the end, used to match empty string
+        # so C won't complain at us
+        if ("A_VECX" in attribdict[tag]): macname = "DEF_VECX_PRINTINFO"
+        else: macname = "DEF_PRINTINFO"
+        f.write('%s(%s,"%s%%s"' % (macname,tag,beh))
+        regs_or_imms = reg_or_immre.findall(behdict[tag])
+        ri = 0
+        seenregs = {}
+        for allregs,a,b,c,d,allimm,immlett,bits,immshift in regs_or_imms:
+            if a:
+                #register
+                if b in seenregs:
+                    regno = seenregs[b]
+                else:
+                    regno = ri
+                if len(b) == 1:
+                    f.write(',REGNO(%d)' % regno)
+                    if 'S' in a:
+                        f.write(',sreg2str(REGNO(%d))' % regno)
+                    elif 'C' in a:
+                        f.write(',creg2str(REGNO(%d))' % regno)
+                elif len(b) == 2:
+                    f.write(',REGNO(%d)+1,REGNO(%d)' % (regno,regno))
+                elif len(b) == 4:
+                    f.write(',REGNO(%d)^3,REGNO(%d)^2,REGNO(%d)^1,REGNO(%d)' % \
+                            (regno,regno,regno,regno))
+                else:
+                    print("Put some stuff to handle quads here")
+                if b not in seenregs:
+                    seenregs[b] = ri
+                    ri += 1
+            else:
+                #immediate
+                if (immlett.isupper()):
+                    if extendable_upper_imm:
+                        if immlett in 'rR':
+                            f.write(',insn->extension_valid?"##":""')
+                        else:
+                            f.write(',insn->extension_valid?"#":""')
+                    else:
+                        f.write(',""')
+                    ii = 1
+                else:
+                    if extendable_lower_imm:
+                        if immlett in 'rR':
+                            f.write(',insn->extension_valid?"##":""')
+                        else:
+                            f.write(',insn->extension_valid?"#":""')
+                    else:
+                        f.write(',""')
+                    ii = 0
+                f.write(',IMMNO(%d)' % ii)
+        # append empty string so there is at least one more arg
+        f.write(',"")\n')
+
+    realf = open('printinsn_generated.h','w')
+    realf.write(f.getvalue())
+    realf.close()
+    f.close()
+
+if __name__ == "__main__":
+    main()
diff --git a/target/hexagon/gen_shortcode.py b/target/hexagon/gen_shortcode.py
new file mode 100755
index 0000000..3255972
--- /dev/null
+++ b/target/hexagon/gen_shortcode.py
@@ -0,0 +1,71 @@
+#!/usr/bin/env python3
+
+##
+##  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##
+##  This program is free software; you can redistribute it and/or modify
+##  it under the terms of the GNU General Public License as published by
+##  the Free Software Foundation; either version 2 of the License, or
+##  (at your option) any later version.
+##
+##  This program is distributed in the hope that it will be useful,
+##  but WITHOUT ANY WARRANTY; without even the implied warranty of
+##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+##  GNU General Public License for more details.
+##
+##  You should have received a copy of the GNU General Public License
+##  along with this program; if not, see <http://www.gnu.org/licenses/>.
+##
+
+import sys
+import re
+import string
+from io import StringIO
+
+from hex_common import *
+
+def gen_shortcode(f, tag):
+    f.write('DEF_SHORTCODE(%s, %s)\n' % (tag,semdict[tag]))
+
+def main():
+    read_semantics_file(sys.argv[1])
+    read_attribs_file(sys.argv[2])
+    calculate_attribs()
+    tagregs = get_tagregs()
+    tagimms = get_tagimms()
+
+    ##
+    ## Generate the shortcode_generated.h file
+    ##
+    f = StringIO()
+
+    f.write("#ifndef DEF_SHORTCODE\n")
+    f.write("#define DEF_SHORTCODE(TAG,SHORTCODE)    /* Nothing */\n")
+    f.write("#endif\n")
+
+    for tag in tags:
+        ## Skip the priv instructions
+        if ( "A_PRIV" in attribdict[tag] ) :
+            continue
+        ## Skip the guest instructions
+        if ( "A_GUEST" in attribdict[tag] ) :
+            continue
+        ## Skip the diag instructions
+        if ( tag == "Y6_diag" ) :
+            continue
+        if ( tag == "Y6_diag0" ) :
+            continue
+        if ( tag == "Y6_diag1" ) :
+            continue
+
+        gen_shortcode(f, tag)
+
+    f.write("#undef DEF_SHORTCODE\n")
+
+    realf = open('shortcode_generated.h','w')
+    realf.write(f.getvalue())
+    realf.close()
+    f.close()
+
+if __name__ == "__main__":
+    main()
diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs.py
new file mode 100755
index 0000000..1b2ef5b
--- /dev/null
+++ b/target/hexagon/gen_tcg_funcs.py
@@ -0,0 +1,301 @@
+#!/usr/bin/env python3
+
+##
+##  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##
+##  This program is free software; you can redistribute it and/or modify
+##  it under the terms of the GNU General Public License as published by
+##  the Free Software Foundation; either version 2 of the License, or
+##  (at your option) any later version.
+##
+##  This program is distributed in the hope that it will be useful,
+##  but WITHOUT ANY WARRANTY; without even the implied warranty of
+##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+##  GNU General Public License for more details.
+##
+##  You should have received a copy of the GNU General Public License
+##  along with this program; if not, see <http://www.gnu.org/licenses/>.
+##
+
+import sys
+import re
+import string
+from io import StringIO
+
+from hex_common import *
+
+##
+## Helpers for gen_tcg_func
+##
+def gen_decl_ea_tcg(f):
+    f.write("DECL_EA;\n")
+
+def gen_free_ea_tcg(f):
+    f.write("FREE_EA;\n")
+
+def genptr_decl(f,regtype,regid,regno):
+    regN="%s%sN" % (regtype,regid)
+    macro = "DECL_%sREG_%s" % (regtype, regid)
+    f.write("%s(%s%sV, %s, %d, 0);\n" % \
+        (macro, regtype, regid, regN, regno))
+
+def genptr_decl_new(f,regtype,regid,regno):
+    regN="%s%sX" % (regtype,regid)
+    macro = "DECL_NEW_%sREG_%s" % (regtype, regid)
+    f.write("%s(%s%sN, %s, %d, 0);\n" % \
+        (macro, regtype, regid, regN, regno))
+
+def genptr_decl_opn(f, tag, regtype, regid, toss, numregs, i):
+    if (is_pair(regid)):
+        genptr_decl(f,regtype,regid,i)
+    elif (is_single(regid)):
+        if is_old_val(regtype, regid, tag):
+            genptr_decl(f,regtype,regid,i)
+        elif is_new_val(regtype, regid, tag):
+            genptr_decl_new(f,regtype,regid,i)
+        else:
+            print("Bad register parse: ",regtype,regid,toss,numregs)
+    else:
+        print("Bad register parse: ",regtype,regid,toss,numregs)
+
+def genptr_decl_imm(f,immlett):
+    if (immlett.isupper()):
+        i = 1
+    else:
+        i = 0
+    f.write("DECL_IMM(%s,%d);\n" % (imm_name(immlett),i))
+
+def genptr_free(f,regtype,regid,regno):
+    macro = "FREE_%sREG_%s" % (regtype, regid)
+    f.write("%s(%s%sV);\n" % (macro, regtype, regid))
+
+def genptr_free_new(f,regtype,regid,regno):
+    macro = "FREE_NEW_%sREG_%s" % (regtype, regid)
+    f.write("%s(%s%sN);\n" % (macro, regtype, regid))
+
+def genptr_free_opn(f,regtype,regid,i,tag):
+    if (is_pair(regid)):
+        genptr_free(f,regtype,regid,i)
+    elif (is_single(regid)):
+        if is_old_val(regtype, regid, tag):
+            genptr_free(f,regtype,regid,i)
+        elif is_new_val(regtype, regid, tag):
+            genptr_free_new(f,regtype,regid,i)
+        else:
+            print("Bad register parse: ",regtype,regid,toss,numregs)
+    else:
+        print("Bad register parse: ",regtype,regid,toss,numregs)
+
+def genptr_free_imm(f,immlett):
+    f.write("FREE_IMM(%s);\n" % (imm_name(immlett)))
+
+def genptr_src_read(f,regtype,regid):
+    macro = "READ_%sREG_%s" % (regtype, regid)
+    f.write("%s(%s%sV, %s%sN);\n" % \
+        (macro,regtype,regid,regtype,regid))
+
+def genptr_src_read_new(f,regtype,regid):
+    macro = "READ_NEW_%sREG_%s" % (regtype, regid)
+    f.write("%s(%s%sN, %s%sX);\n" % \
+        (macro,regtype,regid,regtype,regid))
+
+def genptr_src_read_opn(f,regtype,regid,tag):
+    if (is_pair(regid)):
+        genptr_src_read(f,regtype,regid)
+    elif (is_single(regid)):
+        if is_old_val(regtype, regid, tag):
+            genptr_src_read(f,regtype,regid)
+        elif is_new_val(regtype, regid, tag):
+            genptr_src_read_new(f,regtype,regid)
+        else:
+            print("Bad register parse: ",regtype,regid,toss,numregs)
+    else:
+        print("Bad register parse: ",regtype,regid,toss,numregs)
+
+def gen_helper_call_opn(f, tag, regtype, regid, toss, numregs, i):
+    if (i > 0): f.write(", ")
+    if (is_pair(regid)):
+        f.write("%s%sV" % (regtype,regid))
+    elif (is_single(regid)):
+        if is_old_val(regtype, regid, tag):
+            f.write("%s%sV" % (regtype,regid))
+        elif is_new_val(regtype, regid, tag):
+            f.write("%s%sN" % (regtype,regid))
+        else:
+            print("Bad register parse: ",regtype,regid,toss,numregs)
+    else:
+        print("Bad register parse: ",regtype,regid,toss,numregs)
+
+def gen_helper_decl_imm(f,immlett):
+    f.write("DECL_TCG_IMM(tcgv_%s, %s);\n" % \
+        (imm_name(immlett), imm_name(immlett)))
+
+def gen_helper_call_imm(f,immlett):
+    f.write(", tcgv_%s" % imm_name(immlett))
+
+def gen_helper_free_imm(f,immlett):
+    f.write("FREE_TCG_IMM(tcgv_%s);\n" % imm_name(immlett))
+
+def genptr_dst_write(f,regtype, regid):
+    macro = "WRITE_%sREG_%s" % (regtype, regid)
+    f.write("%s(%s%sN, %s%sV);\n" % (macro, regtype, regid, regtype, regid))
+
+def genptr_dst_write_opn(f,regtype, regid, tag):
+    if (is_pair(regid)):
+        genptr_dst_write(f, regtype, regid)
+    elif (is_single(regid)):
+        genptr_dst_write(f, regtype, regid)
+    else:
+        print("Bad register parse: ",regtype,regid,toss,numregs)
+
+##
+## Generate the TCG code to call the helper
+##     For A2_add: Rd32=add(Rs32,Rt32), { RdV=RsV+RtV;}
+##     We produce:
+##       {
+##       /* A2_add */
+##       DECL_RREG_d(RdV, RdN, 0, 0);
+##       DECL_RREG_s(RsV, RsN, 1, 0);
+##       DECL_RREG_t(RtV, RtN, 2, 0);
+##       READ_RREG_s(RsV, RsN);
+##       READ_RREG_t(RtV, RtN);
+##       #ifdef fGEN_TCG_A2_add
+##       fGEN_TCG_A2_add({ RdV=RsV+RtV;});
+##       #else
+##       gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
+##       #endif
+##       WRITE_RREG_d(RdN, RdV);
+##       FREE_RREG_d(RdV);
+##       FREE_RREG_s(RsV);
+##       FREE_RREG_t(RtV);
+##       /* A2_add */
+##       }
+##
+def gen_tcg_func(f, tag, regs, imms):
+    f.write('{\n')
+    f.write('/* %s */\n' % tag)
+    if need_ea(tag): gen_decl_ea_tcg(f)
+    i=0
+    ## Declare all the operands (regs and immediates)
+    for regtype,regid,toss,numregs in regs:
+        genptr_decl_opn(f, tag, regtype, regid, toss, numregs, i)
+        i += 1
+    for immlett,bits,immshift in imms:
+        genptr_decl_imm(f,immlett)
+
+    if 'A_PRIV' in attribdict[tag]:
+        f.write('fCHECKFORPRIV();\n')
+    if 'A_GUEST' in attribdict[tag]:
+        f.write('fCHECKFORGUEST();\n')
+    if 'A_FPOP' in attribdict[tag]:
+        f.write('fFPOP_START();\n');
+
+    ## Read all the inputs
+    for regtype,regid,toss,numregs in regs:
+        if (is_read(regid)):
+            genptr_src_read_opn(f,regtype,regid,tag)
+
+    f.write("#ifdef fGEN_TCG_%s\n" % tag)
+    f.write("fGEN_TCG_%s(%s);\n" % (tag, semdict[tag]))
+    f.write("#else\n")
+    ## Generate the call to the helper
+    f.write("do {\n")
+    for immlett,bits,immshift in imms:
+        gen_helper_decl_imm(f,immlett)
+    if need_part1(tag): f.write("PART1_WRAP(")
+    if need_slot(tag): f.write("SLOT_WRAP(")
+    f.write("gen_helper_%s(" % (tag))
+    i=0
+    ## If there is a scalar result, it is the return type
+    for regtype,regid,toss,numregs in regs:
+        if (is_written(regid)):
+            gen_helper_call_opn(f, tag, regtype, regid, toss, numregs, i)
+            i += 1
+    if (i > 0): f.write(", ")
+    f.write("cpu_env")
+    i=1
+    for regtype,regid,toss,numregs in regs:
+        if (is_read(regid)):
+            gen_helper_call_opn(f, tag, regtype, regid, toss, numregs, i)
+            i += 1
+    for immlett,bits,immshift in imms:
+        gen_helper_call_imm(f,immlett)
+
+    if need_slot(tag): f.write(", slot")
+    if need_part1(tag): f.write(", part1" )
+    f.write(")")
+    if need_slot(tag): f.write(")")
+    if need_part1(tag): f.write(")")
+    f.write(";\n")
+    for immlett,bits,immshift in imms:
+        gen_helper_free_imm(f,immlett)
+    f.write("} while (0);\n")
+    f.write("#endif\n")
+
+    ## Write all the outputs
+    for regtype,regid,toss,numregs in regs:
+        if (is_written(regid)):
+            genptr_dst_write_opn(f,regtype, regid, tag)
+
+    if 'A_FPOP' in attribdict[tag]:
+        f.write('fFPOP_END();\n');
+
+
+    ## Free all the operands (regs and immediates)
+    if need_ea(tag): gen_free_ea_tcg(f)
+    for regtype,regid,toss,numregs in regs:
+        genptr_free_opn(f,regtype,regid,i,tag)
+        i += 1
+    for immlett,bits,immshift in imms:
+        genptr_free_imm(f,immlett)
+
+    f.write("/* %s */\n" % tag)
+    f.write("}")
+
+def gen_def_tcg_func(f, tag, tagregs, tagimms):
+    regs = tagregs[tag]
+    imms = tagimms[tag]
+
+    f.write('DEF_TCG_FUNC(%s, /* %s */\n' % (tag,semdict[tag]))
+    gen_tcg_func(f, tag, regs, imms)
+    f.write(")\n" )
+
+def main():
+    read_semantics_file(sys.argv[1])
+    read_attribs_file(sys.argv[2])
+    calculate_attribs()
+    tagregs = get_tagregs()
+    tagimms = get_tagimms()
+
+    f = StringIO()
+
+    f.write("#ifndef DEF_TCG_FUNC\n")
+    f.write("#define DEF_TCG_FUNC(TAG,GENFN)         /* Nothing */\n")
+    f.write("#endif\n")
+
+    for tag in tags:
+        ## Skip the priv instructions
+        if ( "A_PRIV" in attribdict[tag] ) :
+            continue
+        ## Skip the guest instructions
+        if ( "A_GUEST" in attribdict[tag] ) :
+            continue
+        ## Skip the diag instructions
+        if ( tag == "Y6_diag" ) :
+            continue
+        if ( tag == "Y6_diag0" ) :
+            continue
+        if ( tag == "Y6_diag1" ) :
+            continue
+
+        gen_def_tcg_func(f, tag, tagregs, tagimms)
+
+    f.write("#undef DEF_TCG_FUNC\n")
+
+    realf = open('tcg_funcs_generated.h','w')
+    realf.write(f.getvalue())
+    realf.close()
+    f.close()
+
+if __name__ == "__main__":
+    main()
diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py
new file mode 100755
index 0000000..d45ffab
--- /dev/null
+++ b/target/hexagon/hex_common.py
@@ -0,0 +1,204 @@
+#!/usr/bin/env python3
+
+##
+##  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##
+##  This program is free software; you can redistribute it and/or modify
+##  it under the terms of the GNU General Public License as published by
+##  the Free Software Foundation; either version 2 of the License, or
+##  (at your option) any later version.
+##
+##  This program is distributed in the hope that it will be useful,
+##  but WITHOUT ANY WARRANTY; without even the implied warranty of
+##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+##  GNU General Public License for more details.
+##
+##  You should have received a copy of the GNU General Public License
+##  along with this program; if not, see <http://www.gnu.org/licenses/>.
+##
+
+import sys
+import re
+import string
+from io import StringIO
+
+behdict = {}          # tag ->behavior
+semdict = {}          # tag -> semantics
+attribdict = {}       # tag -> attributes
+macros = {}           # macro -> macro information...
+attribinfo = {}       # Register information and misc
+tags = []             # list of all tags
+
+# We should do this as a hash for performance,
+# but to keep order let's keep it as a list.
+def uniquify(seq):
+    seen = set()
+    seen_add = seen.add
+    return [x for x in seq if x not in seen and not seen_add(x)]
+
+regre = re.compile(
+    r"((?<!DUP)[MNORCPQXSGVZA])([stuvwxyzdefg]+)([.]?[LlHh]?)(\d+S?)")
+immre = re.compile(r"[#]([rRsSuUm])(\d+)(?:[:](\d+))?")
+reg_or_immre = \
+    re.compile(r"(((?<!DUP)[MNRCOPQXSGVZA])([stuvwxyzdefg]+)" + \
+                "([.]?[LlHh]?)(\d+S?))|([#]([rRsSuUm])(\d+)[:]?(\d+)?)")
+relimmre = re.compile(r"[#]([rR])(\d+)(?:[:](\d+))?")
+absimmre = re.compile(r"[#]([sSuUm])(\d+)(?:[:](\d+))?")
+
+finished_macros = set()
+
+def expand_macro_attribs(macro,allmac_re):
+    if macro.key not in finished_macros:
+        # Get a list of all things that might be macros
+        l = allmac_re.findall(macro.beh)
+        for submacro in l:
+            if not submacro: continue
+            if not macros[submacro]:
+                raise Exception("Couldn't find macro: <%s>" % l)
+            macro.attribs |= expand_macro_attribs(
+                macros[submacro], allmac_re)
+            finished_macros.add(macro.key)
+    return macro.attribs
+
+immextre = re.compile(r'f(MUST_)?IMMEXT[(]([UuSsRr])')
+def calculate_attribs():
+    # Recurse down macros, find attributes from sub-macros
+    macroValues = list(macros.values())
+    allmacros_restr = "|".join(set([ m.re.pattern for m in macroValues ]))
+    allmacros_re = re.compile(allmacros_restr)
+    for macro in macroValues:
+        expand_macro_attribs(macro,allmacros_re)
+    # Append attributes to all instructions
+    for tag in tags:
+        for macname in allmacros_re.findall(semdict[tag]):
+            if not macname: continue
+            macro = macros[macname]
+            attribdict[tag] |= set(macro.attribs)
+
+def SEMANTICS(tag, beh, sem):
+    #print tag,beh,sem
+    behdict[tag] = beh
+    semdict[tag] = sem
+    attribdict[tag] = set()
+    tags.append(tag)        # dicts have no order, this is for order
+
+def ATTRIBUTES(tag, attribstring):
+    attribstring = \
+        attribstring.replace("ATTRIBS","").replace("(","").replace(")","")
+    if not attribstring:
+        return
+    attribs = attribstring.split(",")
+    for attrib in attribs:
+        attribdict[tag].add(attrib.strip())
+
+class Macro(object):
+    __slots__ = ['key','name', 'beh', 'attribs', 're']
+    def __init__(self, name, beh, attribs):
+        self.key = name
+        self.name = name
+        self.beh = beh
+        self.attribs = set(attribs)
+        self.re = re.compile("\\b" + name + "\\b")
+
+def MACROATTRIB(macname,beh,attribstring):
+    attribstring = attribstring.replace("(","").replace(")","")
+    if attribstring:
+        attribs = attribstring.split(",")
+    else:
+        attribs = []
+    macros[macname] = Macro(macname,beh,attribs)
+
+def compute_tag_regs(tag):
+    return uniquify(regre.findall(behdict[tag]))
+
+def compute_tag_immediates(tag):
+    return uniquify(immre.findall(behdict[tag]))
+
+##
+##  tagregs is the main data structure we'll use
+##  tagregs[tag] will contain the registers used by an instruction
+##  Within each entry, we'll use the regtype and regid fields
+##      regtype can be one of the following
+##          C                control register
+##          N                new register value
+##          P                predicate register
+##          R                GPR register
+##          M                modifier register
+##      regid can be one of the following
+##          d, e             destination register
+##          dd               destination register pair
+##          s, t, u, v, w    source register
+##          ss, tt, uu, vv   source register pair
+##          x, y             read-write register
+##          xx, yy           read-write register pair
+##
+def get_tagregs():
+    return dict(zip(tags, list(map(compute_tag_regs, tags))))
+
+def get_tagimms():
+    return dict(zip(tags, list(map(compute_tag_immediates, tags))))
+
+def is_pair(regid):
+    return len(regid) == 2
+
+def is_single(regid):
+    return len(regid) == 1
+
+def is_written(regid):
+    return regid[0] in "dexy"
+
+def is_writeonly(regid):
+    return regid[0] in "de"
+
+def is_read(regid):
+    return regid[0] in "stuvwxy"
+
+def is_readwrite(regid):
+    return regid[0] in "xy"
+
+def is_scalar_reg(regtype):
+    return regtype in "RPC"
+
+def is_old_val(regtype, regid, tag):
+    return regtype+regid+'V' in semdict[tag]
+
+def is_new_val(regtype, regid, tag):
+    return regtype+regid+'N' in semdict[tag]
+
+def need_slot(tag):
+    if ('A_CONDEXEC' in attribdict[tag] or
+        'A_STORE' in attribdict[tag] or
+        'A_LOAD' in attribdict[tag]):
+        return 1
+    else:
+        return 0
+
+def need_part1(tag):
+    return re.compile(r"fPART1").search(semdict[tag])
+
+def need_ea(tag):
+    return re.compile(r"\bEA\b").search(semdict[tag])
+
+def imm_name(immlett):
+    return "%siV" % immlett
+
+def read_semantics_file(name):
+    eval_line = ""
+    for line in open(name, 'rt').readlines():
+        if not line.startswith("#"):
+            eval_line += line
+            if line.endswith("\\\n"):
+                eval_line.rstrip("\\\n")
+            else:
+                eval(eval_line.strip())
+                eval_line = ""
+
+def read_attribs_file(name):
+    attribre = re.compile(r'DEF_ATTRIB\(([A-Za-z0-9_]+), ([^,]*), ' +
+            r'"([A-Za-z0-9_\.]*)", "([A-Za-z0-9_\.]*)"\)')
+    for line in open(sys.argv[2], 'rt').readlines():
+        if not attribre.match(line):
+            continue
+        (attrib_base,descr,rreg,wreg) = attribre.findall(line)[0]
+        attrib_base = 'A_' + attrib_base
+        attribinfo[attrib_base] = {'rreg':rreg, 'wreg':wreg, 'descr':descr}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 22/34] Hexagon (target/hexagon) generator phase 3 - C preprocessor for decode tree
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (20 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 21/34] Hexagon (target/hexagon) generator phase 2 - generate header files Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-18 15:50 ` [RFC PATCH v3 23/34] Hexagon (target/hexagon) generater phase 4 - " Taylor Simpson
                   ` (13 subsequent siblings)
  35 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Run the C preprocessor across the instruction definition and encoding
files to expand macros and prepare the iset.py file.  The resulting
fill contains python data structures used to build the decode tree.

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/gen_dectree_import.c | 191 ++++++++++++++++++++++++++++++++++++
 1 file changed, 191 insertions(+)
 create mode 100644 target/hexagon/gen_dectree_import.c

diff --git a/target/hexagon/gen_dectree_import.c b/target/hexagon/gen_dectree_import.c
new file mode 100644
index 0000000..237726e
--- /dev/null
+++ b/target/hexagon/gen_dectree_import.c
@@ -0,0 +1,191 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * This program generates the encodings file that is processed by
+ * the dectree.py script to produce the decoding tree.  We use the C
+ * preprocessor to manipulate the files imported from the Hexagon
+ * architecture library.
+ */
+#include "qemu/osdep.h"
+#include "opcodes.h"
+
+#define STRINGIZE(X)    #X
+
+const char *opcode_names[] = {
+#define OPCODE(IID) STRINGIZE(IID)
+#include "opcodes_def_generated.h"
+    NULL
+#undef OPCODE
+};
+
+const char *opcode_syntax[XX_LAST_OPCODE];
+
+/*
+ * Process the instruction definitions
+ *     Scalar core instructions have the following form
+ *         Q6INSN(A2_add,"Rd32=add(Rs32,Rt32)",ATTRIBS(),
+ *         "Add 32-bit registers",
+ *         { RdV=RsV+RtV;})
+ */
+void opcode_init(void)
+{
+#define Q6INSN(TAG, BEH, ATTRIBS, DESCR, SEM) \
+   opcode_syntax[TAG] = BEH;
+#define EXTINSN(TAG, BEH, ATTRIBS, DESCR, SEM) \
+   opcode_syntax[TAG] = BEH;
+#include "imported/allidefs.def"
+#undef Q6INSN
+#undef EXTINSN
+}
+
+const char *opcode_rregs[] = {
+#define REGINFO(TAG, REGINFO, RREGS, WREGS) RREGS,
+#define IMMINFO(TAG, SIGN, SIZE, SHAMT, SIGN2, SIZE2, SHAMT2)  /* nothing */
+#include "op_regs_generated.h"
+    NULL
+#undef REGINFO
+#undef IMMINFO
+};
+
+const char *opcode_wregs[] = {
+#define REGINFO(TAG, REGINFO, RREGS, WREGS) WREGS,
+#define IMMINFO(TAG, SIGN, SIZE, SHAMT, SIGN2, SIZE2, SHAMT2)  /* nothing */
+#include "op_regs_generated.h"
+    NULL
+#undef REGINFO
+#undef IMMINFO
+};
+
+opcode_encoding_t opcode_encodings[] = {
+#define DEF_ENC32(TAG, ENCSTR) \
+    [TAG] = { .encoding = ENCSTR },
+#define DEF_ENC_SUBINSN(TAG, CLASS, ENCSTR) \
+    [TAG] = { .encoding = ENCSTR, .enc_class = CLASS },
+#define DEF_EXT_ENC(TAG, CLASS, ENCSTR) \
+    [TAG] = { .encoding = ENCSTR, .enc_class = CLASS },
+#include "imported/encode.def"
+#undef DEF_ENC32
+#undef DEF_ENC_SUBINSN
+#undef DEF_EXT_ENC
+};
+
+static const char * const opcode_enc_class_names[XX_LAST_ENC_CLASS] = {
+    "NORMAL",
+    "16BIT",
+    "SUBINSN_A",
+    "SUBINSN_L1",
+    "SUBINSN_L2",
+    "SUBINSN_S1",
+    "SUBINSN_S2",
+    "EXT_noext",
+    "EXT_mmvec",
+};
+
+static const char *get_opcode_enc(int opcode)
+{
+    const char *tmp = opcode_encodings[opcode].encoding;
+    if (tmp == NULL) {
+        tmp = "MISSING ENCODING";
+    }
+    return tmp;
+}
+
+static const char *get_opcode_enc_class(int opcode)
+{
+    return opcode_enc_class_names[opcode_encodings[opcode].enc_class];
+}
+
+static void gen_iset_table(FILE *out)
+{
+    int i;
+
+    fprintf(out, "iset = {\n");
+    for (i = 0; i < XX_LAST_OPCODE; i++) {
+        fprintf(out, "\t\'%s\' : {\n", opcode_names[i]);
+        fprintf(out, "\t\t\'tag\' : \'%s\',\n", opcode_names[i]);
+        fprintf(out, "\t\t\'syntax\' : \'%s\',\n", opcode_syntax[i]);
+        fprintf(out, "\t\t\'rregs\' : \'%s\',\n", opcode_rregs[i]);
+        fprintf(out, "\t\t\'wregs\' : \'%s\',\n", opcode_wregs[i]);
+        fprintf(out, "\t\t\'enc\' : \'%s\',\n", get_opcode_enc(i));
+        fprintf(out, "\t\t\'enc_class\' : \'%s\',\n", get_opcode_enc_class(i));
+        fprintf(out, "\t},\n");
+    }
+    fprintf(out, "};\n\n");
+}
+
+static void gen_tags_list(FILE *out)
+{
+    int i;
+
+    fprintf(out, "tags = [\n");
+    for (i = 0; i < XX_LAST_OPCODE; i++) {
+        fprintf(out, "\t\'%s\',\n", opcode_names[i]);
+    }
+    fprintf(out, "];\n\n");
+}
+
+static void gen_enc_ext_spaces_table(FILE *out)
+{
+    fprintf(out, "enc_ext_spaces = {\n");
+#define DEF_EXT_SPACE(SPACEID, ENCSTR) \
+    fprintf(out, "\t\'%s\' : \'%s\',\n", #SPACEID, ENCSTR);
+#include "imported/encode.def"
+#undef DEF_EXT_SPACE
+    fprintf(out, "};\n\n");
+}
+
+static void gen_subinsn_groupings_table(FILE *out)
+{
+    fprintf(out, "subinsn_groupings = {\n");
+#define DEF_PACKED32(TAG, TYPEA, TYPEB, ENCSTR) \
+    do { \
+        fprintf(out, "\t\'%s\' : {\n", #TAG); \
+        fprintf(out, "\t\t\'name\' : \'%s\',\n", #TAG); \
+        fprintf(out, "\t\t\'class_a\' : \'%s\',\n", #TYPEA); \
+        fprintf(out, "\t\t\'class_b\' : \'%s\',\n", #TYPEB); \
+        fprintf(out, "\t\t\'enc\' : \'%s\',\n", ENCSTR); \
+        fprintf(out, "\t},\n"); \
+    } while (0);
+#include "imported/encode.def"
+#undef DEF_PACKED32
+    fprintf(out, "};\n\n");
+}
+
+int main(int argc, char *argv[])
+{
+    FILE *outfile;
+
+    if (argc != 2) {
+        fprintf(stderr, "Usage: gen_dectree_import ouptputfile\n");
+        return -1;
+    }
+    outfile = fopen(argv[1], "w");
+    if (outfile == NULL) {
+        fprintf(stderr, "Cannot open %s for writing\n", argv[1]);
+        return -1;
+    }
+
+    opcode_init();
+    gen_iset_table(outfile);
+    gen_tags_list(outfile);
+    gen_enc_ext_spaces_table(outfile);
+    gen_subinsn_groupings_table(outfile);
+
+    fclose(outfile);
+    return 0;
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 23/34] Hexagon (target/hexagon) generater phase 4 - decode tree
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (21 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 22/34] Hexagon (target/hexagon) generator phase 3 - C preprocessor for decode tree Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-18 15:50 ` [RFC PATCH v3 24/34] Hexagon (target/hexagon) opcode data structures Taylor Simpson
                   ` (12 subsequent siblings)
  35 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Python script that emits the decode tree in dectree_generated.h.

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/dectree.py | 352 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 352 insertions(+)
 create mode 100755 target/hexagon/dectree.py

diff --git a/target/hexagon/dectree.py b/target/hexagon/dectree.py
new file mode 100755
index 0000000..47a05a3
--- /dev/null
+++ b/target/hexagon/dectree.py
@@ -0,0 +1,352 @@
+#!/usr/bin/env python3
+
+##
+##  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##
+##  This program is free software; you can redistribute it and/or modify
+##  it under the terms of the GNU General Public License as published by
+##  the Free Software Foundation; either version 2 of the License, or
+##  (at your option) any later version.
+##
+##  This program is distributed in the hope that it will be useful,
+##  but WITHOUT ANY WARRANTY; without even the implied warranty of
+##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+##  GNU General Public License for more details.
+##
+##  You should have received a copy of the GNU General Public License
+##  along with this program; if not, see <http://www.gnu.org/licenses/>.
+##
+
+import io
+import re
+
+import sys
+import iset
+
+encs = {tag : ''.join(reversed(iset.iset[tag]['enc'].replace(' ', '')))
+    for tag in iset.tags if iset.iset[tag]['enc'] != 'MISSING ENCODING'}
+
+enc_classes = set([iset.iset[tag]['enc_class'] for tag in encs.keys()])
+subinsn_enc_classes = \
+    set([enc_class for enc_class in enc_classes \
+        if enc_class.startswith('SUBINSN_')])
+ext_enc_classes = \
+    set([enc_class for enc_class in enc_classes \
+        if enc_class not in ('NORMAL', '16BIT') and \
+           not enc_class.startswith('SUBINSN_')])
+
+try:
+    subinsn_groupings = iset.subinsn_groupings
+except AttributeError:
+    subinsn_groupings = {}
+
+for (tag, subinsn_grouping) in subinsn_groupings.items():
+    encs[tag] = ''.join(reversed(subinsn_grouping['enc'].replace(' ', '')))
+
+dectree_normal = {'leaves' : set()}
+dectree_16bit = {'leaves' : set()}
+dectree_subinsn_groupings = {'leaves' : set()}
+dectree_subinsns = {name : {'leaves' : set()} for name in subinsn_enc_classes}
+dectree_extensions = {name : {'leaves' : set()} for name in ext_enc_classes}
+
+for tag in encs.keys():
+    if tag in subinsn_groupings:
+        dectree_subinsn_groupings['leaves'].add(tag)
+        continue
+    enc_class = iset.iset[tag]['enc_class']
+    if enc_class.startswith('SUBINSN_'):
+        if len(encs[tag]) != 32:
+            encs[tag] = encs[tag] + '0' * (32 - len(encs[tag]))
+        dectree_subinsns[enc_class]['leaves'].add(tag)
+    elif  enc_class == '16BIT':
+        if len(encs[tag]) != 16:
+            raise Exception('Tag "{}" has enc_class "{}" and not an encoding ' +
+                            'width of 16 bits!'.format(tag, enc_class))
+        dectree_16bit['leaves'].add(tag)
+    else:
+        if len(encs[tag]) != 32:
+            raise Exception('Tag "{}" has enc_class "{}" and not an encoding ' +
+                            'width of 32 bits!'.format(tag, enc_class))
+        if enc_class == 'NORMAL':
+            dectree_normal['leaves'].add(tag)
+        else:
+            dectree_extensions[enc_class]['leaves'].add(tag)
+
+faketags = set()
+for (tag, enc) in iset.enc_ext_spaces.items():
+    faketags.add(tag)
+    encs[tag] = ''.join(reversed(enc.replace(' ', '')))
+    dectree_normal['leaves'].add(tag)
+
+faketags |= set(subinsn_groupings.keys())
+
+def every_bit_counts(bitset):
+    for i in range(1, len(next(iter(bitset)))):
+        if len(set([bits[:i] + bits[i+1:] for bits in bitset])) == len(bitset):
+            return False
+    return True
+
+def auto_separate(node):
+    tags = node['leaves']
+    if len(tags) <= 1:
+        return
+    enc_width = len(encs[next(iter(tags))])
+    opcode_bit_for_all = \
+        [all([encs[tag][i] in '01' \
+            for tag in tags]) for i in range(enc_width)]
+    opcode_bit_is_0_for_all = \
+        [opcode_bit_for_all[i] and all([encs[tag][i] == '0' \
+            for tag in tags]) for i in range(enc_width)]
+    opcode_bit_is_1_for_all = \
+        [opcode_bit_for_all[i] and all([encs[tag][i] == '1' \
+            for tag in tags]) for i in range(enc_width)]
+    differentiator_opcode_bit = \
+        [opcode_bit_for_all[i] and \
+         not (opcode_bit_is_0_for_all[i] or \
+         opcode_bit_is_1_for_all[i]) \
+            for i in range(enc_width)]
+    best_width = 0
+    for width in range(4, 0, -1):
+        for lsb in range(enc_width - width, -1, -1):
+            bitset = set([encs[tag][lsb:lsb+width] for tag in tags])
+            if all(differentiator_opcode_bit[lsb:lsb+width]) and \
+                (len(bitset) == len(tags) or every_bit_counts(bitset)):
+                best_width = width
+                best_lsb = lsb
+                caught_all_tags = len(bitset) == len(tags)
+                break
+        if best_width != 0:
+            break
+    if best_width == 0:
+        raise Exception('Could not find a way to differentiate the encodings ' +
+                         'of the following tags:\n{}'.format('\n'.join(tags)))
+    if caught_all_tags:
+        for width in range(1, best_width):
+            for lsb in range(enc_width - width, -1, -1):
+                bitset = set([encs[tag][lsb:lsb+width] for tag in tags])
+                if all(differentiator_opcode_bit[lsb:lsb+width]) and \
+                    len(bitset) == len(tags):
+                    best_width = width
+                    best_lsb = lsb
+                    break
+            else:
+                continue
+            break
+    node['separator_lsb'] = best_lsb
+    node['separator_width'] = best_width
+    node['children'] = []
+    for value in range(2 ** best_width):
+        child = {}
+        bits = ''.join(reversed('{:0{}b}'.format(value, best_width)))
+        child['leaves'] = \
+            set([tag for tag in tags \
+                if encs[tag][best_lsb:best_lsb+best_width] == bits])
+        node['children'].append(child)
+    for child in node['children']:
+        auto_separate(child)
+
+auto_separate(dectree_normal)
+auto_separate(dectree_16bit)
+if subinsn_groupings:
+    auto_separate(dectree_subinsn_groupings)
+for dectree_subinsn in dectree_subinsns.values():
+    auto_separate(dectree_subinsn)
+for dectree_ext in dectree_extensions.values():
+    auto_separate(dectree_ext)
+
+for tag in faketags:
+    del encs[tag]
+
+def table_name(parents, node):
+    path = parents + [node]
+    root = path[0]
+    tag = next(iter(node['leaves']))
+    if tag in subinsn_groupings:
+        enc_width = len(subinsn_groupings[tag]['enc'].replace(' ', ''))
+    else:
+        tag = next(iter(node['leaves'] - faketags))
+        enc_width = len(encs[tag])
+    determining_bits = ['_'] * enc_width
+    for (parent, child) in zip(path[:-1], path[1:]):
+        lsb = parent['separator_lsb']
+        width = parent['separator_width']
+        value = parent['children'].index(child)
+        determining_bits[lsb:lsb+width] = \
+            list(reversed('{:0{}b}'.format(value, width)))
+    if tag in subinsn_groupings:
+        name = 'DECODE_ROOT_EE'
+    else:
+        enc_class = iset.iset[tag]['enc_class']
+        if enc_class in ext_enc_classes:
+            name = 'DECODE_EXT_{}'.format(enc_class)
+        elif enc_class in subinsn_enc_classes:
+            name = 'DECODE_SUBINSN_{}'.format(enc_class)
+        else:
+            name = 'DECODE_ROOT_{}'.format(enc_width)
+    if node != root:
+        name += '_' + ''.join(reversed(determining_bits))
+    return name
+
+def print_node(f, node, parents):
+    if len(node['leaves']) <= 1:
+        return
+    name = table_name(parents, node)
+    lsb = node['separator_lsb']
+    width = node['separator_width']
+    print('DECODE_NEW_TABLE({},{},DECODE_SEPARATOR_BITS({},{}))'.\
+        format(name, 2 ** width, lsb, width), file=f)
+    for child in node['children']:
+        if len(child['leaves']) == 0:
+            print('INVALID()', file=f)
+        elif len(child['leaves']) == 1:
+            (tag,) = child['leaves']
+            if tag in subinsn_groupings:
+                class_a = subinsn_groupings[tag]['class_a']
+                class_b = subinsn_groupings[tag]['class_b']
+                enc = subinsn_groupings[tag]['enc'].replace(' ', '')
+                if 'RESERVED' in tag:
+                    print('INVALID()', file=f)
+                else:
+                    print('SUBINSNS({},{},{},"{}")'.\
+                        format(tag, class_a, class_b, enc), file=f)
+            elif tag in iset.enc_ext_spaces:
+                enc = iset.enc_ext_spaces[tag].replace(' ', '')
+                print('EXTSPACE({},"{}")'.format(tag, enc), file=f)
+            else:
+                enc = ''.join(reversed(encs[tag]))
+                print('TERMINAL({},"{}")'.format(tag, enc), file=f)
+        else:
+            print('TABLE_LINK({})'.format(table_name(parents + [node], child)),
+                  file=f)
+    print('DECODE_END_TABLE({},{},DECODE_SEPARATOR_BITS({},{}))'.\
+        format(name, 2 ** width, lsb, width), file=f)
+    print(file=f)
+    parents.append(node)
+    for child in node['children']:
+        print_node(f, child, parents)
+    parents.pop()
+
+def print_tree(f, tree):
+    print_node(f, tree, [])
+
+def print_match_info(f):
+    for tag in sorted(encs.keys(), key=iset.tags.index):
+        enc = ''.join(reversed(encs[tag]))
+        mask = int(re.sub(r'[^1]', r'0', enc.replace('0', '1')), 2)
+        match = int(re.sub(r'[^01]', r'0', enc), 2)
+        suffix = ''
+        print('DECODE{}_MATCH_INFO({},0x{:x}U,0x{:x}U)'.\
+            format(suffix, tag, mask, match), file=f)
+
+regre = re.compile(
+    r'((?<!DUP)[MNORCPQXSGVZA])([stuvwxyzdefg]+)([.]?[LlHh]?)(\d+S?)')
+immre = re.compile(r'[#]([rRsSuUm])(\d+)(?:[:](\d+))?')
+
+def ordered_unique(l):
+    return sorted(set(l), key=l.index)
+
+implicit_registers = {
+    'SP' : 29,
+    'FP' : 30,
+    'LR' : 31
+}
+
+num_registers = {
+    'R' : 32,
+    'V' : 32
+}
+
+def print_op_info(f):
+    for tag in sorted(encs.keys(), key=iset.tags.index):
+        enc = encs[tag]
+        print(file=f)
+        print('DECODE_OPINFO({},'.format(tag), file=f)
+        regs = ordered_unique(regre.findall(iset.iset[tag]['syntax']))
+        imms = ordered_unique(immre.findall(iset.iset[tag]['syntax']))
+        regno = 0
+        for reg in regs:
+            reg_type = reg[0]
+            reg_letter = reg[1][0]
+            reg_num_choices = int(reg[3].rstrip('S'))
+            reg_mapping = reg[0] + ''.join(['_' for letter in reg[1]]) + reg[3]
+            reg_enc_fields = re.findall(reg_letter + '+', enc)
+            if len(reg_enc_fields) == 0:
+                raise Exception('Tag "{}" missing register field!'.format(tag))
+            if len(reg_enc_fields) > 1:
+                raise Exception('Tag "{}" has split register field!'.\
+                    format(tag))
+            reg_enc_field = reg_enc_fields[0]
+            if 2 ** len(reg_enc_field) != reg_num_choices:
+                raise Exception('Tag "{}" has incorrect register field width!'.\
+                    format(tag))
+            print('        DECODE_REG({},{},{})'.\
+                format(regno, len(reg_enc_field), enc.index(reg_enc_field)),
+                       file=f)
+            if reg_type in num_registers and \
+                reg_num_choices != num_registers[reg_type]:
+                print('        DECODE_MAPPED_REG({},{})'.\
+                    format(regno, reg_mapping), file=f)
+            regno += 1
+        def implicit_register_key(reg):
+            return implicit_registers[reg]
+        for reg in sorted(
+            set([r for r in (iset.iset[tag]['rregs'].split(',') + \
+                iset.iset[tag]['wregs'].split(',')) \
+                    if r in implicit_registers]), key=implicit_register_key):
+            print('        DECODE_IMPL_REG({},{})'.\
+                format(regno, implicit_registers[reg]), file=f)
+            regno += 1
+        if imms and imms[0][0].isupper():
+            imms = reversed(imms)
+        for imm in imms:
+            if imm[0].isupper():
+                immno = 1
+            else:
+                immno = 0
+            imm_type = imm[0]
+            imm_width = int(imm[1])
+            imm_shift = imm[2]
+            if imm_shift:
+                imm_shift = int(imm_shift)
+            else:
+                imm_shift = 0
+            if imm_type.islower():
+                imm_letter = 'i'
+            else:
+                imm_letter = 'I'
+            remainder = imm_width
+            for m in reversed(list(re.finditer(imm_letter + '+', enc))):
+                remainder -= m.end() - m.start()
+                print('        DECODE_IMM({},{},{},{})'.\
+                    format(immno, m.end() - m.start(), m.start(), remainder),
+                        file=f)
+            if remainder != 0:
+                if imm[2]:
+                    imm[2] = ':' + imm[2]
+                raise Exception('Tag "{}" has an incorrect number of ' + \
+                    'encoding bits for immediate "{}"'.\
+                    format(tag, ''.join(imm)))
+            if imm_type.lower() in 'sr':
+                print('        DECODE_IMM_SXT({},{})'.\
+                    format(immno, imm_width), file=f)
+            if imm_type.lower() == 'n':
+                print('        DECODE_IMM_NEG({},{})'.\
+                    format(immno, imm_width), file=f)
+            if imm_shift:
+                print('        DECODE_IMM_SHIFT({},{})'.\
+                    format(immno, imm_shift), file=f)
+        print(')', file=f)
+
+if __name__ == '__main__':
+    f = io.StringIO()
+    print_tree(f, dectree_normal)
+    print_tree(f, dectree_16bit)
+    if subinsn_groupings:
+        print_tree(f, dectree_subinsn_groupings)
+    for (name, dectree_subinsn) in sorted(dectree_subinsns.items()):
+        print_tree(f, dectree_subinsn)
+    for (name, dectree_ext) in sorted(dectree_extensions.items()):
+        print_tree(f, dectree_ext)
+    print_match_info(f)
+    print_op_info(f)
+    open('dectree_generated.h', 'w').write(f.getvalue())
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 24/34] Hexagon (target/hexagon) opcode data structures
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (22 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 23/34] Hexagon (target/hexagon) generater phase 4 - " Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-26 15:25   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 25/34] Hexagon (target/hexagon) macros to interface with the generator Taylor Simpson
                   ` (11 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/opcodes.h |  66 +++++++++++++++
 target/hexagon/opcodes.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 277 insertions(+)
 create mode 100644 target/hexagon/opcodes.h
 create mode 100644 target/hexagon/opcodes.c

diff --git a/target/hexagon/opcodes.h b/target/hexagon/opcodes.h
new file mode 100644
index 0000000..46384d4
--- /dev/null
+++ b/target/hexagon/opcodes.h
@@ -0,0 +1,66 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_OPCODES_H
+#define HEXAGON_OPCODES_H
+
+#include "hex_arch_types.h"
+#include "attribs.h"
+
+typedef enum {
+#define OPCODE(IID) IID
+#include "opcodes_def_generated.h"
+    XX_LAST_OPCODE
+#undef OPCODE
+} opcode_t;
+
+typedef enum {
+    NORMAL,
+    HALF,
+    SUBINSN_A,
+    SUBINSN_L1,
+    SUBINSN_L2,
+    SUBINSN_S1,
+    SUBINSN_S2,
+    EXT_noext,
+    EXT_mmvec,
+    XX_LAST_ENC_CLASS
+} enc_class_t;
+
+extern const char *opcode_names[];
+
+extern const char *opcode_reginfo[];
+extern const char *opcode_rregs[];
+extern const char *opcode_wregs[];
+
+typedef struct {
+    const char * const encoding;
+    size4u_t vals;
+    size4u_t dep_vals;
+    const enc_class_t enc_class;
+} opcode_encoding_t;
+
+extern opcode_encoding_t opcode_encodings[XX_LAST_OPCODE];
+
+extern size4u_t
+    opcode_attribs[XX_LAST_OPCODE][(A_ZZ_LASTATTRIB / ATTRIB_WIDTH) + 1];
+
+extern void opcode_init(void);
+
+extern int opcode_which_immediate_is_extended(opcode_t opcode);
+
+#endif
diff --git a/target/hexagon/opcodes.c b/target/hexagon/opcodes.c
new file mode 100644
index 0000000..3a9a3a1
--- /dev/null
+++ b/target/hexagon/opcodes.c
@@ -0,0 +1,211 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * opcodes.c
+ *
+ * data tables generated automatically
+ * Maybe some functions too
+ */
+
+#include "qemu/osdep.h"
+#include "opcodes.h"
+#include "decode.h"
+
+#define VEC_DESCR(A, B, C) DESCR(A, B, C)
+#define DONAME(X) #X
+
+const char *opcode_names[] = {
+#define OPCODE(IID) DONAME(IID)
+#include "opcodes_def_generated.h"
+    NULL
+#undef OPCODE
+};
+
+const char *opcode_reginfo[] = {
+#define IMMINFO(TAG, SIGN, SIZE, SHAMT, SIGN2, SIZE2, SHAMT2)    /* nothing */
+#define REGINFO(TAG, REGINFO, RREGS, WREGS) REGINFO,
+#include "op_regs_generated.h"
+    NULL
+#undef REGINFO
+#undef IMMINFO
+};
+
+
+const char *opcode_rregs[] = {
+#define IMMINFO(TAG, SIGN, SIZE, SHAMT, SIGN2, SIZE2, SHAMT2)    /* nothing */
+#define REGINFO(TAG, REGINFO, RREGS, WREGS) RREGS,
+#include "op_regs_generated.h"
+    NULL
+#undef REGINFO
+#undef IMMINFO
+};
+
+
+const char *opcode_wregs[] = {
+#define IMMINFO(TAG, SIGN, SIZE, SHAMT, SIGN2, SIZE2, SHAMT2)    /* nothing */
+#define REGINFO(TAG, REGINFO, RREGS, WREGS) WREGS,
+#include "op_regs_generated.h"
+    NULL
+#undef REGINFO
+#undef IMMINFO
+};
+
+const char *opcode_short_semantics[] = {
+#define OPCODE(X)              NULL
+#include "opcodes_def_generated.h"
+#undef OPCODE
+    NULL
+};
+
+
+size4u_t
+    opcode_attribs[XX_LAST_OPCODE][(A_ZZ_LASTATTRIB / ATTRIB_WIDTH) + 1];
+
+static void init_attribs(int tag, ...)
+{
+    va_list ap;
+    int attr;
+    va_start(ap, tag);
+    while ((attr = va_arg(ap, int)) != 0) {
+        opcode_attribs[tag][attr / ATTRIB_WIDTH] |= 1 << (attr % ATTRIB_WIDTH);
+    }
+}
+
+static size4u_t str2val(const char *str)
+{
+    size4u_t ret = 0;
+    for ( ; *str; str++) {
+        switch (*str) {
+        case ' ':
+        case '\t':
+            break;
+        case 's':
+        case 't':
+        case 'u':
+        case 'v':
+        case 'w':
+        case 'd':
+        case 'e':
+        case 'x':
+        case 'y':
+        case 'i':
+        case 'I':
+        case 'P':
+        case 'E':
+        case 'o':
+        case '-':
+        case '0':
+            ret = (ret << 1) | 0;
+            break;
+        case '1':
+            ret = (ret << 1) | 1;
+            break;
+        default:
+            break;
+        }
+    }
+    return ret;
+}
+
+opcode_encoding_t opcode_encodings[] = {
+#define DEF_ENC32(OPCODE, ENCSTR) \
+    [OPCODE] = { .encoding = ENCSTR },
+
+#define DEF_ENC_SUBINSN(OPCODE, CLASS, ENCSTR) \
+    [OPCODE] = { .encoding = ENCSTR, .enc_class = CLASS },
+
+#define DEF_EXT_ENC(OPCODE, CLASS, ENCSTR) \
+    [OPCODE] = { .encoding = ENCSTR, .enc_class = CLASS },
+
+#include "imported/encode.def"
+
+#undef DEF_ENC32
+#undef DEF_ENC_SUBINSN
+#undef DEF_EXT_ENC
+};
+
+void opcode_init(void)
+{
+    init_attribs(0, 0);
+
+#define DEF_ENC32(OPCODE, ENCSTR) \
+    opcode_encodings[OPCODE].vals = str2val(ENCSTR);
+
+#define DEF_ENC_SUBINSN(OPCODE, CLASS, ENCSTR) \
+    opcode_encodings[OPCODE].vals = str2val(ENCSTR);
+
+#define LEGACY_DEF_ENC32(OPCODE, ENCSTR) \
+    opcode_encodings[OPCODE].dep_vals = str2val(ENCSTR);
+
+#define DEF_EXT_ENC(OPCODE, CLASS, ENCSTR) \
+    opcode_encodings[OPCODE].vals = str2val(ENCSTR);
+
+#include "imported/encode.def"
+
+#undef LEGACY_DEF_ENC32
+#undef DEF_ENC32
+#undef DEF_ENC_SUBINSN
+#undef DEF_EXT_ENC
+
+#define ATTRIBS(...) , ## __VA_ARGS__, 0
+#define OP_ATTRIB(TAG, ARGS) init_attribs(TAG ARGS);
+#include "op_attribs_generated.h"
+#undef OP_ATTRIB
+#undef ATTRIBS
+
+    decode_init();
+
+#define DEF_SHORTCODE(TAG, SHORTCODE) \
+    opcode_short_semantics[TAG] = #SHORTCODE;
+#include "shortcode_generated.h"
+#undef DEF_SHORTCODE
+}
+
+
+#define NEEDLE "IMMEXT("
+
+int opcode_which_immediate_is_extended(opcode_t opcode)
+{
+    const char *p;
+    if (opcode >= XX_LAST_OPCODE) {
+        g_assert_not_reached();
+        return 0;
+    }
+    if (!GET_ATTRIB(opcode, A_EXTENDABLE)) {
+        g_assert_not_reached();
+        return 0;
+    }
+    p = opcode_short_semantics[opcode];
+    p = strstr(p, NEEDLE);
+    if (p == NULL) {
+        g_assert_not_reached();
+        return 0;
+    }
+    p += strlen(NEEDLE);
+    while (isspace(*p)) {
+        p++;
+    }
+    /* lower is always imm 0, upper always imm 1. */
+    if (islower(*p)) {
+        return 0;
+    } else if (isupper(*p)) {
+        return 1;
+    } else {
+        g_assert_not_reached();
+    }
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 25/34] Hexagon (target/hexagon) macros to interface with the generator
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (23 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 24/34] Hexagon (target/hexagon) opcode data structures Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-29  0:49   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics Taylor Simpson
                   ` (10 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Various forms of declare, read, write, free

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/macros.h | 364 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 364 insertions(+)
 create mode 100644 target/hexagon/macros.h

diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
new file mode 100644
index 0000000..5582dcb
--- /dev/null
+++ b/target/hexagon/macros.h
@@ -0,0 +1,364 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_MACROS_H
+#define HEXAGON_MACROS_H
+
+#include "qemu.h"
+#include "cpu.h"
+#include "hex_regs.h"
+#include "reg_fields.h"
+
+#ifdef QEMU_GENERATE
+#define DECL_REG(NAME, NUM, X, OFF) \
+    TCGv NAME = tcg_temp_local_new(); \
+    int NUM = REGNO(X) + OFF
+
+#define DECL_REG_WRITABLE(NAME, NUM, X, OFF) \
+    TCGv NAME = tcg_temp_local_new(); \
+    int NUM = REGNO(X) + OFF; \
+    do { \
+        int is_predicated = GET_ATTRIB(insn->opcode, A_CONDEXEC); \
+        if (is_predicated && !is_preloaded(ctx, NUM)) { \
+            tcg_gen_mov_tl(hex_new_value[NUM], hex_gpr[NUM]); \
+        } \
+    } while (0)
+/*
+ * For read-only temps, avoid allocating and freeing
+ */
+#define DECL_REG_READONLY(NAME, NUM, X, OFF) \
+    TCGv NAME; \
+    int NUM = REGNO(X) + OFF
+
+#define DECL_RREG_d(NAME, NUM, X, OFF) \
+    DECL_REG_WRITABLE(NAME, NUM, X, OFF)
+#define DECL_RREG_e(NAME, NUM, X, OFF) \
+    DECL_REG(NAME, NUM, X, OFF)
+#define DECL_RREG_s(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+#define DECL_RREG_t(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+#define DECL_RREG_u(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+#define DECL_RREG_v(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+#define DECL_RREG_x(NAME, NUM, X, OFF) \
+    DECL_REG_WRITABLE(NAME, NUM, X, OFF)
+#define DECL_RREG_y(NAME, NUM, X, OFF) \
+    DECL_REG_WRITABLE(NAME, NUM, X, OFF)
+
+#define DECL_PREG_d(NAME, NUM, X, OFF) \
+    DECL_REG(NAME, NUM, X, OFF)
+#define DECL_PREG_e(NAME, NUM, X, OFF) \
+    DECL_REG(NAME, NUM, X, OFF)
+#define DECL_PREG_s(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+#define DECL_PREG_t(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+#define DECL_PREG_u(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+#define DECL_PREG_v(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+#define DECL_PREG_x(NAME, NUM, X, OFF) \
+    DECL_REG(NAME, NUM, X, OFF)
+#define DECL_PREG_y(NAME, NUM, X, OFF) \
+    DECL_REG(NAME, NUM, X, OFF)
+
+#define DECL_CREG_d(NAME, NUM, X, OFF) \
+    DECL_REG(NAME, NUM, X, OFF)
+#define DECL_CREG_s(NAME, NUM, X, OFF) \
+    DECL_REG(NAME, NUM, X, OFF)
+
+#define DECL_MREG_u(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+
+#define DECL_NEW_NREG_s(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+#define DECL_NEW_NREG_t(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+
+#define DECL_NEW_PREG_t(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+#define DECL_NEW_PREG_u(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+#define DECL_NEW_PREG_v(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+
+#define DECL_NEW_OREG_s(NAME, NUM, X, OFF) \
+    DECL_REG_READONLY(NAME, NUM, X, OFF)
+
+#define DECL_PAIR(NAME, NUM, X, OFF) \
+    TCGv_i64 NAME = tcg_temp_local_new_i64(); \
+    size1u_t NUM = REGNO(X) + OFF
+
+#define DECL_PAIR_WRITABLE(NAME, NUM, X, OFF) \
+    TCGv_i64 NAME = tcg_temp_local_new_i64(); \
+    size1u_t NUM = REGNO(X) + OFF; \
+    do { \
+        int is_predicated = GET_ATTRIB(insn->opcode, A_CONDEXEC); \
+        if (is_predicated) { \
+            if (!is_preloaded(ctx, NUM)) { \
+                tcg_gen_mov_tl(hex_new_value[NUM], hex_gpr[NUM]); \
+            } \
+            if (!is_preloaded(ctx, NUM + 1)) { \
+                tcg_gen_mov_tl(hex_new_value[NUM + 1], hex_gpr[NUM + 1]); \
+            } \
+        } \
+    } while (0)
+
+#define DECL_RREG_dd(NAME, NUM, X, OFF) \
+    DECL_PAIR_WRITABLE(NAME, NUM, X, OFF)
+#define DECL_RREG_ss(NAME, NUM, X, OFF) \
+    DECL_PAIR(NAME, NUM, X, OFF)
+#define DECL_RREG_tt(NAME, NUM, X, OFF) \
+    DECL_PAIR(NAME, NUM, X, OFF)
+#define DECL_RREG_xx(NAME, NUM, X, OFF) \
+    DECL_PAIR_WRITABLE(NAME, NUM, X, OFF)
+#define DECL_RREG_yy(NAME, NUM, X, OFF) \
+    DECL_PAIR_WRITABLE(NAME, NUM, X, OFF)
+
+#define DECL_CREG_dd(NAME, NUM, X, OFF) \
+    DECL_PAIR_WRITABLE(NAME, NUM, X, OFF)
+#define DECL_CREG_ss(NAME, NUM, X, OFF) \
+    DECL_PAIR(NAME, NUM, X, OFF)
+
+#define DECL_IMM(NAME, X) \
+    int NAME = IMMNO(X); \
+    do { \
+        NAME = NAME; \
+    } while (0)
+#define DECL_TCG_IMM(TCG_NAME, VAL) \
+    TCGv TCG_NAME = tcg_const_tl(VAL)
+
+#define DECL_EA \
+    TCGv EA; \
+    do { \
+        if (GET_ATTRIB(insn->opcode, A_CONDEXEC)) { \
+            EA = tcg_temp_local_new(); \
+        } else { \
+            EA = tcg_temp_new(); \
+        } \
+    } while (0)
+
+#define LOG_REG_WRITE(RNUM, VAL)\
+    do { \
+        int is_predicated = GET_ATTRIB(insn->opcode, A_CONDEXEC); \
+        gen_log_reg_write(RNUM, VAL, insn->slot, is_predicated); \
+        ctx_log_reg_write(ctx, (RNUM)); \
+    } while (0)
+
+#define LOG_PRED_WRITE(PNUM, VAL) \
+    do { \
+        gen_log_pred_write(PNUM, VAL); \
+        ctx_log_pred_write(ctx, (PNUM)); \
+    } while (0)
+
+#define FREE_REG(NAME) \
+    tcg_temp_free(NAME)
+#define FREE_REG_READONLY(NAME) \
+    /* Nothing */
+
+#define FREE_RREG_d(NAME)            FREE_REG(NAME)
+#define FREE_RREG_e(NAME)            FREE_REG(NAME)
+#define FREE_RREG_s(NAME)            FREE_REG_READONLY(NAME)
+#define FREE_RREG_t(NAME)            FREE_REG_READONLY(NAME)
+#define FREE_RREG_u(NAME)            FREE_REG_READONLY(NAME)
+#define FREE_RREG_v(NAME)            FREE_REG_READONLY(NAME)
+#define FREE_RREG_x(NAME)            FREE_REG(NAME)
+#define FREE_RREG_y(NAME)            FREE_REG(NAME)
+
+#define FREE_PREG_d(NAME)            FREE_REG(NAME)
+#define FREE_PREG_e(NAME)            FREE_REG(NAME)
+#define FREE_PREG_s(NAME)            FREE_REG_READONLY(NAME)
+#define FREE_PREG_t(NAME)            FREE_REG_READONLY(NAME)
+#define FREE_PREG_u(NAME)            FREE_REG_READONLY(NAME)
+#define FREE_PREG_v(NAME)            FREE_REG_READONLY(NAME)
+#define FREE_PREG_x(NAME)            FREE_REG(NAME)
+
+#define FREE_CREG_d(NAME)            FREE_REG(NAME)
+#define FREE_CREG_s(NAME)            FREE_REG_READONLY(NAME)
+
+#define FREE_MREG_u(NAME)            FREE_REG_READONLY(NAME)
+
+#define FREE_NEW_NREG_s(NAME)        FREE_REG(NAME)
+#define FREE_NEW_NREG_t(NAME)        FREE_REG(NAME)
+
+#define FREE_NEW_PREG_t(NAME)        FREE_REG_READONLY(NAME)
+#define FREE_NEW_PREG_u(NAME)        FREE_REG_READONLY(NAME)
+#define FREE_NEW_PREG_v(NAME)        FREE_REG_READONLY(NAME)
+
+#define FREE_NEW_OREG_s(NAME)        FREE_REG(NAME)
+
+#define FREE_REG_PAIR(NAME) \
+    tcg_temp_free_i64(NAME)
+
+#define FREE_RREG_dd(NAME)           FREE_REG_PAIR(NAME)
+#define FREE_RREG_ss(NAME)           FREE_REG_PAIR(NAME)
+#define FREE_RREG_tt(NAME)           FREE_REG_PAIR(NAME)
+#define FREE_RREG_xx(NAME)           FREE_REG_PAIR(NAME)
+#define FREE_RREG_yy(NAME)           FREE_REG_PAIR(NAME)
+
+#define FREE_CREG_dd(NAME)           FREE_REG_PAIR(NAME)
+#define FREE_CREG_ss(NAME)           FREE_REG_PAIR(NAME)
+
+#define FREE_IMM(NAME)               /* nothing */
+#define FREE_TCG_IMM(NAME)           tcg_temp_free(NAME)
+
+#define FREE_EA \
+    tcg_temp_free(EA)
+#else
+#define LOG_REG_WRITE(RNUM, VAL)\
+    log_reg_write(env, RNUM, VAL, slot)
+#define LOG_PRED_WRITE(RNUM, VAL)\
+    log_pred_write(env, RNUM, VAL)
+#endif
+
+#define SLOT_WRAP(CODE) \
+    do { \
+        TCGv slot = tcg_const_tl(insn->slot); \
+        CODE; \
+        tcg_temp_free(slot); \
+    } while (0)
+
+#define PART1_WRAP(CODE) \
+    do { \
+        TCGv part1 = tcg_const_tl(insn->part1); \
+        CODE; \
+        tcg_temp_free(part1); \
+    } while (0)
+
+#define MARK_LATE_PRED_WRITE(RNUM) /* Not modelled in qemu */
+
+#define REGNO(NUM) (insn->regno[NUM])
+#define IMMNO(NUM) (insn->immed[NUM])
+
+#ifdef QEMU_GENERATE
+#define READ_REG(dest, NUM) \
+    gen_read_reg(dest, NUM)
+#define READ_REG_READONLY(dest, NUM) \
+    do { dest = hex_gpr[NUM]; } while (0)
+
+#define READ_RREG_s(dest, NUM) \
+    READ_REG_READONLY(dest, NUM)
+#define READ_RREG_t(dest, NUM) \
+    READ_REG_READONLY(dest, NUM)
+#define READ_RREG_u(dest, NUM) \
+    READ_REG_READONLY(dest, NUM)
+#define READ_RREG_x(dest, NUM) \
+    READ_REG(dest, NUM)
+#define READ_RREG_y(dest, NUM) \
+    READ_REG(dest, NUM)
+
+#define READ_OREG_s(dest, NUM) \
+    READ_REG_READONLY(dest, NUM)
+
+#define READ_CREG_s(dest, NUM) \
+    do { \
+        if ((NUM) + HEX_REG_SA0 == HEX_REG_P3_0) { \
+            gen_read_p3_0(dest); \
+        } else { \
+            READ_REG_READONLY(dest, ((NUM) + HEX_REG_SA0)); \
+        } \
+    } while (0)
+
+#define READ_MREG_u(dest, NUM) \
+    do { \
+        READ_REG_READONLY(dest, ((NUM) + HEX_REG_M0)); \
+        dest = dest; \
+    } while (0)
+#else
+#define READ_REG(NUM) \
+    (env->gpr[(NUM)])
+#endif
+
+#ifdef QEMU_GENERATE
+#define READ_REG_PAIR(tmp, NUM) \
+    tcg_gen_concat_i32_i64(tmp, hex_gpr[NUM], hex_gpr[(NUM) + 1])
+#define READ_RREG_ss(tmp, NUM)          READ_REG_PAIR(tmp, NUM)
+#define READ_RREG_tt(tmp, NUM)          READ_REG_PAIR(tmp, NUM)
+#define READ_RREG_xx(tmp, NUM)          READ_REG_PAIR(tmp, NUM)
+#define READ_RREG_yy(tmp, NUM)          READ_REG_PAIR(tmp, NUM)
+
+#define READ_CREG_PAIR(tmp, i) \
+    READ_REG_PAIR(tmp, ((i) + HEX_REG_SA0))
+#define READ_CREG_ss(tmp, i)            READ_CREG_PAIR(tmp, i)
+#endif
+
+#ifdef QEMU_GENERATE
+#define READ_PREG(dest, NUM)             gen_read_preg(dest, (NUM))
+#define READ_PREG_READONLY(dest, NUM)    do { dest = hex_pred[NUM]; } while (0)
+
+#define READ_PREG_s(dest, NUM)           READ_PREG_READONLY(dest, NUM)
+#define READ_PREG_t(dest, NUM)           READ_PREG_READONLY(dest, NUM)
+#define READ_PREG_u(dest, NUM)           READ_PREG_READONLY(dest, NUM)
+#define READ_PREG_v(dest, NUM)           READ_PREG_READONLY(dest, NUM)
+#define READ_PREG_x(dest, NUM)           READ_PREG(dest, NUM)
+
+#define READ_NEW_PREG(pred, PNUM) \
+    do { pred = hex_new_pred_value[PNUM]; } while (0)
+#define READ_NEW_PREG_t(pred, PNUM)      READ_NEW_PREG(pred, PNUM)
+#define READ_NEW_PREG_u(pred, PNUM)      READ_NEW_PREG(pred, PNUM)
+#define READ_NEW_PREG_v(pred, PNUM)      READ_NEW_PREG(pred, PNUM)
+
+#define READ_NEW_REG(tmp, i) \
+    do { tmp = tcg_const_tl(i); } while (0)
+#define READ_NEW_NREG_s(tmp, i)          READ_NEW_REG(tmp, i)
+#define READ_NEW_NREG_t(tmp, i)          READ_NEW_REG(tmp, i)
+#define READ_NEW_OREG_s(tmp, i)          READ_NEW_REG(tmp, i)
+#else
+#define READ_PREG(NUM)                (env->pred[NUM])
+#endif
+
+
+#define WRITE_RREG(NUM, VAL)             LOG_REG_WRITE(NUM, VAL)
+#define WRITE_RREG_d(NUM, VAL)           LOG_REG_WRITE(NUM, VAL)
+#define WRITE_RREG_e(NUM, VAL)           LOG_REG_WRITE(NUM, VAL)
+#define WRITE_RREG_x(NUM, VAL)           LOG_REG_WRITE(NUM, VAL)
+#define WRITE_RREG_y(NUM, VAL)           LOG_REG_WRITE(NUM, VAL)
+
+#define WRITE_PREG(NUM, VAL)             LOG_PRED_WRITE(NUM, VAL)
+#define WRITE_PREG_d(NUM, VAL)           LOG_PRED_WRITE(NUM, VAL)
+#define WRITE_PREG_e(NUM, VAL)           LOG_PRED_WRITE(NUM, VAL)
+#define WRITE_PREG_x(NUM, VAL)           LOG_PRED_WRITE(NUM, VAL)
+
+#ifdef QEMU_GENERATE
+#define WRITE_CREG(i, tmp) \
+    do { \
+        if (i + HEX_REG_SA0 == HEX_REG_P3_0) { \
+            gen_write_p3_0(tmp); \
+        } else { \
+            WRITE_RREG((i) + HEX_REG_SA0, tmp); \
+        } \
+    } while (0)
+#define WRITE_CREG_d(NUM, VAL)           WRITE_CREG(NUM, VAL)
+
+#define WRITE_CREG_PAIR(i, tmp)          WRITE_REG_PAIR((i) + HEX_REG_SA0, tmp)
+#define WRITE_CREG_dd(NUM, VAL)          WRITE_CREG_PAIR(NUM, VAL)
+
+#define WRITE_REG_PAIR(NUM, VAL) \
+    do { \
+        int is_predicated = GET_ATTRIB(insn->opcode, A_CONDEXEC); \
+        gen_log_reg_write_pair(NUM, VAL, insn->slot, is_predicated); \
+        ctx_log_reg_write(ctx, (NUM)); \
+        ctx_log_reg_write(ctx, (NUM) + 1); \
+    } while (0)
+
+#define WRITE_RREG_dd(NUM, VAL)          WRITE_REG_PAIR(NUM, VAL)
+#define WRITE_RREG_xx(NUM, VAL)          WRITE_REG_PAIR(NUM, VAL)
+#define WRITE_RREG_yy(NUM, VAL)          WRITE_REG_PAIR(NUM, VAL)
+#endif
+
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (24 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 25/34] Hexagon (target/hexagon) macros to interface with the generator Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-29  1:16   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 27/34] Hexagon (target/hexagon) instruction classes Taylor Simpson
                   ` (9 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/macros.h    | 620 ++++++++++++++++++++++++++++++++++++++++++++-
 target/hexagon/op_helper.c |  21 --
 2 files changed, 619 insertions(+), 22 deletions(-)

diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index 5582dcb..9665bc9 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -18,6 +18,8 @@
 #ifndef HEXAGON_MACROS_H
 #define HEXAGON_MACROS_H
 
+#include "qemu/osdep.h"
+#include "qemu/host-utils.h"
 #include "qemu.h"
 #include "cpu.h"
 #include "hex_regs.h"
@@ -241,7 +243,6 @@
         tcg_temp_free(part1); \
     } while (0)
 
-#define MARK_LATE_PRED_WRITE(RNUM) /* Not modelled in qemu */
 
 #define REGNO(NUM) (insn->regno[NUM])
 #define IMMNO(NUM) (insn->immed[NUM])
@@ -362,3 +363,620 @@
 #define WRITE_RREG_yy(NUM, VAL)          WRITE_REG_PAIR(NUM, VAL)
 #endif
 
+#define PCALIGN 4
+#define PCALIGN_MASK (PCALIGN - 1)
+
+#define GET_FIELD(FIELD, REGIN) \
+    fEXTRACTU_BITS(REGIN, reg_field_info[FIELD].width, \
+                   reg_field_info[FIELD].offset)
+
+#define GET_USR_FIELD(FIELD) \
+    fEXTRACTU_BITS(env->gpr[HEX_REG_USR], reg_field_info[FIELD].width, \
+                   reg_field_info[FIELD].offset)
+
+#define SET_USR_FIELD(FIELD, VAL) \
+    fINSERT_BITS(env->gpr[HEX_REG_USR], reg_field_info[FIELD].width, \
+                 reg_field_info[FIELD].offset, (VAL))
+
+#ifdef QEMU_GENERATE
+/*
+ * Section 5.5 of the Hexagon V67 Programmer's Reference Manual
+ *
+ * Slot 1 store with slot 0 load
+ * A slot 1 store operation with a slot 0 load operation can appear in a packet.
+ * The packet attribute :mem_noshuf inhibits the instruction reordering that
+ * would otherwise be done by the assembler. For example:
+ *     {
+ *         memw(R5) = R2 // slot 1 store
+ *         R3 = memh(R6) // slot 0 load
+ *     }:mem_noshuf
+ * Unlike most packetized operations, these memory operations are not executed
+ * in parallel (Section 3.3.1). Instead, the store instruction in Slot 1
+ * effectively executes first, followed by the load instruction in Slot 0. If
+ * the addresses of the two operations are overlapping, the load will receive
+ * the newly stored data. This feature is supported in processor versions
+ * V65 or greater.
+ *
+ *
+ * For qemu, we look for a load in slot 0 when there is  a store in slot 1
+ * in the same packet.  When we see this, we call a helper that merges the
+ * bytes from the store buffer with the value loaded from memory.
+ */
+#define CHECK_NOSHUF(DST, VA, SZ, SIGN) \
+    do { \
+        if (insn->slot == 0 && pkt->pkt_has_store_s1) { \
+            gen_helper_merge_inflight_store##SZ##SIGN(DST, cpu_env, VA, DST); \
+        } \
+    } while (0)
+
+#define MEM_LOAD1s(DST, VA) \
+    do { \
+        tcg_gen_qemu_ld8s(DST, VA, ctx->mem_idx); \
+        CHECK_NOSHUF(DST, VA, 1, s); \
+    } while (0)
+#define MEM_LOAD1u(DST, VA) \
+    do { \
+        tcg_gen_qemu_ld8u(DST, VA, ctx->mem_idx); \
+        CHECK_NOSHUF(DST, VA, 1, u); \
+    } while (0)
+#define MEM_LOAD2s(DST, VA) \
+    do { \
+        tcg_gen_qemu_ld16s(DST, VA, ctx->mem_idx); \
+        CHECK_NOSHUF(DST, VA, 2, s); \
+    } while (0)
+#define MEM_LOAD2u(DST, VA) \
+    do { \
+        tcg_gen_qemu_ld16u(DST, VA, ctx->mem_idx); \
+        CHECK_NOSHUF(DST, VA, 2, u); \
+    } while (0)
+#define MEM_LOAD4s(DST, VA) \
+    do { \
+        tcg_gen_qemu_ld32s(DST, VA, ctx->mem_idx); \
+        CHECK_NOSHUF(DST, VA, 4, s); \
+    } while (0)
+#define MEM_LOAD4u(DST, VA) \
+    do { \
+        tcg_gen_qemu_ld32s(DST, VA, ctx->mem_idx); \
+        CHECK_NOSHUF(DST, VA, 4, u); \
+    } while (0)
+#define MEM_LOAD8u(DST, VA) \
+    do { \
+        tcg_gen_qemu_ld64(DST, VA, ctx->mem_idx); \
+        CHECK_NOSHUF(DST, VA, 8, u); \
+    } while (0)
+#else
+#define MEM_LOAD1s(VA) ((size1s_t)mem_load1(env, slot, VA))
+#define MEM_LOAD1u(VA) ((size1u_t)mem_load1(env, slot, VA))
+#define MEM_LOAD2s(VA) ((size2s_t)mem_load2(env, slot, VA))
+#define MEM_LOAD2u(VA) ((size2u_t)mem_load2(env, slot, VA))
+#define MEM_LOAD4s(VA) ((size4s_t)mem_load4(env, slot, VA))
+#define MEM_LOAD4u(VA) ((size4u_t)mem_load4(env, slot, VA))
+#define MEM_LOAD8s(VA) ((size8s_t)mem_load8(env, slot, VA))
+#define MEM_LOAD8u(VA) ((size8u_t)mem_load8(env, slot, VA))
+
+#define MEM_STORE1(VA, DATA, SLOT) log_store32(env, VA, DATA, 1, SLOT)
+#define MEM_STORE2(VA, DATA, SLOT) log_store32(env, VA, DATA, 2, SLOT)
+#define MEM_STORE4(VA, DATA, SLOT) log_store32(env, VA, DATA, 4, SLOT)
+#define MEM_STORE8(VA, DATA, SLOT) log_store64(env, VA, DATA, 8, SLOT)
+#endif
+
+#define CANCEL cancel_slot(env, slot)
+
+#define LOAD_CANCEL(EA) do { CANCEL; } while (0)
+
+#ifdef QEMU_GENERATE
+static inline void gen_pred_cancel(TCGv pred, int slot_num)
+ {
+    TCGv slot_mask = tcg_const_tl(1 << slot_num);
+    TCGv tmp = tcg_temp_new();
+    TCGv zero = tcg_const_tl(0);
+    TCGv one = tcg_const_tl(1);
+    tcg_gen_or_tl(slot_mask, hex_slot_cancelled, slot_mask);
+    tcg_gen_andi_tl(tmp, pred, 1);
+    tcg_gen_movcond_tl(TCG_COND_EQ, hex_slot_cancelled, tmp, zero,
+                       slot_mask, hex_slot_cancelled);
+    tcg_temp_free(slot_mask);
+    tcg_temp_free(tmp);
+    tcg_temp_free(zero);
+    tcg_temp_free(one);
+}
+#define PRED_LOAD_CANCEL(PRED, EA) \
+    gen_pred_cancel(PRED, insn->is_endloop ? 4 : insn->slot)
+#endif
+
+#define STORE_CANCEL(EA) { env->slot_cancelled |= (1 << slot); }
+
+#define fMAX(A, B) (((A) > (B)) ? (A) : (B))
+
+#define fMIN(A, B) (((A) < (B)) ? (A) : (B))
+
+#define fABS(A) (((A) < 0) ? (-(A)) : (A))
+#define fINSERT_BITS(REG, WIDTH, OFFSET, INVAL) \
+    do { \
+        REG = ((REG) & ~(((fCONSTLL(1) << (WIDTH)) - 1) << (OFFSET))) | \
+           (((INVAL) & ((fCONSTLL(1) << (WIDTH)) - 1)) << (OFFSET)); \
+    } while (0)
+#define fEXTRACTU_BITS(INREG, WIDTH, OFFSET) \
+    (fZXTN(WIDTH, 32, (INREG >> OFFSET)))
+#define fEXTRACTU_BIDIR(INREG, WIDTH, OFFSET) \
+    (fZXTN(WIDTH, 32, fBIDIR_LSHIFTR((INREG), (OFFSET), 4_8)))
+#define fEXTRACTU_RANGE(INREG, HIBIT, LOWBIT) \
+    (fZXTN((HIBIT - LOWBIT + 1), 32, (INREG >> LOWBIT)))
+
+#define f8BITSOF(VAL) ((VAL) ? 0xff : 0x00)
+
+#ifdef QEMU_GENERATE
+#define fLSBOLD(VAL) tcg_gen_andi_tl(LSB, (VAL), 1)
+#else
+#define fLSBOLD(VAL)  ((VAL) & 1)
+#endif
+
+#ifdef QEMU_GENERATE
+#define fLSBNEW(PVAL)   tcg_gen_mov_tl(LSB, (PVAL))
+#define fLSBNEW0        fLSBNEW(0)
+#define fLSBNEW1        fLSBNEW(1)
+#else
+#define fLSBNEW(PVAL)   (PVAL)
+#define fLSBNEW0        new_pred_value(env, 0)
+#define fLSBNEW1        new_pred_value(env, 1)
+#endif
+
+#ifdef QEMU_GENERATE
+static inline void gen_logical_not(TCGv dest, TCGv src)
+{
+    TCGv one = tcg_const_tl(1);
+    TCGv zero = tcg_const_tl(0);
+
+    tcg_gen_movcond_tl(TCG_COND_NE, dest, src, zero, zero, one);
+
+    tcg_temp_free(one);
+    tcg_temp_free(zero);
+}
+#define fLSBOLDNOT(VAL) \
+    do { \
+        tcg_gen_andi_tl(LSB, (VAL), 1); \
+        tcg_gen_xori_tl(LSB, LSB, 1); \
+    } while (0)
+#define fLSBNEWNOT(PNUM) \
+    gen_logical_not(LSB, (PNUM))
+#else
+#define fLSBNEWNOT(PNUM) (!fLSBNEW(PNUM))
+#define fLSBOLDNOT(VAL) (!fLSBOLD(VAL))
+#define fLSBNEW0NOT (!fLSBNEW0)
+#define fLSBNEW1NOT (!fLSBNEW1)
+#endif
+
+#define fNEWREG(RNUM) ((int32_t)(env->new_value[(RNUM)]))
+
+#define fNEWREG_ST(RNUM) (env->new_value[(RNUM)])
+
+#define fSATUVALN(N, VAL) \
+    ({ \
+        fSET_OVERFLOW(); \
+        ((VAL) < 0) ? 0 : ((1LL << (N)) - 1); \
+    })
+#define fSATVALN(N, VAL) \
+    ({ \
+        fSET_OVERFLOW(); \
+        ((VAL) < 0) ? (-(1LL << ((N) - 1))) : ((1LL << ((N) - 1)) - 1); \
+    })
+#define fZXTN(N, M, VAL) ((VAL) & ((1LL << (N)) - 1))
+#define fSXTN(N, M, VAL) \
+    ((fZXTN(N, M, VAL) ^ (1LL << ((N) - 1))) - (1LL << ((N) - 1)))
+#define fSATN(N, VAL) \
+    ((fSXTN(N, 64, VAL) == (VAL)) ? (VAL) : fSATVALN(N, VAL))
+#define fADDSAT64(DST, A, B) \
+    do { \
+        size8u_t __a = fCAST8u(A); \
+        size8u_t __b = fCAST8u(B); \
+        size8u_t __sum = __a + __b; \
+        size8u_t __xor = __a ^ __b; \
+        const size8u_t __mask = 0x8000000000000000ULL; \
+        if (__xor & __mask) { \
+            DST = __sum; \
+        } \
+        else if ((__a ^ __sum) & __mask) { \
+            if (__sum & __mask) { \
+                DST = 0x7FFFFFFFFFFFFFFFLL; \
+                fSET_OVERFLOW(); \
+            } else { \
+                DST = 0x8000000000000000LL; \
+                fSET_OVERFLOW(); \
+            } \
+        } else { \
+            DST = __sum; \
+        } \
+    } while (0)
+#define fSATUN(N, VAL) \
+    ((fZXTN(N, 64, VAL) == (VAL)) ? (VAL) : fSATUVALN(N, VAL))
+#define fSATH(VAL) (fSATN(16, VAL))
+#define fSATUH(VAL) (fSATUN(16, VAL))
+#define fSATUB(VAL) (fSATUN(8, VAL))
+#define fSATB(VAL) (fSATN(8, VAL))
+#define fIMMEXT(IMM) (IMM = IMM)
+#define fMUST_IMMEXT(IMM) fIMMEXT(IMM)
+
+#define fPCALIGN(IMM) IMM = (IMM & ~PCALIGN_MASK)
+
+#define fREAD_LR() (READ_REG(HEX_REG_LR))
+
+#define fWRITE_LR(A) WRITE_RREG(HEX_REG_LR, A)
+#define fWRITE_FP(A) WRITE_RREG(HEX_REG_FP, A)
+#define fWRITE_SP(A) WRITE_RREG(HEX_REG_SP, A)
+
+#define fREAD_SP() (READ_REG(HEX_REG_SP))
+#define fREAD_LC0 (READ_REG(HEX_REG_LC0))
+#define fREAD_LC1 (READ_REG(HEX_REG_LC1))
+#define fREAD_SA0 (READ_REG(HEX_REG_SA0))
+#define fREAD_SA1 (READ_REG(HEX_REG_SA1))
+#define fREAD_FP() (READ_REG(HEX_REG_FP))
+#ifdef FIXME
+/* Figure out how to get insn->extension_valid to helper */
+#define fREAD_GP() \
+    (insn->extension_valid ? 0 : READ_REG(HEX_REG_GP))
+#else
+#define fREAD_GP() READ_REG(HEX_REG_GP)
+#endif
+#define fREAD_PC() (READ_REG(HEX_REG_PC))
+
+#define fREAD_NPC() (env->next_PC & (0xfffffffe))
+
+#define fREAD_P0() (READ_PREG(0))
+#define fREAD_P3() (READ_PREG(3))
+
+#define fCHECK_PCALIGN(A)
+
+#define fWRITE_NPC(A) write_new_pc(env, A)
+
+#define fBRANCH(LOC, TYPE)          fWRITE_NPC(LOC)
+#define fJUMPR(REGNO, TARGET, TYPE) fBRANCH(TARGET, COF_TYPE_JUMPR)
+#define fHINTJR(TARGET) { /* Not modelled in qemu */}
+#define fCALL(A) \
+    do { \
+        fWRITE_LR(fREAD_NPC()); \
+        fBRANCH(A, COF_TYPE_CALL); \
+    } while (0)
+#define fCALLR(A) \
+    do { \
+        fWRITE_LR(fREAD_NPC()); \
+        fBRANCH(A, COF_TYPE_CALLR); \
+    } while (0)
+#define fWRITE_LOOP_REGS0(START, COUNT) \
+    do { \
+        WRITE_RREG(HEX_REG_LC0, COUNT);  \
+        WRITE_RREG(HEX_REG_SA0, START); \
+    } while (0)
+#define fWRITE_LOOP_REGS1(START, COUNT) \
+    do { \
+        WRITE_RREG(HEX_REG_LC1, COUNT);  \
+        WRITE_RREG(HEX_REG_SA1, START);\
+    } while (0)
+#define fWRITE_LC0(VAL) WRITE_RREG(HEX_REG_LC0, VAL)
+#define fWRITE_LC1(VAL) WRITE_RREG(HEX_REG_LC1, VAL)
+
+#define fCARRY_FROM_ADD(A, B, C) carry_from_add64(A, B, C)
+
+#define fSET_OVERFLOW() SET_USR_FIELD(USR_OVF, 1)
+#define fSET_LPCFG(VAL) SET_USR_FIELD(USR_LPCFG, (VAL))
+#define fGET_LPCFG (GET_USR_FIELD(USR_LPCFG))
+#define fWRITE_P0(VAL) WRITE_PREG(0, VAL)
+#define fWRITE_P1(VAL) WRITE_PREG(1, VAL)
+#define fWRITE_P2(VAL) WRITE_PREG(2, VAL)
+#define fWRITE_P3(VAL) WRITE_PREG(3, VAL)
+#define fPART1(WORK) if (part1) { WORK; return; }
+#define fCAST4u(A) ((size4u_t)(A))
+#define fCAST4s(A) ((size4s_t)(A))
+#define fCAST8u(A) ((size8u_t)(A))
+#define fCAST8s(A) ((size8s_t)(A))
+#define fCAST4_4s(A) ((size4s_t)(A))
+#define fCAST4_4u(A) ((size4u_t)(A))
+#define fCAST4_8s(A) ((size8s_t)((size4s_t)(A)))
+#define fCAST4_8u(A) ((size8u_t)((size4u_t)(A)))
+#define fCAST8_8s(A) ((size8s_t)(A))
+#define fCAST8_8u(A) ((size8u_t)(A))
+#define fCAST2_8s(A) ((size8s_t)((size2s_t)(A)))
+#define fCAST2_8u(A) ((size8u_t)((size2u_t)(A)))
+#define fZE8_16(A) ((size2s_t)((size1u_t)(A)))
+#define fSE8_16(A) ((size2s_t)((size1s_t)(A)))
+#define fSE16_32(A) ((size4s_t)((size2s_t)(A)))
+#define fZE16_32(A) ((size4u_t)((size2u_t)(A)))
+#define fSE32_64(A) ((size8s_t)((size4s_t)(A)))
+#define fZE32_64(A) ((size8u_t)((size4u_t)(A)))
+#define fSE8_32(A) ((size4s_t)((size1s_t)(A)))
+#define fZE8_32(A) ((size4s_t)((size1u_t)(A)))
+#define fMPY8UU(A, B) (int)(fZE8_16(A) * fZE8_16(B))
+#define fMPY8US(A, B) (int)(fZE8_16(A) * fSE8_16(B))
+#define fMPY8SU(A, B) (int)(fSE8_16(A) * fZE8_16(B))
+#define fMPY8SS(A, B) (int)((short)(A) * (short)(B))
+#define fMPY16SS(A, B) fSE32_64(fSE16_32(A) * fSE16_32(B))
+#define fMPY16UU(A, B) fZE32_64(fZE16_32(A) * fZE16_32(B))
+#define fMPY16SU(A, B) fSE32_64(fSE16_32(A) * fZE16_32(B))
+#define fMPY16US(A, B) fMPY16SU(B, A)
+#define fMPY32SS(A, B) (fSE32_64(A) * fSE32_64(B))
+#define fMPY32UU(A, B) (fZE32_64(A) * fZE32_64(B))
+#define fMPY32SU(A, B) (fSE32_64(A) * fZE32_64(B))
+#define fMPY3216SS(A, B) (fSE32_64(A) * fSXTN(16, 64, B))
+#define fMPY3216SU(A, B) (fSE32_64(A) * fZXTN(16, 64, B))
+#define fROUND(A) (A + 0x8000)
+#define fCLIP(DST, SRC, U) \
+    do { \
+        size4s_t maxv = (1 << U) - 1; \
+        size4s_t minv = -(1 << U); \
+        DST = fMIN(maxv, fMAX(SRC, minv)); \
+    } while (0)
+#define fCRND(A) ((((A) & 0x3) == 0x3) ? ((A) + 1) : ((A)))
+#define fRNDN(A, N) ((((N) == 0) ? (A) : (((fSE32_64(A)) + (1 << ((N) - 1))))))
+#define fCRNDN(A, N) (conv_round(A, N))
+#define fADD128(A, B) (add128(A, B))
+#define fSUB128(A, B) (sub128(A, B))
+#define fSHIFTR128(A, B) (shiftr128(A, B))
+#define fSHIFTL128(A, B) (shiftl128(A, B))
+#define fAND128(A, B) (and128(A, B))
+#define fCAST8S_16S(A) (cast8s_to_16s(A))
+#define fCAST16S_8S(A) (cast16s_to_8s(A))
+
+#define fEA_RI(REG, IMM) \
+    do { \
+        EA = REG + IMM; \
+    } while (0)
+#define fEA_RRs(REG, REG2, SCALE) \
+    do { \
+        EA = REG + (REG2 << SCALE); \
+    } while (0)
+#define fEA_IRs(IMM, REG, SCALE) \
+    do { \
+        EA = IMM + (REG << SCALE); \
+    } while (0)
+
+#ifdef QEMU_GENERATE
+#define fEA_IMM(IMM)        tcg_gen_movi_tl(EA, (IMM))
+#define fEA_REG(REG)        tcg_gen_mov_tl(EA, REG)
+#define fPM_I(REG, IMM)     tcg_gen_addi_tl(REG, REG, (IMM))
+#define fPM_M(REG, MVAL)    tcg_gen_add_tl(REG, REG, MVAL)
+#else
+#define fEA_IMM(IMM)        do { EA = (IMM); } while (0)
+#define fEA_REG(REG)        do { EA = (REG); } while (0)
+#define fEA_GPI(IMM)        do { EA = fREAD_GP() + (IMM); } while (0)
+#define fPM_I(REG, IMM)     do { REG = REG + (IMM); } while (0)
+#define fPM_M(REG, MVAL)    do { REG = REG + (MVAL); } while (0)
+#endif
+#define fSCALE(N, A) (((size8s_t)(A)) << N)
+#define fSATW(A) fSATN(32, ((long long)A))
+#define fSAT(A) fSATN(32, (A))
+#define fSAT_ORIG_SHL(A, ORIG_REG) \
+    ((((size4s_t)((fSAT(A)) ^ ((size4s_t)(ORIG_REG)))) < 0) \
+        ? fSATVALN(32, ((size4s_t)(ORIG_REG))) \
+        : ((((ORIG_REG) > 0) && ((A) == 0)) ? fSATVALN(32, (ORIG_REG)) \
+                                            : fSAT(A)))
+#define fPASS(A) A
+#define fRND(A) (((A) + 1) >> 1)
+#define fBIDIR_SHIFTL(SRC, SHAMT, REGSTYPE) \
+    (((SHAMT) < 0) ? ((fCAST##REGSTYPE(SRC) >> ((-(SHAMT)) - 1)) >> 1) \
+                   : (fCAST##REGSTYPE(SRC) << (SHAMT)))
+#define fBIDIR_ASHIFTL(SRC, SHAMT, REGSTYPE) \
+    fBIDIR_SHIFTL(SRC, SHAMT, REGSTYPE##s)
+#define fBIDIR_LSHIFTL(SRC, SHAMT, REGSTYPE) \
+    fBIDIR_SHIFTL(SRC, SHAMT, REGSTYPE##u)
+#define fBIDIR_ASHIFTL_SAT(SRC, SHAMT, REGSTYPE) \
+    (((SHAMT) < 0) ? ((fCAST##REGSTYPE##s(SRC) >> ((-(SHAMT)) - 1)) >> 1) \
+                   : fSAT_ORIG_SHL(fCAST##REGSTYPE##s(SRC) << (SHAMT), (SRC)))
+#define fBIDIR_SHIFTR(SRC, SHAMT, REGSTYPE) \
+    (((SHAMT) < 0) ? ((fCAST##REGSTYPE(SRC) << ((-(SHAMT)) - 1)) << 1) \
+                   : (fCAST##REGSTYPE(SRC) >> (SHAMT)))
+#define fBIDIR_ASHIFTR(SRC, SHAMT, REGSTYPE) \
+    fBIDIR_SHIFTR(SRC, SHAMT, REGSTYPE##s)
+#define fBIDIR_LSHIFTR(SRC, SHAMT, REGSTYPE) \
+    fBIDIR_SHIFTR(SRC, SHAMT, REGSTYPE##u)
+#define fBIDIR_ASHIFTR_SAT(SRC, SHAMT, REGSTYPE) \
+    (((SHAMT) < 0) ? fSAT_ORIG_SHL((fCAST##REGSTYPE##s(SRC) \
+                        << ((-(SHAMT)) - 1)) << 1, (SRC)) \
+                   : (fCAST##REGSTYPE##s(SRC) >> (SHAMT)))
+#define fASHIFTR(SRC, SHAMT, REGSTYPE) (fCAST##REGSTYPE##s(SRC) >> (SHAMT))
+#define fLSHIFTR(SRC, SHAMT, REGSTYPE) \
+    (((SHAMT) >= 64) ? 0 : (fCAST##REGSTYPE##u(SRC) >> (SHAMT)))
+#define fROTL(SRC, SHAMT, REGSTYPE) \
+    (((SHAMT) == 0) ? (SRC) : ((fCAST##REGSTYPE##u(SRC) << (SHAMT)) | \
+                              ((fCAST##REGSTYPE##u(SRC) >> \
+                                 ((sizeof(SRC) * 8) - (SHAMT))))))
+#define fROTR(SRC, SHAMT, REGSTYPE) \
+    (((SHAMT) == 0) ? (SRC) : ((fCAST##REGSTYPE##u(SRC) >> (SHAMT)) | \
+                              ((fCAST##REGSTYPE##u(SRC) << \
+                                 ((sizeof(SRC) * 8) - (SHAMT))))))
+#define fASHIFTL(SRC, SHAMT, REGSTYPE) \
+    (((SHAMT) >= 64) ? 0 : (fCAST##REGSTYPE##s(SRC) << (SHAMT)))
+#define fFLOAT(A) \
+    ({ union { float f; size4u_t i; } _fipun; _fipun.i = (A); _fipun.f; })
+#define fUNFLOAT(A) \
+    ({ union { float f; size4u_t i; } _fipun; \
+     _fipun.f = (A); isnan(_fipun.f) ? 0xFFFFFFFFU : _fipun.i; })
+#define fSFNANVAL() 0xffffffff
+#define fSFINFVAL(A) (((A) & 0x80000000) | 0x7f800000)
+#define fSFONEVAL(A) (((A) & 0x80000000) | fUNFLOAT(1.0))
+#define fCHECKSFNAN(DST, A) \
+    do { \
+        if (isnan(fFLOAT(A))) { \
+            if ((fGETBIT(22, A)) == 0) { \
+                fRAISEFLAGS(FE_INVALID); \
+            } \
+            DST = fSFNANVAL(); \
+        } \
+    } while (0)
+#define fCHECKSFNAN3(DST, A, B, C) \
+    do { \
+        fCHECKSFNAN(DST, A); \
+        fCHECKSFNAN(DST, B); \
+        fCHECKSFNAN(DST, C); \
+    } while (0)
+#define fSF_BIAS() 127
+#define fSF_MANTBITS() 23
+#define fSF_MUL_POW2(A, B) \
+    (fUNFLOAT(fFLOAT(A) * fFLOAT((fSF_BIAS() + (B)) << fSF_MANTBITS())))
+#define fSF_GETEXP(A) (((A) >> fSF_MANTBITS()) & 0xff)
+#define fSF_MAXEXP() (254)
+#define fSF_RECIP_COMMON(N, D, O, A) arch_sf_recip_common(&N, &D, &O, &A)
+#define fSF_INVSQRT_COMMON(N, O, A) arch_sf_invsqrt_common(&N, &O, &A)
+#define fFMAFX(A, B, C, ADJ) internal_fmafx(A, B, C, fSXTN(8, 64, ADJ))
+#define fFMAF(A, B, C) internal_fmafx(A, B, C, 0)
+#define fSFMPY(A, B) internal_mpyf(A, B)
+#define fMAKESF(SIGN, EXP, MANT) \
+    ((((SIGN) & 1) << 31) | \
+     (((EXP) & 0xff) << fSF_MANTBITS()) | \
+     ((MANT) & ((1 << fSF_MANTBITS()) - 1)))
+#define fDOUBLE(A) \
+    ({ union { double f; size8u_t i; } _fipun; _fipun.i = (A); _fipun.f; })
+#define fUNDOUBLE(A) \
+    ({ union { double f; size8u_t i; } _fipun; \
+     _fipun.f = (A); \
+     isnan(_fipun.f) ? 0xFFFFFFFFFFFFFFFFULL : _fipun.i; })
+#define fDFNANVAL() 0xffffffffffffffffULL
+#define fDF_ISNORMAL(X) (fpclassify(fDOUBLE(X)) == FP_NORMAL)
+#define fDF_ISDENORM(X) (fpclassify(fDOUBLE(X)) == FP_SUBNORMAL)
+#define fDF_ISBIG(X) (fDF_GETEXP(X) >= 512)
+#define fDF_MANTBITS() 52
+#define fDF_GETEXP(A) (((A) >> fDF_MANTBITS()) & 0x7ff)
+#define fFMA(A, B, C) internal_fma(A, B, C)
+#define fDF_MPY_HH(A, B, ACC) internal_mpyhh(A, B, ACC)
+
+#ifdef QEMU_GENERATE
+/* These will be needed if we write any FP instructions with TCG */
+#define fFPOP_START()      /* nothing */
+#define fFPOP_END()        /* nothing */
+#else
+#define fFPOP_START() arch_fpop_start(env)
+#define fFPOP_END() arch_fpop_end(env)
+#endif
+
+#define fFPSETROUND_NEAREST() fesetround(FE_TONEAREST)
+#define fFPSETROUND_CHOP() fesetround(FE_TOWARDZERO)
+#define fFPCANCELFLAGS() feclearexcept(FE_ALL_EXCEPT)
+#define fISINFPROD(A, B) \
+    ((isinf(A) && isinf(B)) || \
+     (isinf(A) && isfinite(B) && ((B) != 0.0)) || \
+     (isinf(B) && isfinite(A) && ((A) != 0.0)))
+#define fISZEROPROD(A, B) \
+    ((((A) == 0.0) && isfinite(B)) || (((B) == 0.0) && isfinite(A)))
+#define fRAISEFLAGS(A) arch_raise_fpflag(A)
+#define fDF_MAX(A, B) \
+    (((A) == (B)) ? fDOUBLE(fUNDOUBLE(A) & fUNDOUBLE(B)) : fmax(A, B))
+#define fDF_MIN(A, B) \
+    (((A) == (B)) ? fDOUBLE(fUNDOUBLE(A) | fUNDOUBLE(B)) : fmin(A, B))
+#define fSF_MAX(A, B) \
+    (((A) == (B)) ? fFLOAT(fUNFLOAT(A) & fUNFLOAT(B)) : fmaxf(A, B))
+#define fSF_MIN(A, B) \
+    (((A) == (B)) ? fFLOAT(fUNFLOAT(A) | fUNFLOAT(B)) : fminf(A, B))
+
+#ifdef QEMU_GENERATE
+#define fLOAD(NUM, SIZE, SIGN, EA, DST) MEM_LOAD##SIZE##SIGN(DST, EA)
+#else
+#define fLOAD(NUM, SIZE, SIGN, EA, DST) \
+    DST = (size##SIZE##SIGN##_t)MEM_LOAD##SIZE##SIGN(EA)
+#endif
+
+#define fMEMOP(NUM, SIZE, SIGN, EA, FNTYPE, VALUE)
+
+#define fGET_FRAMEKEY() READ_REG(HEX_REG_FRAMEKEY)
+#define fFRAME_SCRAMBLE(VAL) ((VAL) ^ (fCAST8u(fGET_FRAMEKEY()) << 32))
+#define fFRAME_UNSCRAMBLE(VAL) fFRAME_SCRAMBLE(VAL)
+
+#ifdef CONFIG_USER_ONLY
+#define fFRAMECHECK(ADDR, EA) do { } while (0) /* Not modelled in linux-user */
+#else
+/* System mode not implemented yet */
+#define fFRAMECHECK(ADDR, EA)  g_assert_not_reached();
+#endif
+
+#ifdef QEMU_GENERATE
+#define fLOAD_LOCKED(NUM, SIZE, SIGN, EA, DST) \
+    gen_load_locked##SIZE##SIGN(DST, EA, ctx->mem_idx);
+#endif
+
+#define fSTORE(NUM, SIZE, EA, SRC) MEM_STORE##SIZE(EA, SRC, slot)
+
+#ifdef QEMU_GENERATE
+#define fSTORE_LOCKED(NUM, SIZE, EA, SRC, PRED) \
+    gen_store_conditional##SIZE(env, ctx, PdN, PRED, EA, SRC);
+#endif
+
+#define fGETBYTE(N, SRC) ((size1s_t)((SRC >> ((N) * 8)) & 0xff))
+#define fGETUBYTE(N, SRC) ((size1u_t)((SRC >> ((N) * 8)) & 0xff))
+
+#define fSETBYTE(N, DST, VAL) \
+    do { \
+        DST = (DST & ~(0x0ffLL << ((N) * 8))) | \
+        (((size8u_t)((VAL) & 0x0ffLL)) << ((N) * 8)); \
+    } while (0)
+#define fGETHALF(N, SRC) ((size2s_t)((SRC >> ((N) * 16)) & 0xffff))
+#define fGETUHALF(N, SRC) ((size2u_t)((SRC >> ((N) * 16)) & 0xffff))
+#define fSETHALF(N, DST, VAL) \
+    do { \
+        DST = (DST & ~(0x0ffffLL << ((N) * 16))) | \
+        (((size8u_t)((VAL) & 0x0ffff)) << ((N) * 16)); \
+    } while (0)
+#define fSETHALFw fSETHALF
+#define fSETHALFd fSETHALF
+
+#define fGETWORD(N, SRC) \
+    ((size8s_t)((size4s_t)((SRC >> ((N) * 32)) & 0x0ffffffffLL)))
+#define fGETUWORD(N, SRC) \
+    ((size8u_t)((size4u_t)((SRC >> ((N) * 32)) & 0x0ffffffffLL)))
+
+#define fSETWORD(N, DST, VAL) \
+    do { \
+        DST = (DST & ~(0x0ffffffffLL << ((N) * 32))) | \
+              (((VAL) & 0x0ffffffffLL) << ((N) * 32)); \
+    } while (0)
+
+#define fSETBIT(N, DST, VAL) \
+    do { \
+        DST = (DST & ~(1ULL << (N))) | (((size8u_t)(VAL)) << (N)); \
+    } while (0)
+
+#define fGETBIT(N, SRC) (((SRC) >> N) & 1)
+#define fSETBITS(HI, LO, DST, VAL) \
+    do { \
+        int j; \
+        for (j = LO; j <= HI; j++) { \
+            fSETBIT(j, DST, VAL); \
+        } \
+    } while (0)
+#define fCOUNTONES_4(VAL) ctpop32(VAL)
+#define fCOUNTONES_8(VAL) ctpop64(VAL)
+#define fBREV_8(VAL) revbit64(VAL)
+#define fBREV_4(VAL) revbit32(VAL)
+#define fCL1_8(VAL) clo64(VAL)
+#define fCL1_4(VAL) clo32(VAL)
+#define fINTERLEAVE(ODD, EVEN) interleave(ODD, EVEN)
+#define fDEINTERLEAVE(MIXED) deinterleave(MIXED)
+#define fHIDE(A) A
+#define fCONSTLL(A) A##LL
+#define fECHO(A) (A)
+
+#define fTRAP(TRAPTYPE, IMM) helper_raise_exception(env, HEX_EXCP_TRAP0)
+
+#define fALIGN_REG_FIELD_VALUE(FIELD, VAL) \
+    ((VAL) << reg_field_info[FIELD].offset)
+#define fGET_REG_FIELD_MASK(FIELD) \
+    (((1 << reg_field_info[FIELD].width) - 1) << reg_field_info[FIELD].offset)
+#define fREAD_REG_FIELD(REG, FIELD) \
+    fEXTRACTU_BITS(env->gpr[HEX_REG_##REG], \
+                   reg_field_info[FIELD].width, \
+                   reg_field_info[FIELD].offset)
+#define fGET_FIELD(VAL, FIELD)
+#define fSET_FIELD(VAL, FIELD, NEWVAL)
+#define fBARRIER()
+#define fSYNCH()
+#define fISYNC()
+#define fDCFETCH(REG) do { REG = REG; } while (0) /* Nothing to do in qemu */
+#define fICINVA(REG) do { REG = REG; } while (0) /* Nothing to do in qemu */
+#define fL2FETCH(ADDR, HEIGHT, WIDTH, STRIDE, FLAGS)
+#define fDCCLEANA(REG) do { REG = REG; } while (0) /* Nothing to do in qemu */
+#define fDCCLEANINVA(REG) \
+    do { REG = REG; } while (0) /* Nothing to do in qemu */
+
+#define fDCZEROA(REG) do { env->dczero_addr = (REG); } while (0)
+
+#define fBRANCH_SPECULATE_STALL(DOTNEWVAL, JUMP_COND, SPEC_DIR, HINTBITNUM, \
+                                STRBITNUM) /* Nothing */
+
+
+#endif
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index f86d45b..2ea4e70 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -88,27 +88,6 @@ static inline void log_pred_write(CPUHexagonState *env, int pnum,
     }
 }
 
-static inline void log_store32(CPUHexagonState *env, target_ulong addr,
-                               target_ulong val, int width, int slot)
-{
-    HEX_DEBUG_LOG("log_store%d(0x" TARGET_FMT_lx ", " TARGET_FMT_ld
-                  " [0x" TARGET_FMT_lx "])\n",
-                  width, addr, val, val);
-    env->mem_log_stores[slot].va = addr;
-    env->mem_log_stores[slot].width = width;
-    env->mem_log_stores[slot].data32 = val;
-}
-
-static inline void log_store64(CPUHexagonState *env, target_ulong addr,
-                               int64_t val, int width, int slot)
-{
-    HEX_DEBUG_LOG("log_store%d(0x" TARGET_FMT_lx ", %ld [0x%lx])\n",
-                   width, addr, val, val);
-    env->mem_log_stores[slot].va = addr;
-    env->mem_log_stores[slot].width = width;
-    env->mem_log_stores[slot].data64 = val;
-}
-
 static inline void write_new_pc(CPUHexagonState *env, target_ulong addr)
 {
     HEX_DEBUG_LOG("write_new_pc(0x" TARGET_FMT_lx ")\n", addr);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 27/34] Hexagon (target/hexagon) instruction classes
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (25 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-29  1:37   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 28/34] Hexagon (target/hexagon) TCG generation helpers Taylor Simpson
                   ` (8 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Determine legal VLIW slots for each instruction

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/iclass.h            | 46 ++++++++++++++++++++
 target/hexagon/iclass.c            | 88 ++++++++++++++++++++++++++++++++++++++
 target/hexagon/imported/iclass.def | 52 ++++++++++++++++++++++
 3 files changed, 186 insertions(+)
 create mode 100644 target/hexagon/iclass.h
 create mode 100644 target/hexagon/iclass.c
 create mode 100644 target/hexagon/imported/iclass.def

diff --git a/target/hexagon/iclass.h b/target/hexagon/iclass.h
new file mode 100644
index 0000000..89288ac
--- /dev/null
+++ b/target/hexagon/iclass.h
@@ -0,0 +1,46 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_ICLASS_H
+#define HEXAGON_ICLASS_H
+
+#include "opcodes.h"
+
+#define ICLASS_FROM_TYPE(TYPE) ICLASS_##TYPE
+
+typedef enum {
+
+#define DEF_PP_ICLASS32(TYPE, SLOTS, UNITS)    ICLASS_FROM_TYPE(TYPE),
+#define DEF_EE_ICLASS32(TYPE, SLOTS, UNITS)    /* nothing */
+#include "imported/iclass.def"
+#undef DEF_PP_ICLASS32
+#undef DEF_EE_ICLASS32
+
+#define DEF_EE_ICLASS32(TYPE, SLOTS, UNITS)    ICLASS_FROM_TYPE(TYPE),
+#define DEF_PP_ICLASS32(TYPE, SLOTS, UNITS)    /* nothing */
+#include "imported/iclass.def"
+#undef DEF_PP_ICLASS32
+#undef DEF_EE_ICLASS32
+
+    ICLASS_FROM_TYPE(COPROC_VX),
+    ICLASS_FROM_TYPE(COPROC_VMEM),
+    NUM_ICLASSES
+} iclass_t;
+
+extern const char *find_iclass_slots(opcode_t opcode, int itype);
+
+#endif
diff --git a/target/hexagon/iclass.c b/target/hexagon/iclass.c
new file mode 100644
index 0000000..32c2236
--- /dev/null
+++ b/target/hexagon/iclass.c
@@ -0,0 +1,88 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "iclass.h"
+
+typedef struct {
+    const char * const slots;
+} iclass_info_t;
+
+static const iclass_info_t iclass_info[] = {
+
+#define DEF_EE_ICLASS32(TYPE, SLOTS, UNITS)    /* nothing */
+#define DEF_PP_ICLASS32(TYPE, SLOTS, UNITS) \
+    [ICLASS_FROM_TYPE(TYPE)] = { .slots = #SLOTS },
+
+#include "imported/iclass.def"
+#undef DEF_PP_ICLASS32
+#undef DEF_EE_ICLASS32
+
+#define DEF_PP_ICLASS32(TYPE, SLOTS, UNITS)    /* nothing */
+#define DEF_EE_ICLASS32(TYPE, SLOTS, UNITS) \
+    [ICLASS_FROM_TYPE(TYPE)] = { .slots = #SLOTS },
+
+#include "imported/iclass.def"
+#undef DEF_PP_ICLASS32
+#undef DEF_EE_ICLASS32
+
+    {0}
+};
+
+const char *find_iclass_slots(opcode_t opcode, int itype)
+{
+    /* There are some exceptions to what the iclass dictates */
+    if (GET_ATTRIB(opcode, A_ICOP)) {
+        return "2";
+    } else if (GET_ATTRIB(opcode, A_RESTRICT_SLOT0ONLY)) {
+        return "0";
+    } else if (GET_ATTRIB(opcode, A_RESTRICT_SLOT1ONLY)) {
+        return "1";
+    } else if (GET_ATTRIB(opcode, A_RESTRICT_SLOT2ONLY)) {
+        return "2";
+    } else if (GET_ATTRIB(opcode, A_RESTRICT_SLOT3ONLY)) {
+        return "3";
+    } else if (GET_ATTRIB(opcode, A_COF) &&
+               GET_ATTRIB(opcode, A_INDIRECT) &&
+               !GET_ATTRIB(opcode, A_MEMLIKE) &&
+               !GET_ATTRIB(opcode, A_MEMLIKE_PACKET_RULES)) {
+        return "2";
+    } else if (GET_ATTRIB(opcode, A_RESTRICT_NOSLOT1)) {
+        return "0";
+    } else if ((opcode == J2_trap0) ||
+               (opcode == Y2_isync) ||
+               (opcode == J4_hintjumpr)) {
+        return "2";
+    } else if ((itype == ICLASS_V2LDST) && (GET_ATTRIB(opcode, A_STORE))) {
+        return "01";
+    } else if ((itype == ICLASS_V2LDST) && (!GET_ATTRIB(opcode, A_STORE))) {
+        return "01";
+    } else if (GET_ATTRIB(opcode, A_CRSLOT23)) {
+        return "23";
+    } else if (GET_ATTRIB(opcode, A_RESTRICT_PREFERSLOT0)) {
+        return "0";
+    } else if (GET_ATTRIB(opcode, A_SUBINSN)) {
+        return "01";
+    } else if (GET_ATTRIB(opcode, A_CALL)) {
+        return "23";
+    } else if ((opcode == J4_jumpseti) || (opcode == J4_jumpsetr)) {
+        return "23";
+    } else {
+        return iclass_info[itype].slots;
+    }
+}
+
diff --git a/target/hexagon/imported/iclass.def b/target/hexagon/imported/iclass.def
new file mode 100644
index 0000000..4ef725f
--- /dev/null
+++ b/target/hexagon/imported/iclass.def
@@ -0,0 +1,52 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/* DEF_*(TYPE,SLOTS,UNITS) */
+DEF_PP_ICLASS32(EXTENDER,0123,LDST|SUNIT|MUNIT) /* 0 */
+DEF_PP_ICLASS32(CJ,0123,CTRLFLOW) /* 1 */
+DEF_PP_ICLASS32(NCJ,01,LDST|CTRLFLOW) /* 2 */
+DEF_PP_ICLASS32(V4LDST,01,LDST) /* 3 */
+DEF_PP_ICLASS32(V2LDST,01,LDST) /* 4 */
+DEF_PP_ICLASS32(J,0123,CTRLFLOW)  /* 5 */
+DEF_PP_ICLASS32(CR,3,SUNIT)     /* 6 */
+DEF_PP_ICLASS32(ALU32_2op,0123,LDST|SUNIT|MUNIT) /* 7 */
+DEF_PP_ICLASS32(S_2op,23,SUNIT|MUNIT)               /* 8 */
+DEF_PP_ICLASS32(LD,01,LDST)                    /* 9 */
+DEF_PP_ICLASS32(ST,01,LDST)                        /* 10 */
+DEF_PP_ICLASS32(ALU32_ADDI,0123,LDST|SUNIT|MUNIT) /* 11 */
+DEF_PP_ICLASS32(S_3op,23,SUNIT|MUNIT)               /* 12 */
+DEF_PP_ICLASS32(ALU64,23,SUNIT|MUNIT)             /* 13 */
+DEF_PP_ICLASS32(M,23,SUNIT|MUNIT)                 /* 14 */
+DEF_PP_ICLASS32(ALU32_3op,0123,LDST|SUNIT|MUNIT) /* 15 */
+
+DEF_EE_ICLASS32(EE0,01,INVALID) /* 0 */
+DEF_EE_ICLASS32(EE1,01,INVALID) /* 1 */
+DEF_EE_ICLASS32(EE2,01,INVALID) /* 2 */
+DEF_EE_ICLASS32(EE3,01,INVALID) /* 3 */
+DEF_EE_ICLASS32(EE4,01,INVALID) /* 4 */
+DEF_EE_ICLASS32(EE5,01,INVALID) /* 5 */
+DEF_EE_ICLASS32(EE6,01,INVALID) /* 6 */
+DEF_EE_ICLASS32(EE7,01,INVALID) /* 7 */
+DEF_EE_ICLASS32(EE8,01,INVALID) /* 8 */
+DEF_EE_ICLASS32(EE9,01,INVALID) /* 9 */
+DEF_EE_ICLASS32(EEA,01,INVALID) /* 10 */
+DEF_EE_ICLASS32(EEB,01,INVALID) /* 11 */
+DEF_EE_ICLASS32(EEC,01,INVALID) /* 12 */
+DEF_EE_ICLASS32(EED,01,INVALID) /* 13 */
+DEF_EE_ICLASS32(EEE,01,INVALID) /* 14 */
+DEF_EE_ICLASS32(EEF,01,INVALID) /* 15 */
+
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 28/34] Hexagon (target/hexagon) TCG generation helpers
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (26 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 27/34] Hexagon (target/hexagon) instruction classes Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-29  1:48   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 29/34] Hexagon (target/hexagon) TCG generation Taylor Simpson
                   ` (7 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Helpers for reading and writing registers
Helpers for load-locked/store-conditional

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/genptr_helpers.h | 244 ++++++++++++++++++++++++++++++++++++++++
 target/hexagon/op_helper.c      |  18 +++
 2 files changed, 262 insertions(+)
 create mode 100644 target/hexagon/genptr_helpers.h

diff --git a/target/hexagon/genptr_helpers.h b/target/hexagon/genptr_helpers.h
new file mode 100644
index 0000000..ffcb1e3
--- /dev/null
+++ b/target/hexagon/genptr_helpers.h
@@ -0,0 +1,244 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_GENPTR_HELPERS_H
+#define HEXAGON_GENPTR_HELPERS_H
+
+#include "tcg/tcg.h"
+
+static inline TCGv gen_read_reg(TCGv result, int num)
+{
+    tcg_gen_mov_tl(result, hex_gpr[num]);
+    return result;
+}
+
+static inline TCGv gen_read_preg(TCGv pred, uint8_t num)
+{
+    tcg_gen_mov_tl(pred, hex_pred[num]);
+    return pred;
+}
+
+static inline void gen_log_reg_write(int rnum, TCGv val, int slot,
+                                     int is_predicated)
+{
+    if (is_predicated) {
+        TCGv one = tcg_const_tl(1);
+        TCGv zero = tcg_const_tl(0);
+        TCGv slot_mask = tcg_temp_new();
+
+        tcg_gen_andi_tl(slot_mask, hex_slot_cancelled, 1 << slot);
+        tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum], slot_mask, zero,
+                           val, hex_new_value[rnum]);
+#if HEX_DEBUG
+        /* Do this so HELPER(debug_commit_end) will know */
+        tcg_gen_movcond_tl(TCG_COND_EQ, hex_reg_written[rnum], slot_mask, zero,
+                           one, hex_reg_written[rnum]);
+#endif
+
+        tcg_temp_free(one);
+        tcg_temp_free(zero);
+        tcg_temp_free(slot_mask);
+    } else {
+        tcg_gen_mov_tl(hex_new_value[rnum], val);
+#if HEX_DEBUG
+        /* Do this so HELPER(debug_commit_end) will know */
+        tcg_gen_movi_tl(hex_reg_written[rnum], 1);
+#endif
+    }
+}
+
+static inline void gen_log_reg_write_pair(int rnum, TCGv_i64 val, int slot,
+                                          int is_predicated)
+{
+    TCGv val32 = tcg_temp_new();
+
+    if (is_predicated) {
+        TCGv one = tcg_const_tl(1);
+        TCGv zero = tcg_const_tl(0);
+        TCGv slot_mask = tcg_temp_new();
+
+        tcg_gen_andi_tl(slot_mask, hex_slot_cancelled, 1 << slot);
+        /* Low word */
+        tcg_gen_extrl_i64_i32(val32, val);
+        tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum], slot_mask, zero,
+                           val32, hex_new_value[rnum]);
+#if HEX_DEBUG
+        /* Do this so HELPER(debug_commit_end) will know */
+        tcg_gen_movcond_tl(TCG_COND_EQ, hex_reg_written[rnum],
+                           slot_mask, zero,
+                           one, hex_reg_written[rnum]);
+#endif
+
+        /* High word */
+        tcg_gen_extrh_i64_i32(val32, val);
+        tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum + 1],
+                           slot_mask, zero,
+                           val32, hex_new_value[rnum + 1]);
+#if HEX_DEBUG
+        /* Do this so HELPER(debug_commit_end) will know */
+        tcg_gen_movcond_tl(TCG_COND_EQ, hex_reg_written[rnum + 1],
+                           slot_mask, zero,
+                           one, hex_reg_written[rnum + 1]);
+#endif
+
+        tcg_temp_free(one);
+        tcg_temp_free(zero);
+        tcg_temp_free(slot_mask);
+    } else {
+        /* Low word */
+        tcg_gen_extrl_i64_i32(val32, val);
+        tcg_gen_mov_tl(hex_new_value[rnum], val32);
+#if HEX_DEBUG
+        /* Do this so HELPER(debug_commit_end) will know */
+        tcg_gen_movi_tl(hex_reg_written[rnum], 1);
+#endif
+
+        /* High word */
+        tcg_gen_extrh_i64_i32(val32, val);
+        tcg_gen_mov_tl(hex_new_value[rnum + 1], val32);
+#if HEX_DEBUG
+        /* Do this so HELPER(debug_commit_end) will know */
+        tcg_gen_movi_tl(hex_reg_written[rnum + 1], 1);
+#endif
+    }
+
+    tcg_temp_free(val32);
+}
+
+static inline void gen_log_pred_write(int pnum, TCGv val)
+{
+    TCGv zero = tcg_const_tl(0);
+    TCGv base_val = tcg_temp_new();
+    TCGv and_val = tcg_temp_new();
+    TCGv pred_written = tcg_temp_new();
+
+    /* Multiple writes to the same preg are and'ed together */
+    tcg_gen_andi_tl(base_val, val, 0xff);
+    tcg_gen_and_tl(and_val, base_val, hex_new_pred_value[pnum]);
+    tcg_gen_andi_tl(pred_written, hex_pred_written, 1 << pnum);
+    tcg_gen_movcond_tl(TCG_COND_NE, hex_new_pred_value[pnum],
+                       pred_written, zero,
+                       and_val, base_val);
+    tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << pnum);
+
+    tcg_temp_free(zero);
+    tcg_temp_free(base_val);
+    tcg_temp_free(and_val);
+    tcg_temp_free(pred_written);
+}
+
+static inline void gen_read_p3_0(TCGv control_reg)
+{
+    TCGv pval = tcg_temp_new();
+    int i;
+    tcg_gen_movi_tl(control_reg, 0);
+    for (i = NUM_PREGS - 1; i >= 0; i--) {
+        tcg_gen_shli_tl(control_reg, control_reg, 8);
+        tcg_gen_andi_tl(pval, hex_pred[i], 0xff);
+        tcg_gen_or_tl(control_reg, control_reg, pval);
+    }
+    tcg_temp_free(pval);
+}
+
+static inline void gen_write_p3_0(TCGv tmp)
+{
+    TCGv control_reg = tcg_temp_new();
+    TCGv pred_val = tcg_temp_new();
+    int i;
+
+    tcg_gen_mov_tl(control_reg, tmp);
+    for (i = 0; i < NUM_PREGS; i++) {
+        tcg_gen_andi_tl(pred_val, control_reg, 0xff);
+        tcg_gen_mov_tl(hex_pred[i], pred_val);
+        tcg_gen_shri_tl(control_reg, control_reg, 8);
+    }
+    tcg_temp_free(control_reg);
+    tcg_temp_free(pred_val);
+}
+
+static inline void gen_load_locked4u(TCGv dest, TCGv vaddr, int mem_index)
+{
+    tcg_gen_qemu_ld32u(dest, vaddr, mem_index);
+    tcg_gen_mov_tl(hex_llsc_addr, vaddr);
+    tcg_gen_mov_tl(hex_llsc_val, dest);
+}
+
+static inline void gen_load_locked8u(TCGv_i64 dest, TCGv vaddr, int mem_index)
+{
+    tcg_gen_qemu_ld64(dest, vaddr, mem_index);
+    tcg_gen_mov_tl(hex_llsc_addr, vaddr);
+    tcg_gen_mov_i64(hex_llsc_val_i64, dest);
+}
+
+static inline void gen_store_conditional4(CPUHexagonState *env,
+                                          DisasContext *ctx, int prednum,
+                                          TCGv pred, TCGv vaddr, TCGv src)
+{
+    TCGLabel *fail = gen_new_label();
+    TCGLabel *done = gen_new_label();
+
+    tcg_gen_brcond_tl(TCG_COND_NE, vaddr, hex_llsc_addr, fail);
+
+    TCGv one = tcg_const_tl(0xff);
+    TCGv zero = tcg_const_tl(0);
+    TCGv tmp = tcg_temp_new();
+    tcg_gen_atomic_cmpxchg_tl(tmp, hex_llsc_addr, hex_llsc_val, src,
+                              ctx->mem_idx, MO_32);
+    tcg_gen_movcond_tl(TCG_COND_EQ, hex_pred[prednum], tmp, hex_llsc_val,
+                       one, zero);
+    tcg_temp_free(one);
+    tcg_temp_free(zero);
+    tcg_temp_free(tmp);
+    tcg_gen_br(done);
+
+    gen_set_label(fail);
+    tcg_gen_movi_tl(pred, 0);
+
+    gen_set_label(done);
+    tcg_gen_movi_tl(hex_llsc_addr, ~0);
+}
+
+static inline void gen_store_conditional8(CPUHexagonState *env,
+                                          DisasContext *ctx, int prednum,
+                                          TCGv pred, TCGv vaddr, TCGv_i64 src)
+{
+    TCGLabel *fail = gen_new_label();
+    TCGLabel *done = gen_new_label();
+
+    tcg_gen_brcond_tl(TCG_COND_NE, vaddr, hex_llsc_addr, fail);
+
+    TCGv_i64 one = tcg_const_i64(0xff);
+    TCGv_i64 zero = tcg_const_i64(0);
+    TCGv_i64 tmp = tcg_temp_new_i64();
+    tcg_gen_atomic_cmpxchg_i64(tmp, hex_llsc_addr, hex_llsc_val_i64, src,
+                               ctx->mem_idx, MO_64);
+    tcg_gen_movcond_i64(TCG_COND_EQ, tmp, tmp, hex_llsc_val_i64,
+                        one, zero);
+    tcg_gen_extrl_i64_i32(hex_pred[prednum], tmp);
+    tcg_temp_free_i64(one);
+    tcg_temp_free_i64(zero);
+    tcg_temp_free_i64(tmp);
+    tcg_gen_br(done);
+
+    gen_set_label(fail);
+    tcg_gen_movi_tl(pred, 0);
+
+    gen_set_label(done);
+    tcg_gen_movi_tl(hex_llsc_addr, ~0);
+}
+
+#endif
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index 2ea4e70..a234cf6 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -88,6 +88,24 @@ static inline void log_pred_write(CPUHexagonState *env, int pnum,
     }
 }
 
+static inline void log_store32(CPUHexagonState *env, target_ulong addr,
+                               int32_t val, int width, int slot)
+{
+    HEX_DEBUG_LOG("log_store%d(0x%x, %d [0x%x])\n", width, addr, val, val);
+    env->mem_log_stores[slot].va = addr;
+    env->mem_log_stores[slot].width = width;
+    env->mem_log_stores[slot].data32 = val;
+}
+
+static inline void log_store64(CPUHexagonState *env, target_ulong addr,
+                               int64_t val, int width, int slot)
+{
+    HEX_DEBUG_LOG("log_store%d(0x%x, %ld [0x%lx])\n", width, addr, val, val);
+    env->mem_log_stores[slot].va = addr;
+    env->mem_log_stores[slot].width = width;
+    env->mem_log_stores[slot].data64 = val;
+}
+
 static inline void write_new_pc(CPUHexagonState *env, target_ulong addr)
 {
     HEX_DEBUG_LOG("write_new_pc(0x" TARGET_FMT_lx ")\n", addr);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 29/34] Hexagon (target/hexagon) TCG generation
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (27 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 28/34] Hexagon (target/hexagon) TCG generation helpers Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-29  1:58   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions Taylor Simpson
                   ` (6 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Include the generated files and set up the data structures

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/genptr.h | 25 ++++++++++++++++++++++
 target/hexagon/genptr.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 80 insertions(+)
 create mode 100644 target/hexagon/genptr.h
 create mode 100644 target/hexagon/genptr.c

diff --git a/target/hexagon/genptr.h b/target/hexagon/genptr.h
new file mode 100644
index 0000000..a3a3db1
--- /dev/null
+++ b/target/hexagon/genptr.h
@@ -0,0 +1,25 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_GENPTR_H
+#define HEXAGON_GENPTR_H
+
+#include "insn.h"
+
+extern semantic_insn_t opcode_genptr[];
+
+#endif
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
new file mode 100644
index 0000000..a85fc14
--- /dev/null
+++ b/target/hexagon/genptr.c
@@ -0,0 +1,55 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define QEMU_GENERATE
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "cpu.h"
+#include "internal.h"
+#include "tcg/tcg-op.h"
+#include "insn.h"
+#include "opcodes.h"
+#include "translate.h"
+#include "macros.h"
+#include "genptr_helpers.h"
+
+#define DEF_TCG_FUNC(TAG, GENFN) \
+static void generate_##TAG(CPUHexagonState *env, DisasContext *ctx, \
+                           insn_t *insn, packet_t *pkt) \
+{ \
+    GENFN \
+}
+#include "tcg_funcs_generated.h"
+#undef DEF_TCG_FUNC
+
+
+/* Fill in the table with NULLs because not all the opcodes have DEF_QEMU */
+semantic_insn_t opcode_genptr[] = {
+#define OPCODE(X)                              NULL
+#include "opcodes_def_generated.h"
+    NULL
+#undef OPCODE
+};
+
+/* This function overwrites the NULL entries where we have a DEF_QEMU */
+void init_genptr(void)
+{
+#define DEF_TCG_FUNC(TAG, GENFN) \
+    opcode_genptr[TAG] = generate_##TAG;
+#include "tcg_funcs_generated.h"
+#undef DEF_TCG_FUNC
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (28 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 29/34] Hexagon (target/hexagon) TCG generation Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-29  2:02   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 31/34] Hexagon (target/hexagon) translation Taylor Simpson
                   ` (5 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Helpers won't work if there are multiple definitions, so we override these
instructions using #define fGEN_TCG_<tag>.

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/gen_tcg.h | 198 +++++++++++++++++++++++++++++++++++++++++++++++
 target/hexagon/helper.h  |   2 +
 target/hexagon/genptr.c  |   1 +
 3 files changed, 201 insertions(+)
 create mode 100644 target/hexagon/gen_tcg.h

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
new file mode 100644
index 0000000..35568d1
--- /dev/null
+++ b/target/hexagon/gen_tcg.h
@@ -0,0 +1,198 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_GEN_TCG_H
+#define HEXAGON_GEN_TCG_H
+
+/*
+ * Here is a primer to understand the tag names for load/store instructions
+ *
+ * Data types
+ *      b        signed byte                       r0 = memb(r2+#0)
+ *     ub        unsigned byte                     r0 = memub(r2+#0)
+ *      h        signed half word (16 bits)        r0 = memh(r2+#0)
+ *     uh        unsigned half word                r0 = memuh(r2+#0)
+ *      i        integer (32 bits)                 r0 = memw(r2+#0)
+ *      d        double word (64 bits)             r1:0 = memd(r2+#0)
+ *
+ * Addressing modes
+ *     _io       indirect with offset              r0 = memw(r1+#4)
+ *     _ur       absolute with register offset     r0 = memw(r1<<#4+##variable)
+ *     _rr       indirect with register offset     r0 = memw(r1+r4<<#2)
+ *     gp        global pointer relative           r0 = memw(gp+#200)
+ *     _sp       stack pointer relative            r0 = memw(r29+#12)
+ *     _ap       absolute set                      r0 = memw(r1=##variable)
+ *     _pr       post increment register           r0 = memw(r1++m1)
+ *     _pi       post increment immediate          r0 = memb(r1++#1)
+ */
+
+/* Macros for complex addressing modes */
+#define GET_EA_ap \
+    do { \
+        fEA_IMM(UiV); \
+        tcg_gen_movi_tl(ReV, UiV); \
+    } while (0)
+#define GET_EA_pr \
+    do { \
+        fEA_REG(RxV); \
+        fPM_M(RxV, MuV); \
+    } while (0)
+#define GET_EA_pi \
+    do { \
+        fEA_REG(RxV); \
+        fPM_I(RxV, siV); \
+    } while (0)
+
+
+/* Instructions with multiple definitions */
+#define fGEN_TCG_LOAD_AP(RES, SIZE, SIGN) \
+    do { \
+        fMUST_IMMEXT(UiV); \
+        fEA_IMM(UiV); \
+        fLOAD(1, SIZE, SIGN, EA, RES); \
+        tcg_gen_movi_tl(ReV, UiV); \
+    } while (0)
+
+#define fGEN_TCG_L4_loadrub_ap(SHORTCODE) \
+    fGEN_TCG_LOAD_AP(RdV, 1, u)
+#define fGEN_TCG_L4_loadrb_ap(SHORTCODE) \
+    fGEN_TCG_LOAD_AP(RdV, 1, s)
+#define fGEN_TCG_L4_loadruh_ap(SHORTCODE) \
+    fGEN_TCG_LOAD_AP(RdV, 2, u)
+#define fGEN_TCG_L4_loadrh_ap(SHORTCODE) \
+    fGEN_TCG_LOAD_AP(RdV, 2, s)
+#define fGEN_TCG_L4_loadri_ap(SHORTCODE) \
+    fGEN_TCG_LOAD_AP(RdV, 4, u)
+#define fGEN_TCG_L4_loadrd_ap(SHORTCODE) \
+    fGEN_TCG_LOAD_AP(RddV, 8, u)
+
+#define fGEN_TCG_L2_loadrub_pr(SHORTCODE)      SHORTCODE
+#define fGEN_TCG_L2_loadrub_pi(SHORTCODE)      SHORTCODE
+#define fGEN_TCG_L2_loadrb_pr(SHORTCODE)       SHORTCODE
+#define fGEN_TCG_L2_loadrb_pi(SHORTCODE)       SHORTCODE;
+#define fGEN_TCG_L2_loadruh_pr(SHORTCODE)      SHORTCODE
+#define fGEN_TCG_L2_loadruh_pi(SHORTCODE)      SHORTCODE;
+#define fGEN_TCG_L2_loadrh_pr(SHORTCODE)       SHORTCODE
+#define fGEN_TCG_L2_loadrh_pi(SHORTCODE)       SHORTCODE
+#define fGEN_TCG_L2_loadri_pr(SHORTCODE)       SHORTCODE
+#define fGEN_TCG_L2_loadri_pi(SHORTCODE)       SHORTCODE
+#define fGEN_TCG_L2_loadrd_pr(SHORTCODE)       SHORTCODE
+#define fGEN_TCG_L2_loadrd_pi(SHORTCODE)       SHORTCODE
+
+/*
+ * Predicated loads
+ * Here is a primer to understand the tag names
+ *
+ * Predicate used
+ *      t        true "old" value                  if (p0) r0 = memb(r2+#0)
+ *      f        false "old" value                 if (!p0) r0 = memb(r2+#0)
+ *      tnew     true "new" value                  if (p0.new) r0 = memb(r2+#0)
+ *      fnew     false "new" value                 if (!p0.new) r0 = memb(r2+#0)
+ */
+#define fGEN_TCG_PRED_LOAD(GET_EA, PRED, SIZE, SIGN) \
+    do { \
+        TCGv LSB = tcg_temp_local_new(); \
+        TCGLabel *label = gen_new_label(); \
+        GET_EA; \
+        PRED;  \
+        PRED_LOAD_CANCEL(LSB, EA); \
+        tcg_gen_movi_tl(RdV, 0); \
+        tcg_gen_brcondi_tl(TCG_COND_EQ, LSB, 0, label); \
+            fLOAD(1, SIZE, SIGN, EA, RdV); \
+        gen_set_label(label); \
+        tcg_temp_free(LSB); \
+    } while (0)
+
+#define fGEN_TCG_L2_ploadrubt_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBOLD(PtV), 1, u)
+#define fGEN_TCG_L2_ploadrubf_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBOLDNOT(PtV), 1, u)
+#define fGEN_TCG_L2_ploadrubtnew_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBNEW(PtN), 1, u)
+#define fGEN_TCG_L2_ploadrubfnew_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBNEWNOT(PtN), 1, u)
+#define fGEN_TCG_L2_ploadrbt_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBOLD(PtV), 1, s)
+#define fGEN_TCG_L2_ploadrbf_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBOLDNOT(PtV), 1, s)
+#define fGEN_TCG_L2_ploadrbtnew_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBNEW(PtN), 1, s)
+#define fGEN_TCG_L2_ploadrbfnew_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD({ fEA_REG(RxV); fPM_I(RxV, siV); }, \
+                       fLSBNEWNOT(PtN), 1, s)
+
+#define fGEN_TCG_L2_ploadruht_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBOLD(PtV), 2, u)
+#define fGEN_TCG_L2_ploadruhf_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBOLDNOT(PtV), 2, u)
+#define fGEN_TCG_L2_ploadruhtnew_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBNEW(PtN), 2, u)
+#define fGEN_TCG_L2_ploadruhfnew_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBNEWNOT(PtN), 2, u)
+#define fGEN_TCG_L2_ploadrht_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBOLD(PtV), 2, s)
+#define fGEN_TCG_L2_ploadrhf_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBOLDNOT(PtV), 2, s)
+#define fGEN_TCG_L2_ploadrhtnew_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBNEW(PtN), 2, s)
+#define fGEN_TCG_L2_ploadrhfnew_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBNEWNOT(PtN), 2, s)
+
+#define fGEN_TCG_L2_ploadrit_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBOLD(PtV), 4, u)
+#define fGEN_TCG_L2_ploadrif_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBOLDNOT(PtV), 4, u)
+#define fGEN_TCG_L2_ploadritnew_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBNEW(PtN), 4, u)
+#define fGEN_TCG_L2_ploadrifnew_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD(GET_EA_pi, fLSBNEWNOT(PtN), 4, u)
+
+/* Predicated loads into a register pair */
+#define fGEN_TCG_PRED_LOAD_PAIR(GET_EA, PRED) \
+    do { \
+        TCGv LSB = tcg_temp_local_new(); \
+        TCGLabel *label = gen_new_label(); \
+        GET_EA; \
+        PRED;  \
+        PRED_LOAD_CANCEL(LSB, EA); \
+        tcg_gen_movi_i64(RddV, 0); \
+        tcg_gen_brcondi_tl(TCG_COND_EQ, LSB, 0, label); \
+            fLOAD(1, 8, u, EA, RddV); \
+        gen_set_label(label); \
+        tcg_temp_free(LSB); \
+    } while (0)
+
+#define fGEN_TCG_L2_ploadrdt_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD_PAIR(GET_EA_pi, fLSBOLD(PtV))
+#define fGEN_TCG_L2_ploadrdf_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD_PAIR(GET_EA_pi, fLSBOLDNOT(PtV))
+#define fGEN_TCG_L2_ploadrdtnew_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD_PAIR(GET_EA_pi, fLSBNEW(PtN))
+#define fGEN_TCG_L2_ploadrdfnew_pi(SHORTCODE) \
+    fGEN_TCG_PRED_LOAD_PAIR(GET_EA_pi, fLSBNEWNOT(PtN))
+
+/* load-locked and store-locked */
+#define fGEN_TCG_L2_loadw_locked(SHORTCODE) \
+    SHORTCODE
+#define fGEN_TCG_L4_loadd_locked(SHORTCODE) \
+    SHORTCODE
+#define fGEN_TCG_S2_storew_locked(SHORTCODE) \
+    do { SHORTCODE; READ_PREG(PdV, PdN); } while (0)
+#define fGEN_TCG_S4_stored_locked(SHORTCODE) \
+    do { SHORTCODE; READ_PREG(PdV, PdN); } while (0)
+
+#endif
diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h
index 48b1917..cee902b 100644
--- a/target/hexagon/helper.h
+++ b/target/hexagon/helper.h
@@ -15,6 +15,8 @@
  *  along with this program; if not, see <http://www.gnu.org/licenses/>.
  */
 
+#include "gen_tcg.h"
+
 DEF_HELPER_2(raise_exception, noreturn, env, i32)
 DEF_HELPER_1(debug_start_packet, void, env)
 DEF_HELPER_3(debug_check_store_width, void, env, int, int)
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index a85fc14..1808660 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -26,6 +26,7 @@
 #include "translate.h"
 #include "macros.h"
 #include "genptr_helpers.h"
+#include "gen_tcg.h"
 
 #define DEF_TCG_FUNC(TAG, GENFN) \
 static void generate_##TAG(CPUHexagonState *env, DisasContext *ctx, \
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 31/34] Hexagon (target/hexagon) translation
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (29 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-29  2:49   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 32/34] Hexagon (linux-user/hexagon) Linux user emulation Taylor Simpson
                   ` (4 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Read the instruction memory
Create a packet data structure
Generate TCG code for the start of the packet
Invoke the generate function for each instruction
Generate TCG code for the end of the packet

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/translate.h | 103 +++++++
 target/hexagon/translate.c | 730 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 833 insertions(+)
 create mode 100644 target/hexagon/translate.h
 create mode 100644 target/hexagon/translate.c

diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
new file mode 100644
index 0000000..144140f
--- /dev/null
+++ b/target/hexagon/translate.h
@@ -0,0 +1,103 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_TRANSLATE_H
+#define HEXAGON_TRANSLATE_H
+
+#include "cpu.h"
+#include "exec/translator.h"
+#include "tcg/tcg-op.h"
+#include "internal.h"
+
+typedef struct DisasContext {
+    DisasContextBase base;
+    uint32_t mem_idx;
+    int reg_log[REG_WRITES_MAX];
+    int reg_log_idx;
+    int preg_log[PRED_WRITES_MAX];
+    int preg_log_idx;
+    uint8_t store_width[STORES_MAX];
+} DisasContext;
+
+static inline void ctx_log_reg_write(DisasContext *ctx, int rnum)
+{
+#if HEX_DEBUG
+    int i;
+    for (i = 0; i < ctx->reg_log_idx; i++) {
+        if (ctx->reg_log[i] == rnum) {
+            HEX_DEBUG_LOG("WARNING: Multiple writes to r%d\n", rnum);
+        }
+    }
+#endif
+    ctx->reg_log[ctx->reg_log_idx] = rnum;
+    ctx->reg_log_idx++;
+}
+
+static inline void ctx_log_pred_write(DisasContext *ctx, int pnum)
+{
+    ctx->preg_log[ctx->preg_log_idx] = pnum;
+    ctx->preg_log_idx++;
+}
+
+static inline bool is_preloaded(DisasContext *ctx, int num)
+{
+    int i;
+    for (i = 0; i < ctx->reg_log_idx; i++) {
+        if (ctx->reg_log[i] == num) {
+            return true;
+        }
+    }
+    return false;
+}
+
+extern TCGv hex_gpr[TOTAL_PER_THREAD_REGS];
+extern TCGv hex_pred[NUM_PREGS];
+extern TCGv hex_next_PC;
+extern TCGv hex_this_PC;
+extern TCGv hex_slot_cancelled;
+extern TCGv hex_branch_taken;
+extern TCGv hex_new_value[TOTAL_PER_THREAD_REGS];
+extern TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
+extern TCGv hex_new_pred_value[NUM_PREGS];
+extern TCGv hex_pred_written;
+extern TCGv hex_store_addr[STORES_MAX];
+extern TCGv hex_store_width[STORES_MAX];
+extern TCGv hex_store_val32[STORES_MAX];
+extern TCGv_i64 hex_store_val64[STORES_MAX];
+extern TCGv hex_dczero_addr;
+extern TCGv hex_llsc_addr;
+extern TCGv hex_llsc_val;
+extern TCGv_i64 hex_llsc_val_i64;
+
+static inline void gen_slot_cancelled_check(TCGv check, int slot_num)
+{
+    TCGv mask = tcg_const_tl(1 << slot_num);
+    TCGv one = tcg_const_tl(1);
+    TCGv zero = tcg_const_tl(0);
+
+    tcg_gen_and_tl(mask, hex_slot_cancelled, mask);
+    tcg_gen_movcond_tl(TCG_COND_NE, check, mask, zero, one, zero);
+
+    tcg_temp_free(one);
+    tcg_temp_free(zero);
+    tcg_temp_free(mask);
+}
+
+extern void gen_exception(int excp);
+extern void gen_exception_debug(void);
+
+#endif
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
new file mode 100644
index 0000000..9e3f4af
--- /dev/null
+++ b/target/hexagon/translate.c
@@ -0,0 +1,730 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define QEMU_GENERATE
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "tcg/tcg-op.h"
+#include "exec/cpu_ldst.h"
+#include "exec/log.h"
+#include "internal.h"
+#include "attribs.h"
+#include "insn.h"
+#include "decode.h"
+#include "translate.h"
+#include "printinsn.h"
+
+TCGv hex_gpr[TOTAL_PER_THREAD_REGS];
+TCGv hex_pred[NUM_PREGS];
+TCGv hex_next_PC;
+TCGv hex_this_PC;
+TCGv hex_slot_cancelled;
+TCGv hex_branch_taken;
+TCGv hex_new_value[TOTAL_PER_THREAD_REGS];
+#if HEX_DEBUG
+TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
+#endif
+TCGv hex_new_pred_value[NUM_PREGS];
+TCGv hex_pred_written;
+TCGv hex_store_addr[STORES_MAX];
+TCGv hex_store_width[STORES_MAX];
+TCGv hex_store_val32[STORES_MAX];
+TCGv_i64 hex_store_val64[STORES_MAX];
+TCGv hex_pkt_has_store_s1;
+TCGv hex_dczero_addr;
+TCGv hex_llsc_addr;
+TCGv hex_llsc_val;
+TCGv_i64 hex_llsc_val_i64;
+
+static const char * const hexagon_prednames[] = {
+  "p0", "p1", "p2", "p3"
+};
+
+void gen_exception(int excp)
+{
+    TCGv_i32 helper_tmp = tcg_const_i32(excp);
+    gen_helper_raise_exception(cpu_env, helper_tmp);
+    tcg_temp_free_i32(helper_tmp);
+}
+
+void gen_exception_debug(void)
+{
+    gen_exception(EXCP_DEBUG);
+}
+
+#if HEX_DEBUG
+#define PACKET_BUFFER_LEN              1028
+static void print_pkt(packet_t *pkt)
+{
+    char buf[PACKET_BUFFER_LEN];
+    snprint_a_pkt(buf, PACKET_BUFFER_LEN, pkt);
+    HEX_DEBUG_LOG("%s", buf);
+}
+#define HEX_DEBUG_PRINT_PKT(pkt)  print_pkt(pkt)
+#else
+#define HEX_DEBUG_PRINT_PKT(pkt)  /* nothing */
+#endif
+
+static int read_packet_words(CPUHexagonState *env, DisasContext *ctx,
+                             uint32_t words[])
+{
+    bool found_end = false;
+    int max_words;
+    int nwords;
+    int i;
+
+    /* Make sure we don't cross a page boundary */
+    max_words = -(ctx->base.pc_next | TARGET_PAGE_MASK) / sizeof(uint32_t);
+    if (max_words < PACKET_WORDS_MAX) {
+        /* Might cross a page boundary */
+        if (ctx->base.num_insns == 1) {
+            /* OK if it's the first packet in the TB */
+            max_words = PACKET_WORDS_MAX;
+        }
+    } else {
+        max_words = PACKET_WORDS_MAX;
+    }
+
+    memset(words, 0, PACKET_WORDS_MAX * sizeof(uint32_t));
+    for (nwords = 0; !found_end && nwords < max_words; nwords++) {
+        words[nwords] = cpu_ldl_code(env,
+                                ctx->base.pc_next + nwords * sizeof(uint32_t));
+        found_end = is_packet_end(words[nwords]);
+    }
+    if (!found_end) {
+        if (nwords == PACKET_WORDS_MAX) {
+            /* Read too many words without finding the end */
+            gen_exception(HEX_EXCP_INVALID_PACKET);
+            ctx->base.is_jmp = DISAS_NORETURN;
+            return 0;
+        }
+        /* Crosses page boundary - defer to next TB */
+        ctx->base.is_jmp = DISAS_TOO_MANY;
+        return 0;
+    }
+
+    HEX_DEBUG_LOG("decode_packet: pc = 0x%x\n", ctx->base.pc_next);
+    HEX_DEBUG_LOG("    words = { ");
+    for (i = 0; i < nwords; i++) {
+        HEX_DEBUG_LOG("0x%x, ", words[i]);
+    }
+    HEX_DEBUG_LOG("}\n");
+
+    return nwords;
+}
+
+static void gen_start_packet(DisasContext *ctx, packet_t *pkt)
+{
+    target_ulong next_PC = ctx->base.pc_next + pkt->encod_pkt_size_in_bytes;
+    int i;
+
+    /* Clear out the disassembly context */
+    ctx->reg_log_idx = 0;
+    ctx->preg_log_idx = 0;
+    for (i = 0; i < STORES_MAX; i++) {
+        ctx->store_width[i] = 0;
+    }
+    tcg_gen_movi_tl(hex_pkt_has_store_s1, pkt->pkt_has_store_s1);
+
+#if HEX_DEBUG
+    /* Handy place to set a breakpoint before the packet executes */
+    gen_helper_debug_start_packet(cpu_env);
+    tcg_gen_movi_tl(hex_this_PC, ctx->base.pc_next);
+#endif
+
+    /* Initialize the runtime state for packet semantics */
+    tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], ctx->base.pc_next);
+    tcg_gen_movi_tl(hex_slot_cancelled, 0);
+    if (pkt->pkt_has_cof) {
+        tcg_gen_movi_tl(hex_branch_taken, 0);
+        tcg_gen_movi_tl(hex_next_PC, next_PC);
+    }
+    tcg_gen_movi_tl(hex_pred_written, 0);
+}
+
+/*
+ * The LOG_*_WRITE macros mark most of the writes in a packet
+ * However, there are some implicit writes marked as attributes
+ * of the applicable instructions.
+ */
+static void mark_implicit_reg_write(DisasContext *ctx, insn_t *insn,
+                                    int attrib, int rnum)
+{
+    if (GET_ATTRIB(insn->opcode, attrib)) {
+        int is_predicated = GET_ATTRIB(insn->opcode, A_CONDEXEC);
+        if (is_predicated && !is_preloaded(ctx, rnum)) {
+            tcg_gen_mov_tl(hex_new_value[rnum], hex_gpr[rnum]);
+        }
+
+        ctx->reg_log[ctx->reg_log_idx] = rnum;
+        ctx->reg_log_idx++;
+    }
+}
+
+static void mark_implicit_pred_write(DisasContext *ctx, insn_t *insn,
+                                     int attrib, int pnum)
+{
+    if (GET_ATTRIB(insn->opcode, attrib)) {
+        ctx->preg_log[ctx->preg_log_idx] = pnum;
+        ctx->preg_log_idx++;
+    }
+}
+
+static void mark_implicit_writes(DisasContext *ctx, insn_t *insn)
+{
+    mark_implicit_reg_write(ctx, insn, A_IMPLICIT_WRITES_FP,  HEX_REG_FP);
+    mark_implicit_reg_write(ctx, insn, A_IMPLICIT_WRITES_SP,  HEX_REG_SP);
+    mark_implicit_reg_write(ctx, insn, A_IMPLICIT_WRITES_LR,  HEX_REG_LR);
+    mark_implicit_reg_write(ctx, insn, A_IMPLICIT_WRITES_LC0, HEX_REG_LC0);
+    mark_implicit_reg_write(ctx, insn, A_IMPLICIT_WRITES_SA0, HEX_REG_SA0);
+    mark_implicit_reg_write(ctx, insn, A_IMPLICIT_WRITES_LC1, HEX_REG_LC1);
+    mark_implicit_reg_write(ctx, insn, A_IMPLICIT_WRITES_SA1, HEX_REG_SA1);
+
+    mark_implicit_pred_write(ctx, insn, A_IMPLICIT_WRITES_P0, 0);
+    mark_implicit_pred_write(ctx, insn, A_IMPLICIT_WRITES_P1, 1);
+    mark_implicit_pred_write(ctx, insn, A_IMPLICIT_WRITES_P2, 2);
+    mark_implicit_pred_write(ctx, insn, A_IMPLICIT_WRITES_P3, 3);
+}
+
+static void gen_insn(CPUHexagonState *env, DisasContext *ctx,
+                     insn_t *insn, packet_t *pkt)
+{
+    if (insn->generate) {
+        mark_implicit_writes(ctx, insn);
+        insn->generate(env, ctx, insn, pkt);
+    } else {
+        gen_exception(HEX_EXCP_INVALID_OPCODE);
+        ctx->base.is_jmp = DISAS_NORETURN;
+    }
+}
+
+/*
+ * Helpers for generating the packet commit
+ */
+static void gen_reg_writes(DisasContext *ctx)
+{
+    int i;
+
+    for (i = 0; i < ctx->reg_log_idx; i++) {
+        int reg_num = ctx->reg_log[i];
+
+        tcg_gen_mov_tl(hex_gpr[reg_num], hex_new_value[reg_num]);
+    }
+}
+
+static void gen_pred_writes(DisasContext *ctx, packet_t *pkt)
+{
+    /* Early exit if the log is empty */
+    if (!ctx->preg_log_idx) {
+        return;
+    }
+
+    TCGv zero = tcg_const_tl(0);
+    TCGv control_reg = tcg_temp_new();
+    TCGv pval = tcg_temp_new();
+    int i;
+
+    /*
+     * Only endloop instructions will conditionally
+     * write a predicate.  If there are no endloop
+     * instructions, we can use the non-conditional
+     * write of the predicates.
+     */
+    if (pkt->pkt_has_endloop) {
+        TCGv pred_written = tcg_temp_new();
+        for (i = 0; i < ctx->preg_log_idx; i++) {
+            int pred_num = ctx->preg_log[i];
+
+            tcg_gen_andi_tl(pred_written, hex_pred_written, 1 << pred_num);
+            tcg_gen_movcond_tl(TCG_COND_NE, hex_pred[pred_num],
+                               pred_written, zero,
+                               hex_new_pred_value[pred_num],
+                               hex_pred[pred_num]);
+        }
+        tcg_temp_free(pred_written);
+    } else {
+        for (i = 0; i < ctx->preg_log_idx; i++) {
+            int pred_num = ctx->preg_log[i];
+            tcg_gen_mov_tl(hex_pred[pred_num], hex_new_pred_value[pred_num]);
+#if HEX_DEBUG
+            /* Do this so HELPER(debug_commit_end) will know */
+            tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << pred_num);
+#endif
+        }
+    }
+
+    tcg_temp_free(zero);
+    tcg_temp_free(control_reg);
+    tcg_temp_free(pval);
+}
+
+#if HEX_DEBUG
+static inline void gen_check_store_width(DisasContext *ctx, int slot_num)
+{
+    TCGv slot = tcg_const_tl(slot_num);
+    TCGv check = tcg_const_tl(ctx->store_width[slot_num]);
+    gen_helper_debug_check_store_width(cpu_env, slot, check);
+    tcg_temp_free(slot);
+    tcg_temp_free(check);
+}
+#define HEX_DEBUG_GEN_CHECK_STORE_WIDTH(ctx, slot_num) \
+    gen_check_store_width(ctx, slot_num)
+#else
+#define HEX_DEBUG_GEN_CHECK_STORE_WIDTH(ctx, slot_num)  /* nothing */
+#endif
+
+static void process_store(DisasContext *ctx, int slot_num)
+{
+    TCGv cancelled = tcg_temp_local_new();
+    TCGLabel *label_end = gen_new_label();
+
+    /* Don't do anything if the slot was cancelled */
+    gen_slot_cancelled_check(cancelled, slot_num);
+    tcg_gen_brcondi_tl(TCG_COND_NE, cancelled, 0, label_end);
+    {
+        int ctx_width = ctx->store_width[slot_num];
+        TCGv address = tcg_temp_local_new();
+        tcg_gen_mov_tl(address, hex_store_addr[slot_num]);
+
+        /*
+         * If we know the width from the DisasContext, we can
+         * generate much cleaner code.
+         * Unfortunately, not all instructions execute the fSTORE
+         * macro during code generation.  Anything that uses the
+         * generic helper will have this problem.  Instructions
+         * that use fWRAP to generate proper TCG code will be OK.
+         */
+        TCGv value;
+        TCGv_i64 value_i64;
+        TCGLabel *label_w2;
+        TCGLabel *label_w4;
+        TCGLabel *label_w8;
+        switch (ctx_width) {
+        case 1:
+            HEX_DEBUG_GEN_CHECK_STORE_WIDTH(ctx, slot_num);
+            value = tcg_temp_new();
+            tcg_gen_mov_tl(value, hex_store_val32[slot_num]);
+            tcg_gen_qemu_st8(value, address, ctx->mem_idx);
+            tcg_temp_free(value);
+            break;
+        case 2:
+            HEX_DEBUG_GEN_CHECK_STORE_WIDTH(ctx, slot_num);
+            value = tcg_temp_new();
+            tcg_gen_mov_tl(value, hex_store_val32[slot_num]);
+            tcg_gen_qemu_st16(value, address, ctx->mem_idx);
+            tcg_temp_free(value);
+            break;
+        case 4:
+            HEX_DEBUG_GEN_CHECK_STORE_WIDTH(ctx, slot_num);
+            value = tcg_temp_new();
+            tcg_gen_mov_tl(value, hex_store_val32[slot_num]);
+            tcg_gen_qemu_st32(value, address, ctx->mem_idx);
+            tcg_temp_free(value);
+            break;
+        case 8:
+            HEX_DEBUG_GEN_CHECK_STORE_WIDTH(ctx, slot_num);
+            value_i64 = tcg_temp_new_i64();
+            tcg_gen_mov_i64(value_i64, hex_store_val64[slot_num]);
+            tcg_gen_qemu_st64(value_i64, address, ctx->mem_idx);
+            tcg_temp_free_i64(value_i64);
+            break;
+        default:
+            /*
+             * If we get to here, we don't know the width at
+             * TCG generation time, we'll generate branching
+             * based on the width at runtime.
+             */
+            label_w2 = gen_new_label();
+            label_w4 = gen_new_label();
+            label_w8 = gen_new_label();
+            TCGv width = tcg_temp_local_new();
+
+            tcg_gen_mov_tl(width, hex_store_width[slot_num]);
+            tcg_gen_brcondi_tl(TCG_COND_NE, width, 1, label_w2);
+            {
+                /* Width is 1 byte */
+                TCGv value = tcg_temp_new();
+                tcg_gen_mov_tl(value, hex_store_val32[slot_num]);
+                tcg_gen_qemu_st8(value, address, ctx->mem_idx);
+                tcg_gen_br(label_end);
+                tcg_temp_free(value);
+            }
+            gen_set_label(label_w2);
+            tcg_gen_brcondi_tl(TCG_COND_NE, width, 2, label_w4);
+            {
+                /* Width is 2 bytes */
+                TCGv value = tcg_temp_new();
+                tcg_gen_mov_tl(value, hex_store_val32[slot_num]);
+                tcg_gen_qemu_st16(value, address, ctx->mem_idx);
+                tcg_gen_br(label_end);
+                tcg_temp_free(value);
+            }
+            gen_set_label(label_w4);
+            tcg_gen_brcondi_tl(TCG_COND_NE, width, 4, label_w8);
+            {
+                /* Width is 4 bytes */
+                TCGv value = tcg_temp_new();
+                tcg_gen_mov_tl(value, hex_store_val32[slot_num]);
+                tcg_gen_qemu_st32(value, address, ctx->mem_idx);
+                tcg_gen_br(label_end);
+                tcg_temp_free(value);
+            }
+            gen_set_label(label_w8);
+            {
+                /* Width is 8 bytes */
+                TCGv_i64 value = tcg_temp_new_i64();
+                tcg_gen_mov_i64(value, hex_store_val64[slot_num]);
+                tcg_gen_qemu_st64(value, address, ctx->mem_idx);
+                tcg_gen_br(label_end);
+                tcg_temp_free_i64(value);
+            }
+
+            tcg_temp_free(width);
+        }
+        tcg_temp_free(address);
+    }
+    gen_set_label(label_end);
+
+    tcg_temp_free(cancelled);
+}
+
+static void process_store_log(DisasContext *ctx, packet_t *pkt)
+{
+    /*
+     *  When a packet has two stores, the hardware processes
+     *  slot 1 and then slot 2.  This will be important when
+     *  the memory accesses overlap.
+     */
+    if (pkt->pkt_has_store_s1 && !pkt->pkt_has_dczeroa) {
+        process_store(ctx, 1);
+    }
+    if (pkt->pkt_has_store_s0 && !pkt->pkt_has_dczeroa) {
+        process_store(ctx, 0);
+    }
+}
+
+/* Zero out a 32-bit cache line */
+static void process_dczeroa(DisasContext *ctx, packet_t *pkt)
+{
+    if (pkt->pkt_has_dczeroa) {
+        /* Store 32 bytes of zero starting at (addr & ~0x1f) */
+        TCGv addr = tcg_temp_new();
+        TCGv_i64 zero = tcg_const_i64(0);
+
+        tcg_gen_andi_tl(addr, hex_dczero_addr, ~0x1f);
+        tcg_gen_qemu_st64(zero, addr, ctx->mem_idx);
+        tcg_gen_addi_tl(addr, addr, 8);
+        tcg_gen_qemu_st64(zero, addr, ctx->mem_idx);
+        tcg_gen_addi_tl(addr, addr, 8);
+        tcg_gen_qemu_st64(zero, addr, ctx->mem_idx);
+        tcg_gen_addi_tl(addr, addr, 8);
+        tcg_gen_qemu_st64(zero, addr, ctx->mem_idx);
+
+        tcg_temp_free(addr);
+        tcg_temp_free_i64(zero);
+    }
+}
+
+static bool process_change_of_flow(DisasContext *ctx, packet_t *pkt)
+{
+    if (pkt->pkt_has_cof) {
+        tcg_gen_mov_tl(hex_gpr[HEX_REG_PC], hex_next_PC);
+        return true;
+    }
+    return false;
+}
+
+static void gen_exec_counters(packet_t *pkt)
+{
+    int num_insns = pkt->num_insns;
+    int num_real_insns = 0;
+    int i;
+
+    for (i = 0; i < num_insns; i++) {
+        if (!pkt->insn[i].is_endloop &&
+            !pkt->insn[i].part1 &&
+            !GET_ATTRIB(pkt->insn[i].opcode, A_IT_NOP)) {
+            num_real_insns++;
+        }
+    }
+
+    tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_PKT_CNT],
+                    hex_gpr[HEX_REG_QEMU_PKT_CNT], 1);
+    if (num_real_insns) {
+        tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_INSN_CNT],
+                        hex_gpr[HEX_REG_QEMU_INSN_CNT], num_real_insns);
+    }
+}
+
+static void gen_commit_packet(DisasContext *ctx, packet_t *pkt)
+{
+    bool end_tb = false;
+
+    gen_reg_writes(ctx);
+    gen_pred_writes(ctx, pkt);
+    process_store_log(ctx, pkt);
+    process_dczeroa(ctx, pkt);
+    end_tb |= process_change_of_flow(ctx, pkt);
+    gen_exec_counters(pkt);
+#if HEX_DEBUG
+    {
+        TCGv has_st0 =
+            tcg_const_tl(pkt->pkt_has_store_s0 && !pkt->pkt_has_dczeroa);
+        TCGv has_st1 =
+            tcg_const_tl(pkt->pkt_has_store_s1 && !pkt->pkt_has_dczeroa);
+
+        /* Handy place to set a breakpoint at the end of execution */
+        gen_helper_debug_commit_end(cpu_env, has_st0, has_st1);
+
+        tcg_temp_free(has_st0);
+        tcg_temp_free(has_st1);
+    }
+#endif
+
+    if (end_tb) {
+        tcg_gen_exit_tb(NULL, 0);
+        ctx->base.is_jmp = DISAS_NORETURN;
+    }
+}
+
+static void decode_and_translate_packet(CPUHexagonState *env, DisasContext *ctx)
+{
+    uint32_t words[PACKET_WORDS_MAX];
+    int nwords;
+    packet_t pkt;
+    int i;
+
+    nwords = read_packet_words(env, ctx, words);
+    if (!nwords) {
+        return;
+    }
+
+    if (decode_this(nwords, words, &pkt)) {
+        HEX_DEBUG_PRINT_PKT(&pkt);
+        gen_start_packet(ctx, &pkt);
+        for (i = 0; i < pkt.num_insns; i++) {
+            gen_insn(env, ctx, &pkt.insn[i], &pkt);
+        }
+        gen_commit_packet(ctx, &pkt);
+        ctx->base.pc_next += pkt.encod_pkt_size_in_bytes;
+    } else {
+        gen_exception(HEX_EXCP_INVALID_PACKET);
+        ctx->base.is_jmp = DISAS_NORETURN;
+    }
+}
+
+static void hexagon_tr_init_disas_context(DisasContextBase *dcbase,
+                                          CPUState *cs)
+{
+    DisasContext *ctx = container_of(dcbase, DisasContext, base);
+
+    ctx->mem_idx = MMU_USER_IDX;
+}
+
+static void hexagon_tr_tb_start(DisasContextBase *db, CPUState *cpu)
+{
+}
+
+static void hexagon_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
+{
+    DisasContext *ctx = container_of(dcbase, DisasContext, base);
+
+    tcg_gen_insn_start(ctx->base.pc_next);
+}
+
+static bool hexagon_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                        const CPUBreakpoint *bp)
+{
+    DisasContext *ctx = container_of(dcbase, DisasContext, base);
+
+    tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], ctx->base.pc_next);
+    ctx->base.is_jmp = DISAS_NORETURN;
+    gen_exception_debug();
+    /*
+     * The address covered by the breakpoint must be included in
+     * [tb->pc, tb->pc + tb->size) in order to for it to be
+     * properly cleared -- thus we increment the PC here so that
+     * the logic setting tb->size below does the right thing.
+     */
+    ctx->base.pc_next += 4;
+    return true;
+}
+
+
+static void hexagon_tr_translate_packet(DisasContextBase *dcbase, CPUState *cpu)
+{
+    DisasContext *ctx = container_of(dcbase, DisasContext, base);
+    CPUHexagonState *env = cpu->env_ptr;
+
+    decode_and_translate_packet(env, ctx);
+
+    if (ctx->base.is_jmp == DISAS_NEXT) {
+        target_ulong page_start;
+
+        page_start = ctx->base.pc_first & TARGET_PAGE_MASK;
+        if (ctx->base.pc_next - page_start >= TARGET_PAGE_SIZE) {
+            ctx->base.is_jmp = DISAS_TOO_MANY;
+        }
+
+        /*
+         * The CPU log is used to compare against LLDB single stepping,
+         * so end the TLB after every packet.
+         */
+        if (qemu_loglevel_mask(CPU_LOG_TB_CPU)) {
+            ctx->base.is_jmp = DISAS_TOO_MANY;
+        }
+#if HEX_DEBUG
+        /* When debugging, only put one packet per TB */
+        ctx->base.is_jmp = DISAS_TOO_MANY;
+#endif
+    }
+}
+
+static void hexagon_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
+{
+    DisasContext *ctx = container_of(dcbase, DisasContext, base);
+
+    switch (ctx->base.is_jmp) {
+    case DISAS_TOO_MANY:
+        tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], ctx->base.pc_next);
+        if (ctx->base.singlestep_enabled) {
+            gen_exception_debug();
+        } else {
+            tcg_gen_exit_tb(NULL, 0);
+        }
+        break;
+    case DISAS_NORETURN:
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static void hexagon_tr_disas_log(const DisasContextBase *dcbase, CPUState *cpu)
+{
+    qemu_log("IN: %s\n", lookup_symbol(dcbase->pc_first));
+    log_target_disas(cpu, dcbase->pc_first, dcbase->tb->size);
+}
+
+
+static const TranslatorOps hexagon_tr_ops = {
+    .init_disas_context = hexagon_tr_init_disas_context,
+    .tb_start           = hexagon_tr_tb_start,
+    .insn_start         = hexagon_tr_insn_start,
+    .breakpoint_check   = hexagon_tr_breakpoint_check,
+    .translate_insn     = hexagon_tr_translate_packet,
+    .tb_stop            = hexagon_tr_tb_stop,
+    .disas_log          = hexagon_tr_disas_log,
+};
+
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+{
+    DisasContext ctx;
+
+    translator_loop(&hexagon_tr_ops, &ctx.base, cs, tb, max_insns);
+}
+
+#define NAME_LEN               64
+static char new_value_names[TOTAL_PER_THREAD_REGS][NAME_LEN];
+#if HEX_DEBUG
+static char reg_written_names[TOTAL_PER_THREAD_REGS][NAME_LEN];
+#endif
+static char new_pred_value_names[NUM_PREGS][NAME_LEN];
+static char store_addr_names[STORES_MAX][NAME_LEN];
+static char store_width_names[STORES_MAX][NAME_LEN];
+static char store_val32_names[STORES_MAX][NAME_LEN];
+static char store_val64_names[STORES_MAX][NAME_LEN];
+
+void hexagon_translate_init(void)
+{
+    int i;
+
+    opcode_init();
+
+    for (i = 0; i < TOTAL_PER_THREAD_REGS; i++) {
+        hex_gpr[i] = tcg_global_mem_new(cpu_env,
+            offsetof(CPUHexagonState, gpr[i]),
+            hexagon_regnames[i]);
+
+        snprintf(new_value_names[i], NAME_LEN, "new_%s", hexagon_regnames[i]);
+        hex_new_value[i] = tcg_global_mem_new(cpu_env,
+            offsetof(CPUHexagonState, new_value[i]),
+            new_value_names[i]);
+
+#if HEX_DEBUG
+        snprintf(reg_written_names[i], NAME_LEN, "reg_written_%s",
+                 hexagon_regnames[i]);
+        hex_reg_written[i] = tcg_global_mem_new(cpu_env,
+            offsetof(CPUHexagonState, reg_written[i]),
+            reg_written_names[i]);
+#endif
+    }
+    for (i = 0; i < NUM_PREGS; i++) {
+        hex_pred[i] = tcg_global_mem_new(cpu_env,
+            offsetof(CPUHexagonState, pred[i]),
+            hexagon_prednames[i]);
+
+        snprintf(new_pred_value_names[i], NAME_LEN, "new_pred_%s",
+                 hexagon_prednames[i]);
+        hex_new_pred_value[i] = tcg_global_mem_new(cpu_env,
+            offsetof(CPUHexagonState, new_pred_value[i]),
+            new_pred_value_names[i]);
+    }
+    hex_pred_written = tcg_global_mem_new(cpu_env,
+        offsetof(CPUHexagonState, pred_written), "pred_written");
+    hex_next_PC = tcg_global_mem_new(cpu_env,
+        offsetof(CPUHexagonState, next_PC), "next_PC");
+    hex_this_PC = tcg_global_mem_new(cpu_env,
+        offsetof(CPUHexagonState, this_PC), "this_PC");
+    hex_slot_cancelled = tcg_global_mem_new(cpu_env,
+        offsetof(CPUHexagonState, slot_cancelled), "slot_cancelled");
+    hex_branch_taken = tcg_global_mem_new(cpu_env,
+        offsetof(CPUHexagonState, branch_taken), "branch_taken");
+    hex_pkt_has_store_s1 = tcg_global_mem_new(cpu_env,
+        offsetof(CPUHexagonState, pkt_has_store_s1), "pkt_has_store_s1");
+    hex_dczero_addr = tcg_global_mem_new(cpu_env,
+        offsetof(CPUHexagonState, dczero_addr), "dczero_addr");
+    hex_llsc_addr = tcg_global_mem_new(cpu_env,
+        offsetof(CPUHexagonState, llsc_addr), "llsc_addr");
+    hex_llsc_val = tcg_global_mem_new(cpu_env,
+        offsetof(CPUHexagonState, llsc_val), "llsc_val");
+    hex_llsc_val_i64 = tcg_global_mem_new_i64(cpu_env,
+        offsetof(CPUHexagonState, llsc_val_i64), "llsc_val_i64");
+    for (i = 0; i < STORES_MAX; i++) {
+        snprintf(store_addr_names[i], NAME_LEN, "store_addr_%d", i);
+        hex_store_addr[i] = tcg_global_mem_new(cpu_env,
+            offsetof(CPUHexagonState, mem_log_stores[i].va),
+            store_addr_names[i]);
+
+        snprintf(store_width_names[i], NAME_LEN, "store_width_%d", i);
+        hex_store_width[i] = tcg_global_mem_new(cpu_env,
+            offsetof(CPUHexagonState, mem_log_stores[i].width),
+            store_width_names[i]);
+
+        snprintf(store_val32_names[i], NAME_LEN, "store_val32_%d", i);
+        hex_store_val32[i] = tcg_global_mem_new(cpu_env,
+            offsetof(CPUHexagonState, mem_log_stores[i].data32),
+            store_val32_names[i]);
+
+        snprintf(store_val64_names[i], NAME_LEN, "store_val64_%d", i);
+        hex_store_val64[i] = tcg_global_mem_new_i64(cpu_env,
+            offsetof(CPUHexagonState, mem_log_stores[i].data64),
+            store_val64_names[i]);
+    }
+
+    init_genptr();
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 32/34] Hexagon (linux-user/hexagon) Linux user emulation
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (30 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 31/34] Hexagon (target/hexagon) translation Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-29  2:59   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 33/34] Hexagon (tests/tcg/hexagon) TCG tests Taylor Simpson
                   ` (3 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Implementation of Linux user emulation for Hexagon
Some common files modified in addition to new files in linux-user/hexagon

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 linux-user/hexagon/sockbits.h       |  18 ++
 linux-user/hexagon/syscall_nr.h     | 343 ++++++++++++++++++++++++++++++++++++
 linux-user/hexagon/target_cpu.h     |  44 +++++
 linux-user/hexagon/target_elf.h     |  40 +++++
 linux-user/hexagon/target_fcntl.h   |  18 ++
 linux-user/hexagon/target_signal.h  |  34 ++++
 linux-user/hexagon/target_structs.h |  46 +++++
 linux-user/hexagon/target_syscall.h |  32 ++++
 linux-user/hexagon/termbits.h       |  18 ++
 linux-user/syscall_defs.h           |  33 ++++
 linux-user/elfload.c                |  16 ++
 linux-user/hexagon/cpu_loop.c       |  99 +++++++++++
 linux-user/hexagon/signal.c         | 276 +++++++++++++++++++++++++++++
 linux-user/syscall.c                |   2 +
 scripts/gensyscalls.sh              |   3 +-
 15 files changed, 1021 insertions(+), 1 deletion(-)
 create mode 100644 linux-user/hexagon/sockbits.h
 create mode 100644 linux-user/hexagon/syscall_nr.h
 create mode 100644 linux-user/hexagon/target_cpu.h
 create mode 100644 linux-user/hexagon/target_elf.h
 create mode 100644 linux-user/hexagon/target_fcntl.h
 create mode 100644 linux-user/hexagon/target_signal.h
 create mode 100644 linux-user/hexagon/target_structs.h
 create mode 100644 linux-user/hexagon/target_syscall.h
 create mode 100644 linux-user/hexagon/termbits.h
 create mode 100644 linux-user/hexagon/cpu_loop.c
 create mode 100644 linux-user/hexagon/signal.c

diff --git a/linux-user/hexagon/sockbits.h b/linux-user/hexagon/sockbits.h
new file mode 100644
index 0000000..a6e8966
--- /dev/null
+++ b/linux-user/hexagon/sockbits.h
@@ -0,0 +1,18 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "../generic/sockbits.h"
diff --git a/linux-user/hexagon/syscall_nr.h b/linux-user/hexagon/syscall_nr.h
new file mode 100644
index 0000000..31bf2e4
--- /dev/null
+++ b/linux-user/hexagon/syscall_nr.h
@@ -0,0 +1,343 @@
+/*
+ * This file contains the system call numbers.
+ * Do not modify.
+ * This file is generated by gensyscalls.sh
+ */
+#ifndef LINUX_USER_HEXAGON_SYSCALL_NR_H
+#define LINUX_USER_HEXAGON_SYSCALL_NR_H
+
+#define TARGET_NR_io_setup 0
+#define TARGET_NR_io_destroy 1
+#define TARGET_NR_io_submit 2
+#define TARGET_NR_io_cancel 3
+#define TARGET_NR_io_getevents 4
+#define TARGET_NR_setxattr 5
+#define TARGET_NR_lsetxattr 6
+#define TARGET_NR_fsetxattr 7
+#define TARGET_NR_getxattr 8
+#define TARGET_NR_lgetxattr 9
+#define TARGET_NR_fgetxattr 10
+#define TARGET_NR_listxattr 11
+#define TARGET_NR_llistxattr 12
+#define TARGET_NR_flistxattr 13
+#define TARGET_NR_removexattr 14
+#define TARGET_NR_lremovexattr 15
+#define TARGET_NR_fremovexattr 16
+#define TARGET_NR_getcwd 17
+#define TARGET_NR_lookup_dcookie 18
+#define TARGET_NR_eventfd2 19
+#define TARGET_NR_epoll_create1 20
+#define TARGET_NR_epoll_ctl 21
+#define TARGET_NR_epoll_pwait 22
+#define TARGET_NR_dup 23
+#define TARGET_NR_dup3 24
+#define TARGET_NR_fcntl64 25
+#define TARGET_NR_inotify_init1 26
+#define TARGET_NR_inotify_add_watch 27
+#define TARGET_NR_inotify_rm_watch 28
+#define TARGET_NR_ioctl 29
+#define TARGET_NR_ioprio_set 30
+#define TARGET_NR_ioprio_get 31
+#define TARGET_NR_flock 32
+#define TARGET_NR_mknodat 33
+#define TARGET_NR_mkdirat 34
+#define TARGET_NR_unlinkat 35
+#define TARGET_NR_symlinkat 36
+#define TARGET_NR_linkat 37
+#define TARGET_NR_renameat 38
+#define TARGET_NR_umount2 39
+#define TARGET_NR_mount 40
+#define TARGET_NR_pivot_root 41
+#define TARGET_NR_nfsservctl 42
+#define TARGET_NR_statfs64 43
+#define TARGET_NR_fstatfs64 44
+#define TARGET_NR_truncate64 45
+#define TARGET_NR_ftruncate64 46
+#define TARGET_NR_fallocate 47
+#define TARGET_NR_faccessat 48
+#define TARGET_NR_chdir 49
+#define TARGET_NR_fchdir 50
+#define TARGET_NR_chroot 51
+#define TARGET_NR_fchmod 52
+#define TARGET_NR_fchmodat 53
+#define TARGET_NR_fchownat 54
+#define TARGET_NR_fchown 55
+#define TARGET_NR_openat 56
+#define TARGET_NR_close 57
+#define TARGET_NR_vhangup 58
+#define TARGET_NR_pipe2 59
+#define TARGET_NR_quotactl 60
+#define TARGET_NR_getdents64 61
+#define TARGET_NR_llseek 62
+#define TARGET_NR_read 63
+#define TARGET_NR_write 64
+#define TARGET_NR_readv 65
+#define TARGET_NR_writev 66
+#define TARGET_NR_pread64 67
+#define TARGET_NR_pwrite64 68
+#define TARGET_NR_preadv 69
+#define TARGET_NR_pwritev 70
+#define TARGET_NR_sendfile64 71
+#define TARGET_NR_pselect6 72
+#define TARGET_NR_ppoll 73
+#define TARGET_NR_signalfd4 74
+#define TARGET_NR_vmsplice 75
+#define TARGET_NR_splice 76
+#define TARGET_NR_tee 77
+#define TARGET_NR_readlinkat 78
+#define TARGET_NR_fstatat64 79
+#define TARGET_NR_fstat64 80
+#define TARGET_NR_sync 81
+#define TARGET_NR_fsync 82
+#define TARGET_NR_fdatasync 83
+#define TARGET_NR_sync_file_range 84
+#define TARGET_NR_timerfd_create 85
+#define TARGET_NR_timerfd_settime 86
+#define TARGET_NR_timerfd_gettime 87
+#define TARGET_NR_utimensat 88
+#define TARGET_NR_acct 89
+#define TARGET_NR_capget 90
+#define TARGET_NR_capset 91
+#define TARGET_NR_personality 92
+#define TARGET_NR_exit 93
+#define TARGET_NR_exit_group 94
+#define TARGET_NR_waitid 95
+#define TARGET_NR_set_tid_address 96
+#define TARGET_NR_unshare 97
+#define TARGET_NR_futex 98
+#define TARGET_NR_set_robust_list 99
+#define TARGET_NR_get_robust_list 100
+#define TARGET_NR_nanosleep 101
+#define TARGET_NR_getitimer 102
+#define TARGET_NR_setitimer 103
+#define TARGET_NR_kexec_load 104
+#define TARGET_NR_init_module 105
+#define TARGET_NR_delete_module 106
+#define TARGET_NR_timer_create 107
+#define TARGET_NR_timer_gettime 108
+#define TARGET_NR_timer_getoverrun 109
+#define TARGET_NR_timer_settime 110
+#define TARGET_NR_timer_delete 111
+#define TARGET_NR_clock_settime 112
+#define TARGET_NR_clock_gettime 113
+#define TARGET_NR_clock_getres 114
+#define TARGET_NR_clock_nanosleep 115
+#define TARGET_NR_syslog 116
+#define TARGET_NR_ptrace 117
+#define TARGET_NR_sched_setparam 118
+#define TARGET_NR_sched_setscheduler 119
+#define TARGET_NR_sched_getscheduler 120
+#define TARGET_NR_sched_getparam 121
+#define TARGET_NR_sched_setaffinity 122
+#define TARGET_NR_sched_getaffinity 123
+#define TARGET_NR_sched_yield 124
+#define TARGET_NR_sched_get_priority_max 125
+#define TARGET_NR_sched_get_priority_min 126
+#define TARGET_NR_sched_rr_get_interval 127
+#define TARGET_NR_restart_syscall 128
+#define TARGET_NR_kill 129
+#define TARGET_NR_tkill 130
+#define TARGET_NR_tgkill 131
+#define TARGET_NR_sigaltstack 132
+#define TARGET_NR_rt_sigsuspend 133
+#define TARGET_NR_rt_sigaction 134
+#define TARGET_NR_rt_sigprocmask 135
+#define TARGET_NR_rt_sigpending 136
+#define TARGET_NR_rt_sigtimedwait 137
+#define TARGET_NR_rt_sigqueueinfo 138
+#define TARGET_NR_rt_sigreturn 139
+#define TARGET_NR_setpriority 140
+#define TARGET_NR_getpriority 141
+#define TARGET_NR_reboot 142
+#define TARGET_NR_setregid 143
+#define TARGET_NR_setgid 144
+#define TARGET_NR_setreuid 145
+#define TARGET_NR_setuid 146
+#define TARGET_NR_setresuid 147
+#define TARGET_NR_getresuid 148
+#define TARGET_NR_setresgid 149
+#define TARGET_NR_getresgid 150
+#define TARGET_NR_setfsuid 151
+#define TARGET_NR_setfsgid 152
+#define TARGET_NR_times 153
+#define TARGET_NR_setpgid 154
+#define TARGET_NR_getpgid 155
+#define TARGET_NR_getsid 156
+#define TARGET_NR_setsid 157
+#define TARGET_NR_getgroups 158
+#define TARGET_NR_setgroups 159
+#define TARGET_NR_uname 160
+#define TARGET_NR_sethostname 161
+#define TARGET_NR_setdomainname 162
+#define TARGET_NR_getrlimit 163
+#define TARGET_NR_setrlimit 164
+#define TARGET_NR_getrusage 165
+#define TARGET_NR_umask 166
+#define TARGET_NR_prctl 167
+#define TARGET_NR_getcpu 168
+#define TARGET_NR_gettimeofday 169
+#define TARGET_NR_settimeofday 170
+#define TARGET_NR_adjtimex 171
+#define TARGET_NR_getpid 172
+#define TARGET_NR_getppid 173
+#define TARGET_NR_getuid 174
+#define TARGET_NR_geteuid 175
+#define TARGET_NR_getgid 176
+#define TARGET_NR_getegid 177
+#define TARGET_NR_gettid 178
+#define TARGET_NR_sysinfo 179
+#define TARGET_NR_mq_open 180
+#define TARGET_NR_mq_unlink 181
+#define TARGET_NR_mq_timedsend 182
+#define TARGET_NR_mq_timedreceive 183
+#define TARGET_NR_mq_notify 184
+#define TARGET_NR_mq_getsetattr 185
+#define TARGET_NR_msgget 186
+#define TARGET_NR_msgctl 187
+#define TARGET_NR_msgrcv 188
+#define TARGET_NR_msgsnd 189
+#define TARGET_NR_semget 190
+#define TARGET_NR_semctl 191
+#define TARGET_NR_semtimedop 192
+#define TARGET_NR_semop 193
+#define TARGET_NR_shmget 194
+#define TARGET_NR_shmctl 195
+#define TARGET_NR_shmat 196
+#define TARGET_NR_shmdt 197
+#define TARGET_NR_socket 198
+#define TARGET_NR_socketpair 199
+#define TARGET_NR_bind 200
+#define TARGET_NR_listen 201
+#define TARGET_NR_accept 202
+#define TARGET_NR_connect 203
+#define TARGET_NR_getsockname 204
+#define TARGET_NR_getpeername 205
+#define TARGET_NR_sendto 206
+#define TARGET_NR_recvfrom 207
+#define TARGET_NR_setsockopt 208
+#define TARGET_NR_getsockopt 209
+#define TARGET_NR_shutdown 210
+#define TARGET_NR_sendmsg 211
+#define TARGET_NR_recvmsg 212
+#define TARGET_NR_readahead 213
+#define TARGET_NR_brk 214
+#define TARGET_NR_munmap 215
+#define TARGET_NR_mremap 216
+#define TARGET_NR_add_key 217
+#define TARGET_NR_request_key 218
+#define TARGET_NR_keyctl 219
+#define TARGET_NR_clone 220
+#define TARGET_NR_execve 221
+#define TARGET_NR_mmap2 222
+#define TARGET_NR_fadvise64_64 223
+#define TARGET_NR_swapon 224
+#define TARGET_NR_swapoff 225
+#define TARGET_NR_mprotect 226
+#define TARGET_NR_msync 227
+#define TARGET_NR_mlock 228
+#define TARGET_NR_munlock 229
+#define TARGET_NR_mlockall 230
+#define TARGET_NR_munlockall 231
+#define TARGET_NR_mincore 232
+#define TARGET_NR_madvise 233
+#define TARGET_NR_remap_file_pages 234
+#define TARGET_NR_mbind 235
+#define TARGET_NR_get_mempolicy 236
+#define TARGET_NR_set_mempolicy 237
+#define TARGET_NR_migrate_pages 238
+#define TARGET_NR_move_pages 239
+#define TARGET_NR_rt_tgsigqueueinfo 240
+#define TARGET_NR_perf_event_open 241
+#define TARGET_NR_accept4 242
+#define TARGET_NR_recvmmsg 243
+#define TARGET_NR_arch_specific_syscall 244
+#define TARGET_NR_wait4 260
+#define TARGET_NR_prlimit64 261
+#define TARGET_NR_fanotify_init 262
+#define TARGET_NR_fanotify_mark 263
+#define TARGET_NR_name_to_handle_at 264
+#define TARGET_NR_open_by_handle_at 265
+#define TARGET_NR_clock_adjtime 266
+#define TARGET_NR_syncfs 267
+#define TARGET_NR_setns 268
+#define TARGET_NR_sendmmsg 269
+#define TARGET_NR_process_vm_readv 270
+#define TARGET_NR_process_vm_writev 271
+#define TARGET_NR_kcmp 272
+#define TARGET_NR_finit_module 273
+#define TARGET_NR_sched_setattr 274
+#define TARGET_NR_sched_getattr 275
+#define TARGET_NR_renameat2 276
+#define TARGET_NR_seccomp 277
+#define TARGET_NR_getrandom 278
+#define TARGET_NR_memfd_create 279
+#define TARGET_NR_bpf 280
+#define TARGET_NR_execveat 281
+#define TARGET_NR_userfaultfd 282
+#define TARGET_NR_membarrier 283
+#define TARGET_NR_mlock2 284
+#define TARGET_NR_copy_file_range 285
+#define TARGET_NR_preadv2 286
+#define TARGET_NR_pwritev2 287
+#define TARGET_NR_pkey_mprotect 288
+#define TARGET_NR_pkey_alloc 289
+#define TARGET_NR_pkey_free 290
+#define TARGET_NR_open 1024
+#define TARGET_NR_link 1025
+#define TARGET_NR_unlink 1026
+#define TARGET_NR_mknod 1027
+#define TARGET_NR_chmod 1028
+#define TARGET_NR_chown 1029
+#define TARGET_NR_mkdir 1030
+#define TARGET_NR_rmdir 1031
+#define TARGET_NR_lchown 1032
+#define TARGET_NR_access 1033
+#define TARGET_NR_rename 1034
+#define TARGET_NR_readlink 1035
+#define TARGET_NR_symlink 1036
+#define TARGET_NR_utimes 1037
+#define TARGET_NR_stat64 1038
+#define TARGET_NR_lstat64 1039
+#define TARGET_NR_pipe 1040
+#define TARGET_NR_dup2 1041
+#define TARGET_NR_epoll_create 1042
+#define TARGET_NR_inotify_init 1043
+#define TARGET_NR_eventfd 1044
+#define TARGET_NR_signalfd 1045
+#define TARGET_NR_sendfile 1046
+#define TARGET_NR_ftruncate 1047
+#define TARGET_NR_truncate 1048
+#define TARGET_NR_stat 1049
+#define TARGET_NR_lstat 1050
+#define TARGET_NR_fstat 1051
+#define TARGET_NR_fcntl 1052
+#define TARGET_NR_fadvise64 1053
+#define TARGET_NR_newfstatat 1054
+#define TARGET_NR_fstatfs 1055
+#define TARGET_NR_statfs 1056
+#define TARGET_NR_lseek 1057
+#define TARGET_NR_mmap 1058
+#define TARGET_NR_alarm 1059
+#define TARGET_NR_getpgrp 1060
+#define TARGET_NR_pause 1061
+#define TARGET_NR_time 1062
+#define TARGET_NR_utime 1063
+#define TARGET_NR_creat 1064
+#define TARGET_NR_getdents 1065
+#define TARGET_NR_futimesat 1066
+#define TARGET_NR_select 1067
+#define TARGET_NR_poll 1068
+#define TARGET_NR_epoll_wait 1069
+#define TARGET_NR_ustat 1070
+#define TARGET_NR_vfork 1071
+#define TARGET_NR_oldwait4 1072
+#define TARGET_NR_recv 1073
+#define TARGET_NR_send 1074
+#define TARGET_NR_bdflush 1075
+#define TARGET_NR_umount 1076
+#define TARGET_NR_uselib 1077
+#define TARGET_NR__sysctl 1078
+#define TARGET_NR_fork 1079
+
+#endif /* LINUX_USER_HEXAGON_SYSCALL_NR_H */
+
diff --git a/linux-user/hexagon/target_cpu.h b/linux-user/hexagon/target_cpu.h
new file mode 100644
index 0000000..53d45f5
--- /dev/null
+++ b/linux-user/hexagon/target_cpu.h
@@ -0,0 +1,44 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_TARGET_CPU_H
+#define HEXAGON_TARGET_CPU_H
+
+static inline void cpu_clone_regs_child(CPUHexagonState *env,
+                                        target_ulong newsp, unsigned flags)
+{
+    if (newsp) {
+        env->gpr[HEX_REG_SP] = newsp;
+    }
+    env->gpr[0] = 0;
+}
+
+static inline void cpu_clone_regs_parent(CPUHexagonState *env, unsigned flags)
+{
+}
+
+static inline void cpu_set_tls(CPUHexagonState *env, target_ulong newtls)
+{
+    env->gpr[HEX_REG_UGP] = newtls;
+}
+
+static inline abi_ulong get_sp_from_cpustate(CPUHexagonState *state)
+{
+    return state->gpr[HEX_REG_SP];
+}
+
+#endif
diff --git a/linux-user/hexagon/target_elf.h b/linux-user/hexagon/target_elf.h
new file mode 100644
index 0000000..0058b94
--- /dev/null
+++ b/linux-user/hexagon/target_elf.h
@@ -0,0 +1,40 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_TARGET_ELF_H
+#define HEXAGON_TARGET_ELF_H
+
+static inline const char *cpu_get_model(uint32_t eflags)
+{
+    /* For now, treat anything newer than v5 as a v67 */
+    /* FIXME - Disable instructions that are newer than the specified arch */
+    if (eflags == 0x04 ||    /* v5  */
+        eflags == 0x05 ||    /* v55 */
+        eflags == 0x60 ||    /* v60 */
+        eflags == 0x61 ||    /* v61 */
+        eflags == 0x62 ||    /* v62 */
+        eflags == 0x65 ||    /* v65 */
+        eflags == 0x66 ||    /* v66 */
+        eflags == 0x67 ||    /* v67 */
+        eflags == 0x8067     /* v67t */
+       ) {
+        return "v67";
+    }
+    return "unknown";
+}
+
+#endif
diff --git a/linux-user/hexagon/target_fcntl.h b/linux-user/hexagon/target_fcntl.h
new file mode 100644
index 0000000..08162e6
--- /dev/null
+++ b/linux-user/hexagon/target_fcntl.h
@@ -0,0 +1,18 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "../generic/fcntl.h"
diff --git a/linux-user/hexagon/target_signal.h b/linux-user/hexagon/target_signal.h
new file mode 100644
index 0000000..12a6187
--- /dev/null
+++ b/linux-user/hexagon/target_signal.h
@@ -0,0 +1,34 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_TARGET_SIGNAL_H
+#define HEXAGON_TARGET_SIGNAL_H
+
+typedef struct target_sigaltstack {
+    abi_ulong ss_sp;
+    abi_int ss_flags;
+    abi_ulong ss_size;
+} target_stack_t;
+
+#define TARGET_SS_ONSTACK 1
+#define TARGET_SS_DISABLE 2
+
+#define TARGET_MINSIGSTKSZ 2048
+
+#include "../generic/signal.h"
+
+#endif /* TARGET_SIGNAL_H */
diff --git a/linux-user/hexagon/target_structs.h b/linux-user/hexagon/target_structs.h
new file mode 100644
index 0000000..2e06227
--- /dev/null
+++ b/linux-user/hexagon/target_structs.h
@@ -0,0 +1,46 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * Hexagon specific structures for linux-user
+ */
+#ifndef HEXAGON_TARGET_STRUCTS_H
+#define HEXAGON_TARGET_STRUCTS_H
+
+struct target_ipc_perm {
+    abi_int __key;
+    abi_int uid;
+    abi_int gid;
+    abi_int cuid;
+    abi_int cgid;
+    abi_ushort mode;
+    abi_ushort __pad1;
+    abi_ushort __seq;
+};
+
+struct target_shmid_ds {
+    struct target_ipc_perm shm_perm;
+    abi_long shm_segsz;
+    abi_ulong shm_atime;
+    abi_ulong shm_dtime;
+    abi_ulong shm_ctime;
+    abi_int shm_cpid;
+    abi_int shm_lpid;
+    abi_ulong shm_nattch;
+};
+
+#endif
diff --git a/linux-user/hexagon/target_syscall.h b/linux-user/hexagon/target_syscall.h
new file mode 100644
index 0000000..a3bd307
--- /dev/null
+++ b/linux-user/hexagon/target_syscall.h
@@ -0,0 +1,32 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HEXAGON_TARGET_SYSCALL_H
+#define HEXAGON_TARGET_SYSCALL_H
+
+struct target_pt_regs {
+    abi_long sepc;
+    abi_long sp;
+};
+
+#define UNAME_MACHINE "hexagon"
+#define UNAME_MINIMUM_RELEASE "4.15.0"
+
+#define TARGET_MLOCKALL_MCL_CURRENT 1
+#define TARGET_MLOCKALL_MCL_FUTURE  2
+
+#endif
diff --git a/linux-user/hexagon/termbits.h b/linux-user/hexagon/termbits.h
new file mode 100644
index 0000000..c5f92f1
--- /dev/null
+++ b/linux-user/hexagon/termbits.h
@@ -0,0 +1,18 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "../i386/termbits.h"
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 3c261cf..9323418 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -102,6 +102,14 @@
 #define TARGET_IOC_WRITE  2U
 #define TARGET_IOC_READ   1U
 
+#elif defined(TARGET_HEXAGON)
+
+#define TARGET_IOC_SIZEBITS     14
+
+#define TARGET_IOC_NONE   0U
+#define TARGET_IOC_WRITE  1U
+#define TARGET_IOC_READ          2U
+
 #else
 #error unsupported CPU
 #endif
@@ -2136,6 +2144,31 @@ struct target_stat64 {
     uint64_t   st_ino;
 };
 
+#elif defined(TARGET_HEXAGON)
+
+struct target_stat {
+    unsigned long long st_dev;
+    unsigned long long st_ino;
+    unsigned int st_mode;
+    unsigned int st_nlink;
+    unsigned int st_uid;
+    unsigned int st_gid;
+    unsigned long long st_rdev;
+    target_ulong __pad1;
+    long long st_size;
+    target_long st_blksize;
+    int __pad2;
+    long long st_blocks;
+
+    target_long target_st_atime;
+    target_long target_st_atime_nsec;
+    target_long target_st_mtime;
+    target_long target_st_mtime_nsec;
+    target_long target_st_ctime;
+    target_long target_st_ctime_nsec;
+    int __unused[2];
+};
+
 #else
 #error unsupported CPU
 #endif
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index fe9dfe7..3a229d2 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1475,6 +1475,22 @@ static void elf_core_copy_regs(target_elf_gregset_t *regs,
 
 #endif /* TARGET_XTENSA */
 
+#ifdef TARGET_HEXAGON
+
+#define ELF_START_MMAP 0x20000000
+
+#define ELF_CLASS       ELFCLASS32
+#define ELF_ARCH        EM_HEXAGON
+
+static inline void init_thread(struct target_pt_regs *regs,
+                               struct image_info *infop)
+{
+    regs->sepc = infop->entry;
+    regs->sp = infop->start_stack;
+}
+
+#endif /* TARGET_HEXAGON */
+
 #ifndef ELF_PLATFORM
 #define ELF_PLATFORM (NULL)
 #endif
diff --git a/linux-user/hexagon/cpu_loop.c b/linux-user/hexagon/cpu_loop.c
new file mode 100644
index 0000000..f40a844
--- /dev/null
+++ b/linux-user/hexagon/cpu_loop.c
@@ -0,0 +1,99 @@
+/*
+ *  qemu user cpu loop
+ *
+ *  Copyright (c) 2003-2008 Fabrice Bellard
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu.h"
+#include "cpu_loop-common.h"
+#include "internal.h"
+
+void cpu_loop(CPUHexagonState *env)
+{
+    CPUState *cs = CPU(hexagon_env_get_cpu(env));
+    int trapnr, signum, sigcode;
+    target_ulong sigaddr;
+    target_ulong syscallnum;
+    target_ulong ret;
+
+    for (;;) {
+        cpu_exec_start(cs);
+        trapnr = cpu_exec(cs);
+        cpu_exec_end(cs);
+        process_queued_cpu_work(cs);
+
+        signum = 0;
+        sigcode = 0;
+        sigaddr = 0;
+
+        switch (trapnr) {
+        case EXCP_INTERRUPT:
+            /* just indicate that signals should be handled asap */
+            break;
+        case HEX_EXCP_TRAP0:
+            syscallnum = env->gpr[6];
+            env->gpr[HEX_REG_PC] += 4;
+            ret = do_syscall(env,
+                             syscallnum,
+                             env->gpr[0],
+                             env->gpr[1],
+                             env->gpr[2],
+                             env->gpr[3],
+                             env->gpr[4],
+                             env->gpr[5],
+                             0, 0);
+            if (ret == -TARGET_ERESTARTSYS) {
+                env->gpr[HEX_REG_PC] -= 4;
+            } else if (ret != -TARGET_QEMU_ESIGRETURN) {
+                env->gpr[0] = ret;
+            }
+            break;
+        case HEX_EXCP_FETCH_NO_UPAGE:
+        case HEX_EXCP_PRIV_NO_UREAD:
+        case HEX_EXCP_PRIV_NO_UWRITE:
+            signum = TARGET_SIGSEGV;
+            sigcode = TARGET_SEGV_MAPERR;
+            break;
+        case EXCP_ATOMIC:
+            cpu_exec_step_atomic(cs);
+            break;
+        default:
+            EXCP_DUMP(env, "\nqemu: unhandled CPU exception %#x - aborting\n",
+                     trapnr);
+            exit(EXIT_FAILURE);
+        }
+
+        if (signum) {
+            target_siginfo_t info = {
+                .si_signo = signum,
+                .si_errno = 0,
+                .si_code = sigcode,
+                ._sifields._sigfault._addr = sigaddr
+            };
+            queue_signal(env, info.si_signo, QEMU_SI_KILL, &info);
+        }
+
+        process_pending_signals(env);
+    }
+}
+
+void target_cpu_copy_regs(CPUArchState *env, struct target_pt_regs *regs)
+{
+    env->gpr[HEX_REG_PC] = regs->sepc;
+    env->gpr[HEX_REG_SP] = regs->sp;
+}
diff --git a/linux-user/hexagon/signal.c b/linux-user/hexagon/signal.c
new file mode 100644
index 0000000..99837e1
--- /dev/null
+++ b/linux-user/hexagon/signal.c
@@ -0,0 +1,276 @@
+/*
+ *  Emulation of Linux signals
+ *
+ *  Copyright (c) 2003 Fabrice Bellard
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include "qemu/osdep.h"
+#include "qemu.h"
+#include "signal-common.h"
+#include "linux-user/trace.h"
+
+struct target_sigcontext {
+    target_ulong r0,  r1,  r2,  r3;
+    target_ulong r4,  r5,  r6,  r7;
+    target_ulong r8,  r9, r10, r11;
+    target_ulong r12, r13, r14, r15;
+    target_ulong r16, r17, r18, r19;
+    target_ulong r20, r21, r22, r23;
+    target_ulong r24, r25, r26, r27;
+    target_ulong r28, r29, r30, r31;
+    target_ulong sa0;
+    target_ulong lc0;
+    target_ulong sa1;
+    target_ulong lc1;
+    target_ulong m0;
+    target_ulong m1;
+    target_ulong usr;
+    target_ulong p3_0;
+    target_ulong gp;
+    target_ulong ugp;
+    target_ulong pc;
+    target_ulong cause;
+    target_ulong badva;
+    target_ulong pad1;
+    target_ulong pad2;
+    target_ulong pad3;
+};
+
+struct target_ucontext {
+    unsigned long uc_flags;
+    target_ulong uc_link; /* target pointer */
+    target_stack_t uc_stack;
+    struct target_sigcontext uc_mcontext;
+    target_sigset_t uc_sigmask;
+};
+
+struct target_rt_sigframe {
+    uint32_t tramp[2];
+    struct target_siginfo info;
+    struct target_ucontext uc;
+};
+
+static abi_ulong get_sigframe(struct target_sigaction *ka,
+                              CPUHexagonState *regs, size_t framesize)
+{
+    abi_ulong sp = get_sp_from_cpustate(regs);
+
+    /* This is the X/Open sanctioned signal stack switching.  */
+    sp = target_sigsp(sp, ka) - framesize;
+
+    sp = QEMU_ALIGN_DOWN(sp, 8);
+
+    return sp;
+}
+
+static void setup_sigcontext(struct target_sigcontext *sc, CPUHexagonState *env)
+{
+    __put_user(env->gpr[HEX_REG_R00], &sc->r0);
+    __put_user(env->gpr[HEX_REG_R01], &sc->r1);
+    __put_user(env->gpr[HEX_REG_R02], &sc->r2);
+    __put_user(env->gpr[HEX_REG_R03], &sc->r3);
+    __put_user(env->gpr[HEX_REG_R04], &sc->r4);
+    __put_user(env->gpr[HEX_REG_R05], &sc->r5);
+    __put_user(env->gpr[HEX_REG_R06], &sc->r6);
+    __put_user(env->gpr[HEX_REG_R07], &sc->r7);
+    __put_user(env->gpr[HEX_REG_R08], &sc->r8);
+    __put_user(env->gpr[HEX_REG_R09], &sc->r9);
+    __put_user(env->gpr[HEX_REG_R10], &sc->r10);
+    __put_user(env->gpr[HEX_REG_R11], &sc->r11);
+    __put_user(env->gpr[HEX_REG_R12], &sc->r12);
+    __put_user(env->gpr[HEX_REG_R13], &sc->r13);
+    __put_user(env->gpr[HEX_REG_R14], &sc->r14);
+    __put_user(env->gpr[HEX_REG_R15], &sc->r15);
+    __put_user(env->gpr[HEX_REG_R16], &sc->r16);
+    __put_user(env->gpr[HEX_REG_R17], &sc->r17);
+    __put_user(env->gpr[HEX_REG_R18], &sc->r18);
+    __put_user(env->gpr[HEX_REG_R19], &sc->r19);
+    __put_user(env->gpr[HEX_REG_R20], &sc->r20);
+    __put_user(env->gpr[HEX_REG_R21], &sc->r21);
+    __put_user(env->gpr[HEX_REG_R22], &sc->r22);
+    __put_user(env->gpr[HEX_REG_R23], &sc->r23);
+    __put_user(env->gpr[HEX_REG_R24], &sc->r24);
+    __put_user(env->gpr[HEX_REG_R25], &sc->r25);
+    __put_user(env->gpr[HEX_REG_R26], &sc->r26);
+    __put_user(env->gpr[HEX_REG_R27], &sc->r27);
+    __put_user(env->gpr[HEX_REG_R28], &sc->r28);
+    __put_user(env->gpr[HEX_REG_R29], &sc->r29);
+    __put_user(env->gpr[HEX_REG_R30], &sc->r30);
+    __put_user(env->gpr[HEX_REG_R31], &sc->r31);
+    __put_user(env->gpr[HEX_REG_SA0], &sc->sa0);
+    __put_user(env->gpr[HEX_REG_LC0], &sc->lc0);
+    __put_user(env->gpr[HEX_REG_SA1], &sc->sa1);
+    __put_user(env->gpr[HEX_REG_LC1], &sc->lc1);
+    __put_user(env->gpr[HEX_REG_M0], &sc->m0);
+    __put_user(env->gpr[HEX_REG_M1], &sc->m1);
+    __put_user(env->gpr[HEX_REG_USR], &sc->usr);
+    __put_user(env->gpr[HEX_REG_P3_0], &sc->p3_0);
+    __put_user(env->gpr[HEX_REG_GP], &sc->gp);
+    __put_user(env->gpr[HEX_REG_UGP], &sc->ugp);
+    __put_user(env->gpr[HEX_REG_PC], &sc->pc);
+}
+
+static void setup_ucontext(struct target_ucontext *uc,
+                           CPUHexagonState *env, target_sigset_t *set)
+{
+    __put_user(0,    &(uc->uc_flags));
+    __put_user(0,    &(uc->uc_link));
+
+    target_save_altstack(&uc->uc_stack, env);
+
+    int i;
+    for (i = 0; i < TARGET_NSIG_WORDS; i++) {
+        __put_user(set->sig[i], &(uc->uc_sigmask.sig[i]));
+    }
+
+    setup_sigcontext(&uc->uc_mcontext, env);
+}
+
+static inline void install_sigtramp(uint32_t *tramp)
+{
+    __put_user(0x7800d166, tramp + 0); /*  { r6=#__NR_rt_sigreturn } */
+    __put_user(0x5400c004, tramp + 1); /*  { trap0(#1) } */
+}
+
+void setup_rt_frame(int sig, struct target_sigaction *ka,
+                    target_siginfo_t *info,
+                    target_sigset_t *set, CPUHexagonState *env)
+{
+    abi_ulong frame_addr;
+    struct target_rt_sigframe *frame;
+
+    frame_addr = get_sigframe(ka, env, sizeof(*frame));
+    trace_user_setup_rt_frame(env, frame_addr);
+
+    if (!lock_user_struct(VERIFY_WRITE, frame, frame_addr, 0)) {
+        goto badframe;
+    }
+
+    setup_ucontext(&frame->uc, env, set);
+    tswap_siginfo(&frame->info, info);
+    install_sigtramp(frame->tramp);
+
+    env->gpr[HEX_REG_PC] = ka->_sa_handler;
+    env->gpr[HEX_REG_SP] = frame_addr;
+    env->gpr[HEX_REG_R00] = sig;
+    env->gpr[HEX_REG_R01] =
+        frame_addr + offsetof(struct target_rt_sigframe, info);
+    env->gpr[HEX_REG_R02] =
+        frame_addr + offsetof(struct target_rt_sigframe, uc);
+    env->gpr[HEX_REG_LR] =
+        frame_addr + offsetof(struct target_rt_sigframe, tramp);
+
+    return;
+
+badframe:
+    unlock_user_struct(frame, frame_addr, 1);
+    if (sig == TARGET_SIGSEGV) {
+        ka->_sa_handler = TARGET_SIG_DFL;
+    }
+    force_sig(TARGET_SIGSEGV);
+}
+
+static void restore_sigcontext(CPUHexagonState *env,
+                               struct target_sigcontext *sc)
+{
+    __get_user(env->gpr[HEX_REG_R00], &sc->r0);
+    __get_user(env->gpr[HEX_REG_R01], &sc->r1);
+    __get_user(env->gpr[HEX_REG_R02], &sc->r2);
+    __get_user(env->gpr[HEX_REG_R03], &sc->r3);
+    __get_user(env->gpr[HEX_REG_R04], &sc->r4);
+    __get_user(env->gpr[HEX_REG_R05], &sc->r5);
+    __get_user(env->gpr[HEX_REG_R06], &sc->r6);
+    __get_user(env->gpr[HEX_REG_R07], &sc->r7);
+    __get_user(env->gpr[HEX_REG_R08], &sc->r8);
+    __get_user(env->gpr[HEX_REG_R09], &sc->r9);
+    __get_user(env->gpr[HEX_REG_R10], &sc->r10);
+    __get_user(env->gpr[HEX_REG_R11], &sc->r11);
+    __get_user(env->gpr[HEX_REG_R12], &sc->r12);
+    __get_user(env->gpr[HEX_REG_R13], &sc->r13);
+    __get_user(env->gpr[HEX_REG_R14], &sc->r14);
+    __get_user(env->gpr[HEX_REG_R15], &sc->r15);
+    __get_user(env->gpr[HEX_REG_R16], &sc->r16);
+    __get_user(env->gpr[HEX_REG_R17], &sc->r17);
+    __get_user(env->gpr[HEX_REG_R18], &sc->r18);
+    __get_user(env->gpr[HEX_REG_R19], &sc->r19);
+    __get_user(env->gpr[HEX_REG_R20], &sc->r20);
+    __get_user(env->gpr[HEX_REG_R21], &sc->r21);
+    __get_user(env->gpr[HEX_REG_R22], &sc->r22);
+    __get_user(env->gpr[HEX_REG_R23], &sc->r23);
+    __get_user(env->gpr[HEX_REG_R24], &sc->r24);
+    __get_user(env->gpr[HEX_REG_R25], &sc->r25);
+    __get_user(env->gpr[HEX_REG_R26], &sc->r26);
+    __get_user(env->gpr[HEX_REG_R27], &sc->r27);
+    __get_user(env->gpr[HEX_REG_R28], &sc->r28);
+    __get_user(env->gpr[HEX_REG_R29], &sc->r29);
+    __get_user(env->gpr[HEX_REG_R30], &sc->r30);
+    __get_user(env->gpr[HEX_REG_R31], &sc->r31);
+    __get_user(env->gpr[HEX_REG_SA0], &sc->sa0);
+    __get_user(env->gpr[HEX_REG_LC0], &sc->lc0);
+    __get_user(env->gpr[HEX_REG_SA1], &sc->sa1);
+    __get_user(env->gpr[HEX_REG_LC1], &sc->lc1);
+    __get_user(env->gpr[HEX_REG_M0], &sc->m0);
+    __get_user(env->gpr[HEX_REG_M1], &sc->m1);
+    __get_user(env->gpr[HEX_REG_USR], &sc->usr);
+    __get_user(env->gpr[HEX_REG_P3_0], &sc->p3_0);
+    __get_user(env->gpr[HEX_REG_GP], &sc->gp);
+    __get_user(env->gpr[HEX_REG_UGP], &sc->ugp);
+    __get_user(env->gpr[HEX_REG_PC], &sc->pc);
+}
+
+static void restore_ucontext(CPUHexagonState *env, struct target_ucontext *uc)
+{
+    sigset_t blocked;
+    target_sigset_t target_set;
+    int i;
+
+    target_sigemptyset(&target_set);
+    for (i = 0; i < TARGET_NSIG_WORDS; i++) {
+        __get_user(target_set.sig[i], &(uc->uc_sigmask.sig[i]));
+    }
+
+    target_to_host_sigset_internal(&blocked, &target_set);
+    set_sigmask(&blocked);
+
+    restore_sigcontext(env, &uc->uc_mcontext);
+}
+
+long do_rt_sigreturn(CPUHexagonState *env)
+{
+    struct target_rt_sigframe *frame;
+    abi_ulong frame_addr;
+
+    frame_addr = env->gpr[HEX_REG_SP];
+    trace_user_do_sigreturn(env, frame_addr);
+    if (!lock_user_struct(VERIFY_READ, frame, frame_addr, 1)) {
+        goto badframe;
+    }
+
+    restore_ucontext(env, &frame->uc);
+
+    if (do_sigaltstack(frame_addr + offsetof(struct target_rt_sigframe,
+            uc.uc_stack), 0, get_sp_from_cpustate(env)) == -EFAULT) {
+        goto badframe;
+    }
+
+    unlock_user_struct(frame, frame_addr, 0);
+    return -TARGET_QEMU_ESIGRETURN;
+
+badframe:
+    unlock_user_struct(frame, frame_addr, 0);
+    force_sig(TARGET_SIGSEGV);
+    return 0;
+}
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 945fc25..863058b 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -522,6 +522,8 @@ static inline int regpairs_aligned(void *cpu_env, int num)
 }
 #elif defined(TARGET_XTENSA)
 static inline int regpairs_aligned(void *cpu_env, int num) { return 1; }
+#elif defined(TARGET_HEXAGON)
+static inline int regpairs_aligned(void *cpu_env, int num) { return 1; }
 #else
 static inline int regpairs_aligned(void *cpu_env, int num) { return 0; }
 #endif
diff --git a/scripts/gensyscalls.sh b/scripts/gensyscalls.sh
index b7b8456..e2ee60a 100755
--- a/scripts/gensyscalls.sh
+++ b/scripts/gensyscalls.sh
@@ -53,7 +53,7 @@ read_includes()
 
 filter_defines()
 {
-    grep -e "#define __NR_" -e "#define __NR3264"
+    grep -e "#define __NR_" -e "#define __NR3264" | grep -v __NR_syscalls
 }
 
 rename_defines()
@@ -99,4 +99,5 @@ generate_syscall_nr openrisc 32 "$output/linux-user/openrisc/syscall_nr.h"
 
 generate_syscall_nr riscv 32 "$output/linux-user/riscv/syscall32_nr.h"
 generate_syscall_nr riscv 64 "$output/linux-user/riscv/syscall64_nr.h"
+generate_syscall_nr hexagon 32 "$output/linux-user/hexagon/syscall_nr.h"
 rm -fr "$TMP"
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 33/34] Hexagon (tests/tcg/hexagon) TCG tests
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (31 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 32/34] Hexagon (linux-user/hexagon) Linux user emulation Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-29  3:05   ` Richard Henderson
  2020-08-18 15:50 ` [RFC PATCH v3 34/34] Hexagon build infrastructure Taylor Simpson
                   ` (2 subsequent siblings)
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Modify tests/tcg/configure.sh
Add reference files to tests/tcg/hexagon
Add Hexagon-specific tests

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 tests/tcg/hexagon/atomics.c        | 122 ++++++
 tests/tcg/hexagon/clrtnew.c        |  56 +++
 tests/tcg/hexagon/dual_stores.c    |  60 +++
 tests/tcg/hexagon/exec_counters.c  |  57 +++
 tests/tcg/hexagon/mem_noshuf.c     | 291 ++++++++++++++
 tests/tcg/hexagon/misc.c           | 293 ++++++++++++++
 tests/tcg/hexagon/preg_alias.c     | 106 +++++
 tests/tcg/hexagon/pthread_cancel.c |  43 +++
 tests/tcg/hexagon/sfminmax.c       |  62 +++
 tests/tcg/configure.sh             |   4 +-
 tests/tcg/hexagon/Makefile.target  |  49 +++
 tests/tcg/hexagon/first.S          |  57 +++
 tests/tcg/hexagon/float_convs.ref  | 748 ++++++++++++++++++++++++++++++++++++
 tests/tcg/hexagon/float_madds.ref  | 768 +++++++++++++++++++++++++++++++++++++
 14 files changed, 2715 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/hexagon/atomics.c
 create mode 100644 tests/tcg/hexagon/clrtnew.c
 create mode 100644 tests/tcg/hexagon/dual_stores.c
 create mode 100644 tests/tcg/hexagon/exec_counters.c
 create mode 100644 tests/tcg/hexagon/mem_noshuf.c
 create mode 100644 tests/tcg/hexagon/misc.c
 create mode 100644 tests/tcg/hexagon/preg_alias.c
 create mode 100644 tests/tcg/hexagon/pthread_cancel.c
 create mode 100644 tests/tcg/hexagon/sfminmax.c
 create mode 100644 tests/tcg/hexagon/Makefile.target
 create mode 100644 tests/tcg/hexagon/first.S
 create mode 100644 tests/tcg/hexagon/float_convs.ref
 create mode 100644 tests/tcg/hexagon/float_madds.ref

diff --git a/tests/tcg/hexagon/atomics.c b/tests/tcg/hexagon/atomics.c
new file mode 100644
index 0000000..88d7450
--- /dev/null
+++ b/tests/tcg/hexagon/atomics.c
@@ -0,0 +1,122 @@
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <pthread.h>
+
+/* Using volatile because we are testing atomics */
+static inline int atomic_inc32(volatile int *x)
+{
+    int old, dummy;
+    __asm__ __volatile__(
+        "1: %0 = memw_locked(%2)\n\t"
+        "   %1 = add(%0, #1)\n\t"
+        "   memw_locked(%2, p0) = %1\n\t"
+        "   if (!p0) jump 1b\n\t"
+        : "=&r"(old), "=&r"(dummy)
+        : "r"(x)
+        : "p0", "memory");
+    return old;
+}
+
+/* Using volatile because we are testing atomics */
+static inline long long atomic_inc64(volatile long long *x)
+{
+    long long old, dummy;
+    __asm__ __volatile__(
+        "1: %0 = memd_locked(%2)\n\t"
+        "   %1 = #1\n\t"
+        "   %1 = add(%0, %1)\n\t"
+        "   memd_locked(%2, p0) = %1\n\t"
+        "   if (!p0) jump 1b\n\t"
+        : "=&r"(old), "=&r"(dummy)
+        : "r"(x)
+        : "p0", "memory");
+    return old;
+}
+
+/* Using volatile because we are testing atomics */
+static inline int atomic_dec32(volatile int *x)
+{
+    int old, dummy;
+    __asm__ __volatile__(
+        "1: %0 = memw_locked(%2)\n\t"
+        "   %1 = add(%0, #-1)\n\t"
+        "   memw_locked(%2, p0) = %1\n\t"
+        "   if (!p0) jump 1b\n\t"
+        : "=&r"(old), "=&r"(dummy)
+        : "r"(x)
+        : "p0", "memory");
+    return old;
+}
+
+/* Using volatile because we are testing atomics */
+static inline long long atomic_dec64(volatile long long *x)
+{
+    long long old, dummy;
+    __asm__ __volatile__(
+        "1: %0 = memd_locked(%2)\n\t"
+        "   %1 = #-1\n\t"
+        "   %1 = add(%0, %1)\n\t"
+        "   memd_locked(%2, p0) = %1\n\t"
+        "   if (!p0) jump 1b\n\t"
+        : "=&r"(old), "=&r"(dummy)
+        : "r"(x)
+        : "p0", "memory");
+    return old;
+}
+
+#define LOOP_CNT 1000
+/* Using volatile because we are testing atomics */
+volatile int tick32 = 1;
+/* Using volatile because we are testing atomics */
+volatile long long tick64 = 1;
+int err;
+
+void *thread1_func(void *arg)
+{
+    int i;
+
+    for (i = 0; i < LOOP_CNT; i++) {
+        atomic_inc32(&tick32);
+        atomic_dec64(&tick64);
+    }
+    return NULL;
+}
+
+void *thread2_func(void *arg)
+{
+    int i;
+    for (i = 0; i < LOOP_CNT; i++) {
+        atomic_dec32(&tick32);
+        atomic_inc64(&tick64);
+    }
+    return NULL;
+}
+
+void test_pthread(void)
+{
+    pthread_t tid1, tid2;
+
+    pthread_create(&tid1, NULL, thread1_func, "hello1");
+    pthread_create(&tid2, NULL, thread2_func, "hello2");
+    pthread_join(tid1, NULL);
+    pthread_join(tid2, NULL);
+
+    if (tick32 != 1) {
+        printf("ERROR: tick32 %d != 1\n", tick32);
+        err++;
+    }
+    if (tick64 != 1) {
+        printf("ERROR: tick64 %lld != 1\n", tick64);
+        err++;
+    }
+}
+
+int main(int argc, char **argv)
+{
+    test_pthread();
+    puts(err ? "FAIL" : "PASS");
+    return err;
+}
diff --git a/tests/tcg/hexagon/clrtnew.c b/tests/tcg/hexagon/clrtnew.c
new file mode 100644
index 0000000..665c2b3
--- /dev/null
+++ b/tests/tcg/hexagon/clrtnew.c
@@ -0,0 +1,56 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+
+static inline int test_clrtnew(int arg1, int old_val)
+{
+  int ret;
+  asm volatile("r5 = %2\n\t"
+               "{\n\t"
+                   "p0 = cmp.eq(%1, #1)\n\t"
+                   "if (p0.new) r5=#0\n\t"
+               "}\n\t"
+               "%0 = r5\n\t"
+               : "=r"(ret)
+               : "r"(arg1), "r"(old_val)
+               : "p0", "r5");
+  return ret;
+}
+
+int err;
+
+static void check(int val, int expect)
+{
+    if (val != expect) {
+        printf("ERROR: 0x%d != 0x%d\n", val, expect);
+        err++;
+    }
+}
+
+int main()
+{
+    int res;
+
+    res = test_clrtnew(1, 7);
+    check(res, 0);
+    res = test_clrtnew(2, 7);
+    check(res, 7);
+
+    puts(err ? "FAIL" : "PASS");
+    return err;
+}
diff --git a/tests/tcg/hexagon/dual_stores.c b/tests/tcg/hexagon/dual_stores.c
new file mode 100644
index 0000000..ba81bc2
--- /dev/null
+++ b/tests/tcg/hexagon/dual_stores.c
@@ -0,0 +1,60 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+
+/*
+ *  Make sure that two stores in the same packet honor proper
+ *  semantics: slot 1 executes first, then slot 0.
+ *  This is important when the addresses overlap.
+ */
+static inline void dual_stores(int *p, char *q, int x, char y)
+{
+  asm volatile("{\n\t"
+               "    memw(%0) = %2\n\t"
+               "    memb(%1) = %3\n\t"
+               "}\n"
+               :: "r"(p), "r"(q), "r"(x), "r"(y)
+               : "memory");
+}
+
+typedef union {
+    int word;
+    char byte;
+} Dual;
+
+int err;
+
+static void check(Dual d, int expect)
+{
+    if (d.word != expect) {
+        printf("ERROR: 0x%08x != 0x%08x\n", d.word, expect);
+        err++;
+    }
+}
+
+int main()
+{
+    Dual d;
+
+    d.word = ~0;
+    dual_stores(&d.word, &d.byte, 0x12345678, 0xff);
+    check(d, 0x123456ff);
+
+    puts(err ? "FAIL" : "PASS");
+    return err;
+}
diff --git a/tests/tcg/hexagon/exec_counters.c b/tests/tcg/hexagon/exec_counters.c
new file mode 100644
index 0000000..36eae25
--- /dev/null
+++ b/tests/tcg/hexagon/exec_counters.c
@@ -0,0 +1,57 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ *  Check the instruction counters in qemu
+ */
+#include <stdio.h>
+
+int err;
+
+static void check(const char *name, int val, int expect)
+{
+    if (val != expect) {
+        printf("ERROR: %s %d != %d\n", name, val, expect);
+        err++;
+    }
+}
+
+int main()
+{
+  int pkt, insn;
+
+  asm volatile("r2 = #0\n\t"
+               "c23 = r2\n\t"
+               "c22 = r2\n\t"
+               "c21 = r2\n\t"
+               "c20 = r2\n\t"
+               "r2 = #7\n\t"
+               "loop0(1f, #3)\n\t"
+               "1:\n\t"
+               "    { p0 = cmp.eq(r2,#5); if (p0.new) jump:nt 2f }\n\t"
+               "    {r0 = r1; r1 = r0 }:endloop0\n\t"
+               "2:\n\t"
+               "%[pkt] = c20\n\t"
+               "%[insn] = c21\n\t"
+               : [pkt] "=r"(pkt), [insn] "=r"(insn)
+               : : "r0", "r1", "r2", "sa0", "lc0", "p0");
+
+  check("Packet", pkt, 9);
+  check("Instruction", insn, 14);
+  puts(err ? "FAIL" : "PASS");
+  return err;
+}
diff --git a/tests/tcg/hexagon/mem_noshuf.c b/tests/tcg/hexagon/mem_noshuf.c
new file mode 100644
index 0000000..f7bf2e3
--- /dev/null
+++ b/tests/tcg/hexagon/mem_noshuf.c
@@ -0,0 +1,291 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+
+/*
+ *  Make sure that the :mem_noshuf packet attribute is honored.
+ *  This is important when the addresses overlap.
+ *  The store instruction in slot 1 effectively executes first,
+ *  followed by the load instruction in slot 0.
+ */
+
+#define MEM_NOSHUF32(NAME, ST_TYPE, LD_TYPE, ST_OP, LD_OP) \
+static inline unsigned int NAME(ST_TYPE * p, LD_TYPE * q, ST_TYPE x) \
+{ \
+    unsigned int ret; \
+    asm volatile("{\n\t" \
+                 "    " #ST_OP "(%1) = %3\n\t" \
+                 "    %0 = " #LD_OP "(%2)\n\t" \
+                 "}:mem_noshuf\n" \
+                 : "=r"(ret) \
+                 : "r"(p), "r"(q), "r"(x) \
+                 : "memory"); \
+    return ret; \
+}
+
+#define MEM_NOSHUF64(NAME, ST_TYPE, LD_TYPE, ST_OP, LD_OP) \
+static inline unsigned long long NAME(ST_TYPE * p, LD_TYPE * q, ST_TYPE x) \
+{ \
+    unsigned long long ret; \
+    asm volatile("{\n\t" \
+                 "    " #ST_OP "(%1) = %3\n\t" \
+                 "    %0 = " #LD_OP "(%2)\n\t" \
+                 "}:mem_noshuf\n" \
+                 : "=r"(ret) \
+                 : "r"(p), "r"(q), "r"(x) \
+                 : "memory"); \
+    return ret; \
+}
+
+/* Store byte combinations */
+MEM_NOSHUF32(mem_noshuf_sb_lb,  signed char,  signed char,      memb, memb)
+MEM_NOSHUF32(mem_noshuf_sb_lub, signed char,  unsigned char,    memb, memub)
+MEM_NOSHUF32(mem_noshuf_sb_lh,  signed char,  signed short,     memb, memh)
+MEM_NOSHUF32(mem_noshuf_sb_luh, signed char,  unsigned short,   memb, memuh)
+MEM_NOSHUF32(mem_noshuf_sb_lw,  signed char,  signed int,       memb, memw)
+MEM_NOSHUF64(mem_noshuf_sb_ld,  signed char,  signed long long, memb, memd)
+
+/* Store half combinations */
+MEM_NOSHUF32(mem_noshuf_sh_lb,  signed short, signed char,      memh, memb)
+MEM_NOSHUF32(mem_noshuf_sh_lub, signed short, unsigned char,    memh, memub)
+MEM_NOSHUF32(mem_noshuf_sh_lh,  signed short, signed short,     memh, memh)
+MEM_NOSHUF32(mem_noshuf_sh_luh, signed short, unsigned short,   memh, memuh)
+MEM_NOSHUF32(mem_noshuf_sh_lw,  signed short, signed int,       memh, memw)
+MEM_NOSHUF64(mem_noshuf_sh_ld,  signed short, signed long long, memh, memd)
+
+/* Store word combinations */
+MEM_NOSHUF32(mem_noshuf_sw_lb,  signed int,   signed char,      memw, memb)
+MEM_NOSHUF32(mem_noshuf_sw_lub, signed int,   unsigned char,    memw, memub)
+MEM_NOSHUF32(mem_noshuf_sw_lh,  signed int,   signed short,     memw, memh)
+MEM_NOSHUF32(mem_noshuf_sw_luh, signed int,   unsigned short,   memw, memuh)
+MEM_NOSHUF32(mem_noshuf_sw_lw,  signed int,   signed int,       memw, memw)
+MEM_NOSHUF64(mem_noshuf_sw_ld,  signed int,   signed long long, memw, memd)
+
+/* Store double combinations */
+MEM_NOSHUF32(mem_noshuf_sd_lb,  long long,    signed char,      memd, memb)
+MEM_NOSHUF32(mem_noshuf_sd_lub, long long,    unsigned char,    memd, memub)
+MEM_NOSHUF32(mem_noshuf_sd_lh,  long long,    signed short,     memd, memh)
+MEM_NOSHUF32(mem_noshuf_sd_luh, long long,    unsigned short,   memd, memuh)
+MEM_NOSHUF32(mem_noshuf_sd_lw,  long long,    signed int,       memd, memw)
+MEM_NOSHUF64(mem_noshuf_sd_ld,  long long,    signed long long, memd, memd)
+
+static inline unsigned int cancel_sw_lb(int pred, int *p, signed char *q, int x)
+{
+    unsigned int ret;
+    asm volatile("p0 = cmp.eq(%4, #0)\n\t"
+                 "{\n\t"
+                 "    if (!p0) memw(%1) = %3\n\t"
+                 "    %0 = memb(%2)\n\t"
+                 "}:mem_noshuf\n"
+                 : "=r"(ret)
+                 : "r"(p), "r"(q), "r"(x), "r"(pred)
+                 : "p0", "memory");
+    return ret;
+}
+
+static inline
+unsigned long long cancel_sw_ld(int pred, int *p, long long *q, int x)
+{
+    long long ret;
+    asm volatile("p0 = cmp.eq(%4, #0)\n\t"
+                 "{\n\t"
+                 "    if (!p0) memw(%1) = %3\n\t"
+                 "    %0 = memd(%2)\n\t"
+                 "}:mem_noshuf\n"
+                 : "=r"(ret)
+                 : "r"(p), "r"(q), "r"(x), "r"(pred)
+                 : "p0", "memory");
+    return ret;
+}
+
+typedef union {
+    signed long long d;
+    unsigned long long ud;
+    signed int w[2];
+    unsigned int uw[2];
+    signed short h[4];
+    unsigned short uh[4];
+    signed char b[8];
+    unsigned char ub[8];
+} Memory;
+
+int err;
+
+static void check32(int n, int expect)
+{
+    if (n != expect) {
+        printf("ERROR: 0x%08x != 0x%08x\n", n, expect);
+        err++;
+    }
+}
+
+static void check64(long long n, long long expect)
+{
+    if (n != expect) {
+        printf("ERROR: 0x%08llx != 0x%08llx\n", n, expect);
+        err++;
+    }
+}
+
+int main()
+{
+    Memory n;
+    unsigned int res32;
+    unsigned long long res64;
+
+    /*
+     * Store byte combinations
+     */
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sb_lb(&n.b[0], &n.b[0], 0x87);
+    check32(res32, 0xffffff87);
+
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sb_lub(&n.b[0], &n.ub[0], 0x87);
+    check32(res32, 0x00000087);
+
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sb_lh(&n.b[0], &n.h[0], 0x87);
+    check32(res32, 0xffffff87);
+
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sb_luh(&n.b[0], &n.uh[0], 0x87);
+    check32(res32, 0x0000ff87);
+
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sb_lw(&n.b[0], &n.w[0], 0x87);
+    check32(res32, 0xffffff87);
+
+    n.d = ~0;
+    res64 = mem_noshuf_sb_ld(&n.b[0], &n.d, 0x87);
+    check64(res64, 0xffffffffffffff87);
+
+    /*
+     * Store half combinations
+     */
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sh_lb(&n.h[0], &n.b[0], 0x8787);
+    check32(res32, 0xffffff87);
+
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sh_lub(&n.h[0], &n.ub[1], 0x8f87);
+    check32(res32, 0x0000008f);
+
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sh_lh(&n.h[0], &n.h[0], 0x8a87);
+    check32(res32, 0xffff8a87);
+
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sh_luh(&n.h[0], &n.uh[0], 0x8a87);
+    check32(res32, 0x8a87);
+
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sh_lw(&n.h[1], &n.w[0], 0x8a87);
+    check32(res32, 0x8a87ffff);
+
+    n.w[0] = ~0;
+    res64 = mem_noshuf_sh_ld(&n.h[1], &n.d, 0x8a87);
+    check64(res64, 0xffffffff8a87ffff);
+
+    /*
+     * Store word combinations
+     */
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sw_lb(&n.w[0], &n.b[0], 0x12345687);
+    check32(res32, 0xffffff87);
+
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sw_lub(&n.w[0], &n.ub[0], 0x12345687);
+    check32(res32, 0x00000087);
+
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sw_lh(&n.w[0], &n.h[0], 0x1234f678);
+    check32(res32, 0xfffff678);
+
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sw_luh(&n.w[0], &n.uh[0], 0x12345678);
+    check32(res32, 0x00005678);
+
+    n.w[0] = ~0;
+    res32 = mem_noshuf_sw_lw(&n.w[0], &n.w[0], 0x12345678);
+    check32(res32, 0x12345678);
+
+    n.d = ~0;
+    res64 = mem_noshuf_sw_ld(&n.w[0], &n.d, 0x12345678);
+    check64(res64, 0xffffffff12345678);
+
+    /*
+     * Store double combinations
+     */
+    n.d = ~0;
+    res32 = mem_noshuf_sd_lb(&n.d, &n.b[1], 0x123456789abcdef0);
+    check32(res32, 0xffffffde);
+
+    n.d = ~0;
+    res32 = mem_noshuf_sd_lub(&n.d, &n.ub[1], 0x123456789abcdef0);
+    check32(res32, 0x000000de);
+
+    n.d = ~0;
+    res32 = mem_noshuf_sd_lh(&n.d, &n.h[1], 0x123456789abcdef0);
+    check32(res32, 0xffff9abc);
+
+    n.d = ~0;
+    res32 = mem_noshuf_sd_luh(&n.d, &n.uh[1], 0x123456789abcdef0);
+    check32(res32, 0x00009abc);
+
+    n.d = ~0;
+    res32 = mem_noshuf_sd_lw(&n.d, &n.w[1], 0x123456789abcdef0);
+    check32(res32, 0x12345678);
+
+    n.d = ~0;
+    res64 = mem_noshuf_sd_ld(&n.d, &n.d, 0x123456789abcdef0);
+    check64(res64, 0x123456789abcdef0);
+
+    /*
+     * Predicated word stores
+     */
+    n.w[0] = ~0;
+    res32 = cancel_sw_lb(0, &n.w[0], &n.b[0], 0x12345678);
+    check32(res32, 0xffffffff);
+
+    n.w[0] = ~0;
+    res32 = cancel_sw_lb(1, &n.w[0], &n.b[0], 0x12345687);
+    check32(res32, 0xffffff87);
+
+    /*
+     * Predicated double stores
+     */
+    n.d = ~0;
+    res64 = cancel_sw_ld(0, &n.w[0], &n.d, 0x12345678);
+    check64(res64, 0xffffffffffffffff);
+
+    n.d = ~0;
+    res64 = cancel_sw_ld(1, &n.w[0], &n.d, 0x12345678);
+    check64(res64, 0xffffffff12345678);
+
+    n.d = ~0;
+    res64 = cancel_sw_ld(0, &n.w[1], &n.d, 0x12345678);
+    check64(res64, 0xffffffffffffffff);
+
+    n.d = ~0;
+    res64 = cancel_sw_ld(1, &n.w[1], &n.d, 0x12345678);
+    check64(res64, 0x12345678ffffffff);
+
+    puts(err ? "FAIL" : "PASS");
+    return err;
+}
diff --git a/tests/tcg/hexagon/misc.c b/tests/tcg/hexagon/misc.c
new file mode 100644
index 0000000..796ac74
--- /dev/null
+++ b/tests/tcg/hexagon/misc.c
@@ -0,0 +1,293 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+#include <string.h>
+
+typedef unsigned char uint8_t;
+typedef unsigned short uint16_t;
+typedef unsigned int uint32_t;
+
+
+static inline void S4_storerhnew_rr(void *p, int index, uint16_t v)
+{
+  asm volatile("{\n\t"
+               "    r0 = %0\n\n"
+               "    memh(%1+%2<<#2) = r0.new\n\t"
+               "}\n"
+               :: "r"(v), "r"(p), "r"(index)
+               : "r0", "memory");
+}
+
+static uint32_t data;
+static inline void *S4_storerbnew_ap(uint8_t v)
+{
+  void *ret;
+  asm volatile("{\n\t"
+               "    r0 = %1\n\n"
+               "    memb(%0 = ##data) = r0.new\n\t"
+               "}\n"
+               : "=r"(ret)
+               : "r"(v)
+               : "r0", "memory");
+  return ret;
+}
+
+static inline void *S4_storerhnew_ap(uint16_t v)
+{
+  void *ret;
+  asm volatile("{\n\t"
+               "    r0 = %1\n\n"
+               "    memh(%0 = ##data) = r0.new\n\t"
+               "}\n"
+               : "=r"(ret)
+               : "r"(v)
+               : "r0", "memory");
+  return ret;
+}
+
+static inline void *S4_storerinew_ap(uint32_t v)
+{
+  void *ret;
+  asm volatile("{\n\t"
+               "    r0 = %1\n\n"
+               "    memw(%0 = ##data) = r0.new\n\t"
+               "}\n"
+               : "=r"(ret)
+               : "r"(v)
+               : "r0", "memory");
+  return ret;
+}
+
+static inline void S4_storeirbt_io(void *p, int pred)
+{
+  asm volatile("p0 = cmp.eq(%0, #1)\n\t"
+               "if (p0) memb(%1+#4)=#27\n\t"
+               :: "r"(pred), "r"(p)
+               : "p0", "memory");
+}
+
+static inline void S4_storeirbf_io(void *p, int pred)
+{
+  asm volatile("p0 = cmp.eq(%0, #1)\n\t"
+               "if (!p0) memb(%1+#4)=#27\n\t"
+               :: "r"(pred), "r"(p)
+               : "p0", "memory");
+}
+
+static inline void S4_storeirbtnew_io(void *p, int pred)
+{
+  asm volatile("{\n\t"
+               "    p0 = cmp.eq(%0, #1)\n\t"
+               "    if (p0.new) memb(%1+#4)=#27\n\t"
+               "}\n\t"
+               :: "r"(pred), "r"(p)
+               : "p0", "memory");
+}
+
+static inline void S4_storeirbfnew_io(void *p, int pred)
+{
+  asm volatile("{\n\t"
+               "    p0 = cmp.eq(%0, #1)\n\t"
+               "    if (!p0.new) memb(%1+#4)=#27\n\t"
+               "}\n\t"
+               :: "r"(pred), "r"(p)
+               : "p0", "memory");
+}
+
+static inline void S4_storeirht_io(void *p, int pred)
+{
+  asm volatile("p0 = cmp.eq(%0, #1)\n\t"
+               "if (p0) memh(%1+#4)=#27\n\t"
+               :: "r"(pred), "r"(p)
+               : "p0", "memory");
+}
+
+static inline void S4_storeirhf_io(void *p, int pred)
+{
+  asm volatile("p0 = cmp.eq(%0, #1)\n\t"
+               "if (!p0) memh(%1+#4)=#27\n\t"
+               :: "r"(pred), "r"(p)
+               : "p0", "memory");
+}
+
+static inline void S4_storeirhtnew_io(void *p, int pred)
+{
+  asm volatile("{\n\t"
+               "    p0 = cmp.eq(%0, #1)\n\t"
+               "    if (p0.new) memh(%1+#4)=#27\n\t"
+               "}\n\t"
+               :: "r"(pred), "r"(p)
+               : "p0", "memory");
+}
+
+static inline void S4_storeirhfnew_io(void *p, int pred)
+{
+  asm volatile("{\n\t"
+               "    p0 = cmp.eq(%0, #1)\n\t"
+               "    if (!p0.new) memh(%1+#4)=#27\n\t"
+               "}\n\t"
+               :: "r"(pred), "r"(p)
+               : "p0", "memory");
+}
+
+static inline void S4_storeirit_io(void *p, int pred)
+{
+  asm volatile("p0 = cmp.eq(%0, #1)\n\t"
+               "if (p0) memw(%1+#4)=#27\n\t"
+               :: "r"(pred), "r"(p)
+               : "p0", "memory");
+}
+
+static inline void S4_storeirif_io(void *p, int pred)
+{
+  asm volatile("p0 = cmp.eq(%0, #1)\n\t"
+               "if (!p0) memw(%1+#4)=#27\n\t"
+               :: "r"(pred), "r"(p)
+               : "p0", "memory");
+}
+
+static inline void S4_storeiritnew_io(void *p, int pred)
+{
+  asm volatile("{\n\t"
+               "    p0 = cmp.eq(%0, #1)\n\t"
+               "    if (p0.new) memw(%1+#4)=#27\n\t"
+               "}\n\t"
+               :: "r"(pred), "r"(p)
+               : "p0", "memory");
+}
+
+static inline void S4_storeirifnew_io(void *p, int pred)
+{
+  asm volatile("{\n\t"
+               "    p0 = cmp.eq(%0, #1)\n\t"
+               "    if (!p0.new) memw(%1+#4)=#27\n\t"
+               "}\n\t"
+               :: "r"(pred), "r"(p)
+               : "p0", "memory");
+}
+
+int err;
+
+static void check(int val, int expect)
+{
+    if (val != expect) {
+        printf("ERROR: 0x%04x != 0x%04x\n", val, expect);
+        err++;
+    }
+}
+
+uint32_t init[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
+uint32_t array[10];
+
+int main()
+{
+
+    memcpy(array, init, sizeof(array));
+    S4_storerhnew_rr(array, 4, 0xffff);
+    check(array[4], 0xffff);
+
+    data = ~0;
+    check((uint32_t)S4_storerbnew_ap(0x12), (uint32_t)&data);
+    check(data, 0xffffff12);
+
+    data = ~0;
+    check((uint32_t)S4_storerhnew_ap(0x1234), (uint32_t)&data);
+    check(data, 0xffff1234);
+
+    data = ~0;
+    check((uint32_t)S4_storerinew_ap(0x12345678), (uint32_t)&data);
+    check(data, 0x12345678);
+
+    /* Byte */
+    memcpy(array, init, sizeof(array));
+    S4_storeirbt_io(&array[1], 1);
+    check(array[2], 27);
+    S4_storeirbt_io(&array[2], 0);
+    check(array[3], 3);
+
+    memcpy(array, init, sizeof(array));
+    S4_storeirbf_io(&array[3], 0);
+    check(array[4], 27);
+    S4_storeirbf_io(&array[4], 1);
+    check(array[5], 5);
+
+    memcpy(array, init, sizeof(array));
+    S4_storeirbtnew_io(&array[5], 1);
+    check(array[6], 27);
+    S4_storeirbtnew_io(&array[6], 0);
+    check(array[7], 7);
+
+    memcpy(array, init, sizeof(array));
+    S4_storeirbfnew_io(&array[7], 0);
+    check(array[8], 27);
+    S4_storeirbfnew_io(&array[8], 1);
+    check(array[9], 9);
+
+    /* Half word */
+    memcpy(array, init, sizeof(array));
+    S4_storeirht_io(&array[1], 1);
+    check(array[2], 27);
+    S4_storeirht_io(&array[2], 0);
+    check(array[3], 3);
+
+    memcpy(array, init, sizeof(array));
+    S4_storeirhf_io(&array[3], 0);
+    check(array[4], 27);
+    S4_storeirhf_io(&array[4], 1);
+    check(array[5], 5);
+
+    memcpy(array, init, sizeof(array));
+    S4_storeirhtnew_io(&array[5], 1);
+    check(array[6], 27);
+    S4_storeirhtnew_io(&array[6], 0);
+    check(array[7], 7);
+
+    memcpy(array, init, sizeof(array));
+    S4_storeirhfnew_io(&array[7], 0);
+    check(array[8], 27);
+    S4_storeirhfnew_io(&array[8], 1);
+    check(array[9], 9);
+
+    /* Word */
+    memcpy(array, init, sizeof(array));
+    S4_storeirit_io(&array[1], 1);
+    check(array[2], 27);
+    S4_storeirit_io(&array[2], 0);
+    check(array[3], 3);
+
+    memcpy(array, init, sizeof(array));
+    S4_storeirif_io(&array[3], 0);
+    check(array[4], 27);
+    S4_storeirif_io(&array[4], 1);
+    check(array[5], 5);
+
+    memcpy(array, init, sizeof(array));
+    S4_storeiritnew_io(&array[5], 1);
+    check(array[6], 27);
+    S4_storeiritnew_io(&array[6], 0);
+    check(array[7], 7);
+
+    memcpy(array, init, sizeof(array));
+    S4_storeirifnew_io(&array[7], 0);
+    check(array[8], 27);
+    S4_storeirifnew_io(&array[8], 1);
+    check(array[9], 9);
+
+    puts(err ? "FAIL" : "PASS");
+    return err;
+}
diff --git a/tests/tcg/hexagon/preg_alias.c b/tests/tcg/hexagon/preg_alias.c
new file mode 100644
index 0000000..4d51674
--- /dev/null
+++ b/tests/tcg/hexagon/preg_alias.c
@@ -0,0 +1,106 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+
+static inline int preg_alias(int v0, int v1, int v2, int v3)
+{
+  int ret;
+  asm volatile("p0 = %1\n\t"
+               "p1 = %2\n\t"
+               "p2 = %3\n\t"
+               "p3 = %4\n\t"
+               "%0 = C4\n"
+               : "=r"(ret)
+               : "r"(v0), "r"(v1), "r"(v2), "r"(v3)
+               : "p0", "p1", "p2", "p3");
+  return ret;
+}
+
+typedef union {
+    int creg;
+    struct {
+      unsigned char p0;
+      unsigned char p1;
+      unsigned char p2;
+      unsigned char p3;
+    } pregs;
+} PRegs;
+
+static inline void creg_alias(int cval, PRegs *pregs)
+{
+  unsigned char val;
+  asm volatile("c4 = %0" : : "r"(cval));
+
+  asm volatile("%0 = p0" : "=r"(val));
+  pregs->pregs.p0 = val;
+  asm volatile("%0 = p1" : "=r"(val));
+  pregs->pregs.p1 = val;
+  asm volatile("%0 = p2" : "=r"(val));
+  pregs->pregs.p2 = val;
+  asm volatile("%0 = p3" : "=r"(val));
+  pregs->pregs.p3 = val;
+}
+
+int err;
+
+static void check(int val, int expect)
+{
+    if (val != expect) {
+        printf("ERROR: 0x%08x != 0x%08x\n", val, expect);
+        err++;
+    }
+}
+
+int main()
+{
+    int c4;
+    PRegs pregs;
+
+    c4 = preg_alias(0xff, 0x00, 0xff, 0x00);
+    check(c4, 0x00ff00ff);
+    c4 = preg_alias(0xff, 0x00, 0x00, 0x00);
+    check(c4, 0x000000ff);
+    c4 = preg_alias(0x00, 0xff, 0x00, 0x00);
+    check(c4, 0x0000ff00);
+    c4 = preg_alias(0x00, 0x00, 0xff, 0x00);
+    check(c4, 0x00ff0000);
+    c4 = preg_alias(0x00, 0x00, 0x00, 0xff);
+    check(c4, 0xff000000);
+    c4 = preg_alias(0xff, 0xff, 0xff, 0xff);
+    check(c4, 0xffffffff);
+
+    creg_alias(0x00ff00ff, &pregs);
+    check(pregs.creg, 0x00ff00ff);
+    creg_alias(0x00ffff00, &pregs);
+    check(pregs.creg, 0x00ffff00);
+    creg_alias(0x00000000, &pregs);
+    check(pregs.creg, 0x00000000);
+    creg_alias(0xff000000, &pregs);
+    check(pregs.creg, 0xff000000);
+    creg_alias(0x00ff0000, &pregs);
+    check(pregs.creg, 0x00ff0000);
+    creg_alias(0x0000ff00, &pregs);
+    check(pregs.creg, 0x0000ff00);
+    creg_alias(0x000000ff, &pregs);
+    check(pregs.creg, 0x000000ff);
+    creg_alias(0xffffffff, &pregs);
+    check(pregs.creg, 0xffffffff);
+
+    puts(err ? "FAIL" : "PASS");
+    return err;
+}
diff --git a/tests/tcg/hexagon/pthread_cancel.c b/tests/tcg/hexagon/pthread_cancel.c
new file mode 100644
index 0000000..db30cbc
--- /dev/null
+++ b/tests/tcg/hexagon/pthread_cancel.c
@@ -0,0 +1,43 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+#include <unistd.h>
+#include <pthread.h>
+
+static void *func(void *arg)
+{
+    sleep(3);
+    return 0;
+}
+
+int main()
+{
+    int err = 0;
+    pthread_t thread;
+    void *res;
+
+    pthread_create(&thread, 0, func, NULL);
+    pthread_cancel(thread);
+    pthread_join(thread, &res);
+    if (res != PTHREAD_CANCELED) {
+        err++;
+    }
+
+    puts(err == 0 ? "PASS" : "FAIL");
+    return err == 0 ? 0 : -1;
+}
diff --git a/tests/tcg/hexagon/sfminmax.c b/tests/tcg/hexagon/sfminmax.c
new file mode 100644
index 0000000..cfe443f
--- /dev/null
+++ b/tests/tcg/hexagon/sfminmax.c
@@ -0,0 +1,62 @@
+/*
+ *  Copyright(c) 2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+
+/*
+ * This test checks that the FP invalid bit in USR is not set
+ * when one of the operands is NaN.
+ */
+const int FPINVF_BIT = 1;
+
+int err;
+
+int main()
+{
+    int usr;
+
+    asm volatile("r2 = usr\n\t"
+                 "r2 = clrbit(r2, #%1)\n\t"
+                 "usr = r2\n\t"
+                 "r2 = ##0x7fc00000\n\t"    /* NaN */
+                 "r3 = ##0x7f7fffff\n\t"
+                 "r2 = sfmin(r2, r3)\n\t"
+                 "%0 = usr\n\t"
+                 : "=r"(usr) : "i"(FPINVF_BIT) : "r2", "r3", "usr");
+
+    if (usr & (1 << FPINVF_BIT)) {
+        puts("sfmin test failed");
+        err++;
+    }
+
+    asm volatile("r2 = usr\n\t"
+                 "r2 = clrbit(r2, #%1)\n\t"
+                 "usr = r2\n\t"
+                 "r2 = ##0x7fc00000\n\t"    /* NaN */
+                 "r3 = ##0x7f7fffff\n\t"
+                 "r2 = sfmax(r2, r3)\n\t"
+                 "%0 = usr\n\t"
+                 : "=r"(usr) : "i"(FPINVF_BIT) : "r2", "r3", "usr");
+
+    if (usr & (1 << FPINVF_BIT)) {
+        puts("sfmax test failed");
+        err++;
+    }
+
+    puts(err ? "FAIL" : "PASS");
+    return err ? 1 : 0;
+}
diff --git a/tests/tcg/configure.sh b/tests/tcg/configure.sh
index 102578c..43a940d 100755
--- a/tests/tcg/configure.sh
+++ b/tests/tcg/configure.sh
@@ -69,6 +69,8 @@ fi
 : ${cross_cc_cflags_sparc64="-m64 -mcpu=ultrasparc"}
 : ${cross_cc_x86_64="x86_64-pc-linux-gnu-gcc"}
 : ${cross_cc_cflags_x86_64="-m64"}
+: ${cross_cc_hexagon="hexagon-linux-clang"}
+: ${cross_cc_cflags_hexagon="-mv67 -O2 -Wno-incompatible-pointer-types"}
 
 for target in $target_list; do
   arch=${target%%-*}
@@ -94,7 +96,7 @@ for target in $target_list; do
     xtensa|xtensaeb)
       arches=xtensa
       ;;
-    alpha|cris|hppa|i386|lm32|m68k|openrisc|riscv64|s390x|sh4|sparc64)
+    alpha|cris|hppa|i386|lm32|m68k|openrisc|riscv64|s390x|sh4|sparc64|hexagon)
       arches=$target
       ;;
     *)
diff --git a/tests/tcg/hexagon/Makefile.target b/tests/tcg/hexagon/Makefile.target
new file mode 100644
index 0000000..f696c78
--- /dev/null
+++ b/tests/tcg/hexagon/Makefile.target
@@ -0,0 +1,49 @@
+##
+##  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##
+##  This program is free software; you can redistribute it and/or modify
+##  it under the terms of the GNU General Public License as published by
+##  the Free Software Foundation; either version 2 of the License, or
+##  (at your option) any later version.
+##
+##  This program is distributed in the hope that it will be useful,
+##  but WITHOUT ANY WARRANTY; without even the implied warranty of
+##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+##  GNU General Public License for more details.
+##
+##  You should have received a copy of the GNU General Public License
+##  along with this program; if not, see <http://www.gnu.org/licenses/>.
+##
+
+# Hexagon doesn't support gdb, so skip the EXTRA_RUNS
+EXTRA_RUNS =
+
+# Hexagon has 64K pages, so increase the timeout to keep
+# test-mmap from timing out
+ifeq ($(CONFIG_DEBUG_TCG),y)
+TIMEOUT=90
+else
+TIMEOUT=40
+endif
+
+
+CFLAGS = -mv67 -O2
+
+HEX_SRC=$(SRC_PATH)/tests/tcg/hexagon
+VPATH += $(HEX_SRC)
+
+first: $(HEX_SRC)/first.S
+	$(CC) -static -mv67 -nostdlib $^ -o $@
+
+HEX_TESTS = first
+HEX_TESTS += exec_counters
+HEX_TESTS += misc
+HEX_TESTS += preg_alias
+HEX_TESTS += dual_stores
+HEX_TESTS += clrtnew
+HEX_TESTS += mem_noshuf
+HEX_TESTS += pthread_cancel
+HEX_TESTS += atomics
+HEX_TESTS += sfminmax
+
+TESTS += $(HEX_TESTS)
diff --git a/tests/tcg/hexagon/first.S b/tests/tcg/hexagon/first.S
new file mode 100644
index 0000000..841c0bb
--- /dev/null
+++ b/tests/tcg/hexagon/first.S
@@ -0,0 +1,57 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define SYS_write		 64
+#define SYS_exit_group           94
+#define SYS_exit                 93
+
+#define FD_STDOUT                1
+
+	.type	str,@object
+	.section	.rodata
+str:
+	.string	"Hello!\n"
+	.size	str, 8
+
+.text
+.global _start
+_start:
+	r6 = #SYS_write
+	r0 = #FD_STDOUT
+	r1 = ##str
+	r2 = #7
+	trap0(#1)
+
+	r0 = #0
+	r6 = #SYS_exit_group
+	trap0(#1)
+
+.section ".note.ABI-tag", "a"
+.align 4
+.long 1f - 0f          /* name length */
+.long 3f - 2f          /* data length */
+.long  1               /* note type */
+
+/*
+ * vendor name seems like this should be MUSL but lldb doesn't agree.
+ */
+0:     .asciz "GNU"
+1:     .align 4
+2:     .long 0 /* linux */
+       .long 3,0,0
+3:     .align 4
+
diff --git a/tests/tcg/hexagon/float_convs.ref b/tests/tcg/hexagon/float_convs.ref
new file mode 100644
index 0000000..9ec9ffc
--- /dev/null
+++ b/tests/tcg/hexagon/float_convs.ref
@@ -0,0 +1,748 @@
+### Rounding to nearest
+from single: f32(-nan:0xffa00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (INVALID)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-nan:0xffc00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (OK)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-inf:0xff800000)
+  to double: f64(-inf:0x00fff0000000000000) (OK)
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+  to double: f64(-0x1.fffffe00000000000000p+127:0x00c7efffffe0000000) (INEXACT )
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+  to double: f64(-0x1.1874b200000000000000p+103:0x00c661874b20000000) (INEXACT )
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+  to double: f64(-0x1.c0bab600000000000000p+99:0x00c62c0bab60000000) (INEXACT )
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+  to double: f64(-0x1.31f75000000000000000p-40:0x00bd731f7500000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+  to double: f64(-0x1.50544400000000000000p-66:0x00bbd5054440000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.00000000000000000000p-126:0x80800000)
+  to double: f64(-0x1.00000000000000000000p-126:0x00b810000000000000) (OK)
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(0x0.00000000000000000000p+0:0000000000)
+  to double: f64(0x0.00000000000000000000p+0:00000000000000000000) (OK)
+   to int32: 0 (OK)
+   to int64: 0 (OK)
+  to uint32: 0 (OK)
+  to uint64: 0 (OK)
+from single: f32(0x1.00000000000000000000p-126:0x00800000)
+  to double: f64(0x1.00000000000000000000p-126:0x003810000000000000) (OK)
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.00000000000000000000p-25:0x33000000)
+  to double: f64(0x1.00000000000000000000p-25:0x003e60000000000000) (OK)
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+  to double: f64(0x1.ffffe600000000000000p-25:0x003e6ffffe60000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+  to double: f64(0x1.ff801a00000000000000p-15:0x003f0ff801a0000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.00000c00000000000000p-14:0x38800006)
+  to double: f64(0x1.00000c00000000000000p-14:0x003f100000c0000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.00000000000000000000p+0:0x3f800000)
+  to double: f64(0x1.00000000000000000000p+0:0x003ff0000000000000) (OK)
+   to int32: 1 (OK)
+   to int64: 1 (OK)
+  to uint32: 1 (OK)
+  to uint64: 1 (OK)
+from single: f32(0x1.00400000000000000000p+0:0x3f802000)
+  to double: f64(0x1.00400000000000000000p+0:0x003ff0040000000000) (INEXACT )
+   to int32: 1 (INEXACT )
+   to int64: 1 (INEXACT )
+  to uint32: 1 (INEXACT )
+  to uint64: 1 (INEXACT )
+from single: f32(0x1.00000000000000000000p+1:0x40000000)
+  to double: f64(0x1.00000000000000000000p+1:0x004000000000000000) (OK)
+   to int32: 2 (OK)
+   to int64: 2 (OK)
+  to uint32: 2 (OK)
+  to uint64: 2 (OK)
+from single: f32(0x1.5bf0a800000000000000p+1:0x402df854)
+  to double: f64(0x1.5bf0a800000000000000p+1:0x004005bf0a80000000) (INEXACT )
+   to int32: 2 (INEXACT )
+   to int64: 2 (INEXACT )
+  to uint32: 2 (INEXACT )
+  to uint64: 2 (INEXACT )
+from single: f32(0x1.921fb600000000000000p+1:0x40490fdb)
+  to double: f64(0x1.921fb600000000000000p+1:0x00400921fb60000000) (INEXACT )
+   to int32: 3 (INEXACT )
+   to int64: 3 (INEXACT )
+  to uint32: 3 (INEXACT )
+  to uint64: 3 (INEXACT )
+from single: f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+  to double: f64(0x1.ffbe0000000000000000p+15:0x0040effbe000000000) (INEXACT )
+   to int32: 65503 (OK)
+   to int64: 65503 (OK)
+  to uint32: 65503 (OK)
+  to uint64: 65503 (OK)
+from single: f32(0x1.ffc00000000000000000p+15:0x477fe000)
+  to double: f64(0x1.ffc00000000000000000p+15:0x0040effc0000000000) (INEXACT )
+   to int32: 65504 (OK)
+   to int64: 65504 (OK)
+  to uint32: 65504 (OK)
+  to uint64: 65504 (OK)
+from single: f32(0x1.ffc20000000000000000p+15:0x477fe100)
+  to double: f64(0x1.ffc20000000000000000p+15:0x0040effc2000000000) (INEXACT )
+   to int32: 65505 (OK)
+   to int64: 65505 (OK)
+  to uint32: 65505 (OK)
+  to uint64: 65505 (OK)
+from single: f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+  to double: f64(0x1.ffbf0000000000000000p+16:0x0040fffbf000000000) (INEXACT )
+   to int32: 131007 (OK)
+   to int64: 131007 (OK)
+  to uint32: 131007 (OK)
+  to uint64: 131007 (OK)
+from single: f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+  to double: f64(0x1.ffc00000000000000000p+16:0x0040fffc0000000000) (INEXACT )
+   to int32: 131008 (OK)
+   to int64: 131008 (OK)
+  to uint32: 131008 (OK)
+  to uint64: 131008 (OK)
+from single: f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+  to double: f64(0x1.ffc10000000000000000p+16:0x0040fffc1000000000) (INEXACT )
+   to int32: 131009 (OK)
+   to int64: 131009 (OK)
+  to uint32: 131009 (OK)
+  to uint64: 131009 (OK)
+from single: f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+  to double: f64(0x1.c0bab600000000000000p+99:0x00462c0bab60000000) (INEXACT )
+   to int32: 2147483647 (INVALID)
+   to int64: 9223372036854775807 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+  to double: f64(0x1.fffffe00000000000000p+127:0x0047efffffe0000000) (INEXACT )
+   to int32: 2147483647 (INVALID)
+   to int64: 9223372036854775807 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(inf:0x7f800000)
+  to double: f64(inf:0x007ff0000000000000) (OK)
+   to int32: 2147483647 (INVALID)
+   to int64: 9223372036854775807 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-nan:0x7fc00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (OK)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-nan:0x7fa00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (INVALID)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+### Rounding upwards
+from single: f32(-nan:0xffa00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (INVALID)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-nan:0xffc00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (OK)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-inf:0xff800000)
+  to double: f64(-inf:0x00fff0000000000000) (OK)
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+  to double: f64(-0x1.fffffe00000000000000p+127:0x00c7efffffe0000000) (INEXACT )
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+  to double: f64(-0x1.1874b200000000000000p+103:0x00c661874b20000000) (INEXACT )
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+  to double: f64(-0x1.c0bab600000000000000p+99:0x00c62c0bab60000000) (INEXACT )
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+  to double: f64(-0x1.31f75000000000000000p-40:0x00bd731f7500000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+  to double: f64(-0x1.50544400000000000000p-66:0x00bbd5054440000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.00000000000000000000p-126:0x80800000)
+  to double: f64(-0x1.00000000000000000000p-126:0x00b810000000000000) (OK)
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(0x0.00000000000000000000p+0:0000000000)
+  to double: f64(0x0.00000000000000000000p+0:00000000000000000000) (OK)
+   to int32: 0 (OK)
+   to int64: 0 (OK)
+  to uint32: 0 (OK)
+  to uint64: 0 (OK)
+from single: f32(0x1.00000000000000000000p-126:0x00800000)
+  to double: f64(0x1.00000000000000000000p-126:0x003810000000000000) (OK)
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.00000000000000000000p-25:0x33000000)
+  to double: f64(0x1.00000000000000000000p-25:0x003e60000000000000) (OK)
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+  to double: f64(0x1.ffffe600000000000000p-25:0x003e6ffffe60000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+  to double: f64(0x1.ff801a00000000000000p-15:0x003f0ff801a0000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.00000c00000000000000p-14:0x38800006)
+  to double: f64(0x1.00000c00000000000000p-14:0x003f100000c0000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.00000000000000000000p+0:0x3f800000)
+  to double: f64(0x1.00000000000000000000p+0:0x003ff0000000000000) (OK)
+   to int32: 1 (OK)
+   to int64: 1 (OK)
+  to uint32: 1 (OK)
+  to uint64: 1 (OK)
+from single: f32(0x1.00400000000000000000p+0:0x3f802000)
+  to double: f64(0x1.00400000000000000000p+0:0x003ff0040000000000) (INEXACT )
+   to int32: 1 (INEXACT )
+   to int64: 1 (INEXACT )
+  to uint32: 1 (INEXACT )
+  to uint64: 1 (INEXACT )
+from single: f32(0x1.00000000000000000000p+1:0x40000000)
+  to double: f64(0x1.00000000000000000000p+1:0x004000000000000000) (OK)
+   to int32: 2 (OK)
+   to int64: 2 (OK)
+  to uint32: 2 (OK)
+  to uint64: 2 (OK)
+from single: f32(0x1.5bf0a800000000000000p+1:0x402df854)
+  to double: f64(0x1.5bf0a800000000000000p+1:0x004005bf0a80000000) (INEXACT )
+   to int32: 2 (INEXACT )
+   to int64: 2 (INEXACT )
+  to uint32: 2 (INEXACT )
+  to uint64: 2 (INEXACT )
+from single: f32(0x1.921fb600000000000000p+1:0x40490fdb)
+  to double: f64(0x1.921fb600000000000000p+1:0x00400921fb60000000) (INEXACT )
+   to int32: 3 (INEXACT )
+   to int64: 3 (INEXACT )
+  to uint32: 3 (INEXACT )
+  to uint64: 3 (INEXACT )
+from single: f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+  to double: f64(0x1.ffbe0000000000000000p+15:0x0040effbe000000000) (INEXACT )
+   to int32: 65503 (OK)
+   to int64: 65503 (OK)
+  to uint32: 65503 (OK)
+  to uint64: 65503 (OK)
+from single: f32(0x1.ffc00000000000000000p+15:0x477fe000)
+  to double: f64(0x1.ffc00000000000000000p+15:0x0040effc0000000000) (INEXACT )
+   to int32: 65504 (OK)
+   to int64: 65504 (OK)
+  to uint32: 65504 (OK)
+  to uint64: 65504 (OK)
+from single: f32(0x1.ffc20000000000000000p+15:0x477fe100)
+  to double: f64(0x1.ffc20000000000000000p+15:0x0040effc2000000000) (INEXACT )
+   to int32: 65505 (OK)
+   to int64: 65505 (OK)
+  to uint32: 65505 (OK)
+  to uint64: 65505 (OK)
+from single: f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+  to double: f64(0x1.ffbf0000000000000000p+16:0x0040fffbf000000000) (INEXACT )
+   to int32: 131007 (OK)
+   to int64: 131007 (OK)
+  to uint32: 131007 (OK)
+  to uint64: 131007 (OK)
+from single: f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+  to double: f64(0x1.ffc00000000000000000p+16:0x0040fffc0000000000) (INEXACT )
+   to int32: 131008 (OK)
+   to int64: 131008 (OK)
+  to uint32: 131008 (OK)
+  to uint64: 131008 (OK)
+from single: f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+  to double: f64(0x1.ffc10000000000000000p+16:0x0040fffc1000000000) (INEXACT )
+   to int32: 131009 (OK)
+   to int64: 131009 (OK)
+  to uint32: 131009 (OK)
+  to uint64: 131009 (OK)
+from single: f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+  to double: f64(0x1.c0bab600000000000000p+99:0x00462c0bab60000000) (INEXACT )
+   to int32: 2147483647 (INVALID)
+   to int64: 9223372036854775807 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+  to double: f64(0x1.fffffe00000000000000p+127:0x0047efffffe0000000) (INEXACT )
+   to int32: 2147483647 (INVALID)
+   to int64: 9223372036854775807 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(inf:0x7f800000)
+  to double: f64(inf:0x007ff0000000000000) (OK)
+   to int32: 2147483647 (INVALID)
+   to int64: 9223372036854775807 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-nan:0x7fc00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (OK)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-nan:0x7fa00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (INVALID)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+### Rounding downwards
+from single: f32(-nan:0xffa00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (INVALID)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-nan:0xffc00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (OK)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-inf:0xff800000)
+  to double: f64(-inf:0x00fff0000000000000) (OK)
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+  to double: f64(-0x1.fffffe00000000000000p+127:0x00c7efffffe0000000) (INEXACT )
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+  to double: f64(-0x1.1874b200000000000000p+103:0x00c661874b20000000) (INEXACT )
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+  to double: f64(-0x1.c0bab600000000000000p+99:0x00c62c0bab60000000) (INEXACT )
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+  to double: f64(-0x1.31f75000000000000000p-40:0x00bd731f7500000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+  to double: f64(-0x1.50544400000000000000p-66:0x00bbd5054440000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.00000000000000000000p-126:0x80800000)
+  to double: f64(-0x1.00000000000000000000p-126:0x00b810000000000000) (OK)
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(0x0.00000000000000000000p+0:0000000000)
+  to double: f64(0x0.00000000000000000000p+0:00000000000000000000) (OK)
+   to int32: 0 (OK)
+   to int64: 0 (OK)
+  to uint32: 0 (OK)
+  to uint64: 0 (OK)
+from single: f32(0x1.00000000000000000000p-126:0x00800000)
+  to double: f64(0x1.00000000000000000000p-126:0x003810000000000000) (OK)
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.00000000000000000000p-25:0x33000000)
+  to double: f64(0x1.00000000000000000000p-25:0x003e60000000000000) (OK)
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+  to double: f64(0x1.ffffe600000000000000p-25:0x003e6ffffe60000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+  to double: f64(0x1.ff801a00000000000000p-15:0x003f0ff801a0000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.00000c00000000000000p-14:0x38800006)
+  to double: f64(0x1.00000c00000000000000p-14:0x003f100000c0000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.00000000000000000000p+0:0x3f800000)
+  to double: f64(0x1.00000000000000000000p+0:0x003ff0000000000000) (OK)
+   to int32: 1 (OK)
+   to int64: 1 (OK)
+  to uint32: 1 (OK)
+  to uint64: 1 (OK)
+from single: f32(0x1.00400000000000000000p+0:0x3f802000)
+  to double: f64(0x1.00400000000000000000p+0:0x003ff0040000000000) (INEXACT )
+   to int32: 1 (INEXACT )
+   to int64: 1 (INEXACT )
+  to uint32: 1 (INEXACT )
+  to uint64: 1 (INEXACT )
+from single: f32(0x1.00000000000000000000p+1:0x40000000)
+  to double: f64(0x1.00000000000000000000p+1:0x004000000000000000) (OK)
+   to int32: 2 (OK)
+   to int64: 2 (OK)
+  to uint32: 2 (OK)
+  to uint64: 2 (OK)
+from single: f32(0x1.5bf0a800000000000000p+1:0x402df854)
+  to double: f64(0x1.5bf0a800000000000000p+1:0x004005bf0a80000000) (INEXACT )
+   to int32: 2 (INEXACT )
+   to int64: 2 (INEXACT )
+  to uint32: 2 (INEXACT )
+  to uint64: 2 (INEXACT )
+from single: f32(0x1.921fb600000000000000p+1:0x40490fdb)
+  to double: f64(0x1.921fb600000000000000p+1:0x00400921fb60000000) (INEXACT )
+   to int32: 3 (INEXACT )
+   to int64: 3 (INEXACT )
+  to uint32: 3 (INEXACT )
+  to uint64: 3 (INEXACT )
+from single: f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+  to double: f64(0x1.ffbe0000000000000000p+15:0x0040effbe000000000) (INEXACT )
+   to int32: 65503 (OK)
+   to int64: 65503 (OK)
+  to uint32: 65503 (OK)
+  to uint64: 65503 (OK)
+from single: f32(0x1.ffc00000000000000000p+15:0x477fe000)
+  to double: f64(0x1.ffc00000000000000000p+15:0x0040effc0000000000) (INEXACT )
+   to int32: 65504 (OK)
+   to int64: 65504 (OK)
+  to uint32: 65504 (OK)
+  to uint64: 65504 (OK)
+from single: f32(0x1.ffc20000000000000000p+15:0x477fe100)
+  to double: f64(0x1.ffc20000000000000000p+15:0x0040effc2000000000) (INEXACT )
+   to int32: 65505 (OK)
+   to int64: 65505 (OK)
+  to uint32: 65505 (OK)
+  to uint64: 65505 (OK)
+from single: f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+  to double: f64(0x1.ffbf0000000000000000p+16:0x0040fffbf000000000) (INEXACT )
+   to int32: 131007 (OK)
+   to int64: 131007 (OK)
+  to uint32: 131007 (OK)
+  to uint64: 131007 (OK)
+from single: f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+  to double: f64(0x1.ffc00000000000000000p+16:0x0040fffc0000000000) (INEXACT )
+   to int32: 131008 (OK)
+   to int64: 131008 (OK)
+  to uint32: 131008 (OK)
+  to uint64: 131008 (OK)
+from single: f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+  to double: f64(0x1.ffc10000000000000000p+16:0x0040fffc1000000000) (INEXACT )
+   to int32: 131009 (OK)
+   to int64: 131009 (OK)
+  to uint32: 131009 (OK)
+  to uint64: 131009 (OK)
+from single: f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+  to double: f64(0x1.c0bab600000000000000p+99:0x00462c0bab60000000) (INEXACT )
+   to int32: 2147483647 (INVALID)
+   to int64: 9223372036854775807 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+  to double: f64(0x1.fffffe00000000000000p+127:0x0047efffffe0000000) (INEXACT )
+   to int32: 2147483647 (INVALID)
+   to int64: 9223372036854775807 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(inf:0x7f800000)
+  to double: f64(inf:0x007ff0000000000000) (OK)
+   to int32: 2147483647 (INVALID)
+   to int64: 9223372036854775807 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-nan:0x7fc00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (OK)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-nan:0x7fa00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (INVALID)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+### Rounding to zero
+from single: f32(-nan:0xffa00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (INVALID)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-nan:0xffc00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (OK)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-inf:0xff800000)
+  to double: f64(-inf:0x00fff0000000000000) (OK)
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+  to double: f64(-0x1.fffffe00000000000000p+127:0x00c7efffffe0000000) (INEXACT )
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+  to double: f64(-0x1.1874b200000000000000p+103:0x00c661874b20000000) (INEXACT )
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+  to double: f64(-0x1.c0bab600000000000000p+99:0x00c62c0bab60000000) (INEXACT )
+   to int32: -2147483648 (INVALID)
+   to int64: -9223372036854775808 (INVALID)
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+  to double: f64(-0x1.31f75000000000000000p-40:0x00bd731f7500000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+  to double: f64(-0x1.50544400000000000000p-66:0x00bbd5054440000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(-0x1.00000000000000000000p-126:0x80800000)
+  to double: f64(-0x1.00000000000000000000p-126:0x00b810000000000000) (OK)
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INVALID)
+  to uint64: 0 (INVALID)
+from single: f32(0x0.00000000000000000000p+0:0000000000)
+  to double: f64(0x0.00000000000000000000p+0:00000000000000000000) (OK)
+   to int32: 0 (OK)
+   to int64: 0 (OK)
+  to uint32: 0 (OK)
+  to uint64: 0 (OK)
+from single: f32(0x1.00000000000000000000p-126:0x00800000)
+  to double: f64(0x1.00000000000000000000p-126:0x003810000000000000) (OK)
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.00000000000000000000p-25:0x33000000)
+  to double: f64(0x1.00000000000000000000p-25:0x003e60000000000000) (OK)
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+  to double: f64(0x1.ffffe600000000000000p-25:0x003e6ffffe60000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+  to double: f64(0x1.ff801a00000000000000p-15:0x003f0ff801a0000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.00000c00000000000000p-14:0x38800006)
+  to double: f64(0x1.00000c00000000000000p-14:0x003f100000c0000000) (INEXACT )
+   to int32: 0 (INEXACT )
+   to int64: 0 (INEXACT )
+  to uint32: 0 (INEXACT )
+  to uint64: 0 (INEXACT )
+from single: f32(0x1.00000000000000000000p+0:0x3f800000)
+  to double: f64(0x1.00000000000000000000p+0:0x003ff0000000000000) (OK)
+   to int32: 1 (OK)
+   to int64: 1 (OK)
+  to uint32: 1 (OK)
+  to uint64: 1 (OK)
+from single: f32(0x1.00400000000000000000p+0:0x3f802000)
+  to double: f64(0x1.00400000000000000000p+0:0x003ff0040000000000) (INEXACT )
+   to int32: 1 (INEXACT )
+   to int64: 1 (INEXACT )
+  to uint32: 1 (INEXACT )
+  to uint64: 1 (INEXACT )
+from single: f32(0x1.00000000000000000000p+1:0x40000000)
+  to double: f64(0x1.00000000000000000000p+1:0x004000000000000000) (OK)
+   to int32: 2 (OK)
+   to int64: 2 (OK)
+  to uint32: 2 (OK)
+  to uint64: 2 (OK)
+from single: f32(0x1.5bf0a800000000000000p+1:0x402df854)
+  to double: f64(0x1.5bf0a800000000000000p+1:0x004005bf0a80000000) (INEXACT )
+   to int32: 2 (INEXACT )
+   to int64: 2 (INEXACT )
+  to uint32: 2 (INEXACT )
+  to uint64: 2 (INEXACT )
+from single: f32(0x1.921fb600000000000000p+1:0x40490fdb)
+  to double: f64(0x1.921fb600000000000000p+1:0x00400921fb60000000) (INEXACT )
+   to int32: 3 (INEXACT )
+   to int64: 3 (INEXACT )
+  to uint32: 3 (INEXACT )
+  to uint64: 3 (INEXACT )
+from single: f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+  to double: f64(0x1.ffbe0000000000000000p+15:0x0040effbe000000000) (INEXACT )
+   to int32: 65503 (OK)
+   to int64: 65503 (OK)
+  to uint32: 65503 (OK)
+  to uint64: 65503 (OK)
+from single: f32(0x1.ffc00000000000000000p+15:0x477fe000)
+  to double: f64(0x1.ffc00000000000000000p+15:0x0040effc0000000000) (INEXACT )
+   to int32: 65504 (OK)
+   to int64: 65504 (OK)
+  to uint32: 65504 (OK)
+  to uint64: 65504 (OK)
+from single: f32(0x1.ffc20000000000000000p+15:0x477fe100)
+  to double: f64(0x1.ffc20000000000000000p+15:0x0040effc2000000000) (INEXACT )
+   to int32: 65505 (OK)
+   to int64: 65505 (OK)
+  to uint32: 65505 (OK)
+  to uint64: 65505 (OK)
+from single: f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+  to double: f64(0x1.ffbf0000000000000000p+16:0x0040fffbf000000000) (INEXACT )
+   to int32: 131007 (OK)
+   to int64: 131007 (OK)
+  to uint32: 131007 (OK)
+  to uint64: 131007 (OK)
+from single: f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+  to double: f64(0x1.ffc00000000000000000p+16:0x0040fffc0000000000) (INEXACT )
+   to int32: 131008 (OK)
+   to int64: 131008 (OK)
+  to uint32: 131008 (OK)
+  to uint64: 131008 (OK)
+from single: f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+  to double: f64(0x1.ffc10000000000000000p+16:0x0040fffc1000000000) (INEXACT )
+   to int32: 131009 (OK)
+   to int64: 131009 (OK)
+  to uint32: 131009 (OK)
+  to uint64: 131009 (OK)
+from single: f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+  to double: f64(0x1.c0bab600000000000000p+99:0x00462c0bab60000000) (INEXACT )
+   to int32: 2147483647 (INVALID)
+   to int64: 9223372036854775807 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+  to double: f64(0x1.fffffe00000000000000p+127:0x0047efffffe0000000) (INEXACT )
+   to int32: 2147483647 (INVALID)
+   to int64: 9223372036854775807 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(inf:0x7f800000)
+  to double: f64(inf:0x007ff0000000000000) (OK)
+   to int32: 2147483647 (INVALID)
+   to int64: 9223372036854775807 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-nan:0x7fc00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (OK)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
+from single: f32(-nan:0x7fa00000)
+  to double: f64(-nan:0x00ffffffffffffffff) (INVALID)
+   to int32: -1 (INVALID)
+   to int64: -1 (INVALID)
+  to uint32: -1 (INVALID)
+  to uint64: -1 (INVALID)
diff --git a/tests/tcg/hexagon/float_madds.ref b/tests/tcg/hexagon/float_madds.ref
new file mode 100644
index 0000000..ceed3bb
--- /dev/null
+++ b/tests/tcg/hexagon/float_madds.ref
@@ -0,0 +1,768 @@
+### Rounding to nearest
+op : f32(-nan:0xffa00000) * f32(-nan:0xffc00000) + f32(-inf:0xff800000)
+res: f32(-nan:0xffffffff) flags=INVALID (0/0)
+op : f32(-nan:0xffc00000) * f32(-inf:0xff800000) + f32(-nan:0xffa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (0/1)
+op : f32(-inf:0xff800000) * f32(-nan:0xffa00000) + f32(-nan:0xffc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (0/2)
+op : f32(-nan:0xffc00000) * f32(-inf:0xff800000) + f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+res: f32(-nan:0xffffffff) flags=OK (1/0)
+op : f32(-inf:0xff800000) * f32(-0x1.fffffe00000000000000p+127:0xff7fffff) + f32(-nan:0xffc00000)
+res: f32(-nan:0xffffffff) flags=OK (1/1)
+op : f32(-0x1.fffffe00000000000000p+127:0xff7fffff) * f32(-nan:0xffc00000) + f32(-inf:0xff800000)
+res: f32(-nan:0xffffffff) flags=OK (1/2)
+op : f32(-inf:0xff800000) * f32(-0x1.fffffe00000000000000p+127:0xff7fffff) + f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+res: f32(inf:0x7f800000) flags=OK (2/0)
+op : f32(-0x1.fffffe00000000000000p+127:0xff7fffff) * f32(-0x1.1874b200000000000000p+103:0xf30c3a59) + f32(-inf:0xff800000)
+res: f32(-inf:0xff800000) flags=OK (2/1)
+op : f32(-0x1.1874b200000000000000p+103:0xf30c3a59) * f32(-inf:0xff800000) + f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+res: f32(inf:0x7f800000) flags=OK (2/2)
+op : f32(-0x1.fffffe00000000000000p+127:0xff7fffff) * f32(-0x1.1874b200000000000000p+103:0xf30c3a59) + f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (3/0)
+op : f32(-0x1.1874b200000000000000p+103:0xf30c3a59) * f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) + f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (3/1)
+op : f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) * f32(-0x1.fffffe00000000000000p+127:0xff7fffff) + f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (3/2)
+op : f32(-0x1.1874b200000000000000p+103:0xf30c3a59) * f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) + f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (4/0)
+op : f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) * f32(-0x1.31f75000000000000000p-40:0xab98fba8) + f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+res: f32(-0x1.1874b200000000000000p+103:0xf30c3a59) flags=INEXACT  (4/1)
+op : f32(-0x1.31f75000000000000000p-40:0xab98fba8) * f32(-0x1.1874b200000000000000p+103:0xf30c3a59) + f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+res: f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) flags=INEXACT  (4/2)
+op : f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) * f32(-0x1.31f75000000000000000p-40:0xab98fba8) + f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+res: f32(0x1.0c27fa00000000000000p+60:0x5d8613fd) flags=INEXACT  (5/0)
+op : f32(-0x1.31f75000000000000000p-40:0xab98fba8) * f32(-0x1.50544400000000000000p-66:0x9ea82a22) + f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+res: f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) flags=INEXACT  (5/1)
+op : f32(-0x1.50544400000000000000p-66:0x9ea82a22) * f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) + f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+res: f32(0x1.26c46200000000000000p+34:0x50936231) flags=INEXACT  (5/2)
+op : f32(-0x1.31f75000000000000000p-40:0xab98fba8) * f32(-0x1.50544400000000000000p-66:0x9ea82a22) + f32(-0x1.00000000000000000000p-126:0x80800000)
+res: f32(0x1.91f94000000000000000p-106:0x0ac8fca0) flags=INEXACT  (6/0)
+op : f32(-0x1.50544400000000000000p-66:0x9ea82a22) * f32(-0x1.00000000000000000000p-126:0x80800000) + f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+res: f32(-0x1.31f75000000000000000p-40:0xab98fba8) flags=INEXACT  (6/1)
+op : f32(-0x1.00000000000000000000p-126:0x80800000) * f32(-0x1.31f75000000000000000p-40:0xab98fba8) + f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+res: f32(-0x1.50544400000000000000p-66:0x9ea82a22) flags=INEXACT  (6/2)
+op : f32(-0x1.50544400000000000000p-66:0x9ea82a22) * f32(-0x1.00000000000000000000p-126:0x80800000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(0x0.00000000000000000000p+0:0000000000) flags=UNDERFLOW INEXACT  (7/0)
+op : f32(-0x1.00000000000000000000p-126:0x80800000) * f32(0x0.00000000000000000000p+0:0000000000) + f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+res: f32(-0x1.50544400000000000000p-66:0x9ea82a22) flags=INEXACT  (7/1)
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(-0x1.50544400000000000000p-66:0x9ea82a22) + f32(-0x1.00000000000000000000p-126:0x80800000)
+res: f32(-0x1.00000000000000000000p-126:0x80800000) flags=OK (7/2)
+op : f32(-0x1.00000000000000000000p-126:0x80800000) * f32(0x0.00000000000000000000p+0:0000000000) + f32(0x1.00000000000000000000p-126:0x00800000)
+res: f32(0x1.00000000000000000000p-126:0x00800000) flags=OK (8/0)
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(0x1.00000000000000000000p-126:0x00800000) + f32(-0x1.00000000000000000000p-126:0x80800000)
+res: f32(-0x1.00000000000000000000p-126:0x80800000) flags=OK (8/1)
+op : f32(0x1.00000000000000000000p-126:0x00800000) * f32(-0x1.00000000000000000000p-126:0x80800000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(-0x0.00000000000000000000p+0:0x80000000) flags=UNDERFLOW INEXACT  (8/2)
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(0x1.00000000000000000000p-126:0x00800000) + f32(0x1.00000000000000000000p-25:0x33000000)
+res: f32(0x1.00000000000000000000p-25:0x33000000) flags=OK (9/0)
+op : f32(0x1.00000000000000000000p-126:0x00800000) * f32(0x1.00000000000000000000p-25:0x33000000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(0x0.00000000000000000000p+0:0000000000) flags=UNDERFLOW INEXACT  (9/1)
+op : f32(0x1.00000000000000000000p-25:0x33000000) * f32(0x0.00000000000000000000p+0:0000000000) + f32(0x1.00000000000000000000p-126:0x00800000)
+res: f32(0x1.00000000000000000000p-126:0x00800000) flags=OK (9/2)
+op : f32(0x1.00000000000000000000p-126:0x00800000) * f32(0x1.00000000000000000000p-25:0x33000000) + f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+res: f32(0x1.ffffe600000000000000p-25:0x337ffff3) flags=INEXACT  (10/0)
+op : f32(0x1.00000000000000000000p-25:0x33000000) * f32(0x1.ffffe600000000000000p-25:0x337ffff3) + f32(0x1.00000000000000000000p-126:0x00800000)
+res: f32(0x1.ffffe600000000000000p-50:0x26fffff3) flags=INEXACT  (10/1)
+op : f32(0x1.ffffe600000000000000p-25:0x337ffff3) * f32(0x1.00000000000000000000p-126:0x00800000) + f32(0x1.00000000000000000000p-25:0x33000000)
+res: f32(0x1.00000000000000000000p-25:0x33000000) flags=INEXACT  (10/2)
+op : f32(0x1.00000000000000000000p-25:0x33000000) * f32(0x1.ffffe600000000000000p-25:0x337ffff3) + f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+res: f32(0x1.ff801a00000000000000p-15:0x387fc00d) flags=INEXACT  (11/0)
+op : f32(0x1.ffffe600000000000000p-25:0x337ffff3) * f32(0x1.ff801a00000000000000p-15:0x387fc00d) + f32(0x1.00000000000000000000p-25:0x33000000)
+res: f32(0x1.0007fe00000000000000p-25:0x330003ff) flags=INEXACT  (11/1)
+op : f32(0x1.ff801a00000000000000p-15:0x387fc00d) * f32(0x1.00000000000000000000p-25:0x33000000) + f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+res: f32(0x1.0001f200000000000000p-24:0x338000f9) flags=INEXACT  (11/2)
+op : f32(0x1.ffffe600000000000000p-25:0x337ffff3) * f32(0x1.ff801a00000000000000p-15:0x387fc00d) + f32(0x1.00000c00000000000000p-14:0x38800006)
+res: f32(0x1.00000c00000000000000p-14:0x38800006) flags=INEXACT  (12/0)
+op : f32(0x1.ff801a00000000000000p-15:0x387fc00d) * f32(0x1.00000c00000000000000p-14:0x38800006) + f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+res: f32(0x1.0ffbf400000000000000p-24:0x3387fdfa) flags=INEXACT  (12/1)
+op : f32(0x1.00000c00000000000000p-14:0x38800006) * f32(0x1.ffffe600000000000000p-25:0x337ffff3) + f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+res: f32(0x1.ff801c00000000000000p-15:0x387fc00e) flags=INEXACT  (12/2)
+op : f32(0x1.ff801a00000000000000p-15:0x387fc00d) * f32(0x1.00000c00000000000000p-14:0x38800006) + f32(0x1.00000000000000000000p+0:0x3f800000)
+res: f32(0x1.00000000000000000000p+0:0x3f800000) flags=INEXACT  (13/0)
+op : f32(0x1.00000c00000000000000p-14:0x38800006) * f32(0x1.00000000000000000000p+0:0x3f800000) + f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+res: f32(0x1.ffc01800000000000000p-14:0x38ffe00c) flags=INEXACT  (13/1)
+op : f32(0x1.00000000000000000000p+0:0x3f800000) * f32(0x1.ff801a00000000000000p-15:0x387fc00d) + f32(0x1.00000c00000000000000p-14:0x38800006)
+res: f32(0x1.ffc01800000000000000p-14:0x38ffe00c) flags=INEXACT  (13/2)
+op : f32(0x1.00000c00000000000000p-14:0x38800006) * f32(0x1.00000000000000000000p+0:0x3f800000) + f32(0x1.00400000000000000000p+0:0x3f802000)
+res: f32(0x1.00440000000000000000p+0:0x3f802200) flags=INEXACT  (14/0)
+op : f32(0x1.00000000000000000000p+0:0x3f800000) * f32(0x1.00400000000000000000p+0:0x3f802000) + f32(0x1.00000c00000000000000p-14:0x38800006)
+res: f32(0x1.00440000000000000000p+0:0x3f802200) flags=INEXACT  (14/1)
+op : f32(0x1.00400000000000000000p+0:0x3f802000) * f32(0x1.00000c00000000000000p-14:0x38800006) + f32(0x1.00000000000000000000p+0:0x3f800000)
+res: f32(0x1.00040200000000000000p+0:0x3f800201) flags=INEXACT  (14/2)
+op : f32(0x1.00000000000000000000p+0:0x3f800000) * f32(0x1.00400000000000000000p+0:0x3f802000) + f32(0x1.00000000000000000000p+1:0x40000000)
+res: f32(0x1.80200000000000000000p+1:0x40401000) flags=INEXACT  (15/0)
+op : f32(0x1.00400000000000000000p+0:0x3f802000) * f32(0x1.00000000000000000000p+1:0x40000000) + f32(0x1.00000000000000000000p+0:0x3f800000)
+res: f32(0x1.80400000000000000000p+1:0x40402000) flags=INEXACT  (15/1)
+op : f32(0x1.00000000000000000000p+1:0x40000000) * f32(0x1.00000000000000000000p+0:0x3f800000) + f32(0x1.00400000000000000000p+0:0x3f802000)
+res: f32(0x1.80200000000000000000p+1:0x40401000) flags=INEXACT  (15/2)
+op : f32(0x1.00400000000000000000p+0:0x3f802000) * f32(0x1.00000000000000000000p+1:0x40000000) + f32(0x1.5bf0a800000000000000p+1:0x402df854)
+res: f32(0x1.2e185400000000000000p+2:0x40970c2a) flags=INEXACT  (16/0)
+op : f32(0x1.00000000000000000000p+1:0x40000000) * f32(0x1.5bf0a800000000000000p+1:0x402df854) + f32(0x1.00400000000000000000p+0:0x3f802000)
+res: f32(0x1.9c00a800000000000000p+2:0x40ce0054) flags=INEXACT  (16/1)
+op : f32(0x1.5bf0a800000000000000p+1:0x402df854) * f32(0x1.00400000000000000000p+0:0x3f802000) + f32(0x1.00000000000000000000p+1:0x40000000)
+res: f32(0x1.2e23d200000000000000p+2:0x409711e9) flags=INEXACT  (16/2)
+op : f32(0x1.00000000000000000000p+1:0x40000000) * f32(0x1.5bf0a800000000000000p+1:0x402df854) + f32(0x1.921fb600000000000000p+1:0x40490fdb)
+res: f32(0x1.12804200000000000000p+3:0x41094021) flags=INEXACT  (17/0)
+op : f32(0x1.5bf0a800000000000000p+1:0x402df854) * f32(0x1.921fb600000000000000p+1:0x40490fdb) + f32(0x1.00000000000000000000p+1:0x40000000)
+res: f32(0x1.51458000000000000000p+3:0x4128a2c0) flags=INEXACT  (17/1)
+op : f32(0x1.921fb600000000000000p+1:0x40490fdb) * f32(0x1.00000000000000000000p+1:0x40000000) + f32(0x1.5bf0a800000000000000p+1:0x402df854)
+res: f32(0x1.200c0400000000000000p+3:0x41100602) flags=INEXACT  (17/2)
+op : f32(0x1.5bf0a800000000000000p+1:0x402df854) * f32(0x1.921fb600000000000000p+1:0x40490fdb) + f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+res: f32(0x1.ffcf1400000000000000p+15:0x477fe78a) flags=INEXACT  (18/0)
+op : f32(0x1.921fb600000000000000p+1:0x40490fdb) * f32(0x1.ffbe0000000000000000p+15:0x477fdf00) + f32(0x1.5bf0a800000000000000p+1:0x402df854)
+res: f32(0x1.91ed3c00000000000000p+17:0x4848f69e) flags=INEXACT  (18/1)
+op : f32(0x1.ffbe0000000000000000p+15:0x477fdf00) * f32(0x1.5bf0a800000000000000p+1:0x402df854) + f32(0x1.921fb600000000000000p+1:0x40490fdb)
+res: f32(0x1.5bc56000000000000000p+17:0x482de2b0) flags=INEXACT  (18/2)
+op : f32(0x1.921fb600000000000000p+1:0x40490fdb) * f32(0x1.ffbe0000000000000000p+15:0x477fdf00) + f32(0x1.ffc00000000000000000p+15:0x477fe000)
+res: f32(0x1.08edf000000000000000p+18:0x488476f8) flags=INEXACT  (19/0)
+op : f32(0x1.ffbe0000000000000000p+15:0x477fdf00) * f32(0x1.ffc00000000000000000p+15:0x477fe000) + f32(0x1.921fb600000000000000p+1:0x40490fdb)
+res: f32(0x1.ff7e0800000000000000p+31:0x4f7fbf04) flags=INEXACT  (19/1)
+op : f32(0x1.ffc00000000000000000p+15:0x477fe000) * f32(0x1.921fb600000000000000p+1:0x40490fdb) + f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+res: f32(0x1.08ee7a00000000000000p+18:0x4884773d) flags=INEXACT  (19/2)
+op : f32(0x1.ffbe0000000000000000p+15:0x477fdf00) * f32(0x1.ffc00000000000000000p+15:0x477fe000) + f32(0x1.ffc20000000000000000p+15:0x477fe100)
+res: f32(0x1.ff800800000000000000p+31:0x4f7fc004) flags=INEXACT  (20/0)
+op : f32(0x1.ffc00000000000000000p+15:0x477fe000) * f32(0x1.ffc20000000000000000p+15:0x477fe100) + f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+res: f32(0x1.ff840800000000000000p+31:0x4f7fc204) flags=INEXACT  (20/1)
+op : f32(0x1.ffc20000000000000000p+15:0x477fe100) * f32(0x1.ffbe0000000000000000p+15:0x477fdf00) + f32(0x1.ffc00000000000000000p+15:0x477fe000)
+res: f32(0x1.ff820800000000000000p+31:0x4f7fc104) flags=INEXACT  (20/2)
+op : f32(0x1.ffc00000000000000000p+15:0x477fe000) * f32(0x1.ffc20000000000000000p+15:0x477fe100) + f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+res: f32(0x1.ff860800000000000000p+31:0x4f7fc304) flags=INEXACT  (21/0)
+op : f32(0x1.ffc20000000000000000p+15:0x477fe100) * f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) + f32(0x1.ffc00000000000000000p+15:0x477fe000)
+res: f32(0x1.ff820800000000000000p+32:0x4fffc104) flags=INEXACT  (21/1)
+op : f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) * f32(0x1.ffc00000000000000000p+15:0x477fe000) + f32(0x1.ffc20000000000000000p+15:0x477fe100)
+res: f32(0x1.ff800800000000000000p+32:0x4fffc004) flags=INEXACT  (21/2)
+op : f32(0x1.ffc20000000000000000p+15:0x477fe100) * f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) + f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+res: f32(0x1.ff830800000000000000p+32:0x4fffc184) flags=INEXACT  (22/0)
+op : f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) * f32(0x1.ffc00000000000000000p+16:0x47ffe000) + f32(0x1.ffc20000000000000000p+15:0x477fe100)
+res: f32(0x1.ff7f8800000000000000p+33:0x507fbfc4) flags=INEXACT  (22/1)
+op : f32(0x1.ffc00000000000000000p+16:0x47ffe000) * f32(0x1.ffc20000000000000000p+15:0x477fe100) + f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+res: f32(0x1.ff840800000000000000p+32:0x4fffc204) flags=INEXACT  (22/2)
+op : f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) * f32(0x1.ffc00000000000000000p+16:0x47ffe000) + f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+res: f32(0x1.ff800800000000000000p+33:0x507fc004) flags=INEXACT  (23/0)
+op : f32(0x1.ffc00000000000000000p+16:0x47ffe000) * f32(0x1.ffc10000000000000000p+16:0x47ffe080) + f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+res: f32(0x1.ff820800000000000000p+33:0x507fc104) flags=INEXACT  (23/1)
+op : f32(0x1.ffc10000000000000000p+16:0x47ffe080) * f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) + f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+res: f32(0x1.ff810800000000000000p+33:0x507fc084) flags=INEXACT  (23/2)
+op : f32(0x1.ffc00000000000000000p+16:0x47ffe000) * f32(0x1.ffc10000000000000000p+16:0x47ffe080) + f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+res: f32(0x1.c0bab600000000000000p+99:0x71605d5b) flags=INEXACT  (24/0)
+op : f32(0x1.ffc10000000000000000p+16:0x47ffe080) * f32(0x1.c0bab600000000000000p+99:0x71605d5b) + f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+res: f32(0x1.c0838000000000000000p+116:0x79e041c0) flags=INEXACT  (24/1)
+op : f32(0x1.c0bab600000000000000p+99:0x71605d5b) * f32(0x1.ffc00000000000000000p+16:0x47ffe000) + f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+res: f32(0x1.c0829e00000000000000p+116:0x79e0414f) flags=INEXACT  (24/2)
+op : f32(0x1.ffc10000000000000000p+16:0x47ffe080) * f32(0x1.c0bab600000000000000p+99:0x71605d5b) + f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (25/0)
+op : f32(0x1.c0bab600000000000000p+99:0x71605d5b) * f32(0x1.fffffe00000000000000p+127:0x7f7fffff) + f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (25/1)
+op : f32(0x1.fffffe00000000000000p+127:0x7f7fffff) * f32(0x1.ffc10000000000000000p+16:0x47ffe080) + f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (25/2)
+op : f32(0x1.c0bab600000000000000p+99:0x71605d5b) * f32(0x1.fffffe00000000000000p+127:0x7f7fffff) + f32(inf:0x7f800000)
+res: f32(inf:0x7f800000) flags=OK (26/0)
+op : f32(0x1.fffffe00000000000000p+127:0x7f7fffff) * f32(inf:0x7f800000) + f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+res: f32(inf:0x7f800000) flags=OK (26/1)
+op : f32(inf:0x7f800000) * f32(0x1.c0bab600000000000000p+99:0x71605d5b) + f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+res: f32(inf:0x7f800000) flags=OK (26/2)
+op : f32(0x1.fffffe00000000000000p+127:0x7f7fffff) * f32(inf:0x7f800000) + f32(-nan:0x7fc00000)
+res: f32(-nan:0xffffffff) flags=OK (27/0)
+op : f32(inf:0x7f800000) * f32(-nan:0x7fc00000) + f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+res: f32(-nan:0xffffffff) flags=OK (27/1)
+op : f32(-nan:0x7fc00000) * f32(0x1.fffffe00000000000000p+127:0x7f7fffff) + f32(inf:0x7f800000)
+res: f32(-nan:0xffffffff) flags=OK (27/2)
+op : f32(inf:0x7f800000) * f32(-nan:0x7fc00000) + f32(-nan:0x7fa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (28/0)
+op : f32(-nan:0x7fc00000) * f32(-nan:0x7fa00000) + f32(inf:0x7f800000)
+res: f32(-nan:0xffffffff) flags=INVALID (28/1)
+op : f32(-nan:0x7fa00000) * f32(inf:0x7f800000) + f32(-nan:0x7fc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (28/2)
+op : f32(-nan:0x7fc00000) * f32(-nan:0x7fa00000) + f32(-nan:0xffa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (29/0)
+op : f32(-nan:0x7fa00000) * f32(-nan:0xffa00000) + f32(-nan:0x7fc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (29/1)
+op : f32(-nan:0xffa00000) * f32(-nan:0x7fc00000) + f32(-nan:0x7fa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (29/2)
+op : f32(-nan:0x7fa00000) * f32(-nan:0xffa00000) + f32(-nan:0xffc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (30/0)
+op : f32(-nan:0xffa00000) * f32(-nan:0xffc00000) + f32(-nan:0x7fa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (30/1)
+op : f32(-nan:0xffc00000) * f32(-nan:0x7fa00000) + f32(-nan:0xffa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (30/2)
+# LP184149
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(0x1.00000000000000000000p-1:0x3f000000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(0x0.00000000000000000000p+0:0000000000) flags=OK (31/0)
+op : f32(0x1.00000000000000000000p-149:0x00000001) * f32(0x1.00000000000000000000p-149:0x00000001) + f32(0x1.00000000000000000000p-149:0x00000001)
+res: f32(0x1.00000000000000000000p-149:0x00000001) flags=UNDERFLOW INEXACT  (32/0)
+### Rounding upwards
+op : f32(-nan:0xffa00000) * f32(-nan:0xffc00000) + f32(-inf:0xff800000)
+res: f32(-nan:0xffffffff) flags=INVALID (0/0)
+op : f32(-nan:0xffc00000) * f32(-inf:0xff800000) + f32(-nan:0xffa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (0/1)
+op : f32(-inf:0xff800000) * f32(-nan:0xffa00000) + f32(-nan:0xffc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (0/2)
+op : f32(-nan:0xffc00000) * f32(-inf:0xff800000) + f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+res: f32(-nan:0xffffffff) flags=OK (1/0)
+op : f32(-inf:0xff800000) * f32(-0x1.fffffe00000000000000p+127:0xff7fffff) + f32(-nan:0xffc00000)
+res: f32(-nan:0xffffffff) flags=OK (1/1)
+op : f32(-0x1.fffffe00000000000000p+127:0xff7fffff) * f32(-nan:0xffc00000) + f32(-inf:0xff800000)
+res: f32(-nan:0xffffffff) flags=OK (1/2)
+op : f32(-inf:0xff800000) * f32(-0x1.fffffe00000000000000p+127:0xff7fffff) + f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+res: f32(inf:0x7f800000) flags=OK (2/0)
+op : f32(-0x1.fffffe00000000000000p+127:0xff7fffff) * f32(-0x1.1874b200000000000000p+103:0xf30c3a59) + f32(-inf:0xff800000)
+res: f32(-inf:0xff800000) flags=OK (2/1)
+op : f32(-0x1.1874b200000000000000p+103:0xf30c3a59) * f32(-inf:0xff800000) + f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+res: f32(inf:0x7f800000) flags=OK (2/2)
+op : f32(-0x1.fffffe00000000000000p+127:0xff7fffff) * f32(-0x1.1874b200000000000000p+103:0xf30c3a59) + f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (3/0)
+op : f32(-0x1.1874b200000000000000p+103:0xf30c3a59) * f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) + f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (3/1)
+op : f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) * f32(-0x1.fffffe00000000000000p+127:0xff7fffff) + f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (3/2)
+op : f32(-0x1.1874b200000000000000p+103:0xf30c3a59) * f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) + f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (4/0)
+op : f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) * f32(-0x1.31f75000000000000000p-40:0xab98fba8) + f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+res: f32(-0x1.1874b000000000000000p+103:0xf30c3a58) flags=INEXACT  (4/1)
+op : f32(-0x1.31f75000000000000000p-40:0xab98fba8) * f32(-0x1.1874b200000000000000p+103:0xf30c3a59) + f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+res: f32(-0x1.c0bab400000000000000p+99:0xf1605d5a) flags=INEXACT  (4/2)
+op : f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) * f32(-0x1.31f75000000000000000p-40:0xab98fba8) + f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+res: f32(0x1.0c27fa00000000000000p+60:0x5d8613fd) flags=INEXACT  (5/0)
+op : f32(-0x1.31f75000000000000000p-40:0xab98fba8) * f32(-0x1.50544400000000000000p-66:0x9ea82a22) + f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+res: f32(-0x1.c0bab400000000000000p+99:0xf1605d5a) flags=INEXACT  (5/1)
+op : f32(-0x1.50544400000000000000p-66:0x9ea82a22) * f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) + f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+res: f32(0x1.26c46200000000000000p+34:0x50936231) flags=INEXACT  (5/2)
+op : f32(-0x1.31f75000000000000000p-40:0xab98fba8) * f32(-0x1.50544400000000000000p-66:0x9ea82a22) + f32(-0x1.00000000000000000000p-126:0x80800000)
+res: f32(0x1.91f94000000000000000p-106:0x0ac8fca0) flags=INEXACT  (6/0)
+op : f32(-0x1.50544400000000000000p-66:0x9ea82a22) * f32(-0x1.00000000000000000000p-126:0x80800000) + f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+res: f32(-0x1.31f74e00000000000000p-40:0xab98fba7) flags=INEXACT  (6/1)
+op : f32(-0x1.00000000000000000000p-126:0x80800000) * f32(-0x1.31f75000000000000000p-40:0xab98fba8) + f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+res: f32(-0x1.50544200000000000000p-66:0x9ea82a21) flags=INEXACT  (6/2)
+op : f32(-0x1.50544400000000000000p-66:0x9ea82a22) * f32(-0x1.00000000000000000000p-126:0x80800000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(0x1.00000000000000000000p-149:0x00000001) flags=UNDERFLOW INEXACT  (7/0)
+op : f32(-0x1.00000000000000000000p-126:0x80800000) * f32(0x0.00000000000000000000p+0:0000000000) + f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+res: f32(-0x1.50544400000000000000p-66:0x9ea82a22) flags=INEXACT  (7/1)
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(-0x1.50544400000000000000p-66:0x9ea82a22) + f32(-0x1.00000000000000000000p-126:0x80800000)
+res: f32(-0x1.00000000000000000000p-126:0x80800000) flags=OK (7/2)
+op : f32(-0x1.00000000000000000000p-126:0x80800000) * f32(0x0.00000000000000000000p+0:0000000000) + f32(0x1.00000000000000000000p-126:0x00800000)
+res: f32(0x1.00000000000000000000p-126:0x00800000) flags=OK (8/0)
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(0x1.00000000000000000000p-126:0x00800000) + f32(-0x1.00000000000000000000p-126:0x80800000)
+res: f32(-0x1.00000000000000000000p-126:0x80800000) flags=OK (8/1)
+op : f32(0x1.00000000000000000000p-126:0x00800000) * f32(-0x1.00000000000000000000p-126:0x80800000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(-0x0.00000000000000000000p+0:0x80000000) flags=UNDERFLOW INEXACT  (8/2)
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(0x1.00000000000000000000p-126:0x00800000) + f32(0x1.00000000000000000000p-25:0x33000000)
+res: f32(0x1.00000000000000000000p-25:0x33000000) flags=OK (9/0)
+op : f32(0x1.00000000000000000000p-126:0x00800000) * f32(0x1.00000000000000000000p-25:0x33000000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(0x1.00000000000000000000p-149:0x00000001) flags=UNDERFLOW INEXACT  (9/1)
+op : f32(0x1.00000000000000000000p-25:0x33000000) * f32(0x0.00000000000000000000p+0:0000000000) + f32(0x1.00000000000000000000p-126:0x00800000)
+res: f32(0x1.00000000000000000000p-126:0x00800000) flags=OK (9/2)
+op : f32(0x1.00000000000000000000p-126:0x00800000) * f32(0x1.00000000000000000000p-25:0x33000000) + f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+res: f32(0x1.ffffe800000000000000p-25:0x337ffff4) flags=INEXACT  (10/0)
+op : f32(0x1.00000000000000000000p-25:0x33000000) * f32(0x1.ffffe600000000000000p-25:0x337ffff3) + f32(0x1.00000000000000000000p-126:0x00800000)
+res: f32(0x1.ffffe800000000000000p-50:0x26fffff4) flags=INEXACT  (10/1)
+op : f32(0x1.ffffe600000000000000p-25:0x337ffff3) * f32(0x1.00000000000000000000p-126:0x00800000) + f32(0x1.00000000000000000000p-25:0x33000000)
+res: f32(0x1.00000200000000000000p-25:0x33000001) flags=INEXACT  (10/2)
+op : f32(0x1.00000000000000000000p-25:0x33000000) * f32(0x1.ffffe600000000000000p-25:0x337ffff3) + f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+res: f32(0x1.ff801c00000000000000p-15:0x387fc00e) flags=INEXACT  (11/0)
+op : f32(0x1.ffffe600000000000000p-25:0x337ffff3) * f32(0x1.ff801a00000000000000p-15:0x387fc00d) + f32(0x1.00000000000000000000p-25:0x33000000)
+res: f32(0x1.00080000000000000000p-25:0x33000400) flags=INEXACT  (11/1)
+op : f32(0x1.ff801a00000000000000p-15:0x387fc00d) * f32(0x1.00000000000000000000p-25:0x33000000) + f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+res: f32(0x1.0001f400000000000000p-24:0x338000fa) flags=INEXACT  (11/2)
+op : f32(0x1.ffffe600000000000000p-25:0x337ffff3) * f32(0x1.ff801a00000000000000p-15:0x387fc00d) + f32(0x1.00000c00000000000000p-14:0x38800006)
+res: f32(0x1.00000e00000000000000p-14:0x38800007) flags=INEXACT  (12/0)
+op : f32(0x1.ff801a00000000000000p-15:0x387fc00d) * f32(0x1.00000c00000000000000p-14:0x38800006) + f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+res: f32(0x1.0ffbf600000000000000p-24:0x3387fdfb) flags=INEXACT  (12/1)
+op : f32(0x1.00000c00000000000000p-14:0x38800006) * f32(0x1.ffffe600000000000000p-25:0x337ffff3) + f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+res: f32(0x1.ff801c00000000000000p-15:0x387fc00e) flags=INEXACT  (12/2)
+op : f32(0x1.ff801a00000000000000p-15:0x387fc00d) * f32(0x1.00000c00000000000000p-14:0x38800006) + f32(0x1.00000000000000000000p+0:0x3f800000)
+res: f32(0x1.00000200000000000000p+0:0x3f800001) flags=INEXACT  (13/0)
+op : f32(0x1.00000c00000000000000p-14:0x38800006) * f32(0x1.00000000000000000000p+0:0x3f800000) + f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+res: f32(0x1.ffc01a00000000000000p-14:0x38ffe00d) flags=INEXACT  (13/1)
+op : f32(0x1.00000000000000000000p+0:0x3f800000) * f32(0x1.ff801a00000000000000p-15:0x387fc00d) + f32(0x1.00000c00000000000000p-14:0x38800006)
+res: f32(0x1.ffc01a00000000000000p-14:0x38ffe00d) flags=INEXACT  (13/2)
+op : f32(0x1.00000c00000000000000p-14:0x38800006) * f32(0x1.00000000000000000000p+0:0x3f800000) + f32(0x1.00400000000000000000p+0:0x3f802000)
+res: f32(0x1.00440200000000000000p+0:0x3f802201) flags=INEXACT  (14/0)
+op : f32(0x1.00000000000000000000p+0:0x3f800000) * f32(0x1.00400000000000000000p+0:0x3f802000) + f32(0x1.00000c00000000000000p-14:0x38800006)
+res: f32(0x1.00440200000000000000p+0:0x3f802201) flags=INEXACT  (14/1)
+op : f32(0x1.00400000000000000000p+0:0x3f802000) * f32(0x1.00000c00000000000000p-14:0x38800006) + f32(0x1.00000000000000000000p+0:0x3f800000)
+res: f32(0x1.00040200000000000000p+0:0x3f800201) flags=INEXACT  (14/2)
+op : f32(0x1.00000000000000000000p+0:0x3f800000) * f32(0x1.00400000000000000000p+0:0x3f802000) + f32(0x1.00000000000000000000p+1:0x40000000)
+res: f32(0x1.80200000000000000000p+1:0x40401000) flags=INEXACT  (15/0)
+op : f32(0x1.00400000000000000000p+0:0x3f802000) * f32(0x1.00000000000000000000p+1:0x40000000) + f32(0x1.00000000000000000000p+0:0x3f800000)
+res: f32(0x1.80400000000000000000p+1:0x40402000) flags=INEXACT  (15/1)
+op : f32(0x1.00000000000000000000p+1:0x40000000) * f32(0x1.00000000000000000000p+0:0x3f800000) + f32(0x1.00400000000000000000p+0:0x3f802000)
+res: f32(0x1.80200000000000000000p+1:0x40401000) flags=INEXACT  (15/2)
+op : f32(0x1.00400000000000000000p+0:0x3f802000) * f32(0x1.00000000000000000000p+1:0x40000000) + f32(0x1.5bf0a800000000000000p+1:0x402df854)
+res: f32(0x1.2e185400000000000000p+2:0x40970c2a) flags=INEXACT  (16/0)
+op : f32(0x1.00000000000000000000p+1:0x40000000) * f32(0x1.5bf0a800000000000000p+1:0x402df854) + f32(0x1.00400000000000000000p+0:0x3f802000)
+res: f32(0x1.9c00a800000000000000p+2:0x40ce0054) flags=INEXACT  (16/1)
+op : f32(0x1.5bf0a800000000000000p+1:0x402df854) * f32(0x1.00400000000000000000p+0:0x3f802000) + f32(0x1.00000000000000000000p+1:0x40000000)
+res: f32(0x1.2e23d400000000000000p+2:0x409711ea) flags=INEXACT  (16/2)
+op : f32(0x1.00000000000000000000p+1:0x40000000) * f32(0x1.5bf0a800000000000000p+1:0x402df854) + f32(0x1.921fb600000000000000p+1:0x40490fdb)
+res: f32(0x1.12804200000000000000p+3:0x41094021) flags=INEXACT  (17/0)
+op : f32(0x1.5bf0a800000000000000p+1:0x402df854) * f32(0x1.921fb600000000000000p+1:0x40490fdb) + f32(0x1.00000000000000000000p+1:0x40000000)
+res: f32(0x1.51458200000000000000p+3:0x4128a2c1) flags=INEXACT  (17/1)
+op : f32(0x1.921fb600000000000000p+1:0x40490fdb) * f32(0x1.00000000000000000000p+1:0x40000000) + f32(0x1.5bf0a800000000000000p+1:0x402df854)
+res: f32(0x1.200c0600000000000000p+3:0x41100603) flags=INEXACT  (17/2)
+op : f32(0x1.5bf0a800000000000000p+1:0x402df854) * f32(0x1.921fb600000000000000p+1:0x40490fdb) + f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+res: f32(0x1.ffcf1600000000000000p+15:0x477fe78b) flags=INEXACT  (18/0)
+op : f32(0x1.921fb600000000000000p+1:0x40490fdb) * f32(0x1.ffbe0000000000000000p+15:0x477fdf00) + f32(0x1.5bf0a800000000000000p+1:0x402df854)
+res: f32(0x1.91ed3c00000000000000p+17:0x4848f69e) flags=INEXACT  (18/1)
+op : f32(0x1.ffbe0000000000000000p+15:0x477fdf00) * f32(0x1.5bf0a800000000000000p+1:0x402df854) + f32(0x1.921fb600000000000000p+1:0x40490fdb)
+res: f32(0x1.5bc56200000000000000p+17:0x482de2b1) flags=INEXACT  (18/2)
+op : f32(0x1.921fb600000000000000p+1:0x40490fdb) * f32(0x1.ffbe0000000000000000p+15:0x477fdf00) + f32(0x1.ffc00000000000000000p+15:0x477fe000)
+res: f32(0x1.08edf000000000000000p+18:0x488476f8) flags=INEXACT  (19/0)
+op : f32(0x1.ffbe0000000000000000p+15:0x477fdf00) * f32(0x1.ffc00000000000000000p+15:0x477fe000) + f32(0x1.921fb600000000000000p+1:0x40490fdb)
+res: f32(0x1.ff7e0a00000000000000p+31:0x4f7fbf05) flags=INEXACT  (19/1)
+op : f32(0x1.ffc00000000000000000p+15:0x477fe000) * f32(0x1.921fb600000000000000p+1:0x40490fdb) + f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+res: f32(0x1.08ee7a00000000000000p+18:0x4884773d) flags=INEXACT  (19/2)
+op : f32(0x1.ffbe0000000000000000p+15:0x477fdf00) * f32(0x1.ffc00000000000000000p+15:0x477fe000) + f32(0x1.ffc20000000000000000p+15:0x477fe100)
+res: f32(0x1.ff800a00000000000000p+31:0x4f7fc005) flags=INEXACT  (20/0)
+op : f32(0x1.ffc00000000000000000p+15:0x477fe000) * f32(0x1.ffc20000000000000000p+15:0x477fe100) + f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+res: f32(0x1.ff840800000000000000p+31:0x4f7fc204) flags=INEXACT  (20/1)
+op : f32(0x1.ffc20000000000000000p+15:0x477fe100) * f32(0x1.ffbe0000000000000000p+15:0x477fdf00) + f32(0x1.ffc00000000000000000p+15:0x477fe000)
+res: f32(0x1.ff820800000000000000p+31:0x4f7fc104) flags=INEXACT  (20/2)
+op : f32(0x1.ffc00000000000000000p+15:0x477fe000) * f32(0x1.ffc20000000000000000p+15:0x477fe100) + f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+res: f32(0x1.ff860800000000000000p+31:0x4f7fc304) flags=INEXACT  (21/0)
+op : f32(0x1.ffc20000000000000000p+15:0x477fe100) * f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) + f32(0x1.ffc00000000000000000p+15:0x477fe000)
+res: f32(0x1.ff820800000000000000p+32:0x4fffc104) flags=INEXACT  (21/1)
+op : f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) * f32(0x1.ffc00000000000000000p+15:0x477fe000) + f32(0x1.ffc20000000000000000p+15:0x477fe100)
+res: f32(0x1.ff800a00000000000000p+32:0x4fffc005) flags=INEXACT  (21/2)
+op : f32(0x1.ffc20000000000000000p+15:0x477fe100) * f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) + f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+res: f32(0x1.ff830800000000000000p+32:0x4fffc184) flags=INEXACT  (22/0)
+op : f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) * f32(0x1.ffc00000000000000000p+16:0x47ffe000) + f32(0x1.ffc20000000000000000p+15:0x477fe100)
+res: f32(0x1.ff7f8a00000000000000p+33:0x507fbfc5) flags=INEXACT  (22/1)
+op : f32(0x1.ffc00000000000000000p+16:0x47ffe000) * f32(0x1.ffc20000000000000000p+15:0x477fe100) + f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+res: f32(0x1.ff840800000000000000p+32:0x4fffc204) flags=INEXACT  (22/2)
+op : f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) * f32(0x1.ffc00000000000000000p+16:0x47ffe000) + f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+res: f32(0x1.ff800a00000000000000p+33:0x507fc005) flags=INEXACT  (23/0)
+op : f32(0x1.ffc00000000000000000p+16:0x47ffe000) * f32(0x1.ffc10000000000000000p+16:0x47ffe080) + f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+res: f32(0x1.ff820800000000000000p+33:0x507fc104) flags=INEXACT  (23/1)
+op : f32(0x1.ffc10000000000000000p+16:0x47ffe080) * f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) + f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+res: f32(0x1.ff810800000000000000p+33:0x507fc084) flags=INEXACT  (23/2)
+op : f32(0x1.ffc00000000000000000p+16:0x47ffe000) * f32(0x1.ffc10000000000000000p+16:0x47ffe080) + f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+res: f32(0x1.c0bab800000000000000p+99:0x71605d5c) flags=INEXACT  (24/0)
+op : f32(0x1.ffc10000000000000000p+16:0x47ffe080) * f32(0x1.c0bab600000000000000p+99:0x71605d5b) + f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+res: f32(0x1.c0838000000000000000p+116:0x79e041c0) flags=INEXACT  (24/1)
+op : f32(0x1.c0bab600000000000000p+99:0x71605d5b) * f32(0x1.ffc00000000000000000p+16:0x47ffe000) + f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+res: f32(0x1.c082a000000000000000p+116:0x79e04150) flags=INEXACT  (24/2)
+op : f32(0x1.ffc10000000000000000p+16:0x47ffe080) * f32(0x1.c0bab600000000000000p+99:0x71605d5b) + f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (25/0)
+op : f32(0x1.c0bab600000000000000p+99:0x71605d5b) * f32(0x1.fffffe00000000000000p+127:0x7f7fffff) + f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (25/1)
+op : f32(0x1.fffffe00000000000000p+127:0x7f7fffff) * f32(0x1.ffc10000000000000000p+16:0x47ffe080) + f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+res: f32(inf:0x7f800000) flags=OVERFLOW INEXACT  (25/2)
+op : f32(0x1.c0bab600000000000000p+99:0x71605d5b) * f32(0x1.fffffe00000000000000p+127:0x7f7fffff) + f32(inf:0x7f800000)
+res: f32(inf:0x7f800000) flags=OK (26/0)
+op : f32(0x1.fffffe00000000000000p+127:0x7f7fffff) * f32(inf:0x7f800000) + f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+res: f32(inf:0x7f800000) flags=OK (26/1)
+op : f32(inf:0x7f800000) * f32(0x1.c0bab600000000000000p+99:0x71605d5b) + f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+res: f32(inf:0x7f800000) flags=OK (26/2)
+op : f32(0x1.fffffe00000000000000p+127:0x7f7fffff) * f32(inf:0x7f800000) + f32(-nan:0x7fc00000)
+res: f32(-nan:0xffffffff) flags=OK (27/0)
+op : f32(inf:0x7f800000) * f32(-nan:0x7fc00000) + f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+res: f32(-nan:0xffffffff) flags=OK (27/1)
+op : f32(-nan:0x7fc00000) * f32(0x1.fffffe00000000000000p+127:0x7f7fffff) + f32(inf:0x7f800000)
+res: f32(-nan:0xffffffff) flags=OK (27/2)
+op : f32(inf:0x7f800000) * f32(-nan:0x7fc00000) + f32(-nan:0x7fa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (28/0)
+op : f32(-nan:0x7fc00000) * f32(-nan:0x7fa00000) + f32(inf:0x7f800000)
+res: f32(-nan:0xffffffff) flags=INVALID (28/1)
+op : f32(-nan:0x7fa00000) * f32(inf:0x7f800000) + f32(-nan:0x7fc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (28/2)
+op : f32(-nan:0x7fc00000) * f32(-nan:0x7fa00000) + f32(-nan:0xffa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (29/0)
+op : f32(-nan:0x7fa00000) * f32(-nan:0xffa00000) + f32(-nan:0x7fc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (29/1)
+op : f32(-nan:0xffa00000) * f32(-nan:0x7fc00000) + f32(-nan:0x7fa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (29/2)
+op : f32(-nan:0x7fa00000) * f32(-nan:0xffa00000) + f32(-nan:0xffc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (30/0)
+op : f32(-nan:0xffa00000) * f32(-nan:0xffc00000) + f32(-nan:0x7fa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (30/1)
+op : f32(-nan:0xffc00000) * f32(-nan:0x7fa00000) + f32(-nan:0xffa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (30/2)
+# LP184149
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(0x1.00000000000000000000p-1:0x3f000000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(0x0.00000000000000000000p+0:0000000000) flags=OK (31/0)
+op : f32(0x1.00000000000000000000p-149:0x00000001) * f32(0x1.00000000000000000000p-149:0x00000001) + f32(0x1.00000000000000000000p-149:0x00000001)
+res: f32(0x1.00000000000000000000p-148:0x00000002) flags=UNDERFLOW INEXACT  (32/0)
+### Rounding downwards
+op : f32(-nan:0xffa00000) * f32(-nan:0xffc00000) + f32(-inf:0xff800000)
+res: f32(-nan:0xffffffff) flags=INVALID (0/0)
+op : f32(-nan:0xffc00000) * f32(-inf:0xff800000) + f32(-nan:0xffa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (0/1)
+op : f32(-inf:0xff800000) * f32(-nan:0xffa00000) + f32(-nan:0xffc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (0/2)
+op : f32(-nan:0xffc00000) * f32(-inf:0xff800000) + f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+res: f32(-nan:0xffffffff) flags=OK (1/0)
+op : f32(-inf:0xff800000) * f32(-0x1.fffffe00000000000000p+127:0xff7fffff) + f32(-nan:0xffc00000)
+res: f32(-nan:0xffffffff) flags=OK (1/1)
+op : f32(-0x1.fffffe00000000000000p+127:0xff7fffff) * f32(-nan:0xffc00000) + f32(-inf:0xff800000)
+res: f32(-nan:0xffffffff) flags=OK (1/2)
+op : f32(-inf:0xff800000) * f32(-0x1.fffffe00000000000000p+127:0xff7fffff) + f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+res: f32(inf:0x7f800000) flags=OK (2/0)
+op : f32(-0x1.fffffe00000000000000p+127:0xff7fffff) * f32(-0x1.1874b200000000000000p+103:0xf30c3a59) + f32(-inf:0xff800000)
+res: f32(-inf:0xff800000) flags=OK (2/1)
+op : f32(-0x1.1874b200000000000000p+103:0xf30c3a59) * f32(-inf:0xff800000) + f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+res: f32(inf:0x7f800000) flags=OK (2/2)
+op : f32(-0x1.fffffe00000000000000p+127:0xff7fffff) * f32(-0x1.1874b200000000000000p+103:0xf30c3a59) + f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (3/0)
+op : f32(-0x1.1874b200000000000000p+103:0xf30c3a59) * f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) + f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (3/1)
+op : f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) * f32(-0x1.fffffe00000000000000p+127:0xff7fffff) + f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (3/2)
+op : f32(-0x1.1874b200000000000000p+103:0xf30c3a59) * f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) + f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (4/0)
+op : f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) * f32(-0x1.31f75000000000000000p-40:0xab98fba8) + f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+res: f32(-0x1.1874b200000000000000p+103:0xf30c3a59) flags=INEXACT  (4/1)
+op : f32(-0x1.31f75000000000000000p-40:0xab98fba8) * f32(-0x1.1874b200000000000000p+103:0xf30c3a59) + f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+res: f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) flags=INEXACT  (4/2)
+op : f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) * f32(-0x1.31f75000000000000000p-40:0xab98fba8) + f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+res: f32(0x1.0c27f800000000000000p+60:0x5d8613fc) flags=INEXACT  (5/0)
+op : f32(-0x1.31f75000000000000000p-40:0xab98fba8) * f32(-0x1.50544400000000000000p-66:0x9ea82a22) + f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+res: f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) flags=INEXACT  (5/1)
+op : f32(-0x1.50544400000000000000p-66:0x9ea82a22) * f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) + f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+res: f32(0x1.26c46000000000000000p+34:0x50936230) flags=INEXACT  (5/2)
+op : f32(-0x1.31f75000000000000000p-40:0xab98fba8) * f32(-0x1.50544400000000000000p-66:0x9ea82a22) + f32(-0x1.00000000000000000000p-126:0x80800000)
+res: f32(0x1.91f93e00000000000000p-106:0x0ac8fc9f) flags=INEXACT  (6/0)
+op : f32(-0x1.50544400000000000000p-66:0x9ea82a22) * f32(-0x1.00000000000000000000p-126:0x80800000) + f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+res: f32(-0x1.31f75000000000000000p-40:0xab98fba8) flags=INEXACT  (6/1)
+op : f32(-0x1.00000000000000000000p-126:0x80800000) * f32(-0x1.31f75000000000000000p-40:0xab98fba8) + f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+res: f32(-0x1.50544400000000000000p-66:0x9ea82a22) flags=INEXACT  (6/2)
+op : f32(-0x1.50544400000000000000p-66:0x9ea82a22) * f32(-0x1.00000000000000000000p-126:0x80800000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(0x0.00000000000000000000p+0:0000000000) flags=UNDERFLOW INEXACT  (7/0)
+op : f32(-0x1.00000000000000000000p-126:0x80800000) * f32(0x0.00000000000000000000p+0:0000000000) + f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+res: f32(-0x1.50544400000000000000p-66:0x9ea82a22) flags=INEXACT  (7/1)
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(-0x1.50544400000000000000p-66:0x9ea82a22) + f32(-0x1.00000000000000000000p-126:0x80800000)
+res: f32(-0x1.00000000000000000000p-126:0x80800000) flags=OK (7/2)
+op : f32(-0x1.00000000000000000000p-126:0x80800000) * f32(0x0.00000000000000000000p+0:0000000000) + f32(0x1.00000000000000000000p-126:0x00800000)
+res: f32(0x1.00000000000000000000p-126:0x00800000) flags=OK (8/0)
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(0x1.00000000000000000000p-126:0x00800000) + f32(-0x1.00000000000000000000p-126:0x80800000)
+res: f32(-0x1.00000000000000000000p-126:0x80800000) flags=OK (8/1)
+op : f32(0x1.00000000000000000000p-126:0x00800000) * f32(-0x1.00000000000000000000p-126:0x80800000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(-0x1.00000000000000000000p-149:0x80000001) flags=UNDERFLOW INEXACT  (8/2)
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(0x1.00000000000000000000p-126:0x00800000) + f32(0x1.00000000000000000000p-25:0x33000000)
+res: f32(0x1.00000000000000000000p-25:0x33000000) flags=OK (9/0)
+op : f32(0x1.00000000000000000000p-126:0x00800000) * f32(0x1.00000000000000000000p-25:0x33000000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(0x0.00000000000000000000p+0:0000000000) flags=UNDERFLOW INEXACT  (9/1)
+op : f32(0x1.00000000000000000000p-25:0x33000000) * f32(0x0.00000000000000000000p+0:0000000000) + f32(0x1.00000000000000000000p-126:0x00800000)
+res: f32(0x1.00000000000000000000p-126:0x00800000) flags=OK (9/2)
+op : f32(0x1.00000000000000000000p-126:0x00800000) * f32(0x1.00000000000000000000p-25:0x33000000) + f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+res: f32(0x1.ffffe600000000000000p-25:0x337ffff3) flags=INEXACT  (10/0)
+op : f32(0x1.00000000000000000000p-25:0x33000000) * f32(0x1.ffffe600000000000000p-25:0x337ffff3) + f32(0x1.00000000000000000000p-126:0x00800000)
+res: f32(0x1.ffffe600000000000000p-50:0x26fffff3) flags=INEXACT  (10/1)
+op : f32(0x1.ffffe600000000000000p-25:0x337ffff3) * f32(0x1.00000000000000000000p-126:0x00800000) + f32(0x1.00000000000000000000p-25:0x33000000)
+res: f32(0x1.00000000000000000000p-25:0x33000000) flags=INEXACT  (10/2)
+op : f32(0x1.00000000000000000000p-25:0x33000000) * f32(0x1.ffffe600000000000000p-25:0x337ffff3) + f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+res: f32(0x1.ff801a00000000000000p-15:0x387fc00d) flags=INEXACT  (11/0)
+op : f32(0x1.ffffe600000000000000p-25:0x337ffff3) * f32(0x1.ff801a00000000000000p-15:0x387fc00d) + f32(0x1.00000000000000000000p-25:0x33000000)
+res: f32(0x1.0007fe00000000000000p-25:0x330003ff) flags=INEXACT  (11/1)
+op : f32(0x1.ff801a00000000000000p-15:0x387fc00d) * f32(0x1.00000000000000000000p-25:0x33000000) + f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+res: f32(0x1.0001f200000000000000p-24:0x338000f9) flags=INEXACT  (11/2)
+op : f32(0x1.ffffe600000000000000p-25:0x337ffff3) * f32(0x1.ff801a00000000000000p-15:0x387fc00d) + f32(0x1.00000c00000000000000p-14:0x38800006)
+res: f32(0x1.00000c00000000000000p-14:0x38800006) flags=INEXACT  (12/0)
+op : f32(0x1.ff801a00000000000000p-15:0x387fc00d) * f32(0x1.00000c00000000000000p-14:0x38800006) + f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+res: f32(0x1.0ffbf400000000000000p-24:0x3387fdfa) flags=INEXACT  (12/1)
+op : f32(0x1.00000c00000000000000p-14:0x38800006) * f32(0x1.ffffe600000000000000p-25:0x337ffff3) + f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+res: f32(0x1.ff801a00000000000000p-15:0x387fc00d) flags=INEXACT  (12/2)
+op : f32(0x1.ff801a00000000000000p-15:0x387fc00d) * f32(0x1.00000c00000000000000p-14:0x38800006) + f32(0x1.00000000000000000000p+0:0x3f800000)
+res: f32(0x1.00000000000000000000p+0:0x3f800000) flags=INEXACT  (13/0)
+op : f32(0x1.00000c00000000000000p-14:0x38800006) * f32(0x1.00000000000000000000p+0:0x3f800000) + f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+res: f32(0x1.ffc01800000000000000p-14:0x38ffe00c) flags=INEXACT  (13/1)
+op : f32(0x1.00000000000000000000p+0:0x3f800000) * f32(0x1.ff801a00000000000000p-15:0x387fc00d) + f32(0x1.00000c00000000000000p-14:0x38800006)
+res: f32(0x1.ffc01800000000000000p-14:0x38ffe00c) flags=INEXACT  (13/2)
+op : f32(0x1.00000c00000000000000p-14:0x38800006) * f32(0x1.00000000000000000000p+0:0x3f800000) + f32(0x1.00400000000000000000p+0:0x3f802000)
+res: f32(0x1.00440000000000000000p+0:0x3f802200) flags=INEXACT  (14/0)
+op : f32(0x1.00000000000000000000p+0:0x3f800000) * f32(0x1.00400000000000000000p+0:0x3f802000) + f32(0x1.00000c00000000000000p-14:0x38800006)
+res: f32(0x1.00440000000000000000p+0:0x3f802200) flags=INEXACT  (14/1)
+op : f32(0x1.00400000000000000000p+0:0x3f802000) * f32(0x1.00000c00000000000000p-14:0x38800006) + f32(0x1.00000000000000000000p+0:0x3f800000)
+res: f32(0x1.00040000000000000000p+0:0x3f800200) flags=INEXACT  (14/2)
+op : f32(0x1.00000000000000000000p+0:0x3f800000) * f32(0x1.00400000000000000000p+0:0x3f802000) + f32(0x1.00000000000000000000p+1:0x40000000)
+res: f32(0x1.80200000000000000000p+1:0x40401000) flags=INEXACT  (15/0)
+op : f32(0x1.00400000000000000000p+0:0x3f802000) * f32(0x1.00000000000000000000p+1:0x40000000) + f32(0x1.00000000000000000000p+0:0x3f800000)
+res: f32(0x1.80400000000000000000p+1:0x40402000) flags=INEXACT  (15/1)
+op : f32(0x1.00000000000000000000p+1:0x40000000) * f32(0x1.00000000000000000000p+0:0x3f800000) + f32(0x1.00400000000000000000p+0:0x3f802000)
+res: f32(0x1.80200000000000000000p+1:0x40401000) flags=INEXACT  (15/2)
+op : f32(0x1.00400000000000000000p+0:0x3f802000) * f32(0x1.00000000000000000000p+1:0x40000000) + f32(0x1.5bf0a800000000000000p+1:0x402df854)
+res: f32(0x1.2e185400000000000000p+2:0x40970c2a) flags=INEXACT  (16/0)
+op : f32(0x1.00000000000000000000p+1:0x40000000) * f32(0x1.5bf0a800000000000000p+1:0x402df854) + f32(0x1.00400000000000000000p+0:0x3f802000)
+res: f32(0x1.9c00a800000000000000p+2:0x40ce0054) flags=INEXACT  (16/1)
+op : f32(0x1.5bf0a800000000000000p+1:0x402df854) * f32(0x1.00400000000000000000p+0:0x3f802000) + f32(0x1.00000000000000000000p+1:0x40000000)
+res: f32(0x1.2e23d200000000000000p+2:0x409711e9) flags=INEXACT  (16/2)
+op : f32(0x1.00000000000000000000p+1:0x40000000) * f32(0x1.5bf0a800000000000000p+1:0x402df854) + f32(0x1.921fb600000000000000p+1:0x40490fdb)
+res: f32(0x1.12804000000000000000p+3:0x41094020) flags=INEXACT  (17/0)
+op : f32(0x1.5bf0a800000000000000p+1:0x402df854) * f32(0x1.921fb600000000000000p+1:0x40490fdb) + f32(0x1.00000000000000000000p+1:0x40000000)
+res: f32(0x1.51458000000000000000p+3:0x4128a2c0) flags=INEXACT  (17/1)
+op : f32(0x1.921fb600000000000000p+1:0x40490fdb) * f32(0x1.00000000000000000000p+1:0x40000000) + f32(0x1.5bf0a800000000000000p+1:0x402df854)
+res: f32(0x1.200c0400000000000000p+3:0x41100602) flags=INEXACT  (17/2)
+op : f32(0x1.5bf0a800000000000000p+1:0x402df854) * f32(0x1.921fb600000000000000p+1:0x40490fdb) + f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+res: f32(0x1.ffcf1400000000000000p+15:0x477fe78a) flags=INEXACT  (18/0)
+op : f32(0x1.921fb600000000000000p+1:0x40490fdb) * f32(0x1.ffbe0000000000000000p+15:0x477fdf00) + f32(0x1.5bf0a800000000000000p+1:0x402df854)
+res: f32(0x1.91ed3a00000000000000p+17:0x4848f69d) flags=INEXACT  (18/1)
+op : f32(0x1.ffbe0000000000000000p+15:0x477fdf00) * f32(0x1.5bf0a800000000000000p+1:0x402df854) + f32(0x1.921fb600000000000000p+1:0x40490fdb)
+res: f32(0x1.5bc56000000000000000p+17:0x482de2b0) flags=INEXACT  (18/2)
+op : f32(0x1.921fb600000000000000p+1:0x40490fdb) * f32(0x1.ffbe0000000000000000p+15:0x477fdf00) + f32(0x1.ffc00000000000000000p+15:0x477fe000)
+res: f32(0x1.08edee00000000000000p+18:0x488476f7) flags=INEXACT  (19/0)
+op : f32(0x1.ffbe0000000000000000p+15:0x477fdf00) * f32(0x1.ffc00000000000000000p+15:0x477fe000) + f32(0x1.921fb600000000000000p+1:0x40490fdb)
+res: f32(0x1.ff7e0800000000000000p+31:0x4f7fbf04) flags=INEXACT  (19/1)
+op : f32(0x1.ffc00000000000000000p+15:0x477fe000) * f32(0x1.921fb600000000000000p+1:0x40490fdb) + f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+res: f32(0x1.08ee7800000000000000p+18:0x4884773c) flags=INEXACT  (19/2)
+op : f32(0x1.ffbe0000000000000000p+15:0x477fdf00) * f32(0x1.ffc00000000000000000p+15:0x477fe000) + f32(0x1.ffc20000000000000000p+15:0x477fe100)
+res: f32(0x1.ff800800000000000000p+31:0x4f7fc004) flags=INEXACT  (20/0)
+op : f32(0x1.ffc00000000000000000p+15:0x477fe000) * f32(0x1.ffc20000000000000000p+15:0x477fe100) + f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+res: f32(0x1.ff840600000000000000p+31:0x4f7fc203) flags=INEXACT  (20/1)
+op : f32(0x1.ffc20000000000000000p+15:0x477fe100) * f32(0x1.ffbe0000000000000000p+15:0x477fdf00) + f32(0x1.ffc00000000000000000p+15:0x477fe000)
+res: f32(0x1.ff820600000000000000p+31:0x4f7fc103) flags=INEXACT  (20/2)
+op : f32(0x1.ffc00000000000000000p+15:0x477fe000) * f32(0x1.ffc20000000000000000p+15:0x477fe100) + f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+res: f32(0x1.ff860600000000000000p+31:0x4f7fc303) flags=INEXACT  (21/0)
+op : f32(0x1.ffc20000000000000000p+15:0x477fe100) * f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) + f32(0x1.ffc00000000000000000p+15:0x477fe000)
+res: f32(0x1.ff820600000000000000p+32:0x4fffc103) flags=INEXACT  (21/1)
+op : f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) * f32(0x1.ffc00000000000000000p+15:0x477fe000) + f32(0x1.ffc20000000000000000p+15:0x477fe100)
+res: f32(0x1.ff800800000000000000p+32:0x4fffc004) flags=INEXACT  (21/2)
+op : f32(0x1.ffc20000000000000000p+15:0x477fe100) * f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) + f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+res: f32(0x1.ff830600000000000000p+32:0x4fffc183) flags=INEXACT  (22/0)
+op : f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) * f32(0x1.ffc00000000000000000p+16:0x47ffe000) + f32(0x1.ffc20000000000000000p+15:0x477fe100)
+res: f32(0x1.ff7f8800000000000000p+33:0x507fbfc4) flags=INEXACT  (22/1)
+op : f32(0x1.ffc00000000000000000p+16:0x47ffe000) * f32(0x1.ffc20000000000000000p+15:0x477fe100) + f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+res: f32(0x1.ff840600000000000000p+32:0x4fffc203) flags=INEXACT  (22/2)
+op : f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) * f32(0x1.ffc00000000000000000p+16:0x47ffe000) + f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+res: f32(0x1.ff800800000000000000p+33:0x507fc004) flags=INEXACT  (23/0)
+op : f32(0x1.ffc00000000000000000p+16:0x47ffe000) * f32(0x1.ffc10000000000000000p+16:0x47ffe080) + f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+res: f32(0x1.ff820600000000000000p+33:0x507fc103) flags=INEXACT  (23/1)
+op : f32(0x1.ffc10000000000000000p+16:0x47ffe080) * f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) + f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+res: f32(0x1.ff810600000000000000p+33:0x507fc083) flags=INEXACT  (23/2)
+op : f32(0x1.ffc00000000000000000p+16:0x47ffe000) * f32(0x1.ffc10000000000000000p+16:0x47ffe080) + f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+res: f32(0x1.c0bab600000000000000p+99:0x71605d5b) flags=INEXACT  (24/0)
+op : f32(0x1.ffc10000000000000000p+16:0x47ffe080) * f32(0x1.c0bab600000000000000p+99:0x71605d5b) + f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+res: f32(0x1.c0837e00000000000000p+116:0x79e041bf) flags=INEXACT  (24/1)
+op : f32(0x1.c0bab600000000000000p+99:0x71605d5b) * f32(0x1.ffc00000000000000000p+16:0x47ffe000) + f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+res: f32(0x1.c0829e00000000000000p+116:0x79e0414f) flags=INEXACT  (24/2)
+op : f32(0x1.ffc10000000000000000p+16:0x47ffe080) * f32(0x1.c0bab600000000000000p+99:0x71605d5b) + f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (25/0)
+op : f32(0x1.c0bab600000000000000p+99:0x71605d5b) * f32(0x1.fffffe00000000000000p+127:0x7f7fffff) + f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (25/1)
+op : f32(0x1.fffffe00000000000000p+127:0x7f7fffff) * f32(0x1.ffc10000000000000000p+16:0x47ffe080) + f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (25/2)
+op : f32(0x1.c0bab600000000000000p+99:0x71605d5b) * f32(0x1.fffffe00000000000000p+127:0x7f7fffff) + f32(inf:0x7f800000)
+res: f32(inf:0x7f800000) flags=OK (26/0)
+op : f32(0x1.fffffe00000000000000p+127:0x7f7fffff) * f32(inf:0x7f800000) + f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+res: f32(inf:0x7f800000) flags=OK (26/1)
+op : f32(inf:0x7f800000) * f32(0x1.c0bab600000000000000p+99:0x71605d5b) + f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+res: f32(inf:0x7f800000) flags=OK (26/2)
+op : f32(0x1.fffffe00000000000000p+127:0x7f7fffff) * f32(inf:0x7f800000) + f32(-nan:0x7fc00000)
+res: f32(-nan:0xffffffff) flags=OK (27/0)
+op : f32(inf:0x7f800000) * f32(-nan:0x7fc00000) + f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+res: f32(-nan:0xffffffff) flags=OK (27/1)
+op : f32(-nan:0x7fc00000) * f32(0x1.fffffe00000000000000p+127:0x7f7fffff) + f32(inf:0x7f800000)
+res: f32(-nan:0xffffffff) flags=OK (27/2)
+op : f32(inf:0x7f800000) * f32(-nan:0x7fc00000) + f32(-nan:0x7fa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (28/0)
+op : f32(-nan:0x7fc00000) * f32(-nan:0x7fa00000) + f32(inf:0x7f800000)
+res: f32(-nan:0xffffffff) flags=INVALID (28/1)
+op : f32(-nan:0x7fa00000) * f32(inf:0x7f800000) + f32(-nan:0x7fc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (28/2)
+op : f32(-nan:0x7fc00000) * f32(-nan:0x7fa00000) + f32(-nan:0xffa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (29/0)
+op : f32(-nan:0x7fa00000) * f32(-nan:0xffa00000) + f32(-nan:0x7fc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (29/1)
+op : f32(-nan:0xffa00000) * f32(-nan:0x7fc00000) + f32(-nan:0x7fa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (29/2)
+op : f32(-nan:0x7fa00000) * f32(-nan:0xffa00000) + f32(-nan:0xffc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (30/0)
+op : f32(-nan:0xffa00000) * f32(-nan:0xffc00000) + f32(-nan:0x7fa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (30/1)
+op : f32(-nan:0xffc00000) * f32(-nan:0x7fa00000) + f32(-nan:0xffa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (30/2)
+# LP184149
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(0x1.00000000000000000000p-1:0x3f000000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(0x0.00000000000000000000p+0:0000000000) flags=OK (31/0)
+op : f32(0x1.00000000000000000000p-149:0x00000001) * f32(0x1.00000000000000000000p-149:0x00000001) + f32(0x1.00000000000000000000p-149:0x00000001)
+res: f32(0x1.00000000000000000000p-149:0x00000001) flags=UNDERFLOW INEXACT  (32/0)
+### Rounding to zero
+op : f32(-nan:0xffa00000) * f32(-nan:0xffc00000) + f32(-inf:0xff800000)
+res: f32(-nan:0xffffffff) flags=INVALID (0/0)
+op : f32(-nan:0xffc00000) * f32(-inf:0xff800000) + f32(-nan:0xffa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (0/1)
+op : f32(-inf:0xff800000) * f32(-nan:0xffa00000) + f32(-nan:0xffc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (0/2)
+op : f32(-nan:0xffc00000) * f32(-inf:0xff800000) + f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+res: f32(-nan:0xffffffff) flags=OK (1/0)
+op : f32(-inf:0xff800000) * f32(-0x1.fffffe00000000000000p+127:0xff7fffff) + f32(-nan:0xffc00000)
+res: f32(-nan:0xffffffff) flags=OK (1/1)
+op : f32(-0x1.fffffe00000000000000p+127:0xff7fffff) * f32(-nan:0xffc00000) + f32(-inf:0xff800000)
+res: f32(-nan:0xffffffff) flags=OK (1/2)
+op : f32(-inf:0xff800000) * f32(-0x1.fffffe00000000000000p+127:0xff7fffff) + f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+res: f32(inf:0x7f800000) flags=OK (2/0)
+op : f32(-0x1.fffffe00000000000000p+127:0xff7fffff) * f32(-0x1.1874b200000000000000p+103:0xf30c3a59) + f32(-inf:0xff800000)
+res: f32(-inf:0xff800000) flags=OK (2/1)
+op : f32(-0x1.1874b200000000000000p+103:0xf30c3a59) * f32(-inf:0xff800000) + f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+res: f32(inf:0x7f800000) flags=OK (2/2)
+op : f32(-0x1.fffffe00000000000000p+127:0xff7fffff) * f32(-0x1.1874b200000000000000p+103:0xf30c3a59) + f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (3/0)
+op : f32(-0x1.1874b200000000000000p+103:0xf30c3a59) * f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) + f32(-0x1.fffffe00000000000000p+127:0xff7fffff)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (3/1)
+op : f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) * f32(-0x1.fffffe00000000000000p+127:0xff7fffff) + f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (3/2)
+op : f32(-0x1.1874b200000000000000p+103:0xf30c3a59) * f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) + f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (4/0)
+op : f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) * f32(-0x1.31f75000000000000000p-40:0xab98fba8) + f32(-0x1.1874b200000000000000p+103:0xf30c3a59)
+res: f32(-0x1.1874b000000000000000p+103:0xf30c3a58) flags=INEXACT  (4/1)
+op : f32(-0x1.31f75000000000000000p-40:0xab98fba8) * f32(-0x1.1874b200000000000000p+103:0xf30c3a59) + f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+res: f32(-0x1.c0bab400000000000000p+99:0xf1605d5a) flags=INEXACT  (4/2)
+op : f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) * f32(-0x1.31f75000000000000000p-40:0xab98fba8) + f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+res: f32(0x1.0c27f800000000000000p+60:0x5d8613fc) flags=INEXACT  (5/0)
+op : f32(-0x1.31f75000000000000000p-40:0xab98fba8) * f32(-0x1.50544400000000000000p-66:0x9ea82a22) + f32(-0x1.c0bab600000000000000p+99:0xf1605d5b)
+res: f32(-0x1.c0bab400000000000000p+99:0xf1605d5a) flags=INEXACT  (5/1)
+op : f32(-0x1.50544400000000000000p-66:0x9ea82a22) * f32(-0x1.c0bab600000000000000p+99:0xf1605d5b) + f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+res: f32(0x1.26c46000000000000000p+34:0x50936230) flags=INEXACT  (5/2)
+op : f32(-0x1.31f75000000000000000p-40:0xab98fba8) * f32(-0x1.50544400000000000000p-66:0x9ea82a22) + f32(-0x1.00000000000000000000p-126:0x80800000)
+res: f32(0x1.91f93e00000000000000p-106:0x0ac8fc9f) flags=INEXACT  (6/0)
+op : f32(-0x1.50544400000000000000p-66:0x9ea82a22) * f32(-0x1.00000000000000000000p-126:0x80800000) + f32(-0x1.31f75000000000000000p-40:0xab98fba8)
+res: f32(-0x1.31f74e00000000000000p-40:0xab98fba7) flags=INEXACT  (6/1)
+op : f32(-0x1.00000000000000000000p-126:0x80800000) * f32(-0x1.31f75000000000000000p-40:0xab98fba8) + f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+res: f32(-0x1.50544200000000000000p-66:0x9ea82a21) flags=INEXACT  (6/2)
+op : f32(-0x1.50544400000000000000p-66:0x9ea82a22) * f32(-0x1.00000000000000000000p-126:0x80800000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(0x0.00000000000000000000p+0:0000000000) flags=UNDERFLOW INEXACT  (7/0)
+op : f32(-0x1.00000000000000000000p-126:0x80800000) * f32(0x0.00000000000000000000p+0:0000000000) + f32(-0x1.50544400000000000000p-66:0x9ea82a22)
+res: f32(-0x1.50544400000000000000p-66:0x9ea82a22) flags=INEXACT  (7/1)
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(-0x1.50544400000000000000p-66:0x9ea82a22) + f32(-0x1.00000000000000000000p-126:0x80800000)
+res: f32(-0x1.00000000000000000000p-126:0x80800000) flags=OK (7/2)
+op : f32(-0x1.00000000000000000000p-126:0x80800000) * f32(0x0.00000000000000000000p+0:0000000000) + f32(0x1.00000000000000000000p-126:0x00800000)
+res: f32(0x1.00000000000000000000p-126:0x00800000) flags=OK (8/0)
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(0x1.00000000000000000000p-126:0x00800000) + f32(-0x1.00000000000000000000p-126:0x80800000)
+res: f32(-0x1.00000000000000000000p-126:0x80800000) flags=OK (8/1)
+op : f32(0x1.00000000000000000000p-126:0x00800000) * f32(-0x1.00000000000000000000p-126:0x80800000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(-0x0.00000000000000000000p+0:0x80000000) flags=UNDERFLOW INEXACT  (8/2)
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(0x1.00000000000000000000p-126:0x00800000) + f32(0x1.00000000000000000000p-25:0x33000000)
+res: f32(0x1.00000000000000000000p-25:0x33000000) flags=OK (9/0)
+op : f32(0x1.00000000000000000000p-126:0x00800000) * f32(0x1.00000000000000000000p-25:0x33000000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(0x0.00000000000000000000p+0:0000000000) flags=UNDERFLOW INEXACT  (9/1)
+op : f32(0x1.00000000000000000000p-25:0x33000000) * f32(0x0.00000000000000000000p+0:0000000000) + f32(0x1.00000000000000000000p-126:0x00800000)
+res: f32(0x1.00000000000000000000p-126:0x00800000) flags=OK (9/2)
+op : f32(0x1.00000000000000000000p-126:0x00800000) * f32(0x1.00000000000000000000p-25:0x33000000) + f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+res: f32(0x1.ffffe600000000000000p-25:0x337ffff3) flags=INEXACT  (10/0)
+op : f32(0x1.00000000000000000000p-25:0x33000000) * f32(0x1.ffffe600000000000000p-25:0x337ffff3) + f32(0x1.00000000000000000000p-126:0x00800000)
+res: f32(0x1.ffffe600000000000000p-50:0x26fffff3) flags=INEXACT  (10/1)
+op : f32(0x1.ffffe600000000000000p-25:0x337ffff3) * f32(0x1.00000000000000000000p-126:0x00800000) + f32(0x1.00000000000000000000p-25:0x33000000)
+res: f32(0x1.00000000000000000000p-25:0x33000000) flags=INEXACT  (10/2)
+op : f32(0x1.00000000000000000000p-25:0x33000000) * f32(0x1.ffffe600000000000000p-25:0x337ffff3) + f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+res: f32(0x1.ff801a00000000000000p-15:0x387fc00d) flags=INEXACT  (11/0)
+op : f32(0x1.ffffe600000000000000p-25:0x337ffff3) * f32(0x1.ff801a00000000000000p-15:0x387fc00d) + f32(0x1.00000000000000000000p-25:0x33000000)
+res: f32(0x1.0007fe00000000000000p-25:0x330003ff) flags=INEXACT  (11/1)
+op : f32(0x1.ff801a00000000000000p-15:0x387fc00d) * f32(0x1.00000000000000000000p-25:0x33000000) + f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+res: f32(0x1.0001f200000000000000p-24:0x338000f9) flags=INEXACT  (11/2)
+op : f32(0x1.ffffe600000000000000p-25:0x337ffff3) * f32(0x1.ff801a00000000000000p-15:0x387fc00d) + f32(0x1.00000c00000000000000p-14:0x38800006)
+res: f32(0x1.00000c00000000000000p-14:0x38800006) flags=INEXACT  (12/0)
+op : f32(0x1.ff801a00000000000000p-15:0x387fc00d) * f32(0x1.00000c00000000000000p-14:0x38800006) + f32(0x1.ffffe600000000000000p-25:0x337ffff3)
+res: f32(0x1.0ffbf400000000000000p-24:0x3387fdfa) flags=INEXACT  (12/1)
+op : f32(0x1.00000c00000000000000p-14:0x38800006) * f32(0x1.ffffe600000000000000p-25:0x337ffff3) + f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+res: f32(0x1.ff801a00000000000000p-15:0x387fc00d) flags=INEXACT  (12/2)
+op : f32(0x1.ff801a00000000000000p-15:0x387fc00d) * f32(0x1.00000c00000000000000p-14:0x38800006) + f32(0x1.00000000000000000000p+0:0x3f800000)
+res: f32(0x1.00000000000000000000p+0:0x3f800000) flags=INEXACT  (13/0)
+op : f32(0x1.00000c00000000000000p-14:0x38800006) * f32(0x1.00000000000000000000p+0:0x3f800000) + f32(0x1.ff801a00000000000000p-15:0x387fc00d)
+res: f32(0x1.ffc01800000000000000p-14:0x38ffe00c) flags=INEXACT  (13/1)
+op : f32(0x1.00000000000000000000p+0:0x3f800000) * f32(0x1.ff801a00000000000000p-15:0x387fc00d) + f32(0x1.00000c00000000000000p-14:0x38800006)
+res: f32(0x1.ffc01800000000000000p-14:0x38ffe00c) flags=INEXACT  (13/2)
+op : f32(0x1.00000c00000000000000p-14:0x38800006) * f32(0x1.00000000000000000000p+0:0x3f800000) + f32(0x1.00400000000000000000p+0:0x3f802000)
+res: f32(0x1.00440000000000000000p+0:0x3f802200) flags=INEXACT  (14/0)
+op : f32(0x1.00000000000000000000p+0:0x3f800000) * f32(0x1.00400000000000000000p+0:0x3f802000) + f32(0x1.00000c00000000000000p-14:0x38800006)
+res: f32(0x1.00440000000000000000p+0:0x3f802200) flags=INEXACT  (14/1)
+op : f32(0x1.00400000000000000000p+0:0x3f802000) * f32(0x1.00000c00000000000000p-14:0x38800006) + f32(0x1.00000000000000000000p+0:0x3f800000)
+res: f32(0x1.00040000000000000000p+0:0x3f800200) flags=INEXACT  (14/2)
+op : f32(0x1.00000000000000000000p+0:0x3f800000) * f32(0x1.00400000000000000000p+0:0x3f802000) + f32(0x1.00000000000000000000p+1:0x40000000)
+res: f32(0x1.80200000000000000000p+1:0x40401000) flags=INEXACT  (15/0)
+op : f32(0x1.00400000000000000000p+0:0x3f802000) * f32(0x1.00000000000000000000p+1:0x40000000) + f32(0x1.00000000000000000000p+0:0x3f800000)
+res: f32(0x1.80400000000000000000p+1:0x40402000) flags=INEXACT  (15/1)
+op : f32(0x1.00000000000000000000p+1:0x40000000) * f32(0x1.00000000000000000000p+0:0x3f800000) + f32(0x1.00400000000000000000p+0:0x3f802000)
+res: f32(0x1.80200000000000000000p+1:0x40401000) flags=INEXACT  (15/2)
+op : f32(0x1.00400000000000000000p+0:0x3f802000) * f32(0x1.00000000000000000000p+1:0x40000000) + f32(0x1.5bf0a800000000000000p+1:0x402df854)
+res: f32(0x1.2e185400000000000000p+2:0x40970c2a) flags=INEXACT  (16/0)
+op : f32(0x1.00000000000000000000p+1:0x40000000) * f32(0x1.5bf0a800000000000000p+1:0x402df854) + f32(0x1.00400000000000000000p+0:0x3f802000)
+res: f32(0x1.9c00a800000000000000p+2:0x40ce0054) flags=INEXACT  (16/1)
+op : f32(0x1.5bf0a800000000000000p+1:0x402df854) * f32(0x1.00400000000000000000p+0:0x3f802000) + f32(0x1.00000000000000000000p+1:0x40000000)
+res: f32(0x1.2e23d200000000000000p+2:0x409711e9) flags=INEXACT  (16/2)
+op : f32(0x1.00000000000000000000p+1:0x40000000) * f32(0x1.5bf0a800000000000000p+1:0x402df854) + f32(0x1.921fb600000000000000p+1:0x40490fdb)
+res: f32(0x1.12804000000000000000p+3:0x41094020) flags=INEXACT  (17/0)
+op : f32(0x1.5bf0a800000000000000p+1:0x402df854) * f32(0x1.921fb600000000000000p+1:0x40490fdb) + f32(0x1.00000000000000000000p+1:0x40000000)
+res: f32(0x1.51458000000000000000p+3:0x4128a2c0) flags=INEXACT  (17/1)
+op : f32(0x1.921fb600000000000000p+1:0x40490fdb) * f32(0x1.00000000000000000000p+1:0x40000000) + f32(0x1.5bf0a800000000000000p+1:0x402df854)
+res: f32(0x1.200c0400000000000000p+3:0x41100602) flags=INEXACT  (17/2)
+op : f32(0x1.5bf0a800000000000000p+1:0x402df854) * f32(0x1.921fb600000000000000p+1:0x40490fdb) + f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+res: f32(0x1.ffcf1400000000000000p+15:0x477fe78a) flags=INEXACT  (18/0)
+op : f32(0x1.921fb600000000000000p+1:0x40490fdb) * f32(0x1.ffbe0000000000000000p+15:0x477fdf00) + f32(0x1.5bf0a800000000000000p+1:0x402df854)
+res: f32(0x1.91ed3a00000000000000p+17:0x4848f69d) flags=INEXACT  (18/1)
+op : f32(0x1.ffbe0000000000000000p+15:0x477fdf00) * f32(0x1.5bf0a800000000000000p+1:0x402df854) + f32(0x1.921fb600000000000000p+1:0x40490fdb)
+res: f32(0x1.5bc56000000000000000p+17:0x482de2b0) flags=INEXACT  (18/2)
+op : f32(0x1.921fb600000000000000p+1:0x40490fdb) * f32(0x1.ffbe0000000000000000p+15:0x477fdf00) + f32(0x1.ffc00000000000000000p+15:0x477fe000)
+res: f32(0x1.08edee00000000000000p+18:0x488476f7) flags=INEXACT  (19/0)
+op : f32(0x1.ffbe0000000000000000p+15:0x477fdf00) * f32(0x1.ffc00000000000000000p+15:0x477fe000) + f32(0x1.921fb600000000000000p+1:0x40490fdb)
+res: f32(0x1.ff7e0800000000000000p+31:0x4f7fbf04) flags=INEXACT  (19/1)
+op : f32(0x1.ffc00000000000000000p+15:0x477fe000) * f32(0x1.921fb600000000000000p+1:0x40490fdb) + f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+res: f32(0x1.08ee7800000000000000p+18:0x4884773c) flags=INEXACT  (19/2)
+op : f32(0x1.ffbe0000000000000000p+15:0x477fdf00) * f32(0x1.ffc00000000000000000p+15:0x477fe000) + f32(0x1.ffc20000000000000000p+15:0x477fe100)
+res: f32(0x1.ff800800000000000000p+31:0x4f7fc004) flags=INEXACT  (20/0)
+op : f32(0x1.ffc00000000000000000p+15:0x477fe000) * f32(0x1.ffc20000000000000000p+15:0x477fe100) + f32(0x1.ffbe0000000000000000p+15:0x477fdf00)
+res: f32(0x1.ff840600000000000000p+31:0x4f7fc203) flags=INEXACT  (20/1)
+op : f32(0x1.ffc20000000000000000p+15:0x477fe100) * f32(0x1.ffbe0000000000000000p+15:0x477fdf00) + f32(0x1.ffc00000000000000000p+15:0x477fe000)
+res: f32(0x1.ff820600000000000000p+31:0x4f7fc103) flags=INEXACT  (20/2)
+op : f32(0x1.ffc00000000000000000p+15:0x477fe000) * f32(0x1.ffc20000000000000000p+15:0x477fe100) + f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+res: f32(0x1.ff860600000000000000p+31:0x4f7fc303) flags=INEXACT  (21/0)
+op : f32(0x1.ffc20000000000000000p+15:0x477fe100) * f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) + f32(0x1.ffc00000000000000000p+15:0x477fe000)
+res: f32(0x1.ff820600000000000000p+32:0x4fffc103) flags=INEXACT  (21/1)
+op : f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) * f32(0x1.ffc00000000000000000p+15:0x477fe000) + f32(0x1.ffc20000000000000000p+15:0x477fe100)
+res: f32(0x1.ff800800000000000000p+32:0x4fffc004) flags=INEXACT  (21/2)
+op : f32(0x1.ffc20000000000000000p+15:0x477fe100) * f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) + f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+res: f32(0x1.ff830600000000000000p+32:0x4fffc183) flags=INEXACT  (22/0)
+op : f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) * f32(0x1.ffc00000000000000000p+16:0x47ffe000) + f32(0x1.ffc20000000000000000p+15:0x477fe100)
+res: f32(0x1.ff7f8800000000000000p+33:0x507fbfc4) flags=INEXACT  (22/1)
+op : f32(0x1.ffc00000000000000000p+16:0x47ffe000) * f32(0x1.ffc20000000000000000p+15:0x477fe100) + f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+res: f32(0x1.ff840600000000000000p+32:0x4fffc203) flags=INEXACT  (22/2)
+op : f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) * f32(0x1.ffc00000000000000000p+16:0x47ffe000) + f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+res: f32(0x1.ff800800000000000000p+33:0x507fc004) flags=INEXACT  (23/0)
+op : f32(0x1.ffc00000000000000000p+16:0x47ffe000) * f32(0x1.ffc10000000000000000p+16:0x47ffe080) + f32(0x1.ffbf0000000000000000p+16:0x47ffdf80)
+res: f32(0x1.ff820600000000000000p+33:0x507fc103) flags=INEXACT  (23/1)
+op : f32(0x1.ffc10000000000000000p+16:0x47ffe080) * f32(0x1.ffbf0000000000000000p+16:0x47ffdf80) + f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+res: f32(0x1.ff810600000000000000p+33:0x507fc083) flags=INEXACT  (23/2)
+op : f32(0x1.ffc00000000000000000p+16:0x47ffe000) * f32(0x1.ffc10000000000000000p+16:0x47ffe080) + f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+res: f32(0x1.c0bab600000000000000p+99:0x71605d5b) flags=INEXACT  (24/0)
+op : f32(0x1.ffc10000000000000000p+16:0x47ffe080) * f32(0x1.c0bab600000000000000p+99:0x71605d5b) + f32(0x1.ffc00000000000000000p+16:0x47ffe000)
+res: f32(0x1.c0837e00000000000000p+116:0x79e041bf) flags=INEXACT  (24/1)
+op : f32(0x1.c0bab600000000000000p+99:0x71605d5b) * f32(0x1.ffc00000000000000000p+16:0x47ffe000) + f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+res: f32(0x1.c0829e00000000000000p+116:0x79e0414f) flags=INEXACT  (24/2)
+op : f32(0x1.ffc10000000000000000p+16:0x47ffe080) * f32(0x1.c0bab600000000000000p+99:0x71605d5b) + f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (25/0)
+op : f32(0x1.c0bab600000000000000p+99:0x71605d5b) * f32(0x1.fffffe00000000000000p+127:0x7f7fffff) + f32(0x1.ffc10000000000000000p+16:0x47ffe080)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (25/1)
+op : f32(0x1.fffffe00000000000000p+127:0x7f7fffff) * f32(0x1.ffc10000000000000000p+16:0x47ffe080) + f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+res: f32(0x1.fffffe00000000000000p+127:0x7f7fffff) flags=OVERFLOW INEXACT  (25/2)
+op : f32(0x1.c0bab600000000000000p+99:0x71605d5b) * f32(0x1.fffffe00000000000000p+127:0x7f7fffff) + f32(inf:0x7f800000)
+res: f32(inf:0x7f800000) flags=OK (26/0)
+op : f32(0x1.fffffe00000000000000p+127:0x7f7fffff) * f32(inf:0x7f800000) + f32(0x1.c0bab600000000000000p+99:0x71605d5b)
+res: f32(inf:0x7f800000) flags=OK (26/1)
+op : f32(inf:0x7f800000) * f32(0x1.c0bab600000000000000p+99:0x71605d5b) + f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+res: f32(inf:0x7f800000) flags=OK (26/2)
+op : f32(0x1.fffffe00000000000000p+127:0x7f7fffff) * f32(inf:0x7f800000) + f32(-nan:0x7fc00000)
+res: f32(-nan:0xffffffff) flags=OK (27/0)
+op : f32(inf:0x7f800000) * f32(-nan:0x7fc00000) + f32(0x1.fffffe00000000000000p+127:0x7f7fffff)
+res: f32(-nan:0xffffffff) flags=OK (27/1)
+op : f32(-nan:0x7fc00000) * f32(0x1.fffffe00000000000000p+127:0x7f7fffff) + f32(inf:0x7f800000)
+res: f32(-nan:0xffffffff) flags=OK (27/2)
+op : f32(inf:0x7f800000) * f32(-nan:0x7fc00000) + f32(-nan:0x7fa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (28/0)
+op : f32(-nan:0x7fc00000) * f32(-nan:0x7fa00000) + f32(inf:0x7f800000)
+res: f32(-nan:0xffffffff) flags=INVALID (28/1)
+op : f32(-nan:0x7fa00000) * f32(inf:0x7f800000) + f32(-nan:0x7fc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (28/2)
+op : f32(-nan:0x7fc00000) * f32(-nan:0x7fa00000) + f32(-nan:0xffa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (29/0)
+op : f32(-nan:0x7fa00000) * f32(-nan:0xffa00000) + f32(-nan:0x7fc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (29/1)
+op : f32(-nan:0xffa00000) * f32(-nan:0x7fc00000) + f32(-nan:0x7fa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (29/2)
+op : f32(-nan:0x7fa00000) * f32(-nan:0xffa00000) + f32(-nan:0xffc00000)
+res: f32(-nan:0xffffffff) flags=INVALID (30/0)
+op : f32(-nan:0xffa00000) * f32(-nan:0xffc00000) + f32(-nan:0x7fa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (30/1)
+op : f32(-nan:0xffc00000) * f32(-nan:0x7fa00000) + f32(-nan:0xffa00000)
+res: f32(-nan:0xffffffff) flags=INVALID (30/2)
+# LP184149
+op : f32(0x0.00000000000000000000p+0:0000000000) * f32(0x1.00000000000000000000p-1:0x3f000000) + f32(0x0.00000000000000000000p+0:0000000000)
+res: f32(0x0.00000000000000000000p+0:0000000000) flags=OK (31/0)
+op : f32(0x1.00000000000000000000p-149:0x00000001) * f32(0x1.00000000000000000000p-149:0x00000001) + f32(0x1.00000000000000000000p-149:0x00000001)
+res: f32(0x1.00000000000000000000p-149:0x00000001) flags=UNDERFLOW INEXACT  (32/0)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC PATCH v3 34/34] Hexagon build infrastructure
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (32 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 33/34] Hexagon (tests/tcg/hexagon) TCG tests Taylor Simpson
@ 2020-08-18 15:50 ` Taylor Simpson
  2020-08-29  3:19   ` Richard Henderson
  2020-08-18 16:32 ` [RFC PATCH v3 00/34] Hexagon patch series no-reply
  2020-08-29  3:27 ` Richard Henderson
  35 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-18 15:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: ale, riku.voipio, richard.henderson, laurent, tsimpson, philmd,
	aleksandar.m.mail

Add file to default-configs
Change configure
Add target/hexagon/Makefile.objs
Change scripts/qemu-binfmt-conf.sh

We can build a hexagon-linux-user target and run programs on the Hexagon
scalar core.  With hexagon-linux-clang installed, "make check-tcg" will
pass.

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 configure                              |   9 ++
 default-configs/hexagon-linux-user.mak |   1 +
 scripts/qemu-binfmt-conf.sh            |   6 +-
 target/hexagon/Makefile.objs           | 203 +++++++++++++++++++++++++++++++++
 4 files changed, 218 insertions(+), 1 deletion(-)
 create mode 100644 default-configs/hexagon-linux-user.mak
 create mode 100644 target/hexagon/Makefile.objs

diff --git a/configure b/configure
index 2acc4d1..1f5c5a0 100755
--- a/configure
+++ b/configure
@@ -8275,6 +8275,12 @@ case "$target_name" in
     bflt="yes"
     mttcg="yes"
   ;;
+  hexagon)
+    TARGET_BASE_ARCH=hexagon
+    TARGET_ABI_DIR=hexagon
+    mttcg=yes
+    target_compiler=$cross_cc_hexagon
+  ;;
   *)
     error_exit "Unsupported target CPU"
   ;;
@@ -8447,6 +8453,9 @@ for i in $ARCH $TARGET_BASE_ARCH ; do
   xtensa*)
     disas_config "XTENSA"
   ;;
+  hexagon)
+    disas_config "HEXAGON"
+  ;;
   esac
 done
 if test "$tcg_interpreter" = "yes" ; then
diff --git a/default-configs/hexagon-linux-user.mak b/default-configs/hexagon-linux-user.mak
new file mode 100644
index 0000000..ad55af0
--- /dev/null
+++ b/default-configs/hexagon-linux-user.mak
@@ -0,0 +1 @@
+# Default configuration for hexagon-linux-user
diff --git a/scripts/qemu-binfmt-conf.sh b/scripts/qemu-binfmt-conf.sh
index 9f1580a..7b5d54b 100755
--- a/scripts/qemu-binfmt-conf.sh
+++ b/scripts/qemu-binfmt-conf.sh
@@ -4,7 +4,7 @@
 qemu_target_list="i386 i486 alpha arm armeb sparc sparc32plus sparc64 \
 ppc ppc64 ppc64le m68k mips mipsel mipsn32 mipsn32el mips64 mips64el \
 sh4 sh4eb s390x aarch64 aarch64_be hppa riscv32 riscv64 xtensa xtensaeb \
-microblaze microblazeel or1k x86_64"
+microblaze microblazeel or1k x86_64 hexagon"
 
 i386_magic='\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x03\x00'
 i386_mask='\xff\xff\xff\xff\xff\xfe\xfe\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff'
@@ -136,6 +136,10 @@ or1k_magic='\x7fELF\x01\x02\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\
 or1k_mask='\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff'
 or1k_family=or1k
 
+hexagon_magic='\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\xa4\x00'
+hexagon_mask='\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff'
+hexagon_family=hexagon
+
 qemu_get_family() {
     cpu=${HOST_ARCH:-$(uname -m)}
     case "$cpu" in
diff --git a/target/hexagon/Makefile.objs b/target/hexagon/Makefile.objs
new file mode 100644
index 0000000..f9321c8
--- /dev/null
+++ b/target/hexagon/Makefile.objs
@@ -0,0 +1,203 @@
+##
+##  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##
+##  This program is free software; you can redistribute it and/or modify
+##  it under the terms of the GNU General Public License as published by
+##  the Free Software Foundation; either version 2 of the License, or
+##  (at your option) any later version.
+##
+##  This program is distributed in the hope that it will be useful,
+##  but WITHOUT ANY WARRANTY; without even the implied warranty of
+##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+##  GNU General Public License for more details.
+##
+##  You should have received a copy of the GNU General Public License
+##  along with this program; if not, see <http://www.gnu.org/licenses/>.
+##
+
+obj-y += \
+    cpu.o \
+    translate.o \
+    op_helper.o \
+    gdbstub.o \
+    genptr.o \
+    reg_fields.o \
+    decode.o \
+    iclass.o \
+    opcodes.o \
+    printinsn.o \
+    arch.o \
+    fma_emu.o \
+    conv_emu.o
+
+#
+#  Step 1
+#  We use a C program to create semantics_generated.pyinc
+#
+BUILD_USER_DIR = $(BUILD_DIR)/hexagon-linux-user
+GEN_SEMANTICS = gen_semantics
+GEN_SEMANTICS_SRC = $(SRC_PATH)/target/hexagon/gen_semantics.c
+
+IDEF_FILES = \
+    $(SRC_PATH)/target/hexagon/imported/alu.idef \
+    $(SRC_PATH)/target/hexagon/imported/branch.idef \
+    $(SRC_PATH)/target/hexagon/imported/compare.idef \
+    $(SRC_PATH)/target/hexagon/imported/float.idef \
+    $(SRC_PATH)/target/hexagon/imported/ldst.idef \
+    $(SRC_PATH)/target/hexagon/imported/mpy.idef \
+    $(SRC_PATH)/target/hexagon/imported/shift.idef \
+    $(SRC_PATH)/target/hexagon/imported/subinsns.idef \
+    $(SRC_PATH)/target/hexagon/imported/system.idef
+DEF_FILES = \
+    $(SRC_PATH)/target/hexagon/imported/allidefs.def \
+    $(SRC_PATH)/target/hexagon/imported/macros.def
+
+$(GEN_SEMANTICS): $(GEN_SEMANTICS_SRC) $(IDEF_FILES) $(DEF_FILES)
+	$(CC) $(CFLAGS) $(GEN_SEMANTICS_SRC) -o $(GEN_SEMANTICS)
+
+SEMANTICS=semantics_generated.pyinc
+$(SEMANTICS): $(GEN_SEMANTICS)
+	$(call quiet-command, \
+	    $(BUILD_USER_DIR)/$(GEN_SEMANTICS) $(SEMANTICS), \
+	    "GEN", $(SEMANTICS))
+
+#
+# Step 2
+# We use Python scripts to generate the following files
+#
+SHORTCODE_H = $(BUILD_USER_DIR)/shortcode_generated.h
+HELPER_PROTOS_H = $(BUILD_USER_DIR)/helper_protos_generated.h
+TCG_FUNCS_H = $(BUILD_USER_DIR)/tcg_funcs_generated.h
+HELPER_FUNCS_H = $(BUILD_USER_DIR)/helper_funcs_generated.h
+OPCODES_DEF_H = $(BUILD_USER_DIR)/opcodes_def_generated.h
+OP_ATTRIBS_H = $(BUILD_USER_DIR)/op_attribs_generated.h
+OP_REGS_H = $(BUILD_USER_DIR)/op_regs_generated.h
+PRINTINSN_H = $(BUILD_USER_DIR)/printinsn_generated.h
+
+GENERATED_HEXAGON_FILES = \
+    $(SHORTCODE_H) \
+    $(HELPER_PROTOS_H) \
+    $(TCG_FUNCS_H) \
+    $(HELPER_FUNCS_H) \
+    $(OPCODES_DEF_H) \
+    $(OP_ATTRIBS_H) \
+    $(OP_REGS_H) \
+    $(PRINTINSN_H)
+
+$(SHORTCODE_H): \
+    $(SRC_PATH)/target/hexagon/hex_common.py \
+    $(SRC_PATH)/target/hexagon/gen_shortcode.py \
+    $(SEMANTICS) \
+    $(SRC_PATH)/target/hexagon/attribs_def.h
+	$(call quiet-command, \
+	    $(SRC_PATH)/target/hexagon/gen_shortcode.py \
+                $(SEMANTICS) \
+                $(SRC_PATH)/target/hexagon/attribs_def.h, \
+	    "GEN", "Hexagon shortcode_generated.h")
+
+$(HELPER_PROTOS_H): \
+    $(SRC_PATH)/target/hexagon/hex_common.py \
+    $(SRC_PATH)/target/hexagon/gen_helper_protos.py \
+    $(SEMANTICS) \
+    $(SRC_PATH)/target/hexagon/attribs_def.h
+	$(call quiet-command, \
+	    $(SRC_PATH)/target/hexagon/gen_helper_protos.py \
+                $(SEMANTICS) \
+                $(SRC_PATH)/target/hexagon/attribs_def.h, \
+	    "GEN", "Hexagon helper_protos_generated.h")
+
+$(TCG_FUNCS_H): \
+    $(SRC_PATH)/target/hexagon/hex_common.py \
+    $(SRC_PATH)/target/hexagon/gen_tcg_funcs.py \
+    $(SEMANTICS) \
+    $(SRC_PATH)/target/hexagon/attribs_def.h
+	$(call quiet-command, \
+	    $(SRC_PATH)/target/hexagon/gen_tcg_funcs.py \
+                $(SEMANTICS) \
+                $(SRC_PATH)/target/hexagon/attribs_def.h, \
+	    "GEN", "Hexagon tcg_funcs_generated.h")
+
+$(HELPER_FUNCS_H): \
+    $(SRC_PATH)/target/hexagon/hex_common.py \
+    $(SRC_PATH)/target/hexagon/gen_helper_funcs.py \
+    $(SEMANTICS) \
+    $(SRC_PATH)/target/hexagon/attribs_def.h
+	$(call quiet-command, \
+	    $(SRC_PATH)/target/hexagon/gen_helper_funcs.py \
+                $(SEMANTICS) \
+                $(SRC_PATH)/target/hexagon/attribs_def.h, \
+	    "GEN", "Hexagon helper_funcs_generated.h")
+
+$(PRINTINSN_H): \
+    $(SRC_PATH)/target/hexagon/hex_common.py \
+    $(SRC_PATH)/target/hexagon/gen_printinsn.py \
+    $(SEMANTICS) \
+    $(SRC_PATH)/target/hexagon/attribs_def.h
+	$(call quiet-command, \
+	    $(SRC_PATH)/target/hexagon/gen_printinsn.py \
+                $(SEMANTICS) \
+                $(SRC_PATH)/target/hexagon/attribs_def.h, \
+	    "GEN", "Hexagon printinsn_generated.h")
+
+$(OP_REGS_H): \
+    $(SRC_PATH)/target/hexagon/hex_common.py \
+    $(SRC_PATH)/target/hexagon/gen_op_regs.py \
+    $(SEMANTICS) \
+    $(SRC_PATH)/target/hexagon/attribs_def.h
+	$(call quiet-command, \
+	    $(SRC_PATH)/target/hexagon/gen_op_regs.py \
+                $(SEMANTICS) \
+                $(SRC_PATH)/target/hexagon/attribs_def.h, \
+	    "GEN", "Hexagon op_regs_generated.h")
+
+$(OP_ATTRIBS_H): \
+    $(SRC_PATH)/target/hexagon/hex_common.py \
+    $(SRC_PATH)/target/hexagon/gen_op_attribs.py \
+    $(SEMANTICS) \
+    $(SRC_PATH)/target/hexagon/attribs_def.h
+	$(call quiet-command, \
+	    $(SRC_PATH)/target/hexagon/gen_op_attribs.py \
+                $(SEMANTICS) \
+                $(SRC_PATH)/target/hexagon/attribs_def.h, \
+	    "GEN", "Hexagon op_attribs_generated.h")
+
+$(OPCODES_DEF_H): \
+    $(SRC_PATH)/target/hexagon/hex_common.py \
+    $(SRC_PATH)/target/hexagon/gen_opcodes_def.py \
+    $(SEMANTICS) \
+    $(SRC_PATH)/target/hexagon/attribs_def.h
+	$(call quiet-command, \
+	    $(SRC_PATH)/target/hexagon/gen_opcodes_def.py \
+                $(SEMANTICS) \
+                $(SRC_PATH)/target/hexagon/attribs_def.h, \
+	    "GEN", "Hexagon opcodes_def_generated.h")
+
+#
+# Step 3
+# We use a C program to create iset.py which is imported into dectree.py
+# to create the decode tree
+#
+GEN_DECTREE_IMPORT=gen_dectree_import
+GEN_DECTREE_IMPORT_SRC = $(SRC_PATH)/target/hexagon/gen_dectree_import.c
+
+$(GEN_DECTREE_IMPORT): $(GEN_DECTREE_IMPORT_SRC) $(OPCODES_DEF_H) config-target.h
+	$(CC) $(QEMU_CFLAGS) $(QEMU_INCLUDES) -I$(BUILD_DIR) $(GEN_DECTREE_IMPORT_SRC) -o $(GEN_DECTREE_IMPORT)
+
+DECTREE_IMPORT=iset.py
+$(DECTREE_IMPORT): $(GEN_DECTREE_IMPORT)
+	$(call quiet-command, \
+	    $(BUILD_USER_DIR)/$(GEN_DECTREE_IMPORT) $(DECTREE_IMPORT), \
+	    "GEN", $(DECTREE_IMPORT))
+
+#
+# Step 4
+# We use the dectree.py script to generate the decode tree header file
+#
+DECTREE_HEADER=dectree_generated.h
+$(DECTREE_HEADER): $(SRC_PATH)/target/hexagon/dectree.py $(DECTREE_IMPORT)
+	$(call quiet-command, \
+	    PYTHONPATH=$(BUILD_USER_DIR) \
+	    $(PYTHON) $(SRC_PATH)/target/hexagon/dectree.py, \
+	    "GEN", "Hexagon decode tree")
+
+generated-files-y += $(GENERATED_HEXAGON_FILES) $(DECTREE_HEADER)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 00/34] Hexagon patch series
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (33 preceding siblings ...)
  2020-08-18 15:50 ` [RFC PATCH v3 34/34] Hexagon build infrastructure Taylor Simpson
@ 2020-08-18 16:32 ` no-reply
  2020-08-29  3:27 ` Richard Henderson
  35 siblings, 0 replies; 122+ messages in thread
From: no-reply @ 2020-08-18 16:32 UTC (permalink / raw)
  To: tsimpson
  Cc: ale, riku.voipio, richard.henderson, qemu-devel, laurent,
	tsimpson, philmd, aleksandar.m.mail

Patchew URL: https://patchew.org/QEMU/1597765847-16637-1-git-send-email-tsimpson@quicinc.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 1597765847-16637-1-git-send-email-tsimpson@quicinc.com
Subject: [RFC PATCH v3 00/34] Hexagon patch series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]         patchew/1597765847-16637-1-git-send-email-tsimpson@quicinc.com -> patchew/1597765847-16637-1-git-send-email-tsimpson@quicinc.com
 - [tag update]      patchew/20200610120426.12826-1-vsementsov@virtuozzo.com -> patchew/20200610120426.12826-1-vsementsov@virtuozzo.com
Switched to a new branch 'test'
149d5a8 Hexagon build infrastructure
4db60b1 Hexagon (tests/tcg/hexagon) TCG tests
3799611 Hexagon (linux-user/hexagon) Linux user emulation
a71a593 Hexagon (target/hexagon) translation
edab971 Hexagon (target/hexagon) TCG for instructions with multiple definitions
e932b7e Hexagon (target/hexagon) TCG generation
77ada1d Hexagon (target/hexagon) TCG generation helpers
be70802 Hexagon (target/hexagon) instruction classes
33ce822 Hexagon (target/hexagon) macros referenced in instruction semantics
f44d484 Hexagon (target/hexagon) macros to interface with the generator
c4e4b84 Hexagon (target/hexagon) opcode data structures
64f25a3 Hexagon (target/hexagon) generater phase 4 - decode tree
799a8cb Hexagon (target/hexagon) generator phase 3 - C preprocessor for decode tree
fd0b8ef Hexagon (target/hexagon) generator phase 2 - generate header files
01ee39f Hexagon (target/hexagon) generator phase 1 - C preprocessor for semantics
56a8168 Hexagon (target/hexagon/imported) arch import - instruction encoding
fe828ec Hexagon (target/hexagon/imported) arch import - instruction semantics
47fcbc9 Hexagon (target/hexagon/imported) arch import - macro definitions
032fa2d Hexagon (target/hexagon) utility functions
90e2392 Hexagon (target/hexagon) instruction printing
e4a75c2 Hexagon (target/hexagon) instruction/packet decode
720bbdf Hexagon (target/hexagon) register map
4a5c3d8 Hexagon (target/hexagon) instruction attributes
01bf963 Hexagon (target/hexagon) register fields
934c416 Hexagon (target/hexagon) instruction and packet types
e905475 Hexagon (target/hexagon) architecture types
e10c00c Hexagon (target/hexagon) GDB Stub
54222f4 Hexagon (target/hexagon) scalar core helpers
8c90070 Hexagon (disas) disassembler
fc17f04 Hexagon (target/hexagon) register names
87807c4 Hexagon (target/hexagon) scalar core definition
28282ee Hexagon (include/elf.h) ELF machine definition
acf59c5 Hexagon (target/hexagon) README
360594d Hexagon Update MAINTAINERS file

=== OUTPUT BEGIN ===
1/34 Checking commit 360594de302c (Hexagon Update MAINTAINERS file)
2/34 Checking commit acf59c5f8d57 (Hexagon (target/hexagon) README)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#13: 
new file mode 100644

total: 0 errors, 1 warnings, 254 lines checked

Patch 2/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
3/34 Checking commit 28282eeb25c8 (Hexagon (include/elf.h) ELF machine definition)
4/34 Checking commit 87807c4ee5ed (Hexagon (target/hexagon) scalar core definition)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#13: 
new file mode 100644

total: 0 errors, 1 warnings, 580 lines checked

Patch 4/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
5/34 Checking commit fc17f04e763d (Hexagon (target/hexagon) register names)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#11: 
new file mode 100644

total: 0 errors, 1 warnings, 83 lines checked

Patch 5/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
6/34 Checking commit 8c900700b508 (Hexagon (disas) disassembler)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#26: 
new file mode 100644

total: 0 errors, 1 warnings, 76 lines checked

Patch 6/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
7/34 Checking commit 54222f424b69 (Hexagon (target/hexagon) scalar core helpers)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#14: 
new file mode 100644

total: 0 errors, 1 warnings, 401 lines checked

Patch 7/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
8/34 Checking commit e10c00cef005 (Hexagon (target/hexagon) GDB Stub)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#26: 
new file mode 100644

total: 0 errors, 1 warnings, 63 lines checked

Patch 8/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
9/34 Checking commit e9054759731f (Hexagon (target/hexagon) architecture types)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#13: 
new file mode 100644

total: 0 errors, 1 warnings, 43 lines checked

Patch 9/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
10/34 Checking commit 934c416b066b (Hexagon (target/hexagon) instruction and packet types)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#14: 
new file mode 100644

total: 0 errors, 1 warnings, 75 lines checked

Patch 10/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
11/34 Checking commit 01bf963045a8 (Hexagon (target/hexagon) register fields)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#13: 
new file mode 100644

ERROR: Macros with complex values should be enclosed in parenthesis
#84: FILE: target/hexagon/reg_fields.h:33:
+#define DEF_REG_FIELD(TAG, NAME, START, WIDTH, DESCRIPTION) \
+    TAG,

total: 1 errors, 1 warnings, 146 lines checked

Patch 11/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

12/34 Checking commit 4a5c3d8b0344 (Hexagon (target/hexagon) instruction attributes)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#11: 
new file mode 100644

ERROR: Macros with complex values should be enclosed in parenthesis
#37: FILE: target/hexagon/attribs.h:22:
+#define DEF_ATTRIB(NAME, ...) A_##NAME,

total: 1 errors, 1 warnings, 130 lines checked

Patch 12/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

13/34 Checking commit 720bbdf3c5d8 (Hexagon (target/hexagon) register map)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#16: 
new file mode 100644

total: 0 errors, 1 warnings, 38 lines checked

Patch 13/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
14/34 Checking commit e4a75c280804 (Hexagon (target/hexagon) instruction/packet decode)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#22: 
new file mode 100644

ERROR: Macros with complex values should be enclosed in parenthesis
#118: FILE: target/hexagon/decode.c:92:
+#define DECODE_SEPARATOR_BITS(START, WIDTH) NULL, START, WIDTH

ERROR: Macros with multiple statements should be enclosed in a do - while loop
#721: FILE: target/hexagon/q6v_decode.c:51:
+#define DECODE_OPINFO(TAG, BEH) \
+    case TAG: \
+        { BEH  } \
+        break; \
+

total: 2 errors, 1 warnings, 1005 lines checked

Patch 14/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

15/34 Checking commit 90e239256680 (Hexagon (target/hexagon) instruction printing)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#11: 
new file mode 100644

ERROR: Macros with multiple statements should be enclosed in a do - while loop
#61: FILE: target/hexagon/printinsn.c:46:
+#define DEF_PRINTINFO(TAG, FMT, ...) \
+    case TAG: \
+        snprintf(buf, n, FMT, __VA_ARGS__);\
+        break;

total: 1 errors, 1 warnings, 120 lines checked

Patch 15/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

16/34 Checking commit 032fa2d6f2f2 (Hexagon (target/hexagon) utility functions)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#13: 
new file mode 100644

WARNING: Block comments use a leading /* on a separate line
#1326: FILE: target/hexagon/fma_emu.c:470:
+        /* result zero */ \

WARNING: Block comments use a leading /* on a separate line
#1334: FILE: target/hexagon/fma_emu.c:478:
+    /* Normalize right */ \

WARNING: Block comments use a leading /* on a separate line
#1335: FILE: target/hexagon/fma_emu.c:479:
+    /* We want MANTBITS bits of mantissa plus the leading one. */ \

WARNING: Block comments use a leading /* on a separate line
#1336: FILE: target/hexagon/fma_emu.c:480:
+    /* That means that we want MANTBITS+1 bits, or 0x000000000000FF_FFFF */ \

WARNING: Block comments use a leading /* on a separate line
#1337: FILE: target/hexagon/fma_emu.c:481:
+    /* So we need to normalize right while the high word is non-zero and \

WARNING: Block comments should align the * on each line
#1338: FILE: target/hexagon/fma_emu.c:482:
+    /* So we need to normalize right while the high word is non-zero and \
+    * while the low word is nonzero when masked with 0xffe0_0000_0000_0000 */ \

WARNING: Block comments use a leading /* on a separate line
#1342: FILE: target/hexagon/fma_emu.c:486:
+    /* \

WARNING: Block comments use a leading /* on a separate line
#1352: FILE: target/hexagon/fma_emu.c:496:
+    /* \

WARNING: Block comments use a leading /* on a separate line
#1359: FILE: target/hexagon/fma_emu.c:503:
+        /* \

WARNING: Block comments use a leading /* on a separate line
#1368: FILE: target/hexagon/fma_emu.c:512:
+    /* OK, we're relatively canonical... now we need to round */ \

WARNING: Block comments use a leading /* on a separate line
#1373: FILE: target/hexagon/fma_emu.c:517:
+            /* Chop and we're done */ \

WARNING: Block comments use a leading /* on a separate line
#1387: FILE: target/hexagon/fma_emu.c:531:
+                /* round up if guard is 1, down if guard is zero */ \

WARNING: Block comments use a leading /* on a separate line
#1390: FILE: target/hexagon/fma_emu.c:534:
+                /* exactly .5, round up if odd */ \

WARNING: Block comments use a leading /* on a separate line
#1396: FILE: target/hexagon/fma_emu.c:540:
+    /* \

WARNING: Block comments use a leading /* on a separate line
#1405: FILE: target/hexagon/fma_emu.c:549:
+    /* Overflow? */ \

WARNING: Block comments use a leading /* on a separate line
#1407: FILE: target/hexagon/fma_emu.c:551:
+        /* Yep, inf result */ \

WARNING: Block comments use a leading /* on a separate line
#1429: FILE: target/hexagon/fma_emu.c:573:
+    /* Underflow? */ \

WARNING: Block comments use a leading /* on a separate line
#1431: FILE: target/hexagon/fma_emu.c:575:
+        /* Leading one means: No, we're normal. So, we should be done... */ \

total: 0 errors, 19 warnings, 1623 lines checked

Patch 16/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
17/34 Checking commit 47fcbc98752d (Hexagon (target/hexagon/imported) arch import - macro definitions)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#15: 
new file mode 100755

total: 0 errors, 1 warnings, 1529 lines checked

Patch 17/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
18/34 Checking commit fe828ec04435 (Hexagon (target/hexagon/imported) arch import - instruction semantics)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#19: 
new file mode 100644

total: 0 errors, 1 warnings, 5337 lines checked

Patch 18/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
19/34 Checking commit 56a816808b60 (Hexagon (target/hexagon/imported) arch import - instruction encoding)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#14: 
new file mode 100644

total: 0 errors, 1 warnings, 2385 lines checked

Patch 19/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
20/34 Checking commit 01ee39f0b141 (Hexagon (target/hexagon) generator phase 1 - C preprocessor for semantics)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#17: 
new file mode 100644

ERROR: suspicious ; after while (0)
#82: FILE: target/hexagon/gen_semantics.c:61:
+    } while (0);

total: 1 errors, 1 warnings, 88 lines checked

Patch 20/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

21/34 Checking commit fd0b8efdcba8 (Hexagon (target/hexagon) generator phase 2 - generate header files)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#30: 
new file mode 100755

total: 0 errors, 1 warnings, 1354 lines checked

Patch 21/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
22/34 Checking commit 799a8cb9426a (Hexagon (target/hexagon) generator phase 3 - C preprocessor for decode tree)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#15: 
new file mode 100644

ERROR: Macros with complex values should be enclosed in parenthesis
#76: FILE: target/hexagon/gen_dectree_import.c:57:
+#define REGINFO(TAG, REGINFO, RREGS, WREGS) RREGS,

ERROR: Macros with complex values should be enclosed in parenthesis
#85: FILE: target/hexagon/gen_dectree_import.c:66:
+#define REGINFO(TAG, REGINFO, RREGS, WREGS) WREGS,

ERROR: suspicious ; after while (0)
#182: FILE: target/hexagon/gen_dectree_import.c:163:
+    } while (0);

total: 3 errors, 1 warnings, 191 lines checked

Patch 22/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

23/34 Checking commit 64f25a3e9492 (Hexagon (target/hexagon) generater phase 4 - decode tree)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#13: 
new file mode 100755

total: 0 errors, 1 warnings, 352 lines checked

Patch 23/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
24/34 Checking commit c4e4b843d932 (Hexagon (target/hexagon) opcode data structures)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#11: 
new file mode 100644

ERROR: Macros with complex values should be enclosed in parenthesis
#56: FILE: target/hexagon/opcodes.c:41:
+#define REGINFO(TAG, REGINFO, RREGS, WREGS) REGINFO,

ERROR: Macros with complex values should be enclosed in parenthesis
#66: FILE: target/hexagon/opcodes.c:51:
+#define REGINFO(TAG, REGINFO, RREGS, WREGS) RREGS,

ERROR: Macros with complex values should be enclosed in parenthesis
#76: FILE: target/hexagon/opcodes.c:61:
+#define REGINFO(TAG, REGINFO, RREGS, WREGS) WREGS,

ERROR: Macros with complex values should be enclosed in parenthesis
#180: FILE: target/hexagon/opcodes.c:165:
+#define ATTRIBS(...) , ## __VA_ARGS__, 0

total: 4 errors, 1 warnings, 277 lines checked

Patch 24/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

25/34 Checking commit f44d4844c738 (Hexagon (target/hexagon) macros to interface with the generator)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#13: 
new file mode 100644

total: 0 errors, 1 warnings, 364 lines checked

Patch 25/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
26/34 Checking commit 33ce822b03e9 (Hexagon (target/hexagon) macros referenced in instruction semantics)
27/34 Checking commit be70802f716c (Hexagon (target/hexagon) instruction classes)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#13: 
new file mode 100644

ERROR: Macros with complex values should be enclosed in parenthesis
#138: FILE: target/hexagon/iclass.h:27:
+#define DEF_PP_ICLASS32(TYPE, SLOTS, UNITS)    ICLASS_FROM_TYPE(TYPE),

ERROR: Macros with complex values should be enclosed in parenthesis
#144: FILE: target/hexagon/iclass.h:33:
+#define DEF_EE_ICLASS32(TYPE, SLOTS, UNITS)    ICLASS_FROM_TYPE(TYPE),

total: 2 errors, 1 warnings, 186 lines checked

Patch 27/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

28/34 Checking commit 77ada1dc6124 (Hexagon (target/hexagon) TCG generation helpers)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#14: 
new file mode 100644

total: 0 errors, 1 warnings, 268 lines checked

Patch 28/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
29/34 Checking commit e932b7e1745b (Hexagon (target/hexagon) TCG generation)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#13: 
new file mode 100644

total: 0 errors, 1 warnings, 80 lines checked

Patch 29/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
30/34 Checking commit edab9714dc4f (Hexagon (target/hexagon) TCG for instructions with multiple definitions)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#14: 
new file mode 100644

total: 0 errors, 1 warnings, 213 lines checked

Patch 30/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
31/34 Checking commit a71a593d99c6 (Hexagon (target/hexagon) translation)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#17: 
new file mode 100644

total: 0 errors, 1 warnings, 833 lines checked

Patch 31/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
32/34 Checking commit 3799611b203f (Hexagon (linux-user/hexagon) Linux user emulation)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#41: 
new file mode 100644

total: 0 errors, 1 warnings, 1056 lines checked

Patch 32/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
33/34 Checking commit 4db60b131aa7 (Hexagon (tests/tcg/hexagon) TCG tests)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#37: 
new file mode 100644

total: 0 errors, 1 warnings, 2728 lines checked

Patch 33/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
34/34 Checking commit 149d5a836f53 (Hexagon build infrastructure)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#47: 
new file mode 100644

WARNING: line over 80 characters
#70: FILE: scripts/qemu-binfmt-conf.sh:139:
+hexagon_magic='\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\xa4\x00'

ERROR: line over 90 characters
#71: FILE: scripts/qemu-binfmt-conf.sh:140:
+hexagon_mask='\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff'

total: 1 errors, 2 warnings, 243 lines checked

Patch 34/34 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/1597765847-16637-1-git-send-email-tsimpson@quicinc.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 01/34] Hexagon Update MAINTAINERS file
  2020-08-18 15:50 ` [RFC PATCH v3 01/34] Hexagon Update MAINTAINERS file Taylor Simpson
@ 2020-08-26  1:55   ` Richard Henderson
  0 siblings, 0 replies; 122+ messages in thread
From: Richard Henderson @ 2020-08-26  1:55 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> Add Taylor Simpson as the Hexagon target maintainer
> 
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>  MAINTAINERS | 8 ++++++++
>  1 file changed, 8 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 02/34] Hexagon (target/hexagon) README
  2020-08-18 15:50 ` [RFC PATCH v3 02/34] Hexagon (target/hexagon) README Taylor Simpson
@ 2020-08-26  2:06   ` Richard Henderson
  0 siblings, 0 replies; 122+ messages in thread
From: Richard Henderson @ 2020-08-26  2:06 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +    #ifdef fGEN_TCG_A2_add
> +    fGEN_TCG_A2_add({ RdV=RsV+RtV;});
> +    #else
> +    do {
> +    gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
> +    } while (0);

I don't understand the benefit of passing the SHORTCODE to fGEN_TCG_*.  Is this
file included for helper generation?

It seems to contradict what you have a few lines lower


> +The gen_tcg.h file has any overrides. For example,
> +    #define fGEN_TCG_A2_add(GENHLPR, SHORTCODE) \
> +        tcg_gen_add_tl(RdV, RsV, RtV)

which has two arguments not one.

Is this README out of date?


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 03/34] Hexagon (include/elf.h) ELF machine definition
  2020-08-18 15:50 ` [RFC PATCH v3 03/34] Hexagon (include/elf.h) ELF machine definition Taylor Simpson
@ 2020-08-26  2:06   ` Richard Henderson
  0 siblings, 0 replies; 122+ messages in thread
From: Richard Henderson @ 2020-08-26  2:06 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> Define EM_HEXAGON 164
> 
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>  include/elf.h | 2 ++
>  1 file changed, 2 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~



^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 04/34] Hexagon (target/hexagon) scalar core definition
  2020-08-18 15:50 ` [RFC PATCH v3 04/34] Hexagon (target/hexagon) scalar core definition Taylor Simpson
@ 2020-08-26 13:35   ` Richard Henderson
  2020-08-26 23:51     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 13:35 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +#include <fenv.h>

This should not be in cpu.h.  What's up?


> +#define TARGET_PAGE_BITS 16     /* 64K pages */
> +#define TARGET_LONG_BITS 32

Belongs in cpu-param.h

> +#ifdef CONFIG_USER_ONLY
> +#define TOTAL_PER_THREAD_REGS 64
> +#else
...
> +    target_ulong gpr[TOTAL_PER_THREAD_REGS];

Do I not understand hexagon enough to know why the number of general registers
would vary with system mode?  Why is the define conditional on user-only?

> +/*
> + * Change HEX_DEBUG to 1 to turn on debugging output
> + */
> +#define HEX_DEBUG 0
> +#define HEX_DEBUG_LOG(...) \
> +    do { \
> +        if (HEX_DEBUG) { \
> +            rcu_read_lock(); \
> +            fprintf(stderr, __VA_ARGS__); \
> +            rcu_read_unlock(); \
> +        } \
> +    } while (0)
> +

No.  There are plenty of bad examples of this in qemu, let's not add another.

First, the lock doesn't do what you think.
Second, stderr is never right.
Third, just about any time you want this, there's a tracepoint that you could
add that would be better, correct, and toggleable from the command-line.

> +/*
> + * One of the main debugging techniques is to use "-d cpu" and compare against
> + * LLDB output when single stepping.  However, the target and qemu put the
> + * stacks at different locations.  This is used to compensate so the diff is
> + * cleaner.
> + */
> +static inline target_ulong hack_stack_ptrs(CPUHexagonState *env,
> +                                           target_ulong addr)
> +{
> +    static bool first = true;
> +    if (first) {
> +        first = false;
> +        env->stack_start = env->gpr[HEX_REG_SP];
> +        env->gpr[HEX_REG_USR] = 0x56000;
> +
> +#define ADJUST_STACK 0
> +#if ADJUST_STACK
> +        /*
> +         * Change the two numbers below to
> +         *     1    qemu stack location
> +         *     2    hardware stack location
> +         * Or set to zero for normal mode (no stack adjustment)
> +         */
> +        env->stack_adjust = 0xfffeeb80 - 0xbf89f980;
> +#else
> +        env->stack_adjust = 0;
> +#endif
> +    }
> +
> +    target_ulong stack_start = env->stack_start;
> +    target_ulong stack_size = 0x10000;
> +    target_ulong stack_adjust = env->stack_adjust;
> +
> +    if (stack_start + 0x1000 >= addr && addr >= (stack_start - stack_size)) {
> +        return addr - stack_adjust;
> +    }
> +    return addr;
> +}

I understand your desire for this sort of comparison.  What I don't understand
is the method.  Surely it would be preferable to actually change the stack
location in qemu, rather than constantly adjust for it.

Add a per-target hook to linux-user/hexagon/target_elf.h that controls the
allocation of the stack in elfload.c, setup_arg_pages().

> +static void hexagon_dump(CPUHexagonState *env, FILE *f)
> +{
> +    static target_ulong last_pc;
> +    int i;
> +
> +    /*
> +     * When comparing with LLDB, it doesn't step through single-cycle
> +     * hardware loops the same way.  So, we just skip them here
> +     */
> +    if (env->gpr[HEX_REG_PC] == last_pc) {
> +        return;
> +    }

Multi-threaded data race.  Might as well move last_pc to env->dump_last_pc or
something.

But I'd also suggest that all of this lldb compatibility stuff be optional,
switchable from the command-line with a cpu property.  Because there are going
to be real cases where *not* single-stepping will result in dumps from the same
PC, and you've just squashed all of those.

Call the property x-lldb-compat, or something, and default it to off.  You then
turn it on with "-cpu v67,x-lldb-compat=on".


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 05/34] Hexagon (target/hexagon) register names
  2020-08-18 15:50 ` [RFC PATCH v3 05/34] Hexagon (target/hexagon) register names Taylor Simpson
@ 2020-08-26 13:39   ` Richard Henderson
  0 siblings, 0 replies; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 13:39 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>  target/hexagon/hex_regs.h | 83 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 83 insertions(+)
>  create mode 100644 target/hexagon/hex_regs.h

Noting for the record that this apparent out-of-order patch hasn't broken
bisecting because the build is not enabled until the final patch.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 06/34] Hexagon (disas) disassembler
  2020-08-18 15:50 ` [RFC PATCH v3 06/34] Hexagon (disas) disassembler Taylor Simpson
@ 2020-08-26 13:52   ` Richard Henderson
  2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 13:52 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +    len = disassemble_hexagon(words, i, buf, PACKET_BUFFER_LEN);
> +    slen = strlen(buf);
> +    if (buf[slen - 1] == '\n') {
> +        buf[slen - 1] = '\0';
> +    }
> +    (*info->fprintf_func)(info->stream, "%s", buf);

Normally our disassemblers print the instruction address; sometimes the raw
bytes (or word) of the instruction.

Looking forward to patch 14 where disassemble_hexagon is defined, I see none of
that.  Indeed, if disassembly fails, we get...

> +    if (decode_this(nwords, words, &pkt)) {
> +        snprint_a_pkt(buf, bufsize, &pkt);
> +        return pkt.encod_pkt_size_in_bytes;
> +    } else {
> +        snprintf(buf, bufsize, "<invalid>");
> +        return 0;
> +    }

... no indication at all what happened or where, just "<invalid>".  That's not
going to make it easy to find problems.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 07/34] Hexagon (target/hexagon) scalar core helpers
  2020-08-18 15:50 ` [RFC PATCH v3 07/34] Hexagon (target/hexagon) scalar core helpers Taylor Simpson
@ 2020-08-26 14:16   ` Richard Henderson
  2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 14:16 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> The majority of helpers are generated.  Define the helper functions needed
> then include the generated file
> 
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>  target/hexagon/helper.h    |  33 ++++
>  target/hexagon/op_helper.c | 368 +++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 401 insertions(+)
>  create mode 100644 target/hexagon/helper.h
>  create mode 100644 target/hexagon/op_helper.c
> 
> diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h
> new file mode 100644
> index 0000000..48b1917
> --- /dev/null
> +++ b/target/hexagon/helper.h
> @@ -0,0 +1,33 @@
> +/*
> + *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License
> + *  along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +DEF_HELPER_2(raise_exception, noreturn, env, i32)
> +DEF_HELPER_1(debug_start_packet, void, env)
> +DEF_HELPER_3(debug_check_store_width, void, env, int, int)
> +DEF_HELPER_3(debug_commit_end, void, env, int, int)
> +DEF_HELPER_3(merge_inflight_store1s, s32, env, s32, s32)
> +DEF_HELPER_3(merge_inflight_store1u, s32, env, s32, s32)
> +DEF_HELPER_3(merge_inflight_store2s, s32, env, s32, s32)
> +DEF_HELPER_3(merge_inflight_store2u, s32, env, s32, s32)
> +DEF_HELPER_3(merge_inflight_store4s, s32, env, s32, s32)
> +DEF_HELPER_3(merge_inflight_store4u, s32, env, s32, s32)
> +DEF_HELPER_3(merge_inflight_store8u, s64, env, s32, s64)

Please use DEF_HELPER_FLAGS_N when possible.

You should be able to know when

(1) No exceptions are possible, and nothing is touched except the inputs.  In
this case use TCG_CALL_NO_RWG_SE.

(2) No exceptions are possible, and nothing in env is touched for which you
have created a tcg variable in hexagon_translate_init().  In this case use
TCG_CALL_NO_RWG.

(3) Exceptions are possible, and no tcg variables are touched on the
non-exceptional path.  In this case use TCG_CALL_NO_WG.

> +static inline void log_pred_write(CPUHexagonState *env, int pnum,
> +                                  target_ulong val)

You might be better off letting the compiler decide about inlining.

> +/*
> + * Handle mem_noshuf
> + *
> + * This occurs when there is a load that might need data forwarded
> + * from an inflight store in slot 1.  Note that the load and store
> + * might have different sizes, so we can't simply compare the
> + * addresses.  We merge only the bytes that overlap (if any).
> + */
> +static int64_t merge_bytes(CPUHexagonState *env, target_ulong load_addr,
> +                           int64_t load_data, uint32_t load_width)
> +{
> +    /* Don't do anything if slot 1 was cancelled */
> +    const int store_slot = 1;
> +    if (env->slot_cancelled & (1 << store_slot)) {
> +        return load_data;
> +    }
> +
> +    int store_width = env->mem_log_stores[store_slot].width;
> +    target_ulong store_addr = env->mem_log_stores[store_slot].va;
> +    union {
> +        uint8_t bytes[8];
> +        uint32_t data32;
> +        uint64_t data64;
> +    } retdata, storedata;

Red flag here.  This is assuming that the host and the target are both
little-endian.

> +    int bigmask = ((-load_width) & (-store_width));
> +    if ((load_addr & bigmask) != (store_addr & bigmask)) {

Huh?  This doesn't detect overlap.  Counter example:

  load_addr = 0, load_width = 4,
  store_addr = 1, store_width = 1.

  bigmask = -4 & -1 -> -4.

  (0 & -4) != (1 & -4) -> 0 != 1


> +    while ((i < load_width) && (j < store_width)) {
> +        retdata.bytes[i] = storedata.bytes[j];
> +        i++;
> +        j++;
> +    }
> +    return retdata.data64;

This most definitely requires host-endian adjustment.

> +/* Helpful for printing intermediate values within instructions */
> +void HELPER(debug_value)(CPUHexagonState *env, int32_t value)
> +{
> +    HEX_DEBUG_LOG("value = 0x%x\n", value);
> +}
> +
> +void HELPER(debug_value_i64)(CPUHexagonState *env, int64_t value)
> +{
> +    HEX_DEBUG_LOG("value_i64 = 0x%lx\n", value);
> +}

I think we need to lose all of this.
There are other ways to debug TCG.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 08/34] Hexagon (target/hexagon) GDB Stub
  2020-08-18 15:50 ` [RFC PATCH v3 08/34] Hexagon (target/hexagon) GDB Stub Taylor Simpson
@ 2020-08-26 14:17   ` Richard Henderson
  0 siblings, 0 replies; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 14:17 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> GDB register read and write routines
> 
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>  target/hexagon/internal.h |  2 ++
>  target/hexagon/cpu.c      |  2 ++
>  target/hexagon/gdbstub.c  | 49 +++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 53 insertions(+)
>  create mode 100644 target/hexagon/gdbstub.c


Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 09/34] Hexagon (target/hexagon) architecture types
  2020-08-18 15:50 ` [RFC PATCH v3 09/34] Hexagon (target/hexagon) architecture types Taylor Simpson
@ 2020-08-26 14:19   ` Richard Henderson
  2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 14:19 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +#ifndef HEXAGON_ARCH_TYPES_H
> +#define HEXAGON_ARCH_TYPES_H
> +
> +#include <stdint.h>

Do you really need to re-include this?
This was done in "qemu/osdep.h".

In general, osdep.h must be included first, and it takes care of all of the
basic system includes.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 10/34] Hexagon (target/hexagon) instruction and packet types
  2020-08-18 15:50 ` [RFC PATCH v3 10/34] Hexagon (target/hexagon) instruction and packet types Taylor Simpson
@ 2020-08-26 14:22   ` Richard Henderson
  2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 14:22 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +struct Instruction {
> +    semantic_insn_t generate;            /* pointer to genptr routine */
> +    size1u_t regno[REG_OPERANDS_MAX];    /* reg operands including predicates */
> +    size2u_t opcode;
> +
> +    size4u_t iclass:6;
> +    size4u_t slot:3;
> +    size4u_t part1:1;        /*
> +                              * cmp-jumps are split into two insns.
> +                              * set for the compare and clear for the jump
> +                              */
> +    size4u_t extension_valid:1;   /* Has a constant extender attached */
> +    size4u_t which_extended:1;    /* If has an extender, which immediate */
> +    size4u_t is_endloop:1;   /* This is an end of loop */
> +    size4u_t new_value_producer_slot:4;
> +    size4s_t immed[IMMEDS_MAX];    /* immediate field */
> +};

Is this an imported file or not?

If it is not imported, I'd very much prefer that we stick to the stdint.h type
names.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 11/34] Hexagon (target/hexagon) register fields
  2020-08-18 15:50 ` [RFC PATCH v3 11/34] Hexagon (target/hexagon) register fields Taylor Simpson
@ 2020-08-26 14:29   ` Richard Henderson
  2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 14:29 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> Declare bitfields within registers such as user status register (USR)
> 
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>  target/hexagon/reg_fields.h     | 40 +++++++++++++++++++++
>  target/hexagon/reg_fields_def.h | 78 +++++++++++++++++++++++++++++++++++++++++
>  target/hexagon/reg_fields.c     | 28 +++++++++++++++
>  3 files changed, 146 insertions(+)
>  create mode 100644 target/hexagon/reg_fields.h
>  create mode 100644 target/hexagon/reg_fields_def.h
>  create mode 100644 target/hexagon/reg_fields.c
> 
> diff --git a/target/hexagon/reg_fields.h b/target/hexagon/reg_fields.h
> new file mode 100644
> index 0000000..cf168f0
> --- /dev/null
> +++ b/target/hexagon/reg_fields.h
> @@ -0,0 +1,40 @@
> +/*
> + *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License
> + *  along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HEXAGON_REG_FIELDS_H
> +#define HEXAGON_REG_FIELDS_H
> +
> +#define NUM_GEN_REGS 32

What's this?  It doesn't appear to be field related.

> +extern reg_field_t reg_field_info[];

const.

> +enum reg_fields_enum {

Doesn't follow naming guidelines.  But you don't actually use the name at all,
so better to just drop the name entirely?

> +/* USR fields */
> +DEF_REG_FIELD(USR_OVF,
> +    "ovf", 0, 1,
> +    "Sticky Saturation Overflow - "
> +    "Set when saturation occurs while executing instruction that specifies "
> +    "optional saturation, remains set until explicitly cleared by a USR=Rs "
> +    "instruction.")

Is the description as a string really useful, or even used?
A comment would seem to do just as well, not consume space in the final binary,
and even then seems redundant with the actual architecture manual.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 12/34] Hexagon (target/hexagon) instruction attributes
  2020-08-18 15:50 ` [RFC PATCH v3 12/34] Hexagon (target/hexagon) instruction attributes Taylor Simpson
@ 2020-08-26 14:34   ` Richard Henderson
  0 siblings, 0 replies; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 14:34 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +#define ATTRIB_WIDTH 32
> +#define GET_ATTRIB(opcode, attrib) \
> +    (((opcode_attribs[opcode][attrib / ATTRIB_WIDTH])\
> +    >> (attrib % ATTRIB_WIDTH)) & 0x1)

Can you define GET_ATTRIB in terms of qemu/bitops.h?

I'm leery of ATTRIB_WIDTH being separate from the actual definition of
opcode_attribs, over in opcodes.h.

Why does attribs.h need to live separately?  They're clearly closely related,
and cannot in fact be used separately.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 13/34] Hexagon (target/hexagon) register map
  2020-08-18 15:50 ` [RFC PATCH v3 13/34] Hexagon (target/hexagon) register map Taylor Simpson
@ 2020-08-26 14:36   ` Richard Henderson
  2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 14:36 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +#ifndef HEXAGON_REGMAP_H
> +#define HEXAGON_REGMAP_H
> +
> +        /* Name   Num Table */
> +DEF_REGMAP(R_16,  16, 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23)
> +DEF_REGMAP(R__8,  8,  0, 2, 4, 6, 16, 18, 20, 22)
> +DEF_REGMAP(R__4,  4,  0, 2, 4, 6)
> +DEF_REGMAP(R_4,   4,  0, 1, 2, 3)
> +DEF_REGMAP(R_8S,  8,  0, 1, 2, 3, 16, 17, 18, 19)
> +DEF_REGMAP(R_8,   8,  0, 1, 2, 3, 4, 5, 6, 7)
> +DEF_REGMAP(V__8,  8,  0, 4, 8, 12, 16, 20, 24, 28)
> +DEF_REGMAP(V__16, 16, 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30)

Given that DEF_REGMAP itself is defined in decode.c, and not even in another
header file, why do these not live in decode.c as well?


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 14/34] Hexagon (target/hexagon) instruction/packet decode
  2020-08-18 15:50 ` [RFC PATCH v3 14/34] Hexagon (target/hexagon) instruction/packet decode Taylor Simpson
@ 2020-08-26 15:06   ` Richard Henderson
  2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 15:06 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +#define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT) \
> +    static struct _dectree_table_struct dectree_table_##TAG;

All of these little structures should be const.


> +typedef struct {
> +    struct _dectree_table_struct *table_link;
> +    struct _dectree_table_struct *table_link_b;
> +    opcode_t opcode;
> +    enum {
> +        DECTREE_ENTRY_INVALID,
> +        DECTREE_TABLE_LINK,
> +        DECTREE_SUBINSNS,
> +        DECTREE_EXTSPACE,
> +        DECTREE_TERMINAL
> +    } type;
> +} dectree_entry_t;

Which probably requires these links to be const, and a few uses within the
functions that use them.

That should move 78K worth of tables into .data.rel.ro, where they can't be
modified after relocation.

> +            /* previous insn is the producer */
> +            def_opcode = packet->insn[def_idx].opcode;
> +            dststr = strstr(opcode_wregs[def_opcode], "Rd");
> +            if (dststr) {
> +                dststr = strchr(opcode_reginfo[def_opcode], 'd');
> +            } else {
> +                dststr = strstr(opcode_wregs[def_opcode], "Rx");
> +                if (dststr) {
> +                    dststr = strchr(opcode_reginfo[def_opcode], 'x');
> +                } else {
> +                    dststr = strstr(opcode_wregs[def_opcode], "Re");
> +                    if (dststr) {
> +                        dststr = strchr(opcode_reginfo[def_opcode], 'e');
> +                    } else {
> +                        dststr = strstr(opcode_wregs[def_opcode], "Ry");
> +                        if (dststr) {
> +                            dststr = strchr(opcode_reginfo[def_opcode], 'y');
> +                        } else {
> +                            g_assert_not_reached();
> +                            return 1;
> +                        }
> +                    }
> +                }
> +            }

I can't help but thinking that all of this string manipulation is not the most
ideal way to implement a decoder.  Surely all this can be pre-processed into
code, or flags, or an enumeration or something.

Oh, and g_assert_not_reached() will never return.  You don't have to keep
putting return statements after it.  Please remove all of those, everywhere.

I'm going to ignore the rest of the decoder, as it's somewhat bewildering.
Even in the final patches-applied form.  It could really use some more commentary.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 15/34] Hexagon (target/hexagon) instruction printing
  2020-08-18 15:50 ` [RFC PATCH v3 15/34] Hexagon (target/hexagon) instruction printing Taylor Simpson
@ 2020-08-26 15:08   ` Richard Henderson
  0 siblings, 0 replies; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 15:08 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +extern void snprint_a_pkt(char *buf, int n, packet_t *pkt);

Is there a good reason you're using a fixed size buffer and not returning a
GString?


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 16/34] Hexagon (target/hexagon) utility functions
  2020-08-18 15:50 ` [RFC PATCH v3 16/34] Hexagon (target/hexagon) utility functions Taylor Simpson
@ 2020-08-26 15:10   ` Richard Henderson
  2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 15:10 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +extern size16s_t cast8s_to_16s(size8s_t a);
> +extern size8s_t cast16s_to_8s(size16s_t a);
> +extern size16s_t add128(size16s_t a, size16s_t b);
> +extern size16s_t sub128(size16s_t a, size16s_t b);
> +extern size16s_t shiftr128(size16s_t a, size4u_t n);
> +extern size16s_t shiftl128(size16s_t a, size4u_t n);
> +extern size16s_t and128(size16s_t a, size16s_t b);

Did you really need to reinvent qemu/int128.h?


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 17/34] Hexagon (target/hexagon/imported) arch import - macro definitions
  2020-08-18 15:50 ` [RFC PATCH v3 17/34] Hexagon (target/hexagon/imported) arch import - macro definitions Taylor Simpson
@ 2020-08-26 15:17   ` Richard Henderson
  2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 15:17 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> The macro definitions specify instruction attributes that are applied
> to each instruction that references the macro. The generator will
> recursively apply attributes to each instruction that used the macro.
> 
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>  target/hexagon/imported/macros.def | 1529 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 1529 insertions(+)
>  create mode 100755 target/hexagon/imported/macros.def


You might as well squash all of the patches dealing with imported files,
because they're beyond review.  They are what they are: another project's files.

Give that squashed patch my

Acked-by: Richard Henderson <richard.henderson@linaro.org>


r~


PS: I notice that the README mentions archlib, but does not give a pointer to it.



^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 24/34] Hexagon (target/hexagon) opcode data structures
  2020-08-18 15:50 ` [RFC PATCH v3 24/34] Hexagon (target/hexagon) opcode data structures Taylor Simpson
@ 2020-08-26 15:25   ` Richard Henderson
  2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-26 15:25 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +extern const char *opcode_names[];
> +
> +extern const char *opcode_reginfo[];
> +extern const char *opcode_rregs[];
> +extern const char *opcode_wregs[];

const char * const


> +extern opcode_encoding_t opcode_encodings[XX_LAST_OPCODE];

const.

> +extern size4u_t
> +    opcode_attribs[XX_LAST_OPCODE][(A_ZZ_LASTATTRIB / ATTRIB_WIDTH) + 1];

const.

And using qemu/bitops.h if possible, as discussed earlier vs attribs.h.

> +const char *opcode_short_semantics[] = {
> +#define OPCODE(X)              NULL
> +#include "opcodes_def_generated.h"
> +#undef OPCODE
> +    NULL
> +};
...
> +#define DEF_SHORTCODE(TAG, SHORTCODE) \
> +    opcode_short_semantics[TAG] = #SHORTCODE;
> +#include "shortcode_generated.h"
> +#undef DEF_SHORTCODE

Just initialize opcode_short_semantics with shortcode_generated.h in the first
place.  Then you don't need to create a table of NULL and overwrite at startup.

And you can also make the table constant.

> +    if (p == NULL) {
> +        g_assert_not_reached();
> +        return 0;
> +    }

I prefer assert(p != NULL) to if (test) { abort(); }, where possible.  E.g.
this later one is fine:

> +    if (islower(*p)) {
> +        return 0;
> +    } else if (isupper(*p)) {
> +        return 1;
> +    } else {
> +        g_assert_not_reached();
> +    }


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 04/34] Hexagon (target/hexagon) scalar core definition
  2020-08-26 13:35   ` Richard Henderson
@ 2020-08-26 23:51     ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-26 23:51 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

Thanks for the feedback, Richard!!



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, August 26, 2020 7:36 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 04/34] Hexagon (target/hexagon) scalar core
> definition
>
> On 8/18/20 8:50 AM, Taylor Simpson wrote:
> > +#include <fenv.h>
>
> This should not be in cpu.h.  What's up?

We're not using qemu softfloat (it's on the TODO list), so there's a fenv_t field in CPUHexagonState.

> > +#define TARGET_PAGE_BITS 16     /* 64K pages */
> > +#define TARGET_LONG_BITS 32
>
> Belongs in cpu-param.h

OK

> > +#ifdef CONFIG_USER_ONLY
> > +#define TOTAL_PER_THREAD_REGS 64
> > +#else
> ...
> > +    target_ulong gpr[TOTAL_PER_THREAD_REGS];
>
> Do I not understand hexagon enough to know why the number of general
> registers
> would vary with system mode?  Why is the define conditional on user-only?

Yes, there are some registers that are only available in system mode.  Since this series is only for linux-user mode, I'll remove the ifdef for now.

We're working on system mode.  When that series is ready, we can put the ifdef back in.  At that time, you'll also see the extra registers in hex_regs.h.

> No.  There are plenty of bad examples of this in qemu, let's not add another.
>
> First, the lock doesn't do what you think.
> Second, stderr is never right.
> Third, just about any time you want this, there's a tracepoint that you could
> add that would be better, correct, and toggleable from the command-line.

OK

> I understand your desire for this sort of comparison.  What I don't
> understand
> is the method.  Surely it would be preferable to actually change the stack
> location in qemu, rather than constantly adjust for it.
>
> Add a per-target hook to linux-user/hexagon/target_elf.h that controls the
> allocation of the stack in elfload.c, setup_arg_pages().

OK, will look into this.  Thanks for the advice, I wasn't aware this was possible.

>
> > +static void hexagon_dump(CPUHexagonState *env, FILE *f)
> > +{
> > +    static target_ulong last_pc;
> > +    int i;
> > +
> > +    /*
> > +     * When comparing with LLDB, it doesn't step through single-cycle
> > +     * hardware loops the same way.  So, we just skip them here
> > +     */
> > +    if (env->gpr[HEX_REG_PC] == last_pc) {
> > +        return;
> > +    }
>
> Multi-threaded data race.  Might as well move last_pc to env->dump_last_pc
> or
> something.
>
> But I'd also suggest that all of this lldb compatibility stuff be optional,
> switchable from the command-line with a cpu property.  Because there are
> going
> to be real cases where *not* single-stepping will result in dumps from the
> same
> PC, and you've just squashed all of those.
>
> Call the property x-lldb-compat, or something, and default it to off.  You then
> turn it on with "-cpu v67,x-lldb-compat=on".

OK


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 06/34] Hexagon (disas) disassembler
  2020-08-26 13:52   ` Richard Henderson
@ 2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-26 23:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, August 26, 2020 7:52 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 06/34] Hexagon (disas) disassembler
>
> Normally our disassemblers print the instruction address; sometimes the raw
> bytes (or word) of the instruction.

OK


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 07/34] Hexagon (target/hexagon) scalar core helpers
  2020-08-26 14:16   ` Richard Henderson
@ 2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-26 23:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, August 26, 2020 8:17 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 07/34] Hexagon (target/hexagon) scalar core
> helpers
>
> > +DEF_HELPER_3(merge_inflight_store8u, s64, env, s32, s64)
>
> Please use DEF_HELPER_FLAGS_N when possible.

OK

> > +static inline void log_pred_write(CPUHexagonState *env, int pnum,
> > +                                  target_ulong val)
>
> You might be better off letting the compiler decide about inlining.

Isn't "inline" just a hint to the compiler?

> > +    union {
> > +        uint8_t bytes[8];
> > +        uint32_t data32;
> > +        uint64_t data64;
> > +    } retdata, storedata;
>
> Red flag here.  This is assuming that the host and the target are both
> little-endian.

OK

> > +    int bigmask = ((-load_width) & (-store_width));
> > +    if ((load_addr & bigmask) != (store_addr & bigmask)) {
>
> Huh?  This doesn't detect overlap.  Counter example:
>
>   load_addr = 0, load_width = 4,
>   store_addr = 1, store_width = 1.
>
>   bigmask = -4 & -1 -> -4.
>
>   (0 & -4) != (1 & -4) -> 0 != 1

OK

> > +    while ((i < load_width) && (j < store_width)) {
> > +        retdata.bytes[i] = storedata.bytes[j];
> > +        i++;
> > +        j++;
> > +    }
> > +    return retdata.data64;
>
> This most definitely requires host-endian adjustment.

OK

> > +/* Helpful for printing intermediate values within instructions */
> > +void HELPER(debug_value)(CPUHexagonState *env, int32_t value)
> > +{
> > +    HEX_DEBUG_LOG("value = 0x%x\n", value);
> > +}
> > +
> > +void HELPER(debug_value_i64)(CPUHexagonState *env, int64_t value)
> > +{
> > +    HEX_DEBUG_LOG("value_i64 = 0x%lx\n", value);
> > +}
>
> I think we need to lose all of this.
> There are other ways to debug TCG.

OK


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 09/34] Hexagon (target/hexagon) architecture types
  2020-08-26 14:19   ` Richard Henderson
@ 2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-26 23:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, August 26, 2020 8:19 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 09/34] Hexagon (target/hexagon) architecture
> types
>
> > +#include <stdint.h>
>
> Do you really need to re-include this?
> This was done in "qemu/osdep.h".
>
> In general, osdep.h must be included first, and it takes care of all of the
> basic system includes.

OK

^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 10/34] Hexagon (target/hexagon) instruction and packet types
  2020-08-26 14:22   ` Richard Henderson
@ 2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-26 23:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, August 26, 2020 8:22 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: ale@rev.ng; riku.voipio@iki.fi; laurent@vivier.eu; philmd@redhat.com;
> aleksandar.m.mail@gmail.com
> Subject: Re: [RFC PATCH v3 10/34] Hexagon (target/hexagon) instruction and
> packet types
>
> On 8/18/20 8:50 AM, Taylor Simpson wrote:
> > +struct Instruction {
> > +    semantic_insn_t generate;            /* pointer to genptr routine */
> > +    size1u_t regno[REG_OPERANDS_MAX];    /* reg operands including
> predicates */
> > +    size2u_t opcode;
> > +
> > +    size4u_t iclass:6;
> > +    size4u_t slot:3;
> > +    size4u_t part1:1;        /*
> > +                              * cmp-jumps are split into two insns.
> > +                              * set for the compare and clear for the jump
> > +                              */
> > +    size4u_t extension_valid:1;   /* Has a constant extender attached */
> > +    size4u_t which_extended:1;    /* If has an extender, which immediate
> */
> > +    size4u_t is_endloop:1;   /* This is an end of loop */
> > +    size4u_t new_value_producer_slot:4;
> > +    size4s_t immed[IMMEDS_MAX];    /* immediate field */
> > +};
>
> Is this an imported file or not?
>
> If it is not imported, I'd very much prefer that we stick to the stdint.h type
> names.

Agreed.  My goal is to stick with stdint.h types outside the imported directory.  I'll change this.


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 11/34] Hexagon (target/hexagon) register fields
  2020-08-26 14:29   ` Richard Henderson
@ 2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-26 23:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, August 26, 2020 8:30 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 11/34] Hexagon (target/hexagon) register fields
>
> > +#define NUM_GEN_REGS 32
>
> What's this?  It doesn't appear to be field related.

Not needed, will remove it.

> > +extern reg_field_t reg_field_info[];
>
> const.

OK

> > +enum reg_fields_enum {
>
> Doesn't follow naming guidelines.  But you don't actually use the name at all,
> so better to just drop the name entirely?

OK

> > +/* USR fields */
> > +DEF_REG_FIELD(USR_OVF,
> > +    "ovf", 0, 1,
> > +    "Sticky Saturation Overflow - "
> > +    "Set when saturation occurs while executing instruction that specifies "
> > +    "optional saturation, remains set until explicitly cleared by a USR=Rs "
> > +    "instruction.")
>
> Is the description as a string really useful, or even used?
> A comment would seem to do just as well, not consume space in the final
> binary,
> and even then seems redundant with the actual architecture manual.

I thought they help make the code more readable.  You are right that they shouldn't take space in the binary.

I can either change so they don't go into the binary or remove them altogether - guess I'll remove them altogether.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 13/34] Hexagon (target/hexagon) register map
  2020-08-26 14:36   ` Richard Henderson
@ 2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-26 23:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, August 26, 2020 8:36 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 13/34] Hexagon (target/hexagon) register map
>
> > +DEF_REGMAP(V__16, 16, 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28,
> 30)
>
> Given that DEF_REGMAP itself is defined in decode.c, and not even in
> another
> header file, why do these not live in decode.c as well?

I'll move them to decode.c.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 14/34] Hexagon (target/hexagon) instruction/packet decode
  2020-08-26 15:06   ` Richard Henderson
@ 2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-26 23:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, August 26, 2020 9:07 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 14/34] Hexagon (target/hexagon)
> instruction/packet decode
>
> On 8/18/20 8:50 AM, Taylor Simpson wrote:
> > +#define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT) \
> > +    static struct _dectree_table_struct dectree_table_##TAG;
>
> All of these little structures should be const.
>
> Which probably requires these links to be const, and a few uses within the
> functions that use them.
>
> That should move 78K worth of tables into .data.rel.ro, where they can't be
> modified after relocation.

OK

> I can't help but thinking that all of this string manipulation is not the most
> ideal way to implement a decoder.  Surely all this can be pre-processed into
> code, or flags, or an enumeration or something.

I'll have to think about this one.

>
> Oh, and g_assert_not_reached() will never return.  You don't have to keep
> putting return statements after it.  Please remove all of those, everywhere.

OK

> I'm going to ignore the rest of the decoder, as it's somewhat bewildering.
> Even in the final patches-applied form.  It could really use some more
> commentary.

OK

^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 16/34] Hexagon (target/hexagon) utility functions
  2020-08-26 15:10   ` Richard Henderson
@ 2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-26 23:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, August 26, 2020 9:11 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 16/34] Hexagon (target/hexagon) utility functions
>
> Did you really need to reinvent qemu/int128.h?

I remember looking at qemu/int128.h briefly but can't remember why I ended up not using it.  I'll take another look.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 17/34] Hexagon (target/hexagon/imported) arch import - macro definitions
  2020-08-26 15:17   ` Richard Henderson
@ 2020-08-26 23:52     ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-26 23:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, August 26, 2020 9:17 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 17/34] Hexagon (target/hexagon/imported) arch
> import - macro definitions
>
> You might as well squash all of the patches dealing with imported files,
> because they're beyond review.  They are what they are: another project's
> files.
>
> Give that squashed patch my
>
> Acked-by: Richard Henderson <richard.henderson@linaro.org>

OK

> PS: I notice that the README mentions archlib, but does not give a pointer to
> it.

Other than what we import here, it's not open source.


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 24/34] Hexagon (target/hexagon) opcode data structures
  2020-08-26 15:25   ` Richard Henderson
@ 2020-08-26 23:52     ` Taylor Simpson
  2020-08-27  4:05       ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-26 23:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, August 26, 2020 9:26 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 24/34] Hexagon (target/hexagon) opcode data
> structures
>
> > +extern const char *opcode_names[];
> > +
> > +extern const char *opcode_reginfo[];
> > +extern const char *opcode_rregs[];
> > +extern const char *opcode_wregs[];
>
> const char * const

OK

> > +extern opcode_encoding_t opcode_encodings[XX_LAST_OPCODE];
>
> const.

OK

> > +extern size4u_t
> > +    opcode_attribs[XX_LAST_OPCODE][(A_ZZ_LASTATTRIB /
> ATTRIB_WIDTH) + 1];
>
> const.

OK

> And using qemu/bitops.h if possible, as discussed earlier vs attribs.h.

Do you mean replace the GET_ATTRIB macro with test_bit from qemu/bitops.h?

> Just initialize opcode_short_semantics with shortcode_generated.h in the
> first
> place.  Then you don't need to create a table of NULL and overwrite at
> startup.
>
> And you can also make the table constant.

OK

> > +    if (p == NULL) {
> > +        g_assert_not_reached();
> > +        return 0;
> > +    }
>
> I prefer assert(p != NULL) to if (test) { abort(); }, where possible.  E.g.
> this later one is fine:

OK


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 24/34] Hexagon (target/hexagon) opcode data structures
  2020-08-26 23:52     ` Taylor Simpson
@ 2020-08-27  4:05       ` Richard Henderson
  0 siblings, 0 replies; 122+ messages in thread
From: Richard Henderson @ 2020-08-27  4:05 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/26/20 4:52 PM, Taylor Simpson wrote:
>> And using qemu/bitops.h if possible, as discussed earlier vs attribs.h.
> 
> Do you mean replace the GET_ATTRIB macro with test_bit from qemu/bitops.h?

No, just define GET_ATTRIB in terms of test_bit, and define opcode_attribs
using BITS_TO_LONGS.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 25/34] Hexagon (target/hexagon) macros to interface with the generator
  2020-08-18 15:50 ` [RFC PATCH v3 25/34] Hexagon (target/hexagon) macros to interface with the generator Taylor Simpson
@ 2020-08-29  0:49   ` Richard Henderson
  2020-08-30 20:30     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-29  0:49 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +#define DECL_REG(NAME, NUM, X, OFF) \
> +    TCGv NAME = tcg_temp_local_new(); \
> +    int NUM = REGNO(X) + OFF
> +
> +#define DECL_REG_WRITABLE(NAME, NUM, X, OFF) \
> +    TCGv NAME = tcg_temp_local_new(); \
> +    int NUM = REGNO(X) + OFF; \
> +    do { \
> +        int is_predicated = GET_ATTRIB(insn->opcode, A_CONDEXEC); \
> +        if (is_predicated && !is_preloaded(ctx, NUM)) { \
> +            tcg_gen_mov_tl(hex_new_value[NUM], hex_gpr[NUM]); \
> +        } \
> +    } while (0)
> +/*
> + * For read-only temps, avoid allocating and freeing
> + */
> +#define DECL_REG_READONLY(NAME, NUM, X, OFF) \
> +    TCGv NAME; \
> +    int NUM = REGNO(X) + OFF
> +
> +#define DECL_RREG_d(NAME, NUM, X, OFF) \
> +    DECL_REG_WRITABLE(NAME, NUM, X, OFF)
> +#define DECL_RREG_e(NAME, NUM, X, OFF) \
> +    DECL_REG(NAME, NUM, X, OFF)

Is there a good reason for all these macros?
Why not just bake this knowledge into gen_tcg_funcs.py?
Seems like it would be just a couple of functions...

At present, both this and the intermediary files are unreadable.  One has to
pass genptr.c through -E and indent to see what's going on.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics
  2020-08-18 15:50 ` [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics Taylor Simpson
@ 2020-08-29  1:16   ` Richard Henderson
  2020-08-30 20:23     ` Taylor Simpson
  2020-10-08 15:00     ` Taylor Simpson
  0 siblings, 2 replies; 122+ messages in thread
From: Richard Henderson @ 2020-08-29  1:16 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +/*
> + * Section 5.5 of the Hexagon V67 Programmer's Reference Manual
> + *
> + * Slot 1 store with slot 0 load
> + * A slot 1 store operation with a slot 0 load operation can appear in a packet.
> + * The packet attribute :mem_noshuf inhibits the instruction reordering that
> + * would otherwise be done by the assembler. For example:
> + *     {
> + *         memw(R5) = R2 // slot 1 store
> + *         R3 = memh(R6) // slot 0 load
> + *     }:mem_noshuf
> + * Unlike most packetized operations, these memory operations are not executed
> + * in parallel (Section 3.3.1). Instead, the store instruction in Slot 1
> + * effectively executes first, followed by the load instruction in Slot 0. If
> + * the addresses of the two operations are overlapping, the load will receive
> + * the newly stored data. This feature is supported in processor versions
> + * V65 or greater.
> + *
> + *
> + * For qemu, we look for a load in slot 0 when there is  a store in slot 1
> + * in the same packet.  When we see this, we call a helper that merges the
> + * bytes from the store buffer with the value loaded from memory.
> + */
> +#define CHECK_NOSHUF(DST, VA, SZ, SIGN) \
> +    do { \
> +        if (insn->slot == 0 && pkt->pkt_has_store_s1) { \
> +            gen_helper_merge_inflight_store##SZ##SIGN(DST, cpu_env, VA, DST); \
> +        } \
> +    } while (0)

Ah, so I see what you're trying to do with the merge thing, which had the
host-endian problems.

I think the merge stuff is a mistake.  I think you can get the semantics that
you want with

	probe_read(ld_addr, ld_len)
	qemu_st(st_value, st_addr)
	qemu_ld(ld_value, ld_addr)

In this way, all exceptions are recognized before the store is complete, the
normal memory operations handle any possible overlap.

> +#define fINSERT_BITS(REG, WIDTH, OFFSET, INVAL) \
> +    do { \
> +        REG = ((REG) & ~(((fCONSTLL(1) << (WIDTH)) - 1) << (OFFSET))) | \
> +           (((INVAL) & ((fCONSTLL(1) << (WIDTH)) - 1)) << (OFFSET)); \
> +    } while (0)

reg = deposit32(reg, offset, width, inval)

> +#define fEXTRACTU_BITS(INREG, WIDTH, OFFSET) \
> +    (fZXTN(WIDTH, 32, (INREG >> OFFSET)))
> +#define fEXTRACTU_BIDIR(INREG, WIDTH, OFFSET) \
> +    (fZXTN(WIDTH, 32, fBIDIR_LSHIFTR((INREG), (OFFSET), 4_8)))
> +#define fEXTRACTU_RANGE(INREG, HIBIT, LOWBIT) \
> +    (fZXTN((HIBIT - LOWBIT + 1), 32, (INREG >> LOWBIT)))

extract32(inreg, offset, width)

> +#define fZXTN(N, M, VAL) ((VAL) & ((1LL << (N)) - 1))

extract32(VAL, 0, n)

> +#define fSXTN(N, M, VAL) \
> +    ((fZXTN(N, M, VAL) ^ (1LL << ((N) - 1))) - (1LL << ((N) - 1)))

sextract32(val, 0, n)

> +#define fRND(A) (((A) + 1) >> 1)

Does A have a type that won't overflow?
For Arm we write this as

    (A >> 1) + (A & 1)

> +#define fDCFETCH(REG) do { REG = REG; } while (0) /* Nothing to do in qemu */
> +#define fICINVA(REG) do { REG = REG; } while (0) /* Nothing to do in qemu */
> +#define fL2FETCH(ADDR, HEIGHT, WIDTH, STRIDE, FLAGS)
> +#define fDCCLEANA(REG) do { REG = REG; } while (0) /* Nothing to do in qemu */
> +#define fDCCLEANINVA(REG) \
> +    do { REG = REG; } while (0) /* Nothing to do in qemu */

Is this "R=R" thing trying to avoid a compiler warning?
Perhaps "(void)R" would be sufficient to avoid that?

> -static inline void log_store32(CPUHexagonState *env, target_ulong addr,
> -                               target_ulong val, int width, int slot)
> -{
> -    HEX_DEBUG_LOG("log_store%d(0x" TARGET_FMT_lx ", " TARGET_FMT_ld
> -                  " [0x" TARGET_FMT_lx "])\n",
> -                  width, addr, val, val);
> -    env->mem_log_stores[slot].va = addr;
> -    env->mem_log_stores[slot].width = width;
> -    env->mem_log_stores[slot].data32 = val;
> -}
> -
> -static inline void log_store64(CPUHexagonState *env, target_ulong addr,
> -                               int64_t val, int width, int slot)
> -{
> -    HEX_DEBUG_LOG("log_store%d(0x" TARGET_FMT_lx ", %ld [0x%lx])\n",
> -                   width, addr, val, val);
> -    env->mem_log_stores[slot].va = addr;
> -    env->mem_log_stores[slot].width = width;
> -    env->mem_log_stores[slot].data64 = val;
> -}
> -

Fold this back to wherever it came from.  Clearly no need to introduce it in
the first place.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 27/34] Hexagon (target/hexagon) instruction classes
  2020-08-18 15:50 ` [RFC PATCH v3 27/34] Hexagon (target/hexagon) instruction classes Taylor Simpson
@ 2020-08-29  1:37   ` Richard Henderson
  2020-08-30 20:04     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-29  1:37 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +} iclass_t;
...
> +extern const char *find_iclass_slots(opcode_t opcode, int itype);
...
> +typedef struct {
> +    const char * const slots;
> +} iclass_info_t;

I'll note that you aren't following our CODING_STYLE for types.  Which of these
need to match imported/ and which can you fix.

> +typedef enum {
> +
> +#define DEF_PP_ICLASS32(TYPE, SLOTS, UNITS)    ICLASS_FROM_TYPE(TYPE),
> +#define DEF_EE_ICLASS32(TYPE, SLOTS, UNITS)    /* nothing */
> +#include "imported/iclass.def"
> +#undef DEF_PP_ICLASS32
> +#undef DEF_EE_ICLASS32
> +
> +#define DEF_EE_ICLASS32(TYPE, SLOTS, UNITS)    ICLASS_FROM_TYPE(TYPE),
> +#define DEF_PP_ICLASS32(TYPE, SLOTS, UNITS)    /* nothing */
> +#include "imported/iclass.def"
> +#undef DEF_PP_ICLASS32
> +#undef DEF_EE_ICLASS32

Any particular reason why you're defining one as nothing, and doing this dance
twice?

> +const char *find_iclass_slots(opcode_t opcode, int itype)
> +{
> +    /* There are some exceptions to what the iclass dictates */
> +    if (GET_ATTRIB(opcode, A_ICOP)) {
> +        return "2";

Why are you returning a string and not a simple bitmask?

> +    [ICLASS_FROM_TYPE(TYPE)] = { .slots = #SLOTS },

This could be converted to the bitmask with

enum {
    SLOTS_0  = (1 << 0),
    SLOTS_1  = (1 << 1),
    SLOTS_23 = (1 << 2) | (1 << 3),
    ...
};

static const uint8_t iclass_info[] = {

#define DEF_ICLASS(TYPE, SLOTS) \
    [ICLASS_##TYPE] = SLOTS_##SLOTS
#define DEF_PP_ICLASS32(TYPE, SLOTS, UNITS) \
    DEF_ICLASS(TYPE, SLOTS)
#define DEF_EE_ICLASS32(TYPE, SLOTS, UNITS) \
    DEF_ICLASS(TYPE, SLOTS)

At which point the ultimate consumer,

>     for (i = 0, slot = 3; i < pkt->num_insns; i++) {
>         valid_slot_str = get_valid_slot_str(pkt, i);
> 
>         while (strchr(valid_slot_str, '0' + slot) == NULL) {
>             slot--;
>         }
>         pkt->insn[i].slot = slot;

becomes a simple integer mask check.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 28/34] Hexagon (target/hexagon) TCG generation helpers
  2020-08-18 15:50 ` [RFC PATCH v3 28/34] Hexagon (target/hexagon) TCG generation helpers Taylor Simpson
@ 2020-08-29  1:48   ` Richard Henderson
  2020-08-30 19:53     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-29  1:48 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> Helpers for reading and writing registers
> Helpers for load-locked/store-conditional
> 
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>  target/hexagon/genptr_helpers.h | 244 ++++++++++++++++++++++++++++++++++++++++
>  target/hexagon/op_helper.c      |  18 +++
>  2 files changed, 262 insertions(+)
>  create mode 100644 target/hexagon/genptr_helpers.h
> 
> diff --git a/target/hexagon/genptr_helpers.h b/target/hexagon/genptr_helpers.h
> new file mode 100644
> index 0000000..ffcb1e3
> --- /dev/null
> +++ b/target/hexagon/genptr_helpers.h
> @@ -0,0 +1,244 @@
> +/*
> + *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License
> + *  along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HEXAGON_GENPTR_HELPERS_H
> +#define HEXAGON_GENPTR_HELPERS_H
> +
> +#include "tcg/tcg.h"
> +
> +static inline TCGv gen_read_reg(TCGv result, int num)
> +{
> +    tcg_gen_mov_tl(result, hex_gpr[num]);
> +    return result;
> +}
> +
> +static inline TCGv gen_read_preg(TCGv pred, uint8_t num)
> +{
> +    tcg_gen_mov_tl(pred, hex_pred[num]);
> +    return pred;
> +}
> +
> +static inline void gen_log_reg_write(int rnum, TCGv val, int slot,
> +                                     int is_predicated)

These are quite large.  Why are they marked inline?

> +        /* Low word */
> +        tcg_gen_extrl_i64_i32(val32, val);
> +        tcg_gen_mov_tl(hex_new_value[rnum], val32);

Why are you extracting into a temporary?
This could be done with

    tcg_gen_extr_i64_i32(hex_new_value[rnum],
                         hex_new_value[rnum + 1], val);

> +static inline void gen_read_p3_0(TCGv control_reg)
> +{
> +    TCGv pval = tcg_temp_new();
> +    int i;
> +    tcg_gen_movi_tl(control_reg, 0);
> +    for (i = NUM_PREGS - 1; i >= 0; i--) {
> +        tcg_gen_shli_tl(control_reg, control_reg, 8);
> +        tcg_gen_andi_tl(pval, hex_pred[i], 0xff);
> +        tcg_gen_or_tl(control_reg, control_reg, pval);

tcg_gen_deposit_tl(control_reg, control_reg,
                   hex_pred[i], i * 8, 8);

> +    for (i = 0; i < NUM_PREGS; i++) {
> +        tcg_gen_andi_tl(pred_val, control_reg, 0xff);
> +        tcg_gen_mov_tl(hex_pred[i], pred_val);
> +        tcg_gen_shri_tl(control_reg, control_reg, 8);

tcg_gen_extract_tl(hex_pred[i], control_reg, i * 8, 8);

> +static inline void log_store32(CPUHexagonState *env, target_ulong addr,
> +                               int32_t val, int width, int slot)
> +{
> +    HEX_DEBUG_LOG("log_store%d(0x%x, %d [0x%x])\n", width, addr, val, val);
> +    env->mem_log_stores[slot].va = addr;
> +    env->mem_log_stores[slot].width = width;
> +    env->mem_log_stores[slot].data32 = val;
> +}
> +
> +static inline void log_store64(CPUHexagonState *env, target_ulong addr,
> +                               int64_t val, int width, int slot)
> +{
> +    HEX_DEBUG_LOG("log_store%d(0x%x, %ld [0x%lx])\n", width, addr, val, val);
> +    env->mem_log_stores[slot].va = addr;
> +    env->mem_log_stores[slot].width = width;
> +    env->mem_log_stores[slot].data64 = val;
> +}

... or fold this re-addition back into where it was accidentally removed.  ;-)


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 29/34] Hexagon (target/hexagon) TCG generation
  2020-08-18 15:50 ` [RFC PATCH v3 29/34] Hexagon (target/hexagon) TCG generation Taylor Simpson
@ 2020-08-29  1:58   ` Richard Henderson
  2020-08-30 19:49     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-29  1:58 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +/* Fill in the table with NULLs because not all the opcodes have DEF_QEMU */
> +semantic_insn_t opcode_genptr[] = {
> +#define OPCODE(X)                              NULL
> +#include "opcodes_def_generated.h"
> +    NULL
> +#undef OPCODE
> +};
> +
> +/* This function overwrites the NULL entries where we have a DEF_QEMU */
> +void init_genptr(void)
> +{
> +#define DEF_TCG_FUNC(TAG, GENFN) \
> +    opcode_genptr[TAG] = generate_##TAG;
> +#include "tcg_funcs_generated.h"
> +#undef DEF_TCG_FUNC
> +}

Just size the array properly to start.

const semantic_insn_t opcode_genptr[XX_LAST_OPCODE] = {

#define DEF_TCG_FUNC(TAG, GENFN) \
    [TAG] = generate_##TAG,
#include "tcg_funcs_generated.h"

};


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-08-18 15:50 ` [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions Taylor Simpson
@ 2020-08-29  2:02   ` Richard Henderson
  2020-08-30 19:48     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-29  2:02 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> +++ b/target/hexagon/helper.h
> @@ -15,6 +15,8 @@
>   *  along with this program; if not, see <http://www.gnu.org/licenses/>.
>   */
>  
> +#include "gen_tcg.h"

Why would you need this here?  Definitely looks wrong.


r~



^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 31/34] Hexagon (target/hexagon) translation
  2020-08-18 15:50 ` [RFC PATCH v3 31/34] Hexagon (target/hexagon) translation Taylor Simpson
@ 2020-08-29  2:49   ` Richard Henderson
  2020-08-30 19:37     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-29  2:49 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> Read the instruction memory
> Create a packet data structure
> Generate TCG code for the start of the packet
> Invoke the generate function for each instruction
> Generate TCG code for the end of the packet
> 
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>  target/hexagon/translate.h | 103 +++++++
>  target/hexagon/translate.c | 730 +++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 833 insertions(+)
>  create mode 100644 target/hexagon/translate.h
>  create mode 100644 target/hexagon/translate.c
> 
> diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
> new file mode 100644
> index 0000000..144140f
> --- /dev/null
> +++ b/target/hexagon/translate.h
> @@ -0,0 +1,103 @@
> +/*
> + *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License
> + *  along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HEXAGON_TRANSLATE_H
> +#define HEXAGON_TRANSLATE_H
> +
> +#include "cpu.h"
> +#include "exec/translator.h"
> +#include "tcg/tcg-op.h"
> +#include "internal.h"
> +
> +typedef struct DisasContext {
> +    DisasContextBase base;
> +    uint32_t mem_idx;
> +    int reg_log[REG_WRITES_MAX];
> +    int reg_log_idx;
> +    int preg_log[PRED_WRITES_MAX];
> +    int preg_log_idx;
> +    uint8_t store_width[STORES_MAX];
> +} DisasContext;
> +
> +static inline void ctx_log_reg_write(DisasContext *ctx, int rnum)
> +{
> +#if HEX_DEBUG
> +    int i;
> +    for (i = 0; i < ctx->reg_log_idx; i++) {
> +        if (ctx->reg_log[i] == rnum) {
> +            HEX_DEBUG_LOG("WARNING: Multiple writes to r%d\n", rnum);
> +        }
> +    }
> +#endif
> +    ctx->reg_log[ctx->reg_log_idx] = rnum;
> +    ctx->reg_log_idx++;
> +}

Why not just keep a bitmask of the rnum written?
Does the order of this log really matter?

> +static inline bool is_preloaded(DisasContext *ctx, int num)
> +{
> +    int i;
> +    for (i = 0; i < ctx->reg_log_idx; i++) {
> +        if (ctx->reg_log[i] == num) {
> +            return true;
> +        }
> +    }
> +    return false;
> +}

It would mean this one becomes constant time.

> +static inline void gen_slot_cancelled_check(TCGv check, int slot_num)
> +{
> +    TCGv mask = tcg_const_tl(1 << slot_num);
> +    TCGv one = tcg_const_tl(1);
> +    TCGv zero = tcg_const_tl(0);
> +
> +    tcg_gen_and_tl(mask, hex_slot_cancelled, mask);
> +    tcg_gen_movcond_tl(TCG_COND_NE, check, mask, zero, one, zero);

This is a bit silly.  Better as

    tcg_gen_extract_i32(check, hex_slot_cancelled, slot_num, 1);


> +static int read_packet_words(CPUHexagonState *env, DisasContext *ctx,
> +                             uint32_t words[])
> +{
> +    bool found_end = false;
> +    int max_words;
> +    int nwords;
> +    int i;
> +
> +    /* Make sure we don't cross a page boundary */
> +    max_words = -(ctx->base.pc_next | TARGET_PAGE_MASK) / sizeof(uint32_t);
> +    if (max_words < PACKET_WORDS_MAX) {
> +        /* Might cross a page boundary */
> +        if (ctx->base.num_insns == 1) {
> +            /* OK if it's the first packet in the TB */
> +            max_words = PACKET_WORDS_MAX;
> +        }
> +    } else {
> +        max_words = PACKET_WORDS_MAX;
> +    }
> +
> +    memset(words, 0, PACKET_WORDS_MAX * sizeof(uint32_t));
> +    for (nwords = 0; !found_end && nwords < max_words; nwords++) {
> +        words[nwords] = cpu_ldl_code(env,
> +                                ctx->base.pc_next + nwords * sizeof(uint32_t));
> +        found_end = is_packet_end(words[nwords]);
> +    }
> +    if (!found_end) {
> +        if (nwords == PACKET_WORDS_MAX) {
> +            /* Read too many words without finding the end */
> +            gen_exception(HEX_EXCP_INVALID_PACKET);
> +            ctx->base.is_jmp = DISAS_NORETURN;
> +            return 0;
> +        }
> +        /* Crosses page boundary - defer to next TB */
> +        ctx->base.is_jmp = DISAS_TOO_MANY;

The problem with this is that the translator has asked for the next insn, and
you havn't provided it.

One way to fix this might be to decrement ctx->base.num_insns, to compensate
for the increment that *will* happen after you return.

Another way, which involves less poking about into internals is to look for the
next packet once you've completed the current packet.  This is what Arm does
for thumb2 insns:

>     if (dc->base.is_jmp == DISAS_NEXT
>         && (dc->base.pc_next - dc->page_start >= TARGET_PAGE_SIZE
>             || (dc->base.pc_next - dc->page_start >= TARGET_PAGE_SIZE - 3
>                 && insn_crosses_page(env, dc)))) {
>         dc->base.is_jmp = DISAS_TOO_MANY;
>     }

I.e. you only have to do this for the few packets that are near enough to the
end of the page that PACKET_WORDS_MAX crosses.

> +    tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], ctx->base.pc_next);
> +    tcg_gen_movi_tl(hex_slot_cancelled, 0);
> +    if (pkt->pkt_has_cof) {
> +        tcg_gen_movi_tl(hex_branch_taken, 0);
> +        tcg_gen_movi_tl(hex_next_PC, next_PC);
> +    }
> +    tcg_gen_movi_tl(hex_pred_written, 0);
> +}

Surely you don't need to actually set PC for every PC?
Nor set hex_slot_cancelled if the packet contains nothing that can cancel
anything.  Nor set hex_pred_written if no predicates are written.

> +    TCGv cancelled = tcg_temp_local_new();
> +    TCGLabel *label_end = gen_new_label();
> +
> +    /* Don't do anything if the slot was cancelled */
> +    gen_slot_cancelled_check(cancelled, slot_num);
> +    tcg_gen_brcondi_tl(TCG_COND_NE, cancelled, 0, label_end);

cancelled does not need to be local; it is consumed by the branch and not
consumed afterward.  Just free it here.

> +        /*
> +         * If we know the width from the DisasContext, we can
> +         * generate much cleaner code.
> +         * Unfortunately, not all instructions execute the fSTORE
> +         * macro during code generation.  Anything that uses the
> +         * generic helper will have this problem.  Instructions
> +         * that use fWRAP to generate proper TCG code will be OK.
> +         */

OMG.  How disgusting.

> +            value = tcg_temp_new();
> +            tcg_gen_mov_tl(value, hex_store_val32[slot_num]);
> +            tcg_gen_qemu_st8(value, address, ctx->mem_idx);
> +            tcg_temp_free(value);

Why are you copying to a temporary?

> +        default:
> +            /*
> +             * If we get to here, we don't know the width at
> +             * TCG generation time, we'll generate branching
> +             * based on the width at runtime.
> +             */
> +            label_w2 = gen_new_label();
> +            label_w4 = gen_new_label();
> +            label_w8 = gen_new_label();
> +            TCGv width = tcg_temp_local_new();

You might as well make this a helper.  This is going to generate a *lot* of code.

> +static void gen_exec_counters(packet_t *pkt)
> +{
> +    int num_insns = pkt->num_insns;
> +    int num_real_insns = 0;
> +    int i;
> +
> +    for (i = 0; i < num_insns; i++) {
> +        if (!pkt->insn[i].is_endloop &&
> +            !pkt->insn[i].part1 &&
> +            !GET_ATTRIB(pkt->insn[i].opcode, A_IT_NOP)) {
> +            num_real_insns++;
> +        }
> +    }
> +
> +    tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_PKT_CNT],
> +                    hex_gpr[HEX_REG_QEMU_PKT_CNT], 1);
> +    if (num_real_insns) {
> +        tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_INSN_CNT],
> +                        hex_gpr[HEX_REG_QEMU_INSN_CNT], num_real_insns);
> +    }

tcg_gen_addi_tl will check for the immediate == 0.

As with updating PC for every insn, this is going to be expensive.

You could accumulate these values through the TB and then update them at the
end.  You'd need to store the intermediate values in the space managed by
TARGET_INSN_START_EXTRA_WORDS, so that you can update them on any exceptional
path out of the TB, in restore_state_to_opc().

> +    if (end_tb) {
> +        tcg_gen_exit_tb(NULL, 0);

This misses out on ctx->base.singlestep_enabled, and almost certainly belongs
in hexagon_tr_tb_stop.  Use

#define DISAS_EXIT  DISAS_TARGET_0

or some other appropriate naming.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 32/34] Hexagon (linux-user/hexagon) Linux user emulation
  2020-08-18 15:50 ` [RFC PATCH v3 32/34] Hexagon (linux-user/hexagon) Linux user emulation Taylor Simpson
@ 2020-08-29  2:59   ` Richard Henderson
  0 siblings, 0 replies; 122+ messages in thread
From: Richard Henderson @ 2020-08-29  2:59 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> Implementation of Linux user emulation for Hexagon
> Some common files modified in addition to new files in linux-user/hexagon
> 
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---

Looks plausible.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 33/34] Hexagon (tests/tcg/hexagon) TCG tests
  2020-08-18 15:50 ` [RFC PATCH v3 33/34] Hexagon (tests/tcg/hexagon) TCG tests Taylor Simpson
@ 2020-08-29  3:05   ` Richard Henderson
  2020-09-01  9:57     ` Alessandro Di Federico
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-29  3:05 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, philmd, riku.voipio, laurent, Alex Bennée, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> Modify tests/tcg/configure.sh
> Add reference files to tests/tcg/hexagon
> Add Hexagon-specific tests
> 
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---

Looks ok.

Could you please work with Alex Bennee to set up a tests/docker/dockerfile/
script containing the cross-compiler from the Qualcomm SDK?  That way these
tests can be run automatically.

Compare debian-xtensa-cross.docker, which is similar.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 34/34] Hexagon build infrastructure
  2020-08-18 15:50 ` [RFC PATCH v3 34/34] Hexagon build infrastructure Taylor Simpson
@ 2020-08-29  3:19   ` Richard Henderson
  2020-09-24  2:35     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-29  3:19 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> Add file to default-configs
> Change configure
> Add target/hexagon/Makefile.objs
> Change scripts/qemu-binfmt-conf.sh
> 
> We can build a hexagon-linux-user target and run programs on the Hexagon
> scalar core.  With hexagon-linux-clang installed, "make check-tcg" will
> pass.
> 
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---

This will have to be updated for the meson conversion.

I don't understand it all myself, and all of those generated files will need
special attention.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 00/34] Hexagon patch series
  2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
                   ` (34 preceding siblings ...)
  2020-08-18 16:32 ` [RFC PATCH v3 00/34] Hexagon patch series no-reply
@ 2020-08-29  3:27 ` Richard Henderson
  2020-08-30 20:47   ` Taylor Simpson
  35 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-29  3:27 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/18/20 8:50 AM, Taylor Simpson wrote:
> This series adds support for the Hexagon processor with Linux user support
> 
> See patch 02/34 Hexagon README for detailed information.
> 
> Once the series is applied, the Hexagon port will pass "make check-tcg".
> The series also includes Hexagon-specific tests in tcg/tests/hexagon.
> 
> We have a parallel effort to make the Hexagon Linux toolchain inside a docker
> container publically available.

Oh, excellent.

> *** Future items under consideration ***
> Use qemu softfloat

This is a blocker.  It's definitely not hard.

> Use qemu decodetree

This would certainly clean up all of the string processing that I mentioned.
It seems like it would be just as easy to convert into the decodetree format as
what you're currently doing for dectree_generated.h.  Indeed, easier, since you
only need the ones that are TERMINAL().  All of the other things labeled
TABLE_LINK are handled by decodetree itself.

Anyway, I've completed what review I planed to do against this version.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 31/34] Hexagon (target/hexagon) translation
  2020-08-29  2:49   ` Richard Henderson
@ 2020-08-30 19:37     ` Taylor Simpson
  2020-08-30 23:08       ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-30 19:37 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Friday, August 28, 2020 8:50 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 31/34] Hexagon (target/hexagon) translation
>
> > +static inline void ctx_log_reg_write(DisasContext *ctx, int rnum)
> > +{
> > +#if HEX_DEBUG
> > +    int i;
> > +    for (i = 0; i < ctx->reg_log_idx; i++) {
> > +        if (ctx->reg_log[i] == rnum) {
> > +            HEX_DEBUG_LOG("WARNING: Multiple writes to r%d\n", rnum);
> > +        }
> > +    }
> > +#endif
> > +    ctx->reg_log[ctx->reg_log_idx] = rnum;
> > +    ctx->reg_log_idx++;
> > +}
>
> Why not just keep a bitmask of the rnum written?
> Does the order of this log really matter?

OK

> > +static inline void gen_slot_cancelled_check(TCGv check, int slot_num)
> > +{
> > +    TCGv mask = tcg_const_tl(1 << slot_num);
> > +    TCGv one = tcg_const_tl(1);
> > +    TCGv zero = tcg_const_tl(0);
> > +
> > +    tcg_gen_and_tl(mask, hex_slot_cancelled, mask);
> > +    tcg_gen_movcond_tl(TCG_COND_NE, check, mask, zero, one, zero);
>
> This is a bit silly.  Better as
>
>     tcg_gen_extract_i32(check, hex_slot_cancelled, slot_num, 1);

OK

> > +static int read_packet_words(CPUHexagonState *env, DisasContext *ctx,
> > +                             uint32_t words[])
> > +{
> > +    bool found_end = false;
> > +    int max_words;
> > +    int nwords;
> > +    int i;
> > +
> > +    /* Make sure we don't cross a page boundary */
> > +    max_words = -(ctx->base.pc_next | TARGET_PAGE_MASK) /
> sizeof(uint32_t);
> > +    if (max_words < PACKET_WORDS_MAX) {
> > +        /* Might cross a page boundary */
> > +        if (ctx->base.num_insns == 1) {
> > +            /* OK if it's the first packet in the TB */
> > +            max_words = PACKET_WORDS_MAX;
> > +        }
> > +    } else {
> > +        max_words = PACKET_WORDS_MAX;
> > +    }
> > +
> > +    memset(words, 0, PACKET_WORDS_MAX * sizeof(uint32_t));
> > +    for (nwords = 0; !found_end && nwords < max_words; nwords++) {
> > +        words[nwords] = cpu_ldl_code(env,
> > +                                ctx->base.pc_next + nwords * sizeof(uint32_t));
> > +        found_end = is_packet_end(words[nwords]);
> > +    }
> > +    if (!found_end) {
> > +        if (nwords == PACKET_WORDS_MAX) {
> > +            /* Read too many words without finding the end */
> > +            gen_exception(HEX_EXCP_INVALID_PACKET);
> > +            ctx->base.is_jmp = DISAS_NORETURN;
> > +            return 0;
> > +        }
> > +        /* Crosses page boundary - defer to next TB */
> > +        ctx->base.is_jmp = DISAS_TOO_MANY;
>
> The problem with this is that the translator has asked for the next insn, and
> you havn't provided it.
>
> One way to fix this might be to decrement ctx->base.num_insns, to
> compensate
> for the increment that *will* happen after you return.
>
> Another way, which involves less poking about into internals is to look for the
> next packet once you've completed the current packet.  This is what Arm
> does
> for thumb2 insns:
>
> >     if (dc->base.is_jmp == DISAS_NEXT
> >         && (dc->base.pc_next - dc->page_start >= TARGET_PAGE_SIZE
> >             || (dc->base.pc_next - dc->page_start >= TARGET_PAGE_SIZE - 3
> >                 && insn_crosses_page(env, dc)))) {
> >         dc->base.is_jmp = DISAS_TOO_MANY;
> >     }
>
> I.e. you only have to do this for the few packets that are near enough to the
> end of the page that PACKET_WORDS_MAX crosses.

I'm actually checking two conditions here.
1) packet crossing a page boundary
2) reading too many words without finding the end of the packet.
I guess it would be better to separate them.

What is the correct behavior for the second case?  Should we return an error code from here and have the higher level code generate the invalid packet exception?

> > +    tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], ctx->base.pc_next);
> > +    tcg_gen_movi_tl(hex_slot_cancelled, 0);
> > +    if (pkt->pkt_has_cof) {
> > +        tcg_gen_movi_tl(hex_branch_taken, 0);
> > +        tcg_gen_movi_tl(hex_next_PC, next_PC);
> > +    }
> > +    tcg_gen_movi_tl(hex_pred_written, 0);
> > +}
>
> Surely you don't need to actually set PC for every PC?

What do other targets do?

> Nor set hex_slot_cancelled if the packet contains nothing that can cancel
> anything.  Nor set hex_pred_written if no predicates are written.

Checking for instructions that can cancel is pretty straightforward because we have the CONDEXEC attribute.  Checking if any predicates are written will be more complex.  I'll scratch my head and figure out the cleanest way to do this.

>
> > +    TCGv cancelled = tcg_temp_local_new();
> > +    TCGLabel *label_end = gen_new_label();
> > +
> > +    /* Don't do anything if the slot was cancelled */
> > +    gen_slot_cancelled_check(cancelled, slot_num);
> > +    tcg_gen_brcondi_tl(TCG_COND_NE, cancelled, 0, label_end);
>
> cancelled does not need to be local; it is consumed by the branch and not
> consumed afterward.  Just free it here.

OK

> > +        /*
> > +         * If we know the width from the DisasContext, we can
> > +         * generate much cleaner code.
> > +         * Unfortunately, not all instructions execute the fSTORE
> > +         * macro during code generation.  Anything that uses the
> > +         * generic helper will have this problem.  Instructions
> > +         * that use fWRAP to generate proper TCG code will be OK.
> > +         */
>
> OMG.  How disgusting.

The word "generic" is a typo - should be "generated".  In order to keep this series small, we're only overriding the helper for the minimal number of instructions.  Over time, we'll override all the stores and we can eliminate this.


> > +            value = tcg_temp_new();
> > +            tcg_gen_mov_tl(value, hex_store_val32[slot_num]);
> > +            tcg_gen_qemu_st8(value, address, ctx->mem_idx);
> > +            tcg_temp_free(value);
>
> Why are you copying to a temporary?

Will fix.

> > +        default:
> > +            /*
> > +             * If we get to here, we don't know the width at
> > +             * TCG generation time, we'll generate branching
> > +             * based on the width at runtime.
> > +             */
> > +            label_w2 = gen_new_label();
> > +            label_w4 = gen_new_label();
> > +            label_w8 = gen_new_label();
> > +            TCGv width = tcg_temp_local_new();
>
> You might as well make this a helper.  This is going to generate a *lot* of
> code.

This is the part that will go away when we override all the stores.

> > +static void gen_exec_counters(packet_t *pkt)
> > +{
> > +    int num_insns = pkt->num_insns;
> > +    int num_real_insns = 0;
> > +    int i;
> > +
> > +    for (i = 0; i < num_insns; i++) {
> > +        if (!pkt->insn[i].is_endloop &&
> > +            !pkt->insn[i].part1 &&
> > +            !GET_ATTRIB(pkt->insn[i].opcode, A_IT_NOP)) {
> > +            num_real_insns++;
> > +        }
> > +    }
> > +
> > +    tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_PKT_CNT],
> > +                    hex_gpr[HEX_REG_QEMU_PKT_CNT], 1);
> > +    if (num_real_insns) {
> > +        tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_INSN_CNT],
> > +                        hex_gpr[HEX_REG_QEMU_INSN_CNT], num_real_insns);
> > +    }
>
> tcg_gen_addi_tl will check for the immediate == 0.

OK, great.  I also see that tcg_gen_mov_i32 will check that the source and dest are different.  So no TCG code will be generated.

> As with updating PC for every insn, this is going to be expensive.
>
> You could accumulate these values through the TB and then update them at
> the
> end.  You'd need to store the intermediate values in the space managed by
> TARGET_INSN_START_EXTRA_WORDS, so that you can update them on any
> exceptional
> path out of the TB, in restore_state_to_opc().

OK

> > +    if (end_tb) {
> > +        tcg_gen_exit_tb(NULL, 0);
>
> This misses out on ctx->base.singlestep_enabled, and almost certainly
> belongs
> in hexagon_tr_tb_stop.  Use
>
> #define DISAS_EXIT  DISAS_TARGET_0
>
> or some other appropriate naming.

OK


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-08-29  2:02   ` Richard Henderson
@ 2020-08-30 19:48     ` Taylor Simpson
  2020-08-30 21:13       ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-30 19:48 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Friday, August 28, 2020 8:03 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for
> instructions with multiple definitions
>
> On 8/18/20 8:50 AM, Taylor Simpson wrote:
> > +++ b/target/hexagon/helper.h
> > @@ -15,6 +15,8 @@
> >   *  along with this program; if not, see <http://www.gnu.org/licenses/>.
> >   */
> >
> > +#include "gen_tcg.h"
>
> Why would you need this here?  Definitely looks wrong.

I'll add the following comment to indicate what's going on

/*
  * Each of the generated helpers is wrapped with #ifndef fGEN_TCG_<tag>.
  * For example
   *     #ifndef fGEN_TCG_A2_add
   *     DEF_HELPER_3(A2_add, s32, env, s32, s32)
   *     #endif
  * We include gen_tcg.h here to provide definitions of fGEN_TCG for any instructions that
  * are overridden.
  *
  * This prevents unused helpers from taking up space in the executable.
  */

^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 29/34] Hexagon (target/hexagon) TCG generation
  2020-08-29  1:58   ` Richard Henderson
@ 2020-08-30 19:49     ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-30 19:49 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Friday, August 28, 2020 7:58 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 29/34] Hexagon (target/hexagon) TCG
> generation
>
> On 8/18/20 8:50 AM, Taylor Simpson wrote:
> > +/* Fill in the table with NULLs because not all the opcodes have
> DEF_QEMU */
> > +semantic_insn_t opcode_genptr[] = {
> > +#define OPCODE(X)                              NULL
> > +#include "opcodes_def_generated.h"
> > +    NULL
> > +#undef OPCODE
> > +};
> > +
> > +/* This function overwrites the NULL entries where we have a DEF_QEMU
> */
> > +void init_genptr(void)
> > +{
> > +#define DEF_TCG_FUNC(TAG, GENFN) \
> > +    opcode_genptr[TAG] = generate_##TAG;
> > +#include "tcg_funcs_generated.h"
> > +#undef DEF_TCG_FUNC
> > +}
>
> Just size the array properly to start.
>
> const semantic_insn_t opcode_genptr[XX_LAST_OPCODE] = {
>
> #define DEF_TCG_FUNC(TAG, GENFN) \
>     [TAG] = generate_##TAG,
> #include "tcg_funcs_generated.h"
>
> };

OK

^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 28/34] Hexagon (target/hexagon) TCG generation helpers
  2020-08-29  1:48   ` Richard Henderson
@ 2020-08-30 19:53     ` Taylor Simpson
  2020-08-30 20:52       ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-30 19:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Friday, August 28, 2020 7:49 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 28/34] Hexagon (target/hexagon) TCG
> generation helpers
>
> On 8/18/20 8:50 AM, Taylor Simpson wrote:
> > Helpers for reading and writing registers
> > Helpers for load-locked/store-conditional
> >
> > Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> > ---
> >  target/hexagon/genptr_helpers.h | 244
> ++++++++++++++++++++++++++++++++++++++++
> >  target/hexagon/op_helper.c      |  18 +++
> >  2 files changed, 262 insertions(+)
> >  create mode 100644 target/hexagon/genptr_helpers.h
> >
> > diff --git a/target/hexagon/genptr_helpers.h
> b/target/hexagon/genptr_helpers.h
> > new file mode 100644
> > index 0000000..ffcb1e3
> > --- /dev/null
> > +++ b/target/hexagon/genptr_helpers.h
> > @@ -0,0 +1,244 @@
> > +
> > +static inline void gen_log_reg_write(int rnum, TCGv val, int slot,
> > +                                     int is_predicated)
>
> These are quite large.  Why are they marked inline?

Since this is a header file, it prevents the compiler from complaining when they aren't used.

>
> > +        /* Low word */
> > +        tcg_gen_extrl_i64_i32(val32, val);
> > +        tcg_gen_mov_tl(hex_new_value[rnum], val32);
>
> Why are you extracting into a temporary?
> This could be done with
>
>     tcg_gen_extr_i64_i32(hex_new_value[rnum],
>                          hex_new_value[rnum + 1], val);

OK

> > +static inline void gen_read_p3_0(TCGv control_reg)
> > +{
> > +    TCGv pval = tcg_temp_new();
> > +    int i;
> > +    tcg_gen_movi_tl(control_reg, 0);
> > +    for (i = NUM_PREGS - 1; i >= 0; i--) {
> > +        tcg_gen_shli_tl(control_reg, control_reg, 8);
> > +        tcg_gen_andi_tl(pval, hex_pred[i], 0xff);
> > +        tcg_gen_or_tl(control_reg, control_reg, pval);
>
> tcg_gen_deposit_tl(control_reg, control_reg,
>                    hex_pred[i], i * 8, 8);

OK

> > +    for (i = 0; i < NUM_PREGS; i++) {
> > +        tcg_gen_andi_tl(pred_val, control_reg, 0xff);
> > +        tcg_gen_mov_tl(hex_pred[i], pred_val);
> > +        tcg_gen_shri_tl(control_reg, control_reg, 8);
>
> tcg_gen_extract_tl(hex_pred[i], control_reg, i * 8, 8);

OK

> > +static inline void log_store32(CPUHexagonState *env, target_ulong addr,
> > +                               int32_t val, int width, int slot)
> > +{
> > +    HEX_DEBUG_LOG("log_store%d(0x%x, %d [0x%x])\n", width, addr, val,
> val);
> > +    env->mem_log_stores[slot].va = addr;
> > +    env->mem_log_stores[slot].width = width;
> > +    env->mem_log_stores[slot].data32 = val;
> > +}
> > +
> > +static inline void log_store64(CPUHexagonState *env, target_ulong addr,
> > +                               int64_t val, int width, int slot)
> > +{
> > +    HEX_DEBUG_LOG("log_store%d(0x%x, %ld [0x%lx])\n", width, addr,
> val, val);
> > +    env->mem_log_stores[slot].va = addr;
> > +    env->mem_log_stores[slot].width = width;
> > +    env->mem_log_stores[slot].data64 = val;
> > +}
>
> ... or fold this re-addition back into where it was accidentally removed.  ;-)

Could you elaborate?


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 27/34] Hexagon (target/hexagon) instruction classes
  2020-08-29  1:37   ` Richard Henderson
@ 2020-08-30 20:04     ` Taylor Simpson
  2020-08-30 20:43       ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-30 20:04 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Friday, August 28, 2020 7:37 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 27/34] Hexagon (target/hexagon) instruction
> classes
>
> On 8/18/20 8:50 AM, Taylor Simpson wrote:
> > +} iclass_t;
> ...
> > +extern const char *find_iclass_slots(opcode_t opcode, int itype);
> ...
> > +typedef struct {
> > +    const char * const slots;
> > +} iclass_info_t;
>
> I'll note that you aren't following our CODING_STYLE for types.  Which of
> these
> need to match imported/ and which can you fix.

So, this should be CamelCase?  I should be able to fix all of them.

>
> > +typedef enum {
> > +
> > +#define DEF_PP_ICLASS32(TYPE, SLOTS, UNITS)
> ICLASS_FROM_TYPE(TYPE),
> > +#define DEF_EE_ICLASS32(TYPE, SLOTS, UNITS)    /* nothing */
> > +#include "imported/iclass.def"
> > +#undef DEF_PP_ICLASS32
> > +#undef DEF_EE_ICLASS32
> > +
> > +#define DEF_EE_ICLASS32(TYPE, SLOTS, UNITS)
> ICLASS_FROM_TYPE(TYPE),
> > +#define DEF_PP_ICLASS32(TYPE, SLOTS, UNITS)    /* nothing */
> > +#include "imported/iclass.def"
> > +#undef DEF_PP_ICLASS32
> > +#undef DEF_EE_ICLASS32
>
> Any particular reason why you're defining one as nothing, and doing this
> dance
> twice?

Will investigate.

> > +const char *find_iclass_slots(opcode_t opcode, int itype)
> > +{
> > +    /* There are some exceptions to what the iclass dictates */
> > +    if (GET_ATTRIB(opcode, A_ICOP)) {
> > +        return "2";
>
> Why are you returning a string and not a simple bitmask?

Will fix

> > +    [ICLASS_FROM_TYPE(TYPE)] = { .slots = #SLOTS },
>
> This could be converted to the bitmask with
>
> enum {
>     SLOTS_0  = (1 << 0),
>     SLOTS_1  = (1 << 1),
>     SLOTS_23 = (1 << 2) | (1 << 3),
>     ...
> };
>
> static const uint8_t iclass_info[] = {
>
> #define DEF_ICLASS(TYPE, SLOTS) \
>     [ICLASS_##TYPE] = SLOTS_##SLOTS
> #define DEF_PP_ICLASS32(TYPE, SLOTS, UNITS) \
>     DEF_ICLASS(TYPE, SLOTS)
> #define DEF_EE_ICLASS32(TYPE, SLOTS, UNITS) \
>     DEF_ICLASS(TYPE, SLOTS)
>
> At which point the ultimate consumer,
>
> >     for (i = 0, slot = 3; i < pkt->num_insns; i++) {
> >         valid_slot_str = get_valid_slot_str(pkt, i);
> >
> >         while (strchr(valid_slot_str, '0' + slot) == NULL) {
> >             slot--;
> >         }
> >         pkt->insn[i].slot = slot;
>
> becomes a simple integer mask check.

Will fix

^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics
  2020-08-29  1:16   ` Richard Henderson
@ 2020-08-30 20:23     ` Taylor Simpson
  2020-08-30 21:06       ` Richard Henderson
  2020-10-08 15:00     ` Taylor Simpson
  1 sibling, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-30 20:23 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Friday, August 28, 2020 7:17 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros
> referenced in instruction semantics
>
> On 8/18/20 8:50 AM, Taylor Simpson wrote:
> > + * For qemu, we look for a load in slot 0 when there is  a store in slot 1
> > + * in the same packet.  When we see this, we call a helper that merges the
> > + * bytes from the store buffer with the value loaded from memory.
> > + */
> > +#define CHECK_NOSHUF(DST, VA, SZ, SIGN) \
> > +    do { \
> > +        if (insn->slot == 0 && pkt->pkt_has_store_s1) { \
> > +            gen_helper_merge_inflight_store##SZ##SIGN(DST, cpu_env, VA,
> DST); \
> > +        } \
> > +    } while (0)
>
> Ah, so I see what you're trying to do with the merge thing, which had the
> host-endian problems.
>
> I think the merge stuff is a mistake.  I think you can get the semantics that
> you want with
>
> probe_read(ld_addr, ld_len)
> qemu_st(st_value, st_addr)
> qemu_ld(ld_value, ld_addr)
>
> In this way, all exceptions are recognized before the store is complete, the
> normal memory operations handle any possible overlap.

So, do this inside the helper?  Or is there a way to generate TCG code?

>
> > +#define fINSERT_BITS(REG, WIDTH, OFFSET, INVAL) \
> > +    do { \
> > +        REG = ((REG) & ~(((fCONSTLL(1) << (WIDTH)) - 1) << (OFFSET))) | \
> > +           (((INVAL) & ((fCONSTLL(1) << (WIDTH)) - 1)) << (OFFSET)); \
> > +    } while (0)
>
> reg = deposit32(reg, offset, width, inval)

OK

> > +#define fEXTRACTU_BITS(INREG, WIDTH, OFFSET) \
> > +    (fZXTN(WIDTH, 32, (INREG >> OFFSET)))
> > +#define fEXTRACTU_BIDIR(INREG, WIDTH, OFFSET) \
> > +    (fZXTN(WIDTH, 32, fBIDIR_LSHIFTR((INREG), (OFFSET), 4_8)))
> > +#define fEXTRACTU_RANGE(INREG, HIBIT, LOWBIT) \
> > +    (fZXTN((HIBIT - LOWBIT + 1), 32, (INREG >> LOWBIT)))
>
> extract32(inreg, offset, width)

OK

> > +#define fZXTN(N, M, VAL) ((VAL) & ((1LL << (N)) - 1))
>
> extract32(VAL, 0, n)

OK

> > +#define fSXTN(N, M, VAL) \
> > +    ((fZXTN(N, M, VAL) ^ (1LL << ((N) - 1))) - (1LL << ((N) - 1)))
>
> sextract32(val, 0, n)

OK

> > +#define fRND(A) (((A) + 1) >> 1)
>
> Does A have a type that won't overflow?
> For Arm we write this as
>
>     (A >> 1) + (A & 1)

Will investigate

> > +#define fDCFETCH(REG) do { REG = REG; } while (0) /* Nothing to do in
> qemu */
> > +#define fICINVA(REG) do { REG = REG; } while (0) /* Nothing to do in
> qemu */
> > +#define fL2FETCH(ADDR, HEIGHT, WIDTH, STRIDE, FLAGS)
> > +#define fDCCLEANA(REG) do { REG = REG; } while (0) /* Nothing to do in
> qemu */
> > +#define fDCCLEANINVA(REG) \
> > +    do { REG = REG; } while (0) /* Nothing to do in qemu */
>
> Is this "R=R" thing trying to avoid a compiler warning?
> Perhaps "(void)R" would be sufficient to avoid that?

Yes, it avoids a compiler warning.  Will change to (void)

> > -static inline void log_store32(CPUHexagonState *env, target_ulong addr,
> > -                               target_ulong val, int width, int slot)
> > -{
> > -    HEX_DEBUG_LOG("log_store%d(0x" TARGET_FMT_lx ", "
> TARGET_FMT_ld
> > -                  " [0x" TARGET_FMT_lx "])\n",
> > -                  width, addr, val, val);
> > -    env->mem_log_stores[slot].va = addr;
> > -    env->mem_log_stores[slot].width = width;
> > -    env->mem_log_stores[slot].data32 = val;
> > -}
> > -
> > -static inline void log_store64(CPUHexagonState *env, target_ulong addr,
> > -                               int64_t val, int width, int slot)
> > -{
> > -    HEX_DEBUG_LOG("log_store%d(0x" TARGET_FMT_lx ", %ld [0x%lx])\n",
> > -                   width, addr, val, val);
> > -    env->mem_log_stores[slot].va = addr;
> > -    env->mem_log_stores[slot].width = width;
> > -    env->mem_log_stores[slot].data64 = val;
> > -}
> > -
>
> Fold this back to wherever it came from.  Clearly no need to introduce it in
> the first place.

These are invoked by the MEM_STORE macros.  Are you saying to put this code there?



^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 25/34] Hexagon (target/hexagon) macros to interface with the generator
  2020-08-29  0:49   ` Richard Henderson
@ 2020-08-30 20:30     ` Taylor Simpson
  2020-08-30 20:59       ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-30 20:30 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Friday, August 28, 2020 6:49 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 25/34] Hexagon (target/hexagon) macros to
> interface with the generator
>
> On 8/18/20 8:50 AM, Taylor Simpson wrote:
> > +#define DECL_REG(NAME, NUM, X, OFF) \
> > +    TCGv NAME = tcg_temp_local_new(); \
> > +    int NUM = REGNO(X) + OFF
> > +
> > +#define DECL_REG_WRITABLE(NAME, NUM, X, OFF) \
> > +    TCGv NAME = tcg_temp_local_new(); \
> > +    int NUM = REGNO(X) + OFF; \
> > +    do { \
> > +        int is_predicated = GET_ATTRIB(insn->opcode, A_CONDEXEC); \
> > +        if (is_predicated && !is_preloaded(ctx, NUM)) { \
> > +            tcg_gen_mov_tl(hex_new_value[NUM], hex_gpr[NUM]); \
> > +        } \
> > +    } while (0)
> > +/*
> > + * For read-only temps, avoid allocating and freeing
> > + */
> > +#define DECL_REG_READONLY(NAME, NUM, X, OFF) \
> > +    TCGv NAME; \
> > +    int NUM = REGNO(X) + OFF
> > +
> > +#define DECL_RREG_d(NAME, NUM, X, OFF) \
> > +    DECL_REG_WRITABLE(NAME, NUM, X, OFF)
> > +#define DECL_RREG_e(NAME, NUM, X, OFF) \
> > +    DECL_REG(NAME, NUM, X, OFF)
>
> Is there a good reason for all these macros?
> Why not just bake this knowledge into gen_tcg_funcs.py?
> Seems like it would be just a couple of functions...
>
> At present, both this and the intermediary files are unreadable.  One has to
> pass genptr.c through -E and indent to see what's going on.

I add the regid (see comment in hex_common.py) in order to reduce the number of TCGv temps and TCG code we generate.  I originally had a single DEF/READ/WRITE/FREE set.  We would always create a TCGv and copy to/from the temp.  In read-only cases, we don't need a temp - we just point to the source.  For write-only, we assign directly to the new_value.  For read-write, we actually need the temp.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 27/34] Hexagon (target/hexagon) instruction classes
  2020-08-30 20:04     ` Taylor Simpson
@ 2020-08-30 20:43       ` Richard Henderson
  0 siblings, 0 replies; 122+ messages in thread
From: Richard Henderson @ 2020-08-30 20:43 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/30/20 1:04 PM, Taylor Simpson wrote:
> So, this should be CamelCase?  I should be able to fix all of them.

Yes, they should.  Thanks.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 00/34] Hexagon patch series
  2020-08-29  3:27 ` Richard Henderson
@ 2020-08-30 20:47   ` Taylor Simpson
  2020-08-30 23:33     ` Richard Henderson
  2020-09-07  9:49     ` Rob Landley
  0 siblings, 2 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-30 20:47 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

Richard,

Thank you so much for the feedback.  I really appreciate it.

I'll get to work addressing the issues.  Since some of the items will take longer than others, please advise whether it's preferred to send intermediate updates or wait until they are all addressed.

Taylor


> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Friday, August 28, 2020 9:27 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 00/34] Hexagon patch series
>
> On 8/18/20 8:50 AM, Taylor Simpson wrote:
> > This series adds support for the Hexagon processor with Linux user support
> >
> > See patch 02/34 Hexagon README for detailed information.
> >
> > Once the series is applied, the Hexagon port will pass "make check-tcg".
> > The series also includes Hexagon-specific tests in tcg/tests/hexagon.
> >
> > We have a parallel effort to make the Hexagon Linux toolchain inside a
> docker
> > container publically available.
>
> Oh, excellent.
>
> > *** Future items under consideration ***
> > Use qemu softfloat
>
> This is a blocker.  It's definitely not hard.
>
> > Use qemu decodetree
>
> This would certainly clean up all of the string processing that I mentioned.
> It seems like it would be just as easy to convert into the decodetree format
> as
> what you're currently doing for dectree_generated.h.  Indeed, easier, since
> you
> only need the ones that are TERMINAL().  All of the other things labeled
> TABLE_LINK are handled by decodetree itself.
>
> Anyway, I've completed what review I planed to do against this version.
>
>
> r~

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 28/34] Hexagon (target/hexagon) TCG generation helpers
  2020-08-30 19:53     ` Taylor Simpson
@ 2020-08-30 20:52       ` Richard Henderson
  2020-08-30 21:38         ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-30 20:52 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/30/20 12:53 PM, Taylor Simpson wrote:
>>> +++ b/target/hexagon/genptr_helpers.h
>>> @@ -0,0 +1,244 @@
>>> +
>>> +static inline void gen_log_reg_write(int rnum, TCGv val, int slot,
>>> +                                     int is_predicated)
>>
>> These are quite large.  Why are they marked inline?
> 
> Since this is a header file, it prevents the compiler from complaining when they aren't used.

Ok, why are they in a header file?
Why would they be unused, come to that?

The header file is used exactly once, by genptr.c.  Seems to me they could just
as well be *in* genptr.c.

If the functions are not used, just remove them?


>>> +static inline void log_store32(CPUHexagonState *env, target_ulong addr,
>>> +                               int32_t val, int width, int slot)
>>> +{
>>> +    HEX_DEBUG_LOG("log_store%d(0x%x, %d [0x%x])\n", width, addr, val,
>> val);
>>> +    env->mem_log_stores[slot].va = addr;
>>> +    env->mem_log_stores[slot].width = width;
>>> +    env->mem_log_stores[slot].data32 = val;
>>> +}
>>> +
>>> +static inline void log_store64(CPUHexagonState *env, target_ulong addr,
>>> +                               int64_t val, int width, int slot)
>>> +{
>>> +    HEX_DEBUG_LOG("log_store%d(0x%x, %ld [0x%lx])\n", width, addr,
>> val, val);
>>> +    env->mem_log_stores[slot].va = addr;
>>> +    env->mem_log_stores[slot].width = width;
>>> +    env->mem_log_stores[slot].data64 = val;
>>> +}
>>
>> ... or fold this re-addition back into where it was accidentally removed.  ;-)
> 
> Could you elaborate?

You added this code in one patch (didn't check which), removed it in patch 26,
and re-added it here in patch 28.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 25/34] Hexagon (target/hexagon) macros to interface with the generator
  2020-08-30 20:30     ` Taylor Simpson
@ 2020-08-30 20:59       ` Richard Henderson
  2020-08-30 21:20         ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-30 20:59 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/30/20 1:30 PM, Taylor Simpson wrote:
> 
> 
>> -----Original Message-----
>> From: Richard Henderson <richard.henderson@linaro.org>
>> Sent: Friday, August 28, 2020 6:49 PM
>> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
>> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
>> aleksandar.m.mail@gmail.com; ale@rev.ng
>> Subject: Re: [RFC PATCH v3 25/34] Hexagon (target/hexagon) macros to
>> interface with the generator
>>
>> On 8/18/20 8:50 AM, Taylor Simpson wrote:
>>> +#define DECL_REG(NAME, NUM, X, OFF) \
>>> +    TCGv NAME = tcg_temp_local_new(); \
>>> +    int NUM = REGNO(X) + OFF
>>> +
>>> +#define DECL_REG_WRITABLE(NAME, NUM, X, OFF) \
>>> +    TCGv NAME = tcg_temp_local_new(); \
>>> +    int NUM = REGNO(X) + OFF; \
>>> +    do { \
>>> +        int is_predicated = GET_ATTRIB(insn->opcode, A_CONDEXEC); \
>>> +        if (is_predicated && !is_preloaded(ctx, NUM)) { \
>>> +            tcg_gen_mov_tl(hex_new_value[NUM], hex_gpr[NUM]); \
>>> +        } \
>>> +    } while (0)
>>> +/*
>>> + * For read-only temps, avoid allocating and freeing
>>> + */
>>> +#define DECL_REG_READONLY(NAME, NUM, X, OFF) \
>>> +    TCGv NAME; \
>>> +    int NUM = REGNO(X) + OFF
>>> +
>>> +#define DECL_RREG_d(NAME, NUM, X, OFF) \
>>> +    DECL_REG_WRITABLE(NAME, NUM, X, OFF)
>>> +#define DECL_RREG_e(NAME, NUM, X, OFF) \
>>> +    DECL_REG(NAME, NUM, X, OFF)
>>
>> Is there a good reason for all these macros?
>> Why not just bake this knowledge into gen_tcg_funcs.py?
>> Seems like it would be just a couple of functions...
>>
>> At present, both this and the intermediary files are unreadable.  One has to
>> pass genptr.c through -E and indent to see what's going on.
> 
> I add the regid...

No, that doesn't answer the question.

Why does DECL_RREG_d et al exist as macros at all?  Why not emit the expansions
directly by gen_tcg_funcs.py?

It seems to me that all this does is obfuscate the code, adding one more layer
that one has to unwind in order to understand.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics
  2020-08-30 20:23     ` Taylor Simpson
@ 2020-08-30 21:06       ` Richard Henderson
  0 siblings, 0 replies; 122+ messages in thread
From: Richard Henderson @ 2020-08-30 21:06 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/30/20 1:23 PM, Taylor Simpson wrote:
>> I think the merge stuff is a mistake.  I think you can get the semantics that
>> you want with
>>
>> probe_read(ld_addr, ld_len)
>> qemu_st(st_value, st_addr)
>> qemu_ld(ld_value, ld_addr)
>>
>> In this way, all exceptions are recognized before the store is complete, the
>> normal memory operations handle any possible overlap.
> 
> So, do this inside the helper?  Or is there a way to generate TCG code?

I was thinking TCG code, where you can look at the packet before any code gen,
and decide whether or not this situation actually applies.

The probe is a simple helper call, for which the generic machinery exists
(probe_access, probe_write, probe_read).  All you have to do is write the wrapper.

The loads and stores are, well, normal loads and stores.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-08-30 19:48     ` Taylor Simpson
@ 2020-08-30 21:13       ` Richard Henderson
  2020-08-30 21:30         ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-30 21:13 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/30/20 12:48 PM, Taylor Simpson wrote:
> I'll add the following comment to indicate what's going on
> 
> /*
>   * Each of the generated helpers is wrapped with #ifndef fGEN_TCG_<tag>.
>   * For example
>    *     #ifndef fGEN_TCG_A2_add
>    *     DEF_HELPER_3(A2_add, s32, env, s32, s32)
>    *     #endif
>   * We include gen_tcg.h here to provide definitions of fGEN_TCG for any instructions that
>   * are overridden.
>   *
>   * This prevents unused helpers from taking up space in the executable.
>   */

Ah, hum.  Well.

How about we figure out a way to communicate to the generator scripts which
functions have been implemented "directly", and don't need to be generated at all?

I don't see why we have to wait until the c preprocessor to remove this unused
code, and the less we have of it, the better.


r~



^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 25/34] Hexagon (target/hexagon) macros to interface with the generator
  2020-08-30 20:59       ` Richard Henderson
@ 2020-08-30 21:20         ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-30 21:20 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Sunday, August 30, 2020 2:59 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 25/34] Hexagon (target/hexagon) macros to
> interface with the generator
>
> On 8/30/20 1:30 PM, Taylor Simpson wrote:
> >
> >
> >> -----Original Message-----
> >> From: Richard Henderson <richard.henderson@linaro.org>
> >> Sent: Friday, August 28, 2020 6:49 PM
> >> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> >> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> >> aleksandar.m.mail@gmail.com; ale@rev.ng
> >> Subject: Re: [RFC PATCH v3 25/34] Hexagon (target/hexagon) macros to
> >> interface with the generator
> >>
> >> On 8/18/20 8:50 AM, Taylor Simpson wrote:
> >>> +#define DECL_REG(NAME, NUM, X, OFF) \
> >>> +    TCGv NAME = tcg_temp_local_new(); \
> >>> +    int NUM = REGNO(X) + OFF
> >>> +
> >>> +#define DECL_REG_WRITABLE(NAME, NUM, X, OFF) \
> >>> +    TCGv NAME = tcg_temp_local_new(); \
> >>> +    int NUM = REGNO(X) + OFF; \
> >>> +    do { \
> >>> +        int is_predicated = GET_ATTRIB(insn->opcode, A_CONDEXEC); \
> >>> +        if (is_predicated && !is_preloaded(ctx, NUM)) { \
> >>> +            tcg_gen_mov_tl(hex_new_value[NUM], hex_gpr[NUM]); \
> >>> +        } \
> >>> +    } while (0)
> >>> +/*
> >>> + * For read-only temps, avoid allocating and freeing
> >>> + */
> >>> +#define DECL_REG_READONLY(NAME, NUM, X, OFF) \
> >>> +    TCGv NAME; \
> >>> +    int NUM = REGNO(X) + OFF
> >>> +
> >>> +#define DECL_RREG_d(NAME, NUM, X, OFF) \
> >>> +    DECL_REG_WRITABLE(NAME, NUM, X, OFF)
> >>> +#define DECL_RREG_e(NAME, NUM, X, OFF) \
> >>> +    DECL_REG(NAME, NUM, X, OFF)
> >>
> >> Is there a good reason for all these macros?
> >> Why not just bake this knowledge into gen_tcg_funcs.py?
> >> Seems like it would be just a couple of functions...
> >>
> >> At present, both this and the intermediary files are unreadable.  One has
> to
> >> pass genptr.c through -E and indent to see what's going on.
> >
> > I add the regid...
>
> No, that doesn't answer the question.
>
> Why does DECL_RREG_d et al exist as macros at all?  Why not emit the
> expansions
> directly by gen_tcg_funcs.py?
>
> It seems to me that all this does is obfuscate the code, adding one more layer
> that one has to unwind in order to understand.
>
>
> r~

This is partly historical.  My intent was to keep the generator  simple and mechanical and put the optimization and complexity into the macros.  I will code this up and see if we think it's better.



^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-08-30 21:13       ` Richard Henderson
@ 2020-08-30 21:30         ` Taylor Simpson
  2020-08-30 23:26           ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-30 21:30 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Sunday, August 30, 2020 3:13 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for
> instructions with multiple definitions
>
> On 8/30/20 12:48 PM, Taylor Simpson wrote:
> > I'll add the following comment to indicate what's going on
> >
> > /*
> >   * Each of the generated helpers is wrapped with #ifndef
> fGEN_TCG_<tag>.
> >   * For example
> >    *     #ifndef fGEN_TCG_A2_add
> >    *     DEF_HELPER_3(A2_add, s32, env, s32, s32)
> >    *     #endif
> >   * We include gen_tcg.h here to provide definitions of fGEN_TCG for any
> instructions that
> >   * are overridden.
> >   *
> >   * This prevents unused helpers from taking up space in the executable.
> >   */
>
> Ah, hum.  Well.
>
> How about we figure out a way to communicate to the generator scripts
> which
> functions have been implemented "directly", and don't need to be
> generated at all?
>
> I don't see why we have to wait until the c preprocessor to remove this
> unused
> code, and the less we have of it, the better.
>

A few reasons
- Makes it easy to override instruction helpers.  All one has to do is #define fGEN_TCG_<tag>.
- When debugging the override, sometimes you want to quickly revert back to the helper.  Or if you've written a bunch of overrides in one shot and then find that a test case is failing, you can binary search which one to turn off and get the test to pass.  This is the one with the bug to fix.
- Reduces time for an incremental build.  When we add or delete an override, we don't have to re-run the generator.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 28/34] Hexagon (target/hexagon) TCG generation helpers
  2020-08-30 20:52       ` Richard Henderson
@ 2020-08-30 21:38         ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-30 21:38 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Sunday, August 30, 2020 2:52 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 28/34] Hexagon (target/hexagon) TCG
> generation helpers
>
> On 8/30/20 12:53 PM, Taylor Simpson wrote:
> >>> +++ b/target/hexagon/genptr_helpers.h
> >>> @@ -0,0 +1,244 @@
> >>> +
> >>> +static inline void gen_log_reg_write(int rnum, TCGv val, int slot,
> >>> +                                     int is_predicated)
> >>
> >> These are quite large.  Why are they marked inline?
> >
> > Since this is a header file, it prevents the compiler from complaining when
> they aren't used.
>
> Ok, why are they in a header file?
> Why would they be unused, come to that?
>
> The header file is used exactly once, by genptr.c.  Seems to me they could
> just
> as well be *in* genptr.c.
>
> If the functions are not used, just remove them?

I could have sworn it was included in more than one C file.  I'll move the contents to genptr.c.


> >>> +static inline void log_store32(CPUHexagonState *env, target_ulong
> addr,
> >>> +                               int32_t val, int width, int slot)
> >>> +{
> >>> +    HEX_DEBUG_LOG("log_store%d(0x%x, %d [0x%x])\n", width, addr,
> val,
> >> val);
> >>> +    env->mem_log_stores[slot].va = addr;
> >>> +    env->mem_log_stores[slot].width = width;
> >>> +    env->mem_log_stores[slot].data32 = val;
> >>> +}
> >>> +
> >>> +static inline void log_store64(CPUHexagonState *env, target_ulong
> addr,
> >>> +                               int64_t val, int width, int slot)
> >>> +{
> >>> +    HEX_DEBUG_LOG("log_store%d(0x%x, %ld [0x%lx])\n", width, addr,
> >> val, val);
> >>> +    env->mem_log_stores[slot].va = addr;
> >>> +    env->mem_log_stores[slot].width = width;
> >>> +    env->mem_log_stores[slot].data64 = val;
> >>> +}
> >>
> >> ... or fold this re-addition back into where it was accidentally removed.  ;-)
> >
> > Could you elaborate?
>
> You added this code in one patch (didn't check which), removed it in patch
> 26,
> and re-added it here in patch 28.

My apologies, this is my screwing up the git rebase.  I'll fix it.

>
>
> r~

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 31/34] Hexagon (target/hexagon) translation
  2020-08-30 19:37     ` Taylor Simpson
@ 2020-08-30 23:08       ` Richard Henderson
  0 siblings, 0 replies; 122+ messages in thread
From: Richard Henderson @ 2020-08-30 23:08 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/30/20 12:37 PM, Taylor Simpson wrote:
> I'm actually checking two conditions here.
> 1) packet crossing a page boundary
> 2) reading too many words without finding the end of the packet.
> I guess it would be better to separate them.
> 
> What is the correct behavior for the second case?  Should we return an error code from here and have the higher level code generate the invalid packet exception?

I would return an error code.

In fact, I would also pass in the max number of words to read:

static int read_packet_words(CPUHexagonState *env,
                             DisasContext *ctx,
                             uint32_t words[PACKET_WORDS_MAX],
                             int max_words)
{
   // stuff
   return !found_end ? 0 : nwords;
}


Then, in translate_packet,


    uint32_t words[PACKET_WORDS_MAX];
    int nwords = read_packet_words(env, ctx, words,
                                   PACKET_WORDS_MAX);

    if (nwords == 0) {
        // raise exception
        return;
    }

    decode_and_translate_packet(env, ctx, words, nwords);

    /* If we're going to try for another packet... */
    if (ctx->base.is_jmp == DISAS_NEXT &&
        ctx->base.num_insns < ctx->base.num_insns) {
        /*
         * Remember the end of the page containing the
         * first packet.  Note that the first packet
         * is allowed to span two pages, so this is not
         * necessarily the same as the end of the page
         * containing ctx->base.pc_start.
         */
        if (ctx->base.num_insns == 1) {
            ctx->page_end
                = TARGET_PAGE_ALIGN(ctx->base.pc_next);
        }

        /*
         * If there are not PACKET_WORDS_MAX remaining on
         * the page, check to see if a full packet remains.
         * If not, split the TB so that the packet that
         * crosses the page begins the next TB.
         */
        target_long left = ctx->page_end - ctx->base.pc_next;
        tcg_debug_assert(left >= 0);
        if (left == 0
            || (left < PACKET_WORDS_MAX * 4 &&
                !read_packet_words(env, ctx, words, left / 4)) {
            ctx->base.is_jmp = DISAS_TOO_MANY;
        }
    }


The reason for all this is to properly capture the behaviour of instruction
execution vs SIGSEGV.

First, during translate we do not want to read from the next page unless
absolutely necessary.  Doing so could raise SIGSEGV before it would be
appropriate, e.g. because the TB should have branched away (or raised SIGFPE,
or anything else) before getting that far.

Second, when dispatching a TB, we check the 1 or 2 pages that the TB occupies
for validity.  If the second page is invalid, we raise SIGSEGV without
executing the TB at all.  Which makes it appear as if the SIGSEGV happened at
the first insn of the TB.  Which is correct if and only if the first insn is
the one that did cross the page.

>> Surely you don't need to actually set PC for every PC?
> 
> What do other targets do?

If you have a pc-relative instruction, e.g. x86_64's

  lea  offset(%rip), %rax

then you just use the known immediate for %rip:

  tcg_gen_movi_tl(cpu_reg[eax], ctx->base.pc_next + offset);

Normally, PC is only valid when explicitly returning to the cpu loop
(tcg_gen_exit_tb, static exception), for indirect branching
(tcg_gen_lookup_and_goto_ptr), or after dynamic exception unwinding
(restore_state_to_opc).

When using goto_tb, we can get away with *assuming* static state, because the
values get baked into the link to a specific next-TB.  That's why the general
form is

  tcg_gen_goto_tb(n);
  gen_set_pc_im(s, dest);
  tcg_gen_exit_tb(s->base.tb, n);

The first time we cross link N, the link is unset, which causes the goto_tb to
continue to the next tcg opcode.  Which then sets all of the static state that
has been assumed (often, as here, just the pc).  We then exit, telling the
cpu_loop to examine cpu state, locate the next TB, and fill in link N from the
current TB.

The second time we cross link N, the link is set, which causes the goto_tb to
continue immediately to the next TB.  We do not execute the store to PC, as it
is implied by next_tb->pc.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-08-30 21:30         ` Taylor Simpson
@ 2020-08-30 23:26           ` Richard Henderson
  2020-08-31 17:08             ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-30 23:26 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/30/20 2:30 PM, Taylor Simpson wrote:
> 
> 
>> -----Original Message-----
>> From: Richard Henderson <richard.henderson@linaro.org>
>> Sent: Sunday, August 30, 2020 3:13 PM
>> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
>> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
>> aleksandar.m.mail@gmail.com; ale@rev.ng
>> Subject: Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for
>> instructions with multiple definitions
>>
>> On 8/30/20 12:48 PM, Taylor Simpson wrote:
>>> I'll add the following comment to indicate what's going on
>>>
>>> /*
>>>   * Each of the generated helpers is wrapped with #ifndef
>> fGEN_TCG_<tag>.
>>>   * For example
>>>    *     #ifndef fGEN_TCG_A2_add
>>>    *     DEF_HELPER_3(A2_add, s32, env, s32, s32)
>>>    *     #endif
>>>   * We include gen_tcg.h here to provide definitions of fGEN_TCG for any
>> instructions that
>>>   * are overridden.
>>>   *
>>>   * This prevents unused helpers from taking up space in the executable.
>>>   */
>>
>> Ah, hum.  Well.
>>
>> How about we figure out a way to communicate to the generator scripts
>> which
>> functions have been implemented "directly", and don't need to be
>> generated at all?
>>
>> I don't see why we have to wait until the c preprocessor to remove this
>> unused
>> code, and the less we have of it, the better.
>>
> 
> A few reasons
> - Makes it easy to override instruction helpers.  All one has to do is #define fGEN_TCG_<tag>.

If the generator can examine, say, genptr_override.c.inc, then you don't even
have to add a #define.  Just add the code.

Perhaps something like

DEFINE_FGEN(tag)
{
    // some code
}

where DEFINE_FGEN expands to the appropriate function declaration.  The
generator then need only grep '^DEFINE_FGEN' and extract the list of overridden
tags.


> - When debugging the override, sometimes you want to quickly revert back to the helper.  Or if you've written a bunch of overrides in one shot and then find that a test case is failing, you can binary search which one to turn off and get the test to pass.  This is the one with the bug to fix.

No difference there, just binary searching on different text.

> - Reduces time for an incremental build.  When we add or delete an override, we don't have to re-run the generator.

About this I care not at all.  I can't imagine that more than fractions of a
second are at stake.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 00/34] Hexagon patch series
  2020-08-30 20:47   ` Taylor Simpson
@ 2020-08-30 23:33     ` Richard Henderson
  2020-08-31 17:57       ` Taylor Simpson
  2020-09-07  9:49     ` Rob Landley
  1 sibling, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-30 23:33 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/30/20 1:47 PM, Taylor Simpson wrote:
> Richard,
> 
> Thank you so much for the feedback.  I really appreciate it.
> 
> I'll get to work addressing the issues.  Since some of the items will take longer than others, please advise whether it's preferred to send intermediate updates or wait until they are all addressed.

I don't mind intermediate updates.  Just keep a list in the cover letter of the
things that are still on the to-do list, and I'll not focus on those.

We could also talk about what portions of the to-do list are blocker, and what
can be done via normal development.  Because neither you nor I want to carry
around this huge patch set forever.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-08-30 23:26           ` Richard Henderson
@ 2020-08-31 17:08             ` Taylor Simpson
  2020-08-31 17:29               ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-31 17:08 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail


> >> -----Original Message-----
> >> From: Richard Henderson <richard.henderson@linaro.org>
> >> Sent: Sunday, August 30, 2020 3:13 PM
> >> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> >> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> >> aleksandar.m.mail@gmail.com; ale@rev.ng
> >> Subject: Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for
> >> instructions with multiple definitions
> >>
> >> On 8/30/20 12:48 PM, Taylor Simpson wrote:
> >>> I'll add the following comment to indicate what's going on
> >>>
> >>> /*
> >>>   * Each of the generated helpers is wrapped with #ifndef
> >> fGEN_TCG_<tag>.
> >>>   * For example
> >>>    *     #ifndef fGEN_TCG_A2_add
> >>>    *     DEF_HELPER_3(A2_add, s32, env, s32, s32)
> >>>    *     #endif
> >>>   * We include gen_tcg.h here to provide definitions of fGEN_TCG for
> any
> >> instructions that
> >>>   * are overridden.
> >>>   *
> >>>   * This prevents unused helpers from taking up space in the executable.
> >>>   */
> >>
> >> Ah, hum.  Well.
> >>
> >> How about we figure out a way to communicate to the generator scripts
> >> which
> >> functions have been implemented "directly", and don't need to be
> >> generated at all?
> >>
> >> I don't see why we have to wait until the c preprocessor to remove this
> >> unused
> >> code, and the less we have of it, the better.
> >>
> >
> > A few reasons
> > - Makes it easy to override instruction helpers.  All one has to do is #define
> fGEN_TCG_<tag>.
>
> If the generator can examine, say, genptr_override.c.inc, then you don't
> even
> have to add a #define.  Just add the code.
>
> Perhaps something like
>
> DEFINE_FGEN(tag)
> {
>     // some code
> }
>
> where DEFINE_FGEN expands to the appropriate function declaration.  The
> generator then need only grep '^DEFINE_FGEN' and extract the list of
> overridden
> tags.
>
>
> > - When debugging the override, sometimes you want to quickly revert back
> to the helper.  Or if you've written a bunch of overrides in one shot and then
> find that a test case is failing, you can binary search which one to turn off and
> get the test to pass.  This is the one with the bug to fix.
>
> No difference there, just binary searching on different text.
>
> > - Reduces time for an incremental build.  When we add or delete an
> override, we don't have to re-run the generator.
>
> About this I care not at all.  I can't imagine that more than fractions of a
> second are at stake.

I can modify the generator to skip instructions with overrides.

There are some assumptions here I'd like to clarify.  When I started this discussion, there seemed to be general resistance to the concept of a generator.  Because of this, I made the generator as simple as possible and pushed the complexity and optimization into code that consumes the output of the generator.  Also, I assumed people wouldn't read the output of the generator, so I didn't focus on making it readable.

Now, it sounds like my assumptions were not correct.  You pointed out two things to do in the generator
- Expand the macros inline
- Skip the instructions that have overrides
I addition, there other things I should do if we want the generated files to be more readable
- Indent the code
- Only generate one of the fGEN_TCG_<tag> or gen_helper_<tag>
- Generate the declaration of the generate_<tag> function instead of just the body

Thoughts?

Thanks,
Taylor


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-08-31 17:08             ` Taylor Simpson
@ 2020-08-31 17:29               ` Richard Henderson
  2020-08-31 18:14                 ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-31 17:29 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/31/20 10:08 AM, Taylor Simpson wrote:
> There are some assumptions here I'd like to clarify.  When I started this
> discussion, there seemed to be general resistance to the concept of a
> generator.  Because of this, I made the generator as simple as possible and
> pushed the complexity and optimization into code that consumes the output of
> the generator.  Also, I assumed people wouldn't read the output of the
> generator, so I didn't focus on making it readable.

Except, at some point, one has to debug this code.
If the code is unreadable, how do you figure out what's broken?

> Now, it sounds like my assumptions were not correct.  You pointed out two
> things to do in the generator> - Expand the macros inline
> - Skip the instructions that have overrides

Yes please.

> I addition, there other things I should do if we want the generated files to be more readable
> - Indent the code

Helpful, yes.

I wouldn't worry about nested statements within the *.def files too much,
except to put each ";" terminated statement on a new line, so that gdb's step
command goes to the next statement instead of skipping everything all at once.

> - Only generate one of the fGEN_TCG_<tag> or gen_helper_<tag>

That would be part of "skip the instructions that have overrides", would it not?

> - Generate the declaration of the generate_<tag> function instead of just the body

I'm not quite sure what this means.

Aren't the "generate_<tag>" functions private to genptr.c?  Why would we need a
separate declaration of them, as opposed to just a definition?


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 00/34] Hexagon patch series
  2020-08-30 23:33     ` Richard Henderson
@ 2020-08-31 17:57       ` Taylor Simpson
  2020-08-31 20:43         ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-31 17:57 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail


> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Sunday, August 30, 2020 5:33 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 00/34] Hexagon patch series
>
> I don't mind intermediate updates.  Just keep a list in the cover letter of the
> things that are still on the to-do list, and I'll not focus on those.
>
> We could also talk about what portions of the to-do list are blocker, and what
> can be done via normal development.  Because neither you nor I want to
> carry
> around this huge patch set forever.

OK, here's the list of items.  Let me know if I missed anything.  I'll indicate which ones can be done quickly and which ones would take more time.  I added a column for blocker if you or anyone else has input on that.

PatchItemEffortBlocker
Use qemu softfloat??Yes
Use qemu decodetree.py??
SeveralUse const when appropriatesmall
SeveralRemove anything after g_assert_not_reachedsmall
SeveralFix log_store32/64 add/remove/add in patch seriessmall
SeveralFollow naming guidelines for structs and enumssmall
04Move decls to cpu-param.hsmall
04Remove CONFIG_USER_ONLY ifdef'ssmall
04Remove DEBUG_HEXAGONsmall
04Remove stack pointer modification hack, use qemu mechanismsmall
04Add property x-lldb-compat to control output in logsmall
06Include instruction and raw bytes in disassemblysmall
07Use DEF_HELPER_FLAGSsmall
07, 26Endianness of merge_bytessmall
07Fix overlap testsmall
07Remove HELPER(debug_value)/HELPER(debug_value_i64)small
09Include "qemu/osdep.h" instead of <stdint.h>small
10 (and others)Stick with stdint.h types except in imported filessmall
11Remove description from reg field definitionssmall
13Move regmap.h into decode.csmall
14, 27Use bit mask instead of strings in decodingsmall
14Add comments to decodersmall
16Use qemu/int128.hmedium
17Squash patches dealing with imported filessmall
24Use qemu/bitops.h for instruction attributessmall
24Fix initialization of opcode_short_semanticssmall
24Change if (p == NULL) { g_assert_not_reached(); } to assert(p != NULL)small
25Expand DECL/READ/WRITE/FREE macros into generated codesmall
26Rewrite fINSERT*, fEXTRACT*, f?XTN macrossmall
26Investigate fRND macrosmall
26Change REG = REG to (VOID)REG to suppress compiler warningsmall
27Remove multiple includes of imported/iclass.defsmall
28Move genptr_helpers.h into genptr.csmall
28Remove unneeded tempssmall
28Use tcg_gen_deposit_tl and tcg_gen_extract_tl when dealing with p3_0small
29Size opcode_genptr[] properly and initialize with [TAG] = generate_##TAGsmall
30Don't generate helpers for instructions that are overriddensmall
Don't include "gen_tcg.h" in helper.h
31Use bitmask for ctx->reg_log instead of an arraysmall
31Use tcg_gen_extract_i32 for gen_slot_cancelled_checksmall
31Properly deal with reading instructions across a page boundary and toomedium
many instructions before finding end-of-packet
31Don't set PC at the beginning of every packetmedium
31Don't set slot_cancelled unless neededsmall
31Don't set hex_pred_written unless neededmedium
31Change cancelled variable to not localsmall
31Remove unnecessary tempsmall
31Let tcg_gen_addi_tl check for zerosmall
31Move gen_exec_counters to end of TB instead of every packetmedium
31Move end of TB handling to hexagon_tr_tb_stopsmall












^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-08-31 17:29               ` Richard Henderson
@ 2020-08-31 18:14                 ` Taylor Simpson
  2020-08-31 19:20                   ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-31 18:14 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Monday, August 31, 2020 11:29 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for
> instructions with multiple definitions
>
> On 8/31/20 10:08 AM, Taylor Simpson wrote:
> > There are some assumptions here I'd like to clarify.  When I started this
> > discussion, there seemed to be general resistance to the concept of a
> > generator.  Because of this, I made the generator as simple as possible and
> > pushed the complexity and optimization into code that consumes the
> output of
> > the generator.  Also, I assumed people wouldn't read the output of the
> > generator, so I didn't focus on making it readable.
>
> Except, at some point, one has to debug this code.
> If the code is unreadable, how do you figure out what's broken?
>
> > Now, it sounds like my assumptions were not correct.  You pointed out two
> > things to do in the generator> - Expand the macros inline
> > - Skip the instructions that have overrides
>
> Yes please.
>
> > I addition, there other things I should do if we want the generated files to
> be more readable
> > - Indent the code
>
> Helpful, yes.
>
> I wouldn't worry about nested statements within the *.def files too much,
> except to put each ";" terminated statement on a new line, so that gdb's step
> command goes to the next statement instead of skipping everything all at
> once.
>
> > - Only generate one of the fGEN_TCG_<tag> or gen_helper_<tag>
>
> That would be part of "skip the instructions that have overrides", would it
> not?

Just to be explicit, the code that generates code is generated as
    #ifdef fGEN_TCG_A2_add
    fGEN_TCG_A2_add({ RdV=RsV+RtV;});
    #else
    do {
    gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
    } while (0);
    #endif
If we're going to have the generator know if there is an override, this would be more readable as either
    fGEN_TCG_A2_add({ RdV=RsV+RtV;});
or
    gen_helper_A2_add(RdV, cpu_env, RsV, RtV);


> > - Generate the declaration of the generate_<tag> function instead of just
> the body
>
> I'm not quite sure what this means.
>
> Aren't the "generate_<tag>" functions private to genptr.c?  Why would we
> need a
> separate declaration of them, as opposed to just a definition?

In genptr.c, there is
    #define DEF_TCG_FUNC(TAG, GENFN) \
    static void generate_##TAG(CPUHexagonState *env, DisasContext *ctx, \
                               insn_t *insn, packet_t *pkt) \
    { \
        GENFN \
    }
    #include "tcg_funcs_generated.h"
    #undef DEF_TCG_FUNC
The generated code only has the body of the function.  It would be more readable to move the static void generate_##TAG ... into the generated code.





^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-08-31 18:14                 ` Taylor Simpson
@ 2020-08-31 19:20                   ` Richard Henderson
  2020-08-31 23:10                     ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-31 19:20 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/31/20 11:14 AM, Taylor Simpson wrote:
> Just to be explicit, the code that generates code is generated as
>     #ifdef fGEN_TCG_A2_add
>     fGEN_TCG_A2_add({ RdV=RsV+RtV;});
>     #else
>     do {
>     gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
>     } while (0);
>     #endif
> If we're going to have the generator know if there is an override, this would be more readable as either
>     fGEN_TCG_A2_add({ RdV=RsV+RtV;});
> or
>     gen_helper_A2_add(RdV, cpu_env, RsV, RtV);

Not quite, see below.

> In genptr.c, there is
>     #define DEF_TCG_FUNC(TAG, GENFN) \
>     static void generate_##TAG(CPUHexagonState *env, DisasContext *ctx, \
>                                insn_t *insn, packet_t *pkt) \
>     { \
>         GENFN \
>     }
>     #include "tcg_funcs_generated.h"
>     #undef DEF_TCG_FUNC
> The generated code only has the body of the function.  It would be more
> readable to move the static void generate_##TAG ... into the generated
> code.

Yes.

The fGEN_TCG_A2_add macro does not require nor use that {...} argument.  What
it *does* need are the same arguments as are given to generate_<rtag>.  I
assume you are using those arguments implicitly in your current fGEN_TCG_<rtag>
instances?

It would be cleanest to only have the generate_* functions.

Either they are written by hand (replacing the current fGEN_TCG_*), or they are
generated.  In either case, there's just the one level of indirection from
opcode_genptr[].

I'd imagine

--- genptr.c

#define DEF_TCG_FUNC(TAG) \
static void generate_##TAG(CPUHexagonState *env, \
    DisasContext *ctx, insn_t *insn, packet_t *pkt)

/*
 * All IIDs with an explicit implementation,
 * overriding the auto-generated helper functions.
 */

DEF_TCG_FUNC(A2_add)
{
    /* { RdV=RsV+RtV;} */
    tcg_gen_add_tl(args...);
}

/*
 * Generate calls to the auto-generate helpers,
 * and slot everything into the opcode_genptr table.
 */
#include "genptr_generated.c.inc"

--- genptr_generated.c.inc

DEF_TCG_FUNC(A4_tlbmatch)
{
   gen_helper_A4_tlbmatch(args...);
}

// etc

const SemanticInsn opcode_genptr[] = {
    // All IID's, generated or not.
};

---

This leaves genptr.c as the file to grep for '^DEF_TCG_FUNC'.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 00/34] Hexagon patch series
  2020-08-31 17:57       ` Taylor Simpson
@ 2020-08-31 20:43         ` Richard Henderson
  2020-08-31 23:48           ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-08-31 20:43 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/31/20 10:57 AM, Taylor Simpson wrote:
> OK, here's the list of items.  Let me know if I missed anything.  I'll 
> indicate which ones can be done quickly and which ones would take more time.
> I added a column for blocker if you or anyone else has input on that.
> 
> PatchItemEffortBlocker
> Use qemu softfloat??Yes

Hmm, this table didn't render.  Below, yes/no for blocker column.

> Use qemu decodetree.py??

no

> SeveralUse const when appropriatesmall

yes

> SeveralRemove anything after g_assert_not_reachedsmall

yes

> SeveralFix log_store32/64 add/remove/add in patch seriessmall

yes

> SeveralFollow naming guidelines for structs and enumssmall

yes

> 04Move decls to cpu-param.hsmall

Yes.  The only reason this even compiled is that you don't do system mode yet.  ;-)

> 04Remove CONFIG_USER_ONLY ifdef'ssmall

yes

> 04Remove DEBUG_HEXAGONsmall

yes

> 04Remove stack pointer modification hack, use qemu mechanismsmall

yes

> 04Add property x-lldb-compat to control output in logsmall

yes

> 06Include instruction and raw bytes in disassemblysmall

yes

> 07Use DEF_HELPER_FLAGSsmall

no, but should be easy.

> 07, 26Endianness of merge_bytessmall

Yes, one way or another; hopefully by removing all of the merge_bytes and using
probe_read.

> 07Fix overlap testsmall

yes

> 07Remove HELPER(debug_value)/HELPER(debug_value_i64)small

yes

> 09Include "qemu/osdep.h" instead of <stdint.h>small

yes

> 10 (and others)Stick with stdint.h types except in imported filessmall

yes

> 11Remove description from reg field definitionssmall

yes

> 13Move regmap.h into decode.csmall

yes

> 14, 27Use bit mask instead of strings in decodingsmall

no, but should be easy.

> 14Add comments to decodersmall

yes

> 16Use qemu/int128.hmedium

no

> 17Squash patches dealing with imported filessmall

yes

> 24Use qemu/bitops.h for instruction attributessmall

no

> 24Fix initialization of opcode_short_semanticssmall

yes

> 24Change if (p == NULL) { g_assert_not_reached(); } to assert(p != NULL)small

no

> 25Expand DECL/READ/WRITE/FREE macros into generated codesmall

Yes.

In the end I think some of these will in the end want to be helper functions.
As I was thinking how to best write A2_add, I was thinking

/* TODO: You currently have an "offset" parameter to
   DECL_REG.  I can't see that it is ever used?
   I would *hope* that this could be resolved earlier,
   so that by this time insn->regno[*] is absolute.  */
static int resolve_regno(Insn *insn, int slot, int off);

/* Return hex_new_value[regno]; log the write. */
static TCGv reg_for_write(DisasContext *ctx, int regno);

/* Return hex_reg[regno]; force up-to-date value for PC. */
static TCGv reg_for_read(DisasContext *ctx, int regno);

/* if (preg) hex_new_value[regno] = hex_reg[regno],
   or !preg if !test_positive.
   Leaves hex_new_value[] canonical for gen_reg_writes,
   no extra temporary required. */
static void gen_cancel_reg(DisasContext *ctx, int preg,
                           int rreg, bool test_positive);

DEF_TCG_FUNC(A2_add)
{
    int rd = resolve_regno(insn, 0, 0);
    int rs = resolve_regno(insn, 1, 0);
    int rt = resolve_regno(insn, 2, 0);

    tcg_gen_add_tl(reg_for_write(ctx, rd),
                   reg_for_read(ctx, rs),
                   reg_for_read(ctx, rt));
}

DEF_TCG_FUNC(A2_paddit)
{
    int pu = resolve_regno(insn, 0, 0);
    int rd = resolve_regno(insn, 1, 0);
    int rs = resolve_regno(insn, 2, 0);
    int rt = resolve_regno(insn, 3, 0);

    tcg_gen_add_tl(reg_for_write(ctx, rd),
                   reg_for_read(ctx, rs),
                   reg_for_read(ctx, rt));
    gen_cancel_reg(ctx, insn, rd, pu, true);
}

However, I don't think we have to have a comprehensive set of these now.  We
can expand everything into the generator to start, then adjust the generator as
we add helper functions and use them within the overrides.

> 26Rewrite fINSERT*, fEXTRACT*, f?XTN macrossmall

yes

> 26Investigate fRND macrosmall

no, but should be easy.

> 26Change REG = REG to (VOID)REG to suppress compiler warningsmall

yes

> 27Remove multiple includes of imported/iclass.defsmall

yes

> 28Move genptr_helpers.h into genptr.csmall

yes

> 28Remove unneeded tempssmall

no

> 28Use tcg_gen_deposit_tl and tcg_gen_extract_tl when dealing with p3_0small

no

> 29Size opcode_genptr[] properly and initialize with [TAG] = generate_##TAGsmall

yes; i think this will fall out of other changes.

> 30Don't generate helpers for instructions that are overriddensmall

yes

> Don't include "gen_tcg.h" in helper.h

yes

> 31Use bitmask for ctx->reg_log instead of an arraysmall

yes

> 31Use tcg_gen_extract_i32 for gen_slot_cancelled_checksmall

yes

> 31Properly deal with reading instructions across a page boundary and toomedium
> many instructions before finding end-of-packet

Yes, this should be easy.  Unless there's some surprise in the code I have
already suggested code.

> 31Don't set PC at the beginning of every packetmedium

no

> 31Don't set slot_cancelled unless neededsmall

no

> 31Don't set hex_pred_written unless neededmedium

no

> 31Change cancelled variable to not localsmall

yes

> 31Remove unnecessary tempsmall

yes

> 31Let tcg_gen_addi_tl check for zerosmall

yes

> 31Move gen_exec_counters to end of TB instead of every packetmedium

no

> 31Move end of TB handling to hexagon_tr_tb_stopsmall

yes


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-08-31 19:20                   ` Richard Henderson
@ 2020-08-31 23:10                     ` Taylor Simpson
  2020-09-01  2:40                       ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-08-31 23:10 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Monday, August 31, 2020 1:20 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for
> instructions with multiple definitions
>
> The fGEN_TCG_A2_add macro does not require nor use that {...} argument.

The fGEN_TCG_A2_add macro does need that argument, but there are cases that do need it.  Here's an example from gen_tcg.h
    #define fGEN_TCG_L2_loadrub_pr(SHORTCODE)      SHORTCODE
This is explained in the README, but basically the argument is useful if we can properly define the macros that it contains to generate TCG.


> What it *does* need are the same arguments as are given to generate_<rtag>.  I
> assume you are using those arguments implicitly in your current
> fGEN_TCG_<rtag>
> instances?

Yes

>
> It would be cleanest to only have the generate_* functions.
>
> Either they are written by hand (replacing the current fGEN_TCG_*), or they
> are
> generated.  In either case, there's just the one level of indirection from
> opcode_genptr[].
>
> I'd imagine
>
> --- genptr.c
>
> #define DEF_TCG_FUNC(TAG) \
> static void generate_##TAG(CPUHexagonState *env, \
>     DisasContext *ctx, insn_t *insn, packet_t *pkt)
>
> /*
>  * All IIDs with an explicit implementation,
>  * overriding the auto-generated helper functions.
>  */
>
> DEF_TCG_FUNC(A2_add)
> {
>     /* { RdV=RsV+RtV;} */
>     tcg_gen_add_tl(args...);
> }

There's additional generated code before and after the tcg_gen_add_tl.  IMO, we don't want the person who writes an override having to reproduce the generated code.  Assuming we have a definition of fGEN_TCG_A2_add and we have the generator intelligently expanding the macros, this is what will be generated.

static void generate_A2_add(CPUHexagonState *env, DisasContext *ctx, insn_t *insn, packet_t *pkt)
{
/* A2_add */
int RdN =insn->regno[0];
TCGv RdV = tcg_temp_local_new();
int RsN = insn->regno[1];
TCGv RsV = hex_gpr[RsN];
int RtN = insn->regno[2];
TCGv RtV = hex_gpr[RtN];

fGEN_TCG_A2_add({ RdV=RsV+RtV;});

gen_log_reg_write(RdN, RdV, insn->slot, 0);
ctx_log_reg_write(ctx, RdN);

tcg_temp_free(RdV);
/* A2_add */
}

If there weren't an override, we'd get this

static void generate_A2_add(CPUHexagonState *env, DisasContext *ctx, insn_t *insn, packet_t *pkt)
{
/* A2_add */
int RdN =insn->regno[0];
TCGv RdV = tcg_temp_local_new();
int RsN = insn->regno[1];
TCGv RsV = hex_gpr[RsN];
int RtN = insn->regno[2];
TCGv RtV = hex_gpr[RtN];

gen_helper_A2_add(RdV, cpu_env, RsV, RtV);                    /* Only difference is this line */

gen_log_reg_write(RdN, RdV, insn->slot, 0);
ctx_log_reg_write(ctx, RdN);

tcg_temp_free(RdV);
/* A2_add */
}

The fGEN_TCG_<tag> macro can also mention the operands of the instruction (RdV, RsV, RtV in this example).

Unlike the generate_<tag> functions that all have the same signature.  The overrides would have different signatures.  This would be more defensive programming because you know exactly where the variables come from but more verbose when writing the overrides by hand.  Also, note that these need to be macros in order to take advantage of the SHORTCODE.

In other words, instead of
#define fGEN_TCG_A2_add(SHORTCODE)    tcg_gen_add_tl(RdV, RsV, RtV)

We would write
#define fGEN_TCG_A2_add(env, ctx, insn, pkt, RdV, RsV, RtV, SHORTCODE)    tcg_gen_add_tl(RdV, RsV, RtV);

Personally, I prefer the former, but will change to the latter if you feel strongly.

I'm not married to the fGEN_TCG_<tag> name.  DEF_TCG_<tag> would also be fine.

>
> /*
>  * Generate calls to the auto-generate helpers,
>  * and slot everything into the opcode_genptr table.
>  */
> #include "genptr_generated.c.inc"
>
> --- genptr_generated.c.inc
>
> DEF_TCG_FUNC(A4_tlbmatch)
> {
>    gen_helper_A4_tlbmatch(args...);
> }
>
> // etc
>
> const SemanticInsn opcode_genptr[] = {
>     // All IID's, generated or not.
> };
>
> ---
>
> This leaves genptr.c as the file to grep for '^DEF_TCG_FUNC'.
>
>
> r~

^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 00/34] Hexagon patch series
  2020-08-31 20:43         ` Richard Henderson
@ 2020-08-31 23:48           ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-08-31 23:48 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail


> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Monday, August 31, 2020 2:44 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 00/34] Hexagon patch series
>
> On 8/31/20 10:57 AM, Taylor Simpson wrote:
> > OK, here's the list of items.  Let me know if I missed anything.  I'll
> > indicate which ones can be done quickly and which ones would take more
> time.
> > I added a column for blocker if you or anyone else has input on that.
> >
> > PatchItemEffortBlocker
> > Use qemu softfloat??Yes
>
> Hmm, this table didn't render.  Below, yes/no for blocker column.

Sorry about that - not sure what happened.

I will work all those you marked "yes" or "no, but should be easy".

> > 25Expand DECL/READ/WRITE/FREE macros into generated codesmall
>
> Yes.
>
> In the end I think some of these will in the end want to be helper functions.
> As I was thinking how to best write A2_add, I was thinking

See my response to the thread on patch 30/34.

Since you mention A2_paddit, here's what it would look like assuming it is overridden.

static void generate_A2_paddt(CPUHexagonState *env, DisasContext *ctx, insn_t *insn, packet_t *pkt)
{
/* A2_paddit */
int PuN = insn->regno[0];
TCGv PuV = hex_pred[PuN];
Int RdN = insn->regno[1];
TCGv RdV = tcg_temp_local_new();
if (!is_preloaded(ctx, RdN)) {
    tcg_gen_mov_tl(hex_new_value[RdN], hex_gpr[RdN]);
}
int RsN = insn->regno[2];
TCGv RsV = hex_gpr[RsN];
int siV = insn->immed[0];

fGEN_TCG_A2_paddit({if(fLSBOLD(PuV)){fIMMEXT(siV); RdV=RsV+siV;} else {CANCEL;}});

gen_log_reg_write(RdN, RdV, insn->slot, 1);   /* Only does the write if we haven't cancelled */
ctx_log_reg_write(ctx, RdN);

tcg_temp_free(RdV);
/* A2_paddit */
}

Here's what the override looks like (there are a bunch of these, so we have a helper macro which could also be a function)
/* Predicated add instructions */
#define GEN_TCG_padd(PRED, ADD) \
    do { \
        TCGv LSB = tcg_temp_new(); \
        TCGv mask = tcg_temp_new(); \
        TCGv zero = tcg_const_tl(0); \
        PRED; \
        ADD; \
        tcg_gen_movi_tl(mask, 1 << insn->slot); \
        tcg_gen_or_tl(mask, hex_slot_cancelled, mask); \
        tcg_gen_movcond_tl(TCG_COND_NE, hex_slot_cancelled, LSB, zero, \
                                               hex_slot_cancelled, mask); \
        tcg_temp_free(LSB); \
        tcg_temp_free(mask); \
        tcg_temp_free(zero); \
    } while (0)

#define fGEN_TCG_A2_paddit(SHORTCODE) \
    GEN_TCG_padd(fLSBOLD(PuV), tcg_gen_addi_tl(RdV, RsV, siV))

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-08-31 23:10                     ` Taylor Simpson
@ 2020-09-01  2:40                       ` Richard Henderson
  2020-09-01  4:17                         ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-09-01  2:40 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/31/20 4:10 PM, Taylor Simpson wrote:
> 
> 
>> -----Original Message-----
>> From: Richard Henderson <richard.henderson@linaro.org>
>> Sent: Monday, August 31, 2020 1:20 PM
>> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
>> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
>> aleksandar.m.mail@gmail.com; ale@rev.ng
>> Subject: Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for
>> instructions with multiple definitions
>>
>> The fGEN_TCG_A2_add macro does not require nor use that {...} argument.
> 
> The fGEN_TCG_A2_add macro does need that argument, but there are cases that
> do need it.  Here's an example from gen_tcg.h
>     #define fGEN_TCG_L2_loadrub_pr(SHORTCODE)      SHORTCODE
> This is explained in the README, but basically the argument is useful if we
> can properly define the macros that it contains to generate TCG.
We're certainly not going to be able to handle e.g. "+" or "if", so it is going
to work only for the most trivial of SHORTCODE.

Though in fact loadrub_pr makes that grade...

> IMO, we don't want the person who writes an override having to reproduce the 
> generated code.  Assuming we have a definition of fGEN_TCG_A2_add and we
> have the generator intelligently expanding the macros, this is what will be
> generated.
You need to give me a better example that A2_add, then.  Because I see that
being exactly one line, calling a helper that handles all instructions of the
same format, passing tcg_gen_add_tl as a callback.

Have a browse through my recent microblaze decodetree conversion.  Note that
the basic logical operations are implemented with exactly one source line.

> Unlike the generate_<tag> functions that all have the same signature.  The overrides would have different signatures.  This would be more defensive programming because you know exactly where the variables come from but more verbose when writing the overrides by hand.  Also, note that these need to be macros in order to take advantage of the SHORTCODE.
> 
> In other words, instead of
> #define fGEN_TCG_A2_add(SHORTCODE)    tcg_gen_add_tl(RdV, RsV, RtV)
> 
> We would write
> #define fGEN_TCG_A2_add(env, ctx, insn, pkt, RdV, RsV, RtV, SHORTCODE)    tcg_gen_add_tl(RdV, RsV, RtV);
> 
> Personally, I prefer the former, but will change to the latter if you feel strongly.

This comes from trying to handle instructions in different ways, but represent
them all the same.

I guess I see the attraction of the magic non-parameters -- you get a
compilation error if the variable is not present, but are not tied to
positional parameters.

Ho hum.  Maybe I'm trying to overthink this too much before tackling the
ultimate goal of full parsing of the SHORTCODE.

Perhaps the only thing for the short term is to have the generator grep
genptr.c for "#define fGEN", to choose between the two alternatives: inline
generation or out-of-line helper generation.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-09-01  2:40                       ` Richard Henderson
@ 2020-09-01  4:17                         ` Taylor Simpson
  2020-09-24  2:56                           ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-09-01  4:17 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Monday, August 31, 2020 8:41 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for
> instructions with multiple definitions
>
> On 8/31/20 4:10 PM, Taylor Simpson wrote:
> >
> >
> >> -----Original Message-----
> >> From: Richard Henderson <richard.henderson@linaro.org>
> >> Sent: Monday, August 31, 2020 1:20 PM
> >> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> >> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> >> aleksandar.m.mail@gmail.com; ale@rev.ng
> >> Subject: Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for
> >> instructions with multiple definitions
> >>
> >> The fGEN_TCG_A2_add macro does not require nor use that {...}
> argument.
> >
> > The fGEN_TCG_A2_add macro does need that argument, but there are
> cases that
> > do need it.  Here's an example from gen_tcg.h
> >     #define fGEN_TCG_L2_loadrub_pr(SHORTCODE)      SHORTCODE
> > This is explained in the README, but basically the argument is useful if we
> > can properly define the macros that it contains to generate TCG.
> We're certainly not going to be able to handle e.g. "+" or "if", so it is going
> to work only for the most trivial of SHORTCODE.
>
> Though in fact loadrub_pr makes that grade...

The prior version of this series included all the overrides I've written to date.  To reduce the size of this version, I removed most of them and only left the ones that are essential for correct execution.  I plan to submit the others in subsequent series.  Anyway, there are >50 overrides of load/store instructions that leverage SHORTCODE.

> > IMO, we don't want the person who writes an override having to
> reproduce the
> > generated code.  Assuming we have a definition of fGEN_TCG_A2_add and
> we
> > have the generator intelligently expanding the macros, this is what will be
> > generated.
> You need to give me a better example that A2_add, then.  Because I see that
> being exactly one line, calling a helper that handles all instructions of the
> same format, passing tcg_gen_add_tl as a callback.

Here's a more complicated example for a predicated post-increment load

Static void generate_L2_ploadrit_pi(CPUHexagonState *env, DisasContext*cts, insn_t *insn, packet_t *pkt)
{
/* L2_ploadrit_pi */
TCGv EA = tcg_temp_local_new();
int PtN = insn->regno[0];
TCGv PtV = hex_pred[PtN];
int RdN = insn->regno[1];
TCGv RdV = tcg_temp_local_new();
if (!is_preloaded(ctx, RdN)) {
    tcg_gen_mov_tl(hex_hew_value[RdN], hex_gpr[RdN]);
}
int RxN = insn->regno[2];
TCGv RxV = tcg_temp_local_new();
if (!is_preloaded(ctx, RxN)) {
    tcg_gen_mov_tl(hex_new_value[RdN], hex_gpr[RxN]);
}
int siV = insn->immed[0];
tcg_gen_mov_tl(RxV, hex_gpr[RxN]);
fGEN_TCG_L2_ploadrit_pi({fEA_REG(RxV); if(fLSBOLD(PtV)){ fPM_I(RxV,siV); fLOAD(1,4,u,EA,RdV);} else {LOAD_CANCEL(EA);}});
gen_log_reg_write(RdN, RdV, insn->slot, 1);
gen_log_reg_write(RxN, RxV, insn->slot, 1);
tcg_temp_free(EA);
tcg_temp_free(RdV);
tcg_temp_free(RxV);
/* L2_ploadrit_pi */
}


> Have a browse through my recent microblaze decodetree conversion.  Note
> that
> the basic logical operations are implemented with exactly one source line.

With a helper function, our compares are all one line also

static inline void gen_compare(TCGCond cond, TCGv res, TCGv arg1, TCGv arg2)
{
    TCGv one = tcg_const_tl(0xff);
    TCGv zero = tcg_const_tl(0);

    tcg_gen_movcond_tl(cond, res, arg1, arg2, one, zero);

    tcg_temp_free(one);
    tcg_temp_free(zero);
}

/* Compare instructions */
#define fGEN_TCG_C2_cmpeq(SHORTCODE) \
    gen_compare(TCG_COND_EQ, PdV, RsV, RtV)
#define fGEN_TCG_C4_cmpneq(SHORTCODE) \
    gen_compare(TCG_COND_NE, PdV, RsV, RtV)
#define fGEN_TCG_C2_cmpgt(SHORTCODE) \
    gen_compare(TCG_COND_GT, PdV, RsV, RtV)
#define fGEN_TCG_C2_cmpgtu(SHORTCODE) \
    gen_compare(TCG_COND_GTU, PdV, RsV, RtV)
...



> > Unlike the generate_<tag> functions that all have the same signature.  The
> overrides would have different signatures.  This would be more defensive
> programming because you know exactly where the variables come from but
> more verbose when writing the overrides by hand.  Also, note that these
> need to be macros in order to take advantage of the SHORTCODE.
> >
> > In other words, instead of
> > #define fGEN_TCG_A2_add(SHORTCODE)    tcg_gen_add_tl(RdV, RsV, RtV)
> >
> > We would write
> > #define fGEN_TCG_A2_add(env, ctx, insn, pkt, RdV, RsV, RtV,
> SHORTCODE)    tcg_gen_add_tl(RdV, RsV, RtV);
> >
> > Personally, I prefer the former, but will change to the latter if you feel
> strongly.
>
> This comes from trying to handle instructions in different ways, but
> represent
> them all the same.
>
> I guess I see the attraction of the magic non-parameters -- you get a
> compilation error if the variable is not present, but are not tied to
> positional parameters.
>
> Ho hum.  Maybe I'm trying to overthink this too much before tackling the
> ultimate goal of full parsing of the SHORTCODE.

Alessandro (ale@rev.ng) and Niccolo (nizzo@rev.ng) are working on this 😊

> Perhaps the only thing for the short term is to have the generator grep
> genptr.c for "#define fGEN", to choose between the two alternatives: inline
> generation or out-of-line helper generation.

That's certainly doable.  It will also be good to implement some of your other ideas
- Have the generator expand the DECL/READ/WRITE/FREE macros will make the generated code more readable and we can specialize them for predicated vs non-predicated instructions which will make translation faster.
- Generate the entire generate_<tag> function instead of just the body will make the generated code more readable.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 33/34] Hexagon (tests/tcg/hexagon) TCG tests
  2020-08-29  3:05   ` Richard Henderson
@ 2020-09-01  9:57     ` Alessandro Di Federico
  0 siblings, 0 replies; 122+ messages in thread
From: Alessandro Di Federico @ 2020-09-01  9:57 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: philmd, riku.voipio, laurent, Taylor Simpson, Alex Bennée,
	aleksandar.m.mail

On Fri, 28 Aug 2020 20:05:44 -0700
Richard Henderson <richard.henderson@linaro.org> wrote:

> Could you please work with Alex Bennee to set up a
> tests/docker/dockerfile/ script containing the cross-compiler from
> the Qualcomm SDK?  That way these tests can be run automatically.
> 
> Compare debian-xtensa-cross.docker, which is similar.

We already have something similar, but it didn't make it in this
patchset. We put an effort in putting together a fully open source
toolchain.

It takes a while to build, but we'll provide a pre-built image on
dockerhub.
Eventually, upstream LLVM and musl will be in sync and it will be no
longer necessary to build it by hand.

-- 
Alessandro Di Federico
rev.ng

P.S. Richard: thanks a lot for the thorough reviews.


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 00/34] Hexagon patch series
  2020-08-30 20:47   ` Taylor Simpson
  2020-08-30 23:33     ` Richard Henderson
@ 2020-09-07  9:49     ` Rob Landley
  2020-09-15  0:41       ` [EXT] " Brian Cain
  1 sibling, 1 reply; 122+ messages in thread
From: Rob Landley @ 2020-09-07  9:49 UTC (permalink / raw)
  To: Taylor Simpson, Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 8/30/20 3:47 PM, Taylor Simpson wrote:
> Richard,
> 
> Thank you so much for the feedback.  I really appreciate it.
> 
> I'll get to work addressing the issues.  Since some of the items will take longer than others, please advise whether it's preferred to send intermediate updates or wait until they are all addressed.
> 
> Taylor

Which branch of https://github.com/quic/qemu do I pull to try this without
scraping 30 patches out of the mailing list?

>>> Once the series is applied, the Hexagon port will pass "make check-tcg".
>>> The series also includes Hexagon-specific tests in tcg/tests/hexagon.
>>>
>>> We have a parallel effort to make the Hexagon Linux toolchain inside a
>> docker
>>> container publically available.
>>
>> Oh, excellent.

Is that a follow-up to https://www.spinics.net/lists/linux-hexagon/msg01706.html
and is this a clang toolchain or a gcc toolchain?

I've noticed gcc 9.2 has hexagon in config.guess and config.sub, but the only
other file outside of the test suite containing the word "hexagon" in a case
insensitive match is the Changelog saying Linas Veptas added it to config.guess
and config.sub in 2011. And while https://github.com/quic has a musl fork it
doesn't have any compiler or linker forks...

Rob


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [EXT] Re: [RFC PATCH v3 00/34] Hexagon patch series
  2020-09-07  9:49     ` Rob Landley
@ 2020-09-15  0:41       ` Brian Cain
  0 siblings, 0 replies; 122+ messages in thread
From: Brian Cain @ 2020-09-15  0:41 UTC (permalink / raw)
  To: Rob Landley, Taylor Simpson, Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

[-- Attachment #1: Type: text/plain, Size: 1642 bytes --]

> -----Original Message-----
> From: Qemu-devel <qemu-devel-bounces+bcain=quicinc.com@nongnu.org>
> On Behalf Of Rob Landley
...
> 
> On 8/30/20 3:47 PM, Taylor Simpson wrote:
> > Richard,
> >
> > Thank you so much for the feedback.  I really appreciate it.
> >
> > I'll get to work addressing the issues.  Since some of the items will take longer
> than others, please advise whether it's preferred to send intermediate updates
> or wait until they are all addressed.
> >
> > Taylor
> 
> Which branch of https://github.com/quic/qemu do I pull to try this without
> scraping 30 patches out of the mailing list?

IIRC this patch series was "small_series_v3"

> >>> Once the series is applied, the Hexagon port will pass "make check-tcg".
> >>> The series also includes Hexagon-specific tests in tcg/tests/hexagon.
> >>>
> >>> We have a parallel effort to make the Hexagon Linux toolchain inside
> >>> a
> >> docker
> >>> container publically available.
> >>
> >> Oh, excellent.
> 
> Is that a follow-up to https://www.spinics.net/lists/linux-
> hexagon/msg01706.html
> and is this a clang toolchain or a gcc toolchain?

It's a clang toolchain.

> I've noticed gcc 9.2 has hexagon in config.guess and config.sub, but the only
> other file outside of the test suite containing the word "hexagon" in a case
> insensitive match is the Changelog saying Linas Veptas added it to config.guess
> and config.sub in 2011. And while https://github.com/quic has a musl fork it
> doesn't have any compiler or linker forks...

clang and lld support for hexagon exists in the upstream llvm-project repo.

-Brian

[-- Attachment #2: winmail.dat --]
[-- Type: application/ms-tnef, Size: 12472 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 34/34] Hexagon build infrastructure
  2020-08-29  3:19   ` Richard Henderson
@ 2020-09-24  2:35     ` Taylor Simpson
  2020-09-25 16:59       ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-09-24  2:35 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Friday, August 28, 2020 9:20 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 34/34] Hexagon build infrastructure
>
> This will have to be updated for the meson conversion.
>
> I don't understand it all myself, and all of those generated files will need
> special attention.
>

I've made the changes for meson, including converting target/hexagon/Makefile.objs to target/hexagon/meson.build, and I can build qemu-hexagon with
    mkdir build
    cd build
    configure --target-list=hexagon-linux-user
    make

However, when I run "make check-tcg", nothing actually happens.
      BUILD   TCG tests for hexagon-linux-user
    Generating qemu-version.h with a meson_exe.py custom command
      RUN     TCG tests for hexagon-linux-user

What am I missing?  Has some other command replaced "make check-tcg"?

Thanks,
Taylor



^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-09-01  4:17                         ` Taylor Simpson
@ 2020-09-24  2:56                           ` Taylor Simpson
  2020-09-24 15:03                             ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-09-24  2:56 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> > On 8/31/20 4:10 PM, Taylor Simpson wrote:
> > >
> > >
> > >> -----Original Message-----
> > >> From: Richard Henderson <richard.henderson@linaro.org>
> > >> Sent: Monday, August 31, 2020 1:20 PM
> > >> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> > >> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> > >> aleksandar.m.mail@gmail.com; ale@rev.ng
> > >> Subject: Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for
> > >> instructions with multiple definitions
> > >>
> > Ho hum.  Maybe I'm trying to overthink this too much before tackling the
> > ultimate goal of full parsing of the SHORTCODE.
> > Perhaps the only thing for the short term is to have the generator grep
> > genptr.c for "#define fGEN", to choose between the two alternatives:
> inline
> > generation or out-of-line helper generation.
>
> That's certainly doable.  It will also be good to implement some of your other
> ideas
> - Have the generator expand the DECL/READ/WRITE/FREE macros will make
> the generated code more readable and we can specialize them for
> predicated vs non-predicated instructions which will make translation faster.
> - Generate the entire generate_<tag> function instead of just the body will
> make the generated code more readable.

I've made these changes to the generator.  I hope you like the results.  As an example, here is what we generate for the add instruction

DEF_TCG_FUNC(A2_add,
static void generate_A2_add(
                CPUHexagonState *env,
                DisasContext *ctx,
                insn_t *insn,
                packet_t *pkt)
{
    TCGv RdV = tcg_temp_local_new();
    const int RdN = insn->regno[0];
    TCGv RsV = hex_gpr[insn->regno[1]];
    TCGv RtV = hex_gpr[insn->regno[2]];
    gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
    gen_log_reg_write(RdN, RdV);
    ctx_log_reg_write(ctx, RdN);
    tcg_temp_free(RdV);
})

And here is how the generated file gets used in genptr.c

#define DEF_TCG_FUNC(TAG, GENFN) \
    GENFN
#include "tcg_funcs_generated.h"
#undef DEF_TCG_FUNC

/*
 * Not all opcodes have generate_<tag> functions, so initialize
 * the table from the tcg_funcs_generated.h file.
 */
const semantic_insn_t opcode_genptr[XX_LAST_OPCODE] = {
#define DEF_TCG_FUNC(TAG, GENFN) \
    [TAG] = generate_##TAG,
#include "tcg_funcs_generated.h"
#undef DEF_TCG_FUNC
};

I've also addressed several of the items from Richard's review, so I'll resubmit the series once I figure out how to get "make check-tcg" working under meson.

Thanks,
Taylor


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-09-24  2:56                           ` Taylor Simpson
@ 2020-09-24 15:03                             ` Richard Henderson
  2020-09-24 17:18                               ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-09-24 15:03 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 9/23/20 7:56 PM, Taylor Simpson wrote:
> 
> 
>>> On 8/31/20 4:10 PM, Taylor Simpson wrote:
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Richard Henderson <richard.henderson@linaro.org>
>>>>> Sent: Monday, August 31, 2020 1:20 PM
>>>>> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
>>>>> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
>>>>> aleksandar.m.mail@gmail.com; ale@rev.ng
>>>>> Subject: Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for
>>>>> instructions with multiple definitions
>>>>>
>>> Ho hum.  Maybe I'm trying to overthink this too much before tackling the
>>> ultimate goal of full parsing of the SHORTCODE.
>>> Perhaps the only thing for the short term is to have the generator grep
>>> genptr.c for "#define fGEN", to choose between the two alternatives:
>> inline
>>> generation or out-of-line helper generation.
>>
>> That's certainly doable.  It will also be good to implement some of your other
>> ideas
>> - Have the generator expand the DECL/READ/WRITE/FREE macros will make
>> the generated code more readable and we can specialize them for
>> predicated vs non-predicated instructions which will make translation faster.
>> - Generate the entire generate_<tag> function instead of just the body will
>> make the generated code more readable.
> 
> I've made these changes to the generator.  I hope you like the results.  As an example, here is what we generate for the add instruction
> 
> DEF_TCG_FUNC(A2_add,
> static void generate_A2_add(
>                 CPUHexagonState *env,
>                 DisasContext *ctx,
>                 insn_t *insn,
>                 packet_t *pkt)
> {
>     TCGv RdV = tcg_temp_local_new();
>     const int RdN = insn->regno[0];
>     TCGv RsV = hex_gpr[insn->regno[1]];
>     TCGv RtV = hex_gpr[insn->regno[2]];
>     gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
>     gen_log_reg_write(RdN, RdV);
>     ctx_log_reg_write(ctx, RdN);
>     tcg_temp_free(RdV);
> })

I would be happier if the entire function body were not in a macro.  Have you
tried to set a gdb breakpoint in one of these?  Once upon a time at least, this
would have resulted in all lines of the function becoming one "source line" in
the debug info.

I also think the full function prototype is unnecessary, and the replication of
"A2_add" undesirable.

I would prefer the function prototype itself to be macro-ized.

E.g.

DEF_TCG_FUNC(A2_add)
{
    TCGv RdV = tcg_temp_local_new();
    const int RdN = insn->regno[0];
    TCGv RsV = hex_gpr[insn->regno[1]];
    TCGv RtV = hex_gpr[insn->regno[2]];
    gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
    gen_log_reg_write(RdN, RdV);
    ctx_log_reg_write(ctx, RdN);
    tcg_temp_free(RdV);
}

with

#define DEF_TCG_FUNC(TAG)                             \
    static void generate_##TAG(CPUHexagonState *env,  \
                               DisasContext *ctx,     \
                               insn_t *insn,          \
                               packet_t *pkt)

> And here is how the generated file gets used in genptr.c
> 
> #define DEF_TCG_FUNC(TAG, GENFN) \
>     GENFN
> #include "tcg_funcs_generated.h"
> #undef DEF_TCG_FUNC
> 
> /*
>  * Not all opcodes have generate_<tag> functions, so initialize
>  * the table from the tcg_funcs_generated.h file.
>  */
> const semantic_insn_t opcode_genptr[XX_LAST_OPCODE] = {
> #define DEF_TCG_FUNC(TAG, GENFN) \
>     [TAG] = generate_##TAG,
> #include "tcg_funcs_generated.h"
> #undef DEF_TCG_FUNC
> };

Obviously, the macro I propose above cannot be directly reused, as you do here.
 But I also think we should not try to do so.

You've got a script generating stuff.  It can just as easily generate two
different lists.  You're trying to do too much with the C preprocessor and too
little with python.

At some point in the v3 thread, I had suggested grepping for some macro in
order to indicate to the python script which tags are implemented manually.  My
definition above is easy to look for: exactly one thing on the line, easy regexp.

> I've also addressed several of the items from Richard's review, so I'll resubmit the series once I figure out how to get "make check-tcg" working under meson.

Yes, make check-tcg is currently broken, as are a few other check-foo.  I've
not yet had the courage to look into it, hoping that someone else will do it first.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-09-24 15:03                             ` Richard Henderson
@ 2020-09-24 17:18                               ` Taylor Simpson
  2020-09-24 19:04                                 ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-09-24 17:18 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Thursday, September 24, 2020 9:04 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for
> instructions with multiple definitions
>
> >
> > I've made these changes to the generator.  I hope you like the results.  As
> an example, here is what we generate for the add instruction
> >
> > DEF_TCG_FUNC(A2_add,
> > static void generate_A2_add(
> >                 CPUHexagonState *env,
> >                 DisasContext *ctx,
> >                 insn_t *insn,
> >                 packet_t *pkt)
> > {
> >     TCGv RdV = tcg_temp_local_new();
> >     const int RdN = insn->regno[0];
> >     TCGv RsV = hex_gpr[insn->regno[1]];
> >     TCGv RtV = hex_gpr[insn->regno[2]];
> >     gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
> >     gen_log_reg_write(RdN, RdV);
> >     ctx_log_reg_write(ctx, RdN);
> >     tcg_temp_free(RdV);
> > })
>
> I would be happier if the entire function body were not in a macro.  Have you
> tried to set a gdb breakpoint in one of these?  Once upon a time at least, this
> would have resulted in all lines of the function becoming one "source line" in
> the debug info.

Good point.  It still comes out as a single line.

> I also think the full function prototype is unnecessary, and the replication of
> "A2_add" undesirable.
>
> I would prefer the function prototype itself to be macro-ized.
>
> E.g.
>
> DEF_TCG_FUNC(A2_add)
> {
>     TCGv RdV = tcg_temp_local_new();
>     const int RdN = insn->regno[0];
>     TCGv RsV = hex_gpr[insn->regno[1]];
>     TCGv RtV = hex_gpr[insn->regno[2]];
>     gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
>     gen_log_reg_write(RdN, RdV);
>     ctx_log_reg_write(ctx, RdN);
>     tcg_temp_free(RdV);
> }
>
> with
>
> #define DEF_TCG_FUNC(TAG)                             \
>     static void generate_##TAG(CPUHexagonState *env,  \
>                                DisasContext *ctx,     \
>                                insn_t *insn,          \
>                                packet_t *pkt)
>
> > And here is how the generated file gets used in genptr.c
> >
> > #define DEF_TCG_FUNC(TAG, GENFN) \
> >     GENFN
> > #include "tcg_funcs_generated.h"
> > #undef DEF_TCG_FUNC
> >
> > /*
> >  * Not all opcodes have generate_<tag> functions, so initialize
> >  * the table from the tcg_funcs_generated.h file.
> >  */
> > const semantic_insn_t opcode_genptr[XX_LAST_OPCODE] = {
> > #define DEF_TCG_FUNC(TAG, GENFN) \
> >     [TAG] = generate_##TAG,
> > #include "tcg_funcs_generated.h"
> > #undef DEF_TCG_FUNC
> > };
>
> Obviously, the macro I propose above cannot be directly reused, as you do
> here.
>  But I also think we should not try to do so.
>
> You've got a script generating stuff.  It can just as easily generate two
> different lists.  You're trying to do too much with the C preprocessor and too
> little with python.

Sure, generating two lists was also suggested by Alessandro (ale@rev.ng).  Although doing more in python and less with the C preprocessor would lead to the conclusion we shouldn't define the function prototype in a macro.  If we generate two lists, what is the advantage of putting the function signature in a macro vs generating?

>
> At some point in the v3 thread, I had suggested grepping for some macro in
> order to indicate to the python script which tags are implemented manually.
> My
> definition above is easy to look for: exactly one thing on the line, easy
> regexp.

This is already done as well.  As you may recall, we were previously generating
    #ifdef fGEN_TCG_A2_add
    fGEN_TCG_A2_add({ RdV=RsV+RtV;});
    #else
    do {
    gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
    } while (0);

The generator now searches for "#define fGEN_TCG_<tag>" and generates different code depending on what it finds.  This version of the series only contains overrides that are required for correct execution.  So, A2_add isn't there.  When we do override it, the generator produces this

static void generate_A2_add(
                CPUHexagonState *env,
                DisasContext *ctx,
                insn_t *insn,
                packet_t *pkt)
{
    TCGv RdV = tcg_temp_local_new();
    const int RdN = insn->regno[0];
    TCGv RsV = hex_gpr[insn->regno[1]];
    TCGv RtV = hex_gpr[insn->regno[2]];
    fGEN_TCG_A2_add({ RdV=RsV+RtV;});                  <---- This line is different
    gen_log_reg_write(RdN, RdV);
    ctx_log_reg_write(ctx, RdN);
    tcg_temp_free(RdV);
}

Also, if it finds the override, it doesn't generate the DEF_HELPER or the helper function.  That means we don't include tcg_gen.h in helper.h as you suggested.


Thanks,
Taylor


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions
  2020-09-24 17:18                               ` Taylor Simpson
@ 2020-09-24 19:04                                 ` Richard Henderson
  0 siblings, 0 replies; 122+ messages in thread
From: Richard Henderson @ 2020-09-24 19:04 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 9/24/20 10:18 AM, Taylor Simpson wrote:
>> You've got a script generating stuff.  It can just as easily generate two
>> different lists.  You're trying to do too much with the C preprocessor and too
>> little with python.
> 
> Sure, generating two lists was also suggested by Alessandro (ale@rev.ng).  Although doing more in python and less with the C preprocessor would lead to the conclusion we shouldn't define the function prototype in a macro.  If we generate two lists, what is the advantage of putting the function signature in a macro vs generating?

None, because...

>> At some point in the v3 thread, I had suggested grepping for some macro in
>> order to indicate to the python script which tags are implemented manually.
>> My
>> definition above is easy to look for: exactly one thing on the line, easy
>> regexp.
> 
> This is already done as well.  As you may recall, we were previously generating
>     #ifdef fGEN_TCG_A2_add
>     fGEN_TCG_A2_add({ RdV=RsV+RtV;});
>     #else
>     do {
>     gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
>     } while (0);
> 
> The generator now searches for "#define fGEN_TCG_<tag>" ...

... I'd forgotten that they were two different macros.

> Also, if it finds the override, it doesn't generate the DEF_HELPER or the helper function.  That means we don't include tcg_gen.h in helper.h as you suggested.

Excellent.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 34/34] Hexagon build infrastructure
  2020-09-24  2:35     ` Taylor Simpson
@ 2020-09-25 16:59       ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 122+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-25 16:59 UTC (permalink / raw)
  To: Taylor Simpson, Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, laurent, aleksandar.m.mail

Hi Taylor,

On 9/24/20 4:35 AM, Taylor Simpson wrote:
>> -----Original Message-----
>> From: Richard Henderson <richard.henderson@linaro.org>
>> Sent: Friday, August 28, 2020 9:20 PM
>> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
>> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
>> aleksandar.m.mail@gmail.com; ale@rev.ng
>> Subject: Re: [RFC PATCH v3 34/34] Hexagon build infrastructure
>>
>> This will have to be updated for the meson conversion.
>>
>> I don't understand it all myself, and all of those generated files will need
>> special attention.
>>
> 
> I've made the changes for meson, including converting target/hexagon/Makefile.objs to target/hexagon/meson.build, and I can build qemu-hexagon with
>     mkdir build
>     cd build
>     configure --target-list=hexagon-linux-user
>     make
> 
> However, when I run "make check-tcg", nothing actually happens.
>       BUILD   TCG tests for hexagon-linux-user
>     Generating qemu-version.h with a meson_exe.py custom command
>       RUN     TCG tests for hexagon-linux-user
> 
> What am I missing?  Has some other command replaced "make check-tcg"?

Maybe that patch from Paolo fixes it:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg742912.html

> 
> Thanks,
> Taylor
> 
> 



^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics
  2020-08-29  1:16   ` Richard Henderson
  2020-08-30 20:23     ` Taylor Simpson
@ 2020-10-08 15:00     ` Taylor Simpson
  2020-10-08 17:30       ` Richard Henderson
  1 sibling, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-10-08 15:00 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Friday, August 28, 2020 7:17 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros
> referenced in instruction semantics
>
>
> Ah, so I see what you're trying to do with the merge thing, which had the
> host-endian problems.
>
> I think the merge stuff is a mistake.  I think you can get the semantics that
> you want with
>
> probe_read(ld_addr, ld_len)
> qemu_st(st_value, st_addr)
> qemu_ld(ld_value, ld_addr)
>
> In this way, all exceptions are recognized before the store is complete, the
> normal memory operations handle any possible overlap.

Following up on this...

1) We don't need to do the probe_read because the load has already happened at this point.

2) I'm not familiar with qemu_st/qemu_ld.  Are these shorthand for tcg_gen_qemu_st*/tcg_gen_qemu_ld*?  We can't actually do the store at this point because it would alter the memory before we are sure the packet will commit (i.e., there might be still be an exception raised by another instruction in the packet).

3) How important is it to support big endian hosts?  Would it be OK to put something like this to declare it isn't supported for Hexagon?
#if defined(HOST_WORDS_BIGENDIAN)
#error Hexagon guest not supported on big endian host
#endif

4) If the above is not OK, are the macros in bswap.h the correct ones to use?  Is this pseudo-code correct?
store_val = le32_to_cpu(store_val);
load_val = le32_to_cpu(load_val);
<merge bytes>
/* store_val is dead so no need to convert back */
load_val = cpu_to_le32(load_val)


Thanks,
Taylor


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics
  2020-10-08 15:00     ` Taylor Simpson
@ 2020-10-08 17:30       ` Richard Henderson
  2020-10-08 18:51         ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-10-08 17:30 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 10/8/20 10:00 AM, Taylor Simpson wrote:
> 
> 
>> -----Original Message-----
>> From: Richard Henderson <richard.henderson@linaro.org>
>> Sent: Friday, August 28, 2020 7:17 PM
>> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
>> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
>> aleksandar.m.mail@gmail.com; ale@rev.ng
>> Subject: Re: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros
>> referenced in instruction semantics
>>
>>
>> Ah, so I see what you're trying to do with the merge thing, which had the
>> host-endian problems.
>>
>> I think the merge stuff is a mistake.  I think you can get the semantics that
>> you want with
>>
>> probe_read(ld_addr, ld_len)
>> qemu_st(st_value, st_addr)
>> qemu_ld(ld_value, ld_addr)
>>
>> In this way, all exceptions are recognized before the store is complete, the
>> normal memory operations handle any possible overlap.
> 
> Following up on this...
> 
> 1) We don't need to do the probe_read because the load has already happened at this point.


What do you mean "has already happened"?  How can it have done without doing
the merging by hand.  Which this operation ordering is intended to make
unnecessary.

I think you're missing the point.


> 2) I'm not familiar with qemu_st/qemu_ld.  Are these shorthand for tcg_gen_qemu_st*/tcg_gen_qemu_ld*?

Yes.

> We can't actually do the store at this point because it would alter the memory before we are sure the packet will commit (i.e., there might be still be an exception raised by another instruction in the packet).

What other instruction?  Give me a concrete example so that I can give you a
concrete answer.

I think you may need to do more preprocessing of the entire packet so that you
can answer this sort of question (is there any other possible exception) when
generating code.

> 3) How important is it to support big endian hosts?  Would it be OK to put something like this to declare it isn't supported for Hexagon?
> #if defined(HOST_WORDS_BIGENDIAN)
> #error Hexagon guest not supported on big endian host
> #endif

That would make ./configure && make fail on such a host.
So, no, you can't do that.

> 
> 4) If the above is not OK, are the macros in bswap.h the correct ones to use?  Is this pseudo-code correct?
> store_val = le32_to_cpu(store_val);
> load_val = le32_to_cpu(load_val);
> <merge bytes>
> /* store_val is dead so no need to convert back */
> load_val = cpu_to_le32(load_val)

There's some misuse of cpu_to_le32 vs le32_to_cpu there (I've never liked those
names, but it helps to think about what form the data is already in).


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics
  2020-10-08 17:30       ` Richard Henderson
@ 2020-10-08 18:51         ` Taylor Simpson
  2020-10-08 20:02           ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-10-08 18:51 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Thursday, October 8, 2020 11:31 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros
> referenced in instruction semantics
>
> >> Ah, so I see what you're trying to do with the merge thing, which had the
> >> host-endian problems.
> >>
> >> I think the merge stuff is a mistake.  I think you can get the semantics that
> >> you want with
> >>
> >> probe_read(ld_addr, ld_len)
> >> qemu_st(st_value, st_addr)
> >> qemu_ld(ld_value, ld_addr)
> >>
> >> In this way, all exceptions are recognized before the store is complete,
> the
> >> normal memory operations handle any possible overlap.
> >
> > Following up on this...
> >
> > 1) We don't need to do the probe_read because the load has already
> happened at this point.
>
>
> What do you mean "has already happened"?
> How can it have done without doing the merging by hand.  Which this operation ordering is intended to make unnecessary.
>
> I think you're missing the point.

Sorry I wasn't clear.  We have done the load from memory as it was at the beginning of the packet.  The result of the store is in mem_log_stores in CPUHexagonState.  This code updates the bytes that were loaded with any overlapping bytes from the store that hasn't been committed yet.

>
>
> > 2) I'm not familiar with qemu_st/qemu_ld.  Are these shorthand for
> tcg_gen_qemu_st*/tcg_gen_qemu_ld*?
>
> Yes.
>
> > We can't actually do the store at this point because it would alter the
> memory before we are sure the packet will commit (i.e., there might be still
> be an exception raised by another instruction in the packet).
>
> What other instruction?  Give me a concrete example so that I can give you a
> concrete answer.

Remember, there can be 4 instructions in a packet.  This code is only dealing with two of them (a load and a store).  Here's an example where a different instruction in the packet can fault.

    67f8:       c0 40 21 1f     1f2140c0 {      v0.uh = vsat(v0.uw,v1.uw)
    67fc:       00 45 02 a1     a1024500        memb(r2+#0) = r5
    6800:       02 c0 04 91     9104c002        r2 = memb(r4+#0) }

The vsat instruction requires a vector context.  If the thread doesn't have a vector context, an exception will be raised.  The OS can provide a context and replay the packet.

> I think you may need to do more preprocessing of the entire packet so that
> you
> can answer this sort of question (is there any other possible exception) when
> generating code.
>
> > 3) How important is it to support big endian hosts?  Would it be OK to put
> something like this to declare it isn't supported for Hexagon?
> > #if defined(HOST_WORDS_BIGENDIAN)
> > #error Hexagon guest not supported on big endian host
> > #endif
>
> That would make ./configure && make fail on such a host.
> So, no, you can't do that.
>
> >
> > 4) If the above is not OK, are the macros in bswap.h the correct ones to
> use?  Is this pseudo-code correct?
> > store_val = le32_to_cpu(store_val);
> > load_val = le32_to_cpu(load_val);
> > <merge bytes>
> > /* store_val is dead so no need to convert back */
> > load_val = cpu_to_le32(load_val)
>
> There's some misuse of cpu_to_le32 vs le32_to_cpu there (I've never liked
> those
> names, but it helps to think about what form the data is already in).

So, what is the right sequence?


Thanks,
Taylor


^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics
  2020-10-08 18:51         ` Taylor Simpson
@ 2020-10-08 20:02           ` Richard Henderson
  2020-10-08 20:54             ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-10-08 20:02 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 10/8/20 1:51 PM, Taylor Simpson wrote:
>> How can it have done without doing the merging by hand.  Which this operation ordering is intended to make unnecessary.
>>
>> I think you're missing the point.
> 
> Sorry I wasn't clear.  We have done the load from memory as it was at the beginning of the packet.  The result of the store is in mem_log_stores in CPUHexagonState.  This code updates the bytes that were loaded with any overlapping bytes from the store that hasn't been committed yet.

Right, so you *are* missing the point.

The point is to *not* do the load earlier, but only probe the memory for
readability so that any exception is recognized before the store commits.

Then, after the store, actually perform the load.  Thus any overlapping bytes
get the values that they should.

Voila, no by-hand merging.

>     67f8:       c0 40 21 1f     1f2140c0 {      v0.uh = vsat(v0.uw,v1.uw)
>     67fc:       00 45 02 a1     a1024500        memb(r2+#0) = r5
>     6800:       02 c0 04 91     9104c002        r2 = memb(r4+#0) }
> 
> The vsat instruction requires a vector context.  If the thread doesn't have a vector context, an exception will be raised.  The OS can provide a context and replay the packet.

Sure.

Is there any per-packet exception priority that would require a fault from the
store to be provided in preference to the fault for the vector context?

Anyway, I'm suggesting ordering the operations within the packet to be one
that's most convenient for us.

>>> store_val = le32_to_cpu(store_val);
>>> load_val = le32_to_cpu(load_val);
>>> <merge bytes>
>>> /* store_val is dead so no need to convert back */
>>> load_val = cpu_to_le32(load_val)
>>
>> There's some misuse of cpu_to_le32 vs le32_to_cpu there (I've never liked
>> those
>> names, but it helps to think about what form the data is already in).
> 
> So, what is the right sequence?

Well, <merge_bytes> wants to operate on a le ordering, so the final load_val
assignment should use le32_to_cpu.  Think about this in terms of units, like
Fahrenheit vs Celsius.

As for the other two, it depends on where the values come from.  Probably they
should be cpu_to_le32, but I can't tell without extra context.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics
  2020-10-08 20:02           ` Richard Henderson
@ 2020-10-08 20:54             ` Taylor Simpson
  2020-10-09 12:59               ` Richard Henderson
  0 siblings, 1 reply; 122+ messages in thread
From: Taylor Simpson @ 2020-10-08 20:54 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Thursday, October 8, 2020 2:02 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros
> referenced in instruction semantics
>
> On 10/8/20 1:51 PM, Taylor Simpson wrote:
> >> How can it have done without doing the merging by hand.  Which this
> operation ordering is intended to make unnecessary.
> >>
> >> I think you're missing the point.
> >
> > Sorry I wasn't clear.  We have done the load from memory as it was at the
> beginning of the packet.  The result of the store is in mem_log_stores in
> CPUHexagonState.  This code updates the bytes that were loaded with any
> overlapping bytes from the store that hasn't been committed yet.
>
> Right, so you *are* missing the point.
>
> The point is to *not* do the load earlier, but only probe the memory for
> readability so that any exception is recognized before the store commits.
>
> Then, after the store, actually perform the load.  Thus any overlapping bytes
> get the values that they should.

That's what I coded originally, but after sleeping on it decided it could lead to problems.  See below...

> Voila, no by-hand merging.
>
> >     67f8:       c0 40 21 1f     1f2140c0 {      v0.uh = vsat(v0.uw,v1.uw)
> >     67fc:       00 45 02 a1     a1024500        memb(r2+#0) = r5
> >     6800:       02 c0 04 91     9104c002        r2 = memb(r4+#0) }
> >
> > The vsat instruction requires a vector context.  If the thread doesn't have a
> vector context, an exception will be raised.  The OS can provide a context and
> replay the packet.
>
> Sure.
>
> Is there any per-packet exception priority that would require a fault from the
> store to be provided in preference to the fault for the vector context?

I don't think there's a priority for the exceptions.  In this example, any of the 3 instructions could generate an exception.  The thing I'm worried about isn't that the store could generate an exception.  The problem is letting the store change the machine state before we're sure none of the instructions will raise an exception.  Maybe I'm worrying about something that would never result in different behavior - if the packet gets replayed, we'll just store the same value again.

I'll reach out to the architects and ask if there is a case where doing the store first could be a problem - and also about the priority of exceptions.


> Anyway, I'm suggesting ordering the operations within the packet to be one
> that's most convenient for us.

This may or may not be possible.  We already have to reorder to put .new consumers after the producers, and we have to keep the change-of-flow instructions in the original order.  So, I'm reluctant to go down this path worrying that there will be a long tail of corner case bugs to engineer out.




^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics
  2020-10-08 20:54             ` Taylor Simpson
@ 2020-10-09 12:59               ` Richard Henderson
  2020-10-09 16:02                 ` Taylor Simpson
  0 siblings, 1 reply; 122+ messages in thread
From: Richard Henderson @ 2020-10-09 12:59 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail

On 10/8/20 3:54 PM, Taylor Simpson wrote:
> I'll reach out to the architects and ask if there is a case where doing the store first could be a problem - and also about the priority of exceptions.

Of course there are cases where doing the store first will cause problems.

> This may or may not be possible.  We already have to reorder to put .new consumers after the producers, and we have to keep the change-of-flow instructions in the original order.  So, I'm reluctant to go down this path worrying that there will be a long tail of corner case bugs to engineer out.

Well, do as you like, I suppose, but I think the merging thing that you were
trying to do is just as fraught with peril.


r~


^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics
  2020-10-09 12:59               ` Richard Henderson
@ 2020-10-09 16:02                 ` Taylor Simpson
  0 siblings, 0 replies; 122+ messages in thread
From: Taylor Simpson @ 2020-10-09 16:02 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: ale, riku.voipio, philmd, laurent, aleksandar.m.mail


> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Friday, October 9, 2020 6:59 AM
>
> On 10/8/20 3:54 PM, Taylor Simpson wrote:
> > I'll reach out to the architects and ask if there is a case where doing the
> store first could be a problem - and also about the priority of exceptions.
>
> Of course there are cases where doing the store first will cause problems.
>
> > This may or may not be possible.  We already have to reorder to put .new
> consumers after the producers, and we have to keep the change-of-flow
> instructions in the original order.  So, I'm reluctant to go down this path
> worrying that there will be a long tail of corner case bugs to engineer out.
>
> Well, do as you like, I suppose, but I think the merging thing that you were
> trying to do is just as fraught with peril.

Thanks for your patience on this.  After speaking to the architects and putting more comments in the decoder (per your feedback),  I'm convinced that doing the store first won't lead to problems.  So, I'll move forward with your suggestion.

In other news, this change and the switch to qemu softfloat are the only blockers from your review.  Once these are done, I'll submit v5 of the series.  Thanks for your feedback, it's made our code significantly better.


Thanks
Taylor

^ permalink raw reply	[flat|nested] 122+ messages in thread

end of thread, other threads:[~2020-10-09 16:14 UTC | newest]

Thread overview: 122+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-18 15:50 [RFC PATCH v3 00/34] Hexagon patch series Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 01/34] Hexagon Update MAINTAINERS file Taylor Simpson
2020-08-26  1:55   ` Richard Henderson
2020-08-18 15:50 ` [RFC PATCH v3 02/34] Hexagon (target/hexagon) README Taylor Simpson
2020-08-26  2:06   ` Richard Henderson
2020-08-18 15:50 ` [RFC PATCH v3 03/34] Hexagon (include/elf.h) ELF machine definition Taylor Simpson
2020-08-26  2:06   ` Richard Henderson
2020-08-18 15:50 ` [RFC PATCH v3 04/34] Hexagon (target/hexagon) scalar core definition Taylor Simpson
2020-08-26 13:35   ` Richard Henderson
2020-08-26 23:51     ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 05/34] Hexagon (target/hexagon) register names Taylor Simpson
2020-08-26 13:39   ` Richard Henderson
2020-08-18 15:50 ` [RFC PATCH v3 06/34] Hexagon (disas) disassembler Taylor Simpson
2020-08-26 13:52   ` Richard Henderson
2020-08-26 23:52     ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 07/34] Hexagon (target/hexagon) scalar core helpers Taylor Simpson
2020-08-26 14:16   ` Richard Henderson
2020-08-26 23:52     ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 08/34] Hexagon (target/hexagon) GDB Stub Taylor Simpson
2020-08-26 14:17   ` Richard Henderson
2020-08-18 15:50 ` [RFC PATCH v3 09/34] Hexagon (target/hexagon) architecture types Taylor Simpson
2020-08-26 14:19   ` Richard Henderson
2020-08-26 23:52     ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 10/34] Hexagon (target/hexagon) instruction and packet types Taylor Simpson
2020-08-26 14:22   ` Richard Henderson
2020-08-26 23:52     ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 11/34] Hexagon (target/hexagon) register fields Taylor Simpson
2020-08-26 14:29   ` Richard Henderson
2020-08-26 23:52     ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 12/34] Hexagon (target/hexagon) instruction attributes Taylor Simpson
2020-08-26 14:34   ` Richard Henderson
2020-08-18 15:50 ` [RFC PATCH v3 13/34] Hexagon (target/hexagon) register map Taylor Simpson
2020-08-26 14:36   ` Richard Henderson
2020-08-26 23:52     ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 14/34] Hexagon (target/hexagon) instruction/packet decode Taylor Simpson
2020-08-26 15:06   ` Richard Henderson
2020-08-26 23:52     ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 15/34] Hexagon (target/hexagon) instruction printing Taylor Simpson
2020-08-26 15:08   ` Richard Henderson
2020-08-18 15:50 ` [RFC PATCH v3 16/34] Hexagon (target/hexagon) utility functions Taylor Simpson
2020-08-26 15:10   ` Richard Henderson
2020-08-26 23:52     ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 17/34] Hexagon (target/hexagon/imported) arch import - macro definitions Taylor Simpson
2020-08-26 15:17   ` Richard Henderson
2020-08-26 23:52     ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 18/34] Hexagon (target/hexagon/imported) arch import - instruction semantics Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 19/34] Hexagon (target/hexagon/imported) arch import - instruction encoding Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 20/34] Hexagon (target/hexagon) generator phase 1 - C preprocessor for semantics Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 21/34] Hexagon (target/hexagon) generator phase 2 - generate header files Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 22/34] Hexagon (target/hexagon) generator phase 3 - C preprocessor for decode tree Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 23/34] Hexagon (target/hexagon) generater phase 4 - " Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 24/34] Hexagon (target/hexagon) opcode data structures Taylor Simpson
2020-08-26 15:25   ` Richard Henderson
2020-08-26 23:52     ` Taylor Simpson
2020-08-27  4:05       ` Richard Henderson
2020-08-18 15:50 ` [RFC PATCH v3 25/34] Hexagon (target/hexagon) macros to interface with the generator Taylor Simpson
2020-08-29  0:49   ` Richard Henderson
2020-08-30 20:30     ` Taylor Simpson
2020-08-30 20:59       ` Richard Henderson
2020-08-30 21:20         ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 26/34] Hexagon (target/hexagon) macros referenced in instruction semantics Taylor Simpson
2020-08-29  1:16   ` Richard Henderson
2020-08-30 20:23     ` Taylor Simpson
2020-08-30 21:06       ` Richard Henderson
2020-10-08 15:00     ` Taylor Simpson
2020-10-08 17:30       ` Richard Henderson
2020-10-08 18:51         ` Taylor Simpson
2020-10-08 20:02           ` Richard Henderson
2020-10-08 20:54             ` Taylor Simpson
2020-10-09 12:59               ` Richard Henderson
2020-10-09 16:02                 ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 27/34] Hexagon (target/hexagon) instruction classes Taylor Simpson
2020-08-29  1:37   ` Richard Henderson
2020-08-30 20:04     ` Taylor Simpson
2020-08-30 20:43       ` Richard Henderson
2020-08-18 15:50 ` [RFC PATCH v3 28/34] Hexagon (target/hexagon) TCG generation helpers Taylor Simpson
2020-08-29  1:48   ` Richard Henderson
2020-08-30 19:53     ` Taylor Simpson
2020-08-30 20:52       ` Richard Henderson
2020-08-30 21:38         ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 29/34] Hexagon (target/hexagon) TCG generation Taylor Simpson
2020-08-29  1:58   ` Richard Henderson
2020-08-30 19:49     ` Taylor Simpson
2020-08-18 15:50 ` [RFC PATCH v3 30/34] Hexagon (target/hexagon) TCG for instructions with multiple definitions Taylor Simpson
2020-08-29  2:02   ` Richard Henderson
2020-08-30 19:48     ` Taylor Simpson
2020-08-30 21:13       ` Richard Henderson
2020-08-30 21:30         ` Taylor Simpson
2020-08-30 23:26           ` Richard Henderson
2020-08-31 17:08             ` Taylor Simpson
2020-08-31 17:29               ` Richard Henderson
2020-08-31 18:14                 ` Taylor Simpson
2020-08-31 19:20                   ` Richard Henderson
2020-08-31 23:10                     ` Taylor Simpson
2020-09-01  2:40                       ` Richard Henderson
2020-09-01  4:17                         ` Taylor Simpson
2020-09-24  2:56                           ` Taylor Simpson
2020-09-24 15:03                             ` Richard Henderson
2020-09-24 17:18                               ` Taylor Simpson
2020-09-24 19:04                                 ` Richard Henderson
2020-08-18 15:50 ` [RFC PATCH v3 31/34] Hexagon (target/hexagon) translation Taylor Simpson
2020-08-29  2:49   ` Richard Henderson
2020-08-30 19:37     ` Taylor Simpson
2020-08-30 23:08       ` Richard Henderson
2020-08-18 15:50 ` [RFC PATCH v3 32/34] Hexagon (linux-user/hexagon) Linux user emulation Taylor Simpson
2020-08-29  2:59   ` Richard Henderson
2020-08-18 15:50 ` [RFC PATCH v3 33/34] Hexagon (tests/tcg/hexagon) TCG tests Taylor Simpson
2020-08-29  3:05   ` Richard Henderson
2020-09-01  9:57     ` Alessandro Di Federico
2020-08-18 15:50 ` [RFC PATCH v3 34/34] Hexagon build infrastructure Taylor Simpson
2020-08-29  3:19   ` Richard Henderson
2020-09-24  2:35     ` Taylor Simpson
2020-09-25 16:59       ` Philippe Mathieu-Daudé
2020-08-18 16:32 ` [RFC PATCH v3 00/34] Hexagon patch series no-reply
2020-08-29  3:27 ` Richard Henderson
2020-08-30 20:47   ` Taylor Simpson
2020-08-30 23:33     ` Richard Henderson
2020-08-31 17:57       ` Taylor Simpson
2020-08-31 20:43         ` Richard Henderson
2020-08-31 23:48           ` Taylor Simpson
2020-09-07  9:49     ` Rob Landley
2020-09-15  0:41       ` [EXT] " Brian Cain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.