All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Hexagon (target/hexagon) improve store handling
@ 2022-09-20  8:07 Taylor Simpson
  2022-09-20  8:07 ` [PATCH 1/3] Hexagon (target/hexagon) add instruction attributes from archlib Taylor Simpson
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Taylor Simpson @ 2022-09-20  8:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: tsimpson, richard.henderson, f4bug, ale, anjo, bcain, mlambert

Make store handling faster and more robust

Taylor Simpson (3):
  Hexagon (target/hexagon) add instruction attributes from archlib
  Hexagon (target/hexagon) move store size tracking to translation
  Hexagon (target/hexagon) Change decision to set pkt_has_store_s[01]

 target/hexagon/macros.h               |   8 +-
 target/hexagon/attribs_def.h.inc      |  38 +++++++-
 target/hexagon/decode.c               |  17 ++--
 target/hexagon/genptr.c               |  36 +++-----
 target/hexagon/translate.c            |  36 +++++++-
 target/hexagon/hex_common.py          |   3 +-
 target/hexagon/imported/ldst.idef     | 122 +++++++++++++-------------
 target/hexagon/imported/subinsns.idef |  72 +++++++--------
 8 files changed, 196 insertions(+), 136 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] Hexagon (target/hexagon) add instruction attributes from archlib
  2022-09-20  8:07 [PATCH 0/3] Hexagon (target/hexagon) improve store handling Taylor Simpson
@ 2022-09-20  8:07 ` Taylor Simpson
  2022-09-28 16:12   ` Richard Henderson
  2022-09-20  8:07 ` [PATCH 2/3] Hexagon (target/hexagon) move store size tracking to translation Taylor Simpson
  2022-09-20  8:07 ` [PATCH 3/3] Hexagon (target/hexagon) Change decision to set pkt_has_store_s[01] Taylor Simpson
  2 siblings, 1 reply; 8+ messages in thread
From: Taylor Simpson @ 2022-09-20  8:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: tsimpson, richard.henderson, f4bug, ale, anjo, bcain, mlambert

The imported files from the architecture library have added some
instruction attributes.  Some of these will be used in a subsequent
patch for determing the size of a store.

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/attribs_def.h.inc      |  37 +++++++-
 target/hexagon/imported/ldst.idef     | 122 +++++++++++++-------------
 target/hexagon/imported/subinsns.idef |  72 +++++++--------
 3 files changed, 133 insertions(+), 98 deletions(-)

diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.h.inc
index dc890a557f..222ad95fb0 100644
--- a/target/hexagon/attribs_def.h.inc
+++ b/target/hexagon/attribs_def.h.inc
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -38,6 +38,16 @@ DEF_ATTRIB(SUBINSN, "sub-instruction", "", "")
 /* Load and Store attributes */
 DEF_ATTRIB(LOAD, "Loads from memory", "", "")
 DEF_ATTRIB(STORE, "Stores to memory", "", "")
+DEF_ATTRIB(STOREIMMED, "Stores immed to memory", "", "")
+DEF_ATTRIB(MEMSIZE_0B, "Memory width is 0 byte", "", "")
+DEF_ATTRIB(MEMSIZE_1B, "Memory width is 1 byte", "", "")
+DEF_ATTRIB(MEMSIZE_2B, "Memory width is 2 bytes", "", "")
+DEF_ATTRIB(MEMSIZE_4B, "Memory width is 4 bytes", "", "")
+DEF_ATTRIB(MEMSIZE_8B, "Memory width is 8 bytes", "", "")
+DEF_ATTRIB(REGWRSIZE_1B, "Memory width is 1 byte", "", "")
+DEF_ATTRIB(REGWRSIZE_2B, "Memory width is 2 bytes", "", "")
+DEF_ATTRIB(REGWRSIZE_4B, "Memory width is 4 bytes", "", "")
+DEF_ATTRIB(REGWRSIZE_8B, "Memory width is 8 bytes", "", "")
 DEF_ATTRIB(MEMLIKE, "Memory-like instruction", "", "")
 DEF_ATTRIB(MEMLIKE_PACKET_RULES, "follows Memory-like packet rules", "", "")
 
@@ -71,6 +81,11 @@ DEF_ATTRIB(COF, "Change-of-flow instruction", "", "")
 DEF_ATTRIB(CONDEXEC, "May be cancelled by a predicate", "", "")
 DEF_ATTRIB(DOTNEWVALUE, "Uses a register value generated in this pkt", "", "")
 DEF_ATTRIB(NEWCMPJUMP, "Compound compare and jump", "", "")
+DEF_ATTRIB(NVSTORE, "New-value store", "", "")
+DEF_ATTRIB(MEMOP, "memop", "", "")
+
+DEF_ATTRIB(ROPS_2, "Compound instruction worth 2 RISC-ops", "", "")
+DEF_ATTRIB(ROPS_3, "Compound instruction worth 3 RISC-ops", "", "")
 
 /* access to implicit registers */
 DEF_ATTRIB(IMPLICIT_WRITES_LR, "Writes the link register", "", "UREG.LR")
@@ -87,6 +102,9 @@ DEF_ATTRIB(IMPLICIT_WRITES_P3, "May write Predicate 3", "", "UREG.P3")
 DEF_ATTRIB(IMPLICIT_READS_PC, "Reads the PC register", "", "")
 DEF_ATTRIB(IMPLICIT_WRITES_USR, "May write USR", "", "")
 DEF_ATTRIB(WRITES_PRED_REG, "Writes a predicate register", "", "")
+DEF_ATTRIB(COMMUTES, "The operation is communitive", "", "")
+DEF_ATTRIB(DEALLOCRET, "dealloc_return", "", "")
+DEF_ATTRIB(DEALLOCFRAME, "deallocframe", "", "")
 
 DEF_ATTRIB(CRSLOT23, "Can execute in slot 2 or slot 3 (CR)", "", "")
 DEF_ATTRIB(IT_NOP, "nop instruction", "", "")
@@ -94,17 +112,21 @@ DEF_ATTRIB(IT_EXTENDER, "constant extender instruction", "", "")
 
 
 /* Restrictions to make note of */
+DEF_ATTRIB(RESTRICT_COF_MAX1, "One change-of-flow per packet", "", "")
+DEF_ATTRIB(RESTRICT_NOPACKET, "Not allowed in a packet", "", "")
 DEF_ATTRIB(RESTRICT_SLOT0ONLY, "Must execute on slot0", "", "")
 DEF_ATTRIB(RESTRICT_SLOT1ONLY, "Must execute on slot1", "", "")
 DEF_ATTRIB(RESTRICT_SLOT2ONLY, "Must execute on slot2", "", "")
 DEF_ATTRIB(RESTRICT_SLOT3ONLY, "Must execute on slot3", "", "")
 DEF_ATTRIB(RESTRICT_NOSLOT1, "No slot 1 instruction in parallel", "", "")
 DEF_ATTRIB(RESTRICT_PREFERSLOT0, "Try to encode into slot 0", "", "")
+DEF_ATTRIB(RESTRICT_PACKET_AXOK, "May exist with A-type or X-type", "", "")
 
 DEF_ATTRIB(ICOP, "Instruction cache op", "", "")
 
 DEF_ATTRIB(HWLOOP0_END, "Ends HW loop0", "", "")
 DEF_ATTRIB(HWLOOP1_END, "Ends HW loop1", "", "")
+DEF_ATTRIB(RET_TYPE, "return type", "", "")
 DEF_ATTRIB(DCZEROA, "dczeroa type", "", "")
 DEF_ATTRIB(ICFLUSHOP, "icflush op type", "", "")
 DEF_ATTRIB(DCFLUSHOP, "dcflush op type", "", "")
@@ -116,5 +138,18 @@ DEF_ATTRIB(L2FETCH, "Instruction is l2fetch type", "", "")
 DEF_ATTRIB(ICINVA, "icinva", "", "")
 DEF_ATTRIB(DCCLEANINVA, "dccleaninva", "", "")
 
+/* Documentation Notes */
+DEF_ATTRIB(NOTE_CONDITIONAL, "can be conditionally executed", "", "")
+DEF_ATTRIB(NOTE_NEWVAL_SLOT0, "New-value oprnd must execute on slot 0", "", "")
+DEF_ATTRIB(NOTE_PRIV, "Monitor-level feature", "", "")
+DEF_ATTRIB(NOTE_NOPACKET, "solo instruction", "", "")
+DEF_ATTRIB(NOTE_AXOK, "May only be grouped with ALU32 or non-FP XTYPE.", "", "")
+DEF_ATTRIB(NOTE_LATEPRED, "The predicate can not be used as a .new", "", "")
+DEF_ATTRIB(NOTE_NVSLOT0, "Can execute only in slot 0 (ST)", "", "")
+
+/* Restrictions to make note of */
+DEF_ATTRIB(RESTRICT_NOSLOT1_STORE, "Packet must not have slot 1 store", "", "")
+DEF_ATTRIB(RESTRICT_LATEPRED, "Predicate can not be used as a .new.", "", "")
+
 /* Keep this as the last attribute: */
 DEF_ATTRIB(ZZ_LASTATTRIB, "Last attribute in the file", "", "")
diff --git a/target/hexagon/imported/ldst.idef b/target/hexagon/imported/ldst.idef
index 359d3b744e..237634bdd9 100644
--- a/target/hexagon/imported/ldst.idef
+++ b/target/hexagon/imported/ldst.idef
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -31,12 +31,12 @@ Q6INSN(L2_##TAG##_pci, OPER"(Rx32++#s4:"SHFT":circ(Mu2))",ATTRIB,DESCR,{fEA_REG(
 Q6INSN(L2_##TAG##_pcr, OPER"(Rx32++I:circ(Mu2))",  ATTRIB,DESCR,{fEA_REG(RxV); fPM_CIRR(RxV,fREAD_IREG(MuV)<<SCALE,MuV); SEMANTICS;})
 
 /* The set of 32-bit load instructions */
-STD_LD_AMODES(loadrub,"Rd32=memub","Load Unsigned Byte",ATTRIBS(A_LOAD),"0",fLOAD(1,1,u,EA,RdV),0)
-STD_LD_AMODES(loadrb, "Rd32=memb", "Load signed Byte",ATTRIBS(A_LOAD),"0",fLOAD(1,1,s,EA,RdV),0)
-STD_LD_AMODES(loadruh,"Rd32=memuh","Load unsigned Half integer",ATTRIBS(A_LOAD),"1",fLOAD(1,2,u,EA,RdV),1)
-STD_LD_AMODES(loadrh, "Rd32=memh", "Load signed Half integer",ATTRIBS(A_LOAD),"1",fLOAD(1,2,s,EA,RdV),1)
-STD_LD_AMODES(loadri, "Rd32=memw", "Load Word",ATTRIBS(A_LOAD),"2",fLOAD(1,4,u,EA,RdV),2)
-STD_LD_AMODES(loadrd, "Rdd32=memd","Load Double integer",ATTRIBS(A_LOAD),"3",fLOAD(1,8,u,EA,RddV),3)
+STD_LD_AMODES(loadrub,"Rd32=memub","Load Unsigned Byte",ATTRIBS(A_MEMSIZE_1B,A_LOAD,A_REGWRSIZE_1B),"0",fLOAD(1,1,u,EA,RdV),0)
+STD_LD_AMODES(loadrb, "Rd32=memb", "Load signed Byte",ATTRIBS(A_MEMSIZE_1B,A_LOAD),"0",fLOAD(1,1,s,EA,RdV),0)
+STD_LD_AMODES(loadruh,"Rd32=memuh","Load unsigned Half integer",ATTRIBS(A_REGWRSIZE_2B,A_MEMSIZE_2B,A_LOAD),"1",fLOAD(1,2,u,EA,RdV),1)
+STD_LD_AMODES(loadrh, "Rd32=memh", "Load signed Half integer",ATTRIBS(A_REGWRSIZE_2B,A_MEMSIZE_2B,A_LOAD),"1",fLOAD(1,2,s,EA,RdV),1)
+STD_LD_AMODES(loadri, "Rd32=memw", "Load Word",ATTRIBS(A_REGWRSIZE_4B,A_MEMSIZE_4B,A_LOAD),"2",fLOAD(1,4,u,EA,RdV),2)
+STD_LD_AMODES(loadrd, "Rdd32=memd","Load Double integer",ATTRIBS(A_REGWRSIZE_8B,A_MEMSIZE_8B,A_LOAD),"3",fLOAD(1,8,u,EA,RddV),3)
 
 /* These instructions do a load an unpack */
 STD_LD_AMODES(loadbzw2, "Rd32=memubh", "Load Bytes and Vector Zero-Extend (unpack)",
@@ -113,28 +113,28 @@ Q6INSN(S2_##TAG##_pcr, OPER"(Rx32++I:circ(Mu2))="DEST,  ATTRIB,DESCR,{fEA_REG(Rx
 
 
 /* The set of 32-bit store instructions */
-STD_ST_AMODES(storerb, "Rt32", "memb","Store Byte",ATTRIBS(A_STORE),"0",fSTORE(1,1,EA,fGETBYTE(0,RtV)),0)
-STD_ST_AMODES(storerh, "Rt32", "memh","Store Half integer",ATTRIBS(A_STORE),"1",fSTORE(1,2,EA,fGETHALF(0,RtV)),1)
-STD_ST_AMODES(storerf, "Rt.H32", "memh","Store Upper Half integer",ATTRIBS(A_STORE),"1",fSTORE(1,2,EA,fGETHALF(1,RtV)),1)
-STD_ST_AMODES(storeri, "Rt32", "memw","Store Word",ATTRIBS(A_STORE),"2",fSTORE(1,4,EA,RtV),2)
-STD_ST_AMODES(storerd, "Rtt32","memd","Store Double integer",ATTRIBS(A_STORE),"3",fSTORE(1,8,EA,RttV),3)
-STD_ST_AMODES(storerinew, "Nt8.new", "memw","Store Word",ATTRIBS(A_STORE),"2",fSTORE(1,4,EA,fNEWREG_ST(NtN)),2)
-STD_ST_AMODES(storerbnew, "Nt8.new", "memb","Store Byte",ATTRIBS(A_STORE),"0",fSTORE(1,1,EA,fGETBYTE(0,fNEWREG_ST(NtN))),0)
-STD_ST_AMODES(storerhnew, "Nt8.new", "memh","Store Half integer",ATTRIBS(A_STORE),"1",fSTORE(1,2,EA,fGETHALF(0,fNEWREG_ST(NtN))),1)
+STD_ST_AMODES(storerb, "Rt32", "memb","Store Byte",ATTRIBS(A_MEMSIZE_1B,A_STORE),"0",fSTORE(1,1,EA,fGETBYTE(0,RtV)),0)
+STD_ST_AMODES(storerh, "Rt32", "memh","Store Half integer",ATTRIBS(A_REGWRSIZE_2B,A_MEMSIZE_2B,A_STORE),"1",fSTORE(1,2,EA,fGETHALF(0,RtV)),1)
+STD_ST_AMODES(storerf, "Rt.H32", "memh","Store Upper Half integer",ATTRIBS(A_REGWRSIZE_2B,A_MEMSIZE_2B,A_STORE),"1",fSTORE(1,2,EA,fGETHALF(1,RtV)),1)
+STD_ST_AMODES(storeri, "Rt32", "memw","Store Word",ATTRIBS(A_REGWRSIZE_4B,A_MEMSIZE_4B,A_STORE),"2",fSTORE(1,4,EA,RtV),2)
+STD_ST_AMODES(storerd, "Rtt32","memd","Store Double integer",ATTRIBS(A_REGWRSIZE_8B,A_MEMSIZE_8B,A_STORE),"3",fSTORE(1,8,EA,RttV),3)
+STD_ST_AMODES(storerinew, "Nt8.new", "memw","Store Word",ATTRIBS(A_REGWRSIZE_4B,A_NOTE_NEWVAL_SLOT0,A_NVSTORE,A_NOTE_NVSLOT0,A_MEMSIZE_4B,A_STORE,A_RESTRICT_NOSLOT1_STORE),"2",fSTORE(1,4,EA,fNEWREG_ST(NtN)),2)
+STD_ST_AMODES(storerbnew, "Nt8.new", "memb","Store Byte",ATTRIBS(A_NOTE_NEWVAL_SLOT0,A_NVSTORE,A_NOTE_NVSLOT0,A_MEMSIZE_1B,A_STORE,A_RESTRICT_NOSLOT1_STORE),"0",fSTORE(1,1,EA,fGETBYTE(0,fNEWREG_ST(NtN))),0)
+STD_ST_AMODES(storerhnew, "Nt8.new", "memh","Store Half integer",ATTRIBS(A_REGWRSIZE_2B,A_NOTE_NEWVAL_SLOT0,A_NVSTORE,A_NOTE_NVSLOT0,A_MEMSIZE_2B,A_STORE,A_RESTRICT_NOSLOT1_STORE),"1",fSTORE(1,2,EA,fGETHALF(0,fNEWREG_ST(NtN))),1)
 
 
-Q6INSN(S2_allocframe,"allocframe(Rx32,#u11:3):raw", ATTRIBS(A_STORE,A_RESTRICT_SLOT0ONLY), "Allocate stack frame",
+Q6INSN(S2_allocframe,"allocframe(Rx32,#u11:3):raw", ATTRIBS(A_REGWRSIZE_8B,A_MEMSIZE_8B,A_STORE,A_RESTRICT_SLOT0ONLY), "Allocate stack frame",
 { fEA_RI(RxV,-8); fSTORE(1,8,EA,fFRAME_SCRAMBLE((fCAST8_8u(fREAD_LR()) << 32) | fCAST4_4u(fREAD_FP()))); fWRITE_FP(EA); fFRAMECHECK(EA-uiV,EA); RxV = EA-uiV; })
 
-#define A_RETURN A_RESTRICT_SLOT0ONLY
+#define A_RETURN A_RESTRICT_COF_MAX1,A_RESTRICT_SLOT0ONLY,A_RESTRICT_NOSLOT1_STORE,A_RET_TYPE,A_DEALLOCRET
 
-Q6INSN(L2_deallocframe,"Rdd32=deallocframe(Rs32):raw", ATTRIBS(A_LOAD), "Deallocate stack frame",
+Q6INSN(L2_deallocframe,"Rdd32=deallocframe(Rs32):raw", ATTRIBS(A_REGWRSIZE_8B,A_MEMSIZE_8B,A_LOAD,A_DEALLOCFRAME), "Deallocate stack frame",
 { fHIDE(size8u_t tmp;) fEA_REG(RsV);
   fLOAD(1,8,u,EA,tmp);
   RddV = fFRAME_UNSCRAMBLE(tmp);
   fWRITE_SP(EA+8); })
 
-Q6INSN(L4_return,"Rdd32=dealloc_return(Rs32):raw", ATTRIBS(A_JINDIR,A_LOAD,A_RETURN), "Deallocate stack frame and return",
+Q6INSN(L4_return,"Rdd32=dealloc_return(Rs32):raw", ATTRIBS(A_REGWRSIZE_8B,A_ROPS_2,A_JINDIR,A_MEMSIZE_8B,A_LOAD,A_RETURN), "Deallocate stack frame and return",
 { fHIDE(size8u_t tmp;) fEA_REG(RsV);
   fLOAD(1,8,u,EA,tmp);
   RddV = fFRAME_UNSCRAMBLE(tmp);
@@ -166,7 +166,7 @@ Q6INSN(L4_return,"Rdd32=dealloc_return(Rs32):raw", ATTRIBS(A_JINDIR,A_LOAD,A_RET
     COND_RETURN_TF(TG,new_pt,".new",12,0,SPECULATE_TAKEN,ATTRIBS,fLSBNEW,PvN,":t") \
     COND_RETURN_TF(TG,new_pnt,".new",12,0,SPECULATE_NOT_TAKEN,ATTRIBS,fLSBNEW,PvN,":nt") \
 
-#define RETURN_ATTRIBS A_LOAD,A_RETURN
+#define RETURN_ATTRIBS A_ROPS_2,A_MEMSIZE_8B,A_LOAD,A_RETURN
 
 COND_RETURN_TF(L4_return,,,7,0,SPECULATE_NOT_TAKEN,ATTRIBS(RETURN_ATTRIBS,A_JINDIROLD),fLSBOLD,PvV,)
 COND_RETURN_NEW(L4_return,12,0,ATTRIBS(RETURN_ATTRIBS,A_JINDIRNEW))
@@ -174,18 +174,18 @@ COND_RETURN_NEW(L4_return,12,0,ATTRIBS(RETURN_ATTRIBS,A_JINDIRNEW))
 
 
 
-Q6INSN(L2_loadw_locked,"Rd32=memw_locked(Rs32)", ATTRIBS(A_LOAD,A_RESTRICT_SLOT0ONLY), "Load word with lock",
+Q6INSN(L2_loadw_locked,"Rd32=memw_locked(Rs32)", ATTRIBS(A_REGWRSIZE_4B,A_MEMSIZE_4B,A_LOAD,A_RESTRICT_SLOT0ONLY,A_RESTRICT_PACKET_AXOK,A_NOTE_AXOK), "Load word with lock",
 { fEA_REG(RsV); fLOAD_LOCKED(1,4,u,EA,RdV) })
 
 
-Q6INSN(S2_storew_locked,"memw_locked(Rs32,Pd4)=Rt32", ATTRIBS(A_STORE,A_RESTRICT_SLOT0ONLY), "Store word with lock",
+Q6INSN(S2_storew_locked,"memw_locked(Rs32,Pd4)=Rt32", ATTRIBS(A_REGWRSIZE_4B,A_MEMSIZE_4B,A_STORE,A_RESTRICT_SLOT0ONLY,A_RESTRICT_PACKET_AXOK,A_NOTE_AXOK,A_RESTRICT_LATEPRED,A_NOTE_LATEPRED), "Store word with lock",
 { fEA_REG(RsV); fSTORE_LOCKED(1,4,EA,RtV,PdV) })
 
 
-Q6INSN(L4_loadd_locked,"Rdd32=memd_locked(Rs32)", ATTRIBS(A_LOAD,A_RESTRICT_SLOT0ONLY), "Load double with lock",
+Q6INSN(L4_loadd_locked,"Rdd32=memd_locked(Rs32)", ATTRIBS(A_REGWRSIZE_8B,A_MEMSIZE_8B,A_LOAD,A_RESTRICT_SLOT0ONLY,A_RESTRICT_PACKET_AXOK,A_NOTE_AXOK), "Load double with lock",
 { fEA_REG(RsV); fLOAD_LOCKED(1,8,u,EA,RddV) })
 
-Q6INSN(S4_stored_locked,"memd_locked(Rs32,Pd4)=Rtt32", ATTRIBS(A_STORE,A_RESTRICT_SLOT0ONLY), "Store word with lock",
+Q6INSN(S4_stored_locked,"memd_locked(Rs32,Pd4)=Rtt32", ATTRIBS(A_REGWRSIZE_8B,A_MEMSIZE_8B,A_STORE,A_RESTRICT_SLOT0ONLY,A_RESTRICT_PACKET_AXOK,A_NOTE_AXOK,A_RESTRICT_LATEPRED,A_NOTE_LATEPRED), "Store word with lock",
 { fEA_REG(RsV); fSTORE_LOCKED(1,8,EA,RttV,PdV) })
 
 
@@ -220,12 +220,12 @@ Q6INSN(L4_p##TAG##fnew_abs,"if (!Pt4.new) "OPER"(#u6)",ATTRIB,DESCR,{fMUST_IMMEX
 
 
 /* The set of 32-bit predicated load instructions */
-STD_PLD_AMODES(loadrub,"Rd32=memub","Load Unsigned Byte",ATTRIBS(A_ARCHV2,A_LOAD),"0",0,fLOAD(1,1,u,EA,RdV))
-STD_PLD_AMODES(loadrb, "Rd32=memb", "Load signed Byte",ATTRIBS(A_ARCHV2,A_LOAD),"0",0,fLOAD(1,1,s,EA,RdV))
-STD_PLD_AMODES(loadruh,"Rd32=memuh","Load unsigned Half integer",ATTRIBS(A_ARCHV2,A_LOAD),"1",1,fLOAD(1,2,u,EA,RdV))
-STD_PLD_AMODES(loadrh, "Rd32=memh", "Load signed Half integer",ATTRIBS(A_ARCHV2,A_LOAD),"1",1,fLOAD(1,2,s,EA,RdV))
-STD_PLD_AMODES(loadri, "Rd32=memw", "Load Word",ATTRIBS(A_ARCHV2,A_LOAD),"2",2,fLOAD(1,4,u,EA,RdV))
-STD_PLD_AMODES(loadrd, "Rdd32=memd","Load Double integer",ATTRIBS(A_ARCHV2,A_LOAD),"3",3,fLOAD(1,8,u,EA,RddV))
+STD_PLD_AMODES(loadrub,"Rd32=memub","Load Unsigned Byte",ATTRIBS(A_ARCHV2,A_MEMSIZE_1B,A_LOAD,A_REGWRSIZE_1B),"0",0,fLOAD(1,1,u,EA,RdV))
+STD_PLD_AMODES(loadrb, "Rd32=memb", "Load signed Byte",ATTRIBS(A_ARCHV2,A_MEMSIZE_1B,A_LOAD),"0",0,fLOAD(1,1,s,EA,RdV))
+STD_PLD_AMODES(loadruh,"Rd32=memuh","Load unsigned Half integer",ATTRIBS(A_REGWRSIZE_2B,A_ARCHV2,A_MEMSIZE_2B,A_LOAD),"1",1,fLOAD(1,2,u,EA,RdV))
+STD_PLD_AMODES(loadrh, "Rd32=memh", "Load signed Half integer",ATTRIBS(A_REGWRSIZE_2B,A_ARCHV2,A_MEMSIZE_2B,A_LOAD),"1",1,fLOAD(1,2,s,EA,RdV))
+STD_PLD_AMODES(loadri, "Rd32=memw", "Load Word",ATTRIBS(A_REGWRSIZE_4B,A_ARCHV2,A_MEMSIZE_4B,A_LOAD),"2",2,fLOAD(1,4,u,EA,RdV))
+STD_PLD_AMODES(loadrd, "Rdd32=memd","Load Double integer",ATTRIBS(A_REGWRSIZE_8B,A_ARCHV2,A_MEMSIZE_8B,A_LOAD),"3",3,fLOAD(1,8,u,EA,RddV))
 
 /* The set of addressing modes standard to all predicated store instructions */
 #define STD_PST_AMODES(TAG,DEST,OPER,DESCR,ATTRIB,SHFT,SHFTNUM,SEMANTICS)\
@@ -251,14 +251,14 @@ Q6INSN(S4_p##TAG##fnew_abs,"if (!Pv4.new) "OPER"(#u6)="DEST,ATTRIB,DESCR,{fMUST_
 
 
 /* The set of 32-bit predicated store instructions */
-STD_PST_AMODES(storerb,"Rt32","memb","Store Byte",ATTRIBS(A_ARCHV2,A_STORE),"0",0,fSTORE(1,1,EA,fGETBYTE(0,RtV)))
-STD_PST_AMODES(storerh,"Rt32","memh","Store Half integer",ATTRIBS(A_ARCHV2,A_STORE),"1",1,fSTORE(1,2,EA,fGETHALF(0,RtV)))
-STD_PST_AMODES(storerf,"Rt.H32","memh","Store Upper Half integer",ATTRIBS(A_ARCHV2,A_STORE),"1",1,fSTORE(1,2,EA,fGETHALF(1,RtV)))
-STD_PST_AMODES(storeri,"Rt32","memw","Store Word",ATTRIBS(A_ARCHV2,A_STORE),"2",2,fSTORE(1,4,EA,RtV))
-STD_PST_AMODES(storerd,"Rtt32","memd","Store Double integer",ATTRIBS(A_ARCHV2,A_STORE),"3",3,fSTORE(1,8,EA,RttV))
-STD_PST_AMODES(storerinew,"Nt8.new","memw","Store Word",ATTRIBS(A_ARCHV2,A_STORE),"2",2,fSTORE(1,4,EA,fNEWREG_ST(NtN)))
-STD_PST_AMODES(storerbnew,"Nt8.new","memb","Store Byte",ATTRIBS(A_ARCHV2,A_STORE),"0",0,fSTORE(1,1,EA,fGETBYTE(0,fNEWREG_ST(NtN))))
-STD_PST_AMODES(storerhnew,"Nt8.new","memh","Store Half integer",ATTRIBS(A_ARCHV2,A_STORE),"1",1,fSTORE(1,2,EA,fGETHALF(0,fNEWREG_ST(NtN))))
+STD_PST_AMODES(storerb,"Rt32","memb","Store Byte",ATTRIBS(A_ARCHV2,A_MEMSIZE_1B,A_STORE),"0",0,fSTORE(1,1,EA,fGETBYTE(0,RtV)))
+STD_PST_AMODES(storerh,"Rt32","memh","Store Half integer",ATTRIBS(A_REGWRSIZE_2B,A_ARCHV2,A_MEMSIZE_2B,A_STORE),"1",1,fSTORE(1,2,EA,fGETHALF(0,RtV)))
+STD_PST_AMODES(storerf,"Rt.H32","memh","Store Upper Half integer",ATTRIBS(A_REGWRSIZE_2B,A_ARCHV2,A_MEMSIZE_2B,A_STORE),"1",1,fSTORE(1,2,EA,fGETHALF(1,RtV)))
+STD_PST_AMODES(storeri,"Rt32","memw","Store Word",ATTRIBS(A_REGWRSIZE_4B,A_ARCHV2,A_MEMSIZE_4B,A_STORE),"2",2,fSTORE(1,4,EA,RtV))
+STD_PST_AMODES(storerd,"Rtt32","memd","Store Double integer",ATTRIBS(A_REGWRSIZE_8B,A_ARCHV2,A_MEMSIZE_8B,A_STORE),"3",3,fSTORE(1,8,EA,RttV))
+STD_PST_AMODES(storerinew,"Nt8.new","memw","Store Word",ATTRIBS(A_REGWRSIZE_4B,A_ARCHV2,A_NOTE_NEWVAL_SLOT0,A_NVSTORE,A_NOTE_NVSLOT0,A_MEMSIZE_4B,A_STORE,A_RESTRICT_NOSLOT1_STORE),"2",2,fSTORE(1,4,EA,fNEWREG_ST(NtN)))
+STD_PST_AMODES(storerbnew,"Nt8.new","memb","Store Byte",ATTRIBS(A_ARCHV2,A_NOTE_NEWVAL_SLOT0,A_NVSTORE,A_NOTE_NVSLOT0,A_MEMSIZE_1B,A_STORE,A_RESTRICT_NOSLOT1_STORE),"0",0,fSTORE(1,1,EA,fGETBYTE(0,fNEWREG_ST(NtN))))
+STD_PST_AMODES(storerhnew,"Nt8.new","memh","Store Half integer",ATTRIBS(A_REGWRSIZE_2B,A_ARCHV2,A_NOTE_NEWVAL_SLOT0,A_NVSTORE,A_NOTE_NVSLOT0,A_MEMSIZE_2B,A_STORE,A_RESTRICT_NOSLOT1_STORE),"1",1,fSTORE(1,2,EA,fGETHALF(0,fNEWREG_ST(NtN))))
 
 
 
@@ -271,9 +271,9 @@ STD_PST_AMODES(storerhnew,"Nt8.new","memh","Store Half integer",ATTRIBS(A_ARCHV2
 
 /* The set of 32-bit non-predicated mem-ops */
 #define STD_MEMOP_AMODES(TAG,OPER,DESCR,SEMANTICS)\
-Q6INSN(L4_##TAG##w_io,  "memw(Rs32+#u6:2)"OPER,     ATTRIBS(A_RESTRICT_SLOT0ONLY),DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); fHIDE(size4s_t tmp;) fLOAD(1,4,s,EA,tmp); SEMANTICS;  fSTORE(1,4,EA,tmp); })\
-Q6INSN(L4_##TAG##b_io,  "memb(Rs32+#u6:0)"OPER,     ATTRIBS(A_RESTRICT_SLOT0ONLY),DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); fHIDE(size4s_t tmp;) fLOAD(1,1,s,EA,tmp); SEMANTICS;  fSTORE(1,1,EA,tmp); })\
-Q6INSN(L4_##TAG##h_io,  "memh(Rs32+#u6:1)"OPER,     ATTRIBS(A_RESTRICT_SLOT0ONLY),DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); fHIDE(size4s_t tmp;) fLOAD(1,2,s,EA,tmp); SEMANTICS;  fSTORE(1,2,EA,tmp); })
+Q6INSN(L4_##TAG##w_io,  "memw(Rs32+#u6:2)"OPER,     ATTRIBS(A_MEMOP,A_ROPS_3,A_MEMSIZE_4B,A_RESTRICT_SLOT0ONLY,A_RESTRICT_NOSLOT1_STORE),DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); fHIDE(size4s_t tmp;) fLOAD(1,4,s,EA,tmp); SEMANTICS;  fSTORE(1,4,EA,tmp); })\
+Q6INSN(L4_##TAG##b_io,  "memb(Rs32+#u6:0)"OPER,     ATTRIBS(A_MEMOP,A_ROPS_3,A_MEMSIZE_1B,A_RESTRICT_SLOT0ONLY,A_RESTRICT_NOSLOT1_STORE),DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); fHIDE(size4s_t tmp;) fLOAD(1,1,s,EA,tmp); SEMANTICS;  fSTORE(1,1,EA,tmp); })\
+Q6INSN(L4_##TAG##h_io,  "memh(Rs32+#u6:1)"OPER,     ATTRIBS(A_MEMOP,A_ROPS_3,A_MEMSIZE_2B,A_RESTRICT_SLOT0ONLY,A_RESTRICT_NOSLOT1_STORE),DESCR,{fIMMEXT(uiV); fEA_RI(RsV,uiV); fHIDE(size4s_t tmp;) fLOAD(1,2,s,EA,tmp); SEMANTICS;  fSTORE(1,2,EA,tmp); })
 
 
 
@@ -302,9 +302,9 @@ Q6INSN(S4_##TAG##tnew_io,"if (Pv4.new) "OPER"(Rs32+#u6:"SHFT")="DEST,ATTRIB,DESC
 Q6INSN(S4_##TAG##fnew_io,"if (!Pv4.new) "OPER"(Rs32+#u6:"SHFT")="DEST,ATTRIB,DESCR,{fEA_RI(RsV,uiV); if (fLSBNEWNOT(PvN)){ SEMANTICS; } else {STORE_CANCEL(EA);}})
 
 /* The set of 32-bit store immediate instructions */
-V4_PSTI_AMODES(storeirb,"#S6","memb","Store Immediate Byte",ATTRIBS(A_ARCHV2,A_STORE),"0",fIMMEXT(SiV); fSTORE(1,1,EA,SiV))
-V4_PSTI_AMODES(storeirh,"#S6","memh","Store Immediate Half integer",ATTRIBS(A_ARCHV2,A_STORE),"1",fIMMEXT(SiV); fSTORE(1,2,EA,SiV))
-V4_PSTI_AMODES(storeiri,"#S6","memw","Store Immediate Word",ATTRIBS(A_ARCHV2,A_STORE),"2",fIMMEXT(SiV); fSTORE(1,4,EA,SiV))
+V4_PSTI_AMODES(storeirb,"#S6","memb","Store Immediate Byte",ATTRIBS(A_ARCHV2,A_ROPS_2,A_MEMSIZE_1B,A_STORE,A_STOREIMMED),"0",fIMMEXT(SiV); fSTORE(1,1,EA,SiV))
+V4_PSTI_AMODES(storeirh,"#S6","memh","Store Immediate Half integer",ATTRIBS(A_REGWRSIZE_2B,A_ARCHV2,A_ROPS_2,A_MEMSIZE_2B,A_STORE,A_STOREIMMED),"1",fIMMEXT(SiV); fSTORE(1,2,EA,SiV))
+V4_PSTI_AMODES(storeiri,"#S6","memw","Store Immediate Word",ATTRIBS(A_REGWRSIZE_4B,A_ARCHV2,A_ROPS_2,A_MEMSIZE_4B,A_STORE,A_STOREIMMED),"2",fIMMEXT(SiV); fSTORE(1,4,EA,SiV))
 
 
 /* Non-predicated store immediates */
@@ -312,9 +312,9 @@ V4_PSTI_AMODES(storeiri,"#S6","memw","Store Immediate Word",ATTRIBS(A_ARCHV2,A_S
 Q6INSN(S4_##TAG##_io,  OPER"(Rs32+#u6:"SHFT")="DEST,  ATTRIB,DESCR,{fEA_RI(RsV,uiV); SEMANTICS; })
 
 /* The set of 32-bit store immediate instructions */
-V4_STI_AMODES(storeirb,"#S8","memb","Store Immediate Byte",ATTRIBS(A_ARCHV2,A_STORE),"0",fIMMEXT(SiV); fSTORE(1,1,EA,SiV))
-V4_STI_AMODES(storeirh,"#S8","memh","Store Immediate Half integer",ATTRIBS(A_ARCHV2,A_STORE),"1",fIMMEXT(SiV); fSTORE(1,2,EA,SiV))
-V4_STI_AMODES(storeiri,"#S8","memw","Store Immediate Word",ATTRIBS(A_ARCHV2,A_STORE),"2",fIMMEXT(SiV); fSTORE(1,4,EA,SiV))
+V4_STI_AMODES(storeirb,"#S8","memb","Store Immediate Byte",ATTRIBS(A_ARCHV2,A_ROPS_2,A_MEMSIZE_1B,A_STORE,A_STOREIMMED),"0",fIMMEXT(SiV); fSTORE(1,1,EA,SiV))
+V4_STI_AMODES(storeirh,"#S8","memh","Store Immediate Half integer",ATTRIBS(A_REGWRSIZE_2B,A_ARCHV2,A_ROPS_2,A_MEMSIZE_2B,A_STORE,A_STOREIMMED),"1",fIMMEXT(SiV); fSTORE(1,2,EA,SiV))
+V4_STI_AMODES(storeiri,"#S8","memw","Store Immediate Word",ATTRIBS(A_REGWRSIZE_4B,A_ARCHV2,A_ROPS_2,A_MEMSIZE_4B,A_STORE,A_STOREIMMED),"2",fIMMEXT(SiV); fSTORE(1,4,EA,SiV))
 
 
 
@@ -332,23 +332,23 @@ V4_STI_AMODES(storeiri,"#S8","memw","Store Immediate Word",ATTRIBS(A_ARCHV2,A_ST
 Q6INSN(L2_##TAG##gp, OPER"(gp+#u16:"SHFT")",   ATTRIB,DESCR,{fIMMEXT(uiV); fEA_GPI(uiV); SEMANTICS; })
 
 /* The set of 32-bit load instructions */
-STD_GPLD_AMODES(loadrub,"Rd32=memub","Load Unsigned Byte",ATTRIBS(A_LOAD,A_ARCHV2),"0",fLOAD(1,1,u,EA,RdV))
-STD_GPLD_AMODES(loadrb, "Rd32=memb", "Load signed Byte",ATTRIBS(A_LOAD,A_ARCHV2),"0",fLOAD(1,1,s,EA,RdV))
-STD_GPLD_AMODES(loadruh,"Rd32=memuh","Load unsigned Half integer",ATTRIBS(A_LOAD,A_ARCHV2),"1",fLOAD(1,2,u,EA,RdV))
-STD_GPLD_AMODES(loadrh, "Rd32=memh", "Load signed Half integer",ATTRIBS(A_LOAD,A_ARCHV2),"1",fLOAD(1,2,s,EA,RdV))
-STD_GPLD_AMODES(loadri, "Rd32=memw", "Load Word",ATTRIBS(A_LOAD,A_ARCHV2),"2",fLOAD(1,4,u,EA,RdV))
-STD_GPLD_AMODES(loadrd, "Rdd32=memd","Load Double integer",ATTRIBS(A_LOAD,A_ARCHV2),"3",fLOAD(1,8,u,EA,RddV))
+STD_GPLD_AMODES(loadrub,"Rd32=memub","Load Unsigned Byte",ATTRIBS(A_MEMSIZE_1B,A_LOAD,A_ARCHV2,A_REGWRSIZE_1B),"0",fLOAD(1,1,u,EA,RdV))
+STD_GPLD_AMODES(loadrb, "Rd32=memb", "Load signed Byte",ATTRIBS(A_MEMSIZE_1B,A_LOAD,A_ARCHV2),"0",fLOAD(1,1,s,EA,RdV))
+STD_GPLD_AMODES(loadruh,"Rd32=memuh","Load unsigned Half integer",ATTRIBS(A_REGWRSIZE_2B,A_MEMSIZE_2B,A_LOAD,A_ARCHV2),"1",fLOAD(1,2,u,EA,RdV))
+STD_GPLD_AMODES(loadrh, "Rd32=memh", "Load signed Half integer",ATTRIBS(A_REGWRSIZE_2B,A_MEMSIZE_2B,A_LOAD,A_ARCHV2),"1",fLOAD(1,2,s,EA,RdV))
+STD_GPLD_AMODES(loadri, "Rd32=memw", "Load Word",ATTRIBS(A_REGWRSIZE_4B,A_MEMSIZE_4B,A_LOAD,A_ARCHV2),"2",fLOAD(1,4,u,EA,RdV))
+STD_GPLD_AMODES(loadrd, "Rdd32=memd","Load Double integer",ATTRIBS(A_REGWRSIZE_8B,A_MEMSIZE_8B,A_LOAD,A_ARCHV2),"3",fLOAD(1,8,u,EA,RddV))
 
 
 #define STD_GPST_AMODES(TAG,DEST,OPER,DESCR,ATTRIB,SHFT,SEMANTICS)\
 Q6INSN(S2_##TAG##gp, OPER"(gp+#u16:"SHFT")="DEST, ATTRIB,DESCR,{fIMMEXT(uiV); fEA_GPI(uiV); SEMANTICS; })
 
 /* The set of 32-bit store instructions */
-STD_GPST_AMODES(storerb, "Rt32", "memb","Store Byte",ATTRIBS(A_STORE,A_ARCHV2),"0",fSTORE(1,1,EA,fGETBYTE(0,RtV)))
-STD_GPST_AMODES(storerh, "Rt32", "memh","Store Half integer",ATTRIBS(A_STORE,A_ARCHV2),"1",fSTORE(1,2,EA,fGETHALF(0,RtV)))
-STD_GPST_AMODES(storerf, "Rt.H32", "memh","Store Upper Half integer",ATTRIBS(A_STORE,A_ARCHV2),"1",fSTORE(1,2,EA,fGETHALF(1,RtV)))
-STD_GPST_AMODES(storeri, "Rt32", "memw","Store Word",ATTRIBS(A_STORE,A_ARCHV2),"2",fSTORE(1,4,EA,RtV))
-STD_GPST_AMODES(storerd, "Rtt32","memd","Store Double integer",ATTRIBS(A_STORE,A_ARCHV2),"3",fSTORE(1,8,EA,RttV))
-STD_GPST_AMODES(storerinew, "Nt8.new", "memw","Store Word",ATTRIBS(A_STORE,A_ARCHV2),"2",fSTORE(1,4,EA,fNEWREG_ST(NtN)))
-STD_GPST_AMODES(storerbnew, "Nt8.new", "memb","Store Byte",ATTRIBS(A_STORE,A_ARCHV2),"0",fSTORE(1,1,EA,fGETBYTE(0,fNEWREG_ST(NtN))))
-STD_GPST_AMODES(storerhnew, "Nt8.new", "memh","Store Half integer",ATTRIBS(A_STORE,A_ARCHV2),"1",fSTORE(1,2,EA,fGETHALF(0,fNEWREG_ST(NtN))))
+STD_GPST_AMODES(storerb, "Rt32", "memb","Store Byte",ATTRIBS(A_MEMSIZE_1B,A_STORE,A_ARCHV2),"0",fSTORE(1,1,EA,fGETBYTE(0,RtV)))
+STD_GPST_AMODES(storerh, "Rt32", "memh","Store Half integer",ATTRIBS(A_REGWRSIZE_2B,A_MEMSIZE_2B,A_STORE,A_ARCHV2),"1",fSTORE(1,2,EA,fGETHALF(0,RtV)))
+STD_GPST_AMODES(storerf, "Rt.H32", "memh","Store Upper Half integer",ATTRIBS(A_REGWRSIZE_2B,A_MEMSIZE_2B,A_STORE,A_ARCHV2),"1",fSTORE(1,2,EA,fGETHALF(1,RtV)))
+STD_GPST_AMODES(storeri, "Rt32", "memw","Store Word",ATTRIBS(A_REGWRSIZE_4B,A_MEMSIZE_4B,A_STORE,A_ARCHV2),"2",fSTORE(1,4,EA,RtV))
+STD_GPST_AMODES(storerd, "Rtt32","memd","Store Double integer",ATTRIBS(A_REGWRSIZE_8B,A_MEMSIZE_8B,A_STORE,A_ARCHV2),"3",fSTORE(1,8,EA,RttV))
+STD_GPST_AMODES(storerinew, "Nt8.new", "memw","Store Word",ATTRIBS(A_REGWRSIZE_4B,A_NOTE_NEWVAL_SLOT0,A_NVSTORE,A_NOTE_NVSLOT0,A_MEMSIZE_4B,A_STORE,A_RESTRICT_NOSLOT1_STORE,A_ARCHV2),"2",fSTORE(1,4,EA,fNEWREG_ST(NtN)))
+STD_GPST_AMODES(storerbnew, "Nt8.new", "memb","Store Byte",ATTRIBS(A_NOTE_NEWVAL_SLOT0,A_NVSTORE,A_NOTE_NVSLOT0,A_MEMSIZE_1B,A_STORE,A_RESTRICT_NOSLOT1_STORE,A_ARCHV2),"0",fSTORE(1,1,EA,fGETBYTE(0,fNEWREG_ST(NtN))))
+STD_GPST_AMODES(storerhnew, "Nt8.new", "memh","Store Half integer",ATTRIBS(A_REGWRSIZE_2B,A_NOTE_NEWVAL_SLOT0,A_NVSTORE,A_NOTE_NVSLOT0,A_MEMSIZE_2B,A_STORE,A_RESTRICT_NOSLOT1_STORE,A_ARCHV2),"1",fSTORE(1,2,EA,fGETHALF(0,fNEWREG_ST(NtN))))
diff --git a/target/hexagon/imported/subinsns.idef b/target/hexagon/imported/subinsns.idef
index ec1c74f479..be0ae8779d 100644
--- a/target/hexagon/imported/subinsns.idef
+++ b/target/hexagon/imported/subinsns.idef
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -39,18 +39,18 @@ Q6INSN(SA1_clrf,     "if (!p0) Rd16=#0",      ATTRIBS(A_SUBINSN),"clear if false
 Q6INSN(SA1_addsp,    "Rd16=add(r29,#u6:2)",   ATTRIBS(A_SUBINSN),"Add",        { RdV=fREAD_SP()+uiV; })
 Q6INSN(SA1_inc,      "Rd16=add(Rs16,#1)",     ATTRIBS(A_SUBINSN),"Inc",        { RdV=RsV+1;})
 Q6INSN(SA1_dec,      "Rd16=add(Rs16,#-1)",    ATTRIBS(A_SUBINSN),"Dec",        { RdV=RsV-1;})
-Q6INSN(SA1_addrx,    "Rx16=add(Rx16,Rs16)",   ATTRIBS(A_SUBINSN),"Add",        { RxV=RxV+RsV; })
+Q6INSN(SA1_addrx,    "Rx16=add(Rx16,Rs16)",   ATTRIBS(A_SUBINSN,A_COMMUTES),"Add",        { RxV=RxV+RsV; })
 Q6INSN(SA1_zxtb,     "Rd16=and(Rs16,#255)",   ATTRIBS(A_SUBINSN),"Zxtb",       { RdV= fZXTN(8,32,RsV);})
 Q6INSN(SA1_and1,     "Rd16=and(Rs16,#1)",     ATTRIBS(A_SUBINSN),"And #1",     { RdV= RsV&1;})
 Q6INSN(SA1_sxtb,     "Rd16=sxtb(Rs16)",       ATTRIBS(A_SUBINSN),"Sxtb",       { RdV= fSXTN(8,32,RsV);})
 Q6INSN(SA1_zxth,     "Rd16=zxth(Rs16)",       ATTRIBS(A_SUBINSN),"Zxth",       { RdV= fZXTN(16,32,RsV);})
 Q6INSN(SA1_sxth,     "Rd16=sxth(Rs16)",       ATTRIBS(A_SUBINSN),"Sxth",       { RdV= fSXTN(16,32,RsV);})
-Q6INSN(SA1_combinezr,"Rdd8=combine(#0,Rs16)", ATTRIBS(A_SUBINSN),"Combines",   { fSETWORD(0,RddV,RsV); fSETWORD(1,RddV,0); })
-Q6INSN(SA1_combinerz,"Rdd8=combine(Rs16,#0)", ATTRIBS(A_SUBINSN),"Combines",   { fSETWORD(0,RddV,0); fSETWORD(1,RddV,RsV); })
-Q6INSN(SA1_combine0i,"Rdd8=combine(#0,#u2)", ATTRIBS(A_SUBINSN),"Combines",   { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,0); })
-Q6INSN(SA1_combine1i,"Rdd8=combine(#1,#u2)", ATTRIBS(A_SUBINSN),"Combines",   { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,1); })
-Q6INSN(SA1_combine2i,"Rdd8=combine(#2,#u2)", ATTRIBS(A_SUBINSN),"Combines",   { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,2); })
-Q6INSN(SA1_combine3i,"Rdd8=combine(#3,#u2)", ATTRIBS(A_SUBINSN),"Combines",   { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,3); })
+Q6INSN(SA1_combinezr,"Rdd8=combine(#0,Rs16)", ATTRIBS(A_SUBINSN,A_ROPS_2),"Combines",   { fSETWORD(0,RddV,RsV); fSETWORD(1,RddV,0); })
+Q6INSN(SA1_combinerz,"Rdd8=combine(Rs16,#0)", ATTRIBS(A_SUBINSN,A_ROPS_2),"Combines",   { fSETWORD(0,RddV,0); fSETWORD(1,RddV,RsV); })
+Q6INSN(SA1_combine0i,"Rdd8=combine(#0,#u2)", ATTRIBS(A_SUBINSN,A_ROPS_2),"Combines",   { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,0); })
+Q6INSN(SA1_combine1i,"Rdd8=combine(#1,#u2)", ATTRIBS(A_SUBINSN,A_ROPS_2),"Combines",   { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,1); })
+Q6INSN(SA1_combine2i,"Rdd8=combine(#2,#u2)", ATTRIBS(A_SUBINSN,A_ROPS_2),"Combines",   { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,2); })
+Q6INSN(SA1_combine3i,"Rdd8=combine(#3,#u2)", ATTRIBS(A_SUBINSN,A_ROPS_2),"Combines",   { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,3); })
 Q6INSN(SA1_cmpeqi,   "p0=cmp.eq(Rs16,#u2)",   ATTRIBS(A_SUBINSN),"CompareImmed",{fWRITE_P0(f8BITSOF(RsV==uiV));})
 
 
@@ -62,16 +62,16 @@ Q6INSN(SA1_cmpeqi,   "p0=cmp.eq(Rs16,#u2)",   ATTRIBS(A_SUBINSN),"CompareImmed",
 /*                                                               */
 /*****************************************************************/
 
-Q6INSN(SL1_loadri_io,  "Rd16=memw(Rs16+#u4:2)", ATTRIBS(A_LOAD,A_SUBINSN),"load word", {fEA_RI(RsV,uiV); fLOAD(1,4,u,EA,RdV);})
-Q6INSN(SL1_loadrub_io, "Rd16=memub(Rs16+#u4:0)",ATTRIBS(A_LOAD,A_SUBINSN),"load byte", {fEA_RI(RsV,uiV); fLOAD(1,1,u,EA,RdV);})
+Q6INSN(SL1_loadri_io,  "Rd16=memw(Rs16+#u4:2)", ATTRIBS(A_REGWRSIZE_4B,A_MEMSIZE_4B,A_LOAD,A_SUBINSN),"load word", {fEA_RI(RsV,uiV); fLOAD(1,4,u,EA,RdV);})
+Q6INSN(SL1_loadrub_io, "Rd16=memub(Rs16+#u4:0)",ATTRIBS(A_MEMSIZE_1B,A_LOAD,A_SUBINSN,A_REGWRSIZE_1B),"load byte", {fEA_RI(RsV,uiV); fLOAD(1,1,u,EA,RdV);})
 
-Q6INSN(SL2_loadrh_io,  "Rd16=memh(Rs16+#u3:1)", ATTRIBS(A_LOAD,A_SUBINSN),"load half", {fEA_RI(RsV,uiV); fLOAD(1,2,s,EA,RdV);})
-Q6INSN(SL2_loadruh_io, "Rd16=memuh(Rs16+#u3:1)",ATTRIBS(A_LOAD,A_SUBINSN),"load half", {fEA_RI(RsV,uiV); fLOAD(1,2,u,EA,RdV);})
-Q6INSN(SL2_loadrb_io,  "Rd16=memb(Rs16+#u3:0)", ATTRIBS(A_LOAD,A_SUBINSN),"load byte", {fEA_RI(RsV,uiV); fLOAD(1,1,s,EA,RdV);})
-Q6INSN(SL2_loadri_sp,  "Rd16=memw(r29+#u5:2)",  ATTRIBS(A_LOAD,A_SUBINSN),"load word", {fEA_RI(fREAD_SP(),uiV); fLOAD(1,4,u,EA,RdV);})
-Q6INSN(SL2_loadrd_sp,  "Rdd8=memd(r29+#u5:3)", ATTRIBS(A_LOAD,A_SUBINSN),"load dword",{fEA_RI(fREAD_SP(),uiV); fLOAD(1,8,u,EA,RddV);})
+Q6INSN(SL2_loadrh_io,  "Rd16=memh(Rs16+#u3:1)", ATTRIBS(A_REGWRSIZE_2B,A_MEMSIZE_2B,A_LOAD,A_SUBINSN),"load half", {fEA_RI(RsV,uiV); fLOAD(1,2,s,EA,RdV);})
+Q6INSN(SL2_loadruh_io, "Rd16=memuh(Rs16+#u3:1)",ATTRIBS(A_REGWRSIZE_2B,A_MEMSIZE_2B,A_LOAD,A_SUBINSN),"load half", {fEA_RI(RsV,uiV); fLOAD(1,2,u,EA,RdV);})
+Q6INSN(SL2_loadrb_io,  "Rd16=memb(Rs16+#u3:0)", ATTRIBS(A_MEMSIZE_1B,A_LOAD,A_SUBINSN),"load byte", {fEA_RI(RsV,uiV); fLOAD(1,1,s,EA,RdV);})
+Q6INSN(SL2_loadri_sp,  "Rd16=memw(r29+#u5:2)",  ATTRIBS(A_REGWRSIZE_4B,A_MEMSIZE_4B,A_LOAD,A_SUBINSN),"load word", {fEA_RI(fREAD_SP(),uiV); fLOAD(1,4,u,EA,RdV);})
+Q6INSN(SL2_loadrd_sp,  "Rdd8=memd(r29+#u5:3)", ATTRIBS(A_REGWRSIZE_8B,A_MEMSIZE_8B,A_LOAD,A_SUBINSN),"load dword",{fEA_RI(fREAD_SP(),uiV); fLOAD(1,8,u,EA,RddV);})
 
-Q6INSN(SL2_deallocframe,"deallocframe", ATTRIBS(A_SUBINSN,A_LOAD), "Deallocate stack frame",
+Q6INSN(SL2_deallocframe,"deallocframe", ATTRIBS(A_REGWRSIZE_8B,A_SUBINSN,A_MEMSIZE_8B,A_LOAD,A_DEALLOCFRAME), "Deallocate stack frame",
 { fHIDE(size8u_t tmp;) fEA_REG(fREAD_FP());
   fLOAD(1,8,u,EA,tmp);
   tmp = fFRAME_UNSCRAMBLE(tmp);
@@ -79,7 +79,7 @@ Q6INSN(SL2_deallocframe,"deallocframe", ATTRIBS(A_SUBINSN,A_LOAD), "Deallocate s
   fWRITE_FP(fGETWORD(0,tmp));
   fWRITE_SP(EA+8); })
 
-Q6INSN(SL2_return,"dealloc_return", ATTRIBS(A_JINDIR,A_SUBINSN,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY), "Deallocate stack frame and return",
+Q6INSN(SL2_return,"dealloc_return", ATTRIBS(A_REGWRSIZE_8B,A_JINDIR,A_SUBINSN,A_ROPS_2,A_MEMSIZE_8B,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY,A_RET_TYPE,A_DEALLOCRET), "Deallocate stack frame and return",
 { fHIDE(size8u_t tmp;) fEA_REG(fREAD_FP());
   fLOAD(1,8,u,EA,tmp);
   tmp = fFRAME_UNSCRAMBLE(tmp);
@@ -88,40 +88,40 @@ Q6INSN(SL2_return,"dealloc_return", ATTRIBS(A_JINDIR,A_SUBINSN,A_LOAD,A_RETURN,A
   fWRITE_SP(EA+8);
   fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);})
 
-Q6INSN(SL2_return_t,"if (p0) dealloc_return", ATTRIBS(A_JINDIROLD,A_SUBINSN,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY), "Deallocate stack frame and return",
+Q6INSN(SL2_return_t,"if (p0) dealloc_return", ATTRIBS(A_REGWRSIZE_8B,A_JINDIROLD,A_SUBINSN,A_ROPS_2,A_MEMSIZE_8B,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY,A_RET_TYPE), "Deallocate stack frame and return",
 { fHIDE(size8u_t tmp;); fBRANCH_SPECULATE_STALL(fLSBOLD(fREAD_P0()),, SPECULATE_NOT_TAKEN,4,0); fEA_REG(fREAD_FP()); if (fLSBOLD(fREAD_P0())) { fLOAD(1,8,u,EA,tmp); tmp = fFRAME_UNSCRAMBLE(tmp); fWRITE_LR(fGETWORD(1,tmp)); fWRITE_FP(fGETWORD(0,tmp)); fWRITE_SP(EA+8);
   fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);} else {LOAD_CANCEL(EA);} })
 
-Q6INSN(SL2_return_f,"if (!p0) dealloc_return", ATTRIBS(A_JINDIROLD,A_SUBINSN,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY), "Deallocate stack frame and return",
+Q6INSN(SL2_return_f,"if (!p0) dealloc_return", ATTRIBS(A_REGWRSIZE_8B,A_JINDIROLD,A_SUBINSN,A_ROPS_2,A_MEMSIZE_8B,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY,A_RET_TYPE), "Deallocate stack frame and return",
 { fHIDE(size8u_t tmp;);fBRANCH_SPECULATE_STALL(fLSBOLDNOT(fREAD_P0()),, SPECULATE_NOT_TAKEN,4,0); fEA_REG(fREAD_FP()); if (fLSBOLDNOT(fREAD_P0())) { fLOAD(1,8,u,EA,tmp); tmp = fFRAME_UNSCRAMBLE(tmp); fWRITE_LR(fGETWORD(1,tmp)); fWRITE_FP(fGETWORD(0,tmp)); fWRITE_SP(EA+8);
   fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);} else {LOAD_CANCEL(EA);} })
 
 
 
-Q6INSN(SL2_return_tnew,"if (p0.new) dealloc_return:nt", ATTRIBS(A_JINDIRNEW,A_SUBINSN,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY), "Deallocate stack frame and return",
+Q6INSN(SL2_return_tnew,"if (p0.new) dealloc_return:nt", ATTRIBS(A_REGWRSIZE_8B,A_JINDIRNEW,A_SUBINSN,A_ROPS_2,A_MEMSIZE_8B,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY,A_RET_TYPE), "Deallocate stack frame and return",
 { fHIDE(size8u_t tmp;) fBRANCH_SPECULATE_STALL(fLSBNEW0,, SPECULATE_NOT_TAKEN , 4,3); fEA_REG(fREAD_FP()); if (fLSBNEW0) { fLOAD(1,8,u,EA,tmp); tmp = fFRAME_UNSCRAMBLE(tmp); fWRITE_LR(fGETWORD(1,tmp)); fWRITE_FP(fGETWORD(0,tmp)); fWRITE_SP(EA+8);
   fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);} else {LOAD_CANCEL(EA);} })
 
-Q6INSN(SL2_return_fnew,"if (!p0.new) dealloc_return:nt", ATTRIBS(A_JINDIRNEW,A_SUBINSN,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY), "Deallocate stack frame and return",
+Q6INSN(SL2_return_fnew,"if (!p0.new) dealloc_return:nt", ATTRIBS(A_REGWRSIZE_8B,A_JINDIRNEW,A_SUBINSN,A_ROPS_2,A_MEMSIZE_8B,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY,A_RET_TYPE), "Deallocate stack frame and return",
 { fHIDE(size8u_t tmp;) fBRANCH_SPECULATE_STALL(fLSBNEW0NOT,, SPECULATE_NOT_TAKEN , 4,3); fEA_REG(fREAD_FP()); if (fLSBNEW0NOT) { fLOAD(1,8,u,EA,tmp); tmp = fFRAME_UNSCRAMBLE(tmp); fWRITE_LR(fGETWORD(1,tmp)); fWRITE_FP(fGETWORD(0,tmp)); fWRITE_SP(EA+8);
   fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);} else {LOAD_CANCEL(EA);} })
 
 
-Q6INSN(SL2_jumpr31,"jumpr r31",ATTRIBS(A_SUBINSN,A_JINDIR,A_RESTRICT_SLOT0ONLY),"indirect unconditional jump",
+Q6INSN(SL2_jumpr31,"jumpr r31",ATTRIBS(A_SUBINSN,A_JINDIR,A_RESTRICT_SLOT0ONLY,A_RET_TYPE),"indirect unconditional jump",
 { fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);})
 
-Q6INSN(SL2_jumpr31_t,"if (p0) jumpr r31",ATTRIBS(A_SUBINSN,A_JINDIROLD,A_RESTRICT_SLOT0ONLY),"indirect conditional jump if true",
+Q6INSN(SL2_jumpr31_t,"if (p0) jumpr r31",ATTRIBS(A_SUBINSN,A_JINDIROLD,A_NOTE_CONDITIONAL,A_RESTRICT_SLOT0ONLY,A_RET_TYPE),"indirect conditional jump if true",
 {fBRANCH_SPECULATE_STALL(fLSBOLD(fREAD_P0()),, SPECULATE_TAKEN,4,0); if (fLSBOLD(fREAD_P0())) {fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);}})
 
-Q6INSN(SL2_jumpr31_f,"if (!p0) jumpr r31",ATTRIBS(A_SUBINSN,A_JINDIROLD,A_RESTRICT_SLOT0ONLY),"indirect conditional jump if false",
+Q6INSN(SL2_jumpr31_f,"if (!p0) jumpr r31",ATTRIBS(A_SUBINSN,A_JINDIROLD,A_NOTE_CONDITIONAL,A_RESTRICT_SLOT0ONLY,A_RET_TYPE),"indirect conditional jump if false",
 {fBRANCH_SPECULATE_STALL(fLSBOLDNOT(fREAD_P0()),, SPECULATE_TAKEN,4,0); if (fLSBOLDNOT(fREAD_P0())) {fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);}})
 
 
 
-Q6INSN(SL2_jumpr31_tnew,"if (p0.new) jumpr:nt r31",ATTRIBS(A_SUBINSN,A_JINDIRNEW,A_RESTRICT_SLOT0ONLY),"indirect conditional jump if true",
+Q6INSN(SL2_jumpr31_tnew,"if (p0.new) jumpr:nt r31",ATTRIBS(A_SUBINSN,A_JINDIRNEW,A_NOTE_CONDITIONAL,A_RESTRICT_SLOT0ONLY,A_RET_TYPE),"indirect conditional jump if true",
 {fBRANCH_SPECULATE_STALL(fLSBNEW0,, SPECULATE_NOT_TAKEN , 4,3); if (fLSBNEW0) {fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);}})
 
-Q6INSN(SL2_jumpr31_fnew,"if (!p0.new) jumpr:nt r31",ATTRIBS(A_SUBINSN,A_JINDIRNEW,A_RESTRICT_SLOT0ONLY),"indirect conditional jump if false",
+Q6INSN(SL2_jumpr31_fnew,"if (!p0.new) jumpr:nt r31",ATTRIBS(A_SUBINSN,A_JINDIRNEW,A_NOTE_CONDITIONAL,A_RESTRICT_SLOT0ONLY,A_RET_TYPE),"indirect conditional jump if false",
 {fBRANCH_SPECULATE_STALL(fLSBNEW0NOT,, SPECULATE_NOT_TAKEN , 4,3); if (fLSBNEW0NOT) {fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);}})
 
 
@@ -134,16 +134,16 @@ Q6INSN(SL2_jumpr31_fnew,"if (!p0.new) jumpr:nt r31",ATTRIBS(A_SUBINSN,A_JINDIRNE
 /*                                                               */
 /*****************************************************************/
 
-Q6INSN(SS1_storew_io,  "memw(Rs16+#u4:2)=Rt16", ATTRIBS(A_STORE,A_SUBINSN), "store word", {fEA_RI(RsV,uiV); fSTORE(1,4,EA,RtV);})
-Q6INSN(SS1_storeb_io,  "memb(Rs16+#u4:0)=Rt16", ATTRIBS(A_STORE,A_SUBINSN), "store byte", {fEA_RI(RsV,uiV); fSTORE(1,1,EA,fGETBYTE(0,RtV));})
-Q6INSN(SS2_storeh_io,  "memh(Rs16+#u3:1)=Rt16", ATTRIBS(A_STORE,A_SUBINSN), "store half", {fEA_RI(RsV,uiV); fSTORE(1,2,EA,fGETHALF(0,RtV));})
-Q6INSN(SS2_stored_sp,  "memd(r29+#s6:3)=Rtt8", ATTRIBS(A_STORE,A_SUBINSN), "store dword",{fEA_RI(fREAD_SP(),siV); fSTORE(1,8,EA,RttV);})
-Q6INSN(SS2_storew_sp,  "memw(r29+#u5:2)=Rt16",  ATTRIBS(A_STORE,A_SUBINSN), "store word", {fEA_RI(fREAD_SP(),uiV); fSTORE(1,4,EA,RtV);})
-Q6INSN(SS2_storewi0,   "memw(Rs16+#u4:2)=#0", ATTRIBS(A_STORE,A_SUBINSN), "store word", {fEA_RI(RsV,uiV); fSTORE(1,4,EA,0);})
-Q6INSN(SS2_storebi0,   "memb(Rs16+#u4:0)=#0", ATTRIBS(A_STORE,A_SUBINSN), "store byte", {fEA_RI(RsV,uiV); fSTORE(1,1,EA,0);})
-Q6INSN(SS2_storewi1,   "memw(Rs16+#u4:2)=#1", ATTRIBS(A_STORE,A_SUBINSN), "store word", {fEA_RI(RsV,uiV); fSTORE(1,4,EA,1);})
-Q6INSN(SS2_storebi1,   "memb(Rs16+#u4:0)=#1", ATTRIBS(A_STORE,A_SUBINSN), "store byte", {fEA_RI(RsV,uiV); fSTORE(1,1,EA,1);})
+Q6INSN(SS1_storew_io,  "memw(Rs16+#u4:2)=Rt16", ATTRIBS(A_REGWRSIZE_4B,A_MEMSIZE_4B,A_STORE,A_SUBINSN), "store word", {fEA_RI(RsV,uiV); fSTORE(1,4,EA,RtV);})
+Q6INSN(SS1_storeb_io,  "memb(Rs16+#u4:0)=Rt16", ATTRIBS(A_MEMSIZE_1B,A_STORE,A_SUBINSN), "store byte", {fEA_RI(RsV,uiV); fSTORE(1,1,EA,fGETBYTE(0,RtV));})
+Q6INSN(SS2_storeh_io,  "memh(Rs16+#u3:1)=Rt16", ATTRIBS(A_MEMSIZE_2B,A_STORE,A_SUBINSN), "store half", {fEA_RI(RsV,uiV); fSTORE(1,2,EA,fGETHALF(0,RtV));})
+Q6INSN(SS2_stored_sp,  "memd(r29+#s6:3)=Rtt8", ATTRIBS(A_REGWRSIZE_8B,A_MEMSIZE_8B,A_STORE,A_SUBINSN), "store dword",{fEA_RI(fREAD_SP(),siV); fSTORE(1,8,EA,RttV);})
+Q6INSN(SS2_storew_sp,  "memw(r29+#u5:2)=Rt16",  ATTRIBS(A_REGWRSIZE_4B,A_MEMSIZE_4B,A_STORE,A_SUBINSN), "store word", {fEA_RI(fREAD_SP(),uiV); fSTORE(1,4,EA,RtV);})
+Q6INSN(SS2_storewi0,   "memw(Rs16+#u4:2)=#0", ATTRIBS(A_REGWRSIZE_4B,A_MEMSIZE_4B,A_STORE,A_SUBINSN,A_ROPS_2), "store word", {fEA_RI(RsV,uiV); fSTORE(1,4,EA,0);})
+Q6INSN(SS2_storebi0,   "memb(Rs16+#u4:0)=#0", ATTRIBS(A_MEMSIZE_1B,A_STORE,A_SUBINSN,A_ROPS_2), "store byte", {fEA_RI(RsV,uiV); fSTORE(1,1,EA,0);})
+Q6INSN(SS2_storewi1,   "memw(Rs16+#u4:2)=#1", ATTRIBS(A_REGWRSIZE_4B,A_MEMSIZE_4B,A_STORE,A_SUBINSN,A_ROPS_2), "store word", {fEA_RI(RsV,uiV); fSTORE(1,4,EA,1);})
+Q6INSN(SS2_storebi1,   "memb(Rs16+#u4:0)=#1", ATTRIBS(A_MEMSIZE_1B,A_STORE,A_SUBINSN,A_ROPS_2), "store byte", {fEA_RI(RsV,uiV); fSTORE(1,1,EA,1);})
 
 
-Q6INSN(SS2_allocframe,"allocframe(#u5:3)", ATTRIBS(A_SUBINSN,A_STORE,A_RESTRICT_SLOT0ONLY), "Allocate stack frame",
+Q6INSN(SS2_allocframe,"allocframe(#u5:3)", ATTRIBS(A_REGWRSIZE_8B,A_SUBINSN,A_MEMSIZE_8B,A_STORE,A_RESTRICT_SLOT0ONLY), "Allocate stack frame",
 { fEA_RI(fREAD_SP(),-8);  fSTORE(1,8,EA,fFRAME_SCRAMBLE((fCAST8_8u(fREAD_LR()) << 32) | fCAST4_4u(fREAD_FP()))); fWRITE_FP(EA); fFRAMECHECK(EA-uiV,EA); fWRITE_SP(EA-uiV); })
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] Hexagon (target/hexagon) move store size tracking to translation
  2022-09-20  8:07 [PATCH 0/3] Hexagon (target/hexagon) improve store handling Taylor Simpson
  2022-09-20  8:07 ` [PATCH 1/3] Hexagon (target/hexagon) add instruction attributes from archlib Taylor Simpson
@ 2022-09-20  8:07 ` Taylor Simpson
  2022-09-28 16:09   ` Richard Henderson
  2022-09-20  8:07 ` [PATCH 3/3] Hexagon (target/hexagon) Change decision to set pkt_has_store_s[01] Taylor Simpson
  2 siblings, 1 reply; 8+ messages in thread
From: Taylor Simpson @ 2022-09-20  8:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: tsimpson, richard.henderson, f4bug, ale, anjo, bcain, mlambert

The store width is needed for packet commit, so it is stored in
ctx->store_width.  Currently, it is set when a store has a TCG
override instead of a QEMU helper.  In the QEMU helper case, the
ctx->store_width is not set, we invoke a helper during packet commit
that uses the runtime store width.

This patch ensures ctx->store_width is set for all store instructions,
so performance is improved because packet commit can generate the proper
TCG store rather than the generic helper.

We do this by
- Use the attributes from the instructions during translation to
  set ctx->store_width
- Remove setting of ctx->store_width from genptr.c

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/macros.h    |  8 ++++----
 target/hexagon/genptr.c    | 36 ++++++++++++------------------------
 target/hexagon/translate.c | 26 ++++++++++++++++++++++++++
 3 files changed, 42 insertions(+), 28 deletions(-)

diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index 92eb8bbf05..c8805bdaeb 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -156,7 +156,7 @@
         __builtin_choose_expr(TYPE_TCGV(X), \
             gen_store1, (void)0))
 #define MEM_STORE1(VA, DATA, SLOT) \
-    MEM_STORE1_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT)
+    MEM_STORE1_FUNC(DATA)(cpu_env, VA, DATA, SLOT)
 
 #define MEM_STORE2_FUNC(X) \
     __builtin_choose_expr(TYPE_INT(X), \
@@ -164,7 +164,7 @@
         __builtin_choose_expr(TYPE_TCGV(X), \
             gen_store2, (void)0))
 #define MEM_STORE2(VA, DATA, SLOT) \
-    MEM_STORE2_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT)
+    MEM_STORE2_FUNC(DATA)(cpu_env, VA, DATA, SLOT)
 
 #define MEM_STORE4_FUNC(X) \
     __builtin_choose_expr(TYPE_INT(X), \
@@ -172,7 +172,7 @@
         __builtin_choose_expr(TYPE_TCGV(X), \
             gen_store4, (void)0))
 #define MEM_STORE4(VA, DATA, SLOT) \
-    MEM_STORE4_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT)
+    MEM_STORE4_FUNC(DATA)(cpu_env, VA, DATA, SLOT)
 
 #define MEM_STORE8_FUNC(X) \
     __builtin_choose_expr(TYPE_INT(X), \
@@ -180,7 +180,7 @@
         __builtin_choose_expr(TYPE_TCGV_I64(X), \
             gen_store8, (void)0))
 #define MEM_STORE8(VA, DATA, SLOT) \
-    MEM_STORE8_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT)
+    MEM_STORE8_FUNC(DATA)(cpu_env, VA, DATA, SLOT)
 #else
 #define MEM_LOAD1s(VA) ((int8_t)mem_load1(env, slot, VA))
 #define MEM_LOAD1u(VA) ((uint8_t)mem_load1(env, slot, VA))
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 8a334ba07b..806d0974ff 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -401,62 +401,50 @@ static inline void gen_store32(TCGv vaddr, TCGv src, int width, int slot)
     tcg_gen_mov_tl(hex_store_val32[slot], src);
 }
 
-static inline void gen_store1(TCGv_env cpu_env, TCGv vaddr, TCGv src,
-                              DisasContext *ctx, int slot)
+static inline void gen_store1(TCGv_env cpu_env, TCGv vaddr, TCGv src, int slot)
 {
     gen_store32(vaddr, src, 1, slot);
-    ctx->store_width[slot] = 1;
 }
 
-static inline void gen_store1i(TCGv_env cpu_env, TCGv vaddr, int32_t src,
-                               DisasContext *ctx, int slot)
+static inline void gen_store1i(TCGv_env cpu_env, TCGv vaddr, int32_t src, int slot)
 {
     TCGv tmp = tcg_constant_tl(src);
-    gen_store1(cpu_env, vaddr, tmp, ctx, slot);
+    gen_store1(cpu_env, vaddr, tmp, slot);
 }
 
-static inline void gen_store2(TCGv_env cpu_env, TCGv vaddr, TCGv src,
-                              DisasContext *ctx, int slot)
+static inline void gen_store2(TCGv_env cpu_env, TCGv vaddr, TCGv src, int slot)
 {
     gen_store32(vaddr, src, 2, slot);
-    ctx->store_width[slot] = 2;
 }
 
-static inline void gen_store2i(TCGv_env cpu_env, TCGv vaddr, int32_t src,
-                               DisasContext *ctx, int slot)
+static inline void gen_store2i(TCGv_env cpu_env, TCGv vaddr, int32_t src, int slot)
 {
     TCGv tmp = tcg_constant_tl(src);
-    gen_store2(cpu_env, vaddr, tmp, ctx, slot);
+    gen_store2(cpu_env, vaddr, tmp, slot);
 }
 
-static inline void gen_store4(TCGv_env cpu_env, TCGv vaddr, TCGv src,
-                              DisasContext *ctx, int slot)
+static inline void gen_store4(TCGv_env cpu_env, TCGv vaddr, TCGv src, int slot)
 {
     gen_store32(vaddr, src, 4, slot);
-    ctx->store_width[slot] = 4;
 }
 
-static inline void gen_store4i(TCGv_env cpu_env, TCGv vaddr, int32_t src,
-                               DisasContext *ctx, int slot)
+static inline void gen_store4i(TCGv_env cpu_env, TCGv vaddr, int32_t src, int slot)
 {
     TCGv tmp = tcg_constant_tl(src);
-    gen_store4(cpu_env, vaddr, tmp, ctx, slot);
+    gen_store4(cpu_env, vaddr, tmp, slot);
 }
 
-static inline void gen_store8(TCGv_env cpu_env, TCGv vaddr, TCGv_i64 src,
-                              DisasContext *ctx, int slot)
+static inline void gen_store8(TCGv_env cpu_env, TCGv vaddr, TCGv_i64 src, int slot)
 {
     tcg_gen_mov_tl(hex_store_addr[slot], vaddr);
     tcg_gen_movi_tl(hex_store_width[slot], 8);
     tcg_gen_mov_i64(hex_store_val64[slot], src);
-    ctx->store_width[slot] = 8;
 }
 
-static inline void gen_store8i(TCGv_env cpu_env, TCGv vaddr, int64_t src,
-                               DisasContext *ctx, int slot)
+static inline void gen_store8i(TCGv_env cpu_env, TCGv vaddr, int64_t src, int slot)
 {
     TCGv_i64 tmp = tcg_constant_i64(src);
-    gen_store8(cpu_env, vaddr, tmp, ctx, slot);
+    gen_store8(cpu_env, vaddr, tmp, slot);
 }
 
 static TCGv gen_8bitsof(TCGv result, TCGv value)
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 0e8a0772f7..bc02870b9f 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -327,6 +327,31 @@ static void mark_implicit_pred_writes(DisasContext *ctx, Insn *insn)
     mark_implicit_pred_write(ctx, insn, A_IMPLICIT_WRITES_P3, 3);
 }
 
+static void mark_store_width(DisasContext *ctx, Insn *insn)
+{
+    uint16_t opcode = insn->opcode;
+    uint32_t slot = insn->slot;
+
+    if (GET_ATTRIB(opcode, A_STORE)) {
+        if (GET_ATTRIB(opcode, A_MEMSIZE_1B)) {
+            ctx->store_width[slot] = 1;
+            return;
+        }
+        if (GET_ATTRIB(opcode, A_MEMSIZE_2B)) {
+            ctx->store_width[slot] = 2;
+            return;
+        }
+        if (GET_ATTRIB(opcode, A_MEMSIZE_4B)) {
+            ctx->store_width[slot] = 4;
+            return;
+        }
+        if (GET_ATTRIB(opcode, A_MEMSIZE_8B)) {
+            ctx->store_width[slot] = 8;
+            return;
+        }
+    }
+}
+
 static void gen_insn(CPUHexagonState *env, DisasContext *ctx,
                      Insn *insn, Packet *pkt)
 {
@@ -334,6 +359,7 @@ static void gen_insn(CPUHexagonState *env, DisasContext *ctx,
         mark_implicit_reg_writes(ctx, insn);
         insn->generate(env, ctx, insn, pkt);
         mark_implicit_pred_writes(ctx, insn);
+        mark_store_width(ctx, insn);
     } else {
         gen_exception_end_tb(ctx, HEX_EXCP_INVALID_OPCODE);
     }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] Hexagon (target/hexagon) Change decision to set pkt_has_store_s[01]
  2022-09-20  8:07 [PATCH 0/3] Hexagon (target/hexagon) improve store handling Taylor Simpson
  2022-09-20  8:07 ` [PATCH 1/3] Hexagon (target/hexagon) add instruction attributes from archlib Taylor Simpson
  2022-09-20  8:07 ` [PATCH 2/3] Hexagon (target/hexagon) move store size tracking to translation Taylor Simpson
@ 2022-09-20  8:07 ` Taylor Simpson
  2022-09-28 16:11   ` Richard Henderson
  2 siblings, 1 reply; 8+ messages in thread
From: Taylor Simpson @ 2022-09-20  8:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: tsimpson, richard.henderson, f4bug, ale, anjo, bcain, mlambert

We have found cases where pkt_has_store_s[01] is set incorrectly.
This leads to generating an unnecessary store that is left over
from a previous packet.

Add an attribute to determine if an instruction is a scalar store
The attribute is attached to the fSTORE macro (hex_common.py)
Simplify the logic in decode.c that sets pkt_has_store_s[01]

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/attribs_def.h.inc |  1 +
 target/hexagon/decode.c          | 17 ++++++++++++-----
 target/hexagon/translate.c       | 10 ++++++----
 target/hexagon/hex_common.py     |  3 ++-
 4 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.h.inc
index 222ad95fb0..5d2a102c18 100644
--- a/target/hexagon/attribs_def.h.inc
+++ b/target/hexagon/attribs_def.h.inc
@@ -44,6 +44,7 @@ DEF_ATTRIB(MEMSIZE_1B, "Memory width is 1 byte", "", "")
 DEF_ATTRIB(MEMSIZE_2B, "Memory width is 2 bytes", "", "")
 DEF_ATTRIB(MEMSIZE_4B, "Memory width is 4 bytes", "", "")
 DEF_ATTRIB(MEMSIZE_8B, "Memory width is 8 bytes", "", "")
+DEF_ATTRIB(SCALAR_STORE, "Store is scalar", "", "")
 DEF_ATTRIB(REGWRSIZE_1B, "Memory width is 1 byte", "", "")
 DEF_ATTRIB(REGWRSIZE_2B, "Memory width is 2 bytes", "", "")
 DEF_ATTRIB(REGWRSIZE_4B, "Memory width is 4 bytes", "", "")
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index 6f0f27b4ba..2ba94a77de 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -402,10 +402,17 @@ static void decode_set_insn_attr_fields(Packet *pkt)
         }
 
         if (GET_ATTRIB(opcode, A_STORE)) {
-            if (pkt->insn[i].slot == 0) {
-                pkt->pkt_has_store_s0 = true;
-            } else {
-                pkt->pkt_has_store_s1 = true;
+            if (GET_ATTRIB(opcode, A_SCALAR_STORE) &&
+                !GET_ATTRIB(opcode, A_MEMSIZE_0B)) {
+                g_assert(GET_ATTRIB(opcode, A_MEMSIZE_1B) ||
+                         GET_ATTRIB(opcode, A_MEMSIZE_2B) ||
+                         GET_ATTRIB(opcode, A_MEMSIZE_4B) ||
+                         GET_ATTRIB(opcode, A_MEMSIZE_8B));
+                if (pkt->insn[i].slot == 0) {
+                    pkt->pkt_has_store_s0 = true;
+                } else {
+                    pkt->pkt_has_store_s1 = true;
+                }
             }
         }
 
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index bc02870b9f..efe7d2264e 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -525,10 +525,12 @@ static void process_store_log(DisasContext *ctx, Packet *pkt)
      *  slot 1 and then slot 0.  This will be important when
      *  the memory accesses overlap.
      */
-    if (pkt->pkt_has_store_s1 && !pkt->pkt_has_dczeroa) {
+    if (pkt->pkt_has_store_s1) {
+        g_assert(!pkt->pkt_has_dczeroa);
         process_store(ctx, pkt, 1);
     }
-    if (pkt->pkt_has_store_s0 && !pkt->pkt_has_dczeroa) {
+    if (pkt->pkt_has_store_s0) {
+        g_assert(!pkt->pkt_has_dczeroa);
         process_store(ctx, pkt, 0);
     }
 }
@@ -691,7 +693,7 @@ static void gen_commit_packet(CPUHexagonState *env, DisasContext *ctx,
          * The dczeroa will be the store in slot 0, check that we don't have
          * a store in slot 1 or an HVX store.
          */
-        g_assert(has_store_s0 && !has_store_s1 && !has_hvx_store);
+        g_assert(!has_store_s1 && !has_hvx_store);
         process_dczeroa(ctx, pkt);
     } else if (has_hvx_store) {
         TCGv mem_idx = tcg_constant_tl(ctx->mem_idx);
diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py
index c81aca8d2a..d9ba7df786 100755
--- a/target/hexagon/hex_common.py
+++ b/target/hexagon/hex_common.py
@@ -1,7 +1,7 @@
 #!/usr/bin/env python3
 
 ##
-##  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved.
 ##
 ##  This program is free software; you can redistribute it and/or modify
 ##  it under the terms of the GNU General Public License as published by
@@ -75,6 +75,7 @@ def calculate_attribs():
     add_qemu_macro_attrib('fWRITE_P3', 'A_WRITES_PRED_REG')
     add_qemu_macro_attrib('fSET_OVERFLOW', 'A_IMPLICIT_WRITES_USR')
     add_qemu_macro_attrib('fSET_LPCFG', 'A_IMPLICIT_WRITES_USR')
+    add_qemu_macro_attrib('fSTORE', 'A_SCALAR_STORE')
 
     # Recurse down macros, find attributes from sub-macros
     macroValues = list(macros.values())
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] Hexagon (target/hexagon) move store size tracking to translation
  2022-09-20  8:07 ` [PATCH 2/3] Hexagon (target/hexagon) move store size tracking to translation Taylor Simpson
@ 2022-09-28 16:09   ` Richard Henderson
  0 siblings, 0 replies; 8+ messages in thread
From: Richard Henderson @ 2022-09-28 16:09 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel; +Cc: f4bug, ale, anjo, bcain, mlambert

On 9/20/22 01:07, Taylor Simpson wrote:
> The store width is needed for packet commit, so it is stored in
> ctx->store_width.  Currently, it is set when a store has a TCG
> override instead of a QEMU helper.  In the QEMU helper case, the
> ctx->store_width is not set, we invoke a helper during packet commit
> that uses the runtime store width.
> 
> This patch ensures ctx->store_width is set for all store instructions,
> so performance is improved because packet commit can generate the proper
> TCG store rather than the generic helper.
> 
> We do this by
> - Use the attributes from the instructions during translation to
>    set ctx->store_width
> - Remove setting of ctx->store_width from genptr.c
> 
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>   target/hexagon/macros.h    |  8 ++++----
>   target/hexagon/genptr.c    | 36 ++++++++++++------------------------
>   target/hexagon/translate.c | 26 ++++++++++++++++++++++++++
>   3 files changed, 42 insertions(+), 28 deletions(-)
> 
> diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
> index 92eb8bbf05..c8805bdaeb 100644
> --- a/target/hexagon/macros.h
> +++ b/target/hexagon/macros.h
> @@ -156,7 +156,7 @@
>           __builtin_choose_expr(TYPE_TCGV(X), \
>               gen_store1, (void)0))
>   #define MEM_STORE1(VA, DATA, SLOT) \
> -    MEM_STORE1_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT)
> +    MEM_STORE1_FUNC(DATA)(cpu_env, VA, DATA, SLOT)
>   
>   #define MEM_STORE2_FUNC(X) \
>       __builtin_choose_expr(TYPE_INT(X), \
> @@ -164,7 +164,7 @@
>           __builtin_choose_expr(TYPE_TCGV(X), \
>               gen_store2, (void)0))
>   #define MEM_STORE2(VA, DATA, SLOT) \
> -    MEM_STORE2_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT)
> +    MEM_STORE2_FUNC(DATA)(cpu_env, VA, DATA, SLOT)
>   
>   #define MEM_STORE4_FUNC(X) \
>       __builtin_choose_expr(TYPE_INT(X), \
> @@ -172,7 +172,7 @@
>           __builtin_choose_expr(TYPE_TCGV(X), \
>               gen_store4, (void)0))
>   #define MEM_STORE4(VA, DATA, SLOT) \
> -    MEM_STORE4_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT)
> +    MEM_STORE4_FUNC(DATA)(cpu_env, VA, DATA, SLOT)
>   
>   #define MEM_STORE8_FUNC(X) \
>       __builtin_choose_expr(TYPE_INT(X), \
> @@ -180,7 +180,7 @@
>           __builtin_choose_expr(TYPE_TCGV_I64(X), \
>               gen_store8, (void)0))
>   #define MEM_STORE8(VA, DATA, SLOT) \
> -    MEM_STORE8_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT)
> +    MEM_STORE8_FUNC(DATA)(cpu_env, VA, DATA, SLOT)
>   #else
>   #define MEM_LOAD1s(VA) ((int8_t)mem_load1(env, slot, VA))
>   #define MEM_LOAD1u(VA) ((uint8_t)mem_load1(env, slot, VA))
> diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
> index 8a334ba07b..806d0974ff 100644
> --- a/target/hexagon/genptr.c
> +++ b/target/hexagon/genptr.c
> @@ -401,62 +401,50 @@ static inline void gen_store32(TCGv vaddr, TCGv src, int width, int slot)
>       tcg_gen_mov_tl(hex_store_val32[slot], src);
>   }
>   
> -static inline void gen_store1(TCGv_env cpu_env, TCGv vaddr, TCGv src,
> -                              DisasContext *ctx, int slot)
> +static inline void gen_store1(TCGv_env cpu_env, TCGv vaddr, TCGv src, int slot)
>   {
>       gen_store32(vaddr, src, 1, slot);
> -    ctx->store_width[slot] = 1;
>   }
>   
> -static inline void gen_store1i(TCGv_env cpu_env, TCGv vaddr, int32_t src,
> -                               DisasContext *ctx, int slot)
> +static inline void gen_store1i(TCGv_env cpu_env, TCGv vaddr, int32_t src, int slot)
>   {
>       TCGv tmp = tcg_constant_tl(src);
> -    gen_store1(cpu_env, vaddr, tmp, ctx, slot);
> +    gen_store1(cpu_env, vaddr, tmp, slot);
>   }
>   
> -static inline void gen_store2(TCGv_env cpu_env, TCGv vaddr, TCGv src,
> -                              DisasContext *ctx, int slot)
> +static inline void gen_store2(TCGv_env cpu_env, TCGv vaddr, TCGv src, int slot)
>   {
>       gen_store32(vaddr, src, 2, slot);
> -    ctx->store_width[slot] = 2;
>   }
>   
> -static inline void gen_store2i(TCGv_env cpu_env, TCGv vaddr, int32_t src,
> -                               DisasContext *ctx, int slot)
> +static inline void gen_store2i(TCGv_env cpu_env, TCGv vaddr, int32_t src, int slot)
>   {
>       TCGv tmp = tcg_constant_tl(src);
> -    gen_store2(cpu_env, vaddr, tmp, ctx, slot);
> +    gen_store2(cpu_env, vaddr, tmp, slot);
>   }
>   
> -static inline void gen_store4(TCGv_env cpu_env, TCGv vaddr, TCGv src,
> -                              DisasContext *ctx, int slot)
> +static inline void gen_store4(TCGv_env cpu_env, TCGv vaddr, TCGv src, int slot)
>   {
>       gen_store32(vaddr, src, 4, slot);
> -    ctx->store_width[slot] = 4;
>   }
>   
> -static inline void gen_store4i(TCGv_env cpu_env, TCGv vaddr, int32_t src,
> -                               DisasContext *ctx, int slot)
> +static inline void gen_store4i(TCGv_env cpu_env, TCGv vaddr, int32_t src, int slot)
>   {
>       TCGv tmp = tcg_constant_tl(src);
> -    gen_store4(cpu_env, vaddr, tmp, ctx, slot);
> +    gen_store4(cpu_env, vaddr, tmp, slot);
>   }
>   
> -static inline void gen_store8(TCGv_env cpu_env, TCGv vaddr, TCGv_i64 src,
> -                              DisasContext *ctx, int slot)
> +static inline void gen_store8(TCGv_env cpu_env, TCGv vaddr, TCGv_i64 src, int slot)
>   {
>       tcg_gen_mov_tl(hex_store_addr[slot], vaddr);
>       tcg_gen_movi_tl(hex_store_width[slot], 8);
>       tcg_gen_mov_i64(hex_store_val64[slot], src);
> -    ctx->store_width[slot] = 8;
>   }
>   
> -static inline void gen_store8i(TCGv_env cpu_env, TCGv vaddr, int64_t src,
> -                               DisasContext *ctx, int slot)
> +static inline void gen_store8i(TCGv_env cpu_env, TCGv vaddr, int64_t src, int slot)
>   {
>       TCGv_i64 tmp = tcg_constant_i64(src);
> -    gen_store8(cpu_env, vaddr, tmp, ctx, slot);
> +    gen_store8(cpu_env, vaddr, tmp, slot);
>   }
>   
>   static TCGv gen_8bitsof(TCGv result, TCGv value)
> diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
> index 0e8a0772f7..bc02870b9f 100644
> --- a/target/hexagon/translate.c
> +++ b/target/hexagon/translate.c
> @@ -327,6 +327,31 @@ static void mark_implicit_pred_writes(DisasContext *ctx, Insn *insn)
>       mark_implicit_pred_write(ctx, insn, A_IMPLICIT_WRITES_P3, 3);
>   }
>   
> +static void mark_store_width(DisasContext *ctx, Insn *insn)
> +{
> +    uint16_t opcode = insn->opcode;
> +    uint32_t slot = insn->slot;
> +
> +    if (GET_ATTRIB(opcode, A_STORE)) {
> +        if (GET_ATTRIB(opcode, A_MEMSIZE_1B)) {
> +            ctx->store_width[slot] = 1;
> +            return;
> +        }
> +        if (GET_ATTRIB(opcode, A_MEMSIZE_2B)) {
> +            ctx->store_width[slot] = 2;
> +            return;
> +        }
> +        if (GET_ATTRIB(opcode, A_MEMSIZE_4B)) {
> +            ctx->store_width[slot] = 4;
> +            return;
> +        }
> +        if (GET_ATTRIB(opcode, A_MEMSIZE_8B)) {
> +            ctx->store_width[slot] = 8;
> +            return;
> +        }

Hmm.  Perhaps

     int size = 0;
     if (GET_ATTRIB(opcode, A_MEMSIZE_1B)) {
         size |= 1;
     }
     ...
     tcg_debug_assert(is_power_of_2(size));
     ctx->store_width[slot] = size;

just to make sure you get exactly one of the above cases.

Otherwise, LGTM,

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] Hexagon (target/hexagon) Change decision to set pkt_has_store_s[01]
  2022-09-20  8:07 ` [PATCH 3/3] Hexagon (target/hexagon) Change decision to set pkt_has_store_s[01] Taylor Simpson
@ 2022-09-28 16:11   ` Richard Henderson
  2022-09-28 17:52     ` Taylor Simpson
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Henderson @ 2022-09-28 16:11 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel; +Cc: f4bug, ale, anjo, bcain, mlambert

On 9/20/22 01:07, Taylor Simpson wrote:
> We have found cases where pkt_has_store_s[01] is set incorrectly.
> This leads to generating an unnecessary store that is left over
> from a previous packet.
> 
> Add an attribute to determine if an instruction is a scalar store
> The attribute is attached to the fSTORE macro (hex_common.py)
> Simplify the logic in decode.c that sets pkt_has_store_s[01]
> 
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>   target/hexagon/attribs_def.h.inc |  1 +
>   target/hexagon/decode.c          | 17 ++++++++++++-----
>   target/hexagon/translate.c       | 10 ++++++----
>   target/hexagon/hex_common.py     |  3 ++-
>   4 files changed, 21 insertions(+), 10 deletions(-)
> 
> diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.h.inc
> index 222ad95fb0..5d2a102c18 100644
> --- a/target/hexagon/attribs_def.h.inc
> +++ b/target/hexagon/attribs_def.h.inc
> @@ -44,6 +44,7 @@ DEF_ATTRIB(MEMSIZE_1B, "Memory width is 1 byte", "", "")
>   DEF_ATTRIB(MEMSIZE_2B, "Memory width is 2 bytes", "", "")
>   DEF_ATTRIB(MEMSIZE_4B, "Memory width is 4 bytes", "", "")
>   DEF_ATTRIB(MEMSIZE_8B, "Memory width is 8 bytes", "", "")
> +DEF_ATTRIB(SCALAR_STORE, "Store is scalar", "", "")
>   DEF_ATTRIB(REGWRSIZE_1B, "Memory width is 1 byte", "", "")
>   DEF_ATTRIB(REGWRSIZE_2B, "Memory width is 2 bytes", "", "")
>   DEF_ATTRIB(REGWRSIZE_4B, "Memory width is 4 bytes", "", "")
> diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
> index 6f0f27b4ba..2ba94a77de 100644
> --- a/target/hexagon/decode.c
> +++ b/target/hexagon/decode.c
> @@ -1,5 +1,5 @@
>   /*
> - *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
> + *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved.
>    *
>    *  This program is free software; you can redistribute it and/or modify
>    *  it under the terms of the GNU General Public License as published by
> @@ -402,10 +402,17 @@ static void decode_set_insn_attr_fields(Packet *pkt)
>           }
>   
>           if (GET_ATTRIB(opcode, A_STORE)) {
> -            if (pkt->insn[i].slot == 0) {
> -                pkt->pkt_has_store_s0 = true;
> -            } else {
> -                pkt->pkt_has_store_s1 = true;
> +            if (GET_ATTRIB(opcode, A_SCALAR_STORE) &&
> +                !GET_ATTRIB(opcode, A_MEMSIZE_0B)) {
> +                g_assert(GET_ATTRIB(opcode, A_MEMSIZE_1B) ||
> +                         GET_ATTRIB(opcode, A_MEMSIZE_2B) ||
> +                         GET_ATTRIB(opcode, A_MEMSIZE_4B) ||
> +                         GET_ATTRIB(opcode, A_MEMSIZE_8B));

Would this assert be redundant with the one I suggested vs patch 2?

Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

> +                if (pkt->insn[i].slot == 0) {
> +                    pkt->pkt_has_store_s0 = true;
> +                } else {
> +                    pkt->pkt_has_store_s1 = true;
> +                }
>               }
>           }
>   
> diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
> index bc02870b9f..efe7d2264e 100644
> --- a/target/hexagon/translate.c
> +++ b/target/hexagon/translate.c
> @@ -1,5 +1,5 @@
>   /*
> - *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
> + *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved.
>    *
>    *  This program is free software; you can redistribute it and/or modify
>    *  it under the terms of the GNU General Public License as published by
> @@ -525,10 +525,12 @@ static void process_store_log(DisasContext *ctx, Packet *pkt)
>        *  slot 1 and then slot 0.  This will be important when
>        *  the memory accesses overlap.
>        */
> -    if (pkt->pkt_has_store_s1 && !pkt->pkt_has_dczeroa) {
> +    if (pkt->pkt_has_store_s1) {
> +        g_assert(!pkt->pkt_has_dczeroa);
>           process_store(ctx, pkt, 1);
>       }
> -    if (pkt->pkt_has_store_s0 && !pkt->pkt_has_dczeroa) {
> +    if (pkt->pkt_has_store_s0) {
> +        g_assert(!pkt->pkt_has_dczeroa);
>           process_store(ctx, pkt, 0);
>       }
>   }
> @@ -691,7 +693,7 @@ static void gen_commit_packet(CPUHexagonState *env, DisasContext *ctx,
>            * The dczeroa will be the store in slot 0, check that we don't have
>            * a store in slot 1 or an HVX store.
>            */
> -        g_assert(has_store_s0 && !has_store_s1 && !has_hvx_store);
> +        g_assert(!has_store_s1 && !has_hvx_store);
>           process_dczeroa(ctx, pkt);
>       } else if (has_hvx_store) {
>           TCGv mem_idx = tcg_constant_tl(ctx->mem_idx);
> diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py
> index c81aca8d2a..d9ba7df786 100755
> --- a/target/hexagon/hex_common.py
> +++ b/target/hexagon/hex_common.py
> @@ -1,7 +1,7 @@
>   #!/usr/bin/env python3
>   
>   ##
> -##  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
> +##  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved.
>   ##
>   ##  This program is free software; you can redistribute it and/or modify
>   ##  it under the terms of the GNU General Public License as published by
> @@ -75,6 +75,7 @@ def calculate_attribs():
>       add_qemu_macro_attrib('fWRITE_P3', 'A_WRITES_PRED_REG')
>       add_qemu_macro_attrib('fSET_OVERFLOW', 'A_IMPLICIT_WRITES_USR')
>       add_qemu_macro_attrib('fSET_LPCFG', 'A_IMPLICIT_WRITES_USR')
> +    add_qemu_macro_attrib('fSTORE', 'A_SCALAR_STORE')
>   
>       # Recurse down macros, find attributes from sub-macros
>       macroValues = list(macros.values())



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] Hexagon (target/hexagon) add instruction attributes from archlib
  2022-09-20  8:07 ` [PATCH 1/3] Hexagon (target/hexagon) add instruction attributes from archlib Taylor Simpson
@ 2022-09-28 16:12   ` Richard Henderson
  0 siblings, 0 replies; 8+ messages in thread
From: Richard Henderson @ 2022-09-28 16:12 UTC (permalink / raw)
  To: Taylor Simpson, qemu-devel; +Cc: f4bug, ale, anjo, bcain, mlambert

On 9/20/22 01:07, Taylor Simpson wrote:
> The imported files from the architecture library have added some
> instruction attributes.  Some of these will be used in a subsequent
> patch for determing the size of a store.
> 
> Signed-off-by: Taylor Simpson<tsimpson@quicinc.com>
> ---
>   target/hexagon/attribs_def.h.inc      |  37 +++++++-
>   target/hexagon/imported/ldst.idef     | 122 +++++++++++++-------------
>   target/hexagon/imported/subinsns.idef |  72 +++++++--------
>   3 files changed, 133 insertions(+), 98 deletions(-)

Acked-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH 3/3] Hexagon (target/hexagon) Change decision to set pkt_has_store_s[01]
  2022-09-28 16:11   ` Richard Henderson
@ 2022-09-28 17:52     ` Taylor Simpson
  0 siblings, 0 replies; 8+ messages in thread
From: Taylor Simpson @ 2022-09-28 17:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: f4bug, ale, anjo, Brian Cain, Michael Lambert



> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, September 28, 2022 11:12 AM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: f4bug@amsat.org; ale@rev.ng; anjo@rev.ng; Brian Cain
> <bcain@quicinc.com>; Michael Lambert <mlambert@quicinc.com>
> Subject: Re: [PATCH 3/3] Hexagon (target/hexagon) Change decision to set
> pkt_has_store_s[01]
> 
> On 9/20/22 01:07, Taylor Simpson wrote:
> > We have found cases where pkt_has_store_s[01] is set incorrectly.
> > This leads to generating an unnecessary store that is left over from a
> > previous packet.
> >
> > Add an attribute to determine if an instruction is a scalar store The
> > attribute is attached to the fSTORE macro (hex_common.py) Simplify the
> > logic in decode.c that sets pkt_has_store_s[01]
> >
> > Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> > ---
> >   target/hexagon/attribs_def.h.inc |  1 +
> >   target/hexagon/decode.c          | 17 ++++++++++++-----
> >   target/hexagon/translate.c       | 10 ++++++----
> >   target/hexagon/hex_common.py     |  3 ++-
> >   4 files changed, 21 insertions(+), 10 deletions(-)
> >
> > --git a/target/hexagon/decode.c b/target/hexagon/decode.c index
> > 6f0f27b4ba..2ba94a77de 100644
> > --- a/target/hexagon/decode.c
> > +++ b/target/hexagon/decode.c
> > @@ -1,5 +1,5 @@
> >           }
> >
> >           if (GET_ATTRIB(opcode, A_STORE)) {
> > -            if (pkt->insn[i].slot == 0) {
> > -                pkt->pkt_has_store_s0 = true;
> > -            } else {
> > -                pkt->pkt_has_store_s1 = true;
> > +            if (GET_ATTRIB(opcode, A_SCALAR_STORE) &&
> > +                !GET_ATTRIB(opcode, A_MEMSIZE_0B)) {
> > +                g_assert(GET_ATTRIB(opcode, A_MEMSIZE_1B) ||
> > +                         GET_ATTRIB(opcode, A_MEMSIZE_2B) ||
> > +                         GET_ATTRIB(opcode, A_MEMSIZE_4B) ||
> > +                         GET_ATTRIB(opcode, A_MEMSIZE_8B));
> 
> Would this assert be redundant with the one I suggested vs patch 2?
> 
> Otherwise,
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Yes, this would be redundant with the one you suggested.  Further, the one you suggested is an improvement because it ensures that exactly one of the attributes is set.

Will make the changes and create a PR.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-09-28 18:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-20  8:07 [PATCH 0/3] Hexagon (target/hexagon) improve store handling Taylor Simpson
2022-09-20  8:07 ` [PATCH 1/3] Hexagon (target/hexagon) add instruction attributes from archlib Taylor Simpson
2022-09-28 16:12   ` Richard Henderson
2022-09-20  8:07 ` [PATCH 2/3] Hexagon (target/hexagon) move store size tracking to translation Taylor Simpson
2022-09-28 16:09   ` Richard Henderson
2022-09-20  8:07 ` [PATCH 3/3] Hexagon (target/hexagon) Change decision to set pkt_has_store_s[01] Taylor Simpson
2022-09-28 16:11   ` Richard Henderson
2022-09-28 17:52     ` Taylor Simpson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.