[PATCH v2 00/17] target/arm: Convert rest of Neon 3-reg-same to decodetree

* [PATCH v2 00/17] target/arm: Convert rest of Neon 3-reg-same to decodetree
@ 2020-05-12 16:38 Peter Maydell
  2020-05-12 16:38 ` [PATCH v2 01/17] target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH " Peter Maydell
                   ` (17 more replies)
  0 siblings, 18 replies; 30+ messages in thread
From: Peter Maydell @ 2020-05-12 16:38 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

This patchset is v2 of the Neon decodetree conversion. The first
half of v1 is in master already, so we're left just with patches
converting the rest of the 3-reg-same Neon dp insn group.

Based-on: <20200508152200.6547-1-richard.henderson@linaro.org>
("[PATCH v3 00/16] target/arm: partial vector cleanup")
Strictly speaking, based on that with a fixup for the VRSRA bug,
but I think patchew should be placated by that based-on tag.

Git tree available at:
 https://git.linaro.org/people/peter.maydell/qemu-arm.git neon-decodetree
with the whole patchstack including RTH's series.

Changes from v1:
 * the first 19 or so patches have been upstreamed
 * patch 1 (VQRDMLAH/VQRDMLSH) uses do_3same() now
 * patch 3 (64-bit elt 3-reg-same): shifts now use a format
   which swaps Vn and Vm in decode, so we don't need to special
   case them in the C code. We also use the gvec interface
   rather than hand-rolling a for-each-pass loop.
 * patch 4: make DO_3SAME_32() handle just one trans fn rather
   than doing both _U and _S in one macro invocation.
   Use gvec rather than hand-rolling the for-each-pass loop.
   patch 5: (vaba/vabd): new, since rth's patchset rewrote how these
   were handled and they can now just be handled via DO_3SAME_NO_SZ_3()
 * patch 7: saturating shift handling rewritten to use the gvec APIs.
 * patch 10 (vqdmulh/vqrdmulh): rewritten to use gvec
 * patch 11 (vadd/vsub): rewritten to use gvec; vabd now in an earlier
   patch with vaba
 * patch 13 (vmul/vmla/vmls): vmul uses gvec; do_3same_fp() and
   DO_3S_FP() now added in this patch as it is the first user
 * new patch 15 making recps_f32 and rsqrts_f32 easier to use with
   common gvec APIs and macros by moving the 'env' argument to the front
 * patch 16: updated VRECPS/VRSQRTS code to use gvec

NB: I have not attempted to merge VQSHL_S and VQSHL_S64 into
one pattern in patch 7 (as suggested in review of v1) -- adding
yet another DO_3SAME_FOO for the case of "64 bit and 32 bit
can be done with same trans/gen fn" didn't seem worthwhile.

thanks
-- PMM

Peter Maydell (17):
  target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH to decodetree
  target/arm: Convert Neon 3-reg-same SHA to decodetree
  target/arm: Convert Neon 64-bit element 3-reg-same insns
  target/arm: Convert Neon VHADD 3-reg-same insns
  target/arm: Convert Neon VABA/VABD 3-reg-same to decodetree
  target/arm: Convert Neon VRHADD, VHSUB 3-reg-same insns to decodetree
  target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to
    decodetree
  target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree
  target/arm: Convert Neon VPADD 3-reg-same insns to decodetree
  target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree
  target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to
    decodetree
  target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to
    decodetree
  target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to
    decodetree
  target/arm: Convert Neon 3-reg-same compare insns to decodetree
  target/arm: Move 'env' argument of recps_f32 and rsqrts_f32 helpers to
    usual place
  target/arm: Convert Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS to
    decodetree
  target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree

 target/arm/helper.h             |   7 +-
 target/arm/neon-dp.decode       | 113 +++++-
 target/arm/neon_helper.c        |   7 -
 target/arm/translate-neon.inc.c | 639 ++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 495 +------------------------
 target/arm/vec_helper.c         |   7 +
 target/arm/vfp_helper.c         |   4 +-
 7 files changed, 767 insertions(+), 505 deletions(-)

-- 
2.20.1

^ permalink raw reply	[flat|nested] 30+ messages in thread