All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/42] target/arm: Convert VFP decoder to decodetree
@ 2019-06-06 17:45 Peter Maydell
  2019-06-06 17:45 ` [Qemu-devel] [PATCH 01/42] decodetree: Fix comparison of Field Peter Maydell
                   ` (41 more replies)
  0 siblings, 42 replies; 88+ messages in thread
From: Peter Maydell @ 2019-06-06 17:45 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

This patchset converts the Arm VFP instructions to use decodetree
instead of the current hand-written decode.

We gain:
 * a more maintainable decoder which doesn't live in one big function
 * correct prioritization of UNDEF exceptions against "VFP disabled"
   exceptions and "M-profile lazy FP stacking" activity
 * significant reduction in the use of the "cpu_F0[sd]" and "cpu_F1[sd]"
   TCG globals. These are a relic of a much older translator and
   eventually we should try to get rid of them entirely
 * more accurate decode, UNDEFing some things we were incorrectly lax on
 * a fixed bug for VFP short-vector mixed vector/scalar VMLA/VMLS/VNMLA/VNMLS
   insns: we were incorrectly corrupting the scalar input operand
   in the process of performing the multiply-accumulate, so every
   element after the first was miscalculated
 * a fixed bug in the calculation of the next register number to use
   when VFP short-vector operations wrapped around the vector bank
 * decode which checks ID registers for "do we have D16-D31" rather
   than using "is this VFPv3" -- this means that Cortex-M4, -M33 and -R5F
   all now correctly give the guest only 16 Dregs rather than 31.
   (Note that the old decoder hides this UNDEF handling inside the
   VFP_DREG macros...)
 * the fused multiply-add insns now correctly UNDEF for attempts to
   use them as short-vector operations
 * short-vector functionality is only implemented if the ID registers
   say it should be (which in practice means "only Cortex-A8 or earlier");
   we continue to provide it in -cpu max for compatibility
 * VRINTR, VRINTZ and VRINTX are only provided in v8A and above
 * VFP related translation code split out into its own source file
 * the "is this special register present and accessible" check is
   now consistent between read and write

There is definitely scope for further cleanup:
 * the translate-vfp.inc.c could be further isolated into its
   own standalone .c file rather than being #included into translate.c
 * cpu_F0* are still used in parts of the Neon decode (and the
   iwmmxt code, alas)
 * I noticed some places doing a load-and-shift or load-modify-store
   sequence to update byte or halfword parts of float registers;
   these could be rewritten to do direct byte or halfword loads/stores
 * we could remove the remaining uses of tcg_gen_ld/st_f32()
   (in the Neon decode)
but at 42 patches this is already a pretty hefty patchset, so
I have deferred those to attack later once this has got in.

On the downside, there are more lines of code here, but some of
them we'll get back when we finish some of the cleanups noted
above, some are just copyright-and-license boilerplate, and I
think the rest are well invested in easier to modify code...

Patch 1 is Richard's recent decodetree script bugfix, which
is needed for the VFP decode to behave correctly.

Tested with RISU, a mixture of comparison against real Cortex-A7
and Cortex-A8 and against the old version of QEMU, plus some
smoke-testing of aarch32 system emulation.

thanks
-- PMM

Peter Maydell (41):
  target/arm: Add stubs for AArch32 VFP decodetree
  target/arm: Factor out VFP access checking code
  target/arm: Fix Cortex-R5F MVFR values
  target/arm: Explicitly enable VFP short-vectors for aarch32 -cpu max
  target/arm: Convert the VSEL instructions to decodetree
  target/arm: Convert VMINNM, VMAXNM to decodetree
  target/arm: Convert VRINTA/VRINTN/VRINTP/VRINTM to decodetree
  target/arm: Convert VCVTA/VCVTN/VCVTP/VCVTM to decodetree
  target/arm: Move the VFP trans_* functions to translate-vfp.inc.c
  target/arm: Add helpers for VFP register loads and stores
  target/arm: Convert "double-precision" register moves to decodetree
  target/arm: Convert "single-precision" register moves to decodetree
  target/arm: Convert VFP two-register transfer insns to decodetree
  target/arm: Convert VFP VLDR and VSTR to decodetree
  target/arm: Convert the VFP load/store multiple insns to decodetree
  target/arm: Remove VLDR/VSTR/VLDM/VSTM use of cpu_F0s and cpu_F0d
  target/arm: Convert VFP VMLA to decodetree
  target/arm: Convert VFP VMLS to decodetree
  target/arm: Convert VFP VNMLS to decodetree
  target/arm: Convert VFP VNMLA to decodetree
  target/arm: Convert VMUL to decodetree
  target/arm: Convert VNMUL to decodetree
  target/arm: Convert VADD to decodetree
  target/arm: Convert VSUB to decodetree
  target/arm: Convert VDIV to decodetree
  target/arm: Convert VFP fused multiply-add insns to decodetree
  target/arm: Convert VMOV (imm) to decodetree
  target/arm: Convert VABS to decodetree
  target/arm: Convert VNEG to decodetree
  target/arm: Convert VSQRT to decodetree
  target/arm: Convert VMOV (register) to decodetree
  target/arm: Convert VFP comparison insns to decodetree
  target/arm: Convert the VCVT-from-f16 insns to decodetree
  target/arm: Convert the VCVT-to-f16 insns to decodetree
  target/arm: Convert VFP round insns to decodetree
  target/arm: Convert double-single precision conversion insns to
    decodetree
  target/arm: Convert integer-to-float insns to decodetree
  target/arm: Convert VJCVT to decodetree
  target/arm: Convert VCVT fp/fixed-point conversion insns to decodetree
  target/arm: Convert float-to-integer VCVT insns to decodetree
  target/arm: Fix short-vector increment behaviour

Richard Henderson (1):
  decodetree: Fix comparison of Field

 target/arm/Makefile.objs       |   13 +
 target/arm/cpu.h               |   11 +
 target/arm/cpu.c               |    6 +
 target/arm/translate-vfp.inc.c | 2660 ++++++++++++++++++++++++++++++++
 target/arm/translate.c         | 1503 +-----------------
 scripts/decodetree.py          |    2 +-
 target/arm/vfp-uncond.decode   |   63 +
 target/arm/vfp.decode          |  242 +++
 8 files changed, 3024 insertions(+), 1476 deletions(-)
 create mode 100644 target/arm/translate-vfp.inc.c
 create mode 100644 target/arm/vfp-uncond.decode
 create mode 100644 target/arm/vfp.decode

-- 
2.20.1



^ permalink raw reply	[flat|nested] 88+ messages in thread

end of thread, other threads:[~2019-06-10 21:24 UTC | newest]

Thread overview: 88+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-06 17:45 [Qemu-devel] [PATCH 00/42] target/arm: Convert VFP decoder to decodetree Peter Maydell
2019-06-06 17:45 ` [Qemu-devel] [PATCH 01/42] decodetree: Fix comparison of Field Peter Maydell
2019-06-06 17:45 ` [Qemu-devel] [PATCH 02/42] target/arm: Add stubs for AArch32 VFP decodetree Peter Maydell
2019-06-07 14:47   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 03/42] target/arm: Factor out VFP access checking code Peter Maydell
2019-06-07 14:49   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 04/42] target/arm: Fix Cortex-R5F MVFR values Peter Maydell
2019-06-07 14:50   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 05/42] target/arm: Explicitly enable VFP short-vectors for aarch32 -cpu max Peter Maydell
2019-06-07 14:51   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 06/42] target/arm: Convert the VSEL instructions to decodetree Peter Maydell
2019-06-07 14:54   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 07/42] target/arm: Convert VMINNM, VMAXNM " Peter Maydell
2019-06-07 14:55   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 08/42] target/arm: Convert VRINTA/VRINTN/VRINTP/VRINTM " Peter Maydell
2019-06-07 14:57   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 09/42] target/arm: Convert VCVTA/VCVTN/VCVTP/VCVTM " Peter Maydell
2019-06-07 15:38   ` Richard Henderson
2019-06-07 15:39     ` Peter Maydell
2019-06-06 17:45 ` [Qemu-devel] [PATCH 10/42] target/arm: Move the VFP trans_* functions to translate-vfp.inc.c Peter Maydell
2019-06-07 15:53   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 11/42] target/arm: Add helpers for VFP register loads and stores Peter Maydell
2019-06-07 17:11   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 12/42] target/arm: Convert "double-precision" register moves to decodetree Peter Maydell
2019-06-07 17:27   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 13/42] target/arm: Convert "single-precision" " Peter Maydell
2019-06-07 18:08   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 14/42] target/arm: Convert VFP two-register transfer insns " Peter Maydell
2019-06-08 13:46   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 15/42] target/arm: Convert VFP VLDR and VSTR " Peter Maydell
2019-06-08 13:54   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 16/42] target/arm: Convert the VFP load/store multiple insns " Peter Maydell
2019-06-08 14:04   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 17/42] target/arm: Remove VLDR/VSTR/VLDM/VSTM use of cpu_F0s and cpu_F0d Peter Maydell
2019-06-08 14:05   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 18/42] target/arm: Convert VFP VMLA to decodetree Peter Maydell
2019-06-08 14:14   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 19/42] target/arm: Convert VFP VMLS " Peter Maydell
2019-06-08 18:21   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 20/42] target/arm: Convert VFP VNMLS " Peter Maydell
2019-06-08 18:25   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 21/42] target/arm: Convert VFP VNMLA " Peter Maydell
2019-06-08 18:26   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 22/42] target/arm: Convert VMUL " Peter Maydell
2019-06-08 18:28   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 23/42] target/arm: Convert VNMUL " Peter Maydell
2019-06-08 18:29   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 24/42] target/arm: Convert VADD " Peter Maydell
2019-06-08 18:29   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 25/42] target/arm: Convert VSUB " Peter Maydell
2019-06-08 18:30   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 26/42] target/arm: Convert VDIV " Peter Maydell
2019-06-08 18:31   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 27/42] target/arm: Convert VFP fused multiply-add insns " Peter Maydell
2019-06-08 18:40   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 28/42] target/arm: Convert VMOV (imm) " Peter Maydell
2019-06-08 18:55   ` Richard Henderson
2019-06-10 17:12     ` Peter Maydell
2019-06-10 18:40       ` Richard Henderson
2019-06-10 19:27         ` [Qemu-devel] [Qemu-arm] " Ali Mezgani
2019-06-06 17:45 ` [Qemu-devel] [PATCH 29/42] target/arm: Convert VABS " Peter Maydell
2019-06-08 18:57   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 30/42] target/arm: Convert VNEG " Peter Maydell
2019-06-08 18:57   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 31/42] target/arm: Convert VSQRT " Peter Maydell
2019-06-08 18:59   ` Richard Henderson
2019-06-06 17:45 ` [Qemu-devel] [PATCH 32/42] target/arm: Convert VMOV (register) " Peter Maydell
2019-06-08 19:00   ` Richard Henderson
2019-06-06 17:46 ` [Qemu-devel] [PATCH 33/42] target/arm: Convert VFP comparison insns " Peter Maydell
2019-06-08 19:02   ` Richard Henderson
2019-06-06 17:46 ` [Qemu-devel] [PATCH 34/42] target/arm: Convert the VCVT-from-f16 " Peter Maydell
2019-06-08 19:08   ` Richard Henderson
2019-06-06 17:46 ` [Qemu-devel] [PATCH 35/42] target/arm: Convert the VCVT-to-f16 " Peter Maydell
2019-06-08 19:10   ` Richard Henderson
2019-06-06 17:46 ` [Qemu-devel] [PATCH 36/42] target/arm: Convert VFP round " Peter Maydell
2019-06-08 19:11   ` Richard Henderson
2019-06-06 17:46 ` [Qemu-devel] [PATCH 37/42] target/arm: Convert double-single precision conversion " Peter Maydell
2019-06-08 19:14   ` Richard Henderson
2019-06-06 17:46 ` [Qemu-devel] [PATCH 38/42] target/arm: Convert integer-to-float " Peter Maydell
2019-06-08 19:15   ` Richard Henderson
2019-06-06 17:46 ` [Qemu-devel] [PATCH 39/42] target/arm: Convert VJCVT " Peter Maydell
2019-06-08 19:16   ` Richard Henderson
2019-06-06 17:46 ` [Qemu-devel] [PATCH 40/42] target/arm: Convert VCVT fp/fixed-point conversion insns " Peter Maydell
2019-06-08 19:22   ` Richard Henderson
2019-06-06 17:46 ` [Qemu-devel] [PATCH 41/42] target/arm: Convert float-to-integer VCVT " Peter Maydell
2019-06-08 19:24   ` Richard Henderson
2019-06-06 17:46 ` [Qemu-devel] [PATCH 42/42] target/arm: Fix short-vector increment behaviour Peter Maydell
2019-06-08 19:26   ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.