All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: qemu-devel@nongnu.org, qemu-ppc@nongnu.org, richard.henderson@linaro.org
Subject: Re: [Qemu-devel] [PATCH v5 0/9] target/ppc: prepare for conversion to TCG vector operations
Date: Thu, 3 Jan 2019 11:23:13 +1100	[thread overview]
Message-ID: <20190103002313.GD10853@umbus.fritz.box> (raw)
In-Reply-To: <20190102091423.21155-1-mark.cave-ayland@ilande.co.uk>

[-- Attachment #1: Type: text/plain, Size: 5031 bytes --]

On Wed, Jan 02, 2019 at 09:14:14AM +0000, Mark Cave-Ayland wrote:
> This patchset is an attempt at trying to improve the VMX (Altivec) instruction
> performance by laying the groundwork for use of the new TCG vector operations.
> 
> Patches 1 and 2 fix a sign-extension error discovered in EXTRACT_SHELPER and an
> associated typo in the SIMM5 macro which were discovered whilst testing Richard's
> follow-on TCG vector improvements patchset.
> 
> In order to use TCG vector operations, the registers must be accessible from cpu_env
> whilst currently they are accessed via arrays of static TCG globals. Patches 3-5
> are therefore mechanical patches which introduce access helpers for FPR, AVR and VSR
> registers using the supplied TCGv_i64 parameter.
> 
> Once this is done, patch 6 enables us to remove the static TCG global arrays and updates
> the access helpers to read/write to the relevant fields in cpu_env directly.
> 
> Patches 7 and 8 perform the legwork required to enable VSX instructions to be converted
> to use TCG vector operations in future by rearranging the FP, VMX and VSX registers into
> a single aligned VSR register array (the scope of this patchset is VMX only).
> 
> Patch 9 removes the AVR* macros and replaces them with the corresponding Vsr* macros
> since they are equivalent.
> 
> Finally thanks to Richard for taking the time to answer some of my (mostly beginner)
> questions related to TCG.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

Applied to ppc-for-4.0, thanks.

> 
> v5:
> - Fix up KVM-enabled builds on PPC host due to missing conversion of target/ppc/kvm.c
> 
> v4:
> - Rebase onto master
> - Add extra R-B tags from Richard
> - Leave HI_IDX/LO_IDX in int_helper.c in patch 9 (similarly named macros are also
>   used in other files so let's ensure there is no confusion)
> - Add cpu_fpr_ptr(), cpu_vsrl_ptr() and cpu_avr_ptr() as suggested by Richard in
>   patch 8
> 
> v3:
> - Rebase onto master, drop RFC prefix, alter subject line
> - Add A-B tags from David
> - Add SIMM5/EXTRACT_HELPER macro fix patches to the start of the series
> - Drop patch 4 from previous patchset (delay AVR register writeback) as it should
>   not be required.
> - Remove extra get_fpr() accidentally added to GEN_FLOAT macros in patch 3
> - Fix temporary leak when VMX/VSX not enabled in patches 4 and 5
> - Add patch to remove AVR* macros, replacing them with Vsr* macros
> - Drop patches converting logical, add and sub instructions to TCG vector ops (let
>   Richard incorporate this into his TCG vector improvements patchset)
> 
> v2:
> - Rebase onto master
> - Add comment explaining rationale for FPR helpers in description for patch 1
> - Add R-B tags from Richard
> - Add patch 3 to delay AVR register writeback as spotted by Richard
> - Add patches 6 and 7 to merge FPR, VMX and VSX registers into the vsr array
>   to facilitate conversion of VSX instructions to vector operations later
> - Fix accidental bug whereby the conversion of get_vsr()/set_vsr() to access
>   data from cpu_env was incorrectly squashed into patch 3
> - Move set_fpr() further down in gen_fsqrts() and gen_frsqrtes() in patch 1
> 
> Mark Cave-Ayland (9):
>   target/ppc: fix typo in SIMM5 extraction helper
>   target/ppc: switch EXTRACT_HELPER macros over to use
>     sextract32/extract32
>   target/ppc: introduce get_fpr() and set_fpr() helpers for FP register
>     access
>   target/ppc: introduce get_avr64() and set_avr64() helpers for VMX
>     register access
>   target/ppc: introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}()
>     helpers for VSR register access
>   target/ppc: switch FPR, VMX and VSX helpers to access data directly
>     from cpu_env
>   target/ppc: merge ppc_vsr_t and ppc_avr_t union types
>   target/ppc: move FP and VMX registers into aligned vsr register array
>   target/ppc: replace AVR* macros with Vsr* macros
> 
>  linux-user/ppc/signal.c             |  28 +-
>  target/ppc/arch_dump.c              |  15 +-
>  target/ppc/cpu.h                    |  42 +-
>  target/ppc/gdbstub.c                |   8 +-
>  target/ppc/int_helper.c             |  86 ++--
>  target/ppc/internal.h               |  39 +-
>  target/ppc/kvm.c                    |  24 +-
>  target/ppc/machine.c                |  72 ++-
>  target/ppc/monitor.c                |   4 +-
>  target/ppc/translate.c              |  73 ++-
>  target/ppc/translate/dfp-impl.inc.c |   2 +-
>  target/ppc/translate/fp-impl.inc.c  | 486 +++++++++++++++-----
>  target/ppc/translate/vmx-impl.inc.c | 154 +++++--
>  target/ppc/translate/vsx-impl.inc.c | 862 ++++++++++++++++++++++++++----------
>  target/ppc/translate_init.inc.c     |  26 +-
>  15 files changed, 1374 insertions(+), 547 deletions(-)
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      parent reply	other threads:[~2019-01-03  0:27 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-02  9:14 [Qemu-devel] [PATCH v5 0/9] target/ppc: prepare for conversion to TCG vector operations Mark Cave-Ayland
2019-01-02  9:14 ` [Qemu-devel] [PATCH v5 1/9] target/ppc: fix typo in SIMM5 extraction helper Mark Cave-Ayland
2019-01-02  9:14 ` [Qemu-devel] [PATCH v5 2/9] target/ppc: switch EXTRACT_HELPER macros over to use sextract32/extract32 Mark Cave-Ayland
2019-01-02  9:14 ` [Qemu-devel] [PATCH v5 3/9] target/ppc: introduce get_fpr() and set_fpr() helpers for FP register access Mark Cave-Ayland
2019-01-02  9:14 ` [Qemu-devel] [PATCH v5 4/9] target/ppc: introduce get_avr64() and set_avr64() helpers for VMX " Mark Cave-Ayland
2019-01-02  9:14 ` [Qemu-devel] [PATCH v5 5/9] target/ppc: introduce get_cpu_vsr{l, h}() and set_cpu_vsr{l, h}() helpers for VSR " Mark Cave-Ayland
2019-01-02  9:14 ` [Qemu-devel] [PATCH v5 6/9] target/ppc: switch FPR, VMX and VSX helpers to access data directly from cpu_env Mark Cave-Ayland
2019-01-02  9:14 ` [Qemu-devel] [PATCH v5 7/9] target/ppc: merge ppc_vsr_t and ppc_avr_t union types Mark Cave-Ayland
2019-01-02  9:14 ` [Qemu-devel] [PATCH v5 8/9] target/ppc: move FP and VMX registers into aligned vsr register array Mark Cave-Ayland
2019-01-02  9:14 ` [Qemu-devel] [PATCH v5 9/9] target/ppc: replace AVR* macros with Vsr* macros Mark Cave-Ayland
2019-01-03  0:23 ` David Gibson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190103002313.GD10853@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=mark.cave-ayland@ilande.co.uk \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.