All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Maydell <peter.maydell@linaro.org>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: QEMU Developers <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH 2/3] target-arm: add env->tbflags
Date: Mon, 26 Sep 2016 15:00:52 -0700	[thread overview]
Message-ID: <CAFEAcA-xfaoA3DyjPA_vDpSv6shR70L3RSJjaQqrHOcit79XMg@mail.gmail.com> (raw)
In-Reply-To: <1473847013-20191-3-git-send-email-pbonzini@redhat.com>

On 14 September 2016 at 02:56, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Computing TranslationBlock flags is pretty expensive on ARM, especially
> 32-bit.  In order to limit the cost we want to cache as many of them
> as possible.  Therefore, the flags are split in two parts.  Static flags
> come directly from a new CPUARMState field env->tbflags, and are updated
> whenever EL changes (i.e. on exceptions and exception returns) or after
> MSR instructions.

Doing this for all MSR writes is a bit sad, because a lot of them
don't actually change the TB flags, and quite a few of them which
previously we were able to code to not have to do a helper call
at all (direct writes to fields) now get a pointless helper call.

You're also recalculating more often than stated here, in that
you also recalc on any gen_lookup_tb() call in the 32-bit
decoder. (This is just as well because for instance vec_len
and vec_stride aren't set via the cp15 system register write
path.)

You're treating the PSTATE_SS flag as static, but you don't
have anything which causes a recalculation of it on the code
path which changes it (gen_ss_advance()).

The 32-bit SETEND instruction changes CPSR_E, which has
an effect on the BE_DATA_MASK flag, but I don't think
that code path will cause us to recalculate flags.

I found this patch kind of difficult to review because
it isn't obvious why we recalculate the static flags at
the points where we do (ie whether those points are
necessary and sufficient for correct behaviour). Most
of the comments above are the result of my looking at
whether some particular flags that I suspected of being
tricky were handled correctly :-)

Is there a possibility of having an assertion somewhere that
the static flags we think we have actually match the
old dynamic flags?

> Dynamic flags are computed on the fly by cpu_get_tb_cpu_state (which
> calls cpu_dynamic_tb_cpu_flags to retrieve them), same as before.
> As of this patch, all flags are dynamic and env->tbflags is always 0,
> so this patch adds the infrastructure but does not do any caching yet.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  target-arm/cpu.c           |  2 ++
>  target-arm/cpu.h           | 10 +++++++++-
>  target-arm/helper.c        |  2 ++
>  target-arm/helper.h        |  1 +
>  target-arm/op_helper.c     |  7 +++++++
>  target-arm/translate-a64.c |  4 ++++
>  target-arm/translate.c     | 12 ++++++++++--
>  target-arm/translate.h     |  1 +
>  8 files changed, 36 insertions(+), 3 deletions(-)
>
> diff --git a/target-arm/cpu.c b/target-arm/cpu.c
> index ce8b8f4..189ceab 100644
> --- a/target-arm/cpu.c
> +++ b/target-arm/cpu.c
> @@ -225,6 +225,8 @@ static void arm_cpu_reset(CPUState *s)
>                                &env->vfp.fp_status);
>      set_float_detect_tininess(float_tininess_before_rounding,
>                                &env->vfp.standard_fp_status);
> +
> +    env->tbflags = cpu_get_tb_cpu_flags(env);
>      tlb_flush(s, 1);
>
>  #ifndef CONFIG_USER_ONLY
> diff --git a/target-arm/cpu.h b/target-arm/cpu.h
> index ef195bd..5918df5 100644
> --- a/target-arm/cpu.h
> +++ b/target-arm/cpu.h
> @@ -155,6 +155,7 @@ typedef struct CPUARMState {
>       */
>      uint32_t pstate;
>      uint32_t aarch64; /* 1 if CPU is in aarch64 state; inverse of PSTATE.nRW */
> +    uint32_t tbflags;
>
>      /* Frequently accessed CPSR bits are stored separately for efficiency.
>         This contains all the other bits.  Use cpsr_{read,write} to access
> @@ -2370,10 +2371,17 @@ static inline uint32_t cpu_dynamic_tb_cpu_flags(CPUARMState *env)
>      return flags;
>  }
>
> +static inline uint32_t cpu_get_tb_cpu_flags(CPUARMState *env)
> +{
> +    uint32_t flags = 0;
> +
> +    return flags;
> +}
> +
>  static inline void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>                                          target_ulong *cs_base, uint32_t *flags)
>  {
> -    *flags = cpu_dynamic_tb_cpu_flags(env);
> +    *flags = env->tbflags | cpu_dynamic_tb_cpu_flags(env);
>
>      if (is_a64(env)) {
>          *pc = env->pc;
> diff --git a/target-arm/helper.c b/target-arm/helper.c
> index bdb842c..0b4f6de 100644
> --- a/target-arm/helper.c
> +++ b/target-arm/helper.c
> @@ -6068,6 +6068,7 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
>      addr = ldl_phys(cs->as, env->v7m.vecbase + env->v7m.exception * 4);
>      env->regs[15] = addr & 0xfffffffe;
>      env->thumb = addr & 1;
> +    env->tbflags = cpu_get_tb_cpu_flags(env);
>  }
>
>  /* Function used to synchronize QEMU's AArch64 register set with AArch32
> @@ -6642,6 +6643,7 @@ void arm_cpu_do_interrupt(CPUState *cs)
>          arm_cpu_do_interrupt_aarch32(cs);
>      }
>
> +    env->tbflags = cpu_get_tb_cpu_flags(env);
>      arm_call_el_change_hook(cpu);
>
>      if (!kvm_enabled()) {
> diff --git a/target-arm/helper.h b/target-arm/helper.h
> index 84aa637..21a0e35 100644
> --- a/target-arm/helper.h
> +++ b/target-arm/helper.h
> @@ -73,6 +73,7 @@ DEF_HELPER_2(get_cp_reg64, i64, env, ptr)
>  DEF_HELPER_3(msr_i_pstate, void, env, i32, i32)
>  DEF_HELPER_1(clear_pstate_ss, void, env)
>  DEF_HELPER_1(exception_return, void, env)
> +DEF_HELPER_FLAGS_1(compute_tbflags, TCG_CALL_NO_WG_SE, i32, env)
>
>  DEF_HELPER_2(get_r13_banked, i32, env, i32)
>  DEF_HELPER_3(set_r13_banked, void, env, i32, i32)
> diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
> index be27b21..e5d7cfc 100644
> --- a/target-arm/op_helper.c
> +++ b/target-arm/op_helper.c
> @@ -479,6 +479,7 @@ void HELPER(cpsr_write_eret)(CPUARMState *env, uint32_t val)
>  {
>      cpsr_write(env, val, CPSR_ERET_MASK, CPSRWriteExceptionReturn);
>
> +    env->tbflags = cpu_get_tb_cpu_flags(env);
>      arm_call_el_change_hook(arm_env_get_cpu(env));
>  }
>
> @@ -975,6 +976,7 @@ void HELPER(exception_return)(CPUARMState *env)
>          env->pc = env->elr_el[cur_el];
>      }
>
> +    env->tbflags = cpu_get_tb_cpu_flags(env);
>      arm_call_el_change_hook(arm_env_get_cpu(env));
>
>      return;
> @@ -1326,3 +1328,8 @@ uint32_t HELPER(ror_cc)(CPUARMState *env, uint32_t x, uint32_t i)
>          return ((uint32_t)x >> shift) | (x << (32 - shift));
>      }
>  }
> +
> +uint32_t HELPER(compute_tbflags)(CPUARMState *env)
> +{
> +    return cpu_get_tb_cpu_flags(env);
> +}
> diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
> index f5e29d2..7c355b0 100644
> --- a/target-arm/translate-a64.c
> +++ b/target-arm/translate-a64.c
> @@ -1507,6 +1507,10 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
>          }
>      }
>
> +    if (!isread) {
> +        gen_helper_compute_tbflags(cpu_tbflags, cpu_env);
> +    }
> +
>      if ((s->tb->cflags & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
>          /* I/O operations must end the TB here (whether read or write) */
>          gen_io_end();
> diff --git a/target-arm/translate.c b/target-arm/translate.c
> index bd5d5cb..b4d8fe7 100644
> --- a/target-arm/translate.c
> +++ b/target-arm/translate.c
> @@ -68,6 +68,7 @@ TCGv_i64 cpu_exclusive_val;
>  TCGv_i64 cpu_exclusive_test;
>  TCGv_i32 cpu_exclusive_info;
>  #endif
> +TCGv_i32 cpu_tbflags;
>
>  /* FIXME:  These should be removed.  */
>  static TCGv_i32 cpu_F0s, cpu_F1s;
> @@ -107,6 +108,8 @@ void arm_translate_init(void)
>      cpu_exclusive_info = tcg_global_mem_new_i32(cpu_env,
>          offsetof(CPUARMState, exclusive_info), "exclusive_info");
>  #endif
> +    cpu_tbflags = tcg_global_mem_new_i32(cpu_env,
> +        offsetof(CPUARMState, tbflags), "tbflags");
>
>      a64_translate_init();
>  }
> @@ -1135,6 +1138,7 @@ static void gen_exception_insn(DisasContext *s, int offset, int excp,
>  /* Force a TB lookup after an instruction that changes the CPU state.  */
>  static inline void gen_lookup_tb(DisasContext *s)
>  {
> +    gen_helper_compute_tbflags(cpu_tbflags, cpu_env);

Why do we need to do this for 32-bit ARM but there's nothing
similar in the 64-bit ARM decoder?

>      tcg_gen_movi_i32(cpu_R[15], s->pc & ~1);
>      s->is_jmp = DISAS_JUMP;
>  }
> @@ -7630,12 +7634,16 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
>              /* I/O operations must end the TB here (whether read or write) */
>              gen_io_end();
>              gen_lookup_tb(s);
> -        } else if (!isread && !(ri->type & ARM_CP_SUPPRESS_TB_END)) {
> +        } else if (!isread) {
>              /* We default to ending the TB on a coprocessor register write,
>               * but allow this to be suppressed by the register definition
>               * (usually only necessary to work around guest bugs).
>               */
> -            gen_lookup_tb(s);
> +            if (ri->type & ARM_CP_SUPPRESS_TB_END) {
> +                gen_helper_compute_tbflags(cpu_tbflags, cpu_env);
> +            } else {
> +                gen_lookup_tb(s);
> +            }
>          }
>
>          return 0;
> diff --git a/target-arm/translate.h b/target-arm/translate.h
> index dbd7ac8..d269f4c 100644
> --- a/target-arm/translate.h
> +++ b/target-arm/translate.h
> @@ -81,6 +81,7 @@ extern TCGv_i64 cpu_exclusive_val;
>  extern TCGv_i64 cpu_exclusive_test;
>  extern TCGv_i32 cpu_exclusive_info;
>  #endif
> +extern TCGv_i32 cpu_tbflags;
>
>  static inline int arm_dc_feature(DisasContext *dc, int feature)
>  {
> --
> 2.7.4

thanks
-- PMM

  reply	other threads:[~2016-09-26 22:01 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-14  9:56 [Qemu-devel] [PATCH 0/3] target-arm: cache tbflags in CPUARMState Paolo Bonzini
2016-09-14  9:56 ` [Qemu-devel] [PATCH 1/3] target-arm: introduce cpu_dynamic_tb_cpu_flags Paolo Bonzini
2016-09-14  9:56 ` [Qemu-devel] [PATCH 2/3] target-arm: add env->tbflags Paolo Bonzini
2016-09-26 22:00   ` Peter Maydell [this message]
2016-09-27  8:15     ` Paolo Bonzini
2016-09-14  9:56 ` [Qemu-devel] [PATCH 3/3] target-arm: cache most tbflags Paolo Bonzini
2016-09-26 10:04 ` [Qemu-devel] [PATCH 0/3] target-arm: cache tbflags in CPUARMState Laurent Desnogues
2016-09-26 11:13   ` Laurent Desnogues
2016-11-10 11:42 ` Alex Bennée
2016-11-10 12:19   ` Paolo Bonzini
2016-11-10 13:37     ` Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFEAcA-xfaoA3DyjPA_vDpSv6shR70L3RSJjaQqrHOcit79XMg@mail.gmail.com \
    --to=peter.maydell@linaro.org \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.