All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: "Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Roger Pau Monné" <roger.pau@citrix.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Wei Liu <wl@xen.org>
Subject: Ping: [PATCH] x86emul: de-duplicate scatters to the same linear address
Date: Fri, 5 Feb 2021 11:10:21 +0100	[thread overview]
Message-ID: <a9739f4a-cfc6-673c-f2a7-a21723eda199@suse.com> (raw)
In-Reply-To: <6064996d-943f-1be3-9bfd-e872149da2a1@suse.com>

On 10.11.2020 14:26, Jan Beulich wrote:
> The SDM specifically allows for earlier writes to fully overlapping
> ranges to be dropped. If a guest did so, hvmemul_phys_mmio_access()
> would crash it if varying data was written to the same address. Detect
> overlaps early, as doing so in hvmemul_{linear,phys}_mmio_access() would
> be quite a bit more difficult.
> 
> Note that due to cache slot use being linear address based, there's no
> similar issue with multiple writes to the same physical address (mapped
> through different linear addresses).
> 
> Since this requires an adjustment to the EVEX Disp8 scaling test,
> correct a comment there at the same time.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> TBD: The SDM isn't entirely unambiguous about the faulting behavior in
>      this case: If a fault would need delivering on the earlier slot
>      despite the write getting squashed, we'd have to call ops->write()
>      with size set to zero for the earlier write(s). However,
>      hvm/emulate.c's handling of zero-byte accesses extends only to the
>      virtual-to-linear address conversions (and raising of involved
>      faults), so in order to also observe #PF changes to that logic
>      would then also be needed. Can we live with a possible misbehavior
>      here?
> 
> --- a/tools/tests/x86_emulator/evex-disp8.c
> +++ b/tools/tests/x86_emulator/evex-disp8.c
> @@ -647,8 +647,8 @@ static const uint16_t imm0f[16] = {
>  static struct x86_emulate_ops emulops;
>  
>  /*
> - * Access tracking (by granular) is used on the first 64 bytes of address
> - * space. Instructions get encode with a raw Disp8 value of 1, which then
> + * Access tracking (byte granular) is used on the first bytes of address
> + * space. Instructions get encoded with a raw Disp8 value of 1, which then
>   * gets scaled accordingly. Hence accesses below the address <scaling factor>
>   * as well as at or above 2 * <scaling factor> are indications of bugs. To
>   * aid diagnosis / debugging, track all accesses below 3 * <scaling factor>.
> @@ -804,6 +804,31 @@ static void test_one(const struct test *
>  
>      asm volatile ( "kxnorw %k1, %k1, %k1" );
>      asm volatile ( "vxorps %xmm4, %xmm4, %xmm4" );
> +    if ( sg && (test->opc | 3) == 0xa3 )
> +    {
> +        /*
> +         * Non-prefetch scatters need handling specially, due to the
> +         * overlapped write elimination done by the emulator. The order of
> +         * indexes chosen is somewhat arbitrary, within the constraints
> +         * imposed by the various different uses.
> +         */
> +        static const struct {
> +            int32_t _[16];
> +        } off32 = {{ 1, 0, 2, 3, 7, 6, 5, 4, 8, 10, 12, 14, 9, 11, 13, 15 }};
> +
> +        if ( test->opc & 1 )
> +        {
> +            asm volatile ( "vpmovsxdq %0, %%zmm4" :: "m" (off32) );
> +            vsz >>= !evex.w;
> +        }
> +        else
> +            asm volatile ( "vmovdqu32 %0, %%zmm4" :: "m" (off32) );
> +
> +        /* Scale by element size. */
> +        instr[6] |= (evex.w | 2) << 6;
> +
> +        sg = false;
> +    }
>  
>      ctxt->regs->eip = (unsigned long)&instr[0];
>      ctxt->regs->edx = 0;
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -10032,25 +10032,47 @@ x86_emulate(
>  
>          for ( i = 0; op_mask; ++i )
>          {
> -            long idx = b & 1 ? index.qw[i] : index.dw[i];
> +            long idx = (b & 1 ? index.qw[i]
> +                              : index.dw[i]) * (1 << state->sib_scale);
> +            unsigned long offs = truncate_ea(ea.mem.off + idx);
> +            unsigned int j;
>  
>              if ( !(op_mask & (1 << i)) )
>                  continue;
>  
> -            rc = ops->write(ea.mem.seg,
> -                            truncate_ea(ea.mem.off +
> -                                        idx * (1 << state->sib_scale)),
> -                            (void *)mmvalp + i * op_bytes, op_bytes, ctxt);
> -            if ( rc != X86EMUL_OKAY )
> +            /*
> +             * hvmemul_linear_mmio_access() will find a cache slot based on
> +             * linear address. hvmemul_phys_mmio_access() will crash the
> +             * domain if observing varying data getting written to the same
> +             * address within a cache slot. Utilize that squashing earlier
> +             * writes to fully overlapping addresses is permitted by the spec.
> +             */
> +            for ( j = i + 1; j < n; ++j )
>              {
> -                /* See comment in gather emulation. */
> -                if ( rc != X86EMUL_EXCEPTION && done )
> -                    rc = X86EMUL_RETRY;
> -                break;
> +                long idx2 = (b & 1 ? index.qw[j]
> +                                   : index.dw[j]) * (1 << state->sib_scale);
> +
> +                if ( (op_mask & (1 << j)) &&
> +                     truncate_ea(ea.mem.off + idx2) == offs )
> +                    break;
> +            }
> +
> +            if ( j >= n )
> +            {
> +                rc = ops->write(ea.mem.seg, offs,
> +                                (void *)mmvalp + i * op_bytes, op_bytes, ctxt);
> +                if ( rc != X86EMUL_OKAY )
> +                {
> +                    /* See comment in gather emulation. */
> +                    if ( rc != X86EMUL_EXCEPTION && done )
> +                        rc = X86EMUL_RETRY;
> +                    break;
> +                }
> +
> +                done = true;
>              }
>  
>              op_mask &= ~(1 << i);
> -            done = true;
>  
>  #ifdef __XEN__
>              if ( op_mask && local_events_need_delivery() )
> 



  reply	other threads:[~2021-02-05 10:10 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-10 13:26 [PATCH] x86emul: de-duplicate scatters to the same linear address Jan Beulich
2021-02-05 10:10 ` Jan Beulich [this message]
2021-02-05 10:41 ` Andrew Cooper
2021-02-05 11:28   ` Jan Beulich
2021-02-17  8:32     ` Ping: " Jan Beulich
2021-04-01  8:05       ` Ping²: " Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a9739f4a-cfc6-673c-f2a7-a21723eda199@suse.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=roger.pau@citrix.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.