qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* plugins: Missing Store Exclusive Memory Accesses
@ 2021-09-16 20:44 Aaron Lindsay
  2021-09-17 11:05 ` Alex Bennée
  0 siblings, 1 reply; 11+ messages in thread
From: Aaron Lindsay @ 2021-09-16 20:44 UTC (permalink / raw)
  To: qemu-devel, Alex Bennée; +Cc: cota, richard.henderson

Hello,

I recently noticed that the plugin interface does not appear to be
emitting callbacks to functions registered via
`qemu_plugin_register_vcpu_mem_cb` for AArch64 store exclusives. This
would include instructions like `stxp  w16, x2, x3, [x4]` (encoding:
0xc8300c82). Seeing as how I'm only running with a single CPU, I don't
see how this could be due to losing exclusivity after the preceding
`ldxp`.

In looking at QEMU's source, I *think* this is because the
`gen_store_exclusive` function in translate-a64.c is not making the same
calls to `plugin_gen_mem_callbacks` & company that are being made by
"normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at
least in my case; I do see some code paths under `gen_store_exclusive`
call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all
of them do?).

Does my initial guess check out? And, if so, does anyone have insight
into how to fix this issue most cleanly/generically? I suspect if/when I
debug my particular case I can discover one code path to fix, but I'm
wondering if my discovery may be part of a larger class of cases which
fell through the cracks and ought to be fixed together.

Thanks for any help,

Aaron


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: plugins: Missing Store Exclusive Memory Accesses
  2021-09-16 20:44 plugins: Missing Store Exclusive Memory Accesses Aaron Lindsay
@ 2021-09-17 11:05 ` Alex Bennée
  2021-09-17 14:44   ` Aaron Lindsay via
  2021-09-21 20:28   ` Aaron Lindsay via
  0 siblings, 2 replies; 11+ messages in thread
From: Alex Bennée @ 2021-09-17 11:05 UTC (permalink / raw)
  To: Aaron Lindsay; +Cc: cota, richard.henderson, qemu-devel


Aaron Lindsay <aaron@os.amperecomputing.com> writes:

> Hello,
>
> I recently noticed that the plugin interface does not appear to be
> emitting callbacks to functions registered via
> `qemu_plugin_register_vcpu_mem_cb` for AArch64 store exclusives. This
> would include instructions like `stxp  w16, x2, x3, [x4]` (encoding:
> 0xc8300c82). Seeing as how I'm only running with a single CPU, I don't
> see how this could be due to losing exclusivity after the preceding
> `ldxp`.

The exclusive handling is a bit special due to the need to emulate it's
behaviour using cmpxchg primitives.

>
> In looking at QEMU's source, I *think* this is because the
> `gen_store_exclusive` function in translate-a64.c is not making the same
> calls to `plugin_gen_mem_callbacks` & company that are being made by
> "normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at
> least in my case; I do see some code paths under `gen_store_exclusive`
> call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all
> of them do?).

The key TCG operation is the cmpxchg which does the effective store. For
-smp 1 we should use normal ld and st tcg ops. For > 1 it eventually
falls to tcg_gen_atomic_cmpxchg_XX which is a helper. That eventually
ends up at:

  atomic_trace_rmw_post

which should be where things are hooked.

> Does my initial guess check out? And, if so, does anyone have insight
> into how to fix this issue most cleanly/generically? I suspect if/when I
> debug my particular case I can discover one code path to fix, but I'm
> wondering if my discovery may be part of a larger class of cases which
> fell through the cracks and ought to be fixed together.

Have you got simple example of a test case?

>
> Thanks for any help,
>
> Aaron


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: plugins: Missing Store Exclusive Memory Accesses
  2021-09-17 11:05 ` Alex Bennée
@ 2021-09-17 14:44   ` Aaron Lindsay via
  2021-09-21 20:28   ` Aaron Lindsay via
  1 sibling, 0 replies; 11+ messages in thread
From: Aaron Lindsay via @ 2021-09-17 14:44 UTC (permalink / raw)
  To: Alex Bennée; +Cc: qemu-devel, cota, richard.henderson

On Sep 17 12:05, Alex Bennée wrote:
> Aaron Lindsay <aaron@os.amperecomputing.com> writes:
> > In looking at QEMU's source, I *think* this is because the
> > `gen_store_exclusive` function in translate-a64.c is not making the same
> > calls to `plugin_gen_mem_callbacks` & company that are being made by
> > "normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at
> > least in my case; I do see some code paths under `gen_store_exclusive`
> > call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all
> > of them do?).
> 
> The key TCG operation is the cmpxchg which does the effective store. For
> -smp 1 we should use normal ld and st tcg ops. For > 1 it eventually
> falls to tcg_gen_atomic_cmpxchg_XX which is a helper. That eventually
> ends up at:
> 
>   atomic_trace_rmw_post
> 
> which should be where things are hooked.

If I am understanding you correctly, it seems like my `stxp` should be using
the "normal" load and store tcg ops since I am running with `-smp 1`, and
therefore correctly emitting plugin memory callbacks.

I think my next step is to figure out exactly which tcg code path is being used
for this instruction to remove any doubt about what's going on here.

> > Does my initial guess check out? And, if so, does anyone have insight
> > into how to fix this issue most cleanly/generically? I suspect if/when I
> > debug my particular case I can discover one code path to fix, but I'm
> > wondering if my discovery may be part of a larger class of cases which
> > fell through the cracks and ought to be fixed together.
> 
> Have you got simple example of a test case?

My test case is reasonably simple - I can reproduce the issue reliably and in
under 5 minutes - but I don't currently have a self-contained version in a form
I can share.

Here is the surrounding dynamic instruction stream, as reported by the plugin
interface (via callbacks registered with
`qemu_plugin_register_vcpu_insn_exec_cb`), along with corresponding memory
accesses (reported via callbacks registered with
`qemu_plugin_register_vcpu_mem_cb`):

  pc               ( opcode   ): `disassembly`
------------------|-------------|-------------
0xffff0000082076b4 (0x9436c8a9): `bl    #0xffff000008fb9958`
0xffff000008fb9958 (0xf9800091): `prfm  pstl1strm, [x4]`
0xffff000008fb995c (0xc87f4490): `ldxp  x16, x17, [x4]`
	^ accesses virtual addresses: 0xffff8002fffdde60, 0xffff8002fffdde68
0xffff000008fb9960 (0xca000210): `eor   x16, x16, x0`
0xffff000008fb9964 (0xca010231): `eor   x17, x17, x1`
0xffff000008fb9968 (0xaa110211): `orr   x17, x16, x17`
0xffff000008fb996c (0xb5000071): `cbnz  x17, #0xffff000008fb9978`
0xffff000008fb9970 (0xc8300c82): `stxp  w16, x2, x3, [x4]`
0xffff000008fb9974 (0x35ffff50): `cbnz  w16, #0xffff000008fb995c`
0xffff000008fb9978 (0xaa1103e0): `mov   x0, x17`
0xffff000008fb997c (0xd65f03c0): `ret   `
0xffff0000082076b8 (0xd503201f): `nop   `
0xffff0000082076bc (0xd503201f): `nop   `
0xffff0000082076c0 (0xd503201f): `nop   `
0xffff0000082076c4 (0xb94010a1): `ldr   w1, [x5, #0x10]`
	^ accesses virtual addresses: 0xffff8002f18b5cd0
0xffff0000082076c8 (0x51000421): `sub   w1, w1, #1`
0xffff0000082076cc (0xb90010a1): `str   w1, [x5, #0x10]`
	^ accesses virtual addresses: 0xffff8002f18b5cd0
0xffff0000082076d0 (0x35000061): `cbnz  w1, #0xffff0000082076dc`

Notice that the `stxp` receives no corresponding callbacks via
`qemu_plugin_register_vcpu_mem_cb` like the `ldxp`, `ldr`, and `str` do.

-Aaron


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: plugins: Missing Store Exclusive Memory Accesses
  2021-09-17 11:05 ` Alex Bennée
  2021-09-17 14:44   ` Aaron Lindsay via
@ 2021-09-21 20:28   ` Aaron Lindsay via
  2021-09-22 20:22     ` Aaron Lindsay via
  1 sibling, 1 reply; 11+ messages in thread
From: Aaron Lindsay via @ 2021-09-21 20:28 UTC (permalink / raw)
  To: Alex Bennée; +Cc: qemu-devel, cota, richard.henderson

On Sep 17 12:05, Alex Bennée wrote:
> Aaron Lindsay <aaron@os.amperecomputing.com> writes:
> > I recently noticed that the plugin interface does not appear to be
> > emitting callbacks to functions registered via
> > `qemu_plugin_register_vcpu_mem_cb` for AArch64 store exclusives. This
> > would include instructions like `stxp  w16, x2, x3, [x4]` (encoding:
> > 0xc8300c82). Seeing as how I'm only running with a single CPU, I don't
> > see how this could be due to losing exclusivity after the preceding
> > `ldxp`.
> 
> The exclusive handling is a bit special due to the need to emulate it's
> behaviour using cmpxchg primitives.
> 
> >
> > In looking at QEMU's source, I *think* this is because the
> > `gen_store_exclusive` function in translate-a64.c is not making the same
> > calls to `plugin_gen_mem_callbacks` & company that are being made by
> > "normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at
> > least in my case; I do see some code paths under `gen_store_exclusive`
> > call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all
> > of them do?).
> 
> The key TCG operation is the cmpxchg which does the effective store. For
> -smp 1 we should use normal ld and st tcg ops. For > 1 it eventually
> falls to tcg_gen_atomic_cmpxchg_XX which is a helper. That eventually
> ends up at:
> 
>   atomic_trace_rmw_post
> 
> which should be where things are hooked.

When I open this up in gdb, I see that I'm getting the following call
graph for the `stxp` instruction in question (for -smp 1):

gen_store_exclusive -> gen_helper_paired_cmpxchg64_le

In other words, I'm taking the `s->be_data == MO_LE` else/if clause.

I do not see where the helper behind that (defined in helper-a64.c as
`uint64_t HELPER(paired_cmpxchg64_le)...`) is calling in to generate
plugin callbacks in this case. Am I missing something?

-Aaron


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: plugins: Missing Store Exclusive Memory Accesses
  2021-09-21 20:28   ` Aaron Lindsay via
@ 2021-09-22 20:22     ` Aaron Lindsay via
  2021-10-20 17:12       ` Aaron Lindsay via
  0 siblings, 1 reply; 11+ messages in thread
From: Aaron Lindsay via @ 2021-09-22 20:22 UTC (permalink / raw)
  To: Alex Bennée, richard.henderson; +Cc: qemu-devel, cota

On Sep 21 16:28, Aaron Lindsay wrote:
> On Sep 17 12:05, Alex Bennée wrote:
> > Aaron Lindsay <aaron@os.amperecomputing.com> writes:
> > > I recently noticed that the plugin interface does not appear to be
> > > emitting callbacks to functions registered via
> > > `qemu_plugin_register_vcpu_mem_cb` for AArch64 store exclusives. This
> > > would include instructions like `stxp  w16, x2, x3, [x4]` (encoding:
> > > 0xc8300c82). Seeing as how I'm only running with a single CPU, I don't
> > > see how this could be due to losing exclusivity after the preceding
> > > `ldxp`.
> > 
> > The exclusive handling is a bit special due to the need to emulate it's
> > behaviour using cmpxchg primitives.
> > 
> > >
> > > In looking at QEMU's source, I *think* this is because the
> > > `gen_store_exclusive` function in translate-a64.c is not making the same
> > > calls to `plugin_gen_mem_callbacks` & company that are being made by
> > > "normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at
> > > least in my case; I do see some code paths under `gen_store_exclusive`
> > > call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all
> > > of them do?).
> > 
> > The key TCG operation is the cmpxchg which does the effective store. For
> > -smp 1 we should use normal ld and st tcg ops. For > 1 it eventually
> > falls to tcg_gen_atomic_cmpxchg_XX which is a helper. That eventually
> > ends up at:
> > 
> >   atomic_trace_rmw_post
> > 
> > which should be where things are hooked.
> 
> When I open this up in gdb, I see that I'm getting the following call
> graph for the `stxp` instruction in question (for -smp 1):
> 
> gen_store_exclusive -> gen_helper_paired_cmpxchg64_le
> 
> In other words, I'm taking the `s->be_data == MO_LE` else/if clause.
> 
> I do not see where the helper behind that (defined in helper-a64.c as
> `uint64_t HELPER(paired_cmpxchg64_le)...`) is calling in to generate
> plugin callbacks in this case. Am I missing something?

Richard, Alex,

The more I look at this, the more it feels like the following
AArch64-specific helpers may have been overlooked when adding
tracing/plugin hooks:
	gen_helper_paired_cmpxchg64_le
	gen_helper_paired_cmpxchg64_be

But... I'm still not sure I fully understand how everything I'm digging
into interacts; I am happy to keep investigating and work towards a fix,
but think I need a nudge in the right direction.

Thanks for any nudges,

Aaron


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: plugins: Missing Store Exclusive Memory Accesses
  2021-09-22 20:22     ` Aaron Lindsay via
@ 2021-10-20 17:12       ` Aaron Lindsay via
  2021-10-20 17:54         ` Alex Bennée
  0 siblings, 1 reply; 11+ messages in thread
From: Aaron Lindsay via @ 2021-10-20 17:12 UTC (permalink / raw)
  To: Alex Bennée, richard.henderson; +Cc: qemu-devel, cota

On Sep 22 16:22, Aaron Lindsay wrote:
> On Sep 21 16:28, Aaron Lindsay wrote:
> > On Sep 17 12:05, Alex Bennée wrote:
> > > Aaron Lindsay <aaron@os.amperecomputing.com> writes:
> > > > I recently noticed that the plugin interface does not appear to be
> > > > emitting callbacks to functions registered via
> > > > `qemu_plugin_register_vcpu_mem_cb` for AArch64 store exclusives. This
> > > > would include instructions like `stxp  w16, x2, x3, [x4]` (encoding:
> > > > 0xc8300c82). Seeing as how I'm only running with a single CPU, I don't
> > > > see how this could be due to losing exclusivity after the preceding
> > > > `ldxp`.
> > > 
> > > The exclusive handling is a bit special due to the need to emulate it's
> > > behaviour using cmpxchg primitives.
> > > 
> > > >
> > > > In looking at QEMU's source, I *think* this is because the
> > > > `gen_store_exclusive` function in translate-a64.c is not making the same
> > > > calls to `plugin_gen_mem_callbacks` & company that are being made by
> > > > "normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at
> > > > least in my case; I do see some code paths under `gen_store_exclusive`
> > > > call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all
> > > > of them do?).
> > > 
> > > The key TCG operation is the cmpxchg which does the effective store. For
> > > -smp 1 we should use normal ld and st tcg ops. For > 1 it eventually
> > > falls to tcg_gen_atomic_cmpxchg_XX which is a helper. That eventually
> > > ends up at:
> > > 
> > >   atomic_trace_rmw_post
> > > 
> > > which should be where things are hooked.
> > 
> > When I open this up in gdb, I see that I'm getting the following call
> > graph for the `stxp` instruction in question (for -smp 1):
> > 
> > gen_store_exclusive -> gen_helper_paired_cmpxchg64_le
> > 
> > In other words, I'm taking the `s->be_data == MO_LE` else/if clause.
> > 
> > I do not see where the helper behind that (defined in helper-a64.c as
> > `uint64_t HELPER(paired_cmpxchg64_le)...`) is calling in to generate
> > plugin callbacks in this case. Am I missing something?
> 
> Richard, Alex,
> 
> The more I look at this, the more it feels like the following
> AArch64-specific helpers may have been overlooked when adding
> tracing/plugin hooks:
> 	gen_helper_paired_cmpxchg64_le
> 	gen_helper_paired_cmpxchg64_be
> 
> But... I'm still not sure I fully understand how everything I'm digging
> into interacts; I am happy to keep investigating and work towards a fix,
> but think I need a nudge in the right direction.

Ping?

I'm happy to spend some more time digging into this issue, and would
love to be pointed in the right direction if someone is able!

Thanks!

-Aaron


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: plugins: Missing Store Exclusive Memory Accesses
  2021-10-20 17:12       ` Aaron Lindsay via
@ 2021-10-20 17:54         ` Alex Bennée
  2021-10-20 20:49           ` Aaron Lindsay via
  0 siblings, 1 reply; 11+ messages in thread
From: Alex Bennée @ 2021-10-20 17:54 UTC (permalink / raw)
  To: Aaron Lindsay; +Cc: cota, richard.henderson, qemu-devel


Aaron Lindsay <aaron@os.amperecomputing.com> writes:

> On Sep 22 16:22, Aaron Lindsay wrote:
>> On Sep 21 16:28, Aaron Lindsay wrote:
>> > On Sep 17 12:05, Alex Bennée wrote:
>> > > Aaron Lindsay <aaron@os.amperecomputing.com> writes:
>> > > > I recently noticed that the plugin interface does not appear to be
>> > > > emitting callbacks to functions registered via
>> > > > `qemu_plugin_register_vcpu_mem_cb` for AArch64 store exclusives. This
>> > > > would include instructions like `stxp  w16, x2, x3, [x4]` (encoding:
>> > > > 0xc8300c82). Seeing as how I'm only running with a single CPU, I don't
>> > > > see how this could be due to losing exclusivity after the preceding
>> > > > `ldxp`.
>> > > 
>> > > The exclusive handling is a bit special due to the need to emulate it's
>> > > behaviour using cmpxchg primitives.
>> > > 
>> > > >
>> > > > In looking at QEMU's source, I *think* this is because the
>> > > > `gen_store_exclusive` function in translate-a64.c is not making the same
>> > > > calls to `plugin_gen_mem_callbacks` & company that are being made by
>> > > > "normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at
>> > > > least in my case; I do see some code paths under `gen_store_exclusive`
>> > > > call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all
>> > > > of them do?).
>> > > 
>> > > The key TCG operation is the cmpxchg which does the effective store. For
>> > > -smp 1 we should use normal ld and st tcg ops. For > 1 it eventually
>> > > falls to tcg_gen_atomic_cmpxchg_XX which is a helper. That eventually
>> > > ends up at:
>> > > 
>> > >   atomic_trace_rmw_post
>> > > 
>> > > which should be where things are hooked.
>> > 
>> > When I open this up in gdb, I see that I'm getting the following call
>> > graph for the `stxp` instruction in question (for -smp 1):
>> > 
>> > gen_store_exclusive -> gen_helper_paired_cmpxchg64_le
>> > 
>> > In other words, I'm taking the `s->be_data == MO_LE` else/if clause.
>> > 
>> > I do not see where the helper behind that (defined in helper-a64.c as
>> > `uint64_t HELPER(paired_cmpxchg64_le)...`) is calling in to generate
>> > plugin callbacks in this case. Am I missing something?
>> 
>> Richard, Alex,
>> 
>> The more I look at this, the more it feels like the following
>> AArch64-specific helpers may have been overlooked when adding
>> tracing/plugin hooks:
>> 	gen_helper_paired_cmpxchg64_le
>> 	gen_helper_paired_cmpxchg64_be
>> 
>> But... I'm still not sure I fully understand how everything I'm digging
>> into interacts; I am happy to keep investigating and work towards a fix,
>> but think I need a nudge in the right direction.
>
> Ping?
>
> I'm happy to spend some more time digging into this issue, and would
> love to be pointed in the right direction if someone is able!

These all end up in:

      ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, target_ulong addr,
                                    ABI_TYPE cmpv, ABI_TYPE newv,
                                    MemOpIdx oi, uintptr_t retaddr)

Have you got a test case you are using so I can try and replicate the
failure you are seeing? So far by inspection everything looks OK to me.

>
> Thanks!
>
> -Aaron


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: plugins: Missing Store Exclusive Memory Accesses
  2021-10-20 17:54         ` Alex Bennée
@ 2021-10-20 20:49           ` Aaron Lindsay via
  2021-10-21 12:28             ` Alex Bennée
  0 siblings, 1 reply; 11+ messages in thread
From: Aaron Lindsay via @ 2021-10-20 20:49 UTC (permalink / raw)
  To: Alex Bennée; +Cc: richard.henderson, qemu-devel, cota

On Oct 20 18:54, Alex Bennée wrote:
> Have you got a test case you are using so I can try and replicate the
> failure you are seeing? So far by inspection everything looks OK to me.

I took some time today to put together a minimal(ish) reproducer using
usermode. The source files used are below, I compiled the test binary on an
AArch64 system using:

$ gcc -g -o stxp stxp.s stxp.c

Then built the plugin from stxp_plugin.cc, and ran it all like:

qemu-aarch64 \
    -cpu cortex-a57 \
    -D stxp_plugin.log \
    -d plugin \
    -plugin 'stxp_plugin.so' \
    ./stxp

I observe that, for me, the objdump of stxp contains:
000000000040070c <loop>:
  40070c:   f9800011    prfm    pstl1strm, [x0]
  400710:   c87f4410    ldxp    x16, x17, [x0]
  400714:   c8300c02    stxp    w16, x2, x3, [x0]
  400718:   f1000652    subs    x18, x18, #0x1
  40071c:   54000040    b.eq    400724 <done>  // b.none
  400720:   17fffffb    b   40070c <loop>

But the output in stxp_plugin.log looks something like:
	Executing PC: 0x40070c
	Executing PC: 0x400710
	PC 0x400710 accessed memory at 0x550080ec70
	PC 0x400710 accessed memory at 0x550080ec78
	Executing PC: 0x400714
	Executing PC: 0x400718
	Executing PC: 0x40071c
	Executing PC: 0x400720

From this, I believe the ldxp instruction at PC 0x400710 is reporting two
memory accesses but the stxp instruction at 0x400714 is not.

-Aaron

--- stxp.c ---
void stxp_issue_demo();

int main() {
        char arr[16];
        stxp_issue_demo(&arr);
}

--- stxp.s ---
.align 8

stxp_issue_demo:
    mov x18, 0x1000
    mov x2, 0x0
    mov x3, 0x0
loop:
    prfm  pstl1strm, [x0]
    ldxp  x16, x17, [x0]
    stxp  w16, x2, x3, [x0]

    subs x18, x18, 1
    beq done
    b loop
done:
    ret

.global stxp_issue_demo

--- stxp_plugin.cc ---
#include <stdio.h>

extern "C" {

#include <qemu-plugin.h>

QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;

void qemu_logf(const char *str, ...)
{
    char message[1024];
    va_list args;
    va_start(args, str);
    vsnprintf(message, 1023, str, args);

    qemu_plugin_outs(message);

    va_end(args);
}

void before_insn_cb(unsigned int cpu_index, void *udata)
{
    uint64_t pc = (uint64_t)udata;
    qemu_logf("Executing PC: 0x%" PRIx64 "\n", pc);
}

static void mem_cb(unsigned int cpu_index, qemu_plugin_meminfo_t meminfo, uint64_t va, void *udata)
{
    uint64_t pc = (uint64_t)udata;
    qemu_logf("PC 0x%" PRIx64 " accessed memory at 0x%" PRIx64 "\n", pc, va);
}

static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
{
    size_t n = qemu_plugin_tb_n_insns(tb);

    for (size_t i = 0; i < n; i++) {
        struct qemu_plugin_insn *insn = qemu_plugin_tb_get_insn(tb, i);
        uint64_t pc = qemu_plugin_insn_vaddr(insn);

        qemu_plugin_register_vcpu_insn_exec_cb(insn, before_insn_cb, QEMU_PLUGIN_CB_R_REGS, (void *)pc);
        qemu_plugin_register_vcpu_mem_cb(insn, mem_cb, QEMU_PLUGIN_CB_NO_REGS, QEMU_PLUGIN_MEM_RW, (void*)pc);
    }
}

QEMU_PLUGIN_EXPORT
int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
                        int argc, char **argv)
{
    qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans);
    return 0;
}

}


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: plugins: Missing Store Exclusive Memory Accesses
  2021-10-20 20:49           ` Aaron Lindsay via
@ 2021-10-21 12:28             ` Alex Bennée
  2021-10-21 20:40               ` Aaron Lindsay via
  0 siblings, 1 reply; 11+ messages in thread
From: Alex Bennée @ 2021-10-21 12:28 UTC (permalink / raw)
  To: Aaron Lindsay; +Cc: cota, richard.henderson, qemu-devel


Aaron Lindsay <aaron@os.amperecomputing.com> writes:

> On Oct 20 18:54, Alex Bennée wrote:
>> Have you got a test case you are using so I can try and replicate the
>> failure you are seeing? So far by inspection everything looks OK to me.
>
> I took some time today to put together a minimal(ish) reproducer using
> usermode. The source files used are below, I compiled the test binary on an
> AArch64 system using:
>
> $ gcc -g -o stxp stxp.s stxp.c
>
> Then built the plugin from stxp_plugin.cc, and ran it all like:
>
> qemu-aarch64 \
>     -cpu cortex-a57 \
>     -D stxp_plugin.log \
>     -d plugin \
>     -plugin 'stxp_plugin.so' \
>     ./stxp
>
> I observe that, for me, the objdump of stxp contains:
> 000000000040070c <loop>:
>   40070c:   f9800011    prfm    pstl1strm, [x0]
>   400710:   c87f4410    ldxp    x16, x17, [x0]
>   400714:   c8300c02    stxp    w16, x2, x3, [x0]
>   400718:   f1000652    subs    x18, x18, #0x1
>   40071c:   54000040    b.eq    400724 <done>  // b.none
>   400720:   17fffffb    b   40070c <loop>
>
> But the output in stxp_plugin.log looks something like:
> 	Executing PC: 0x40070c
> 	Executing PC: 0x400710
> 	PC 0x400710 accessed memory at 0x550080ec70
> 	PC 0x400710 accessed memory at 0x550080ec78
> 	Executing PC: 0x400714
> 	Executing PC: 0x400718
> 	Executing PC: 0x40071c
> 	Executing PC: 0x400720
>
> From this, I believe the ldxp instruction at PC 0x400710 is reporting two
> memory accesses but the stxp instruction at 0x400714 is not.

This is fascinating but I can't replicate your results. I get the
following pattern:

  Executing PC: 0x400910                                                                
  Executing PC: 0x400914                                                                
  PC 0x400914 accessed memory at 0x55007fffd0 
  PC 0x400914 accessed memory at 0x55007fffd8                                           
  Executing PC: 0x400918                                                                
  PC 0x400918 accessed memory at 0x55007fffd0 
  PC 0x400918 accessed memory at 0x55007fffd8                                           
  PC 0x400918 accessed memory at 0x55007fffd0                                           
  PC 0x400918 accessed memory at 0x55007fffd8                                           
  Executing PC: 0x40091c                                                                
  Executing PC: 0x400920                                                                
  Executing PC: 0x400924                                                                
  Executing PC: 0x400910                                                                
  Executing PC: 0x400914                                                                
  PC 0x400914 accessed memory at 0x55007fffd0                                           
  PC 0x400914 accessed memory at 0x55007fffd8                                           
  Executing PC: 0x400918                                                                
  PC 0x400918 accessed memory at 0x55007fffd0                                           
  PC 0x400918 accessed memory at 0x55007fffd8                                           
  PC 0x400918 accessed memory at 0x55007fffd0                                           
  PC 0x400918 accessed memory at 0x55007fffd8                                           
  Executing PC: 0x40091c                                                                
  Executing PC: 0x400920                                                                
  Executing PC: 0x400924                                                                
  Executing PC: 0x400910                                                                
  Executing PC: 0x400914                                                                
  PC 0x400914 accessed memory at 0x55007fffd0                                           
  PC 0x400914 accessed memory at 0x55007fffd8                                           
  Executing PC: 0x400918                                                                
  PC 0x400918 accessed memory at 0x55007fffd0                                           
  PC 0x400918 accessed memory at 0x55007fffd8                                           
  PC 0x400918 accessed memory at 0x55007fffd0 
  PC 0x400918 accessed memory at 0x55007fffd8                                           
  Executing PC: 0x40091c                                                                
  Executing PC: 0x400920                                                                
  Executing PC: 0x400924                                                                
  Executing PC: 0x400910                                                                
  Executing PC: 0x400914                                                                
  PC 0x400914 accessed memory at 0x55007fffd0                                           
  PC 0x400914 accessed memory at 0x55007fffd8                                           
  Executing PC: 0x400918                                                                
  PC 0x400918 accessed memory at 0x55007fffd0 
  PC 0x400918 accessed memory at 0x55007fffd8                                           
  PC 0x400918 accessed memory at 0x55007fffd0                                           
  PC 0x400918 accessed memory at 0x55007fffd8 

It's a bit clearer if you use the contrib/execlog plugin:

  ./qemu-aarch64 -plugin contrib/plugins/libexeclog.so -d plugin  ./tests/tcg/aarch64-linux-user/stxp

  0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0]
  0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 
  0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 
  0, 0x40091c, 0xf1000652, "subs x18, x18, #1"
  0, 0x400920, 0x54000040, "b.eq #0x400928"
  0, 0x400924, 0x17fffffb, "b #0x400910"
  0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0]
  0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 
  0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 
  0, 0x40091c, 0xf1000652, "subs x18, x18, #1"
  0, 0x400920, 0x54000040, "b.eq #0x400928"
  0, 0x400924, 0x17fffffb, "b #0x400910"
  0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0]
  0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 
  0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 
  0, 0x40091c, 0xf1000652, "subs x18, x18, #1"
  0, 0x400920, 0x54000040, "b.eq #0x400928"
  0, 0x400924, 0x17fffffb, "b #0x400910"
  0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0]
  0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 
  0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 
  0, 0x40091c, 0xf1000652, "subs x18, x18, #1"
  0, 0x400920, 0x54000040, "b.eq #0x400928"
  0, 0x400924, 0x17fffffb, "b #0x400910"
  0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0]
  0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 
  0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 
  0, 0x40091c, 0xf1000652, "subs x18, x18, #1"
  0, 0x400920, 0x54000040, "b.eq #0x400928"
  0, 0x400924, 0x17fffffb, "b #0x400910"

Although you can see stxp looks a bit weird on account of the loads it
does during the cmpxchng. So consider me stumped. The only thing I can
thing of next is to see how closely I can replicate your build
environment.

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: plugins: Missing Store Exclusive Memory Accesses
  2021-10-21 12:28             ` Alex Bennée
@ 2021-10-21 20:40               ` Aaron Lindsay via
  2021-10-22  8:37                 ` Alex Bennée
  0 siblings, 1 reply; 11+ messages in thread
From: Aaron Lindsay via @ 2021-10-21 20:40 UTC (permalink / raw)
  To: Alex Bennée; +Cc: richard.henderson, qemu-devel, cota

On Oct 21 13:28, Alex Bennée wrote:
> It's a bit clearer if you use the contrib/execlog plugin:
> 
>   ./qemu-aarch64 -plugin contrib/plugins/libexeclog.so -d plugin  ./tests/tcg/aarch64-linux-user/stxp
> 
>   0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0]
>   0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 
>   0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 
>   0, 0x40091c, 0xf1000652, "subs x18, x18, #1"
>   0, 0x400920, 0x54000040, "b.eq #0x400928"
>   0, 0x400924, 0x17fffffb, "b #0x400910"
>   0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0]
>   0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 
>   0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 
>   0, 0x40091c, 0xf1000652, "subs x18, x18, #1"
>   0, 0x400920, 0x54000040, "b.eq #0x400928"
>   0, 0x400924, 0x17fffffb, "b #0x400910"
>   0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0]
>   0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 
>   0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 
>   0, 0x40091c, 0xf1000652, "subs x18, x18, #1"
>   0, 0x400920, 0x54000040, "b.eq #0x400928"
>   0, 0x400924, 0x17fffffb, "b #0x400910"
>   0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0]
>   0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 
>   0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 
>   0, 0x40091c, 0xf1000652, "subs x18, x18, #1"
>   0, 0x400920, 0x54000040, "b.eq #0x400928"
>   0, 0x400924, 0x17fffffb, "b #0x400910"
>   0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0]
>   0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 
>   0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 
>   0, 0x40091c, 0xf1000652, "subs x18, x18, #1"
>   0, 0x400920, 0x54000040, "b.eq #0x400928"
>   0, 0x400924, 0x17fffffb, "b #0x400910"
> 
> Although you can see stxp looks a bit weird on account of the loads it
> does during the cmpxchng. So consider me stumped. The only thing I can
> thing of next is to see how closely I can replicate your build
> environment.

I apologize, I had apparently gotten farther behind upstream than I
realized since originally encountering this. I tried the latest upstream
code and am now able to observe the same thing as you. Somewhere between
v6.1.0 and now, the original issue I reported has been resolved.

However, I am not sure reporting loads for a store exclusive makes sense
to me here, either. My understanding is that the stxp needs to check if
it still has exclusive access and QEMU's implementation results in the
extra loads, but I would expect that the plugin interface would only
report architectural loads.

Is there any obvious way to omit the loads from the plugin interface
here?

-Aaron


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: plugins: Missing Store Exclusive Memory Accesses
  2021-10-21 20:40               ` Aaron Lindsay via
@ 2021-10-22  8:37                 ` Alex Bennée
  0 siblings, 0 replies; 11+ messages in thread
From: Alex Bennée @ 2021-10-22  8:37 UTC (permalink / raw)
  To: Aaron Lindsay; +Cc: cota, richard.henderson, qemu-devel


Aaron Lindsay <aaron@os.amperecomputing.com> writes:

> On Oct 21 13:28, Alex Bennée wrote:
>> It's a bit clearer if you use the contrib/execlog plugin:
>> 
>>   ./qemu-aarch64 -plugin contrib/plugins/libexeclog.so -d plugin  ./tests/tcg/aarch64-linux-user/stxp
>> 
>>   0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0]
>>   0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 
>>   0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 
>>   0, 0x40091c, 0xf1000652, "subs x18, x18, #1"
>>   0, 0x400920, 0x54000040, "b.eq #0x400928"
>>   0, 0x400924, 0x17fffffb, "b #0x400910"
<snip>
>> 
>> Although you can see stxp looks a bit weird on account of the loads it
>> does during the cmpxchng. So consider me stumped. The only thing I can
>> thing of next is to see how closely I can replicate your build
>> environment.
>
> I apologize, I had apparently gotten farther behind upstream than I
> realized since originally encountering this. I tried the latest upstream
> code and am now able to observe the same thing as you. Somewhere between
> v6.1.0 and now, the original issue I reported has been resolved.
>
> However, I am not sure reporting loads for a store exclusive makes sense
> to me here, either. My understanding is that the stxp needs to check if
> it still has exclusive access and QEMU's implementation results in the
> extra loads, but I would expect that the plugin interface would only
> report architectural loads.

Yes this is an anomaly. It's not reporting all loads and stores because
there are accesses to cpu_exclusive_addr and cpu_exclusive_val which we
use to simulate the exclusivity check. However we don't currently have a
way to signal to the TCG that a cmpxchg is only being done to simulate a
store.

I guess we need to either signal the helper someway to avoid calling
atomic_trace_rmw_post and call atomic_trace_st_post instead. Ideally we
could signal this in metadata somehow (although I suspect adding
something to MemOpIdx might be too ugly). The alternative would be
defining another series of cmpxchg helpers that did this.

Looking at the code also reminds me that we need to excise the broken
memory trace code.

> Is there any obvious way to omit the loads from the plugin interface
> here?
>
> -Aaron


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-10-22  8:58 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-16 20:44 plugins: Missing Store Exclusive Memory Accesses Aaron Lindsay
2021-09-17 11:05 ` Alex Bennée
2021-09-17 14:44   ` Aaron Lindsay via
2021-09-21 20:28   ` Aaron Lindsay via
2021-09-22 20:22     ` Aaron Lindsay via
2021-10-20 17:12       ` Aaron Lindsay via
2021-10-20 17:54         ` Alex Bennée
2021-10-20 20:49           ` Aaron Lindsay via
2021-10-21 12:28             ` Alex Bennée
2021-10-21 20:40               ` Aaron Lindsay via
2021-10-22  8:37                 ` Alex Bennée

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).