* plugins: Missing Store Exclusive Memory Accesses @ 2021-09-16 20:44 Aaron Lindsay 2021-09-17 11:05 ` Alex Bennée 0 siblings, 1 reply; 11+ messages in thread From: Aaron Lindsay @ 2021-09-16 20:44 UTC (permalink / raw) To: qemu-devel, Alex Bennée; +Cc: cota, richard.henderson Hello, I recently noticed that the plugin interface does not appear to be emitting callbacks to functions registered via `qemu_plugin_register_vcpu_mem_cb` for AArch64 store exclusives. This would include instructions like `stxp w16, x2, x3, [x4]` (encoding: 0xc8300c82). Seeing as how I'm only running with a single CPU, I don't see how this could be due to losing exclusivity after the preceding `ldxp`. In looking at QEMU's source, I *think* this is because the `gen_store_exclusive` function in translate-a64.c is not making the same calls to `plugin_gen_mem_callbacks` & company that are being made by "normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at least in my case; I do see some code paths under `gen_store_exclusive` call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all of them do?). Does my initial guess check out? And, if so, does anyone have insight into how to fix this issue most cleanly/generically? I suspect if/when I debug my particular case I can discover one code path to fix, but I'm wondering if my discovery may be part of a larger class of cases which fell through the cracks and ought to be fixed together. Thanks for any help, Aaron ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: plugins: Missing Store Exclusive Memory Accesses 2021-09-16 20:44 plugins: Missing Store Exclusive Memory Accesses Aaron Lindsay @ 2021-09-17 11:05 ` Alex Bennée 2021-09-17 14:44 ` Aaron Lindsay via 2021-09-21 20:28 ` Aaron Lindsay via 0 siblings, 2 replies; 11+ messages in thread From: Alex Bennée @ 2021-09-17 11:05 UTC (permalink / raw) To: Aaron Lindsay; +Cc: cota, richard.henderson, qemu-devel Aaron Lindsay <aaron@os.amperecomputing.com> writes: > Hello, > > I recently noticed that the plugin interface does not appear to be > emitting callbacks to functions registered via > `qemu_plugin_register_vcpu_mem_cb` for AArch64 store exclusives. This > would include instructions like `stxp w16, x2, x3, [x4]` (encoding: > 0xc8300c82). Seeing as how I'm only running with a single CPU, I don't > see how this could be due to losing exclusivity after the preceding > `ldxp`. The exclusive handling is a bit special due to the need to emulate it's behaviour using cmpxchg primitives. > > In looking at QEMU's source, I *think* this is because the > `gen_store_exclusive` function in translate-a64.c is not making the same > calls to `plugin_gen_mem_callbacks` & company that are being made by > "normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at > least in my case; I do see some code paths under `gen_store_exclusive` > call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all > of them do?). The key TCG operation is the cmpxchg which does the effective store. For -smp 1 we should use normal ld and st tcg ops. For > 1 it eventually falls to tcg_gen_atomic_cmpxchg_XX which is a helper. That eventually ends up at: atomic_trace_rmw_post which should be where things are hooked. > Does my initial guess check out? And, if so, does anyone have insight > into how to fix this issue most cleanly/generically? I suspect if/when I > debug my particular case I can discover one code path to fix, but I'm > wondering if my discovery may be part of a larger class of cases which > fell through the cracks and ought to be fixed together. Have you got simple example of a test case? > > Thanks for any help, > > Aaron -- Alex Bennée ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: plugins: Missing Store Exclusive Memory Accesses 2021-09-17 11:05 ` Alex Bennée @ 2021-09-17 14:44 ` Aaron Lindsay via 2021-09-21 20:28 ` Aaron Lindsay via 1 sibling, 0 replies; 11+ messages in thread From: Aaron Lindsay via @ 2021-09-17 14:44 UTC (permalink / raw) To: Alex Bennée; +Cc: qemu-devel, cota, richard.henderson On Sep 17 12:05, Alex Bennée wrote: > Aaron Lindsay <aaron@os.amperecomputing.com> writes: > > In looking at QEMU's source, I *think* this is because the > > `gen_store_exclusive` function in translate-a64.c is not making the same > > calls to `plugin_gen_mem_callbacks` & company that are being made by > > "normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at > > least in my case; I do see some code paths under `gen_store_exclusive` > > call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all > > of them do?). > > The key TCG operation is the cmpxchg which does the effective store. For > -smp 1 we should use normal ld and st tcg ops. For > 1 it eventually > falls to tcg_gen_atomic_cmpxchg_XX which is a helper. That eventually > ends up at: > > atomic_trace_rmw_post > > which should be where things are hooked. If I am understanding you correctly, it seems like my `stxp` should be using the "normal" load and store tcg ops since I am running with `-smp 1`, and therefore correctly emitting plugin memory callbacks. I think my next step is to figure out exactly which tcg code path is being used for this instruction to remove any doubt about what's going on here. > > Does my initial guess check out? And, if so, does anyone have insight > > into how to fix this issue most cleanly/generically? I suspect if/when I > > debug my particular case I can discover one code path to fix, but I'm > > wondering if my discovery may be part of a larger class of cases which > > fell through the cracks and ought to be fixed together. > > Have you got simple example of a test case? My test case is reasonably simple - I can reproduce the issue reliably and in under 5 minutes - but I don't currently have a self-contained version in a form I can share. Here is the surrounding dynamic instruction stream, as reported by the plugin interface (via callbacks registered with `qemu_plugin_register_vcpu_insn_exec_cb`), along with corresponding memory accesses (reported via callbacks registered with `qemu_plugin_register_vcpu_mem_cb`): pc ( opcode ): `disassembly` ------------------|-------------|------------- 0xffff0000082076b4 (0x9436c8a9): `bl #0xffff000008fb9958` 0xffff000008fb9958 (0xf9800091): `prfm pstl1strm, [x4]` 0xffff000008fb995c (0xc87f4490): `ldxp x16, x17, [x4]` ^ accesses virtual addresses: 0xffff8002fffdde60, 0xffff8002fffdde68 0xffff000008fb9960 (0xca000210): `eor x16, x16, x0` 0xffff000008fb9964 (0xca010231): `eor x17, x17, x1` 0xffff000008fb9968 (0xaa110211): `orr x17, x16, x17` 0xffff000008fb996c (0xb5000071): `cbnz x17, #0xffff000008fb9978` 0xffff000008fb9970 (0xc8300c82): `stxp w16, x2, x3, [x4]` 0xffff000008fb9974 (0x35ffff50): `cbnz w16, #0xffff000008fb995c` 0xffff000008fb9978 (0xaa1103e0): `mov x0, x17` 0xffff000008fb997c (0xd65f03c0): `ret ` 0xffff0000082076b8 (0xd503201f): `nop ` 0xffff0000082076bc (0xd503201f): `nop ` 0xffff0000082076c0 (0xd503201f): `nop ` 0xffff0000082076c4 (0xb94010a1): `ldr w1, [x5, #0x10]` ^ accesses virtual addresses: 0xffff8002f18b5cd0 0xffff0000082076c8 (0x51000421): `sub w1, w1, #1` 0xffff0000082076cc (0xb90010a1): `str w1, [x5, #0x10]` ^ accesses virtual addresses: 0xffff8002f18b5cd0 0xffff0000082076d0 (0x35000061): `cbnz w1, #0xffff0000082076dc` Notice that the `stxp` receives no corresponding callbacks via `qemu_plugin_register_vcpu_mem_cb` like the `ldxp`, `ldr`, and `str` do. -Aaron ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: plugins: Missing Store Exclusive Memory Accesses 2021-09-17 11:05 ` Alex Bennée 2021-09-17 14:44 ` Aaron Lindsay via @ 2021-09-21 20:28 ` Aaron Lindsay via 2021-09-22 20:22 ` Aaron Lindsay via 1 sibling, 1 reply; 11+ messages in thread From: Aaron Lindsay via @ 2021-09-21 20:28 UTC (permalink / raw) To: Alex Bennée; +Cc: qemu-devel, cota, richard.henderson On Sep 17 12:05, Alex Bennée wrote: > Aaron Lindsay <aaron@os.amperecomputing.com> writes: > > I recently noticed that the plugin interface does not appear to be > > emitting callbacks to functions registered via > > `qemu_plugin_register_vcpu_mem_cb` for AArch64 store exclusives. This > > would include instructions like `stxp w16, x2, x3, [x4]` (encoding: > > 0xc8300c82). Seeing as how I'm only running with a single CPU, I don't > > see how this could be due to losing exclusivity after the preceding > > `ldxp`. > > The exclusive handling is a bit special due to the need to emulate it's > behaviour using cmpxchg primitives. > > > > > In looking at QEMU's source, I *think* this is because the > > `gen_store_exclusive` function in translate-a64.c is not making the same > > calls to `plugin_gen_mem_callbacks` & company that are being made by > > "normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at > > least in my case; I do see some code paths under `gen_store_exclusive` > > call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all > > of them do?). > > The key TCG operation is the cmpxchg which does the effective store. For > -smp 1 we should use normal ld and st tcg ops. For > 1 it eventually > falls to tcg_gen_atomic_cmpxchg_XX which is a helper. That eventually > ends up at: > > atomic_trace_rmw_post > > which should be where things are hooked. When I open this up in gdb, I see that I'm getting the following call graph for the `stxp` instruction in question (for -smp 1): gen_store_exclusive -> gen_helper_paired_cmpxchg64_le In other words, I'm taking the `s->be_data == MO_LE` else/if clause. I do not see where the helper behind that (defined in helper-a64.c as `uint64_t HELPER(paired_cmpxchg64_le)...`) is calling in to generate plugin callbacks in this case. Am I missing something? -Aaron ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: plugins: Missing Store Exclusive Memory Accesses 2021-09-21 20:28 ` Aaron Lindsay via @ 2021-09-22 20:22 ` Aaron Lindsay via 2021-10-20 17:12 ` Aaron Lindsay via 0 siblings, 1 reply; 11+ messages in thread From: Aaron Lindsay via @ 2021-09-22 20:22 UTC (permalink / raw) To: Alex Bennée, richard.henderson; +Cc: qemu-devel, cota On Sep 21 16:28, Aaron Lindsay wrote: > On Sep 17 12:05, Alex Bennée wrote: > > Aaron Lindsay <aaron@os.amperecomputing.com> writes: > > > I recently noticed that the plugin interface does not appear to be > > > emitting callbacks to functions registered via > > > `qemu_plugin_register_vcpu_mem_cb` for AArch64 store exclusives. This > > > would include instructions like `stxp w16, x2, x3, [x4]` (encoding: > > > 0xc8300c82). Seeing as how I'm only running with a single CPU, I don't > > > see how this could be due to losing exclusivity after the preceding > > > `ldxp`. > > > > The exclusive handling is a bit special due to the need to emulate it's > > behaviour using cmpxchg primitives. > > > > > > > > In looking at QEMU's source, I *think* this is because the > > > `gen_store_exclusive` function in translate-a64.c is not making the same > > > calls to `plugin_gen_mem_callbacks` & company that are being made by > > > "normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at > > > least in my case; I do see some code paths under `gen_store_exclusive` > > > call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all > > > of them do?). > > > > The key TCG operation is the cmpxchg which does the effective store. For > > -smp 1 we should use normal ld and st tcg ops. For > 1 it eventually > > falls to tcg_gen_atomic_cmpxchg_XX which is a helper. That eventually > > ends up at: > > > > atomic_trace_rmw_post > > > > which should be where things are hooked. > > When I open this up in gdb, I see that I'm getting the following call > graph for the `stxp` instruction in question (for -smp 1): > > gen_store_exclusive -> gen_helper_paired_cmpxchg64_le > > In other words, I'm taking the `s->be_data == MO_LE` else/if clause. > > I do not see where the helper behind that (defined in helper-a64.c as > `uint64_t HELPER(paired_cmpxchg64_le)...`) is calling in to generate > plugin callbacks in this case. Am I missing something? Richard, Alex, The more I look at this, the more it feels like the following AArch64-specific helpers may have been overlooked when adding tracing/plugin hooks: gen_helper_paired_cmpxchg64_le gen_helper_paired_cmpxchg64_be But... I'm still not sure I fully understand how everything I'm digging into interacts; I am happy to keep investigating and work towards a fix, but think I need a nudge in the right direction. Thanks for any nudges, Aaron ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: plugins: Missing Store Exclusive Memory Accesses 2021-09-22 20:22 ` Aaron Lindsay via @ 2021-10-20 17:12 ` Aaron Lindsay via 2021-10-20 17:54 ` Alex Bennée 0 siblings, 1 reply; 11+ messages in thread From: Aaron Lindsay via @ 2021-10-20 17:12 UTC (permalink / raw) To: Alex Bennée, richard.henderson; +Cc: qemu-devel, cota On Sep 22 16:22, Aaron Lindsay wrote: > On Sep 21 16:28, Aaron Lindsay wrote: > > On Sep 17 12:05, Alex Bennée wrote: > > > Aaron Lindsay <aaron@os.amperecomputing.com> writes: > > > > I recently noticed that the plugin interface does not appear to be > > > > emitting callbacks to functions registered via > > > > `qemu_plugin_register_vcpu_mem_cb` for AArch64 store exclusives. This > > > > would include instructions like `stxp w16, x2, x3, [x4]` (encoding: > > > > 0xc8300c82). Seeing as how I'm only running with a single CPU, I don't > > > > see how this could be due to losing exclusivity after the preceding > > > > `ldxp`. > > > > > > The exclusive handling is a bit special due to the need to emulate it's > > > behaviour using cmpxchg primitives. > > > > > > > > > > > In looking at QEMU's source, I *think* this is because the > > > > `gen_store_exclusive` function in translate-a64.c is not making the same > > > > calls to `plugin_gen_mem_callbacks` & company that are being made by > > > > "normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at > > > > least in my case; I do see some code paths under `gen_store_exclusive` > > > > call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all > > > > of them do?). > > > > > > The key TCG operation is the cmpxchg which does the effective store. For > > > -smp 1 we should use normal ld and st tcg ops. For > 1 it eventually > > > falls to tcg_gen_atomic_cmpxchg_XX which is a helper. That eventually > > > ends up at: > > > > > > atomic_trace_rmw_post > > > > > > which should be where things are hooked. > > > > When I open this up in gdb, I see that I'm getting the following call > > graph for the `stxp` instruction in question (for -smp 1): > > > > gen_store_exclusive -> gen_helper_paired_cmpxchg64_le > > > > In other words, I'm taking the `s->be_data == MO_LE` else/if clause. > > > > I do not see where the helper behind that (defined in helper-a64.c as > > `uint64_t HELPER(paired_cmpxchg64_le)...`) is calling in to generate > > plugin callbacks in this case. Am I missing something? > > Richard, Alex, > > The more I look at this, the more it feels like the following > AArch64-specific helpers may have been overlooked when adding > tracing/plugin hooks: > gen_helper_paired_cmpxchg64_le > gen_helper_paired_cmpxchg64_be > > But... I'm still not sure I fully understand how everything I'm digging > into interacts; I am happy to keep investigating and work towards a fix, > but think I need a nudge in the right direction. Ping? I'm happy to spend some more time digging into this issue, and would love to be pointed in the right direction if someone is able! Thanks! -Aaron ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: plugins: Missing Store Exclusive Memory Accesses 2021-10-20 17:12 ` Aaron Lindsay via @ 2021-10-20 17:54 ` Alex Bennée 2021-10-20 20:49 ` Aaron Lindsay via 0 siblings, 1 reply; 11+ messages in thread From: Alex Bennée @ 2021-10-20 17:54 UTC (permalink / raw) To: Aaron Lindsay; +Cc: cota, richard.henderson, qemu-devel Aaron Lindsay <aaron@os.amperecomputing.com> writes: > On Sep 22 16:22, Aaron Lindsay wrote: >> On Sep 21 16:28, Aaron Lindsay wrote: >> > On Sep 17 12:05, Alex Bennée wrote: >> > > Aaron Lindsay <aaron@os.amperecomputing.com> writes: >> > > > I recently noticed that the plugin interface does not appear to be >> > > > emitting callbacks to functions registered via >> > > > `qemu_plugin_register_vcpu_mem_cb` for AArch64 store exclusives. This >> > > > would include instructions like `stxp w16, x2, x3, [x4]` (encoding: >> > > > 0xc8300c82). Seeing as how I'm only running with a single CPU, I don't >> > > > see how this could be due to losing exclusivity after the preceding >> > > > `ldxp`. >> > > >> > > The exclusive handling is a bit special due to the need to emulate it's >> > > behaviour using cmpxchg primitives. >> > > >> > > > >> > > > In looking at QEMU's source, I *think* this is because the >> > > > `gen_store_exclusive` function in translate-a64.c is not making the same >> > > > calls to `plugin_gen_mem_callbacks` & company that are being made by >> > > > "normal" stores handled by functions like `tcg_gen_qemu_st_i64` (at >> > > > least in my case; I do see some code paths under `gen_store_exclusive` >> > > > call down into `tcg_gen_qemu_st_i64` eventually, but it appears not all >> > > > of them do?). >> > > >> > > The key TCG operation is the cmpxchg which does the effective store. For >> > > -smp 1 we should use normal ld and st tcg ops. For > 1 it eventually >> > > falls to tcg_gen_atomic_cmpxchg_XX which is a helper. That eventually >> > > ends up at: >> > > >> > > atomic_trace_rmw_post >> > > >> > > which should be where things are hooked. >> > >> > When I open this up in gdb, I see that I'm getting the following call >> > graph for the `stxp` instruction in question (for -smp 1): >> > >> > gen_store_exclusive -> gen_helper_paired_cmpxchg64_le >> > >> > In other words, I'm taking the `s->be_data == MO_LE` else/if clause. >> > >> > I do not see where the helper behind that (defined in helper-a64.c as >> > `uint64_t HELPER(paired_cmpxchg64_le)...`) is calling in to generate >> > plugin callbacks in this case. Am I missing something? >> >> Richard, Alex, >> >> The more I look at this, the more it feels like the following >> AArch64-specific helpers may have been overlooked when adding >> tracing/plugin hooks: >> gen_helper_paired_cmpxchg64_le >> gen_helper_paired_cmpxchg64_be >> >> But... I'm still not sure I fully understand how everything I'm digging >> into interacts; I am happy to keep investigating and work towards a fix, >> but think I need a nudge in the right direction. > > Ping? > > I'm happy to spend some more time digging into this issue, and would > love to be pointed in the right direction if someone is able! These all end up in: ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, target_ulong addr, ABI_TYPE cmpv, ABI_TYPE newv, MemOpIdx oi, uintptr_t retaddr) Have you got a test case you are using so I can try and replicate the failure you are seeing? So far by inspection everything looks OK to me. > > Thanks! > > -Aaron -- Alex Bennée ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: plugins: Missing Store Exclusive Memory Accesses 2021-10-20 17:54 ` Alex Bennée @ 2021-10-20 20:49 ` Aaron Lindsay via 2021-10-21 12:28 ` Alex Bennée 0 siblings, 1 reply; 11+ messages in thread From: Aaron Lindsay via @ 2021-10-20 20:49 UTC (permalink / raw) To: Alex Bennée; +Cc: richard.henderson, qemu-devel, cota On Oct 20 18:54, Alex Bennée wrote: > Have you got a test case you are using so I can try and replicate the > failure you are seeing? So far by inspection everything looks OK to me. I took some time today to put together a minimal(ish) reproducer using usermode. The source files used are below, I compiled the test binary on an AArch64 system using: $ gcc -g -o stxp stxp.s stxp.c Then built the plugin from stxp_plugin.cc, and ran it all like: qemu-aarch64 \ -cpu cortex-a57 \ -D stxp_plugin.log \ -d plugin \ -plugin 'stxp_plugin.so' \ ./stxp I observe that, for me, the objdump of stxp contains: 000000000040070c <loop>: 40070c: f9800011 prfm pstl1strm, [x0] 400710: c87f4410 ldxp x16, x17, [x0] 400714: c8300c02 stxp w16, x2, x3, [x0] 400718: f1000652 subs x18, x18, #0x1 40071c: 54000040 b.eq 400724 <done> // b.none 400720: 17fffffb b 40070c <loop> But the output in stxp_plugin.log looks something like: Executing PC: 0x40070c Executing PC: 0x400710 PC 0x400710 accessed memory at 0x550080ec70 PC 0x400710 accessed memory at 0x550080ec78 Executing PC: 0x400714 Executing PC: 0x400718 Executing PC: 0x40071c Executing PC: 0x400720 From this, I believe the ldxp instruction at PC 0x400710 is reporting two memory accesses but the stxp instruction at 0x400714 is not. -Aaron --- stxp.c --- void stxp_issue_demo(); int main() { char arr[16]; stxp_issue_demo(&arr); } --- stxp.s --- .align 8 stxp_issue_demo: mov x18, 0x1000 mov x2, 0x0 mov x3, 0x0 loop: prfm pstl1strm, [x0] ldxp x16, x17, [x0] stxp w16, x2, x3, [x0] subs x18, x18, 1 beq done b loop done: ret .global stxp_issue_demo --- stxp_plugin.cc --- #include <stdio.h> extern "C" { #include <qemu-plugin.h> QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION; void qemu_logf(const char *str, ...) { char message[1024]; va_list args; va_start(args, str); vsnprintf(message, 1023, str, args); qemu_plugin_outs(message); va_end(args); } void before_insn_cb(unsigned int cpu_index, void *udata) { uint64_t pc = (uint64_t)udata; qemu_logf("Executing PC: 0x%" PRIx64 "\n", pc); } static void mem_cb(unsigned int cpu_index, qemu_plugin_meminfo_t meminfo, uint64_t va, void *udata) { uint64_t pc = (uint64_t)udata; qemu_logf("PC 0x%" PRIx64 " accessed memory at 0x%" PRIx64 "\n", pc, va); } static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb) { size_t n = qemu_plugin_tb_n_insns(tb); for (size_t i = 0; i < n; i++) { struct qemu_plugin_insn *insn = qemu_plugin_tb_get_insn(tb, i); uint64_t pc = qemu_plugin_insn_vaddr(insn); qemu_plugin_register_vcpu_insn_exec_cb(insn, before_insn_cb, QEMU_PLUGIN_CB_R_REGS, (void *)pc); qemu_plugin_register_vcpu_mem_cb(insn, mem_cb, QEMU_PLUGIN_CB_NO_REGS, QEMU_PLUGIN_MEM_RW, (void*)pc); } } QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info, int argc, char **argv) { qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans); return 0; } } ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: plugins: Missing Store Exclusive Memory Accesses 2021-10-20 20:49 ` Aaron Lindsay via @ 2021-10-21 12:28 ` Alex Bennée 2021-10-21 20:40 ` Aaron Lindsay via 0 siblings, 1 reply; 11+ messages in thread From: Alex Bennée @ 2021-10-21 12:28 UTC (permalink / raw) To: Aaron Lindsay; +Cc: cota, richard.henderson, qemu-devel Aaron Lindsay <aaron@os.amperecomputing.com> writes: > On Oct 20 18:54, Alex Bennée wrote: >> Have you got a test case you are using so I can try and replicate the >> failure you are seeing? So far by inspection everything looks OK to me. > > I took some time today to put together a minimal(ish) reproducer using > usermode. The source files used are below, I compiled the test binary on an > AArch64 system using: > > $ gcc -g -o stxp stxp.s stxp.c > > Then built the plugin from stxp_plugin.cc, and ran it all like: > > qemu-aarch64 \ > -cpu cortex-a57 \ > -D stxp_plugin.log \ > -d plugin \ > -plugin 'stxp_plugin.so' \ > ./stxp > > I observe that, for me, the objdump of stxp contains: > 000000000040070c <loop>: > 40070c: f9800011 prfm pstl1strm, [x0] > 400710: c87f4410 ldxp x16, x17, [x0] > 400714: c8300c02 stxp w16, x2, x3, [x0] > 400718: f1000652 subs x18, x18, #0x1 > 40071c: 54000040 b.eq 400724 <done> // b.none > 400720: 17fffffb b 40070c <loop> > > But the output in stxp_plugin.log looks something like: > Executing PC: 0x40070c > Executing PC: 0x400710 > PC 0x400710 accessed memory at 0x550080ec70 > PC 0x400710 accessed memory at 0x550080ec78 > Executing PC: 0x400714 > Executing PC: 0x400718 > Executing PC: 0x40071c > Executing PC: 0x400720 > > From this, I believe the ldxp instruction at PC 0x400710 is reporting two > memory accesses but the stxp instruction at 0x400714 is not. This is fascinating but I can't replicate your results. I get the following pattern: Executing PC: 0x400910 Executing PC: 0x400914 PC 0x400914 accessed memory at 0x55007fffd0 PC 0x400914 accessed memory at 0x55007fffd8 Executing PC: 0x400918 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 Executing PC: 0x40091c Executing PC: 0x400920 Executing PC: 0x400924 Executing PC: 0x400910 Executing PC: 0x400914 PC 0x400914 accessed memory at 0x55007fffd0 PC 0x400914 accessed memory at 0x55007fffd8 Executing PC: 0x400918 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 Executing PC: 0x40091c Executing PC: 0x400920 Executing PC: 0x400924 Executing PC: 0x400910 Executing PC: 0x400914 PC 0x400914 accessed memory at 0x55007fffd0 PC 0x400914 accessed memory at 0x55007fffd8 Executing PC: 0x400918 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 Executing PC: 0x40091c Executing PC: 0x400920 Executing PC: 0x400924 Executing PC: 0x400910 Executing PC: 0x400914 PC 0x400914 accessed memory at 0x55007fffd0 PC 0x400914 accessed memory at 0x55007fffd8 Executing PC: 0x400918 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 It's a bit clearer if you use the contrib/execlog plugin: ./qemu-aarch64 -plugin contrib/plugins/libexeclog.so -d plugin ./tests/tcg/aarch64-linux-user/stxp 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" 0, 0x400920, 0x54000040, "b.eq #0x400928" 0, 0x400924, 0x17fffffb, "b #0x400910" 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" 0, 0x400920, 0x54000040, "b.eq #0x400928" 0, 0x400924, 0x17fffffb, "b #0x400910" 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" 0, 0x400920, 0x54000040, "b.eq #0x400928" 0, 0x400924, 0x17fffffb, "b #0x400910" 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" 0, 0x400920, 0x54000040, "b.eq #0x400928" 0, 0x400924, 0x17fffffb, "b #0x400910" 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" 0, 0x400920, 0x54000040, "b.eq #0x400928" 0, 0x400924, 0x17fffffb, "b #0x400910" Although you can see stxp looks a bit weird on account of the loads it does during the cmpxchng. So consider me stumped. The only thing I can thing of next is to see how closely I can replicate your build environment. -- Alex Bennée ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: plugins: Missing Store Exclusive Memory Accesses 2021-10-21 12:28 ` Alex Bennée @ 2021-10-21 20:40 ` Aaron Lindsay via 2021-10-22 8:37 ` Alex Bennée 0 siblings, 1 reply; 11+ messages in thread From: Aaron Lindsay via @ 2021-10-21 20:40 UTC (permalink / raw) To: Alex Bennée; +Cc: richard.henderson, qemu-devel, cota On Oct 21 13:28, Alex Bennée wrote: > It's a bit clearer if you use the contrib/execlog plugin: > > ./qemu-aarch64 -plugin contrib/plugins/libexeclog.so -d plugin ./tests/tcg/aarch64-linux-user/stxp > > 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] > 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 > 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 > 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" > 0, 0x400920, 0x54000040, "b.eq #0x400928" > 0, 0x400924, 0x17fffffb, "b #0x400910" > 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] > 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 > 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 > 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" > 0, 0x400920, 0x54000040, "b.eq #0x400928" > 0, 0x400924, 0x17fffffb, "b #0x400910" > 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] > 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 > 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 > 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" > 0, 0x400920, 0x54000040, "b.eq #0x400928" > 0, 0x400924, 0x17fffffb, "b #0x400910" > 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] > 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 > 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 > 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" > 0, 0x400920, 0x54000040, "b.eq #0x400928" > 0, 0x400924, 0x17fffffb, "b #0x400910" > 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] > 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 > 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 > 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" > 0, 0x400920, 0x54000040, "b.eq #0x400928" > 0, 0x400924, 0x17fffffb, "b #0x400910" > > Although you can see stxp looks a bit weird on account of the loads it > does during the cmpxchng. So consider me stumped. The only thing I can > thing of next is to see how closely I can replicate your build > environment. I apologize, I had apparently gotten farther behind upstream than I realized since originally encountering this. I tried the latest upstream code and am now able to observe the same thing as you. Somewhere between v6.1.0 and now, the original issue I reported has been resolved. However, I am not sure reporting loads for a store exclusive makes sense to me here, either. My understanding is that the stxp needs to check if it still has exclusive access and QEMU's implementation results in the extra loads, but I would expect that the plugin interface would only report architectural loads. Is there any obvious way to omit the loads from the plugin interface here? -Aaron ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: plugins: Missing Store Exclusive Memory Accesses 2021-10-21 20:40 ` Aaron Lindsay via @ 2021-10-22 8:37 ` Alex Bennée 0 siblings, 0 replies; 11+ messages in thread From: Alex Bennée @ 2021-10-22 8:37 UTC (permalink / raw) To: Aaron Lindsay; +Cc: cota, richard.henderson, qemu-devel Aaron Lindsay <aaron@os.amperecomputing.com> writes: > On Oct 21 13:28, Alex Bennée wrote: >> It's a bit clearer if you use the contrib/execlog plugin: >> >> ./qemu-aarch64 -plugin contrib/plugins/libexeclog.so -d plugin ./tests/tcg/aarch64-linux-user/stxp >> >> 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] >> 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 >> 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 >> 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" >> 0, 0x400920, 0x54000040, "b.eq #0x400928" >> 0, 0x400924, 0x17fffffb, "b #0x400910" <snip> >> >> Although you can see stxp looks a bit weird on account of the loads it >> does during the cmpxchng. So consider me stumped. The only thing I can >> thing of next is to see how closely I can replicate your build >> environment. > > I apologize, I had apparently gotten farther behind upstream than I > realized since originally encountering this. I tried the latest upstream > code and am now able to observe the same thing as you. Somewhere between > v6.1.0 and now, the original issue I reported has been resolved. > > However, I am not sure reporting loads for a store exclusive makes sense > to me here, either. My understanding is that the stxp needs to check if > it still has exclusive access and QEMU's implementation results in the > extra loads, but I would expect that the plugin interface would only > report architectural loads. Yes this is an anomaly. It's not reporting all loads and stores because there are accesses to cpu_exclusive_addr and cpu_exclusive_val which we use to simulate the exclusivity check. However we don't currently have a way to signal to the TCG that a cmpxchg is only being done to simulate a store. I guess we need to either signal the helper someway to avoid calling atomic_trace_rmw_post and call atomic_trace_st_post instead. Ideally we could signal this in metadata somehow (although I suspect adding something to MemOpIdx might be too ugly). The alternative would be defining another series of cmpxchg helpers that did this. Looking at the code also reminds me that we need to excise the broken memory trace code. > Is there any obvious way to omit the loads from the plugin interface > here? > > -Aaron -- Alex Bennée ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-10-22 8:58 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-09-16 20:44 plugins: Missing Store Exclusive Memory Accesses Aaron Lindsay 2021-09-17 11:05 ` Alex Bennée 2021-09-17 14:44 ` Aaron Lindsay via 2021-09-21 20:28 ` Aaron Lindsay via 2021-09-22 20:22 ` Aaron Lindsay via 2021-10-20 17:12 ` Aaron Lindsay via 2021-10-20 17:54 ` Alex Bennée 2021-10-20 20:49 ` Aaron Lindsay via 2021-10-21 12:28 ` Alex Bennée 2021-10-21 20:40 ` Aaron Lindsay via 2021-10-22 8:37 ` Alex Bennée
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).