qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Fwd: [joel@sing.id.au: atomic failures on qemu-system-riscv64]
       [not found] <4350e3aabc7651c63286c78ffa3f273b5cb16884.camel@wdc.com>
@ 2019-06-05 20:59 ` Marco Peereboom
  2019-06-05 23:18   ` Palmer Dabbelt
  0 siblings, 1 reply; 4+ messages in thread
From: Marco Peereboom @ 2019-06-05 20:59 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3865 bytes --]

Joel is on vacation so here it is again.

> Begin forwarded message:
> 
> From: Alistair Francis <Alistair.Francis@wdc.com>
> Subject: Re: [joel@sing.id.au: atomic failures on qemu-system-riscv64]
> Date: June 5, 2019 at 7:19:53 PM GMT+1
> To: "joel@sing.id.au" <joel@sing.id.au>, "palmer@sifive.com" <palmer@sifive.com>
> Cc: "marco@decred.org" <marco@decred.org>, "me@carlosedp.com" <me@carlosedp.com>
> 
> On Fri, 2019-05-31 at 03:21 +1000, Joel Sing wrote:
>> I've just sent this to qemu-riscv@nongnu.org - forwarding for
>> visibility...
> 
> Hello Joel,
> 
> Can you please send this to the QEMU mailing list?
> https://wiki.qemu.org/Contribute/MailingLists
> 
>> 
>> ----- Forwarded message from Joel Sing <joel@sing.id.au> -----
>> 
>> Date: Fri, 31 May 2019 03:20:03 +1000
>> From: Joel Sing <joel@sing.id.au>
>> To: qemu-riscv@nongnu.org
>> Subject: atomic failures on qemu-system-riscv64
>> User-Agent: Mutt/1.10.1 (2018-07-13)
>> 
>> While working on a Go (www.golang.org) port for riscv, I've run
>> into issues with atomics (namely LR/SC) on qemu-system-riscv64.
>> There are several reproducers for this problem including (using
>> gcc builtin atomics):
>> 
>>  https://gist.github.com/4a6f656c/8433032a3f70893a278259f8108aad90
>> 
>> And a version using inline assembly:
>> 
>>  https://gist.github.com/4a6f656c/d883091f5ca811822720213be343a75a
>> 
>> Depending on the qemu configuration the number of threads may
>> need increasing (to force context switching) and/or run in a
>> loop. Go's sync/atomic tests also fail regularly.
>> 
>> Having dug into the qemu code, what I believe is happening is
>> along the lines of the following:
>> 
>> 1. Thread 1 runs and executes an LR - this assigns an address
>>   to load_res and a value to load_val (say 1).
>> 
>> 2. A context switch occurs and thread 2 is now run - it runs
>>   an LR and SC on the same address modifying the stored value.
>>   Another LR is executed loading load_val with the current
>>   value (say 3).
>> 
>> 3. A context switch occurs and thread 1 is now run again, it
>>   continues on its LR/SC sequence and now runs the SC instruction.
>>   This is based on the assumption that it had a reservation
>>   and the SC will fail if the memory has changed. The underlying
>>   implementation of SC is a cmpxchg with the value in load_val
>>   - this no longer has the original value and hence successfully
>>   compares (as does the tcg_gen_setcond_tl() between the returned
>>   value and load_val) thus the SC succeeds when it should not.
>> 
>> The diff below clears load_res when the mode changes - with this
>> applied the reproducers work correctly, as do Go's atomic tests.
>> I'm not sure this "fix" is 100% correct, but it certainly verifies
>> where the problem lies. It does also fall inline with the RISCV
>> spec:
>> 
>> "The SC must fail if there is an observable memory access from
>> another hart to the address, or if there is an intervening context
>> switch on this hart, or if in the meantime the hart executed a
>> privileged exception-return instruction."
>> 
>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
>> index b17f169681..9875b8e5d3 100644
>> --- a/target/riscv/cpu_helper.c
>> +++ b/target/riscv/cpu_helper.c
>> @@ -113,6 +113,8 @@ void riscv_cpu_set_mode(CPURISCVState *env,
>> target_ulong newpriv)
>>     }
>>     /* tlb_flush is unnecessary as mode is contained in mmu_idx */
>>     env->priv = newpriv;
>> +
>> +    env->load_res = 0;
> 
> This looks ok to me, I'll read the spec to double check though. Do you
> mind adding a comment in the code as to why this is required?
> 
> Alistair
> 
>> }
>> 
>> /* get_physical_address - get the physical address for this virtual
>> address
>> 
>> ----- End forwarded message -----


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Fwd: [joel@sing.id.au: atomic failures on qemu-system-riscv64]
  2019-06-05 20:59 ` [Qemu-devel] Fwd: [joel@sing.id.au: atomic failures on qemu-system-riscv64] Marco Peereboom
@ 2019-06-05 23:18   ` Palmer Dabbelt
  2019-06-07  2:50     ` Richard Henderson
  0 siblings, 1 reply; 4+ messages in thread
From: Palmer Dabbelt @ 2019-06-05 23:18 UTC (permalink / raw)
  To: marco; +Cc: qemu-riscv, qemu-devel

On Wed, 05 Jun 2019 13:59:53 PDT (-0700), marco@decred.org wrote:
> Joel is on vacation so here it is again.
>
>> Begin forwarded message:
>> 
>> From: Alistair Francis <Alistair.Francis@wdc.com>
>> Subject: Re: [joel@sing.id.au: atomic failures on qemu-system-riscv64]
>> Date: June 5, 2019 at 7:19:53 PM GMT+1
>> To: "joel@sing.id.au" <joel@sing.id.au>, "palmer@sifive.com" <palmer@sifive.com>
>> Cc: "marco@decred.org" <marco@decred.org>, "me@carlosedp.com" <me@carlosedp.com>
>> 
>> On Fri, 2019-05-31 at 03:21 +1000, Joel Sing wrote:
>>> I've just sent this to qemu-riscv@nongnu.org - forwarding for
>>> visibility...
>> 
>> Hello Joel,
>> 
>> Can you please send this to the QEMU mailing list?
>> https://wiki.qemu.org/Contribute/MailingLists
>> 
>>> 
>>> ----- Forwarded message from Joel Sing <joel@sing.id.au> -----
>>> 
>>> Date: Fri, 31 May 2019 03:20:03 +1000
>>> From: Joel Sing <joel@sing.id.au>
>>> To: qemu-riscv@nongnu.org
>>> Subject: atomic failures on qemu-system-riscv64
>>> User-Agent: Mutt/1.10.1 (2018-07-13)
>>> 
>>> While working on a Go (www.golang.org) port for riscv, I've run
>>> into issues with atomics (namely LR/SC) on qemu-system-riscv64.
>>> There are several reproducers for this problem including (using
>>> gcc builtin atomics):
>>> 
>>>  https://gist.github.com/4a6f656c/8433032a3f70893a278259f8108aad90
>>> 
>>> And a version using inline assembly:
>>> 
>>>  https://gist.github.com/4a6f656c/d883091f5ca811822720213be343a75a
>>> 
>>> Depending on the qemu configuration the number of threads may
>>> need increasing (to force context switching) and/or run in a
>>> loop. Go's sync/atomic tests also fail regularly.
>>> 
>>> Having dug into the qemu code, what I believe is happening is
>>> along the lines of the following:
>>> 
>>> 1. Thread 1 runs and executes an LR - this assigns an address
>>>   to load_res and a value to load_val (say 1).
>>> 
>>> 2. A context switch occurs and thread 2 is now run - it runs
>>>   an LR and SC on the same address modifying the stored value.
>>>   Another LR is executed loading load_val with the current
>>>   value (say 3).
>>> 
>>> 3. A context switch occurs and thread 1 is now run again, it
>>>   continues on its LR/SC sequence and now runs the SC instruction.
>>>   This is based on the assumption that it had a reservation
>>>   and the SC will fail if the memory has changed. The underlying
>>>   implementation of SC is a cmpxchg with the value in load_val
>>>   - this no longer has the original value and hence successfully
>>>   compares (as does the tcg_gen_setcond_tl() between the returned
>>>   value and load_val) thus the SC succeeds when it should not.
>>> 
>>> The diff below clears load_res when the mode changes - with this
>>> applied the reproducers work correctly, as do Go's atomic tests.
>>> I'm not sure this "fix" is 100% correct, but it certainly verifies
>>> where the problem lies. It does also fall inline with the RISCV
>>> spec:
>>> 
>>> "The SC must fail if there is an observable memory access from
>>> another hart to the address, or if there is an intervening context
>>> switch on this hart, or if in the meantime the hart executed a
>>> privileged exception-return instruction."

Sorry about this, but that wording is actually a bug in the ISA manual.  The
current version says

    An SC must fail if there is
    another SC (to any address) between the LR and the SC in program
    order.  The precise statement of the atomicity requirements for
    successful LR/SC sequences is defined by the Atomicity Axiom in
    Section~\ref{sec:rvwmo}.
    
    \begin{commentary}
    A store-conditional instruction to a scratch word of memory should be used
    during a preemptive context switch to forcibly yield any existing load
    reservation.
    \end{commentary}

The text in version 2.2 was changed because it doesn't really have meaning:
harts are a hardware construct, but context switches are a software context.
There were a few of these bugs floating around the ISA manual and we cleaned
them up more than a year ago, but the version on the RISC-V website hasn't
changed.  The ratified ISA manual will have the correct wording.

That said, the Linux user ABI allows programs to rely on reservations being
shot down on context switches -- if that wasn't the case then LR/SC would be
useless.  That's how the ambiguous wording snuck into the user ISA manual: we
confused the ISA behavior (ie, what the hardware does) with the user ABI
behavior (ie, what programs can rely on).

The reason you're only seeing this in QEMU is because the hardware has a
timeout on load reservations of something like 1000 cycles, and the Linux
context switching code takes longer than that timeout.  For some reason I
remember having fixed this bug in Linux, but I must have forgotten to actually
send out the patch.

Also, unless I'm misunderstanding something our implementation of LR/SC is
pretty broken.  We're just using a CAS to check if the value changed, which
suffers from the ABA problem that LR/SC is there to fix in the first place.  I
might be missing something here, though, as it looks like MIPS is doing
essentially the same thing.

>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
>>> index b17f169681..9875b8e5d3 100644
>>> --- a/target/riscv/cpu_helper.c
>>> +++ b/target/riscv/cpu_helper.c
>>> @@ -113,6 +113,8 @@ void riscv_cpu_set_mode(CPURISCVState *env,
>>> target_ulong newpriv)
>>>     }
>>>     /* tlb_flush is unnecessary as mode is contained in mmu_idx */
>>>     env->priv = newpriv;
>>> +
>>> +    env->load_res = 0;
>> 
>> This looks ok to me, I'll read the spec to double check though. Do you
>> mind adding a comment in the code as to why this is required?
>> 
>> Alistair
>> 
>>> }
>>> 
>>> /* get_physical_address - get the physical address for this virtual
>>> address
>>> 
>>> ----- End forwarded message -----


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Fwd: [joel@sing.id.au: atomic failures on qemu-system-riscv64]
  2019-06-05 23:18   ` Palmer Dabbelt
@ 2019-06-07  2:50     ` Richard Henderson
  2019-06-08  5:41       ` Palmer Dabbelt
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Henderson @ 2019-06-07  2:50 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: open list:RISC-V, qemu-devel, marco

>
> Also, unless I'm misunderstanding something our implementation of LR/SC is
>
pretty broken.  We're just using a CAS to check if the value changed, which
> suffers from the ABA problem that LR/SC is there to fix in the first
> place.  I
> might be missing something here, though, as it looks like MIPS is doing
> essentially the same thing.
>

All of our load-lock/store-conditional implementations do the same.

You are correct that we do not implement the exact correct semantics.
Correct semantics, as far as I know, cannot be modeled without emulating
everything in a serial context, including caches.

We therefore make a considered choice to observe that while ll/sc can be
used to do all kinds of wild and woolly things, no one actually does so.
What people actually do is write portable code, using c/c++ atomic
primitives. And that all of these can be done with cmpxchg, because that's
what x86 has.

Using that choice, we can provide multi-threaded smp emulation that,
demonstrably, works.

>

r~

>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Fwd: [joel@sing.id.au: atomic failures on qemu-system-riscv64]
  2019-06-07  2:50     ` Richard Henderson
@ 2019-06-08  5:41       ` Palmer Dabbelt
  0 siblings, 0 replies; 4+ messages in thread
From: Palmer Dabbelt @ 2019-06-08  5:41 UTC (permalink / raw)
  To: richard.henderson; +Cc: qemu-riscv, qemu-devel, marco

On Thu, 06 Jun 2019 19:50:57 PDT (-0700), richard.henderson@linaro.org wrote:
>>
>> Also, unless I'm misunderstanding something our implementation of LR/SC is
>>
> pretty broken.  We're just using a CAS to check if the value changed, which
>> suffers from the ABA problem that LR/SC is there to fix in the first
>> place.  I
>> might be missing something here, though, as it looks like MIPS is doing
>> essentially the same thing.
>>
>
> All of our load-lock/store-conditional implementations do the same.
>
> You are correct that we do not implement the exact correct semantics.
> Correct semantics, as far as I know, cannot be modeled without emulating
> everything in a serial context, including caches.
>
> We therefore make a considered choice to observe that while ll/sc can be
> used to do all kinds of wild and woolly things, no one actually does so.
> What people actually do is write portable code, using c/c++ atomic
> primitives. And that all of these can be done with cmpxchg, because that's
> what x86 has.
>
> Using that choice, we can provide multi-threaded smp emulation that,
> demonstrably, works.

OK, that makes sense.  I was just worried that we were screwing something up on
our end.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-06-08  5:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4350e3aabc7651c63286c78ffa3f273b5cb16884.camel@wdc.com>
2019-06-05 20:59 ` [Qemu-devel] Fwd: [joel@sing.id.au: atomic failures on qemu-system-riscv64] Marco Peereboom
2019-06-05 23:18   ` Palmer Dabbelt
2019-06-07  2:50     ` Richard Henderson
2019-06-08  5:41       ` Palmer Dabbelt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).