All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alistair Francis <alistair23@gmail.com>
To: Palmer Dabbelt <palmer@sifive.com>
Cc: me@carlosedp.com, "open list:RISC-V" <qemu-riscv@nongnu.org>,
	Richard Henderson <richard.henderson@linaro.org>,
	"qemu-devel@nongnu.org Developers" <qemu-devel@nongnu.org>,
	Joel Sing <joel@sing.id.au>,
	Alistair Francis <Alistair.Francis@wdc.com>,
	marco@decred.org
Subject: Re: [Qemu-devel] [PATCH] atomic failures on qemu-system-riscv64
Date: Tue, 24 Sep 2019 18:01:36 -0700	[thread overview]
Message-ID: <CAKmqyKPVxkbKkZPg5u+nKiLtNgtf5Na6JgzF3ocrx8wy8+bxJw@mail.gmail.com> (raw)
In-Reply-To: <CAKmqyKON2=3mO-n3nXwyqN9i-RDOeyrpjxt9JJMSBwyyiEiuow@mail.gmail.com>

On Tue, Sep 24, 2019 at 4:35 PM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Tue, Sep 24, 2019 at 1:04 PM Palmer Dabbelt <palmer@sifive.com> wrote:
> >
> > On Tue, 24 Sep 2019 11:29:25 PDT (-0700), alistair23@gmail.com wrote:
> > > On Mon, Jun 24, 2019 at 11:21 AM Joel Sing <joel@sing.id.au> wrote:
> > >>
> > >> On 19-06-17 16:52:44, Richard Henderson wrote:
> > >> > On 6/16/19 12:19 PM, Joel Sing wrote:
> > >> > > +    /*
> > >> > > +     * Clear the load reservation, since an SC must fail if there is
> > >> > > +     * an SC to any address, in between an LR and SC pair.
> > >> > > +     */
> > >> > > +    tcg_gen_movi_tl(load_res, 0);
> > >> > > +
> > >> > >      gen_set_label(l2);
> > >> >
> > >> > This clear needs to be moved down below label l2.
> > >> > Otherwise, with lr / sc / sc, the second sc could succeed in error.
> > >>
> > >> Indeed, thanks.
> > >>
> > >> > FWIW, other targets have used -1 as the "invalid" load reservation, since the
> > >> > architecture does not require address 0 to be unmapped.  This should be quite
> > >> > visible in M-mode with paging disabled and ram at offset 0.  Often, other
> > >> > targets require alignment for the lr/sc address, though I don't see that for riscv.
> > >>
> > >> I've switched to -1 as suggested. Regarding the alignment for reservations, the
> > >> specification does require this, although I do not recall seeing any enforcement
> > >> of this by qemu itself.
> > >>
> > >> New diff follows.
> > >>
> > >> From 8ef31a2ce8ef1cbeee92995a0b2994f480e9bb6d Mon Sep 17 00:00:00 2001
> > >> From: Joel Sing <joel@sing.id.au>
> > >> Date: Tue, 25 Jun 2019 02:44:24 +1000
> > >> Subject: [PATCH] Clear load reservations on qemu riscv target
> > >>
> > >> This prevents a load reservation from being placed in one context/process,
> > >> then being used in another, resulting in an SC succeeding incorrectly and
> > >> breaking atomics.
> > >>
> > >> Signed-off-by: Joel Sing <joel@sing.id.au>
> > >> ---
> > >>  target/riscv/cpu.c                      | 1 +
> > >>  target/riscv/cpu_helper.c               | 9 +++++++++
> > >>  target/riscv/insn_trans/trans_rva.inc.c | 8 +++++++-
> > >>  3 files changed, 17 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> > >> index d61bce6d55..e7c8bf48fc 100644
> > >> --- a/target/riscv/cpu.c
> > >> +++ b/target/riscv/cpu.c
> > >> @@ -281,6 +281,7 @@ static void riscv_cpu_reset(CPUState *cs)
> > >>      env->pc = env->resetvec;
> > >>  #endif
> > >>      cs->exception_index = EXCP_NONE;
> > >> +    env->load_res = -1;
> > >>      set_default_nan_mode(1, &env->fp_status);
> > >>  }
> > >>
> > >> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > >> index b17f169681..6a07b12e65 100644
> > >> --- a/target/riscv/cpu_helper.c
> > >> +++ b/target/riscv/cpu_helper.c
> > >> @@ -113,6 +113,15 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong newpriv)
> > >>      }
> > >>      /* tlb_flush is unnecessary as mode is contained in mmu_idx */
> > >>      env->priv = newpriv;
> > >> +
> > >> +    /* Clear the load reservation - otherwise a reservation placed in one
> > >> +     * context/process can be used by another, resulting in an SC succeeding
> > >> +     * incorrectly. Version 2.2 of the ISA specification explicitly requires
> > >> +     * this behaviour, while later revisions say that the kernel "should" use
> > >> +     * an SC instruction to force the yielding of a load reservation on a
> > >> +     * preemptive context switch. As a result, do both.
> > >> +     */
> > >> +    env->load_res = -1;
> > >>  }
> > >>
> > >>  /* get_physical_address - get the physical address for this virtual address
> > >> diff --git a/target/riscv/insn_trans/trans_rva.inc.c b/target/riscv/insn_trans/trans_rva.inc.c
> > >> index f6dbbc065e..fadd88849e 100644
> > >> --- a/target/riscv/insn_trans/trans_rva.inc.c
> > >> +++ b/target/riscv/insn_trans/trans_rva.inc.c
> > >> @@ -61,7 +61,7 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop)
> > >>
> > >>      gen_set_label(l1);
> > >>      /*
> > >> -     * Address comparion failure.  However, we still need to
> > >> +     * Address comparison failure.  However, we still need to
> > >>       * provide the memory barrier implied by AQ/RL.
> > >>       */
> > >>      tcg_gen_mb(TCG_MO_ALL + a->aq * TCG_BAR_LDAQ + a->rl * TCG_BAR_STRL);
> > >> @@ -69,6 +69,12 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop)
> > >>      gen_set_gpr(a->rd, dat);
> > >>
> > >>      gen_set_label(l2);
> > >> +    /*
> > >> +     * Clear the load reservation, since an SC must fail if there is
> > >> +     * an SC to any address, in between an LR and SC pair.
> > >> +     */
> > >> +    tcg_gen_movi_tl(load_res, -1);
> > >> +
> > >>      tcg_temp_free(dat);
> > >>      tcg_temp_free(src1);
> > >>      tcg_temp_free(src2);
> > >> --
> > >
> > > This patch causes boot failures when booting systemd built with musl on RV64.
> > >
> > > It could possibly be a musl bug, but I wanted to throw that out here
> > > first to see what people think.
> >
> > Looking at the musl port, I see at least one bug in their atomics jumping out
> > at me:
> >
> > diff --git a/arch/riscv64/atomic_arch.h b/arch/riscv64/atomic_arch.h
> > index c9765342..41ad4d04 100644
> > --- a/arch/riscv64/atomic_arch.h
> > +++ b/arch/riscv64/atomic_arch.h
> > @@ -14,7 +14,7 @@ static inline int a_cas(volatile int *p, int t, int s)
> >                 "       sc.w.aqrl %1, %4, (%2)\n"
> >                 "       bnez %1, 1b\n"
> >                 "1:"
> > -               : "=&r"(old), "=r"(tmp)
> > +               : "=&r"(old), "=&r"(tmp)
> >                 : "r"(p), "r"(t), "r"(s)
> >                 : "memory");
> >         return old;
> > @@ -31,7 +31,7 @@ static inline void *a_cas_p(volatile void *p, void *t, void *s)
> >                 "       sc.d.aqrl %1, %4, (%2)\n"
> >                 "       bnez %1, 1b\n"
> >                 "1:"
> > -               : "=&r"(old), "=r"(tmp)
> > +               : "=&r"(old), "=&r"(tmp)
> >                 : "r"(p), "r"(t), "r"(s)
> >                 : "memory");
> >         return old;
> >
> > It's a shot in the dark as to whether that'll fix your bug, but I could at
> > least see a mechanism for it: before we yielded load reservations on context
> > switches then that backwards branch would never be taken, so we wouldn't notice
> > if tmp was allocated into one of the same registers as the inputs.  Even if it
> > doesn't fix your issue it's still a bug so I'll send the patch out, just LMK so
> > I can indicate how important the issue is.
>
> I haven't had a chance to test this fix yet. The bug was reported by
> Khem (and other OE people) as it's break musl for RISC-V.

I did get a chance to test it. This seems to fix the issue :)

Please send the patch to musl and CC me when you do.

Good catch!

Alistair

>
> >
> > This should manifest on hardware, but it looks like we managed to drop that SC
> > patch.  I'll go send the patch out now...
>
> Thanks, do you mind CCing me?
>
> Alistair
>
> >
> > >
> > > Alistair
> > >
> > >> 2.21.0
> > >>
> > >>


WARNING: multiple messages have this Message-ID (diff)
From: Alistair Francis <alistair23@gmail.com>
To: Palmer Dabbelt <palmer@sifive.com>
Cc: Joel Sing <joel@sing.id.au>,
	Richard Henderson <richard.henderson@linaro.org>,
	me@carlosedp.com,  "open list:RISC-V" <qemu-riscv@nongnu.org>,
	 "qemu-devel@nongnu.org Developers" <qemu-devel@nongnu.org>,
	Alistair Francis <Alistair.Francis@wdc.com>,
	marco@decred.org
Subject: Re: [Qemu-devel] [PATCH] atomic failures on qemu-system-riscv64
Date: Tue, 24 Sep 2019 18:01:36 -0700	[thread overview]
Message-ID: <CAKmqyKPVxkbKkZPg5u+nKiLtNgtf5Na6JgzF3ocrx8wy8+bxJw@mail.gmail.com> (raw)
In-Reply-To: <CAKmqyKON2=3mO-n3nXwyqN9i-RDOeyrpjxt9JJMSBwyyiEiuow@mail.gmail.com>

On Tue, Sep 24, 2019 at 4:35 PM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Tue, Sep 24, 2019 at 1:04 PM Palmer Dabbelt <palmer@sifive.com> wrote:
> >
> > On Tue, 24 Sep 2019 11:29:25 PDT (-0700), alistair23@gmail.com wrote:
> > > On Mon, Jun 24, 2019 at 11:21 AM Joel Sing <joel@sing.id.au> wrote:
> > >>
> > >> On 19-06-17 16:52:44, Richard Henderson wrote:
> > >> > On 6/16/19 12:19 PM, Joel Sing wrote:
> > >> > > +    /*
> > >> > > +     * Clear the load reservation, since an SC must fail if there is
> > >> > > +     * an SC to any address, in between an LR and SC pair.
> > >> > > +     */
> > >> > > +    tcg_gen_movi_tl(load_res, 0);
> > >> > > +
> > >> > >      gen_set_label(l2);
> > >> >
> > >> > This clear needs to be moved down below label l2.
> > >> > Otherwise, with lr / sc / sc, the second sc could succeed in error.
> > >>
> > >> Indeed, thanks.
> > >>
> > >> > FWIW, other targets have used -1 as the "invalid" load reservation, since the
> > >> > architecture does not require address 0 to be unmapped.  This should be quite
> > >> > visible in M-mode with paging disabled and ram at offset 0.  Often, other
> > >> > targets require alignment for the lr/sc address, though I don't see that for riscv.
> > >>
> > >> I've switched to -1 as suggested. Regarding the alignment for reservations, the
> > >> specification does require this, although I do not recall seeing any enforcement
> > >> of this by qemu itself.
> > >>
> > >> New diff follows.
> > >>
> > >> From 8ef31a2ce8ef1cbeee92995a0b2994f480e9bb6d Mon Sep 17 00:00:00 2001
> > >> From: Joel Sing <joel@sing.id.au>
> > >> Date: Tue, 25 Jun 2019 02:44:24 +1000
> > >> Subject: [PATCH] Clear load reservations on qemu riscv target
> > >>
> > >> This prevents a load reservation from being placed in one context/process,
> > >> then being used in another, resulting in an SC succeeding incorrectly and
> > >> breaking atomics.
> > >>
> > >> Signed-off-by: Joel Sing <joel@sing.id.au>
> > >> ---
> > >>  target/riscv/cpu.c                      | 1 +
> > >>  target/riscv/cpu_helper.c               | 9 +++++++++
> > >>  target/riscv/insn_trans/trans_rva.inc.c | 8 +++++++-
> > >>  3 files changed, 17 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> > >> index d61bce6d55..e7c8bf48fc 100644
> > >> --- a/target/riscv/cpu.c
> > >> +++ b/target/riscv/cpu.c
> > >> @@ -281,6 +281,7 @@ static void riscv_cpu_reset(CPUState *cs)
> > >>      env->pc = env->resetvec;
> > >>  #endif
> > >>      cs->exception_index = EXCP_NONE;
> > >> +    env->load_res = -1;
> > >>      set_default_nan_mode(1, &env->fp_status);
> > >>  }
> > >>
> > >> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > >> index b17f169681..6a07b12e65 100644
> > >> --- a/target/riscv/cpu_helper.c
> > >> +++ b/target/riscv/cpu_helper.c
> > >> @@ -113,6 +113,15 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong newpriv)
> > >>      }
> > >>      /* tlb_flush is unnecessary as mode is contained in mmu_idx */
> > >>      env->priv = newpriv;
> > >> +
> > >> +    /* Clear the load reservation - otherwise a reservation placed in one
> > >> +     * context/process can be used by another, resulting in an SC succeeding
> > >> +     * incorrectly. Version 2.2 of the ISA specification explicitly requires
> > >> +     * this behaviour, while later revisions say that the kernel "should" use
> > >> +     * an SC instruction to force the yielding of a load reservation on a
> > >> +     * preemptive context switch. As a result, do both.
> > >> +     */
> > >> +    env->load_res = -1;
> > >>  }
> > >>
> > >>  /* get_physical_address - get the physical address for this virtual address
> > >> diff --git a/target/riscv/insn_trans/trans_rva.inc.c b/target/riscv/insn_trans/trans_rva.inc.c
> > >> index f6dbbc065e..fadd88849e 100644
> > >> --- a/target/riscv/insn_trans/trans_rva.inc.c
> > >> +++ b/target/riscv/insn_trans/trans_rva.inc.c
> > >> @@ -61,7 +61,7 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop)
> > >>
> > >>      gen_set_label(l1);
> > >>      /*
> > >> -     * Address comparion failure.  However, we still need to
> > >> +     * Address comparison failure.  However, we still need to
> > >>       * provide the memory barrier implied by AQ/RL.
> > >>       */
> > >>      tcg_gen_mb(TCG_MO_ALL + a->aq * TCG_BAR_LDAQ + a->rl * TCG_BAR_STRL);
> > >> @@ -69,6 +69,12 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop)
> > >>      gen_set_gpr(a->rd, dat);
> > >>
> > >>      gen_set_label(l2);
> > >> +    /*
> > >> +     * Clear the load reservation, since an SC must fail if there is
> > >> +     * an SC to any address, in between an LR and SC pair.
> > >> +     */
> > >> +    tcg_gen_movi_tl(load_res, -1);
> > >> +
> > >>      tcg_temp_free(dat);
> > >>      tcg_temp_free(src1);
> > >>      tcg_temp_free(src2);
> > >> --
> > >
> > > This patch causes boot failures when booting systemd built with musl on RV64.
> > >
> > > It could possibly be a musl bug, but I wanted to throw that out here
> > > first to see what people think.
> >
> > Looking at the musl port, I see at least one bug in their atomics jumping out
> > at me:
> >
> > diff --git a/arch/riscv64/atomic_arch.h b/arch/riscv64/atomic_arch.h
> > index c9765342..41ad4d04 100644
> > --- a/arch/riscv64/atomic_arch.h
> > +++ b/arch/riscv64/atomic_arch.h
> > @@ -14,7 +14,7 @@ static inline int a_cas(volatile int *p, int t, int s)
> >                 "       sc.w.aqrl %1, %4, (%2)\n"
> >                 "       bnez %1, 1b\n"
> >                 "1:"
> > -               : "=&r"(old), "=r"(tmp)
> > +               : "=&r"(old), "=&r"(tmp)
> >                 : "r"(p), "r"(t), "r"(s)
> >                 : "memory");
> >         return old;
> > @@ -31,7 +31,7 @@ static inline void *a_cas_p(volatile void *p, void *t, void *s)
> >                 "       sc.d.aqrl %1, %4, (%2)\n"
> >                 "       bnez %1, 1b\n"
> >                 "1:"
> > -               : "=&r"(old), "=r"(tmp)
> > +               : "=&r"(old), "=&r"(tmp)
> >                 : "r"(p), "r"(t), "r"(s)
> >                 : "memory");
> >         return old;
> >
> > It's a shot in the dark as to whether that'll fix your bug, but I could at
> > least see a mechanism for it: before we yielded load reservations on context
> > switches then that backwards branch would never be taken, so we wouldn't notice
> > if tmp was allocated into one of the same registers as the inputs.  Even if it
> > doesn't fix your issue it's still a bug so I'll send the patch out, just LMK so
> > I can indicate how important the issue is.
>
> I haven't had a chance to test this fix yet. The bug was reported by
> Khem (and other OE people) as it's break musl for RISC-V.

I did get a chance to test it. This seems to fix the issue :)

Please send the patch to musl and CC me when you do.

Good catch!

Alistair

>
> >
> > This should manifest on hardware, but it looks like we managed to drop that SC
> > patch.  I'll go send the patch out now...
>
> Thanks, do you mind CCing me?
>
> Alistair
>
> >
> > >
> > > Alistair
> > >
> > >> 2.21.0
> > >>
> > >>


  reply	other threads:[~2019-09-25  1:07 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-16 19:19 [Qemu-devel] [PATCH] atomic failures on qemu-system-riscv64 Joel Sing
2019-06-16 19:19 ` [Qemu-riscv] " Joel Sing
2019-06-16 20:54 ` [Qemu-devel] " no-reply
2019-06-16 20:54   ` [Qemu-riscv] " no-reply
2019-06-17 23:52 ` Richard Henderson
2019-06-17 23:52   ` [Qemu-riscv] " Richard Henderson
2019-06-24  6:42   ` Palmer Dabbelt
2019-06-24  6:42     ` [Qemu-riscv] " Palmer Dabbelt
2019-06-24 18:08   ` Joel Sing
2019-06-24 18:08     ` [Qemu-riscv] " Joel Sing
2019-06-25 15:36     ` Richard Henderson
2019-06-25 15:36       ` [Qemu-riscv] " Richard Henderson
2019-06-26  6:07       ` Palmer Dabbelt
2019-06-26  6:07         ` [Qemu-riscv] " Palmer Dabbelt
2019-06-26  7:48         ` Richard Henderson
2019-06-26  7:48           ` [Qemu-riscv] " Richard Henderson
2019-06-26  8:25           ` Palmer Dabbelt
2019-06-26  8:25             ` [Qemu-riscv] " Palmer Dabbelt
2019-06-26  8:30             ` Richard Henderson
2019-06-26  8:30               ` [Qemu-riscv] " Richard Henderson
2019-06-26  8:31               ` Palmer Dabbelt
2019-06-26  8:31                 ` [Qemu-riscv] " Palmer Dabbelt
2019-06-25 15:39     ` Richard Henderson
2019-06-25 15:39       ` [Qemu-riscv] " Richard Henderson
2019-06-26  6:07       ` Palmer Dabbelt
2019-06-26  6:07         ` [Qemu-riscv] " Palmer Dabbelt
2019-09-24 18:29     ` Alistair Francis
2019-09-24 18:29       ` Alistair Francis
2019-09-24 20:04       ` Palmer Dabbelt
2019-09-24 20:04         ` Palmer Dabbelt
2019-09-24 23:35         ` Alistair Francis
2019-09-24 23:35           ` Alistair Francis
2019-09-25  1:01           ` Alistair Francis [this message]
2019-09-25  1:01             ` Alistair Francis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKmqyKPVxkbKkZPg5u+nKiLtNgtf5Na6JgzF3ocrx8wy8+bxJw@mail.gmail.com \
    --to=alistair23@gmail.com \
    --cc=Alistair.Francis@wdc.com \
    --cc=joel@sing.id.au \
    --cc=marco@decred.org \
    --cc=me@carlosedp.com \
    --cc=palmer@sifive.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-riscv@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.