* [Qemu-devel] [PATCH v1 0/1] RISC-V: Critical fixes for QEMU 2.12 @ 2018-03-27 19:54 Michael Clark 2018-03-27 19:54 ` [Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug Michael Clark 0 siblings, 1 reply; 8+ messages in thread From: Michael Clark @ 2018-03-27 19:54 UTC (permalink / raw) To: qemu-devel, Peter Maydell Cc: patches, Michael Clark, Palmer Dabbelt, Sagar Karandikar, Bastian Koppelmann This series includes changes that are considered release critical, such as floating point register file corruption under SMP Linux. Michael Clark (1): RISC-V: Workaround for critical mstatus.FS MTTCG bug target/riscv/op_helper.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) -- 2.7.0 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug 2018-03-27 19:54 [Qemu-devel] [PATCH v1 0/1] RISC-V: Critical fixes for QEMU 2.12 Michael Clark @ 2018-03-27 19:54 ` Michael Clark 2018-03-27 22:17 ` Philippe Mathieu-Daudé 2018-03-28 2:08 ` [Qemu-devel] [patches] " Palmer Dabbelt 0 siblings, 2 replies; 8+ messages in thread From: Michael Clark @ 2018-03-27 19:54 UTC (permalink / raw) To: qemu-devel Cc: patches, Michael Clark, Palmer Dabbelt, Sagar Karandikar, Bastian Koppelmann, Peter Maydell This change is a workaround for a bug where mstatus.FS is not correctly reporting dirty when MTTCG and SMP are enabled which results in the floating point register file not being saved during context switches. This a critical bug for RISC-V in QEMU as it results in floating point register file corruption when running SMP Linux in the RISC-V 'virt' machine. This workaround will return dirty if mstatus.FS is switched from off to initial or clean. We have checked the specification and it is legal for an implementation to return either off, or dirty, if set to initial or clean. This workaround will result in unnecessary floating point save restore. When mstatus.FS is off, floating point instruction trap to indicate the process is using the FPU. The OS can then save floating-point state of the previous process using the FPU and set mstatus.FS to initial or clean. With this workaround, mstatus.FS will always return dirty if set to a non-zero value, indicating floating point save restore is necessary, versus misreporting mstatus.FS resulting in floating point register file corruption. Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Sagar Karandikar <sagark@eecs.berkeley.edu> Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Cc: Peter Maydell <peter.maydell@linaro.org> Tested-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Michael Clark <mjc@sifive.com> --- target/riscv/op_helper.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c index e34715d..7281b98 100644 --- a/target/riscv/op_helper.c +++ b/target/riscv/op_helper.c @@ -144,8 +144,23 @@ void csr_write_helper(CPURISCVState *env, target_ulong val_to_write, } mstatus = (mstatus & ~mask) | (val_to_write & mask); - int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS; - dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS; + + /* Note: this is a workaround for an issue where mstatus.FS + does not report dirty when SMP and MTTCG is enabled. This + workaround is technically compliant with the RISC-V Privileged + specification as it is legal to return only off, or dirty, + however this may cause unnecessary saves of floating point state. + Without this workaround, floating point state is not saved and + restored correctly when SMP and MTTCG is enabled, */ + if (qemu_tcg_mttcg_enabled()) { + /* FP is always dirty or off */ + if (mstatus & MSTATUS_FS) { + mstatus |= MSTATUS_FS; + } + } + + int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) | + ((mstatus & MSTATUS_XS) == MSTATUS_XS); mstatus = set_field(mstatus, MSTATUS_SD, dirty); env->mstatus = mstatus; break; -- 2.7.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug 2018-03-27 19:54 ` [Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug Michael Clark @ 2018-03-27 22:17 ` Philippe Mathieu-Daudé 2018-03-28 0:15 ` Michael Clark 2018-03-28 2:08 ` [Qemu-devel] [patches] " Palmer Dabbelt 1 sibling, 1 reply; 8+ messages in thread From: Philippe Mathieu-Daudé @ 2018-03-27 22:17 UTC (permalink / raw) To: qemu-devel, Alex Bennée, Richard Henderson Cc: Michael Clark, Peter Maydell, Sagar Karandikar, Bastian Koppelmann, Palmer Dabbelt, patches Cc'ing Alex and Richard. On 03/27/2018 04:54 PM, Michael Clark wrote: > This change is a workaround for a bug where mstatus.FS > is not correctly reporting dirty when MTTCG and SMP are > enabled which results in the floating point register file > not being saved during context switches. This a critical > bug for RISC-V in QEMU as it results in floating point > register file corruption when running SMP Linux in the > RISC-V 'virt' machine. > > This workaround will return dirty if mstatus.FS is > switched from off to initial or clean. We have checked > the specification and it is legal for an implementation > to return either off, or dirty, if set to initial or clean. > > This workaround will result in unnecessary floating point > save restore. When mstatus.FS is off, floating point > instruction trap to indicate the process is using the FPU. > The OS can then save floating-point state of the previous > process using the FPU and set mstatus.FS to initial or > clean. With this workaround, mstatus.FS will always return > dirty if set to a non-zero value, indicating floating point > save restore is necessary, versus misreporting mstatus.FS > resulting in floating point register file corruption. > > Cc: Palmer Dabbelt <palmer@sifive.com> > Cc: Sagar Karandikar <sagark@eecs.berkeley.edu> > Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> > Cc: Peter Maydell <peter.maydell@linaro.org> > Tested-by: Richard W.M. Jones <rjones@redhat.com> > Signed-off-by: Michael Clark <mjc@sifive.com> > --- > target/riscv/op_helper.c | 19 +++++++++++++++++-- > 1 file changed, 17 insertions(+), 2 deletions(-) > > diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c > index e34715d..7281b98 100644 > --- a/target/riscv/op_helper.c > +++ b/target/riscv/op_helper.c > @@ -144,8 +144,23 @@ void csr_write_helper(CPURISCVState *env, target_ulong val_to_write, > } > > mstatus = (mstatus & ~mask) | (val_to_write & mask); > - int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS; > - dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS; > + > + /* Note: this is a workaround for an issue where mstatus.FS > + does not report dirty when SMP and MTTCG is enabled. This > + workaround is technically compliant with the RISC-V Privileged > + specification as it is legal to return only off, or dirty, > + however this may cause unnecessary saves of floating point state. > + Without this workaround, floating point state is not saved and > + restored correctly when SMP and MTTCG is enabled, */ > + if (qemu_tcg_mttcg_enabled()) { > + /* FP is always dirty or off */ > + if (mstatus & MSTATUS_FS) { > + mstatus |= MSTATUS_FS; > + } > + } > + > + int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) | > + ((mstatus & MSTATUS_XS) == MSTATUS_XS); > mstatus = set_field(mstatus, MSTATUS_SD, dirty); > env->mstatus = mstatus; > break; > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug 2018-03-27 22:17 ` Philippe Mathieu-Daudé @ 2018-03-28 0:15 ` Michael Clark 2018-03-28 2:26 ` Richard Henderson 2018-03-30 7:11 ` Alex Bennée 0 siblings, 2 replies; 8+ messages in thread From: Michael Clark @ 2018-03-28 0:15 UTC (permalink / raw) To: Philippe Mathieu-Daudé Cc: QEMU Developers, Alex Bennée, Richard Henderson, Peter Maydell, Sagar Karandikar, Bastian Koppelmann, Palmer Dabbelt, RISC-V Patches On Tue, Mar 27, 2018 at 3:17 PM, Philippe Mathieu-Daudé <f4bug@amsat.org> wrote: > Cc'ing Alex and Richard. > > On 03/27/2018 04:54 PM, Michael Clark wrote: > > This change is a workaround for a bug where mstatus.FS > > is not correctly reporting dirty when MTTCG and SMP are > > enabled which results in the floating point register file > > not being saved during context switches. This a critical > > bug for RISC-V in QEMU as it results in floating point > > register file corruption when running SMP Linux in the > > RISC-V 'virt' machine. > > > > This workaround will return dirty if mstatus.FS is > > switched from off to initial or clean. We have checked > > the specification and it is legal for an implementation > > to return either off, or dirty, if set to initial or clean. > > > > This workaround will result in unnecessary floating point > > save restore. When mstatus.FS is off, floating point > > instruction trap to indicate the process is using the FPU. > > The OS can then save floating-point state of the previous > > process using the FPU and set mstatus.FS to initial or > > clean. With this workaround, mstatus.FS will always return > > dirty if set to a non-zero value, indicating floating point > > save restore is necessary, versus misreporting mstatus.FS > > resulting in floating point register file corruption. > > > > Cc: Palmer Dabbelt <palmer@sifive.com> > > Cc: Sagar Karandikar <sagark@eecs.berkeley.edu> > > Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> > > Cc: Peter Maydell <peter.maydell@linaro.org> > > Tested-by: Richard W.M. Jones <rjones@redhat.com> > > Signed-off-by: Michael Clark <mjc@sifive.com> > > --- > > target/riscv/op_helper.c | 19 +++++++++++++++++-- > > 1 file changed, 17 insertions(+), 2 deletions(-) > > > > diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c > > index e34715d..7281b98 100644 > > --- a/target/riscv/op_helper.c > > +++ b/target/riscv/op_helper.c > > @@ -144,8 +144,23 @@ void csr_write_helper(CPURISCVState *env, > target_ulong val_to_write, > > } > > > > mstatus = (mstatus & ~mask) | (val_to_write & mask); > > - int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS; > > - dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS; > > + > > + /* Note: this is a workaround for an issue where mstatus.FS > > + does not report dirty when SMP and MTTCG is enabled. This > > + workaround is technically compliant with the RISC-V > Privileged > > + specification as it is legal to return only off, or dirty, > > + however this may cause unnecessary saves of floating point > state. > > + Without this workaround, floating point state is not saved > and > > + restored correctly when SMP and MTTCG is enabled, */ > On looking at this again, I think we may need to remove the qemu_tcg_mttcg_enabled conditional and always return dirty if the state is initial or clean, but not off. While testing on uniprocessor worked okay, it's likely because we were lucky and there was no task migration or multiple FPU tasks working. This would mean we would revert to Richard W.M. Jones initial patch. > + if (qemu_tcg_mttcg_enabled()) { > > + /* FP is always dirty or off */ > > + if (mstatus & MSTATUS_FS) { > > + mstatus |= MSTATUS_FS; > > + } > > + } > > + > > + int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) | > > + ((mstatus & MSTATUS_XS) == MSTATUS_XS); > > mstatus = set_field(mstatus, MSTATUS_SD, dirty); > > env->mstatus = mstatus; > > break; > > > The text from the specification that allows us to always return dirty if set to initial or clean, is below i.e. Dirty implies state has "potentially" been modified, so that gives us wriggle room. " When an extension's status is set to Off , any instruction that attempts to read or write the corresponding state will cause an exception. When the status is Initial, the corresponding state should have an initial constant value. When the status is Clean, the corresponding state is potentially di fferent from the initial value, but matches the last value stored on a context swap. When the status is Dirty, the corresponding state has potentially been modif ed since the last context save. " I think the problem is Linux is setting the state to clean after saving fpu register state [1], but we have no code in the QEMU FPU operations to set the state to dirty, if is clean or initial, only code to cause an exception if the floating point extension state is set to off e.g. static void gen_fp_store(DisasContext *ctx, uint32_t opc, int rs1, int rs2, target_long imm) { TCGv t0; if (!(ctx->flags & TB_FLAGS_FP_ENABLE)) { gen_exception_illegal(ctx); return; } t0 = tcg_temp_new(); gen_get_gpr(t0, rs1); tcg_gen_addi_tl(t0, t0, imm); switch (opc) { case OPC_RISC_FSW: tcg_gen_qemu_st_i64(cpu_fpr[rs2], t0, ctx->mem_idx, MO_TEUL); break; case OPC_RISC_FSD: tcg_gen_qemu_st_i64(cpu_fpr[rs2], t0, ctx->mem_idx, MO_TEQ); break; default: gen_exception_illegal(ctx); break; } tcg_temp_free(t0); } [1] https://github.com/torvalds/linux/blob/master/arch/riscv/include/asm/switch_to.h ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug 2018-03-28 0:15 ` Michael Clark @ 2018-03-28 2:26 ` Richard Henderson 2018-03-30 7:11 ` Alex Bennée 1 sibling, 0 replies; 8+ messages in thread From: Richard Henderson @ 2018-03-28 2:26 UTC (permalink / raw) To: Michael Clark, Philippe Mathieu-Daudé Cc: QEMU Developers, Alex Bennée, Peter Maydell, Sagar Karandikar, Bastian Koppelmann, Palmer Dabbelt, RISC-V Patches On 03/28/2018 08:15 AM, Michael Clark wrote: > On looking at this again, I think we may need to remove > the qemu_tcg_mttcg_enabled conditional and always return dirty if the state is > initial or clean, but not off. Yes. > While testing on uniprocessor worked okay, it's likely because we were lucky > and there was no task migration or multiple FPU tasks working. Also yes. > > + if (qemu_tcg_mttcg_enabled()) { > > + /* FP is always dirty or off */ > > + if (mstatus & MSTATUS_FS) { > > + mstatus |= MSTATUS_FS; > > + } > > + } I've just posted an alternate patch set to track MSTATUS_FS more exactly, but if that's thought to be too much this late in the cycle, then I'll sign off on this patch without the mttcg test. r~ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug 2018-03-28 0:15 ` Michael Clark 2018-03-28 2:26 ` Richard Henderson @ 2018-03-30 7:11 ` Alex Bennée 2018-03-30 17:01 ` Michael Clark 1 sibling, 1 reply; 8+ messages in thread From: Alex Bennée @ 2018-03-30 7:11 UTC (permalink / raw) To: Michael Clark Cc: Philippe Mathieu-Daudé, QEMU Developers, Richard Henderson, Peter Maydell, Sagar Karandikar, Bastian Koppelmann, Palmer Dabbelt, RISC-V Patches Michael Clark <mjc@sifive.com> writes: > On Tue, Mar 27, 2018 at 3:17 PM, Philippe Mathieu-Daudé <f4bug@amsat.org> > wrote: > >> Cc'ing Alex and Richard. >> >> On 03/27/2018 04:54 PM, Michael Clark wrote: >> > This change is a workaround for a bug where mstatus.FS >> > is not correctly reporting dirty when MTTCG and SMP are >> > enabled which results in the floating point register file >> > not being saved during context switches. This a critical >> > bug for RISC-V in QEMU as it results in floating point >> > register file corruption when running SMP Linux in the >> > RISC-V 'virt' machine. >> > >> > This workaround will return dirty if mstatus.FS is >> > switched from off to initial or clean. We have checked >> > the specification and it is legal for an implementation >> > to return either off, or dirty, if set to initial or clean. >> > >> > This workaround will result in unnecessary floating point >> > save restore. When mstatus.FS is off, floating point >> > instruction trap to indicate the process is using the FPU. >> > The OS can then save floating-point state of the previous >> > process using the FPU and set mstatus.FS to initial or >> > clean. With this workaround, mstatus.FS will always return >> > dirty if set to a non-zero value, indicating floating point >> > save restore is necessary, versus misreporting mstatus.FS >> > resulting in floating point register file corruption. >> > >> > Cc: Palmer Dabbelt <palmer@sifive.com> >> > Cc: Sagar Karandikar <sagark@eecs.berkeley.edu> >> > Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> >> > Cc: Peter Maydell <peter.maydell@linaro.org> >> > Tested-by: Richard W.M. Jones <rjones@redhat.com> >> > Signed-off-by: Michael Clark <mjc@sifive.com> >> > --- >> > target/riscv/op_helper.c | 19 +++++++++++++++++-- >> > 1 file changed, 17 insertions(+), 2 deletions(-) >> > >> > diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c >> > index e34715d..7281b98 100644 >> > --- a/target/riscv/op_helper.c >> > +++ b/target/riscv/op_helper.c >> > @@ -144,8 +144,23 @@ void csr_write_helper(CPURISCVState *env, >> target_ulong val_to_write, >> > } >> > >> > mstatus = (mstatus & ~mask) | (val_to_write & mask); >> > - int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS; >> > - dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS; >> > + >> > + /* Note: this is a workaround for an issue where mstatus.FS >> > + does not report dirty when SMP and MTTCG is enabled. This >> > + workaround is technically compliant with the RISC-V >> Privileged >> > + specification as it is legal to return only off, or dirty, >> > + however this may cause unnecessary saves of floating point >> state. >> > + Without this workaround, floating point state is not saved >> and >> > + restored correctly when SMP and MTTCG is enabled, */ >> > > On looking at this again, I think we may need to remove the > qemu_tcg_mttcg_enabled conditional and always return dirty if the state is > initial or clean, but not off. > > While testing on uniprocessor worked okay, it's likely because we were > lucky and there was no task migration or multiple FPU tasks working. This > would mean we would revert to Richard W.M. Jones initial patch. > >> + if (qemu_tcg_mttcg_enabled()) { >> > + /* FP is always dirty or off */ >> > + if (mstatus & MSTATUS_FS) { >> > + mstatus |= MSTATUS_FS; >> > + } >> > + } I'm confused. If mstatus is a per-vCPU variable why does the enabling or not of MTTCG matter here? >> > + >> > + int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) | >> > + ((mstatus & MSTATUS_XS) == MSTATUS_XS); >> > mstatus = set_field(mstatus, MSTATUS_SD, dirty); >> > env->mstatus = mstatus; >> > break; >> > >> > > The text from the specification that allows us to always return dirty if > set to initial or clean, is below i.e. Dirty implies state has > "potentially" been modified, so that gives us wriggle room. > > " > When an extension's status is set to Off , any instruction that attempts to > read or write the corresponding > state will cause an exception. When the status is Initial, the > corresponding state should > have an initial constant value. When the status is Clean, the corresponding > state is potentially > di fferent from the initial value, but matches the last value stored on a > context swap. When the > status is Dirty, the corresponding state has potentially been modif ed > since the last context save. > " > > I think the problem is Linux is setting the state to clean after saving > fpu register state [1], but we have no code in the QEMU FPU operations to > set the state to dirty, if is clean or initial, only code to cause an > exception if the floating point extension state is set to off e.g. > > static void gen_fp_store(DisasContext *ctx, uint32_t opc, int rs1, > int rs2, target_long imm) > { > TCGv t0; > > if (!(ctx->flags & TB_FLAGS_FP_ENABLE)) { > gen_exception_illegal(ctx); > return; > } > > t0 = tcg_temp_new(); > gen_get_gpr(t0, rs1); > tcg_gen_addi_tl(t0, t0, imm); > > switch (opc) { > case OPC_RISC_FSW: > tcg_gen_qemu_st_i64(cpu_fpr[rs2], t0, ctx->mem_idx, MO_TEUL); > break; > case OPC_RISC_FSD: > tcg_gen_qemu_st_i64(cpu_fpr[rs2], t0, ctx->mem_idx, MO_TEQ); > break; > default: > gen_exception_illegal(ctx); > break; > } > > tcg_temp_free(t0); > } > > [1] > https://github.com/torvalds/linux/blob/master/arch/riscv/include/asm/switch_to.h -- Alex Bennée ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug 2018-03-30 7:11 ` Alex Bennée @ 2018-03-30 17:01 ` Michael Clark 0 siblings, 0 replies; 8+ messages in thread From: Michael Clark @ 2018-03-30 17:01 UTC (permalink / raw) To: Alex Bennée Cc: Philippe Mathieu-Daudé, QEMU Developers, Richard Henderson, Peter Maydell, Sagar Karandikar, Bastian Koppelmann, Palmer Dabbelt, RISC-V Patches On Fri, Mar 30, 2018 at 12:11 AM, Alex Bennée <alex.bennee@linaro.org> wrote: > > Michael Clark <mjc@sifive.com> writes: > > > On Tue, Mar 27, 2018 at 3:17 PM, Philippe Mathieu-Daudé <f4bug@amsat.org > > > > wrote: > > > >> Cc'ing Alex and Richard. > >> > >> On 03/27/2018 04:54 PM, Michael Clark wrote: > >> > This change is a workaround for a bug where mstatus.FS > >> > is not correctly reporting dirty when MTTCG and SMP are > >> > enabled which results in the floating point register file > >> > not being saved during context switches. This a critical > >> > bug for RISC-V in QEMU as it results in floating point > >> > register file corruption when running SMP Linux in the > >> > RISC-V 'virt' machine. > >> > > >> > This workaround will return dirty if mstatus.FS is > >> > switched from off to initial or clean. We have checked > >> > the specification and it is legal for an implementation > >> > to return either off, or dirty, if set to initial or clean. > >> > > >> > This workaround will result in unnecessary floating point > >> > save restore. When mstatus.FS is off, floating point > >> > instruction trap to indicate the process is using the FPU. > >> > The OS can then save floating-point state of the previous > >> > process using the FPU and set mstatus.FS to initial or > >> > clean. With this workaround, mstatus.FS will always return > >> > dirty if set to a non-zero value, indicating floating point > >> > save restore is necessary, versus misreporting mstatus.FS > >> > resulting in floating point register file corruption. > >> > > >> > Cc: Palmer Dabbelt <palmer@sifive.com> > >> > Cc: Sagar Karandikar <sagark@eecs.berkeley.edu> > >> > Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> > >> > Cc: Peter Maydell <peter.maydell@linaro.org> > >> > Tested-by: Richard W.M. Jones <rjones@redhat.com> > >> > Signed-off-by: Michael Clark <mjc@sifive.com> > >> > --- > >> > target/riscv/op_helper.c | 19 +++++++++++++++++-- > >> > 1 file changed, 17 insertions(+), 2 deletions(-) > >> > > >> > diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c > >> > index e34715d..7281b98 100644 > >> > --- a/target/riscv/op_helper.c > >> > +++ b/target/riscv/op_helper.c > >> > @@ -144,8 +144,23 @@ void csr_write_helper(CPURISCVState *env, > >> target_ulong val_to_write, > >> > } > >> > > >> > mstatus = (mstatus & ~mask) | (val_to_write & mask); > >> > - int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS; > >> > - dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS; > >> > + > >> > + /* Note: this is a workaround for an issue where mstatus.FS > >> > + does not report dirty when SMP and MTTCG is enabled. This > >> > + workaround is technically compliant with the RISC-V > >> Privileged > >> > + specification as it is legal to return only off, or dirty, > >> > + however this may cause unnecessary saves of floating point > >> state. > >> > + Without this workaround, floating point state is not saved > >> and > >> > + restored correctly when SMP and MTTCG is enabled, */ > >> > > > > On looking at this again, I think we may need to remove the > > qemu_tcg_mttcg_enabled conditional and always return dirty if the state > is > > initial or clean, but not off. > > > > While testing on uniprocessor worked okay, it's likely because we were > > lucky and there was no task migration or multiple FPU tasks working. This > > would mean we would revert to Richard W.M. Jones initial patch. > > > >> + if (qemu_tcg_mttcg_enabled()) { > >> > + /* FP is always dirty or off */ > >> > + if (mstatus & MSTATUS_FS) { > >> > + mstatus |= MSTATUS_FS; > >> > + } > >> > + } > > > I'm confused. If mstatus is a per-vCPU variable why does the enabling or > not of MTTCG matter here? It was an incorrect inference from the available data at the time given the problem showed up on SMP but not uniprocessor. After inspecting the logic we concluded we were just lucky in the uniprocessor case because our test case only have once process using the CPU, whereas with SMP the process was rescheduled to another core which made the corruption visible on SMP. We removed the MTTCG condition from the subsequent version of the patch, updated the patch description and comment and removed references to MTTCG. > >> > + > >> > + int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) | > >> > + ((mstatus & MSTATUS_XS) == MSTATUS_XS); > >> > mstatus = set_field(mstatus, MSTATUS_SD, dirty); > >> > env->mstatus = mstatus; > >> > break; > >> > > >> > > > > The text from the specification that allows us to always return dirty if > > set to initial or clean, is below i.e. Dirty implies state has > > "potentially" been modified, so that gives us wriggle room. > > > > " > > When an extension's status is set to Off , any instruction that attempts > to > > read or write the corresponding > > state will cause an exception. When the status is Initial, the > > corresponding state should > > have an initial constant value. When the status is Clean, the > corresponding > > state is potentially > > di fferent from the initial value, but matches the last value stored on a > > context swap. When the > > status is Dirty, the corresponding state has potentially been modif ed > > since the last context save. > > " > > > > I think the problem is Linux is setting the state to clean after saving > > fpu register state [1], but we have no code in the QEMU FPU operations to > > set the state to dirty, if is clean or initial, only code to cause an > > exception if the floating point extension state is set to off e.g. > > > > static void gen_fp_store(DisasContext *ctx, uint32_t opc, int rs1, > > int rs2, target_long imm) > > { > > TCGv t0; > > > > if (!(ctx->flags & TB_FLAGS_FP_ENABLE)) { > > gen_exception_illegal(ctx); > > return; > > } > > > > t0 = tcg_temp_new(); > > gen_get_gpr(t0, rs1); > > tcg_gen_addi_tl(t0, t0, imm); > > > > switch (opc) { > > case OPC_RISC_FSW: > > tcg_gen_qemu_st_i64(cpu_fpr[rs2], t0, ctx->mem_idx, MO_TEUL); > > break; > > case OPC_RISC_FSD: > > tcg_gen_qemu_st_i64(cpu_fpr[rs2], t0, ctx->mem_idx, MO_TEQ); > > break; > > default: > > gen_exception_illegal(ctx); > > break; > > } > > > > tcg_temp_free(t0); > > } > > > > [1] > > https://github.com/torvalds/linux/blob/master/arch/riscv/ > include/asm/switch_to.h > > > -- > Alex Bennée > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [patches] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug 2018-03-27 19:54 ` [Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug Michael Clark 2018-03-27 22:17 ` Philippe Mathieu-Daudé @ 2018-03-28 2:08 ` Palmer Dabbelt 1 sibling, 0 replies; 8+ messages in thread From: Palmer Dabbelt @ 2018-03-28 2:08 UTC (permalink / raw) Cc: qemu-devel, patches, Michael Clark, sagark, kbastian, peter.maydell On Tue, 27 Mar 2018 12:54:47 PDT (-0700), Michael Clark wrote: > This change is a workaround for a bug where mstatus.FS > is not correctly reporting dirty when MTTCG and SMP are > enabled which results in the floating point register file > not being saved during context switches. This a critical > bug for RISC-V in QEMU as it results in floating point > register file corruption when running SMP Linux in the > RISC-V 'virt' machine. > > This workaround will return dirty if mstatus.FS is > switched from off to initial or clean. We have checked > the specification and it is legal for an implementation > to return either off, or dirty, if set to initial or clean. > > This workaround will result in unnecessary floating point > save restore. When mstatus.FS is off, floating point > instruction trap to indicate the process is using the FPU. > The OS can then save floating-point state of the previous > process using the FPU and set mstatus.FS to initial or > clean. With this workaround, mstatus.FS will always return > dirty if set to a non-zero value, indicating floating point > save restore is necessary, versus misreporting mstatus.FS > resulting in floating point register file corruption. > > Cc: Palmer Dabbelt <palmer@sifive.com> > Cc: Sagar Karandikar <sagark@eecs.berkeley.edu> > Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> > Cc: Peter Maydell <peter.maydell@linaro.org> > Tested-by: Richard W.M. Jones <rjones@redhat.com> > Signed-off-by: Michael Clark <mjc@sifive.com> > --- > target/riscv/op_helper.c | 19 +++++++++++++++++-- > 1 file changed, 17 insertions(+), 2 deletions(-) > > diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c > index e34715d..7281b98 100644 > --- a/target/riscv/op_helper.c > +++ b/target/riscv/op_helper.c > @@ -144,8 +144,23 @@ void csr_write_helper(CPURISCVState *env, target_ulong val_to_write, > } > > mstatus = (mstatus & ~mask) | (val_to_write & mask); > - int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS; > - dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS; > + > + /* Note: this is a workaround for an issue where mstatus.FS > + does not report dirty when SMP and MTTCG is enabled. This > + workaround is technically compliant with the RISC-V Privileged > + specification as it is legal to return only off, or dirty, > + however this may cause unnecessary saves of floating point state. > + Without this workaround, floating point state is not saved and > + restored correctly when SMP and MTTCG is enabled, */ > + if (qemu_tcg_mttcg_enabled()) { > + /* FP is always dirty or off */ > + if (mstatus & MSTATUS_FS) { > + mstatus |= MSTATUS_FS; > + } > + } > + > + int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) | > + ((mstatus & MSTATUS_XS) == MSTATUS_XS); > mstatus = set_field(mstatus, MSTATUS_SD, dirty); > env->mstatus = mstatus; > break; FWIW, this isn't just "technically compliant with the RISC-V Privileged specification" but it's actually an intended design point. We're considering making this a bit more explicit in the ISA manual -- well, unless Andrew decides I'm being too pedantic in one of my possible readings of the spec :). Reviewed-By: Palmer Dabbelt <palmer@sifive.com> ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-03-30 17:01 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-03-27 19:54 [Qemu-devel] [PATCH v1 0/1] RISC-V: Critical fixes for QEMU 2.12 Michael Clark 2018-03-27 19:54 ` [Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug Michael Clark 2018-03-27 22:17 ` Philippe Mathieu-Daudé 2018-03-28 0:15 ` Michael Clark 2018-03-28 2:26 ` Richard Henderson 2018-03-30 7:11 ` Alex Bennée 2018-03-30 17:01 ` Michael Clark 2018-03-28 2:08 ` [Qemu-devel] [patches] " Palmer Dabbelt
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.