All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1
@ 2021-10-26  3:48 Reiji Watanabe
  2021-10-26  9:06 ` Catalin Marinas
  2021-10-26 11:22 ` Robin Murphy
  0 siblings, 2 replies; 10+ messages in thread
From: Reiji Watanabe @ 2021-10-26  3:48 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Marc Zyngier, linux-arm-kernel, Peter Shier, Ricardo Koller,
	Oliver Upton, Jing Zhang, Raghavendra Rao Anata, Reiji Watanabe

Currently, clear_page() uses DC ZVA instruction unconditionally.  But it
should make sure that DCZID_EL0.DZP, which indicates whether or not use
of DC ZVA instruction is prohibited, is zero when using the instruction.
Use stp as memset does instead when DCZID_EL0.DZP == 1.

Signed-off-by: Reiji Watanabe <reijiw@google.com>
---
 arch/arm64/lib/clear_page.S | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
index b84b179edba3..7ce1bfa4081c 100644
--- a/arch/arm64/lib/clear_page.S
+++ b/arch/arm64/lib/clear_page.S
@@ -16,6 +16,7 @@
  */
 SYM_FUNC_START_PI(clear_page)
 	mrs	x1, dczid_el0
+	tbnz	x1, #4, 2f	/* Branch if DC GVA is prohibited */
 	and	w1, w1, #0xf
 	mov	x2, #4
 	lsl	x1, x2, x1
@@ -25,5 +26,15 @@ SYM_FUNC_START_PI(clear_page)
 	tst	x0, #(PAGE_SIZE - 1)
 	b.ne	1b
 	ret
+
+2:	mov	x1, #(PAGE_SIZE)
+	sub	x0, x0, #16	/* Pre-bias. */
+3:	stp	xzr, xzr, [x0, #16]
+	stp	xzr, xzr, [x0, #32]
+	stp	xzr, xzr, [x0, #48]
+	stp	xzr, xzr, [x0, #64]!
+	subs	x1, x1, #64
+	b.gt	3b
+	ret
 SYM_FUNC_END_PI(clear_page)
 EXPORT_SYMBOL(clear_page)
-- 
2.33.0.1079.g6e70778dc9-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1
  2021-10-26  3:48 [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 Reiji Watanabe
@ 2021-10-26  9:06 ` Catalin Marinas
  2021-10-27  1:35   ` Reiji Watanabe
  2021-10-26 11:22 ` Robin Murphy
  1 sibling, 1 reply; 10+ messages in thread
From: Catalin Marinas @ 2021-10-26  9:06 UTC (permalink / raw)
  To: Reiji Watanabe
  Cc: Will Deacon, Marc Zyngier, linux-arm-kernel, Peter Shier,
	Ricardo Koller, Oliver Upton, Jing Zhang, Raghavendra Rao Anata

On Mon, Oct 25, 2021 at 08:48:44PM -0700, Reiji Watanabe wrote:
> Currently, clear_page() uses DC ZVA instruction unconditionally.  But it
> should make sure that DCZID_EL0.DZP, which indicates whether or not use
> of DC ZVA instruction is prohibited, is zero when using the instruction.
> Use stp as memset does instead when DCZID_EL0.DZP == 1.

For correctness, this is fine, but have you come across a system that
has DZP == 1 (or hypervisor setting HCR_EL2.TDZ)?

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1
  2021-10-26  3:48 [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 Reiji Watanabe
  2021-10-26  9:06 ` Catalin Marinas
@ 2021-10-26 11:22 ` Robin Murphy
  2021-10-26 12:23   ` Mark Rutland
  1 sibling, 1 reply; 10+ messages in thread
From: Robin Murphy @ 2021-10-26 11:22 UTC (permalink / raw)
  To: Reiji Watanabe, Catalin Marinas, Will Deacon
  Cc: Marc Zyngier, linux-arm-kernel, Peter Shier, Ricardo Koller,
	Oliver Upton, Jing Zhang, Raghavendra Rao Anata

On 2021-10-26 04:48, Reiji Watanabe wrote:
> Currently, clear_page() uses DC ZVA instruction unconditionally.  But it
> should make sure that DCZID_EL0.DZP, which indicates whether or not use
> of DC ZVA instruction is prohibited, is zero when using the instruction.
> Use stp as memset does instead when DCZID_EL0.DZP == 1.
> 
> Signed-off-by: Reiji Watanabe <reijiw@google.com>
> ---
>   arch/arm64/lib/clear_page.S | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
> index b84b179edba3..7ce1bfa4081c 100644
> --- a/arch/arm64/lib/clear_page.S
> +++ b/arch/arm64/lib/clear_page.S
> @@ -16,6 +16,7 @@
>    */
>   SYM_FUNC_START_PI(clear_page)
>   	mrs	x1, dczid_el0
> +	tbnz	x1, #4, 2f	/* Branch if DC GVA is prohibited */
>   	and	w1, w1, #0xf
>   	mov	x2, #4
>   	lsl	x1, x2, x1
> @@ -25,5 +26,15 @@ SYM_FUNC_START_PI(clear_page)
>   	tst	x0, #(PAGE_SIZE - 1)
>   	b.ne	1b
>   	ret
> +
> +2:	mov	x1, #(PAGE_SIZE)
> +	sub	x0, x0, #16	/* Pre-bias. */

Out of curiosity, what's this for? It's not like we need to worry about 
PAGE_SIZE or page addresses being misaligned. I don't really see why 
we'd need a different condition from the DC ZVA loop.

Robin.

> +3:	stp	xzr, xzr, [x0, #16]
> +	stp	xzr, xzr, [x0, #32]
> +	stp	xzr, xzr, [x0, #48]
> +	stp	xzr, xzr, [x0, #64]!
> +	subs	x1, x1, #64
> +	b.gt	3b
> +	ret
>   SYM_FUNC_END_PI(clear_page)
>   EXPORT_SYMBOL(clear_page)
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1
  2021-10-26 11:22 ` Robin Murphy
@ 2021-10-26 12:23   ` Mark Rutland
  2021-10-27  6:44     ` Reiji Watanabe
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Rutland @ 2021-10-26 12:23 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Reiji Watanabe, Catalin Marinas, Will Deacon, Marc Zyngier,
	linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton,
	Jing Zhang, Raghavendra Rao Anata

On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote:
> On 2021-10-26 04:48, Reiji Watanabe wrote:
> > Currently, clear_page() uses DC ZVA instruction unconditionally.  But it
> > should make sure that DCZID_EL0.DZP, which indicates whether or not use
> > of DC ZVA instruction is prohibited, is zero when using the instruction.
> > Use stp as memset does instead when DCZID_EL0.DZP == 1.
> > 
> > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > ---
> >   arch/arm64/lib/clear_page.S | 11 +++++++++++
> >   1 file changed, 11 insertions(+)
> > 
> > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
> > index b84b179edba3..7ce1bfa4081c 100644
> > --- a/arch/arm64/lib/clear_page.S
> > +++ b/arch/arm64/lib/clear_page.S
> > @@ -16,6 +16,7 @@
> >    */
> >   SYM_FUNC_START_PI(clear_page)
> >   	mrs	x1, dczid_el0
> > +	tbnz	x1, #4, 2f	/* Branch if DC GVA is prohibited */

DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA}
are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to
s/GVA/ZVA/ here.

Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(),
which'll need a similar update...

> >   	and	w1, w1, #0xf
> >   	mov	x2, #4
> >   	lsl	x1, x2, x1
> > @@ -25,5 +26,15 @@ SYM_FUNC_START_PI(clear_page)
> >   	tst	x0, #(PAGE_SIZE - 1)
> >   	b.ne	1b
> >   	ret
> > +
> > +2:	mov	x1, #(PAGE_SIZE)
> > +	sub	x0, x0, #16	/* Pre-bias. */
> 
> Out of curiosity, what's this for? It's not like we need to worry about
> PAGE_SIZE or page addresses being misaligned. I don't really see why we'd
> need a different condition from the DC ZVA loop.

I believe this was copied from arch/arm64/lib/memset.S, in the
`.Lnot_short` case, where we have:

| .Lnot_short:
|         sub     dst, dst, #16/* Pre-bias.  */
|         sub     count, count, #64
| 1:
|         stp     A_l, A_l, [dst, #16]
|         stp     A_l, A_l, [dst, #32]
|         stp     A_l, A_l, [dst, #48]
|         stp     A_l, A_l, [dst, #64]!
|         subs    count, count, #64
|         b.ge    1b

> Robin.
> 
> > +3:	stp	xzr, xzr, [x0, #16]
> > +	stp	xzr, xzr, [x0, #32]
> > +	stp	xzr, xzr, [x0, #48]
> > +	stp	xzr, xzr, [x0, #64]!
> > +	subs	x1, x1, #64
> > +	b.gt	3b
> > +	ret

FWIW, I'd also prefer consistency with the existing loop, i.e.

2:	stp	xzr, xzr, [x0, #0]
	stp	xzr, xzr, [x0, #16]
	stp	xzr, xzr, [x0, #32]
	stp	xzr, xzr, [x0, #48]
	add	x0, x0, #64
	tst	x0, #(PAGE_SIZE - 1)
	b.ne	2b
	ret

Thanks,
Mark.

> >   SYM_FUNC_END_PI(clear_page)
> >   EXPORT_SYMBOL(clear_page)
> > 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1
  2021-10-26  9:06 ` Catalin Marinas
@ 2021-10-27  1:35   ` Reiji Watanabe
  0 siblings, 0 replies; 10+ messages in thread
From: Reiji Watanabe @ 2021-10-27  1:35 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Marc Zyngier, linux-arm-kernel, Peter Shier,
	Ricardo Koller, Oliver Upton, Jing Zhang, Raghavendra Rao Anata

On Tue, Oct 26, 2021 at 2:06 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Mon, Oct 25, 2021 at 08:48:44PM -0700, Reiji Watanabe wrote:
> > Currently, clear_page() uses DC ZVA instruction unconditionally.  But it
> > should make sure that DCZID_EL0.DZP, which indicates whether or not use
> > of DC ZVA instruction is prohibited, is zero when using the instruction.
> > Use stp as memset does instead when DCZID_EL0.DZP == 1.
>
> For correctness, this is fine, but have you come across a system that
> has DZP == 1 (or hypervisor setting HCR_EL2.TDZ)?

We are thinking of setting HCR_EL2.TDZ considering live migration
support between two systems with different DCZID_EL0.BS.

Thanks,
Reiji

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1
  2021-10-26 12:23   ` Mark Rutland
@ 2021-10-27  6:44     ` Reiji Watanabe
  2021-10-27 11:09       ` Mark Rutland
  0 siblings, 1 reply; 10+ messages in thread
From: Reiji Watanabe @ 2021-10-27  6:44 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Robin Murphy, Catalin Marinas, Will Deacon, Marc Zyngier,
	linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton,
	Jing Zhang, Raghavendra Rao Anata

On Tue, Oct 26, 2021 at 5:23 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote:
> > On 2021-10-26 04:48, Reiji Watanabe wrote:
> > > Currently, clear_page() uses DC ZVA instruction unconditionally.  But it
> > > should make sure that DCZID_EL0.DZP, which indicates whether or not use
> > > of DC ZVA instruction is prohibited, is zero when using the instruction.
> > > Use stp as memset does instead when DCZID_EL0.DZP == 1.
> > >
> > > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > > ---
> > >   arch/arm64/lib/clear_page.S | 11 +++++++++++
> > >   1 file changed, 11 insertions(+)
> > >
> > > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
> > > index b84b179edba3..7ce1bfa4081c 100644
> > > --- a/arch/arm64/lib/clear_page.S
> > > +++ b/arch/arm64/lib/clear_page.S
> > > @@ -16,6 +16,7 @@
> > >    */
> > >   SYM_FUNC_START_PI(clear_page)
> > >     mrs     x1, dczid_el0
> > > +   tbnz    x1, #4, 2f      /* Branch if DC GVA is prohibited */
>
> DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA}
> are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to
> s/GVA/ZVA/ here.

Thank you for catching it ! I will fix that.

> Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(),
> which'll need a similar update...

Yes, I'm aware of that and mte_zero_clear_page_tags() needs to get
updated as well.  But, Since I'm not familiar with MTE (and I don't
have any plans to use MTE yet), I didn't work on them (I'm not sure
how I can test them).
I might try to fix them separately later as well when I have time
(not so soon most likely though).


> > >     and     w1, w1, #0xf
> > >     mov     x2, #4
> > >     lsl     x1, x2, x1
> > > @@ -25,5 +26,15 @@ SYM_FUNC_START_PI(clear_page)
> > >     tst     x0, #(PAGE_SIZE - 1)
> > >     b.ne    1b
> > >     ret
> > > +
> > > +2: mov     x1, #(PAGE_SIZE)
> > > +   sub     x0, x0, #16     /* Pre-bias. */
> >
> > Out of curiosity, what's this for? It's not like we need to worry about
> > PAGE_SIZE or page addresses being misaligned. I don't really see why we'd
> > need a different condition from the DC ZVA loop.
>
> I believe this was copied from arch/arm64/lib/memset.S, in the
> `.Lnot_short` case, where we have:

Yes, I used the same code as memset.  I couldn't come up with any
other implementation that could get same performance with fewer
number of instructions in the loop than that.


> | .Lnot_short:
> |         sub     dst, dst, #16/* Pre-bias.  */
> |         sub     count, count, #64
> | 1:
> |         stp     A_l, A_l, [dst, #16]
> |         stp     A_l, A_l, [dst, #32]
> |         stp     A_l, A_l, [dst, #48]
> |         stp     A_l, A_l, [dst, #64]!
> |         subs    count, count, #64
> |         b.ge    1b
>
> > Robin.
> >
> > > +3: stp     xzr, xzr, [x0, #16]
> > > +   stp     xzr, xzr, [x0, #32]
> > > +   stp     xzr, xzr, [x0, #48]
> > > +   stp     xzr, xzr, [x0, #64]!
> > > +   subs    x1, x1, #64
> > > +   b.gt    3b
> > > +   ret
>
> FWIW, I'd also prefer consistency with the existing loop, i.e.
>
> 2:      stp     xzr, xzr, [x0, #0]
>         stp     xzr, xzr, [x0, #16]
>         stp     xzr, xzr, [x0, #32]
>         stp     xzr, xzr, [x0, #48]
>         add     x0, x0, #64
>         tst     x0, #(PAGE_SIZE - 1)
>         b.ne    2b
>         ret

Thank you for the suggestion.  Yes, I agree that it would be better
in terms of consistency with the existing loop.
I will fix this as you suggested in the v2 patch.

Thanks,
Reiji

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1
  2021-10-27  6:44     ` Reiji Watanabe
@ 2021-10-27 11:09       ` Mark Rutland
  2021-10-28  1:49         ` Reiji Watanabe
  2021-10-28  7:46         ` Will Deacon
  0 siblings, 2 replies; 10+ messages in thread
From: Mark Rutland @ 2021-10-27 11:09 UTC (permalink / raw)
  To: Reiji Watanabe
  Cc: Robin Murphy, Catalin Marinas, Will Deacon, Marc Zyngier,
	linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton,
	Jing Zhang, Raghavendra Rao Anata

On Tue, Oct 26, 2021 at 11:44:51PM -0700, Reiji Watanabe wrote:
> On Tue, Oct 26, 2021 at 5:23 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote:
> > > On 2021-10-26 04:48, Reiji Watanabe wrote:
> > > > Currently, clear_page() uses DC ZVA instruction unconditionally.  But it
> > > > should make sure that DCZID_EL0.DZP, which indicates whether or not use
> > > > of DC ZVA instruction is prohibited, is zero when using the instruction.
> > > > Use stp as memset does instead when DCZID_EL0.DZP == 1.
> > > >
> > > > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > > > ---
> > > >   arch/arm64/lib/clear_page.S | 11 +++++++++++
> > > >   1 file changed, 11 insertions(+)
> > > >
> > > > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
> > > > index b84b179edba3..7ce1bfa4081c 100644
> > > > --- a/arch/arm64/lib/clear_page.S
> > > > +++ b/arch/arm64/lib/clear_page.S
> > > > @@ -16,6 +16,7 @@
> > > >    */
> > > >   SYM_FUNC_START_PI(clear_page)
> > > >     mrs     x1, dczid_el0
> > > > +   tbnz    x1, #4, 2f      /* Branch if DC GVA is prohibited */
> >
> > DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA}
> > are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to
> > s/GVA/ZVA/ here.
> 
> Thank you for catching it ! I will fix that.
> 
> > Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(),
> > which'll need a similar update...
> 
> Yes, I'm aware of that and mte_zero_clear_page_tags() needs to get
> updated as well.  But, Since I'm not familiar with MTE (and I don't
> have any plans to use MTE yet), I didn't work on them (I'm not sure
> how I can test them).
> I might try to fix them separately later as well when I have time
> (not so soon most likely though).

My view is that we should either:

* Document that we require DCZID_EL0.DZP==0, as is implicitly the case
  today.

* Fix *all* usage of DC {ZVA,GVZ,GZVA} to work with DCZID_EL0.DZP==1.

... otherwise we're just hiding the problem rather than fixing it.

QEMU TCG mode has MTE support, so it should be possible to test using
that in a configuration such as:

 -machine virt,virtualization=on,mte=on -cpu max

... then you can hack the EL2 stub code in head.S to initialize
HCR_EL2.TDZ=1 before dropping to EL1 (and reporting that the kernel
started at EL1).

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1
  2021-10-27 11:09       ` Mark Rutland
@ 2021-10-28  1:49         ` Reiji Watanabe
  2021-10-28  7:46         ` Will Deacon
  1 sibling, 0 replies; 10+ messages in thread
From: Reiji Watanabe @ 2021-10-28  1:49 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Robin Murphy, Catalin Marinas, Will Deacon, Marc Zyngier,
	linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton,
	Jing Zhang, Raghavendra Rao Anata

On Wed, Oct 27, 2021 at 4:09 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Tue, Oct 26, 2021 at 11:44:51PM -0700, Reiji Watanabe wrote:
> > On Tue, Oct 26, 2021 at 5:23 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > > On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote:
> > > > On 2021-10-26 04:48, Reiji Watanabe wrote:
> > > > > Currently, clear_page() uses DC ZVA instruction unconditionally.  But it
> > > > > should make sure that DCZID_EL0.DZP, which indicates whether or not use
> > > > > of DC ZVA instruction is prohibited, is zero when using the instruction.
> > > > > Use stp as memset does instead when DCZID_EL0.DZP == 1.
> > > > >
> > > > > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > > > > ---
> > > > >   arch/arm64/lib/clear_page.S | 11 +++++++++++
> > > > >   1 file changed, 11 insertions(+)
> > > > >
> > > > > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
> > > > > index b84b179edba3..7ce1bfa4081c 100644
> > > > > --- a/arch/arm64/lib/clear_page.S
> > > > > +++ b/arch/arm64/lib/clear_page.S
> > > > > @@ -16,6 +16,7 @@
> > > > >    */
> > > > >   SYM_FUNC_START_PI(clear_page)
> > > > >     mrs     x1, dczid_el0
> > > > > +   tbnz    x1, #4, 2f      /* Branch if DC GVA is prohibited */
> > >
> > > DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA}
> > > are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to
> > > s/GVA/ZVA/ here.
> >
> > Thank you for catching it ! I will fix that.
> >
> > > Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(),
> > > which'll need a similar update...
> >
> > Yes, I'm aware of that and mte_zero_clear_page_tags() needs to get
> > updated as well.  But, Since I'm not familiar with MTE (and I don't
> > have any plans to use MTE yet), I didn't work on them (I'm not sure
> > how I can test them).
> > I might try to fix them separately later as well when I have time
> > (not so soon most likely though).
>
> My view is that we should either:
>
> * Document that we require DCZID_EL0.DZP==0, as is implicitly the case
>   today.
>
> * Fix *all* usage of DC {ZVA,GVZ,GZVA} to work with DCZID_EL0.DZP==1.
>
> ... otherwise we're just hiding the problem rather than fixing it.
>
> QEMU TCG mode has MTE support, so it should be possible to test using
> that in a configuration such as:
>
>  -machine virt,virtualization=on,mte=on -cpu max
>
> ... then you can hack the EL2 stub code in head.S to initialize
> HCR_EL2.TDZ=1 before dropping to EL1 (and reporting that the kernel
> started at EL1).

Understood.
I will work on the MTE fixes and include them into the v2 patch.
Thank you so much for all the comments and information.

Regards,
Reiji

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1
  2021-10-27 11:09       ` Mark Rutland
  2021-10-28  1:49         ` Reiji Watanabe
@ 2021-10-28  7:46         ` Will Deacon
  2021-10-28  9:03           ` Mark Rutland
  1 sibling, 1 reply; 10+ messages in thread
From: Will Deacon @ 2021-10-28  7:46 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Reiji Watanabe, Robin Murphy, Catalin Marinas, Marc Zyngier,
	linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton,
	Jing Zhang, Raghavendra Rao Anata

On Wed, Oct 27, 2021 at 12:09:47PM +0100, Mark Rutland wrote:
> On Tue, Oct 26, 2021 at 11:44:51PM -0700, Reiji Watanabe wrote:
> > On Tue, Oct 26, 2021 at 5:23 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > > On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote:
> > > > On 2021-10-26 04:48, Reiji Watanabe wrote:
> > > > > Currently, clear_page() uses DC ZVA instruction unconditionally.  But it
> > > > > should make sure that DCZID_EL0.DZP, which indicates whether or not use
> > > > > of DC ZVA instruction is prohibited, is zero when using the instruction.
> > > > > Use stp as memset does instead when DCZID_EL0.DZP == 1.
> > > > >
> > > > > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > > > > ---
> > > > >   arch/arm64/lib/clear_page.S | 11 +++++++++++
> > > > >   1 file changed, 11 insertions(+)
> > > > >
> > > > > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
> > > > > index b84b179edba3..7ce1bfa4081c 100644
> > > > > --- a/arch/arm64/lib/clear_page.S
> > > > > +++ b/arch/arm64/lib/clear_page.S
> > > > > @@ -16,6 +16,7 @@
> > > > >    */
> > > > >   SYM_FUNC_START_PI(clear_page)
> > > > >     mrs     x1, dczid_el0
> > > > > +   tbnz    x1, #4, 2f      /* Branch if DC GVA is prohibited */
> > >
> > > DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA}
> > > are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to
> > > s/GVA/ZVA/ here.
> > 
> > Thank you for catching it ! I will fix that.
> > 
> > > Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(),
> > > which'll need a similar update...
> > 
> > Yes, I'm aware of that and mte_zero_clear_page_tags() needs to get
> > updated as well.  But, Since I'm not familiar with MTE (and I don't
> > have any plans to use MTE yet), I didn't work on them (I'm not sure
> > how I can test them).
> > I might try to fix them separately later as well when I have time
> > (not so soon most likely though).
> 
> My view is that we should either:
> 
> * Document that we require DCZID_EL0.DZP==0, as is implicitly the case
>   today.

I disagree with that. There's nothing wrong in trapping this stuff, as long
as you go ahead and emulate it, which is exactly why we aren't checking it at
the moment as EL2 should be prepared to handle the trap. The Arm ARM talks
about the instructions being "prohibited" but that doesn't mean anything --
the reality is that they trap to EL2.

We could document *that* though?

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1
  2021-10-28  7:46         ` Will Deacon
@ 2021-10-28  9:03           ` Mark Rutland
  0 siblings, 0 replies; 10+ messages in thread
From: Mark Rutland @ 2021-10-28  9:03 UTC (permalink / raw)
  To: Will Deacon
  Cc: Reiji Watanabe, Robin Murphy, Catalin Marinas, Marc Zyngier,
	linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton,
	Jing Zhang, Raghavendra Rao Anata

On Thu, Oct 28, 2021 at 08:46:10AM +0100, Will Deacon wrote:
> On Wed, Oct 27, 2021 at 12:09:47PM +0100, Mark Rutland wrote:
> > On Tue, Oct 26, 2021 at 11:44:51PM -0700, Reiji Watanabe wrote:
> > > On Tue, Oct 26, 2021 at 5:23 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote:
> > > > > On 2021-10-26 04:48, Reiji Watanabe wrote:
> > > > > > Currently, clear_page() uses DC ZVA instruction unconditionally.  But it
> > > > > > should make sure that DCZID_EL0.DZP, which indicates whether or not use
> > > > > > of DC ZVA instruction is prohibited, is zero when using the instruction.
> > > > > > Use stp as memset does instead when DCZID_EL0.DZP == 1.
> > > > > >
> > > > > > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > > > > > ---
> > > > > >   arch/arm64/lib/clear_page.S | 11 +++++++++++
> > > > > >   1 file changed, 11 insertions(+)
> > > > > >
> > > > > > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
> > > > > > index b84b179edba3..7ce1bfa4081c 100644
> > > > > > --- a/arch/arm64/lib/clear_page.S
> > > > > > +++ b/arch/arm64/lib/clear_page.S
> > > > > > @@ -16,6 +16,7 @@
> > > > > >    */
> > > > > >   SYM_FUNC_START_PI(clear_page)
> > > > > >     mrs     x1, dczid_el0
> > > > > > +   tbnz    x1, #4, 2f      /* Branch if DC GVA is prohibited */
> > > >
> > > > DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA}
> > > > are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to
> > > > s/GVA/ZVA/ here.
> > > 
> > > Thank you for catching it ! I will fix that.
> > > 
> > > > Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(),
> > > > which'll need a similar update...
> > > 
> > > Yes, I'm aware of that and mte_zero_clear_page_tags() needs to get
> > > updated as well.  But, Since I'm not familiar with MTE (and I don't
> > > have any plans to use MTE yet), I didn't work on them (I'm not sure
> > > how I can test them).
> > > I might try to fix them separately later as well when I have time
> > > (not so soon most likely though).
> > 
> > My view is that we should either:
> > 
> > * Document that we require DCZID_EL0.DZP==0, as is implicitly the case
> >   today.
> 
> I disagree with that. There's nothing wrong in trapping this stuff, as long
> as you go ahead and emulate it, which is exactly why we aren't checking it at
> the moment as EL2 should be prepared to handle the trap. The Arm ARM talks
> about the instructions being "prohibited" but that doesn't mean anything --
> the reality is that they trap to EL2.

TBH, I think trapping DC ZVA rather than forcing it to be UNDEF was an
architectural mistake.

I think the intent of the "prohibited" wording (and exposure of
DCZID_EL0.DZP to EL0 and EL1) is clearly that SW should *avoid* DC ZVA and
friends when DCZID_EL0.DZP is set (and libc and the Arm optimized
routines have *always* checked that), and performance would be abysmal
with emulated DC ZVA, so at minimum we want to avoid hitting emulation
in the common case.

> We could document *that* though?

I guess; though it makes me uneasy since the architecture clearly pushes
people to read CTR_EL0.DZP, and heavily implies that there's no need to
emulate, so I don't think we have much of a leg to stand on from an
architecture PoV.

If we need to support this, my preference would be to support
DCZID_EL0.DZP==1 by avoiding the trapped instructions entirely.

I realise the problem as ever is "what does userspace do?".

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-10-28  9:06 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-26  3:48 [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 Reiji Watanabe
2021-10-26  9:06 ` Catalin Marinas
2021-10-27  1:35   ` Reiji Watanabe
2021-10-26 11:22 ` Robin Murphy
2021-10-26 12:23   ` Mark Rutland
2021-10-27  6:44     ` Reiji Watanabe
2021-10-27 11:09       ` Mark Rutland
2021-10-28  1:49         ` Reiji Watanabe
2021-10-28  7:46         ` Will Deacon
2021-10-28  9:03           ` Mark Rutland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.