linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: arm64: nvhe: Disable profile optimization
@ 2022-09-20  8:20 Denis Nikitin
  2022-09-20  9:33 ` Marc Zyngier
  2022-09-22  5:31 ` [PATCH v2] KVM: arm64: nvhe: Fix build with " Denis Nikitin
  0 siblings, 2 replies; 14+ messages in thread
From: Denis Nikitin @ 2022-09-20  8:20 UTC (permalink / raw)
  To: Marc Zyngier, Catalin Marinas, Will Deacon
  Cc: James Morse, Alexandru Elisei, Nick Desaulniers, Manoj Gupta,
	David Brazdil, linux-arm-kernel, kvmarm, linux-kernel,
	Denis Nikitin

Kernel build with -fprofile-sample-use raises the following failure:

error: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o: Unexpected SHT_REL
section ".rel.llvm.call-graph-profile"

SHT_REL is generated by the latest lld, see
https://reviews.llvm.org/rGca3bdb57fa1ac98b711a735de048c12b5fdd8086.
Disable profile optimization in kvm/nvhe to fix the build with
AutoFDO.

Signed-off-by: Denis Nikitin <denik@chromium.org>
---
 arch/arm64/kvm/hyp/nvhe/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index b5c5119c7396..6a6188374a52 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -89,6 +89,9 @@ quiet_cmd_hypcopy = HYPCOPY $@
 # Remove ftrace, Shadow Call Stack, and CFI CFLAGS.
 # This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations.
 KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS))
+# Profile optimization creates SHT_REL section '.llvm.call-graph-profile' for
+# the hot code. SHT_REL is currently not supported by the KVM tools.
+KBUILD_CFLAGS += $(call cc-option,-fno-profile-sample-use,-fno-profile-use)
 
 # KVM nVHE code is run at a different exception code with a different map, so
 # compiler instrumentation that inserts callbacks or checks into the code may
-- 
2.37.3.968.ga6b4b080e4-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] KVM: arm64: nvhe: Disable profile optimization
  2022-09-20  8:20 [PATCH] KVM: arm64: nvhe: Disable profile optimization Denis Nikitin
@ 2022-09-20  9:33 ` Marc Zyngier
  2022-09-21  0:08   ` Denis Nikitin
  2022-09-22  5:31 ` [PATCH v2] KVM: arm64: nvhe: Fix build with " Denis Nikitin
  1 sibling, 1 reply; 14+ messages in thread
From: Marc Zyngier @ 2022-09-20  9:33 UTC (permalink / raw)
  To: Denis Nikitin
  Cc: Catalin Marinas, Will Deacon, James Morse, Alexandru Elisei,
	Nick Desaulniers, Manoj Gupta, David Brazdil, linux-arm-kernel,
	kvmarm, linux-kernel

Hi Denis,

On Tue, 20 Sep 2022 09:20:05 +0100,
Denis Nikitin <denik@chromium.org> wrote:
> 
> Kernel build with -fprofile-sample-use raises the following failure:
> 
> error: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o: Unexpected SHT_REL
> section ".rel.llvm.call-graph-profile"

How is this flag provided? I don't see any occurrence of it in the
kernel so far.

> 
> SHT_REL is generated by the latest lld, see
> https://reviews.llvm.org/rGca3bdb57fa1ac98b711a735de048c12b5fdd8086.

Is this part of a released toolchain? If so, can you spell out the
first version where this occurs?

> Disable profile optimization in kvm/nvhe to fix the build with
> AutoFDO.

It'd be good to at least mention how AutoFDO and -fprofile-sample-use
relate to each other.

> 
> Signed-off-by: Denis Nikitin <denik@chromium.org>
> ---
>  arch/arm64/kvm/hyp/nvhe/Makefile | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index b5c5119c7396..6a6188374a52 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -89,6 +89,9 @@ quiet_cmd_hypcopy = HYPCOPY $@
>  # Remove ftrace, Shadow Call Stack, and CFI CFLAGS.
>  # This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations.
>  KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS))
> +# Profile optimization creates SHT_REL section '.llvm.call-graph-profile' for
> +# the hot code. SHT_REL is currently not supported by the KVM tools.

'KVM tools' seems vague. Maybe call out the actual helper that
processes the relocations?

> +KBUILD_CFLAGS += $(call cc-option,-fno-profile-sample-use,-fno-profile-use)

Why adding these options instead of filtering out the offending option
as it is done just above?

Also, is this the only place the kernel fails to compile? The EFI stub
does similar things AFAIR, and could potentially fail the same way.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] KVM: arm64: nvhe: Disable profile optimization
  2022-09-20  9:33 ` Marc Zyngier
@ 2022-09-21  0:08   ` Denis Nikitin
  2022-09-21  6:02     ` Denis Nikitin
  0 siblings, 1 reply; 14+ messages in thread
From: Denis Nikitin @ 2022-09-21  0:08 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Denis Nikitin, Catalin Marinas, Will Deacon, James Morse,
	Alexandru Elisei, Nick Desaulniers, Manoj Gupta, David Brazdil,
	linux-arm-kernel, kvmarm, linux-kernel

Hi Mark,

Thank you for a quick response.

On Tue, Sep 20, 2022 at 2:34 AM Marc Zyngier <maz@kernel.org> wrote:
>
> Hi Denis,
>
> On Tue, 20 Sep 2022 09:20:05 +0100,
> Denis Nikitin <denik@chromium.org> wrote:
> >
> > Kernel build with -fprofile-sample-use raises the following failure:
> >
> > error: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o: Unexpected SHT_REL
> > section ".rel.llvm.call-graph-profile"
>
> How is this flag provided? I don't see any occurrence of it in the
> kernel so far.

On ChromeOS we build the kernel with sample profiles by adding
-fprofile-sample-use=/path/to/gcov.profile to KCFLAGS.

>
> >
> > SHT_REL is generated by the latest lld, see
> > https://reviews.llvm.org/rGca3bdb57fa1ac98b711a735de048c12b5fdd8086.
>
> Is this part of a released toolchain? If so, can you spell out the
> first version where this occurs?

Yes, it was added in llvm-13. I will update the patch.

>
> > Disable profile optimization in kvm/nvhe to fix the build with
> > AutoFDO.
>
> It'd be good to at least mention how AutoFDO and -fprofile-sample-use
> relate to each other.

Good point. AutoFDO is an example of sample profiles.
It's not actually relevant for the bug. I will better remove it.

>
> >
> > Signed-off-by: Denis Nikitin <denik@chromium.org>
> > ---
> >  arch/arm64/kvm/hyp/nvhe/Makefile | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> > index b5c5119c7396..6a6188374a52 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> > @@ -89,6 +89,9 @@ quiet_cmd_hypcopy = HYPCOPY $@
> >  # Remove ftrace, Shadow Call Stack, and CFI CFLAGS.
> >  # This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations.
> >  KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS))
> > +# Profile optimization creates SHT_REL section '.llvm.call-graph-profile' for
> > +# the hot code. SHT_REL is currently not supported by the KVM tools.
>
> 'KVM tools' seems vague. Maybe call out the actual helper that
> processes the relocations?

Agreed.

>
> > +KBUILD_CFLAGS += $(call cc-option,-fno-profile-sample-use,-fno-profile-use)
>
> Why adding these options instead of filtering out the offending option
> as it is done just above?

That was actually the alternative solution and it worked as well.
Let me double check if profile optimization doesn't mess up with other
sections and if it doesn't I will remove the '.llvm.call-graph-profile'
section instead.

>
> Also, is this the only place the kernel fails to compile? The EFI stub
> does similar things AFAIR, and could potentially fail the same way.

This was the only place in 5.15 where we tested it.
Let me see if EFI has this section.

>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

Thanks,
Denis

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] KVM: arm64: nvhe: Disable profile optimization
  2022-09-21  0:08   ` Denis Nikitin
@ 2022-09-21  6:02     ` Denis Nikitin
  2022-09-21 17:25       ` Marc Zyngier
  0 siblings, 1 reply; 14+ messages in thread
From: Denis Nikitin @ 2022-09-21  6:02 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, James Morse, Alexandru Elisei,
	Nick Desaulniers, Manoj Gupta, David Brazdil, linux-arm-kernel,
	kvmarm, linux-kernel, Denis Nikitin

Adding a few more comments...

On Tue, Sep 20, 2022 at 5:08 PM Denis Nikitin <denik@google.com> wrote:
>
> Hi Mark,
>
> Thank you for a quick response.
>
> On Tue, Sep 20, 2022 at 2:34 AM Marc Zyngier <maz@kernel.org> wrote:
> >
> > Hi Denis,
> >
> > On Tue, 20 Sep 2022 09:20:05 +0100,
> > Denis Nikitin <denik@chromium.org> wrote:
> > >
> > > Kernel build with -fprofile-sample-use raises the following failure:
> > >
> > > error: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o: Unexpected SHT_REL
> > > section ".rel.llvm.call-graph-profile"
> >
> > How is this flag provided? I don't see any occurrence of it in the
> > kernel so far.
>
> On ChromeOS we build the kernel with sample profiles by adding
> -fprofile-sample-use=/path/to/gcov.profile to KCFLAGS.
>
> >
> > >
> > > SHT_REL is generated by the latest lld, see
> > > https://reviews.llvm.org/rGca3bdb57fa1ac98b711a735de048c12b5fdd8086.
> >
> > Is this part of a released toolchain? If so, can you spell out the
> > first version where this occurs?
>
> Yes, it was added in llvm-13. I will update the patch.
>
> >
> > > Disable profile optimization in kvm/nvhe to fix the build with
> > > AutoFDO.
> >
> > It'd be good to at least mention how AutoFDO and -fprofile-sample-use
> > relate to each other.
>
> Good point. AutoFDO is an example of sample profiles.
> It's not actually relevant for the bug. I will better remove it.
>
> >
> > >
> > > Signed-off-by: Denis Nikitin <denik@chromium.org>
> > > ---
> > >  arch/arm64/kvm/hyp/nvhe/Makefile | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> > > index b5c5119c7396..6a6188374a52 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> > > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> > > @@ -89,6 +89,9 @@ quiet_cmd_hypcopy = HYPCOPY $@
> > >  # Remove ftrace, Shadow Call Stack, and CFI CFLAGS.
> > >  # This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations.
> > >  KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS))
> > > +# Profile optimization creates SHT_REL section '.llvm.call-graph-profile' for
> > > +# the hot code. SHT_REL is currently not supported by the KVM tools.
> >
> > 'KVM tools' seems vague. Maybe call out the actual helper that
> > processes the relocations?
>
> Agreed.
>
> >
> > > +KBUILD_CFLAGS += $(call cc-option,-fno-profile-sample-use,-fno-profile-use)
> >
> > Why adding these options instead of filtering out the offending option
> > as it is done just above?
>
> That was actually the alternative solution and it worked as well.
> Let me double check if profile optimization doesn't mess up with other
> sections and if it doesn't I will remove the '.llvm.call-graph-profile'
> section instead.

When I remove the '.llvm.call-graph-profile' section the layout of other
sections slightly changes (offsets and sizes) compared to
`-fno-profile-sample-use`. But the list of sections remains the same.

>
> >
> > Also, is this the only place the kernel fails to compile? The EFI stub
> > does similar things AFAIR, and could potentially fail the same way.
>
> This was the only place in 5.15 where we tested it.
> Let me see if EFI has this section.

EFI code is not marked as hot in the profile.

Regarding "could potentially fail", I don't see any explicit manipulations
with code sections in EFI.
The hardcoded EFI stub entries should not be affected.

>
> >
> > Thanks,
> >
> >         M.
> >
> > --
> > Without deviation from the norm, progress is not possible.
>
> Thanks,
> Denis

- Denis

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] KVM: arm64: nvhe: Disable profile optimization
  2022-09-21  6:02     ` Denis Nikitin
@ 2022-09-21 17:25       ` Marc Zyngier
  0 siblings, 0 replies; 14+ messages in thread
From: Marc Zyngier @ 2022-09-21 17:25 UTC (permalink / raw)
  To: Denis Nikitin
  Cc: Catalin Marinas, Will Deacon, James Morse, Alexandru Elisei,
	Nick Desaulniers, Manoj Gupta, David Brazdil, linux-arm-kernel,
	kvmarm, linux-kernel

On Wed, 21 Sep 2022 07:02:50 +0100,
Denis Nikitin <denik@chromium.org> wrote:
> 
> Adding a few more comments...
> 
> On Tue, Sep 20, 2022 at 5:08 PM Denis Nikitin <denik@google.com> wrote:
> >
> > Hi Mark,
> >
> > Thank you for a quick response.
> >
> > On Tue, Sep 20, 2022 at 2:34 AM Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > Hi Denis,
> > >
> > > On Tue, 20 Sep 2022 09:20:05 +0100,
> > > Denis Nikitin <denik@chromium.org> wrote:
> > > >
> > > > Kernel build with -fprofile-sample-use raises the following failure:
> > > >
> > > > error: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o: Unexpected SHT_REL
> > > > section ".rel.llvm.call-graph-profile"
> > >
> > > How is this flag provided? I don't see any occurrence of it in the
> > > kernel so far.
> >
> > On ChromeOS we build the kernel with sample profiles by adding
> > -fprofile-sample-use=/path/to/gcov.profile to KCFLAGS.
> >
> > >
> > > >
> > > > SHT_REL is generated by the latest lld, see
> > > > https://reviews.llvm.org/rGca3bdb57fa1ac98b711a735de048c12b5fdd8086.
> > >
> > > Is this part of a released toolchain? If so, can you spell out the
> > > first version where this occurs?
> >
> > Yes, it was added in llvm-13. I will update the patch.
> >
> > >
> > > > Disable profile optimization in kvm/nvhe to fix the build with
> > > > AutoFDO.
> > >
> > > It'd be good to at least mention how AutoFDO and -fprofile-sample-use
> > > relate to each other.
> >
> > Good point. AutoFDO is an example of sample profiles.
> > It's not actually relevant for the bug. I will better remove it.
> >
> > >
> > > >
> > > > Signed-off-by: Denis Nikitin <denik@chromium.org>
> > > > ---
> > > >  arch/arm64/kvm/hyp/nvhe/Makefile | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> > > > index b5c5119c7396..6a6188374a52 100644
> > > > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> > > > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> > > > @@ -89,6 +89,9 @@ quiet_cmd_hypcopy = HYPCOPY $@
> > > >  # Remove ftrace, Shadow Call Stack, and CFI CFLAGS.
> > > >  # This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations.
> > > >  KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS))
> > > > +# Profile optimization creates SHT_REL section '.llvm.call-graph-profile' for
> > > > +# the hot code. SHT_REL is currently not supported by the KVM tools.
> > >
> > > 'KVM tools' seems vague. Maybe call out the actual helper that
> > > processes the relocations?
> >
> > Agreed.
> >
> > >
> > > > +KBUILD_CFLAGS += $(call cc-option,-fno-profile-sample-use,-fno-profile-use)
> > >
> > > Why adding these options instead of filtering out the offending option
> > > as it is done just above?
> >
> > That was actually the alternative solution and it worked as well.
> > Let me double check if profile optimization doesn't mess up with other
> > sections and if it doesn't I will remove the '.llvm.call-graph-profile'
> > section instead.
> 
> When I remove the '.llvm.call-graph-profile' section the layout of other
> sections slightly changes (offsets and sizes) compared to
> `-fno-profile-sample-use`. But the list of sections remains the same.

If this method works well enough, I'd rather we stick to it, instead
of having two ways to disable this sort of things.

> > > Also, is this the only place the kernel fails to compile? The EFI stub
> > > does similar things AFAIR, and could potentially fail the same way.
> >
> > This was the only place in 5.15 where we tested it.
> > Let me see if EFI has this section.
> 
> EFI code is not marked as hot in the profile.
> 
> Regarding "could potentially fail", I don't see any explicit manipulations
> with code sections in EFI.
> The hardcoded EFI stub entries should not be affected.

I was more worried by the runtime relocation that the EFI stub
performs for the kernel, but if you've checked that already, that
works for me.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2] KVM: arm64: nvhe: Fix build with profile optimization
  2022-09-20  8:20 [PATCH] KVM: arm64: nvhe: Disable profile optimization Denis Nikitin
  2022-09-20  9:33 ` Marc Zyngier
@ 2022-09-22  5:31 ` Denis Nikitin
  2022-09-22 10:37   ` Marc Zyngier
  1 sibling, 1 reply; 14+ messages in thread
From: Denis Nikitin @ 2022-09-22  5:31 UTC (permalink / raw)
  To: Marc Zyngier, Catalin Marinas, Will Deacon
  Cc: James Morse, Alexandru Elisei, Nick Desaulniers, Manoj Gupta,
	David Brazdil, linux-arm-kernel, kvmarm, linux-kernel,
	Denis Nikitin

Kernel build with clang and KCFLAGS=-fprofile-sample-use fails with:

error: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o: Unexpected SHT_REL
section ".rel.llvm.call-graph-profile"

Starting from 13.0.0 llvm can generate SHT_REL section, see
https://reviews.llvm.org/rGca3bdb57fa1ac98b711a735de048c12b5fdd8086.
gen-hyprel does not support SHT_REL relocation section.

Remove ".llvm.call-graph-profile" SHT_REL relocation from kvm_nvhe
to fix the build.

Signed-off-by: Denis Nikitin <denik@chromium.org>
---
V1 -> V2: Remove the relocation instead of disabling the profile-use flags.
---
 arch/arm64/kvm/hyp/nvhe/Makefile | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index b5c5119c7396..49ec950ac57b 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -78,8 +78,10 @@ $(obj)/kvm_nvhe.o: $(obj)/kvm_nvhe.rel.o FORCE
 
 # The HYPREL command calls `gen-hyprel` to generate an assembly file with
 # a list of relocations targeting hyp code/data.
+# Starting from 13.0.0 llvm emits SHT_REL section '.llvm.call-graph-profile'
+# when profile optimization is applied. gen-hyprel does not support SHT_REL.
 quiet_cmd_hyprel = HYPREL  $@
-      cmd_hyprel = $(obj)/gen-hyprel $< > $@
+	cmd_hyprel = $(OBJCOPY) -R .llvm.call-graph-profile $<; $(obj)/gen-hyprel $< > $@
 
 # The HYPCOPY command uses `objcopy` to prefix all ELF symbol names
 # to avoid clashes with VHE code/data.
-- 
2.37.3.968.ga6b4b080e4-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] KVM: arm64: nvhe: Fix build with profile optimization
  2022-09-22  5:31 ` [PATCH v2] KVM: arm64: nvhe: Fix build with " Denis Nikitin
@ 2022-09-22 10:37   ` Marc Zyngier
  2022-09-23  5:01     ` Denis Nikitin
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Zyngier @ 2022-09-22 10:37 UTC (permalink / raw)
  To: Denis Nikitin
  Cc: Catalin Marinas, Will Deacon, James Morse, Alexandru Elisei,
	Nick Desaulniers, Manoj Gupta, David Brazdil, linux-arm-kernel,
	kvmarm, linux-kernel

On Thu, 22 Sep 2022 06:31:45 +0100,
Denis Nikitin <denik@chromium.org> wrote:
> 
> Kernel build with clang and KCFLAGS=-fprofile-sample-use fails with:
> 
> error: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o: Unexpected SHT_REL
> section ".rel.llvm.call-graph-profile"
> 
> Starting from 13.0.0 llvm can generate SHT_REL section, see
> https://reviews.llvm.org/rGca3bdb57fa1ac98b711a735de048c12b5fdd8086.
> gen-hyprel does not support SHT_REL relocation section.
> 
> Remove ".llvm.call-graph-profile" SHT_REL relocation from kvm_nvhe
> to fix the build.
> 
> Signed-off-by: Denis Nikitin <denik@chromium.org>
> ---
> V1 -> V2: Remove the relocation instead of disabling the profile-use flags.
> ---
>  arch/arm64/kvm/hyp/nvhe/Makefile | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index b5c5119c7396..49ec950ac57b 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -78,8 +78,10 @@ $(obj)/kvm_nvhe.o: $(obj)/kvm_nvhe.rel.o FORCE
>  
>  # The HYPREL command calls `gen-hyprel` to generate an assembly file with
>  # a list of relocations targeting hyp code/data.
> +# Starting from 13.0.0 llvm emits SHT_REL section '.llvm.call-graph-profile'
> +# when profile optimization is applied. gen-hyprel does not support SHT_REL.
>  quiet_cmd_hyprel = HYPREL  $@
> -      cmd_hyprel = $(obj)/gen-hyprel $< > $@
> +	cmd_hyprel = $(OBJCOPY) -R .llvm.call-graph-profile $<; $(obj)/gen-hyprel $< > $@

I was really hoping that you'd just drop the flags from the CFLAGS
instead of removing the generated section. Something like:

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index b5c5119c7396..e5b2d43925b4 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -88,7 +88,7 @@ quiet_cmd_hypcopy = HYPCOPY $@
 
 # Remove ftrace, Shadow Call Stack, and CFI CFLAGS.
 # This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations.
-KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS))
+KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI) -fprofile-sample-use, $(KBUILD_CFLAGS))
 
 # KVM nVHE code is run at a different exception code with a different map, so
 # compiler instrumentation that inserts callbacks or checks into the code may

However, I even failed to reproduce your problem using LLVM 14 as
packaged by Debian (if that matters, I'm using an arm64 build
machine). I build the kernel with:

$ make LLVM=1 KCFLAGS=-fprofile-sample-use -j8 vmlinux

and the offending object only contains the following sections:

arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o:     file format elf64-littleaarch64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .hyp.idmap.text 00000ae4  0000000000000000  0000000000000000  00000800  2**11
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .hyp.text     0000e988  0000000000000000  0000000000000000  00001800  2**11
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  2 .hyp.data..ro_after_init 00000820  0000000000000000  0000000000000000  00010188  2**3
                  CONTENTS, ALLOC, LOAD, DATA
  3 .hyp.rodata   00002e70  0000000000000000  0000000000000000  000109a8  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
  4 .hyp.data..percpu 00001ee0  0000000000000000  0000000000000000  00013820  2**4
                  CONTENTS, ALLOC, LOAD, DATA
  5 .hyp.bss      00001158  0000000000000000  0000000000000000  00015700  2**3
                  ALLOC
  6 .comment      0000001f  0000000000000000  0000000000000000  00017830  2**0
                  CONTENTS, READONLY
  7 .llvm_addrsig 000000b8  0000000000000000  0000000000000000  0001784f  2**0
                  CONTENTS, READONLY, EXCLUDE
  8 .altinstructions 00001284  0000000000000000  0000000000000000  00015700  2**0
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
  9 __jump_table  00000960  0000000000000000  0000000000000000  00016988  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, DATA
 10 __bug_table   0000051c  0000000000000000  0000000000000000  000172e8  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, DATA
 11 __kvm_ex_table 00000028  0000000000000000  0000000000000000  00017808  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
 12 .note.GNU-stack 00000000  0000000000000000  0000000000000000  00027370  2**0
                  CONTENTS, READONLY

So what am I missing to trigger this issue? Does it rely on something
like PGO, which is not upstream yet? A bit of handholding would be
much appreciated.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] KVM: arm64: nvhe: Fix build with profile optimization
  2022-09-22 10:37   ` Marc Zyngier
@ 2022-09-23  5:01     ` Denis Nikitin
       [not found]       ` <CAH=Qcsi3aQ51AsAE0WmAH9VmpqjOaQQt=ru5Nav4+d8F3fMPwQ@mail.gmail.com>
  0 siblings, 1 reply; 14+ messages in thread
From: Denis Nikitin @ 2022-09-23  5:01 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, James Morse, Alexandru Elisei,
	Nick Desaulniers, Manoj Gupta, David Brazdil, linux-arm-kernel,
	kvmarm, linux-kernel

Hi Mark,

On Thu, Sep 22, 2022 at 3:38 AM Marc Zyngier <maz@kernel.org> wrote:
>
> I was really hoping that you'd just drop the flags from the CFLAGS
> instead of removing the generated section. Something like:
>
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index b5c5119c7396..e5b2d43925b4 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -88,7 +88,7 @@ quiet_cmd_hypcopy = HYPCOPY $@
>
>  # Remove ftrace, Shadow Call Stack, and CFI CFLAGS.
>  # This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations.
> -KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS))
> +KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI) -fprofile-sample-use, $(KBUILD_CFLAGS))
>
>  # KVM nVHE code is run at a different exception code with a different map, so
>  # compiler instrumentation that inserts callbacks or checks into the code may

Sorry, I moved on with a different approach and didn't explain the rationale.

Like you mentioned before, the flag `-fprofile-sample-use` does not appear
in the kernel. And it looks confusing when the flag is disabled or filtered out
here. This was the first reason.

The root cause of the build failure wasn't the compiler profile guided
optimization but the extra metadata in SHT_REL section which llvm injected
into kvm_nvhe.tmp.o for further link optimization.
If we remove the .llvm.call-graph-profile section we fix the build and avoid
potential problems with relocations optimized by the linker. The profile
guided optimization will still be applied by the compiler.

Let me know what you think about it.

>
> However, I even failed to reproduce your problem using LLVM 14 as
> packaged by Debian (if that matters, I'm using an arm64 build
> machine). I build the kernel with:
>
> $ make LLVM=1 KCFLAGS=-fprofile-sample-use -j8 vmlinux
>
> and the offending object only contains the following sections:
>
> arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o:     file format elf64-littleaarch64
>
> Sections:
> Idx Name          Size      VMA               LMA               File off  Algn
>   0 .hyp.idmap.text 00000ae4  0000000000000000  0000000000000000  00000800  2**11
>                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>   1 .hyp.text     0000e988  0000000000000000  0000000000000000  00001800  2**11
>                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>   2 .hyp.data..ro_after_init 00000820  0000000000000000  0000000000000000  00010188  2**3
>                   CONTENTS, ALLOC, LOAD, DATA
>   3 .hyp.rodata   00002e70  0000000000000000  0000000000000000  000109a8  2**3
>                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>   4 .hyp.data..percpu 00001ee0  0000000000000000  0000000000000000  00013820  2**4
>                   CONTENTS, ALLOC, LOAD, DATA
>   5 .hyp.bss      00001158  0000000000000000  0000000000000000  00015700  2**3
>                   ALLOC
>   6 .comment      0000001f  0000000000000000  0000000000000000  00017830  2**0
>                   CONTENTS, READONLY
>   7 .llvm_addrsig 000000b8  0000000000000000  0000000000000000  0001784f  2**0
>                   CONTENTS, READONLY, EXCLUDE
>   8 .altinstructions 00001284  0000000000000000  0000000000000000  00015700  2**0
>                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>   9 __jump_table  00000960  0000000000000000  0000000000000000  00016988  2**3
>                   CONTENTS, ALLOC, LOAD, RELOC, DATA
>  10 __bug_table   0000051c  0000000000000000  0000000000000000  000172e8  2**2
>                   CONTENTS, ALLOC, LOAD, RELOC, DATA
>  11 __kvm_ex_table 00000028  0000000000000000  0000000000000000  00017808  2**3
>                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>  12 .note.GNU-stack 00000000  0000000000000000  0000000000000000  00027370  2**0
>                   CONTENTS, READONLY
>
> So what am I missing to trigger this issue? Does it rely on something
> like PGO, which is not upstream yet? A bit of handholding would be
> much appreciated.

Right, it relies on the PGO profile.
On ChromeOS we collect the sample PGO profile from Arm devices with
enabled CoreSight/ETM. You can find more details on ETM at
https://www.kernel.org/doc/Documentation/trace/coresight/coresight.rst.

https://github.com/Linaro/OpenCSD/blob/master/decoder/tests/auto-fdo/autofdo.md
contains information about the pipeline of collecting, processing, and applying
the profile.

>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

Thanks,
Denis

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] KVM: arm64: nvhe: Fix build with profile optimization
       [not found]       ` <CAH=Qcsi3aQ51AsAE0WmAH9VmpqjOaQQt=ru5Nav4+d8F3fMPwQ@mail.gmail.com>
@ 2022-09-29 16:13         ` Denis Nikitin
  2022-10-06 16:28           ` Denis Nikitin
  0 siblings, 1 reply; 14+ messages in thread
From: Denis Nikitin @ 2022-09-29 16:13 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, James Morse, Alexandru Elisei,
	Nick Desaulniers, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, Manoj Gupta

Hi Marc,

Please let me know what you think about this approach.

Thanks,
Denis

On Thu, Sep 22, 2022 at 11:04 PM Manoj Gupta <manojgupta@google.com> wrote:
>
>
>
> On Thu, Sep 22, 2022 at 10:01 PM Denis Nikitin <denik@chromium.org> wrote:
>>
>> Hi Mark,
>>
>> On Thu, Sep 22, 2022 at 3:38 AM Marc Zyngier <maz@kernel.org> wrote:
>> >
>> > I was really hoping that you'd just drop the flags from the CFLAGS
>> > instead of removing the generated section. Something like:
>> >
>> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
>> > index b5c5119c7396..e5b2d43925b4 100644
>> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
>> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
>> > @@ -88,7 +88,7 @@ quiet_cmd_hypcopy = HYPCOPY $@
>> >
>> >  # Remove ftrace, Shadow Call Stack, and CFI CFLAGS.
>> >  # This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations.
>> > -KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS))
>> > +KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI) -fprofile-sample-use, $(KBUILD_CFLAGS))
>> >
>> >  # KVM nVHE code is run at a different exception code with a different map, so
>> >  # compiler instrumentation that inserts callbacks or checks into the code may
>>
>> Sorry, I moved on with a different approach and didn't explain the rationale.
>>
>> Like you mentioned before, the flag `-fprofile-sample-use` does not appear
>> in the kernel. And it looks confusing when the flag is disabled or filtered out
>> here. This was the first reason.
>>
>> The root cause of the build failure wasn't the compiler profile guided
>> optimization but the extra metadata in SHT_REL section which llvm injected
>> into kvm_nvhe.tmp.o for further link optimization.
>> If we remove the .llvm.call-graph-profile section we fix the build and avoid
>> potential problems with relocations optimized by the linker. The profile
>> guided optimization will still be applied by the compiler.
>>
>> Let me know what you think about it.
>>
>> >
>> > However, I even failed to reproduce your problem using LLVM 14 as
>> > packaged by Debian (if that matters, I'm using an arm64 build
>> > machine). I build the kernel with:
>> >
>> > $ make LLVM=1 KCFLAGS=-fprofile-sample-use -j8 vmlinux
>> >
>> > and the offending object only contains the following sections:
>> >
>
>
> Just some comments based on my ChromeOS build experience.
>
> fprofile-sample-use needs the profile file name argument to read the pgo data from
> i.e. -fprofile-sample-use=/path/to/gcov.profile.
>
> Since the path to filename can change, it makes filtering out more difficult.
> It is certainly possible to find and filter the exact argument by some string search of KCFLAGS.
> But passing  -fno-profile-sample-use is easier and less error prone which I believe the previous patch version tried to do.
>
>
>> > arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o:     file format elf64-littleaarch64
>> >
>> > Sections:
>> > Idx Name          Size      VMA               LMA               File off  Algn
>> >   0 .hyp.idmap.text 00000ae4  0000000000000000  0000000000000000  00000800  2**11
>> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>> >   1 .hyp.text     0000e988  0000000000000000  0000000000000000  00001800  2**11
>> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>> >   2 .hyp.data..ro_after_init 00000820  0000000000000000  0000000000000000  00010188  2**3
>> >                   CONTENTS, ALLOC, LOAD, DATA
>> >   3 .hyp.rodata   00002e70  0000000000000000  0000000000000000  000109a8  2**3
>> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>> >   4 .hyp.data..percpu 00001ee0  0000000000000000  0000000000000000  00013820  2**4
>> >                   CONTENTS, ALLOC, LOAD, DATA
>> >   5 .hyp.bss      00001158  0000000000000000  0000000000000000  00015700  2**3
>> >                   ALLOC
>> >   6 .comment      0000001f  0000000000000000  0000000000000000  00017830  2**0
>> >                   CONTENTS, READONLY
>> >   7 .llvm_addrsig 000000b8  0000000000000000  0000000000000000  0001784f  2**0
>> >                   CONTENTS, READONLY, EXCLUDE
>> >   8 .altinstructions 00001284  0000000000000000  0000000000000000  00015700  2**0
>> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>> >   9 __jump_table  00000960  0000000000000000  0000000000000000  00016988  2**3
>> >                   CONTENTS, ALLOC, LOAD, RELOC, DATA
>> >  10 __bug_table   0000051c  0000000000000000  0000000000000000  000172e8  2**2
>> >                   CONTENTS, ALLOC, LOAD, RELOC, DATA
>> >  11 __kvm_ex_table 00000028  0000000000000000  0000000000000000  00017808  2**3
>> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>> >  12 .note.GNU-stack 00000000  0000000000000000  0000000000000000  00027370  2**0
>> >                   CONTENTS, READONLY
>> >
>> > So what am I missing to trigger this issue? Does it rely on something
>> > like PGO, which is not upstream yet? A bit of handholding would be
>> > much appreciated.
>>
>> Right, it relies on the PGO profile.
>> On ChromeOS we collect the sample PGO profile from Arm devices with
>> enabled CoreSight/ETM. You can find more details on ETM at
>> https://www.kernel.org/doc/Documentation/trace/coresight/coresight.rst.
>>
>> https://github.com/Linaro/OpenCSD/blob/master/decoder/tests/auto-fdo/autofdo.md
>> contains information about the pipeline of collecting, processing, and applying
>> the profile.
>>
>
> Generally the difficult part is in collecting a good matching profile for the workload.
> So I think this patch is better than previous since it still keeps the compiler optimization for the hot code paths
> in the file but removes the problematic section.
>
> Thanks,
> Manoj
>
>
>>
>> >
>> > Thanks,
>> >
>> >         M.
>> >
>> > --
>> > Without deviation from the norm, progress is not possible.
>>
>> Thanks,
>> Denis

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] KVM: arm64: nvhe: Fix build with profile optimization
  2022-09-29 16:13         ` Denis Nikitin
@ 2022-10-06 16:28           ` Denis Nikitin
  2022-10-09  2:20             ` Marc Zyngier
  0 siblings, 1 reply; 14+ messages in thread
From: Denis Nikitin @ 2022-10-06 16:28 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, James Morse, Alexandru Elisei,
	Nick Desaulniers, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, Manoj Gupta

Hi Mark,

This problem currently blocks the PGO roll on the ChromeOS kernel and
we need some kind of a solution.
Could you please take a look?

Thanks,
Denis

On Thu, Sep 29, 2022 at 9:13 AM Denis Nikitin <denik@chromium.org> wrote:
>
> Hi Marc,
>
> Please let me know what you think about this approach.
>
> Thanks,
> Denis
>
> On Thu, Sep 22, 2022 at 11:04 PM Manoj Gupta <manojgupta@google.com> wrote:
> >
> >
> >
> > On Thu, Sep 22, 2022 at 10:01 PM Denis Nikitin <denik@chromium.org> wrote:
> >>
> >> Hi Mark,
> >>
> >> On Thu, Sep 22, 2022 at 3:38 AM Marc Zyngier <maz@kernel.org> wrote:
> >> >
> >> > I was really hoping that you'd just drop the flags from the CFLAGS
> >> > instead of removing the generated section. Something like:
> >> >
> >> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> >> > index b5c5119c7396..e5b2d43925b4 100644
> >> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> >> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> >> > @@ -88,7 +88,7 @@ quiet_cmd_hypcopy = HYPCOPY $@
> >> >
> >> >  # Remove ftrace, Shadow Call Stack, and CFI CFLAGS.
> >> >  # This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations.
> >> > -KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS))
> >> > +KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI) -fprofile-sample-use, $(KBUILD_CFLAGS))
> >> >
> >> >  # KVM nVHE code is run at a different exception code with a different map, so
> >> >  # compiler instrumentation that inserts callbacks or checks into the code may
> >>
> >> Sorry, I moved on with a different approach and didn't explain the rationale.
> >>
> >> Like you mentioned before, the flag `-fprofile-sample-use` does not appear
> >> in the kernel. And it looks confusing when the flag is disabled or filtered out
> >> here. This was the first reason.
> >>
> >> The root cause of the build failure wasn't the compiler profile guided
> >> optimization but the extra metadata in SHT_REL section which llvm injected
> >> into kvm_nvhe.tmp.o for further link optimization.
> >> If we remove the .llvm.call-graph-profile section we fix the build and avoid
> >> potential problems with relocations optimized by the linker. The profile
> >> guided optimization will still be applied by the compiler.
> >>
> >> Let me know what you think about it.
> >>
> >> >
> >> > However, I even failed to reproduce your problem using LLVM 14 as
> >> > packaged by Debian (if that matters, I'm using an arm64 build
> >> > machine). I build the kernel with:
> >> >
> >> > $ make LLVM=1 KCFLAGS=-fprofile-sample-use -j8 vmlinux
> >> >
> >> > and the offending object only contains the following sections:
> >> >
> >
> >
> > Just some comments based on my ChromeOS build experience.
> >
> > fprofile-sample-use needs the profile file name argument to read the pgo data from
> > i.e. -fprofile-sample-use=/path/to/gcov.profile.
> >
> > Since the path to filename can change, it makes filtering out more difficult.
> > It is certainly possible to find and filter the exact argument by some string search of KCFLAGS.
> > But passing  -fno-profile-sample-use is easier and less error prone which I believe the previous patch version tried to do.
> >
> >
> >> > arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o:     file format elf64-littleaarch64
> >> >
> >> > Sections:
> >> > Idx Name          Size      VMA               LMA               File off  Algn
> >> >   0 .hyp.idmap.text 00000ae4  0000000000000000  0000000000000000  00000800  2**11
> >> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
> >> >   1 .hyp.text     0000e988  0000000000000000  0000000000000000  00001800  2**11
> >> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
> >> >   2 .hyp.data..ro_after_init 00000820  0000000000000000  0000000000000000  00010188  2**3
> >> >                   CONTENTS, ALLOC, LOAD, DATA
> >> >   3 .hyp.rodata   00002e70  0000000000000000  0000000000000000  000109a8  2**3
> >> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
> >> >   4 .hyp.data..percpu 00001ee0  0000000000000000  0000000000000000  00013820  2**4
> >> >                   CONTENTS, ALLOC, LOAD, DATA
> >> >   5 .hyp.bss      00001158  0000000000000000  0000000000000000  00015700  2**3
> >> >                   ALLOC
> >> >   6 .comment      0000001f  0000000000000000  0000000000000000  00017830  2**0
> >> >                   CONTENTS, READONLY
> >> >   7 .llvm_addrsig 000000b8  0000000000000000  0000000000000000  0001784f  2**0
> >> >                   CONTENTS, READONLY, EXCLUDE
> >> >   8 .altinstructions 00001284  0000000000000000  0000000000000000  00015700  2**0
> >> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
> >> >   9 __jump_table  00000960  0000000000000000  0000000000000000  00016988  2**3
> >> >                   CONTENTS, ALLOC, LOAD, RELOC, DATA
> >> >  10 __bug_table   0000051c  0000000000000000  0000000000000000  000172e8  2**2
> >> >                   CONTENTS, ALLOC, LOAD, RELOC, DATA
> >> >  11 __kvm_ex_table 00000028  0000000000000000  0000000000000000  00017808  2**3
> >> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
> >> >  12 .note.GNU-stack 00000000  0000000000000000  0000000000000000  00027370  2**0
> >> >                   CONTENTS, READONLY
> >> >
> >> > So what am I missing to trigger this issue? Does it rely on something
> >> > like PGO, which is not upstream yet? A bit of handholding would be
> >> > much appreciated.
> >>
> >> Right, it relies on the PGO profile.
> >> On ChromeOS we collect the sample PGO profile from Arm devices with
> >> enabled CoreSight/ETM. You can find more details on ETM at
> >> https://www.kernel.org/doc/Documentation/trace/coresight/coresight.rst.
> >>
> >> https://github.com/Linaro/OpenCSD/blob/master/decoder/tests/auto-fdo/autofdo.md
> >> contains information about the pipeline of collecting, processing, and applying
> >> the profile.
> >>
> >
> > Generally the difficult part is in collecting a good matching profile for the workload.
> > So I think this patch is better than previous since it still keeps the compiler optimization for the hot code paths
> > in the file but removes the problematic section.
> >
> > Thanks,
> > Manoj
> >
> >
> >>
> >> >
> >> > Thanks,
> >> >
> >> >         M.
> >> >
> >> > --
> >> > Without deviation from the norm, progress is not possible.
> >>
> >> Thanks,
> >> Denis

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] KVM: arm64: nvhe: Fix build with profile optimization
  2022-10-06 16:28           ` Denis Nikitin
@ 2022-10-09  2:20             ` Marc Zyngier
  2022-10-11  2:15               ` Denis Nikitin
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Zyngier @ 2022-10-09  2:20 UTC (permalink / raw)
  To: Denis Nikitin
  Cc: Catalin Marinas, Will Deacon, James Morse, Alexandru Elisei,
	Nick Desaulniers, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, Manoj Gupta

On Thu, 06 Oct 2022 17:28:17 +0100,
Denis Nikitin <denik@chromium.org> wrote:
> 
> Hi Mark,

s/k/c/

> 
> This problem currently blocks the PGO roll on the ChromeOS kernel and
> we need some kind of a solution.

I'm sorry, but I don't feel constrained by your internal deadlines. I
have my own...

> Could you please take a look?

I have asked for a reproducer. All I got for an answer is "this is
hard". Providing a profiling file would help, for example.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] KVM: arm64: nvhe: Fix build with profile optimization
  2022-10-09  2:20             ` Marc Zyngier
@ 2022-10-11  2:15               ` Denis Nikitin
  2022-10-13 11:09                 ` Marc Zyngier
  0 siblings, 1 reply; 14+ messages in thread
From: Denis Nikitin @ 2022-10-11  2:15 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, James Morse, Alexandru Elisei,
	Nick Desaulniers, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, Manoj Gupta

On Sat, Oct 8, 2022 at 7:22 PM Marc Zyngier <maz@kernel.org> wrote:
>
> On Thu, 06 Oct 2022 17:28:17 +0100,
> Denis Nikitin <denik@chromium.org> wrote:
> >
> > Hi Mark,
>
> s/k/c/
>
> >
> > This problem currently blocks the PGO roll on the ChromeOS kernel and
> > we need some kind of a solution.
>
> I'm sorry, but I don't feel constrained by your internal deadlines. I
> have my own...
>
> > Could you please take a look?
>
> I have asked for a reproducer. All I got for an answer is "this is
> hard". Providing a profiling file would help, for example.

Could you please try the following profile on the 5.15 branch?

$ cat <<EOF > prof.txt
kvm_pgtable_walk:100:10
 2: 5
 3: 5
 5: 5
 6: 5
 10: 5
 10: _kvm_pgtable_walk:50
  5: 5
  7: 5
  10: 5
  13.2: 5
  14: 5
  16: 5 __kvm_pgtable_walk:5
  13: kvm_pgd_page_idx:30
   2: __kvm_pgd_page_idx:30
    2: 5
    3: 5
    5: 5
    2: kvm_granule_shift:5
     3: 5
EOF

$ make LLVM=1 ARCH=arm64 KCFLAGS=-fprofile-sample-use=prof.txt -j8 vmlinux

Thanks,
Denis

>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] KVM: arm64: nvhe: Fix build with profile optimization
  2022-10-11  2:15               ` Denis Nikitin
@ 2022-10-13 11:09                 ` Marc Zyngier
  2022-10-13 19:02                   ` Denis Nikitin
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Zyngier @ 2022-10-13 11:09 UTC (permalink / raw)
  To: Denis Nikitin
  Cc: Catalin Marinas, Will Deacon, James Morse, Alexandru Elisei,
	Nick Desaulniers, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, Manoj Gupta

On Tue, 11 Oct 2022 03:15:36 +0100,
Denis Nikitin <denik@chromium.org> wrote:
> 
> On Sat, Oct 8, 2022 at 7:22 PM Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Thu, 06 Oct 2022 17:28:17 +0100,
> > Denis Nikitin <denik@chromium.org> wrote:
> > >
> > > Hi Mark,
> >
> > s/k/c/
> >
> > >
> > > This problem currently blocks the PGO roll on the ChromeOS kernel and
> > > we need some kind of a solution.
> >
> > I'm sorry, but I don't feel constrained by your internal deadlines. I
> > have my own...
> >
> > > Could you please take a look?
> >
> > I have asked for a reproducer. All I got for an answer is "this is
> > hard". Providing a profiling file would help, for example.
> 
> Could you please try the following profile on the 5.15 branch?
> 
> $ cat <<EOF > prof.txt
> kvm_pgtable_walk:100:10
>  2: 5
>  3: 5
>  5: 5
>  6: 5
>  10: 5
>  10: _kvm_pgtable_walk:50
>   5: 5
>   7: 5
>   10: 5
>   13.2: 5
>   14: 5
>   16: 5 __kvm_pgtable_walk:5
>   13: kvm_pgd_page_idx:30
>    2: __kvm_pgd_page_idx:30
>     2: 5
>     3: 5
>     5: 5
>     2: kvm_granule_shift:5
>      3: 5
> EOF
> 
> $ make LLVM=1 ARCH=arm64 KCFLAGS=-fprofile-sample-use=prof.txt -j8 vmlinux

Thanks, this was helpful, as I was able to reproduce the build failure.

FWIW, it seems pretty easy to work around by filtering out the
offending option, making it consistent with the mechanism we already
use for tracing and the like.

I came up with the hack below, which does the trick and is IMHO better
than dropping the section (extra work) or adding the negation of this
option (which depends on the compiler option evaluation order).

	M.

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 48f6ae7cc6e6..7df1b6afca7f 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -91,7 +91,7 @@ quiet_cmd_hypcopy = HYPCOPY $@
 
 # Remove ftrace, Shadow Call Stack, and CFI CFLAGS.
 # This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations.
-KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS))
+KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI) -fprofile-sample-use=%, $(KBUILD_CFLAGS))
 
 # KVM nVHE code is run at a different exception code with a different map, so
 # compiler instrumentation that inserts callbacks or checks into the code may


-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] KVM: arm64: nvhe: Fix build with profile optimization
  2022-10-13 11:09                 ` Marc Zyngier
@ 2022-10-13 19:02                   ` Denis Nikitin
  0 siblings, 0 replies; 14+ messages in thread
From: Denis Nikitin @ 2022-10-13 19:02 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, James Morse, Alexandru Elisei,
	Nick Desaulniers, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, Manoj Gupta

Thank you Marc for figuring out the filtering-out solution!
It fixed the build on ChromeOS.

I will update the patch and also filter out `-fprofile-use` which will avoid
a similar problem with the instrumented PGO in the future.

Thanks,
Denis

On Thu, Oct 13, 2022 at 4:09 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Tue, 11 Oct 2022 03:15:36 +0100,
> Denis Nikitin <denik@chromium.org> wrote:
> >
> > On Sat, Oct 8, 2022 at 7:22 PM Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > On Thu, 06 Oct 2022 17:28:17 +0100,
> > > Denis Nikitin <denik@chromium.org> wrote:
> > > >
> > > > Hi Mark,
> > >
> > > s/k/c/
> > >
> > > >
> > > > This problem currently blocks the PGO roll on the ChromeOS kernel and
> > > > we need some kind of a solution.
> > >
> > > I'm sorry, but I don't feel constrained by your internal deadlines. I
> > > have my own...
> > >
> > > > Could you please take a look?
> > >
> > > I have asked for a reproducer. All I got for an answer is "this is
> > > hard". Providing a profiling file would help, for example.
> >
> > Could you please try the following profile on the 5.15 branch?
> >
> > $ cat <<EOF > prof.txt
> > kvm_pgtable_walk:100:10
> >  2: 5
> >  3: 5
> >  5: 5
> >  6: 5
> >  10: 5
> >  10: _kvm_pgtable_walk:50
> >   5: 5
> >   7: 5
> >   10: 5
> >   13.2: 5
> >   14: 5
> >   16: 5 __kvm_pgtable_walk:5
> >   13: kvm_pgd_page_idx:30
> >    2: __kvm_pgd_page_idx:30
> >     2: 5
> >     3: 5
> >     5: 5
> >     2: kvm_granule_shift:5
> >      3: 5
> > EOF
> >
> > $ make LLVM=1 ARCH=arm64 KCFLAGS=-fprofile-sample-use=prof.txt -j8 vmlinux
>
> Thanks, this was helpful, as I was able to reproduce the build failure.
>
> FWIW, it seems pretty easy to work around by filtering out the
> offending option, making it consistent with the mechanism we already
> use for tracing and the like.
>
> I came up with the hack below, which does the trick and is IMHO better
> than dropping the section (extra work) or adding the negation of this
> option (which depends on the compiler option evaluation order).
>
>         M.
>
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 48f6ae7cc6e6..7df1b6afca7f 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -91,7 +91,7 @@ quiet_cmd_hypcopy = HYPCOPY $@
>
>  # Remove ftrace, Shadow Call Stack, and CFI CFLAGS.
>  # This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations.
> -KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS))
> +KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI) -fprofile-sample-use=%, $(KBUILD_CFLAGS))
>
>  # KVM nVHE code is run at a different exception code with a different map, so
>  # compiler instrumentation that inserts callbacks or checks into the code may
>
>
> --
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-10-13 19:04 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-20  8:20 [PATCH] KVM: arm64: nvhe: Disable profile optimization Denis Nikitin
2022-09-20  9:33 ` Marc Zyngier
2022-09-21  0:08   ` Denis Nikitin
2022-09-21  6:02     ` Denis Nikitin
2022-09-21 17:25       ` Marc Zyngier
2022-09-22  5:31 ` [PATCH v2] KVM: arm64: nvhe: Fix build with " Denis Nikitin
2022-09-22 10:37   ` Marc Zyngier
2022-09-23  5:01     ` Denis Nikitin
     [not found]       ` <CAH=Qcsi3aQ51AsAE0WmAH9VmpqjOaQQt=ru5Nav4+d8F3fMPwQ@mail.gmail.com>
2022-09-29 16:13         ` Denis Nikitin
2022-10-06 16:28           ` Denis Nikitin
2022-10-09  2:20             ` Marc Zyngier
2022-10-11  2:15               ` Denis Nikitin
2022-10-13 11:09                 ` Marc Zyngier
2022-10-13 19:02                   ` Denis Nikitin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).