linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Denis Nikitin <denik@chromium.org>
To: Marc Zyngier <maz@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,  James Morse <james.morse@arm.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	 Nick Desaulniers <ndesaulniers@google.com>,
	David Brazdil <dbrazdil@google.com>,
	 linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu,  linux-kernel@vger.kernel.org,
	Manoj Gupta <manojgupta@google.com>
Subject: Re: [PATCH v2] KVM: arm64: nvhe: Fix build with profile optimization
Date: Thu, 29 Sep 2022 09:13:12 -0700	[thread overview]
Message-ID: <CADDJ8CXObkhMvOx+L29awjtt7tiaTWxFrRxOmhUqvzku1wswHw@mail.gmail.com> (raw)
In-Reply-To: <CAH=Qcsi3aQ51AsAE0WmAH9VmpqjOaQQt=ru5Nav4+d8F3fMPwQ@mail.gmail.com>

Hi Marc,

Please let me know what you think about this approach.

Thanks,
Denis

On Thu, Sep 22, 2022 at 11:04 PM Manoj Gupta <manojgupta@google.com> wrote:
>
>
>
> On Thu, Sep 22, 2022 at 10:01 PM Denis Nikitin <denik@chromium.org> wrote:
>>
>> Hi Mark,
>>
>> On Thu, Sep 22, 2022 at 3:38 AM Marc Zyngier <maz@kernel.org> wrote:
>> >
>> > I was really hoping that you'd just drop the flags from the CFLAGS
>> > instead of removing the generated section. Something like:
>> >
>> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
>> > index b5c5119c7396..e5b2d43925b4 100644
>> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
>> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
>> > @@ -88,7 +88,7 @@ quiet_cmd_hypcopy = HYPCOPY $@
>> >
>> >  # Remove ftrace, Shadow Call Stack, and CFI CFLAGS.
>> >  # This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations.
>> > -KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS))
>> > +KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI) -fprofile-sample-use, $(KBUILD_CFLAGS))
>> >
>> >  # KVM nVHE code is run at a different exception code with a different map, so
>> >  # compiler instrumentation that inserts callbacks or checks into the code may
>>
>> Sorry, I moved on with a different approach and didn't explain the rationale.
>>
>> Like you mentioned before, the flag `-fprofile-sample-use` does not appear
>> in the kernel. And it looks confusing when the flag is disabled or filtered out
>> here. This was the first reason.
>>
>> The root cause of the build failure wasn't the compiler profile guided
>> optimization but the extra metadata in SHT_REL section which llvm injected
>> into kvm_nvhe.tmp.o for further link optimization.
>> If we remove the .llvm.call-graph-profile section we fix the build and avoid
>> potential problems with relocations optimized by the linker. The profile
>> guided optimization will still be applied by the compiler.
>>
>> Let me know what you think about it.
>>
>> >
>> > However, I even failed to reproduce your problem using LLVM 14 as
>> > packaged by Debian (if that matters, I'm using an arm64 build
>> > machine). I build the kernel with:
>> >
>> > $ make LLVM=1 KCFLAGS=-fprofile-sample-use -j8 vmlinux
>> >
>> > and the offending object only contains the following sections:
>> >
>
>
> Just some comments based on my ChromeOS build experience.
>
> fprofile-sample-use needs the profile file name argument to read the pgo data from
> i.e. -fprofile-sample-use=/path/to/gcov.profile.
>
> Since the path to filename can change, it makes filtering out more difficult.
> It is certainly possible to find and filter the exact argument by some string search of KCFLAGS.
> But passing  -fno-profile-sample-use is easier and less error prone which I believe the previous patch version tried to do.
>
>
>> > arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o:     file format elf64-littleaarch64
>> >
>> > Sections:
>> > Idx Name          Size      VMA               LMA               File off  Algn
>> >   0 .hyp.idmap.text 00000ae4  0000000000000000  0000000000000000  00000800  2**11
>> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>> >   1 .hyp.text     0000e988  0000000000000000  0000000000000000  00001800  2**11
>> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
>> >   2 .hyp.data..ro_after_init 00000820  0000000000000000  0000000000000000  00010188  2**3
>> >                   CONTENTS, ALLOC, LOAD, DATA
>> >   3 .hyp.rodata   00002e70  0000000000000000  0000000000000000  000109a8  2**3
>> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>> >   4 .hyp.data..percpu 00001ee0  0000000000000000  0000000000000000  00013820  2**4
>> >                   CONTENTS, ALLOC, LOAD, DATA
>> >   5 .hyp.bss      00001158  0000000000000000  0000000000000000  00015700  2**3
>> >                   ALLOC
>> >   6 .comment      0000001f  0000000000000000  0000000000000000  00017830  2**0
>> >                   CONTENTS, READONLY
>> >   7 .llvm_addrsig 000000b8  0000000000000000  0000000000000000  0001784f  2**0
>> >                   CONTENTS, READONLY, EXCLUDE
>> >   8 .altinstructions 00001284  0000000000000000  0000000000000000  00015700  2**0
>> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>> >   9 __jump_table  00000960  0000000000000000  0000000000000000  00016988  2**3
>> >                   CONTENTS, ALLOC, LOAD, RELOC, DATA
>> >  10 __bug_table   0000051c  0000000000000000  0000000000000000  000172e8  2**2
>> >                   CONTENTS, ALLOC, LOAD, RELOC, DATA
>> >  11 __kvm_ex_table 00000028  0000000000000000  0000000000000000  00017808  2**3
>> >                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
>> >  12 .note.GNU-stack 00000000  0000000000000000  0000000000000000  00027370  2**0
>> >                   CONTENTS, READONLY
>> >
>> > So what am I missing to trigger this issue? Does it rely on something
>> > like PGO, which is not upstream yet? A bit of handholding would be
>> > much appreciated.
>>
>> Right, it relies on the PGO profile.
>> On ChromeOS we collect the sample PGO profile from Arm devices with
>> enabled CoreSight/ETM. You can find more details on ETM at
>> https://www.kernel.org/doc/Documentation/trace/coresight/coresight.rst.
>>
>> https://github.com/Linaro/OpenCSD/blob/master/decoder/tests/auto-fdo/autofdo.md
>> contains information about the pipeline of collecting, processing, and applying
>> the profile.
>>
>
> Generally the difficult part is in collecting a good matching profile for the workload.
> So I think this patch is better than previous since it still keeps the compiler optimization for the hot code paths
> in the file but removes the problematic section.
>
> Thanks,
> Manoj
>
>
>>
>> >
>> > Thanks,
>> >
>> >         M.
>> >
>> > --
>> > Without deviation from the norm, progress is not possible.
>>
>> Thanks,
>> Denis

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2022-09-29 16:14 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-20  8:20 [PATCH] KVM: arm64: nvhe: Disable profile optimization Denis Nikitin
2022-09-20  9:33 ` Marc Zyngier
2022-09-21  0:08   ` Denis Nikitin
2022-09-21  6:02     ` Denis Nikitin
2022-09-21 17:25       ` Marc Zyngier
2022-09-22  5:31 ` [PATCH v2] KVM: arm64: nvhe: Fix build with " Denis Nikitin
2022-09-22 10:37   ` Marc Zyngier
2022-09-23  5:01     ` Denis Nikitin
     [not found]       ` <CAH=Qcsi3aQ51AsAE0WmAH9VmpqjOaQQt=ru5Nav4+d8F3fMPwQ@mail.gmail.com>
2022-09-29 16:13         ` Denis Nikitin [this message]
2022-10-06 16:28           ` Denis Nikitin
2022-10-09  2:20             ` Marc Zyngier
2022-10-11  2:15               ` Denis Nikitin
2022-10-13 11:09                 ` Marc Zyngier
2022-10-13 19:02                   ` Denis Nikitin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADDJ8CXObkhMvOx+L29awjtt7tiaTWxFrRxOmhUqvzku1wswHw@mail.gmail.com \
    --to=denik@chromium.org \
    --cc=alexandru.elisei@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=dbrazdil@google.com \
    --cc=james.morse@arm.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manojgupta@google.com \
    --cc=maz@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).