From: Ricardo Ribalda <ribalda@chromium.org> To: Philipp Rudo <prudo@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com>, linux-kernel@vger.kernel.org, kexec@lists.infradead.org, Baoquan He <bhe@redhat.com> Subject: Re: [PATCH v3] kexec: Support purgatories with .text.hot sections Date: Mon, 27 Mar 2023 13:52:08 +0200 [thread overview] Message-ID: <CANiDSCtu8oOn9vV9eak=S2RDVVO9yan2BO8K5ia9jALABqiwjQ@mail.gmail.com> (raw) In-Reply-To: <20230324165855.23084947@rotkaeppchen> Hi Philipp On Fri, 24 Mar 2023 at 17:00, Philipp Rudo <prudo@redhat.com> wrote: > > Hi Ricardo, > > On Wed, 22 Mar 2023 20:09:21 +0100 > Ricardo Ribalda <ribalda@chromium.org> wrote: > > > Clang16 links the purgatory text in two sections: > > > > [ 1] .text PROGBITS 0000000000000000 00000040 > > 00000000000011a1 0000000000000000 AX 0 0 16 > > [ 2] .rela.text RELA 0000000000000000 00003498 > > 0000000000000648 0000000000000018 I 24 1 8 > > ... > > [17] .text.hot. PROGBITS 0000000000000000 00003220 > > 000000000000020b 0000000000000000 AX 0 0 1 > > [18] .rela.text.hot. RELA 0000000000000000 00004428 > > 0000000000000078 0000000000000018 I 24 17 8 > > > > And both of them have their range [sh_addr ... sh_addr+sh_size] on the > > area pointed by `e_entry`. > > > > This causes that image->start is calculated twice, once for .text and > > another time for .text.hot. The second calculation leaves image->start > > in a random location. > > > > Because of this, the system crashes inmediatly after: > > > > kexec_core: Starting new kernel > > Great analysis! > > > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> > > --- > > kexec: Fix kexec_file_load for llvm16 > > > > When upreving llvm I realised that kexec stopped working on my test > > platform. This patch fixes it. > > > > To: Eric Biederman <ebiederm@xmission.com> > > Cc: Baoquan He <bhe@redhat.com> > > Cc: Philipp Rudo <prudo@redhat.com> > > Cc: kexec@lists.infradead.org > > Cc: linux-kernel@vger.kernel.org > > --- > > Changes in v3: > > - Fix initial value. Thanks Ross! > > - Link to v2: https://lore.kernel.org/r/20230321-kexec_clang16-v2-0-d10e5d517869@chromium.org > > > > Changes in v2: > > - Fix if condition. Thanks Steven!. > > - Update Philipp email. Thanks Baoquan. > > - Link to v1: https://lore.kernel.org/r/20230321-kexec_clang16-v1-0-a768fc2c7c4d@chromium.org > > --- > > kernel/kexec_file.c | 13 ++++++++++++- > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c > > index f1a0e4e3fb5c..25a37d8f113a 100644 > > --- a/kernel/kexec_file.c > > +++ b/kernel/kexec_file.c > > @@ -901,10 +901,21 @@ static int kexec_purgatory_setup_sechdrs(struct purgatory_info *pi, > > } > > > > offset = ALIGN(offset, align); > > + > > + /* > > + * Check if the segment contains the entry point, if so, > > + * calculate the value of image->start based on it. > > + * If the compiler has produced more than one .text sections > > + * (Eg: .text.hot), they are generally after the main .text > > + * section, and they shall not be used to calculate > > + * image->start. So do not re-calculate image->start if it > > + * is not set to the initial value. > > + */ > > if (sechdrs[i].sh_flags & SHF_EXECINSTR && > > pi->ehdr->e_entry >= sechdrs[i].sh_addr && > > pi->ehdr->e_entry < (sechdrs[i].sh_addr > > - + sechdrs[i].sh_size)) { > > + + sechdrs[i].sh_size) && > > + kbuf->image->start == pi->ehdr->e_entry) { > > I'm not entirely sure if this is the solution to go with. As you state > in the comment above this solution assumes that the .text section comes > before any other .text.* section. But this assumption isn't much > stronger than the assumption that there is only a single .text section, > which is used nowadays. > > The best solution I can come up with right now is to introduce a linker > script for the purgatory that simply merges the .text sections into > one. Similar to what I did for s390 in > arch/s390/purgatory/purgatory.lds.S (although for a different reason). > But that would require every architecture to get one. An alternative > would be to find a way to get rid of the -r option on the LD_FLAGS, > which IIRC is the reason why both section overlap in the first place. I tried removing the -r from arch/x86/purgatory/Makefile and that resulted into: [ 115.631578] BUG: unable to handle page fault for address: ffff93224d5c8e20 [ 115.631583] #PF: supervisor write access in kernel mode [ 115.631585] #PF: error_code(0x0002) - not-present page [ 115.631586] PGD 100000067 P4D 100000067 PUD 1001ed067 PMD 132b58067 PTE 0 [ 115.631589] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 115.631592] CPU: 0 PID: 5291 Comm: kexec-lite Tainted: G U 5.15.103-17399-g852a928df601-dirty #19 cd159e0d6a91f03e06035a0a8eb7fc984a8f3e82 [ 115.631594] Hardware name: Google Crota/Crota, BIOS Google_Crota.14505.288.0 11/08/2022 [ 115.631595] RIP: 0010:memcpy_erms+0x6/0x10 [ 115.631599] Code: 5d 00 eb bd eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 cc cc cc cc 66 90 48 89 f8 48 89 d1 <f3> a4 c3 cc cc cc cc 0f 1f 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe [ 115.631601] RSP: 0018:ffff93224f65fe50 EFLAGS: 00010246 [ 115.631602] RAX: ffff93224d5c8e20 RBX: 00000000ffffffea RCX: 0000000000000100 [ 115.631603] RDX: 0000000000000100 RSI: ffff9322407bd000 RDI: ffff93224d5c8e20 [ 115.631604] RBP: ffff93224f65fe88 R08: 0000000000000000 R09: ffff92133cd3ef08 [ 115.631605] R10: ffff9322407be000 R11: ffffffffa1b4f2e0 R12: 0000000000000000 [ 115.631606] R13: ffff92133cee4c00 R14: 0000000000000100 R15: ffffffffa2b6f14f [ 115.631607] FS: 000078e8b9dbf7c0(0000) GS:ffff921437800000(0000) knlGS:0000000000000000 [ 115.631609] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 115.631610] CR2: ffff93224d5c8e20 CR3: 000000015be26001 CR4: 0000000000770ef0 [ 115.631611] PKRU: 55555554 [ 115.631612] Call Trace: [ 115.631614] <TASK> [ 115.631615] kexec_purgatory_get_set_symbol+0x82/0xd3 [ 115.631619] __se_sys_kexec_file_load+0x523/0x644 [ 115.631621] do_syscall_64+0x58/0xa5 [ 115.631623] entry_SYSCALL_64_after_hwframe+0x61/0xcb And I did not continue in that direction. I also tried finding a flag for llvm that would avoid splitting .text, but was not lucky either. I will look into making a linker script for x86, we could combine it with something like: if (sechdrs[i].sh_flags & SHF_EXECINSTR && pi->ehdr->e_entry >= sechdrs[i].sh_addr && pi->ehdr->e_entry < (sechdrs[i].sh_addr - + sechdrs[i].sh_size) && - kbuf->image->start == pi->ehdr->e_entry) { - kbuf->image->start -= sechdrs[i].sh_addr; - kbuf->image->start += kbuf->mem + offset; + + sechdrs[i].sh_size)) { + if (!WARN_ON(kbuf->image->start != pi->ehdr->e_entry)) { + kbuf->image->start -= sechdrs[i].sh_addr; + kbuf->image->start += kbuf->mem + offset; + } } So developers have some hints of what to look at. Thanks! > > Thanks > Philipp > > > kbuf->image->start -= sechdrs[i].sh_addr; > > kbuf->image->start += kbuf->mem + offset; > > } > > > > --- > > base-commit: 17214b70a159c6547df9ae204a6275d983146f6b > > change-id: 20230321-kexec_clang16-4510c23d129c > > > > Best regards, > -- Ricardo Ribalda
WARNING: multiple messages have this Message-ID (diff)
From: Ricardo Ribalda <ribalda@chromium.org> To: Philipp Rudo <prudo@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com>, linux-kernel@vger.kernel.org, kexec@lists.infradead.org, Baoquan He <bhe@redhat.com> Subject: Re: [PATCH v3] kexec: Support purgatories with .text.hot sections Date: Mon, 27 Mar 2023 13:52:08 +0200 [thread overview] Message-ID: <CANiDSCtu8oOn9vV9eak=S2RDVVO9yan2BO8K5ia9jALABqiwjQ@mail.gmail.com> (raw) In-Reply-To: <20230324165855.23084947@rotkaeppchen> Hi Philipp On Fri, 24 Mar 2023 at 17:00, Philipp Rudo <prudo@redhat.com> wrote: > > Hi Ricardo, > > On Wed, 22 Mar 2023 20:09:21 +0100 > Ricardo Ribalda <ribalda@chromium.org> wrote: > > > Clang16 links the purgatory text in two sections: > > > > [ 1] .text PROGBITS 0000000000000000 00000040 > > 00000000000011a1 0000000000000000 AX 0 0 16 > > [ 2] .rela.text RELA 0000000000000000 00003498 > > 0000000000000648 0000000000000018 I 24 1 8 > > ... > > [17] .text.hot. PROGBITS 0000000000000000 00003220 > > 000000000000020b 0000000000000000 AX 0 0 1 > > [18] .rela.text.hot. RELA 0000000000000000 00004428 > > 0000000000000078 0000000000000018 I 24 17 8 > > > > And both of them have their range [sh_addr ... sh_addr+sh_size] on the > > area pointed by `e_entry`. > > > > This causes that image->start is calculated twice, once for .text and > > another time for .text.hot. The second calculation leaves image->start > > in a random location. > > > > Because of this, the system crashes inmediatly after: > > > > kexec_core: Starting new kernel > > Great analysis! > > > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> > > --- > > kexec: Fix kexec_file_load for llvm16 > > > > When upreving llvm I realised that kexec stopped working on my test > > platform. This patch fixes it. > > > > To: Eric Biederman <ebiederm@xmission.com> > > Cc: Baoquan He <bhe@redhat.com> > > Cc: Philipp Rudo <prudo@redhat.com> > > Cc: kexec@lists.infradead.org > > Cc: linux-kernel@vger.kernel.org > > --- > > Changes in v3: > > - Fix initial value. Thanks Ross! > > - Link to v2: https://lore.kernel.org/r/20230321-kexec_clang16-v2-0-d10e5d517869@chromium.org > > > > Changes in v2: > > - Fix if condition. Thanks Steven!. > > - Update Philipp email. Thanks Baoquan. > > - Link to v1: https://lore.kernel.org/r/20230321-kexec_clang16-v1-0-a768fc2c7c4d@chromium.org > > --- > > kernel/kexec_file.c | 13 ++++++++++++- > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c > > index f1a0e4e3fb5c..25a37d8f113a 100644 > > --- a/kernel/kexec_file.c > > +++ b/kernel/kexec_file.c > > @@ -901,10 +901,21 @@ static int kexec_purgatory_setup_sechdrs(struct purgatory_info *pi, > > } > > > > offset = ALIGN(offset, align); > > + > > + /* > > + * Check if the segment contains the entry point, if so, > > + * calculate the value of image->start based on it. > > + * If the compiler has produced more than one .text sections > > + * (Eg: .text.hot), they are generally after the main .text > > + * section, and they shall not be used to calculate > > + * image->start. So do not re-calculate image->start if it > > + * is not set to the initial value. > > + */ > > if (sechdrs[i].sh_flags & SHF_EXECINSTR && > > pi->ehdr->e_entry >= sechdrs[i].sh_addr && > > pi->ehdr->e_entry < (sechdrs[i].sh_addr > > - + sechdrs[i].sh_size)) { > > + + sechdrs[i].sh_size) && > > + kbuf->image->start == pi->ehdr->e_entry) { > > I'm not entirely sure if this is the solution to go with. As you state > in the comment above this solution assumes that the .text section comes > before any other .text.* section. But this assumption isn't much > stronger than the assumption that there is only a single .text section, > which is used nowadays. > > The best solution I can come up with right now is to introduce a linker > script for the purgatory that simply merges the .text sections into > one. Similar to what I did for s390 in > arch/s390/purgatory/purgatory.lds.S (although for a different reason). > But that would require every architecture to get one. An alternative > would be to find a way to get rid of the -r option on the LD_FLAGS, > which IIRC is the reason why both section overlap in the first place. I tried removing the -r from arch/x86/purgatory/Makefile and that resulted into: [ 115.631578] BUG: unable to handle page fault for address: ffff93224d5c8e20 [ 115.631583] #PF: supervisor write access in kernel mode [ 115.631585] #PF: error_code(0x0002) - not-present page [ 115.631586] PGD 100000067 P4D 100000067 PUD 1001ed067 PMD 132b58067 PTE 0 [ 115.631589] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 115.631592] CPU: 0 PID: 5291 Comm: kexec-lite Tainted: G U 5.15.103-17399-g852a928df601-dirty #19 cd159e0d6a91f03e06035a0a8eb7fc984a8f3e82 [ 115.631594] Hardware name: Google Crota/Crota, BIOS Google_Crota.14505.288.0 11/08/2022 [ 115.631595] RIP: 0010:memcpy_erms+0x6/0x10 [ 115.631599] Code: 5d 00 eb bd eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 cc cc cc cc 66 90 48 89 f8 48 89 d1 <f3> a4 c3 cc cc cc cc 0f 1f 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe [ 115.631601] RSP: 0018:ffff93224f65fe50 EFLAGS: 00010246 [ 115.631602] RAX: ffff93224d5c8e20 RBX: 00000000ffffffea RCX: 0000000000000100 [ 115.631603] RDX: 0000000000000100 RSI: ffff9322407bd000 RDI: ffff93224d5c8e20 [ 115.631604] RBP: ffff93224f65fe88 R08: 0000000000000000 R09: ffff92133cd3ef08 [ 115.631605] R10: ffff9322407be000 R11: ffffffffa1b4f2e0 R12: 0000000000000000 [ 115.631606] R13: ffff92133cee4c00 R14: 0000000000000100 R15: ffffffffa2b6f14f [ 115.631607] FS: 000078e8b9dbf7c0(0000) GS:ffff921437800000(0000) knlGS:0000000000000000 [ 115.631609] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 115.631610] CR2: ffff93224d5c8e20 CR3: 000000015be26001 CR4: 0000000000770ef0 [ 115.631611] PKRU: 55555554 [ 115.631612] Call Trace: [ 115.631614] <TASK> [ 115.631615] kexec_purgatory_get_set_symbol+0x82/0xd3 [ 115.631619] __se_sys_kexec_file_load+0x523/0x644 [ 115.631621] do_syscall_64+0x58/0xa5 [ 115.631623] entry_SYSCALL_64_after_hwframe+0x61/0xcb And I did not continue in that direction. I also tried finding a flag for llvm that would avoid splitting .text, but was not lucky either. I will look into making a linker script for x86, we could combine it with something like: if (sechdrs[i].sh_flags & SHF_EXECINSTR && pi->ehdr->e_entry >= sechdrs[i].sh_addr && pi->ehdr->e_entry < (sechdrs[i].sh_addr - + sechdrs[i].sh_size) && - kbuf->image->start == pi->ehdr->e_entry) { - kbuf->image->start -= sechdrs[i].sh_addr; - kbuf->image->start += kbuf->mem + offset; + + sechdrs[i].sh_size)) { + if (!WARN_ON(kbuf->image->start != pi->ehdr->e_entry)) { + kbuf->image->start -= sechdrs[i].sh_addr; + kbuf->image->start += kbuf->mem + offset; + } } So developers have some hints of what to look at. Thanks! > > Thanks > Philipp > > > kbuf->image->start -= sechdrs[i].sh_addr; > > kbuf->image->start += kbuf->mem + offset; > > } > > > > --- > > base-commit: 17214b70a159c6547df9ae204a6275d983146f6b > > change-id: 20230321-kexec_clang16-4510c23d129c > > > > Best regards, > -- Ricardo Ribalda _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2023-03-27 11:52 UTC|newest] Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-03-22 19:09 [PATCH v3] kexec: Support purgatories with .text.hot sections Ricardo Ribalda 2023-03-22 19:09 ` Ricardo Ribalda 2023-03-22 20:42 ` Ross Zwisler 2023-03-22 20:42 ` Ross Zwisler 2023-03-22 20:57 ` Ricardo Ribalda 2023-03-22 20:57 ` Ricardo Ribalda 2023-03-24 15:58 ` Philipp Rudo 2023-03-24 15:58 ` Philipp Rudo 2023-03-27 11:52 ` Ricardo Ribalda [this message] 2023-03-27 11:52 ` Ricardo Ribalda 2023-04-03 14:35 ` Philipp Rudo 2023-04-03 14:35 ` Philipp Rudo
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CANiDSCtu8oOn9vV9eak=S2RDVVO9yan2BO8K5ia9jALABqiwjQ@mail.gmail.com' \ --to=ribalda@chromium.org \ --cc=bhe@redhat.com \ --cc=ebiederm@xmission.com \ --cc=kexec@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=prudo@redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.