All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Roth <michael.roth@amd.com>
To: "NOMURA JUNICHI(野村 淳一)" <junichi.nomura@nec.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>, "bp@suse.de" <bp@suse.de>
Subject: Re: [Regression v5.19-rc1] crash kexec fails to boot the 2nd kernel (Re: [PATCH v12 38/46] x86/sev: Add SEV-SNP feature detection/setup)
Date: Tue, 28 Jun 2022 19:41:14 -0500	[thread overview]
Message-ID: <20220629004114.zn5rurrmqdkiceun@amd.com> (raw)
In-Reply-To: <TYCPR01MB694815CD815E98945F63C99183B49@TYCPR01MB6948.jpnprd01.prod.outlook.com>

On Fri, Jun 24, 2022 at 12:44:52AM +0000, NOMURA JUNICHI(野村 淳一) wrote:
> I found crash kexec fails to boot the 2nd kernel since v5.19-rc1 and
> git bisect points to this as a bad commit:
> 
>   commit b190a043c49af4587f5e157053f909192820522a
>   Author: Michael Roth <michael.roth@amd.com>
>   Date:   Thu Feb 24 10:56:18 2022 -0600
> 
>     x86/sev: Add SEV-SNP feature detection/setup
> 
>     Initial/preliminary detection of SEV-SNP is done via the Confidential
>     Computing blob. Check for it prior to the normal SEV/SME feature
>     initialization, and add some sanity checks to confirm it agrees with
>     SEV-SNP CPUID/MSR bits.
> 
> The problem seems to occur when find_cc_blob_setup_data() walks setup_data
> chain.  If the code is modified to do nothing in find_cc_blob_setup_data(),
> the 2nd kernel boots fine.
> 
> On my system, the chain of setup_data looks like following on regular boot:
>   setup_data: type=0x3 addr=0x9e9e5018 next=0x9e9dc018
>   setup_data: type=0x3 addr=0x9e9dc018 next=0x9e9d2018
>   setup_data: type=0x3 addr=0x9e9d2018 next=0x8a27b018
>   setup_data: type=0x3 addr=0x8a27b018 next=0x8a218018
>   setup_data: type=0x3 addr=0x8a218018 next=0x9e9a0018
>   setup_data: type=0x3 addr=0x9e9a0018 next=0x8a1e6018
>   setup_data: type=0x3 addr=0x8a1e6018 next=0x8a1b4018
>   setup_data: type=0x3 addr=0x8a1b4018 next=0x8a182018
>   setup_data: type=0x3 addr=0x8a182018 next=0x8a056018
>   setup_data: type=0x3 addr=0x8a056018 next=0x8a020018
>   setup_data: type=0x3 addr=0x8a020018 next=0x89fea018
>   setup_data: type=0x3 addr=0x89fea018 next=0x0
> 
> OTOH, it looks like following on crash kexec boot:
>   setup_data: type=0x4 addr=0x2e000000 next=0x0

Hi,

Thanks for the debug info. I haven't been able to reproduce this on the
Milan or Cascade Lake systems I've tried, with kexec -l/-p, and well as
with/without -s, so there may be something hardware/environment-specific
going on here, so I could really use your help to test possible fixes.

> 
> Other places that parses setup_data uses early_memremap() before
> accessing the data (e.g. parse_setup_data()).  I wonder if the lack of
> remapping causes the problem but find_cc_blob is too early in the
> boot process for early_memremap to work.

I think this might be the case. Prior to early_memremap() being
available, we need to rely on the initialize identity map set up by the
decompression kernel. It has some stuff to add mappings for boot_params
and whatnot, but I don't see where boot_params->hdr.setup_data is
handled.

If you use kexec -s to force kexec_file_load, then the kernel sets it up
so that boot_params->hdr.setup_data points to some memory just after
boot_params, and boot/compressed uses 2M pages in its identity map, so
that generally ends up handling the whole range.

But if you use kexec's default kexec_load functionality, setup_data might
be allocated elsewhere, so in that case we might need explicit mapping. I
noticed on my systems boot_params->hdr.setup_data seems to generally end
up at 0x100000 for some reason, and maybe that addr just happens to
get mapped for other reasons so I don't end up hitting the crash.

Could you give it a shot with the kexec -s flag and so if that works?

If so, can you apply the below potential fix, and retry your original
reproducer?

diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
index 44c350d627c7..c548950981a2 100644
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -110,6 +110,7 @@ void kernel_add_identity_map(unsigned long start, unsigned long end)
 void initialize_identity_maps(void *rmode)
 {
        unsigned long cmdline;
+       struct setup_data *sd;

        /* Exclude the encryption mask from __PHYSICAL_MASK */
        physical_mask &= ~sme_me_mask;
@@ -163,6 +164,12 @@ void initialize_identity_maps(void *rmode)
        cmdline = get_cmd_line_ptr();
        kernel_add_identity_map(cmdline, cmdline + COMMAND_LINE_SIZE);

+       sd = (struct setup_data *)boot_params->hdr.setup_data;
+       while (sd) {
+               kernel_add_identity_map((unsigned long)sd, (unsigned long)(sizeof(*sd) + sd->len));
+               sd = (struct setup_data *)sd->next;
+       }
+
        sev_prep_identity_maps(top_level_pgt);

        /* Load the new page-table. */


Thanks!

-Mike

> 
> -- 
> Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.
> 



  parent reply	other threads:[~2022-06-29  0:41 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-24  0:44 [Regression v5.19-rc1] crash kexec fails to boot the 2nd kernel (Re: [PATCH v12 38/46] x86/sev: Add SEV-SNP feature detection/setup) NOMURA JUNICHI(野村 淳一)
2022-06-24  9:03 ` Borislav Petkov
2022-06-24 10:14   ` NOMURA JUNICHI(野村 淳一)
2022-06-29  0:41 ` Michael Roth [this message]
2022-06-29  7:38   ` NOMURA JUNICHI(野村 淳一)
2022-06-29  8:20     ` Borislav Petkov
2022-06-29 11:06       ` NOMURA JUNICHI(野村 淳一)
2022-06-29 13:52         ` Michael Roth
2022-06-29 15:35           ` Michael Roth
2022-06-29 13:54     ` Michael Roth
2022-06-30  8:25       ` NOMURA JUNICHI(野村 淳一)
2022-08-16 14:25 ` [Regression v5.19-rc1] kernel fails to boot, no console output " Jeremi Piotrowski
2022-08-16 15:06   ` Michael Roth
2022-08-17  8:40     ` Jeremi Piotrowski
2022-08-22 16:39       ` Michael Roth
2022-09-08  9:35         ` Jeremi Piotrowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220629004114.zn5rurrmqdkiceun@amd.com \
    --to=michael.roth@amd.com \
    --cc=bp@suse.de \
    --cc=junichi.nomura@nec.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.