linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Does earlyprintk=ttyS0 work for an AMD SNP guest on KVM?
@ 2023-02-16  4:40 Dexuan Cui
  2023-02-16  9:14 ` Jeremi Piotrowski
  0 siblings, 1 reply; 6+ messages in thread
From: Dexuan Cui @ 2023-02-16  4:40 UTC (permalink / raw)
  To: Tom Lendacky, Borislav Petkov, sandipan.das, Gupta, Pankaj,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky, kvm,
	x86
  Cc: Tianyu Lan, linux-hyperv, linux-kernel

Hi all,
With the earlyprintk=ttyS0 kernel parameter, a C-bit mode Linux SNP guest
on Hyper-V always decides to crash via sev_es_terminate() in
do_boot_stage2_vc(), because early_setup_ghcb() fails:

early_setup_ghcb() ->
  set_page_decrypted() ->
    set_clr_page_flags() ->
      split_large_pmd() ->
        alloc_pgt_page() fails to allocate memory.

static void *alloc_pgt_page(void *context)
{
...
        /* Validate there is space available for a new page. */
        if (pages->pgt_buf_offset >= pages->pgt_buf_size) {
  ...
              return NULL;
        }
...
}

alloc_pgt_page() fails to allocate memory because both
pages->pgt_buf_offset and pages->pgt_buf_size are zero.


pgt_data.pgt_buf_size is zero because of this line in
initialize_identity_maps()
     pgt_data.pgt_buf_size = BOOT_PGT_SIZE - BOOT_INIT_PGT_SIZE;

void initialize_identity_maps(void *rmode)
{
...
        top_level_pgt = read_cr3_pa();
        if (p4d_offset((pgd_t *)top_level_pgt, 0) == (p4d_t *)_pgtable) {
                pgt_data.pgt_buf = _pgtable + BOOT_INIT_PGT_SIZE;
                pgt_data.pgt_buf_size = BOOT_PGT_SIZE - BOOT_INIT_PGT_SIZE;
                memset(pgt_data.pgt_buf, 0, pgt_data.pgt_buf_size);
        } else {
                pgt_data.pgt_buf = _pgtable;
                pgt_data.pgt_buf_size = BOOT_PGT_SIZE;
                memset(pgt_data.pgt_buf, 0, pgt_data.pgt_buf_size);
                top_level_pgt = (unsigned long)alloc_pgt_page(&pgt_data);
        }

In arch/x86/include/asm/boot.h, BOOT_PGT_SIZE equals
BOOT_INIT_PGT_SIZE if CONFIG_RANDOMIZE_BASE is not defined 
(which is my case):
 
# define BOOT_INIT_PGT_SIZE     (6*4096)

# ifdef CONFIG_RANDOMIZE_BASE
...
#  ifdef CONFIG_X86_VERBOSE_BOOTUP
#   define BOOT_PGT_SIZE        (19*4096)
#  else /* !CONFIG_X86_VERBOSE_BOOTUP */
#   define BOOT_PGT_SIZE        (17*4096)
#  endif
# else /* !CONFIG_RANDOMIZE_BASE */
#  define BOOT_PGT_SIZE         BOOT_INIT_PGT_SIZE
# endif

I think this means: if CONFIG_RANDOMIZE_BASE is not defined,
earlyprintk=ttyS0 also doesn't work for an SNP guest on KVM? 
Sorry I don't have a KVM environment at hand to test it.

If I define CONFIG_RANDOMIZE_BASE, my C-bit mode SNP guest crashes
even ealier -- it looks like CONFIG_RANDOMIZE_BASE is incompatible
with my guest on Hyper-V due to some reason I don't know.

Do you always use CONFIG_RANDOMIZE_BASE for a SNP guest on KVM
and does earlyprintk=ttyS0 work for you?

Can you please share your thoughts? Thanks!

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Does earlyprintk=ttyS0 work for an AMD SNP guest on KVM?
  2023-02-16  4:40 Does earlyprintk=ttyS0 work for an AMD SNP guest on KVM? Dexuan Cui
@ 2023-02-16  9:14 ` Jeremi Piotrowski
  2023-02-16 17:58   ` Dexuan Cui
  0 siblings, 1 reply; 6+ messages in thread
From: Jeremi Piotrowski @ 2023-02-16  9:14 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Tom Lendacky, Borislav Petkov, sandipan.das, Gupta, Pankaj,
	ray.huang, brijesh.singh, michael.roth, kvm, x86, Tianyu Lan,
	linux-hyperv, linux-kernel

On Thu, Feb 16, 2023 at 04:40:14AM +0000, Dexuan Cui wrote:
> Hi all,
> With the earlyprintk=ttyS0 kernel parameter, a C-bit mode Linux SNP guest
> on Hyper-V always decides to crash via sev_es_terminate() in
> do_boot_stage2_vc(), because early_setup_ghcb() fails:
> 
> early_setup_ghcb() ->
>   set_page_decrypted() ->
>     set_clr_page_flags() ->
>       split_large_pmd() ->
>         alloc_pgt_page() fails to allocate memory.
> 
> static void *alloc_pgt_page(void *context)
> {
> ...
>         /* Validate there is space available for a new page. */
>         if (pages->pgt_buf_offset >= pages->pgt_buf_size) {
>   ...
>               return NULL;
>         }
> ...
> }
> 
> alloc_pgt_page() fails to allocate memory because both
> pages->pgt_buf_offset and pages->pgt_buf_size are zero.
> 
> 
> pgt_data.pgt_buf_size is zero because of this line in
> initialize_identity_maps()
>      pgt_data.pgt_buf_size = BOOT_PGT_SIZE - BOOT_INIT_PGT_SIZE;
> 
> void initialize_identity_maps(void *rmode)
> {
> ...
>         top_level_pgt = read_cr3_pa();
>         if (p4d_offset((pgd_t *)top_level_pgt, 0) == (p4d_t *)_pgtable) {
>                 pgt_data.pgt_buf = _pgtable + BOOT_INIT_PGT_SIZE;
>                 pgt_data.pgt_buf_size = BOOT_PGT_SIZE - BOOT_INIT_PGT_SIZE;
>                 memset(pgt_data.pgt_buf, 0, pgt_data.pgt_buf_size);
>         } else {
>                 pgt_data.pgt_buf = _pgtable;
>                 pgt_data.pgt_buf_size = BOOT_PGT_SIZE;
>                 memset(pgt_data.pgt_buf, 0, pgt_data.pgt_buf_size);
>                 top_level_pgt = (unsigned long)alloc_pgt_page(&pgt_data);

I just tested an SNP guest on KVM with and without CONFIG_RANDOMIZE_BASE.
In both cases we end up in the else() branch.
With CONFIG_RANDOMIZE_BASE BOOT_PGT_SIZE=0x13000
Without CONFIG_RANDOMMIZE_BASE BOOT_PGT_SIZE=0x6000.

So in both cases pgt_data.pgt_buf_size != 0. 

Getting into that first branch would require having 5-level paging supported
(CONFIG_X86_5LEVEL=y) and enabled inside the guest, I don't have that on any
hardware I have access to.

Jeremi

>         }
> 
> In arch/x86/include/asm/boot.h, BOOT_PGT_SIZE equals
> BOOT_INIT_PGT_SIZE if CONFIG_RANDOMIZE_BASE is not defined 
> (which is my case):
>  
> # define BOOT_INIT_PGT_SIZE     (6*4096)
> 
> # ifdef CONFIG_RANDOMIZE_BASE
> ...
> #  ifdef CONFIG_X86_VERBOSE_BOOTUP
> #   define BOOT_PGT_SIZE        (19*4096)
> #  else /* !CONFIG_X86_VERBOSE_BOOTUP */
> #   define BOOT_PGT_SIZE        (17*4096)
> #  endif
> # else /* !CONFIG_RANDOMIZE_BASE */
> #  define BOOT_PGT_SIZE         BOOT_INIT_PGT_SIZE
> # endif
> 
> I think this means: if CONFIG_RANDOMIZE_BASE is not defined,
> earlyprintk=ttyS0 also doesn't work for an SNP guest on KVM? 
> Sorry I don't have a KVM environment at hand to test it.
> 
> If I define CONFIG_RANDOMIZE_BASE, my C-bit mode SNP guest crashes
> even ealier -- it looks like CONFIG_RANDOMIZE_BASE is incompatible
> with my guest on Hyper-V due to some reason I don't know.
> 
> Do you always use CONFIG_RANDOMIZE_BASE for a SNP guest on KVM
> and does earlyprintk=ttyS0 work for you?
> 
> Can you please share your thoughts? Thanks!
> 
> Thanks,
> -- Dexuan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Does earlyprintk=ttyS0 work for an AMD SNP guest on KVM?
  2023-02-16  9:14 ` Jeremi Piotrowski
@ 2023-02-16 17:58   ` Dexuan Cui
  2023-02-17 12:51     ` Jeremi Piotrowski
  0 siblings, 1 reply; 6+ messages in thread
From: Dexuan Cui @ 2023-02-16 17:58 UTC (permalink / raw)
  To: Jeremi Piotrowski
  Cc: Tom Lendacky, Borislav Petkov, sandipan.das, Gupta, Pankaj,
	ray.huang, brijesh.singh, michael.roth, kvm, x86, Tianyu Lan,
	linux-hyperv, linux-kernel

> From: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
> Sent: Thursday, February 16, 2023 1:15 AM
> > ...
> > alloc_pgt_page() fails to allocate memory because both
> > pages->pgt_buf_offset and pages->pgt_buf_size are zero.
> >
> >
> > pgt_data.pgt_buf_size is zero because of this line in
> > initialize_identity_maps()
> >      pgt_data.pgt_buf_size = BOOT_PGT_SIZE - BOOT_INIT_PGT_SIZE;
> >
> > void initialize_identity_maps(void *rmode)
> > {
> > ...
> >         top_level_pgt = read_cr3_pa();
> >         if (p4d_offset((pgd_t *)top_level_pgt, 0) == (p4d_t *)_pgtable) {
> >                 pgt_data.pgt_buf = _pgtable + BOOT_INIT_PGT_SIZE;
> >                 pgt_data.pgt_buf_size = BOOT_PGT_SIZE -
> > BOOT_INIT_PGT_SIZE;
> >                 memset(pgt_data.pgt_buf, 0, pgt_data.pgt_buf_size);
> >         } else {
> >                 pgt_data.pgt_buf = _pgtable;
> >                 pgt_data.pgt_buf_size = BOOT_PGT_SIZE;
> >                 memset(pgt_data.pgt_buf, 0, pgt_data.pgt_buf_size);
> >                 top_level_pgt = (unsigned
> > long)alloc_pgt_page(&pgt_data);
> 
> I just tested an SNP guest on KVM with and without
> CONFIG_RANDOMIZE_BASE.
> In both cases we end up in the else() branch.
> With CONFIG_RANDOMIZE_BASE BOOT_PGT_SIZE=0x13000
> Without CONFIG_RANDOMMIZE_BASE BOOT_PGT_SIZE=0x6000.
> 
> So in both cases pgt_data.pgt_buf_size != 0.
> 
> Getting into that first branch would require having 5-level paging supported
> (CONFIG_X86_5LEVEL=y) and enabled inside the guest, I don't have that on
> any
> hardware I have access to.
> 
> Jeremi

CONFIG_X86_5LEVEL is not set for my kernel.

The comment before the first branch says:
  On 4-level paging, p4d_offset(top_level_pgt, 0) is equal to 'top_level_pgt'.

IIUC this means 'top_level_pgt' is equal to '_pgtable'? i.e. without 
CONFIG_RANDOMIZE_BASE, pgt_data.pgt_buf_size should be 0.

Not sure why it's not getting into the first branch for you.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Does earlyprintk=ttyS0 work for an AMD SNP guest on KVM?
  2023-02-16 17:58   ` Dexuan Cui
@ 2023-02-17 12:51     ` Jeremi Piotrowski
  2023-02-18  2:54       ` Dexuan Cui
  0 siblings, 1 reply; 6+ messages in thread
From: Jeremi Piotrowski @ 2023-02-17 12:51 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Tom Lendacky, Borislav Petkov, sandipan.das, Gupta, Pankaj,
	ray.huang, brijesh.singh, michael.roth, kvm, x86, Tianyu Lan,
	linux-hyperv, linux-kernel

On 16/02/2023 18:58, Dexuan Cui wrote:
>> From: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
>> Sent: Thursday, February 16, 2023 1:15 AM
>>> ...
>>> alloc_pgt_page() fails to allocate memory because both
>>> pages->pgt_buf_offset and pages->pgt_buf_size are zero.
>>>
>>>
>>> pgt_data.pgt_buf_size is zero because of this line in
>>> initialize_identity_maps()
>>>      pgt_data.pgt_buf_size = BOOT_PGT_SIZE - BOOT_INIT_PGT_SIZE;
>>>
>>> void initialize_identity_maps(void *rmode)
>>> {
>>> ...
>>>         top_level_pgt = read_cr3_pa();
>>>         if (p4d_offset((pgd_t *)top_level_pgt, 0) == (p4d_t *)_pgtable) {
>>>                 pgt_data.pgt_buf = _pgtable + BOOT_INIT_PGT_SIZE;
>>>                 pgt_data.pgt_buf_size = BOOT_PGT_SIZE -
>>> BOOT_INIT_PGT_SIZE;
>>>                 memset(pgt_data.pgt_buf, 0, pgt_data.pgt_buf_size);
>>>         } else {
>>>                 pgt_data.pgt_buf = _pgtable;
>>>                 pgt_data.pgt_buf_size = BOOT_PGT_SIZE;
>>>                 memset(pgt_data.pgt_buf, 0, pgt_data.pgt_buf_size);
>>>                 top_level_pgt = (unsigned
>>> long)alloc_pgt_page(&pgt_data);
>>
>> I just tested an SNP guest on KVM with and without
>> CONFIG_RANDOMIZE_BASE.
>> In both cases we end up in the else() branch.
>> With CONFIG_RANDOMIZE_BASE BOOT_PGT_SIZE=0x13000
>> Without CONFIG_RANDOMMIZE_BASE BOOT_PGT_SIZE=0x6000.
>>
>> So in both cases pgt_data.pgt_buf_size != 0.
>>
>> Getting into that first branch would require having 5-level paging supported
>> (CONFIG_X86_5LEVEL=y) and enabled inside the guest, I don't have that on
>> any
>> hardware I have access to.
>>
>> Jeremi
> 
> CONFIG_X86_5LEVEL is not set for my kernel.
> 
> The comment before the first branch says:
>   On 4-level paging, p4d_offset(top_level_pgt, 0) is equal to 'top_level_pgt'.
> 
> IIUC this means 'top_level_pgt' is equal to '_pgtable'? i.e. without 
> CONFIG_RANDOMIZE_BASE, pgt_data.pgt_buf_size should be 0.
> 
> Not sure why it's not getting into the first branch for you.

Sorry, I got two things confused here. The relevant part of the comment is this:
"If we came here via startup_32(), cr3 will be _pgtable already".

Booting a (non-SNP) guest via BIOS I end up in the first branch. Upstream SNP support
requires OVMF (UEFI) so we'll always reach the kernel in 64-bit mode (startup_64?),
and end up in the second branch.

Jeremi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Does earlyprintk=ttyS0 work for an AMD SNP guest on KVM?
  2023-02-17 12:51     ` Jeremi Piotrowski
@ 2023-02-18  2:54       ` Dexuan Cui
  2023-03-17 18:07         ` Tom Lendacky
  0 siblings, 1 reply; 6+ messages in thread
From: Dexuan Cui @ 2023-02-18  2:54 UTC (permalink / raw)
  To: Jeremi Piotrowski
  Cc: Tom Lendacky, Borislav Petkov, sandipan.das, Gupta, Pankaj,
	ray.huang, brijesh.singh, michael.roth, kvm, x86, Tianyu Lan,
	linux-hyperv, linux-kernel

> From: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
> Sent: Friday, February 17, 2023 4:51 AM
> To: Dexuan Cui <decui@microsoft.com>
> > [...]
> > The comment before the first branch says:
> >   On 4-level paging, p4d_offset(top_level_pgt, 0) is equal to 'top_level_pgt'.
> >
> > IIUC this means 'top_level_pgt' is equal to '_pgtable'? i.e. without
> > CONFIG_RANDOMIZE_BASE, pgt_data.pgt_buf_size should be 0.
> >
> > Not sure why it's not getting into the first branch for you.
> 
> Sorry, I got two things confused here. The relevant part of the comment is this:
> "If we came here via startup_32(), cr3 will be _pgtable already".
> 
> Booting a (non-SNP) guest via BIOS I end up in the first branch. Upstream SNP
> support requires OVMF (UEFI) so we'll always reach the kernel in 64-bit mode
> (startup_64?), and end up in the second branch.
> 
> Jeremi

Here I'm running a C-bit mode SNP guest on Hyper-V via "direct-boot" (i.e. 
I run Set-VMFirmware to tell Hyper-V to boot the kernel directly without
UEFI). Looks like arch/x86/boot/compressed/head_64.S: startup_32 runs
first and calls startup_64 later (?) This might explain why I'm getting into
the first branch, which I hope could be fixed by someone...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Does earlyprintk=ttyS0 work for an AMD SNP guest on KVM?
  2023-02-18  2:54       ` Dexuan Cui
@ 2023-03-17 18:07         ` Tom Lendacky
  0 siblings, 0 replies; 6+ messages in thread
From: Tom Lendacky @ 2023-03-17 18:07 UTC (permalink / raw)
  To: Dexuan Cui, Jeremi Piotrowski
  Cc: Borislav Petkov, sandipan.das, Gupta, Pankaj, ray.huang,
	brijesh.singh, michael.roth, kvm, x86, Tianyu Lan, linux-hyperv,
	linux-kernel

On 2/17/23 20:54, Dexuan Cui wrote:
>> From: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
>> Sent: Friday, February 17, 2023 4:51 AM
>> To: Dexuan Cui <decui@microsoft.com>
>>> [...]
>>> The comment before the first branch says:
>>>    On 4-level paging, p4d_offset(top_level_pgt, 0) is equal to 'top_level_pgt'.
>>>
>>> IIUC this means 'top_level_pgt' is equal to '_pgtable'? i.e. without
>>> CONFIG_RANDOMIZE_BASE, pgt_data.pgt_buf_size should be 0.
>>>
>>> Not sure why it's not getting into the first branch for you.
>>
>> Sorry, I got two things confused here. The relevant part of the comment is this:
>> "If we came here via startup_32(), cr3 will be _pgtable already".
>>
>> Booting a (non-SNP) guest via BIOS I end up in the first branch. Upstream SNP
>> support requires OVMF (UEFI) so we'll always reach the kernel in 64-bit mode
>> (startup_64?), and end up in the second branch.
>>
>> Jeremi
> 
> Here I'm running a C-bit mode SNP guest on Hyper-V via "direct-boot" (i.e.
> I run Set-VMFirmware to tell Hyper-V to boot the kernel directly without
> UEFI). Looks like arch/x86/boot/compressed/head_64.S: startup_32 runs
> first and calls startup_64 later (?) This might explain why I'm getting into
> the first branch, which I hope could be fixed by someone...

It sounds like there aren't enough pages available to satisfy the page 
split in order to make the GHCB shared. Have you tried changing 
BOOT_INIT_PGT_SIZE to increase it by 1 page. Splitting the page will 
require an additional page table, I think that is all that would be needed.

Thanks,
Tom

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-03-17 18:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-16  4:40 Does earlyprintk=ttyS0 work for an AMD SNP guest on KVM? Dexuan Cui
2023-02-16  9:14 ` Jeremi Piotrowski
2023-02-16 17:58   ` Dexuan Cui
2023-02-17 12:51     ` Jeremi Piotrowski
2023-02-18  2:54       ` Dexuan Cui
2023-03-17 18:07         ` Tom Lendacky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).