All of lore.kernel.org
 help / color / mirror / Atom feed
From: lijiang <lijiang@redhat.com>
To: "Lendacky, Thomas" <Thomas.Lendacky@amd.com>,
	Baoquan He <bhe@redhat.com>
Cc: "kexec@lists.infradead.org" <kexec@lists.infradead.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: The current SME implementation fails kexec/kdump kernel booting.
Date: Tue, 11 Jun 2019 17:52:55 +0800	[thread overview]
Message-ID: <33b9237f-5e8c-fe49-4f55-220ce9a492fb@redhat.com> (raw)
In-Reply-To: <2fe0e56c-9286-b71d-3d6d-c2a6fbcfba89@redhat.com>

在 2019年06月09日 11:45, lijiang 写道:
> 在 2019年06月06日 00:04, Lendacky, Thomas 写道:
>> On 6/4/19 7:56 PM, Baoquan He wrote:
>>> On 06/04/19 at 03:56pm, Lendacky, Thomas wrote:
>>>> On 6/4/19 8:49 AM, Baoquan He wrote:
>>>>> Hi Tom,
>>>>>
>>>>> Lianbo reported kdump kernel can't boot well with 'nokaslr' added, and
>>>>> have to enable KASLR in kdump kernel to make it boot successfully. This
>>>>> blocked his work on enabling sme for kexec/kdump. And on some machines
>>>>> SME kernel can't boot in 1st kernel.
>>>>>
>>>>> I checked code of SME implementation, and found out the root cause. The
>>>>> above failures are caused by SME code, sme_encrypt_kernel(). In
>>>>> sme_encrypt_kernel(), you get a 2M of encryption work area as intermediate
>>>>> buffer to encrypt kernel in-place. And the work area is just after _end of
>>>>> kernel.
>>>>
>>>> I remember worrying about something like this back when I was testing the
>>>> kexec support. I had come up with a patch to address it, but never got the
>>>> time to test and submit it.  I've included it here if you'd like to test
>>>> it (I haven't done run this patch in quite some time). If it works, we can
>>>> think about submitting it.
>>>
>>> Thanks for your quick response and making this patch, Tom.
>>>
>>> Tested on a speedway machine, it entered into kernel, but failed in
>>> below stage. Tested two times, always happened.
>>
>> Is this the initial kernel boot or the kexec kernel boot?
>>
>> It looks like this is related to the initrd/initramfs decryption. Not
>> sure what could be happening there. I just tried the patch on my Naples
>> system and a 5.2.0-rc3 kernel and have been able to repeatedly kexec boot
>> a number of times so far.
>>
> 
> I used the hacked kexec-tools(by Baoquan) to test it, the kexec-d kernel and
> kdump kernel worked well. But Tom's patch only worked for the kexec-d kernel,
> and the kdump kernel did not work(kdump kernel could not successfully boot).
> What's the difference between them?
> 

After applied Tom's patch, i changed the reserved memory(for crash kernel) to the
above 256M(>256M), such as crashkernel=320M or 384M,512M..., the kdump kernel can
work and successfully dump the vmcore.

But the kdump kernel always happened the panic or could not boot successfully in
the 256M(<= 256M) case, and on HP machine, i noticed that it printed OOM, the kdump
kernel was too smaller memory. But i never see the OOM on speedway machine(probably
related to the earlyprintk, it doesn't work and it loses many logs).

After removing the option 'CONFIG_DEBUG_INFO' from .config, i tested again, the kdump
kernel did not happen the panic in the 256M(crashkernel=256M), the kdump kernel can
work and succeed to dump the vmcore on HP machine or speedway machine.

It seems that the small memory caused the previous failure in kdump kernel. I would
suggest to post this patch to upstream. What's your opinion? Tom, Baoquan and other
people. Or do you have any comment?

Thanks.
Lianbo

> Thanks
> Lianbo
> 
>> Thanks,
>> Tom
>>
>>>
>>>
>>> [    4.978521] Freeing unused decrypted memory: 2040K
>>> [    4.983800] Freeing unused kernel image memory: 2344K
>>> [    4.988943] Write protecting the kernel read-only data: 18432k
>>> [    4.995306] Freeing unused kernel image memory: 2012K
>>> [    5.000488] Freeing unused kernel image memory: 256K
>>> [    5.005540] Run /init as init process
>>> [    5.009443] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
>>> [    5.017230] CPU: 0 PID: 1 Comm: init Not tainted 5.2.0-rc2+ #38
>>> [    5.023251] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1004B 10/18/2017
>>> [    5.031299] Call Trace:
>>> [    5.033793]  dump_stack+0x46/0x60
>>> [    5.037169]  panic+0xfb/0x2cb
>>> [    5.040191]  do_exit.cold.21+0x59/0x81
>>> [    5.044004]  do_group_exit+0x3a/0xa0
>>> [    5.047640]  __x64_sys_exit_group+0x14/0x20
>>> [    5.051899]  do_syscall_64+0x55/0x1c0
>>> [    5.055627]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>> [    5.060764] RIP: 0033:0x7fa1b1fc9e2e
>>> [    5.064404] Code: Bad RIP value.
>>> [    5.067687] RSP: 002b:00007fffc5abb778 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
>>> [    5.075296] RAX: ffffffffffffffda RBX: 00007fa1b1fd2528 RCX: 00007fa1b1fc9e2e
>>> [    5.082625] RDX: 000000000000007f RSI: 000000000000003c RDI: 000000000000007f
>>> [    5.089879] RBP: 00007fa1b21d8d00 R08: 00000000000000e7 R09: 00007fffc5abb688
>>> [    5.097134] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
>>> [    5.104386] R13: 0000000000000001 R14: 00007fa1b21d8d40 R15: 00007fa1b21d8d30
>>> [    5.111645] Kernel Offset: disabled
>>> [    5.423002] Rebooting in 10 seconds..
>>> [   15.429641] ACPI MEMORY or I/O RESET_REG.
>>>

WARNING: multiple messages have this Message-ID (diff)
From: lijiang <lijiang@redhat.com>
To: "Lendacky, Thomas" <Thomas.Lendacky@amd.com>,
	Baoquan He <bhe@redhat.com>
Cc: "x86@kernel.org" <x86@kernel.org>,
	"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: The current SME implementation fails kexec/kdump kernel booting.
Date: Tue, 11 Jun 2019 17:52:55 +0800	[thread overview]
Message-ID: <33b9237f-5e8c-fe49-4f55-220ce9a492fb@redhat.com> (raw)
In-Reply-To: <2fe0e56c-9286-b71d-3d6d-c2a6fbcfba89@redhat.com>

在 2019年06月09日 11:45, lijiang 写道:
> 在 2019年06月06日 00:04, Lendacky, Thomas 写道:
>> On 6/4/19 7:56 PM, Baoquan He wrote:
>>> On 06/04/19 at 03:56pm, Lendacky, Thomas wrote:
>>>> On 6/4/19 8:49 AM, Baoquan He wrote:
>>>>> Hi Tom,
>>>>>
>>>>> Lianbo reported kdump kernel can't boot well with 'nokaslr' added, and
>>>>> have to enable KASLR in kdump kernel to make it boot successfully. This
>>>>> blocked his work on enabling sme for kexec/kdump. And on some machines
>>>>> SME kernel can't boot in 1st kernel.
>>>>>
>>>>> I checked code of SME implementation, and found out the root cause. The
>>>>> above failures are caused by SME code, sme_encrypt_kernel(). In
>>>>> sme_encrypt_kernel(), you get a 2M of encryption work area as intermediate
>>>>> buffer to encrypt kernel in-place. And the work area is just after _end of
>>>>> kernel.
>>>>
>>>> I remember worrying about something like this back when I was testing the
>>>> kexec support. I had come up with a patch to address it, but never got the
>>>> time to test and submit it.  I've included it here if you'd like to test
>>>> it (I haven't done run this patch in quite some time). If it works, we can
>>>> think about submitting it.
>>>
>>> Thanks for your quick response and making this patch, Tom.
>>>
>>> Tested on a speedway machine, it entered into kernel, but failed in
>>> below stage. Tested two times, always happened.
>>
>> Is this the initial kernel boot or the kexec kernel boot?
>>
>> It looks like this is related to the initrd/initramfs decryption. Not
>> sure what could be happening there. I just tried the patch on my Naples
>> system and a 5.2.0-rc3 kernel and have been able to repeatedly kexec boot
>> a number of times so far.
>>
> 
> I used the hacked kexec-tools(by Baoquan) to test it, the kexec-d kernel and
> kdump kernel worked well. But Tom's patch only worked for the kexec-d kernel,
> and the kdump kernel did not work(kdump kernel could not successfully boot).
> What's the difference between them?
> 

After applied Tom's patch, i changed the reserved memory(for crash kernel) to the
above 256M(>256M), such as crashkernel=320M or 384M,512M..., the kdump kernel can
work and successfully dump the vmcore.

But the kdump kernel always happened the panic or could not boot successfully in
the 256M(<= 256M) case, and on HP machine, i noticed that it printed OOM, the kdump
kernel was too smaller memory. But i never see the OOM on speedway machine(probably
related to the earlyprintk, it doesn't work and it loses many logs).

After removing the option 'CONFIG_DEBUG_INFO' from .config, i tested again, the kdump
kernel did not happen the panic in the 256M(crashkernel=256M), the kdump kernel can
work and succeed to dump the vmcore on HP machine or speedway machine.

It seems that the small memory caused the previous failure in kdump kernel. I would
suggest to post this patch to upstream. What's your opinion? Tom, Baoquan and other
people. Or do you have any comment?

Thanks.
Lianbo

> Thanks
> Lianbo
> 
>> Thanks,
>> Tom
>>
>>>
>>>
>>> [    4.978521] Freeing unused decrypted memory: 2040K
>>> [    4.983800] Freeing unused kernel image memory: 2344K
>>> [    4.988943] Write protecting the kernel read-only data: 18432k
>>> [    4.995306] Freeing unused kernel image memory: 2012K
>>> [    5.000488] Freeing unused kernel image memory: 256K
>>> [    5.005540] Run /init as init process
>>> [    5.009443] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
>>> [    5.017230] CPU: 0 PID: 1 Comm: init Not tainted 5.2.0-rc2+ #38
>>> [    5.023251] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1004B 10/18/2017
>>> [    5.031299] Call Trace:
>>> [    5.033793]  dump_stack+0x46/0x60
>>> [    5.037169]  panic+0xfb/0x2cb
>>> [    5.040191]  do_exit.cold.21+0x59/0x81
>>> [    5.044004]  do_group_exit+0x3a/0xa0
>>> [    5.047640]  __x64_sys_exit_group+0x14/0x20
>>> [    5.051899]  do_syscall_64+0x55/0x1c0
>>> [    5.055627]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>> [    5.060764] RIP: 0033:0x7fa1b1fc9e2e
>>> [    5.064404] Code: Bad RIP value.
>>> [    5.067687] RSP: 002b:00007fffc5abb778 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
>>> [    5.075296] RAX: ffffffffffffffda RBX: 00007fa1b1fd2528 RCX: 00007fa1b1fc9e2e
>>> [    5.082625] RDX: 000000000000007f RSI: 000000000000003c RDI: 000000000000007f
>>> [    5.089879] RBP: 00007fa1b21d8d00 R08: 00000000000000e7 R09: 00007fffc5abb688
>>> [    5.097134] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
>>> [    5.104386] R13: 0000000000000001 R14: 00007fa1b21d8d40 R15: 00007fa1b21d8d30
>>> [    5.111645] Kernel Offset: disabled
>>> [    5.423002] Rebooting in 10 seconds..
>>> [   15.429641] ACPI MEMORY or I/O RESET_REG.
>>>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2019-06-11  9:54 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-04 13:49 The current SME implementation fails kexec/kdump kernel booting Baoquan He
2019-06-04 13:49 ` Baoquan He
2019-06-04 15:56 ` Lendacky, Thomas
2019-06-04 15:56   ` Lendacky, Thomas
2019-06-05  0:56   ` Baoquan He
2019-06-05  0:56     ` Baoquan He
2019-06-05 16:04     ` Lendacky, Thomas
2019-06-05 16:04       ` Lendacky, Thomas
2019-06-05 22:57       ` Baoquan He
2019-06-05 22:57         ` Baoquan He
2019-06-09  3:45       ` lijiang
2019-06-09  3:45         ` lijiang
2019-06-11  9:52         ` lijiang [this message]
2019-06-11  9:52           ` lijiang
2019-06-11 10:24           ` Baoquan He
2019-06-11 10:24             ` Baoquan He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33b9237f-5e8c-fe49-4f55-220ce9a492fb@redhat.com \
    --to=lijiang@redhat.com \
    --cc=Thomas.Lendacky@amd.com \
    --cc=bhe@redhat.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.