All of lore.kernel.org
 help / color / mirror / Atom feed
* kexec-starting-kernel-problem-on-vm
@ 2021-04-14  7:04 Jingxian He
  2021-04-19  8:26 ` kexec-starting-kernel-problem-on-vm Baoquan He
  0 siblings, 1 reply; 3+ messages in thread
From: Jingxian He @ 2021-04-14  7:04 UTC (permalink / raw)
  To: kexec; +Cc: hewenliang4, wuxu.wu, horms, hejingxian

[-- Attachment #1: Type: text/plain, Size: 1619 bytes --]

We use ���kexec ���l��� and ���kexec ���e��� on our virtual machine to upgrade the
linux kernel. We find that the new kernel may start fail due to checking
the sha256 sum of the initrd segment checking fail with low probability.

The related code is as following:
/* arch/x86/purgatory/purgatory.c */
static int verify_sha256_digest(void)
{
	struct kexec_sha_region *ptr, *end;
	u8 digest[SHA256_DIGEST_SIZE];
	struct sha256_state sctx;

	sha256_init(&sctx);
	end = purgatory_sha_regions + ARRAY_SIZE(purgatory_sha_regions);

	for (ptr = purgatory_sha_regions; ptr < end; ptr++)
		sha256_update(&sctx, (uint8_t *)(ptr->start), ptr->len);

	sha256_final(&sctx, digest);

	if (memcmp(digest, purgatory_sha256_digest, sizeof(digest)))
		return 1;

	return 0;
}

void purgatory(void)
{
	int ret;

	ret = verify_sha256_digest();
	if (ret) { //<------verify_sha256 fail, entering loop forever
		/* loop forever */
		for (;;)
			;
	}
	copy_backup_region();
}


Our opnion of this problem:
We think that the process of relocating the new kernel depending on the
boot cpu running without interruption. However, the vcpus may be interrupted
by the qemu process with async_page_fault interruption.
There exists memory overriding risk when the boot vcpu relocate the new kernel.

When we enable the KVM_GUEST feature, and make the memory less than 500M,
The new kernel starting problem with ���kexec -l/-e��� will happen at every time.

My last question is:
Are ���kexec ���l��� and ���kexec ���e��� commands not applicable on virtual machines?



[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: kexec-starting-kernel-problem-on-vm
  2021-04-14  7:04 kexec-starting-kernel-problem-on-vm Jingxian He
@ 2021-04-19  8:26 ` Baoquan He
  2021-04-19  8:37   ` kexec-starting-kernel-problem-on-vm David Hildenbrand
  0 siblings, 1 reply; 3+ messages in thread
From: Baoquan He @ 2021-04-19  8:26 UTC (permalink / raw)
  To: Jingxian He; +Cc: kexec, hewenliang4, wuxu.wu, horms, jasowang, david, peterx

Hi Jingxian,

On 04/14/21 at 03:04pm, Jingxian He wrote:
> We use ‘kexec –l’ and ‘kexec –e’ on our virtual machine to upgrade the
> linux kernel. We find that the new kernel may start fail due to checking
> the sha256 sum of the initrd segment checking fail with low probability.
> 
> The related code is as following:
> /* arch/x86/purgatory/purgatory.c */
> static int verify_sha256_digest(void)
> {
> 	struct kexec_sha_region *ptr, *end;
> 	u8 digest[SHA256_DIGEST_SIZE];
> 	struct sha256_state sctx;
> 
> 	sha256_init(&sctx);
> 	end = purgatory_sha_regions + ARRAY_SIZE(purgatory_sha_regions);
> 
> 	for (ptr = purgatory_sha_regions; ptr < end; ptr++)
> 		sha256_update(&sctx, (uint8_t *)(ptr->start), ptr->len);
> 
> 	sha256_final(&sctx, digest);
> 
> 	if (memcmp(digest, purgatory_sha256_digest, sizeof(digest)))
> 		return 1;
> 
> 	return 0;
> }
> 
> void purgatory(void)
> {
> 	int ret;
> 
> 	ret = verify_sha256_digest();

I usually use qemu/kvm guest to test kernel, kexec and kdump, haven't
met this issue. kexec -l/-e works well for me. Seems you are not using
the latest kexec-tools. Otherwise you can use "-i (--no-checks)" to work
around this for the time being.

> 	if (ret) { //<------verify_sha256 fail, entering loop forever
> 		/* loop forever */
> 		for (;;)
> 			;
> 	}
> 	copy_backup_region();
> }
> 
> 
> Our opnion of this problem:
> We think that the process of relocating the new kernel depending on the
> boot cpu running without interruption. However, the vcpus may be interrupted
> by the qemu process with async_page_fault interruption.
> There exists memory overriding risk when the boot vcpu relocate the new kernel.
> 
> When we enable the KVM_GUEST feature, and make the memory less than 500M,
> The new kernel starting problem with ‘kexec -l/-e’ will happen at every time.

I am not familiar with qemu/kvm, so add several Virt experts to CC list,
see if they have idea about it. Meanwhile, you might need to provide
the kernel version you are testing. And also wondering if you have tried
with latest kexec-tools.

Thanks
Baoquan


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: kexec-starting-kernel-problem-on-vm
  2021-04-19  8:26 ` kexec-starting-kernel-problem-on-vm Baoquan He
@ 2021-04-19  8:37   ` David Hildenbrand
  0 siblings, 0 replies; 3+ messages in thread
From: David Hildenbrand @ 2021-04-19  8:37 UTC (permalink / raw)
  To: Baoquan He, Jingxian He
  Cc: kexec, hewenliang4, wuxu.wu, horms, jasowang, peterx

On 19.04.21 10:26, Baoquan He wrote:
> Hi Jingxian,
> 
> On 04/14/21 at 03:04pm, Jingxian He wrote:
>> We use ‘kexec –l’ and ‘kexec –e’ on our virtual machine to upgrade the
>> linux kernel. We find that the new kernel may start fail due to checking
>> the sha256 sum of the initrd segment checking fail with low probability.
>>
>> The related code is as following:
>> /* arch/x86/purgatory/purgatory.c */
>> static int verify_sha256_digest(void)
>> {
>> 	struct kexec_sha_region *ptr, *end;
>> 	u8 digest[SHA256_DIGEST_SIZE];
>> 	struct sha256_state sctx;
>>
>> 	sha256_init(&sctx);
>> 	end = purgatory_sha_regions + ARRAY_SIZE(purgatory_sha_regions);
>>
>> 	for (ptr = purgatory_sha_regions; ptr < end; ptr++)
>> 		sha256_update(&sctx, (uint8_t *)(ptr->start), ptr->len);
>>
>> 	sha256_final(&sctx, digest);
>>
>> 	if (memcmp(digest, purgatory_sha256_digest, sizeof(digest)))
>> 		return 1;
>>
>> 	return 0;
>> }
>>
>> void purgatory(void)
>> {
>> 	int ret;
>>
>> 	ret = verify_sha256_digest();
> 
> I usually use qemu/kvm guest to test kernel, kexec and kdump, haven't
> met this issue. kexec -l/-e works well for me. Seems you are not using
> the latest kexec-tools. Otherwise you can use "-i (--no-checks)" to work
> around this for the time being.
> 
>> 	if (ret) { //<------verify_sha256 fail, entering loop forever
>> 		/* loop forever */
>> 		for (;;)
>> 			;
>> 	}
>> 	copy_backup_region();
>> }
>>
>>
>> Our opnion of this problem:
>> We think that the process of relocating the new kernel depending on the
>> boot cpu running without interruption. However, the vcpus may be interrupted
>> by the qemu process with async_page_fault interruption.

So, are you saying that the host still delivers an AFP to the guest, 
even though it has interrupts disabled (including AFP)? Hard to imagine 
that this would be the case right now.

Any other host activity that temporarily stops/schedules out the VCPU 
should be not relevant to the VM ("vCPU interrupted by the QEMU 
process"); if there would be something running inside the VM that 
disables interrupts to reduce the size of a race window, that would need 
fixing inside the VM.

-- 
Thanks,

David / dhildenb


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-19  8:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-14  7:04 kexec-starting-kernel-problem-on-vm Jingxian He
2021-04-19  8:26 ` kexec-starting-kernel-problem-on-vm Baoquan He
2021-04-19  8:37   ` kexec-starting-kernel-problem-on-vm David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.