All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brijesh Singh <brijesh.singh@amd.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>, Borislav Petkov <bp@suse.de>
Cc: brijesh.singh@amd.com, Radim Krcmar <rkrcmar@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juergen Gross <jgross@suse.com>,
	Pavel Tatashin <pasha.tatashin@oracle.com>,
	steven.sistare@oracle.com, daniel.m.jordan@oracle.com,
	x86@kernel.org, kvm@vger.kernel.org, "Lendacky,
	Thomas" <Thomas.Lendacky@amd.com>
Subject: Re: [patch 0/7] x86/kvmclock: Remove memblock dependency and further cleanups
Date: Fri, 6 Jul 2018 18:51:16 -0500	[thread overview]
Message-ID: <faa8df8c-f658-7fca-886d-ae741e41f4a8@amd.com> (raw)
In-Reply-To: <f4071613-e151-90ff-7298-1ebc6e94c152@redhat.com>


Adding Tom and Boris


On 7/6/18 12:47 PM, Paolo Bonzini wrote:
> On 06/07/2018 18:13, Thomas Gleixner wrote:
>> To allow early utilization of kvmclock it is required to remove the
>> memblock dependency. memblock is currently used to allocate the per
>> cpu data for kvmclock.
>>
>> The first patch replaces the memblock with a static array sized 64bytes *
>> NR_CPUS and was posted by Pavel. That patch allocates everything statically
>> which is a waste when kvmclock is not used.
>>
>> The rest of the series cleans up the code and converts it to per cpu
>> variables but does not put the kvmclock data into the per cpu area as that
>> has an issue vs. mapping the boot cpu data into the VDSO (leaks arbitrary
>> data, unless page sized).
>>
>> The per cpu data consists of pointers to the actual data. For the boot cpu
>> a page sized array is statically allocated which can be mapped into the
>> VDSO. That array is used for initializing the first 64 CPU pointers. If
>> there are more CPUs the pvclock data is allocated during CPU bringup.
>>
>> So this still will have some overhead when kvmclock is not in use, but
>> bringing it down to zero would be a massive trainwreck and even more
>> indirections.
>>
>> Thanks,
>>
>> 	tglx
>>
>> 8<--------------
>>  a/arch/x86/include/asm/kvm_guest.h |    7 
>>  arch/x86/include/asm/kvm_para.h    |    1 
>>  arch/x86/kernel/kvm.c              |   14 -
>>  arch/x86/kernel/kvmclock.c         |  262 ++++++++++++++-----------------------
>>  arch/x86/kernel/setup.c            |    4 
>>  5 files changed, 105 insertions(+), 183 deletions(-)
>>
>>
>>
>>
> Thanks, this is really nice.  With the small changes from my review,
>
> Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Hi Paolo and Thomas,


This series breaks SEV guest support. The physical address of both
wall_clock and hv_clock is shared with hypervisor for updates. In case
of SEV the address must be mapped as 'decrypted (i.e C=0)' so that both
guest and HV can access the data correctly. The follow patch should map
the pages as decrypted.


diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 890e9e5..640c796 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -251,6 +251,20 @@ static void kvm_shutdown(void)
        native_machine_shutdown();
 }
 
+static void sev_map_clocks_decrypted(void)
+{
+       if (!sev_active())
+               return;
+
+       /*
+        * wall_clock and hv_clock addresses are shared with hypervisor.
+        * When SEV is enabled, any addresses shared with hypervisor must be
+        * mapped decrypted.
+        */
+       early_set_memory_decrypted((unsigned long) wall_clock,
WALL_CLOCK_SIZE);
+       early_set_memory_decrypted((unsigned long) hv_clock, HV_CLOCK_SIZE);
+}
+
 void __init kvmclock_init(void)
 {
        struct pvclock_vcpu_time_info *vcpu_time;
@@ -269,6 +283,8 @@ void __init kvmclock_init(void)
        wall_clock = (struct pvclock_wall_clock *)wall_clock_mem;
        hv_clock = (struct pvclock_vsyscall_time_info *)hv_clock_mem;
 
+       sev_map_clocks_decrypted();
+
        if (kvm_register_clock("primary cpu clock")) {
                hv_clock = NULL;
                wall_clock = NULL;


But this patch triggers the below kernel crash.
early_set_memory_decrypted() uses kernel_physical_mapping_init() to
split the large pages and clear the C-bit. It seems this function still
has dependency with memblock.

[    0.000000] Hypervisor detected: KVM
[    0.000000] Kernel panic - not syncing: alloc_low_pages: ran out of
memory
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.18.0-rc3-sev #19
[    0.000000] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
0.0.0 02/06/2015
[    0.000000] Call Trace:
[    0.000000]  ? dump_stack+0x5c/0x80
[    0.000000]  ? panic+0xe7/0x247
[    0.000000]  ? alloc_low_pages+0x130/0x130
[    0.000000]  ? kernel_physical_mapping_init+0xe0/0x204
[    0.000000]  ? early_set_memory_enc_dec+0x10f/0x160
[    0.000000]  ? 0xffffffffb1000000
[    0.000000]  ? kvmclock_init+0x83/0x20a
[    0.000000]  ? setup_arch+0x42c/0xce6
[    0.000000]  ? start_kernel+0x67/0x531
[    0.000000]  ? load_ucode_bsp+0x76/0x12e
[    0.000000]  ? secondary_startup_64+0xa5/0xb0
[    0.000000] ---[ end Kernel panic - not syncing: alloc_low_pages: ran
out of memory ]---

- Brijesh


  reply	other threads:[~2018-07-06 23:51 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-06 16:13 [patch 0/7] x86/kvmclock: Remove memblock dependency and further cleanups Thomas Gleixner
2018-07-06 16:13 ` [patch 1/7] x86/kvmclock: Remove memblock dependency Thomas Gleixner
2018-07-06 16:13 ` [patch 2/7] x86/kvmclock: Remove page size requirement from wall_clock Thomas Gleixner
2018-07-12  2:15   ` Pavel Tatashin
2018-07-06 16:13 ` [patch 3/7] x86/kvmclock: Decrapify kvm_register_clock() Thomas Gleixner
2018-07-06 17:38   ` Paolo Bonzini
2018-07-06 17:39     ` Thomas Gleixner
2018-07-12  2:24   ` Pavel Tatashin
2018-07-06 16:13 ` [patch 4/7] x86/kvmclock: Cleanup the code Thomas Gleixner
2018-07-06 17:39   ` Paolo Bonzini
2018-07-09  9:05   ` Peter Zijlstra
2018-07-09 10:03     ` Thomas Gleixner
2018-07-09 11:32     ` Paolo Bonzini
2018-07-06 16:13 ` [patch 5/7] x86/kvmclock: Mark variables __initdata and __ro_after_init Thomas Gleixner
2018-07-12  2:31   ` Pavel Tatashin
2018-07-06 16:13 ` [patch 6/7] x86/kvmclock: Move kvmclock vsyscall param and init to kvmclock Thomas Gleixner
2018-07-06 17:43   ` Paolo Bonzini
2018-07-06 19:23     ` Thomas Gleixner
2018-07-12  2:52   ` Pavel Tatashin
2018-07-06 16:13 ` [patch 7/7] x86/kvmclock: Switch kvmclock data to a PER_CPU variable Thomas Gleixner
2018-07-12  3:12   ` Pavel Tatashin
2018-07-06 17:47 ` [patch 0/7] x86/kvmclock: Remove memblock dependency and further cleanups Paolo Bonzini
2018-07-06 23:51   ` Brijesh Singh [this message]
2018-07-09  9:22 ` [patch 8/7] x86/kvmclock: Avoid TSC recalibration Peter Zijlstra
2018-07-12  2:12 ` [patch 0/7] x86/kvmclock: Remove memblock dependency and further cleanups Pavel Tatashin
2018-07-13 22:51   ` Thomas Gleixner
2018-07-14  0:20     ` Pavel Tatashin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=faa8df8c-f658-7fca-886d-ae741e41f4a8@amd.com \
    --to=brijesh.singh@amd.com \
    --cc=Thomas.Lendacky@amd.com \
    --cc=bp@suse.de \
    --cc=daniel.m.jordan@oracle.com \
    --cc=jgross@suse.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pasha.tatashin@oracle.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rkrcmar@redhat.com \
    --cc=steven.sistare@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.