linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Eugene Batalov <ebatalov@parallels.com>,
	kvm@vger.kernel.org, imammedo@redhat.com,
	linux-kernel@vger.kernel.org, eabatalov89@gmail.com
Subject: Re: [PATCHv1] kvm guest: fix uninitialized kvmclock read by KVM guest
Date: Wed, 19 Jun 2013 15:05:23 +0200	[thread overview]
Message-ID: <51C1AC93.6010001@redhat.com> (raw)
In-Reply-To: <20130618222114.GC13856@amt.cnet>

Il 19/06/2013 00:21, Marcelo Tosatti ha scritto:
> On Sat, Jun 15, 2013 at 10:01:45PM +0400, Eugene Batalov wrote:
>> Due to unintialized kvmclock read KVM guest is hanging on SMP boot stage.
>> If unintialized memory contains fatal garbage then hang reproduction is 100%.
>> Unintialized memory is allocated by memblock_alloc. So the garbage values
>> depend on many many things.
>>
>> See the detailed description of the bug and possible ways to fix it
>> in the kernel bug tracker.
>> "Bug 59521 - KVM linux guest reads uninitialized pvclock values before
>> executing rdmsr MSR_KVM_WALL_CLOCK"
>>
>> I decided to fix it simply returning 0ULL from kvm_clock_read when
>> kvm clocksource is not initialized yet.
>> The same as kernel bootstrap CPU doesn on boot stage when kernel
>> clocksources are not initialized yet.
>>
>> Signed-off-by: Eugene Batalov <ebatalov@parallels.com>
>> ---
>> I dont' use kernel percpu variables because for each SMP CPU
>> their contents are copied from the bootstrap CPU. And I don't
>> think that fixing the value for each SMP CPU is a good style.
>> If you know a better approach to store the is_pv_clock_ready flags
>> I am ready to use it.
>>
>> The patch applies cleanly to
>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>
>> I've tested the changes with Ubuntu 13.04 "raring" userspace and
>> Ubuntu-3.8.0.19-30 kernel tag.
>>
>>  arch/x86/kernel/kvmclock.c |   13 +++++++++++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
>> index 5bedbdd..a6e0af4 100644
>> --- a/arch/x86/kernel/kvmclock.c
>> +++ b/arch/x86/kernel/kvmclock.c
>> @@ -43,6 +43,9 @@ early_param("no-kvmclock", parse_no_kvmclock);
>>  static struct pvclock_vsyscall_time_info *hv_clock;
>>  static struct pvclock_wall_clock wall_clock;
>>  
>> +/* For each cpu store here a flag which tells whether pvclock is initialized */
>> +static int __cacheline_aligned_in_smp is_pv_clock_ready[NR_CPUS] = {};
>> +
>>  /*
>>   * The wallclock is the time of day when we booted. Since then, some time may
>>   * have elapsed since the hypervisor wrote the data. So we try to account for
>> @@ -84,8 +87,11 @@ static cycle_t kvm_clock_read(void)
>>  
>>  	preempt_disable_notrace();
>>  	cpu = smp_processor_id();
>> -	src = &hv_clock[cpu].pvti;
>> -	ret = pvclock_clocksource_read(src);
>> +	if (is_pv_clock_ready[cpu]) {
>> +		src = &hv_clock[cpu].pvti;
>> +		ret = pvclock_clocksource_read(src);
>> +	} else
>> +		ret = 0ULL;
>>  	preempt_enable_notrace();
>>  	return ret;
>>  }
>> @@ -168,6 +174,9 @@ int kvm_register_clock(char *txt)
>>  	printk(KERN_INFO "kvm-clock: cpu %d, msr %x:%x, %s\n",
>>  	       cpu, high, low, txt);
>>  
>> +	if (!ret)
>> +		is_pv_clock_ready[cpu] = 1;
>> +
>>  	return ret;
>>  }
> 
> The path can be really hot, so better avoid it if possible. The patch
> to zero the memblock_alloc'd area from Igor brings the behaviour back
> to before regression: return 0 until kvmclock is initialized. On top of
> your analysis in the BZ, it now seems the right (and safer) thing to do.

Yeah, definitely safer for master and stable.  I'm applying that to master.

But if we can invert cpu_init with early_clock_percpu_init, it would be
nicer to do that for 3.11.

Paolo

  reply	other threads:[~2013-06-19 13:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-59521-28872@https.bugzilla.kernel.org/>
2013-06-10 16:31 ` [PATCH] x86: kvmclock: zero initialize pvclock shared memory area Igor Mammedov
2013-06-10 20:19   ` Marcelo Tosatti
2013-06-15 18:01     ` [PATCHv1] kvm guest: fix uninitialized kvmclock read by KVM guest Eugene Batalov
2013-06-18 22:21       ` Marcelo Tosatti
2013-06-19 13:05         ` Paolo Bonzini [this message]
     [not found]           ` <CAJF2t5sYHy9q9a7-fZauf1Z7_FkK1_DOP13GHji=8-vDUsnnsQ@mail.gmail.com>
2013-06-19 13:29             ` Paolo Bonzini
2013-06-20  8:30               ` Igor Mammedov
2013-06-20  8:35                 ` Paolo Bonzini
2013-06-21  9:01 ` [PATCH 0/2 v2] x86: kvmclock: Prevent uninitialized per-cpu kvmclock usage Igor Mammedov
2013-06-21  9:01 ` [PATCH 1/2] x86: kvmclock: zero initialize pvclock shared memory area Igor Mammedov
2013-06-21  9:01 ` [PATCH 2/2] x86: kvmclock: register per-cpu kvmclock at earliest possible time Igor Mammedov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51C1AC93.6010001@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=eabatalov89@gmail.com \
    --cc=ebatalov@parallels.com \
    --cc=imammedo@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).