All of lore.kernel.org
 help / color / mirror / Atom feed
From: Glauber Costa <glommer@redhat.com>
To: Avi Kivity <avi@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, aliguori@us.ibm.com
Subject: Re: [PATCH 01/16] KVM-HDR: register KVM basic header infrastructure
Date: Wed, 26 Jan 2011 13:36:10 -0200	[thread overview]
Message-ID: <1296056170.3591.14.camel@mothafucka.localdomain> (raw)
In-Reply-To: <4D4039CB.6060008@redhat.com>

On Wed, 2011-01-26 at 17:12 +0200, Avi Kivity wrote:
> On 01/26/2011 02:13 PM, Glauber Costa wrote:
> > >
> > >  - it doesn't lend itself will to live migration.  Extra state must be
> > >  maintained in the hypervisor.
> > Yes, but can be queried at any time as well. I don't do it in this
> > patch, but this is explicitly mentioned in my TODO.
> 
> Using the existing method (MSRs) takes care of this, which reduces churn.

No, it doesn't.

First, we have to explicitly list some msrs for save/restore in
userspace anyway. But also, the MSRs only holds values. For the case I'm
trying to hit here, being: msrs being used to register something, like
kvmclock, there is usually accompanying code as well.


> > >  - it isn't how normal hardware operates
> > Since we're trying to go for guest cooperation here, I don't really see
> > a need to stay close to hardware here.
> 
> For Linux there is not much difference, since we can easily adapt it.
> But we don't know the impact on other guests, and we can't refactor 
> them.  Staying close to precedent means it will be easier for other 
> guests to work with a kvm host, if they choose.

I honestly don't see the difference. I am not proposing anything
terribly different, in the end, for the sake of this specific point of
guest supportability it's all 1 msr+cpuid vs n msr+cpuid.

> 
> > >
> > >  what's wrong with extending the normal approach of one msr per feature?
> >
> > * It's harder to do discovery with MSRs. You can't just rely on getting
> > an error before the idts are properly setups. The way I am proposing
> > allow us to just try to register a memory area, and get a failure if we
> > can't handle it, at any time
> 
> Use cpuid to ensure that you won't get a #GP.
Again, that increases confusion, IMHO. Your hypervisor may have a
feature, userspace lack it, and then you end up figuring why something
does not work.

> 
> > * To overcome the above, we had usually relied on cpuids. This requires
> > qemu/userspace cooperation for feature enablement
> 
> We need that anyway.  The kernel cannot enable features on its own since 
> that breaks live migration.

That is true. But easy to overcome as well.

> > * This mechanism just bumps us out to userspace if we can't handle a
> > request. As such, it allows for pure guest kernel ->  userspace
> > communication, that can be used, for instance, to emulate new features
> > in older hypervisors one does not want to change. BTW, maybe there is
> > value in exiting to userspace even if we stick to the
> > one-msr-per-feature approach?
> 
> Yes.
> 
> I'm not 100% happy with emulating MSRs in userspace, but we can think 
> about a mechanism that allows userspace to designate certain MSRs as 
> handled by userspace.
> 
> Before we do that I'd like to see what fraction of MSRs can be usefully 
> emulated in userspace (beyond those that just store a value and ignore it).

None of the existing. But for instance, I was discussing this issue with
anthony a while ago, and he thinks that in order to completely avoid
bogus softlockups, qemu/userspace, which is the entity here that knows
when it has stopped (think ctrl+Z or stop + cont, save/restore, etc),
could notify this to the guest kernel directly through a shared variable
like this.

See, this is not about "new features", but rather, about between pieces
of memory. So what I'm doing in the end is just generalizing "an MSR for
shared memory", instead of one new MSR for each piece of data.

Maybe I was unfortunate to mention async_pf in the description to begin
with.


  reply	other threads:[~2011-01-26 15:36 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-24 18:06 [PATCH 00/16] New Proposal for steal time in KVM Glauber Costa
2011-01-24 18:06 ` [PATCH 01/16] KVM-HDR: register KVM basic header infrastructure Glauber Costa
2011-01-26 11:06   ` Avi Kivity
2011-01-26 12:13     ` Glauber Costa
2011-01-26 15:12       ` Avi Kivity
2011-01-26 15:36         ` Glauber Costa [this message]
2011-01-26 17:22           ` Anthony Liguori
2011-01-26 17:49             ` Glauber Costa
2011-01-27 12:31               ` Avi Kivity
2011-01-24 18:06 ` [PATCH 02/16] KVM-HV: KVM - KVM Virtual Memory hypervisor implementation Glauber Costa
2011-01-24 18:06 ` [PATCH 03/16] KVM-HDR: KVM Userspace registering ioctl Glauber Costa
2011-01-26 11:12   ` Avi Kivity
2011-01-26 12:14     ` Glauber Costa
2011-01-26 15:14       ` Avi Kivity
2011-01-26 15:23         ` Glauber Costa
2011-01-24 18:06 ` [PATCH 04/16] KVM-HV: " Glauber Costa
2011-01-24 18:06 ` [PATCH 05/16] KVM-HDR: Implement wallclock over KVM - KVM Virtual Memory Glauber Costa
2011-01-26 11:13   ` Avi Kivity
2011-01-26 12:20     ` Glauber Costa
2011-01-26 15:17       ` Avi Kivity
2011-01-26 15:45         ` Glauber Costa
2011-01-27 12:17           ` Avi Kivity
2011-01-24 18:06 ` [PATCH 06/16] " Glauber Costa
2011-01-24 18:06 ` [PATCH 07/16] KVM-GST: " Glauber Costa
2011-01-24 18:06 ` [PATCH 08/16] KVM-HDR: Implement kvmclock systemtime " Glauber Costa
2011-01-24 18:06 ` [PATCH 09/16] KVM-HV: " Glauber Costa
2011-01-24 18:06 ` [PATCH 10/16] KVM-GST: " Glauber Costa
2011-01-24 18:06 ` [PATCH 11/16] KVM-HDR: KVM Steal time implementation Glauber Costa
2011-01-24 23:06   ` Rik van Riel
2011-01-24 18:06 ` [PATCH 12/16] KVM-HV: " Glauber Costa
2011-01-24 23:15   ` Rik van Riel
2011-01-24 18:06 ` [PATCH 13/16] KVM-HV: KVM Steal time calculation Glauber Costa
2011-01-24 23:20   ` Rik van Riel
2011-01-24 18:06 ` [PATCH 14/16] KVM-GST: KVM Steal time registration Glauber Costa
2011-01-24 23:27   ` Rik van Riel
2011-01-24 23:31   ` Rik van Riel
2011-01-25  1:25     ` Glauber Costa
2011-01-25  1:26       ` Rik van Riel
2011-01-25  1:28         ` Glauber Costa
2011-01-24 18:06 ` [PATCH 15/16] KVM-GST: KVM Steal time accounting Glauber Costa
2011-01-24 23:33   ` Rik van Riel
2011-01-24 18:06 ` [PATCH 16/16] KVM-GST: adjust scheduler cpu power Glauber Costa
2011-01-24 18:32   ` Peter Zijlstra
2011-01-24 18:51     ` Glauber Costa
2011-01-24 19:51       ` Peter Zijlstra
2011-01-24 19:57         ` Glauber Costa
2011-01-25 20:02         ` Glauber Costa
2011-01-25 20:13           ` Peter Zijlstra
2011-01-25 20:47             ` Glauber Costa
2011-01-25 21:07               ` Peter Zijlstra
2011-01-25 21:27                 ` Glauber Costa
2011-01-26  9:57                   ` Peter Zijlstra
2011-01-26 15:43                     ` Glauber Costa
2011-01-26 16:46                       ` Peter Zijlstra
2011-01-26 16:53                         ` Peter Zijlstra
2011-01-26 18:11                         ` Glauber Costa
2011-01-24 19:53       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1296056170.3591.14.camel@mothafucka.localdomain \
    --to=glommer@redhat.com \
    --cc=aliguori@us.ibm.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.