linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Graf <agraf@suse.de>
To: Avi Kivity <avi@redhat.com>
Cc: Anthony Liguori <anthony@codemonkey.ws>,
	KVM list <kvm@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	qemu-devel <qemu-devel@nongnu.org>,
	kvm-ppc <kvm-ppc@vger.kernel.org>
Subject: Re: [Qemu-devel] [RFC] Next gen kvm api
Date: Fri, 17 Feb 2012 01:19:35 +0100	[thread overview]
Message-ID: <8A20A1D8-9CB7-4256-A9BD-03D972C5292A@suse.de> (raw)
In-Reply-To: <4F3D5B35.4000606@redhat.com>


On 16.02.2012, at 20:38, Avi Kivity wrote:

> On 02/16/2012 09:34 PM, Alexander Graf wrote:
>> On 16.02.2012, at 20:24, Avi Kivity wrote:
>> 
>>> On 02/15/2012 04:08 PM, Alexander Graf wrote:
>>>>> 
>>>>> Well, the scatter/gather registers I proposed will give you just one
>>>>> register or all of them.
>>>> 
>>>> One register is hardly any use. We either need all ways of a respective address to do a full fledged lookup or all of them. 
>>> 
>>> I should have said, just one register, or all of them, or anything in
>>> between.
>>> 
>>>> By sharing the same data structures between qemu and kvm, we actually managed to reuse all of the tcg code for lookups, just like you do for x86.
>>> 
>>> Sharing the data structures is not need.  Simply synchronize them before
>>> lookup, like we do for ordinary registers.
>> 
>> Ordinary registers are a few bytes. We're talking of dozens of kbytes here.
> 
> A TLB way is a few dozen bytes, no?
> 
>>> 
>>>> On x86 you also have shared memory for page tables, it's just guest visible, hence in guest memory. The concept is the same.
>>> 
>>> But cr3 isn't, and if we put it in shared memory, we'd have to VMREAD it
>>> on every exit.  And you're risking the same thing if your hardware gets
>>> cleverer.
>> 
>> Yes, we do. When that day comes, we forget the CAP and do it another way. Which way we will find out by the time that day of more clever hardware comes :).
> 
> Or we try to be less clever unless we have a really compelling reason. 
> qemu monitor and gdb support aren't compelling reasons to optimize.

The goal here was simplicity with a grain of performance concerns.

So what would you be envisioning? Should we make all of the MMU walker code in target-ppc KVM aware so it fetches that single way it actually cares about on demand from the kernel? That is pretty intrusive and goes against the general nicely fitting in principle of how KVM integrates today.

Also, we need to store the guest TLB somewhere. With this model, we can just store it in user space memory, so we keep only a single copy around, reducing memory footprint. If we had to copy it, we would need more than a single copy.

> 
>>> 
>>> It's too magical, fitting a random version of a random userspace
>>> component.  Now you can't change this tcg code (and still keep the magic).
>>> 
>>> Some complexity is part of keeping software as separate components.
>> 
>> Why? If another user space wants to use this, they can
>> 
>> a) do the slow copy path
>> or
>> b) simply use our struct definitions
>> 
>> The whole copy thing really only makes sense when you have existing code in user space that you don't want to touch, but easily add on KVM to it. If KVM is part of your whole design, then integrating things makes a lot more sense.
> 
> Yeah, I guess.
> 
>> 
>>> 
>>>> There are essentially no if(kvm_enabled)'s in our MMU walking code, because the tables are just there. Makes everything a lot easier (without dragging down performance).
>>> 
>>> We have the same issue with registers.  There we call
>>> cpu_synchronize_state() before every access.  No magic, but we get to
>>> reuse the code just the same.
>> 
>> Yes, and for those few bytes it's ok to do so - most of the time. On s390, even those get shared by now. And it makes sense to do so - if we synchronize it every time anyways, why not do so implicitly?
>> 
> 
> At least on x86, we synchronize only rarely.

Yeah, on s390 we only know which registers actually contain the information we need for traps / hypercalls when in user space, since that's where the decoding happens. So we better have all GPRs available to read from and write to.


Alex


  parent reply	other threads:[~2012-02-17  0:19 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-02 16:09 [RFC] Next gen kvm api Avi Kivity
     [not found] ` <CAB9FdM9M2DWXBxxyG-ez_5igT61x5b7ptw+fKfgaqMBU_JS5aA@mail.gmail.com>
2012-02-02 22:16   ` [Qemu-devel] " Rob Earhart
2012-02-05 13:14   ` Avi Kivity
2012-02-06 17:41     ` Rob Earhart
2012-02-06 19:11       ` Anthony Liguori
2012-02-07 12:03         ` Avi Kivity
2012-02-07 15:17           ` Anthony Liguori
2012-02-07 16:02             ` Avi Kivity
2012-02-07 16:18               ` Jan Kiszka
2012-02-07 16:21                 ` Anthony Liguori
2012-02-07 16:29                   ` Jan Kiszka
2012-02-15 13:41                     ` Avi Kivity
2012-02-07 16:19               ` Anthony Liguori
2012-02-15 13:47                 ` Avi Kivity
2012-02-07 12:01       ` Avi Kivity
2012-02-03  2:09 ` Anthony Liguori
2012-02-04  2:08   ` Takuya Yoshikawa
2012-02-22 13:06     ` Peter Zijlstra
2012-02-05  9:24   ` Avi Kivity
2012-02-07  1:08   ` Alexander Graf
2012-02-07 12:24     ` Avi Kivity
2012-02-07 12:51       ` Alexander Graf
2012-02-07 13:16         ` Avi Kivity
2012-02-07 13:40           ` Alexander Graf
2012-02-07 14:21             ` Avi Kivity
2012-02-07 14:39               ` Alexander Graf
2012-02-15 11:18                 ` Avi Kivity
2012-02-15 11:57                   ` Alexander Graf
2012-02-15 13:29                     ` Avi Kivity
2012-02-15 13:37                       ` Alexander Graf
2012-02-15 13:57                         ` Avi Kivity
2012-02-15 14:08                           ` Alexander Graf
2012-02-16 19:24                             ` Avi Kivity
2012-02-16 19:34                               ` Alexander Graf
2012-02-16 19:38                                 ` Avi Kivity
2012-02-16 20:41                                   ` Scott Wood
2012-02-17  0:23                                     ` Alexander Graf
2012-02-17 18:27                                       ` Scott Wood
2012-02-18  9:49                                     ` Avi Kivity
2012-02-17  0:19                                   ` Alexander Graf [this message]
2012-02-18 10:00                                     ` Avi Kivity
2012-02-18 10:43                                       ` Alexander Graf
2012-02-15 19:17                     ` Scott Wood
2012-02-12  7:10               ` Takuya Yoshikawa
2012-02-15 13:32                 ` Avi Kivity
2012-02-07 15:23             ` Anthony Liguori
2012-02-07 15:28               ` Alexander Graf
2012-02-08 17:20               ` Alan Cox
2012-02-15 13:33               ` Avi Kivity
2012-02-15 22:14             ` Arnd Bergmann
2012-02-10  3:07   ` Jamie Lokier
2012-02-03 18:07 ` Eric Northup
2012-02-03 22:52   ` [Qemu-devel] " Anthony Liguori
2012-02-06 19:46     ` Scott Wood
2012-02-07  6:58       ` Michael Ellerman
2012-02-07 10:04         ` Alexander Graf
2012-02-15 22:21           ` Arnd Bergmann
2012-02-16  1:04             ` Michael Ellerman
2012-02-16 19:28               ` Avi Kivity
2012-02-17  0:09                 ` Michael Ellerman
2012-02-18 10:03                   ` Avi Kivity
2012-02-16 10:26             ` Avi Kivity
2012-02-07 12:28       ` Anthony Liguori
2012-02-07 12:40         ` Avi Kivity
2012-02-07 12:51           ` Anthony Liguori
2012-02-07 13:18             ` Avi Kivity
2012-02-07 15:15               ` Anthony Liguori
2012-02-07 18:28                 ` Chris Wright
2012-02-08 17:02         ` Scott Wood
2012-02-08 17:12           ` Alan Cox
2012-02-05  9:37 ` Gleb Natapov
2012-02-05  9:44   ` Avi Kivity
2012-02-05  9:51     ` Gleb Natapov
2012-02-05  9:56       ` Avi Kivity
2012-02-05 10:58         ` Gleb Natapov
2012-02-05 13:16           ` Avi Kivity
2012-02-05 16:36       ` [Qemu-devel] " Anthony Liguori
2012-02-06  9:34         ` Avi Kivity
2012-02-06 13:33           ` Anthony Liguori
2012-02-06 13:54             ` Avi Kivity
2012-02-06 14:00               ` Anthony Liguori
2012-02-06 14:08                 ` Avi Kivity
2012-02-07 18:12           ` Rusty Russell
2012-02-15 13:39             ` Avi Kivity
2012-02-15 21:59               ` Anthony Liguori
2012-02-16  8:57                 ` Gleb Natapov
2012-02-16 14:46                   ` Anthony Liguori
2012-02-16 19:34                     ` Avi Kivity
2012-02-15 23:08               ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8A20A1D8-9CB7-4256-A9BD-03D972C5292A@suse.de \
    --to=agraf@suse.de \
    --cc=anthony@codemonkey.ws \
    --cc=avi@redhat.com \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).