linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: qemu-devel <qemu-devel@nongnu.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Gleb Natapov <gleb@redhat.com>, KVM list <kvm@vger.kernel.org>
Subject: Re: [Qemu-devel] [RFC] Next gen kvm api
Date: Mon, 06 Feb 2012 15:54:25 +0200	[thread overview]
Message-ID: <4F2FDB91.80200@redhat.com> (raw)
In-Reply-To: <4F2FD692.5060708@codemonkey.ws>

On 02/06/2012 03:33 PM, Anthony Liguori wrote:
>> Look at arch/x86/kvm/i8254.c:pit_ioport_read() for a counterexample.
>> There are also interactions with other devices (for example the
>> apic/ioapic interaction via the apic bus).
>
>
> Hrm, maybe I'm missing it, but the path that would be hot is:
>
> if (!status_latched && !count_latched) {
>    value = kpit_elapsed()
>    // manipulate count based on mode
>    // mask value depending on read_state
> }
>
> This path is side-effect free, and applies relatively simple math to a
> time counter.

Do guests always read an unlatched counter?  Doesn't seem reasonable
since they can't get a stable count this way.

>
> The idea would be to allow the filter to not handle an I/O request
> depending on existing state.  Anything that's modifies state (like
> reading the latch counter) would drop to userspace.

This restricts us to a subset of the device which is at the mercy of the
guest.

>
>>
>>>
>>> If userspace had a way to upload bytecode to the kernel that was
>>> executed for a PIO operation, it could either pass the operation to
>>> userspace or handle it within the kernel when possible without taking
>>> a heavy weight exit.
>>>
>>> If the bytecode can access variables in a shared memory area, it could
>>> be pretty efficient to work with.
>>>
>>> This means that the kernel never has to deal with specific in-kernel
>>> devices but that userspace can accelerator as many of its devices as
>>> it sees fit.
>>
>> I would really love to have this, but the problem is that we'd need a
>> general purpose bytecode VM with binding to some kernel APIs.  The
>> bytecode VM, if made general enough to host more complicated devices,
>> would likely be much larger than the actual code we have in the
>> kernel now.
>
> I think the question is whether BPF is good enough as it stands.  I'm
> not really sure.

I think not.  It doesn't have 64-bit muldiv, required for hpet, for example.

>   I agree that inventing a new bytecode VM is probably not worth it.
>
>>>
>>> This could replace ioeventfd as a mechanism (which would allow
>>> clearing the notify flag before writing to an eventfd).
>>>
>>> We could potentially just use BPF for this.
>>
>> BPF generally just computes a predicate.
>
> Can it modify a packet in place?  I think a predicate is about right
> (can this io operation be handled in the kernel or not) but the
> question is whether there's a way produce an output as a side effect.

You can use the scratch area, and say that it's persistent.  But the VM
itself isn't rich enough.

>
>> We could overload the scratch
>> area for storing internal state and for read results, though (and have
>> an "mmio scratch register" for reading the time).
>
> Right.
>

We could define mmio registers for muldiv64, and for communicating over
the APIC bus.  But then the device model for BPF ends up more
complicated than the kernel devices we have put together.

-- 
error compiling committee.c: too many arguments to function


  reply	other threads:[~2012-02-06 13:54 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-02 16:09 [RFC] Next gen kvm api Avi Kivity
     [not found] ` <CAB9FdM9M2DWXBxxyG-ez_5igT61x5b7ptw+fKfgaqMBU_JS5aA@mail.gmail.com>
2012-02-02 22:16   ` [Qemu-devel] " Rob Earhart
2012-02-05 13:14   ` Avi Kivity
2012-02-06 17:41     ` Rob Earhart
2012-02-06 19:11       ` Anthony Liguori
2012-02-07 12:03         ` Avi Kivity
2012-02-07 15:17           ` Anthony Liguori
2012-02-07 16:02             ` Avi Kivity
2012-02-07 16:18               ` Jan Kiszka
2012-02-07 16:21                 ` Anthony Liguori
2012-02-07 16:29                   ` Jan Kiszka
2012-02-15 13:41                     ` Avi Kivity
2012-02-07 16:19               ` Anthony Liguori
2012-02-15 13:47                 ` Avi Kivity
2012-02-07 12:01       ` Avi Kivity
2012-02-03  2:09 ` Anthony Liguori
2012-02-04  2:08   ` Takuya Yoshikawa
2012-02-22 13:06     ` Peter Zijlstra
2012-02-05  9:24   ` Avi Kivity
2012-02-07  1:08   ` Alexander Graf
2012-02-07 12:24     ` Avi Kivity
2012-02-07 12:51       ` Alexander Graf
2012-02-07 13:16         ` Avi Kivity
2012-02-07 13:40           ` Alexander Graf
2012-02-07 14:21             ` Avi Kivity
2012-02-07 14:39               ` Alexander Graf
2012-02-15 11:18                 ` Avi Kivity
2012-02-15 11:57                   ` Alexander Graf
2012-02-15 13:29                     ` Avi Kivity
2012-02-15 13:37                       ` Alexander Graf
2012-02-15 13:57                         ` Avi Kivity
2012-02-15 14:08                           ` Alexander Graf
2012-02-16 19:24                             ` Avi Kivity
2012-02-16 19:34                               ` Alexander Graf
2012-02-16 19:38                                 ` Avi Kivity
2012-02-16 20:41                                   ` Scott Wood
2012-02-17  0:23                                     ` Alexander Graf
2012-02-17 18:27                                       ` Scott Wood
2012-02-18  9:49                                     ` Avi Kivity
2012-02-17  0:19                                   ` Alexander Graf
2012-02-18 10:00                                     ` Avi Kivity
2012-02-18 10:43                                       ` Alexander Graf
2012-02-15 19:17                     ` Scott Wood
2012-02-12  7:10               ` Takuya Yoshikawa
2012-02-15 13:32                 ` Avi Kivity
2012-02-07 15:23             ` Anthony Liguori
2012-02-07 15:28               ` Alexander Graf
2012-02-08 17:20               ` Alan Cox
2012-02-15 13:33               ` Avi Kivity
2012-02-15 22:14             ` Arnd Bergmann
2012-02-10  3:07   ` Jamie Lokier
2012-02-03 18:07 ` Eric Northup
2012-02-03 22:52   ` [Qemu-devel] " Anthony Liguori
2012-02-06 19:46     ` Scott Wood
2012-02-07  6:58       ` Michael Ellerman
2012-02-07 10:04         ` Alexander Graf
2012-02-15 22:21           ` Arnd Bergmann
2012-02-16  1:04             ` Michael Ellerman
2012-02-16 19:28               ` Avi Kivity
2012-02-17  0:09                 ` Michael Ellerman
2012-02-18 10:03                   ` Avi Kivity
2012-02-16 10:26             ` Avi Kivity
2012-02-07 12:28       ` Anthony Liguori
2012-02-07 12:40         ` Avi Kivity
2012-02-07 12:51           ` Anthony Liguori
2012-02-07 13:18             ` Avi Kivity
2012-02-07 15:15               ` Anthony Liguori
2012-02-07 18:28                 ` Chris Wright
2012-02-08 17:02         ` Scott Wood
2012-02-08 17:12           ` Alan Cox
2012-02-05  9:37 ` Gleb Natapov
2012-02-05  9:44   ` Avi Kivity
2012-02-05  9:51     ` Gleb Natapov
2012-02-05  9:56       ` Avi Kivity
2012-02-05 10:58         ` Gleb Natapov
2012-02-05 13:16           ` Avi Kivity
2012-02-05 16:36       ` [Qemu-devel] " Anthony Liguori
2012-02-06  9:34         ` Avi Kivity
2012-02-06 13:33           ` Anthony Liguori
2012-02-06 13:54             ` Avi Kivity [this message]
2012-02-06 14:00               ` Anthony Liguori
2012-02-06 14:08                 ` Avi Kivity
2012-02-07 18:12           ` Rusty Russell
2012-02-15 13:39             ` Avi Kivity
2012-02-15 21:59               ` Anthony Liguori
2012-02-16  8:57                 ` Gleb Natapov
2012-02-16 14:46                   ` Anthony Liguori
2012-02-16 19:34                     ` Avi Kivity
2012-02-15 23:08               ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F2FDB91.80200@redhat.com \
    --to=avi@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).