All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Ellerman <michael@ellerman.id.au>
To: Arnd Bergmann <arnd@arndb.de>
Cc: qemu-devel@nongnu.org, Alexander Graf <agraf@suse.de>,
	KVM list <kvm@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Eric Northup <digitaleric@google.com>,
	Scott Wood <scottwood@freescale.com>, Avi Kivity <avi@redhat.com>
Subject: Re: [Qemu-devel] [RFC] Next gen kvm api
Date: Thu, 16 Feb 2012 12:04:05 +1100	[thread overview]
Message-ID: <1329354245.6976.25.camel@concordia> (raw)
In-Reply-To: <201202152221.36154.arnd@arndb.de>

[-- Attachment #1: Type: text/plain, Size: 2504 bytes --]

On Wed, 2012-02-15 at 22:21 +0000, Arnd Bergmann wrote:
> On Tuesday 07 February 2012, Alexander Graf wrote:
> > On 07.02.2012, at 07:58, Michael Ellerman wrote:
> > 
> > > On Mon, 2012-02-06 at 13:46 -0600, Scott Wood wrote:
> > >> You're exposing a large, complex kernel subsystem that does very
> > >> low-level things with the hardware.  It's a potential source of exploits
> > >> (from bugs in KVM or in hardware).  I can see people wanting to be
> > >> selective with access because of that.
> > > 
> > > Exactly.
> > > 
> > > In a perfect world I'd agree with Anthony, but in reality I think
> > > sysadmins are quite happy that they can prevent some users from using
> > > KVM.
> > > 
> > > You could presumably achieve something similar with capabilities or
> > > whatever, but a node in /dev is much simpler.
> > 
> > Well, you could still keep the /dev/kvm node and then have syscalls operate on the fd.
> > 
> > But again, I don't see the problem with the ioctl interface. It's nice, extensible and works great for us.
> > 
> 
> ioctl is good for hardware devices and stuff that you want to enumerate
> and/or control permissions on. For something like KVM that is really a
> core kernel service, a syscall makes much more sense.

Yeah maybe. That distinction is at least in part just historical.

The first problem I see with using a syscall is that you don't need one
syscall for KVM, you need ~90. OK so you wouldn't do that, you'd use a
multiplexed syscall like epoll_ctl() - or probably several
(vm/vcpu/etc).

Secondly you still need a handle/context for those syscalls, and I think
the most sane thing to use for that is an fd.

At that point you've basically reinvented ioctl :)

I also think it is an advantage that you have a node in /dev for
permissions. I know other "core kernel" interfaces don't use a /dev
node, but arguably that is their loss.

> I would certainly never mix the two concepts: If you use a chardev to get
> a file descriptor, use ioctl to do operations on it, and if you use a 
> syscall to get the file descriptor then use other syscalls to do operations
> on it.

Sure, we use a syscall to get the fd (open) and then other syscalls to
do operations on it, ioctl and kvm_vcpu_run. ;)

But seriously, I guess that makes sense. Though it's a bit of a pity
because if you want a syscall for any of it, eg. vcpu_run(), then you
have to basically reinvent ioctl for all the other little operations.

cheers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Michael Ellerman <michael@ellerman.id.au>
To: Arnd Bergmann <arnd@arndb.de>
Cc: Alexander Graf <agraf@suse.de>, KVM list <kvm@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	qemu-devel@nongnu.org, Eric Northup <digitaleric@google.com>,
	Scott Wood <scottwood@freescale.com>, Avi Kivity <avi@redhat.com>
Subject: Re: [RFC] Next gen kvm api
Date: Thu, 16 Feb 2012 12:04:05 +1100	[thread overview]
Message-ID: <1329354245.6976.25.camel@concordia> (raw)
In-Reply-To: <201202152221.36154.arnd@arndb.de>

[-- Attachment #1: Type: text/plain, Size: 2504 bytes --]

On Wed, 2012-02-15 at 22:21 +0000, Arnd Bergmann wrote:
> On Tuesday 07 February 2012, Alexander Graf wrote:
> > On 07.02.2012, at 07:58, Michael Ellerman wrote:
> > 
> > > On Mon, 2012-02-06 at 13:46 -0600, Scott Wood wrote:
> > >> You're exposing a large, complex kernel subsystem that does very
> > >> low-level things with the hardware.  It's a potential source of exploits
> > >> (from bugs in KVM or in hardware).  I can see people wanting to be
> > >> selective with access because of that.
> > > 
> > > Exactly.
> > > 
> > > In a perfect world I'd agree with Anthony, but in reality I think
> > > sysadmins are quite happy that they can prevent some users from using
> > > KVM.
> > > 
> > > You could presumably achieve something similar with capabilities or
> > > whatever, but a node in /dev is much simpler.
> > 
> > Well, you could still keep the /dev/kvm node and then have syscalls operate on the fd.
> > 
> > But again, I don't see the problem with the ioctl interface. It's nice, extensible and works great for us.
> > 
> 
> ioctl is good for hardware devices and stuff that you want to enumerate
> and/or control permissions on. For something like KVM that is really a
> core kernel service, a syscall makes much more sense.

Yeah maybe. That distinction is at least in part just historical.

The first problem I see with using a syscall is that you don't need one
syscall for KVM, you need ~90. OK so you wouldn't do that, you'd use a
multiplexed syscall like epoll_ctl() - or probably several
(vm/vcpu/etc).

Secondly you still need a handle/context for those syscalls, and I think
the most sane thing to use for that is an fd.

At that point you've basically reinvented ioctl :)

I also think it is an advantage that you have a node in /dev for
permissions. I know other "core kernel" interfaces don't use a /dev
node, but arguably that is their loss.

> I would certainly never mix the two concepts: If you use a chardev to get
> a file descriptor, use ioctl to do operations on it, and if you use a 
> syscall to get the file descriptor then use other syscalls to do operations
> on it.

Sure, we use a syscall to get the fd (open) and then other syscalls to
do operations on it, ioctl and kvm_vcpu_run. ;)

But seriously, I guess that makes sense. Though it's a bit of a pity
because if you want a syscall for any of it, eg. vcpu_run(), then you
have to basically reinvent ioctl for all the other little operations.

cheers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Michael Ellerman <michael@ellerman.id.au>
To: Arnd Bergmann <arnd@arndb.de>
Cc: Alexander Graf <agraf@suse.de>, KVM list <kvm@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	qemu-devel@nongnu.org, Eric Northup <digitaleric@google.com>,
	Scott Wood <scottwood@freescale.com>, Avi Kivity <avi@redhat.com>
Subject: Re: [Qemu-devel] [RFC] Next gen kvm api
Date: Thu, 16 Feb 2012 12:04:05 +1100	[thread overview]
Message-ID: <1329354245.6976.25.camel@concordia> (raw)
In-Reply-To: <201202152221.36154.arnd@arndb.de>

[-- Attachment #1: Type: text/plain, Size: 2504 bytes --]

On Wed, 2012-02-15 at 22:21 +0000, Arnd Bergmann wrote:
> On Tuesday 07 February 2012, Alexander Graf wrote:
> > On 07.02.2012, at 07:58, Michael Ellerman wrote:
> > 
> > > On Mon, 2012-02-06 at 13:46 -0600, Scott Wood wrote:
> > >> You're exposing a large, complex kernel subsystem that does very
> > >> low-level things with the hardware.  It's a potential source of exploits
> > >> (from bugs in KVM or in hardware).  I can see people wanting to be
> > >> selective with access because of that.
> > > 
> > > Exactly.
> > > 
> > > In a perfect world I'd agree with Anthony, but in reality I think
> > > sysadmins are quite happy that they can prevent some users from using
> > > KVM.
> > > 
> > > You could presumably achieve something similar with capabilities or
> > > whatever, but a node in /dev is much simpler.
> > 
> > Well, you could still keep the /dev/kvm node and then have syscalls operate on the fd.
> > 
> > But again, I don't see the problem with the ioctl interface. It's nice, extensible and works great for us.
> > 
> 
> ioctl is good for hardware devices and stuff that you want to enumerate
> and/or control permissions on. For something like KVM that is really a
> core kernel service, a syscall makes much more sense.

Yeah maybe. That distinction is at least in part just historical.

The first problem I see with using a syscall is that you don't need one
syscall for KVM, you need ~90. OK so you wouldn't do that, you'd use a
multiplexed syscall like epoll_ctl() - or probably several
(vm/vcpu/etc).

Secondly you still need a handle/context for those syscalls, and I think
the most sane thing to use for that is an fd.

At that point you've basically reinvented ioctl :)

I also think it is an advantage that you have a node in /dev for
permissions. I know other "core kernel" interfaces don't use a /dev
node, but arguably that is their loss.

> I would certainly never mix the two concepts: If you use a chardev to get
> a file descriptor, use ioctl to do operations on it, and if you use a 
> syscall to get the file descriptor then use other syscalls to do operations
> on it.

Sure, we use a syscall to get the fd (open) and then other syscalls to
do operations on it, ioctl and kvm_vcpu_run. ;)

But seriously, I guess that makes sense. Though it's a bit of a pity
because if you want a syscall for any of it, eg. vcpu_run(), then you
have to basically reinvent ioctl for all the other little operations.

cheers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

  reply	other threads:[~2012-02-16  1:04 UTC|newest]

Thread overview: 236+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-02 16:09 [RFC] Next gen kvm api Avi Kivity
2012-02-02 16:09 ` [Qemu-devel] " Avi Kivity
2012-02-02 22:13 ` Rob Earhart
2012-02-02 22:13   ` [Qemu-devel] " Rob Earhart
2012-02-02 22:16   ` Rob Earhart
2012-02-02 22:16     ` Rob Earhart
2012-02-05 13:14   ` Avi Kivity
2012-02-05 13:14     ` Avi Kivity
2012-02-06 17:41     ` [Qemu-devel] " Rob Earhart
2012-02-06 19:11       ` Anthony Liguori
2012-02-06 19:11         ` Anthony Liguori
2012-02-06 19:11         ` Anthony Liguori
2012-02-07 12:03         ` [Qemu-devel] " Avi Kivity
2012-02-07 12:03           ` Avi Kivity
2012-02-07 15:17           ` [Qemu-devel] " Anthony Liguori
2012-02-07 16:02             ` Avi Kivity
2012-02-07 16:18               ` Jan Kiszka
2012-02-07 16:18                 ` Jan Kiszka
2012-02-07 16:18                 ` Jan Kiszka
2012-02-07 16:21                 ` [Qemu-devel] " Anthony Liguori
2012-02-07 16:21                   ` Anthony Liguori
2012-02-07 16:29                   ` Jan Kiszka
2012-02-07 16:29                     ` Jan Kiszka
2012-02-15 13:41                     ` Avi Kivity
2012-02-15 13:41                       ` Avi Kivity
2012-02-07 16:19               ` Anthony Liguori
2012-02-15 13:47                 ` Avi Kivity
2012-02-07 12:01       ` Avi Kivity
2012-02-03  2:09 ` Anthony Liguori
2012-02-03  2:09   ` Anthony Liguori
2012-02-03  2:09   ` Anthony Liguori
2012-02-04  2:08   ` [Qemu-devel] " Takuya Yoshikawa
2012-02-04  2:08     ` Takuya Yoshikawa
2012-02-04  2:08     ` Takuya Yoshikawa
2012-02-22 13:06     ` [Qemu-devel] " Peter Zijlstra
2012-02-22 13:06       ` Peter Zijlstra
2012-02-05  9:24   ` Avi Kivity
2012-02-05  9:24     ` Avi Kivity
2012-02-05  9:24     ` Avi Kivity
2012-02-07  1:08   ` [Qemu-devel] " Alexander Graf
2012-02-07  1:08     ` Alexander Graf
2012-02-07  1:08     ` Alexander Graf
2012-02-07  1:08     ` Alexander Graf
2012-02-07 12:24     ` [Qemu-devel] " Avi Kivity
2012-02-07 12:24       ` Avi Kivity
2012-02-07 12:24       ` Avi Kivity
2012-02-07 12:51       ` Alexander Graf
2012-02-07 12:51         ` Alexander Graf
2012-02-07 12:51         ` Alexander Graf
2012-02-07 13:16         ` Avi Kivity
2012-02-07 13:16           ` Avi Kivity
2012-02-07 13:16           ` Avi Kivity
2012-02-07 13:40           ` Alexander Graf
2012-02-07 13:40             ` Alexander Graf
2012-02-07 13:40             ` Alexander Graf
2012-02-07 14:21             ` Avi Kivity
2012-02-07 14:21               ` Avi Kivity
2012-02-07 14:21               ` Avi Kivity
2012-02-07 14:21               ` Avi Kivity
2012-02-07 14:39               ` [Qemu-devel] " Alexander Graf
2012-02-07 14:39                 ` Alexander Graf
2012-02-07 14:39                 ` Alexander Graf
2012-02-15 11:18                 ` Avi Kivity
2012-02-15 11:18                   ` Avi Kivity
2012-02-15 11:18                   ` Avi Kivity
2012-02-15 11:57                   ` Alexander Graf
2012-02-15 11:57                     ` Alexander Graf
2012-02-15 11:57                     ` Alexander Graf
2012-02-15 13:29                     ` Avi Kivity
2012-02-15 13:29                       ` Avi Kivity
2012-02-15 13:29                       ` Avi Kivity
2012-02-15 13:37                       ` Alexander Graf
2012-02-15 13:37                         ` Alexander Graf
2012-02-15 13:37                         ` Alexander Graf
2012-02-15 13:57                         ` Avi Kivity
2012-02-15 13:57                           ` Avi Kivity
2012-02-15 13:57                           ` Avi Kivity
2012-02-15 14:08                           ` Alexander Graf
2012-02-15 14:08                             ` Alexander Graf
2012-02-15 14:08                             ` Alexander Graf
2012-02-16 19:24                             ` Avi Kivity
2012-02-16 19:24                               ` Avi Kivity
2012-02-16 19:24                               ` Avi Kivity
2012-02-16 19:24                               ` Avi Kivity
2012-02-16 19:34                               ` [Qemu-devel] " Alexander Graf
2012-02-16 19:34                                 ` Alexander Graf
2012-02-16 19:34                                 ` Alexander Graf
2012-02-16 19:38                                 ` Avi Kivity
2012-02-16 19:38                                   ` Avi Kivity
2012-02-16 19:38                                   ` Avi Kivity
2012-02-16 20:41                                   ` Scott Wood
2012-02-16 20:41                                     ` Scott Wood
2012-02-16 20:41                                     ` Scott Wood
2012-02-17  0:23                                     ` Alexander Graf
2012-02-17  0:23                                       ` Alexander Graf
2012-02-17  0:23                                       ` Alexander Graf
2012-02-17 18:27                                       ` Scott Wood
2012-02-17 18:27                                         ` Scott Wood
2012-02-17 18:27                                         ` Scott Wood
2012-02-18  9:49                                     ` Avi Kivity
2012-02-18  9:49                                       ` Avi Kivity
2012-02-18  9:49                                       ` Avi Kivity
2012-02-18  9:49                                       ` Avi Kivity
2012-02-17  0:19                                   ` [Qemu-devel] " Alexander Graf
2012-02-17  0:19                                     ` Alexander Graf
2012-02-17  0:19                                     ` Alexander Graf
2012-02-18 10:00                                     ` Avi Kivity
2012-02-18 10:00                                       ` Avi Kivity
2012-02-18 10:00                                       ` Avi Kivity
2012-02-18 10:00                                       ` Avi Kivity
2012-02-18 10:43                                       ` [Qemu-devel] " Alexander Graf
2012-02-18 10:43                                         ` Alexander Graf
2012-02-18 10:43                                         ` Alexander Graf
2012-02-15 19:17                     ` Scott Wood
2012-02-15 19:17                       ` Scott Wood
2012-02-15 19:17                       ` Scott Wood
2012-02-12  7:10               ` Takuya Yoshikawa
2012-02-12  7:10                 ` Takuya Yoshikawa
2012-02-12  7:10                 ` Takuya Yoshikawa
2012-02-12  7:10                 ` Takuya Yoshikawa
2012-02-15 13:32                 ` [Qemu-devel] " Avi Kivity
2012-02-15 13:32                   ` Avi Kivity
2012-02-15 13:32                   ` Avi Kivity
2012-02-07 15:23             ` Anthony Liguori
2012-02-07 15:23               ` Anthony Liguori
2012-02-07 15:23               ` Anthony Liguori
2012-02-07 15:28               ` Alexander Graf
2012-02-07 15:28                 ` Alexander Graf
2012-02-07 15:28                 ` Alexander Graf
2012-02-08 17:20               ` Alan Cox
2012-02-08 17:20                 ` Alan Cox
2012-02-08 17:20                 ` Alan Cox
2012-02-15 13:33               ` Avi Kivity
2012-02-15 13:33                 ` Avi Kivity
2012-02-15 13:33                 ` Avi Kivity
2012-02-15 22:14             ` Arnd Bergmann
2012-02-15 22:14               ` Arnd Bergmann
2012-02-10  3:07   ` Jamie Lokier
2012-02-10  3:07     ` Jamie Lokier
2012-02-03 18:07 ` Eric Northup
2012-02-03 18:07   ` [Qemu-devel] " Eric Northup
2012-02-03 18:07   ` Eric Northup
2012-02-03 22:52   ` [Qemu-devel] " Anthony Liguori
2012-02-03 22:52     ` Anthony Liguori
2012-02-03 22:52     ` Anthony Liguori
2012-02-06 19:46     ` [Qemu-devel] " Scott Wood
2012-02-06 19:46       ` Scott Wood
2012-02-07  6:58       ` Michael Ellerman
2012-02-07  6:58         ` Michael Ellerman
2012-02-07  6:58         ` Michael Ellerman
2012-02-07 10:04         ` [Qemu-devel] " Alexander Graf
2012-02-07 10:04           ` Alexander Graf
2012-02-15 22:21           ` Arnd Bergmann
2012-02-15 22:21             ` Arnd Bergmann
2012-02-16  1:04             ` Michael Ellerman [this message]
2012-02-16  1:04               ` Michael Ellerman
2012-02-16  1:04               ` Michael Ellerman
2012-02-16 19:28               ` [Qemu-devel] " Avi Kivity
2012-02-16 19:28                 ` Avi Kivity
2012-02-17  0:09                 ` Michael Ellerman
2012-02-17  0:09                   ` Michael Ellerman
2012-02-17  0:09                   ` Michael Ellerman
2012-02-18 10:03                   ` [Qemu-devel] " Avi Kivity
2012-02-18 10:03                     ` Avi Kivity
2012-02-18 10:03                     ` Avi Kivity
2012-02-16 10:26             ` [Qemu-devel] " Avi Kivity
2012-02-16 10:26               ` Avi Kivity
2012-02-16 10:26               ` Avi Kivity
2012-02-07 12:28       ` [Qemu-devel] " Anthony Liguori
2012-02-07 12:28         ` Anthony Liguori
2012-02-07 12:40         ` Avi Kivity
2012-02-07 12:40           ` Avi Kivity
2012-02-07 12:51           ` Anthony Liguori
2012-02-07 12:51             ` Anthony Liguori
2012-02-07 13:18             ` Avi Kivity
2012-02-07 13:18               ` Avi Kivity
2012-02-07 13:18               ` Avi Kivity
2012-02-07 15:15               ` [Qemu-devel] " Anthony Liguori
2012-02-07 15:15                 ` Anthony Liguori
2012-02-07 18:28                 ` Chris Wright
2012-02-07 18:28                   ` Chris Wright
2012-02-08 17:02         ` Scott Wood
2012-02-08 17:02           ` Scott Wood
2012-02-08 17:12           ` Alan Cox
2012-02-08 17:12             ` Alan Cox
2012-02-08 17:12             ` Alan Cox
2012-02-05  9:37 ` Gleb Natapov
2012-02-05  9:37   ` [Qemu-devel] " Gleb Natapov
2012-02-05  9:37   ` Gleb Natapov
2012-02-05  9:44   ` Avi Kivity
2012-02-05  9:44     ` [Qemu-devel] " Avi Kivity
2012-02-05  9:44     ` Avi Kivity
2012-02-05  9:51     ` Gleb Natapov
2012-02-05  9:51       ` [Qemu-devel] " Gleb Natapov
2012-02-05  9:51       ` Gleb Natapov
2012-02-05  9:56       ` Avi Kivity
2012-02-05  9:56         ` [Qemu-devel] " Avi Kivity
2012-02-05  9:56         ` Avi Kivity
2012-02-05 10:58         ` Gleb Natapov
2012-02-05 10:58           ` [Qemu-devel] " Gleb Natapov
2012-02-05 10:58           ` Gleb Natapov
2012-02-05 13:16           ` Avi Kivity
2012-02-05 13:16             ` [Qemu-devel] " Avi Kivity
2012-02-05 13:16             ` Avi Kivity
2012-02-05 16:36       ` [Qemu-devel] " Anthony Liguori
2012-02-05 16:36         ` Anthony Liguori
2012-02-05 16:36         ` Anthony Liguori
2012-02-06  9:34         ` [Qemu-devel] " Avi Kivity
2012-02-06  9:34           ` Avi Kivity
2012-02-06  9:34           ` Avi Kivity
2012-02-06 13:33           ` [Qemu-devel] " Anthony Liguori
2012-02-06 13:33             ` Anthony Liguori
2012-02-06 13:54             ` Avi Kivity
2012-02-06 13:54               ` Avi Kivity
2012-02-06 14:00               ` Anthony Liguori
2012-02-06 14:00                 ` Anthony Liguori
2012-02-06 14:08                 ` Avi Kivity
2012-02-06 14:08                   ` Avi Kivity
2012-02-07 18:12           ` Rusty Russell
2012-02-07 18:12             ` Rusty Russell
2012-02-07 18:12             ` Rusty Russell
2012-02-15 13:39             ` [Qemu-devel] " Avi Kivity
2012-02-15 13:39               ` Avi Kivity
2012-02-15 21:59               ` Anthony Liguori
2012-02-15 21:59                 ` Anthony Liguori
2012-02-16  8:57                 ` Gleb Natapov
2012-02-16  8:57                   ` Gleb Natapov
2012-02-16  8:57                   ` Gleb Natapov
2012-02-16 14:46                   ` [Qemu-devel] " Anthony Liguori
2012-02-16 14:46                     ` Anthony Liguori
2012-02-16 19:34                     ` Avi Kivity
2012-02-16 19:34                       ` Avi Kivity
2012-02-16 19:34                       ` Avi Kivity
2012-02-15 23:08               ` [Qemu-devel] " Rusty Russell
2012-02-15 23:08                 ` Rusty Russell
2012-02-15 23:08                 ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1329354245.6976.25.camel@concordia \
    --to=michael@ellerman.id.au \
    --cc=agraf@suse.de \
    --cc=arnd@arndb.de \
    --cc=avi@redhat.com \
    --cc=digitaleric@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=scottwood@freescale.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.