linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [KVM PATCH v4 0/2] irqfd
@ 2009-05-04 17:57 Gregory Haskins
  2009-05-04 17:57 ` [KVM PATCH v4 1/2] eventfd: export eventfd interfaces for module use Gregory Haskins
                   ` (4 more replies)
  0 siblings, 5 replies; 35+ messages in thread
From: Gregory Haskins @ 2009-05-04 17:57 UTC (permalink / raw)
  To: kvm; +Cc: viro, linux-kernel, avi, davidel

(Applies to kvm.git:7da2e3ba, plus you will also need Davide Libenzi's
eventfd_file_create() patch, which you can find here:

http://www.mail-archive.com/kvm@vger.kernel.org/msg13923.html

You can find my complete tree with kvm.git, Davide's patch, and this series
here:

http://git.kernel.org/?p=linux/kernel/git/ghaskins/vbus/linux-2.6.git;a=shortlog;h=irqfd

----------------------

irqfd, v4

This series implements a mechanism called "irqfd".  It lets you create
an eventfd based file-desriptor to inject interrupts to a kvm guest. For
more details, please see the prologue for patch 2/2.

[ Changelog:

   v4:
        *) Changed allocation model to create the new fd last, after
           we get past the last potential error point by using Davide's
           new eventfd_file_create interface (Al Viro, Davide Libenzi)
	*) We no longer export sys_eventfd2() since it is replaced
           functionally with eventfd_file_create();
        *) Rebased to kvm.git:7da2e3ba

   v3:
        *) The kernel now allocates the eventfd (need to export sys_eventfd2)
        *) Added a flags field for future expansion to kvm_irqfd()
        *) We properly toggle the irq level 1+0.
        *) We re-use the USERSPACE_SRC_ID instead of creating our own
        *) Properly check for failures establishing a poll-table with eventfd
	*) Fixed fd/file leaks on failure
	*) Rebased to lateste kvm.git::41b76d8d04

   v2:
	*) Dropped notifier_chain based callbacks in favor of
	   wait_queue_t::func and file::poll based callbacks (Thanks to
	   Davide for the suggestion)

   v1:
        *) Initial release

--------

We do not have a user of this interface in this series, though note
future version of virtual-bus (v4 and above) will be based on this.

Note that this series requires userspace patches for qemu-kvm.git, v3, which
you can find here: http://patchwork.kernel.org/patch/20213/

-Greg

---

Gregory Haskins (2):
      kvm: add support for irqfd via eventfd-notification interface
      eventfd: export eventfd interfaces for module use


 arch/x86/kvm/Makefile    |    2 -
 arch/x86/kvm/x86.c       |    1 
 fs/eventfd.c             |    4 +
 include/linux/kvm.h      |    7 ++
 include/linux/kvm_host.h |    4 +
 virt/kvm/irqfd.c         |  159 ++++++++++++++++++++++++++++++++++++++++++++++
 virt/kvm/kvm_main.c      |   11 +++
 7 files changed, 187 insertions(+), 1 deletions(-)
 create mode 100644 virt/kvm/irqfd.c



^ permalink raw reply	[flat|nested] 35+ messages in thread

* [KVM PATCH v4 1/2] eventfd: export eventfd interfaces for module use
  2009-05-04 17:57 [KVM PATCH v4 0/2] irqfd Gregory Haskins
@ 2009-05-04 17:57 ` Gregory Haskins
  2009-05-04 22:24   ` Al Viro
  2009-05-04 17:57 ` [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface Gregory Haskins
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 35+ messages in thread
From: Gregory Haskins @ 2009-05-04 17:57 UTC (permalink / raw)
  To: kvm; +Cc: viro, linux-kernel, avi, davidel

We will re-use eventfd for implmenting irqfd later in the series, and the
irqfd users will potentially live in modules.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
---

 fs/eventfd.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/fs/eventfd.c b/fs/eventfd.c
index 0de6ebb..2e1c2ff 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -16,6 +16,7 @@
 #include <linux/anon_inodes.h>
 #include <linux/eventfd.h>
 #include <linux/syscalls.h>
+#include <linux/module.h>
 
 struct eventfd_ctx {
 	wait_queue_head_t wqh;
@@ -56,6 +57,7 @@ int eventfd_signal(struct file *file, int n)
 
 	return n;
 }
+EXPORT_SYMBOL_GPL(eventfd_signal);
 
 static int eventfd_release(struct inode *inode, struct file *file)
 {
@@ -197,6 +199,7 @@ struct file *eventfd_fget(int fd)
 
 	return file;
 }
+EXPORT_SYMBOL_GPL(eventfd_fget);
 
 struct file *eventfd_file_create(unsigned int count, int flags)
 {
@@ -225,6 +228,7 @@ struct file *eventfd_file_create(unsigned int count, int flags)
 
 	return file;
 }
+EXPORT_SYMBOL_GPL(eventfd_file_create);
 
 SYSCALL_DEFINE2(eventfd2, unsigned int, count, int, flags)
 {


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-04 17:57 [KVM PATCH v4 0/2] irqfd Gregory Haskins
  2009-05-04 17:57 ` [KVM PATCH v4 1/2] eventfd: export eventfd interfaces for module use Gregory Haskins
@ 2009-05-04 17:57 ` Gregory Haskins
  2009-05-05 15:45   ` Avi Kivity
  2009-05-06 11:35   ` Gregory Haskins
  2009-05-04 18:06 ` [KVM PATCH v4 0/2] irqfd Gregory Haskins
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 35+ messages in thread
From: Gregory Haskins @ 2009-05-04 17:57 UTC (permalink / raw)
  To: kvm; +Cc: viro, linux-kernel, avi, davidel

KVM provides a complete virtual system environment for guests, including
support for injecting interrupts modeled after the real exception/interrupt
facilities present on the native platform (such as the IDT on x86).
Virtual interrupts can come from a variety of sources (emulated devices,
pass-through devices, etc) but all must be injected to the guest via
the KVM infrastructure.  This patch adds a new mechanism to inject a specific
interrupt to a guest using a decoupled eventfd mechnanism:  Any legal signal
on the irqfd (using eventfd semantics from either userspace or kernel) will
translate into an injected interrupt in the guest at the next available
interrupt window.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
---

 arch/x86/kvm/Makefile    |    2 -
 arch/x86/kvm/x86.c       |    1 
 include/linux/kvm.h      |    7 ++
 include/linux/kvm_host.h |    4 +
 virt/kvm/irqfd.c         |  159 ++++++++++++++++++++++++++++++++++++++++++++++
 virt/kvm/kvm_main.c      |   11 +++
 6 files changed, 183 insertions(+), 1 deletions(-)
 create mode 100644 virt/kvm/irqfd.c

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index b43c4ef..d5fff51 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -3,7 +3,7 @@
 #
 
 common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
-                coalesced_mmio.o irq_comm.o)
+                coalesced_mmio.o irq_comm.o irqfd.o)
 ifeq ($(CONFIG_KVM_TRACE),y)
 common-objs += $(addprefix ../../../virt/kvm/, kvm_trace.o)
 endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2d7082c..699a407 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1027,6 +1027,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_REINJECT_CONTROL:
 	case KVM_CAP_IRQ_INJECT_STATUS:
 	case KVM_CAP_ASSIGN_DEV_IRQ:
+	case KVM_CAP_IRQFD:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 3db5d8d..5e9b861 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -415,6 +415,7 @@ struct kvm_trace_rec {
 #define KVM_CAP_ASSIGN_DEV_IRQ 29
 /* Another bug in KVM_SET_USER_MEMORY_REGION fixed: */
 #define KVM_CAP_JOIN_MEMORY_REGIONS_WORKS 30
+#define KVM_CAP_IRQFD 31
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -454,6 +455,11 @@ struct kvm_irq_routing {
 
 #endif
 
+struct kvm_irqfd {
+	__u32 gsi;
+	__u32 flags;
+};
+
 /*
  * ioctls for VM fds
  */
@@ -498,6 +504,7 @@ struct kvm_irq_routing {
 #define KVM_ASSIGN_SET_MSIX_ENTRY \
 			_IOW(KVMIO, 0x74, struct kvm_assigned_msix_entry)
 #define KVM_DEASSIGN_DEV_IRQ       _IOW(KVMIO, 0x75, struct kvm_assigned_irq)
+#define KVM_IRQFD                  _IOW(KVMIO, 0x76, struct kvm_irqfd)
 
 /*
  * ioctls for vcpu fds
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 095ebb6..6a8d1c1 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -134,6 +134,7 @@ struct kvm {
 	struct list_head vm_list;
 	struct kvm_io_bus mmio_bus;
 	struct kvm_io_bus pio_bus;
+	struct list_head irqfds;
 	struct kvm_vm_stat stat;
 	struct kvm_arch arch;
 	atomic_t users_count;
@@ -524,4 +525,7 @@ static inline void kvm_free_irq_routing(struct kvm *kvm) {}
 
 #endif
 
+int kvm_irqfd(struct kvm *kvm, int gsi, int flags);
+void kvm_irqfd_release(struct kvm *kvm);
+
 #endif
diff --git a/virt/kvm/irqfd.c b/virt/kvm/irqfd.c
new file mode 100644
index 0000000..6c82fcb
--- /dev/null
+++ b/virt/kvm/irqfd.c
@@ -0,0 +1,159 @@
+/*
+ * irqfd: Allows an eventfd to be used to inject an interrupt to the guest
+ *
+ * Credit goes to Avi Kivity for the original idea.
+ *
+ * Copyright 2009 Novell.  All Rights Reserved.
+ *
+ * Author:
+ *	Gregory Haskins <ghaskins@novell.com>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/eventfd.h>
+#include <linux/workqueue.h>
+#include <linux/syscalls.h>
+#include <linux/wait.h>
+#include <linux/poll.h>
+#include <linux/file.h>
+#include <linux/list.h>
+
+struct _irqfd {
+	struct kvm               *kvm;
+	int                       gsi;
+	struct file              *file;
+	struct list_head          list;
+	poll_table                pt;
+	wait_queue_head_t        *wqh;
+	wait_queue_t              wait;
+	struct work_struct        work;
+};
+
+static void
+irqfd_inject(struct work_struct *work)
+{
+	struct _irqfd *irqfd = container_of(work, struct _irqfd, work);
+	struct kvm *kvm = irqfd->kvm;
+
+	mutex_lock(&kvm->lock);
+	kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 1);
+	kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 0);
+	mutex_unlock(&kvm->lock);
+}
+
+static int
+irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key)
+{
+	struct _irqfd *irqfd = container_of(wait, struct _irqfd, wait);
+
+	/*
+	 * The eventfd calls its wake_up with interrupts disabled, and
+	 * eventfd_signal() may be called from interrupt context. Therefore
+	 * we need to defer the IRQ injection until later since we need to
+	 * acquire the kvm->lock to do so.
+	 */
+	schedule_work(&irqfd->work);
+
+	return 0;
+}
+
+static void
+irqfd_ptable_queue_proc(struct file *file, wait_queue_head_t *wqh,
+			poll_table *pt)
+{
+	struct _irqfd *irqfd = container_of(pt, struct _irqfd, pt);
+
+	irqfd->wqh = wqh;
+	add_wait_queue(wqh, &irqfd->wait);
+}
+
+int
+kvm_irqfd(struct kvm *kvm, int gsi, int flags)
+{
+	struct _irqfd *irqfd;
+	struct file *file = NULL;
+	int fd = -1;
+	int ret;
+
+	irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
+	if (!irqfd)
+		return -ENOMEM;
+
+	irqfd->kvm = kvm;
+	irqfd->gsi = gsi;
+	INIT_LIST_HEAD(&irqfd->list);
+	INIT_WORK(&irqfd->work, irqfd_inject);
+
+	/*
+	 * We re-use eventfd for irqfd, and therefore will embed the eventfd
+	 * lifetime in the irqfd.
+	 */
+	file = eventfd_file_create(0, 0);
+	if (IS_ERR(file)) {
+		ret = PTR_ERR(file);
+		goto fail;
+	}
+
+	irqfd->file = file;
+
+	/*
+	 * Install our own custom wake-up handling so we are notified via
+	 * a callback whenever someone signals the underlying eventfd
+	 */
+	init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup);
+	init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc);
+
+	ret = file->f_op->poll(file, &irqfd->pt);
+	if (ret < 0)
+		goto fail;
+
+	mutex_lock(&kvm->lock);
+	list_add_tail(&irqfd->list, &kvm->irqfds);
+	mutex_unlock(&kvm->lock);
+
+	ret = get_unused_fd();
+	if (ret < 0)
+		goto fail;
+
+	fd = ret;
+
+	fd_install(fd, file);
+
+	return fd;
+
+fail:
+	if (file && !IS_ERR(file))
+		fput(file);
+
+	kfree(irqfd);
+	return ret;
+}
+
+void
+kvm_irqfd_release(struct kvm *kvm)
+{
+	struct _irqfd *irqfd, *tmp;
+
+	list_for_each_entry_safe(irqfd, tmp, &kvm->irqfds, list) {
+		remove_wait_queue(irqfd->wqh, &irqfd->wait);
+
+		flush_work(&irqfd->work);
+		fput(irqfd->file);
+
+		list_del(&irqfd->list);
+		kfree(irqfd);
+	}
+}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2b73e19..8b3b06a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -972,6 +972,7 @@ static struct kvm *kvm_create_vm(void)
 	atomic_inc(&kvm->mm->mm_count);
 	spin_lock_init(&kvm->mmu_lock);
 	kvm_io_bus_init(&kvm->pio_bus);
+	INIT_LIST_HEAD(&kvm->irqfds);
 	mutex_init(&kvm->lock);
 	kvm_io_bus_init(&kvm->mmio_bus);
 	init_rwsem(&kvm->slots_lock);
@@ -1023,6 +1024,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
 	spin_lock(&kvm_lock);
 	list_del(&kvm->vm_list);
 	spin_unlock(&kvm_lock);
+	kvm_irqfd_release(kvm);
 	kvm_free_irq_routing(kvm);
 	kvm_io_bus_destroy(&kvm->pio_bus);
 	kvm_io_bus_destroy(&kvm->mmio_bus);
@@ -2197,6 +2199,15 @@ static long kvm_vm_ioctl(struct file *filp,
 	}
 #endif
 #endif /* KVM_CAP_IRQ_ROUTING */
+	case KVM_IRQFD: {
+		struct kvm_irqfd data;
+
+		r = -EFAULT;
+		if (copy_from_user(&data, argp, sizeof data))
+			goto out;
+		r = kvm_irqfd(kvm, data.gsi, data.flags);
+		break;
+	}
 	default:
 		r = kvm_arch_vm_ioctl(filp, ioctl, arg);
 	}


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 0/2] irqfd
  2009-05-04 17:57 [KVM PATCH v4 0/2] irqfd Gregory Haskins
  2009-05-04 17:57 ` [KVM PATCH v4 1/2] eventfd: export eventfd interfaces for module use Gregory Haskins
  2009-05-04 17:57 ` [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface Gregory Haskins
@ 2009-05-04 18:06 ` Gregory Haskins
  2009-05-04 18:52 ` Gregory Haskins
  2009-05-05 14:16 ` Davide Libenzi
  4 siblings, 0 replies; 35+ messages in thread
From: Gregory Haskins @ 2009-05-04 18:06 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: kvm, viro, linux-kernel, avi, davidel

[-- Attachment #1: Type: text/plain, Size: 506 bytes --]

Gregory Haskins wrote:
> (Applies to kvm.git:7da2e3ba, plus you will also need Davide Libenzi's
> eventfd_file_create() patch, which you can find here:
>   

[ snip ]

Sorry about the double post of v4.  The first time through I
fat-fingered Al's and LKML's addresses so they were munged together.  I
tried to kill the mail before it went out, but it looks like I missed
it.  If replying, please be sure to use the second version of v4 (which
has a legit address for Al and LKML).

-Greg



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 0/2] irqfd
  2009-05-04 17:57 [KVM PATCH v4 0/2] irqfd Gregory Haskins
                   ` (2 preceding siblings ...)
  2009-05-04 18:06 ` [KVM PATCH v4 0/2] irqfd Gregory Haskins
@ 2009-05-04 18:52 ` Gregory Haskins
  2009-05-05 14:16 ` Davide Libenzi
  4 siblings, 0 replies; 35+ messages in thread
From: Gregory Haskins @ 2009-05-04 18:52 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: kvm, viro, linux-kernel, avi, davidel

[-- Attachment #1: Type: text/plain, Size: 425 bytes --]

Gregory Haskins wrote:
> (Applies to kvm.git:7da2e3ba, plus you will also need Davide Libenzi's
> eventfd_file_create() patch, which you can find here:
>   
[snip]

I should also add that v4 is build-tested only.  I am in the middle of
refactoring virtual-bus, which is my only current test-harness for
this.  I will report back later this week how the testing goes once I
get something running again.

-Greg



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 1/2] eventfd: export eventfd interfaces for module use
  2009-05-04 17:57 ` [KVM PATCH v4 1/2] eventfd: export eventfd interfaces for module use Gregory Haskins
@ 2009-05-04 22:24   ` Al Viro
  2009-05-05  2:17     ` Gregory Haskins
  0 siblings, 1 reply; 35+ messages in thread
From: Al Viro @ 2009-05-04 22:24 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: kvm, linux-kernel, avi, davidel

On Mon, May 04, 2009 at 01:57:45PM -0400, Gregory Haskins wrote:
> @@ -56,6 +57,7 @@ int eventfd_signal(struct file *file, int n)
>  
>  	return n;
>  }
> +EXPORT_SYMBOL_GPL(eventfd_signal);

perhaps, but...

> @@ -197,6 +199,7 @@ struct file *eventfd_fget(int fd)
>  
>  	return file;
>  }
> +EXPORT_SYMBOL_GPL(eventfd_fget);

this one looks very odd.  Could you show legitimate users?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 1/2] eventfd: export eventfd interfaces for module use
  2009-05-04 22:24   ` Al Viro
@ 2009-05-05  2:17     ` Gregory Haskins
  0 siblings, 0 replies; 35+ messages in thread
From: Gregory Haskins @ 2009-05-05  2:17 UTC (permalink / raw)
  To: Al Viro; +Cc: kvm, linux-kernel, avi, davidel

[-- Attachment #1: Type: text/plain, Size: 1019 bytes --]

Al Viro wrote:
> On Mon, May 04, 2009 at 01:57:45PM -0400, Gregory Haskins wrote:
>   
>> @@ -56,6 +57,7 @@ int eventfd_signal(struct file *file, int n)
>>  
>>  	return n;
>>  }
>> +EXPORT_SYMBOL_GPL(eventfd_signal);
>>     
>
> perhaps, but...
>
>   
>> @@ -197,6 +199,7 @@ struct file *eventfd_fget(int fd)
>>  
>>  	return file;
>>  }
>> +EXPORT_SYMBOL_GPL(eventfd_fget);
>>     
>
> this one looks very odd.  Could you show legitimate users?
>   
Hi Al,

I plan on using this in a similar way as the AIO io_submit function does
today for the next version of the virtual-bus patches.  That is,
userspace allocates an eventfd (in this case, via kvm_irqfd() and the
kernel side code you have been reviewing) and subsequently submits the
fd to a unique virtual-bus instance in the kernel via an ioctl().  This
will ultimately associate the vbus instance dynamically with the kvm
instance in a loosely coupled relationship.  Vbus may be compiled as a
module, thus the export.

-Greg


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 0/2] irqfd
  2009-05-04 17:57 [KVM PATCH v4 0/2] irqfd Gregory Haskins
                   ` (3 preceding siblings ...)
  2009-05-04 18:52 ` Gregory Haskins
@ 2009-05-05 14:16 ` Davide Libenzi
  2009-05-05 14:27   ` Gregory Haskins
  4 siblings, 1 reply; 35+ messages in thread
From: Davide Libenzi @ 2009-05-05 14:16 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: kvm, viro, Linux Kernel Mailing List, avi

On Mon, 4 May 2009, Gregory Haskins wrote:

> (Applies to kvm.git:7da2e3ba, plus you will also need Davide Libenzi's
> eventfd_file_create() patch, which you can find here:
> 
> http://www.mail-archive.com/kvm@vger.kernel.org/msg13923.html

Ping me back if Al acks the irqfd thing, that I'll take a better look at 
the patch above and make an official post. Without any users, I'd rather 
leave the current code as is.


- Davide



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 0/2] irqfd
  2009-05-05 14:16 ` Davide Libenzi
@ 2009-05-05 14:27   ` Gregory Haskins
  0 siblings, 0 replies; 35+ messages in thread
From: Gregory Haskins @ 2009-05-05 14:27 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: kvm, viro, Linux Kernel Mailing List, avi

[-- Attachment #1: Type: text/plain, Size: 531 bytes --]

Davide Libenzi wrote:
> On Mon, 4 May 2009, Gregory Haskins wrote:
>
>   
>> (Applies to kvm.git:7da2e3ba, plus you will also need Davide Libenzi's
>> eventfd_file_create() patch, which you can find here:
>>
>> http://www.mail-archive.com/kvm@vger.kernel.org/msg13923.html
>>     
>
> Ping me back if Al acks the irqfd thing, that I'll take a better look at 
> the patch above and make an official post. Without any users, I'd rather 
> leave the current code as is.
>   

Will do, Davide.  Thank you.

-Greg



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-04 17:57 ` [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface Gregory Haskins
@ 2009-05-05 15:45   ` Avi Kivity
  2009-05-05 17:34     ` Gregory Haskins
  2009-05-05 17:56     ` Gregory Haskins
  2009-05-06 11:35   ` Gregory Haskins
  1 sibling, 2 replies; 35+ messages in thread
From: Avi Kivity @ 2009-05-05 15:45 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: kvm, viro, linux-kernel, davidel

Gregory Haskins wrote:
> KVM provides a complete virtual system environment for guests, including
> support for injecting interrupts modeled after the real exception/interrupt
> facilities present on the native platform (such as the IDT on x86).
> Virtual interrupts can come from a variety of sources (emulated devices,
> pass-through devices, etc) but all must be injected to the guest via
> the KVM infrastructure.  This patch adds a new mechanism to inject a specific
> interrupt to a guest using a decoupled eventfd mechnanism:  Any legal signal
> on the irqfd (using eventfd semantics from either userspace or kernel) will
> translate into an injected interrupt in the guest at the next available
> interrupt window.
>
>  
> +struct kvm_irqfd {
> +	__u32 gsi;
> +	__u32 flags;
> +};
> +
>   

Please add some reserved space here.

> +int
> +kvm_irqfd(struct kvm *kvm, int gsi, int flags)
> +{
> +	struct _irqfd *irqfd;
> +	struct file *file = NULL;
> +	int fd = -1;
> +	int ret;
> +
> +	irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
> +	if (!irqfd)
> +		return -ENOMEM;
> +
> +	irqfd->kvm = kvm;
>   

You need to increase the refcount on struct kvm here.  Otherwise evil 
userspace will create an irqfd, close the vm and vcpu fds, and inject an 
interrupt.

Otherwise, looks good.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-05 15:45   ` Avi Kivity
@ 2009-05-05 17:34     ` Gregory Haskins
  2009-05-05 17:43       ` Avi Kivity
  2009-05-05 17:56     ` Gregory Haskins
  1 sibling, 1 reply; 35+ messages in thread
From: Gregory Haskins @ 2009-05-05 17:34 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, viro, linux-kernel, davidel

[-- Attachment #1: Type: text/plain, Size: 1531 bytes --]

Avi Kivity wrote:
> Gregory Haskins wrote:
>> KVM provides a complete virtual system environment for guests, including
>> support for injecting interrupts modeled after the real
>> exception/interrupt
>> facilities present on the native platform (such as the IDT on x86).
>> Virtual interrupts can come from a variety of sources (emulated devices,
>> pass-through devices, etc) but all must be injected to the guest via
>> the KVM infrastructure.  This patch adds a new mechanism to inject a
>> specific
>> interrupt to a guest using a decoupled eventfd mechnanism:  Any legal
>> signal
>> on the irqfd (using eventfd semantics from either userspace or
>> kernel) will
>> translate into an injected interrupt in the guest at the next available
>> interrupt window.
>>
>>  
>> +struct kvm_irqfd {
>> +    __u32 gsi;
>> +    __u32 flags;
>> +};
>> +
>>   
>
> Please add some reserved space here.

Ack.  Any rule of thumb here?  How about a "__u8 pad[16]" ?

>
>> +int
>> +kvm_irqfd(struct kvm *kvm, int gsi, int flags)
>> +{
>> +    struct _irqfd *irqfd;
>> +    struct file *file = NULL;
>> +    int fd = -1;
>> +    int ret;
>> +
>> +    irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
>> +    if (!irqfd)
>> +        return -ENOMEM;
>> +
>> +    irqfd->kvm = kvm;
>>   
>
> You need to increase the refcount on struct kvm here.  Otherwise evil
> userspace will create an irqfd, close the vm and vcpu fds, and inject
> an interrupt.

Good catch.  Will fix.

Thanks Avi,
-Greg



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-05 17:34     ` Gregory Haskins
@ 2009-05-05 17:43       ` Avi Kivity
  0 siblings, 0 replies; 35+ messages in thread
From: Avi Kivity @ 2009-05-05 17:43 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: kvm, viro, linux-kernel, davidel

Gregory Haskins wrote:
>>>  
>>> +struct kvm_irqfd {
>>> +    __u32 gsi;
>>> +    __u32 flags;
>>> +};
>>> +
>>>   
>>>       
>> Please add some reserved space here.
>>     
>
> Ack.  Any rule of thumb here?  How about a "__u8 pad[16]" ?
>   

I'd round it up so the whole thing is 32 bytes (not that it matters).

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-05 15:45   ` Avi Kivity
  2009-05-05 17:34     ` Gregory Haskins
@ 2009-05-05 17:56     ` Gregory Haskins
  2009-05-05 18:10       ` Avi Kivity
  1 sibling, 1 reply; 35+ messages in thread
From: Gregory Haskins @ 2009-05-05 17:56 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, viro, linux-kernel, davidel

[-- Attachment #1: Type: text/plain, Size: 1022 bytes --]

Avi Kivity wrote:
> Gregory Haskins wrote:
>
>> +int
>> +kvm_irqfd(struct kvm *kvm, int gsi, int flags)
>> +{
>> +    struct _irqfd *irqfd;
>> +    struct file *file = NULL;
>> +    int fd = -1;
>> +    int ret;
>> +
>> +    irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
>> +    if (!irqfd)
>> +        return -ENOMEM;
>> +
>> +    irqfd->kvm = kvm;
>>   
>
> You need to increase the refcount on struct kvm here.  Otherwise evil
> userspace will create an irqfd, close the vm and vcpu fds, and inject
> an interrupt.

I just reviewed the code in prep for v5, and now I remember why I didnt
take a reference:  I designed it the opposite direction:  the vm-fd owns
a reference to the irqfd, and will decouple the kvm context from the
eventfd on shutdown  (see kvm_irqfd_release()).   I still need to spin a
v5 regardless in order to add the padding as previously discussed.  But
let me know if you still see any holes in light of this alternate object
lifetime approach I am using.

-Greg



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-05 17:56     ` Gregory Haskins
@ 2009-05-05 18:10       ` Avi Kivity
  2009-05-05 18:21         ` Gregory Haskins
  0 siblings, 1 reply; 35+ messages in thread
From: Avi Kivity @ 2009-05-05 18:10 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: kvm, viro, linux-kernel, davidel

Gregory Haskins wrote:
> Avi Kivity wrote:
>   
>> Gregory Haskins wrote:
>>
>>     
>>> +int
>>> +kvm_irqfd(struct kvm *kvm, int gsi, int flags)
>>> +{
>>> +    struct _irqfd *irqfd;
>>> +    struct file *file = NULL;
>>> +    int fd = -1;
>>> +    int ret;
>>> +
>>> +    irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
>>> +    if (!irqfd)
>>> +        return -ENOMEM;
>>> +
>>> +    irqfd->kvm = kvm;
>>>   
>>>       
>> You need to increase the refcount on struct kvm here.  Otherwise evil
>> userspace will create an irqfd, close the vm and vcpu fds, and inject
>> an interrupt.
>>     
>
> I just reviewed the code in prep for v5, and now I remember why I didnt
> take a reference:  I designed it the opposite direction:  the vm-fd owns
> a reference to the irqfd, and will decouple the kvm context from the
> eventfd on shutdown  (see kvm_irqfd_release()).   I still need to spin a
> v5 regardless in order to add the padding as previously discussed.  But
> let me know if you still see any holes in light of this alternate object
> lifetime approach I am using.
>   

Right, irqfd_release works.  But I think refcounting is simpler, since 
we already kvm_get_kvm() and kvm_put_kvm(), and you wouldn't need the 
irqfd list.  On the other hand, I'm not sure you get a callback from 
eventfd on close(), so refcounting may not be implementable.

Drat, irqfd_release doesn't work.  You reference kvm->lock in 
irqfd_inject without taking any locks.

btw, there's still your original idea of creating the eventfd in 
userspace and passing it down.  That would be workable if we can see a 
way to both signal the eventfd and get called back in irq context.  
Maybe that's preferable to what we're doing here, but we need to see how 
it would work.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-05 18:10       ` Avi Kivity
@ 2009-05-05 18:21         ` Gregory Haskins
  0 siblings, 0 replies; 35+ messages in thread
From: Gregory Haskins @ 2009-05-05 18:21 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Gregory Haskins, kvm, viro, linux-kernel, davidel

[-- Attachment #1: Type: text/plain, Size: 3132 bytes --]

Avi Kivity wrote:
> Gregory Haskins wrote:
>> Avi Kivity wrote:
>>  
>>> Gregory Haskins wrote:
>>>
>>>    
>>>> +int
>>>> +kvm_irqfd(struct kvm *kvm, int gsi, int flags)
>>>> +{
>>>> +    struct _irqfd *irqfd;
>>>> +    struct file *file = NULL;
>>>> +    int fd = -1;
>>>> +    int ret;
>>>> +
>>>> +    irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
>>>> +    if (!irqfd)
>>>> +        return -ENOMEM;
>>>> +
>>>> +    irqfd->kvm = kvm;
>>>>         
>>> You need to increase the refcount on struct kvm here.  Otherwise evil
>>> userspace will create an irqfd, close the vm and vcpu fds, and inject
>>> an interrupt.
>>>     
>>
>> I just reviewed the code in prep for v5, and now I remember why I didnt
>> take a reference:  I designed it the opposite direction:  the vm-fd owns
>> a reference to the irqfd, and will decouple the kvm context from the
>> eventfd on shutdown  (see kvm_irqfd_release()).   I still need to spin a
>> v5 regardless in order to add the padding as previously discussed.  But
>> let me know if you still see any holes in light of this alternate object
>> lifetime approach I am using.
>>   
>
> Right, irqfd_release works.  But I think refcounting is simpler, since
> we already kvm_get_kvm() and kvm_put_kvm(), and you wouldn't need the
> irqfd list.  On the other hand, I'm not sure you get a callback from
> eventfd on close(), so refcounting may not be implementable.

;)

>
> Drat, irqfd_release doesn't work.  You reference kvm->lock in
> irqfd_inject without taking any locks.

I *think* this is ok, tho.  I remove myself from the waitq, and then
flush any potentially scheduled deferred work before returning.  This
all happens synchronously to the vm_release() code when the vm-fd is
bring dropped, but before we actually release the struct kvm*. 
Therefore, I think kvm->lock is guaranteed to remain valid for the
duration of the irqfd_release(), and we guarantee it wont be accessed
after the irqfd_release() completes.  Or do you have a different concern?

On this topic of proper ref counts, though....

I wonder if I need an extra fget() in there.  I presume that the
evenfd_file_create() returns with only a single reference, which
presumably I am handing one to userspace, and one to the irqfd.... which
is broken.  Or does fd_install() bump that for me (doesnt look like
it)?  Al, Davide, any comments?

>
> btw, there's still your original idea of creating the eventfd in
> userspace and passing it down.  That would be workable if we can see a
> way to both signal the eventfd and get called back in irq context. 
> Maybe that's preferable to what we're doing here, but we need to see
> how it would work.

We can do that, but I don't see it as changing the general problem
here.  However, I think if you find that the above comments about the
kvm->lock w.r.t. irqfd_release() are ok, we don't need to worry about
it.  If you prefer the userspace allocation of eventfd() for other
reasons, we can easily go back to that model as well...but its not
strictly necessary for this particular issue iiuc.

-Greg


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-04 17:57 ` [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface Gregory Haskins
  2009-05-05 15:45   ` Avi Kivity
@ 2009-05-06 11:35   ` Gregory Haskins
  2009-05-06 15:24     ` Davide Libenzi
  1 sibling, 1 reply; 35+ messages in thread
From: Gregory Haskins @ 2009-05-06 11:35 UTC (permalink / raw)
  To: viro, davidel; +Cc: kvm, linux-kernel, avi

[-- Attachment #1: Type: text/plain, Size: 1563 bytes --]

Al, Davide,

Gregory Haskins wrote:
> +
> +int
> +kvm_irqfd(struct kvm *kvm, int gsi, int flags)
> +{
> +	struct _irqfd *irqfd;
> +	struct file *file = NULL;
> +	int fd = -1;
> +	int ret;
> +
> +	irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
> +	if (!irqfd)
> +		return -ENOMEM;
> +
> +	irqfd->kvm = kvm;
> +	irqfd->gsi = gsi;
> +	INIT_LIST_HEAD(&irqfd->list);
> +	INIT_WORK(&irqfd->work, irqfd_inject);
> +
> +	/*
> +	 * We re-use eventfd for irqfd, and therefore will embed the eventfd
> +	 * lifetime in the irqfd.
> +	 */
> +	file = eventfd_file_create(0, 0);
> +	if (IS_ERR(file)) {
> +		ret = PTR_ERR(file);
> +		goto fail;
> +	}
> +
> +	irqfd->file = file;
> +
> +	/*
> +	 * Install our own custom wake-up handling so we are notified via
> +	 * a callback whenever someone signals the underlying eventfd
> +	 */
> +	init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup);
> +	init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc);
> +
> +	ret = file->f_op->poll(file, &irqfd->pt);
> +	if (ret < 0)
> +		goto fail;
> +
> +	mutex_lock(&kvm->lock);
> +	list_add_tail(&irqfd->list, &kvm->irqfds);
> +	mutex_unlock(&kvm->lock);
> +
> +	ret = get_unused_fd();
> +	if (ret < 0)
> +		goto fail;
> +
> +	fd = ret;
> +
> +	fd_install(fd, file);
>   

Can you comment on whether this function needs to take an additional
reference on file here?  (one for the one held by userspace/fd, and the
other for irqfd->file)  My instinct is telling me this may be currently
broken, but I am not sure.

-Greg




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-06 11:35   ` Gregory Haskins
@ 2009-05-06 15:24     ` Davide Libenzi
  2009-05-06 15:37       ` Gregory Haskins
  0 siblings, 1 reply; 35+ messages in thread
From: Davide Libenzi @ 2009-05-06 15:24 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: viro, kvm, Linux Kernel Mailing List, avi

On Wed, 6 May 2009, Gregory Haskins wrote:

> Al, Davide,
> 
> Gregory Haskins wrote:
> > +
> > +int
> > +kvm_irqfd(struct kvm *kvm, int gsi, int flags)
> > +{
> > +	struct _irqfd *irqfd;
> > +	struct file *file = NULL;
> > +	int fd = -1;
> > +	int ret;
> > +
> > +	irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
> > +	if (!irqfd)
> > +		return -ENOMEM;
> > +
> > +	irqfd->kvm = kvm;
> > +	irqfd->gsi = gsi;
> > +	INIT_LIST_HEAD(&irqfd->list);
> > +	INIT_WORK(&irqfd->work, irqfd_inject);
> > +
> > +	/*
> > +	 * We re-use eventfd for irqfd, and therefore will embed the eventfd
> > +	 * lifetime in the irqfd.
> > +	 */
> > +	file = eventfd_file_create(0, 0);
> > +	if (IS_ERR(file)) {
> > +		ret = PTR_ERR(file);
> > +		goto fail;
> > +	}
> > +
> > +	irqfd->file = file;
> > +
> > +	/*
> > +	 * Install our own custom wake-up handling so we are notified via
> > +	 * a callback whenever someone signals the underlying eventfd
> > +	 */
> > +	init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup);
> > +	init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc);
> > +
> > +	ret = file->f_op->poll(file, &irqfd->pt);
> > +	if (ret < 0)
> > +		goto fail;
> > +
> > +	mutex_lock(&kvm->lock);
> > +	list_add_tail(&irqfd->list, &kvm->irqfds);
> > +	mutex_unlock(&kvm->lock);
> > +
> > +	ret = get_unused_fd();
> > +	if (ret < 0)
> > +		goto fail;
> > +
> > +	fd = ret;
> > +
> > +	fd_install(fd, file);
> >   
> 
> Can you comment on whether this function needs to take an additional
> reference on file here?  (one for the one held by userspace/fd, and the
> other for irqfd->file)  My instinct is telling me this may be currently
> broken, but I am not sure.

I don't know exactly how it is used, but looks broken. If the fd is 
returned to userspace, and userspace closes the fd, you will leak the 
irqfd*. If you do an extra fget(), you will never know if the userspace 
closed the last-1 instance of the eventfd file*.
What is likely needed, is add a callback to eventfd_file_create(), so that 
the eventfd can call your callback on ->release, and give you a chance to 
perform proper cleanup of the irqfd*.



- Davide



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-06 15:24     ` Davide Libenzi
@ 2009-05-06 15:37       ` Gregory Haskins
  2009-05-07  1:34         ` Davide Libenzi
  0 siblings, 1 reply; 35+ messages in thread
From: Gregory Haskins @ 2009-05-06 15:37 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: viro, kvm, Linux Kernel Mailing List, avi

[-- Attachment #1: Type: text/plain, Size: 3083 bytes --]

Davide Libenzi wrote:
> On Wed, 6 May 2009, Gregory Haskins wrote:
>
>   
>> Al, Davide,
>>
>> Gregory Haskins wrote:
>>     
>>> +
>>> +int
>>> +kvm_irqfd(struct kvm *kvm, int gsi, int flags)
>>> +{
>>> +	struct _irqfd *irqfd;
>>> +	struct file *file = NULL;
>>> +	int fd = -1;
>>> +	int ret;
>>> +
>>> +	irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
>>> +	if (!irqfd)
>>> +		return -ENOMEM;
>>> +
>>> +	irqfd->kvm = kvm;
>>> +	irqfd->gsi = gsi;
>>> +	INIT_LIST_HEAD(&irqfd->list);
>>> +	INIT_WORK(&irqfd->work, irqfd_inject);
>>> +
>>> +	/*
>>> +	 * We re-use eventfd for irqfd, and therefore will embed the eventfd
>>> +	 * lifetime in the irqfd.
>>> +	 */
>>> +	file = eventfd_file_create(0, 0);
>>> +	if (IS_ERR(file)) {
>>> +		ret = PTR_ERR(file);
>>> +		goto fail;
>>> +	}
>>> +
>>> +	irqfd->file = file;
>>> +
>>> +	/*
>>> +	 * Install our own custom wake-up handling so we are notified via
>>> +	 * a callback whenever someone signals the underlying eventfd
>>> +	 */
>>> +	init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup);
>>> +	init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc);
>>> +
>>> +	ret = file->f_op->poll(file, &irqfd->pt);
>>> +	if (ret < 0)
>>> +		goto fail;
>>> +
>>> +	mutex_lock(&kvm->lock);
>>> +	list_add_tail(&irqfd->list, &kvm->irqfds);
>>> +	mutex_unlock(&kvm->lock);
>>> +
>>> +	ret = get_unused_fd();
>>> +	if (ret < 0)
>>> +		goto fail;
>>> +
>>> +	fd = ret;
>>> +
>>> +	fd_install(fd, file);
>>>   
>>>       
>> Can you comment on whether this function needs to take an additional
>> reference on file here?  (one for the one held by userspace/fd, and the
>> other for irqfd->file)  My instinct is telling me this may be currently
>> broken, but I am not sure.
>>     
>
> I don't know exactly how it is used, but looks broken.
Yeah, I think it is.  I addressed this in v5 so please review that
version if you have a moment.

>  If the fd is 
> returned to userspace, and userspace closes the fd, you will leak the 
> irqfd*. If you do an extra fget(), you will never know if the userspace 
> closed the last-1 instance of the eventfd file*.
> What is likely needed, is add a callback to eventfd_file_create(), so that 
> the eventfd can call your callback on ->release, and give you a chance to 
> perform proper cleanup of the irqfd*.
>   

I think we are ok in this regard (at least in v5) without the callback. 
kvm holds irqfd, which holds eventfd.  In a normal situation, we will
have eventfd with 2 references.  If userspace closes the eventfd, it
will drop 1 of the 2 eventfd file references, but the object should
remain intact as long as kvm still holds it as well.  When the kvm-fd is
released, we will then decouple from the eventfd->wqh and drop the last
fput(), officially freeing it.

Likewise, if kvm is closed before the eventfd, we will simply decouple
from the wqh and fput(eventfd), leaving the last reference held by
userspace until it closes as well.

Let me know if you see any holes in that.

Thanks Davide,
-Greg



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-06 15:37       ` Gregory Haskins
@ 2009-05-07  1:34         ` Davide Libenzi
  2009-05-07  2:06           ` Gregory Haskins
  2009-05-07  9:48           ` Avi Kivity
  0 siblings, 2 replies; 35+ messages in thread
From: Davide Libenzi @ 2009-05-07  1:34 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: viro, kvm, Linux Kernel Mailing List, avi

On Wed, 6 May 2009, Gregory Haskins wrote:

> I think we are ok in this regard (at least in v5) without the callback. 
> kvm holds irqfd, which holds eventfd.  In a normal situation, we will
> have eventfd with 2 references.  If userspace closes the eventfd, it
> will drop 1 of the 2 eventfd file references, but the object should
> remain intact as long as kvm still holds it as well.  When the kvm-fd is
> released, we will then decouple from the eventfd->wqh and drop the last
> fput(), officially freeing it.
> 
> Likewise, if kvm is closed before the eventfd, we will simply decouple
> from the wqh and fput(eventfd), leaving the last reference held by
> userspace until it closes as well.
> 
> Let me know if you see any holes in that.

Looks OK, modulo my knowledge of KVM internals.


- Davide



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-07  1:34         ` Davide Libenzi
@ 2009-05-07  2:06           ` Gregory Haskins
  2009-05-08 15:06             ` Davide Libenzi
  2009-05-07  9:48           ` Avi Kivity
  1 sibling, 1 reply; 35+ messages in thread
From: Gregory Haskins @ 2009-05-07  2:06 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: viro, kvm, Linux Kernel Mailing List, avi

[-- Attachment #1: Type: text/plain, Size: 1116 bytes --]

Davide Libenzi wrote:
> On Wed, 6 May 2009, Gregory Haskins wrote:
>
>   
>> I think we are ok in this regard (at least in v5) without the callback. 
>> kvm holds irqfd, which holds eventfd.  In a normal situation, we will
>> have eventfd with 2 references.  If userspace closes the eventfd, it
>> will drop 1 of the 2 eventfd file references, but the object should
>> remain intact as long as kvm still holds it as well.  When the kvm-fd is
>> released, we will then decouple from the eventfd->wqh and drop the last
>> fput(), officially freeing it.
>>
>> Likewise, if kvm is closed before the eventfd, we will simply decouple
>> from the wqh and fput(eventfd), leaving the last reference held by
>> userspace until it closes as well.
>>
>> Let me know if you see any holes in that.
>>     
>
> Looks OK, modulo my knowledge of KVM internals.
>   

Thanks Davide,

If there isn't any more feedback on the series from Al, Avi,
etc...please formally submit your eventfd patch so this series is
available for Avi to pull in for inclusion when/if he deems it fit.

Thanks again,
-Greg



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-07  1:34         ` Davide Libenzi
  2009-05-07  2:06           ` Gregory Haskins
@ 2009-05-07  9:48           ` Avi Kivity
  2009-05-07 13:46             ` Marcelo Tosatti
                               ` (2 more replies)
  1 sibling, 3 replies; 35+ messages in thread
From: Avi Kivity @ 2009-05-07  9:48 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Gregory Haskins, viro, kvm, Linux Kernel Mailing List

Davide Libenzi wrote:
> On Wed, 6 May 2009, Gregory Haskins wrote:
>
>   
>> I think we are ok in this regard (at least in v5) without the callback. 
>> kvm holds irqfd, which holds eventfd.  In a normal situation, we will
>> have eventfd with 2 references.  If userspace closes the eventfd, it
>> will drop 1 of the 2 eventfd file references, but the object should
>> remain intact as long as kvm still holds it as well.  When the kvm-fd is
>> released, we will then decouple from the eventfd->wqh and drop the last
>> fput(), officially freeing it.
>>
>> Likewise, if kvm is closed before the eventfd, we will simply decouple
>> from the wqh and fput(eventfd), leaving the last reference held by
>> userspace until it closes as well.
>>
>> Let me know if you see any holes in that.
>>     
>
> Looks OK, modulo my knowledge of KVM internals.
>   

What's your take on adding irq context safe callbacks to irqfd?

To give some background here, we would like to use eventfd as a generic 
connector between components, so the components do not know about each 
other.  So far eventfd successfully abstracts among components in the 
same process, in different processes, and in the kernel.

eventfd_signal() can be safely called from irq context, and will wake up 
a waiting task.  But in some cases, if the consumer is in the kernel, it 
may be able to consume the event from irq context, saving a context switch.

So, will you consider patches adding this capability to eventfd?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-07  9:48           ` Avi Kivity
@ 2009-05-07 13:46             ` Marcelo Tosatti
  2009-05-07 14:01               ` Gregory Haskins
  2009-05-07 14:46             ` Davide Libenzi
  2009-05-08  3:13             ` Davide Libenzi
  2 siblings, 1 reply; 35+ messages in thread
From: Marcelo Tosatti @ 2009-05-07 13:46 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Davide Libenzi, Gregory Haskins, viro, kvm, Linux Kernel Mailing List

On Thu, May 07, 2009 at 12:48:21PM +0300, Avi Kivity wrote:
> Davide Libenzi wrote:
>> On Wed, 6 May 2009, Gregory Haskins wrote:
>>
>>   
>>> I think we are ok in this regard (at least in v5) without the 
>>> callback. kvm holds irqfd, which holds eventfd.  In a normal 
>>> situation, we will
>>> have eventfd with 2 references.  If userspace closes the eventfd, it
>>> will drop 1 of the 2 eventfd file references, but the object should
>>> remain intact as long as kvm still holds it as well.  When the kvm-fd is
>>> released, we will then decouple from the eventfd->wqh and drop the last
>>> fput(), officially freeing it.
>>>
>>> Likewise, if kvm is closed before the eventfd, we will simply decouple
>>> from the wqh and fput(eventfd), leaving the last reference held by
>>> userspace until it closes as well.
>>>
>>> Let me know if you see any holes in that.
>>>     
>>
>> Looks OK, modulo my knowledge of KVM internals.
>>   
>
> What's your take on adding irq context safe callbacks to irqfd?
>
> To give some background here, we would like to use eventfd as a generic  
> connector between components, so the components do not know about each  
> other.  So far eventfd successfully abstracts among components in the  
> same process, in different processes, and in the kernel.
>
> eventfd_signal() can be safely called from irq context, and will wake up  
> a waiting task.  But in some cases, if the consumer is in the kernel, it  
> may be able to consume the event from irq context, saving a context 
> switch.
>
> So, will you consider patches adding this capability to eventfd?

(pasting from a separate thread)

> That's my thinking.  PCI interrupts don't work because we need to do  
> some hacky stuff in there, but MSI should.  Oh, and we could improve
> UIO  
> support for interrupts when using MSI, since there's no need to  
> acknowledge the interrupt.

Ok, so for INTx assigned devices all you need to do on the ACK handler
is to re-enable the host interrupt (and set the guest interrupt line to
low).

Right now the ack comes through a kvm internal irq ack callback.

AFAICS there is no mechanism in irqfd for ACK notification, and
interrupt injection is edge triggered.

So for PCI INTx assigned devices (or any INTx level), you'd want to keep
the guest interrupt high, with some way to notify the ACK.

Avi mentioned a separate irqfd to notify the ACK. For assigned devices,
you could register a fd wakeup function in that fd, which replaces the
current irq ACK callback?


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-07 13:46             ` Marcelo Tosatti
@ 2009-05-07 14:01               ` Gregory Haskins
  2009-05-07 14:34                 ` Avi Kivity
  0 siblings, 1 reply; 35+ messages in thread
From: Gregory Haskins @ 2009-05-07 14:01 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Avi Kivity, Davide Libenzi, Gregory Haskins, viro, kvm,
	Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 3815 bytes --]

Marcelo Tosatti wrote:
> On Thu, May 07, 2009 at 12:48:21PM +0300, Avi Kivity wrote:
>   
>> Davide Libenzi wrote:
>>     
>>> On Wed, 6 May 2009, Gregory Haskins wrote:
>>>
>>>   
>>>       
>>>> I think we are ok in this regard (at least in v5) without the 
>>>> callback. kvm holds irqfd, which holds eventfd.  In a normal 
>>>> situation, we will
>>>> have eventfd with 2 references.  If userspace closes the eventfd, it
>>>> will drop 1 of the 2 eventfd file references, but the object should
>>>> remain intact as long as kvm still holds it as well.  When the kvm-fd is
>>>> released, we will then decouple from the eventfd->wqh and drop the last
>>>> fput(), officially freeing it.
>>>>
>>>> Likewise, if kvm is closed before the eventfd, we will simply decouple
>>>> from the wqh and fput(eventfd), leaving the last reference held by
>>>> userspace until it closes as well.
>>>>
>>>> Let me know if you see any holes in that.
>>>>     
>>>>         
>>> Looks OK, modulo my knowledge of KVM internals.
>>>   
>>>       
>> What's your take on adding irq context safe callbacks to irqfd?
>>
>> To give some background here, we would like to use eventfd as a generic  
>> connector between components, so the components do not know about each  
>> other.  So far eventfd successfully abstracts among components in the  
>> same process, in different processes, and in the kernel.
>>
>> eventfd_signal() can be safely called from irq context, and will wake up  
>> a waiting task.  But in some cases, if the consumer is in the kernel, it  
>> may be able to consume the event from irq context, saving a context 
>> switch.
>>
>> So, will you consider patches adding this capability to eventfd?
>>     
>
> (pasting from a separate thread)
>
>   
>> That's my thinking.  PCI interrupts don't work because we need to do  
>> some hacky stuff in there, but MSI should.  Oh, and we could improve
>> UIO  
>> support for interrupts when using MSI, since there's no need to  
>> acknowledge the interrupt.
>>     
>
> Ok, so for INTx assigned devices all you need to do on the ACK handler
> is to re-enable the host interrupt (and set the guest interrupt line to
> low).
>
> Right now the ack comes through a kvm internal irq ack callback.
>
> AFAICS there is no mechanism in irqfd for ACK notification, and
> interrupt injection is edge triggered.
>
> So for PCI INTx assigned devices (or any INTx level), you'd want to keep
> the guest interrupt high, with some way to notify the ACK.
>
> Avi mentioned a separate irqfd to notify the ACK. For assigned devices,
> you could register a fd wakeup function in that fd, which replaces the
> current irq ACK callback?
>   

One thing I was thinking here was that I could create a flag for the
kvm_irqfd() function for something like "KVM_IRQFD_MODE_CLEAR".  This
flag when specified at creation time will cause the event to execute a
clear operation instead of a set when triggered.  That way, the default
mode is an edge-triggered set.  The non-default mode is to trigger a
clear.  Level-triggered ints could therefore create two irqfds, one for
raising, the other for clearing.

An alternative is to abandon the use of eventfd, and allow the irqfd to
be a first-class anon-fd.  The parameters passed to the write/signal()
function could then indicate the desired level.  The disadvantage would
be that it would not be compatible with eventfd, so we would need to
decide if the tradeoff is worth it.

OTOH, I suspect level triggered interrupts will be primarily in the
legacy domain, so perhaps we do not need to worry about it too much. 
Therefore, another option is that we *could* simply set the stake in the
ground that legacy/level cannot use irqfd.

Thoughts?

-Greg



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-07 14:01               ` Gregory Haskins
@ 2009-05-07 14:34                 ` Avi Kivity
  2009-05-07 14:54                   ` Gregory Haskins
  0 siblings, 1 reply; 35+ messages in thread
From: Avi Kivity @ 2009-05-07 14:34 UTC (permalink / raw)
  To: Gregory Haskins
  Cc: Marcelo Tosatti, Davide Libenzi, viro, kvm, Linux Kernel Mailing List

Gregory Haskins wrote:
> One thing I was thinking here was that I could create a flag for the
> kvm_irqfd() function for something like "KVM_IRQFD_MODE_CLEAR".  This
> flag when specified at creation time will cause the event to execute a
> clear operation instead of a set when triggered.  That way, the default
> mode is an edge-triggered set.  The non-default mode is to trigger a
> clear.  Level-triggered ints could therefore create two irqfds, one for
> raising, the other for clearing.
>   

That's my second choice option.

> An alternative is to abandon the use of eventfd, and allow the irqfd to
> be a first-class anon-fd.  The parameters passed to the write/signal()
> function could then indicate the desired level.  The disadvantage would
> be that it would not be compatible with eventfd, so we would need to
> decide if the tradeoff is worth it.
>   

I would really like to keep using eventfd.  Which is why I asked Davide 
about the prospects of direct callbacks (vs wakeups).

> OTOH, I suspect level triggered interrupts will be primarily in the
> legacy domain, so perhaps we do not need to worry about it too much. 
> Therefore, another option is that we *could* simply set the stake in the
> ground that legacy/level cannot use irqfd.
>   

This is my preferred option.  For a virtio-net-server in the kernel, 
we'd service its eventfd in qemu, raising and lowering the pci interrupt 
in the traditional way.

But we'd still need to know when to lower the interrupt.  How?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-07  9:48           ` Avi Kivity
  2009-05-07 13:46             ` Marcelo Tosatti
@ 2009-05-07 14:46             ` Davide Libenzi
  2009-05-07 15:47               ` Avi Kivity
  2009-05-08  3:13             ` Davide Libenzi
  2 siblings, 1 reply; 35+ messages in thread
From: Davide Libenzi @ 2009-05-07 14:46 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Gregory Haskins, viro, kvm, Linux Kernel Mailing List

On Thu, 7 May 2009, Avi Kivity wrote:

> What's your take on adding irq context safe callbacks to irqfd?
> 
> To give some background here, we would like to use eventfd as a generic
> connector between components, so the components do not know about each other.
> So far eventfd successfully abstracts among components in the same process, in
> different processes, and in the kernel.
> 
> eventfd_signal() can be safely called from irq context, and will wake up a
> waiting task.  But in some cases, if the consumer is in the kernel, it may be
> able to consume the event from irq context, saving a context switch.
> 
> So, will you consider patches adding this capability to eventfd?

Maybe I got lost in the thread, but inside the kernel we have 
callback-based wakeup since long time. This is what epoll uses, when 
hooking into the file* f_op->poll() subsystem.
Did you mean something else?


- Davide



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-07 14:34                 ` Avi Kivity
@ 2009-05-07 14:54                   ` Gregory Haskins
  2009-05-07 15:26                     ` Avi Kivity
  0 siblings, 1 reply; 35+ messages in thread
From: Gregory Haskins @ 2009-05-07 14:54 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Marcelo Tosatti, Davide Libenzi, viro, kvm, Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 2675 bytes --]

Avi Kivity wrote:
> Gregory Haskins wrote:
>> One thing I was thinking here was that I could create a flag for the
>> kvm_irqfd() function for something like "KVM_IRQFD_MODE_CLEAR".  This
>> flag when specified at creation time will cause the event to execute a
>> clear operation instead of a set when triggered.  That way, the default
>> mode is an edge-triggered set.  The non-default mode is to trigger a
>> clear.  Level-triggered ints could therefore create two irqfds, one for
>> raising, the other for clearing.
>>   
>
> That's my second choice option.
>
>> An alternative is to abandon the use of eventfd, and allow the irqfd to
>> be a first-class anon-fd.  The parameters passed to the write/signal()
>> function could then indicate the desired level.  The disadvantage would
>> be that it would not be compatible with eventfd, so we would need to
>> decide if the tradeoff is worth it.
>>   
>
> I would really like to keep using eventfd.  Which is why I asked
> Davide about the prospects of direct callbacks (vs wakeups).

I saw that request.  That would be ideal.

>
>> OTOH, I suspect level triggered interrupts will be primarily in the
>> legacy domain, so perhaps we do not need to worry about it too much.
>> Therefore, another option is that we *could* simply set the stake in the
>> ground that legacy/level cannot use irqfd.
>>   
>
> This is my preferred option.  For a virtio-net-server in the kernel,
> we'd service its eventfd in qemu, raising and lowering the pci
> interrupt in the traditional way.
>
> But we'd still need to know when to lower the interrupt.  How?

IIUC, isn't that  usually device/subsystem specific, and out of scope of
the GSI delivery vehicle?  For instance, most devices I have seen with
level ints have a register in their device register namespace for acking
the int.  As an aside, this is what causes some of the grief in dealing
with shared interrupts like KVM pass-through and/or threaded-isrs:  
There isn't a standardized way to ACK them.

You may also see some generalization of masking/acking in things like
the MSI-X table.  But again, this would be out of scope of the general
GSI delivery path IIUC.

I understand that there is a feedback mechanism in the ioapic model for
calling back on acknowledgment of the interrupt.  But I am not sure what
is how the real hardware works normally, and therefore I am not
convinced that is something we need to feed all the way back (i.e. via
irqfd or whatever).  In the interest of full disclosure, its been a few
years since I studied the xAPIC docs, so I might be out to lunch on that
assertion. ;)

-Greg




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-07 14:54                   ` Gregory Haskins
@ 2009-05-07 15:26                     ` Avi Kivity
  0 siblings, 0 replies; 35+ messages in thread
From: Avi Kivity @ 2009-05-07 15:26 UTC (permalink / raw)
  To: Gregory Haskins
  Cc: Marcelo Tosatti, Davide Libenzi, viro, kvm, Linux Kernel Mailing List

Gregory Haskins wrote:

  

>> This is my preferred option.  For a virtio-net-server in the kernel,
>> we'd service its eventfd in qemu, raising and lowering the pci
>> interrupt in the traditional way.
>>
>> But we'd still need to know when to lower the interrupt.  How?
>>     
>
> IIUC, isn't that  usually device/subsystem specific, and out of scope of
> the GSI delivery vehicle?  For instance, most devices I have seen with
> level ints have a register in their device register namespace for acking
> the int.  

Yes it is.

> As an aside, this is what causes some of the grief in dealing
> with shared interrupts like KVM pass-through and/or threaded-isrs:  
> There isn't a standardized way to ACK them.
>   

So we'd need a side channel to tell userspace to lower the irq.  Another 
eventfd likely.

Note we don't support device assignment for devices with shared interrupts.

> You may also see some generalization of masking/acking in things like
> the MSI-X table.  But again, this would be out of scope of the general
> GSI delivery path IIUC.
>
> I understand that there is a feedback mechanism in the ioapic model for
> calling back on acknowledgment of the interrupt.  But I am not sure what
> is how the real hardware works normally, and therefore I am not
> convinced that is something we need to feed all the way back (i.e. via
> irqfd or whatever).  In the interest of full disclosure, its been a few
> years since I studied the xAPIC docs, so I might be out to lunch on that
> assertion. ;)
>   

Right, that ack thing is completely internal, used for catching up on 
time drift, and for shutting down level triggered assigned interrupts.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-07 14:46             ` Davide Libenzi
@ 2009-05-07 15:47               ` Avi Kivity
  2009-05-07 16:44                 ` Davide Libenzi
  0 siblings, 1 reply; 35+ messages in thread
From: Avi Kivity @ 2009-05-07 15:47 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Gregory Haskins, viro, kvm, Linux Kernel Mailing List

Davide Libenzi wrote:
>   
>> What's your take on adding irq context safe callbacks to irqfd?
>>
>> To give some background here, we would like to use eventfd as a generic
>> connector between components, so the components do not know about each other.
>> So far eventfd successfully abstracts among components in the same process, in
>> different processes, and in the kernel.
>>
>> eventfd_signal() can be safely called from irq context, and will wake up a
>> waiting task.  But in some cases, if the consumer is in the kernel, it may be
>> able to consume the event from irq context, saving a context switch.
>>
>> So, will you consider patches adding this capability to eventfd?
>>     
>
> Maybe I got lost in the thread, but inside the kernel we have 
> callback-based wakeup since long time. This is what epoll uses, when 
> hooking into the file* f_op->poll() subsystem.
> Did you mean something else?
>   

Do you mean wait_queue_t::func?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-07 15:47               ` Avi Kivity
@ 2009-05-07 16:44                 ` Davide Libenzi
  2009-05-07 18:12                   ` Avi Kivity
  0 siblings, 1 reply; 35+ messages in thread
From: Davide Libenzi @ 2009-05-07 16:44 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Gregory Haskins, viro, kvm, Linux Kernel Mailing List

On Thu, 7 May 2009, Avi Kivity wrote:

> Davide Libenzi wrote:
> >   
> > > What's your take on adding irq context safe callbacks to irqfd?
> > > 
> > > To give some background here, we would like to use eventfd as a generic
> > > connector between components, so the components do not know about each
> > > other.
> > > So far eventfd successfully abstracts among components in the same
> > > process, in
> > > different processes, and in the kernel.
> > > 
> > > eventfd_signal() can be safely called from irq context, and will wake up a
> > > waiting task.  But in some cases, if the consumer is in the kernel, it may
> > > be
> > > able to consume the event from irq context, saving a context switch.
> > > 
> > > So, will you consider patches adding this capability to eventfd?
> > >     
> > 
> > Maybe I got lost in the thread, but inside the kernel we have callback-based
> > wakeup since long time. This is what epoll uses, when hooking into the file*
> > f_op->poll() subsystem.
> > Did you mean something else?
> >   
> 
> Do you mean wait_queue_t::func?

Yes, it is since long time ago that a wakeup in Linux can lead either to a 
real wakeup (in scheduler terms), or to a simple function call.
Isn't that enough?


- Davide



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-07 16:44                 ` Davide Libenzi
@ 2009-05-07 18:12                   ` Avi Kivity
  0 siblings, 0 replies; 35+ messages in thread
From: Avi Kivity @ 2009-05-07 18:12 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Gregory Haskins, viro, kvm, Linux Kernel Mailing List

Davide Libenzi wrote:
> On Thu, 7 May 2009, Avi Kivity wrote:
>
>   
>> Davide Libenzi wrote:
>>     
>>>   
>>>       
>>>> What's your take on adding irq context safe callbacks to irqfd?
>>>>
>>>> To give some background here, we would like to use eventfd as a generic
>>>> connector between components, so the components do not know about each
>>>> other.
>>>> So far eventfd successfully abstracts among components in the same
>>>> process, in
>>>> different processes, and in the kernel.
>>>>
>>>> eventfd_signal() can be safely called from irq context, and will wake up a
>>>> waiting task.  But in some cases, if the consumer is in the kernel, it may
>>>> be
>>>> able to consume the event from irq context, saving a context switch.
>>>>
>>>> So, will you consider patches adding this capability to eventfd?
>>>>     
>>>>         
>>> Maybe I got lost in the thread, but inside the kernel we have callback-based
>>> wakeup since long time. This is what epoll uses, when hooking into the file*
>>> f_op->poll() subsystem.
>>> Did you mean something else?
>>>   
>>>       
>> Do you mean wait_queue_t::func?
>>     
>
> Yes, it is since long time ago that a wakeup in Linux can lead either to a 
> real wakeup (in scheduler terms), or to a simple function call.
> Isn't that enough?
>
>   

It's perfect.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-07  9:48           ` Avi Kivity
  2009-05-07 13:46             ` Marcelo Tosatti
  2009-05-07 14:46             ` Davide Libenzi
@ 2009-05-08  3:13             ` Davide Libenzi
  2009-05-08  8:19               ` Avi Kivity
  2 siblings, 1 reply; 35+ messages in thread
From: Davide Libenzi @ 2009-05-08  3:13 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Gregory Haskins, viro, kvm, Linux Kernel Mailing List

On Thu, 7 May 2009, Avi Kivity wrote:

> What's your take on adding irq context safe callbacks to irqfd?
> 
> To give some background here, we would like to use eventfd as a generic
> connector between components, so the components do not know about each other.
> So far eventfd successfully abstracts among components in the same process, in
> different processes, and in the kernel.
> 
> eventfd_signal() can be safely called from irq context, and will wake up a
> waiting task.  But in some cases, if the consumer is in the kernel, it may be
> able to consume the event from irq context, saving a context switch.
> 
> So, will you consider patches adding this capability to eventfd?

Since I received this one after your ACK on the capability of eventfd of 
triggering callbacks, I assume we're clear on this point, aren't we?


- Davide



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-08  3:13             ` Davide Libenzi
@ 2009-05-08  8:19               ` Avi Kivity
  0 siblings, 0 replies; 35+ messages in thread
From: Avi Kivity @ 2009-05-08  8:19 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Gregory Haskins, viro, kvm, Linux Kernel Mailing List

Davide Libenzi wrote:
> On Thu, 7 May 2009, Avi Kivity wrote:
>
>   
>> What's your take on adding irq context safe callbacks to irqfd?
>>
>> To give some background here, we would like to use eventfd as a generic
>> connector between components, so the components do not know about each other.
>> So far eventfd successfully abstracts among components in the same process, in
>> different processes, and in the kernel.
>>
>> eventfd_signal() can be safely called from irq context, and will wake up a
>> waiting task.  But in some cases, if the consumer is in the kernel, it may be
>> able to consume the event from irq context, saving a context switch.
>>
>> So, will you consider patches adding this capability to eventfd?
>>     
>
> Since I received this one after your ACK on the capability of eventfd of 
> triggering callbacks, I assume we're clear on this point, aren't we?
>   

We are.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-07  2:06           ` Gregory Haskins
@ 2009-05-08 15:06             ` Davide Libenzi
  2009-05-12  3:55               ` Gregory Haskins
  0 siblings, 1 reply; 35+ messages in thread
From: Davide Libenzi @ 2009-05-08 15:06 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: viro, kvm, Linux Kernel Mailing List, avi

On Wed, 6 May 2009, Gregory Haskins wrote:

> If there isn't any more feedback on the series from Al, Avi,
> etc...please formally submit your eventfd patch so this series is
> available for Avi to pull in for inclusion when/if he deems it fit.

Did you decide you will be using those bits?


- Davide



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-08 15:06             ` Davide Libenzi
@ 2009-05-12  3:55               ` Gregory Haskins
  2009-05-12  6:55                 ` Davide Libenzi
  0 siblings, 1 reply; 35+ messages in thread
From: Gregory Haskins @ 2009-05-12  3:55 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Gregory Haskins, viro, kvm, Linux Kernel Mailing List, avi

[-- Attachment #1: Type: text/plain, Size: 684 bytes --]

Davide Libenzi wrote:
> On Wed, 6 May 2009, Gregory Haskins wrote:
>
>   
>> If there isn't any more feedback on the series from Al, Avi,
>> etc...please formally submit your eventfd patch so this series is
>> available for Avi to pull in for inclusion when/if he deems it fit.
>>     
>
> Did you decide you will be using those bits?
>   

Hi Davide,
  It appears that we will not end up needing them since the new version
I am about to push goes back to creating the eventfd in userspace to
begin with, per Avi's request.  Regardless, thank you kindly for your
help here, and also for your poll suggestion which we will definitely be
using.

Regards,
-Greg



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 266 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface
  2009-05-12  3:55               ` Gregory Haskins
@ 2009-05-12  6:55                 ` Davide Libenzi
  0 siblings, 0 replies; 35+ messages in thread
From: Davide Libenzi @ 2009-05-12  6:55 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: viro, kvm, Linux Kernel Mailing List, avi

On Mon, 11 May 2009, Gregory Haskins wrote:

> Davide Libenzi wrote:
> > On Wed, 6 May 2009, Gregory Haskins wrote:
> >
> >   
> >> If there isn't any more feedback on the series from Al, Avi,
> >> etc...please formally submit your eventfd patch so this series is
> >> available for Avi to pull in for inclusion when/if he deems it fit.
> >>     
> >
> > Did you decide you will be using those bits?
> >   
> 
> Hi Davide,
>   It appears that we will not end up needing them since the new version
> I am about to push goes back to creating the eventfd in userspace to
> begin with, per Avi's request.

That looks like a good idea.


- Davide



^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2009-05-12  7:00 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-04 17:57 [KVM PATCH v4 0/2] irqfd Gregory Haskins
2009-05-04 17:57 ` [KVM PATCH v4 1/2] eventfd: export eventfd interfaces for module use Gregory Haskins
2009-05-04 22:24   ` Al Viro
2009-05-05  2:17     ` Gregory Haskins
2009-05-04 17:57 ` [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface Gregory Haskins
2009-05-05 15:45   ` Avi Kivity
2009-05-05 17:34     ` Gregory Haskins
2009-05-05 17:43       ` Avi Kivity
2009-05-05 17:56     ` Gregory Haskins
2009-05-05 18:10       ` Avi Kivity
2009-05-05 18:21         ` Gregory Haskins
2009-05-06 11:35   ` Gregory Haskins
2009-05-06 15:24     ` Davide Libenzi
2009-05-06 15:37       ` Gregory Haskins
2009-05-07  1:34         ` Davide Libenzi
2009-05-07  2:06           ` Gregory Haskins
2009-05-08 15:06             ` Davide Libenzi
2009-05-12  3:55               ` Gregory Haskins
2009-05-12  6:55                 ` Davide Libenzi
2009-05-07  9:48           ` Avi Kivity
2009-05-07 13:46             ` Marcelo Tosatti
2009-05-07 14:01               ` Gregory Haskins
2009-05-07 14:34                 ` Avi Kivity
2009-05-07 14:54                   ` Gregory Haskins
2009-05-07 15:26                     ` Avi Kivity
2009-05-07 14:46             ` Davide Libenzi
2009-05-07 15:47               ` Avi Kivity
2009-05-07 16:44                 ` Davide Libenzi
2009-05-07 18:12                   ` Avi Kivity
2009-05-08  3:13             ` Davide Libenzi
2009-05-08  8:19               ` Avi Kivity
2009-05-04 18:06 ` [KVM PATCH v4 0/2] irqfd Gregory Haskins
2009-05-04 18:52 ` Gregory Haskins
2009-05-05 14:16 ` Davide Libenzi
2009-05-05 14:27   ` Gregory Haskins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).