From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755907AbbJ1SSu (ORCPT ); Wed, 28 Oct 2015 14:18:50 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40666 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755391AbbJ1SSt (ORCPT ); Wed, 28 Oct 2015 14:18:49 -0400 Message-ID: <1446056328.8018.422.camel@redhat.com> Subject: Re: [RFC PATCH] VFIO: Add a parameter to force nonthread IRQ From: Alex Williamson To: Yunhong Jiang Cc: Paolo Bonzini , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Steven Rostedt Date: Wed, 28 Oct 2015 12:18:48 -0600 In-Reply-To: <20151028175013.GA21961@jnakajim-build> References: <1445908801-14732-1-git-send-email-yunhong.jiang@linux.intel.com> <1445917034.8018.220.camel@redhat.com> <20151027063501.GA22054@jnakajim-build> <562F43F8.1040101@redhat.com> <20151027212648.GA22916@jnakajim-build> <56301A87.9030907@redhat.com> <20151028175013.GA21961@jnakajim-build> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2015-10-28 at 10:50 -0700, Yunhong Jiang wrote: > On Wed, Oct 28, 2015 at 01:44:55AM +0100, Paolo Bonzini wrote: > > > > > > On 27/10/2015 22:26, Yunhong Jiang wrote: > > >> > On RT kernels however can you call eventfd_signal from interrupt > > >> > context? You cannot call spin_lock_irqsave (which can sleep) from a > > >> > non-threaded interrupt handler, can you? You would need a raw spin lock. > > > Thanks for pointing this out. Yes, we can't call spin_lock_irqsave on RT > > > kernel. Will do this way on next patch. But not sure if it's overkill to use > > > raw_spinlock there since the eventfd_signal is used by other caller also. > > > > No, I don't think you can use raw_spinlock there. The problem is not > > just eventfd_signal, it is especially wake_up_locked_poll. You cannot > > convert the whole workqueue infrastructure to use raw_spinlock. > > You mean the waitqueue, instead of workqueue, right? One choice is to change > the eventfd to use simple wait queue, which is raw_spinlock. But use simple > waitqueue on eventfd may in fact impact real time latency if not in this > scenario. > > > > > Alex, would it make sense to use the IRQ bypass infrastructure always, > > not just for VT-d, to do the MSI injection directly from the VFIO > > interrupt handler and bypass the eventfd? Basically this would add an > > RCU-protected list of consumers matching the token to struct > > irq_bypass_producer, and a > > > > int (*inject)(struct irq_bypass_consumer *); > > > > callback to struct irq_bypass_consumer. If any callback returns true, > > the eventfd is not signaled. The KVM implementation would be like this > > (compare with virt/kvm/eventfd.c): > > > > /* Extracted out of irqfd_wakeup */ > > static int > > irqfd_wakeup_pollin(struct kvm_kernel_irqfd *irqfd) > > { > > ... > > } > > > > /* Extracted out of irqfd_wakeup */ > > static int > > irqfd_wakeup_pollhup(struct kvm_kernel_irqfd *irqfd) > > { > > ... > > } > > > > static int > > irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, > > void *key) > > { > > struct _irqfd *irqfd = container_of(wait, > > struct _irqfd, wait); > > unsigned long flags = (unsigned long)key; > > > > if (flags & POLLIN) > > irqfd_wakeup_pollin(irqfd); > > if (flags & POLLHUP) > > irqfd_wakeup_pollhup(irqfd); > > > > return 0; > > } > > > > static int kvm_arch_irq_bypass_inject( > > struct irq_bypass_consumer *cons) > > { > > struct kvm_kernel_irqfd *irqfd = > > container_of(cons, struct kvm_kernel_irqfd, > > consumer); > > > > irqfd_wakeup_pollin(irqfd); > > } > > > This is a good idea IMHO. So for MSI interrupt, the > kvm_arch_irq_bypass_inject will be used, and the irqfd_wakeup will not be > invoked anymore, am I right? > > I noticed the irq bypass manager is not merged yet, are there any git branch > for it? It's in linux-next via the kvm.git next branch: git://git.kernel.org/pub/scm/virt/kvm/kvm.git Thanks, Alex