All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: rukhsana ansari <ruk.ansari@gmail.com>,
	qemu-devel@nongnu.org, kvm@vger.kernel.org,
	Juan Quintela <quintela@redhat.com>,
	jasowang@redhat.com, Jes.Sorensen@redhat.com
Subject: Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
Date: Mon, 14 Mar 2011 13:31:20 -0600	[thread overview]
Message-ID: <1300131080.3141.47.camel@x201> (raw)
In-Reply-To: <20110314190002.GA24011@redhat.com>

On Mon, 2011-03-14 at 21:00 +0200, Michael S. Tsirkin wrote:
> On Mon, Mar 14, 2011 at 10:35:08PM +0530, rukhsana ansari wrote:
> > Seeking clarification to the original question I posted:
> > >>
> > >>
> > > This maybe a novice question - Would appreciate it if you can you provide a
> > > pointer to documentation or relevant code that explains what is the
> > > limitation in supporting level irq support in kvm irqfd.
> > >
> > >
> > >
> > After browsing the KVM kernel code, it does look like direct assignment of PCI
> > devices allows support for level-triggered interrupts to be injected to the
> > guest from the kernel.  (as opposed to not supporting it for vhost irqfd
> > mechanism)
> > This occurs when the guest device supports INTX.
> > Reference:  kvm_assigned_dev_interrupt_work_handler() in assigned-dev.c calls
> > kvm_set_irq()
> > with the guest_irq.
> > This function in turn invokes the assigned set function  (either
> > kvm_set_pic_irq or kvm_set_ioapic_irq) which was setup at kvm_irq_chip creation
> > time when kvm_setup_default_irq_routing () called for handling ioctl
> > KVM_CREATE_IRQCHIP.
> > 
> > So, it isn't clear why level-triggered interrupt isn't supported for irqfd
> > mechanism.
> > Would greatly appreciate clarification here
> > 
> > Thanks
> > -Rukhsana
> > 
> 
> Mostly, no one came up with an implementation so far.
> 
> If the point is to use irqfd with vhost-net, there's also
> a question of adding interfaces to
> 1. pass IO read transactions directly to another kernel module
> 2. add an interface to clear the irq level
> 
> Maybe the right thing is to combine the two somehow:
> irqfd might get an oiption to set a bit in memory,
> ioeventfd might get an option to read and clear from memory
> and clear irqfd line at the same time.

I had wanted this for VFIO too and it gets pretty complicated.  The
first problem with level triggered interrupts is that you need to know
which GSI your device triggers.  This means translating PCI INTA through
bridge swizzles and chipset mapping to an IOAPIC.  Current device
assignment does this through a complete hack in qemu.  Then you can set
the IRQ, but being level triggered, we need to know when the guest has
serviced the IRQ so we can de-assert it.  This requires a hook into the
in-kernel APIC to sent the EOI back out to userspace.

I posted RFC patches for doing all this a while back, but they didn't go
anywhere.  I think the feeling was that it was too intrusive for "slow"
interrupts.  The current thinking for VFIO based device assignment is to
use qemu for level interrupts until we find something that actually
needs low latency in this path.  We generally consider INTx to be like
supporting i/o port space or non-4k BARs, ie. necessary for
compatibility, but not necessarily a performance path.  High performance
devices should always be using some kind of MSI because it bypasses all
of the APIC complications and slowness.  Thanks,

Alex


WARNING: multiple messages have this Message-ID (diff)
From: Alex Williamson <alex.williamson@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: rukhsana ansari <ruk.ansari@gmail.com>,
	kvm@vger.kernel.org, Juan Quintela <quintela@redhat.com>,
	Jes.Sorensen@redhat.com, jasowang@redhat.com,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
Date: Mon, 14 Mar 2011 13:31:20 -0600	[thread overview]
Message-ID: <1300131080.3141.47.camel@x201> (raw)
In-Reply-To: <20110314190002.GA24011@redhat.com>

On Mon, 2011-03-14 at 21:00 +0200, Michael S. Tsirkin wrote:
> On Mon, Mar 14, 2011 at 10:35:08PM +0530, rukhsana ansari wrote:
> > Seeking clarification to the original question I posted:
> > >>
> > >>
> > > This maybe a novice question - Would appreciate it if you can you provide a
> > > pointer to documentation or relevant code that explains what is the
> > > limitation in supporting level irq support in kvm irqfd.
> > >
> > >
> > >
> > After browsing the KVM kernel code, it does look like direct assignment of PCI
> > devices allows support for level-triggered interrupts to be injected to the
> > guest from the kernel.  (as opposed to not supporting it for vhost irqfd
> > mechanism)
> > This occurs when the guest device supports INTX.
> > Reference:  kvm_assigned_dev_interrupt_work_handler() in assigned-dev.c calls
> > kvm_set_irq()
> > with the guest_irq.
> > This function in turn invokes the assigned set function  (either
> > kvm_set_pic_irq or kvm_set_ioapic_irq) which was setup at kvm_irq_chip creation
> > time when kvm_setup_default_irq_routing () called for handling ioctl
> > KVM_CREATE_IRQCHIP.
> > 
> > So, it isn't clear why level-triggered interrupt isn't supported for irqfd
> > mechanism.
> > Would greatly appreciate clarification here
> > 
> > Thanks
> > -Rukhsana
> > 
> 
> Mostly, no one came up with an implementation so far.
> 
> If the point is to use irqfd with vhost-net, there's also
> a question of adding interfaces to
> 1. pass IO read transactions directly to another kernel module
> 2. add an interface to clear the irq level
> 
> Maybe the right thing is to combine the two somehow:
> irqfd might get an oiption to set a bit in memory,
> ioeventfd might get an option to read and clear from memory
> and clear irqfd line at the same time.

I had wanted this for VFIO too and it gets pretty complicated.  The
first problem with level triggered interrupts is that you need to know
which GSI your device triggers.  This means translating PCI INTA through
bridge swizzles and chipset mapping to an IOAPIC.  Current device
assignment does this through a complete hack in qemu.  Then you can set
the IRQ, but being level triggered, we need to know when the guest has
serviced the IRQ so we can de-assert it.  This requires a hook into the
in-kernel APIC to sent the EOI back out to userspace.

I posted RFC patches for doing all this a while back, but they didn't go
anywhere.  I think the feeling was that it was too intrusive for "slow"
interrupts.  The current thinking for VFIO based device assignment is to
use qemu for level interrupts until we find something that actually
needs low latency in this path.  We generally consider INTx to be like
supporting i/o port space or non-4k BARs, ie. necessary for
compatibility, but not necessarily a performance path.  High performance
devices should always be using some kind of MSI because it bypasses all
of the APIC complications and slowness.  Thanks,

Alex

  reply	other threads:[~2011-03-14 19:31 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-20 15:35 [PATCH] vhost: force vhost off for non-MSI guests Michael S. Tsirkin
2011-01-20 15:35 ` [Qemu-devel] " Michael S. Tsirkin
2011-01-20 15:43 ` Anthony Liguori
2011-01-20 15:43   ` Anthony Liguori
2011-01-20 16:07   ` Michael S. Tsirkin
2011-01-20 16:07     ` Michael S. Tsirkin
2011-01-21  0:23     ` Anthony Liguori
2011-01-21  1:35       ` Alex Williamson
2011-01-21  1:35         ` Alex Williamson
2011-01-21  9:55         ` Michael S. Tsirkin
2011-01-21  9:55           ` Michael S. Tsirkin
2011-01-21 13:19           ` Alex Williamson
2011-01-21 13:19             ` Alex Williamson
2011-01-21 13:43             ` Michael S. Tsirkin
2011-01-21 13:43               ` Michael S. Tsirkin
2011-01-21 14:40           ` Anthony Liguori
2011-01-21 14:40             ` Anthony Liguori
2011-01-21  9:48       ` Michael S. Tsirkin
2011-01-21 14:37         ` Anthony Liguori
2011-01-20 16:31 ` Sridhar Samudrala
2011-01-20 16:31   ` [Qemu-devel] " Sridhar Samudrala
2011-01-20 17:47   ` Michael S. Tsirkin
2011-01-20 17:47     ` [Qemu-devel] " Michael S. Tsirkin
2011-01-20 23:43     ` Sridhar Samudrala
2011-01-20 23:43       ` [Qemu-devel] " Sridhar Samudrala
2011-01-20 18:05 ` Alex Williamson
2011-01-20 18:05   ` [Qemu-devel] " Alex Williamson
2011-03-09 12:19 ` rukhsana ansari
2011-03-09 12:19   ` [Qemu-devel] " rukhsana ansari
2011-03-14 17:05   ` rukhsana ansari
2011-03-14 17:05     ` [Qemu-devel] " rukhsana ansari
2011-03-14 19:00     ` Michael S. Tsirkin
2011-03-14 19:00       ` Michael S. Tsirkin
2011-03-14 19:31       ` Alex Williamson [this message]
2011-03-14 19:31         ` Alex Williamson
2011-03-17 15:34         ` rukhsana ansari
2011-03-17 15:34           ` [Qemu-devel] " rukhsana ansari

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1300131080.3141.47.camel@x201 \
    --to=alex.williamson@redhat.com \
    --cc=Jes.Sorensen@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=ruk.ansari@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.