linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Avi Kivity <avi@redhat.com>
Cc: mst@redhat.com, gleb@redhat.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, jan.kiszka@siemens.com
Subject: Re: [PATCH v7 2/2] kvm: KVM_EOIFD, an eventfd for EOIs
Date: Mon, 13 Aug 2012 15:34:01 -0600	[thread overview]
Message-ID: <1344893641.4683.146.camel@ul30vt.home> (raw)
In-Reply-To: <50276B11.8020708@redhat.com>

On Sun, 2012-08-12 at 11:36 +0300, Avi Kivity wrote:
> On 08/09/2012 10:26 PM, Alex Williamson wrote:
> > On Mon, 2012-08-06 at 13:40 +0300, Avi Kivity wrote:
> >> On 08/06/2012 01:38 PM, Avi Kivity wrote:
> >> 
> >> > Regarding the implementation, instead of a linked list, would an array
> >> > of counters parallel to the bitmap make it simpler?
> >> 
> >> Or even, replace the bitmap with an array of counters.
> > 
> > I'm not sure a counter array is what we're really after.  That gives us
> > reference counting for the irq source IDs, but not the key->gsi lookup.
> 
> You can look up the gsi while registering the eoifd, so it's accessible
> as eoifd->gsi instead of eoifd->source->gsi.  The irqfd can go away
> while the eoifd is still active, but is this a problem?

In my opinion, no, but Michael disagrees.

> > It also highlights another issue, that we have a limited set of source
> > IDs.  Looks like we have BITS_PER_LONG IDs, with two already used, one
> > for the shared userspace ID and another for the PIT.  How happy are we
> > going to be with a limit of 62 level interrupts in use at one time?
> 
> When we start being unhappy we can increase that number.  On the other
> hand more locks and lists makes me unhappy now.

Yep, good point.  My latest version removes the source ID object lock
and list (and objects).  I still have a lock and list for the ack
notification, but it's hard not to unless we combine them into one
mega-irqfd ioctl as Michael suggests.

> > It's arguably a reasonable number since the most virtualization friendly
> > devices (sr-iov VFs) don't even support this kind of interrupt.  It's
> > also very wasteful allocating an entire source ID for a single GSI
> > within that source ID.  PCI supports interrupts A, B, C, and D, which,
> > in the most optimal config, each go to different GSIs.  So we could
> > theoretically be more efficient in our use and allocation of irq source
> > IDs if we tracked use by the source ID, gsi pair.
> 
> There are, in one userspace, just three gsis available for PCI links, so
> you're compressing the source id space by 3.

I imagine there's a way to put each PCI interrupt pin on a GSI, but
still only 4, not a great expansion of source ID space.  I like
Michael's idea of re-using source IDs if we run out better.

> > That probably makes it less practical to replace anything at the top
> > level with a counter array.  The key that we pass back is currently the
> > actual source ID, but we don't specify what it is, so we could split it
> > and have it encode a 16bit source ID plus 16 bit GSI.  It could also be
> > an idr entry.
> 
> We can fix those kinds of problems by adding another layer of
> indirection.  But I doubt they will be needed.  I don't see people
> assigning 60 legacy devices to one guest.

Yep, we can ignore it for now and put it in the hands of userspace to
re-use IDs if needed.

> > Michael, would the interface be more acceptable to you if we added
> > separate ioctls to allocate and free some representation of an irq
> > source ID, gsi pair?  For instance, an ioctl might return an idr entry
> > for an irq source ID/gsi object which would then be passed as a
> > parameter in struct kvm_irqfd and struct kvm_eoifd so that the object
> > representing the source id/gsi isn't magically freed on it's own.  This
> > would also allow us to deassign/close one end and reconfigure it later.
> > Thanks,
> 
> Another option is to push the responsibility for allocating IDs for the
> association to userspace.  Let userspace both create the irqfd and the
> eoifd with the same ID, the kernel matches them at registration time and
> copies the gsi/sourceid from the first to the second eventfd.

Aside from the copying gsi/sourceid bit, you've just described my latest
attempt at this series.  Specifying both a sourceid and gsi also allows
userspace to make better use of the sourceid address space (use more
than one gsi if userspace wants the complexity of managing them).
Thanks,

Alex


  reply	other threads:[~2012-08-13 21:34 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-24 20:43 [PATCH v7 0/2] kvm: level irqfd and new eoifd Alex Williamson
2012-07-24 20:43 ` [PATCH v7 1/2] kvm: Extend irqfd to support level interrupts Alex Williamson
2012-07-29 15:01   ` Michael S. Tsirkin
2012-07-30 16:06     ` Alex Williamson
2012-07-24 20:43 ` [PATCH v7 2/2] kvm: KVM_EOIFD, an eventfd for EOIs Alex Williamson
2012-07-29 14:54   ` Michael S. Tsirkin
2012-07-30 16:22     ` Alex Williamson
2012-07-31  0:01       ` Michael S. Tsirkin
2012-07-31  0:26         ` Alex Williamson
2012-07-31  0:36           ` Michael S. Tsirkin
2012-07-31  1:12             ` Alex Williamson
2012-08-01 19:06               ` Alex Williamson
2012-08-12  7:49                 ` Michael S. Tsirkin
2012-08-13 16:48                   ` Alex Williamson
2012-08-13 16:59                     ` Michael S. Tsirkin
2012-08-13 18:17                       ` Alex Williamson
2012-08-13 19:50                         ` Michael S. Tsirkin
2012-08-13 20:48                           ` Alex Williamson
2012-08-13 21:50                             ` Michael S. Tsirkin
2012-08-13 22:22                               ` Alex Williamson
2012-08-13 22:52                                 ` Michael S. Tsirkin
2012-08-14 10:10                                   ` Gleb Natapov
2012-08-14 10:13                                     ` Gleb Natapov
2012-08-02  8:42               ` Michael S. Tsirkin
2012-08-06 10:17   ` Avi Kivity
2012-08-06 10:38     ` Avi Kivity
2012-08-06 10:40       ` Avi Kivity
2012-08-09 19:26         ` Alex Williamson
2012-08-12  8:36           ` Avi Kivity
2012-08-13 21:34             ` Alex Williamson [this message]
2012-08-13 22:06               ` Michael S. Tsirkin
2012-08-13 22:41                 ` Alex Williamson
2012-08-13 23:00                   ` Michael S. Tsirkin
2012-08-14  3:09                     ` Alex Williamson
2012-08-14  8:35                       ` Michael S. Tsirkin
2012-08-14 21:28                         ` Alex Williamson
2012-08-12  9:33           ` Michael S. Tsirkin
2012-08-13 21:23             ` Alex Williamson
2012-08-13 22:00               ` Michael S. Tsirkin
2012-08-14 12:35             ` Avi Kivity
2012-08-14 14:50               ` Michael S. Tsirkin
2012-08-14 22:01               ` Alex Williamson
2012-08-14 23:04                 ` Michael S. Tsirkin
2012-08-14 23:26                   ` Alex Williamson
2012-08-15 13:09                     ` Avi Kivity
2012-08-12  7:53     ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1344893641.4683.146.camel@ul30vt.home \
    --to=alex.williamson@redhat.com \
    --cc=avi@redhat.com \
    --cc=gleb@redhat.com \
    --cc=jan.kiszka@siemens.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).