All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: virtio-dev@lists.oasis-open.org,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Linux Virtualization <virtualization@lists.linux-foundation.org>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	"linux390@de.ibm.com" <linux390@de.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API
Date: Wed, 3 Sep 2014 00:50:33 -0700	[thread overview]
Message-ID: <CALCETrWMc6TQy02kLXLcLwMJbPkpY4yv7U-As6jv0Frx34nuww@mail.gmail.com> (raw)
In-Reply-To: <874mwpkxxi.fsf@rustcorp.com.au>

On Sep 2, 2014 11:53 PM, "Rusty Russell" <rusty@rustcorp.com.au> wrote:
>
> Andy Lutomirski <luto@amacapital.net> writes:
> > There really are virtio devices that are pieces of silicon and not
> > figments of a hypervisor's imagination [1].
>
> Hi Andy,
>
>         As you're discovering, there's a reason no one has done the DMA
> API before.
>
> So the problem is that ppc64's IOMMU is a platform thing, not a bus
> thing.  They really do carve out an exception for virtio devices,
> because performance (LOTS of performance).  It remains to be seen if
> other platforms have the same performance issues, but in absence of
> other evidence, the answer is yes.
>
> It's a hack.  But having specific virtual-only devices are an even
> bigger hack.
>
> Physical virtio devices have been talked about, but don't actually exist
> in Real Life.  And someone a virtio PCI card is going to have serious
> performance issues: mainly because they'll want the rings in the card's
> MMIO region, not allocated by the driver.  Being broken on PPC is really
> the least of their problems.
>
> So, what do we do?  It'd be nice if Linux virtio Just Worked under Xen,
> though Xen's IOMMU is outside the virtio spec.  Since virtio_pci can be
> a module, obvious hacks like having xen_arch_setup initialize a dma_ops pointer
> exposed by virtio_pci.c is out.

Xen does expose dma_ops.  The trick is knowing when to use it.

>
> I think the best approach is to have a new feature bit (25 is free),
> VIRTIO_F_USE_BUS_MAPPING which indicates that a device really wants to
> use the mapping for the bus it is on.  A real device would set this,
> or it won't work behind an IOMMU.  A Xen device would also set this.

The devices I care about aren't actually Xen devices.  They're devices
supplied by QEMU/KVM, booting a Xen hypervisor, which in turn passes
the virtio device (along with every other PCI device) through to dom0.
So this is exactly the same virtio device that regular x86 KVM guests
would see.  The reason that current code fails is that Xen guest
physical addresses aren't the same as the addresses seen by the outer
hypervisor.

These devices don't know that physical addresses != bus addresses, so
they can't advertise that fact.

If we ever end up with a virtio_pci device with physical addressing,
behind an IOMMU (but ignoring it), on Xen, we'll have a problem, since
neither "physical" addressing nor dma ops will work.

That being said, there are also proposals for virtio devices supplied
by Xen dom0 to domU, and these will presumably work the same way,
except that the device implementation will know that it's on Xen.

Grr.  This is mostly a result of the fact that virtio_pci devices
aren't really PCI devices.  I still think that virtio_pci shouldn't
have to worry about this; ideally this would all be handled higher up
in the device hierarchy.  x86 already gets this right.


Are there any hypervisors except PPC that use virtio_pci, have IOMMUs
on the pci slot that virtio_pci lives in, and that use physical
addressing?  If not, I think that just quirking PPC will work (at
least until someone wants IOMMU support in virtio_pci on PPC, in which
case doing something using devicetree seems like a reasonable
solution).

--Andy

>
> Thoughts?
> Rusty.
>
> PS.  I cc'd OASIS virtio-dev: it's subscriber only for IP reasons (to
>      subscribe you have to promise we can use your suggestion in the
>      standard).  Feel free to remove in any replies, but it's part of
>      the world we live in...

  reply	other threads:[~2014-09-03  7:50 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-01 17:39 [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API Andy Lutomirski
2014-09-01 17:39 ` [PATCH v4 1/4] virtio_ring: Support DMA APIs if requested Andy Lutomirski
2014-09-01 17:39 ` [PATCH v4 2/4] virtio_pci: Use the DMA API for virtqueues Andy Lutomirski
2014-09-01 17:39 ` [PATCH v4 3/4] virtio_net: Don't set the end flag on reusable sg entries Andy Lutomirski
2014-09-01 17:39 ` [PATCH v4 4/4] virtio_net: Stop doing DMA from the stack Andy Lutomirski
2014-09-01 22:16 ` [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API Benjamin Herrenschmidt
2014-09-02  5:55   ` Andy Lutomirski
2014-09-02 20:53     ` Benjamin Herrenschmidt
2014-09-02 20:56       ` Konrad Rzeszutek Wilk
2014-09-02 21:08         ` Benjamin Herrenschmidt
2014-09-02 21:37       ` Andy Lutomirski
2014-09-02 22:10         ` Benjamin Herrenschmidt
2014-09-02 23:11           ` Andy Lutomirski
2014-09-02 23:20             ` Benjamin Herrenschmidt
2014-09-02 23:42               ` Andy Lutomirski
2014-09-03  0:25                 ` Benjamin Herrenschmidt
2014-09-03  0:32                   ` Andy Lutomirski
2014-09-03  0:43                     ` Benjamin Herrenschmidt
2014-09-04  2:03                       ` Andy Lutomirski
2014-09-03  7:47                   ` Paolo Bonzini
2014-09-03  7:52                     ` Andy Lutomirski
2014-09-03  8:01                       ` Paolo Bonzini
2014-09-03  8:05                     ` Benjamin Herrenschmidt
2014-09-03 12:11                       ` Paolo Bonzini
2014-09-03 15:07                         ` Andy Lutomirski
2014-09-03 15:11                           ` Paolo Bonzini
2014-09-03 16:39                           ` Michael S. Tsirkin
2014-09-03 20:38                             ` Andy Lutomirski
2014-09-03  7:43               ` Paolo Bonzini
2014-09-03  6:42         ` Rusty Russell
2014-09-03  7:50           ` Andy Lutomirski [this message]
2014-09-05  2:31             ` Rusty Russell
2014-09-05  2:57               ` Andy Lutomirski
2014-09-05  5:20                 ` Benjamin Herrenschmidt
2014-09-05  7:33                 ` Christian Borntraeger
2014-09-10 15:36                 ` Christopher Covington
2014-09-10 16:15                   ` Andy Lutomirski
2014-09-05  5:16               ` Benjamin Herrenschmidt
2014-09-14  8:58               ` Michael S. Tsirkin
2014-09-03 12:51           ` Michael S. Tsirkin
2014-09-05  2:32             ` Rusty Russell
2014-09-05  3:06               ` Andy Lutomirski
2014-09-02 21:10     ` Michael S. Tsirkin
2014-09-02 21:49       ` Andy Lutomirski
2015-07-28  1:08 Andy Lutomirski
2015-07-28  7:05 ` Christian Borntraeger
2015-07-28  7:05 ` Christian Borntraeger
2015-07-28  8:16 ` Paolo Bonzini
2015-07-28  8:16 ` Paolo Bonzini
2015-07-28 10:12   ` Benjamin Herrenschmidt
2015-07-28 10:12   ` Benjamin Herrenschmidt
2015-07-28 12:46     ` Paolo Bonzini
2015-07-28 13:06       ` Michael S. Tsirkin
2015-07-28 13:06       ` Michael S. Tsirkin
2015-07-28 13:11         ` Jan Kiszka
2015-07-28 16:11           ` Andy Lutomirski
2015-07-28 16:44             ` Jan Kiszka
2015-07-28 16:44             ` Jan Kiszka
2015-07-28 17:10               ` Andy Lutomirski
2015-07-28 17:10               ` Andy Lutomirski
2015-07-28 17:17                 ` Jan Kiszka
2015-07-28 17:17                 ` Jan Kiszka
2015-07-28 18:22                   ` Andy Lutomirski
2015-07-28 18:22                   ` Andy Lutomirski
2015-07-28 19:06                     ` Jan Kiszka
2015-07-28 19:06                     ` Jan Kiszka
2015-07-28 19:24                       ` Andy Lutomirski
2015-07-28 19:24                       ` Andy Lutomirski
2015-07-28 19:33                         ` Jan Kiszka
2015-07-28 21:16                           ` Andy Lutomirski
2015-07-28 21:16                           ` Andy Lutomirski
2015-07-28 22:43                             ` Andy Lutomirski
2015-07-28 22:43                             ` Andy Lutomirski
2015-07-28 23:21                               ` Benjamin Herrenschmidt
2015-07-28 23:33                                 ` Andy Lutomirski
2015-07-28 23:33                                 ` Andy Lutomirski
2015-07-29  0:36                                   ` Benjamin Herrenschmidt
2015-07-29  0:36                                   ` Benjamin Herrenschmidt
2015-07-29  0:47                                     ` Andy Lutomirski
2015-07-29  0:47                                     ` Andy Lutomirski
2015-07-29  0:54                                       ` Benjamin Herrenschmidt
2015-07-29  0:54                                       ` Benjamin Herrenschmidt
2015-07-29  8:17                                       ` Paolo Bonzini
2015-07-29  8:20                                         ` Jan Kiszka
2015-07-29  8:20                                         ` Jan Kiszka
2015-07-29  9:21                                         ` Benjamin Herrenschmidt
2015-07-29  9:21                                         ` Benjamin Herrenschmidt
2015-07-29  8:17                                       ` Paolo Bonzini
2015-07-29  8:07                                 ` Jan Kiszka
2015-07-29  8:07                                 ` Jan Kiszka
2015-07-28 23:21                               ` Benjamin Herrenschmidt
2015-07-28 19:33                         ` Jan Kiszka
2015-07-28 16:11           ` Andy Lutomirski
2015-07-28 16:36           ` Paolo Bonzini
2015-07-28 16:36           ` Paolo Bonzini
2015-07-28 16:42             ` Jan Kiszka
2015-07-28 17:15               ` Paolo Bonzini
2015-07-28 17:15               ` Paolo Bonzini
2015-07-28 17:19                 ` Jan Kiszka
2015-07-28 17:19                 ` Jan Kiszka
2015-07-28 17:31                   ` Paolo Bonzini
2015-07-28 17:31                   ` Paolo Bonzini
2015-07-28 16:42             ` Jan Kiszka
2015-07-28 13:11         ` Jan Kiszka
2015-07-28 12:46     ` Paolo Bonzini
2015-07-28 13:08 ` Michael S. Tsirkin
2015-07-28 13:08 ` Michael S. Tsirkin
2015-07-28  1:08 Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrWMc6TQy02kLXLcLwMJbPkpY4yv7U-As6jv0Frx34nuww@mail.gmail.com \
    --to=luto@amacapital.net \
    --cc=benh@kernel.crashing.org \
    --cc=borntraeger@de.ibm.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux390@de.ibm.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=rusty@rustcorp.com.au \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.