All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Woodhouse <dwmw2@infradead.org>
To: Andy Lutomirski <luto@amacapital.net>,
	"Michael S. Tsirkin" <mst@redhat.com>
Cc: linux-s390 <linux-s390@vger.kernel.org>,
	KVM <kvm@vger.kernel.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Sebastian Ott <sebott@linux.vnet.ibm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linux Virtualization <virtualization@lists.linux-foundation.org>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Joerg Roedel <jroedel@suse.de>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v3 0/3] virtio DMA API core stuff
Date: Wed, 11 Nov 2015 23:30:27 +0100	[thread overview]
Message-ID: <1447281027.3513.11.camel__20534.7838149921$1447281050$gmane$org@infradead.org> (raw)
In-Reply-To: <CALCETrWmZaQxS3-r9jsUb3BPhdLRbRrdZWok2geHnYKaWC4YKA@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 3666 bytes --]

On Wed, 2015-11-11 at 07:56 -0800, Andy Lutomirski wrote:
> 
> Can you flesh out this trick?
> 
> On x86 IIUC the IOMMU more-or-less defaults to passthrough.  If the
> kernel wants, it can switch it to a non-passthrough mode.  My patches
> cause the virtio driver to do exactly this, except that the host
> implementation doesn't actually exist yet, so the patches will instead
> have no particular effect.

At some level, yes — we're compatible with a 1982 IBM PC and thus the
IOMMU is entirely disabled at boot until the kernel turns it on —
except in TXT mode where we abandon that compatibility.

But no, the virtio driver has *nothing* to do with switching the device
out of passthrough mode. It is either in passthrough mode, or it isn't.

If the VMM *doesn't* expose an IOMMU to the guest, obviously the
devices are in passthrough mode. If the guest kernel doesn't have IOMMU
support enabled, then obviously the devices are in passthrough mode.
And if the ACPI tables exposed to the guest kernel *tell* it that the
virtio devices are not actually behind the IOMMU (which qemu gets
wrong), then it'll be in passthrough mode.

If the IOMMU is exposed, and enabled, and telling the guest kernel that
it *does* cover the virtio devices, then those virtio devices will
*not* be in passthrough mode.

You choosing to use the DMA API in the virtio device drivers instead of
being buggy, has nothing to do with whether it's actually in
passthrough mode or not. Whether it's in passthrough mode or not, using
the DMA API is technically the right thing to do — because it should
either *do* the translation, or return a 1:1 mapped IOVA, as
appropriate.


> On powerpc and sparc, we *already* screwed up.  The host already tells
> the guest that there's an IOMMU and that it's *enabled* because those
> platforms don't have selective IOMMU coverage the way that x86 does.
> So we need to work around it.

No, we need it on x86 too because once we fix the virtio device driver
bug and make it start using the DMA API, then we start to trip up on
the qemu bug where it lies about which devices are covered by the
IOMMU.

Of course, we still have that same qemu bug w.r.t. assigned devices,
which it *also* claims are behind its IOMMU when they're not...

> I think that, if we want fancy virt-friendly IOMMU stuff like you're
> talking about, then the right thing to do is to create a virtio bus
> instead of pretending to be PCI.  That bus could have a virtio IOMMU
> and its own cross-platform enumeration mechanism for devices on the
> bus, and everything would be peachy.

That doesn't really help very much for the x86 case where the problem
is compatibility with *existing* (arguably broken) qemu
implementations.

Having said that, if this were real hardware I'd just be blacklisting
it and saying "Another BIOS with broken DMAR tables --> IOMMU
completely disabled". So perhaps we should just do that.


> I still don't understand what trick.  If we want virtio devices to be
> assignable, then they should be translated through the IOMMU, and the
> DMA API is the right interface for that.

The DMA API is the right interface *regardless* of whether there's
actual translation to be done. The device driver itself should not be
involved in any way with that decision.

When you want to access MMIO, you use ioremap() and writel() instead of
doing random crap for yourself. When you want DMA, you use the DMA API
to get a bus address for your device *even* if you expect there to be
no IOMMU and you expect it to precisely match the physical address. No
excuses.

-- 
dwmw2



[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  reply	other threads:[~2015-11-11 22:30 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-28  6:38 [PATCH v3 0/3] virtio DMA API core stuff Andy Lutomirski
2015-10-28  6:38 ` [PATCH v3 1/3] virtio_net: Stop doing DMA from the stack Andy Lutomirski
2015-10-28  7:08   ` Michael S. Tsirkin
2015-10-28  7:08   ` Michael S. Tsirkin
2015-10-28  6:38 ` Andy Lutomirski
2015-10-28  6:38 ` [PATCH v3 2/3] virtio_ring: Support DMA APIs Andy Lutomirski
2015-10-28  6:38 ` Andy Lutomirski
2015-10-28  6:39 ` [PATCH v3 3/3] virtio_pci: Use the DMA API Andy Lutomirski
2015-10-28  6:39 ` Andy Lutomirski
2015-10-28  6:53 ` [PATCH v3 0/3] virtio DMA API core stuff David Woodhouse
2015-10-28  6:53   ` David Woodhouse
2015-10-28  7:09   ` Andy Lutomirski
2015-10-28  7:09   ` Andy Lutomirski
2015-10-28  7:17 ` Michael S. Tsirkin
2015-10-28  7:17   ` Michael S. Tsirkin
2015-10-28  7:40   ` Christian Borntraeger
2015-10-28  7:40     ` Christian Borntraeger
2015-10-28  8:09     ` David Woodhouse
2015-10-28  8:09       ` David Woodhouse
2015-10-28 11:35       ` Michael S. Tsirkin
2015-10-28 11:35         ` Michael S. Tsirkin
2015-10-28 13:35         ` David Woodhouse
2015-10-28 13:35           ` David Woodhouse
2015-10-28 14:05           ` Michael S. Tsirkin
2015-10-28 14:05             ` Michael S. Tsirkin
2015-10-28 14:13             ` David Woodhouse
2015-10-28 14:13               ` David Woodhouse
2015-10-28 14:22               ` Michael S. Tsirkin
2015-10-28 14:22                 ` Michael S. Tsirkin
2015-10-28 14:32                 ` David Woodhouse
2015-10-28 14:32                   ` David Woodhouse
2015-10-28 16:12                   ` Michael S. Tsirkin
2015-10-28 22:51                     ` Andy Lutomirski
2015-10-28 22:51                       ` Andy Lutomirski
2015-10-29  9:01                       ` Michael S. Tsirkin
2015-10-29  9:01                         ` Michael S. Tsirkin
2015-10-29 16:18                         ` David Woodhouse
2015-10-29 16:18                           ` David Woodhouse
2015-11-08 10:37                           ` Michael S. Tsirkin
2015-11-08 10:37                             ` Michael S. Tsirkin
2015-11-08 11:49                             ` Joerg Roedel
2015-11-08 11:49                               ` Joerg Roedel
2015-11-10 15:02                               ` Michael S. Tsirkin
2015-11-10 15:02                                 ` Michael S. Tsirkin
2015-11-10 18:54                                 ` Andy Lutomirski
2015-11-10 18:54                                   ` Andy Lutomirski
2015-11-11 10:05                                   ` Michael S. Tsirkin
2015-11-11 10:05                                     ` Michael S. Tsirkin
2015-11-11 15:56                                     ` Andy Lutomirski
2015-11-11 22:30                                       ` David Woodhouse [this message]
2015-11-11 22:30                                       ` David Woodhouse
2015-11-12 11:09                                         ` Michael S. Tsirkin
2015-11-12 11:09                                           ` Michael S. Tsirkin
2015-11-12 12:18                                           ` David Woodhouse
2015-11-12 12:18                                             ` David Woodhouse
2015-11-11 15:56                                     ` Andy Lutomirski
2015-11-22 13:06                               ` Marcel Apfelbaum
2015-11-22 13:06                                 ` Marcel Apfelbaum
2015-11-22 15:54                                 ` David Woodhouse
2015-11-22 15:54                                   ` David Woodhouse
2015-11-22 17:04                                   ` Marcel Apfelbaum
2015-11-22 17:04                                   ` Marcel Apfelbaum
2015-11-22 22:11                                   ` Michael S. Tsirkin
2015-11-22 22:11                                     ` Michael S. Tsirkin
2015-11-08 12:00                             ` David Woodhouse
2015-11-08 12:00                               ` David Woodhouse
2015-10-30 15:16                         ` Joerg Roedel
2015-10-30 15:16                         ` Joerg Roedel
2015-11-11  9:11                           ` Michael S. Tsirkin
2015-11-11  9:11                             ` Michael S. Tsirkin
2015-10-30 16:54                         ` David Woodhouse
2015-10-30 16:54                           ` David Woodhouse
2015-11-03 10:24                         ` Paolo Bonzini
2015-11-03 10:24                         ` Paolo Bonzini
2015-10-28 16:12                   ` Michael S. Tsirkin
2015-10-28  8:36     ` Benjamin Herrenschmidt
2015-10-28  8:36       ` Benjamin Herrenschmidt
2015-10-28 11:23       ` Michael S. Tsirkin
2015-10-28 11:23         ` Michael S. Tsirkin
2015-10-28 13:37         ` David Woodhouse
2015-10-28 13:37           ` David Woodhouse
2015-10-28 14:07           ` Michael S. Tsirkin
2015-10-28 14:07             ` Michael S. Tsirkin
2015-11-19 13:45 ` Michael S. Tsirkin
2015-11-19 13:45 ` Michael S. Tsirkin
2015-11-19 21:59   ` Andy Lutomirski
2015-11-19 21:59     ` Andy Lutomirski
2015-11-19 23:38     ` David Woodhouse
2015-11-19 23:38       ` David Woodhouse
2015-11-20  2:56       ` Benjamin Herrenschmidt
2015-11-20  2:56         ` Benjamin Herrenschmidt
2015-11-20  8:34         ` Michael S. Tsirkin
2015-11-20  8:34         ` Michael S. Tsirkin
2015-11-20  8:21       ` Michael S. Tsirkin
2015-11-20  8:21         ` Michael S. Tsirkin
2015-11-22 15:58         ` David Woodhouse
2015-11-22 15:58         ` David Woodhouse
2015-11-22 21:52           ` Michael S. Tsirkin
2015-11-22 21:52           ` Michael S. Tsirkin
2015-11-22 22:21             ` David Woodhouse
2015-11-22 22:21               ` David Woodhouse
2015-11-23  7:56               ` Michael S. Tsirkin
2015-11-23  7:56                 ` Michael S. Tsirkin
2015-11-22 22:21             ` David Woodhouse
2015-11-22 22:21               ` David Woodhouse
2015-11-20  6:56     ` Michael S. Tsirkin
2015-11-20  6:56       ` Michael S. Tsirkin
2015-11-20  7:47       ` Michael S. Tsirkin
2015-11-20  7:47         ` Michael S. Tsirkin
  -- strict thread matches above, loose matches on Subject: below --
2015-10-28  6:38 Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='1447281027.3513.11.camel__20534.7838149921$1447281050$gmane$org@infradead.org' \
    --to=dwmw2@infradead.org \
    --cc=benh@kernel.crashing.org \
    --cc=borntraeger@de.ibm.com \
    --cc=hch@lst.de \
    --cc=jroedel@suse.de \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=sebott@linux.vnet.ibm.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.