All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Joerg Roedel <jroedel@suse.de>
Cc: David Woodhouse <dwmw2@infradead.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Andy Lutomirski <luto@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Cornelia Huck <cornelia.huck@de.ibm.com>,
	Sebastian Ott <sebott@linux.vnet.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Christoph Hellwig <hch@lst.de>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	KVM <kvm@vger.kernel.org>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	linux-s390 <linux-s390@vger.kernel.org>,
	Linux Virtualization <virtualization@lists.linux-foundation.org>
Subject: Re: [PATCH v3 0/3] virtio DMA API core stuff
Date: Tue, 10 Nov 2015 17:02:19 +0200	[thread overview]
Message-ID: <20151109224720-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <20151108114946.GG2255@suse.de>

On Sun, Nov 08, 2015 at 12:49:46PM +0100, Joerg Roedel wrote:
> On Sun, Nov 08, 2015 at 12:37:47PM +0200, Michael S. Tsirkin wrote:
> > I have no problem with that. For example, can we teach
> > the DMA API on intel x86 to use PT for virtio by default?
> > That would allow merging Andy's patches with
> > full compatibility with old guests and hosts.
> 
> Well, the only incompatibility comes from an experimental qemu feature,
> more explicitly from a bug in that features implementation. So why
> should we work around that in the kernel? I think it is not too hard to
> fix qemu to generate a correct DMAR table which excludes the virtio
> devices from iommu translation.
> 
> 
> 	Joerg

It's not that easy - you'd have to dedicate some buses
for iommu bypass, and teach management tools to only put
virtio there - but it's possible.

This will absolutely address guests that don't need to set up IOMMU for
virtio devices, and virtio that bypasses the IOMMU.

But the problem is that we do want to *allow* guests
to set up IOMMU for virtio devices.
In that case, these are two other usecases:

A- monolitic virtio within QEMU:
	iommu only needed for VFIO ->
	guest should always use iommu=pt
        iommu=on works but is just useless overhead.

B- modular out of process virtio outside QEMU:
	iommu needed for VFIO or kernel driver ->
	guest should use iommu=pt or iommu=on
	depending on security/performance requirements

Note that there could easily be a mix of these in the same system.

So for these cases we do need QEMU to specify to guest that IOMMU covers
the virtio devices.  Also, once one does this, the default on linux is
iommu=on and not pt, which works but ATM is very slow.

This poses three problems:

1. How do we address the different needs of A and B?
   One way would be for virtio to pass the information to guest
   using some virtio specific way, and have drivers
   specify what kind of DMA access they want.

2. (Kind of a subset of 1) once we do allow IOMMU, how do we make sure most guests
   use the more sensible iommu=pt.

3. Once we do allow IOMMU, how can we keep existing guests work in this configuration?
   Creating different hypervisor configurations depending on guest is very nasty.
   Again, one way would be some virtio specific interface.

I'd rather we figured the answers to this before merging Andy's patches
because I'm concerned that instead of 1 broken configuration
(virtio always bypasses IOMMU) we'll get two bad configurations
(in the second one, virtio uses the slow default with no
gain in security).

Suggestions wellcome.

-- 
MST

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Joerg Roedel <jroedel@suse.de>
Cc: linux-s390 <linux-s390@vger.kernel.org>,
	KVM <kvm@vger.kernel.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Sebastian Ott <sebott@linux.vnet.ibm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Andy Lutomirski <luto@kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Linux Virtualization <virtualization@lists.linux-foundation.org>,
	David Woodhouse <dwmw2@infradead.org>,
	Christoph Hellwig <hch@lst.de>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>
Subject: Re: [PATCH v3 0/3] virtio DMA API core stuff
Date: Tue, 10 Nov 2015 17:02:19 +0200	[thread overview]
Message-ID: <20151109224720-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <20151108114946.GG2255@suse.de>

On Sun, Nov 08, 2015 at 12:49:46PM +0100, Joerg Roedel wrote:
> On Sun, Nov 08, 2015 at 12:37:47PM +0200, Michael S. Tsirkin wrote:
> > I have no problem with that. For example, can we teach
> > the DMA API on intel x86 to use PT for virtio by default?
> > That would allow merging Andy's patches with
> > full compatibility with old guests and hosts.
> 
> Well, the only incompatibility comes from an experimental qemu feature,
> more explicitly from a bug in that features implementation. So why
> should we work around that in the kernel? I think it is not too hard to
> fix qemu to generate a correct DMAR table which excludes the virtio
> devices from iommu translation.
> 
> 
> 	Joerg

It's not that easy - you'd have to dedicate some buses
for iommu bypass, and teach management tools to only put
virtio there - but it's possible.

This will absolutely address guests that don't need to set up IOMMU for
virtio devices, and virtio that bypasses the IOMMU.

But the problem is that we do want to *allow* guests
to set up IOMMU for virtio devices.
In that case, these are two other usecases:

A- monolitic virtio within QEMU:
	iommu only needed for VFIO ->
	guest should always use iommu=pt
        iommu=on works but is just useless overhead.

B- modular out of process virtio outside QEMU:
	iommu needed for VFIO or kernel driver ->
	guest should use iommu=pt or iommu=on
	depending on security/performance requirements

Note that there could easily be a mix of these in the same system.

So for these cases we do need QEMU to specify to guest that IOMMU covers
the virtio devices.  Also, once one does this, the default on linux is
iommu=on and not pt, which works but ATM is very slow.

This poses three problems:

1. How do we address the different needs of A and B?
   One way would be for virtio to pass the information to guest
   using some virtio specific way, and have drivers
   specify what kind of DMA access they want.

2. (Kind of a subset of 1) once we do allow IOMMU, how do we make sure most guests
   use the more sensible iommu=pt.

3. Once we do allow IOMMU, how can we keep existing guests work in this configuration?
   Creating different hypervisor configurations depending on guest is very nasty.
   Again, one way would be some virtio specific interface.

I'd rather we figured the answers to this before merging Andy's patches
because I'm concerned that instead of 1 broken configuration
(virtio always bypasses IOMMU) we'll get two bad configurations
(in the second one, virtio uses the slow default with no
gain in security).

Suggestions wellcome.

-- 
MST

  reply	other threads:[~2015-11-10 15:02 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-28  6:38 [PATCH v3 0/3] virtio DMA API core stuff Andy Lutomirski
2015-10-28  6:38 ` [PATCH v3 1/3] virtio_net: Stop doing DMA from the stack Andy Lutomirski
2015-10-28  7:08   ` Michael S. Tsirkin
2015-10-28  7:08   ` Michael S. Tsirkin
2015-10-28  6:38 ` Andy Lutomirski
2015-10-28  6:38 ` [PATCH v3 2/3] virtio_ring: Support DMA APIs Andy Lutomirski
2015-10-28  6:38 ` Andy Lutomirski
2015-10-28  6:39 ` [PATCH v3 3/3] virtio_pci: Use the DMA API Andy Lutomirski
2015-10-28  6:39 ` Andy Lutomirski
2015-10-28  6:53 ` [PATCH v3 0/3] virtio DMA API core stuff David Woodhouse
2015-10-28  6:53   ` David Woodhouse
2015-10-28  7:09   ` Andy Lutomirski
2015-10-28  7:09   ` Andy Lutomirski
2015-10-28  7:17 ` Michael S. Tsirkin
2015-10-28  7:17   ` Michael S. Tsirkin
2015-10-28  7:40   ` Christian Borntraeger
2015-10-28  7:40     ` Christian Borntraeger
2015-10-28  8:09     ` David Woodhouse
2015-10-28  8:09       ` David Woodhouse
2015-10-28 11:35       ` Michael S. Tsirkin
2015-10-28 11:35         ` Michael S. Tsirkin
2015-10-28 13:35         ` David Woodhouse
2015-10-28 13:35           ` David Woodhouse
2015-10-28 14:05           ` Michael S. Tsirkin
2015-10-28 14:05             ` Michael S. Tsirkin
2015-10-28 14:13             ` David Woodhouse
2015-10-28 14:13               ` David Woodhouse
2015-10-28 14:22               ` Michael S. Tsirkin
2015-10-28 14:22                 ` Michael S. Tsirkin
2015-10-28 14:32                 ` David Woodhouse
2015-10-28 14:32                   ` David Woodhouse
2015-10-28 16:12                   ` Michael S. Tsirkin
2015-10-28 22:51                     ` Andy Lutomirski
2015-10-28 22:51                       ` Andy Lutomirski
2015-10-29  9:01                       ` Michael S. Tsirkin
2015-10-29  9:01                         ` Michael S. Tsirkin
2015-10-29 16:18                         ` David Woodhouse
2015-10-29 16:18                           ` David Woodhouse
2015-11-08 10:37                           ` Michael S. Tsirkin
2015-11-08 10:37                             ` Michael S. Tsirkin
2015-11-08 11:49                             ` Joerg Roedel
2015-11-08 11:49                               ` Joerg Roedel
2015-11-10 15:02                               ` Michael S. Tsirkin [this message]
2015-11-10 15:02                                 ` Michael S. Tsirkin
2015-11-10 18:54                                 ` Andy Lutomirski
2015-11-10 18:54                                   ` Andy Lutomirski
2015-11-11 10:05                                   ` Michael S. Tsirkin
2015-11-11 10:05                                     ` Michael S. Tsirkin
2015-11-11 15:56                                     ` Andy Lutomirski
2015-11-11 22:30                                       ` David Woodhouse
2015-11-11 22:30                                       ` David Woodhouse
2015-11-12 11:09                                         ` Michael S. Tsirkin
2015-11-12 11:09                                           ` Michael S. Tsirkin
2015-11-12 12:18                                           ` David Woodhouse
2015-11-12 12:18                                             ` David Woodhouse
2015-11-11 15:56                                     ` Andy Lutomirski
2015-11-22 13:06                               ` Marcel Apfelbaum
2015-11-22 13:06                                 ` Marcel Apfelbaum
2015-11-22 15:54                                 ` David Woodhouse
2015-11-22 15:54                                   ` David Woodhouse
2015-11-22 17:04                                   ` Marcel Apfelbaum
2015-11-22 17:04                                   ` Marcel Apfelbaum
2015-11-22 22:11                                   ` Michael S. Tsirkin
2015-11-22 22:11                                     ` Michael S. Tsirkin
2015-11-08 12:00                             ` David Woodhouse
2015-11-08 12:00                               ` David Woodhouse
2015-10-30 15:16                         ` Joerg Roedel
2015-10-30 15:16                         ` Joerg Roedel
2015-11-11  9:11                           ` Michael S. Tsirkin
2015-11-11  9:11                             ` Michael S. Tsirkin
2015-10-30 16:54                         ` David Woodhouse
2015-10-30 16:54                           ` David Woodhouse
2015-11-03 10:24                         ` Paolo Bonzini
2015-11-03 10:24                         ` Paolo Bonzini
2015-10-28 16:12                   ` Michael S. Tsirkin
2015-10-28  8:36     ` Benjamin Herrenschmidt
2015-10-28  8:36       ` Benjamin Herrenschmidt
2015-10-28 11:23       ` Michael S. Tsirkin
2015-10-28 11:23         ` Michael S. Tsirkin
2015-10-28 13:37         ` David Woodhouse
2015-10-28 13:37           ` David Woodhouse
2015-10-28 14:07           ` Michael S. Tsirkin
2015-10-28 14:07             ` Michael S. Tsirkin
2015-11-19 13:45 ` Michael S. Tsirkin
2015-11-19 13:45 ` Michael S. Tsirkin
2015-11-19 21:59   ` Andy Lutomirski
2015-11-19 21:59     ` Andy Lutomirski
2015-11-19 23:38     ` David Woodhouse
2015-11-19 23:38       ` David Woodhouse
2015-11-20  2:56       ` Benjamin Herrenschmidt
2015-11-20  2:56         ` Benjamin Herrenschmidt
2015-11-20  8:34         ` Michael S. Tsirkin
2015-11-20  8:34         ` Michael S. Tsirkin
2015-11-20  8:21       ` Michael S. Tsirkin
2015-11-20  8:21         ` Michael S. Tsirkin
2015-11-22 15:58         ` David Woodhouse
2015-11-22 15:58         ` David Woodhouse
2015-11-22 21:52           ` Michael S. Tsirkin
2015-11-22 21:52           ` Michael S. Tsirkin
2015-11-22 22:21             ` David Woodhouse
2015-11-22 22:21               ` David Woodhouse
2015-11-23  7:56               ` Michael S. Tsirkin
2015-11-23  7:56                 ` Michael S. Tsirkin
2015-11-22 22:21             ` David Woodhouse
2015-11-22 22:21               ` David Woodhouse
2015-11-20  6:56     ` Michael S. Tsirkin
2015-11-20  6:56       ` Michael S. Tsirkin
2015-11-20  7:47       ` Michael S. Tsirkin
2015-11-20  7:47         ` Michael S. Tsirkin
  -- strict thread matches above, loose matches on Subject: below --
2015-10-28  6:38 Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151109224720-mutt-send-email-mst@redhat.com \
    --to=mst@redhat.com \
    --cc=benh@kernel.crashing.org \
    --cc=borntraeger@de.ibm.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=dwmw2@infradead.org \
    --cc=hch@lst.de \
    --cc=jroedel@suse.de \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=sebott@linux.vnet.ibm.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.