linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tom Lyon <pugs@lyon-about.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: randy.dunlap@oracle.com, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, chrisw@sous-sol.org, joro@8bytes.org,
	hjk@linutronix.de, avi@redhat.com, gregkh@suse.de,
	aafabbri@cisco.com, scofeldm@cisco.com
Subject: Re: [PATCH V2] VFIO driver: Non-privileged user level PCI drivers
Date: Thu, 17 Jun 2010 14:14:00 -0700	[thread overview]
Message-ID: <201006171414.00878.pugs@lyon-about.com> (raw)
In-Reply-To: <20100613102339.GB4191@redhat.com>

On Sunday 13 June 2010 03:23:39 am Michael S. Tsirkin wrote:
> On Fri, Jun 11, 2010 at 03:15:53PM -0700, Tom Lyon wrote:
> > [ bunch of stuff about MSI-X checking and IOMMUs and config registers...]
> > 
> > OK, here's the thing.  The IOMMU API today does not do squat about
> > dealing with interrupts. Interrupts are special because the APIC
> > addresses are not each in their own page.  Yes, the IOMMU hardware
> > supports it (at least Intel), and there's some Intel intr remapping
> > code (not AMD), but it doesn't look like it is enough.
> 
> The iommu book from AMD seems to say that interrupt remapping table
> address is taken from the device table entry.  So hardware support seems
> to be there, and to me it looks like it should be enough.
> Need to look at the iommu/msi code some more to figure out
> whether what linux does is handling this correctly -
> if it doesn't we need to fix that.
> 
> > Therefore, we must not allow the user level driver to diddle the MSI
> > or MSI-X areas - either in config space or in the device memory space.
> 
> It won't help.
> Consider that you want to let a userspace driver control
> the device with DMA capabilities.
> 
> So if there is a range of addresses that device
> can write into that can break host, these writes
> can be triggered by userspace. Limiting
> userspace access to MSI registers won't help:
> you need a way to protect host from the device.

OK, after more investigation, I realize you are right.
We definitely need the IOMMU protection for interrupts, and
if we have it, a lot of the code for config space protection is pointless.
It does seem that the Intel  intr_remapping code does what we want
(accidentally) but that the AMD iommu code does not yet do any
interrupt remapping.  Joerg - can you comment? On the roadmap?

I should have an AMD system w IOMMU in a couple of days, so I
can check this out.

> 
> >  If the device doesn't have its MSI-X registers in nice page aligned
> >  areas, then it is not "well-behaved" and it is S.O.L. The SR-IOV spec
> >  recommends that devices be designed the well-behaved way.
> > 
> > When the code in vfio_pci_config speaks of "virtualization" it means
> > that there are fake registers which the user driver can read or write,
> > but do not affect the real registers. BARs are one case, MSI regs
> > another. The PCI vendor and device ID are virtual because SR-IOV
> > doesn't supply them but I wanted the user driver to find them in the
> > same old place.
> 
> Sorry, I still don't understand why do we bother.  All this is already
> implemented in userspace.  Why can't we just use this existing userspace
> implementation?  It seems that all kernel needs to do is prevent
> userspace from writing BARs.

I assume the userspace of which you speak is qemu?  This is not what I'm
doing with vfio - I'm interested in the HPC networking model of direct 
user space access to the network. 

> Why can't we replace all this complexity with basically:
> 
> if (addr <= PCI_BASE_ADDRESS_5 && addr + len >= PCI_BASE_ADDRESS_0)
> 	return -ENOPERM;
> 
> And maybe another register or two. Most registers should be fine.
> 
> > [ Re: Hotplug and Suspend/Resume]
> > There are *plenty* of real drivers - brand new ones - which don't
> > bother with these today.  Yeah, I can see adding them to the framework
> > someday - but if there's no urgent need then it is way down the
> > priority list.
> 
> Well, for kernel drivers everything mostly works out of the box, it is
> handled by the PCI subsystem.  So some kind of framework will need to be
> added for userspace drivers as well.  And I suspect this issue won't be
> fixable later without breaking applications.

Whatever works out of the box for the kernel drivers which don't implement
suspend/resume will work for the user level drivers which don't.
> 
> > Meanwhile, the other uses beckon.
> 
> Which other uses? I thought the whole point was fixing
> what's broken with current kvm implementation.
> So it seems to be we should not rush it ignoring existing issues such as
> hotplug.
Non-kvm cases.  That don't care about suspend/resume.

 



  reply	other threads:[~2010-06-17 21:17 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-08 21:21 [PATCH V2] VFIO driver: Non-privileged user level PCI drivers Tom Lyon
2010-06-08 21:41 ` Randy Dunlap
2010-06-09 12:11   ` Arnd Bergmann
2010-06-08 21:45 ` Randy Dunlap
2010-06-08 22:38 ` Michael S. Tsirkin
2010-06-08 23:54   ` Tom Lyon
2010-06-09  5:45     ` Michael S. Tsirkin
2010-06-11 22:15       ` Tom Lyon
2010-06-13 10:23         ` Michael S. Tsirkin
2010-06-17 21:14           ` Tom Lyon [this message]
2010-06-17 21:47             ` Michael S. Tsirkin
2010-06-24 12:22             ` Joerg Roedel
2010-06-24 15:03               ` Michael S. Tsirkin
2010-06-09 11:04 ` Avi Kivity
2010-06-09 15:25   ` Greg KH
2010-06-09 16:05 ` Michael S. Tsirkin
2010-06-10 17:27 ` Konrad Rzeszutek Wilk
2010-06-11  1:58   ` Tom Lyon
2010-06-11  4:19     ` Greg KH
2010-06-11  4:56     ` Avi Kivity
2010-06-30  6:14 ` Alex Williamson
2010-06-30 13:36   ` Michael S. Tsirkin
2010-06-30 14:00     ` Alex Williamson
2010-06-30 22:17   ` Tom Lyon
2010-06-30 22:32     ` Michael S. Tsirkin
2010-06-30 22:49       ` Tom Lyon
2010-07-01  4:16 ` Alex Williamson
2010-07-01  4:30   ` Tom Lyon
2010-07-01  5:16     ` Alex Williamson
2010-07-01 15:29 ` Alex Williamson
2010-07-01 15:31   ` Michael S. Tsirkin
2010-07-01 15:48     ` Alex Williamson
2010-07-01 16:22       ` Michael S. Tsirkin
2010-07-01 18:49       ` Tom Lyon
2010-07-06  4:50 ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201006171414.00878.pugs@lyon-about.com \
    --to=pugs@lyon-about.com \
    --cc=aafabbri@cisco.com \
    --cc=avi@redhat.com \
    --cc=chrisw@sous-sol.org \
    --cc=gregkh@suse.de \
    --cc=hjk@linutronix.de \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=randy.dunlap@oracle.com \
    --cc=scofeldm@cisco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).