linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	David Gibson <david@gibson.dropbear.id.au>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org
Subject: Re: [PATCH 1/2] vfio powerpc: implemented IOMMU driver for VFIO
Date: Mon, 26 Nov 2012 21:29:48 -0700	[thread overview]
Message-ID: <1353990588.1809.148.camel@bling.home> (raw)
In-Reply-To: <50B43C36.60707@ozlabs.ru>

On Tue, 2012-11-27 at 15:06 +1100, Alexey Kardashevskiy wrote:
> On 27/11/12 05:20, Alex Williamson wrote:
> > On Fri, 2012-11-23 at 20:03 +1100, Alexey Kardashevskiy wrote:
> >> VFIO implements platform independent stuff such as
> >> a PCI driver, BAR access (via read/write on a file descriptor
> >> or direct mapping when possible) and IRQ signaling.
> >>
> >> The platform dependent part includes IOMMU initialization
> >> and handling. This patch implements an IOMMU driver for VFIO
> >> which does mapping/unmapping pages for the guest IO and
> >> provides information about DMA window (required by a POWERPC
> >> guest).
> >>
> >> The counterpart in QEMU is required to support this functionality.
> >>
> >> Cc: David Gibson <david@gibson.dropbear.id.au>
> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >> ---
> >>   drivers/vfio/Kconfig                |    6 +
> >>   drivers/vfio/Makefile               |    1 +
> >>   drivers/vfio/vfio_iommu_spapr_tce.c |  247 +++++++++++++++++++++++++++++++++++
> >>   include/linux/vfio.h                |   20 +++
> >>   4 files changed, 274 insertions(+)
> >>   create mode 100644 drivers/vfio/vfio_iommu_spapr_tce.c
> >>
> >> diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> >> index 7cd5dec..b464687 100644
> >> --- a/drivers/vfio/Kconfig
> >> +++ b/drivers/vfio/Kconfig
> >> @@ -3,10 +3,16 @@ config VFIO_IOMMU_TYPE1
> >>   	depends on VFIO
> >>   	default n
> >>
> >> +config VFIO_IOMMU_SPAPR_TCE
> >> +	tristate
> >> +	depends on VFIO && SPAPR_TCE_IOMMU
> >> +	default n
> >> +
> >>   menuconfig VFIO
> >>   	tristate "VFIO Non-Privileged userspace driver framework"
> >>   	depends on IOMMU_API
> >>   	select VFIO_IOMMU_TYPE1 if X86
> >> +	select VFIO_IOMMU_SPAPR_TCE if PPC_POWERNV
> >>   	help
> >>   	  VFIO provides a framework for secure userspace device drivers.
> >>   	  See Documentation/vfio.txt for more details.
> >> diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
> >> index 2398d4a..72bfabc 100644
> >> --- a/drivers/vfio/Makefile
> >> +++ b/drivers/vfio/Makefile
> >> @@ -1,3 +1,4 @@
> >>   obj-$(CONFIG_VFIO) += vfio.o
> >>   obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
> >> +obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
> >>   obj-$(CONFIG_VFIO_PCI) += pci/
> >> diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
> >> new file mode 100644
> >> index 0000000..46a6298
> >> --- /dev/null
> >> +++ b/drivers/vfio/vfio_iommu_spapr_tce.c
> >> @@ -0,0 +1,247 @@
> >> +/*
> >> + * VFIO: IOMMU DMA mapping support for TCE on POWER
> >> + *
> >> + * Copyright (C) 2012 IBM Corp.  All rights reserved.
> >> + *     Author: Alexey Kardashevskiy <aik@ozlabs.ru>
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify
> >> + * it under the terms of the GNU General Public License version 2 as
> >> + * published by the Free Software Foundation.
> >> + *
> >> + * Derived from original vfio_iommu_type1.c:
> >> + * Copyright (C) 2012 Red Hat, Inc.  All rights reserved.
> >> + *     Author: Alex Williamson <alex.williamson@redhat.com>
> >> + */
> >> +
> >> +#include <linux/module.h>
> >> +#include <linux/pci.h>
> >> +#include <linux/slab.h>
> >> +#include <linux/uaccess.h>
> >> +#include <linux/err.h>
> >> +#include <linux/vfio.h>
> >> +#include <asm/iommu.h>
> >> +
> >> +#define DRIVER_VERSION  "0.1"
> >> +#define DRIVER_AUTHOR   "aik@ozlabs.ru"
> >> +#define DRIVER_DESC     "VFIO IOMMU SPAPR TCE"
> >> +
> >> +static void tce_iommu_detach_group(void *iommu_data,
> >> +		struct iommu_group *iommu_group);
> >> +
> >> +/*
> >> + * VFIO IOMMU fd for SPAPR_TCE IOMMU implementation
> >> + */
> >> +
> >> +/*
> >> + * The container descriptor supports only a single group per container.
> >> + * Required by the API as the container is not supplied with the IOMMU group
> >> + * at the moment of initialization.
> >> + */
> >> +struct tce_container {
> >> +	struct mutex lock;
> >> +	struct iommu_table *tbl;
> >> +};
> >> +
> >> +static void *tce_iommu_open(unsigned long arg)
> >> +{
> >> +	struct tce_container *container;
> >> +
> >> +	if (arg != VFIO_SPAPR_TCE_IOMMU) {
> >> +		printk(KERN_ERR "tce_vfio: Wrong IOMMU type\n");
> >> +		return ERR_PTR(-EINVAL);
> >> +	}
> >> +
> >> +	container = kzalloc(sizeof(*container), GFP_KERNEL);
> >> +	if (!container)
> >> +		return ERR_PTR(-ENOMEM);
> >> +
> >> +	mutex_init(&container->lock);
> >> +
> >> +	return container;
> >> +}
> >> +
> >> +static void tce_iommu_release(void *iommu_data)
> >> +{
> >> +	struct tce_container *container = iommu_data;
> >> +
> >> +	WARN_ON(container->tbl && !container->tbl->it_group);
> >
> > I think your patch ordering is backwards here.  it_group isn't added
> > until 2/2.  I'd really like to see the arch/powerpc code approved and
> > merged by the powerpc maintainer before we add the code that makes use
> > of it into vfio.  Otherwise we just get lots of churn if interfaces
> > change or they disapprove of it altogether.
> 
> 
> Makes sense, thanks.
> 
> 
> >> +	if (container->tbl && container->tbl->it_group)
> >> +		tce_iommu_detach_group(iommu_data, container->tbl->it_group);
> >> +
> >> +	mutex_destroy(&container->lock);
> >> +
> >> +	kfree(container);
> >> +}
> >> +
> >> +static long tce_iommu_ioctl(void *iommu_data,
> >> +				 unsigned int cmd, unsigned long arg)
> >> +{
> >> +	struct tce_container *container = iommu_data;
> >> +	unsigned long minsz;
> >> +
> >> +	switch (cmd) {
> >> +	case VFIO_CHECK_EXTENSION: {
> >> +		return (arg == VFIO_SPAPR_TCE_IOMMU) ? 1 : 0;
> >> +	}
> >> +	case VFIO_IOMMU_SPAPR_TCE_GET_INFO: {
> >> +		struct vfio_iommu_spapr_tce_info info;
> >> +		struct iommu_table *tbl = container->tbl;
> >> +
> >> +		if (WARN_ON(!tbl))
> >> +			return -ENXIO;
> >> +
> >> +		minsz = offsetofend(struct vfio_iommu_spapr_tce_info,
> >> +				dma64_window_size);
> >> +
> >> +		if (copy_from_user(&info, (void __user *)arg, minsz))
> >> +			return -EFAULT;
> >> +
> >> +		if (info.argsz < minsz)
> >> +			return -EINVAL;
> >> +
> >> +		info.dma32_window_start = tbl->it_offset << IOMMU_PAGE_SHIFT;
> >> +		info.dma32_window_size = tbl->it_size << IOMMU_PAGE_SHIFT;
> >> +		info.dma64_window_start = 0;
> >> +		info.dma64_window_size = 0;
> >> +		info.flags = 0;
> >> +
> >> +		if (copy_to_user((void __user *)arg, &info, minsz))
> >> +			return -EFAULT;
> >> +
> >> +		return 0;
> >> +	}
> >> +	case VFIO_IOMMU_MAP_DMA: {
> >> +		vfio_iommu_spapr_tce_dma_map param;
> >> +		struct iommu_table *tbl = container->tbl;
> >> +		enum dma_data_direction direction = DMA_NONE;
> >> +
> >> +		if (WARN_ON(!tbl))
> >> +			return -ENXIO;
> >> +
> >> +		minsz = offsetofend(vfio_iommu_spapr_tce_dma_map, size);
> >> +
> >> +		if (copy_from_user(&param, (void __user *)arg, minsz))
> >> +			return -EFAULT;
> >> +
> >> +		if (param.argsz < minsz)
> >> +			return -EINVAL;
> >> +
> >> +		if ((param.flags & VFIO_DMA_MAP_FLAG_READ) &&
> >> +				(param.flags & VFIO_DMA_MAP_FLAG_WRITE)) {
> >> +			direction = DMA_BIDIRECTIONAL;
> >> +		} else if (param.flags & VFIO_DMA_MAP_FLAG_READ) {
> >> +			direction = DMA_TO_DEVICE;
> >> +		} else if (param.flags & VFIO_DMA_MAP_FLAG_WRITE) {
> >> +			direction = DMA_FROM_DEVICE;
> >> +		}
> >> +
> >> +		param.size += param.iova & ~IOMMU_PAGE_MASK;
> >> +		param.size = _ALIGN_UP(param.size, IOMMU_PAGE_SIZE);
> >
> > On x86 we force iova, vaddr, and size to all be aligned to the smallest
> > page granularity of the iommu and return -EINVAL if it doesn't fit.
> > What does it imply to the user if they're always aligned to work here?
> > Won't this interface happily map overlapping entries with no indication
> > to the user that the previous mapping is no longer valid?
> > Maybe another reason why a combined unmap/map makes me nervous, we have
> > to assume the user knows what they're doing.
> 
> 
> I got used to guests which do know what they are doing so I am pretty calm :)
> but ok, I'll move alignment to the QEMU, it makes sense.
> 
> 
> >> +
> >> +		return iommu_put_tces(tbl, param.iova >> IOMMU_PAGE_SHIFT,
> >> +				param.vaddr & IOMMU_PAGE_MASK, direction,
> >> +				param.size >> IOMMU_PAGE_SHIFT);
> >> +	}
> >> +	case VFIO_IOMMU_UNMAP_DMA: {
> >> +		vfio_iommu_spapr_tce_dma_unmap param;
> >> +		struct iommu_table *tbl = container->tbl;
> >> +
> >> +		if (WARN_ON(!tbl))
> >> +			return -ENXIO;
> >> +
> >> +		minsz = offsetofend(vfio_iommu_spapr_tce_dma_unmap, size);
> >> +
> >> +		if (copy_from_user(&param, (void __user *)arg, minsz))
> >> +			return -EFAULT;
> >> +
> >> +		if (param.argsz < minsz)
> >> +			return -EINVAL;
> >> +
> >> +		param.size += param.iova & ~IOMMU_PAGE_MASK;
> >> +		param.size = _ALIGN_UP(param.size, IOMMU_PAGE_SIZE);
> >> +
> >> +		return iommu_put_tces(tbl, param.iova >> IOMMU_PAGE_SHIFT,
> >> +				0, DMA_NONE, param.size >> IOMMU_PAGE_SHIFT);
> >> +	}
> >> +	default:
> >> +		printk(KERN_WARNING "tce_vfio: unexpected cmd %x\n", cmd);
> >
> > pr_warn
> >
> >> +	}
> >> +
> >> +	return -ENOTTY;
> >> +}
> >> +
> >> +static int tce_iommu_attach_group(void *iommu_data,
> >> +		struct iommu_group *iommu_group)
> >> +{
> >> +	struct tce_container *container = iommu_data;
> >> +	struct iommu_table *tbl = iommu_group_get_iommudata(iommu_group);
> >> +
> >> +	BUG_ON(!tbl);
> >> +	mutex_lock(&container->lock);
> >> +	pr_debug("tce_vfio: Attaching group #%u to iommu %p\n",
> >> +			iommu_group_id(iommu_group), iommu_group);
> >> +	if (container->tbl) {
> >> +		printk(KERN_WARNING "tce_vfio: Only one group per IOMMU container is allowed, existing id=%d, attaching id=%d\n",
> >
> > pr_warn
> >
> >> +				iommu_group_id(container->tbl->it_group),
> >> +				iommu_group_id(iommu_group));
> >> +		mutex_unlock(&container->lock);
> >> +		return -EBUSY;
> >> +	}
> >> +
> >> +	container->tbl = tbl;
> >
> > Would it be too much paranoia to clear all the tce here as you do below
> > on detach?
> 
> Guess so. I do unmap on detach() and the guest calls put_tce(0) (i.e. 
> unmaps) the whole DMA window at the boot time.

But that's just one user of this interface, we can't assume they'll all
be so agreeable.  If any tces were enabled here, a malicious user would
have a window to host memory, right?  Thanks,

Alex


  reply	other threads:[~2012-11-27  5:02 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20121113033832.GW4696@truffula.fritz.box>
2012-11-20  0:48 ` [PATCH] vfio powerpc: enabled and supported on powernv platform Alexey Kardashevskiy
2012-11-20 18:19   ` Alex Williamson
2012-11-22 11:56     ` Sethi Varun-B16395
2012-11-23  2:02       ` Alexey Kardashevskiy
2012-11-26 15:18         ` Alex Williamson
2012-11-26 18:04           ` Alex Williamson
2012-11-27  0:21             ` Benjamin Herrenschmidt
2012-11-27  3:28             ` Alexey Kardashevskiy
2012-11-27  4:23               ` Alex Williamson
2012-11-26 15:08       ` Alex Williamson
2012-11-23  9:03     ` [PATCH 0/2] vfio powerpc: implemented and enabled Alexey Kardashevskiy
2012-11-23  9:03       ` [PATCH 1/2] vfio powerpc: implemented IOMMU driver for VFIO Alexey Kardashevskiy
2012-11-26 18:20         ` Alex Williamson
2012-11-27  4:06           ` Alexey Kardashevskiy
2012-11-27  4:29             ` Alex Williamson [this message]
2012-11-27  4:58               ` Alexey Kardashevskiy
2012-11-27  5:06                 ` David Gibson
2012-11-27  5:07                 ` Alex Williamson
2012-11-28  7:21                   ` [PATCH] " Alexey Kardashevskiy
2012-11-28 21:01                     ` Alex Williamson
2012-11-29  3:51                       ` Alexey Kardashevskiy
2012-11-23  9:03       ` [PATCH 2/2] vfio powerpc: enabled on powernv platform Alexey Kardashevskiy
2012-11-27  4:41         ` Alex Williamson
2012-11-28  7:18           ` [PATCH] " Alexey Kardashevskiy
2012-11-28 21:30             ` Alex Williamson
2012-11-29  3:53               ` Alexey Kardashevskiy
2012-11-29  4:20                 ` Alex Williamson
2012-11-30  6:14                   ` Alexey Kardashevskiy
2012-11-30 16:48                     ` Alex Williamson
2012-12-01  0:14                       ` Alexey Kardashevskiy
2012-11-30  6:16                   ` Alexey Kardashevskiy
2012-12-03  2:52       ` [PATCH 0/2] vfio on power: yet another try Alexey Kardashevskiy
2012-12-03  2:52         ` [PATCH 1/2] vfio powerpc: enabled on powernv platform Alexey Kardashevskiy
2012-12-03 17:35           ` Alex Williamson
2012-12-04  8:12             ` Alexey Kardashevskiy
2012-12-04 15:51               ` Alex Williamson
2012-12-07  7:35                 ` [PATCH] " Alexey Kardashevskiy
2012-12-07 17:38                   ` Alex Williamson
2012-12-12  6:14                     ` Alexey Kardashevskiy
2012-12-12 14:34                       ` Alex Williamson
2012-12-13  2:29                         ` Benjamin Herrenschmidt
2012-12-13  6:27                           ` Alexey Kardashevskiy
2012-12-13  6:28                           ` [PATCH] iommu: moving initialization earlier Alexey Kardashevskiy
2012-12-13 15:48                             ` Alex Williamson
2012-12-16 11:20                               ` Joerg Roedel
2013-01-04  8:21                                 ` Alexey Kardashevskiy
2013-01-06  9:49                                   ` Joerg Roedel
2012-12-12 12:34                     ` [PATCH] vfio powerpc: enabled on powernv platform Alexey Kardashevskiy
2012-12-12 12:38                       ` Alexey Kardashevskiy
2012-12-12 23:30                       ` Alex Williamson
2012-12-13  2:24                         ` Alexey Kardashevskiy
2012-12-13  2:39                         ` Benjamin Herrenschmidt
2012-12-13  2:57                         ` Benjamin Herrenschmidt
2012-12-13  3:22                           ` Alex Williamson
2012-12-03  2:52         ` [PATCH 2/2] vfio powerpc: implemented IOMMU driver for VFIO Alexey Kardashevskiy
2012-12-03 17:53           ` Alex Williamson
2012-12-07  7:34             ` [PATCH] " Alexey Kardashevskiy
2012-12-07 17:01               ` Alex Williamson
2012-12-12  6:59                 ` Alexey Kardashevskiy
2012-12-12 14:36                   ` Alex Williamson
2012-12-12 12:35                 ` Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1353990588.1809.148.camel@bling.home \
    --to=alex.williamson@redhat.com \
    --cc=aik@ozlabs.ru \
    --cc=benh@kernel.crashing.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).