linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: "Mika Penttilä" <mika.penttila@nextfour.com>
Cc: Yan Zhao <yan.y.zhao@intel.com>,
	"zhenyuw@linux.intel.com" <zhenyuw@linux.intel.com>,
	"intel-gvt-dev@lists.freedesktop.org" 
	<intel-gvt-dev@lists.freedesktop.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"kevin.tian@intel.com" <kevin.tian@intel.com>,
	"peterx@redhat.com" <peterx@redhat.com>
Subject: Re: [PATCH v2 1/2] vfio: introduce vfio_dma_rw to read/write a range of IOVAs
Date: Wed, 15 Jan 2020 20:58:27 -0700	[thread overview]
Message-ID: <20200115205827.2249201c@x1.home> (raw)
In-Reply-To: <7528cfff-2512-538e-4e44-85f0a0b0130a@nextfour.com>

On Thu, 16 Jan 2020 03:15:58 +0000
Mika Penttilä <mika.penttila@nextfour.com> wrote:

> On 16.1.2020 4.59, Alex Williamson wrote:
> > On Thu, 16 Jan 2020 02:30:52 +0000
> > Mika Penttilä <mika.penttila@nextfour.com> wrote:
> >  
> >> On 15.1.2020 22.06, Alex Williamson wrote:  
> >>> On Tue, 14 Jan 2020 22:53:03 -0500
> >>> Yan Zhao <yan.y.zhao@intel.com> wrote:
> >>>     
> >>>> vfio_dma_rw will read/write a range of user space memory pointed to by
> >>>> IOVA into/from a kernel buffer without pinning the user space memory.
> >>>>
> >>>> TODO: mark the IOVAs to user space memory dirty if they are written in
> >>>> vfio_dma_rw().
> >>>>
> >>>> Cc: Kevin Tian <kevin.tian@intel.com>
> >>>> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> >>>> ---
> >>>>    drivers/vfio/vfio.c             | 45 +++++++++++++++++++
> >>>>    drivers/vfio/vfio_iommu_type1.c | 76 +++++++++++++++++++++++++++++++++
> >>>>    include/linux/vfio.h            |  5 +++
> >>>>    3 files changed, 126 insertions(+)
> >>>>
> >>>> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> >>>> index c8482624ca34..8bd52bc841cf 100644
> >>>> --- a/drivers/vfio/vfio.c
> >>>> +++ b/drivers/vfio/vfio.c
> >>>> @@ -1961,6 +1961,51 @@ int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn, int npage)
> >>>>    }
> >>>>    EXPORT_SYMBOL(vfio_unpin_pages);
> >>>>    
> >>>> +/*
> >>>> + * Read/Write a range of IOVAs pointing to user space memory into/from a kernel
> >>>> + * buffer without pinning the user space memory
> >>>> + * @dev [in]  : device
> >>>> + * @iova [in] : base IOVA of a user space buffer
> >>>> + * @data [in] : pointer to kernel buffer
> >>>> + * @len [in]  : kernel buffer length
> >>>> + * @write     : indicate read or write
> >>>> + * Return error code on failure or 0 on success.
> >>>> + */
> >>>> +int vfio_dma_rw(struct device *dev, dma_addr_t iova, void *data,
> >>>> +		   size_t len, bool write)
> >>>> +{
> >>>> +	struct vfio_container *container;
> >>>> +	struct vfio_group *group;
> >>>> +	struct vfio_iommu_driver *driver;
> >>>> +	int ret = 0;  
> >> Do you know the iova given to vfio_dma_rw() is indeed a gpa and not iova
> >> from a iommu mapping? So isn't it you actually assume all the guest is
> >> pinned,
> >> like from device assignment?
> >>
> >> Or who and how is the vfio mapping added before the vfio_dma_rw() ?  
> > vfio only knows about IOVAs, not GPAs.  It's possible that IOVAs are
> > identity mapped to the GPA space, but a VM with a vIOMMU would quickly
> > break any such assumption.  Pinning is also not required.  This access
> > is via the CPU, not the I/O device, so we don't require the memory to
> > be pinning and it potentially won't be for a non-IOMMU backed mediated
> > device.  The intention here is that via the mediation of an mdev
> > device, a vendor driver would already know IOVA ranges for the device
> > to access via the guest driver programming of the device.  Thanks,
> >
> > Alex  
> 
> Thanks Alex... you mean IOVA is in the case of iommu already a 
> iommu-translated address to a user space VA in VM host space?

The user (QEMU in the case of device assignment) performs ioctls to map
user VAs to IOVAs for the device.  With IOMMU backing the VAs are
pinned to get HPA and the IOVA to HPA mappings are programmed into the
IOMMU.  Thus the device accesses the IOVA to get to the HPA, which is
the backing for the VA.  In this case we're simply using the IOVA to
lookup the VA and access it with the CPU directly.  The IOMMU isn't
involved, but we're still performing an access as if we were the device
doing a DMA. Let me know if that doesn't answer your question.

> How does it get to hold on that? What piece of meditation is responsible 
> for this?

It's device specific.  The mdev vendor driver is mediating a specific
hardware device where user accesses to MMIO on the device configures
DMA targets.  The mediation needs to trap those accesses in order to
pin page and program the real hardware with real physical addresses (be
they HPA or host-IOVAs depending on the host IOMMU config) to perform
those DMAs.  For cases where the CPU might choose to perform some sort
of virtual DMA on behalf of the device itself, this interface would be
used.  Thanks,

Alex


  reply	other threads:[~2020-01-16  3:58 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-15  3:41 [PATCH v2 0/2] use vfio_dma_rw to read/write IOVAs from CPU side Yan Zhao
2020-01-15  3:53 ` [PATCH v2 1/2] vfio: introduce vfio_dma_rw to read/write a range of IOVAs Yan Zhao
2020-01-15 20:06   ` Alex Williamson
2020-01-16  2:30     ` Mika Penttilä
2020-01-16  2:59       ` Alex Williamson
2020-01-16  3:15         ` Mika Penttilä
2020-01-16  3:58           ` Alex Williamson [this message]
2020-01-16  5:32     ` Yan Zhao
2020-01-15  3:54 ` [PATCH v2 2/2] drm/i915/gvt: subsitute kvm_read/write_guest with vfio_dma_rw Yan Zhao
2020-01-15 20:06   ` Alex Williamson
2020-01-16  5:49     ` Yan Zhao
2020-01-16 15:37       ` Alex Williamson
2020-01-19 10:06         ` Yan Zhao
2020-01-20 20:01           ` Alex Williamson
2020-01-21  8:12             ` Yan Zhao
2020-01-21 16:51               ` Alex Williamson
2020-01-21 22:10                 ` Yan Zhao
2020-01-22  3:07                   ` Yan Zhao
2020-01-23 10:02                     ` Yan Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200115205827.2249201c@x1.home \
    --to=alex.williamson@redhat.com \
    --cc=intel-gvt-dev@lists.freedesktop.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mika.penttila@nextfour.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=yan.y.zhao@intel.com \
    --cc=zhenyuw@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).