All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: <cohuck@redhat.com>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <peterx@redhat.com>
Subject: Re: [RFC PATCH 05/10] vfio: Create a vfio_device from vma lookup
Date: Thu, 25 Feb 2021 15:21:13 -0700	[thread overview]
Message-ID: <20210225152113.3e083b4a@omen.home.shazbot.org> (raw)
In-Reply-To: <20210225000610.GP4247@nvidia.com>

On Wed, 24 Feb 2021 20:06:10 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Wed, Feb 24, 2021 at 02:55:06PM -0700, Alex Williamson wrote:
> 
> > > The only use of the special ops would be if there are multiple types
> > > of mmap's going on, but for this narrow use case those would be safely
> > > distinguished by the vm_pgoff instead  
> > 
> > We potentially do have device specific regions which can support mmap,
> > for example the migration region.  We'll need to think about how we
> > could even know if portions of those regions map to a device.  We could
> > use the notifier to announce it and require the code supporting those
> > device specific regions manage it.  
> 
> So, the above basically says any VFIO VMA is allowed for VFIO to map
> to the IOMMU.
> 
> If there are places creating mmaps for VFIO that should not go to the
> IOMMU then they need to return NULL from this function.
> 
> > I'm not really clear what you're getting at with vm_pgoff though, could
> > you explain further?  
> 
> Ah, so I have to take a side discussion to explain what I ment.
> 
> The vm_pgoff is a bit confused because we change it here in vfio_pci:
> 
>     vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff;
> 
> But the address_space invalidation assumes it still has the region
> based encoding:
> 
> +	vfio_device_unmap_mapping_range(vdev->device,
> +			VFIO_PCI_INDEX_TO_OFFSET(VFIO_PCI_BAR0_REGION_INDEX),
> +			VFIO_PCI_INDEX_TO_OFFSET(VFIO_PCI_ROM_REGION_INDEX) -
> +			VFIO_PCI_INDEX_TO_OFFSET(VFIO_PCI_BAR0_REGION_INDEX));
> 
> Those three indexes are in the vm_pgoff numberspace and so vm_pgoff
> must always be set to the same thing - either the
> VFIO_PCI_INDEX_TO_OFFSET() coding or the physical pfn. 

Aha, I hadn't made that connection.

> Since you say we need a limited invalidation this looks like a bug to
> me - and it must always be the VFIO_PCI_INDEX_TO_OFFSET coding.

Yes, this must have only worked in testing because I mmap'd BAR0 which
is at index/offset zero, so the pfn range overlapped the user offset.
I'm glad you caught that...

> So, the PCI vma needs to get switched to use the
> VFIO_PCI_INDEX_TO_OFFSET coding and then we can always extract the
> region number from the vm_pgoff and thus access any additional data,
> such as the base pfn or a flag saying this cannot be mapped to the
> IOMMU. Do the reverse of VFIO_PCI_INDEX_TO_OFFSET and consult
> information attached to that region ID.
> 
> All places creating vfio mmaps have to set the vm_pgoff to
> VFIO_PCI_INDEX_TO_OFFSET().

This is where it gets tricky.  The vm_pgoff we get from
file_operations.mmap is already essentially describing an offset from
the base of a specific resource.  We could convert that from an absolute
offset to a pfn offset, but it's only the bus driver code (ex.
vfio-pci) that knows how to get the base, assuming there is a single
base per region (we can't assume enough bits per region to store
absolute pfn).  Also note that you're suggesting that all vfio mmaps
would need to standardize on the vfio-pci implementation of region
layouts.  Not that most drivers haven't copied vfio-pci, but we've
specifically avoided exposing it as a fixed uAPI such that we could have
the flexibility for a bus driver to implement regions offsets however
they need.

So I'm not really sure what this looks like.  Within vfio-pci we could
keep the index bits in place to allow unmmap_mapping_range() to
selectively zap matching vm_pgoffs but expanding that to a vfio
standard such that the IOMMU backend can also extract a pfn looks very
limiting, or ugly.  Thanks,

Alex

> But we have these violations that need fixing:
> 
> drivers/vfio/fsl-mc/vfio_fsl_mc.c:      vma->vm_pgoff = (region.addr >> PAGE_SHIFT) + pgoff;
> drivers/vfio/platform/vfio_platform_common.c:   vma->vm_pgoff = (region.addr >> PAGE_SHIFT) + pgoff;
> 
> Couldn't see any purpose to this code, cargo cult copy? Just delete
> it.
> 
> drivers/vfio/pci/vfio_pci.c:    vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff;
> 
> Used to implement fault() but we could get the region number and
> extract the pfn from the vfio_pci_device's data easy enough.
> 
> I manually checked that other parts of VFIO not under drivers/vfio are
> doing it OK, looks fine.
> 
> Jason
> 


  reply	other threads:[~2021-02-25 22:26 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-22 16:50 [RFC PATCH 00/10] vfio: Device memory DMA mapping improvements Alex Williamson
2021-02-22 16:50 ` [RFC PATCH 01/10] vfio: Create vfio_fs_type with inode per device Alex Williamson
2021-02-26  5:38   ` Christoph Hellwig
2021-02-26 13:15     ` Jason Gunthorpe
2021-02-22 16:50 ` [RFC PATCH 02/10] vfio: Update vfio_add_group_dev() API Alex Williamson
2021-02-22 17:01   ` Jason Gunthorpe
2021-02-22 16:50 ` [RFC PATCH 03/10] vfio: Export unmap_mapping_range() wrapper Alex Williamson
2021-02-22 16:51 ` [RFC PATCH 04/10] vfio/pci: Use vfio_device_unmap_mapping_range() Alex Williamson
2021-02-22 17:22   ` Jason Gunthorpe
2021-02-24 21:55     ` Alex Williamson
2021-02-25  0:57       ` Jason Gunthorpe
2021-02-22 16:51 ` [RFC PATCH 05/10] vfio: Create a vfio_device from vma lookup Alex Williamson
2021-02-22 17:29   ` Jason Gunthorpe
2021-02-24 21:55     ` Alex Williamson
2021-02-25  0:06       ` Jason Gunthorpe
2021-02-25 22:21         ` Alex Williamson [this message]
2021-02-25 23:49           ` Jason Gunthorpe
2021-03-04 21:37             ` Alex Williamson
2021-03-04 23:16               ` Jason Gunthorpe
2021-03-05  0:07                 ` Alex Williamson
2021-03-05  0:36                   ` Jason Gunthorpe
2021-02-22 16:51 ` [RFC PATCH 06/10] vfio: Add a device notifier interface Alex Williamson
2021-02-22 16:51 ` [RFC PATCH 07/10] vfio/pci: Notify on device release Alex Williamson
2021-02-22 16:52 ` [RFC PATCH 08/10] vfio/type1: Refactor pfn_list clearing Alex Williamson
2021-02-22 16:52 ` [RFC PATCH 09/10] vfio/type1: Pass iommu and dma objects through to vaddr_get_pfn Alex Williamson
2021-02-22 16:52 ` [RFC PATCH 10/10] vfio/type1: Register device notifier Alex Williamson
2021-02-22 17:55   ` Jason Gunthorpe
2021-02-24 21:55     ` Alex Williamson
2021-02-25  0:22       ` Jason Gunthorpe
2021-02-25 17:54         ` Peter Xu
2021-02-25 18:19           ` Jason Gunthorpe
2021-02-25 19:06             ` Peter Xu
2021-02-25 19:17               ` Jason Gunthorpe
2021-02-25 19:54                 ` Peter Xu
2021-02-26  5:47     ` Christoph Hellwig
2021-02-22 18:00 ` [RFC PATCH 00/10] vfio: Device memory DMA mapping improvements Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210225152113.3e083b4a@omen.home.shazbot.org \
    --to=alex.williamson@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterx@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.