All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: alex.williamson@redhat.com
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	jgg@nvidia.com, peterx@redhat.com, prime.zeng@hisilicon.com,
	cohuck@redhat.com
Subject: Re: [PATCH v2] vfio/pci: Handle concurrent vma faults
Date: Mon, 28 Jun 2021 10:46:53 -0600	[thread overview]
Message-ID: <20210628104653.4ca65921.alex.williamson@redhat.com> (raw)
In-Reply-To: <161540257788.10151.6284852774772157400.stgit@gimli.home>

On Wed, 10 Mar 2021 11:58:07 -0700
Alex Williamson <alex.williamson@redhat.com> wrote:

> vfio_pci_mmap_fault() incorrectly makes use of io_remap_pfn_range()
> from within a vm_ops fault handler.  This function will trigger a
> BUG_ON if it encounters a populated pte within the remapped range,
> where any fault is meant to populate the entire vma.  Concurrent
> inflight faults to the same vma will therefore hit this issue,
> triggering traces such as:
> 
> [ 1591.733256] kernel BUG at mm/memory.c:2177!
> [ 1591.739515] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> [ 1591.747381] Modules linked in: vfio_iommu_type1 vfio_pci vfio_virqfd vfio pv680_mii(O)
> [ 1591.760536] CPU: 2 PID: 227 Comm: lcore-worker-2 Tainted: G O 5.11.0-rc3+ #1
> [ 1591.770735] Hardware name:  , BIOS HixxxxFPGA 1P B600 V121-1
> [ 1591.778872] pstate: 40400009 (nZcv daif +PAN -UAO -TCO BTYPE=--)
> [ 1591.786134] pc : remap_pfn_range+0x214/0x340
> [ 1591.793564] lr : remap_pfn_range+0x1b8/0x340
> [ 1591.799117] sp : ffff80001068bbd0
> [ 1591.803476] x29: ffff80001068bbd0 x28: 0000042eff6f0000
> [ 1591.810404] x27: 0000001100910000 x26: 0000001300910000
> [ 1591.817457] x25: 0068000000000fd3 x24: ffffa92f1338e358
> [ 1591.825144] x23: 0000001140000000 x22: 0000000000000041
> [ 1591.832506] x21: 0000001300910000 x20: ffffa92f141a4000
> [ 1591.839520] x19: 0000001100a00000 x18: 0000000000000000
> [ 1591.846108] x17: 0000000000000000 x16: ffffa92f11844540
> [ 1591.853570] x15: 0000000000000000 x14: 0000000000000000
> [ 1591.860768] x13: fffffc0000000000 x12: 0000000000000880
> [ 1591.868053] x11: ffff0821bf3d01d0 x10: ffff5ef2abd89000
> [ 1591.875932] x9 : ffffa92f12ab0064 x8 : ffffa92f136471c0
> [ 1591.883208] x7 : 0000001140910000 x6 : 0000000200000000
> [ 1591.890177] x5 : 0000000000000001 x4 : 0000000000000001
> [ 1591.896656] x3 : 0000000000000000 x2 : 0168044000000fd3
> [ 1591.903215] x1 : ffff082126261880 x0 : fffffc2084989868
> [ 1591.910234] Call trace:
> [ 1591.914837]  remap_pfn_range+0x214/0x340
> [ 1591.921765]  vfio_pci_mmap_fault+0xac/0x130 [vfio_pci]
> [ 1591.931200]  __do_fault+0x44/0x12c
> [ 1591.937031]  handle_mm_fault+0xcc8/0x1230
> [ 1591.942475]  do_page_fault+0x16c/0x484
> [ 1591.948635]  do_translation_fault+0xbc/0xd8
> [ 1591.954171]  do_mem_abort+0x4c/0xc0
> [ 1591.960316]  el0_da+0x40/0x80
> [ 1591.965585]  el0_sync_handler+0x168/0x1b0
> [ 1591.971608]  el0_sync+0x174/0x180
> [ 1591.978312] Code: eb1b027f 540000c0 f9400022 b4fffe02 (d4210000)
> 
> Switch to using vmf_insert_pfn() to allow replacing mappings, and
> include decrypted memory protection as formerly provided by
> io_remap_pfn_range().  Tracking of vmas is also updated to
> prevent duplicate entries.
> 
> Fixes: 11c4cd07ba11 ("vfio-pci: Fault mmaps to enable vma tracking")
> Reported-by: Zeng Tao <prime.zeng@hisilicon.com>
> Suggested-by: Zeng Tao <prime.zeng@hisilicon.com>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
> 
> v2: Set decrypted pgprot in mmap, use non-_prot vmf_insert_pfn()
>     as suggested by Jason G.

IIRC, there were no blocking issues on this patch as an interim fix to
resolve the concurrent fault issues with io_remap_pfn_range().
Unfortunately it also got no Reviewed-by or Tested-by feedback.  I'd
like to put this in for v5.14 (should have gone in earlier).  Any final
comments?  Thanks,

Alex

> 
>  drivers/vfio/pci/vfio_pci.c |   30 ++++++++++++++++++------------
>  1 file changed, 18 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 65e7e6b44578..73e125d73640 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -1573,6 +1573,11 @@ static int __vfio_pci_add_vma(struct vfio_pci_device *vdev,
>  {
>  	struct vfio_pci_mmap_vma *mmap_vma;
>  
> +	list_for_each_entry(mmap_vma, &vdev->vma_list, vma_next) {
> +		if (mmap_vma->vma == vma)
> +			return 0; /* Swallow the error, the vma is tracked */
> +	}
> +
>  	mmap_vma = kmalloc(sizeof(*mmap_vma), GFP_KERNEL);
>  	if (!mmap_vma)
>  		return -ENOMEM;
> @@ -1612,31 +1617,31 @@ static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf)
>  {
>  	struct vm_area_struct *vma = vmf->vma;
>  	struct vfio_pci_device *vdev = vma->vm_private_data;
> -	vm_fault_t ret = VM_FAULT_NOPAGE;
> +	unsigned long vaddr = vma->vm_start, pfn = vma->vm_pgoff;
> +	vm_fault_t ret = VM_FAULT_SIGBUS;
>  
>  	mutex_lock(&vdev->vma_lock);
>  	down_read(&vdev->memory_lock);
>  
> -	if (!__vfio_pci_memory_enabled(vdev)) {
> -		ret = VM_FAULT_SIGBUS;
> -		mutex_unlock(&vdev->vma_lock);
> +	if (!__vfio_pci_memory_enabled(vdev))
>  		goto up_out;
> +
> +	for (; vaddr < vma->vm_end; vaddr += PAGE_SIZE, pfn++) {
> +		ret = vmf_insert_pfn(vma, vaddr, pfn);
> +		if (ret != VM_FAULT_NOPAGE) {
> +			zap_vma_ptes(vma, vma->vm_start, vaddr - vma->vm_start);
> +			goto up_out;
> +		}
>  	}
>  
>  	if (__vfio_pci_add_vma(vdev, vma)) {
>  		ret = VM_FAULT_OOM;
> -		mutex_unlock(&vdev->vma_lock);
> -		goto up_out;
> +		zap_vma_ptes(vma, vma->vm_start, vma->vm_end - vma->vm_start);
>  	}
>  
> -	mutex_unlock(&vdev->vma_lock);
> -
> -	if (io_remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
> -			       vma->vm_end - vma->vm_start, vma->vm_page_prot))
> -		ret = VM_FAULT_SIGBUS;
> -
>  up_out:
>  	up_read(&vdev->memory_lock);
> +	mutex_unlock(&vdev->vma_lock);
>  	return ret;
>  }
>  
> @@ -1702,6 +1707,7 @@ static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma)
>  
>  	vma->vm_private_data = vdev;
>  	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> +	vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
>  	vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff;
>  
>  	/*
> 


  reply	other threads:[~2021-06-28 16:47 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10 18:58 [PATCH v2] vfio/pci: Handle concurrent vma faults Alex Williamson
2021-06-28 16:46 ` Alex Williamson [this message]
2021-06-28 17:30   ` Jason Gunthorpe
2021-06-28 18:36     ` Alex Williamson
2021-06-28 18:52       ` Jason Gunthorpe
2021-06-28 19:30         ` Alex Williamson
2021-06-28 23:26           ` Jason Gunthorpe
2021-06-29 14:11             ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210628104653.4ca65921.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterx@redhat.com \
    --cc=prime.zeng@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.