linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kirti Wankhede <kwankhede@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: <pbonzini@redhat.com>, <kraxel@redhat.com>, <cjia@nvidia.com>,
	<qemu-devel@nongnu.org>, <kvm@vger.kernel.org>,
	<kevin.tian@intel.com>, <jike.song@intel.com>,
	<bjsdjshi@linux.vnet.ibm.com>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v11 11/22] vfio iommu: Add blocking notifier to notify DMA_UNMAP
Date: Wed, 9 Nov 2016 01:29:19 +0530	[thread overview]
Message-ID: <675f0d46-aa2b-7404-701e-d9ce17b64549@nvidia.com> (raw)
In-Reply-To: <20161108104619.6f76b918@t450s.home>



On 11/8/2016 11:16 PM, Alex Williamson wrote:
> On Tue, 8 Nov 2016 21:56:29 +0530
> Kirti Wankhede <kwankhede@nvidia.com> wrote:
> 
>> On 11/8/2016 5:15 AM, Alex Williamson wrote:
>>> On Sat, 5 Nov 2016 02:40:45 +0530
>>> Kirti Wankhede <kwankhede@nvidia.com> wrote:
>>>   
>> ...
>>>>  
>>>> +int vfio_register_notifier(struct device *dev, struct notifier_block *nb)  
>>>
>>> Is the expectation here that this is a generic notifier for all
>>> vfio->mdev signaling?  That should probably be made clear in the mdev
>>> API to avoid vendor drivers assuming their notifier callback only
>>> occurs for unmaps, even if that's currently the case.
>>>   
>>
>> Ok. Adding comment about notifier callback in mdev_device which is part
>> of next patch.
>>
>> ...
>>
>>>>  	mutex_lock(&iommu->lock);
>>>>  
>>>> -	if (!iommu->external_domain) {
>>>> +	/* Fail if notifier list is empty */
>>>> +	if ((!iommu->external_domain) || (!iommu->notifier.head)) {
>>>>  		ret = -EINVAL;
>>>>  		goto pin_done;
>>>>  	}
>>>> @@ -867,6 +870,11 @@ unlock:
>>>>  	/* Report how much was unmapped */
>>>>  	unmap->size = unmapped;
>>>>  
>>>> +	if (unmapped && iommu->external_domain)
>>>> +		blocking_notifier_call_chain(&iommu->notifier,
>>>> +					     VFIO_IOMMU_NOTIFY_DMA_UNMAP,
>>>> +					     unmap);  
>>>
>>> This is after the fact, there's already a gap here where pages are
>>> unpinned and the mdev device is still running.  
>>
>> Oh, there is a bug here, now unpin_pages() take user_pfn as argument and
>> find vfio_dma. If its not found, it doesn't unpin pages. We have to call
>> this notifier before vfio_remove_dma(). But if we call this before
>> vfio_remove_dma() there will be deadlock since iommu->lock is already
>> held here and vfio_iommu_type1_unpin_pages() will also try to hold
>> iommu->lock.
>> If we want to call blocking_notifier_call_chain() before
>> vfio_remove_dma(), sequence should be:
>>
>> unmapped += dma->size;
>> mutex_unlock(&iommu->lock);
>> if (iommu->external_domain)) {
>> 	struct vfio_iommu_type1_dma_unmap nb_unmap;
>>
>> 	nb_unmap.iova = dma->iova;
>> 	nb_unmap.size = dma->size;
>> 	blocking_notifier_call_chain(&iommu->notifier,
>>         	                     VFIO_IOMMU_NOTIFY_DMA_UNMAP,
>>                 	             &nb_unmap);
>> }
>> mutex_lock(&iommu->lock);
>> vfio_remove_dma(iommu, dma);
> 
> It seems like it would be worthwhile to have the rb-tree rooted in the
> vfio-dma, then we only need to call the notifier if there are pages
> pinned within that vfio-dma (ie. the rb-tree is not empty).  We can
> then release the lock call the notifier, re-acquire the lock, and
> BUG_ON if the rb-tree still is not empty.  We might get duplicate pfns
> between separate vfio_dma structs, but as I mentioned in other replies,
> that seems like an exception that we don't need to optimize for.
> 

If we don't optimize for the case where iova from different vfio_dma are
mapped to same pfn and we would not consider this case for page
accounting then:
- have rb tree of pinned iova, where key would be iova, in each vfio_dma
structure.
- iova tracking structure would have iova and ref_count only.
- page accounting would only count number of iova's in rb_tree, case
where different iova could map to same pfn would not be considered in
this implementation for now.
- vfio_unpin_pages() would have user_pfn and pfn as input, we would
validate that iova exist in rb tree and trust vendor driver that
corresponding pfn is correct, there is no validation of pfn. If want
validate pfn, call GUP, verify pfn and call put_pfn().
- In .release() or .detach_group() path, if there are entries in this rb
tree, call GUP again using that iova, get pfn and then call
put_pfn(pfn) for ref_count+1 times. This is because we are not keeping
pfn in our tracking logic.

Does this sound reasonable?

Thanks,
Kirti


>>>  The notifier needs to
>>> happen prior to that and I suspect that we need to validate that we
>>> have no remaining external pfn references within this vfio_dma block.
>>> It seems like we need to root our pfn tracking in the vfio_dma so that
>>> we can see that it's empty after the notifier chain and BUG_ON if not.  
>>
>> There is no way to find pfns from that iova range with current
>> implementation. We can have this validate if we go with linear array of
>> iova to track pfns.
> 
> Right, I was still hoping to avoid storing the pfn even with the
> array/page-table approach though, ask the mm layer for the mapping
> again.  Is that too much overhead?  Maybe the page table could store
> the phys addr and we could use PAGE_MASK to store the reference count
> so that each entry is still only 8bytes(?)
>  
>>> I would also add some enforcement that external pinning is only enabled
>>> when vfio_iommu_type1 is configured for v2 semantics (ie. we only
>>> support unmaps exactly matching previous maps).
>>>   
>>
>> Ok I'll add that check.
>>
>> Thanks,
>> Kirti
> 

  reply	other threads:[~2016-11-08 20:00 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-04 21:10 [PATCH v11 00/22] Add Mediated device support Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 01/22] vfio: Mediated device Core driver Kirti Wankhede
2016-11-07  6:40   ` Tian, Kevin
     [not found]   ` <20161108092552.GA2090@bjsdjshi@linux.vnet.ibm.com>
2016-11-08 21:06     ` Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 02/22] vfio: VFIO based driver for Mediated devices Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 03/22] vfio: Rearrange functions to get vfio_group from dev Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 04/22] vfio: Common function to increment container_users Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 05/22] vfio iommu: Added pin and unpin callback functions to vfio_iommu_driver_ops Kirti Wankhede
2016-11-07 19:36   ` Alex Williamson
2016-11-08 13:55     ` Kirti Wankhede
2016-11-08 16:39       ` Alex Williamson
2016-11-08 18:47         ` Kirti Wankhede
2016-11-08 19:14           ` Alex Williamson
2016-11-04 21:10 ` [PATCH v11 06/22] vfio iommu type1: Update arguments of vfio_lock_acct Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 07/22] vfio iommu type1: Update argument of vaddr_get_pfn() Kirti Wankhede
2016-11-07  8:42   ` Alexey Kardashevskiy
2016-11-04 21:10 ` [PATCH v11 08/22] vfio iommu type1: Add find_iommu_group() function Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 09/22] vfio iommu type1: Add task structure to vfio_dma Kirti Wankhede
2016-11-07 21:03   ` Alex Williamson
2016-11-08 14:13     ` Kirti Wankhede
2016-11-08 16:43       ` Alex Williamson
2016-11-04 21:10 ` [PATCH v11 10/22] vfio iommu type1: Add support for mediated devices Kirti Wankhede
2016-11-07 23:16   ` Alex Williamson
2016-11-08  2:20     ` Jike Song
2016-11-08 16:18       ` Alex Williamson
2016-11-08 15:06     ` Kirti Wankhede
2016-11-08 17:05       ` Alex Williamson
2016-11-08  6:52   ` Alexey Kardashevskiy
2016-11-15  5:17     ` Alexey Kardashevskiy
2016-11-15  6:33       ` Kirti Wankhede
2016-11-15  7:27         ` Alexey Kardashevskiy
2016-11-15  7:56           ` Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 11/22] vfio iommu: Add blocking notifier to notify DMA_UNMAP Kirti Wankhede
2016-11-07 23:45   ` Alex Williamson
2016-11-08 16:26     ` Kirti Wankhede
2016-11-08 17:46       ` Alex Williamson
2016-11-08 19:59         ` Kirti Wankhede [this message]
2016-11-08 21:28           ` Alex Williamson
2016-11-14  7:52             ` Kirti Wankhede
2016-11-14 15:37               ` Alex Williamson
2016-11-04 21:10 ` [PATCH v11 12/22] vfio: Add notifier callback to parent's ops structure of mdev Kirti Wankhede
2016-11-07 23:51   ` Alex Williamson
2016-11-04 21:10 ` [PATCH v11 13/22] vfio: Introduce common function to add capabilities Kirti Wankhede
2016-11-08  7:29   ` Alexey Kardashevskiy
2016-11-08 20:46     ` Kirti Wankhede
2016-11-08 21:42       ` Alex Williamson
2016-11-09  2:23         ` Alexey Kardashevskiy
2016-11-04 21:10 ` [PATCH v11 14/22] vfio_pci: Update vfio_pci to use vfio_info_add_capability() Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 15/22] vfio: Introduce vfio_set_irqs_validate_and_prepare() Kirti Wankhede
2016-11-08  8:46   ` Alexey Kardashevskiy
2016-11-08 20:22     ` Kirti Wankhede
2016-11-09  3:07       ` Alexey Kardashevskiy
2016-11-09  3:35         ` Alex Williamson
2016-11-04 21:10 ` [PATCH v11 16/22] vfio_pci: Updated to use vfio_set_irqs_validate_and_prepare() Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 17/22] vfio_platform: " Kirti Wankhede
2016-11-08  8:52   ` Alexey Kardashevskiy
2016-11-08 20:41     ` Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 18/22] vfio: Define device_api strings Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 19/22] docs: Add Documentation for Mediated devices Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 20/22] docs: Sysfs ABI for mediated device framework Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 21/22] docs: Sample driver to demonstrate how to use Mediated " Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 22/22] MAINTAINERS: Add entry VFIO based Mediated device drivers Kirti Wankhede
2016-11-07  3:30 ` [PATCH v11 00/22] Add Mediated device support Alexey Kardashevskiy
2016-11-07  3:59   ` Kirti Wankhede
2016-11-07  5:06     ` Kirti Wankhede
2016-11-07  6:15     ` Alexey Kardashevskiy
2016-11-07  6:36       ` Kirti Wankhede
2016-11-07  6:46         ` Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=675f0d46-aa2b-7404-701e-d9ce17b64549@nvidia.com \
    --to=kwankhede@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=bjsdjshi@linux.vnet.ibm.com \
    --cc=cjia@nvidia.com \
    --cc=jike.song@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kraxel@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).