linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Shenming Lu <lushenming@huawei.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>, Will Deacon <will@kernel.org>,
	"Robin Murphy" <robin.murphy@arm.com>,
	Joerg Roedel <joro@8bytes.org>,
	"Jean-Philippe Brucker" <jean-philippe@linaro.org>,
	Eric Auger <eric.auger@redhat.com>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<iommu@lists.linux-foundation.org>, <linux-api@vger.kernel.org>,
	Kevin Tian <kevin.tian@intel.com>,
	Lu Baolu <baolu.lu@linux.intel.com>, <yi.l.liu@intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	"Barry Song" <song.bao.hua@hisilicon.com>,
	<wanghaibin.wang@huawei.com>, <yuzenghui@huawei.com>
Subject: Re: [RFC PATCH v3 6/8] vfio/type1: No need to statically pin and map if IOPF enabled
Date: Fri, 21 May 2021 14:39:03 +0800	[thread overview]
Message-ID: <f0b64de4-adda-1dc2-464b-00de5a8279cb@huawei.com> (raw)
In-Reply-To: <20210518125818.2282941f.alex.williamson@redhat.com>

On 2021/5/19 2:58, Alex Williamson wrote:
> On Fri, 9 Apr 2021 11:44:18 +0800
> Shenming Lu <lushenming@huawei.com> wrote:
> 
>> If IOPF enabled for the VFIO container, there is no need to statically
>> pin and map the entire DMA range, we can do it on demand. And unmap
>> according to the IOPF mapped bitmap when removing vfio_dma.
>>
>> Note that we still mark all pages dirty even if IOPF enabled, we may
>> add IOPF-based fine grained dirty tracking support in the future.
>>
>> Signed-off-by: Shenming Lu <lushenming@huawei.com>
>> ---
>>  drivers/vfio/vfio_iommu_type1.c | 38 +++++++++++++++++++++++++++------
>>  1 file changed, 32 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index 7df5711e743a..dcc93c3b258c 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -175,6 +175,7 @@ struct vfio_iopf_group {
>>  #define IOPF_MAPPED_BITMAP_GET(dma, i)	\
>>  			      ((dma->iopf_mapped_bitmap[(i) / BITS_PER_LONG]	\
>>  			       >> ((i) % BITS_PER_LONG)) & 0x1)  
>> +#define IOPF_MAPPED_BITMAP_BYTES(n)	DIRTY_BITMAP_BYTES(n)
>>  
>>  #define WAITED 1
>>  
>> @@ -959,7 +960,8 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
>>  	 * already pinned and accounted. Accouting should be done if there is no
>>  	 * iommu capable domain in the container.
>>  	 */
>> -	do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu);
>> +	do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu) ||
>> +			iommu->iopf_enabled;
>>  
>>  	for (i = 0; i < npage; i++) {
>>  		struct vfio_pfn *vpfn;
>> @@ -1048,7 +1050,8 @@ static int vfio_iommu_type1_unpin_pages(void *iommu_data,
>>  
>>  	mutex_lock(&iommu->lock);
>>  
>> -	do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu);
>> +	do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu) ||
>> +			iommu->iopf_enabled;
> 
> pin/unpin are actually still pinning pages, why does iopf exempt them
> from accounting?

If iopf_enabled is true, do_accounting will be true too, we will account
the external pinned pages?

> 
> 
>>  	for (i = 0; i < npage; i++) {
>>  		struct vfio_dma *dma;
>>  		dma_addr_t iova;
>> @@ -1169,7 +1172,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>>  	if (!dma->size)
>>  		return 0;
>>  
>> -	if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu))
>> +	if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu) || iommu->iopf_enabled)
>>  		return 0;
>>  
>>  	/*
>> @@ -1306,11 +1309,20 @@ static void vfio_unmap_partial_iopf(struct vfio_iommu *iommu,
>>  	}
>>  }
>>  
>> +static void vfio_dma_clean_iopf(struct vfio_iommu *iommu, struct vfio_dma *dma)
>> +{
>> +	vfio_unmap_partial_iopf(iommu, dma, dma->iova, dma->iova + dma->size);
>> +
>> +	kfree(dma->iopf_mapped_bitmap);
>> +}
>> +
>>  static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
>>  {
>>  	WARN_ON(!RB_EMPTY_ROOT(&dma->pfn_list));
>>  	vfio_unmap_unpin(iommu, dma, true);
>>  	vfio_unlink_dma(iommu, dma);
>> +	if (iommu->iopf_enabled)
>> +		vfio_dma_clean_iopf(iommu, dma);
>>  	put_task_struct(dma->task);
>>  	vfio_dma_bitmap_free(dma);
>>  	if (dma->vaddr_invalid) {
>> @@ -1359,7 +1371,8 @@ static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
>>  	 * mark all pages dirty if any IOMMU capable device is not able
>>  	 * to report dirty pages and all pages are pinned and mapped.
>>  	 */
>> -	if (iommu->num_non_pinned_groups && dma->iommu_mapped)
>> +	if (iommu->num_non_pinned_groups &&
>> +	    (dma->iommu_mapped || iommu->iopf_enabled))
>>  		bitmap_set(dma->bitmap, 0, nbits);
> 
> This seems like really poor integration of iopf into dirty page
> tracking.  I'd expect dirty logging to flush the mapped pages and
> write faults to mark pages dirty.  Shouldn't the fault handler also
> provide only the access faulted, so for example a read fault wouldn't
> mark the page dirty?
I just want to keep the behavior here as before, if IOPF enabled, we
will still mark all pages dirty.

We can distinguish between write and read faults in the fault handler,
so there is a way to add IOPF-based fine grained dirty tracking support...
But I am not sure whether there is a need to implement this, we can
consider this in the future?

> 
>>  
>>  	if (shift) {
>> @@ -1772,6 +1785,16 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>>  		goto out_unlock;
>>  	}
>>  
>> +	if (iommu->iopf_enabled) {
>> +		dma->iopf_mapped_bitmap = kvzalloc(IOPF_MAPPED_BITMAP_BYTES(
>> +						size >> PAGE_SHIFT), GFP_KERNEL);
>> +		if (!dma->iopf_mapped_bitmap) {
>> +			ret = -ENOMEM;
>> +			kfree(dma);
>> +			goto out_unlock;
>> +		}
> 
> 
> So we're assuming nothing can fault and therefore nothing can reference
> the iopf_mapped_bitmap until this point in the series?

I will move this to the front of this series.

Thanks,
Shenming

> 
> 
>> +	}
>> +
>>  	iommu->dma_avail--;
>>  	dma->iova = iova;
>>  	dma->vaddr = vaddr;
>> @@ -1811,8 +1834,11 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>>  	/* Insert zero-sized and grow as we map chunks of it */
>>  	vfio_link_dma(iommu, dma);
>>  
>> -	/* Don't pin and map if container doesn't contain IOMMU capable domain*/
>> -	if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu))
>> +	/*
>> +	 * Don't pin and map if container doesn't contain IOMMU capable domain,
>> +	 * or IOPF enabled for the container.
>> +	 */
>> +	if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu) || iommu->iopf_enabled)
>>  		dma->size = size;
>>  	else
>>  		ret = vfio_pin_map_dma(iommu, dma, size);
> 
> .
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-05-21  6:43 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-09  3:44 [RFC PATCH v3 0/8] Add IOPF support for VFIO passthrough Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 1/8] iommu: Evolve the device fault reporting framework Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:37     ` Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 2/8] vfio/type1: Add a page fault handler Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:38     ` Shenming Lu
2021-05-24 22:11       ` Alex Williamson
2021-05-27 11:16         ` Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 3/8] vfio/type1: Add an MMU notifier to avoid pinning Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:37     ` Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 4/8] vfio/type1: Pre-map more pages than requested in the IOPF handling Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:37     ` Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 5/8] vfio/type1: VFIO_IOMMU_ENABLE_IOPF Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:38     ` Shenming Lu
2021-05-24 22:11       ` Alex Williamson
2021-05-27 11:15         ` Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 6/8] vfio/type1: No need to statically pin and map if IOPF enabled Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:39     ` Shenming Lu [this message]
2021-04-09  3:44 ` [RFC PATCH v3 7/8] vfio/type1: Add selective DMA faulting support Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  6:39     ` Shenming Lu
2021-04-09  3:44 ` [RFC PATCH v3 8/8] vfio: Add nested IOPF support Shenming Lu
2021-05-18 18:58   ` Alex Williamson
2021-05-21  7:59     ` Shenming Lu
2021-05-24 13:11       ` Shenming Lu
2021-05-24 22:11         ` Alex Williamson
2021-05-27 11:03           ` Shenming Lu
2021-05-27 11:18             ` Lu Baolu
2021-06-01  4:36               ` Shenming Lu
2021-04-26  1:41 ` [RFC PATCH v3 0/8] Add IOPF support for VFIO passthrough Shenming Lu
2021-05-11 11:30   ` Shenming Lu
2021-05-18 18:57 ` Alex Williamson
2021-05-21  6:37   ` Shenming Lu
2021-05-24 22:11     ` Alex Williamson
2021-05-27 11:25       ` Shenming Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f0b64de4-adda-1dc2-464b-00de5a8279cb@huawei.com \
    --to=lushenming@huawei.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=cohuck@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jean-philippe@linaro.org \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=song.bao.hua@hisilicon.com \
    --cc=wanghaibin.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).