All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Nadav Amit <nadav.amit@gmail.com>, Joerg Roedel <joro@8bytes.org>
Cc: linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	Nadav Amit <namit@vmware.com>, Jiajun Cao <caojiajun@vmware.com>,
	Will Deacon <will@kernel.org>
Subject: Re: [PATCH v3 5/6] iommu/amd: Tailored gather logic for AMD
Date: Tue, 15 Jun 2021 13:55:18 +0100	[thread overview]
Message-ID: <1913c012-e6c0-1d5e-01b3-5f6da367c6bd@arm.com> (raw)
In-Reply-To: <20210607182541.119756-6-namit@vmware.com>

On 2021-06-07 19:25, Nadav Amit wrote:
> From: Nadav Amit <namit@vmware.com>
> 
> AMD's IOMMU can flush efficiently (i.e., in a single flush) any range.
> This is in contrast, for instnace, to Intel IOMMUs that have a limit on
> the number of pages that can be flushed in a single flush.  In addition,
> AMD's IOMMU do not care about the page-size, so changes of the page size
> do not need to trigger a TLB flush.
> 
> So in most cases, a TLB flush due to disjoint range or page-size changes
> are not needed for AMD. Yet, vIOMMUs require the hypervisor to
> synchronize the virtualized IOMMU's PTEs with the physical ones. This
> process induce overheads, so it is better not to cause unnecessary
> flushes, i.e., flushes of PTEs that were not modified.
> 
> Implement and use amd_iommu_iotlb_gather_add_page() and use it instead
> of the generic iommu_iotlb_gather_add_page(). Ignore page-size changes
> and disjoint regions unless "non-present cache" feature is reported by
> the IOMMU capabilities, as this is an indication we are running on a
> physical IOMMU. A similar indication is used by VT-d (see "caching
> mode"). The new logic retains the same flushing behavior that we had
> before the introduction of page-selective IOTLB flushes for AMD.
> 
> On virtualized environments, check if the newly flushed region and the
> gathered one are disjoint and flush if it is. Also check whether the new
> region would cause IOTLB invalidation of large region that would include
> unmodified PTE. The latter check is done according to the "order" of the
> IOTLB flush.

If it helps,

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

I wonder if it might be more effective to defer the alignment-based 
splitting part to amd_iommu_iotlb_sync() itself, but that could be 
investigated as another follow-up.

Robin.

> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Jiajun Cao <caojiajun@vmware.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: iommu@lists.linux-foundation.org
> Cc: linux-kernel@vger.kernel.org>
> Signed-off-by: Nadav Amit <namit@vmware.com>
> ---
>   drivers/iommu/amd/iommu.c | 44 ++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 3e40f6610b6a..128f2e889ced 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -2053,6 +2053,48 @@ static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova,
>   	return ret;
>   }
>   
> +static void amd_iommu_iotlb_gather_add_page(struct iommu_domain *domain,
> +					    struct iommu_iotlb_gather *gather,
> +					    unsigned long iova, size_t size)
> +{
> +	/*
> +	 * AMD's IOMMU can flush as many pages as necessary in a single flush.
> +	 * Unless we run in a virtual machine, which can be inferred according
> +	 * to whether "non-present cache" is on, it is probably best to prefer
> +	 * (potentially) too extensive TLB flushing (i.e., more misses) over
> +	 * mutliple TLB flushes (i.e., more flushes). For virtual machines the
> +	 * hypervisor needs to synchronize the host IOMMU PTEs with those of
> +	 * the guest, and the trade-off is different: unnecessary TLB flushes
> +	 * should be avoided.
> +	 */
> +	if (amd_iommu_np_cache && gather->end != 0) {
> +		unsigned long start = iova, end = start + size - 1;
> +
> +		if (iommu_iotlb_gather_is_disjoint(gather, iova, size)) {
> +			/*
> +			 * If the new page is disjoint from the current range,
> +			 * flush.
> +			 */
> +			iommu_iotlb_sync(domain, gather);
> +		} else {
> +			/*
> +			 * If the order of TLB flushes increases by more than
> +			 * 1, it means that we would have to flush PTEs that
> +			 * were not modified. In this case, flush.
> +			 */
> +			unsigned long new_start = min(gather->start, start);
> +			unsigned long new_end = min(gather->end, end);
> +			int msb_diff = fls64(gather->end ^ gather->start);
> +			int new_msb_diff = fls64(new_end ^ new_start);
> +
> +			if (new_msb_diff > msb_diff + 1)
> +				iommu_iotlb_sync(domain, gather);
> +		}
> +	}
> +
> +	iommu_iotlb_gather_add_range(gather, iova, size);
> +}
> +
>   static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
>   			      size_t page_size,
>   			      struct iommu_iotlb_gather *gather)
> @@ -2067,7 +2109,7 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
>   
>   	r = (ops->unmap) ? ops->unmap(ops, iova, page_size, gather) : 0;
>   
> -	iommu_iotlb_gather_add_page(dom, gather, iova, page_size);
> +	amd_iommu_iotlb_gather_add_page(dom, gather, iova, page_size);
>   
>   	return r;
>   }
> 

WARNING: multiple messages have this Message-ID (diff)
From: Robin Murphy <robin.murphy@arm.com>
To: Nadav Amit <nadav.amit@gmail.com>, Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>,
	iommu@lists.linux-foundation.org, Nadav Amit <namit@vmware.com>,
	linux-kernel@vger.kernel.org, Jiajun Cao <caojiajun@vmware.com>
Subject: Re: [PATCH v3 5/6] iommu/amd: Tailored gather logic for AMD
Date: Tue, 15 Jun 2021 13:55:18 +0100	[thread overview]
Message-ID: <1913c012-e6c0-1d5e-01b3-5f6da367c6bd@arm.com> (raw)
In-Reply-To: <20210607182541.119756-6-namit@vmware.com>

On 2021-06-07 19:25, Nadav Amit wrote:
> From: Nadav Amit <namit@vmware.com>
> 
> AMD's IOMMU can flush efficiently (i.e., in a single flush) any range.
> This is in contrast, for instnace, to Intel IOMMUs that have a limit on
> the number of pages that can be flushed in a single flush.  In addition,
> AMD's IOMMU do not care about the page-size, so changes of the page size
> do not need to trigger a TLB flush.
> 
> So in most cases, a TLB flush due to disjoint range or page-size changes
> are not needed for AMD. Yet, vIOMMUs require the hypervisor to
> synchronize the virtualized IOMMU's PTEs with the physical ones. This
> process induce overheads, so it is better not to cause unnecessary
> flushes, i.e., flushes of PTEs that were not modified.
> 
> Implement and use amd_iommu_iotlb_gather_add_page() and use it instead
> of the generic iommu_iotlb_gather_add_page(). Ignore page-size changes
> and disjoint regions unless "non-present cache" feature is reported by
> the IOMMU capabilities, as this is an indication we are running on a
> physical IOMMU. A similar indication is used by VT-d (see "caching
> mode"). The new logic retains the same flushing behavior that we had
> before the introduction of page-selective IOTLB flushes for AMD.
> 
> On virtualized environments, check if the newly flushed region and the
> gathered one are disjoint and flush if it is. Also check whether the new
> region would cause IOTLB invalidation of large region that would include
> unmodified PTE. The latter check is done according to the "order" of the
> IOTLB flush.

If it helps,

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

I wonder if it might be more effective to defer the alignment-based 
splitting part to amd_iommu_iotlb_sync() itself, but that could be 
investigated as another follow-up.

Robin.

> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Jiajun Cao <caojiajun@vmware.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: iommu@lists.linux-foundation.org
> Cc: linux-kernel@vger.kernel.org>
> Signed-off-by: Nadav Amit <namit@vmware.com>
> ---
>   drivers/iommu/amd/iommu.c | 44 ++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 3e40f6610b6a..128f2e889ced 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -2053,6 +2053,48 @@ static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova,
>   	return ret;
>   }
>   
> +static void amd_iommu_iotlb_gather_add_page(struct iommu_domain *domain,
> +					    struct iommu_iotlb_gather *gather,
> +					    unsigned long iova, size_t size)
> +{
> +	/*
> +	 * AMD's IOMMU can flush as many pages as necessary in a single flush.
> +	 * Unless we run in a virtual machine, which can be inferred according
> +	 * to whether "non-present cache" is on, it is probably best to prefer
> +	 * (potentially) too extensive TLB flushing (i.e., more misses) over
> +	 * mutliple TLB flushes (i.e., more flushes). For virtual machines the
> +	 * hypervisor needs to synchronize the host IOMMU PTEs with those of
> +	 * the guest, and the trade-off is different: unnecessary TLB flushes
> +	 * should be avoided.
> +	 */
> +	if (amd_iommu_np_cache && gather->end != 0) {
> +		unsigned long start = iova, end = start + size - 1;
> +
> +		if (iommu_iotlb_gather_is_disjoint(gather, iova, size)) {
> +			/*
> +			 * If the new page is disjoint from the current range,
> +			 * flush.
> +			 */
> +			iommu_iotlb_sync(domain, gather);
> +		} else {
> +			/*
> +			 * If the order of TLB flushes increases by more than
> +			 * 1, it means that we would have to flush PTEs that
> +			 * were not modified. In this case, flush.
> +			 */
> +			unsigned long new_start = min(gather->start, start);
> +			unsigned long new_end = min(gather->end, end);
> +			int msb_diff = fls64(gather->end ^ gather->start);
> +			int new_msb_diff = fls64(new_end ^ new_start);
> +
> +			if (new_msb_diff > msb_diff + 1)
> +				iommu_iotlb_sync(domain, gather);
> +		}
> +	}
> +
> +	iommu_iotlb_gather_add_range(gather, iova, size);
> +}
> +
>   static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
>   			      size_t page_size,
>   			      struct iommu_iotlb_gather *gather)
> @@ -2067,7 +2109,7 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
>   
>   	r = (ops->unmap) ? ops->unmap(ops, iova, page_size, gather) : 0;
>   
> -	iommu_iotlb_gather_add_page(dom, gather, iova, page_size);
> +	amd_iommu_iotlb_gather_add_page(dom, gather, iova, page_size);
>   
>   	return r;
>   }
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2021-06-15 12:55 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-07 18:25 [PATCH v3 0/6] iommu/amd: Enable page-selective flushes Nadav Amit
2021-06-07 18:25 ` Nadav Amit
2021-06-07 18:25 ` [PATCH v3 1/6] iommu/amd: Selective flush on unmap Nadav Amit
2021-06-07 18:25   ` Nadav Amit
2021-06-07 18:25 ` [PATCH v3 2/6] iommu/amd: Do not use flush-queue when NpCache is on Nadav Amit
2021-06-07 18:25   ` Nadav Amit
2021-06-15 13:08   ` Robin Murphy
2021-06-15 13:08     ` Robin Murphy
2021-06-15 18:26     ` Nadav Amit
2021-06-15 18:26       ` Nadav Amit
2021-06-15 19:36       ` Robin Murphy
2021-06-15 19:36         ` Robin Murphy
2021-06-07 18:25 ` [PATCH v3 3/6] iommu: Improve iommu_iotlb_gather helpers Nadav Amit
2021-06-07 18:25   ` Nadav Amit
2021-06-11 13:50   ` Will Deacon
2021-06-11 13:50     ` Will Deacon
2021-06-15 10:42   ` Robin Murphy
2021-06-15 10:42     ` Robin Murphy
2021-06-15 19:05     ` Nadav Amit
2021-06-15 19:05       ` Nadav Amit
2021-06-15 19:07       ` Nadav Amit
2021-06-15 19:07         ` Nadav Amit
2021-06-15 12:29   ` Yong Wu
2021-06-15 12:29     ` Yong Wu
2021-06-15 12:41     ` Robin Murphy
2021-06-15 12:41       ` Robin Murphy
2021-06-07 18:25 ` [PATCH v3 4/6] iommu: Factor iommu_iotlb_gather_is_disjoint() out Nadav Amit
2021-06-07 18:25   ` Nadav Amit
2021-06-11 13:57   ` Will Deacon
2021-06-11 13:57     ` Will Deacon
2021-06-11 16:50     ` Nadav Amit
2021-06-11 16:50       ` Nadav Amit
2021-06-15 10:29       ` Will Deacon
2021-06-15 10:29         ` Will Deacon
2021-06-15 18:54         ` Nadav Amit
2021-06-15 18:54           ` Nadav Amit
2021-06-07 18:25 ` [PATCH v3 5/6] iommu/amd: Tailored gather logic for AMD Nadav Amit
2021-06-07 18:25   ` Nadav Amit
2021-06-15 12:55   ` Robin Murphy [this message]
2021-06-15 12:55     ` Robin Murphy
2021-06-15 18:14     ` Nadav Amit
2021-06-15 18:14       ` Nadav Amit
2021-06-15 19:20       ` Robin Murphy
2021-06-15 19:20         ` Robin Murphy
2021-06-15 19:46         ` Nadav Amit
2021-06-15 19:46           ` Nadav Amit
2021-06-07 18:25 ` [PATCH v3 6/6] iommu/amd: Sync once for scatter-gather operations Nadav Amit
2021-06-07 18:25   ` Nadav Amit
2021-06-15 11:25   ` Robin Murphy
2021-06-15 11:25     ` Robin Murphy
2021-06-15 18:51     ` Nadav Amit
2021-06-15 18:51       ` Nadav Amit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1913c012-e6c0-1d5e-01b3-5f6da367c6bd@arm.com \
    --to=robin.murphy@arm.com \
    --cc=caojiajun@vmware.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nadav.amit@gmail.com \
    --cc=namit@vmware.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.