All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: joro@8bytes.org
Cc: iommu@lists.linux-foundation.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, dwmw2@infradead.org,
	thunder.leizhen@huawei.com, lorenzo.pieralisi@arm.com,
	ard.biesheuvel@linaro.org, Jonathan.Cameron@huawei.com,
	nwatters@codeaurora.org, ray.jui@broadcom.com
Subject: [PATCH v2 2/4] iommu/iova: Optimise the padding calculation
Date: Fri, 21 Jul 2017 12:41:59 +0100	[thread overview]
Message-ID: <4f964e56fe39c6ce1c84b8458c3a27dcb51077d4.1500636791.git.robin.murphy@arm.com> (raw)
In-Reply-To: <cover.1500636791.git.robin.murphy@arm.com>

From: Zhen Lei <thunder.leizhen@huawei.com>

The mask for calculating the padding size doesn't change, so there's no
need to recalculate it every loop iteration. Furthermore, Once we've
done that, it becomes clear that we don't actually need to calculate a
padding size at all - by flipping the arithmetic around, we can just
combine the upper limit, size, and mask directly to check against the
lower limit.

For an arm64 build, this alone knocks 16% off the size of the entire
alloc_iova() function!

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Zhen Lei <thunder.leizhen@huawei.com>
[rm: simplified more of the arithmetic, rewrote commit message]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

v2: No change

 drivers/iommu/iova.c | 40 ++++++++++++++--------------------------
 1 file changed, 14 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 8f7552dc4e04..d094d1ca8f23 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -129,16 +129,6 @@ iova_insert_rbtree(struct rb_root *root, struct iova *iova,
 	rb_insert_color(&iova->node, root);
 }
 
-/*
- * Computes the padding size required, to make the start address
- * naturally aligned on the power-of-two order of its size
- */
-static unsigned int
-iova_get_pad_size(unsigned int size, unsigned int limit_pfn)
-{
-	return (limit_pfn - size) & (__roundup_pow_of_two(size) - 1);
-}
-
 static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 		unsigned long size, unsigned long limit_pfn,
 			struct iova *new, bool size_aligned)
@@ -146,7 +136,10 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 	struct rb_node *prev, *curr = NULL;
 	unsigned long flags;
 	unsigned long saved_pfn;
-	unsigned int pad_size = 0;
+	unsigned long align_mask = ~0UL;
+
+	if (size_aligned)
+		align_mask <<= __fls(size);
 
 	/* Walk the tree backwards */
 	spin_lock_irqsave(&iovad->iova_rbtree_lock, flags);
@@ -156,31 +149,26 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 	while (curr) {
 		struct iova *curr_iova = rb_entry(curr, struct iova, node);
 
-		if (limit_pfn <= curr_iova->pfn_lo) {
+		if (limit_pfn <= curr_iova->pfn_lo)
 			goto move_left;
-		} else if (limit_pfn > curr_iova->pfn_hi) {
-			if (size_aligned)
-				pad_size = iova_get_pad_size(size, limit_pfn);
-			if ((curr_iova->pfn_hi + size + pad_size) < limit_pfn)
-				break;	/* found a free slot */
-		}
+
+		if (((limit_pfn - size) & align_mask) > curr_iova->pfn_hi)
+			break;	/* found a free slot */
+
 		limit_pfn = curr_iova->pfn_lo;
 move_left:
 		prev = curr;
 		curr = rb_prev(curr);
 	}
 
-	if (!curr) {
-		if (size_aligned)
-			pad_size = iova_get_pad_size(size, limit_pfn);
-		if ((iovad->start_pfn + size + pad_size) > limit_pfn) {
-			spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags);
-			return -ENOMEM;
-		}
+	if (limit_pfn < size ||
+	    (!curr && ((limit_pfn - size) & align_mask) < iovad->start_pfn)) {
+		spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags);
+		return -ENOMEM;
 	}
 
 	/* pfn_lo will point to size aligned address if size_aligned is set */
-	new->pfn_lo = limit_pfn - (size + pad_size);
+	new->pfn_lo = (limit_pfn - size) & align_mask;
 	new->pfn_hi = new->pfn_lo + size - 1;
 
 	/* If we have 'prev', it's a valid place to start the insertion. */
-- 
2.12.2.dirty

WARNING: multiple messages have this Message-ID (diff)
From: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
To: joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org
Cc: ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org,
	Jonathan.Cameron-hv44wF8Li93QT0dZR+AlfA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	ray.jui-dY08KVG/lbpWk0Htik3J/w@public.gmane.org,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
Subject: [PATCH v2 2/4] iommu/iova: Optimise the padding calculation
Date: Fri, 21 Jul 2017 12:41:59 +0100	[thread overview]
Message-ID: <4f964e56fe39c6ce1c84b8458c3a27dcb51077d4.1500636791.git.robin.murphy@arm.com> (raw)
In-Reply-To: <cover.1500636791.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>

From: Zhen Lei <thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>

The mask for calculating the padding size doesn't change, so there's no
need to recalculate it every loop iteration. Furthermore, Once we've
done that, it becomes clear that we don't actually need to calculate a
padding size at all - by flipping the arithmetic around, we can just
combine the upper limit, size, and mask directly to check against the
lower limit.

For an arm64 build, this alone knocks 16% off the size of the entire
alloc_iova() function!

Signed-off-by: Zhen Lei <thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Tested-by: Zhen Lei <thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
[rm: simplified more of the arithmetic, rewrote commit message]
Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---

v2: No change

 drivers/iommu/iova.c | 40 ++++++++++++++--------------------------
 1 file changed, 14 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 8f7552dc4e04..d094d1ca8f23 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -129,16 +129,6 @@ iova_insert_rbtree(struct rb_root *root, struct iova *iova,
 	rb_insert_color(&iova->node, root);
 }
 
-/*
- * Computes the padding size required, to make the start address
- * naturally aligned on the power-of-two order of its size
- */
-static unsigned int
-iova_get_pad_size(unsigned int size, unsigned int limit_pfn)
-{
-	return (limit_pfn - size) & (__roundup_pow_of_two(size) - 1);
-}
-
 static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 		unsigned long size, unsigned long limit_pfn,
 			struct iova *new, bool size_aligned)
@@ -146,7 +136,10 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 	struct rb_node *prev, *curr = NULL;
 	unsigned long flags;
 	unsigned long saved_pfn;
-	unsigned int pad_size = 0;
+	unsigned long align_mask = ~0UL;
+
+	if (size_aligned)
+		align_mask <<= __fls(size);
 
 	/* Walk the tree backwards */
 	spin_lock_irqsave(&iovad->iova_rbtree_lock, flags);
@@ -156,31 +149,26 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 	while (curr) {
 		struct iova *curr_iova = rb_entry(curr, struct iova, node);
 
-		if (limit_pfn <= curr_iova->pfn_lo) {
+		if (limit_pfn <= curr_iova->pfn_lo)
 			goto move_left;
-		} else if (limit_pfn > curr_iova->pfn_hi) {
-			if (size_aligned)
-				pad_size = iova_get_pad_size(size, limit_pfn);
-			if ((curr_iova->pfn_hi + size + pad_size) < limit_pfn)
-				break;	/* found a free slot */
-		}
+
+		if (((limit_pfn - size) & align_mask) > curr_iova->pfn_hi)
+			break;	/* found a free slot */
+
 		limit_pfn = curr_iova->pfn_lo;
 move_left:
 		prev = curr;
 		curr = rb_prev(curr);
 	}
 
-	if (!curr) {
-		if (size_aligned)
-			pad_size = iova_get_pad_size(size, limit_pfn);
-		if ((iovad->start_pfn + size + pad_size) > limit_pfn) {
-			spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags);
-			return -ENOMEM;
-		}
+	if (limit_pfn < size ||
+	    (!curr && ((limit_pfn - size) & align_mask) < iovad->start_pfn)) {
+		spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags);
+		return -ENOMEM;
 	}
 
 	/* pfn_lo will point to size aligned address if size_aligned is set */
-	new->pfn_lo = limit_pfn - (size + pad_size);
+	new->pfn_lo = (limit_pfn - size) & align_mask;
 	new->pfn_hi = new->pfn_lo + size - 1;
 
 	/* If we have 'prev', it's a valid place to start the insertion. */
-- 
2.12.2.dirty

WARNING: multiple messages have this Message-ID (diff)
From: robin.murphy@arm.com (Robin Murphy)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2 2/4] iommu/iova: Optimise the padding calculation
Date: Fri, 21 Jul 2017 12:41:59 +0100	[thread overview]
Message-ID: <4f964e56fe39c6ce1c84b8458c3a27dcb51077d4.1500636791.git.robin.murphy@arm.com> (raw)
In-Reply-To: <cover.1500636791.git.robin.murphy@arm.com>

From: Zhen Lei <thunder.leizhen@huawei.com>

The mask for calculating the padding size doesn't change, so there's no
need to recalculate it every loop iteration. Furthermore, Once we've
done that, it becomes clear that we don't actually need to calculate a
padding size at all - by flipping the arithmetic around, we can just
combine the upper limit, size, and mask directly to check against the
lower limit.

For an arm64 build, this alone knocks 16% off the size of the entire
alloc_iova() function!

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Zhen Lei <thunder.leizhen@huawei.com>
[rm: simplified more of the arithmetic, rewrote commit message]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

v2: No change

 drivers/iommu/iova.c | 40 ++++++++++++++--------------------------
 1 file changed, 14 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 8f7552dc4e04..d094d1ca8f23 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -129,16 +129,6 @@ iova_insert_rbtree(struct rb_root *root, struct iova *iova,
 	rb_insert_color(&iova->node, root);
 }
 
-/*
- * Computes the padding size required, to make the start address
- * naturally aligned on the power-of-two order of its size
- */
-static unsigned int
-iova_get_pad_size(unsigned int size, unsigned int limit_pfn)
-{
-	return (limit_pfn - size) & (__roundup_pow_of_two(size) - 1);
-}
-
 static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 		unsigned long size, unsigned long limit_pfn,
 			struct iova *new, bool size_aligned)
@@ -146,7 +136,10 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 	struct rb_node *prev, *curr = NULL;
 	unsigned long flags;
 	unsigned long saved_pfn;
-	unsigned int pad_size = 0;
+	unsigned long align_mask = ~0UL;
+
+	if (size_aligned)
+		align_mask <<= __fls(size);
 
 	/* Walk the tree backwards */
 	spin_lock_irqsave(&iovad->iova_rbtree_lock, flags);
@@ -156,31 +149,26 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 	while (curr) {
 		struct iova *curr_iova = rb_entry(curr, struct iova, node);
 
-		if (limit_pfn <= curr_iova->pfn_lo) {
+		if (limit_pfn <= curr_iova->pfn_lo)
 			goto move_left;
-		} else if (limit_pfn > curr_iova->pfn_hi) {
-			if (size_aligned)
-				pad_size = iova_get_pad_size(size, limit_pfn);
-			if ((curr_iova->pfn_hi + size + pad_size) < limit_pfn)
-				break;	/* found a free slot */
-		}
+
+		if (((limit_pfn - size) & align_mask) > curr_iova->pfn_hi)
+			break;	/* found a free slot */
+
 		limit_pfn = curr_iova->pfn_lo;
 move_left:
 		prev = curr;
 		curr = rb_prev(curr);
 	}
 
-	if (!curr) {
-		if (size_aligned)
-			pad_size = iova_get_pad_size(size, limit_pfn);
-		if ((iovad->start_pfn + size + pad_size) > limit_pfn) {
-			spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags);
-			return -ENOMEM;
-		}
+	if (limit_pfn < size ||
+	    (!curr && ((limit_pfn - size) & align_mask) < iovad->start_pfn)) {
+		spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags);
+		return -ENOMEM;
 	}
 
 	/* pfn_lo will point to size aligned address if size_aligned is set */
-	new->pfn_lo = limit_pfn - (size + pad_size);
+	new->pfn_lo = (limit_pfn - size) & align_mask;
 	new->pfn_hi = new->pfn_lo + size - 1;
 
 	/* If we have 'prev', it's a valid place to start the insertion. */
-- 
2.12.2.dirty

  parent reply	other threads:[~2017-07-21 11:42 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-21 11:41 [PATCH v2 0/4] Optimise 64-bit IOVA allocations Robin Murphy
2017-07-21 11:41 ` Robin Murphy
2017-07-21 11:41 ` [PATCH v2 1/4] iommu/iova: Optimise rbtree searching Robin Murphy
2017-07-21 11:41   ` Robin Murphy
2017-07-21 11:41   ` Robin Murphy
2017-07-21 11:41 ` Robin Murphy [this message]
2017-07-21 11:41   ` [PATCH v2 2/4] iommu/iova: Optimise the padding calculation Robin Murphy
2017-07-21 11:41   ` Robin Murphy
2017-07-21 11:42 ` [PATCH v2 3/4] iommu/iova: Extend rbtree node caching Robin Murphy
2017-07-21 11:42   ` Robin Murphy
2017-07-21 11:42   ` Robin Murphy
2017-07-29  3:57   ` Nate Watterson
2017-07-29  3:57     ` Nate Watterson
2017-07-31 11:42     ` Robin Murphy
2017-07-31 11:42       ` Robin Murphy
2017-07-31 11:42       ` Robin Murphy
2017-08-03 19:41       ` Nate Watterson
2017-08-03 19:41         ` Nate Watterson
2017-08-31  9:46         ` Leizhen (ThunderTown)
2017-08-31  9:46           ` Leizhen (ThunderTown)
2017-08-31  9:46           ` Leizhen (ThunderTown)
2017-09-19  9:42       ` Leizhen (ThunderTown)
2017-09-19  9:42         ` Leizhen (ThunderTown)
2017-09-19  9:42         ` Leizhen (ThunderTown)
2017-09-19 10:07         ` Robin Murphy
2017-09-19 10:07           ` Robin Murphy
2017-09-19 10:07           ` Robin Murphy
2017-07-21 11:42 ` [PATCH v2 4/4] iommu/iova: Make dma_32bit_pfn implicit Robin Murphy
2017-07-21 11:42   ` Robin Murphy
2017-07-26 11:08 ` [PATCH v2 0/4] Optimise 64-bit IOVA allocations Joerg Roedel
2017-07-26 11:08   ` Joerg Roedel
2017-07-26 11:08   ` Joerg Roedel
2017-07-26 11:17   ` Leizhen (ThunderTown)
2017-07-26 11:17     ` Leizhen (ThunderTown)
2017-07-26 11:17     ` Leizhen (ThunderTown)
2017-08-08 12:03     ` Ganapatrao Kulkarni
2017-08-08 12:03       ` Ganapatrao Kulkarni
2017-08-09  1:42       ` Leizhen (ThunderTown)
2017-08-09  1:42         ` Leizhen (ThunderTown)
2017-08-09  1:42         ` Leizhen (ThunderTown)
2017-08-09  3:24         ` Ganapatrao Kulkarni
2017-08-09  3:24           ` Ganapatrao Kulkarni
2017-08-09  3:24           ` Ganapatrao Kulkarni
2017-08-09  4:09           ` Leizhen (ThunderTown)
2017-08-09  4:09             ` Leizhen (ThunderTown)
2017-08-09  4:09             ` Leizhen (ThunderTown)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4f964e56fe39c6ce1c84b8458c3a27dcb51077d4.1500636791.git.robin.murphy@arm.com \
    --to=robin.murphy@arm.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=nwatters@codeaurora.org \
    --cc=ray.jui@broadcom.com \
    --cc=thunder.leizhen@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.