From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934371AbdCWNB7 (ORCPT ); Thu, 23 Mar 2017 09:01:59 -0400 Received: from foss.arm.com ([217.140.101.70]:56082 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934305AbdCWNB5 (ORCPT ); Thu, 23 Mar 2017 09:01:57 -0400 Subject: Re: [PATCH 3/7] iommu/iova: insert start_pfn boundary of dma32 To: Zhen Lei , Joerg Roedel , iommu , David Woodhouse , Sudeep Dutt , Ashutosh Dixit , linux-kernel References: <1490164067-12552-1-git-send-email-thunder.leizhen@huawei.com> <1490164067-12552-4-git-send-email-thunder.leizhen@huawei.com> Cc: Zefan Li , Xinwei Hu , Tianhong Ding , Hanjun Guo From: Robin Murphy Message-ID: <85bb4d05-b0a7-333b-c5e1-163402b44327@arm.com> Date: Thu, 23 Mar 2017 13:01:50 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 MIME-Version: 1.0 In-Reply-To: <1490164067-12552-4-git-send-email-thunder.leizhen@huawei.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 22/03/17 06:27, Zhen Lei wrote: > Reserve the first granule size memory(start at start_pfn) as boundary > iova, to make sure that iovad->cached32_node can not be NULL in future. > Meanwhile, changed the assignment of iovad->cached32_node from rb_next to > rb_prev of &free->node in function __cached_rbnode_delete_update. I'm not sure I follow this. It's a top-down allocator, so cached32_node points to the last node allocated (or the node above the last one freed) on the assumption that there is likely free space directly below there, thus it's a pretty good place for the next allocation to start searching from. On the other hand, start_pfn is a hard "do not go below this line" limit, so it doesn't seem to make any sense to ever point the former at the latter. I could understand slightly more if we were reserving the PFN *above* the cached range, but as-is I don't see what we gain from the change here, nor what benefit the cached32_node != NULL assumption gives (AFAICS it would be more useful to simply restrict the cases where it may be NULL to when the address space is either completely full or completely empty, or perhaps both). Robin. > Signed-off-by: Zhen Lei > --- > drivers/iommu/iova.c | 63 ++++++++++++++++++++++++++++++---------------------- > 1 file changed, 37 insertions(+), 26 deletions(-) > > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > index 1c49969..b5a148e 100644 > --- a/drivers/iommu/iova.c > +++ b/drivers/iommu/iova.c > @@ -32,6 +32,17 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad, > static void init_iova_rcaches(struct iova_domain *iovad); > static void free_iova_rcaches(struct iova_domain *iovad); > > +static void > +insert_iova_boundary(struct iova_domain *iovad) > +{ > + struct iova *iova; > + unsigned long start_pfn_32bit = iovad->start_pfn; > + > + iova = reserve_iova(iovad, start_pfn_32bit, start_pfn_32bit); > + BUG_ON(!iova); > + iovad->cached32_node = &iova->node; > +} > + > void > init_iova_domain(struct iova_domain *iovad, unsigned long granule, > unsigned long start_pfn, unsigned long pfn_32bit) > @@ -45,27 +56,38 @@ init_iova_domain(struct iova_domain *iovad, unsigned long granule, > > spin_lock_init(&iovad->iova_rbtree_lock); > iovad->rbroot = RB_ROOT; > - iovad->cached32_node = NULL; > iovad->granule = granule; > iovad->start_pfn = start_pfn; > iovad->dma_32bit_pfn = pfn_32bit; > init_iova_rcaches(iovad); > + > + /* > + * Insert boundary nodes for dma32. So cached32_node can not be NULL in > + * future. > + */ > + insert_iova_boundary(iovad); > } > EXPORT_SYMBOL_GPL(init_iova_domain); > > static struct rb_node * > __get_cached_rbnode(struct iova_domain *iovad, unsigned long *limit_pfn) > { > - if ((*limit_pfn > iovad->dma_32bit_pfn) || > - (iovad->cached32_node == NULL)) > + struct rb_node *cached_node; > + struct rb_node *next_node; > + > + if (*limit_pfn > iovad->dma_32bit_pfn) > return rb_last(&iovad->rbroot); > - else { > - struct rb_node *prev_node = rb_prev(iovad->cached32_node); > - struct iova *curr_iova = > - rb_entry(iovad->cached32_node, struct iova, node); > - *limit_pfn = curr_iova->pfn_lo - 1; > - return prev_node; > + else > + cached_node = iovad->cached32_node; > + > + next_node = rb_next(cached_node); > + if (next_node) { > + struct iova *next_iova = rb_entry(next_node, struct iova, node); > + > + *limit_pfn = min(*limit_pfn, next_iova->pfn_lo - 1); > } > + > + return cached_node; > } > > static void > @@ -83,20 +105,13 @@ __cached_rbnode_delete_update(struct iova_domain *iovad, struct iova *free) > struct iova *cached_iova; > struct rb_node *curr; > > - if (!iovad->cached32_node) > - return; > curr = iovad->cached32_node; > cached_iova = rb_entry(curr, struct iova, node); > > if (free->pfn_lo >= cached_iova->pfn_lo) { > - struct rb_node *node = rb_next(&free->node); > - struct iova *iova = rb_entry(node, struct iova, node); > - > /* only cache if it's below 32bit pfn */ > - if (node && iova->pfn_lo < iovad->dma_32bit_pfn) > - iovad->cached32_node = node; > - else > - iovad->cached32_node = NULL; > + if (free->pfn_hi <= iovad->dma_32bit_pfn) > + iovad->cached32_node = rb_prev(&free->node); > } > } > > @@ -114,7 +129,7 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, > unsigned long size, unsigned long limit_pfn, > struct iova *new, bool size_aligned) > { > - struct rb_node *prev, *curr = NULL; > + struct rb_node *prev, *curr; > unsigned long flags; > unsigned long saved_pfn; > unsigned long pad_size = 0; > @@ -144,13 +159,9 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, > curr = rb_prev(curr); > } > > - if (!curr) { > - if (size_aligned) > - pad_size = iova_get_pad_size(size, limit_pfn); > - if ((iovad->start_pfn + size + pad_size) > limit_pfn) { > - spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); > - return -ENOMEM; > - } > + if (unlikely(!curr)) { > + spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); > + return -ENOMEM; > } > > /* pfn_lo will point to size aligned address if size_aligned is set */ > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Robin Murphy Subject: Re: [PATCH 3/7] iommu/iova: insert start_pfn boundary of dma32 Date: Thu, 23 Mar 2017 13:01:50 +0000 Message-ID: <85bb4d05-b0a7-333b-c5e1-163402b44327@arm.com> References: <1490164067-12552-1-git-send-email-thunder.leizhen@huawei.com> <1490164067-12552-4-git-send-email-thunder.leizhen@huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1490164067-12552-4-git-send-email-thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Zhen Lei , Joerg Roedel , iommu , David Woodhouse , Sudeep Dutt , Ashutosh Dixit , linux-kernel Cc: Xinwei Hu , Zefan Li , Hanjun Guo , Tianhong Ding List-Id: iommu@lists.linux-foundation.org On 22/03/17 06:27, Zhen Lei wrote: > Reserve the first granule size memory(start at start_pfn) as boundary > iova, to make sure that iovad->cached32_node can not be NULL in future. > Meanwhile, changed the assignment of iovad->cached32_node from rb_next to > rb_prev of &free->node in function __cached_rbnode_delete_update. I'm not sure I follow this. It's a top-down allocator, so cached32_node points to the last node allocated (or the node above the last one freed) on the assumption that there is likely free space directly below there, thus it's a pretty good place for the next allocation to start searching from. On the other hand, start_pfn is a hard "do not go below this line" limit, so it doesn't seem to make any sense to ever point the former at the latter. I could understand slightly more if we were reserving the PFN *above* the cached range, but as-is I don't see what we gain from the change here, nor what benefit the cached32_node != NULL assumption gives (AFAICS it would be more useful to simply restrict the cases where it may be NULL to when the address space is either completely full or completely empty, or perhaps both). Robin. > Signed-off-by: Zhen Lei > --- > drivers/iommu/iova.c | 63 ++++++++++++++++++++++++++++++---------------------- > 1 file changed, 37 insertions(+), 26 deletions(-) > > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > index 1c49969..b5a148e 100644 > --- a/drivers/iommu/iova.c > +++ b/drivers/iommu/iova.c > @@ -32,6 +32,17 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad, > static void init_iova_rcaches(struct iova_domain *iovad); > static void free_iova_rcaches(struct iova_domain *iovad); > > +static void > +insert_iova_boundary(struct iova_domain *iovad) > +{ > + struct iova *iova; > + unsigned long start_pfn_32bit = iovad->start_pfn; > + > + iova = reserve_iova(iovad, start_pfn_32bit, start_pfn_32bit); > + BUG_ON(!iova); > + iovad->cached32_node = &iova->node; > +} > + > void > init_iova_domain(struct iova_domain *iovad, unsigned long granule, > unsigned long start_pfn, unsigned long pfn_32bit) > @@ -45,27 +56,38 @@ init_iova_domain(struct iova_domain *iovad, unsigned long granule, > > spin_lock_init(&iovad->iova_rbtree_lock); > iovad->rbroot = RB_ROOT; > - iovad->cached32_node = NULL; > iovad->granule = granule; > iovad->start_pfn = start_pfn; > iovad->dma_32bit_pfn = pfn_32bit; > init_iova_rcaches(iovad); > + > + /* > + * Insert boundary nodes for dma32. So cached32_node can not be NULL in > + * future. > + */ > + insert_iova_boundary(iovad); > } > EXPORT_SYMBOL_GPL(init_iova_domain); > > static struct rb_node * > __get_cached_rbnode(struct iova_domain *iovad, unsigned long *limit_pfn) > { > - if ((*limit_pfn > iovad->dma_32bit_pfn) || > - (iovad->cached32_node == NULL)) > + struct rb_node *cached_node; > + struct rb_node *next_node; > + > + if (*limit_pfn > iovad->dma_32bit_pfn) > return rb_last(&iovad->rbroot); > - else { > - struct rb_node *prev_node = rb_prev(iovad->cached32_node); > - struct iova *curr_iova = > - rb_entry(iovad->cached32_node, struct iova, node); > - *limit_pfn = curr_iova->pfn_lo - 1; > - return prev_node; > + else > + cached_node = iovad->cached32_node; > + > + next_node = rb_next(cached_node); > + if (next_node) { > + struct iova *next_iova = rb_entry(next_node, struct iova, node); > + > + *limit_pfn = min(*limit_pfn, next_iova->pfn_lo - 1); > } > + > + return cached_node; > } > > static void > @@ -83,20 +105,13 @@ __cached_rbnode_delete_update(struct iova_domain *iovad, struct iova *free) > struct iova *cached_iova; > struct rb_node *curr; > > - if (!iovad->cached32_node) > - return; > curr = iovad->cached32_node; > cached_iova = rb_entry(curr, struct iova, node); > > if (free->pfn_lo >= cached_iova->pfn_lo) { > - struct rb_node *node = rb_next(&free->node); > - struct iova *iova = rb_entry(node, struct iova, node); > - > /* only cache if it's below 32bit pfn */ > - if (node && iova->pfn_lo < iovad->dma_32bit_pfn) > - iovad->cached32_node = node; > - else > - iovad->cached32_node = NULL; > + if (free->pfn_hi <= iovad->dma_32bit_pfn) > + iovad->cached32_node = rb_prev(&free->node); > } > } > > @@ -114,7 +129,7 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, > unsigned long size, unsigned long limit_pfn, > struct iova *new, bool size_aligned) > { > - struct rb_node *prev, *curr = NULL; > + struct rb_node *prev, *curr; > unsigned long flags; > unsigned long saved_pfn; > unsigned long pad_size = 0; > @@ -144,13 +159,9 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, > curr = rb_prev(curr); > } > > - if (!curr) { > - if (size_aligned) > - pad_size = iova_get_pad_size(size, limit_pfn); > - if ((iovad->start_pfn + size + pad_size) > limit_pfn) { > - spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); > - return -ENOMEM; > - } > + if (unlikely(!curr)) { > + spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); > + return -ENOMEM; > } > > /* pfn_lo will point to size aligned address if size_aligned is set */ >