* [PATCH 0/3] iommu: reduce spinlock contention on fast path @ 2019-11-21 0:13 Cong Wang 2019-11-21 0:13 ` [PATCH 1/3] iommu: match the original algorithm Cong Wang ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Cong Wang @ 2019-11-21 0:13 UTC (permalink / raw) To: iommu; +Cc: Cong Wang, linux-kernel This patchset contains three small optimizations for the global spinlock contention in IOVA cache. Our memcache perf test shows this reduced its p999 latency down by 45% on AMD when IOMMU is enabled. Cong Wang (3): iommu: match the original algorithm iommu: optimize iova_magazine_free_pfns() iommu: avoid taking iova_rbtree_lock twice --- drivers/iommu/iova.c | 44 +++++++++++++++++++++++++------------------- 1 file changed, 25 insertions(+), 19 deletions(-) -- 2.21.0 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] iommu: match the original algorithm 2019-11-21 0:13 [PATCH 0/3] iommu: reduce spinlock contention on fast path Cong Wang @ 2019-11-21 0:13 ` Cong Wang 2019-11-27 18:00 ` John Garry 2019-11-21 0:13 ` [PATCH 2/3] iommu: optimize iova_magazine_free_pfns() Cong Wang 2019-11-21 0:13 ` [PATCH 3/3] iommu: avoid taking iova_rbtree_lock twice Cong Wang 2 siblings, 1 reply; 7+ messages in thread From: Cong Wang @ 2019-11-21 0:13 UTC (permalink / raw) To: iommu; +Cc: Cong Wang, linux-kernel The IOVA cache algorithm implemented in IOMMU code does not exactly match the original algorithm described in the paper. Particularly, it doesn't need to free the loaded empty magazine when trying to put it back to global depot. This patch makes it exactly match the original algorithm. Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> --- drivers/iommu/iova.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 41c605b0058f..92f72a85e62a 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -900,7 +900,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad, if (!iova_magazine_full(cpu_rcache->loaded)) { can_insert = true; - } else if (!iova_magazine_full(cpu_rcache->prev)) { + } else if (iova_magazine_empty(cpu_rcache->prev)) { swap(cpu_rcache->prev, cpu_rcache->loaded); can_insert = true; } else { @@ -909,8 +909,9 @@ static bool __iova_rcache_insert(struct iova_domain *iovad, if (new_mag) { spin_lock(&rcache->lock); if (rcache->depot_size < MAX_GLOBAL_MAGS) { - rcache->depot[rcache->depot_size++] = - cpu_rcache->loaded; + swap(rcache->depot[rcache->depot_size], cpu_rcache->prev); + swap(cpu_rcache->prev, cpu_rcache->loaded); + rcache->depot_size++; } else { mag_to_free = cpu_rcache->loaded; } @@ -963,14 +964,15 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache, if (!iova_magazine_empty(cpu_rcache->loaded)) { has_pfn = true; - } else if (!iova_magazine_empty(cpu_rcache->prev)) { + } else if (iova_magazine_full(cpu_rcache->prev)) { swap(cpu_rcache->prev, cpu_rcache->loaded); has_pfn = true; } else { spin_lock(&rcache->lock); if (rcache->depot_size > 0) { - iova_magazine_free(cpu_rcache->loaded); - cpu_rcache->loaded = rcache->depot[--rcache->depot_size]; + swap(rcache->depot[rcache->depot_size - 1], cpu_rcache->prev); + swap(cpu_rcache->prev, cpu_rcache->loaded); + rcache->depot_size--; has_pfn = true; } spin_unlock(&rcache->lock); -- 2.21.0 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] iommu: match the original algorithm 2019-11-21 0:13 ` [PATCH 1/3] iommu: match the original algorithm Cong Wang @ 2019-11-27 18:00 ` John Garry 2019-11-27 18:38 ` Qian Cai 2019-11-28 19:30 ` Cong Wang 0 siblings, 2 replies; 7+ messages in thread From: John Garry @ 2019-11-27 18:00 UTC (permalink / raw) To: Cong Wang, iommu; +Cc: linux-kernel On 21/11/2019 00:13, Cong Wang wrote: > The IOVA cache algorithm implemented in IOMMU code does not > exactly match the original algorithm described in the paper. > > Particularly, it doesn't need to free the loaded empty magazine > when trying to put it back to global depot. > > This patch makes it exactly match the original algorithm. > I haven't gone into the details, but this patch alone is giving this: root@(none)$ [ 123.857024] kmemleak: 8 new suspected memory leaks (see /sys/kernel/debug/kmemleak) root@(none)$ cat /sys/kernel/debug/kmemleak unreferenced object 0xffff002339843000 (size 2048): comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000001d2710bf>] kmem_cache_alloc+0x188/0x260 [<00000000cc229a78>] init_iova_domain+0x1e8/0x2a8 [<000000002646fc92>] iommu_setup_dma_ops+0x200/0x710 [<00000000acc5fe46>] arch_setup_dma_ops+0x80/0x128 [<00000000994e1e43>] acpi_dma_configure+0x11c/0x140 [<00000000effe9374>] pci_dma_configure+0xe0/0x108 [<00000000f614ae1e>] really_probe+0x210/0x548 [<0000000087884b1b>] driver_probe_device+0x7c/0x148 [<0000000010af2936>] device_driver_attach+0x94/0xa0 [<00000000c92b2971>] __driver_attach+0xa4/0x110 [<00000000c873500f>] bus_for_each_dev+0xe8/0x158 [<00000000c7d0e008>] driver_attach+0x30/0x40 [<000000003cf39ba8>] bus_add_driver+0x234/0x2f0 [<0000000043830a45>] driver_register+0xbc/0x1d0 [<00000000c8a41162>] __pci_register_driver+0xb0/0xc8 [<00000000e562eeec>] sas_v3_pci_driver_init+0x20/0x28 unreferenced object 0xffff002339844000 (size 2048): comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s) [snip] And I don't feel like continuing until it's resolved.... Thanks, John > Cc: Joerg Roedel <joro@8bytes.org> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> > --- > drivers/iommu/iova.c | 14 ++++++++------ > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > index 41c605b0058f..92f72a85e62a 100644 > --- a/drivers/iommu/iova.c > +++ b/drivers/iommu/iova.c > @@ -900,7 +900,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad, > > if (!iova_magazine_full(cpu_rcache->loaded)) { > can_insert = true; > - } else if (!iova_magazine_full(cpu_rcache->prev)) { > + } else if (iova_magazine_empty(cpu_rcache->prev)) { > swap(cpu_rcache->prev, cpu_rcache->loaded); > can_insert = true; > } else { > @@ -909,8 +909,9 @@ static bool __iova_rcache_insert(struct iova_domain *iovad, > if (new_mag) { > spin_lock(&rcache->lock); > if (rcache->depot_size < MAX_GLOBAL_MAGS) { > - rcache->depot[rcache->depot_size++] = > - cpu_rcache->loaded; > + swap(rcache->depot[rcache->depot_size], cpu_rcache->prev); > + swap(cpu_rcache->prev, cpu_rcache->loaded); > + rcache->depot_size++; > } else { > mag_to_free = cpu_rcache->loaded; > } > @@ -963,14 +964,15 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache, > > if (!iova_magazine_empty(cpu_rcache->loaded)) { > has_pfn = true; > - } else if (!iova_magazine_empty(cpu_rcache->prev)) { > + } else if (iova_magazine_full(cpu_rcache->prev)) { > swap(cpu_rcache->prev, cpu_rcache->loaded); > has_pfn = true; > } else { > spin_lock(&rcache->lock); > if (rcache->depot_size > 0) { > - iova_magazine_free(cpu_rcache->loaded); > - cpu_rcache->loaded = rcache->depot[--rcache->depot_size]; > + swap(rcache->depot[rcache->depot_size - 1], cpu_rcache->prev); > + swap(cpu_rcache->prev, cpu_rcache->loaded); > + rcache->depot_size--; > has_pfn = true; > } > spin_unlock(&rcache->lock); > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] iommu: match the original algorithm 2019-11-27 18:00 ` John Garry @ 2019-11-27 18:38 ` Qian Cai 2019-11-28 19:30 ` Cong Wang 1 sibling, 0 replies; 7+ messages in thread From: Qian Cai @ 2019-11-27 18:38 UTC (permalink / raw) To: John Garry; +Cc: Cong Wang, iommu, linux-kernel > On Nov 27, 2019, at 1:01 PM, John Garry <john.garry@huawei.com> wrote: > > I haven't gone into the details, but this patch alone is giving this: > > root@(none)$ [ 123.857024] kmemleak: 8 new suspected memory leaks (see /sys/kernel/debug/kmemleak) > > root@(none)$ cat /sys/kernel/debug/kmemleak > unreferenced object 0xffff002339843000 (size 2048): > comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s) > hex dump (first 32 bytes): > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [<000000001d2710bf>] kmem_cache_alloc+0x188/0x260 > [<00000000cc229a78>] init_iova_domain+0x1e8/0x2a8 > [<000000002646fc92>] iommu_setup_dma_ops+0x200/0x710 > [<00000000acc5fe46>] arch_setup_dma_ops+0x80/0x128 > [<00000000994e1e43>] acpi_dma_configure+0x11c/0x140 > [<00000000effe9374>] pci_dma_configure+0xe0/0x108 > [<00000000f614ae1e>] really_probe+0x210/0x548 > [<0000000087884b1b>] driver_probe_device+0x7c/0x148 > [<0000000010af2936>] device_driver_attach+0x94/0xa0 > [<00000000c92b2971>] __driver_attach+0xa4/0x110 > [<00000000c873500f>] bus_for_each_dev+0xe8/0x158 > [<00000000c7d0e008>] driver_attach+0x30/0x40 > [<000000003cf39ba8>] bus_add_driver+0x234/0x2f0 > [<0000000043830a45>] driver_register+0xbc/0x1d0 > [<00000000c8a41162>] __pci_register_driver+0xb0/0xc8 > [<00000000e562eeec>] sas_v3_pci_driver_init+0x20/0x28 > unreferenced object 0xffff002339844000 (size 2048): > comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s) > > [snip] > > And I don't feel like continuing until it's resolved.... Thanks for talking a hit by this before me. It is frustrating that people tend not to test their patches properly with things like kmemleak. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] iommu: match the original algorithm 2019-11-27 18:00 ` John Garry 2019-11-27 18:38 ` Qian Cai @ 2019-11-28 19:30 ` Cong Wang 1 sibling, 0 replies; 7+ messages in thread From: Cong Wang @ 2019-11-28 19:30 UTC (permalink / raw) To: John Garry; +Cc: iommu, LKML On Wed, Nov 27, 2019 at 10:01 AM John Garry <john.garry@huawei.com> wrote: > > On 21/11/2019 00:13, Cong Wang wrote: > > The IOVA cache algorithm implemented in IOMMU code does not > > exactly match the original algorithm described in the paper. > > > > Particularly, it doesn't need to free the loaded empty magazine > > when trying to put it back to global depot. > > > > This patch makes it exactly match the original algorithm. > > > > I haven't gone into the details, but this patch alone is giving this: > > root@(none)$ [ 123.857024] kmemleak: 8 new suspected memory leaks (see > /sys/kernel/debug/kmemleak) Ah, thanks for catching it! I see what I missed, I should pre-allocate those empty entries in order to make it work correctly. I didn't catch this because this was tested on a production machine where we can't afford CONFIG_DEBUG_KMEMLEAK=y for obvious performance concerns. Anyway, I will fix this and send v2. Thanks! _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/3] iommu: optimize iova_magazine_free_pfns() 2019-11-21 0:13 [PATCH 0/3] iommu: reduce spinlock contention on fast path Cong Wang 2019-11-21 0:13 ` [PATCH 1/3] iommu: match the original algorithm Cong Wang @ 2019-11-21 0:13 ` Cong Wang 2019-11-21 0:13 ` [PATCH 3/3] iommu: avoid taking iova_rbtree_lock twice Cong Wang 2 siblings, 0 replies; 7+ messages in thread From: Cong Wang @ 2019-11-21 0:13 UTC (permalink / raw) To: iommu; +Cc: Cong Wang, linux-kernel If the maganize is empty, iova_magazine_free_pfns() should be a nop, however it misses the case of mag->size==0. So we should just call iova_magazine_empty(). This should reduce the contention on iovad->iova_rbtree_lock a little bit. Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> --- drivers/iommu/iova.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 92f72a85e62a..b82c6f1cbfc2 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -797,13 +797,23 @@ static void iova_magazine_free(struct iova_magazine *mag) kfree(mag); } +static bool iova_magazine_full(struct iova_magazine *mag) +{ + return (mag && mag->size == IOVA_MAG_SIZE); +} + +static bool iova_magazine_empty(struct iova_magazine *mag) +{ + return (!mag || mag->size == 0); +} + static void iova_magazine_free_pfns(struct iova_magazine *mag, struct iova_domain *iovad) { unsigned long flags; int i; - if (!mag) + if (iova_magazine_empty(mag)) return; spin_lock_irqsave(&iovad->iova_rbtree_lock, flags); @@ -820,16 +830,6 @@ iova_magazine_free_pfns(struct iova_magazine *mag, struct iova_domain *iovad) mag->size = 0; } -static bool iova_magazine_full(struct iova_magazine *mag) -{ - return (mag && mag->size == IOVA_MAG_SIZE); -} - -static bool iova_magazine_empty(struct iova_magazine *mag) -{ - return (!mag || mag->size == 0); -} - static unsigned long iova_magazine_pop(struct iova_magazine *mag, unsigned long limit_pfn) { -- 2.21.0 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/3] iommu: avoid taking iova_rbtree_lock twice 2019-11-21 0:13 [PATCH 0/3] iommu: reduce spinlock contention on fast path Cong Wang 2019-11-21 0:13 ` [PATCH 1/3] iommu: match the original algorithm Cong Wang 2019-11-21 0:13 ` [PATCH 2/3] iommu: optimize iova_magazine_free_pfns() Cong Wang @ 2019-11-21 0:13 ` Cong Wang 2 siblings, 0 replies; 7+ messages in thread From: Cong Wang @ 2019-11-21 0:13 UTC (permalink / raw) To: iommu; +Cc: Cong Wang, linux-kernel Both find_iova() and __free_iova() take iova_rbtree_lock, there is no reason to take and release it twice inside free_iova(). Fold them into the critical section by calling the unlock versions instead. Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> --- drivers/iommu/iova.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index b82c6f1cbfc2..ba6bd33f2b16 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -390,10 +390,14 @@ EXPORT_SYMBOL_GPL(__free_iova); void free_iova(struct iova_domain *iovad, unsigned long pfn) { - struct iova *iova = find_iova(iovad, pfn); + unsigned long flags; + struct iova *iova; + spin_lock_irqsave(&iovad->iova_rbtree_lock, flags); + iova = private_find_iova(iovad, pfn); if (iova) - __free_iova(iovad, iova); + private_free_iova(iovad, iova); + spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); } EXPORT_SYMBOL_GPL(free_iova); -- 2.21.0 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-11-28 19:30 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-21 0:13 [PATCH 0/3] iommu: reduce spinlock contention on fast path Cong Wang 2019-11-21 0:13 ` [PATCH 1/3] iommu: match the original algorithm Cong Wang 2019-11-27 18:00 ` John Garry 2019-11-27 18:38 ` Qian Cai 2019-11-28 19:30 ` Cong Wang 2019-11-21 0:13 ` [PATCH 2/3] iommu: optimize iova_magazine_free_pfns() Cong Wang 2019-11-21 0:13 ` [PATCH 3/3] iommu: avoid taking iova_rbtree_lock twice Cong Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).