linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] iommu: Disable preemption around use of this_cpu_ptr()
@ 2016-05-31 16:34 Chris Wilson
  2016-06-01 10:23 ` [PATCH v2] " Chris Wilson
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2016-05-31 16:34 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: intel-gfx, Chris Wilson, Joonas Lahtinen, iommu, linux-kernel

Between acquiring the this_cpu_ptr() and using it, ideally we don't want
to be preempted and work on another CPU's private data. If we disable
preemption around this_cpu_ptr, we do not need the CPU local spinlock -
so long as take care that no other CPU is running that code as do perform
the cross-CPU cache flushing and teardown.

[  167.997877] BUG: using smp_processor_id() in preemptible [00000000] code: usb-storage/216
[  167.997940] caller is debug_smp_processor_id+0x17/0x20
[  167.997945] CPU: 7 PID: 216 Comm: usb-storage Tainted: G     U          4.7.0-rc1-gfxbench-RO_Patchwork_1057+ #1
[  167.997948] Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012
[  167.997951]  0000000000000000 ffff880118b7f9c8 ffffffff8140dca5 0000000000000007
[  167.997958]  ffffffff81a3a7e9 ffff880118b7f9f8 ffffffff8142a927 0000000000000000
[  167.997965]  ffff8800d499ed58 0000000000000001 00000000000fffff ffff880118b7fa08
[  167.997971] Call Trace:
[  167.997977]  [<ffffffff8140dca5>] dump_stack+0x67/0x92
[  167.997981]  [<ffffffff8142a927>] check_preemption_disabled+0xd7/0xe0
[  167.997985]  [<ffffffff8142a947>] debug_smp_processor_id+0x17/0x20
[  167.997990]  [<ffffffff81507e17>] alloc_iova_fast+0xb7/0x210
[  167.997994]  [<ffffffff8150c55f>] intel_alloc_iova+0x7f/0xd0
[  167.997998]  [<ffffffff8151021d>] intel_map_sg+0xbd/0x240
[  167.998002]  [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[  167.998009]  [<ffffffff81596059>] usb_hcd_map_urb_for_dma+0x4b9/0x5a0
[  167.998013]  [<ffffffff81596d19>] usb_hcd_submit_urb+0xe9/0xaa0
[  167.998017]  [<ffffffff810cff2f>] ? mark_held_locks+0x6f/0xa0
[  167.998022]  [<ffffffff810d525c>] ? __raw_spin_lock_init+0x1c/0x50
[  167.998025]  [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[  167.998028]  [<ffffffff815988f3>] usb_submit_urb+0x3f3/0x5a0
[  167.998032]  [<ffffffff810d0082>] ? trace_hardirqs_on_caller+0x122/0x1b0
[  167.998035]  [<ffffffff81599ae7>] usb_sg_wait+0x67/0x150
[  167.998039]  [<ffffffff815dc202>] usb_stor_bulk_transfer_sglist.part.3+0x82/0xd0
[  167.998042]  [<ffffffff815dc29c>] usb_stor_bulk_srb+0x4c/0x60
[  167.998045]  [<ffffffff815dc42e>] usb_stor_Bulk_transport+0x17e/0x420
[  167.998049]  [<ffffffff815dcf32>] usb_stor_invoke_transport+0x242/0x540
[  167.998052]  [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[  167.998058]  [<ffffffff815dba19>] usb_stor_transparent_scsi_command+0x9/0x10
[  167.998061]  [<ffffffff815de518>] usb_stor_control_thread+0x158/0x260
[  167.998064]  [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20
[  167.998067]  [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20
[  167.998071]  [<ffffffff8109ddfa>] kthread+0xea/0x100
[  167.998078]  [<ffffffff817ac6af>] ret_from_fork+0x1f/0x40
[  167.998081]  [<ffffffff8109dd10>] ? kthread_create_on_node+0x1f0/0x1f0

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96293
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/iommu/iova.c | 31 ++++++++++---------------------
 1 file changed, 10 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index ba764a0835d3..5c5bbdb86a42 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -390,6 +390,11 @@ free_iova(struct iova_domain *iovad, unsigned long pfn)
 }
 EXPORT_SYMBOL_GPL(free_iova);
 
+static void free_this_cached_iovas(void *info)
+{
+	free_cpu_cached_iovas(smp_processor_id(), info);
+}
+
 /**
  * alloc_iova_fast - allocates an iova from rcache
  * @iovad: - iova domain in question
@@ -413,15 +418,12 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
 retry:
 	new_iova = alloc_iova(iovad, size, limit_pfn, true);
 	if (!new_iova) {
-		unsigned int cpu;
-
 		if (flushed_rcache)
 			return 0;
 
 		/* Try replenishing IOVAs by flushing rcache. */
 		flushed_rcache = true;
-		for_each_online_cpu(cpu)
-			free_cpu_cached_iovas(cpu, iovad);
+		on_each_cpu(free_this_cached_iovas, iovad, true);
 		goto retry;
 	}
 
@@ -645,7 +647,6 @@ struct iova_magazine {
 };
 
 struct iova_cpu_rcache {
-	spinlock_t lock;
 	struct iova_magazine *loaded;
 	struct iova_magazine *prev;
 };
@@ -727,7 +728,6 @@ static void init_iova_rcaches(struct iova_domain *iovad)
 			continue;
 		for_each_possible_cpu(cpu) {
 			cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu);
-			spin_lock_init(&cpu_rcache->lock);
 			cpu_rcache->loaded = iova_magazine_alloc(GFP_KERNEL);
 			cpu_rcache->prev = iova_magazine_alloc(GFP_KERNEL);
 		}
@@ -747,10 +747,9 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 	struct iova_magazine *mag_to_free = NULL;
 	struct iova_cpu_rcache *cpu_rcache;
 	bool can_insert = false;
-	unsigned long flags;
 
+	preempt_disable();
 	cpu_rcache = this_cpu_ptr(rcache->cpu_rcaches);
-	spin_lock_irqsave(&cpu_rcache->lock, flags);
 
 	if (!iova_magazine_full(cpu_rcache->loaded)) {
 		can_insert = true;
@@ -778,7 +777,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 	if (can_insert)
 		iova_magazine_push(cpu_rcache->loaded, iova_pfn);
 
-	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
+	preempt_enable();
 
 	if (mag_to_free) {
 		iova_magazine_free_pfns(mag_to_free, iovad);
@@ -810,10 +809,9 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache,
 	struct iova_cpu_rcache *cpu_rcache;
 	unsigned long iova_pfn = 0;
 	bool has_pfn = false;
-	unsigned long flags;
 
+	preempt_disable();
 	cpu_rcache = this_cpu_ptr(rcache->cpu_rcaches);
-	spin_lock_irqsave(&cpu_rcache->lock, flags);
 
 	if (!iova_magazine_empty(cpu_rcache->loaded)) {
 		has_pfn = true;
@@ -833,7 +831,7 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache,
 	if (has_pfn)
 		iova_pfn = iova_magazine_pop(cpu_rcache->loaded, limit_pfn);
 
-	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
+	preempt_enable();
 
 	return iova_pfn;
 }
@@ -862,17 +860,11 @@ static void free_cpu_iova_rcache(unsigned int cpu, struct iova_domain *iovad,
 				 struct iova_rcache *rcache)
 {
 	struct iova_cpu_rcache *cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu);
-	unsigned long flags;
-
-	spin_lock_irqsave(&cpu_rcache->lock, flags);
-
 	iova_magazine_free_pfns(cpu_rcache->loaded, iovad);
 	iova_magazine_free(cpu_rcache->loaded);
 
 	iova_magazine_free_pfns(cpu_rcache->prev, iovad);
 	iova_magazine_free(cpu_rcache->prev);
-
-	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
 }
 
 /*
@@ -906,16 +898,13 @@ void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad)
 {
 	struct iova_cpu_rcache *cpu_rcache;
 	struct iova_rcache *rcache;
-	unsigned long flags;
 	int i;
 
 	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
 		rcache = &iovad->rcaches[i];
 		cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu);
-		spin_lock_irqsave(&cpu_rcache->lock, flags);
 		iova_magazine_free_pfns(cpu_rcache->loaded, iovad);
 		iova_magazine_free_pfns(cpu_rcache->prev, iovad);
-		spin_unlock_irqrestore(&cpu_rcache->lock, flags);
 	}
 }
 
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2] iommu: Disable preemption around use of this_cpu_ptr()
  2016-05-31 16:34 [PATCH] iommu: Disable preemption around use of this_cpu_ptr() Chris Wilson
@ 2016-06-01 10:23 ` Chris Wilson
  2016-06-01 11:10   ` [PATCH v3 1/2] " Chris Wilson
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2016-06-01 10:23 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: intel-gfx, Chris Wilson, Joonas Lahtinen, iommu, linux-kernel

Between acquiring the this_cpu_ptr() and using it, ideally we don't want
to be preempted and work on another CPU's private data. If we disable
preemption around this_cpu_ptr, we do not need the CPU local spinlock -
so long as take care that no other CPU is running that code as do perform
the cross-CPU cache flushing and teardown.

[  167.997877] BUG: using smp_processor_id() in preemptible [00000000] code: usb-storage/216
[  167.997940] caller is debug_smp_processor_id+0x17/0x20
[  167.997945] CPU: 7 PID: 216 Comm: usb-storage Tainted: G     U          4.7.0-rc1-gfxbench-RO_Patchwork_1057+ #1
[  167.997948] Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012
[  167.997951]  0000000000000000 ffff880118b7f9c8 ffffffff8140dca5 0000000000000007
[  167.997958]  ffffffff81a3a7e9 ffff880118b7f9f8 ffffffff8142a927 0000000000000000
[  167.997965]  ffff8800d499ed58 0000000000000001 00000000000fffff ffff880118b7fa08
[  167.997971] Call Trace:
[  167.997977]  [<ffffffff8140dca5>] dump_stack+0x67/0x92
[  167.997981]  [<ffffffff8142a927>] check_preemption_disabled+0xd7/0xe0
[  167.997985]  [<ffffffff8142a947>] debug_smp_processor_id+0x17/0x20
[  167.997990]  [<ffffffff81507e17>] alloc_iova_fast+0xb7/0x210
[  167.997994]  [<ffffffff8150c55f>] intel_alloc_iova+0x7f/0xd0
[  167.997998]  [<ffffffff8151021d>] intel_map_sg+0xbd/0x240
[  167.998002]  [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[  167.998009]  [<ffffffff81596059>] usb_hcd_map_urb_for_dma+0x4b9/0x5a0
[  167.998013]  [<ffffffff81596d19>] usb_hcd_submit_urb+0xe9/0xaa0
[  167.998017]  [<ffffffff810cff2f>] ? mark_held_locks+0x6f/0xa0
[  167.998022]  [<ffffffff810d525c>] ? __raw_spin_lock_init+0x1c/0x50
[  167.998025]  [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[  167.998028]  [<ffffffff815988f3>] usb_submit_urb+0x3f3/0x5a0
[  167.998032]  [<ffffffff810d0082>] ? trace_hardirqs_on_caller+0x122/0x1b0
[  167.998035]  [<ffffffff81599ae7>] usb_sg_wait+0x67/0x150
[  167.998039]  [<ffffffff815dc202>] usb_stor_bulk_transfer_sglist.part.3+0x82/0xd0
[  167.998042]  [<ffffffff815dc29c>] usb_stor_bulk_srb+0x4c/0x60
[  167.998045]  [<ffffffff815dc42e>] usb_stor_Bulk_transport+0x17e/0x420
[  167.998049]  [<ffffffff815dcf32>] usb_stor_invoke_transport+0x242/0x540
[  167.998052]  [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[  167.998058]  [<ffffffff815dba19>] usb_stor_transparent_scsi_command+0x9/0x10
[  167.998061]  [<ffffffff815de518>] usb_stor_control_thread+0x158/0x260
[  167.998064]  [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20
[  167.998067]  [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20
[  167.998071]  [<ffffffff8109ddfa>] kthread+0xea/0x100
[  167.998078]  [<ffffffff817ac6af>] ret_from_fork+0x1f/0x40
[  167.998081]  [<ffffffff8109dd10>] ? kthread_create_on_node+0x1f0/0x1f0

v2: convert preempt_disable(); var = this_cpu_ptr() to var = get_cpu_ptr()

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96293
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org

iova
---
 drivers/iommu/iova.c | 33 ++++++++++-----------------------
 1 file changed, 10 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index ba764a0835d3..3464ed371eac 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -390,6 +390,11 @@ free_iova(struct iova_domain *iovad, unsigned long pfn)
 }
 EXPORT_SYMBOL_GPL(free_iova);
 
+static void free_this_cached_iovas(void *info)
+{
+	free_cpu_cached_iovas(smp_processor_id(), info);
+}
+
 /**
  * alloc_iova_fast - allocates an iova from rcache
  * @iovad: - iova domain in question
@@ -413,15 +418,12 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
 retry:
 	new_iova = alloc_iova(iovad, size, limit_pfn, true);
 	if (!new_iova) {
-		unsigned int cpu;
-
 		if (flushed_rcache)
 			return 0;
 
 		/* Try replenishing IOVAs by flushing rcache. */
 		flushed_rcache = true;
-		for_each_online_cpu(cpu)
-			free_cpu_cached_iovas(cpu, iovad);
+		on_each_cpu(free_this_cached_iovas, iovad, true);
 		goto retry;
 	}
 
@@ -645,7 +647,6 @@ struct iova_magazine {
 };
 
 struct iova_cpu_rcache {
-	spinlock_t lock;
 	struct iova_magazine *loaded;
 	struct iova_magazine *prev;
 };
@@ -727,7 +728,6 @@ static void init_iova_rcaches(struct iova_domain *iovad)
 			continue;
 		for_each_possible_cpu(cpu) {
 			cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu);
-			spin_lock_init(&cpu_rcache->lock);
 			cpu_rcache->loaded = iova_magazine_alloc(GFP_KERNEL);
 			cpu_rcache->prev = iova_magazine_alloc(GFP_KERNEL);
 		}
@@ -747,10 +747,8 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 	struct iova_magazine *mag_to_free = NULL;
 	struct iova_cpu_rcache *cpu_rcache;
 	bool can_insert = false;
-	unsigned long flags;
 
-	cpu_rcache = this_cpu_ptr(rcache->cpu_rcaches);
-	spin_lock_irqsave(&cpu_rcache->lock, flags);
+	cpu_rcache = get_cpu_var(rcache->cpu_rcaches);
 
 	if (!iova_magazine_full(cpu_rcache->loaded)) {
 		can_insert = true;
@@ -778,7 +776,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 	if (can_insert)
 		iova_magazine_push(cpu_rcache->loaded, iova_pfn);
 
-	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
+	put_cpu_var(rcache->cpu_rcaches);
 
 	if (mag_to_free) {
 		iova_magazine_free_pfns(mag_to_free, iovad);
@@ -810,10 +808,8 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache,
 	struct iova_cpu_rcache *cpu_rcache;
 	unsigned long iova_pfn = 0;
 	bool has_pfn = false;
-	unsigned long flags;
 
-	cpu_rcache = this_cpu_ptr(rcache->cpu_rcaches);
-	spin_lock_irqsave(&cpu_rcache->lock, flags);
+	cpu_rcache = get_cpu_var(rcache->cpu_rcaches);
 
 	if (!iova_magazine_empty(cpu_rcache->loaded)) {
 		has_pfn = true;
@@ -833,7 +829,7 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache,
 	if (has_pfn)
 		iova_pfn = iova_magazine_pop(cpu_rcache->loaded, limit_pfn);
 
-	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
+	put_cpu_var(rcache->cpu_rcaches);
 
 	return iova_pfn;
 }
@@ -862,17 +858,11 @@ static void free_cpu_iova_rcache(unsigned int cpu, struct iova_domain *iovad,
 				 struct iova_rcache *rcache)
 {
 	struct iova_cpu_rcache *cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu);
-	unsigned long flags;
-
-	spin_lock_irqsave(&cpu_rcache->lock, flags);
-
 	iova_magazine_free_pfns(cpu_rcache->loaded, iovad);
 	iova_magazine_free(cpu_rcache->loaded);
 
 	iova_magazine_free_pfns(cpu_rcache->prev, iovad);
 	iova_magazine_free(cpu_rcache->prev);
-
-	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
 }
 
 /*
@@ -906,16 +896,13 @@ void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad)
 {
 	struct iova_cpu_rcache *cpu_rcache;
 	struct iova_rcache *rcache;
-	unsigned long flags;
 	int i;
 
 	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
 		rcache = &iovad->rcaches[i];
 		cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu);
-		spin_lock_irqsave(&cpu_rcache->lock, flags);
 		iova_magazine_free_pfns(cpu_rcache->loaded, iovad);
 		iova_magazine_free_pfns(cpu_rcache->prev, iovad);
-		spin_unlock_irqrestore(&cpu_rcache->lock, flags);
 	}
 }
 
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 1/2] iommu: Disable preemption around use of this_cpu_ptr()
  2016-06-01 10:23 ` [PATCH v2] " Chris Wilson
@ 2016-06-01 11:10   ` Chris Wilson
  2016-06-01 11:10     ` [PATCH v3 2/2] iommu: Remove cpu-local spinlock Chris Wilson
                       ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Chris Wilson @ 2016-06-01 11:10 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: intel-gfx, Chris Wilson, Joonas Lahtinen, iommu, linux-kernel

Between acquiring the this_cpu_ptr() and using it, ideally we don't want
to be preempted and work on another CPU's private data. this_cpu_ptr()
checks whether or not preemption is disable, and get_cpu_ptr() provides
a convenient wrapper for operating on the cpu ptr inside a preemption
disabled critical section (which currently is provided by the
spinlock). Indeed if we disable preemption around this_cpu_ptr,
we do not need the CPU local spinlock - so long as take care that no other
CPU is running that code as do perform the cross-CPU cache flushing and
teardown, but that is a subject for another patch.

[  167.997877] BUG: using smp_processor_id() in preemptible [00000000] code: usb-storage/216
[  167.997940] caller is debug_smp_processor_id+0x17/0x20
[  167.997945] CPU: 7 PID: 216 Comm: usb-storage Tainted: G     U          4.7.0-rc1-gfxbench-RO_Patchwork_1057+ #1
[  167.997948] Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012
[  167.997951]  0000000000000000 ffff880118b7f9c8 ffffffff8140dca5 0000000000000007
[  167.997958]  ffffffff81a3a7e9 ffff880118b7f9f8 ffffffff8142a927 0000000000000000
[  167.997965]  ffff8800d499ed58 0000000000000001 00000000000fffff ffff880118b7fa08
[  167.997971] Call Trace:
[  167.997977]  [<ffffffff8140dca5>] dump_stack+0x67/0x92
[  167.997981]  [<ffffffff8142a927>] check_preemption_disabled+0xd7/0xe0
[  167.997985]  [<ffffffff8142a947>] debug_smp_processor_id+0x17/0x20
[  167.997990]  [<ffffffff81507e17>] alloc_iova_fast+0xb7/0x210
[  167.997994]  [<ffffffff8150c55f>] intel_alloc_iova+0x7f/0xd0
[  167.997998]  [<ffffffff8151021d>] intel_map_sg+0xbd/0x240
[  167.998002]  [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[  167.998009]  [<ffffffff81596059>] usb_hcd_map_urb_for_dma+0x4b9/0x5a0
[  167.998013]  [<ffffffff81596d19>] usb_hcd_submit_urb+0xe9/0xaa0
[  167.998017]  [<ffffffff810cff2f>] ? mark_held_locks+0x6f/0xa0
[  167.998022]  [<ffffffff810d525c>] ? __raw_spin_lock_init+0x1c/0x50
[  167.998025]  [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[  167.998028]  [<ffffffff815988f3>] usb_submit_urb+0x3f3/0x5a0
[  167.998032]  [<ffffffff810d0082>] ? trace_hardirqs_on_caller+0x122/0x1b0
[  167.998035]  [<ffffffff81599ae7>] usb_sg_wait+0x67/0x150
[  167.998039]  [<ffffffff815dc202>] usb_stor_bulk_transfer_sglist.part.3+0x82/0xd0
[  167.998042]  [<ffffffff815dc29c>] usb_stor_bulk_srb+0x4c/0x60
[  167.998045]  [<ffffffff815dc42e>] usb_stor_Bulk_transport+0x17e/0x420
[  167.998049]  [<ffffffff815dcf32>] usb_stor_invoke_transport+0x242/0x540
[  167.998052]  [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[  167.998058]  [<ffffffff815dba19>] usb_stor_transparent_scsi_command+0x9/0x10
[  167.998061]  [<ffffffff815de518>] usb_stor_control_thread+0x158/0x260
[  167.998064]  [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20
[  167.998067]  [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20
[  167.998071]  [<ffffffff8109ddfa>] kthread+0xea/0x100
[  167.998078]  [<ffffffff817ac6af>] ret_from_fork+0x1f/0x40
[  167.998081]  [<ffffffff8109dd10>] ? kthread_create_on_node+0x1f0/0x1f0

v2: convert preempt_disable(); var = this_cpu_ptr() to var = get_cpu_ptr()
v3: Actually use get_cpu_ptr (not get_cpu_var). Drop the spinlock
removal, concentrate on the immediate bug fix.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96293
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/iommu/iova.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index ba764a0835d3..e23001bfcfee 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -420,8 +420,10 @@ retry:
 
 		/* Try replenishing IOVAs by flushing rcache. */
 		flushed_rcache = true;
+		preempt_disable();
 		for_each_online_cpu(cpu)
 			free_cpu_cached_iovas(cpu, iovad);
+		preempt_enable();
 		goto retry;
 	}
 
@@ -749,7 +751,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 	bool can_insert = false;
 	unsigned long flags;
 
-	cpu_rcache = this_cpu_ptr(rcache->cpu_rcaches);
+	cpu_rcache = get_cpu_ptr(rcache->cpu_rcaches);
 	spin_lock_irqsave(&cpu_rcache->lock, flags);
 
 	if (!iova_magazine_full(cpu_rcache->loaded)) {
@@ -779,6 +781,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 		iova_magazine_push(cpu_rcache->loaded, iova_pfn);
 
 	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
+	put_cpu_ptr(rcache->cpu_rcaches);
 
 	if (mag_to_free) {
 		iova_magazine_free_pfns(mag_to_free, iovad);
@@ -812,7 +815,7 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache,
 	bool has_pfn = false;
 	unsigned long flags;
 
-	cpu_rcache = this_cpu_ptr(rcache->cpu_rcaches);
+	cpu_rcache = get_cpu_ptr(rcache->cpu_rcaches);
 	spin_lock_irqsave(&cpu_rcache->lock, flags);
 
 	if (!iova_magazine_empty(cpu_rcache->loaded)) {
@@ -834,6 +837,7 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache,
 		iova_pfn = iova_magazine_pop(cpu_rcache->loaded, limit_pfn);
 
 	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
+	put_cpu_ptr(rcache->cpu_rcaches);
 
 	return iova_pfn;
 }
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 2/2] iommu: Remove cpu-local spinlock
  2016-06-01 11:10   ` [PATCH v3 1/2] " Chris Wilson
@ 2016-06-01 11:10     ` Chris Wilson
  2016-06-01 12:40       ` Joonas Lahtinen
  2016-06-15 12:22       ` Joerg Roedel
  2016-06-01 12:25     ` [PATCH v3 1/2] iommu: Disable preemption around use of this_cpu_ptr() Joonas Lahtinen
                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 11+ messages in thread
From: Chris Wilson @ 2016-06-01 11:10 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: intel-gfx, Chris Wilson, Joonas Lahtinen, iommu, linux-kernel

By avoiding cross-CPU usage of the per-cpu iova cache, we can forgo
having a spinlock inside the per-cpu struct. The only place where we
actually may touch another CPU's data is when performing a cache flush
after running out of memory. Here, we can instead schedule a task to run
on the other CPU to do the flush before trying again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/iommu/iova.c | 29 ++++++-----------------------
 1 file changed, 6 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index e23001bfcfee..36cdc8eeab1c 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -390,6 +390,11 @@ free_iova(struct iova_domain *iovad, unsigned long pfn)
 }
 EXPORT_SYMBOL_GPL(free_iova);
 
+static void free_this_cached_iovas(void *info)
+{
+	free_cpu_cached_iovas(smp_processor_id(), info);
+}
+
 /**
  * alloc_iova_fast - allocates an iova from rcache
  * @iovad: - iova domain in question
@@ -413,17 +418,12 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
 retry:
 	new_iova = alloc_iova(iovad, size, limit_pfn, true);
 	if (!new_iova) {
-		unsigned int cpu;
-
 		if (flushed_rcache)
 			return 0;
 
 		/* Try replenishing IOVAs by flushing rcache. */
 		flushed_rcache = true;
-		preempt_disable();
-		for_each_online_cpu(cpu)
-			free_cpu_cached_iovas(cpu, iovad);
-		preempt_enable();
+		on_each_cpu(free_this_cached_iovas, iovad, true);
 		goto retry;
 	}
 
@@ -647,7 +647,6 @@ struct iova_magazine {
 };
 
 struct iova_cpu_rcache {
-	spinlock_t lock;
 	struct iova_magazine *loaded;
 	struct iova_magazine *prev;
 };
@@ -729,7 +728,6 @@ static void init_iova_rcaches(struct iova_domain *iovad)
 			continue;
 		for_each_possible_cpu(cpu) {
 			cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu);
-			spin_lock_init(&cpu_rcache->lock);
 			cpu_rcache->loaded = iova_magazine_alloc(GFP_KERNEL);
 			cpu_rcache->prev = iova_magazine_alloc(GFP_KERNEL);
 		}
@@ -749,10 +747,8 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 	struct iova_magazine *mag_to_free = NULL;
 	struct iova_cpu_rcache *cpu_rcache;
 	bool can_insert = false;
-	unsigned long flags;
 
 	cpu_rcache = get_cpu_ptr(rcache->cpu_rcaches);
-	spin_lock_irqsave(&cpu_rcache->lock, flags);
 
 	if (!iova_magazine_full(cpu_rcache->loaded)) {
 		can_insert = true;
@@ -780,7 +776,6 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 	if (can_insert)
 		iova_magazine_push(cpu_rcache->loaded, iova_pfn);
 
-	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
 	put_cpu_ptr(rcache->cpu_rcaches);
 
 	if (mag_to_free) {
@@ -813,10 +808,8 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache,
 	struct iova_cpu_rcache *cpu_rcache;
 	unsigned long iova_pfn = 0;
 	bool has_pfn = false;
-	unsigned long flags;
 
 	cpu_rcache = get_cpu_ptr(rcache->cpu_rcaches);
-	spin_lock_irqsave(&cpu_rcache->lock, flags);
 
 	if (!iova_magazine_empty(cpu_rcache->loaded)) {
 		has_pfn = true;
@@ -836,7 +829,6 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache,
 	if (has_pfn)
 		iova_pfn = iova_magazine_pop(cpu_rcache->loaded, limit_pfn);
 
-	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
 	put_cpu_ptr(rcache->cpu_rcaches);
 
 	return iova_pfn;
@@ -866,17 +858,11 @@ static void free_cpu_iova_rcache(unsigned int cpu, struct iova_domain *iovad,
 				 struct iova_rcache *rcache)
 {
 	struct iova_cpu_rcache *cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu);
-	unsigned long flags;
-
-	spin_lock_irqsave(&cpu_rcache->lock, flags);
-
 	iova_magazine_free_pfns(cpu_rcache->loaded, iovad);
 	iova_magazine_free(cpu_rcache->loaded);
 
 	iova_magazine_free_pfns(cpu_rcache->prev, iovad);
 	iova_magazine_free(cpu_rcache->prev);
-
-	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
 }
 
 /*
@@ -910,16 +896,13 @@ void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad)
 {
 	struct iova_cpu_rcache *cpu_rcache;
 	struct iova_rcache *rcache;
-	unsigned long flags;
 	int i;
 
 	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
 		rcache = &iovad->rcaches[i];
 		cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu);
-		spin_lock_irqsave(&cpu_rcache->lock, flags);
 		iova_magazine_free_pfns(cpu_rcache->loaded, iovad);
 		iova_magazine_free_pfns(cpu_rcache->prev, iovad);
-		spin_unlock_irqrestore(&cpu_rcache->lock, flags);
 	}
 }
 
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/2] iommu: Disable preemption around use of this_cpu_ptr()
  2016-06-01 11:10   ` [PATCH v3 1/2] " Chris Wilson
  2016-06-01 11:10     ` [PATCH v3 2/2] iommu: Remove cpu-local spinlock Chris Wilson
@ 2016-06-01 12:25     ` Joonas Lahtinen
  2016-06-15 12:25     ` Joerg Roedel
  2016-06-27 11:14     ` Joerg Roedel
  3 siblings, 0 replies; 11+ messages in thread
From: Joonas Lahtinen @ 2016-06-01 12:25 UTC (permalink / raw)
  To: Chris Wilson, Joerg Roedel; +Cc: intel-gfx, iommu, linux-kernel

On ke, 2016-06-01 at 12:10 +0100, Chris Wilson wrote:
> Between acquiring the this_cpu_ptr() and using it, ideally we don't want
> to be preempted and work on another CPU's private data. this_cpu_ptr()
> checks whether or not preemption is disable, and get_cpu_ptr() provides
> a convenient wrapper for operating on the cpu ptr inside a preemption
> disabled critical section (which currently is provided by the
> spinlock). Indeed if we disable preemption around this_cpu_ptr,
> we do not need the CPU local spinlock - so long as take care that no other
> CPU is running that code as do perform the cross-CPU cache flushing and
> teardown, but that is a subject for another patch.
> 
> [  167.997877] BUG: using smp_processor_id() in preemptible [00000000] code: usb-storage/216
> [  167.997940] caller is debug_smp_processor_id+0x17/0x20
> [  167.997945] CPU: 7 PID: 216 Comm: usb-storage Tainted: G     U          4.7.0-rc1-gfxbench-RO_Patchwork_1057+ #1
> [  167.997948] Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012
> [  167.997951]  0000000000000000 ffff880118b7f9c8 ffffffff8140dca5 0000000000000007
> [  167.997958]  ffffffff81a3a7e9 ffff880118b7f9f8 ffffffff8142a927 0000000000000000
> [  167.997965]  ffff8800d499ed58 0000000000000001 00000000000fffff ffff880118b7fa08
> [  167.997971] Call Trace:
> [  167.997977]  [] dump_stack+0x67/0x92
> [  167.997981]  [] check_preemption_disabled+0xd7/0xe0
> [  167.997985]  [] debug_smp_processor_id+0x17/0x20
> [  167.997990]  [] alloc_iova_fast+0xb7/0x210
> [  167.997994]  [] intel_alloc_iova+0x7f/0xd0
> [  167.997998]  [] intel_map_sg+0xbd/0x240
> [  167.998002]  [] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [  167.998009]  [] usb_hcd_map_urb_for_dma+0x4b9/0x5a0
> [  167.998013]  [] usb_hcd_submit_urb+0xe9/0xaa0
> [  167.998017]  [] ? mark_held_locks+0x6f/0xa0
> [  167.998022]  [] ? __raw_spin_lock_init+0x1c/0x50
> [  167.998025]  [] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [  167.998028]  [] usb_submit_urb+0x3f3/0x5a0
> [  167.998032]  [] ? trace_hardirqs_on_caller+0x122/0x1b0
> [  167.998035]  [] usb_sg_wait+0x67/0x150
> [  167.998039]  [] usb_stor_bulk_transfer_sglist.part.3+0x82/0xd0
> [  167.998042]  [] usb_stor_bulk_srb+0x4c/0x60
> [  167.998045]  [] usb_stor_Bulk_transport+0x17e/0x420
> [  167.998049]  [] usb_stor_invoke_transport+0x242/0x540
> [  167.998052]  [] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [  167.998058]  [] usb_stor_transparent_scsi_command+0x9/0x10
> [  167.998061]  [] usb_stor_control_thread+0x158/0x260
> [  167.998064]  [] ? fill_inquiry_response+0x20/0x20
> [  167.998067]  [] ? fill_inquiry_response+0x20/0x20
> [  167.998071]  [] kthread+0xea/0x100
> [  167.998078]  [] ret_from_fork+0x1f/0x40
> [  167.998081]  [] ? kthread_create_on_node+0x1f0/0x1f0
> 
> v2: convert preempt_disable(); var = this_cpu_ptr() to var = get_cpu_ptr()
> v3: Actually use get_cpu_ptr (not get_cpu_var). Drop the spinlock
> removal, concentrate on the immediate bug fix.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96293
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: iommu@lists.linux-foundation.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/iommu/iova.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index ba764a0835d3..e23001bfcfee 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -420,8 +420,10 @@ retry:
>  
>  		/* Try replenishing IOVAs by flushing rcache. */
>  		flushed_rcache = true;
> +		preempt_disable();
>  		for_each_online_cpu(cpu)
>  			free_cpu_cached_iovas(cpu, iovad);
> +		preempt_enable();
>  		goto retry;
>  	}
>  
> @@ -749,7 +751,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
>  	bool can_insert = false;
>  	unsigned long flags;
>  
> -	cpu_rcache = this_cpu_ptr(rcache->cpu_rcaches);
> +	cpu_rcache = get_cpu_ptr(rcache->cpu_rcaches);
>  	spin_lock_irqsave(&cpu_rcache->lock, flags);
>  
>  	if (!iova_magazine_full(cpu_rcache->loaded)) {
> @@ -779,6 +781,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
>  		iova_magazine_push(cpu_rcache->loaded, iova_pfn);
>  
>  	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
> +	put_cpu_ptr(rcache->cpu_rcaches);
>  
>  	if (mag_to_free) {
>  		iova_magazine_free_pfns(mag_to_free, iovad);
> @@ -812,7 +815,7 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache,
>  	bool has_pfn = false;
>  	unsigned long flags;
>  
> -	cpu_rcache = this_cpu_ptr(rcache->cpu_rcaches);
> +	cpu_rcache = get_cpu_ptr(rcache->cpu_rcaches);
>  	spin_lock_irqsave(&cpu_rcache->lock, flags);
>  
>  	if (!iova_magazine_empty(cpu_rcache->loaded)) {
> @@ -834,6 +837,7 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache,
>  		iova_pfn = iova_magazine_pop(cpu_rcache->loaded, limit_pfn);
>  
>  	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
> +	put_cpu_ptr(rcache->cpu_rcaches);
>  
>  	return iova_pfn;
>  }
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] iommu: Remove cpu-local spinlock
  2016-06-01 11:10     ` [PATCH v3 2/2] iommu: Remove cpu-local spinlock Chris Wilson
@ 2016-06-01 12:40       ` Joonas Lahtinen
  2016-06-15 12:22       ` Joerg Roedel
  1 sibling, 0 replies; 11+ messages in thread
From: Joonas Lahtinen @ 2016-06-01 12:40 UTC (permalink / raw)
  To: Chris Wilson, Joerg Roedel; +Cc: intel-gfx, iommu, linux-kernel

On ke, 2016-06-01 at 12:10 +0100, Chris Wilson wrote:
> By avoiding cross-CPU usage of the per-cpu iova cache, we can forgo
> having a spinlock inside the per-cpu struct. The only place where we
> actually may touch another CPU's data is when performing a cache flush
> after running out of memory. Here, we can instead schedule a task to run
> on the other CPU to do the flush before trying again.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: iommu@lists.linux-foundation.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/iommu/iova.c | 29 ++++++-----------------------
>  1 file changed, 6 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index e23001bfcfee..36cdc8eeab1c 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -390,6 +390,11 @@ free_iova(struct iova_domain *iovad, unsigned long pfn)
>  }
>  EXPORT_SYMBOL_GPL(free_iova);
>  
> +static void free_this_cached_iovas(void *info)
> +{
> +	free_cpu_cached_iovas(smp_processor_id(), info);
> +}
> +
>  /**
>   * alloc_iova_fast - allocates an iova from rcache
>   * @iovad: - iova domain in question
> @@ -413,17 +418,12 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
>  retry:
>  	new_iova = alloc_iova(iovad, size, limit_pfn, true);
>  	if (!new_iova) {
> -		unsigned int cpu;
> -
>  		if (flushed_rcache)
>  			return 0;
>  
>  		/* Try replenishing IOVAs by flushing rcache. */
>  		flushed_rcache = true;
> -		preempt_disable();
> -		for_each_online_cpu(cpu)
> -			free_cpu_cached_iovas(cpu, iovad);
> -		preempt_enable();
> +		on_each_cpu(free_this_cached_iovas, iovad, true);

This is not on a hot path, so should be worthy change.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas

>  		goto retry;
>  	}
>  
Joonas Lahtinen
Open Source Technology Center
Intel Corporation

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] iommu: Remove cpu-local spinlock
  2016-06-01 11:10     ` [PATCH v3 2/2] iommu: Remove cpu-local spinlock Chris Wilson
  2016-06-01 12:40       ` Joonas Lahtinen
@ 2016-06-15 12:22       ` Joerg Roedel
  1 sibling, 0 replies; 11+ messages in thread
From: Joerg Roedel @ 2016-06-15 12:22 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Joonas Lahtinen, iommu, linux-kernel

On Wed, Jun 01, 2016 at 12:10:09PM +0100, Chris Wilson wrote:
> By avoiding cross-CPU usage of the per-cpu iova cache, we can forgo
> having a spinlock inside the per-cpu struct. The only place where we
> actually may touch another CPU's data is when performing a cache flush
> after running out of memory. Here, we can instead schedule a task to run
> on the other CPU to do the flush before trying again.

The big benefit of the spinlock is that the rcache can be flushed from a
different cpu without the need to send IPIs to everyone.


	Joerg

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/2] iommu: Disable preemption around use of this_cpu_ptr()
  2016-06-01 11:10   ` [PATCH v3 1/2] " Chris Wilson
  2016-06-01 11:10     ` [PATCH v3 2/2] iommu: Remove cpu-local spinlock Chris Wilson
  2016-06-01 12:25     ` [PATCH v3 1/2] iommu: Disable preemption around use of this_cpu_ptr() Joonas Lahtinen
@ 2016-06-15 12:25     ` Joerg Roedel
  2016-06-26 10:54       ` Thorsten Leemhuis
  2016-06-27 11:14     ` Joerg Roedel
  3 siblings, 1 reply; 11+ messages in thread
From: Joerg Roedel @ 2016-06-15 12:25 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Joonas Lahtinen, iommu, linux-kernel

On Wed, Jun 01, 2016 at 12:10:08PM +0100, Chris Wilson wrote:
> Between acquiring the this_cpu_ptr() and using it, ideally we don't want
> to be preempted and work on another CPU's private data. this_cpu_ptr()
> checks whether or not preemption is disable, and get_cpu_ptr() provides
> a convenient wrapper for operating on the cpu ptr inside a preemption
> disabled critical section (which currently is provided by the
> spinlock). Indeed if we disable preemption around this_cpu_ptr,
> we do not need the CPU local spinlock - so long as take care that no other
> CPU is running that code as do perform the cross-CPU cache flushing and
> teardown, but that is a subject for another patch.
> 
> [  167.997877] BUG: using smp_processor_id() in preemptible [00000000] code: usb-storage/216
> [  167.997940] caller is debug_smp_processor_id+0x17/0x20
> [  167.997945] CPU: 7 PID: 216 Comm: usb-storage Tainted: G     U          4.7.0-rc1-gfxbench-RO_Patchwork_1057+ #1
> [  167.997948] Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012
> [  167.997951]  0000000000000000 ffff880118b7f9c8 ffffffff8140dca5 0000000000000007
> [  167.997958]  ffffffff81a3a7e9 ffff880118b7f9f8 ffffffff8142a927 0000000000000000
> [  167.997965]  ffff8800d499ed58 0000000000000001 00000000000fffff ffff880118b7fa08
> [  167.997971] Call Trace:
> [  167.997977]  [<ffffffff8140dca5>] dump_stack+0x67/0x92
> [  167.997981]  [<ffffffff8142a927>] check_preemption_disabled+0xd7/0xe0
> [  167.997985]  [<ffffffff8142a947>] debug_smp_processor_id+0x17/0x20
> [  167.997990]  [<ffffffff81507e17>] alloc_iova_fast+0xb7/0x210
> [  167.997994]  [<ffffffff8150c55f>] intel_alloc_iova+0x7f/0xd0
> [  167.997998]  [<ffffffff8151021d>] intel_map_sg+0xbd/0x240
> [  167.998002]  [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [  167.998009]  [<ffffffff81596059>] usb_hcd_map_urb_for_dma+0x4b9/0x5a0
> [  167.998013]  [<ffffffff81596d19>] usb_hcd_submit_urb+0xe9/0xaa0
> [  167.998017]  [<ffffffff810cff2f>] ? mark_held_locks+0x6f/0xa0
> [  167.998022]  [<ffffffff810d525c>] ? __raw_spin_lock_init+0x1c/0x50
> [  167.998025]  [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [  167.998028]  [<ffffffff815988f3>] usb_submit_urb+0x3f3/0x5a0
> [  167.998032]  [<ffffffff810d0082>] ? trace_hardirqs_on_caller+0x122/0x1b0
> [  167.998035]  [<ffffffff81599ae7>] usb_sg_wait+0x67/0x150
> [  167.998039]  [<ffffffff815dc202>] usb_stor_bulk_transfer_sglist.part.3+0x82/0xd0
> [  167.998042]  [<ffffffff815dc29c>] usb_stor_bulk_srb+0x4c/0x60
> [  167.998045]  [<ffffffff815dc42e>] usb_stor_Bulk_transport+0x17e/0x420
> [  167.998049]  [<ffffffff815dcf32>] usb_stor_invoke_transport+0x242/0x540
> [  167.998052]  [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [  167.998058]  [<ffffffff815dba19>] usb_stor_transparent_scsi_command+0x9/0x10
> [  167.998061]  [<ffffffff815de518>] usb_stor_control_thread+0x158/0x260
> [  167.998064]  [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20
> [  167.998067]  [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20
> [  167.998071]  [<ffffffff8109ddfa>] kthread+0xea/0x100
> [  167.998078]  [<ffffffff817ac6af>] ret_from_fork+0x1f/0x40
> [  167.998081]  [<ffffffff8109dd10>] ? kthread_create_on_node+0x1f0/0x1f0
> 
> v2: convert preempt_disable(); var = this_cpu_ptr() to var = get_cpu_ptr()
> v3: Actually use get_cpu_ptr (not get_cpu_var). Drop the spinlock
> removal, concentrate on the immediate bug fix.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96293
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: iommu@lists.linux-foundation.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/iommu/iova.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index ba764a0835d3..e23001bfcfee 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -420,8 +420,10 @@ retry:
>  
>  		/* Try replenishing IOVAs by flushing rcache. */
>  		flushed_rcache = true;
> +		preempt_disable();
>  		for_each_online_cpu(cpu)
>  			free_cpu_cached_iovas(cpu, iovad);
> +		preempt_enable();

Why do you need to disable preemption here? The free_cpu_cached_iovas
function does not need to stay on the same cpu as it iterates over the
rcaches for all cpus anyway.


	Joerg

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/2] iommu: Disable preemption around use of this_cpu_ptr()
  2016-06-15 12:25     ` Joerg Roedel
@ 2016-06-26 10:54       ` Thorsten Leemhuis
  2016-06-27 11:19         ` Joerg Roedel
  0 siblings, 1 reply; 11+ messages in thread
From: Thorsten Leemhuis @ 2016-06-26 10:54 UTC (permalink / raw)
  To: Joerg Roedel, Chris Wilson
  Cc: intel-gfx, Joonas Lahtinen, iommu, linux-kernel

On 15.06.2016 14:25, Joerg Roedel wrote:
> On Wed, Jun 01, 2016 at 12:10:08PM +0100, Chris Wilson wrote:
>> Between acquiring the this_cpu_ptr() and using it, ideally we don't want
>> to be preempted and work on another CPU's private data. this_cpu_ptr()
>> checks whether or not preemption is disable, and get_cpu_ptr() provides
>> a convenient wrapper for operating on the cpu ptr inside a preemption
>> disabled critical section (which currently is provided by the
>> spinlock). Indeed if we disable preemption around this_cpu_ptr,
>> we do not need the CPU local spinlock - so long as take care that no other
>> CPU is running that code as do perform the cross-CPU cache flushing and
>> teardown, but that is a subject for another patch.
> […]
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96293
> […]
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index ba764a0835d3..e23001bfcfee 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -420,8 +420,10 @@ retry:
>>  
>>  		/* Try replenishing IOVAs by flushing rcache. */
>>  		flushed_rcache = true;
>> +		preempt_disable();
>>  		for_each_online_cpu(cpu)
>>  			free_cpu_cached_iovas(cpu, iovad);
>> +		preempt_enable();
> 
> Why do you need to disable preemption here? The free_cpu_cached_iovas
> function does not need to stay on the same cpu as it iterates over the
> rcaches for all cpus anyway.

Joerg, what's the status here? This made it on my 4.7 regressions
report, as the patches from this thread are supposed to fix a
regression; see
http://thread.gmane.org/gmane.linux.usb.general/143504/focus=153154
for details.

Please let me know if if fixes went to mainline already; I did a quick
check and could see any.

Sincerely, your regression tracker for Linux 4.7 (http://bit.ly/28JRmJo)
 Thorsten

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/2] iommu: Disable preemption around use of this_cpu_ptr()
  2016-06-01 11:10   ` [PATCH v3 1/2] " Chris Wilson
                       ` (2 preceding siblings ...)
  2016-06-15 12:25     ` Joerg Roedel
@ 2016-06-27 11:14     ` Joerg Roedel
  3 siblings, 0 replies; 11+ messages in thread
From: Joerg Roedel @ 2016-06-27 11:14 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Joonas Lahtinen, iommu, linux-kernel

On Wed, Jun 01, 2016 at 12:10:08PM +0100, Chris Wilson wrote:
>  drivers/iommu/iova.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)

Okay, applied this patch to iommu/fixes and will send it upstream this
week.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/2] iommu: Disable preemption around use of this_cpu_ptr()
  2016-06-26 10:54       ` Thorsten Leemhuis
@ 2016-06-27 11:19         ` Joerg Roedel
  0 siblings, 0 replies; 11+ messages in thread
From: Joerg Roedel @ 2016-06-27 11:19 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Chris Wilson, intel-gfx, Joonas Lahtinen, iommu, linux-kernel

On Sun, Jun 26, 2016 at 12:54:19PM +0200, Thorsten Leemhuis wrote:
> Joerg, what's the status here? This made it on my 4.7 regressions
> report, as the patches from this thread are supposed to fix a
> regression; see
> http://thread.gmane.org/gmane.linux.usb.general/143504/focus=153154
> for details.
> 
> Please let me know if if fixes went to mainline already; I did a quick
> check and could see any.

Okay, queued the fix, should be upstream for -rc6.

> Sincerely, your regression tracker for Linux 4.7 (http://bit.ly/28JRmJo)
>  Thorsten

Awesome, glad to see you picking this up :)


Cheers,

	Joerg

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-06-27 11:19 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-31 16:34 [PATCH] iommu: Disable preemption around use of this_cpu_ptr() Chris Wilson
2016-06-01 10:23 ` [PATCH v2] " Chris Wilson
2016-06-01 11:10   ` [PATCH v3 1/2] " Chris Wilson
2016-06-01 11:10     ` [PATCH v3 2/2] iommu: Remove cpu-local spinlock Chris Wilson
2016-06-01 12:40       ` Joonas Lahtinen
2016-06-15 12:22       ` Joerg Roedel
2016-06-01 12:25     ` [PATCH v3 1/2] iommu: Disable preemption around use of this_cpu_ptr() Joonas Lahtinen
2016-06-15 12:25     ` Joerg Roedel
2016-06-26 10:54       ` Thorsten Leemhuis
2016-06-27 11:19         ` Joerg Roedel
2016-06-27 11:14     ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).