[PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
@ 2018-01-15 16:03 Ming Lei
  2018-01-15 16:03 ` [PATCH 1/2] genirq/affinity: move irq vectors spread into one function Ming Lei
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Ming Lei @ 2018-01-15 16:03 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-kernel, Christoph Hellwig,
	Thomas Gleixner
  Cc: Laurence Oberman, Mike Snitzer, Ming Lei

Hi,

These two patches fixes IO hang issue reported by Laurence.

84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
may cause one irq vector assigned to all offline CPUs, then this vector
can't handle irq any more.

The 1st patch moves irq vectors spread into one function, and prepares
for the fix done in 2nd patch.

The 2nd patch fixes the issue by trying to make sure online CPUs assigned
to irq vector.

Ming Lei (2):
  genirq/affinity: move irq vectors spread into one function
  genirq/affinity: try best to make sure online CPU is assigned to
    vector

 kernel/irq/affinity.c | 77 ++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 52 insertions(+), 25 deletions(-)

-- 
2.9.5

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/2] genirq/affinity: move irq vectors spread into one function
  2018-01-15 16:03 [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector Ming Lei
@ 2018-01-15 16:03 ` Ming Lei
  2018-01-15 16:03 ` [PATCH 2/2] genirq/affinity: try best to make sure online CPU is assigned to vector Ming Lei
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Ming Lei @ 2018-01-15 16:03 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-kernel, Christoph Hellwig,
	Thomas Gleixner
  Cc: Laurence Oberman, Mike Snitzer, Ming Lei, Christoph Hellwig

This patch is preparing for doing two steps spread:

	- spread vectors across non-online CPUs
	- spread vectors across online CPUs

This way is applied for trying best to avoid allocating all offline CPUs
to one single vector.

No functional change, and code gets cleaned up too.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 kernel/irq/affinity.c | 56 +++++++++++++++++++++++++++++++--------------------
 1 file changed, 34 insertions(+), 22 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index a37a3b4b6342..99eb38a4cc83 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -94,6 +94,35 @@ static int get_nodes_in_cpumask(cpumask_var_t *node_to_possible_cpumask,
 	return nodes;
 }
 
+/* Spread irq vectors, and the result is stored to @irqmsk. */
+static int irq_vecs_spread_affinity(struct cpumask *irqmsk,
+				    int max_irqmsks,
+				    int max_vecs,
+				    struct cpumask *nmsk)
+{
+	int v, ncpus = cpumask_weight(nmsk);
+	int vecs_to_assign, extra_vecs;
+
+	/* How many vectors we will try to spread */
+	vecs_to_assign = min(max_vecs, ncpus);
+
+	/* Account for rounding errors */
+	extra_vecs = ncpus - vecs_to_assign * (ncpus / vecs_to_assign);
+
+	for (v = 0; v < min(max_irqmsks, vecs_to_assign); v++) {
+		int cpus_per_vec = ncpus / vecs_to_assign;
+
+		/* Account for extra vectors to compensate rounding errors */
+		if (extra_vecs) {
+			cpus_per_vec++;
+			--extra_vecs;
+		}
+		irq_spread_init_one(irqmsk + v, nmsk, cpus_per_vec);
+	}
+
+	return v;
+}
+
 /**
  * irq_create_affinity_masks - Create affinity masks for multiqueue spreading
  * @nvecs:	The total number of vectors
@@ -104,7 +133,7 @@ static int get_nodes_in_cpumask(cpumask_var_t *node_to_possible_cpumask,
 struct cpumask *
 irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 {
-	int n, nodes, cpus_per_vec, extra_vecs, curvec;
+	int n, nodes, curvec;
 	int affv = nvecs - affd->pre_vectors - affd->post_vectors;
 	int last_affv = affv + affd->pre_vectors;
 	nodemask_t nodemsk = NODE_MASK_NONE;
@@ -154,33 +183,16 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	}
 
 	for_each_node_mask(n, nodemsk) {
-		int ncpus, v, vecs_to_assign, vecs_per_node;
+		int vecs_per_node;
 
 		/* Spread the vectors per node */
 		vecs_per_node = (affv - (curvec - affd->pre_vectors)) / nodes;
 
-		/* Get the cpus on this node which are in the mask */
 		cpumask_and(nmsk, cpu_possible_mask, node_to_possible_cpumask[n]);
 
-		/* Calculate the number of cpus per vector */
-		ncpus = cpumask_weight(nmsk);
-		vecs_to_assign = min(vecs_per_node, ncpus);
-
-		/* Account for rounding errors */
-		extra_vecs = ncpus - vecs_to_assign * (ncpus / vecs_to_assign);
-
-		for (v = 0; curvec < last_affv && v < vecs_to_assign;
-		     curvec++, v++) {
-			cpus_per_vec = ncpus / vecs_to_assign;
-
-			/* Account for extra vectors to compensate rounding errors */
-			if (extra_vecs) {
-				cpus_per_vec++;
-				--extra_vecs;
-			}
-			irq_spread_init_one(masks + curvec, nmsk, cpus_per_vec);
-		}
-
+		curvec += irq_vecs_spread_affinity(&masks[curvec],
+						   last_affv - curvec,
+						   vecs_per_node, nmsk);
 		if (curvec >= last_affv)
 			break;
 		--nodes;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/2] genirq/affinity: try best to make sure online CPU is assigned to vector
  2018-01-15 16:03 [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector Ming Lei
  2018-01-15 16:03 ` [PATCH 1/2] genirq/affinity: move irq vectors spread into one function Ming Lei
@ 2018-01-15 16:03 ` Ming Lei
  2018-01-15 17:40 ` [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector Christoph Hellwig
  2018-01-15 17:43 ` Thomas Gleixner
  3 siblings, 0 replies; 23+ messages in thread
From: Ming Lei @ 2018-01-15 16:03 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-kernel, Christoph Hellwig,
	Thomas Gleixner
  Cc: Laurence Oberman, Mike Snitzer, Ming Lei, Christoph Hellwig

84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
causes irq vector assigned to all offline CPUs, and IO hang is reported
on HPSA by Laurence.

This patch fixes this issue by trying best to make sure online CPU can be
assigned to irq vector. And take two steps to spread irq vectors:

1) spread irq vectors across offline CPUs in the node cpumask

2) spread irq vectors across online CPUs in the node cpumask

Fixes: 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Reported-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 kernel/irq/affinity.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index 99eb38a4cc83..8b716548b3db 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -103,6 +103,10 @@ static int irq_vecs_spread_affinity(struct cpumask *irqmsk,
 	int v, ncpus = cpumask_weight(nmsk);
 	int vecs_to_assign, extra_vecs;
 
+	/* May happen when spreading vectors across offline cpus */
+	if (!ncpus)
+		return 0;
+
 	/* How many vectors we will try to spread */
 	vecs_to_assign = min(max_vecs, ncpus);
 
@@ -165,13 +169,16 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	/* Stabilize the cpumasks */
 	get_online_cpus();
 	build_node_to_possible_cpumask(node_to_possible_cpumask);
-	nodes = get_nodes_in_cpumask(node_to_possible_cpumask, cpu_possible_mask,
-				     &nodemsk);
 
 	/*
+	 * Don't spread irq vector across offline node.
+	 *
 	 * If the number of nodes in the mask is greater than or equal the
 	 * number of vectors we just spread the vectors across the nodes.
+	 *
 	 */
+	nodes = get_nodes_in_cpumask(node_to_possible_cpumask, cpu_online_mask,
+				     &nodemsk);
 	if (affv <= nodes) {
 		for_each_node_mask(n, nodemsk) {
 			cpumask_copy(masks + curvec,
@@ -182,14 +189,22 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 		goto done;
 	}
 
+	nodes_clear(nodemsk);
+	nodes = get_nodes_in_cpumask(node_to_possible_cpumask, cpu_possible_mask,
+				     &nodemsk);
 	for_each_node_mask(n, nodemsk) {
 		int vecs_per_node;
 
 		/* Spread the vectors per node */
 		vecs_per_node = (affv - (curvec - affd->pre_vectors)) / nodes;
 
-		cpumask_and(nmsk, cpu_possible_mask, node_to_possible_cpumask[n]);
+		/* spread vectors across offline cpus in the node cpumask */
+		cpumask_andnot(nmsk, node_to_possible_cpumask[n], cpu_online_mask);
+		irq_vecs_spread_affinity(&masks[curvec], last_affv - curvec,
+				vecs_per_node, nmsk);
 
+		/* spread vectors across online cpus in the node cpumask */
+		cpumask_and(nmsk, node_to_possible_cpumask[n], cpu_online_mask);
 		curvec += irq_vecs_spread_affinity(&masks[curvec],
 						   last_affv - curvec,
 						   vecs_per_node, nmsk);
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-01-15 16:03 [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector Ming Lei
  2018-01-15 16:03 ` [PATCH 1/2] genirq/affinity: move irq vectors spread into one function Ming Lei
  2018-01-15 16:03 ` [PATCH 2/2] genirq/affinity: try best to make sure online CPU is assigned to vector Ming Lei
@ 2018-01-15 17:40 ` Christoph Hellwig
  2018-01-16  1:30   ` Ming Lei
  2018-01-16  2:15   ` Ming Lei
  2018-01-15 17:43 ` Thomas Gleixner
  3 siblings, 2 replies; 23+ messages in thread
From: Christoph Hellwig @ 2018-01-15 17:40 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, linux-kernel, Christoph Hellwig,
	Thomas Gleixner, Laurence Oberman, Mike Snitzer

On Tue, Jan 16, 2018 at 12:03:43AM +0800, Ming Lei wrote:
> Hi,
> 
> These two patches fixes IO hang issue reported by Laurence.
> 
> 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> may cause one irq vector assigned to all offline CPUs, then this vector
> can't handle irq any more.

Well, that very much was the intention of managed interrupts.  Why
does the device raise an interrupt for a queue that has no online
cpu assigned to it?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-01-15 16:03 [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector Ming Lei
                   ` (2 preceding siblings ...)
  2018-01-15 17:40 ` [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector Christoph Hellwig
@ 2018-01-15 17:43 ` Thomas Gleixner
  2018-01-15 17:54   ` Laurence Oberman
  2018-01-16  1:34   ` Ming Lei
  3 siblings, 2 replies; 23+ messages in thread
From: Thomas Gleixner @ 2018-01-15 17:43 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, linux-kernel, Christoph Hellwig,
	Laurence Oberman, Mike Snitzer

On Tue, 16 Jan 2018, Ming Lei wrote:
> These two patches fixes IO hang issue reported by Laurence.
> 
> 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> may cause one irq vector assigned to all offline CPUs, then this vector
> can't handle irq any more.
> 
> The 1st patch moves irq vectors spread into one function, and prepares
> for the fix done in 2nd patch.
> 
> The 2nd patch fixes the issue by trying to make sure online CPUs assigned
> to irq vector.

Which means it's completely undoing the intent and mechanism of managed
interrupts. Not going to happen.

Which driver is that which abuses managed interrupts and does not keep its
queues properly sorted on cpu hotplug?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-01-15 17:43 ` Thomas Gleixner
@ 2018-01-15 17:54   ` Laurence Oberman
  2018-01-16  1:34   ` Ming Lei
  1 sibling, 0 replies; 23+ messages in thread
From: Laurence Oberman @ 2018-01-15 17:54 UTC (permalink / raw)
  To: Thomas Gleixner, Ming Lei, Ming Lei
  Cc: Jens Axboe, linux-block, linux-kernel, Christoph Hellwig,
	Mike Snitzer, Brace, Don

On Mon, 2018-01-15 at 18:43 +0100, Thomas Gleixner wrote:
> On Tue, 16 Jan 2018, Ming Lei wrote:
> > These two patches fixes IO hang issue reported by Laurence.
> > 
> > 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> > may cause one irq vector assigned to all offline CPUs, then this
> > vector
> > can't handle irq any more.
> > 
> > The 1st patch moves irq vectors spread into one function, and
> > prepares
> > for the fix done in 2nd patch.
> > 
> > The 2nd patch fixes the issue by trying to make sure online CPUs
> > assigned
> > to irq vector.
> 
> Which means it's completely undoing the intent and mechanism of
> managed
> interrupts. Not going to happen.
> 
> Which driver is that which abuses managed interrupts and does not
> keep its
> queues properly sorted on cpu hotplug?
> 
> Thanks,
> 
> 	tglx

Hello Thomas

The servers I am using are all booting off hpsa (SmartArray)
The system would hang on boot with this stack below.

So seen when booting off hpsa driver, not seen by Mike when booting off
a server not using hpsa.

Also not seen when reverting the patch I called out and reverted.

Putting that patch back into Mike/Jens combined tree and adding Ming's
patch seems to fix this issue now. I can boot.

I just did a quick sanity boot and check, not any in-depth testing
right now.

Its not code I am at all familiar with that Ming has changed to make it
work so I defer to Ming to explain in-depth


[  246.751050] INFO: task systemd-udevd:411 blocked for more than 120
seconds.
[  246.791852]       Tainted: G          I      4.15.0-
rc4.block.dm.4.16+ #1
[  246.830650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables
this message.
[  246.874637] systemd-udevd   D    0   411    408 0x80000004
[  246.904934] Call Trace:
[  246.918191]  ? __schedule+0x28d/0x870
[  246.937643]  ? _cond_resched+0x15/0x30
[  246.958222]  schedule+0x32/0x80
[  246.975424]  async_synchronize_cookie_domain+0x8b/0x140
[  247.004452]  ? remove_wait_queue+0x60/0x60
[  247.027335]  do_init_module+0xbe/0x219
[  247.048022]  load_module+0x21d6/0x2910
[  247.069436]  ? m_show+0x1c0/0x1c0
[  247.087999]  SYSC_finit_module+0x94/0xe0
[  247.110392]  entry_SYSCALL_64_fastpath+0x1a/0x7d
[  247.136669] RIP: 0033:0x7f84049287f9
[  247.156112] RSP: 002b:00007ffd13199ab8 EFLAGS: 00000246 ORIG_RAX:
0000000000000139
[  247.196883] RAX: ffffffffffffffda RBX: 000055b712b59e80 RCX:
00007f84049287f9
[  247.237989] RDX: 0000000000000000 RSI: 00007f8405245099 RDI:
0000000000000008
[  247.279105] RBP: 00007f8404bf2760 R08: 0000000000000000 R09:
000055b712b45760
[  247.320005] R10: 0000000000000008 R11: 0000000000000246 R12:
0000000000000020
[  247.360625] R13: 00007f8404bf2818 R14: 0000000000000050 R15:
00007f8404bf27b8
[  247.401062] INFO: task scsi_eh_0:471 blocked for more than 120
seconds.
[  247.438161]       Tainted: G          I      4.15.0-
rc4.block.dm.4.16+ #1
[  247.476640] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables
this message.
[  247.520700] scsi_eh_0       D    0   471      2 0x80000000
[  247.551339] Call Trace:
[  247.564360]  ? __schedule+0x28d/0x870
[  247.584720]  schedule+0x32/0x80
[  247.601294]  hpsa_eh_device_reset_handler+0x68c/0x700 [hpsa]
[  247.633358]  ? remove_wait_queue+0x60/0x60
[  247.656345]  scsi_try_bus_device_reset+0x27/0x40
[  247.682424]  scsi_eh_ready_devs+0x53f/0xe20
[  247.706467]  ? __pm_runtime_resume+0x55/0x70
[  247.730327]  scsi_error_handler+0x434/0x5e0
[  247.754387]  ? __schedule+0x295/0x870
[  247.775420]  kthread+0xf5/0x130
[  247.793461]  ? scsi_eh_get_sense+0x240/0x240
[  247.818008]  ? kthread_associate_blkcg+0x90/0x90
[  247.844759]  ret_from_fork+0x1f/0x30
[  247.865440] INFO: task scsi_id:488 blocked for more than 120
seconds.
[  247.901112]       Tainted: G          I      4.15.0-
rc4.block.dm.4.16+ #1
[  247.938743] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables
this message.
[  247.981092] scsi_id         D    0   488      1 0x00000004
[  248.010535] Call Trace:
[  248.023567]  ? __schedule+0x28d/0x870
[  248.044236]  ? __switch_to+0x1f5/0x460
[  248.065776]  schedule+0x32/0x80
[  248.084238]  schedule_timeout+0x1d4/0x2f0
[  248.106184]  wait_for_completion+0x123/0x190
[  248.130759]  ? wake_up_q+0x70/0x70
[  248.150295]  flush_work+0x119/0x1a0
[  248.169238]  ? wake_up_worker+0x30/0x30
[  248.189670]  __cancel_work_timer+0x103/0x190
[  248.213751]  ? kobj_lookup+0x10b/0x160
[  248.235441]  disk_block_events+0x6f/0x90
[  248.257820]  __blkdev_get+0x6a/0x480
[  248.278770]  ? bd_acquire+0xd0/0xd0
[  248.298438]  blkdev_get+0x1a5/0x300
[  248.316587]  ? bd_acquire+0xd0/0xd0
[  248.334814]  do_dentry_open+0x202/0x320
[  248.354372]  ? security_inode_permission+0x3c/0x50
[  248.378818]  path_openat+0x537/0x12c0
[  248.397386]  ? vm_insert_page+0x1e0/0x1f0
[  248.417664]  ? vvar_fault+0x75/0x140
[  248.435811]  do_filp_open+0x91/0x100
[  248.454061]  do_sys_open+0x126/0x210
[  248.472462]  entry_SYSCALL_64_fastpath+0x1a/0x7d
[  248.495438] RIP: 0033:0x7f39e60e1e90
[  248.513136] RSP: 002b:00007ffc4c906ba8 EFLAGS: 00000246 ORIG_RAX:
0000000000000002
[  248.550726] RAX: ffffffffffffffda RBX: 00005624aead3010 RCX:
00007f39e60e1e90
[  248.586207] RDX: 00007f39e60cc0c4 RSI: 0000000000080800 RDI:
00007ffc4c906ed0
[  248.622411] RBP: 00007ffc4c906b60 R08: 00007f39e60cc140 R09:
00007f39e60cc140
[  248.658704] R10: 000000000000001f R11: 0000000000000246 R12:
00007ffc4c906ed0
[  248.695771] R13: 000000009da9d520 R14: 0000000000000000 R15:
00007ffc4c906c28

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-01-15 17:40 ` [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector Christoph Hellwig
@ 2018-01-16  1:30   ` Ming Lei
  2018-01-16 11:25     ` Thomas Gleixner
  2018-01-16  2:15   ` Ming Lei
  1 sibling, 1 reply; 23+ messages in thread
From: Ming Lei @ 2018-01-16  1:30 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, linux-kernel, Thomas Gleixner,
	Laurence Oberman, Mike Snitzer

On Mon, Jan 15, 2018 at 09:40:36AM -0800, Christoph Hellwig wrote:
> On Tue, Jan 16, 2018 at 12:03:43AM +0800, Ming Lei wrote:
> > Hi,
> > 
> > These two patches fixes IO hang issue reported by Laurence.
> > 
> > 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> > may cause one irq vector assigned to all offline CPUs, then this vector
> > can't handle irq any more.
> 
> Well, that very much was the intention of managed interrupts.  Why
> does the device raise an interrupt for a queue that has no online
> cpu assigned to it?

It is because of irq_create_affinity_masks().

Once irq vectors spread across possible CPUs, some of which are offline,
may be assigned to one vector.

For example of HPSA, there are 8 irq vectors in this device, and the
system supports at most 32 CPUs, but only 16 presents(0-15) after booting,
we should allow to assign at least one CPU for handling each irq vector for
HPSA, but:

1) before commit 84676c1f21:

	irq 25, cpu list 0
	irq 26, cpu list 2
	irq 27, cpu list 4
	irq 28, cpu list 6
	irq 29, cpu list 8
	irq 30, cpu list 10
	irq 31, cpu list 12
	irq 32, cpu list 14
	irq 33, cpu list 1
	irq 34, cpu list 3
	irq 35, cpu list 5
	irq 36, cpu list 7
	irq 37, cpu list 9
	irq 38, cpu list 11
	irq 39, cpu list 13
	irq 40, cpu list 15

2) after commit 84676c1f21:

	irq 25, cpu list 0, 2
	irq 26, cpu list 4, 6
	irq 27, cpu list 8, 10
	irq 28, cpu list 12, 14
	irq 29, cpu list 16, 18
	irq 30, cpu list 20, 22
	irq 31, cpu list 24, 26
	irq 32, cpu list 28, 30
	irq 33, cpu list 1, 3
	irq 34, cpu list 5, 7
	irq 35, cpu list 9, 11
	irq 36, cpu list 13, 15
	irq 37, cpu list 17, 19
	irq 38, cpu list 21, 23
	irq 39, cpu list 25, 27
	irq 40, cpu list 29, 31

And vectors of 29-32, 37-40 are assigned to offline CPUs.

-- 
Ming

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-01-15 17:43 ` Thomas Gleixner
  2018-01-15 17:54   ` Laurence Oberman
@ 2018-01-16  1:34   ` Ming Lei
  1 sibling, 0 replies; 23+ messages in thread
From: Ming Lei @ 2018-01-16  1:34 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jens Axboe, linux-block, linux-kernel, Christoph Hellwig,
	Laurence Oberman, Mike Snitzer

On Mon, Jan 15, 2018 at 06:43:47PM +0100, Thomas Gleixner wrote:
> On Tue, 16 Jan 2018, Ming Lei wrote:
> > These two patches fixes IO hang issue reported by Laurence.
> > 
> > 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> > may cause one irq vector assigned to all offline CPUs, then this vector
> > can't handle irq any more.
> > 
> > The 1st patch moves irq vectors spread into one function, and prepares
> > for the fix done in 2nd patch.
> > 
> > The 2nd patch fixes the issue by trying to make sure online CPUs assigned
> > to irq vector.
> 
> Which means it's completely undoing the intent and mechanism of managed
> interrupts. Not going to happen.

As I replied in previous mail, some of offline CPUs may be assigned to
some of irq vectors after we assign vectors to all possible CPUs, some
of which are not present.

> 
> Which driver is that which abuses managed interrupts and does not keep its
> queues properly sorted on cpu hotplug?

It isn't related with driver/device, and I can trigger this issue on NVMe
easily except for HPSA.


Thanks,
Ming

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-01-15 17:40 ` [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector Christoph Hellwig
  2018-01-16  1:30   ` Ming Lei
@ 2018-01-16  2:15   ` Ming Lei
  1 sibling, 0 replies; 23+ messages in thread
From: Ming Lei @ 2018-01-16  2:15 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, linux-kernel, Thomas Gleixner,
	Laurence Oberman, Mike Snitzer

On Mon, Jan 15, 2018 at 09:40:36AM -0800, Christoph Hellwig wrote:
> On Tue, Jan 16, 2018 at 12:03:43AM +0800, Ming Lei wrote:
> > Hi,
> > 
> > These two patches fixes IO hang issue reported by Laurence.
> > 
> > 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> > may cause one irq vector assigned to all offline CPUs, then this vector
> > can't handle irq any more.
> 
> Well, that very much was the intention of managed interrupts.  Why
> does the device raise an interrupt for a queue that has no online
> cpu assigned to it?

If pci_alloc_irq_vectors() returns OK, driver may think everything
is just fine, and configure the related hw queues(such as enabling irq
on queues), and finally irq comes and no CPU can handle them.

Also I think there may not drivers which check if the CPUs assigned
for irq vectors are online or not, and seems never a job which is
supposed to do by driver.

-- 
Ming

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-01-16  1:30   ` Ming Lei
@ 2018-01-16 11:25     ` Thomas Gleixner
  2018-01-16 12:23       ` Ming Lei
  2018-01-16 13:28       ` Laurence Oberman
  0 siblings, 2 replies; 23+ messages in thread
From: Thomas Gleixner @ 2018-01-16 11:25 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, linux-block, linux-kernel,
	Laurence Oberman, Mike Snitzer

On Tue, 16 Jan 2018, Ming Lei wrote:

> On Mon, Jan 15, 2018 at 09:40:36AM -0800, Christoph Hellwig wrote:
> > On Tue, Jan 16, 2018 at 12:03:43AM +0800, Ming Lei wrote:
> > > Hi,
> > > 
> > > These two patches fixes IO hang issue reported by Laurence.
> > > 
> > > 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> > > may cause one irq vector assigned to all offline CPUs, then this vector
> > > can't handle irq any more.
> > 
> > Well, that very much was the intention of managed interrupts.  Why
> > does the device raise an interrupt for a queue that has no online
> > cpu assigned to it?
> 
> It is because of irq_create_affinity_masks().

That still does not answer the question. If the interrupt for a queue is
assigned to an offline CPU, then the queue should not be used and never
raise an interrupt. That's how managed interrupts have been designed.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-01-16 11:25     ` Thomas Gleixner
@ 2018-01-16 12:23       ` Ming Lei
  2018-01-16 13:28       ` Laurence Oberman
  1 sibling, 0 replies; 23+ messages in thread
From: Ming Lei @ 2018-01-16 12:23 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Christoph Hellwig, Jens Axboe, linux-block, linux-kernel,
	Laurence Oberman, Mike Snitzer, Don Brace, James E.J. Bottomley,
	Martin K. Petersen, esc.storagedev, linux-scsi

On Tue, Jan 16, 2018 at 12:25:19PM +0100, Thomas Gleixner wrote:
> On Tue, 16 Jan 2018, Ming Lei wrote:
> 
> > On Mon, Jan 15, 2018 at 09:40:36AM -0800, Christoph Hellwig wrote:
> > > On Tue, Jan 16, 2018 at 12:03:43AM +0800, Ming Lei wrote:
> > > > Hi,
> > > > 
> > > > These two patches fixes IO hang issue reported by Laurence.
> > > > 
> > > > 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
> > > > may cause one irq vector assigned to all offline CPUs, then this vector
> > > > can't handle irq any more.
> > > 
> > > Well, that very much was the intention of managed interrupts.  Why
> > > does the device raise an interrupt for a queue that has no online
> > > cpu assigned to it?
> > 
> > It is because of irq_create_affinity_masks().
> 
> That still does not answer the question. If the interrupt for a queue is
> assigned to an offline CPU, then the queue should not be used and never
> raise an interrupt. That's how managed interrupts have been designed.

Sorry for not answering it in 1st place, but later I realized that:

	https://marc.info/?l=linux-block&m=151606896601195&w=2

Also wrt. HPSA's queue, looks they are not usual IO queue(such as NVMe's
hw queue) which supposes to be C/S model. And HPSA's queue is more like
a management queue, I guess, since HPSA is still a single queue HBA,
from blk-mq view.

Cc HPSA and SCSI guys.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-01-16 11:25     ` Thomas Gleixner
  2018-01-16 12:23       ` Ming Lei
@ 2018-01-16 13:28       ` Laurence Oberman
  2018-01-16 15:22           ` Don Brace
  1 sibling, 1 reply; 23+ messages in thread
From: Laurence Oberman @ 2018-01-16 13:28 UTC (permalink / raw)
  To: Thomas Gleixner, Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, linux-block, linux-kernel,
	Mike Snitzer, Brace, Don

On Tue, 2018-01-16 at 12:25 +0100, Thomas Gleixner wrote:
> On Tue, 16 Jan 2018, Ming Lei wrote:
> 
> > On Mon, Jan 15, 2018 at 09:40:36AM -0800, Christoph Hellwig wrote:
> > > On Tue, Jan 16, 2018 at 12:03:43AM +0800, Ming Lei wrote:
> > > > Hi,
> > > > 
> > > > These two patches fixes IO hang issue reported by Laurence.
> > > > 
> > > > 84676c1f21 ("genirq/affinity: assign vectors to all possible
> > > > CPUs")
> > > > may cause one irq vector assigned to all offline CPUs, then
> > > > this vector
> > > > can't handle irq any more.
> > > 
> > > Well, that very much was the intention of managed
> > > interrupts.  Why
> > > does the device raise an interrupt for a queue that has no online
> > > cpu assigned to it?
> > 
> > It is because of irq_create_affinity_masks().
> 
> That still does not answer the question. If the interrupt for a queue
> is
> assigned to an offline CPU, then the queue should not be used and
> never
> raise an interrupt. That's how managed interrupts have been designed.
> 
> Thanks,
> 
> 	tglx
> 
> 
> 
> 

I captured a full boot log for this issue for Microsemi, I will send it
to Don Brace.
I enabled all the HPSA debug and here is snippet

[    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.15.0-
rc4.noming+ root=/dev/mapper/rhel_ibclient-root ro crashkernel=512M@64M
 rd.lvm.lv=rhel_ibclient/root rd.lvm.lv=rhel_ibclient/swap
log_buf_len=54M console=ttyS1,115200n8 scsi_mod.use_blk_mq=y
dm_mod.use_blk_mq=y
[    0.000000] Memory: 7834908K/1002852K available (8397K kernel code,
3012K rwdata, 3660K rodata, 2184K init, 15344K bss, 2356808K reserved,
0K cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=32,
Nodes=2
[    0.000000] ftrace: allocating 33084 entries in 130 pages
[    0.000000] Running RCU self tests
[    0.000000] Hierarchical RCU implementation.
[    0.000000] 	RCU lockdep checking is enabled.
[    0.000000] 	RCU restricting CPUs from NR_CPUS=8192 to
nr_cpu_ids=32.
[    0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16,
nr_cpu_ids=32
[    0.000000] NR_IRQS: 524544, nr_irqs: 1088, preallocated irqs: 16
..
..
    0.190147] smp: Brought up 2 nodes, 16 CPUs
[    0.192006] smpboot: Max logical packages: 4
[    0.193007] smpboot: Total of 16 processors activated (76776.33
BogoMIPS)
[    0.940640] node 0 initialised, 10803218 pages in 743ms
[    1.005449] node 1 initialised, 11812066 pages in 807ms
..
..
[    7.440896] hpsa 0000:05:00.0: can't disable ASPM; OS doesn't have
ASPM control
[    7.442071] hpsa 0000:05:00.0: Logical aborts not supported
[    7.442075] hpsa 0000:05:00.0: HP SSD Smart Path aborts not
supported
[    7.442164] hpsa 0000:05:00.0: Controller Configuration information
[    7.442167] hpsa 0000:05:00.0: ------------------------------------
[    7.442173] hpsa 0000:05:00.0:    Signature = CISS
[    7.442177] hpsa 0000:05:00.0:    Spec Number = 3
[    7.442182] hpsa 0000:05:00.0:    Transport methods supported =
0x7a000007
[    7.442186] hpsa 0000:05:00.0:    Transport methods active = 0x3
[    7.442190] hpsa 0000:05:00.0:    Requested transport Method = 0x2
[    7.442194] hpsa 0000:05:00.0:    Coalesce Interrupt Delay = 0x0
[    7.442198] hpsa 0000:05:00.0:    Coalesce Interrupt Count = 0x1
[    7.442202] hpsa 0000:05:00.0:    Max outstanding commands = 1024
[    7.442206] hpsa 0000:05:00.0:    Bus Types = 0x200000
[    7.442220] hpsa 0000:05:00.0:    Server Name = 2M21220149
[    7.442224] hpsa 0000:05:00.0:    Heartbeat Counter = 0xd23
[    7.442224] 
[    7.442224] 
..
..
  246.751135] INFO: task systemd-udevd:413 blocked for more than 120
seconds.
[  246.788008]       Tainted: G          I      4.15.0-rc4.noming+ #1
[  246.822380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  246.865594] systemd-udevd   D    0   413    411 0x80000004
[  246.895519] Call Trace:
[  246.909713]  ? __schedule+0x340/0xc20
[  246.930236]  schedule+0x32/0x80
[  246.947905]  schedule_timeout+0x23d/0x450
[  246.970047]  ? find_held_lock+0x2d/0x90
[  246.991774]  ? wait_for_completion_io+0x108/0x170
[  247.018172]  io_schedule_timeout+0x19/0x40
[  247.041208]  wait_for_completion_io+0x110/0x170
[  247.067326]  ? wake_up_q+0x70/0x70
[  247.086801]  hpsa_scsi_do_simple_cmd+0xc6/0x100 [hpsa]
[  247.114315]  hpsa_scsi_do_simple_cmd_with_retry+0xb7/0x1c0 [hpsa]
[  247.146629]  hpsa_scsi_do_inquiry+0x73/0xd0 [hpsa]
[  247.174118]  hpsa_init_one+0x12cb/0x1a59 [hpsa]
[  247.199851]  ? __pm_runtime_resume+0x55/0x70
[  247.224527]  local_pci_probe+0x3f/0xa0
[  247.246034]  pci_device_probe+0x146/0x1b0
[  247.268413]  driver_probe_device+0x2b3/0x4a0
[  247.291868]  __driver_attach+0xda/0xe0
[  247.313370]  ? driver_probe_device+0x4a0/0x4a0
[  247.338399]  bus_for_each_dev+0x6a/0xb0
[  247.359912]  bus_add_driver+0x41/0x260
[  247.380244]  driver_register+0x5b/0xd0
[  247.400811]  ? 0xffffffffc016b000
[  247.418819]  hpsa_init+0x38/0x1000 [hpsa]
[  247.440763]  ? 0xffffffffc016b000
[  247.459451]  do_one_initcall+0x4d/0x19c
[  247.480539]  ? do_init_module+0x22/0x220
[  247.502575]  ? rcu_read_lock_sched_held+0x64/0x70
[  247.529549]  ? kmem_cache_alloc_trace+0x1f7/0x260
[  247.556204]  ? do_init_module+0x22/0x220
[  247.578633]  do_init_module+0x5a/0x220
[  247.600322]  load_module+0x21e8/0x2a50
[  247.621648]  ? __symbol_put+0x60/0x60
[  247.642796]  SYSC_finit_module+0x94/0xe0
[  247.665336]  entry_SYSCALL_64_fastpath+0x1f/0x96
[  247.691751] RIP: 0033:0x7fc63d6527f9
[  247.712308] RSP: 002b:00007ffdf1659ba8 EFLAGS: 00000246 ORIG_RAX:
0000000000000139
[  247.755272] RAX: ffffffffffffffda RBX: 0000556b524c5f70 RCX:
00007fc63d6527f9
[  247.795779] RDX: 0000000000000000 RSI: 00007fc63df6f099 RDI:
0000000000000008
[  247.836413] RBP: 00007fc63df6f099 R08: 0000000000000000 R09:
0000556b524be760
[  247.876395] R10: 0000000000000008 R11: 0000000000000246 R12:
0000000000000000
[  247.917597] R13: 0000556b524c5f10 R14: 0000000000020000 R15:
0000000000000000
[  247.957272] 
[  247.957272] Showing all locks held in the system:
[  247.992019] 1 lock held by khungtaskd/118:
[  248.015019]  #0:  (tasklist_lock){.+.+}, at: [<000000004ef3538d>]
debug_show_all_locks+0x39/0x1b0
[  248.064600] 2 locks held by systemd-udevd/413:
[  248.090031]  #0:  (&dev->mutex){....}, at: [<000000002a395ec8>]
__driver_attach+0x4a/0xe0
[  248.136620]  #1:  (&dev->mutex){....}, at: [<00000000d9def23c>]
__driver_attach+0x58/0xe0
[  248.183245] 
[  248.191675] =============================================
[  248.191675] 
[  314.825134] dracut-initqueue[437]: Warning: dracut-initqueue timeout
- starting timeout scripts
[  315.368421] dracut-initqueue[437]: Warning: dracut-initqueue timeout
- starting timeout scripts
[  315.894373] dracut-initqueue[437]: Warning: dracut-initqueue timeout
- starting timeout scripts
[  316.418385] dracut-initqueue[437]: Warning: dracut-initqueue timeout
- starting timeout scripts
[  316.944461] dracut-initqueue[437]: Warning: dracut-initqueue timeout
- starting timeout scripts
[  317.466708] dracut-initqueue[437]: Warning: dracut-initqueue timeout
- starting timeout scripts
[  317.994380] dracut-initqueue[437]: Warning: dracut-initqueue timeout
- starti

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-01-16 13:28       ` Laurence Oberman
@ 2018-01-16 15:22           ` Don Brace
  0 siblings, 0 replies; 23+ messages in thread
From: Don Brace @ 2018-01-16 15:22 UTC (permalink / raw)
  To: Laurence Oberman, Thomas Gleixner, Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, linux-block, linux-kernel, Mike Snitzer

PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBMYXVyZW5jZSBPYmVybWFuIFtt
YWlsdG86bG9iZXJtYW5AcmVkaGF0LmNvbV0NCj4gU2VudDogVHVlc2RheSwgSmFudWFyeSAxNiwg
MjAxOCA3OjI5IEFNDQo+IFRvOiBUaG9tYXMgR2xlaXhuZXIgPHRnbHhAbGludXRyb25peC5kZT47
IE1pbmcgTGVpIDxtaW5nLmxlaUByZWRoYXQuY29tPg0KPiBDYzogQ2hyaXN0b3BoIEhlbGx3aWcg
PGhjaEBpbmZyYWRlYWQub3JnPjsgSmVucyBBeGJvZSA8YXhib2VAZmIuY29tPjsNCj4gbGludXgt
YmxvY2tAdmdlci5rZXJuZWwub3JnOyBsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnOyBNaWtl
IFNuaXR6ZXINCj4gPHNuaXR6ZXJAcmVkaGF0LmNvbT47IERvbiBCcmFjZSA8ZG9uLmJyYWNlQG1p
Y3Jvc2VtaS5jb20+DQo+IFN1YmplY3Q6IFJlOiBbUEFUQ0ggMC8yXSBnZW5pcnEvYWZmaW5pdHk6
IHRyeSB0byBtYWtlIHN1cmUgb25saW5lIENQVSBpcyBhc3NnaW5lZA0KPiB0byBpcnEgdmVjdG9y
DQo+IA0KPiA+ID4gSXQgaXMgYmVjYXVzZSBvZiBpcnFfY3JlYXRlX2FmZmluaXR5X21hc2tzKCku
DQo+ID4NCj4gPiBUaGF0IHN0aWxsIGRvZXMgbm90IGFuc3dlciB0aGUgcXVlc3Rpb24uIElmIHRo
ZSBpbnRlcnJ1cHQgZm9yIGEgcXVldWUNCj4gPiBpcw0KPiA+IGFzc2lnbmVkIHRvIGFuIG9mZmxp
bmUgQ1BVLCB0aGVuIHRoZSBxdWV1ZSBzaG91bGQgbm90IGJlIHVzZWQgYW5kDQo+ID4gbmV2ZXIN
Cj4gPiByYWlzZSBhbiBpbnRlcnJ1cHQuIFRoYXQncyBob3cgbWFuYWdlZCBpbnRlcnJ1cHRzIGhh
dmUgYmVlbiBkZXNpZ25lZC4NCj4gPg0KPiA+IFRoYW5rcywNCj4gPg0KPiA+ICAgICAgIHRnbHgN
Cj4gPg0KPiA+DQo+ID4NCj4gPg0KPiANCj4gSSBjYXB0dXJlZCBhIGZ1bGwgYm9vdCBsb2cgZm9y
IHRoaXMgaXNzdWUgZm9yIE1pY3Jvc2VtaSwgSSB3aWxsIHNlbmQgaXQNCj4gdG8gRG9uIEJyYWNl
Lg0KPiBJIGVuYWJsZWQgYWxsIHRoZSBIUFNBIGRlYnVnIGFuZCBoZXJlIGlzIHNuaXBwZXQNCj4g
DQo+IA0KPiAuLg0KPiAuLg0KPiAuLg0KPiAgIDI0Ni43NTExMzVdIElORk86IHRhc2sgc3lzdGVt
ZC11ZGV2ZDo0MTMgYmxvY2tlZCBmb3IgbW9yZSB0aGFuIDEyMA0KPiBzZWNvbmRzLg0KPiBbwqDC
oDI0Ni43ODgwMDhdwqDCoMKgwqDCoMKgwqBUYWludGVkOiBHICAgICAgICAgIEkgICAgICA0LjE1
LjAtcmM0Lm5vbWluZysgIzENCj4gW8KgwqAyNDYuODIyMzgwXSAiZWNobyAwID4gL3Byb2Mvc3lz
L2tlcm5lbC9odW5nX3Rhc2tfdGltZW91dF9zZWNzIg0KPiBkaXNhYmxlcyB0aGlzIG1lc3NhZ2Uu
DQo+IFvCoMKgMjQ2Ljg2NTU5NF0gc3lzdGVtZC11ZGV2ZCAgIEQgICAgMCAgIDQxMyAgICA0MTEg
MHg4MDAwMDAwNA0KPiBbwqDCoDI0Ni44OTU1MTldIENhbGwgVHJhY2U6DQo+IFvCoMKgMjQ2Ljkw
OTcxM13CoMKgPyBfX3NjaGVkdWxlKzB4MzQwLzB4YzIwDQo+IFvCoMKgMjQ2LjkzMDIzNl3CoMKg
c2NoZWR1bGUrMHgzMi8weDgwDQo+IFvCoMKgMjQ2Ljk0NzkwNV3CoMKgc2NoZWR1bGVfdGltZW91
dCsweDIzZC8weDQ1MA0KPiBbICAyNDYuOTcwMDQ3XcKgwqA/IGZpbmRfaGVsZF9sb2NrKzB4MmQv
MHg5MA0KPiBbwqDCoDI0Ni45OTE3NzRdwqDCoD8gd2FpdF9mb3JfY29tcGxldGlvbl9pbysweDEw
OC8weDE3MA0KPiBbwqDCoDI0Ny4wMTgxNzJdwqDCoGlvX3NjaGVkdWxlX3RpbWVvdXQrMHgxOS8w
eDQwDQo+IFvCoMKgMjQ3LjA0MTIwOF3CoMKgd2FpdF9mb3JfY29tcGxldGlvbl9pbysweDExMC8w
eDE3MA0KPiBbwqDCoDI0Ny4wNjczMjZdwqDCoD8gd2FrZV91cF9xKzB4NzAvMHg3MA0KPiBbwqDC
oDI0Ny4wODY4MDFdwqDCoGhwc2Ffc2NzaV9kb19zaW1wbGVfY21kKzB4YzYvMHgxMDAgW2hwc2Fd
DQo+IFvCoMKgMjQ3LjExNDMxNV3CoMKgaHBzYV9zY3NpX2RvX3NpbXBsZV9jbWRfd2l0aF9yZXRy
eSsweGI3LzB4MWMwIFtocHNhXQ0KPiBbwqDCoDI0Ny4xNDY2MjldwqDCoGhwc2Ffc2NzaV9kb19p
bnF1aXJ5KzB4NzMvMHhkMCBbaHBzYV0NCj4gW8KgwqAyNDcuMTc0MTE4XcKgwqBocHNhX2luaXRf
b25lKzB4MTJjYi8weDFhNTkgW2hwc2FdDQoNClRoaXMgdHJhY2UgY29tZXMgZnJvbSBpbnRlcm5h
bGx5IGdlbmVyYXRlZCBkaXNjb3ZlcnkgY29tbWFuZHMuIE5vIFNDU0kgZGV2aWNlcyBoYXZlDQpi
ZWVuIHByZXNlbnRlZCB0byB0aGUgU01MIHlldC4NCg0KQXQgdGhpcyBwb2ludCB3ZSBzaG91bGQg
YmUgcnVubmluZyBvbiBvbmx5IG9uZSBDUFUuIFRoZXNlIGNvbW1hbmRzIGFyZSBtZWFudCB0byB1
c2UNCnJlcGx5IHF1ZXVlIDAgd2hpY2ggYXJlIHRpZWQgdG8gQ1BVIDAuIEl0J3MgaW50ZXJlc3Rp
bmcgdGhhdCB0aGUgcGF0Y2ggaGVscHMuDQoNCkhvd2V2ZXIsIEkgd2FzIHdvbmRlcmluZyBpZiB5
b3UgY291bGQgaW5zcGVjdCB0aGUgaUxvIElNTCBsb2dzIGFuZCBzZW5kIHRoZQ0KQUhTIGxvZ3Mg
Zm9yIGluc3BlY3Rpb24uDQoNClRoYW5rcywNCkRvbiBCcmFjZQ0KRVNDIC0gU21hcnQgU3RvcmFn
ZQ0KTWljcm9zZW1pIENvcnBvcmF0aW9uDQoNCj4gW8KgwqAyNDcuMTk5ODUxXcKgwqA/IF9fcG1f
cnVudGltZV9yZXN1bWUrMHg1NS8weDcwDQo+IFvCoMKgMjQ3LjIyNDUyN13CoMKgbG9jYWxfcGNp
X3Byb2JlKzB4M2YvMHhhMA0KPiBbwqDCoDI0Ny4yNDYwMzRdwqDCoHBjaV9kZXZpY2VfcHJvYmUr
MHgxNDYvMHgxYjANCj4gW8KgwqAyNDcuMjY4NDEzXcKgwqBkcml2ZXJfcHJvYmVfZGV2aWNlKzB4
MmIzLzB4NGEwDQo+IFvCoMKgMjQ3LjI5MTg2OF3CoMKgX19kcml2ZXJfYXR0YWNoKzB4ZGEvMHhl
MA0KPiBbwqDCoDI0Ny4zMTMzNzBdwqDCoD8gZHJpdmVyX3Byb2JlX2RldmljZSsweDRhMC8weDRh
MA0KPiBbwqDCoDI0Ny4zMzgzOTldwqDCoGJ1c19mb3JfZWFjaF9kZXYrMHg2YS8weGIwDQo+IFvC
oMKgMjQ3LjM1OTkxMl3CoMKgYnVzX2FkZF9kcml2ZXIrMHg0MS8weDI2MA0KPiBbwqDCoDI0Ny4z
ODAyNDRdwqDCoGRyaXZlcl9yZWdpc3RlcisweDViLzB4ZDANCj4gW8KgwqAyNDcuNDAwODExXcKg
wqA/IDB4ZmZmZmZmZmZjMDE2YjAwMA0KPiBbwqDCoDI0Ny40MTg4MTldwqDCoGhwc2FfaW5pdCsw
eDM4LzB4MTAwMCBbaHBzYV0NCj4gW8KgwqAyNDcuNDQwNzYzXcKgwqA/IDB4ZmZmZmZmZmZjMDE2
YjAwMA0KPiBbwqDCoDI0Ny40NTk0NTFdwqDCoGRvX29uZV9pbml0Y2FsbCsweDRkLzB4MTljDQo+
IFvCoMKgMjQ3LjQ4MDUzOV3CoMKgPyBkb19pbml0X21vZHVsZSsweDIyLzB4MjIwDQo+IFvCoMKg
MjQ3LjUwMjU3NV3CoMKgPyByY3VfcmVhZF9sb2NrX3NjaGVkX2hlbGQrMHg2NC8weDcwDQo+IFvC
oMKgMjQ3LjUyOTU0OV3CoMKgPyBrbWVtX2NhY2hlX2FsbG9jX3RyYWNlKzB4MWY3LzB4MjYwDQo+
IFvCoMKgMjQ3LjU1NjIwNF3CoMKgPyBkb19pbml0X21vZHVsZSsweDIyLzB4MjIwDQo+IFvCoMKg
MjQ3LjU3ODYzM13CoMKgZG9faW5pdF9tb2R1bGUrMHg1YS8weDIyMA0KPiBbwqDCoDI0Ny42MDAz
MjJdwqDCoGxvYWRfbW9kdWxlKzB4MjFlOC8weDJhNTANCj4gW8KgwqAyNDcuNjIxNjQ4XcKgwqA/
IF9fc3ltYm9sX3B1dCsweDYwLzB4NjANCj4gW8KgwqAyNDcuNjQyNzk2XcKgwqBTWVNDX2Zpbml0
X21vZHVsZSsweDk0LzB4ZTANCj4gW8KgwqAyNDcuNjY1MzM2XcKgwqBlbnRyeV9TWVNDQUxMXzY0
X2Zhc3RwYXRoKzB4MWYvMHg5Ng0KPiBbwqDCoDI0Ny42OTE3NTFdIFJJUDogMDAzMzoweDdmYzYz
ZDY1MjdmOQ0KPiBbwqDCoDI0Ny43MTIzMDhdIFJTUDogMDAyYjowMDAwN2ZmZGYxNjU5YmE4IEVG
TEFHUzogMDAwMDAyNDYgT1JJR19SQVg6DQo+IDAwMDAwMDAwMDAwMDAxMzkNCj4gW8KgwqAyNDcu
NzU1MjcyXSBSQVg6IGZmZmZmZmZmZmZmZmZmZGEgUkJYOiAwMDAwNTU2YjUyNGM1ZjcwIFJDWDoN
Cj4gMDAwMDdmYzYzZDY1MjdmOQ0KPiBbwqDCoDI0Ny43OTU3NzldIFJEWDogMDAwMDAwMDAwMDAw
MDAwMCBSU0k6IDAwMDA3ZmM2M2RmNmYwOTkgUkRJOg0KPiAwMDAwMDAwMDAwMDAwMDA4DQo+IFvC
oMKgMjQ3LjgzNjQxM10gUkJQOiAwMDAwN2ZjNjNkZjZmMDk5IFIwODogMDAwMDAwMDAwMDAwMDAw
MCBSMDk6DQo+IDAwMDA1NTZiNTI0YmU3NjANCj4gW8KgwqAyNDcuODc2Mzk1XSBSMTA6IDAwMDAw
MDAwMDAwMDAwMDggUjExOiAwMDAwMDAwMDAwMDAwMjQ2IFIxMjoNCj4gMDAwMDAwMDAwMDAwMDAw
MA0KPiBbwqDCoDI0Ny45MTc1OTddIFIxMzogMDAwMDU1NmI1MjRjNWYxMCBSMTQ6IDAwMDAwMDAw
MDAwMjAwMDAgUjE1Og0KPiAwMDAwMDAwMDAwMDAwMDAwDQo+IFvCoMKgMjQ3Ljk1NzI3Ml0NCj4g
W8KgwqAyNDcuOTU3MjcyXSBTaG93aW5nIGFsbCBsb2NrcyBoZWxkIGluIHRoZSBzeXN0ZW06DQo+
IFvCoMKgMjQ3Ljk5MjAxOV0gMSBsb2NrIGhlbGQgYnkga2h1bmd0YXNrZC8xMTg6DQo+IFvCoMKg
MjQ4LjAxNTAxOV3CoMKgIzA6ICAodGFza2xpc3RfbG9jayl7LisuK30sIGF0OiBbPDAwMDAwMDAw
NGVmMzUzOGQ+XQ0KPiBkZWJ1Z19zaG93X2FsbF9sb2NrcysweDM5LzB4MWIwDQo+IFvCoMKgMjQ4
LjA2NDYwMF0gMiBsb2NrcyBoZWxkIGJ5IHN5c3RlbWQtdWRldmQvNDEzOg0KPiBbwqDCoDI0OC4w
OTAwMzFdwqDCoCMwOiAgKCZkZXYtPm11dGV4KXsuLi4ufSwgYXQ6IFs8MDAwMDAwMDAyYTM5NWVj
OD5dDQo+IF9fZHJpdmVyX2F0dGFjaCsweDRhLzB4ZTANCj4gW8KgwqAyNDguMTM2NjIwXcKgwqAj
MTogICgmZGV2LT5tdXRleCl7Li4uLn0sIGF0OiBbPDAwMDAwMDAwZDlkZWYyM2M+XQ0KPiBfX2Ry
aXZlcl9hdHRhY2grMHg1OC8weGUwDQo+IFvCoMKgMjQ4LjE4MzI0NV0NCj4gW8KgwqAyNDguMTkx
Njc1XSA9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0NCj4gW8Kg
wqAyNDguMTkxNjc1XQ0KPiBbwqDCoDMxNC44MjUxMzRdIGRyYWN1dC1pbml0cXVldWVbNDM3XTog
V2FybmluZzogZHJhY3V0LWluaXRxdWV1ZSB0aW1lb3V0DQo+IC0gc3RhcnRpbmcgdGltZW91dCBz
Y3JpcHRzDQo+IFvCoMKgMzE1LjM2ODQyMV0gZHJhY3V0LWluaXRxdWV1ZVs0MzddOiBXYXJuaW5n
OiBkcmFjdXQtaW5pdHF1ZXVlIHRpbWVvdXQNCj4gLSBzdGFydGluZyB0aW1lb3V0IHNjcmlwdHMN
Cj4gW8KgwqAzMTUuODk0MzczXSBkcmFjdXQtaW5pdHF1ZXVlWzQzN106IFdhcm5pbmc6IGRyYWN1
dC1pbml0cXVldWUgdGltZW91dA0KPiAtIHN0YXJ0aW5nIHRpbWVvdXQgc2NyaXB0cw0KPiBbwqDC
oDMxNi40MTgzODVdIGRyYWN1dC1pbml0cXVldWVbNDM3XTogV2FybmluZzogZHJhY3V0LWluaXRx
dWV1ZSB0aW1lb3V0DQo+IC0gc3RhcnRpbmcgdGltZW91dCBzY3JpcHRzDQo+IFvCoMKgMzE2Ljk0
NDQ2MV0gZHJhY3V0LWluaXRxdWV1ZVs0MzddOiBXYXJuaW5nOiBkcmFjdXQtaW5pdHF1ZXVlIHRp
bWVvdXQNCj4gLSBzdGFydGluZyB0aW1lb3V0IHNjcmlwdHMNCj4gW8KgwqAzMTcuNDY2NzA4XSBk
cmFjdXQtaW5pdHF1ZXVlWzQzN106IFdhcm5pbmc6IGRyYWN1dC1pbml0cXVldWUgdGltZW91dA0K
PiAtIHN0YXJ0aW5nIHRpbWVvdXQgc2NyaXB0cw0KPiBbwqDCoDMxNy45OTQzODBdIGRyYWN1dC1p
bml0cXVldWVbNDM3XTogV2FybmluZzogZHJhY3V0LWluaXRxdWV1ZSB0aW1lb3V0DQo+IC0gc3Rh
cnRpDQoNCg==

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
@ 2018-01-16 15:22           ` Don Brace
  0 siblings, 0 replies; 23+ messages in thread
From: Don Brace @ 2018-01-16 15:22 UTC (permalink / raw)
  To: Laurence Oberman, Thomas Gleixner, Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, linux-block, linux-kernel, Mike Snitzer

> -----Original Message-----
> From: Laurence Oberman [mailto:loberman@redhat.com]
> Sent: Tuesday, January 16, 2018 7:29 AM
> To: Thomas Gleixner <tglx@linutronix.de>; Ming Lei <ming.lei@redhat.com>
> Cc: Christoph Hellwig <hch@infradead.org>; Jens Axboe <axboe@fb.com>;
> linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Mike Snitzer
> <snitzer@redhat.com>; Don Brace <don.brace@microsemi.com>
> Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined
> to irq vector
> 
> > > It is because of irq_create_affinity_masks().
> >
> > That still does not answer the question. If the interrupt for a queue
> > is
> > assigned to an offline CPU, then the queue should not be used and
> > never
> > raise an interrupt. That's how managed interrupts have been designed.
> >
> > Thanks,
> >
> >       tglx
> >
> >
> >
> >
> 
> I captured a full boot log for this issue for Microsemi, I will send it
> to Don Brace.
> I enabled all the HPSA debug and here is snippet
> 
> 
> ..
> ..
> ..
>   246.751135] INFO: task systemd-udevd:413 blocked for more than 120
> seconds.
> [  246.788008]       Tainted: G          I      4.15.0-rc4.noming+ #1
> [  246.822380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  246.865594] systemd-udevd   D    0   413    411 0x80000004
> [  246.895519] Call Trace:
> [  246.909713]  ? __schedule+0x340/0xc20
> [  246.930236]  schedule+0x32/0x80
> [  246.947905]  schedule_timeout+0x23d/0x450
> [  246.970047]  ? find_held_lock+0x2d/0x90
> [  246.991774]  ? wait_for_completion_io+0x108/0x170
> [  247.018172]  io_schedule_timeout+0x19/0x40
> [  247.041208]  wait_for_completion_io+0x110/0x170
> [  247.067326]  ? wake_up_q+0x70/0x70
> [  247.086801]  hpsa_scsi_do_simple_cmd+0xc6/0x100 [hpsa]
> [  247.114315]  hpsa_scsi_do_simple_cmd_with_retry+0xb7/0x1c0 [hpsa]
> [  247.146629]  hpsa_scsi_do_inquiry+0x73/0xd0 [hpsa]
> [  247.174118]  hpsa_init_one+0x12cb/0x1a59 [hpsa]

This trace comes from internally generated discovery commands. No SCSI devices have
been presented to the SML yet.

At this point we should be running on only one CPU. These commands are meant to use
reply queue 0 which are tied to CPU 0. It's interesting that the patch helps.

However, I was wondering if you could inspect the iLo IML logs and send the
AHS logs for inspection.

Thanks,
Don Brace
ESC - Smart Storage
Microsemi Corporation

> [  247.199851]  ? __pm_runtime_resume+0x55/0x70
> [  247.224527]  local_pci_probe+0x3f/0xa0
> [  247.246034]  pci_device_probe+0x146/0x1b0
> [  247.268413]  driver_probe_device+0x2b3/0x4a0
> [  247.291868]  __driver_attach+0xda/0xe0
> [  247.313370]  ? driver_probe_device+0x4a0/0x4a0
> [  247.338399]  bus_for_each_dev+0x6a/0xb0
> [  247.359912]  bus_add_driver+0x41/0x260
> [  247.380244]  driver_register+0x5b/0xd0
> [  247.400811]  ? 0xffffffffc016b000
> [  247.418819]  hpsa_init+0x38/0x1000 [hpsa]
> [  247.440763]  ? 0xffffffffc016b000
> [  247.459451]  do_one_initcall+0x4d/0x19c
> [  247.480539]  ? do_init_module+0x22/0x220
> [  247.502575]  ? rcu_read_lock_sched_held+0x64/0x70
> [  247.529549]  ? kmem_cache_alloc_trace+0x1f7/0x260
> [  247.556204]  ? do_init_module+0x22/0x220
> [  247.578633]  do_init_module+0x5a/0x220
> [  247.600322]  load_module+0x21e8/0x2a50
> [  247.621648]  ? __symbol_put+0x60/0x60
> [  247.642796]  SYSC_finit_module+0x94/0xe0
> [  247.665336]  entry_SYSCALL_64_fastpath+0x1f/0x96
> [  247.691751] RIP: 0033:0x7fc63d6527f9
> [  247.712308] RSP: 002b:00007ffdf1659ba8 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000139
> [  247.755272] RAX: ffffffffffffffda RBX: 0000556b524c5f70 RCX:
> 00007fc63d6527f9
> [  247.795779] RDX: 0000000000000000 RSI: 00007fc63df6f099 RDI:
> 0000000000000008
> [  247.836413] RBP: 00007fc63df6f099 R08: 0000000000000000 R09:
> 0000556b524be760
> [  247.876395] R10: 0000000000000008 R11: 0000000000000246 R12:
> 0000000000000000
> [  247.917597] R13: 0000556b524c5f10 R14: 0000000000020000 R15:
> 0000000000000000
> [  247.957272]
> [  247.957272] Showing all locks held in the system:
> [  247.992019] 1 lock held by khungtaskd/118:
> [  248.015019]  #0:  (tasklist_lock){.+.+}, at: [<000000004ef3538d>]
> debug_show_all_locks+0x39/0x1b0
> [  248.064600] 2 locks held by systemd-udevd/413:
> [  248.090031]  #0:  (&dev->mutex){....}, at: [<000000002a395ec8>]
> __driver_attach+0x4a/0xe0
> [  248.136620]  #1:  (&dev->mutex){....}, at: [<00000000d9def23c>]
> __driver_attach+0x58/0xe0
> [  248.183245]
> [  248.191675] =============================================
> [  248.191675]
> [  314.825134] dracut-initqueue[437]: Warning: dracut-initqueue timeout
> - starting timeout scripts
> [  315.368421] dracut-initqueue[437]: Warning: dracut-initqueue timeout
> - starting timeout scripts
> [  315.894373] dracut-initqueue[437]: Warning: dracut-initqueue timeout
> - starting timeout scripts
> [  316.418385] dracut-initqueue[437]: Warning: dracut-initqueue timeout
> - starting timeout scripts
> [  316.944461] dracut-initqueue[437]: Warning: dracut-initqueue timeout
> - starting timeout scripts
> [  317.466708] dracut-initqueue[437]: Warning: dracut-initqueue timeout
> - starting timeout scripts
> [  317.994380] dracut-initqueue[437]: Warning: dracut-initqueue timeout
> - starti


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-01-16 15:22           ` Don Brace
  (?)
@ 2018-01-16 15:35           ` Laurence Oberman
  -1 siblings, 0 replies; 23+ messages in thread
From: Laurence Oberman @ 2018-01-16 15:35 UTC (permalink / raw)
  To: Don Brace, Thomas Gleixner, Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, linux-block, linux-kernel, Mike Snitzer

On Tue, 2018-01-16 at 15:22 +0000, Don Brace wrote:
> > -----Original Message-----
> > From: Laurence Oberman [mailto:loberman@redhat.com]
> > Sent: Tuesday, January 16, 2018 7:29 AM
> > To: Thomas Gleixner <tglx@linutronix.de>; Ming Lei <ming.lei@redhat
> > .com>
> > Cc: Christoph Hellwig <hch@infradead.org>; Jens Axboe <axboe@fb.com
> > >;
> > linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Mike
> > Snitzer
> > <snitzer@redhat.com>; Don Brace <don.brace@microsemi.com>
> > Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online
> > CPU is assgined
> > to irq vector
> > 
> > > > It is because of irq_create_affinity_masks().
> > > 
> > > That still does not answer the question. If the interrupt for a
> > > queue
> > > is
> > > assigned to an offline CPU, then the queue should not be used and
> > > never
> > > raise an interrupt. That's how managed interrupts have been
> > > designed.
> > > 
> > > Thanks,
> > > 
> > >       tglx
> > > 
> > > 
> > > 
> > > 
> > 
> > I captured a full boot log for this issue for Microsemi, I will
> > send it
> > to Don Brace.
> > I enabled all the HPSA debug and here is snippet
> > 
> > 
> > ..
> > ..
> > ..
> >   246.751135] INFO: task systemd-udevd:413 blocked for more than
> > 120
> > seconds.
> > [  246.788008]       Tainted: G          I      4.15.0-rc4.noming+
> > #1
> > [  246.822380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  246.865594] systemd-udevd   D    0   413    411 0x80000004
> > [  246.895519] Call Trace:
> > [  246.909713]  ? __schedule+0x340/0xc20
> > [  246.930236]  schedule+0x32/0x80
> > [  246.947905]  schedule_timeout+0x23d/0x450
> > [  246.970047]  ? find_held_lock+0x2d/0x90
> > [  246.991774]  ? wait_for_completion_io+0x108/0x170
> > [  247.018172]  io_schedule_timeout+0x19/0x40
> > [  247.041208]  wait_for_completion_io+0x110/0x170
> > [  247.067326]  ? wake_up_q+0x70/0x70
> > [  247.086801]  hpsa_scsi_do_simple_cmd+0xc6/0x100 [hpsa]
> > [  247.114315]  hpsa_scsi_do_simple_cmd_with_retry+0xb7/0x1c0
> > [hpsa]
> > [  247.146629]  hpsa_scsi_do_inquiry+0x73/0xd0 [hpsa]
> > [  247.174118]  hpsa_init_one+0x12cb/0x1a59 [hpsa]
> 
> This trace comes from internally generated discovery commands. No
> SCSI devices have
> been presented to the SML yet.
> 
> At this point we should be running on only one CPU. These commands
> are meant to use
> reply queue 0 which are tied to CPU 0. It's interesting that the
> patch helps.
> 
> However, I was wondering if you could inspect the iLo IML logs and
> send the
> AHS logs for inspection.
> 
> Thanks,
> Don Brace
> ESC - Smart Storage
> Microsemi Corporation


Hello Don

I took two other dl380 g7's and ran the same kernel and it hangs in the
identical place. Its absolutely consistent here.

I doubt all three have hardware issues.

Nothing is logged of interest in the IML.

Ming will have more to share on specifically why it helps.
I think he sent that along to you already.

Regards
Laurence

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-01-16 15:22           ` Don Brace
@ 2018-01-16 15:47             ` Ming Lei
  -1 siblings, 0 replies; 23+ messages in thread
From: Ming Lei @ 2018-01-16 15:47 UTC (permalink / raw)
  To: Don Brace
  Cc: Laurence Oberman, Thomas Gleixner, Christoph Hellwig, Jens Axboe,
	linux-block, linux-kernel, Mike Snitzer

On Tue, Jan 16, 2018 at 03:22:18PM +0000, Don Brace wrote:
> > -----Original Message-----
> > From: Laurence Oberman [mailto:loberman@redhat.com]
> > Sent: Tuesday, January 16, 2018 7:29 AM
> > To: Thomas Gleixner <tglx@linutronix.de>; Ming Lei <ming.lei@redhat.com>
> > Cc: Christoph Hellwig <hch@infradead.org>; Jens Axboe <axboe@fb.com>;
> > linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Mike Snitzer
> > <snitzer@redhat.com>; Don Brace <don.brace@microsemi.com>
> > Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined
> > to irq vector
> > 
> > > > It is because of irq_create_affinity_masks().
> > >
> > > That still does not answer the question. If the interrupt for a queue
> > > is
> > > assigned to an offline CPU, then the queue should not be used and
> > > never
> > > raise an interrupt. That's how managed interrupts have been designed.
> > >
> > > Thanks,
> > >
> > >       tglx
> > >
> > >
> > >
> > >
> > 
> > I captured a full boot log for this issue for Microsemi, I will send it
> > to Don Brace.
> > I enabled all the HPSA debug and here is snippet
> > 
> > 
> > ..
> > ..
> > ..
> >   246.751135] INFO: task systemd-udevd:413 blocked for more than 120
> > seconds.
> > [ï¿½ï¿½246.788008]ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½Tainted: G          I      4.15.0-rc4.noming+ #1
> > [ï¿½ï¿½246.822380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ï¿½ï¿½246.865594] systemd-udevd   D    0   413    411 0x80000004
> > [ï¿½ï¿½246.895519] Call Trace:
> > [ï¿½ï¿½246.909713]ï¿½ï¿½? __schedule+0x340/0xc20
> > [ï¿½ï¿½246.930236]ï¿½ï¿½schedule+0x32/0x80
> > [ï¿½ï¿½246.947905]ï¿½ï¿½schedule_timeout+0x23d/0x450
> > [  246.970047]ï¿½ï¿½? find_held_lock+0x2d/0x90
> > [ï¿½ï¿½246.991774]ï¿½ï¿½? wait_for_completion_io+0x108/0x170
> > [ï¿½ï¿½247.018172]ï¿½ï¿½io_schedule_timeout+0x19/0x40
> > [ï¿½ï¿½247.041208]ï¿½ï¿½wait_for_completion_io+0x110/0x170
> > [ï¿½ï¿½247.067326]ï¿½ï¿½? wake_up_q+0x70/0x70
> > [ï¿½ï¿½247.086801]ï¿½ï¿½hpsa_scsi_do_simple_cmd+0xc6/0x100 [hpsa]
> > [ï¿½ï¿½247.114315]ï¿½ï¿½hpsa_scsi_do_simple_cmd_with_retry+0xb7/0x1c0 [hpsa]
> > [ï¿½ï¿½247.146629]ï¿½ï¿½hpsa_scsi_do_inquiry+0x73/0xd0 [hpsa]
> > [ï¿½ï¿½247.174118]ï¿½ï¿½hpsa_init_one+0x12cb/0x1a59 [hpsa]
> 
> This trace comes from internally generated discovery commands. No SCSI devices have
> been presented to the SML yet.
> 
> At this point we should be running on only one CPU. These commands are meant to use
> reply queue 0 which are tied to CPU 0. It's interesting that the patch helps.

In hpsa_interrupt_mode(), you pass PCI_IRQ_AFFINITY to pci_alloc_irq_vectors(),
which may spread one irq vector across all offline CPUs. That is the cause of
this hang reported by Laurence from my observation.

BTW, if the interrupt handler for the reply queue isn't performance sensitive,
maybe PCI_IRQ_AFFINITY can be removed for avoiding this issue.

But anyway, as I replied in this thread, this patch still improves irq
vectors spread.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
@ 2018-01-16 15:47             ` Ming Lei
  0 siblings, 0 replies; 23+ messages in thread
From: Ming Lei @ 2018-01-16 15:47 UTC (permalink / raw)
  To: Don Brace
  Cc: Laurence Oberman, Thomas Gleixner, Christoph Hellwig, Jens Axboe,
	linux-block, linux-kernel, Mike Snitzer

On Tue, Jan 16, 2018 at 03:22:18PM +0000, Don Brace wrote:
> > -----Original Message-----
> > From: Laurence Oberman [mailto:loberman@redhat.com]
> > Sent: Tuesday, January 16, 2018 7:29 AM
> > To: Thomas Gleixner <tglx@linutronix.de>; Ming Lei <ming.lei@redhat.com>
> > Cc: Christoph Hellwig <hch@infradead.org>; Jens Axboe <axboe@fb.com>;
> > linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Mike Snitzer
> > <snitzer@redhat.com>; Don Brace <don.brace@microsemi.com>
> > Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined
> > to irq vector
> > 
> > > > It is because of irq_create_affinity_masks().
> > >
> > > That still does not answer the question. If the interrupt for a queue
> > > is
> > > assigned to an offline CPU, then the queue should not be used and
> > > never
> > > raise an interrupt. That's how managed interrupts have been designed.
> > >
> > > Thanks,
> > >
> > >       tglx
> > >
> > >
> > >
> > >
> > 
> > I captured a full boot log for this issue for Microsemi, I will send it
> > to Don Brace.
> > I enabled all the HPSA debug and here is snippet
> > 
> > 
> > ..
> > ..
> > ..
> >   246.751135] INFO: task systemd-udevd:413 blocked for more than 120
> > seconds.
> > [  246.788008]       Tainted: G          I      4.15.0-rc4.noming+ #1
> > [  246.822380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  246.865594] systemd-udevd   D    0   413    411 0x80000004
> > [  246.895519] Call Trace:
> > [  246.909713]  ? __schedule+0x340/0xc20
> > [  246.930236]  schedule+0x32/0x80
> > [  246.947905]  schedule_timeout+0x23d/0x450
> > [  246.970047]  ? find_held_lock+0x2d/0x90
> > [  246.991774]  ? wait_for_completion_io+0x108/0x170
> > [  247.018172]  io_schedule_timeout+0x19/0x40
> > [  247.041208]  wait_for_completion_io+0x110/0x170
> > [  247.067326]  ? wake_up_q+0x70/0x70
> > [  247.086801]  hpsa_scsi_do_simple_cmd+0xc6/0x100 [hpsa]
> > [  247.114315]  hpsa_scsi_do_simple_cmd_with_retry+0xb7/0x1c0 [hpsa]
> > [  247.146629]  hpsa_scsi_do_inquiry+0x73/0xd0 [hpsa]
> > [  247.174118]  hpsa_init_one+0x12cb/0x1a59 [hpsa]
> 
> This trace comes from internally generated discovery commands. No SCSI devices have
> been presented to the SML yet.
> 
> At this point we should be running on only one CPU. These commands are meant to use
> reply queue 0 which are tied to CPU 0. It's interesting that the patch helps.

In hpsa_interrupt_mode(), you pass PCI_IRQ_AFFINITY to pci_alloc_irq_vectors(),
which may spread one irq vector across all offline CPUs. That is the cause of
this hang reported by Laurence from my observation.

BTW, if the interrupt handler for the reply queue isn't performance sensitive,
maybe PCI_IRQ_AFFINITY can be removed for avoiding this issue.

But anyway, as I replied in this thread, this patch still improves irq
vectors spread.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-01-16 15:22           ` Don Brace
@ 2018-02-01 10:36             ` Ming Lei
  -1 siblings, 0 replies; 23+ messages in thread
From: Ming Lei @ 2018-02-01 10:36 UTC (permalink / raw)
  To: Don Brace
  Cc: Laurence Oberman, Thomas Gleixner, Christoph Hellwig, Jens Axboe,
	linux-block, linux-kernel, Mike Snitzer

On Tue, Jan 16, 2018 at 03:22:18PM +0000, Don Brace wrote:
> > -----Original Message-----
> > From: Laurence Oberman [mailto:loberman@redhat.com]
> > Sent: Tuesday, January 16, 2018 7:29 AM
> > To: Thomas Gleixner <tglx@linutronix.de>; Ming Lei <ming.lei@redhat.com>
> > Cc: Christoph Hellwig <hch@infradead.org>; Jens Axboe <axboe@fb.com>;
> > linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Mike Snitzer
> > <snitzer@redhat.com>; Don Brace <don.brace@microsemi.com>
> > Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined
> > to irq vector
> > 
> > > > It is because of irq_create_affinity_masks().
> > >
> > > That still does not answer the question. If the interrupt for a queue
> > > is
> > > assigned to an offline CPU, then the queue should not be used and
> > > never
> > > raise an interrupt. That's how managed interrupts have been designed.
> > >
> > > Thanks,
> > >
> > >       tglx
> > >
> > >
> > >
> > >
> > 
> > I captured a full boot log for this issue for Microsemi, I will send it
> > to Don Brace.
> > I enabled all the HPSA debug and here is snippet
> > 
> > 
> > ..
> > ..
> > ..
> >   246.751135] INFO: task systemd-udevd:413 blocked for more than 120
> > seconds.
> > [ï¿½ï¿½246.788008]ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½Tainted: G          I      4.15.0-rc4.noming+ #1
> > [ï¿½ï¿½246.822380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ï¿½ï¿½246.865594] systemd-udevd   D    0   413    411 0x80000004
> > [ï¿½ï¿½246.895519] Call Trace:
> > [ï¿½ï¿½246.909713]ï¿½ï¿½? __schedule+0x340/0xc20
> > [ï¿½ï¿½246.930236]ï¿½ï¿½schedule+0x32/0x80
> > [ï¿½ï¿½246.947905]ï¿½ï¿½schedule_timeout+0x23d/0x450
> > [  246.970047]ï¿½ï¿½? find_held_lock+0x2d/0x90
> > [ï¿½ï¿½246.991774]ï¿½ï¿½? wait_for_completion_io+0x108/0x170
> > [ï¿½ï¿½247.018172]ï¿½ï¿½io_schedule_timeout+0x19/0x40
> > [ï¿½ï¿½247.041208]ï¿½ï¿½wait_for_completion_io+0x110/0x170
> > [ï¿½ï¿½247.067326]ï¿½ï¿½? wake_up_q+0x70/0x70
> > [ï¿½ï¿½247.086801]ï¿½ï¿½hpsa_scsi_do_simple_cmd+0xc6/0x100 [hpsa]
> > [ï¿½ï¿½247.114315]ï¿½ï¿½hpsa_scsi_do_simple_cmd_with_retry+0xb7/0x1c0 [hpsa]
> > [ï¿½ï¿½247.146629]ï¿½ï¿½hpsa_scsi_do_inquiry+0x73/0xd0 [hpsa]
> > [ï¿½ï¿½247.174118]ï¿½ï¿½hpsa_init_one+0x12cb/0x1a59 [hpsa]
> 
> This trace comes from internally generated discovery commands. No SCSI devices have
> been presented to the SML yet.
> 
> At this point we should be running on only one CPU. These commands are meant to use
> reply queue 0 which are tied to CPU 0. It's interesting that the patch helps.
> 
> However, I was wondering if you could inspect the iLo IML logs and send the
> AHS logs for inspection.

Hello Don,

Now the patch has been merged to linus tree as:

84676c1f21e8ff54b ("genirq/affinity: assign vectors to all possible CPUs")

and it breaks Laurence's machine completely, :-(

I just take a look at HPSA's code, and found that reply queue is chosen
in the following way in most of code path:

        if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
                cp->ReplyQueue = smp_processor_id() % h->nreply_queues;

h->nreply_queues is the msix vector number which is returned from
pci_alloc_irq_vectors(), and now some of vectors may be mapped to all
offline CPUs, for example, one processor isn't plugged to socket.

If I understand correctly, 'cp->ReplyQueue' is aligned to one irq
vector, and the command is expected by handled via that irq vector,
is it right?

If yes, now I guess this way can't work any more if number of online
CPUs is >= h->nreply_queues, and you may need to check the cpu affinity
of one vector before choosing the reply queue, and block/blk-mq-pci.c
may be helpful for you.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
@ 2018-02-01 10:36             ` Ming Lei
  0 siblings, 0 replies; 23+ messages in thread
From: Ming Lei @ 2018-02-01 10:36 UTC (permalink / raw)
  To: Don Brace
  Cc: Laurence Oberman, Thomas Gleixner, Christoph Hellwig, Jens Axboe,
	linux-block, linux-kernel, Mike Snitzer

On Tue, Jan 16, 2018 at 03:22:18PM +0000, Don Brace wrote:
> > -----Original Message-----
> > From: Laurence Oberman [mailto:loberman@redhat.com]
> > Sent: Tuesday, January 16, 2018 7:29 AM
> > To: Thomas Gleixner <tglx@linutronix.de>; Ming Lei <ming.lei@redhat.com>
> > Cc: Christoph Hellwig <hch@infradead.org>; Jens Axboe <axboe@fb.com>;
> > linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Mike Snitzer
> > <snitzer@redhat.com>; Don Brace <don.brace@microsemi.com>
> > Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined
> > to irq vector
> > 
> > > > It is because of irq_create_affinity_masks().
> > >
> > > That still does not answer the question. If the interrupt for a queue
> > > is
> > > assigned to an offline CPU, then the queue should not be used and
> > > never
> > > raise an interrupt. That's how managed interrupts have been designed.
> > >
> > > Thanks,
> > >
> > >       tglx
> > >
> > >
> > >
> > >
> > 
> > I captured a full boot log for this issue for Microsemi, I will send it
> > to Don Brace.
> > I enabled all the HPSA debug and here is snippet
> > 
> > 
> > ..
> > ..
> > ..
> >   246.751135] INFO: task systemd-udevd:413 blocked for more than 120
> > seconds.
> > [  246.788008]       Tainted: G          I      4.15.0-rc4.noming+ #1
> > [  246.822380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  246.865594] systemd-udevd   D    0   413    411 0x80000004
> > [  246.895519] Call Trace:
> > [  246.909713]  ? __schedule+0x340/0xc20
> > [  246.930236]  schedule+0x32/0x80
> > [  246.947905]  schedule_timeout+0x23d/0x450
> > [  246.970047]  ? find_held_lock+0x2d/0x90
> > [  246.991774]  ? wait_for_completion_io+0x108/0x170
> > [  247.018172]  io_schedule_timeout+0x19/0x40
> > [  247.041208]  wait_for_completion_io+0x110/0x170
> > [  247.067326]  ? wake_up_q+0x70/0x70
> > [  247.086801]  hpsa_scsi_do_simple_cmd+0xc6/0x100 [hpsa]
> > [  247.114315]  hpsa_scsi_do_simple_cmd_with_retry+0xb7/0x1c0 [hpsa]
> > [  247.146629]  hpsa_scsi_do_inquiry+0x73/0xd0 [hpsa]
> > [  247.174118]  hpsa_init_one+0x12cb/0x1a59 [hpsa]
> 
> This trace comes from internally generated discovery commands. No SCSI devices have
> been presented to the SML yet.
> 
> At this point we should be running on only one CPU. These commands are meant to use
> reply queue 0 which are tied to CPU 0. It's interesting that the patch helps.
> 
> However, I was wondering if you could inspect the iLo IML logs and send the
> AHS logs for inspection.

Hello Don,

Now the patch has been merged to linus tree as:

84676c1f21e8ff54b ("genirq/affinity: assign vectors to all possible CPUs")

and it breaks Laurence's machine completely, :-(

I just take a look at HPSA's code, and found that reply queue is chosen
in the following way in most of code path:

        if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
                cp->ReplyQueue = smp_processor_id() % h->nreply_queues;

h->nreply_queues is the msix vector number which is returned from
pci_alloc_irq_vectors(), and now some of vectors may be mapped to all
offline CPUs, for example, one processor isn't plugged to socket.

If I understand correctly, 'cp->ReplyQueue' is aligned to one irq
vector, and the command is expected by handled via that irq vector,
is it right?

If yes, now I guess this way can't work any more if number of online
CPUs is >= h->nreply_queues, and you may need to check the cpu affinity
of one vector before choosing the reply queue, and block/blk-mq-pci.c
may be helpful for you.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-02-01 10:36             ` Ming Lei
@ 2018-02-01 14:53               ` Don Brace
  -1 siblings, 0 replies; 23+ messages in thread
From: Don Brace @ 2018-02-01 14:53 UTC (permalink / raw)
  To: Ming Lei
  Cc: Laurence Oberman, Thomas Gleixner, Christoph Hellwig, Jens Axboe,
	linux-block, linux-kernel, Mike Snitzer

> -----Original Message-----
> From: Ming Lei [mailto:ming.lei@redhat.com]
> Sent: Thursday, February 01, 2018 4:37 AM
> To: Don Brace <don.brace@microsemi.com>
> Cc: Laurence Oberman <loberman@redhat.com>; Thomas Gleixner
> <tglx@linutronix.de>; Christoph Hellwig <hch@infradead.org>; Jens Axboe
> <axboe@fb.com>; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org=
;
> Mike Snitzer <snitzer@redhat.com>
> Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is =
assgined
> to irq vector
>=20
> EXTERNAL EMAIL
>=20
>=20
> On Tue, Jan 16, 2018 at 03:22:18PM +0000, Don Brace wrote:
> > > -----Original Message-----
> > > From: Laurence Oberman [mailto:loberman@redhat.com]
> > > Sent: Tuesday, January 16, 2018 7:29 AM
> > > To: Thomas Gleixner <tglx@linutronix.de>; Ming Lei <ming.lei@redhat.c=
om>
> > > Cc: Christoph Hellwig <hch@infradead.org>; Jens Axboe <axboe@fb.com>;
> > > linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Mike Snitz=
er
> > > <snitzer@redhat.com>; Don Brace <don.brace@microsemi.com>
> > > Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU=
 is
> assgined
> > > to irq vector
> > >
> > > > > It is because of irq_create_affinity_masks().
> > > >
> > > > That still does not answer the question. If the interrupt for a que=
ue
> > > > is
> > > > assigned to an offline CPU, then the queue should not be used and
> > > > never
> > > > raise an interrupt. That's how managed interrupts have been designe=
d.
> > > >
> > > > Thanks,
> > > >
> > > >       tglx
> > > >
> > > >
> > > >
> > > >
> > >
> > > I captured a full boot log for this issue for Microsemi, I will send =
it
> > > to Don Brace.
> > > I enabled all the HPSA debug and here is snippet
> > >
> > >
> > > ..
> > > ..
> > > ..
> > >   246.751135] INFO: task systemd-udevd:413 blocked for more than 120
> > > seconds.
> > > [=A0=A0246.788008]=A0=A0=A0=A0=A0=A0=A0Tainted: G          I      4.1=
5.0-rc4.noming+ #1
> > > [=A0=A0246.822380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [=A0=A0246.865594] systemd-udevd   D    0   413    411 0x80000004
> > > [=A0=A0246.895519] Call Trace:
> > > [=A0=A0246.909713]=A0=A0? __schedule+0x340/0xc20
> > > [=A0=A0246.930236]=A0=A0schedule+0x32/0x80
> > > [=A0=A0246.947905]=A0=A0schedule_timeout+0x23d/0x450
> > > [  246.970047]=A0=A0? find_held_lock+0x2d/0x90
> > > [=A0=A0246.991774]=A0=A0? wait_for_completion_io+0x108/0x170
> > > [=A0=A0247.018172]=A0=A0io_schedule_timeout+0x19/0x40
> > > [=A0=A0247.041208]=A0=A0wait_for_completion_io+0x110/0x170
> > > [=A0=A0247.067326]=A0=A0? wake_up_q+0x70/0x70
> > > [=A0=A0247.086801]=A0=A0hpsa_scsi_do_simple_cmd+0xc6/0x100 [hpsa]
> > > [=A0=A0247.114315]=A0=A0hpsa_scsi_do_simple_cmd_with_retry+0xb7/0x1c0=
 [hpsa]
> > > [=A0=A0247.146629]=A0=A0hpsa_scsi_do_inquiry+0x73/0xd0 [hpsa]
> > > [=A0=A0247.174118]=A0=A0hpsa_init_one+0x12cb/0x1a59 [hpsa]
> >
> > This trace comes from internally generated discovery commands. No SCSI
> devices have
> > been presented to the SML yet.
> >
> > At this point we should be running on only one CPU. These commands are
> meant to use
> > reply queue 0 which are tied to CPU 0. It's interesting that the patch =
helps.
> >
> > However, I was wondering if you could inspect the iLo IML logs and send=
 the
> > AHS logs for inspection.
>=20
> Hello Don,
>=20
> Now the patch has been merged to linus tree as:
>=20
> 84676c1f21e8ff54b ("genirq/affinity: assign vectors to all possible CPUs"=
)
>=20
> and it breaks Laurence's machine completely, :-(
>=20
> I just take a look at HPSA's code, and found that reply queue is chosen
> in the following way in most of code path:
>=20
>         if (likely(reply_queue =3D=3D DEFAULT_REPLY_QUEUE))
>                 cp->ReplyQueue =3D smp_processor_id() % h->nreply_queues;
>=20
> h->nreply_queues is the msix vector number which is returned from
> pci_alloc_irq_vectors(), and now some of vectors may be mapped to all
> offline CPUs, for example, one processor isn't plugged to socket.
>=20
> If I understand correctly, 'cp->ReplyQueue' is aligned to one irq
> vector, and the command is expected by handled via that irq vector,
> is it right?
>=20
> If yes, now I guess this way can't work any more if number of online
> CPUs is >=3D h->nreply_queues, and you may need to check the cpu affinity
> of one vector before choosing the reply queue, and block/blk-mq-pci.c
> may be helpful for you.
>=20
> Thanks,
> Ming

Thanks Ming,
I start working up a patch.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
@ 2018-02-01 14:53               ` Don Brace
  0 siblings, 0 replies; 23+ messages in thread
From: Don Brace @ 2018-02-01 14:53 UTC (permalink / raw)
  To: Ming Lei
  Cc: Laurence Oberman, Thomas Gleixner, Christoph Hellwig, Jens Axboe,
	linux-block, linux-kernel, Mike Snitzer

> -----Original Message-----
> From: Ming Lei [mailto:ming.lei@redhat.com]
> Sent: Thursday, February 01, 2018 4:37 AM
> To: Don Brace <don.brace@microsemi.com>
> Cc: Laurence Oberman <loberman@redhat.com>; Thomas Gleixner
> <tglx@linutronix.de>; Christoph Hellwig <hch@infradead.org>; Jens Axboe
> <axboe@fb.com>; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org;
> Mike Snitzer <snitzer@redhat.com>
> Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined
> to irq vector
> 
> EXTERNAL EMAIL
> 
> 
> On Tue, Jan 16, 2018 at 03:22:18PM +0000, Don Brace wrote:
> > > -----Original Message-----
> > > From: Laurence Oberman [mailto:loberman@redhat.com]
> > > Sent: Tuesday, January 16, 2018 7:29 AM
> > > To: Thomas Gleixner <tglx@linutronix.de>; Ming Lei <ming.lei@redhat.com>
> > > Cc: Christoph Hellwig <hch@infradead.org>; Jens Axboe <axboe@fb.com>;
> > > linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Mike Snitzer
> > > <snitzer@redhat.com>; Don Brace <don.brace@microsemi.com>
> > > Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is
> assgined
> > > to irq vector
> > >
> > > > > It is because of irq_create_affinity_masks().
> > > >
> > > > That still does not answer the question. If the interrupt for a queue
> > > > is
> > > > assigned to an offline CPU, then the queue should not be used and
> > > > never
> > > > raise an interrupt. That's how managed interrupts have been designed.
> > > >
> > > > Thanks,
> > > >
> > > >       tglx
> > > >
> > > >
> > > >
> > > >
> > >
> > > I captured a full boot log for this issue for Microsemi, I will send it
> > > to Don Brace.
> > > I enabled all the HPSA debug and here is snippet
> > >
> > >
> > > ..
> > > ..
> > > ..
> > >   246.751135] INFO: task systemd-udevd:413 blocked for more than 120
> > > seconds.
> > > [  246.788008]       Tainted: G          I      4.15.0-rc4.noming+ #1
> > > [  246.822380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [  246.865594] systemd-udevd   D    0   413    411 0x80000004
> > > [  246.895519] Call Trace:
> > > [  246.909713]  ? __schedule+0x340/0xc20
> > > [  246.930236]  schedule+0x32/0x80
> > > [  246.947905]  schedule_timeout+0x23d/0x450
> > > [  246.970047]  ? find_held_lock+0x2d/0x90
> > > [  246.991774]  ? wait_for_completion_io+0x108/0x170
> > > [  247.018172]  io_schedule_timeout+0x19/0x40
> > > [  247.041208]  wait_for_completion_io+0x110/0x170
> > > [  247.067326]  ? wake_up_q+0x70/0x70
> > > [  247.086801]  hpsa_scsi_do_simple_cmd+0xc6/0x100 [hpsa]
> > > [  247.114315]  hpsa_scsi_do_simple_cmd_with_retry+0xb7/0x1c0 [hpsa]
> > > [  247.146629]  hpsa_scsi_do_inquiry+0x73/0xd0 [hpsa]
> > > [  247.174118]  hpsa_init_one+0x12cb/0x1a59 [hpsa]
> >
> > This trace comes from internally generated discovery commands. No SCSI
> devices have
> > been presented to the SML yet.
> >
> > At this point we should be running on only one CPU. These commands are
> meant to use
> > reply queue 0 which are tied to CPU 0. It's interesting that the patch helps.
> >
> > However, I was wondering if you could inspect the iLo IML logs and send the
> > AHS logs for inspection.
> 
> Hello Don,
> 
> Now the patch has been merged to linus tree as:
> 
> 84676c1f21e8ff54b ("genirq/affinity: assign vectors to all possible CPUs")
> 
> and it breaks Laurence's machine completely, :-(
> 
> I just take a look at HPSA's code, and found that reply queue is chosen
> in the following way in most of code path:
> 
>         if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
>                 cp->ReplyQueue = smp_processor_id() % h->nreply_queues;
> 
> h->nreply_queues is the msix vector number which is returned from
> pci_alloc_irq_vectors(), and now some of vectors may be mapped to all
> offline CPUs, for example, one processor isn't plugged to socket.
> 
> If I understand correctly, 'cp->ReplyQueue' is aligned to one irq
> vector, and the command is expected by handled via that irq vector,
> is it right?
> 
> If yes, now I guess this way can't work any more if number of online
> CPUs is >= h->nreply_queues, and you may need to check the cpu affinity
> of one vector before choosing the reply queue, and block/blk-mq-pci.c
> may be helpful for you.
> 
> Thanks,
> Ming

Thanks Ming,
I start working up a patch.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
  2018-02-01 14:53               ` Don Brace
@ 2018-02-01 15:04                 ` Ming Lei
  -1 siblings, 0 replies; 23+ messages in thread
From: Ming Lei @ 2018-02-01 15:04 UTC (permalink / raw)
  To: Don Brace
  Cc: Laurence Oberman, Thomas Gleixner, Christoph Hellwig, Jens Axboe,
	linux-block, linux-kernel, Mike Snitzer, linux-scsi

On Thu, Feb 01, 2018 at 02:53:35PM +0000, Don Brace wrote:
> > -----Original Message-----
> > From: Ming Lei [mailto:ming.lei@redhat.com]
> > Sent: Thursday, February 01, 2018 4:37 AM
> > To: Don Brace <don.brace@microsemi.com>
> > Cc: Laurence Oberman <loberman@redhat.com>; Thomas Gleixner
> > <tglx@linutronix.de>; Christoph Hellwig <hch@infradead.org>; Jens Axboe
> > <axboe@fb.com>; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org;
> > Mike Snitzer <snitzer@redhat.com>
> > Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined
> > to irq vector
> > 
> > EXTERNAL EMAIL
> > 
> > 
> > On Tue, Jan 16, 2018 at 03:22:18PM +0000, Don Brace wrote:
> > > > -----Original Message-----
> > > > From: Laurence Oberman [mailto:loberman@redhat.com]
> > > > Sent: Tuesday, January 16, 2018 7:29 AM
> > > > To: Thomas Gleixner <tglx@linutronix.de>; Ming Lei <ming.lei@redhat.com>
> > > > Cc: Christoph Hellwig <hch@infradead.org>; Jens Axboe <axboe@fb.com>;
> > > > linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Mike Snitzer
> > > > <snitzer@redhat.com>; Don Brace <don.brace@microsemi.com>
> > > > Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is
> > assgined
> > > > to irq vector
> > > >
> > > > > > It is because of irq_create_affinity_masks().
> > > > >
> > > > > That still does not answer the question. If the interrupt for a queue
> > > > > is
> > > > > assigned to an offline CPU, then the queue should not be used and
> > > > > never
> > > > > raise an interrupt. That's how managed interrupts have been designed.
> > > > >
> > > > > Thanks,
> > > > >
> > > > >       tglx
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > > I captured a full boot log for this issue for Microsemi, I will send it
> > > > to Don Brace.
> > > > I enabled all the HPSA debug and here is snippet
> > > >
> > > >
> > > > ..
> > > > ..
> > > > ..
> > > >   246.751135] INFO: task systemd-udevd:413 blocked for more than 120
> > > > seconds.
> > > > [ï¿½ï¿½246.788008]ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½Tainted: G          I      4.15.0-rc4.noming+ #1
> > > > [ï¿½ï¿½246.822380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > > disables this message.
> > > > [ï¿½ï¿½246.865594] systemd-udevd   D    0   413    411 0x80000004
> > > > [ï¿½ï¿½246.895519] Call Trace:
> > > > [ï¿½ï¿½246.909713]ï¿½ï¿½? __schedule+0x340/0xc20
> > > > [ï¿½ï¿½246.930236]ï¿½ï¿½schedule+0x32/0x80
> > > > [ï¿½ï¿½246.947905]ï¿½ï¿½schedule_timeout+0x23d/0x450
> > > > [  246.970047]ï¿½ï¿½? find_held_lock+0x2d/0x90
> > > > [ï¿½ï¿½246.991774]ï¿½ï¿½? wait_for_completion_io+0x108/0x170
> > > > [ï¿½ï¿½247.018172]ï¿½ï¿½io_schedule_timeout+0x19/0x40
> > > > [ï¿½ï¿½247.041208]ï¿½ï¿½wait_for_completion_io+0x110/0x170
> > > > [ï¿½ï¿½247.067326]ï¿½ï¿½? wake_up_q+0x70/0x70
> > > > [ï¿½ï¿½247.086801]ï¿½ï¿½hpsa_scsi_do_simple_cmd+0xc6/0x100 [hpsa]
> > > > [ï¿½ï¿½247.114315]ï¿½ï¿½hpsa_scsi_do_simple_cmd_with_retry+0xb7/0x1c0 [hpsa]
> > > > [ï¿½ï¿½247.146629]ï¿½ï¿½hpsa_scsi_do_inquiry+0x73/0xd0 [hpsa]
> > > > [ï¿½ï¿½247.174118]ï¿½ï¿½hpsa_init_one+0x12cb/0x1a59 [hpsa]
> > >
> > > This trace comes from internally generated discovery commands. No SCSI
> > devices have
> > > been presented to the SML yet.
> > >
> > > At this point we should be running on only one CPU. These commands are
> > meant to use
> > > reply queue 0 which are tied to CPU 0. It's interesting that the patch helps.
> > >
> > > However, I was wondering if you could inspect the iLo IML logs and send the
> > > AHS logs for inspection.
> > 
> > Hello Don,
> > 
> > Now the patch has been merged to linus tree as:
> > 
> > 84676c1f21e8ff54b ("genirq/affinity: assign vectors to all possible CPUs")
> > 
> > and it breaks Laurence's machine completely, :-(
> > 
> > I just take a look at HPSA's code, and found that reply queue is chosen
> > in the following way in most of code path:
> > 
> >         if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
> >                 cp->ReplyQueue = smp_processor_id() % h->nreply_queues;
> > 
> > h->nreply_queues is the msix vector number which is returned from
> > pci_alloc_irq_vectors(), and now some of vectors may be mapped to all
> > offline CPUs, for example, one processor isn't plugged to socket.
> > 
> > If I understand correctly, 'cp->ReplyQueue' is aligned to one irq
> > vector, and the command is expected by handled via that irq vector,
> > is it right?
> > 
> > If yes, now I guess this way can't work any more if number of online
> > CPUs is >= h->nreply_queues, and you may need to check the cpu affinity
> > of one vector before choosing the reply queue, and block/blk-mq-pci.c
> > may be helpful for you.
> > 
> > Thanks,
> > Ming
> 
> Thanks Ming,
> I start working up a patch.

Also the reply queue may be mapped to blk-mq's hw queue directly, then the
conversion may be done by blk-mq's MQ framework, but legacy path still need
the fix.

thanks
Ming

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector
@ 2018-02-01 15:04                 ` Ming Lei
  0 siblings, 0 replies; 23+ messages in thread
From: Ming Lei @ 2018-02-01 15:04 UTC (permalink / raw)
  To: Don Brace
  Cc: Laurence Oberman, Thomas Gleixner, Christoph Hellwig, Jens Axboe,
	linux-block, linux-kernel, Mike Snitzer, linux-scsi

On Thu, Feb 01, 2018 at 02:53:35PM +0000, Don Brace wrote:
> > -----Original Message-----
> > From: Ming Lei [mailto:ming.lei@redhat.com]
> > Sent: Thursday, February 01, 2018 4:37 AM
> > To: Don Brace <don.brace@microsemi.com>
> > Cc: Laurence Oberman <loberman@redhat.com>; Thomas Gleixner
> > <tglx@linutronix.de>; Christoph Hellwig <hch@infradead.org>; Jens Axboe
> > <axboe@fb.com>; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org;
> > Mike Snitzer <snitzer@redhat.com>
> > Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined
> > to irq vector
> > 
> > EXTERNAL EMAIL
> > 
> > 
> > On Tue, Jan 16, 2018 at 03:22:18PM +0000, Don Brace wrote:
> > > > -----Original Message-----
> > > > From: Laurence Oberman [mailto:loberman@redhat.com]
> > > > Sent: Tuesday, January 16, 2018 7:29 AM
> > > > To: Thomas Gleixner <tglx@linutronix.de>; Ming Lei <ming.lei@redhat.com>
> > > > Cc: Christoph Hellwig <hch@infradead.org>; Jens Axboe <axboe@fb.com>;
> > > > linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Mike Snitzer
> > > > <snitzer@redhat.com>; Don Brace <don.brace@microsemi.com>
> > > > Subject: Re: [PATCH 0/2] genirq/affinity: try to make sure online CPU is
> > assgined
> > > > to irq vector
> > > >
> > > > > > It is because of irq_create_affinity_masks().
> > > > >
> > > > > That still does not answer the question. If the interrupt for a queue
> > > > > is
> > > > > assigned to an offline CPU, then the queue should not be used and
> > > > > never
> > > > > raise an interrupt. That's how managed interrupts have been designed.
> > > > >
> > > > > Thanks,
> > > > >
> > > > >       tglx
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > > I captured a full boot log for this issue for Microsemi, I will send it
> > > > to Don Brace.
> > > > I enabled all the HPSA debug and here is snippet
> > > >
> > > >
> > > > ..
> > > > ..
> > > > ..
> > > >   246.751135] INFO: task systemd-udevd:413 blocked for more than 120
> > > > seconds.
> > > > [  246.788008]       Tainted: G          I      4.15.0-rc4.noming+ #1
> > > > [  246.822380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > > disables this message.
> > > > [  246.865594] systemd-udevd   D    0   413    411 0x80000004
> > > > [  246.895519] Call Trace:
> > > > [  246.909713]  ? __schedule+0x340/0xc20
> > > > [  246.930236]  schedule+0x32/0x80
> > > > [  246.947905]  schedule_timeout+0x23d/0x450
> > > > [  246.970047]  ? find_held_lock+0x2d/0x90
> > > > [  246.991774]  ? wait_for_completion_io+0x108/0x170
> > > > [  247.018172]  io_schedule_timeout+0x19/0x40
> > > > [  247.041208]  wait_for_completion_io+0x110/0x170
> > > > [  247.067326]  ? wake_up_q+0x70/0x70
> > > > [  247.086801]  hpsa_scsi_do_simple_cmd+0xc6/0x100 [hpsa]
> > > > [  247.114315]  hpsa_scsi_do_simple_cmd_with_retry+0xb7/0x1c0 [hpsa]
> > > > [  247.146629]  hpsa_scsi_do_inquiry+0x73/0xd0 [hpsa]
> > > > [  247.174118]  hpsa_init_one+0x12cb/0x1a59 [hpsa]
> > >
> > > This trace comes from internally generated discovery commands. No SCSI
> > devices have
> > > been presented to the SML yet.
> > >
> > > At this point we should be running on only one CPU. These commands are
> > meant to use
> > > reply queue 0 which are tied to CPU 0. It's interesting that the patch helps.
> > >
> > > However, I was wondering if you could inspect the iLo IML logs and send the
> > > AHS logs for inspection.
> > 
> > Hello Don,
> > 
> > Now the patch has been merged to linus tree as:
> > 
> > 84676c1f21e8ff54b ("genirq/affinity: assign vectors to all possible CPUs")
> > 
> > and it breaks Laurence's machine completely, :-(
> > 
> > I just take a look at HPSA's code, and found that reply queue is chosen
> > in the following way in most of code path:
> > 
> >         if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
> >                 cp->ReplyQueue = smp_processor_id() % h->nreply_queues;
> > 
> > h->nreply_queues is the msix vector number which is returned from
> > pci_alloc_irq_vectors(), and now some of vectors may be mapped to all
> > offline CPUs, for example, one processor isn't plugged to socket.
> > 
> > If I understand correctly, 'cp->ReplyQueue' is aligned to one irq
> > vector, and the command is expected by handled via that irq vector,
> > is it right?
> > 
> > If yes, now I guess this way can't work any more if number of online
> > CPUs is >= h->nreply_queues, and you may need to check the cpu affinity
> > of one vector before choosing the reply queue, and block/blk-mq-pci.c
> > may be helpful for you.
> > 
> > Thanks,
> > Ming
> 
> Thanks Ming,
> I start working up a patch.

Also the reply queue may be mapped to blk-mq's hw queue directly, then the
conversion may be done by blk-mq's MQ framework, but legacy path still need
the fix.

thanks
Ming

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-02-01 15:05 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-15 16:03 [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector Ming Lei
2018-01-15 16:03 ` [PATCH 1/2] genirq/affinity: move irq vectors spread into one function Ming Lei
2018-01-15 16:03 ` [PATCH 2/2] genirq/affinity: try best to make sure online CPU is assigned to vector Ming Lei
2018-01-15 17:40 ` [PATCH 0/2] genirq/affinity: try to make sure online CPU is assgined to irq vector Christoph Hellwig
2018-01-16  1:30   ` Ming Lei
2018-01-16 11:25     ` Thomas Gleixner
2018-01-16 12:23       ` Ming Lei
2018-01-16 13:28       ` Laurence Oberman
2018-01-16 15:22         ` Don Brace
2018-01-16 15:22           ` Don Brace
2018-01-16 15:35           ` Laurence Oberman
2018-01-16 15:47           ` Ming Lei
2018-01-16 15:47             ` Ming Lei
2018-02-01 10:36           ` Ming Lei
2018-02-01 10:36             ` Ming Lei
2018-02-01 14:53             ` Don Brace
2018-02-01 14:53               ` Don Brace
2018-02-01 15:04               ` Ming Lei
2018-02-01 15:04                 ` Ming Lei
2018-01-16  2:15   ` Ming Lei
2018-01-15 17:43 ` Thomas Gleixner
2018-01-15 17:54   ` Laurence Oberman
2018-01-16  1:34   ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.