All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] IRQ, cpu-hotplug: Fix a race between CPU hotplug and IRQ desc alloc/free
@ 2017-09-04  8:53 Huang, Ying
  2017-09-04  9:23 ` Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: Huang, Ying @ 2017-09-04  8:53 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, Huang Ying

From: Huang Ying <ying.huang@intel.com>

When developing code to bootup some APs (Application CPUs)
asynchronously, the following kernel panic is encountered.  After
checking the code, it is found that the IRQ descriptor may be NULL
during CPU hotplug.  So I added corresponding NULL pointer checking to
fix this.  And it is found that irq_migrate_all_off_this_cpu() doesn't
lock sparse_irq_lock, fixed that too.

"
BUG: unable to handle kernel NULL pointer dereference at 00000000000000a4
IP: _raw_spin_lock_irq+0x1e/0x40
PGD 0
P4D 0

Oops: 0002 [#1] SMP
Modules linked in:
CPU: 93 PID: 713 Comm: cpuhp/93 Not tainted 4.13.0-rc7-00261-g3760d3d #1
Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0335.R00.1601291644 01/29/2016
task: ffff883f930e2680 task.stack: ffffc9000ef00000
RIP: 0010:_raw_spin_lock_irq+0x1e/0x40
RSP: 0000:ffffc9000ef03de0 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000010 RCX: 0000000000000010
RDX: 0000000000000001 RSI: 0000000000000010 RDI: 00000000000000a4
RBP: ffffc9000ef03de0 R08: ffff881036801240 R09: 0000000000000000
R10: 0000000000000040 R11: ffff881036801268 R12: 00000000000000a4
R13: 0000000000000000 R14: 000000000000005d R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff884044540000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000a4 CR3: 000000407ee09000 CR4: 00000000003406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 irq_affinity_online_cpu+0x46/0xe0
 ? irq_migrate_all_off_this_cpu+0x2a0/0x2a0
 cpuhp_invoke_callback+0x80/0x400
 cpuhp_up_callbacks+0x36/0xc0
 ? smpboot_thread_fn+0x34/0x1f0
 ? smpboot_thread_fn+0x12d/0x1f0
 cpuhp_thread_fun+0xd5/0xe0
 smpboot_thread_fn+0x128/0x1f0
 kthread+0x114/0x150
 ? sort_range+0x30/0x30
 ? kthread_create_on_node+0x40/0x40
 ret_from_fork+0x25/0x30
Code: 89 e5 e8 26 8b 6f ff 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 fa 66 0f 1f 44 00 00 65 ff 05 49 4a 63 7e 31 c0 ba 01 00 00 00 <f0> 0f b1 17 85 c0 75 02 5d c3 89 c6 e8 21 71 6f ff 66 90 5d c3
RIP: _raw_spin_lock_irq+0x1e/0x40 RSP: ffffc9000ef03de0
CR2: 00000000000000a4
---[ end trace a9eacc0758f1f81e ]---
Kernel panic - not syncing: Fatal exception
"

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
---
 kernel/irq/cpuhotplug.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c
index 638eb9c83d9f..af9029625271 100644
--- a/kernel/irq/cpuhotplug.c
+++ b/kernel/irq/cpuhotplug.c
@@ -129,10 +129,13 @@ void irq_migrate_all_off_this_cpu(void)
 	struct irq_desc *desc;
 	unsigned int irq;
 
+	irq_lock_sparse();
 	for_each_active_irq(irq) {
 		bool affinity_broken;
 
 		desc = irq_to_desc(irq);
+		if (!desc)
+			continue;
 		raw_spin_lock(&desc->lock);
 		affinity_broken = migrate_one_irq(desc);
 		raw_spin_unlock(&desc->lock);
@@ -142,6 +145,7 @@ void irq_migrate_all_off_this_cpu(void)
 					    irq, smp_processor_id());
 		}
 	}
+	irq_unlock_sparse();
 }
 
 static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu)
@@ -179,6 +183,8 @@ int irq_affinity_online_cpu(unsigned int cpu)
 	irq_lock_sparse();
 	for_each_active_irq(irq) {
 		desc = irq_to_desc(irq);
+		if (!desc)
+			continue;
 		raw_spin_lock_irq(&desc->lock);
 		irq_restore_affinity_of_irq(desc, cpu);
 		raw_spin_unlock_irq(&desc->lock);
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] IRQ, cpu-hotplug: Fix a race between CPU hotplug and IRQ desc alloc/free
  2017-09-04  8:53 [PATCH] IRQ, cpu-hotplug: Fix a race between CPU hotplug and IRQ desc alloc/free Huang, Ying
@ 2017-09-04  9:23 ` Thomas Gleixner
  2017-09-04 23:41   ` Huang, Ying
  0 siblings, 1 reply; 4+ messages in thread
From: Thomas Gleixner @ 2017-09-04  9:23 UTC (permalink / raw)
  To: Huang, Ying; +Cc: linux-kernel

On Mon, 4 Sep 2017, Huang, Ying wrote:
> diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c
> index 638eb9c83d9f..af9029625271 100644
> --- a/kernel/irq/cpuhotplug.c
ry> +++ b/kernel/irq/cpuhotplug.c
> @@ -129,10 +129,13 @@ void irq_migrate_all_off_this_cpu(void)
>  	struct irq_desc *desc;
>  	unsigned int irq;
>  
> +	irq_lock_sparse();

You cannot take that lock here as irq_migrate_all_off_this_cpu() is called
with interrupts disabled.

The protection in takedown_cpus() is wrong. Patch below.

Thanks,

	tglx
----

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -642,13 +642,13 @@ static int takedown_cpu(unsigned int cpu
 	wait_for_completion(&st->done);
 	BUG_ON(st->state != CPUHP_AP_IDLE_DEAD);
 
-	/* Interrupts are moved away from the dying cpu, reenable alloc/free */
-	irq_unlock_sparse();
-
 	hotplug_cpu__broadcast_tick_pull(cpu);
 	/* This actually kills the CPU. */
 	__cpu_die(cpu);
 
+	/* Interrupts are moved away from the dying cpu, reenable alloc/free */
+	irq_unlock_sparse();
+
 	tick_cleanup_dead_cpu(cpu);
 	return 0;
 }

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] IRQ, cpu-hotplug: Fix a race between CPU hotplug and IRQ desc alloc/free
  2017-09-04  9:23 ` Thomas Gleixner
@ 2017-09-04 23:41   ` Huang, Ying
  2017-09-05  7:15     ` Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: Huang, Ying @ 2017-09-04 23:41 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Huang, Ying, linux-kernel

Thomas Gleixner <tglx@linutronix.de> writes:

> On Mon, 4 Sep 2017, Huang, Ying wrote:
>> diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c
>> index 638eb9c83d9f..af9029625271 100644
>> --- a/kernel/irq/cpuhotplug.c
> ry> +++ b/kernel/irq/cpuhotplug.c
>> @@ -129,10 +129,13 @@ void irq_migrate_all_off_this_cpu(void)
>>  	struct irq_desc *desc;
>>  	unsigned int irq;
>>  
>> +	irq_lock_sparse();
>
> You cannot take that lock here as irq_migrate_all_off_this_cpu() is called
> with interrupts disabled.

Oh, sorry, I misunderstand the code.  I will only keep the !desc check
in the patch.

> The protection in takedown_cpus() is wrong. Patch below.
>
> Thanks,
>
> 	tglx
> ----
>
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -642,13 +642,13 @@ static int takedown_cpu(unsigned int cpu
>  	wait_for_completion(&st->done);
>  	BUG_ON(st->state != CPUHP_AP_IDLE_DEAD);
>  
> -	/* Interrupts are moved away from the dying cpu, reenable alloc/free */
> -	irq_unlock_sparse();
> -
>  	hotplug_cpu__broadcast_tick_pull(cpu);
>  	/* This actually kills the CPU. */
>  	__cpu_die(cpu);
>  
> +	/* Interrupts are moved away from the dying cpu, reenable alloc/free */
> +	irq_unlock_sparse();
> +

I don't understand this.  It appears that irq_migrate_all_off_this_cpu()
is called in take_cpu_down() which has sparse_irq_lock held already.

Best Regards,
Huang, Ying

>  	tick_cleanup_dead_cpu(cpu);
>  	return 0;
>  }

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] IRQ, cpu-hotplug: Fix a race between CPU hotplug and IRQ desc alloc/free
  2017-09-04 23:41   ` Huang, Ying
@ 2017-09-05  7:15     ` Thomas Gleixner
  0 siblings, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2017-09-05  7:15 UTC (permalink / raw)
  To: Huang, Ying; +Cc: linux-kernel

On Tue, 5 Sep 2017, Huang, Ying wrote:
> Thomas Gleixner <tglx@linutronix.de> writes:
> > +	/* Interrupts are moved away from the dying cpu, reenable alloc/free */
> > +	irq_unlock_sparse();
> > +
> 
> I don't understand this.  It appears that irq_migrate_all_off_this_cpu()
> is called in take_cpu_down() which has sparse_irq_lock held already.

You're right. I was looking at the wrong place.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-09-05  7:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-04  8:53 [PATCH] IRQ, cpu-hotplug: Fix a race between CPU hotplug and IRQ desc alloc/free Huang, Ying
2017-09-04  9:23 ` Thomas Gleixner
2017-09-04 23:41   ` Huang, Ying
2017-09-05  7:15     ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.