All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] metag: smp/hotplug fixes
@ 2013-07-01 16:04 James Hogan
  2013-07-01 16:04 ` [PATCH 1/5] metag: use clear_tasks_mm_cpumask() James Hogan
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: James Hogan @ 2013-07-01 16:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: Thomas Gleixner, James Hogan, Srivatsa S. Bhat, Anton Vorontsov,
	Kirill Tkhai, Oleg Nesterov

Kirill Tkhai kindly pointed out a bug in the metag SMP code, prompting
me to give it a bashing and weed out the hotplug bugs that have crept in
(or been around for a while). This patchset fixes various issues noticed
that I intend to apply for v3.11.

They're all short patches so please do give them a glance over in case
I've misunderstood anything.

James Hogan (5):
  metag: use clear_tasks_mm_cpumask()
  metag: smp: enable irqs after set_cpu_online
  metag: smp: don't spin waiting for CPU to start
  metag: kick: prevent nested kick handlers
  metag: cpu hotplug: route_irq: preserve irq mask

 arch/metag/kernel/irq.c  |  5 +++--
 arch/metag/kernel/kick.c |  9 +++++++++
 arch/metag/kernel/smp.c  | 35 +++++++++++++----------------------
 3 files changed, 25 insertions(+), 24 deletions(-)

Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Kirill Tkhai <tkhai@yandex.ru>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
-- 
1.8.1.2



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/5] metag: use clear_tasks_mm_cpumask()
  2013-07-01 16:04 [PATCH 0/5] metag: smp/hotplug fixes James Hogan
@ 2013-07-01 16:04 ` James Hogan
  2013-07-01 16:04 ` [PATCH 2/5] metag: smp: enable irqs after set_cpu_online James Hogan
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: James Hogan @ 2013-07-01 16:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: Thomas Gleixner, James Hogan, Srivatsa S. Bhat, Oleg Nesterov,
	Anton Vorontsov

Checking for process->mm is not enough because process' main thread may
exit or detach its mm via use_mm(), but other threads may still have a
valid mm.

To fix this we would need to use find_lock_task_mm(), which would walk
up all threads and returns an appropriate task (with task lock held).

clear_tasks_mm_cpumask() was introduced in v3.5-rc1 to fix this issue,
so let's use it for metag too.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
---
 arch/metag/kernel/smp.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
index 86fdda4..b813515 100644
--- a/arch/metag/kernel/smp.c
+++ b/arch/metag/kernel/smp.c
@@ -276,7 +276,6 @@ static DECLARE_COMPLETION(cpu_killed);
 int __cpuexit __cpu_disable(void)
 {
 	unsigned int cpu = smp_processor_id();
-	struct task_struct *p;
 
 	/*
 	 * Take this CPU offline.  Once we clear this, we can't return,
@@ -296,12 +295,7 @@ int __cpuexit __cpu_disable(void)
 	flush_cache_all();
 	local_flush_tlb_all();
 
-	read_lock(&tasklist_lock);
-	for_each_process(p) {
-		if (p->mm)
-			cpumask_clear_cpu(cpu, mm_cpumask(p->mm));
-	}
-	read_unlock(&tasklist_lock);
+	clear_tasks_mm_cpumask(cpu);
 
 	return 0;
 }
-- 
1.8.1.2



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/5] metag: smp: enable irqs after set_cpu_online
  2013-07-01 16:04 [PATCH 0/5] metag: smp/hotplug fixes James Hogan
  2013-07-01 16:04 ` [PATCH 1/5] metag: use clear_tasks_mm_cpumask() James Hogan
@ 2013-07-01 16:04 ` James Hogan
  2013-07-01 21:42   ` Thomas Gleixner
  2013-07-02  5:57   ` Srivatsa S. Bhat
  2013-07-01 16:04 ` [PATCH 3/5] metag: smp: don't spin waiting for CPU to start James Hogan
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 18+ messages in thread
From: James Hogan @ 2013-07-01 16:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: Thomas Gleixner, James Hogan, Kirill Tkhai, Srivatsa S. Bhat

In secondary_start_kernel() interrupts should be enabled with
local_irq_enable() after the cpu is marked as online with
set_cpu_online(). Otherwise it's possible for a timer interrupt to
trigger a softirq, which if the cpu is marked as offline may have it's
affinity altered.

Reported-by: Kirill Tkhai <tkhai@yandex.ru>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Kirill Tkhai <tkhai@yandex.ru>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 arch/metag/kernel/smp.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
index b813515..09979f2 100644
--- a/arch/metag/kernel/smp.c
+++ b/arch/metag/kernel/smp.c
@@ -379,12 +379,7 @@ asmlinkage void secondary_start_kernel(void)
 
 	setup_priv();
 
-	/*
-	 * Enable local interrupts.
-	 */
-	tbi_startup_interrupt(TBID_SIGNUM_TRT);
 	notify_cpu_starting(cpu);
-	local_irq_enable();
 
 	pr_info("CPU%u (thread %u): Booted secondary processor\n",
 		cpu, cpu_2_hwthread_id[cpu]);
@@ -398,6 +393,12 @@ asmlinkage void secondary_start_kernel(void)
 	set_cpu_online(cpu, true);
 
 	/*
+	 * Enable local interrupts.
+	 */
+	tbi_startup_interrupt(TBID_SIGNUM_TRT);
+	local_irq_enable();
+
+	/*
 	 * OK, it's off to the idle thread for us
 	 */
 	cpu_startup_entry(CPUHP_ONLINE);
-- 
1.8.1.2



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/5] metag: smp: don't spin waiting for CPU to start
  2013-07-01 16:04 [PATCH 0/5] metag: smp/hotplug fixes James Hogan
  2013-07-01 16:04 ` [PATCH 1/5] metag: use clear_tasks_mm_cpumask() James Hogan
  2013-07-01 16:04 ` [PATCH 2/5] metag: smp: enable irqs after set_cpu_online James Hogan
@ 2013-07-01 16:04 ` James Hogan
  2013-07-01 21:46   ` Thomas Gleixner
  2013-07-02  6:12   ` Srivatsa S. Bhat
  2013-07-01 16:04 ` [PATCH 4/5] metag: kick: prevent nested kick handlers James Hogan
  2013-07-01 16:04 ` [PATCH 5/5] metag: cpu hotplug: route_irq: preserve irq mask James Hogan
  4 siblings, 2 replies; 18+ messages in thread
From: James Hogan @ 2013-07-01 16:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: Thomas Gleixner, James Hogan, Srivatsa S. Bhat

Use a completion to block until a secondary CPU has started up, like ARM
do, instead of a loop of udelays.

On Meta, SMP is really SMT, with each "CPU" being a different hardware
thread on the same Meta processor core, so as well as being more
efficient and latency friendly, using a completion prevents the bogomips
of the secondary CPU from being drastically skewed every time by the
execution of the tight in-cache udelay loop on the other CPU.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 arch/metag/kernel/smp.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
index 09979f2..e413875 100644
--- a/arch/metag/kernel/smp.c
+++ b/arch/metag/kernel/smp.c
@@ -8,6 +8,7 @@
  * published by the Free Software Foundation.
  */
 #include <linux/atomic.h>
+#include <linux/completion.h>
 #include <linux/delay.h>
 #include <linux/init.h>
 #include <linux/spinlock.h>
@@ -62,6 +63,8 @@ static DEFINE_PER_CPU(struct ipi_data, ipi_data) = {
 
 static DEFINE_SPINLOCK(boot_lock);
 
+static DECLARE_COMPLETION(cpu_running);
+
 /*
  * "thread" is assumed to be a valid Meta hardware thread ID.
  */
@@ -235,20 +238,12 @@ int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *idle)
 	 */
 	ret = boot_secondary(thread, idle);
 	if (ret == 0) {
-		unsigned long timeout;
-
 		/*
 		 * CPU was successfully started, wait for it
 		 * to come online or time out.
 		 */
-		timeout = jiffies + HZ;
-		while (time_before(jiffies, timeout)) {
-			if (cpu_online(cpu))
-				break;
-
-			udelay(10);
-			barrier();
-		}
+		wait_for_completion_timeout(&cpu_running,
+					    msecs_to_jiffies(1000));
 
 		if (!cpu_online(cpu))
 			ret = -EIO;
@@ -391,6 +386,7 @@ asmlinkage void secondary_start_kernel(void)
 	 * OK, now it's safe to let the boot CPU continue
 	 */
 	set_cpu_online(cpu, true);
+	complete(&cpu_running);
 
 	/*
 	 * Enable local interrupts.
-- 
1.8.1.2



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/5] metag: kick: prevent nested kick handlers
  2013-07-01 16:04 [PATCH 0/5] metag: smp/hotplug fixes James Hogan
                   ` (2 preceding siblings ...)
  2013-07-01 16:04 ` [PATCH 3/5] metag: smp: don't spin waiting for CPU to start James Hogan
@ 2013-07-01 16:04 ` James Hogan
  2013-07-01 21:51   ` Thomas Gleixner
  2013-07-01 16:04 ` [PATCH 5/5] metag: cpu hotplug: route_irq: preserve irq mask James Hogan
  4 siblings, 1 reply; 18+ messages in thread
From: James Hogan @ 2013-07-01 16:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: Thomas Gleixner, James Hogan

The main kick trigger handler iterates a list of kick handlers and calls
each one. This is done with the kick_handlers_lock spin lock held, but
this causes a problem on SMP where IPIs are implemented with kicks. A
reschedule IPI calls scheduler_ipi() which uses irq_enter() and
irq_exit(). This results in the scheduler being invoked with
kick_handlers_lock held which can result in a nested kick trigger
attempting to acquire the lock, resulting in deadlock.

irq_enter() and irq_exit() can nest, so call them from the main kick
interrupt handler so that softirqs are only handled after
kick_handlers_lock is released.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
---
 arch/metag/kernel/kick.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/metag/kernel/kick.c b/arch/metag/kernel/kick.c
index 50fcbec..beb3776 100644
--- a/arch/metag/kernel/kick.c
+++ b/arch/metag/kernel/kick.c
@@ -26,6 +26,8 @@
  * pass it as an argument.
  */
 #include <linux/export.h>
+#include <linux/hardirq.h>
+#include <linux/irq.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/types.h>
@@ -66,6 +68,7 @@ EXPORT_SYMBOL(kick_unregister_func);
 TBIRES
 kick_handler(TBIRES State, int SigNum, int Triggers, int Inst, PTBI pTBI)
 {
+	struct pt_regs *old_regs;
 	struct kick_irq_handler *kh;
 	struct list_head *lh;
 	int handled = 0;
@@ -79,6 +82,9 @@ kick_handler(TBIRES State, int SigNum, int Triggers, int Inst, PTBI pTBI)
 
 	trace_hardirqs_off();
 
+	old_regs = set_irq_regs((struct pt_regs *)State.Sig.pCtx);
+	irq_enter();
+
 	/*
 	 * There is no need to disable interrupts here because we
 	 * can't nest KICK interrupts in a KICK interrupt handler.
@@ -97,5 +103,8 @@ kick_handler(TBIRES State, int SigNum, int Triggers, int Inst, PTBI pTBI)
 
 	WARN_ON(!handled);
 
+	irq_exit();
+	set_irq_regs(old_regs);
+
 	return tail_end(ret);
 }
-- 
1.8.1.2



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/5] metag: cpu hotplug: route_irq: preserve irq mask
  2013-07-01 16:04 [PATCH 0/5] metag: smp/hotplug fixes James Hogan
                   ` (3 preceding siblings ...)
  2013-07-01 16:04 ` [PATCH 4/5] metag: kick: prevent nested kick handlers James Hogan
@ 2013-07-01 16:04 ` James Hogan
  2013-07-01 21:51   ` Thomas Gleixner
  2013-07-02  6:16   ` Srivatsa S. Bhat
  4 siblings, 2 replies; 18+ messages in thread
From: James Hogan @ 2013-07-01 16:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: Thomas Gleixner, James Hogan

The route_irq() function needs to preserve the irq mask by using the
_irqsave/irqrestore variants of raw spin lock functions instead of the
_irq variants. This is because it is called from __cpu_disable() (via
migrate_irqs()), which is called with IRQs disabled, so using the _irq
variants re-enables IRQs.

This appears to have been causing occasional hits of the
BUG_ON(!irqs_disabled()) in __irq_work_run() during CPU hotplug soak
testing:
  BUG: failure at kernel/irq_work.c:122/__irq_work_run()!

Signed-off-by: James Hogan <james.hogan@imgtec.com>
---
 arch/metag/kernel/irq.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/metag/kernel/irq.c b/arch/metag/kernel/irq.c
index d91b1e9..2a2c9d5 100644
--- a/arch/metag/kernel/irq.c
+++ b/arch/metag/kernel/irq.c
@@ -279,11 +279,12 @@ static void route_irq(struct irq_data *data, unsigned int irq, unsigned int cpu)
 {
 	struct irq_desc *desc = irq_to_desc(irq);
 	struct irq_chip *chip = irq_data_get_irq_chip(data);
+	unsigned long flags;
 
-	raw_spin_lock_irq(&desc->lock);
+	raw_spin_lock_irqsave(&desc->lock, flags);
 	if (chip->irq_set_affinity)
 		chip->irq_set_affinity(data, cpumask_of(cpu), false);
-	raw_spin_unlock_irq(&desc->lock);
+	raw_spin_unlock_irqrestore(&desc->lock, flags);
 }
 
 /*
-- 
1.8.1.2



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] metag: smp: enable irqs after set_cpu_online
  2013-07-01 16:04 ` [PATCH 2/5] metag: smp: enable irqs after set_cpu_online James Hogan
@ 2013-07-01 21:42   ` Thomas Gleixner
  2013-07-02  5:57   ` Srivatsa S. Bhat
  1 sibling, 0 replies; 18+ messages in thread
From: Thomas Gleixner @ 2013-07-01 21:42 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-kernel, Kirill Tkhai, Srivatsa S. Bhat

On Mon, 1 Jul 2013, James Hogan wrote:

> In secondary_start_kernel() interrupts should be enabled with
> local_irq_enable() after the cpu is marked as online with
> set_cpu_online(). Otherwise it's possible for a timer interrupt to
> trigger a softirq, which if the cpu is marked as offline may have it's
> affinity altered.
> 
> Reported-by: Kirill Tkhai <tkhai@yandex.ru>
> Signed-off-by: James Hogan <james.hogan@imgtec.com>
> Cc: Kirill Tkhai <tkhai@yandex.ru>
> Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>

Acked-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] metag: smp: don't spin waiting for CPU to start
  2013-07-01 16:04 ` [PATCH 3/5] metag: smp: don't spin waiting for CPU to start James Hogan
@ 2013-07-01 21:46   ` Thomas Gleixner
  2013-07-01 22:02     ` James Hogan
  2013-07-02  6:12   ` Srivatsa S. Bhat
  1 sibling, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2013-07-01 21:46 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-kernel, Srivatsa S. Bhat

On Mon, 1 Jul 2013, James Hogan wrote:
> Use a completion to block until a secondary CPU has started up, like ARM
> do, instead of a loop of udelays.
> 
> On Meta, SMP is really SMT, with each "CPU" being a different hardware
> thread on the same Meta processor core, so as well as being more
> efficient and latency friendly, using a completion prevents the bogomips
> of the secondary CPU from being drastically skewed every time by the
> execution of the tight in-cache udelay loop on the other CPU.

The least of our worries is the correctness of bogomips. But if that
patch makes bogomips less bogus while fixing a real issue, please add
my
 
Acked-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] metag: kick: prevent nested kick handlers
  2013-07-01 16:04 ` [PATCH 4/5] metag: kick: prevent nested kick handlers James Hogan
@ 2013-07-01 21:51   ` Thomas Gleixner
  2013-07-01 22:17     ` James Hogan
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2013-07-01 21:51 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-kernel

On Mon, 1 Jul 2013, James Hogan wrote:

> The main kick trigger handler iterates a list of kick handlers and calls
> each one. This is done with the kick_handlers_lock spin lock held, but
> this causes a problem on SMP where IPIs are implemented with kicks. A
> reschedule IPI calls scheduler_ipi() which uses irq_enter() and
> irq_exit(). This results in the scheduler being invoked with
> kick_handlers_lock held which can result in a nested kick trigger
> attempting to acquire the lock, resulting in deadlock.
> 
> irq_enter() and irq_exit() can nest, so call them from the main kick
> interrupt handler so that softirqs are only handled after
> kick_handlers_lock is released.

This changelog is confusing. What I decode from the patch is, that you
are adding a missing irq_enter/exit pair to the kick_handler, right ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] metag: cpu hotplug: route_irq: preserve irq mask
  2013-07-01 16:04 ` [PATCH 5/5] metag: cpu hotplug: route_irq: preserve irq mask James Hogan
@ 2013-07-01 21:51   ` Thomas Gleixner
  2013-07-02  6:16   ` Srivatsa S. Bhat
  1 sibling, 0 replies; 18+ messages in thread
From: Thomas Gleixner @ 2013-07-01 21:51 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-kernel



On Mon, 1 Jul 2013, James Hogan wrote:

> The route_irq() function needs to preserve the irq mask by using the
> _irqsave/irqrestore variants of raw spin lock functions instead of the
> _irq variants. This is because it is called from __cpu_disable() (via
> migrate_irqs()), which is called with IRQs disabled, so using the _irq
> variants re-enables IRQs.
> 
> This appears to have been causing occasional hits of the
> BUG_ON(!irqs_disabled()) in __irq_work_run() during CPU hotplug soak
> testing:
>   BUG: failure at kernel/irq_work.c:122/__irq_work_run()!
> 
> Signed-off-by: James Hogan <james.hogan@imgtec.com>

Acked-by: Thomas Gleixner <tglx@linutronix.de>

> ---
>  arch/metag/kernel/irq.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/metag/kernel/irq.c b/arch/metag/kernel/irq.c
> index d91b1e9..2a2c9d5 100644
> --- a/arch/metag/kernel/irq.c
> +++ b/arch/metag/kernel/irq.c
> @@ -279,11 +279,12 @@ static void route_irq(struct irq_data *data, unsigned int irq, unsigned int cpu)
>  {
>  	struct irq_desc *desc = irq_to_desc(irq);
>  	struct irq_chip *chip = irq_data_get_irq_chip(data);
> +	unsigned long flags;
>  
> -	raw_spin_lock_irq(&desc->lock);
> +	raw_spin_lock_irqsave(&desc->lock, flags);
>  	if (chip->irq_set_affinity)
>  		chip->irq_set_affinity(data, cpumask_of(cpu), false);
> -	raw_spin_unlock_irq(&desc->lock);
> +	raw_spin_unlock_irqrestore(&desc->lock, flags);
>  }
>  
>  /*
> -- 
> 1.8.1.2
> 
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] metag: smp: don't spin waiting for CPU to start
  2013-07-01 21:46   ` Thomas Gleixner
@ 2013-07-01 22:02     ` James Hogan
  0 siblings, 0 replies; 18+ messages in thread
From: James Hogan @ 2013-07-01 22:02 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Srivatsa S. Bhat

On 1 July 2013 22:46, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Mon, 1 Jul 2013, James Hogan wrote:
>> Use a completion to block until a secondary CPU has started up, like ARM
>> do, instead of a loop of udelays.
>>
>> On Meta, SMP is really SMT, with each "CPU" being a different hardware
>> thread on the same Meta processor core, so as well as being more
>> efficient and latency friendly, using a completion prevents the bogomips
>> of the secondary CPU from being drastically skewed every time by the
>> execution of the tight in-cache udelay loop on the other CPU.
>
> The least of our worries is the correctness of bogomips. But if that
> patch makes bogomips less bogus while fixing a real issue, please add
> my
>
> Acked-by: Thomas Gleixner <tglx@linutronix.de>

Yes, it's not so much the absolute value, but that secondary cpus will
have a bogomips around 25% higher than the boot cpu, even though they
should be about equal (ignoring the effects of other cpus). If I
understand correctly, all else being equal this would make udelay
block for a different amount of time on different cpus (although I
haven't tried measuring it).

James

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] metag: kick: prevent nested kick handlers
  2013-07-01 21:51   ` Thomas Gleixner
@ 2013-07-01 22:17     ` James Hogan
  2013-07-01 22:56       ` Thomas Gleixner
  0 siblings, 1 reply; 18+ messages in thread
From: James Hogan @ 2013-07-01 22:17 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML

On 1 July 2013 22:51, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Mon, 1 Jul 2013, James Hogan wrote:
>
>> The main kick trigger handler iterates a list of kick handlers and calls
>> each one. This is done with the kick_handlers_lock spin lock held, but
>> this causes a problem on SMP where IPIs are implemented with kicks. A
>> reschedule IPI calls scheduler_ipi() which uses irq_enter() and
>> irq_exit(). This results in the scheduler being invoked with
>> kick_handlers_lock held which can result in a nested kick trigger
>> attempting to acquire the lock, resulting in deadlock.
>>
>> irq_enter() and irq_exit() can nest, so call them from the main kick
>> interrupt handler so that softirqs are only handled after
>> kick_handlers_lock is released.
>
> This changelog is confusing. What I decode from the patch is, that you
> are adding a missing irq_enter/exit pair to the kick_handler, right ?

Yes. Previously the outermost pair of irq_enter/exit was inside the
spin lock critical section (inside scheduler_ipi). so soft-irqs
(apparently including the scheduler) would run from irq_exit with the
spinlock still held. Now it waits until the new outermost irq_exit(),
after the spin lock is released.

I should probably have increased the number of lines of diff context.
http://lxr.linux.no/#linux+v3.10/arch/metag/kernel/kick.c#L66

Cheers
James

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] metag: kick: prevent nested kick handlers
  2013-07-01 22:17     ` James Hogan
@ 2013-07-01 22:56       ` Thomas Gleixner
  2013-07-02  0:00         ` James Hogan
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2013-07-01 22:56 UTC (permalink / raw)
  To: James Hogan; +Cc: LKML

On Mon, 1 Jul 2013, James Hogan wrote:
> On 1 July 2013 22:51, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Mon, 1 Jul 2013, James Hogan wrote:
> >
> >> The main kick trigger handler iterates a list of kick handlers and calls
> >> each one. This is done with the kick_handlers_lock spin lock held, but
> >> this causes a problem on SMP where IPIs are implemented with kicks. A
> >> reschedule IPI calls scheduler_ipi() which uses irq_enter() and
> >> irq_exit(). This results in the scheduler being invoked with
> >> kick_handlers_lock held which can result in a nested kick trigger
> >> attempting to acquire the lock, resulting in deadlock.
> >>
> >> irq_enter() and irq_exit() can nest, so call them from the main kick
> >> interrupt handler so that softirqs are only handled after
> >> kick_handlers_lock is released.
> >
> > This changelog is confusing. What I decode from the patch is, that you
> > are adding a missing irq_enter/exit pair to the kick_handler, right ?
> 
> Yes. Previously the outermost pair of irq_enter/exit was inside the
> spin lock critical section (inside scheduler_ipi). so soft-irqs
> (apparently including the scheduler) would run from irq_exit with the
> spinlock still held. Now it waits until the new outermost irq_exit(),
> after the spin lock is released.
> 
> I should probably have increased the number of lines of diff context.
> http://lxr.linux.no/#linux+v3.10/arch/metag/kernel/kick.c#L66

No, you should have written the changelog less confusing. I can see
the context by just looking at the current and the patched code.

The explanation you provided right now that the outermost pair of
irq_enter/exit was inside the spin locked section matches the patch
and makes sense.

So the real issue is, that all of your various interrupts come from
the same place, i.e. kick_handler. The kick_handler locks a global
lock and calls the handlers which have registered. These handlers
(partially calling generic code) might invoke irq_enter/exit pairs,
which leads to the underlying problem.

If irq_exit() results in a nest count of hard interrupts = 0, then it
invokes eventually pending soft interrupts. The softirq callbacks can
issue an IPI of some sorts and because this IPI will end up in the
kick handler, the systems livelocks on the already held global lock
which protects the kick_handler list. This can happen from any invoked
handler, because the kick_handler is missing an irq_enter/pair
preventing a kick_handler invocation recursion.

The obvious fix for now is to add an irq_enter/exit() pair around the
spin locked section in the kick handler, so irq_exit() invocations of
subhandlers wont lead to softirq execution.

The more elegant fix would be to replace that global lock which is not
really scalable with an RCU protected list of subscribed handlers.

The most elegant fix would be to have separate entry points for
various exceptions, but that seems to be an architectural issue.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] metag: kick: prevent nested kick handlers
  2013-07-01 22:56       ` Thomas Gleixner
@ 2013-07-02  0:00         ` James Hogan
  0 siblings, 0 replies; 18+ messages in thread
From: James Hogan @ 2013-07-02  0:00 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, james.hogan

Thomas Gleixner <tglx@linutronix.de> wrote:

>On Mon, 1 Jul 2013, James Hogan wrote:
>> On 1 July 2013 22:51, Thomas Gleixner <tglx@linutronix.de> wrote:
>> > On Mon, 1 Jul 2013, James Hogan wrote:
>> >
>> >> The main kick trigger handler iterates a list of kick handlers and
>calls
>> >> each one. This is done with the kick_handlers_lock spin lock held,
>but
>> >> this causes a problem on SMP where IPIs are implemented with
>kicks. A
>> >> reschedule IPI calls scheduler_ipi() which uses irq_enter() and
>> >> irq_exit(). This results in the scheduler being invoked with
>> >> kick_handlers_lock held which can result in a nested kick trigger
>> >> attempting to acquire the lock, resulting in deadlock.
>> >>
>> >> irq_enter() and irq_exit() can nest, so call them from the main
>kick
>> >> interrupt handler so that softirqs are only handled after
>> >> kick_handlers_lock is released.
>> >
>> > This changelog is confusing. What I decode from the patch is, that
>you
>> > are adding a missing irq_enter/exit pair to the kick_handler, right
>?
>> 
>> Yes. Previously the outermost pair of irq_enter/exit was inside the
>> spin lock critical section (inside scheduler_ipi). so soft-irqs
>> (apparently including the scheduler) would run from irq_exit with the
>> spinlock still held. Now it waits until the new outermost irq_exit(),
>> after the spin lock is released.
>> 
>> I should probably have increased the number of lines of diff context.
>> http://lxr.linux.no/#linux+v3.10/arch/metag/kernel/kick.c#L66
>
>No, you should have written the changelog less confusing. I can see
>the context by just looking at the current and the patched code.

Yes, i agree, sorry it wasn't clear.

>The explanation you provided right now that the outermost pair of
>irq_enter/exit was inside the spin locked section matches the patch
>and makes sense.
>
>So the real issue is, that all of your various interrupts come from

Only kicks, which are the hardware IPI mechanism (in practice only used for SMP IPIs and for communication with non-linux hw threads). Timer/peripheral interrupts come via different handlers.

>the same place, i.e. kick_handler. The kick_handler locks a global
>lock and calls the handlers which have registered. These handlers
>(partially calling generic code) might invoke irq_enter/exit pairs,
>which leads to the underlying problem.
>
>If irq_exit() results in a nest count of hard interrupts = 0, then it
>invokes eventually pending soft interrupts. The softirq callbacks can
>issue an IPI of some sorts and because this IPI will end up in the
>kick handler, the systems livelocks on the already held global lock
>which protects the kick_handler list. This can happen from any invoked
>handler, because the kick_handler is missing an irq_enter/pair
>preventing a kick_handler invocation recursion.

Yes, that's my understanding of the problem.

>
>The obvious fix for now is to add an irq_enter/exit() pair around the
>spin locked section in the kick handler, so irq_exit() invocations of
>subhandlers wont lead to softirq execution.
>
>The more elegant fix would be to replace that global lock which is not
>really scalable with an RCU protected list of subscribed handlers.

Yes, I considered switching it to an RCU list (thanks for confirming that), but this patch looks sensible to me even with rcu, so I figured that can be a separate patch.

>The most elegant fix would be to have separate entry points for
>various exceptions, but that seems to be an architectural issue.

thanks for taking the time to review and for the other acks. i'll rewrite the commit message of this patch and resend.

cheers
James


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] metag: smp: enable irqs after set_cpu_online
  2013-07-01 16:04 ` [PATCH 2/5] metag: smp: enable irqs after set_cpu_online James Hogan
  2013-07-01 21:42   ` Thomas Gleixner
@ 2013-07-02  5:57   ` Srivatsa S. Bhat
  1 sibling, 0 replies; 18+ messages in thread
From: Srivatsa S. Bhat @ 2013-07-02  5:57 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-kernel, Thomas Gleixner, Kirill Tkhai

On 07/01/2013 09:34 PM, James Hogan wrote:
> In secondary_start_kernel() interrupts should be enabled with
> local_irq_enable() after the cpu is marked as online with
> set_cpu_online(). Otherwise it's possible for a timer interrupt to
> trigger a softirq, which if the cpu is marked as offline may have it's
> affinity altered.
> 
> Reported-by: Kirill Tkhai <tkhai@yandex.ru>
> Signed-off-by: James Hogan <james.hogan@imgtec.com>
> Cc: Kirill Tkhai <tkhai@yandex.ru>
> Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> ---

Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Regards,
Srivatsa S. Bhat

>  arch/metag/kernel/smp.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
> index b813515..09979f2 100644
> --- a/arch/metag/kernel/smp.c
> +++ b/arch/metag/kernel/smp.c
> @@ -379,12 +379,7 @@ asmlinkage void secondary_start_kernel(void)
> 
>  	setup_priv();
> 
> -	/*
> -	 * Enable local interrupts.
> -	 */
> -	tbi_startup_interrupt(TBID_SIGNUM_TRT);
>  	notify_cpu_starting(cpu);
> -	local_irq_enable();
> 
>  	pr_info("CPU%u (thread %u): Booted secondary processor\n",
>  		cpu, cpu_2_hwthread_id[cpu]);
> @@ -398,6 +393,12 @@ asmlinkage void secondary_start_kernel(void)
>  	set_cpu_online(cpu, true);
> 
>  	/*
> +	 * Enable local interrupts.
> +	 */
> +	tbi_startup_interrupt(TBID_SIGNUM_TRT);
> +	local_irq_enable();
> +
> +	/*
>  	 * OK, it's off to the idle thread for us
>  	 */
>  	cpu_startup_entry(CPUHP_ONLINE);
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] metag: smp: don't spin waiting for CPU to start
  2013-07-01 16:04 ` [PATCH 3/5] metag: smp: don't spin waiting for CPU to start James Hogan
  2013-07-01 21:46   ` Thomas Gleixner
@ 2013-07-02  6:12   ` Srivatsa S. Bhat
  1 sibling, 0 replies; 18+ messages in thread
From: Srivatsa S. Bhat @ 2013-07-02  6:12 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-kernel, Thomas Gleixner

On 07/01/2013 09:34 PM, James Hogan wrote:
> Use a completion to block until a secondary CPU has started up, like ARM
> do, instead of a loop of udelays.
> 
> On Meta, SMP is really SMT, with each "CPU" being a different hardware
> thread on the same Meta processor core, so as well as being more
> efficient and latency friendly, using a completion prevents the bogomips
> of the secondary CPU from being drastically skewed every time by the
> execution of the tight in-cache udelay loop on the other CPU.
> 
> Signed-off-by: James Hogan <james.hogan@imgtec.com>
> Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> ---

Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Regards,
Srivatsa S. Bhat


>  arch/metag/kernel/smp.c | 16 ++++++----------
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
> index 09979f2..e413875 100644
> --- a/arch/metag/kernel/smp.c
> +++ b/arch/metag/kernel/smp.c
> @@ -8,6 +8,7 @@
>   * published by the Free Software Foundation.
>   */
>  #include <linux/atomic.h>
> +#include <linux/completion.h>
>  #include <linux/delay.h>
>  #include <linux/init.h>
>  #include <linux/spinlock.h>
> @@ -62,6 +63,8 @@ static DEFINE_PER_CPU(struct ipi_data, ipi_data) = {
> 
>  static DEFINE_SPINLOCK(boot_lock);
> 
> +static DECLARE_COMPLETION(cpu_running);
> +
>  /*
>   * "thread" is assumed to be a valid Meta hardware thread ID.
>   */
> @@ -235,20 +238,12 @@ int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *idle)
>  	 */
>  	ret = boot_secondary(thread, idle);
>  	if (ret == 0) {
> -		unsigned long timeout;
> -
>  		/*
>  		 * CPU was successfully started, wait for it
>  		 * to come online or time out.
>  		 */
> -		timeout = jiffies + HZ;
> -		while (time_before(jiffies, timeout)) {
> -			if (cpu_online(cpu))
> -				break;
> -
> -			udelay(10);
> -			barrier();
> -		}
> +		wait_for_completion_timeout(&cpu_running,
> +					    msecs_to_jiffies(1000));
> 
>  		if (!cpu_online(cpu))
>  			ret = -EIO;
> @@ -391,6 +386,7 @@ asmlinkage void secondary_start_kernel(void)
>  	 * OK, now it's safe to let the boot CPU continue
>  	 */
>  	set_cpu_online(cpu, true);
> +	complete(&cpu_running);
> 
>  	/*
>  	 * Enable local interrupts.
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] metag: cpu hotplug: route_irq: preserve irq mask
  2013-07-01 16:04 ` [PATCH 5/5] metag: cpu hotplug: route_irq: preserve irq mask James Hogan
  2013-07-01 21:51   ` Thomas Gleixner
@ 2013-07-02  6:16   ` Srivatsa S. Bhat
  2013-07-02  9:59     ` James Hogan
  1 sibling, 1 reply; 18+ messages in thread
From: Srivatsa S. Bhat @ 2013-07-02  6:16 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-kernel, Thomas Gleixner

On 07/01/2013 09:34 PM, James Hogan wrote:
> The route_irq() function needs to preserve the irq mask by using the
> _irqsave/irqrestore variants of raw spin lock functions instead of the
> _irq variants. This is because it is called from __cpu_disable() (via
> migrate_irqs()), which is called with IRQs disabled, so using the _irq
> variants re-enables IRQs.
> 
> This appears to have been causing occasional hits of the
> BUG_ON(!irqs_disabled()) in __irq_work_run() during CPU hotplug soak
> testing:
>   BUG: failure at kernel/irq_work.c:122/__irq_work_run()!
> 
> Signed-off-by: James Hogan <james.hogan@imgtec.com>
> ---

Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Regards,
Srivatsa S. Bhat

>  arch/metag/kernel/irq.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/metag/kernel/irq.c b/arch/metag/kernel/irq.c
> index d91b1e9..2a2c9d5 100644
> --- a/arch/metag/kernel/irq.c
> +++ b/arch/metag/kernel/irq.c
> @@ -279,11 +279,12 @@ static void route_irq(struct irq_data *data, unsigned int irq, unsigned int cpu)
>  {
>  	struct irq_desc *desc = irq_to_desc(irq);
>  	struct irq_chip *chip = irq_data_get_irq_chip(data);
> +	unsigned long flags;
> 
> -	raw_spin_lock_irq(&desc->lock);
> +	raw_spin_lock_irqsave(&desc->lock, flags);
>  	if (chip->irq_set_affinity)
>  		chip->irq_set_affinity(data, cpumask_of(cpu), false);
> -	raw_spin_unlock_irq(&desc->lock);
> +	raw_spin_unlock_irqrestore(&desc->lock, flags);
>  }
> 
>  /*
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] metag: cpu hotplug: route_irq: preserve irq mask
  2013-07-02  6:16   ` Srivatsa S. Bhat
@ 2013-07-02  9:59     ` James Hogan
  0 siblings, 0 replies; 18+ messages in thread
From: James Hogan @ 2013-07-02  9:59 UTC (permalink / raw)
  To: Srivatsa S. Bhat; +Cc: linux-kernel, Thomas Gleixner

On 02/07/13 07:16, Srivatsa S. Bhat wrote:
> On 07/01/2013 09:34 PM, James Hogan wrote:
>> The route_irq() function needs to preserve the irq mask by using the
>> _irqsave/irqrestore variants of raw spin lock functions instead of the
>> _irq variants. This is because it is called from __cpu_disable() (via
>> migrate_irqs()), which is called with IRQs disabled, so using the _irq
>> variants re-enables IRQs.
>>
>> This appears to have been causing occasional hits of the
>> BUG_ON(!irqs_disabled()) in __irq_work_run() during CPU hotplug soak
>> testing:
>>   BUG: failure at kernel/irq_work.c:122/__irq_work_run()!
>>
>> Signed-off-by: James Hogan <james.hogan@imgtec.com>
>> ---
> 
> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Thanks for taking the time to review these patches

Cheers
James


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-07-02 10:27 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-01 16:04 [PATCH 0/5] metag: smp/hotplug fixes James Hogan
2013-07-01 16:04 ` [PATCH 1/5] metag: use clear_tasks_mm_cpumask() James Hogan
2013-07-01 16:04 ` [PATCH 2/5] metag: smp: enable irqs after set_cpu_online James Hogan
2013-07-01 21:42   ` Thomas Gleixner
2013-07-02  5:57   ` Srivatsa S. Bhat
2013-07-01 16:04 ` [PATCH 3/5] metag: smp: don't spin waiting for CPU to start James Hogan
2013-07-01 21:46   ` Thomas Gleixner
2013-07-01 22:02     ` James Hogan
2013-07-02  6:12   ` Srivatsa S. Bhat
2013-07-01 16:04 ` [PATCH 4/5] metag: kick: prevent nested kick handlers James Hogan
2013-07-01 21:51   ` Thomas Gleixner
2013-07-01 22:17     ` James Hogan
2013-07-01 22:56       ` Thomas Gleixner
2013-07-02  0:00         ` James Hogan
2013-07-01 16:04 ` [PATCH 5/5] metag: cpu hotplug: route_irq: preserve irq mask James Hogan
2013-07-01 21:51   ` Thomas Gleixner
2013-07-02  6:16   ` Srivatsa S. Bhat
2013-07-02  9:59     ` James Hogan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.