All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] arm64: do not force irq affinity setting
@ 2014-06-26  6:49 Prashant Gaikwad
  2014-06-26 10:20 ` Will Deacon
  0 siblings, 1 reply; 7+ messages in thread
From: Prashant Gaikwad @ 2014-06-26  6:49 UTC (permalink / raw)
  To: linux-arm-kernel

Unconditional copying cpu_online_mask to affinity
may result in migrating affinity to wrong CPU.

For example, IRQ 5 affinity mask contains CPU 4-7,
it was affined to CPU4 and CPU 0-7 are online.
Now if we hot-unplug CPU4 then with current
implementation affinity mask will contain
CPU 0-3,5-7 and IRQ 5 will be affined to CPU0.

Instead copy cpu_online_mask to affinity only if
no online CPU is present in affinity mask and do
not force affinity seeting which would do the
CPU online check.

Signed-off-by: Prashant Gaikwad <pgaikwad@nvidia.com>
---
 arch/arm64/kernel/irq.c |   12 ++++--------
 1 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 0f08dfd..dfa6e3e 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -97,19 +97,15 @@ static bool migrate_one_irq(struct irq_desc *desc)
 	if (irqd_is_per_cpu(d) || !cpumask_test_cpu(smp_processor_id(), affinity))
 		return false;
 
-	if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids)
+	if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) {
+		affinity = cpu_online_mask;
 		ret = true;
+	}
 
-	/*
-	 * when using forced irq_set_affinity we must ensure that the cpu
-	 * being offlined is not present in the affinity mask, it may be
-	 * selected as the target CPU otherwise
-	 */
-	affinity = cpu_online_mask;
 	c = irq_data_get_irq_chip(d);
 	if (!c->irq_set_affinity)
 		pr_debug("IRQ%u: unable to set affinity\n", d->irq);
-	else if (c->irq_set_affinity(d, affinity, true) == IRQ_SET_MASK_OK && ret)
+	else if (c->irq_set_affinity(d, affinity, false) == IRQ_SET_MASK_OK && ret)
 		cpumask_copy(d->affinity, affinity);
 
 	return ret;
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH] arm64: do not force irq affinity setting
  2014-06-26  6:49 [PATCH] arm64: do not force irq affinity setting Prashant Gaikwad
@ 2014-06-26 10:20 ` Will Deacon
  2014-06-26 12:00   ` Prashant Gaikwad
  2014-06-26 13:45   ` Sudeep Holla
  0 siblings, 2 replies; 7+ messages in thread
From: Will Deacon @ 2014-06-26 10:20 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Thu, Jun 26, 2014 at 07:49:55AM +0100, Prashant Gaikwad wrote:
> Unconditional copying cpu_online_mask to affinity
> may result in migrating affinity to wrong CPU.

We have a bug, but I don't follow your reasoning.

> For example, IRQ 5 affinity mask contains CPU 4-7,

Ok, so d->affinity is 0xf0...

> it was affined to CPU4 and CPU 0-7 are online.

...and cpu_online_mask is 0xff.

> Now if we hot-unplug CPU4 then with current
> implementation affinity mask will contain
> CPU 0-3,5-7 and IRQ 5 will be affined to CPU0.

cpumask_any_and(affinity, cpu_online_mask) will give return < nr_cpu_ids
since there is an intersection of 0xf0. That means ret is false.

The bug is that we then do affinity = cpu_online_mask; unconditionally,
but we *won't* do the cpumask_copy, since ret is false.

You can fix this by simply bringing the arm64 code into line with the arm
code, which begs the question as to why this has to exist in the arch/
backend@all!

Will

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] arm64: do not force irq affinity setting
  2014-06-26 10:20 ` Will Deacon
@ 2014-06-26 12:00   ` Prashant Gaikwad
  2014-06-26 13:11     ` Will Deacon
  2014-06-26 13:45   ` Sudeep Holla
  1 sibling, 1 reply; 7+ messages in thread
From: Prashant Gaikwad @ 2014-06-26 12:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2014-06-26 at 15:50 +0530, Will Deacon wrote:
> Hello,
> 
> On Thu, Jun 26, 2014 at 07:49:55AM +0100, Prashant Gaikwad wrote:
> > Unconditional copying cpu_online_mask to affinity
> > may result in migrating affinity to wrong CPU.
> 
> We have a bug, but I don't follow your reasoning.
> 
> > For example, IRQ 5 affinity mask contains CPU 4-7,
> 
> Ok, so d->affinity is 0xf0...
> 
> > it was affined to CPU4 and CPU 0-7 are online.
> 
> ...and cpu_online_mask is 0xff.
> 
> > Now if we hot-unplug CPU4 then with current
> > implementation affinity mask will contain
> > CPU 0-3,5-7 and IRQ 5 will be affined to CPU0.
> 
> cpumask_any_and(affinity, cpu_online_mask) will give return < nr_cpu_ids
> since there is an intersection of 0xf0. That means ret is false.
> 
> The bug is that we then do affinity = cpu_online_mask; unconditionally,
> but we *won't* do the cpumask_copy, since ret is false.
> 

We do not copy but the affinity mask passed to irq_set_affinity function
is nothing but cpu_online_mask. So in GIC it will set affinity to CPU0.

> You can fix this by simply bringing the arm64 code into line with the arm
> code, which begs the question as to why this has to exist in the arch/
> backend at all!

Where can we move this code?

> 
> Will

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] arm64: do not force irq affinity setting
  2014-06-26 12:00   ` Prashant Gaikwad
@ 2014-06-26 13:11     ` Will Deacon
  2014-06-26 13:40       ` Prashant Gaikwad
  0 siblings, 1 reply; 7+ messages in thread
From: Will Deacon @ 2014-06-26 13:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 26, 2014 at 01:00:24PM +0100, Prashant Gaikwad wrote:
> On Thu, 2014-06-26 at 15:50 +0530, Will Deacon wrote:
> > On Thu, Jun 26, 2014 at 07:49:55AM +0100, Prashant Gaikwad wrote:
> > > Unconditional copying cpu_online_mask to affinity
> > > may result in migrating affinity to wrong CPU.
> > 
> > We have a bug, but I don't follow your reasoning.
> > 
> > > For example, IRQ 5 affinity mask contains CPU 4-7,
> > 
> > Ok, so d->affinity is 0xf0...
> > 
> > > it was affined to CPU4 and CPU 0-7 are online.
> > 
> > ...and cpu_online_mask is 0xff.
> > 
> > > Now if we hot-unplug CPU4 then with current
> > > implementation affinity mask will contain
> > > CPU 0-3,5-7 and IRQ 5 will be affined to CPU0.
> > 
> > cpumask_any_and(affinity, cpu_online_mask) will give return < nr_cpu_ids
> > since there is an intersection of 0xf0. That means ret is false.
> > 
> > The bug is that we then do affinity = cpu_online_mask; unconditionally,
> > but we *won't* do the cpumask_copy, since ret is false.
> > 
> 
> We do not copy but the affinity mask passed to irq_set_affinity function
> is nothing but cpu_online_mask. So in GIC it will set affinity to CPU0.

Exactly, but your proposed patch changed more than that.

> > You can fix this by simply bringing the arm64 code into line with the arm
> > code, which begs the question as to why this has to exist in the arch/
> > backend at all!
> 
> Where can we move this code?

kernel/irq/migration.c?

Will

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] arm64: do not force irq affinity setting
  2014-06-26 13:11     ` Will Deacon
@ 2014-06-26 13:40       ` Prashant Gaikwad
  2014-06-26 14:04         ` Sudeep Holla
  0 siblings, 1 reply; 7+ messages in thread
From: Prashant Gaikwad @ 2014-06-26 13:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2014-06-26 at 18:41 +0530, Will Deacon wrote:
> On Thu, Jun 26, 2014 at 01:00:24PM +0100, Prashant Gaikwad wrote:
> > On Thu, 2014-06-26 at 15:50 +0530, Will Deacon wrote:
> > > On Thu, Jun 26, 2014 at 07:49:55AM +0100, Prashant Gaikwad wrote:
> > > > Unconditional copying cpu_online_mask to affinity
> > > > may result in migrating affinity to wrong CPU.
> > > 
> > > We have a bug, but I don't follow your reasoning.
> > > 
> > > > For example, IRQ 5 affinity mask contains CPU 4-7,
> > > 
> > > Ok, so d->affinity is 0xf0...
> > > 
> > > > it was affined to CPU4 and CPU 0-7 are online.
> > > 
> > > ...and cpu_online_mask is 0xff.
> > > 
> > > > Now if we hot-unplug CPU4 then with current
> > > > implementation affinity mask will contain
> > > > CPU 0-3,5-7 and IRQ 5 will be affined to CPU0.
> > > 
> > > cpumask_any_and(affinity, cpu_online_mask) will give return < nr_cpu_ids
> > > since there is an intersection of 0xf0. That means ret is false.
> > > 
> > > The bug is that we then do affinity = cpu_online_mask; unconditionally,
> > > but we *won't* do the cpumask_copy, since ret is false.
> > > 
> > 
> > We do not copy but the affinity mask passed to irq_set_affinity function
> > is nothing but cpu_online_mask. So in GIC it will set affinity to CPU0.
> 
> Exactly, but your proposed patch changed more than that.
> 

I am changing the force flag to false. That is because after I fix this
behavior we have another bug where the IRQ affinity is set to offline
CPU.

When cpumask_any_and(affinity, cpu_online_mask) return < nr_cpu_ids we
pass the affinity mask as it is which contains the offline CPU too and
if force flag is true then GIC driver skips online CPU check. If CPU0 is
going down then the affinity mask will have CPU0 and GIC driver will
keep the affinity to CPU0.

Changing force flag to false ensures that GIC driver checks for online
CPU.

> > > You can fix this by simply bringing the arm64 code into line with the arm
> > > code, which begs the question as to why this has to exist in the arch/
> > > backend at all!
> > 
> > Where can we move this code?
> 
> kernel/irq/migration.c?
> 
> Will

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] arm64: do not force irq affinity setting
  2014-06-26 10:20 ` Will Deacon
  2014-06-26 12:00   ` Prashant Gaikwad
@ 2014-06-26 13:45   ` Sudeep Holla
  1 sibling, 0 replies; 7+ messages in thread
From: Sudeep Holla @ 2014-06-26 13:45 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On 26/06/14 11:20, Will Deacon wrote:
> Hello,
>
> On Thu, Jun 26, 2014 at 07:49:55AM +0100, Prashant Gaikwad wrote:
>> Unconditional copying cpu_online_mask to affinity
>> may result in migrating affinity to wrong CPU.
>
> We have a bug, but I don't follow your reasoning.
>
>> For example, IRQ 5 affinity mask contains CPU 4-7,
>
> Ok, so d->affinity is 0xf0...
>
>> it was affined to CPU4 and CPU 0-7 are online.
>
> ...and cpu_online_mask is 0xff.
>
>> Now if we hot-unplug CPU4 then with current
>> implementation affinity mask will contain
>> CPU 0-3,5-7 and IRQ 5 will be affined to CPU0.
>
> cpumask_any_and(affinity, cpu_online_mask) will give return < nr_cpu_ids
> since there is an intersection of 0xf0. That means ret is false.
>
> The bug is that we then do affinity = cpu_online_mask; unconditionally,
> but we *won't* do the cpumask_copy, since ret is false.
>
> You can fix this by simply bringing the arm64 code into line with the arm
> code, which begs the question as to why this has to exist in the arch/
> backend at all!
>

The unconditional assignment was added by me to fix CPU0 hotplug issue explained 
in commit 601c942176d8 which is wrong and evident from the above
usecase. It was added to retain the forced irq_set_affinity. The difference 
between arm and arm64 is because the arm doesn't have the patch [1]

We can move to irq_set_affinity without force option as this patch does.
I had mentioned similar solution[2], but Russell wants to get feedback from
tglx[3]

And yes I see similar implementations for many architectures, definitely
can be unified.

Regards,
Sudeep

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-May/254838.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-May/259255.html
[3] http://www.spinics.net/lists/arm-kernel/msg340279.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] arm64: do not force irq affinity setting
  2014-06-26 13:40       ` Prashant Gaikwad
@ 2014-06-26 14:04         ` Sudeep Holla
  0 siblings, 0 replies; 7+ messages in thread
From: Sudeep Holla @ 2014-06-26 14:04 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 26/06/14 14:40, Prashant Gaikwad wrote:
> On Thu, 2014-06-26 at 18:41 +0530, Will Deacon wrote:
>> On Thu, Jun 26, 2014 at 01:00:24PM +0100, Prashant Gaikwad wrote:
>>> On Thu, 2014-06-26 at 15:50 +0530, Will Deacon wrote:
>>>> On Thu, Jun 26, 2014 at 07:49:55AM +0100, Prashant Gaikwad wrote:
>>>>> Unconditional copying cpu_online_mask to affinity
>>>>> may result in migrating affinity to wrong CPU.
>>>>
>>>> We have a bug, but I don't follow your reasoning.
>>>>
>>>>> For example, IRQ 5 affinity mask contains CPU 4-7,
>>>>
>>>> Ok, so d->affinity is 0xf0...
>>>>
>>>>> it was affined to CPU4 and CPU 0-7 are online.
>>>>
>>>> ...and cpu_online_mask is 0xff.
>>>>
>>>>> Now if we hot-unplug CPU4 then with current
>>>>> implementation affinity mask will contain
>>>>> CPU 0-3,5-7 and IRQ 5 will be affined to CPU0.
>>>>
>>>> cpumask_any_and(affinity, cpu_online_mask) will give return < nr_cpu_ids
>>>> since there is an intersection of 0xf0. That means ret is false.
>>>>
>>>> The bug is that we then do affinity = cpu_online_mask; unconditionally,
>>>> but we *won't* do the cpumask_copy, since ret is false.
>>>>
>>>
>>> We do not copy but the affinity mask passed to irq_set_affinity function
>>> is nothing but cpu_online_mask. So in GIC it will set affinity to CPU0.
>>
>> Exactly, but your proposed patch changed more than that.
>>
>
> I am changing the force flag to false. That is because after I fix this
> behavior we have another bug where the IRQ affinity is set to offline
> CPU.
>

That's correct, it's the original issue I saw and fixed incorrectly which
triggered the bug you have now.

The main reason to retain the force flag as true is that the implementation is
irqchip specific. GIC implements the way you explained but what if some other
irqchip implementation has something different.

I believe that's the reason why Russell wants to get feedback from tglx.

Regards,
Sudeep

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-06-26 14:04 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-26  6:49 [PATCH] arm64: do not force irq affinity setting Prashant Gaikwad
2014-06-26 10:20 ` Will Deacon
2014-06-26 12:00   ` Prashant Gaikwad
2014-06-26 13:11     ` Will Deacon
2014-06-26 13:40       ` Prashant Gaikwad
2014-06-26 14:04         ` Sudeep Holla
2014-06-26 13:45   ` Sudeep Holla

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.