* [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask @ 2012-09-26 17:38 Chuansheng Liu 2012-09-26 8:49 ` Srivatsa S. Bhat 2012-09-26 23:45 ` Chuansheng Liu 0 siblings, 2 replies; 15+ messages in thread From: Chuansheng Liu @ 2012-09-26 17:38 UTC (permalink / raw) To: srivatsa.bhat, tglx, mingo, x86; +Cc: linux-kernel, yanmin_zhang When one CPU is going offline, and fixup_irqs() will re-set the irq affinity in some cases, we should clean the offlining CPU from the irq affinity. The reason is setting offlining CPU as of the affinity is useless. Moreover, the smp_affinity value will be confusing when the offlining CPU come back again. Example: For irq 93 with 4 CPUS, the default affinity f(1111), normal cases: 4 CPUS will receive the irq93 interrupts. When echo 0 > /sys/devices/system/cpu/cpu3/online, just CPU0,1,2 will receive the interrupts. But after the CPU3 is online again, we will not set affinity,the result will be: the smp_affinity is f, but still just CPU0,1,2 can receive the interrupts. So we should clean the offlining CPU from irq affinity mask in fixup_irqs(). Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: liu chuansheng <chuansheng.liu@intel.com> --- arch/x86/kernel/irq.c | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index d44f782..08bb905 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -239,6 +239,7 @@ void fixup_irqs(void) struct irq_desc *desc; struct irq_data *data; struct irq_chip *chip; + int cpu = smp_processor_id(); for_each_irq_desc(irq, desc) { int break_affinity = 0; @@ -277,8 +278,11 @@ void fixup_irqs(void) if (!irqd_can_move_in_process_context(data) && chip->irq_mask) chip->irq_mask(data); - if (chip->irq_set_affinity) - chip->irq_set_affinity(data, affinity, true); + if ((chip->irq_set_affinity) && + !chip->irq_set_affinity(data, affinity, true)) { + if (cpumask_test_cpu(cpu, data->affinity)) + cpumask_clear_cpu(cpu, data->affinity); + } else if (!(warned++)) set_affinity = 0; -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-26 17:38 [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask Chuansheng Liu @ 2012-09-26 8:49 ` Srivatsa S. Bhat 2012-09-26 8:51 ` Liu, Chuansheng 2012-09-26 8:56 ` Liu, Chuansheng 2012-09-26 23:45 ` Chuansheng Liu 1 sibling, 2 replies; 15+ messages in thread From: Srivatsa S. Bhat @ 2012-09-26 8:49 UTC (permalink / raw) To: Chuansheng Liu; +Cc: tglx, mingo, x86, linux-kernel, yanmin_zhang On 09/26/2012 11:08 PM, Chuansheng Liu wrote: > > When one CPU is going offline, and fixup_irqs() will re-set the > irq affinity in some cases, we should clean the offlining CPU from > the irq affinity. > > The reason is setting offlining CPU as of the affinity is useless. > Moreover, the smp_affinity value will be confusing when the > offlining CPU come back again. > > Example: > For irq 93 with 4 CPUS, the default affinity f(1111), > normal cases: 4 CPUS will receive the irq93 interrupts. > > When echo 0 > /sys/devices/system/cpu/cpu3/online, just CPU0,1,2 will > receive the interrupts. > > But after the CPU3 is online again, we will not set affinity,the result > will be: > the smp_affinity is f, but still just CPU0,1,2 can receive the interrupts. > > So we should clean the offlining CPU from irq affinity mask > in fixup_irqs(). > > Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Please hold on.. I'm not yet done reviewing, I might have more comments :-) > Signed-off-by: liu chuansheng <chuansheng.liu@intel.com> > --- > arch/x86/kernel/irq.c | 8 ++++++-- > 1 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c > index d44f782..08bb905 100644 > --- a/arch/x86/kernel/irq.c > +++ b/arch/x86/kernel/irq.c > @@ -239,6 +239,7 @@ void fixup_irqs(void) > struct irq_desc *desc; > struct irq_data *data; > struct irq_chip *chip; > + int cpu = smp_processor_id(); > > for_each_irq_desc(irq, desc) { > int break_affinity = 0; > @@ -277,8 +278,11 @@ void fixup_irqs(void) > if (!irqd_can_move_in_process_context(data) && chip->irq_mask) > chip->irq_mask(data); > > - if (chip->irq_set_affinity) > - chip->irq_set_affinity(data, affinity, true); > + if ((chip->irq_set_affinity) && > + !chip->irq_set_affinity(data, affinity, true)) { A return value of 0 and 1 are acceptable. So this check isn't correct. Regards, Srivatsa S. Bhat > + if (cpumask_test_cpu(cpu, data->affinity)) > + cpumask_clear_cpu(cpu, data->affinity); OMG, why did you drop the other hunk which cleared the cpu *before* invoking ->irq_set_affinity()? IMO, altering irq affinity involves more work than just altering the mask; that's why you have that ->irq_set_affinity() function. So, if you alter the mask *after* calling ->irq_set_affinity(), its not right.. Regards, Srivatsa S. Bhat > + } > else if (!(warned++)) > set_affinity = 0; > ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-26 8:49 ` Srivatsa S. Bhat @ 2012-09-26 8:51 ` Liu, Chuansheng 2012-09-26 8:56 ` Liu, Chuansheng 1 sibling, 0 replies; 15+ messages in thread From: Liu, Chuansheng @ 2012-09-26 8:51 UTC (permalink / raw) To: Srivatsa S. Bhat; +Cc: tglx, mingo, x86, linux-kernel, yanmin_zhang [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 267 bytes --] > Please hold on.. I'm not yet done reviewing, I might have more comments :-) Sure, welcome, thanks again. ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-26 8:49 ` Srivatsa S. Bhat 2012-09-26 8:51 ` Liu, Chuansheng @ 2012-09-26 8:56 ` Liu, Chuansheng 2012-09-26 9:02 ` Srivatsa S. Bhat 1 sibling, 1 reply; 15+ messages in thread From: Liu, Chuansheng @ 2012-09-26 8:56 UTC (permalink / raw) To: Srivatsa S. Bhat; +Cc: tglx, mingo, x86, linux-kernel, yanmin_zhang [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 681 bytes --] > A return value of 0 and 1 are acceptable. So this check isn't correct. > > Regards, > Srivatsa S. Bhat > Which case value 1 is acceptable, could you share? Thanks. > OMG, why did you drop the other hunk which cleared the cpu *before* > invoking ->irq_set_affinity()? IMO, altering irq affinity involves more work > than just altering the mask; that's why you have that ->irq_set_affinity() > function. So, if you alter the mask *after* calling ->irq_set_affinity(), > its not right.. Sorry the mistake, will update. ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-26 8:56 ` Liu, Chuansheng @ 2012-09-26 9:02 ` Srivatsa S. Bhat 0 siblings, 0 replies; 15+ messages in thread From: Srivatsa S. Bhat @ 2012-09-26 9:02 UTC (permalink / raw) To: Liu, Chuansheng; +Cc: tglx, mingo, x86, linux-kernel, yanmin_zhang On 09/26/2012 02:26 PM, Liu, Chuansheng wrote: >> A return value of 0 and 1 are acceptable. So this check isn't correct. >> >> Regards, >> Srivatsa S. Bhat >> > Which case value 1 is acceptable, could you share? Thanks. I can see the following in include/linux/irq.h: /* * Return value for chip->irq_set_affinity() * * IRQ_SET_MASK_OK - OK, core updates irq_data.affinity * IRQ_SET_MASK_NOCPY - OK, chip did update irq_data.affinity */ enum { IRQ_SET_MASK_OK = 0, IRQ_SET_MASK_OK_NOCOPY, }; And see some of those ->irq_set_affinity() implementations at various places. Regards, Srivatsa S. Bhat > >> OMG, why did you drop the other hunk which cleared the cpu *before* >> invoking ->irq_set_affinity()? IMO, altering irq affinity involves more work >> than just altering the mask; that's why you have that ->irq_set_affinity() >> function. So, if you alter the mask *after* calling ->irq_set_affinity(), >> its not right.. > Sorry the mistake, will update. > ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-26 17:38 [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask Chuansheng Liu 2012-09-26 8:49 ` Srivatsa S. Bhat @ 2012-09-26 23:45 ` Chuansheng Liu 2012-09-26 15:47 ` Srivatsa S. Bhat 2012-09-26 16:03 ` Srivatsa S. Bhat 1 sibling, 2 replies; 15+ messages in thread From: Chuansheng Liu @ 2012-09-26 23:45 UTC (permalink / raw) To: srivatsa.bhat, tglx, mingo, x86 Cc: linux-kernel, yanmin_zhang, chuansheng.liu When one CPU is going offline, and fixup_irqs() will re-set the irq affinity in some cases, we should clean the offlining CPU from the irq affinity. The reason is setting offlining CPU as of the affinity is useless. Moreover, the smp_affinity value will be confusing when the offlining CPU come back again. Example: For irq 93 with 4 CPUS, the default affinity f(1111), normal cases: 4 CPUS will receive the irq93 interrupts. When echo 0 > /sys/devices/system/cpu/cpu3/online, just CPU0,1,2 will receive the interrupts. But after the CPU3 is online again, we will not set affinity,the result will be: the smp_affinity is f, but still just CPU0,1,2 can receive the interrupts. So we should clean the offlining CPU from irq affinity mask in fixup_irqs(). Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: liu chuansheng <chuansheng.liu@intel.com> --- arch/x86/kernel/irq.c | 21 +++++++++++++++++---- 1 files changed, 17 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index d44f782..ead0807 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -239,10 +239,13 @@ void fixup_irqs(void) struct irq_desc *desc; struct irq_data *data; struct irq_chip *chip; + int cpu = smp_processor_id(); for_each_irq_desc(irq, desc) { int break_affinity = 0; int set_affinity = 1; + bool set_ret = false; + const struct cpumask *affinity; if (!desc) @@ -256,7 +259,8 @@ void fixup_irqs(void) data = irq_desc_get_irq_data(desc); affinity = data->affinity; if (!irq_has_action(irq) || irqd_is_per_cpu(data) || - cpumask_subset(affinity, cpu_online_mask)) { + cpumask_subset(affinity, cpu_online_mask) || + !cpumask_test_cpu(cpu, data->affinity)) { raw_spin_unlock(&desc->lock); continue; } @@ -277,9 +281,18 @@ void fixup_irqs(void) if (!irqd_can_move_in_process_context(data) && chip->irq_mask) chip->irq_mask(data); - if (chip->irq_set_affinity) - chip->irq_set_affinity(data, affinity, true); - else if (!(warned++)) + if (chip->irq_set_affinity) { + struct cpumask mask; + cpumask_copy(&mask, affinity); + cpumask_clear_cpu(cpu, &mask); + switch (chip->irq_set_affinity(data, &mask, true)) { + case IRQ_SET_MASK_OK: + cpumask_copy(data->affinity, &mask); + case IRQ_SET_MASK_OK_NOCOPY: + set_ret = true; + } + } + if ((!set_ret) && !(warned++)) set_affinity = 0; /* -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-26 23:45 ` Chuansheng Liu @ 2012-09-26 15:47 ` Srivatsa S. Bhat 2012-09-26 16:03 ` Srivatsa S. Bhat 1 sibling, 0 replies; 15+ messages in thread From: Srivatsa S. Bhat @ 2012-09-26 15:47 UTC (permalink / raw) To: Chuansheng Liu Cc: tglx, mingo, x86, linux-kernel, yanmin_zhang, Suresh Siddha, Paul E. McKenney On 09/27/2012 05:15 AM, Chuansheng Liu wrote: > > When one CPU is going offline, and fixup_irqs() will re-set the > irq affinity in some cases, we should clean the offlining CPU from > the irq affinity. > > The reason is setting offlining CPU as of the affinity is useless. > Moreover, the smp_affinity value will be confusing when the > offlining CPU come back again. > > Example: > For irq 93 with 4 CPUS, the default affinity f(1111), > normal cases: 4 CPUS will receive the irq93 interrupts. > > When echo 0 > /sys/devices/system/cpu/cpu3/online, just CPU0,1,2 will > receive the interrupts. > > But after the CPU3 is online again, we will not set affinity,the result > will be: > the smp_affinity is f, but still just CPU0,1,2 can receive the interrupts. > > So we should clean the offlining CPU from irq affinity mask > in fixup_irqs(). > > Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> :-) OK, so here is the general rule: You shouldn't automatically add Reviewed-by tags.. You can include them only if the reviewer _explicitly_ lets you know that he is fine with the patch. Often, review happens in multiple iterations/stages. So just because you addressed all the review comments raised in iteration 'n' doesn't mean there won't be issues in iteration 'n+1', perhaps because the way you addressed the concern might not be the best approach.. or the reviewer might find more issues in iteration 'n+1' which he might have over-looked in iteration 'n'. So please refrain from adding such tags automatically! > Signed-off-by: liu chuansheng <chuansheng.liu@intel.com> > --- > arch/x86/kernel/irq.c | 21 +++++++++++++++++---- > 1 files changed, 17 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c > index d44f782..ead0807 100644 > --- a/arch/x86/kernel/irq.c > +++ b/arch/x86/kernel/irq.c > @@ -239,10 +239,13 @@ void fixup_irqs(void) > struct irq_desc *desc; > struct irq_data *data; > struct irq_chip *chip; > + int cpu = smp_processor_id(); > > for_each_irq_desc(irq, desc) { > int break_affinity = 0; > int set_affinity = 1; > + bool set_ret = false; > + > const struct cpumask *affinity; > > if (!desc) > @@ -256,7 +259,8 @@ void fixup_irqs(void) > data = irq_desc_get_irq_data(desc); > affinity = data->affinity; > if (!irq_has_action(irq) || irqd_is_per_cpu(data) || > - cpumask_subset(affinity, cpu_online_mask)) { > + cpumask_subset(affinity, cpu_online_mask) || > + !cpumask_test_cpu(cpu, data->affinity)) { This last check is superfluous, because it already checks if 'affinity' is a subset of cpu_online_mask. Note that this cpu was already removed from the cpu_online_mask before coming here. > raw_spin_unlock(&desc->lock); > continue; > } > @@ -277,9 +281,18 @@ void fixup_irqs(void) > if (!irqd_can_move_in_process_context(data) && chip->irq_mask) > chip->irq_mask(data); > > - if (chip->irq_set_affinity) > - chip->irq_set_affinity(data, affinity, true); > - else if (!(warned++)) > + if (chip->irq_set_affinity) { > + struct cpumask mask; It is good to avoid allocating huge cpumask bitmasks like this on stack. If we really can't do without a temp mask, you could perhaps do something like: cpumask_var_t mask; alloc_cpumask_var(&mask, GFP_ATOMIC); > + cpumask_copy(&mask, affinity); > + cpumask_clear_cpu(cpu, &mask); > + switch (chip->irq_set_affinity(data, &mask, true)) { > + case IRQ_SET_MASK_OK: > + cpumask_copy(data->affinity, &mask); This is again not required. __ioapic_set_affinity() copies the mask for you. (And __ioapic_set_affinity() is called in every ->irq_set_affinity implementation, if I read the source code correctly). Regards, Srivatsa S. Bhat > + case IRQ_SET_MASK_OK_NOCOPY: > + set_ret = true; > + } > + } > + if ((!set_ret) && !(warned++)) > set_affinity = 0; > > /* > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-26 23:45 ` Chuansheng Liu 2012-09-26 15:47 ` Srivatsa S. Bhat @ 2012-09-26 16:03 ` Srivatsa S. Bhat 2012-09-26 17:06 ` Suresh Siddha 1 sibling, 1 reply; 15+ messages in thread From: Srivatsa S. Bhat @ 2012-09-26 16:03 UTC (permalink / raw) To: Chuansheng Liu Cc: tglx, mingo, x86, linux-kernel, yanmin_zhang, Paul E. McKenney, Suresh Siddha, Peter Zijlstra, rusty On 09/27/2012 05:15 AM, Chuansheng Liu wrote: > > When one CPU is going offline, and fixup_irqs() will re-set the > irq affinity in some cases, we should clean the offlining CPU from > the irq affinity. > > The reason is setting offlining CPU as of the affinity is useless. > Moreover, the smp_affinity value will be confusing when the > offlining CPU come back again. > > Example: > For irq 93 with 4 CPUS, the default affinity f(1111), > normal cases: 4 CPUS will receive the irq93 interrupts. > > When echo 0 > /sys/devices/system/cpu/cpu3/online, just CPU0,1,2 will > receive the interrupts. > > But after the CPU3 is online again, we will not set affinity,the result > will be: > the smp_affinity is f, but still just CPU0,1,2 can receive the interrupts. > > So we should clean the offlining CPU from irq affinity mask > in fixup_irqs(). > I have some fundamental questions here: 1. Why was the CPU never removed from the affinity masks in the original code? I find it hard to believe that it was just an oversight, because the whole point of fixup_irqs() is to affine the interrupts to other CPUs, IIUC. So, is that really a bug or is the existing code correct for some reason which I don't know of? 2. In case this is indeed a bug, why are the warnings ratelimited when the interrupts can't be affined to other CPUs? Are they not serious enough to report? Put more strongly, why do we even silently return with a warning instead of reporting that the CPU offline operation failed?? Is that because we have come way too far in the hotplug sequence and we can't easily roll back? Or are we still actually OK in that situation? Suresh, I'd be grateful if you could kindly throw some light on these issues... I'm actually debugging an issue where an offline CPU gets apic timer interrupts (and in one case, I even saw a device interrupt), which I have reported in another thread at: https://lkml.org/lkml/2012/9/26/119 But this issue in fixup_irqs() that Liu brought to light looks even more surprising to me.. Regards, Srivatsa S. Bhat > --- > arch/x86/kernel/irq.c | 21 +++++++++++++++++---- > 1 files changed, 17 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c > index d44f782..ead0807 100644 > --- a/arch/x86/kernel/irq.c > +++ b/arch/x86/kernel/irq.c > @@ -239,10 +239,13 @@ void fixup_irqs(void) > struct irq_desc *desc; > struct irq_data *data; > struct irq_chip *chip; > + int cpu = smp_processor_id(); > > for_each_irq_desc(irq, desc) { > int break_affinity = 0; > int set_affinity = 1; > + bool set_ret = false; > + > const struct cpumask *affinity; > > if (!desc) > @@ -256,7 +259,8 @@ void fixup_irqs(void) > data = irq_desc_get_irq_data(desc); > affinity = data->affinity; > if (!irq_has_action(irq) || irqd_is_per_cpu(data) || > - cpumask_subset(affinity, cpu_online_mask)) { > + cpumask_subset(affinity, cpu_online_mask) || > + !cpumask_test_cpu(cpu, data->affinity)) { > raw_spin_unlock(&desc->lock); > continue; > } > @@ -277,9 +281,18 @@ void fixup_irqs(void) > if (!irqd_can_move_in_process_context(data) && chip->irq_mask) > chip->irq_mask(data); > > - if (chip->irq_set_affinity) > - chip->irq_set_affinity(data, affinity, true); > - else if (!(warned++)) > + if (chip->irq_set_affinity) { > + struct cpumask mask; > + cpumask_copy(&mask, affinity); > + cpumask_clear_cpu(cpu, &mask); > + switch (chip->irq_set_affinity(data, &mask, true)) { > + case IRQ_SET_MASK_OK: > + cpumask_copy(data->affinity, &mask); > + case IRQ_SET_MASK_OK_NOCOPY: > + set_ret = true; > + } > + } > + if ((!set_ret) && !(warned++)) > set_affinity = 0; > > /* > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-26 16:03 ` Srivatsa S. Bhat @ 2012-09-26 17:06 ` Suresh Siddha 2012-09-26 17:30 ` Srivatsa S. Bhat 0 siblings, 1 reply; 15+ messages in thread From: Suresh Siddha @ 2012-09-26 17:06 UTC (permalink / raw) To: Srivatsa S. Bhat Cc: Chuansheng Liu, tglx, mingo, x86, linux-kernel, yanmin_zhang, Paul E. McKenney, Peter Zijlstra, rusty On Wed, 2012-09-26 at 21:33 +0530, Srivatsa S. Bhat wrote: > I have some fundamental questions here: > 1. Why was the CPU never removed from the affinity masks in the original > code? I find it hard to believe that it was just an oversight, because the > whole point of fixup_irqs() is to affine the interrupts to other CPUs, IIUC. > So, is that really a bug or is the existing code correct for some reason > which I don't know of? I am not aware of the history but my guess is that the affinity mask which is coming from the user-space wants to be preserved. And fixup_irqs() is fixing the underlying interrupt routing when the cpu goes down with a hope that things will be corrected when the cpu comes back online. But as Liu noted, we are not correcting the underlying routing when the cpu comes back online. I think we should fix that rather than modifying the user-specified affinity. > 2. In case this is indeed a bug, why are the warnings ratelimited when the > interrupts can't be affined to other CPUs? Are they not serious enough to > report? Put more strongly, why do we even silently return with a warning > instead of reporting that the CPU offline operation failed?? Is that because > we have come way too far in the hotplug sequence and we can't easily roll > back? Or are we still actually OK in that situation? Are you referring to the "cannot set affinity for irq" messages? That happens only if the irq chip doesn't have the irq_set_affinity() setup. But that is not common. > > Suresh, I'd be grateful if you could kindly throw some light on these > issues... I'm actually debugging an issue where an offline CPU gets apic timer > interrupts (and in one case, I even saw a device interrupt), which I have > reported in another thread at: https://lkml.org/lkml/2012/9/26/119 > But this issue in fixup_irqs() that Liu brought to light looks even more > surprising to me.. These issues look different to me, will look into that. thanks, suresh ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-26 17:06 ` Suresh Siddha @ 2012-09-26 17:30 ` Srivatsa S. Bhat 2012-09-26 22:46 ` Suresh Siddha 0 siblings, 1 reply; 15+ messages in thread From: Srivatsa S. Bhat @ 2012-09-26 17:30 UTC (permalink / raw) To: Suresh Siddha Cc: Chuansheng Liu, tglx, mingo, x86, linux-kernel, yanmin_zhang, Paul E. McKenney, Peter Zijlstra, rusty On 09/26/2012 10:36 PM, Suresh Siddha wrote: > On Wed, 2012-09-26 at 21:33 +0530, Srivatsa S. Bhat wrote: >> I have some fundamental questions here: >> 1. Why was the CPU never removed from the affinity masks in the original >> code? I find it hard to believe that it was just an oversight, because the >> whole point of fixup_irqs() is to affine the interrupts to other CPUs, IIUC. >> So, is that really a bug or is the existing code correct for some reason >> which I don't know of? > > I am not aware of the history but my guess is that the affinity mask > which is coming from the user-space wants to be preserved. And > fixup_irqs() is fixing the underlying interrupt routing when the cpu > goes down and the code that corresponds to that is: irq_force_complete_move(irq); is it? > with a hope that things will be corrected when the cpu comes > back online. But as Liu noted, we are not correcting the underlying > routing when the cpu comes back online. I think we should fix that > rather than modifying the user-specified affinity. > Hmm, I didn't entirely get your suggestion. Are you saying that we should change data->affinity (by calling ->irq_set_affinity()) during offline but maintain a copy of the original affinity mask somewhere, so that we can try to match it when possible (ie., when CPU comes back online)? >> 2. In case this is indeed a bug, why are the warnings ratelimited when the >> interrupts can't be affined to other CPUs? Are they not serious enough to >> report? Put more strongly, why do we even silently return with a warning >> instead of reporting that the CPU offline operation failed?? Is that because >> we have come way too far in the hotplug sequence and we can't easily roll >> back? Or are we still actually OK in that situation? > > Are you referring to the "cannot set affinity for irq" messages? Yes > That happens only if the irq chip doesn't have the irq_set_affinity() setup. That is my other point of concern : setting irq affinity can fail even if we have ->irq_set_affinity(). (If __ioapic_set_affinity() fails, for example). Why don't we complain in that case? I think we should... and if its serious enough, abort the hotplug operation or atleast indicate that offline failed.. > But that is not common. > >> >> Suresh, I'd be grateful if you could kindly throw some light on these >> issues... I'm actually debugging an issue where an offline CPU gets apic timer >> interrupts (and in one case, I even saw a device interrupt), which I have >> reported in another thread at: https://lkml.org/lkml/2012/9/26/119 >> But this issue in fixup_irqs() that Liu brought to light looks even more >> surprising to me.. > > These issues look different to me, will look into that. > Ok, thanks a lot! Regards, Srivatsa S. Bhat ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-26 17:30 ` Srivatsa S. Bhat @ 2012-09-26 22:46 ` Suresh Siddha 2012-09-27 18:42 ` Srivatsa S. Bhat 2012-10-09 8:51 ` Liu, Chuansheng 0 siblings, 2 replies; 15+ messages in thread From: Suresh Siddha @ 2012-09-26 22:46 UTC (permalink / raw) To: Srivatsa S. Bhat Cc: Chuansheng Liu, tglx, mingo, x86, linux-kernel, yanmin_zhang, Paul E. McKenney, Peter Zijlstra, rusty On Wed, 2012-09-26 at 23:00 +0530, Srivatsa S. Bhat wrote: > On 09/26/2012 10:36 PM, Suresh Siddha wrote: > > On Wed, 2012-09-26 at 21:33 +0530, Srivatsa S. Bhat wrote: > >> I have some fundamental questions here: > >> 1. Why was the CPU never removed from the affinity masks in the original > >> code? I find it hard to believe that it was just an oversight, because the > >> whole point of fixup_irqs() is to affine the interrupts to other CPUs, IIUC. > >> So, is that really a bug or is the existing code correct for some reason > >> which I don't know of? > > > > I am not aware of the history but my guess is that the affinity mask > > which is coming from the user-space wants to be preserved. And > > fixup_irqs() is fixing the underlying interrupt routing when the cpu > > goes down > > and the code that corresponds to that is: > irq_force_complete_move(irq); is it? No. irq_set_affinity() > > with a hope that things will be corrected when the cpu comes > > back online. But as Liu noted, we are not correcting the underlying > > routing when the cpu comes back online. I think we should fix that > > rather than modifying the user-specified affinity. > > > > Hmm, I didn't entirely get your suggestion. Are you saying that we should change > data->affinity (by calling ->irq_set_affinity()) during offline but maintain a > copy of the original affinity mask somewhere, so that we can try to match it > when possible (ie., when CPU comes back online)? Don't change the data->affinity in the fixup_irqs() and shortly after a cpu is online, call irq_chip's irq_set_affinity() for those irq's who affinity included this cpu (now that the cpu is back online, irq_set_affinity() will setup the HW routing tables correctly). This presumes that across the suspend/resume, cpu offline/online operations, we don't want to break the irq affinity setup by the user-level entity like irqbalance etc... > > That happens only if the irq chip doesn't have the irq_set_affinity() setup. > > That is my other point of concern : setting irq affinity can fail even if > we have ->irq_set_affinity(). (If __ioapic_set_affinity() fails, for example). > Why don't we complain in that case? I think we should... and if its serious > enough, abort the hotplug operation or atleast indicate that offline failed.. yes if there is a failure then we are in trouble, as the cpu is already disappeared from the online-masks etc. For platforms with interrupt-remapping, interrupts can be migrated from the process context and as such this all can be done much before. And for legacy platforms we have done quite a few changes in the recent past like using eoi_ioapic_irq() for level triggered interrupts etc, that makes it as safe as it can be. Perhaps we can move most of the fixup_irqs() code much ahead and the lost section of the current fixup_irqs() (which check IRR bits and use the retrigger function to trigger the interrupt on another cpu) can still be done late just like now. thanks, suresh ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-26 22:46 ` Suresh Siddha @ 2012-09-27 18:42 ` Srivatsa S. Bhat 2012-09-27 19:20 ` Suresh Siddha 2012-10-09 8:51 ` Liu, Chuansheng 1 sibling, 1 reply; 15+ messages in thread From: Srivatsa S. Bhat @ 2012-09-27 18:42 UTC (permalink / raw) To: Suresh Siddha Cc: Chuansheng Liu, tglx, mingo, x86, linux-kernel, yanmin_zhang, Paul E. McKenney, Peter Zijlstra, rusty On 09/27/2012 04:16 AM, Suresh Siddha wrote: > On Wed, 2012-09-26 at 23:00 +0530, Srivatsa S. Bhat wrote: >> On 09/26/2012 10:36 PM, Suresh Siddha wrote: >>> On Wed, 2012-09-26 at 21:33 +0530, Srivatsa S. Bhat wrote: >>>> I have some fundamental questions here: >>>> 1. Why was the CPU never removed from the affinity masks in the original >>>> code? I find it hard to believe that it was just an oversight, because the >>>> whole point of fixup_irqs() is to affine the interrupts to other CPUs, IIUC. >>>> So, is that really a bug or is the existing code correct for some reason >>>> which I don't know of? >>> >>> I am not aware of the history but my guess is that the affinity mask >>> which is coming from the user-space wants to be preserved. And >>> fixup_irqs() is fixing the underlying interrupt routing when the cpu >>> goes down >> >> and the code that corresponds to that is: >> irq_force_complete_move(irq); is it? > > No. irq_set_affinity() > Um? That takes the updated/changed affinity and sets data->affinity to that value no? You mentioned that probably the intention of the original code was to preserve the user-set affinity mask, but still change the underlying interrupt routing. Sorry, but I still didn't quite understand what is that part of the code that achieves that. It appears that ->irq_set_affinity() is doing both - change the (user-set) affinity as well as the underlying irq routing. And it does it only when all CPUs in the original affinity mask go offline, not on every offline; which looks like a real bug to me. >>> with a hope that things will be corrected when the cpu comes >>> back online. But as Liu noted, we are not correcting the underlying >>> routing when the cpu comes back online. I think we should fix that >>> rather than modifying the user-specified affinity. >>> I am not able to distinguish the 2 things in the existing code, as I mentioned above :( >> >> Hmm, I didn't entirely get your suggestion. Are you saying that we should change >> data->affinity (by calling ->irq_set_affinity()) during offline but maintain a >> copy of the original affinity mask somewhere, so that we can try to match it >> when possible (ie., when CPU comes back online)? > > Don't change the data->affinity in the fixup_irqs() You mean, IOW, don't call ->irq_set_affinity() during CPU offline at all? (Would that even be correct?) > and shortly after a > cpu is online, call irq_chip's irq_set_affinity() for those irq's who > affinity included this cpu (now that the cpu is back online, > irq_set_affinity() will setup the HW routing tables correctly). > > This presumes that across the suspend/resume, cpu offline/online > operations, we don't want to break the irq affinity setup by the > user-level entity like irqbalance etc... > The only way I can think of to preserve the user-set affinity but still alter the underlying routing appropriately when needed, is by having an additional mask (per-irq) that holds the user-set affinity. >>> That happens only if the irq chip doesn't have the irq_set_affinity() setup. >> >> That is my other point of concern : setting irq affinity can fail even if >> we have ->irq_set_affinity(). (If __ioapic_set_affinity() fails, for example). >> Why don't we complain in that case? I think we should... and if its serious >> enough, abort the hotplug operation or atleast indicate that offline failed.. > > yes if there is a failure then we are in trouble, as the cpu is already > disappeared from the online-masks etc. For platforms with > interrupt-remapping, interrupts can be migrated from the process context > and as such this all can be done much before. > > And for legacy platforms we have done quite a few changes in the recent > past like using eoi_ioapic_irq() for level triggered interrupts etc, > that makes it as safe as it can be. Perhaps we can move most of the > fixup_irqs() code much ahead and the lost section of the current > fixup_irqs() (which check IRR bits and use the retrigger function to > trigger the interrupt on another cpu) can still be done late just like > now. > Hmm.. ok.. Thanks for the explanation! Regards, Srivatsa S. Bhat ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-27 18:42 ` Srivatsa S. Bhat @ 2012-09-27 19:20 ` Suresh Siddha 2012-09-27 20:33 ` Srivatsa S. Bhat 0 siblings, 1 reply; 15+ messages in thread From: Suresh Siddha @ 2012-09-27 19:20 UTC (permalink / raw) To: Srivatsa S. Bhat Cc: Chuansheng Liu, tglx, mingo, x86, linux-kernel, yanmin_zhang, Paul E. McKenney, Peter Zijlstra, rusty On Fri, 2012-09-28 at 00:12 +0530, Srivatsa S. Bhat wrote: > On 09/27/2012 04:16 AM, Suresh Siddha wrote: > > > > No. irq_set_affinity() > > > > Um? That takes the updated/changed affinity and sets data->affinity to > that value no? You mentioned that probably the intention of the original > code was to preserve the user-set affinity mask, but still change the > underlying interrupt routing. Sorry, but I still didn't quite understand > what is that part of the code that achieves that. For the HW routing to be changed we AND it with cpu_online_map and use that for programming the interrupt entries etc. The user-specified affinity still has the cpu that is offlined. And when the cpu comes online and if it is part of the user-specified affinity, then the HW routing can be again modified to include the new cpu. hope this clears it! thanks. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-27 19:20 ` Suresh Siddha @ 2012-09-27 20:33 ` Srivatsa S. Bhat 0 siblings, 0 replies; 15+ messages in thread From: Srivatsa S. Bhat @ 2012-09-27 20:33 UTC (permalink / raw) To: Suresh Siddha Cc: Chuansheng Liu, tglx, mingo, x86, linux-kernel, yanmin_zhang, Paul E. McKenney, Peter Zijlstra, rusty On 09/28/2012 12:50 AM, Suresh Siddha wrote: > On Fri, 2012-09-28 at 00:12 +0530, Srivatsa S. Bhat wrote: >> On 09/27/2012 04:16 AM, Suresh Siddha wrote: >>> >>> No. irq_set_affinity() >>> >> >> Um? That takes the updated/changed affinity and sets data->affinity to >> that value no? You mentioned that probably the intention of the original >> code was to preserve the user-set affinity mask, but still change the >> underlying interrupt routing. Sorry, but I still didn't quite understand >> what is that part of the code that achieves that. > > For the HW routing to be changed we AND it with cpu_online_map and use > that for programming the interrupt entries etc. Ah, now I see.. you were referring to the __assign_irq_vector() code, whereas I was looking only at fixup_irqs() and was trying to find the code that did what you said.. that's what got me confused earlier :-) > The user-specified > affinity still has the cpu that is offlined. > Right, so data->affinity is untouched, whereas cfg->domain is updated when the CPU is offlined.. > And when the cpu comes online and if it is part of the user-specified > affinity, then the HW routing can be again modified to include the new > cpu. > Right, got it. > hope this clears it! > Yep, thanks a lot! Regards, Srivatsa S. Bhat ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask 2012-09-26 22:46 ` Suresh Siddha 2012-09-27 18:42 ` Srivatsa S. Bhat @ 2012-10-09 8:51 ` Liu, Chuansheng 1 sibling, 0 replies; 15+ messages in thread From: Liu, Chuansheng @ 2012-10-09 8:51 UTC (permalink / raw) To: Siddha, Suresh B, Srivatsa S. Bhat Cc: tglx, mingo, x86, linux-kernel, yanmin_zhang, Paul E. McKenney, Peter Zijlstra, rusty [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3974 bytes --] One suggestion below, thanks. > -----Original Message----- > From: Siddha, Suresh B > Sent: Thursday, September 27, 2012 6:46 AM > To: Srivatsa S. Bhat > Cc: Liu, Chuansheng; tglx@linutronix.de; mingo@redhat.com; x86@kernel.org; > linux-kernel@vger.kernel.org; yanmin_zhang@linux.intel.com; Paul E. > McKenney; Peter Zijlstra; rusty@rustcorp.com.au > Subject: Re: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq > affinity mask > > On Wed, 2012-09-26 at 23:00 +0530, Srivatsa S. Bhat wrote: > > On 09/26/2012 10:36 PM, Suresh Siddha wrote: > > > On Wed, 2012-09-26 at 21:33 +0530, Srivatsa S. Bhat wrote: > > >> I have some fundamental questions here: > > >> 1. Why was the CPU never removed from the affinity masks in the original > > >> code? I find it hard to believe that it was just an oversight, because the > > >> whole point of fixup_irqs() is to affine the interrupts to other CPUs, IIUC. > > >> So, is that really a bug or is the existing code correct for some reason > > >> which I don't know of? > > > > > > I am not aware of the history but my guess is that the affinity mask > > > which is coming from the user-space wants to be preserved. And > > > fixup_irqs() is fixing the underlying interrupt routing when the cpu > > > goes down > > > > and the code that corresponds to that is: > > irq_force_complete_move(irq); is it? > > No. irq_set_affinity() > > > > with a hope that things will be corrected when the cpu comes > > > back online. But as Liu noted, we are not correcting the underlying > > > routing when the cpu comes back online. I think we should fix that > > > rather than modifying the user-specified affinity. > > > > > > > Hmm, I didn't entirely get your suggestion. Are you saying that we should > change > > data->affinity (by calling ->irq_set_affinity()) during offline but maintain a > > copy of the original affinity mask somewhere, so that we can try to match it > > when possible (ie., when CPU comes back online)? > > Don't change the data->affinity in the fixup_irqs() and shortly after a > cpu is online, call irq_chip's irq_set_affinity() for those irq's who > affinity included this cpu (now that the cpu is back online, > irq_set_affinity() will setup the HW routing tables correctly). > > This presumes that across the suspend/resume, cpu offline/online > operations, we don't want to break the irq affinity setup by the > user-level entity like irqbalance etc... > In fixup_irqs(), could we untouch the data->affinity, but calling ->irq_set_affinity() without offlined CPU. If so, the better affinity is set, and user-space can use the data->affinity correctly, also new affinity setting will and online cpus. > > > That happens only if the irq chip doesn't have the irq_set_affinity() setup. > > > > That is my other point of concern : setting irq affinity can fail even if > > we have ->irq_set_affinity(). (If __ioapic_set_affinity() fails, for example). > > Why don't we complain in that case? I think we should... and if its serious > > enough, abort the hotplug operation or atleast indicate that offline failed.. > > yes if there is a failure then we are in trouble, as the cpu is already > disappeared from the online-masks etc. For platforms with > interrupt-remapping, interrupts can be migrated from the process context > and as such this all can be done much before. > > And for legacy platforms we have done quite a few changes in the recent > past like using eoi_ioapic_irq() for level triggered interrupts etc, > that makes it as safe as it can be. Perhaps we can move most of the > fixup_irqs() code much ahead and the lost section of the current > fixup_irqs() (which check IRR bits and use the retrigger function to > trigger the interrupt on another cpu) can still be done late just like > now. > > thanks, > suresh ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-10-09 8:51 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-09-26 17:38 [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask Chuansheng Liu 2012-09-26 8:49 ` Srivatsa S. Bhat 2012-09-26 8:51 ` Liu, Chuansheng 2012-09-26 8:56 ` Liu, Chuansheng 2012-09-26 9:02 ` Srivatsa S. Bhat 2012-09-26 23:45 ` Chuansheng Liu 2012-09-26 15:47 ` Srivatsa S. Bhat 2012-09-26 16:03 ` Srivatsa S. Bhat 2012-09-26 17:06 ` Suresh Siddha 2012-09-26 17:30 ` Srivatsa S. Bhat 2012-09-26 22:46 ` Suresh Siddha 2012-09-27 18:42 ` Srivatsa S. Bhat 2012-09-27 19:20 ` Suresh Siddha 2012-09-27 20:33 ` Srivatsa S. Bhat 2012-10-09 8:51 ` Liu, Chuansheng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).