All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liu, Chuansheng" <chuansheng.liu@intel.com>
To: "Siddha, Suresh B" <suresh.b.siddha@intel.com>,
	"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: "tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"yanmin_zhang@linux.intel.com" <yanmin_zhang@linux.intel.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>
Subject: RE: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask
Date: Tue, 9 Oct 2012 08:51:33 +0000	[thread overview]
Message-ID: <27240C0AC20F114CBF8149A2696CBE4A19171C@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <1348699588.6644.21.camel@sbsiddha-desk.sc.intel.com>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3974 bytes --]

One suggestion below, thanks.

> -----Original Message-----
> From: Siddha, Suresh B
> Sent: Thursday, September 27, 2012 6:46 AM
> To: Srivatsa S. Bhat
> Cc: Liu, Chuansheng; tglx@linutronix.de; mingo@redhat.com; x86@kernel.org;
> linux-kernel@vger.kernel.org; yanmin_zhang@linux.intel.com; Paul E.
> McKenney; Peter Zijlstra; rusty@rustcorp.com.au
> Subject: Re: [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq
> affinity mask
> 
> On Wed, 2012-09-26 at 23:00 +0530, Srivatsa S. Bhat wrote:
> > On 09/26/2012 10:36 PM, Suresh Siddha wrote:
> > > On Wed, 2012-09-26 at 21:33 +0530, Srivatsa S. Bhat wrote:
> > >> I have some fundamental questions here:
> > >> 1. Why was the CPU never removed from the affinity masks in the original
> > >> code? I find it hard to believe that it was just an oversight, because the
> > >> whole point of fixup_irqs() is to affine the interrupts to other CPUs, IIUC.
> > >> So, is that really a bug or is the existing code correct for some reason
> > >> which I don't know of?
> > >
> > > I am not aware of the history but my guess is that the affinity mask
> > > which is coming from the user-space wants to be preserved. And
> > > fixup_irqs() is fixing the underlying interrupt routing when the cpu
> > > goes down
> >
> > and the code that corresponds to that is:
> > irq_force_complete_move(irq); is it?
> 
> No. irq_set_affinity()
> 
> > > with a hope that things will be corrected when the cpu comes
> > > back online. But  as Liu noted, we are not correcting the underlying
> > > routing when the cpu comes back online. I think we should fix that
> > > rather than modifying the user-specified affinity.
> > >
> >
> > Hmm, I didn't entirely get your suggestion. Are you saying that we should
> change
> > data->affinity (by calling ->irq_set_affinity()) during offline but maintain a
> > copy of the original affinity mask somewhere, so that we can try to match it
> > when possible (ie., when CPU comes back online)?
> 
> Don't change the data->affinity in the fixup_irqs() and shortly after a
> cpu is online, call irq_chip's irq_set_affinity() for those irq's who
> affinity included this cpu (now that the cpu is back online,
> irq_set_affinity() will setup the HW routing tables correctly).
> 
> This presumes that across the suspend/resume, cpu offline/online
> operations, we don't want to break the irq affinity setup by the
> user-level entity like irqbalance etc...
> 
In fixup_irqs(), could we untouch the data->affinity, but calling ->irq_set_affinity() without offlined CPU.
If so, the better affinity is set, and user-space can use the data->affinity correctly, also new affinity setting
will and online cpus.

> > > That happens only if the irq chip doesn't have the irq_set_affinity() setup.
> >
> > That is my other point of concern : setting irq affinity can fail even if
> > we have ->irq_set_affinity(). (If __ioapic_set_affinity() fails, for example).
> > Why don't we complain in that case? I think we should... and if its serious
> > enough, abort the hotplug operation or atleast indicate that offline failed..
> 
> yes if there is a failure then we are in trouble, as the cpu is already
> disappeared from the online-masks etc. For platforms with
> interrupt-remapping, interrupts can be migrated from the process context
> and as such this all can be done much before.
> 
> And for legacy platforms we have done quite a few changes in the recent
> past like using eoi_ioapic_irq() for level triggered interrupts etc,
> that makes it as safe as it can be. Perhaps we can move most of the
> fixup_irqs() code much ahead and the lost section of the current
> fixup_irqs() (which check IRR bits and use the retrigger function to
> trigger the interrupt on another cpu) can still be done late just like
> now.
> 
> thanks,
> suresh

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

      parent reply	other threads:[~2012-10-09  8:51 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-26 17:38 [PATCH RESEND] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask Chuansheng Liu
2012-09-26  8:49 ` Srivatsa S. Bhat
2012-09-26  8:51   ` Liu, Chuansheng
2012-09-26  8:56   ` Liu, Chuansheng
2012-09-26  9:02     ` Srivatsa S. Bhat
2012-09-26 23:45 ` Chuansheng Liu
2012-09-26 15:47   ` Srivatsa S. Bhat
2012-09-26 16:03   ` Srivatsa S. Bhat
2012-09-26 17:06     ` Suresh Siddha
2012-09-26 17:30       ` Srivatsa S. Bhat
2012-09-26 22:46         ` Suresh Siddha
2012-09-27 18:42           ` Srivatsa S. Bhat
2012-09-27 19:20             ` Suresh Siddha
2012-09-27 20:33               ` Srivatsa S. Bhat
2012-10-09  8:51           ` Liu, Chuansheng [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27240C0AC20F114CBF8149A2696CBE4A19171C@SHSMSX101.ccr.corp.intel.com \
    --to=chuansheng.liu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rusty@rustcorp.com.au \
    --cc=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.