All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Frank Rowand <frank.rowand@am.sony.com>,
	"paulmck@linux.vnet.ibm.com" <paulmck@linux.vnet.ibm.com>,
	"Rowand, Frank" <Frank_Rowand@sonyusa.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-rt-users <linux-rt-users@vger.kernel.org>,
	Mike Galbraith <efault@gmx.de>, Ingo Molnar <mingo@tglx.de>,
	Venkatesh Pallipadi <venki@google.com>
Subject: Re: [ANNOUNCE] 3.0.1-rt11
Date: Wed, 7 Sep 2011 18:32:38 +0200 (CEST)	[thread overview]
Message-ID: <alpine.LFD.2.02.1109071818350.2723@ionos> (raw)
In-Reply-To: <20110907140130.GT6619@n2100.arm.linux.org.uk>

On Wed, 7 Sep 2011, Russell King - ARM Linux wrote:

> On Wed, Sep 07, 2011 at 12:57:44PM +0200, Thomas Gleixner wrote:
> > The problem is that if you enable interrupts on the CPU _BEFORE_ it is
> > set online AND active, then you can end up waking up kernel threads
> > which are bound to that CPU and the scheduler will happily schedule
> > them on an online CPU. That makes them lose the cpu affinity to the
> > CPU as well and hell breaks lose.
> 
> How can that happen?
> 
> 1. The only interrupts we're likely to receive are the local timer
>    interrupts - we have not routed any other interrupts to this CPU.

Fair enough, on x86 this can happen when we enable interrupts.
 
> 2. We will not schedule on this CPU except at explicit scheduling
>    points (such as contended mutexes or explicit calls to schedule)
>    as we have a call to preempt_disable().

Right, you don't schedule. But a wakeup of a thread which has its
affinity set to the new online CPU runs (as Frank pointed out)
through:

   wake_up_process()
      try_to_wake_up()
         select_task_rq()
            if (... || !cpu_online(cpu))
               select_fallback_rq(task_cpu(p), p)
                  ...
                  /* No more Mr. Nice Guy. */
                  dest_cpu = cpuset_cpus_allowed_fallback(p)
                     do_set_cpus_allowed(p, cpu_possible_mask)
                        #  Thus ksoftirqd can now run on any cpu...

So the problem is not scheduling, it's the wakeup code. Sorry for
being imprecise.

We can't do anything about it in the scheduler code, so we have to
make sure that the cpu startup code enables interrupts after the
online AND active bits have been set.
 
> > Frank has observed this with softirq threads, but the same thing is
> > true for any other CPU bound thread like the worker stuff.
> 
> So who is scheduling a workqueue from the local timer?

The problem are timer callbacks which might be executed in the softirq
code on return from interrupt. We had one case observed on x86 where
an expired timer was queued on the about to go online cpu and the
callback scheduled work on that CPU which then caused the cpu affine
worker thread to move away :(

> > So moving the online, active thing BEFORE enabling interrupt is the
> > only sensible solution.
> 
> Yes, that'll be why even x86 enables interrupts before setting the CPU
> online for the delay calibration.

Correct.

Thanks,

	tglx

  reply	other threads:[~2011-09-07 16:32 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-13 10:53 [ANNOUNCE] 3.0.1-rt11 Peter Zijlstra
2011-08-13 11:48 ` Mike Galbraith
2011-08-13 11:58   ` Peter Zijlstra
2011-08-13 13:59     ` Mike Galbraith
2011-08-13 14:23       ` Peter Zijlstra
2011-08-13 16:27       ` Paul E. McKenney
2011-08-14  4:23         ` Mike Galbraith
2011-08-16 14:17           ` Nivedita Singhvi
2011-08-16 15:10             ` Mike Galbraith
2011-08-16 15:18               ` Nivedita Singhvi
2011-08-16 19:31               ` Paul E. McKenney
2011-08-17  4:28                 ` Mike Galbraith
2011-08-17  5:03                   ` Nivedita Singhvi
2011-08-15 10:09         ` Mike Galbraith
2011-08-14 21:19 ` Clark Williams
2011-08-21  8:30 ` patches/mm-memory-rt.patch can go away Mike Galbraith
2011-08-23 14:12 ` [patch] sched, rt: fix migrate_enable() thinko Mike Galbraith
2011-09-08  2:11   ` Frank Rowand
2011-09-08  4:58     ` Mike Galbraith
2011-08-24 23:58 ` [ANNOUNCE] 3.0.1-rt11 Frank Rowand
2011-08-26 23:55   ` Paul E. McKenney
2011-08-29 19:57     ` Frank Rowand
2011-08-30  3:17       ` Paul E. McKenney
2011-09-07  2:53     ` Frank Rowand
2011-09-07  3:00       ` Frank Rowand
2011-09-07  3:00         ` Frank Rowand
2011-09-07  6:42       ` Paul E. McKenney
2011-09-07  9:25       ` Thomas Gleixner
2011-09-07  9:25         ` Thomas Gleixner
2011-09-07 10:46         ` Russell King - ARM Linux
2011-09-07 10:47           ` Russell King - ARM Linux
2011-09-07 10:57             ` Thomas Gleixner
2011-09-07 14:01               ` Russell King - ARM Linux
2011-09-07 16:32                 ` Thomas Gleixner [this message]
2011-09-07 16:33                 ` Frank Rowand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.02.1109071818350.2723@ionos \
    --to=tglx@linutronix.de \
    --cc=Frank_Rowand@sonyusa.com \
    --cc=efault@gmx.de \
    --cc=frank.rowand@am.sony.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=mingo@tglx.de \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=venki@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.