linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Michael Neuling <mikey@neuling.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>,
	Darren Hart <dvhltc@us.ibm.com>,
	Gautham R Shenoy <ego@in.ibm.com>,
	Josh Triplett <josh@joshtriplett.org>,
	linuxppc-dev@ozlabs.org, Will Schmidt <willschm@us.ibm.com>,
	Paul Mackerras <paulus@samba.org>,
	Ankita Garg <ankita@in.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries
Date: Thu, 02 Sep 2010 00:06:07 -0400	[thread overview]
Message-ID: <1283400367.2356.69.camel@gandalf.stny.rr.com> (raw)
In-Reply-To: <11579.1283389322@neuling.org>

On Thu, 2010-09-02 at 11:02 +1000, Michael Neuling wrote:

> We need to call smp_startup_cpu on boot when we the cpus are still in
> FW.  smp_startup_cpu does this for us on boot.
> 
> I'm wondering if we just need to move the test down a bit to make sure
> the preempt_count is set.  I've not been following this thread, but
> maybe this might work?

Egad no! Setting the preempt_count to zero _is_ the bug. I think Darren
even said that adding the exit prevented the bug (although now he's
hitting a hard lockup someplace else). The original code he was using
did not have the condition to return for kexec. It was just a
coincidence that this code helped in bringing a CPU back online.

> 
> Untested patch below...
> 
> Mikey
> 
> diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c
> index 0317cce..3afaba4 100644
> --- a/arch/powerpc/platforms/pseries/smp.c
> +++ b/arch/powerpc/platforms/pseries/smp.c
> @@ -104,18 +104,18 @@ static inline int __devinit smp_startup_cpu(unsigned int lcpu)
>  
>  	pcpu = get_hard_smp_processor_id(lcpu);
>  
> -	/* Check to see if the CPU out of FW already for kexec */
> -	if (smp_query_cpu_stopped(pcpu) == QCSS_NOT_STOPPED){
> -		cpumask_set_cpu(lcpu, of_spin_mask);
> -		return 1;
> -	}
> -
>  	/* Fixup atomic count: it exited inside IRQ handler. */
>  	task_thread_info(paca[lcpu].__current)->preempt_count	= 0;

We DON'T want to do the above. It's nasty! This is one CPU's task
touching an intimate part of another CPU's task. It's equivalent of me
putting my hand down you wife's blouse. It's offensive, and rude.

OK, if the CPU was never online, then you can do what you want. But what
we see is that this fails on CPU hotplug.  You stop a CPU, and it goes
into this cede_processor() call. When you wake it up, suddenly the task
on that woken CPU has its preempt count fscked up.  This was really
really hard to debug. We thought it was stack corruption or something.
But it ended up being that this code has one CPU touching the breasts of
another CPU. This code is a pervert!

What the trace clearly showed, was that we take down a CPU, and in doing
so, the code on that CPU set the preempt count to 1, and it expected to
have it as 1 when it returned. But the code that kicked started this CPU
back to life (bring the CPU back online), set the preempt count on the
task of that CPU to 0, and screwed everything up.

-- Steve


>  
>  	if (get_cpu_current_state(lcpu) == CPU_STATE_INACTIVE)
>  		goto out;
>  
> +	/* Check to see if the CPU out of FW already for kexec */
> +	if (smp_query_cpu_stopped(pcpu) == QCSS_NOT_STOPPED){
> +		cpumask_set_cpu(lcpu, of_spin_mask);
> +		return 1;
> +	}
> +
>  	/* 
>  	 * If the RTAS start-cpu token does not exist then presume the
>  	 * cpu is already spinning.
> 

  reply	other threads:[~2010-09-02  4:06 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-22 18:24 [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries Darren Hart
2010-07-22 18:36 ` Darren Hart
2010-07-22 18:38 ` Thomas Gleixner
2010-08-10 22:36   ` Darren Hart
2010-07-22 22:25 ` Benjamin Herrenschmidt
2010-07-22 23:57   ` Darren Hart
2010-07-23  4:44     ` Darren Hart
2010-07-23  5:08       ` Vaidyanathan Srinivasan
2010-07-23  5:11         ` Benjamin Herrenschmidt
2010-07-23  7:07           ` Vaidyanathan Srinivasan
2010-08-05  4:45             ` Darren Hart
2010-08-05 11:06               ` Vaidyanathan Srinivasan
2010-08-05 12:26                 ` Thomas Gleixner
2010-07-23  5:09       ` Benjamin Herrenschmidt
2010-08-06  2:19         ` Darren Hart
2010-08-06  5:09           ` Vaidyanathan Srinivasan
2010-08-06  7:13             ` Darren Hart
2010-07-23 14:39     ` Will Schmidt
2010-08-04 13:44   ` Darren Hart
2010-08-19 15:58 ` Ankita Garg
2010-08-19 18:58   ` Will Schmidt
2010-08-23 22:20   ` Darren Hart
2010-08-31  7:12   ` Darren Hart
2010-09-01  5:54     ` Michael Ellerman
2010-09-01 15:10       ` Darren Hart
2010-09-01 18:47         ` Darren Hart
2010-09-01 19:59           ` Steven Rostedt
2010-09-01 20:42             ` Darren Hart
2010-09-02  1:02               ` Michael Neuling
2010-09-02  4:06                 ` Steven Rostedt [this message]
2010-09-02  6:04                   ` Darren Hart
2010-09-03 20:10                     ` Will Schmidt
2010-09-02 23:04                   ` Michael Neuling
2010-09-03  0:08                     ` Darren Hart
2010-09-02  3:46           ` Michael Neuling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1283400367.2356.69.camel@gandalf.stny.rr.com \
    --to=rostedt@goodmis.org \
    --cc=ankita@in.ibm.com \
    --cc=dvhltc@us.ibm.com \
    --cc=ego@in.ibm.com \
    --cc=josh@joshtriplett.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=mikey@neuling.org \
    --cc=paulus@samba.org \
    --cc=sfr@canb.auug.org.au \
    --cc=tglx@linutronix.de \
    --cc=willschm@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).