All of lore.kernel.org
 help / color / mirror / Atom feed
From: Prarit Bhargava <prarit@redhat.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>,
	athorlton@sgi.com, CAI Qian <caiqian@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: BUG: tick device NULL pointer during system initialization and shutdown
Date: Mon, 08 Jul 2013 09:04:31 -0400	[thread overview]
Message-ID: <51DAB8DF.6060806@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1307011518350.4013@ionos.tec.linutronix.de>



On 07/01/2013 09:30 AM, Thomas Gleixner wrote:
> On Mon, 1 Jul 2013, Prarit Bhargava wrote:
>> On 06/28/2013 06:52 AM, Thomas Gleixner wrote:
>>> Huch. Did the warning in the broadcast code trigger before that?
>>
>> tglx,
>>
>> AFAICT it does not.  Log below on the system I'm testing on.  The test on the
>> system is system boots, sleeps for 30 seconds and then reboots.
> 
>> [  270.563197] INFO: rcu_sched detected stalls on CPUs/tasks: { 51} (detected by
>> 63, t=217205 jiffies, g=3583, c=3582, q=578)
> 
> So the stall is on CPU51, but we do not get a backtrace for CPU51. 
> 
> The backtrace trigger is only sent to online cpus. So CPU51 is offline
> already. Which makes sense as we are in the process of bringing CPUs
> down and the CPUs with backtrace are 0 and 53-63.
> 
> I'm pretty sure, that the patch which clears the stale flag is
> unrelated to this and it cures the NULL pointer dereference (the
> reason why this can happen is clear).
> 
> So now you do not longer trip over the NULL pointer dereference, but
> you see a weird RCU stall on an already DEAD cpu. Note, it's dead
> because we already took CPU52 offline as well.
> 
> Paul???

I hit this a few times ... but the frequency of hitting this is MUCH less than
that off the original bug.  So Thomas, can you add

Tested-by: Prarit Bhargava <prarit@redhat.com>

to the "tick: Make oneshot broadcast robust vs. CPU offlining" patch?

IMO that problem seems to be solved and we're just peeling the proverbial onion
and finding deeper bugs.

P.

  parent reply	other threads:[~2013-07-08 13:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-18 18:46 BUG: tick device NULL pointer during system initialization and shutdown Prarit Bhargava
2013-06-24 13:57 ` Thomas Gleixner
2013-06-25 23:50   ` Prarit Bhargava
2013-06-26 11:05     ` Thomas Gleixner
2013-06-27 17:04       ` Prarit Bhargava
2013-06-28 10:52         ` Thomas Gleixner
2013-07-01 13:07           ` Prarit Bhargava
2013-07-01 13:30             ` Thomas Gleixner
2013-07-01 15:41               ` Paul E. McKenney
2013-07-08 13:04               ` Prarit Bhargava [this message]
2013-07-02 12:31       ` [tip:timers/core] tick: Make oneshot broadcast robust vs. CPU offlining tip-bot for Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51DAB8DF.6060806@redhat.com \
    --to=prarit@redhat.com \
    --cc=athorlton@sgi.com \
    --cc=caiqian@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.