All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <clm@fb.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>, <torvalds@linux-foundation.org>,
	<linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@kernel.org>,
	Stanislaw Gruszka <sgruszka@redhat.com>
Subject: Re: New crashes walking proc with Saturday's git
Date: Sun, 23 Nov 2014 16:29:53 -0500	[thread overview]
Message-ID: <1416778193.3019.0@mail.thefacebook.com> (raw)
In-Reply-To: <1416777079.1732.0@mail.thefacebook.com>

On Sun, Nov 23, 2014 at 4:11 PM, Chris Mason <clm@fb.com> wrote:
> 
> 
> On Sun, Nov 23, 2014 at 4:05 PM, Thomas Gleixner <tglx@linutronix.de> 
> wrote:
>> On Sun, 23 Nov 2014, Chris Mason wrote:
>>>  On Sun, Nov 23, 2014 at 11:32 AM, Borislav Petkov <bp@alien8.de> 
>>> wrote:
>>>  > On Sun, Nov 23, 2014 at 11:16:51AM -0500, Chris Mason wrote:
>>>  > >  It must be:
>>>  > >
>>>  > >  commit 6e998916dfe327e785e7c2447959b2c1a3ea4930
>>>  > >  Author: Stanislaw Gruszka <sgruszka@redhat.com>
>>>  > >  Date:   Wed Nov 12 16:58:44 2014 +0100
>>>  > >
>>>  > >     sched/cputime: Fix clock_nanosleep()/clock_gettime() 
>>> inconsistency
>>>  > >
>>>  > >  I'll do two runs to confirm, but it's the only related patch 
>>> between rc5
>>>  > > and
>>>  > >  now.
>>> 
>>>  I've adding Ingo and Stanislaw to the cc.  With
>>>  6e998916dfe327e785e7c2447959b2c1a3ea4930 reverted, I'm no longer 
>>> crashing.
>>> 
>>>  Repeating the stack trace for the new cc list.  I see the crash 
>>> with atop or
>>>  similar walkers of /proc racing against exiting programs.  Given 
>>> the NULL rip,
>>>  this line from the patch is probably broken, but it really feels 
>>> like we
>>>  should be falling over on p->sched_class and not on the 
>>> update_curr func.
>>> 
>>>  +               p->sched_class->update_curr(rq);
>>> 
>>>  I'm leaving my fork bomb running on two machines with the patch 
>>> reverted to
>>>  make sure.
>> 
>> The sched_class instances which do not have update_curr are stop_task
>> and idle. Patch below.
>> 
>> I'm sure nobody thought about the stats read code path here.
>> 
>> [ 1053.759741]  [<ffffffff81208348>] do_task_stat+0x8b8/0xb00
>> 
>> do_task_stat(()
>>  thread_group_cputime_adjusted()
>>    thread_group_cputime()
>>      task_cputime()
>>        task_sched_runtime()
>> 	if (task_current(rq, p) && task_on_rq_queued(p)) {
>>                 update_rq_clock(rq);
>>                 p->sched_class->update_curr(rq);
>>         }
>> 
>> Now if the stats are read for a stomp machine task, aka 'migration/N'
>> and that task is current on its cpu. Ooops.
>> 
>> I added the callback for idle tasks as well for completeness sake.
> 
> This does make sense, but it doesn't match with the crash being much 
> more likely during the fork bomb.  The difference is crashing within 
> a few hours vs crashing within 5 minutes.
> 
> But, maybe I just got lucky.  I'll try the patch.

11 minutes later and it's still alive.  I'll keep an eye on it and yell 
if it falls over.

-chris




  reply	other threads:[~2014-11-23 21:30 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-23 15:02 New crashes walking proc with Saturday's git Chris Mason
2014-11-23 15:56 ` Chris Mason
2014-11-23 16:11   ` Borislav Petkov
2014-11-23 16:16     ` Chris Mason
2014-11-23 16:32       ` Borislav Petkov
2014-11-23 16:36         ` Chris Mason
2014-11-23 16:49         ` Chris Mason
2014-11-23 16:59           ` Borislav Petkov
2014-11-23 21:05           ` Thomas Gleixner
2014-11-23 21:11             ` Chris Mason
2014-11-23 21:29               ` Chris Mason [this message]
2014-11-23 21:38                 ` Borislav Petkov
2014-11-23 21:38               ` Thomas Gleixner
2014-11-23 21:42                 ` Chris Mason
2014-11-23 22:04                   ` [PATCH] sched: Provide update_curr callbacks for stop/idle scheduling classes Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1416778193.3019.0@mail.thefacebook.com \
    --to=clm@fb.com \
    --cc=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=sgruszka@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.