All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anton Blanchard <anton@samba.org>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: mahesh@linux.vnet.ibm.com, linux-kernel@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, mingo@elte.hu,
	benh@kernel.crashing.org, torvalds@linux-foundation.org
Subject: Re: [regression] 3.0-rc boot failure -- bisected to cd4ea6ae3982
Date: Thu, 14 Jul 2011 10:34:18 +1000	[thread overview]
Message-ID: <20110714103418.7ef25b68@kryten> (raw)
In-Reply-To: <1310036375.3282.509.camel@twins>


Hi Peter,

> Surely this isn't the first multi-node P7 to boot a kernel with this
> patch? If my git foo is any good it hit -next on 23rd of May.
> 
> I guess I'm asking is, do smaller P7 machines boot? And if so, is
> there any difference except size?
> 
> How many nodes does the thing have anyway, 28? Hmm, that could mean
> its the first machine with >16 nodes to boot this, which would make it
> trigger the magic ALL_NODES crap.

We haven't tested a box with more than 16 nodes in quite a while, so it
may be this.

I took a quick look and we are stuck in update_group_power:

        do {
                power += group->cpu_power;
                group = group->next;
        } while (group != child->groups);

I looked at the linked list:

child->groups = c000007b2f74ff00

and dumping group as we go:

c000007b2f74ff00 c000007b2f760000 c000007b2fb60000 c000007b2ff60000

at this point we end up in a cycle and never make it back to
child->groups:

c000008b2e68ff00 c000008b2e6a0000 c000008b2eaa0000 c000008b2eea0000
c000009aee77ff00 c000009aee790000 c000009aeeb90000 c000009aeef90000
c00000bafde91800 c00000dafdf81800 c00000fafce81800 c000011afdf71800
c00001226e70ff00 c00001226e720000 c00001226eb20000 c00001226ef20000
c000008b2e68ff00

Still investigating

Anton


WARNING: multiple messages have this Message-ID (diff)
From: Anton Blanchard <anton@samba.org>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: mahesh@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, mingo@elte.hu,
	torvalds@linux-foundation.org
Subject: Re: [regression] 3.0-rc boot failure -- bisected to cd4ea6ae3982
Date: Thu, 14 Jul 2011 10:34:18 +1000	[thread overview]
Message-ID: <20110714103418.7ef25b68@kryten> (raw)
In-Reply-To: <1310036375.3282.509.camel@twins>


Hi Peter,

> Surely this isn't the first multi-node P7 to boot a kernel with this
> patch? If my git foo is any good it hit -next on 23rd of May.
> 
> I guess I'm asking is, do smaller P7 machines boot? And if so, is
> there any difference except size?
> 
> How many nodes does the thing have anyway, 28? Hmm, that could mean
> its the first machine with >16 nodes to boot this, which would make it
> trigger the magic ALL_NODES crap.

We haven't tested a box with more than 16 nodes in quite a while, so it
may be this.

I took a quick look and we are stuck in update_group_power:

        do {
                power += group->cpu_power;
                group = group->next;
        } while (group != child->groups);

I looked at the linked list:

child->groups = c000007b2f74ff00

and dumping group as we go:

c000007b2f74ff00 c000007b2f760000 c000007b2fb60000 c000007b2ff60000

at this point we end up in a cycle and never make it back to
child->groups:

c000008b2e68ff00 c000008b2e6a0000 c000008b2eaa0000 c000008b2eea0000
c000009aee77ff00 c000009aee790000 c000009aeeb90000 c000009aeef90000
c00000bafde91800 c00000dafdf81800 c00000fafce81800 c000011afdf71800
c00001226e70ff00 c00001226e720000 c00001226eb20000 c00001226ef20000
c000008b2e68ff00

Still investigating

Anton

  parent reply	other threads:[~2011-07-14  0:34 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-07 10:22 [regression] 3.0-rc boot failure -- bisected to cd4ea6ae3982 Mahesh J Salgaonkar
2011-07-07 10:22 ` Mahesh J Salgaonkar
2011-07-07 10:59 ` Peter Zijlstra
2011-07-07 10:59   ` Peter Zijlstra
2011-07-07 11:55   ` Mahesh J Salgaonkar
2011-07-07 11:55     ` Mahesh J Salgaonkar
2011-07-07 12:28     ` Peter Zijlstra
2011-07-07 12:28       ` Peter Zijlstra
2011-07-14  0:34   ` Anton Blanchard [this message]
2011-07-14  0:34     ` Anton Blanchard
2011-07-14  4:35     ` Anton Blanchard
2011-07-14  4:35       ` Anton Blanchard
2011-07-14 13:16       ` Peter Zijlstra
2011-07-14 13:16         ` Peter Zijlstra
2011-07-15  0:45         ` Anton Blanchard
2011-07-15  0:45           ` Anton Blanchard
2011-07-15  8:37           ` Peter Zijlstra
2011-07-15  8:37             ` Peter Zijlstra
2011-07-18 21:35           ` Peter Zijlstra
2011-07-18 21:35             ` Peter Zijlstra
2011-07-19  4:44             ` Anton Blanchard
2011-07-19  4:44               ` Anton Blanchard
2011-07-19 10:21               ` Peter Zijlstra
2011-07-19 10:21                 ` Peter Zijlstra
2011-07-20  2:03                 ` Anton Blanchard
2011-07-20  2:03                   ` Anton Blanchard
2011-07-20 10:14                 ` Anton Blanchard
2011-07-20 10:14                   ` Anton Blanchard
2011-07-20 10:45                   ` Peter Zijlstra
2011-07-20 10:45                     ` Peter Zijlstra
2011-07-20 12:14                     ` Anton Blanchard
2011-07-20 12:14                       ` Anton Blanchard
2011-07-20 14:40                       ` Linus Torvalds
2011-07-20 14:40                         ` Linus Torvalds
2011-07-20 14:58                         ` Peter Zijlstra
2011-07-20 14:58                           ` Peter Zijlstra
2011-07-20 16:04                           ` Linus Torvalds
2011-07-20 16:04                             ` Linus Torvalds
2011-07-20 16:42                             ` Ingo Molnar
2011-07-20 16:42                               ` Ingo Molnar
2011-07-20 16:42                             ` Peter Zijlstra
2011-07-20 16:42                               ` Peter Zijlstra
2011-07-20 17:29                               ` [tip:sched/urgent] sched: Avoid creating superfluous NUMA domains on non-NUMA systems tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110714103418.7ef25b68@kryten \
    --to=anton@samba.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=benh@kernel.crashing.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mahesh@linux.vnet.ibm.com \
    --cc=mingo@elte.hu \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.