linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: john stultz <johnstul@us.ibm.com>
To: Andrew Morton <akpm@osdl.org>
Cc: Matthias Urlichs <smurf@smurf.noris.de>,
	linux-kernel@vger.kernel.org, torvalds@osdl.org, bunk@stusta.de,
	lethal@linux-sh.org, hirofumi@mail.parknet.co.jp,
	Andi Kleen <ak@muc.de>
Subject: Re: REGRESSION: the new i386 timer code fails to sync CPUs
Date: Mon, 24 Jul 2006 08:58:58 -0700	[thread overview]
Message-ID: <1153756738.9440.14.camel@localhost> (raw)
In-Reply-To: <20060723053755.0aaf9ce0.akpm@osdl.org>

On Sun, 2006-07-23 at 05:37 -0700, Andrew Morton wrote:
> On Sun, 23 Jul 2006 14:08:29 +0200
> Matthias Urlichs <smurf@smurf.noris.de> wrote:
> 
> > Hi,
> > 
> > Andrew Morton:
> > > - CPU0 and CPU1 share a TSC and CPU2 and CPU3 share another TSC.
> > > 
> > That mmakes sense, since they're one dual-core Xeon each.
> 
> OK.
> 
> > > - Earlier kernels didn't use the TSC as a time source whereas this one
> > >   does, hence the problems which you're observing.
> > > 
> > Correct; see below.
> > 
> > > I assume that booting with clock=pit or clock=pmtmr fixes it?
> > > 
> > Testing... yes, both.
> > 
> > > It would be useful to check your 2.6.17 boot logs, see if we can work out
> > > what 2.6.17 was using for a clock source.
> > > 
> > That's easy:
> > 
> > 2.6.17    -Using pmtmr for high-res timesource
> > 2.6.18git +Time: tsc clocksource has been installed.
> > 
> > I missed those two lines, as in the boot logs they're not really
> > adjacent, so they got lost in the jumble of other differences.
> 
> OK, thanks.  Marking the TSC as bad in this case is simple to do - let us
> let John work out the best way.
> 
> We must have lost a TSC sanity check somewhere along the way.  I wonder
> what it was?

Well, I changed the TSC vs ACPI PM timer priority ordering to be more
like x86-64 (Andi had a similar patch he was proposing as well). For
awhile suse/redhat kernels have been swapping them, as the TSC gives
such a performance boost, however the ACPI PM timer is usually the safer
option (distro customers are often told to use clock=pmtmr on some
boxes).

I'll see what we can do to narrow it down, but its been assumed by both
x86-64 and the new i386 code that the TSCs on Intel SMP boxes are
synched, unless we're explicitly told they aren't (Summit, etc).

With the current code it is trivial to mark the TSC as unstable and the
system will automatically fall back to the next best clocksource. The
difficulty is just making sure we've got all the cases covered without
needlessly disqualifying synced systems.

Andi: If this is a generic issue, and not specific to Matthias' box, we
may need to re-think the assumption that Intel SMP is synced. You're
thoughts?

> > Interestingly, CPU0/1 gets 6000 bogomips while CPU2/3 only reaches 5600 ..?
> > (That happens with both kernels.) I do wonder why, and whether this has any
> > bearing on the current problem.
> 
> I wouldn't expect it to matter, unless the TSCs are running at different
> speeds or something.

Matthias: "clock=pmtmr" is probably the best workaround in the short
term. Could you send me your dmesg and dmidecode output? We'll try to
find something to key off of so it will mark the tsc as unstable by
default on your system.

thanks
-john



  parent reply	other threads:[~2006-07-24 15:59 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-22 23:36 REGRESSION: the new i386 timer code fails to sync CPUs Matthias Urlichs
2006-07-23  0:36 ` Andrew Morton
2006-07-23  8:16   ` Matthias Urlichs
2006-07-23 11:46     ` Andrew Morton
2006-07-23 12:08       ` Matthias Urlichs
2006-07-23 12:37         ` Andrew Morton
2006-07-23 12:58           ` Matthias Urlichs
2006-07-24 15:52           ` Siddha, Suresh B
2006-07-24 15:58           ` john stultz [this message]
2006-07-24 17:17             ` Matthias Urlichs
2006-07-24 17:51               ` Andi Kleen
2006-07-24 20:54                 ` john stultz
2006-07-30  9:03                   ` Andrew Morton
2006-07-30  9:49                     ` Matthias Urlichs
2006-07-30 20:10                     ` Andi Kleen
2006-07-30 20:55                       ` Andrew Morton
2006-07-30 21:13                       ` Matthias Urlichs
2006-07-30 21:20                         ` Arjan van de Ven
2006-07-30 21:55                           ` Matthias Urlichs
2006-08-01  1:47                             ` Siddha, Suresh B
2006-08-01  3:14                               ` Matthias Urlichs
2006-07-30 21:57                         ` Andi Kleen
2006-07-30 22:28                           ` Matthias Urlichs
2006-07-31 14:24                     ` Matthias Urlichs
2006-07-24 17:39             ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1153756738.9440.14.camel@localhost \
    --to=johnstul@us.ibm.com \
    --cc=ak@muc.de \
    --cc=akpm@osdl.org \
    --cc=bunk@stusta.de \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=lethal@linux-sh.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=smurf@smurf.noris.de \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).