From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932170AbWG3U67 (ORCPT ); Sun, 30 Jul 2006 16:58:59 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932297AbWG3U67 (ORCPT ); Sun, 30 Jul 2006 16:58:59 -0400 Received: from smtp.osdl.org ([65.172.181.4]:32157 "EHLO smtp.osdl.org") by vger.kernel.org with ESMTP id S932170AbWG3U66 (ORCPT ); Sun, 30 Jul 2006 16:58:58 -0400 Date: Sun, 30 Jul 2006 13:55:18 -0700 From: Andrew Morton To: Andi Kleen Cc: johnstul@us.ibm.com, smurf@smurf.noris.de, linux-kernel@vger.kernel.org, torvalds@osdl.org, bunk@stusta.de, lethal@linux-sh.org, hirofumi@mail.parknet.co.jp, asit.k.mallick@intel.com Subject: Re: REGRESSION: the new i386 timer code fails to sync CPUs Message-Id: <20060730135518.f16c0399.akpm@osdl.org> In-Reply-To: <20060730201005.GA85093@muc.de> References: <20060722173649.952f909f.akpm@osdl.org> <20060723081604.GD27566@kiste.smurf.noris.de> <20060723044637.3857d428.akpm@osdl.org> <20060723120829.GA7776@kiste.smurf.noris.de> <20060723053755.0aaf9ce0.akpm@osdl.org> <1153756738.9440.14.camel@localhost> <20060724171711.GA3662@kiste.smurf.noris.de> <20060724175150.GD50320@muc.de> <1153774443.12836.6.camel@localhost> <20060730020346.5d301bb5.akpm@osdl.org> <20060730201005.GA85093@muc.de> X-Mailer: Sylpheed version 2.2.4 (GTK+ 2.8.17; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On 30 Jul 2006 22:10:05 +0200 Andi Kleen wrote: > > I guess Matthias didn't test this patch. Can we get some obviously-correct > > fix in place for 2.6.18? > > So far we don't have any idea what the problem is on that system. I believe we do know what the problem is: a) write_tsc() doesn't work, b) the TSC's are unsynced (or have an offset), c) we removed a check which would have caused pmtmr/rtc fallback. > > It is a "CPU0: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03". > > Was that on that system? yes. > I guess it could be checked for and TSC > be forced off. There's no need for that, I think. synchronize_tsc_bp() knows for-sure that the synchronization failed, in a way which works on all CPUs. So all we need to do is to set some flag in synchronize_tsc_bp() if `buggy' is set, telling the clocksource code to give up on the TSC. > It sounds like a real CPU bug however. I was hoping the Intel guys could help out with that.