All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Peter Zijlstra <peterz@infradead.org>
Cc: paulmck@kernel.org, Feng Tang <feng.tang@intel.com>,
	kernel test robot <oliver.sang@intel.com>,
	0day robot <lkp@intel.com>, John Stultz <john.stultz@linaro.org>,
	Stephen Boyd <sboyd@kernel.org>, Jonathan Corbet <corbet@lwn.net>,
	Mark Rutland <Mark.Rutland@arm.com>,
	Marc Zyngier <maz@kernel.org>, Andi Kleen <ak@linux.intel.com>,
	Xing Zhengjun <zhengjun.xing@linux.intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, kernel-team@fb.com, neeraju@codeaurora.org,
	zhengjun.xing@intel.com, x86@kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [clocksource]  8c30ace35d: WARNING:at_kernel/time/clocksource.c:#clocksource_watchdog
Date: Wed, 28 Apr 2021 19:00:15 +0200	[thread overview]
Message-ID: <871raumjj4.ffs@nanos.tec.linutronix.de> (raw)
In-Reply-To: <YImBlV8l7bjZ7Q6G@hirez.programming.kicks-ass.net>

On Wed, Apr 28 2021 at 17:39, Peter Zijlstra wrote:
> On Wed, Apr 28, 2021 at 03:34:52PM +0200, Thomas Gleixner wrote:
>> #4 is the easy case because we can check MSR_TSC_ADJUST to figure out
>>    whether something has written to MSR_TSC or MSR_TSC_ADJUST and undo
>>    the damage in a sane way.
>
> This is after the fact though; userspace (and kernel space) will have
> observed non-linear time and things will be broken in various subtle and
> hard to tell ways.

What I observed in the recent past is that _IF_ that happens it's a
small amount of cycles so it's not a given that this can be observed
accross CPUs. But yes, it's daft.

>> I can live with that and maybe we should have done that 15 years ago
>> instead of trying to work around it at the symptom level.
>
> Anybody that still has runtime BIOS wreckage will then silently suffer
> nonlinear time, doubly so for anybody not having TSC_ADJUST. Are we sure
> we can tell them all to bugger off and buy new hardware?
>
> At the very least we need something like tsc=broken, to explicitly mark
> TSC bad on machines, so that people that see TSC fail on their current
> kernels can continue to use the new kernels. This requires a whole lot
> of care on the part of users though, and will raise a ruckus, because I
> bet a fair number of these people are not even currently aware we're
> disabling TSC for them :/

I'm still allowed to dream, right? :)

Thanks,

        tglx

WARNING: multiple messages have this Message-ID (diff)
From: Thomas Gleixner <tglx@linutronix.de>
To: lkp@lists.01.org
Subject: Re: [clocksource] 8c30ace35d: WARNING:at_kernel/time/clocksource.c:#clocksource_watchdog
Date: Wed, 28 Apr 2021 19:00:15 +0200	[thread overview]
Message-ID: <871raumjj4.ffs@nanos.tec.linutronix.de> (raw)
In-Reply-To: <YImBlV8l7bjZ7Q6G@hirez.programming.kicks-ass.net>

[-- Attachment #1: Type: text/plain, Size: 1471 bytes --]

On Wed, Apr 28 2021 at 17:39, Peter Zijlstra wrote:
> On Wed, Apr 28, 2021 at 03:34:52PM +0200, Thomas Gleixner wrote:
>> #4 is the easy case because we can check MSR_TSC_ADJUST to figure out
>>    whether something has written to MSR_TSC or MSR_TSC_ADJUST and undo
>>    the damage in a sane way.
>
> This is after the fact though; userspace (and kernel space) will have
> observed non-linear time and things will be broken in various subtle and
> hard to tell ways.

What I observed in the recent past is that _IF_ that happens it's a
small amount of cycles so it's not a given that this can be observed
accross CPUs. But yes, it's daft.

>> I can live with that and maybe we should have done that 15 years ago
>> instead of trying to work around it at the symptom level.
>
> Anybody that still has runtime BIOS wreckage will then silently suffer
> nonlinear time, doubly so for anybody not having TSC_ADJUST. Are we sure
> we can tell them all to bugger off and buy new hardware?
>
> At the very least we need something like tsc=broken, to explicitly mark
> TSC bad on machines, so that people that see TSC fail on their current
> kernels can continue to use the new kernels. This requires a whole lot
> of care on the part of users though, and will raise a ruckus, because I
> bet a fair number of these people are not even currently aware we're
> disabling TSC for them :/

I'm still allowed to dream, right? :)

Thanks,

        tglx

  reply	other threads:[~2021-04-28 17:00 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-25 22:45 [PATCH v10 clocksource 0/7] Do not mark clocks unstable due to delays for v5.13 Paul E. McKenney
2021-04-25 22:47 ` [PATCH v10 clocksource 1/7] clocksource: Provide module parameters to inject delays in watchdog Paul E. McKenney
2021-04-26  4:07   ` Andi Kleen
2021-04-26  7:13     ` Thomas Gleixner
2021-04-26 15:28     ` Paul E. McKenney
2021-04-26 16:00       ` Andi Kleen
2021-04-26 16:14         ` Paul E. McKenney
2021-04-26 17:56           ` Andi Kleen
2021-04-26 18:24             ` Paul E. McKenney
2021-04-28  4:49               ` Luming Yu
2021-04-28 13:57                 ` Paul E. McKenney
2021-04-28 14:24                   ` Luming Yu
2021-04-28 14:37                     ` Thomas Gleixner
2021-04-25 22:47 ` [PATCH v10 clocksource 2/7] clocksource: Retry clock read if long delays detected Paul E. McKenney
2021-04-27  1:44   ` Feng Tang
2021-04-25 22:47 ` [PATCH v10 clocksource 3/7] clocksource: Check per-CPU clock synchronization when marked unstable Paul E. McKenney
2021-04-26  4:12   ` Andi Kleen
2021-04-26  7:16     ` Thomas Gleixner
2021-04-25 22:47 ` [PATCH v10 clocksource 4/7] clocksource: Provide a module parameter to fuzz per-CPU clock checking Paul E. McKenney
2021-04-25 22:47 ` [PATCH v10 clocksource 5/7] clocksource: Limit number of CPUs checked for clock synchronization Paul E. McKenney
2021-04-25 22:47 ` [PATCH v10 clocksource 6/7] clocksource: Forgive tsc_early pre-calibration drift Paul E. McKenney
2021-04-26 15:01   ` Feng Tang
2021-04-26 15:25     ` Paul E. McKenney
2021-04-26 15:36       ` Feng Tang
2021-04-26 18:26         ` Paul E. McKenney
2021-04-27  1:13           ` Feng Tang
2021-04-27  3:46             ` Paul E. McKenney
2021-04-27  4:16               ` Feng Tang
2021-04-26 15:28     ` Thomas Gleixner
2021-04-27 21:03     ` Thomas Gleixner
2021-04-27  7:27   ` [clocksource] 8c30ace35d: WARNING:at_kernel/time/clocksource.c:#clocksource_watchdog kernel test robot
2021-04-27  7:27     ` kernel test robot
2021-04-27  8:45     ` Feng Tang
2021-04-27  8:45       ` Feng Tang
2021-04-27 13:37       ` Paul E. McKenney
2021-04-27 13:37         ` Paul E. McKenney
2021-04-27 17:50         ` Paul E. McKenney
2021-04-27 17:50           ` Paul E. McKenney
2021-04-27 21:09           ` Thomas Gleixner
2021-04-27 21:09             ` Thomas Gleixner
2021-04-28  1:48             ` Paul E. McKenney
2021-04-28  1:48               ` Paul E. McKenney
2021-04-28 10:14               ` Thomas Gleixner
2021-04-28 10:14                 ` Thomas Gleixner
2021-04-28 18:31                 ` Paul E. McKenney
2021-04-28 18:31                   ` Paul E. McKenney
2021-04-28 13:34             ` Thomas Gleixner
2021-04-28 13:34               ` Thomas Gleixner
2021-04-28 15:39               ` Peter Zijlstra
2021-04-28 15:39                 ` Peter Zijlstra
2021-04-28 17:00                 ` Thomas Gleixner [this message]
2021-04-28 17:00                   ` Thomas Gleixner
2021-04-29  7:38                   ` Feng Tang
2021-04-29  7:38                     ` Feng Tang
2021-04-28 18:31               ` Paul E. McKenney
2021-04-28 18:31                 ` Paul E. McKenney
2021-04-29  8:27                 ` Thomas Gleixner
2021-04-29  8:27                   ` Thomas Gleixner
2021-04-29 14:26                   ` Paul E. McKenney
2021-04-29 14:26                     ` Paul E. McKenney
2021-04-29 17:30                     ` Thomas Gleixner
2021-04-29 17:30                       ` Thomas Gleixner
2021-04-29 23:04                       ` Andi Kleen
2021-04-29 23:04                         ` Andi Kleen
2021-04-30  0:24                         ` Paul E. McKenney
2021-04-30  0:24                           ` Paul E. McKenney
2021-04-30  0:59                           ` Paul E. McKenney
2021-04-30  0:59                             ` Paul E. McKenney
2021-04-30  5:08                       ` Paul E. McKenney
2021-04-30  5:08                         ` Paul E. McKenney
2021-04-25 22:47 ` [PATCH v9 clocksource 6/6] clocksource: Reduce WATCHDOG_THRESHOLD Paul E. McKenney
2021-04-25 22:47 ` [PATCH v10 clocksource 7/7] " Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871raumjj4.ffs@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=Mark.Rutland@arm.com \
    --cc=ak@linux.intel.com \
    --cc=corbet@lwn.net \
    --cc=feng.tang@intel.com \
    --cc=john.stultz@linaro.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=maz@kernel.org \
    --cc=neeraju@codeaurora.org \
    --cc=oliver.sang@intel.com \
    --cc=paulmck@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sboyd@kernel.org \
    --cc=x86@kernel.org \
    --cc=zhengjun.xing@intel.com \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.