linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Roland Scheidegger <rscheidegger_lists@hispeed.ch>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	x86@kernel.org, Peter Zijlstra <peterz@infradead.org>,
	Borislav Petkov <bp@alien8.de>,
	Bruce Schlobohm <bruce.schlobohm@intel.com>,
	Kevin Stanton <kevin.b.stanton@intel.com>,
	Allen Hung <allen_hung@dell.com>
Subject: Re: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm
Date: Wed, 14 Dec 2016 02:36:21 +0100	[thread overview]
Message-ID: <357e0a0f-af6b-2a8e-2af0-b05652ccbb30@hispeed.ch> (raw)
In-Reply-To: <alpine.DEB.2.20.1612131744500.3415@nanos>

Am 13.12.2016 um 17:46 schrieb Thomas Gleixner:
> On Tue, 13 Dec 2016, Roland Scheidegger wrote:
> 
>> Am 13.12.2016 um 14:14 schrieb Thomas Gleixner:
>>> Roland reported interesting TSC ADJUST register wreckage on his DELL
>>> machine, which seems to populate that MSR with a random number generator.
>>
>> FWIW, I thought about the actual values some more and I don't actually
>> think they are all that random any more: the behavior is consistent with
>> the bios trying to zero the TSC of all cpus. If I understand this right,
>> writing a zero to TSC would cause somewhat small negative values in the
>> TSC_ADJ register at boot time, and larger negative values at suspend
>> time (at least if the TSC just stops when suspended and isn't reset) -
>> exactly what I'm seeing.
>> (And of course the different TSC_ADJ values would be because the bios is
>> writing TSC without any thoughts of synchronization, just one cpu after
>> another).
> 
> Yeah, that might be. Still it looks like random nonsense and definitely the
> BIOS developers did not follow the secrit boot protocol.
> 
>>> Deeper investagation into fixing this wreckage unearthed another special
>>> feature which is designed by Intel: Negative TSC adjuste values cause
>>> interrupt storms on the TSC deadline timer. Further details in patch 2/2
>>
>> This actually looks like quite a serious hw bug to me, shouldn't there
>> be an errata for such a bug?
>>
>> And I still don't quite understand why the lockup doesn't happen after a
>> warm boot, there must be something different there...
> 
> What are the adjust values after a warm boot?
> 

So, after cold boot with a kernel which doesn't adjust TSCs, then warm
boot I got:
[    0.000000] TSC ADJUST: CPU0: -602358264300 176072418728
[    0.000000] TSC ADJUST: Boot CPU0: -602358264300
[    0.172245] TSC ADJUST: CPU1: -602360207584 176587932558
[    0.172245] TSC ADJUST differs: Reference CPU0: -602358264300 CPU1:
-602360207584
[    0.172246] TSC ADJUST synchronize: Reference CPU0: -602358264300
CPU1: -602360207584
[    0.252663] TSC ADJUST: CPU2: -602359000822 176828627154
[    0.252663] TSC ADJUST differs: Reference CPU0: -602358264300 CPU2:
-602359000822
[    0.252664] TSC ADJUST synchronize: Reference CPU0: -602358264300
CPU2: -602359000822
[    0.337014] TSC ADJUST: CPU3: -602360177680 177081093132
[    0.337014] TSC ADJUST differs: Reference CPU0: -602358264300 CPU3:
-602360177680
[    0.337015] TSC ADJUST synchronize: Reference CPU0: -602358264300
CPU3: -602360177680

and so on.

Albeit after another reboot (some minutes later), it actually straight
locked up again:

TSC ADJUST: CPU1: -8257481427958 165112676430
TSC ADJUST differs: Reference CPU0: -8257479484330 CPU1: -8257481427958
TSC ADJUST synchronize: Reference CPU0: -8257479484330 CPU1: -8254781427958
TSC target sync skip
...
smpboot: Target CPU is online

So, actually I thought the TSC would get reset too on warm boot, but
clearly looks like that isn't the case...
But I don't know what's the difference between first and second reboot -
the adjust values have just more magnitude, but otherwise even the
direction of the adjustments and everything looks all the same (just
like cold boot, which also looks all the same to me).

Roland

  reply	other threads:[~2016-12-14  1:36 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-13 13:14 [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm Thomas Gleixner
2016-12-13 13:14 ` [patch 1/2] x86/tsc: Validate TSC_ADJUST after resume Thomas Gleixner
2016-12-13 13:22   ` Peter Zijlstra
2016-12-13 13:23     ` Thomas Gleixner
2016-12-15 10:52   ` [tip:x86/timers] " tip-bot for Thomas Gleixner
2016-12-13 13:14 ` [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero Thomas Gleixner
2016-12-13 13:43   ` Peter Zijlstra
2016-12-13 15:49     ` Thomas Gleixner
2016-12-15 10:53   ` [tip:x86/timers] " tip-bot for Thomas Gleixner
2016-12-16 11:46   ` [patch 2/2] " Thomas Gleixner
2016-12-16 11:52     ` Ingo Molnar
2016-12-16 11:53       ` Thomas Gleixner
2016-12-16 13:33     ` Thomas Gleixner
2016-12-13 16:34 ` [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm Roland Scheidegger
2016-12-13 16:46   ` Thomas Gleixner
2016-12-14  1:36     ` Roland Scheidegger [this message]
2016-12-14  7:31       ` Thomas Gleixner
2016-12-14 20:59         ` Thomas Gleixner
2016-12-14 21:40           ` Thomas Gleixner
2016-12-14 22:54             ` Roland Scheidegger
2016-12-15  9:31               ` Thomas Gleixner
2017-01-26 23:40                 ` Stanton, Kevin B

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=357e0a0f-af6b-2a8e-2af0-b05652ccbb30@hispeed.ch \
    --to=rscheidegger_lists@hispeed.ch \
    --cc=allen_hung@dell.com \
    --cc=bp@alien8.de \
    --cc=bruce.schlobohm@intel.com \
    --cc=kevin.b.stanton@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).