All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Sean Christopherson <seanjc@google.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	x86@kernel.org, Ashok Raj <ashok.raj@linux.intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Tony Luck <tony.luck@intel.com>,
	Arjan van de Veen <arjan@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Eric Biederman <ebiederm@xmission.com>
Subject: Re: [patch 0/6] Cure kexec() vs. mwait_play_dead() troubles
Date: Tue, 06 Jun 2023 09:20:10 +0200	[thread overview]
Message-ID: <87mt1d3vmd.ffs@tglx> (raw)
In-Reply-To: <ZH5rCySnEr0KmATT@google.com>

On Mon, Jun 05 2023 at 16:08, Sean Christopherson wrote:
> On Tue, Jun 06, 2023, Thomas Gleixner wrote:
>> On Mon, Jun 05 2023 at 10:41, Sean Christopherson wrote:
>> > On Sat, Jun 03, 2023, Thomas Gleixner wrote:
>> >> This is only half safe because HLT can resume execution due to NMI, SMI and
>> >> MCE. Unfortunately there is no real safe mechanism to "park" a CPU reliably,
>> >
>> > On Intel.  On AMD, enabling EFER.SVME and doing CLGI will block everything except
>> > single-step #DB (lol) and RESET.  #MC handling is implementation-dependent and
>> > *might* cause shutdown, but at least there's a chance it will work.  And presumably
>> > modern CPUs do pend the #MC until GIF=1.
>> 
>> Abusing SVME for that is definitely in the realm of creative bonus
>> points, but not necessarily a general purpose solution.
>
> Heh, my follow-up ideas for Intel are to abuse XuCode or SEAM ;-)

I feared that :)

>> >> So parking them via INIT is not completely solving the problem, but it
>> >> takes at least NMI and SMI out of the picture.
>> >
>> > Don't most SMM handlers rendezvous all CPUs?  I.e. won't blocking SMIs indefinitely
>> > potentially cause problems too?
>> 
>> Not that I'm aware of. If so then this would be a hideous firmware bug
>> as firmware must be aware of CPUs which hang around in INIT independent
>> of this.
>
> I was thinking of the EDKII code in UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c, e.g.
> SmmWaitForApArrival().  I've never dug deeply into how EDKII uses SMM, what its
> timeouts are, etc., I just remember coming across that code when poking around
> EDKII for other stuff.

There is a comment:

  Note the SMI Handlers must ALWAYS take into account the cases that not
  all APs are available in an SMI run.

Also not all SMIs required global synchronization. But it's all an
inpenetrable mess...

>> Making this work for regular kexec() including this:
>> 
>> > To avoid OOM after many kexec(), reserving a page could be done iff
>> > the current kernel wasn't itself kexec()'d.
>> 
>> would be possible and I thought about it, but that needs a complete new
>> design of "offline", "shutdown offline" and a non-trivial amount of
>> backwards compatibility magic because you can't assume that the kexec()
>> kernel version is greater or equal to the current one. kexec() is
>> supposed to work both ways, downgrading and upgrading. IOW, that ship
>> sailed long ago.
>
> Right, but doesn't gaining "full" protection require ruling out unenlightened
> downgrades?  E.g. if someone downgrades to an old kernel, doesn't hide the "offline"
> CPUs from the kexec() kernel, and boots the old kernel with -nosmt or whatever,
> then that old kernel will do the naive MWAIT or unprotected HLT and
> it's hosed again.

Of course.

> If we're relying on the admin to hide the offline CPUs, could we usurp
> an existing kernel param to hide a small chunk of memory instead?

The only "safe" place is below 1M I think. Not sure whether we have
some existing command line option to "hide" a range there. Neither am I
sure that this would be always the same range.

More questions than answers :)

Thanks

        tglx





  reply	other threads:[~2023-06-06  7:20 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-03 20:06 [patch 0/6] Cure kexec() vs. mwait_play_dead() troubles Thomas Gleixner
2023-06-03 20:06 ` [patch 1/6] x86/smp: Remove pointless wmb() from native_stop_other_cpus() Thomas Gleixner
2023-06-03 20:06 ` [patch 2/6] x86/smp: Acquire stopping_cpu unconditionally Thomas Gleixner
2023-06-03 20:07 ` [patch 3/6] x86/smp: Use dedicated cache-line for mwait_play_dead() Thomas Gleixner
2023-06-03 20:07 ` [patch 4/6] x86/smp: Cure kexec() vs. mwait_play_dead() breakage Thomas Gleixner
2023-06-03 20:54   ` Ashok Raj
2023-06-04  3:19   ` Ashok Raj
2023-06-05  7:41     ` Thomas Gleixner
2023-06-03 20:07 ` [patch 5/6] x86/smp: Split sending INIT IPI out into a helper function Thomas Gleixner
2023-06-04  4:02   ` Mika Penttilä
2023-06-04 10:24     ` Ashok Raj
2023-06-05  7:54     ` Thomas Gleixner
2023-06-05  8:23   ` [patch v2 " Thomas Gleixner
2023-06-03 20:07 ` [patch 6/6] x86/smp: Put CPUs into INIT on shutdown if possible Thomas Gleixner
2023-06-03 20:57   ` Ashok Raj
2023-06-05 17:41 ` [patch 0/6] Cure kexec() vs. mwait_play_dead() troubles Sean Christopherson
2023-06-05 22:41   ` Thomas Gleixner
2023-06-05 23:08     ` Sean Christopherson
2023-06-06  7:20       ` Thomas Gleixner [this message]
2023-06-07 16:21     ` Ashok Raj
2023-06-07 17:33       ` Sean Christopherson
2023-06-07 22:19         ` Ashok Raj
2023-06-08  3:46           ` Sean Christopherson
2023-06-08  4:03             ` Ashok Raj
2023-06-16 15:07             ` Ashok Raj
2023-06-16 19:00               ` Sean Christopherson
2023-06-16 19:03                 ` Ashok Raj
2023-06-16 19:08                   ` Sean Christopherson
2023-06-09  8:40         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mt1d3vmd.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=arjan@linux.intel.com \
    --cc=ashok.raj@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=seanjc@google.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.