* Re: [PATCH v2] x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
2021-10-18 18:17 ` Borislav Petkov
@ 2021-10-18 19:10 ` H. Peter Anvin
2021-10-18 19:31 ` Borislav Petkov
2021-10-18 19:29 ` Sean Christopherson
2021-10-18 20:06 ` Andrew Cooper
2 siblings, 1 reply; 11+ messages in thread
From: H. Peter Anvin @ 2021-10-18 19:10 UTC (permalink / raw)
To: Borislav Petkov, Jane Malalane
Cc: LKML, x86, Thomas Gleixner, Ingo Molnar, Pu Wen, Paolo Bonzini,
Sean Christopherson, Peter Zijlstra, Andrew Cooper,
Yazen Ghannam, Brijesh Singh, Huang Rui, Andy Lutomirski,
Kim Phillips, stable
AFAIK no Intel CPU has ever had that behavior, and always cleared the segments; I don't Intel has any plans of supporting such a CPUID bit (although I'd certainly be willing to take such a request back to the CPU teams on request.)
That being said, this sounds like an ideal use for the hypervisor CPU feature flag. Maybe we should consider a migration hypervisor flag too to explicitly tell the kernel not to rely on hardware probing that breaks migration in general.
Now, with a CPUID but being introduced, the right thing would be to use the CPUID bit as a feature instead of using a bug flag, and add whitelisting in the vendor-specific code as applicable.
On October 18, 2021 11:17:30 AM PDT, Borislav Petkov <bp@alien8.de> wrote:
>On Wed, Oct 13, 2021 at 03:22:30PM +0100, Jane Malalane wrote:
>> @@ -650,6 +651,27 @@ static void early_init_amd(struct cpuinfo_x86 *c)
>> if (c->x86_power & BIT(14))
>> set_cpu_cap(c, X86_FEATURE_RAPL);
>>
>> + /*
>> + * Zen1 and earlier CPUs don't clear segment base/limits when
>> + * loading a NULL selector. This has been designated
>> + * X86_BUG_NULL_SEG.
>> + *
>> + * Zen3 CPUs advertise Null Selector Clears Base in CPUID.
>> + * Zen2 CPUs also have this behaviour, but no CPUID bit.
>> + *
>> + * A hypervisor may sythesize the bit, but may also hide it
>> + * for migration safety, so we must not probe for model
>> + * specific behaviour when virtualised.
>> + */
>> + if (c->extended_cpuid_level >= 0x80000021 && cpuid_eax(0x80000021) & BIT(6))
>> + nscb = true;
>> +
>> + if (!cpu_has(c, X86_FEATURE_HYPERVISOR) && !nscb && c->x86 == 0x17)
>> + nscb = check_null_seg_clears_base(c);
>> +
>> + if (!nscb)
>> + set_cpu_bug(c, X86_BUG_NULL_SEG);
>> +
>> #ifdef CONFIG_X86_64
>> set_cpu_cap(c, X86_FEATURE_SYSCALL32);
>> #else
>
>Can we do something like this?
>
>It is carved out into a separate function which you can simply call from
>early_init_amd() and early_init_hygon().
>
>I guess you can put that function in arch/x86/kernel/cpu/common.c or so.
>
>Then, you should put the comments right over the code like I've done
>below so that one can follow what's going on with each particular check.
>
>I've also flipped the logic a bit and it is simpler this way.
>
>Totally untested of course.
>
>static void early_probe_null_seg_clearing_base(struct cpuinfo_x86 *c)
>{
> /*
> * A hypervisor may sythesize the bit, but may also hide it
> * for migration safety, so do not probe for model-specific
> * behaviour when virtualised.
> */
> if (cpu_has(c, X86_FEATURE_HYPERVISOR))
> return;
>
> /* Zen3 CPUs advertise Null Selector Clears Base in CPUID. */
> if (c->extended_cpuid_level >= 0x80000021 && cpuid_eax(0x80000021) & BIT(6))
> return;
>
> /* Zen2 CPUs also have this behaviour, but no CPUID bit. */
> if (c->x86 == 0x17 && check_null_seg_clears_base(c))
> return;
>
> /* All the remaining ones are affected */
> set_cpu_bug(c, X86_BUG_NULL_SEG);
>}
>
>> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
>> index 0f8885949e8c..2ca4afb97247 100644
>> --- a/arch/x86/kernel/cpu/common.c
>> +++ b/arch/x86/kernel/cpu/common.c
>> @@ -1395,7 +1395,7 @@ void __init early_cpu_init(void)
>> early_identify_cpu(&boot_cpu_data);
>> }
>>
>> -static void detect_null_seg_behavior(struct cpuinfo_x86 *c)
>> +bool check_null_seg_clears_base(struct cpuinfo_x86 *c)
>> {
>> #ifdef CONFIG_X86_64
>> /*
>> @@ -1418,10 +1418,10 @@ static void detect_null_seg_behavior(struct cpuinfo_x86 *c)
>> wrmsrl(MSR_FS_BASE, 1);
>> loadsegment(fs, 0);
>> rdmsrl(MSR_FS_BASE, tmp);
>> - if (tmp != 0)
>> - set_cpu_bug(c, X86_BUG_NULL_SEG);
>> wrmsrl(MSR_FS_BASE, old_base);
>> + return tmp == 0;
>> #endif
>> + return true;
>> }
>>
>> static void generic_identify(struct cpuinfo_x86 *c)
>> @@ -1457,8 +1457,6 @@ static void generic_identify(struct cpuinfo_x86 *c)
>>
>> get_model_name(c); /* Default name */
>>
>> - detect_null_seg_behavior(c);
>> -
>> /*
>> * ESPFIX is a strange bug. All real CPUs have it. Paravirt
>> * systems that run Linux at CPL > 0 may or may not have the
>
>So this function is called on all x86 CPUs. Are you sure others besides
>AMD and Hygon do not have the same issue?
>
>IOW, I wouldn't remove that call here.
>
>But then this is the identify() phase in the boot process and you've
>moved it to early_identify() by putting it in the ->c_early_init()
>function pointer on AMD and Hygon.
>
>Is there any particular reasoning for this or can that detection remain
>in ->c_identify()?
>
>Because if this null seg behavior detection should happen on all
>CPUs - and I think it should, because, well, it has been that way
>until now - then the vendor specific identification minus what
>detect_null_seg_behavior() does should run first and then after
>->c_identify() is done, you should do something like:
>
> if (!cpu_has_bug(c, X86_BUG_NULL_SEG)) {
> if (!check_null_seg_clears_base(c))
> set_cpu_bug(c, X86_BUG_NULL_SEG);
> }
>
>so that it still takes place on all CPUs.
>
>I.e., you should split the detection.
>
>I hope I'm making sense ...
>
>Ah, btw, that @c parameter to detect_null_seg_behavior() is unused - you
>should remove it in a pre-patch.
>
>Thx.
>
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
2021-10-18 19:10 ` H. Peter Anvin
@ 2021-10-18 19:31 ` Borislav Petkov
0 siblings, 0 replies; 11+ messages in thread
From: Borislav Petkov @ 2021-10-18 19:31 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Jane Malalane, LKML, x86, Thomas Gleixner, Ingo Molnar, Pu Wen,
Paolo Bonzini, Sean Christopherson, Peter Zijlstra,
Andrew Cooper, Yazen Ghannam, Brijesh Singh, Huang Rui,
Andy Lutomirski, Kim Phillips, stable
On Mon, Oct 18, 2021 at 12:10:20PM -0700, H. Peter Anvin wrote:
> AFAIK no Intel CPU has ever had that behavior, and always cleared the
> segments; I don't Intel has any plans of supporting such a CPUID bit
> (although I'd certainly be willing to take such a request back to the
> CPU teams on request.)
No need - we can always set or clear a flag on Intel, depending on what
we do.
> That being said, this sounds like an ideal use for the hypervisor CPU
> feature flag.
Yap, it uses it.
> Maybe we should consider a migration hypervisor flag too to explicitly
> tell the kernel not to rely on hardware probing that breaks migration
> in general.
Meh, migration-specific flag calls for all kinds of nasty when each
HV would want different things to happen in the guest, for migration.
And then the patch flood will come. I mean, we already do a bunch of
X86_FEATURE_HYPERVISOR all over the place and apparently it is enough
here too...
> Now, with a CPUID but being introduced, the right thing would be to
> use the CPUID bit as a feature instead of using a bug flag, and add
> whitelisting in the vendor-specific code as applicable.
I guess we can flip all that logic checking X86_BUG_NULL_SEG... it
sounds like a lot of churn to me, though and I don't see a pressing need
for it unless someone is bored and wants to do some kernel patching
exercises but whatever...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
2021-10-18 18:17 ` Borislav Petkov
2021-10-18 19:10 ` H. Peter Anvin
@ 2021-10-18 19:29 ` Sean Christopherson
2021-10-18 19:37 ` Borislav Petkov
2021-10-18 20:06 ` Andrew Cooper
2 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2021-10-18 19:29 UTC (permalink / raw)
To: Borislav Petkov
Cc: Jane Malalane, LKML, x86, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, Pu Wen, Paolo Bonzini, Peter Zijlstra,
Andrew Cooper, Yazen Ghannam, Brijesh Singh, Huang Rui,
Andy Lutomirski, Kim Phillips, stable
On Mon, Oct 18, 2021, Borislav Petkov wrote:
> On Wed, Oct 13, 2021 at 03:22:30PM +0100, Jane Malalane wrote:
> Totally untested of course.
>
> static void early_probe_null_seg_clearing_base(struct cpuinfo_x86 *c)
> {
> /*
> * A hypervisor may sythesize the bit, but may also hide it
> * for migration safety, so do not probe for model-specific
> * behaviour when virtualised.
> */
> if (cpu_has(c, X86_FEATURE_HYPERVISOR))
This isn't correct. When running as a guest, the intended behavior is to fully
trust the CPUID.0x80000021 bit. If bit 6 is set, yay, the hypervisor has told
the kernel that it will only ever run on hardware without the bug. If bit 6 is
clear and HYPERVISOR is true, then the FMS crud can't be trusted because the
kernel _may_ run on affected hardware in the future even if the current underlying
hardware is not affected.
> return;
>
> /* Zen3 CPUs advertise Null Selector Clears Base in CPUID. */
> if (c->extended_cpuid_level >= 0x80000021 && cpuid_eax(0x80000021) & BIT(6))
> return;
>
> /* Zen2 CPUs also have this behaviour, but no CPUID bit. */
> if (c->x86 == 0x17 && check_null_seg_clears_base(c))
> return;
>
> /* All the remaining ones are affected */
> set_cpu_bug(c, X86_BUG_NULL_SEG);
> }
...
> > @@ -1457,8 +1457,6 @@ static void generic_identify(struct cpuinfo_x86 *c)
> >
> > get_model_name(c); /* Default name */
> >
> > - detect_null_seg_behavior(c);
> > -
> > /*
> > * ESPFIX is a strange bug. All real CPUs have it. Paravirt
> > * systems that run Linux at CPL > 0 may or may not have the
>
> So this function is called on all x86 CPUs. Are you sure others besides
> AMD and Hygon do not have the same issue?
>
> IOW, I wouldn't remove that call here.
I agree. If the argument for this patch is that the kernel can be migrated to
older hardware, then it stands to reason that the kernel could also be migrated
to a different CPU vendor entirely. E.g. start on Intel, migrate to Zen1, kaboom.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
2021-10-18 19:29 ` Sean Christopherson
@ 2021-10-18 19:37 ` Borislav Petkov
2021-10-18 20:05 ` Sean Christopherson
0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2021-10-18 19:37 UTC (permalink / raw)
To: Sean Christopherson
Cc: Jane Malalane, LKML, x86, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, Pu Wen, Paolo Bonzini, Peter Zijlstra,
Andrew Cooper, Yazen Ghannam, Brijesh Singh, Huang Rui,
Andy Lutomirski, Kim Phillips, stable
On Mon, Oct 18, 2021 at 07:29:41PM +0000, Sean Christopherson wrote:
> This isn't correct. When running as a guest, the intended behavior is to fully
> trust the CPUID.0x80000021 bit.
Really? Because I'm coming from an SEV-SNP mail thread where we don't
trust the HV at all and we even hand in a CPUID page into the guest...
:-P
> If bit 6 is set, yay, the hypervisor has told the kernel that it
> will only ever run on hardware without the bug. If bit 6 is clear
> and HYPERVISOR is true, then the FMS crud can't be trusted because
> the kernel _may_ run on affected hardware in the future even if the
> current underlying hardware is not affected.
Ok, I see, then the CPUID check needs to go first, makes sense.
> I agree. If the argument for this patch is that the kernel can be migrated to
> older hardware, then it stands to reason that the kernel could also be migrated
> to a different CPU vendor entirely. E.g. start on Intel, migrate to Zen1, kaboom.
Migration across vendors? Really, that works?
I'll believe it only when I see it with my own eyes.
:-)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
2021-10-18 19:37 ` Borislav Petkov
@ 2021-10-18 20:05 ` Sean Christopherson
2021-10-18 20:18 ` Andrew Cooper
0 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2021-10-18 20:05 UTC (permalink / raw)
To: Borislav Petkov
Cc: Jane Malalane, LKML, x86, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, Pu Wen, Paolo Bonzini, Peter Zijlstra,
Andrew Cooper, Yazen Ghannam, Brijesh Singh, Huang Rui,
Andy Lutomirski, Kim Phillips, stable
On Mon, Oct 18, 2021, Borislav Petkov wrote:
> On Mon, Oct 18, 2021 at 07:29:41PM +0000, Sean Christopherson wrote:
> > I agree. If the argument for this patch is that the kernel can be migrated to
> > older hardware, then it stands to reason that the kernel could also be migrated
> > to a different CPU vendor entirely. E.g. start on Intel, migrate to Zen1, kaboom.
>
> Migration across vendors? Really, that works?
>
> I'll believe it only when I see it with my own eyes.
There are plenty of caveats, but it is feasible. KVM even has a few patches that
came about specifically to support cross-vendor migration, e.g. commit adc2a23734ac
("KVM: nSVM: improve SYSENTER emulation on AMD").
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
2021-10-18 20:05 ` Sean Christopherson
@ 2021-10-18 20:18 ` Andrew Cooper
0 siblings, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2021-10-18 20:18 UTC (permalink / raw)
To: Sean Christopherson, Borislav Petkov
Cc: Jane Malalane, LKML, x86, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, Pu Wen, Paolo Bonzini, Peter Zijlstra,
Yazen Ghannam, Brijesh Singh, Huang Rui, Andy Lutomirski,
Kim Phillips, stable
On 18/10/2021 21:05, Sean Christopherson wrote:
> On Mon, Oct 18, 2021, Borislav Petkov wrote:
>> On Mon, Oct 18, 2021 at 07:29:41PM +0000, Sean Christopherson wrote:
>>> I agree. If the argument for this patch is that the kernel can be migrated to
>>> older hardware, then it stands to reason that the kernel could also be migrated
>>> to a different CPU vendor entirely. E.g. start on Intel, migrate to Zen1, kaboom.
>> Migration across vendors? Really, that works?
>>
>> I'll believe it only when I see it with my own eyes.
> There are plenty of caveats, but it is feasible. KVM even has a few patches that
> came about specifically to support cross-vendor migration, e.g. commit adc2a23734ac
> ("KVM: nSVM: improve SYSENTER emulation on AMD").
http://developer.amd.com/wordpress/media/2012/10/CrossVendorMigration.pdf
Yeah - it really did work back in the day. For Xen PV guests, it still
largely works today, because the most obvious vendor specifics were
already abstracted away in the PV ABI.
Of course, none of this has remotely survived the speculation
apocalypse. A cross-vendor VM has 0 chance of getting speculation
safety working.
~Andrew
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
2021-10-18 18:17 ` Borislav Petkov
2021-10-18 19:10 ` H. Peter Anvin
2021-10-18 19:29 ` Sean Christopherson
@ 2021-10-18 20:06 ` Andrew Cooper
2021-10-18 21:46 ` Borislav Petkov
2 siblings, 1 reply; 11+ messages in thread
From: Andrew Cooper @ 2021-10-18 20:06 UTC (permalink / raw)
To: Borislav Petkov, Jane Malalane
Cc: LKML, x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Pu Wen,
Paolo Bonzini, Sean Christopherson, Peter Zijlstra,
Yazen Ghannam, Brijesh Singh, Huang Rui, Andy Lutomirski,
Kim Phillips, stable
On 18/10/2021 19:17, Borislav Petkov wrote:
> On Wed, Oct 13, 2021 at 03:22:30PM +0100, Jane Malalane wrote:
>> @@ -650,6 +651,27 @@ static void early_init_amd(struct cpuinfo_x86 *c)
>> if (c->x86_power & BIT(14))
>> set_cpu_cap(c, X86_FEATURE_RAPL);
>>
>> + /*
>> + * Zen1 and earlier CPUs don't clear segment base/limits when
>> + * loading a NULL selector. This has been designated
>> + * X86_BUG_NULL_SEG.
>> + *
>> + * Zen3 CPUs advertise Null Selector Clears Base in CPUID.
>> + * Zen2 CPUs also have this behaviour, but no CPUID bit.
>> + *
>> + * A hypervisor may sythesize the bit, but may also hide it
>> + * for migration safety, so we must not probe for model
>> + * specific behaviour when virtualised.
>> + */
>> + if (c->extended_cpuid_level >= 0x80000021 && cpuid_eax(0x80000021) & BIT(6))
>> + nscb = true;
>> +
>> + if (!cpu_has(c, X86_FEATURE_HYPERVISOR) && !nscb && c->x86 == 0x17)
>> + nscb = check_null_seg_clears_base(c);
>> +
>> + if (!nscb)
>> + set_cpu_bug(c, X86_BUG_NULL_SEG);
>> +
>> #ifdef CONFIG_X86_64
>> set_cpu_cap(c, X86_FEATURE_SYSCALL32);
>> #else
> Can we do something like this?
>
> It is carved out into a separate function which you can simply call from
> early_init_amd() and early_init_hygon().
>
> I guess you can put that function in arch/x86/kernel/cpu/common.c or so.
>
> Then, you should put the comments right over the code like I've done
> below so that one can follow what's going on with each particular check.
>
> I've also flipped the logic a bit and it is simpler this way.
:) Sadly...
>
> Totally untested of course.
>
> static void early_probe_null_seg_clearing_base(struct cpuinfo_x86 *c)
> {
> /*
> * A hypervisor may sythesize the bit, but may also hide it
> * for migration safety, so do not probe for model-specific
> * behaviour when virtualised.
> */
> if (cpu_has(c, X86_FEATURE_HYPERVISOR))
> return;
>
> /* Zen3 CPUs advertise Null Selector Clears Base in CPUID. */
> if (c->extended_cpuid_level >= 0x80000021 && cpuid_eax(0x80000021) & BIT(6))
> return;
>
> /* Zen2 CPUs also have this behaviour, but no CPUID bit. */
> if (c->x86 == 0x17 && check_null_seg_clears_base(c))
> return;
... this is 0x18 for Hygon, and ...
>
> /* All the remaining ones are affected */
> set_cpu_bug(c, X86_BUG_NULL_SEG);
... hypervisor && !ncsb still needs to set BUG_NULL_SEG, so you really
can't exit early.
> }
>
>> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
>> index 0f8885949e8c..2ca4afb97247 100644
>> --- a/arch/x86/kernel/cpu/common.c
>> +++ b/arch/x86/kernel/cpu/common.c
>> @@ -1395,7 +1395,7 @@ void __init early_cpu_init(void)
>> early_identify_cpu(&boot_cpu_data);
>> }
>>
>> -static void detect_null_seg_behavior(struct cpuinfo_x86 *c)
>> +bool check_null_seg_clears_base(struct cpuinfo_x86 *c)
>> {
>> #ifdef CONFIG_X86_64
>> /*
>> @@ -1418,10 +1418,10 @@ static void detect_null_seg_behavior(struct cpuinfo_x86 *c)
>> wrmsrl(MSR_FS_BASE, 1);
>> loadsegment(fs, 0);
>> rdmsrl(MSR_FS_BASE, tmp);
>> - if (tmp != 0)
>> - set_cpu_bug(c, X86_BUG_NULL_SEG);
>> wrmsrl(MSR_FS_BASE, old_base);
>> + return tmp == 0;
>> #endif
>> + return true;
>> }
>>
>> static void generic_identify(struct cpuinfo_x86 *c)
>> @@ -1457,8 +1457,6 @@ static void generic_identify(struct cpuinfo_x86 *c)
>>
>> get_model_name(c); /* Default name */
>>
>> - detect_null_seg_behavior(c);
>> -
>> /*
>> * ESPFIX is a strange bug. All real CPUs have it. Paravirt
>> * systems that run Linux at CPL > 0 may or may not have the
> So this function is called on all x86 CPUs. Are you sure others besides
> AMD and Hygon do not have the same issue?
No other CPU vendors are known to have this issue. (And by "issue",
even this is complicated. Back in the 32bit days, it was a plausible
perf improvement, but it backfired massively for AMD64 where there was a
possibility/expectation to use NULL segments.)
Andy only put the check in unilaterally just in case, and even that was
fine-ish until AMD went and fixed it silently in Zen2.
> IOW, I wouldn't remove that call here.
>
> But then this is the identify() phase in the boot process and you've
> moved it to early_identify() by putting it in the ->c_early_init()
> function pointer on AMD and Hygon.
>
> Is there any particular reasoning for this or can that detection remain
> in ->c_identify()?
No particular reason. It needs resolving before init gets created, but
any time before that is fine.
> Because if this null seg behavior detection should happen on all
> CPUs - and I think it should, because, well, it has been that way
> until now - then the vendor specific identification minus what
> detect_null_seg_behavior() does should run first and then after
> ->c_identify() is done, you should do something like:
>
> if (!cpu_has_bug(c, X86_BUG_NULL_SEG)) {
> if (!check_null_seg_clears_base(c))
> set_cpu_bug(c, X86_BUG_NULL_SEG);
> }
>
> so that it still takes place on all CPUs.
This would only really work for boot cpu and setup_force_cap(), because
no CPU is going to have X86_BUG_NULL_SEG set by default, but this still
misses the point of the bugfix which is "check_null_seg_clears_base()
must not be called when cpu_has_hypervisor".
In practice, the BSP is good enough. The behaviour predates the K8,
which was the first CPU where it became observable without
SMM/PUSHALL/etc, and quite possibly goes back to the dawn of time, and
you can't mix a Zen1 and Zen2 in a 2-socket system.
>
> I.e., you should split the detection.
>
> I hope I'm making sense ...
>
> Ah, btw, that @c parameter to detect_null_seg_behavior() is unused - you
> should remove it in a pre-patch.
It is made unused by this patch, so can't be pulled out earlier, but
should be adjusted.
~Andrew
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
2021-10-18 20:06 ` Andrew Cooper
@ 2021-10-18 21:46 ` Borislav Petkov
0 siblings, 0 replies; 11+ messages in thread
From: Borislav Petkov @ 2021-10-18 21:46 UTC (permalink / raw)
To: Andrew Cooper
Cc: Jane Malalane, LKML, x86, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, Pu Wen, Paolo Bonzini, Sean Christopherson,
Peter Zijlstra, Yazen Ghannam, Brijesh Singh, Huang Rui,
Andy Lutomirski, Kim Phillips, stable
On Mon, Oct 18, 2021 at 09:06:07PM +0100, Andrew Cooper wrote:
> ... this is 0x18 for Hygon, and ...
Sure, whatever :)
> >
> > /* All the remaining ones are affected */
> > set_cpu_bug(c, X86_BUG_NULL_SEG);
>
> ... hypervisor && !ncsb still needs to set BUG_NULL_SEG, so you really
> can't exit early.
Yeah, we had a session on IRC, we came up with this rough version, more
polishing tomorrow:
static void early_probe_null_seg_clearing_base(struct cpuinfo_x86 *c)
{
/* Zen3 CPUs advertise Null Selector Clears Base in CPUID. */
if (c->extended_cpuid_level >= 0x80000021 && cpuid_eax(0x80000021) & BIT(6))
return;
/*
* CPUID bit above wasn't set. If this kernel is still running as a HV guest,
* then the HV has decided not to advertize that CPUID bit for whatever reason.
* For example, one member of the migration pool might be vulnerable.
* Which means, the bug is present: set the BUG flag and return.
*/
if (cpu_has(c, X86_FEATURE_HYPERVISOR)) {
set_cpu_bug(c, X86_BUG_NULL_SEG);
return;
}
/* Zen2 CPUs also have this behaviour, but no CPUID bit. 0x18 for Hygon. */
if ((c->x86 == 0x17 || c->x86 == 0x18) && check_null_seg_clears_base(c))
return;
/* All the remaining ones are affected */
set_cpu_bug(c, X86_BUG_NULL_SEG);
}
So I really want to have those comments explaining each step in the
complex check because we will forget why this crazy dance is being done
and as I said in a previous thread, we're not all virtualizers. :)
> No other CPU vendors are known to have this issue.
How do you know? Or should there be a comment along the lines of "Cooper
says that..."
:-)
> (And by "issue", even this is complicated. Back in the 32bit
> days, it was a plausible perf improvement, but it backfired massively
> for AMD64 where there was a possibility/expectation to use NULL
> segments.)
>
> Andy only put the check in unilaterally just in case, and even that was
> fine-ish until AMD went and fixed it silently in Zen2.
Yeah, there's the context switch overhead too but that's for another
thread.
> > Because if this null seg behavior detection should happen on all
> > CPUs - and I think it should, because, well, it has been that way
> > until now - then the vendor specific identification minus what
> > detect_null_seg_behavior() does should run first and then after
> > ->c_identify() is done, you should do something like:
> >
> > if (!cpu_has_bug(c, X86_BUG_NULL_SEG)) { if
> > (!check_null_seg_clears_base(c)) set_cpu_bug(c, X86_BUG_NULL_SEG);
> > }
> >
> > so that it still takes place on all CPUs.
>
> This would only really work for boot cpu and setup_force_cap(),
> because no CPU is going to have X86_BUG_NULL_SEG set by
> default, but this still misses the point of the bugfix which
> is "check_null_seg_clears_base() must not be called when
> cpu_has_hypervisor".
>
> In practice, the BSP is good enough. The behaviour predates the
> K8, which was the first CPU where it became observable without
> SMM/PUSHALL/etc, and quite possibly goes back to the dawn of time, and
> you can't mix a Zen1 and Zen2 in a 2-socket system.
Oh, I didn't express myself properly "should happen on all CPUs" was
supposed to mean, if this detection should happen on all vendors like it
does now. Not BSP vs AP.
> It is made unused by this patch, so can't be pulled out earlier, but
> should be adjusted.
Right.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 11+ messages in thread