All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/cpu: Avoid writing MSR_IA32_TSX_CTRL when writing it is not supported
@ 2022-09-06 20:17 Hans de Goede
  2022-09-06 20:43 ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Hans de Goede @ 2022-09-06 20:17 UTC (permalink / raw)
  To: Rafael J . Wysocki, Pavel Machek, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H . Peter Anvin
  Cc: Hans de Goede, x86, linux-kernel, Dave Hansen

On an Intel Atom N2600 (and presumable other Cedar Trail models)
MSR_IA32_TSX_CTRL can be read, causing saved_msr.valid to be set for it
by msr_build_context().

This causes restore_processor_state() to try and restore it, but writing
this MSR is not allowed on the Intel Atom N2600 leading to:

[   99.955141] unchecked MSR access error: WRMSR to 0x122 (tried to write 0x0000000000000002) at rIP: 0xffffffff8b07a574 (native_write_msr+0x4/0x20)
[   99.955176] Call Trace:
[   99.955186]  <TASK>
[   99.955195]  restore_processor_state+0x275/0x2c0
[   99.955246]  x86_acpi_suspend_lowlevel+0x10e/0x140
[   99.955273]  acpi_suspend_enter+0xd3/0x100
[   99.955297]  suspend_devices_and_enter+0x7e2/0x830
[   99.955341]  pm_suspend.cold+0x2d2/0x35e
[   99.955368]  state_store+0x68/0xd0
[   99.955402]  kernfs_fop_write_iter+0x15e/0x210
[   99.955442]  vfs_write+0x225/0x4b0
[   99.955523]  ksys_write+0x59/0xd0
[   99.955557]  do_syscall_64+0x58/0x80
[   99.955579]  ? do_syscall_64+0x67/0x80
[   99.955600]  ? up_read+0x17/0x20
[   99.955631]  ? lock_is_held_type+0xe3/0x140
[   99.955670]  ? asm_exc_page_fault+0x22/0x30
[   99.955688]  ? lockdep_hardirqs_on+0x7d/0x100
[   99.955710]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   99.955723] RIP: 0033:0x7f7d0fb018f7
[   99.955741] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[   99.955753] RSP: 002b:00007ffd03292ee8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   99.955771] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f7d0fb018f7
[   99.955781] RDX: 0000000000000004 RSI: 00007ffd03292fd0 RDI: 0000000000000004
[   99.955790] RBP: 00007ffd03292fd0 R08: 000000000000c0fe R09: 0000000000000000
[   99.955799] R10: 00007f7d0fb85fb0 R11: 0000000000000246 R12: 0000000000000004
[   99.955808] R13: 000055df564173e0 R14: 0000000000000004 R15: 00007f7d0fbf49e0
[   99.955910]  </TASK>

Extend the valid check in msr_build_context() to also do a test write of
the read value to avoid marking MSR-s which may not be written as valid.

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 arch/x86/power/cpu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index bb176c72891c..94b41bfd0769 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -433,10 +433,11 @@ static int msr_build_context(const u32 *msr_id, const int num)
 	}
 
 	for (i = saved_msrs->num, j = 0; i < total_num; i++, j++) {
-		u64 dummy;
+		u64 value;
 
 		msr_array[i].info.msr_no	= msr_id[j];
-		msr_array[i].valid		= !rdmsrl_safe(msr_id[j], &dummy);
+		msr_array[i].valid		= !rdmsrl_safe(msr_id[j], &value) &&
+						  !wrmsrl_safe(msr_id[j], value);
 		msr_array[i].info.reg.q		= 0;
 	}
 	saved_msrs->num   = total_num;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/cpu: Avoid writing MSR_IA32_TSX_CTRL when writing it is not supported
  2022-09-06 20:17 [PATCH] x86/cpu: Avoid writing MSR_IA32_TSX_CTRL when writing it is not supported Hans de Goede
@ 2022-09-06 20:43 ` Peter Zijlstra
  2022-09-06 20:56   ` Hans de Goede
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2022-09-06 20:43 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Rafael J . Wysocki, Pavel Machek, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H . Peter Anvin, x86, linux-kernel,
	Dave Hansen

On Tue, Sep 06, 2022 at 10:17:43PM +0200, Hans de Goede wrote:
> On an Intel Atom N2600 (and presumable other Cedar Trail models)
> MSR_IA32_TSX_CTRL can be read, causing saved_msr.valid to be set for it
> by msr_build_context().
> 
> This causes restore_processor_state() to try and restore it, but writing
> this MSR is not allowed on the Intel Atom N2600 leading to:

FWIW, virt tends to do this same thing a lot. They'll allow reading
random MSRs and only fail on write.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/cpu: Avoid writing MSR_IA32_TSX_CTRL when writing it is not supported
  2022-09-06 20:43 ` Peter Zijlstra
@ 2022-09-06 20:56   ` Hans de Goede
  2022-09-06 21:00     ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Hans de Goede @ 2022-09-06 20:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Rafael J . Wysocki, Pavel Machek, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H . Peter Anvin, x86, linux-kernel,
	Dave Hansen

Hi,

On 9/6/22 22:43, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 10:17:43PM +0200, Hans de Goede wrote:
>> On an Intel Atom N2600 (and presumable other Cedar Trail models)
>> MSR_IA32_TSX_CTRL can be read, causing saved_msr.valid to be set for it
>> by msr_build_context().
>>
>> This causes restore_processor_state() to try and restore it, but writing
>> this MSR is not allowed on the Intel Atom N2600 leading to:
> 
> FWIW, virt tends to do this same thing a lot. They'll allow reading
> random MSRs and only fail on write.

Right. So I guess I should send a v2 with an updated commit
message mentioning this ?

Regards,

Hans


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/cpu: Avoid writing MSR_IA32_TSX_CTRL when writing it is not supported
  2022-09-06 20:56   ` Hans de Goede
@ 2022-09-06 21:00     ` Peter Zijlstra
  2022-09-06 23:00       ` Andrew Cooper
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2022-09-06 21:00 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Rafael J . Wysocki, Pavel Machek, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H . Peter Anvin, x86, linux-kernel,
	Dave Hansen

On Tue, Sep 06, 2022 at 10:56:47PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 9/6/22 22:43, Peter Zijlstra wrote:
> > On Tue, Sep 06, 2022 at 10:17:43PM +0200, Hans de Goede wrote:
> >> On an Intel Atom N2600 (and presumable other Cedar Trail models)
> >> MSR_IA32_TSX_CTRL can be read, causing saved_msr.valid to be set for it
> >> by msr_build_context().
> >>
> >> This causes restore_processor_state() to try and restore it, but writing
> >> this MSR is not allowed on the Intel Atom N2600 leading to:
> > 
> > FWIW, virt tends to do this same thing a lot. They'll allow reading
> > random MSRs and only fail on write.
> 
> Right. So I guess I should send a v2 with an updated commit
> message mentioning this ?

Nah, just saying this is a somewhat common pattern with MSRs.

The best ones are the one where writing the value read is invalid :/ or
those who also silently eat a 0 write just for giggles. Luckily that
doesn't happen often.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/cpu: Avoid writing MSR_IA32_TSX_CTRL when writing it is not supported
  2022-09-06 21:00     ` Peter Zijlstra
@ 2022-09-06 23:00       ` Andrew Cooper
  2022-09-07  7:32         ` Hans de Goede
  2022-09-08  1:03         ` Pawan Gupta
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Cooper @ 2022-09-06 23:00 UTC (permalink / raw)
  To: Peter Zijlstra, Hans de Goede
  Cc: Rafael J . Wysocki, Pavel Machek, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H . Peter Anvin, x86, linux-kernel,
	Dave Hansen, Andrew Cooper

On 06/09/2022 22:00, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 10:56:47PM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 9/6/22 22:43, Peter Zijlstra wrote:
>>> On Tue, Sep 06, 2022 at 10:17:43PM +0200, Hans de Goede wrote:
>>>> On an Intel Atom N2600 (and presumable other Cedar Trail models)
>>>> MSR_IA32_TSX_CTRL can be read, causing saved_msr.valid to be set for it
>>>> by msr_build_context().
>>>>
>>>> This causes restore_processor_state() to try and restore it, but writing
>>>> this MSR is not allowed on the Intel Atom N2600 leading to:
>>> FWIW, virt tends to do this same thing a lot. They'll allow reading
>>> random MSRs and only fail on write.
>> Right. So I guess I should send a v2 with an updated commit
>> message mentioning this ?
> Nah, just saying this is a somewhat common pattern with MSRs.
>
> The best ones are the one where writing the value read is invalid :/ or
> those who also silently eat a 0 write just for giggles. Luckily that
> doesn't happen often.

Several comments.  First of all, MSR_TSX_CTRL is a fully read/write
MSR.  If virt is doing this wrong, fix the hypervisor.  But this doesn't
look virt related?

More importantly, MSR_TSX_CTRL does not plausibly exist on an Atom
N2600, as it is more than a decade old.

MSR_TSX_CTRL was retrofitted in microcode to the MDS_NO, TAA-vulnerable
CPUs which is a very narrow range from about 1 quarter of 2019 which
includes Cascade Lake, and then included architecturally on subsequent
parts which support TSX.

pm_save_spec_msr() is totally broken.  It's poking MSRs blindly without
checking the enumeration of the capability first.

In this case, I bet the N2600 has a model specific MSR living at index
0x122 which has absolutely nothing at all to do with TSX.

~Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/cpu: Avoid writing MSR_IA32_TSX_CTRL when writing it is not supported
  2022-09-06 23:00       ` Andrew Cooper
@ 2022-09-07  7:32         ` Hans de Goede
  2022-09-08 13:34           ` Andrew Cooper
  2022-09-08  1:03         ` Pawan Gupta
  1 sibling, 1 reply; 9+ messages in thread
From: Hans de Goede @ 2022-09-07  7:32 UTC (permalink / raw)
  To: Andrew Cooper, Peter Zijlstra
  Cc: Rafael J . Wysocki, Pavel Machek, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H . Peter Anvin, x86, linux-kernel,
	Dave Hansen

[-- Attachment #1: Type: text/plain, Size: 2196 bytes --]

Hi,

On 9/7/22 01:00, Andrew Cooper wrote:
> On 06/09/2022 22:00, Peter Zijlstra wrote:
>> On Tue, Sep 06, 2022 at 10:56:47PM +0200, Hans de Goede wrote:
>>> Hi,
>>>
>>> On 9/6/22 22:43, Peter Zijlstra wrote:
>>>> On Tue, Sep 06, 2022 at 10:17:43PM +0200, Hans de Goede wrote:
>>>>> On an Intel Atom N2600 (and presumable other Cedar Trail models)
>>>>> MSR_IA32_TSX_CTRL can be read, causing saved_msr.valid to be set for it
>>>>> by msr_build_context().
>>>>>
>>>>> This causes restore_processor_state() to try and restore it, but writing
>>>>> this MSR is not allowed on the Intel Atom N2600 leading to:
>>>> FWIW, virt tends to do this same thing a lot. They'll allow reading
>>>> random MSRs and only fail on write.
>>> Right. So I guess I should send a v2 with an updated commit
>>> message mentioning this ?
>> Nah, just saying this is a somewhat common pattern with MSRs.
>>
>> The best ones are the one where writing the value read is invalid :/ or
>> those who also silently eat a 0 write just for giggles. Luckily that
>> doesn't happen often.
> 
> Several comments.  First of all, MSR_TSX_CTRL is a fully read/write
> MSR.  If virt is doing this wrong, fix the hypervisor.  But this doesn't
> look virt related?
> 
> More importantly, MSR_TSX_CTRL does not plausibly exist on an Atom
> N2600, as it is more than a decade old.
> 
> MSR_TSX_CTRL was retrofitted in microcode to the MDS_NO, TAA-vulnerable
> CPUs which is a very narrow range from about 1 quarter of 2019 which
> includes Cascade Lake, and then included architecturally on subsequent
> parts which support TSX.
> 
> pm_save_spec_msr() is totally broken.  It's poking MSRs blindly without
> checking the enumeration of the capability first.

Note I did to a different version of this patch before this which did
add a capability check, but I only send that to various x86-folks +
x86@kernel.org which as Peter pointed out is an alias not a list,
so you will not have seen that earlier version.

I have attached the earlier version to this email.

> In this case, I bet the N2600 has a model specific MSR living at index
> 0x122 which has absolutely nothing at all to do with TSX.

That is my guess too.

Regards,

Hans

[-- Attachment #2: 0001-x86-cpu-Avoid-writing-MSR_IA32_TSX_CTRL-when-tsx_ctr.patch --]
[-- Type: text/x-patch, Size: 4991 bytes --]

From 51bac2c734e0f2fd7e2acb406afd8a201ddf3400 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede@redhat.com>
Date: Tue, 6 Sep 2022 18:00:47 +0200
Subject: [PATCH] x86/cpu: Avoid writing MSR_IA32_TSX_CTRL when
 !tsx_ctrl_is_supported()

On an Intel Atom N2600 (and presumable other Cedar Trail models)
MSR_IA32_TSX_CTRL can be read, causing saved_msr.valid to be set for it
by msr_build_context().

This causes restore_processor_state() to try and restore it, but writing
this MSR is not allowed on the Intel Atom N2600 leading to:

[   99.955141] unchecked MSR access error: WRMSR to 0x122 (tried to write 0x0000000000000002) at rIP: 0xffffffff8b07a574 (native_write_msr+0x4/0x20)
[   99.955176] Call Trace:
[   99.955186]  <TASK>
[   99.955195]  restore_processor_state+0x275/0x2c0
[   99.955246]  x86_acpi_suspend_lowlevel+0x10e/0x140
[   99.955273]  acpi_suspend_enter+0xd3/0x100
[   99.955297]  suspend_devices_and_enter+0x7e2/0x830
[   99.955341]  pm_suspend.cold+0x2d2/0x35e
[   99.955368]  state_store+0x68/0xd0
[   99.955402]  kernfs_fop_write_iter+0x15e/0x210
[   99.955442]  vfs_write+0x225/0x4b0
[   99.955523]  ksys_write+0x59/0xd0
[   99.955557]  do_syscall_64+0x58/0x80
[   99.955579]  ? do_syscall_64+0x67/0x80
[   99.955600]  ? up_read+0x17/0x20
[   99.955631]  ? lock_is_held_type+0xe3/0x140
[   99.955670]  ? asm_exc_page_fault+0x22/0x30
[   99.955688]  ? lockdep_hardirqs_on+0x7d/0x100
[   99.955710]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   99.955723] RIP: 0033:0x7f7d0fb018f7
[   99.955741] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[   99.955753] RSP: 002b:00007ffd03292ee8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   99.955771] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f7d0fb018f7
[   99.955781] RDX: 0000000000000004 RSI: 00007ffd03292fd0 RDI: 0000000000000004
[   99.955790] RBP: 00007ffd03292fd0 R08: 000000000000c0fe R09: 0000000000000000
[   99.955799] R10: 00007f7d0fb85fb0 R11: 0000000000000246 R12: 0000000000000004
[   99.955808] R13: 000055df564173e0 R14: 0000000000000004 R15: 00007f7d0fbf49e0
[   99.955910]  </TASK>

Make tsx_ctrl_is_supported() from kernel/cpu/tsx.c non static and only pass
pass MSR_IA32_TSX_CTRL to msr_build_context() when that returns true.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
Important note for reviewers: In its current form this patch changes the order in
which MSR-s are restored, it used to be:

        MSR_IA32_SPEC_CTRL,
        MSR_IA32_TSX_CTRL,
        MSR_TSX_FORCE_ABORT,
        MSR_IA32_MCU_OPT_CTRL,
        MSR_AMD64_LS_CFG,

Which is now changed to:

        MSR_IA32_SPEC_CTRL,
        MSR_TSX_FORCE_ABORT,
        MSR_IA32_MCU_OPT_CTRL,
        MSR_AMD64_LS_CFG,
        MSR_IA32_TSX_CTRL,

I am not sure if this may have an impact on the various CPU
vulnerability mitigations, please review carefully.
---
 arch/x86/include/asm/cpu.h | 6 ++++++
 arch/x86/kernel/cpu/tsx.c  | 2 +-
 arch/x86/power/cpu.c       | 4 +++-
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index 8cbf623f0ecf..9047701d1966 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -49,6 +49,7 @@ extern bool handle_user_split_lock(struct pt_regs *regs, long error_code);
 extern bool handle_guest_split_lock(unsigned long ip);
 extern void handle_bus_lock(struct pt_regs *regs);
 u8 get_this_hybrid_cpu_type(void);
+bool tsx_ctrl_is_supported(void);
 #else
 static inline void __init sld_setup(struct cpuinfo_x86 *c) {}
 static inline bool handle_user_split_lock(struct pt_regs *regs, long error_code)
@@ -67,6 +68,11 @@ static inline u8 get_this_hybrid_cpu_type(void)
 {
 	return 0;
 }
+
+static inline bool tsx_ctrl_is_supported(void)
+{
+	return false;
+}
 #endif
 #ifdef CONFIG_IA32_FEAT_CTL
 void init_ia32_feat_ctl(struct cpuinfo_x86 *c);
diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
index ec7bbac3a9f2..be7e8d4cc0fc 100644
--- a/arch/x86/kernel/cpu/tsx.c
+++ b/arch/x86/kernel/cpu/tsx.c
@@ -58,7 +58,7 @@ static void tsx_enable(void)
 	wrmsrl(MSR_IA32_TSX_CTRL, tsx);
 }
 
-static bool tsx_ctrl_is_supported(void)
+bool tsx_ctrl_is_supported(void)
 {
 	u64 ia32_cap = x86_read_arch_cap_msr();
 
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index bb176c72891c..9c95099d1add 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -515,13 +515,15 @@ static void pm_save_spec_msr(void)
 {
 	u32 spec_msr_id[] = {
 		MSR_IA32_SPEC_CTRL,
-		MSR_IA32_TSX_CTRL,
 		MSR_TSX_FORCE_ABORT,
 		MSR_IA32_MCU_OPT_CTRL,
 		MSR_AMD64_LS_CFG,
 	};
+	u32 tsx_ctrl_msr_id[] = { MSR_IA32_TSX_CTRL };
 
 	msr_build_context(spec_msr_id, ARRAY_SIZE(spec_msr_id));
+	if (tsx_ctrl_is_supported())
+		msr_build_context(tsx_ctrl_msr_id, ARRAY_SIZE(tsx_ctrl_msr_id));
 }
 
 static int pm_check_save_msr(void)
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/cpu: Avoid writing MSR_IA32_TSX_CTRL when writing it is not supported
  2022-09-06 23:00       ` Andrew Cooper
  2022-09-07  7:32         ` Hans de Goede
@ 2022-09-08  1:03         ` Pawan Gupta
  2022-09-08 13:46           ` Andrew Cooper
  1 sibling, 1 reply; 9+ messages in thread
From: Pawan Gupta @ 2022-09-08  1:03 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Peter Zijlstra, Hans de Goede, Rafael J . Wysocki, Pavel Machek,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H . Peter Anvin, x86, linux-kernel, Dave Hansen

On Tue, Sep 06, 2022 at 11:00:08PM +0000, Andrew Cooper wrote:
> On 06/09/2022 22:00, Peter Zijlstra wrote:
> > On Tue, Sep 06, 2022 at 10:56:47PM +0200, Hans de Goede wrote:
> >> Hi,
> >>
> >> On 9/6/22 22:43, Peter Zijlstra wrote:
> >>> On Tue, Sep 06, 2022 at 10:17:43PM +0200, Hans de Goede wrote:
> >>>> On an Intel Atom N2600 (and presumable other Cedar Trail models)
> >>>> MSR_IA32_TSX_CTRL can be read, causing saved_msr.valid to be set for it
> >>>> by msr_build_context().
> >>>>
> >>>> This causes restore_processor_state() to try and restore it, but writing
> >>>> this MSR is not allowed on the Intel Atom N2600 leading to:
> >>> FWIW, virt tends to do this same thing a lot. They'll allow reading
> >>> random MSRs and only fail on write.
> >> Right. So I guess I should send a v2 with an updated commit
> >> message mentioning this ?
> > Nah, just saying this is a somewhat common pattern with MSRs.
> >
> > The best ones are the one where writing the value read is invalid :/ or
> > those who also silently eat a 0 write just for giggles. Luckily that
> > doesn't happen often.
> 
> Several comments.  First of all, MSR_TSX_CTRL is a fully read/write
> MSR.  If virt is doing this wrong, fix the hypervisor.  But this doesn't
> look virt related?
> 
> More importantly, MSR_TSX_CTRL does not plausibly exist on an Atom
> N2600, as it is more than a decade old.
> 
> MSR_TSX_CTRL was retrofitted in microcode to the MDS_NO, TAA-vulnerable
> CPUs which is a very narrow range from about 1 quarter of 2019 which
> includes Cascade Lake, and then included architecturally on subsequent
> parts which support TSX.
> 
> pm_save_spec_msr() is totally broken.  It's poking MSRs blindly without
> checking the enumeration of the capability first.

pm_save_spec_msr() relies on valid-msr-check in build_msr_context(), but
obviously it is not working in this particular case.

Does adding the enumeration check as below looks okay:

(I am not sure if I got the enumeration right for MSR_AMD64_LS_CFG).

---
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index 8cbf623f0ecf..a750c1a1964b 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -76,6 +76,8 @@ static inline void init_ia32_feat_ctl(struct cpuinfo_x86 *c) {}
 
 extern __noendbr void cet_disable(void);
 
+extern bool spec_msr_valid(u32 msr_id);
+
 struct ucode_cpu_info;
 
 int intel_cpu_collect_info(struct ucode_cpu_info *uci);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3e508f239098..7430a36fd7ae 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1278,6 +1278,26 @@ static bool __init cpu_matches(const struct x86_cpu_id *table, unsigned long whi
 	return m && !!(m->driver_data & which);
 }
 
+bool spec_msr_valid(u32 msr_id)
+{
+	u64 ia32_cap = x86_read_arch_cap_msr();
+
+	switch (msr_id) {
+	case MSR_IA32_SPEC_CTRL:
+		return boot_cpu_has(X86_FEATURE_MSR_SPEC_CTRL);
+	case MSR_IA32_TSX_CTRL:
+		return !!(ia32_cap & ARCH_CAP_TSX_CTRL_MSR);
+	case MSR_TSX_FORCE_ABORT:
+		return boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT);
+	case MSR_IA32_MCU_OPT_CTRL:
+		return boot_cpu_has(X86_FEATURE_SRBDS_CTRL);
+	case MSR_AMD64_LS_CFG:
+		return boot_cpu_has(X86_FEATURE_LS_CFG_SSBD);
+	}
+
+	return false;
+}
+
 u64 x86_read_arch_cap_msr(void)
 {
 	u64 ia32_cap = 0;
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index bb176c72891c..8db73f7982c7 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -520,8 +520,12 @@ static void pm_save_spec_msr(void)
 		MSR_IA32_MCU_OPT_CTRL,
 		MSR_AMD64_LS_CFG,
 	};
+	int i;
 
-	msr_build_context(spec_msr_id, ARRAY_SIZE(spec_msr_id));
+	for (i=0; i < ARRAY_SIZE(spec_msr_id); i++) {
+		if (spec_msr_valid(spec_msr_id[i]))
+			msr_build_context(&spec_msr_id[i], 1);
+	}
 }
 
 static int pm_check_save_msr(void)

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/cpu: Avoid writing MSR_IA32_TSX_CTRL when writing it is not supported
  2022-09-07  7:32         ` Hans de Goede
@ 2022-09-08 13:34           ` Andrew Cooper
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2022-09-08 13:34 UTC (permalink / raw)
  To: Hans de Goede, Peter Zijlstra
  Cc: Rafael J . Wysocki, Pavel Machek, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H . Peter Anvin, x86, linux-kernel,
	Dave Hansen

On 07/09/2022 08:32, Hans de Goede wrote:
> Hi,
>
> On 9/7/22 01:00, Andrew Cooper wrote:
>> On 06/09/2022 22:00, Peter Zijlstra wrote:
>>> On Tue, Sep 06, 2022 at 10:56:47PM +0200, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 9/6/22 22:43, Peter Zijlstra wrote:
>>>>> On Tue, Sep 06, 2022 at 10:17:43PM +0200, Hans de Goede wrote:
>>>>>> On an Intel Atom N2600 (and presumable other Cedar Trail models)
>>>>>> MSR_IA32_TSX_CTRL can be read, causing saved_msr.valid to be set for it
>>>>>> by msr_build_context().
>>>>>>
>>>>>> This causes restore_processor_state() to try and restore it, but writing
>>>>>> this MSR is not allowed on the Intel Atom N2600 leading to:
>>>>> FWIW, virt tends to do this same thing a lot. They'll allow reading
>>>>> random MSRs and only fail on write.
>>>> Right. So I guess I should send a v2 with an updated commit
>>>> message mentioning this ?
>>> Nah, just saying this is a somewhat common pattern with MSRs.
>>>
>>> The best ones are the one where writing the value read is invalid :/ or
>>> those who also silently eat a 0 write just for giggles. Luckily that
>>> doesn't happen often.
>> Several comments.  First of all, MSR_TSX_CTRL is a fully read/write
>> MSR.  If virt is doing this wrong, fix the hypervisor.  But this doesn't
>> look virt related?
>>
>> More importantly, MSR_TSX_CTRL does not plausibly exist on an Atom
>> N2600, as it is more than a decade old.
>>
>> MSR_TSX_CTRL was retrofitted in microcode to the MDS_NO, TAA-vulnerable
>> CPUs which is a very narrow range from about 1 quarter of 2019 which
>> includes Cascade Lake, and then included architecturally on subsequent
>> parts which support TSX.
>>
>> pm_save_spec_msr() is totally broken.  It's poking MSRs blindly without
>> checking the enumeration of the capability first.
> Note I did to a different version of this patch before this which did
> add a capability check, but I only send that to various x86-folks +
> x86@kernel.org which as Peter pointed out is an alias not a list,
> so you will not have seen that earlier version.
>
> I have attached the earlier version to this email.

In answer to your question in the patch, no the order doesn't matter,
despite the overlapping interactions between TSX_CTRL and MCU_OPT_CTRL.

~Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/cpu: Avoid writing MSR_IA32_TSX_CTRL when writing it is not supported
  2022-09-08  1:03         ` Pawan Gupta
@ 2022-09-08 13:46           ` Andrew Cooper
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2022-09-08 13:46 UTC (permalink / raw)
  To: Pawan Gupta
  Cc: Peter Zijlstra, Hans de Goede, Rafael J . Wysocki, Pavel Machek,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H . Peter Anvin, x86, linux-kernel, Dave Hansen

On 08/09/2022 02:03, Pawan Gupta wrote:
> On Tue, Sep 06, 2022 at 11:00:08PM +0000, Andrew Cooper wrote:
>> On 06/09/2022 22:00, Peter Zijlstra wrote:
>>> On Tue, Sep 06, 2022 at 10:56:47PM +0200, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 9/6/22 22:43, Peter Zijlstra wrote:
>>>>> On Tue, Sep 06, 2022 at 10:17:43PM +0200, Hans de Goede wrote:
>>>>>> On an Intel Atom N2600 (and presumable other Cedar Trail models)
>>>>>> MSR_IA32_TSX_CTRL can be read, causing saved_msr.valid to be set for it
>>>>>> by msr_build_context().
>>>>>>
>>>>>> This causes restore_processor_state() to try and restore it, but writing
>>>>>> this MSR is not allowed on the Intel Atom N2600 leading to:
>>>>> FWIW, virt tends to do this same thing a lot. They'll allow reading
>>>>> random MSRs and only fail on write.
>>>> Right. So I guess I should send a v2 with an updated commit
>>>> message mentioning this ?
>>> Nah, just saying this is a somewhat common pattern with MSRs.
>>>
>>> The best ones are the one where writing the value read is invalid :/ or
>>> those who also silently eat a 0 write just for giggles. Luckily that
>>> doesn't happen often.
>> Several comments.  First of all, MSR_TSX_CTRL is a fully read/write
>> MSR.  If virt is doing this wrong, fix the hypervisor.  But this doesn't
>> look virt related?
>>
>> More importantly, MSR_TSX_CTRL does not plausibly exist on an Atom
>> N2600, as it is more than a decade old.
>>
>> MSR_TSX_CTRL was retrofitted in microcode to the MDS_NO, TAA-vulnerable
>> CPUs which is a very narrow range from about 1 quarter of 2019 which
>> includes Cascade Lake, and then included architecturally on subsequent
>> parts which support TSX.
>>
>> pm_save_spec_msr() is totally broken.  It's poking MSRs blindly without
>> checking the enumeration of the capability first.
> pm_save_spec_msr() relies on valid-msr-check in build_msr_context(), but
> obviously it is not working in this particular case.
>
> Does adding the enumeration check as below looks okay:
>
> (I am not sure if I got the enumeration right for MSR_AMD64_LS_CFG).

family >= 0x10 && family <= 0x18

>
> ---
> diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
> index 8cbf623f0ecf..a750c1a1964b 100644
> --- a/arch/x86/include/asm/cpu.h
> +++ b/arch/x86/include/asm/cpu.h
> @@ -76,6 +76,8 @@ static inline void init_ia32_feat_ctl(struct cpuinfo_x86 *c) {}
>  
>  extern __noendbr void cet_disable(void);
>  
> +extern bool spec_msr_valid(u32 msr_id);
> +
>  struct ucode_cpu_info;
>  
>  int intel_cpu_collect_info(struct ucode_cpu_info *uci);
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 3e508f239098..7430a36fd7ae 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1278,6 +1278,26 @@ static bool __init cpu_matches(const struct x86_cpu_id *table, unsigned long whi
>  	return m && !!(m->driver_data & which);
>  }
>  
> +bool spec_msr_valid(u32 msr_id)
> +{
> +	u64 ia32_cap = x86_read_arch_cap_msr();
> +
> +	switch (msr_id) {
> +	case MSR_IA32_SPEC_CTRL:
> +		return boot_cpu_has(X86_FEATURE_MSR_SPEC_CTRL);
> +	case MSR_IA32_TSX_CTRL:
> +		return !!(ia32_cap & ARCH_CAP_TSX_CTRL_MSR);
> +	case MSR_TSX_FORCE_ABORT:
> +		return boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT);
> +	case MSR_IA32_MCU_OPT_CTRL:
> +		return boot_cpu_has(X86_FEATURE_SRBDS_CTRL);
> +	case MSR_AMD64_LS_CFG:
> +		return boot_cpu_has(X86_FEATURE_LS_CFG_SSBD);
> +	}
> +
> +	return false;
> +}
> +
>  u64 x86_read_arch_cap_msr(void)
>  {
>  	u64 ia32_cap = 0;
> diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
> index bb176c72891c..8db73f7982c7 100644
> --- a/arch/x86/power/cpu.c
> +++ b/arch/x86/power/cpu.c
> @@ -520,8 +520,12 @@ static void pm_save_spec_msr(void)
>  		MSR_IA32_MCU_OPT_CTRL,
>  		MSR_AMD64_LS_CFG,

Checking the enumerations is definitely an improvement, but this wants
to become a tuple list of { msr, flag } so it can't get out of sync.

Except two of the options aren't simple bits.  The contents of
MSR_ARCH_CAPS ought to become feature bits because it's a CPUID feature
leaf in disguise.

AMD LS_CFG is more complicated, because the dispatch serialising bit
needs setting unilaterally (families 0x10, 0x12 thru 0x18), but the SSBD
control ought to resolve on the next context switch.

~Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-09-08 13:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-06 20:17 [PATCH] x86/cpu: Avoid writing MSR_IA32_TSX_CTRL when writing it is not supported Hans de Goede
2022-09-06 20:43 ` Peter Zijlstra
2022-09-06 20:56   ` Hans de Goede
2022-09-06 21:00     ` Peter Zijlstra
2022-09-06 23:00       ` Andrew Cooper
2022-09-07  7:32         ` Hans de Goede
2022-09-08 13:34           ` Andrew Cooper
2022-09-08  1:03         ` Pawan Gupta
2022-09-08 13:46           ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.