linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* FSGSBASE causing panic on 5.9-rc1
@ 2020-08-19 18:07 Tom Lendacky
  2020-08-19 18:19 ` Tom Lendacky
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Lendacky @ 2020-08-19 18:07 UTC (permalink / raw)
  To: Linux Kernel Mailing List, X86 ML
  Cc: Andy Lutomirski, Chang S. Bae, Thomas Gleixner, Sasha Levin,
	Borislav Petkov, Peter Zijlstra, Ingo Molnar

It looks like the FSGSBASE support is crashing my second generation EPYC
system. I was able to bisect it to:

b745cfba44c1 ("x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit")

The panic only happens when using KVM. Doing kernel builds or stress
on bare-metal appears fine. But if I fire up, in this case, a 64-vCPU
guest and do a kernel build within the guest, I get the following:

[  120.360637] BUG: scheduling while atomic: qemu-system-x86/5485/0x00110000
[  124.041646] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: x86_pmu_handle_irq+0x163/0x170
[  124.041647] ------------[ cut here ]------------
[  124.041649] Hardware name: AMD
[  124.041649] Workqueue:  0x0 (events)
[  124.041651] Call Trace:
[  124.041651] ------------[ cut here ]------------
[  124.041652] corrupted preempt_count: kworker/22:1/1449/0x110000
[  124.051267] WARNING: CPU: 22 PID: 1449 at kernel/sched/core.c:3595 finish_task_switch+0x289/0x290
[  124.051268] Modules linked in: tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge stp llc fuse amd64_edac_mod edac_mce_amd wmi_bmof kvm_amd kvm irqbypass sg ipmi_ssif ccp k10temp acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq squashfs loop sch_fq_codel parport_pc ppdev lp parport ip_tables raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 linear sd_mod t10_pi crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper ast drm_vram_helper drm_ttm_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect ahci sysimgblt libahci fb_sys_fops libata drm e1000e i2c_piix4 wmi i2c_designware_platform i2c_designware_core pinctrl_amd i2c_core
[  124.051285] CPU: 22 PID: 1449 Comm: kworker/22:1 Tainted: G        W         5.9.0-rc1-sos-linux #1
[  124.051286] Hardware name: AMD
[  124.051286] Workqueue:  0x0 (events)
[  124.051287] RIP: 0010:finish_task_switch+0x289/0x290
[  124.051288] Code: ff 65 48 8b 04 25 c0 7b 01 00 8b 90 a8 08 00 00 48 8d b0 b0 0a 00 00 48 c7 c7 20 10 10 86 c6 05 be aa 55 01 01 e8 89 03 fd ff <0f> 0b e9 6b ff ff ff 55 48 89 e5 41 55 41 54 49 89 fc 53 48 89 f3
[  124.051288] RSP: 0018:ffffc9001afe7e10 EFLAGS: 00010082
[  124.051289] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000023
[  124.051290] RDX: 0000000000000023 RSI: ffffffff86101044 RDI: ffff88900d798bb0
[  124.051290] RBP: ffffc9001afe7e38 R08: ffff88900d798ba8 R09: 0000000000000005
[  124.051290] R10: 000000000000000f R11: ffff88900d798d54 R12: ffff88900d7aacc0
[  124.051291] R13: ffff889bd2308000 R14: 0000000000000000 R15: ffff88900d7aacc0
[  124.051291] FS:  0000000000000000(0000) GS:ffff88900d780000(0000) knlGS:0000000000000000
[  124.051292] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  124.051292] CR2: 00007ff607620000 CR3: 0000001bcb0d2000 CR4: 0000000000350ee0
[  124.051293] Call Trace:
[  124.051293]  __schedule+0x348/0x810
[  124.051293]  ? dbs_work_handler+0x47/0x60
[  124.051294]  schedule+0x4a/0xb0
[  124.051294]  worker_thread+0xcf/0x3b0
[  124.051294]  ? process_one_work+0x370/0x370
[  124.051294]  kthread+0xfe/0x140
[  124.051295]  ? kthread_park+0x90/0x90
[  124.051295]  ret_from_fork+0x22/0x30
[  124.051295] ---[ end trace 7f77ee8ad05caa89 ]---
[  124.051296] Kernel Offset: disabled

Specifying nofsgsbase avoids the issue. This is very reproducible, so I
can easily test any fixes.

Thanks,
Tom

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-19 18:07 FSGSBASE causing panic on 5.9-rc1 Tom Lendacky
@ 2020-08-19 18:19 ` Tom Lendacky
  2020-08-19 21:25   ` Andy Lutomirski
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Lendacky @ 2020-08-19 18:19 UTC (permalink / raw)
  To: Linux Kernel Mailing List, X86 ML
  Cc: Andy Lutomirski, Chang S. Bae, Thomas Gleixner, Sasha Levin,
	Borislav Petkov, Peter Zijlstra, Ingo Molnar

On 8/19/20 1:07 PM, Tom Lendacky wrote:
> It looks like the FSGSBASE support is crashing my second generation EPYC
> system. I was able to bisect it to:
> 
> b745cfba44c1 ("x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit")
> 
> The panic only happens when using KVM. Doing kernel builds or stress
> on bare-metal appears fine. But if I fire up, in this case, a 64-vCPU
> guest and do a kernel build within the guest, I get the following:

I should clarify that this panic is on the bare-metal system, not in the
guest. And that specifying nofsgsbase on the bare-metal command line fixes
the issue.

Thanks,
Tom

> 
> [  120.360637] BUG: scheduling while atomic: qemu-system-x86/5485/0x00110000
> [  124.041646] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: x86_pmu_handle_irq+0x163/0x170
> [  124.041647] ------------[ cut here ]------------
> [  124.041649] Hardware name: AMD
> [  124.041649] Workqueue:  0x0 (events)
> [  124.041651] Call Trace:
> [  124.041651] ------------[ cut here ]------------
> [  124.041652] corrupted preempt_count: kworker/22:1/1449/0x110000
> [  124.051267] WARNING: CPU: 22 PID: 1449 at kernel/sched/core.c:3595 finish_task_switch+0x289/0x290
> [  124.051268] Modules linked in: tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge stp llc fuse amd64_edac_mod edac_mce_amd wmi_bmof kvm_amd kvm irqbypass sg ipmi_ssif ccp k10temp acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq squashfs loop sch_fq_codel parport_pc ppdev lp parport ip_tables raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 linear sd_mod t10_pi crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper ast drm_vram_helper drm_ttm_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect ahci sysimgblt libahci fb_sys_fops libata drm e1000e i2c_piix4 wmi i2c_designware_platform i2c_designware_core pinctrl_amd i2c_core
> [  124.051285] CPU: 22 PID: 1449 Comm: kworker/22:1 Tainted: G        W         5.9.0-rc1-sos-linux #1
> [  124.051286] Hardware name: AMD
> [  124.051286] Workqueue:  0x0 (events)
> [  124.051287] RIP: 0010:finish_task_switch+0x289/0x290
> [  124.051288] Code: ff 65 48 8b 04 25 c0 7b 01 00 8b 90 a8 08 00 00 48 8d b0 b0 0a 00 00 48 c7 c7 20 10 10 86 c6 05 be aa 55 01 01 e8 89 03 fd ff <0f> 0b e9 6b ff ff ff 55 48 89 e5 41 55 41 54 49 89 fc 53 48 89 f3
> [  124.051288] RSP: 0018:ffffc9001afe7e10 EFLAGS: 00010082
> [  124.051289] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000023
> [  124.051290] RDX: 0000000000000023 RSI: ffffffff86101044 RDI: ffff88900d798bb0
> [  124.051290] RBP: ffffc9001afe7e38 R08: ffff88900d798ba8 R09: 0000000000000005
> [  124.051290] R10: 000000000000000f R11: ffff88900d798d54 R12: ffff88900d7aacc0
> [  124.051291] R13: ffff889bd2308000 R14: 0000000000000000 R15: ffff88900d7aacc0
> [  124.051291] FS:  0000000000000000(0000) GS:ffff88900d780000(0000) knlGS:0000000000000000
> [  124.051292] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  124.051292] CR2: 00007ff607620000 CR3: 0000001bcb0d2000 CR4: 0000000000350ee0
> [  124.051293] Call Trace:
> [  124.051293]  __schedule+0x348/0x810
> [  124.051293]  ? dbs_work_handler+0x47/0x60
> [  124.051294]  schedule+0x4a/0xb0
> [  124.051294]  worker_thread+0xcf/0x3b0
> [  124.051294]  ? process_one_work+0x370/0x370
> [  124.051294]  kthread+0xfe/0x140
> [  124.051295]  ? kthread_park+0x90/0x90
> [  124.051295]  ret_from_fork+0x22/0x30
> [  124.051295] ---[ end trace 7f77ee8ad05caa89 ]---
> [  124.051296] Kernel Offset: disabled
> 
> Specifying nofsgsbase avoids the issue. This is very reproducible, so I
> can easily test any fixes.
> 
> Thanks,
> Tom
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-19 18:19 ` Tom Lendacky
@ 2020-08-19 21:25   ` Andy Lutomirski
  2020-08-20  0:21     ` Andy Lutomirski
  2020-08-20 13:43     ` Paolo Bonzini
  0 siblings, 2 replies; 27+ messages in thread
From: Andy Lutomirski @ 2020-08-19 21:25 UTC (permalink / raw)
  To: Tom Lendacky, Joerg Roedel, Christopherson, Sean J,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson
  Cc: Linux Kernel Mailing List, X86 ML, Andy Lutomirski, Chang S. Bae,
	Thomas Gleixner, Sasha Levin, Borislav Petkov, Peter Zijlstra,
	Ingo Molnar

On Wed, Aug 19, 2020 at 11:19 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 8/19/20 1:07 PM, Tom Lendacky wrote:
> > It looks like the FSGSBASE support is crashing my second generation EPYC
> > system. I was able to bisect it to:
> >
> > b745cfba44c1 ("x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit")
> >
> > The panic only happens when using KVM. Doing kernel builds or stress
> > on bare-metal appears fine. But if I fire up, in this case, a 64-vCPU
> > guest and do a kernel build within the guest, I get the following:
>
> I should clarify that this panic is on the bare-metal system, not in the
> guest. And that specifying nofsgsbase on the bare-metal command line fixes
> the issue.

I certainly see some oddities:

We have this code:

static void svm_vcpu_put(struct kvm_vcpu *vcpu)
{
        struct vcpu_svm *svm = to_svm(vcpu);
        int i;

        avic_vcpu_put(vcpu);

        ++vcpu->stat.host_state_reload;
        kvm_load_ldt(svm->host.ldt);
#ifdef CONFIG_X86_64
        loadsegment(fs, svm->host.fs);
        wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);
        load_gs_index(svm->host.gs);

Surely that should do load_gs_index() *before* wrmsrl().  But that's
not the problem at hand.

There are also some open-coded rdmsr and wrmsrs of MSR_GS_BASE --
surely these should be x86_gsbase_read_cpu() and
x86_gsbase_write_cpu().  (Those functions don't actually exist, but
the fsbase equivalents do, and we should add them.)  But that's also
not the problem at hand.

I haven't actually spotted the bug yet...

--Andy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-19 21:25   ` Andy Lutomirski
@ 2020-08-20  0:21     ` Andy Lutomirski
  2020-08-20 15:10       ` Sean Christopherson
  2020-08-20 13:43     ` Paolo Bonzini
  1 sibling, 1 reply; 27+ messages in thread
From: Andy Lutomirski @ 2020-08-20  0:21 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Tom Lendacky, Joerg Roedel, Christopherson, Sean J,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On Wed, Aug 19, 2020 at 2:25 PM Andy Lutomirski <luto@kernel.org> wrote:
>
> On Wed, Aug 19, 2020 at 11:19 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >
> > On 8/19/20 1:07 PM, Tom Lendacky wrote:
> > > It looks like the FSGSBASE support is crashing my second generation EPYC
> > > system. I was able to bisect it to:
> > >
> > > b745cfba44c1 ("x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit")
> > >
> > > The panic only happens when using KVM. Doing kernel builds or stress
> > > on bare-metal appears fine. But if I fire up, in this case, a 64-vCPU
> > > guest and do a kernel build within the guest, I get the following:
> >
> > I should clarify that this panic is on the bare-metal system, not in the
> > guest. And that specifying nofsgsbase on the bare-metal command line fixes
> > the issue.
>
> I certainly see some oddities:
>
> We have this code:
>
> static void svm_vcpu_put(struct kvm_vcpu *vcpu)
> {
>         struct vcpu_svm *svm = to_svm(vcpu);
>         int i;
>
>         avic_vcpu_put(vcpu);
>
>         ++vcpu->stat.host_state_reload;
>         kvm_load_ldt(svm->host.ldt);
> #ifdef CONFIG_X86_64
>         loadsegment(fs, svm->host.fs);
>         wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);
>         load_gs_index(svm->host.gs);
>
> Surely that should do load_gs_index() *before* wrmsrl().  But that's
> not the problem at hand.
>
> There are also some open-coded rdmsr and wrmsrs of MSR_GS_BASE --
> surely these should be x86_gsbase_read_cpu() and
> x86_gsbase_write_cpu().  (Those functions don't actually exist, but
> the fsbase equivalents do, and we should add them.)  But that's also
> not the problem at hand.

Make that cpu_kernelmode_gs_base(cpu).  Perf win on all CPUs.

But I still don't see the bug.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-19 21:25   ` Andy Lutomirski
  2020-08-20  0:21     ` Andy Lutomirski
@ 2020-08-20 13:43     ` Paolo Bonzini
  2020-08-20 17:51       ` Andy Lutomirski
  1 sibling, 1 reply; 27+ messages in thread
From: Paolo Bonzini @ 2020-08-20 13:43 UTC (permalink / raw)
  To: Andy Lutomirski, Tom Lendacky, Joerg Roedel, Christopherson,
	Sean J, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson
  Cc: Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On 19/08/20 23:25, Andy Lutomirski wrote:
>         wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);
>         load_gs_index(svm->host.gs);
> 
> Surely that should do load_gs_index() *before* wrmsrl().  But that's
> not the problem at hand.

The wrmsrl is writing the inactive GS base so the ordering between
load_gs_index and wrmsrl(MSR_KERNEL_GS_BASE) should be irrelevant?

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20  0:21     ` Andy Lutomirski
@ 2020-08-20 15:10       ` Sean Christopherson
  2020-08-20 15:21         ` Tom Lendacky
  0 siblings, 1 reply; 27+ messages in thread
From: Sean Christopherson @ 2020-08-20 15:10 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Tom Lendacky, Joerg Roedel, Paolo Bonzini, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Linux Kernel Mailing List, X86 ML,
	Chang S. Bae, Thomas Gleixner, Sasha Levin, Borislav Petkov,
	Peter Zijlstra, Ingo Molnar

On Wed, Aug 19, 2020 at 05:21:33PM -0700, Andy Lutomirski wrote:
> On Wed, Aug 19, 2020 at 2:25 PM Andy Lutomirski <luto@kernel.org> wrote:
> >
> > On Wed, Aug 19, 2020 at 11:19 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> > >
> > > On 8/19/20 1:07 PM, Tom Lendacky wrote:
> > > > It looks like the FSGSBASE support is crashing my second generation EPYC
> > > > system. I was able to bisect it to:
> > > >
> > > > b745cfba44c1 ("x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit")
> > > >
> > > > The panic only happens when using KVM. Doing kernel builds or stress
> > > > on bare-metal appears fine. But if I fire up, in this case, a 64-vCPU
> > > > guest and do a kernel build within the guest, I get the following:
> > >
> > > I should clarify that this panic is on the bare-metal system, not in the
> > > guest. And that specifying nofsgsbase on the bare-metal command line fixes
> > > the issue.
> >
> > I certainly see some oddities:
> >
> > We have this code:
> >
> > static void svm_vcpu_put(struct kvm_vcpu *vcpu)
> > {
> >         struct vcpu_svm *svm = to_svm(vcpu);
> >         int i;
> >
> >         avic_vcpu_put(vcpu);
> >
> >         ++vcpu->stat.host_state_reload;
> >         kvm_load_ldt(svm->host.ldt);
> > #ifdef CONFIG_X86_64
> >         loadsegment(fs, svm->host.fs);
> >         wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);

Pretty sure current->thread.gsbase can be stale, i.e. this needs:

	current_save_fsgs();
	wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);

On a related topic, we really should consolidate the VMX and SVM code for
these flows, they're both ugly.

> >         load_gs_index(svm->host.gs);
> >
> > Surely that should do load_gs_index() *before* wrmsrl().  But that's
> > not the problem at hand.
> >
> > There are also some open-coded rdmsr and wrmsrs of MSR_GS_BASE --
> > surely these should be x86_gsbase_read_cpu() and
> > x86_gsbase_write_cpu().  (Those functions don't actually exist, but
> > the fsbase equivalents do, and we should add them.)  But that's also
> > not the problem at hand.
> 
> Make that cpu_kernelmode_gs_base(cpu).  Perf win on all CPUs.
> 
> But I still don't see the bug.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 15:10       ` Sean Christopherson
@ 2020-08-20 15:21         ` Tom Lendacky
  2020-08-20 15:55           ` Andy Lutomirski
  2020-08-20 18:43           ` Bae, Chang Seok
  0 siblings, 2 replies; 27+ messages in thread
From: Tom Lendacky @ 2020-08-20 15:21 UTC (permalink / raw)
  To: Sean Christopherson, Andy Lutomirski
  Cc: Joerg Roedel, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Linux Kernel Mailing List, X86 ML, Chang S. Bae,
	Thomas Gleixner, Sasha Levin, Borislav Petkov, Peter Zijlstra,
	Ingo Molnar

On 8/20/20 10:10 AM, Sean Christopherson wrote:
> On Wed, Aug 19, 2020 at 05:21:33PM -0700, Andy Lutomirski wrote:
>> On Wed, Aug 19, 2020 at 2:25 PM Andy Lutomirski <luto@kernel.org> wrote:
>>>
>>> On Wed, Aug 19, 2020 at 11:19 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>>
>>>> On 8/19/20 1:07 PM, Tom Lendacky wrote:
>>>>> It looks like the FSGSBASE support is crashing my second generation EPYC
>>>>> system. I was able to bisect it to:
>>>>>
>>>>> b745cfba44c1 ("x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit")
>>>>>
>>>>> The panic only happens when using KVM. Doing kernel builds or stress
>>>>> on bare-metal appears fine. But if I fire up, in this case, a 64-vCPU
>>>>> guest and do a kernel build within the guest, I get the following:
>>>>
>>>> I should clarify that this panic is on the bare-metal system, not in the
>>>> guest. And that specifying nofsgsbase on the bare-metal command line fixes
>>>> the issue.
>>>
>>> I certainly see some oddities:
>>>
>>> We have this code:
>>>
>>> static void svm_vcpu_put(struct kvm_vcpu *vcpu)
>>> {
>>>          struct vcpu_svm *svm = to_svm(vcpu);
>>>          int i;
>>>
>>>          avic_vcpu_put(vcpu);
>>>
>>>          ++vcpu->stat.host_state_reload;
>>>          kvm_load_ldt(svm->host.ldt);
>>> #ifdef CONFIG_X86_64
>>>          loadsegment(fs, svm->host.fs);
>>>          wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);
> 
> Pretty sure current->thread.gsbase can be stale, i.e. this needs:
> 
> 	current_save_fsgs();

I did try adding current_save_fsgs() in svm_vcpu_load(), saving the 
current->thread.gsbase value to a new variable in the svm struct. I then 
used that variable in the wrmsrl below, but it still crashed.

Thanks,
Tom

> 	wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);
> 
> On a related topic, we really should consolidate the VMX and SVM code for
> these flows, they're both ugly.
> 
>>>          load_gs_index(svm->host.gs);
>>>
>>> Surely that should do load_gs_index() *before* wrmsrl().  But that's
>>> not the problem at hand.
>>>
>>> There are also some open-coded rdmsr and wrmsrs of MSR_GS_BASE --
>>> surely these should be x86_gsbase_read_cpu() and
>>> x86_gsbase_write_cpu().  (Those functions don't actually exist, but
>>> the fsbase equivalents do, and we should add them.)  But that's also
>>> not the problem at hand.
>>
>> Make that cpu_kernelmode_gs_base(cpu).  Perf win on all CPUs.
>>
>> But I still don't see the bug.
> 
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 15:21         ` Tom Lendacky
@ 2020-08-20 15:55           ` Andy Lutomirski
  2020-08-20 16:17             ` Tom Lendacky
  2020-08-20 18:43           ` Bae, Chang Seok
  1 sibling, 1 reply; 27+ messages in thread
From: Andy Lutomirski @ 2020-08-20 15:55 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Sean Christopherson, Andy Lutomirski, Joerg Roedel,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On Thu, Aug 20, 2020 at 8:21 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 8/20/20 10:10 AM, Sean Christopherson wrote:
> > On Wed, Aug 19, 2020 at 05:21:33PM -0700, Andy Lutomirski wrote:
> >> On Wed, Aug 19, 2020 at 2:25 PM Andy Lutomirski <luto@kernel.org> wrote:
> >>>
> >>> On Wed, Aug 19, 2020 at 11:19 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>>>
> >>>> On 8/19/20 1:07 PM, Tom Lendacky wrote:
> >>>>> It looks like the FSGSBASE support is crashing my second generation EPYC
> >>>>> system. I was able to bisect it to:
> >>>>>
> >>>>> b745cfba44c1 ("x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit")
> >>>>>
> >>>>> The panic only happens when using KVM. Doing kernel builds or stress
> >>>>> on bare-metal appears fine. But if I fire up, in this case, a 64-vCPU
> >>>>> guest and do a kernel build within the guest, I get the following:
> >>>>
> >>>> I should clarify that this panic is on the bare-metal system, not in the
> >>>> guest. And that specifying nofsgsbase on the bare-metal command line fixes
> >>>> the issue.
> >>>
> >>> I certainly see some oddities:
> >>>
> >>> We have this code:
> >>>
> >>> static void svm_vcpu_put(struct kvm_vcpu *vcpu)
> >>> {
> >>>          struct vcpu_svm *svm = to_svm(vcpu);
> >>>          int i;
> >>>
> >>>          avic_vcpu_put(vcpu);
> >>>
> >>>          ++vcpu->stat.host_state_reload;
> >>>          kvm_load_ldt(svm->host.ldt);
> >>> #ifdef CONFIG_X86_64
> >>>          loadsegment(fs, svm->host.fs);
> >>>          wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);
> >
> > Pretty sure current->thread.gsbase can be stale, i.e. this needs:
> >
> >       current_save_fsgs();
>
> I did try adding current_save_fsgs() in svm_vcpu_load(), saving the
> current->thread.gsbase value to a new variable in the svm struct. I then
> used that variable in the wrmsrl below, but it still crashed.

Can you try bisecting all the way back to:

commit dd649bd0b3aa012740059b1ba31ecad28a408f7f
Author: Andy Lutomirski <luto@kernel.org>
Date:   Thu May 28 16:13:48 2020 -0400

    x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE

and adding the unsafe_fsgsbase command line option while you bisect.

Also, you're crashing when you run a guest, right?  Can you try
running the x86 sefltests on a bad kernel without running any guests?

--Andy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 15:55           ` Andy Lutomirski
@ 2020-08-20 16:17             ` Tom Lendacky
  2020-08-20 16:30               ` Tom Lendacky
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Lendacky @ 2020-08-20 16:17 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Sean Christopherson, Joerg Roedel, Paolo Bonzini,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On 8/20/20 10:55 AM, Andy Lutomirski wrote:
> On Thu, Aug 20, 2020 at 8:21 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>
>> On 8/20/20 10:10 AM, Sean Christopherson wrote:
>>> On Wed, Aug 19, 2020 at 05:21:33PM -0700, Andy Lutomirski wrote:
>>>> On Wed, Aug 19, 2020 at 2:25 PM Andy Lutomirski <luto@kernel.org> wrote:
>>>>>
>>>>> On Wed, Aug 19, 2020 at 11:19 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>>>>
>>>>>> On 8/19/20 1:07 PM, Tom Lendacky wrote:
>>>>>>> It looks like the FSGSBASE support is crashing my second generation EPYC
>>>>>>> system. I was able to bisect it to:
>>>>>>>
>>>>>>> b745cfba44c1 ("x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit")
>>>>>>>
>>>>>>> The panic only happens when using KVM. Doing kernel builds or stress
>>>>>>> on bare-metal appears fine. But if I fire up, in this case, a 64-vCPU
>>>>>>> guest and do a kernel build within the guest, I get the following:
>>>>>>
>>>>>> I should clarify that this panic is on the bare-metal system, not in the
>>>>>> guest. And that specifying nofsgsbase on the bare-metal command line fixes
>>>>>> the issue.
>>>>>
>>>>> I certainly see some oddities:
>>>>>
>>>>> We have this code:
>>>>>
>>>>> static void svm_vcpu_put(struct kvm_vcpu *vcpu)
>>>>> {
>>>>>           struct vcpu_svm *svm = to_svm(vcpu);
>>>>>           int i;
>>>>>
>>>>>           avic_vcpu_put(vcpu);
>>>>>
>>>>>           ++vcpu->stat.host_state_reload;
>>>>>           kvm_load_ldt(svm->host.ldt);
>>>>> #ifdef CONFIG_X86_64
>>>>>           loadsegment(fs, svm->host.fs);
>>>>>           wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);
>>>
>>> Pretty sure current->thread.gsbase can be stale, i.e. this needs:
>>>
>>>        current_save_fsgs();
>>
>> I did try adding current_save_fsgs() in svm_vcpu_load(), saving the
>> current->thread.gsbase value to a new variable in the svm struct. I then
>> used that variable in the wrmsrl below, but it still crashed.
> 
> Can you try bisecting all the way back to:
> 
> commit dd649bd0b3aa012740059b1ba31ecad28a408f7f
> Author: Andy Lutomirski <luto@kernel.org>
> Date:   Thu May 28 16:13:48 2020 -0400
> 
>      x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
> 
> and adding the unsafe_fsgsbase command line option while you bisect.

I'll give that a try.

> 
> Also, you're crashing when you run a guest, right?  Can you try

Right, when the guest is running. The guest boots fine and only when I put 
some stress on it (kernel build) does it cause the issue. It might be 
worth trying to pin all the vCPUs and see if the crash still happens.

> running the x86 sefltests on a bad kernel without running any guests?

I'll give that a try.

Thanks,
Tom

> 
> --Andy
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 16:17             ` Tom Lendacky
@ 2020-08-20 16:30               ` Tom Lendacky
  2020-08-20 17:41                 ` Paolo Bonzini
  2020-08-20 18:34                 ` Tom Lendacky
  0 siblings, 2 replies; 27+ messages in thread
From: Tom Lendacky @ 2020-08-20 16:30 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Sean Christopherson, Joerg Roedel, Paolo Bonzini,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On 8/20/20 11:17 AM, Tom Lendacky wrote:
> On 8/20/20 10:55 AM, Andy Lutomirski wrote:
>> On Thu, Aug 20, 2020 at 8:21 AM Tom Lendacky <thomas.lendacky@amd.com> 
>> wrote:
>>>
>>> On 8/20/20 10:10 AM, Sean Christopherson wrote:
>>>> On Wed, Aug 19, 2020 at 05:21:33PM -0700, Andy Lutomirski wrote:
>>>>> On Wed, Aug 19, 2020 at 2:25 PM Andy Lutomirski <luto@kernel.org> wrote:
>>>>>>
>>>>>> On Wed, Aug 19, 2020 at 11:19 AM Tom Lendacky 
>>>>>> <thomas.lendacky@amd.com> wrote:
>>>>>>>
>>>>>>> On 8/19/20 1:07 PM, Tom Lendacky wrote:
>>>>>>>> It looks like the FSGSBASE support is crashing my second 
>>>>>>>> generation EPYC
>>>>>>>> system. I was able to bisect it to:
>>>>>>>>
>>>>>>>> b745cfba44c1 ("x86/cpu: Enable FSGSBASE on 64bit by default and 
>>>>>>>> add a chicken bit")
>>>>>>>>
>>>>>>>> The panic only happens when using KVM. Doing kernel builds or stress
>>>>>>>> on bare-metal appears fine. But if I fire up, in this case, a 64-vCPU
>>>>>>>> guest and do a kernel build within the guest, I get the following:
>>>>>>>
>>>>>>> I should clarify that this panic is on the bare-metal system, not 
>>>>>>> in the
>>>>>>> guest. And that specifying nofsgsbase on the bare-metal command 
>>>>>>> line fixes
>>>>>>> the issue.
>>>>>>
>>>>>> I certainly see some oddities:
>>>>>>
>>>>>> We have this code:
>>>>>>
>>>>>> static void svm_vcpu_put(struct kvm_vcpu *vcpu)
>>>>>> {
>>>>>>           struct vcpu_svm *svm = to_svm(vcpu);
>>>>>>           int i;
>>>>>>
>>>>>>           avic_vcpu_put(vcpu);
>>>>>>
>>>>>>           ++vcpu->stat.host_state_reload;
>>>>>>           kvm_load_ldt(svm->host.ldt);
>>>>>> #ifdef CONFIG_X86_64
>>>>>>           loadsegment(fs, svm->host.fs);
>>>>>>           wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);
>>>>
>>>> Pretty sure current->thread.gsbase can be stale, i.e. this needs:
>>>>
>>>>        current_save_fsgs();
>>>
>>> I did try adding current_save_fsgs() in svm_vcpu_load(), saving the
>>> current->thread.gsbase value to a new variable in the svm struct. I then
>>> used that variable in the wrmsrl below, but it still crashed.
>>
>> Can you try bisecting all the way back to:
>>
>> commit dd649bd0b3aa012740059b1ba31ecad28a408f7f
>> Author: Andy Lutomirski <luto@kernel.org>
>> Date:   Thu May 28 16:13:48 2020 -0400
>>
>>      x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
>>
>> and adding the unsafe_fsgsbase command line option while you bisect.
> 
> I'll give that a try.
> 
>>
>> Also, you're crashing when you run a guest, right?  Can you try
> 
> Right, when the guest is running. The guest boots fine and only when I put 
> some stress on it (kernel build) does it cause the issue. It might be 
> worth trying to pin all the vCPUs and see if the crash still happens.
> 
>> running the x86 sefltests on a bad kernel without running any guests?
> 
> I'll give that a try.

All the selftests passed.

Thanks,
Tom

> 
> Thanks,
> Tom
> 
>>
>> --Andy
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 16:30               ` Tom Lendacky
@ 2020-08-20 17:41                 ` Paolo Bonzini
  2020-08-20 18:34                 ` Tom Lendacky
  1 sibling, 0 replies; 27+ messages in thread
From: Paolo Bonzini @ 2020-08-20 17:41 UTC (permalink / raw)
  To: Tom Lendacky, Andy Lutomirski
  Cc: Sean Christopherson, Joerg Roedel, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Linux Kernel Mailing List, X86 ML, Chang S. Bae,
	Thomas Gleixner, Sasha Levin, Borislav Petkov, Peter Zijlstra,
	Ingo Molnar

On 20/08/20 18:30, Tom Lendacky wrote:
>>> running the x86 sefltests on a bad kernel without running any guests?
>>
>> I'll give that a try.
> 
> All the selftests passed.

Do the KVM selftests also pass?  Especially the dirty_log_test might be
interesting since it can be run for a longer time.

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 13:43     ` Paolo Bonzini
@ 2020-08-20 17:51       ` Andy Lutomirski
  0 siblings, 0 replies; 27+ messages in thread
From: Andy Lutomirski @ 2020-08-20 17:51 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Andy Lutomirski, Tom Lendacky, Joerg Roedel, Christopherson,
	Sean J, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On Thu, Aug 20, 2020 at 6:43 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 19/08/20 23:25, Andy Lutomirski wrote:
> >         wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);
> >         load_gs_index(svm->host.gs);
> >
> > Surely that should do load_gs_index() *before* wrmsrl().  But that's
> > not the problem at hand.
>
> The wrmsrl is writing the inactive GS base so the ordering between
> load_gs_index and wrmsrl(MSR_KERNEL_GS_BASE) should be irrelevant?

load_gs_index() sets the index between a pair of swapgs's -- it writes
the inactive base, too.

--Andy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 16:30               ` Tom Lendacky
  2020-08-20 17:41                 ` Paolo Bonzini
@ 2020-08-20 18:34                 ` Tom Lendacky
  2020-08-20 18:38                   ` Jim Mattson
  1 sibling, 1 reply; 27+ messages in thread
From: Tom Lendacky @ 2020-08-20 18:34 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Sean Christopherson, Joerg Roedel, Paolo Bonzini,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On 8/20/20 11:30 AM, Tom Lendacky wrote:
> On 8/20/20 11:17 AM, Tom Lendacky wrote:
>> On 8/20/20 10:55 AM, Andy Lutomirski wrote:
>>> On Thu, Aug 20, 2020 at 8:21 AM Tom Lendacky <thomas.lendacky@amd.com> 
>>> wrote:
>>>>
>>>> On 8/20/20 10:10 AM, Sean Christopherson wrote:
>>>>> On Wed, Aug 19, 2020 at 05:21:33PM -0700, Andy Lutomirski wrote:
>>>>>> On Wed, Aug 19, 2020 at 2:25 PM Andy Lutomirski <luto@kernel.org> 
>>>>>> wrote:
>>>>>>>
>>>>>>> On Wed, Aug 19, 2020 at 11:19 AM Tom Lendacky 
>>>>>>> <thomas.lendacky@amd.com> wrote:
>>>>>>>>
>>>>>>>> On 8/19/20 1:07 PM, Tom Lendacky wrote:
>>>>>>>>> It looks like the FSGSBASE support is crashing my second 
>>>>>>>>> generation EPYC
>>>>>>>>> system. I was able to bisect it to:
>>>>>>>>>
>>>>>>>>> b745cfba44c1 ("x86/cpu: Enable FSGSBASE on 64bit by default and 
>>>>>>>>> add a chicken bit")
>>>>>>>>>
>>>>>>>>> The panic only happens when using KVM. Doing kernel builds or stress
>>>>>>>>> on bare-metal appears fine. But if I fire up, in this case, a 
>>>>>>>>> 64-vCPU
>>>>>>>>> guest and do a kernel build within the guest, I get the following:
>>>>>>>>
>>>>>>>> I should clarify that this panic is on the bare-metal system, not 
>>>>>>>> in the
>>>>>>>> guest. And that specifying nofsgsbase on the bare-metal command 
>>>>>>>> line fixes
>>>>>>>> the issue.
>>>>>>>
>>>>>>> I certainly see some oddities:
>>>>>>>
>>>>>>> We have this code:
>>>>>>>
>>>>>>> static void svm_vcpu_put(struct kvm_vcpu *vcpu)
>>>>>>> {
>>>>>>>           struct vcpu_svm *svm = to_svm(vcpu);
>>>>>>>           int i;
>>>>>>>
>>>>>>>           avic_vcpu_put(vcpu);
>>>>>>>
>>>>>>>           ++vcpu->stat.host_state_reload;
>>>>>>>           kvm_load_ldt(svm->host.ldt);
>>>>>>> #ifdef CONFIG_X86_64
>>>>>>>           loadsegment(fs, svm->host.fs);
>>>>>>>           wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);
>>>>>
>>>>> Pretty sure current->thread.gsbase can be stale, i.e. this needs:
>>>>>
>>>>>        current_save_fsgs();
>>>>
>>>> I did try adding current_save_fsgs() in svm_vcpu_load(), saving the
>>>> current->thread.gsbase value to a new variable in the svm struct. I then
>>>> used that variable in the wrmsrl below, but it still crashed.
>>>
>>> Can you try bisecting all the way back to:
>>>
>>> commit dd649bd0b3aa012740059b1ba31ecad28a408f7f
>>> Author: Andy Lutomirski <luto@kernel.org>
>>> Date:   Thu May 28 16:13:48 2020 -0400
>>>
>>>      x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
>>>
>>> and adding the unsafe_fsgsbase command line option while you bisect.
>>
>> I'll give that a try.

Bisecting with unsafe_fsgsbase identified:

c82965f9e530 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit")

But I'm thinking that could be because it starts using GET_PERCPU_BASE, 
which on Rome would use RDPID. So is SVM restoring TSC_AUX_MSR too late? 
That would explain why I don't see the issue on Naples, which doesn't 
support RDPID.

Thanks,
Tom

>>
>>>
>>> Also, you're crashing when you run a guest, right?  Can you try
>>
>> Right, when the guest is running. The guest boots fine and only when I 
>> put some stress on it (kernel build) does it cause the issue. It might 
>> be worth trying to pin all the vCPUs and see if the crash still happens.
>>
>>> running the x86 sefltests on a bad kernel without running any guests?
>>
>> I'll give that a try.
> 
> All the selftests passed.
> 
> Thanks,
> Tom
> 
>>
>> Thanks,
>> Tom
>>
>>>
>>> --Andy
>>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 18:34                 ` Tom Lendacky
@ 2020-08-20 18:38                   ` Jim Mattson
  2020-08-20 18:39                     ` Jim Mattson
  0 siblings, 1 reply; 27+ messages in thread
From: Jim Mattson @ 2020-08-20 18:38 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Andy Lutomirski, Sean Christopherson, Joerg Roedel,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On Thu, Aug 20, 2020 at 11:34 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 8/20/20 11:30 AM, Tom Lendacky wrote:
> > On 8/20/20 11:17 AM, Tom Lendacky wrote:
> >> On 8/20/20 10:55 AM, Andy Lutomirski wrote:
> >>> On Thu, Aug 20, 2020 at 8:21 AM Tom Lendacky <thomas.lendacky@amd.com>
> >>> wrote:
> >>>>
> >>>> On 8/20/20 10:10 AM, Sean Christopherson wrote:
> >>>>> On Wed, Aug 19, 2020 at 05:21:33PM -0700, Andy Lutomirski wrote:
> >>>>>> On Wed, Aug 19, 2020 at 2:25 PM Andy Lutomirski <luto@kernel.org>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> On Wed, Aug 19, 2020 at 11:19 AM Tom Lendacky
> >>>>>>> <thomas.lendacky@amd.com> wrote:
> >>>>>>>>
> >>>>>>>> On 8/19/20 1:07 PM, Tom Lendacky wrote:
> >>>>>>>>> It looks like the FSGSBASE support is crashing my second
> >>>>>>>>> generation EPYC
> >>>>>>>>> system. I was able to bisect it to:
> >>>>>>>>>
> >>>>>>>>> b745cfba44c1 ("x86/cpu: Enable FSGSBASE on 64bit by default and
> >>>>>>>>> add a chicken bit")
> >>>>>>>>>
> >>>>>>>>> The panic only happens when using KVM. Doing kernel builds or stress
> >>>>>>>>> on bare-metal appears fine. But if I fire up, in this case, a
> >>>>>>>>> 64-vCPU
> >>>>>>>>> guest and do a kernel build within the guest, I get the following:
> >>>>>>>>
> >>>>>>>> I should clarify that this panic is on the bare-metal system, not
> >>>>>>>> in the
> >>>>>>>> guest. And that specifying nofsgsbase on the bare-metal command
> >>>>>>>> line fixes
> >>>>>>>> the issue.
> >>>>>>>
> >>>>>>> I certainly see some oddities:
> >>>>>>>
> >>>>>>> We have this code:
> >>>>>>>
> >>>>>>> static void svm_vcpu_put(struct kvm_vcpu *vcpu)
> >>>>>>> {
> >>>>>>>           struct vcpu_svm *svm = to_svm(vcpu);
> >>>>>>>           int i;
> >>>>>>>
> >>>>>>>           avic_vcpu_put(vcpu);
> >>>>>>>
> >>>>>>>           ++vcpu->stat.host_state_reload;
> >>>>>>>           kvm_load_ldt(svm->host.ldt);
> >>>>>>> #ifdef CONFIG_X86_64
> >>>>>>>           loadsegment(fs, svm->host.fs);
> >>>>>>>           wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);
> >>>>>
> >>>>> Pretty sure current->thread.gsbase can be stale, i.e. this needs:
> >>>>>
> >>>>>        current_save_fsgs();
> >>>>
> >>>> I did try adding current_save_fsgs() in svm_vcpu_load(), saving the
> >>>> current->thread.gsbase value to a new variable in the svm struct. I then
> >>>> used that variable in the wrmsrl below, but it still crashed.
> >>>
> >>> Can you try bisecting all the way back to:
> >>>
> >>> commit dd649bd0b3aa012740059b1ba31ecad28a408f7f
> >>> Author: Andy Lutomirski <luto@kernel.org>
> >>> Date:   Thu May 28 16:13:48 2020 -0400
> >>>
> >>>      x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
> >>>
> >>> and adding the unsafe_fsgsbase command line option while you bisect.
> >>
> >> I'll give that a try.
>
> Bisecting with unsafe_fsgsbase identified:
>
> c82965f9e530 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit")
>
> But I'm thinking that could be because it starts using GET_PERCPU_BASE,
> which on Rome would use RDPID. So is SVM restoring TSC_AUX_MSR too late?
> That would explain why I don't see the issue on Naples, which doesn't
> support RDPID.

It looks to me like SVM loads the guest TSC_AUX from vcpu_load to
vcpu_put, with this comment:

/* This assumes that the kernel never uses MSR_TSC_AUX */
if (static_cpu_has(X86_FEATURE_RDTSCP))
        wrmsrl(MSR_TSC_AUX, svm->tsc_aux);

We are talking about mainline here, right?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 18:38                   ` Jim Mattson
@ 2020-08-20 18:39                     ` Jim Mattson
  2020-08-20 18:41                       ` Tom Lendacky
  0 siblings, 1 reply; 27+ messages in thread
From: Jim Mattson @ 2020-08-20 18:39 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Andy Lutomirski, Sean Christopherson, Joerg Roedel,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On Thu, Aug 20, 2020 at 11:38 AM Jim Mattson <jmattson@google.com> wrote:
>
> On Thu, Aug 20, 2020 at 11:34 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >
> > On 8/20/20 11:30 AM, Tom Lendacky wrote:
> > > On 8/20/20 11:17 AM, Tom Lendacky wrote:
> > >> On 8/20/20 10:55 AM, Andy Lutomirski wrote:
> > >>> On Thu, Aug 20, 2020 at 8:21 AM Tom Lendacky <thomas.lendacky@amd.com>
> > >>> wrote:
> > >>>>
> > >>>> On 8/20/20 10:10 AM, Sean Christopherson wrote:
> > >>>>> On Wed, Aug 19, 2020 at 05:21:33PM -0700, Andy Lutomirski wrote:
> > >>>>>> On Wed, Aug 19, 2020 at 2:25 PM Andy Lutomirski <luto@kernel.org>
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>> On Wed, Aug 19, 2020 at 11:19 AM Tom Lendacky
> > >>>>>>> <thomas.lendacky@amd.com> wrote:
> > >>>>>>>>
> > >>>>>>>> On 8/19/20 1:07 PM, Tom Lendacky wrote:
> > >>>>>>>>> It looks like the FSGSBASE support is crashing my second
> > >>>>>>>>> generation EPYC
> > >>>>>>>>> system. I was able to bisect it to:
> > >>>>>>>>>
> > >>>>>>>>> b745cfba44c1 ("x86/cpu: Enable FSGSBASE on 64bit by default and
> > >>>>>>>>> add a chicken bit")
> > >>>>>>>>>
> > >>>>>>>>> The panic only happens when using KVM. Doing kernel builds or stress
> > >>>>>>>>> on bare-metal appears fine. But if I fire up, in this case, a
> > >>>>>>>>> 64-vCPU
> > >>>>>>>>> guest and do a kernel build within the guest, I get the following:
> > >>>>>>>>
> > >>>>>>>> I should clarify that this panic is on the bare-metal system, not
> > >>>>>>>> in the
> > >>>>>>>> guest. And that specifying nofsgsbase on the bare-metal command
> > >>>>>>>> line fixes
> > >>>>>>>> the issue.
> > >>>>>>>
> > >>>>>>> I certainly see some oddities:
> > >>>>>>>
> > >>>>>>> We have this code:
> > >>>>>>>
> > >>>>>>> static void svm_vcpu_put(struct kvm_vcpu *vcpu)
> > >>>>>>> {
> > >>>>>>>           struct vcpu_svm *svm = to_svm(vcpu);
> > >>>>>>>           int i;
> > >>>>>>>
> > >>>>>>>           avic_vcpu_put(vcpu);
> > >>>>>>>
> > >>>>>>>           ++vcpu->stat.host_state_reload;
> > >>>>>>>           kvm_load_ldt(svm->host.ldt);
> > >>>>>>> #ifdef CONFIG_X86_64
> > >>>>>>>           loadsegment(fs, svm->host.fs);
> > >>>>>>>           wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);
> > >>>>>
> > >>>>> Pretty sure current->thread.gsbase can be stale, i.e. this needs:
> > >>>>>
> > >>>>>        current_save_fsgs();
> > >>>>
> > >>>> I did try adding current_save_fsgs() in svm_vcpu_load(), saving the
> > >>>> current->thread.gsbase value to a new variable in the svm struct. I then
> > >>>> used that variable in the wrmsrl below, but it still crashed.
> > >>>
> > >>> Can you try bisecting all the way back to:
> > >>>
> > >>> commit dd649bd0b3aa012740059b1ba31ecad28a408f7f
> > >>> Author: Andy Lutomirski <luto@kernel.org>
> > >>> Date:   Thu May 28 16:13:48 2020 -0400
> > >>>
> > >>>      x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
> > >>>
> > >>> and adding the unsafe_fsgsbase command line option while you bisect.
> > >>
> > >> I'll give that a try.
> >
> > Bisecting with unsafe_fsgsbase identified:
> >
> > c82965f9e530 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit")
> >
> > But I'm thinking that could be because it starts using GET_PERCPU_BASE,
> > which on Rome would use RDPID. So is SVM restoring TSC_AUX_MSR too late?
> > That would explain why I don't see the issue on Naples, which doesn't
> > support RDPID.
>
> It looks to me like SVM loads the guest TSC_AUX from vcpu_load to
> vcpu_put, with this comment:
>
> /* This assumes that the kernel never uses MSR_TSC_AUX */
> if (static_cpu_has(X86_FEATURE_RDTSCP))
>         wrmsrl(MSR_TSC_AUX, svm->tsc_aux);

Correction: It never restores TSC_AUX, AFAICT.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 18:39                     ` Jim Mattson
@ 2020-08-20 18:41                       ` Tom Lendacky
  2020-08-20 19:04                         ` Tom Lendacky
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Lendacky @ 2020-08-20 18:41 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Andy Lutomirski, Sean Christopherson, Joerg Roedel,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On 8/20/20 1:39 PM, Jim Mattson wrote:
> On Thu, Aug 20, 2020 at 11:38 AM Jim Mattson <jmattson@google.com> wrote:
>>
>> On Thu, Aug 20, 2020 at 11:34 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>
>>> On 8/20/20 11:30 AM, Tom Lendacky wrote:
>>>> On 8/20/20 11:17 AM, Tom Lendacky wrote:
>>>>> On 8/20/20 10:55 AM, Andy Lutomirski wrote:
>>>>>> On Thu, Aug 20, 2020 at 8:21 AM Tom Lendacky <thomas.lendacky@amd.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> On 8/20/20 10:10 AM, Sean Christopherson wrote:
>>>>>>>> On Wed, Aug 19, 2020 at 05:21:33PM -0700, Andy Lutomirski wrote:
>>>>>>>>> On Wed, Aug 19, 2020 at 2:25 PM Andy Lutomirski <luto@kernel.org>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 19, 2020 at 11:19 AM Tom Lendacky
>>>>>>>>>> <thomas.lendacky@amd.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 8/19/20 1:07 PM, Tom Lendacky wrote:
>>>>>>>>>>>> It looks like the FSGSBASE support is crashing my second
>>>>>>>>>>>> generation EPYC
>>>>>>>>>>>> system. I was able to bisect it to:
>>>>>>>>>>>>
>>>>>>>>>>>> b745cfba44c1 ("x86/cpu: Enable FSGSBASE on 64bit by default and
>>>>>>>>>>>> add a chicken bit")
>>>>>>>>>>>>
>>>>>>>>>>>> The panic only happens when using KVM. Doing kernel builds or stress
>>>>>>>>>>>> on bare-metal appears fine. But if I fire up, in this case, a
>>>>>>>>>>>> 64-vCPU
>>>>>>>>>>>> guest and do a kernel build within the guest, I get the following:
>>>>>>>>>>>
>>>>>>>>>>> I should clarify that this panic is on the bare-metal system, not
>>>>>>>>>>> in the
>>>>>>>>>>> guest. And that specifying nofsgsbase on the bare-metal command
>>>>>>>>>>> line fixes
>>>>>>>>>>> the issue.
>>>>>>>>>>
>>>>>>>>>> I certainly see some oddities:
>>>>>>>>>>
>>>>>>>>>> We have this code:
>>>>>>>>>>
>>>>>>>>>> static void svm_vcpu_put(struct kvm_vcpu *vcpu)
>>>>>>>>>> {
>>>>>>>>>>            struct vcpu_svm *svm = to_svm(vcpu);
>>>>>>>>>>            int i;
>>>>>>>>>>
>>>>>>>>>>            avic_vcpu_put(vcpu);
>>>>>>>>>>
>>>>>>>>>>            ++vcpu->stat.host_state_reload;
>>>>>>>>>>            kvm_load_ldt(svm->host.ldt);
>>>>>>>>>> #ifdef CONFIG_X86_64
>>>>>>>>>>            loadsegment(fs, svm->host.fs);
>>>>>>>>>>            wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);
>>>>>>>>
>>>>>>>> Pretty sure current->thread.gsbase can be stale, i.e. this needs:
>>>>>>>>
>>>>>>>>         current_save_fsgs();
>>>>>>>
>>>>>>> I did try adding current_save_fsgs() in svm_vcpu_load(), saving the
>>>>>>> current->thread.gsbase value to a new variable in the svm struct. I then
>>>>>>> used that variable in the wrmsrl below, but it still crashed.
>>>>>>
>>>>>> Can you try bisecting all the way back to:
>>>>>>
>>>>>> commit dd649bd0b3aa012740059b1ba31ecad28a408f7f
>>>>>> Author: Andy Lutomirski <luto@kernel.org>
>>>>>> Date:   Thu May 28 16:13:48 2020 -0400
>>>>>>
>>>>>>       x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
>>>>>>
>>>>>> and adding the unsafe_fsgsbase command line option while you bisect.
>>>>>
>>>>> I'll give that a try.
>>>
>>> Bisecting with unsafe_fsgsbase identified:
>>>
>>> c82965f9e530 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit")
>>>
>>> But I'm thinking that could be because it starts using GET_PERCPU_BASE,
>>> which on Rome would use RDPID. So is SVM restoring TSC_AUX_MSR too late?
>>> That would explain why I don't see the issue on Naples, which doesn't
>>> support RDPID.
>>
>> It looks to me like SVM loads the guest TSC_AUX from vcpu_load to
>> vcpu_put, with this comment:
>>
>> /* This assumes that the kernel never uses MSR_TSC_AUX */
>> if (static_cpu_has(X86_FEATURE_RDTSCP))
>>          wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
> 
> Correction: It never restores TSC_AUX, AFAICT.

It does, it's in the host_save_user_msrs array.

Thanks,
Tom

> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 15:21         ` Tom Lendacky
  2020-08-20 15:55           ` Andy Lutomirski
@ 2020-08-20 18:43           ` Bae, Chang Seok
  1 sibling, 0 replies; 27+ messages in thread
From: Bae, Chang Seok @ 2020-08-20 18:43 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Christopherson, Sean J, Andy Lutomirski, Joerg Roedel,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Linux Kernel Mailing List, X86 ML, Thomas Gleixner, Sasha Levin,
	Borislav Petkov, Peter Zijlstra, Ingo Molnar, Hansen, Dave,
	Shankar, Ravi V


> On Aug 20, 2020, at 08:21, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> On 8/20/20 10:10 AM, Sean Christopherson wrote:
>> 
>> Pretty sure current->thread.gsbase can be stale, i.e. this needs:
>> 	current_save_fsgs();
> 
> I did try adding current_save_fsgs() in svm_vcpu_load(), saving the current->thread.gsbase value to a new variable in the svm struct. I then used that variable in the wrmsrl below, but it still crashed.

Then, current->thread.gsbase is from __rdgsbase_inactive() which is
user GSBASE.

If you do the wrmsrl below, it overwrites the current GSBASE with the 
user value.

>> 	wrmsrl(MSR_KERNEL_GS_BASE, current->thread.gsbase);

Thanks,
Chang

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 18:41                       ` Tom Lendacky
@ 2020-08-20 19:04                         ` Tom Lendacky
  2020-08-20 19:05                           ` Tom Lendacky
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Lendacky @ 2020-08-20 19:04 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Andy Lutomirski, Sean Christopherson, Joerg Roedel,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On 8/20/20 1:41 PM, Tom Lendacky wrote:
> On 8/20/20 1:39 PM, Jim Mattson wrote:
>> On Thu, Aug 20, 2020 at 11:38 AM Jim Mattson <jmattson@google.com> wrote:
>>>
>>> On Thu, Aug 20, 2020 at 11:34 AM Tom Lendacky <thomas.lendacky@amd.com> 
>>> wrote:
>>>>
>>>>
>>>> Bisecting with unsafe_fsgsbase identified:
>>>>
>>>> c82965f9e530 ("x86/entry/64: Handle FSGSBASE enabled paranoid 
>>>> entry/exit")
>>>>
>>>> But I'm thinking that could be because it starts using GET_PERCPU_BASE,
>>>> which on Rome would use RDPID. So is SVM restoring TSC_AUX_MSR too late?
>>>> That would explain why I don't see the issue on Naples, which doesn't
>>>> support RDPID.
>>>
>>> It looks to me like SVM loads the guest TSC_AUX from vcpu_load to
>>> vcpu_put, with this comment:
>>>
>>> /* This assumes that the kernel never uses MSR_TSC_AUX */
>>> if (static_cpu_has(X86_FEATURE_RDTSCP))
>>>          wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
>>
>> Correction: It never restores TSC_AUX, AFAICT.
> 
> It does, it's in the host_save_user_msrs array.

I added a quick hack to save TSC_AUX to a new variable in the SVM struct 
and then restore it right after VMEXIT (just after where GS is restored in 
svm_vcpu_enter_exit()) and my guest is no longer crashing.

Thanks,
Tom

> 
> Thanks,
> Tom
> 
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 19:04                         ` Tom Lendacky
@ 2020-08-20 19:05                           ` Tom Lendacky
  2020-08-20 20:07                             ` Dave Hansen
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Lendacky @ 2020-08-20 19:05 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Andy Lutomirski, Sean Christopherson, Joerg Roedel,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar



On 8/20/20 2:04 PM, Tom Lendacky wrote:
> On 8/20/20 1:41 PM, Tom Lendacky wrote:
>> On 8/20/20 1:39 PM, Jim Mattson wrote:
>>> On Thu, Aug 20, 2020 at 11:38 AM Jim Mattson <jmattson@google.com> wrote:
>>>>
>>>> On Thu, Aug 20, 2020 at 11:34 AM Tom Lendacky 
>>>> <thomas.lendacky@amd.com> wrote:
>>>>>
>>>>>
>>>>> Bisecting with unsafe_fsgsbase identified:
>>>>>
>>>>> c82965f9e530 ("x86/entry/64: Handle FSGSBASE enabled paranoid 
>>>>> entry/exit")
>>>>>
>>>>> But I'm thinking that could be because it starts using GET_PERCPU_BASE,
>>>>> which on Rome would use RDPID. So is SVM restoring TSC_AUX_MSR too late?
>>>>> That would explain why I don't see the issue on Naples, which doesn't
>>>>> support RDPID.
>>>>
>>>> It looks to me like SVM loads the guest TSC_AUX from vcpu_load to
>>>> vcpu_put, with this comment:
>>>>
>>>> /* This assumes that the kernel never uses MSR_TSC_AUX */
>>>> if (static_cpu_has(X86_FEATURE_RDTSCP))
>>>>          wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
>>>
>>> Correction: It never restores TSC_AUX, AFAICT.
>>
>> It does, it's in the host_save_user_msrs array.
> 
> I added a quick hack to save TSC_AUX to a new variable in the SVM struct 
> and then restore it right after VMEXIT (just after where GS is restored in 
> svm_vcpu_enter_exit()) and my guest is no longer crashing.

Sorry, I mean my host is no longer crashing.

Thanks,
Tom

> 
> Thanks,
> Tom
> 
>>
>> Thanks,
>> Tom
>>
>>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 19:05                           ` Tom Lendacky
@ 2020-08-20 20:07                             ` Dave Hansen
  2020-08-20 20:15                               ` Tom Lendacky
  0 siblings, 1 reply; 27+ messages in thread
From: Dave Hansen @ 2020-08-20 20:07 UTC (permalink / raw)
  To: Tom Lendacky, Jim Mattson
  Cc: Andy Lutomirski, Sean Christopherson, Joerg Roedel,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On 8/20/20 12:05 PM, Tom Lendacky wrote:
>> I added a quick hack to save TSC_AUX to a new variable in the SVM
>> struct and then restore it right after VMEXIT (just after where GS is
>> restored in svm_vcpu_enter_exit()) and my guest is no longer crashing.
> 
> Sorry, I mean my host is no longer crashing.

Just to make sure I've got this:
1. Older CPUs didn't have X86_FEATURE_RDPID
2. FSGSBASE patches started using RDPID in the NMI entry path when
   supported *AND* FSGSBASE was enabled
3. There was a latent SVM bug which did not restore the RDPID data
   before NMIs were reenabled after VMEXIT
4. If an NMI comes in the window between VMEXIT and the
   wrmsr(TSC_AUX)... boom

If FSGSBASE reverted is disabled (as Tom did on the command-line), then
the RDPID path isn't hit.

Fun.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 20:07                             ` Dave Hansen
@ 2020-08-20 20:15                               ` Tom Lendacky
  2020-08-20 20:36                                 ` Andy Lutomirski
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Lendacky @ 2020-08-20 20:15 UTC (permalink / raw)
  To: Dave Hansen, Jim Mattson
  Cc: Andy Lutomirski, Sean Christopherson, Joerg Roedel,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On 8/20/20 3:07 PM, Dave Hansen wrote:
> On 8/20/20 12:05 PM, Tom Lendacky wrote:
>>> I added a quick hack to save TSC_AUX to a new variable in the SVM
>>> struct and then restore it right after VMEXIT (just after where GS is
>>> restored in svm_vcpu_enter_exit()) and my guest is no longer crashing.
>>
>> Sorry, I mean my host is no longer crashing.
> 
> Just to make sure I've got this:
> 1. Older CPUs didn't have X86_FEATURE_RDPID
> 2. FSGSBASE patches started using RDPID in the NMI entry path when
>     supported *AND* FSGSBASE was enabled
> 3. There was a latent SVM bug which did not restore the RDPID data
>     before NMIs were reenabled after VMEXIT
> 4. If an NMI comes in the window between VMEXIT and the
>     wrmsr(TSC_AUX)... boom

Right, which means that the setting of TSC_AUX to the guest value needs to 
be moved, too.

Thanks,
Tom

> 
> If FSGSBASE reverted is disabled (as Tom did on the command-line), then
> the RDPID path isn't hit.
> 
> Fun.
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 20:15                               ` Tom Lendacky
@ 2020-08-20 20:36                                 ` Andy Lutomirski
  2020-08-20 22:05                                   ` Sean Christopherson
  0 siblings, 1 reply; 27+ messages in thread
From: Andy Lutomirski @ 2020-08-20 20:36 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Dave Hansen, Jim Mattson, Andy Lutomirski, Sean Christopherson,
	Joerg Roedel, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar



> On Aug 20, 2020, at 1:15 PM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> 
> On 8/20/20 3:07 PM, Dave Hansen wrote:
>> On 8/20/20 12:05 PM, Tom Lendacky wrote:
>>>> I added a quick hack to save TSC_AUX to a new variable in the SVM
>>>> struct and then restore it right after VMEXIT (just after where GS is
>>>> restored in svm_vcpu_enter_exit()) and my guest is no longer crashing.
>>> 
>>> Sorry, I mean my host is no longer crashing.
>> Just to make sure I've got this:
>> 1. Older CPUs didn't have X86_FEATURE_RDPID
>> 2. FSGSBASE patches started using RDPID in the NMI entry path when
>>    supported *AND* FSGSBASE was enabled
>> 3. There was a latent SVM bug which did not restore the RDPID data
>>    before NMIs were reenabled after VMEXIT
>> 4. If an NMI comes in the window between VMEXIT and the
>>    wrmsr(TSC_AUX)... boom
> 
> Right, which means that the setting of TSC_AUX to the guest value needs to be moved, too.
> 

Depending on how much of a perf hit this is, we could also skip using RDPID in the paranoid path on SVM-capable CPUs.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 20:36                                 ` Andy Lutomirski
@ 2020-08-20 22:05                                   ` Sean Christopherson
  2020-08-20 22:07                                     ` Andy Lutomirski
  0 siblings, 1 reply; 27+ messages in thread
From: Sean Christopherson @ 2020-08-20 22:05 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Tom Lendacky, Dave Hansen, Jim Mattson, Andy Lutomirski,
	Joerg Roedel, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On Thu, Aug 20, 2020 at 01:36:46PM -0700, Andy Lutomirski wrote:
> 
> 
> > On Aug 20, 2020, at 1:15 PM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> > 
> > On 8/20/20 3:07 PM, Dave Hansen wrote:
> >> On 8/20/20 12:05 PM, Tom Lendacky wrote:
> >>>> I added a quick hack to save TSC_AUX to a new variable in the SVM
> >>>> struct and then restore it right after VMEXIT (just after where GS is
> >>>> restored in svm_vcpu_enter_exit()) and my guest is no longer crashing.
> >>> 
> >>> Sorry, I mean my host is no longer crashing.
> >> Just to make sure I've got this:
> >> 1. Older CPUs didn't have X86_FEATURE_RDPID
> >> 2. FSGSBASE patches started using RDPID in the NMI entry path when
> >>    supported *AND* FSGSBASE was enabled
> >> 3. There was a latent SVM bug which did not restore the RDPID data
> >>    before NMIs were reenabled after VMEXIT
> >> 4. If an NMI comes in the window between VMEXIT and the
> >>    wrmsr(TSC_AUX)... boom
> > 
> > Right, which means that the setting of TSC_AUX to the guest value needs to be moved, too.
> > 
> 
> Depending on how much of a perf hit this is, we could also skip using RDPID
> in the paranoid path on SVM-capable CPUs.

Doesn't this affect VMX as well?  KVM+VMX doesn't restore TSC_AUX until the
kernel returns to userspace.  I don't see anything that prevents the NMI
RDPID path from affecting Intel CPUs.

Assuming that's the case, I would strongly prefer this be handled in the
paranoid path.  NMIs are unblocked immediately on VMX VM-Exit, which means
using the MSR load lists in the VMCS, and I hate those with a vengeance.

Perf overhead on VMX would be 8-10% for VM-Exits that would normally stay
in KVM's run loop, e.g. ~125 cycles for the WMRSR, ~1300-1500 cycles to
handle the most common VM-Exits.  It'd be even higher overhead for the
VMX preemption timer, which is handled without even enabling IRQs and is
a hot path as it's used to emulate the TSC deadline timer for the guest.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 22:05                                   ` Sean Christopherson
@ 2020-08-20 22:07                                     ` Andy Lutomirski
  2020-08-20 22:34                                       ` Sean Christopherson
  0 siblings, 1 reply; 27+ messages in thread
From: Andy Lutomirski @ 2020-08-20 22:07 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Tom Lendacky, Dave Hansen, Jim Mattson, Andy Lutomirski,
	Joerg Roedel, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On Thu, Aug 20, 2020 at 3:05 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Thu, Aug 20, 2020 at 01:36:46PM -0700, Andy Lutomirski wrote:
> >
> >
> > > On Aug 20, 2020, at 1:15 PM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> > >
> > > On 8/20/20 3:07 PM, Dave Hansen wrote:
> > >> On 8/20/20 12:05 PM, Tom Lendacky wrote:
> > >>>> I added a quick hack to save TSC_AUX to a new variable in the SVM
> > >>>> struct and then restore it right after VMEXIT (just after where GS is
> > >>>> restored in svm_vcpu_enter_exit()) and my guest is no longer crashing.
> > >>>
> > >>> Sorry, I mean my host is no longer crashing.
> > >> Just to make sure I've got this:
> > >> 1. Older CPUs didn't have X86_FEATURE_RDPID
> > >> 2. FSGSBASE patches started using RDPID in the NMI entry path when
> > >>    supported *AND* FSGSBASE was enabled
> > >> 3. There was a latent SVM bug which did not restore the RDPID data
> > >>    before NMIs were reenabled after VMEXIT
> > >> 4. If an NMI comes in the window between VMEXIT and the
> > >>    wrmsr(TSC_AUX)... boom
> > >
> > > Right, which means that the setting of TSC_AUX to the guest value needs to be moved, too.
> > >
> >
> > Depending on how much of a perf hit this is, we could also skip using RDPID
> > in the paranoid path on SVM-capable CPUs.
>
> Doesn't this affect VMX as well?  KVM+VMX doesn't restore TSC_AUX until the
> kernel returns to userspace.  I don't see anything that prevents the NMI
> RDPID path from affecting Intel CPUs.
>
> Assuming that's the case, I would strongly prefer this be handled in the
> paranoid path.  NMIs are unblocked immediately on VMX VM-Exit, which means
> using the MSR load lists in the VMCS, and I hate those with a vengeance.
>
> Perf overhead on VMX would be 8-10% for VM-Exits that would normally stay
> in KVM's run loop, e.g. ~125 cycles for the WMRSR, ~1300-1500 cycles to
> handle the most common VM-Exits.  It'd be even higher overhead for the
> VMX preemption timer, which is handled without even enabling IRQs and is
> a hot path as it's used to emulate the TSC deadline timer for the guest.

I'm fine with that -- let's get rid of RDPID unconditionally in the
paranoid path.  Want to send a patch that also adds as comment
explaining why we're not using RDPID?

--Andy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 22:07                                     ` Andy Lutomirski
@ 2020-08-20 22:34                                       ` Sean Christopherson
  2020-08-21  0:00                                         ` Tom Lendacky
  0 siblings, 1 reply; 27+ messages in thread
From: Sean Christopherson @ 2020-08-20 22:34 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Tom Lendacky, Dave Hansen, Jim Mattson, Joerg Roedel,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On Thu, Aug 20, 2020 at 03:07:10PM -0700, Andy Lutomirski wrote:
> On Thu, Aug 20, 2020 at 3:05 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > On Thu, Aug 20, 2020 at 01:36:46PM -0700, Andy Lutomirski wrote:
> > >
> > >
> > > > On Aug 20, 2020, at 1:15 PM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> > > >
> > > > On 8/20/20 3:07 PM, Dave Hansen wrote:
> > > >> On 8/20/20 12:05 PM, Tom Lendacky wrote:
> > > >>>> I added a quick hack to save TSC_AUX to a new variable in the SVM
> > > >>>> struct and then restore it right after VMEXIT (just after where GS is
> > > >>>> restored in svm_vcpu_enter_exit()) and my guest is no longer crashing.
> > > >>>
> > > >>> Sorry, I mean my host is no longer crashing.
> > > >> Just to make sure I've got this:
> > > >> 1. Older CPUs didn't have X86_FEATURE_RDPID
> > > >> 2. FSGSBASE patches started using RDPID in the NMI entry path when
> > > >>    supported *AND* FSGSBASE was enabled
> > > >> 3. There was a latent SVM bug which did not restore the RDPID data
> > > >>    before NMIs were reenabled after VMEXIT
> > > >> 4. If an NMI comes in the window between VMEXIT and the
> > > >>    wrmsr(TSC_AUX)... boom
> > > >
> > > > Right, which means that the setting of TSC_AUX to the guest value needs to be moved, too.
> > > >
> > >
> > > Depending on how much of a perf hit this is, we could also skip using RDPID
> > > in the paranoid path on SVM-capable CPUs.
> >
> > Doesn't this affect VMX as well?  KVM+VMX doesn't restore TSC_AUX until the
> > kernel returns to userspace.  I don't see anything that prevents the NMI
> > RDPID path from affecting Intel CPUs.
> >
> > Assuming that's the case, I would strongly prefer this be handled in the
> > paranoid path.  NMIs are unblocked immediately on VMX VM-Exit, which means
> > using the MSR load lists in the VMCS, and I hate those with a vengeance.
> >
> > Perf overhead on VMX would be 8-10% for VM-Exits that would normally stay
> > in KVM's run loop, e.g. ~125 cycles for the WMRSR, ~1300-1500 cycles to
> > handle the most common VM-Exits.  It'd be even higher overhead for the
> > VMX preemption timer, which is handled without even enabling IRQs and is
> > a hot path as it's used to emulate the TSC deadline timer for the guest.
> 
> I'm fine with that -- let's get rid of RDPID unconditionally in the
> paranoid path.  Want to send a patch that also adds as comment
> explaining why we're not using RDPID?

Sure, though I won't object if Tom beats me to the punch :-)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-20 22:34                                       ` Sean Christopherson
@ 2020-08-21  0:00                                         ` Tom Lendacky
  2020-08-21  1:56                                           ` Sean Christopherson
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Lendacky @ 2020-08-21  0:00 UTC (permalink / raw)
  To: Sean Christopherson, Andy Lutomirski
  Cc: Dave Hansen, Jim Mattson, Joerg Roedel, Paolo Bonzini,
	Vitaly Kuznetsov, Wanpeng Li, Linux Kernel Mailing List, X86 ML,
	Chang S. Bae, Thomas Gleixner, Sasha Levin, Borislav Petkov,
	Peter Zijlstra, Ingo Molnar

On 8/20/20 5:34 PM, Sean Christopherson wrote:
> On Thu, Aug 20, 2020 at 03:07:10PM -0700, Andy Lutomirski wrote:
>> On Thu, Aug 20, 2020 at 3:05 PM Sean Christopherson
>> <sean.j.christopherson@intel.com> wrote:
>>>
>>> On Thu, Aug 20, 2020 at 01:36:46PM -0700, Andy Lutomirski wrote:
>>>>
>>>>
>>>>> On Aug 20, 2020, at 1:15 PM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>>>
>>>>> On 8/20/20 3:07 PM, Dave Hansen wrote:
>>>>>> On 8/20/20 12:05 PM, Tom Lendacky wrote:
>>>>>>>> I added a quick hack to save TSC_AUX to a new variable in the SVM
>>>>>>>> struct and then restore it right after VMEXIT (just after where GS is
>>>>>>>> restored in svm_vcpu_enter_exit()) and my guest is no longer crashing.
>>>>>>>
>>>>>>> Sorry, I mean my host is no longer crashing.
>>>>>> Just to make sure I've got this:
>>>>>> 1. Older CPUs didn't have X86_FEATURE_RDPID
>>>>>> 2. FSGSBASE patches started using RDPID in the NMI entry path when
>>>>>>     supported *AND* FSGSBASE was enabled
>>>>>> 3. There was a latent SVM bug which did not restore the RDPID data
>>>>>>     before NMIs were reenabled after VMEXIT
>>>>>> 4. If an NMI comes in the window between VMEXIT and the
>>>>>>     wrmsr(TSC_AUX)... boom
>>>>>
>>>>> Right, which means that the setting of TSC_AUX to the guest value needs to be moved, too.
>>>>>
>>>>
>>>> Depending on how much of a perf hit this is, we could also skip using RDPID
>>>> in the paranoid path on SVM-capable CPUs.
>>>
>>> Doesn't this affect VMX as well?  KVM+VMX doesn't restore TSC_AUX until the
>>> kernel returns to userspace.  I don't see anything that prevents the NMI
>>> RDPID path from affecting Intel CPUs.
>>>
>>> Assuming that's the case, I would strongly prefer this be handled in the
>>> paranoid path.  NMIs are unblocked immediately on VMX VM-Exit, which means
>>> using the MSR load lists in the VMCS, and I hate those with a vengeance.
>>>
>>> Perf overhead on VMX would be 8-10% for VM-Exits that would normally stay
>>> in KVM's run loop, e.g. ~125 cycles for the WMRSR, ~1300-1500 cycles to
>>> handle the most common VM-Exits.  It'd be even higher overhead for the
>>> VMX preemption timer, which is handled without even enabling IRQs and is
>>> a hot path as it's used to emulate the TSC deadline timer for the guest.
>>
>> I'm fine with that -- let's get rid of RDPID unconditionally in the
>> paranoid path.  Want to send a patch that also adds as comment
>> explaining why we're not using RDPID?
> 
> Sure, though I won't object if Tom beats me to the punch :-)

I can do it, but won't be able to get to it until sometime tomorrow.

Thanks,
Tom

> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: FSGSBASE causing panic on 5.9-rc1
  2020-08-21  0:00                                         ` Tom Lendacky
@ 2020-08-21  1:56                                           ` Sean Christopherson
  0 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2020-08-21  1:56 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Andy Lutomirski, Dave Hansen, Jim Mattson, Joerg Roedel,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Linux Kernel Mailing List, X86 ML, Chang S. Bae, Thomas Gleixner,
	Sasha Levin, Borislav Petkov, Peter Zijlstra, Ingo Molnar

On Thu, Aug 20, 2020 at 07:00:16PM -0500, Tom Lendacky wrote:
> On 8/20/20 5:34 PM, Sean Christopherson wrote:
> > On Thu, Aug 20, 2020 at 03:07:10PM -0700, Andy Lutomirski wrote:
> > > On Thu, Aug 20, 2020 at 3:05 PM Sean Christopherson
> > > <sean.j.christopherson@intel.com> wrote:
> > > > 
> > > > On Thu, Aug 20, 2020 at 01:36:46PM -0700, Andy Lutomirski wrote:
> > > > > 
> > > > > Depending on how much of a perf hit this is, we could also skip using RDPID
> > > > > in the paranoid path on SVM-capable CPUs.
> > > > 
> > > > Doesn't this affect VMX as well?  KVM+VMX doesn't restore TSC_AUX until the
> > > > kernel returns to userspace.  I don't see anything that prevents the NMI
> > > > RDPID path from affecting Intel CPUs.
> > > > 
> > > > Assuming that's the case, I would strongly prefer this be handled in the
> > > > paranoid path.  NMIs are unblocked immediately on VMX VM-Exit, which means
> > > > using the MSR load lists in the VMCS, and I hate those with a vengeance.
> > > > 
> > > > Perf overhead on VMX would be 8-10% for VM-Exits that would normally stay
> > > > in KVM's run loop, e.g. ~125 cycles for the WMRSR, ~1300-1500 cycles to
> > > > handle the most common VM-Exits.  It'd be even higher overhead for the
> > > > VMX preemption timer, which is handled without even enabling IRQs and is
> > > > a hot path as it's used to emulate the TSC deadline timer for the guest.
> > > 
> > > I'm fine with that -- let's get rid of RDPID unconditionally in the
> > > paranoid path.  Want to send a patch that also adds as comment
> > > explaining why we're not using RDPID?
> > 
> > Sure, though I won't object if Tom beats me to the punch :-)
> 
> I can do it, but won't be able to get to it until sometime tomorrow.

Confirmed VMX goes kaboom when running perf with a VM.  Patch incoming.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2020-08-21  1:56 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-19 18:07 FSGSBASE causing panic on 5.9-rc1 Tom Lendacky
2020-08-19 18:19 ` Tom Lendacky
2020-08-19 21:25   ` Andy Lutomirski
2020-08-20  0:21     ` Andy Lutomirski
2020-08-20 15:10       ` Sean Christopherson
2020-08-20 15:21         ` Tom Lendacky
2020-08-20 15:55           ` Andy Lutomirski
2020-08-20 16:17             ` Tom Lendacky
2020-08-20 16:30               ` Tom Lendacky
2020-08-20 17:41                 ` Paolo Bonzini
2020-08-20 18:34                 ` Tom Lendacky
2020-08-20 18:38                   ` Jim Mattson
2020-08-20 18:39                     ` Jim Mattson
2020-08-20 18:41                       ` Tom Lendacky
2020-08-20 19:04                         ` Tom Lendacky
2020-08-20 19:05                           ` Tom Lendacky
2020-08-20 20:07                             ` Dave Hansen
2020-08-20 20:15                               ` Tom Lendacky
2020-08-20 20:36                                 ` Andy Lutomirski
2020-08-20 22:05                                   ` Sean Christopherson
2020-08-20 22:07                                     ` Andy Lutomirski
2020-08-20 22:34                                       ` Sean Christopherson
2020-08-21  0:00                                         ` Tom Lendacky
2020-08-21  1:56                                           ` Sean Christopherson
2020-08-20 18:43           ` Bae, Chang Seok
2020-08-20 13:43     ` Paolo Bonzini
2020-08-20 17:51       ` Andy Lutomirski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).