kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Hugh Dickins <hughd@google.com>, Babu Moger <babu.moger@amd.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Jim Mattson <jmattson@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	kvm list <kvm@vger.kernel.org>, Joerg Roedel <joro@8bytes.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>, "H . Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Makarand Sonare <makarandsonare@google.com>,
	Sean Christopherson <seanjc@google.com>
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support
Date: Wed, 24 Mar 2021 19:43:29 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.11.2103241913190.10112@eggly.anvils> (raw)
In-Reply-To: <alpine.LSU.2.11.2103241651280.9593@eggly.anvils>

On Wed, 24 Mar 2021, Hugh Dickins wrote:
> On Wed, 24 Mar 2021, Borislav Petkov wrote:
> 
> > Ok,
> > 
> > some more experimenting Babu and I did lead us to:
> > 
> > ---
> > diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> > index f5ca15622dc9..259aa4889cad 100644
> > --- a/arch/x86/include/asm/tlbflush.h
> > +++ b/arch/x86/include/asm/tlbflush.h
> > @@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned long addr)
> >  	 */
> >  	if (kaiser_enabled)
> >  		invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> > +	else
> > +		asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> > +
> >  	invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> >  }
> > 
> > applied on the guest kernel which fixes the issue. And let me add Hugh
> > who did that PCID stuff at the time. So lemme summarize for Hugh and to
> > ask him nicely to sanity-check me. :-)
> 
> Just a brief interim note to assure you that I'm paying attention,
> but wow, it's a long time since I gave any thought down here!
> Trying to page it all back in...
> 
> I see no harm in your workaround if it works, but it's not as if
> this is a previously untried path: so I'm suspicious how an issue
> here with Globals could have gone unnoticed for so long, and need
> to understand it better.

Right, after looking into it more, I completely agree with you:
the Kaiser series (in both 4.4-stable and 4.9-stable) was simply
wrong to lose that invlpg - fine in the kaiser case when we don't
enable Globals at all, but plain wrong in the !kaiser_enabled case.
One way or another, we have somehow got away with it for three years.

I do agree with Paolo that the PCID_ASID_KERN flush would be better
moved under the "if (kaiser_enabled)" now. (And if this were ongoing
development, I'd want to rewrite the function altogether: but no,
these old stable trees are not the place for that.)

Boris, may I leave both -stable fixes to you?
Let me know if you'd prefer me to clean up my mess.

Thanks a lot for tracking this down,
Hugh

> > 
> > Basically, you have an AMD host which supports PCID and INVPCID and you
> > boot on it a 4.9 guest. It explodes like the panic below.
> > 
> > What fixes it is this:
> > 
> > diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> > index f5ca15622dc9..259aa4889cad 100644
> > --- a/arch/x86/include/asm/tlbflush.h
> > +++ b/arch/x86/include/asm/tlbflush.h
> > @@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned long addr)
> >  	 */
> >  	if (kaiser_enabled)
> >  		invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> > +	else
> > +		asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> > +
> >  	invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> >  }
> > 
> > ---
> > 
> > and the reason why it does, IMHO, is because on AMD, kaiser_enabled is
> > false because AMD is not affected by Meltdown, which means, there's no
> > user/kernel pagetables split.
> > 
> > And that also means, you have global TLB entries which means that if you
> > look at that __native_flush_tlb_single() function, it needs to flush
> > global TLB entries on CPUs with X86_FEATURE_INVPCID_SINGLE by doing an
> > INVLPG in the kaiser_enabled=0 case. Errgo, the above hunk.
> > 
> > But I might be completely off here thus this note...
> > 
> > Thoughts?
> > 
> > Thx.
> > 
> > 
> > [    1.235726] ------------[ cut here ]------------
> > [    1.237515] kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!
> > [    1.240926] invalid opcode: 0000 [#1] SMP
> > [    1.243301] Modules linked in:
> > [    1.244585] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 Debian 4.9.228-1
> > [    1.247657] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > [    1.251249] task: ffff909363e94040 task.stack: ffffa41bc0194000
> > [    1.253519] RIP: 0010:[<ffffffff8fa2e40c>]  [<ffffffff8fa2e40c>] text_poke+0x18c/0x240
> > [    1.256593] RSP: 0018:ffffa41bc0197d90  EFLAGS: 00010096
> > [    1.258657] RAX: 000000000000000f RBX: 0000000001020800 RCX: 00000000feda3203
> > [    1.261388] RDX: 00000000178bfbff RSI: 0000000000000000 RDI: ffffffffff57a000
> > [    1.264168] RBP: ffffffff8fbd3eca R08: 0000000000000000 R09: 0000000000000003
> > [    1.266983] R10: 0000000000000003 R11: 0000000000000112 R12: 0000000000000001
> > [    1.269702] R13: ffffa41bc0197dcf R14: 0000000000000286 R15: ffffed1c40407500
> > [    1.272572] FS:  0000000000000000(0000) GS:ffff909366300000(0000) knlGS:0000000000000000
> > [    1.275791] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [    1.278032] CR2: 0000000000000000 CR3: 0000000010c08000 CR4: 00000000003606f0
> > [    1.280815] Stack:
> > [    1.281630]  ffffffff8fbd3eca 0000000000000005 ffffa41bc0197e03 ffffffff8fbd3ecb
> > [    1.284660]  0000000000000000 0000000000000000 ffffffff8fa2e835 ccffffff8fad4326
> > [    1.287729]  1ccd0231874d55d3 ffffffff8fbd3eca ffffa41bc0197e03 ffffffff90203844
> > [    1.290852] Call Trace:
> > [    1.291782]  [<ffffffff8fbd3eca>] ? swap_entry_free+0x12a/0x300
> > [    1.294900]  [<ffffffff8fbd3ecb>] ? swap_entry_free+0x12b/0x300
> > [    1.297267]  [<ffffffff8fa2e835>] ? text_poke_bp+0x55/0xe0
> > [    1.299473]  [<ffffffff8fbd3eca>] ? swap_entry_free+0x12a/0x300
> > [    1.301896]  [<ffffffff8fa2b64c>] ? arch_jump_label_transform+0x9c/0x120
> > [    1.304557]  [<ffffffff9073e81f>] ? set_debug_rodata+0xc/0xc
> > [    1.306790]  [<ffffffff8fb81d92>] ? __jump_label_update+0x72/0x80
> > [    1.309255]  [<ffffffff8fb8206f>] ? static_key_slow_inc+0x8f/0xa0
> > [    1.311680]  [<ffffffff8fbd7a57>] ? frontswap_register_ops+0x107/0x1d0
> > [    1.314281]  [<ffffffff9077078c>] ? init_zswap+0x282/0x3f6
> > [    1.316547]  [<ffffffff9077050a>] ? init_frontswap+0x8c/0x8c
> > [    1.318784]  [<ffffffff8fa0223e>] ? do_one_initcall+0x4e/0x180
> > [    1.321067]  [<ffffffff9073e81f>] ? set_debug_rodata+0xc/0xc
> > [    1.323366]  [<ffffffff9073f08d>] ? kernel_init_freeable+0x16b/0x1ec
> > [    1.325873]  [<ffffffff90011d50>] ? rest_init+0x80/0x80
> > [    1.327989]  [<ffffffff90011d5a>] ? kernel_init+0xa/0x100
> > [    1.330092]  [<ffffffff9001f424>] ? ret_from_fork+0x44/0x70
> > [    1.332311] Code: 00 0f a2 4d 85 e4 74 4a 0f b6 45 00 41 38 45 00 75 19 31 c0 83 c0 01 48 63 d0 49 39 d4 76 33 41 0f b6 4c 15 00 38 4c 15 00 74 e9 <0f> 0b 48 89 ef e8 da d6 19 00 48 8d bd 00 10 00 00 48 89 c3 e8 
> > [    1.342818] RIP  [<ffffffff8fa2e40c>] text_poke+0x18c/0x240
> > [    1.345859]  RSP <ffffa41bc0197d90>
> > [    1.347285] ---[ end trace 0a1c5ab5eb16de89 ]---
> > [    1.349169] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> > [    1.349169] 
> > [    1.352885] Kernel Offset: 0xea00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > [    1.357039] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> > [    1.357039] 
> > 
> > 
> > -- 
> > Regards/Gruss,
> >     Boris.
> > 
> > https://people.kernel.org/tglx/notes-about-netiquette

  reply	other threads:[~2021-03-25  2:44 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-11 19:27 [PATCH v6 00/12] SVM cleanup and INVPCID feature support Babu Moger
2020-09-11 19:27 ` [PATCH v6 01/12] KVM: SVM: Introduce vmcb_(set_intercept/clr_intercept/_is_intercept) Babu Moger
2020-09-11 19:28 ` [PATCH v6 02/12] KVM: SVM: Change intercept_cr to generic intercepts Babu Moger
2020-09-11 19:28 ` [PATCH v6 03/12] KVM: SVM: Change intercept_dr " Babu Moger
2020-09-11 19:28 ` [PATCH v6 04/12] KVM: SVM: Modify intercept_exceptions " Babu Moger
2020-09-12 16:52   ` Paolo Bonzini
2020-09-14 15:06     ` Sean Christopherson
2020-09-22 13:39       ` Paolo Bonzini
2020-09-22 19:11         ` Babu Moger
2020-09-23  2:43           ` Paolo Bonzini
2020-09-23 13:35             ` Babu Moger
2020-09-11 19:28 ` [PATCH v6 05/12] KVM: SVM: Modify 64 bit intercept field to two 32 bit vectors Babu Moger
2020-09-11 19:28 ` [PATCH v6 06/12] KVM: SVM: Add new intercept vector in vmcb_control_area Babu Moger
2020-09-11 19:28 ` [PATCH v6 07/12] KVM: nSVM: Cleanup nested_state data structure Babu Moger
2020-09-11 19:28 ` [PATCH v6 08/12] KVM: SVM: Remove set_cr_intercept, clr_cr_intercept and is_cr_intercept Babu Moger
2020-09-11 19:28 ` [PATCH v6 09/12] KVM: SVM: Remove set_exception_intercept and clr_exception_intercept Babu Moger
2020-09-11 19:29 ` [PATCH v6 10/12] KVM: X86: Rename and move the function vmx_handle_memory_failure to x86.c Babu Moger
2020-09-11 19:29 ` [PATCH v6 11/12] KVM: X86: Move handling of INVPCID types to x86 Babu Moger
2020-09-11 19:29 ` [PATCH v6 12/12] KVM:SVM: Enable INVPCID feature on AMD Babu Moger
2020-09-12 17:08 ` [PATCH v6 00/12] SVM cleanup and INVPCID feature support Paolo Bonzini
2020-09-14 15:05   ` Sean Christopherson
2020-09-14 18:33   ` Babu Moger
2021-01-19 23:01     ` Jim Mattson
2021-01-19 23:45       ` Babu Moger
2021-01-20 21:14         ` Jim Mattson
2021-01-20 21:45           ` Babu Moger
2021-01-21  3:10             ` Babu Moger
2021-01-21 23:51               ` Babu Moger
2021-01-23  1:52                 ` Babu Moger
2021-02-24  0:13                   ` Jim Mattson
2021-02-24 22:17                     ` Babu Moger
2021-03-10  1:04                       ` Babu Moger
2021-03-10  9:08                         ` Paolo Bonzini
2021-03-10 14:55                           ` Babu Moger
2021-03-10 14:58                             ` Babu Moger
2021-03-10 15:31                               ` Paolo Bonzini
2021-03-11  1:21                                 ` Babu Moger
2021-03-11 20:07                                   ` Borislav Petkov
2021-03-11 20:32                                     ` Borislav Petkov
2021-03-11 20:57                                       ` Babu Moger
2021-03-11 21:40                                         ` Borislav Petkov
2021-03-11 22:04                                           ` Babu Moger
2021-03-11 22:15                                             ` Babu Moger
2021-03-11 23:52                                               ` Borislav Petkov
2021-03-12 16:12                                                 ` Babu Moger
2021-03-24 21:21                                                   ` Borislav Petkov
2021-03-24 21:59                                                     ` Paolo Bonzini
2021-03-25  0:05                                                     ` Hugh Dickins
2021-03-25  2:43                                                       ` Hugh Dickins [this message]
2021-03-25  9:56                                                         ` Borislav Petkov
2021-03-25 10:29                                                           ` [PATCH] x86/tlb: Flush global mappings when KAISER is disabled Borislav Petkov
2021-03-25 10:52                                                             ` Paolo Bonzini
2021-03-25 15:13                                                             ` Babu Moger
2021-03-25 16:33                                                             ` Hugh Dickins
2021-03-25 19:00                                                               ` Jim Mattson
2021-03-25 20:09                                                             ` Borislav Petkov
2021-03-25 20:36                                                               ` Sasha Levin
2021-03-25 23:19                                                                 ` Sasha Levin
2021-03-25 23:56                                                                   ` Ben Hutchings
2021-03-11 21:23                                       ` [PATCH v6 00/12] SVM cleanup and INVPCID feature support Jim Mattson
2021-03-11 21:36                                         ` Borislav Petkov
2021-03-11 21:50                                           ` Babu Moger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.11.2103241913190.10112@eggly.anvils \
    --to=hughd@google.com \
    --cc=babu.moger@amd.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=makarandsonare@google.com \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).