All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Hugh Dickins <hughd@google.com>, Babu Moger <babu.moger@amd.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Jim Mattson <jmattson@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	kvm list <kvm@vger.kernel.org>, Joerg Roedel <joro@8bytes.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>, "H . Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Makarand Sonare <makarandsonare@google.com>,
	Sean Christopherson <seanjc@google.com>
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support
Date: Wed, 24 Mar 2021 19:43:29 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.11.2103241913190.10112@eggly.anvils> (raw)
In-Reply-To: <alpine.LSU.2.11.2103241651280.9593@eggly.anvils>

On Wed, 24 Mar 2021, Hugh Dickins wrote:
> On Wed, 24 Mar 2021, Borislav Petkov wrote:
> 
> > Ok,
> > 
> > some more experimenting Babu and I did lead us to:
> > 
> > ---
> > diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> > index f5ca15622dc9..259aa4889cad 100644
> > --- a/arch/x86/include/asm/tlbflush.h
> > +++ b/arch/x86/include/asm/tlbflush.h
> > @@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned long addr)
> >  	 */
> >  	if (kaiser_enabled)
> >  		invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> > +	else
> > +		asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> > +
> >  	invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> >  }
> > 
> > applied on the guest kernel which fixes the issue. And let me add Hugh
> > who did that PCID stuff at the time. So lemme summarize for Hugh and to
> > ask him nicely to sanity-check me. :-)
> 
> Just a brief interim note to assure you that I'm paying attention,
> but wow, it's a long time since I gave any thought down here!
> Trying to page it all back in...
> 
> I see no harm in your workaround if it works, but it's not as if
> this is a previously untried path: so I'm suspicious how an issue
> here with Globals could have gone unnoticed for so long, and need
> to understand it better.

Right, after looking into it more, I completely agree with you:
the Kaiser series (in both 4.4-stable and 4.9-stable) was simply
wrong to lose that invlpg - fine in the kaiser case when we don't
enable Globals at all, but plain wrong in the !kaiser_enabled case.
One way or another, we have somehow got away with it for three years.

I do agree with Paolo that the PCID_ASID_KERN flush would be better
moved under the "if (kaiser_enabled)" now. (And if this were ongoing
development, I'd want to rewrite the function altogether: but no,
these old stable trees are not the place for that.)

Boris, may I leave both -stable fixes to you?
Let me know if you'd prefer me to clean up my mess.

Thanks a lot for tracking this down,
Hugh

> > 
> > Basically, you have an AMD host which supports PCID and INVPCID and you
> > boot on it a 4.9 guest. It explodes like the panic below.
> > 
> > What fixes it is this:
> > 
> > diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> > index f5ca15622dc9..259aa4889cad 100644
> > --- a/arch/x86/include/asm/tlbflush.h
> > +++ b/arch/x86/include/asm/tlbflush.h
> > @@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned long addr)
> >  	 */
> >  	if (kaiser_enabled)
> >  		invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> > +	else
> > +		asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> > +
> >  	invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> >  }
> > 
> > ---
> > 
> > and the reason why it does, IMHO, is because on AMD, kaiser_enabled is
> > false because AMD is not affected by Meltdown, which means, there's no
> > user/kernel pagetables split.
> > 
> > And that also means, you have global TLB entries which means that if you
> > look at that __native_flush_tlb_single() function, it needs to flush
> > global TLB entries on CPUs with X86_FEATURE_INVPCID_SINGLE by doing an
> > INVLPG in the kaiser_enabled=0 case. Errgo, the above hunk.
> > 
> > But I might be completely off here thus this note...
> > 
> > Thoughts?
> > 
> > Thx.
> > 
> > 
> > [    1.235726] ------------[ cut here ]------------
> > [    1.237515] kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!
> > [    1.240926] invalid opcode: 0000 [#1] SMP
> > [    1.243301] Modules linked in:
> > [    1.244585] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 Debian 4.9.228-1
> > [    1.247657] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > [    1.251249] task: ffff909363e94040 task.stack: ffffa41bc0194000
> > [    1.253519] RIP: 0010:[<ffffffff8fa2e40c>]  [<ffffffff8fa2e40c>] text_poke+0x18c/0x240
> > [    1.256593] RSP: 0018:ffffa41bc0197d90  EFLAGS: 00010096
> > [    1.258657] RAX: 000000000000000f RBX: 0000000001020800 RCX: 00000000feda3203
> > [    1.261388] RDX: 00000000178bfbff RSI: 0000000000000000 RDI: ffffffffff57a000
> > [    1.264168] RBP: ffffffff8fbd3eca R08: 0000000000000000 R09: 0000000000000003
> > [    1.266983] R10: 0000000000000003 R11: 0000000000000112 R12: 0000000000000001
> > [    1.269702] R13: ffffa41bc0197dcf R14: 0000000000000286 R15: ffffed1c40407500
> > [    1.272572] FS:  0000000000000000(0000) GS:ffff909366300000(0000) knlGS:0000000000000000
> > [    1.275791] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [    1.278032] CR2: 0000000000000000 CR3: 0000000010c08000 CR4: 00000000003606f0
> > [    1.280815] Stack:
> > [    1.281630]  ffffffff8fbd3eca 0000000000000005 ffffa41bc0197e03 ffffffff8fbd3ecb
> > [    1.284660]  0000000000000000 0000000000000000 ffffffff8fa2e835 ccffffff8fad4326
> > [    1.287729]  1ccd0231874d55d3 ffffffff8fbd3eca ffffa41bc0197e03 ffffffff90203844
> > [    1.290852] Call Trace:
> > [    1.291782]  [<ffffffff8fbd3eca>] ? swap_entry_free+0x12a/0x300
> > [    1.294900]  [<ffffffff8fbd3ecb>] ? swap_entry_free+0x12b/0x300
> > [    1.297267]  [<ffffffff8fa2e835>] ? text_poke_bp+0x55/0xe0
> > [    1.299473]  [<ffffffff8fbd3eca>] ? swap_entry_free+0x12a/0x300
> > [    1.301896]  [<ffffffff8fa2b64c>] ? arch_jump_label_transform+0x9c/0x120
> > [    1.304557]  [<ffffffff9073e81f>] ? set_debug_rodata+0xc/0xc
> > [    1.306790]  [<ffffffff8fb81d92>] ? __jump_label_update+0x72/0x80
> > [    1.309255]  [<ffffffff8fb8206f>] ? static_key_slow_inc+0x8f/0xa0
> > [    1.311680]  [<ffffffff8fbd7a57>] ? frontswap_register_ops+0x107/0x1d0
> > [    1.314281]  [<ffffffff9077078c>] ? init_zswap+0x282/0x3f6
> > [    1.316547]  [<ffffffff9077050a>] ? init_frontswap+0x8c/0x8c
> > [    1.318784]  [<ffffffff8fa0223e>] ? do_one_initcall+0x4e/0x180
> > [    1.321067]  [<ffffffff9073e81f>] ? set_debug_rodata+0xc/0xc
> > [    1.323366]  [<ffffffff9073f08d>] ? kernel_init_freeable+0x16b/0x1ec
> > [    1.325873]  [<ffffffff90011d50>] ? rest_init+0x80/0x80
> > [    1.327989]  [<ffffffff90011d5a>] ? kernel_init+0xa/0x100
> > [    1.330092]  [<ffffffff9001f424>] ? ret_from_fork+0x44/0x70
> > [    1.332311] Code: 00 0f a2 4d 85 e4 74 4a 0f b6 45 00 41 38 45 00 75 19 31 c0 83 c0 01 48 63 d0 49 39 d4 76 33 41 0f b6 4c 15 00 38 4c 15 00 74 e9 <0f> 0b 48 89 ef e8 da d6 19 00 48 8d bd 00 10 00 00 48 89 c3 e8 
> > [    1.342818] RIP  [<ffffffff8fa2e40c>] text_poke+0x18c/0x240
> > [    1.345859]  RSP <ffffa41bc0197d90>
> > [    1.347285] ---[ end trace 0a1c5ab5eb16de89 ]---
> > [    1.349169] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> > [    1.349169] 
> > [    1.352885] Kernel Offset: 0xea00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > [    1.357039] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> > [    1.357039] 
> > 
> > 
> > -- 
> > Regards/Gruss,
> >     Boris.
> > 
> > https://people.kernel.org/tglx/notes-about-netiquette

  reply	other threads:[~2021-03-25  2:44 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-11 19:27 [PATCH v6 00/12] SVM cleanup and INVPCID feature support Babu Moger
2020-09-11 19:27 ` [PATCH v6 01/12] KVM: SVM: Introduce vmcb_(set_intercept/clr_intercept/_is_intercept) Babu Moger
2020-09-11 19:28 ` [PATCH v6 02/12] KVM: SVM: Change intercept_cr to generic intercepts Babu Moger
2020-09-11 19:28 ` [PATCH v6 03/12] KVM: SVM: Change intercept_dr " Babu Moger
2020-09-11 19:28 ` [PATCH v6 04/12] KVM: SVM: Modify intercept_exceptions " Babu Moger
2020-09-12 16:52   ` Paolo Bonzini
2020-09-14 15:06     ` Sean Christopherson
2020-09-22 13:39       ` Paolo Bonzini
2020-09-22 19:11         ` Babu Moger
2020-09-23  2:43           ` Paolo Bonzini
2020-09-23 13:35             ` Babu Moger
2020-09-11 19:28 ` [PATCH v6 05/12] KVM: SVM: Modify 64 bit intercept field to two 32 bit vectors Babu Moger
2020-09-11 19:28 ` [PATCH v6 06/12] KVM: SVM: Add new intercept vector in vmcb_control_area Babu Moger
2020-09-11 19:28 ` [PATCH v6 07/12] KVM: nSVM: Cleanup nested_state data structure Babu Moger
2020-09-11 19:28 ` [PATCH v6 08/12] KVM: SVM: Remove set_cr_intercept, clr_cr_intercept and is_cr_intercept Babu Moger
2020-09-11 19:28 ` [PATCH v6 09/12] KVM: SVM: Remove set_exception_intercept and clr_exception_intercept Babu Moger
2020-09-11 19:29 ` [PATCH v6 10/12] KVM: X86: Rename and move the function vmx_handle_memory_failure to x86.c Babu Moger
2020-09-11 19:29 ` [PATCH v6 11/12] KVM: X86: Move handling of INVPCID types to x86 Babu Moger
2020-09-11 19:29 ` [PATCH v6 12/12] KVM:SVM: Enable INVPCID feature on AMD Babu Moger
2020-09-12 17:08 ` [PATCH v6 00/12] SVM cleanup and INVPCID feature support Paolo Bonzini
2020-09-14 15:05   ` Sean Christopherson
2020-09-14 18:33   ` Babu Moger
2021-01-19 23:01     ` Jim Mattson
2021-01-19 23:45       ` Babu Moger
2021-01-20 21:14         ` Jim Mattson
2021-01-20 21:45           ` Babu Moger
2021-01-21  3:10             ` Babu Moger
2021-01-21 23:51               ` Babu Moger
2021-01-23  1:52                 ` Babu Moger
2021-02-24  0:13                   ` Jim Mattson
2021-02-24 22:17                     ` Babu Moger
2021-03-10  1:04                       ` Babu Moger
2021-03-10  9:08                         ` Paolo Bonzini
2021-03-10 14:55                           ` Babu Moger
2021-03-10 14:58                             ` Babu Moger
2021-03-10 15:31                               ` Paolo Bonzini
2021-03-11  1:21                                 ` Babu Moger
2021-03-11 20:07                                   ` Borislav Petkov
2021-03-11 20:32                                     ` Borislav Petkov
2021-03-11 20:57                                       ` Babu Moger
2021-03-11 21:40                                         ` Borislav Petkov
2021-03-11 22:04                                           ` Babu Moger
2021-03-11 22:15                                             ` Babu Moger
2021-03-11 23:52                                               ` Borislav Petkov
2021-03-12 14:53                                                 ` Babu Moger
2021-03-12 16:12                                                 ` Babu Moger
2021-03-24 21:21                                                   ` Borislav Petkov
2021-03-24 21:59                                                     ` Paolo Bonzini
2021-03-25  0:05                                                     ` Hugh Dickins
2021-03-25  2:43                                                       ` Hugh Dickins [this message]
2021-03-25  9:56                                                         ` Borislav Petkov
2021-03-25 10:29                                                           ` [PATCH] x86/tlb: Flush global mappings when KAISER is disabled Borislav Petkov
2021-03-25 10:52                                                             ` Paolo Bonzini
2021-03-25 15:13                                                             ` Babu Moger
2021-03-25 16:33                                                             ` Hugh Dickins
2021-03-25 19:00                                                               ` Jim Mattson
2021-03-25 20:09                                                             ` Borislav Petkov
2021-03-25 20:36                                                               ` Sasha Levin
2021-03-25 23:19                                                                 ` Sasha Levin
2021-03-25 23:56                                                                   ` Ben Hutchings
2021-03-11 21:23                                       ` [PATCH v6 00/12] SVM cleanup and INVPCID feature support Jim Mattson
2021-03-11 21:36                                         ` Borislav Petkov
2021-03-11 21:50                                           ` Babu Moger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.11.2103241913190.10112@eggly.anvils \
    --to=hughd@google.com \
    --cc=babu.moger@amd.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=makarandsonare@google.com \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.