From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2686C433B4 for ; Tue, 11 May 2021 16:10:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8651B616EA for ; Tue, 11 May 2021 16:10:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230505AbhEKQLE (ORCPT ); Tue, 11 May 2021 12:11:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43480 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230484AbhEKQK6 (ORCPT ); Tue, 11 May 2021 12:10:58 -0400 Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D7E1C061574 for ; Tue, 11 May 2021 09:09:52 -0700 (PDT) Received: by mail-pg1-x531.google.com with SMTP id q15so11846065pgg.12 for ; Tue, 11 May 2021 09:09:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=IjrFFPJC53lebLPOH5QRUVXmz4ihOhkij0I7QzCDBI0=; b=XcGb4jGjlJ6JO4vxlkNoZxxV11pXvU+oFMTV1r5oyRYzZo8CUqM0TxvcFJPRKg5lVf xFA6Eu8mHcLmEu3dplxBuNpOTarryk+zCnR0rOpd+YVIBuXQmYxs2IDA3byQmFrJQafI yMlWWBA7M/GOuE06gP0n4Seifx2RFiE+aztMDMkSy+wFaJuLr1QVK2/8bT7JKs0+KMmj vByvezV2m79Gi5LZfYDzv2xotK+3Hx+UhggOogV05Pm41AEZBD1RViip+bbidHfjLx61 Bjjnmg/z226fxULvxCFRlpvyl8MWreccXUXvb1npKvoXVjEh2MeQtTF83cAg5gBz9yCq OmLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=IjrFFPJC53lebLPOH5QRUVXmz4ihOhkij0I7QzCDBI0=; b=nJNwOjmTH+tUfLc9CA2cbyego4EOttfpAw+Ujtw04NA25hE2+s5dSKazGeJDtS3rLR GCcxQNRA3a7Yng/lO+Zi/BYldzDlih4+skPdDoPbhAkf1B0desDXsSY0R8JYMUabfpJz Gjxzuu0ZTVOwYfaOUfCrBZSAfHR14HlatM8k1a1RlWr2OIH3lMtPUE/xOIHQmtUj6eot TpfOw9BrrLxxDlOEf8OJY1JZxaxErbuCe1TnhGydyDmIPtQCJEdvF+azq6O+ydxzEm2/ usnPVKHIqiEB7mB94c4gMboIYv/Oi+SIHjBzeeIMmR64c7b/ogUtiedsJ6YV0Y43V8Ew 2r6Q== X-Gm-Message-State: AOAM530CCjUwxF+FeSurS2r4j5YgrUhsB0G5pYUkVEDD6dJOwI4knj9H wr+cxbftVB6GspGsueT7xwXsbw== X-Google-Smtp-Source: ABdhPJzmmsIpuYEnKafc4HRT84mFXJz/TW3XfLuQ6ixT3txxj4R5VRjY5pgVYaRIjUQ0cu1nHQoCjQ== X-Received: by 2002:a62:30c2:0:b029:289:116c:ec81 with SMTP id w185-20020a6230c20000b0290289116cec81mr31331906pfw.42.1620749391561; Tue, 11 May 2021 09:09:51 -0700 (PDT) Received: from google.com (240.111.247.35.bc.googleusercontent.com. [35.247.111.240]) by smtp.gmail.com with ESMTPSA id i3sm2468601pjv.30.2021.05.11.09.09.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 May 2021 09:09:51 -0700 (PDT) Date: Tue, 11 May 2021 16:09:45 +0000 From: Sean Christopherson To: Dave Hansen Cc: Dan Williams , Kuppuswamy Sathyanarayanan , Peter Zijlstra , Andy Lutomirski , Tony Luck , Andi Kleen , Kirill Shutemov , Kuppuswamy Sathyanarayanan , Raj Ashok , Linux Kernel Mailing List Subject: Re: [RFC v2 16/32] x86/tdx: Handle MWAIT, MONITOR and WBINVD Message-ID: References: <22d56f70-c69c-b3d2-51d6-962abdbc2882@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <22d56f70-c69c-b3d2-51d6-962abdbc2882@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 11, 2021, Dave Hansen wrote: > On 5/10/21 6:23 PM, Dan Williams wrote: > >> To prevent TD guest from using MWAIT/MONITOR instructions, > >> support for these instructions are already disabled by TDX > >> module (SEAM). So CPUID flags for these instructions should > >> be in disabled state. > > Why does this not result in a #UD if the instruction is disabled by > > SEAM? How is it possible to execute a disabled instruction (one > > precluded by CPUID) to the point where it triggers #VE instead of #UD? > > This is actually a vestige of VMX. It's quite possible toady to have a > feature which isn't enumerated in CPUID which still exists and "works" > in the silicon. No, virtualization holes are something else entirely. MONITOR/MWAIT are a bit weird; they do have an enable bit in IA32_MISC_ENABLE, but most VMMs don't context switch IA32_MISC_ENABLE (load guest value on entry, load host value on exit) because that would add ~250 cycles to every host<->guest transition. And IA32_MISC_ENABLE is shared between SMT siblings, which further complicates loading the guest's value into hardware. In the end, it's easier to leave MONITOR/MWAIT enabled in hardware and instead force a VM-Exit. As for why TDX injects #VE instead of #UD, I suspect it's for the same reason that KVM emulates MONITOR/MWAIT as nops instead of injecting a #UD. The CPUID bit for MONITOR/MWAIT reflects their enabling in IA32_MISC_ENABLE, not raw support in hardware. That means there's no definitive way to enumerate to BIOS that MONITOR/MWAIT are not supported, e.g. AFAICT, EDKII blindly assumes it can enable MONITOR/MWAIT in IA32_MISC_ENABLE. To justify #UD instead of #VE, TDX would have to inject #GP on WRMSR to set IA32_MISC_ENABLE.ENABLE_MONITOR, and even then there would be weirdness with respect to VMM behavior in response to TDVMCALL(WRMSR) since the VMM could allow the virtual write. In the end, it's again simpler to inject #VE. > There are all kinds of pitfalls to doing this, but folks evidently do it in > public clouds all the time. Virtualization holes are when instructions/features are enumerated via CPUID, but don't have a control to hide the feature from the guest (or in the case of CET, multiple feature are buried behind a single control). So even if the VMM hides the feature via CPUID, the guest can still _cleanly_ execute the instruction if it's supported by the underlying hardware. > The CPUID virtualization basically just traps into the hypervisor and > lets the hypervisor set whatever register values it wants to appear when > CPUID "returns". > > But, the controls for what instructions generate #UD are actually quite > separate and unrelated to CPUID itself. Eh, any sane VMM will accurately represent its virtual CPU model via CPUID insofar as possible, there are just too many creaky corners in x86 to make things 100% bombproof.