From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44016C433F5 for ; Mon, 18 Apr 2022 09:25:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237480AbiDRJ14 (ORCPT ); Mon, 18 Apr 2022 05:27:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237464AbiDRJ1z (ORCPT ); Mon, 18 Apr 2022 05:27:55 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E03115FEE; Mon, 18 Apr 2022 02:25:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1650273916; x=1681809916; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=ioZ9JXZxLCcWETmYWkAhKILzUEa+o6rWjiImVbx3U2I=; b=ZQWjxNmfy8uOm7Z9UynWX/87cgy8lxRzXZ01TvrNDSwOrWJKEG3B5GUR vmfR6+ka/DUNo2x+qj84JWVzCQfj+xOiBfOTnKd23GigiUhqEVe6V0LPz 3XwHo1fMqFLp2RdtyiY/1MJKaEtzRRb3cm9GIeVS74fVnW13N2rQHYaR0 bxBwsp8vzAZuJ4KaLPUw2wtu7zq4TzzDoshKc7Y3cBva4JyiaT0LkAK6x I+Ck2Zvl9oACM52g4FR3lpprKvEZuR76RKEyQERgKrO2Ebve+7rmb8Psr X7hxFFSho0a0UeaDAve/38GAi5VSL5yqzRM/uAA7C7HPedOUBluLNIIm4 g==; X-IronPort-AV: E=McAfee;i="6400,9594,10320"; a="349935563" X-IronPort-AV: E=Sophos;i="5.90,269,1643702400"; d="scan'208";a="349935563" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2022 02:25:16 -0700 X-IronPort-AV: E=Sophos;i="5.90,269,1643702400"; d="scan'208";a="575505697" Received: from gao-cwp.sh.intel.com (HELO gao-cwp) ([10.239.159.23]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2022 02:25:10 -0700 Date: Mon, 18 Apr 2022 17:25:05 +0800 From: Chao Gao To: Sean Christopherson Cc: Zeng Guang , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, Dave Hansen , Tony Luck , Kan Liang , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Kim Phillips , Jarkko Sakkinen , Jethro Beekman , Kai Huang , x86@kernel.org, linux-kernel@vger.kernel.org, Robert Hu Subject: Re: [PATCH v8 9/9] KVM: VMX: enable IPI virtualization Message-ID: <20220418092500.GA14409@gao-cwp> References: <20220411090447.5928-1-guang.zeng@intel.com> <20220411090447.5928-10-guang.zeng@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 15, 2022 at 03:25:06PM +0000, Sean Christopherson wrote: >On Mon, Apr 11, 2022, Zeng Guang wrote: >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index d1a39285deab..23fbf52f7bea 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -11180,11 +11180,15 @@ static int sync_regs(struct kvm_vcpu *vcpu) >> >> int kvm_arch_vcpu_precreate(struct kvm *kvm, unsigned int id) >> { >> + int ret = 0; >> + >> if (kvm_check_tsc_unstable() && atomic_read(&kvm->online_vcpus) != 0) >> pr_warn_once("kvm: SMP vm created on host with unstable TSC; " >> "guest TSC will not be reliable\n"); >> >> - return 0; >> + if (kvm_x86_ops.alloc_ipiv_pid_table) >> + ret = static_call(kvm_x86_alloc_ipiv_pid_table)(kvm); > >Add a generic kvm_x86_ops.vcpu_precreate, no reason to make this so specific. >And use KVM_X86_OP_RET0 instead of KVM_X86_OP_OPTIONAL, then this can simply be > > return static_call(kvm_x86_vcpu_precreate); > >That said, there's a flaw in my genius plan. > > 1. KVM_CREATE_VM > 2. KVM_CAP_MAX_VCPU_ID, set max_vcpu_ids=1 > 3. KVM_CREATE_VCPU, create IPIv table but ultimately fails > 4. KVM decrements created_vcpus back to '0' > 5. KVM_CAP_MAX_VCPU_ID, set max_vcpu_ids=4096 > 6. KVM_CREATE_VCPU w/ ID out of range > >In other words, malicious userspace could trigger buffer overflow. can we simply return an error (e.g., -EEXIST) on step 5 (i.e., max_vcpu_ids cannot be changed after being set once)? or can we detect the change of max_vcpu_ids in step 6 and re-allocate PID table? > >That could be solved by adding an arch hook to undo precreate, but that's gross >and a good indication that we're trying to solve this the wrong way. > >I think it's high time we add KVM_FINALIZE_VM, though that's probably a bad name >since e.g. TDX wants to use that name for VM really, really, being finalized[*], >i.e. after all vCPUs have been created. > >KVM_POST_CREATE_VM? That's not very good either. > >Paolo or anyone else, thoughts? > >[*] https://lore.kernel.org/all/83768bf0f786d24f49d9b698a45ba65441ef5ef0.1646422845.git.isaku.yamahata@intel.com