From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27DAAC433E0 for ; Thu, 11 Mar 2021 20:49:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D399364F8C for ; Thu, 11 Mar 2021 20:49:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230385AbhCKUtI (ORCPT ); Thu, 11 Mar 2021 15:49:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56826 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230231AbhCKUsw (ORCPT ); Thu, 11 Mar 2021 15:48:52 -0500 Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 712BBC061574 for ; Thu, 11 Mar 2021 12:48:52 -0800 (PST) Received: by mail-io1-xd2e.google.com with SMTP id u20so23390343iot.9 for ; Thu, 11 Mar 2021 12:48:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=UQoFxXZvc3gQJU1fgokCDoSWR9gpWKlvLUH6Gip9gtU=; b=r6M+mRbw9lUP1km3/UVJd2JjOV0mWTtKM+KCu+8cQtUYSpi22yudBBx/KvOiVYMOTv WO3mIGiUG8dzqjAyYFZWLoptOF89EXFtOJaPhGC+HFULzsq/Klu32FcabkMf/1Jtrlwj GC2wIdtv1TBf0ovUfXpvnEDsbXL/c1H/QdissdQryZqw+dlGBjOj33ARMsQmJkevugiK r8LuuY3FTHJcXJGNdJftHuCjCAJMy6BSFHrHxNiVpUxc/DK5EZWqrWGGP1zDHdbBAwIJ H0ceYOX9tZYlgW43u8kCHXvqrJkw4EGWBumGnazNIeMl6qltAL1eHnAQMFQn18yGEBIY D6ZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=UQoFxXZvc3gQJU1fgokCDoSWR9gpWKlvLUH6Gip9gtU=; b=Mysy9lVnOU2MMsG8f5e63U2dFhJs0CsnrpKOOOXuJtaczZ/1uM8Tkkbp3412ATrf/n cGEFqxQs22GrmYgLW+RGVKm1t661/09gbi8UEuVrS4FWQfnpw1OS1hNQDOaMdLoE8zTz ILf0q0I4dUeDfgYaOGW8S59i9FkDLcADSZTDVSjntN/j7EoVEbLxAnt0BcWQA4b1OrSZ dx7n/D+anDMTS4+1ACWdwUOhjpms+QF9Xq6KdsQ8wPmAv7I8JJvj+A3ZpEYXwQ4vSW4n maWgP96lcgLaryBKn4mI2oUxuvaZXhr0MRmB+Kk8zImKQGxednp11Gx6hhyTQ95zUPUL 20lw== X-Gm-Message-State: AOAM5327QgJyqrJ1qtMThXOaQCce5tSRAvPLXIG2dIFZTZbWbCqRRu3P VJGLVLg5ep69oaH0IRu5MOMreIHA3CMZFMMJ2nbEwg== X-Google-Smtp-Source: ABdhPJwCZFYBLDdLOOaYii/M1WFGGx42V2HsWWvLddvOZM58Lx4ax6jIB5Wf6yZzVo5L1EmWDw27q0QqF7skThRqZDs= X-Received: by 2002:a05:6638:1614:: with SMTP id x20mr5458728jas.19.1615495726921; Thu, 11 Mar 2021 12:48:46 -0800 (PST) MIME-Version: 1.0 References: <20210224175122.GA19661@ashkalra_ubuntu_server> <20210225202008.GA5208@ashkalra_ubuntu_server> <20210226140432.GB5950@ashkalra_ubuntu_server> <20210302145543.GA29994@ashkalra_ubuntu_server> <20210303185441.GA19944@willie-the-truck> <20210311181458.GA6650@ashkalra_ubuntu_server> In-Reply-To: <20210311181458.GA6650@ashkalra_ubuntu_server> From: Steve Rutherford Date: Thu, 11 Mar 2021 12:48:07 -0800 Message-ID: Subject: Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl To: Ashish Kalra Cc: Will Deacon , Sean Christopherson , "pbonzini@redhat.com" , "joro@8bytes.org" , "Lendacky, Thomas" , "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "venu.busireddy@oracle.com" , "Singh, Brijesh" , Quentin Perret , maz@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Thu, Mar 11, 2021 at 10:15 AM Ashish Kalra wrote: > > On Wed, Mar 03, 2021 at 06:54:41PM +0000, Will Deacon wrote: > > [+Marc] > > > > On Tue, Mar 02, 2021 at 02:55:43PM +0000, Ashish Kalra wrote: > > > On Fri, Feb 26, 2021 at 09:44:41AM -0800, Sean Christopherson wrote: > > > > On Fri, Feb 26, 2021, Ashish Kalra wrote: > > > > > On Thu, Feb 25, 2021 at 02:59:27PM -0800, Steve Rutherford wrote: > > > > > > On Thu, Feb 25, 2021 at 12:20 PM Ashish Kalra wrote: > > > > > > Thanks for grabbing the data! > > > > > > > > > > > > I am fine with both paths. Sean has stated an explicit desire f= or > > > > > > hypercall exiting, so I think that would be the current consens= us. > > > > > > > > Yep, though it'd be good to get Paolo's input, too. > > > > > > > > > > If we want to do hypercall exiting, this should be in a follow-= up > > > > > > series where we implement something more generic, e.g. a hyperc= all > > > > > > exiting bitmap or hypercall exit list. If we are taking the hyp= ercall > > > > > > exit route, we can drop the kvm side of the hypercall. > > > > > > > > I don't think this is a good candidate for arbitrary hypercall inte= rception. Or > > > > rather, I think hypercall interception should be an orthogonal impl= ementation. > > > > > > > > The guest, including guest firmware, needs to be aware that the hyp= ercall is > > > > supported, and the ABI needs to be well-defined. Relying on usersp= ace VMMs to > > > > implement a common ABI is an unnecessary risk. > > > > > > > > We could make KVM's default behavior be a nop, i.e. have KVM enforc= e the ABI but > > > > require further VMM intervention. But, I just don't see the point,= it would > > > > save only a few lines of code. It would also limit what KVM could = do in the > > > > future, e.g. if KVM wanted to do its own bookkeeping _and_ exit to = userspace, > > > > then mandatory interception would essentially make it impossible fo= r KVM to do > > > > bookkeeping while still honoring the interception request. > > > > > > > > However, I do think it would make sense to have the userspace exit = be a generic > > > > exit type. But hey, we already have the necessary ABI defined for = that! It's > > > > just not used anywhere. > > > > > > > > /* KVM_EXIT_HYPERCALL */ > > > > struct { > > > > __u64 nr; > > > > __u64 args[6]; > > > > __u64 ret; > > > > __u32 longmode; > > > > __u32 pad; > > > > } hypercall; > > > > > > > > > > > > > > Userspace could also handle the MSR using MSR filters (would ne= ed to > > > > > > confirm that). Then userspace could also be in control of the = cpuid bit. > > > > > > > > An MSR is not a great fit; it's x86 specific and limited to 64 bits= of data. > > > > The data limitation could be fudged by shoving data into non-standa= rd GPRs, but > > > > that will result in truly heinous guest code, and extensibility iss= ues. > > > > > > > > The data limitation is a moot point, because the x86-only thing is = a deal > > > > breaker. arm64's pKVM work has a near-identical use case for a gue= st to share > > > > memory with a host. I can't think of a clever way to avoid having = to support > > > > TDX's and SNP's hypervisor-agnostic variants, but we can at least n= ot have > > > > multiple KVM variants. > > > > > > Looking at arm64's pKVM work, i see that it is a recently introduced = RFC > > > patch-set and probably relevant to arm64 nVHE hypervisor > > > mode/implementation, and potentially makes sense as it adds guest > > > memory protection as both host and guest kernels are running on the s= ame > > > privilege level ? > > > > > > Though i do see that the pKVM stuff adds two hypercalls, specifically= : > > > > > > pkvm_create_mappings() ( I assume this is for setting shared memory > > > regions between host and guest) & > > > pkvm_create_private_mappings(). > > > > > > And the use-cases are quite similar to memory protection architectues > > > use cases, for example, use with virtio devices, guest DMA I/O, etc. > > > > These hypercalls are both private to the host kernel communicating with > > its hypervisor counterpart, so I don't think they're particularly > > relevant here. As far as I can see, the more useful thing is to allow > > the guest to communicate back to the host (and the VMM) that it has ope= ned > > up a memory window, perhaps for virtio rings or some other shared memor= y. > > > > We hacked this up as a prototype in the past: > > > > https://nam11.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fand= roid-kvm.googlesource.com%2Flinux%2F%2B%2Fd12a9e2c12a52cf7140d40cd9fa092dc8= a85fac9%255E%2521%2F&data=3D04%7C01%7Cashish.kalra%40amd.com%7C7ae6bbd9= fa6442f9edcc08d8de75d14b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63750= 3944913839841%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiL= CJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=3DJuon5nJ7BB6moTWYssRXOWrDOr= YfZLmA%2BLrz3s12Ook%3D&reserved=3D0 > > > > but that's all arm64-specific and if we're solving the same problem as > > you, then let's avoid arch-specific stuff if possible. The way in which > > the guest discovers the interface will be arch-specific (we already hav= e > > a mechanism for that and some hypercalls are already allocated by specs > > from Arm), but the interface back to the VMM and some (most?) of the ho= st > > handling could be shared. > > > > I have started implementing a similar "hypercall to userspace" > functionality for these DMA_SHARE/DMA_UNSHARE type of interfaces > corresponding to SEV guest's add/remove shared regions on the x86 platfor= m. > > This does not implement a generic hypercall exiting infrastructure, > mainly extends the KVM hypercall support to return back to userspace > specifically for add/remove shared region hypercalls and then re-uses > the complete userspace I/O callback functionality to resume the guest > after returning back from userspace handling of the hypercall. > > Looking fwd. to any comments/feedback/thoughts on the above. Others have mentioned a lack of appetite for generic hypercall intercepts, so this is the right approach. Thanks, Steve