From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C13D3C00144 for ; Fri, 29 Jul 2022 19:51:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 320356B0072; Fri, 29 Jul 2022 15:51:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2CDE66B0073; Fri, 29 Jul 2022 15:51:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1BD748E0001; Fri, 29 Jul 2022 15:51:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0E38B6B0072 for ; Fri, 29 Jul 2022 15:51:36 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id DD3CF1601DB for ; Fri, 29 Jul 2022 19:51:35 +0000 (UTC) X-FDA: 79741182150.10.069279B Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by imf09.hostedemail.com (Postfix) with ESMTP id 807E71400DC for ; Fri, 29 Jul 2022 19:51:35 +0000 (UTC) Received: by mail-pl1-f179.google.com with SMTP id iw1so5499902plb.6 for ; Fri, 29 Jul 2022 12:51:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc; bh=Dstciygn1YsoqcejBVc0D+MFf2Th1tfQSWlbVTNdSdk=; b=OrhOLugCM8Cj6h/411/1G13LNCne89tLk5CN5RFMmNJSVoJNIumYmTLvlzznytVtNN +65DB/Lckye6fIPxvoZypOO8+mrNMPNNKas4vW7cgtVmxrweDNwCISCeiu0QCKuS2RZ6 ttaT5y9tC05NWhrlcjtdCsb6bWH7km4UaVw4OedOk2ltG35dSvYrKBj7OmyLILdt3Upf Iy9drzJziV1Aq+PbxPImfRw3PlkeRVsHRBoYCrTRvuKjXPhLz/a00+HdWIEf7gUQxdPD LSVaC821KNqpxBCP0G4SRoMSzSCEBj/43FxcrRa86Ms3qiqAbpkt93nq+xufg884dyHM 3E1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=Dstciygn1YsoqcejBVc0D+MFf2Th1tfQSWlbVTNdSdk=; b=Y89LV/1rUajdgZYZw44UIQ/N96/L8Kprqs7XNQKTCW9wuRAUIJhvdd+qUPVnIik6Ly mwTbRXs95swET5ari39/n/7PmHYbnYPt6uWoD2wJZc++Q89uhpVAHiAMz7LCYdbJXxrq fG22p5LR7+oPETeCxhBGuoFX5mqpJ2+AcFwRimg22L8xNVg/6SUrA4XxGkSHUbRP0p0b 5EBY9I2DAO8tyLC9bVV/5fwhEDpWkLi8qCXI6ewIH00ijhCR9oculoKWiSaUMcINGS+/ IJJt87I0iiOdhUmp5UpmEwTS7JeuTcvG5xx1HV9fIzCs5ImegEm1Pe2S/DSmo6nuqGlO FlDQ== X-Gm-Message-State: ACgBeo0M2sLfm6MNgzMnv2+AiY0Xp4WDju4i2Dxa8u6ZjDPOuVsQwTU8 RzEeUsCLheBCaAI0Kihj/heQOw== X-Google-Smtp-Source: AA6agR63sDQRf5W5vFXYY3BbXk7I5lysrOCYAyeVjEMh41/Jj9OO8QiEjQjnyE/PUlhIQQOKevQ85Q== X-Received: by 2002:a17:902:7e47:b0:16c:7115:84d6 with SMTP id a7-20020a1709027e4700b0016c711584d6mr5524717pln.93.1659124294027; Fri, 29 Jul 2022 12:51:34 -0700 (PDT) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id u5-20020a170902714500b0016c574aa0fdsm4057346plm.76.2022.07.29.12.51.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Jul 2022 12:51:33 -0700 (PDT) Date: Fri, 29 Jul 2022 19:51:29 +0000 From: Sean Christopherson To: Chao Peng Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, qemu-devel@nongnu.org, linux-kselftest@vger.kernel.org, Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Shuah Khan , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Vishal Annapurve , Yu Zhang , "Kirill A . Shutemov" , luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com, aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, Quentin Perret , Michael Roth , mhocko@suse.com, Muchun Song Subject: Re: [PATCH v7 09/14] KVM: Extend the memslot to support fd-based private memory Message-ID: References: <20220706082016.2603916-1-chao.p.peng@linux.intel.com> <20220706082016.2603916-10-chao.p.peng@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220706082016.2603916-10-chao.p.peng@linux.intel.com> ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=OrhOLugC; spf=pass (imf09.hostedemail.com: domain of seanjc@google.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=seanjc@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1659124295; a=rsa-sha256; cv=none; b=ExlSaF1J0W6/ZEjrvDiJYsUxEzEITReNrVBe0WdcmSKuGB8zOpB/xo1VxgbAosuJLvfY+d lMEAHTjmRGOiTOxCdU81VZMPm74HXB02nyrIkda8o0Qto85S0hZMmnv81aEpHLF94o3yXA vEKK4piWsOsHC4Gds6nHnEiq8ZZa+H8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659124295; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Dstciygn1YsoqcejBVc0D+MFf2Th1tfQSWlbVTNdSdk=; b=nWXPrH9tycSOdHKtfgSQJYd1FQdfzqxNIypdisWgtzvsSMW2w9V6InXtAADnitUAm8eMtL ElvwZfWSe4py1ZoXbWqWl3tjMQKAzuebv4pn1Mgo0nR2yPASPDvAvGJP7R5GgFTyWUivzk ZhgIq3nO3MvUkT94Qq0OdQTNK5CGAso= X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 807E71400DC Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=OrhOLugC; spf=pass (imf09.hostedemail.com: domain of seanjc@google.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=seanjc@google.com; dmarc=pass (policy=reject) header.from=google.com X-Stat-Signature: k1ejpxjo4dsymdofnjzmeihyx33dj68d X-Rspam-User: X-HE-Tag: 1659124295-737022 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jul 06, 2022, Chao Peng wrote: > @@ -1332,9 +1332,18 @@ yet and must be cleared on entry. > __u64 userspace_addr; /* start of the userspace allocated memory */ > }; > > + struct kvm_userspace_memory_region_ext { > + struct kvm_userspace_memory_region region; > + __u64 private_offset; > + __u32 private_fd; > + __u32 pad1; > + __u64 pad2[14]; > +}; > + > /* for kvm_memory_region::flags */ > #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) > #define KVM_MEM_READONLY (1UL << 1) > + #define KVM_MEM_PRIVATE (1UL << 2) Very belatedly following up on prior feedback... | I think a flag is still needed, the problem is private_fd can be safely | accessed only when this flag is set, e.g. without this flag, we can't | copy_from_user these new fields since they don't exist for previous | kvm_userspace_memory_region callers. I forgot about that aspect of things. We don't technically need a dedicated PRIVATE flag to handle that, but it does seem to be the least awful soltuion. We could either add a generic KVM_MEM_EXTENDED_REGION or an entirely new ioctl(), e.g. KVM_SET_USER_MEMORY_REGION2, but in both approaches there's a decent chance that we'll end up needed individual "this field is valid" flags anways. E.g. if KVM requires pad1 and pad2 to be zero to carve out future extensions, then we're right back here if some future extension needs to treat '0' as a legal input. TL;DR: adding KVM_MEM_PRIVATE still seems like the best approach. > @@ -4631,14 +4658,35 @@ static long kvm_vm_ioctl(struct file *filp, > break; > } > case KVM_SET_USER_MEMORY_REGION: { > - struct kvm_userspace_memory_region kvm_userspace_mem; > + struct kvm_user_mem_region mem; > + unsigned long size; > + u32 flags; > + > + kvm_sanity_check_user_mem_region_alias(); > + > + memset(&mem, 0, sizeof(mem)); > > r = -EFAULT; > - if (copy_from_user(&kvm_userspace_mem, argp, > - sizeof(kvm_userspace_mem))) > + > + if (get_user(flags, > + (u32 __user *)(argp + offsetof(typeof(mem), flags)))) > + goto out; Indentation is funky. It's hard to massage this into something short and readable What about capturing the offset separately? E.g. struct kvm_user_mem_region mem; unsigned int flags_offset = offsetof(typeof(mem), flags)); unsigned long size; u32 flags; kvm_sanity_check_user_mem_region_alias(); memset(&mem, 0, sizeof(mem)); r = -EFAULT; if (get_user(flags, (u32 __user *)(argp + flags_offset))) goto out; But this can actually be punted until KVM_MEM_PRIVATE is fully supported. As of this patch, KVM doesn't read the extended size, so I believe the diff for this patch can simply be: diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index da263c370d00..5194beb7b52f 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4640,6 +4640,10 @@ static long kvm_vm_ioctl(struct file *filp, sizeof(kvm_userspace_mem))) goto out; + r = -EINVAL; + if (mem.flags & KVM_MEM_PRIVATE) + goto out; + r = kvm_vm_ioctl_set_memory_region(kvm, &kvm_userspace_mem); break; }