From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25109C2B9F4 for ; Tue, 22 Jun 2021 10:25:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0B6F5613AD for ; Tue, 22 Jun 2021 10:25:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229817AbhFVK15 (ORCPT ); Tue, 22 Jun 2021 06:27:57 -0400 Received: from mail.kernel.org ([198.145.29.99]:42114 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229567AbhFVK1p (ORCPT ); Tue, 22 Jun 2021 06:27:45 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D02FB613AD; Tue, 22 Jun 2021 10:25:29 +0000 (UTC) Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1lvdax-0094U6-Pc; Tue, 22 Jun 2021 11:25:27 +0100 Date: Tue, 22 Jun 2021 11:25:27 +0100 Message-ID: <875yy6ci20.wl-maz@kernel.org> From: Marc Zyngier To: Fuad Tabba Cc: Steven Price , Catalin Marinas , Will Deacon , "Dr. David Alan Gilbert" , qemu-devel@nongnu.org, Dave Martin , Juan Quintela , Richard Henderson , linux-kernel@vger.kernel.org, Thomas Gleixner , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest In-Reply-To: References: <20210621111716.37157-1-steven.price@arm.com> <20210621111716.37157-6-steven.price@arm.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: tabba@google.com, steven.price@arm.com, catalin.marinas@arm.com, will@kernel.org, dgilbert@redhat.com, qemu-devel@nongnu.org, Dave.Martin@arm.com, quintela@redhat.com, richard.henderson@linaro.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Fuad, On Tue, 22 Jun 2021 09:56:22 +0100, Fuad Tabba wrote: > > Hi, > > > On Mon, Jun 21, 2021 at 12:18 PM Steven Price wrote: > > > > The VMM may not wish to have it's own mapping of guest memory mapped > > with PROT_MTE because this causes problems if the VMM has tag checking > > enabled (the guest controls the tags in physical RAM and it's unlikely > > the tags are correct for the VMM). > > > > Instead add a new ioctl which allows the VMM to easily read/write the > > tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE > > while the VMM can still read/write the tags for the purpose of > > migration. > > > > Reviewed-by: Catalin Marinas > > Signed-off-by: Steven Price > > --- > > arch/arm64/include/asm/kvm_host.h | 3 ++ > > arch/arm64/include/asm/mte-def.h | 1 + > > arch/arm64/include/uapi/asm/kvm.h | 11 +++++ > > arch/arm64/kvm/arm.c | 7 +++ > > arch/arm64/kvm/guest.c | 82 +++++++++++++++++++++++++++++++ > > include/uapi/linux/kvm.h | 1 + > > 6 files changed, 105 insertions(+) > > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > > index 309e36cc1b42..6a2ac4636d42 100644 > > --- a/arch/arm64/include/asm/kvm_host.h > > +++ b/arch/arm64/include/asm/kvm_host.h > > @@ -729,6 +729,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu, > > int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, > > struct kvm_device_attr *attr); > > > > +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, > > + struct kvm_arm_copy_mte_tags *copy_tags); > > + > > /* Guest/host FPSIMD coordination helpers */ > > int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu); > > void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu); > > diff --git a/arch/arm64/include/asm/mte-def.h b/arch/arm64/include/asm/mte-def.h > > index cf241b0f0a42..626d359b396e 100644 > > --- a/arch/arm64/include/asm/mte-def.h > > +++ b/arch/arm64/include/asm/mte-def.h > > @@ -7,6 +7,7 @@ > > > > #define MTE_GRANULE_SIZE UL(16) > > #define MTE_GRANULE_MASK (~(MTE_GRANULE_SIZE - 1)) > > +#define MTE_GRANULES_PER_PAGE (PAGE_SIZE / MTE_GRANULE_SIZE) > > #define MTE_TAG_SHIFT 56 > > #define MTE_TAG_SIZE 4 > > #define MTE_TAG_MASK GENMASK((MTE_TAG_SHIFT + (MTE_TAG_SIZE - 1)), MTE_TAG_SHIFT) > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h > > index 24223adae150..b3edde68bc3e 100644 > > --- a/arch/arm64/include/uapi/asm/kvm.h > > +++ b/arch/arm64/include/uapi/asm/kvm.h > > @@ -184,6 +184,17 @@ struct kvm_vcpu_events { > > __u32 reserved[12]; > > }; > > > > +struct kvm_arm_copy_mte_tags { > > + __u64 guest_ipa; > > + __u64 length; > > + void __user *addr; > > + __u64 flags; > > + __u64 reserved[2]; > > +}; > > + > > +#define KVM_ARM_TAGS_TO_GUEST 0 > > +#define KVM_ARM_TAGS_FROM_GUEST 1 > > + > > /* If you need to interpret the index values, here is the key: */ > > #define KVM_REG_ARM_COPROC_MASK 0x000000000FFF0000 > > #define KVM_REG_ARM_COPROC_SHIFT 16 > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > > index 28ce26a68f09..511f3716fe33 100644 > > --- a/arch/arm64/kvm/arm.c > > +++ b/arch/arm64/kvm/arm.c > > @@ -1359,6 +1359,13 @@ long kvm_arch_vm_ioctl(struct file *filp, > > > > return 0; > > } > > + case KVM_ARM_MTE_COPY_TAGS: { > > + struct kvm_arm_copy_mte_tags copy_tags; > > + > > + if (copy_from_user(©_tags, argp, sizeof(copy_tags))) > > + return -EFAULT; > > + return kvm_vm_ioctl_mte_copy_tags(kvm, ©_tags); > > + } > > default: > > return -EINVAL; > > } > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > > index 5cb4a1cd5603..4ddb20017b2f 100644 > > --- a/arch/arm64/kvm/guest.c > > +++ b/arch/arm64/kvm/guest.c > > @@ -995,3 +995,85 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, > > > > return ret; > > } > > + > > +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, > > + struct kvm_arm_copy_mte_tags *copy_tags) > > +{ > > + gpa_t guest_ipa = copy_tags->guest_ipa; > > + size_t length = copy_tags->length; > > + void __user *tags = copy_tags->addr; > > + gpa_t gfn; > > + bool write = !(copy_tags->flags & KVM_ARM_TAGS_FROM_GUEST); > > + int ret = 0; > > + > > + if (!kvm_has_mte(kvm)) > > + return -EINVAL; > > + > > + if (copy_tags->reserved[0] || copy_tags->reserved[1]) > > + return -EINVAL; > > + > > + if (copy_tags->flags & ~KVM_ARM_TAGS_FROM_GUEST) > > + return -EINVAL; > > + > > + if (length & ~PAGE_MASK || guest_ipa & ~PAGE_MASK) > > + return -EINVAL; > > + > > + gfn = gpa_to_gfn(guest_ipa); > > + > > + mutex_lock(&kvm->slots_lock); > > + > > + while (length > 0) { > > + kvm_pfn_t pfn = gfn_to_pfn_prot(kvm, gfn, write, NULL); > > + void *maddr; > > + unsigned long num_tags; > > + struct page *page; > > + > > + if (is_error_noslot_pfn(pfn)) { > > + ret = -EFAULT; > > + goto out; > > + } > > + > > + page = pfn_to_online_page(pfn); > > + if (!page) { > > + /* Reject ZONE_DEVICE memory */ > > + ret = -EFAULT; > > + goto out; > > + } > > + maddr = page_address(page); > > + > > + if (!write) { > > + if (test_bit(PG_mte_tagged, &page->flags)) > > + num_tags = mte_copy_tags_to_user(tags, maddr, > > + MTE_GRANULES_PER_PAGE); > > + else > > + /* No tags in memory, so write zeros */ > > + num_tags = MTE_GRANULES_PER_PAGE - > > + clear_user(tags, MTE_GRANULES_PER_PAGE); > > + kvm_release_pfn_clean(pfn); > > + } else { > > + num_tags = mte_copy_tags_from_user(maddr, tags, > > + MTE_GRANULES_PER_PAGE); > > + kvm_release_pfn_dirty(pfn); > > + } > > + > > + if (num_tags != MTE_GRANULES_PER_PAGE) { > > + ret = -EFAULT; > > + goto out; > > + } > > + > > + /* Set the flag after checking the write completed fully */ > > + if (write) > > + set_bit(PG_mte_tagged, &page->flags); > > + > > + gfn++; > > + tags += num_tags; > > + length -= PAGE_SIZE; > > + } > > + > > +out: > > + mutex_unlock(&kvm->slots_lock); > > + /* If some data has been copied report the number of bytes copied */ > > + if (length != copy_tags->length) > > + return copy_tags->length - length; > > I'm not sure if this is actually an issue, but a couple of comments on > the return value if there is an error after a partial copy has been > done. If mte_copy_tags_to_user or mte_copy_tags_from_user don't return > MTE_GRANULES_PER_PAGE, then the check for num_tags would fail, but > some of the tags would have been copied, which wouldn't be reflected > in length. That said, on a write the tagged bit wouldn't be set, and > on read then the return value would be conservative, but not > incorrect. > > That said, even though it is described that way in the documentation > (rather deep in the description though), it might be confusing to > return a non-negative value on an error. The other kvm ioctl I could > find that does something similar, KVM_S390_GET_IRQ_STATE, seems to > always return a -ERROR on error, rather than the number of bytes > copied. My mental analogy for this ioctl is the read()/write() syscalls, which return the number of bytes that have been transferred in either direction. I agree that there are some corner cases (a tag copy that fails because of a faulty page adjacent to a valid page will still report some degree of success), but it is also important to report what has actually been done in either direction. Thanks, M. -- Without deviation from the norm, progress is not possible. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6076FC48BE5 for ; Tue, 22 Jun 2021 10:39:14 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E65E9613B2 for ; Tue, 22 Jun 2021 10:39:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E65E9613B2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44516 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lvdoH-0004ia-4p for qemu-devel@archiver.kernel.org; Tue, 22 Jun 2021 06:39:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46756) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lvdb5-0000UW-SU for qemu-devel@nongnu.org; Tue, 22 Jun 2021 06:25:35 -0400 Received: from mail.kernel.org ([198.145.29.99]:59038) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lvdb2-0002Pf-SK for qemu-devel@nongnu.org; Tue, 22 Jun 2021 06:25:35 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D02FB613AD; Tue, 22 Jun 2021 10:25:29 +0000 (UTC) Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1lvdax-0094U6-Pc; Tue, 22 Jun 2021 11:25:27 +0100 Date: Tue, 22 Jun 2021 11:25:27 +0100 Message-ID: <875yy6ci20.wl-maz@kernel.org> From: Marc Zyngier To: Fuad Tabba Subject: Re: [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest In-Reply-To: References: <20210621111716.37157-1-steven.price@arm.com> <20210621111716.37157-6-steven.price@arm.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: tabba@google.com, steven.price@arm.com, catalin.marinas@arm.com, will@kernel.org, dgilbert@redhat.com, qemu-devel@nongnu.org, Dave.Martin@arm.com, quintela@redhat.com, richard.henderson@linaro.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Received-SPF: pass client-ip=198.145.29.99; envelope-from=maz@kernel.org; helo=mail.kernel.org X-Spam_score_int: -68 X-Spam_score: -6.9 X-Spam_bar: ------ X-Spam_report: (-6.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Juan Quintela , Catalin Marinas , Richard Henderson , qemu-devel@nongnu.org, "Dr. David Alan Gilbert" , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, Thomas Gleixner , Steven Price , Will Deacon , Dave Martin , linux-kernel@vger.kernel.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Hi Fuad, On Tue, 22 Jun 2021 09:56:22 +0100, Fuad Tabba wrote: > > Hi, > > > On Mon, Jun 21, 2021 at 12:18 PM Steven Price wrote: > > > > The VMM may not wish to have it's own mapping of guest memory mapped > > with PROT_MTE because this causes problems if the VMM has tag checking > > enabled (the guest controls the tags in physical RAM and it's unlikely > > the tags are correct for the VMM). > > > > Instead add a new ioctl which allows the VMM to easily read/write the > > tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE > > while the VMM can still read/write the tags for the purpose of > > migration. > > > > Reviewed-by: Catalin Marinas > > Signed-off-by: Steven Price > > --- > > arch/arm64/include/asm/kvm_host.h | 3 ++ > > arch/arm64/include/asm/mte-def.h | 1 + > > arch/arm64/include/uapi/asm/kvm.h | 11 +++++ > > arch/arm64/kvm/arm.c | 7 +++ > > arch/arm64/kvm/guest.c | 82 +++++++++++++++++++++++++++++++ > > include/uapi/linux/kvm.h | 1 + > > 6 files changed, 105 insertions(+) > > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > > index 309e36cc1b42..6a2ac4636d42 100644 > > --- a/arch/arm64/include/asm/kvm_host.h > > +++ b/arch/arm64/include/asm/kvm_host.h > > @@ -729,6 +729,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu, > > int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, > > struct kvm_device_attr *attr); > > > > +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, > > + struct kvm_arm_copy_mte_tags *copy_tags); > > + > > /* Guest/host FPSIMD coordination helpers */ > > int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu); > > void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu); > > diff --git a/arch/arm64/include/asm/mte-def.h b/arch/arm64/include/asm/mte-def.h > > index cf241b0f0a42..626d359b396e 100644 > > --- a/arch/arm64/include/asm/mte-def.h > > +++ b/arch/arm64/include/asm/mte-def.h > > @@ -7,6 +7,7 @@ > > > > #define MTE_GRANULE_SIZE UL(16) > > #define MTE_GRANULE_MASK (~(MTE_GRANULE_SIZE - 1)) > > +#define MTE_GRANULES_PER_PAGE (PAGE_SIZE / MTE_GRANULE_SIZE) > > #define MTE_TAG_SHIFT 56 > > #define MTE_TAG_SIZE 4 > > #define MTE_TAG_MASK GENMASK((MTE_TAG_SHIFT + (MTE_TAG_SIZE - 1)), MTE_TAG_SHIFT) > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h > > index 24223adae150..b3edde68bc3e 100644 > > --- a/arch/arm64/include/uapi/asm/kvm.h > > +++ b/arch/arm64/include/uapi/asm/kvm.h > > @@ -184,6 +184,17 @@ struct kvm_vcpu_events { > > __u32 reserved[12]; > > }; > > > > +struct kvm_arm_copy_mte_tags { > > + __u64 guest_ipa; > > + __u64 length; > > + void __user *addr; > > + __u64 flags; > > + __u64 reserved[2]; > > +}; > > + > > +#define KVM_ARM_TAGS_TO_GUEST 0 > > +#define KVM_ARM_TAGS_FROM_GUEST 1 > > + > > /* If you need to interpret the index values, here is the key: */ > > #define KVM_REG_ARM_COPROC_MASK 0x000000000FFF0000 > > #define KVM_REG_ARM_COPROC_SHIFT 16 > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > > index 28ce26a68f09..511f3716fe33 100644 > > --- a/arch/arm64/kvm/arm.c > > +++ b/arch/arm64/kvm/arm.c > > @@ -1359,6 +1359,13 @@ long kvm_arch_vm_ioctl(struct file *filp, > > > > return 0; > > } > > + case KVM_ARM_MTE_COPY_TAGS: { > > + struct kvm_arm_copy_mte_tags copy_tags; > > + > > + if (copy_from_user(©_tags, argp, sizeof(copy_tags))) > > + return -EFAULT; > > + return kvm_vm_ioctl_mte_copy_tags(kvm, ©_tags); > > + } > > default: > > return -EINVAL; > > } > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > > index 5cb4a1cd5603..4ddb20017b2f 100644 > > --- a/arch/arm64/kvm/guest.c > > +++ b/arch/arm64/kvm/guest.c > > @@ -995,3 +995,85 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, > > > > return ret; > > } > > + > > +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, > > + struct kvm_arm_copy_mte_tags *copy_tags) > > +{ > > + gpa_t guest_ipa = copy_tags->guest_ipa; > > + size_t length = copy_tags->length; > > + void __user *tags = copy_tags->addr; > > + gpa_t gfn; > > + bool write = !(copy_tags->flags & KVM_ARM_TAGS_FROM_GUEST); > > + int ret = 0; > > + > > + if (!kvm_has_mte(kvm)) > > + return -EINVAL; > > + > > + if (copy_tags->reserved[0] || copy_tags->reserved[1]) > > + return -EINVAL; > > + > > + if (copy_tags->flags & ~KVM_ARM_TAGS_FROM_GUEST) > > + return -EINVAL; > > + > > + if (length & ~PAGE_MASK || guest_ipa & ~PAGE_MASK) > > + return -EINVAL; > > + > > + gfn = gpa_to_gfn(guest_ipa); > > + > > + mutex_lock(&kvm->slots_lock); > > + > > + while (length > 0) { > > + kvm_pfn_t pfn = gfn_to_pfn_prot(kvm, gfn, write, NULL); > > + void *maddr; > > + unsigned long num_tags; > > + struct page *page; > > + > > + if (is_error_noslot_pfn(pfn)) { > > + ret = -EFAULT; > > + goto out; > > + } > > + > > + page = pfn_to_online_page(pfn); > > + if (!page) { > > + /* Reject ZONE_DEVICE memory */ > > + ret = -EFAULT; > > + goto out; > > + } > > + maddr = page_address(page); > > + > > + if (!write) { > > + if (test_bit(PG_mte_tagged, &page->flags)) > > + num_tags = mte_copy_tags_to_user(tags, maddr, > > + MTE_GRANULES_PER_PAGE); > > + else > > + /* No tags in memory, so write zeros */ > > + num_tags = MTE_GRANULES_PER_PAGE - > > + clear_user(tags, MTE_GRANULES_PER_PAGE); > > + kvm_release_pfn_clean(pfn); > > + } else { > > + num_tags = mte_copy_tags_from_user(maddr, tags, > > + MTE_GRANULES_PER_PAGE); > > + kvm_release_pfn_dirty(pfn); > > + } > > + > > + if (num_tags != MTE_GRANULES_PER_PAGE) { > > + ret = -EFAULT; > > + goto out; > > + } > > + > > + /* Set the flag after checking the write completed fully */ > > + if (write) > > + set_bit(PG_mte_tagged, &page->flags); > > + > > + gfn++; > > + tags += num_tags; > > + length -= PAGE_SIZE; > > + } > > + > > +out: > > + mutex_unlock(&kvm->slots_lock); > > + /* If some data has been copied report the number of bytes copied */ > > + if (length != copy_tags->length) > > + return copy_tags->length - length; > > I'm not sure if this is actually an issue, but a couple of comments on > the return value if there is an error after a partial copy has been > done. If mte_copy_tags_to_user or mte_copy_tags_from_user don't return > MTE_GRANULES_PER_PAGE, then the check for num_tags would fail, but > some of the tags would have been copied, which wouldn't be reflected > in length. That said, on a write the tagged bit wouldn't be set, and > on read then the return value would be conservative, but not > incorrect. > > That said, even though it is described that way in the documentation > (rather deep in the description though), it might be confusing to > return a non-negative value on an error. The other kvm ioctl I could > find that does something similar, KVM_S390_GET_IRQ_STATE, seems to > always return a -ERROR on error, rather than the number of bytes > copied. My mental analogy for this ioctl is the read()/write() syscalls, which return the number of bytes that have been transferred in either direction. I agree that there are some corner cases (a tag copy that fails because of a faulty page adjacent to a valid page will still report some degree of success), but it is also important to report what has actually been done in either direction. Thanks, M. -- Without deviation from the norm, progress is not possible. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E498C2B9F4 for ; Tue, 22 Jun 2021 10:25:36 +0000 (UTC) Received: from mm01.cs.columbia.edu (mm01.cs.columbia.edu [128.59.11.253]) by mail.kernel.org (Postfix) with ESMTP id 09FEE613B2 for ; Tue, 22 Jun 2021 10:25:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 09FEE613B2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvmarm-bounces@lists.cs.columbia.edu Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 7E70740870; Tue, 22 Jun 2021 06:25:35 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YqvpZSyIN0JT; Tue, 22 Jun 2021 06:25:33 -0400 (EDT) Received: from mm01.cs.columbia.edu (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id E8728407F4; Tue, 22 Jun 2021 06:25:33 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 72956407E7 for ; Tue, 22 Jun 2021 06:25:32 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id e2Tr9JI-XVbE for ; Tue, 22 Jun 2021 06:25:31 -0400 (EDT) Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id 172A6407D1 for ; Tue, 22 Jun 2021 06:25:31 -0400 (EDT) Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D02FB613AD; Tue, 22 Jun 2021 10:25:29 +0000 (UTC) Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1lvdax-0094U6-Pc; Tue, 22 Jun 2021 11:25:27 +0100 Date: Tue, 22 Jun 2021 11:25:27 +0100 Message-ID: <875yy6ci20.wl-maz@kernel.org> From: Marc Zyngier To: Fuad Tabba Subject: Re: [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest In-Reply-To: References: <20210621111716.37157-1-steven.price@arm.com> <20210621111716.37157-6-steven.price@arm.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: tabba@google.com, steven.price@arm.com, catalin.marinas@arm.com, will@kernel.org, dgilbert@redhat.com, qemu-devel@nongnu.org, Dave.Martin@arm.com, quintela@redhat.com, richard.henderson@linaro.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Cc: Juan Quintela , Catalin Marinas , Richard Henderson , qemu-devel@nongnu.org, "Dr. David Alan Gilbert" , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, Thomas Gleixner , Steven Price , Will Deacon , Dave Martin , linux-kernel@vger.kernel.org X-BeenThere: kvmarm@lists.cs.columbia.edu X-Mailman-Version: 2.1.14 Precedence: list List-Id: Where KVM/ARM decisions are made List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu Hi Fuad, On Tue, 22 Jun 2021 09:56:22 +0100, Fuad Tabba wrote: > > Hi, > > > On Mon, Jun 21, 2021 at 12:18 PM Steven Price wrote: > > > > The VMM may not wish to have it's own mapping of guest memory mapped > > with PROT_MTE because this causes problems if the VMM has tag checking > > enabled (the guest controls the tags in physical RAM and it's unlikely > > the tags are correct for the VMM). > > > > Instead add a new ioctl which allows the VMM to easily read/write the > > tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE > > while the VMM can still read/write the tags for the purpose of > > migration. > > > > Reviewed-by: Catalin Marinas > > Signed-off-by: Steven Price > > --- > > arch/arm64/include/asm/kvm_host.h | 3 ++ > > arch/arm64/include/asm/mte-def.h | 1 + > > arch/arm64/include/uapi/asm/kvm.h | 11 +++++ > > arch/arm64/kvm/arm.c | 7 +++ > > arch/arm64/kvm/guest.c | 82 +++++++++++++++++++++++++++++++ > > include/uapi/linux/kvm.h | 1 + > > 6 files changed, 105 insertions(+) > > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > > index 309e36cc1b42..6a2ac4636d42 100644 > > --- a/arch/arm64/include/asm/kvm_host.h > > +++ b/arch/arm64/include/asm/kvm_host.h > > @@ -729,6 +729,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu, > > int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, > > struct kvm_device_attr *attr); > > > > +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, > > + struct kvm_arm_copy_mte_tags *copy_tags); > > + > > /* Guest/host FPSIMD coordination helpers */ > > int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu); > > void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu); > > diff --git a/arch/arm64/include/asm/mte-def.h b/arch/arm64/include/asm/mte-def.h > > index cf241b0f0a42..626d359b396e 100644 > > --- a/arch/arm64/include/asm/mte-def.h > > +++ b/arch/arm64/include/asm/mte-def.h > > @@ -7,6 +7,7 @@ > > > > #define MTE_GRANULE_SIZE UL(16) > > #define MTE_GRANULE_MASK (~(MTE_GRANULE_SIZE - 1)) > > +#define MTE_GRANULES_PER_PAGE (PAGE_SIZE / MTE_GRANULE_SIZE) > > #define MTE_TAG_SHIFT 56 > > #define MTE_TAG_SIZE 4 > > #define MTE_TAG_MASK GENMASK((MTE_TAG_SHIFT + (MTE_TAG_SIZE - 1)), MTE_TAG_SHIFT) > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h > > index 24223adae150..b3edde68bc3e 100644 > > --- a/arch/arm64/include/uapi/asm/kvm.h > > +++ b/arch/arm64/include/uapi/asm/kvm.h > > @@ -184,6 +184,17 @@ struct kvm_vcpu_events { > > __u32 reserved[12]; > > }; > > > > +struct kvm_arm_copy_mte_tags { > > + __u64 guest_ipa; > > + __u64 length; > > + void __user *addr; > > + __u64 flags; > > + __u64 reserved[2]; > > +}; > > + > > +#define KVM_ARM_TAGS_TO_GUEST 0 > > +#define KVM_ARM_TAGS_FROM_GUEST 1 > > + > > /* If you need to interpret the index values, here is the key: */ > > #define KVM_REG_ARM_COPROC_MASK 0x000000000FFF0000 > > #define KVM_REG_ARM_COPROC_SHIFT 16 > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > > index 28ce26a68f09..511f3716fe33 100644 > > --- a/arch/arm64/kvm/arm.c > > +++ b/arch/arm64/kvm/arm.c > > @@ -1359,6 +1359,13 @@ long kvm_arch_vm_ioctl(struct file *filp, > > > > return 0; > > } > > + case KVM_ARM_MTE_COPY_TAGS: { > > + struct kvm_arm_copy_mte_tags copy_tags; > > + > > + if (copy_from_user(©_tags, argp, sizeof(copy_tags))) > > + return -EFAULT; > > + return kvm_vm_ioctl_mte_copy_tags(kvm, ©_tags); > > + } > > default: > > return -EINVAL; > > } > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > > index 5cb4a1cd5603..4ddb20017b2f 100644 > > --- a/arch/arm64/kvm/guest.c > > +++ b/arch/arm64/kvm/guest.c > > @@ -995,3 +995,85 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, > > > > return ret; > > } > > + > > +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, > > + struct kvm_arm_copy_mte_tags *copy_tags) > > +{ > > + gpa_t guest_ipa = copy_tags->guest_ipa; > > + size_t length = copy_tags->length; > > + void __user *tags = copy_tags->addr; > > + gpa_t gfn; > > + bool write = !(copy_tags->flags & KVM_ARM_TAGS_FROM_GUEST); > > + int ret = 0; > > + > > + if (!kvm_has_mte(kvm)) > > + return -EINVAL; > > + > > + if (copy_tags->reserved[0] || copy_tags->reserved[1]) > > + return -EINVAL; > > + > > + if (copy_tags->flags & ~KVM_ARM_TAGS_FROM_GUEST) > > + return -EINVAL; > > + > > + if (length & ~PAGE_MASK || guest_ipa & ~PAGE_MASK) > > + return -EINVAL; > > + > > + gfn = gpa_to_gfn(guest_ipa); > > + > > + mutex_lock(&kvm->slots_lock); > > + > > + while (length > 0) { > > + kvm_pfn_t pfn = gfn_to_pfn_prot(kvm, gfn, write, NULL); > > + void *maddr; > > + unsigned long num_tags; > > + struct page *page; > > + > > + if (is_error_noslot_pfn(pfn)) { > > + ret = -EFAULT; > > + goto out; > > + } > > + > > + page = pfn_to_online_page(pfn); > > + if (!page) { > > + /* Reject ZONE_DEVICE memory */ > > + ret = -EFAULT; > > + goto out; > > + } > > + maddr = page_address(page); > > + > > + if (!write) { > > + if (test_bit(PG_mte_tagged, &page->flags)) > > + num_tags = mte_copy_tags_to_user(tags, maddr, > > + MTE_GRANULES_PER_PAGE); > > + else > > + /* No tags in memory, so write zeros */ > > + num_tags = MTE_GRANULES_PER_PAGE - > > + clear_user(tags, MTE_GRANULES_PER_PAGE); > > + kvm_release_pfn_clean(pfn); > > + } else { > > + num_tags = mte_copy_tags_from_user(maddr, tags, > > + MTE_GRANULES_PER_PAGE); > > + kvm_release_pfn_dirty(pfn); > > + } > > + > > + if (num_tags != MTE_GRANULES_PER_PAGE) { > > + ret = -EFAULT; > > + goto out; > > + } > > + > > + /* Set the flag after checking the write completed fully */ > > + if (write) > > + set_bit(PG_mte_tagged, &page->flags); > > + > > + gfn++; > > + tags += num_tags; > > + length -= PAGE_SIZE; > > + } > > + > > +out: > > + mutex_unlock(&kvm->slots_lock); > > + /* If some data has been copied report the number of bytes copied */ > > + if (length != copy_tags->length) > > + return copy_tags->length - length; > > I'm not sure if this is actually an issue, but a couple of comments on > the return value if there is an error after a partial copy has been > done. If mte_copy_tags_to_user or mte_copy_tags_from_user don't return > MTE_GRANULES_PER_PAGE, then the check for num_tags would fail, but > some of the tags would have been copied, which wouldn't be reflected > in length. That said, on a write the tagged bit wouldn't be set, and > on read then the return value would be conservative, but not > incorrect. > > That said, even though it is described that way in the documentation > (rather deep in the description though), it might be confusing to > return a non-negative value on an error. The other kvm ioctl I could > find that does something similar, KVM_S390_GET_IRQ_STATE, seems to > always return a -ERROR on error, rather than the number of bytes > copied. My mental analogy for this ioctl is the read()/write() syscalls, which return the number of bytes that have been transferred in either direction. I agree that there are some corner cases (a tag copy that fails because of a faulty page adjacent to a valid page will still report some degree of success), but it is also important to report what has actually been done in either direction. Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2575BC2B9F4 for ; Tue, 22 Jun 2021 10:27:08 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id ED883613AE for ; Tue, 22 Jun 2021 10:27:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ED883613AE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Subject:Cc:To:From:Message-ID:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=x1ED9AXs/IuPbv/UCD26xNzxWy3EnnOhiHdsW/nkn4Q=; b=T9PGLSDLzsvteA Hpv6+gXQ/fHD1Tv4tH2nrOFbwb1VUu1YSgaA8f8ACs1/nh99wEqI2wvcTvwEFzOvfVarmTNW5MILx mRKqHA80/cPM7Cw8rHJ0S8gs76cqVSiuCzNHS3CbULzeRuw6DfFNTufWP3oFbA4+QZWJPTAun2j9m T1pRrqf9qf8JK7vPqHu5binsE8iawnle/PH4H+pCUMEJONbi8/VfyCJkooXr/L1SJe1b+W8yjSh/f iTUcJGv+MRdJGjejtvnZ7AD719ztTa0gWfh8F0M7Z9i8RntBxp6v90Y1AiY0Ib8AAs0EH356Ibh72 mLz1P3EmveioFZCNUqoA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1lvdb4-006dIh-UA; Tue, 22 Jun 2021 10:25:35 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1lvdb0-006dII-Lc for linux-arm-kernel@lists.infradead.org; Tue, 22 Jun 2021 10:25:32 +0000 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D02FB613AD; Tue, 22 Jun 2021 10:25:29 +0000 (UTC) Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1lvdax-0094U6-Pc; Tue, 22 Jun 2021 11:25:27 +0100 Date: Tue, 22 Jun 2021 11:25:27 +0100 Message-ID: <875yy6ci20.wl-maz@kernel.org> From: Marc Zyngier To: Fuad Tabba Cc: Steven Price , Catalin Marinas , Will Deacon , "Dr. David Alan Gilbert" , qemu-devel@nongnu.org, Dave Martin , Juan Quintela , Richard Henderson , linux-kernel@vger.kernel.org, Thomas Gleixner , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest In-Reply-To: References: <20210621111716.37157-1-steven.price@arm.com> <20210621111716.37157-6-steven.price@arm.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: tabba@google.com, steven.price@arm.com, catalin.marinas@arm.com, will@kernel.org, dgilbert@redhat.com, qemu-devel@nongnu.org, Dave.Martin@arm.com, quintela@redhat.com, richard.henderson@linaro.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210622_032530_793714_48060B0F X-CRM114-Status: GOOD ( 49.74 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Fuad, On Tue, 22 Jun 2021 09:56:22 +0100, Fuad Tabba wrote: > > Hi, > > > On Mon, Jun 21, 2021 at 12:18 PM Steven Price wrote: > > > > The VMM may not wish to have it's own mapping of guest memory mapped > > with PROT_MTE because this causes problems if the VMM has tag checking > > enabled (the guest controls the tags in physical RAM and it's unlikely > > the tags are correct for the VMM). > > > > Instead add a new ioctl which allows the VMM to easily read/write the > > tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE > > while the VMM can still read/write the tags for the purpose of > > migration. > > > > Reviewed-by: Catalin Marinas > > Signed-off-by: Steven Price > > --- > > arch/arm64/include/asm/kvm_host.h | 3 ++ > > arch/arm64/include/asm/mte-def.h | 1 + > > arch/arm64/include/uapi/asm/kvm.h | 11 +++++ > > arch/arm64/kvm/arm.c | 7 +++ > > arch/arm64/kvm/guest.c | 82 +++++++++++++++++++++++++++++++ > > include/uapi/linux/kvm.h | 1 + > > 6 files changed, 105 insertions(+) > > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > > index 309e36cc1b42..6a2ac4636d42 100644 > > --- a/arch/arm64/include/asm/kvm_host.h > > +++ b/arch/arm64/include/asm/kvm_host.h > > @@ -729,6 +729,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu, > > int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, > > struct kvm_device_attr *attr); > > > > +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, > > + struct kvm_arm_copy_mte_tags *copy_tags); > > + > > /* Guest/host FPSIMD coordination helpers */ > > int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu); > > void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu); > > diff --git a/arch/arm64/include/asm/mte-def.h b/arch/arm64/include/asm/mte-def.h > > index cf241b0f0a42..626d359b396e 100644 > > --- a/arch/arm64/include/asm/mte-def.h > > +++ b/arch/arm64/include/asm/mte-def.h > > @@ -7,6 +7,7 @@ > > > > #define MTE_GRANULE_SIZE UL(16) > > #define MTE_GRANULE_MASK (~(MTE_GRANULE_SIZE - 1)) > > +#define MTE_GRANULES_PER_PAGE (PAGE_SIZE / MTE_GRANULE_SIZE) > > #define MTE_TAG_SHIFT 56 > > #define MTE_TAG_SIZE 4 > > #define MTE_TAG_MASK GENMASK((MTE_TAG_SHIFT + (MTE_TAG_SIZE - 1)), MTE_TAG_SHIFT) > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h > > index 24223adae150..b3edde68bc3e 100644 > > --- a/arch/arm64/include/uapi/asm/kvm.h > > +++ b/arch/arm64/include/uapi/asm/kvm.h > > @@ -184,6 +184,17 @@ struct kvm_vcpu_events { > > __u32 reserved[12]; > > }; > > > > +struct kvm_arm_copy_mte_tags { > > + __u64 guest_ipa; > > + __u64 length; > > + void __user *addr; > > + __u64 flags; > > + __u64 reserved[2]; > > +}; > > + > > +#define KVM_ARM_TAGS_TO_GUEST 0 > > +#define KVM_ARM_TAGS_FROM_GUEST 1 > > + > > /* If you need to interpret the index values, here is the key: */ > > #define KVM_REG_ARM_COPROC_MASK 0x000000000FFF0000 > > #define KVM_REG_ARM_COPROC_SHIFT 16 > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > > index 28ce26a68f09..511f3716fe33 100644 > > --- a/arch/arm64/kvm/arm.c > > +++ b/arch/arm64/kvm/arm.c > > @@ -1359,6 +1359,13 @@ long kvm_arch_vm_ioctl(struct file *filp, > > > > return 0; > > } > > + case KVM_ARM_MTE_COPY_TAGS: { > > + struct kvm_arm_copy_mte_tags copy_tags; > > + > > + if (copy_from_user(©_tags, argp, sizeof(copy_tags))) > > + return -EFAULT; > > + return kvm_vm_ioctl_mte_copy_tags(kvm, ©_tags); > > + } > > default: > > return -EINVAL; > > } > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > > index 5cb4a1cd5603..4ddb20017b2f 100644 > > --- a/arch/arm64/kvm/guest.c > > +++ b/arch/arm64/kvm/guest.c > > @@ -995,3 +995,85 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, > > > > return ret; > > } > > + > > +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, > > + struct kvm_arm_copy_mte_tags *copy_tags) > > +{ > > + gpa_t guest_ipa = copy_tags->guest_ipa; > > + size_t length = copy_tags->length; > > + void __user *tags = copy_tags->addr; > > + gpa_t gfn; > > + bool write = !(copy_tags->flags & KVM_ARM_TAGS_FROM_GUEST); > > + int ret = 0; > > + > > + if (!kvm_has_mte(kvm)) > > + return -EINVAL; > > + > > + if (copy_tags->reserved[0] || copy_tags->reserved[1]) > > + return -EINVAL; > > + > > + if (copy_tags->flags & ~KVM_ARM_TAGS_FROM_GUEST) > > + return -EINVAL; > > + > > + if (length & ~PAGE_MASK || guest_ipa & ~PAGE_MASK) > > + return -EINVAL; > > + > > + gfn = gpa_to_gfn(guest_ipa); > > + > > + mutex_lock(&kvm->slots_lock); > > + > > + while (length > 0) { > > + kvm_pfn_t pfn = gfn_to_pfn_prot(kvm, gfn, write, NULL); > > + void *maddr; > > + unsigned long num_tags; > > + struct page *page; > > + > > + if (is_error_noslot_pfn(pfn)) { > > + ret = -EFAULT; > > + goto out; > > + } > > + > > + page = pfn_to_online_page(pfn); > > + if (!page) { > > + /* Reject ZONE_DEVICE memory */ > > + ret = -EFAULT; > > + goto out; > > + } > > + maddr = page_address(page); > > + > > + if (!write) { > > + if (test_bit(PG_mte_tagged, &page->flags)) > > + num_tags = mte_copy_tags_to_user(tags, maddr, > > + MTE_GRANULES_PER_PAGE); > > + else > > + /* No tags in memory, so write zeros */ > > + num_tags = MTE_GRANULES_PER_PAGE - > > + clear_user(tags, MTE_GRANULES_PER_PAGE); > > + kvm_release_pfn_clean(pfn); > > + } else { > > + num_tags = mte_copy_tags_from_user(maddr, tags, > > + MTE_GRANULES_PER_PAGE); > > + kvm_release_pfn_dirty(pfn); > > + } > > + > > + if (num_tags != MTE_GRANULES_PER_PAGE) { > > + ret = -EFAULT; > > + goto out; > > + } > > + > > + /* Set the flag after checking the write completed fully */ > > + if (write) > > + set_bit(PG_mte_tagged, &page->flags); > > + > > + gfn++; > > + tags += num_tags; > > + length -= PAGE_SIZE; > > + } > > + > > +out: > > + mutex_unlock(&kvm->slots_lock); > > + /* If some data has been copied report the number of bytes copied */ > > + if (length != copy_tags->length) > > + return copy_tags->length - length; > > I'm not sure if this is actually an issue, but a couple of comments on > the return value if there is an error after a partial copy has been > done. If mte_copy_tags_to_user or mte_copy_tags_from_user don't return > MTE_GRANULES_PER_PAGE, then the check for num_tags would fail, but > some of the tags would have been copied, which wouldn't be reflected > in length. That said, on a write the tagged bit wouldn't be set, and > on read then the return value would be conservative, but not > incorrect. > > That said, even though it is described that way in the documentation > (rather deep in the description though), it might be confusing to > return a non-negative value on an error. The other kvm ioctl I could > find that does something similar, KVM_S390_GET_IRQ_STATE, seems to > always return a -ERROR on error, rather than the number of bytes > copied. My mental analogy for this ioctl is the read()/write() syscalls, which return the number of bytes that have been transferred in either direction. I agree that there are some corner cases (a tag copy that fails because of a faulty page adjacent to a valid page will still report some degree of success), but it is also important to report what has actually been done in either direction. Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel