From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93987C07E94 for ; Fri, 4 Jun 2021 09:01:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 79D356139A for ; Fri, 4 Jun 2021 09:01:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230015AbhFDJDS (ORCPT ); Fri, 4 Jun 2021 05:03:18 -0400 Received: from mail.kernel.org ([198.145.29.99]:48608 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229930AbhFDJDR (ORCPT ); Fri, 4 Jun 2021 05:03:17 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 7EA186140F; Fri, 4 Jun 2021 09:01:28 +0000 (UTC) Date: Fri, 4 Jun 2021 10:01:26 +0100 From: Catalin Marinas To: Steven Price Cc: Marc Zyngier , Will Deacon , James Morse , Julien Thierry , Suzuki K Poulose , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Dave Martin , Mark Rutland , Thomas Gleixner , qemu-devel@nongnu.org, Juan Quintela , "Dr. David Alan Gilbert" , Richard Henderson , Peter Maydell , Haibo Xu , Andrew Jones Subject: Re: [PATCH v13 4/8] KVM: arm64: Introduce MTE VM feature Message-ID: <20210604090125.GA23321@arm.com> References: <20210524104513.13258-1-steven.price@arm.com> <20210524104513.13258-5-steven.price@arm.com> <20210603160031.GE20338@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210603160031.GE20338@arm.com> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 03, 2021 at 05:00:31PM +0100, Catalin Marinas wrote: > On Mon, May 24, 2021 at 11:45:09AM +0100, Steven Price wrote: > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index c5d1f3c87dbd..226035cf7d6c 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -822,6 +822,42 @@ transparent_hugepage_adjust(struct kvm_memory_slot *memslot, > > return PAGE_SIZE; > > } > > > > +static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn, > > + unsigned long size) > > +{ > > + if (kvm_has_mte(kvm)) { > > + /* > > + * The page will be mapped in stage 2 as Normal Cacheable, so > > + * the VM will be able to see the page's tags and therefore > > + * they must be initialised first. If PG_mte_tagged is set, > > + * tags have already been initialised. > > + * pfn_to_online_page() is used to reject ZONE_DEVICE pages > > + * that may not support tags. > > + */ > > + unsigned long i, nr_pages = size >> PAGE_SHIFT; > > + struct page *page = pfn_to_online_page(pfn); > > + > > + if (!page) > > + return -EFAULT; > > + > > + for (i = 0; i < nr_pages; i++, page++) { > > + /* > > + * There is a potential (but very unlikely) race > > + * between two VMs which are sharing a physical page > > + * entering this at the same time. However by splitting > > + * the test/set the only risk is tags being overwritten > > + * by the mte_clear_page_tags() call. > > + */ > > And I think the real risk here is when the page is writable by at least > one of the VMs sharing the page. This excludes KSM, so it only leaves > the MAP_SHARED mappings. > > > + if (!test_bit(PG_mte_tagged, &page->flags)) { > > + mte_clear_page_tags(page_address(page)); > > + set_bit(PG_mte_tagged, &page->flags); > > + } > > + } > > If we want to cover this race (I'd say in a separate patch), we can call > mte_sync_page_tags(page, __pte(0), false, true) directly (hopefully I > got the arguments right). We can avoid the big lock in most cases if > kvm_arch_prepare_memory_region() sets a VM_MTE_RESET (tag clear etc.) > and __alloc_zeroed_user_highpage() clears the tags on allocation (as we > do for VM_MTE but the new flag would not affect the stage 1 VMM page > attributes). Another idea: if VM_SHARED is found for any vma within a region in kvm_arch_prepare_memory_region(), we either prevent the enabling of MTE for the guest or reject the memory slot if MTE was already enabled. An alternative here would be to clear VM_MTE_ALLOWED so that any subsequent mprotect(PROT_MTE) in the VMM would fail in arch_validate_flags(). MTE would still be allowed in the guest but in the VMM for the guest memory regions. We can probably do this irrespective of VM_SHARED. Of course, the VMM can still mmap() the memory initially with PROT_MTE but that's not an issue IIRC, only the concurrent mprotect(). -- Catalin From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28356C07E94 for ; Fri, 4 Jun 2021 09:02:53 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E4BAB613BF for ; Fri, 4 Jun 2021 09:02:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E4BAB613BF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:51152 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lp5jA-0004vR-0H for qemu-devel@archiver.kernel.org; Fri, 04 Jun 2021 05:02:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37140) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lp5i3-0003Zj-7R for qemu-devel@nongnu.org; Fri, 04 Jun 2021 05:01:43 -0400 Received: from mail.kernel.org ([198.145.29.99]:37306) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lp5ht-0008TN-H3 for qemu-devel@nongnu.org; Fri, 04 Jun 2021 05:01:41 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 7EA186140F; Fri, 4 Jun 2021 09:01:28 +0000 (UTC) Date: Fri, 4 Jun 2021 10:01:26 +0100 From: Catalin Marinas To: Steven Price Subject: Re: [PATCH v13 4/8] KVM: arm64: Introduce MTE VM feature Message-ID: <20210604090125.GA23321@arm.com> References: <20210524104513.13258-1-steven.price@arm.com> <20210524104513.13258-5-steven.price@arm.com> <20210603160031.GE20338@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210603160031.GE20338@arm.com> User-Agent: Mutt/1.10.1 (2018-07-13) Received-SPF: pass client-ip=198.145.29.99; envelope-from=cmarinas@kernel.org; helo=mail.kernel.org X-Spam_score_int: -66 X-Spam_score: -6.7 X-Spam_bar: ------ X-Spam_report: (-6.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Peter Maydell , "Dr. David Alan Gilbert" , Andrew Jones , Haibo Xu , Suzuki K Poulose , qemu-devel@nongnu.org, Marc Zyngier , Juan Quintela , Richard Henderson , linux-kernel@vger.kernel.org, Dave Martin , James Morse , linux-arm-kernel@lists.infradead.org, Thomas Gleixner , Will Deacon , kvmarm@lists.cs.columbia.edu, Julien Thierry Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On Thu, Jun 03, 2021 at 05:00:31PM +0100, Catalin Marinas wrote: > On Mon, May 24, 2021 at 11:45:09AM +0100, Steven Price wrote: > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index c5d1f3c87dbd..226035cf7d6c 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -822,6 +822,42 @@ transparent_hugepage_adjust(struct kvm_memory_slot *memslot, > > return PAGE_SIZE; > > } > > > > +static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn, > > + unsigned long size) > > +{ > > + if (kvm_has_mte(kvm)) { > > + /* > > + * The page will be mapped in stage 2 as Normal Cacheable, so > > + * the VM will be able to see the page's tags and therefore > > + * they must be initialised first. If PG_mte_tagged is set, > > + * tags have already been initialised. > > + * pfn_to_online_page() is used to reject ZONE_DEVICE pages > > + * that may not support tags. > > + */ > > + unsigned long i, nr_pages = size >> PAGE_SHIFT; > > + struct page *page = pfn_to_online_page(pfn); > > + > > + if (!page) > > + return -EFAULT; > > + > > + for (i = 0; i < nr_pages; i++, page++) { > > + /* > > + * There is a potential (but very unlikely) race > > + * between two VMs which are sharing a physical page > > + * entering this at the same time. However by splitting > > + * the test/set the only risk is tags being overwritten > > + * by the mte_clear_page_tags() call. > > + */ > > And I think the real risk here is when the page is writable by at least > one of the VMs sharing the page. This excludes KSM, so it only leaves > the MAP_SHARED mappings. > > > + if (!test_bit(PG_mte_tagged, &page->flags)) { > > + mte_clear_page_tags(page_address(page)); > > + set_bit(PG_mte_tagged, &page->flags); > > + } > > + } > > If we want to cover this race (I'd say in a separate patch), we can call > mte_sync_page_tags(page, __pte(0), false, true) directly (hopefully I > got the arguments right). We can avoid the big lock in most cases if > kvm_arch_prepare_memory_region() sets a VM_MTE_RESET (tag clear etc.) > and __alloc_zeroed_user_highpage() clears the tags on allocation (as we > do for VM_MTE but the new flag would not affect the stage 1 VMM page > attributes). Another idea: if VM_SHARED is found for any vma within a region in kvm_arch_prepare_memory_region(), we either prevent the enabling of MTE for the guest or reject the memory slot if MTE was already enabled. An alternative here would be to clear VM_MTE_ALLOWED so that any subsequent mprotect(PROT_MTE) in the VMM would fail in arch_validate_flags(). MTE would still be allowed in the guest but in the VMM for the guest memory regions. We can probably do this irrespective of VM_SHARED. Of course, the VMM can still mmap() the memory initially with PROT_MTE but that's not an issue IIRC, only the concurrent mprotect(). -- Catalin From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4A53C07E94 for ; Fri, 4 Jun 2021 09:01:45 +0000 (UTC) Received: from mm01.cs.columbia.edu (mm01.cs.columbia.edu [128.59.11.253]) by mail.kernel.org (Postfix) with ESMTP id 53E89613C9 for ; Fri, 4 Jun 2021 09:01:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 53E89613C9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvmarm-bounces@lists.cs.columbia.edu Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id C90394B0E2; Fri, 4 Jun 2021 05:01:43 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JkfnKmnhUPhh; Fri, 4 Jun 2021 05:01:38 -0400 (EDT) Received: from mm01.cs.columbia.edu (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 7180F4B0F4; Fri, 4 Jun 2021 05:01:38 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id B79C24B0E2 for ; Fri, 4 Jun 2021 05:01:37 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id t6dnHvWNCMNk for ; Fri, 4 Jun 2021 05:01:32 -0400 (EDT) Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id 9644F4B0D6 for ; Fri, 4 Jun 2021 05:01:32 -0400 (EDT) Received: by mail.kernel.org (Postfix) with ESMTPSA id 7EA186140F; Fri, 4 Jun 2021 09:01:28 +0000 (UTC) Date: Fri, 4 Jun 2021 10:01:26 +0100 From: Catalin Marinas To: Steven Price Subject: Re: [PATCH v13 4/8] KVM: arm64: Introduce MTE VM feature Message-ID: <20210604090125.GA23321@arm.com> References: <20210524104513.13258-1-steven.price@arm.com> <20210524104513.13258-5-steven.price@arm.com> <20210603160031.GE20338@arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20210603160031.GE20338@arm.com> User-Agent: Mutt/1.10.1 (2018-07-13) Cc: "Dr. David Alan Gilbert" , qemu-devel@nongnu.org, Marc Zyngier , Juan Quintela , Richard Henderson , linux-kernel@vger.kernel.org, Dave Martin , linux-arm-kernel@lists.infradead.org, Thomas Gleixner , Will Deacon , kvmarm@lists.cs.columbia.edu X-BeenThere: kvmarm@lists.cs.columbia.edu X-Mailman-Version: 2.1.14 Precedence: list List-Id: Where KVM/ARM decisions are made List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu On Thu, Jun 03, 2021 at 05:00:31PM +0100, Catalin Marinas wrote: > On Mon, May 24, 2021 at 11:45:09AM +0100, Steven Price wrote: > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index c5d1f3c87dbd..226035cf7d6c 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -822,6 +822,42 @@ transparent_hugepage_adjust(struct kvm_memory_slot *memslot, > > return PAGE_SIZE; > > } > > > > +static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn, > > + unsigned long size) > > +{ > > + if (kvm_has_mte(kvm)) { > > + /* > > + * The page will be mapped in stage 2 as Normal Cacheable, so > > + * the VM will be able to see the page's tags and therefore > > + * they must be initialised first. If PG_mte_tagged is set, > > + * tags have already been initialised. > > + * pfn_to_online_page() is used to reject ZONE_DEVICE pages > > + * that may not support tags. > > + */ > > + unsigned long i, nr_pages = size >> PAGE_SHIFT; > > + struct page *page = pfn_to_online_page(pfn); > > + > > + if (!page) > > + return -EFAULT; > > + > > + for (i = 0; i < nr_pages; i++, page++) { > > + /* > > + * There is a potential (but very unlikely) race > > + * between two VMs which are sharing a physical page > > + * entering this at the same time. However by splitting > > + * the test/set the only risk is tags being overwritten > > + * by the mte_clear_page_tags() call. > > + */ > > And I think the real risk here is when the page is writable by at least > one of the VMs sharing the page. This excludes KSM, so it only leaves > the MAP_SHARED mappings. > > > + if (!test_bit(PG_mte_tagged, &page->flags)) { > > + mte_clear_page_tags(page_address(page)); > > + set_bit(PG_mte_tagged, &page->flags); > > + } > > + } > > If we want to cover this race (I'd say in a separate patch), we can call > mte_sync_page_tags(page, __pte(0), false, true) directly (hopefully I > got the arguments right). We can avoid the big lock in most cases if > kvm_arch_prepare_memory_region() sets a VM_MTE_RESET (tag clear etc.) > and __alloc_zeroed_user_highpage() clears the tags on allocation (as we > do for VM_MTE but the new flag would not affect the stage 1 VMM page > attributes). Another idea: if VM_SHARED is found for any vma within a region in kvm_arch_prepare_memory_region(), we either prevent the enabling of MTE for the guest or reject the memory slot if MTE was already enabled. An alternative here would be to clear VM_MTE_ALLOWED so that any subsequent mprotect(PROT_MTE) in the VMM would fail in arch_validate_flags(). MTE would still be allowed in the guest but in the VMM for the guest memory regions. We can probably do this irrespective of VM_SHARED. Of course, the VMM can still mmap() the memory initially with PROT_MTE but that's not an issue IIRC, only the concurrent mprotect(). -- Catalin _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E57B4C07E94 for ; Fri, 4 Jun 2021 09:06:38 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B2EEC613E9 for ; Fri, 4 Jun 2021 09:06:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B2EEC613E9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=dvdLhn3Kl4TDZOPMuwgYWEs9zfKcUBRKGA4Mp1WWr8w=; b=hdRJBXM/tN1HmF X55779SykdbCtFUTwuAtqC3uvaYFMjDX5VxPulvWYF1PFk3J0IwMEkSY+kyaZQaeQdmJSN3q5CKBQ ETwZZWhPGvC73D48kJTM/ynwN7zYDELmqG1uA6m67HVajnnOCw0EQ7mEVxSrOU/L6KMztSTc5SFPw CZWfuas3Eyjd2C2IJIqvThHocGBA0Ffsn5qj4Rx+zJaR343onlfYFtCzahTOYxjmv4TESK+7wGn/B j1F7JxXP4lFYc7OG5on2DtqSINDEM2jFNhomhM/y50NDQUFRsY798KYn3kMIvSF/z5ZiRnMf729Kz h7XOfaes9F03Y7OGykJQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1lp5ky-00CTCY-VT; Fri, 04 Jun 2021 09:04:45 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1lp5hr-00CRtg-RD for linux-arm-kernel@lists.infradead.org; Fri, 04 Jun 2021 09:01:36 +0000 Received: by mail.kernel.org (Postfix) with ESMTPSA id 7EA186140F; Fri, 4 Jun 2021 09:01:28 +0000 (UTC) Date: Fri, 4 Jun 2021 10:01:26 +0100 From: Catalin Marinas To: Steven Price Cc: Marc Zyngier , Will Deacon , James Morse , Julien Thierry , Suzuki K Poulose , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Dave Martin , Mark Rutland , Thomas Gleixner , qemu-devel@nongnu.org, Juan Quintela , "Dr. David Alan Gilbert" , Richard Henderson , Peter Maydell , Haibo Xu , Andrew Jones Subject: Re: [PATCH v13 4/8] KVM: arm64: Introduce MTE VM feature Message-ID: <20210604090125.GA23321@arm.com> References: <20210524104513.13258-1-steven.price@arm.com> <20210524104513.13258-5-steven.price@arm.com> <20210603160031.GE20338@arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20210603160031.GE20338@arm.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210604_020131_951348_AE27C205 X-CRM114-Status: GOOD ( 30.54 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Jun 03, 2021 at 05:00:31PM +0100, Catalin Marinas wrote: > On Mon, May 24, 2021 at 11:45:09AM +0100, Steven Price wrote: > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index c5d1f3c87dbd..226035cf7d6c 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -822,6 +822,42 @@ transparent_hugepage_adjust(struct kvm_memory_slot *memslot, > > return PAGE_SIZE; > > } > > > > +static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn, > > + unsigned long size) > > +{ > > + if (kvm_has_mte(kvm)) { > > + /* > > + * The page will be mapped in stage 2 as Normal Cacheable, so > > + * the VM will be able to see the page's tags and therefore > > + * they must be initialised first. If PG_mte_tagged is set, > > + * tags have already been initialised. > > + * pfn_to_online_page() is used to reject ZONE_DEVICE pages > > + * that may not support tags. > > + */ > > + unsigned long i, nr_pages = size >> PAGE_SHIFT; > > + struct page *page = pfn_to_online_page(pfn); > > + > > + if (!page) > > + return -EFAULT; > > + > > + for (i = 0; i < nr_pages; i++, page++) { > > + /* > > + * There is a potential (but very unlikely) race > > + * between two VMs which are sharing a physical page > > + * entering this at the same time. However by splitting > > + * the test/set the only risk is tags being overwritten > > + * by the mte_clear_page_tags() call. > > + */ > > And I think the real risk here is when the page is writable by at least > one of the VMs sharing the page. This excludes KSM, so it only leaves > the MAP_SHARED mappings. > > > + if (!test_bit(PG_mte_tagged, &page->flags)) { > > + mte_clear_page_tags(page_address(page)); > > + set_bit(PG_mte_tagged, &page->flags); > > + } > > + } > > If we want to cover this race (I'd say in a separate patch), we can call > mte_sync_page_tags(page, __pte(0), false, true) directly (hopefully I > got the arguments right). We can avoid the big lock in most cases if > kvm_arch_prepare_memory_region() sets a VM_MTE_RESET (tag clear etc.) > and __alloc_zeroed_user_highpage() clears the tags on allocation (as we > do for VM_MTE but the new flag would not affect the stage 1 VMM page > attributes). Another idea: if VM_SHARED is found for any vma within a region in kvm_arch_prepare_memory_region(), we either prevent the enabling of MTE for the guest or reject the memory slot if MTE was already enabled. An alternative here would be to clear VM_MTE_ALLOWED so that any subsequent mprotect(PROT_MTE) in the VMM would fail in arch_validate_flags(). MTE would still be allowed in the guest but in the VMM for the guest memory regions. We can probably do this irrespective of VM_SHARED. Of course, the VMM can still mmap() the memory initially with PROT_MTE but that's not an issue IIRC, only the concurrent mprotect(). -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel