From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 998C9566C; Mon, 6 Mar 2023 17:34:42 +0000 (UTC) Received: by mail-lj1-f176.google.com with SMTP id a32so10482948ljr.9; Mon, 06 Mar 2023 09:34:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1678124080; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=YNbWnb9fLhufznPrTCujm0OjCR86yff8zEINR9/66Jg=; b=CCFi8E6XdEWAkbx/5iJUDNojUY+3dy3mwXOpLptH+w5LOZIbQ9pS42/9KCj14S7/LT /gJe9a5ZTEMXinKJuVpQ+z4kr7JavyiJjrRY8GhPfv/B3Y2w2UvQ1GidbNPdIJO12Y1w hbVqAzWH1lxz0cuZEndnPWyjdhu5UBv1fks9VqmFe7sOGMZuntrwQsXQvnzdNNAamuBa KHvtROUSJx0OvOe3BMwFuv0Xbrl74qR1GfUUaKWxXjl8AzacTKCq/HPHOpiRe/Sne+BR J7i3oIa+hibwY77PAoGM8U75cJl6/KfbIgQbSJxMlAu9JzY27p1e+gU7igKBh02Y8ab1 tJpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678124080; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YNbWnb9fLhufznPrTCujm0OjCR86yff8zEINR9/66Jg=; b=zRRVfBlXorVeF55sAYiOAEKuWgAkF1qqk2MBGbh1FdRxS6XMBeE+Um9yjZhcsqxOiJ XaPslL8wDUdpZ7vGmIo60fBJZcFaEIugTB7VR4i7KuKnJT4HgsmPjetYrxM7jBCWY0EP McXxLKIvOH3GKTBHcmKX9kiax+AlzIHiPELWGGw2TZNDlzeaIT4YEl/qZN9vKbxxsydZ d11y+rLJ1kVEUquAiVXrWtdpcT48xBRz0Pn08PuTS/Ueg8SyQ6IyMV/x8voC26bVjBdR soFpTKTTCp1a75hT4V58VgSCKG+UscSWeB/amBRcWdVxtPrj1967II0WXCP1+ViNpjm0 ZPOg== X-Gm-Message-State: AO0yUKUVnphO6wSaxRLqSu6rSkH8fMitNBE4skufShDfs+Whv53eL7k/ 7AYiVIHnA4dbDdEdbRe6vgM= X-Google-Smtp-Source: AK7set9gnNxS2zX2GuCCqoY4IvMKgf6MFW3wKPgqhT4pU292X7/ZfK4Ti8bCLicVaw8R/CLQGSVgEg== X-Received: by 2002:a2e:be9e:0:b0:295:8bea:99a2 with SMTP id a30-20020a2ebe9e000000b002958bea99a2mr4565495ljr.1.1678124080368; Mon, 06 Mar 2023 09:34:40 -0800 (PST) Received: from localhost (88-115-161-74.elisa-laajakaista.fi. [88.115.161.74]) by smtp.gmail.com with ESMTPSA id k16-20020a2ea270000000b002947028ae7csm1818432ljm.72.2023.03.06.09.34.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Mar 2023 09:34:40 -0800 (PST) Date: Mon, 6 Mar 2023 19:34:39 +0200 From: Zhi Wang To: Steven Price Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, Catalin Marinas , Marc Zyngier , Will Deacon , James Morse , Oliver Upton , Suzuki K Poulose , Zenghui Yu , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Joey Gouly , Alexandru Elisei , Christoffer Dall , Fuad Tabba , linux-coco@lists.linux.dev Subject: Re: [RFC PATCH 16/28] arm64: RME: Allow populating initial contents Message-ID: <20230306193439.000048f2@gmail.com> In-Reply-To: <20230127112932.38045-17-steven.price@arm.com> References: <20230127112248.136810-1-suzuki.poulose@arm.com> <20230127112932.38045-1-steven.price@arm.com> <20230127112932.38045-17-steven.price@arm.com> X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) Precedence: bulk X-Mailing-List: linux-coco@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Fri, 27 Jan 2023 11:29:20 +0000 Steven Price wrote: > The VMM needs to populate the realm with some data before starting (e.g. > a kernel and initrd). This is measured by the RMM and used as part of > the attestation later on. > > Signed-off-by: Steven Price > --- > arch/arm64/kvm/rme.c | 366 +++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 366 insertions(+) > > diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c > index 16e0bfea98b1..3405b43e1421 100644 > --- a/arch/arm64/kvm/rme.c > +++ b/arch/arm64/kvm/rme.c > @@ -4,6 +4,7 @@ > */ > > #include > +#include > > #include > #include > @@ -426,6 +427,359 @@ void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size) > } > } > > +static int realm_create_protected_data_page(struct realm *realm, > + unsigned long ipa, > + struct page *dst_page, > + struct page *tmp_page) > +{ > + phys_addr_t dst_phys, tmp_phys; > + int ret; > + > + copy_page(page_address(tmp_page), page_address(dst_page)); > + > + dst_phys = page_to_phys(dst_page); > + tmp_phys = page_to_phys(tmp_page); > + > + if (rmi_granule_delegate(dst_phys)) > + return -ENXIO; > + > + ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa, tmp_phys, > + RMI_MEASURE_CONTENT); > + > + if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) { > + /* Create missing RTTs and retry */ > + int level = RMI_RETURN_INDEX(ret); > + > + ret = realm_create_rtt_levels(realm, ipa, level, > + RME_RTT_MAX_LEVEL, NULL); > + if (ret) > + goto err; > + > + ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa, > + tmp_phys, RMI_MEASURE_CONTENT); > + } > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + if (WARN_ON(rmi_granule_undelegate(dst_phys))) { > + /* Page can't be returned to NS world so is lost */ > + get_page(dst_page); > + } > + return -ENXIO; > +} > + > +static int fold_rtt(phys_addr_t rd, unsigned long addr, int level, > + struct realm *realm) > +{ > + struct rtt_entry rtt; > + phys_addr_t rtt_addr; > + > + if (rmi_rtt_read_entry(rd, addr, level, &rtt)) > + return -ENXIO; > + > + if (rtt.state != RMI_TABLE) > + return -EINVAL; > + > + rtt_addr = rmi_rtt_get_phys(&rtt); > + if (rmi_rtt_fold(rtt_addr, rd, addr, level + 1)) > + return -ENXIO; > + > + free_delegated_page(realm, rtt_addr); > + > + return 0; > +} > + > +int realm_map_protected(struct realm *realm, > + unsigned long hva, > + unsigned long base_ipa, > + struct page *dst_page, > + unsigned long map_size, > + struct kvm_mmu_memory_cache *memcache) > +{ > + phys_addr_t dst_phys = page_to_phys(dst_page); > + phys_addr_t rd = virt_to_phys(realm->rd); > + unsigned long phys = dst_phys; > + unsigned long ipa = base_ipa; > + unsigned long size; > + int map_level; > + int ret = 0; > + > + if (WARN_ON(!IS_ALIGNED(ipa, map_size))) > + return -EINVAL; > + > + switch (map_size) { > + case PAGE_SIZE: > + map_level = 3; > + break; > + case RME_L2_BLOCK_SIZE: > + map_level = 2; > + break; > + default: > + return -EINVAL; > + } > + > + if (map_level < RME_RTT_MAX_LEVEL) { > + /* > + * A temporary RTT is needed during the map, precreate it, > + * however if there is an error (e.g. missing parent tables) > + * this will be handled below. > + */ > + realm_create_rtt_levels(realm, ipa, map_level, > + RME_RTT_MAX_LEVEL, memcache); > + } > + > + for (size = 0; size < map_size; size += PAGE_SIZE) { > + if (rmi_granule_delegate(phys)) { > + struct rtt_entry rtt; > + > + /* > + * It's possible we raced with another VCPU on the same > + * fault. If the entry exists and matches then exit > + * early and assume the other VCPU will handle the > + * mapping. > + */ > + if (rmi_rtt_read_entry(rd, ipa, RME_RTT_MAX_LEVEL, &rtt)) > + goto err; > + > + // FIXME: For a block mapping this could race at level > + // 2 or 3... > + if (WARN_ON((rtt.walk_level != RME_RTT_MAX_LEVEL || > + rtt.state != RMI_ASSIGNED || > + rtt.desc != phys))) { > + goto err; > + } > + > + return 0; > + } > + > + ret = rmi_data_create_unknown(phys, rd, ipa); > + > + if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) { > + /* Create missing RTTs and retry */ > + int level = RMI_RETURN_INDEX(ret); > + > + ret = realm_create_rtt_levels(realm, ipa, level, > + RME_RTT_MAX_LEVEL, > + memcache); > + WARN_ON(ret); > + if (ret) > + goto err_undelegate; > + > + ret = rmi_data_create_unknown(phys, rd, ipa); > + } > + WARN_ON(ret); > + > + if (ret) > + goto err_undelegate; > + > + phys += PAGE_SIZE; > + ipa += PAGE_SIZE; > + } > + > + if (map_size == RME_L2_BLOCK_SIZE) > + ret = fold_rtt(rd, base_ipa, map_level, realm); > + if (WARN_ON(ret)) > + goto err; > + > + return 0; > + > +err_undelegate: > + if (WARN_ON(rmi_granule_undelegate(phys))) { > + /* Page can't be returned to NS world so is lost */ > + get_page(phys_to_page(phys)); > + } > +err: > + while (size > 0) { > + phys -= PAGE_SIZE; > + size -= PAGE_SIZE; > + ipa -= PAGE_SIZE; > + > + rmi_data_destroy(rd, ipa); > + > + if (WARN_ON(rmi_granule_undelegate(phys))) { > + /* Page can't be returned to NS world so is lost */ > + get_page(phys_to_page(phys)); > + } > + } > + return -ENXIO; > +} > + There seems no caller to the function above. Better move it to the related patch. > +static int populate_par_region(struct kvm *kvm, > + phys_addr_t ipa_base, > + phys_addr_t ipa_end) > +{ > + struct realm *realm = &kvm->arch.realm; > + struct kvm_memory_slot *memslot; > + gfn_t base_gfn, end_gfn; > + int idx; > + phys_addr_t ipa; > + int ret = 0; > + struct page *tmp_page; > + phys_addr_t rd = virt_to_phys(realm->rd); > + > + base_gfn = gpa_to_gfn(ipa_base); > + end_gfn = gpa_to_gfn(ipa_end); > + > + idx = srcu_read_lock(&kvm->srcu); > + memslot = gfn_to_memslot(kvm, base_gfn); > + if (!memslot) { > + ret = -EFAULT; > + goto out; > + } > + > + /* We require the region to be contained within a single memslot */ > + if (memslot->base_gfn + memslot->npages < end_gfn) { > + ret = -EINVAL; > + goto out; > + } > + > + tmp_page = alloc_page(GFP_KERNEL); > + if (!tmp_page) { > + ret = -ENOMEM; > + goto out; > + } > + > + mmap_read_lock(current->mm); > + > + ipa = ipa_base; > + > + while (ipa < ipa_end) { > + struct vm_area_struct *vma; > + unsigned long map_size; > + unsigned int vma_shift; > + unsigned long offset; > + unsigned long hva; > + struct page *page; > + kvm_pfn_t pfn; > + int level; > + > + hva = gfn_to_hva_memslot(memslot, gpa_to_gfn(ipa)); > + vma = vma_lookup(current->mm, hva); > + if (!vma) { > + ret = -EFAULT; > + break; > + } > + > + if (is_vm_hugetlb_page(vma)) > + vma_shift = huge_page_shift(hstate_vma(vma)); > + else > + vma_shift = PAGE_SHIFT; > + > + map_size = 1 << vma_shift; > + > + /* > + * FIXME: This causes over mapping, but there's no good > + * solution here with the ABI as it stands > + */ > + ipa = ALIGN_DOWN(ipa, map_size); > + > + switch (map_size) { > + case RME_L2_BLOCK_SIZE: > + level = 2; > + break; > + case PAGE_SIZE: > + level = 3; > + break; > + default: > + WARN_ONCE(1, "Unsupport vma_shift %d", vma_shift); > + ret = -EFAULT; > + break; > + } > + > + pfn = gfn_to_pfn_memslot(memslot, gpa_to_gfn(ipa)); > + > + if (is_error_pfn(pfn)) { > + ret = -EFAULT; > + break; > + } > + > + ret = rmi_rtt_init_ripas(rd, ipa, level); > + if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) { > + ret = realm_create_rtt_levels(realm, ipa, > + RMI_RETURN_INDEX(ret), > + level, NULL); > + if (ret) > + break; > + ret = rmi_rtt_init_ripas(rd, ipa, level); > + if (ret) { > + ret = -ENXIO; > + break; > + } > + } > + > + if (level < RME_RTT_MAX_LEVEL) { > + /* > + * A temporary RTT is needed during the map, precreate > + * it, however if there is an error (e.g. missing > + * parent tables) this will be handled in the > + * realm_create_protected_data_page() call. > + */ > + realm_create_rtt_levels(realm, ipa, level, > + RME_RTT_MAX_LEVEL, NULL); > + } > + > + page = pfn_to_page(pfn); > + > + for (offset = 0; offset < map_size && !ret; > + offset += PAGE_SIZE, page++) { > + phys_addr_t page_ipa = ipa + offset; > + > + ret = realm_create_protected_data_page(realm, page_ipa, > + page, tmp_page); > + } > + if (ret) > + goto err_release_pfn; > + > + if (level == 2) { > + ret = fold_rtt(rd, ipa, level, realm); > + if (ret) > + goto err_release_pfn; > + } > + > + ipa += map_size; > + kvm_set_pfn_accessed(pfn); > + kvm_set_pfn_dirty(pfn); kvm_release_pfn_dirty() has already called kvm_set_pfn_{accessed, dirty}(). > + kvm_release_pfn_dirty(pfn); > +err_release_pfn: > + if (ret) { > + kvm_release_pfn_clean(pfn); > + break; > + } > + } > + > + mmap_read_unlock(current->mm); > + __free_page(tmp_page); > + > +out: > + srcu_read_unlock(&kvm->srcu, idx); > + return ret; > +} > + > +static int kvm_populate_realm(struct kvm *kvm, > + struct kvm_cap_arm_rme_populate_realm_args *args) > +{ > + phys_addr_t ipa_base, ipa_end; > + Check kvm_is_realm(kvm) here or in the kvm_realm_enable_cap(). > + if (kvm_realm_state(kvm) != REALM_STATE_NEW) > + return -EBUSY; Maybe -EINVAL? The realm hasn't been created (RMI_REALM_CREATE is not called yet). The userspace shouldn't reach this path. > + > + if (!IS_ALIGNED(args->populate_ipa_base, PAGE_SIZE) || > + !IS_ALIGNED(args->populate_ipa_size, PAGE_SIZE)) > + return -EINVAL; > + > + ipa_base = args->populate_ipa_base; > + ipa_end = ipa_base + args->populate_ipa_size; > + > + if (ipa_end < ipa_base) > + return -EINVAL; > + > + return populate_par_region(kvm, ipa_base, ipa_end); > +} > + > static int set_ipa_state(struct kvm_vcpu *vcpu, > unsigned long ipa, > unsigned long end, > @@ -748,6 +1102,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap) > r = kvm_init_ipa_range_realm(kvm, &args); > break; > } > + case KVM_CAP_ARM_RME_POPULATE_REALM: { > + struct kvm_cap_arm_rme_populate_realm_args args; > + void __user *argp = u64_to_user_ptr(cap->args[1]); > + > + if (copy_from_user(&args, argp, sizeof(args))) { > + r = -EFAULT; > + break; > + } > + > + r = kvm_populate_realm(kvm, &args); > + break; > + } > default: > r = -EINVAL; > break; From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2AE2EC61DA4 for ; Mon, 6 Mar 2023 17:41:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=dphXuaOgU7vEYiTr5VFITmU/jrUJNdlaC0/T6QyGIzM=; b=oaatMNxiW299nL i/nvqg94NK9wuXRYVZXfTq2VXoUFRj2a2/8kY4jznJDMdEXGS9osMcuJ9w4SGGuB0UU9ZpAeBXydS fJx4M+lsbKVqXKUlUJjYA4ANo7PbfqnBv3EI6bDf4nkuCLkzR4M7I/VfAssMu9ROzRuDZPhdAVFV2 W3PJBlSVd3mTukyZdHslTemFIekT/6egQr8o0ase2y0/w3MPiaPE6mA2VioIyT+T8CDhHhHZP1O90 fOCRlysuIDn0q39pW8G3oGHvplx1lJ8BgVpSGe0g7DbznvIWQNuWbEOElY3RyX5A8sR0CRk6xeydb O3YorTTnxBdSNQPQXBzw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pZEoM-00EEFd-82; Mon, 06 Mar 2023 17:39:49 +0000 Received: from mail-lj1-x232.google.com ([2a00:1450:4864:20::232]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pZEjS-00EBk0-J4 for linux-arm-kernel@lists.infradead.org; Mon, 06 Mar 2023 17:34:44 +0000 Received: by mail-lj1-x232.google.com with SMTP id b13so10499723ljf.6 for ; Mon, 06 Mar 2023 09:34:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1678124080; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=YNbWnb9fLhufznPrTCujm0OjCR86yff8zEINR9/66Jg=; b=CCFi8E6XdEWAkbx/5iJUDNojUY+3dy3mwXOpLptH+w5LOZIbQ9pS42/9KCj14S7/LT /gJe9a5ZTEMXinKJuVpQ+z4kr7JavyiJjrRY8GhPfv/B3Y2w2UvQ1GidbNPdIJO12Y1w hbVqAzWH1lxz0cuZEndnPWyjdhu5UBv1fks9VqmFe7sOGMZuntrwQsXQvnzdNNAamuBa KHvtROUSJx0OvOe3BMwFuv0Xbrl74qR1GfUUaKWxXjl8AzacTKCq/HPHOpiRe/Sne+BR J7i3oIa+hibwY77PAoGM8U75cJl6/KfbIgQbSJxMlAu9JzY27p1e+gU7igKBh02Y8ab1 tJpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678124080; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YNbWnb9fLhufznPrTCujm0OjCR86yff8zEINR9/66Jg=; b=XN5nChdWLeT9pdWDV2UujxlOb6nlBFCx0ei01cQoQVHNGkuBcj1r0zTXS0rhf6aS6j pCfVi9IpHerQCYmer8qwRA7OwGROvpXrotEoB8a05Z+K3MzPjuzWvazy5JPLat6EIS+3 QSLvK9pEw7NDsEudUauu2GDCxlaLFxAFt4fLFXk57CffVS/EOAW8gaVhuqFY4Oje37RV o7f1BNQnwhpx6vwJqTaGfquXlQhg3I2SAfjH0GwY8EcY/C3BaA3HkTH6n3aB4Gxvt4jZ zJ4zKK40aPcu5dmEBOkwb276g3cQE6hNCH30TvAvOblTECmq31kY3M/Ge00Jh2s8eAdY A9HA== X-Gm-Message-State: AO0yUKU2ZLS5iTydhHBewJ6fdnS3K52SDFwwoAceWEvVsiU9BayEimmn hQWL1qxJZ1YxSGKJUo1vfHY= X-Google-Smtp-Source: AK7set9gnNxS2zX2GuCCqoY4IvMKgf6MFW3wKPgqhT4pU292X7/ZfK4Ti8bCLicVaw8R/CLQGSVgEg== X-Received: by 2002:a2e:be9e:0:b0:295:8bea:99a2 with SMTP id a30-20020a2ebe9e000000b002958bea99a2mr4565495ljr.1.1678124080368; Mon, 06 Mar 2023 09:34:40 -0800 (PST) Received: from localhost (88-115-161-74.elisa-laajakaista.fi. [88.115.161.74]) by smtp.gmail.com with ESMTPSA id k16-20020a2ea270000000b002947028ae7csm1818432ljm.72.2023.03.06.09.34.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Mar 2023 09:34:40 -0800 (PST) Date: Mon, 6 Mar 2023 19:34:39 +0200 From: Zhi Wang To: Steven Price Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, Catalin Marinas , Marc Zyngier , Will Deacon , James Morse , Oliver Upton , Suzuki K Poulose , Zenghui Yu , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Joey Gouly , Alexandru Elisei , Christoffer Dall , Fuad Tabba , linux-coco@lists.linux.dev Subject: Re: [RFC PATCH 16/28] arm64: RME: Allow populating initial contents Message-ID: <20230306193439.000048f2@gmail.com> In-Reply-To: <20230127112932.38045-17-steven.price@arm.com> References: <20230127112248.136810-1-suzuki.poulose@arm.com> <20230127112932.38045-1-steven.price@arm.com> <20230127112932.38045-17-steven.price@arm.com> X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230306_093442_719127_0623E345 X-CRM114-Status: GOOD ( 33.33 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, 27 Jan 2023 11:29:20 +0000 Steven Price wrote: > The VMM needs to populate the realm with some data before starting (e.g. > a kernel and initrd). This is measured by the RMM and used as part of > the attestation later on. > > Signed-off-by: Steven Price > --- > arch/arm64/kvm/rme.c | 366 +++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 366 insertions(+) > > diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c > index 16e0bfea98b1..3405b43e1421 100644 > --- a/arch/arm64/kvm/rme.c > +++ b/arch/arm64/kvm/rme.c > @@ -4,6 +4,7 @@ > */ > > #include > +#include > > #include > #include > @@ -426,6 +427,359 @@ void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size) > } > } > > +static int realm_create_protected_data_page(struct realm *realm, > + unsigned long ipa, > + struct page *dst_page, > + struct page *tmp_page) > +{ > + phys_addr_t dst_phys, tmp_phys; > + int ret; > + > + copy_page(page_address(tmp_page), page_address(dst_page)); > + > + dst_phys = page_to_phys(dst_page); > + tmp_phys = page_to_phys(tmp_page); > + > + if (rmi_granule_delegate(dst_phys)) > + return -ENXIO; > + > + ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa, tmp_phys, > + RMI_MEASURE_CONTENT); > + > + if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) { > + /* Create missing RTTs and retry */ > + int level = RMI_RETURN_INDEX(ret); > + > + ret = realm_create_rtt_levels(realm, ipa, level, > + RME_RTT_MAX_LEVEL, NULL); > + if (ret) > + goto err; > + > + ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa, > + tmp_phys, RMI_MEASURE_CONTENT); > + } > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + if (WARN_ON(rmi_granule_undelegate(dst_phys))) { > + /* Page can't be returned to NS world so is lost */ > + get_page(dst_page); > + } > + return -ENXIO; > +} > + > +static int fold_rtt(phys_addr_t rd, unsigned long addr, int level, > + struct realm *realm) > +{ > + struct rtt_entry rtt; > + phys_addr_t rtt_addr; > + > + if (rmi_rtt_read_entry(rd, addr, level, &rtt)) > + return -ENXIO; > + > + if (rtt.state != RMI_TABLE) > + return -EINVAL; > + > + rtt_addr = rmi_rtt_get_phys(&rtt); > + if (rmi_rtt_fold(rtt_addr, rd, addr, level + 1)) > + return -ENXIO; > + > + free_delegated_page(realm, rtt_addr); > + > + return 0; > +} > + > +int realm_map_protected(struct realm *realm, > + unsigned long hva, > + unsigned long base_ipa, > + struct page *dst_page, > + unsigned long map_size, > + struct kvm_mmu_memory_cache *memcache) > +{ > + phys_addr_t dst_phys = page_to_phys(dst_page); > + phys_addr_t rd = virt_to_phys(realm->rd); > + unsigned long phys = dst_phys; > + unsigned long ipa = base_ipa; > + unsigned long size; > + int map_level; > + int ret = 0; > + > + if (WARN_ON(!IS_ALIGNED(ipa, map_size))) > + return -EINVAL; > + > + switch (map_size) { > + case PAGE_SIZE: > + map_level = 3; > + break; > + case RME_L2_BLOCK_SIZE: > + map_level = 2; > + break; > + default: > + return -EINVAL; > + } > + > + if (map_level < RME_RTT_MAX_LEVEL) { > + /* > + * A temporary RTT is needed during the map, precreate it, > + * however if there is an error (e.g. missing parent tables) > + * this will be handled below. > + */ > + realm_create_rtt_levels(realm, ipa, map_level, > + RME_RTT_MAX_LEVEL, memcache); > + } > + > + for (size = 0; size < map_size; size += PAGE_SIZE) { > + if (rmi_granule_delegate(phys)) { > + struct rtt_entry rtt; > + > + /* > + * It's possible we raced with another VCPU on the same > + * fault. If the entry exists and matches then exit > + * early and assume the other VCPU will handle the > + * mapping. > + */ > + if (rmi_rtt_read_entry(rd, ipa, RME_RTT_MAX_LEVEL, &rtt)) > + goto err; > + > + // FIXME: For a block mapping this could race at level > + // 2 or 3... > + if (WARN_ON((rtt.walk_level != RME_RTT_MAX_LEVEL || > + rtt.state != RMI_ASSIGNED || > + rtt.desc != phys))) { > + goto err; > + } > + > + return 0; > + } > + > + ret = rmi_data_create_unknown(phys, rd, ipa); > + > + if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) { > + /* Create missing RTTs and retry */ > + int level = RMI_RETURN_INDEX(ret); > + > + ret = realm_create_rtt_levels(realm, ipa, level, > + RME_RTT_MAX_LEVEL, > + memcache); > + WARN_ON(ret); > + if (ret) > + goto err_undelegate; > + > + ret = rmi_data_create_unknown(phys, rd, ipa); > + } > + WARN_ON(ret); > + > + if (ret) > + goto err_undelegate; > + > + phys += PAGE_SIZE; > + ipa += PAGE_SIZE; > + } > + > + if (map_size == RME_L2_BLOCK_SIZE) > + ret = fold_rtt(rd, base_ipa, map_level, realm); > + if (WARN_ON(ret)) > + goto err; > + > + return 0; > + > +err_undelegate: > + if (WARN_ON(rmi_granule_undelegate(phys))) { > + /* Page can't be returned to NS world so is lost */ > + get_page(phys_to_page(phys)); > + } > +err: > + while (size > 0) { > + phys -= PAGE_SIZE; > + size -= PAGE_SIZE; > + ipa -= PAGE_SIZE; > + > + rmi_data_destroy(rd, ipa); > + > + if (WARN_ON(rmi_granule_undelegate(phys))) { > + /* Page can't be returned to NS world so is lost */ > + get_page(phys_to_page(phys)); > + } > + } > + return -ENXIO; > +} > + There seems no caller to the function above. Better move it to the related patch. > +static int populate_par_region(struct kvm *kvm, > + phys_addr_t ipa_base, > + phys_addr_t ipa_end) > +{ > + struct realm *realm = &kvm->arch.realm; > + struct kvm_memory_slot *memslot; > + gfn_t base_gfn, end_gfn; > + int idx; > + phys_addr_t ipa; > + int ret = 0; > + struct page *tmp_page; > + phys_addr_t rd = virt_to_phys(realm->rd); > + > + base_gfn = gpa_to_gfn(ipa_base); > + end_gfn = gpa_to_gfn(ipa_end); > + > + idx = srcu_read_lock(&kvm->srcu); > + memslot = gfn_to_memslot(kvm, base_gfn); > + if (!memslot) { > + ret = -EFAULT; > + goto out; > + } > + > + /* We require the region to be contained within a single memslot */ > + if (memslot->base_gfn + memslot->npages < end_gfn) { > + ret = -EINVAL; > + goto out; > + } > + > + tmp_page = alloc_page(GFP_KERNEL); > + if (!tmp_page) { > + ret = -ENOMEM; > + goto out; > + } > + > + mmap_read_lock(current->mm); > + > + ipa = ipa_base; > + > + while (ipa < ipa_end) { > + struct vm_area_struct *vma; > + unsigned long map_size; > + unsigned int vma_shift; > + unsigned long offset; > + unsigned long hva; > + struct page *page; > + kvm_pfn_t pfn; > + int level; > + > + hva = gfn_to_hva_memslot(memslot, gpa_to_gfn(ipa)); > + vma = vma_lookup(current->mm, hva); > + if (!vma) { > + ret = -EFAULT; > + break; > + } > + > + if (is_vm_hugetlb_page(vma)) > + vma_shift = huge_page_shift(hstate_vma(vma)); > + else > + vma_shift = PAGE_SHIFT; > + > + map_size = 1 << vma_shift; > + > + /* > + * FIXME: This causes over mapping, but there's no good > + * solution here with the ABI as it stands > + */ > + ipa = ALIGN_DOWN(ipa, map_size); > + > + switch (map_size) { > + case RME_L2_BLOCK_SIZE: > + level = 2; > + break; > + case PAGE_SIZE: > + level = 3; > + break; > + default: > + WARN_ONCE(1, "Unsupport vma_shift %d", vma_shift); > + ret = -EFAULT; > + break; > + } > + > + pfn = gfn_to_pfn_memslot(memslot, gpa_to_gfn(ipa)); > + > + if (is_error_pfn(pfn)) { > + ret = -EFAULT; > + break; > + } > + > + ret = rmi_rtt_init_ripas(rd, ipa, level); > + if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) { > + ret = realm_create_rtt_levels(realm, ipa, > + RMI_RETURN_INDEX(ret), > + level, NULL); > + if (ret) > + break; > + ret = rmi_rtt_init_ripas(rd, ipa, level); > + if (ret) { > + ret = -ENXIO; > + break; > + } > + } > + > + if (level < RME_RTT_MAX_LEVEL) { > + /* > + * A temporary RTT is needed during the map, precreate > + * it, however if there is an error (e.g. missing > + * parent tables) this will be handled in the > + * realm_create_protected_data_page() call. > + */ > + realm_create_rtt_levels(realm, ipa, level, > + RME_RTT_MAX_LEVEL, NULL); > + } > + > + page = pfn_to_page(pfn); > + > + for (offset = 0; offset < map_size && !ret; > + offset += PAGE_SIZE, page++) { > + phys_addr_t page_ipa = ipa + offset; > + > + ret = realm_create_protected_data_page(realm, page_ipa, > + page, tmp_page); > + } > + if (ret) > + goto err_release_pfn; > + > + if (level == 2) { > + ret = fold_rtt(rd, ipa, level, realm); > + if (ret) > + goto err_release_pfn; > + } > + > + ipa += map_size; > + kvm_set_pfn_accessed(pfn); > + kvm_set_pfn_dirty(pfn); kvm_release_pfn_dirty() has already called kvm_set_pfn_{accessed, dirty}(). > + kvm_release_pfn_dirty(pfn); > +err_release_pfn: > + if (ret) { > + kvm_release_pfn_clean(pfn); > + break; > + } > + } > + > + mmap_read_unlock(current->mm); > + __free_page(tmp_page); > + > +out: > + srcu_read_unlock(&kvm->srcu, idx); > + return ret; > +} > + > +static int kvm_populate_realm(struct kvm *kvm, > + struct kvm_cap_arm_rme_populate_realm_args *args) > +{ > + phys_addr_t ipa_base, ipa_end; > + Check kvm_is_realm(kvm) here or in the kvm_realm_enable_cap(). > + if (kvm_realm_state(kvm) != REALM_STATE_NEW) > + return -EBUSY; Maybe -EINVAL? The realm hasn't been created (RMI_REALM_CREATE is not called yet). The userspace shouldn't reach this path. > + > + if (!IS_ALIGNED(args->populate_ipa_base, PAGE_SIZE) || > + !IS_ALIGNED(args->populate_ipa_size, PAGE_SIZE)) > + return -EINVAL; > + > + ipa_base = args->populate_ipa_base; > + ipa_end = ipa_base + args->populate_ipa_size; > + > + if (ipa_end < ipa_base) > + return -EINVAL; > + > + return populate_par_region(kvm, ipa_base, ipa_end); > +} > + > static int set_ipa_state(struct kvm_vcpu *vcpu, > unsigned long ipa, > unsigned long end, > @@ -748,6 +1102,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap) > r = kvm_init_ipa_range_realm(kvm, &args); > break; > } > + case KVM_CAP_ARM_RME_POPULATE_REALM: { > + struct kvm_cap_arm_rme_populate_realm_args args; > + void __user *argp = u64_to_user_ptr(cap->args[1]); > + > + if (copy_from_user(&args, argp, sizeof(args))) { > + r = -EFAULT; > + break; > + } > + > + r = kvm_populate_realm(kvm, &args); > + break; > + } > default: > r = -EINVAL; > break; _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel