From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com [209.85.167.53])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id C29378F69;
	Tue, 14 Mar 2023 15:31:24 +0000 (UTC)
Received: by mail-lf1-f53.google.com with SMTP id t11so20544113lfr.1;
        Tue, 14 Mar 2023 08:31:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112; t=1678807883;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:subject:cc:to:from:date:from:to:cc:subject:date
         :message-id:reply-to;
        bh=tugbO6M1bVLMnJoIr/ZEHMXjyijCTYHveLUSI3WWDXA=;
        b=h22qd2sV7BFIcMSAMfmG8jgqgG5blaHLrqsWHj9GS2H5tdpkL2cXPyMESlxhlod0zH
         C32X2agh8NHgQoMMhhSxIG4oJcqV86w/ysUvs1i8961vzzIxez/EJ8EduQAbJKerrZHI
         dxRm5PghDse7YJMKtih0xW3ahOfTrLvKzncrF+4THNJWJL2/dCngRkDDwe6wumDHnRQ1
         IlIfe1EKm1E66Q8HrY1psbVCanO/C4clL+5JsjJLxpgoOQt/MXY2TgLlnuWgHnpaBzVN
         mP0OMLZjOIU1mxnTIxlcAi8brD7ruQpE6QuPhMaL2ebA3k0AUu+TsYzAp9naSjH3rpTi
         aPSg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678807883;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=tugbO6M1bVLMnJoIr/ZEHMXjyijCTYHveLUSI3WWDXA=;
        b=ec8Yh5wh3PLxawErSt/7Lz4H5eeyD8RaZ5WEn0Izo6JCrlfaElMJysjAZJ3CeSbBZM
         WAFhY8O2WSoztHCBl3h5ZK65dWe/xYl5ErrLpwv9V6UKB0G3jaxxNHxUX1n627PcJUyT
         i9dQAYsP9AamCuinnNNXrPHGkqUMWNnHkbOmx3xLK4UVCgz2CD3gqNGrtVkhSw8xcAQU
         j0VOn2+6oh/V12hqx9L9feCvmPRfBP4u3fi95nfsWzzrDT7SsnW7HjjFkDrcdLgJe0K0
         9Ovij1Uiiw1z5Q6qMNuXg+EYOJRIqxsfASXLupwj0qcREkN8DBnR0K6F68RXDOlgMo65
         jeTA==
X-Gm-Message-State: AO0yUKXoM305VN99WRJ7UB+PPl53VPRzZJFKoUqmTbx8IJ+ScHoSU8xN
	xNMGx+P7cGO5diC1Do0Dmdc=
X-Google-Smtp-Source: AK7set9GdZxrsy9GkqN3cH9EjyALylTe/CRPsTjwCr3WEKHsygS0kQmIy1oykYsMPS4O8JjlUBT+Bw==
X-Received: by 2002:ac2:530f:0:b0:4dc:807a:d137 with SMTP id c15-20020ac2530f000000b004dc807ad137mr4146819lfh.4.1678807882577;
        Tue, 14 Mar 2023 08:31:22 -0700 (PDT)
Received: from localhost (88-115-161-74.elisa-laajakaista.fi. [88.115.161.74])
        by smtp.gmail.com with ESMTPSA id o15-20020ac2494f000000b004a2c447598fsm438257lfi.159.2023.03.14.08.31.22
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 14 Mar 2023 08:31:22 -0700 (PDT)
Date: Tue, 14 Mar 2023 17:31:21 +0200
From: Zhi Wang <zhi.wang.linux@gmail.com>
To: Steven Price <steven.price@arm.com>
Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, Catalin Marinas
 <catalin.marinas@arm.com>, Marc Zyngier <maz@kernel.org>, Will Deacon
 <will@kernel.org>, James Morse <james.morse@arm.com>, Oliver Upton
 <oliver.upton@linux.dev>, Suzuki K Poulose <suzuki.poulose@arm.com>,
 Zenghui Yu <yuzenghui@huawei.com>, linux-arm-kernel@lists.infradead.org,
 linux-kernel@vger.kernel.org, Joey Gouly <joey.gouly@arm.com>, Alexandru
 Elisei <alexandru.elisei@arm.com>, Christoffer Dall
 <christoffer.dall@arm.com>, Fuad Tabba <tabba@google.com>,
 linux-coco@lists.linux.dev
Subject: Re: [RFC PATCH 16/28] arm64: RME: Allow populating initial contents
Message-ID: <20230314173121.00006cfb@gmail.com>
In-Reply-To: <6c6ff608-2314-a49a-84ee-fac883c4e227@arm.com>
References: <20230127112248.136810-1-suzuki.poulose@arm.com>
	<20230127112932.38045-1-steven.price@arm.com>
	<20230127112932.38045-17-steven.price@arm.com>
	<20230306193439.000048f2@gmail.com>
	<6c6ff608-2314-a49a-84ee-fac883c4e227@arm.com>
X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32)
Precedence: bulk
X-Mailing-List: linux-coco@lists.linux.dev
List-Id: <linux-coco.lists.linux.dev>
List-Subscribe: <mailto:linux-coco+subscribe@lists.linux.dev>
List-Unsubscribe: <mailto:linux-coco+unsubscribe@lists.linux.dev>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

On Fri, 10 Mar 2023 15:47:16 +0000
Steven Price <steven.price@arm.com> wrote:

> On 06/03/2023 17:34, Zhi Wang wrote:
> > On Fri, 27 Jan 2023 11:29:20 +0000
> > Steven Price <steven.price@arm.com> wrote:
> >   
> >> The VMM needs to populate the realm with some data before starting (e.g.
> >> a kernel and initrd). This is measured by the RMM and used as part of
> >> the attestation later on.
> >>
> >> Signed-off-by: Steven Price <steven.price@arm.com>
> >> ---
> >>  arch/arm64/kvm/rme.c | 366 +++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 366 insertions(+)
> >>
> >> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> >> index 16e0bfea98b1..3405b43e1421 100644
> >> --- a/arch/arm64/kvm/rme.c
> >> +++ b/arch/arm64/kvm/rme.c
> >> @@ -4,6 +4,7 @@
> >>   */
> >>  
> >>  #include <linux/kvm_host.h>
> >> +#include <linux/hugetlb.h>
> >>  
> >>  #include <asm/kvm_emulate.h>
> >>  #include <asm/kvm_mmu.h>
> >> @@ -426,6 +427,359 @@ void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size)
> >>  	}
> >>  }
> >>  
> >> +static int realm_create_protected_data_page(struct realm *realm,
> >> +					    unsigned long ipa,
> >> +					    struct page *dst_page,
> >> +					    struct page *tmp_page)
> >> +{
> >> +	phys_addr_t dst_phys, tmp_phys;
> >> +	int ret;
> >> +
> >> +	copy_page(page_address(tmp_page), page_address(dst_page));
> >> +
> >> +	dst_phys = page_to_phys(dst_page);
> >> +	tmp_phys = page_to_phys(tmp_page);
> >> +
> >> +	if (rmi_granule_delegate(dst_phys))
> >> +		return -ENXIO;
> >> +
> >> +	ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa, tmp_phys,
> >> +			      RMI_MEASURE_CONTENT);
> >> +
> >> +	if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> >> +		/* Create missing RTTs and retry */
> >> +		int level = RMI_RETURN_INDEX(ret);
> >> +
> >> +		ret = realm_create_rtt_levels(realm, ipa, level,
> >> +					      RME_RTT_MAX_LEVEL, NULL);
> >> +		if (ret)
> >> +			goto err;
> >> +
> >> +		ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa,
> >> +				      tmp_phys, RMI_MEASURE_CONTENT);
> >> +	}
> >> +
> >> +	if (ret)
> >> +		goto err;
> >> +
> >> +	return 0;
> >> +
> >> +err:
> >> +	if (WARN_ON(rmi_granule_undelegate(dst_phys))) {
> >> +		/* Page can't be returned to NS world so is lost */
> >> +		get_page(dst_page);
> >> +	}
> >> +	return -ENXIO;
> >> +}
> >> +
> >> +static int fold_rtt(phys_addr_t rd, unsigned long addr, int level,
> >> +		    struct realm *realm)
> >> +{
> >> +	struct rtt_entry rtt;
> >> +	phys_addr_t rtt_addr;
> >> +
> >> +	if (rmi_rtt_read_entry(rd, addr, level, &rtt))
> >> +		return -ENXIO;
> >> +
> >> +	if (rtt.state != RMI_TABLE)
> >> +		return -EINVAL;
> >> +
> >> +	rtt_addr = rmi_rtt_get_phys(&rtt);
> >> +	if (rmi_rtt_fold(rtt_addr, rd, addr, level + 1))
> >> +		return -ENXIO;
> >> +
> >> +	free_delegated_page(realm, rtt_addr);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +int realm_map_protected(struct realm *realm,
> >> +			unsigned long hva,
> >> +			unsigned long base_ipa,
> >> +			struct page *dst_page,
> >> +			unsigned long map_size,
> >> +			struct kvm_mmu_memory_cache *memcache)
> >> +{
> >> +	phys_addr_t dst_phys = page_to_phys(dst_page);
> >> +	phys_addr_t rd = virt_to_phys(realm->rd);
> >> +	unsigned long phys = dst_phys;
> >> +	unsigned long ipa = base_ipa;
> >> +	unsigned long size;
> >> +	int map_level;
> >> +	int ret = 0;
> >> +
> >> +	if (WARN_ON(!IS_ALIGNED(ipa, map_size)))
> >> +		return -EINVAL;
> >> +
> >> +	switch (map_size) {
> >> +	case PAGE_SIZE:
> >> +		map_level = 3;
> >> +		break;
> >> +	case RME_L2_BLOCK_SIZE:
> >> +		map_level = 2;
> >> +		break;
> >> +	default:
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	if (map_level < RME_RTT_MAX_LEVEL) {
> >> +		/*
> >> +		 * A temporary RTT is needed during the map, precreate it,
> >> +		 * however if there is an error (e.g. missing parent tables)
> >> +		 * this will be handled below.
> >> +		 */
> >> +		realm_create_rtt_levels(realm, ipa, map_level,
> >> +					RME_RTT_MAX_LEVEL, memcache);
> >> +	}
> >> +
> >> +	for (size = 0; size < map_size; size += PAGE_SIZE) {
> >> +		if (rmi_granule_delegate(phys)) {
> >> +			struct rtt_entry rtt;
> >> +
> >> +			/*
> >> +			 * It's possible we raced with another VCPU on the same
> >> +			 * fault. If the entry exists and matches then exit
> >> +			 * early and assume the other VCPU will handle the
> >> +			 * mapping.
> >> +			 */
> >> +			if (rmi_rtt_read_entry(rd, ipa, RME_RTT_MAX_LEVEL, &rtt))
> >> +				goto err;
> >> +
> >> +			// FIXME: For a block mapping this could race at level
> >> +			// 2 or 3...
> >> +			if (WARN_ON((rtt.walk_level != RME_RTT_MAX_LEVEL ||
> >> +				     rtt.state != RMI_ASSIGNED ||
> >> +				     rtt.desc != phys))) {
> >> +				goto err;
> >> +			}
> >> +
> >> +			return 0;
> >> +		}
> >> +
> >> +		ret = rmi_data_create_unknown(phys, rd, ipa);
> >> +
> >> +		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> >> +			/* Create missing RTTs and retry */
> >> +			int level = RMI_RETURN_INDEX(ret);
> >> +
> >> +			ret = realm_create_rtt_levels(realm, ipa, level,
> >> +						      RME_RTT_MAX_LEVEL,
> >> +						      memcache);
> >> +			WARN_ON(ret);
> >> +			if (ret)
> >> +				goto err_undelegate;
> >> +
> >> +			ret = rmi_data_create_unknown(phys, rd, ipa);
> >> +		}
> >> +		WARN_ON(ret);
> >> +
> >> +		if (ret)
> >> +			goto err_undelegate;
> >> +
> >> +		phys += PAGE_SIZE;
> >> +		ipa += PAGE_SIZE;
> >> +	}
> >> +
> >> +	if (map_size == RME_L2_BLOCK_SIZE)
> >> +		ret = fold_rtt(rd, base_ipa, map_level, realm);
> >> +	if (WARN_ON(ret))
> >> +		goto err;
> >> +
> >> +	return 0;
> >> +
> >> +err_undelegate:
> >> +	if (WARN_ON(rmi_granule_undelegate(phys))) {
> >> +		/* Page can't be returned to NS world so is lost */
> >> +		get_page(phys_to_page(phys));
> >> +	}
> >> +err:
> >> +	while (size > 0) {
> >> +		phys -= PAGE_SIZE;
> >> +		size -= PAGE_SIZE;
> >> +		ipa -= PAGE_SIZE;
> >> +
> >> +		rmi_data_destroy(rd, ipa);
> >> +
> >> +		if (WARN_ON(rmi_granule_undelegate(phys))) {
> >> +			/* Page can't be returned to NS world so is lost */
> >> +			get_page(phys_to_page(phys));
> >> +		}
> >> +	}
> >> +	return -ENXIO;
> >> +}
> >> +  
> > 
> > There seems no caller to the function above. Better move it to the related
> > patch.  
> 
> Indeed this should really be in the next patch - will move as it's very
> confusing having it in this patch (sorry about that).
> 
> >> +static int populate_par_region(struct kvm *kvm,
> >> +			       phys_addr_t ipa_base,
> >> +			       phys_addr_t ipa_end)
> >> +{
> >> +	struct realm *realm = &kvm->arch.realm;
> >> +	struct kvm_memory_slot *memslot;
> >> +	gfn_t base_gfn, end_gfn;
> >> +	int idx;
> >> +	phys_addr_t ipa;
> >> +	int ret = 0;
> >> +	struct page *tmp_page;
> >> +	phys_addr_t rd = virt_to_phys(realm->rd);
> >> +
> >> +	base_gfn = gpa_to_gfn(ipa_base);
> >> +	end_gfn = gpa_to_gfn(ipa_end);
> >> +
> >> +	idx = srcu_read_lock(&kvm->srcu);
> >> +	memslot = gfn_to_memslot(kvm, base_gfn);
> >> +	if (!memslot) {
> >> +		ret = -EFAULT;
> >> +		goto out;
> >> +	}
> >> +
> >> +	/* We require the region to be contained within a single memslot */
> >> +	if (memslot->base_gfn + memslot->npages < end_gfn) {
> >> +		ret = -EINVAL;
> >> +		goto out;
> >> +	}
> >> +
> >> +	tmp_page = alloc_page(GFP_KERNEL);
> >> +	if (!tmp_page) {
> >> +		ret = -ENOMEM;
> >> +		goto out;
> >> +	}
> >> +
> >> +	mmap_read_lock(current->mm);
> >> +
> >> +	ipa = ipa_base;
> >> +
> >> +	while (ipa < ipa_end) {
> >> +		struct vm_area_struct *vma;
> >> +		unsigned long map_size;
> >> +		unsigned int vma_shift;
> >> +		unsigned long offset;
> >> +		unsigned long hva;
> >> +		struct page *page;
> >> +		kvm_pfn_t pfn;
> >> +		int level;
> >> +
> >> +		hva = gfn_to_hva_memslot(memslot, gpa_to_gfn(ipa));
> >> +		vma = vma_lookup(current->mm, hva);
> >> +		if (!vma) {
> >> +			ret = -EFAULT;
> >> +			break;
> >> +		}
> >> +
> >> +		if (is_vm_hugetlb_page(vma))
> >> +			vma_shift = huge_page_shift(hstate_vma(vma));
> >> +		else
> >> +			vma_shift = PAGE_SHIFT;
> >> +
> >> +		map_size = 1 << vma_shift;
> >> +
> >> +		/*
> >> +		 * FIXME: This causes over mapping, but there's no good
> >> +		 * solution here with the ABI as it stands
> >> +		 */
> >> +		ipa = ALIGN_DOWN(ipa, map_size);
> >> +
> >> +		switch (map_size) {
> >> +		case RME_L2_BLOCK_SIZE:
> >> +			level = 2;
> >> +			break;
> >> +		case PAGE_SIZE:
> >> +			level = 3;
> >> +			break;
> >> +		default:
> >> +			WARN_ONCE(1, "Unsupport vma_shift %d", vma_shift);
> >> +			ret = -EFAULT;
> >> +			break;
> >> +		}
> >> +
> >> +		pfn = gfn_to_pfn_memslot(memslot, gpa_to_gfn(ipa));
> >> +
> >> +		if (is_error_pfn(pfn)) {
> >> +			ret = -EFAULT;
> >> +			break;
> >> +		}
> >> +
> >> +		ret = rmi_rtt_init_ripas(rd, ipa, level);
> >> +		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> >> +			ret = realm_create_rtt_levels(realm, ipa,
> >> +						      RMI_RETURN_INDEX(ret),
> >> +						      level, NULL);
> >> +			if (ret)
> >> +				break;
> >> +			ret = rmi_rtt_init_ripas(rd, ipa, level);
> >> +			if (ret) {
> >> +				ret = -ENXIO;
> >> +				break;
> >> +			}
> >> +		}
> >> +
> >> +		if (level < RME_RTT_MAX_LEVEL) {
> >> +			/*
> >> +			 * A temporary RTT is needed during the map, precreate
> >> +			 * it, however if there is an error (e.g. missing
> >> +			 * parent tables) this will be handled in the
> >> +			 * realm_create_protected_data_page() call.
> >> +			 */
> >> +			realm_create_rtt_levels(realm, ipa, level,
> >> +						RME_RTT_MAX_LEVEL, NULL);
> >> +		}
> >> +
> >> +		page = pfn_to_page(pfn);
> >> +
> >> +		for (offset = 0; offset < map_size && !ret;
> >> +		     offset += PAGE_SIZE, page++) {
> >> +			phys_addr_t page_ipa = ipa + offset;
> >> +
> >> +			ret = realm_create_protected_data_page(realm, page_ipa,
> >> +							       page, tmp_page);
> >> +		}
> >> +		if (ret)
> >> +			goto err_release_pfn;
> >> +
> >> +		if (level == 2) {
> >> +			ret = fold_rtt(rd, ipa, level, realm);
> >> +			if (ret)
> >> +				goto err_release_pfn;
> >> +		}
> >> +
> >> +		ipa += map_size;  
> >   
> >> +		kvm_set_pfn_accessed(pfn);
> >> +		kvm_set_pfn_dirty(pfn);  
> > 
> > kvm_release_pfn_dirty() has already called kvm_set_pfn_{accessed, dirty}().  
> 
> Will remove those calls.
> 
> >> +		kvm_release_pfn_dirty(pfn);
> >> +err_release_pfn:
> >> +		if (ret) {
> >> +			kvm_release_pfn_clean(pfn);
> >> +			break;
> >> +		}
> >> +	}
> >> +
> >> +	mmap_read_unlock(current->mm);
> >> +	__free_page(tmp_page);
> >> +
> >> +out:
> >> +	srcu_read_unlock(&kvm->srcu, idx);
> >> +	return ret;
> >> +}
> >> +
> >> +static int kvm_populate_realm(struct kvm *kvm,
> >> +			      struct kvm_cap_arm_rme_populate_realm_args *args)
> >> +{
> >> +	phys_addr_t ipa_base, ipa_end;
> >> +  
> > 
> > Check kvm_is_realm(kvm) here or in the kvm_realm_enable_cap().  
> 
> I'm going to update kvm_vm_ioctl_enable_cap() to check kvm_is_realm() so
> we won't get here.
> 
> >> +	if (kvm_realm_state(kvm) != REALM_STATE_NEW)
> >> +		return -EBUSY;  
> > 
> > Maybe -EINVAL? The realm hasn't been created (RMI_REALM_CREATE is not called
> > yet). The userspace shouldn't reach this path.  
> 
> Well user space can attempt to populate in the ACTIVE state - which is
> where the idea of 'busy' comes from. Admittedly it's a little confusing
> when RMI_REALM_CREATE hasn't been called.
> 
> I'm not particularly bothered about the return code, but it's useful to
> have a different code to -EINVAL as it's not an invalid argument, but
> calling at the wrong time. I can't immediately see a better error code
> though.
> 
The reason why I feel -EBUSY is little bit off is EBUSY usually indicates
something is already initialized and currently running, then another
calling path wanna to operate it. 

I took a look on the ioctls in arch/arm64/kvm/arm.c. It seems people have
different opinions for calling execution path at a wrong time:

For example:

long kvm_arch_vcpu_ioctl()
...
        case KVM_GET_REG_LIST: {
                struct kvm_reg_list __user *user_list = argp;
                struct kvm_reg_list reg_list;
                unsigned n;

                r = -ENOEXEC;
                if (unlikely(!kvm_vcpu_initialized(vcpu)))
                        break;

                r = -EPERM;
                if (!kvm_arm_vcpu_is_finalized(vcpu))
                        break;

If we have to choose one, I prefer -ENOEXEC as -EPERM is stranger. But
personally my vote goes to -EINVAL.

> Steve
> 
> >> +
> >> +	if (!IS_ALIGNED(args->populate_ipa_base, PAGE_SIZE) ||
> >> +	    !IS_ALIGNED(args->populate_ipa_size, PAGE_SIZE))
> >> +		return -EINVAL;
> >> +
> >> +	ipa_base = args->populate_ipa_base;
> >> +	ipa_end = ipa_base + args->populate_ipa_size;
> >> +
> >> +	if (ipa_end < ipa_base)
> >> +		return -EINVAL;
> >> +
> >> +	return populate_par_region(kvm, ipa_base, ipa_end);
> >> +}
> >> +
> >>  static int set_ipa_state(struct kvm_vcpu *vcpu,
> >>  			 unsigned long ipa,
> >>  			 unsigned long end,
> >> @@ -748,6 +1102,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
> >>  		r = kvm_init_ipa_range_realm(kvm, &args);
> >>  		break;
> >>  	}
> >> +	case KVM_CAP_ARM_RME_POPULATE_REALM: {
> >> +		struct kvm_cap_arm_rme_populate_realm_args args;
> >> +		void __user *argp = u64_to_user_ptr(cap->args[1]);
> >> +
> >> +		if (copy_from_user(&args, argp, sizeof(args))) {
> >> +			r = -EFAULT;
> >> +			break;
> >> +		}
> >> +
> >> +		r = kvm_populate_realm(kvm, &args);
> >> +		break;
> >> +	}
> >>  	default:
> >>  		r = -EINVAL;
> >>  		break;  
> >   
> 


From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.lore.kernel.org (Postfix) with ESMTPS id 53E8AC7618A
	for <linux-arm-kernel@archiver.kernel.org>; Tue, 14 Mar 2023 15:32:31 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
	d=lists.infradead.org; s=bombadil.20210309; h=Sender:
	Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post:
	List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:
	Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description:
	Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:
	List-Owner; bh=arMSERJip0mxQ2iVRQ7w0KfHifylzwrUnBUgu1Yi7Zg=; b=Pb933b1gG6/R05
	1nJTbUEZTo0gu4L90bveMlo9uGtJgAJzwHxYvWaLgXVD4I63FtII3LTam5MUruPOjVDKT/1/1/SB/
	BVX0DUcdsb5VD9hO07a5n89+uufBL+e+V6BQdUUe7MPBK1qjy2qdK2klSvN62xk+eeyyaXWEA8z5a
	Ows1dmdjABRWkVzVBI7FgSXxf/gF2wLFgCxraf1oO3elYRXDygwe+FbF4ckutr3/EGVhYSDl4yMCg
	4VQeTQQPGKTQV02yzAKepSpRi/JuOyCCoqApcyn03svamGE6DLPcC4baMx9MH3qL48nwx7SF5yBVP
	DsKfbQdBhMBxS7dNmmHg==;
Received: from localhost ([::1] helo=bombadil.infradead.org)
	by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux))
	id 1pc6cb-00Ac8V-1o;
	Tue, 14 Mar 2023 15:31:29 +0000
Received: from mail-lf1-x12e.google.com ([2a00:1450:4864:20::12e])
	by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux))
	id 1pc6cY-00Ac7R-1X
	for linux-arm-kernel@lists.infradead.org;
	Tue, 14 Mar 2023 15:31:28 +0000
Received: by mail-lf1-x12e.google.com with SMTP id d36so20521532lfv.8
        for <linux-arm-kernel@lists.infradead.org>; Tue, 14 Mar 2023 08:31:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112; t=1678807883;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:subject:cc:to:from:date:from:to:cc:subject:date
         :message-id:reply-to;
        bh=tugbO6M1bVLMnJoIr/ZEHMXjyijCTYHveLUSI3WWDXA=;
        b=h22qd2sV7BFIcMSAMfmG8jgqgG5blaHLrqsWHj9GS2H5tdpkL2cXPyMESlxhlod0zH
         C32X2agh8NHgQoMMhhSxIG4oJcqV86w/ysUvs1i8961vzzIxez/EJ8EduQAbJKerrZHI
         dxRm5PghDse7YJMKtih0xW3ahOfTrLvKzncrF+4THNJWJL2/dCngRkDDwe6wumDHnRQ1
         IlIfe1EKm1E66Q8HrY1psbVCanO/C4clL+5JsjJLxpgoOQt/MXY2TgLlnuWgHnpaBzVN
         mP0OMLZjOIU1mxnTIxlcAi8brD7ruQpE6QuPhMaL2ebA3k0AUu+TsYzAp9naSjH3rpTi
         aPSg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678807883;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=tugbO6M1bVLMnJoIr/ZEHMXjyijCTYHveLUSI3WWDXA=;
        b=m8StPuTdGmpcEh4e/fPyIe8AQPWH1dHH8UXL/J3TOX8Am0b7Wg/8dj5svbfRS2wIbS
         9UChOgp6C/l43ZNaTGVW9cJwJg/HGoS8W4kPFcO+/NebXU6m3sdlLfk+giE9KBOJ0Em2
         c+6bhrNVqJZZ+RMU8noNrR7GafFnvpZd3ZjQ5cIk88MpvIECRWeStp+LxAL7uURTAnu1
         T0l/311ORr0P15hYDZs9kGyWbWGSzaPtBZfHfD2HgdKvNMqP/+1ZJ4Q34GdaK7t73JGk
         T2TbJdKh6ZyE+JCdTAuNQY0zH00RmMSoI6a1UNKHr4B53s5le/BbC+SdvoNfaggF5wBb
         lANg==
X-Gm-Message-State: AO0yUKVkZIDKMae7Y69G3/j6AwIsz47rpqUCcYkjXgRWcc7Z1oDToL32
	4twY+9juVe6NOgqzH50IPKeNOsZUUuw=
X-Google-Smtp-Source: AK7set9GdZxrsy9GkqN3cH9EjyALylTe/CRPsTjwCr3WEKHsygS0kQmIy1oykYsMPS4O8JjlUBT+Bw==
X-Received: by 2002:ac2:530f:0:b0:4dc:807a:d137 with SMTP id c15-20020ac2530f000000b004dc807ad137mr4146819lfh.4.1678807882577;
        Tue, 14 Mar 2023 08:31:22 -0700 (PDT)
Received: from localhost (88-115-161-74.elisa-laajakaista.fi. [88.115.161.74])
        by smtp.gmail.com with ESMTPSA id o15-20020ac2494f000000b004a2c447598fsm438257lfi.159.2023.03.14.08.31.22
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 14 Mar 2023 08:31:22 -0700 (PDT)
Date: Tue, 14 Mar 2023 17:31:21 +0200
From: Zhi Wang <zhi.wang.linux@gmail.com>
To: Steven Price <steven.price@arm.com>
Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, Catalin Marinas
 <catalin.marinas@arm.com>, Marc Zyngier <maz@kernel.org>, Will Deacon
 <will@kernel.org>, James Morse <james.morse@arm.com>, Oliver Upton
 <oliver.upton@linux.dev>, Suzuki K Poulose <suzuki.poulose@arm.com>,
 Zenghui Yu <yuzenghui@huawei.com>, linux-arm-kernel@lists.infradead.org,
 linux-kernel@vger.kernel.org, Joey Gouly <joey.gouly@arm.com>, Alexandru
 Elisei <alexandru.elisei@arm.com>, Christoffer Dall
 <christoffer.dall@arm.com>, Fuad Tabba <tabba@google.com>,
 linux-coco@lists.linux.dev
Subject: Re: [RFC PATCH 16/28] arm64: RME: Allow populating initial contents
Message-ID: <20230314173121.00006cfb@gmail.com>
In-Reply-To: <6c6ff608-2314-a49a-84ee-fac883c4e227@arm.com>
References: <20230127112248.136810-1-suzuki.poulose@arm.com>
	<20230127112932.38045-1-steven.price@arm.com>
	<20230127112932.38045-17-steven.price@arm.com>
	<20230306193439.000048f2@gmail.com>
	<6c6ff608-2314-a49a-84ee-fac883c4e227@arm.com>
X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32)
MIME-Version: 1.0
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 
X-CRM114-CacheID: sfid-20230314_083126_532873_5534C0C2 
X-CRM114-Status: GOOD (  45.18  )
X-BeenThere: linux-arm-kernel@lists.infradead.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: <linux-arm-kernel.lists.infradead.org>
List-Unsubscribe: <http://lists.infradead.org/mailman/options/linux-arm-kernel>,
 <mailto:linux-arm-kernel-request@lists.infradead.org?subject=unsubscribe>
List-Archive: <http://lists.infradead.org/pipermail/linux-arm-kernel/>
List-Post: <mailto:linux-arm-kernel@lists.infradead.org>
List-Help: <mailto:linux-arm-kernel-request@lists.infradead.org?subject=help>
List-Subscribe: <http://lists.infradead.org/mailman/listinfo/linux-arm-kernel>,
 <mailto:linux-arm-kernel-request@lists.infradead.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org

On Fri, 10 Mar 2023 15:47:16 +0000
Steven Price <steven.price@arm.com> wrote:

> On 06/03/2023 17:34, Zhi Wang wrote:
> > On Fri, 27 Jan 2023 11:29:20 +0000
> > Steven Price <steven.price@arm.com> wrote:
> >   
> >> The VMM needs to populate the realm with some data before starting (e.g.
> >> a kernel and initrd). This is measured by the RMM and used as part of
> >> the attestation later on.
> >>
> >> Signed-off-by: Steven Price <steven.price@arm.com>
> >> ---
> >>  arch/arm64/kvm/rme.c | 366 +++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 366 insertions(+)
> >>
> >> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> >> index 16e0bfea98b1..3405b43e1421 100644
> >> --- a/arch/arm64/kvm/rme.c
> >> +++ b/arch/arm64/kvm/rme.c
> >> @@ -4,6 +4,7 @@
> >>   */
> >>  
> >>  #include <linux/kvm_host.h>
> >> +#include <linux/hugetlb.h>
> >>  
> >>  #include <asm/kvm_emulate.h>
> >>  #include <asm/kvm_mmu.h>
> >> @@ -426,6 +427,359 @@ void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size)
> >>  	}
> >>  }
> >>  
> >> +static int realm_create_protected_data_page(struct realm *realm,
> >> +					    unsigned long ipa,
> >> +					    struct page *dst_page,
> >> +					    struct page *tmp_page)
> >> +{
> >> +	phys_addr_t dst_phys, tmp_phys;
> >> +	int ret;
> >> +
> >> +	copy_page(page_address(tmp_page), page_address(dst_page));
> >> +
> >> +	dst_phys = page_to_phys(dst_page);
> >> +	tmp_phys = page_to_phys(tmp_page);
> >> +
> >> +	if (rmi_granule_delegate(dst_phys))
> >> +		return -ENXIO;
> >> +
> >> +	ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa, tmp_phys,
> >> +			      RMI_MEASURE_CONTENT);
> >> +
> >> +	if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> >> +		/* Create missing RTTs and retry */
> >> +		int level = RMI_RETURN_INDEX(ret);
> >> +
> >> +		ret = realm_create_rtt_levels(realm, ipa, level,
> >> +					      RME_RTT_MAX_LEVEL, NULL);
> >> +		if (ret)
> >> +			goto err;
> >> +
> >> +		ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa,
> >> +				      tmp_phys, RMI_MEASURE_CONTENT);
> >> +	}
> >> +
> >> +	if (ret)
> >> +		goto err;
> >> +
> >> +	return 0;
> >> +
> >> +err:
> >> +	if (WARN_ON(rmi_granule_undelegate(dst_phys))) {
> >> +		/* Page can't be returned to NS world so is lost */
> >> +		get_page(dst_page);
> >> +	}
> >> +	return -ENXIO;
> >> +}
> >> +
> >> +static int fold_rtt(phys_addr_t rd, unsigned long addr, int level,
> >> +		    struct realm *realm)
> >> +{
> >> +	struct rtt_entry rtt;
> >> +	phys_addr_t rtt_addr;
> >> +
> >> +	if (rmi_rtt_read_entry(rd, addr, level, &rtt))
> >> +		return -ENXIO;
> >> +
> >> +	if (rtt.state != RMI_TABLE)
> >> +		return -EINVAL;
> >> +
> >> +	rtt_addr = rmi_rtt_get_phys(&rtt);
> >> +	if (rmi_rtt_fold(rtt_addr, rd, addr, level + 1))
> >> +		return -ENXIO;
> >> +
> >> +	free_delegated_page(realm, rtt_addr);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +int realm_map_protected(struct realm *realm,
> >> +			unsigned long hva,
> >> +			unsigned long base_ipa,
> >> +			struct page *dst_page,
> >> +			unsigned long map_size,
> >> +			struct kvm_mmu_memory_cache *memcache)
> >> +{
> >> +	phys_addr_t dst_phys = page_to_phys(dst_page);
> >> +	phys_addr_t rd = virt_to_phys(realm->rd);
> >> +	unsigned long phys = dst_phys;
> >> +	unsigned long ipa = base_ipa;
> >> +	unsigned long size;
> >> +	int map_level;
> >> +	int ret = 0;
> >> +
> >> +	if (WARN_ON(!IS_ALIGNED(ipa, map_size)))
> >> +		return -EINVAL;
> >> +
> >> +	switch (map_size) {
> >> +	case PAGE_SIZE:
> >> +		map_level = 3;
> >> +		break;
> >> +	case RME_L2_BLOCK_SIZE:
> >> +		map_level = 2;
> >> +		break;
> >> +	default:
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	if (map_level < RME_RTT_MAX_LEVEL) {
> >> +		/*
> >> +		 * A temporary RTT is needed during the map, precreate it,
> >> +		 * however if there is an error (e.g. missing parent tables)
> >> +		 * this will be handled below.
> >> +		 */
> >> +		realm_create_rtt_levels(realm, ipa, map_level,
> >> +					RME_RTT_MAX_LEVEL, memcache);
> >> +	}
> >> +
> >> +	for (size = 0; size < map_size; size += PAGE_SIZE) {
> >> +		if (rmi_granule_delegate(phys)) {
> >> +			struct rtt_entry rtt;
> >> +
> >> +			/*
> >> +			 * It's possible we raced with another VCPU on the same
> >> +			 * fault. If the entry exists and matches then exit
> >> +			 * early and assume the other VCPU will handle the
> >> +			 * mapping.
> >> +			 */
> >> +			if (rmi_rtt_read_entry(rd, ipa, RME_RTT_MAX_LEVEL, &rtt))
> >> +				goto err;
> >> +
> >> +			// FIXME: For a block mapping this could race at level
> >> +			// 2 or 3...
> >> +			if (WARN_ON((rtt.walk_level != RME_RTT_MAX_LEVEL ||
> >> +				     rtt.state != RMI_ASSIGNED ||
> >> +				     rtt.desc != phys))) {
> >> +				goto err;
> >> +			}
> >> +
> >> +			return 0;
> >> +		}
> >> +
> >> +		ret = rmi_data_create_unknown(phys, rd, ipa);
> >> +
> >> +		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> >> +			/* Create missing RTTs and retry */
> >> +			int level = RMI_RETURN_INDEX(ret);
> >> +
> >> +			ret = realm_create_rtt_levels(realm, ipa, level,
> >> +						      RME_RTT_MAX_LEVEL,
> >> +						      memcache);
> >> +			WARN_ON(ret);
> >> +			if (ret)
> >> +				goto err_undelegate;
> >> +
> >> +			ret = rmi_data_create_unknown(phys, rd, ipa);
> >> +		}
> >> +		WARN_ON(ret);
> >> +
> >> +		if (ret)
> >> +			goto err_undelegate;
> >> +
> >> +		phys += PAGE_SIZE;
> >> +		ipa += PAGE_SIZE;
> >> +	}
> >> +
> >> +	if (map_size == RME_L2_BLOCK_SIZE)
> >> +		ret = fold_rtt(rd, base_ipa, map_level, realm);
> >> +	if (WARN_ON(ret))
> >> +		goto err;
> >> +
> >> +	return 0;
> >> +
> >> +err_undelegate:
> >> +	if (WARN_ON(rmi_granule_undelegate(phys))) {
> >> +		/* Page can't be returned to NS world so is lost */
> >> +		get_page(phys_to_page(phys));
> >> +	}
> >> +err:
> >> +	while (size > 0) {
> >> +		phys -= PAGE_SIZE;
> >> +		size -= PAGE_SIZE;
> >> +		ipa -= PAGE_SIZE;
> >> +
> >> +		rmi_data_destroy(rd, ipa);
> >> +
> >> +		if (WARN_ON(rmi_granule_undelegate(phys))) {
> >> +			/* Page can't be returned to NS world so is lost */
> >> +			get_page(phys_to_page(phys));
> >> +		}
> >> +	}
> >> +	return -ENXIO;
> >> +}
> >> +  
> > 
> > There seems no caller to the function above. Better move it to the related
> > patch.  
> 
> Indeed this should really be in the next patch - will move as it's very
> confusing having it in this patch (sorry about that).
> 
> >> +static int populate_par_region(struct kvm *kvm,
> >> +			       phys_addr_t ipa_base,
> >> +			       phys_addr_t ipa_end)
> >> +{
> >> +	struct realm *realm = &kvm->arch.realm;
> >> +	struct kvm_memory_slot *memslot;
> >> +	gfn_t base_gfn, end_gfn;
> >> +	int idx;
> >> +	phys_addr_t ipa;
> >> +	int ret = 0;
> >> +	struct page *tmp_page;
> >> +	phys_addr_t rd = virt_to_phys(realm->rd);
> >> +
> >> +	base_gfn = gpa_to_gfn(ipa_base);
> >> +	end_gfn = gpa_to_gfn(ipa_end);
> >> +
> >> +	idx = srcu_read_lock(&kvm->srcu);
> >> +	memslot = gfn_to_memslot(kvm, base_gfn);
> >> +	if (!memslot) {
> >> +		ret = -EFAULT;
> >> +		goto out;
> >> +	}
> >> +
> >> +	/* We require the region to be contained within a single memslot */
> >> +	if (memslot->base_gfn + memslot->npages < end_gfn) {
> >> +		ret = -EINVAL;
> >> +		goto out;
> >> +	}
> >> +
> >> +	tmp_page = alloc_page(GFP_KERNEL);
> >> +	if (!tmp_page) {
> >> +		ret = -ENOMEM;
> >> +		goto out;
> >> +	}
> >> +
> >> +	mmap_read_lock(current->mm);
> >> +
> >> +	ipa = ipa_base;
> >> +
> >> +	while (ipa < ipa_end) {
> >> +		struct vm_area_struct *vma;
> >> +		unsigned long map_size;
> >> +		unsigned int vma_shift;
> >> +		unsigned long offset;
> >> +		unsigned long hva;
> >> +		struct page *page;
> >> +		kvm_pfn_t pfn;
> >> +		int level;
> >> +
> >> +		hva = gfn_to_hva_memslot(memslot, gpa_to_gfn(ipa));
> >> +		vma = vma_lookup(current->mm, hva);
> >> +		if (!vma) {
> >> +			ret = -EFAULT;
> >> +			break;
> >> +		}
> >> +
> >> +		if (is_vm_hugetlb_page(vma))
> >> +			vma_shift = huge_page_shift(hstate_vma(vma));
> >> +		else
> >> +			vma_shift = PAGE_SHIFT;
> >> +
> >> +		map_size = 1 << vma_shift;
> >> +
> >> +		/*
> >> +		 * FIXME: This causes over mapping, but there's no good
> >> +		 * solution here with the ABI as it stands
> >> +		 */
> >> +		ipa = ALIGN_DOWN(ipa, map_size);
> >> +
> >> +		switch (map_size) {
> >> +		case RME_L2_BLOCK_SIZE:
> >> +			level = 2;
> >> +			break;
> >> +		case PAGE_SIZE:
> >> +			level = 3;
> >> +			break;
> >> +		default:
> >> +			WARN_ONCE(1, "Unsupport vma_shift %d", vma_shift);
> >> +			ret = -EFAULT;
> >> +			break;
> >> +		}
> >> +
> >> +		pfn = gfn_to_pfn_memslot(memslot, gpa_to_gfn(ipa));
> >> +
> >> +		if (is_error_pfn(pfn)) {
> >> +			ret = -EFAULT;
> >> +			break;
> >> +		}
> >> +
> >> +		ret = rmi_rtt_init_ripas(rd, ipa, level);
> >> +		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> >> +			ret = realm_create_rtt_levels(realm, ipa,
> >> +						      RMI_RETURN_INDEX(ret),
> >> +						      level, NULL);
> >> +			if (ret)
> >> +				break;
> >> +			ret = rmi_rtt_init_ripas(rd, ipa, level);
> >> +			if (ret) {
> >> +				ret = -ENXIO;
> >> +				break;
> >> +			}
> >> +		}
> >> +
> >> +		if (level < RME_RTT_MAX_LEVEL) {
> >> +			/*
> >> +			 * A temporary RTT is needed during the map, precreate
> >> +			 * it, however if there is an error (e.g. missing
> >> +			 * parent tables) this will be handled in the
> >> +			 * realm_create_protected_data_page() call.
> >> +			 */
> >> +			realm_create_rtt_levels(realm, ipa, level,
> >> +						RME_RTT_MAX_LEVEL, NULL);
> >> +		}
> >> +
> >> +		page = pfn_to_page(pfn);
> >> +
> >> +		for (offset = 0; offset < map_size && !ret;
> >> +		     offset += PAGE_SIZE, page++) {
> >> +			phys_addr_t page_ipa = ipa + offset;
> >> +
> >> +			ret = realm_create_protected_data_page(realm, page_ipa,
> >> +							       page, tmp_page);
> >> +		}
> >> +		if (ret)
> >> +			goto err_release_pfn;
> >> +
> >> +		if (level == 2) {
> >> +			ret = fold_rtt(rd, ipa, level, realm);
> >> +			if (ret)
> >> +				goto err_release_pfn;
> >> +		}
> >> +
> >> +		ipa += map_size;  
> >   
> >> +		kvm_set_pfn_accessed(pfn);
> >> +		kvm_set_pfn_dirty(pfn);  
> > 
> > kvm_release_pfn_dirty() has already called kvm_set_pfn_{accessed, dirty}().  
> 
> Will remove those calls.
> 
> >> +		kvm_release_pfn_dirty(pfn);
> >> +err_release_pfn:
> >> +		if (ret) {
> >> +			kvm_release_pfn_clean(pfn);
> >> +			break;
> >> +		}
> >> +	}
> >> +
> >> +	mmap_read_unlock(current->mm);
> >> +	__free_page(tmp_page);
> >> +
> >> +out:
> >> +	srcu_read_unlock(&kvm->srcu, idx);
> >> +	return ret;
> >> +}
> >> +
> >> +static int kvm_populate_realm(struct kvm *kvm,
> >> +			      struct kvm_cap_arm_rme_populate_realm_args *args)
> >> +{
> >> +	phys_addr_t ipa_base, ipa_end;
> >> +  
> > 
> > Check kvm_is_realm(kvm) here or in the kvm_realm_enable_cap().  
> 
> I'm going to update kvm_vm_ioctl_enable_cap() to check kvm_is_realm() so
> we won't get here.
> 
> >> +	if (kvm_realm_state(kvm) != REALM_STATE_NEW)
> >> +		return -EBUSY;  
> > 
> > Maybe -EINVAL? The realm hasn't been created (RMI_REALM_CREATE is not called
> > yet). The userspace shouldn't reach this path.  
> 
> Well user space can attempt to populate in the ACTIVE state - which is
> where the idea of 'busy' comes from. Admittedly it's a little confusing
> when RMI_REALM_CREATE hasn't been called.
> 
> I'm not particularly bothered about the return code, but it's useful to
> have a different code to -EINVAL as it's not an invalid argument, but
> calling at the wrong time. I can't immediately see a better error code
> though.
> 
The reason why I feel -EBUSY is little bit off is EBUSY usually indicates
something is already initialized and currently running, then another
calling path wanna to operate it. 

I took a look on the ioctls in arch/arm64/kvm/arm.c. It seems people have
different opinions for calling execution path at a wrong time:

For example:

long kvm_arch_vcpu_ioctl()
...
        case KVM_GET_REG_LIST: {
                struct kvm_reg_list __user *user_list = argp;
                struct kvm_reg_list reg_list;
                unsigned n;

                r = -ENOEXEC;
                if (unlikely(!kvm_vcpu_initialized(vcpu)))
                        break;

                r = -EPERM;
                if (!kvm_arm_vcpu_is_finalized(vcpu))
                        break;

If we have to choose one, I prefer -ENOEXEC as -EPERM is stranger. But
personally my vote goes to -EINVAL.

> Steve
> 
> >> +
> >> +	if (!IS_ALIGNED(args->populate_ipa_base, PAGE_SIZE) ||
> >> +	    !IS_ALIGNED(args->populate_ipa_size, PAGE_SIZE))
> >> +		return -EINVAL;
> >> +
> >> +	ipa_base = args->populate_ipa_base;
> >> +	ipa_end = ipa_base + args->populate_ipa_size;
> >> +
> >> +	if (ipa_end < ipa_base)
> >> +		return -EINVAL;
> >> +
> >> +	return populate_par_region(kvm, ipa_base, ipa_end);
> >> +}
> >> +
> >>  static int set_ipa_state(struct kvm_vcpu *vcpu,
> >>  			 unsigned long ipa,
> >>  			 unsigned long end,
> >> @@ -748,6 +1102,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
> >>  		r = kvm_init_ipa_range_realm(kvm, &args);
> >>  		break;
> >>  	}
> >> +	case KVM_CAP_ARM_RME_POPULATE_REALM: {
> >> +		struct kvm_cap_arm_rme_populate_realm_args args;
> >> +		void __user *argp = u64_to_user_ptr(cap->args[1]);
> >> +
> >> +		if (copy_from_user(&args, argp, sizeof(args))) {
> >> +			r = -EFAULT;
> >> +			break;
> >> +		}
> >> +
> >> +		r = kvm_populate_realm(kvm, &args);
> >> +		break;
> >> +	}
> >>  	default:
> >>  		r = -EINVAL;
> >>  		break;  
> >   
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel