linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Steven Price <steven.price@arm.com>
To: Marc Zyngier <maz@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,  James Morse <james.morse@arm.com>,
	Julien Thierry <julien.thierry.kdev@gmail.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Dave Martin <Dave.Martin@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Richard Henderson <richard.henderson@linaro.org>,
	Peter Maydell <peter.maydell@linaro.org>,
	Haibo Xu <Haibo.Xu@arm.com>, Andrew Jones <drjones@redhat.com>
Subject: Re: [PATCH v9 5/6] KVM: arm64: ioctl to fetch/store tags in a guest
Date: Wed, 10 Mar 2021 16:47:56 +0000	[thread overview]
Message-ID: <9f90acc8-54b1-460a-dc25-d2d84e70c9d1@arm.com> (raw)
In-Reply-To: <87ft14xl9e.wl-maz@kernel.org>

On 09/03/2021 17:57, Marc Zyngier wrote:
> On Mon, 01 Mar 2021 14:23:14 +0000,
> Steven Price <steven.price@arm.com> wrote:
>>
>> The VMM may not wish to have it's own mapping of guest memory mapped
>> with PROT_MTE because this causes problems if the VMM has tag checking
>> enabled (the guest controls the tags in physical RAM and it's unlikely
>> the tags are correct for the VMM).
>>
>> Instead add a new ioctl which allows the VMM to easily read/write the
>> tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE
>> while the VMM can still read/write the tags for the purpose of
>> migration.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>   arch/arm64/include/uapi/asm/kvm.h | 13 +++++++
>>   arch/arm64/kvm/arm.c              | 57 +++++++++++++++++++++++++++++++
>>   include/uapi/linux/kvm.h          |  1 +
>>   3 files changed, 71 insertions(+)
>>
>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
>> index 24223adae150..5fc2534ac5df 100644
>> --- a/arch/arm64/include/uapi/asm/kvm.h
>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>> @@ -184,6 +184,19 @@ struct kvm_vcpu_events {
>>   	__u32 reserved[12];
>>   };
>>   
>> +struct kvm_arm_copy_mte_tags {
>> +	__u64 guest_ipa;
>> +	__u64 length;
>> +	union {
>> +		void __user *addr;
>> +		__u64 padding;
>> +	};
>> +	__u64 flags;
> 
> I'd be keen on a couple of reserved __64s. Just in case...

Fair enough, I'll add a __u64 reserved[2];

>> +};
>> +
>> +#define KVM_ARM_TAGS_TO_GUEST		0
>> +#define KVM_ARM_TAGS_FROM_GUEST		1
>> +
>>   /* If you need to interpret the index values, here is the key: */
>>   #define KVM_REG_ARM_COPROC_MASK		0x000000000FFF0000
>>   #define KVM_REG_ARM_COPROC_SHIFT	16
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index 46bf319f6cb7..01d404833e24 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -1297,6 +1297,53 @@ static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
>>   	}
>>   }
>>   
>> +static int kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
>> +				      struct kvm_arm_copy_mte_tags *copy_tags)
>> +{
>> +	gpa_t guest_ipa = copy_tags->guest_ipa;
>> +	size_t length = copy_tags->length;
>> +	void __user *tags = copy_tags->addr;
>> +	gpa_t gfn;
>> +	bool write = !(copy_tags->flags & KVM_ARM_TAGS_FROM_GUEST);
>> +
>> +	if (copy_tags->flags & ~KVM_ARM_TAGS_FROM_GUEST)
>> +		return -EINVAL;
>> +
>> +	if (length & ~PAGE_MASK || guest_ipa & ~PAGE_MASK)
>> +		return -EINVAL;
> 
> It is a bit odd to require userspace to provide a page-aligned
> addr/size, as it now has to find out about the kernel's page
> size. MTE_GRANULE_SIZE-aligned values would make more sense. Is there
> an underlying reason for this?

No fundamental reason, my thoughts were:

  * It's likely user space is naturally going to be using page-aligned 
quantities during migration, so it already has to care about this.

  * It makes the loop below easier.

  * It's easy to relax the restriction in the future if it becomes a 
problem, much harder to tighten it without breaking anything.

But I can switch to MTE_GRANULE_SIZE if you'd prefer, let me know.

>> +
>> +	gfn = gpa_to_gfn(guest_ipa);
>> +
>> +	while (length > 0) {
>> +		kvm_pfn_t pfn = gfn_to_pfn_prot(kvm, gfn, write, NULL);
>> +		void *maddr;
>> +		unsigned long num_tags = PAGE_SIZE / MTE_GRANULE_SIZE;
>> +
>> +		if (is_error_noslot_pfn(pfn))
>> +			return -ENOENT;
>> +
>> +		maddr = page_address(pfn_to_page(pfn));
>> +
>> +		if (!write) {
>> +			num_tags = mte_copy_tags_to_user(tags, maddr, num_tags);
>> +			kvm_release_pfn_clean(pfn);
>> +		} else {
>> +			num_tags = mte_copy_tags_from_user(maddr, tags,
>> +							   num_tags);
>> +			kvm_release_pfn_dirty(pfn);
>> +		}
>> +
> 
> Is it actually safe to do this without holding any lock, without
> checking anything against the mmu_notifier_seq? What if the pages are
> being swapped out? Or the memslot removed from under your feet?
> 
> It looks... dangerous. Do you even want to allow this while vcpus are
> actually running?

Umm... yeah I'm not sure how I managed to forgot the locks. This should 
be holding kvm->slots_lock to prevent the slot going under our feet. I 
was surprised that lockdep didn't catch that, until I noticed I'd 
disabled it and discovered why (the model makes it incredibly slow). 
However I've done a run with it enabled now - and with the 
kvm->slots_lock taken it's happy.

gfn_to_pfn_prot() internally calls a variant of get_user_pages() - so 
swapping out shouldn't be a problem.

In terms of running with the vcpus running - given this is going to be 
used for migration I think that's pretty much a requirement. We want to 
be able to dump the tags while executing to enable early transfer of the 
memory.

Steve

>> +		if (num_tags != PAGE_SIZE / MTE_GRANULE_SIZE)
>> +			return -EFAULT;
>> +
>> +		gfn++;
>> +		tags += num_tags;
>> +		length -= PAGE_SIZE;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>   long kvm_arch_vm_ioctl(struct file *filp,
>>   		       unsigned int ioctl, unsigned long arg)
>>   {
>> @@ -1333,6 +1380,16 @@ long kvm_arch_vm_ioctl(struct file *filp,
>>   
>>   		return 0;
>>   	}
>> +	case KVM_ARM_MTE_COPY_TAGS: {
>> +		struct kvm_arm_copy_mte_tags copy_tags;
>> +
>> +		if (!kvm_has_mte(kvm))
>> +			return -EINVAL;
>> +
>> +		if (copy_from_user(&copy_tags, argp, sizeof(copy_tags)))
>> +			return -EFAULT;
>> +		return kvm_vm_ioctl_mte_copy_tags(kvm, &copy_tags);
>> +	}
>>   	default:
>>   		return -EINVAL;
>>   	}
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 05618a4abf7e..b75af0f9ba55 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -1423,6 +1423,7 @@ struct kvm_s390_ucas_mapping {
>>   /* Available with KVM_CAP_PMU_EVENT_FILTER */
>>   #define KVM_SET_PMU_EVENT_FILTER  _IOW(KVMIO,  0xb2, struct kvm_pmu_event_filter)
>>   #define KVM_PPC_SVM_OFF		  _IO(KVMIO,  0xb3)
>> +#define KVM_ARM_MTE_COPY_TAGS	  _IOR(KVMIO,  0xb4, struct kvm_arm_copy_mte_tags)
>>   
>>   /* ioctl for vm fd */
>>   #define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
>> -- 
>> 2.20.1
>>
>>
> 
> Thanks,
> 
> 	M.
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-03-10 16:49 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-01 14:23 [PATCH v9 0/6] MTE support for KVM guest Steven Price
2021-03-01 14:23 ` [PATCH v9 1/6] arm64: mte: Sync tags for pages where PTE is untagged Steven Price
2021-03-01 14:23 ` [PATCH v9 2/6] arm64: kvm: Introduce MTE VM feature Steven Price
2021-03-09 17:06   ` Marc Zyngier
2021-03-10 14:52     ` Steven Price
2021-03-01 14:23 ` [PATCH v9 3/6] arm64: kvm: Save/restore MTE registers Steven Price
2021-03-09 17:27   ` Marc Zyngier
2021-03-10 14:53     ` Steven Price
2021-03-01 14:23 ` [PATCH v9 4/6] arm64: kvm: Expose KVM_ARM_CAP_MTE Steven Price
2021-03-01 14:23 ` [PATCH v9 5/6] KVM: arm64: ioctl to fetch/store tags in a guest Steven Price
2021-03-09 17:57   ` Marc Zyngier
2021-03-10 16:47     ` Steven Price [this message]
2021-03-01 14:23 ` [PATCH v9 6/6] KVM: arm64: Document MTE capability and ioctl Steven Price
2021-03-09 11:01   ` Peter Maydell
2021-03-11 12:35     ` Steven Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9f90acc8-54b1-460a-dc25-d2d84e70c9d1@arm.com \
    --to=steven.price@arm.com \
    --cc=Dave.Martin@arm.com \
    --cc=Haibo.Xu@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=dgilbert@redhat.com \
    --cc=drjones@redhat.com \
    --cc=james.morse@arm.com \
    --cc=julien.thierry.kdev@gmail.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=richard.henderson@linaro.org \
    --cc=suzuki.poulose@arm.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).