All of lore.kernel.org
 help / color / mirror / Atom feed
From: Quentin Perret <qperret@google.com>
To: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Fuad Tabba <tabba@google.com>,
	David Brazdil <dbrazdil@google.com>,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org,
	kernel-team@android.com
Subject: Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
Date: Mon, 18 Oct 2021 11:32:13 +0100	[thread overview]
Message-ID: <YW1NLb9Pn9NyEYZF@google.com> (raw)
In-Reply-To: <87h7dhupfa.wl-maz@kernel.org>

On Saturday 16 Oct 2021 at 13:25:45 (+0100), Marc Zyngier wrote:
> At this stage, the old thread may have been destroyed and the memory
> recycled. What happens if, in the interval, that memory gets shared
> again in another context? My guts feeling is that either the sharing
> fails, or the unshare above breaks something else (the refcounting
> doesn't save us if the sharing is not with HYP).

Aha, yes, that's a good point. The problematic scenario would be: a vcpu
runs in the context of task A, then blocks. Then task A is destroyed,
but the vcpu isn't (possibly because e.g. userspace intends to run it in
the context of another task or something along those lines). But the
thread_info and fpsimd_state of task A remain shared with the hypervisor
until the next vcpu run, even though the memory has been freed by the
host, and is possibly reallocated to another guest in the meantime.

So yes, at this point sharing/donating this memory range with a new
guest will fail, and confuse the host massively :/

> In any case, I wonder whether we need a notification from the core
> code that a thread for which we have a HYP mapping is gone so that we
> can mop up the mapping at that point. That's similar to what we have
> for MMU notifiers and S2 PTs.
> 
> This is doable by hooking into fpsimd_release_task() and extending
> thread_struct to track the sharing with HYP.

I've been looking into this, but struggled to find a good way to avoid
all the races. Specifically, handling the case where a vcpu and the task
which last ran it get destroyed at the same time isn't that easy to
handle: you end up having to maintain pointers from the task to the vcpu
and vice versa, but I have no obvious idea to update them both in a
non-racy way (other than having a one big lock to serialize all
those changes, but that would mean serializing all task destructions so
that really doesn't scale).

Another option is to take a refcount on 'current' from
kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
the hyp and release the refcount of the previous task after unsharing.
But that means we'll have to also drop the refcount when the vcpu
gets destroyed, as well as explicitly unshare at that point. Shouldn't
be too bad I think. Thoughts?

Thanks,
Quentin

WARNING: multiple messages have this Message-ID (diff)
From: Quentin Perret <qperret@google.com>
To: Marc Zyngier <maz@kernel.org>
Cc: kernel-team@android.com, Will Deacon <will@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu
Subject: Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
Date: Mon, 18 Oct 2021 11:32:13 +0100	[thread overview]
Message-ID: <YW1NLb9Pn9NyEYZF@google.com> (raw)
In-Reply-To: <87h7dhupfa.wl-maz@kernel.org>

On Saturday 16 Oct 2021 at 13:25:45 (+0100), Marc Zyngier wrote:
> At this stage, the old thread may have been destroyed and the memory
> recycled. What happens if, in the interval, that memory gets shared
> again in another context? My guts feeling is that either the sharing
> fails, or the unshare above breaks something else (the refcounting
> doesn't save us if the sharing is not with HYP).

Aha, yes, that's a good point. The problematic scenario would be: a vcpu
runs in the context of task A, then blocks. Then task A is destroyed,
but the vcpu isn't (possibly because e.g. userspace intends to run it in
the context of another task or something along those lines). But the
thread_info and fpsimd_state of task A remain shared with the hypervisor
until the next vcpu run, even though the memory has been freed by the
host, and is possibly reallocated to another guest in the meantime.

So yes, at this point sharing/donating this memory range with a new
guest will fail, and confuse the host massively :/

> In any case, I wonder whether we need a notification from the core
> code that a thread for which we have a HYP mapping is gone so that we
> can mop up the mapping at that point. That's similar to what we have
> for MMU notifiers and S2 PTs.
> 
> This is doable by hooking into fpsimd_release_task() and extending
> thread_struct to track the sharing with HYP.

I've been looking into this, but struggled to find a good way to avoid
all the races. Specifically, handling the case where a vcpu and the task
which last ran it get destroyed at the same time isn't that easy to
handle: you end up having to maintain pointers from the task to the vcpu
and vice versa, but I have no obvious idea to update them both in a
non-racy way (other than having a one big lock to serialize all
those changes, but that would mean serializing all task destructions so
that really doesn't scale).

Another option is to take a refcount on 'current' from
kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
the hyp and release the refcount of the previous task after unsharing.
But that means we'll have to also drop the refcount when the vcpu
gets destroyed, as well as explicitly unshare at that point. Shouldn't
be too bad I think. Thoughts?

Thanks,
Quentin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: Quentin Perret <qperret@google.com>
To: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Fuad Tabba <tabba@google.com>,
	David Brazdil <dbrazdil@google.com>,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org,
	kernel-team@android.com
Subject: Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
Date: Mon, 18 Oct 2021 11:32:13 +0100	[thread overview]
Message-ID: <YW1NLb9Pn9NyEYZF@google.com> (raw)
In-Reply-To: <87h7dhupfa.wl-maz@kernel.org>

On Saturday 16 Oct 2021 at 13:25:45 (+0100), Marc Zyngier wrote:
> At this stage, the old thread may have been destroyed and the memory
> recycled. What happens if, in the interval, that memory gets shared
> again in another context? My guts feeling is that either the sharing
> fails, or the unshare above breaks something else (the refcounting
> doesn't save us if the sharing is not with HYP).

Aha, yes, that's a good point. The problematic scenario would be: a vcpu
runs in the context of task A, then blocks. Then task A is destroyed,
but the vcpu isn't (possibly because e.g. userspace intends to run it in
the context of another task or something along those lines). But the
thread_info and fpsimd_state of task A remain shared with the hypervisor
until the next vcpu run, even though the memory has been freed by the
host, and is possibly reallocated to another guest in the meantime.

So yes, at this point sharing/donating this memory range with a new
guest will fail, and confuse the host massively :/

> In any case, I wonder whether we need a notification from the core
> code that a thread for which we have a HYP mapping is gone so that we
> can mop up the mapping at that point. That's similar to what we have
> for MMU notifiers and S2 PTs.
> 
> This is doable by hooking into fpsimd_release_task() and extending
> thread_struct to track the sharing with HYP.

I've been looking into this, but struggled to find a good way to avoid
all the races. Specifically, handling the case where a vcpu and the task
which last ran it get destroyed at the same time isn't that easy to
handle: you end up having to maintain pointers from the task to the vcpu
and vice versa, but I have no obvious idea to update them both in a
non-racy way (other than having a one big lock to serialize all
those changes, but that would mean serializing all task destructions so
that really doesn't scale).

Another option is to take a refcount on 'current' from
kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
the hyp and release the refcount of the previous task after unsharing.
But that means we'll have to also drop the refcount when the vcpu
gets destroyed, as well as explicitly unshare at that point. Shouldn't
be too bad I think. Thoughts?

Thanks,
Quentin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-10-18 10:32 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-13 15:58 [PATCH 00/16] KVM: arm64: Implement unshare hypercall for pkvm Quentin Perret
2021-10-13 15:58 ` Quentin Perret
2021-10-13 15:58 ` Quentin Perret
2021-10-13 15:58 ` [PATCH 01/16] KVM: arm64: Introduce do_share() helper for memory sharing between components Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-15 15:11   ` Andrew Walbran
2021-10-15 15:11     ` Andrew Walbran
2021-10-15 15:11     ` Andrew Walbran
2021-10-19 10:37     ` Quentin Perret
2021-10-19 10:37       ` Quentin Perret
2021-10-19 10:37       ` Quentin Perret
2021-10-13 15:58 ` [PATCH 02/16] KVM: arm64: Implement __pkvm_host_share_hyp() using do_share() Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 03/16] KVM: arm64: Avoid remapping the SVE state in the hyp stage-1 Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-16 11:04   ` Marc Zyngier
2021-10-16 11:04     ` Marc Zyngier
2021-10-16 11:04     ` Marc Zyngier
2021-10-18 10:36     ` Quentin Perret
2021-10-18 10:36       ` Quentin Perret
2021-10-18 10:36       ` Quentin Perret
2021-10-13 15:58 ` [PATCH 04/16] KVM: arm64: Introduce kvm_share_hyp() Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-17 10:41   ` Marc Zyngier
2021-10-17 10:41     ` Marc Zyngier
2021-10-17 10:41     ` Marc Zyngier
2021-10-18 10:51     ` Quentin Perret
2021-10-18 10:51       ` Quentin Perret
2021-10-18 10:51       ` Quentin Perret
2021-10-13 15:58 ` [PATCH 05/16] KVM: arm64: Accept page ranges in pkvm share hypercall Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 06/16] KVM: arm64: Provide {get,put}_page() stubs for early hyp allocator Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 07/16] KVM: arm64: Refcount hyp stage-1 pgtable pages Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 08/16] KVM: arm64: Fixup hyp stage-1 refcount Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 09/16] KVM: arm64: Hook up ->page_count() for hypervisor stage-1 page-table Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 10/16] KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2 Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 11/16] KVM: arm64: Back hyp_vmemmap for all of memory Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 12/16] KVM: arm64: Move hyp refcount helpers to header files Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 13/16] KVM: arm64: Move double-sharing logic into hyp-specific function Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 14/16] KVM: arm64: Refcount shared pages at EL2 Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 15/16] KVM: arm64: pkvm: Introduce an unshare hypercall Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-16 12:25   ` Marc Zyngier
2021-10-16 12:25     ` Marc Zyngier
2021-10-16 12:25     ` Marc Zyngier
2021-10-18 10:32     ` Quentin Perret [this message]
2021-10-18 10:32       ` Quentin Perret
2021-10-18 10:32       ` Quentin Perret
2021-10-18 14:03       ` Quentin Perret
2021-10-18 14:03         ` Quentin Perret
2021-10-18 14:03         ` Quentin Perret
2021-10-18 17:12         ` Marc Zyngier
2021-10-18 17:12           ` Marc Zyngier
2021-10-18 17:12           ` Marc Zyngier
2021-10-19  9:40           ` Quentin Perret
2021-10-19  9:40             ` Quentin Perret
2021-10-19  9:40             ` Quentin Perret

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YW1NLb9Pn9NyEYZF@google.com \
    --to=qperret@google.com \
    --cc=alexandru.elisei@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=dbrazdil@google.com \
    --cc=james.morse@arm.com \
    --cc=kernel-team@android.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.