All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Rosato <mjrosato@linux.ibm.com>
To: Tony Krowiak <akrowiak@linux.ibm.com>,
	linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org
Cc: jjherne@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com,
	pasic@linux.ibm.com, pbonzini@redhat.com, frankja@linux.ibm.com,
	imbrenda@linux.ibm.com, david@redhat.com
Subject: Re: [RFC] kvm: reverse call order of kvm_arch_destroy_vm() and kvm_destroy_devices()
Date: Tue, 5 Jul 2022 15:30:26 -0400	[thread overview]
Message-ID: <c4062e02-4b35-e130-b653-e467bef2eb4f@linux.ibm.com> (raw)
In-Reply-To: <20220705185430.499688-1-akrowiak@linux.ibm.com>

On 7/5/22 2:54 PM, Tony Krowiak wrote:
> There is a new requirement for s390 secure execution guests that the
> hypervisor ensures all AP queues are reset and disassociated from the
> KVM guest before the secure configuration is torn down. It is the
> responsibility of the vfio_ap device driver to handle this.
> 
> Prior to commit ("vfio: remove VFIO_GROUP_NOTIFY_SET_KVM"),
> the driver reset all AP queues passed through to a KVM guest when notified
> that the KVM pointer was being set to NULL. Subsequently, the AP queues
> are only reset when the fd for the mediated device used to pass the queues
> through to the guest is closed (the vfio_ap_mdev_close_device() callback).
> This is not a problem when userspace is well-behaved and uses the
> KVM_DEV_VFIO_GROUP_DEL attribute to remove the VFIO group; however, if
> userspace for some reason does not close the mdev fd, a secure execution
> guest will tear down its configuration before the AP queues are
> reset because the teardown is done in the kvm_arch_destroy_vm function
> which is invoked prior to vm_destroy_devices.

To clarify, even before "vfio: remove VFIO_GROUP_NOTIFY_SET_KVM" if 
userspace did not delete the group via KVM_DEV_VFIO_GROUP_DEL then the 
old callback would also not have been triggered until 
kvm_destroy_devices() anyway (the callback would have been triggered 
with a NULL kvm pointer via a call from kvm_vfio_destroy(), triggered 
from kvm_destroy_devices()).

My point being: this behavior did not start with "vfio: remove 
VFIO_GROUP_NOTIFY_SET_KVM", that patch just removed the notifier since 
both actions always took place at device open/close time anyway.  So if 
destroying the devices before the vm isn't doable, a new 
notifier/whatever that sets the KVM assocation to NULL would also have 
to happen at an earlier point in time than VFIO_GROUP_NOTIFY_SET_KVM did 
(and should maybe be something that is optional/opt-in and used only by 
vfio drivers that need it to cleanup a KVM association at a point prior 
to the device being destroyed).  There should still be no need for any 
sort of notifier to set the (non-NULL) KVM association as it's already 
associated with the vfio group before device_open.

But let's first see if anyone can shed some understanding on the 
ordering between kvm_arch_destroy_vm and kvm_destroy_devices...

> 
> This patch proposes a simple solution; rather than introducing a new
> notifier into vfio or callback into KVM, what aoubt reversing the order
> in which the kvm_arch_destroy_vm and kvm_destroy_devices are called. In
> some very limited testing (i.e., the automated regression tests for
> the vfio_ap device driver) this did not seem to cause any problems.
> 
> The question remains, is there a good technical reason why the VM
> is destroyed before the devices it is using? This is not intuitive, so
> this is a request for comments on this proposed patch. The assumption
> here is that the medev fd will get closed when the devices are destroyed.
> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>   virt/kvm/kvm_main.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index a49df8988cd6..edaf2918be9b 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1248,8 +1248,8 @@ static void kvm_destroy_vm(struct kvm *kvm)
>   #else
>   	kvm_flush_shadow_all(kvm);
>   #endif
> -	kvm_arch_destroy_vm(kvm);
>   	kvm_destroy_devices(kvm);
> +	kvm_arch_destroy_vm(kvm);
>   	for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
>   		kvm_free_memslots(kvm, &kvm->__memslots[i][0]);
>   		kvm_free_memslots(kvm, &kvm->__memslots[i][1]);


  reply	other threads:[~2022-07-05 19:30 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-05 18:54 [RFC] kvm: reverse call order of kvm_arch_destroy_vm() and kvm_destroy_devices() Tony Krowiak
2022-07-05 19:30 ` Matthew Rosato [this message]
2022-07-18 14:11 ` Anthony Krowiak
2022-07-27 19:00 ` Anthony Krowiak
2022-08-01 11:53   ` Halil Pasic
2022-08-11 14:39     ` Anthony Krowiak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c4062e02-4b35-e130-b653-e467bef2eb4f@linux.ibm.com \
    --to=mjrosato@linux.ibm.com \
    --cc=akrowiak@linux.ibm.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=jjherne@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=pasic@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.