kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Halil Pasic <pasic@linux.ibm.com>
To: Tony Krowiak <akrowiak@linux.ibm.com>
Cc: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, freude@linux.ibm.com,
	borntraeger@de.ibm.com, cohuck@redhat.com,
	mjrosato@linux.ibm.com, alex.williamson@redhat.com,
	kwankhede@nvidia.com, fiuczy@linux.ibm.com,
	frankja@linux.ibm.com, david@redhat.com, imbrenda@linux.ibm.com,
	hca@linux.ibm.com, gor@linux.ibm.com
Subject: Re: [PATCH v10 11/16] s390/vfio-ap: allow hot plug/unplug of AP resources using mdev device
Date: Mon, 28 Sep 2020 03:01:47 +0200	[thread overview]
Message-ID: <20200928030147.7ee6f494.pasic@linux.ibm.com> (raw)
In-Reply-To: <20200821195616.13554-12-akrowiak@linux.ibm.com>

On Fri, 21 Aug 2020 15:56:11 -0400
Tony Krowiak <akrowiak@linux.ibm.com> wrote:

> Let's hot plug/unplug adapters, domains and control domains assigned to or
> unassigned from an AP matrix mdev device while it is in use by a guest per
> the following:
> 
> * When the APID of an adapter is assigned to a matrix mdev in use by a KVM
>   guest, the adapter will be hot plugged into the KVM guest as long as each
>   APQN derived from the Cartesian product of the APID being assigned and
>   the APQIs already assigned to the guest's CRYCB references a queue device
>   bound to the vfio_ap device driver.
> 
> * When the APID of an adapter is unassigned from a matrix mdev in use by a
>   KVM guest, the adapter will be hot unplugged from the KVM guest.
> 
> * When the APQI of a domain is assigned to a matrix mdev in use by a KVM
>   guest, the domain will be hot plugged into the KVM guest as long as each
>   APQN derived from the Cartesian product of the APQI being assigned and
>   the APIDs already assigned to the guest's CRYCB references a queue device
>   bound to the vfio_ap device driver.
> 
> * When the APQI of a domain is unassigned from a matrix mdev in use by a
>   KVM guest, the domain will be hot unplugged from the KVM guest



Hm, I suppose this means that what your guest effectively gets may depend
on whether assign_domain or assign_adapter is done first.

Suppose we have the queues
0.0 0.1
1.0 
bound to vfio_ap, i.e. 1.1 is missing for a reason different than
belonging to the default drivers (for what exact reason no idea).

Let's suppose we started with the matix containing only adapter
0 (0.) and domain 0 (.0).

After echo 1 > assign_adapter && echo 1 > assign_domain we end up with
matrix:
0.0 0.1
1.0 1.1
guest_matrix:
0.0 0.1
while after echo 1 > assign_domain && echo 1 > assign_adapter we end up
with:
matrix:
0.0 0.1
1.0 1.1
guest_matrix:
0.0
0.1

That means, the set of bound queues and the set of assigned resources do
not fully determine the set of resources passed through to the guest.

I that a deliberate design choice?



> 
> * When the domain number of a control domain is assigned to a matrix mdev
>   in use by a KVM guest, the control domain will be hot plugged into the
>   KVM guest.
> 
> * When the domain number of a control domain is unassigned from a matrix
>   mdev in use by a KVM guest, the control domain will be hot unplugged
>   from the KVM guest.
> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_ops.c | 196 ++++++++++++++++++++++++++++++
>  1 file changed, 196 insertions(+)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index cf3321eb239b..2b01a8eb6ee7 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -731,6 +731,56 @@ static void vfio_ap_mdev_link_queues(struct ap_matrix_mdev *matrix_mdev,
>  	}
>  }
>  
> +static bool vfio_ap_mdev_assign_apqis_4_apid(struct ap_matrix_mdev *matrix_mdev,
> +					     unsigned long apid)
> +{
> +	DECLARE_BITMAP(aqm, AP_DOMAINS);
> +	unsigned long apqi, apqn;
> +
> +	bitmap_copy(aqm, matrix_mdev->matrix.aqm, AP_DOMAINS);
> +
> +	for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, AP_DOMAINS) {
> +		if (!test_bit_inv(apqi,
> +				  (unsigned long *) matrix_dev->info.aqm))
> +			clear_bit_inv(apqi, aqm);
> +
> +		apqn = AP_MKQID(apid, apqi);
> +		if (!vfio_ap_get_mdev_queue(matrix_mdev, apqn))
> +			clear_bit_inv(apqi, aqm);
> +	}
> +
> +	if (bitmap_empty(aqm, AP_DOMAINS))
> +		return false;
> +
> +	set_bit_inv(apid, matrix_mdev->shadow_apcb.apm);
> +	bitmap_copy(matrix_mdev->shadow_apcb.aqm, aqm, AP_DOMAINS);
> +
> +	return true;
> +}
> +
> +static bool vfio_ap_mdev_assign_guest_apid(struct ap_matrix_mdev *matrix_mdev,
> +					   unsigned long apid)
> +{
> +	unsigned long apqi, apqn;
> +
> +	if (!vfio_ap_mdev_has_crycb(matrix_mdev) ||
> +	    !test_bit_inv(apid, (unsigned long *)matrix_dev->info.apm))
> +		return false;
> +
> +	if (bitmap_empty(matrix_mdev->shadow_apcb.aqm, AP_DOMAINS))
> +		return vfio_ap_mdev_assign_apqis_4_apid(matrix_mdev, apid);


Hm. Let's say we have the same situation regarding the bound queues as
above but we start with the empty matrix, and do all the assignments
while the guest is running.

Consider the following sequence of actions.

1) echo 0 > assign_domain
2) echo 1 > assign_domain
3) echo 1 > assign_adapter
4) echo 0 > assign_adapter
5) echo 1 > unassign_adapter

I understand that at 3), because
bitmap_empty(matrix_mdev->shadow_apcb.aqm)we would end up with a shadow
aqm containing just domain 0, as queue 1.1 ain't bound to us.

Thus at the end we would have
matrix:
0.0 0.1
guest_matrix:
0.0

And if we add in an adapter 2. into the mix with the queues 2.0 and 2.1
then after
6) echo 2 > assign_adapter
we get
Thus at the end we would have
matrix:
0.0 0.1
2.0 2.1
guest_matrix:
0.0
2.0

This looks very quirky to me. Did I read the code wrong? Opinions?

> +
> +	for_each_set_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm, AP_DOMAINS) {
> +		apqn = AP_MKQID(apid, apqi);
> +		if (!vfio_ap_get_mdev_queue(matrix_mdev, apqn))
> +			return false;
> +	}
> +
> +	set_bit_inv(apid, matrix_mdev->shadow_apcb.apm);
> +
> +	return true;
> +}
> +
>  /**
>   * assign_adapter_store
>   *
> @@ -792,12 +842,42 @@ static ssize_t assign_adapter_store(struct device *dev,
>  	}
>  	set_bit_inv(apid, matrix_mdev->matrix.apm);
>  	vfio_ap_mdev_link_queues(matrix_mdev, LINK_APID, apid);
> +	if (vfio_ap_mdev_assign_guest_apid(matrix_mdev, apid))
> +		vfio_ap_mdev_commit_shadow_apcb(matrix_mdev);
>  	mutex_unlock(&matrix_dev->lock);
>  
>  	return count;
>  }
>  static DEVICE_ATTR_WO(assign_adapter);
>  
> +static bool vfio_ap_mdev_unassign_guest_apid(struct ap_matrix_mdev *matrix_mdev,
> +					     unsigned long apid)
> +{
> +	if (vfio_ap_mdev_has_crycb(matrix_mdev)) {
> +		if (test_bit_inv(apid, matrix_mdev->shadow_apcb.apm)) {
> +			clear_bit_inv(apid, matrix_mdev->shadow_apcb.apm);
> +
> +			/*
> +			 * If there are no APIDs assigned to the guest, then
> +			 * the guest will not have access to any queues, so
> +			 * let's also go ahead and unassign the APQIs. Keeping
> +			 * them around may yield unpredictable results during
> +			 * a probe that is not related to a host AP
> +			 * configuration change (i.e., an AP adapter is
> +			 * configured online).
> +			 */

I don't quite understand this comment. Clearing out the other mask when
the one becomes empty, does allow us to recover the full possible guest
matrix in the scenario described above. I don't see any shadow
manipulation in the probe handler at this stage. Are we maybe
talking about the same effect as I described for assign?

Regards,
Halil

> +			if (bitmap_empty(matrix_mdev->shadow_apcb.apm,
> +					 AP_DEVICES))
> +				bitmap_clear(matrix_mdev->shadow_apcb.aqm, 0,
> +					     AP_DOMAINS);
> +
> +			return true;
> +		}
> +	}
> +
> +	return false;
> +}
> +
>  /**
>   * unassign_adapter_store
>   *
> @@ -834,12 +914,64 @@ static ssize_t unassign_adapter_store(struct device *dev,
>  	mutex_lock(&matrix_dev->lock);
>  	clear_bit_inv((unsigned long)apid, matrix_mdev->matrix.apm);
>  	vfio_ap_mdev_link_queues(matrix_mdev, UNLINK_APID, apid);
> +	if (vfio_ap_mdev_unassign_guest_apid(matrix_mdev, apid))
> +		vfio_ap_mdev_commit_shadow_apcb(matrix_mdev);
>  	mutex_unlock(&matrix_dev->lock);
>  
>  	return count;
>  }
>  static DEVICE_ATTR_WO(unassign_adapter);
>  
> +static bool vfio_ap_mdev_assign_apids_4_apqi(struct ap_matrix_mdev *matrix_mdev,
> +					     unsigned long apqi)
> +{
> +	DECLARE_BITMAP(apm, AP_DEVICES);
> +	unsigned long apid, apqn;
> +
> +	bitmap_copy(apm, matrix_mdev->matrix.apm, AP_DEVICES);
> +
> +	for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, AP_DEVICES) {
> +		if (!test_bit_inv(apid,
> +				  (unsigned long *) matrix_dev->info.apm))
> +			clear_bit_inv(apqi, apm);
> +
> +		apqn = AP_MKQID(apid, apqi);
> +		if (!vfio_ap_get_mdev_queue(matrix_mdev, apqn))
> +			clear_bit_inv(apid, apm);
> +	}
> +
> +	if (bitmap_empty(apm, AP_DEVICES))
> +		return false;
> +
> +	set_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm);
> +	bitmap_copy(matrix_mdev->shadow_apcb.apm, apm, AP_DEVICES);
> +
> +	return true;
> +}
> +
> +static bool vfio_ap_mdev_assign_guest_apqi(struct ap_matrix_mdev *matrix_mdev,
> +					   unsigned long apqi)
> +{
> +	unsigned long apid, apqn;
> +
> +	if (!vfio_ap_mdev_has_crycb(matrix_mdev) ||
> +	    !test_bit_inv(apqi, (unsigned long *)matrix_dev->info.aqm))
> +		return false;
> +
> +	if (bitmap_empty(matrix_mdev->shadow_apcb.apm, AP_DEVICES))
> +		return vfio_ap_mdev_assign_apids_4_apqi(matrix_mdev, apqi);
> +
> +	for_each_set_bit_inv(apid, matrix_mdev->shadow_apcb.apm, AP_DEVICES) {
> +		apqn = AP_MKQID(apid, apqi);
> +		if (!vfio_ap_get_mdev_queue(matrix_mdev, apqn))
> +			return false;
> +	}
> +
> +	set_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm);
> +
> +	return true;
> +}
> +
>  /**
>   * assign_domain_store
>   *
> @@ -901,12 +1033,41 @@ static ssize_t assign_domain_store(struct device *dev,
>  	}
>  	set_bit_inv(apqi, matrix_mdev->matrix.aqm);
>  	vfio_ap_mdev_link_queues(matrix_mdev, LINK_APQI, apqi);
> +	if (vfio_ap_mdev_assign_guest_apqi(matrix_mdev, apqi))
> +		vfio_ap_mdev_commit_shadow_apcb(matrix_mdev);
>  	mutex_unlock(&matrix_dev->lock);
>  
>  	return count;
>  }
>  static DEVICE_ATTR_WO(assign_domain);
>  
> +static bool vfio_ap_mdev_unassign_guest_apqi(struct ap_matrix_mdev *matrix_mdev,
> +					     unsigned long apqi)
> +{
> +	if (vfio_ap_mdev_has_crycb(matrix_mdev)) {
> +		if (test_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm)) {
> +			clear_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm);
> +
> +			/*
> +			 * If there are no APQIs assigned to the guest, then
> +			 * the guest will not have access to any queues, so
> +			 * let's also go ahead and unassign the APIDs. Keeping
> +			 * them around may yield unpredictable results during
> +			 * a probe that is not related to a host AP
> +			 * configuration change (i.e., an AP adapter is
> +			 * configured online).
> +			 */
> +			if (bitmap_empty(matrix_mdev->shadow_apcb.aqm,
> +					 AP_DOMAINS))
> +				bitmap_clear(matrix_mdev->shadow_apcb.apm, 0,
> +					     AP_DEVICES);
> +
> +			return true;
> +		}
> +	}
> +
> +	return false;
> +}
>  
>  /**
>   * unassign_domain_store
> @@ -944,12 +1105,28 @@ static ssize_t unassign_domain_store(struct device *dev,
>  	mutex_lock(&matrix_dev->lock);
>  	clear_bit_inv((unsigned long)apqi, matrix_mdev->matrix.aqm);
>  	vfio_ap_mdev_link_queues(matrix_mdev, UNLINK_APQI, apqi);
> +	if (vfio_ap_mdev_unassign_guest_apqi(matrix_mdev, apqi))
> +		vfio_ap_mdev_commit_shadow_apcb(matrix_mdev);
>  	mutex_unlock(&matrix_dev->lock);
>  
>  	return count;
>  }
>  static DEVICE_ATTR_WO(unassign_domain);
>  
> +static bool vfio_ap_mdev_assign_guest_cdom(struct ap_matrix_mdev *matrix_mdev,
> +					   unsigned long domid)
> +{
> +	if (vfio_ap_mdev_has_crycb(matrix_mdev)) {
> +		if (!test_bit_inv(domid, matrix_mdev->shadow_apcb.adm)) {
> +			set_bit_inv(domid, matrix_mdev->shadow_apcb.adm);
> +
> +			return true;
> +		}
> +	}
> +
> +	return false;
> +}
> +
>  /**
>   * assign_control_domain_store
>   *
> @@ -984,12 +1161,29 @@ static ssize_t assign_control_domain_store(struct device *dev,
>  
>  	mutex_lock(&matrix_dev->lock);
>  	set_bit_inv(id, matrix_mdev->matrix.adm);
> +	if (vfio_ap_mdev_assign_guest_cdom(matrix_mdev, id))
> +		vfio_ap_mdev_commit_shadow_apcb(matrix_mdev);
>  	mutex_unlock(&matrix_dev->lock);
>  
>  	return count;
>  }
>  static DEVICE_ATTR_WO(assign_control_domain);
>  
> +static bool
> +vfio_ap_mdev_unassign_guest_cdom(struct ap_matrix_mdev *matrix_mdev,
> +				 unsigned long domid)
> +{
> +	if (vfio_ap_mdev_has_crycb(matrix_mdev)) {
> +		if (test_bit_inv(domid, matrix_mdev->shadow_apcb.adm)) {
> +			clear_bit_inv(domid, matrix_mdev->shadow_apcb.adm);
> +
> +			return true;
> +		}
> +	}
> +
> +	return false;
> +}
> +
>  /**
>   * unassign_control_domain_store
>   *
> @@ -1024,6 +1218,8 @@ static ssize_t unassign_control_domain_store(struct device *dev,
>  
>  	mutex_lock(&matrix_dev->lock);
>  	clear_bit_inv(domid, matrix_mdev->matrix.adm);
> +	if (vfio_ap_mdev_unassign_guest_cdom(matrix_mdev, domid))
> +		vfio_ap_mdev_commit_shadow_apcb(matrix_mdev);
>  	mutex_unlock(&matrix_dev->lock);
>  
>  	return count;

u

  reply	other threads:[~2020-09-28  1:02 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-21 19:56 [PATCH v10 00/16] s390/vfio-ap: dynamic configuration support Tony Krowiak
2020-08-21 19:56 ` [PATCH v10 01/16] s390/vfio-ap: add version vfio_ap module Tony Krowiak
2020-08-25 10:04   ` Cornelia Huck
2020-08-26 14:49     ` Tony Krowiak
2020-08-27 10:32       ` Cornelia Huck
2020-08-27 14:39         ` Tony Krowiak
2020-08-28  8:10           ` Cornelia Huck
2020-08-21 19:56 ` [PATCH v10 02/16] s390/vfio-ap: use new AP bus interface to search for queue devices Tony Krowiak
2020-08-25 10:13   ` Cornelia Huck
2020-08-27 14:24     ` Tony Krowiak
2020-08-28  8:13       ` Cornelia Huck
2020-08-28 15:10         ` Tony Krowiak
2020-09-25  2:11       ` Halil Pasic
2020-10-16 20:59         ` Tony Krowiak
2020-09-04  8:11   ` Christian Borntraeger
2020-09-08 18:54     ` Tony Krowiak
2020-09-25  2:27   ` Halil Pasic
2020-09-29 13:07     ` Tony Krowiak
2020-09-29 13:37       ` Halil Pasic
2020-09-29 20:57         ` Tony Krowiak
2020-08-21 19:56 ` [PATCH v10 03/16] s390/vfio-ap: manage link between queue struct and matrix mdev Tony Krowiak
2020-08-25 10:25   ` Cornelia Huck
2020-08-28 23:05     ` Tony Krowiak
2020-09-04  8:15   ` Christian Borntraeger
2020-09-08 19:03     ` Tony Krowiak
2020-09-25  7:58   ` Halil Pasic
2020-08-21 19:56 ` [PATCH v10 04/16] s390/zcrypt: driver callback to indicate resource in use Tony Krowiak
2020-09-14 15:29   ` Cornelia Huck
2020-09-15 19:32     ` Tony Krowiak
2020-09-17 12:14       ` Cornelia Huck
2020-09-17 13:54         ` Tony Krowiak
2020-09-25  9:24   ` Halil Pasic
2020-09-29 13:59     ` Tony Krowiak
2020-08-21 19:56 ` [PATCH v10 05/16] s390/vfio-ap: implement in-use callback for vfio_ap driver Tony Krowiak
2020-09-14 15:31   ` Cornelia Huck
2020-09-25  9:29   ` Halil Pasic
2020-09-29 14:00     ` Tony Krowiak
2020-08-21 19:56 ` [PATCH v10 06/16] s390/vfio-ap: introduce shadow APCB Tony Krowiak
2020-09-17 14:22   ` Cornelia Huck
2020-09-18 17:03     ` Tony Krowiak
2020-09-26  1:38   ` Halil Pasic
2020-09-29 16:04     ` Tony Krowiak
2020-09-29 16:19       ` Halil Pasic
2020-08-21 19:56 ` [PATCH v10 07/16] s390/vfio-ap: sysfs attribute to display the guest's matrix Tony Krowiak
2020-09-17 14:34   ` Cornelia Huck
2020-09-18 17:09     ` Tony Krowiak
2020-09-26  7:16       ` Halil Pasic
2020-09-29 21:00         ` Tony Krowiak
2020-08-21 19:56 ` [PATCH v10 08/16] s390/vfio-ap: filter matrix for unavailable queue devices Tony Krowiak
2020-09-26  8:24   ` Halil Pasic
2020-09-29 21:59     ` Tony Krowiak
2020-08-21 19:56 ` [PATCH v10 09/16] s390/vfio-ap: allow assignment of unavailable AP queues to mdev device Tony Krowiak
2020-09-26 23:49   ` Halil Pasic
2020-09-30 12:59     ` Tony Krowiak
2020-09-30 22:29       ` Halil Pasic
2020-08-21 19:56 ` [PATCH v10 10/16] s390/vfio-ap: allow configuration of matrix mdev in use by a KVM guest Tony Krowiak
2020-09-27  0:03   ` Halil Pasic
2020-09-30 13:19     ` Tony Krowiak
2020-08-21 19:56 ` [PATCH v10 11/16] s390/vfio-ap: allow hot plug/unplug of AP resources using mdev device Tony Krowiak
2020-09-28  1:01   ` Halil Pasic [this message]
2020-10-05 16:24     ` Tony Krowiak
2020-10-05 18:30       ` Halil Pasic
2020-10-05 21:48         ` Tony Krowiak
2020-10-05 23:05         ` Tony Krowiak
2020-08-21 19:56 ` [PATCH v10 12/16] s390/zcrypt: Notify driver on config changed and scan complete callbacks Tony Krowiak
2020-09-27  1:39   ` Halil Pasic
2020-08-21 19:56 ` [PATCH v10 13/16] s390/vfio-ap: handle host AP config change notification Tony Krowiak
2020-09-28  1:38   ` Halil Pasic
2020-10-12 20:53     ` Tony Krowiak
2020-10-12 21:27     ` Tony Krowiak
2020-08-21 19:56 ` [PATCH v10 14/16] s390/vfio-ap: handle AP bus scan completed notification Tony Krowiak
2020-09-28  2:11   ` Halil Pasic
2020-08-21 19:56 ` [PATCH v10 15/16] s390/vfio-ap: handle probe/remove not due to host AP config changes Tony Krowiak
2020-09-28  2:45   ` Halil Pasic
2020-08-21 19:56 ` [PATCH v10 16/16] s390/vfio-ap: update docs to include dynamic config support Tony Krowiak
2020-08-25 10:45   ` Cornelia Huck
2020-08-31 18:34     ` Tony Krowiak
2020-09-28  2:48   ` Halil Pasic
2020-10-16 16:36     ` Tony Krowiak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200928030147.7ee6f494.pasic@linux.ibm.com \
    --to=pasic@linux.ibm.com \
    --cc=akrowiak@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=fiuczy@linux.ibm.com \
    --cc=frankja@linux.ibm.com \
    --cc=freude@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).