linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jason J. Herne" <jjherne@linux.ibm.com>
To: Tony Krowiak <akrowiak@linux.ibm.com>,
	linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org
Cc: freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com,
	mjrosato@linux.ibm.com, pasic@linux.ibm.com,
	alex.williamson@redhat.com, kwankhede@nvidia.com,
	fiuczy@linux.ibm.com
Subject: Re: [PATCH v19 15/20] s390/vfio-ap: implement in-use callback for vfio_ap driver
Date: Thu, 2 Jun 2022 16:21:03 -0400	[thread overview]
Message-ID: <043e8669-145c-df44-74c4-a3023cb6b159@linux.ibm.com> (raw)
In-Reply-To: <e1ee7b53-9522-6a8a-463b-329bdab0dd30@linux.ibm.com>

On 6/2/22 15:19, Tony Krowiak wrote:
> 
> 
> On 6/2/22 2:16 PM, Jason J. Herne wrote:
>> On 4/4/22 18:10, Tony Krowiak wrote:
>>> Let's implement the callback to indicate when an APQN
>>> is in use by the vfio_ap device driver. The callback is
>>> invoked whenever a change to the apmask or aqmask would
>>> result in one or more queue devices being removed from the driver. The
>>> vfio_ap device driver will indicate a resource is in use
>>> if the APQN of any of the queue devices to be removed are assigned to
>>> any of the matrix mdevs under the driver's control.
>>>
>>> There is potential for a deadlock condition between the
>>> matrix_dev->guests_lock used to lock the guest during assignment of
>>> adapters and domains and the ap_perms_mutex locked by the AP bus when
>>> changes are made to the sysfs apmask/aqmask attributes.
>>>
>>> The AP Perms lock controls access to the objects that store the adapter
>>> numbers (ap_perms) and domain numbers (aq_perms) for the sysfs
>>> /sys/bus/ap/apmask and /sys/bus/ap/aqmask attributes. These attributes
>>> identify which queues are reserved for the zcrypt default device drivers.
>>> Before allowing a bit to be removed from either mask, the AP bus must check
>>> with the vfio_ap device driver to verify that none of the queues are
>>> assigned to any of its mediated devices.
>>>
>>> The apmask/aqmask attributes can be written or read at any time from
>>> userspace, so care must be taken to prevent a deadlock with asynchronous
>>> operations that might be taking place in the vfio_ap device driver. For
>>> example, consider the following:
>>>
>>> 1. A system administrator assigns an adapter to a mediated device under the
>>>     control of the vfio_ap device driver. The driver will need to first take
>>>     the matrix_dev->guests_lock to potentially hot plug the adapter into
>>>     the KVM guest.
>>> 2. At the same time, a system administrator sets a bit in the sysfs
>>>     /sys/bus/ap/ap_mask attribute. To complete the operation, the AP bus
>>>     must:
>>>     a. Take the ap_perms_mutex lock to update the object storing the values
>>>        for the /sys/bus/ap/ap_mask attribute.
>>>     b. Call the vfio_ap device driver's in-use callback to verify that the
>>>        queues now being reserved for the default zcrypt drivers are not
>>>        assigned to a mediated device owned by the vfio_ap device driver. To
>>>        do the verification, the in-use callback function takes the
>>>        matrix_dev->guests_lock, but has to wait because it is already held
>>>        by the operation in 1 above.
>>> 3. The vfio_ap device driver calls an AP bus function to verify that the
>>>     new queues resulting from the assignment of the adapter in step 1 are
>>>     not reserved for the default zcrypt device driver. This AP bus function
>>>     tries to take the ap_perms_mutex lock but gets stuck waiting for the
>>>     waiting for the lock due to step 2a above.
>>>
>>> Consequently, we have the following deadlock situation:
>>>
>>> matrix_dev->guests_lock locked (1)
>>> ap_perms_mutex lock locked (2a)
>>> Waiting for matrix_dev->gusts_lock (2b) which is currently held (1)
>>> Waiting for ap_perms_mutex lock (3) which is currently held (2a)
>>>
>>> To prevent this deadlock scenario, the function called in step 3 will no
>>> longer take the ap_perms_mutex lock and require the caller to take the
>>> lock. The lock will be the first taken by the adapter/domain assignment
>>> functions in the vfio_ap device driver to maintain the proper locking
>>> order.
>>>
>>> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
>>> ---
>>>   drivers/s390/crypto/ap_bus.c          | 31 ++++++++----
>>>   drivers/s390/crypto/vfio_ap_drv.c     |  1 +
>>>   drivers/s390/crypto/vfio_ap_ops.c     | 68 +++++++++++++++++++++++++++
>>>   drivers/s390/crypto/vfio_ap_private.h |  2 +
>>>   4 files changed, 94 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
>>> index fdf16cb70881..f48552e900a2 100644
>>> --- a/drivers/s390/crypto/ap_bus.c
>>> +++ b/drivers/s390/crypto/ap_bus.c
>>> @@ -817,6 +817,17 @@ static void ap_bus_revise_bindings(void)
>>>       bus_for_each_dev(&ap_bus_type, NULL, NULL, __ap_revise_reserved);
>>>   }
>>>   +/**
>>> + * ap_apqn_in_matrix_owned_by_def_drv: indicates whether an APQN c is reserved
>>> + *                       for the host drivers or not.
>>> + * @card: the APID of the adapter card to check
>>> + * @queue: the APQI of the queue to check
>>> + *
>>> + * Note: the ap_perms_mutex must be locked by the caller of this function.
>>> + *
>>> + * Return: an int specifying whether the APQN is reserved for the host (1) or
>>> + *       not (0)
>>> + */
>>
>> Comment's function name does not match real function name. Also APQN "c", from
>> description does not appear to exist.
> 
> No, it definitely does not match and the 'c' is an extraneous typo. I'll fix both.
> 
>>
>>
>>
>>>   int ap_owned_by_def_drv(int card, int queue)
>>>   {
>>>       int rc = 0;
>>> @@ -824,25 +835,31 @@ int ap_owned_by_def_drv(int card, int queue)
>>>       if (card < 0 || card >= AP_DEVICES || queue < 0 || queue >= AP_DOMAINS)
>>>           return -EINVAL;
>>>   -    mutex_lock(&ap_perms_mutex);
>>> -
>>>       if (test_bit_inv(card, ap_perms.apm)
>>>           && test_bit_inv(queue, ap_perms.aqm))
>>>           rc = 1;
>>>   -    mutex_unlock(&ap_perms_mutex);
>>> -
>>>       return rc;
>>>   }
>>>   EXPORT_SYMBOL(ap_owned_by_def_drv);
>>>   +/**
>>> + * ap_apqn_in_matrix_owned_by_def_drv: indicates whether every APQN contained in
>>> + *                       a set is reserved for the host drivers
>>> + *                       or not.
>>> + * @apm: a bitmap specifying a set of APIDs comprising the APQNs to check
>>> + * @aqm: a bitmap specifying a set of APQIs comprising the APQNs to check
>>> + *
>>> + * Note: the ap_perms_mutex must be locked by the caller of this function.
>>> + *
>>> + * Return: an int specifying whether each APQN is reserved for the host (1) or
>>> + *       not (0)
>>> + */
>>>   int ap_apqn_in_matrix_owned_by_def_drv(unsigned long *apm,
>>>                          unsigned long *aqm)
>>>   {
>>>       int card, queue, rc = 0;
>>>   -    mutex_lock(&ap_perms_mutex);
>>> -
>>>       for (card = 0; !rc && card < AP_DEVICES; card++)
>>>           if (test_bit_inv(card, apm) &&
>>>               test_bit_inv(card, ap_perms.apm))
>>> @@ -851,8 +868,6 @@ int ap_apqn_in_matrix_owned_by_def_drv(unsigned long *apm,
>>>                       test_bit_inv(queue, ap_perms.aqm))
>>>                       rc = 1;
>>>   -    mutex_unlock(&ap_perms_mutex);
>>> -
>>>       return rc;
>>>   }
>>>   EXPORT_SYMBOL(ap_apqn_in_matrix_owned_by_def_drv);
>>> diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
>>> index c258e5f7fdfc..2c3084589347 100644
>>> --- a/drivers/s390/crypto/vfio_ap_drv.c
>>> +++ b/drivers/s390/crypto/vfio_ap_drv.c
>>> @@ -107,6 +107,7 @@ static const struct attribute_group vfio_queue_attr_group = {
>>>   static struct ap_driver vfio_ap_drv = {
>>>       .probe = vfio_ap_mdev_probe_queue,
>>>       .remove = vfio_ap_mdev_remove_queue,
>>> +    .in_use = vfio_ap_mdev_resource_in_use,
>>>       .ids = ap_queue_ids,
>>>   };
>>>   diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
>>> index 49ed54dc9e05..3ece2cd9f1e7 100644
>>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>>> @@ -902,6 +902,21 @@ static int vfio_ap_mdev_verify_no_sharing(unsigned long *mdev_apm,
>>>       return 0;
>>>   }
>>>   +/**
>>> + * vfio_ap_mdev_validate_masks - verify that the APQNs assigned to the mdev are
>>> + *                 not reserved for the default zcrypt driver and
>>> + *                 are not assigned to another mdev.
>>> + *
>>> + * @matrix_mdev: the mdev to which the APQNs being validated are assigned.
>>> + *
>>> + * Return: One of the following values:
>>> + * o the error returned from the ap_apqn_in_matrix_owned_by_def_drv() function,
>>> + *   most likely -EBUSY indicating the ap_perms_mutex lock is already held.
>>> + * o EADDRNOTAVAIL if an APQN assigned to @matrix_mdev is reserved for the
>>> + *           zcrypt default driver.
>>> + * o EADDRINUSE if an APQN assigned to @matrix_mdev is assigned to another mdev
>>> + * o A zero indicating validation succeeded.
>>> + */
>>>   static int vfio_ap_mdev_validate_masks(struct ap_matrix_mdev *matrix_mdev)
>>>   {
>>>       if (ap_apqn_in_matrix_owned_by_def_drv(matrix_mdev->matrix.apm,
>>> @@ -951,6 +966,10 @@ static void vfio_ap_mdev_link_adapter(struct ap_matrix_mdev 
>>> *matrix_mdev,
>>>    *       An APQN derived from the cross product of the APID being assigned
>>>    *       and the APQIs previously assigned is being used by another mediated
>>>    *       matrix device
>>> + *
>>> + *    5. -EAGAIN
>>> + *       A lock required to validate the mdev's AP configuration could not
>>> + *       be obtained.
>>>    */
>>>   static ssize_t assign_adapter_store(struct device *dev,
>>>                       struct device_attribute *attr,
>>> @@ -961,6 +980,7 @@ static ssize_t assign_adapter_store(struct device *dev,
>>>       DECLARE_BITMAP(apm_delta, AP_DEVICES);
>>>       struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev);
>>>   +    mutex_lock(&ap_perms_mutex);
>>>       get_update_locks_for_mdev(matrix_mdev);
>>>         ret = kstrtoul(buf, 0, &apid);
>>> @@ -991,6 +1011,7 @@ static ssize_t assign_adapter_store(struct device *dev,
>>>       ret = count;
>>>   done:
>>>       release_update_locks_for_mdev(matrix_mdev);
>>> +    mutex_unlock(&ap_perms_mutex);
>>>         return ret;
>>>   }
>>> @@ -1144,6 +1165,10 @@ static void vfio_ap_mdev_link_domain(struct ap_matrix_mdev 
>>> *matrix_mdev,
>>>    *       An APQN derived from the cross product of the APQI being assigned
>>>    *       and the APIDs previously assigned is being used by another mediated
>>>    *       matrix device
>>> + *
>>> + *    5. -EAGAIN
>>> + *       The lock required to validate the mdev's AP configuration could not
>>> + *       be obtained.
>>>    */
>>>   static ssize_t assign_domain_store(struct device *dev,
>>>                      struct device_attribute *attr,
>>> @@ -1154,6 +1179,7 @@ static ssize_t assign_domain_store(struct device *dev,
>>>       DECLARE_BITMAP(aqm_delta, AP_DOMAINS);
>>>       struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev);
>>>   +    mutex_lock(&ap_perms_mutex);
>>>       get_update_locks_for_mdev(matrix_mdev);
>>>         ret = kstrtoul(buf, 0, &apqi);
>>> @@ -1184,6 +1210,7 @@ static ssize_t assign_domain_store(struct device *dev,
>>>       ret = count;
>>>   done:
>>>       release_update_locks_for_mdev(matrix_mdev);
>>> +    mutex_unlock(&ap_perms_mutex);
>>>         return ret;
>>>   }
>>> @@ -1868,3 +1895,44 @@ void vfio_ap_mdev_remove_queue(struct ap_device *apdev)
>>>       kfree(q);
>>>       release_update_locks_for_mdev(matrix_mdev);
>>>   }
>>> +
>>> +/**
>>> + * vfio_ap_mdev_resource_in_use: check whether any of a set of APQNs is
>>> + *                 assigned to a mediated device under the control
>>> + *                 of the vfio_ap device driver.
>>> + *
>>> + * @apm: a bitmap specifying a set of APIDs comprising the APQNs to check.
>>> + * @aqm: a bitmap specifying a set of APQIs comprising the APQNs to check.
>>> + *
>>> + * This function is invoked by the AP bus when changes to the apmask/aqmask
>>> + * attributes will result in giving control of the queue devices specified via
>>> + * @apm and @aqm to the default zcrypt device driver. Prior to calling this
>>> + * function, the AP bus locks the ap_perms_mutex. If this function is called
>>> + * while an adapter or domain is being assigned to a mediated device, the
>>> + * assignment operations will take the matrix_dev->guests_lock and
>>> + * matrix_dev->mdevs_lock then call the ap_apqn_in_matrix_owned_by_def_drv
>>> + * function, which also locks the ap_perms_mutex. This could result in a
>>> + * deadlock.
>>> + *
>>> + * To avoid a deadlock, this function will verify that the
>>> + * matrix_dev->guests_lock and matrix_dev->mdevs_lock are not currently held and
>>> + * will return -EBUSY if the locks can not be obtained.
>>
>> This part of the comment does not seem to match reality. Unless I'm missing
>> something? Did you mean to call mutex_trylock? Or is the comment stale?
> 
> The comment was written prior to the changes introduced to avoid the locking
> dependency (i.e., taking the ap_perms_mutex lock by the assignment functions
> which I believe was your suggestion). I shall remove the comment.

With both changes made...
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>




-- 
-- Jason J. Herne (jjherne@linux.ibm.com)

  reply	other threads:[~2022-06-02 20:21 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-04 22:10 [PATCH v19 00/20] s390/vfio-ap: dynamic configuration support Tony Krowiak
2022-04-04 22:10 ` [PATCH v19 01/20] s390/vfio-ap: use new AP bus interface to search for queue devices Tony Krowiak
2022-04-04 22:10 ` [PATCH v19 02/20] s390/vfio-ap: move probe and remove callbacks to vfio_ap_ops.c Tony Krowiak
2022-05-24 14:49   ` Jason J. Herne
2022-05-24 17:41     ` Tony Krowiak
2022-04-04 22:10 ` [PATCH v19 03/20] s390/vfio-ap: manage link between queue struct and matrix mdev Tony Krowiak
2022-04-04 22:10 ` [PATCH v19 04/20] s390/vfio-ap: introduce shadow APCB Tony Krowiak
2022-04-04 22:10 ` [PATCH v19 05/20] s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev Tony Krowiak
2022-05-16 16:36   ` Jason J. Herne
2022-05-16 17:13     ` Tony Krowiak
2022-05-16 17:50       ` Jason J. Herne
2022-05-16 18:06         ` Tony Krowiak
2022-04-04 22:10 ` [PATCH v19 06/20] s390/vfio-ap: allow assignment of unavailable AP queues to mdev device Tony Krowiak
2022-04-04 22:10 ` [PATCH v19 07/20] s390/vfio-ap: rename matrix_dev->lock mutex to matrix_dev->mdevs_lock Tony Krowiak
2022-05-17 14:02   ` Jason J. Herne
2022-05-17 18:36     ` Tony Krowiak
2022-04-04 22:10 ` [PATCH v19 08/20] s390/vfio-ap: introduce new mutex to control access to the KVM pointer Tony Krowiak
2022-05-27 12:40   ` Jason J. Herne
2022-04-04 22:10 ` [PATCH v19 09/20] s390/vfio-ap: use proper locking order when setting/clearing " Tony Krowiak
2022-05-27 12:41   ` Jason J. Herne
2022-04-04 22:10 ` [PATCH v19 10/20] s390/vfio-ap: prepare for dynamic update of guest's APCB on assign/unassign Tony Krowiak
2022-05-27 13:18   ` Jason J. Herne
2022-05-31 10:32     ` Tony Krowiak
2022-04-04 22:10 ` [PATCH v19 11/20] s390/vfio-ap: prepare for dynamic update of guest's APCB on queue probe/remove Tony Krowiak
2022-05-27 13:36   ` Jason J. Herne
2022-05-31 10:44     ` Tony Krowiak
2022-06-07 12:05       ` Halil Pasic
2022-06-08 13:31         ` Tony Krowiak
2022-05-27 13:50   ` Jason J. Herne
2022-05-31 11:57     ` Tony Krowiak
2022-05-31 12:02     ` Tony Krowiak
2022-04-04 22:10 ` [PATCH v19 12/20] s390/vfio-ap: allow hot plug/unplug of AP devices when assigned/unassigned Tony Krowiak
2022-06-01 18:54   ` Jason J. Herne
2022-04-04 22:10 ` [PATCH v19 13/20] s390/vfio-ap: hot plug/unplug of AP devices when probed/removed Tony Krowiak
2022-06-01 18:55   ` Jason J. Herne
2022-04-04 22:10 ` [PATCH v19 14/20] s390/vfio-ap: reset queues after adapter/domain unassignment Tony Krowiak
2022-06-02 15:00   ` Jason J. Herne
2022-04-04 22:10 ` [PATCH v19 15/20] s390/vfio-ap: implement in-use callback for vfio_ap driver Tony Krowiak
2022-06-02 18:16   ` Jason J. Herne
2022-06-02 19:19     ` Tony Krowiak
2022-06-02 20:21       ` Jason J. Herne [this message]
2022-04-04 22:10 ` [PATCH v19 16/20] s390/vfio-ap: sysfs attribute to display the guest's matrix Tony Krowiak
2022-04-04 22:10 ` [PATCH v19 17/20] s390/vfio-ap: handle config changed and scan complete notification Tony Krowiak
2022-06-06 17:50   ` Jason J. Herne
2022-04-04 22:10 ` [PATCH v19 18/20] s390/vfio-ap: update docs to include dynamic config support Tony Krowiak
2022-05-31 13:22   ` Jason J. Herne
2022-04-04 22:10 ` [PATCH v19 19/20] s390/Docs: new doc describing lock usage by the vfio_ap device driver Tony Krowiak
2022-05-31 19:23   ` Jason J. Herne
2022-06-02 16:11     ` Tony Krowiak
2022-04-04 22:10 ` [PATCH v19 20/20] MAINTAINERS: pick up all vfio_ap docs for VFIO AP maintainers Tony Krowiak
2022-05-31 18:26   ` Matthew Rosato
2022-06-02 16:19     ` Tony Krowiak
2022-04-29 19:57 ` [PATCH v19 00/20] s390/vfio-ap: dynamic configuration support Tony Krowiak
2022-05-03 17:39 ` Tony Krowiak
2022-05-09 14:34 ` Tony Krowiak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=043e8669-145c-df44-74c4-a3023cb6b159@linux.ibm.com \
    --to=jjherne@linux.ibm.com \
    --cc=akrowiak@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=fiuczy@linux.ibm.com \
    --cc=freude@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).