From: "Liu, Yi L" <yi.l.liu@intel.com>
To: Nicolin Chen <nicolinc@nvidia.com>,
"jgg@nvidia.com" <jgg@nvidia.com>,
"Tian, Kevin" <kevin.tian@intel.com>,
"joro@8bytes.org" <joro@8bytes.org>,
"will@kernel.org" <will@kernel.org>,
"robin.murphy@arm.com" <robin.murphy@arm.com>,
"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
"shuah@kernel.org" <shuah@kernel.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"linux-kselftest@vger.kernel.org"
<linux-kselftest@vger.kernel.org>,
"baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>
Subject: RE: [PATCH v2 08/10] iommufd/device: Use iommu_group_replace_domain()
Date: Wed, 8 Feb 2023 08:08:42 +0000 [thread overview]
Message-ID: <DS0PR11MB75293887741EEB8953BCB0A9C3D89@DS0PR11MB7529.namprd11.prod.outlook.com> (raw)
In-Reply-To: <4653f009c3dacae8ebf3a4865aaa944aa9c7cc7e.1675802050.git.nicolinc@nvidia.com>
Hi Nic,
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Wednesday, February 8, 2023 5:18 AM
>
> iommu_group_replace_domain() is introduced to support use cases where
> an
> iommu_group can be attached to a new domain without getting detached
> from
> the old one. This replacement feature will be useful, for cases such as:
> 1) vPASID mode, when a guest wants to replace a single pasid (PASID=0)
> table with a larger table (PASID=N)
> 2) Nesting mode, when switching the attaching device from an S2 domain
> to an S1 domain, or when switching between relevant S1 domains.
> as it allows these cases to switch seamlessly without a DMA disruption.
>
> So, call iommu_group_replace_domain() in the
> iommufd_device_do_attach().
> And add a __iommmufd_device_detach helper to allow the replace routine
> to
> do a partial detach on the current hwpt that's being replaced. Though the
> updated locking logic is overcomplicated, it will be eased, once those
> iopt_table_add/remove_ioas and list_add/del calls are moved to hwpt's
> allocation/destroy() functions in the coming nesting series, as that'll
> depend on a new ->domain_alloc_user op in the iommu core.
>
> Also, block replace operations that are from/to auto_domains, i.e. only
> user-allocated hw_pagetables can be replaced or replaced with.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/iommufd/device.c | 101 +++++++++++++++++-------
> drivers/iommu/iommufd/iommufd_private.h | 2 +
> 2 files changed, 76 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/iommu/iommufd/device.c
> b/drivers/iommu/iommufd/device.c
> index b8c3e3baccb5..8a9834fc129a 100644
> --- a/drivers/iommu/iommufd/device.c
> +++ b/drivers/iommu/iommufd/device.c
> @@ -9,6 +9,8 @@
> #include "io_pagetable.h"
> #include "iommufd_private.h"
>
> +MODULE_IMPORT_NS(IOMMUFD_INTERNAL);
> +
> static bool allow_unsafe_interrupts;
> module_param(allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
> MODULE_PARM_DESC(
> @@ -194,9 +196,61 @@ static bool
> iommufd_hw_pagetable_has_group(struct iommufd_hw_pagetable *hwpt,
> return false;
> }
>
> +/**
> + * __iommmufd_device_detach - Detach a device from idev->hwpt to
> new_hwpt
This function doesn't do anything to make this device attached to new_hwpt.
It is done in the iommufd_device_attach_ioas(). New_hwpt here indicates if
this detach requires to do some extra thing. E.g. remove reserved iova from
the idev->hwpt->ioas. So may just say " Detach a device from idev->hwpt",
and explain the usage of new_hwpt in the below.
> + * @idev: device to detach
> + * @new_hwpt: new hw_pagetable to attach (pass in NULL for a simple
> detach)
The new hw_pagetable to be attached.
> + * @detach_group: flag to call iommu_detach_group
> + *
> + * This is a cleanup helper shared by the replace and detach routines.
> Comparing
> + * to a detach routine, a replace routine only needs a partial detach
> procedure:
> + * it does not need the iommu_detach_group(); it will attach the device to
> a new
> + * hw_pagetable after a partial detach from the currently attached
> hw_pagetable,
> + * so certain steps can be skipped if two hw_pagetables have the same
> IOAS.
> + */
> +static void __iommmufd_device_detach(struct iommufd_device *idev,
> + struct iommufd_hw_pagetable
> *new_hwpt,
> + bool detach_group)
> +{
> + struct iommufd_hw_pagetable *hwpt = idev->hwpt;
> + struct iommufd_ioas *new_ioas = NULL;
> +
> + if (new_hwpt)
> + new_ioas = new_hwpt->ioas;
> +
> + mutex_lock(&hwpt->devices_lock);
> + list_del(&idev->devices_item);
> + if (hwpt->ioas != new_ioas)
> + mutex_lock(&hwpt->ioas->mutex);
The lock order is mostly hwpt->ioas->mutex and then hwpt->devices_lock.
See the iommufd_device_auto_get_domain(). If possible, may switch the
order sequence here. Also, rename hwpt to be cur_hwpt, this may help
reviewers to distinguish it from the hwpt in the caller of this function. It
looks to be a deadlock at first look, but not after closer reading.
> + if (!iommufd_hw_pagetable_has_group(hwpt, idev->group)) {
> + if (list_empty(&hwpt->devices)) {
> + iopt_table_remove_domain(&hwpt->ioas->iopt,
> + hwpt->domain);
> + list_del(&hwpt->hwpt_item);
> + }
> + if (detach_group)
> + iommu_detach_group(hwpt->domain, idev->group);
> + }
> + if (hwpt->ioas != new_ioas) {
> + iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev-
> >dev);
> + mutex_unlock(&hwpt->ioas->mutex);
> + }
> + mutex_unlock(&hwpt->devices_lock);
> +
> + if (hwpt->auto_domain)
> + iommufd_object_destroy_user(idev->ictx, &hwpt->obj);
> + else
> + refcount_dec(&hwpt->obj.users);
> +
> + idev->hwpt = NULL;
> +
> + refcount_dec(&idev->obj.users);
> +}
> +
> static int iommufd_device_do_attach(struct iommufd_device *idev,
> struct iommufd_hw_pagetable *hwpt)
> {
> + struct iommufd_hw_pagetable *cur_hwpt = idev->hwpt;
> phys_addr_t sw_msi_start = PHYS_ADDR_MAX;
> int rc;
>
> @@ -236,7 +290,7 @@ static int iommufd_device_do_attach(struct
> iommufd_device *idev,
> * the group once for the first device that is in the group.
> */
> if (!iommufd_hw_pagetable_has_group(hwpt, idev->group)) {
> - rc = iommu_attach_group(hwpt->domain, idev->group);
> + rc = iommu_group_replace_domain(idev->group, hwpt-
> >domain);
> if (rc)
> goto out_iova;
>
> @@ -249,6 +303,10 @@ static int iommufd_device_do_attach(struct
> iommufd_device *idev,
> }
> }
>
> + /* Replace the cur_hwpt without iommu_detach_group() */
> + if (cur_hwpt)
> + __iommmufd_device_detach(idev, hwpt, false);
> +
> idev->hwpt = hwpt;
> refcount_inc(&hwpt->obj.users);
> list_add(&idev->devices_item, &hwpt->devices);
> @@ -256,7 +314,10 @@ static int iommufd_device_do_attach(struct
> iommufd_device *idev,
> return 0;
>
> out_detach:
> - iommu_detach_group(hwpt->domain, idev->group);
> + if (cur_hwpt)
> + iommu_group_replace_domain(idev->group, cur_hwpt-
> >domain);
> + else
> + iommu_detach_group(hwpt->domain, idev->group);
> out_iova:
> iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev);
> out_unlock:
> @@ -345,6 +406,13 @@ int iommufd_device_attach(struct iommufd_device
> *idev, u32 *pt_id)
> struct iommufd_hw_pagetable *hwpt =
> container_of(pt_obj, struct
> iommufd_hw_pagetable, obj);
>
> + if (idev->hwpt == hwpt)
> + goto out_done;
> + if (idev->hwpt && idev->hwpt->auto_domain) {
> + rc = -EBUSY;
This means if device was attached to an auto_created hwpt, then we
cannot replace it with a user allocated hwpt? If yes, this means the
replace is not available until user hwpt support, which is part of nesting.
> + goto out_put_pt_obj;
> + }
> +
> mutex_lock(&hwpt->ioas->mutex);
> rc = iommufd_device_do_attach(idev, hwpt);
> mutex_unlock(&hwpt->ioas->mutex);
> @@ -356,6 +424,8 @@ int iommufd_device_attach(struct iommufd_device
> *idev, u32 *pt_id)
> struct iommufd_ioas *ioas =
> container_of(pt_obj, struct iommufd_ioas, obj);
>
> + if (idev->hwpt)
> + return -EBUSY;
So we don't allow ioas replacement for physical devices. Is it?
Looks like emulated devices allows it.
> rc = iommufd_device_auto_get_domain(idev, ioas);
> if (rc)
> goto out_put_pt_obj;
> @@ -367,6 +437,7 @@ int iommufd_device_attach(struct iommufd_device
> *idev, u32 *pt_id)
> }
>
> refcount_inc(&idev->obj.users);
> +out_done:
> *pt_id = idev->hwpt->obj.id;
> rc = 0;
>
> @@ -385,31 +456,7 @@
> EXPORT_SYMBOL_NS_GPL(iommufd_device_attach, IOMMUFD);
> */
> void iommufd_device_detach(struct iommufd_device *idev)
> {
> - struct iommufd_hw_pagetable *hwpt = idev->hwpt;
> -
> - mutex_lock(&hwpt->ioas->mutex);
> - mutex_lock(&hwpt->devices_lock);
> - list_del(&idev->devices_item);
> - if (!iommufd_hw_pagetable_has_group(hwpt, idev->group)) {
> - if (list_empty(&hwpt->devices)) {
> - iopt_table_remove_domain(&hwpt->ioas->iopt,
> - hwpt->domain);
> - list_del(&hwpt->hwpt_item);
> - }
> - iommu_detach_group(hwpt->domain, idev->group);
> - }
> - iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev);
> - mutex_unlock(&hwpt->devices_lock);
> - mutex_unlock(&hwpt->ioas->mutex);
> -
> - if (hwpt->auto_domain)
> - iommufd_object_destroy_user(idev->ictx, &hwpt->obj);
> - else
> - refcount_dec(&hwpt->obj.users);
> -
> - idev->hwpt = NULL;
> -
> - refcount_dec(&idev->obj.users);
> + __iommmufd_device_detach(idev, NULL, true);
> }
> EXPORT_SYMBOL_NS_GPL(iommufd_device_detach, IOMMUFD);
>
> diff --git a/drivers/iommu/iommufd/iommufd_private.h
> b/drivers/iommu/iommufd/iommufd_private.h
> index 593138bb37b8..200c783800ad 100644
> --- a/drivers/iommu/iommufd/iommufd_private.h
> +++ b/drivers/iommu/iommufd/iommufd_private.h
> @@ -9,6 +9,8 @@
> #include <linux/refcount.h>
> #include <linux/uaccess.h>
>
> +#include "../iommu-priv.h"
> +
> struct iommu_domain;
> struct iommu_group;
> struct iommu_option;
> --
> 2.39.1
Regards,
Yi Liu
next prev parent reply other threads:[~2023-02-08 8:08 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-07 21:17 [PATCH v2 00/10] Add IO page table replacement support Nicolin Chen
2023-02-07 21:17 ` [PATCH v2 01/10] iommu: Move dev_iommu_ops() to private header Nicolin Chen
2023-02-09 2:49 ` Tian, Kevin
2023-02-07 21:17 ` [PATCH v2 02/10] iommu: Introduce a new iommu_group_replace_domain() API Nicolin Chen
2023-02-09 2:55 ` Tian, Kevin
2023-02-09 13:23 ` Jason Gunthorpe
2023-02-10 1:34 ` Tian, Kevin
2023-02-10 23:51 ` Alex Williamson
2023-02-11 0:44 ` Jason Gunthorpe
2023-02-13 2:24 ` Tian, Kevin
2023-02-13 8:34 ` Baolu Lu
2023-02-13 14:45 ` Jason Gunthorpe
2023-02-14 3:29 ` Tian, Kevin
2023-02-15 6:10 ` Tian, Kevin
2023-02-15 12:52 ` Jason Gunthorpe
2023-02-22 2:11 ` Tian, Kevin
2023-02-24 0:57 ` Jason Gunthorpe
2023-02-24 8:07 ` Tian, Kevin
2023-02-07 21:17 ` [PATCH v2 03/10] iommufd: Create access in vfio_iommufd_emulated_bind() Nicolin Chen
2023-02-09 2:56 ` Tian, Kevin
2023-02-09 16:15 ` Nicolin Chen
2023-02-09 18:58 ` Eric Farman
2023-02-09 19:54 ` Nicolin Chen
2023-02-07 21:17 ` [PATCH v2 04/10] iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_SET_IOAS coverage Nicolin Chen
2023-02-09 2:59 ` Tian, Kevin
2023-02-07 21:17 ` [PATCH v2 05/10] iommufd: Add replace support in iommufd_access_set_ioas() Nicolin Chen
2023-02-09 3:13 ` Tian, Kevin
2023-02-09 20:28 ` Nicolin Chen
2023-02-09 20:49 ` Jason Gunthorpe
2023-02-09 22:18 ` Nicolin Chen
2023-02-07 21:17 ` [PATCH v2 06/10] iommufd/selftest: Add coverage for access->ioas replacement Nicolin Chen
2023-02-07 21:17 ` [PATCH v2 07/10] iommufd/device: Make hwpt_list list_add/del symmetric Nicolin Chen
2023-02-09 3:23 ` Tian, Kevin
2023-02-09 13:24 ` Jason Gunthorpe
2023-02-10 1:46 ` Tian, Kevin
2023-02-10 21:17 ` Jason Gunthorpe
2023-02-13 2:12 ` Tian, Kevin
2023-02-07 21:18 ` [PATCH v2 08/10] iommufd/device: Use iommu_group_replace_domain() Nicolin Chen
2023-02-08 8:08 ` Liu, Yi L [this message]
2023-02-09 20:55 ` Nicolin Chen
2023-02-08 8:12 ` Liu, Yi L
2023-02-09 20:56 ` Nicolin Chen
2023-02-09 4:00 ` Tian, Kevin
2023-02-09 21:13 ` Nicolin Chen
2023-02-10 0:01 ` Jason Gunthorpe
2023-02-10 20:50 ` Nicolin Chen
2023-02-10 2:11 ` Tian, Kevin
2023-02-11 0:10 ` Nicolin Chen
2023-02-13 2:34 ` Tian, Kevin
2023-02-13 7:48 ` Nicolin Chen
2023-02-13 8:27 ` Tian, Kevin
2023-02-13 14:49 ` Jason Gunthorpe
2023-02-14 10:54 ` Nicolin Chen
2023-02-15 1:37 ` Tian, Kevin
2023-02-15 1:58 ` Nicolin Chen
2023-02-15 2:15 ` Tian, Kevin
2023-02-15 7:15 ` Nicolin Chen
2023-02-15 7:24 ` Tian, Kevin
2023-02-15 12:51 ` Jason Gunthorpe
2023-02-14 10:59 ` Nicolin Chen
2023-02-15 1:38 ` Tian, Kevin
2023-02-15 7:16 ` Nicolin Chen
2023-02-07 21:18 ` [PATCH v2 09/10] vfio: Support IO page table replacement Nicolin Chen
2023-02-09 4:06 ` Tian, Kevin
2023-02-07 21:18 ` [PATCH v2 10/10] vfio: Do not allow !ops->dma_unmap in vfio_pin/unpin_pages() Nicolin Chen
2023-02-09 4:10 ` Tian, Kevin
2023-02-09 13:26 ` Jason Gunthorpe
2023-02-09 16:19 ` Nicolin Chen
2023-02-09 2:50 ` [PATCH v2 00/10] Add IO page table replacement support Tian, Kevin
2023-02-09 16:13 ` Nicolin Chen
2023-02-10 1:34 ` Tian, Kevin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DS0PR11MB75293887741EEB8953BCB0A9C3D89@DS0PR11MB7529.namprd11.prod.outlook.com \
--to=yi.l.liu@intel.com \
--cc=alex.williamson@redhat.com \
--cc=baolu.lu@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=nicolinc@nvidia.com \
--cc=robin.murphy@arm.com \
--cc=shuah@kernel.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).