iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: "Prakhya, Sai Praneeth" <sai.praneeth.prakhya@intel.com>
To: Lu Baolu <baolu.lu@linux.intel.com>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>
Cc: "Raj, Ashok" <ashok.raj@intel.com>,
	Will Deacon <will.deacon@arm.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Christoph Hellwig <hch@lst.de>
Subject: RE: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
Date: Mon, 24 Feb 2020 03:20:14 +0000	[thread overview]
Message-ID: <FFF73D592F13FD46B8700F0A279B802F572177EC@ORSMSX114.amr.corp.intel.com> (raw)
In-Reply-To: <b8b5ac1f-b736-531d-a621-030ec8e3e7b1@linux.intel.com>

> On 2020/2/17 5:57, Sai Praneeth Prakhya wrote:
> > Presently, the default domain of an iommu_group is allocated during
> > boot time (i.e. when a device is being added to a group) and it cannot
> > be changed later. So, the device would typically be either in identity
> > (also known as pass_through) mode (controlled by "iommu=pt" kernel
> > command line
> 
> The default domain type is initialized according to the kernel build option, and
> could be overrided by several kernel commands like iommu=pt and
> iommu.passthrough.

Yes, that's right. Will change this text.

> > argument) or the device would be in DMA mode as long as the machine is
> > up and running. There is no way to change the default domain type
> > dynamically i.e. after booting, a device cannot switch between
> > identity mode and DMA mode.
> >
> > But, assume a use case wherein the user trusts the device and also
> > believes that the OS is secure enough and hence wants *only* this
> > device to bypass IOMMU (so that it could be high performing) whereas
> > all the other devices to go through IOMMU (so that the system is
> > protected). Presently, this use case is not supported. Hence it will
> > be helpful if there is some way to change the default domain of a
> > B:D.F dynamically. Since, linux iommu subsystem prefers to deal at
> > iommu_group level instead of B:D.F level, it might be helpful if there
> > is some way to change the default domain of a
> > *iommu_group* dynamically. Hence, add such support.

[snipped]

> > +/*
> > + * Changes the default domain of a group
> > + *
> > + * @group: The group for which the default domain should be changed
> > + * @prv_dom: The previous domain that is being switched from
> 
> The previous domain is still kept in group->default_domain, so it's unnecessary
> to pass it as a parameter, right?

Yes, you are right. I passed it as an argument thinking that I could make one assignment less in this function. The caller of this function iommu_group_store_type() already has prv_dom, so just passed it. But, will change this so that instead of every caller getting and passing the argument, it's easier to do it in this function.

> > + * @type: The type of the new default domain that gets associated
> > +with the group
> > + *
> > + * Returns 0 on success and error code on failure  */ static int
> > +iommu_group_change_def_domain(struct iommu_group *group,
> > +					 struct iommu_domain *prv_dom,
> > +					 int type)
> > +{
> > +	struct group_device *grp_dev, *temp;
> > +	struct iommu_domain *new_domain;
> > +	int ret;
> > +
> > +	/*
> > +	 * iommu_domain_alloc() takes "struct bus_type" as an argument which
> is
> > +	 * a member in "struct device". Changing a group's default domain type
> > +	 * deals at iommu_group level rather than device level and hence there
> > +	 * is no straight forward way to get "bus_type" of an iommu_group that
> > +	 * could be passed to iommu_domain_alloc(). So, instead of directly
> > +	 * calling iommu_domain_alloc(), use iommu_ops from previous default
> > +	 * domain.
> > +	 */
> > +	if (!prv_dom || !prv_dom->ops || !prv_dom->ops->domain_alloc ||
> !type)
> > +		return -EINVAL;
> > +
> > +	/* Allocate a new domain of requested type */
> > +	new_domain = prv_dom->ops->domain_alloc(type);
> > +	if (!new_domain) {
> > +		pr_err("Unable to allocate memory for the new domain\n");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	new_domain->type = type;
> > +	new_domain->ops = prv_dom->ops;
> > +	new_domain->pgsize_bitmap = prv_dom->pgsize_bitmap;
> > +
> > +	/*
> > +	 * Set these upfront here because iommu_group_add_device() and
> > +	 * iommu_group_create_direct_mappings() needs these to be set
> > +	 */
> > +	mutex_lock(&group->mutex);
> > +	group->default_domain = new_domain;
> > +	group->domain = new_domain;
> > +	mutex_unlock(&group->mutex);
> > +
> > +	iommu_group_ref_get(group);
> > +
> > +	list_for_each_entry_safe(grp_dev, temp, &group->devices, list) {
> > +		struct device *dev;
> > +
> > +		dev = grp_dev->dev;
> > +		iommu_release_device(dev);
> > +
> > +		ret = iommu_group_add_device(group, dev);
> > +		if (ret)
> > +			dev_err(dev, "Failed to add to iommu group %d: %d\n",
> > +				group->id, ret);
> 
> Need to handle this error case.

I wasn't sure on how to handle the error ☹
i.e. group's domain/default_domain are already updated to new domain and assume there are 'n' devices in the group and this failed for 'k'th device, I wasn't sure how I could roll back the changes made for k-1 devices. So, I thought probably just alert the user that there was an error while changing default domain type and try updating for other devices in the group (hopefully other devices might succeed). Also, *generally* we shouldn't see any errors here because all these devices were already in the same group earlier (we aren’t adding/removing new devices to the group). We are just changing default domain type and we already made sure that device could be in the requested default domain type.

> > +
> > +		ret = prv_dom->ops->add_device(dev);
> > +		if (ret)
> > +			dev_err(dev, "Error adding to iommu: %d\n", ret);
> 
> Ditto.
> 
> > +	}
> > +
> > +	iommu_group_put(group);
> > +	iommu_domain_free(prv_dom);
> > +	return 0;
> > +}
> > +
> > +static int is_driver_bound(struct device *dev, void *not_used) {
> > +	int ret = 0;
> > +
> > +	device_lock(dev);
> > +	if (device_is_bound(dev))
> > +		ret = 1;
> > +	device_unlock(dev);
> > +	return ret;
> > +}
> > +
> > +static ssize_t iommu_group_store_type(struct iommu_group *group,
> > +				      const char *buf, size_t count) {
> > +	int ret = 0, req_type = 0, req_auto = 0;
> > +	struct iommu_domain *prv_dom;
> > +	struct group_device *grp_dev;
> > +	const struct iommu_ops *ops;
> > +	struct device *dev;
> > +
> > +	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
> > +		return -EACCES;
> > +
> > +	if (WARN_ON(!group))
> > +		return -EINVAL;
> > +
> > +	if (sysfs_streq(buf, "identity"))
> > +		req_type = IOMMU_DOMAIN_IDENTITY;
> > +	else if (sysfs_streq(buf, "DMA"))
> > +		req_type = IOMMU_DOMAIN_DMA;
> > +	else if (sysfs_streq(buf, "auto"))
> > +		req_auto = 1;
> > +	else
> > +		return -EINVAL;
> > +
> > +	/*
> > +	 * Check if any device in the group still has a driver binded to it.
> > +	 * This might race with device driver probing code and unfortunately
> > +	 * there is no clean way out of that either, locking all devices in the
> > +	 * group and then do the re-attach will introduce a lock-inversion with
> > +	 * group->mutex - Joerg.
> 
> Do you mean that we can't do below?
> 
> mutex_lock(&group->mutex);
> for_each_group_device()
> 	device_lock(dev);
> 
> /* Default domain switch */
> for_each_group_device()
> 	device_unlock()
> mutex_unlock(&group->mutex)

I think, Joerg talks about two issues here
1. is_driver_bound() takes/releases device_lock() which could race with device driver probing code i.e. in an other terminal user could try to unload/load the module.
2. We cannot do as you suggested above because the functions (one is mentioned below) that switch default domain need to take iommu_group_mutex lock. So, taking the mutex and then calling iommu functions that switch default domain will dead lock.

> > +	 */
> > +	if (iommu_group_for_each_dev(group, NULL, is_driver_bound)) {
> > +		pr_err("Active drivers exist for devices in the group\n");
> > +		return -EBUSY;
> > +	}
> > +
> > +	mutex_lock(&group->mutex);
> > +	prv_dom = group->default_domain;
> > +	if (!prv_dom || !prv_dom->ops || !prv_dom->ops-
> >dev_def_domain_type) {
> > +		pr_err("'dev_def_domain_type' call back isn't registered\n");
> > +		ret = -EPERM;
> > +		goto out;
> > +	}
> > +
> > +	/*
> > +	 * Check if any user level driver (that doesn't use kernel driver like
> > +	 * VFIO) is directly using the group. VFIO use case is detected by
> > +	 * is_driver_bound() check above
> > +	 */
> > +	if (prv_dom != group->domain) {
> > +		pr_err("Group assigned to user level for direct access\n");
> > +		ret = -EBUSY;
> > +		goto out;
> > +	}
> > +
> > +	/*
> > +	 * If user has requested "auto", kernel has to decide the default domain
> > +	 * type of a group. Hence, find out individual preferences of a device.
> > +	 */
> > +	ops = prv_dom->ops;
> > +	if (req_auto) {
> > +		int dma_devs = 0, idt_devs = 0, dev_def_dom;
> > +
> > +		list_for_each_entry(grp_dev, &group->devices, list) {
> > +			dev = grp_dev->dev;
> > +			dev_def_dom = ops->dev_def_domain_type(dev);
> > +			if (dev_def_dom == IOMMU_DOMAIN_IDENTITY)
> > +				idt_devs++;
> > +			if (dev_def_dom == IOMMU_DOMAIN_DMA)
> > +				dma_devs++;
> > +		}
> > +
> > +		/*
> > +		 * Default domain type of a group is undefined if some devices
> > +		 * in the group should be in dma domain while other devices
> > +		 * should be in identity domain
> > +		 */
> > +		if (idt_devs && dma_devs) {
> > +			pr_err("Some devices need identity domain while other
> need dma domain\n");
> > +			ret = -EPERM;
> > +			goto out;
> > +		}
> > +
> > +		/*
> > +		 * Default domain type of a group is identity if
> > +		 * 1. All the devices in the group need to be in identity domain
> > +		 * 2. Some devices should be in identity domain while other
> > +		 *    devices could be in either of dma or identity domain
> > +		 */
> > +		if (idt_devs && !dma_devs)
> > +			req_type = IOMMU_DOMAIN_IDENTITY;
> > +
> > +		/*
> > +		 * Default domain type of a group is dma if
> > +		 * 1. All the devices in the group need to be in dma domain
> > +		 * 2. Some devices should be in dma domain while other devices
> > +		 *    could be in either of dma or identity domain
> > +		 * 3. All the devices could be in either of the domains (namely
> > +		 *    dma and identity)
> > +		 */
> > +		if (!idt_devs)
> > +			req_type = IOMMU_DOMAIN_DMA;
> 
> Actually, There are four combinations:
> 
> 			idt_devs==0		idt_devs!=0
> dma_devs == 0		iommu_def_domain_type	identity
> dma_devs != 0		DMA			unsupported

Yes.. you are right. All these four combinations are presently handled in the code.
The comment above talks about combinations that lead us to select DMA domain.

The cases you mentioned above are handled as:
1. dma_devs == 0 && idt_devs == 0 -> In code, req_type = IOMMU_DOMAIN_DMA; because this means *all* the devices in the group can be in either of identity domain or DMA domain, hence select DMA domain over identity domain.
2. dma_devs != 0 && idt_devs == 0 -> In code, req_type = IOMMU_DOMAIN_DMA; because this means some devices in the group *should* be in DMA domain and other devices in the group could be in either of identity domain or DMA domain.
3. dma_devs == 0 && idt_devs != 0 -> In code, req_type = IOMMU_DOMAIN_IDENTITY; because this means some devices in the group *should* be in identity domain and other devices in the group could be in either of identity domain or DMA domain.
4. dma_devs != 0 && idt_devs != 0 -> In code, return error, because this means some devices in the group *should* be in identity domain and other devices in the group *should* be in DMA domain. This combination is not expected. This situation shouldn't arise because this would have failed during boot itself because these kind of devices aren't grouped together in the first place.

> > +	}
> > +
> > +	/*
> > +	 * Switch to a new domain only if the requested domain type is different
> > +	 * from the existing default domain type
> > +	 */
> > +	if (prv_dom->type == req_type)
> > +		goto out;
> > +
> > +	/*
> > +	 * Every device may not support both the domain types (namely DMA
> and
> > +	 * identity), so check if it's ok to change domain type of every device
> > +	 * in the group to the requested domain. This check is not required if
> > +	 * user has requested "auto" because it's already done above implicitly.
> > +	 */
> > +	if (!req_auto) {
> > +		list_for_each_entry(grp_dev, &group->devices, list) {
> > +			int allowed_types;
> > +
> > +			dev = grp_dev->dev;
> > +			allowed_types = ops->dev_def_domain_type(dev);
> > +			if (allowed_types && allowed_types != req_type) {
> > +				dev_err(dev, "Cannot be in %s domain\n", buf);
> > +				ret = -EPERM;
> > +				goto out;
> > +			}
> > +		}
> > +	}
> > +
> > +	mutex_unlock(&group->mutex);
> > +	ret = iommu_group_change_def_domain(group, prv_dom, req_type);
> 
> Why do you want to put group->mutex before do the real domain switching?

Because the below call stack takes iommu_group_mutex. I drop the lock here to facilitate it, otherwise the kernel hangs trying to get the same lock which it is already holding on to 

mutex_lock(&group->mutex);
iommu_group_remove_device()
intel_iommu_remove_device()
iommu_release_device()
iommu_group_change_def_domain()

Regards,
Sai
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2020-02-24  3:20 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-16 21:57 [PATCH V2 0/5] iommu: Add support to change default domain of a group Sai Praneeth Prakhya
2020-02-16 21:57 ` [PATCH V2 1/5] iommu: Add dev_def_domain_type() call back function to iommu_ops Sai Praneeth Prakhya
2020-02-22 23:37   ` Lu Baolu
2020-02-22 23:39     ` Prakhya, Sai Praneeth
2020-02-16 21:57 ` [PATCH V2 2/5] iommu/vt-d: Rename device_def_domain_type() to intel_iommu_dev_def_domain_type() Sai Praneeth Prakhya
2020-02-22 23:42   ` Lu Baolu
2020-02-22 23:59     ` Prakhya, Sai Praneeth
2020-02-23  1:50       ` Lu Baolu
2020-02-24  3:23         ` Prakhya, Sai Praneeth
2020-02-16 21:57 ` [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group Sai Praneeth Prakhya
2020-02-23  1:20   ` Lu Baolu
2020-02-24  3:20     ` Prakhya, Sai Praneeth [this message]
2020-02-24  5:46       ` Lu Baolu
2020-02-24  7:03         ` Prakhya, Sai Praneeth
2020-02-24  7:39           ` Lu Baolu
2020-02-24  7:57             ` Prakhya, Sai Praneeth
2020-02-24  8:12               ` Lu Baolu
2020-02-24  8:39                 ` Lu Baolu
2020-02-24  8:44                   ` Prakhya, Sai Praneeth
2020-03-02 15:08   ` Joerg Roedel
2020-03-03  6:47     ` Lu Baolu
2020-03-03 13:13       ` Joerg Roedel
2020-03-04 12:17         ` Lu Baolu
2020-02-16 21:57 ` [PATCH V2 4/5] iommu: Take lock before reading iommu_group default domain type Sai Praneeth Prakhya
2020-02-16 21:57 ` [PATCH V2 5/5] iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file Sai Praneeth Prakhya
2020-02-23  1:38   ` Lu Baolu
2020-02-24  2:18     ` Prakhya, Sai Praneeth
2020-02-22 23:40 ` [PATCH V2 0/5] iommu: Add support to change default domain of a group Prakhya, Sai Praneeth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=FFF73D592F13FD46B8700F0A279B802F572177EC@ORSMSX114.amr.corp.intel.com \
    --to=sai.praneeth.prakhya@intel.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=robin.murphy@arm.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).