All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: "iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	"linux-kselftest@vger.kernel.org"
	<linux-kselftest@vger.kernel.org>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Nicolin Chen <nicolinc@nvidia.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>
Subject: Re: [PATCH v3 03/17] iommufd: Replace the hwpt->devices list with iommufd_group
Date: Fri, 14 Apr 2023 10:31:33 -0300	[thread overview]
Message-ID: <ZDlVtcwhV2G8ZKao@nvidia.com> (raw)
In-Reply-To: <BN9PR11MB52762841AAA04A24F76A743C8C989@BN9PR11MB5276.namprd11.prod.outlook.com>

On Thu, Apr 13, 2023 at 02:52:54AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, April 12, 2023 7:18 PM
> > 
> > On Wed, Apr 12, 2023 at 08:27:36AM +0000, Tian, Kevin wrote:
> > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > Sent: Tuesday, April 11, 2023 10:31 PM
> > > >
> > > > On Thu, Mar 23, 2023 at 07:21:42AM +0000, Tian, Kevin wrote:
> > > >
> > > > > If no oversight then we can directly put the lock in
> > > > > iommufd_hw_pagetable_attach/detach() which can also simplify a bit
> > on
> > > > > its callers in device.c.
> > > >
> > > > So, I did this, and syzkaller explains why this can't be done:
> > > >
> > > > https://lore.kernel.org/r/0000000000006e66d605f83e09bc@google.com
> > > >
> > > > We can't allow the hwpt to be discovered by a parallel
> > > > iommufd_hw_pagetable_attach() until it is done being setup, otherwise
> > > > if we fail to set it up we can't destroy the hwpt.
> > > >
> > > > 	if (immediate_attach) {
> > > > 		rc = iommufd_hw_pagetable_attach(hwpt, idev);
> > > > 		if (rc)
> > > > 			goto out_abort;
> > > > 	}
> > > >
> > > > 	rc = iopt_table_add_domain(&hwpt->ioas->iopt, hwpt->domain);
> > > > 	if (rc)
> > > > 		goto out_detach;
> > > > 	list_add_tail(&hwpt->hwpt_item, &hwpt->ioas->hwpt_list);
> > > > 	return hwpt;
> > > >
> > > > out_detach:
> > > > 	if (immediate_attach)
> > > > 		iommufd_hw_pagetable_detach(idev);
> > > > out_abort:
> > > > 	iommufd_object_abort_and_destroy(ictx, &hwpt->obj);
> > > >
> > > > As some other idev could be pointing at it too now.
> > >
> > > How could this happen before this object is finalized? iirc you pointed to
> > > me this fact in previous discussion.
> > 
> > It only is unavailable through the xarray, but we've added it to at
> > least one internal list on the group already, it is kind of sketchy to
> > work like this, it should all be atomic..
> > 
> 
> which internal list? group has a list for attached devices but regarding
> to hwpt it's stored in a single field igroup->hwpt.

It is added to 

	list_add_tail(&hwpt->hwpt_item, &hwpt->ioas->hwpt_list);

Which can be observed from

	mutex_lock(&ioas->mutex);
	list_for_each_entry(hwpt, &ioas->hwpt_list, hwpt_item) {
		if (!hwpt->auto_domain)
			continue;

		if (!iommufd_lock_obj(&hwpt->obj))
			continue;

If iommufd_lock_obj() has happened then
iommufd_object_abort_and_destroy() is in trouble.

Thus we need to hold the ioas->mutex right up until we know we can't
call iommufd_object_abort_and_destroy(), or lift out the hwpt list_add

This could maybe also be fixed by holding the destroy_rw_sem right up
until finalize. Though, I think I looked at this once and decided
against it for some reason..

> btw removing this lock in this file also makes it easier to support siov
> device which doesn't have group. We can have internal group attach
> and pasid attach wrappers within device.c and leave igroup->lock held
> in the group attach path.

Yeah, I expect this will need more work when we get to PASID support

Most likely the resolution will be something like PASID domains can't
be used as PF/VF domains because they don't have the right reserved
regions, so they shouldn't be in the hwpt_list at all, so we can use a
more relaxed locking.

Jason

  reply	other threads:[~2023-04-14 13:31 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-21 19:14 [PATCH v3 00/17] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 01/17] iommufd: Move isolated msi enforcement to iommufd_device_bind() Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 02/17] iommufd: Add iommufd_group Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 03/17] iommufd: Replace the hwpt->devices list with iommufd_group Jason Gunthorpe
2023-03-23  7:21   ` Tian, Kevin
2023-03-23 14:23     ` Jason Gunthorpe
2023-03-24  1:37       ` Tian, Kevin
2023-03-24 15:02         ` Jason Gunthorpe
2023-03-28  2:32           ` Tian, Kevin
2023-03-28 11:38             ` Jason Gunthorpe
2023-03-29  3:03               ` Tian, Kevin
2023-04-11 14:31     ` Jason Gunthorpe
2023-04-12  8:27       ` Tian, Kevin
2023-04-12 11:17         ` Jason Gunthorpe
2023-04-13  2:52           ` Tian, Kevin
2023-04-14 13:31             ` Jason Gunthorpe [this message]
2023-04-20  6:15               ` Tian, Kevin
2023-04-20 15:34                 ` Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 04/17] iommu: Export iommu_get_resv_regions() Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 05/17] iommufd: Keep track of each device's reserved regions instead of groups Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 06/17] iommufd: Use the iommufd_group to avoid duplicate MSI setup Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 07/17] iommufd: Make sw_msi_start a group global Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 08/17] iommufd: Move putting a hwpt to a helper function Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 09/17] iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc() Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 10/17] iommufd: Reorganize iommufd_device_attach into iommufd_device_change_pt Jason Gunthorpe
2023-03-23  7:25   ` Tian, Kevin
2023-03-23 14:26     ` Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 11/17] iommu: Introduce a new iommu_group_replace_domain() API Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 12/17] iommufd: Add iommufd_device_replace() Jason Gunthorpe
2023-03-23  7:31   ` Tian, Kevin
2023-03-23 14:30     ` Jason Gunthorpe
2023-03-24  1:42       ` Tian, Kevin
2023-03-24 15:03         ` Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 13/17] iommufd: Make destroy_rwsem use a lock class per object type Jason Gunthorpe
2023-03-23  7:54   ` Tian, Kevin
2023-03-21 19:14 ` [PATCH v3 14/17] iommufd/selftest: Test iommufd_device_replace() Jason Gunthorpe
2023-03-23  7:57   ` Tian, Kevin
2023-03-23 14:32     ` Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 15/17] iommufd: Add IOMMU_HWPT_ALLOC Jason Gunthorpe
2023-03-23  8:00   ` Tian, Kevin
2023-03-21 19:14 ` [PATCH v3 16/17] iommufd/selftest: Return the real idev id from selftest mock_domain Jason Gunthorpe
2023-03-23  8:02   ` Tian, Kevin
2023-03-21 19:14 ` [PATCH v3 17/17] iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC Jason Gunthorpe
2023-03-23  8:03   ` Tian, Kevin
2023-03-23  8:04 ` [PATCH v3 00/17] Add iommufd physical device operations for replace and alloc hwpt Tian, Kevin
2023-03-23 14:35   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZDlVtcwhV2G8ZKao@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=iommu@lists.linux.dev \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=nicolinc@nvidia.com \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.