iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Jason Gunthorpe <jgg@nvidia.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Cornelia Huck <cohuck@redhat.com>,
	iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, Will Deacon <will@kernel.org>
Cc: Qian Cai <cai@lca.pw>, Joerg Roedel <jroedel@suse.de>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	Matthew Rosato <mjrosato@linux.ibm.com>
Subject: Re: [PATCH 4/4] iommu: Fix ordering of iommu_release_device()
Date: Thu, 8 Sep 2022 22:27:06 +0100	[thread overview]
Message-ID: <ada74e00-77e1-770b-f0b7-a4c43a86c06f@arm.com> (raw)
In-Reply-To: <87b7041e-bc8d-500c-7167-04190e3795a9@arm.com>

On 2022-09-08 22:05, Robin Murphy wrote:
> On 2022-09-08 19:45, Jason Gunthorpe wrote:
>> default domains created a situation where the device is always connected
>> to a domain of some kind. When the device is idle it is connected to one
>> of the two pre-existing domains in the group, blocking_domain or
>> default_domain. In this way we have a continuous assertion of what state
>> the transation is in.
>>
>> When this is all destructed then we need to remove all the devices from
>> their domains via the ops->release_device() call before the domain can be
>> freed. This is the bug recognized in commit 9ac8545199a1 ("iommu: Fix
>> use-after-free in iommu_release_device").
>>
>> However, we must also stop any concurrent access to the iommu driver for
>> this device before we destroy it. This is done by:
>>
>>   1) Drivers only using the iommu API while they have a device driver
>>      attached to the device. This directly prevents release from 
>> happening.
>>
>>   2) Removing the device from the group list so any lingering group
>>      references no longer refer to the device. This is done by
>>      iommu_group_remove_device()
>>
>> Since iommu_group_remove_device() has been moved this breaks #2 and
>> triggers an WARN when VFIO races group activities with the release of the
>> device:
>>
>>     iommu driver failed to attach the default/blocking domain
>>     WARNING: CPU: 0 PID: 5082 at drivers/iommu/iommu.c:1961 
>> iommu_detach_group+0x6c/0x80
>>     Modules linked in: macvtap macvlan tap vfio_pci vfio_pci_core 
>> irqbypass vfio_virqfd kvm nft_fib_inet nft_fib_ipv4 nft_fib_ipv6>
>>     CPU: 0 PID: 5082 Comm: qemu-system-s39 Tainted: G        
>> W          6.0.0-rc3 #5
>>     Hardware name: IBM 3931 A01 782 (LPAR)
>>     Krnl PSW : 0704c00180000000 000000095bb10d28 
>> (iommu_detach_group+0x70/0x80)
>>           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
>>     Krnl GPRS: 0000000000000001 0000000900000027 0000000000000039 
>> 000000095c97ffe0
>>           00000000fffeffff 00000009fc290000 00000000af1fda50 
>> 00000000af590b58
>>           00000000af1fdaf0 0000000135c7a320 0000000135e52258 
>> 0000000135e52200
>>           00000000a29e8000 00000000af590b40 000000095bb10d24 
>> 0000038004b13c98
>>     Krnl Code: 000000095bb10d18: c020003d56fc        larl    
>> %r2,000000095c2bbb10
>>               000000095bb10d1e: c0e50019d901        brasl   
>> %r14,000000095be4bf20
>>              #000000095bb10d24: af000000            mc      0,0
>>              >000000095bb10d28: b904002a            lgr     %r2,%r10
>>               000000095bb10d2c: ebaff0a00004        lmg     
>> %r10,%r15,160(%r15)
>>               000000095bb10d32: c0f4001aa867        brcl    
>> 15,000000095be65e00
>>               000000095bb10d38: c004002168e0        brcl    
>> 0,000000095bf3def8
>>               000000095bb10d3e: eb6ff0480024        stmg    
>> %r6,%r15,72(%r15)
>>     Call Trace:
>>      [<000000095bb10d28>] iommu_detach_group+0x70/0x80
>>     ([<000000095bb10d24>] iommu_detach_group+0x6c/0x80)
>>      [<000003ff80243b0e>] vfio_iommu_type1_detach_group+0x136/0x6c8 
>> [vfio_iommu_type1]
>>      [<000003ff80137780>] __vfio_group_unset_container+0x58/0x158 [vfio]
>>      [<000003ff80138a16>] vfio_group_fops_unl_ioctl+0x1b6/0x210 [vfio]
>>     pci 0004:00:00.0: Removing from iommu group 4
>>      [<000000095b5b62e8>] __s390x_sys_ioctl+0xc0/0x100
>>      [<000000095be5d3b4>] __do_syscall+0x1d4/0x200
>>      [<000000095be6c072>] system_call+0x82/0xb0
>>     Last Breaking-Event-Address:
>>      [<000000095be4bf80>] __warn_printk+0x60/0x68
>>
>> So, put things in the right order:
>>   - Remove the device from the group's list
>>   - Release the device from the iommu driver to drop all domain 
>> references
>>   - Free the domains
>>
>> This is done by splitting out the kobject_put(), which triggers
>> iommu_group_release(), from the rest of iommu_group_remove_device() and
>> placing it after release is called.
> 
> So simple... now how did I fail to think of that? :)

Oh, because s390 is using iommu_get_domain_for_dev() in its 
release_device callback, which needs to dereference the group to work, 
and the current domain may also be a non-default one which we can't 
prevent from disappearing racily, that was why :(

I think we may be back to looking at s390 having to keep track of 
domains internally like SMMUv3 does, and both drivers synchronising 
between their domain_free and release_device to to do their internal 
detach from either place... unless possibly the core code starts 
refcounting domains as well?

Robin.

>> Fixes: 9ac8545199a1 ("iommu: Fix use-after-free in iommu_release_device")
>> Reported-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> Cc: Robin Murphy <robin.murphy@arm.com>
>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>> ---
>>   drivers/iommu/iommu.c | 36 +++++++++++++++++++++++++++---------
>>   1 file changed, 27 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 780fb70715770d..c451bf715182ac 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -90,6 +90,7 @@ static int 
>> iommu_create_device_direct_mappings(struct iommu_group *group,
>>   static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
>>   static ssize_t iommu_group_store_type(struct iommu_group *group,
>>                         const char *buf, size_t count);
>> +static void __iommu_group_remove_device(struct device *dev);
>>   #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)        \
>>   struct iommu_group_attribute iommu_group_attr_##_name =        \
>> @@ -330,6 +331,7 @@ int iommu_probe_device(struct device *dev)
>>   void iommu_release_device(struct device *dev)
>>   {
>> +    struct iommu_group *group = dev->iommu_group;
>>       const struct iommu_ops *ops;
>>       if (!dev->iommu)
>> @@ -337,11 +339,20 @@ void iommu_release_device(struct device *dev)
>>       iommu_device_unlink(dev->iommu->iommu_dev, dev);
> 
> In fact, now that you've made it obvious, could we not simply do an 
> extra kobject_get() here before calling regular 
> iommu_group_remove_device(), and avoid having to split that up at all? 
> That should delay any default domain teardown just as definitively as 
> holding the original reference for longer, no?
> 
> Thanks,
> Robin.
> 
>> +    __iommu_group_remove_device(dev);
>>       ops = dev_iommu_ops(dev);
>>       if (ops->release_device)
>>           ops->release_device(dev);
>> -    iommu_group_remove_device(dev);
>> +    /*
>> +     * This will eventually call iommu_group_release() which will 
>> free the
>> +     * iommu_domains. Up until the release_device() above the 
>> iommu_domains
>> +     * may still have been associated with the device, and we cannot 
>> free
>> +     * them until the have been detached. release_device() is 
>> expected to
>> +     * detach all domains connected to the dev.
>> +     */
>> +    kobject_put(group->devices_kobj);
>> +
>>       module_put(ops->owner);
>>       dev_iommu_free(dev);
>>   }
>> @@ -939,14 +950,7 @@ int iommu_group_add_device(struct iommu_group 
>> *group, struct device *dev)
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_group_add_device);
>> -/**
>> - * iommu_group_remove_device - remove a device from it's current group
>> - * @dev: device to be removed
>> - *
>> - * This function is called by an iommu driver to remove the device from
>> - * it's current group.  This decrements the iommu group reference count.
>> - */
>> -void iommu_group_remove_device(struct device *dev)
>> +static void __iommu_group_remove_device(struct device *dev)
>>   {
>>       struct iommu_group *group = dev->iommu_group;
>>       struct group_device *tmp_device, *device = NULL;
>> @@ -977,6 +981,20 @@ void iommu_group_remove_device(struct device *dev)
>>       kfree(device->name);
>>       kfree(device);
>>       dev->iommu_group = NULL;
>> +}
>> +
>> +/**
>> + * iommu_group_remove_device - remove a device from it's current group
>> + * @dev: device to be removed
>> + *
>> + * This function is called by an iommu driver to remove the device from
>> + * it's current group.  This decrements the iommu group reference count.
>> + */
>> +void iommu_group_remove_device(struct device *dev)
>> +{
>> +    struct iommu_group *group = dev->iommu_group;
>> +
>> +    __iommu_group_remove_device(dev);
>>       kobject_put(group->devices_kobj);
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_group_remove_device);
> 

  reply	other threads:[~2022-09-08 21:27 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-08 18:44 [PATCH 0/4] Fix splats releated to using the iommu_group after destroying devices Jason Gunthorpe
2022-09-08 18:44 ` [PATCH 1/4] vfio: Simplify vfio_create_group() Jason Gunthorpe
2022-09-20 19:45   ` Matthew Rosato
2022-09-08 18:44 ` [PATCH 2/4] vfio: Move the sanity check of the group to vfio_create_group() Jason Gunthorpe
2022-09-22 19:10   ` Alex Williamson
2022-09-22 19:36     ` Jason Gunthorpe
2022-09-22 21:23       ` Alex Williamson
2022-09-22 23:12         ` Jason Gunthorpe
2022-09-08 18:45 ` [PATCH 3/4] vfio: Follow a strict lifetime for struct iommu_group * Jason Gunthorpe
2022-09-20 19:32   ` Matthew Rosato
2022-09-08 18:45 ` [PATCH 4/4] iommu: Fix ordering of iommu_release_device() Jason Gunthorpe
2022-09-08 21:05   ` Robin Murphy
2022-09-08 21:27     ` Robin Murphy [this message]
2022-09-08 21:43       ` Jason Gunthorpe
2022-09-09  9:05         ` Robin Murphy
2022-09-09 13:25           ` Jason Gunthorpe
2022-09-09 17:57             ` Robin Murphy
2022-09-09 18:30               ` Jason Gunthorpe
2022-09-09 19:55                 ` Robin Murphy
2022-09-09 23:45                   ` Jason Gunthorpe
2022-09-12 11:13                     ` Robin Murphy
2022-09-22 16:56                       ` Jason Gunthorpe
2022-09-09 12:49 ` [PATCH 0/4] Fix splats releated to using the iommu_group after destroying devices Matthew Rosato
2022-09-09 16:24   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ada74e00-77e1-770b-f0b7-a4c43a86c06f@arm.com \
    --to=robin.murphy@arm.com \
    --cc=alex.williamson@redhat.com \
    --cc=cai@lca.pw \
    --cc=cohuck@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=jroedel@suse.de \
    --cc=kvm@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).