iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Matthew Rosato <mjrosato@linux.ibm.com>
Cc: Robin Murphy <robin.murphy@arm.com>,
	iommu@lists.linux.dev,
	Alex Williamson <alex.williamson@redhat.com>,
	linux-s390@vger.kernel.org, schnelle@linux.ibm.com,
	pmorel@linux.ibm.com, borntraeger@linux.ibm.com,
	hca@linux.ibm.com, gor@linux.ibm.com,
	gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com,
	svens@linux.ibm.com, joro@8bytes.org, will@kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 1/2] iommu/s390: Fix race with release_device ops
Date: Fri, 2 Sep 2022 14:21:23 -0300	[thread overview]
Message-ID: <YxI7kzuchcJz8sRX@nvidia.com> (raw)
In-Reply-To: <273fdd58-549c-30d4-39a9-85fe631162ba@linux.ibm.com>

On Fri, Sep 02, 2022 at 01:11:09PM -0400, Matthew Rosato wrote:
> On 9/1/22 4:37 PM, Jason Gunthorpe wrote:
> > On Thu, Sep 01, 2022 at 12:14:24PM -0400, Matthew Rosato wrote:
> >> On 9/1/22 6:25 AM, Robin Murphy wrote:
> >>> On 2022-08-31 21:12, Matthew Rosato wrote:
> >>>> With commit fa7e9ecc5e1c ("iommu/s390: Tolerate repeat attach_dev
> >>>> calls") s390-iommu is supposed to handle dynamic switching between IOMMU
> >>>> domains and the DMA API handling.  However, this commit does not
> >>>> sufficiently handle the case where the device is released via a call
> >>>> to the release_device op as it may occur at the same time as an opposing
> >>>> attach_dev or detach_dev since the group mutex is not held over
> >>>> release_device.  This was observed if the device is deconfigured during a
> >>>> small window during vfio-pci initialization and can result in WARNs and
> >>>> potential kernel panics.
> >>>
> >>> Hmm, the more I think about it, something doesn't sit right about this whole situation... release_device is called via the notifier from device_del() after the device has been removed from its parent bus and largely dismantled; it should definitely not still have a driver bound by that point, so how is VFIO doing things that manage to race at all?
> >>>
> >>> Robin.
> >>
> >> So, I generally have seen the issue manifest as one of the calls
> >> into the iommu core from __vfio_group_unset_container
> >> (e.g. iommu_deatch_group via vfio_type1_iommu) failing with a WARN.
> >> This happens when the vfio group fd is released, which could be
> >> coming e.g. from a userspace ioctl VFIO_GROUP_UNSET_CONTAINER.
> >> AFAICT there's nothing serializing the notion of calling into the
> >> iommu core here against a device that is simultaneously going
> >> through release_device (because we don't enter release_device with
> >> the group mutex held), resulting in unpredictable behavior between
> >> the dueling attach_dev/detach_dev and the release_device for
> >> s390-iommu at least.
> > 
> > Oh, this is a vfio bug.
> 
> I've been running with your diff applied today on s390 and this
> indeed fixes the issue by preventing the detach-after-release coming
> out of vfio. 

Heh, I'm shocked it worked at all

I've been trying to understand Robin's latest remarks because maybe I
don't really understand your situation right.

IMHO this is definately a VFIO bug, because in a single-device group
we must not allow the domain to remain attached past remove(). Or more
broadly we shouldn't be holding ownership of a group without also
having a driver attached.

But this dicussion with Robin about multi-device groups and hotplug
makes me wonder what your situation is? There is certainly something
interesting there too, and this can't be a solution to that problem.

> Can you send as a patch for review?

After I wrote this I had a better idea, to avoid the completion and
just fully orphan the group fd.

And the patch is kind of messy

Can you forward me the backtrace you hit also?

(Though I'm not sure I can get to this promptly, I have only 4 working
days before LPC and still many things to do)

Thanks,
Jason

  reply	other threads:[~2022-09-02 17:21 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-31 20:12 [PATCH v4 0/2] iommu/s390: fixes related to repeat attach_dev calls Matthew Rosato
2022-08-31 20:12 ` [PATCH v4 1/2] iommu/s390: Fix race with release_device ops Matthew Rosato
2022-09-01  7:56   ` Pierre Morel
2022-09-01  9:37     ` Niklas Schnelle
2022-09-01 11:01       ` Robin Murphy
2022-09-01 13:42         ` Niklas Schnelle
2022-09-01 14:17           ` Niklas Schnelle
2022-09-01 14:29           ` Robin Murphy
2022-09-01 14:34             ` Jason Gunthorpe
2022-09-01 15:03               ` Robin Murphy
2022-09-01 15:49                 ` Jason Gunthorpe
2022-09-01 17:00                   ` Robin Murphy
2022-09-01 20:28       ` Matthew Rosato
2022-09-02  7:49         ` Niklas Schnelle
2022-09-01 10:25   ` Robin Murphy
2022-09-01 16:14     ` Matthew Rosato
2022-09-01 20:37       ` Jason Gunthorpe
2022-09-02 17:11         ` Matthew Rosato
2022-09-02 17:21           ` Jason Gunthorpe [this message]
2022-09-02 18:20             ` Matthew Rosato
2022-09-05  9:46             ` Robin Murphy
2022-09-06 13:36               ` Jason Gunthorpe
2022-09-02 10:48       ` Robin Murphy
2022-08-31 20:12 ` [PATCH v4 2/2] iommu/s390: fix leak of s390_domain_device Matthew Rosato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YxI7kzuchcJz8sRX@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=agordeev@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=pmorel@linux.ibm.com \
    --cc=robin.murphy@arm.com \
    --cc=schnelle@linux.ibm.com \
    --cc=svens@linux.ibm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).