linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Joerg Roedel <joro@8bytes.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Kevin Tian <kevin.tian@intel.com>,
	Ashok Raj <ashok.raj@intel.com>,
	kvm@vger.kernel.org, rafael@kernel.org,
	David Airlie <airlied@linux.ie>,
	linux-pci@vger.kernel.org,
	Thierry Reding <thierry.reding@gmail.com>,
	Diana Craciun <diana.craciun@oss.nxp.com>,
	Dmitry Osipenko <digetx@gmail.com>, Will Deacon <will@kernel.org>,
	Stuart Yoder <stuyoder@gmail.com>,
	Jonathan Hunter <jonathanh@nvidia.com>,
	Chaitanya Kulkarni <kch@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Cornelia Huck <cohuck@redhat.com>,
	linux-kernel@vger.kernel.org, Li Yang <leoyang.li@nxp.com>,
	iommu@lists.linux-foundation.org,
	Jacob jun Pan <jacob.jun.pan@intel.com>,
	Daniel Vetter <daniel@ffwll.ch>
Subject: Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
Date: Tue, 22 Feb 2022 19:53:53 -0400	[thread overview]
Message-ID: <20220222235353.GF10061@nvidia.com> (raw)
In-Reply-To: <3d4c3bf1-fed6-f640-dc20-36d667de7461@arm.com>

On Tue, Feb 22, 2022 at 09:18:23PM +0000, Robin Murphy wrote:

> > Still not sure I see what you are thinking though..
> 
> What part of "How hard is it to hold group->mutex when reading or writing
> group->owner?" sounded like "complex lockless algorithm", exactly?

group->owner is not the issue, this series is already using the group
lock to protect the group_cnt and owner.

It is how you inspect the struct device it iterates over to decide if
it is still using the DMA API or not that is the problem. Hint: this
is why I keep mentioning the device_lock() as it is the only locking we
for this today.

> To spell it out, the scheme I'm proposing looks like this:

Well, I already got this, it is what is in driver_or_DMA_API_token()
that matters

I think you are suggesting to do something like:

   if (!READ_ONCE(dev->driver) ||  ???)
       return NULL;
   return group;  // A DMA_API 'token'

Which is locklessly reading dev->driver, and why you are talking about
races, I guess.

> remove:
>        bool still_owned = false;
>        mutex_lock(group->mutex);
>        list_for_each_entry(tmp, &group->devices, list) {
>                void *owner = driver_or_DMA_API_token(tmp);
>                if (tmp == dev || !owner || owner != group->owner)

And here you expect this will never be called if a group is owned by
VFIO? Which bakes in that weird behavior of really_probe() that only
some errors deserve to get a notifier?

How does the next series work? The iommu_attach_device() work relies
on the owner_cnt too. I don't think this list can replace it there.

> always serialised, even if the list walk in the remove hook sees "future"
> information WRT other devices' drivers, at worst it should merely short-cut
> to a corresponding pending reclaim of ownership.

Depending on what the actual logic is I could accept this argument,
assuming it came with a WRITE_ONCE on the store side and we all
thought carefully about how all this is ordered.

> Because the current alternative to adding a few simple lines to dd.c is
> adding loads of lines all over the place to end up calling back into common
> IOMMU code, to do something I'm 99% certain the common IOMMU code
> could do

*shrug* both Christoph and I tried to convince Greg. He never really
explained why, but currently he thinks this is the right way to design
it, and so here we are.

> for itself in private. That said, having worked through the above, it does
> start looking like a bit of a big change for this series at this point, so
> I'm happy to keep it on the back burner for when I have to rip
> .dma_configure to pieces anyway.

OK, thanks.

> According to lockdep, I think I've solved the VFIO locking issue provided
> vfio_group_viable() goes away, so I'm certainly keen not to delay that for
> another cycle!

Indeed.

Keep in mind that lockdep is disabled on the device_lock()..

> paragraph quoted above again. I'm not talking about automatic DMA API
> claiming, that clearly happens per-device; I'm talking about explicit
> callers of iommu_group_claim_dma_owner(). Does VFIO call that multiple times
> for individual devices? No. Should it? No. Is it reasonable that any other
> future callers should need to? I don't think so. Would things be easier to
> reason about if we just disallowed it outright? For sure.

iommufd is device centric and the current draft does call
iommu_group_claim_dma_owner() once for each device. It doesn't have
any reason to track groups, so it has no way to know if it is
"nesting" or not.

I hope the iommu_attach_device() work will progress and iommufd can
eventualy call a cleaner device API, it is setup to work this way at
least.

So, yes, currently future calls need the owner_cnt to work right.
(and we are doing all this to allow iommufd and vfio to share the
ownership logic - adding VFIO-like group tracking complexity to
iommufd to save a few bus callbacks is not a win, IMHO)

> > It has already been 8 weeks on this point, lets just move on please.
> 
> Sure, if it was rc7 with the merge window looming I'd be saying "this is
> close enough, let's get it in now and fix the small stuff next cycle".
> However while there's still potentially time to get things right first time,
> I for one am going to continue to point them out because I'm not a fan of
> avoidable churn. I'm sorry I haven't had a chance to look properly at this
> series between v1 and v6, but that's just how things have been.

Well, it is understandable, but this was supposed to be a smallish
cleanup series. All the improvments from the discussion are welcomed
and certainly did improve it, but this started in November and is
dragging on..

Sometimes we need churn to bring everyone along the journey.

Jason

  reply	other threads:[~2022-02-22 23:54 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-18  0:55 [PATCH v6 00/11] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
2022-02-18  0:55 ` [PATCH v6 01/11] iommu: Add dma ownership management interfaces Lu Baolu
2022-02-19  7:31   ` Christoph Hellwig
2022-02-21  4:02     ` Lu Baolu
2022-02-23 18:00   ` Robin Murphy
2022-02-23 18:02     ` Jason Gunthorpe
2022-02-23 18:20       ` Robin Murphy
2022-02-23 18:32         ` Jason Gunthorpe
2022-02-24  5:16       ` Lu Baolu
2022-02-24  5:29         ` Lu Baolu
2022-02-24  8:58           ` Robin Murphy
2022-02-24  5:21     ` Lu Baolu
2022-02-18  0:55 ` [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type Lu Baolu
2022-02-19  7:32   ` Christoph Hellwig
2022-02-21 20:43     ` Robin Murphy
2022-02-21 23:48       ` Jason Gunthorpe
2022-02-22  4:48         ` Lu Baolu
2022-02-22 10:58         ` Robin Murphy
2022-02-22 15:16           ` Jason Gunthorpe
2022-02-22 21:18             ` Robin Murphy
2022-02-22 23:53               ` Jason Gunthorpe [this message]
2022-02-23  5:01                 ` Lu Baolu
2022-02-23 13:04                   ` Robin Murphy
2022-02-23 13:46                     ` Jason Gunthorpe
2022-02-23 14:06                       ` Greg Kroah-Hartman
2022-02-23 14:09                         ` Jason Gunthorpe
2022-02-23 14:30                           ` Jason Gunthorpe
2022-02-23 16:03                             ` Greg Kroah-Hartman
2022-02-23 17:05                               ` Robin Murphy
2022-02-23 17:47                                 ` Greg Kroah-Hartman
2022-02-18  0:55 ` [PATCH v6 03/11] amba: Stop sharing platform_dma_configure() Lu Baolu
2022-02-18  0:55 ` [PATCH v6 04/11] bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management Lu Baolu
2022-02-18  7:55   ` Greg Kroah-Hartman
2022-02-18  0:55 ` [PATCH v6 05/11] PCI: pci_stub: Set driver_managed_dma Lu Baolu
2022-02-18  0:55 ` [PATCH v6 06/11] PCI: portdrv: " Lu Baolu
2022-02-18  0:55 ` [PATCH v6 07/11] vfio: Set DMA ownership for VFIO devices Lu Baolu
2022-02-18  0:55 ` [PATCH v6 08/11] vfio: Remove use of vfio_group_viable() Lu Baolu
2022-02-18  0:55 ` [PATCH v6 09/11] vfio: Delete the unbound_list Lu Baolu
2022-02-18  0:55 ` [PATCH v6 10/11] vfio: Remove iommu group notifier Lu Baolu
2022-02-23 21:53   ` Alex Williamson
2022-02-24  2:49     ` Lu Baolu
2022-02-18  0:55 ` [PATCH v6 11/11] iommu: Remove iommu group changes notifier Lu Baolu
2022-02-18 15:51 ` [PATCH v6 00/11] Fix BUG_ON in vfio_iommu_group_notifier() Jason Gunthorpe
2022-02-21  3:38   ` Lu Baolu
2022-02-28  0:58 ` Lu Baolu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220222235353.GF10061@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=airlied@linux.ie \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=cohuck@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=daniel@ffwll.ch \
    --cc=diana.craciun@oss.nxp.com \
    --cc=digetx@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@intel.com \
    --cc=jonathanh@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kch@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=leoyang.li@nxp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=stuyoder@gmail.com \
    --cc=thierry.reding@gmail.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).