From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
arnd@arndb.de, "Yu, Fenghua" <fenghua.yu@intel.com>,
gregkh@linuxfoundation.org, iommu@lists.linux-foundation.org,
zhangfei.gao@linaro.org, linux-accelerators@lists.ozlabs.org
Subject: Re: [PATCH 0/2] iommu: Remove iommu_sva_ops::mm_exit()
Date: Wed, 8 Apr 2020 16:48:02 -0700 [thread overview]
Message-ID: <20200408164802.155a69e3@jacob-builder> (raw)
In-Reply-To: <20200408223218.GC11886@ziepe.ca>
On Wed, 8 Apr 2020 19:32:18 -0300
Jason Gunthorpe <jgg@ziepe.ca> wrote:
> On Wed, Apr 08, 2020 at 02:35:52PM -0700, Jacob Pan wrote:
> > > On Wed, Apr 08, 2020 at 11:35:52AM -0700, Jacob Pan wrote:
> > > > Hi Jean,
> > > >
> > > > On Wed, 8 Apr 2020 16:04:25 +0200
> > > > Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
> > > >
> > > > > The IOMMU SVA API currently requires device drivers to
> > > > > implement an mm_exit() callback, which stops device jobs that
> > > > > do DMA. This function is called in the release() MMU
> > > > > notifier, when an address space that is shared with a device
> > > > > exits.
> > > > >
> > > > > It has been noted several time during discussions about SVA
> > > > > that cancelling DMA jobs can be slow and complex, and doing
> > > > > it in the release() notifier might cause synchronization
> > > > > issues (patch 2 has more background). Device drivers must in
> > > > > any case call unbind() to remove their bond, after stopping
> > > > > DMA from a more favorable context (release of a file
> > > > > descriptor).
> > > > >
> > > > > So after mm exits, rather than notifying device drivers, we
> > > > > can hold on to the PASID until unbind(), ask IOMMU drivers to
> > > > > silently abort DMA and Page Requests in the meantime. This
> > > > > change should relieve the mmput() path.
> > > >
> > > > I assume mm is destroyed after all the FDs are closed
> > >
> > > FDs do not hold a mmget(), but they may hold a mmgrab(), ie
> > > anything using mmu_notifiers has to hold a grab until the
> > > notifier is destroyed, which is often triggered by FD close.
> > >
> > Sorry, I don't get this. Are you saying we have to hold a mmgrab()
> > between svm_bind/mmu_notifier_register and
> > svm_unbind/mmu_notifier_unregister?
>
> Yes. This is done automatically for the caller inside the mmu_notifier
> implementation. We now even store the mm_struct pointer inside the
> notifier.
>
> Once a notifier is registered the mm_struct remains valid memory until
> the notifier is unregistered.
>
> > Isn't the idea of mmu_notifier is to avoid holding the mm reference
> > and rely on the notifier to tell us when mm is going away?
>
> The notifier only holds a mmgrab(), not a mmget() - this allows
> exit_mmap to proceed, but the mm_struct memory remains.
>
> This is also probably why it is a bad idea to tie the lifetime of
> something like a pasid to the mmdrop as a evil user could cause a
> large number of mm structs to be released but not freed, probably
> defeating cgroup limits and so forth (not sure)
>
> > It seems both Intel and AMD iommu drivers don't hold mmgrab after
> > mmu_notifier_register.
>
> It is done internally to the implementation.
>
> > > So the exit_mmap() -> release() may happen before the FDs are
> > > destroyed, but the final mmdrop() will be during some FD release
> > > when the final mmdrop() happens.
> >
> > Do you mean mmdrop() is after FD release?
>
> Yes, it will be done by the mmu_notifier_unregister(), which should be
> called during FD release if the iommu lifetime is linked to some FD.
>
> > If so, unbind is called in FD release should take care of
> > everything, i.e. stops DMA, clear PASID context on IOMMU, flush PRS
> > queue etc.
>
> Yes, this is the proper way, when the DMA is stopped and no use of the
> PASID remains then you can drop the mmu notifier and release the PASID
> entirely. If that is linked to the lifetime of the FD then forget
> completely about the mm_struct lifetime, it doesn't matter..
>
Got everything above, thanks a lot.
If everything is in order with the FD close. Why do we need to
"ask IOMMU drivers to silently abort DMA and Page Requests in the
meantime." in mm_exit notifier? This will be done orderly in unbind
anyway.
> > Enforcing unbind upon FD close might be a precarious path, perhaps
> > that is why we have to deal with out of order situation?
>
> How so? You just put it in the FD release function :)
>
I was thinking some driver may choose to defer unbind in some workqueue
etc.
> > > > In VT-d, because of enqcmd and lazy PASID free we plan to hold
> > > > on to the PASID until mmdrop.
> > > > https://lore.kernel.org/patchwork/patch/1217762/
> > >
> > > Why? The bind already gets a mmu_notifier which has refcounts and
> > > the right lifetime for PASID.. This code could already be
> > > simplified by using the mmu_notifier_get()/put() stuff.
> > >
> > Yes, I guess mmu_notifier_get()/put() is new :)
> > +Fenghua
>
> I was going to convert the intel code when I did many other drivers,
> but it was a bit too complex..
>
> But the approach is straightforward. Get rid of the mm search list and
> use mmu_notifier_get(). This returns a singlton notifier for the
> mm_struct and handles refcounting/etc
>
> Use mmu_notifier_put() during a unbind, it will callback to
> free_notifier() to do the final frees (ie this is where the pasid
> should go away)
>
> For the SVM_FLAG_PRIVATE_PASID continue to use mmu_notifier_register,
> however this can now be mixed with mmu_notifier_put() so the cleanup
> is the same. A separate ops static struct is needed to create a unique
> key though
>
> Jason
[Jacob Pan]
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
next prev parent reply other threads:[~2020-04-08 23:42 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-08 14:04 [PATCH 0/2] iommu: Remove iommu_sva_ops::mm_exit() Jean-Philippe Brucker
2020-04-08 14:04 ` [PATCH 1/2] uacce: Remove mm_exit() op Jean-Philippe Brucker
2020-04-09 9:07 ` Zhangfei Gao
2020-04-09 9:44 ` Jean-Philippe Brucker
2020-04-08 14:04 ` [PATCH 2/2] iommu: Remove iommu_sva_ops::mm_exit() Jean-Philippe Brucker
2020-04-08 18:35 ` [PATCH 0/2] " Jacob Pan
2020-04-08 19:02 ` Jason Gunthorpe
2020-04-08 21:35 ` Jacob Pan
2020-04-08 22:32 ` Jason Gunthorpe
2020-04-08 23:48 ` Jacob Pan [this message]
2020-04-09 6:39 ` Jean-Philippe Brucker
2020-04-09 14:14 ` Jacob Pan
2020-04-09 14:25 ` Jason Gunthorpe
2020-04-09 16:21 ` Jacob Pan
2020-04-09 16:58 ` Jason Gunthorpe
2020-04-09 14:50 ` Jean-Philippe Brucker
2020-04-09 16:27 ` Jacob Pan
2020-04-10 15:52 ` Jacob Pan
2020-04-15 7:47 ` Jean-Philippe Brucker
2020-04-16 20:58 ` Jacob Pan
2020-04-20 8:02 ` Jean-Philippe Brucker
2020-04-09 12:08 ` Jason Gunthorpe
2020-04-09 16:31 ` Jacob Pan
2020-04-08 23:49 ` Fenghua Yu
2020-04-09 12:12 ` Jason Gunthorpe
2020-04-08 19:04 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200408164802.155a69e3@jacob-builder \
--to=jacob.jun.pan@linux.intel.com \
--cc=arnd@arndb.de \
--cc=fenghua.yu@intel.com \
--cc=gregkh@linuxfoundation.org \
--cc=iommu@lists.linux-foundation.org \
--cc=jean-philippe@linaro.org \
--cc=jgg@ziepe.ca \
--cc=linux-accelerators@lists.ozlabs.org \
--cc=zhangfei.gao@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).