From: Jean-Philippe Brucker <jean-philippe@linaro.org>
To: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
Raj Ashok <ashok.raj@intel.com>,
Jean-Philippe Brucker <jean-philippe@linaro.com>,
LKML <linux-kernel@vger.kernel.org>,
iommu@lists.linux-foundation.org,
Alex Williamson <alex.williamson@redhat.com>,
"Wu, Hao" <hao.wu@intel.com>,
David Woodhouse <dwmw2@infradead.org>,
Jonathan Cameron <jic23@kernel.org>
Subject: Re: [PATCH 00/10] IOASID extensions for guest SVA
Date: Thu, 2 Apr 2020 14:26:33 +0200 [thread overview]
Message-ID: <20200402122633.GC1176452@myrica> (raw)
In-Reply-To: <20200401163842.09c8e1a6@jacob-builder>
On Wed, Apr 01, 2020 at 04:38:42PM -0700, Jacob Pan wrote:
> On Wed, 1 Apr 2020 16:03:01 +0200
> Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
>
> > Hi Jacob,
> >
> > On Wed, Mar 25, 2020 at 10:55:21AM -0700, Jacob Pan wrote:
> > > IOASID was introduced in v5.5 as a generic kernel allocator service
> > > for both PCIe Process Address Space ID (PASID) and ARM SMMU's Sub
> > > Stream ID. In addition to basic ID allocation, ioasid_set was
> > > introduced as a token that is shared by a group of IOASIDs. This
> > > set token can be used for permission checking but lack of some
> > > features needed by guest Shared Virtual Address (SVA). In addition,
> > > IOASID support for life cycle management is needed among multiple
> > > users.
> > >
> > > This patchset introduces two extensions to the IOASID code,
> > > 1. IOASID set operations
> > > 2. Notifications for IOASID state synchronization
> >
> > My main concern with this series is patch 7 changing the spinlock to a
> > mutex, which prevents SVA from calling ioasid_free() from the RCU
> > callback of MMU notifiers. Could we use atomic notifiers, or do the
> > FREE notification another way?
> >
> Maybe I am looking at the wrong code, I thought
> mmu_notifier_ops.free_notifier() is called outside spinlock with
> call_srcu(), which will be invoked in the thread context.
> in mmu_notifier.c mmu_notifier_put()
> spin_unlock(&mm->notifier_subscriptions->lock);
>
> call_srcu(&srcu, &subscription->rcu, mmu_notifier_free_rcu);
free_notifier() is called from RCU callback, and according to
Documentation/RCU/checklist.txt:
5. If call_rcu() or call_srcu() is used, the callback function will
be called from softirq context. In particular, it cannot block.
When applying the patch I get the sleep-in-atomic warning:
[ 87.861793] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
[ 87.863293] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 74, name: kworker/6:1
[ 87.863993] 2 locks held by kworker/6:1/74:
[ 87.864493] #0: ffffff885ac12538 ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x740/0x1880
[ 87.865593] #1: ffffff88591efd30 ((work_completion)(&sdp->work)){+.+.}-{0:0}, at: process_one_work+0x740/0x1880
[ 87.866993] CPU: 6 PID: 74 Comm: kworker/6:1 Not tainted 5.6.0-next-20200331+ #121
[ 87.867393] Hardware name: FVP Base (DT)
[ 87.867893] Workqueue: rcu_gp srcu_invoke_callbacks
[ 87.868393] Call trace:
[ 87.868793] dump_backtrace+0x0/0x310
[ 87.869293] show_stack+0x14/0x20
[ 87.869693] dump_stack+0x124/0x180
[ 87.870193] ___might_sleep+0x2ac/0x428
[ 87.870693] __might_sleep+0x88/0x168
[ 87.871094] __mutex_lock+0xa0/0x1270
[ 87.871593] mutex_lock_nested+0x1c/0x28
[ 87.872093] ioasid_free+0x28/0x48
[ 87.872493] io_mm_free+0x1d0/0x608
[ 87.872993] mmu_notifier_free_rcu+0x74/0xe8
[ 87.873393] srcu_invoke_callbacks+0x1d0/0x2c8
[ 87.873893] process_one_work+0x858/0x1880
[ 87.874393] worker_thread+0x314/0xcd0
[ 87.874793] kthread+0x318/0x400
[ 87.875293] ret_from_fork+0x10/0x18
>
> Anyway, if we have to use atomic. I tried atomic notifier first, there
> are two subscribers to the free event on x86.
> 1. IOMMU
> 2. KVM
>
> For #1, the problem is that in the free operation, VT-d driver
> needs to do a lot of clean up in thread context.
> - hold a mutex to traverse a list of devices
> - clear PASID entry and flush cache
>
> For #2, KVM might be able to deal with spinlocks for updating VMCS
> PASID translation table. +Hao
>
> Perhaps two solutions I can think of:
> 1. Use a cyclic IOASID allocator. The main reason of clean up at free
> is to prevent race with IOASID alloc. Similar to PID, 2M IOASID
> will take long time overflow. Then we can use atomic notifier and a
> deferred workqueue to do IOMMU cleanup. The downside is a large and
> growing PASID table, may not be a performance issue since it has TLB.
That might be a problem for SMMU, which has 1024 * 64kB leaf PASID tables,
for a total of 64MB per endpoint if there is too much fragmentation in
the IOASID space.
> 2. Let VFIO ensure free always happen after unbind. Then there is no
> need to do cleanup. But that requires VFIO to keep track of all the
> PASIDs within each VM. When the VM terminates, VFIO is responsible for
> the clean up. That was Yi's original proposal. I also tried to provide
> an IOASID set iterator for VFIO to free the IOASIDs within each VM/set,
> but the private data belongs to IOMMU driver.
Not really my place to comment on this, but I find it nicer to use the
same gpasid_unbind() path when VFIO frees a PASID as when the guest
explicitly unbinds before freeing.
Thanks,
Jean
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
next prev parent reply other threads:[~2020-04-02 12:26 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-25 17:55 [PATCH 00/10] IOASID extensions for guest SVA Jacob Pan
2020-03-25 17:55 ` [PATCH 01/10] iommu/ioasid: Introduce system-wide capacity Jacob Pan
2020-03-27 8:07 ` Tian, Kevin
2020-03-27 16:08 ` Jacob Pan
2020-04-01 13:45 ` Jean-Philippe Brucker
2020-04-01 22:50 ` Jacob Pan
2020-03-25 17:55 ` [PATCH 02/10] iommu/vt-d: Set IOASID capacity when SVM is enabled Jacob Pan
2020-03-27 8:08 ` Tian, Kevin
2020-03-25 17:55 ` [PATCH 03/10] iommu/ioasid: Introduce per set allocation APIs Jacob Pan
2020-03-26 2:12 ` Lu Baolu
2020-03-26 21:30 ` Jacob Pan
2020-03-27 8:38 ` Tian, Kevin
2020-03-27 16:59 ` Jacob Pan
2020-03-28 6:32 ` Tian, Kevin
2020-04-01 13:47 ` Jean-Philippe Brucker
2020-04-06 20:02 ` Jacob Pan
2020-04-07 11:01 ` Jean-Philippe Brucker
2020-04-21 21:51 ` Jacob Pan
2020-03-25 17:55 ` [PATCH 04/10] iommu/ioasid: Rename ioasid_set_data to avoid confusion with ioasid_set Jacob Pan
2020-03-27 9:35 ` Tian, Kevin
2020-03-25 17:55 ` [PATCH 05/10] iommu/ioasid: Create an IOASID set for host SVA use Jacob Pan
2020-03-27 9:41 ` Tian, Kevin
2020-03-27 17:28 ` Jacob Pan
2020-03-28 6:33 ` Tian, Kevin
2020-04-01 13:53 ` Jean-Philippe Brucker
2020-04-06 15:33 ` Jacob Pan
2020-04-07 11:01 ` Jean-Philippe Brucker
2020-04-13 22:06 ` Jacob Pan
2020-04-15 15:10 ` Jean-Philippe Brucker
2020-03-25 17:55 ` [PATCH 06/10] iommu/ioasid: Convert to set aware allocations Jacob Pan
2020-03-27 9:54 ` Tian, Kevin
2020-03-27 17:41 ` Jacob Pan
2020-03-28 6:40 ` Tian, Kevin
2020-04-06 20:07 ` Jacob Pan
2020-04-01 13:55 ` Jean-Philippe Brucker
2020-04-01 22:45 ` Jacob Pan
2020-03-25 17:55 ` [PATCH 07/10] iommu/ioasid: Use mutex instead of spinlock Jacob Pan
2020-03-27 9:55 ` Tian, Kevin
2020-04-01 13:58 ` Jean-Philippe Brucker
2020-03-25 17:55 ` [PATCH 08/10] iommu/ioasid: Introduce notifier APIs Jacob Pan
2020-03-27 10:03 ` Tian, Kevin
2020-03-27 18:36 ` Jacob Pan
2020-03-28 6:43 ` Tian, Kevin
2020-03-31 15:13 ` Jacob Pan
2020-04-01 14:00 ` Jean-Philippe Brucker
2020-04-10 15:43 ` Jacob Pan
2020-03-25 17:55 ` [PATCH 09/10] iommu/ioasid: Support ioasid_set quota adjustment Jacob Pan
2020-03-27 10:09 ` Tian, Kevin
2020-03-27 23:30 ` Jacob Pan
2020-03-28 6:44 ` Tian, Kevin
2020-03-25 17:55 ` [PATCH 10/10] iommu/vt-d: Register PASID notifier for status change Jacob Pan
2020-03-27 10:22 ` Tian, Kevin
2020-03-27 23:47 ` Jacob Pan
2020-04-01 14:03 ` [PATCH 00/10] IOASID extensions for guest SVA Jean-Philippe Brucker
2020-04-01 23:38 ` Jacob Pan
2020-04-02 12:26 ` Jean-Philippe Brucker [this message]
2020-04-02 16:09 ` Jacob Pan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200402122633.GC1176452@myrica \
--to=jean-philippe@linaro.org \
--cc=alex.williamson@redhat.com \
--cc=ashok.raj@intel.com \
--cc=dwmw2@infradead.org \
--cc=hao.wu@intel.com \
--cc=iommu@lists.linux-foundation.org \
--cc=jacob.jun.pan@linux.intel.com \
--cc=jean-philippe@linaro.com \
--cc=jic23@kernel.org \
--cc=kevin.tian@intel.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).