All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: "Liu, Yi L" <yi.l.liu@intel.com>,
	yi.y.sun@linux.intel.com, qemu-devel <qemu-devel@nongnu.org>,
	mst <mst@redhat.com>
Subject: Re: [PATCH 3/3] intel-iommu: PASID support
Date: Fri, 14 Jan 2022 13:58:07 +0800	[thread overview]
Message-ID: <CACGkMEun7WEhXy_ApxfgYmbVofjjKgGuA0ezPZG4ypRK+HtSfA@mail.gmail.com> (raw)
In-Reply-To: <YeDumkj9ZgPKGgoN@xz-m1.local>

On Fri, Jan 14, 2022 at 11:31 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Fri, Jan 14, 2022 at 10:47:44AM +0800, Jason Wang wrote:
> >
> > 在 2022/1/13 下午1:06, Peter Xu 写道:
> > > On Wed, Jan 05, 2022 at 12:19:45PM +0800, Jason Wang wrote:
> > > > @@ -1725,11 +1780,16 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
> > > >           cc_entry->context_cache_gen = s->context_cache_gen;
> > > >       }
> > > > +    /* Try to fetch slpte form IOTLB */
> > > > +    if ((pasid == PCI_NO_PASID) && s->root_scalable) {
> > > > +        pasid = VTD_CE_GET_RID2PASID(&ce);
> > > > +    }
> > > > +
> > > >       /*
> > > >        * We don't need to translate for pass-through context entries.
> > > >        * Also, let's ignore IOTLB caching as well for PT devices.
> > > >        */
> > > > -    if (vtd_dev_pt_enabled(s, &ce)) {
> > > > +    if (vtd_dev_pt_enabled(s, &ce, pasid)) {
> > > >           entry->iova = addr & VTD_PAGE_MASK_4K;
> > > >           entry->translated_addr = entry->iova;
> > > >           entry->addr_mask = ~VTD_PAGE_MASK_4K;
> > > > @@ -1750,14 +1810,24 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
> > > >           return true;
> > > >       }
> > > > +    iotlb_entry = vtd_lookup_iotlb(s, source_id, addr, pasid);
> > > > +    if (iotlb_entry) {
> > > > +        trace_vtd_iotlb_page_hit(source_id, addr, iotlb_entry->slpte,
> > > > +                                 iotlb_entry->domain_id);
> > > > +        slpte = iotlb_entry->slpte;
> > > > +        access_flags = iotlb_entry->access_flags;
> > > > +        page_mask = iotlb_entry->mask;
> > > > +        goto out;
> > > > +    }
> > > IIUC the iotlb lookup moved down just because the pasid==NO_PASID case then
> > > we'll need to fetch the default pasid from the context entry.  That looks
> > > reasonable.
> > >
> > > It's just a bit of pity because logically it'll slow down iotlb hits due to
> > > context entry operations.  When NO_PASID we could have looked up iotlb without
> > > checking pasid at all, assuming that "default pasid" will always match.  But
> > > that is a little bit hacky.
> >
> >
> > Right, but I think you meant to do this only when scalable mode is disabled.
>
> Yes IMHO it will definitely suite for !scalable case since that's exactly what
> we did before.  What I'm also wondering is even if scalable is enabled but no
> "real" pasid is used, so if all the translations go through the default pasid
> that stored in the device context entry, then maybe we can ignore checking it.
> The latter is the "hacky" part mentioned above.

The problem I see is that we can't know what PASID is used as default
without reading the context entry?

>
> The other thing to mention is, if we postpone the iotlb lookup to be after
> context entry, then logically we can have per-device iotlb, that means we can
> replace IntelIOMMUState.iotlb with VTDAddressSpace.iotlb in the future, too,
> which can also be more efficient.

Right but we still need to limit the total slots and ATS is a better
way to deal with the IOTLB bottleneck actually.

>
> Not sure whether Michael will have a preference, for me I think either way can
> be done on top.
>
> >
> >
> > >
> > > vIOMMU seems to be mostly used for assigned devices and dpdk in production in
> > > the future due to its slowness otherwise.. so maybe not a big deal at all.
> > >
> > > [...]
> > >
> > > > @@ -2011,7 +2083,52 @@ static void vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
> > > >       vtd_iommu_lock(s);
> > > >       g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_page, &info);
> > > >       vtd_iommu_unlock(s);
> > > > -    vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am);
> > > > +    vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am, PCI_NO_PASID);
> > > > +}
> > > > +
> > > > +static void vtd_iotlb_page_pasid_invalidate(IntelIOMMUState *s,
> > > > +                                            uint16_t domain_id,
> > > > +                                            hwaddr addr, uint8_t am,
> > > > +                                            uint32_t pasid)
> > > > +{
> > > > +    VTDIOTLBPageInvInfo info;
> > > > +
> > > > +    trace_vtd_inv_desc_iotlb_pasid_pages(domain_id, addr, am, pasid);
> > > > +
> > > > +    assert(am <= VTD_MAMV);
> > > > +    info.domain_id = domain_id;
> > > > +    info.addr = addr;
> > > > +    info.mask = ~((1 << am) - 1);
> > > > +    info.pasid = pasid;
> > > > +    vtd_iommu_lock(s);
> > > > +    g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_page_pasid, &info);
> > > > +    vtd_iommu_unlock(s);
> > > > +    vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am, pasid);
> > > Hmm, I think indeed we need a notification, but it'll be unnecessary for
> > > e.g. vfio map notifiers, because this is 1st level invalidation and at least so
> > > far vfio map notifiers are rewalking only the 2nd level page table, so it'll be
> > > destined to be a no-op and pure overhead.
> >
> >
> > Right, consider we don't implement l1 and we don't have a 1st level
> > abstraction in neither vhost nor vfio, we can simply remove this.
>
> We probably still need the real pasid invalidation parts in the future?

Yes.

>  Either
> for vhost (if vhost will going to cache pasid-based translations), or for
> compatible assigned devices in the future where the HW can cache it.

Vhost has the plan to support ASID here:

https://patchwork.kernel.org/project/kvm/patch/20201216064818.48239-11-jasowang@redhat.com/#23866593

>
> I'm not sure what's the best way to do this, yet. Perhaps adding a new field to
> vtd_iotlb_page_invalidate_notify() telling whether this is pasid-based or not
> (basically, an invalidation for 1st or 2nd level pgtable)?

AFAIK there's no L1 in the abstraction for device IOTLB but a combined
translation result from IVOA-GPA

>  Then if it is
> pasid-based, we could opt-out for the shadow page walking.
>
> But as you mentioned we could also postpone it to the future.  Your call. :-)

Right, I tend to defer it otherwise there seems no way to test this.

Thanks

>
> Thanks,
>
> >
> >
> > >
> > > > +}
> > > > +
> > > > +static void vtd_iotlb_pasid_invalidate(IntelIOMMUState *s, uint16_t domain_id,
> > > > +                                       uint32_t pasid)
> > > > +{
> > > > +    VTDIOTLBPageInvInfo info;
> > > > +    VTDAddressSpace *vtd_as;
> > > > +    VTDContextEntry ce;
> > > > +
> > > > +    trace_vtd_inv_desc_iotlb_pasid(domain_id, pasid);
> > > > +
> > > > +    info.domain_id = domain_id;
> > > > +    info.pasid = pasid;
> > > > +    vtd_iommu_lock(s);
> > > > +    g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_pasid, &info);
> > > > +    vtd_iommu_unlock(s);
> > > > +
> > > > +    QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) {
> > > > +        if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
> > > > +                                      vtd_as->devfn, &ce) &&
> > > > +            domain_id == vtd_get_domain_id(s, &ce, vtd_as->pasid) &&
> > > > +            pasid == vtd_as->pasid) {
> > > > +            vtd_sync_shadow_page_table(vtd_as);
> > > Do we need to rewalk the shadow pgtable (which is the 2nd level, afaict) even
> > > if we got the 1st level pgtable invalidated?
> >
> >
> > Seems not and this makes me think to remove the whole PASID based
> > invalidation logic since they are for L1 which is not implemented in this
> > series.
>
> --
> Peter Xu
>



  reply	other threads:[~2022-01-14  6:02 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-05  4:19 [PATCH 0/3] PASID support for Intel IOMMU Jason Wang
2022-01-05  4:19 ` [PATCH] intel-iommu: correctly check passthrough during translation Jason Wang
2022-01-05  4:19 ` [PATCH 1/3] intel-iommu: don't warn guest errors when getting rid2pasid entry Jason Wang
2022-01-13  3:35   ` Peter Xu
2022-01-13  6:16     ` Jason Wang
2022-01-13  6:32       ` Peter Xu
2022-01-13  7:05     ` Michael S. Tsirkin
2022-01-14  3:02       ` Jason Wang
2022-01-13  7:06   ` Michael S. Tsirkin
2022-01-14  2:56     ` Jason Wang
2022-01-05  4:19 ` [PATCH 2/3] intel-iommu: drop VTDBus Jason Wang
2022-01-13  4:12   ` Peter Xu
2022-01-14  2:32     ` Jason Wang
2022-01-14  9:15       ` Jason Wang
2022-01-17  1:27         ` Peter Xu
2022-01-17  1:42           ` Peter Xu
2022-01-05  4:19 ` [PATCH 3/3] intel-iommu: PASID support Jason Wang
2022-01-13  5:06   ` Peter Xu
2022-01-13  7:16     ` Michael S. Tsirkin
2022-01-14  2:47     ` Jason Wang
2022-01-14  3:31       ` Peter Xu
2022-01-14  5:58         ` Jason Wang [this message]
2022-01-14  7:13           ` Peter Xu
2022-01-14  7:22             ` Jason Wang
2022-01-14  7:45               ` Peter Xu
2022-01-14  9:12                 ` Jason Wang
2022-01-14 12:58               ` Liu Yi L
2022-01-17  6:01                 ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACGkMEun7WEhXy_ApxfgYmbVofjjKgGuA0ezPZG4ypRK+HtSfA@mail.gmail.com \
    --to=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.