All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Durrant <Paul.Durrant@citrix.com>
To: 'Jan Beulich' <JBeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Wei Liu <wei.liu2@citrix.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	"Tim (Xen.org)" <tim@xen.org>,
	George Dunlap <George.Dunlap@citrix.com>,
	Julien Grall <julien.grall@arm.com>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	xen-devel <xen-devel@lists.xenproject.org>,
	Ian Jackson <Ian.Jackson@citrix.com>,
	Brian Woods <brian.woods@amd.com>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations
Date: Tue, 4 Dec 2018 15:36:12 +0000	[thread overview]
Message-ID: <0d6a78307bb5438582dd7e8e55a449db@AMSPEX02CL03.citrite.net> (raw)
In-Reply-To: <5C069A650200007800202BFD@prv1-mh.provo.novell.com>

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 04 December 2018 15:17
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Brian Woods <brian.woods@amd.com>; Suravee Suthikulpanit
> <suravee.suthikulpanit@amd.com>; Julien Grall <julien.grall@arm.com>;
> Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
> <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Kevin
> Tian <kevin.tian@intel.com>; Stefano Stabellini <sstabellini@kernel.org>;
> xen-devel <xen-devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH v2 3/4] iommu: elide flushing for higher order
> map/unmap operations
> 
> >>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
> > This patch removes any implicit flushing that occurs in the
> implementation
> > of map and unmap operations and adds new iommu_map/unmap() wrapper
> > functions. To maintain sematics of the iommu_legacy_map/unmap() wrapper
> > functions, these are modified to call the new wrapper functions and then
> > perform an explicit flush operation.
> >
> > Because VT-d currently performs two different types of flush dependent
> upon
> > whether a PTE is being modified versus merely added (i.e. replacing a
> non-
> > present PTE) 'iommu flush flags' are defined by this patch and the
> > iommu_ops map_page() and unmap_page() methods are modified to OR the
> type
> > of flush necessary for the PTE that has been populated or depopulated
> into
> > an accumulated flags value. The accumulated value can then be passed
> into
> > the explicit flush operation.
> >
> > The ARM SMMU implementations of map_page() and unmap_page() currently
> > perform no implicit flushing and therefore the modified methods do not
> > adjust the flush flags.
> 
> Which, however, likely is wrong. If we mean the flushing to be initiated
> by the arch- and vendor-independent wrappers, then all map/unmap
> backends should indicate the needed kind of flush. Granted this can be
> done later, if things are otherwise correct on Arm right now.
> 
> > --- a/xen/arch/arm/p2m.c
> > +++ b/xen/arch/arm/p2m.c
> > @@ -971,8 +971,17 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
> >
> >      if ( need_iommu_pt_sync(p2m->domain) &&
> >           (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
> > +    {
> > +        unsigned int flush_flags = 0;
> > +
> > +        if ( lpae_is_valid(orig_pte) )
> > +            flush_flags |= IOMMU_FLUSHF_modified;
> > +        if ( lpae_is_valid(*entry) )
> > +            flush_flags |= IOMMU_FLUSHF_added;
> 
> Shouldn't this be "else if" with the meaning assigned to both
> types? From an abstract perspective I'd also expect that for
> a single mapping no more than one of the flags can come
> back set (through the iommu_ops interface).

That's not how I see it. My rationale is:

- present PTE made non-present, or modified -> IOMMU_FLUSHF_modified
- new PTE value is present -> IOMMU_FLUSHF_added

So, a single op can set any combination of bits and thus the above code does not use 'else if'.

> 
> > @@ -84,7 +86,7 @@ static bool set_iommu_pde_present(uint32_t *pde,
> unsigned long next_mfn,
> >
> >          if ( maddr_old != maddr_next || iw != old_w || ir != old_r ||
> >               old_level != next_level )
> > -            need_flush = true;
> > +            flush_flags = IOMMU_FLUSHF_modified;
> 
> Why uniformly "modified"?

Because the AMD IOMMU does require flushing for a non-present -> present transition AFAICT. The old code certainly implies this.

> 
> > @@ -645,11 +648,13 @@ static unsigned long flush_count(unsigned long
> dfn, unsigned int page_count,
> >  }
> >
> >  int amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
> > -                                unsigned int page_count)
> > +                                unsigned int page_count,
> > +                                unsigned int flush_flags)
> >  {
> >      unsigned long dfn_l = dfn_x(dfn);
> >
> >      ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
> > +    ASSERT(flush_flags & IOMMU_FLUSHF_modified);
> 
> Is this valid? What if a map operation solely re-established what
> was already there? Aiui in that case set_iommu_pde_present()
> would always return zero. Or take this (seeing that the generic
> wrapper has a zero check for the flush flags):

Yes, the ASSERT is there because this should never be called unless flush_flags != 0 (ensured by the wrapper) and the map code should only ever set IOMMU_FLUSHF_modified.

> 
> > @@ -692,6 +697,7 @@ int amd_iommu_reserve_domain_unity_map(struct domain
> *domain,
> >      unsigned long npages, i;
> >      unsigned long gfn;
> >      unsigned int flags = !!ir;
> > +    unsigned int flush_flags = 0;
> >      int rt = 0;
> >
> >      if ( iw )
> > @@ -703,11 +709,21 @@ int amd_iommu_reserve_domain_unity_map(struct
> domain *domain,
> >      {
> >          unsigned long frame = gfn + i;
> >
> > -        rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame),
> flags);
> > +        rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame),
> flags,
> > +                                &flush_flags);
> >          if ( rt != 0 )
> > -            return rt;
> > +            break;
> >      }
> > -    return 0;
> > +
> > +    /*
> > +     * The underlying implementation is void so the return value is
> > +     * meaningless and can hence be ignored.
> > +     */
> > +    while ( amd_iommu_flush_iotlb_pages(domain, _dfn(gfn), npages,
> > +                                        flush_flags) )
> > +        break;
> 
> Nothing here guarantees flush_flags to be non-zero.

Good point. I'll add a check.

> 
> > @@ -235,6 +236,9 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
> >                  process_pending_softirqs();
> >          }
> >
> > +        while ( !flush_flags && iommu_flush_all(d) )
> > +            break;
> 
> Is there a comment missing here explaining the seemingly odd
> loop?

I'm merely using the construct you suggested, but I can add a comment.

> 
> > @@ -381,6 +402,17 @@ int iommu_legacy_unmap(struct domain *d, dfn_t dfn,
> unsigned int page_order)
> >      return rc;
> >  }
> >
> > +int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int
> page_order)
> > +{
> > +    unsigned int flush_flags = 0;
> > +    int rc = iommu_unmap(d, dfn, page_order, &flush_flags);
> > +
> > +    if ( !rc )
> > +        rc = iommu_flush(d, dfn, (1u << page_order), flush_flags);
> 
> No iommu_dont_flush_iotlb check needed here?

I thought the old VT-d unmap code ignored it, but I see it didn't so yes I do need to add the check.

> 
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -633,11 +633,14 @@ static int __must_check iommu_flush_iotlb(struct
> domain *d, dfn_t dfn,
> >
> >  static int __must_check iommu_flush_iotlb_pages(struct domain *d,
> >                                                  dfn_t dfn,
> > -                                                unsigned int
> page_count)
> > +                                                unsigned int
> page_count,
> > +                                                unsigned int
> flush_flags)
> >  {
> >      ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
> > +    ASSERT(flush_flags);
> >
> > -    return iommu_flush_iotlb(d, dfn, 1, page_count);
> > +    return iommu_flush_iotlb(d, dfn, flush_flags &
> IOMMU_FLUSHF_modified,
> > +                             page_count);
> 
> Why the restriction to "modified"?

The parameter is a bool which should be true if an existing PTE was modified or false otherwise. I can make this !!(flush_flags & IOMMU_FLUSHF_modified) is you prefer.

> 
> > @@ -1825,15 +1825,18 @@ static int __must_check
> intel_iommu_map_page(struct domain *d,
> >      spin_unlock(&hd->arch.mapping_lock);
> >      unmap_vtd_domain_page(page);
> >
> > -    if ( !this_cpu(iommu_dont_flush_iotlb) )
> > -        rc = iommu_flush_iotlb(d, dfn, dma_pte_present(old), 1);
> > +    *flush_flags |= IOMMU_FLUSHF_added;
> > +    if ( dma_pte_present(old) )
> > +        *flush_flags |= IOMMU_FLUSHF_modified;
> 
> See my earlier comment as to only one of them to get set for an
> individual mapping.
> 
> > @@ -62,14 +61,15 @@ int arch_iommu_populate_page_table(struct domain *d)
> >          {
> >              unsigned long mfn = mfn_x(page_to_mfn(page));
> >              unsigned long gfn = mfn_to_gmfn(d, mfn);
> > +            unsigned int flush_flags = 0;
> >
> >              if ( gfn != gfn_x(INVALID_GFN) )
> >              {
> >                  ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
> >                  BUG_ON(SHARED_M2P(gfn));
> > -                rc = iommu_legacy_map(d, _dfn(gfn), _mfn(mfn),
> > -                                      PAGE_ORDER_4K, IOMMUF_readable |
> > -                                      IOMMUF_writable);
> > +                rc = iommu_map(d, _dfn(gfn), _mfn(mfn),
> > +                               PAGE_ORDER_4K, IOMMUF_readable |
> > +                               IOMMUF_writable, &flush_flags);
> >              }
> >              if ( rc )
> >              {
> > @@ -103,7 +103,6 @@ int arch_iommu_populate_page_table(struct domain *d)
> >      }
> >
> >      spin_unlock(&d->page_alloc_lock);
> > -    this_cpu(iommu_dont_flush_iotlb) = 0;
> >
> >      if ( !rc )
> >          rc = iommu_flush_all(d);
> 
> Would be nice to have a comment here clarifying why flush_flags
> doesn't get used.

Ok.

> 
> > @@ -249,6 +251,9 @@ void __hwdom_init arch_iommu_hwdom_init(struct
> domain *d)
> >          if (!(i & 0xfffff))
> >              process_pending_softirqs();
> >      }
> > +
> > +    if ( !flush_flags && iommu_flush_all(d) )
> > +        return;
> >  }
> 
> Again please attach a brief comment explaining the seemingly
> strange construct.
> 

Ok.

> > --- a/xen/include/xen/iommu.h
> > +++ b/xen/include/xen/iommu.h
> > @@ -93,6 +93,22 @@ void iommu_teardown(struct domain *d);
> >  #define _IOMMUF_writable 1
> >  #define IOMMUF_writable  (1u<<_IOMMUF_writable)
> >
> > +enum
> > +{
> 
> Brace on the same line as "enum" please, just like for struct/union. When
> they're named this helps finding the place where a certain type gets
> (fully) declared.

Ok.

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2018-12-04 15:44 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-03 17:40 [PATCH v2 0/4] iommu improvements Paul Durrant
2018-12-03 17:40 ` [PATCH v2 1/4] amd-iommu: add flush iommu_ops Paul Durrant
2018-12-04 14:24   ` Jan Beulich
2018-12-04 14:56     ` Paul Durrant
2018-12-03 17:40 ` [PATCH v2 2/4] iommu: rename wrapper functions Paul Durrant
2018-12-04 14:51   ` Jan Beulich
2018-12-04 15:00     ` Paul Durrant
2018-12-03 17:40 ` [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations Paul Durrant
2018-12-04 15:16   ` Jan Beulich
2018-12-04 15:36     ` Paul Durrant [this message]
2018-12-04 16:01       ` Jan Beulich
2018-12-04 16:53         ` Paul Durrant
2018-12-04 17:20           ` Jan Beulich
2018-12-03 17:40 ` [PATCH v2 4/4] x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order() Paul Durrant
2018-12-04 15:20   ` Jan Beulich
2018-12-04 15:22     ` Paul Durrant
2018-12-04 15:36       ` Jan Beulich
2018-12-04 15:51   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0d6a78307bb5438582dd7e8e55a449db@AMSPEX02CL03.citrite.net \
    --to=paul.durrant@citrix.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=George.Dunlap@citrix.com \
    --cc=Ian.Jackson@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=brian.woods@amd.com \
    --cc=julien.grall@arm.com \
    --cc=kevin.tian@intel.com \
    --cc=konrad.wilk@oracle.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=tim@xen.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.