xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Xu, Quan" <quan.xu@intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: "dario.faggioli@citrix.com" <dario.faggioli@citrix.com>,
	"Wu, Feng" <feng.wu@intel.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue
Date: Fri, 17 Jun 2016 08:15:03 +0000	[thread overview]
Message-ID: <945CA011AD5F084CBEA3E851C0AB28894B8E6E7D@SHSMSX103.ccr.corp.intel.com> (raw)
In-Reply-To: <5763BC3B02000078000F5E4F@prv-mh.provo.novell.com>

On June 17, 2016 3:01 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 17.06.16 at 08:08, <quan.xu@intel.com> wrote:
> 
> > On June 16, 2016 5:05 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >> >>> On 16.06.16 at 10:42, <quan.xu@intel.com> wrote:
> >> > On June 02, 2016 7:07 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >> >> >>> On 01.06.16 at 11:05, <quan.xu@intel.com> wrote:
> >> >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16
> did,
> >> >> > +                                         struct pci_ats_dev
> >> >> > +*ats_dev) {
> >> >> > +    struct domain *d = NULL;
> >> >> > +    struct pci_dev *pdev;
> >> >> > +
> >> >> > +    if ( test_bit(did, iommu->domid_bitmap) )
> >> >> > +        d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> >> >> > +
> >> >> > +    /*
> >> >> > +     * In case the domain has been freed or the IOMMU domid bitmap
> is
> >> >> > +     * not valid, the device no longer belongs to this domain.
> >> >> > +     */
> >> >> > +    if ( d == NULL )
> >> >> > +        return;
> >> >> > +
> >> >> > +    pcidevs_lock();
> >> >> > +
> >> >> > +    for_each_pdev(d, pdev)
> >> >> > +    {
> >> >> > +        if ( (pdev->seg == ats_dev->seg) &&
> >> >> > +             (pdev->bus == ats_dev->bus) &&
> >> >> > +             (pdev->devfn == ats_dev->devfn) )
> >> >> > +        {
> >> >> > +            ASSERT(pdev->domain);
> >> >> > +            list_del(&pdev->domain_list);
> >> >> > +            pdev->domain = NULL;
> >> >> > +            pci_hide_existing_device(pdev);
> >> >> > +            break;
> >> >> > +        }
> >> >> > +    }
> >> >> > +
> >> >> > +    pcidevs_unlock();
> >> >>
> >> >> ... this loop (and locking). (Of course such a change may better
> >> >> be done in another preparatory patch.)
> >> >>
> >> >
> >> > To eliminate the locking?  I am afraid the locking is still a must
> >> > here even without the loop, also referring  to device_assigned()..
> >>
> >> If the entire loop gets eliminated, what would be left is
> >>
> >>     pcidevs_lock();
> >>     pcidevs_unlock();
> >>
> >> which I don't think does any good.
> >
> > Why? I can't follow it..
> 
> I don't understand your question. Can you explain what use above code
> sequence is, in your opinion? Or else - what does the "why"
> refer to?
> 

Ah, there may be a gap between us. without this loop,  these pdev operation should be still there, such as,


+    ASSERT(pdev->domain);
+    list_del(&pdev->domain_list);
+    pdev->domain = NULL;
+    pci_hide_existing_device(pdev);

So, the left is not only:
   pcidevs_lock();
   pcidevs_unlock();


> >> >> > +static int __must_check dev_invalidate_sync(struct iommu
> >> >> > +*iommu,
> >> >> > +u16
> >> >> did,
> >> >> > +                                            struct pci_ats_dev
> >> >> > +*ats_dev) {
> >> >> > +    struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
> >> >> > +    int rc = 0;
> >> >> > +
> >> >> > +    if ( qi_ctrl->qinval_maddr )
> >> >> > +    {
> >> >> > +        rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
> >> >> > +
> >> >> > +        if ( rc == -ETIMEDOUT )
> >> >> > +            dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
> >> >> > +    }
> >> >> > +
> >> >> > +    return rc;
> >> >> > +}
> >> >>
> >> >> I've never really understood why invalidate_sync() returns success
> >> >> when it didn't do anything. Now that you copy this same behavior
> >> >> here, I really need to ask you to explain that.
> >> >>
> >> >
> >> > It is acceptable to me, returning success when it didn't do
> >> > anything
> >> > -- this is worth reflection and criticism:(..
> >> > It is better:
> >> > +    if ( qi_ctrl->qinval_maddr )
> >> > +        ...
> >> > +    else
> >> > +        rc = -ENOENT;
> >>
> >> Right. And perhaps a separate patch to make invalidate_sync() do the
> same.
> >
> > Agreed.
> >
> >> Question is whether this really ought to be a conditional, or whether
> > instead
> >> this code is unreachable when qinval is not in use, in which case
> >> these conditionals would imo better be converted to ASSERT()s.
> >>
> >
> > IMO, this should be a conditional.
> > As mentioned below, strictly speaking, this is a bug. We can't
> > ASSERT() based on a bug..
> > A coming patch may fix it..
> 
> And again I don't understand: ASSERT()s are to verify assumed state. If static
> code analysis resulted in understanding a function is unreachable when
> qi_ctrl->qinval_maddr is zero (because qinval ought to have got disabled if any
> of the table setup failed), then adding ASSERT() would (a) document that and
> (b) allow to know quickly if something broke that assumption.
> 

You are right.  A separate patch does this.

> But then again I may simply misunderstand your wording: "We can't ASSERT()
> based on a bug" is really pretty unclear to me.
> 

I supposed the variable in ASSERT() is always true, but disable_qinval() needs to make
qi_ctrl->qinval_maddr  zero, but today it doesn't do this -- a bug.
With your explanation,  I got it now. Thanks.

Quan








_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-06-17  8:15 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-01  9:05 [Patch v11 0/3] VT-d Device-TLB flush issue Xu, Quan
2016-06-01  9:05 ` [Patch v11 1/3] IOMMU: add a timeout parameter for device IOTLB invalidation Xu, Quan
2016-06-02 10:24   ` Jan Beulich
2016-06-15  2:55     ` Xu, Quan
2016-06-01  9:05 ` [Patch v11 2/3] vt-d: synchronize for Device-TLB flush one by one Xu, Quan
2016-06-02 10:49   ` Jan Beulich
2016-06-01  9:05 ` [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue Xu, Quan
2016-06-02 11:07   ` Jan Beulich
2016-06-16  8:42     ` Xu, Quan
2016-06-16  9:04       ` Jan Beulich
2016-06-17  6:08         ` Xu, Quan
2016-06-17  7:00           ` Jan Beulich
2016-06-17  8:15             ` Xu, Quan [this message]
2016-06-17  8:40               ` Jan Beulich
2016-06-22 15:54             ` Xu, Quan
2016-06-22 16:18               ` Jan Beulich
2016-06-23  2:08                 ` Xu, Quan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=945CA011AD5F084CBEA3E851C0AB28894B8E6E7D@SHSMSX103.ccr.corp.intel.com \
    --to=quan.xu@intel.com \
    --cc=JBeulich@suse.com \
    --cc=dario.faggioli@citrix.com \
    --cc=feng.wu@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=xen-devel@lists.xen.org \
    --subject='Re: [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).