Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error.

From: "Xu, Quan" <quan.xu@intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	"Wu, Feng" <feng.wu@intel.com>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Liu Jinsong <jinsong.liu@alibaba-inc.com>,
	Dario Faggioli <dario.faggioli@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Keir Fraser <keir@xen.org>
Subject: Re: [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error.
Date: Wed, 30 Mar 2016 02:28:13 +0000	[thread overview]
Message-ID: <945CA011AD5F084CBEA3E851C0AB28894B871005@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <56FA48E302000078000E0B40@prv-mh.provo.novell.com>

On March 29, 2016 3:21pm, <JBeulich@suse.com> wrote:
> >>> On 28.03.16 at 05:33, <quan.xu@intel.com> wrote:
> > On March 18, 2016 1:15am, <JBeulich@suse.com> wrote:
> >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote:
> >> > --- a/xen/common/grant_table.c
> >> > +++ b/xen/common/grant_table.c
> >> > @@ -932,8 +932,9 @@ __gnttab_map_grant_ref(
> >> >              {
> >> >                  nr_gets++;
> >> >                  (void)get_page(pg, rd);
> >> > -                if ( !(op->flags & GNTMAP_readonly) )
> >> > -                    get_page_type(pg, PGT_writable_page);
> >> > +                if ( !(op->flags & GNTMAP_readonly) &&
> >> > +                     !get_page_type(pg, PGT_writable_page) )
> >> > +                        goto could_not_pin;
> >>
> >> This needs explanation, as it doesn't look related to what your
> >> actual goal is: If an error was possible here, I think this would be
> >> a security issue. However, as also kind of documented by the
> >> explicitly ignored return value from get_page(), it is my understanding there
> here we only obtain an _extra_ reference.
> >>
> >
> > For this point, I inferred from:
> > map_vcpu_info()
> > {
> > ...
> >     if ( !get_page_type(page, PGT_writable_page) )
> >     {
> >         put_page(page);
> >         return -EINVAL;
> >     }
> > ...
> > }
> > , then for get_page_type(), I think the return value:
> >      0 -- error,
> >      1-- right.
> >
> > So if get_page_type() is failed, we should goto could_not_pin.
> 
> Did you read my reply at all? The explanation I'm expecting here is why error
> checking is all of the sudden needed _at all_.
> 

Sorry for my stupid reply.
As in this version, before the open discussion, I try to return the iommu_{,un}map_page() error in this call tree:
           iommu_{,un}map_page() -- __get_page_type() -- get_page_type()---
then, in this point, I try to deal with this iommu_{,un}map_page() error.

> > btw, there is another issue in the call path:
> >     iommu_{,un}map_page() -- __get_page_type() -- get_page_type()---
> >
> >
> > I tried to return iommu_{,un}map_page() error code in
> > __get_page_type(), is it right?
> 
> If the operation got fully rolled back - yes. Whether fully rolling back is feasible
> there though is - see the respective discussion - an open question.
> 

For the open question, does it refer to as below:

"""
As said, we first need
to settle on an abstract model. Do we want IOMMU mapping
failures to be fatal to the domain (perhaps with the exception
of the hardware one)? I think we do, and for the hardware domain
we'd do things on a best effort basis (always erring on the side
of unmapping). Which would probably mean crashing the domain
could be centralized in iommu_{,un}map_page(). How much roll
back would then still be needed in callers of these functions
for the hardware domain's sake would need to be seen.
"""

I hope it is yes. I read all of your emails again and again, I found I did get the point until this Monday.
I am summarizing it and would send out in a new thread.


> >> > --- a/xen/drivers/passthrough/x86/iommu.c
> >> > +++ b/xen/drivers/passthrough/x86/iommu.c
> >> > @@ -104,7 +104,11 @@ int arch_iommu_populate_page_table(struct
> >> domain *d)
> >> >      this_cpu(iommu_dont_flush_iotlb) = 0;
> >> >
> >> >      if ( !rc )
> >> > -        iommu_iotlb_flush_all(d);
> >> > +    {
> >> > +        rc = iommu_iotlb_flush_all(d);
> >> > +        if ( rc )
> >> > +            iommu_teardown(d);
> >> > +    }
> >> >      else if ( rc != -ERESTART )
> >> >          iommu_teardown(d);
> >>
> >> Why can't you just use the existing call to iommu_teardown(), by
> >> simply
> > deleting
> >> the "else"?
> >>
> >
> > Just check it, could I modify it as below:
> > --- a/xen/drivers/passthrough/x86/iommu.c
> > +++ b/xen/drivers/passthrough/x86/iommu.c
> > @@ -105,7 +105,8 @@ int arch_iommu_populate_page_table(struct domain
> > *d)
> >
> >      if ( !rc )
> >          iommu_iotlb_flush_all(d);
> > -    else if ( rc != -ERESTART )
> > +
> > +    if ( rc != -ERESTART )
> >          iommu_teardown(d);
> 
> Clearly not - not only are you losing the return value of
> iommu_iotlb_flush_all() now, you would then also call
> iommu_teardown() in the "success" case. My comment was related to code
> structure, yet you seem to have taken it literally.
> 

Then, what about this one:

--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -104,8 +104,9 @@ int arch_iommu_populate_page_table(struct domain *d)
     this_cpu(iommu_dont_flush_iotlb) = 0;

     if ( !rc )
-        iommu_iotlb_flush_all(d);
-    else if ( rc != -ERESTART )
+        rc = iommu_iotlb_flush_all(d);
+
+    if ( !rc && rc != -ERESTART )
         iommu_teardown(d);


IMO, my original modification is correct and redundant with 2 'iommu_teardown()'..
If this is still the correct one, could you help me send out the correct one?

Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel