xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Xu, Quan" <quan.xu@intel.com>
To: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	"Wu, Feng" <feng.wu@intel.com>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Liu Jinsong <jinsong.liu@alibaba-inc.com>,
	Dario Faggioli <dario.faggioli@citrix.com>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Keir Fraser <keir@xen.org>
Subject: abstract model of  IOMMU unmaping/mapping failures
Date: Thu, 31 Mar 2016 09:06:21 +0000	[thread overview]
Message-ID: <945CA011AD5F084CBEA3E851C0AB28894B873451@SHSMSX101.ccr.corp.intel.com> (raw)

All,

Here is a summary of my investigation of the abstract model:

Below policies are adopted when deciding whether to rollback a callchain:

1. Domain will be crashed immediately within iommu_{,un}map_page, treated as a fatal error (with the exception of the hardware one). Whether to rollback depends on the need of hardware domain;

2. For hardware domain, roll back on a best effort basis. When rollback is not feasible (in early initialization phase or trade-off of complexity), at least unmap upon maps error and then throw out error message;

Below are a detail analysis of all existing callers on IOMMU interfaces (8-11 needs more discussions):

----
1. memory_add(): rollback (no change)

(Existing code)
   >>...
   if ( ret )
        goto destroy_m2p;

    if ( iommu_enabled && !iommu_passthrough && !need_iommu(hardware_domain) )
    {
        for ( i = spfn; i < epfn; i++ )
            if ( iommu_map_page(hardware_domain, i, i, IOMMUF_readable|IOMMUF_writable) )
                break;
        if ( i != epfn )
        {
            while (i-- > old_max)
                iommu_unmap_page(hardware_domain, i);
            goto destroy_m2p;
        }
    } 
   <<...
Existing code already rolls back through destroy_m2p cleanly.

----
2. vtd_set_hwdom_mapping(): no rollback (no change).

It is in early initialization phase. A quick thought looks a bit tricky to fully rollback this path, so propose to leave as it is.

----
3. __gnttab_map_grant_ref():rollback (no change)


(Existing code)
        >>...
        if ( (act_pin & (GNTPIN_hstw_mask|GNTPIN_devw_mask)) &&
             !(old_pin & (GNTPIN_hstw_mask|GNTPIN_devw_mask)) )
        {
            if ( !(kind & MAPKIND_WRITE) )
                err = iommu_map_page(ld, frame, frame,
                                     IOMMUF_readable|IOMMUF_writable);
        }
        else if ( act_pin && !old_pin )
        {
            if ( !kind )
                err = iommu_map_page(ld, frame, frame, IOMMUF_readable);
        }
        if ( err )
        {
            double_gt_unlock(lgt, rgt);
            rc = GNTST_general_error;
            goto undo_out;
        }
        <<...

rollback is already cleanly handled.

----
4. __gnttab_unmap_common():rollback (no change)

(Existing code)
       >>...
             if ( !kind )
            err = iommu_unmap_page(ld, op->frame);
        else if ( !(kind & MAPKIND_WRITE) )
            err = iommu_map_page(ld, op->frame, op->frame, IOMMUF_readable);

        double_gt_unlock(lgt, rgt);

        if ( err )
            rc = GNTST_general_error;
       <<...

Similarly no change required, as error has been correctly handled.

----
5. guest_physmap_add_entry():rollback (no change)

(Existing code)
   >>...
    rc = iommu_map_page(
        d, mfn + i, mfn + i, IOMMUF_readable|IOMMUF_writable);
    if ( rc != 0 )
    {
        while ( i-- > 0 )
            iommu_unmap_page(d, mfn + i);
        return rc;
    }
   <<...

Above is in the start of the function, so no additional state to be cleared thus it is in good shape too.

----
6. clear_identity_p2m_entry():rollback (no change).

(Existing code)
  >>...
    if ( !paging_mode_translate(d) )
    {
        if ( !need_iommu(d) )
            return 0;
        return iommu_unmap_page(d, gfn);
    }
  <<...

in the start, and error already rolled back to caller.

----
7. set_identity_p2m_entry():rollback (no change).

(Existing code)
  >>...
    if ( !paging_mode_translate(p2m->domain) )
    {
        if ( !need_iommu(d) )
            return 0;
        return iommu_map_page(d, gfn, gfn, IOMMUF_readable|IOMMUF_writable);
    }
  <<...
	 if ( is_hardware_domain(d) && !iommu_use_hap_pt(d) )
        ret = iommu_map_page(d, gfn, gfn, IOMMUF_readable|IOMMUF_writable);
   >> ...

error handling and rollback already in-place.

----
8. p2m_remove_page():rollback one level(change required).

(Existing code)
  >>...
    if ( !paging_mode_translate(p2m->domain) )
    {
        if ( need_iommu(p2m->domain) )
            for ( i = 0; i < (1 << page_order); i++ )
                iommu_unmap_page(p2m->domain, mfn + i);
        return 0;
    }
  <<...

There is no error checking of iommu_unmap_page. We need to add.
However, there are several callers of this function which don't handle error at all. I'm not sure whether hardware domain will use this function.
Since it is in core p2m logic (which I don't have much knowledge), I hope we can print a warning and simply return error here (given the fact that non-hardware domains are all crashed immediately within unmap call)

----
9. p2m_pt_set_entry(): open (propose no-rollback).

(Existing code)
  >>...
           for ( i = 0; i < (1UL << page_order); i++ )
                iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
                               iommu_pte_flags);
        else
            for ( i = 0; i < (1UL << page_order); i++ )
                iommu_unmap_page(p2m->domain, gfn + i);
  <<...

Above is in the end of the func.

I'm not sure whether it will be called for hardware domain ( PVH mode is one potential concern. as we know, PVH has been introduced on Xen 4.4 as a DomU, and on Xen 4.5 as a Dom0.).

If not, we can leave it as today. Otherwise, I want to hear your suggestion whether we can use no-rollback (just erring) for hardware domain here since this code path is very complex.

  10. ept_set_entry(): open (propose no-rollback).

(Existing code)
  >>...
            if ( iommu_flags )
                for ( i = 0; i < (1 << order); i++ )
                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
            else
                for ( i = 0; i < (1 << order); i++ )
                    iommu_unmap_page(d, gfn + i);
  <<...

Above is in the end of the func.

Same open as 9) whether we can use no-rollback here.

----
11. __get_page_type(): open (propose no-rollback).
The following is in detail:
    >>...
      if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
        {
            if ( (x & PGT_type_mask) == PGT_writable_page )
                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
            else if ( type == PGT_writable_page )
                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
                               page_to_mfn(page),
                               IOMMUF_readable|IOMMUF_writable);
        }
    <<...

It's almost in the end of the func. 
As the limited set of cases where __get_page_type() calls iommu_{,un}map_page(), and a get_page_type() invocation that
previously was known to never fail (or at least so we hope, based on the existing code) [I got it from Jan's reply.],
we can classify it as normal PV domain and be treated as a fatal error. 
So for I am inclined to leave it as today. 


I welcome your comments and feedback!!

Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

             reply	other threads:[~2016-03-31  9:06 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-31  9:06 Xu, Quan [this message]
2016-03-31  9:12 ` abstract model of IOMMU unmaping/mapping failures Xu, Quan
2016-04-01  0:25 ` Tian, Kevin
2016-04-01 11:57 ` Jan Beulich
2016-04-06  7:38   ` Xu, Quan
2016-04-07 22:44     ` Jan Beulich
2016-04-08  8:54       ` Xu, Quan
2016-04-12  3:30 ` Xu, Quan
2016-04-15 13:32   ` George Dunlap
2016-04-15 14:02 ` George Dunlap
2016-04-18  1:20   ` Xu, Quan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=945CA011AD5F084CBEA3E851C0AB28894B873451@SHSMSX101.ccr.corp.intel.com \
    --to=quan.xu@intel.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dario.faggioli@citrix.com \
    --cc=feng.wu@intel.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=jinsong.liu@alibaba-inc.com \
    --cc=jun.nakajima@intel.com \
    --cc=keir@xen.org \
    --cc=kevin.tian@intel.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).