All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Xu, Quan" <quan.xu@intel.com>
To: George Dunlap <george.dunlap@citrix.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>, Keir Fraser <keir@xen.org>,
	Jan Beulich <jbeulich@suse.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Dario Faggioli <dario.faggioli@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	"Nakajima, Jun" <jun.nakajima@intel.com>
Subject: Re: [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping
Date: Fri, 6 May 2016 02:40:14 +0000	[thread overview]
Message-ID: <945CA011AD5F084CBEA3E851C0AB28894B8A9C2E@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <572B2861.7020807@citrix.com>

On May 05, 2016 7:03 PM, George Dunlap <george.dunlap@citrix.com> wrote:
> On 05/05/16 07:53, Xu, Quan wrote:
> > On May 04, 2016 9:49 PM, George Dunlap <George.Dunlap@eu.citrix.com>
> wrote:
> >> On Fri, Apr 29, 2016 at 10:25 AM, Quan Xu <quan.xu@intel.com> wrote:
> >>> When IOMMU mapping is failed, we issue a best effort rollback,
> >>> stopping IOMMU mapping, unmapping the previous IOMMU maps and
> >> then
> >>> reporting the error up to the call trees. When rollback is not
> >>> feasible (in early initialization phase or trade-off of complexity)
> >>> for the hardware domain, we do things on a best effort basis, only
> >>> throwing
> >> out an error message.
> >>>
> >>> IOMMU unmapping should perhaps continue despite an error, in an
> >>> attempt to do best effort cleanup.
> >>>
> >>> Signed-off-by: Quan Xu <quan.xu@intel.com>
> >>>
> >>> CC: Keir Fraser <keir@xen.org>
> >>> CC: Jan Beulich <jbeulich@suse.com>
> >>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> >>> CC: Jun Nakajima <jun.nakajima@intel.com>
> >>> CC: Kevin Tian <kevin.tian@intel.com>
> >>> CC: George Dunlap <george.dunlap@eu.citrix.com>
> >>> ---
> >>>  xen/arch/x86/mm.c               | 13 ++++++++-----
> >>>  xen/arch/x86/mm/p2m-ept.c       | 27 +++++++++++++++++++++++++--
> >>>  xen/arch/x86/mm/p2m-pt.c        | 24 ++++++++++++++++++++----
> >>>  xen/arch/x86/mm/p2m.c           | 11 +++++++++--
> >>>  xen/drivers/passthrough/iommu.c | 14 +++++++++++++-
> >>>  5 files changed, 75 insertions(+), 14 deletions(-)
> >>>
> >>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index
> >>> a42097f..427097d 100644
> >>> --- a/xen/arch/x86/mm.c
> >>> +++ b/xen/arch/x86/mm.c
> >>> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info
> >> *page, unsigned long type,
> >>>                             int preemptible)  {
> >>>      unsigned long nx, x, y = page->u.inuse.type_info;
> >>> -    int rc = 0;
> >>> +    int rc = 0, ret = 0;
> >>>
> >>>      ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
> >>>
> >>> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info
> >> *page, unsigned long type,
> >>>          if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
> >>>          {
> >>>              if ( (x & PGT_type_mask) == PGT_writable_page )
> >>> -                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
> >>> +                ret = iommu_unmap_page(d, mfn_to_gmfn(d,
> >> page_to_mfn(page)));
> >>>              else if ( type == PGT_writable_page )
> >>> -                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> >>> -                               page_to_mfn(page),
> >>> -                               IOMMUF_readable|IOMMUF_writable);
> >>> +                ret = iommu_map_page(d, mfn_to_gmfn(d,
> page_to_mfn(page)),
> >>> +                                     page_to_mfn(page),
> >>> +
> >>> + IOMMUF_readable|IOMMUF_writable);
> >>>          }
> >>>      }
> >>>
> >>> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info
> >> *page, unsigned long type,
> >>>      if ( (x & PGT_partial) && !(nx & PGT_partial) )
> >>>          put_page(page);
> >>>
> >>> +    if ( !rc )
> >>> +        rc = ret;
> >>> +
> >>>      return rc;
> >>>  }
> >>>
> >>> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> >>> index 1ed5b47..df87944 100644
> >>> --- a/xen/arch/x86/mm/p2m-ept.c
> >>> +++ b/xen/arch/x86/mm/p2m-ept.c
> >>> @@ -821,6 +821,8 @@ out:
> >>>      if ( needs_sync )
> >>>          ept_sync_domain(p2m);
> >>>
> >>> +    ret = 0;
> >>> +
> >>>      /* For host p2m, may need to change VT-d page table.*/
> >>>      if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
> >>>           need_modify_vtd_table )
> >>> @@ -831,11 +833,29 @@ out:
> >>>          {
> >>>              if ( iommu_flags )
> >>>                  for ( i = 0; i < (1 << order); i++ )
> >>> -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
> >>> +                {
> >>> +                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i,
> >>> + iommu_flags);
> >>> +
> >>> +                    if ( !ret && unlikely(rc) )
> >>> +                    {
> >>> +                        while ( i-- )
> >>> +                            iommu_unmap_page(d, gfn + i);
> >>> +
> >>> +                        ret = rc;
> >>> +                        break;
> >>> +                    }
> >>> +                }
> >>>              else
> >>>                  for ( i = 0; i < (1 << order); i++ )
> >>> -                    iommu_unmap_page(d, gfn + i);
> >>> +                {
> >>> +                    rc = iommu_unmap_page(d, gfn + i);
> >>> +
> >>> +                    if ( !ret && unlikely(rc) )
> >>> +                        ret = rc;
> >>> +                }
> >>>          }
> >>> +
> >>> +        rc = 0;
> >>>      }
> >>
> >> So the reason for doing this thing with setting ret=0, then using rc,
> >> then setting rc to zero, is to make sure that the change is
> >> propagated to the altp2m if necessary?
> >>
> >> I'm not a fan of this sort of "implied signalling".  The check at the
> >> altp2m conditional is meant to say, "everything went fine, go ahead
> >> and propagate the change".  But with your changes, that's not what we
> >> want anymore -- we want, "If the change actually made it into the
> >> hostp2m, propagate it to the altp2m."
> >>
> >> As such, I think it would be a lot clearer to just make a new boolean
> >> variable, maybe "entry_written", which we initialize to false and
> >> then set to true when the entry is actually written; and then change
> >> the condition re the altp2m to "if ( entry_written &&
> >> p2m_is_hostp2m(..) )".
> >>
> >> Then I think you could make the ret / rc use mirror what you do
> >> elsewhere, where ret is used to store the return value of the
> >> function call, and it's propagated to rc only if rc is not already set.
> >>
> >
> > George,
> >
> > Thanks for your detailed explanation. This seems an another matter of
> preference.
> > Of course, I'd follow your suggestion in p2m  field, while I need to
> > hear the other maintainers' options as well (especially when it comes from
> Kevin/Jan, as they do spend a lot of time for me).
> >
> > A little bit of difference here, IMO, an int 'entry_written' is much
> > better, as we also need  to propagate the 'entry_written' up to callers, i.e. the
> errors returned from atomic_write_ept_entry() are '-EINVAL', '-ESRCH' and '-
> EBUSY'. Then, the follow is my modification (feel free to correct me):
> 
> But it doesn't look like you're actually taking advantage of the fact that
> entry_written is non-boolean -- other than immediately copying the value into
> rc, all the places you're only checking if it's zero or not (where "zero" in this
> case actually means "the entry was written").
> 
> The code you had before was, as far as I can tell, functionally correct.
>  The point of introducing the extra variable is to decrease cognitive load on
> people coming along to modify the code later.  To do that, you want to
> maximize the things you do that are within expectation, and minimize the
> things you do outside of that.
> 
> The name "entry_written" sounds like a boolean; developers will expect
> "0 / false" to mean "not written" and "1/true" to mean "entry written".
>  Furthermore, rc looks like a return value; if coders see "rc =
> atomic_write_ept_entry()" they immediately have a framework for what's
> going on; whereas if they see "entry_written == [...]; if
> (entry_written) { ... rc= entry_written; }", it takes some thinking to figure out.
> Not much, but every little bit of unnecessary thinking adds up.
> 
> My suggestion remains to:
> 1. Declare entry_written as a bool, initialized to false 2. In both
> atomic_write_ept_entry() calls, use rc to collect the return value 3. In the
> second one (the one which may fail), add an else { entry_written = true; } 4. In
> the conditional deciding whether to update the iommu, use "entry_written", I
> think.  At this point rc == 0 and entry_written = true will be identical, but if we
> ever insert something in between, we want the iommu update to be
> attempted any time the entry is successfully written.
> 5. Use "if ( entry_written && ...)" when deciding whether to propagate the
> change to the altp2m.
> 

George, 
Thanks. It is very clear now, and I will  follow it in next v4. 

Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-05-06  2:40 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-29  9:25 [PATCH v3 00/10] Check VT-d Device-TLB flush error Quan Xu
2016-04-29  9:25 ` [PATCH v3 01/10] vt-d: fix the IOMMU flush issue Quan Xu
2016-05-04  1:26   ` Tian, Kevin
2016-05-04 12:09     ` Xu, Quan
2016-05-04 13:51       ` Jan Beulich
2016-05-04 15:03         ` Xu, Quan
2016-04-29  9:25 ` [PATCH v3 02/10] IOMMU: handle IOMMU mapping and unmapping failures Quan Xu
2016-05-04  1:28   ` Tian, Kevin
2016-05-04 11:49     ` Xu, Quan
2016-05-04 13:44       ` Jan Beulich
2016-05-05  1:47         ` Xu, Quan
2016-04-29  9:25 ` [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping Quan Xu
2016-05-04  1:45   ` Tian, Kevin
2016-05-04  8:40     ` Jan Beulich
2016-05-04  9:13       ` Xu, Quan
2016-05-04  9:28         ` Jan Beulich
2016-05-05  0:38           ` Tian, Kevin
2016-05-04 13:48   ` George Dunlap
2016-05-05  6:53     ` Xu, Quan
2016-05-05 11:02       ` George Dunlap
2016-05-06  2:40         ` Xu, Quan [this message]
2016-04-29  9:25 ` [PATCH v3 04/10] vt-d: propagate IOMMU Device-TLB flush error up to IOMMU unmapping Quan Xu
2016-04-29  9:25 ` [PATCH v3 05/10] vt-d: propagate IOMMU Device-TLB flush error up to IOMMU mapping Quan Xu
2016-04-29  9:25 ` [PATCH v3 06/10] propagate IOMMU Device-TLB flush error up to iommu_iotlb_flush{, _all} (top level ones) Quan Xu
2016-04-29  9:25 ` [PATCH v3 07/10] IOMMU: propagate IOMMU Device-TLB flush error up to iommu_iotlb_flush{, _all} (leaf ones) Quan Xu
2016-05-04  1:54   ` Tian, Kevin
2016-04-29  9:25 ` [PATCH v3 08/10] vt-d/ept: propagate IOMMU Device-TLB flush error up to EPT update Quan Xu
2016-05-04  1:56   ` Tian, Kevin
2016-04-29  9:25 ` [PATCH v3 09/10] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending Quan Xu
2016-05-04  1:59   ` Tian, Kevin
2016-05-04  2:14     ` Xu, Quan
2016-05-04  8:42       ` Jan Beulich
2016-05-04  8:59         ` Xu, Quan
2016-05-05 10:18         ` Xu, Quan
2016-05-06  7:02           ` Jan Beulich
2016-05-06  7:20             ` Xu, Quan
2016-04-29  9:25 ` [PATCH v3 10/10] vt-d: propagate error up to ME phantom function mapping and unmapping Quan Xu
2016-05-04  2:02   ` Tian, Kevin
2016-05-04  2:19     ` Xu, Quan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=945CA011AD5F084CBEA3E851C0AB28894B8A9C2E@SHSMSX101.ccr.corp.intel.com \
    --to=quan.xu@intel.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dario.faggioli@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=jun.nakajima@intel.com \
    --cc=keir@xen.org \
    --cc=kevin.tian@intel.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.