All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tamas K Lengyel <tamas@tklengyel.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "Tamas K Lengyel" <tamas.lengyel@intel.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"George Dunlap" <george.dunlap@citrix.com>,
	"Roger Pau Monné" <roger.pau@citrix.com>, "Wei Liu" <wl@xen.org>,
	Xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH 2/2] x86/hap: Resolve mm-lock order violations when forking VMs with nested p2m
Date: Wed, 6 Jan 2021 10:29:48 -0500	[thread overview]
Message-ID: <CABfawhn3OBbpHW9e1Dd9n4UCOe_KaymBS0QwnJC2tLr-oAnBmg@mail.gmail.com> (raw)
In-Reply-To: <158cf7ca-17cf-c886-20e8-b53519bec1b5@suse.com>

On Wed, Jan 6, 2021 at 7:03 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 04.01.2021 18:41, Tamas K Lengyel wrote:
> > @@ -893,13 +894,11 @@ static int nominate_page(struct domain *d, gfn_t gfn,
> >          goto out;
> >
> >      /*
> > -     * Now that the page is validated, we can lock it. There is no
> > -     * race because we're holding the p2m entry, so no one else
> > -     * could be nominating this gfn.
> > +     * Now that the page is validated, we can make it shared. There is no race
> > +     * because we're holding the p2m entry, so no one else could be nominating
> > +     * this gfn & and it is evidently not yet shared with any other VM, thus we
> > +     * don't need to take the mem_sharing_page_lock here.
> >       */
> > -    ret = -ENOENT;
> > -    if ( !mem_sharing_page_lock(page) )
> > -        goto out;
>
> Isn't it too limited to mention just nomination in the comment?
> Unsharing, for example, also needs to be prevented (or else
> the tail of sharing could race with the beginning of unsharing).
> The other paths looks to similarly hold the GFN, so the
> reasoning is fine for them as well. Except maybe audit() - what
> about races with that one?

Audit is unused and should be removed. I don't plan on maintaining
that bit, it might already be broken and I don't see a use for it.

Unsharing is already protected. No other domain could be doing an
unshare since no other domain can have this page mapped as shared
before nominate finishes as you need a sharing ref for it that's
returned from nominate. If the same domain is triggering an unshare we
are protected because the first thing unshare_page() does is get_gfn,
which we hold already until nominate finishes. So we are all good.

>
> > @@ -1214,7 +1212,7 @@ int __mem_sharing_unshare_page(struct domain *d,
> >      p2m_type_t p2mt;
> >      mfn_t mfn;
> >      struct page_info *page, *old_page;
> > -    int last_gfn;
> > +    int last_gfn, rc = 0;
>
> I consider this unhelpful: last_gfn really wants to be bool, and
> hence wants to not share a declaration with rc. But you're the
> maintainer ...

Doesn't really matter in the end IMHO.

>
> > @@ -1226,6 +1224,15 @@ int __mem_sharing_unshare_page(struct domain *d,
> >          return 0;
> >      }
> >
> > +    /* lock nested p2ms to avoid lock-order violation */
>
> Would you mind mentioning here the other side of the possible
> violation, to aid the reader?

You mean what the nested p2m locks would conflict with? I think in the
context of mem_sharing it's clear that the only thing it can conflict
with is the mem_sharing mm lock.

>
> > +    if ( unlikely(nestedhvm_enabled(d)) )
> > +    {
> > +        int i;
>
> unsigned int please (also further down), no matter that there may
> be other similar examples of (bad) use of plain int.

IMHO this is the type of change request that makes absolutely 0
difference at the end.

>
> > +        for ( i = 0; i < MAX_NESTEDP2M; i++ )
> > +            p2m_lock(d->arch.nested_p2m[i]);
>
> From a brief scan, this is the first instance of acquiring all
> nested p2m locks in one go. Ordering these by index is perhaps
> fine, but I think this wants spelling out in e.g. mm-locks.h. Of
> course the question is if you really need to go this far, i.e.
> whether really all of the locks need holding. This is even more
> so with p2m_flush_table_locked() not really looking to be a
> quick operation, when there have many pages accumulated for it.
> I.e. the overall lock holding time may turn out even more
> excessive this way than it apparently already is.

I agree this is not ideal but it gets things working without Xen
crashing. I would prefer if we could get rid of the mm lock ordering
altogether in this context. We already hold the host p2m lock and the
sharing lock, that ought to suffice. But I don't have a better way to
work around it for now.

>
> > --- a/xen/arch/x86/mm/p2m.c
> > +++ b/xen/arch/x86/mm/p2m.c
> > @@ -1598,8 +1598,17 @@ void
> >  p2m_flush_nestedp2m(struct domain *d)
> >  {
> >      int i;
> > +    struct p2m_domain *p2m;
> > +
> >      for ( i = 0; i < MAX_NESTEDP2M; i++ )
> > -        p2m_flush_table(d->arch.nested_p2m[i]);
> > +    {
> > +        p2m = d->arch.nested_p2m[i];
>
> Please move the declaration here, making this the variable's
> initializer (unless line length constraints make the latter
> undesirable).

I really don't get what difference this would make.

Thanks for the review,
Tamas


  reply	other threads:[~2021-01-06 15:31 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-04 17:41 [PATCH 1/2] x86/mem_sharing: copy cpuid during vm forking Tamas K Lengyel
2021-01-04 17:41 ` [PATCH 2/2] x86/hap: Resolve mm-lock order violations when forking VMs with nested p2m Tamas K Lengyel
2021-01-06 12:03   ` Jan Beulich
2021-01-06 15:29     ` Tamas K Lengyel [this message]
2021-01-06 16:11       ` Jan Beulich
2021-01-06 16:26         ` Tamas K Lengyel
2021-01-07 12:25           ` Jan Beulich
2021-01-07 12:43             ` Tamas K Lengyel
2021-01-07 12:56               ` Jan Beulich
2021-01-07 13:27                 ` Tamas K Lengyel
2021-01-05  8:40 ` [PATCH 1/2] x86/mem_sharing: copy cpuid during vm forking Jan Beulich
2021-01-05 11:04 ` Andrew Cooper
2021-01-05 15:50   ` Lengyel, Tamas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABfawhn3OBbpHW9e1Dd9n4UCOe_KaymBS0QwnJC2tLr-oAnBmg@mail.gmail.com \
    --to=tamas@tklengyel.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=roger.pau@citrix.com \
    --cc=tamas.lengyel@intel.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.