All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <Andrew.Cooper3@citrix.com>
To: Jan Beulich <jbeulich@suse.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Cc: Wei Liu <wl@xen.org>, Roger Pau Monne <roger.pau@citrix.com>,
	George Dunlap <George.Dunlap@citrix.com>,
	"Tim (Xen.org)" <tim@xen.org>
Subject: Re: [PATCH 2/8] x86/shadow: properly handle get_page() failing
Date: Wed, 27 Jul 2022 12:46:50 +0000	[thread overview]
Message-ID: <6c5b5339-f645-4214-bc53-2b7ead004367@citrix.com> (raw)
In-Reply-To: <c5797fa8-8ee1-7668-936c-392f8298dff1@suse.com>

On 26/07/2022 17:04, Jan Beulich wrote:
> We should not blindly (in a release build) insert the new entry in the
> hash if a reference to the guest page cannot be obtained, or else an
> excess reference would be put when removing the hash entry again. Crash
> the domain in that case instead. The sole caller doesn't further care
> about the state of the guest page: All it does is return the
> corresponding shadow page (which was obtained successfully before) to
> its caller.
>
> To compensate we further need to adjust hash removal: Since the shadow
> page already has had its backlink set, domain cleanup code would try to
> destroy the shadow, and hence still cause a put_page() without
> corresponding get_page(). Leverage that the failed get_page() leads to
> no hash insertion, making shadow_hash_delete() no longer assume it will
> find the requested entry. Instead return back whether the entry was
> found. This way delete_shadow_status() can avoid calling put_page() in
> the problem scenario.
>
> For the other caller of shadow_hash_delete() simply reinstate the
> otherwise dropped assertion at the call site.
>
> While touching the conditionals in {set,delete}_shadow_status() anyway,
> also switch around their two pre-existing parts, to have the cheap one
> first (frequently allowing to avoid evaluation of the expensive - due to
> evaluate_nospec() - one altogether).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>, although with
some observations and a suggestion.

>
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -1581,7 +1581,7 @@ void shadow_hash_insert(struct domain *d
>      sh_hash_audit_bucket(d, key);
>  }
>  
> -void shadow_hash_delete(struct domain *d, unsigned long n, unsigned int t,
> +bool shadow_hash_delete(struct domain *d, unsigned long n, unsigned int t,
>                          mfn_t smfn)
>  /* Excise the mapping (n,t)->smfn from the hash table */
>  {
> @@ -1606,10 +1606,8 @@ void shadow_hash_delete(struct domain *d
>      {
>          /* Need to search for the one we want */
>          x = d->arch.paging.shadow.hash_table[key];
> -        while ( 1 )
> +        while ( x )
>          {
> -            ASSERT(x); /* We can't have hit the end, since our target is
> -                        * still in the chain somehwere... */
>              if ( next_shadow(x) == sp )
>              {
>                  x->next_shadow = sp->next_shadow;
> @@ -1617,10 +1615,14 @@ void shadow_hash_delete(struct domain *d
>              }
>              x = next_shadow(x);
>          }
> +        if ( !x )
> +            return false;
>      }

This entire if/else block is "delete matching element from single linked
list", opencoded in a very confusing way to follow.  I can't help but
feel there's a way to simplify it.  (Not in this patch, but for future
clean-up.)

>      set_next_shadow(sp, NULL);
>  
>      sh_hash_audit_bucket(d, key);
> +
> +    return true;
>  }
>  
>  typedef int (*hash_vcpu_callback_t)(struct vcpu *v, mfn_t smfn, mfn_t other_mfn);
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -132,7 +132,8 @@ delete_fl1_shadow_status(struct domain *
>      SHADOW_PRINTK("gfn=%"SH_PRI_gfn", type=%08x, smfn=%"PRI_mfn"\n",
>                     gfn_x(gfn), SH_type_fl1_shadow, mfn_x(smfn));
>      ASSERT(mfn_to_page(smfn)->u.sh.head);
> -    shadow_hash_delete(d, gfn_x(gfn), SH_type_fl1_shadow, smfn);
> +    if ( !shadow_hash_delete(d, gfn_x(gfn), SH_type_fl1_shadow, smfn) )
> +        ASSERT_UNREACHABLE();

I'd recommend crashing the domain here too.  I know the we've already
got the return value we want, but this represents corruption in the
shadow pagetable metadata, and I highly doubt it is safe to let the
guest continue to execute in such circumstances.

>  }
>  
>  
> --- a/xen/arch/x86/mm/shadow/private.h
> +++ b/xen/arch/x86/mm/shadow/private.h
> @@ -375,7 +375,7 @@ shadow_size(unsigned int shadow_type)
>  mfn_t shadow_hash_lookup(struct domain *d, unsigned long n, unsigned int t);
>  void  shadow_hash_insert(struct domain *d,
>                           unsigned long n, unsigned int t, mfn_t smfn);
> -void  shadow_hash_delete(struct domain *d,
> +bool  shadow_hash_delete(struct domain *d,
>                           unsigned long n, unsigned int t, mfn_t smfn);
>  
>  /* shadow promotion */
> @@ -773,18 +773,19 @@ static inline void
>  set_shadow_status(struct domain *d, mfn_t gmfn, u32 shadow_type, mfn_t smfn)
>  /* Put a shadow into the hash table */
>  {
> -    int res;
> -
>      SHADOW_PRINTK("d%d gmfn=%lx, type=%08x, smfn=%lx\n",
>                    d->domain_id, mfn_x(gmfn), shadow_type, mfn_x(smfn));
>  
>      ASSERT(mfn_to_page(smfn)->u.sh.head);
>  
>      /* 32-bit PV guests don't own their l4 pages so can't get_page them */
> -    if ( !is_pv_32bit_domain(d) || shadow_type != SH_type_l4_64_shadow )
> +    if ( (shadow_type != SH_type_l4_64_shadow || !is_pv_32bit_domain(d)) &&

This is the sensible way around anyway, but this is also a great example
of a predicate which really doesn't want to be eval_nospec().

We've got a growing pile of such usecases, so really do need to find a
predicate we can use which indicates "safely not speculation relevant".

~Andrew

  reply	other threads:[~2022-07-27 12:47 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-26 15:58 [PATCH 0/8] x86: XSA-40{1,2,8} follow-up Jan Beulich
2022-07-26 16:03 ` [PATCH 1/8] x86/shadow: drop shadow_prepare_page_type_change()'s 3rd parameter Jan Beulich
2022-07-27  9:25   ` Andrew Cooper
2022-07-26 16:04 ` [PATCH 2/8] x86/shadow: properly handle get_page() failing Jan Beulich
2022-07-27 12:46   ` Andrew Cooper [this message]
2022-07-27 12:53     ` Jan Beulich
2022-07-27 19:06       ` Andrew Cooper
2022-07-28  6:12         ` Jan Beulich
2022-07-26 16:04 ` [PATCH 3/8] mm: enforce return value checking on get_page() Jan Beulich
2022-07-26 17:32   ` Julien Grall
2022-07-26 16:05 ` [PATCH 4/8] x86/shadow: exclude HVM-only code from sh_remove_shadows() when !HVM Jan Beulich
2022-07-27 12:55   ` Andrew Cooper
2022-07-26 16:05 ` [PATCH 5/8] x86/shadow: don't open-code shadow_remove_all_shadows() Jan Beulich
2022-07-27 12:56   ` Andrew Cooper
2022-07-26 16:06 ` [PATCH 6/8] x86/shadow: drop CONFIG_HVM conditionals from sh_update_cr3() Jan Beulich
2022-07-27 12:57   ` Andrew Cooper
2022-07-26 16:06 ` [PATCH 7/8] x86/mm: adjust type check around _get_page_type()'s TLB flush Jan Beulich
2022-07-27 18:25   ` Andrew Cooper
2022-07-28  6:16     ` Jan Beulich
2022-08-12  8:51     ` Jan Beulich
2022-07-26 16:07 ` [PATCH 8/8] x86/mm: re-arrange " Jan Beulich
2022-07-27 18:31   ` Andrew Cooper
2022-07-28  6:20     ` Jan Beulich
2022-07-30 20:28 ` [PATCH 0/8] x86: XSA-40{1,2,8} follow-up Tim Deegan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6c5b5339-f645-4214-bc53-2b7ead004367@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=George.Dunlap@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=roger.pau@citrix.com \
    --cc=tim@xen.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.