Re: [PATCH] libx86: Introduce x86_cpu_policy_calculate_compatible() with MSR_ARCH_CAPS handling

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Xen-devel <xen-devel@lists.xenproject.org>,
	Jan Beulich <JBeulich@suse.com>, Wei Liu <wl@xen.org>
Subject: Re: [PATCH] libx86: Introduce x86_cpu_policy_calculate_compatible() with MSR_ARCH_CAPS handling
Date: Wed, 5 May 2021 13:37:48 +0100	[thread overview]
Message-ID: <8f6f339b-f025-2cd0-e666-a3083e79af3a@citrix.com> (raw)
In-Reply-To: <YJJtqyDOIkMxjvxW@Air-de-Roger>

On 05/05/2021 11:04, Roger Pau Monné wrote:
> On Tue, May 04, 2021 at 10:31:20PM +0100, Andrew Cooper wrote:
>> Just as with x86_cpu_policies_are_compatible(), make a start on this function
>> with some token handling.
>>
>> Add levelling support for MSR_ARCH_CAPS, because RSBA has interesting
>> properties, and introduce test_calculate_compatible_success() to the unit
>> tests, covering various cases where the arch_caps CPUID bit falls out, and
>> with RSBA accumulating rather than intersecting across the two.
>>
>> Extend x86_cpu_policies_are_compatible() with a check for MSR_ARCH_CAPS, which
>> was arguably missing from c/s e32605b07ef "x86: Begin to introduce support for
>> MSR_ARCH_CAPS".
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> CC: Wei Liu <wl@xen.org>
>> ---
>>  tools/include/xen-tools/libs.h           |   5 ++
>>  tools/tests/cpu-policy/test-cpu-policy.c | 150 +++++++++++++++++++++++++++++++
>>  xen/include/xen/lib/x86/cpu-policy.h     |  22 +++++
>>  xen/lib/x86/policy.c                     |  47 ++++++++++
>>  4 files changed, 224 insertions(+)
>>
>> diff --git a/tools/include/xen-tools/libs.h b/tools/include/xen-tools/libs.h
>> index a16e0c3807..4de10efdea 100644
>> --- a/tools/include/xen-tools/libs.h
>> +++ b/tools/include/xen-tools/libs.h
>> @@ -63,4 +63,9 @@
>>  #define ROUNDUP(_x,_w) (((unsigned long)(_x)+(1UL<<(_w))-1) & ~((1UL<<(_w))-1))
>>  #endif
>>  
>> +#ifndef _AC
>> +#define __AC(X, Y)   (X ## Y)
>> +#define _AC(X, Y)    __AC(X, Y)
>> +#endif
> You need to remove those definitions from libxl_internal.h.

Ok.

>
>>  #endif	/* __XEN_TOOLS_LIBS__ */
>> diff --git a/tools/tests/cpu-policy/test-cpu-policy.c b/tools/tests/cpu-policy/test-cpu-policy.c
>> index 75973298df..455b4fe3c0 100644
>> --- a/tools/tests/cpu-policy/test-cpu-policy.c
>> +++ b/tools/tests/cpu-policy/test-cpu-policy.c
>> @@ -775,6 +775,154 @@ static void test_is_compatible_failure(void)
>>      }
>>  }
>>  
>> +static void test_calculate_compatible_success(void)
>> +{
>> +    static struct test {
>> +        const char *name;
>> +        struct {
>> +            struct cpuid_policy p;
>> +            struct msr_policy m;
>> +        } a, b, out;
>> +    } tests[] = {
>> +        {
>> +            "arch_caps, b short max_leaf",
>> +            .a = {
>> +                .p.basic.max_leaf = 7,
>> +                .p.feat.arch_caps = true,
>> +                .m.arch_caps.rdcl_no = true,
>> +            },
>> +            .b = {
>> +                .p.basic.max_leaf = 6,
>> +                .p.feat.arch_caps = true,
>> +                .m.arch_caps.rdcl_no = true,
>> +            },
>> +            .out = {
>> +                .p.basic.max_leaf = 6,
>> +            },
>> +        },
>> +        {
>> +            "arch_caps, b feat missing",
>> +            .a = {
>> +                .p.basic.max_leaf = 7,
>> +                .p.feat.arch_caps = true,
>> +                .m.arch_caps.rdcl_no = true,
>> +            },
>> +            .b = {
>> +                .p.basic.max_leaf = 7,
>> +                .m.arch_caps.rdcl_no = true,
>> +            },
>> +            .out = {
>> +                .p.basic.max_leaf = 7,
>> +            },
>> +        },
>> +        {
>> +            "arch_caps, b rdcl_no missing",
>> +            .a = {
>> +                .p.basic.max_leaf = 7,
>> +                .p.feat.arch_caps = true,
>> +                .m.arch_caps.rdcl_no = true,
>> +            },
>> +            .b = {
>> +                .p.basic.max_leaf = 7,
>> +                .p.feat.arch_caps = true,
>> +            },
>> +            .out = {
>> +                .p.basic.max_leaf = 7,
>> +                .p.feat.arch_caps = true,
>> +            },
>> +        },
>> +        {
>> +            "arch_caps, rdcl_no ok",
>> +            .a = {
>> +                .p.basic.max_leaf = 7,
>> +                .p.feat.arch_caps = true,
>> +                .m.arch_caps.rdcl_no = true,
>> +            },
>> +            .b = {
>> +                .p.basic.max_leaf = 7,
>> +                .p.feat.arch_caps = true,
>> +                .m.arch_caps.rdcl_no = true,
>> +            },
>> +            .out = {
>> +                .p.basic.max_leaf = 7,
>> +                .p.feat.arch_caps = true,
>> +                .m.arch_caps.rdcl_no = true,
>> +            },
>> +        },
>> +        {
>> +            "arch_caps, rsba accum",
>> +            .a = {
>> +                .p.basic.max_leaf = 7,
>> +                .p.feat.arch_caps = true,
>> +                .m.arch_caps.rsba = true,
>> +            },
>> +            .b = {
>> +                .p.basic.max_leaf = 7,
>> +                .p.feat.arch_caps = true,
>> +            },
>> +            .out = {
>> +                .p.basic.max_leaf = 7,
>> +                .p.feat.arch_caps = true,
>> +                .m.arch_caps.rsba = true,
>> +            },
>> +        },
>> +    };
>> +    struct cpu_policy_errors no_errors = INIT_CPU_POLICY_ERRORS;
>> +
>> +    printf("Testing calculate compatibility success:\n");
>> +
>> +    for ( size_t i = 0; i < ARRAY_SIZE(tests); ++i )
>> +    {
>> +        struct test *t = &tests[i];
>> +        struct cpuid_policy *p = malloc(sizeof(struct cpuid_policy));
>> +        struct msr_policy *m = malloc(sizeof(struct msr_policy));
>> +        struct cpu_policy a = {
>> +            &t->a.p,
>> +            &t->a.m,
>> +        }, b = {
>> +            &t->b.p,
>> +            &t->b.m,
>> +        }, out = {
>> +            p,
>> +            m,
>> +        };
>> +        struct cpu_policy_errors e;
>> +        int res;
>> +
>> +        if ( !p || !m )
>> +            err(1, "%s() malloc failure", __func__);
>> +
>> +        res = x86_cpu_policy_calculate_compatible(&a, &b, &out, &e);
>> +
>> +        /* Check the expected error output. */
>> +        if ( res != 0 || memcmp(&no_errors, &e, sizeof(no_errors)) )
>> +        {
>> +            fail("  Test '%s' expected no errors\n"
>> +                 "    got res %d { leaf %08x, subleaf %08x, msr %08x }\n",
>> +                 t->name, res, e.leaf, e.subleaf, e.msr);
>> +            goto test_done;
>> +        }
>> +
>> +        if ( memcmp(&t->out.p, p, sizeof(*p)) )
>> +        {
>> +            fail("  Test '%s' resulting CPUID policy not as expected\n",
>> +                 t->name);
>> +            goto test_done;
>> +        }
>> +
>> +        if ( memcmp(&t->out.m, m, sizeof(*m)) )
>> +        {
>> +            fail("  Test '%s' resulting MSR policy not as expected\n",
>> +                 t->name);
>> +            goto test_done;
>> +        }
> In order to assert that we don't mess things up, I would also add a
> x86_cpu_policies_are_compatible check here:
>
> res = x86_cpu_policies_are_compatible(&a, &out, &e);
> if ( res )
>     fail("  Test '%s' created incompatible policy\n"
>          "    got res %d { leaf %08x, subleaf %08x, msr %08x }\n",
>          t->name, res, e.leaf, e.subleaf, e.msr);
> res = x86_cpu_policies_are_compatible(&b, &out, &e);
> if ( res )
>     fail("  Test '%s' created incompatible policy\n"
>          "    got res %d { leaf %08x, subleaf %08x, msr %08x }\n",
>          t->name, res, e.leaf, e.subleaf, e.msr);

That's potentially problematic, hence not including it the first time
around.  See the discussion below.

>> +
>> +    test_done:
>> +        free(p);
>> +        free(m);
>> +    }
>> +}
>> +
>>  int main(int argc, char **argv)
>>  {
>>      printf("CPU Policy unit tests\n");
>> @@ -793,6 +941,8 @@ int main(int argc, char **argv)
>>      test_is_compatible_success();
>>      test_is_compatible_failure();
>>  
>> +    test_calculate_compatible_success();
>> +
>>      if ( nr_failures )
>>          printf("Done: %u failures\n", nr_failures);
>>      else
>> diff --git a/xen/include/xen/lib/x86/cpu-policy.h b/xen/include/xen/lib/x86/cpu-policy.h
>> index 5a2c4c7b2d..0422a15557 100644
>> --- a/xen/include/xen/lib/x86/cpu-policy.h
>> +++ b/xen/include/xen/lib/x86/cpu-policy.h
>> @@ -37,6 +37,28 @@ int x86_cpu_policies_are_compatible(const struct cpu_policy *host,
>>                                      const struct cpu_policy *guest,
>>                                      struct cpu_policy_errors *err);
>>  
>> +/*
>> + * Given two policies, calculate one which is compatible with each.
>> + *
>> + * i.e. Given host @a and host @b, calculate what to give a VM so it can live
>> + * migrate between the two.
>> + *
>> + * @param a        A cpu_policy.
>> + * @param b        Another cpu_policy.
>> + * @param out      A policy compatible with @a and @b.
>> + * @param err      Optional hint for error diagnostics.
>> + * @returns -errno
>> + *
>> + * For typical usage, @a and @b should be system policies of the same type
>> + * (i.e. PV/HVM default/max) from different hosts.  In the case that an
>> + * incompatibility is detected, the optional err pointer may identify the
>> + * problematic leaf/subleaf and/or MSR.
>> + */
>> +int x86_cpu_policy_calculate_compatible(const struct cpu_policy *a,
>> +                                        const struct cpu_policy *b,
>> +                                        struct cpu_policy *out,
>> +                                        struct cpu_policy_errors *err);
>> +
>>  #endif /* !XEN_LIB_X86_POLICIES_H */
>>  
>>  /*
>> diff --git a/xen/lib/x86/policy.c b/xen/lib/x86/policy.c
>> index f6cea4e2f9..06039e8aa8 100644
>> --- a/xen/lib/x86/policy.c
>> +++ b/xen/lib/x86/policy.c
>> @@ -29,6 +29,9 @@ int x86_cpu_policies_are_compatible(const struct cpu_policy *host,
>>      if ( ~host->msr->platform_info.raw & guest->msr->platform_info.raw )
>>          FAIL_MSR(MSR_INTEL_PLATFORM_INFO);
>>  
>> +    if ( ~host->msr->arch_caps.raw & guest->msr->arch_caps.raw )
>> +        FAIL_MSR(MSR_ARCH_CAPABILITIES);
> It might be nice to expand test_is_compatible_{success,failure} to
> account for arch_caps being checked now.

At some point we're going to need to stop unit testing "does the AND
operator work", and limit testing to the interesting corner cases.

> Shouldn't this check also take into account that host might not have
> RSBA set, but it's legit for a guest policy to have it?

When we expose this properly to guests, the max policies will have RSBA
set, and the default policies will have RSBA forwarded from hardware
and/or the model table.

Therefore, we can accept any VM RSBA configuration, irrespective of the
particulars of this host, but if you e.g. have a pool of haswell's, the
default policy will have RSBA clear across the board, and the VM won't
see it.

> if ( ~host->msr->arch_caps.raw & guest->msr->arch_caps.raw & ~POL_MASK )
>     FAIL_MSR(MSR_ARCH_CAPABILITIES);
>
> Maybe POL_MASK should be renamed and defined in a header so it's
> widely available?

No - this would be incorrect.  The polarity of certain bits only matters
for levelling calculations, not for "can this VM run on this host"
calculations.

If the VM has seen RSBA, and Xen doesn't know about it, the VM cannot run.

>
>> +
>>  #undef FAIL_MSR
>>  #undef FAIL_CPUID
>>  #undef NA
>> @@ -43,6 +46,50 @@ int x86_cpu_policies_are_compatible(const struct cpu_policy *host,
>>      return ret;
>>  }
>>  
>> +int x86_cpu_policy_calculate_compatible(const struct cpu_policy *a,
>> +                                        const struct cpu_policy *b,
>> +                                        struct cpu_policy *out,
>> +                                        struct cpu_policy_errors *err)
> I think this should be in an #ifndef __XEN__ protected region?
>
> There's no need to expose this to the hypervisor, as I would expect it
> will never have to do compatible policy generation? (ie: it will
> always be done by the toolstack?)

As indicated previously, I still think we want this in Xen for the boot
paths, but I suppose the guard was my suggestion to you, so is only fair
at this point.

>
>> +{
>> +    const struct cpuid_policy *ap = a->cpuid, *bp = b->cpuid;
>> +    const struct msr_policy *am = a->msr, *bm = b->msr;
>> +    struct cpuid_policy *cp = out->cpuid;
>> +    struct msr_policy *mp = out->msr;
>> +
>> +    memset(cp, 0, sizeof(*cp));
>> +    memset(mp, 0, sizeof(*mp));
>> +
>> +    cp->basic.max_leaf = min(ap->basic.max_leaf, bp->basic.max_leaf);
>> +
>> +    if ( cp->basic.max_leaf >= 7 )
>> +    {
>> +        cp->feat.max_subleaf = min(ap->feat.max_subleaf, bp->feat.max_subleaf);
>> +
>> +        cp->feat.raw[0].b = ap->feat.raw[0].b & bp->feat.raw[0].b;
>> +        cp->feat.raw[0].c = ap->feat.raw[0].c & bp->feat.raw[0].c;
>> +        cp->feat.raw[0].d = ap->feat.raw[0].d & bp->feat.raw[0].d;
>> +    }
>> +
>> +    /* TODO: Far more. */
> Right, my proposed patch (07/13) went a bit further and also leveled
> 1c, 1d, Da1, e1c, e1d, e7d, e8b and e21a, and we also need to level
> a couple of max_leaf fields.
>
> I'm happy for this to go in first, and I can rebase the extra logic I
> have on top of this one.

There is a lot of work to do.

One thing I haven't addressed yet is the fact is things which don't
level, e.g. vendor.  You've got to pick one, and there isn't a
mathematical relationship to use between a and b.

I think for that, we ought to document that we strictly take from a. 
This makes the operation not commutative, and in particular, I don't
think we want to waste too much time/effort trying to make cross-vendor
cases work - it was a stunt a decade ago, with a huge number of sharp
corners, as well as creating a number of XSAs due to poor implementation.

For v1, I suggest we firmly stick to the same-vendor case.  It's not as
if there is a lack of things to do to make this work.

~Andrew