All of lore.kernel.org
 help / color / mirror / Atom feed
From: George Dunlap <george.dunlap@citrix.com>
To: <xen-devel@lists.xenproject.org>
Cc: Steven Haigh <netwiz@crc.id.au>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	George Dunlap <george.dunlap@citrix.com>,
	Andreas Kinzler <hfp@posteo.de>, Jan Beulich <jbeulich@suse.com>,
	Anthony Perard <anthony.perard@citrix.com>,
	Ian Jackson <ian.jackson@citrix.com>
Subject: [Xen-devel] [PATCH RFC] x86: Add hack to disable "Fake HT" mode
Date: Fri, 15 Nov 2019 10:57:39 +0000	[thread overview]
Message-ID: <20191115105739.20333-1-george.dunlap@citrix.com> (raw)

Changeset ca2eee92df44 ("x86, hvm: Expose host core/HT topology to HVM
guests") attempted to "fake up" a topology which would induce guest
operating systems to not treat vcpus as sibling hyperthreads.  This
involved (among other things) actually reporting hyperthreading as
available, but giving vcpus every other APICID.  The resulting cpu
featureset is invalid, but most operating systems on most hardware
managed to cope with it.

Unfortunately, Windows running on modern AMD hardware -- including
Ryzen 3xxx series processors, and reportedly EPYC "Rome" cpus -- gets
confused by the resulting contradictory feature bits and crashes
during installation.  (Linux guests have so far continued to cope.)

A "proper" fix is complicated and it's too late to fix it either for
4.13, or to backport to supported branches.  As a short-term fix,
implement an option to disable this "Fake HT" mode.  The resulting
topology reported will not be canonical, but experimentally continues
to work with Windows guests.

However, disabling this "Fake HT" mode has not been widely tested, and
will almost certainly break migration if applied inconsistently.

To minimize impact while allowing administrators to disable "Fake HT"
only on guests which are known not to work without it (i.e., Windows
guests) on affected hardware, add an environment variable which can be
set to disable the "Fake HT" mode on such hardware.

Reported-by: Steven Haigh <netwiz@crc.id.au>
Reported-by: Andreas Kinzler <hfp@posteo.de>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
This has been compile-tested only; I'm posting it early to get
feedback on the approach.

TODO: Prevent such guests from being migrated

Open questions:

- Is this the right place to put the `getenv` check?

- Is there any way we can make migration work, at least in some cases?

- Can we check for known-problematic models, and at least report a
  more useful error?

CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Anthony Perard <anthony.perard@citrix.com>
---
 tools/libxc/xc_cpuid_x86.c | 74 +++++++++++++++++++++++---------------
 1 file changed, 45 insertions(+), 29 deletions(-)

diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 312c481f1e..70c85e1467 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -579,52 +579,68 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid,
     }
     else
     {
-        /*
-         * Topology for HVM guests is entirely controlled by Xen.  For now, we
-         * hardcode APIC_ID = vcpu_id * 2 to give the illusion of no SMT.
-         */
-        p->basic.htt = true;
+        p->basic.htt = false;
         p->extd.cmp_legacy = false;
 
-        /*
-         * Leaf 1 EBX[23:16] is Maximum Logical Processors Per Package.
-         * Update to reflect vLAPIC_ID = vCPU_ID * 2, but make sure to avoid
-         * overflow.
-         */
-        if ( !(p->basic.lppp & 0x80) )
-            p->basic.lppp *= 2;
-
         switch ( p->x86_vendor )
         {
         case X86_VENDOR_INTEL:
             for ( i = 0; (p->cache.subleaf[i].type &&
                           i < ARRAY_SIZE(p->cache.raw)); ++i )
             {
-                p->cache.subleaf[i].cores_per_package =
-                    (p->cache.subleaf[i].cores_per_package << 1) | 1;
+                p->cache.subleaf[i].cores_per_package = 0;
                 p->cache.subleaf[i].threads_per_cache = 0;
             }
             break;
+        }
 
-        case X86_VENDOR_AMD:
-        case X86_VENDOR_HYGON:
+        if ( !getenv("XEN_LIBXC_DISABLE_FAKEHT") ) {
             /*
-             * Leaf 0x80000008 ECX[15:12] is ApicIdCoreSize.
-             * Leaf 0x80000008 ECX[7:0] is NumberOfCores (minus one).
-             * Update to reflect vLAPIC_ID = vCPU_ID * 2.  But avoid
-             * - overflow,
-             * - going out of sync with leaf 1 EBX[23:16],
-             * - incrementing ApicIdCoreSize when it's zero (which changes the
-             *   meaning of bits 7:0).
+             * Topology for HVM guests is entirely controlled by Xen.  For now, we
+             * hardcode APIC_ID = vcpu_id * 2 to give the illusion of no SMT.
              */
-            if ( p->extd.nc < 0x7f )
+            p->basic.htt = true;
+
+            /*
+             * Leaf 1 EBX[23:16] is Maximum Logical Processors Per Package.
+             * Update to reflect vLAPIC_ID = vCPU_ID * 2, but make sure to avoid
+             * overflow.
+             */
+            if ( !(p->basic.lppp & 0x80) )
+                p->basic.lppp *= 2;
+
+            switch ( p->x86_vendor )
             {
-                if ( p->extd.apic_id_size != 0 && p->extd.apic_id_size != 0xf )
-                    p->extd.apic_id_size++;
+            case X86_VENDOR_INTEL:
+                for ( i = 0; (p->cache.subleaf[i].type &&
+                              i < ARRAY_SIZE(p->cache.raw)); ++i )
+                {
+                    p->cache.subleaf[i].cores_per_package =
+                        (p->cache.subleaf[i].cores_per_package << 1) | 1;
+                    p->cache.subleaf[i].threads_per_cache = 0;
+                }
+
+            case X86_VENDOR_AMD:
+            case X86_VENDOR_HYGON:
+                /*
+                 * Leaf 0x80000008 ECX[15:12] is ApicIdCoreSize.
+                 * Leaf 0x80000008 ECX[7:0] is NumberOfCores (minus one).
+                 * Update to reflect vLAPIC_ID = vCPU_ID * 2.  But avoid
+                 * - overflow,
+                 * - going out of sync with leaf 1 EBX[23:16],
+                 * - incrementing ApicIdCoreSize when it's zero (which changes the
+                 *   meaning of bits 7:0).
+                 */
+                if ( p->extd.nc < 0x7f )
+                {
+                    if ( p->extd.apic_id_size != 0 && p->extd.apic_id_size != 0xf )
+                        p->extd.apic_id_size++;
+
+                    p->extd.nc = (p->extd.nc << 1) | 1;
+                }
+                break;
 
-                p->extd.nc = (p->extd.nc << 1) | 1;
             }
-            break;
         }
 
         /*
-- 
2.24.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

             reply	other threads:[~2019-11-15 10:58 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-15 10:57 George Dunlap [this message]
2019-11-15 11:12 ` [Xen-devel] [PATCH RFC] x86: Add hack to disable "Fake HT" mode Jan Beulich
2019-11-15 11:58   ` George Dunlap
2019-11-15 12:39     ` Jan Beulich
2019-11-15 13:55       ` Andrew Cooper
2019-11-15 14:04         ` George Dunlap
2019-11-15 14:05           ` George Dunlap
2019-11-15 14:06           ` Andrew Cooper
2019-11-15 14:10             ` George Dunlap
2019-11-15 14:14               ` Andrew Cooper
2019-11-15 14:18               ` Jan Beulich
2019-11-15 14:29                 ` George Dunlap
2019-11-15 14:42                   ` Jan Beulich
2019-11-15 14:55                     ` George Dunlap
2019-11-15 14:59                       ` Andrew Cooper
2019-11-15 15:23                         ` George Dunlap
2019-11-15 15:33                           ` Jan Beulich
2019-11-15 15:35                           ` Andrew Cooper
2019-11-15 11:17 ` Andreas Kinzler
2019-11-15 11:29   ` George Dunlap
2019-11-15 11:39     ` Andreas Kinzler
2019-11-15 12:10       ` George Dunlap
2019-11-15 12:44         ` Andreas Kinzler
2019-11-15 14:00           ` Andrew Cooper
2019-11-15 12:23 ` Jürgen Groß
2019-11-15 12:42   ` George Dunlap
2019-11-15 14:31 ` Steven Haigh
2019-11-15 15:05   ` George Dunlap
2019-11-15 15:10     ` Jürgen Groß
2019-11-15 15:15       ` George Dunlap
2019-11-15 15:27     ` Jan Beulich
2019-11-15 15:30       ` George Dunlap
2019-11-15 15:34         ` Jan Beulich
2019-11-15 15:37           ` George Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191115105739.20333-1-george.dunlap@citrix.com \
    --to=george.dunlap@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anthony.perard@citrix.com \
    --cc=hfp@posteo.de \
    --cc=ian.jackson@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=netwiz@crc.id.au \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.