[Qemu-devel] [PATCH v3 0/4] x86: Physical address limit patches

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v3 0/4] x86: Physical address limit patches
@ 2016-07-05 19:03 Dr. David Alan Gilbert (git)
  2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 1/4] x86: Allow physical address bits to be set Dr. David Alan Gilbert (git)
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2016-07-05 19:03 UTC (permalink / raw)
  To: qemu-devel, pbonzini, ehabkost, marcel, mst, kraxel

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

QEMU sets the guests physical address bits to 40; this is wrong
on most hardware, and can be detected by the guest.
It also stops you using really huge multi-TB VMs.

Red Hat has had a patch, that Andrea wrote, downstream for a couple
of years that reads the hosts value and uses that in the guest.  That's
correct as far as the guest sees it, and lets you create huge VMs.

The downside, is that if you've got a mix of hosts, say an i7 and a Xeon,
life gets complicated in migration; prior to 2.6 it all apparently
worked (although a guest that looked might spot the change).
In 2.6 Paolo started checking MSR writes and they failed when the
incoming MTRR mask didn't fit.

This series:
   a) Fixes up mtrr masks so that if you're migrating between hosts
      of different physical address size it tries to do something sensible.

   b) Lets you specify the guest physical address size via a CPU property, i.e.
        -cpu SandyBridge,phys-bits=36

      The default is still to use the existing 40 bits value.

   c) Lets you tell qemu to use the same setting as the host, i.e.
        -cpu SandyBridge,phys-bits=0

Note that mixed size hosts are still not necessarily safe; a guest
started on a host with a large physical address size might start using
those bits and get upset when it's moved to a small host.
However that was already potentially broken in existing qemu that
used a magic value of 40.

There's potential to add some extra guards against people
doing silly stuff; e.g. stop people running VMs using 1TB of
address space on a tiny host.

(* Note I need to do some more testing on this version; but it passes
the smoke test; I'll report back on that but wanted people to comment
if this is closer to what people wanted).

Dave

v3
  Leave the default behaviour as before rather than switching to host behaviour
  Use 9999 as the default value so that we can tell later if the value
     is a user specified value or the default
  Make the host mismatch warning only warn once and not in compat case
  Only use host-bits in kvm mode
  Don't allow explicit setting in 32bit mode
  Flattened the warning & 32 bit patches into the other patches

Dr. David Alan Gilbert (4):
  x86: Allow physical address bits to be set
  x86: Mask mtrr mask based on CPU physical address limits
  x86: fill high bits of mtrr mask
  x86: Set physical address bits based on host

 include/hw/i386/pc.h |  5 +++
 target-i386/cpu.c    | 97 +++++++++++++++++++++++++++++++++++++++++++++++-----
 target-i386/cpu.h    |  6 ++++
 target-i386/kvm.c    | 37 ++++++++++++++++++--
 4 files changed, 133 insertions(+), 12 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v3 1/4] x86: Allow physical address bits to be set
  2016-07-05 19:03 [Qemu-devel] [PATCH v3 0/4] x86: Physical address limit patches Dr. David Alan Gilbert (git)
@ 2016-07-05 19:03 ` Dr. David Alan Gilbert (git)
  2016-07-06 19:01   ` Eduardo Habkost
  2016-07-06 19:57   ` Eduardo Habkost
  2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 2/4] x86: Mask mtrr mask based on CPU physical address limits Dr. David Alan Gilbert (git)
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2016-07-05 19:03 UTC (permalink / raw)
  To: qemu-devel, pbonzini, ehabkost, marcel, mst, kraxel

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Currently QEMU sets the x86 number of physical address bits to the
magic number 40.  This is only correct on some small AMD systems;
Intel systems tend to have 36, 39, 46 bits, and large AMD systems
tend to have 48.

Having the value different from your actual hardware is detectable
by the guest and in principal can cause problems;
The current limit of 40 stops TB VMs being created by those lucky
enough to have that much.

This patch lets you set the physical bits by a cpu property but
defaults to the same existing magic 40.

I've removed the ancient warning about the 42 bit limit in exec.c;
I can't find that limit in there and no one else seems to know where
it is.

We use a magic value of 9999 as the property default so that we can
later distinguish between the default and a user set value.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 target-i386/cpu.c | 43 ++++++++++++++++++++++++++++++++++---------
 target-i386/cpu.h |  3 +++
 2 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 3bd3cfc..74d53c5 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2602,17 +2602,13 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         break;
     case 0x80000008:
         /* virtual & phys address size in low 2 bytes. */
-/* XXX: This value must match the one used in the MMU code. */
         if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
-            /* 64 bit processor */
-/* XXX: The physical address space is limited to 42 bits in exec.c. */
-            *eax = 0x00003028; /* 48 bits virtual, 40 bits physical */
+            /* 64 bit processor, 48 bits virtual, configurable
+             * physical bits.
+             */
+            *eax = 0x00003000 + cpu->phys_bits;
         } else {
-            if (env->features[FEAT_1_EDX] & CPUID_PSE36) {
-                *eax = 0x00000024; /* 36 bits physical */
-            } else {
-                *eax = 0x00000020; /* 32 bits physical */
-            }
+            *eax = cpu->phys_bits;
         }
         *ebx = 0;
         *ecx = 0;
@@ -2956,7 +2952,35 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
            & CPUID_EXT2_AMD_ALIASES);
     }
 
+    if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
+        /* 9999 is a special meaning 'use the old default',
+         */
+        if (cpu->phys_bits == 9999) {
+            /* this must match the PHYS_ADDR_MASK in cpu.h */
+            cpu->phys_bits = 40;
+        }
+        if (cpu->phys_bits > TARGET_PHYS_ADDR_SPACE_BITS ||
+            cpu->phys_bits < 32) {
+            error_setg(errp, "phys_bits should be between 32 and %u or 0 to"
+                             " use host size (but is %u)",
+                             TARGET_PHYS_ADDR_SPACE_BITS, cpu->phys_bits);
+            return;
+        }
+    } else {
+        /* For 32 bit systems don't use the user set value, but keep
+         * phys_bits consistent with what we tell the guest.
+         */
+        if (cpu->phys_bits != 9999) {
+            error_setg(errp, "phys_bits is not user-configurable in 32 bit");
+            return;
+        }
 
+        if (env->features[FEAT_1_EDX] & CPUID_PSE36) {
+            cpu->phys_bits = 36;
+        } else {
+            cpu->phys_bits = 32;
+        }
+    }
     cpu_exec_init(cs, &error_abort);
 
     if (tcg_enabled()) {
@@ -3257,6 +3281,7 @@ static Property x86_cpu_properties[] = {
     DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
     DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
     DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
+    DEFINE_PROP_UINT32("phys-bits", X86CPU, phys_bits, 9999),
     DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, 0),
     DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, 0),
     DEFINE_PROP_UINT32("xlevel2", X86CPU, env.cpuid_xlevel2, 0),
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 474b0b9..221b1a2 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -1181,6 +1181,9 @@ struct X86CPU {
     /* Compatibility bits for old machine types: */
     bool enable_cpuid_0xb;
 
+    /* Number of physical address bits supported */
+    uint32_t phys_bits;
+
     /* in order to simplify APIC support, we leave this pointer to the
        user */
     struct DeviceState *apic_state;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v3 2/4] x86: Mask mtrr mask based on CPU physical address limits
  2016-07-05 19:03 [Qemu-devel] [PATCH v3 0/4] x86: Physical address limit patches Dr. David Alan Gilbert (git)
  2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 1/4] x86: Allow physical address bits to be set Dr. David Alan Gilbert (git)
@ 2016-07-05 19:03 ` Dr. David Alan Gilbert (git)
  2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 3/4] x86: fill high bits of mtrr mask Dr. David Alan Gilbert (git)
  2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 4/4] x86: Set physical address bits based on host Dr. David Alan Gilbert (git)
  3 siblings, 0 replies; 14+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2016-07-05 19:03 UTC (permalink / raw)
  To: qemu-devel, pbonzini, ehabkost, marcel, mst, kraxel

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

The CPU GPs if we try and set a bit in a variable MTRR mask above
the limit of physical address bits on the host.  We hit this
when loading a migration from a host with a larger physical
address limit than our destination (e.g. a Xeon->i7 of same
generation) but previously used to get away with it
until 48e1a45 started checking that msr writes actually worked.

It seems in our case the GP probably comes from KVM emulating
that GP.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
---
 target-i386/kvm.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index f3698f1..6429205 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1677,6 +1677,8 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
             }
         }
         if (has_msr_mtrr) {
+            uint64_t phys_mask = MAKE_64BIT_MASK(0, cpu->phys_bits);
+
             kvm_msr_entry_add(cpu, MSR_MTRRdefType, env->mtrr_deftype);
             kvm_msr_entry_add(cpu, MSR_MTRRfix64K_00000, env->mtrr_fixed[0]);
             kvm_msr_entry_add(cpu, MSR_MTRRfix16K_80000, env->mtrr_fixed[1]);
@@ -1690,10 +1692,15 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
             kvm_msr_entry_add(cpu, MSR_MTRRfix4K_F0000, env->mtrr_fixed[9]);
             kvm_msr_entry_add(cpu, MSR_MTRRfix4K_F8000, env->mtrr_fixed[10]);
             for (i = 0; i < MSR_MTRRcap_VCNT; i++) {
+                /* The CPU GPs if we write to a bit above the physical limit of
+                 * the host CPU (and KVM emulates that)
+                 */
+                uint64_t mask = env->mtrr_var[i].mask;
+                mask &= phys_mask;
+
                 kvm_msr_entry_add(cpu, MSR_MTRRphysBase(i),
                                   env->mtrr_var[i].base);
-                kvm_msr_entry_add(cpu, MSR_MTRRphysMask(i),
-                                  env->mtrr_var[i].mask);
+                kvm_msr_entry_add(cpu, MSR_MTRRphysMask(i), mask);
             }
         }

-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v3 3/4] x86: fill high bits of mtrr mask
  2016-07-05 19:03 [Qemu-devel] [PATCH v3 0/4] x86: Physical address limit patches Dr. David Alan Gilbert (git)
  2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 1/4] x86: Allow physical address bits to be set Dr. David Alan Gilbert (git)
  2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 2/4] x86: Mask mtrr mask based on CPU physical address limits Dr. David Alan Gilbert (git)
@ 2016-07-05 19:03 ` Dr. David Alan Gilbert (git)
  2016-07-06 19:18   ` Eduardo Habkost
  2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 4/4] x86: Set physical address bits based on host Dr. David Alan Gilbert (git)
  3 siblings, 1 reply; 14+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2016-07-05 19:03 UTC (permalink / raw)
  To: qemu-devel, pbonzini, ehabkost, marcel, mst, kraxel

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Fill the bits between 51..number-of-physical-address-bits in the
MTRR_PHYSMASKn variable range mtrr masks so that they're consistent
in the migration stream irrespective of the physical address space
of the source VM in a migration.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/hw/i386/pc.h |  5 +++++
 target-i386/cpu.c    |  1 +
 target-i386/cpu.h    |  3 +++
 target-i386/kvm.c    | 26 +++++++++++++++++++++++++-
 4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 7e43b20..d85e924 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -374,6 +374,11 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
         .driver   = TYPE_X86_CPU,\
         .property = "cpuid-0xb",\
         .value    = "off",\
+    },\
+    {\
+        .driver = TYPE_X86_CPU,\
+        .property = "fill-mtrr-mask",\
+        .value = "off",\
     },
 
 #define PC_COMPAT_2_5 \
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 74d53c5..f33cf58 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -3282,6 +3282,7 @@ static Property x86_cpu_properties[] = {
     DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
     DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
     DEFINE_PROP_UINT32("phys-bits", X86CPU, phys_bits, 9999),
+    DEFINE_PROP_BOOL("fill-mtrr-mask", X86CPU, fill_mtrr_mask, true),
     DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, 0),
     DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, 0),
     DEFINE_PROP_UINT32("xlevel2", X86CPU, env.cpuid_xlevel2, 0),
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 221b1a2..a9bbd91 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -1181,6 +1181,9 @@ struct X86CPU {
     /* Compatibility bits for old machine types: */
     bool enable_cpuid_0xb;
 
+    /* if true fill the top bits of the MTRR_PHYSMASKn variable range */
+    bool fill_mtrr_mask;
+
     /* Number of physical address bits supported */
     uint32_t phys_bits;
 
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 6429205..b711867 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1935,6 +1935,7 @@ static int kvm_get_msrs(X86CPU *cpu)
     CPUX86State *env = &cpu->env;
     struct kvm_msr_entry *msrs = cpu->kvm_msr_buf->entries;
     int ret, i;
+    uint64_t mtrr_top_bits;
 
     kvm_msr_buf_reset(cpu);
 
@@ -2084,6 +2085,28 @@ static int kvm_get_msrs(X86CPU *cpu)
     }
 
     assert(ret == cpu->kvm_msr_buf->nmsrs);
+    /*
+     * MTRR masks: Each mask consists of 5 parts
+     * a  10..0: must be zero
+     * b  11   : valid bit
+     * c n-1.12: actual mask bits
+     * d  51..n: reserved must be zero
+     * e  63.52: reserved must be zero
+     *
+     * 'n' is the number of physical bits supported by the CPU and is
+     * apparently always <= 52.   We know our 'n' but don't know what
+     * the destinations 'n' is; it might be smaller, in which case
+     * it masks (c) on loading. It might be larger, in which case
+     * we fill 'd' so that d..c is consistent irrespetive of the 'n'
+     * we're migrating to.
+     */
+    if (cpu->fill_mtrr_mask && cpu->phys_bits < TARGET_PHYS_ADDR_SPACE_BITS) {
+        mtrr_top_bits = MAKE_64BIT_MASK(cpu->phys_bits,
+                                TARGET_PHYS_ADDR_SPACE_BITS - cpu->phys_bits);
+    } else {
+        mtrr_top_bits = 0;
+    }
+
     for (i = 0; i < ret; i++) {
         uint32_t index = msrs[i].index;
         switch (index) {
@@ -2279,7 +2302,8 @@ static int kvm_get_msrs(X86CPU *cpu)
             break;
         case MSR_MTRRphysBase(0) ... MSR_MTRRphysMask(MSR_MTRRcap_VCNT - 1):
             if (index & 1) {
-                env->mtrr_var[MSR_MTRRphysIndex(index)].mask = msrs[i].data;
+                env->mtrr_var[MSR_MTRRphysIndex(index)].mask = msrs[i].data |
+                                                               mtrr_top_bits;
             } else {
                 env->mtrr_var[MSR_MTRRphysIndex(index)].base = msrs[i].data;
             }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v3 4/4] x86: Set physical address bits based on host
  2016-07-05 19:03 [Qemu-devel] [PATCH v3 0/4] x86: Physical address limit patches Dr. David Alan Gilbert (git)
                   ` (2 preceding siblings ...)
  2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 3/4] x86: fill high bits of mtrr mask Dr. David Alan Gilbert (git)
@ 2016-07-05 19:03 ` Dr. David Alan Gilbert (git)
  2016-07-06 19:32   ` Eduardo Habkost
  3 siblings, 1 reply; 14+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2016-07-05 19:03 UTC (permalink / raw)
  To: qemu-devel, pbonzini, ehabkost, marcel, mst, kraxel

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

A special case based on the previous phys-bits property; if it's
the magic value 0 then use the hosts capabilities.

We can also use the value we read from the host to check the users
explicitly set value and warn them if it doesn't match.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 target-i386/cpu.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index f33cf58..6ebd26b 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2952,7 +2952,60 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
            & CPUID_EXT2_AMD_ALIASES);
     }
 
+    /* For 64bit systems think about the number of physical bits to present.
+     * ideally this should be the same as the host; anything other than matching
+     * the host can cause incorrect guest behaviour.
+     * QEMU used to pick the magic value of 40 bits that corresponds to
+     * consumer AMD devices but nothing esle.
+     */
     if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
+        if (kvm_enabled()) {
+            /* Read the hosts physical address size, and compare it to what we
+             * were asked for; note old machine types default to 40 bits
+             */
+            uint32_t eax;
+            uint32_t host_phys_bits = 0;
+            static bool warned;
+
+            host_cpuid(0x80000000, 0, &eax, NULL, NULL, NULL);
+            if (eax >= 0x80000008) {
+                host_cpuid(0x80000008, 0, &eax, NULL, NULL, NULL);
+                /* Note: According to AMD doc 25481 rev 2.34 they have a field
+                 * at 23:16 that can specify a maximum physical address bits for
+                 * the guest that can override this value; but I've not seen
+                 * anything with that set.
+                 */
+                host_phys_bits = eax & 0xff;
+            } else {
+                /* It's an odd 64 bit machine that doesn't have the leaf for
+                 * physical address bits; fall back to 36 that's most older
+                 * Intel.
+                 */
+                host_phys_bits = 36;
+            }
+
+            if (cpu->phys_bits == 0) {
+                /* The user asked for us to use the host physical bits */
+                cpu->phys_bits = host_phys_bits;
+            }
+
+            /* Print a warning if the user set it to a value that's not the
+             * host value.
+             */
+            if (cpu->phys_bits != host_phys_bits && cpu->phys_bits != 9999 &&
+                !warned) {
+                error_report("Warning: Host physical bits (%u)"
+                                 " does not match phys_bits (%u)",
+                                 host_phys_bits, cpu->phys_bits);
+                warned = true;
+            }
+        } else {
+            if (cpu->phys_bits == 0) {
+                error_setg(errp, "phys_bits can not be read from the host in"
+                                 " TCG mode");
+                return;
+            }
+        }
         /* 9999 is a special meaning 'use the old default',
          */
         if (cpu->phys_bits == 9999) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/4] x86: Allow physical address bits to be set
  2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 1/4] x86: Allow physical address bits to be set Dr. David Alan Gilbert (git)
@ 2016-07-06 19:01   ` Eduardo Habkost
  2016-07-07 16:39     ` Dr. David Alan Gilbert
  2016-07-06 19:57   ` Eduardo Habkost
  1 sibling, 1 reply; 14+ messages in thread
From: Eduardo Habkost @ 2016-07-06 19:01 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git); +Cc: qemu-devel, pbonzini, marcel, mst, kraxel

On Tue, Jul 05, 2016 at 08:03:15PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Currently QEMU sets the x86 number of physical address bits to the
> magic number 40.  This is only correct on some small AMD systems;
> Intel systems tend to have 36, 39, 46 bits, and large AMD systems
> tend to have 48.
> 
> Having the value different from your actual hardware is detectable
> by the guest and in principal can cause problems;
> The current limit of 40 stops TB VMs being created by those lucky
> enough to have that much.
> 
> This patch lets you set the physical bits by a cpu property but
> defaults to the same existing magic 40.
> 
> I've removed the ancient warning about the 42 bit limit in exec.c;
> I can't find that limit in there and no one else seems to know where
> it is.
> 
> We use a magic value of 9999 as the property default so that we can
> later distinguish between the default and a user set value.

I'd prefer to use -1 or 0xFFFFFFFF (UINT32_MAX) to indicate it
was not set by the user, and not document it as "use the old
default" but just as "it was not set explicitly".

This won't allow us to differentiate between "set by user" and
"set by machine-type compat_props" in the future. But I would
prefer to use a MachineClass field or boolean property than magic
numbers for that, anyway.

If you replace 9999 with UINT32_MAX or 0xFFFFFFFF in this patch:

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>

BTW, using 0 to indicate "not set" would be acceptable, too, but
the magic 0 value in patch 4/4 would need to be replaced with a
boolean "host-phys-bits" property. But I prefer boolean properties
instead of magic numbers, anyway.

> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  target-i386/cpu.c | 43 ++++++++++++++++++++++++++++++++++---------
>  target-i386/cpu.h |  3 +++
>  2 files changed, 37 insertions(+), 9 deletions(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 3bd3cfc..74d53c5 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -2602,17 +2602,13 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>          break;
>      case 0x80000008:
>          /* virtual & phys address size in low 2 bytes. */
> -/* XXX: This value must match the one used in the MMU code. */
>          if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
> -            /* 64 bit processor */
> -/* XXX: The physical address space is limited to 42 bits in exec.c. */
> -            *eax = 0x00003028; /* 48 bits virtual, 40 bits physical */
> +            /* 64 bit processor, 48 bits virtual, configurable
> +             * physical bits.
> +             */
> +            *eax = 0x00003000 + cpu->phys_bits;
>          } else {
> -            if (env->features[FEAT_1_EDX] & CPUID_PSE36) {
> -                *eax = 0x00000024; /* 36 bits physical */
> -            } else {
> -                *eax = 0x00000020; /* 32 bits physical */
> -            }
> +            *eax = cpu->phys_bits;
>          }
>          *ebx = 0;
>          *ecx = 0;
> @@ -2956,7 +2952,35 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>             & CPUID_EXT2_AMD_ALIASES);
>      }
>  
> +    if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
> +        /* 9999 is a special meaning 'use the old default',
> +         */
> +        if (cpu->phys_bits == 9999) {
> +            /* this must match the PHYS_ADDR_MASK in cpu.h */
> +            cpu->phys_bits = 40;
> +        }
> +        if (cpu->phys_bits > TARGET_PHYS_ADDR_SPACE_BITS ||
> +            cpu->phys_bits < 32) {
> +            error_setg(errp, "phys_bits should be between 32 and %u or 0 to"
> +                             " use host size (but is %u)",
> +                             TARGET_PHYS_ADDR_SPACE_BITS, cpu->phys_bits);
> +            return;
> +        }
> +    } else {
> +        /* For 32 bit systems don't use the user set value, but keep
> +         * phys_bits consistent with what we tell the guest.
> +         */
> +        if (cpu->phys_bits != 9999) {
> +            error_setg(errp, "phys_bits is not user-configurable in 32 bit");
> +            return;
> +        }
>  
> +        if (env->features[FEAT_1_EDX] & CPUID_PSE36) {
> +            cpu->phys_bits = 36;
> +        } else {
> +            cpu->phys_bits = 32;
> +        }
> +    }
>      cpu_exec_init(cs, &error_abort);
>  
>      if (tcg_enabled()) {
> @@ -3257,6 +3281,7 @@ static Property x86_cpu_properties[] = {
>      DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
>      DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
>      DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
> +    DEFINE_PROP_UINT32("phys-bits", X86CPU, phys_bits, 9999),
>      DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, 0),
>      DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, 0),
>      DEFINE_PROP_UINT32("xlevel2", X86CPU, env.cpuid_xlevel2, 0),
> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> index 474b0b9..221b1a2 100644
> --- a/target-i386/cpu.h
> +++ b/target-i386/cpu.h
> @@ -1181,6 +1181,9 @@ struct X86CPU {
>      /* Compatibility bits for old machine types: */
>      bool enable_cpuid_0xb;
>  
> +    /* Number of physical address bits supported */
> +    uint32_t phys_bits;
> +
>      /* in order to simplify APIC support, we leave this pointer to the
>         user */
>      struct DeviceState *apic_state;
> -- 
> 2.7.4
> 

-- 
Eduardo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 3/4] x86: fill high bits of mtrr mask
  2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 3/4] x86: fill high bits of mtrr mask Dr. David Alan Gilbert (git)
@ 2016-07-06 19:18   ` Eduardo Habkost
  2016-07-07 17:51     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 14+ messages in thread
From: Eduardo Habkost @ 2016-07-06 19:18 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git); +Cc: qemu-devel, pbonzini, marcel, mst, kraxel

On Tue, Jul 05, 2016 at 08:03:17PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Fill the bits between 51..number-of-physical-address-bits in the
> MTRR_PHYSMASKn variable range mtrr masks so that they're consistent
> in the migration stream irrespective of the physical address space
> of the source VM in a migration.
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
[...]
> @@ -2084,6 +2085,28 @@ static int kvm_get_msrs(X86CPU *cpu)
>      }
>  
>      assert(ret == cpu->kvm_msr_buf->nmsrs);
> +    /*
> +     * MTRR masks: Each mask consists of 5 parts
> +     * a  10..0: must be zero
> +     * b  11   : valid bit
> +     * c n-1.12: actual mask bits
> +     * d  51..n: reserved must be zero
> +     * e  63.52: reserved must be zero
> +     *
> +     * 'n' is the number of physical bits supported by the CPU and is
> +     * apparently always <= 52.   We know our 'n' but don't know what
> +     * the destinations 'n' is; it might be smaller, in which case
> +     * it masks (c) on loading. It might be larger, in which case
> +     * we fill 'd' so that d..c is consistent irrespetive of the 'n'
> +     * we're migrating to.
> +     */
> +    if (cpu->fill_mtrr_mask && cpu->phys_bits < TARGET_PHYS_ADDR_SPACE_BITS) {

As we already ensure phys_bits <= TARGET_PHYS_ADDR_SPACE_BITS in
patch 1/4, what about just doing this:

  assert(cpu->phys_bits <= TARGET_PHYS_ADDR_SPACE_BITS)


> +        mtrr_top_bits = MAKE_64BIT_MASK(cpu->phys_bits,
> +                                TARGET_PHYS_ADDR_SPACE_BITS - cpu->phys_bits);

What's the actual meaning of TARGET_PHYS_ADDR_SPACE_BITS? Can it
ever change in the future?  Should a change in
TARGET_PHYS_ADDR_SPACE_BITS really change the migration format?

To make sure we won't have any surprises if
TARGET_PHYS_ADDR_SPACE_BITS change, I would change the code to:

 QEMU_BUILD_BUG_ON(TARGET_PHYS_ADDR_SPACE_BITS > 52);
 assert(cpu->phs_bits <= TARGET_PHYS_ADDR_SPACE_BITS);
 mtrr_top_bits = MAKE_64BIT_MASK(cpu->phys_bits, 52 - cpu->phys_bits);


> +    } else {
> +        mtrr_top_bits = 0;
> +    }
> +
>      for (i = 0; i < ret; i++) {
>          uint32_t index = msrs[i].index;
>          switch (index) {
> @@ -2279,7 +2302,8 @@ static int kvm_get_msrs(X86CPU *cpu)
>              break;
>          case MSR_MTRRphysBase(0) ... MSR_MTRRphysMask(MSR_MTRRcap_VCNT - 1):
>              if (index & 1) {
> -                env->mtrr_var[MSR_MTRRphysIndex(index)].mask = msrs[i].data;
> +                env->mtrr_var[MSR_MTRRphysIndex(index)].mask = msrs[i].data |
> +                                                               mtrr_top_bits;
>              } else {
>                  env->mtrr_var[MSR_MTRRphysIndex(index)].base = msrs[i].data;
>              }
> -- 
> 2.7.4
> 

-- 
Eduardo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 4/4] x86: Set physical address bits based on host
  2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 4/4] x86: Set physical address bits based on host Dr. David Alan Gilbert (git)
@ 2016-07-06 19:32   ` Eduardo Habkost
  2016-07-07 18:46     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 14+ messages in thread
From: Eduardo Habkost @ 2016-07-06 19:32 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git); +Cc: qemu-devel, pbonzini, marcel, mst, kraxel

On Tue, Jul 05, 2016 at 08:03:18PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> A special case based on the previous phys-bits property; if it's
> the magic value 0 then use the hosts capabilities.
> 
> We can also use the value we read from the host to check the users
> explicitly set value and warn them if it doesn't match.
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  target-i386/cpu.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 53 insertions(+)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index f33cf58..6ebd26b 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -2952,7 +2952,60 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>             & CPUID_EXT2_AMD_ALIASES);
>      }
>  
> +    /* For 64bit systems think about the number of physical bits to present.
> +     * ideally this should be the same as the host; anything other than matching
> +     * the host can cause incorrect guest behaviour.
> +     * QEMU used to pick the magic value of 40 bits that corresponds to
> +     * consumer AMD devices but nothing esle.
> +     */
>      if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
> +        if (kvm_enabled()) {
> +            /* Read the hosts physical address size, and compare it to what we
> +             * were asked for; note old machine types default to 40 bits
> +             */

Isn't the "old machine types default to 40 bits" part obsolete,
now that you use 9999 to indicate it was not set explicitly?

Also, the observation makes me confused: I know the old machine
types default to 40 bits, but I don't know why I need to know
that to understand the host_cpuid() logic below.

> +            uint32_t eax;
> +            uint32_t host_phys_bits = 0;
> +            static bool warned;
> +
> +            host_cpuid(0x80000000, 0, &eax, NULL, NULL, NULL);
> +            if (eax >= 0x80000008) {
> +                host_cpuid(0x80000008, 0, &eax, NULL, NULL, NULL);
> +                /* Note: According to AMD doc 25481 rev 2.34 they have a field
> +                 * at 23:16 that can specify a maximum physical address bits for
> +                 * the guest that can override this value; but I've not seen
> +                 * anything with that set.
> +                 */
> +                host_phys_bits = eax & 0xff;
> +            } else {
> +                /* It's an odd 64 bit machine that doesn't have the leaf for
> +                 * physical address bits; fall back to 36 that's most older
> +                 * Intel.
> +                 */
> +                host_phys_bits = 36;
> +            }

I would love to see this logic moved inside a
x86_host_phys_bits() function.

> +
> +            if (cpu->phys_bits == 0) {
> +                /* The user asked for us to use the host physical bits */
> +                cpu->phys_bits = host_phys_bits;
> +            }
> +
> +            /* Print a warning if the user set it to a value that's not the
> +             * host value.
> +             */
> +            if (cpu->phys_bits != host_phys_bits && cpu->phys_bits != 9999 &&
> +                !warned) {
> +                error_report("Warning: Host physical bits (%u)"
> +                                 " does not match phys_bits (%u)",
> +                                 host_phys_bits, cpu->phys_bits);
> +                warned = true;

The name of the user-visible property is "phys-bits", not
phys_bits. Maybe we could say "does not match phys-bits
property".

> +            }
> +        } else {
> +            if (cpu->phys_bits == 0) {
> +                error_setg(errp, "phys_bits can not be read from the host in"
> +                                 " TCG mode");
> +                return;
> +            }
> +        }
>          /* 9999 is a special meaning 'use the old default',
>           */
>          if (cpu->phys_bits == 9999) {
> -- 
> 2.7.4
> 

-- 
Eduardo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/4] x86: Allow physical address bits to be set
  2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 1/4] x86: Allow physical address bits to be set Dr. David Alan Gilbert (git)
  2016-07-06 19:01   ` Eduardo Habkost
@ 2016-07-06 19:57   ` Eduardo Habkost
  1 sibling, 0 replies; 14+ messages in thread
From: Eduardo Habkost @ 2016-07-06 19:57 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git); +Cc: qemu-devel, pbonzini, marcel, mst, kraxel

On Tue, Jul 05, 2016 at 08:03:15PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Currently QEMU sets the x86 number of physical address bits to the
> magic number 40.  This is only correct on some small AMD systems;
> Intel systems tend to have 36, 39, 46 bits, and large AMD systems
> tend to have 48.
> 
> Having the value different from your actual hardware is detectable
> by the guest and in principal can cause problems;
> The current limit of 40 stops TB VMs being created by those lucky
> enough to have that much.
> 
> This patch lets you set the physical bits by a cpu property but
> defaults to the same existing magic 40.
> 
> I've removed the ancient warning about the 42 bit limit in exec.c;
> I can't find that limit in there and no one else seems to know where
> it is.
> 
> We use a magic value of 9999 as the property default so that we can
> later distinguish between the default and a user set value.
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  target-i386/cpu.c | 43 ++++++++++++++++++++++++++++++++++---------
>  target-i386/cpu.h |  3 +++
>  2 files changed, 37 insertions(+), 9 deletions(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 3bd3cfc..74d53c5 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -2602,17 +2602,13 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>          break;
>      case 0x80000008:
>          /* virtual & phys address size in low 2 bytes. */
> -/* XXX: This value must match the one used in the MMU code. */
>          if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
> -            /* 64 bit processor */
> -/* XXX: The physical address space is limited to 42 bits in exec.c. */
> -            *eax = 0x00003028; /* 48 bits virtual, 40 bits physical */
> +            /* 64 bit processor, 48 bits virtual, configurable
> +             * physical bits.
> +             */
> +            *eax = 0x00003000 + cpu->phys_bits;
>          } else {
> -            if (env->features[FEAT_1_EDX] & CPUID_PSE36) {
> -                *eax = 0x00000024; /* 36 bits physical */
> -            } else {
> -                *eax = 0x00000020; /* 32 bits physical */
> -            }
> +            *eax = cpu->phys_bits;
>          }
>          *ebx = 0;
>          *ecx = 0;
> @@ -2956,7 +2952,35 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>             & CPUID_EXT2_AMD_ALIASES);
>      }
>  
> +    if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
> +        /* 9999 is a special meaning 'use the old default',
> +         */
> +        if (cpu->phys_bits == 9999) {
> +            /* this must match the PHYS_ADDR_MASK in cpu.h */
> +            cpu->phys_bits = 40;
> +        }
> +        if (cpu->phys_bits > TARGET_PHYS_ADDR_SPACE_BITS ||
> +            cpu->phys_bits < 32) {

Can it be set to anything except 40/36 in TCG? It looks like the
TCG reserved-bits code added by Paolo in
e8f6d00c30ed88910d0d985f4b2bf41654172ceb assumes the reserved
bits are defined at compile time. It even has a comment saying:

/* XXX: This value should match the one returned by CPUID and in exec.c */

(Paolo, what's the value "in exec.c" mentioned in this comment?)

To make it better documented, I would rework the cpu.h macros
like this:

cpu.h:

  #if defined(TARGET_X86_64)
  #define TCG_PHYS_ADDR_LIMIT 40
  #else
  #define TCG_PHYS_ADDR_LIMIT 36
  #endif

  #define TCG_PHYS_ADDR_MASK MAKE_64BIT_MASK(0, TCG_PHYS_ADDR_LIMIT)

  #define KVM_PHYS_ADDR_LIMIT 52

  QEMU_BUILD_BUG_ON(TARGET_PHYS_ADDR_SPACE_BITS > KVM_PHYS_ADDR_LIMIT)

cpu.c:

  if (kvm_enabled()) {
      if (cpu->phys_bits > KVM_PHYS_ADDR_LIMIT() ||
          cpu->phys_bits < 32) {
          error_setg(errp, "phys-bits should be between 32 and %u (but is %u)",
                           phys_addr_limit(), cpu->phys_bits);
          return;
      }
  } else {
     if (cpu->phys_bits != TCG_PHYS_ADDR_LIMIT) {
         error_setg(errp, "TCG only supports phys-bits=%u", TCG_PHYS_ADDR_LIMIT);
         return;
     }
  }

> +            error_setg(errp, "phys_bits should be between 32 and %u or 0 to"
> +                             " use host size (but is %u)",
> +                             TARGET_PHYS_ADDR_SPACE_BITS, cpu->phys_bits);

The "0 to use host size" part is not valid until we apply patch
4/4.

> +            return;
> +        }
> +    } else {
> +        /* For 32 bit systems don't use the user set value, but keep
> +         * phys_bits consistent with what we tell the guest.
> +         */
> +        if (cpu->phys_bits != 9999) {
> +            error_setg(errp, "phys_bits is not user-configurable in 32 bit");
> +            return;
> +        }
>  
> +        if (env->features[FEAT_1_EDX] & CPUID_PSE36) {
> +            cpu->phys_bits = 36;
> +        } else {
> +            cpu->phys_bits = 32;
> +        }
> +    }
>      cpu_exec_init(cs, &error_abort);
>  
>      if (tcg_enabled()) {
> @@ -3257,6 +3281,7 @@ static Property x86_cpu_properties[] = {
>      DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
>      DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
>      DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
> +    DEFINE_PROP_UINT32("phys-bits", X86CPU, phys_bits, 9999),
>      DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, 0),
>      DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, 0),
>      DEFINE_PROP_UINT32("xlevel2", X86CPU, env.cpuid_xlevel2, 0),
> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> index 474b0b9..221b1a2 100644
> --- a/target-i386/cpu.h
> +++ b/target-i386/cpu.h
> @@ -1181,6 +1181,9 @@ struct X86CPU {
>      /* Compatibility bits for old machine types: */
>      bool enable_cpuid_0xb;
>  
> +    /* Number of physical address bits supported */
> +    uint32_t phys_bits;
> +
>      /* in order to simplify APIC support, we leave this pointer to the
>         user */
>      struct DeviceState *apic_state;
> -- 
> 2.7.4
> 

-- 
Eduardo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/4] x86: Allow physical address bits to be set
  2016-07-06 19:01   ` Eduardo Habkost
@ 2016-07-07 16:39     ` Dr. David Alan Gilbert
  2016-07-07 18:02       ` Eduardo Habkost
  0 siblings, 1 reply; 14+ messages in thread
From: Dr. David Alan Gilbert @ 2016-07-07 16:39 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: qemu-devel, pbonzini, marcel, mst, kraxel

* Eduardo Habkost (ehabkost@redhat.com) wrote:
> On Tue, Jul 05, 2016 at 08:03:15PM +0100, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Currently QEMU sets the x86 number of physical address bits to the
> > magic number 40.  This is only correct on some small AMD systems;
> > Intel systems tend to have 36, 39, 46 bits, and large AMD systems
> > tend to have 48.
> > 
> > Having the value different from your actual hardware is detectable
> > by the guest and in principal can cause problems;
> > The current limit of 40 stops TB VMs being created by those lucky
> > enough to have that much.
> > 
> > This patch lets you set the physical bits by a cpu property but
> > defaults to the same existing magic 40.
> > 
> > I've removed the ancient warning about the 42 bit limit in exec.c;
> > I can't find that limit in there and no one else seems to know where
> > it is.
> > 
> > We use a magic value of 9999 as the property default so that we can
> > later distinguish between the default and a user set value.
> 
> I'd prefer to use -1 or 0xFFFFFFFF (UINT32_MAX) to indicate it
> was not set by the user, and not document it as "use the old
> default" but just as "it was not set explicitly".
> 
> This won't allow us to differentiate between "set by user" and
> "set by machine-type compat_props" in the future. But I would
> prefer to use a MachineClass field or boolean property than magic
> numbers for that, anyway.
> 
> If you replace 9999 with UINT32_MAX or 0xFFFFFFFF in this patch:
> 
> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
> 
> BTW, using 0 to indicate "not set" would be acceptable, too, but
> the magic 0 value in patch 4/4 would need to be replaced with a
> boolean "host-phys-bits" property. But I prefer boolean properties
> instead of magic numbers, anyway.

OK, lets do that then.  I'll use 0 here for the magic default
and then add the host-phys-bits boolean that strictly overrides
the phys-bits numeric.

I didn't use UINT32_MAX or 0xffffffff because I wasn't convinced
how that would interact with the machine-type/compat code which
uses strings to represent default values for machine types.

(And we have ~40 cases of DEFINE_PROP_UINT*(.....,-1) which I just
find very wrong).

Dave

> 
> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  target-i386/cpu.c | 43 ++++++++++++++++++++++++++++++++++---------
> >  target-i386/cpu.h |  3 +++
> >  2 files changed, 37 insertions(+), 9 deletions(-)
> > 
> > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > index 3bd3cfc..74d53c5 100644
> > --- a/target-i386/cpu.c
> > +++ b/target-i386/cpu.c
> > @@ -2602,17 +2602,13 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >          break;
> >      case 0x80000008:
> >          /* virtual & phys address size in low 2 bytes. */
> > -/* XXX: This value must match the one used in the MMU code. */
> >          if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
> > -            /* 64 bit processor */
> > -/* XXX: The physical address space is limited to 42 bits in exec.c. */
> > -            *eax = 0x00003028; /* 48 bits virtual, 40 bits physical */
> > +            /* 64 bit processor, 48 bits virtual, configurable
> > +             * physical bits.
> > +             */
> > +            *eax = 0x00003000 + cpu->phys_bits;
> >          } else {
> > -            if (env->features[FEAT_1_EDX] & CPUID_PSE36) {
> > -                *eax = 0x00000024; /* 36 bits physical */
> > -            } else {
> > -                *eax = 0x00000020; /* 32 bits physical */
> > -            }
> > +            *eax = cpu->phys_bits;
> >          }
> >          *ebx = 0;
> >          *ecx = 0;
> > @@ -2956,7 +2952,35 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> >             & CPUID_EXT2_AMD_ALIASES);
> >      }
> >  
> > +    if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
> > +        /* 9999 is a special meaning 'use the old default',
> > +         */
> > +        if (cpu->phys_bits == 9999) {
> > +            /* this must match the PHYS_ADDR_MASK in cpu.h */
> > +            cpu->phys_bits = 40;
> > +        }
> > +        if (cpu->phys_bits > TARGET_PHYS_ADDR_SPACE_BITS ||
> > +            cpu->phys_bits < 32) {
> > +            error_setg(errp, "phys_bits should be between 32 and %u or 0 to"
> > +                             " use host size (but is %u)",
> > +                             TARGET_PHYS_ADDR_SPACE_BITS, cpu->phys_bits);
> > +            return;
> > +        }
> > +    } else {
> > +        /* For 32 bit systems don't use the user set value, but keep
> > +         * phys_bits consistent with what we tell the guest.
> > +         */
> > +        if (cpu->phys_bits != 9999) {
> > +            error_setg(errp, "phys_bits is not user-configurable in 32 bit");
> > +            return;
> > +        }
> >  
> > +        if (env->features[FEAT_1_EDX] & CPUID_PSE36) {
> > +            cpu->phys_bits = 36;
> > +        } else {
> > +            cpu->phys_bits = 32;
> > +        }
> > +    }
> >      cpu_exec_init(cs, &error_abort);
> >  
> >      if (tcg_enabled()) {
> > @@ -3257,6 +3281,7 @@ static Property x86_cpu_properties[] = {
> >      DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
> >      DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
> >      DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
> > +    DEFINE_PROP_UINT32("phys-bits", X86CPU, phys_bits, 9999),
> >      DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, 0),
> >      DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, 0),
> >      DEFINE_PROP_UINT32("xlevel2", X86CPU, env.cpuid_xlevel2, 0),
> > diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> > index 474b0b9..221b1a2 100644
> > --- a/target-i386/cpu.h
> > +++ b/target-i386/cpu.h
> > @@ -1181,6 +1181,9 @@ struct X86CPU {
> >      /* Compatibility bits for old machine types: */
> >      bool enable_cpuid_0xb;
> >  
> > +    /* Number of physical address bits supported */
> > +    uint32_t phys_bits;
> > +
> >      /* in order to simplify APIC support, we leave this pointer to the
> >         user */
> >      struct DeviceState *apic_state;
> > -- 
> > 2.7.4
> > 
> 
> -- 
> Eduardo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 3/4] x86: fill high bits of mtrr mask
  2016-07-06 19:18   ` Eduardo Habkost
@ 2016-07-07 17:51     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 14+ messages in thread
From: Dr. David Alan Gilbert @ 2016-07-07 17:51 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: qemu-devel, pbonzini, marcel, mst, kraxel

* Eduardo Habkost (ehabkost@redhat.com) wrote:
> On Tue, Jul 05, 2016 at 08:03:17PM +0100, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Fill the bits between 51..number-of-physical-address-bits in the
> > MTRR_PHYSMASKn variable range mtrr masks so that they're consistent
> > in the migration stream irrespective of the physical address space
> > of the source VM in a migration.
> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> [...]
> > @@ -2084,6 +2085,28 @@ static int kvm_get_msrs(X86CPU *cpu)
> >      }
> >  
> >      assert(ret == cpu->kvm_msr_buf->nmsrs);
> > +    /*
> > +     * MTRR masks: Each mask consists of 5 parts
> > +     * a  10..0: must be zero
> > +     * b  11   : valid bit
> > +     * c n-1.12: actual mask bits
> > +     * d  51..n: reserved must be zero
> > +     * e  63.52: reserved must be zero
> > +     *
> > +     * 'n' is the number of physical bits supported by the CPU and is
> > +     * apparently always <= 52.   We know our 'n' but don't know what
> > +     * the destinations 'n' is; it might be smaller, in which case
> > +     * it masks (c) on loading. It might be larger, in which case
> > +     * we fill 'd' so that d..c is consistent irrespetive of the 'n'
> > +     * we're migrating to.
> > +     */
> > +    if (cpu->fill_mtrr_mask && cpu->phys_bits < TARGET_PHYS_ADDR_SPACE_BITS) {
> 
> As we already ensure phys_bits <= TARGET_PHYS_ADDR_SPACE_BITS in
> patch 1/4, what about just doing this:
> 
>   assert(cpu->phys_bits <= TARGET_PHYS_ADDR_SPACE_BITS)
> 
> 
> > +        mtrr_top_bits = MAKE_64BIT_MASK(cpu->phys_bits,
> > +                                TARGET_PHYS_ADDR_SPACE_BITS - cpu->phys_bits);
> 
> What's the actual meaning of TARGET_PHYS_ADDR_SPACE_BITS? Can it
> ever change in the future?  Should a change in
> TARGET_PHYS_ADDR_SPACE_BITS really change the migration format?
> 
> To make sure we won't have any surprises if
> TARGET_PHYS_ADDR_SPACE_BITS change, I would change the code to:
> 
>  QEMU_BUILD_BUG_ON(TARGET_PHYS_ADDR_SPACE_BITS > 52);
>  assert(cpu->phs_bits <= TARGET_PHYS_ADDR_SPACE_BITS);
>  mtrr_top_bits = MAKE_64BIT_MASK(cpu->phys_bits, 52 - cpu->phys_bits);

Done.

All limits tend to stretch with time, so no it wouldn't surprise me if that
happened some day.

Dave

> 
> 
> > +    } else {
> > +        mtrr_top_bits = 0;
> > +    }
> > +
> >      for (i = 0; i < ret; i++) {
> >          uint32_t index = msrs[i].index;
> >          switch (index) {
> > @@ -2279,7 +2302,8 @@ static int kvm_get_msrs(X86CPU *cpu)
> >              break;
> >          case MSR_MTRRphysBase(0) ... MSR_MTRRphysMask(MSR_MTRRcap_VCNT - 1):
> >              if (index & 1) {
> > -                env->mtrr_var[MSR_MTRRphysIndex(index)].mask = msrs[i].data;
> > +                env->mtrr_var[MSR_MTRRphysIndex(index)].mask = msrs[i].data |
> > +                                                               mtrr_top_bits;
> >              } else {
> >                  env->mtrr_var[MSR_MTRRphysIndex(index)].base = msrs[i].data;
> >              }
> > -- 
> > 2.7.4
> > 
> 
> -- 
> Eduardo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/4] x86: Allow physical address bits to be set
  2016-07-07 16:39     ` Dr. David Alan Gilbert
@ 2016-07-07 18:02       ` Eduardo Habkost
  2016-07-07 18:36         ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 14+ messages in thread
From: Eduardo Habkost @ 2016-07-07 18:02 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, pbonzini, marcel, mst, kraxel

On Thu, Jul 07, 2016 at 05:39:14PM +0100, Dr. David Alan Gilbert wrote:
[...]
> * Eduardo Habkost (ehabkost@redhat.com) wrote:
> > On Tue, Jul 05, 2016 at 08:03:15PM +0100, Dr. David Alan Gilbert (git) wrote:
> > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > 
> > > Currently QEMU sets the x86 number of physical address bits to the
> > > magic number 40.  This is only correct on some small AMD systems;
> > > Intel systems tend to have 36, 39, 46 bits, and large AMD systems
> > > tend to have 48.
> > > 
> > > Having the value different from your actual hardware is detectable
> > > by the guest and in principal can cause problems;
> > > The current limit of 40 stops TB VMs being created by those lucky
> > > enough to have that much.
> > > 
> > > This patch lets you set the physical bits by a cpu property but
> > > defaults to the same existing magic 40.
> > > 
> > > I've removed the ancient warning about the 42 bit limit in exec.c;
> > > I can't find that limit in there and no one else seems to know where
> > > it is.
> > > 
> > > We use a magic value of 9999 as the property default so that we can
> > > later distinguish between the default and a user set value.
> > 
> > I'd prefer to use -1 or 0xFFFFFFFF (UINT32_MAX) to indicate it
> > was not set by the user, and not document it as "use the old
> > default" but just as "it was not set explicitly".
> > 
> > This won't allow us to differentiate between "set by user" and
> > "set by machine-type compat_props" in the future. But I would
> > prefer to use a MachineClass field or boolean property than magic
> > numbers for that, anyway.
> > 
> > If you replace 9999 with UINT32_MAX or 0xFFFFFFFF in this patch:
> > 
> > Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
> > 
> > BTW, using 0 to indicate "not set" would be acceptable, too, but
> > the magic 0 value in patch 4/4 would need to be replaced with a
> > boolean "host-phys-bits" property. But I prefer boolean properties
> > instead of magic numbers, anyway.
> 
> OK, lets do that then.  I'll use 0 here for the magic default
> and then add the host-phys-bits boolean that strictly overrides
> the phys-bits numeric.
> 
> I didn't use UINT32_MAX or 0xffffffff because I wasn't convinced
> how that would interact with the machine-type/compat code which
> uses strings to represent default values for machine types.

Good point. Integer parsing code in QEMU is ... interesting.

> 
> (And we have ~40 cases of DEFINE_PROP_UINT*(.....,-1) which I just
> find very wrong).

Just wait until you see how string-input-visitor.c parses
uint64_t values.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/4] x86: Allow physical address bits to be set
  2016-07-07 18:02       ` Eduardo Habkost
@ 2016-07-07 18:36         ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 14+ messages in thread
From: Dr. David Alan Gilbert @ 2016-07-07 18:36 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: qemu-devel, pbonzini, marcel, mst, kraxel

* Eduardo Habkost (ehabkost@redhat.com) wrote:
> On Thu, Jul 07, 2016 at 05:39:14PM +0100, Dr. David Alan Gilbert wrote:
> [...]
> > * Eduardo Habkost (ehabkost@redhat.com) wrote:
> > > On Tue, Jul 05, 2016 at 08:03:15PM +0100, Dr. David Alan Gilbert (git) wrote:
> > > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > 
> > > > Currently QEMU sets the x86 number of physical address bits to the
> > > > magic number 40.  This is only correct on some small AMD systems;
> > > > Intel systems tend to have 36, 39, 46 bits, and large AMD systems
> > > > tend to have 48.
> > > > 
> > > > Having the value different from your actual hardware is detectable
> > > > by the guest and in principal can cause problems;
> > > > The current limit of 40 stops TB VMs being created by those lucky
> > > > enough to have that much.
> > > > 
> > > > This patch lets you set the physical bits by a cpu property but
> > > > defaults to the same existing magic 40.
> > > > 
> > > > I've removed the ancient warning about the 42 bit limit in exec.c;
> > > > I can't find that limit in there and no one else seems to know where
> > > > it is.
> > > > 
> > > > We use a magic value of 9999 as the property default so that we can
> > > > later distinguish between the default and a user set value.
> > > 
> > > I'd prefer to use -1 or 0xFFFFFFFF (UINT32_MAX) to indicate it
> > > was not set by the user, and not document it as "use the old
> > > default" but just as "it was not set explicitly".
> > > 
> > > This won't allow us to differentiate between "set by user" and
> > > "set by machine-type compat_props" in the future. But I would
> > > prefer to use a MachineClass field or boolean property than magic
> > > numbers for that, anyway.
> > > 
> > > If you replace 9999 with UINT32_MAX or 0xFFFFFFFF in this patch:
> > > 
> > > Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
> > > 
> > > BTW, using 0 to indicate "not set" would be acceptable, too, but
> > > the magic 0 value in patch 4/4 would need to be replaced with a
> > > boolean "host-phys-bits" property. But I prefer boolean properties
> > > instead of magic numbers, anyway.
> > 
> > OK, lets do that then.  I'll use 0 here for the magic default
> > and then add the host-phys-bits boolean that strictly overrides
> > the phys-bits numeric.
> > 
> > I didn't use UINT32_MAX or 0xffffffff because I wasn't convinced
> > how that would interact with the machine-type/compat code which
> > uses strings to represent default values for machine types.
> 
> Good point. Integer parsing code in QEMU is ... interesting.
> 
> > 
> > (And we have ~40 cases of DEFINE_PROP_UINT*(.....,-1) which I just
> > find very wrong).
> 
> Just wait until you see how string-input-visitor.c parses
> uint64_t values.

Blech!

Dave

> 
> -- 
> Eduardo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 4/4] x86: Set physical address bits based on host
  2016-07-06 19:32   ` Eduardo Habkost
@ 2016-07-07 18:46     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 14+ messages in thread
From: Dr. David Alan Gilbert @ 2016-07-07 18:46 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: qemu-devel, pbonzini, marcel, mst, kraxel

* Eduardo Habkost (ehabkost@redhat.com) wrote:
> On Tue, Jul 05, 2016 at 08:03:18PM +0100, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > A special case based on the previous phys-bits property; if it's
> > the magic value 0 then use the hosts capabilities.
> > 
> > We can also use the value we read from the host to check the users
> > explicitly set value and warn them if it doesn't match.
> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  target-i386/cpu.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 53 insertions(+)
> > 
> > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > index f33cf58..6ebd26b 100644
> > --- a/target-i386/cpu.c
> > +++ b/target-i386/cpu.c
> > @@ -2952,7 +2952,60 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> >             & CPUID_EXT2_AMD_ALIASES);
> >      }
> >  
> > +    /* For 64bit systems think about the number of physical bits to present.
> > +     * ideally this should be the same as the host; anything other than matching
> > +     * the host can cause incorrect guest behaviour.
> > +     * QEMU used to pick the magic value of 40 bits that corresponds to
> > +     * consumer AMD devices but nothing esle.
> > +     */
> >      if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
> > +        if (kvm_enabled()) {
> > +            /* Read the hosts physical address size, and compare it to what we
> > +             * were asked for; note old machine types default to 40 bits
> > +             */
> 
> Isn't the "old machine types default to 40 bits" part obsolete,
> now that you use 9999 to indicate it was not set explicitly?

Yes it is; gone.

> Also, the observation makes me confused: I know the old machine
> types default to 40 bits, but I don't know why I need to know
> that to understand the host_cpuid() logic below.

Yes that comment also made more sense the way it used to be arranged.

> 
> > +            uint32_t eax;
> > +            uint32_t host_phys_bits = 0;
> > +            static bool warned;
> > +
> > +            host_cpuid(0x80000000, 0, &eax, NULL, NULL, NULL);
> > +            if (eax >= 0x80000008) {
> > +                host_cpuid(0x80000008, 0, &eax, NULL, NULL, NULL);
> > +                /* Note: According to AMD doc 25481 rev 2.34 they have a field
> > +                 * at 23:16 that can specify a maximum physical address bits for
> > +                 * the guest that can override this value; but I've not seen
> > +                 * anything with that set.
> > +                 */
> > +                host_phys_bits = eax & 0xff;
> > +            } else {
> > +                /* It's an odd 64 bit machine that doesn't have the leaf for
> > +                 * physical address bits; fall back to 36 that's most older
> > +                 * Intel.
> > +                 */
> > +                host_phys_bits = 36;
> > +            }
> 
> I would love to see this logic moved inside a
> x86_host_phys_bits() function.

Done.

> > +
> > +            if (cpu->phys_bits == 0) {
> > +                /* The user asked for us to use the host physical bits */
> > +                cpu->phys_bits = host_phys_bits;
> > +            }
> > +
> > +            /* Print a warning if the user set it to a value that's not the
> > +             * host value.
> > +             */
> > +            if (cpu->phys_bits != host_phys_bits && cpu->phys_bits != 9999 &&
> > +                !warned) {
> > +                error_report("Warning: Host physical bits (%u)"
> > +                                 " does not match phys_bits (%u)",
> > +                                 host_phys_bits, cpu->phys_bits);
> > +                warned = true;
> 
> The name of the user-visible property is "phys-bits", not
> phys_bits. Maybe we could say "does not match phys-bits
> property".

Done.

Dave

> 
> > +            }
> > +        } else {
> > +            if (cpu->phys_bits == 0) {
> > +                error_setg(errp, "phys_bits can not be read from the host in"
> > +                                 " TCG mode");
> > +                return;
> > +            }
> > +        }
> >          /* 9999 is a special meaning 'use the old default',
> >           */
> >          if (cpu->phys_bits == 9999) {
> > -- 
> > 2.7.4
> > 
> 
> -- 
> Eduardo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-07-07 18:46 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-05 19:03 [Qemu-devel] [PATCH v3 0/4] x86: Physical address limit patches Dr. David Alan Gilbert (git)
2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 1/4] x86: Allow physical address bits to be set Dr. David Alan Gilbert (git)
2016-07-06 19:01   ` Eduardo Habkost
2016-07-07 16:39     ` Dr. David Alan Gilbert
2016-07-07 18:02       ` Eduardo Habkost
2016-07-07 18:36         ` Dr. David Alan Gilbert
2016-07-06 19:57   ` Eduardo Habkost
2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 2/4] x86: Mask mtrr mask based on CPU physical address limits Dr. David Alan Gilbert (git)
2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 3/4] x86: fill high bits of mtrr mask Dr. David Alan Gilbert (git)
2016-07-06 19:18   ` Eduardo Habkost
2016-07-07 17:51     ` Dr. David Alan Gilbert
2016-07-05 19:03 ` [Qemu-devel] [PATCH v3 4/4] x86: Set physical address bits based on host Dr. David Alan Gilbert (git)
2016-07-06 19:32   ` Eduardo Habkost
2016-07-07 18:46     ` Dr. David Alan Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.