[Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
@ 2015-06-08 19:07 Eduardo Habkost
  2015-06-08 19:07 ` [Qemu-devel] [PATCH 1/2] target-i386: Introduce "-cpu custom" Eduardo Habkost
                   ` (3 more replies)
  0 siblings, 4 replies; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-08 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: mimu, borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark,
	Andreas Färber, rth

The problem:

The existing libvirt APIs assume that if a given CPU model is runnable in a
host kernel+hardware combination, it will be always runnable on that host even
if the machine-type changes.

That assumption is implied in some of libvirt interfaces, for example, at:

1) Host capabilities, which let callers know the set of CPU models
   that can run in a host:
   https://libvirt.org/formatcaps.html#elementHost

   "virsh capabilities" returns a CPU model name + CPU feature list, assuming
   that a CPU model name has a meaning that's independent from the
   machine-type.

2) The function that checks if a given CPU model definition
   is compatible with a host (virConnectCompareCPU()),
   which does not take the machine-type as argument:
   http://libvirt.org/html/libvirt-libvirt-host.html#virConnectCompareCPU

But that assumption is not true, as QEMU changes CPU models in new
machine-types when fixing bugs, or when new features (previously unsupported by
QEMU, TCG or KVM) get implemented.

The solution:

libvirt can solve this problem partially by making sure every feature in a CPU
model is explicitly configured, instead of (incorrectly) expecting that a named
CPU model will never change in QEMU. But this doesn't solve the problem
completely, because it is still possible that new features unknown to libvirt
get enabled in the default CPU model in future machine-types (that's very
likely to happen when we introduce new KVM features, for example).

So, to make sure no new feature will be ever enabled without the knowledge of
libvirt, add a "-cpu custom" mode, where no CPU model data is loaded at all,
and everything needs to be configured explicitly using CPU properties. That
means no CPU features will ever change depending on machine-type or accelerator
capabilities when using "-cpu custom".

                              * * *

I know that this is basically the opposite of what we were aiming at in the
last few month^Wyears, where we were struggling to implement probing interfaces
to expose machine-type-dependent data for libvirt. But, at least the fact that
we could implement the new approach using a 9-line patch means we were still
going in the right direction. :)

To help libvirt in the transition, a x86-cpu-model-dump script is provided,
that will generate a config file that can be loaded using -readconfig, based on
the -cpu and -machine options provided in the command-line.

                              * * *

This is basically the same version I sent as an RFC in April. A git tree is
available at:

  git://github.com/ehabkost/qemu-hacks.git work/x86-cpu-custom-model

Eduardo Habkost (2):
  target-i386: Introduce "-cpu custom"
  scripts: x86-cpu-model-dump script

 scripts/x86-cpu-model-dump | 322 +++++++++++++++++++++++++++++++++++++++++++++
 target-i386/cpu.c          |  10 +-
 2 files changed, 331 insertions(+), 1 deletion(-)
 create mode 100755 scripts/x86-cpu-model-dump

-- 
2.1.0

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [Qemu-devel] [PATCH 1/2] target-i386: Introduce "-cpu custom"
  2015-06-08 19:07 [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Eduardo Habkost
@ 2015-06-08 19:07 ` Eduardo Habkost
  2015-06-08 19:07 ` [Qemu-devel] [PATCH 2/2] scripts: x86-cpu-model-dump script Eduardo Habkost
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-08 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: mimu, borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark,
	Andreas Färber, rth

Now that we can configure everything in a CPU using QOM properties, add
a new CPU model name that won't load anything from the CPU model table.
That means no CPUID field will be initialized with any data that depends
on CPU model name, machine-type, or accelerator.

This will allow management software to control CPUID data completely
using the "-cpu" command-line option, or using global properties.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
 target-i386/cpu.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 4e7cdaa..4677784 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -3052,7 +3052,9 @@ static void x86_cpu_initfn(Object *obj)
         }
     }
 
-    x86_cpu_load_def(cpu, xcc->cpu_def, &error_abort);
+    if (xcc->cpu_def) {
+        x86_cpu_load_def(cpu, xcc->cpu_def, &error_abort);
+    }
 
     /* init various static tables used in TCG mode */
     if (tcg_enabled() && !inited) {
@@ -3182,6 +3184,11 @@ static const TypeInfo x86_cpu_type_info = {
     .class_init = x86_cpu_common_class_init,
 };
 
+static const TypeInfo custom_x86_cpu_type_info = {
+    .name = X86_CPU_TYPE_NAME("custom"),
+    .parent = TYPE_X86_CPU,
+};
+
 static void x86_cpu_register_types(void)
 {
     int i;
@@ -3190,6 +3197,7 @@ static void x86_cpu_register_types(void)
     for (i = 0; i < ARRAY_SIZE(builtin_x86_defs); i++) {
         x86_register_cpudef_type(&builtin_x86_defs[i]);
     }
+    type_register_static(&custom_x86_cpu_type_info);
 #ifdef CONFIG_KVM
     type_register_static(&host_x86_cpu_type_info);
 #endif
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [Qemu-devel] [PATCH 2/2] scripts: x86-cpu-model-dump script
  2015-06-08 19:07 [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Eduardo Habkost
  2015-06-08 19:07 ` [Qemu-devel] [PATCH 1/2] target-i386: Introduce "-cpu custom" Eduardo Habkost
@ 2015-06-08 19:07 ` Eduardo Habkost
  2015-06-08 20:18 ` [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Jiri Denemark
  2015-06-16 17:40 ` Eduardo Habkost
  3 siblings, 0 replies; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-08 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: mimu, borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark,
	Andreas Färber, rth

This is an example script that can be used to help generate a config
file that will reproduce a given CPU model from QEMU. The generated
config file can be loaded using "-readconfig" to make QEMU create CPUs
that will look exactly like the one used when cpu-model-dump was run.

A --self-test mode is implemented, to make sure the config file
generated by the script will generate a 100% equivalent CPU when used
with "-cpu custom".

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
Changes v1 -> v2:
* Use "cpuid-" prefix instead of "feat-"
* Exit earlier if QEMU fails
* Exit code of the script will match QEMU or diff exit code

Changes v2 -> v3:
* Don't rely on "cpuid-" prefix for feature flag properties,
  simply look for known feature names based on cpu_map.xml
* Implement self-test mode inside the script, and check
  every single QOM property of the resulting CPU
* Don't use "kvmclock" property to check KVM_FEATURE_CLOCKSOURCE2
* More verbose assertion messages to help debugging
* Add '-d' argument for debugging
* Use the new "custom" CPU model for self-test
---
 scripts/x86-cpu-model-dump | 322 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 322 insertions(+)
 create mode 100755 scripts/x86-cpu-model-dump

diff --git a/scripts/x86-cpu-model-dump b/scripts/x86-cpu-model-dump
new file mode 100755
index 0000000..1654836
--- /dev/null
+++ b/scripts/x86-cpu-model-dump
@@ -0,0 +1,322 @@
+#!/usr/bin/env python2.7
+#
+# Script to dump CPU model information as a QEMU config file that can be loaded
+# using -readconfig
+#
+# Author: Eduardo Habkost <ehabkost@redhat.com>
+#
+# Copyright (c) 2015 Red Hat Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+# THE SOFTWARE.
+
+
+import sys, os, signal, tempfile, re, argparse, StringIO
+import xml.etree.ElementTree
+
+# Allow us to load the qmp/qmp.py module:
+sys.path.append(os.path.join(os.path.dirname(sys.argv[0]), 'qmp'))
+import qmp
+
+import logging
+logger = logging.getLogger('x86-cpu-model-dump')
+
+CPU_PATH = '/machine/icc-bridge/icc/child[0]'
+PROPS = set(['level',
+             'xlevel',
+             'xlevel2',
+             'vendor',
+             'family',
+             'model',
+             'stepping',
+             'model-id',
+            ])
+CPU_MAP = '/usr/share/libvirt/cpu_map.xml'
+
+# features that may not be on cpu_map.xml:
+KNOWN_FEAT_NAMES = [
+    # CPU feature aliases don't have properties, add some special feature
+    # names telling the script to ignore them:
+    (0x80000001, 0, 'edx', [
+        "fpu-ALIAS", "vme-ALIAS", "de-ALIAS", "pse-ALIAS",
+        "tsc-ALIAS", "msr-ALIAS", "pae-ALIAS", "mce-ALIAS",
+        "cx8-ALIAS", "apic-ALIAS", None, None,
+        "mtrr-ALIAS", "pge-ALIAS", "mca-ALIAS", "cmov-ALIAS",
+        "pat-ALIAS", "pse36-ALIAS", None, None,
+        None, None, None, "mmx-ALIAS",
+        "fxsr-ALIAS", None, None, None,
+        None, None, None, None,
+    ]),
+    # cpu_map.xml does not contain KVM feature flags:
+    (0x40000001, 0, 'eax', [
+        "kvmclock", "kvm-nopiodelay", "kvm-mmu", "kvmclock-ALIAS",
+        "kvm-asyncpf", "kvm-steal-time", "kvm-pv-eoi", "kvm-pv-unhalt",
+        None, None, None, None,
+        None, None, None, None,
+        None, None, None, None,
+        None, None, None, None,
+        "kvmclock-stable-bit", None, None, None,
+        None, None, None, None,
+    ]),
+    # cpu_map.xml does not have XSAVE flags:
+    (0xd, 1, 'eax', [
+        "xsaveopt", "xsavec", "xgetbv1", "xsaves",
+    ]),
+    # cpu_map.xml does not contain SVM flags:
+    (0x8000000a, 0, 'edx', [
+        "npt", "lbrv", "svm_lock", "nrip_save",
+        "tsc_scale", "vmcb_clean",  "flushbyasid", "decodeassists",
+        None, None, "pause_filter", None,
+        "pfthreshold", None, None, None,
+        None, None, None, None,
+        None, None, None, None,
+        None, None, None, None,
+        None, None, None, None,
+    ]),
+]
+
+def dbg(msg, *args):
+    logger.debug(msg, *args)
+    pass
+
+def value_to_string(v):
+    """Convert property value to string parseable by -global"""
+    t = type(v)
+    if t == bool:
+        return v and "on" or "off"
+    elif t == str or t == unicode:
+        return v
+    elif t == int:
+        return str(v)
+    else:
+        raise Exception("Unsupported property type: %r", t)
+
+def propname(feat):
+    return feat.replace('_', '-')
+
+def load_feat_names(cpu_map):
+    """Load feature names from libvirt cpu_map.xml"""
+    cpumap = xml.etree.ElementTree.parse(cpu_map)
+    feat_names = {}
+
+    for func, idx, reg, names in KNOWN_FEAT_NAMES:
+        for bitnr, name in enumerate(names):
+            if name:
+                feat_names[(func, idx, reg, bitnr)] = name
+
+    for f in cpumap.getroot().findall("./arch[@name='x86']/feature"):
+        fname = f.attrib['name']
+        for cpuid in f.findall('cpuid'):
+            func = int(cpuid.attrib['function'], 0)
+            idx = 0
+            for reg in 'abcd':
+                regname = 'e%sx' % (reg)
+                if regname in cpuid.attrib:
+                    v = int(cpuid.attrib[regname], 0)
+                    for bitnr in range(32):
+                        bitval = (1 << bitnr)
+                        if v & bitval:
+                            feat_names[(func, idx, regname, bitnr)] = fname
+
+    return feat_names
+
+def get_all_props(qmp, path):
+    r = {}
+    props = qmp.command('qom-list', path=path)
+    for p in props:
+        value = qmp.command('qom-get', path=path, property=p['name'])
+        r[p['name']] = value
+    return r
+
+def dump_cpu_data(output, qmp, cpu_path, feat_names):
+    def get_prop(pname):
+        return qmp.command('qom-get', path=cpu_path, property=pname)
+
+    def pname_for_feature_bit(fw, bitnr):
+        func = fw['cpuid-input-eax']
+        idx = fw.get('cpuid-input-ecx', 0)
+        regname = fw['cpuid-register'].lower()
+        key = (func, idx, regname, bitnr)
+        keystr = "0x%x,0x%x,%s,%d" % (func, idx, regname, bitnr)
+        pname = feat_names.get(key)
+        if pname:
+            pname = propname(pname)
+        return pname
+
+    def enumerate_feature_props(fw_list):
+        for fw in fw_list:
+            value = fw['features']
+            for bitnr in range(32):
+                is_set = (value & (1 << bitnr)) != 0
+                pname = pname_for_feature_bit(fw, bitnr)
+
+                # special case for alias bits: ignore them
+                if pname and pname.endswith('-ALIAS'):
+                    continue
+
+                if pname is None:
+                    pname = 'no-property-for-%r-%d' % (fw, bitnr)
+
+                yield is_set, pname
+
+    props = qmp.command('qom-list', path=cpu_path)
+    props = set([prop['name'] for prop in props])
+
+    known_props = PROPS.copy()
+    feat_props = set([propname(feat) for feat in feat_names.values()])
+    known_props.update(feat_props)
+    known_props.intersection_update(props)
+
+    propdict = {}
+    for pname in known_props:
+        propdict[pname] = get_prop(pname)
+
+    # sanity-check feature-words:
+    for is_set, pname in enumerate_feature_props(get_prop('feature-words')):
+        # feature-word bits must match property:
+        assert propdict.get(pname, False) == is_set, \
+            "property (%s) is not %r" % (pname, is_set)
+
+    # bits set on filtered-features need property fixup:
+    for is_set, pname in enumerate_feature_props(get_prop('filtered-features')):
+        if is_set:
+            assert propdict.get(pname, False) == False, \
+                "filtered-feature %r is not off" % (pname)
+            propdict[pname] = True
+
+    for pname in sorted(propdict.keys()):
+        pvalue = propdict.get(pname)
+        output.write('[global]\n')
+        output.write('driver = "cpu"\n')
+        output.write('property = "%s"\n' % (pname))
+        output.write('value = "%s"\n' % (value_to_string(pvalue)))
+        output.write('\n')
+
+def run_qemu(qemu_bin, args):
+    sockdir = tempfile.mkdtemp()
+    sockpath = os.path.join(sockdir, 'monitor.sock')
+    pidfile = os.path.join(sockdir, 'pidfile')
+
+    try:
+        qemu_cmd = [qemu_bin]
+        qemu_cmd.extend(args)
+        qemu_cmd.append('-chardev')
+        qemu_cmd.append('socket,id=qmp0,path=%s,server,nowait' % (sockpath))
+        qemu_cmd.append('-qmp')
+        qemu_cmd.append('chardev:qmp0')
+        qemu_cmd.append('-daemonize')
+        qemu_cmd.append('-pidfile')
+        qemu_cmd.append(pidfile)
+
+        dbg("Running QEMU: %r" % (qemu_cmd))
+
+        ret = os.spawnvp(os.P_WAIT, qemu_bin, qemu_cmd)
+        if ret != 0:
+            raise Exception("Failed to start QEMU")
+
+        srv = qmp.QEMUMonitorProtocol(sockpath)
+        srv.connect()
+
+        yield srv
+    finally:
+        try:
+            pid = int(open(pidfile, 'r').read())
+            dbg('Killing QEMU, pid: %d' % (pid))
+            os.kill(pid, signal.SIGTERM)
+            os.waitpid(pid, 0)
+        except:
+            pass
+        try:
+            os.unlink(pidfile)
+        except:
+            pass
+        try:
+            os.unlink(sockpath)
+        except:
+            pass
+        os.rmdir(sockdir)
+
+def self_test(args, feat_names):
+    args1 = args.qemu_args + ['-cpu', args.selftest]
+    o1 = tempfile.NamedTemporaryFile()
+    q1 = run_qemu(args.qemu_bin, args1)
+    srv = q1.next()
+    dump_cpu_data(o1, srv, CPU_PATH, feat_names)
+    o1.flush()
+    props1 = get_all_props(srv, CPU_PATH)
+    q1.close()
+
+    args2 = args.qemu_args + ['-cpu', 'custom', '-readconfig', o1.name]
+
+    o2 = tempfile.NamedTemporaryFile()
+    q2 = run_qemu(args.qemu_bin, args2)
+    srv = q2.next()
+    dump_cpu_data(o2, srv, CPU_PATH, feat_names)
+    o2.flush()
+    props2 = get_all_props(srv, CPU_PATH)
+    q2.close()
+
+    v1 = open(o1.name, 'r').read()
+    v2 = open(o2.name, 'r').read()
+    assert v1 == v2
+
+    r = 0
+    props_to_check = set(props1.keys() + props2.keys())
+    # The 'type' property is the only one we expect to change:
+    props_to_check.difference_update(set(['type']))
+
+    for k in props_to_check:
+        p1 = props1[k]
+        p2 = props2[k]
+        if p1 != p2:
+            print >>sys.stderr, "Property %r mismatch:" % (k)
+            print >>sys.stderr, repr(p1)
+            print >>sys.stderr, repr(p2)
+            print >>sys.stderr, ''
+            r = 1
+    return r
+
+def main(argv):
+    parser = argparse.ArgumentParser(description='Process some integers.')
+    parser.add_argument('qemu_bin', metavar='QEMU', type=str,
+                        help='Path to QEMU binary')
+    parser.add_argument('--self-test', '--selftest', metavar='CPU_MODEL',
+                        dest='selftest',
+                        help='Self-test script using -cpu CPU_MODEL')
+    parser.add_argument('-d', dest='debug', action='store_true',
+                        help='Enable debug messages')
+
+    # parse_known_args() won't stop because of QEMU command-line arguments
+    args, qemu_args = parser.parse_known_args(argv[1:])
+    args.qemu_args = qemu_args
+
+    if args.debug:
+        logging.basicConfig(level=logging.DEBUG)
+
+    feat_names = load_feat_names(CPU_MAP)
+
+    if args.selftest:
+        return self_test(args, feat_names)
+    else:
+        qemu = run_qemu(args.qemu_bin, args.qemu_args)
+        srv = qemu.next()
+        dump_cpu_data(sys.stdout, srv, CPU_PATH, feat_names)
+        qemu.close()
+
+if __name__ == '__main__':
+    sys.exit(main(sys.argv))
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-08 19:07 [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Eduardo Habkost
  2015-06-08 19:07 ` [Qemu-devel] [PATCH 1/2] target-i386: Introduce "-cpu custom" Eduardo Habkost
  2015-06-08 19:07 ` [Qemu-devel] [PATCH 2/2] scripts: x86-cpu-model-dump script Eduardo Habkost
@ 2015-06-08 20:18 ` Jiri Denemark
  2015-06-09  8:56   ` Daniel P. Berrange
  2015-06-23 12:32   ` Andreas Färber
  2015-06-16 17:40 ` Eduardo Habkost
  3 siblings, 2 replies; 81+ messages in thread
From: Jiri Denemark @ 2015-06-08 20:18 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, qemu-devel, borntraeger, Igor Mammedov, Paolo Bonzini,
	Andreas Färber, rth

On Mon, Jun 08, 2015 at 16:07:38 -0300, Eduardo Habkost wrote:
...
> libvirt can solve this problem partially by making sure every feature in a CPU
> model is explicitly configured, instead of (incorrectly) expecting that a named
> CPU model will never change in QEMU. But this doesn't solve the problem
> completely, because it is still possible that new features unknown to libvirt
> get enabled in the default CPU model in future machine-types (that's very
> likely to happen when we introduce new KVM features, for example).
> 
> So, to make sure no new feature will be ever enabled without the knowledge of
> libvirt, add a "-cpu custom" mode, where no CPU model data is loaded at all,
> and everything needs to be configured explicitly using CPU properties. That
> means no CPU features will ever change depending on machine-type or accelerator
> capabilities when using "-cpu custom".
> 
>                               * * *
> 
> I know that this is basically the opposite of what we were aiming at in the
> last few month^Wyears, where we were struggling to implement probing interfaces
> to expose machine-type-dependent data for libvirt. But, at least the fact that
> we could implement the new approach using a 9-line patch means we were still
> going in the right direction. :)
> 
> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> that will generate a config file that can be loaded using -readconfig, based on
> the -cpu and -machine options provided in the command-line.

Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
configuration data to libvirt, but now I think it actually makes sense.
We already have a partial copy of CPU model definitions in libvirt
anyway, but as QEMU changes some CPU models in some machine types (and
libvirt does not do that) we have no real control over the guest CPU
configuration. While what we really want is full control to enforce
stable guest ABI.

I will summarize my ideas on how libvirt should be using -cpu custom and
send them to libvirt-list soon.

Jirka

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-08 20:18 ` [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Jiri Denemark
@ 2015-06-09  8:56   ` Daniel P. Berrange
  2015-06-09 13:16     ` Eduardo Habkost
  2015-06-23 12:32   ` Andreas Färber
  1 sibling, 1 reply; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-09  8:56 UTC (permalink / raw)
  To: Jiri Denemark
  Cc: mimu, qemu-devel, borntraeger, Igor Mammedov, Paolo Bonzini, rth,
	Andreas Färber, Eduardo Habkost

On Mon, Jun 08, 2015 at 10:18:35PM +0200, Jiri Denemark wrote:
> On Mon, Jun 08, 2015 at 16:07:38 -0300, Eduardo Habkost wrote:
> ...
> > libvirt can solve this problem partially by making sure every feature in a CPU
> > model is explicitly configured, instead of (incorrectly) expecting that a named
> > CPU model will never change in QEMU. But this doesn't solve the problem
> > completely, because it is still possible that new features unknown to libvirt
> > get enabled in the default CPU model in future machine-types (that's very
> > likely to happen when we introduce new KVM features, for example).
> > 
> > So, to make sure no new feature will be ever enabled without the knowledge of
> > libvirt, add a "-cpu custom" mode, where no CPU model data is loaded at all,
> > and everything needs to be configured explicitly using CPU properties. That
> > means no CPU features will ever change depending on machine-type or accelerator
> > capabilities when using "-cpu custom".
> > 
> >                               * * *
> > 
> > I know that this is basically the opposite of what we were aiming at in the
> > last few month^Wyears, where we were struggling to implement probing interfaces
> > to expose machine-type-dependent data for libvirt. But, at least the fact that
> > we could implement the new approach using a 9-line patch means we were still
> > going in the right direction. :)
> > 
> > To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > that will generate a config file that can be loaded using -readconfig, based on
> > the -cpu and -machine options provided in the command-line.
> 
> Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> configuration data to libvirt, but now I think it actually makes sense.
> We already have a partial copy of CPU model definitions in libvirt
> anyway, but as QEMU changes some CPU models in some machine types (and
> libvirt does not do that) we have no real control over the guest CPU
> configuration. While what we really want is full control to enforce
> stable guest ABI.

I meanwhile, always wanted the full CPU config data in libvirt, so that
we could ensure libvirt was able to use the exact same CPU model setup
on other hypervisors too - eg Xen, VMWare let us specify the CPUID masks
so we could re-use the libvirt data there.

> I will summarize my ideas on how libvirt should be using -cpu custom and
> send them to libvirt-list soon.

These patches are x86 obviously - is there anything we need to worry about
for non-x86 architectures I wonder ? IIRC, all the non-x86 archs have
traditionally only needed CPU model names and don't really have the same
level of per-flag CPUID mask configurability that x86 has.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-09  8:56   ` Daniel P. Berrange
@ 2015-06-09 13:16     ` Eduardo Habkost
  0 siblings, 0 replies; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-09 13:16 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: mimu, qemu-devel, borntraeger, Igor Mammedov, Paolo Bonzini,
	Jiri Denemark, Andreas Färber, rth

On Tue, Jun 09, 2015 at 09:56:25AM +0100, Daniel P. Berrange wrote:
> On Mon, Jun 08, 2015 at 10:18:35PM +0200, Jiri Denemark wrote:
> > On Mon, Jun 08, 2015 at 16:07:38 -0300, Eduardo Habkost wrote:
> > ...
> > > libvirt can solve this problem partially by making sure every feature in a CPU
> > > model is explicitly configured, instead of (incorrectly) expecting that a named
> > > CPU model will never change in QEMU. But this doesn't solve the problem
> > > completely, because it is still possible that new features unknown to libvirt
> > > get enabled in the default CPU model in future machine-types (that's very
> > > likely to happen when we introduce new KVM features, for example).
> > > 
> > > So, to make sure no new feature will be ever enabled without the knowledge of
> > > libvirt, add a "-cpu custom" mode, where no CPU model data is loaded at all,
> > > and everything needs to be configured explicitly using CPU properties. That
> > > means no CPU features will ever change depending on machine-type or accelerator
> > > capabilities when using "-cpu custom".
> > > 
> > >                               * * *
> > > 
> > > I know that this is basically the opposite of what we were aiming at in the
> > > last few month^Wyears, where we were struggling to implement probing interfaces
> > > to expose machine-type-dependent data for libvirt. But, at least the fact that
> > > we could implement the new approach using a 9-line patch means we were still
> > > going in the right direction. :)
> > > 
> > > To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > > that will generate a config file that can be loaded using -readconfig, based on
> > > the -cpu and -machine options provided in the command-line.
> > 
> > Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > configuration data to libvirt, but now I think it actually makes sense.
> > We already have a partial copy of CPU model definitions in libvirt
> > anyway, but as QEMU changes some CPU models in some machine types (and
> > libvirt does not do that) we have no real control over the guest CPU
> > configuration. While what we really want is full control to enforce
> > stable guest ABI.
> 
> I meanwhile, always wanted the full CPU config data in libvirt, so that
> we could ensure libvirt was able to use the exact same CPU model setup
> on other hypervisors too - eg Xen, VMWare let us specify the CPUID masks
> so we could re-use the libvirt data there.
> 
> > I will summarize my ideas on how libvirt should be using -cpu custom and
> > send them to libvirt-list soon.
> 
> These patches are x86 obviously - is there anything we need to worry about
> for non-x86 architectures I wonder ? IIRC, all the non-x86 archs have
> traditionally only needed CPU model names and don't really have the same
> level of per-flag CPUID mask configurability that x86 has.

X86 started with opaque CPU model names hiding implementation details,
then moved to allow extra -cpu parameters. Then we noticed that the CPU
models were hiding too much from libvirt in the X86 case, and now those
parameters were converted to become QOM properties configurable using
-global.

I expect other architectures to follow a similar pattern and allow
things to be configured using QOM properties, but I am not sure they
would go to the extreme of making every single detail configurable using
QOM.

One thing you may need to worry about for all architectures is to check
if a CPU model is runnable in a host before starting or migrating a VM.
In this case, we're introducing a generic mechanism in
query-cpu-definitions for that. See "[PATCH v6 15/17] target-s390x:
Extend arch specific QMP command query-cpu-definitions" and related
patches (posted on 2015-04-27) on qemu-devel.

And in the case of runnability checks, x86 is a bit more complex, too
(because it is too configurable in QEMU and in libvirt): as libvirt
needs to know what exactly is blocking the CPU from running, we have the
"filtered-features" property (that libvirt can start using, now that we
have the "qom-path" field on query-cpus).

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-08 19:07 [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Eduardo Habkost
                   ` (2 preceding siblings ...)
  2015-06-08 20:18 ` [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Jiri Denemark
@ 2015-06-16 17:40 ` Eduardo Habkost
  3 siblings, 0 replies; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-16 17:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: mimu, borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark,
	Andreas Färber, rth


Ping? Any feedback? I want to get this into 2.4.

On Mon, Jun 08, 2015 at 04:07:38PM -0300, Eduardo Habkost wrote:
> The problem:
> 
> The existing libvirt APIs assume that if a given CPU model is runnable in a
> host kernel+hardware combination, it will be always runnable on that host even
> if the machine-type changes.
> 
> That assumption is implied in some of libvirt interfaces, for example, at:
> 
> 1) Host capabilities, which let callers know the set of CPU models
>    that can run in a host:
>    https://libvirt.org/formatcaps.html#elementHost
> 
>    "virsh capabilities" returns a CPU model name + CPU feature list, assuming
>    that a CPU model name has a meaning that's independent from the
>    machine-type.
> 
> 2) The function that checks if a given CPU model definition
>    is compatible with a host (virConnectCompareCPU()),
>    which does not take the machine-type as argument:
>    http://libvirt.org/html/libvirt-libvirt-host.html#virConnectCompareCPU
> 
> But that assumption is not true, as QEMU changes CPU models in new
> machine-types when fixing bugs, or when new features (previously unsupported by
> QEMU, TCG or KVM) get implemented.
> 
> The solution:
> 
> libvirt can solve this problem partially by making sure every feature in a CPU
> model is explicitly configured, instead of (incorrectly) expecting that a named
> CPU model will never change in QEMU. But this doesn't solve the problem
> completely, because it is still possible that new features unknown to libvirt
> get enabled in the default CPU model in future machine-types (that's very
> likely to happen when we introduce new KVM features, for example).
> 
> So, to make sure no new feature will be ever enabled without the knowledge of
> libvirt, add a "-cpu custom" mode, where no CPU model data is loaded at all,
> and everything needs to be configured explicitly using CPU properties. That
> means no CPU features will ever change depending on machine-type or accelerator
> capabilities when using "-cpu custom".
> 
>                               * * *
> 
> I know that this is basically the opposite of what we were aiming at in the
> last few month^Wyears, where we were struggling to implement probing interfaces
> to expose machine-type-dependent data for libvirt. But, at least the fact that
> we could implement the new approach using a 9-line patch means we were still
> going in the right direction. :)
> 
> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> that will generate a config file that can be loaded using -readconfig, based on
> the -cpu and -machine options provided in the command-line.
> 
>                               * * *
> 
> This is basically the same version I sent as an RFC in April. A git tree is
> available at:
> 
>   git://github.com/ehabkost/qemu-hacks.git work/x86-cpu-custom-model
> 
> Eduardo Habkost (2):
>   target-i386: Introduce "-cpu custom"
>   scripts: x86-cpu-model-dump script
> 
>  scripts/x86-cpu-model-dump | 322 +++++++++++++++++++++++++++++++++++++++++++++
>  target-i386/cpu.c          |  10 +-
>  2 files changed, 331 insertions(+), 1 deletion(-)
>  create mode 100755 scripts/x86-cpu-model-dump
> 
> -- 
> 2.1.0
> 
> 

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-08 20:18 ` [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Jiri Denemark
  2015-06-09  8:56   ` Daniel P. Berrange
@ 2015-06-23 12:32   ` Andreas Färber
  2015-06-23 15:08     ` Eduardo Habkost
  2015-06-24  9:20     ` Jiri Denemark
  1 sibling, 2 replies; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 12:32 UTC (permalink / raw)
  To: Jiri Denemark
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, rth, Eduardo Habkost

Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
>> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
>> that will generate a config file that can be loaded using -readconfig, based on
>> the -cpu and -machine options provided in the command-line.
> 
> Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> configuration data to libvirt, but now I think it actually makes sense.
> We already have a partial copy of CPU model definitions in libvirt
> anyway, but as QEMU changes some CPU models in some machine types (and
> libvirt does not do that) we have no real control over the guest CPU
> configuration. While what we really want is full control to enforce
> stable guest ABI.

That sounds like FUD to me. Any concrete data points where QEMU does not
have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
for.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 12:32   ` Andreas Färber
@ 2015-06-23 15:08     ` Eduardo Habkost
  2015-06-23 15:32       ` Michael S. Tsirkin
  2015-06-23 15:51       ` Daniel P. Berrange
  2015-06-24  9:20     ` Jiri Denemark
  1 sibling, 2 replies; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 15:08 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> >> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> >> that will generate a config file that can be loaded using -readconfig, based on
> >> the -cpu and -machine options provided in the command-line.
> > 
> > Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > configuration data to libvirt, but now I think it actually makes sense.
> > We already have a partial copy of CPU model definitions in libvirt
> > anyway, but as QEMU changes some CPU models in some machine types (and
> > libvirt does not do that) we have no real control over the guest CPU
> > configuration. While what we really want is full control to enforce
> > stable guest ABI.
> 
> That sounds like FUD to me. Any concrete data points where QEMU does not
> have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> for.

What Jiri is saying that the CPUs change depending on -mmachine, not
that the ABI is broken by a given machine.

The problem here is that libvirt needs to provide CPU models whose
runnability does not depend on the machine-type. If users have a VM that
is running in a host and the VM machine-type changes, the VM should be
still runnable in that host. QEMU doesn't provide that, our CPU models
may change when we introduce new machine-types, so we are giving them a
mechanism that allows libvirt to implement the policy they need.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 15:08     ` Eduardo Habkost
@ 2015-06-23 15:32       ` Michael S. Tsirkin
  2015-06-23 15:58         ` Eduardo Habkost
  2015-06-23 15:51       ` Daniel P. Berrange
  1 sibling, 1 reply; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-23 15:32 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, Andreas Färber, rth

On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> > Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > >> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > >> that will generate a config file that can be loaded using -readconfig, based on
> > >> the -cpu and -machine options provided in the command-line.
> > > 
> > > Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > > configuration data to libvirt, but now I think it actually makes sense.
> > > We already have a partial copy of CPU model definitions in libvirt
> > > anyway, but as QEMU changes some CPU models in some machine types (and
> > > libvirt does not do that) we have no real control over the guest CPU
> > > configuration. While what we really want is full control to enforce
> > > stable guest ABI.
> > 
> > That sounds like FUD to me. Any concrete data points where QEMU does not
> > have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > for.
> 
> What Jiri is saying that the CPUs change depending on -mmachine, not
> that the ABI is broken by a given machine.
> 
> The problem here is that libvirt needs to provide CPU models whose
> runnability does not depend on the machine-type. If users have a VM that
> is running in a host and the VM machine-type changes,

How does it change, and why?

> the VM should be
> still runnable in that host. QEMU doesn't provide that, our CPU models
> may change when we introduce new machine-types, so we are giving them a
> mechanism that allows libvirt to implement the policy they need.

I don't mind wrt CPU specifically, but we absolutely do change guest ABI
in many ways when we change machine types.

> -- 
> Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 15:08     ` Eduardo Habkost
  2015-06-23 15:32       ` Michael S. Tsirkin
@ 2015-06-23 15:51       ` Daniel P. Berrange
  2015-06-23 15:56         ` Michael S. Tsirkin
  1 sibling, 1 reply; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-23 15:51 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark,
	Andreas Färber, rth

On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> > Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > >> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > >> that will generate a config file that can be loaded using -readconfig, based on
> > >> the -cpu and -machine options provided in the command-line.
> > > 
> > > Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > > configuration data to libvirt, but now I think it actually makes sense.
> > > We already have a partial copy of CPU model definitions in libvirt
> > > anyway, but as QEMU changes some CPU models in some machine types (and
> > > libvirt does not do that) we have no real control over the guest CPU
> > > configuration. While what we really want is full control to enforce
> > > stable guest ABI.
> > 
> > That sounds like FUD to me. Any concrete data points where QEMU does not
> > have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > for.
> 
> What Jiri is saying that the CPUs change depending on -mmachine, not
> that the ABI is broken by a given machine.
> 
> The problem here is that libvirt needs to provide CPU models whose
> runnability does not depend on the machine-type. If users have a VM that
> is running in a host and the VM machine-type changes, the VM should be
> still runnable in that host. QEMU doesn't provide that, our CPU models
> may change when we introduce new machine-types, so we are giving them a
> mechanism that allows libvirt to implement the policy they need.

Expanding on that, but tieing the CPU model to the machine type, QEMU
has in turn effectively tied the machine type to the host hardware.
eg, switching to a newer machine type, may then prevent the guest
from being able to launch on the hardware that it was previously
able to run on, due to some new requirement of the CPU model associated
with the machine type.

Libvirt wants the CPU models to be independant of the machine type,
so in general only the CPU model is dependant on hardware capabilities
and machine type is isolated from hardware.

Libvirt still intends to do versioning of the CPU models, but the
versioning will be separate from the versioning of the machine types,
and will be handled by libvirt itself.

This also allows us to get  further towards our goal which is to have a
consistent representation of CPU models across all libvirt hypervisors.
eg the same libvirt CPU model and versions can be made consistent across
kvm, xen, vmware, etc, as they're not longer changing behind our back
based on the qemu machine type.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 15:51       ` Daniel P. Berrange
@ 2015-06-23 15:56         ` Michael S. Tsirkin
  2015-06-23 16:00           ` Daniel P. Berrange
  0 siblings, 1 reply; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-23 15:56 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Andreas Färber,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 04:51:00PM +0100, Daniel P. Berrange wrote:
> On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> > On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> > > Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > > >> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > > >> that will generate a config file that can be loaded using -readconfig, based on
> > > >> the -cpu and -machine options provided in the command-line.
> > > > 
> > > > Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > > > configuration data to libvirt, but now I think it actually makes sense.
> > > > We already have a partial copy of CPU model definitions in libvirt
> > > > anyway, but as QEMU changes some CPU models in some machine types (and
> > > > libvirt does not do that) we have no real control over the guest CPU
> > > > configuration. While what we really want is full control to enforce
> > > > stable guest ABI.
> > > 
> > > That sounds like FUD to me. Any concrete data points where QEMU does not
> > > have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > > for.
> > 
> > What Jiri is saying that the CPUs change depending on -mmachine, not
> > that the ABI is broken by a given machine.
> > 
> > The problem here is that libvirt needs to provide CPU models whose
> > runnability does not depend on the machine-type. If users have a VM that
> > is running in a host and the VM machine-type changes, the VM should be
> > still runnable in that host. QEMU doesn't provide that, our CPU models
> > may change when we introduce new machine-types, so we are giving them a
> > mechanism that allows libvirt to implement the policy they need.
> 
> Expanding on that, but tieing the CPU model to the machine type, QEMU
> has in turn effectively tied the machine type to the host hardware.
> eg, switching to a newer machine type, may then prevent the guest
> from being able to launch on the hardware that it was previously
> able to run on, due to some new requirement of the CPU model associated
> with the machine type.

So why not keep machine type stable?

> Libvirt wants the CPU models to be independant of the machine type,
> so in general only the CPU model is dependant on hardware capabilities
> and machine type is isolated from hardware.
> 
> Libvirt still intends to do versioning of the CPU models, but the
> versioning will be separate from the versioning of the machine types,
> and will be handled by libvirt itself.
> 
> This also allows us to get  further towards our goal which is to have a
> consistent representation of CPU models across all libvirt hypervisors.
> eg the same libvirt CPU model and versions can be made consistent across
> kvm, xen, vmware, etc, as they're not longer changing behind our back
> based on the qemu machine type.
> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 15:32       ` Michael S. Tsirkin
@ 2015-06-23 15:58         ` Eduardo Habkost
  2015-06-23 16:15           ` Andreas Färber
  0 siblings, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 15:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, Andreas Färber, rth

On Tue, Jun 23, 2015 at 05:32:42PM +0200, Michael S. Tsirkin wrote:
> On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> > On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> > > Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > > >> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > > >> that will generate a config file that can be loaded using -readconfig, based on
> > > >> the -cpu and -machine options provided in the command-line.
> > > > 
> > > > Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > > > configuration data to libvirt, but now I think it actually makes sense.
> > > > We already have a partial copy of CPU model definitions in libvirt
> > > > anyway, but as QEMU changes some CPU models in some machine types (and
> > > > libvirt does not do that) we have no real control over the guest CPU
> > > > configuration. While what we really want is full control to enforce
> > > > stable guest ABI.
> > > 
> > > That sounds like FUD to me. Any concrete data points where QEMU does not
> > > have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > > for.
> > 
> > What Jiri is saying that the CPUs change depending on -mmachine, not
> > that the ABI is broken by a given machine.
> > 
> > The problem here is that libvirt needs to provide CPU models whose
> > runnability does not depend on the machine-type. If users have a VM that
> > is running in a host and the VM machine-type changes,
> 
> How does it change, and why?

Sometimes we add features to a CPU model because they were not emulated by KVM
and now they are. Sometimes we remove or add features or change other fields
because we are fixing previous mistakes. Recently we we were going to remove
features from models because of an Intel CPU errata, but then decided to create
a new CPU model name instead.

See some examples at the end of this message.

> 
> > the VM should be
> > still runnable in that host. QEMU doesn't provide that, our CPU models
> > may change when we introduce new machine-types, so we are giving them a
> > mechanism that allows libvirt to implement the policy they need.
> 
> I don't mind wrt CPU specifically, but we absolutely do change guest ABI
> in many ways when we change machine types.

All the other ABI changes we introduce in QEMU don't affect runnability of the
VM in a given host, that's the problem we are trying to address here. ABI
changes are expected when changing to a new machine, runnability changes
aren't.


Examples of commits changing CPU models:

commit 726a8ff68677d8d5fba17eb0ffb85076bfb598dc
Author: Eduardo Habkost <ehabkost@redhat.com>
Date:   Fri Apr 10 14:45:00 2015 -0300

    target-i386: Remove AMD feature flag aliases from CPU model table
    
    When CPU vendor is AMD, the AMD feature alias bits on
    CPUID[0x80000001].EDX are already automatically copied from CPUID[1].EDX
    on x86_cpu_realizefn(). When CPU vendor is Intel, those bits are
    reserved and should be zero. On either case, those bits shouldn't be set
    in the CPU model table.
    
    Reviewed-by: Igor Mammedov <imammedo@redhat.com>
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

commit 13704e4c455770d500d6b87b117e32f0d01252c9
Author: Eduardo Habkost <ehabkost@redhat.com>
Date:   Thu Jan 22 17:22:54 2015 -0200

    target-i386: Disable HLE and RTM on Haswell & Broadwell
    
    All Haswell CPUs and some Broadwell CPUs were updated by Intel to have
    the HLE and RTM features disabled. This will prevent
    "-cpu Haswell,enforce" and "-cpu Broadwell,enforce" from running out of
    the box on those CPUs.
    
    Disable those features by default on Broadwell and Haswell CPU models,
    starting on pc-*-2.3. Users who want to use those features can enable
    them explicitly on the command-line.
    
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit 78a611f1936b3eac8ed78a2be2146a742a85212c
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Fri Dec 5 10:52:46 2014 +0100

    target-i386: add f16c and rdrand to Haswell and Broadwell
    
    Both were added in Ivy Bridge (for which we do not have a CPU model
    yet!).
    
    Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit b3a4f0b1a072a467d003755ca0e55c5be38387cb
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Wed Dec 10 14:12:41 2014 -0200

    target-i386: add VME to all CPUs
    
    vm86 mode extensions date back to the 486.  All models should have
    them.
    
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit 0bb0b2d2fe7f645ddaf1f0ff40ac669c9feb4aa1
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Mon Nov 24 15:54:43 2014 +0100

    target-i386: add feature flags for CPUID[EAX=0xd,ECX=1]
    
    These represent xsave-related capabilities of the processor, and KVM may
    or may not support them.
    
    Add feature bits so that they are considered by "-cpu ...,enforce", and use
    the new feature work instead of calling kvm_arch_get_supported_cpuid.
    
    Bit 3 (XSAVES) is not migratables because it requires saving MSR_IA32_XSS.
    Neither KVM nor any commonly available hardware supports it anyway.
    
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

commit e93abc147fa628650bdbe7fd57f27462ca40a3c2
Author: Eduardo Habkost <ehabkost@redhat.com>
Date:   Fri Oct 3 16:39:50 2014 -0300

    target-i386: Don't enable nested VMX by default
    
    TCG doesn't support VMX, and nested VMX is not enabled by default in the
    KVM kernel module.
    
    So, there's no reason to have VMX enabled by default on the core2duo and
    coreduo CPU models, today. Even the newer Intel CPU model definitions
    don't have it enabled.
    
    In this case, we need machine-type compat code, as people may be running
    the older machine-types on hosts that had VMX nesting enabled.
    
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
    Signed-off-by: Andreas Färber <afaerber@suse.de>

commit f8e6a11aecc96e9d8a84f17d7c07019471714e20
Author: Eduardo Habkost <ehabkost@redhat.com>
Date:   Tue Sep 10 17:48:59 2013 -0300

    target-i386: Set model=6 on qemu64 & qemu32 CPU models
    
    There's no Intel CPU with family=6,model=2, and Linux and Windows guests
    disable SEP when seeing that combination due to Pentium Pro erratum #82.
    
    In addition to just having SEP ignored by guests, Skype (and maybe other
    applications) runs sysenter directly without passing through ntdll on
    Windows, and crashes because Windows ignored the SEP CPUID bit.
    
    So, having model > 2 is a better default on qemu64 and qemu32 for two
    reasons: making SEP really available for guests, and avoiding crashing
    applications that work on bare metal.
    
    model=3 would fix the problem, but it causes CPU enumeration problems
    for Windows guests[1]. So let's set model=6, that matches "Athlon
    (PM core)" on AMD and "P2 with on-die L2 cache" on Intel and it allows
    Windows to use all CPUs as well as fixing sysenter.
    
    [1] https://bugzilla.redhat.com/show_bug.cgi?id=508623
    
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
    Reviewed-by: Igor Mammedov <imammedo@redhat.com>
    Signed-off-by: Andreas Färber <afaerber@suse.de>

commit 6b11322e0f724eb0649fdc324a44288b783023ad
Author: Eduardo Habkost <ehabkost@redhat.com>
Date:   Mon May 27 17:23:55 2013 -0300

    target-i386: Set level=4 on Conroe/Penryn/Nehalem
    
    The CPUID level value on Conroe, Penryn, and Nehalem are too low. This
    causes at least one known problem: the -smp "threads" option doesn't
    work as expect if level is < 4, because thread count information is
    provided to the guest on CPUID[EAX=4,ECX=2].EAX
    
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
    Reviewed-by: Igor Mammedov <imammedo@redhat.com>
    Signed-off-by: Andreas Färber <afaerber@suse.de>

commit ffce9ebbb69363dfe7605585cdad58ea3847edf4
Author: Eduardo Habkost <ehabkost@redhat.com>
Date:   Mon May 27 17:23:54 2013 -0300

    target-i386: Update model values on Conroe/Penryn/Nehalem CPU models
    
    The CPUID model values on Conroe, Penryn, and Nehalem are too
    conservative and don't reflect the values found on real Conroe, Penryn,
    and Nehalem CPUs.
    
    This causes at least one known problems: Windows XP disables sysenter
    when (family == 6 && model <= 2), but Skype tries to use the sysenter
    instruction anyway because it is reported as available on CPUID, making
    it crash.
    
    This patch sets appropriate model values that correspond to real Conroe,
    Penryn, and Nehalem CPUs.
    
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
    Reviewed-by: Igor Mammedov <imammedo@redhat.com>
    Signed-off-by: Andreas Färber <afaerber@suse.de>

commit b2a856d99281f2fee60a4313d204205bcd2c4269
Author: Andreas Färber <afaerber@suse.de>
Date:   Wed May 1 17:30:51 2013 +0200

    target-i386: Change CPUID model of 486 to 8
    
    This changes the model number of 486 to 8 (DX4) which matches the
    feature set presented, and actually has the CPUID instruction.
    
    This adds a compatibility property, to keep model=0 on pc-*-1.4 and older.
    
    Signed-off-by: H. Peter Anvin <hpa@zytor.com>
    [AF: Add compat_props entry]
    Tested-by: Eduardo Habkost <ehabkost@redhat.com>
    Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
    Signed-off-by: Andreas Färber <afaerber@suse.de>


KVM-specific changes:

commit 75d373ef9729bd22fbc46bfd8dcd158cbf6d9777
Author: Eduardo Habkost <ehabkost@redhat.com>
Date:   Fri Oct 3 16:39:51 2014 -0300

    target-i386: Disable SVM by default in KVM mode
    
    Make SVM be disabled by default on all CPU models when in KVM mode.
    Nested SVM is enabled by default in the KVM kernel module, but it is
    probably less stable than nested VMX (which is already disabled by
    default).
    
    Add a new compat function, x86_cpu_compat_kvm_no_autodisable(), to keep
    compatibility on previous machine-types.
    
    Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
    Signed-off-by: Andreas Färber <afaerber@suse.de>

commit 864867b91b48d38e2bfc7b225197901e6f7d8216
Author: Eduardo Habkost <ehabkost@redhat.com>
Date:   Fri Oct 3 16:39:48 2014 -0300

    target-i386: Disable CPUID_ACPI by default in KVM mode
    
    KVM never supported the CPUID_ACPI flag, so it doesn't make sense to
    have it enabled by default when KVM is enabled.
    
    The motivation here is exactly the same we had for the MONITOR flag
    (disabled by commit 136a7e9a85d7047461f8153f7d12c514a3d68f69).
    
    And like in the MONITOR flag case, we don't need machine-type compat code
    because it is currently impossible to run a KVM VM with the ACPI flag set.
    
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
    Signed-off-by: Andreas Färber <afaerber@suse.de>

commit 136a7e9a85d7047461f8153f7d12c514a3d68f69
Author: Eduardo Habkost <ehabkost@redhat.com>
Date:   Wed Apr 30 13:48:28 2014 -0300

    target-i386: kvm: Don't enable MONITOR by default on any CPU model
    
    KVM never supported the MONITOR flag so it doesn't make sense to have it
    enabled by default when KVM is enabled.
    
    The rationale here is similar to the cases where it makes sense to have
    a feature enabled by default on all CPU models when on KVM mode (e.g.
    x2apic). In this case we are having a feature disabled by default for
    the same reasons.
    
    In this case we don't need machine-type compat code because it is
    currently impossible to run a KVM VM with the MONITOR flag set.
    
    Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
    Signed-off-by: Andreas Färber <afaerber@suse.de>

commit ef02ef5f4536dba090b12360a6c862ef0e57e3bc
Author: Eduardo Habkost <ehabkost@redhat.com>
Date:   Wed Feb 19 11:58:12 2014 -0300

    target-i386: Enable x2apic by default on KVM
    
    When on KVM mode, enable x2apic by default on all CPU models.
    
    Normally we try to keep the CPU model definitions as close as the real
    CPUs as possible, but x2apic can be emulated by KVM without host CPU
    support for x2apic, and it improves performance by reducing APIC access
    overhead. x2apic emulation is available on KVM since 2009 (Linux
    2.6.32-rc1), there's no reason for not enabling x2apic by default when
    running KVM.
    
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Andreas Färber <afaerber@suse.de>

commit 6a4784ce6b95b013a13504ead9ab62975faf6eff
Author: Eduardo Habkost <ehabkost@redhat.com>
Date:   Mon Jan 7 16:20:44 2013 -0200

    target-i386: Disable kvm_mmu by default
    
    KVM_CAP_PV_MMU capability reporting was removed from the kernel since
    v2.6.33 (see commit a68a6a7282373), and was completely removed from the
    kernel since v3.3 (see commit fb92045843). It doesn't make sense to keep
    it enabled by default, as it would cause unnecessary hassle when using
    the "enforce" flag.
    
    This disables kvm_mmu on all machine-types. With this fix, the possible
    scenarios when migrating from QEMU <= 1.3 to QEMU 1.4 are:
    
    ------------+----------+----------------------------------------------------
     src kernel | dst kern.| Result
    ------------+----------+----------------------------------------------------
     >= 2.6.33  | any      | kvm_mmu was already disabled and will stay disabled
     <= 2.6.32  | >= 3.3   | correct live migration is impossible
     <= 2.6.32  | <= 3.2   | kvm_mmu will be disabled on next guest reboot *
    ------------+----------+----------------------------------------------------
    
     * If they are running kernel <= 2.6.32 and want kvm_mmu to be kept
       enabled on guest reboot, they can explicitly add +kvm_mmu to the QEMU
       command-line. Using 2.6.33 and higher, it is not possible to enable
       kvm_mmu explicitly anymore.
    
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
    Reviewed-by: Gleb Natapov <gleb@redhat.com>
    Signed-off-by: Andreas Färber <afaerber@suse.de>

commit dc59944bc9a5ad784572eea57610de60e4a2f4e5
Author: Michael S. Tsirkin <mst@redhat.com>
Date:   Thu Oct 18 00:15:48 2012 +0200

    qemu: enable PV EOI for qemu 1.3
    
    Enable KVM PV EOI by default. You can still disable it with
    -kvm_pv_eoi cpu flag. To avoid breaking cross-version migration,
    enable only for qemu 1.3 (or in the future, newer) machine type.
    
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

commit ef8621b1a3b199c348606c0a11a77d8e8bf135f1
Author: Anthony Liguori <aliguori@us.ibm.com>
Date:   Wed Aug 29 09:32:41 2012 -0500

    target-i386: disable pv eoi to fix migration across QEMU versions
    
    We have a problem with how we handle migration with KVM paravirt features.
    We unconditionally enable paravirt features regardless of whether we know how
    to migrate them.
    
    We also don't tie paravirt features to specific machine types so an old QEMU on
    a new kernel would expose features that never existed.
    
    The 1.2 cycle is over and as things stand, migration is broken.  Michael has
    another series that adds support for migrating PV EOI and attempts to make it
    work correctly for different machine types.
    
    After speaking with Michael on IRC, we agreed to take this patch plus 1 & 4
    from his series.  This makes sure QEMU can migrate PV EOI if it's enabled, but
    does not enable it by default.
    
    This also means that we won't unconditionally enable new features for guests
    future proofing us from this happening again in the future.
    
    Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 15:56         ` Michael S. Tsirkin
@ 2015-06-23 16:00           ` Daniel P. Berrange
  2015-06-23 16:30             ` Michael S. Tsirkin
  0 siblings, 1 reply; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-23 16:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Andreas Färber,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 05:56:35PM +0200, Michael S. Tsirkin wrote:
> On Tue, Jun 23, 2015 at 04:51:00PM +0100, Daniel P. Berrange wrote:
> > On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> > > On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> > > > Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > > > >> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > > > >> that will generate a config file that can be loaded using -readconfig, based on
> > > > >> the -cpu and -machine options provided in the command-line.
> > > > > 
> > > > > Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > > > > configuration data to libvirt, but now I think it actually makes sense.
> > > > > We already have a partial copy of CPU model definitions in libvirt
> > > > > anyway, but as QEMU changes some CPU models in some machine types (and
> > > > > libvirt does not do that) we have no real control over the guest CPU
> > > > > configuration. While what we really want is full control to enforce
> > > > > stable guest ABI.
> > > > 
> > > > That sounds like FUD to me. Any concrete data points where QEMU does not
> > > > have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > > > for.
> > > 
> > > What Jiri is saying that the CPUs change depending on -mmachine, not
> > > that the ABI is broken by a given machine.
> > > 
> > > The problem here is that libvirt needs to provide CPU models whose
> > > runnability does not depend on the machine-type. If users have a VM that
> > > is running in a host and the VM machine-type changes, the VM should be
> > > still runnable in that host. QEMU doesn't provide that, our CPU models
> > > may change when we introduce new machine-types, so we are giving them a
> > > mechanism that allows libvirt to implement the policy they need.
> > 
> > Expanding on that, but tieing the CPU model to the machine type, QEMU
> > has in turn effectively tied the machine type to the host hardware.
> > eg, switching to a newer machine type, may then prevent the guest
> > from being able to launch on the hardware that it was previously
> > able to run on, due to some new requirement of the CPU model associated
> > with the machine type.
> 
> So why not keep machine type stable?

There are many reasons to choose a particular machine type - for
example, to achieve migration compat between hosts with different
QEMU versions, or to enable access to some performance or bug
fix in the machine type in question. Users / apps need to be free
to make those decisions, without being restricted by changes in the
CPU model which may affect what hardware the machine type can be
used on. The current use of machine types for CPU model versioning
is placing users between a rock & hard place, giving them impossible
decisions about which bad behaviour/bug they're willing to accept.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 15:58         ` Eduardo Habkost
@ 2015-06-23 16:15           ` Andreas Färber
  2015-06-23 16:25             ` Daniel P. Berrange
  2015-06-23 16:32             ` Eduardo Habkost
  0 siblings, 2 replies; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 16:15 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
> On Tue, Jun 23, 2015 at 05:32:42PM +0200, Michael S. Tsirkin wrote:
>> On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
>>> On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
>>>> Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
>>>>>> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
>>>>>> that will generate a config file that can be loaded using -readconfig, based on
>>>>>> the -cpu and -machine options provided in the command-line.
>>>>>
>>>>> Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
>>>>> configuration data to libvirt, but now I think it actually makes sense.
>>>>> We already have a partial copy of CPU model definitions in libvirt
>>>>> anyway, but as QEMU changes some CPU models in some machine types (and
>>>>> libvirt does not do that) we have no real control over the guest CPU
>>>>> configuration. While what we really want is full control to enforce
>>>>> stable guest ABI.
>>>>
>>>> That sounds like FUD to me. Any concrete data points where QEMU does not
>>>> have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
>>>> for.
>>>
>>> What Jiri is saying that the CPUs change depending on -mmachine, not
>>> that the ABI is broken by a given machine.
>>>
>>> The problem here is that libvirt needs to provide CPU models whose
>>> runnability does not depend on the machine-type. If users have a VM that
>>> is running in a host and the VM machine-type changes,
>>
>> How does it change, and why?
> 
> Sometimes we add features to a CPU model because they were not emulated by KVM
> and now they are. Sometimes we remove or add features or change other fields
> because we are fixing previous mistakes. Recently we we were going to remove
> features from models because of an Intel CPU errata, but then decided to create
> a new CPU model name instead.
> 
> See some examples at the end of this message.
> 
>>
>>> the VM should be
>>> still runnable in that host. QEMU doesn't provide that, our CPU models
>>> may change when we introduce new machine-types, so we are giving them a
>>> mechanism that allows libvirt to implement the policy they need.
>>
>> I don't mind wrt CPU specifically, but we absolutely do change guest ABI
>> in many ways when we change machine types.
> 
> All the other ABI changes we introduce in QEMU don't affect runnability of the
> VM in a given host, that's the problem we are trying to address here. ABI
> changes are expected when changing to a new machine, runnability changes
> aren't.
> 
> 
> Examples of commits changing CPU models:
[snip]

I've always advocated remaining backwards-compatible and only making CPU
model changes for new machines. You among others felt that was not
always necessary, and now you're using the lack thereof as an argument
to stop using QEMU's CPU models at all? That sounds convoluted...

BTW your list does not answer my question. You would need examples where
a CPU model changes between machines, and I am not aware of any example
beyond the intentional -x.y variations. There are differences between
KVM and TCG though, did you mean that? i440fx and q35 should be
identical and isa-pc, too, and none anyway. None of this has anything to
do with the host CPU.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:15           ` Andreas Färber
@ 2015-06-23 16:25             ` Daniel P. Berrange
  2015-06-23 16:33               ` Michael S. Tsirkin
  2015-06-23 16:40               ` Andreas Färber
  2015-06-23 16:32             ` Eduardo Habkost
  1 sibling, 2 replies; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-23 16:25 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 06:15:51PM +0200, Andreas Färber wrote:
> Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
> > On Tue, Jun 23, 2015 at 05:32:42PM +0200, Michael S. Tsirkin wrote:
> >> On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> >>> On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> >>>> Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> >>>>>> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> >>>>>> that will generate a config file that can be loaded using -readconfig, based on
> >>>>>> the -cpu and -machine options provided in the command-line.
> >>>>>
> >>>>> Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> >>>>> configuration data to libvirt, but now I think it actually makes sense.
> >>>>> We already have a partial copy of CPU model definitions in libvirt
> >>>>> anyway, but as QEMU changes some CPU models in some machine types (and
> >>>>> libvirt does not do that) we have no real control over the guest CPU
> >>>>> configuration. While what we really want is full control to enforce
> >>>>> stable guest ABI.
> >>>>
> >>>> That sounds like FUD to me. Any concrete data points where QEMU does not
> >>>> have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> >>>> for.
> >>>
> >>> What Jiri is saying that the CPUs change depending on -mmachine, not
> >>> that the ABI is broken by a given machine.
> >>>
> >>> The problem here is that libvirt needs to provide CPU models whose
> >>> runnability does not depend on the machine-type. If users have a VM that
> >>> is running in a host and the VM machine-type changes,
> >>
> >> How does it change, and why?
> > 
> > Sometimes we add features to a CPU model because they were not emulated by KVM
> > and now they are. Sometimes we remove or add features or change other fields
> > because we are fixing previous mistakes. Recently we we were going to remove
> > features from models because of an Intel CPU errata, but then decided to create
> > a new CPU model name instead.
> > 
> > See some examples at the end of this message.
> > 
> >>
> >>> the VM should be
> >>> still runnable in that host. QEMU doesn't provide that, our CPU models
> >>> may change when we introduce new machine-types, so we are giving them a
> >>> mechanism that allows libvirt to implement the policy they need.
> >>
> >> I don't mind wrt CPU specifically, but we absolutely do change guest ABI
> >> in many ways when we change machine types.
> > 
> > All the other ABI changes we introduce in QEMU don't affect runnability of the
> > VM in a given host, that's the problem we are trying to address here. ABI
> > changes are expected when changing to a new machine, runnability changes
> > aren't.
> > 
> > 
> > Examples of commits changing CPU models:
> [snip]
> 
> I've always advocated remaining backwards-compatible and only making CPU
> model changes for new machines. You among others felt that was not
> always necessary, and now you're using the lack thereof as an argument
> to stop using QEMU's CPU models at all? That sounds convoluted...

Whether QEMU changed the CPU for existing machines, or only for new
machines is actually not the core problem. Even if we only changed
the CPU in new machines that would still be an unsatisfactory situation
because we want to be able to be able to access different versions of
the CPU without the machine type changing, and access different versions
of the machine type, without the CPU changing. IOW it is the fact that the
changes in CPU are tied to changes in machine type that is the core
problem.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:00           ` Daniel P. Berrange
@ 2015-06-23 16:30             ` Michael S. Tsirkin
  0 siblings, 0 replies; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-23 16:30 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Andreas Färber,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 05:00:54PM +0100, Daniel P. Berrange wrote:
> On Tue, Jun 23, 2015 at 05:56:35PM +0200, Michael S. Tsirkin wrote:
> > On Tue, Jun 23, 2015 at 04:51:00PM +0100, Daniel P. Berrange wrote:
> > > On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> > > > On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> > > > > Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > > > > >> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > > > > >> that will generate a config file that can be loaded using -readconfig, based on
> > > > > >> the -cpu and -machine options provided in the command-line.
> > > > > > 
> > > > > > Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > > > > > configuration data to libvirt, but now I think it actually makes sense.
> > > > > > We already have a partial copy of CPU model definitions in libvirt
> > > > > > anyway, but as QEMU changes some CPU models in some machine types (and
> > > > > > libvirt does not do that) we have no real control over the guest CPU
> > > > > > configuration. While what we really want is full control to enforce
> > > > > > stable guest ABI.
> > > > > 
> > > > > That sounds like FUD to me. Any concrete data points where QEMU does not
> > > > > have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > > > > for.
> > > > 
> > > > What Jiri is saying that the CPUs change depending on -mmachine, not
> > > > that the ABI is broken by a given machine.
> > > > 
> > > > The problem here is that libvirt needs to provide CPU models whose
> > > > runnability does not depend on the machine-type. If users have a VM that
> > > > is running in a host and the VM machine-type changes, the VM should be
> > > > still runnable in that host. QEMU doesn't provide that, our CPU models
> > > > may change when we introduce new machine-types, so we are giving them a
> > > > mechanism that allows libvirt to implement the policy they need.
> > > 
> > > Expanding on that, but tieing the CPU model to the machine type, QEMU
> > > has in turn effectively tied the machine type to the host hardware.
> > > eg, switching to a newer machine type, may then prevent the guest
> > > from being able to launch on the hardware that it was previously
> > > able to run on, due to some new requirement of the CPU model associated
> > > with the machine type.
> > 
> > So why not keep machine type stable?
> 
> There are many reasons to choose a particular machine type - for
> example, to achieve migration compat between hosts with different
> QEMU versions,

This might make you use an old machine type.
It will never make you use a newer machine type
so you will never run into problems.

> or to enable access to some performance or bug
> fix in the machine type in question.

Performance/bugfixes is exactly why we change these though.

> Users / apps need to be free
> to make those decisions, without being restricted by changes in the
> CPU model which may affect what hardware the machine type can be
> used on. The current use of machine types for CPU model versioning
> is placing users between a rock & hard place, giving them impossible
> decisions about which bad behaviour/bug they're willing to accept.
> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:15           ` Andreas Färber
  2015-06-23 16:25             ` Daniel P. Berrange
@ 2015-06-23 16:32             ` Eduardo Habkost
  2015-06-23 17:01               ` Andreas Färber
  1 sibling, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 16:32 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

On Tue, Jun 23, 2015 at 06:15:51PM +0200, Andreas Färber wrote:
> Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
> > On Tue, Jun 23, 2015 at 05:32:42PM +0200, Michael S. Tsirkin wrote:
> >> On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> >>> On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> >>>> Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> >>>>>> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> >>>>>> that will generate a config file that can be loaded using -readconfig, based on
> >>>>>> the -cpu and -machine options provided in the command-line.
> >>>>>
> >>>>> Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> >>>>> configuration data to libvirt, but now I think it actually makes sense.
> >>>>> We already have a partial copy of CPU model definitions in libvirt
> >>>>> anyway, but as QEMU changes some CPU models in some machine types (and
> >>>>> libvirt does not do that) we have no real control over the guest CPU
> >>>>> configuration. While what we really want is full control to enforce
> >>>>> stable guest ABI.
> >>>>
> >>>> That sounds like FUD to me. Any concrete data points where QEMU does not
> >>>> have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> >>>> for.
> >>>
> >>> What Jiri is saying that the CPUs change depending on -mmachine, not
> >>> that the ABI is broken by a given machine.
> >>>
> >>> The problem here is that libvirt needs to provide CPU models whose
> >>> runnability does not depend on the machine-type. If users have a VM that
> >>> is running in a host and the VM machine-type changes,
> >>
> >> How does it change, and why?
> > 
> > Sometimes we add features to a CPU model because they were not emulated by KVM
> > and now they are. Sometimes we remove or add features or change other fields
> > because we are fixing previous mistakes. Recently we we were going to remove
> > features from models because of an Intel CPU errata, but then decided to create
> > a new CPU model name instead.
> > 
> > See some examples at the end of this message.
> > 
> >>
> >>> the VM should be
> >>> still runnable in that host. QEMU doesn't provide that, our CPU models
> >>> may change when we introduce new machine-types, so we are giving them a
> >>> mechanism that allows libvirt to implement the policy they need.
> >>
> >> I don't mind wrt CPU specifically, but we absolutely do change guest ABI
> >> in many ways when we change machine types.
> > 
> > All the other ABI changes we introduce in QEMU don't affect runnability of the
> > VM in a given host, that's the problem we are trying to address here. ABI
> > changes are expected when changing to a new machine, runnability changes
> > aren't.
> > 
> > 
> > Examples of commits changing CPU models:
> [snip]
> 
> I've always advocated remaining backwards-compatible and only making CPU
> model changes for new machines. You among others felt that was not
> always necessary, and now you're using the lack thereof as an argument
> to stop using QEMU's CPU models at all? That sounds convoluted...
> 

Uh? I don't remember anybody suggesting changing CPU models on existing
machines. We always tried to keep existing machines compatible.

> BTW your list does not answer my question. You would need examples where
> a CPU model changes between machines, and I am not aware of any example
> beyond the intentional -x.y variations. There are differences between
> KVM and TCG though, did you mean that? i440fx and q35 should be
> identical and isa-pc, too, and none anyway. None of this has anything to
> do with the host CPU.

We are talking about the -x.y variations (that, yes, are intentional).
But the fact that CPU features change (even the intentional ones in the
-x.y machine variations) affect runnability of VMs (because enabling new
CPU features in KVM require it to be supported by the host kernel code
and by the host CPU).

I was not thinking about the KVM and TCG differences, but this may also
help libvirt deal with the KVM and TCG differences if necessary.

I don't know what you mean by "i440fx and q35 should be identical"
above.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:25             ` Daniel P. Berrange
@ 2015-06-23 16:33               ` Michael S. Tsirkin
  2015-06-23 16:38                 ` Eduardo Habkost
  2015-06-23 16:42                 ` Daniel P. Berrange
  2015-06-23 16:40               ` Andreas Färber
  1 sibling, 2 replies; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-23 16:33 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Andreas Färber,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
> On Tue, Jun 23, 2015 at 06:15:51PM +0200, Andreas Färber wrote:
> > Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
> > > On Tue, Jun 23, 2015 at 05:32:42PM +0200, Michael S. Tsirkin wrote:
> > >> On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> > >>> On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> > >>>> Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > >>>>>> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > >>>>>> that will generate a config file that can be loaded using -readconfig, based on
> > >>>>>> the -cpu and -machine options provided in the command-line.
> > >>>>>
> > >>>>> Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > >>>>> configuration data to libvirt, but now I think it actually makes sense.
> > >>>>> We already have a partial copy of CPU model definitions in libvirt
> > >>>>> anyway, but as QEMU changes some CPU models in some machine types (and
> > >>>>> libvirt does not do that) we have no real control over the guest CPU
> > >>>>> configuration. While what we really want is full control to enforce
> > >>>>> stable guest ABI.
> > >>>>
> > >>>> That sounds like FUD to me. Any concrete data points where QEMU does not
> > >>>> have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > >>>> for.
> > >>>
> > >>> What Jiri is saying that the CPUs change depending on -mmachine, not
> > >>> that the ABI is broken by a given machine.
> > >>>
> > >>> The problem here is that libvirt needs to provide CPU models whose
> > >>> runnability does not depend on the machine-type. If users have a VM that
> > >>> is running in a host and the VM machine-type changes,
> > >>
> > >> How does it change, and why?
> > > 
> > > Sometimes we add features to a CPU model because they were not emulated by KVM
> > > and now they are. Sometimes we remove or add features or change other fields
> > > because we are fixing previous mistakes. Recently we we were going to remove
> > > features from models because of an Intel CPU errata, but then decided to create
> > > a new CPU model name instead.
> > > 
> > > See some examples at the end of this message.
> > > 
> > >>
> > >>> the VM should be
> > >>> still runnable in that host. QEMU doesn't provide that, our CPU models
> > >>> may change when we introduce new machine-types, so we are giving them a
> > >>> mechanism that allows libvirt to implement the policy they need.
> > >>
> > >> I don't mind wrt CPU specifically, but we absolutely do change guest ABI
> > >> in many ways when we change machine types.
> > > 
> > > All the other ABI changes we introduce in QEMU don't affect runnability of the
> > > VM in a given host, that's the problem we are trying to address here. ABI
> > > changes are expected when changing to a new machine, runnability changes
> > > aren't.
> > > 
> > > 
> > > Examples of commits changing CPU models:
> > [snip]
> > 
> > I've always advocated remaining backwards-compatible and only making CPU
> > model changes for new machines. You among others felt that was not
> > always necessary, and now you're using the lack thereof as an argument
> > to stop using QEMU's CPU models at all? That sounds convoluted...
> 
> Whether QEMU changed the CPU for existing machines, or only for new
> machines is actually not the core problem. Even if we only changed
> the CPU in new machines that would still be an unsatisfactory situation
> because we want to be able to be able to access different versions of
> the CPU without the machine type changing, and access different versions
> of the machine type, without the CPU changing. IOW it is the fact that the
> changes in CPU are tied to changes in machine type that is the core
> problem.
> 
> Regards,
> Daniel

But that's because we are fixing bugs.  If CPU X used to work on
hardware Y in machine type A and stopped in machine type B, this is
because we have determined that it's the right thing to do for the
guests and the users. We don't break stuff just for fun.
Why do you want to bring back the bugs we fixed?

> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:33               ` Michael S. Tsirkin
@ 2015-06-23 16:38                 ` Eduardo Habkost
  2015-06-23 16:44                   ` Andreas Färber
  2015-06-23 16:42                 ` Daniel P. Berrange
  1 sibling, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 16:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, Andreas Färber, rth

On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
> > On Tue, Jun 23, 2015 at 06:15:51PM +0200, Andreas Färber wrote:
> > > Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
> > > > On Tue, Jun 23, 2015 at 05:32:42PM +0200, Michael S. Tsirkin wrote:
> > > >> On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> > > >>> On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> > > >>>> Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > > >>>>>> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > > >>>>>> that will generate a config file that can be loaded using -readconfig, based on
> > > >>>>>> the -cpu and -machine options provided in the command-line.
> > > >>>>>
> > > >>>>> Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > > >>>>> configuration data to libvirt, but now I think it actually makes sense.
> > > >>>>> We already have a partial copy of CPU model definitions in libvirt
> > > >>>>> anyway, but as QEMU changes some CPU models in some machine types (and
> > > >>>>> libvirt does not do that) we have no real control over the guest CPU
> > > >>>>> configuration. While what we really want is full control to enforce
> > > >>>>> stable guest ABI.
> > > >>>>
> > > >>>> That sounds like FUD to me. Any concrete data points where QEMU does not
> > > >>>> have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > > >>>> for.
> > > >>>
> > > >>> What Jiri is saying that the CPUs change depending on -mmachine, not
> > > >>> that the ABI is broken by a given machine.
> > > >>>
> > > >>> The problem here is that libvirt needs to provide CPU models whose
> > > >>> runnability does not depend on the machine-type. If users have a VM that
> > > >>> is running in a host and the VM machine-type changes,
> > > >>
> > > >> How does it change, and why?
> > > > 
> > > > Sometimes we add features to a CPU model because they were not emulated by KVM
> > > > and now they are. Sometimes we remove or add features or change other fields
> > > > because we are fixing previous mistakes. Recently we we were going to remove
> > > > features from models because of an Intel CPU errata, but then decided to create
> > > > a new CPU model name instead.
> > > > 
> > > > See some examples at the end of this message.
> > > > 
> > > >>
> > > >>> the VM should be
> > > >>> still runnable in that host. QEMU doesn't provide that, our CPU models
> > > >>> may change when we introduce new machine-types, so we are giving them a
> > > >>> mechanism that allows libvirt to implement the policy they need.
> > > >>
> > > >> I don't mind wrt CPU specifically, but we absolutely do change guest ABI
> > > >> in many ways when we change machine types.
> > > > 
> > > > All the other ABI changes we introduce in QEMU don't affect runnability of the
> > > > VM in a given host, that's the problem we are trying to address here. ABI
> > > > changes are expected when changing to a new machine, runnability changes
> > > > aren't.
> > > > 
> > > > 
> > > > Examples of commits changing CPU models:
> > > [snip]
> > > 
> > > I've always advocated remaining backwards-compatible and only making CPU
> > > model changes for new machines. You among others felt that was not
> > > always necessary, and now you're using the lack thereof as an argument
> > > to stop using QEMU's CPU models at all? That sounds convoluted...
> > 
> > Whether QEMU changed the CPU for existing machines, or only for new
> > machines is actually not the core problem. Even if we only changed
> > the CPU in new machines that would still be an unsatisfactory situation
> > because we want to be able to be able to access different versions of
> > the CPU without the machine type changing, and access different versions
> > of the machine type, without the CPU changing. IOW it is the fact that the
> > changes in CPU are tied to changes in machine type that is the core
> > problem.
> > 
> > Regards,
> > Daniel
> 
> But that's because we are fixing bugs.  If CPU X used to work on
> hardware Y in machine type A and stopped in machine type B, this is
> because we have determined that it's the right thing to do for the
> guests and the users. We don't break stuff just for fun.
> Why do you want to bring back the bugs we fixed?

I didn't take the time to count them, but I bet most of the commits I
listed on my previous e-mail message are not bug fixes, but new
features.

Also, it doesn't matter if the change is a bug fix or a new feature: if
it affects runnability of the VM, it has more consequences than a simple
guest-side ABI change, and libvirt can't tie it to the machine-type so
it needs another flag to enable it.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:25             ` Daniel P. Berrange
  2015-06-23 16:33               ` Michael S. Tsirkin
@ 2015-06-23 16:40               ` Andreas Färber
  2015-06-23 16:53                 ` Daniel P. Berrange
  1 sibling, 1 reply; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 16:40 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth,
	Eduardo Habkost

Am 23.06.2015 um 18:25 schrieb Daniel P. Berrange:
> Whether QEMU changed the CPU for existing machines, or only for new
> machines is actually not the core problem. Even if we only changed
> the CPU in new machines that would still be an unsatisfactory situation
> because we want to be able to be able to access different versions of
> the CPU without the machine type changing, and access different versions
> of the machine type, without the CPU changing. IOW it is the fact that the
> changes in CPU are tied to changes in machine type that is the core
> problem.

This coupling is by design and we expect all KVM/QEMU users to adhere to
it, including those that use the libvirt tool (which I assume is going
to be the majority of KVM users). Either you want a certain
backwards-compatible machine and CPU, or you want the latest and
greatest - why in the world mix and match?!

Would a qemu64-2.3 model help here that pc*-2.3 could use? I believe
that's been proposed in the past. I don't oppose the idea of a
fully-custom CPU, but this blatant attempt of ignoring QEMU's CPU
versioning by libvirt worries me.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:33               ` Michael S. Tsirkin
  2015-06-23 16:38                 ` Eduardo Habkost
@ 2015-06-23 16:42                 ` Daniel P. Berrange
  2015-06-23 16:47                   ` Andreas Färber
  2015-06-23 21:23                   ` Michael S. Tsirkin
  1 sibling, 2 replies; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-23 16:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Andreas Färber,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
> > On Tue, Jun 23, 2015 at 06:15:51PM +0200, Andreas Färber wrote:
> > > Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
> > > > On Tue, Jun 23, 2015 at 05:32:42PM +0200, Michael S. Tsirkin wrote:
> > > >> On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> > > >>> On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> > > >>>> Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > > >>>>>> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > > >>>>>> that will generate a config file that can be loaded using -readconfig, based on
> > > >>>>>> the -cpu and -machine options provided in the command-line.
> > > >>>>>
> > > >>>>> Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > > >>>>> configuration data to libvirt, but now I think it actually makes sense.
> > > >>>>> We already have a partial copy of CPU model definitions in libvirt
> > > >>>>> anyway, but as QEMU changes some CPU models in some machine types (and
> > > >>>>> libvirt does not do that) we have no real control over the guest CPU
> > > >>>>> configuration. While what we really want is full control to enforce
> > > >>>>> stable guest ABI.
> > > >>>>
> > > >>>> That sounds like FUD to me. Any concrete data points where QEMU does not
> > > >>>> have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > > >>>> for.
> > > >>>
> > > >>> What Jiri is saying that the CPUs change depending on -mmachine, not
> > > >>> that the ABI is broken by a given machine.
> > > >>>
> > > >>> The problem here is that libvirt needs to provide CPU models whose
> > > >>> runnability does not depend on the machine-type. If users have a VM that
> > > >>> is running in a host and the VM machine-type changes,
> > > >>
> > > >> How does it change, and why?
> > > > 
> > > > Sometimes we add features to a CPU model because they were not emulated by KVM
> > > > and now they are. Sometimes we remove or add features or change other fields
> > > > because we are fixing previous mistakes. Recently we we were going to remove
> > > > features from models because of an Intel CPU errata, but then decided to create
> > > > a new CPU model name instead.
> > > > 
> > > > See some examples at the end of this message.
> > > > 
> > > >>
> > > >>> the VM should be
> > > >>> still runnable in that host. QEMU doesn't provide that, our CPU models
> > > >>> may change when we introduce new machine-types, so we are giving them a
> > > >>> mechanism that allows libvirt to implement the policy they need.
> > > >>
> > > >> I don't mind wrt CPU specifically, but we absolutely do change guest ABI
> > > >> in many ways when we change machine types.
> > > > 
> > > > All the other ABI changes we introduce in QEMU don't affect runnability of the
> > > > VM in a given host, that's the problem we are trying to address here. ABI
> > > > changes are expected when changing to a new machine, runnability changes
> > > > aren't.
> > > > 
> > > > 
> > > > Examples of commits changing CPU models:
> > > [snip]
> > > 
> > > I've always advocated remaining backwards-compatible and only making CPU
> > > model changes for new machines. You among others felt that was not
> > > always necessary, and now you're using the lack thereof as an argument
> > > to stop using QEMU's CPU models at all? That sounds convoluted...
> > 
> > Whether QEMU changed the CPU for existing machines, or only for new
> > machines is actually not the core problem. Even if we only changed
> > the CPU in new machines that would still be an unsatisfactory situation
> > because we want to be able to be able to access different versions of
> > the CPU without the machine type changing, and access different versions
> > of the machine type, without the CPU changing. IOW it is the fact that the
> > changes in CPU are tied to changes in machine type that is the core
> > problem.
> 
> But that's because we are fixing bugs.  If CPU X used to work on
> hardware Y in machine type A and stopped in machine type B, this is
> because we have determined that it's the right thing to do for the
> guests and the users. We don't break stuff just for fun.
> Why do you want to bring back the bugs we fixed?

Huh, I never said we wanted to bring back bugs. This is about allowing
libvirt to fix the CPU bugs in a way that is independant of the machine
types and portable across hypervisors we deal with. We're absolutely
still going to fix CPU model bugs and ensure stable guest ABI.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:38                 ` Eduardo Habkost
@ 2015-06-23 16:44                   ` Andreas Färber
  2015-06-23 17:08                     ` Eduardo Habkost
  0 siblings, 1 reply; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 16:44 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

Am 23.06.2015 um 18:38 schrieb Eduardo Habkost:
> On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
>> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
>>> Whether QEMU changed the CPU for existing machines, or only for new
>>> machines is actually not the core problem. Even if we only changed
>>> the CPU in new machines that would still be an unsatisfactory situation
>>> because we want to be able to be able to access different versions of
>>> the CPU without the machine type changing, and access different versions
>>> of the machine type, without the CPU changing. IOW it is the fact that the
>>> changes in CPU are tied to changes in machine type that is the core
>>> problem.
>>
>> But that's because we are fixing bugs.  If CPU X used to work on
>> hardware Y in machine type A and stopped in machine type B, this is
>> because we have determined that it's the right thing to do for the
>> guests and the users. We don't break stuff just for fun.
>> Why do you want to bring back the bugs we fixed?
> 
> I didn't take the time to count them, but I bet most of the commits I
> listed on my previous e-mail message are not bug fixes, but new
> features.

Huh? Of course the latest machine model get new features. The point is
that the previous ones don't and that's what we are providing them for -
libvirt is expected to choose one machine and the contract with QEMU is
that for that machine the CPU does *not* grow new features, and we're
going at great lengths to achieve that. So this thread feels more and
more weird...

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:42                 ` Daniel P. Berrange
@ 2015-06-23 16:47                   ` Andreas Färber
  2015-06-23 17:11                     ` Eduardo Habkost
  2015-06-23 17:13                     ` [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Daniel P. Berrange
  2015-06-23 21:23                   ` Michael S. Tsirkin
  1 sibling, 2 replies; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 16:47 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth,
	Eduardo Habkost

Am 23.06.2015 um 18:42 schrieb Daniel P. Berrange:
> On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
>> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
>>> On Tue, Jun 23, 2015 at 06:15:51PM +0200, Andreas Färber wrote:
>>>> Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
>>>>> On Tue, Jun 23, 2015 at 05:32:42PM +0200, Michael S. Tsirkin wrote:
>>>>>> On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
>>>>>>> On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
>>>>>>>> Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
>>>>>>>>>> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
>>>>>>>>>> that will generate a config file that can be loaded using -readconfig, based on
>>>>>>>>>> the -cpu and -machine options provided in the command-line.
>>>>>>>>>
>>>>>>>>> Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
>>>>>>>>> configuration data to libvirt, but now I think it actually makes sense.
>>>>>>>>> We already have a partial copy of CPU model definitions in libvirt
>>>>>>>>> anyway, but as QEMU changes some CPU models in some machine types (and
>>>>>>>>> libvirt does not do that) we have no real control over the guest CPU
>>>>>>>>> configuration. While what we really want is full control to enforce
>>>>>>>>> stable guest ABI.
>>>>>>>>
>>>>>>>> That sounds like FUD to me. Any concrete data points where QEMU does not
>>>>>>>> have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
>>>>>>>> for.
>>>>>>>
>>>>>>> What Jiri is saying that the CPUs change depending on -mmachine, not
>>>>>>> that the ABI is broken by a given machine.
>>>>>>>
>>>>>>> The problem here is that libvirt needs to provide CPU models whose
>>>>>>> runnability does not depend on the machine-type. If users have a VM that
>>>>>>> is running in a host and the VM machine-type changes,
>>>>>>
>>>>>> How does it change, and why?
>>>>>
>>>>> Sometimes we add features to a CPU model because they were not emulated by KVM
>>>>> and now they are. Sometimes we remove or add features or change other fields
>>>>> because we are fixing previous mistakes. Recently we we were going to remove
>>>>> features from models because of an Intel CPU errata, but then decided to create
>>>>> a new CPU model name instead.
>>>>>
>>>>> See some examples at the end of this message.
>>>>>
>>>>>>
>>>>>>> the VM should be
>>>>>>> still runnable in that host. QEMU doesn't provide that, our CPU models
>>>>>>> may change when we introduce new machine-types, so we are giving them a
>>>>>>> mechanism that allows libvirt to implement the policy they need.
>>>>>>
>>>>>> I don't mind wrt CPU specifically, but we absolutely do change guest ABI
>>>>>> in many ways when we change machine types.
>>>>>
>>>>> All the other ABI changes we introduce in QEMU don't affect runnability of the
>>>>> VM in a given host, that's the problem we are trying to address here. ABI
>>>>> changes are expected when changing to a new machine, runnability changes
>>>>> aren't.
>>>>>
>>>>>
>>>>> Examples of commits changing CPU models:
>>>> [snip]
>>>>
>>>> I've always advocated remaining backwards-compatible and only making CPU
>>>> model changes for new machines. You among others felt that was not
>>>> always necessary, and now you're using the lack thereof as an argument
>>>> to stop using QEMU's CPU models at all? That sounds convoluted...
>>>
>>> Whether QEMU changed the CPU for existing machines, or only for new
>>> machines is actually not the core problem. Even if we only changed
>>> the CPU in new machines that would still be an unsatisfactory situation
>>> because we want to be able to be able to access different versions of
>>> the CPU without the machine type changing, and access different versions
>>> of the machine type, without the CPU changing. IOW it is the fact that the
>>> changes in CPU are tied to changes in machine type that is the core
>>> problem.
>>
>> But that's because we are fixing bugs.  If CPU X used to work on
>> hardware Y in machine type A and stopped in machine type B, this is
>> because we have determined that it's the right thing to do for the
>> guests and the users. We don't break stuff just for fun.
>> Why do you want to bring back the bugs we fixed?
> 
> Huh, I never said we wanted to bring back bugs. This is about allowing
> libvirt to fix the CPU bugs in a way that is independant of the machine
> types and portable across hypervisors we deal with. We're absolutely
> still going to fix CPU model bugs and ensure stable guest ABI.

No, that's contradictory! Through the -x.y machines we leave bugs in the
old models *exactly* to assure a stable guest ABI. Fixes are only be
applied to new machines, thus I'm pointing out that you should not use a
new CPU model with an old machine type.

Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:40               ` Andreas Färber
@ 2015-06-23 16:53                 ` Daniel P. Berrange
  2015-06-23 17:10                   ` Andreas Färber
  0 siblings, 1 reply; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-23 16:53 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 06:40:19PM +0200, Andreas Färber wrote:
> Am 23.06.2015 um 18:25 schrieb Daniel P. Berrange:
> > Whether QEMU changed the CPU for existing machines, or only for new
> > machines is actually not the core problem. Even if we only changed
> > the CPU in new machines that would still be an unsatisfactory situation
> > because we want to be able to be able to access different versions of
> > the CPU without the machine type changing, and access different versions
> > of the machine type, without the CPU changing. IOW it is the fact that the
> > changes in CPU are tied to changes in machine type that is the core
> > problem.
> 
> This coupling is by design and we expect all KVM/QEMU users to adhere to
> it, including those that use the libvirt tool (which I assume is going
> to be the majority of KVM users). Either you want a certain
> backwards-compatible machine and CPU, or you want the latest and
> greatest - why in the world mix and match?!

As mentioned, changes/fixes to the CPU model can affect the ability to
launch the guest on a particular host, so we need the ability to control
when those CPU changes are activated for a guest, separately from the
machine type.

> Would a qemu64-2.3 model help here that pc*-2.3 could use? I believe
> that's been proposed in the past. I don't oppose the idea of a
> fully-custom CPU, but this blatant attempt of ignoring QEMU's CPU
> versioning by libvirt worries me.

That is still tieing CPU model names to machine type versions, unless
I'm mis-understanding you. In general having QEMU models avialable
vary depending on the QEMU version is creating problems for apps
higher up the stack. By allowing libvirt to define the CPU model
policy, it can also provide a more consistent interface to applications
above, such as OpenStack, which will facilitate work in the schedular
when it picks hosts capable of deploying/migrating a VM, when there
are heterogenous QEMU versions across the hosts.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:32             ` Eduardo Habkost
@ 2015-06-23 17:01               ` Andreas Färber
  0 siblings, 0 replies; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 17:01 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

Am 23.06.2015 um 18:32 schrieb Eduardo Habkost:
> On Tue, Jun 23, 2015 at 06:15:51PM +0200, Andreas Färber wrote:
>> Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
>>> On Tue, Jun 23, 2015 at 05:32:42PM +0200, Michael S. Tsirkin wrote:
>>>> On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
>>>>> On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
>>>>>> Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
>>>>>>>> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
>>>>>>>> that will generate a config file that can be loaded using -readconfig, based on
>>>>>>>> the -cpu and -machine options provided in the command-line.
>>>>>>>
>>>>>>> Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
>>>>>>> configuration data to libvirt, but now I think it actually makes sense.
>>>>>>> We already have a partial copy of CPU model definitions in libvirt
>>>>>>> anyway, but as QEMU changes some CPU models in some machine types (and
>>>>>>> libvirt does not do that) we have no real control over the guest CPU
>>>>>>> configuration. While what we really want is full control to enforce
>>>>>>> stable guest ABI.
>>>>>>
>>>>>> That sounds like FUD to me. Any concrete data points where QEMU does not
>>>>>> have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
>>>>>> for.
>>>>>
>>>>> What Jiri is saying that the CPUs change depending on -mmachine, not
>>>>> that the ABI is broken by a given machine.
>>>>>
>>>>> The problem here is that libvirt needs to provide CPU models whose
>>>>> runnability does not depend on the machine-type. If users have a VM that
>>>>> is running in a host and the VM machine-type changes,
>>>>
>>>> How does it change, and why?
>>>
>>> Sometimes we add features to a CPU model because they were not emulated by KVM
>>> and now they are. Sometimes we remove or add features or change other fields
>>> because we are fixing previous mistakes. Recently we we were going to remove
>>> features from models because of an Intel CPU errata, but then decided to create
>>> a new CPU model name instead.
>>>
>>> See some examples at the end of this message.
>>>
>>>>
>>>>> the VM should be
>>>>> still runnable in that host. QEMU doesn't provide that, our CPU models
>>>>> may change when we introduce new machine-types, so we are giving them a
>>>>> mechanism that allows libvirt to implement the policy they need.
>>>>
>>>> I don't mind wrt CPU specifically, but we absolutely do change guest ABI
>>>> in many ways when we change machine types.
>>>
>>> All the other ABI changes we introduce in QEMU don't affect runnability of the
>>> VM in a given host, that's the problem we are trying to address here. ABI
>>> changes are expected when changing to a new machine, runnability changes
>>> aren't.
>>>
>>>
>>> Examples of commits changing CPU models:
>> [snip]
>>
>> I've always advocated remaining backwards-compatible and only making CPU
>> model changes for new machines. You among others felt that was not
>> always necessary, and now you're using the lack thereof as an argument
>> to stop using QEMU's CPU models at all? That sounds convoluted...
>>
> 
> Uh? I don't remember anybody suggesting changing CPU models on existing
> machines. We always tried to keep existing machines compatible.

Yes, we try in general. And in a few cases I was overruled, possibly
related to TCG feature filtering or something. Thought that was the
problem here - apparently not. Explanations seem to be the culprit here!

>> BTW your list does not answer my question. You would need examples where
>> a CPU model changes between machines, and I am not aware of any example
>> beyond the intentional -x.y variations. There are differences between
>> KVM and TCG though, did you mean that? i440fx and q35 should be
>> identical and isa-pc, too, and none anyway. None of this has anything to
>> do with the host CPU.
> 
> We are talking about the -x.y variations (that, yes, are intentional).
> But the fact that CPU features change (even the intentional ones in the
> -x.y machine variations) affect runnability of VMs (because enabling new
> CPU features in KVM require it to be supported by the host kernel code
> and by the host CPU).
> 
> I was not thinking about the KVM and TCG differences, but this may also
> help libvirt deal with the KVM and TCG differences if necessary.
> 
> I don't know what you mean by "i440fx and q35 should be identical"
> above.

In this thread there was a claim that CPU models varied between machine
types. I am saying that there should be no CPU model differences between
pc-i440fx-2.3 and pc-q35-2.3 etc. Thus the CPU model is not tied to one
machine, but to the version of QEMU, with -x.y matching the
corresponding release. No news to you, I would hope?

Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:44                   ` Andreas Färber
@ 2015-06-23 17:08                     ` Eduardo Habkost
  2015-06-23 17:18                       ` Andreas Färber
  0 siblings, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 17:08 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

On Tue, Jun 23, 2015 at 06:44:57PM +0200, Andreas Färber wrote:
> Am 23.06.2015 um 18:38 schrieb Eduardo Habkost:
> > On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
> >> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
> >>> Whether QEMU changed the CPU for existing machines, or only for new
> >>> machines is actually not the core problem. Even if we only changed
> >>> the CPU in new machines that would still be an unsatisfactory situation
> >>> because we want to be able to be able to access different versions of
> >>> the CPU without the machine type changing, and access different versions
> >>> of the machine type, without the CPU changing. IOW it is the fact that the
> >>> changes in CPU are tied to changes in machine type that is the core
> >>> problem.
> >>
> >> But that's because we are fixing bugs.  If CPU X used to work on
> >> hardware Y in machine type A and stopped in machine type B, this is
> >> because we have determined that it's the right thing to do for the
> >> guests and the users. We don't break stuff just for fun.
> >> Why do you want to bring back the bugs we fixed?
> > 
> > I didn't take the time to count them, but I bet most of the commits I
> > listed on my previous e-mail message are not bug fixes, but new
> > features.
> 
> Huh? Of course the latest machine model get new features. The point is
> that the previous ones don't and that's what we are providing them for -
> libvirt is expected to choose one machine and the contract with QEMU is
> that for that machine the CPU does *not* grow new features, and we're
> going at great lengths to achieve that. So this thread feels more and
> more weird...

We are not talking about changes to existing machines. We are talking
about having changes introduced in new machines (the one we did on
purpose) affecting the runnability of the VM.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:53                 ` Daniel P. Berrange
@ 2015-06-23 17:10                   ` Andreas Färber
  2015-06-23 17:24                     ` Eduardo Habkost
  0 siblings, 1 reply; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 17:10 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth,
	Eduardo Habkost

Am 23.06.2015 um 18:53 schrieb Daniel P. Berrange:
> On Tue, Jun 23, 2015 at 06:40:19PM +0200, Andreas Färber wrote:
>> Am 23.06.2015 um 18:25 schrieb Daniel P. Berrange:
>>> Whether QEMU changed the CPU for existing machines, or only for new
>>> machines is actually not the core problem. Even if we only changed
>>> the CPU in new machines that would still be an unsatisfactory situation
>>> because we want to be able to be able to access different versions of
>>> the CPU without the machine type changing, and access different versions
>>> of the machine type, without the CPU changing. IOW it is the fact that the
>>> changes in CPU are tied to changes in machine type that is the core
>>> problem.
>>
>> This coupling is by design and we expect all KVM/QEMU users to adhere to
>> it, including those that use the libvirt tool (which I assume is going
>> to be the majority of KVM users). Either you want a certain
>> backwards-compatible machine and CPU, or you want the latest and
>> greatest - why in the world mix and match?!
> 
> As mentioned, changes/fixes to the CPU model can affect the ability to
> launch the guest on a particular host, so we need the ability to control
> when those CPU changes are activated for a guest, separately from the
> machine type.

Why? Today's libvirt with QEMU 2.3 resolves "pc" machine to
"pc-i440fx-2.3" and the guest XML stays that way. When we add new
features for 2.4, 2.3 is guaranteed to stay compatible. Any change would
involve the libvirt user actively switching from pc-i440fx-2.3 to a
different machine such as upcoming pc-i440fx-2.4. Why do you need to
change the CPU separately? Why would a user want to run 2.2's CPU with a
2.3 machine? Or a 2.3 machine with a 2.4 CPU? That's nonsense. If you
want to tweak features, you already have command line options available
to do so on the basis of what the selected machine provides.

>> Would a qemu64-2.3 model help here that pc*-2.3 could use? I believe
>> that's been proposed in the past. I don't oppose the idea of a
>> fully-custom CPU, but this blatant attempt of ignoring QEMU's CPU
>> versioning by libvirt worries me.
> 
> That is still tieing CPU model names to machine type versions, unless
> I'm mis-understanding you. In general having QEMU models avialable
> vary depending on the QEMU version is creating problems for apps
> higher up the stack. By allowing libvirt to define the CPU model
> policy, it can also provide a more consistent interface to applications
> above, such as OpenStack, which will facilitate work in the schedular
> when it picks hosts capable of deploying/migrating a VM, when there
> are heterogenous QEMU versions across the hosts.

If you have heterogeneous QEMUs across hosts, then you need a
common-denominator machine anyway because QEMU wouldn't start with a
machine it doesn't know.

Please give one concrete example of a real-world problem instead of
hypothetical abstract combinations users may or may not be able to
construct.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:47                   ` Andreas Färber
@ 2015-06-23 17:11                     ` Eduardo Habkost
  2015-06-23 21:34                       ` Michael S. Tsirkin
  2015-06-23 17:13                     ` [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Daniel P. Berrange
  1 sibling, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 17:11 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

On Tue, Jun 23, 2015 at 06:47:16PM +0200, Andreas Färber wrote:
> Am 23.06.2015 um 18:42 schrieb Daniel P. Berrange:
> > On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
> >> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
> >>> On Tue, Jun 23, 2015 at 06:15:51PM +0200, Andreas Färber wrote:
> >>>> Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
> >>>>> On Tue, Jun 23, 2015 at 05:32:42PM +0200, Michael S. Tsirkin wrote:
> >>>>>> On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> >>>>>>> On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> >>>>>>>> Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> >>>>>>>>>> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> >>>>>>>>>> that will generate a config file that can be loaded using -readconfig, based on
> >>>>>>>>>> the -cpu and -machine options provided in the command-line.
> >>>>>>>>>
> >>>>>>>>> Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> >>>>>>>>> configuration data to libvirt, but now I think it actually makes sense.
> >>>>>>>>> We already have a partial copy of CPU model definitions in libvirt
> >>>>>>>>> anyway, but as QEMU changes some CPU models in some machine types (and
> >>>>>>>>> libvirt does not do that) we have no real control over the guest CPU
> >>>>>>>>> configuration. While what we really want is full control to enforce
> >>>>>>>>> stable guest ABI.
> >>>>>>>>
> >>>>>>>> That sounds like FUD to me. Any concrete data points where QEMU does not
> >>>>>>>> have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> >>>>>>>> for.
> >>>>>>>
> >>>>>>> What Jiri is saying that the CPUs change depending on -mmachine, not
> >>>>>>> that the ABI is broken by a given machine.
> >>>>>>>
> >>>>>>> The problem here is that libvirt needs to provide CPU models whose
> >>>>>>> runnability does not depend on the machine-type. If users have a VM that
> >>>>>>> is running in a host and the VM machine-type changes,
> >>>>>>
> >>>>>> How does it change, and why?
> >>>>>
> >>>>> Sometimes we add features to a CPU model because they were not emulated by KVM
> >>>>> and now they are. Sometimes we remove or add features or change other fields
> >>>>> because we are fixing previous mistakes. Recently we we were going to remove
> >>>>> features from models because of an Intel CPU errata, but then decided to create
> >>>>> a new CPU model name instead.
> >>>>>
> >>>>> See some examples at the end of this message.
> >>>>>
> >>>>>>
> >>>>>>> the VM should be
> >>>>>>> still runnable in that host. QEMU doesn't provide that, our CPU models
> >>>>>>> may change when we introduce new machine-types, so we are giving them a
> >>>>>>> mechanism that allows libvirt to implement the policy they need.
> >>>>>>
> >>>>>> I don't mind wrt CPU specifically, but we absolutely do change guest ABI
> >>>>>> in many ways when we change machine types.
> >>>>>
> >>>>> All the other ABI changes we introduce in QEMU don't affect runnability of the
> >>>>> VM in a given host, that's the problem we are trying to address here. ABI
> >>>>> changes are expected when changing to a new machine, runnability changes
> >>>>> aren't.
> >>>>>
> >>>>>
> >>>>> Examples of commits changing CPU models:
> >>>> [snip]
> >>>>
> >>>> I've always advocated remaining backwards-compatible and only making CPU
> >>>> model changes for new machines. You among others felt that was not
> >>>> always necessary, and now you're using the lack thereof as an argument
> >>>> to stop using QEMU's CPU models at all? That sounds convoluted...
> >>>
> >>> Whether QEMU changed the CPU for existing machines, or only for new
> >>> machines is actually not the core problem. Even if we only changed
> >>> the CPU in new machines that would still be an unsatisfactory situation
> >>> because we want to be able to be able to access different versions of
> >>> the CPU without the machine type changing, and access different versions
> >>> of the machine type, without the CPU changing. IOW it is the fact that the
> >>> changes in CPU are tied to changes in machine type that is the core
> >>> problem.
> >>
> >> But that's because we are fixing bugs.  If CPU X used to work on
> >> hardware Y in machine type A and stopped in machine type B, this is
> >> because we have determined that it's the right thing to do for the
> >> guests and the users. We don't break stuff just for fun.
> >> Why do you want to bring back the bugs we fixed?
> > 
> > Huh, I never said we wanted to bring back bugs. This is about allowing
> > libvirt to fix the CPU bugs in a way that is independant of the machine
> > types and portable across hypervisors we deal with. We're absolutely
> > still going to fix CPU model bugs and ensure stable guest ABI.
> 
> No, that's contradictory! Through the -x.y machines we leave bugs in the
> old models *exactly* to assure a stable guest ABI. Fixes are only be
> applied to new machines, thus I'm pointing out that you should not use a
> new CPU model with an old machine type.

They don't need to use a new model with an old machine-type (although I
don't see a reason to prevent that). They need to be able to change the
machine-type to a new one without getting any changes that would make
the VM not runnable in the same host. Even if it is a bug fix. If it is
a change that can make the VM unrunnable, it needs to be controlled by a
separate flag, not by the machine-type.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:47                   ` Andreas Färber
  2015-06-23 17:11                     ` Eduardo Habkost
@ 2015-06-23 17:13                     ` Daniel P. Berrange
  2015-06-23 17:29                       ` Andreas Färber
  2015-06-23 21:26                       ` Michael S. Tsirkin
  1 sibling, 2 replies; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-23 17:13 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 06:47:16PM +0200, Andreas Färber wrote:
> Am 23.06.2015 um 18:42 schrieb Daniel P. Berrange:
> > On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
> >> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
> >>> On Tue, Jun 23, 2015 at 06:15:51PM +0200, Andreas Färber wrote:
> >>>> Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
> >>>> I've always advocated remaining backwards-compatible and only making CPU
> >>>> model changes for new machines. You among others felt that was not
> >>>> always necessary, and now you're using the lack thereof as an argument
> >>>> to stop using QEMU's CPU models at all? That sounds convoluted...
> >>>
> >>> Whether QEMU changed the CPU for existing machines, or only for new
> >>> machines is actually not the core problem. Even if we only changed
> >>> the CPU in new machines that would still be an unsatisfactory situation
> >>> because we want to be able to be able to access different versions of
> >>> the CPU without the machine type changing, and access different versions
> >>> of the machine type, without the CPU changing. IOW it is the fact that the
> >>> changes in CPU are tied to changes in machine type that is the core
> >>> problem.
> >>
> >> But that's because we are fixing bugs.  If CPU X used to work on
> >> hardware Y in machine type A and stopped in machine type B, this is
> >> because we have determined that it's the right thing to do for the
> >> guests and the users. We don't break stuff just for fun.
> >> Why do you want to bring back the bugs we fixed?
> > 
> > Huh, I never said we wanted to bring back bugs. This is about allowing
> > libvirt to fix the CPU bugs in a way that is independant of the machine
> > types and portable across hypervisors we deal with. We're absolutely
> > still going to fix CPU model bugs and ensure stable guest ABI.
> 
> No, that's contradictory! Through the -x.y machines we leave bugs in the
> old models *exactly* to assure a stable guest ABI. Fixes are only be
> applied to new machines, thus I'm pointing out that you should not use a
> new CPU model with an old machine type.

I'm not saying that libvirt would ever allow a silent guest ABI change.
Given a libvirt XML config, the guest ABI will never change without an
explicit action on the part of the app/user to change the XML.

This is all about dealing with the case where the app / user conciously
needs/wants to opt-in to a guest ABI change for the guest. eg they wish
to make use of some bug fix or feature improvement in the new machine
type, but they do *not* wish to have the CPU model changed, because
of some CPU model change that is incompatible with their hosts' CPUs.
Conversely, they may wish to get access to a new CPU model, but not
wish to have the rest of the guest ABI change. In both cases the user
is explicitly opt-ing in the ABI change with knowledge about what
this might mean for the guest OS. Currently we are tieing users
hands by forcing CPU and machine types to change in lockstep.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:08                     ` Eduardo Habkost
@ 2015-06-23 17:18                       ` Andreas Färber
  2015-06-23 17:27                         ` Daniel P. Berrange
  2015-06-23 17:39                         ` Eduardo Habkost
  0 siblings, 2 replies; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 17:18 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

Am 23.06.2015 um 19:08 schrieb Eduardo Habkost:
> On Tue, Jun 23, 2015 at 06:44:57PM +0200, Andreas Färber wrote:
>> Am 23.06.2015 um 18:38 schrieb Eduardo Habkost:
>>> On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
>>>> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
>>>>> Whether QEMU changed the CPU for existing machines, or only for new
>>>>> machines is actually not the core problem. Even if we only changed
>>>>> the CPU in new machines that would still be an unsatisfactory situation
>>>>> because we want to be able to be able to access different versions of
>>>>> the CPU without the machine type changing, and access different versions
>>>>> of the machine type, without the CPU changing. IOW it is the fact that the
>>>>> changes in CPU are tied to changes in machine type that is the core
>>>>> problem.
>>>>
>>>> But that's because we are fixing bugs.  If CPU X used to work on
>>>> hardware Y in machine type A and stopped in machine type B, this is
>>>> because we have determined that it's the right thing to do for the
>>>> guests and the users. We don't break stuff just for fun.
>>>> Why do you want to bring back the bugs we fixed?
>>>
>>> I didn't take the time to count them, but I bet most of the commits I
>>> listed on my previous e-mail message are not bug fixes, but new
>>> features.
>>
>> Huh? Of course the latest machine model get new features. The point is
>> that the previous ones don't and that's what we are providing them for -
>> libvirt is expected to choose one machine and the contract with QEMU is
>> that for that machine the CPU does *not* grow new features, and we're
>> going at great lengths to achieve that. So this thread feels more and
>> more weird...
> 
> We are not talking about changes to existing machines. We are talking
> about having changes introduced in new machines (the one we did on
> purpose) affecting the runnability of the VM.

You are talking abstract!


Example 1:

Point A: Machine pc-i440fx-2.3 exists

Runs or runs not.

Point B: Machine pc-i440fx-2.3 still exists

Still runs or runs not due to guest ABI stability rules.


Example 2:

Point A: pc-i440fx-2.4 does not exist in 2.3

Does not run becomes it doesn't exist.

Point B: New pc-i440fx-2.4

Runs or does not run, and if so has more features than pc-i440fx-2.3.

There is no runnability problem - either it runs or it doesn't, but
there's no change over time.

This is what the machine -x.y versioning is all about.

Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:10                   ` Andreas Färber
@ 2015-06-23 17:24                     ` Eduardo Habkost
  2015-06-23 17:31                       ` Daniel P. Berrange
  0 siblings, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 17:24 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

On Tue, Jun 23, 2015 at 07:10:11PM +0200, Andreas Färber wrote:
> Am 23.06.2015 um 18:53 schrieb Daniel P. Berrange:
> > On Tue, Jun 23, 2015 at 06:40:19PM +0200, Andreas Färber wrote:
> >> Am 23.06.2015 um 18:25 schrieb Daniel P. Berrange:
> >>> Whether QEMU changed the CPU for existing machines, or only for new
> >>> machines is actually not the core problem. Even if we only changed
> >>> the CPU in new machines that would still be an unsatisfactory situation
> >>> because we want to be able to be able to access different versions of
> >>> the CPU without the machine type changing, and access different versions
> >>> of the machine type, without the CPU changing. IOW it is the fact that the
> >>> changes in CPU are tied to changes in machine type that is the core
> >>> problem.
> >>
> >> This coupling is by design and we expect all KVM/QEMU users to adhere to
> >> it, including those that use the libvirt tool (which I assume is going
> >> to be the majority of KVM users). Either you want a certain
> >> backwards-compatible machine and CPU, or you want the latest and
> >> greatest - why in the world mix and match?!
> > 
> > As mentioned, changes/fixes to the CPU model can affect the ability to
> > launch the guest on a particular host, so we need the ability to control
> > when those CPU changes are activated for a guest, separately from the
> > machine type.
> 
> Why? Today's libvirt with QEMU 2.3 resolves "pc" machine to
> "pc-i440fx-2.3" and the guest XML stays that way. When we add new
> features for 2.4, 2.3 is guaranteed to stay compatible. Any change would
> involve the libvirt user actively switching from pc-i440fx-2.3 to a
> different machine such as upcoming pc-i440fx-2.4. Why do you need to
> change the CPU separately? Why would a user want to run 2.2's CPU with a
> 2.3 machine? Or a 2.3 machine with a 2.4 CPU? That's nonsense. If you
> want to tweak features, you already have command line options available
> to do so on the basis of what the selected machine provides.

Because pure guest-side ABI changes are different from changes that also
have additional host-side requirements, so we want to untie both things.

About being able to tweak features today, that's true: we have
command-line options for most stuff and that's _almost_ enough for what
libvirt needs. What's missing is something to avoid silently getting new
features that libvirt aren't aware of (but may make the VM unrunnable).
The purpose of "-cpu custom" is to ensure no new host side feature
requirement will be introduced silently, even if choosing a different
machine.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:18                       ` Andreas Färber
@ 2015-06-23 17:27                         ` Daniel P. Berrange
  2015-06-23 17:41                           ` Andreas Färber
  2015-06-23 17:39                         ` Eduardo Habkost
  1 sibling, 1 reply; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-23 17:27 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 07:18:06PM +0200, Andreas Färber wrote:
> Am 23.06.2015 um 19:08 schrieb Eduardo Habkost:
> > On Tue, Jun 23, 2015 at 06:44:57PM +0200, Andreas Färber wrote:
> >> Am 23.06.2015 um 18:38 schrieb Eduardo Habkost:
> >>> On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
> >>>> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
> >>>>> Whether QEMU changed the CPU for existing machines, or only for new
> >>>>> machines is actually not the core problem. Even if we only changed
> >>>>> the CPU in new machines that would still be an unsatisfactory situation
> >>>>> because we want to be able to be able to access different versions of
> >>>>> the CPU without the machine type changing, and access different versions
> >>>>> of the machine type, without the CPU changing. IOW it is the fact that the
> >>>>> changes in CPU are tied to changes in machine type that is the core
> >>>>> problem.
> >>>>
> >>>> But that's because we are fixing bugs.  If CPU X used to work on
> >>>> hardware Y in machine type A and stopped in machine type B, this is
> >>>> because we have determined that it's the right thing to do for the
> >>>> guests and the users. We don't break stuff just for fun.
> >>>> Why do you want to bring back the bugs we fixed?
> >>>
> >>> I didn't take the time to count them, but I bet most of the commits I
> >>> listed on my previous e-mail message are not bug fixes, but new
> >>> features.
> >>
> >> Huh? Of course the latest machine model get new features. The point is
> >> that the previous ones don't and that's what we are providing them for -
> >> libvirt is expected to choose one machine and the contract with QEMU is
> >> that for that machine the CPU does *not* grow new features, and we're
> >> going at great lengths to achieve that. So this thread feels more and
> >> more weird...
> > 
> > We are not talking about changes to existing machines. We are talking
> > about having changes introduced in new machines (the one we did on
> > purpose) affecting the runnability of the VM.
> 
> You are talking abstract!
> 
> 
> Example 1:
> 
> Point A: Machine pc-i440fx-2.3 exists
> 
> Runs or runs not.
> 
> Point B: Machine pc-i440fx-2.3 still exists
> 
> Still runs or runs not due to guest ABI stability rules.
> 
> 
> Example 2:
> 
> Point A: pc-i440fx-2.4 does not exist in 2.3
> 
> Does not run becomes it doesn't exist.
> 
> Point B: New pc-i440fx-2.4
> 
> Runs or does not run, and if so has more features than pc-i440fx-2.3.
> 
> There is no runnability problem - either it runs or it doesn't, but
> there's no change over time.
> 
> This is what the machine -x.y versioning is all about.

Consider a host currently running QEMU 2.3 with machine type
pc-i440fx-2.3 used with SandyBridge.

Now consider the def of SandyBridge was buggy and so in QEMU
2.4 we add the missing CPU feature flag 'wizz', and only
enable that new feature flag with pc-i440fx-2.4

Now consider there was a bug in the virtio-scsi driver that
we also fixed in QEMU 2.4 and thus pc-i440fx-2.4 includes
that fix.

Updating from pc-i440fx-2.3 to pc-i440fx-2.4 has a dependancy
on the host CPU including the 'wizz' flag that was added. This
new CPU feature can prevent the user from using the new machine
type to get the virtio-scsi bug fix.

Separately versioning CPU models still lets us preserve a stable
guest ABI by default, but allows more flexibility when choosing
to opt out of the ABI for a particular upgrade.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:13                     ` [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Daniel P. Berrange
@ 2015-06-23 17:29                       ` Andreas Färber
  2015-06-23 17:42                         ` Eduardo Habkost
  2015-06-23 21:26                       ` Michael S. Tsirkin
  1 sibling, 1 reply; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 17:29 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth,
	Eduardo Habkost

Am 23.06.2015 um 19:13 schrieb Daniel P. Berrange:
> On Tue, Jun 23, 2015 at 06:47:16PM +0200, Andreas Färber wrote:
>> Am 23.06.2015 um 18:42 schrieb Daniel P. Berrange:
>>> On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
>>>> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
>>>>> On Tue, Jun 23, 2015 at 06:15:51PM +0200, Andreas Färber wrote:
>>>>>> Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
>>>>>> I've always advocated remaining backwards-compatible and only making CPU
>>>>>> model changes for new machines. You among others felt that was not
>>>>>> always necessary, and now you're using the lack thereof as an argument
>>>>>> to stop using QEMU's CPU models at all? That sounds convoluted...
>>>>>
>>>>> Whether QEMU changed the CPU for existing machines, or only for new
>>>>> machines is actually not the core problem. Even if we only changed
>>>>> the CPU in new machines that would still be an unsatisfactory situation
>>>>> because we want to be able to be able to access different versions of
>>>>> the CPU without the machine type changing, and access different versions
>>>>> of the machine type, without the CPU changing. IOW it is the fact that the
>>>>> changes in CPU are tied to changes in machine type that is the core
>>>>> problem.
>>>>
>>>> But that's because we are fixing bugs.  If CPU X used to work on
>>>> hardware Y in machine type A and stopped in machine type B, this is
>>>> because we have determined that it's the right thing to do for the
>>>> guests and the users. We don't break stuff just for fun.
>>>> Why do you want to bring back the bugs we fixed?
>>>
>>> Huh, I never said we wanted to bring back bugs. This is about allowing
>>> libvirt to fix the CPU bugs in a way that is independant of the machine
>>> types and portable across hypervisors we deal with. We're absolutely
>>> still going to fix CPU model bugs and ensure stable guest ABI.
>>
>> No, that's contradictory! Through the -x.y machines we leave bugs in the
>> old models *exactly* to assure a stable guest ABI. Fixes are only be
>> applied to new machines, thus I'm pointing out that you should not use a
>> new CPU model with an old machine type.
> 
> I'm not saying that libvirt would ever allow a silent guest ABI change.
> Given a libvirt XML config, the guest ABI will never change without an
> explicit action on the part of the app/user to change the XML.
> 
> This is all about dealing with the case where the app / user conciously
> needs/wants to opt-in to a guest ABI change for the guest. eg they wish
> to make use of some bug fix or feature improvement in the new machine
> type, but they do *not* wish to have the CPU model changed, because
> of some CPU model change that is incompatible with their hosts' CPUs.
> Conversely, they may wish to get access to a new CPU model, but not
> wish to have the rest of the guest ABI change. In both cases the user
> is explicitly opt-ing in the ABI change with knowledge about what
> this might mean for the guest OS. Currently we are tieing users
> hands by forcing CPU and machine types to change in lockstep.

Look, if you keep repeating that users may wish to do random
combinations if we give them the ability to do so, that is not helping
me understand. Right now they can't. If you don't have a single use case
of what CPU model, KVM feature, CPUID bit and guest may constitute a
valid reason to do so, I disagree with adding such a crazy feature.

In summary you seem to be saying that all the years we have spent
fiddling around with those mind-boggling compat_props in QEMU were in
vain because libvirt now wants to start their own versioning system to
give users more degrees of freedom even when you can't articulate a
single concrete reason why users may want to do so.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:24                     ` Eduardo Habkost
@ 2015-06-23 17:31                       ` Daniel P. Berrange
  0 siblings, 0 replies; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-23 17:31 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark,
	Andreas Färber, rth

On Tue, Jun 23, 2015 at 02:24:16PM -0300, Eduardo Habkost wrote:
> On Tue, Jun 23, 2015 at 07:10:11PM +0200, Andreas Färber wrote:
> > Am 23.06.2015 um 18:53 schrieb Daniel P. Berrange:
> > > On Tue, Jun 23, 2015 at 06:40:19PM +0200, Andreas Färber wrote:
> > >> Am 23.06.2015 um 18:25 schrieb Daniel P. Berrange:
> > >>> Whether QEMU changed the CPU for existing machines, or only for new
> > >>> machines is actually not the core problem. Even if we only changed
> > >>> the CPU in new machines that would still be an unsatisfactory situation
> > >>> because we want to be able to be able to access different versions of
> > >>> the CPU without the machine type changing, and access different versions
> > >>> of the machine type, without the CPU changing. IOW it is the fact that the
> > >>> changes in CPU are tied to changes in machine type that is the core
> > >>> problem.
> > >>
> > >> This coupling is by design and we expect all KVM/QEMU users to adhere to
> > >> it, including those that use the libvirt tool (which I assume is going
> > >> to be the majority of KVM users). Either you want a certain
> > >> backwards-compatible machine and CPU, or you want the latest and
> > >> greatest - why in the world mix and match?!
> > > 
> > > As mentioned, changes/fixes to the CPU model can affect the ability to
> > > launch the guest on a particular host, so we need the ability to control
> > > when those CPU changes are activated for a guest, separately from the
> > > machine type.
> > 
> > Why? Today's libvirt with QEMU 2.3 resolves "pc" machine to
> > "pc-i440fx-2.3" and the guest XML stays that way. When we add new
> > features for 2.4, 2.3 is guaranteed to stay compatible. Any change would
> > involve the libvirt user actively switching from pc-i440fx-2.3 to a
> > different machine such as upcoming pc-i440fx-2.4. Why do you need to
> > change the CPU separately? Why would a user want to run 2.2's CPU with a
> > 2.3 machine? Or a 2.3 machine with a 2.4 CPU? That's nonsense. If you
> > want to tweak features, you already have command line options available
> > to do so on the basis of what the selected machine provides.
> 
> Because pure guest-side ABI changes are different from changes that also
> have additional host-side requirements, so we want to untie both things.
> 
> About being able to tweak features today, that's true: we have
> command-line options for most stuff and that's _almost_ enough for what
> libvirt needs. What's missing is something to avoid silently getting new
> features that libvirt aren't aware of (but may make the VM unrunnable).
> The purpose of "-cpu custom" is to ensure no new host side feature
> requirement will be introduced silently, even if choosing a different
> machine.

It also has a positive impact on maintenance. By allowing libvirt to
control the CPU definitions, in cases where there is a need to push
out a CPU model bugfix, libvirt will often be able to make the update
available to users via a new CPU model version, without them having
to do a lock-step upgrade to QEMU at the same time.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:18                       ` Andreas Färber
  2015-06-23 17:27                         ` Daniel P. Berrange
@ 2015-06-23 17:39                         ` Eduardo Habkost
  2015-06-23 18:35                           ` Andreas Färber
  1 sibling, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 17:39 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

On Tue, Jun 23, 2015 at 07:18:06PM +0200, Andreas Färber wrote:
> Am 23.06.2015 um 19:08 schrieb Eduardo Habkost:
> > On Tue, Jun 23, 2015 at 06:44:57PM +0200, Andreas Färber wrote:
> >> Am 23.06.2015 um 18:38 schrieb Eduardo Habkost:
> >>> On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
> >>>> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
> >>>>> Whether QEMU changed the CPU for existing machines, or only for new
> >>>>> machines is actually not the core problem. Even if we only changed
> >>>>> the CPU in new machines that would still be an unsatisfactory situation
> >>>>> because we want to be able to be able to access different versions of
> >>>>> the CPU without the machine type changing, and access different versions
> >>>>> of the machine type, without the CPU changing. IOW it is the fact that the
> >>>>> changes in CPU are tied to changes in machine type that is the core
> >>>>> problem.
> >>>>
> >>>> But that's because we are fixing bugs.  If CPU X used to work on
> >>>> hardware Y in machine type A and stopped in machine type B, this is
> >>>> because we have determined that it's the right thing to do for the
> >>>> guests and the users. We don't break stuff just for fun.
> >>>> Why do you want to bring back the bugs we fixed?
> >>>
> >>> I didn't take the time to count them, but I bet most of the commits I
> >>> listed on my previous e-mail message are not bug fixes, but new
> >>> features.
> >>
> >> Huh? Of course the latest machine model get new features. The point is
> >> that the previous ones don't and that's what we are providing them for -
> >> libvirt is expected to choose one machine and the contract with QEMU is
> >> that for that machine the CPU does *not* grow new features, and we're
> >> going at great lengths to achieve that. So this thread feels more and
> >> more weird...
> > 
> > We are not talking about changes to existing machines. We are talking
> > about having changes introduced in new machines (the one we did on
> > purpose) affecting the runnability of the VM.
> 
> You are talking abstract!

I am just talking about a different problem, and I don't know if you are
purposely trying to ignore it, or are just denying that it is a problem.

> 
> 
> Example 1:
> 
> Point A: Machine pc-i440fx-2.3 exists
> 
> Runs or runs not.
> 
> Point B: Machine pc-i440fx-2.3 still exists
> 
> Still runs or runs not due to guest ABI stability rules.

If you didn't change the machine name, this is not the problem we are
talking about.

> 
> 
> Example 2:
> 
> Point A: pc-i440fx-2.4 does not exist in 2.3
> 
> Does not run becomes it doesn't exist.
> 
> Point B: New pc-i440fx-2.4
> 
> Runs or does not run, and if so has more features than pc-i440fx-2.3.

If you didn't change the machine name, this is not the problem we are
talking about.

> 
> There is no runnability problem - either it runs or it doesn't, but
> there's no change over time.
> 
> This is what the machine -x.y versioning is all about.

Let's try a concrete example:

* User is running a kernel that can't emulate x2apic
* User is running pc-i440fx-1.7
* User wants the gigabyte alignment change implemented by commit
  bb43d3839c29b17a2f5c122114cd4ca978065a18
* User changes machine to pc-i440fx-2.0
* x2apic is now enabled by default in all CPU models
* VM with the same configuration (just the machine change) is not
  runnable anymore in the same host

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:27                         ` Daniel P. Berrange
@ 2015-06-23 17:41                           ` Andreas Färber
  2015-06-23 17:45                             ` Eduardo Habkost
  2015-06-23 17:55                             ` Daniel P. Berrange
  0 siblings, 2 replies; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 17:41 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth,
	Eduardo Habkost

Am 23.06.2015 um 19:27 schrieb Daniel P. Berrange:
> On Tue, Jun 23, 2015 at 07:18:06PM +0200, Andreas Färber wrote:
>> Am 23.06.2015 um 19:08 schrieb Eduardo Habkost:
>>> On Tue, Jun 23, 2015 at 06:44:57PM +0200, Andreas Färber wrote:
>>>> Am 23.06.2015 um 18:38 schrieb Eduardo Habkost:
>>>>> On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
>>>>>> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
>>>>>>> Whether QEMU changed the CPU for existing machines, or only for new
>>>>>>> machines is actually not the core problem. Even if we only changed
>>>>>>> the CPU in new machines that would still be an unsatisfactory situation
>>>>>>> because we want to be able to be able to access different versions of
>>>>>>> the CPU without the machine type changing, and access different versions
>>>>>>> of the machine type, without the CPU changing. IOW it is the fact that the
>>>>>>> changes in CPU are tied to changes in machine type that is the core
>>>>>>> problem.
>>>>>>
>>>>>> But that's because we are fixing bugs.  If CPU X used to work on
>>>>>> hardware Y in machine type A and stopped in machine type B, this is
>>>>>> because we have determined that it's the right thing to do for the
>>>>>> guests and the users. We don't break stuff just for fun.
>>>>>> Why do you want to bring back the bugs we fixed?
>>>>>
>>>>> I didn't take the time to count them, but I bet most of the commits I
>>>>> listed on my previous e-mail message are not bug fixes, but new
>>>>> features.
>>>>
>>>> Huh? Of course the latest machine model get new features. The point is
>>>> that the previous ones don't and that's what we are providing them for -
>>>> libvirt is expected to choose one machine and the contract with QEMU is
>>>> that for that machine the CPU does *not* grow new features, and we're
>>>> going at great lengths to achieve that. So this thread feels more and
>>>> more weird...
>>>
>>> We are not talking about changes to existing machines. We are talking
>>> about having changes introduced in new machines (the one we did on
>>> purpose) affecting the runnability of the VM.
>>
>> You are talking abstract!
>>
>>
>> Example 1:
>>
>> Point A: Machine pc-i440fx-2.3 exists
>>
>> Runs or runs not.
>>
>> Point B: Machine pc-i440fx-2.3 still exists
>>
>> Still runs or runs not due to guest ABI stability rules.
>>
>>
>> Example 2:
>>
>> Point A: pc-i440fx-2.4 does not exist in 2.3
>>
>> Does not run becomes it doesn't exist.
>>
>> Point B: New pc-i440fx-2.4
>>
>> Runs or does not run, and if so has more features than pc-i440fx-2.3.
>>
>> There is no runnability problem - either it runs or it doesn't, but
>> there's no change over time.
>>
>> This is what the machine -x.y versioning is all about.
> 
> Consider a host currently running QEMU 2.3 with machine type
> pc-i440fx-2.3 used with SandyBridge.
> 
> Now consider the def of SandyBridge was buggy and so in QEMU
> 2.4 we add the missing CPU feature flag 'wizz', and only
> enable that new feature flag with pc-i440fx-2.4
> 
> Now consider there was a bug in the virtio-scsi driver that
> we also fixed in QEMU 2.4 and thus pc-i440fx-2.4 includes
> that fix.
> 
> Updating from pc-i440fx-2.3 to pc-i440fx-2.4 has a dependancy
> on the host CPU including the 'wizz' flag that was added. This
> new CPU feature can prevent the user from using the new machine
> type to get the virtio-scsi bug fix.

Thanks for this example! :)

Only if you use "-cpu ...,enforce", no?

The KVM feature filtering should take care of dropping features that are
not available otherwise.

So we seem to be getting to the interesting case of the same machine
(different from what was said previously!) but different hosts.

The QOM property gives you insights into which feature bits are set for
the machine for the model (and for s390x I saw QMP extensions to the
same effect, I thought). That way you could discover features to
disable. However you'll only ever know which ones work once you've tried
it once, right?

I'm pretty sure that we've had discussions with Anthony and Avi on the
same topic ages ago. Alex also made the -cpu best proposal long ago.
In general, if you want to run on a group of hosts, then you need to
figure out a common denominator - a CPU model name and optionally
features to enable/disable.

If that is the whole problem here, then why not just add a global flag
to only enable explicitly requested KVM features? All other features
should not depend on the host, and the whole discussion about -x.y seems
like a distraction.

Regards,
Andreas

> Separately versioning CPU models still lets us preserve a stable
> guest ABI by default, but allows more flexibility when choosing
> to opt out of the ABI for a particular upgrade.
> 
> Regards,
> Daniel

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:29                       ` Andreas Färber
@ 2015-06-23 17:42                         ` Eduardo Habkost
  2015-06-23 17:55                           ` Andreas Färber
  2015-06-23 21:28                           ` Michael S. Tsirkin
  0 siblings, 2 replies; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 17:42 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

On Tue, Jun 23, 2015 at 07:29:15PM +0200, Andreas Färber wrote:
> In summary you seem to be saying that all the years we have spent
> fiddling around with those mind-boggling compat_props in QEMU were in
> vain because libvirt now wants to start their own versioning system to
> give users more degrees of freedom even when you can't articulate a
> single concrete reason why users may want to do so.

I had a similar reaction when I learned about this libvirt
expectation/requirement I was never aware of. But "we spent lots of
effort trying to do things differently" doesn't seem like a valid
justification for design decision.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:41                           ` Andreas Färber
@ 2015-06-23 17:45                             ` Eduardo Habkost
  2015-06-23 17:58                               ` Andreas Färber
  2015-06-23 17:55                             ` Daniel P. Berrange
  1 sibling, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 17:45 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

On Tue, Jun 23, 2015 at 07:41:50PM +0200, Andreas Färber wrote:
[...]
> If that is the whole problem here, then why not just add a global flag
> to only enable explicitly requested KVM features? All other features
> should not depend on the host, and the whole discussion about -x.y seems
> like a distraction.

Now replace "KVM features" with "CPU fatures", because all CPU features
are KVM features, as all of them depend on KVM code enabling them on
GET_SUPPORTED_CPUID.

Thus, the global flag to only enable explicitly request KVM features on
CPUs is "-cpu custom", which doesn't enable any CPU feature at all.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:42                         ` Eduardo Habkost
@ 2015-06-23 17:55                           ` Andreas Färber
  2015-06-23 17:58                             ` Daniel P. Berrange
  2015-06-23 21:28                           ` Michael S. Tsirkin
  1 sibling, 1 reply; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 17:55 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

Am 23.06.2015 um 19:42 schrieb Eduardo Habkost:
> On Tue, Jun 23, 2015 at 07:29:15PM +0200, Andreas Färber wrote:
>> In summary you seem to be saying that all the years we have spent
>> fiddling around with those mind-boggling compat_props in QEMU were in
>> vain because libvirt now wants to start their own versioning system to
>> give users more degrees of freedom even when you can't articulate a
>> single concrete reason why users may want to do so.
> 
> I had a similar reaction when I learned about this libvirt
> expectation/requirement I was never aware of. But "we spent lots of
> effort trying to do things differently" doesn't seem like a valid
> justification for design decision.

True, but my expectation is that libvirt is a friendly wrapper around
QEMU using the mechanisms like -x.y we implemented for it and does not
stand up at random and says, like Dan, that it "wants" to have things
differently from now on, breaking all past assumptions and guarantees.

Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:41                           ` Andreas Färber
  2015-06-23 17:45                             ` Eduardo Habkost
@ 2015-06-23 17:55                             ` Daniel P. Berrange
  1 sibling, 0 replies; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-23 17:55 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 07:41:50PM +0200, Andreas Färber wrote:
> Only if you use "-cpu ...,enforce", no?
> 
> The KVM feature filtering should take care of dropping features that are
> not available otherwise.
> 
> So we seem to be getting to the interesting case of the same machine
> (different from what was said previously!) but different hosts.
> 
> The QOM property gives you insights into which feature bits are set for
> the machine for the model (and for s390x I saw QMP extensions to the
> same effect, I thought). That way you could discover features to
> disable. However you'll only ever know which ones work once you've tried
> it once, right?

On this subject, a big problem in QOM in general is that it hasn't
included a distinction between object classes and object instances.
So there's no way for us to introspect what a machine type provides
without actually instantiating a guest with that machine type.

Again allowing libvirt to control the CPU model removes this problem
as libvirt will be able to determine what CPUs it can run on a given
host without having to probe all the different CPU <-> machine type
combinations for the QEMU on that host has.

This in turn simplifies life for apps using libvirt, as they will always
know that "CPUFoo-X.Y" will always mean the exact same thing on all the
hosts they have no matter what QEMU version/machine is in use.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:55                           ` Andreas Färber
@ 2015-06-23 17:58                             ` Daniel P. Berrange
  0 siblings, 0 replies; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-23 17:58 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 07:55:14PM +0200, Andreas Färber wrote:
> Am 23.06.2015 um 19:42 schrieb Eduardo Habkost:
> > On Tue, Jun 23, 2015 at 07:29:15PM +0200, Andreas Färber wrote:
> >> In summary you seem to be saying that all the years we have spent
> >> fiddling around with those mind-boggling compat_props in QEMU were in
> >> vain because libvirt now wants to start their own versioning system to
> >> give users more degrees of freedom even when you can't articulate a
> >> single concrete reason why users may want to do so.
> > 
> > I had a similar reaction when I learned about this libvirt
> > expectation/requirement I was never aware of. But "we spent lots of
> > effort trying to do things differently" doesn't seem like a valid
> > justification for design decision.
> 
> True, but my expectation is that libvirt is a friendly wrapper around
> QEMU using the mechanisms like -x.y we implemented for it and does not
> stand up at random and says, like Dan, that it "wants" to have things
> differently from now on, breaking all past assumptions and guarantees.

FWIW, this isn't actually a change in opinion from the libvirt side.
We never wanted CPU representation tied to machine types, even years
back when the first qemu discussions in this area took place. We
always wanted to be able to control CPU definition separately from
the machine type.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:45                             ` Eduardo Habkost
@ 2015-06-23 17:58                               ` Andreas Färber
  2015-06-23 18:05                                 ` Daniel P. Berrange
  2015-06-23 18:11                                 ` Eduardo Habkost
  0 siblings, 2 replies; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 17:58 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

Am 23.06.2015 um 19:45 schrieb Eduardo Habkost:
> On Tue, Jun 23, 2015 at 07:41:50PM +0200, Andreas Färber wrote:
> [...]
>> If that is the whole problem here, then why not just add a global flag
>> to only enable explicitly requested KVM features? All other features
>> should not depend on the host, and the whole discussion about -x.y seems
>> like a distraction.
> 
> Now replace "KVM features" with "CPU fatures", because all CPU features
> are KVM features, as all of them depend on KVM code enabling them on
> GET_SUPPORTED_CPUID.
> 
> Thus, the global flag to only enable explicitly request KVM features on
> CPUs is "-cpu custom", which doesn't enable any CPU feature at all.

If libvirt wants to use an empty CPU model, then why export our models
to libvirt?

I don't mind there being an optional custom model, I mind our
compat_props getting ignored that way, which are unrelated to adding new
features, in fact they suppress just that for the -2.3 examples.

Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:58                               ` Andreas Färber
@ 2015-06-23 18:05                                 ` Daniel P. Berrange
  2015-06-23 18:11                                 ` Eduardo Habkost
  1 sibling, 0 replies; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-23 18:05 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 07:58:44PM +0200, Andreas Färber wrote:
> Am 23.06.2015 um 19:45 schrieb Eduardo Habkost:
> > On Tue, Jun 23, 2015 at 07:41:50PM +0200, Andreas Färber wrote:
> > [...]
> >> If that is the whole problem here, then why not just add a global flag
> >> to only enable explicitly requested KVM features? All other features
> >> should not depend on the host, and the whole discussion about -x.y seems
> >> like a distraction.
> > 
> > Now replace "KVM features" with "CPU fatures", because all CPU features
> > are KVM features, as all of them depend on KVM code enabling them on
> > GET_SUPPORTED_CPUID.
> > 
> > Thus, the global flag to only enable explicitly request KVM features on
> > CPUs is "-cpu custom", which doesn't enable any CPU feature at all.
> 
> If libvirt wants to use an empty CPU model, then why export our models
> to libvirt?
> 
> I don't mind there being an optional custom model, I mind our
> compat_props getting ignored that way, which are unrelated to adding new
> features, in fact they suppress just that for the -2.3 examples.

If QEMU has a '-cpu custom' there isn't actually any need for QEMU to
export its CPU models to libvirt. Libvirt already has its own CPU
model database, so we'd just start to use our own database exclusively.

Libvirt would have to take care that we don't break ABI when updating
from a QEMU that pre-dates the '-cpu custom' feature - for that we
have done a one-time only extraction of details of all the historical
variations in QEMU CPUS models per machine type.

So apart from backcompat for existing QEMUs in the wild, libvirt would
no longer have any need to know about QEMU's CPU models.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:58                               ` Andreas Färber
  2015-06-23 18:05                                 ` Daniel P. Berrange
@ 2015-06-23 18:11                                 ` Eduardo Habkost
  1 sibling, 0 replies; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 18:11 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

On Tue, Jun 23, 2015 at 07:58:44PM +0200, Andreas Färber wrote:
> Am 23.06.2015 um 19:45 schrieb Eduardo Habkost:
> > On Tue, Jun 23, 2015 at 07:41:50PM +0200, Andreas Färber wrote:
> > [...]
> >> If that is the whole problem here, then why not just add a global flag
> >> to only enable explicitly requested KVM features? All other features
> >> should not depend on the host, and the whole discussion about -x.y seems
> >> like a distraction.
> > 
> > Now replace "KVM features" with "CPU fatures", because all CPU features
> > are KVM features, as all of them depend on KVM code enabling them on
> > GET_SUPPORTED_CPUID.
> > 
> > Thus, the global flag to only enable explicitly request KVM features on
> > CPUs is "-cpu custom", which doesn't enable any CPU feature at all.
> 
> If libvirt wants to use an empty CPU model, then why export our models
> to libvirt?

We wouldn't need it anymore. We would probably keep it just because old
libvirt versions (or maybe the non-x86 libvirt code) may use it, or
maybe libvirt will keep using the existing QEMU models for existing VMs
(because that would be simpler than converting existing "-cpu Foo" to
"-cpu custom -readconfig").

> 
> I don't mind there being an optional custom model, I mind our
> compat_props getting ignored that way, which are unrelated to adding new
> features, in fact they suppress just that for the -2.3 examples.

I am also not happy for having spent lots of time on compat_props for
CPUs when it is not going to be needed by libvirt anymore, but that's a
sunk cost.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:39                         ` Eduardo Habkost
@ 2015-06-23 18:35                           ` Andreas Färber
  2015-06-23 19:25                             ` Eduardo Habkost
  0 siblings, 1 reply; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 18:35 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

Am 23.06.2015 um 19:39 schrieb Eduardo Habkost:
> On Tue, Jun 23, 2015 at 07:18:06PM +0200, Andreas Färber wrote:
>> Am 23.06.2015 um 19:08 schrieb Eduardo Habkost:
>>> On Tue, Jun 23, 2015 at 06:44:57PM +0200, Andreas Färber wrote:
>>>> Am 23.06.2015 um 18:38 schrieb Eduardo Habkost:
>>>>> On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
>>>>>> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
>>>>>>> Whether QEMU changed the CPU for existing machines, or only for new
>>>>>>> machines is actually not the core problem. Even if we only changed
>>>>>>> the CPU in new machines that would still be an unsatisfactory situation
>>>>>>> because we want to be able to be able to access different versions of
>>>>>>> the CPU without the machine type changing, and access different versions
>>>>>>> of the machine type, without the CPU changing. IOW it is the fact that the
>>>>>>> changes in CPU are tied to changes in machine type that is the core
>>>>>>> problem.
>>>>>>
>>>>>> But that's because we are fixing bugs.  If CPU X used to work on
>>>>>> hardware Y in machine type A and stopped in machine type B, this is
>>>>>> because we have determined that it's the right thing to do for the
>>>>>> guests and the users. We don't break stuff just for fun.
>>>>>> Why do you want to bring back the bugs we fixed?
>>>>>
>>>>> I didn't take the time to count them, but I bet most of the commits I
>>>>> listed on my previous e-mail message are not bug fixes, but new
>>>>> features.
>>>>
>>>> Huh? Of course the latest machine model get new features. The point is
>>>> that the previous ones don't and that's what we are providing them for -
>>>> libvirt is expected to choose one machine and the contract with QEMU is
>>>> that for that machine the CPU does *not* grow new features, and we're
>>>> going at great lengths to achieve that. So this thread feels more and
>>>> more weird...
>>>
>>> We are not talking about changes to existing machines. We are talking
>>> about having changes introduced in new machines (the one we did on
>>> purpose) affecting the runnability of the VM.
>>
>> You are talking abstract!
> 
> I am just talking about a different problem, and I don't know if you are
> purposely trying to ignore it, or are just denying that it is a problem.

So, are you and Dan talking about the same problem or different ones?
I am not deliberately ignoring anything here, but I am denying there is
a problem until either of you explains what a concrete problem is. Seems
we are slowly getting there now.

>> Example 1:
>>
>> Point A: Machine pc-i440fx-2.3 exists
>>
>> Runs or runs not.
>>
>> Point B: Machine pc-i440fx-2.3 still exists
>>
>> Still runs or runs not due to guest ABI stability rules.
> 
> If you didn't change the machine name, this is not the problem we are
> talking about.

OK.

>> Example 2:
>>
>> Point A: pc-i440fx-2.4 does not exist in 2.3
>>
>> Does not run becomes it doesn't exist.
>>
>> Point B: New pc-i440fx-2.4
>>
>> Runs or does not run, and if so has more features than pc-i440fx-2.3.
> 
> If you didn't change the machine name, this is not the problem we are
> talking about.
> 
>>
>> There is no runnability problem - either it runs or it doesn't, but
>> there's no change over time.
>>
>> This is what the machine -x.y versioning is all about.
> 
> Let's try a concrete example:
> 
> * User is running a kernel that can't emulate x2apic
> * User is running pc-i440fx-1.7
> * User wants the gigabyte alignment change implemented by commit
>   bb43d3839c29b17a2f5c122114cd4ca978065a18
> * User changes machine to pc-i440fx-2.0
> * x2apic is now enabled by default in all CPU models
> * VM with the same configuration (just the machine change) is not
>   runnable anymore in the same host

Then let's take a step back: In order to change the machine type, the
user shuts the machine down (it does not run!), edits the XML and tries
to boot it up again. That's where I've challenged your use of the term
of changed "runnability" above. I acknowledged, it might happen that it
does not run. But that has nothing to do with compatibility of QEMU
versions v2.3.0 vs. v2.4.0 then, it is the user's active choice of
options that are incompatible with her system and that never before
worked there. That seems perfectly valid and unavoidable, just like
adding a non-existing command-line option or an unknown XML element to
the guest config.

The difference of opinion seems to be that when there is a bug in QEMU,
I require that the user updates QEMU (not necessarily to a new version),
whereas you are proposing that libvirt should be the one to work around
bugs in QEMU by tweaking command line parameters.

In order to get a virtio-scsi or gigabyte alignment fix that varies
across -x.y machines, that feature can just as well be enabled via
global properties on the old machine. New machines are primarily for new
features.

If someone wants to use that new -2.0 machine, they need to pass the
correct options such as ",-x2apic" in your example or use a CPU model
that does not enable such options by default. (FWIW in that concrete
example I remember Paolo(?) saying that that feature had been supported
for a really long time already.)
The user, who actively edited the guest definition, gets an error
message and has to edit the guest again and then it starts.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 18:35                           ` Andreas Färber
@ 2015-06-23 19:25                             ` Eduardo Habkost
  2015-06-23 19:41                               ` Andreas Färber
  0 siblings, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 19:25 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

On Tue, Jun 23, 2015 at 08:35:54PM +0200, Andreas Färber wrote:
> Am 23.06.2015 um 19:39 schrieb Eduardo Habkost:
> > On Tue, Jun 23, 2015 at 07:18:06PM +0200, Andreas Färber wrote:
> >> Am 23.06.2015 um 19:08 schrieb Eduardo Habkost:
> >>> On Tue, Jun 23, 2015 at 06:44:57PM +0200, Andreas Färber wrote:
> >>>> Am 23.06.2015 um 18:38 schrieb Eduardo Habkost:
> >>>>> On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
> >>>>>> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
> >>>>>>> Whether QEMU changed the CPU for existing machines, or only for new
> >>>>>>> machines is actually not the core problem. Even if we only changed
> >>>>>>> the CPU in new machines that would still be an unsatisfactory situation
> >>>>>>> because we want to be able to be able to access different versions of
> >>>>>>> the CPU without the machine type changing, and access different versions
> >>>>>>> of the machine type, without the CPU changing. IOW it is the fact that the
> >>>>>>> changes in CPU are tied to changes in machine type that is the core
> >>>>>>> problem.
> >>>>>>
> >>>>>> But that's because we are fixing bugs.  If CPU X used to work on
> >>>>>> hardware Y in machine type A and stopped in machine type B, this is
> >>>>>> because we have determined that it's the right thing to do for the
> >>>>>> guests and the users. We don't break stuff just for fun.
> >>>>>> Why do you want to bring back the bugs we fixed?
> >>>>>
> >>>>> I didn't take the time to count them, but I bet most of the commits I
> >>>>> listed on my previous e-mail message are not bug fixes, but new
> >>>>> features.
> >>>>
> >>>> Huh? Of course the latest machine model get new features. The point is
> >>>> that the previous ones don't and that's what we are providing them for -
> >>>> libvirt is expected to choose one machine and the contract with QEMU is
> >>>> that for that machine the CPU does *not* grow new features, and we're
> >>>> going at great lengths to achieve that. So this thread feels more and
> >>>> more weird...
> >>>
> >>> We are not talking about changes to existing machines. We are talking
> >>> about having changes introduced in new machines (the one we did on
> >>> purpose) affecting the runnability of the VM.
> >>
> >> You are talking abstract!
> > 
> > I am just talking about a different problem, and I don't know if you are
> > purposely trying to ignore it, or are just denying that it is a problem.
> 
> So, are you and Dan talking about the same problem or different ones?

The same one.

> I am not deliberately ignoring anything here, but I am denying there is
> a problem until either of you explains what a concrete problem is. Seems
> we are slowly getting there now.

I hope so. :)

> 
> >> Example 1:
> >>
> >> Point A: Machine pc-i440fx-2.3 exists
> >>
> >> Runs or runs not.
> >>
> >> Point B: Machine pc-i440fx-2.3 still exists
> >>
> >> Still runs or runs not due to guest ABI stability rules.
> > 
> > If you didn't change the machine name, this is not the problem we are
> > talking about.
> 
> OK.
> 
> >> Example 2:
> >>
> >> Point A: pc-i440fx-2.4 does not exist in 2.3
> >>
> >> Does not run becomes it doesn't exist.
> >>
> >> Point B: New pc-i440fx-2.4
> >>
> >> Runs or does not run, and if so has more features than pc-i440fx-2.3.
> > 
> > If you didn't change the machine name, this is not the problem we are
> > talking about.
> > 
> >>
> >> There is no runnability problem - either it runs or it doesn't, but
> >> there's no change over time.
> >>
> >> This is what the machine -x.y versioning is all about.
> > 
> > Let's try a concrete example:
> > 
> > * User is running a kernel that can't emulate x2apic
> > * User is running pc-i440fx-1.7
> > * User wants the gigabyte alignment change implemented by commit
> >   bb43d3839c29b17a2f5c122114cd4ca978065a18
> > * User changes machine to pc-i440fx-2.0
> > * x2apic is now enabled by default in all CPU models
> > * VM with the same configuration (just the machine change) is not
> >   runnable anymore in the same host
> 
> Then let's take a step back: In order to change the machine type, the
> user shuts the machine down (it does not run!), edits the XML and tries
> to boot it up again. That's where I've challenged your use of the term
> of changed "runnability" above. I acknowledged, it might happen that it
> does not run. But that has nothing to do with compatibility of QEMU
> versions v2.3.0 vs. v2.4.0 then, it is the user's active choice of
> options that are incompatible with her system and that never before
> worked there. That seems perfectly valid and unavoidable, just like
> adding a non-existing command-line option or an unknown XML element to
> the guest config.

If you add a non-existing command-line option or unknown XML element,
you are providing bad input. The most obvious way to handle it is an
error.

If you change -machine, you are providing good input, but the user won't
have a good explanation why it can't run (because it is a perfectly
valid machine name, reported as supported by QEMU). Or maybe they will
see an explanation, but will have no idea what they need to change to
make the new machine runnable.

CPU model runnability, on the other hand, is well documented in the
libvirt API, and would even allow a management system to automatically
find a solution (because it can tell exactly what's the feature
preventing the VM from running), and tell the user which configurations
are runnable.

> 
> The difference of opinion seems to be that when there is a bug in QEMU,
> I require that the user updates QEMU (not necessarily to a new version),
> whereas you are proposing that libvirt should be the one to work around
> bugs in QEMU by tweaking command line parameters.

In the case of bugs related to CPU definitions, yes, because they are
different kinds of changes: they affect runnability of the VM when they
are enabled.

> 
> In order to get a virtio-scsi or gigabyte alignment fix that varies
> across -x.y machines, that feature can just as well be enabled via
> global properties on the old machine. New machines are primarily for new
> features.

Replace "gibabyte alignment" with any possible reason an user 5 years
from now may want to change to a newer machine.

If everything was configurable using globals, we wouldn't even need to
introduce new machines, and we could let libvirt or the user configure
everything they want, and they would never touch the machine name again
in their configuration. But we don't live in that world yet.

> 
> If someone wants to use that new -2.0 machine, they need to pass the
> correct options such as ",-x2apic" in your example or use a CPU model
> that does not enable such options by default. (FWIW in that concrete
> example I remember Paolo(?) saying that that feature had been supported
> for a really long time already.)
> The user, who actively edited the guest definition, gets an error
> message and has to edit the guest again and then it starts.

That's what we are trying to avoid. The user may be not human, the user
may have thousands of machines being upgraded, and even a human may take
some time to figure out what's needed to make the VM runnable again.

libvirt has no API for checking if a machine name is runnable, or for
checking if a machine+CPU combination is runnable. And I don't see a
reason to force them to add that to their API if they could just
decouple the sets of CPU features from the machine versions.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 19:25                             ` Eduardo Habkost
@ 2015-06-23 19:41                               ` Andreas Färber
  2015-06-23 19:53                                 ` Eduardo Habkost
  2015-06-23 20:26                                 ` Eduardo Habkost
  0 siblings, 2 replies; 81+ messages in thread
From: Andreas Färber @ 2015-06-23 19:41 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Maydell, mimu, Michael S. Tsirkin, qemu-devel,
	Alexander Graf, borntraeger, Igor Mammedov, Paolo Bonzini,
	Jiri Denemark, rth

Am 23.06.2015 um 21:25 schrieb Eduardo Habkost:
> On Tue, Jun 23, 2015 at 08:35:54PM +0200, Andreas Färber wrote:
>> Am 23.06.2015 um 19:39 schrieb Eduardo Habkost:
>>> On Tue, Jun 23, 2015 at 07:18:06PM +0200, Andreas Färber wrote:
>>>> Am 23.06.2015 um 19:08 schrieb Eduardo Habkost:
>>>>> On Tue, Jun 23, 2015 at 06:44:57PM +0200, Andreas Färber wrote:
>>>>>> Am 23.06.2015 um 18:38 schrieb Eduardo Habkost:
>>>>>>> On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
>>>>>>>> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
>>>>>>>>> Whether QEMU changed the CPU for existing machines, or only for new
>>>>>>>>> machines is actually not the core problem. Even if we only changed
>>>>>>>>> the CPU in new machines that would still be an unsatisfactory situation
>>>>>>>>> because we want to be able to be able to access different versions of
>>>>>>>>> the CPU without the machine type changing, and access different versions
>>>>>>>>> of the machine type, without the CPU changing. IOW it is the fact that the
>>>>>>>>> changes in CPU are tied to changes in machine type that is the core
>>>>>>>>> problem.
>>>>>>>>
>>>>>>>> But that's because we are fixing bugs.  If CPU X used to work on
>>>>>>>> hardware Y in machine type A and stopped in machine type B, this is
>>>>>>>> because we have determined that it's the right thing to do for the
>>>>>>>> guests and the users. We don't break stuff just for fun.
>>>>>>>> Why do you want to bring back the bugs we fixed?
>>>>>>>
>>>>>>> I didn't take the time to count them, but I bet most of the commits I
>>>>>>> listed on my previous e-mail message are not bug fixes, but new
>>>>>>> features.
>>>>>>
>>>>>> Huh? Of course the latest machine model get new features. The point is
>>>>>> that the previous ones don't and that's what we are providing them for -
>>>>>> libvirt is expected to choose one machine and the contract with QEMU is
>>>>>> that for that machine the CPU does *not* grow new features, and we're
>>>>>> going at great lengths to achieve that. So this thread feels more and
>>>>>> more weird...
>>>>>
>>>>> We are not talking about changes to existing machines. We are talking
>>>>> about having changes introduced in new machines (the one we did on
>>>>> purpose) affecting the runnability of the VM.
>>>>
>>>> You are talking abstract!
>>>
>>> I am just talking about a different problem, and I don't know if you are
>>> purposely trying to ignore it, or are just denying that it is a problem.
>>
>> So, are you and Dan talking about the same problem or different ones?
> 
> The same one.
> 
>> I am not deliberately ignoring anything here, but I am denying there is
>> a problem until either of you explains what a concrete problem is. Seems
>> we are slowly getting there now.
> 
> I hope so. :)
> 
>>
>>>> Example 1:
>>>>
>>>> Point A: Machine pc-i440fx-2.3 exists
>>>>
>>>> Runs or runs not.
>>>>
>>>> Point B: Machine pc-i440fx-2.3 still exists
>>>>
>>>> Still runs or runs not due to guest ABI stability rules.
>>>
>>> If you didn't change the machine name, this is not the problem we are
>>> talking about.
>>
>> OK.
>>
>>>> Example 2:
>>>>
>>>> Point A: pc-i440fx-2.4 does not exist in 2.3
>>>>
>>>> Does not run becomes it doesn't exist.
>>>>
>>>> Point B: New pc-i440fx-2.4
>>>>
>>>> Runs or does not run, and if so has more features than pc-i440fx-2.3.
>>>
>>> If you didn't change the machine name, this is not the problem we are
>>> talking about.
>>>
>>>>
>>>> There is no runnability problem - either it runs or it doesn't, but
>>>> there's no change over time.
>>>>
>>>> This is what the machine -x.y versioning is all about.
>>>
>>> Let's try a concrete example:
>>>
>>> * User is running a kernel that can't emulate x2apic
>>> * User is running pc-i440fx-1.7
>>> * User wants the gigabyte alignment change implemented by commit
>>>   bb43d3839c29b17a2f5c122114cd4ca978065a18
>>> * User changes machine to pc-i440fx-2.0
>>> * x2apic is now enabled by default in all CPU models
>>> * VM with the same configuration (just the machine change) is not
>>>   runnable anymore in the same host
>>
>> Then let's take a step back: In order to change the machine type, the
>> user shuts the machine down (it does not run!), edits the XML and tries
>> to boot it up again. That's where I've challenged your use of the term
>> of changed "runnability" above. I acknowledged, it might happen that it
>> does not run. But that has nothing to do with compatibility of QEMU
>> versions v2.3.0 vs. v2.4.0 then, it is the user's active choice of
>> options that are incompatible with her system and that never before
>> worked there. That seems perfectly valid and unavoidable, just like
>> adding a non-existing command-line option or an unknown XML element to
>> the guest config.
> 
> If you add a non-existing command-line option or unknown XML element,
> you are providing bad input. The most obvious way to handle it is an
> error.
> 
> If you change -machine, you are providing good input, but the user won't
> have a good explanation why it can't run (because it is a perfectly
> valid machine name, reported as supported by QEMU). Or maybe they will
> see an explanation, but will have no idea what they need to change to
> make the new machine runnable.
> 
> CPU model runnability, on the other hand, is well documented in the
> libvirt API, and would even allow a management system to automatically
> find a solution (because it can tell exactly what's the feature
> preventing the VM from running), and tell the user which configurations
> are runnable.
> 
>>
>> The difference of opinion seems to be that when there is a bug in QEMU,
>> I require that the user updates QEMU (not necessarily to a new version),
>> whereas you are proposing that libvirt should be the one to work around
>> bugs in QEMU by tweaking command line parameters.
> 
> In the case of bugs related to CPU definitions, yes, because they are
> different kinds of changes: they affect runnability of the VM when they
> are enabled.
> 
>>
>> In order to get a virtio-scsi or gigabyte alignment fix that varies
>> across -x.y machines, that feature can just as well be enabled via
>> global properties on the old machine. New machines are primarily for new
>> features.
> 
> Replace "gibabyte alignment" with any possible reason an user 5 years
> from now may want to change to a newer machine.
> 
> If everything was configurable using globals, we wouldn't even need to
> introduce new machines, and we could let libvirt or the user configure
> everything they want, and they would never touch the machine name again
> in their configuration. But we don't live in that world yet.
> 
>>
>> If someone wants to use that new -2.0 machine, they need to pass the
>> correct options such as ",-x2apic" in your example or use a CPU model
>> that does not enable such options by default. (FWIW in that concrete
>> example I remember Paolo(?) saying that that feature had been supported
>> for a really long time already.)
>> The user, who actively edited the guest definition, gets an error
>> message and has to edit the guest again and then it starts.
> 
> That's what we are trying to avoid. The user may be not human, the user
> may have thousands of machines being upgraded, and even a human may take
> some time to figure out what's needed to make the VM runnable again.
> 
> libvirt has no API for checking if a machine name is runnable, or for
> checking if a machine+CPU combination is runnable. And I don't see a
> reason to force them to add that to their API if they could just
> decouple the sets of CPU features from the machine versions.

I am going to stop arguing here and suggest you put this on the agenda
for the next KVM call.

Given that we have had this versioning system for years and no problem
specifically with 2.4 has been raised, I see this as 2.5+ material at
this point.

Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 19:41                               ` Andreas Färber
@ 2015-06-23 19:53                                 ` Eduardo Habkost
  2015-06-23 20:26                                 ` Eduardo Habkost
  1 sibling, 0 replies; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 19:53 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Peter Maydell, mimu, Michael S. Tsirkin, qemu-devel,
	Alexander Graf, borntraeger, Igor Mammedov, Paolo Bonzini,
	Jiri Denemark, rth

On Tue, Jun 23, 2015 at 09:41:51PM +0200, Andreas Färber wrote:
[...]
> I am going to stop arguing here and suggest you put this on the agenda
> for the next KVM call.

I am a bit confused. You said "I don't mind there being an optional
custom model" in a previous message.

If you have objections to libvirt API design decisions, I understand it,
but I suggest you take them to the libvir-list mailing list.

Now, if you have objections to having an optional custom model (i.e.
valid reasons to not apply patch 1/2), please let me know. I didn't see
a single argument to reject the patch, yet.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 19:41                               ` Andreas Färber
  2015-06-23 19:53                                 ` Eduardo Habkost
@ 2015-06-23 20:26                                 ` Eduardo Habkost
  2015-06-23 21:38                                   ` Michael S. Tsirkin
  1 sibling, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-23 20:26 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Peter Maydell, mimu, Michael S. Tsirkin, qemu-devel,
	Alexander Graf, borntraeger, Igor Mammedov, Paolo Bonzini,
	Jiri Denemark, rth

On Tue, Jun 23, 2015 at 09:41:51PM +0200, Andreas Färber wrote:
[...]
> Given that we have had this versioning system for years and no problem
> specifically with 2.4 has been raised, I see this as 2.5+ material at
> this point.

I see this on 2.4 schedule:

"2015-06-16 	Soft feature freeze. All features
	should have patches on the list by this date; major features should have
	initial code committed."

It is a 9-line patch (from which 6 are declarations, and 2 are actual
code statements), it doesn't affect anybody who is not explicitly using
"-cpu custom", it is an useful feature, and it has been on the list
since June 8 (and as RFC since April 13).

Unless somebody gives me a good reason to consider it harmful (that's
different from not accepting the explanations why it is useful, or
disagreeing with users that want to use it), I would like to include it
in 2.4 as long as I get Reviewed-by lines before hard freeze.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 16:42                 ` Daniel P. Berrange
  2015-06-23 16:47                   ` Andreas Färber
@ 2015-06-23 21:23                   ` Michael S. Tsirkin
  2015-06-24  8:52                     ` Daniel P. Berrange
  2015-06-24 14:16                     ` Eduardo Habkost
  1 sibling, 2 replies; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-23 21:23 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Andreas Färber,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 05:42:04PM +0100, Daniel P. Berrange wrote:
> On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
> > On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
> > > On Tue, Jun 23, 2015 at 06:15:51PM +0200, Andreas Färber wrote:
> > > > Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
> > > > > On Tue, Jun 23, 2015 at 05:32:42PM +0200, Michael S. Tsirkin wrote:
> > > > >> On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> > > > >>> On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> > > > >>>> Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > > > >>>>>> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > > > >>>>>> that will generate a config file that can be loaded using -readconfig, based on
> > > > >>>>>> the -cpu and -machine options provided in the command-line.
> > > > >>>>>
> > > > >>>>> Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > > > >>>>> configuration data to libvirt, but now I think it actually makes sense.
> > > > >>>>> We already have a partial copy of CPU model definitions in libvirt
> > > > >>>>> anyway, but as QEMU changes some CPU models in some machine types (and
> > > > >>>>> libvirt does not do that) we have no real control over the guest CPU
> > > > >>>>> configuration. While what we really want is full control to enforce
> > > > >>>>> stable guest ABI.
> > > > >>>>
> > > > >>>> That sounds like FUD to me. Any concrete data points where QEMU does not
> > > > >>>> have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > > > >>>> for.
> > > > >>>
> > > > >>> What Jiri is saying that the CPUs change depending on -mmachine, not
> > > > >>> that the ABI is broken by a given machine.
> > > > >>>
> > > > >>> The problem here is that libvirt needs to provide CPU models whose
> > > > >>> runnability does not depend on the machine-type. If users have a VM that
> > > > >>> is running in a host and the VM machine-type changes,
> > > > >>
> > > > >> How does it change, and why?
> > > > > 
> > > > > Sometimes we add features to a CPU model because they were not emulated by KVM
> > > > > and now they are. Sometimes we remove or add features or change other fields
> > > > > because we are fixing previous mistakes. Recently we we were going to remove
> > > > > features from models because of an Intel CPU errata, but then decided to create
> > > > > a new CPU model name instead.
> > > > > 
> > > > > See some examples at the end of this message.
> > > > > 
> > > > >>
> > > > >>> the VM should be
> > > > >>> still runnable in that host. QEMU doesn't provide that, our CPU models
> > > > >>> may change when we introduce new machine-types, so we are giving them a
> > > > >>> mechanism that allows libvirt to implement the policy they need.
> > > > >>
> > > > >> I don't mind wrt CPU specifically, but we absolutely do change guest ABI
> > > > >> in many ways when we change machine types.
> > > > > 
> > > > > All the other ABI changes we introduce in QEMU don't affect runnability of the
> > > > > VM in a given host, that's the problem we are trying to address here. ABI
> > > > > changes are expected when changing to a new machine, runnability changes
> > > > > aren't.
> > > > > 
> > > > > 
> > > > > Examples of commits changing CPU models:
> > > > [snip]
> > > > 
> > > > I've always advocated remaining backwards-compatible and only making CPU
> > > > model changes for new machines. You among others felt that was not
> > > > always necessary, and now you're using the lack thereof as an argument
> > > > to stop using QEMU's CPU models at all? That sounds convoluted...
> > > 
> > > Whether QEMU changed the CPU for existing machines, or only for new
> > > machines is actually not the core problem. Even if we only changed
> > > the CPU in new machines that would still be an unsatisfactory situation
> > > because we want to be able to be able to access different versions of
> > > the CPU without the machine type changing, and access different versions
> > > of the machine type, without the CPU changing. IOW it is the fact that the
> > > changes in CPU are tied to changes in machine type that is the core
> > > problem.
> > 
> > But that's because we are fixing bugs.  If CPU X used to work on
> > hardware Y in machine type A and stopped in machine type B, this is
> > because we have determined that it's the right thing to do for the
> > guests and the users. We don't break stuff just for fun.
> > Why do you want to bring back the bugs we fixed?
> 
> Huh, I never said we wanted to bring back bugs. This is about allowing
> libvirt to fix the CPU bugs in a way that is independant of the machine
> types and portable across hypervisors we deal with. We're absolutely
> still going to fix CPU model bugs and ensure stable guest ABI.
> 
> Regards,
> Daniel

So any single CPU flag now needs to be added in
- kvm
- qemu
- libvirt

Next thing libvirt will decide it's a policy thing and so
needs to be pushed up to openstack.

We should just figure out what you want to do and support it in QEMU.

Are there many examples where a single flag used to work and then
stopped? I would say such a change is problematic anyway:
not everyone uses libvirt, you are breaking things for people
that run -M pc.

IMHO -M pc is not supposed to mean "can break at any time".


> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:13                     ` [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Daniel P. Berrange
  2015-06-23 17:29                       ` Andreas Färber
@ 2015-06-23 21:26                       ` Michael S. Tsirkin
  1 sibling, 0 replies; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-23 21:26 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Andreas Färber,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 06:13:24PM +0100, Daniel P. Berrange wrote:
> On Tue, Jun 23, 2015 at 06:47:16PM +0200, Andreas Färber wrote:
> > Am 23.06.2015 um 18:42 schrieb Daniel P. Berrange:
> > > On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
> > >> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
> > >>> On Tue, Jun 23, 2015 at 06:15:51PM +0200, Andreas Färber wrote:
> > >>>> Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
> > >>>> I've always advocated remaining backwards-compatible and only making CPU
> > >>>> model changes for new machines. You among others felt that was not
> > >>>> always necessary, and now you're using the lack thereof as an argument
> > >>>> to stop using QEMU's CPU models at all? That sounds convoluted...
> > >>>
> > >>> Whether QEMU changed the CPU for existing machines, or only for new
> > >>> machines is actually not the core problem. Even if we only changed
> > >>> the CPU in new machines that would still be an unsatisfactory situation
> > >>> because we want to be able to be able to access different versions of
> > >>> the CPU without the machine type changing, and access different versions
> > >>> of the machine type, without the CPU changing. IOW it is the fact that the
> > >>> changes in CPU are tied to changes in machine type that is the core
> > >>> problem.
> > >>
> > >> But that's because we are fixing bugs.  If CPU X used to work on
> > >> hardware Y in machine type A and stopped in machine type B, this is
> > >> because we have determined that it's the right thing to do for the
> > >> guests and the users. We don't break stuff just for fun.
> > >> Why do you want to bring back the bugs we fixed?
> > > 
> > > Huh, I never said we wanted to bring back bugs. This is about allowing
> > > libvirt to fix the CPU bugs in a way that is independant of the machine
> > > types and portable across hypervisors we deal with. We're absolutely
> > > still going to fix CPU model bugs and ensure stable guest ABI.
> > 
> > No, that's contradictory! Through the -x.y machines we leave bugs in the
> > old models *exactly* to assure a stable guest ABI. Fixes are only be
> > applied to new machines, thus I'm pointing out that you should not use a
> > new CPU model with an old machine type.
> 
> I'm not saying that libvirt would ever allow a silent guest ABI change.
> Given a libvirt XML config, the guest ABI will never change without an
> explicit action on the part of the app/user to change the XML.
> 
> This is all about dealing with the case where the app / user conciously
> needs/wants to opt-in to a guest ABI change for the guest. eg they wish
> to make use of some bug fix or feature improvement in the new machine
> type, but they do *not* wish to have the CPU model changed, because
> of some CPU model change that is incompatible with their hosts' CPUs.
> Conversely, they may wish to get access to a new CPU model, but not
> wish to have the rest of the guest ABI change. In both cases the user
> is explicitly opt-ing in the ABI change with knowledge about what
> this might mean for the guest OS. Currently we are tieing users
> hands by forcing CPU and machine types to change in lockstep.
> 
> Regards,
> Daniel

Can we have a specific example please?  It's hard to understand the
facts based on such generic statements.


> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:42                         ` Eduardo Habkost
  2015-06-23 17:55                           ` Andreas Färber
@ 2015-06-23 21:28                           ` Michael S. Tsirkin
  2015-06-24 14:18                             ` Eduardo Habkost
  1 sibling, 1 reply; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-23 21:28 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, Andreas Färber, rth

On Tue, Jun 23, 2015 at 02:42:37PM -0300, Eduardo Habkost wrote:
> On Tue, Jun 23, 2015 at 07:29:15PM +0200, Andreas Färber wrote:
> > In summary you seem to be saying that all the years we have spent
> > fiddling around with those mind-boggling compat_props in QEMU were in
> > vain because libvirt now wants to start their own versioning system to
> > give users more degrees of freedom even when you can't articulate a
> > single concrete reason why users may want to do so.
> 
> I had a similar reaction when I learned about this libvirt
> expectation/requirement I was never aware of. But "we spent lots of
> effort trying to do things differently" doesn't seem like a valid
> justification for design decision.

"Users will be hurt because they'll run untested configurations"
seems like a valid reason.

> -- 
> Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 17:11                     ` Eduardo Habkost
@ 2015-06-23 21:34                       ` Michael S. Tsirkin
  2015-06-24 14:24                         ` Eduardo Habkost
  0 siblings, 1 reply; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-23 21:34 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, Andreas Färber, rth

On Tue, Jun 23, 2015 at 02:11:22PM -0300, Eduardo Habkost wrote:
> Even if it is a bug fix. If it is a change that can make the VM
> unrunnable, it needs to be controlled by a separate flag, not by the
> machine-type.

I agree - command line compatibility is important.  But we are supposed
to provide that.  I am surprised that libvirt suddenly wants to avoid
some command line flags because they are not stable. IMHO we did something
wrong here if so. Maybe there was a valid reason for it. But then won't
it apply to libvirt as well?

Now, if people want to update CPU models outside the QEMU binary,
that might be doable simply by moving them to a separate package,
with a text file that QEMU reads at startup.

> -- 
> Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 20:26                                 ` Eduardo Habkost
@ 2015-06-23 21:38                                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-23 21:38 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Maydell, mimu, qemu-devel, Alexander Graf, borntraeger,
	Igor Mammedov, Paolo Bonzini, Jiri Denemark, Andreas Färber,
	rth

On Tue, Jun 23, 2015 at 05:26:38PM -0300, Eduardo Habkost wrote:
> On Tue, Jun 23, 2015 at 09:41:51PM +0200, Andreas Färber wrote:
> [...]
> > Given that we have had this versioning system for years and no problem
> > specifically with 2.4 has been raised, I see this as 2.5+ material at
> > this point.
> 
> I see this on 2.4 schedule:
> 
> "2015-06-16 	Soft feature freeze. All features
> 	should have patches on the list by this date; major features should have
> 	initial code committed."
> 
> It is a 9-line patch (from which 6 are declarations, and 2 are actual
> code statements), it doesn't affect anybody who is not explicitly using
> "-cpu custom", it is an useful feature, and it has been on the list
> since June 8 (and as RFC since April 13).
> 
> Unless somebody gives me a good reason to consider it harmful (that's
> different from not accepting the explanations why it is useful, or
> disagreeing with users that want to use it), I would like to include it
> in 2.4 as long as I get Reviewed-by lines before hard freeze.

For the record I have nothing against the patch itself.
Might be useful for testing or something.
And how libvirt uses QEMU is, in the end, in the hands of
libvirt developers.

I do care about command line stability generally, and about
people not breaking existing documentation and tools
using QEMU directly as opposed to through libvirt,
that's the only reason I participated in this discussion.

> -- 
> Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 21:23                   ` Michael S. Tsirkin
@ 2015-06-24  8:52                     ` Daniel P. Berrange
  2015-06-24 10:31                       ` Michael S. Tsirkin
  2015-06-24 14:16                     ` Eduardo Habkost
  1 sibling, 1 reply; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-24  8:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Andreas Färber,
	Eduardo Habkost

On Tue, Jun 23, 2015 at 11:23:16PM +0200, Michael S. Tsirkin wrote:
> 
> So any single CPU flag now needs to be added in
> - kvm
> - qemu
> - libvirt

This is in fact already the case, and it will also possibly need
to be added to openstack too.

> Next thing libvirt will decide it's a policy thing and so
> needs to be pushed up to openstack.

In fact openstack would really like it if we did exactly that, but
even just having CPUs versioned separately from machine types would
be a big step forward as far as openstack is concerned.

The openstack schedular does not have full visibility into the way
the guest is going to be configured by libvirt/QEMU, in particular
it does not know anything about machine type that will be used by
the guest.

The compute hosts report what CPU features they can support, and
the user / admin will be able to express what CPU model and/or
features they desire their guest to run with, and the schedular
has to match that up to decide on hosts to use. If the CPU QEMU
machine type used can alter what the CPU model means in terms
of features, then the schedling decisions OpenStack is making
are going to be wrong some portion of the time.

So from the POV of the OpenStack schedular, we'd much rather
have CPU models versioned explicitly so their semantics do not
change behind our back based on machine types.

OpenStack is also looking at how to present a consistent
description of CPU models to users, which is independant of
the cloud. Since libvirt/QEMU doesn't allow 100% control of
the CPU model, OpenStack is likely going to have to define
some kind of manual mapping of its own.

> We should just figure out what you want to do and support it in QEMU.

Main thing is versioned CPU models with fixed semantics that
don't magically change based on other aspects of VM configuration,
such as the machine type. This could be accomplished by QEMU
alone.

Following on from that though, there's two other aspects which
we'd like to address. First, be able to present a consistent
set of CPU models across all hypervisors libvirt manages,
regardless of type or version. This is a key reason why we have
always maintained our own CPU model database, even though it
duplicates QEMU's.

More interesting is the question of host passthrough. We have
two modes for that - either 'host-model' or 'host-passthrough'.
The 'host-passthrough' option is something that explicitly
maps to QEMU's  '-cpu host'. This is good because it exposes
100% of the host CPU to the guest. This is bad because it then
prevents use of migration in general, unless both machines
are 100% identical - libvirt just blocks it rather than trying
todo such a fine grained host CPU check.

For that reason we have 'host-model', which is supposed to be
essentially the same thing instead of '-cpu host' we explicitly
list all the features alongside a named model. Since we control
exactly what the guest is being given, we can permit guests
with 'host-model' to be migrated, even if the destination host
is a superset of the source host, we know the guest won't
suddenly see a model change after migration. Currently we are
limited though, as we can only express the CPU features - we
cannot expose the other aspects like level, xlevel, etc. So
our 'host-model' is not quite as perfect a  match as '-cpu host'
is. The '-cpu custom' would help us getting a better match
for 'host-model' by allowing these extra things to be configured.

> Are there many examples where a single flag used to work and then
> stopped? I would say such a change is problematic anyway:
> not everyone uses libvirt, you are breaking things for people
> that run -M pc.
> 
> IMHO -M pc is not supposed to mean "can break at any time".

Well 'pc' is an unversioned machine type, so it explicitly is said to
break at any time - users/apps are supposed to translate that into a
versioned type if they want a guarantee of no breakage.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 12:32   ` Andreas Färber
  2015-06-23 15:08     ` Eduardo Habkost
@ 2015-06-24  9:20     ` Jiri Denemark
  2015-06-24 10:21       ` Michael S. Tsirkin
  1 sibling, 1 reply; 81+ messages in thread
From: Jiri Denemark @ 2015-06-24  9:20 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, rth, Eduardo Habkost

On Tue, Jun 23, 2015 at 14:32:00 +0200, Andreas Färber wrote:
> Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> >> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> >> that will generate a config file that can be loaded using -readconfig, based on
> >> the -cpu and -machine options provided in the command-line.
> > 
> > Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > configuration data to libvirt, but now I think it actually makes sense.
> > We already have a partial copy of CPU model definitions in libvirt
> > anyway, but as QEMU changes some CPU models in some machine types (and
> > libvirt does not do that) we have no real control over the guest CPU
> > configuration. While what we really want is full control to enforce
> > stable guest ABI.
> 
> That sounds like FUD to me. Any concrete data points where QEMU does not
> have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> for.

QEMU provides stable ABI for x86 CPUs only if you use -cpu ...,enforce.
Without enforce the CPU may change everytime a domain is started or
migrated. A small example: let's say a CPU model called "Model" includes
feature "xyz"; when QEMU is started with -cpu Model (no enforce) on a
host which supports xyz, the guest OS will see a CPU with xyz, but when
you migrate it to a host which does not support xyz, QEMU will just
silently drop xyz. In other words, we need to use enforce to make sure
CPU ABI does not change.

But the problem is we can't use enforce because we don't know how a
specific CPU model looks like for a given machine type. Remember, while
libvirt allows users to explicitly ask for a specific CPU model and
features, it also has a mode when libvirt itself computes the right CPU
model and features. And this is impossible with enforce without us
knowing all details about CPU models.

So there are two possible ways to address this:
1. enhance QEMU to give us all we need
    - either by providing commands that would do all the computations
      (CPU model comparison, intersections or denominator, something
      like -cpu best)
    - or provide a way to probe for all (currently 700+) combinations of
      a CPU model and a machine type without actually having to start
      QEMU with each CPU and a machine type separately

2. manage CPU models in libvirt (aka -cpu custom)

During the past several years Eduardo tried to do (1) without getting
anywhere close to something that QEMU would be willing to accept. On the
other hand (2) is a pretty minimal change to QEMU and is more flexible
than (1) because it allows CPU model versions to be decoupled from
machine types (but this was already discussed a lot in the other emails
in this thread).

Jirka

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24  9:20     ` Jiri Denemark
@ 2015-06-24 10:21       ` Michael S. Tsirkin
  2015-06-24 10:31         ` Daniel P. Berrange
  2015-06-24 10:32         ` Paolo Bonzini
  0 siblings, 2 replies; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-24 10:21 UTC (permalink / raw)
  To: Jiri Denemark
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, rth, Andreas Färber, Eduardo Habkost

On Wed, Jun 24, 2015 at 11:20:50AM +0200, Jiri Denemark wrote:
> On Tue, Jun 23, 2015 at 14:32:00 +0200, Andreas Färber wrote:
> > Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > >> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > >> that will generate a config file that can be loaded using -readconfig, based on
> > >> the -cpu and -machine options provided in the command-line.
> > > 
> > > Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > > configuration data to libvirt, but now I think it actually makes sense.
> > > We already have a partial copy of CPU model definitions in libvirt
> > > anyway, but as QEMU changes some CPU models in some machine types (and
> > > libvirt does not do that) we have no real control over the guest CPU
> > > configuration. While what we really want is full control to enforce
> > > stable guest ABI.
> > 
> > That sounds like FUD to me. Any concrete data points where QEMU does not
> > have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > for.
> 
> QEMU provides stable ABI for x86 CPUs only if you use -cpu ...,enforce.
> Without enforce the CPU may change everytime a domain is started or
> migrated. A small example: let's say a CPU model called "Model" includes
> feature "xyz"; when QEMU is started with -cpu Model (no enforce) on a
> host which supports xyz, the guest OS will see a CPU with xyz, but when
> you migrate it to a host which does not support xyz, QEMU will just
> silently drop xyz. In other words, we need to use enforce to make sure
> CPU ABI does not change.

Are there really many examples like this?  Could someone supply some
examples? Eduardo gave examples of CPU changes across machine types
but I haven't seen examples where we would break runnability.

> But the problem is we can't use enforce because we don't know how a
> specific CPU model looks like for a given machine type. Remember, while
> libvirt allows users to explicitly ask for a specific CPU model and
> features, it also has a mode when libvirt itself computes the right CPU
> model and features. And this is impossible with enforce without us
> knowing all details about CPU models.
> 
> So there are two possible ways to address this:
> 1. enhance QEMU to give us all we need
>     - either by providing commands that would do all the computations
>       (CPU model comparison, intersections or denominator, something
>       like -cpu best)
>     - or provide a way to probe for all (currently 700+) combinations of
>       a CPU model and a machine type without actually having to start
>       QEMU with each CPU and a machine type separately
> 
> 2. manage CPU models in libvirt (aka -cpu custom)
> 
> During the past several years Eduardo tried to do (1) without getting
> anywhere close to something that QEMU would be willing to accept.

And the reason, presumably, is because it's a hard problem to solve.
Why is it easier to solve at the libvirt level?

> On the
> other hand (2) is a pretty minimal change to QEMU and is more flexible
> than (1) because it allows CPU model versions to be decoupled from
> machine types (but this was already discussed a lot in the other emails
> in this thread).
> 
> Jirka

I'm fine with the change itself, it's useful e.g. for testing.

But how is it a solution for libvirt's problems?
What is libvirt going to do in the above cases?

-- 
MST

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 10:21       ` Michael S. Tsirkin
@ 2015-06-24 10:31         ` Daniel P. Berrange
  2015-06-24 10:40           ` Michael S. Tsirkin
  2015-06-24 10:32         ` Paolo Bonzini
  1 sibling, 1 reply; 81+ messages in thread
From: Daniel P. Berrange @ 2015-06-24 10:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Andreas Färber,
	Eduardo Habkost

On Wed, Jun 24, 2015 at 12:21:57PM +0200, Michael S. Tsirkin wrote:
> On Wed, Jun 24, 2015 at 11:20:50AM +0200, Jiri Denemark wrote:
> > On Tue, Jun 23, 2015 at 14:32:00 +0200, Andreas Färber wrote:
> > > Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > > >> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > > >> that will generate a config file that can be loaded using -readconfig, based on
> > > >> the -cpu and -machine options provided in the command-line.
> > > > 
> > > > Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > > > configuration data to libvirt, but now I think it actually makes sense.
> > > > We already have a partial copy of CPU model definitions in libvirt
> > > > anyway, but as QEMU changes some CPU models in some machine types (and
> > > > libvirt does not do that) we have no real control over the guest CPU
> > > > configuration. While what we really want is full control to enforce
> > > > stable guest ABI.
> > > 
> > > That sounds like FUD to me. Any concrete data points where QEMU does not
> > > have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > > for.
> > 
> > QEMU provides stable ABI for x86 CPUs only if you use -cpu ...,enforce.
> > Without enforce the CPU may change everytime a domain is started or
> > migrated. A small example: let's say a CPU model called "Model" includes
> > feature "xyz"; when QEMU is started with -cpu Model (no enforce) on a
> > host which supports xyz, the guest OS will see a CPU with xyz, but when
> > you migrate it to a host which does not support xyz, QEMU will just
> > silently drop xyz. In other words, we need to use enforce to make sure
> > CPU ABI does not change.
> 
> Are there really many examples like this?  Could someone supply some
> examples? Eduardo gave examples of CPU changes across machine types
> but I haven't seen examples where we would break runnability.
> 
> > But the problem is we can't use enforce because we don't know how a
> > specific CPU model looks like for a given machine type. Remember, while
> > libvirt allows users to explicitly ask for a specific CPU model and
> > features, it also has a mode when libvirt itself computes the right CPU
> > model and features. And this is impossible with enforce without us
> > knowing all details about CPU models.
> > 
> > So there are two possible ways to address this:
> > 1. enhance QEMU to give us all we need
> >     - either by providing commands that would do all the computations
> >       (CPU model comparison, intersections or denominator, something
> >       like -cpu best)
> >     - or provide a way to probe for all (currently 700+) combinations of
> >       a CPU model and a machine type without actually having to start
> >       QEMU with each CPU and a machine type separately
> > 
> > 2. manage CPU models in libvirt (aka -cpu custom)
> > 
> > During the past several years Eduardo tried to do (1) without getting
> > anywhere close to something that QEMU would be willing to accept.
> 
> And the reason, presumably, is because it's a hard problem to solve.
> Why is it easier to solve at the libvirt level?

One of the main reasons it is hard is because QEMU machine types are
not statically introspectable - you have to actually instantiate the
machine type to determine what config it produces. This is ultimately
a limitation of QOM, and while it could be fixed it would be a pretty
significant design change for QEMU at this point. So the reason it
would be simpler in libvirt is that we would not have any need to
attempt such introspection - the data we need would immediately
available to libvirt in the format it needs to use it in.

The OpenStack scheduling example I mentioned elsewhere is another
reason where the current scheme causes pain - the point at which
OpenStack wants to make decisions about host/guest CPU compatibility,
we don't even have a guest configuration available yet, so we don't
know what machine type we'd want to use, and QEMU isn't even installed
on the hosts doing this decision making. Currently OpenStack just has
to pretend that CPU models don't change based on machine type. Most of
the time we'll be lucky and that won't hurt us, but obviously it is
not a desirable thing to have todo.

> > On the
> > other hand (2) is a pretty minimal change to QEMU and is more flexible
> > than (1) because it allows CPU model versions to be decoupled from
> > machine types (but this was already discussed a lot in the other emails
> > in this thread).
> > 
> > Jirka
> 
> I'm fine with the change itself, it's useful e.g. for testing.
> 
> But how is it a solution for libvirt's problems?
> What is libvirt going to do in the above cases?

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24  8:52                     ` Daniel P. Berrange
@ 2015-06-24 10:31                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-24 10:31 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Andreas Färber,
	Eduardo Habkost

On Wed, Jun 24, 2015 at 09:52:09AM +0100, Daniel P. Berrange wrote:
> On Tue, Jun 23, 2015 at 11:23:16PM +0200, Michael S. Tsirkin wrote:
> > 
> > So any single CPU flag now needs to be added in
> > - kvm
> > - qemu
> > - libvirt
> 
> This is in fact already the case, and it will also possibly need
> to be added to openstack too.
> 
> > Next thing libvirt will decide it's a policy thing and so
> > needs to be pushed up to openstack.
> 
> In fact openstack would really like it if we did exactly that, but
> even just having CPUs versioned separately from machine types would
> be a big step forward as far as openstack is concerned.
> 
> The openstack schedular does not have full visibility into the way
> the guest is going to be configured by libvirt/QEMU, in particular
> it does not know anything about machine type that will be used by
> the guest.
> 
> The compute hosts report what CPU features they can support, and
> the user / admin will be able to express what CPU model and/or
> features they desire their guest to run with, and the schedular
> has to match that up to decide on hosts to use. If the CPU QEMU
> machine type used can alter what the CPU model means in terms
> of features, then the schedling decisions OpenStack is making
> are going to be wrong some portion of the time.

Is this all just theoretical, or are there real examples
where things stop running? I keep hearing "feature xyz"
and it's impossible to argue reasonably about that IMHO.


> So from the POV of the OpenStack schedular, we'd much rather
> have CPU models versioned explicitly so their semantics do not
> change behind our back based on machine types.
> 
> OpenStack is also looking at how to present a consistent
> description of CPU models to users, which is independant of
> the cloud. Since libvirt/QEMU doesn't allow 100% control of
> the CPU model, OpenStack is likely going to have to define
> some kind of manual mapping of its own.
>
> > We should just figure out what you want to do and support it in QEMU.
> 
> Main thing is versioned CPU models with fixed semantics that
> don't magically change based on other aspects of VM configuration,
> such as the machine type. This could be accomplished by QEMU
> alone.
> 
> Following on from that though, there's two other aspects which
> we'd like to address. First, be able to present a consistent
> set of CPU models across all hypervisors libvirt manages,
> regardless of type or version. This is a key reason why we have
> always maintained our own CPU model database, even though it
> duplicates QEMU's.


Do you also want to migrate that? If yes, the problem
definitely becomes more than just CPU specific.


> More interesting is the question of host passthrough. We have
> two modes for that - either 'host-model' or 'host-passthrough'.
> The 'host-passthrough' option is something that explicitly
> maps to QEMU's  '-cpu host'. This is good because it exposes
> 100% of the host CPU to the guest. This is bad because it then
> prevents use of migration in general, unless both machines
> are 100% identical - libvirt just blocks it rather than trying
> todo such a fine grained host CPU check.
> 
> For that reason we have 'host-model', which is supposed to be
> essentially the same thing instead of '-cpu host' we explicitly
> list all the features alongside a named model. Since we control
> exactly what the guest is being given, we can permit guests
> with 'host-model' to be migrated, even if the destination host
> is a superset of the source host, we know the guest won't
> suddenly see a model change after migration. Currently we are
> limited though, as we can only express the CPU features - we
> cannot expose the other aspects like level, xlevel, etc. So
> our 'host-model' is not quite as perfect a  match as '-cpu host'
> is. The '-cpu custom' would help us getting a better match
> for 'host-model' by allowing these extra things to be configured.

This duplicates code from QEMU though.
It looks like we need a tool to get the legal CPUs
that can run on the given host?
Would be easy to add, reusing QEMU codebase.
Maybe make it a new QEMU flag.


> > Are there many examples where a single flag used to work and then
> > stopped? I would say such a change is problematic anyway:
> > not everyone uses libvirt, you are breaking things for people
> > that run -M pc.
> > 
> > IMHO -M pc is not supposed to mean "can break at any time".
> 
> Well 'pc' is an unversioned machine type, so it explicitly is said to
> break at any time - users/apps are supposed to translate that into a
> versioned type if they want a guarantee of no breakage.
> 
> Regards,
> Daniel

That's because you are looking at it from libvirt perspective.  From
QEMU command line, since pc is the default, this makes no sense IMHO:
look up usage advice on the internet, and you will see no one
specifies a machine type.


> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 10:21       ` Michael S. Tsirkin
  2015-06-24 10:31         ` Daniel P. Berrange
@ 2015-06-24 10:32         ` Paolo Bonzini
  1 sibling, 0 replies; 81+ messages in thread
From: Paolo Bonzini @ 2015-06-24 10:32 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jiri Denemark
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	rth, Andreas Färber, Eduardo Habkost

On 24/06/2015 12:21, Michael S. Tsirkin wrote:
> > QEMU provides stable ABI for x86 CPUs only if you use -cpu ...,enforce.
> > Without enforce the CPU may change everytime a domain is started or
> > migrated. A small example: let's say a CPU model called "Model" includes
> > feature "xyz"; when QEMU is started with -cpu Model (no enforce) on a
> > host which supports xyz, the guest OS will see a CPU with xyz, but when
> > you migrate it to a host which does not support xyz, QEMU will just
> > silently drop xyz. In other words, we need to use enforce to make sure
> > CPU ABI does not change.
> 
> Are there really many examples like this?  Could someone supply some
> examples? Eduardo gave examples of CPU changes across machine types
> but I haven't seen examples where we would break runnability.

Same here, and I would be quite surprised of seeing any.  Except perhaps
when we introduced x2apic: that would have broken kernels <2.6.32, but I
hope we can ignore those.

At least as far as I've maintained KVM, we've not added new models to
QEMU after KVM's kernel support was added.

The only case I can imagine is that the kernel is ancient so it doesn't
have anything newer than say IvyBridge, while QEMU is new.  If the user
starts a Haswell VM on a SandyBridge host without "enforce", you will
have a problem when you reboot and get a newer kernel, but this is
independent of machine types.

We have added new features in exactly three cases:

1) F16 and RDRAND to {Haswell,Broadwell} in 2.3;

2) MOVBE to n270 in 1.5;

3) PCLMULQDQ to Westmere also in 1.5.

In all three cases, libvirt is still buggy and doesn't let you use the
features if you have the appropriate host.

We were close to breaking libvirt's expectations when we wanted to
remove TSX from Haswell/Broadwell, but in the end we did it right by
adding separate CPU models and no final release broke libvirt.

So, what is the broken case?  And BTW, what is _exactly_ preventing
libvirt from using enforce for models other than qemu32/qemu64?

Paolo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 10:31         ` Daniel P. Berrange
@ 2015-06-24 10:40           ` Michael S. Tsirkin
  0 siblings, 0 replies; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-24 10:40 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Andreas Färber,
	Eduardo Habkost

On Wed, Jun 24, 2015 at 11:31:37AM +0100, Daniel P. Berrange wrote:
> On Wed, Jun 24, 2015 at 12:21:57PM +0200, Michael S. Tsirkin wrote:
> > On Wed, Jun 24, 2015 at 11:20:50AM +0200, Jiri Denemark wrote:
> > > On Tue, Jun 23, 2015 at 14:32:00 +0200, Andreas Färber wrote:
> > > > Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > > > >> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > > > >> that will generate a config file that can be loaded using -readconfig, based on
> > > > >> the -cpu and -machine options provided in the command-line.
> > > > > 
> > > > > Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > > > > configuration data to libvirt, but now I think it actually makes sense.
> > > > > We already have a partial copy of CPU model definitions in libvirt
> > > > > anyway, but as QEMU changes some CPU models in some machine types (and
> > > > > libvirt does not do that) we have no real control over the guest CPU
> > > > > configuration. While what we really want is full control to enforce
> > > > > stable guest ABI.
> > > > 
> > > > That sounds like FUD to me. Any concrete data points where QEMU does not
> > > > have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > > > for.
> > > 
> > > QEMU provides stable ABI for x86 CPUs only if you use -cpu ...,enforce.
> > > Without enforce the CPU may change everytime a domain is started or
> > > migrated. A small example: let's say a CPU model called "Model" includes
> > > feature "xyz"; when QEMU is started with -cpu Model (no enforce) on a
> > > host which supports xyz, the guest OS will see a CPU with xyz, but when
> > > you migrate it to a host which does not support xyz, QEMU will just
> > > silently drop xyz. In other words, we need to use enforce to make sure
> > > CPU ABI does not change.
> > 
> > Are there really many examples like this?  Could someone supply some
> > examples? Eduardo gave examples of CPU changes across machine types
> > but I haven't seen examples where we would break runnability.

^^^ ???


> > > But the problem is we can't use enforce because we don't know how a
> > > specific CPU model looks like for a given machine type. Remember, while
> > > libvirt allows users to explicitly ask for a specific CPU model and
> > > features, it also has a mode when libvirt itself computes the right CPU
> > > model and features. And this is impossible with enforce without us
> > > knowing all details about CPU models.
> > > 
> > > So there are two possible ways to address this:
> > > 1. enhance QEMU to give us all we need
> > >     - either by providing commands that would do all the computations
> > >       (CPU model comparison, intersections or denominator, something
> > >       like -cpu best)
> > >     - or provide a way to probe for all (currently 700+) combinations of
> > >       a CPU model and a machine type without actually having to start
> > >       QEMU with each CPU and a machine type separately
> > > 
> > > 2. manage CPU models in libvirt (aka -cpu custom)
> > > 
> > > During the past several years Eduardo tried to do (1) without getting
> > > anywhere close to something that QEMU would be willing to accept.
> > 
> > And the reason, presumably, is because it's a hard problem to solve.
> > Why is it easier to solve at the libvirt level?
> 
> One of the main reasons it is hard is because QEMU machine types are
> not statically introspectable - you have to actually instantiate the
> machine type to determine what config it produces.
> This is ultimately
> a limitation of QOM, and while it could be fixed it would be a pretty
> significant design change for QEMU at this point. So the reason it
> would be simpler in libvirt is that we would not have any need to
> attempt such introspection - the data we need would immediately
> available to libvirt in the format it needs to use it in.

Why do you want to poke at QEMU machine types?
Looks like a small utility printing list of CPU types
compatible with the host should be enough.

> 
> The OpenStack scheduling example I mentioned elsewhere is another
> reason where the current scheme causes pain - the point at which
> OpenStack wants to make decisions about host/guest CPU compatibility,
> we don't even have a guest configuration available yet, so we don't
> know what machine type we'd want to use, and QEMU isn't even installed
> on the hosts doing this decision making.

And libvirt is installed? Make that host query utility separate from
qemu then, make libvirt depend on it.

> Currently OpenStack just has
> to pretend that CPU models don't change based on machine type. Most of
> the time we'll be lucky and that won't hurt us, but obviously it is
> not a desirable thing to have todo.

Still wonder whether if it's a theoretical problem.
If yes, why write a ton of code to support it?

> > > On the
> > > other hand (2) is a pretty minimal change to QEMU and is more flexible
> > > than (1) because it allows CPU model versions to be decoupled from
> > > machine types (but this was already discussed a lot in the other emails
> > > in this thread).
> > > 
> > > Jirka
> > 
> > I'm fine with the change itself, it's useful e.g. for testing.
> > 
> > But how is it a solution for libvirt's problems?
> > What is libvirt going to do in the above cases?
> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 21:23                   ` Michael S. Tsirkin
  2015-06-24  8:52                     ` Daniel P. Berrange
@ 2015-06-24 14:16                     ` Eduardo Habkost
  2015-06-24 14:19                       ` Michael S. Tsirkin
  2015-06-24 14:38                       ` Paolo Bonzini
  1 sibling, 2 replies; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-24 14:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, Andreas Färber, rth

On Tue, Jun 23, 2015 at 11:23:16PM +0200, Michael S. Tsirkin wrote:
> On Tue, Jun 23, 2015 at 05:42:04PM +0100, Daniel P. Berrange wrote:
> > On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
> > > On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
> > > > On Tue, Jun 23, 2015 at 06:15:51PM +0200, Andreas Färber wrote:
> > > > > Am 23.06.2015 um 17:58 schrieb Eduardo Habkost:
> > > > > > On Tue, Jun 23, 2015 at 05:32:42PM +0200, Michael S. Tsirkin wrote:
> > > > > >> On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote:
> > > > > >>> On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote:
> > > > > >>>> Am 08.06.2015 um 22:18 schrieb Jiri Denemark:
> > > > > >>>>>> To help libvirt in the transition, a x86-cpu-model-dump script is provided,
> > > > > >>>>>> that will generate a config file that can be loaded using -readconfig, based on
> > > > > >>>>>> the -cpu and -machine options provided in the command-line.
> > > > > >>>>>
> > > > > >>>>> Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU
> > > > > >>>>> configuration data to libvirt, but now I think it actually makes sense.
> > > > > >>>>> We already have a partial copy of CPU model definitions in libvirt
> > > > > >>>>> anyway, but as QEMU changes some CPU models in some machine types (and
> > > > > >>>>> libvirt does not do that) we have no real control over the guest CPU
> > > > > >>>>> configuration. While what we really want is full control to enforce
> > > > > >>>>> stable guest ABI.
> > > > > >>>>
> > > > > >>>> That sounds like FUD to me. Any concrete data points where QEMU does not
> > > > > >>>> have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines
> > > > > >>>> for.
> > > > > >>>
> > > > > >>> What Jiri is saying that the CPUs change depending on -mmachine, not
> > > > > >>> that the ABI is broken by a given machine.
> > > > > >>>
> > > > > >>> The problem here is that libvirt needs to provide CPU models whose
> > > > > >>> runnability does not depend on the machine-type. If users have a VM that
> > > > > >>> is running in a host and the VM machine-type changes,
> > > > > >>
> > > > > >> How does it change, and why?
> > > > > > 
> > > > > > Sometimes we add features to a CPU model because they were not emulated by KVM
> > > > > > and now they are. Sometimes we remove or add features or change other fields
> > > > > > because we are fixing previous mistakes. Recently we we were going to remove
> > > > > > features from models because of an Intel CPU errata, but then decided to create
> > > > > > a new CPU model name instead.
> > > > > > 
> > > > > > See some examples at the end of this message.
> > > > > > 
> > > > > >>
> > > > > >>> the VM should be
> > > > > >>> still runnable in that host. QEMU doesn't provide that, our CPU models
> > > > > >>> may change when we introduce new machine-types, so we are giving them a
> > > > > >>> mechanism that allows libvirt to implement the policy they need.
> > > > > >>
> > > > > >> I don't mind wrt CPU specifically, but we absolutely do change guest ABI
> > > > > >> in many ways when we change machine types.
> > > > > > 
> > > > > > All the other ABI changes we introduce in QEMU don't affect runnability of the
> > > > > > VM in a given host, that's the problem we are trying to address here. ABI
> > > > > > changes are expected when changing to a new machine, runnability changes
> > > > > > aren't.
> > > > > > 
> > > > > > 
> > > > > > Examples of commits changing CPU models:
> > > > > [snip]
> > > > > 
> > > > > I've always advocated remaining backwards-compatible and only making CPU
> > > > > model changes for new machines. You among others felt that was not
> > > > > always necessary, and now you're using the lack thereof as an argument
> > > > > to stop using QEMU's CPU models at all? That sounds convoluted...
> > > > 
> > > > Whether QEMU changed the CPU for existing machines, or only for new
> > > > machines is actually not the core problem. Even if we only changed
> > > > the CPU in new machines that would still be an unsatisfactory situation
> > > > because we want to be able to be able to access different versions of
> > > > the CPU without the machine type changing, and access different versions
> > > > of the machine type, without the CPU changing. IOW it is the fact that the
> > > > changes in CPU are tied to changes in machine type that is the core
> > > > problem.
> > > 
> > > But that's because we are fixing bugs.  If CPU X used to work on
> > > hardware Y in machine type A and stopped in machine type B, this is
> > > because we have determined that it's the right thing to do for the
> > > guests and the users. We don't break stuff just for fun.
> > > Why do you want to bring back the bugs we fixed?
> > 
> > Huh, I never said we wanted to bring back bugs. This is about allowing
> > libvirt to fix the CPU bugs in a way that is independant of the machine
> > types and portable across hypervisors we deal with. We're absolutely
> > still going to fix CPU model bugs and ensure stable guest ABI.
> > 
> > Regards,
> > Daniel
> 
> So any single CPU flag now needs to be added in
> - kvm
> - qemu
> - libvirt
> 
> Next thing libvirt will decide it's a policy thing and so
> needs to be pushed up to openstack.

I don't think that will happen, but if they really decide do do it, why
should we try to stop them? libvirt and OpenStack know what their users
do/need better than us, and if they believe moving data to OpenStack
will provide what users need, they are free to do it. I trust libvirt
developers to do the right thing, here.

> 
> We should just figure out what you want to do and support it in QEMU.
> 
> Are there many examples where a single flag used to work and then
> stopped? I would say such a change is problematic anyway:
> not everyone uses libvirt, you are breaking things for people
> that run -M pc.

People using -M pc have to live with the fact that the host-side
requirements of -M pc may change in newer QEMU versions.

(Again, this is not about ABI changes, but about adding new host-side
hardware/kernel requirements to make a VM run)

> 
> IMHO -M pc is not supposed to mean "can break at any time".

It means "it may have new host-side requirements and may become runnable
in your host (or require additional command-line flags to run) at any time".

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 21:28                           ` Michael S. Tsirkin
@ 2015-06-24 14:18                             ` Eduardo Habkost
  2015-06-24 14:24                               ` Michael S. Tsirkin
  0 siblings, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-24 14:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, Andreas Färber, rth

On Tue, Jun 23, 2015 at 11:28:06PM +0200, Michael S. Tsirkin wrote:
> On Tue, Jun 23, 2015 at 02:42:37PM -0300, Eduardo Habkost wrote:
> > On Tue, Jun 23, 2015 at 07:29:15PM +0200, Andreas Färber wrote:
> > > In summary you seem to be saying that all the years we have spent
> > > fiddling around with those mind-boggling compat_props in QEMU were in
> > > vain because libvirt now wants to start their own versioning system to
> > > give users more degrees of freedom even when you can't articulate a
> > > single concrete reason why users may want to do so.
> > 
> > I had a similar reaction when I learned about this libvirt
> > expectation/requirement I was never aware of. But "we spent lots of
> > effort trying to do things differently" doesn't seem like a valid
> > justification for design decision.
> 
> "Users will be hurt because they'll run untested configurations"
> seems like a valid reason.

I trust libvirt developers to test their CPU definitions as carefully as
we test ours.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 14:16                     ` Eduardo Habkost
@ 2015-06-24 14:19                       ` Michael S. Tsirkin
  2015-06-24 14:35                         ` Andreas Färber
  2015-06-24 14:38                       ` Paolo Bonzini
  1 sibling, 1 reply; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-24 14:19 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, Andreas Färber, rth

On Wed, Jun 24, 2015 at 11:16:51AM -0300, Eduardo Habkost wrote:
> > IMHO -M pc is not supposed to mean "can break at any time".
> 
> It means "it may have new host-side requirements and may become runnable
> in your host (or require additional command-line flags to run) at any time".

That would be pretty bad. I don't think we ever had such cases in practice.

> -- 
> Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 14:18                             ` Eduardo Habkost
@ 2015-06-24 14:24                               ` Michael S. Tsirkin
  0 siblings, 0 replies; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-24 14:24 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, Andreas Färber, rth

On Wed, Jun 24, 2015 at 11:18:18AM -0300, Eduardo Habkost wrote:
> On Tue, Jun 23, 2015 at 11:28:06PM +0200, Michael S. Tsirkin wrote:
> > On Tue, Jun 23, 2015 at 02:42:37PM -0300, Eduardo Habkost wrote:
> > > On Tue, Jun 23, 2015 at 07:29:15PM +0200, Andreas Färber wrote:
> > > > In summary you seem to be saying that all the years we have spent
> > > > fiddling around with those mind-boggling compat_props in QEMU were in
> > > > vain because libvirt now wants to start their own versioning system to
> > > > give users more degrees of freedom even when you can't articulate a
> > > > single concrete reason why users may want to do so.
> > > 
> > > I had a similar reaction when I learned about this libvirt
> > > expectation/requirement I was never aware of. But "we spent lots of
> > > effort trying to do things differently" doesn't seem like a valid
> > > justification for design decision.
> > 
> > "Users will be hurt because they'll run untested configurations"
> > seems like a valid reason.
> 
> I trust libvirt developers to test their CPU definitions as carefully as
> we test ours.

Maybe someone can do a write-up explaining how do requirements
and needs differ?

So far, it looks like gratituos code duplication based on a
mis-understanding, or highly unlikely theoretical what-if scenarious.

Basing our stable interfaces on such grounds might not be a good idea.


> -- 
> Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-23 21:34                       ` Michael S. Tsirkin
@ 2015-06-24 14:24                         ` Eduardo Habkost
  2015-06-24 14:37                           ` Michael S. Tsirkin
  0 siblings, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-24 14:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, Andreas Färber, rth

On Tue, Jun 23, 2015 at 11:34:45PM +0200, Michael S. Tsirkin wrote:
> On Tue, Jun 23, 2015 at 02:11:22PM -0300, Eduardo Habkost wrote:
> > Even if it is a bug fix. If it is a change that can make the VM
> > unrunnable, it needs to be controlled by a separate flag, not by the
> > machine-type.
> 
> I agree - command line compatibility is important.  But we are supposed
> to provide that.  I am surprised that libvirt suddenly wants to avoid
> some command line flags because they are not stable. IMHO we did something
> wrong here if so. Maybe there was a valid reason for it. But then won't
> it apply to libvirt as well?

Maybe we are having the same misunderstanding here: the problem is not
compatibility/stability of existing machines, but the kind of
(intentional) changes introduced in _new_ machines (when the -machine
argument is changed). There are two kinds of changes introduced in new
machines:

1) Guest-side-only ABI changes: those are OK, libvirt normally ignore
   them, they can't make a VM not-runnable.
2) Changes in the host-side dependencies: those need to be more carefully
   controlled by libvirt. That's where CPU features are special: all CPU
   features depend KVM-side features, and enabling them by default on
   new machines makes it impossible for libvirt to know/report in
   advance what's necessary to make a VM runnable and to implement their
   existing runnability APIs[1].

Unless we guarantee that QEMU would never introduce type-(2) changes in
new machines (which I don't think will ever happen because that means
never changing existing CPU models in QEMU), libvirt needs to control
CPU features individually (that's why they need -cpu custom).

> 
> Now, if people want to update CPU models outside the QEMU binary,
> that might be doable simply by moving them to a separate package,
> with a text file that QEMU reads at startup.

You seem to be describing exactly what is made possible by
"-cpu custom -readconfig <cpu-config-file>"

[1] http://libvirt.org/html/libvirt-libvirt-host.html#virConnectCompareCPU
    (See the cover letter of this series)

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 14:19                       ` Michael S. Tsirkin
@ 2015-06-24 14:35                         ` Andreas Färber
  2015-06-24 14:57                           ` Michael S. Tsirkin
  0 siblings, 1 reply; 81+ messages in thread
From: Andreas Färber @ 2015-06-24 14:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Eduardo Habkost

Am 24.06.2015 um 16:19 schrieb Michael S. Tsirkin:
> On Wed, Jun 24, 2015 at 11:16:51AM -0300, Eduardo Habkost wrote:
>>> IMHO -M pc is not supposed to mean "can break at any time".
>>
>> It means "it may have new host-side requirements and may become runnable
>> in your host (or require additional command-line flags to run) at any time".
> 
> That would be pretty bad. I don't think we ever had such cases in practice.

Why is that bad or unexpected? If you install new software, it may have
new dependencies. QEMU would be no different from other software there.

Yesterday Eduardo said it was about having a fixed version installed and
therein switching from a legacy machine to a newer machine. In that case
the dependencies existed ever since installation but were not
immediately visible to the end user due to our stability guarantees.

Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 14:24                         ` Eduardo Habkost
@ 2015-06-24 14:37                           ` Michael S. Tsirkin
  2015-06-24 15:44                             ` [Qemu-devel] Not introducing new host-side requirements on new machine-type versions (was Re: [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models) Eduardo Habkost
  0 siblings, 1 reply; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-24 14:37 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, Andreas Färber, rth

On Wed, Jun 24, 2015 at 11:24:46AM -0300, Eduardo Habkost wrote:
> On Tue, Jun 23, 2015 at 11:34:45PM +0200, Michael S. Tsirkin wrote:
> > On Tue, Jun 23, 2015 at 02:11:22PM -0300, Eduardo Habkost wrote:
> > > Even if it is a bug fix. If it is a change that can make the VM
> > > unrunnable, it needs to be controlled by a separate flag, not by the
> > > machine-type.
> > 
> > I agree - command line compatibility is important.  But we are supposed
> > to provide that.  I am surprised that libvirt suddenly wants to avoid
> > some command line flags because they are not stable. IMHO we did something
> > wrong here if so. Maybe there was a valid reason for it. But then won't
> > it apply to libvirt as well?
> 
> Maybe we are having the same misunderstanding here: the problem is not
> compatibility/stability of existing machines, but the kind of
> (intentional) changes introduced in _new_ machines (when the -machine
> argument is changed). There are two kinds of changes introduced in new
> machines:
> 
> 1) Guest-side-only ABI changes: those are OK, libvirt normally ignore
>    them, they can't make a VM not-runnable.
> 2) Changes in the host-side dependencies: those need to be more carefully
>    controlled by libvirt. That's where CPU features are special: all CPU
>    features depend KVM-side features, and enabling them by default on
>    new machines makes it impossible for libvirt to know/report in
>    advance what's necessary to make a VM runnable and to implement their
>    existing runnability APIs[1].

Not all that special.  e.g. virtio device offloads depend on host
networking's ability to offload them too. And hey, QEMU is part of host
too, so if you want to be pedantic, any device configuration might be
broken on an old host and fixed on a new one, so vm seems to run but
will lose data or crashes.

All this is unusual enough that no one bothers.

> Unless we guarantee that QEMU would never introduce type-(2) changes in
> new machines (which I don't think will ever happen because that means
> never changing existing CPU models in QEMU), libvirt needs to control
> CPU features individually (that's why they need -cpu custom).

Did we ever do it? I don't think so, so yes, we very likely never will.

It's as if someone wrote a wrapper around all kernel system calls
on the assumption that kernel can not guarantee kernel/userspace ABI
will never change. It can and it does.

So let's promise not to break things, and avoid a ton of copy and paste bugs.


> > 
> > Now, if people want to update CPU models outside the QEMU binary,
> > that might be doable simply by moving them to a separate package,
> > with a text file that QEMU reads at startup.
> 
> You seem to be describing exactly what is made possible by
> "-cpu custom -readconfig <cpu-config-file>"
> 
> 
> [1] http://libvirt.org/html/libvirt-libvirt-host.html#virConnectCompareCPU
>     (See the cover letter of this series)

Only if you also add the config files as part of qemu, reworking
existing models to load config file from a pre-defined directory.

My whole point is to avoid useless code duplication.

> -- 
> Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 14:16                     ` Eduardo Habkost
  2015-06-24 14:19                       ` Michael S. Tsirkin
@ 2015-06-24 14:38                       ` Paolo Bonzini
  2015-06-24 14:54                         ` Peter Maydell
  2015-06-24 15:58                         ` Eduardo Habkost
  1 sibling, 2 replies; 81+ messages in thread
From: Paolo Bonzini @ 2015-06-24 14:38 UTC (permalink / raw)
  To: Eduardo Habkost, Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Jiri Denemark, Andreas Färber, rth

On 24/06/2015 16:16, Eduardo Habkost wrote:
> > So any single CPU flag now needs to be added in
> > - kvm
> > - qemu
> > - libvirt
> > 
> > Next thing libvirt will decide it's a policy thing and so
> > needs to be pushed up to openstack.
> 
> I don't think that will happen, but if they really decide do do it, why
> should we try to stop them? libvirt and OpenStack know what their users
> do/need better than us, and if they believe moving data to OpenStack
> will provide what users need, they are free to do it. I trust libvirt
> developers to do the right thing, here.

After talking to Daniel for more than 1 hour I actually think that
OpenStack's scheduler is totally broken with respect to QEMU upgrades.
For some reason they focused on CPU model changes, but actually there's
no past case where runnability actually changed with a QEMU upgrade, and
there will be no such future case; even without "enforce", QEMU
introduces new models long after all features have been available in
KVM.  At this point CPU is no different from any other device.

At the same time, OpenStack's scheduler is not trying to use a machine
type that is available on all nodes of a compute node pool.  I'm not
sure how one can be sure that migration would succeed in these
circumstances.

It's certainly okay for libvirt and OpenStack to use the host CPU
features in order to check whether a node will run a given VM.  However,
libvirt should trust that QEMU developers will not prevent a VM from
running on a previously viable host, just because you change the machine
type.

And OpenStack should _really_ use the machine type as the abstract
representation of what is compatible with what inside a node pool.  This
is what RHEV does, and I see no reason to do it differently.  In
particular there will not be excessive fragmentation of the nodes into
too many pools, because a deployment with too many active machine types
is just asking for trouble.

The possible exception are weird cases involving nested virt and
outdated QEMU on the L0 host, but even those cases can be fixed just by
having an up-to-date cpu_map.xml.

Other random notes from my chat:

1) libvirt should _not_ change the flags that the user passes via XML
just because it thinks that QEMU has those flags.  This makes it
possible for libvirt to keep cpu_map.xml up-to-date without worrying
about versioning.

2) libvirt should not add/remove flags when the user specifies
host-model (i.e. -cpu SandyBridge, not -cpu
SandyBridge,+f16c,+rdrand,+erms,+whatever).  host-model has had so many
bugs reported for it that I hope this could be done unconditionally even
if it is not backwards-compatible.  Or perhaps introduce a new name and
deprecate host-model.  I don't know.

3) regarding "enforce", there are indeed some cases where it would break:

- Haswell/Broadwell CPU model after TSX removal

- qemu64 with KVM

- pretty much everything including qemu64 with TCG

So libvirt here could allow _now_ the user to specify enforce, either
via XML or via qemu.conf (or via XML + a default specified via qemu.conf).

So, I _hate_ to block a feature that is anyway useful for debugging.
And this feels too much like the typical "kernel developers don't like
systemd" rant.  But it looks like this feature is not only too easy to
misuse: there are _already_ plans for misusing it and put the whole
machine compatibility issue under the rug in OpenStack.  So I agree with
Andreas, and would prefer to have a serious discussion with the
OpenStack folks before accepting it.

Paolo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 14:38                       ` Paolo Bonzini
@ 2015-06-24 14:54                         ` Peter Maydell
  2015-06-24 14:56                           ` Paolo Bonzini
  2015-06-24 15:58                         ` Eduardo Habkost
  1 sibling, 1 reply; 81+ messages in thread
From: Peter Maydell @ 2015-06-24 14:54 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Michael Mueller, Michael S. Tsirkin, QEMU Developers,
	Alexander Graf, Christian Borntraeger, Igor Mammedov,
	Jiri Denemark, Richard Henderson, Andreas Färber,
	Eduardo Habkost

On 24 June 2015 at 15:38, Paolo Bonzini <pbonzini@redhat.com> wrote:
> It's certainly okay for libvirt and OpenStack to use the host CPU
> features in order to check whether a node will run a given VM.  However,
> libvirt should trust that QEMU developers will not prevent a VM from
> running on a previously viable host, just because you change the machine
> type.

Note that this might be true for x86 but isn't (in full generality)
necessarily so for other architectures like ARM. For instance if
you have a host machine with a GICv3 with no v2-back-compatibility
support, then you can run a VM whose machine type specifies a GICv3,
but not machine types that require a GICv2...

thanks
-- PMM

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 14:54                         ` Peter Maydell
@ 2015-06-24 14:56                           ` Paolo Bonzini
  0 siblings, 0 replies; 81+ messages in thread
From: Paolo Bonzini @ 2015-06-24 14:56 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Michael Mueller, Michael S. Tsirkin, QEMU Developers,
	Alexander Graf, Christian Borntraeger, Igor Mammedov,
	Jiri Denemark, Richard Henderson, Andreas Färber,
	Eduardo Habkost



On 24/06/2015 16:54, Peter Maydell wrote:
> > It's certainly okay for libvirt and OpenStack to use the host CPU
> > features in order to check whether a node will run a given VM.  However,
> > libvirt should trust that QEMU developers will not prevent a VM from
> > running on a previously viable host, just because you change the machine
> > type.
> 
> Note that this might be true for x86 but isn't (in full generality)
> necessarily so for other architectures like ARM. For instance if
> you have a host machine with a GICv3 with no v2-back-compatibility
> support, then you can run a VM whose machine type specifies a GICv3,
> but not machine types that require a GICv2...

Hmm, that was too terse.  Just because you change the machine type *to a
newer version* with the same chipset.

Paolo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 14:35                         ` Andreas Färber
@ 2015-06-24 14:57                           ` Michael S. Tsirkin
  2015-06-24 15:43                             ` Andreas Färber
  0 siblings, 1 reply; 81+ messages in thread
From: Michael S. Tsirkin @ 2015-06-24 14:57 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Eduardo Habkost

On Wed, Jun 24, 2015 at 04:35:13PM +0200, Andreas Färber wrote:
> Am 24.06.2015 um 16:19 schrieb Michael S. Tsirkin:
> > On Wed, Jun 24, 2015 at 11:16:51AM -0300, Eduardo Habkost wrote:
> >>> IMHO -M pc is not supposed to mean "can break at any time".
> >>
> >> It means "it may have new host-side requirements and may become runnable
> >> in your host (or require additional command-line flags to run) at any time".
> > 
> > That would be pretty bad. I don't think we ever had such cases in practice.
> 
> Why is that bad or unexpected? If you install new software, it may have
> new dependencies. QEMU would be no different from other software there.

That's just basic compatibility.  If I run using same flags, I expect
compatible behaviour.

How would you like it if each time you update bash, all your scripts had
to stop working, unless you specify a specific compability flag in
scripts, in which case you miss (some) bugfixes?

Or if you found code snippets and they can't both work.  Why? Because
snippet 1 says request version 2.4 compatibility and another says use version 2.5
compatibility, so you can't use both.

Each time you break command line compatibility, you
invalidate an unknown chunk of useful documentation somewhere.


> Yesterday Eduardo said it was about having a fixed version installed and
> therein switching from a legacy machine to a newer machine. In that case
> the dependencies existed ever since installation but were not
> immediately visible to the end user due to our stability guarantees.
> 
> Andreas
> 
> -- 
> SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
> 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 14:57                           ` Michael S. Tsirkin
@ 2015-06-24 15:43                             ` Andreas Färber
  0 siblings, 0 replies; 81+ messages in thread
From: Andreas Färber @ 2015-06-24 15:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, rth, Eduardo Habkost

Am 24.06.2015 um 16:57 schrieb Michael S. Tsirkin:
> On Wed, Jun 24, 2015 at 04:35:13PM +0200, Andreas Färber wrote:
>> Am 24.06.2015 um 16:19 schrieb Michael S. Tsirkin:
>>> On Wed, Jun 24, 2015 at 11:16:51AM -0300, Eduardo Habkost wrote:
>>>>> IMHO -M pc is not supposed to mean "can break at any time".
>>>>
>>>> It means "it may have new host-side requirements and may become runnable
>>>> in your host (or require additional command-line flags to run) at any time".
>>>
>>> That would be pretty bad. I don't think we ever had such cases in practice.
>>
>> Why is that bad or unexpected? If you install new software, it may have
>> new dependencies. QEMU would be no different from other software there.
> 
> That's just basic compatibility.  If I run using same flags, I expect
> compatible behaviour.
> 
> How would you like it if each time you update bash, all your scripts had
> to stop working, unless you specify a specific compability flag in
> scripts, in which case you miss (some) bugfixes?
> 
> Or if you found code snippets and they can't both work.  Why? Because
> snippet 1 says request version 2.4 compatibility and another says use version 2.5
> compatibility, so you can't use both.
> 
> Each time you break command line compatibility, you
> invalidate an unknown chunk of useful documentation somewhere.

Not sure if we're talking about the same thing.

For starters, bash or qemu-system-foo may have a dependency on a new
library. sudo recently grew dependencies on some auditing syscall.
Certain machine or CPU level KVM features may require new ioctls.
Therefore new features may have new dependencies, and by default when
not in backwards-compatibility mode we want to enable such new features
and not just some ancient frozen subset - which would be comparable to
Solaris' /bin/sh being a really ancient version with new ones under new
names.

Don't understand what you mean with those two snippets.

Either way it seems a fundamental non-issue at the moment.

qemu64 apart from x2apic did not change much. I certainly didn't propose
any functional feature changes myself and only took patches once okayed
from KVM and x86 reviewers, raising the compat_props card.

However, looking beyond the artificial qemu64, a causal problem beneath
this discussion seems to be our white-listing of CPU features rather
than transparently black-listing all but implemented features.
Apart from omission bugs, there's two ways CPU features can get added,
implementation in TCG and kernel support in KVM. Only the KVM support
depends on the host, whereas the TCG support solely depends on the QEMU
version. Bug fixes would not fall into the implementation-dependent
category. And for the feature-moratorium Paolo proposed I wonder whether
we have any mechanism to assure and test that?

Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [Qemu-devel] Not introducing new host-side requirements on new machine-type versions (was Re: [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models)
  2015-06-24 14:37                           ` Michael S. Tsirkin
@ 2015-06-24 15:44                             ` Eduardo Habkost
  2015-06-24 15:58                               ` Andreas Färber
  2015-06-24 15:59                               ` Paolo Bonzini
  0 siblings, 2 replies; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-24 15:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Paolo Bonzini, Jiri Denemark, Andreas Färber, rth

I think this will reboot the discussion in a good way.

I have talked to Paolo on the phone, and I think I had a big assumption
that was not true:

On Wed, Jun 24, 2015 at 04:37:33PM +0200, Michael S. Tsirkin wrote:
> On Wed, Jun 24, 2015 at 11:24:46AM -0300, Eduardo Habkost wrote:
[...]
> > There are two kinds of changes introduced in new
> > machines:
> > 
> > 1) Guest-side-only ABI changes: those are OK, libvirt normally ignore
> >    them, they can't make a VM not-runnable.
> > 2) Changes in the host-side dependencies: those need to be more carefully
> >    controlled by libvirt. That's where CPU features are special: all CPU
> >    features depend KVM-side features, and enabling them by default on
> >    new machines makes it impossible for libvirt to know/report in
> >    advance what's necessary to make a VM runnable and to implement their
> >    existing runnability APIs[1].
[...]
> > Unless we guarantee that QEMU would never introduce type-(2) changes in
> > new machines (which I don't think will ever happen because that means
> > never changing existing CPU models in QEMU), libvirt needs to control
> > CPU features individually (that's why they need -cpu custom).
> 
> Did we ever do it? I don't think so, so yes, we very likely never will.

That was my big assumption. We did it before with KVM features (they
require KVM-side support), where new features were enabled by default on
new machine-types, I assumed we would do that again with any CPU
feature. We did it before when we added x2apic by default on all CPU
models in KVM (requiring a kernel capable of emulating x2apic).

In another message, Paolo wrote:
> libvirt should trust that QEMU developers will not prevent a VM from
> running on a previously viable host, just because you change the machine
> type.

I assumed that was never going to be true.

As long as QEMU guarantees that, so we don't change existing CPU models
(in new machines) in a way that introduces new host-side dependencies,
we will be OK.

We may need something new to implement this guarantee for KVM features,
though. libvirt will need something that says "please don't enable any
KVM CPUID bits silently for me, let me ask for them explicitly". But
that won't be as drastic as requiring "-cpu custom".

That have some consequences in the way we add new CPU models and
implement CPU model changes. For example: until we know all the features
we want in a CPU model are already available and supported in the latest
kernel, we won't add a new CPU model. The choice of features in CPU
models should be "final" as soon as we add the CPU model, so CPU model
changes should never introduce new host-side requirements. If a CPU
model change requires some additional KVM code or newer host CPU, we
need to add a new CPU model name. We must agree on that and document it,
because I expect to see some complaints in the future when enforcing
this rule.

> 
> It's as if someone wrote a wrapper around all kernel system calls
> on the assumption that kernel can not guarantee kernel/userspace ABI
> will never change. It can and it does.
> 
> So let's promise not to break things, and avoid a ton of copy and paste bugs.

My assumption was that this (introducing type-(2) machine changes) was
never considered "breaking things" and just a fact of life.

If we guarantee that we will never prevent a VM from running on a
previously viable host, just because you change the machine type, we
will be OK.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] Not introducing new host-side requirements on new machine-type versions (was Re: [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models)
  2015-06-24 15:44                             ` [Qemu-devel] Not introducing new host-side requirements on new machine-type versions (was Re: [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models) Eduardo Habkost
@ 2015-06-24 15:58                               ` Andreas Färber
  2015-06-24 16:08                                 ` Eduardo Habkost
  2015-06-24 15:59                               ` Paolo Bonzini
  1 sibling, 1 reply; 81+ messages in thread
From: Andreas Färber @ 2015-06-24 15:58 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

Am 24.06.2015 um 17:44 schrieb Eduardo Habkost:
> In another message, Paolo wrote:
>> libvirt should trust that QEMU developers will not prevent a VM from
>> running on a previously viable host, just because you change the machine
>> type.
> 
> I assumed that was never going to be true.
> 
> As long as QEMU guarantees that, so we don't change existing CPU models
> (in new machines) in a way that introduces new host-side dependencies,
> we will be OK.
> 
> We may need something new to implement this guarantee for KVM features,
> though. libvirt will need something that says "please don't enable any
> KVM CPUID bits silently for me, let me ask for them explicitly". But
> that won't be as drastic as requiring "-cpu custom".

That's what I suggested a global property for yesterday. Not sure how it
would be implemented though.

> That have some consequences in the way we add new CPU models and
> implement CPU model changes. For example: until we know all the features
> we want in a CPU model are already available and supported in the latest
> kernel, we won't add a new CPU model. The choice of features in CPU
> models should be "final" as soon as we add the CPU model, so CPU model
> changes should never introduce new host-side requirements. If a CPU
> model change requires some additional KVM code or newer host CPU, we
> need to add a new CPU model name. We must agree on that and document it,
> because I expect to see some complaints in the future when enforcing
> this rule.
> 
>>
>> It's as if someone wrote a wrapper around all kernel system calls
>> on the assumption that kernel can not guarantee kernel/userspace ABI
>> will never change. It can and it does.
>>
>> So let's promise not to break things, and avoid a ton of copy and paste bugs.
> 
> My assumption was that this (introducing type-(2) machine changes) was
> never considered "breaking things" and just a fact of life.
> 
> If we guarantee that we will never prevent a VM from running on a
> previously viable host, just because you change the machine type, we
> will be OK.

Could you clarify whether that is for KVM only or in general?

Also, if we ignore qemu64 and pick a current X86CPU such as Haswell, can
you make a list of features that are missing in our model and in KVM, if
any, and might be enabled in future Haswell / Haswell+X models? What
delta are we talking about exactly?

Are we at the same point of stability guarantee for ppc POWER?
For s390x, arm and aarch64 I guess not yet?

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 14:38                       ` Paolo Bonzini
  2015-06-24 14:54                         ` Peter Maydell
@ 2015-06-24 15:58                         ` Eduardo Habkost
  2015-06-24 16:00                           ` Paolo Bonzini
  1 sibling, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-24 15:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Jiri Denemark, Andreas Färber,
	rth

On Wed, Jun 24, 2015 at 04:38:51PM +0200, Paolo Bonzini wrote:
[...]
> 1) libvirt should _not_ change the flags that the user passes via XML
> just because it thinks that QEMU has those flags.  This makes it
> possible for libvirt to keep cpu_map.xml up-to-date without worrying
> about versioning.

That's true, and it would solve cases like people asking libvirt for
"Haswell without rdand" and getting QEMU running "-cpu Haswell" (which
has rdrand enabled) because libvirt thinks Haswell doesn't have rdrand.

> 
> 2) libvirt should not add/remove flags when the user specifies
> host-model (i.e. -cpu SandyBridge, not -cpu
> SandyBridge,+f16c,+rdrand,+erms,+whatever).  host-model has had so many
> bugs reported for it that I hope this could be done unconditionally even
> if it is not backwards-compatible.  Or perhaps introduce a new name and
> deprecate host-model.  I don't know.

In other words, libvirt should not assume anything about the QEMU-side
CPU models. If the user asks for "Broadwell", it should use "-cpu
Broadwell", if the user asks for "Broadwell + fpu", it should use "-cpu
Broadwell,+fpu" even if it believes every CPU model since 486 has FPU
enabled.

The reason for that is that we may still introduce CPU model changes, as
long as they are not going to mean new host-side dependencies (see
commit 78a611f1936b3eac8ed78a2be2146a742a85212c for an example, where we
added f16c and rdand to Haswell and Broadwell).

> 
> 3) regarding "enforce", there are indeed some cases where it would break:
> 
> - Haswell/Broadwell CPU model after TSX removal
> 
> - qemu64 with KVM
> 
> - pretty much everything including qemu64 with TCG

- Everything involving KVM CPUID features, that change depending on the
  kernel version.

- Intel CPU models that had VMX enabled in older machine-types,
  and may or may not have VMX enabled depending on kernel VMX nesting
  configuration.

> 
> So libvirt here could allow _now_ the user to specify enforce, either
> via XML or via qemu.conf (or via XML + a default specified via qemu.conf).

More precisely, libvirt would emulate "enforce" mode by checking the
"filtered-features", because "enforce" error messages are not
machine-friendly.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] Not introducing new host-side requirements on new machine-type versions (was Re: [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models)
  2015-06-24 15:44                             ` [Qemu-devel] Not introducing new host-side requirements on new machine-type versions (was Re: [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models) Eduardo Habkost
  2015-06-24 15:58                               ` Andreas Färber
@ 2015-06-24 15:59                               ` Paolo Bonzini
  1 sibling, 0 replies; 81+ messages in thread
From: Paolo Bonzini @ 2015-06-24 15:59 UTC (permalink / raw)
  To: Eduardo Habkost, Michael S. Tsirkin
  Cc: mimu, qemu-devel, Alexander Graf, borntraeger, Igor Mammedov,
	Jiri Denemark, Andreas Färber, rth



On 24/06/2015 17:44, Eduardo Habkost wrote:
> As long as QEMU guarantees that, so we don't change existing CPU models
> (in new machines) in a way that introduces new host-side dependencies,
> we will be OK.
> 
> We may need something new to implement this guarantee for KVM features,
> though. libvirt will need something that says "please don't enable any
> KVM CPUID bits silently for me, let me ask for them explicitly". But
> that won't be as drastic as requiring "-cpu custom".
> 
> That have some consequences in the way we add new CPU models and
> implement CPU model changes. For example: until we know all the features
> we want in a CPU model are already available and supported in the latest
> kernel, we won't add a new CPU model. The choice of features in CPU
> models

... that require host-side support.

> should be "final" as soon as we add the CPU model, so CPU model
> changes should never introduce new host-side requirements.

Just like we are doing for ARAT, we can always hack around the
limitation in kvm_arch_get_supported_cpuid, because new features that
require kernel support are overall rare.

It happened for TSC_DEADLINE/RDTSCP/XSAVE (Sandy Bridge), and it will
happen for MPX/XSAVES (Skylake), but that was pretty much it in the last
few years.

Paolo

> If a CPU
> model change requires some additional KVM code or newer host CPU, we
> need to add a new CPU model name. We must agree on that and document it,
> because I expect to see some complaints in the future when enforcing
> this rule.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
  2015-06-24 15:58                         ` Eduardo Habkost
@ 2015-06-24 16:00                           ` Paolo Bonzini
  0 siblings, 0 replies; 81+ messages in thread
From: Paolo Bonzini @ 2015-06-24 16:00 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Jiri Denemark, Andreas Färber,
	rth



On 24/06/2015 17:58, Eduardo Habkost wrote:
> > 3) regarding "enforce", there are indeed some cases where it would break:
> > 
> > - Haswell/Broadwell CPU model after TSX removal
> > 
> > - qemu64 with KVM
> > 
> > - pretty much everything including qemu64 with TCG
> 
> - Everything involving KVM CPUID features, that change depending on the
>   kernel version.
> 
> - Intel CPU models that had VMX enabled in older machine-types,
>   and may or may not have VMX enabled depending on kernel VMX nesting
>   configuration.
> 
> > So libvirt here could allow _now_ the user to specify enforce, either
> > via XML or via qemu.conf (or via XML + a default specified via qemu.conf).
> 
> More precisely, libvirt would emulate "enforce" mode by checking the
> "filtered-features", because "enforce" error messages are not
> machine-friendly.

Even better, because it would let libvirt special-case stuff like TSX,
VMX and SVM.

Paolo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] Not introducing new host-side requirements on new machine-type versions (was Re: [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models)
  2015-06-24 15:58                               ` Andreas Färber
@ 2015-06-24 16:08                                 ` Eduardo Habkost
  2015-06-24 16:15                                   ` Andreas Färber
  0 siblings, 1 reply; 81+ messages in thread
From: Eduardo Habkost @ 2015-06-24 16:08 UTC (permalink / raw)
  To: Andreas Färber
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

On Wed, Jun 24, 2015 at 05:58:18PM +0200, Andreas Färber wrote:
> Am 24.06.2015 um 17:44 schrieb Eduardo Habkost:
> > In another message, Paolo wrote:
> >> libvirt should trust that QEMU developers will not prevent a VM from
> >> running on a previously viable host, just because you change the machine
> >> type.
> > 
> > I assumed that was never going to be true.
> > 
> > As long as QEMU guarantees that, so we don't change existing CPU models
> > (in new machines) in a way that introduces new host-side dependencies,
> > we will be OK.
> > 
> > We may need something new to implement this guarantee for KVM features,
> > though. libvirt will need something that says "please don't enable any
> > KVM CPUID bits silently for me, let me ask for them explicitly". But
> > that won't be as drastic as requiring "-cpu custom".
> 
> That's what I suggested a global property for yesterday. Not sure how it
> would be implemented though.

A "default-kvm-features" property that enables/disables loading of
kvm_default_features would be enough, probably.

> 
> > That have some consequences in the way we add new CPU models and
> > implement CPU model changes. For example: until we know all the features
> > we want in a CPU model are already available and supported in the latest
> > kernel, we won't add a new CPU model. The choice of features in CPU
> > models should be "final" as soon as we add the CPU model, so CPU model
> > changes should never introduce new host-side requirements. If a CPU
> > model change requires some additional KVM code or newer host CPU, we
> > need to add a new CPU model name. We must agree on that and document it,
> > because I expect to see some complaints in the future when enforcing
> > this rule.
> > 
> >>
> >> It's as if someone wrote a wrapper around all kernel system calls
> >> on the assumption that kernel can not guarantee kernel/userspace ABI
> >> will never change. It can and it does.
> >>
> >> So let's promise not to break things, and avoid a ton of copy and paste bugs.
> > 
> > My assumption was that this (introducing type-(2) machine changes) was
> > never considered "breaking things" and just a fact of life.
> > 
> > If we guarantee that we will never prevent a VM from running on a
> > previously viable host, just because you change the machine type, we
> > will be OK.
> 
> Could you clarify whether that is for KVM only or in general?
> 
> Also, if we ignore qemu64 and pick a current X86CPU such as Haswell, can
> you make a list of features that are missing in our model and in KVM, if
> any, and might be enabled in future Haswell / Haswell+X models? What
> delta are we talking about exactly?

We can make a list by comparing CPUID data from the guest/QEMU with the
real host CPUID data. But I believe we don't have any features we expect
to enable in the future in existing CPU models that would require new
KVM-side code. We just need to be aware of this in the (unlikely?) case
it happens.

Note that the recent ARAT patches are OK because they don't depend on
new KVM-side GET_SUPPORTED_CPUID changes to work.

> 
> Are we at the same point of stability guarantee for ppc POWER?
> For s390x, arm and aarch64 I guess not yet?

That's a good question, and I have no idea how difficult it would be to
implement that guarantee in those machines.

In s390x we are implementing machine-type-dependent "runnable" info on
query-cpu-definitions, but I don't know what the use caes look like, and
what are the consequences for libvirt and OpenStack.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [Qemu-devel] Not introducing new host-side requirements on new machine-type versions (was Re: [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models)
  2015-06-24 16:08                                 ` Eduardo Habkost
@ 2015-06-24 16:15                                   ` Andreas Färber
  0 siblings, 0 replies; 81+ messages in thread
From: Andreas Färber @ 2015-06-24 16:15 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mimu, Michael S. Tsirkin, qemu-devel, Alexander Graf,
	borntraeger, Igor Mammedov, Paolo Bonzini, Jiri Denemark, rth

Am 24.06.2015 um 18:08 schrieb Eduardo Habkost:
> On Wed, Jun 24, 2015 at 05:58:18PM +0200, Andreas Färber wrote:
>> Are we at the same point of stability guarantee for ppc POWER?
>> For s390x, arm and aarch64 I guess not yet?
> 
> That's a good question, and I have no idea how difficult it would be to
> implement that guarantee in those machines.
> 
> In s390x we are implementing machine-type-dependent "runnable" info on
> query-cpu-definitions, but I don't know what the use caes look like, and
> what are the consequences for libvirt and OpenStack.

I assume this discussion is going to be summarized on the Wiki at some
point. Just be sure to explicitly mention x86 then, to be safe.

Cheers,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2015-06-24 16:15 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-08 19:07 [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Eduardo Habkost
2015-06-08 19:07 ` [Qemu-devel] [PATCH 1/2] target-i386: Introduce "-cpu custom" Eduardo Habkost
2015-06-08 19:07 ` [Qemu-devel] [PATCH 2/2] scripts: x86-cpu-model-dump script Eduardo Habkost
2015-06-08 20:18 ` [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Jiri Denemark
2015-06-09  8:56   ` Daniel P. Berrange
2015-06-09 13:16     ` Eduardo Habkost
2015-06-23 12:32   ` Andreas Färber
2015-06-23 15:08     ` Eduardo Habkost
2015-06-23 15:32       ` Michael S. Tsirkin
2015-06-23 15:58         ` Eduardo Habkost
2015-06-23 16:15           ` Andreas Färber
2015-06-23 16:25             ` Daniel P. Berrange
2015-06-23 16:33               ` Michael S. Tsirkin
2015-06-23 16:38                 ` Eduardo Habkost
2015-06-23 16:44                   ` Andreas Färber
2015-06-23 17:08                     ` Eduardo Habkost
2015-06-23 17:18                       ` Andreas Färber
2015-06-23 17:27                         ` Daniel P. Berrange
2015-06-23 17:41                           ` Andreas Färber
2015-06-23 17:45                             ` Eduardo Habkost
2015-06-23 17:58                               ` Andreas Färber
2015-06-23 18:05                                 ` Daniel P. Berrange
2015-06-23 18:11                                 ` Eduardo Habkost
2015-06-23 17:55                             ` Daniel P. Berrange
2015-06-23 17:39                         ` Eduardo Habkost
2015-06-23 18:35                           ` Andreas Färber
2015-06-23 19:25                             ` Eduardo Habkost
2015-06-23 19:41                               ` Andreas Färber
2015-06-23 19:53                                 ` Eduardo Habkost
2015-06-23 20:26                                 ` Eduardo Habkost
2015-06-23 21:38                                   ` Michael S. Tsirkin
2015-06-23 16:42                 ` Daniel P. Berrange
2015-06-23 16:47                   ` Andreas Färber
2015-06-23 17:11                     ` Eduardo Habkost
2015-06-23 21:34                       ` Michael S. Tsirkin
2015-06-24 14:24                         ` Eduardo Habkost
2015-06-24 14:37                           ` Michael S. Tsirkin
2015-06-24 15:44                             ` [Qemu-devel] Not introducing new host-side requirements on new machine-type versions (was Re: [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models) Eduardo Habkost
2015-06-24 15:58                               ` Andreas Färber
2015-06-24 16:08                                 ` Eduardo Habkost
2015-06-24 16:15                                   ` Andreas Färber
2015-06-24 15:59                               ` Paolo Bonzini
2015-06-23 17:13                     ` [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models Daniel P. Berrange
2015-06-23 17:29                       ` Andreas Färber
2015-06-23 17:42                         ` Eduardo Habkost
2015-06-23 17:55                           ` Andreas Färber
2015-06-23 17:58                             ` Daniel P. Berrange
2015-06-23 21:28                           ` Michael S. Tsirkin
2015-06-24 14:18                             ` Eduardo Habkost
2015-06-24 14:24                               ` Michael S. Tsirkin
2015-06-23 21:26                       ` Michael S. Tsirkin
2015-06-23 21:23                   ` Michael S. Tsirkin
2015-06-24  8:52                     ` Daniel P. Berrange
2015-06-24 10:31                       ` Michael S. Tsirkin
2015-06-24 14:16                     ` Eduardo Habkost
2015-06-24 14:19                       ` Michael S. Tsirkin
2015-06-24 14:35                         ` Andreas Färber
2015-06-24 14:57                           ` Michael S. Tsirkin
2015-06-24 15:43                             ` Andreas Färber
2015-06-24 14:38                       ` Paolo Bonzini
2015-06-24 14:54                         ` Peter Maydell
2015-06-24 14:56                           ` Paolo Bonzini
2015-06-24 15:58                         ` Eduardo Habkost
2015-06-24 16:00                           ` Paolo Bonzini
2015-06-23 16:40               ` Andreas Färber
2015-06-23 16:53                 ` Daniel P. Berrange
2015-06-23 17:10                   ` Andreas Färber
2015-06-23 17:24                     ` Eduardo Habkost
2015-06-23 17:31                       ` Daniel P. Berrange
2015-06-23 16:32             ` Eduardo Habkost
2015-06-23 17:01               ` Andreas Färber
2015-06-23 15:51       ` Daniel P. Berrange
2015-06-23 15:56         ` Michael S. Tsirkin
2015-06-23 16:00           ` Daniel P. Berrange
2015-06-23 16:30             ` Michael S. Tsirkin
2015-06-24  9:20     ` Jiri Denemark
2015-06-24 10:21       ` Michael S. Tsirkin
2015-06-24 10:31         ` Daniel P. Berrange
2015-06-24 10:40           ` Michael S. Tsirkin
2015-06-24 10:32         ` Paolo Bonzini
2015-06-16 17:40 ` Eduardo Habkost

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.