All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/6] Support for new CPU model SapphireRapids
@ 2023-01-06  8:38 Lei Wang
  2023-01-06  8:38 ` [PATCH v3 1/6] i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E Lei Wang
                   ` (6 more replies)
  0 siblings, 7 replies; 25+ messages in thread
From: Lei Wang @ 2023-01-06  8:38 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, imammedo, dgilbert, berrange, xiaoyao.li, yang.zhong,
	lei4.wang

This series aims to add a new CPU model SapphireRapids, and tries to
address the problem stated in
https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#mcf67dbd1ad37c65d7988c36a2b267be9afd2fb30,
so that named CPU model can define its own AMX values, and QEMU won't
pass the wrong AMX values to KVM in future platforms if they have
different values supported.

The original patch is
https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#u.

---

Changelog:

v3:
 - Rebase on the latest QEMU (d1852caab131ea898134fdcea8c14bc2ee75fbe9).
 - v2: https://lore.kernel.org/qemu-devel/20221102085256.81139-1-lei4.wang@intel.com/

v2:
 - Fix when passing all zeros of AMX-related CPUID, QEMU will warn
   unsupported.
 - Remove unnecessary function definition and make code cleaner.
 - Fix some typos.
 - v1: https://lore.kernel.org/qemu-devel/20221027020036.373140-1-lei4.wang@intel.com/T/#t


Lei Wang (6):
  i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E
  i386: Remove unused parameter "uint32_t bit" in
    feature_word_description()
  i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit
    features
  i386: Mask and report unavailable multi-bit feature values
  i386: Initialize AMX CPUID leaves with corresponding env->features[]
    leaves
  i386: Add new CPU model SapphireRapids

 target/i386/cpu-internal.h |  11 ++
 target/i386/cpu.c          | 311 +++++++++++++++++++++++++++++++++++--
 target/i386/cpu.h          |  16 ++
 3 files changed, 322 insertions(+), 16 deletions(-)


base-commit: d1852caab131ea898134fdcea8c14bc2ee75fbe9
-- 
2.34.1



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 1/6] i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E
  2023-01-06  8:38 [PATCH v3 0/6] Support for new CPU model SapphireRapids Lei Wang
@ 2023-01-06  8:38 ` Lei Wang
  2023-02-06  7:45   ` Yuan Yao
  2023-01-06  8:38 ` [PATCH v3 2/6] i386: Remove unused parameter "uint32_t bit" in feature_word_description() Lei Wang
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Lei Wang @ 2023-01-06  8:38 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, imammedo, dgilbert, berrange, xiaoyao.li, yang.zhong,
	lei4.wang

CPUID leaf 0x1D and 0x1E enumerate tile and TMUL information for AMX.

Introduce FeatureWord FEAT_1D_1_EAX, FEAT_1D_1_EBX, FEAT_1D_1_ECX and
FEAT_1E_0_EBX. Thus these features of AMX can be expanded when
"-cpu host/max" and can be configured in named CPU model.

Signed-off-by: Lei Wang <lei4.wang@intel.com>
---
 target/i386/cpu.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++
 target/i386/cpu.h | 12 +++++++++++
 2 files changed, 67 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3410e5e470..b6d1247e5e 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1002,6 +1002,45 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
         },
         .tcg_features = ~0U,
     },
+    [FEAT_1D_1_EAX] = {
+        .type = CPUID_FEATURE_WORD,
+        .cpuid = {
+            .eax = 0x1D,
+            .needs_ecx = true, .ecx = 1,
+            .reg = R_EAX,
+        },
+        .migratable_flags = CPUID_AMX_PALETTE_1_TOTAL_TILE_BYTES_MASK |
+            CPUID_AMX_PALETTE_1_BYTES_PER_TILE_MASK,
+    },
+    [FEAT_1D_1_EBX] = {
+        .type = CPUID_FEATURE_WORD,
+        .cpuid = {
+            .eax = 0x1D,
+            .needs_ecx = true, .ecx = 1,
+            .reg = R_EBX,
+        },
+        .migratable_flags = CPUID_AMX_PALETTE_1_BYTES_PER_ROW_MASK |
+            CPUID_AMX_PALETTE_1_MAX_NAMES_MASK,
+    },
+    [FEAT_1D_1_ECX] = {
+        .type = CPUID_FEATURE_WORD,
+        .cpuid = {
+            .eax = 0x1D,
+            .needs_ecx = true, .ecx = 1,
+            .reg = R_ECX,
+        },
+        .migratable_flags = CPUID_AMX_PALETTE_1_MAX_ROWS_MASK,
+    },
+    [FEAT_1E_0_EBX] = {
+        .type = CPUID_FEATURE_WORD,
+        .cpuid = {
+            .eax = 0x1E,
+            .needs_ecx = true, .ecx = 0,
+            .reg = R_EBX,
+        },
+        .migratable_flags = CPUID_AMX_TMUL_MAX_K_MASK |
+            CPUID_AMX_TMUL_MAX_N_MASK,
+    },
     /*Below are MSR exposed features*/
     [FEAT_ARCH_CAPABILITIES] = {
         .type = MSR_FEATURE_WORD,
@@ -1371,6 +1410,22 @@ static FeatureDep feature_dependencies[] = {
         .from = { FEAT_7_0_EBX,             CPUID_7_0_EBX_INTEL_PT },
         .to = { FEAT_14_0_ECX,              ~0ull },
     },
+    {
+        .from = { FEAT_7_0_EDX,             CPUID_7_0_EDX_AMX_TILE },
+        .to = { FEAT_1D_1_EAX,              ~0ull },
+    },
+    {
+        .from = { FEAT_7_0_EDX,             CPUID_7_0_EDX_AMX_TILE },
+        .to = { FEAT_1D_1_EBX,              ~0ull },
+    },
+    {
+        .from = { FEAT_7_0_EDX,             CPUID_7_0_EDX_AMX_TILE },
+        .to = { FEAT_1D_1_ECX,              ~0ull },
+    },
+    {
+        .from = { FEAT_7_0_EDX,             CPUID_7_0_EDX_AMX_TILE },
+        .to = { FEAT_1E_0_EBX,              ~0ull },
+    },
     {
         .from = { FEAT_8000_0001_EDX,       CPUID_EXT2_RDTSCP },
         .to = { FEAT_VMX_SECONDARY_CTLS,    VMX_SECONDARY_EXEC_RDTSCP },
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d4bc19577a..3e3e0cfe59 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -585,6 +585,14 @@ typedef enum X86Seg {
                                  XSTATE_Hi16_ZMM_MASK | XSTATE_PKRU_MASK | \
                                  XSTATE_XTILE_CFG_MASK | XSTATE_XTILE_DATA_MASK)
 
+#define CPUID_AMX_PALETTE_1_TOTAL_TILE_BYTES_MASK 0xffffU
+#define CPUID_AMX_PALETTE_1_BYTES_PER_TILE_MASK   (0xffffU << 16)
+#define CPUID_AMX_PALETTE_1_BYTES_PER_ROW_MASK    0xffffU
+#define CPUID_AMX_PALETTE_1_MAX_NAMES_MASK        (0xffffU << 16)
+#define CPUID_AMX_PALETTE_1_MAX_ROWS_MASK         0xffffU
+#define CPUID_AMX_TMUL_MAX_K_MASK                 0xffU
+#define CPUID_AMX_TMUL_MAX_N_MASK                 (0xffffU << 8)
+
 /* CPUID feature words */
 typedef enum FeatureWord {
     FEAT_1_EDX,         /* CPUID[1].EDX */
@@ -605,6 +613,10 @@ typedef enum FeatureWord {
     FEAT_6_EAX,         /* CPUID[6].EAX */
     FEAT_XSAVE_XCR0_LO, /* CPUID[EAX=0xd,ECX=0].EAX */
     FEAT_XSAVE_XCR0_HI, /* CPUID[EAX=0xd,ECX=0].EDX */
+    FEAT_1D_1_EAX,      /* CPUID[EAX=0x1d,ECX=1].EAX */
+    FEAT_1D_1_EBX,      /* CPUID[EAX=0x1d,ECX=1].EBX */
+    FEAT_1D_1_ECX,      /* CPUID[EAX=0x1d,ECX=1].ECX */
+    FEAT_1E_0_EBX,      /* CPUID[EAX=0x1e,ECX=0].EBX */
     FEAT_ARCH_CAPABILITIES,
     FEAT_CORE_CAPABILITY,
     FEAT_PERF_CAPABILITIES,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 2/6] i386: Remove unused parameter "uint32_t bit" in feature_word_description()
  2023-01-06  8:38 [PATCH v3 0/6] Support for new CPU model SapphireRapids Lei Wang
  2023-01-06  8:38 ` [PATCH v3 1/6] i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E Lei Wang
@ 2023-01-06  8:38 ` Lei Wang
  2023-01-27 13:29   ` Igor Mammedov
  2023-02-08 14:33   ` Xiaoyao Li
  2023-01-06  8:38 ` [PATCH v3 3/6] i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit features Lei Wang
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: Lei Wang @ 2023-01-06  8:38 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, imammedo, dgilbert, berrange, xiaoyao.li, yang.zhong,
	lei4.wang

Parameter "uint32_t bit" is not used in function feature_word_description(),
so remove it.

Signed-off-by: Lei Wang <lei4.wang@intel.com>
---
 target/i386/cpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index b6d1247e5e..883098bc5a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4290,7 +4290,7 @@ static const TypeInfo max_x86_cpu_type_info = {
     .class_init = max_x86_cpu_class_init,
 };
 
-static char *feature_word_description(FeatureWordInfo *f, uint32_t bit)
+static char *feature_word_description(FeatureWordInfo *f)
 {
     assert(f->type == CPUID_FEATURE_WORD || f->type == MSR_FEATURE_WORD);
 
@@ -4329,6 +4329,7 @@ static void mark_unavailable_features(X86CPU *cpu, FeatureWord w, uint64_t mask,
     CPUX86State *env = &cpu->env;
     FeatureWordInfo *f = &feature_word_info[w];
     int i;
+    g_autofree char *feat_word_str = feature_word_description(f);
 
     if (!cpu->force_features) {
         env->features[w] &= ~mask;
@@ -4341,7 +4342,6 @@ static void mark_unavailable_features(X86CPU *cpu, FeatureWord w, uint64_t mask,
 
     for (i = 0; i < 64; ++i) {
         if ((1ULL << i) & mask) {
-            g_autofree char *feat_word_str = feature_word_description(f, i);
             warn_report("%s: %s%s%s [bit %d]",
                         verbose_prefix,
                         feat_word_str,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 3/6] i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit features
  2023-01-06  8:38 [PATCH v3 0/6] Support for new CPU model SapphireRapids Lei Wang
  2023-01-06  8:38 ` [PATCH v3 1/6] i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E Lei Wang
  2023-01-06  8:38 ` [PATCH v3 2/6] i386: Remove unused parameter "uint32_t bit" in feature_word_description() Lei Wang
@ 2023-01-06  8:38 ` Lei Wang
  2023-02-08 14:39   ` Xiaoyao Li
  2023-01-06  8:38 ` [PATCH v3 4/6] i386: Mask and report unavailable multi-bit feature values Lei Wang
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Lei Wang @ 2023-01-06  8:38 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, imammedo, dgilbert, berrange, xiaoyao.li, yang.zhong,
	lei4.wang

Some features use multiple CPUID bits to form a value to be used, e.g.,
CPUID(0x1E,0):EBX[23:08] is regarded as the tmul_maxn value for AMX.
Introduce a new struct "MultiBitFeatureInfo" to hold the information for
those features and create a corresponding member in struct FeatureWordInfo,
so that the infomation can be assigned for each item in feature_word_info
array and used in the future.

Signed-off-by: Lei Wang <lei4.wang@intel.com>
---
 target/i386/cpu-internal.h |  9 +++++++
 target/i386/cpu.c          | 54 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 63 insertions(+)

diff --git a/target/i386/cpu-internal.h b/target/i386/cpu-internal.h
index 9baac5c0b4..66b3d66cb4 100644
--- a/target/i386/cpu-internal.h
+++ b/target/i386/cpu-internal.h
@@ -25,6 +25,13 @@ typedef enum FeatureWordType {
    MSR_FEATURE_WORD,
 } FeatureWordType;
 
+typedef struct MultiBitFeatureInfo {
+    const char *feat_name;
+    uint64_t mask;
+    unsigned high_bit_position;
+    unsigned low_bit_position;
+} MultiBitFeatureInfo;
+
 typedef struct FeatureWordInfo {
     FeatureWordType type;
     /* feature flags names are taken from "Intel Processor Identification and
@@ -51,6 +58,8 @@ typedef struct FeatureWordInfo {
     uint64_t migratable_flags; /* Feature flags known to be migratable */
     /* Features that shouldn't be auto-enabled by "-cpu host" */
     uint64_t no_autoenable_flags;
+    unsigned num_multi_bit_features;
+    MultiBitFeatureInfo *multi_bit_features;
 } FeatureWordInfo;
 
 extern FeatureWordInfo feature_word_info[];
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 883098bc5a..88aa780566 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1011,6 +1011,21 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
         },
         .migratable_flags = CPUID_AMX_PALETTE_1_TOTAL_TILE_BYTES_MASK |
             CPUID_AMX_PALETTE_1_BYTES_PER_TILE_MASK,
+        .num_multi_bit_features = 2,
+        .multi_bit_features = (MultiBitFeatureInfo[]){
+            {
+                .feat_name = "total_tile_bytes",
+                .mask = CPUID_AMX_PALETTE_1_TOTAL_TILE_BYTES_MASK,
+                .high_bit_position = 15,
+                .low_bit_position = 0,
+            },
+            {
+                .feat_name = "bytes_per_tile",
+                .mask = CPUID_AMX_PALETTE_1_BYTES_PER_TILE_MASK,
+                .high_bit_position = 31,
+                .low_bit_position = 16,
+            },
+        },
     },
     [FEAT_1D_1_EBX] = {
         .type = CPUID_FEATURE_WORD,
@@ -1021,6 +1036,21 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
         },
         .migratable_flags = CPUID_AMX_PALETTE_1_BYTES_PER_ROW_MASK |
             CPUID_AMX_PALETTE_1_MAX_NAMES_MASK,
+        .num_multi_bit_features = 2,
+        .multi_bit_features = (MultiBitFeatureInfo[]){
+            {
+                .feat_name = "bytes_per_row",
+                .mask = CPUID_AMX_PALETTE_1_BYTES_PER_ROW_MASK,
+                .high_bit_position = 15,
+                .low_bit_position = 0,
+            },
+            {
+                .feat_name = "max_names",
+                .mask = CPUID_AMX_PALETTE_1_MAX_NAMES_MASK,
+                .high_bit_position = 31,
+                .low_bit_position = 16,
+            },
+        },
     },
     [FEAT_1D_1_ECX] = {
         .type = CPUID_FEATURE_WORD,
@@ -1030,6 +1060,15 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
             .reg = R_ECX,
         },
         .migratable_flags = CPUID_AMX_PALETTE_1_MAX_ROWS_MASK,
+        .num_multi_bit_features = 1,
+        .multi_bit_features = (MultiBitFeatureInfo[]){
+            {
+                .feat_name = "max_rows",
+                .mask = CPUID_AMX_PALETTE_1_MAX_ROWS_MASK,
+                .high_bit_position = 15,
+                .low_bit_position = 0,
+            },
+        },
     },
     [FEAT_1E_0_EBX] = {
         .type = CPUID_FEATURE_WORD,
@@ -1040,6 +1079,21 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
         },
         .migratable_flags = CPUID_AMX_TMUL_MAX_K_MASK |
             CPUID_AMX_TMUL_MAX_N_MASK,
+        .num_multi_bit_features = 2,
+        .multi_bit_features = (MultiBitFeatureInfo[]){
+            {
+                .feat_name = "tmul_maxk",
+                .mask = CPUID_AMX_TMUL_MAX_K_MASK,
+                .high_bit_position = 7,
+                .low_bit_position = 0,
+            },
+            {
+                .feat_name = "tmul_maxn",
+                .mask = CPUID_AMX_TMUL_MAX_N_MASK,
+                .high_bit_position = 23,
+                .low_bit_position = 8,
+            },
+        },
     },
     /*Below are MSR exposed features*/
     [FEAT_ARCH_CAPABILITIES] = {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 4/6] i386: Mask and report unavailable multi-bit feature values
  2023-01-06  8:38 [PATCH v3 0/6] Support for new CPU model SapphireRapids Lei Wang
                   ` (2 preceding siblings ...)
  2023-01-06  8:38 ` [PATCH v3 3/6] i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit features Lei Wang
@ 2023-01-06  8:38 ` Lei Wang
  2023-02-06  7:43   ` Yuan Yao
  2023-01-06  8:38 ` [PATCH v3 5/6] i386: Initialize AMX CPUID leaves with corresponding env->features[] leaves Lei Wang
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Lei Wang @ 2023-01-06  8:38 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, imammedo, dgilbert, berrange, xiaoyao.li, yang.zhong,
	lei4.wang

Some feature words, e.g., feature words in AMX-related CPUID leaf 0x1D and
0x1E are not bit-wise but multiple bits represents one value. Handle this
situation when the values specified are not the same as which are reported
by KVM. The handling includes:

 - The responsibility of masking bits and giving warnings are delegated to
   the feature enabler. A framework is also provided to enable this.
 - To simplify the initialization, a default function is provided if the
   the function is not specified.

The reason why delegating this responsibility rather than just marking
them as zeros when they are not same is because different multi-bit
features may have different logic, which is case by case, for example:

 1. CPUID.0x14_0x1:EBX[15:0]. Even though it's multi-bits field, it's a
    bitmap and each bit represents a separate capability.

 2. CPUID.0x14_0x1:EAX[2:0] represents the number of configurable Address
    Ranges. 3 bits as a whole to represent a integer value. It means the
    maximum capability of HW. If KVM reports M, then M to 0 is legal
    value to configure (because KVM can emulate each value correctly).

 3. CPUID.0x1D_0x1:EAX[31:16] represents palette 1 bytes_per_tile. 16 bits
    as a whole represent an integer value. It's not like case 2 and SW
    needs to configure the same value as reported. Because it's not
    possible for SW to configure to a different value and KVM cannot
    emulate it.

So marking them blindly as zeros is incorrect, and delegating this
responsibility can let each multi-bit feature have its own way to mask bits.

Signed-off-by: Lei Wang <lei4.wang@intel.com>
---
 target/i386/cpu-internal.h |  2 ++
 target/i386/cpu.c          | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/target/i386/cpu-internal.h b/target/i386/cpu-internal.h
index 66b3d66cb4..83c7b53926 100644
--- a/target/i386/cpu-internal.h
+++ b/target/i386/cpu-internal.h
@@ -30,6 +30,8 @@ typedef struct MultiBitFeatureInfo {
     uint64_t mask;
     unsigned high_bit_position;
     unsigned low_bit_position;
+    void (*mark_unavailable_multi_bit)(X86CPU *cpu, FeatureWord w, int index,
+                                       const char *verbose_prefix);
 } MultiBitFeatureInfo;
 
 typedef struct FeatureWordInfo {
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 88aa780566..e638a31d34 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4377,6 +4377,28 @@ static bool x86_cpu_have_filtered_features(X86CPU *cpu)
     return false;
 }
 
+static void mark_unavailable_multi_bit_default(X86CPU *cpu, FeatureWord w,
+                                               int index,
+                                               const char *verbose_prefix)
+{
+    FeatureWordInfo *f = &feature_word_info[w];
+    g_autofree char *feat_word_str = feature_word_description(f);
+    uint64_t host_feat = x86_cpu_get_supported_feature_word(w, false);
+    MultiBitFeatureInfo mf = f->multi_bit_features[index];
+
+    if ((cpu->env.features[w] & mf.mask) &&
+        ((cpu->env.features[w] ^ host_feat) & mf.mask)) {
+        if (!cpu->force_features) {
+            cpu->env.features[w] &= ~mf.mask;
+        }
+        cpu->filtered_features[w] |= mf.mask;
+        if (verbose_prefix)
+            warn_report("%s: %s.%s [%u:%u]", verbose_prefix, feat_word_str,
+                        mf.feat_name, mf.high_bit_position,
+                        mf.low_bit_position);
+    }
+}
+
 static void mark_unavailable_features(X86CPU *cpu, FeatureWord w, uint64_t mask,
                                       const char *verbose_prefix)
 {
@@ -6442,6 +6464,20 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
             x86_cpu_get_supported_feature_word(w, false);
         uint64_t requested_features = env->features[w];
         uint64_t unavailable_features = requested_features & ~host_feat;
+        FeatureWordInfo f = feature_word_info[w];
+        int i;
+
+        for (i = 0; i < f.num_multi_bit_features; i++) {
+            MultiBitFeatureInfo mf = f.multi_bit_features[i];
+            if (mf.mark_unavailable_multi_bit) {
+                mf.mark_unavailable_multi_bit(cpu, w, i, prefix);
+            } else {
+                mark_unavailable_multi_bit_default(cpu, w, i, prefix);
+            }
+
+            unavailable_features &= ~mf.mask;
+        }
+
         mark_unavailable_features(cpu, w, unavailable_features, prefix);
     }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 5/6] i386: Initialize AMX CPUID leaves with corresponding env->features[] leaves
  2023-01-06  8:38 [PATCH v3 0/6] Support for new CPU model SapphireRapids Lei Wang
                   ` (3 preceding siblings ...)
  2023-01-06  8:38 ` [PATCH v3 4/6] i386: Mask and report unavailable multi-bit feature values Lei Wang
@ 2023-01-06  8:38 ` Lei Wang
  2023-01-06  8:38 ` [PATCH v3 6/6] i386: Add new CPU model SapphireRapids Lei Wang
  2023-02-02 11:05 ` [PATCH v3 0/6] Support for " Igor Mammedov
  6 siblings, 0 replies; 25+ messages in thread
From: Lei Wang @ 2023-01-06  8:38 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, imammedo, dgilbert, berrange, xiaoyao.li, yang.zhong,
	lei4.wang

The AMX-related CPUID value, i.e., CPUID(0x1D,1):EAX, CPUID(0x1D,1):EBX,
CPUID(0x1D,1):ECX and CPUID(0x1E,0):EBX are hard-coded to Sapphire Rapids
without considering future platforms.

Replace these hard-coded values with env->features[], so QEMU can pass the
right value to KVM.

Signed-off-by: Lei Wang <lei4.wang@intel.com>
---
 target/i386/cpu.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index e638a31d34..946df29a3d 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -576,16 +576,16 @@ static CPUCacheInfo legacy_l3_cache = {
 #define INTEL_PT_PSB_BITMAP      (0x003f << 16) /* Support 2K,4K,8K,16K,32K,64K */
 
 /* CPUID Leaf 0x1D constants: */
-#define INTEL_AMX_TILE_MAX_SUBLEAF     0x1
-#define INTEL_AMX_TOTAL_TILE_BYTES     0x2000
-#define INTEL_AMX_BYTES_PER_TILE       0x400
-#define INTEL_AMX_BYTES_PER_ROW        0x40
-#define INTEL_AMX_TILE_MAX_NAMES       0x8
-#define INTEL_AMX_TILE_MAX_ROWS        0x10
+#define INTEL_SPR_AMX_TILE_MAX_SUBLEAF     0x1
+#define INTEL_SPR_AMX_TOTAL_TILE_BYTES     0x2000
+#define INTEL_SPR_AMX_BYTES_PER_TILE       0x400
+#define INTEL_SPR_AMX_BYTES_PER_ROW        0x40
+#define INTEL_SPR_AMX_TILE_MAX_NAMES       0x8
+#define INTEL_SPR_AMX_TILE_MAX_ROWS        0x10
 
 /* CPUID Leaf 0x1E constants: */
-#define INTEL_AMX_TMUL_MAX_K           0x10
-#define INTEL_AMX_TMUL_MAX_N           0x40
+#define INTEL_SPR_AMX_TMUL_MAX_K           0x10
+#define INTEL_SPR_AMX_TMUL_MAX_N           0x40
 
 void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
                               uint32_t vendor2, uint32_t vendor3)
@@ -5764,12 +5764,11 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
 
         if (count == 0) {
             /* Highest numbered palette subleaf */
-            *eax = INTEL_AMX_TILE_MAX_SUBLEAF;
+            *eax = INTEL_SPR_AMX_TILE_MAX_SUBLEAF;
         } else if (count == 1) {
-            *eax = INTEL_AMX_TOTAL_TILE_BYTES |
-                   (INTEL_AMX_BYTES_PER_TILE << 16);
-            *ebx = INTEL_AMX_BYTES_PER_ROW | (INTEL_AMX_TILE_MAX_NAMES << 16);
-            *ecx = INTEL_AMX_TILE_MAX_ROWS;
+            *eax = env->features[FEAT_1D_1_EAX];
+            *ebx = env->features[FEAT_1D_1_EBX];
+            *ecx = env->features[FEAT_1D_1_ECX];
         }
         break;
     }
@@ -5785,7 +5784,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
 
         if (count == 0) {
             /* Highest numbered palette subleaf */
-            *ebx = INTEL_AMX_TMUL_MAX_K | (INTEL_AMX_TMUL_MAX_N << 8);
+            *ebx = env->features[FEAT_1E_0_EBX];
         }
         break;
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 6/6] i386: Add new CPU model SapphireRapids
  2023-01-06  8:38 [PATCH v3 0/6] Support for new CPU model SapphireRapids Lei Wang
                   ` (4 preceding siblings ...)
  2023-01-06  8:38 ` [PATCH v3 5/6] i386: Initialize AMX CPUID leaves with corresponding env->features[] leaves Lei Wang
@ 2023-01-06  8:38 ` Lei Wang
  2023-02-02 10:40   ` Igor Mammedov
  2023-02-02 11:05 ` [PATCH v3 0/6] Support for " Igor Mammedov
  6 siblings, 1 reply; 25+ messages in thread
From: Lei Wang @ 2023-01-06  8:38 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, imammedo, dgilbert, berrange, xiaoyao.li, yang.zhong,
	lei4.wang

The new CPU model mostly inherits features from Icelake-Server, while
adding new features:
 - AMX (Advance Matrix eXtensions)
 - Bus Lock Debug Exception
and new instructions:
 - AVX VNNI (Vector Neural Network Instruction):
    - VPDPBUS: Multiply and Add Unsigned and Signed Bytes
    - VPDPBUSDS: Multiply and Add Unsigned and Signed Bytes with Saturation
    - VPDPWSSD: Multiply and Add Signed Word Integers
    - VPDPWSSDS: Multiply and Add Signed Integers with Saturation
 - FP16: Replicates existing AVX512 computational SP (FP32) instructions
   using FP16 instead of FP32 for ~2X performance gain
 - SERIALIZE: Provide software with a simple way to force the processor to
   complete all modifications, faster, allowed in all privilege levels and
   not causing an unconditional VM exit
 - TSX Suspend Load Address Tracking: Allows programmers to choose which
   memory accesses do not need to be tracked in the TSX read set
 - AVX512_BF16: Vector Neural Network Instructions supporting BFLOAT16
   inputs and conversion instructions from IEEE single precision

Features may be added in future versions:
 - CET (virtualization support hasn't been merged)
Instructions may be added in future versions:
 - fast zero-length MOVSB (KVM doesn't support yet)
 - fast short STOSB (KVM doesn't support yet)
 - fast short CMPSB, SCASB (KVM doesn't support yet)

Signed-off-by: Lei Wang <lei4.wang@intel.com>
Reviewed-by: Robert Hoo <robert.hu@linux.intel.com>
---
 target/i386/cpu.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++
 target/i386/cpu.h |   4 ++
 2 files changed, 139 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 946df29a3d..5e1ecd50b3 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3576,6 +3576,141 @@ static const X86CPUDefinition builtin_x86_defs[] = {
             { /* end of list */ }
         }
     },
+    {
+        .name = "SapphireRapids",
+        .level = 0x20,
+        .vendor = CPUID_VENDOR_INTEL,
+        .family = 6,
+        .model = 143,
+        .stepping = 4,
+        /*
+         * please keep the ascending order so that we can have a clear view of
+         * bit position of each feature.
+         */
+        .features[FEAT_1_EDX] =
+            CPUID_FP87 | CPUID_VME | CPUID_DE | CPUID_PSE | CPUID_TSC |
+            CPUID_MSR | CPUID_PAE | CPUID_MCE | CPUID_CX8 | CPUID_APIC |
+            CPUID_SEP | CPUID_MTRR | CPUID_PGE | CPUID_MCA | CPUID_CMOV |
+            CPUID_PAT | CPUID_PSE36 | CPUID_CLFLUSH | CPUID_MMX | CPUID_FXSR |
+            CPUID_SSE | CPUID_SSE2,
+        .features[FEAT_1_ECX] =
+            CPUID_EXT_SSE3 | CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSSE3 |
+            CPUID_EXT_FMA | CPUID_EXT_CX16 | CPUID_EXT_PCID | CPUID_EXT_SSE41 |
+            CPUID_EXT_SSE42 | CPUID_EXT_X2APIC | CPUID_EXT_MOVBE |
+            CPUID_EXT_POPCNT | CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_AES |
+            CPUID_EXT_XSAVE | CPUID_EXT_AVX | CPUID_EXT_F16C | CPUID_EXT_RDRAND,
+        .features[FEAT_8000_0001_EDX] =
+            CPUID_EXT2_SYSCALL | CPUID_EXT2_NX | CPUID_EXT2_PDPE1GB |
+            CPUID_EXT2_RDTSCP | CPUID_EXT2_LM,
+        .features[FEAT_8000_0001_ECX] =
+            CPUID_EXT3_LAHF_LM | CPUID_EXT3_ABM | CPUID_EXT3_3DNOWPREFETCH,
+        .features[FEAT_8000_0008_EBX] =
+            CPUID_8000_0008_EBX_WBNOINVD,
+        .features[FEAT_7_0_EBX] =
+            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_HLE |
+            CPUID_7_0_EBX_AVX2 | CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
+            CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_RTM |
+            CPUID_7_0_EBX_AVX512F | CPUID_7_0_EBX_AVX512DQ |
+            CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_SMAP |
+            CPUID_7_0_EBX_AVX512IFMA | CPUID_7_0_EBX_CLFLUSHOPT |
+            CPUID_7_0_EBX_CLWB | CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI |
+            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL,
+        .features[FEAT_7_0_ECX] =
+            CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU |
+            CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
+            CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
+            CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
+            CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 |
+            CPUID_7_0_ECX_RDPID | CPUID_7_0_ECX_BUS_LOCK_DETECT,
+        .features[FEAT_7_0_EDX] =
+            CPUID_7_0_EDX_FSRM | CPUID_7_0_EDX_SERIALIZE |
+            CPUID_7_0_EDX_TSX_LDTRK | CPUID_7_0_EDX_AMX_BF16 |
+            CPUID_7_0_EDX_AVX512_FP16 | CPUID_7_0_EDX_AMX_TILE |
+            CPUID_7_0_EDX_AMX_INT8 | CPUID_7_0_EDX_SPEC_CTRL |
+            CPUID_7_0_EDX_ARCH_CAPABILITIES | CPUID_7_0_EDX_SPEC_CTRL_SSBD,
+        .features[FEAT_ARCH_CAPABILITIES] =
+            MSR_ARCH_CAP_RDCL_NO | MSR_ARCH_CAP_IBRS_ALL |
+            MSR_ARCH_CAP_SKIP_L1DFL_VMENTRY | MSR_ARCH_CAP_MDS_NO |
+            MSR_ARCH_CAP_PSCHANGE_MC_NO | MSR_ARCH_CAP_TAA_NO,
+        .features[FEAT_XSAVE] =
+            CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
+            CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES | CPUID_D_1_EAX_XFD,
+        .features[FEAT_6_EAX] =
+            CPUID_6_EAX_ARAT,
+        .features[FEAT_7_1_EAX] =
+            CPUID_7_1_EAX_AVX_VNNI | CPUID_7_1_EAX_AVX512_BF16,
+        .features[FEAT_1D_1_EAX] = INTEL_SPR_AMX_TOTAL_TILE_BYTES |
+            (INTEL_SPR_AMX_BYTES_PER_TILE << 16),
+        .features[FEAT_1D_1_EBX] = INTEL_SPR_AMX_BYTES_PER_ROW |
+            (INTEL_SPR_AMX_TILE_MAX_NAMES << 16),
+        .features[FEAT_1D_1_ECX] = INTEL_SPR_AMX_TILE_MAX_ROWS,
+        .features[FEAT_1E_0_EBX] = INTEL_SPR_AMX_TMUL_MAX_K |
+            (INTEL_SPR_AMX_TMUL_MAX_N << 8),
+        .features[FEAT_VMX_BASIC] =
+            MSR_VMX_BASIC_INS_OUTS | MSR_VMX_BASIC_TRUE_CTLS,
+        .features[FEAT_VMX_ENTRY_CTLS] =
+            VMX_VM_ENTRY_LOAD_DEBUG_CONTROLS | VMX_VM_ENTRY_IA32E_MODE |
+            VMX_VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
+            VMX_VM_ENTRY_LOAD_IA32_PAT | VMX_VM_ENTRY_LOAD_IA32_EFER,
+        .features[FEAT_VMX_EPT_VPID_CAPS] =
+            MSR_VMX_EPT_EXECONLY |
+            MSR_VMX_EPT_PAGE_WALK_LENGTH_4 | MSR_VMX_EPT_PAGE_WALK_LENGTH_5 |
+            MSR_VMX_EPT_WB | MSR_VMX_EPT_2MB | MSR_VMX_EPT_1GB |
+            MSR_VMX_EPT_INVEPT | MSR_VMX_EPT_AD_BITS |
+            MSR_VMX_EPT_INVEPT_SINGLE_CONTEXT | MSR_VMX_EPT_INVEPT_ALL_CONTEXT |
+            MSR_VMX_EPT_INVVPID | MSR_VMX_EPT_INVVPID_SINGLE_ADDR |
+            MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT |
+            MSR_VMX_EPT_INVVPID_ALL_CONTEXT |
+            MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT_NOGLOBALS,
+        .features[FEAT_VMX_EXIT_CTLS] =
+            VMX_VM_EXIT_SAVE_DEBUG_CONTROLS |
+            VMX_VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
+            VMX_VM_EXIT_ACK_INTR_ON_EXIT | VMX_VM_EXIT_SAVE_IA32_PAT |
+            VMX_VM_EXIT_LOAD_IA32_PAT | VMX_VM_EXIT_SAVE_IA32_EFER |
+            VMX_VM_EXIT_LOAD_IA32_EFER | VMX_VM_EXIT_SAVE_VMX_PREEMPTION_TIMER,
+        .features[FEAT_VMX_MISC] =
+            MSR_VMX_MISC_STORE_LMA | MSR_VMX_MISC_ACTIVITY_HLT |
+            MSR_VMX_MISC_VMWRITE_VMEXIT,
+        .features[FEAT_VMX_PINBASED_CTLS] =
+            VMX_PIN_BASED_EXT_INTR_MASK | VMX_PIN_BASED_NMI_EXITING |
+            VMX_PIN_BASED_VIRTUAL_NMIS | VMX_PIN_BASED_VMX_PREEMPTION_TIMER |
+            VMX_PIN_BASED_POSTED_INTR,
+        .features[FEAT_VMX_PROCBASED_CTLS] =
+            VMX_CPU_BASED_VIRTUAL_INTR_PENDING |
+            VMX_CPU_BASED_USE_TSC_OFFSETING | VMX_CPU_BASED_HLT_EXITING |
+            VMX_CPU_BASED_INVLPG_EXITING | VMX_CPU_BASED_MWAIT_EXITING |
+            VMX_CPU_BASED_RDPMC_EXITING | VMX_CPU_BASED_RDTSC_EXITING |
+            VMX_CPU_BASED_CR3_LOAD_EXITING | VMX_CPU_BASED_CR3_STORE_EXITING |
+            VMX_CPU_BASED_CR8_LOAD_EXITING | VMX_CPU_BASED_CR8_STORE_EXITING |
+            VMX_CPU_BASED_TPR_SHADOW | VMX_CPU_BASED_VIRTUAL_NMI_PENDING |
+            VMX_CPU_BASED_MOV_DR_EXITING | VMX_CPU_BASED_UNCOND_IO_EXITING |
+            VMX_CPU_BASED_USE_IO_BITMAPS | VMX_CPU_BASED_MONITOR_TRAP_FLAG |
+            VMX_CPU_BASED_USE_MSR_BITMAPS | VMX_CPU_BASED_MONITOR_EXITING |
+            VMX_CPU_BASED_PAUSE_EXITING |
+            VMX_CPU_BASED_ACTIVATE_SECONDARY_CONTROLS,
+        .features[FEAT_VMX_SECONDARY_CTLS] =
+            VMX_SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
+            VMX_SECONDARY_EXEC_ENABLE_EPT | VMX_SECONDARY_EXEC_DESC |
+            VMX_SECONDARY_EXEC_RDTSCP |
+            VMX_SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
+            VMX_SECONDARY_EXEC_ENABLE_VPID | VMX_SECONDARY_EXEC_WBINVD_EXITING |
+            VMX_SECONDARY_EXEC_UNRESTRICTED_GUEST |
+            VMX_SECONDARY_EXEC_APIC_REGISTER_VIRT |
+            VMX_SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
+            VMX_SECONDARY_EXEC_RDRAND_EXITING |
+            VMX_SECONDARY_EXEC_ENABLE_INVPCID |
+            VMX_SECONDARY_EXEC_ENABLE_VMFUNC | VMX_SECONDARY_EXEC_SHADOW_VMCS |
+            VMX_SECONDARY_EXEC_RDSEED_EXITING | VMX_SECONDARY_EXEC_ENABLE_PML |
+            VMX_SECONDARY_EXEC_XSAVES,
+        .features[FEAT_VMX_VMFUNC] =
+            MSR_VMX_VMFUNC_EPT_SWITCHING,
+        .xlevel = 0x80000008,
+        .model_id = "Intel Xeon Processor (SapphireRapids)",
+        .versions = (X86CPUVersionDefinition[]) {
+            { .version = 1 },
+            { /* end of list */ },
+        },
+    },
     {
         .name = "Denverton",
         .level = 21,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 3e3e0cfe59..8361b9f3ff 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -893,10 +893,14 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define CPUID_7_0_EDX_TSX_LDTRK         (1U << 16)
 /* Architectural LBRs */
 #define CPUID_7_0_EDX_ARCH_LBR          (1U << 19)
+/* AMX_BF16 instruction */
+#define CPUID_7_0_EDX_AMX_BF16          (1U << 22)
 /* AVX512_FP16 instruction */
 #define CPUID_7_0_EDX_AVX512_FP16       (1U << 23)
 /* AMX tile (two-dimensional register) */
 #define CPUID_7_0_EDX_AMX_TILE          (1U << 24)
+/* AMX_INT8 instruction */
+#define CPUID_7_0_EDX_AMX_INT8          (1U << 25)
 /* Speculation Control */
 #define CPUID_7_0_EDX_SPEC_CTRL         (1U << 26)
 /* Single Thread Indirect Branch Predictors */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/6] i386: Remove unused parameter "uint32_t bit" in feature_word_description()
  2023-01-06  8:38 ` [PATCH v3 2/6] i386: Remove unused parameter "uint32_t bit" in feature_word_description() Lei Wang
@ 2023-01-27 13:29   ` Igor Mammedov
  2023-02-08 14:33   ` Xiaoyao Li
  1 sibling, 0 replies; 25+ messages in thread
From: Igor Mammedov @ 2023-01-27 13:29 UTC (permalink / raw)
  To: Lei Wang; +Cc: pbonzini, qemu-devel, dgilbert, berrange, xiaoyao.li, yang.zhong

On Fri,  6 Jan 2023 00:38:22 -0800
Lei Wang <lei4.wang@intel.com> wrote:

> Parameter "uint32_t bit" is not used in function feature_word_description(),
> so remove it.
> 
> Signed-off-by: Lei Wang <lei4.wang@intel.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  target/i386/cpu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b6d1247e5e..883098bc5a 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -4290,7 +4290,7 @@ static const TypeInfo max_x86_cpu_type_info = {
>      .class_init = max_x86_cpu_class_init,
>  };
>  
> -static char *feature_word_description(FeatureWordInfo *f, uint32_t bit)
> +static char *feature_word_description(FeatureWordInfo *f)
>  {
>      assert(f->type == CPUID_FEATURE_WORD || f->type == MSR_FEATURE_WORD);
>  
> @@ -4329,6 +4329,7 @@ static void mark_unavailable_features(X86CPU *cpu, FeatureWord w, uint64_t mask,
>      CPUX86State *env = &cpu->env;
>      FeatureWordInfo *f = &feature_word_info[w];
>      int i;
> +    g_autofree char *feat_word_str = feature_word_description(f);
>  
>      if (!cpu->force_features) {
>          env->features[w] &= ~mask;
> @@ -4341,7 +4342,6 @@ static void mark_unavailable_features(X86CPU *cpu, FeatureWord w, uint64_t mask,
>  
>      for (i = 0; i < 64; ++i) {
>          if ((1ULL << i) & mask) {
> -            g_autofree char *feat_word_str = feature_word_description(f, i);
>              warn_report("%s: %s%s%s [bit %d]",
>                          verbose_prefix,
>                          feat_word_str,



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 6/6] i386: Add new CPU model SapphireRapids
  2023-01-06  8:38 ` [PATCH v3 6/6] i386: Add new CPU model SapphireRapids Lei Wang
@ 2023-02-02 10:40   ` Igor Mammedov
  2023-02-03  6:02     ` Wang, Lei
  0 siblings, 1 reply; 25+ messages in thread
From: Igor Mammedov @ 2023-02-02 10:40 UTC (permalink / raw)
  To: Lei Wang; +Cc: pbonzini, qemu-devel, dgilbert, berrange, xiaoyao.li, yang.zhong

On Fri,  6 Jan 2023 00:38:26 -0800
Lei Wang <lei4.wang@intel.com> wrote:

> The new CPU model mostly inherits features from Icelake-Server, while
> adding new features:
>  - AMX (Advance Matrix eXtensions)
>  - Bus Lock Debug Exception
> and new instructions:
>  - AVX VNNI (Vector Neural Network Instruction):
>     - VPDPBUS: Multiply and Add Unsigned and Signed Bytes
>     - VPDPBUSDS: Multiply and Add Unsigned and Signed Bytes with Saturation
>     - VPDPWSSD: Multiply and Add Signed Word Integers
>     - VPDPWSSDS: Multiply and Add Signed Integers with Saturation
>  - FP16: Replicates existing AVX512 computational SP (FP32) instructions
>    using FP16 instead of FP32 for ~2X performance gain
>  - SERIALIZE: Provide software with a simple way to force the processor to
>    complete all modifications, faster, allowed in all privilege levels and
>    not causing an unconditional VM exit
>  - TSX Suspend Load Address Tracking: Allows programmers to choose which
>    memory accesses do not need to be tracked in the TSX read set
>  - AVX512_BF16: Vector Neural Network Instructions supporting BFLOAT16
>    inputs and conversion instructions from IEEE single precision

you should mention all new features here, preferably in format:
 feature-flag:  expected CPUID feature bits , so reviewer would be able to find it in spec
 
also you mention that it inherits most of the features from Icelake cpu,
it would be nice to provide cpuid diff between real Icelake and new cpu
somewhere in cover letter to simplify comparison.

> 
> Features may be added in future versions:
>  - CET (virtualization support hasn't been merged)
> Instructions may be added in future versions:
>  - fast zero-length MOVSB (KVM doesn't support yet)
>  - fast short STOSB (KVM doesn't support yet)
>  - fast short CMPSB, SCASB (KVM doesn't support yet)
> 
> Signed-off-by: Lei Wang <lei4.wang@intel.com>
> Reviewed-by: Robert Hoo <robert.hu@linux.intel.com>
> ---
>  target/i386/cpu.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++
>  target/i386/cpu.h |   4 ++
>  2 files changed, 139 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 946df29a3d..5e1ecd50b3 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -3576,6 +3576,141 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>              { /* end of list */ }
>          }
>      },
> +    {
> +        .name = "SapphireRapids",
> +        .level = 0x20,
> +        .vendor = CPUID_VENDOR_INTEL,
> +        .family = 6,
> +        .model = 143,
> +        .stepping = 4,
> +        /*
> +         * please keep the ascending order so that we can have a clear view of
> +         * bit position of each feature.
> +         */

unless you have a way to enforce this comment. I wouldn't expect that would
work in practice.

Also If you wish to bring more order here, you should start with a prerequisite
patch that would do the same to all existing CPU models.
That way one can easily see a difference between Icelake and new cpu model.

As it is in this patch (with all unnecessary reordering) it's hard for reviewer
to see differences.
I'd suggest to just copy Icelake definition and modify it to suit new cpu model
(without reordering all feature bits)  

> +        .features[FEAT_1_EDX] =
> +            CPUID_FP87 | CPUID_VME | CPUID_DE | CPUID_PSE | CPUID_TSC |
> +            CPUID_MSR | CPUID_PAE | CPUID_MCE | CPUID_CX8 | CPUID_APIC |
> +            CPUID_SEP | CPUID_MTRR | CPUID_PGE | CPUID_MCA | CPUID_CMOV |
> +            CPUID_PAT | CPUID_PSE36 | CPUID_CLFLUSH | CPUID_MMX | CPUID_FXSR |
> +            CPUID_SSE | CPUID_SSE2,
> +        .features[FEAT_1_ECX] =
> +            CPUID_EXT_SSE3 | CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSSE3 |
> +            CPUID_EXT_FMA | CPUID_EXT_CX16 | CPUID_EXT_PCID | CPUID_EXT_SSE41 |
> +            CPUID_EXT_SSE42 | CPUID_EXT_X2APIC | CPUID_EXT_MOVBE |
> +            CPUID_EXT_POPCNT | CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_AES |
> +            CPUID_EXT_XSAVE | CPUID_EXT_AVX | CPUID_EXT_F16C | CPUID_EXT_RDRAND,
> +        .features[FEAT_8000_0001_EDX] =
> +            CPUID_EXT2_SYSCALL | CPUID_EXT2_NX | CPUID_EXT2_PDPE1GB |
> +            CPUID_EXT2_RDTSCP | CPUID_EXT2_LM,
> +        .features[FEAT_8000_0001_ECX] =
> +            CPUID_EXT3_LAHF_LM | CPUID_EXT3_ABM | CPUID_EXT3_3DNOWPREFETCH,
> +        .features[FEAT_8000_0008_EBX] =
> +            CPUID_8000_0008_EBX_WBNOINVD,
> +        .features[FEAT_7_0_EBX] =
> +            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_HLE |
> +            CPUID_7_0_EBX_AVX2 | CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
> +            CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_RTM |
> +            CPUID_7_0_EBX_AVX512F | CPUID_7_0_EBX_AVX512DQ |
> +            CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_SMAP |
> +            CPUID_7_0_EBX_AVX512IFMA | CPUID_7_0_EBX_CLFLUSHOPT |
> +            CPUID_7_0_EBX_CLWB | CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI |
> +            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL,
> +        .features[FEAT_7_0_ECX] =
> +            CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU |
> +            CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
> +            CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
> +            CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
> +            CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 |
> +            CPUID_7_0_ECX_RDPID | CPUID_7_0_ECX_BUS_LOCK_DETECT,
> +        .features[FEAT_7_0_EDX] =
> +            CPUID_7_0_EDX_FSRM | CPUID_7_0_EDX_SERIALIZE |
> +            CPUID_7_0_EDX_TSX_LDTRK | CPUID_7_0_EDX_AMX_BF16 |
> +            CPUID_7_0_EDX_AVX512_FP16 | CPUID_7_0_EDX_AMX_TILE |
> +            CPUID_7_0_EDX_AMX_INT8 | CPUID_7_0_EDX_SPEC_CTRL |
> +            CPUID_7_0_EDX_ARCH_CAPABILITIES | CPUID_7_0_EDX_SPEC_CTRL_SSBD,
> +        .features[FEAT_ARCH_CAPABILITIES] =
> +            MSR_ARCH_CAP_RDCL_NO | MSR_ARCH_CAP_IBRS_ALL |
> +            MSR_ARCH_CAP_SKIP_L1DFL_VMENTRY | MSR_ARCH_CAP_MDS_NO |
> +            MSR_ARCH_CAP_PSCHANGE_MC_NO | MSR_ARCH_CAP_TAA_NO,
> +        .features[FEAT_XSAVE] =
> +            CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
> +            CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES | CPUID_D_1_EAX_XFD,
> +        .features[FEAT_6_EAX] =
> +            CPUID_6_EAX_ARAT,
> +        .features[FEAT_7_1_EAX] =
> +            CPUID_7_1_EAX_AVX_VNNI | CPUID_7_1_EAX_AVX512_BF16,
> +        .features[FEAT_1D_1_EAX] = INTEL_SPR_AMX_TOTAL_TILE_BYTES |
> +            (INTEL_SPR_AMX_BYTES_PER_TILE << 16),
> +        .features[FEAT_1D_1_EBX] = INTEL_SPR_AMX_BYTES_PER_ROW |
> +            (INTEL_SPR_AMX_TILE_MAX_NAMES << 16),
> +        .features[FEAT_1D_1_ECX] = INTEL_SPR_AMX_TILE_MAX_ROWS,
> +        .features[FEAT_1E_0_EBX] = INTEL_SPR_AMX_TMUL_MAX_K |
> +            (INTEL_SPR_AMX_TMUL_MAX_N << 8),
> +        .features[FEAT_VMX_BASIC] =
> +            MSR_VMX_BASIC_INS_OUTS | MSR_VMX_BASIC_TRUE_CTLS,
> +        .features[FEAT_VMX_ENTRY_CTLS] =
> +            VMX_VM_ENTRY_LOAD_DEBUG_CONTROLS | VMX_VM_ENTRY_IA32E_MODE |
> +            VMX_VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
> +            VMX_VM_ENTRY_LOAD_IA32_PAT | VMX_VM_ENTRY_LOAD_IA32_EFER,
> +        .features[FEAT_VMX_EPT_VPID_CAPS] =
> +            MSR_VMX_EPT_EXECONLY |
> +            MSR_VMX_EPT_PAGE_WALK_LENGTH_4 | MSR_VMX_EPT_PAGE_WALK_LENGTH_5 |
> +            MSR_VMX_EPT_WB | MSR_VMX_EPT_2MB | MSR_VMX_EPT_1GB |
> +            MSR_VMX_EPT_INVEPT | MSR_VMX_EPT_AD_BITS |
> +            MSR_VMX_EPT_INVEPT_SINGLE_CONTEXT | MSR_VMX_EPT_INVEPT_ALL_CONTEXT |
> +            MSR_VMX_EPT_INVVPID | MSR_VMX_EPT_INVVPID_SINGLE_ADDR |
> +            MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT |
> +            MSR_VMX_EPT_INVVPID_ALL_CONTEXT |
> +            MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT_NOGLOBALS,
> +        .features[FEAT_VMX_EXIT_CTLS] =
> +            VMX_VM_EXIT_SAVE_DEBUG_CONTROLS |
> +            VMX_VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
> +            VMX_VM_EXIT_ACK_INTR_ON_EXIT | VMX_VM_EXIT_SAVE_IA32_PAT |
> +            VMX_VM_EXIT_LOAD_IA32_PAT | VMX_VM_EXIT_SAVE_IA32_EFER |
> +            VMX_VM_EXIT_LOAD_IA32_EFER | VMX_VM_EXIT_SAVE_VMX_PREEMPTION_TIMER,
> +        .features[FEAT_VMX_MISC] =
> +            MSR_VMX_MISC_STORE_LMA | MSR_VMX_MISC_ACTIVITY_HLT |
> +            MSR_VMX_MISC_VMWRITE_VMEXIT,
> +        .features[FEAT_VMX_PINBASED_CTLS] =
> +            VMX_PIN_BASED_EXT_INTR_MASK | VMX_PIN_BASED_NMI_EXITING |
> +            VMX_PIN_BASED_VIRTUAL_NMIS | VMX_PIN_BASED_VMX_PREEMPTION_TIMER |
> +            VMX_PIN_BASED_POSTED_INTR,
> +        .features[FEAT_VMX_PROCBASED_CTLS] =
> +            VMX_CPU_BASED_VIRTUAL_INTR_PENDING |
> +            VMX_CPU_BASED_USE_TSC_OFFSETING | VMX_CPU_BASED_HLT_EXITING |
> +            VMX_CPU_BASED_INVLPG_EXITING | VMX_CPU_BASED_MWAIT_EXITING |
> +            VMX_CPU_BASED_RDPMC_EXITING | VMX_CPU_BASED_RDTSC_EXITING |
> +            VMX_CPU_BASED_CR3_LOAD_EXITING | VMX_CPU_BASED_CR3_STORE_EXITING |
> +            VMX_CPU_BASED_CR8_LOAD_EXITING | VMX_CPU_BASED_CR8_STORE_EXITING |
> +            VMX_CPU_BASED_TPR_SHADOW | VMX_CPU_BASED_VIRTUAL_NMI_PENDING |
> +            VMX_CPU_BASED_MOV_DR_EXITING | VMX_CPU_BASED_UNCOND_IO_EXITING |
> +            VMX_CPU_BASED_USE_IO_BITMAPS | VMX_CPU_BASED_MONITOR_TRAP_FLAG |
> +            VMX_CPU_BASED_USE_MSR_BITMAPS | VMX_CPU_BASED_MONITOR_EXITING |
> +            VMX_CPU_BASED_PAUSE_EXITING |
> +            VMX_CPU_BASED_ACTIVATE_SECONDARY_CONTROLS,
> +        .features[FEAT_VMX_SECONDARY_CTLS] =
> +            VMX_SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
> +            VMX_SECONDARY_EXEC_ENABLE_EPT | VMX_SECONDARY_EXEC_DESC |
> +            VMX_SECONDARY_EXEC_RDTSCP |
> +            VMX_SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
> +            VMX_SECONDARY_EXEC_ENABLE_VPID | VMX_SECONDARY_EXEC_WBINVD_EXITING |
> +            VMX_SECONDARY_EXEC_UNRESTRICTED_GUEST |
> +            VMX_SECONDARY_EXEC_APIC_REGISTER_VIRT |
> +            VMX_SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
> +            VMX_SECONDARY_EXEC_RDRAND_EXITING |
> +            VMX_SECONDARY_EXEC_ENABLE_INVPCID |
> +            VMX_SECONDARY_EXEC_ENABLE_VMFUNC | VMX_SECONDARY_EXEC_SHADOW_VMCS |
> +            VMX_SECONDARY_EXEC_RDSEED_EXITING | VMX_SECONDARY_EXEC_ENABLE_PML |
> +            VMX_SECONDARY_EXEC_XSAVES,
> +        .features[FEAT_VMX_VMFUNC] =
> +            MSR_VMX_VMFUNC_EPT_SWITCHING,
> +        .xlevel = 0x80000008,
> +        .model_id = "Intel Xeon Processor (SapphireRapids)",
> +        .versions = (X86CPUVersionDefinition[]) {
> +            { .version = 1 },
> +            { /* end of list */ },
> +        },
> +    },
>      {
>          .name = "Denverton",
>          .level = 21,
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 3e3e0cfe59..8361b9f3ff 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -893,10 +893,14 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
>  #define CPUID_7_0_EDX_TSX_LDTRK         (1U << 16)
>  /* Architectural LBRs */
>  #define CPUID_7_0_EDX_ARCH_LBR          (1U << 19)
> +/* AMX_BF16 instruction */
> +#define CPUID_7_0_EDX_AMX_BF16          (1U << 22)
>  /* AVX512_FP16 instruction */
>  #define CPUID_7_0_EDX_AVX512_FP16       (1U << 23)
>  /* AMX tile (two-dimensional register) */
>  #define CPUID_7_0_EDX_AMX_TILE          (1U << 24)
> +/* AMX_INT8 instruction */
> +#define CPUID_7_0_EDX_AMX_INT8          (1U << 25)
>  /* Speculation Control */
>  #define CPUID_7_0_EDX_SPEC_CTRL         (1U << 26)
>  /* Single Thread Indirect Branch Predictors */



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/6] Support for new CPU model SapphireRapids
  2023-01-06  8:38 [PATCH v3 0/6] Support for new CPU model SapphireRapids Lei Wang
                   ` (5 preceding siblings ...)
  2023-01-06  8:38 ` [PATCH v3 6/6] i386: Add new CPU model SapphireRapids Lei Wang
@ 2023-02-02 11:05 ` Igor Mammedov
  2023-02-07  2:50   ` Wang, Lei
                     ` (2 more replies)
  6 siblings, 3 replies; 25+ messages in thread
From: Igor Mammedov @ 2023-02-02 11:05 UTC (permalink / raw)
  To: Lei Wang; +Cc: pbonzini, qemu-devel, dgilbert, berrange, xiaoyao.li, yang.zhong

On Fri,  6 Jan 2023 00:38:20 -0800
Lei Wang <lei4.wang@intel.com> wrote:

> This series aims to add a new CPU model SapphireRapids, and tries to
> address the problem stated in
> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#mcf67dbd1ad37c65d7988c36a2b267be9afd2fb30,
> so that named CPU model can define its own AMX values, and QEMU won't
> pass the wrong AMX values to KVM in future platforms if they have
> different values supported.
> 
> The original patch is
> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#u.

MultiBitFeatureInfo looks like an interesting
idea but among fixing whatever issues this has atm,
you'd probably need to introduce a new  qdev_bitfield property
infrastructure so that such features could be treated like any
other qdev properties.
Also when MultiBitFeatureInfo is added, one should convert all
other usecases where it's applicable (not only for new code)
so that we would end up with consolidated approach instead of
zoo mess.

I'm not sure all MultiBitFeatureInfo complexity is necessary
just for adding a new CPU model, I'd rather use ad hoc approach
that we were using before for non boolean features.

And then try to develop MultiBitFeatureInfo & co as a separate
series to demonstrate how much it will improve current
cpu models definitions.

PS:
 'make check-acceptance' are broken with this

> ---
> 
> Changelog:
> 
> v3:
>  - Rebase on the latest QEMU (d1852caab131ea898134fdcea8c14bc2ee75fbe9).
>  - v2: https://lore.kernel.org/qemu-devel/20221102085256.81139-1-lei4.wang@intel.com/
> 
> v2:
>  - Fix when passing all zeros of AMX-related CPUID, QEMU will warn
>    unsupported.
>  - Remove unnecessary function definition and make code cleaner.
>  - Fix some typos.
>  - v1: https://lore.kernel.org/qemu-devel/20221027020036.373140-1-lei4.wang@intel.com/T/#t
> 
> 
> Lei Wang (6):
>   i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E
>   i386: Remove unused parameter "uint32_t bit" in
>     feature_word_description()
>   i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit
>     features
>   i386: Mask and report unavailable multi-bit feature values
>   i386: Initialize AMX CPUID leaves with corresponding env->features[]
>     leaves
>   i386: Add new CPU model SapphireRapids
> 
>  target/i386/cpu-internal.h |  11 ++
>  target/i386/cpu.c          | 311 +++++++++++++++++++++++++++++++++++--
>  target/i386/cpu.h          |  16 ++
>  3 files changed, 322 insertions(+), 16 deletions(-)
> 
> 
> base-commit: d1852caab131ea898134fdcea8c14bc2ee75fbe9



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 6/6] i386: Add new CPU model SapphireRapids
  2023-02-02 10:40   ` Igor Mammedov
@ 2023-02-03  6:02     ` Wang, Lei
  0 siblings, 0 replies; 25+ messages in thread
From: Wang, Lei @ 2023-02-03  6:02 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: pbonzini, qemu-devel, dgilbert, berrange, xiaoyao.li, yang.zhong

On 2/2/2023 6:40 PM, Igor Mammedov wrote:
> On Fri,  6 Jan 2023 00:38:26 -0800
> Lei Wang <lei4.wang@intel.com> wrote:
> 
>> The new CPU model mostly inherits features from Icelake-Server, while
>> adding new features:
>>  - AMX (Advance Matrix eXtensions)
>>  - Bus Lock Debug Exception
>> and new instructions:
>>  - AVX VNNI (Vector Neural Network Instruction):
>>     - VPDPBUS: Multiply and Add Unsigned and Signed Bytes
>>     - VPDPBUSDS: Multiply and Add Unsigned and Signed Bytes with Saturation
>>     - VPDPWSSD: Multiply and Add Signed Word Integers
>>     - VPDPWSSDS: Multiply and Add Signed Integers with Saturation
>>  - FP16: Replicates existing AVX512 computational SP (FP32) instructions
>>    using FP16 instead of FP32 for ~2X performance gain
>>  - SERIALIZE: Provide software with a simple way to force the processor to
>>    complete all modifications, faster, allowed in all privilege levels and
>>    not causing an unconditional VM exit
>>  - TSX Suspend Load Address Tracking: Allows programmers to choose which
>>    memory accesses do not need to be tracked in the TSX read set
>>  - AVX512_BF16: Vector Neural Network Instructions supporting BFLOAT16
>>    inputs and conversion instructions from IEEE single precision
> 
> you should mention all new features here, preferably in format:
>  feature-flag:  expected CPUID feature bits , so reviewer would be able to find it in spec

Thanks, will mention the new features introduced by the new CPU model.

> also you mention that it inherits most of the features from Icelake cpu,
> it would be nice to provide cpuid diff between real Icelake and new cpu
> somewhere in cover letter to simplify comparison.

OK, will add the diff between the command line output here.

>>
>> Features may be added in future versions:
>>  - CET (virtualization support hasn't been merged)
>> Instructions may be added in future versions:
>>  - fast zero-length MOVSB (KVM doesn't support yet)
>>  - fast short STOSB (KVM doesn't support yet)
>>  - fast short CMPSB, SCASB (KVM doesn't support yet)
>>
>> Signed-off-by: Lei Wang <lei4.wang@intel.com>
>> Reviewed-by: Robert Hoo <robert.hu@linux.intel.com>
>> ---
>>  target/i386/cpu.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++
>>  target/i386/cpu.h |   4 ++
>>  2 files changed, 139 insertions(+)
>>
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index 946df29a3d..5e1ecd50b3 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -3576,6 +3576,141 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>>              { /* end of list */ }
>>          }
>>      },
>> +    {
>> +        .name = "SapphireRapids",
>> +        .level = 0x20,
>> +        .vendor = CPUID_VENDOR_INTEL,
>> +        .family = 6,
>> +        .model = 143,
>> +        .stepping = 4,
>> +        /*
>> +         * please keep the ascending order so that we can have a clear view of
>> +         * bit position of each feature.
>> +         */
> 
> unless you have a way to enforce this comment. I wouldn't expect that would
> work in practice.
> 
> Also If you wish to bring more order here, you should start with a prerequisite
> patch that would do the same to all existing CPU models.
> That way one can easily see a difference between Icelake and new cpu model.
> 
> As it is in this patch (with all unnecessary reordering) it's hard for reviewer
> to see differences.
> I'd suggest to just copy Icelake definition and modify it to suit new cpu model
> (without reordering all feature bits)  

Thanks for the suggestion, will remove the comment and just copy Icelake
definition and modify it to suit new cpu model (without reordering all feature bits)

>> +        .features[FEAT_1_EDX] =
>> +            CPUID_FP87 | CPUID_VME | CPUID_DE | CPUID_PSE | CPUID_TSC |
>> +            CPUID_MSR | CPUID_PAE | CPUID_MCE | CPUID_CX8 | CPUID_APIC |
>> +            CPUID_SEP | CPUID_MTRR | CPUID_PGE | CPUID_MCA | CPUID_CMOV |
>> +            CPUID_PAT | CPUID_PSE36 | CPUID_CLFLUSH | CPUID_MMX | CPUID_FXSR |
>> +            CPUID_SSE | CPUID_SSE2,
>> +        .features[FEAT_1_ECX] =
>> +            CPUID_EXT_SSE3 | CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSSE3 |
>> +            CPUID_EXT_FMA | CPUID_EXT_CX16 | CPUID_EXT_PCID | CPUID_EXT_SSE41 |
>> +            CPUID_EXT_SSE42 | CPUID_EXT_X2APIC | CPUID_EXT_MOVBE |
>> +            CPUID_EXT_POPCNT | CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_AES |
>> +            CPUID_EXT_XSAVE | CPUID_EXT_AVX | CPUID_EXT_F16C | CPUID_EXT_RDRAND,
>> +        .features[FEAT_8000_0001_EDX] =
>> +            CPUID_EXT2_SYSCALL | CPUID_EXT2_NX | CPUID_EXT2_PDPE1GB |
>> +            CPUID_EXT2_RDTSCP | CPUID_EXT2_LM,
>> +        .features[FEAT_8000_0001_ECX] =
>> +            CPUID_EXT3_LAHF_LM | CPUID_EXT3_ABM | CPUID_EXT3_3DNOWPREFETCH,
>> +        .features[FEAT_8000_0008_EBX] =
>> +            CPUID_8000_0008_EBX_WBNOINVD,
>> +        .features[FEAT_7_0_EBX] =
>> +            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_HLE |
>> +            CPUID_7_0_EBX_AVX2 | CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
>> +            CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_RTM |
>> +            CPUID_7_0_EBX_AVX512F | CPUID_7_0_EBX_AVX512DQ |
>> +            CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_SMAP |
>> +            CPUID_7_0_EBX_AVX512IFMA | CPUID_7_0_EBX_CLFLUSHOPT |
>> +            CPUID_7_0_EBX_CLWB | CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI |
>> +            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL,
>> +        .features[FEAT_7_0_ECX] =
>> +            CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU |
>> +            CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
>> +            CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
>> +            CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
>> +            CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 |
>> +            CPUID_7_0_ECX_RDPID | CPUID_7_0_ECX_BUS_LOCK_DETECT,
>> +        .features[FEAT_7_0_EDX] =
>> +            CPUID_7_0_EDX_FSRM | CPUID_7_0_EDX_SERIALIZE |
>> +            CPUID_7_0_EDX_TSX_LDTRK | CPUID_7_0_EDX_AMX_BF16 |
>> +            CPUID_7_0_EDX_AVX512_FP16 | CPUID_7_0_EDX_AMX_TILE |
>> +            CPUID_7_0_EDX_AMX_INT8 | CPUID_7_0_EDX_SPEC_CTRL |
>> +            CPUID_7_0_EDX_ARCH_CAPABILITIES | CPUID_7_0_EDX_SPEC_CTRL_SSBD,
>> +        .features[FEAT_ARCH_CAPABILITIES] =
>> +            MSR_ARCH_CAP_RDCL_NO | MSR_ARCH_CAP_IBRS_ALL |
>> +            MSR_ARCH_CAP_SKIP_L1DFL_VMENTRY | MSR_ARCH_CAP_MDS_NO |
>> +            MSR_ARCH_CAP_PSCHANGE_MC_NO | MSR_ARCH_CAP_TAA_NO,
>> +        .features[FEAT_XSAVE] =
>> +            CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
>> +            CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES | CPUID_D_1_EAX_XFD,
>> +        .features[FEAT_6_EAX] =
>> +            CPUID_6_EAX_ARAT,
>> +        .features[FEAT_7_1_EAX] =
>> +            CPUID_7_1_EAX_AVX_VNNI | CPUID_7_1_EAX_AVX512_BF16,
>> +        .features[FEAT_1D_1_EAX] = INTEL_SPR_AMX_TOTAL_TILE_BYTES |
>> +            (INTEL_SPR_AMX_BYTES_PER_TILE << 16),
>> +        .features[FEAT_1D_1_EBX] = INTEL_SPR_AMX_BYTES_PER_ROW |
>> +            (INTEL_SPR_AMX_TILE_MAX_NAMES << 16),
>> +        .features[FEAT_1D_1_ECX] = INTEL_SPR_AMX_TILE_MAX_ROWS,
>> +        .features[FEAT_1E_0_EBX] = INTEL_SPR_AMX_TMUL_MAX_K |
>> +            (INTEL_SPR_AMX_TMUL_MAX_N << 8),
>> +        .features[FEAT_VMX_BASIC] =
>> +            MSR_VMX_BASIC_INS_OUTS | MSR_VMX_BASIC_TRUE_CTLS,
>> +        .features[FEAT_VMX_ENTRY_CTLS] =
>> +            VMX_VM_ENTRY_LOAD_DEBUG_CONTROLS | VMX_VM_ENTRY_IA32E_MODE |
>> +            VMX_VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
>> +            VMX_VM_ENTRY_LOAD_IA32_PAT | VMX_VM_ENTRY_LOAD_IA32_EFER,
>> +        .features[FEAT_VMX_EPT_VPID_CAPS] =
>> +            MSR_VMX_EPT_EXECONLY |
>> +            MSR_VMX_EPT_PAGE_WALK_LENGTH_4 | MSR_VMX_EPT_PAGE_WALK_LENGTH_5 |
>> +            MSR_VMX_EPT_WB | MSR_VMX_EPT_2MB | MSR_VMX_EPT_1GB |
>> +            MSR_VMX_EPT_INVEPT | MSR_VMX_EPT_AD_BITS |
>> +            MSR_VMX_EPT_INVEPT_SINGLE_CONTEXT | MSR_VMX_EPT_INVEPT_ALL_CONTEXT |
>> +            MSR_VMX_EPT_INVVPID | MSR_VMX_EPT_INVVPID_SINGLE_ADDR |
>> +            MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT |
>> +            MSR_VMX_EPT_INVVPID_ALL_CONTEXT |
>> +            MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT_NOGLOBALS,
>> +        .features[FEAT_VMX_EXIT_CTLS] =
>> +            VMX_VM_EXIT_SAVE_DEBUG_CONTROLS |
>> +            VMX_VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
>> +            VMX_VM_EXIT_ACK_INTR_ON_EXIT | VMX_VM_EXIT_SAVE_IA32_PAT |
>> +            VMX_VM_EXIT_LOAD_IA32_PAT | VMX_VM_EXIT_SAVE_IA32_EFER |
>> +            VMX_VM_EXIT_LOAD_IA32_EFER | VMX_VM_EXIT_SAVE_VMX_PREEMPTION_TIMER,
>> +        .features[FEAT_VMX_MISC] =
>> +            MSR_VMX_MISC_STORE_LMA | MSR_VMX_MISC_ACTIVITY_HLT |
>> +            MSR_VMX_MISC_VMWRITE_VMEXIT,
>> +        .features[FEAT_VMX_PINBASED_CTLS] =
>> +            VMX_PIN_BASED_EXT_INTR_MASK | VMX_PIN_BASED_NMI_EXITING |
>> +            VMX_PIN_BASED_VIRTUAL_NMIS | VMX_PIN_BASED_VMX_PREEMPTION_TIMER |
>> +            VMX_PIN_BASED_POSTED_INTR,
>> +        .features[FEAT_VMX_PROCBASED_CTLS] =
>> +            VMX_CPU_BASED_VIRTUAL_INTR_PENDING |
>> +            VMX_CPU_BASED_USE_TSC_OFFSETING | VMX_CPU_BASED_HLT_EXITING |
>> +            VMX_CPU_BASED_INVLPG_EXITING | VMX_CPU_BASED_MWAIT_EXITING |
>> +            VMX_CPU_BASED_RDPMC_EXITING | VMX_CPU_BASED_RDTSC_EXITING |
>> +            VMX_CPU_BASED_CR3_LOAD_EXITING | VMX_CPU_BASED_CR3_STORE_EXITING |
>> +            VMX_CPU_BASED_CR8_LOAD_EXITING | VMX_CPU_BASED_CR8_STORE_EXITING |
>> +            VMX_CPU_BASED_TPR_SHADOW | VMX_CPU_BASED_VIRTUAL_NMI_PENDING |
>> +            VMX_CPU_BASED_MOV_DR_EXITING | VMX_CPU_BASED_UNCOND_IO_EXITING |
>> +            VMX_CPU_BASED_USE_IO_BITMAPS | VMX_CPU_BASED_MONITOR_TRAP_FLAG |
>> +            VMX_CPU_BASED_USE_MSR_BITMAPS | VMX_CPU_BASED_MONITOR_EXITING |
>> +            VMX_CPU_BASED_PAUSE_EXITING |
>> +            VMX_CPU_BASED_ACTIVATE_SECONDARY_CONTROLS,
>> +        .features[FEAT_VMX_SECONDARY_CTLS] =
>> +            VMX_SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
>> +            VMX_SECONDARY_EXEC_ENABLE_EPT | VMX_SECONDARY_EXEC_DESC |
>> +            VMX_SECONDARY_EXEC_RDTSCP |
>> +            VMX_SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
>> +            VMX_SECONDARY_EXEC_ENABLE_VPID | VMX_SECONDARY_EXEC_WBINVD_EXITING |
>> +            VMX_SECONDARY_EXEC_UNRESTRICTED_GUEST |
>> +            VMX_SECONDARY_EXEC_APIC_REGISTER_VIRT |
>> +            VMX_SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
>> +            VMX_SECONDARY_EXEC_RDRAND_EXITING |
>> +            VMX_SECONDARY_EXEC_ENABLE_INVPCID |
>> +            VMX_SECONDARY_EXEC_ENABLE_VMFUNC | VMX_SECONDARY_EXEC_SHADOW_VMCS |
>> +            VMX_SECONDARY_EXEC_RDSEED_EXITING | VMX_SECONDARY_EXEC_ENABLE_PML |
>> +            VMX_SECONDARY_EXEC_XSAVES,
>> +        .features[FEAT_VMX_VMFUNC] =
>> +            MSR_VMX_VMFUNC_EPT_SWITCHING,
>> +        .xlevel = 0x80000008,
>> +        .model_id = "Intel Xeon Processor (SapphireRapids)",
>> +        .versions = (X86CPUVersionDefinition[]) {
>> +            { .version = 1 },
>> +            { /* end of list */ },
>> +        },
>> +    },
>>      {
>>          .name = "Denverton",
>>          .level = 21,
>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>> index 3e3e0cfe59..8361b9f3ff 100644
>> --- a/target/i386/cpu.h
>> +++ b/target/i386/cpu.h
>> @@ -893,10 +893,14 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
>>  #define CPUID_7_0_EDX_TSX_LDTRK         (1U << 16)
>>  /* Architectural LBRs */
>>  #define CPUID_7_0_EDX_ARCH_LBR          (1U << 19)
>> +/* AMX_BF16 instruction */
>> +#define CPUID_7_0_EDX_AMX_BF16          (1U << 22)
>>  /* AVX512_FP16 instruction */
>>  #define CPUID_7_0_EDX_AVX512_FP16       (1U << 23)
>>  /* AMX tile (two-dimensional register) */
>>  #define CPUID_7_0_EDX_AMX_TILE          (1U << 24)
>> +/* AMX_INT8 instruction */
>> +#define CPUID_7_0_EDX_AMX_INT8          (1U << 25)
>>  /* Speculation Control */
>>  #define CPUID_7_0_EDX_SPEC_CTRL         (1U << 26)
>>  /* Single Thread Indirect Branch Predictors */
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/6] i386: Mask and report unavailable multi-bit feature values
  2023-01-06  8:38 ` [PATCH v3 4/6] i386: Mask and report unavailable multi-bit feature values Lei Wang
@ 2023-02-06  7:43   ` Yuan Yao
  2023-02-09  1:04     ` Wang, Lei
  0 siblings, 1 reply; 25+ messages in thread
From: Yuan Yao @ 2023-02-06  7:43 UTC (permalink / raw)
  To: Lei Wang
  Cc: pbonzini, qemu-devel, imammedo, dgilbert, berrange, xiaoyao.li,
	yang.zhong

On Fri, Jan 06, 2023 at 12:38:24AM -0800, Lei Wang wrote:
> Some feature words, e.g., feature words in AMX-related CPUID leaf 0x1D and
> 0x1E are not bit-wise but multiple bits represents one value. Handle this
> situation when the values specified are not the same as which are reported
> by KVM. The handling includes:
>
>  - The responsibility of masking bits and giving warnings are delegated to
>    the feature enabler. A framework is also provided to enable this.
>  - To simplify the initialization, a default function is provided if the
>    the function is not specified.
>
> The reason why delegating this responsibility rather than just marking
> them as zeros when they are not same is because different multi-bit
> features may have different logic, which is case by case, for example:
>
>  1. CPUID.0x14_0x1:EBX[15:0]. Even though it's multi-bits field, it's a
>     bitmap and each bit represents a separate capability.
>
>  2. CPUID.0x14_0x1:EAX[2:0] represents the number of configurable Address
>     Ranges. 3 bits as a whole to represent a integer value. It means the
>     maximum capability of HW. If KVM reports M, then M to 0 is legal
>     value to configure (because KVM can emulate each value correctly).
>
>  3. CPUID.0x1D_0x1:EAX[31:16] represents palette 1 bytes_per_tile. 16 bits
>     as a whole represent an integer value. It's not like case 2 and SW
>     needs to configure the same value as reported. Because it's not
>     possible for SW to configure to a different value and KVM cannot
>     emulate it.
>
> So marking them blindly as zeros is incorrect, and delegating this
> responsibility can let each multi-bit feature have its own way to mask bits.
>
> Signed-off-by: Lei Wang <lei4.wang@intel.com>
> ---
>  target/i386/cpu-internal.h |  2 ++
>  target/i386/cpu.c          | 36 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 38 insertions(+)
>
> diff --git a/target/i386/cpu-internal.h b/target/i386/cpu-internal.h
> index 66b3d66cb4..83c7b53926 100644
> --- a/target/i386/cpu-internal.h
> +++ b/target/i386/cpu-internal.h
> @@ -30,6 +30,8 @@ typedef struct MultiBitFeatureInfo {
>      uint64_t mask;
>      unsigned high_bit_position;
>      unsigned low_bit_position;
> +    void (*mark_unavailable_multi_bit)(X86CPU *cpu, FeatureWord w, int index,
> +                                       const char *verbose_prefix);
>  } MultiBitFeatureInfo;
>
>  typedef struct FeatureWordInfo {
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 88aa780566..e638a31d34 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -4377,6 +4377,28 @@ static bool x86_cpu_have_filtered_features(X86CPU *cpu)
>      return false;
>  }
>
> +static void mark_unavailable_multi_bit_default(X86CPU *cpu, FeatureWord w,
> +                                               int index,
> +                                               const char *verbose_prefix)
> +{
> +    FeatureWordInfo *f = &feature_word_info[w];
> +    g_autofree char *feat_word_str = feature_word_description(f);
> +    uint64_t host_feat = x86_cpu_get_supported_feature_word(w, false);
> +    MultiBitFeatureInfo mf = f->multi_bit_features[index];
> +
> +    if ((cpu->env.features[w] & mf.mask) &&

With this checking bits are all 0 but covered by mf.mask's range are skippped,
even if they're different from the host_feat, please check whether it's desried
behavior.

> +        ((cpu->env.features[w] ^ host_feat) & mf.mask)) {
> +        if (!cpu->force_features) {
> +            cpu->env.features[w] &= ~mf.mask;
> +        }
> +        cpu->filtered_features[w] |= mf.mask;
> +        if (verbose_prefix)
> +            warn_report("%s: %s.%s [%u:%u]", verbose_prefix, feat_word_str,
> +                        mf.feat_name, mf.high_bit_position,
> +                        mf.low_bit_position);
> +    }
> +}
> +
>  static void mark_unavailable_features(X86CPU *cpu, FeatureWord w, uint64_t mask,
>                                        const char *verbose_prefix)
>  {
> @@ -6442,6 +6464,20 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>              x86_cpu_get_supported_feature_word(w, false);
>          uint64_t requested_features = env->features[w];
>          uint64_t unavailable_features = requested_features & ~host_feat;
> +        FeatureWordInfo f = feature_word_info[w];
> +        int i;
> +
> +        for (i = 0; i < f.num_multi_bit_features; i++) {
> +            MultiBitFeatureInfo mf = f.multi_bit_features[i];
> +            if (mf.mark_unavailable_multi_bit) {
> +                mf.mark_unavailable_multi_bit(cpu, w, i, prefix);
> +            } else {
> +                mark_unavailable_multi_bit_default(cpu, w, i, prefix);
> +            }
> +
> +            unavailable_features &= ~mf.mask;
> +        }
> +
>          mark_unavailable_features(cpu, w, unavailable_features, prefix);
>      }
>
> --
> 2.34.1
>
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/6] i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E
  2023-01-06  8:38 ` [PATCH v3 1/6] i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E Lei Wang
@ 2023-02-06  7:45   ` Yuan Yao
  0 siblings, 0 replies; 25+ messages in thread
From: Yuan Yao @ 2023-02-06  7:45 UTC (permalink / raw)
  To: Lei Wang
  Cc: pbonzini, qemu-devel, imammedo, dgilbert, berrange, xiaoyao.li,
	yang.zhong

On Fri, Jan 06, 2023 at 12:38:21AM -0800, Lei Wang wrote:
> CPUID leaf 0x1D and 0x1E enumerate tile and TMUL information for AMX.
>
> Introduce FeatureWord FEAT_1D_1_EAX, FEAT_1D_1_EBX, FEAT_1D_1_ECX and
> FEAT_1E_0_EBX. Thus these features of AMX can be expanded when
> "-cpu host/max" and can be configured in named CPU model.
>
> Signed-off-by: Lei Wang <lei4.wang@intel.com>

Reviewed-by: Yao Yuan <yuan.yao@intel.com>

> ---
>  target/i386/cpu.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++
>  target/i386/cpu.h | 12 +++++++++++
>  2 files changed, 67 insertions(+)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 3410e5e470..b6d1247e5e 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -1002,6 +1002,45 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
>          },
>          .tcg_features = ~0U,
>      },
> +    [FEAT_1D_1_EAX] = {
> +        .type = CPUID_FEATURE_WORD,
> +        .cpuid = {
> +            .eax = 0x1D,
> +            .needs_ecx = true, .ecx = 1,
> +            .reg = R_EAX,
> +        },
> +        .migratable_flags = CPUID_AMX_PALETTE_1_TOTAL_TILE_BYTES_MASK |
> +            CPUID_AMX_PALETTE_1_BYTES_PER_TILE_MASK,
> +    },
> +    [FEAT_1D_1_EBX] = {
> +        .type = CPUID_FEATURE_WORD,
> +        .cpuid = {
> +            .eax = 0x1D,
> +            .needs_ecx = true, .ecx = 1,
> +            .reg = R_EBX,
> +        },
> +        .migratable_flags = CPUID_AMX_PALETTE_1_BYTES_PER_ROW_MASK |
> +            CPUID_AMX_PALETTE_1_MAX_NAMES_MASK,
> +    },
> +    [FEAT_1D_1_ECX] = {
> +        .type = CPUID_FEATURE_WORD,
> +        .cpuid = {
> +            .eax = 0x1D,
> +            .needs_ecx = true, .ecx = 1,
> +            .reg = R_ECX,
> +        },
> +        .migratable_flags = CPUID_AMX_PALETTE_1_MAX_ROWS_MASK,
> +    },
> +    [FEAT_1E_0_EBX] = {
> +        .type = CPUID_FEATURE_WORD,
> +        .cpuid = {
> +            .eax = 0x1E,
> +            .needs_ecx = true, .ecx = 0,
> +            .reg = R_EBX,
> +        },
> +        .migratable_flags = CPUID_AMX_TMUL_MAX_K_MASK |
> +            CPUID_AMX_TMUL_MAX_N_MASK,
> +    },
>      /*Below are MSR exposed features*/
>      [FEAT_ARCH_CAPABILITIES] = {
>          .type = MSR_FEATURE_WORD,
> @@ -1371,6 +1410,22 @@ static FeatureDep feature_dependencies[] = {
>          .from = { FEAT_7_0_EBX,             CPUID_7_0_EBX_INTEL_PT },
>          .to = { FEAT_14_0_ECX,              ~0ull },
>      },
> +    {
> +        .from = { FEAT_7_0_EDX,             CPUID_7_0_EDX_AMX_TILE },
> +        .to = { FEAT_1D_1_EAX,              ~0ull },
> +    },
> +    {
> +        .from = { FEAT_7_0_EDX,             CPUID_7_0_EDX_AMX_TILE },
> +        .to = { FEAT_1D_1_EBX,              ~0ull },
> +    },
> +    {
> +        .from = { FEAT_7_0_EDX,             CPUID_7_0_EDX_AMX_TILE },
> +        .to = { FEAT_1D_1_ECX,              ~0ull },
> +    },
> +    {
> +        .from = { FEAT_7_0_EDX,             CPUID_7_0_EDX_AMX_TILE },
> +        .to = { FEAT_1E_0_EBX,              ~0ull },
> +    },
>      {
>          .from = { FEAT_8000_0001_EDX,       CPUID_EXT2_RDTSCP },
>          .to = { FEAT_VMX_SECONDARY_CTLS,    VMX_SECONDARY_EXEC_RDTSCP },
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index d4bc19577a..3e3e0cfe59 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -585,6 +585,14 @@ typedef enum X86Seg {
>                                   XSTATE_Hi16_ZMM_MASK | XSTATE_PKRU_MASK | \
>                                   XSTATE_XTILE_CFG_MASK | XSTATE_XTILE_DATA_MASK)
>
> +#define CPUID_AMX_PALETTE_1_TOTAL_TILE_BYTES_MASK 0xffffU
> +#define CPUID_AMX_PALETTE_1_BYTES_PER_TILE_MASK   (0xffffU << 16)
> +#define CPUID_AMX_PALETTE_1_BYTES_PER_ROW_MASK    0xffffU
> +#define CPUID_AMX_PALETTE_1_MAX_NAMES_MASK        (0xffffU << 16)
> +#define CPUID_AMX_PALETTE_1_MAX_ROWS_MASK         0xffffU
> +#define CPUID_AMX_TMUL_MAX_K_MASK                 0xffU
> +#define CPUID_AMX_TMUL_MAX_N_MASK                 (0xffffU << 8)
> +
>  /* CPUID feature words */
>  typedef enum FeatureWord {
>      FEAT_1_EDX,         /* CPUID[1].EDX */
> @@ -605,6 +613,10 @@ typedef enum FeatureWord {
>      FEAT_6_EAX,         /* CPUID[6].EAX */
>      FEAT_XSAVE_XCR0_LO, /* CPUID[EAX=0xd,ECX=0].EAX */
>      FEAT_XSAVE_XCR0_HI, /* CPUID[EAX=0xd,ECX=0].EDX */
> +    FEAT_1D_1_EAX,      /* CPUID[EAX=0x1d,ECX=1].EAX */
> +    FEAT_1D_1_EBX,      /* CPUID[EAX=0x1d,ECX=1].EBX */
> +    FEAT_1D_1_ECX,      /* CPUID[EAX=0x1d,ECX=1].ECX */
> +    FEAT_1E_0_EBX,      /* CPUID[EAX=0x1e,ECX=0].EBX */
>      FEAT_ARCH_CAPABILITIES,
>      FEAT_CORE_CAPABILITY,
>      FEAT_PERF_CAPABILITIES,
> --
> 2.34.1
>
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/6] Support for new CPU model SapphireRapids
  2023-02-02 11:05 ` [PATCH v3 0/6] Support for " Igor Mammedov
@ 2023-02-07  2:50   ` Wang, Lei
  2023-03-06 12:49     ` Igor Mammedov
  2023-02-08 14:53   ` Xiaoyao Li
  2023-03-02 14:49   ` Robert Hoo
  2 siblings, 1 reply; 25+ messages in thread
From: Wang, Lei @ 2023-02-07  2:50 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: pbonzini, qemu-devel, dgilbert, berrange, xiaoyao.li, yang.zhong

On 2/2/2023 7:05 PM, Igor Mammedov wrote:
> On Fri,  6 Jan 2023 00:38:20 -0800
> Lei Wang <lei4.wang@intel.com> wrote:
> 
>> This series aims to add a new CPU model SapphireRapids, and tries to
>> address the problem stated in
>> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#mcf67dbd1ad37c65d7988c36a2b267be9afd2fb30,
>> so that named CPU model can define its own AMX values, and QEMU won't
>> pass the wrong AMX values to KVM in future platforms if they have
>> different values supported.
>>
>> The original patch is
>> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#u.
> 
> MultiBitFeatureInfo looks like an interesting
> idea but among fixing whatever issues this has atm,
> you'd probably need to introduce a new  qdev_bitfield property
> infrastructure so that such features could be treated like any
> other qdev properties.
> Also when MultiBitFeatureInfo is added, one should convert all
> other usecases where it's applicable (not only for new code)
> so that we would end up with consolidated approach instead of
> zoo mess.
> 
> I'm not sure all MultiBitFeatureInfo complexity is necessary
> just for adding a new CPU model, I'd rather use ad hoc approach
> that we were using before for non boolean features.

Hi, Igor. I do not quite understand what does the "ad hoc approach" mean,
currently if we specify a multi-bit non-boolean CPUID value which is different
from the host value to CPU model, e.g., consider the following scenario:

- KVM **ONLY** supports value 5 (101) and,
- QEMU user want to pass value 3 (011) to it,

and follow the current logic:

    uint64_t unavailable_features = requested_features & ~host_feat;

then:

1. The warning message will be messy and not intuitive:

requested_features bit 1 is 1 and host_feat bit 1 is 0, so it will warn on this
non-sense bit.


2. Some CPUID bits will "leak" into the final CPUID passed to KVM:

requested_features bit 0 is 1 and host_feat bit 0 is also 1, so it will pass
this CPUID bit to host, the request_features value is 3 (011), finally we get 1
(001), this is not we want.

Am I understanding it correctly?

> 
> And then try to develop MultiBitFeatureInfo & co as a separate
> series to demonstrate how much it will improve current
> cpu models definitions.
> 
> PS:
>  'make check-acceptance' are broken with this
> 
>> ---
>>
>> Changelog:
>>
>> v3:
>>  - Rebase on the latest QEMU (d1852caab131ea898134fdcea8c14bc2ee75fbe9).
>>  - v2: https://lore.kernel.org/qemu-devel/20221102085256.81139-1-lei4.wang@intel.com/
>>
>> v2:
>>  - Fix when passing all zeros of AMX-related CPUID, QEMU will warn
>>    unsupported.
>>  - Remove unnecessary function definition and make code cleaner.
>>  - Fix some typos.
>>  - v1: https://lore.kernel.org/qemu-devel/20221027020036.373140-1-lei4.wang@intel.com/T/#t
>>
>>
>> Lei Wang (6):
>>   i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E
>>   i386: Remove unused parameter "uint32_t bit" in
>>     feature_word_description()
>>   i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit
>>     features
>>   i386: Mask and report unavailable multi-bit feature values
>>   i386: Initialize AMX CPUID leaves with corresponding env->features[]
>>     leaves
>>   i386: Add new CPU model SapphireRapids
>>
>>  target/i386/cpu-internal.h |  11 ++
>>  target/i386/cpu.c          | 311 +++++++++++++++++++++++++++++++++++--
>>  target/i386/cpu.h          |  16 ++
>>  3 files changed, 322 insertions(+), 16 deletions(-)
>>
>>
>> base-commit: d1852caab131ea898134fdcea8c14bc2ee75fbe9
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/6] i386: Remove unused parameter "uint32_t bit" in feature_word_description()
  2023-01-06  8:38 ` [PATCH v3 2/6] i386: Remove unused parameter "uint32_t bit" in feature_word_description() Lei Wang
  2023-01-27 13:29   ` Igor Mammedov
@ 2023-02-08 14:33   ` Xiaoyao Li
  1 sibling, 0 replies; 25+ messages in thread
From: Xiaoyao Li @ 2023-02-08 14:33 UTC (permalink / raw)
  To: Lei Wang, pbonzini; +Cc: qemu-devel, imammedo, dgilbert, berrange, yang.zhong

On 1/6/2023 4:38 PM, Lei Wang wrote:
> Parameter "uint32_t bit" is not used in function feature_word_description(),
> so remove it.
> 
> Signed-off-by: Lei Wang <lei4.wang@intel.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

> ---
>   target/i386/cpu.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b6d1247e5e..883098bc5a 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -4290,7 +4290,7 @@ static const TypeInfo max_x86_cpu_type_info = {
>       .class_init = max_x86_cpu_class_init,
>   };
>   
> -static char *feature_word_description(FeatureWordInfo *f, uint32_t bit)
> +static char *feature_word_description(FeatureWordInfo *f)
>   {
>       assert(f->type == CPUID_FEATURE_WORD || f->type == MSR_FEATURE_WORD);
>   
> @@ -4329,6 +4329,7 @@ static void mark_unavailable_features(X86CPU *cpu, FeatureWord w, uint64_t mask,
>       CPUX86State *env = &cpu->env;
>       FeatureWordInfo *f = &feature_word_info[w];
>       int i;
> +    g_autofree char *feat_word_str = feature_word_description(f);
>   
>       if (!cpu->force_features) {
>           env->features[w] &= ~mask;
> @@ -4341,7 +4342,6 @@ static void mark_unavailable_features(X86CPU *cpu, FeatureWord w, uint64_t mask,
>   
>       for (i = 0; i < 64; ++i) {
>           if ((1ULL << i) & mask) {
> -            g_autofree char *feat_word_str = feature_word_description(f, i);
>               warn_report("%s: %s%s%s [bit %d]",
>                           verbose_prefix,
>                           feat_word_str,



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/6] i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit features
  2023-01-06  8:38 ` [PATCH v3 3/6] i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit features Lei Wang
@ 2023-02-08 14:39   ` Xiaoyao Li
  0 siblings, 0 replies; 25+ messages in thread
From: Xiaoyao Li @ 2023-02-08 14:39 UTC (permalink / raw)
  To: Lei Wang, pbonzini; +Cc: qemu-devel, imammedo, dgilbert, berrange, yang.zhong

On 1/6/2023 4:38 PM, Lei Wang wrote:
> Some features use multiple CPUID bits to form a value to be used, e.g.,
> CPUID(0x1E,0):EBX[23:08] is regarded as the tmul_maxn value for AMX.
> Introduce a new struct "MultiBitFeatureInfo" to hold the information for
> those features and create a corresponding member in struct FeatureWordInfo,
> so that the infomation can be assigned for each item in feature_word_info
> array and used in the future.
> 
> Signed-off-by: Lei Wang <lei4.wang@intel.com>
> ---
>   target/i386/cpu-internal.h |  9 +++++++
>   target/i386/cpu.c          | 54 ++++++++++++++++++++++++++++++++++++++
>   2 files changed, 63 insertions(+)
> 
> diff --git a/target/i386/cpu-internal.h b/target/i386/cpu-internal.h
> index 9baac5c0b4..66b3d66cb4 100644
> --- a/target/i386/cpu-internal.h
> +++ b/target/i386/cpu-internal.h
> @@ -25,6 +25,13 @@ typedef enum FeatureWordType {
>      MSR_FEATURE_WORD,
>   } FeatureWordType;
>   
> +typedef struct MultiBitFeatureInfo {
> +    const char *feat_name;
> +    uint64_t mask;
> +    unsigned high_bit_position;
> +    unsigned low_bit_position;
> +} MultiBitFeatureInfo;
> +
>   typedef struct FeatureWordInfo {
>       FeatureWordType type;
>       /* feature flags names are taken from "Intel Processor Identification and
> @@ -51,6 +58,8 @@ typedef struct FeatureWordInfo {
>       uint64_t migratable_flags; /* Feature flags known to be migratable */
>       /* Features that shouldn't be auto-enabled by "-cpu host" */
>       uint64_t no_autoenable_flags;
> +    unsigned num_multi_bit_features;
> +    MultiBitFeatureInfo *multi_bit_features;
>   } FeatureWordInfo;
>   
>   extern FeatureWordInfo feature_word_info[];
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 883098bc5a..88aa780566 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -1011,6 +1011,21 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
>           },
>           .migratable_flags = CPUID_AMX_PALETTE_1_TOTAL_TILE_BYTES_MASK |
>               CPUID_AMX_PALETTE_1_BYTES_PER_TILE_MASK,
> +        .num_multi_bit_features = 2,
> +        .multi_bit_features = (MultiBitFeatureInfo[]){
> +            {
> +                .feat_name = "total_tile_bytes",
> +                .mask = CPUID_AMX_PALETTE_1_TOTAL_TILE_BYTES_MASK,
> +                .high_bit_position = 15,
> +                .low_bit_position = 0,
> +            },
> +            {
> +                .feat_name = "bytes_per_tile",
> +                .mask = CPUID_AMX_PALETTE_1_BYTES_PER_TILE_MASK,
> +                .high_bit_position = 31,
> +                .low_bit_position = 16,
> +            },
> +        },

without Patch 4, we have no idea how MultiBitFeatureInfo will be used.

you can

1) introduce the whole MultiBitFeatureInfo infrastructure in this patch 
by merging Patch 4.

2) define real user of MultiBitFeatureInfo such as AMX in a seperate patch.

>       },
>       [FEAT_1D_1_EBX] = {
>           .type = CPUID_FEATURE_WORD,
> @@ -1021,6 +1036,21 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
>           },
>           .migratable_flags = CPUID_AMX_PALETTE_1_BYTES_PER_ROW_MASK |
>               CPUID_AMX_PALETTE_1_MAX_NAMES_MASK,
> +        .num_multi_bit_features = 2,
> +        .multi_bit_features = (MultiBitFeatureInfo[]){
> +            {
> +                .feat_name = "bytes_per_row",
> +                .mask = CPUID_AMX_PALETTE_1_BYTES_PER_ROW_MASK,
> +                .high_bit_position = 15,
> +                .low_bit_position = 0,
> +            },
> +            {
> +                .feat_name = "max_names",
> +                .mask = CPUID_AMX_PALETTE_1_MAX_NAMES_MASK,
> +                .high_bit_position = 31,
> +                .low_bit_position = 16,
> +            },
> +        },
>       },
>       [FEAT_1D_1_ECX] = {
>           .type = CPUID_FEATURE_WORD,
> @@ -1030,6 +1060,15 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
>               .reg = R_ECX,
>           },
>           .migratable_flags = CPUID_AMX_PALETTE_1_MAX_ROWS_MASK,
> +        .num_multi_bit_features = 1,
> +        .multi_bit_features = (MultiBitFeatureInfo[]){
> +            {
> +                .feat_name = "max_rows",
> +                .mask = CPUID_AMX_PALETTE_1_MAX_ROWS_MASK,
> +                .high_bit_position = 15,
> +                .low_bit_position = 0,
> +            },
> +        },
>       },
>       [FEAT_1E_0_EBX] = {
>           .type = CPUID_FEATURE_WORD,
> @@ -1040,6 +1079,21 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
>           },
>           .migratable_flags = CPUID_AMX_TMUL_MAX_K_MASK |
>               CPUID_AMX_TMUL_MAX_N_MASK,
> +        .num_multi_bit_features = 2,
> +        .multi_bit_features = (MultiBitFeatureInfo[]){
> +            {
> +                .feat_name = "tmul_maxk",
> +                .mask = CPUID_AMX_TMUL_MAX_K_MASK,
> +                .high_bit_position = 7,
> +                .low_bit_position = 0,
> +            },
> +            {
> +                .feat_name = "tmul_maxn",
> +                .mask = CPUID_AMX_TMUL_MAX_N_MASK,
> +                .high_bit_position = 23,
> +                .low_bit_position = 8,
> +            },
> +        },
>       },
>       /*Below are MSR exposed features*/
>       [FEAT_ARCH_CAPABILITIES] = {



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/6] Support for new CPU model SapphireRapids
  2023-02-02 11:05 ` [PATCH v3 0/6] Support for " Igor Mammedov
  2023-02-07  2:50   ` Wang, Lei
@ 2023-02-08 14:53   ` Xiaoyao Li
  2023-03-02 14:49   ` Robert Hoo
  2 siblings, 0 replies; 25+ messages in thread
From: Xiaoyao Li @ 2023-02-08 14:53 UTC (permalink / raw)
  To: Igor Mammedov, Lei Wang
  Cc: pbonzini, qemu-devel, dgilbert, berrange, yang.zhong

On 2/2/2023 7:05 PM, Igor Mammedov wrote:
> On Fri,  6 Jan 2023 00:38:20 -0800
> Lei Wang <lei4.wang@intel.com> wrote:
> 
>> This series aims to add a new CPU model SapphireRapids, and tries to
>> address the problem stated in
>> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#mcf67dbd1ad37c65d7988c36a2b267be9afd2fb30,
>> so that named CPU model can define its own AMX values, and QEMU won't
>> pass the wrong AMX values to KVM in future platforms if they have
>> different values supported.
>>
>> The original patch is
>> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#u.
> 
> MultiBitFeatureInfo looks like an interesting
> idea but among fixing whatever issues this has atm,
> you'd probably need to introduce a new  qdev_bitfield property
> infrastructure so that such features could be treated like any
> other qdev properties.
> Also when MultiBitFeatureInfo is added, one should convert all
> other usecases where it's applicable (not only for new code)
> so that we would end up with consolidated approach instead of
> zoo mess.
> 
> I'm not sure all MultiBitFeatureInfo complexity is necessary
> just for adding a new CPU model, I'd rather use ad hoc approach
> that we were using before for non boolean features.

We have to introduce MultiBitFeatureInfo for SPR cpu model if AMX is 
supposed to be included with SPR cpu model. In fact, MultiBitFeatureInfo 
should have been introduced when adding AMX virtualization support in 
QEMU. I.e., current AMX virtualization design is problematic just like 
Intel-PT virtualization.

Ideally, this series can be split as two: 1) Fix AMX virtualization (by 
introducing MultiBitFeatureInfo), 2) define SPR cpu model.

> 
> And then try to develop MultiBitFeatureInfo & co as a separate
> series to demonstrate how much it will improve current
> cpu models definitions.
> 
> PS:
>   'make check-acceptance' are broken with this
> 
>> ---
>>
>> Changelog:
>>
>> v3:
>>   - Rebase on the latest QEMU (d1852caab131ea898134fdcea8c14bc2ee75fbe9).
>>   - v2: https://lore.kernel.org/qemu-devel/20221102085256.81139-1-lei4.wang@intel.com/
>>
>> v2:
>>   - Fix when passing all zeros of AMX-related CPUID, QEMU will warn
>>     unsupported.
>>   - Remove unnecessary function definition and make code cleaner.
>>   - Fix some typos.
>>   - v1: https://lore.kernel.org/qemu-devel/20221027020036.373140-1-lei4.wang@intel.com/T/#t
>>
>>
>> Lei Wang (6):
>>    i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E
>>    i386: Remove unused parameter "uint32_t bit" in
>>      feature_word_description()
>>    i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit
>>      features
>>    i386: Mask and report unavailable multi-bit feature values
>>    i386: Initialize AMX CPUID leaves with corresponding env->features[]
>>      leaves
>>    i386: Add new CPU model SapphireRapids
>>
>>   target/i386/cpu-internal.h |  11 ++
>>   target/i386/cpu.c          | 311 +++++++++++++++++++++++++++++++++++--
>>   target/i386/cpu.h          |  16 ++
>>   3 files changed, 322 insertions(+), 16 deletions(-)
>>
>>
>> base-commit: d1852caab131ea898134fdcea8c14bc2ee75fbe9
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/6] i386: Mask and report unavailable multi-bit feature values
  2023-02-06  7:43   ` Yuan Yao
@ 2023-02-09  1:04     ` Wang, Lei
  2023-02-09  3:29       ` Xiaoyao Li
  0 siblings, 1 reply; 25+ messages in thread
From: Wang, Lei @ 2023-02-09  1:04 UTC (permalink / raw)
  To: Yuan Yao
  Cc: pbonzini, qemu-devel, imammedo, dgilbert, berrange, xiaoyao.li,
	yang.zhong

On 2/6/2023 3:43 PM, Yuan Yao wrote:
> On Fri, Jan 06, 2023 at 12:38:24AM -0800, Lei Wang wrote:
>> Some feature words, e.g., feature words in AMX-related CPUID leaf 0x1D and
>> 0x1E are not bit-wise but multiple bits represents one value. Handle this
>> situation when the values specified are not the same as which are reported
>> by KVM. The handling includes:
>>
>>  - The responsibility of masking bits and giving warnings are delegated to
>>    the feature enabler. A framework is also provided to enable this.
>>  - To simplify the initialization, a default function is provided if the
>>    the function is not specified.
>>
>> The reason why delegating this responsibility rather than just marking
>> them as zeros when they are not same is because different multi-bit
>> features may have different logic, which is case by case, for example:
>>
>>  1. CPUID.0x14_0x1:EBX[15:0]. Even though it's multi-bits field, it's a
>>     bitmap and each bit represents a separate capability.
>>
>>  2. CPUID.0x14_0x1:EAX[2:0] represents the number of configurable Address
>>     Ranges. 3 bits as a whole to represent a integer value. It means the
>>     maximum capability of HW. If KVM reports M, then M to 0 is legal
>>     value to configure (because KVM can emulate each value correctly).
>>
>>  3. CPUID.0x1D_0x1:EAX[31:16] represents palette 1 bytes_per_tile. 16 bits
>>     as a whole represent an integer value. It's not like case 2 and SW
>>     needs to configure the same value as reported. Because it's not
>>     possible for SW to configure to a different value and KVM cannot
>>     emulate it.
>>
>> So marking them blindly as zeros is incorrect, and delegating this
>> responsibility can let each multi-bit feature have its own way to mask bits.
>>
>> Signed-off-by: Lei Wang <lei4.wang@intel.com>
>> ---
>>  target/i386/cpu-internal.h |  2 ++
>>  target/i386/cpu.c          | 36 ++++++++++++++++++++++++++++++++++++
>>  2 files changed, 38 insertions(+)
>>
>> diff --git a/target/i386/cpu-internal.h b/target/i386/cpu-internal.h
>> index 66b3d66cb4..83c7b53926 100644
>> --- a/target/i386/cpu-internal.h
>> +++ b/target/i386/cpu-internal.h
>> @@ -30,6 +30,8 @@ typedef struct MultiBitFeatureInfo {
>>      uint64_t mask;
>>      unsigned high_bit_position;
>>      unsigned low_bit_position;
>> +    void (*mark_unavailable_multi_bit)(X86CPU *cpu, FeatureWord w, int index,
>> +                                       const char *verbose_prefix);
>>  } MultiBitFeatureInfo;
>>
>>  typedef struct FeatureWordInfo {
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index 88aa780566..e638a31d34 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -4377,6 +4377,28 @@ static bool x86_cpu_have_filtered_features(X86CPU *cpu)
>>      return false;
>>  }
>>
>> +static void mark_unavailable_multi_bit_default(X86CPU *cpu, FeatureWord w,
>> +                                               int index,
>> +                                               const char *verbose_prefix)
>> +{
>> +    FeatureWordInfo *f = &feature_word_info[w];
>> +    g_autofree char *feat_word_str = feature_word_description(f);
>> +    uint64_t host_feat = x86_cpu_get_supported_feature_word(w, false);
>> +    MultiBitFeatureInfo mf = f->multi_bit_features[index];
>> +
>> +    if ((cpu->env.features[w] & mf.mask) &&
> 
> With this checking bits are all 0 but covered by mf.mask's range are skippped,
> even if they're different from the host_feat, please check whether it's desried
> behavior.

This is the intended design because there are quite a number of multi-bit CPUIDs
should support passing all 0 to them.

>> +        ((cpu->env.features[w] ^ host_feat) & mf.mask)) {
>> +        if (!cpu->force_features) {
>> +            cpu->env.features[w] &= ~mf.mask;
>> +        }
>> +        cpu->filtered_features[w] |= mf.mask;
>> +        if (verbose_prefix)
>> +            warn_report("%s: %s.%s [%u:%u]", verbose_prefix, feat_word_str,
>> +                        mf.feat_name, mf.high_bit_position,
>> +                        mf.low_bit_position);
>> +    }
>> +}
>> +
>>  static void mark_unavailable_features(X86CPU *cpu, FeatureWord w, uint64_t mask,
>>                                        const char *verbose_prefix)
>>  {
>> @@ -6442,6 +6464,20 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>>              x86_cpu_get_supported_feature_word(w, false);
>>          uint64_t requested_features = env->features[w];
>>          uint64_t unavailable_features = requested_features & ~host_feat;
>> +        FeatureWordInfo f = feature_word_info[w];
>> +        int i;
>> +
>> +        for (i = 0; i < f.num_multi_bit_features; i++) {
>> +            MultiBitFeatureInfo mf = f.multi_bit_features[i];
>> +            if (mf.mark_unavailable_multi_bit) {
>> +                mf.mark_unavailable_multi_bit(cpu, w, i, prefix);
>> +            } else {
>> +                mark_unavailable_multi_bit_default(cpu, w, i, prefix);
>> +            }
>> +
>> +            unavailable_features &= ~mf.mask;
>> +        }
>> +
>>          mark_unavailable_features(cpu, w, unavailable_features, prefix);
>>      }
>>
>> --
>> 2.34.1
>>
>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/6] i386: Mask and report unavailable multi-bit feature values
  2023-02-09  1:04     ` Wang, Lei
@ 2023-02-09  3:29       ` Xiaoyao Li
  2023-02-09  4:21         ` Wang, Lei
  0 siblings, 1 reply; 25+ messages in thread
From: Xiaoyao Li @ 2023-02-09  3:29 UTC (permalink / raw)
  To: Wang, Lei, Yuan Yao
  Cc: pbonzini, qemu-devel, imammedo, dgilbert, berrange, yang.zhong

On 2/9/2023 9:04 AM, Wang, Lei wrote:
> On 2/6/2023 3:43 PM, Yuan Yao wrote:
>> On Fri, Jan 06, 2023 at 12:38:24AM -0800, Lei Wang wrote:
>>> Some feature words, e.g., feature words in AMX-related CPUID leaf 0x1D and
>>> 0x1E are not bit-wise but multiple bits represents one value. Handle this
>>> situation when the values specified are not the same as which are reported
>>> by KVM. The handling includes:
>>>
>>>   - The responsibility of masking bits and giving warnings are delegated to
>>>     the feature enabler. A framework is also provided to enable this.
>>>   - To simplify the initialization, a default function is provided if the
>>>     the function is not specified.

What's the behavior of default ? you need to call out it clearly.

>>> The reason why delegating this responsibility rather than just marking
>>> them as zeros when they are not same is because different multi-bit
>>> features may have different logic, which is case by case, for example:
>>>
>>>   1. CPUID.0x14_0x1:EBX[15:0]. Even though it's multi-bits field, it's a
>>>      bitmap and each bit represents a separate capability.
>>>
>>>   2. CPUID.0x14_0x1:EAX[2:0] represents the number of configurable Address
>>>      Ranges. 3 bits as a whole to represent a integer value. It means the
>>>      maximum capability of HW. If KVM reports M, then M to 0 is legal
>>>      value to configure (because KVM can emulate each value correctly).
>>>
>>>   3. CPUID.0x1D_0x1:EAX[31:16] represents palette 1 bytes_per_tile. 16 bits
>>>      as a whole represent an integer value. It's not like case 2 and SW
>>>      needs to configure the same value as reported. Because it's not
>>>      possible for SW to configure to a different value and KVM cannot
>>>      emulate it.
>>>
>>> So marking them blindly as zeros is incorrect, and delegating this
>>> responsibility can let each multi-bit feature have its own way to mask bits.

you can first describe there are such 3 cases of multi-bit features and 
they need different handling for checking whether configured value is 
supported by KVM or not. So introduce a handling callback function that 
each multi-bit feature can implement their own. Meanwhile, provide a 
default handling callback that handles case x: when configured value is 
not the same as the one reported by KVM, clearing it to zero to mark it 
as unavailable.

>>> Signed-off-by: Lei Wang <lei4.wang@intel.com>
>>> ---
>>>   target/i386/cpu-internal.h |  2 ++
>>>   target/i386/cpu.c          | 36 ++++++++++++++++++++++++++++++++++++
>>>   2 files changed, 38 insertions(+)
>>>
>>> diff --git a/target/i386/cpu-internal.h b/target/i386/cpu-internal.h
>>> index 66b3d66cb4..83c7b53926 100644
>>> --- a/target/i386/cpu-internal.h
>>> +++ b/target/i386/cpu-internal.h
>>> @@ -30,6 +30,8 @@ typedef struct MultiBitFeatureInfo {
>>>       uint64_t mask;
>>>       unsigned high_bit_position;
>>>       unsigned low_bit_position;
>>> +    void (*mark_unavailable_multi_bit)(X86CPU *cpu, FeatureWord w, int index,
>>> +                                       const char *verbose_prefix);
>>>   } MultiBitFeatureInfo;
>>>
>>>   typedef struct FeatureWordInfo {
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index 88aa780566..e638a31d34 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -4377,6 +4377,28 @@ static bool x86_cpu_have_filtered_features(X86CPU *cpu)
>>>       return false;
>>>   }
>>>
>>> +static void mark_unavailable_multi_bit_default(X86CPU *cpu, FeatureWord w,
>>> +                                               int index,
>>> +                                               const char *verbose_prefix)
>>> +{
>>> +    FeatureWordInfo *f = &feature_word_info[w];
>>> +    g_autofree char *feat_word_str = feature_word_description(f);
>>> +    uint64_t host_feat = x86_cpu_get_supported_feature_word(w, false);
>>> +    MultiBitFeatureInfo mf = f->multi_bit_features[index];
>>> +
>>> +    if ((cpu->env.features[w] & mf.mask) &&
>>
>> With this checking bits are all 0 but covered by mf.mask's range are skippped,
>> even if they're different from the host_feat, please check whether it's desried
>> behavior.
> 
> This is the intended design because there are quite a number of multi-bit CPUIDs
> should support passing all 0 to them.

you didn't answer the question. The question is why the handling can be 
skipped when the value of multi-bit feature is 0.

>>> +        ((cpu->env.features[w] ^ host_feat) & mf.mask)) {
>>> +        if (!cpu->force_features) {
>>> +            cpu->env.features[w] &= ~mf.mask;
>>> +        }
>>> +        cpu->filtered_features[w] |= mf.mask;
>>> +        if (verbose_prefix)
>>> +            warn_report("%s: %s.%s [%u:%u]", verbose_prefix, feat_word_str,
>>> +                        mf.feat_name, mf.high_bit_position,
>>> +                        mf.low_bit_position);
>>> +    }
>>> +}
>>> +
>>>   static void mark_unavailable_features(X86CPU *cpu, FeatureWord w, uint64_t mask,
>>>                                         const char *verbose_prefix)
>>>   {
>>> @@ -6442,6 +6464,20 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>>>               x86_cpu_get_supported_feature_word(w, false);
>>>           uint64_t requested_features = env->features[w];
>>>           uint64_t unavailable_features = requested_features & ~host_feat;
>>> +        FeatureWordInfo f = feature_word_info[w];
>>> +        int i;
>>> +
>>> +        for (i = 0; i < f.num_multi_bit_features; i++) {
>>> +            MultiBitFeatureInfo mf = f.multi_bit_features[i];
>>> +            if (mf.mark_unavailable_multi_bit) {
>>> +                mf.mark_unavailable_multi_bit(cpu, w, i, prefix);
>>> +            } else {
>>> +                mark_unavailable_multi_bit_default(cpu, w, i, prefix);
>>> +            }
>>> +
>>> +            unavailable_features &= ~mf.mask;
>>> +        }
>>> +
>>>           mark_unavailable_features(cpu, w, unavailable_features, prefix);
>>>       }
>>>
>>> --
>>> 2.34.1
>>>
>>>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/6] i386: Mask and report unavailable multi-bit feature values
  2023-02-09  3:29       ` Xiaoyao Li
@ 2023-02-09  4:21         ` Wang, Lei
  2023-02-09  5:59           ` Xiaoyao Li
  0 siblings, 1 reply; 25+ messages in thread
From: Wang, Lei @ 2023-02-09  4:21 UTC (permalink / raw)
  To: Xiaoyao Li, Yuan Yao
  Cc: pbonzini, qemu-devel, imammedo, dgilbert, berrange, yang.zhong

On 2/9/2023 11:29 AM, Xiaoyao Li wrote:
> On 2/9/2023 9:04 AM, Wang, Lei wrote:
>> On 2/6/2023 3:43 PM, Yuan Yao wrote:
>>> On Fri, Jan 06, 2023 at 12:38:24AM -0800, Lei Wang wrote:
>>>> Some feature words, e.g., feature words in AMX-related CPUID leaf 0x1D and
>>>> 0x1E are not bit-wise but multiple bits represents one value. Handle this
>>>> situation when the values specified are not the same as which are reported
>>>> by KVM. The handling includes:
>>>>
>>>>   - The responsibility of masking bits and giving warnings are delegated to
>>>>     the feature enabler. A framework is also provided to enable this.
>>>>   - To simplify the initialization, a default function is provided if the
>>>>     the function is not specified.
> 
> What's the behavior of default ? you need to call out it clearly.
> 
>>>> The reason why delegating this responsibility rather than just marking
>>>> them as zeros when they are not same is because different multi-bit
>>>> features may have different logic, which is case by case, for example:
>>>>
>>>>   1. CPUID.0x14_0x1:EBX[15:0]. Even though it's multi-bits field, it's a
>>>>      bitmap and each bit represents a separate capability.
>>>>
>>>>   2. CPUID.0x14_0x1:EAX[2:0] represents the number of configurable Address
>>>>      Ranges. 3 bits as a whole to represent a integer value. It means the
>>>>      maximum capability of HW. If KVM reports M, then M to 0 is legal
>>>>      value to configure (because KVM can emulate each value correctly).
>>>>
>>>>   3. CPUID.0x1D_0x1:EAX[31:16] represents palette 1 bytes_per_tile. 16 bits
>>>>      as a whole represent an integer value. It's not like case 2 and SW
>>>>      needs to configure the same value as reported. Because it's not
>>>>      possible for SW to configure to a different value and KVM cannot
>>>>      emulate it.
>>>>
>>>> So marking them blindly as zeros is incorrect, and delegating this
>>>> responsibility can let each multi-bit feature have its own way to mask bits.
> 
> you can first describe there are such 3 cases of multi-bit features and they
> need different handling for checking whether configured value is supported by
> KVM or not. So introduce a handling callback function that each multi-bit
> feature can implement their own. Meanwhile, provide a default handling callback
> that handles case x: when configured value is not the same as the one reported
> by KVM, clearing it to zero to mark it as unavailable.

OK, will rephrase the commit message.

> 
>>>> Signed-off-by: Lei Wang <lei4.wang@intel.com>
>>>> ---
>>>>   target/i386/cpu-internal.h |  2 ++
>>>>   target/i386/cpu.c          | 36 ++++++++++++++++++++++++++++++++++++
>>>>   2 files changed, 38 insertions(+)
>>>>
>>>> diff --git a/target/i386/cpu-internal.h b/target/i386/cpu-internal.h
>>>> index 66b3d66cb4..83c7b53926 100644
>>>> --- a/target/i386/cpu-internal.h
>>>> +++ b/target/i386/cpu-internal.h
>>>> @@ -30,6 +30,8 @@ typedef struct MultiBitFeatureInfo {
>>>>       uint64_t mask;
>>>>       unsigned high_bit_position;
>>>>       unsigned low_bit_position;
>>>> +    void (*mark_unavailable_multi_bit)(X86CPU *cpu, FeatureWord w, int index,
>>>> +                                       const char *verbose_prefix);
>>>>   } MultiBitFeatureInfo;
>>>>
>>>>   typedef struct FeatureWordInfo {
>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>> index 88aa780566..e638a31d34 100644
>>>> --- a/target/i386/cpu.c
>>>> +++ b/target/i386/cpu.c
>>>> @@ -4377,6 +4377,28 @@ static bool x86_cpu_have_filtered_features(X86CPU *cpu)
>>>>       return false;
>>>>   }
>>>>
>>>> +static void mark_unavailable_multi_bit_default(X86CPU *cpu, FeatureWord w,
>>>> +                                               int index,
>>>> +                                               const char *verbose_prefix)
>>>> +{
>>>> +    FeatureWordInfo *f = &feature_word_info[w];
>>>> +    g_autofree char *feat_word_str = feature_word_description(f);
>>>> +    uint64_t host_feat = x86_cpu_get_supported_feature_word(w, false);
>>>> +    MultiBitFeatureInfo mf = f->multi_bit_features[index];
>>>> +
>>>> +    if ((cpu->env.features[w] & mf.mask) &&
>>>
>>> With this checking bits are all 0 but covered by mf.mask's range are skippped,
>>> even if they're different from the host_feat, please check whether it's desried
>>> behavior.
>>
>> This is the intended design because there are quite a number of multi-bit CPUIDs
>> should support passing all 0 to them.
> 
> you didn't answer the question. The question is why the handling can be skipped
> when the value of multi-bit feature is 0.

I think the default function should handle the most common case, which is,
passing all 0 multi-bits to KVM is valid and shouldn't be warned. E.g, when we
are using some earlier CPU models which doesn't support AMX, we shouldn't be
warned that all 0 values don't match the CPUID reported by KVM.

> 
>>>> +        ((cpu->env.features[w] ^ host_feat) & mf.mask)) {
>>>> +        if (!cpu->force_features) {
>>>> +            cpu->env.features[w] &= ~mf.mask;
>>>> +        }
>>>> +        cpu->filtered_features[w] |= mf.mask;
>>>> +        if (verbose_prefix)
>>>> +            warn_report("%s: %s.%s [%u:%u]", verbose_prefix, feat_word_str,
>>>> +                        mf.feat_name, mf.high_bit_position,
>>>> +                        mf.low_bit_position);
>>>> +    }
>>>> +}
>>>> +
>>>>   static void mark_unavailable_features(X86CPU *cpu, FeatureWord w, uint64_t
>>>> mask,
>>>>                                         const char *verbose_prefix)
>>>>   {
>>>> @@ -6442,6 +6464,20 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool
>>>> verbose)
>>>>               x86_cpu_get_supported_feature_word(w, false);
>>>>           uint64_t requested_features = env->features[w];
>>>>           uint64_t unavailable_features = requested_features & ~host_feat;
>>>> +        FeatureWordInfo f = feature_word_info[w];
>>>> +        int i;
>>>> +
>>>> +        for (i = 0; i < f.num_multi_bit_features; i++) {
>>>> +            MultiBitFeatureInfo mf = f.multi_bit_features[i];
>>>> +            if (mf.mark_unavailable_multi_bit) {
>>>> +                mf.mark_unavailable_multi_bit(cpu, w, i, prefix);
>>>> +            } else {
>>>> +                mark_unavailable_multi_bit_default(cpu, w, i, prefix);
>>>> +            }
>>>> +
>>>> +            unavailable_features &= ~mf.mask;
>>>> +        }
>>>> +
>>>>           mark_unavailable_features(cpu, w, unavailable_features, prefix);
>>>>       }
>>>>
>>>> -- 
>>>> 2.34.1
>>>>
>>>>
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/6] i386: Mask and report unavailable multi-bit feature values
  2023-02-09  4:21         ` Wang, Lei
@ 2023-02-09  5:59           ` Xiaoyao Li
  2023-02-09  6:15             ` Wang, Lei
  0 siblings, 1 reply; 25+ messages in thread
From: Xiaoyao Li @ 2023-02-09  5:59 UTC (permalink / raw)
  To: Wang, Lei, Yuan Yao
  Cc: pbonzini, qemu-devel, imammedo, dgilbert, berrange, yang.zhong

On 2/9/2023 12:21 PM, Wang, Lei wrote:
>>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>>> index 88aa780566..e638a31d34 100644
>>>>> --- a/target/i386/cpu.c
>>>>> +++ b/target/i386/cpu.c
>>>>> @@ -4377,6 +4377,28 @@ static bool x86_cpu_have_filtered_features(X86CPU *cpu)
>>>>>        return false;
>>>>>    }
>>>>>
>>>>> +static void mark_unavailable_multi_bit_default(X86CPU *cpu, FeatureWord w,
>>>>> +                                               int index,
>>>>> +                                               const char *verbose_prefix)
>>>>> +{
>>>>> +    FeatureWordInfo *f = &feature_word_info[w];
>>>>> +    g_autofree char *feat_word_str = feature_word_description(f);
>>>>> +    uint64_t host_feat = x86_cpu_get_supported_feature_word(w, false);
>>>>> +    MultiBitFeatureInfo mf = f->multi_bit_features[index];
>>>>> +
>>>>> +    if ((cpu->env.features[w] & mf.mask) &&
>>>> With this checking bits are all 0 but covered by mf.mask's range are skippped,
>>>> even if they're different from the host_feat, please check whether it's desried
>>>> behavior.
>>> This is the intended design because there are quite a number of multi-bit CPUIDs
>>> should support passing all 0 to them.
>> you didn't answer the question. The question is why the handling can be skipped
>> when the value of multi-bit feature is 0.
> I think the default function should handle the most common case, which is,
> passing all 0 multi-bits to KVM is valid and shouldn't be warned. E.g, when we
> are using some earlier CPU models which doesn't support AMX, we shouldn't be
> warned that all 0 values don't match the CPUID reported by KVM.
> 

passing value 0 to KVM is valid, is not the reason why the handling can 
be skipped.

The correct reason is that when value is 0, it means the multi-bit 
feature is not enabled/requested. It's always legal and doesn't need any 
handling. So it can be just skipped.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/6] i386: Mask and report unavailable multi-bit feature values
  2023-02-09  5:59           ` Xiaoyao Li
@ 2023-02-09  6:15             ` Wang, Lei
  2023-02-09  9:26               ` Xiaoyao Li
  0 siblings, 1 reply; 25+ messages in thread
From: Wang, Lei @ 2023-02-09  6:15 UTC (permalink / raw)
  To: Xiaoyao Li, Yuan Yao
  Cc: pbonzini, qemu-devel, imammedo, dgilbert, berrange, yang.zhong

On 2/9/2023 1:59 PM, Xiaoyao Li wrote:
> On 2/9/2023 12:21 PM, Wang, Lei wrote:
>>>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>>>> index 88aa780566..e638a31d34 100644
>>>>>> --- a/target/i386/cpu.c
>>>>>> +++ b/target/i386/cpu.c
>>>>>> @@ -4377,6 +4377,28 @@ static bool x86_cpu_have_filtered_features(X86CPU
>>>>>> *cpu)
>>>>>>        return false;
>>>>>>    }
>>>>>>
>>>>>> +static void mark_unavailable_multi_bit_default(X86CPU *cpu, FeatureWord w,
>>>>>> +                                               int index,
>>>>>> +                                               const char *verbose_prefix)
>>>>>> +{
>>>>>> +    FeatureWordInfo *f = &feature_word_info[w];
>>>>>> +    g_autofree char *feat_word_str = feature_word_description(f);
>>>>>> +    uint64_t host_feat = x86_cpu_get_supported_feature_word(w, false);
>>>>>> +    MultiBitFeatureInfo mf = f->multi_bit_features[index];
>>>>>> +
>>>>>> +    if ((cpu->env.features[w] & mf.mask) &&
>>>>> With this checking bits are all 0 but covered by mf.mask's range are skippped,
>>>>> even if they're different from the host_feat, please check whether it's
>>>>> desried
>>>>> behavior.
>>>> This is the intended design because there are quite a number of multi-bit
>>>> CPUIDs
>>>> should support passing all 0 to them.
>>> you didn't answer the question. The question is why the handling can be skipped
>>> when the value of multi-bit feature is 0.
>> I think the default function should handle the most common case, which is,
>> passing all 0 multi-bits to KVM is valid and shouldn't be warned. E.g, when we
>> are using some earlier CPU models which doesn't support AMX, we shouldn't be
>> warned that all 0 values don't match the CPUID reported by KVM.
>>
> 
> passing value 0 to KVM is valid, is not the reason why the handling can be skipped.
> 
> The correct reason is that when value is 0, it means the multi-bit feature is
> not enabled/requested. It's always legal and doesn't need any handling. So it
> can be just skipped.

Currently, we can say yes, but I doubt if there will be a multi-bit field in the
future that only accepts the non-zero value specified.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/6] i386: Mask and report unavailable multi-bit feature values
  2023-02-09  6:15             ` Wang, Lei
@ 2023-02-09  9:26               ` Xiaoyao Li
  0 siblings, 0 replies; 25+ messages in thread
From: Xiaoyao Li @ 2023-02-09  9:26 UTC (permalink / raw)
  To: Wang, Lei, Yuan Yao
  Cc: pbonzini, qemu-devel, imammedo, dgilbert, berrange, yang.zhong

On 2/9/2023 2:15 PM, Wang, Lei wrote:
> On 2/9/2023 1:59 PM, Xiaoyao Li wrote:
>> On 2/9/2023 12:21 PM, Wang, Lei wrote:
>>>>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>>>>> index 88aa780566..e638a31d34 100644
>>>>>>> --- a/target/i386/cpu.c
>>>>>>> +++ b/target/i386/cpu.c
>>>>>>> @@ -4377,6 +4377,28 @@ static bool x86_cpu_have_filtered_features(X86CPU
>>>>>>> *cpu)
>>>>>>>         return false;
>>>>>>>     }
>>>>>>>
>>>>>>> +static void mark_unavailable_multi_bit_default(X86CPU *cpu, FeatureWord w,
>>>>>>> +                                               int index,
>>>>>>> +                                               const char *verbose_prefix)
>>>>>>> +{
>>>>>>> +    FeatureWordInfo *f = &feature_word_info[w];
>>>>>>> +    g_autofree char *feat_word_str = feature_word_description(f);
>>>>>>> +    uint64_t host_feat = x86_cpu_get_supported_feature_word(w, false);
>>>>>>> +    MultiBitFeatureInfo mf = f->multi_bit_features[index];
>>>>>>> +
>>>>>>> +    if ((cpu->env.features[w] & mf.mask) &&
>>>>>> With this checking bits are all 0 but covered by mf.mask's range are skippped,
>>>>>> even if they're different from the host_feat, please check whether it's
>>>>>> desried
>>>>>> behavior.
>>>>> This is the intended design because there are quite a number of multi-bit
>>>>> CPUIDs
>>>>> should support passing all 0 to them.
>>>> you didn't answer the question. The question is why the handling can be skipped
>>>> when the value of multi-bit feature is 0.
>>> I think the default function should handle the most common case, which is,
>>> passing all 0 multi-bits to KVM is valid and shouldn't be warned. E.g, when we
>>> are using some earlier CPU models which doesn't support AMX, we shouldn't be
>>> warned that all 0 values don't match the CPUID reported by KVM.
>>>
>>
>> passing value 0 to KVM is valid, is not the reason why the handling can be skipped.
>>
>> The correct reason is that when value is 0, it means the multi-bit feature is
>> not enabled/requested. It's always legal and doesn't need any handling. So it
>> can be just skipped.
> 
> Currently, we can say yes, but I doubt if there will be a multi-bit field in the
> future that only accepts the non-zero value specified.

If so, then the multi-bit field cannot fall into the default handling 
category and it requires its own special handling callback.

It's also the reason I asked "What's the behavior of default ? you need 
to call out it clearly."

This patch should define well the behavior of default handler and 
describe it in the commit message. Moreover, it's better to comment it 
as well above the implementation of mark_unavailable_multi_bit_default()


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/6] Support for new CPU model SapphireRapids
  2023-02-02 11:05 ` [PATCH v3 0/6] Support for " Igor Mammedov
  2023-02-07  2:50   ` Wang, Lei
  2023-02-08 14:53   ` Xiaoyao Li
@ 2023-03-02 14:49   ` Robert Hoo
  2 siblings, 0 replies; 25+ messages in thread
From: Robert Hoo @ 2023-03-02 14:49 UTC (permalink / raw)
  To: Igor Mammedov, Lei Wang
  Cc: pbonzini, qemu-devel, dgilbert, berrange, xiaoyao.li, yang.zhong

On Thu, 2023-02-02 at 12:05 +0100, Igor Mammedov wrote:
> MultiBitFeatureInfo looks like an interesting
> idea 

Yeah, we can feel how much effort Lei spent on this.

> but among fixing whatever issues this has atm,
> you'd probably need to introduce a new  qdev_bitfield property
> infrastructure so that such features could be treated like any
> other qdev properties.
> Also when MultiBitFeatureInfo is added, one should convert all
> other usecases where it's applicable (not only for new code)
> so that we would end up with consolidated approach instead of
> zoo mess.
> 
> I'm not sure all MultiBitFeatureInfo complexity is necessary

Kinda ack.

> just for adding a new CPU model, I'd rather use ad hoc approach
> that we were using before for non boolean features.
> 
> And then try to develop MultiBitFeatureInfo & co as a separate
> series to demonstrate how much it will improve current
> cpu models definitions.
> 

CPUID word isn't always bit wise, i.e. each bit representing a feature,
this isn't new.

e.g.
CPUID.1H.EBX[bit23,16] -- Maximum number of addressable IDs for logical
processors in this physical package
CPUID.04H
etc.

And interestingly, we can see that among so many CPUID leaves (which in
turn contain *words* of EAX, EBX, ECX, EDX), only a few has a
corresponding feature word defined in 

typedef enum FeatureWord {
    FEAT_1_EDX,
    FEAT_1_ECX,
    ...
}

Why?

Those CPUID returns are not *feature words(names)*, they're numbers to
decode, strings to interpreted, etc. So does this CPUID.1DH/1EH, I
think.
Why cannot handle them like handling CPUID.04H?



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/6] Support for new CPU model SapphireRapids
  2023-02-07  2:50   ` Wang, Lei
@ 2023-03-06 12:49     ` Igor Mammedov
  0 siblings, 0 replies; 25+ messages in thread
From: Igor Mammedov @ 2023-03-06 12:49 UTC (permalink / raw)
  To: Wang, Lei
  Cc: pbonzini, qemu-devel, dgilbert, berrange, xiaoyao.li, yang.zhong

On Tue, 7 Feb 2023 10:50:56 +0800
"Wang, Lei" <lei4.wang@intel.com> wrote:

> On 2/2/2023 7:05 PM, Igor Mammedov wrote:
> > On Fri,  6 Jan 2023 00:38:20 -0800
> > Lei Wang <lei4.wang@intel.com> wrote:
> >   
> >> This series aims to add a new CPU model SapphireRapids, and tries to
> >> address the problem stated in
> >> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#mcf67dbd1ad37c65d7988c36a2b267be9afd2fb30,
> >> so that named CPU model can define its own AMX values, and QEMU won't
> >> pass the wrong AMX values to KVM in future platforms if they have
> >> different values supported.
> >>
> >> The original patch is
> >> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#u.  
> > 
> > MultiBitFeatureInfo looks like an interesting
> > idea but among fixing whatever issues this has atm,
> > you'd probably need to introduce a new  qdev_bitfield property
> > infrastructure so that such features could be treated like any
> > other qdev properties.
> > Also when MultiBitFeatureInfo is added, one should convert all
> > other usecases where it's applicable (not only for new code)
> > so that we would end up with consolidated approach instead of
> > zoo mess.
> > 
> > I'm not sure all MultiBitFeatureInfo complexity is necessary
> > just for adding a new CPU model, I'd rather use ad hoc approach
> > that we were using before for non boolean features.  
> 
> Hi, Igor. I do not quite understand what does the "ad hoc approach" mean,

by ah hoc I've mean instead of introducing MultiBitFeatureInfo
try to opencode fixups and checks for AMX properties.
(we do have a number of of such cpuid 'features')
For example look at [x]level (it's just a case MultiBitFeatureInfo that takes 32 bits).
Yes that would be ugly but much less complicated than new infrastructure.

And when/if  MultiBitFeatureInfo is ready for usage, you can convert
cpu models to it (AMX and all other 'legacy' features that were open coded).

> currently if we specify a multi-bit non-boolean CPUID value which is different
> from the host value to CPU model, e.g., consider the following scenario:
> 
> - KVM **ONLY** supports value 5 (101) and,
> - QEMU user want to pass value 3 (011) to it,
> 
> and follow the current logic:
> 
>     uint64_t unavailable_features = requested_features & ~host_feat;
> 
> then:
> 
> 1. The warning message will be messy and not intuitive:
> 
> requested_features bit 1 is 1 and host_feat bit 1 is 0, so it will warn on this
> non-sense bit.
> 
> 
> 2. Some CPUID bits will "leak" into the final CPUID passed to KVM:
> 
> requested_features bit 0 is 1 and host_feat bit 0 is also 1, so it will pass
> this CPUID bit to host, the request_features value is 3 (011), finally we get 1
> (001), this is not we want.
> 
> Am I understanding it correctly?
> 
> > 
> > And then try to develop MultiBitFeatureInfo & co as a separate
> > series to demonstrate how much it will improve current
> > cpu models definitions.
> > 
> > PS:
> >  'make check-acceptance' are broken with this
> >   
> >> ---
> >>
> >> Changelog:
> >>
> >> v3:
> >>  - Rebase on the latest QEMU (d1852caab131ea898134fdcea8c14bc2ee75fbe9).
> >>  - v2: https://lore.kernel.org/qemu-devel/20221102085256.81139-1-lei4.wang@intel.com/
> >>
> >> v2:
> >>  - Fix when passing all zeros of AMX-related CPUID, QEMU will warn
> >>    unsupported.
> >>  - Remove unnecessary function definition and make code cleaner.
> >>  - Fix some typos.
> >>  - v1: https://lore.kernel.org/qemu-devel/20221027020036.373140-1-lei4.wang@intel.com/T/#t
> >>
> >>
> >> Lei Wang (6):
> >>   i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E
> >>   i386: Remove unused parameter "uint32_t bit" in
> >>     feature_word_description()
> >>   i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit
> >>     features
> >>   i386: Mask and report unavailable multi-bit feature values
> >>   i386: Initialize AMX CPUID leaves with corresponding env->features[]
> >>     leaves
> >>   i386: Add new CPU model SapphireRapids
> >>
> >>  target/i386/cpu-internal.h |  11 ++
> >>  target/i386/cpu.c          | 311 +++++++++++++++++++++++++++++++++++--
> >>  target/i386/cpu.h          |  16 ++
> >>  3 files changed, 322 insertions(+), 16 deletions(-)
> >>
> >>
> >> base-commit: d1852caab131ea898134fdcea8c14bc2ee75fbe9  
> >   
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-03-06 12:49 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-06  8:38 [PATCH v3 0/6] Support for new CPU model SapphireRapids Lei Wang
2023-01-06  8:38 ` [PATCH v3 1/6] i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E Lei Wang
2023-02-06  7:45   ` Yuan Yao
2023-01-06  8:38 ` [PATCH v3 2/6] i386: Remove unused parameter "uint32_t bit" in feature_word_description() Lei Wang
2023-01-27 13:29   ` Igor Mammedov
2023-02-08 14:33   ` Xiaoyao Li
2023-01-06  8:38 ` [PATCH v3 3/6] i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit features Lei Wang
2023-02-08 14:39   ` Xiaoyao Li
2023-01-06  8:38 ` [PATCH v3 4/6] i386: Mask and report unavailable multi-bit feature values Lei Wang
2023-02-06  7:43   ` Yuan Yao
2023-02-09  1:04     ` Wang, Lei
2023-02-09  3:29       ` Xiaoyao Li
2023-02-09  4:21         ` Wang, Lei
2023-02-09  5:59           ` Xiaoyao Li
2023-02-09  6:15             ` Wang, Lei
2023-02-09  9:26               ` Xiaoyao Li
2023-01-06  8:38 ` [PATCH v3 5/6] i386: Initialize AMX CPUID leaves with corresponding env->features[] leaves Lei Wang
2023-01-06  8:38 ` [PATCH v3 6/6] i386: Add new CPU model SapphireRapids Lei Wang
2023-02-02 10:40   ` Igor Mammedov
2023-02-03  6:02     ` Wang, Lei
2023-02-02 11:05 ` [PATCH v3 0/6] Support for " Igor Mammedov
2023-02-07  2:50   ` Wang, Lei
2023-03-06 12:49     ` Igor Mammedov
2023-02-08 14:53   ` Xiaoyao Li
2023-03-02 14:49   ` Robert Hoo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.