All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
@ 2016-11-18 10:36 Laszlo Ersek
  2016-11-18 10:36 ` [Qemu-devel] [PATCH v3 for-2.9 1/3] hw/isa/apm: introduce callback for APM_STS_IOPORT writes Laszlo Ersek
                   ` (4 more replies)
  0 siblings, 5 replies; 27+ messages in thread
From: Laszlo Ersek @ 2016-11-18 10:36 UTC (permalink / raw)
  To: qemu devel list
  Cc: Kevin O'Connor, Michael S. Tsirkin, Gerd Hoffmann, Paolo Bonzini

This is v3 of the series, with updates based on the v2 discussion:
<http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg02687.html>.

I've added feature negotiation via the APM_STS ("scratchpad") register.
A new spec file called "docs/specs/q35-apm-sts.txt" is included.

Tested with new OVMF patches (about to send out those as well).
Regression tested with SeaBIOS (beyond simple functional tests with
maximum SeaBIOS logging enabled, I used gdb to step through the new
ich9_apm_status_changed() callback to see if it was behaving compatibly
with SeaBIOS).

The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
crashes very quickly for me when running OVMF:

  kvm_io_ioeventfd_add: error adding ioeventfd: File exists

It is my understanding that there are patches on the list for this:

  [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes

Anyway, the series rebases to v2.8.0-rc0 without as much as context
differences.

Cc: "Kevin O'Connor" <kevin@koconnor.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>

Thanks
Laszlo

Laszlo Ersek (3):
  hw/isa/apm: introduce callback for APM_STS_IOPORT writes
  hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS
  hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs

 docs/specs/q35-apm-sts.txt | 80 ++++++++++++++++++++++++++++++++++++++++++++++
 include/hw/i386/ich9.h     |  9 ++++++
 include/hw/isa/apm.h       |  9 +++---
 hw/acpi/piix4.c            |  2 +-
 hw/isa/apm.c               | 15 ++++++---
 hw/isa/lpc_ich9.c          | 64 +++++++++++++++++++++++++++++++++++--
 hw/isa/vt82c686.c          |  2 +-
 7 files changed, 168 insertions(+), 13 deletions(-)
 create mode 100644 docs/specs/q35-apm-sts.txt

-- 
2.9.2

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v3 for-2.9 1/3] hw/isa/apm: introduce callback for APM_STS_IOPORT writes
  2016-11-18 10:36 [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI Laszlo Ersek
@ 2016-11-18 10:36 ` Laszlo Ersek
  2016-11-18 10:36 ` [Qemu-devel] [PATCH v3 for-2.9 2/3] hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS Laszlo Ersek
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 27+ messages in thread
From: Laszlo Ersek @ 2016-11-18 10:36 UTC (permalink / raw)
  To: qemu devel list
  Cc: Kevin O'Connor, Michael S. Tsirkin, Gerd Hoffmann, Paolo Bonzini

In the following patches we'll add custom handling for when APM_STS_IOPORT
is written. Introduce a dedicated callback similar to the one that is now
called for APM_CNT_IOPORT writes.

Cc: "Kevin O'Connor" <kevin@koconnor.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---

Notes:
    v3:
    - feature negotiation is new in v3 [Paolo, Michael]

 include/hw/isa/apm.h |  9 +++++----
 hw/acpi/piix4.c      |  2 +-
 hw/isa/apm.c         | 15 ++++++++++-----
 hw/isa/lpc_ich9.c    |  2 +-
 hw/isa/vt82c686.c    |  2 +-
 5 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/include/hw/isa/apm.h b/include/hw/isa/apm.h
index 4839ff1df252..8fc83addde4e 100644
--- a/include/hw/isa/apm.h
+++ b/include/hw/isa/apm.h
@@ -5,19 +5,20 @@
 #include "hw/hw.h"
 #include "exec/memory.h"
 
-typedef void (*apm_ctrl_changed_t)(uint32_t val, void *arg);
+typedef void (*apm_reg_changed_t)(uint32_t val, void *arg);
 
 typedef struct APMState {
     uint8_t apmc;
     uint8_t apms;
 
-    apm_ctrl_changed_t callback;
+    apm_reg_changed_t cnt_callback;
+    apm_reg_changed_t sts_callback;
     void *arg;
     MemoryRegion io;
 } APMState;
 
-void apm_init(PCIDevice *dev, APMState *s, apm_ctrl_changed_t callback,
-              void *arg);
+void apm_init(PCIDevice *dev, APMState *s, apm_reg_changed_t cnt_callback,
+              apm_reg_changed_t sts_callback, void *arg);
 
 extern const VMStateDescription vmstate_apm;
 
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 2adc246b0044..bc4087f5f6ef 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -493,7 +493,7 @@ static void piix4_pm_realize(PCIDevice *dev, Error **errp)
     pci_conf[0x3d] = 0x01; // interrupt pin 1
 
     /* APM */
-    apm_init(dev, &s->apm, apm_ctrl_changed, s);
+    apm_init(dev, &s->apm, apm_ctrl_changed, NULL, s);
 
     if (!s->smm_enabled) {
         /* Mark SMM as already inited to prevent SMM from running.  KVM does not
diff --git a/hw/isa/apm.c b/hw/isa/apm.c
index e232b0da033a..67e05b9932d6 100644
--- a/hw/isa/apm.c
+++ b/hw/isa/apm.c
@@ -47,11 +47,15 @@ static void apm_ioport_writeb(void *opaque, hwaddr addr, uint64_t val,
     if (addr == 0) {
         apm->apmc = val;
 
-        if (apm->callback) {
-            (apm->callback)(val, apm->arg);
+        if (apm->cnt_callback) {
+            apm->cnt_callback(val, apm->arg);
         }
     } else {
         apm->apms = val;
+
+        if (apm->sts_callback) {
+            apm->sts_callback(val, apm->arg);
+        }
     }
 }
 
@@ -90,10 +94,11 @@ static const MemoryRegionOps apm_ops = {
     },
 };
 
-void apm_init(PCIDevice *dev, APMState *apm, apm_ctrl_changed_t callback,
-              void *arg)
+void apm_init(PCIDevice *dev, APMState *apm, apm_reg_changed_t cnt_callback,
+              apm_reg_changed_t sts_callback, void *arg)
 {
-    apm->callback = callback;
+    apm->cnt_callback = cnt_callback;
+    apm->sts_callback = sts_callback;
     apm->arg = arg;
 
     /* ioport 0xb2, 0xb3 */
diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
index 10d1ee8b9310..e3556c9f9eae 100644
--- a/hw/isa/lpc_ich9.c
+++ b/hw/isa/lpc_ich9.c
@@ -634,7 +634,7 @@ static void ich9_lpc_realize(PCIDevice *d, Error **errp)
     lpc->isa_bus = isa_bus;
 
     ich9_cc_init(lpc);
-    apm_init(d, &lpc->apm, ich9_apm_ctrl_changed, lpc);
+    apm_init(d, &lpc->apm, ich9_apm_ctrl_changed, NULL, lpc);
 
     lpc->machine_ready.notify = ich9_lpc_machine_ready;
     qemu_add_machine_init_done_notifier(&lpc->machine_ready);
diff --git a/hw/isa/vt82c686.c b/hw/isa/vt82c686.c
index 41d5254f8e55..eb5258597b1d 100644
--- a/hw/isa/vt82c686.c
+++ b/hw/isa/vt82c686.c
@@ -365,7 +365,7 @@ static void vt82c686b_pm_realize(PCIDevice *dev, Error **errp)
     pm_smbus_init(&s->dev.qdev, &s->smb);
     memory_region_add_subregion(get_system_io(), s->smb_io_base, &s->smb.io);
 
-    apm_init(dev, &s->apm, NULL, s);
+    apm_init(dev, &s->apm, NULL, NULL, s);
 
     memory_region_init(&s->io, OBJECT(dev), "vt82c686-pm", 64);
     memory_region_set_enabled(&s->io, false);
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v3 for-2.9 2/3] hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS
  2016-11-18 10:36 [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI Laszlo Ersek
  2016-11-18 10:36 ` [Qemu-devel] [PATCH v3 for-2.9 1/3] hw/isa/apm: introduce callback for APM_STS_IOPORT writes Laszlo Ersek
@ 2016-11-18 10:36 ` Laszlo Ersek
  2016-11-18 10:36 ` [Qemu-devel] [PATCH v3 for-2.9 3/3] hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs Laszlo Ersek
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 27+ messages in thread
From: Laszlo Ersek @ 2016-11-18 10:36 UTC (permalink / raw)
  To: qemu devel list
  Cc: Kevin O'Connor, Michael S. Tsirkin, Gerd Hoffmann, Paolo Bonzini

Define and implement non-standard feature control for SMI handling via the
APM_STS register, such that it remains compatible with SeaBIOS's current
access pattern, and enables OVMF to negotiate features.

Cc: "Kevin O'Connor" <kevin@koconnor.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---

Notes:
    v3:
    - feature negotiation is new in v3 [Paolo, Michael]

 docs/specs/q35-apm-sts.txt | 71 ++++++++++++++++++++++++++++++++++++++++++++++
 include/hw/i386/ich9.h     |  8 ++++++
 hw/isa/lpc_ich9.c          | 54 ++++++++++++++++++++++++++++++++++-
 3 files changed, 132 insertions(+), 1 deletion(-)
 create mode 100644 docs/specs/q35-apm-sts.txt

diff --git a/docs/specs/q35-apm-sts.txt b/docs/specs/q35-apm-sts.txt
new file mode 100644
index 000000000000..cdffb6834380
--- /dev/null
+++ b/docs/specs/q35-apm-sts.txt
@@ -0,0 +1,71 @@
+According to the ICH9 specification, the APM_STS (Advanced Power Management
+Status Port) Register, at IO Port 0xB3, is a scratchpad register for passing
+data between the software agent that raises the synchronous SMI, and the SMI
+handler routine in the firmware. It is not coupled to any hardware
+functionality. The register is also known as SMI Data Port.
+
+While the ACPI FADT exposes the SMI Command Port (APM_CNT) to the OS, the Data
+Port is not exposed.
+
+SeaBIOS uses the Data Port values 0x00 and 0x01 in its SMI handler. SeaBIOS
+also relies on the original behavior of QEMU, where writing to the Command
+Port (APM_CNT) would raise the SMI only on the processor writing the port.
+
+OVMF needs dynamically detectable and settable features for SMI behavior. We
+can use the remaining bits in APM_STS for this, without disturbing SeaBIOS.
+
+The following describes the (non-standard) bit definitions in APM_STS.
+
+    7   6   5   4   3   2   1   0
+  +---+---+---+---+---+---+---+---+
+  |   |   |   |   |   |   |   |   |
+  +---+---+---+---+---+---+---+---+
+    ^   ^   ^   ^   ^   ^   ^   ^
+    |   |   |   |   |   |   |   Transparent bit. Whatever bit the firmware
+    |   |   |   |   |   |   |   writes in this position has no effect on
+    |   |   |   |   |   |   |   hardware, and can be read back identically.
+    |   |   |   |   |   |   |
+    |   |   |   |   |   |   Feature negotiation bit.
+    |   |   |   |   |   |
+    Feature bits. All reserved at the moment.
+
+Feature negotiation
+-------------------
+
+Firmware that is enlightened about this feature shall write 1 to the feature
+negotiation bit first (clearing all other bits), then read back the APM_STS
+register. If the feature negotiation bit is set in the result, then QEMU lacks
+the feature negotiation feature, and APM_STS is entirely transparent. Otherwise
+(i.e., the feature negotiation bit is clear in the result), the more
+significant bits (the feature bits) expose the features supported by QEMU. At
+the moment, no features are defined, and all feature bits read as zero.
+
+Once firmware confirms feature negotiation is available, it shall set (select)
+a subset of the advertised feature bits, and clear the feature negotiation bit,
+in a single write to the APM_STS register. In order to verify the result, the
+firmware shall read back the APM_STS register. If the feature negotiation bit
+is clear in the result, then the feature selection was successful. Otherwise,
+the requested features either weren't a subset of the available features, or
+constituted an inconsistent group of features (breaking inter-feature
+dependencies, for example). Regardless of the feature negotiation bit in the
+read back value, the higher order bits (i.e., the individual feature bits) are
+always zero in that value.
+
+SeaBIOS compatibility
+---------------------
+
+Writing 0x00 and 0x01 to APM_STS translates to "clear all features", with the
+following consequences:
+
+- The SMI behavior remains the original QEMU behavior, regardless of any future
+  features / feature bits.
+
+- Reading back APM_STS returns the value previously written. This is because
+
+  - the least significant bit is transparent,
+
+  - clearing all features is a request that always succeeds, hence the feature
+    negotiation bit will read as zero,
+
+  - the feature bits always read as zero after writing zero to the feature
+    negotiation bit.
diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h
index 5fd7e97d2347..8304396a487f 100644
--- a/include/hw/i386/ich9.h
+++ b/include/hw/i386/ich9.h
@@ -70,6 +70,9 @@ typedef struct ICH9LPCState {
     Notifier machine_ready;
 
     qemu_irq gsi[GSI_NUM_PINS];
+
+    /* SMI features negotiated via APM_STS */
+    uint8_t smi_features;
 } ICH9LPCState;
 
 Object *ich9_lpc_find(void);
@@ -208,6 +211,11 @@ Object *ich9_lpc_find(void);
 #define ICH9_APM_ACPI_ENABLE                    0x2
 #define ICH9_APM_ACPI_DISABLE                   0x3
 
+/* non-standard bits for the APM_STS register */
+#define ICH9_APM_STS_TRANSPARENT_MASK          0x01
+#define ICH9_APM_STS_GET_SET_FEATURES          0x02
+#define ICH9_APM_STS_KNOWN_FEATURES            0x00
+#define ICH9_APM_STS_FEATURE_MASK              0xfc
 
 /* D31:F3 SMBus controller */
 #define TYPE_ICH9_SMB_DEVICE "ICH9 SMB"
diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
index e3556c9f9eae..a50c4a15b6d1 100644
--- a/hw/isa/lpc_ich9.c
+++ b/hw/isa/lpc_ich9.c
@@ -390,6 +390,37 @@ static void ich9_apm_ctrl_changed(uint32_t val, void *arg)
     }
 }
 
+static void ich9_apm_status_changed(uint32_t val, void *arg)
+{
+    ICH9LPCState *lpc = arg;
+    uint32_t new_val = 0;
+
+    /* copy the transparent bits */
+    new_val |= val & ICH9_APM_STS_TRANSPARENT_MASK;
+
+    if (val & ICH9_APM_STS_GET_SET_FEATURES) {
+        /* The guest is querying features. Clearing
+         * ICH9_APM_STS_GET_SET_FEATURES means success.
+         */
+
+        /* provide known features */
+        new_val |= ICH9_APM_STS_KNOWN_FEATURES;
+    } else {
+        /* The guest is setting features. */
+        if (val & ICH9_APM_STS_FEATURE_MASK &
+            ~(uint32_t)ICH9_APM_STS_KNOWN_FEATURES) {
+            /* Unknown feature requested, report error. */
+            new_val |= ICH9_APM_STS_GET_SET_FEATURES;
+        } else {
+            /* Set requested features. Clearing ICH9_APM_STS_GET_SET_FEATURES
+             * means success.*/
+            lpc->smi_features = val & ICH9_APM_STS_KNOWN_FEATURES;
+        }
+    }
+
+    lpc->apm.apms = new_val;
+}
+
 /* config:PMBASE */
 static void
 ich9_lpc_pmbase_sci_update(ICH9LPCState *lpc)
@@ -507,6 +538,7 @@ static void ich9_lpc_reset(DeviceState *qdev)
 
     lpc->sci_level = 0;
     lpc->rst_cnt = 0;
+    lpc->smi_features = 0;
 }
 
 /* root complex register block is mapped into memory space */
@@ -634,7 +666,8 @@ static void ich9_lpc_realize(PCIDevice *d, Error **errp)
     lpc->isa_bus = isa_bus;
 
     ich9_cc_init(lpc);
-    apm_init(d, &lpc->apm, ich9_apm_ctrl_changed, NULL, lpc);
+    apm_init(d, &lpc->apm, ich9_apm_ctrl_changed, ich9_apm_status_changed,
+             lpc);
 
     lpc->machine_ready.notify = ich9_lpc_machine_ready;
     qemu_add_machine_init_done_notifier(&lpc->machine_ready);
@@ -668,6 +701,24 @@ static const VMStateDescription vmstate_ich9_rst_cnt = {
     }
 };
 
+static bool ich9_smi_features_needed(void *opaque)
+{
+    ICH9LPCState *lpc = opaque;
+
+    return (lpc->smi_features != 0);
+}
+
+static const VMStateDescription vmstate_ich9_smi_features = {
+    .name = "ICH9LPC/smi_features",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = ich9_smi_features_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT8(smi_features, ICH9LPCState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_ich9_lpc = {
     .name = "ICH9LPC",
     .version_id = 1,
@@ -683,6 +734,7 @@ static const VMStateDescription vmstate_ich9_lpc = {
     },
     .subsections = (const VMStateDescription*[]) {
         &vmstate_ich9_rst_cnt,
+        &vmstate_ich9_smi_features,
         NULL
     }
 };
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v3 for-2.9 3/3] hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs
  2016-11-18 10:36 [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI Laszlo Ersek
  2016-11-18 10:36 ` [Qemu-devel] [PATCH v3 for-2.9 1/3] hw/isa/apm: introduce callback for APM_STS_IOPORT writes Laszlo Ersek
  2016-11-18 10:36 ` [Qemu-devel] [PATCH v3 for-2.9 2/3] hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS Laszlo Ersek
@ 2016-11-18 10:36 ` Laszlo Ersek
  2016-11-18 14:10 ` [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI Michael S. Tsirkin
  2016-11-23 22:35 ` Paolo Bonzini
  4 siblings, 0 replies; 27+ messages in thread
From: Laszlo Ersek @ 2016-11-18 10:36 UTC (permalink / raw)
  To: qemu devel list
  Cc: Kevin O'Connor, Michael S. Tsirkin, Gerd Hoffmann, Paolo Bonzini

The generic edk2 SMM infrastructure prefers
EFI_SMM_CONTROL2_PROTOCOL.Trigger() to inject an SMI on each processor. If
Trigger() only brings the current processor into SMM, then edk2 handles it
in the following ways:

(1) If Trigger() is executed by the BSP (which is guaranteed before
    ExitBootServices(), but is not necessarily true at runtime), then:

    (a) If edk2 has been configured for "traditional" SMM synchronization,
        then the BSP sends directed SMIs to the APs with APIC delivery,
        bringing them into SMM individually. Then the BSP runs the SMI
        handler / dispatcher.

    (b) If edk2 has been configured for "relaxed" SMM synchronization,
        then the APs that are not already in SMM are not brought in, and
        the BSP runs the SMI handler / dispatcher.

(2) If Trigger() is executed by an AP (which is possible after
    ExitBootServices(), and can be forced e.g. by "taskset -c 1
    efibootmgr"), then the AP in question brings in the BSP with a
    directed SMI, and the BSP runs the SMI handler / dispatcher.

The smaller problem with (1a) and (2) is that the BSP and AP
synchronization is slow. For example, the "taskset -c 1 efibootmgr"
command from (2) can take more than 3 seconds to complete, because
efibootmgr accesses non-volatile UEFI variables intensively.

The larger problem is that QEMU's current behavior diverges from the
behavior usually seen on physical hardware, and that keeps exposing
obscure corner cases, race conditions and other instabilities in edk2,
which generally expects / prefers a software SMI to affect all CPUs at
once.

Therefore introduce the "broadcast SMI" feature
(ICH9_APM_STS_F_BROADCAST_SMI) that causes QEMU to inject the SMI on all
VCPUs. OVMF's EFI_SMM_CONTROL2_PROTOCOL.Trigger() can utilize this to
accommodate edk2's preference about "broadcast" SMI.

While the original posting of this patch
<http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg05658.html>
only intended to speed up (2), based on our recent "stress testing" of SMM
this patch actually provides functional improvements.

Cc: "Kevin O'Connor" <kevin@koconnor.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Also-suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---

Notes:
    v3:
    - key the broadcast SMI off of ICH9_APM_STS_F_BROADCAST_SMI, if it was
      negotiated [Paolo, Michael]

 docs/specs/q35-apm-sts.txt | 15 ++++++++++++---
 include/hw/i386/ich9.h     |  3 ++-
 hw/isa/lpc_ich9.c          | 10 +++++++++-
 3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/docs/specs/q35-apm-sts.txt b/docs/specs/q35-apm-sts.txt
index cdffb6834380..201baec52e9d 100644
--- a/docs/specs/q35-apm-sts.txt
+++ b/docs/specs/q35-apm-sts.txt
@@ -27,7 +27,9 @@ The following describes the (non-standard) bit definitions in APM_STS.
     |   |   |   |   |   |   |
     |   |   |   |   |   |   Feature negotiation bit.
     |   |   |   |   |   |
-    Feature bits. All reserved at the moment.
+    |   |   |   |   |   Broadcast SMI feature bit.
+    |   |   |   |   |
+    Reserved feature bits.
 
 Feature negotiation
 -------------------
@@ -37,8 +39,8 @@ negotiation bit first (clearing all other bits), then read back the APM_STS
 register. If the feature negotiation bit is set in the result, then QEMU lacks
 the feature negotiation feature, and APM_STS is entirely transparent. Otherwise
 (i.e., the feature negotiation bit is clear in the result), the more
-significant bits (the feature bits) expose the features supported by QEMU. At
-the moment, no features are defined, and all feature bits read as zero.
+significant bits (the feature bits) expose the features supported by QEMU.
+Reserved and unsupported feature bits read as zero.
 
 Once firmware confirms feature negotiation is available, it shall set (select)
 a subset of the advertised feature bits, and clear the feature negotiation bit,
@@ -51,6 +53,13 @@ dependencies, for example). Regardless of the feature negotiation bit in the
 read back value, the higher order bits (i.e., the individual feature bits) are
 always zero in that value.
 
+The broadcast SMI feature
+-------------------------
+
+Negotiating the broadcast SMI feature causes QEMU to raise the SMI on all VCPUs
+in response to subsequent SMI Command Port (APM_CNT) writes. By default QEMU
+raises the SMI only on the VCPU that writes to the SMI Command Port (APM_CNT).
+
 SeaBIOS compatibility
 ---------------------
 
diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h
index 8304396a487f..f14b747ff207 100644
--- a/include/hw/i386/ich9.h
+++ b/include/hw/i386/ich9.h
@@ -214,7 +214,8 @@ Object *ich9_lpc_find(void);
 /* non-standard bits for the APM_STS register */
 #define ICH9_APM_STS_TRANSPARENT_MASK          0x01
 #define ICH9_APM_STS_GET_SET_FEATURES          0x02
-#define ICH9_APM_STS_KNOWN_FEATURES            0x00
+#define ICH9_APM_STS_F_BROADCAST_SMI           0x04
+#define ICH9_APM_STS_KNOWN_FEATURES            0x04
 #define ICH9_APM_STS_FEATURE_MASK              0xfc
 
 /* D31:F3 SMBus controller */
diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
index a50c4a15b6d1..d8332f16e704 100644
--- a/hw/isa/lpc_ich9.c
+++ b/hw/isa/lpc_ich9.c
@@ -386,7 +386,15 @@ static void ich9_apm_ctrl_changed(uint32_t val, void *arg)
 
     /* SMI_EN = PMBASE + 30. SMI control and enable register */
     if (lpc->pm.smi_en & ICH9_PMIO_SMI_EN_APMC_EN) {
-        cpu_interrupt(current_cpu, CPU_INTERRUPT_SMI);
+        if (lpc->smi_features & ICH9_APM_STS_F_BROADCAST_SMI) {
+            CPUState *cs;
+
+            CPU_FOREACH(cs) {
+                cpu_interrupt(cs, CPU_INTERRUPT_SMI);
+            }
+        } else {
+            cpu_interrupt(current_cpu, CPU_INTERRUPT_SMI);
+        }
     }
 }
 
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-18 10:36 [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI Laszlo Ersek
                   ` (2 preceding siblings ...)
  2016-11-18 10:36 ` [Qemu-devel] [PATCH v3 for-2.9 3/3] hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs Laszlo Ersek
@ 2016-11-18 14:10 ` Michael S. Tsirkin
  2016-11-23 15:48   ` Laszlo Ersek
  2016-11-23 22:35 ` Paolo Bonzini
  4 siblings, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2016-11-18 14:10 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: qemu devel list, Kevin O'Connor, Gerd Hoffmann, Paolo Bonzini

On Fri, Nov 18, 2016 at 11:36:56AM +0100, Laszlo Ersek wrote:
> This is v3 of the series, with updates based on the v2 discussion:
> <http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg02687.html>.
> 
> I've added feature negotiation via the APM_STS ("scratchpad") register.
> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
> 
> Tested with new OVMF patches (about to send out those as well).
> Regression tested with SeaBIOS (beyond simple functional tests with
> maximum SeaBIOS logging enabled, I used gdb to step through the new
> ich9_apm_status_changed() callback to see if it was behaving compatibly
> with SeaBIOS).
> 
> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
> crashes very quickly for me when running OVMF:
> 
>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
> 
> It is my understanding that there are patches on the list for this:
> 
>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
> 
> Anyway, the series rebases to v2.8.0-rc0 without as much as context
> differences.
> 
> Cc: "Kevin O'Connor" <kevin@koconnor.net>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>

I'll review. Pls remember it will have to be re-posted or pinged
after 2.8 is out.

> Thanks
> Laszlo
> 
> Laszlo Ersek (3):
>   hw/isa/apm: introduce callback for APM_STS_IOPORT writes
>   hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS
>   hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs
> 
>  docs/specs/q35-apm-sts.txt | 80 ++++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/i386/ich9.h     |  9 ++++++
>  include/hw/isa/apm.h       |  9 +++---
>  hw/acpi/piix4.c            |  2 +-
>  hw/isa/apm.c               | 15 ++++++---
>  hw/isa/lpc_ich9.c          | 64 +++++++++++++++++++++++++++++++++++--
>  hw/isa/vt82c686.c          |  2 +-
>  7 files changed, 168 insertions(+), 13 deletions(-)
>  create mode 100644 docs/specs/q35-apm-sts.txt
> 
> -- 
> 2.9.2

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-18 14:10 ` [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI Michael S. Tsirkin
@ 2016-11-23 15:48   ` Laszlo Ersek
  0 siblings, 0 replies; 27+ messages in thread
From: Laszlo Ersek @ 2016-11-23 15:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu devel list, Kevin O'Connor, Gerd Hoffmann, Paolo Bonzini

On 11/18/16 15:10, Michael S. Tsirkin wrote:
> On Fri, Nov 18, 2016 at 11:36:56AM +0100, Laszlo Ersek wrote:
>> This is v3 of the series, with updates based on the v2 discussion:
>> <http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg02687.html>.
>>
>> I've added feature negotiation via the APM_STS ("scratchpad") register.
>> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
>>
>> Tested with new OVMF patches (about to send out those as well).
>> Regression tested with SeaBIOS (beyond simple functional tests with
>> maximum SeaBIOS logging enabled, I used gdb to step through the new
>> ich9_apm_status_changed() callback to see if it was behaving compatibly
>> with SeaBIOS).
>>
>> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
>> crashes very quickly for me when running OVMF:
>>
>>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
>>
>> It is my understanding that there are patches on the list for this:
>>
>>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
>>
>> Anyway, the series rebases to v2.8.0-rc0 without as much as context
>> differences.
>>
>> Cc: "Kevin O'Connor" <kevin@koconnor.net>
>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>> Cc: Gerd Hoffmann <kraxel@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
> 
> I'll review.

Can you please review the

  docs/specs/q35-apm-sts.txt

hunks across the series? The OVMF series has been reviewed and it's
ready to go in.

If you approve the interface spec, I can safely commit the OVMF patches
(exactly because the feature is now negotiated), without having to wait
for the QEMU code to converge.

> Pls remember it will have to be re-posted or pinged
> after 2.8 is out.

Sure, will do.

Thanks!
Laszlo

> 
>> Thanks
>> Laszlo
>>
>> Laszlo Ersek (3):
>>   hw/isa/apm: introduce callback for APM_STS_IOPORT writes
>>   hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS
>>   hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs
>>
>>  docs/specs/q35-apm-sts.txt | 80 ++++++++++++++++++++++++++++++++++++++++++++++
>>  include/hw/i386/ich9.h     |  9 ++++++
>>  include/hw/isa/apm.h       |  9 +++---
>>  hw/acpi/piix4.c            |  2 +-
>>  hw/isa/apm.c               | 15 ++++++---
>>  hw/isa/lpc_ich9.c          | 64 +++++++++++++++++++++++++++++++++++--
>>  hw/isa/vt82c686.c          |  2 +-
>>  7 files changed, 168 insertions(+), 13 deletions(-)
>>  create mode 100644 docs/specs/q35-apm-sts.txt
>>
>> -- 
>> 2.9.2

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-18 10:36 [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI Laszlo Ersek
                   ` (3 preceding siblings ...)
  2016-11-18 14:10 ` [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI Michael S. Tsirkin
@ 2016-11-23 22:35 ` Paolo Bonzini
  2016-11-24  0:01   ` Laszlo Ersek
  4 siblings, 1 reply; 27+ messages in thread
From: Paolo Bonzini @ 2016-11-23 22:35 UTC (permalink / raw)
  To: Laszlo Ersek, qemu devel list
  Cc: Kevin O'Connor, Michael S. Tsirkin, Gerd Hoffmann, Igor Mammedov



On 18/11/2016 11:36, Laszlo Ersek wrote:
> This is v3 of the series, with updates based on the v2 discussion:
> <http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg02687.html>.
> 
> I've added feature negotiation via the APM_STS ("scratchpad") register.
> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
> 
> Tested with new OVMF patches (about to send out those as well).
> Regression tested with SeaBIOS (beyond simple functional tests with
> maximum SeaBIOS logging enabled, I used gdb to step through the new
> ich9_apm_status_changed() callback to see if it was behaving compatibly
> with SeaBIOS).
> 
> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
> crashes very quickly for me when running OVMF:
> 
>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
> 
> It is my understanding that there are patches on the list for this:
> 
>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
> 
> Anyway, the series rebases to v2.8.0-rc0 without as much as context
> differences.

Hi Laszlo,

sorry for the slightly delayed reply.

First of all, I'm wondering if we would be better off adding a new port
0xB1 that is QEMU-specific, instead of reusing 0xB3.

Second, I now remembered the reason why I was against broadcast SMI.
The reason is that it breaks hot-plug.

On hot-plug, the firmware (if it wants to use SMI for anything secure)
must relocate the SMBASE of the newly-hotplugged CPU before the OS has
any chance to put its fangs on it.  This is because the OS can send
direct SMIs and is in control of the area at 0x30000.  So OVMF is
already broken with respect to hotplug, but I am not yet sure if these
patches would break it further.

The full solution is to follow a protocol similar to what real hardware
does.  On hot-plug, before the new CPU starts running, the DSDT should
generate an SMI (with a well-known value written to 0xB2).  The handler
then:

1) waits for SMM rendezvous

2) unparks the hotplugged VCPU.  Until the VCPU is unparked, it doesn't
react to either INITs or of course SIPIs (this would need to be
implemented separately for both TCG and KVM! but only in QEMU, not in
the kernel).

3) relocates SMBASE

4) records the presence of the new VCPU's APIC id for subsequent SMIs.


The other important things are:

* new CPU-hotplug controller (docs/specs/acpi_cpu_hotplug.txt)
interfaces.  I think this would only need a new bit in the write
register at 0x4 ("CPU device control fields").

The 0x0-0x3 write register, currently reserved for reading, might
become read/write for easier communication with the SMI handler.  The
SMI handler would write 1 to the new bit in order to unpark the VCPU.

* how to enable this.  I think it would need a new SMM feature, just
like those that you are adding here, in order to make it negotiable.  If
it is not negotiated, but broadcast SMI is negotiated, you'd need to do
something such as not allowing CPU-hotplug.  (This is the only part that
I think is required for 2.9).

In order to trigger the SMI, the

                 ifctx = aml_if(aml_equal(ins_evt, one));
                 {
                     aml_append(ifctx,
                         aml_call2(CPU_NOTIFY_METHOD, cpu_data, dev_chk));
                     aml_append(ifctx, aml_store(one, ins_evt));
                     aml_append(ifctx, aml_store(one, has_event));
                 }

would be replaced by something like this pseudo-AML:

		Store(And(smm_features, 2), Local1)
		...
		Store(next_cpu_cmd, cpu_cmd)
		If (Equal(ins_evt, One)) {
			If (Greater(Local1, Zero)) {
				Store(CPU_HP_APM_CMD, apm_cmd)
			}
			CPU_NOTIFY_METHOD(cpu_data, dev_chk)
			Store(One, ins_evt)
			Store(One, has_event)
		}


Of course all this is for OVMF only.  SeaBIOS doesn't need to do
anything of this, because it actually likes to have its SMIs only on the
current CPU (and it had better be the BSP, since SMBASE is not relocated
elsewhere!).

Igor, any thoughts?

I understand that this is quite huge in both OVMF and QEMU, but we've
only been delaying it and we knew about it. :(

Paolo

> Cc: "Kevin O'Connor" <kevin@koconnor.net>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> 
> Thanks
> Laszlo
> 
> Laszlo Ersek (3):
>   hw/isa/apm: introduce callback for APM_STS_IOPORT writes
>   hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS
>   hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs
> 
>  docs/specs/q35-apm-sts.txt | 80 ++++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/i386/ich9.h     |  9 ++++++
>  include/hw/isa/apm.h       |  9 +++---
>  hw/acpi/piix4.c            |  2 +-
>  hw/isa/apm.c               | 15 ++++++---
>  hw/isa/lpc_ich9.c          | 64 +++++++++++++++++++++++++++++++++++--
>  hw/isa/vt82c686.c          |  2 +-
>  7 files changed, 168 insertions(+), 13 deletions(-)
>  create mode 100644 docs/specs/q35-apm-sts.txt
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-23 22:35 ` Paolo Bonzini
@ 2016-11-24  0:01   ` Laszlo Ersek
  2016-11-24  0:31     ` Laszlo Ersek
                       ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Laszlo Ersek @ 2016-11-24  0:01 UTC (permalink / raw)
  To: Paolo Bonzini, qemu devel list
  Cc: Kevin O'Connor, Michael S. Tsirkin, Gerd Hoffmann,
	Igor Mammedov, Jordan Justen (Intel address),
	Michael Kinney

CC Jordan & Mike

On 11/23/16 23:35, Paolo Bonzini wrote:
> 
> 
> On 18/11/2016 11:36, Laszlo Ersek wrote:
>> This is v3 of the series, with updates based on the v2 discussion:
>> <http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg02687.html>.
>>
>> I've added feature negotiation via the APM_STS ("scratchpad") register.
>> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
>>
>> Tested with new OVMF patches (about to send out those as well).
>> Regression tested with SeaBIOS (beyond simple functional tests with
>> maximum SeaBIOS logging enabled, I used gdb to step through the new
>> ich9_apm_status_changed() callback to see if it was behaving compatibly
>> with SeaBIOS).
>>
>> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
>> crashes very quickly for me when running OVMF:
>>
>>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
>>
>> It is my understanding that there are patches on the list for this:
>>
>>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
>>
>> Anyway, the series rebases to v2.8.0-rc0 without as much as context
>> differences.
> 
> Hi Laszlo,
> 
> sorry for the slightly delayed reply.
> 
> First of all, I'm wondering if we would be better off adding a new port
> 0xB1 that is QEMU-specific, instead of reusing 0xB3.

Sure, I can look into that, if we agree that's the best way to proceed,
for now. (Although I'm not really happy about the new memory region
stuff it would require. :()

I CC'd Kevin to learn if he foresaw other uses for the APM_STS register
in SeaBIOS.

BTW, what happens in QEMU if the guest reads an unimplemented port?
Hm... unassigned_io_write() seems to be a no-op, and
unassigned_io_read() returns all-bits-one. This means that for a new
port, the negotiation protocol / values have to be reworked.

Port 0xB1 is occupied by ICH9 according to the spec:

  Table 9-2. Fixed I/O Ranges Decoded by Intel ® ICH9 (Sheet 2 of 2)

  I/O
  Address  Read Target           Write Target          Internal Unit
  -------  --------------------  --------------------  -------------
  B0h–B1h  Interrupt Controller  Interrupt Controller  Interrupt

I wonder if we care -- after all, APM_STS (0xB3) is documented not to
have any hardware effects ("scratchpad register").

> Second, I now remembered the reason why I was against broadcast SMI.
> The reason is that it breaks hot-plug.

How does it break hot-plug? After reading your explanation below: is it
that the broadcast SMI (possibly raised by the OS directly) would get to
the new VCPU before the firmware relocated its SMBASE?

> 
> On hot-plug, the firmware (if it wants to use SMI for anything secure)
> must relocate the SMBASE of the newly-hotplugged CPU before the OS has
> any chance to put its fangs on it.  This is because the OS can send
> direct SMIs and is in control of the area at 0x30000.  So OVMF is
> already broken with respect to hotplug,

Yes. We theorized that there could be further edk2 core modules that
hadn't been open sourced yet, necessary for handling VCPU hotplug.

> but I am not yet sure if these
> patches would break it further.

Hard to say without seeing those modules.

I will speculate: assuming that the non-public modules fit together with
the public modules in some way, I expect the broadcast SMI shouldn't
break them. The reason is that the broadcast SMI / traditional delivery
are the default method in UefiCpuPkg, and in practice they work better
(more reliably) with the rest of the edk2 infrastructure, in my
experience, than the relaxed sync method.

In his review today, Jordan wrote
<https://lists.01.org/pipermail/edk2-devel/2016-November/005088.html>,

> I'm glad we'll be using a mechanism that broadcasts to all the
> processors like the real hardware. It is a bit unfortunate that it
> doesn't go through the b2 port for it.

If broadcast is how real hardware does it (even by default!,
apparently), I expect those as-yet unreleased, hotplug-handling modules
in edk2 should be able to cope with broadcast.

> The full solution is to follow a protocol similar to what real hardware
> does.

Real hardware seems to use broadcast, according to the above...

On 11/23/16 23:35, Paolo Bonzini wrote:

> On hot-plug, before the new CPU starts running, the DSDT should
> generate an SMI (with a well-known value written to 0xB2).

I'm not sure I understand right. If it is the DSDT that writes to 0xB2,
that's just another way to say, "the firmware vendor asks the operating
system to write to 0xB2". If the malicious OS is smart enough, it can
realize (from the hardware signal to run the ACPI GPE handler, IIRC)
that a new VCPU is available, and simply not trigger the SMI.

> The handler
> then:
> 
> 1) waits for SMM rendezvous
> 
> 2) unparks the hotplugged VCPU.  Until the VCPU is unparked, it doesn't
> react to either INITs or of course SIPIs (this would need to be
> implemented separately for both TCG and KVM! but only in QEMU, not in
> the kernel).

Okay, this does plug the hole I sketched out above. This logic (the
QEMU-specific unparking) can be done in another platform API in OVMF I
guess (like those in SmmCpuFeaturesLib), but I wonder if we have to
provide the infrastructure in platform code up to the separate SMI
command handler. I think it again depends on those unreleased modules.

> 3) relocates SMBASE
> 
> 4) records the presence of the new VCPU's APIC id for subsequent SMIs.
> 
> 
> The other important things are:
> 
> * new CPU-hotplug controller (docs/specs/acpi_cpu_hotplug.txt)
> interfaces.  I think this would only need a new bit in the write
> register at 0x4 ("CPU device control fields").
> 
> The 0x0-0x3 write register, currently reserved for reading, might
> become read/write for easier communication with the SMI handler.  The
> SMI handler would write 1 to the new bit in order to unpark the VCPU.
> 
> * how to enable this.  I think it would need a new SMM feature, just
> like those that you are adding here, in order to make it negotiable.  If
> it is not negotiated, but broadcast SMI is negotiated, you'd need to do
> something such as not allowing CPU-hotplug.  (This is the only part that
> I think is required for 2.9).

My hope is that we can work on this incrementally (or at least that we
don't forfeit our chance at SMM + VCPU hotplug) by going down the
broadcast SMI route first. Based on what Jordan wrote, this hope should
not be unfounded.

Also, the SMM stack (across components: core edk2 code, OVMF platform
code, QEMU and KVM) is now more than a year old, and we still have
stability problems (hence this series). I'd like to stabilize the base
features before getting wild with hotplug.

(The fact that OVMF lags behind SeaBIOS in a number of features is just
business as usual.)

So, I'd vastly prefer if I needed to extend this series only with a way
to prevent VCPUs from being hotplugged. I think negotiating SMI
broadcast should be good enough for that, as a hook point, given that it
runs in the firmware before the OS gets control (on S3 resume too).

I guess a global variable (or a board-level query function) that would
cause pc_query_hotpluggable_cpus() to return an empty list would be
frowned upon; what's the recommended method?

There is DeviceClass.hotpluggable, but that's not a dynamic property.

> 
> In order to trigger the SMI, the
> 
>                  ifctx = aml_if(aml_equal(ins_evt, one));
>                  {
>                      aml_append(ifctx,
>                          aml_call2(CPU_NOTIFY_METHOD, cpu_data, dev_chk));
>                      aml_append(ifctx, aml_store(one, ins_evt));
>                      aml_append(ifctx, aml_store(one, has_event));
>                  }
> 
> would be replaced by something like this pseudo-AML:
> 
> 		Store(And(smm_features, 2), Local1)
> 		...
> 		Store(next_cpu_cmd, cpu_cmd)
> 		If (Equal(ins_evt, One)) {
> 			If (Greater(Local1, Zero)) {
> 				Store(CPU_HP_APM_CMD, apm_cmd)
> 			}
> 			CPU_NOTIFY_METHOD(cpu_data, dev_chk)
> 			Store(One, ins_evt)
> 			Store(One, has_event)
> 		}
> 
> 
> Of course all this is for OVMF only.  SeaBIOS doesn't need to do
> anything of this, because it actually likes to have its SMIs only on the
> current CPU (and it had better be the BSP, since SMBASE is not relocated
> elsewhere!).
> 
> Igor, any thoughts?
> 
> I understand that this is quite huge in both OVMF and QEMU, but we've
> only been delaying it and we knew about it. :(

I agree about being aware of lack of VCPU hotplug support wrt. SMM.

I disagree about delaying it. I've been aware of problems with the base
SMM features (of differing importance), for example

- that S3 resume with the Ia32 build of OVMF was almost hit-or-miss,

- or that the heavy use of UEFI variable services during Windows 8 boot
caused user-visible choppiness in the rotating animation (because of how
our SMIs worked) -- I think I even mentioned this to you.

Since the stalemate in

  http://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg00125.html

from last November, I've never stopped disliking the lack of broadcast
SMIs, but I couldn't do anything about it, despite it leaving the base
feature set unreliable.

Recently, the open source edk2 SMM stack has acquired a bunch of new
code (you're aware), which exacerbate the instabilities (as I've
documented on the edk2-devel list in excruciating detail). I was
overjoyed when you finally suggested (!) to add broadcast SMI, getting
the discussion un-stuck, because (a) it miraculously fixes everything,
and (b) honestly, I see no other way from keeping the SMM stack from
falling apart otherwise, *without* VCPU hotplug in the picture.

For the last few weeks, I've been working day and night (I'm not
exaggerating) on testing & reviewing edk2 patches related to SMM or even
just multiprocessing, catching errors, reporting them, and even
debugging and fixing them myself. (See for example commits

  00650c531a90 UefiCpuPkg/MpInitLib/X64/MpFuncs.nasm: fix fatal typo
  4af3ae146379 UefiCpuPkg/LocalApicLib: fix feature test for Extended
               Topology CPUID leaf
  1cbd83308989 UefiCpuPkg/MpInitLib: fix feature test for Extended
               Topology CPUID leaf

). Putting out fires more or less as a norm.

Furthermore, I haven't been pretending that VCPU hotplug "doesn't
exist", see

  https://lists.01.org/pipermail/edk2-devel/2016-November/005131.html

for example, which I posted independently, ~8 hours ago.

So, no. I reject the idea that we've been procrastinating SMM enabled
VCPU hotplug. We can't build the second floor while the foundation is
shaking. And then we need to ask Intel to release more code as open source.

Thanks
Laszlo

> 
> Paolo
> 
>> Cc: "Kevin O'Connor" <kevin@koconnor.net>
>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>> Cc: Gerd Hoffmann <kraxel@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>
>> Thanks
>> Laszlo
>>
>> Laszlo Ersek (3):
>>   hw/isa/apm: introduce callback for APM_STS_IOPORT writes
>>   hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS
>>   hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs
>>
>>  docs/specs/q35-apm-sts.txt | 80 ++++++++++++++++++++++++++++++++++++++++++++++
>>  include/hw/i386/ich9.h     |  9 ++++++
>>  include/hw/isa/apm.h       |  9 +++---
>>  hw/acpi/piix4.c            |  2 +-
>>  hw/isa/apm.c               | 15 ++++++---
>>  hw/isa/lpc_ich9.c          | 64 +++++++++++++++++++++++++++++++++++--
>>  hw/isa/vt82c686.c          |  2 +-
>>  7 files changed, 168 insertions(+), 13 deletions(-)
>>  create mode 100644 docs/specs/q35-apm-sts.txt
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-24  0:01   ` Laszlo Ersek
@ 2016-11-24  0:31     ` Laszlo Ersek
  2016-11-24  0:38     ` Kevin O'Connor
  2016-11-24 14:55     ` Igor Mammedov
  2 siblings, 0 replies; 27+ messages in thread
From: Laszlo Ersek @ 2016-11-24  0:31 UTC (permalink / raw)
  To: Paolo Bonzini, qemu devel list
  Cc: Kevin O'Connor, Michael S. Tsirkin, Gerd Hoffmann,
	Igor Mammedov, Jordan Justen (Intel address),
	Michael Kinney

On 11/24/16 01:01, Laszlo Ersek wrote:
> CC Jordan & Mike
> 
> On 11/23/16 23:35, Paolo Bonzini wrote:
>>
>>
>> On 18/11/2016 11:36, Laszlo Ersek wrote:
>>> This is v3 of the series, with updates based on the v2 discussion:
>>> <http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg02687.html>.
>>>
>>> I've added feature negotiation via the APM_STS ("scratchpad") register.
>>> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
>>>
>>> Tested with new OVMF patches (about to send out those as well).
>>> Regression tested with SeaBIOS (beyond simple functional tests with
>>> maximum SeaBIOS logging enabled, I used gdb to step through the new
>>> ich9_apm_status_changed() callback to see if it was behaving compatibly
>>> with SeaBIOS).
>>>
>>> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
>>> crashes very quickly for me when running OVMF:
>>>
>>>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
>>>
>>> It is my understanding that there are patches on the list for this:
>>>
>>>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
>>>
>>> Anyway, the series rebases to v2.8.0-rc0 without as much as context
>>> differences.
>>
>> Hi Laszlo,
>>
>> sorry for the slightly delayed reply.
>>
>> First of all, I'm wondering if we would be better off adding a new port
>> 0xB1 that is QEMU-specific, instead of reusing 0xB3.
> 
> Sure, I can look into that, if we agree that's the best way to proceed,
> for now. (Although I'm not really happy about the new memory region
> stuff it would require. :()
> 
> I CC'd Kevin to learn if he foresaw other uses for the APM_STS register
> in SeaBIOS.
> 
> BTW, what happens in QEMU if the guest reads an unimplemented port?
> Hm... unassigned_io_write() seems to be a no-op, and
> unassigned_io_read() returns all-bits-one. This means that for a new
> port, the negotiation protocol / values have to be reworked.
> 
> Port 0xB1 is occupied by ICH9 according to the spec:
> 
>   Table 9-2. Fixed I/O Ranges Decoded by Intel ® ICH9 (Sheet 2 of 2)
> 
>   I/O
>   Address  Read Target           Write Target          Internal Unit
>   -------  --------------------  --------------------  -------------
>   B0h–B1h  Interrupt Controller  Interrupt Controller  Interrupt
> 
> I wonder if we care -- after all, APM_STS (0xB3) is documented not to
> have any hardware effects ("scratchpad register").
> 
>> Second, I now remembered the reason why I was against broadcast SMI.
>> The reason is that it breaks hot-plug.
> 
> How does it break hot-plug? After reading your explanation below: is it
> that the broadcast SMI (possibly raised by the OS directly) would get to
> the new VCPU before the firmware relocated its SMBASE?
> 
>>
>> On hot-plug, the firmware (if it wants to use SMI for anything secure)
>> must relocate the SMBASE of the newly-hotplugged CPU before the OS has
>> any chance to put its fangs on it.  This is because the OS can send
>> direct SMIs and is in control of the area at 0x30000.  So OVMF is
>> already broken with respect to hotplug,
> 
> Yes. We theorized that there could be further edk2 core modules that
> hadn't been open sourced yet, necessary for handling VCPU hotplug.
> 
>> but I am not yet sure if these
>> patches would break it further.
> 
> Hard to say without seeing those modules.
> 
> I will speculate: assuming that the non-public modules fit together with
> the public modules in some way, I expect the broadcast SMI shouldn't
> break them. The reason is that the broadcast SMI / traditional delivery
> are the default method in UefiCpuPkg, and in practice they work better
> (more reliably) with the rest of the edk2 infrastructure, in my
> experience, than the relaxed sync method.
> 
> In his review today, Jordan wrote
> <https://lists.01.org/pipermail/edk2-devel/2016-November/005088.html>,
> 
>> I'm glad we'll be using a mechanism that broadcasts to all the
>> processors like the real hardware. It is a bit unfortunate that it
>> doesn't go through the b2 port for it.
> 
> If broadcast is how real hardware does it (even by default!,
> apparently), I expect those as-yet unreleased, hotplug-handling modules
> in edk2 should be able to cope with broadcast.
> 
>> The full solution is to follow a protocol similar to what real hardware
>> does.
> 
> Real hardware seems to use broadcast, according to the above...
> 
> On 11/23/16 23:35, Paolo Bonzini wrote:
> 
>> On hot-plug, before the new CPU starts running, the DSDT should
>> generate an SMI (with a well-known value written to 0xB2).
> 
> I'm not sure I understand right. If it is the DSDT that writes to 0xB2,
> that's just another way to say, "the firmware vendor asks the operating
> system to write to 0xB2". If the malicious OS is smart enough, it can
> realize (from the hardware signal to run the ACPI GPE handler, IIRC)
> that a new VCPU is available, and simply not trigger the SMI.
> 
>> The handler
>> then:
>>
>> 1) waits for SMM rendezvous
>>
>> 2) unparks the hotplugged VCPU.  Until the VCPU is unparked, it doesn't
>> react to either INITs or of course SIPIs (this would need to be
>> implemented separately for both TCG and KVM! but only in QEMU, not in
>> the kernel).
> 
> Okay, this does plug the hole I sketched out above. This logic (the
> QEMU-specific unparking) can be done in another platform API in OVMF I
> guess (like those in SmmCpuFeaturesLib), but I wonder if we have to
> provide the infrastructure in platform code up to the separate SMI
> command handler. I think it again depends on those unreleased modules.
> 
>> 3) relocates SMBASE
>>
>> 4) records the presence of the new VCPU's APIC id for subsequent SMIs.
>>
>>
>> The other important things are:
>>
>> * new CPU-hotplug controller (docs/specs/acpi_cpu_hotplug.txt)
>> interfaces.  I think this would only need a new bit in the write
>> register at 0x4 ("CPU device control fields").
>>
>> The 0x0-0x3 write register, currently reserved for reading, might
>> become read/write for easier communication with the SMI handler.  The
>> SMI handler would write 1 to the new bit in order to unpark the VCPU.
>>
>> * how to enable this.  I think it would need a new SMM feature, just
>> like those that you are adding here, in order to make it negotiable.  If
>> it is not negotiated, but broadcast SMI is negotiated, you'd need to do
>> something such as not allowing CPU-hotplug.  (This is the only part that
>> I think is required for 2.9).
> 
> My hope is that we can work on this incrementally (or at least that we
> don't forfeit our chance at SMM + VCPU hotplug) by going down the
> broadcast SMI route first. Based on what Jordan wrote, this hope should
> not be unfounded.
> 
> Also, the SMM stack (across components: core edk2 code, OVMF platform
> code, QEMU and KVM) is now more than a year old, and we still have
> stability problems (hence this series). I'd like to stabilize the base
> features before getting wild with hotplug.
> 
> (The fact that OVMF lags behind SeaBIOS in a number of features is just
> business as usual.)
> 
> So, I'd vastly prefer if I needed to extend this series only with a way
> to prevent VCPUs from being hotplugged. I think negotiating SMI
> broadcast should be good enough for that, as a hook point, given that it
> runs in the firmware before the OS gets control (on S3 resume too).
> 
> I guess a global variable (or a board-level query function) that would
> cause pc_query_hotpluggable_cpus() to return an empty list would be
> frowned upon; what's the recommended method?
> 
> There is DeviceClass.hotpluggable, but that's not a dynamic property.
> 
>>
>> In order to trigger the SMI, the
>>
>>                  ifctx = aml_if(aml_equal(ins_evt, one));
>>                  {
>>                      aml_append(ifctx,
>>                          aml_call2(CPU_NOTIFY_METHOD, cpu_data, dev_chk));
>>                      aml_append(ifctx, aml_store(one, ins_evt));
>>                      aml_append(ifctx, aml_store(one, has_event));
>>                  }
>>
>> would be replaced by something like this pseudo-AML:
>>
>> 		Store(And(smm_features, 2), Local1)
>> 		...
>> 		Store(next_cpu_cmd, cpu_cmd)
>> 		If (Equal(ins_evt, One)) {
>> 			If (Greater(Local1, Zero)) {
>> 				Store(CPU_HP_APM_CMD, apm_cmd)
>> 			}
>> 			CPU_NOTIFY_METHOD(cpu_data, dev_chk)
>> 			Store(One, ins_evt)
>> 			Store(One, has_event)
>> 		}
>>
>>
>> Of course all this is for OVMF only.  SeaBIOS doesn't need to do
>> anything of this, because it actually likes to have its SMIs only on the
>> current CPU (and it had better be the BSP, since SMBASE is not relocated
>> elsewhere!).
>>
>> Igor, any thoughts?
>>
>> I understand that this is quite huge in both OVMF and QEMU, but we've
>> only been delaying it and we knew about it. :(
> 
> I agree about being aware of lack of VCPU hotplug support wrt. SMM.
> 
> I disagree about delaying it. I've been aware of problems with the base
> SMM features (of differing importance), for example
> 
> - that S3 resume with the Ia32 build of OVMF was almost hit-or-miss,
> 
> - or that the heavy use of UEFI variable services during Windows 8 boot
> caused user-visible choppiness in the rotating animation (because of how
> our SMIs worked) -- I think I even mentioned this to you.
> 
> Since the stalemate in
> 
>   http://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg00125.html
> 
> from last November, I've never stopped disliking the lack of broadcast
> SMIs, but I couldn't do anything about it, despite it leaving the base
> feature set unreliable.

And let's not forget about "niceties" like

- host kernel commit 879ae1880449 ("KVM: x86: obey
KVM_X86_QUIRK_CD_NW_CLEARED in kvm_set_cr0()"), and

- kernel bugzilla <https://bugzilla.kernel.org/show_bug.cgi?id=116731>,

which ate up incredible amounts of time, sending a whole lot of
VFIO/OVMF/GPU-assign gamers from the vfio-users list complaining on
every forum they could think of.

(These issues weren't related to SMM directly, but PiSmmCpuDxeSmm has a
depex on gEfiMpServiceProtocolGuid, so anything that interferes with
multiprocessing in CpuDxe or MpInitLib is automatically an SMM bug too.)

I can think of yet more: have you gotten around looking into
<https://bugzilla.redhat.com/show_bug.cgi?id=1348092>, which is about
SMM breaking on KVM hosts that lack EPT?

I believe you haven't. Which is fine. I just take issue with the notion
that we've been "too lazy" to enable SMM + VCPU hotplug.

We got problems that come before.

Thanks
Laszlo

> Recently, the open source edk2 SMM stack has acquired a bunch of new
> code (you're aware), which exacerbate the instabilities (as I've
> documented on the edk2-devel list in excruciating detail). I was
> overjoyed when you finally suggested (!) to add broadcast SMI, getting
> the discussion un-stuck, because (a) it miraculously fixes everything,
> and (b) honestly, I see no other way from keeping the SMM stack from
> falling apart otherwise, *without* VCPU hotplug in the picture.
> 
> For the last few weeks, I've been working day and night (I'm not
> exaggerating) on testing & reviewing edk2 patches related to SMM or even
> just multiprocessing, catching errors, reporting them, and even
> debugging and fixing them myself. (See for example commits
> 
>   00650c531a90 UefiCpuPkg/MpInitLib/X64/MpFuncs.nasm: fix fatal typo
>   4af3ae146379 UefiCpuPkg/LocalApicLib: fix feature test for Extended
>                Topology CPUID leaf
>   1cbd83308989 UefiCpuPkg/MpInitLib: fix feature test for Extended
>                Topology CPUID leaf
> 
> ). Putting out fires more or less as a norm.
> 
> Furthermore, I haven't been pretending that VCPU hotplug "doesn't
> exist", see
> 
>   https://lists.01.org/pipermail/edk2-devel/2016-November/005131.html
> 
> for example, which I posted independently, ~8 hours ago.
> 
> So, no. I reject the idea that we've been procrastinating SMM enabled
> VCPU hotplug. We can't build the second floor while the foundation is
> shaking. And then we need to ask Intel to release more code as open source.
> 
> Thanks
> Laszlo
> 
>>
>> Paolo
>>
>>> Cc: "Kevin O'Connor" <kevin@koconnor.net>
>>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>>> Cc: Gerd Hoffmann <kraxel@redhat.com>
>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>
>>> Thanks
>>> Laszlo
>>>
>>> Laszlo Ersek (3):
>>>   hw/isa/apm: introduce callback for APM_STS_IOPORT writes
>>>   hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS
>>>   hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs
>>>
>>>  docs/specs/q35-apm-sts.txt | 80 ++++++++++++++++++++++++++++++++++++++++++++++
>>>  include/hw/i386/ich9.h     |  9 ++++++
>>>  include/hw/isa/apm.h       |  9 +++---
>>>  hw/acpi/piix4.c            |  2 +-
>>>  hw/isa/apm.c               | 15 ++++++---
>>>  hw/isa/lpc_ich9.c          | 64 +++++++++++++++++++++++++++++++++++--
>>>  hw/isa/vt82c686.c          |  2 +-
>>>  7 files changed, 168 insertions(+), 13 deletions(-)
>>>  create mode 100644 docs/specs/q35-apm-sts.txt
>>>
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-24  0:01   ` Laszlo Ersek
  2016-11-24  0:31     ` Laszlo Ersek
@ 2016-11-24  0:38     ` Kevin O'Connor
  2016-11-24  4:29       ` Michael S. Tsirkin
  2016-11-24 14:55     ` Igor Mammedov
  2 siblings, 1 reply; 27+ messages in thread
From: Kevin O'Connor @ 2016-11-24  0:38 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Paolo Bonzini, qemu devel list, Michael S. Tsirkin,
	Gerd Hoffmann, Igor Mammedov, Jordan Justen (Intel address),
	Michael Kinney

On Thu, Nov 24, 2016 at 01:01:58AM +0100, Laszlo Ersek wrote:
> CC Jordan & Mike
> 
> On 11/23/16 23:35, Paolo Bonzini wrote:
> > 
> > 
> > On 18/11/2016 11:36, Laszlo Ersek wrote:
> >> This is v3 of the series, with updates based on the v2 discussion:
> >> <http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg02687.html>.
> >>
> >> I've added feature negotiation via the APM_STS ("scratchpad") register.
> >> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
> >>
> >> Tested with new OVMF patches (about to send out those as well).
> >> Regression tested with SeaBIOS (beyond simple functional tests with
> >> maximum SeaBIOS logging enabled, I used gdb to step through the new
> >> ich9_apm_status_changed() callback to see if it was behaving compatibly
> >> with SeaBIOS).
> >>
> >> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
> >> crashes very quickly for me when running OVMF:
> >>
> >>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
> >>
> >> It is my understanding that there are patches on the list for this:
> >>
> >>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
> >>
> >> Anyway, the series rebases to v2.8.0-rc0 without as much as context
> >> differences.
> > 
> > Hi Laszlo,
> > 
> > sorry for the slightly delayed reply.
> > 
> > First of all, I'm wondering if we would be better off adding a new port
> > 0xB1 that is QEMU-specific, instead of reusing 0xB3.
> 
> Sure, I can look into that, if we agree that's the best way to proceed,
> for now. (Although I'm not really happy about the new memory region
> stuff it would require. :()
> 
> I CC'd Kevin to learn if he foresaw other uses for the APM_STS register
> in SeaBIOS.

I don't foresee further use of APM_STS in SeaBIOS.  The SMM code in
SeaBIOS is specific to QEMU anyway.  Also, the current use of APM_STS
is so trivial, we could easily remove it from SeaBIOS in a future
release (were that desirable).

As a general comment - it does seem unfortunate that we keep building
adhoc interfaces to communicate information from firmware to QEMU.  We
have a generic mechanism (fw_cfg) for passing adhoc information from
QEMU to the firmware, but the inverse seems to always involve magic
pci registers, magic io space registers, specific init ordering, etc.

That said, I don't object to your proposal.

-Kevin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-24  0:38     ` Kevin O'Connor
@ 2016-11-24  4:29       ` Michael S. Tsirkin
  2016-11-24  8:37         ` Laszlo Ersek
  0 siblings, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2016-11-24  4:29 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Laszlo Ersek, Paolo Bonzini, qemu devel list, Gerd Hoffmann,
	Igor Mammedov, Jordan Justen (Intel address),
	Michael Kinney

On Wed, Nov 23, 2016 at 07:38:35PM -0500, Kevin O'Connor wrote:
> As a general comment - it does seem unfortunate that we keep building
> adhoc interfaces to communicate information from firmware to QEMU.  We
> have a generic mechanism (fw_cfg) for passing adhoc information from
> QEMU to the firmware, but the inverse seems to always involve magic
> pci registers, magic io space registers, specific init ordering, etc.

FWIW I posted a proposal
	fw-cfg: support writeable blobs
a while ago to try to address that

-- 
MST

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-24  4:29       ` Michael S. Tsirkin
@ 2016-11-24  8:37         ` Laszlo Ersek
  2016-11-25  4:00           ` Michael S. Tsirkin
  0 siblings, 1 reply; 27+ messages in thread
From: Laszlo Ersek @ 2016-11-24  8:37 UTC (permalink / raw)
  To: Michael S. Tsirkin, Kevin O'Connor
  Cc: Paolo Bonzini, qemu devel list, Gerd Hoffmann, Igor Mammedov,
	Jordan Justen (Intel address),
	Michael Kinney

On 11/24/16 05:29, Michael S. Tsirkin wrote:
> On Wed, Nov 23, 2016 at 07:38:35PM -0500, Kevin O'Connor wrote:
>> As a general comment - it does seem unfortunate that we keep building
>> adhoc interfaces to communicate information from firmware to QEMU.  We
>> have a generic mechanism (fw_cfg) for passing adhoc information from
>> QEMU to the firmware, but the inverse seems to always involve magic
>> pci registers, magic io space registers, specific init ordering, etc.
> 
> FWIW I posted a proposal
> 	fw-cfg: support writeable blobs
> a while ago to try to address that
> 

Yes, here's the discussion (Feb 2016):

https://www.mail-archive.com/qemu-devel@nongnu.org/msg354852.html

and it was even part of a pull req (Mar 2016):

https://www.mail-archive.com/qemu-devel@nongnu.org/msg359348.html

but it wasn't merged, apparently.

If QEMU (re)gains this feature, I can try basing the broadcast SMI
negotiation on it.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-24  0:01   ` Laszlo Ersek
  2016-11-24  0:31     ` Laszlo Ersek
  2016-11-24  0:38     ` Kevin O'Connor
@ 2016-11-24 14:55     ` Igor Mammedov
  2016-11-24 17:05       ` Paolo Bonzini
  2 siblings, 1 reply; 27+ messages in thread
From: Igor Mammedov @ 2016-11-24 14:55 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Paolo Bonzini, qemu devel list, Kevin O'Connor,
	Michael S. Tsirkin, Gerd Hoffmann, Jordan Justen (Intel address),
	Michael Kinney

On Thu, 24 Nov 2016 01:01:58 +0100
Laszlo Ersek <lersek@redhat.com> wrote:

> CC Jordan & Mike
> 
> On 11/23/16 23:35, Paolo Bonzini wrote:
> > 
> > 
> > On 18/11/2016 11:36, Laszlo Ersek wrote:
> >> This is v3 of the series, with updates based on the v2 discussion:
> >> <http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg02687.html>.
> >>
> >> I've added feature negotiation via the APM_STS ("scratchpad") register.
> >> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
> >>
> >> Tested with new OVMF patches (about to send out those as well).
> >> Regression tested with SeaBIOS (beyond simple functional tests with
> >> maximum SeaBIOS logging enabled, I used gdb to step through the new
> >> ich9_apm_status_changed() callback to see if it was behaving compatibly
> >> with SeaBIOS).
> >>
> >> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
> >> crashes very quickly for me when running OVMF:
> >>
> >>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
> >>
> >> It is my understanding that there are patches on the list for this:
> >>
> >>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
> >>
> >> Anyway, the series rebases to v2.8.0-rc0 without as much as context
> >> differences.
> > 
> > Hi Laszlo,
> > 
> > sorry for the slightly delayed reply.
> > 
> > First of all, I'm wondering if we would be better off adding a new port
> > 0xB1 that is QEMU-specific, instead of reusing 0xB3.
> 
> Sure, I can look into that, if we agree that's the best way to proceed,
> for now. (Although I'm not really happy about the new memory region
> stuff it would require. :()
> 
> I CC'd Kevin to learn if he foresaw other uses for the APM_STS register
> in SeaBIOS.
> 
> BTW, what happens in QEMU if the guest reads an unimplemented port?
> Hm... unassigned_io_write() seems to be a no-op, and
> unassigned_io_read() returns all-bits-one. This means that for a new
> port, the negotiation protocol / values have to be reworked.
> 
> Port 0xB1 is occupied by ICH9 according to the spec:
> 
>   Table 9-2. Fixed I/O Ranges Decoded by Intel ® ICH9 (Sheet 2 of 2)
> 
>   I/O
>   Address  Read Target           Write Target          Internal Unit
>   -------  --------------------  --------------------  -------------
>   B0h–B1h  Interrupt Controller  Interrupt Controller  Interrupt
> 
> I wonder if we care -- after all, APM_STS (0xB3) is documented not to
> have any hardware effects ("scratchpad register").
+1 in favor of using scratch register as Laszlo suggests.


> > Second, I now remembered the reason why I was against broadcast SMI.
> > The reason is that it breaks hot-plug.
> 
> How does it break hot-plug? After reading your explanation below: is it
> that the broadcast SMI (possibly raised by the OS directly) would get to
> the new VCPU before the firmware relocated its SMBASE?
> 
> > 
> > On hot-plug, the firmware (if it wants to use SMI for anything secure)
> > must relocate the SMBASE of the newly-hotplugged CPU before the OS has
> > any chance to put its fangs on it.  This is because the OS can send
> > direct SMIs and is in control of the area at 0x30000.  So OVMF is
> > already broken with respect to hotplug,
> 
> Yes. We theorized that there could be further edk2 core modules that
> hadn't been open sourced yet, necessary for handling VCPU hotplug.
> 
> > but I am not yet sure if these
> > patches would break it further.
> 
> Hard to say without seeing those modules.
> 
> I will speculate: assuming that the non-public modules fit together with
> the public modules in some way, I expect the broadcast SMI shouldn't
> break them. The reason is that the broadcast SMI / traditional delivery
> are the default method in UefiCpuPkg, and in practice they work better
> (more reliably) with the rest of the edk2 infrastructure, in my
> experience, than the relaxed sync method.
> 
> In his review today, Jordan wrote
> <https://lists.01.org/pipermail/edk2-devel/2016-November/005088.html>,
> 
> > I'm glad we'll be using a mechanism that broadcasts to all the
> > processors like the real hardware. It is a bit unfortunate that it
> > doesn't go through the b2 port for it.
> 
> If broadcast is how real hardware does it (even by default!,
> apparently), I expect those as-yet unreleased, hotplug-handling modules
> in edk2 should be able to cope with broadcast.
> 
> > The full solution is to follow a protocol similar to what real hardware
> > does.
> 
> Real hardware seems to use broadcast, according to the above...
> 
> On 11/23/16 23:35, Paolo Bonzini wrote:
> 
> > On hot-plug, before the new CPU starts running, the DSDT should
> > generate an SMI (with a well-known value written to 0xB2).
> 
> I'm not sure I understand right. If it is the DSDT that writes to 0xB2,
> that's just another way to say, "the firmware vendor asks the operating
> system to write to 0xB2". If the malicious OS is smart enough, it can
> realize (from the hardware signal to run the ACPI GPE handler, IIRC)
> that a new VCPU is available, and simply not trigger the SMI.
> 
> > The handler
> > then:
> > 
> > 1) waits for SMM rendezvous
> > 
> > 2) unparks the hotplugged VCPU.  Until the VCPU is unparked, it doesn't
> > react to either INITs or of course SIPIs (this would need to be
> > implemented separately for both TCG and KVM! but only in QEMU, not in
> > the kernel).
> 
> Okay, this does plug the hole I sketched out above. This logic (the
> QEMU-specific unparking) can be done in another platform API in OVMF I
> guess (like those in SmmCpuFeaturesLib), but I wonder if we have to
> provide the infrastructure in platform code up to the separate SMI
> command handler. I think it again depends on those unreleased modules.
I don't really like parked CPU idea and related modifications to
CPU hotplug MMIO interface.

How about an alternative approach:

1) on CPU hotplug QEMU generates SMI (if SMIs are enabled)
   it doesn't matter if it's a directed (to being hotplugged CPU) or broadcast SMI)
   as hotplugged CPU is in reset state and won't handle SMI until it receives SIPI
   (see SDM: 34.2 SYSTEM MANAGEMENT INTERRUPT: NOTES)

2) waits for SMM rendezvous of all known CPUs (i.e except of a hotplugged one)

3) once in rendezvous, a master CPU restores memory at SMBASE to
   a known-good state and sends directed INIT/SIPI to the hotplugged CPU

4) the hotplugged CPU relocates SMBASE as pending SMI is processed before SIPI

5) SIPI handler registers hotplugged CPU in UefiCpuPkg/MpInitLib runtime data structures
   and switches CPU into x2APIC mode if necessary
   (maybe this could be done from SMI handler (safer) and SIPI handler could be just NOP)

6) control goes to OS that gets and processes GPE interrupt without as chance 
   to do dirty tricks on default SMBASE memory

step 2:
  process could be optimized if SMRR range would cover default SMBASE
  then only master CPU and the hotplugged one have to switch into SMM mode,
  this whole machine doesn't have to be stopped in SMM rendezvous.


BTW:
I don't see how broadcast SMI could be used reliably at initial boot 
as all present APs would race toward the same SMBASE when broadcast
SIPI is sent so it's either directed SMIs or directed SIPIs, I'm not
sure what edk2 actually does.

 
> > 3) relocates SMBASE
> > 
> > 4) records the presence of the new VCPU's APIC id for subsequent SMIs.
> > 
> > 
> > The other important things are:
> > 
> > * new CPU-hotplug controller (docs/specs/acpi_cpu_hotplug.txt)
> > interfaces.  I think this would only need a new bit in the write
> > register at 0x4 ("CPU device control fields").
> > 
> > The 0x0-0x3 write register, currently reserved for reading, might
> > become read/write for easier communication with the SMI handler.  The
> > SMI handler would write 1 to the new bit in order to unpark the VCPU.
> > 
> > * how to enable this.  I think it would need a new SMM feature, just
> > like those that you are adding here, in order to make it negotiable.  If
> > it is not negotiated, but broadcast SMI is negotiated, you'd need to do
> > something such as not allowing CPU-hotplug.  (This is the only part that
> > I think is required for 2.9).
> 
> My hope is that we can work on this incrementally (or at least that we
> don't forfeit our chance at SMM + VCPU hotplug) by going down the
> broadcast SMI route first. Based on what Jordan wrote, this hope should
> not be unfounded.
> 
> Also, the SMM stack (across components: core edk2 code, OVMF platform
> code, QEMU and KVM) is now more than a year old, and we still have
> stability problems (hence this series). I'd like to stabilize the base
> features before getting wild with hotplug.
> 
> (The fact that OVMF lags behind SeaBIOS in a number of features is just
> business as usual.)
> 
> So, I'd vastly prefer if I needed to extend this series only with a way
> to prevent VCPUs from being hotplugged. I think negotiating SMI
> broadcast should be good enough for that, as a hook point, given that it
> runs in the firmware before the OS gets control (on S3 resume too).
> 
> I guess a global variable (or a board-level query function) that would
> cause pc_query_hotpluggable_cpus() to return an empty list would be
> frowned upon; what's the recommended method?
> 
> There is DeviceClass.hotpluggable, but that's not a dynamic property.
> 
> > 
> > In order to trigger the SMI, the
> > 
> >                  ifctx = aml_if(aml_equal(ins_evt, one));
> >                  {
> >                      aml_append(ifctx,
> >                          aml_call2(CPU_NOTIFY_METHOD, cpu_data, dev_chk));
> >                      aml_append(ifctx, aml_store(one, ins_evt));
> >                      aml_append(ifctx, aml_store(one, has_event));
> >                  }
> > 
> > would be replaced by something like this pseudo-AML:
> > 
> > 		Store(And(smm_features, 2), Local1)
> > 		...
> > 		Store(next_cpu_cmd, cpu_cmd)
> > 		If (Equal(ins_evt, One)) {
> > 			If (Greater(Local1, Zero)) {
> > 				Store(CPU_HP_APM_CMD, apm_cmd)
> > 			}
> > 			CPU_NOTIFY_METHOD(cpu_data, dev_chk)
> > 			Store(One, ins_evt)
> > 			Store(One, has_event)
> > 		}
> > 
> > 
> > Of course all this is for OVMF only.  SeaBIOS doesn't need to do
> > anything of this, because it actually likes to have its SMIs only on the
> > current CPU (and it had better be the BSP, since SMBASE is not relocated
> > elsewhere!).
> > 
> > Igor, any thoughts?
> > 
> > I understand that this is quite huge in both OVMF and QEMU, but we've
> > only been delaying it and we knew about it. :(
> 
> I agree about being aware of lack of VCPU hotplug support wrt. SMM.
> 
> I disagree about delaying it. I've been aware of problems with the base
> SMM features (of differing importance), for example
> 
> - that S3 resume with the Ia32 build of OVMF was almost hit-or-miss,
> 
> - or that the heavy use of UEFI variable services during Windows 8 boot
> caused user-visible choppiness in the rotating animation (because of how
> our SMIs worked) -- I think I even mentioned this to you.
> 
> Since the stalemate in
> 
>   http://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg00125.html
> 
> from last November, I've never stopped disliking the lack of broadcast
> SMIs, but I couldn't do anything about it, despite it leaving the base
> feature set unreliable.
> 
> Recently, the open source edk2 SMM stack has acquired a bunch of new
> code (you're aware), which exacerbate the instabilities (as I've
> documented on the edk2-devel list in excruciating detail). I was
> overjoyed when you finally suggested (!) to add broadcast SMI, getting
> the discussion un-stuck, because (a) it miraculously fixes everything,
> and (b) honestly, I see no other way from keeping the SMM stack from
> falling apart otherwise, *without* VCPU hotplug in the picture.
> 
> For the last few weeks, I've been working day and night (I'm not
> exaggerating) on testing & reviewing edk2 patches related to SMM or even
> just multiprocessing, catching errors, reporting them, and even
> debugging and fixing them myself. (See for example commits
> 
>   00650c531a90 UefiCpuPkg/MpInitLib/X64/MpFuncs.nasm: fix fatal typo
>   4af3ae146379 UefiCpuPkg/LocalApicLib: fix feature test for Extended
>                Topology CPUID leaf
>   1cbd83308989 UefiCpuPkg/MpInitLib: fix feature test for Extended
>                Topology CPUID leaf
> 
> ). Putting out fires more or less as a norm.
> 
> Furthermore, I haven't been pretending that VCPU hotplug "doesn't
> exist", see
> 
>   https://lists.01.org/pipermail/edk2-devel/2016-November/005131.html
> 
> for example, which I posted independently, ~8 hours ago.
> 
> So, no. I reject the idea that we've been procrastinating SMM enabled
> VCPU hotplug. We can't build the second floor while the foundation is
> shaking. And then we need to ask Intel to release more code as open source.
> 
> Thanks
> Laszlo
> 
> > 
> > Paolo
> > 
> >> Cc: "Kevin O'Connor" <kevin@koconnor.net>
> >> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> >> Cc: Gerd Hoffmann <kraxel@redhat.com>
> >> Cc: Paolo Bonzini <pbonzini@redhat.com>
> >>
> >> Thanks
> >> Laszlo
> >>
> >> Laszlo Ersek (3):
> >>   hw/isa/apm: introduce callback for APM_STS_IOPORT writes
> >>   hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS
> >>   hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs
> >>
> >>  docs/specs/q35-apm-sts.txt | 80 ++++++++++++++++++++++++++++++++++++++++++++++
> >>  include/hw/i386/ich9.h     |  9 ++++++
> >>  include/hw/isa/apm.h       |  9 +++---
> >>  hw/acpi/piix4.c            |  2 +-
> >>  hw/isa/apm.c               | 15 ++++++---
> >>  hw/isa/lpc_ich9.c          | 64 +++++++++++++++++++++++++++++++++++--
> >>  hw/isa/vt82c686.c          |  2 +-
> >>  7 files changed, 168 insertions(+), 13 deletions(-)
> >>  create mode 100644 docs/specs/q35-apm-sts.txt
> >>
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-24 14:55     ` Igor Mammedov
@ 2016-11-24 17:05       ` Paolo Bonzini
  2016-11-24 18:02         ` Igor Mammedov
  0 siblings, 1 reply; 27+ messages in thread
From: Paolo Bonzini @ 2016-11-24 17:05 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Laszlo Ersek, qemu devel list, Kevin O'Connor,
	Michael S. Tsirkin, Gerd Hoffmann, Jordan Justen (Intel address),
	Michael Kinney


> > Okay, this does plug the hole I sketched out above. This logic (the
> > QEMU-specific unparking) can be done in another platform API in OVMF I
> > guess (like those in SmmCpuFeaturesLib), but I wonder if we have to
> > provide the infrastructure in platform code up to the separate SMI
> > command handler. I think it again depends on those unreleased modules.
>
> I don't really like parked CPU idea and related modifications to
> CPU hotplug MMIO interface.
> 
> How about an alternative approach:
> 
> 1) on CPU hotplug QEMU generates SMI (if SMIs are enabled)
>    it doesn't matter if it's a directed (to being hotplugged CPU) or
>    broadcast SMI)
>    as hotplugged CPU is in reset state and won't handle SMI until it receives
>    SIPI (see SDM: 34.2 SYSTEM MANAGEMENT INTERRUPT: NOTES)

Parked CPUs are exactly how it works on real hardware (the arbitrator is the
BMC, while we have QEMU in its place).  The problem is that, if you just
place the hotplugged CPU in reset state, there is a race between the OSPM
and the firmware.  The OSPM can place its own code at 0x30000 and send
INIT/SIPI/SMI before the firmware gets round to doing it.

> BTW:
> I don't see how broadcast SMI could be used reliably at initial boot
> as all present APs would race toward the same SMBASE when broadcast
> SIPI is sent so it's either directed SMIs or directed SIPIs, I'm not
> sure what edk2 actually does.

It uses directed SMIs via local APIC, to relocate SMBASE one vCPU at a
time.

Paolo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-24 17:05       ` Paolo Bonzini
@ 2016-11-24 18:02         ` Igor Mammedov
  2016-11-25  8:55           ` Paolo Bonzini
  0 siblings, 1 reply; 27+ messages in thread
From: Igor Mammedov @ 2016-11-24 18:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Laszlo Ersek, qemu devel list, Kevin O'Connor,
	Michael S. Tsirkin, Gerd Hoffmann, Jordan Justen (Intel address),
	Michael Kinney

On Thu, 24 Nov 2016 12:05:55 -0500 (EST)
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> > > Okay, this does plug the hole I sketched out above. This logic (the
> > > QEMU-specific unparking) can be done in another platform API in OVMF I
> > > guess (like those in SmmCpuFeaturesLib), but I wonder if we have to
> > > provide the infrastructure in platform code up to the separate SMI
> > > command handler. I think it again depends on those unreleased modules.
> >
> > I don't really like parked CPU idea and related modifications to
> > CPU hotplug MMIO interface.
> > 
> > How about an alternative approach:
> > 
> > 1) on CPU hotplug QEMU generates SMI (if SMIs are enabled)
> >    it doesn't matter if it's a directed (to being hotplugged CPU) or
> >    broadcast SMI)
> >    as hotplugged CPU is in reset state and won't handle SMI until it receives
> >    SIPI (see SDM: 34.2 SYSTEM MANAGEMENT INTERRUPT: NOTES)
> 
> Parked CPUs are exactly how it works on real hardware (the arbitrator is the
> BMC, while we have QEMU in its place).  The problem is that, if you just
> place the hotplugged CPU in reset state, there is a race between the OSPM
> and the firmware.  The OSPM can place its own code at 0x30000 and send
> INIT/SIPI/SMI before the firmware gets round to doing it.
if 0x30000 were covered by SMRR range, then OSPM wouldn't able to
place its own code there and there wouldn't be any need in side interfaces
to put and get CPU in/from some undefined by spec state (parked).

But above would imply a large block allocated at 0x30000 to fit all
possible CPUs+1, not sure if it's doable (maybe edk2 wouldn't have
big issues with reserving large block in lowmem).

Suggestion to use CPU hotplug MMIO interface to unpark CPU also doesn't
seem to be secure as it's not protected from OSPM, hence OPSM could
unpark CPU and hijack SMBASE all the same.
It looks like we need only SMM accessible guest/host interface to make
CPU unparking secure or cover default SMBASE by SMRR.

> > BTW:
> > I don't see how broadcast SMI could be used reliably at initial boot
> > as all present APs would race toward the same SMBASE when broadcast
> > SIPI is sent so it's either directed SMIs or directed SIPIs, I'm not
> > sure what edk2 actually does.
> 
> It uses directed SMIs via local APIC, to relocate SMBASE one vCPU at a
> time.
> 
> Paolo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-24  8:37         ` Laszlo Ersek
@ 2016-11-25  4:00           ` Michael S. Tsirkin
  2016-11-25 12:31             ` Laszlo Ersek
  0 siblings, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2016-11-25  4:00 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Kevin O'Connor, Paolo Bonzini, qemu devel list,
	Gerd Hoffmann, Igor Mammedov, Jordan Justen (Intel address),
	Michael Kinney

On Thu, Nov 24, 2016 at 09:37:41AM +0100, Laszlo Ersek wrote:
> On 11/24/16 05:29, Michael S. Tsirkin wrote:
> > On Wed, Nov 23, 2016 at 07:38:35PM -0500, Kevin O'Connor wrote:
> >> As a general comment - it does seem unfortunate that we keep building
> >> adhoc interfaces to communicate information from firmware to QEMU.  We
> >> have a generic mechanism (fw_cfg) for passing adhoc information from
> >> QEMU to the firmware, but the inverse seems to always involve magic
> >> pci registers, magic io space registers, specific init ordering, etc.
> > 
> > FWIW I posted a proposal
> > 	fw-cfg: support writeable blobs
> > a while ago to try to address that
> > 
> 
> Yes, here's the discussion (Feb 2016):
> 
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg354852.html
> 
> and it was even part of a pull req (Mar 2016):
> 
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg359348.html
> 
> but it wasn't merged, apparently.
> 
> If QEMU (re)gains this feature, I can try basing the broadcast SMI
> negotiation on it.
> 
> Thanks
> Laszlo

I dropped since it wasn't used yet. Go ahead and include it
in your patchset if you like it.

-- 
MST

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-24 18:02         ` Igor Mammedov
@ 2016-11-25  8:55           ` Paolo Bonzini
  2016-11-25 14:10             ` Igor Mammedov
  0 siblings, 1 reply; 27+ messages in thread
From: Paolo Bonzini @ 2016-11-25  8:55 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Laszlo Ersek, qemu devel list, Kevin O'Connor,
	Michael S. Tsirkin, Gerd Hoffmann, Jordan Justen (Intel address),
	Michael Kinney


> > Parked CPUs are exactly how it works on real hardware (the arbitrator is
> > the
> > BMC, while we have QEMU in its place).  The problem is that, if you just
> > place the hotplugged CPU in reset state, there is a race between the OSPM
> > and the firmware.  The OSPM can place its own code at 0x30000 and send
> > INIT/SIPI/SMI before the firmware gets round to doing it.
>
> if 0x30000 were covered by SMRR range, then OSPM wouldn't able to
> place its own code there and there wouldn't be any need in side interfaces
> to put and get CPU in/from some undefined by spec state (parked).
> 
> But above would imply a large block allocated at 0x30000 to fit all
> possible CPUs+1, not sure if it's doable (maybe edk2 wouldn't have
> big issues with reserving large block in lowmem).

If you mean that the default SMRR range would include 0x30000 for an hotplugged
CPU, that would work but it is a significant departure from real hardware.
I'd rather avoid that.

> Suggestion to use CPU hotplug MMIO interface to unpark CPU also doesn't
> seem to be secure as it's not protected from OSPM, hence OPSM could
> unpark CPU and hijack SMBASE all the same.
> It looks like we need only SMM accessible guest/host interface to make
> CPU unparking secure or cover default SMBASE by SMRR.

Yes, unparking would be accessible only from SMM if the unparking feature
is negotiated.

Paolo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-25  4:00           ` Michael S. Tsirkin
@ 2016-11-25 12:31             ` Laszlo Ersek
  2016-11-25 12:40               ` Laszlo Ersek
  2016-11-25 14:22               ` Igor Mammedov
  0 siblings, 2 replies; 27+ messages in thread
From: Laszlo Ersek @ 2016-11-25 12:31 UTC (permalink / raw)
  To: Michael S. Tsirkin, Paolo Bonzini
  Cc: Kevin O'Connor, qemu devel list, Gerd Hoffmann,
	Igor Mammedov, Jordan Justen (Intel address),
	Michael Kinney

On 11/25/16 05:00, Michael S. Tsirkin wrote:
> On Thu, Nov 24, 2016 at 09:37:41AM +0100, Laszlo Ersek wrote:
>> On 11/24/16 05:29, Michael S. Tsirkin wrote:
>>> On Wed, Nov 23, 2016 at 07:38:35PM -0500, Kevin O'Connor wrote:
>>>> As a general comment - it does seem unfortunate that we keep building
>>>> adhoc interfaces to communicate information from firmware to QEMU.  We
>>>> have a generic mechanism (fw_cfg) for passing adhoc information from
>>>> QEMU to the firmware, but the inverse seems to always involve magic
>>>> pci registers, magic io space registers, specific init ordering, etc.
>>>
>>> FWIW I posted a proposal
>>> 	fw-cfg: support writeable blobs
>>> a while ago to try to address that
>>>
>>
>> Yes, here's the discussion (Feb 2016):
>>
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg354852.html
>>
>> and it was even part of a pull req (Mar 2016):
>>
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg359348.html
>>
>> but it wasn't merged, apparently.
>>
>> If QEMU (re)gains this feature, I can try basing the broadcast SMI
>> negotiation on it.
>>
>> Thanks
>> Laszlo
> 
> I dropped since it wasn't used yet. Go ahead and include it
> in your patchset if you like it.
> 

I can do that, but I'm no longer sure Paolo still approves of the
broadcast SMI idea.

The way I see it, I can work on getting broadcast SMI functional (with
whichever negotiation method we deem suitable), and make the basic
feature set (which ignores VCPU hotplug entirely) reliable, for now.
Then, later, we can look into VCPU hotplug. VCPU hotplug is already
broken in OVMF and whatever we do for broadcast SMI cannot worsen the
user-observable VCPU hotplug status.

(Note that VCPU hotplug will require a whole bunch of non-platform code
in edk2, such as UefiCpuPkg/Library/MpInitLib, UefiCpuPkg/CpuMpPei, and
UefiCpuPkg/CpuDxe, UefiCpuPkg/PiSmmCpuDxeSmm.)

Or, we can delay (or even drop) broadcast SMI until we divine a design
that, with a lot of code everywhere, makes (a) the basic SMM feature set
reliable *and* (b) VCPU hotplug functional, at the exact same time. I'm
not saying this is impossible, but you'll need a better guy for that
than I am. I always work in incremental, small steps, especially where
the subject matter is hard for me to grasp. If you see me walking down
the wrong path and yank me back, that's appreciated, but the broadcast
SMI idea doesn't look wrong, considering feedback from Jordan as to what
real hardware does (and the edk2 UefiCpuPkg.dec default settings).

So, I'm asking for guidance on:

(1) what interface would be preferred for negotiating SMI:
(1a) APM_STS
(1b) writeable fw_cfg
(1c) another IO port

(2) whether to prevent, and if so, how exactly, VCPU hotplug, when the
broadcast SMI is negotiated. By "how", I mean "what code to modify in QEMU".

For (1), my preference is (1a), simply because it's ready. I do see
perspective in (1b) writeable fw_cfg, so if that's preferred, I can
include Michael's patch and rework my patches to utilize it. (As far as
I see, the textual documentation for fw_cfg is not extended by Michael's
patch, so I guess I'd have to do that too.) (1c) is inferior to (1b) in
my opinion and shouldn't be chosen.

Re (2), I'm clueless. Not sure we should care about it. Even if it is a
security problem, that problem exists within the guest, and triggering
it (i.e., hot-plugging a VCPU) requires a host admin action. The host
admin can cause a bunch of other security problems for the guest, via
different misconfigurations. So if we care about (2), it should be
something minimal, to catch "inadvertent" VCPU hotplug attempts. And
then please tell me what code to mess with.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-25 12:31             ` Laszlo Ersek
@ 2016-11-25 12:40               ` Laszlo Ersek
  2016-11-28  9:01                 ` Gerd Hoffmann
  2016-11-25 14:22               ` Igor Mammedov
  1 sibling, 1 reply; 27+ messages in thread
From: Laszlo Ersek @ 2016-11-25 12:40 UTC (permalink / raw)
  To: Michael S. Tsirkin, Paolo Bonzini
  Cc: Kevin O'Connor, qemu devel list, Gerd Hoffmann,
	Igor Mammedov, Jordan Justen (Intel address),
	Michael Kinney

On 11/25/16 13:31, Laszlo Ersek wrote:
> On 11/25/16 05:00, Michael S. Tsirkin wrote:
>> On Thu, Nov 24, 2016 at 09:37:41AM +0100, Laszlo Ersek wrote:
>>> On 11/24/16 05:29, Michael S. Tsirkin wrote:
>>>> On Wed, Nov 23, 2016 at 07:38:35PM -0500, Kevin O'Connor wrote:
>>>>> As a general comment - it does seem unfortunate that we keep building
>>>>> adhoc interfaces to communicate information from firmware to QEMU.  We
>>>>> have a generic mechanism (fw_cfg) for passing adhoc information from
>>>>> QEMU to the firmware, but the inverse seems to always involve magic
>>>>> pci registers, magic io space registers, specific init ordering, etc.
>>>>
>>>> FWIW I posted a proposal
>>>> 	fw-cfg: support writeable blobs
>>>> a while ago to try to address that
>>>>
>>>
>>> Yes, here's the discussion (Feb 2016):
>>>
>>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg354852.html
>>>
>>> and it was even part of a pull req (Mar 2016):
>>>
>>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg359348.html
>>>
>>> but it wasn't merged, apparently.
>>>
>>> If QEMU (re)gains this feature, I can try basing the broadcast SMI
>>> negotiation on it.
>>>
>>> Thanks
>>> Laszlo
>>
>> I dropped since it wasn't used yet. Go ahead and include it
>> in your patchset if you like it.
>>
> 
> I can do that, but I'm no longer sure Paolo still approves of the
> broadcast SMI idea.
> 
> The way I see it, I can work on getting broadcast SMI functional (with
> whichever negotiation method we deem suitable), and make the basic
> feature set (which ignores VCPU hotplug entirely) reliable, for now.
> Then, later, we can look into VCPU hotplug. VCPU hotplug is already
> broken in OVMF and whatever we do for broadcast SMI cannot worsen the
> user-observable VCPU hotplug status.
> 
> (Note that VCPU hotplug will require a whole bunch of non-platform code
> in edk2, such as UefiCpuPkg/Library/MpInitLib, UefiCpuPkg/CpuMpPei, and
> UefiCpuPkg/CpuDxe, UefiCpuPkg/PiSmmCpuDxeSmm.)
> 
> Or, we can delay (or even drop) broadcast SMI until we divine a design
> that, with a lot of code everywhere, makes (a) the basic SMM feature set
> reliable *and* (b) VCPU hotplug functional, at the exact same time. I'm
> not saying this is impossible, but you'll need a better guy for that
> than I am. I always work in incremental, small steps, especially where
> the subject matter is hard for me to grasp. If you see me walking down
> the wrong path and yank me back, that's appreciated, but the broadcast
> SMI idea doesn't look wrong, considering feedback from Jordan as to what
> real hardware does (and the edk2 UefiCpuPkg.dec default settings).
> 
> So, I'm asking for guidance on:
> 
> (1) what interface would be preferred for negotiating SMI:
> (1a) APM_STS
> (1b) writeable fw_cfg
> (1c) another IO port
> 
> (2) whether to prevent, and if so, how exactly, VCPU hotplug, when the
> broadcast SMI is negotiated. By "how", I mean "what code to modify in QEMU".
> 
> For (1), my preference is (1a), simply because it's ready. I do see
> perspective in (1b) writeable fw_cfg, so if that's preferred, I can
> include Michael's patch and rework my patches to utilize it. (As far as
> I see, the textual documentation for fw_cfg is not extended by Michael's
> patch, so I guess I'd have to do that too.)

If I understand correctly, one argument against the current state of
writeable fw_cfg, captured in
<https://www.mail-archive.com/qemu-devel@nongnu.org/msg354983.html>, is
that callbacks on write are not supported. Apparently, QEMU code that
uses the data written by the guest is supposed to just read that data,
not to expect a notification about it.

I'm unsure how this can work for actual negotiation, where the guest
usually does a read/write/read cycle, and expects some kind of change
between steps #2 and #3. I don't see how that can be implemented in QEMU
without write callbacks (i.e. how QEMU can confirm or reject the
negotiation attempt).

Thanks
Laszlo


> (1c) is inferior to (1b) in
> my opinion and shouldn't be chosen.
> 
> Re (2), I'm clueless. Not sure we should care about it. Even if it is a
> security problem, that problem exists within the guest, and triggering
> it (i.e., hot-plugging a VCPU) requires a host admin action. The host
> admin can cause a bunch of other security problems for the guest, via
> different misconfigurations. So if we care about (2), it should be
> something minimal, to catch "inadvertent" VCPU hotplug attempts. And
> then please tell me what code to mess with.
> 
> Thanks
> Laszlo
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-25  8:55           ` Paolo Bonzini
@ 2016-11-25 14:10             ` Igor Mammedov
  2016-11-28  9:41               ` Paolo Bonzini
  0 siblings, 1 reply; 27+ messages in thread
From: Igor Mammedov @ 2016-11-25 14:10 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Michael S. Tsirkin, Jordan Justen (Intel address),
	qemu devel list, Kevin O'Connor, Gerd Hoffmann,
	Michael Kinney, Laszlo Ersek

On Fri, 25 Nov 2016 03:55:29 -0500 (EST)
Paolo Bonzini <pbonzini@redhat.com> wrote:

> > > Parked CPUs are exactly how it works on real hardware (the arbitrator is
> > > the
> > > BMC, while we have QEMU in its place).  The problem is that, if you just
> > > place the hotplugged CPU in reset state, there is a race between the OSPM
> > > and the firmware.  The OSPM can place its own code at 0x30000 and send
> > > INIT/SIPI/SMI before the firmware gets round to doing it.  
> >
> > if 0x30000 were covered by SMRR range, then OSPM wouldn't able to
> > place its own code there and there wouldn't be any need in side interfaces
> > to put and get CPU in/from some undefined by spec state (parked).
> > 
> > But above would imply a large block allocated at 0x30000 to fit all
> > possible CPUs+1, not sure if it's doable (maybe edk2 wouldn't have
> > big issues with reserving large block in lowmem).  
> 
> If you mean that the default SMRR range would include 0x30000 for an hotplugged
> CPU, that would work but it is a significant departure from real hardware.
> I'd rather avoid that.
But that's guest side only solution to guest problem, that won't require
any assistance from QEMU/KVM.
Baremetal would also benefit from it as it won't need to implement unpark
logic anymore. it should also work for existing HW that has unpark logic.

Do you have any pointers to how hardware does unparking now?

> > Suggestion to use CPU hotplug MMIO interface to unpark CPU also doesn't
> > seem to be secure as it's not protected from OSPM, hence OPSM could
> > unpark CPU and hijack SMBASE all the same.
> > It looks like we need only SMM accessible guest/host interface to make
> > CPU unparking secure or cover default SMBASE by SMRR.  
> 
> Yes, unparking would be accessible only from SMM if the unparking feature
> is negotiated.
I suppose we could check in MMIO handler that all CPUs are in SMM mode
before allowing unparking or ignore command if they are not.

For compat reasons unpark should be optin feature (i.e. firmware should
explicitly enable it to avoid breaking existing configurations (SeaBIOS))

> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-25 12:31             ` Laszlo Ersek
  2016-11-25 12:40               ` Laszlo Ersek
@ 2016-11-25 14:22               ` Igor Mammedov
  1 sibling, 0 replies; 27+ messages in thread
From: Igor Mammedov @ 2016-11-25 14:22 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Michael S. Tsirkin, Paolo Bonzini, Kevin O'Connor,
	qemu devel list, Gerd Hoffmann, Jordan Justen (Intel address),
	Michael Kinney

On Fri, 25 Nov 2016 13:31:17 +0100
Laszlo Ersek <lersek@redhat.com> wrote:

> On 11/25/16 05:00, Michael S. Tsirkin wrote:
> > On Thu, Nov 24, 2016 at 09:37:41AM +0100, Laszlo Ersek wrote:  
> >> On 11/24/16 05:29, Michael S. Tsirkin wrote:  
> >>> On Wed, Nov 23, 2016 at 07:38:35PM -0500, Kevin O'Connor wrote:  
> >>>> As a general comment - it does seem unfortunate that we keep building
> >>>> adhoc interfaces to communicate information from firmware to QEMU.  We
> >>>> have a generic mechanism (fw_cfg) for passing adhoc information from
> >>>> QEMU to the firmware, but the inverse seems to always involve magic
> >>>> pci registers, magic io space registers, specific init ordering, etc.  
> >>>
> >>> FWIW I posted a proposal
> >>> 	fw-cfg: support writeable blobs
> >>> a while ago to try to address that
> >>>  
> >>
> >> Yes, here's the discussion (Feb 2016):
> >>
> >> https://www.mail-archive.com/qemu-devel@nongnu.org/msg354852.html
> >>
> >> and it was even part of a pull req (Mar 2016):
> >>
> >> https://www.mail-archive.com/qemu-devel@nongnu.org/msg359348.html
> >>
> >> but it wasn't merged, apparently.
> >>
> >> If QEMU (re)gains this feature, I can try basing the broadcast SMI
> >> negotiation on it.
> >>
> >> Thanks
> >> Laszlo  
> > 
> > I dropped since it wasn't used yet. Go ahead and include it
> > in your patchset if you like it.
> >   
> 
> I can do that, but I'm no longer sure Paolo still approves of the
> broadcast SMI idea.
> 
> The way I see it, I can work on getting broadcast SMI functional (with
> whichever negotiation method we deem suitable), and make the basic
> feature set (which ignores VCPU hotplug entirely) reliable, for now.
> Then, later, we can look into VCPU hotplug. VCPU hotplug is already
> broken in OVMF and whatever we do for broadcast SMI cannot worsen the
> user-observable VCPU hotplug status.
> 
> (Note that VCPU hotplug will require a whole bunch of non-platform code
> in edk2, such as UefiCpuPkg/Library/MpInitLib, UefiCpuPkg/CpuMpPei, and
> UefiCpuPkg/CpuDxe, UefiCpuPkg/PiSmmCpuDxeSmm.)
> 
> Or, we can delay (or even drop) broadcast SMI until we divine a design
> that, with a lot of code everywhere, makes (a) the basic SMM feature set
> reliable *and* (b) VCPU hotplug functional, at the exact same time. I'm
> not saying this is impossible, but you'll need a better guy for that
> than I am. I always work in incremental, small steps, especially where
> the subject matter is hard for me to grasp. If you see me walking down
> the wrong path and yank me back, that's appreciated, but the broadcast
> SMI idea doesn't look wrong, considering feedback from Jordan as to what
> real hardware does (and the edk2 UefiCpuPkg.dec default settings).
> 
> So, I'm asking for guidance on:
> 
> (1) what interface would be preferred for negotiating SMI:
> (1a) APM_STS
> (1b) writeable fw_cfg
> (1c) another IO port
> 
> (2) whether to prevent, and if so, how exactly, VCPU hotplug, when the
> broadcast SMI is negotiated. By "how", I mean "what code to modify in QEMU".
> 
> For (1), my preference is (1a), simply because it's ready. I do see
> perspective in (1b) writeable fw_cfg, so if that's preferred, I can
> include Michael's patch and rework my patches to utilize it. (As far as
> I see, the textual documentation for fw_cfg is not extended by Michael's
> patch, so I guess I'd have to do that too.) (1c) is inferior to (1b) in
> my opinion and shouldn't be chosen.
> 
> Re (2), I'm clueless. Not sure we should care about it. Even if it is a
> security problem, that problem exists within the guest, and triggering
> it (i.e., hot-plugging a VCPU) requires a host admin action. The host
> admin can cause a bunch of other security problems for the guest, via
> different misconfigurations. So if we care about (2), it should be
> something minimal, to catch "inadvertent" VCPU hotplug attempts. And
> then please tell me what code to mess with.
I'd ignore (2) for now as VCPU hotplug is broken now anyway.

The only difference between broadcast and unicast SMIs, I see here is
that hotplugged VCPU will attempt to enter SMM handler at default
SMBASE when it gets SIPI while with unicast SMIs there won't be
pending SMI on hotplugged CPUs.
So in broadcast case SMBASE should be valid and do SMBASE relocation
on the first entry.


> 
> Thanks
> Laszlo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-25 12:40               ` Laszlo Ersek
@ 2016-11-28  9:01                 ` Gerd Hoffmann
  2016-11-28 10:22                   ` Laszlo Ersek
  0 siblings, 1 reply; 27+ messages in thread
From: Gerd Hoffmann @ 2016-11-28  9:01 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Michael S. Tsirkin, Paolo Bonzini, Kevin O'Connor,
	qemu devel list, Igor Mammedov, Jordan Justen (Intel address),
	Michael Kinney

  Hi,

> If I understand correctly, one argument against the current state of
> writeable fw_cfg, captured in
> <https://www.mail-archive.com/qemu-devel@nongnu.org/msg354983.html>, is
> that callbacks on write are not supported. Apparently, QEMU code that
> uses the data written by the guest is supposed to just read that data,
> not to expect a notification about it.
> 
> I'm unsure how this can work for actual negotiation, where the guest
> usually does a read/write/read cycle, and expects some kind of change
> between steps #2 and #3. I don't see how that can be implemented in QEMU
> without write callbacks (i.e. how QEMU can confirm or reject the
> negotiation attempt).

Do you actually need negotiation?  I think you only need to know
whenever broadcast-smi is supported, and the presence of the
etc/broadcast-smi (or however we name that) fw_cfg file indicates that.
If it is there just write true/false to it to enable/disable, and qemu
checks the field each time a smi is raised.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-25 14:10             ` Igor Mammedov
@ 2016-11-28  9:41               ` Paolo Bonzini
  2016-11-28 11:24                 ` Igor Mammedov
  0 siblings, 1 reply; 27+ messages in thread
From: Paolo Bonzini @ 2016-11-28  9:41 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Michael S. Tsirkin, Jordan Justen (Intel address),
	qemu devel list, Kevin O'Connor, Gerd Hoffmann,
	Michael Kinney, Laszlo Ersek



On 25/11/2016 15:10, Igor Mammedov wrote:
> On Fri, 25 Nov 2016 03:55:29 -0500 (EST)
> Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> if 0x30000 were covered by SMRR range, then OSPM wouldn't able to
>>> place its own code there and there wouldn't be any need in side interfaces
>>> to put and get CPU in/from some undefined by spec state (parked).
>>>
>>> But above would imply a large block allocated at 0x30000 to fit all
>>> possible CPUs+1, not sure if it's doable (maybe edk2 wouldn't have
>>> big issues with reserving large block in lowmem).  
>>
>> If you mean that the default SMRR range would include 0x30000 for an hotplugged
>> CPU, that would work but it is a significant departure from real hardware.
>> I'd rather avoid that.
> 
> But that's guest side only solution to guest problem, that won't require
> any assistance from QEMU/KVM.

No, I don't think it can be guest-only.  The initial value of SMRRs is
undefined, and SMRRs are per processor.  The newly-hotplugged CPU has no
SMRRs defined when it is started up.

> Baremetal would also benefit from it as it won't need to implement unpark
> logic anymore. it should also work for existing HW that has unpark logic.
> 
> Do you have any pointers to how hardware does unparking now?

The firmware tells the BMC to do it.  I don't know what the firmware-BMC
interface looks like.

>>> It looks like we need only SMM accessible guest/host interface to make
>>> CPU unparking secure or cover default SMBASE by SMRR.  
>>
>> Yes, unparking would be accessible only from SMM if the unparking feature
>> is negotiated.
> 
> I suppose we could check in MMIO handler that all CPUs are in SMM mode
> before allowing unparking or ignore command if they are not.
> 
> For compat reasons unpark should be optin feature (i.e. firmware should
> explicitly enable it to avoid breaking existing configurations (SeaBIOS))

Yes, of course---that's why I'm bringing up in the context of this series.

Paolo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-28  9:01                 ` Gerd Hoffmann
@ 2016-11-28 10:22                   ` Laszlo Ersek
  2016-11-28 11:53                     ` Paolo Bonzini
  0 siblings, 1 reply; 27+ messages in thread
From: Laszlo Ersek @ 2016-11-28 10:22 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Michael S. Tsirkin, Paolo Bonzini, Kevin O'Connor,
	qemu devel list, Igor Mammedov, Jordan Justen (Intel address),
	Michael Kinney

On 11/28/16 10:01, Gerd Hoffmann wrote:
>   Hi,
> 
>> If I understand correctly, one argument against the current state of
>> writeable fw_cfg, captured in
>> <https://www.mail-archive.com/qemu-devel@nongnu.org/msg354983.html>, is
>> that callbacks on write are not supported. Apparently, QEMU code that
>> uses the data written by the guest is supposed to just read that data,
>> not to expect a notification about it.
>>
>> I'm unsure how this can work for actual negotiation, where the guest
>> usually does a read/write/read cycle, and expects some kind of change
>> between steps #2 and #3. I don't see how that can be implemented in QEMU
>> without write callbacks (i.e. how QEMU can confirm or reject the
>> negotiation attempt).
> 
> Do you actually need negotiation?  I think you only need to know
> whenever broadcast-smi is supported, and the presence of the
> etc/broadcast-smi (or however we name that) fw_cfg file indicates that.

Michael suggested to use negotiation like virtio does (where the host
can reject invalid combinations of requested features):

http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg03077.html
http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg03218.html

I thought negotiation was overkill:

http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg03123.html

but I figured I'd just stick with the recommendation.

> If it is there just write true/false to it to enable/disable, and qemu
> checks the field each time a smi is raised.

If we agree there's no need for negotiation, I'm game.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-28  9:41               ` Paolo Bonzini
@ 2016-11-28 11:24                 ` Igor Mammedov
  2016-11-28 11:51                   ` Paolo Bonzini
  0 siblings, 1 reply; 27+ messages in thread
From: Igor Mammedov @ 2016-11-28 11:24 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Michael S. Tsirkin, Jordan Justen (Intel address),
	qemu devel list, Kevin O'Connor, Gerd Hoffmann,
	Michael Kinney, Laszlo Ersek

On Mon, 28 Nov 2016 10:41:42 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 25/11/2016 15:10, Igor Mammedov wrote:
> > On Fri, 25 Nov 2016 03:55:29 -0500 (EST)
> > Paolo Bonzini <pbonzini@redhat.com> wrote:  
> >>> if 0x30000 were covered by SMRR range, then OSPM wouldn't able to
> >>> place its own code there and there wouldn't be any need in side interfaces
> >>> to put and get CPU in/from some undefined by spec state (parked).
> >>>
> >>> But above would imply a large block allocated at 0x30000 to fit all
> >>> possible CPUs+1, not sure if it's doable (maybe edk2 wouldn't have
> >>> big issues with reserving large block in lowmem).    
> >>
> >> If you mean that the default SMRR range would include 0x30000 for an hotplugged
> >> CPU, that would work but it is a significant departure from real hardware.
> >> I'd rather avoid that.  
> > 
> > But that's guest side only solution to guest problem, that won't require
> > any assistance from QEMU/KVM.  
> 
> No, I don't think it can be guest-only.  The initial value of SMRRs is
> undefined, and SMRRs are per processor.  The newly-hotplugged CPU has no
> SMRRs defined when it is started up.
Does it matter?
If 0x30000 is protected by SMRRs on exiting CPUs so OS can't tamper with it
and SMI is injected on hotplug
(i.e. hotplugged CPU arrives with SMI pin active, we can do this without
inventing PV solution to park/unpark CPU).

Then even if OS sends INIT/SIPI with code to hijack SMRAM that code
won't get a chance to run unrestricted as pending SMI will be processed
first and SMRAM will be relocated by firmware to per CPU area along with
configuring the hotplugged CPU's SMRR before malicious SIPI is even executed.


> > Baremetal would also benefit from it as it won't need to implement unpark
> > logic anymore. it should also work for existing HW that has unpark logic.
> > 
> > Do you have any pointers to how hardware does unparking now?  
> 
> The firmware tells the BMC to do it.  I don't know what the firmware-BMC
> interface looks like.
> 
> >>> It looks like we need only SMM accessible guest/host interface to make
> >>> CPU unparking secure or cover default SMBASE by SMRR.    
> >>
> >> Yes, unparking would be accessible only from SMM if the unparking feature
> >> is negotiated.  
> > 
> > I suppose we could check in MMIO handler that all CPUs are in SMM mode
> > before allowing unparking or ignore command if they are not.
> > 
> > For compat reasons unpark should be optin feature (i.e. firmware should
> > explicitly enable it to avoid breaking existing configurations (SeaBIOS))  
> 
> Yes, of course---that's why I'm bringing up in the context of this series.
> 
> Paolo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-28 11:24                 ` Igor Mammedov
@ 2016-11-28 11:51                   ` Paolo Bonzini
  0 siblings, 0 replies; 27+ messages in thread
From: Paolo Bonzini @ 2016-11-28 11:51 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Michael S. Tsirkin, Jordan Justen (Intel address),
	qemu devel list, Kevin O'Connor, Gerd Hoffmann,
	Michael Kinney, Laszlo Ersek



On 28/11/2016 12:24, Igor Mammedov wrote:
> On Mon, 28 Nov 2016 10:41:42 +0100
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>> On 25/11/2016 15:10, Igor Mammedov wrote:
>>> On Fri, 25 Nov 2016 03:55:29 -0500 (EST)
>>> Paolo Bonzini <pbonzini@redhat.com> wrote:  
>>>>> if 0x30000 were covered by SMRR range, then OSPM wouldn't able to
>>>>> place its own code there and there wouldn't be any need in side interfaces
>>>>> to put and get CPU in/from some undefined by spec state (parked).
>>>>>
>>>>> But above would imply a large block allocated at 0x30000 to fit all
>>>>> possible CPUs+1, not sure if it's doable (maybe edk2 wouldn't have
>>>>> big issues with reserving large block in lowmem).    
>>>>
>>>> If you mean that the default SMRR range would include 0x30000 for an hotplugged
>>>> CPU, that would work but it is a significant departure from real hardware.
>>>> I'd rather avoid that.  
>>>
>>> But that's guest side only solution to guest problem, that won't require
>>> any assistance from QEMU/KVM.  
>>
>> No, I don't think it can be guest-only.  The initial value of SMRRs is
>> undefined, and SMRRs are per processor.  The newly-hotplugged CPU has no
>> SMRRs defined when it is started up.
> 
> Does it matter?
> If 0x30000 is protected by SMRRs on exiting CPUs so OS can't tamper with it
> and SMI is injected on hotplug
> (i.e. hotplugged CPU arrives with SMI pin active, we can do this without
> inventing PV solution to park/unpark CPU).

If you mean placing TSEG at 0x30000 no, that won't work.  You need much
more memory than that.

If you need placing SMRRs there because KVM doesn't need SMRRs to
overlap TSEG, there are problems:

0) KVM doesn't implement SMRRs yet anyway. :)

1) while it's true that we don't need SMRRs, on Haswell and newer
processors SMRRs also provide the range of physical addresses from which
you can execute from in SMM ("SMM Code Access Check").  We do not
implement that, but it's planned.

2) I would still not bet on not having other vulnerabilities hidden.
For example, can the OS try to have two hotplugs run the initial SMM in
parallel?

3) It's really ugly and only works for virt.  I wouldn't even call the
alternative PV.  This thing is not part of the hardware specs; as long
as ACPI can describe it, it's not PV, it's firmware.

Paolo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
  2016-11-28 10:22                   ` Laszlo Ersek
@ 2016-11-28 11:53                     ` Paolo Bonzini
  0 siblings, 0 replies; 27+ messages in thread
From: Paolo Bonzini @ 2016-11-28 11:53 UTC (permalink / raw)
  To: Laszlo Ersek, Gerd Hoffmann
  Cc: Michael S. Tsirkin, Kevin O'Connor, qemu devel list,
	Igor Mammedov, Jordan Justen (Intel address),
	Michael Kinney



On 28/11/2016 11:22, Laszlo Ersek wrote:
> Michael suggested to use negotiation like virtio does (where the host
> can reject invalid combinations of requested features):
> 
> http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg03077.html
> http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg03218.html
> 
> I thought negotiation was overkill:
> 
> http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg03123.html
> 
> but I figured I'd just stick with the recommendation.

I think negotiation is useful.  By doing things like "prevent
broadcast_SMI && !parked_VCPUs if max_cpus > smp_cpus", it lets you
commit your OVMF patch immediately for example.

Paolo

>> > If it is there just write true/false to it to enable/disable, and qemu
>> > checks the field each time a smi is raised.
> If we agree there's no need for negotiation, I'm game.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2016-11-28 11:53 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-18 10:36 [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI Laszlo Ersek
2016-11-18 10:36 ` [Qemu-devel] [PATCH v3 for-2.9 1/3] hw/isa/apm: introduce callback for APM_STS_IOPORT writes Laszlo Ersek
2016-11-18 10:36 ` [Qemu-devel] [PATCH v3 for-2.9 2/3] hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS Laszlo Ersek
2016-11-18 10:36 ` [Qemu-devel] [PATCH v3 for-2.9 3/3] hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs Laszlo Ersek
2016-11-18 14:10 ` [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI Michael S. Tsirkin
2016-11-23 15:48   ` Laszlo Ersek
2016-11-23 22:35 ` Paolo Bonzini
2016-11-24  0:01   ` Laszlo Ersek
2016-11-24  0:31     ` Laszlo Ersek
2016-11-24  0:38     ` Kevin O'Connor
2016-11-24  4:29       ` Michael S. Tsirkin
2016-11-24  8:37         ` Laszlo Ersek
2016-11-25  4:00           ` Michael S. Tsirkin
2016-11-25 12:31             ` Laszlo Ersek
2016-11-25 12:40               ` Laszlo Ersek
2016-11-28  9:01                 ` Gerd Hoffmann
2016-11-28 10:22                   ` Laszlo Ersek
2016-11-28 11:53                     ` Paolo Bonzini
2016-11-25 14:22               ` Igor Mammedov
2016-11-24 14:55     ` Igor Mammedov
2016-11-24 17:05       ` Paolo Bonzini
2016-11-24 18:02         ` Igor Mammedov
2016-11-25  8:55           ` Paolo Bonzini
2016-11-25 14:10             ` Igor Mammedov
2016-11-28  9:41               ` Paolo Bonzini
2016-11-28 11:24                 ` Igor Mammedov
2016-11-28 11:51                   ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.