[Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests
@ 2015-11-11 17:15 Aravinda Prasad
  2015-11-11 17:15 ` [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob Aravinda Prasad
                   ` (4 more replies)
  0 siblings, 5 replies; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-11 17:15 UTC (permalink / raw)
  To: qemu-ppc, agraf, qemu-devel; +Cc: benh, aik, paulus, sam.bobroff, david

This series of patches add support for FWNMI in PowerKVM guests.

Memory error such as bit flips that cannot be corrected
by hardware is passed on to the kernel for handling
by raising machine check exception (an NMI). Upon such
machine check exception, if the address in error belongs
to guest then KVM causes a guest exit with KVM_EXIT_NMI
exit reason [1].

This patch series adds functionality to pass on such
machine check exception to the guest kernel by suitably
handling KVM_EXIT_NMI exit and building the error log.
This patch series explores the alternate design
discussed in the QEMU mailing list [2], [3].

In earlier design [2] KVM invoked guest's NMI interrupt
vector 0x200 upon machine check exception, while QEMU
patched 0x200 vector to issue hcall to facilitate
building and copying of error log into RTAS space.

The new design simplifies couple of things. Thanks to
David Gibson for the suggestion:

    - Eliminates patching of 0x200 interrupt vector from
      QEMU, hence, trampoline not required
    - Avoids introducing a new private hcall to be called
      from guest's 0x200 vector to enable QEMU to build
      error log
    - Simplifies re-trying by other processors when one
      of the processor is consuming/processing the error
      log.

[1] http://marc.info/?l=kvm-ppc&m=144726114408289
[2] https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
[3] https://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00898.html

---

Aravinda Prasad (4):
      spapr: Extend rtas-blob
      spapr: Register and handle HCALL to receive updated RTAS region
      spapr: Handle "ibm,nmi-register" and "ibm,nmi-interlock" RTAS calls
      target-ppc: Handle NMI guest exit

 hw/ppc/spapr.c         |    4 ++
 hw/ppc/spapr_hcall.c   |    8 +++++
 hw/ppc/spapr_rtas.c    |   29 +++++++++++++++++
 include/hw/ppc/spapr.h |   11 +++++--
 target-ppc/kvm.c       |   69 +++++++++++++++++++++++++++++++++++++++++
 target-ppc/kvm_ppc.h   |   81 ++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 200 insertions(+), 2 deletions(-)

-- 
Aravinda Prasad

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob
  2015-11-11 17:15 [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests Aravinda Prasad
@ 2015-11-11 17:15 ` Aravinda Prasad
  2015-11-12  3:40   ` David Gibson
  2015-11-12  8:26   ` Thomas Huth
  2015-11-11 17:15 ` [Qemu-devel] [PATCH 2/4] spapr: Register and handle HCALL to receive updated RTAS region Aravinda Prasad
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-11 17:15 UTC (permalink / raw)
  To: qemu-ppc, agraf, qemu-devel; +Cc: benh, aik, paulus, sam.bobroff, david

Extend rtas-blob to accommodate error log. Error log
structure is saved in rtas space upon a machine check
exception.

Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 05926a3..b7b9e09 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1556,6 +1556,10 @@ static void ppc_spapr_init(MachineState *machine)
         exit(1);
     }
     spapr->rtas_size = get_image_size(filename);
+
+    /* Resize blob to accommodate error log. */
+    spapr->rtas_size = TARGET_PAGE_ALIGN(spapr->rtas_size);
+
     spapr->rtas_blob = g_malloc(spapr->rtas_size);
     if (load_image_size(filename, spapr->rtas_blob, spapr->rtas_size) < 0) {
         error_report("Could not load LPAR rtas '%s'", filename);

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH 2/4] spapr: Register and handle HCALL to receive updated RTAS region
  2015-11-11 17:15 [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests Aravinda Prasad
  2015-11-11 17:15 ` [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob Aravinda Prasad
@ 2015-11-11 17:15 ` Aravinda Prasad
  2015-11-12  3:42   ` David Gibson
  2015-11-11 17:15 ` [Qemu-devel] [PATCH 3/4] spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls Aravinda Prasad
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-11 17:15 UTC (permalink / raw)
  To: qemu-ppc, agraf, qemu-devel; +Cc: benh, aik, paulus, sam.bobroff, david

Receive updates from SLOF about the updated rtas-base.
A separate patch for SLOF [1] adds functionality to invoke
a private HCALL whenever OS issues instantiate-rtas with
a new rtas-base.

This is required as QEMU needs to know the updated rtas-base
as it allocates error reporting structure in RTAS space upon
a machine check exception.

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-August/120386.html

Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
---
 hw/ppc/spapr_hcall.c   |    8 ++++++++
 include/hw/ppc/spapr.h |    3 ++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 652ddf6..9703ed2 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -580,6 +580,13 @@ static target_ulong h_rtas(PowerPCCPU *cpu, sPAPRMachineState *spapr,
                            nret, rtas_r3 + 12 + 4*nargs);
 }
 
+static target_ulong h_rtas_update(PowerPCCPU *cpu, sPAPRMachineState *spapr,
+                                  target_ulong opcode, target_ulong *args)
+{
+    spapr->rtas_addr = args[0];
+    return 0;
+}
+
 static target_ulong h_logical_load(PowerPCCPU *cpu, sPAPRMachineState *spapr,
                                    target_ulong opcode, target_ulong *args)
 {
@@ -1008,6 +1015,7 @@ static void hypercall_register_types(void)
 
     /* qemu/KVM-PPC specific hcalls */
     spapr_register_hypercall(KVMPPC_H_RTAS, h_rtas);
+    spapr_register_hypercall(KVMPPC_H_RTAS_UPDATE, h_rtas_update);
 
     spapr_register_hypercall(H_SET_MODE, h_set_mode);
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 91a61ab..b5cadd7 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -347,7 +347,8 @@ struct sPAPRMachineState {
 #define KVMPPC_H_LOGICAL_MEMOP  (KVMPPC_HCALL_BASE + 0x1)
 /* Client Architecture support */
 #define KVMPPC_H_CAS            (KVMPPC_HCALL_BASE + 0x2)
-#define KVMPPC_HCALL_MAX        KVMPPC_H_CAS
+#define KVMPPC_H_RTAS_UPDATE    (KVMPPC_HCALL_BASE + 0x3)
+#define KVMPPC_HCALL_MAX        KVMPPC_H_RTAS_UPDATE
 
 typedef struct sPAPRDeviceTreeUpdateHeader {
     uint32_t version_id;

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH 3/4] spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls
  2015-11-11 17:15 [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests Aravinda Prasad
  2015-11-11 17:15 ` [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob Aravinda Prasad
  2015-11-11 17:15 ` [Qemu-devel] [PATCH 2/4] spapr: Register and handle HCALL to receive updated RTAS region Aravinda Prasad
@ 2015-11-11 17:15 ` Aravinda Prasad
  2015-11-12  4:02   ` David Gibson
  2015-11-12  9:23   ` Thomas Huth
  2015-11-11 17:16 ` [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit Aravinda Prasad
  2015-11-12  4:30 ` [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests David Gibson
  4 siblings, 2 replies; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-11 17:15 UTC (permalink / raw)
  To: qemu-ppc, agraf, qemu-devel; +Cc: benh, aik, paulus, sam.bobroff, david

This patch adds support in QEMU to handle "ibm,nmi-register"
and "ibm,nmi-interlock" RTAS calls.

The machine check notification address is saved when the
OS issues "ibm,nmi-register" RTAS call.

This patch also handles the case when multiple processors
experience machine check at or about the same time by
handling "ibm,nmi-interlock" call. In such cases, as per
PAPR, subsequent processors serialize waiting for the first
processor to issue the "ibm,nmi-interlock" call. The second
processor waits till the first processor, which also
received a machine check error, is done reading the error
log. The first processor issues "ibm,nmi-interlock" call
when the error log is consumed. This patch implements the
releasing part of the error-log while subsequent patch
(which builds error log) handles the locking part.

Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
---
 hw/ppc/spapr_rtas.c    |   29 +++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h |    8 +++++++-
 2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 9869bc9..fd4d2af 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -597,6 +597,31 @@ out:
     rtas_st(rets, 0, rc);
 }
 
+static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
+                                  sPAPRMachineState *spapr,
+                                  uint32_t token, uint32_t nargs,
+                                  target_ulong args,
+                                  uint32_t nret, target_ulong rets)
+{
+    qemu_mutex_init(&spapr->mc_in_progress);
+    spapr->guest_machine_check_addr = rtas_ld(args, 1);
+    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+}
+
+static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu,
+                                   sPAPRMachineState *spapr,
+                                   uint32_t token, uint32_t nargs,
+                                   target_ulong args,
+                                   uint32_t nret, target_ulong rets)
+{
+    /*
+     * VCPU issuing "ibm,nmi-interlock" is done with NMI handling,
+     * hence unlock mc_in_progress.
+     */
+    qemu_mutex_unlock(&spapr->mc_in_progress);
+    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+}
+
 static struct rtas_call {
     const char *name;
     spapr_rtas_fn fn;
@@ -747,6 +772,10 @@ static void core_rtas_register_types(void)
                         rtas_get_sensor_state);
     spapr_rtas_register(RTAS_IBM_CONFIGURE_CONNECTOR, "ibm,configure-connector",
                         rtas_ibm_configure_connector);
+    spapr_rtas_register(RTAS_IBM_NMI_REGISTER, "ibm,nmi-register",
+                        rtas_ibm_nmi_register);
+    spapr_rtas_register(RTAS_IBM_NMI_INTERLOCK, "ibm,nmi-interlock",
+                        rtas_ibm_nmi_interlock);
 }
 
 type_init(core_rtas_register_types)
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index b5cadd7..0a31d15 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -74,6 +74,10 @@ struct sPAPRMachineState {
     /* RTAS state */
     QTAILQ_HEAD(, sPAPRConfigureConnectorState) ccs_list;
 
+    /* State related to "ibm,nmi-register" and "ibm,nmi-interlock" calls */
+    target_ulong guest_machine_check_addr;
+    QemuMutex mc_in_progress;
+
     /*< public >*/
     char *kvm_type;
 };
@@ -458,8 +462,10 @@ int spapr_allocate_irq_block(int num, bool lsi, bool msi);
 #define RTAS_IBM_SET_SLOT_RESET                 (RTAS_TOKEN_BASE + 0x23)
 #define RTAS_IBM_CONFIGURE_PE                   (RTAS_TOKEN_BASE + 0x24)
 #define RTAS_IBM_SLOT_ERROR_DETAIL              (RTAS_TOKEN_BASE + 0x25)
+#define RTAS_IBM_NMI_REGISTER                   (RTAS_TOKEN_BASE + 0x26)
+#define RTAS_IBM_NMI_INTERLOCK                  (RTAS_TOKEN_BASE + 0x27)
 
-#define RTAS_TOKEN_MAX                          (RTAS_TOKEN_BASE + 0x26)
+#define RTAS_TOKEN_MAX                          (RTAS_TOKEN_BASE + 0x28)
 
 /* RTAS ibm,get-system-parameter token values */
 #define RTAS_SYSPARM_SPLPAR_CHARACTERISTICS      20

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-11 17:15 [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests Aravinda Prasad
                   ` (2 preceding siblings ...)
  2015-11-11 17:15 ` [Qemu-devel] [PATCH 3/4] spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls Aravinda Prasad
@ 2015-11-11 17:16 ` Aravinda Prasad
  2015-11-12  4:29   ` David Gibson
  2015-11-12  8:09   ` Thomas Huth
  2015-11-12  4:30 ` [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests David Gibson
  4 siblings, 2 replies; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-11 17:16 UTC (permalink / raw)
  To: qemu-ppc, agraf, qemu-devel; +Cc: benh, aik, paulus, sam.bobroff, david

Memory error such as bit flips that cannot be corrected
by hardware are passed on to the kernel for handling.
If the memory address in error belongs to guest then
guest kernel is responsible for taking suitable action.
Patch [1] enhances KVM to exit guest with exit reason
set to KVM_EXIT_NMI in such cases.

This patch handles KVM_EXIT_NMI exit. If the guest OS
has registered the machine check handling routine by
calling "ibm,nmi-register", then the handler builds
the error log and invokes the registered handler else
invokes the handler at 0x200.

[1] http://marc.info/?l=kvm-ppc&m=144726114408289

Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
---
 target-ppc/kvm.c     |   69 +++++++++++++++++++++++++++++++++++++++++++
 target-ppc/kvm_ppc.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 150 insertions(+)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 110436d..e2e5170 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
         ret = 0;
         break;
 
+    case KVM_EXIT_NMI:
+        DPRINTF("handle NMI exception\n");
+        ret = kvm_handle_nmi(cpu);
+        break;
+
     default:
         fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
         ret = -1;
@@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
     return data & 0xffff;
 }
+
+int kvm_handle_nmi(PowerPCCPU *cpu)
+{
+    struct rtas_mc_log mc_log;
+    CPUPPCState *env = &cpu->env;
+    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
+    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
+
+    cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
+
+    /* Properly set bits in MSR before we invoke the handler */
+    env->msr = 0;
+
+    if (!(*pcc->interrupts_big_endian)(cpu)) {
+        env->msr |= (1ULL << MSR_LE);
+    }
+
+#ifdef TARGET_PPC64
+    env->msr |= (1ULL << MSR_SF);
+#endif
+
+    if (!spapr->guest_machine_check_addr) {
+        /*
+         * If OS has not registered with "ibm,nmi-register"
+         * jump to 0x200
+         */
+        env->nip = 0x200;
+        return 0;
+    }
+
+    qemu_mutex_lock(&spapr->mc_in_progress);
+
+    /* Set error log fields */
+    mc_log.r3 = env->gpr[3];
+    mc_log.err_log.byte0 = 0;
+    mc_log.err_log.byte1 =
+        (RTAS_SEVERITY_ERROR_SYNC << RTAS_ELOG_SEVERITY_SHIFT);
+    mc_log.err_log.byte1 |=
+        (RTAS_DISP_NOT_RECOVERED << RTAS_ELOG_DISPOSITION_SHIFT);
+    mc_log.err_log.byte2 =
+        (RTAS_INITIATOR_MEMORY << RTAS_ELOG_INITIATOR_SHIFT);
+    mc_log.err_log.byte2 |= RTAS_TARGET_MEMORY;
+
+    if (env->spr[SPR_DSISR] & P7_DSISR_MC_UE) {
+        mc_log.err_log.byte3 = RTAS_TYPE_ECC_UNCORR;
+    } else {
+        mc_log.err_log.byte3 = 0;
+    }
+
+    /* Handle all Host/Guest LE/BE combinations */
+    if (env->msr & (1ULL << MSR_LE)) {
+        mc_log.r3 = cpu_to_le64(mc_log.r3);
+    } else {
+        mc_log.r3 = cpu_to_be64(mc_log.r3);
+    }
+
+    cpu_physical_memory_write(spapr->rtas_addr + RTAS_ERRLOG_OFFSET,
+                              &mc_log, sizeof(mc_log));
+
+    env->nip = spapr->guest_machine_check_addr;
+    env->gpr[3] = spapr->rtas_addr + RTAS_ERRLOG_OFFSET;
+
+    return 0;
+}
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 5c1d334..1172735 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -53,6 +53,87 @@ void kvmppc_hash64_free_pteg(uint64_t token);
 void kvmppc_hash64_write_pte(CPUPPCState *env, target_ulong pte_index,
                              target_ulong pte0, target_ulong pte1);
 bool kvmppc_has_cap_fixup_hcalls(void);
+int kvm_handle_nmi(PowerPCCPU *cpu);
+
+/* Offset from rtas-base where error log is placed */
+#define RTAS_ERRLOG_OFFSET       (0x200)
+
+#define RTAS_ELOG_SEVERITY_SHIFT         0x5
+#define RTAS_ELOG_DISPOSITION_SHIFT      0x3
+#define RTAS_ELOG_INITIATOR_SHIFT        0x4
+
+/*
+ * Only required RTAS event severity, disposition, initiator
+ * target and type are copied from arch/powerpc/include/asm/rtas.h
+ */
+
+/* RTAS event severity */
+#define RTAS_SEVERITY_ERROR_SYNC    0x3
+
+/* RTAS event disposition */
+#define RTAS_DISP_NOT_RECOVERED     0x2
+
+/* RTAS event initiator */
+#define RTAS_INITIATOR_MEMORY       0x4
+
+/* RTAS event target */
+#define RTAS_TARGET_MEMORY          0x4
+
+/* RTAS event type */
+#define RTAS_TYPE_ECC_UNCORR        0x09
+
+/*
+ * Currently KVM only passes on the uncorrected machine
+ * check memory error to guest. Other machine check errors
+ * such as SLB multi-hit and TLB multi-hit are recovered
+ * in KVM and are not passed on to guest.
+ *
+ * DSISR Bit for uncorrected machine check error. Based
+ * on arch/powerpc/include/asm/mce.h
+ */
+#define PPC_BIT(bit)                (0x8000000000000000ULL >> bit)
+#define P7_DSISR_MC_UE              (PPC_BIT(48))  /* P8 too */
+
+/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */
+struct rtas_error_log {
+    /* Byte 0 */
+    uint8_t     byte0;          /* Architectural version */
+
+    /* Byte 1 */
+    uint8_t     byte1;
+    /* XXXXXXXX
+     * XXX      3: Severity level of error
+     *    XX    2: Degree of recovery
+     *      X   1: Extended log present?
+     *       XX 2: Reserved
+     */
+
+    /* Byte 2 */
+    uint8_t     byte2;
+    /* XXXXXXXX
+     * XXXX     4: Initiator of event
+     *     XXXX 4: Target of failed operation
+     */
+    uint8_t     byte3;          /* General event or error*/
+    __be32      extended_log_length;    /* length in bytes */
+    unsigned char   buffer[1];      /* Start of extended log */
+                                /* Variable length.      */
+};
+
+/*
+ * Data format in RTAS-Blob
+ *
+ * This structure contains error information related to Machine
+ * Check exception. This is filled up and copied to rtas-blob
+ * upon machine check exception. The address of rtas-blob is
+ * passed on to OS registered machine check notification
+ * routines upon machine check exception
+ */
+struct rtas_mc_log {
+    target_ulong r3;
+    struct rtas_error_log err_log;
+};
+
 
 #else
 

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob
  2015-11-11 17:15 ` [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob Aravinda Prasad
@ 2015-11-12  3:40   ` David Gibson
  2015-11-12  8:26   ` Thomas Huth
  1 sibling, 0 replies; 40+ messages in thread
From: David Gibson @ 2015-11-12  3:40 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff

[-- Attachment #1: Type: text/plain, Size: 1218 bytes --]

On Wed, Nov 11, 2015 at 10:45:15PM +0530, Aravinda Prasad wrote:
> Extend rtas-blob to accommodate error log. Error log
> structure is saved in rtas space upon a machine check
> exception.
> 
> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  hw/ppc/spapr.c |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 05926a3..b7b9e09 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1556,6 +1556,10 @@ static void ppc_spapr_init(MachineState *machine)
>          exit(1);
>      }
>      spapr->rtas_size = get_image_size(filename);
> +
> +    /* Resize blob to accommodate error log. */
> +    spapr->rtas_size = TARGET_PAGE_ALIGN(spapr->rtas_size);
> +
>      spapr->rtas_blob = g_malloc(spapr->rtas_size);
>      if (load_image_size(filename, spapr->rtas_blob, spapr->rtas_size) < 0) {
>          error_report("Could not load LPAR rtas '%s'", filename);
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] spapr: Register and handle HCALL to receive updated RTAS region
  2015-11-11 17:15 ` [Qemu-devel] [PATCH 2/4] spapr: Register and handle HCALL to receive updated RTAS region Aravinda Prasad
@ 2015-11-12  3:42   ` David Gibson
  2015-11-12  5:28     ` [Qemu-devel] [Qemu-ppc] " Nikunj A Dadhania
  0 siblings, 1 reply; 40+ messages in thread
From: David Gibson @ 2015-11-12  3:42 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff

[-- Attachment #1: Type: text/plain, Size: 2743 bytes --]

On Wed, Nov 11, 2015 at 10:45:30PM +0530, Aravinda Prasad wrote:
> Receive updates from SLOF about the updated rtas-base.
> A separate patch for SLOF [1] adds functionality to invoke
> a private HCALL whenever OS issues instantiate-rtas with
> a new rtas-base.
> 
> This is required as QEMU needs to know the updated rtas-base
> as it allocates error reporting structure in RTAS space upon
> a machine check exception.
> 
> [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-August/120386.html
> 
> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

With the obvious proviso that the SLOF update needs to be merged (both
to SLOF and to qemu) first.

> ---
>  hw/ppc/spapr_hcall.c   |    8 ++++++++
>  include/hw/ppc/spapr.h |    3 ++-
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index 652ddf6..9703ed2 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -580,6 +580,13 @@ static target_ulong h_rtas(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>                             nret, rtas_r3 + 12 + 4*nargs);
>  }
>  
> +static target_ulong h_rtas_update(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> +                                  target_ulong opcode, target_ulong *args)
> +{
> +    spapr->rtas_addr = args[0];
> +    return 0;
> +}
> +
>  static target_ulong h_logical_load(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>                                     target_ulong opcode, target_ulong *args)
>  {
> @@ -1008,6 +1015,7 @@ static void hypercall_register_types(void)
>  
>      /* qemu/KVM-PPC specific hcalls */
>      spapr_register_hypercall(KVMPPC_H_RTAS, h_rtas);
> +    spapr_register_hypercall(KVMPPC_H_RTAS_UPDATE, h_rtas_update);
>  
>      spapr_register_hypercall(H_SET_MODE, h_set_mode);
>  
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 91a61ab..b5cadd7 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -347,7 +347,8 @@ struct sPAPRMachineState {
>  #define KVMPPC_H_LOGICAL_MEMOP  (KVMPPC_HCALL_BASE + 0x1)
>  /* Client Architecture support */
>  #define KVMPPC_H_CAS            (KVMPPC_HCALL_BASE + 0x2)
> -#define KVMPPC_HCALL_MAX        KVMPPC_H_CAS
> +#define KVMPPC_H_RTAS_UPDATE    (KVMPPC_HCALL_BASE + 0x3)
> +#define KVMPPC_HCALL_MAX        KVMPPC_H_RTAS_UPDATE
>  
>  typedef struct sPAPRDeviceTreeUpdateHeader {
>      uint32_t version_id;
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls
  2015-11-11 17:15 ` [Qemu-devel] [PATCH 3/4] spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls Aravinda Prasad
@ 2015-11-12  4:02   ` David Gibson
  2015-11-12 18:04     ` Aravinda Prasad
  2015-11-12  9:23   ` Thomas Huth
  1 sibling, 1 reply; 40+ messages in thread
From: David Gibson @ 2015-11-12  4:02 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff

[-- Attachment #1: Type: text/plain, Size: 4951 bytes --]

On Wed, Nov 11, 2015 at 10:45:44PM +0530, Aravinda Prasad wrote:
> This patch adds support in QEMU to handle "ibm,nmi-register"
> and "ibm,nmi-interlock" RTAS calls.
> 
> The machine check notification address is saved when the
> OS issues "ibm,nmi-register" RTAS call.
> 
> This patch also handles the case when multiple processors
> experience machine check at or about the same time by
> handling "ibm,nmi-interlock" call. In such cases, as per
> PAPR, subsequent processors serialize waiting for the first
> processor to issue the "ibm,nmi-interlock" call. The second
> processor waits till the first processor, which also
> received a machine check error, is done reading the error
> log. The first processor issues "ibm,nmi-interlock" call
> when the error log is consumed. This patch implements the
> releasing part of the error-log while subsequent patch
> (which builds error log) handles the locking part.
> 
> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr_rtas.c    |   29 +++++++++++++++++++++++++++++
>  include/hw/ppc/spapr.h |    8 +++++++-
>  2 files changed, 36 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 9869bc9..fd4d2af 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -597,6 +597,31 @@ out:
>      rtas_st(rets, 0, rc);
>  }
>  
> +static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
> +                                  sPAPRMachineState *spapr,
> +                                  uint32_t token, uint32_t nargs,
> +                                  target_ulong args,
> +                                  uint32_t nret, target_ulong rets)
> +{
> +    qemu_mutex_init(&spapr->mc_in_progress);
> +    spapr->guest_machine_check_addr = rtas_ld(args, 1);
> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +}
> +
> +static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu,
> +                                   sPAPRMachineState *spapr,
> +                                   uint32_t token, uint32_t nargs,
> +                                   target_ulong args,
> +                                   uint32_t nret, target_ulong rets)
> +{
> +    /*
> +     * VCPU issuing "ibm,nmi-interlock" is done with NMI handling,
> +     * hence unlock mc_in_progress.
> +     */
> +    qemu_mutex_unlock(&spapr->mc_in_progress);

Hrm... allowing the guest direct control over a qemu internal mutex
sets off alarm bells for me.

It could be ok, depending on exactly where the mutex is taken, and
whether it has a timeout and so forth, but it makes me nervous.

If nothing else this needs some substantial comments explaining the
locking scheme and why it's safe to give userspace direct control over
the unlock.

> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +}
> +
>  static struct rtas_call {
>      const char *name;
>      spapr_rtas_fn fn;
> @@ -747,6 +772,10 @@ static void core_rtas_register_types(void)
>                          rtas_get_sensor_state);
>      spapr_rtas_register(RTAS_IBM_CONFIGURE_CONNECTOR, "ibm,configure-connector",
>                          rtas_ibm_configure_connector);
> +    spapr_rtas_register(RTAS_IBM_NMI_REGISTER, "ibm,nmi-register",
> +                        rtas_ibm_nmi_register);
> +    spapr_rtas_register(RTAS_IBM_NMI_INTERLOCK, "ibm,nmi-interlock",
> +                        rtas_ibm_nmi_interlock);
>  }
>  
>  type_init(core_rtas_register_types)
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index b5cadd7..0a31d15 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -74,6 +74,10 @@ struct sPAPRMachineState {
>      /* RTAS state */
>      QTAILQ_HEAD(, sPAPRConfigureConnectorState) ccs_list;
>  
> +    /* State related to "ibm,nmi-register" and "ibm,nmi-interlock" calls */
> +    target_ulong guest_machine_check_addr;
> +    QemuMutex mc_in_progress;
> +
>      /*< public >*/
>      char *kvm_type;
>  };
> @@ -458,8 +462,10 @@ int spapr_allocate_irq_block(int num, bool lsi, bool msi);
>  #define RTAS_IBM_SET_SLOT_RESET                 (RTAS_TOKEN_BASE + 0x23)
>  #define RTAS_IBM_CONFIGURE_PE                   (RTAS_TOKEN_BASE + 0x24)
>  #define RTAS_IBM_SLOT_ERROR_DETAIL              (RTAS_TOKEN_BASE + 0x25)
> +#define RTAS_IBM_NMI_REGISTER                   (RTAS_TOKEN_BASE + 0x26)
> +#define RTAS_IBM_NMI_INTERLOCK                  (RTAS_TOKEN_BASE + 0x27)
>  
> -#define RTAS_TOKEN_MAX                          (RTAS_TOKEN_BASE + 0x26)
> +#define RTAS_TOKEN_MAX                          (RTAS_TOKEN_BASE + 0x28)
>  
>  /* RTAS ibm,get-system-parameter token values */
>  #define RTAS_SYSPARM_SPLPAR_CHARACTERISTICS      20
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-11 17:16 ` [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit Aravinda Prasad
@ 2015-11-12  4:29   ` David Gibson
  2015-11-12  5:20     ` Aravinda Prasad
  2015-11-12  8:09   ` Thomas Huth
  1 sibling, 1 reply; 40+ messages in thread
From: David Gibson @ 2015-11-12  4:29 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff

[-- Attachment #1: Type: text/plain, Size: 7157 bytes --]

On Wed, Nov 11, 2015 at 10:46:02PM +0530, Aravinda Prasad wrote:
> Memory error such as bit flips that cannot be corrected
> by hardware are passed on to the kernel for handling.
> If the memory address in error belongs to guest then
> guest kernel is responsible for taking suitable action.
> Patch [1] enhances KVM to exit guest with exit reason
> set to KVM_EXIT_NMI in such cases.
> 
> This patch handles KVM_EXIT_NMI exit. If the guest OS
> has registered the machine check handling routine by
> calling "ibm,nmi-register", then the handler builds
> the error log and invokes the registered handler else
> invokes the handler at 0x200.
> 
> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
> 
> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
> ---
>  target-ppc/kvm.c     |   69 +++++++++++++++++++++++++++++++++++++++++++
>  target-ppc/kvm_ppc.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 150 insertions(+)
> 
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 110436d..e2e5170 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>          ret = 0;
>          break;
>  
> +    case KVM_EXIT_NMI:
> +        DPRINTF("handle NMI exception\n");
> +        ret = kvm_handle_nmi(cpu);
> +        break;
> +
>      default:
>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>          ret = -1;
> @@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>      return data & 0xffff;
>  }
> +
> +int kvm_handle_nmi(PowerPCCPU *cpu)
> +{
> +    struct rtas_mc_log mc_log;
> +    CPUPPCState *env = &cpu->env;
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> +
> +    cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
> +
> +    /* Properly set bits in MSR before we invoke the handler */
> +    env->msr = 0;
> +
> +    if (!(*pcc->interrupts_big_endian)(cpu)) {
> +        env->msr |= (1ULL << MSR_LE);
> +    }
> +
> +#ifdef TARGET_PPC64
> +    env->msr |= (1ULL << MSR_SF);
> +#endif

Surely this should depend on what the MSR actually was before, or at
the very least the cpu model, rather than just whether qemu has PPC64
support compiled in.

> +
> +    if (!spapr->guest_machine_check_addr) {
> +        /*
> +         * If OS has not registered with "ibm,nmi-register"
> +         * jump to 0x200
> +         */
> +        env->nip = 0x200;
> +        return 0;
> +    }
> +
> +    qemu_mutex_lock(&spapr->mc_in_progress);
> +
> +    /* Set error log fields */
> +    mc_log.r3 = env->gpr[3];
> +    mc_log.err_log.byte0 = 0;
> +    mc_log.err_log.byte1 =
> +        (RTAS_SEVERITY_ERROR_SYNC << RTAS_ELOG_SEVERITY_SHIFT);
> +    mc_log.err_log.byte1 |=
> +        (RTAS_DISP_NOT_RECOVERED << RTAS_ELOG_DISPOSITION_SHIFT);
> +    mc_log.err_log.byte2 =
> +        (RTAS_INITIATOR_MEMORY << RTAS_ELOG_INITIATOR_SHIFT);
> +    mc_log.err_log.byte2 |= RTAS_TARGET_MEMORY;
> +
> +    if (env->spr[SPR_DSISR] & P7_DSISR_MC_UE) {
> +        mc_log.err_log.byte3 = RTAS_TYPE_ECC_UNCORR;
> +    } else {
> +        mc_log.err_log.byte3 = 0;
> +    }
> +
> +    /* Handle all Host/Guest LE/BE combinations */
> +    if (env->msr & (1ULL << MSR_LE)) {
> +        mc_log.r3 = cpu_to_le64(mc_log.r3);
> +    } else {
> +        mc_log.r3 = cpu_to_be64(mc_log.r3);
> +    }
> +
> +    cpu_physical_memory_write(spapr->rtas_addr + RTAS_ERRLOG_OFFSET,
> +                              &mc_log, sizeof(mc_log));
> +
> +    env->nip = spapr->guest_machine_check_addr;
> +    env->gpr[3] = spapr->rtas_addr + RTAS_ERRLOG_OFFSET;
> +
> +    return 0;
> +}
> diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
> index 5c1d334..1172735 100644
> --- a/target-ppc/kvm_ppc.h
> +++ b/target-ppc/kvm_ppc.h
> @@ -53,6 +53,87 @@ void kvmppc_hash64_free_pteg(uint64_t token);
>  void kvmppc_hash64_write_pte(CPUPPCState *env, target_ulong pte_index,
>                               target_ulong pte0, target_ulong pte1);
>  bool kvmppc_has_cap_fixup_hcalls(void);
> +int kvm_handle_nmi(PowerPCCPU *cpu);
> +
> +/* Offset from rtas-base where error log is placed */
> +#define RTAS_ERRLOG_OFFSET       (0x200)
> +
> +#define RTAS_ELOG_SEVERITY_SHIFT         0x5
> +#define RTAS_ELOG_DISPOSITION_SHIFT      0x3
> +#define RTAS_ELOG_INITIATOR_SHIFT        0x4
> +
> +/*
> + * Only required RTAS event severity, disposition, initiator
> + * target and type are copied from arch/powerpc/include/asm/rtas.h
> + */
> +
> +/* RTAS event severity */
> +#define RTAS_SEVERITY_ERROR_SYNC    0x3
> +
> +/* RTAS event disposition */
> +#define RTAS_DISP_NOT_RECOVERED     0x2
> +
> +/* RTAS event initiator */
> +#define RTAS_INITIATOR_MEMORY       0x4
> +
> +/* RTAS event target */
> +#define RTAS_TARGET_MEMORY          0x4
> +
> +/* RTAS event type */
> +#define RTAS_TYPE_ECC_UNCORR        0x09
> +
> +/*
> + * Currently KVM only passes on the uncorrected machine
> + * check memory error to guest. Other machine check errors
> + * such as SLB multi-hit and TLB multi-hit are recovered
> + * in KVM and are not passed on to guest.
> + *
> + * DSISR Bit for uncorrected machine check error. Based
> + * on arch/powerpc/include/asm/mce.h
> + */
> +#define PPC_BIT(bit)                (0x8000000000000000ULL >> bit)
> +#define P7_DSISR_MC_UE              (PPC_BIT(48))  /* P8 too */
> +
> +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */
> +struct rtas_error_log {
> +    /* Byte 0 */
> +    uint8_t     byte0;          /* Architectural version */
> +
> +    /* Byte 1 */
> +    uint8_t     byte1;
> +    /* XXXXXXXX
> +     * XXX      3: Severity level of error
> +     *    XX    2: Degree of recovery
> +     *      X   1: Extended log present?
> +     *       XX 2: Reserved
> +     */
> +
> +    /* Byte 2 */
> +    uint8_t     byte2;
> +    /* XXXXXXXX
> +     * XXXX     4: Initiator of event
> +     *     XXXX 4: Target of failed operation
> +     */
> +    uint8_t     byte3;          /* General event or error*/
> +    __be32      extended_log_length;    /* length in bytes */
> +    unsigned char   buffer[1];      /* Start of extended log */
> +                                /* Variable length.      */
> +};
> +
> +/*
> + * Data format in RTAS-Blob
> + *
> + * This structure contains error information related to Machine
> + * Check exception. This is filled up and copied to rtas-blob
> + * upon machine check exception. The address of rtas-blob is
> + * passed on to OS registered machine check notification
> + * routines upon machine check exception
> + */
> +struct rtas_mc_log {
> +    target_ulong r3;
> +    struct rtas_error_log err_log;
> +};
> +
>  
>  #else
>  
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests
  2015-11-11 17:15 [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests Aravinda Prasad
                   ` (3 preceding siblings ...)
  2015-11-11 17:16 ` [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit Aravinda Prasad
@ 2015-11-12  4:30 ` David Gibson
  4 siblings, 0 replies; 40+ messages in thread
From: David Gibson @ 2015-11-12  4:30 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff

[-- Attachment #1: Type: text/plain, Size: 2082 bytes --]

On Wed, Nov 11, 2015 at 10:45:00PM +0530, Aravinda Prasad wrote:
> This series of patches add support for FWNMI in PowerKVM guests.
> 
> Memory error such as bit flips that cannot be corrected
> by hardware is passed on to the kernel for handling
> by raising machine check exception (an NMI). Upon such
> machine check exception, if the address in error belongs
> to guest then KVM causes a guest exit with KVM_EXIT_NMI
> exit reason [1].
> 
> This patch series adds functionality to pass on such
> machine check exception to the guest kernel by suitably
> handling KVM_EXIT_NMI exit and building the error log.
> This patch series explores the alternate design
> discussed in the QEMU mailing list [2], [3].
> 
> In earlier design [2] KVM invoked guest's NMI interrupt
> vector 0x200 upon machine check exception, while QEMU
> patched 0x200 vector to issue hcall to facilitate
> building and copying of error log into RTAS space.
> 
> The new design simplifies couple of things. Thanks to
> David Gibson for the suggestion:
> 
>     - Eliminates patching of 0x200 interrupt vector from
>       QEMU, hence, trampoline not required
>     - Avoids introducing a new private hcall to be called
>       from guest's 0x200 vector to enable QEMU to build
>       error log
>     - Simplifies re-trying by other processors when one
>       of the processor is consuming/processing the error
>       log.
> 
> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
> [2] https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
> [3]
> https://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00898.html

This looks basically good, although I have some concerns noted in
replies to patches 3 & 4.  We can't merge this, though, until the
prerequisite kernel and SLOF patches are merged, so this won't make it
into 2.5 (hard freeze today).

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-12  4:29   ` David Gibson
@ 2015-11-12  5:20     ` Aravinda Prasad
  0 siblings, 0 replies; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-12  5:20 UTC (permalink / raw)
  To: David Gibson; +Cc: benh, aik, agraf, qemu-devel, paulus, qemu-ppc, sam.bobroff



On Thursday 12 November 2015 09:59 AM, David Gibson wrote:
> On Wed, Nov 11, 2015 at 10:46:02PM +0530, Aravinda Prasad wrote:
>> Memory error such as bit flips that cannot be corrected
>> by hardware are passed on to the kernel for handling.
>> If the memory address in error belongs to guest then
>> guest kernel is responsible for taking suitable action.
>> Patch [1] enhances KVM to exit guest with exit reason
>> set to KVM_EXIT_NMI in such cases.
>>
>> This patch handles KVM_EXIT_NMI exit. If the guest OS
>> has registered the machine check handling routine by
>> calling "ibm,nmi-register", then the handler builds
>> the error log and invokes the registered handler else
>> invokes the handler at 0x200.
>>
>> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
>>
>> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
>> ---
>>  target-ppc/kvm.c     |   69 +++++++++++++++++++++++++++++++++++++++++++
>>  target-ppc/kvm_ppc.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 150 insertions(+)
>>
>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>> index 110436d..e2e5170 100644
>> --- a/target-ppc/kvm.c
>> +++ b/target-ppc/kvm.c
>> @@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>>          ret = 0;
>>          break;
>>  
>> +    case KVM_EXIT_NMI:
>> +        DPRINTF("handle NMI exception\n");
>> +        ret = kvm_handle_nmi(cpu);
>> +        break;
>> +
>>      default:
>>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>>          ret = -1;
>> @@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>>  {
>>      return data & 0xffff;
>>  }
>> +
>> +int kvm_handle_nmi(PowerPCCPU *cpu)
>> +{
>> +    struct rtas_mc_log mc_log;
>> +    CPUPPCState *env = &cpu->env;
>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
>> +
>> +    cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
>> +
>> +    /* Properly set bits in MSR before we invoke the handler */
>> +    env->msr = 0;
>> +
>> +    if (!(*pcc->interrupts_big_endian)(cpu)) {
>> +        env->msr |= (1ULL << MSR_LE);
>> +    }
>> +
>> +#ifdef TARGET_PPC64
>> +    env->msr |= (1ULL << MSR_SF);
>> +#endif
> 
> Surely this should depend on what the MSR actually was before, or at
> the very least the cpu model, rather than just whether qemu has PPC64
> support compiled in.

ah.. ok. Will change.

Regards,
Aravinda


> 
>> +
>> +    if (!spapr->guest_machine_check_addr) {
>> +        /*
>> +         * If OS has not registered with "ibm,nmi-register"
>> +         * jump to 0x200
>> +         */
>> +        env->nip = 0x200;
>> +        return 0;
>> +    }
>> +
>> +    qemu_mutex_lock(&spapr->mc_in_progress);
>> +
>> +    /* Set error log fields */
>> +    mc_log.r3 = env->gpr[3];
>> +    mc_log.err_log.byte0 = 0;
>> +    mc_log.err_log.byte1 =
>> +        (RTAS_SEVERITY_ERROR_SYNC << RTAS_ELOG_SEVERITY_SHIFT);
>> +    mc_log.err_log.byte1 |=
>> +        (RTAS_DISP_NOT_RECOVERED << RTAS_ELOG_DISPOSITION_SHIFT);
>> +    mc_log.err_log.byte2 =
>> +        (RTAS_INITIATOR_MEMORY << RTAS_ELOG_INITIATOR_SHIFT);
>> +    mc_log.err_log.byte2 |= RTAS_TARGET_MEMORY;
>> +
>> +    if (env->spr[SPR_DSISR] & P7_DSISR_MC_UE) {
>> +        mc_log.err_log.byte3 = RTAS_TYPE_ECC_UNCORR;
>> +    } else {
>> +        mc_log.err_log.byte3 = 0;
>> +    }
>> +
>> +    /* Handle all Host/Guest LE/BE combinations */
>> +    if (env->msr & (1ULL << MSR_LE)) {
>> +        mc_log.r3 = cpu_to_le64(mc_log.r3);
>> +    } else {
>> +        mc_log.r3 = cpu_to_be64(mc_log.r3);
>> +    }
>> +
>> +    cpu_physical_memory_write(spapr->rtas_addr + RTAS_ERRLOG_OFFSET,
>> +                              &mc_log, sizeof(mc_log));
>> +
>> +    env->nip = spapr->guest_machine_check_addr;
>> +    env->gpr[3] = spapr->rtas_addr + RTAS_ERRLOG_OFFSET;
>> +
>> +    return 0;
>> +}
>> diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
>> index 5c1d334..1172735 100644
>> --- a/target-ppc/kvm_ppc.h
>> +++ b/target-ppc/kvm_ppc.h
>> @@ -53,6 +53,87 @@ void kvmppc_hash64_free_pteg(uint64_t token);
>>  void kvmppc_hash64_write_pte(CPUPPCState *env, target_ulong pte_index,
>>                               target_ulong pte0, target_ulong pte1);
>>  bool kvmppc_has_cap_fixup_hcalls(void);
>> +int kvm_handle_nmi(PowerPCCPU *cpu);
>> +
>> +/* Offset from rtas-base where error log is placed */
>> +#define RTAS_ERRLOG_OFFSET       (0x200)
>> +
>> +#define RTAS_ELOG_SEVERITY_SHIFT         0x5
>> +#define RTAS_ELOG_DISPOSITION_SHIFT      0x3
>> +#define RTAS_ELOG_INITIATOR_SHIFT        0x4
>> +
>> +/*
>> + * Only required RTAS event severity, disposition, initiator
>> + * target and type are copied from arch/powerpc/include/asm/rtas.h
>> + */
>> +
>> +/* RTAS event severity */
>> +#define RTAS_SEVERITY_ERROR_SYNC    0x3
>> +
>> +/* RTAS event disposition */
>> +#define RTAS_DISP_NOT_RECOVERED     0x2
>> +
>> +/* RTAS event initiator */
>> +#define RTAS_INITIATOR_MEMORY       0x4
>> +
>> +/* RTAS event target */
>> +#define RTAS_TARGET_MEMORY          0x4
>> +
>> +/* RTAS event type */
>> +#define RTAS_TYPE_ECC_UNCORR        0x09
>> +
>> +/*
>> + * Currently KVM only passes on the uncorrected machine
>> + * check memory error to guest. Other machine check errors
>> + * such as SLB multi-hit and TLB multi-hit are recovered
>> + * in KVM and are not passed on to guest.
>> + *
>> + * DSISR Bit for uncorrected machine check error. Based
>> + * on arch/powerpc/include/asm/mce.h
>> + */
>> +#define PPC_BIT(bit)                (0x8000000000000000ULL >> bit)
>> +#define P7_DSISR_MC_UE              (PPC_BIT(48))  /* P8 too */
>> +
>> +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */
>> +struct rtas_error_log {
>> +    /* Byte 0 */
>> +    uint8_t     byte0;          /* Architectural version */
>> +
>> +    /* Byte 1 */
>> +    uint8_t     byte1;
>> +    /* XXXXXXXX
>> +     * XXX      3: Severity level of error
>> +     *    XX    2: Degree of recovery
>> +     *      X   1: Extended log present?
>> +     *       XX 2: Reserved
>> +     */
>> +
>> +    /* Byte 2 */
>> +    uint8_t     byte2;
>> +    /* XXXXXXXX
>> +     * XXXX     4: Initiator of event
>> +     *     XXXX 4: Target of failed operation
>> +     */
>> +    uint8_t     byte3;          /* General event or error*/
>> +    __be32      extended_log_length;    /* length in bytes */
>> +    unsigned char   buffer[1];      /* Start of extended log */
>> +                                /* Variable length.      */
>> +};
>> +
>> +/*
>> + * Data format in RTAS-Blob
>> + *
>> + * This structure contains error information related to Machine
>> + * Check exception. This is filled up and copied to rtas-blob
>> + * upon machine check exception. The address of rtas-blob is
>> + * passed on to OS registered machine check notification
>> + * routines upon machine check exception
>> + */
>> +struct rtas_mc_log {
>> +    target_ulong r3;
>> +    struct rtas_error_log err_log;
>> +};
>> +
>>  
>>  #else
>>  
>>
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] spapr: Register and handle HCALL to receive updated RTAS region
  2015-11-12  3:42   ` David Gibson
@ 2015-11-12  5:28     ` Nikunj A Dadhania
  2015-11-12  7:23       ` David Gibson
  0 siblings, 1 reply; 40+ messages in thread
From: Nikunj A Dadhania @ 2015-11-12  5:28 UTC (permalink / raw)
  To: David Gibson, Aravinda Prasad
  Cc: benh, qemu-devel, qemu-ppc, paulus, sam.bobroff

David Gibson <david@gibson.dropbear.id.au> writes:

> On Wed, Nov 11, 2015 at 10:45:30PM +0530, Aravinda Prasad wrote:
>> Receive updates from SLOF about the updated rtas-base.
>> A separate patch for SLOF [1] adds functionality to invoke
>> a private HCALL whenever OS issues instantiate-rtas with
>> a new rtas-base.
>> 
>> This is required as QEMU needs to know the updated rtas-base
>> as it allocates error reporting structure in RTAS space upon
>> a machine check exception.
>> 
>> [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-August/120386.html
>> 
>> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
>
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
>
> With the obvious proviso that the SLOF update needs to be merged (both
> to SLOF and to qemu) first.

The patch is already part of SLOF tree and is there in qemu:

commit f9a60de30492863811c2cdf6f28988c9e8a2c3d9
Author: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
Date:   Mon Aug 25 18:30:10 2014 +0530

    Add private HCALL to inform updated RTAS base and entry
    
    This patch adds a private HCALL to inform qemu the updated
    rtas-base and rtas-entry address when OS invokes the call
    "instantiate-rtas". This is required as qemu allocates the
    error reporting structure in RTAS space upon a machine check
    exception and hence needs to know the updated RTAS.
    
    Enhancements to qemu to handle the private HCALL, prepare
    error log and invoke machine check notification routine
    are in a separate patch.
    
    Signed-off-by: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
    Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
    Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

Regards
Nikunj

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] spapr: Register and handle HCALL to receive updated RTAS region
  2015-11-12  5:28     ` [Qemu-devel] [Qemu-ppc] " Nikunj A Dadhania
@ 2015-11-12  7:23       ` David Gibson
  0 siblings, 0 replies; 40+ messages in thread
From: David Gibson @ 2015-11-12  7:23 UTC (permalink / raw)
  To: Nikunj A Dadhania
  Cc: benh, qemu-devel, qemu-ppc, sam.bobroff, Aravinda Prasad, paulus

[-- Attachment #1: Type: text/plain, Size: 2142 bytes --]

On Thu, Nov 12, 2015 at 10:58:59AM +0530, Nikunj A Dadhania wrote:
> David Gibson <david@gibson.dropbear.id.au> writes:
> 
> > On Wed, Nov 11, 2015 at 10:45:30PM +0530, Aravinda Prasad wrote:
> >> Receive updates from SLOF about the updated rtas-base.
> >> A separate patch for SLOF [1] adds functionality to invoke
> >> a private HCALL whenever OS issues instantiate-rtas with
> >> a new rtas-base.
> >> 
> >> This is required as QEMU needs to know the updated rtas-base
> >> as it allocates error reporting structure in RTAS space upon
> >> a machine check exception.
> >> 
> >> [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-August/120386.html
> >> 
> >> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
> >
> > Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> >
> > With the obvious proviso that the SLOF update needs to be merged (both
> > to SLOF and to qemu) first.
> 
> The patch is already part of SLOF tree and is there in qemu:
> 
> commit f9a60de30492863811c2cdf6f28988c9e8a2c3d9
> Author: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
> Date:   Mon Aug 25 18:30:10 2014 +0530
> 
>     Add private HCALL to inform updated RTAS base and entry
>     
>     This patch adds a private HCALL to inform qemu the updated
>     rtas-base and rtas-entry address when OS invokes the call
>     "instantiate-rtas". This is required as qemu allocates the
>     error reporting structure in RTAS space upon a machine check
>     exception and hence needs to know the updated RTAS.
>     
>     Enhancements to qemu to handle the private HCALL, prepare
>     error log and invoke machine check notification routine
>     are in a separate patch.
>     
>     Signed-off-by: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
>     Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
>     Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

Ah, ok.  Thanks.
-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-11 17:16 ` [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit Aravinda Prasad
  2015-11-12  4:29   ` David Gibson
@ 2015-11-12  8:09   ` Thomas Huth
  2015-11-12  9:40     ` Thomas Huth
                       ` (2 more replies)
  1 sibling, 3 replies; 40+ messages in thread
From: Thomas Huth @ 2015-11-12  8:09 UTC (permalink / raw)
  To: Aravinda Prasad, qemu-ppc, agraf, qemu-devel
  Cc: aik, benh, paulus, sam.bobroff, david

On 11/11/15 18:16, Aravinda Prasad wrote:
> Memory error such as bit flips that cannot be corrected
> by hardware are passed on to the kernel for handling.
> If the memory address in error belongs to guest then
> guest kernel is responsible for taking suitable action.
> Patch [1] enhances KVM to exit guest with exit reason
> set to KVM_EXIT_NMI in such cases.
> 
> This patch handles KVM_EXIT_NMI exit. If the guest OS
> has registered the machine check handling routine by
> calling "ibm,nmi-register", then the handler builds
> the error log and invokes the registered handler else
> invokes the handler at 0x200.
> 
> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
> 
> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
> ---
>  target-ppc/kvm.c     |   69 +++++++++++++++++++++++++++++++++++++++++++
>  target-ppc/kvm_ppc.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 150 insertions(+)
> 
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 110436d..e2e5170 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>          ret = 0;
>          break;
>  
> +    case KVM_EXIT_NMI:
> +        DPRINTF("handle NMI exception\n");
> +        ret = kvm_handle_nmi(cpu);
> +        break;
> +
>      default:
>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>          ret = -1;
> @@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>      return data & 0xffff;
>  }
> +
> +int kvm_handle_nmi(PowerPCCPU *cpu)
> +{
> +    struct rtas_mc_log mc_log;
> +    CPUPPCState *env = &cpu->env;
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> +
> +    cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
> +
> +    /* Properly set bits in MSR before we invoke the handler */
> +    env->msr = 0;
> +
> +    if (!(*pcc->interrupts_big_endian)(cpu)) {
> +        env->msr |= (1ULL << MSR_LE);
> +    }
> +
> +#ifdef TARGET_PPC64
> +    env->msr |= (1ULL << MSR_SF);
> +#endif
> +
> +    if (!spapr->guest_machine_check_addr) {
> +        /*
> +         * If OS has not registered with "ibm,nmi-register"
> +         * jump to 0x200
> +         */

Shouldn't you also check MSR_ME here first and enter checkstop when
machine checks are disabled?
Also I think you have to set up some more registers for machine check
interrupts, like SRR0 and SRR1?

> +        env->nip = 0x200;
> +        return 0;
> +    }
> +
> +    qemu_mutex_lock(&spapr->mc_in_progress);

Using a mutex here is definitely wrong. The kvm_arch_handle_exit() code
is run under the Big QEMU Lock™ (see qemu_mutex_lock_iothread() in
kvm_cpu_exec()), so if you would ever get one thread waiting for this
mutex here, it could never be unlocked again in rtas_ibm_nmi_interlock()
because the other code would wait forever to get the BQL ==> Deadlock.

I think if you want to be able to handle multiple NMIs at once, you
likely need something like an error log per CPU instead. And if an NMI
happens one CPU while there is already a NMI handler running on the very
same CPU, you could likely simply track this with an boolean variable
and put the CPU into checkstop if this happens?

> +    /* Set error log fields */
> +    mc_log.r3 = env->gpr[3];
> +    mc_log.err_log.byte0 = 0;
> +    mc_log.err_log.byte1 =
> +        (RTAS_SEVERITY_ERROR_SYNC << RTAS_ELOG_SEVERITY_SHIFT);
> +    mc_log.err_log.byte1 |=
> +        (RTAS_DISP_NOT_RECOVERED << RTAS_ELOG_DISPOSITION_SHIFT);
> +    mc_log.err_log.byte2 =
> +        (RTAS_INITIATOR_MEMORY << RTAS_ELOG_INITIATOR_SHIFT);
> +    mc_log.err_log.byte2 |= RTAS_TARGET_MEMORY;
> +
> +    if (env->spr[SPR_DSISR] & P7_DSISR_MC_UE) {
> +        mc_log.err_log.byte3 = RTAS_TYPE_ECC_UNCORR;
> +    } else {
> +        mc_log.err_log.byte3 = 0;
> +    }
> +
> +    /* Handle all Host/Guest LE/BE combinations */
> +    if (env->msr & (1ULL << MSR_LE)) {
> +        mc_log.r3 = cpu_to_le64(mc_log.r3);
> +    } else {
> +        mc_log.r3 = cpu_to_be64(mc_log.r3);
> +    }
> +
> +    cpu_physical_memory_write(spapr->rtas_addr + RTAS_ERRLOG_OFFSET,
> +                              &mc_log, sizeof(mc_log));
> +
> +    env->nip = spapr->guest_machine_check_addr;
> +    env->gpr[3] = spapr->rtas_addr + RTAS_ERRLOG_OFFSET;
> +
> +    return 0;
> +}
> diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
> index 5c1d334..1172735 100644
> --- a/target-ppc/kvm_ppc.h
> +++ b/target-ppc/kvm_ppc.h
> @@ -53,6 +53,87 @@ void kvmppc_hash64_free_pteg(uint64_t token);
>  void kvmppc_hash64_write_pte(CPUPPCState *env, target_ulong pte_index,
>                               target_ulong pte0, target_ulong pte1);
>  bool kvmppc_has_cap_fixup_hcalls(void);
> +int kvm_handle_nmi(PowerPCCPU *cpu);
> +
> +/* Offset from rtas-base where error log is placed */
> +#define RTAS_ERRLOG_OFFSET       (0x200)

Why paranthesis here?

> +#define RTAS_ELOG_SEVERITY_SHIFT         0x5
> +#define RTAS_ELOG_DISPOSITION_SHIFT      0x3
> +#define RTAS_ELOG_INITIATOR_SHIFT        0x4
> +
> +/*
> + * Only required RTAS event severity, disposition, initiator
> + * target and type are copied from arch/powerpc/include/asm/rtas.h
> + */
> +
> +/* RTAS event severity */
> +#define RTAS_SEVERITY_ERROR_SYNC    0x3
> +
> +/* RTAS event disposition */
> +#define RTAS_DISP_NOT_RECOVERED     0x2
> +
> +/* RTAS event initiator */
> +#define RTAS_INITIATOR_MEMORY       0x4
> +
> +/* RTAS event target */
> +#define RTAS_TARGET_MEMORY          0x4
> +
> +/* RTAS event type */
> +#define RTAS_TYPE_ECC_UNCORR        0x09
> +
> +/*
> + * Currently KVM only passes on the uncorrected machine
> + * check memory error to guest. Other machine check errors
> + * such as SLB multi-hit and TLB multi-hit are recovered
> + * in KVM and are not passed on to guest.
> + *
> + * DSISR Bit for uncorrected machine check error. Based
> + * on arch/powerpc/include/asm/mce.h
> + */
> +#define PPC_BIT(bit)                (0x8000000000000000ULL >> bit)
> +#define P7_DSISR_MC_UE              (PPC_BIT(48))  /* P8 too */
> +
> +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */
> +struct rtas_error_log {
> +    /* Byte 0 */
> +    uint8_t     byte0;          /* Architectural version */
> +
> +    /* Byte 1 */
> +    uint8_t     byte1;
> +    /* XXXXXXXX
> +     * XXX      3: Severity level of error
> +     *    XX    2: Degree of recovery
> +     *      X   1: Extended log present?
> +     *       XX 2: Reserved
> +     */
> +
> +    /* Byte 2 */
> +    uint8_t     byte2;
> +    /* XXXXXXXX
> +     * XXXX     4: Initiator of event
> +     *     XXXX 4: Target of failed operation
> +     */
> +    uint8_t     byte3;          /* General event or error*/
> +    __be32      extended_log_length;    /* length in bytes */
> +    unsigned char   buffer[1];      /* Start of extended log */
> +                                /* Variable length.      */
> +};
> +
> +/*
> + * Data format in RTAS-Blob
> + *
> + * This structure contains error information related to Machine
> + * Check exception. This is filled up and copied to rtas-blob
> + * upon machine check exception. The address of rtas-blob is
> + * passed on to OS registered machine check notification
> + * routines upon machine check exception
> + */
> +struct rtas_mc_log {
> +    target_ulong r3;
> +    struct rtas_error_log err_log;
> +};
> +

QEMU coding style is normally to use CamelCase for type names, see
chapter 3 in CODING_STYLE.

 Thomas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob
  2015-11-11 17:15 ` [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob Aravinda Prasad
  2015-11-12  3:40   ` David Gibson
@ 2015-11-12  8:26   ` Thomas Huth
  2015-11-12 11:53     ` David Gibson
  2015-11-12 18:59     ` Aravinda Prasad
  1 sibling, 2 replies; 40+ messages in thread
From: Thomas Huth @ 2015-11-12  8:26 UTC (permalink / raw)
  To: Aravinda Prasad, qemu-ppc, agraf, qemu-devel
  Cc: aik, benh, paulus, sam.bobroff, david

On 11/11/15 18:15, Aravinda Prasad wrote:
> Extend rtas-blob to accommodate error log. Error log
> structure is saved in rtas space upon a machine check
> exception.
> 
> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 05926a3..b7b9e09 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1556,6 +1556,10 @@ static void ppc_spapr_init(MachineState *machine)
>          exit(1);
>      }
>      spapr->rtas_size = get_image_size(filename);
> +
> +    /* Resize blob to accommodate error log. */
> +    spapr->rtas_size = TARGET_PAGE_ALIGN(spapr->rtas_size);
> +
>      spapr->rtas_blob = g_malloc(spapr->rtas_size);
>      if (load_image_size(filename, spapr->rtas_blob, spapr->rtas_size) < 0) {
>          error_report("Could not load LPAR rtas '%s'", filename);

Sorry to say that, but this patch is horrible!

1) If the rtas blob ever gets bigger than 512 bytes, we will get
"random" corruption of the RTAS code later when an NMI occurs since the
mc log is blindly copied into the RTAS area later!
==> Please add an "assert(spapr->rtas_size < RTAS_ERRLOG_OFFSET)" at the
beginning of your patch.

2) Why resizing with TARGET_PAGE_ALIGN() ? In the very worst case, this
would not change the size at all (if the rtas_size is already a multiple
of PAGE_SIZE)
==> Please set the size to a proper value like
 RTAS_ERRLOG_OFFSET + sizeof(struct rtas_mc_log)
instead!

 Thomas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls
  2015-11-11 17:15 ` [Qemu-devel] [PATCH 3/4] spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls Aravinda Prasad
  2015-11-12  4:02   ` David Gibson
@ 2015-11-12  9:23   ` Thomas Huth
  2015-11-12 18:52     ` Aravinda Prasad
  1 sibling, 1 reply; 40+ messages in thread
From: Thomas Huth @ 2015-11-12  9:23 UTC (permalink / raw)
  To: Aravinda Prasad, qemu-ppc, agraf, qemu-devel
  Cc: aik, benh, paulus, sam.bobroff, david

On 11/11/15 18:15, Aravinda Prasad wrote:
> This patch adds support in QEMU to handle "ibm,nmi-register"
> and "ibm,nmi-interlock" RTAS calls.
> 
> The machine check notification address is saved when the
> OS issues "ibm,nmi-register" RTAS call.
> 
> This patch also handles the case when multiple processors
> experience machine check at or about the same time by
> handling "ibm,nmi-interlock" call. In such cases, as per
> PAPR, subsequent processors serialize waiting for the first
> processor to issue the "ibm,nmi-interlock" call. The second
> processor waits till the first processor, which also
> received a machine check error, is done reading the error
> log. The first processor issues "ibm,nmi-interlock" call
> when the error log is consumed. This patch implements the
> releasing part of the error-log while subsequent patch
> (which builds error log) handles the locking part.
> 
> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr_rtas.c    |   29 +++++++++++++++++++++++++++++
>  include/hw/ppc/spapr.h |    8 +++++++-
>  2 files changed, 36 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 9869bc9..fd4d2af 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -597,6 +597,31 @@ out:
>      rtas_st(rets, 0, rc);
>  }
>  
> +static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
> +                                  sPAPRMachineState *spapr,
> +                                  uint32_t token, uint32_t nargs,
> +                                  target_ulong args,
> +                                  uint32_t nret, target_ulong rets)
> +{
> +    qemu_mutex_init(&spapr->mc_in_progress);
> +    spapr->guest_machine_check_addr = rtas_ld(args, 1);
> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +}
> +
> +static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu,
> +                                   sPAPRMachineState *spapr,
> +                                   uint32_t token, uint32_t nargs,
> +                                   target_ulong args,
> +                                   uint32_t nret, target_ulong rets)
> +{
> +    /*
> +     * VCPU issuing "ibm,nmi-interlock" is done with NMI handling,
> +     * hence unlock mc_in_progress.
> +     */
> +    qemu_mutex_unlock(&spapr->mc_in_progress);
> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +}

Maybe the interlock function should return an error if the nmi_register
function has not been called before? OTOH, RTAS is not supposed to do
excessive parameter checking, so this is maybe not worth the effort.

 Thomas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-12  8:09   ` Thomas Huth
@ 2015-11-12  9:40     ` Thomas Huth
  2015-11-12 18:49       ` Aravinda Prasad
  2015-11-13  1:57       ` David Gibson
  2015-11-12 18:23     ` Aravinda Prasad
  2015-11-16  3:50     ` Paul Mackerras
  2 siblings, 2 replies; 40+ messages in thread
From: Thomas Huth @ 2015-11-12  9:40 UTC (permalink / raw)
  To: Aravinda Prasad, qemu-ppc, agraf, qemu-devel
  Cc: aik, benh, paulus, sam.bobroff, david

On 12/11/15 09:09, Thomas Huth wrote:
> On 11/11/15 18:16, Aravinda Prasad wrote:
>> Memory error such as bit flips that cannot be corrected
>> by hardware are passed on to the kernel for handling.
>> If the memory address in error belongs to guest then
>> guest kernel is responsible for taking suitable action.
>> Patch [1] enhances KVM to exit guest with exit reason
>> set to KVM_EXIT_NMI in such cases.
>>
>> This patch handles KVM_EXIT_NMI exit. If the guest OS
>> has registered the machine check handling routine by
>> calling "ibm,nmi-register", then the handler builds
>> the error log and invokes the registered handler else
>> invokes the handler at 0x200.
>>
>> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
>>
>> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
>> ---
>>  target-ppc/kvm.c     |   69 +++++++++++++++++++++++++++++++++++++++++++
>>  target-ppc/kvm_ppc.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 150 insertions(+)
>>
>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>> index 110436d..e2e5170 100644
>> --- a/target-ppc/kvm.c
>> +++ b/target-ppc/kvm.c
>> @@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>>          ret = 0;
>>          break;
>>  
>> +    case KVM_EXIT_NMI:
>> +        DPRINTF("handle NMI exception\n");
>> +        ret = kvm_handle_nmi(cpu);
>> +        break;
>> +
>>      default:
>>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>>          ret = -1;
>> @@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>>  {
>>      return data & 0xffff;
>>  }
>> +
>> +int kvm_handle_nmi(PowerPCCPU *cpu)
>> +{
>> +    struct rtas_mc_log mc_log;
>> +    CPUPPCState *env = &cpu->env;
>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
>> +
>> +    cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
>> +
>> +    /* Properly set bits in MSR before we invoke the handler */
>> +    env->msr = 0;
>> +
>> +    if (!(*pcc->interrupts_big_endian)(cpu)) {
>> +        env->msr |= (1ULL << MSR_LE);
>> +    }
>> +
>> +#ifdef TARGET_PPC64
>> +    env->msr |= (1ULL << MSR_SF);
>> +#endif
>> +
>> +    if (!spapr->guest_machine_check_addr) {
>> +        /*
>> +         * If OS has not registered with "ibm,nmi-register"
>> +         * jump to 0x200
>> +         */
> 
> Shouldn't you also check MSR_ME here first and enter checkstop when
> machine checks are disabled?
> Also I think you have to set up some more registers for machine check
> interrupts, like SRR0 and SRR1?
> 
>> +        env->nip = 0x200;
>> +        return 0;
>> +    }
>> +
>> +    qemu_mutex_lock(&spapr->mc_in_progress);
> 
> Using a mutex here is definitely wrong. The kvm_arch_handle_exit() code
> is run under the Big QEMU Lock™ (see qemu_mutex_lock_iothread() in
> kvm_cpu_exec()),

In case you're looking for the calls, I just noticed that the
qemu_mutex_lock_iothread() have recently been pushed into
kvm_arch_handle_exit() itself.

> so if you would ever get one thread waiting for this
> mutex here, it could never be unlocked again in rtas_ibm_nmi_interlock()
> because the other code would wait forever to get the BQL ==> Deadlock.
> 
> I think if you want to be able to handle multiple NMIs at once, you
> likely need something like an error log per CPU instead. And if an NMI
> happens one CPU while there is already a NMI handler running on the very
> same CPU, you could likely simply track this with an boolean variable
> and put the CPU into checkstop if this happens?

Ok, I now had a look into the LoPAPR spec, and if I've got that right,
you really have to serialize the NMIs in case they happen at multiple
CPUs at the same time. So I guess the best thing you can do here is
something like:

   while (spapr->mc_in_progress) {
       /*
        * There is already another NMI in progress, thus we need
        * to yield here to wait until it has been finsihed
        */
       qemu_mutex_unlock_iothread();
       usleep(10);
       qemu_mutex_lock_iothread();
   }
   spapr->mc_in_progress = true;

Also LoPAPR talks about 'subsequent processors report "fatal error
previously reported"', so maybe the other processors should report that
condition in this case?
And of course you've also got to check that the same CPU is not getting
multiple NMIs before the interlock function has been called again.

 Thomas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob
  2015-11-12  8:26   ` Thomas Huth
@ 2015-11-12 11:53     ` David Gibson
  2015-11-12 18:59     ` Aravinda Prasad
  1 sibling, 0 replies; 40+ messages in thread
From: David Gibson @ 2015-11-12 11:53 UTC (permalink / raw)
  To: Thomas Huth
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, sam.bobroff,
	Aravinda Prasad, paulus

[-- Attachment #1: Type: text/plain, Size: 2043 bytes --]

On Thu, Nov 12, 2015 at 09:26:26AM +0100, Thomas Huth wrote:
> On 11/11/15 18:15, Aravinda Prasad wrote:
> > Extend rtas-blob to accommodate error log. Error log
> > structure is saved in rtas space upon a machine check
> > exception.
> > 
> > Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c |    4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 05926a3..b7b9e09 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1556,6 +1556,10 @@ static void ppc_spapr_init(MachineState *machine)
> >          exit(1);
> >      }
> >      spapr->rtas_size = get_image_size(filename);
> > +
> > +    /* Resize blob to accommodate error log. */
> > +    spapr->rtas_size = TARGET_PAGE_ALIGN(spapr->rtas_size);
> > +
> >      spapr->rtas_blob = g_malloc(spapr->rtas_size);
> >      if (load_image_size(filename, spapr->rtas_blob, spapr->rtas_size) < 0) {
> >          error_report("Could not load LPAR rtas '%s'", filename);
> 
> Sorry to say that, but this patch is horrible!
> 
> 1) If the rtas blob ever gets bigger than 512 bytes, we will get
> "random" corruption of the RTAS code later when an NMI occurs since the
> mc log is blindly copied into the RTAS area later!
> ==> Please add an "assert(spapr->rtas_size < RTAS_ERRLOG_OFFSET)" at the
> beginning of your patch.

The assert is a good idea, although I think the chances of the rtas
ever growing beyond its current 20 bytes is just about nil.

> 2) Why resizing with TARGET_PAGE_ALIGN() ? In the very worst case, this
> would not change the size at all (if the rtas_size is already a multiple
> of PAGE_SIZE)
> ==> Please set the size to a proper value like
>  RTAS_ERRLOG_OFFSET + sizeof(struct rtas_mc_log)
> instead!

That's a good point.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls
  2015-11-12  4:02   ` David Gibson
@ 2015-11-12 18:04     ` Aravinda Prasad
  0 siblings, 0 replies; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-12 18:04 UTC (permalink / raw)
  To: David Gibson; +Cc: benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff



On Thursday 12 November 2015 09:32 AM, David Gibson wrote:
> On Wed, Nov 11, 2015 at 10:45:44PM +0530, Aravinda Prasad wrote:
>> This patch adds support in QEMU to handle "ibm,nmi-register"
>> and "ibm,nmi-interlock" RTAS calls.
>>
>> The machine check notification address is saved when the
>> OS issues "ibm,nmi-register" RTAS call.
>>
>> This patch also handles the case when multiple processors
>> experience machine check at or about the same time by
>> handling "ibm,nmi-interlock" call. In such cases, as per
>> PAPR, subsequent processors serialize waiting for the first
>> processor to issue the "ibm,nmi-interlock" call. The second
>> processor waits till the first processor, which also
>> received a machine check error, is done reading the error
>> log. The first processor issues "ibm,nmi-interlock" call
>> when the error log is consumed. This patch implements the
>> releasing part of the error-log while subsequent patch
>> (which builds error log) handles the locking part.
>>
>> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
>> ---
>>  hw/ppc/spapr_rtas.c    |   29 +++++++++++++++++++++++++++++
>>  include/hw/ppc/spapr.h |    8 +++++++-
>>  2 files changed, 36 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
>> index 9869bc9..fd4d2af 100644
>> --- a/hw/ppc/spapr_rtas.c
>> +++ b/hw/ppc/spapr_rtas.c
>> @@ -597,6 +597,31 @@ out:
>>      rtas_st(rets, 0, rc);
>>  }
>>  
>> +static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
>> +                                  sPAPRMachineState *spapr,
>> +                                  uint32_t token, uint32_t nargs,
>> +                                  target_ulong args,
>> +                                  uint32_t nret, target_ulong rets)
>> +{
>> +    qemu_mutex_init(&spapr->mc_in_progress);
>> +    spapr->guest_machine_check_addr = rtas_ld(args, 1);
>> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
>> +}
>> +
>> +static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu,
>> +                                   sPAPRMachineState *spapr,
>> +                                   uint32_t token, uint32_t nargs,
>> +                                   target_ulong args,
>> +                                   uint32_t nret, target_ulong rets)
>> +{
>> +    /*
>> +     * VCPU issuing "ibm,nmi-interlock" is done with NMI handling,
>> +     * hence unlock mc_in_progress.
>> +     */
>> +    qemu_mutex_unlock(&spapr->mc_in_progress);
> 
> Hrm... allowing the guest direct control over a qemu internal mutex
> sets off alarm bells for me.
> 
> It could be ok, depending on exactly where the mutex is taken, and
> whether it has a timeout and so forth, but it makes me nervous.
> 
> If nothing else this needs some substantial comments explaining the
> locking scheme and why it's safe to give userspace direct control over
> the unlock.

I will follow the approach suggested by Thomas which eliminates direct
control of qemu internal lock by guest.

Regards,
Aravinda

> 
>> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
>> +}
>> +
>>  static struct rtas_call {
>>      const char *name;
>>      spapr_rtas_fn fn;
>> @@ -747,6 +772,10 @@ static void core_rtas_register_types(void)
>>                          rtas_get_sensor_state);
>>      spapr_rtas_register(RTAS_IBM_CONFIGURE_CONNECTOR, "ibm,configure-connector",
>>                          rtas_ibm_configure_connector);
>> +    spapr_rtas_register(RTAS_IBM_NMI_REGISTER, "ibm,nmi-register",
>> +                        rtas_ibm_nmi_register);
>> +    spapr_rtas_register(RTAS_IBM_NMI_INTERLOCK, "ibm,nmi-interlock",
>> +                        rtas_ibm_nmi_interlock);
>>  }
>>  
>>  type_init(core_rtas_register_types)
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index b5cadd7..0a31d15 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -74,6 +74,10 @@ struct sPAPRMachineState {
>>      /* RTAS state */
>>      QTAILQ_HEAD(, sPAPRConfigureConnectorState) ccs_list;
>>  
>> +    /* State related to "ibm,nmi-register" and "ibm,nmi-interlock" calls */
>> +    target_ulong guest_machine_check_addr;
>> +    QemuMutex mc_in_progress;
>> +
>>      /*< public >*/
>>      char *kvm_type;
>>  };
>> @@ -458,8 +462,10 @@ int spapr_allocate_irq_block(int num, bool lsi, bool msi);
>>  #define RTAS_IBM_SET_SLOT_RESET                 (RTAS_TOKEN_BASE + 0x23)
>>  #define RTAS_IBM_CONFIGURE_PE                   (RTAS_TOKEN_BASE + 0x24)
>>  #define RTAS_IBM_SLOT_ERROR_DETAIL              (RTAS_TOKEN_BASE + 0x25)
>> +#define RTAS_IBM_NMI_REGISTER                   (RTAS_TOKEN_BASE + 0x26)
>> +#define RTAS_IBM_NMI_INTERLOCK                  (RTAS_TOKEN_BASE + 0x27)
>>  
>> -#define RTAS_TOKEN_MAX                          (RTAS_TOKEN_BASE + 0x26)
>> +#define RTAS_TOKEN_MAX                          (RTAS_TOKEN_BASE + 0x28)
>>  
>>  /* RTAS ibm,get-system-parameter token values */
>>  #define RTAS_SYSPARM_SPLPAR_CHARACTERISTICS      20
>>
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-12  8:09   ` Thomas Huth
  2015-11-12  9:40     ` Thomas Huth
@ 2015-11-12 18:23     ` Aravinda Prasad
  2015-11-13  1:58       ` David Gibson
  2015-11-19  1:56       ` Alexey Kardashevskiy
  2015-11-16  3:50     ` Paul Mackerras
  2 siblings, 2 replies; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-12 18:23 UTC (permalink / raw)
  To: Thomas Huth
  Cc: benh, aik, agraf, qemu-devel, paulus, qemu-ppc, sam.bobroff, david



On Thursday 12 November 2015 01:39 PM, Thomas Huth wrote:
> On 11/11/15 18:16, Aravinda Prasad wrote:
>> Memory error such as bit flips that cannot be corrected
>> by hardware are passed on to the kernel for handling.
>> If the memory address in error belongs to guest then
>> guest kernel is responsible for taking suitable action.
>> Patch [1] enhances KVM to exit guest with exit reason
>> set to KVM_EXIT_NMI in such cases.
>>
>> This patch handles KVM_EXIT_NMI exit. If the guest OS
>> has registered the machine check handling routine by
>> calling "ibm,nmi-register", then the handler builds
>> the error log and invokes the registered handler else
>> invokes the handler at 0x200.
>>
>> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
>>
>> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
>> ---
>>  target-ppc/kvm.c     |   69 +++++++++++++++++++++++++++++++++++++++++++
>>  target-ppc/kvm_ppc.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 150 insertions(+)
>>
>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>> index 110436d..e2e5170 100644
>> --- a/target-ppc/kvm.c
>> +++ b/target-ppc/kvm.c
>> @@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>>          ret = 0;
>>          break;
>>  
>> +    case KVM_EXIT_NMI:
>> +        DPRINTF("handle NMI exception\n");
>> +        ret = kvm_handle_nmi(cpu);
>> +        break;
>> +
>>      default:
>>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>>          ret = -1;
>> @@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>>  {
>>      return data & 0xffff;
>>  }
>> +
>> +int kvm_handle_nmi(PowerPCCPU *cpu)
>> +{
>> +    struct rtas_mc_log mc_log;
>> +    CPUPPCState *env = &cpu->env;
>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
>> +
>> +    cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
>> +
>> +    /* Properly set bits in MSR before we invoke the handler */
>> +    env->msr = 0;
>> +
>> +    if (!(*pcc->interrupts_big_endian)(cpu)) {
>> +        env->msr |= (1ULL << MSR_LE);
>> +    }
>> +
>> +#ifdef TARGET_PPC64
>> +    env->msr |= (1ULL << MSR_SF);
>> +#endif
>> +
>> +    if (!spapr->guest_machine_check_addr) {
>> +        /*
>> +         * If OS has not registered with "ibm,nmi-register"
>> +         * jump to 0x200
>> +         */
> 
> Shouldn't you also check MSR_ME here first and enter checkstop when
> machine checks are disabled?

Yes, MSR_ME should be checked first.

> Also I think you have to set up some more registers for machine check
> interrupts, like SRR0 and SRR1?

SRRO and SRR1 of vcpu are properly set in KVM in kvmppc_interrupt_hv. I
am not sure if any other registers need to be set.

> 
>> +        env->nip = 0x200;
>> +        return 0;
>> +    }
>> +
>> +    qemu_mutex_lock(&spapr->mc_in_progress);
> 
> Using a mutex here is definitely wrong. The kvm_arch_handle_exit() code
> is run under the Big QEMU Lock™ (see qemu_mutex_lock_iothread() in
> kvm_cpu_exec()), so if you would ever get one thread waiting for this
> mutex here, it could never be unlocked again in rtas_ibm_nmi_interlock()
> because the other code would wait forever to get the BQL ==> Deadlock.
> 
> I think if you want to be able to handle multiple NMIs at once, you
> likely need something like an error log per CPU instead. And if an NMI
> happens one CPU while there is already a NMI handler running on the very
> same CPU, you could likely simply track this with an boolean variable
> and put the CPU into checkstop if this happens?
> 
>> +    /* Set error log fields */
>> +    mc_log.r3 = env->gpr[3];
>> +    mc_log.err_log.byte0 = 0;
>> +    mc_log.err_log.byte1 =
>> +        (RTAS_SEVERITY_ERROR_SYNC << RTAS_ELOG_SEVERITY_SHIFT);
>> +    mc_log.err_log.byte1 |=
>> +        (RTAS_DISP_NOT_RECOVERED << RTAS_ELOG_DISPOSITION_SHIFT);
>> +    mc_log.err_log.byte2 =
>> +        (RTAS_INITIATOR_MEMORY << RTAS_ELOG_INITIATOR_SHIFT);
>> +    mc_log.err_log.byte2 |= RTAS_TARGET_MEMORY;
>> +
>> +    if (env->spr[SPR_DSISR] & P7_DSISR_MC_UE) {
>> +        mc_log.err_log.byte3 = RTAS_TYPE_ECC_UNCORR;
>> +    } else {
>> +        mc_log.err_log.byte3 = 0;
>> +    }
>> +
>> +    /* Handle all Host/Guest LE/BE combinations */
>> +    if (env->msr & (1ULL << MSR_LE)) {
>> +        mc_log.r3 = cpu_to_le64(mc_log.r3);
>> +    } else {
>> +        mc_log.r3 = cpu_to_be64(mc_log.r3);
>> +    }
>> +
>> +    cpu_physical_memory_write(spapr->rtas_addr + RTAS_ERRLOG_OFFSET,
>> +                              &mc_log, sizeof(mc_log));
>> +
>> +    env->nip = spapr->guest_machine_check_addr;
>> +    env->gpr[3] = spapr->rtas_addr + RTAS_ERRLOG_OFFSET;
>> +
>> +    return 0;
>> +}
>> diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
>> index 5c1d334..1172735 100644
>> --- a/target-ppc/kvm_ppc.h
>> +++ b/target-ppc/kvm_ppc.h
>> @@ -53,6 +53,87 @@ void kvmppc_hash64_free_pteg(uint64_t token);
>>  void kvmppc_hash64_write_pte(CPUPPCState *env, target_ulong pte_index,
>>                               target_ulong pte0, target_ulong pte1);
>>  bool kvmppc_has_cap_fixup_hcalls(void);
>> +int kvm_handle_nmi(PowerPCCPU *cpu);
>> +
>> +/* Offset from rtas-base where error log is placed */
>> +#define RTAS_ERRLOG_OFFSET       (0x200)
> 
> Why paranthesis here?

hmm.. not required.

> 
>> +#define RTAS_ELOG_SEVERITY_SHIFT         0x5
>> +#define RTAS_ELOG_DISPOSITION_SHIFT      0x3
>> +#define RTAS_ELOG_INITIATOR_SHIFT        0x4
>> +
>> +/*
>> + * Only required RTAS event severity, disposition, initiator
>> + * target and type are copied from arch/powerpc/include/asm/rtas.h
>> + */
>> +
>> +/* RTAS event severity */
>> +#define RTAS_SEVERITY_ERROR_SYNC    0x3
>> +
>> +/* RTAS event disposition */
>> +#define RTAS_DISP_NOT_RECOVERED     0x2
>> +
>> +/* RTAS event initiator */
>> +#define RTAS_INITIATOR_MEMORY       0x4
>> +
>> +/* RTAS event target */
>> +#define RTAS_TARGET_MEMORY          0x4
>> +
>> +/* RTAS event type */
>> +#define RTAS_TYPE_ECC_UNCORR        0x09
>> +
>> +/*
>> + * Currently KVM only passes on the uncorrected machine
>> + * check memory error to guest. Other machine check errors
>> + * such as SLB multi-hit and TLB multi-hit are recovered
>> + * in KVM and are not passed on to guest.
>> + *
>> + * DSISR Bit for uncorrected machine check error. Based
>> + * on arch/powerpc/include/asm/mce.h
>> + */
>> +#define PPC_BIT(bit)                (0x8000000000000000ULL >> bit)
>> +#define P7_DSISR_MC_UE              (PPC_BIT(48))  /* P8 too */
>> +
>> +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */
>> +struct rtas_error_log {
>> +    /* Byte 0 */
>> +    uint8_t     byte0;          /* Architectural version */
>> +
>> +    /* Byte 1 */
>> +    uint8_t     byte1;
>> +    /* XXXXXXXX
>> +     * XXX      3: Severity level of error
>> +     *    XX    2: Degree of recovery
>> +     *      X   1: Extended log present?
>> +     *       XX 2: Reserved
>> +     */
>> +
>> +    /* Byte 2 */
>> +    uint8_t     byte2;
>> +    /* XXXXXXXX
>> +     * XXXX     4: Initiator of event
>> +     *     XXXX 4: Target of failed operation
>> +     */
>> +    uint8_t     byte3;          /* General event or error*/
>> +    __be32      extended_log_length;    /* length in bytes */
>> +    unsigned char   buffer[1];      /* Start of extended log */
>> +                                /* Variable length.      */
>> +};
>> +
>> +/*
>> + * Data format in RTAS-Blob
>> + *
>> + * This structure contains error information related to Machine
>> + * Check exception. This is filled up and copied to rtas-blob
>> + * upon machine check exception. The address of rtas-blob is
>> + * passed on to OS registered machine check notification
>> + * routines upon machine check exception
>> + */
>> +struct rtas_mc_log {
>> +    target_ulong r3;
>> +    struct rtas_error_log err_log;
>> +};
>> +
> 
> QEMU coding style is normally to use CamelCase for type names, see
> chapter 3 in CODING_STYLE.

Will take care in the next revision.

Regards,
Aravinda

> 
>  Thomas
> 
> 
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-12  9:40     ` Thomas Huth
@ 2015-11-12 18:49       ` Aravinda Prasad
  2015-11-16  7:52         ` Thomas Huth
  2015-11-13  1:57       ` David Gibson
  1 sibling, 1 reply; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-12 18:49 UTC (permalink / raw)
  To: Thomas Huth
  Cc: benh, aik, agraf, qemu-devel, paulus, qemu-ppc, sam.bobroff, david



On Thursday 12 November 2015 03:10 PM, Thomas Huth wrote:
> On 12/11/15 09:09, Thomas Huth wrote:
>> On 11/11/15 18:16, Aravinda Prasad wrote:
>>> Memory error such as bit flips that cannot be corrected
>>> by hardware are passed on to the kernel for handling.
>>> If the memory address in error belongs to guest then
>>> guest kernel is responsible for taking suitable action.
>>> Patch [1] enhances KVM to exit guest with exit reason
>>> set to KVM_EXIT_NMI in such cases.
>>>
>>> This patch handles KVM_EXIT_NMI exit. If the guest OS
>>> has registered the machine check handling routine by
>>> calling "ibm,nmi-register", then the handler builds
>>> the error log and invokes the registered handler else
>>> invokes the handler at 0x200.
>>>
>>> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
>>>
>>> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
>>> ---
>>>  target-ppc/kvm.c     |   69 +++++++++++++++++++++++++++++++++++++++++++
>>>  target-ppc/kvm_ppc.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 150 insertions(+)
>>>
>>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>>> index 110436d..e2e5170 100644
>>> --- a/target-ppc/kvm.c
>>> +++ b/target-ppc/kvm.c
>>> @@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>>>          ret = 0;
>>>          break;
>>>  
>>> +    case KVM_EXIT_NMI:
>>> +        DPRINTF("handle NMI exception\n");
>>> +        ret = kvm_handle_nmi(cpu);
>>> +        break;
>>> +
>>>      default:
>>>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>>>          ret = -1;
>>> @@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>>>  {
>>>      return data & 0xffff;
>>>  }
>>> +
>>> +int kvm_handle_nmi(PowerPCCPU *cpu)
>>> +{
>>> +    struct rtas_mc_log mc_log;
>>> +    CPUPPCState *env = &cpu->env;
>>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>>> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
>>> +
>>> +    cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
>>> +
>>> +    /* Properly set bits in MSR before we invoke the handler */
>>> +    env->msr = 0;
>>> +
>>> +    if (!(*pcc->interrupts_big_endian)(cpu)) {
>>> +        env->msr |= (1ULL << MSR_LE);
>>> +    }
>>> +
>>> +#ifdef TARGET_PPC64
>>> +    env->msr |= (1ULL << MSR_SF);
>>> +#endif
>>> +
>>> +    if (!spapr->guest_machine_check_addr) {
>>> +        /*
>>> +         * If OS has not registered with "ibm,nmi-register"
>>> +         * jump to 0x200
>>> +         */
>>
>> Shouldn't you also check MSR_ME here first and enter checkstop when
>> machine checks are disabled?
>> Also I think you have to set up some more registers for machine check
>> interrupts, like SRR0 and SRR1?
>>
>>> +        env->nip = 0x200;
>>> +        return 0;
>>> +    }
>>> +
>>> +    qemu_mutex_lock(&spapr->mc_in_progress);
>>
>> Using a mutex here is definitely wrong. The kvm_arch_handle_exit() code
>> is run under the Big QEMU Lock™ (see qemu_mutex_lock_iothread() in
>> kvm_cpu_exec()),
> 
> In case you're looking for the calls, I just noticed that the
> qemu_mutex_lock_iothread() have recently been pushed into
> kvm_arch_handle_exit() itself.

ok thanks for pointing out.

> 
>> so if you would ever get one thread waiting for this
>> mutex here, it could never be unlocked again in rtas_ibm_nmi_interlock()
>> because the other code would wait forever to get the BQL ==> Deadlock.
>>
>> I think if you want to be able to handle multiple NMIs at once, you
>> likely need something like an error log per CPU instead. And if an NMI
>> happens one CPU while there is already a NMI handler running on the very
>> same CPU, you could likely simply track this with an boolean variable
>> and put the CPU into checkstop if this happens?
> 
> Ok, I now had a look into the LoPAPR spec, and if I've got that right,
> you really have to serialize the NMIs in case they happen at multiple
> CPUs at the same time. So I guess the best thing you can do here is
> something like:
> 
>    while (spapr->mc_in_progress) {
>        /*
>         * There is already another NMI in progress, thus we need
>         * to yield here to wait until it has been finsihed
>         */
>        qemu_mutex_unlock_iothread();
>        usleep(10);
>        qemu_mutex_lock_iothread();
>    }
>    spapr->mc_in_progress = true;

Above piece of code should help. I will modify accordingly in the next
revision.

> 
> Also LoPAPR talks about 'subsequent processors report "fatal error
> previously reported"', so maybe the other processors should report that
> condition in this case?

I feel guest kernel is responsible for that or does that mean that qemu
should report the same error, which first processor encountered, for
subsequent processors? In that case what if the error encountered by
first processor was recovered.

> And of course you've also got to check that the same CPU is not getting
> multiple NMIs before the interlock function has been called again.

I think it is good to check that. However, shouldn't the guest enable ME
until it calls interlock function?

Regards,
Aravinda

> 
>  Thomas
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls
  2015-11-12  9:23   ` Thomas Huth
@ 2015-11-12 18:52     ` Aravinda Prasad
  0 siblings, 0 replies; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-12 18:52 UTC (permalink / raw)
  To: Thomas Huth
  Cc: benh, aik, agraf, qemu-devel, paulus, qemu-ppc, sam.bobroff, david



On Thursday 12 November 2015 02:53 PM, Thomas Huth wrote:
> On 11/11/15 18:15, Aravinda Prasad wrote:
>> This patch adds support in QEMU to handle "ibm,nmi-register"
>> and "ibm,nmi-interlock" RTAS calls.
>>
>> The machine check notification address is saved when the
>> OS issues "ibm,nmi-register" RTAS call.
>>
>> This patch also handles the case when multiple processors
>> experience machine check at or about the same time by
>> handling "ibm,nmi-interlock" call. In such cases, as per
>> PAPR, subsequent processors serialize waiting for the first
>> processor to issue the "ibm,nmi-interlock" call. The second
>> processor waits till the first processor, which also
>> received a machine check error, is done reading the error
>> log. The first processor issues "ibm,nmi-interlock" call
>> when the error log is consumed. This patch implements the
>> releasing part of the error-log while subsequent patch
>> (which builds error log) handles the locking part.
>>
>> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
>> ---
>>  hw/ppc/spapr_rtas.c    |   29 +++++++++++++++++++++++++++++
>>  include/hw/ppc/spapr.h |    8 +++++++-
>>  2 files changed, 36 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
>> index 9869bc9..fd4d2af 100644
>> --- a/hw/ppc/spapr_rtas.c
>> +++ b/hw/ppc/spapr_rtas.c
>> @@ -597,6 +597,31 @@ out:
>>      rtas_st(rets, 0, rc);
>>  }
>>  
>> +static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
>> +                                  sPAPRMachineState *spapr,
>> +                                  uint32_t token, uint32_t nargs,
>> +                                  target_ulong args,
>> +                                  uint32_t nret, target_ulong rets)
>> +{
>> +    qemu_mutex_init(&spapr->mc_in_progress);
>> +    spapr->guest_machine_check_addr = rtas_ld(args, 1);
>> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
>> +}
>> +
>> +static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu,
>> +                                   sPAPRMachineState *spapr,
>> +                                   uint32_t token, uint32_t nargs,
>> +                                   target_ulong args,
>> +                                   uint32_t nret, target_ulong rets)
>> +{
>> +    /*
>> +     * VCPU issuing "ibm,nmi-interlock" is done with NMI handling,
>> +     * hence unlock mc_in_progress.
>> +     */
>> +    qemu_mutex_unlock(&spapr->mc_in_progress);
>> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
>> +}
> 
> Maybe the interlock function should return an error if the nmi_register
> function has not been called before? OTOH, RTAS is not supposed to do
> excessive parameter checking, so this is maybe not worth the effort.

This will be a simple check to see if spapr->guest_machine_check_addr is
set. I will include it.

Regards,
Aravinda

> 
>  Thomas
> 
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob
  2015-11-12  8:26   ` Thomas Huth
  2015-11-12 11:53     ` David Gibson
@ 2015-11-12 18:59     ` Aravinda Prasad
  1 sibling, 0 replies; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-12 18:59 UTC (permalink / raw)
  To: Thomas Huth
  Cc: benh, aik, agraf, qemu-devel, paulus, qemu-ppc, sam.bobroff, david



On Thursday 12 November 2015 01:56 PM, Thomas Huth wrote:
> On 11/11/15 18:15, Aravinda Prasad wrote:
>> Extend rtas-blob to accommodate error log. Error log
>> structure is saved in rtas space upon a machine check
>> exception.
>>
>> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
>> ---
>>  hw/ppc/spapr.c |    4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 05926a3..b7b9e09 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1556,6 +1556,10 @@ static void ppc_spapr_init(MachineState *machine)
>>          exit(1);
>>      }
>>      spapr->rtas_size = get_image_size(filename);
>> +
>> +    /* Resize blob to accommodate error log. */
>> +    spapr->rtas_size = TARGET_PAGE_ALIGN(spapr->rtas_size);
>> +
>>      spapr->rtas_blob = g_malloc(spapr->rtas_size);
>>      if (load_image_size(filename, spapr->rtas_blob, spapr->rtas_size) < 0) {
>>          error_report("Could not load LPAR rtas '%s'", filename);
> 
> Sorry to say that, but this patch is horrible!
> 
> 1) If the rtas blob ever gets bigger than 512 bytes, we will get
> "random" corruption of the RTAS code later when an NMI occurs since the
> mc log is blindly copied into the RTAS area later!
> ==> Please add an "assert(spapr->rtas_size < RTAS_ERRLOG_OFFSET)" at the
> beginning of your patch.

It is good to add an assert. Will include in next revision.

> 
> 2) Why resizing with TARGET_PAGE_ALIGN() ? In the very worst case, this
> would not change the size at all (if the rtas_size is already a multiple
> of PAGE_SIZE)
> ==> Please set the size to a proper value like
>  RTAS_ERRLOG_OFFSET + sizeof(struct rtas_mc_log)
> instead!

sure will add it.

Regards,
Aravinda

> 
>  Thomas
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-12  9:40     ` Thomas Huth
  2015-11-12 18:49       ` Aravinda Prasad
@ 2015-11-13  1:57       ` David Gibson
  2015-11-13  7:03         ` Thomas Huth
  1 sibling, 1 reply; 40+ messages in thread
From: David Gibson @ 2015-11-13  1:57 UTC (permalink / raw)
  To: Thomas Huth
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, sam.bobroff,
	Aravinda Prasad, paulus

[-- Attachment #1: Type: text/plain, Size: 3671 bytes --]

On Thu, Nov 12, 2015 at 10:40:11AM +0100, Thomas Huth wrote:
> On 12/11/15 09:09, Thomas Huth wrote:
> > On 11/11/15 18:16, Aravinda Prasad wrote:
> >> Memory error such as bit flips that cannot be corrected
> >> by hardware are passed on to the kernel for handling.
> >> If the memory address in error belongs to guest then
> >> guest kernel is responsible for taking suitable action.
> >> Patch [1] enhances KVM to exit guest with exit reason
> >> set to KVM_EXIT_NMI in such cases.
> >>
> >> This patch handles KVM_EXIT_NMI exit. If the guest OS
> >> has registered the machine check handling routine by
> >> calling "ibm,nmi-register", then the handler builds
> >> the error log and invokes the registered handler else
> >> invokes the handler at 0x200.
> >>
> >> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
> >>
> >> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
[snip]
> >> +        env->nip = 0x200;
> >> +        return 0;
> >> +    }
> >> +
> >> +    qemu_mutex_lock(&spapr->mc_in_progress);
> > 
> > Using a mutex here is definitely wrong. The kvm_arch_handle_exit() code
> > is run under the Big QEMU Lock™ (see qemu_mutex_lock_iothread() in
> > kvm_cpu_exec()),
> 
> In case you're looking for the calls, I just noticed that the
> qemu_mutex_lock_iothread() have recently been pushed into
> kvm_arch_handle_exit() itself.
> 
> > so if you would ever get one thread waiting for this
> > mutex here, it could never be unlocked again in rtas_ibm_nmi_interlock()
> > because the other code would wait forever to get the BQL ==> Deadlock.
> > 
> > I think if you want to be able to handle multiple NMIs at once, you
> > likely need something like an error log per CPU instead. And if an NMI
> > happens one CPU while there is already a NMI handler running on the very
> > same CPU, you could likely simply track this with an boolean variable
> > and put the CPU into checkstop if this happens?
> 
> Ok, I now had a look into the LoPAPR spec, and if I've got that right,
> you really have to serialize the NMIs in case they happen at multiple
> CPUs at the same time. So I guess the best thing you can do here is
> something like:
> 
>    while (spapr->mc_in_progress) {
>        /*
>         * There is already another NMI in progress, thus we need
>         * to yield here to wait until it has been finsihed
>         */
>        qemu_mutex_unlock_iothread();
>        usleep(10);
>        qemu_mutex_lock_iothread();
>    }
>    spapr->mc_in_progress = true;
> 
> Also LoPAPR talks about 'subsequent processors report "fatal error
> previously reported"', so maybe the other processors should report that
> condition in this case?
> And of course you've also got to check that the same CPU is not getting
> multiple NMIs before the interlock function has been called again.

You should be able to avoid the nasty usleep by using a pthread
condition variable.  So here you'd have

    while (spapr->mc_in_progress) {
        pthread_cond_wait(&mc_delivery_cond, &qemu_iothread_lock);
    }
    spapr->mc_in_progress = true;

Or.. there may be qemu wrappers around the pthread functions you
should be using.  Once delivery of a single MC is complete, you'd use
pthread_cond_signal() to wake up any additional ones.

pthread_cond_wait automatically drops the specified mutex internally,
so access to mc_in_progress itself is still protected by the iothread
mutex.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-12 18:23     ` Aravinda Prasad
@ 2015-11-13  1:58       ` David Gibson
  2015-11-13  4:53         ` Aravinda Prasad
  2015-11-19  1:56       ` Alexey Kardashevskiy
  1 sibling, 1 reply; 40+ messages in thread
From: David Gibson @ 2015-11-13  1:58 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: Thomas Huth, benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff

[-- Attachment #1: Type: text/plain, Size: 3446 bytes --]

On Thu, Nov 12, 2015 at 11:53:45PM +0530, Aravinda Prasad wrote:
> 
> 
> On Thursday 12 November 2015 01:39 PM, Thomas Huth wrote:
> > On 11/11/15 18:16, Aravinda Prasad wrote:
> >> Memory error such as bit flips that cannot be corrected
> >> by hardware are passed on to the kernel for handling.
> >> If the memory address in error belongs to guest then
> >> guest kernel is responsible for taking suitable action.
> >> Patch [1] enhances KVM to exit guest with exit reason
> >> set to KVM_EXIT_NMI in such cases.
> >>
> >> This patch handles KVM_EXIT_NMI exit. If the guest OS
> >> has registered the machine check handling routine by
> >> calling "ibm,nmi-register", then the handler builds
> >> the error log and invokes the registered handler else
> >> invokes the handler at 0x200.
> >>
> >> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
> >>
> >> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
> >> ---
> >>  target-ppc/kvm.c     |   69 +++++++++++++++++++++++++++++++++++++++++++
> >>  target-ppc/kvm_ppc.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >>  2 files changed, 150 insertions(+)
> >>
> >> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> >> index 110436d..e2e5170 100644
> >> --- a/target-ppc/kvm.c
> >> +++ b/target-ppc/kvm.c
> >> @@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
> >>          ret = 0;
> >>          break;
> >>  
> >> +    case KVM_EXIT_NMI:
> >> +        DPRINTF("handle NMI exception\n");
> >> +        ret = kvm_handle_nmi(cpu);
> >> +        break;
> >> +
> >>      default:
> >>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
> >>          ret = -1;
> >> @@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> >>  {
> >>      return data & 0xffff;
> >>  }
> >> +
> >> +int kvm_handle_nmi(PowerPCCPU *cpu)
> >> +{
> >> +    struct rtas_mc_log mc_log;
> >> +    CPUPPCState *env = &cpu->env;
> >> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> >> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> >> +
> >> +    cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
> >> +
> >> +    /* Properly set bits in MSR before we invoke the handler */
> >> +    env->msr = 0;
> >> +
> >> +    if (!(*pcc->interrupts_big_endian)(cpu)) {
> >> +        env->msr |= (1ULL << MSR_LE);
> >> +    }
> >> +
> >> +#ifdef TARGET_PPC64
> >> +    env->msr |= (1ULL << MSR_SF);
> >> +#endif
> >> +
> >> +    if (!spapr->guest_machine_check_addr) {
> >> +        /*
> >> +         * If OS has not registered with "ibm,nmi-register"
> >> +         * jump to 0x200
> >> +         */
> > 
> > Shouldn't you also check MSR_ME here first and enter checkstop when
> > machine checks are disabled?
> 
> Yes, MSR_ME should be checked first.
> 
> > Also I think you have to set up some more registers for machine check
> > interrupts, like SRR0 and SRR1?
> 
> SRRO and SRR1 of vcpu are properly set in KVM in kvmppc_interrupt_hv. I
> am not sure if any other registers need to be set.

DAR and DSISR are the obvious ones you need to consider, although I
suspect they're already set up correctly by the kernel, too.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-13  1:58       ` David Gibson
@ 2015-11-13  4:53         ` Aravinda Prasad
  2015-11-13  5:57           ` David Gibson
  0 siblings, 1 reply; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-13  4:53 UTC (permalink / raw)
  To: David Gibson
  Cc: Thomas Huth, benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff



On Friday 13 November 2015 07:28 AM, David Gibson wrote:
> On Thu, Nov 12, 2015 at 11:53:45PM +0530, Aravinda Prasad wrote:
>>
>>
>> On Thursday 12 November 2015 01:39 PM, Thomas Huth wrote:
>>> On 11/11/15 18:16, Aravinda Prasad wrote:
>>>> Memory error such as bit flips that cannot be corrected
>>>> by hardware are passed on to the kernel for handling.
>>>> If the memory address in error belongs to guest then
>>>> guest kernel is responsible for taking suitable action.
>>>> Patch [1] enhances KVM to exit guest with exit reason
>>>> set to KVM_EXIT_NMI in such cases.
>>>>
>>>> This patch handles KVM_EXIT_NMI exit. If the guest OS
>>>> has registered the machine check handling routine by
>>>> calling "ibm,nmi-register", then the handler builds
>>>> the error log and invokes the registered handler else
>>>> invokes the handler at 0x200.
>>>>
>>>> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
>>>>
>>>> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
>>>> ---
>>>>  target-ppc/kvm.c     |   69 +++++++++++++++++++++++++++++++++++++++++++
>>>>  target-ppc/kvm_ppc.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  2 files changed, 150 insertions(+)
>>>>
>>>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>>>> index 110436d..e2e5170 100644
>>>> --- a/target-ppc/kvm.c
>>>> +++ b/target-ppc/kvm.c
>>>> @@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>>>>          ret = 0;
>>>>          break;
>>>>  
>>>> +    case KVM_EXIT_NMI:
>>>> +        DPRINTF("handle NMI exception\n");
>>>> +        ret = kvm_handle_nmi(cpu);
>>>> +        break;
>>>> +
>>>>      default:
>>>>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>>>>          ret = -1;
>>>> @@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>>>>  {
>>>>      return data & 0xffff;
>>>>  }
>>>> +
>>>> +int kvm_handle_nmi(PowerPCCPU *cpu)
>>>> +{
>>>> +    struct rtas_mc_log mc_log;
>>>> +    CPUPPCState *env = &cpu->env;
>>>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>>>> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
>>>> +
>>>> +    cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
>>>> +
>>>> +    /* Properly set bits in MSR before we invoke the handler */
>>>> +    env->msr = 0;
>>>> +
>>>> +    if (!(*pcc->interrupts_big_endian)(cpu)) {
>>>> +        env->msr |= (1ULL << MSR_LE);
>>>> +    }
>>>> +
>>>> +#ifdef TARGET_PPC64
>>>> +    env->msr |= (1ULL << MSR_SF);
>>>> +#endif
>>>> +
>>>> +    if (!spapr->guest_machine_check_addr) {
>>>> +        /*
>>>> +         * If OS has not registered with "ibm,nmi-register"
>>>> +         * jump to 0x200
>>>> +         */
>>>
>>> Shouldn't you also check MSR_ME here first and enter checkstop when
>>> machine checks are disabled?
>>
>> Yes, MSR_ME should be checked first.
>>
>>> Also I think you have to set up some more registers for machine check
>>> interrupts, like SRR0 and SRR1?
>>
>> SRRO and SRR1 of vcpu are properly set in KVM in kvmppc_interrupt_hv. I
>> am not sure if any other registers need to be set.
> 
> DAR and DSISR are the obvious ones you need to consider, although I
> suspect they're already set up correctly by the kernel, too.

Yes, they are also setup properly by KVM.

Regards,
Aravinda

> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-13  4:53         ` Aravinda Prasad
@ 2015-11-13  5:57           ` David Gibson
  2015-11-13  6:27             ` Aravinda Prasad
  0 siblings, 1 reply; 40+ messages in thread
From: David Gibson @ 2015-11-13  5:57 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: Thomas Huth, benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff

[-- Attachment #1: Type: text/plain, Size: 3941 bytes --]

On Fri, Nov 13, 2015 at 10:23:20AM +0530, Aravinda Prasad wrote:
> 
> 
> On Friday 13 November 2015 07:28 AM, David Gibson wrote:
> > On Thu, Nov 12, 2015 at 11:53:45PM +0530, Aravinda Prasad wrote:
> >>
> >>
> >> On Thursday 12 November 2015 01:39 PM, Thomas Huth wrote:
> >>> On 11/11/15 18:16, Aravinda Prasad wrote:
> >>>> Memory error such as bit flips that cannot be corrected
> >>>> by hardware are passed on to the kernel for handling.
> >>>> If the memory address in error belongs to guest then
> >>>> guest kernel is responsible for taking suitable action.
> >>>> Patch [1] enhances KVM to exit guest with exit reason
> >>>> set to KVM_EXIT_NMI in such cases.
> >>>>
> >>>> This patch handles KVM_EXIT_NMI exit. If the guest OS
> >>>> has registered the machine check handling routine by
> >>>> calling "ibm,nmi-register", then the handler builds
> >>>> the error log and invokes the registered handler else
> >>>> invokes the handler at 0x200.
> >>>>
> >>>> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
> >>>>
> >>>> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
> >>>> ---
> >>>>  target-ppc/kvm.c     |   69 +++++++++++++++++++++++++++++++++++++++++++
> >>>>  target-ppc/kvm_ppc.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>  2 files changed, 150 insertions(+)
> >>>>
> >>>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> >>>> index 110436d..e2e5170 100644
> >>>> --- a/target-ppc/kvm.c
> >>>> +++ b/target-ppc/kvm.c
> >>>> @@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
> >>>>          ret = 0;
> >>>>          break;
> >>>>  
> >>>> +    case KVM_EXIT_NMI:
> >>>> +        DPRINTF("handle NMI exception\n");
> >>>> +        ret = kvm_handle_nmi(cpu);
> >>>> +        break;
> >>>> +
> >>>>      default:
> >>>>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
> >>>>          ret = -1;
> >>>> @@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> >>>>  {
> >>>>      return data & 0xffff;
> >>>>  }
> >>>> +
> >>>> +int kvm_handle_nmi(PowerPCCPU *cpu)
> >>>> +{
> >>>> +    struct rtas_mc_log mc_log;
> >>>> +    CPUPPCState *env = &cpu->env;
> >>>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> >>>> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> >>>> +
> >>>> +    cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
> >>>> +
> >>>> +    /* Properly set bits in MSR before we invoke the handler */
> >>>> +    env->msr = 0;
> >>>> +
> >>>> +    if (!(*pcc->interrupts_big_endian)(cpu)) {
> >>>> +        env->msr |= (1ULL << MSR_LE);
> >>>> +    }
> >>>> +
> >>>> +#ifdef TARGET_PPC64
> >>>> +    env->msr |= (1ULL << MSR_SF);
> >>>> +#endif
> >>>> +
> >>>> +    if (!spapr->guest_machine_check_addr) {
> >>>> +        /*
> >>>> +         * If OS has not registered with "ibm,nmi-register"
> >>>> +         * jump to 0x200
> >>>> +         */
> >>>
> >>> Shouldn't you also check MSR_ME here first and enter checkstop when
> >>> machine checks are disabled?
> >>
> >> Yes, MSR_ME should be checked first.
> >>
> >>> Also I think you have to set up some more registers for machine check
> >>> interrupts, like SRR0 and SRR1?
> >>
> >> SRRO and SRR1 of vcpu are properly set in KVM in kvmppc_interrupt_hv. I
> >> am not sure if any other registers need to be set.
> > 
> > DAR and DSISR are the obvious ones you need to consider, although I
> > suspect they're already set up correctly by the kernel, too.
> 
> Yes, they are also setup properly by KVM.

Ok, good.  Might be worth throwing a comment in the qemu code noting
that the kernel has already set up that state properly.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-13  5:57           ` David Gibson
@ 2015-11-13  6:27             ` Aravinda Prasad
  0 siblings, 0 replies; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-13  6:27 UTC (permalink / raw)
  To: David Gibson
  Cc: Thomas Huth, benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff



On Friday 13 November 2015 11:27 AM, David Gibson wrote:
> On Fri, Nov 13, 2015 at 10:23:20AM +0530, Aravinda Prasad wrote:
>>
>>
>> On Friday 13 November 2015 07:28 AM, David Gibson wrote:
>>> On Thu, Nov 12, 2015 at 11:53:45PM +0530, Aravinda Prasad wrote:
>>>>
>>>>
>>>> On Thursday 12 November 2015 01:39 PM, Thomas Huth wrote:
>>>>> On 11/11/15 18:16, Aravinda Prasad wrote:
>>>>>> Memory error such as bit flips that cannot be corrected
>>>>>> by hardware are passed on to the kernel for handling.
>>>>>> If the memory address in error belongs to guest then
>>>>>> guest kernel is responsible for taking suitable action.
>>>>>> Patch [1] enhances KVM to exit guest with exit reason
>>>>>> set to KVM_EXIT_NMI in such cases.
>>>>>>
>>>>>> This patch handles KVM_EXIT_NMI exit. If the guest OS
>>>>>> has registered the machine check handling routine by
>>>>>> calling "ibm,nmi-register", then the handler builds
>>>>>> the error log and invokes the registered handler else
>>>>>> invokes the handler at 0x200.
>>>>>>
>>>>>> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
>>>>>>
>>>>>> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
>>>>>> ---
>>>>>>  target-ppc/kvm.c     |   69 +++++++++++++++++++++++++++++++++++++++++++
>>>>>>  target-ppc/kvm_ppc.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>  2 files changed, 150 insertions(+)
>>>>>>
>>>>>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>>>>>> index 110436d..e2e5170 100644
>>>>>> --- a/target-ppc/kvm.c
>>>>>> +++ b/target-ppc/kvm.c
>>>>>> @@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>>>>>>          ret = 0;
>>>>>>          break;
>>>>>>  
>>>>>> +    case KVM_EXIT_NMI:
>>>>>> +        DPRINTF("handle NMI exception\n");
>>>>>> +        ret = kvm_handle_nmi(cpu);
>>>>>> +        break;
>>>>>> +
>>>>>>      default:
>>>>>>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>>>>>>          ret = -1;
>>>>>> @@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>>>>>>  {
>>>>>>      return data & 0xffff;
>>>>>>  }
>>>>>> +
>>>>>> +int kvm_handle_nmi(PowerPCCPU *cpu)
>>>>>> +{
>>>>>> +    struct rtas_mc_log mc_log;
>>>>>> +    CPUPPCState *env = &cpu->env;
>>>>>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>>>>>> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
>>>>>> +
>>>>>> +    cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
>>>>>> +
>>>>>> +    /* Properly set bits in MSR before we invoke the handler */
>>>>>> +    env->msr = 0;
>>>>>> +
>>>>>> +    if (!(*pcc->interrupts_big_endian)(cpu)) {
>>>>>> +        env->msr |= (1ULL << MSR_LE);
>>>>>> +    }
>>>>>> +
>>>>>> +#ifdef TARGET_PPC64
>>>>>> +    env->msr |= (1ULL << MSR_SF);
>>>>>> +#endif
>>>>>> +
>>>>>> +    if (!spapr->guest_machine_check_addr) {
>>>>>> +        /*
>>>>>> +         * If OS has not registered with "ibm,nmi-register"
>>>>>> +         * jump to 0x200
>>>>>> +         */
>>>>>
>>>>> Shouldn't you also check MSR_ME here first and enter checkstop when
>>>>> machine checks are disabled?
>>>>
>>>> Yes, MSR_ME should be checked first.
>>>>
>>>>> Also I think you have to set up some more registers for machine check
>>>>> interrupts, like SRR0 and SRR1?
>>>>
>>>> SRRO and SRR1 of vcpu are properly set in KVM in kvmppc_interrupt_hv. I
>>>> am not sure if any other registers need to be set.
>>>
>>> DAR and DSISR are the obvious ones you need to consider, although I
>>> suspect they're already set up correctly by the kernel, too.
>>
>> Yes, they are also setup properly by KVM.
> 
> Ok, good.  Might be worth throwing a comment in the qemu code noting
> that the kernel has already set up that state properly.

sure.

> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-13  1:57       ` David Gibson
@ 2015-11-13  7:03         ` Thomas Huth
  2015-11-16  5:45           ` David Gibson
  0 siblings, 1 reply; 40+ messages in thread
From: Thomas Huth @ 2015-11-13  7:03 UTC (permalink / raw)
  To: David Gibson
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, sam.bobroff,
	Aravinda Prasad, paulus

[-- Attachment #1: Type: text/plain, Size: 1923 bytes --]

On 13/11/15 02:57, David Gibson wrote:
> On Thu, Nov 12, 2015 at 10:40:11AM +0100, Thomas Huth wrote:
>> On 12/11/15 09:09, Thomas Huth wrote:
>>> On 11/11/15 18:16, Aravinda Prasad wrote:
[...]
>>>> +    qemu_mutex_lock(&spapr->mc_in_progress);
>>>
>>> Using a mutex here is definitely wrong. The kvm_arch_handle_exit() code
>>> is run under the Big QEMU Lock™ (see qemu_mutex_lock_iothread() in
>>> kvm_cpu_exec()),
[...]
>> Ok, I now had a look into the LoPAPR spec, and if I've got that right,
>> you really have to serialize the NMIs in case they happen at multiple
>> CPUs at the same time. So I guess the best thing you can do here is
>> something like:
>>
>>    while (spapr->mc_in_progress) {
>>        /*
>>         * There is already another NMI in progress, thus we need
>>         * to yield here to wait until it has been finsihed
>>         */
>>        qemu_mutex_unlock_iothread();
>>        usleep(10);
>>        qemu_mutex_lock_iothread();
>>    }
>>    spapr->mc_in_progress = true;
[...]
> You should be able to avoid the nasty usleep by using a pthread
> condition variable.  So here you'd have
> 
>     while (spapr->mc_in_progress) {
>         pthread_cond_wait(&mc_delivery_cond, &qemu_iothread_lock);
>     }
>     spapr->mc_in_progress = true;
> 
> Or.. there may be qemu wrappers around the pthread functions you
> should be using.  Once delivery of a single MC is complete, you'd use
> pthread_cond_signal() to wake up any additional ones.
> 
> pthread_cond_wait automatically drops the specified mutex internally,
> so access to mc_in_progress itself is still protected by the iothread
> mutex.

That's a nice one, didn't know that function yet! And actually, there is
already a QEMU wrapper function: qemu_cond_wait() - so this should be
used instead since threads on Windows are working differently in QEMU as
far as I know.

 Thomas




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-12  8:09   ` Thomas Huth
  2015-11-12  9:40     ` Thomas Huth
  2015-11-12 18:23     ` Aravinda Prasad
@ 2015-11-16  3:50     ` Paul Mackerras
  2015-11-16  9:01       ` Thomas Huth
  2 siblings, 1 reply; 40+ messages in thread
From: Paul Mackerras @ 2015-11-16  3:50 UTC (permalink / raw)
  To: Thomas Huth
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, Aravinda Prasad,
	sam.bobroff, david

On Thu, Nov 12, 2015 at 09:09:59AM +0100, Thomas Huth wrote:
> 
> Shouldn't you also check MSR_ME here first and enter checkstop when
> machine checks are disabled?

MSR_ME is a hypervisor resource and is not able to be controlled by HV
KVM guests, or in fact by the OS running on the pseries machine target
regardless of how it's accelerated or emulated.

What you say would only apply if we had a powernv machine target and
we were emulating the whole system, and in that case we wouldn't be
using any hcalls, and we wouldn't be doing FWNMI (or at least not at
this level).

So the answer is no, MSR_ME will always be set when running in a
guest, and we don't ever need to checkstop the virtual machine.

> Also I think you have to set up some more registers for machine check
> interrupts, like SRR0 and SRR1?

They are already set up by the process of taking the machine check in
the first place.

Paul.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-13  7:03         ` Thomas Huth
@ 2015-11-16  5:45           ` David Gibson
  0 siblings, 0 replies; 40+ messages in thread
From: David Gibson @ 2015-11-16  5:45 UTC (permalink / raw)
  To: Thomas Huth
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, sam.bobroff,
	Aravinda Prasad, paulus

[-- Attachment #1: Type: text/plain, Size: 2362 bytes --]

On Fri, Nov 13, 2015 at 08:03:51AM +0100, Thomas Huth wrote:
> On 13/11/15 02:57, David Gibson wrote:
> > On Thu, Nov 12, 2015 at 10:40:11AM +0100, Thomas Huth wrote:
> >> On 12/11/15 09:09, Thomas Huth wrote:
> >>> On 11/11/15 18:16, Aravinda Prasad wrote:
> [...]
> >>>> +    qemu_mutex_lock(&spapr->mc_in_progress);
> >>>
> >>> Using a mutex here is definitely wrong. The kvm_arch_handle_exit() code
> >>> is run under the Big QEMU Lock™ (see qemu_mutex_lock_iothread() in
> >>> kvm_cpu_exec()),
> [...]
> >> Ok, I now had a look into the LoPAPR spec, and if I've got that right,
> >> you really have to serialize the NMIs in case they happen at multiple
> >> CPUs at the same time. So I guess the best thing you can do here is
> >> something like:
> >>
> >>    while (spapr->mc_in_progress) {
> >>        /*
> >>         * There is already another NMI in progress, thus we need
> >>         * to yield here to wait until it has been finsihed
> >>         */
> >>        qemu_mutex_unlock_iothread();
> >>        usleep(10);
> >>        qemu_mutex_lock_iothread();
> >>    }
> >>    spapr->mc_in_progress = true;
> [...]
> > You should be able to avoid the nasty usleep by using a pthread
> > condition variable.  So here you'd have
> > 
> >     while (spapr->mc_in_progress) {
> >         pthread_cond_wait(&mc_delivery_cond, &qemu_iothread_lock);
> >     }
> >     spapr->mc_in_progress = true;
> > 
> > Or.. there may be qemu wrappers around the pthread functions you
> > should be using.  Once delivery of a single MC is complete, you'd use
> > pthread_cond_signal() to wake up any additional ones.
> > 
> > pthread_cond_wait automatically drops the specified mutex internally,
> > so access to mc_in_progress itself is still protected by the iothread
> > mutex.
> 
> That's a nice one, didn't know that function yet! And actually, there is
> already a QEMU wrapper function: qemu_cond_wait() - so this should be
> used instead since threads on Windows are working differently in QEMU as
> far as I know.

Ah, good point, you definitely need to use the qemu wrapper in order
to get proper windows support.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-12 18:49       ` Aravinda Prasad
@ 2015-11-16  7:52         ` Thomas Huth
  2015-11-16 10:07           ` Aravinda Prasad
  0 siblings, 1 reply; 40+ messages in thread
From: Thomas Huth @ 2015-11-16  7:52 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff, david

On 12/11/15 19:49, Aravinda Prasad wrote:
> 
> On Thursday 12 November 2015 03:10 PM, Thomas Huth wrote:
...
>> Also LoPAPR talks about 'subsequent processors report "fatal error
>> previously reported"', so maybe the other processors should report that
>> condition in this case?
> 
> I feel guest kernel is responsible for that or does that mean that qemu
> should report the same error, which first processor encountered, for
> subsequent processors? In that case what if the error encountered by
> first processor was recovered.

I simply refered to this text in LoPAPR:

 Multiple processors of the same OS image may experi-
 ence fatal events at, or about, the same time. The first processor
 to enter the machine check handling firmware reports
 the fatal error. Subsequent processors serialize waiting for the
 first processor to issue the ibm,nmi-interlock call. These
 subsequent processors report "fatal error previously reported".

Is there code in the host kernel already that takes care of this (I
haven't checked)? If so, how does the host kernel know that the event
happened "at or about the same time" since you're checking at the QEMU
side for the mutex condition?

>> And of course you've also got to check that the same CPU is not getting
>> multiple NMIs before the interlock function has been called again.
> 
> I think it is good to check that. However, shouldn't the guest enable ME
> until it calls interlock function?

First, the hypervisor should never trust the guest to do the right
things. Second, LoPAPR says "the OS permanently relinquishes to firmware
the Machine State Register's Machine Check Enable bit", and Paul also
said something similar in another mail to this thread, so I think you
really have to check this in QEMU instead.

 Thomas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-16  3:50     ` Paul Mackerras
@ 2015-11-16  9:01       ` Thomas Huth
  2015-11-16 11:29         ` Aravinda Prasad
  2015-11-16 21:46         ` Paul Mackerras
  0 siblings, 2 replies; 40+ messages in thread
From: Thomas Huth @ 2015-11-16  9:01 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, Aravinda Prasad,
	sam.bobroff, david

On 16/11/15 04:50, Paul Mackerras wrote:
> On Thu, Nov 12, 2015 at 09:09:59AM +0100, Thomas Huth wrote:
>>
>> Shouldn't you also check MSR_ME here first and enter checkstop when
>> machine checks are disabled?
> 
> MSR_ME is a hypervisor resource and is not able to be controlled by HV
> KVM guests, or in fact by the OS running on the pseries machine target
> regardless of how it's accelerated or emulated.
> 
> What you say would only apply if we had a powernv machine target and
> we were emulating the whole system, and in that case we wouldn't be
> using any hcalls, and we wouldn't be doing FWNMI (or at least not at
> this level).
> 
> So the answer is no, MSR_ME will always be set when running in a
> guest, and we don't ever need to checkstop the virtual machine.

Good point, I missed that sentence about the hypervisor resource in the
PowerISA. So QEMU does not have to check this bit here.

But out of curiosity: What happens if a guest disables the ME bit? Is
this checked somewhere or simply ignored?

 Thomas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-16  7:52         ` Thomas Huth
@ 2015-11-16 10:07           ` Aravinda Prasad
  2015-11-16 10:41             ` Thomas Huth
  0 siblings, 1 reply; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-16 10:07 UTC (permalink / raw)
  To: Thomas Huth
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff, david



On Monday 16 November 2015 01:22 PM, Thomas Huth wrote:
> On 12/11/15 19:49, Aravinda Prasad wrote:
>>
>> On Thursday 12 November 2015 03:10 PM, Thomas Huth wrote:
> ...
>>> Also LoPAPR talks about 'subsequent processors report "fatal error
>>> previously reported"', so maybe the other processors should report that
>>> condition in this case?
>>
>> I feel guest kernel is responsible for that or does that mean that qemu
>> should report the same error, which first processor encountered, for
>> subsequent processors? In that case what if the error encountered by
>> first processor was recovered.
> 
> I simply refered to this text in LoPAPR:
> 
>  Multiple processors of the same OS image may experi-
>  ence fatal events at, or about, the same time. The first processor
>  to enter the machine check handling firmware reports
>  the fatal error. Subsequent processors serialize waiting for the
>  first processor to issue the ibm,nmi-interlock call. These
>  subsequent processors report "fatal error previously reported".

Yes, I asked this because I am not clear what "fatal error previously
reported" means as described in PAPR.

> 
> Is there code in the host kernel already that takes care of this (I
> haven't checked)? If so, how does the host kernel know that the event
> happened "at or about the same time" since you're checking at the QEMU
> side for the mutex condition?

I don't think the host kernel takes care of this; it simply forwards
such errors to QEMU via NMI exit. I feel the time referred by "at or
about the same time" is the duration between the registered machine
check handler is invoked and the corresponding interlock call is issued
by guest, which QEMU knows and is protected by a mutex.

> 
>>> And of course you've also got to check that the same CPU is not getting
>>> multiple NMIs before the interlock function has been called again.
>>
>> I think it is good to check that. However, shouldn't the guest enable ME
>> until it calls interlock function?
> 
> First, the hypervisor should never trust the guest to do the right
> things. Second, LoPAPR says "the OS permanently relinquishes to firmware
> the Machine State Register's Machine Check Enable bit", and Paul also
> said something similar in another mail to this thread, so I think you
> really have to check this in QEMU instead.

Hmm. ok. Since ME is always set when running in guest (assuming guest is
not disabling it), we cannot check ME bit to figure out whether the same
CPU is getting UEs before interlock is called. One way is to record the
CPU ID upon such error and check before invoking registered machine
check handler whether that CPU has a pending interlock call. Terminate
the guest if there is a pending interlock call for that CPU rather than
causing the guest to trigger recursive machine check errors.

Regards,
Aravinda


> 
>  Thomas
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-16 10:07           ` Aravinda Prasad
@ 2015-11-16 10:41             ` Thomas Huth
  2015-11-16 11:57               ` Aravinda Prasad
  0 siblings, 1 reply; 40+ messages in thread
From: Thomas Huth @ 2015-11-16 10:41 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff, david

On 16/11/15 11:07, Aravinda Prasad wrote:
> 
> 
> On Monday 16 November 2015 01:22 PM, Thomas Huth wrote:
>> On 12/11/15 19:49, Aravinda Prasad wrote:
>>>
>>> On Thursday 12 November 2015 03:10 PM, Thomas Huth wrote:
>> ...
>>>> Also LoPAPR talks about 'subsequent processors report "fatal error
>>>> previously reported"', so maybe the other processors should report that
>>>> condition in this case?
>>>
>>> I feel guest kernel is responsible for that or does that mean that qemu
>>> should report the same error, which first processor encountered, for
>>> subsequent processors? In that case what if the error encountered by
>>> first processor was recovered.
>>
>> I simply refered to this text in LoPAPR:
>>
>>  Multiple processors of the same OS image may experi-
>>  ence fatal events at, or about, the same time. The first processor
>>  to enter the machine check handling firmware reports
>>  the fatal error. Subsequent processors serialize waiting for the
>>  first processor to issue the ibm,nmi-interlock call. These
>>  subsequent processors report "fatal error previously reported".
> 
> Yes, I asked this because I am not clear what "fatal error previously
> reported" means as described in PAPR.

Looking at table "Table 137. RTAS Event Return Format (Fixed Part)" in
LoPAPR, there is a "ALREADY_REPORTED" severity - I assume this is what
is meant by the cited paragraph?

>> Is there code in the host kernel already that takes care of this (I
>> haven't checked)? If so, how does the host kernel know that the event
>> happened "at or about the same time" since you're checking at the QEMU
>> side for the mutex condition?
> 
> I don't think the host kernel takes care of this; it simply forwards
> such errors to QEMU via NMI exit. I feel the time referred by "at or
> about the same time" is the duration between the registered machine
> check handler is invoked and the corresponding interlock call is issued
> by guest, which QEMU knows and is protected by a mutex.

I agree, that makes sense.

>>>> And of course you've also got to check that the same CPU is not getting
>>>> multiple NMIs before the interlock function has been called again.
>>>
>>> I think it is good to check that. However, shouldn't the guest enable ME
>>> until it calls interlock function?
>>
>> First, the hypervisor should never trust the guest to do the right
>> things. Second, LoPAPR says "the OS permanently relinquishes to firmware
>> the Machine State Register's Machine Check Enable bit", and Paul also
>> said something similar in another mail to this thread, so I think you
>> really have to check this in QEMU instead.
> 
> Hmm. ok. Since ME is always set when running in guest (assuming guest is
> not disabling it), we cannot check ME bit to figure out whether the same
> CPU is getting UEs before interlock is called. One way is to record the
> CPU ID upon such error and check before invoking registered machine
> check handler whether that CPU has a pending interlock call. Terminate
> the guest if there is a pending interlock call for that CPU rather than
> causing the guest to trigger recursive machine check errors.

Do we have some kind of checkstop state emulation in QEMU (sorry, I
haven't checked yet)? If yes, it might be nicer to use that and set the
guest state to PANIC instead of exiting QEMU directly - i.e. to do
something similar like the guest_panicked() function in
target-s390x/kvm.c. That way the management layer (libvirt) can decide
on its own whether to terminate the guest, reboot or keep it in the
crashed state for further analysis.

 Thomas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-16  9:01       ` Thomas Huth
@ 2015-11-16 11:29         ` Aravinda Prasad
  2015-11-16 21:46         ` Paul Mackerras
  1 sibling, 0 replies; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-16 11:29 UTC (permalink / raw)
  To: Thomas Huth
  Cc: benh, aik, agraf, qemu-devel, Paul Mackerras, qemu-ppc,
	sam.bobroff, david



On Monday 16 November 2015 02:31 PM, Thomas Huth wrote:
> On 16/11/15 04:50, Paul Mackerras wrote:
>> On Thu, Nov 12, 2015 at 09:09:59AM +0100, Thomas Huth wrote:
>>>
>>> Shouldn't you also check MSR_ME here first and enter checkstop when
>>> machine checks are disabled?
>>
>> MSR_ME is a hypervisor resource and is not able to be controlled by HV
>> KVM guests, or in fact by the OS running on the pseries machine target
>> regardless of how it's accelerated or emulated.
>>
>> What you say would only apply if we had a powernv machine target and
>> we were emulating the whole system, and in that case we wouldn't be
>> using any hcalls, and we wouldn't be doing FWNMI (or at least not at
>> this level).
>>
>> So the answer is no, MSR_ME will always be set when running in a
>> guest, and we don't ever need to checkstop the virtual machine.
> 
> Good point, I missed that sentence about the hypervisor resource in the
> PowerISA. So QEMU does not have to check this bit here.
> 
> But out of curiosity: What happens if a guest disables the ME bit? Is
> this checked somewhere or simply ignored?

If a guest disables the ME bit (and may be does not set it again) and
should a machine check happen in guest address space then the entire
system checkstops? However, I am not sure if the guest is allowed to
disable ME bit.

Regards,
Aravinda

> 
>  Thomas
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-16 10:41             ` Thomas Huth
@ 2015-11-16 11:57               ` Aravinda Prasad
  0 siblings, 0 replies; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-16 11:57 UTC (permalink / raw)
  To: Thomas Huth
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, paulus, sam.bobroff, david



On Monday 16 November 2015 04:11 PM, Thomas Huth wrote:
> On 16/11/15 11:07, Aravinda Prasad wrote:
>>
>>
>> On Monday 16 November 2015 01:22 PM, Thomas Huth wrote:
>>> On 12/11/15 19:49, Aravinda Prasad wrote:
>>>>
>>>> On Thursday 12 November 2015 03:10 PM, Thomas Huth wrote:
>>> ...
>>>>> Also LoPAPR talks about 'subsequent processors report "fatal error
>>>>> previously reported"', so maybe the other processors should report that
>>>>> condition in this case?
>>>>
>>>> I feel guest kernel is responsible for that or does that mean that qemu
>>>> should report the same error, which first processor encountered, for
>>>> subsequent processors? In that case what if the error encountered by
>>>> first processor was recovered.
>>>
>>> I simply refered to this text in LoPAPR:
>>>
>>>  Multiple processors of the same OS image may experi-
>>>  ence fatal events at, or about, the same time. The first processor
>>>  to enter the machine check handling firmware reports
>>>  the fatal error. Subsequent processors serialize waiting for the
>>>  first processor to issue the ibm,nmi-interlock call. These
>>>  subsequent processors report "fatal error previously reported".
>>
>> Yes, I asked this because I am not clear what "fatal error previously
>> reported" means as described in PAPR.
> 
> Looking at table "Table 137. RTAS Event Return Format (Fixed Part)" in
> LoPAPR, there is a "ALREADY_REPORTED" severity - I assume this is what
> is meant by the cited paragraph?

But QEMU cannot identify whether an error is fatal or not as the
recovery is done inside guest kernel. Also, I am not sure if
"ALREADY_REPORTED" should be set based on the machine check address
i.e., should be set if a machine check is triggered on the same address
as before.

It is difficult to detect whether machine check is triggered on the same
address from DAR as the address in DAR could be same but could belong to
a different user process inside guest backed by a different physical memory.

> 
>>> Is there code in the host kernel already that takes care of this (I
>>> haven't checked)? If so, how does the host kernel know that the event
>>> happened "at or about the same time" since you're checking at the QEMU
>>> side for the mutex condition?
>>
>> I don't think the host kernel takes care of this; it simply forwards
>> such errors to QEMU via NMI exit. I feel the time referred by "at or
>> about the same time" is the duration between the registered machine
>> check handler is invoked and the corresponding interlock call is issued
>> by guest, which QEMU knows and is protected by a mutex.
> 
> I agree, that makes sense.
> 
>>>>> And of course you've also got to check that the same CPU is not getting
>>>>> multiple NMIs before the interlock function has been called again.
>>>>
>>>> I think it is good to check that. However, shouldn't the guest enable ME
>>>> until it calls interlock function?
>>>
>>> First, the hypervisor should never trust the guest to do the right
>>> things. Second, LoPAPR says "the OS permanently relinquishes to firmware
>>> the Machine State Register's Machine Check Enable bit", and Paul also
>>> said something similar in another mail to this thread, so I think you
>>> really have to check this in QEMU instead.
>>
>> Hmm. ok. Since ME is always set when running in guest (assuming guest is
>> not disabling it), we cannot check ME bit to figure out whether the same
>> CPU is getting UEs before interlock is called. One way is to record the
>> CPU ID upon such error and check before invoking registered machine
>> check handler whether that CPU has a pending interlock call. Terminate
>> the guest if there is a pending interlock call for that CPU rather than
>> causing the guest to trigger recursive machine check errors.
> 
> Do we have some kind of checkstop state emulation in QEMU (sorry, I
> haven't checked yet)? If yes, it might be nicer to use that and set the
> guest state to PANIC instead of exiting QEMU directly - i.e. to do
> something similar like the guest_panicked() function in
> target-s390x/kvm.c. That way the management layer (libvirt) can decide
> on its own whether to terminate the guest, reboot or keep it in the
> crashed state for further analysis.

Even I am not sure about checkstop state emulation. I will check it out.

Regards,
Aravinda

> 
>  Thomas
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-16  9:01       ` Thomas Huth
  2015-11-16 11:29         ` Aravinda Prasad
@ 2015-11-16 21:46         ` Paul Mackerras
  1 sibling, 0 replies; 40+ messages in thread
From: Paul Mackerras @ 2015-11-16 21:46 UTC (permalink / raw)
  To: Thomas Huth
  Cc: benh, aik, agraf, qemu-devel, qemu-ppc, Aravinda Prasad,
	sam.bobroff, david

On Mon, Nov 16, 2015 at 10:01:25AM +0100, Thomas Huth wrote:
> On 16/11/15 04:50, Paul Mackerras wrote:
> > On Thu, Nov 12, 2015 at 09:09:59AM +0100, Thomas Huth wrote:
> >>
> >> Shouldn't you also check MSR_ME here first and enter checkstop when
> >> machine checks are disabled?
> > 
> > MSR_ME is a hypervisor resource and is not able to be controlled by HV
> > KVM guests, or in fact by the OS running on the pseries machine target
> > regardless of how it's accelerated or emulated.
> > 
> > What you say would only apply if we had a powernv machine target and
> > we were emulating the whole system, and in that case we wouldn't be
> > using any hcalls, and we wouldn't be doing FWNMI (or at least not at
> > this level).
> > 
> > So the answer is no, MSR_ME will always be set when running in a
> > guest, and we don't ever need to checkstop the virtual machine.
> 
> Good point, I missed that sentence about the hypervisor resource in the
> PowerISA. So QEMU does not have to check this bit here.
> 
> But out of curiosity: What happens if a guest disables the ME bit? Is
> this checked somewhere or simply ignored?

The guest can't turn off the ME bit.  Any attempt to do so is ignored.
See the descriptions of rfid and mtmsrd in the architecture spec.

Paul.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-12 18:23     ` Aravinda Prasad
  2015-11-13  1:58       ` David Gibson
@ 2015-11-19  1:56       ` Alexey Kardashevskiy
  2015-11-19 16:02         ` Aravinda Prasad
  1 sibling, 1 reply; 40+ messages in thread
From: Alexey Kardashevskiy @ 2015-11-19  1:56 UTC (permalink / raw)
  To: Aravinda Prasad, Thomas Huth
  Cc: benh, agraf, qemu-devel, paulus, qemu-ppc, sam.bobroff, david

On 11/13/2015 05:23 AM, Aravinda Prasad wrote:

>>> +
>>> +/*
>>> + * Currently KVM only passes on the uncorrected machine
>>> + * check memory error to guest. Other machine check errors
>>> + * such as SLB multi-hit and TLB multi-hit are recovered
>>> + * in KVM and are not passed on to guest.
>>> + *
>>> + * DSISR Bit for uncorrected machine check error. Based
>>> + * on arch/powerpc/include/asm/mce.h
>>> + */
>>> +#define PPC_BIT(bit)                (0x8000000000000000ULL >> bit)
>>> +#define P7_DSISR_MC_UE              (PPC_BIT(48))  /* P8 too */
>>> +
>>> +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */
>>> +struct rtas_error_log {
>>> +    /* Byte 0 */
>>> +    uint8_t     byte0;          /* Architectural version */
>>> +
>>> +    /* Byte 1 */
>>> +    uint8_t     byte1;
>>> +    /* XXXXXXXX
>>> +     * XXX      3: Severity level of error
>>> +     *    XX    2: Degree of recovery
>>> +     *      X   1: Extended log present?
>>> +     *       XX 2: Reserved
>>> +     */
>>> +
>>> +    /* Byte 2 */
>>> +    uint8_t     byte2;
>>> +    /* XXXXXXXX
>>> +     * XXXX     4: Initiator of event
>>> +     *     XXXX 4: Target of failed operation
>>> +     */
>>> +    uint8_t     byte3;          /* General event or error*/
>>> +    __be32      extended_log_length;    /* length in bytes */
>>> +    unsigned char   buffer[1];      /* Start of extended log */
>>> +                                /* Variable length.      */
>>> +};
>>> +
>>> +/*
>>> + * Data format in RTAS-Blob
>>> + *
>>> + * This structure contains error information related to Machine
>>> + * Check exception. This is filled up and copied to rtas-blob
>>> + * upon machine check exception. The address of rtas-blob is
>>> + * passed on to OS registered machine check notification
>>> + * routines upon machine check exception
>>> + */
>>> +struct rtas_mc_log {
>>> +    target_ulong r3;
>>> +    struct rtas_error_log err_log;
>>> +};
>>> +
>>
>> QEMU coding style is normally to use CamelCase for type names, see
>> chapter 3 in CODING_STYLE.
>
> Will take care in the next revision.

Just to make it clear - the CamelCase comment is valid for the new 
rtas_mc_log type but not for rtas_error_log which is a copy of kernel's
arch/powerpc/include/asm/rtas.h




-- 
Alexey

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
  2015-11-19  1:56       ` Alexey Kardashevskiy
@ 2015-11-19 16:02         ` Aravinda Prasad
  0 siblings, 0 replies; 40+ messages in thread
From: Aravinda Prasad @ 2015-11-19 16:02 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Thomas Huth, benh, agraf, qemu-devel, paulus, qemu-ppc,
	sam.bobroff, david



On Thursday 19 November 2015 07:26 AM, Alexey Kardashevskiy wrote:
> On 11/13/2015 05:23 AM, Aravinda Prasad wrote:
> 
>>>> +
>>>> +/*
>>>> + * Currently KVM only passes on the uncorrected machine
>>>> + * check memory error to guest. Other machine check errors
>>>> + * such as SLB multi-hit and TLB multi-hit are recovered
>>>> + * in KVM and are not passed on to guest.
>>>> + *
>>>> + * DSISR Bit for uncorrected machine check error. Based
>>>> + * on arch/powerpc/include/asm/mce.h
>>>> + */
>>>> +#define PPC_BIT(bit)                (0x8000000000000000ULL >> bit)
>>>> +#define P7_DSISR_MC_UE              (PPC_BIT(48))  /* P8 too */
>>>> +
>>>> +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */
>>>> +struct rtas_error_log {
>>>> +    /* Byte 0 */
>>>> +    uint8_t     byte0;          /* Architectural version */
>>>> +
>>>> +    /* Byte 1 */
>>>> +    uint8_t     byte1;
>>>> +    /* XXXXXXXX
>>>> +     * XXX      3: Severity level of error
>>>> +     *    XX    2: Degree of recovery
>>>> +     *      X   1: Extended log present?
>>>> +     *       XX 2: Reserved
>>>> +     */
>>>> +
>>>> +    /* Byte 2 */
>>>> +    uint8_t     byte2;
>>>> +    /* XXXXXXXX
>>>> +     * XXXX     4: Initiator of event
>>>> +     *     XXXX 4: Target of failed operation
>>>> +     */
>>>> +    uint8_t     byte3;          /* General event or error*/
>>>> +    __be32      extended_log_length;    /* length in bytes */
>>>> +    unsigned char   buffer[1];      /* Start of extended log */
>>>> +                                /* Variable length.      */
>>>> +};
>>>> +
>>>> +/*
>>>> + * Data format in RTAS-Blob
>>>> + *
>>>> + * This structure contains error information related to Machine
>>>> + * Check exception. This is filled up and copied to rtas-blob
>>>> + * upon machine check exception. The address of rtas-blob is
>>>> + * passed on to OS registered machine check notification
>>>> + * routines upon machine check exception
>>>> + */
>>>> +struct rtas_mc_log {
>>>> +    target_ulong r3;
>>>> +    struct rtas_error_log err_log;
>>>> +};
>>>> +
>>>
>>> QEMU coding style is normally to use CamelCase for type names, see
>>> chapter 3 in CODING_STYLE.
>>
>> Will take care in the next revision.
> 
> Just to make it clear - the CamelCase comment is valid for the new
> rtas_mc_log type but not for rtas_error_log which is a copy of kernel's
> arch/powerpc/include/asm/rtas.h

sure. Thanks.


> 
> 
> 
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2015-11-19 16:03 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-11 17:15 [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests Aravinda Prasad
2015-11-11 17:15 ` [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob Aravinda Prasad
2015-11-12  3:40   ` David Gibson
2015-11-12  8:26   ` Thomas Huth
2015-11-12 11:53     ` David Gibson
2015-11-12 18:59     ` Aravinda Prasad
2015-11-11 17:15 ` [Qemu-devel] [PATCH 2/4] spapr: Register and handle HCALL to receive updated RTAS region Aravinda Prasad
2015-11-12  3:42   ` David Gibson
2015-11-12  5:28     ` [Qemu-devel] [Qemu-ppc] " Nikunj A Dadhania
2015-11-12  7:23       ` David Gibson
2015-11-11 17:15 ` [Qemu-devel] [PATCH 3/4] spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls Aravinda Prasad
2015-11-12  4:02   ` David Gibson
2015-11-12 18:04     ` Aravinda Prasad
2015-11-12  9:23   ` Thomas Huth
2015-11-12 18:52     ` Aravinda Prasad
2015-11-11 17:16 ` [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit Aravinda Prasad
2015-11-12  4:29   ` David Gibson
2015-11-12  5:20     ` Aravinda Prasad
2015-11-12  8:09   ` Thomas Huth
2015-11-12  9:40     ` Thomas Huth
2015-11-12 18:49       ` Aravinda Prasad
2015-11-16  7:52         ` Thomas Huth
2015-11-16 10:07           ` Aravinda Prasad
2015-11-16 10:41             ` Thomas Huth
2015-11-16 11:57               ` Aravinda Prasad
2015-11-13  1:57       ` David Gibson
2015-11-13  7:03         ` Thomas Huth
2015-11-16  5:45           ` David Gibson
2015-11-12 18:23     ` Aravinda Prasad
2015-11-13  1:58       ` David Gibson
2015-11-13  4:53         ` Aravinda Prasad
2015-11-13  5:57           ` David Gibson
2015-11-13  6:27             ` Aravinda Prasad
2015-11-19  1:56       ` Alexey Kardashevskiy
2015-11-19 16:02         ` Aravinda Prasad
2015-11-16  3:50     ` Paul Mackerras
2015-11-16  9:01       ` Thomas Huth
2015-11-16 11:29         ` Aravinda Prasad
2015-11-16 21:46         ` Paul Mackerras
2015-11-12  4:30 ` [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.