All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug
@ 2014-08-19  0:21 Michael Roth
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node Michael Roth
                   ` (12 more replies)
  0 siblings, 13 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-19  0:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

These patches are based on ppc-next, and can also be obtained from:

https://github.com/mdroth/qemu/commits/spapr-pci-hotplug-v3-ppc-next

v3:
 * dropped emulation of firmware-managed BAR allocation. this will be
   introduced via a follow-up series via a -machine flag and tied to
   a separate hotplug event to avoid a race condition with guest vs.
   "firmware"-managed BAR allocation, in conjunction with required
   fixes to rpaphp hotplug kernel module to utilize this mode.
 * moved drc_table into sPAPREnvironment (Alexey)
 * moved INDICATOR_* constants and friends into spapr_pci.c (Alexey)
 * use prefixes for global types (DrcEntry/ConfigureConnectorState) (Alexey)
 * updated for new hotplug interface (Alexey)
 * fixed get-power-level to report current power-level rather than
   desired (Alexey)
 * rebased to latest ppc-next

v2:
  * re-ordered patches to fix build bisectability (Alexey)
  * replaced g_warning with DPRINTF in RTAS calls for guest errors (Alexey)
  * replaced g_warning with fprintf for qemu errors (Alexey)
  * updated RTAS calls to use pre-existing error/success macros (Alexey)
  * replaced DR_*/SENSOR_* macros with INDICATOR_* for set-indicator/
    get-sensor-state (Alexey)

OVERVIEW

These patches add support for PCI hotplug for SPAPR guests. We advertise
each PHB as DR-capable (as defined by PAPR 13.5/13.6) with 32 hotpluggable
PCI slots per PHB, which models a standard PCI expansion device for Power
machines where the DRC name/loc-code/index for each slot are generated
based on bus/slot number.

This is compatible with existing guest kernel's via the rpaphp hotplug
module, and existing userspace tools such as drmgr/librtas/rtas_errd for
managing devices, in theory...

NOTES / ADDITIONAL DEPENDENCIES

This series relies on v1.2.19 or later of powerppc-utils (drmgr, rtas_errd,
ppc64-diag, and librtas components, specificially), which will automate
guest-side hotplug setup in response to an EPOW event emitted by QEMU. For
guests with older versions of powerpc-utils, a manual workaround must be
used (documented below).

PATCH LAYOUT

Patches
        1-3   advertise PHBs and associated slots as hotpluggable to guests
        4-7   add RTAS interfaces required for device configuration
        8     fix for ppc (and other) guests that allocate IO bars starting
              at 0x0
        9     enables device_add/device_del for spapr machines and
              guest-driven hotplug
        10-12 define hotplug event structure and emit them in response to
              device_add/device_del

USAGE

For guests with powerpc-utils 1.2.19+:
  hotplug:
    qemu:
      device_add e1000,id=slot0
  unplug:
    qemu:
      device_del slot0

For guests with powerpc-utils prior to 1.2.19:
  hotplug:
    qemu:
      device_add e1000,id=slot0
    guest:
      drmgr -c pci -s "Slot 0" -n -a
      echo 1 >/sys/bus/pci/rescan
  unplug:
    guest:
      drmgr -c pci -s "Slot 0" -n -r
      echo 1 >/sys/bus/pci/devices/0000:00:00.0/remove
    qemu:
      device_del slot0

 hw/pci/pci.c                |   2 +-
 hw/ppc/spapr.c              | 172 +++++++++++++++++++++++++++++++++-
 hw/ppc/spapr_events.c       | 224 ++++++++++++++++++++++++++++++++++++--------
 hw/ppc/spapr_pci.c          | 689 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 include/hw/pci-host/spapr.h |   1 +
 include/hw/ppc/spapr.h      |  46 ++++++++-
 6 files changed, 1083 insertions(+), 51 deletions(-)

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
@ 2014-08-19  0:21 ` Michael Roth
  2014-08-26  7:55   ` Alexey Kardashevskiy
                     ` (3 more replies)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs Michael Roth
                   ` (11 subsequent siblings)
  12 siblings, 4 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-19  0:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

From: Nathan Fontenot <nfont@linux.vnet.ibm.com>

This add entries to the root OF node to advertise our PHBs as being
DR-capable in according with PAPR specification.

Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
and associated with a power domain of -1 (indicating to guests that
power management is handled automatically by hardware).

We currently allocate entries for up to 32 DR-capable PHBs, though
this limit can be increased later.

DrcEntry objects to track the state of the DR-connector associated
with each PHB are stored in a 32-entry array, and each DrcEntry has
in turn have a dynamically-sized number of child DR-connectors,
which we will use later to track the state of DR-connectors
associated with a PHB's physical slots.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_pci.c     |   1 +
 include/hw/ppc/spapr.h |  35 ++++++++++++
 3 files changed, 179 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 5c92707..d5e46c3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
     return ram_size;
 }
 
+sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
+{
+    int i;
+
+    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
+        if (spapr->drc_table[i].phb_buid == buid) {
+            return &spapr->drc_table[i];
+        }
+     }
+
+     return NULL;
+}
+
+static void spapr_init_drc_table(void)
+{
+    int i;
+
+    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
+
+    /* For now we only care about PHB entries */
+    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
+        spapr->drc_table[i].drc_index = 0x2000001 + i;
+    }
+}
+
+sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
+{
+    sPAPRDrcEntry *empty_drc = NULL;
+    sPAPRDrcEntry *found_drc = NULL;
+    int i, phb_index;
+
+    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
+        if (spapr->drc_table[i].phb_buid == 0) {
+            empty_drc = &spapr->drc_table[i];
+        }
+
+        if (spapr->drc_table[i].phb_buid == buid) {
+            found_drc = &spapr->drc_table[i];
+            break;
+        }
+    }
+
+    if (found_drc) {
+        return found_drc;
+    }
+
+    if (empty_drc) {
+        empty_drc->phb_buid = buid;
+        empty_drc->state = state;
+        empty_drc->cc_state.fdt = NULL;
+        empty_drc->cc_state.offset = 0;
+        empty_drc->cc_state.depth = 0;
+        empty_drc->cc_state.state = CC_STATE_IDLE;
+        empty_drc->child_entries =
+            g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
+        phb_index = buid - SPAPR_PCI_BASE_BUID;
+        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
+            empty_drc->child_entries[i].drc_index =
+                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
+        }
+        return empty_drc;
+    }
+
+    return NULL;
+}
+
+static void spapr_create_drc_dt_entries(void *fdt)
+{
+    char char_buf[1024];
+    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
+    uint32_t *entries;
+    int offset, fdt_offset;
+    int i, ret;
+
+    fdt_offset = fdt_path_offset(fdt, "/");
+
+    /* ibm,drc-indexes */
+    memset(int_buf, 0, sizeof(int_buf));
+    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
+
+    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
+        int_buf[i] = spapr->drc_table[i-1].drc_index;
+    }
+
+    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
+                      sizeof(int_buf));
+    if (ret) {
+        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
+    }
+
+    /* ibm,drc-power-domains */
+    memset(int_buf, 0, sizeof(int_buf));
+    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
+
+    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
+        int_buf[i] = 0xffffffff;
+    }
+
+    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
+                      sizeof(int_buf));
+    if (ret) {
+        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains property\n");
+    }
+
+    /* ibm,drc-names */
+    memset(char_buf, 0, sizeof(char_buf));
+    entries = (uint32_t *)&char_buf[0];
+    *entries = SPAPR_DRC_TABLE_SIZE;
+    offset = sizeof(*entries);
+
+    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
+        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
+        char_buf[offset++] = '\0';
+    }
+
+    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
+    if (ret) {
+        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
+    }
+
+    /* ibm,drc-types */
+    memset(char_buf, 0, sizeof(char_buf));
+    entries = (uint32_t *)&char_buf[0];
+    *entries = SPAPR_DRC_TABLE_SIZE;
+    offset = sizeof(*entries);
+
+    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
+        offset += sprintf(char_buf + offset, "PHB");
+        char_buf[offset++] = '\0';
+    }
+
+    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, offset);
+    if (ret) {
+        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
+    }
+}
+
 #define _FDT(exp) \
     do { \
         int ret = (exp);                                           \
@@ -731,6 +868,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
     char *bootlist;
     void *fdt;
     sPAPRPHBState *phb;
+    sPAPRDrcEntry *drc_entry;
 
     fdt = g_malloc(FDT_MAX_SIZE);
 
@@ -750,6 +888,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
     }
 
     QLIST_FOREACH(phb, &spapr->phbs, list) {
+        drc_entry = spapr_phb_to_drc_entry(phb->buid);
+        g_assert(drc_entry);
         ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
     }
 
@@ -789,6 +929,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
         spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
     }
 
+    spapr_create_drc_dt_entries(fdt);
+
     _FDT((fdt_pack(fdt)));
 
     if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
@@ -1443,6 +1585,7 @@ static void ppc_spapr_init(MachineState *machine)
     spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW);
     spapr_pci_rtas_init();
 
+    spapr_init_drc_table();
     phb = spapr_create_phb(spapr, 0);
 
     for (i = 0; i < nb_nics; i++) {
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 9ed39a9..e85134f 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -531,6 +531,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
             + sphb->index * SPAPR_PCI_WINDOW_SPACING;
         sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
         sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
+        spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
     }
 
     if (sphb->buid == -1) {
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 36e8e51..c93794b 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -10,6 +10,36 @@ struct sPAPRNVRAM;
 
 #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
 
+/* For dlparable/hotpluggable slots */
+#define SPAPR_DRC_TABLE_SIZE    32
+#define SPAPR_DRC_PHB_SLOT_MAX  32
+#define SPAPR_DRC_DEV_ID_BASE   0x40000000
+
+typedef struct sPAPRConfigureConnectorState {
+    void *fdt;
+    int offset_start;
+    int offset;
+    int depth;
+    PCIDevice *dev;
+    enum {
+        CC_STATE_IDLE = 0,
+        CC_STATE_PENDING = 1,
+        CC_STATE_ACTIVE,
+    } state;
+} sPAPRConfigureConnectorState;
+
+typedef struct sPAPRDrcEntry sPAPRDrcEntry;
+
+struct sPAPRDrcEntry {
+    uint32_t drc_index;
+    uint64_t phb_buid;
+    void *fdt;
+    int fdt_offset;
+    uint32_t state;
+    sPAPRConfigureConnectorState cc_state;
+    sPAPRDrcEntry *child_entries;
+};
+
 typedef struct sPAPREnvironment {
     struct VIOsPAPRBus *vio_bus;
     QLIST_HEAD(, sPAPRPHBState) phbs;
@@ -39,6 +69,9 @@ typedef struct sPAPREnvironment {
     int htab_save_index;
     bool htab_first_pass;
     int htab_fd;
+
+    /* state for Dynamic Reconfiguration Connectors */
+    sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
 } sPAPREnvironment;
 
 #define H_SUCCESS         0
@@ -481,5 +514,7 @@ int spapr_dma_dt(void *fdt, int node_off, const char *propname,
                  uint32_t liobn, uint64_t window, uint32_t size);
 int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
                       sPAPRTCETable *tcet);
+sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
+sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
 
 #endif /* !defined (__HW_SPAPR_H__) */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs
  2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node Michael Roth
@ 2014-08-19  0:21 ` Michael Roth
  2014-08-26  8:32   ` Alexey Kardashevskiy
                     ` (2 more replies)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 03/12] spapr: add helper to retrieve a PHB/device DrcEntry Michael Roth
                   ` (10 subsequent siblings)
  12 siblings, 3 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-19  0:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

Reserve 32 entries of type PCI in each PHB's initial FDT. This
advertises to guests that each PHB is DR-capable device with
physical hotpluggable slots. This is necessary for allowing
hotplugging of devices to it later via bus rescan or guest rpaphp
hotplug module.

Each entry is assigned a name of "Slot <<bus_idx>*32 +1>",
advertised as a hotpluggable PCI slot, and assigned to power domain
-1 to indicate to the guest that power management is handled by the
hardware.

This models a DR-capable PCI expansion device attached to a host/lpar
via a single PHB with 32 physical hotpluggable slots (as opposed to a
virtual bridge device with external management console). Hotplug will
be handled by the guest via bus rescan or the rpaphp hotplug module.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c              |   3 +-
 hw/ppc/spapr_pci.c          | 102 ++++++++++++++++++++++++++++++++++++++++++++
 include/hw/pci-host/spapr.h |   1 +
 3 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index d5e46c3..90b25b3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -890,7 +890,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
     QLIST_FOREACH(phb, &spapr->phbs, list) {
         drc_entry = spapr_phb_to_drc_entry(phb->buid);
         g_assert(drc_entry);
-        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
+        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, drc_entry->drc_index,
+                                    fdt);
     }
 
     if (ret < 0) {
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index e85134f..924d488 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -851,8 +851,104 @@ static int spapr_phb_children_dt(Object *child, void *opaque)
     return 1;
 }
 
+static void spapr_create_drc_phb_dt_entries(void *fdt, int bus_off, int phb_index)
+{
+    char char_buf[1024];
+    uint32_t int_buf[SPAPR_DRC_PHB_SLOT_MAX + 1];
+    uint32_t *entries;
+    int i, ret, offset;
+
+    /* ibm,drc-indexes */
+    memset(int_buf, 0 , sizeof(int_buf));
+    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
+
+    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
+        int_buf[i] = SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + ((i - 1) << 3);
+    }
+
+    ret = fdt_setprop(fdt, bus_off, "ibm,drc-indexes", int_buf,
+                      sizeof(int_buf));
+    if (ret) {
+        fprintf(stderr, "error adding 'ibm,drc-indexes' field for PHB FDT");
+    }
+
+    /* ibm,drc-power-domains */
+    memset(int_buf, 0, sizeof(int_buf));
+    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
+
+    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
+        int_buf[i] = 0xffffffff;
+    }
+
+    ret = fdt_setprop(fdt, bus_off, "ibm,drc-power-domains", int_buf,
+                      sizeof(int_buf));
+    if (ret) {
+        fprintf(stderr,
+                "error adding 'ibm,drc-power-domains' field for PHB FDT");
+    }
+
+    /* ibm,drc-names */
+    memset(char_buf, 0, sizeof(char_buf));
+    entries = (uint32_t *)&char_buf[0];
+    *entries = SPAPR_DRC_PHB_SLOT_MAX;
+    offset = sizeof(*entries);
+
+    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
+        offset += sprintf(char_buf + offset, "Slot %d",
+                          (phb_index * SPAPR_DRC_PHB_SLOT_MAX) + i - 1);
+        char_buf[offset++] = '\0';
+    }
+
+    ret = fdt_setprop(fdt, bus_off, "ibm,drc-names", char_buf, offset);
+    if (ret) {
+        fprintf(stderr, "error adding 'ibm,drc-names' field for PHB FDT");
+    }
+
+    /* ibm,drc-types */
+    memset(char_buf, 0, sizeof(char_buf));
+    entries = (uint32_t *)&char_buf[0];
+    *entries = SPAPR_DRC_PHB_SLOT_MAX;
+    offset = sizeof(*entries);
+
+    for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
+        offset += sprintf(char_buf + offset, "28");
+        char_buf[offset++] = '\0';
+    }
+
+    ret = fdt_setprop(fdt, bus_off, "ibm,drc-types", char_buf, offset);
+    if (ret) {
+        fprintf(stderr, "error adding 'ibm,drc-types' field for PHB FDT");
+    }
+
+    /* we want the initial indicator state to be 0 - "empty", when we
+     * hot-plug an adaptor in the slot, we need to set the indicator
+     * to 1 - "present."
+     */
+
+    /* ibm,indicator-9003 */
+    memset(int_buf, 0, sizeof(int_buf));
+    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
+
+    ret = fdt_setprop(fdt, bus_off, "ibm,indicator-9003", int_buf,
+                      sizeof(int_buf));
+    if (ret) {
+        fprintf(stderr, "error adding 'ibm,indicator-9003' field for PHB FDT");
+    }
+
+    /* ibm,sensor-9003 */
+    memset(int_buf, 0, sizeof(int_buf));
+    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
+
+    ret = fdt_setprop(fdt, bus_off, "ibm,sensor-9003", int_buf,
+                      sizeof(int_buf));
+    if (ret) {
+        fprintf(stderr, "error adding 'ibm,sensor-9003' field for PHB FDT");
+    }
+}
+
 int spapr_populate_pci_dt(sPAPRPHBState *phb,
                           uint32_t xics_phandle,
+                          uint32_t drc_index,
                           void *fdt)
 {
     int bus_off, i, j;
@@ -934,6 +1030,12 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb,
     object_child_foreach(OBJECT(phb), spapr_phb_children_dt,
                          &((sPAPRTCEDT){ .fdt = fdt, .node_off = bus_off }));
 
+    spapr_create_drc_phb_dt_entries(fdt, bus_off, phb->index);
+    if (drc_index) {
+        _FDT(fdt_setprop(fdt, bus_off, "ibm,my-drc-index", &drc_index,
+                         sizeof(drc_index)));
+    }
+
     return 0;
 }
 
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index 32f0aa7..8f0a42f 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -116,6 +116,7 @@ PCIHostState *spapr_create_phb(sPAPREnvironment *spapr, int index);
 
 int spapr_populate_pci_dt(sPAPRPHBState *phb,
                           uint32_t xics_phandle,
+                          uint32_t drc_index,
                           void *fdt);
 
 void spapr_pci_msi_init(sPAPREnvironment *spapr, hwaddr addr);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 03/12] spapr: add helper to retrieve a PHB/device DrcEntry
  2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node Michael Roth
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs Michael Roth
@ 2014-08-19  0:21 ` Michael Roth
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface Michael Roth
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-19  0:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c         | 23 +++++++++++++++++++++++
 include/hw/ppc/spapr.h |  1 +
 2 files changed, 24 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 90b25b3..39cb0bb 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -309,6 +309,29 @@ sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
      return NULL;
 }
 
+sPAPRDrcEntry *spapr_find_drc_entry(int drc_index)
+{
+    int i, j;
+
+    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
+        sPAPRDrcEntry *phb_entry = &spapr->drc_table[i];
+        if (phb_entry->drc_index == drc_index) {
+            return phb_entry;
+        }
+        if (phb_entry->child_entries == NULL) {
+            continue;
+        }
+        for (j = 0; j < SPAPR_DRC_PHB_SLOT_MAX; j++) {
+            sPAPRDrcEntry *entry = &phb_entry->child_entries[j];
+            if (entry->drc_index == drc_index) {
+                return entry;
+            }
+        }
+     }
+
+     return NULL;
+}
+
 static void spapr_init_drc_table(void)
 {
     int i;
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index c93794b..0ac1a19 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -516,5 +516,6 @@ int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
                       sPAPRTCETable *tcet);
 sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
 sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
+sPAPRDrcEntry *spapr_find_drc_entry(int drc_index);
 
 #endif /* !defined (__HW_SPAPR_H__) */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface
  2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
                   ` (2 preceding siblings ...)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 03/12] spapr: add helper to retrieve a PHB/device DrcEntry Michael Roth
@ 2014-08-19  0:21 ` Michael Roth
  2014-08-26 11:36   ` Alexander Graf
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 05/12] spapr_pci: add get/set-power-level RTAS interfaces Michael Roth
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 69+ messages in thread
From: Michael Roth @ 2014-08-19  0:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

From: Mike Day <ncmike@ncultra.org>

Signed-off-by: Mike Day <ncmike@ncultra.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
 hw/ppc/spapr_pci.c     | 119 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h |   3 ++
 2 files changed, 122 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 924d488..23a3477 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -36,6 +36,16 @@
 
 #include "hw/pci/pci_bus.h"
 
+/* #define DEBUG_SPAPR */
+
+#ifdef DEBUG_SPAPR
+#define DPRINTF(fmt, ...) \
+    do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) \
+    do { } while (0)
+#endif
+
 /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
 #define RTAS_QUERY_FN           0
 #define RTAS_CHANGE_FN          1
@@ -47,6 +57,31 @@
 #define RTAS_TYPE_MSI           1
 #define RTAS_TYPE_MSIX          2
 
+/* For set-indicator RTAS interface */
+#define INDICATOR_ISOLATION_MASK            0x0001   /* 9001 one bit */
+#define INDICATOR_GLOBAL_INTERRUPT_MASK     0x0002   /* 9005 one bit */
+#define INDICATOR_ERROR_LOG_MASK            0x0004   /* 9006 one bit */
+#define INDICATOR_IDENTIFY_MASK             0x0008   /* 9007 one bit */
+#define INDICATOR_RESET_MASK                0x0010   /* 9009 one bit */
+#define INDICATOR_DR_MASK                   0x00e0   /* 9002 three bits */
+#define INDICATOR_ALLOCATION_MASK           0x0300   /* 9003 two bits */
+#define INDICATOR_EPOW_MASK                 0x1c00   /* 9 three bits */
+
+#define INDICATOR_ISOLATION_SHIFT           0x00     /* bit 0 */
+#define INDICATOR_GLOBAL_INTERRUPT_SHIFT    0x01     /* bit 1 */
+#define INDICATOR_ERROR_LOG_SHIFT           0x02     /* bit 2 */
+#define INDICATOR_IDENTIFY_SHIFT            0x03     /* bit 3 */
+#define INDICATOR_RESET_SHIFT               0x04     /* bit 4 */
+#define INDICATOR_DR_SHIFT                  0x05     /* bits 5-7 */
+#define INDICATOR_ALLOCATION_SHIFT          0x08     /* bits 8-9 */
+#define INDICATOR_EPOW_SHIFT                0x0a     /* bits 10-12 */
+
+#define DECODE_DRC_STATE(state, m, s)                  \
+    ((((uint32_t)(state) & (uint32_t)(m))) >> (s))
+
+#define ENCODE_DRC_STATE(val, m, s) \
+    (((uint32_t)(val) << (s)) & (uint32_t)(m))
+
 static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
 {
     sPAPRPHBState *sphb;
@@ -402,6 +437,80 @@ static void rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
     rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */
 }
 
+static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
+                               uint32_t token, uint32_t nargs,
+                               target_ulong args, uint32_t nret,
+                               target_ulong rets)
+{
+    uint32_t indicator = rtas_ld(args, 0);
+    uint32_t drc_index = rtas_ld(args, 1);
+    uint32_t indicator_state = rtas_ld(args, 2);
+    uint32_t encoded = 0, shift = 0, mask = 0;
+    uint32_t *pind;
+    sPAPRDrcEntry *drc_entry = NULL;
+
+    if (drc_index == 0) { /* platform indicator */
+        pind = &spapr->state;
+    } else {
+        drc_entry = spapr_find_drc_entry(drc_index);
+        if (!drc_entry) {
+            DPRINTF("rtas_set_indicator: unable to find drc_entry for %x",
+                    drc_index);
+            rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
+            return;
+        }
+        pind = &drc_entry->state;
+    }
+
+    switch (indicator) {
+    case 9:  /* EPOW */
+        shift = INDICATOR_EPOW_SHIFT;
+        mask = INDICATOR_EPOW_MASK;
+        break;
+    case 9001: /* Isolation state */
+        /* encode the new value into the correct bit field */
+        shift = INDICATOR_ISOLATION_SHIFT;
+        mask = INDICATOR_ISOLATION_MASK;
+        break;
+    case 9002: /* DR */
+        shift = INDICATOR_DR_SHIFT;
+        mask = INDICATOR_DR_MASK;
+        break;
+    case 9003: /* Allocation State */
+        shift = INDICATOR_ALLOCATION_SHIFT;
+        mask = INDICATOR_ALLOCATION_MASK;
+        break;
+    case 9005: /* global interrupt */
+        shift = INDICATOR_GLOBAL_INTERRUPT_SHIFT;
+        mask = INDICATOR_GLOBAL_INTERRUPT_MASK;
+        break;
+    case 9006: /* error log */
+        shift = INDICATOR_ERROR_LOG_SHIFT;
+        mask = INDICATOR_ERROR_LOG_MASK;
+        break;
+    case 9007: /* identify */
+        shift = INDICATOR_IDENTIFY_SHIFT;
+        mask = INDICATOR_IDENTIFY_MASK;
+        break;
+    case 9009: /* reset */
+        shift = INDICATOR_RESET_SHIFT;
+        mask = INDICATOR_RESET_MASK;
+        break;
+    default:
+        DPRINTF("rtas_set_indicator: indicator not implemented: %d",
+                indicator);
+        rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
+        return;
+    }
+
+    encoded = ENCODE_DRC_STATE(indicator_state, mask, shift);
+    /* clear the current indicator value */
+    *pind &= ~mask;
+    /* set the new value */
+    *pind |= encoded;
+    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+}
+
 static int pci_spapr_swizzle(int slot, int pin)
 {
     return (slot + pin) % PCI_NUM_PINS;
@@ -624,6 +733,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
         sphb->lsi_table[i].irq = irq;
     }
 
+    /* make sure the platform EPOW sensor is initialized - the
+     * guest will probe it when there is a hotplug event.
+     */
+    spapr->state &= ~(uint32_t)INDICATOR_EPOW_MASK;
+    spapr->state |= ENCODE_DRC_STATE(0,
+                                     INDICATOR_EPOW_MASK,
+                                     INDICATOR_EPOW_SHIFT);
+
     if (!info->finish_realize) {
         error_setg(errp, "finish_realize not defined");
         return;
@@ -1056,6 +1173,8 @@ void spapr_pci_rtas_init(void)
         spapr_rtas_register(RTAS_IBM_CHANGE_MSI, "ibm,change-msi",
                             rtas_ibm_change_msi);
     }
+    spapr_rtas_register(RTAS_SET_INDICATOR, "set-indicator",
+                        rtas_set_indicator);
 }
 
 static void spapr_pci_register_types(void)
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 0ac1a19..fac85f8 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -72,6 +72,9 @@ typedef struct sPAPREnvironment {
 
     /* state for Dynamic Reconfiguration Connectors */
     sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
+
+    /* Platform state - sensors and indicators */
+    uint32_t state;
 } sPAPREnvironment;
 
 #define H_SUCCESS         0
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 05/12] spapr_pci: add get/set-power-level RTAS interfaces
  2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
                   ` (3 preceding siblings ...)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface Michael Roth
@ 2014-08-19  0:21 ` Michael Roth
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 06/12] spapr_pci: add get-sensor-state RTAS interface Michael Roth
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-19  0:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

From: Nathan Fontenot <nfont@linux.vnet.ibm.com>

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
 hw/ppc/spapr_pci.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 23a3477..f007dd6 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -511,6 +511,27 @@ static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     rtas_st(rets, 0, RTAS_OUT_SUCCESS);
 }
 
+static void rtas_set_power_level(PowerPCCPU *cpu, sPAPREnvironment *spapr,
+                                 uint32_t token, uint32_t nargs,
+                                 target_ulong args, uint32_t nret,
+                                 target_ulong rets)
+{
+    /* we currently only use a single, "live insert" powerdomain for
+     * hotplugged/dlpar'd resources, so the power is always live/full (100)
+     */
+    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+    rtas_st(rets, 1, 100);
+}
+
+static void rtas_get_power_level(PowerPCCPU *cpu, sPAPREnvironment *spapr,
+                                  uint32_t token, uint32_t nargs,
+                                  target_ulong args, uint32_t nret,
+                                  target_ulong rets)
+{
+    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+    rtas_st(rets, 1, 100);
+}
+
 static int pci_spapr_swizzle(int slot, int pin)
 {
     return (slot + pin) % PCI_NUM_PINS;
@@ -1175,6 +1196,10 @@ void spapr_pci_rtas_init(void)
     }
     spapr_rtas_register(RTAS_SET_INDICATOR, "set-indicator",
                         rtas_set_indicator);
+    spapr_rtas_register(RTAS_SET_POWER_LEVEL, "set-power-level",
+                        rtas_set_power_level);
+    spapr_rtas_register(RTAS_GET_POWER_LEVEL, "get-power-level",
+                        rtas_get_power_level);
 }
 
 static void spapr_pci_register_types(void)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 06/12] spapr_pci: add get-sensor-state RTAS interface
  2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
                   ` (4 preceding siblings ...)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 05/12] spapr_pci: add get/set-power-level RTAS interfaces Michael Roth
@ 2014-08-19  0:21 ` Michael Roth
  2014-09-05  0:34   ` Tyrel Datwyler
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 07/12] spapr_pci: add ibm, configure-connector " Michael Roth
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 69+ messages in thread
From: Michael Roth @ 2014-08-19  0:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

From: Mike Day <ncmike@ncultra.org>

Signed-off-by: Mike Day <ncmike@ncultra.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
 hw/ppc/spapr_pci.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index f007dd6..8d1351d 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -66,6 +66,7 @@
 #define INDICATOR_DR_MASK                   0x00e0   /* 9002 three bits */
 #define INDICATOR_ALLOCATION_MASK           0x0300   /* 9003 two bits */
 #define INDICATOR_EPOW_MASK                 0x1c00   /* 9 three bits */
+#define INDICATOR_ENTITY_SENSE_MASK         0xe000   /* 9003 three bits */
 
 #define INDICATOR_ISOLATION_SHIFT           0x00     /* bit 0 */
 #define INDICATOR_GLOBAL_INTERRUPT_SHIFT    0x01     /* bit 1 */
@@ -75,6 +76,10 @@
 #define INDICATOR_DR_SHIFT                  0x05     /* bits 5-7 */
 #define INDICATOR_ALLOCATION_SHIFT          0x08     /* bits 8-9 */
 #define INDICATOR_EPOW_SHIFT                0x0a     /* bits 10-12 */
+#define INDICATOR_ENTITY_SENSE_SHIFT        0x0d     /* bits 13-15 */
+
+#define INDICATOR_ENTITY_SENSE_EMPTY    0
+#define INDICATOR_ENTITY_SENSE_PRESENT  1
 
 #define DECODE_DRC_STATE(state, m, s)                  \
     ((((uint32_t)(state) & (uint32_t)(m))) >> (s))
@@ -532,6 +537,75 @@ static void rtas_get_power_level(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     rtas_st(rets, 1, 100);
 }
 
+static void rtas_get_sensor_state(PowerPCCPU *cpu, sPAPREnvironment *spapr,
+                                  uint32_t token, uint32_t nargs,
+                                  target_ulong args, uint32_t nret,
+                                  target_ulong rets)
+{
+    uint32_t sensor = rtas_ld(args, 0);
+    uint32_t drc_index = rtas_ld(args, 1);
+    uint32_t sensor_state = 0, decoded = 0;
+    uint32_t shift = 0, mask = 0;
+    sPAPRDrcEntry *drc_entry = NULL;
+
+    if (drc_index == 0) {  /* platform state sensor/indicator */
+        sensor_state = spapr->state;
+    } else { /* we should have a drc entry */
+        drc_entry = spapr_find_drc_entry(drc_index);
+        if (!drc_entry) {
+            DPRINTF("unable to find DRC entry for index %x", drc_index);
+            sensor_state = 0; /* empty */
+            rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
+            return;
+        }
+        sensor_state = drc_entry->state;
+    }
+    switch (sensor) {
+    case 9:  /* EPOW */
+        shift = INDICATOR_EPOW_SHIFT;
+        mask = INDICATOR_EPOW_MASK;
+        break;
+    case 9001: /* Isolation state */
+        /* encode the new value into the correct bit field */
+        shift = INDICATOR_ISOLATION_SHIFT;
+        mask = INDICATOR_ISOLATION_MASK;
+        break;
+    case 9002: /* DR */
+        shift = INDICATOR_DR_SHIFT;
+        mask = INDICATOR_DR_MASK;
+        break;
+    case 9003: /* entity sense */
+        shift = INDICATOR_ENTITY_SENSE_SHIFT;
+        mask = INDICATOR_ENTITY_SENSE_MASK;
+        break;
+    case 9005: /* global interrupt */
+        shift = INDICATOR_GLOBAL_INTERRUPT_SHIFT;
+        mask = INDICATOR_GLOBAL_INTERRUPT_MASK;
+        break;
+    case 9006: /* error log */
+        shift = INDICATOR_ERROR_LOG_SHIFT;
+        mask = INDICATOR_ERROR_LOG_MASK;
+        break;
+    case 9007: /* identify */
+        shift = INDICATOR_IDENTIFY_SHIFT;
+        mask = INDICATOR_IDENTIFY_MASK;
+        break;
+    case 9009: /* reset */
+        shift = INDICATOR_RESET_SHIFT;
+        mask = INDICATOR_RESET_MASK;
+        break;
+    default:
+        DPRINTF("rtas_get_sensor_state: sensor not implemented: %d",
+                sensor);
+        rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
+        return;
+    }
+
+    decoded = DECODE_DRC_STATE(sensor_state, mask, shift);
+    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+    rtas_st(rets, 1, decoded);
+}
+
 static int pci_spapr_swizzle(int slot, int pin)
 {
     return (slot + pin) % PCI_NUM_PINS;
@@ -1200,6 +1274,8 @@ void spapr_pci_rtas_init(void)
                         rtas_set_power_level);
     spapr_rtas_register(RTAS_GET_POWER_LEVEL, "get-power-level",
                         rtas_get_power_level);
+    spapr_rtas_register(RTAS_GET_SENSOR_STATE, "get-sensor-state",
+                        rtas_get_sensor_state);
 }
 
 static void spapr_pci_register_types(void)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 07/12] spapr_pci: add ibm, configure-connector RTAS interface
  2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
                   ` (5 preceding siblings ...)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 06/12] spapr_pci: add get-sensor-state RTAS interface Michael Roth
@ 2014-08-19  0:21 ` Michael Roth
  2014-08-26  9:12   ` Alexey Kardashevskiy
  2014-08-26 11:39   ` Alexander Graf
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions Michael Roth
                   ` (5 subsequent siblings)
  12 siblings, 2 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-19  0:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
 hw/ppc/spapr_pci.c | 111 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 111 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 8d1351d..96a57be 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -606,6 +606,115 @@ static void rtas_get_sensor_state(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     rtas_st(rets, 1, decoded);
 }
 
+/* configure-connector work area offsets, int32_t units */
+#define CC_IDX_NODE_NAME_OFFSET 2
+#define CC_IDX_PROP_NAME_OFFSET 2
+#define CC_IDX_PROP_LEN 3
+#define CC_IDX_PROP_DATA_OFFSET 4
+
+#define CC_VAL_DATA_OFFSET ((CC_IDX_PROP_DATA_OFFSET + 1) * 4)
+#define CC_RET_NEXT_SIB 1
+#define CC_RET_NEXT_CHILD 2
+#define CC_RET_NEXT_PROPERTY 3
+#define CC_RET_PREV_PARENT 4
+#define CC_RET_ERROR RTAS_OUT_HW_ERROR
+#define CC_RET_SUCCESS RTAS_OUT_SUCCESS
+
+static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
+                                         sPAPREnvironment *spapr,
+                                         uint32_t token, uint32_t nargs,
+                                         target_ulong args, uint32_t nret,
+                                         target_ulong rets)
+{
+    uint64_t wa_addr = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 0);
+    sPAPRDrcEntry *drc_entry = NULL;
+    sPAPRConfigureConnectorState *ccs;
+    void *wa_buf;
+    int32_t *wa_buf_int;
+    hwaddr map_len = 0x1024;
+    uint32_t drc_index;
+    int rc = 0, next_offset, tag, prop_len, node_name_len;
+    const struct fdt_property *prop;
+    const char *node_name, *prop_name;
+
+    wa_buf = cpu_physical_memory_map(wa_addr, &map_len, 1);
+    if (!wa_buf) {
+        rc = CC_RET_ERROR;
+        goto error_exit;
+    }
+    wa_buf_int = wa_buf;
+
+    drc_index = *(uint32_t *)wa_buf;
+    drc_entry = spapr_find_drc_entry(drc_index);
+    if (!drc_entry) {
+        rc = -1;
+        goto error_exit;
+    }
+
+    ccs = &drc_entry->cc_state;
+    if (ccs->state == CC_STATE_PENDING) {
+        /* fdt should've been been attached to drc_entry during
+         * realize/hotplug
+         */
+        g_assert(ccs->fdt);
+        ccs->depth = 0;
+        ccs->offset = ccs->offset_start;
+        ccs->state = CC_STATE_ACTIVE;
+    }
+
+    if (ccs->state == CC_STATE_IDLE) {
+        rc = -1;
+        goto error_exit;
+    }
+
+retry:
+    tag = fdt_next_tag(ccs->fdt, ccs->offset, &next_offset);
+
+    switch (tag) {
+    case FDT_BEGIN_NODE:
+        ccs->depth++;
+        node_name = fdt_get_name(ccs->fdt, ccs->offset, &node_name_len);
+        wa_buf_int[CC_IDX_NODE_NAME_OFFSET] = CC_VAL_DATA_OFFSET;
+        strcpy(wa_buf + wa_buf_int[CC_IDX_NODE_NAME_OFFSET], node_name);
+        rc = CC_RET_NEXT_CHILD;
+        break;
+    case FDT_END_NODE:
+        ccs->depth--;
+        if (ccs->depth == 0) {
+            /* reached the end of top-level node, declare success */
+            ccs->state = CC_STATE_PENDING;
+            rc = CC_RET_SUCCESS;
+        } else {
+            rc = CC_RET_PREV_PARENT;
+        }
+        break;
+    case FDT_PROP:
+        prop = fdt_get_property_by_offset(ccs->fdt, ccs->offset, &prop_len);
+        prop_name = fdt_string(ccs->fdt, fdt32_to_cpu(prop->nameoff));
+        wa_buf_int[CC_IDX_PROP_NAME_OFFSET] = CC_VAL_DATA_OFFSET;
+        wa_buf_int[CC_IDX_PROP_LEN] = prop_len;
+        wa_buf_int[CC_IDX_PROP_DATA_OFFSET] =
+            CC_VAL_DATA_OFFSET + strlen(prop_name) + 1;
+        strcpy(wa_buf + wa_buf_int[CC_IDX_PROP_NAME_OFFSET], prop_name);
+        memcpy(wa_buf + wa_buf_int[CC_IDX_PROP_DATA_OFFSET],
+               prop->data, prop_len);
+        rc = CC_RET_NEXT_PROPERTY;
+        break;
+    case FDT_END:
+        rc = CC_RET_ERROR;
+        break;
+    default:
+        ccs->offset = next_offset;
+        goto retry;
+    }
+
+    ccs->offset = next_offset;
+
+error_exit:
+    cpu_physical_memory_unmap(wa_buf, 0x1024, 1, 0x1024);
+    rtas_st(rets, 0, rc);
+}
+
 static int pci_spapr_swizzle(int slot, int pin)
 {
     return (slot + pin) % PCI_NUM_PINS;
@@ -1276,6 +1385,8 @@ void spapr_pci_rtas_init(void)
                         rtas_get_power_level);
     spapr_rtas_register(RTAS_GET_SENSOR_STATE, "get-sensor-state",
                         rtas_get_sensor_state);
+    spapr_rtas_register(RTAS_IBM_CONFIGURE_CONNECTOR, "ibm,configure-connector",
+                        rtas_ibm_configure_connector);
 }
 
 static void spapr_pci_register_types(void)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions
  2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
                   ` (6 preceding siblings ...)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 07/12] spapr_pci: add ibm, configure-connector " Michael Roth
@ 2014-08-19  0:21 ` Michael Roth
  2014-08-26  9:14   ` Alexey Kardashevskiy
                     ` (2 more replies)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations Michael Roth
                   ` (4 subsequent siblings)
  12 siblings, 3 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-19  0:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

Some kernels program a 0 address for io regions. PCI 3.0 spec
section 6.2.5.1 doesn't seem to disallow this.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
 hw/pci/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 351d320..9578749 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1035,7 +1035,7 @@ static pcibus_t pci_bar_address(PCIDevice *d,
         /* Check if 32 bit BAR wraps around explicitly.
          * TODO: make priorities correct and remove this work around.
          */
-        if (last_addr <= new_addr || new_addr == 0 || last_addr >= UINT32_MAX) {
+        if (last_addr <= new_addr || last_addr >= UINT32_MAX) {
             return PCI_BAR_UNMAPPED;
         }
         return new_addr;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations
  2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
                   ` (7 preceding siblings ...)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions Michael Roth
@ 2014-08-19  0:21 ` Michael Roth
  2014-08-26  9:40   ` Alexey Kardashevskiy
                     ` (2 more replies)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 10/12] spapr_events: re-use EPOW event infrastructure for hotplug events Michael Roth
                   ` (3 subsequent siblings)
  12 siblings, 3 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-19  0:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

This enables hotplug for PHB bridges. Upon hotplug we generate the
OF-nodes required by PAPR specification and IEEE 1275-1994
"PCI Bus Binding to Open Firmware" for the device.

We associate the corresponding FDT for these nodes with the DrcEntry
corresponding to the slot, which will be fetched via
ibm,configure-connector RTAS calls by the guest as described by PAPR
specification. The FDT is cleaned up in the case of unplug.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
 hw/ppc/spapr_pci.c     | 235 +++++++++++++++++++++++++++++++++++++++++++++++--
 include/hw/ppc/spapr.h |   1 +
 2 files changed, 228 insertions(+), 8 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 96a57be..23864ab 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -87,6 +87,17 @@
 #define ENCODE_DRC_STATE(val, m, s) \
     (((uint32_t)(val) << (s)) & (uint32_t)(m))
 
+#define FDT_MAX_SIZE            0x10000
+#define _FDT(exp) \
+    do { \
+        int ret = (exp);                                           \
+        if (ret < 0) {                                             \
+            return ret;                                            \
+        }                                                          \
+    } while (0)
+
+static void spapr_drc_state_reset(sPAPRDrcEntry *drc_entry);
+
 static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
 {
     sPAPRPHBState *sphb;
@@ -476,6 +487,22 @@ static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
         /* encode the new value into the correct bit field */
         shift = INDICATOR_ISOLATION_SHIFT;
         mask = INDICATOR_ISOLATION_MASK;
+        if (drc_entry) {
+            /* transition from unisolated to isolated for a hotplug slot
+             * entails completion of guest-side device unplug/cleanup, so
+             * we can now safely remove the device if qemu is waiting for
+             * it to be released
+             */
+            if (DECODE_DRC_STATE(*pind, mask, shift) != indicator_state) {
+                if (indicator_state == 0 && drc_entry->awaiting_release) {
+                    /* device_del has been called and host is waiting
+                     * for guest to release/isolate device, go ahead
+                     * and remove it now
+                     */
+                    spapr_drc_state_reset(drc_entry);
+                }
+            }
+        }
         break;
     case 9002: /* DR */
         shift = INDICATOR_DR_SHIFT;
@@ -816,6 +843,198 @@ static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
     return &phb->iommu_as;
 }
 
+static int spapr_populate_pci_child_dt(PCIDevice *dev, void *fdt, int offset,
+                                       int phb_index)
+{
+    int slot = PCI_SLOT(dev->devfn);
+    char slotname[16];
+    bool is_bridge = 1;
+    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
+    uint32_t reg[RESOURCE_CELLS_TOTAL * 8] = { 0 };
+    uint32_t assigned[RESOURCE_CELLS_TOTAL * 8] = { 0 };
+    int reg_size, assigned_size;
+
+    drc_entry = spapr_phb_to_drc_entry(phb_index + SPAPR_PCI_BASE_BUID);
+    g_assert(drc_entry);
+    drc_entry_slot = &drc_entry->child_entries[slot];
+
+    if (pci_default_read_config(dev, PCI_HEADER_TYPE, 1) ==
+        PCI_HEADER_TYPE_NORMAL) {
+        is_bridge = 0;
+    }
+
+    _FDT(fdt_setprop_cell(fdt, offset, "vendor-id",
+                          pci_default_read_config(dev, PCI_VENDOR_ID, 2)));
+    _FDT(fdt_setprop_cell(fdt, offset, "device-id",
+                          pci_default_read_config(dev, PCI_DEVICE_ID, 2)));
+    _FDT(fdt_setprop_cell(fdt, offset, "revision-id",
+                          pci_default_read_config(dev, PCI_REVISION_ID, 1)));
+    _FDT(fdt_setprop_cell(fdt, offset, "class-code",
+                          pci_default_read_config(dev, PCI_CLASS_DEVICE, 2) << 8));
+
+    _FDT(fdt_setprop_cell(fdt, offset, "interrupts",
+                          pci_default_read_config(dev, PCI_INTERRUPT_PIN, 1)));
+
+    /* if this device is NOT a bridge */
+    if (!is_bridge) {
+        _FDT(fdt_setprop_cell(fdt, offset, "min-grant",
+            pci_default_read_config(dev, PCI_MIN_GNT, 1)));
+        _FDT(fdt_setprop_cell(fdt, offset, "max-latency",
+            pci_default_read_config(dev, PCI_MAX_LAT, 1)));
+        _FDT(fdt_setprop_cell(fdt, offset, "subsystem-id",
+            pci_default_read_config(dev, PCI_SUBSYSTEM_ID, 2)));
+        _FDT(fdt_setprop_cell(fdt, offset, "subsystem-vendor-id",
+            pci_default_read_config(dev, PCI_SUBSYSTEM_VENDOR_ID, 2)));
+    }
+
+    _FDT(fdt_setprop_cell(fdt, offset, "cache-line-size",
+        pci_default_read_config(dev, PCI_CACHE_LINE_SIZE, 1)));
+
+    /* the following fdt cells are masked off the pci status register */
+    int pci_status = pci_default_read_config(dev, PCI_STATUS, 2);
+    _FDT(fdt_setprop_cell(fdt, offset, "devsel-speed",
+                          PCI_STATUS_DEVSEL_MASK & pci_status));
+    _FDT(fdt_setprop_cell(fdt, offset, "fast-back-to-back",
+                          PCI_STATUS_FAST_BACK & pci_status));
+    _FDT(fdt_setprop_cell(fdt, offset, "66mhz-capable",
+                          PCI_STATUS_66MHZ & pci_status));
+    _FDT(fdt_setprop_cell(fdt, offset, "udf-supported",
+                          PCI_STATUS_UDF & pci_status));
+
+    _FDT(fdt_setprop_string(fdt, offset, "name", "pci"));
+    sprintf(slotname, "Slot %d", slot + phb_index * 32);
+    _FDT(fdt_setprop(fdt, offset, "ibm,loc-code", slotname, strlen(slotname)));
+    _FDT(fdt_setprop_cell(fdt, offset, "ibm,my-drc-index",
+                          drc_entry_slot->drc_index));
+
+    _FDT(fdt_setprop_cell(fdt, offset, "#address-cells",
+                          RESOURCE_CELLS_ADDRESS));
+    _FDT(fdt_setprop_cell(fdt, offset, "#size-cells",
+                          RESOURCE_CELLS_SIZE));
+    _FDT(fdt_setprop_cell(fdt, offset, "ibm,req#msi-x",
+                          RESOURCE_CELLS_SIZE));
+    fill_resource_props(dev, phb_index, reg, &reg_size,
+                        assigned, &assigned_size);
+    _FDT(fdt_setprop(fdt, offset, "reg", reg, reg_size));
+    _FDT(fdt_setprop(fdt, offset, "assigned-addresses",
+                     assigned, assigned_size));
+
+    return 0;
+}
+
+static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev)
+{
+    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
+    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
+    sPAPRConfigureConnectorState *ccs;
+    int slot = PCI_SLOT(dev->devfn);
+    int offset, ret;
+    void *fdt_orig, *fdt;
+    char nodename[512];
+    uint32_t encoded = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_PRESENT,
+                                        INDICATOR_ENTITY_SENSE_MASK,
+                                        INDICATOR_ENTITY_SENSE_SHIFT);
+
+    drc_entry = spapr_phb_to_drc_entry(phb->buid);
+    g_assert(drc_entry);
+    drc_entry_slot = &drc_entry->child_entries[slot];
+
+    drc_entry->state &= ~(uint32_t)INDICATOR_ENTITY_SENSE_MASK;
+    drc_entry->state |= encoded; /* DR entity present */
+    drc_entry_slot->state &= ~(uint32_t)INDICATOR_ENTITY_SENSE_MASK;
+    drc_entry_slot->state |= encoded; /* and the slot */
+
+    /* add OF node for pci device and required OF DT properties */
+    fdt_orig = g_malloc0(FDT_MAX_SIZE);
+    offset = fdt_create(fdt_orig, FDT_MAX_SIZE);
+    fdt_begin_node(fdt_orig, "");
+    fdt_end_node(fdt_orig);
+    fdt_finish(fdt_orig);
+
+    fdt = g_malloc0(FDT_MAX_SIZE);
+    fdt_open_into(fdt_orig, fdt, FDT_MAX_SIZE);
+    sprintf(nodename, "pci@%d", slot);
+    offset = fdt_add_subnode(fdt, 0, nodename);
+    ret = spapr_populate_pci_child_dt(dev, fdt, offset, phb->index);
+    g_assert(!ret);
+    g_free(fdt_orig);
+
+    /* hold on to node, configure_connector will pass it to the guest later */
+    ccs = &drc_entry_slot->cc_state;
+    ccs->fdt = fdt;
+    ccs->offset_start = offset;
+    ccs->state = CC_STATE_PENDING;
+    ccs->dev = dev;
+
+    return 0;
+}
+
+/* check whether guest has released/isolated device */
+static bool spapr_drc_state_is_releasable(sPAPRDrcEntry *drc_entry)
+{
+    return !DECODE_DRC_STATE(drc_entry->state,
+                             INDICATOR_ISOLATION_MASK,
+                             INDICATOR_ISOLATION_SHIFT);
+}
+
+/* finalize device unplug/deletion */
+static void spapr_drc_state_reset(sPAPRDrcEntry *drc_entry)
+{
+    sPAPRConfigureConnectorState *ccs = &drc_entry->cc_state;
+    uint32_t sense_empty = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_EMPTY,
+                                            INDICATOR_ENTITY_SENSE_MASK,
+                                            INDICATOR_ENTITY_SENSE_SHIFT);
+
+    g_free(ccs->fdt);
+    ccs->fdt = NULL;
+    object_unparent(OBJECT(ccs->dev));
+    ccs->dev = NULL;
+    ccs->state = CC_STATE_IDLE;
+    drc_entry->state &= ~INDICATOR_ENTITY_SENSE_MASK;
+    drc_entry->state |= sense_empty;
+    drc_entry->awaiting_release = false;
+}
+
+static void spapr_device_hotplug_remove(DeviceState *qdev, PCIDevice *dev)
+{
+    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
+    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
+    sPAPRConfigureConnectorState *ccs;
+    int slot = PCI_SLOT(dev->devfn);
+
+    drc_entry = spapr_phb_to_drc_entry(phb->buid);
+    g_assert(drc_entry);
+    drc_entry_slot = &drc_entry->child_entries[slot];
+    ccs = &drc_entry_slot->cc_state;
+    /* shouldn't be removing devices we haven't created an fdt for */
+    g_assert(ccs->state != CC_STATE_IDLE);
+    /* if the device has already been released/isolated by guest, go ahead
+     * and remove it now. Otherwise, flag it as pending guest release so it
+     * can be removed later
+     */
+    if (spapr_drc_state_is_releasable(drc_entry_slot)) {
+        spapr_drc_state_reset(drc_entry_slot);
+    } else {
+        if (drc_entry_slot->awaiting_release) {
+            fprintf(stderr, "waiting for guest to release the device");
+        } else {
+            drc_entry_slot->awaiting_release = true;
+        }
+    }
+}
+
+static void spapr_phb_hot_plug(HotplugHandler *plug_handler,
+                               DeviceState *plugged_dev, Error **errp)
+{
+    spapr_device_hotplug_add(DEVICE(plug_handler), PCI_DEVICE(plugged_dev));
+}
+
+static void spapr_phb_hot_unplug(HotplugHandler *plug_handler,
+                                 DeviceState *plugged_dev, Error **errp)
+{
+    spapr_device_hotplug_remove(DEVICE(plug_handler), PCI_DEVICE(plugged_dev));
+}
+
 static void spapr_phb_realize(DeviceState *dev, Error **errp)
 {
     SysBusDevice *s = SYS_BUS_DEVICE(dev);
@@ -903,6 +1122,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
                            &sphb->memspace, &sphb->iospace,
                            PCI_DEVFN(0, 0), PCI_NUM_PINS, TYPE_PCI_BUS);
     phb->bus = bus;
+    qbus_set_hotplug_handler(BUS(phb->bus), DEVICE(sphb), NULL);
 
     /*
      * Initialize PHB address space.
@@ -1108,6 +1328,7 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data)
     PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass);
     DeviceClass *dc = DEVICE_CLASS(klass);
     sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_CLASS(klass);
+    HotplugHandlerClass *hp = HOTPLUG_HANDLER_CLASS(klass);
 
     hc->root_bus_path = spapr_phb_root_bus_path;
     dc->realize = spapr_phb_realize;
@@ -1117,6 +1338,8 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data)
     set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
     dc->cannot_instantiate_with_device_add_yet = false;
     spc->finish_realize = spapr_phb_finish_realize;
+    hp->plug = spapr_phb_hot_plug;
+    hp->unplug = spapr_phb_hot_unplug;
 }
 
 static const TypeInfo spapr_phb_info = {
@@ -1125,6 +1348,10 @@ static const TypeInfo spapr_phb_info = {
     .instance_size = sizeof(sPAPRPHBState),
     .class_init    = spapr_phb_class_init,
     .class_size    = sizeof(sPAPRPHBClass),
+    .interfaces    = (InterfaceInfo[]) {
+        { TYPE_HOTPLUG_HANDLER },
+        { }
+    }
 };
 
 PCIHostState *spapr_create_phb(sPAPREnvironment *spapr, int index)
@@ -1304,14 +1531,6 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb,
         return bus_off;
     }
 
-#define _FDT(exp) \
-    do { \
-        int ret = (exp);                                           \
-        if (ret < 0) {                                             \
-            return ret;                                            \
-        }                                                          \
-    } while (0)
-
     /* Write PHB properties */
     _FDT(fdt_setprop_string(fdt, bus_off, "device_type", "pci"));
     _FDT(fdt_setprop_string(fdt, bus_off, "compatible", "IBM,Logical_PHB"));
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index fac85f8..e08dd2f 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -36,6 +36,7 @@ struct sPAPRDrcEntry {
     void *fdt;
     int fdt_offset;
     uint32_t state;
+    bool awaiting_release;
     sPAPRConfigureConnectorState cc_state;
     sPAPRDrcEntry *child_entries;
 };
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 10/12] spapr_events: re-use EPOW event infrastructure for hotplug events
  2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
                   ` (8 preceding siblings ...)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations Michael Roth
@ 2014-08-19  0:21 ` Michael Roth
  2014-08-26  9:28   ` Alexey Kardashevskiy
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 11/12] spapr_events: event-scan RTAS interface Michael Roth
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 69+ messages in thread
From: Michael Roth @ 2014-08-19  0:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

From: Nathan Fontenot <nfont@linux.vnet.ibm.com>

This extends the data structures currently used to report EPOW events to
gets via the check-exception RTAS interfaces to also include event types
for hotplug/unplug events.

This is currently undocumented and being finalized for inclusion in PAPR
specification, but we implement this here as an extension for guest
userspace tools to implement (existing guest kernels simply log these
events via a sysfs interface that's read by rtas_errd).

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c         |   2 +-
 hw/ppc/spapr_events.c  | 215 ++++++++++++++++++++++++++++++++++++++++---------
 include/hw/ppc/spapr.h |   4 +-
 3 files changed, 180 insertions(+), 41 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 39cb0bb..825fd31 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1706,7 +1706,7 @@ static void ppc_spapr_init(MachineState *machine)
     spapr->fdt_skel = spapr_create_fdt_skel(initrd_base, initrd_size,
                                             kernel_size, kernel_le,
                                             boot_device, kernel_cmdline,
-                                            spapr->epow_irq);
+                                            spapr->check_exception_irq);
     assert(spapr->fdt_skel != NULL);
 }
 
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 1b6157d..c0be0e5 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -32,6 +32,8 @@
 
 #include "hw/ppc/spapr.h"
 #include "hw/ppc/spapr_vio.h"
+#include "hw/pci/pci.h"
+#include "hw/pci-host/spapr.h"
 
 #include <libfdt.h>
 
@@ -77,6 +79,7 @@ struct rtas_error_log {
 #define   RTAS_LOG_TYPE_ECC_UNCORR              0x00000009
 #define   RTAS_LOG_TYPE_ECC_CORR                0x0000000a
 #define   RTAS_LOG_TYPE_EPOW                    0x00000040
+#define   RTAS_LOG_TYPE_HOTPLUG                 0x000000e5
     uint32_t extended_length;
 } QEMU_PACKED;
 
@@ -166,6 +169,38 @@ struct epow_log_full {
     struct rtas_event_log_v6_epow epow;
 } QEMU_PACKED;
 
+struct rtas_event_log_v6_hp {
+#define RTAS_LOG_V6_SECTION_ID_HOTPLUG              0x4850 /* HP */
+    struct rtas_event_log_v6_section_header hdr;
+    uint8_t hotplug_type;
+#define RTAS_LOG_V6_HP_TYPE_CPU                          1
+#define RTAS_LOG_V6_HP_TYPE_MEMORY                       2
+#define RTAS_LOG_V6_HP_TYPE_SLOT                         3
+#define RTAS_LOG_V6_HP_TYPE_PHB                          4
+#define RTAS_LOG_V6_HP_TYPE_PCI                          5
+    uint8_t hotplug_action;
+#define RTAS_LOG_V6_HP_ACTION_ADD                        1
+#define RTAS_LOG_V6_HP_ACTION_REMOVE                     2
+    uint8_t hotplug_identifier;
+#define RTAS_LOG_V6_HP_ID_DRC_NAME                       1
+#define RTAS_LOG_V6_HP_ID_DRC_INDEX                      2
+#define RTAS_LOG_V6_HP_ID_DRC_COUNT                      3
+    uint8_t reserved;
+    union {
+        uint32_t index;
+        uint32_t count;
+        char name[1];
+    } drc;
+} QEMU_PACKED;
+
+struct hp_log_full {
+    struct rtas_error_log hdr;
+    struct rtas_event_log_v6 v6hdr;
+    struct rtas_event_log_v6_maina maina;
+    struct rtas_event_log_v6_mainb mainb;
+    struct rtas_event_log_v6_hp hp;
+} QEMU_PACKED;
+
 #define EVENT_MASK_INTERNAL_ERRORS           0x80000000
 #define EVENT_MASK_EPOW                      0x40000000
 #define EVENT_MASK_HOTPLUG                   0x10000000
@@ -181,29 +216,61 @@ struct epow_log_full {
         }                                                          \
     } while (0)
 
-void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq)
+void spapr_events_fdt_skel(void *fdt, uint32_t check_exception_irq)
 {
-    uint32_t epow_irq_ranges[] = {cpu_to_be32(epow_irq), cpu_to_be32(1)};
-    uint32_t epow_interrupts[] = {cpu_to_be32(epow_irq), 0};
+    uint32_t irq_ranges[] = {cpu_to_be32(check_exception_irq), cpu_to_be32(1)};
+    uint32_t interrupts[] = {cpu_to_be32(check_exception_irq), 0};
 
     _FDT((fdt_begin_node(fdt, "event-sources")));
 
     _FDT((fdt_property(fdt, "interrupt-controller", NULL, 0)));
     _FDT((fdt_property_cell(fdt, "#interrupt-cells", 2)));
     _FDT((fdt_property(fdt, "interrupt-ranges",
-                       epow_irq_ranges, sizeof(epow_irq_ranges))));
+                       irq_ranges, sizeof(irq_ranges))));
 
     _FDT((fdt_begin_node(fdt, "epow-events")));
-    _FDT((fdt_property(fdt, "interrupts",
-                       epow_interrupts, sizeof(epow_interrupts))));
+    _FDT((fdt_property(fdt, "interrupts", interrupts, sizeof(interrupts))));
     _FDT((fdt_end_node(fdt)));
 
     _FDT((fdt_end_node(fdt)));
 }
 
 static struct epow_log_full *pending_epow;
+static struct hp_log_full *pending_hp;
 static uint32_t next_plid;
 
+static void spapr_init_v6hdr(struct rtas_event_log_v6 *v6hdr)
+{
+    v6hdr->b0 = RTAS_LOG_V6_B0_VALID | RTAS_LOG_V6_B0_NEW_LOG
+        | RTAS_LOG_V6_B0_BIGENDIAN;
+    v6hdr->b2 = RTAS_LOG_V6_B2_POWERPC_FORMAT
+        | RTAS_LOG_V6_B2_LOG_FORMAT_PLATFORM_EVENT;
+    v6hdr->company = cpu_to_be32(RTAS_LOG_V6_COMPANY_IBM);
+}
+
+static void spapr_init_maina(struct rtas_event_log_v6_maina *maina,
+                             int section_count)
+{
+    struct tm tm;
+    int year;
+
+    maina->hdr.section_id = cpu_to_be16(RTAS_LOG_V6_SECTION_ID_MAINA);
+    maina->hdr.section_length = cpu_to_be16(sizeof(*maina));
+    /* FIXME: section version, subtype and creator id? */
+    qemu_get_timedate(&tm, spapr->rtc_offset);
+    year = tm.tm_year + 1900;
+    maina->creation_date = cpu_to_be32((to_bcd(year / 100) << 24)
+                                       | (to_bcd(year % 100) << 16)
+                                       | (to_bcd(tm.tm_mon + 1) << 8)
+                                       | to_bcd(tm.tm_mday));
+    maina->creation_time = cpu_to_be32((to_bcd(tm.tm_hour) << 24)
+                                       | (to_bcd(tm.tm_min) << 16)
+                                       | (to_bcd(tm.tm_sec) << 8));
+    maina->creator_id = 'H'; /* Hypervisor */
+    maina->section_count = section_count;
+    maina->plid = next_plid++;
+}
+
 static void spapr_powerdown_req(Notifier *n, void *opaque)
 {
     sPAPREnvironment *spapr = container_of(n, sPAPREnvironment, epow_notifier);
@@ -212,8 +279,6 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
     struct rtas_event_log_v6_maina *maina;
     struct rtas_event_log_v6_mainb *mainb;
     struct rtas_event_log_v6_epow *epow;
-    struct tm tm;
-    int year;
 
     if (pending_epow) {
         /* For now, we just throw away earlier events if two come
@@ -237,27 +302,8 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
     hdr->extended_length = cpu_to_be32(sizeof(*pending_epow)
                                        - sizeof(pending_epow->hdr));
 
-    v6hdr->b0 = RTAS_LOG_V6_B0_VALID | RTAS_LOG_V6_B0_NEW_LOG
-        | RTAS_LOG_V6_B0_BIGENDIAN;
-    v6hdr->b2 = RTAS_LOG_V6_B2_POWERPC_FORMAT
-        | RTAS_LOG_V6_B2_LOG_FORMAT_PLATFORM_EVENT;
-    v6hdr->company = cpu_to_be32(RTAS_LOG_V6_COMPANY_IBM);
-
-    maina->hdr.section_id = cpu_to_be16(RTAS_LOG_V6_SECTION_ID_MAINA);
-    maina->hdr.section_length = cpu_to_be16(sizeof(*maina));
-    /* FIXME: section version, subtype and creator id? */
-    qemu_get_timedate(&tm, spapr->rtc_offset);
-    year = tm.tm_year + 1900;
-    maina->creation_date = cpu_to_be32((to_bcd(year / 100) << 24)
-                                       | (to_bcd(year % 100) << 16)
-                                       | (to_bcd(tm.tm_mon + 1) << 8)
-                                       | to_bcd(tm.tm_mday));
-    maina->creation_time = cpu_to_be32((to_bcd(tm.tm_hour) << 24)
-                                       | (to_bcd(tm.tm_min) << 16)
-                                       | (to_bcd(tm.tm_sec) << 8));
-    maina->creator_id = 'H'; /* Hypervisor */
-    maina->section_count = 3; /* Main-A, Main-B and EPOW */
-    maina->plid = next_plid++;
+    spapr_init_v6hdr(v6hdr);
+    spapr_init_maina(maina, 3 /* Main-A, Main-B and EPOW */);
 
     mainb->hdr.section_id = cpu_to_be16(RTAS_LOG_V6_SECTION_ID_MAINB);
     mainb->hdr.section_length = cpu_to_be16(sizeof(*mainb));
@@ -274,7 +320,87 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
     epow->event_modifier = RTAS_LOG_V6_EPOW_MODIFIER_NORMAL;
     epow->extended_modifier = RTAS_LOG_V6_EPOW_XMODIFIER_PARTITION_SPECIFIC;
 
-    qemu_irq_pulse(xics_get_qirq(spapr->icp, spapr->epow_irq));
+    qemu_irq_pulse(xics_get_qirq(spapr->icp, spapr->check_exception_irq));
+}
+
+static void spapr_hotplug_req_event(uint8_t hp_type, uint8_t hp_action,
+                                    sPAPRPHBState *phb, int slot)
+{
+    struct rtas_error_log *hdr;
+    struct rtas_event_log_v6 *v6hdr;
+    struct rtas_event_log_v6_maina *maina;
+    struct rtas_event_log_v6_mainb *mainb;
+    struct rtas_event_log_v6_hp *hp;
+    sPAPRDrcEntry *drc_entry;
+
+    if (pending_hp) {
+        /* Just toss any pending hotplug events for now, this will
+         * need to be fixed later on.
+         */
+        g_free(pending_hp);
+    }
+
+    pending_hp = g_malloc0(sizeof(*pending_hp));
+    hdr = &pending_hp->hdr;
+    v6hdr = &pending_hp->v6hdr;
+    maina = &pending_hp->maina;
+    mainb = &pending_hp->mainb;
+    hp = &pending_hp->hp;
+
+    hdr->summary = cpu_to_be32(RTAS_LOG_VERSION_6
+                               | RTAS_LOG_SEVERITY_EVENT
+                               | RTAS_LOG_DISPOSITION_NOT_RECOVERED
+                               | RTAS_LOG_OPTIONAL_PART_PRESENT
+                               | RTAS_LOG_INITIATOR_HOTPLUG
+                               | RTAS_LOG_TYPE_HOTPLUG);
+    hdr->extended_length = cpu_to_be32(sizeof(*pending_hp)
+                                       - sizeof(pending_hp->hdr));
+
+    spapr_init_v6hdr(v6hdr);
+    spapr_init_maina(maina, 3 /* Main-A, Main-B, HP */);
+
+    mainb->hdr.section_id = cpu_to_be16(RTAS_LOG_V6_SECTION_ID_MAINB);
+    mainb->hdr.section_length = cpu_to_be16(sizeof(*mainb));
+    mainb->subsystem_id = 0x80; /* External environment */
+    mainb->event_severity = 0x00; /* Informational / non-error */
+    mainb->event_subtype = 0x00; /* Normal shutdown */
+
+    hp->hdr.section_id = cpu_to_be16(RTAS_LOG_V6_SECTION_ID_HOTPLUG);
+    hp->hdr.section_length = cpu_to_be16(sizeof(*hp));
+    hp->hdr.section_version = 1; /* includes extended modifier */
+    hp->hotplug_action = hp_action;
+
+    hp->hotplug_type = hp_type;
+
+    drc_entry = spapr_phb_to_drc_entry(phb->buid);
+    if (!drc_entry) {
+        drc_entry = spapr_add_phb_to_drc_table(phb->buid, 2 /* Unusable */);
+    }
+
+    switch (hp_type) {
+    case RTAS_LOG_V6_HP_TYPE_PCI:
+        hp->drc.index = drc_entry->child_entries[slot].drc_index;
+        hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
+        break;
+    }
+
+    qemu_irq_pulse(xics_get_qirq(spapr->icp, spapr->check_exception_irq));
+}
+
+void spapr_pci_hotplug_add_event(DeviceState *qdev, int slot)
+{
+    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
+
+    return spapr_hotplug_req_event(RTAS_LOG_V6_HP_TYPE_PCI,
+                                   RTAS_LOG_V6_HP_ACTION_ADD, phb, slot);
+}
+
+void spapr_pci_hotplug_remove_event(DeviceState *qdev, int slot)
+{
+    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
+
+    return spapr_hotplug_req_event(RTAS_LOG_V6_HP_TYPE_PCI,
+                                   RTAS_LOG_V6_HP_ACTION_REMOVE, phb, slot);
 }
 
 static void check_exception(PowerPCCPU *cpu, sPAPREnvironment *spapr,
@@ -298,15 +424,26 @@ static void check_exception(PowerPCCPU *cpu, sPAPREnvironment *spapr,
         xinfo |= (uint64_t)rtas_ld(args, 6) << 32;
     }
 
-    if ((mask & EVENT_MASK_EPOW) && pending_epow) {
-        if (sizeof(*pending_epow) < len) {
-            len = sizeof(*pending_epow);
-        }
+    if (mask & EVENT_MASK_EPOW) {
+        if (pending_epow) {
+            if (sizeof(*pending_epow) < len) {
+                len = sizeof(*pending_epow);
+            }
 
-        cpu_physical_memory_write(buf, pending_epow, len);
-        g_free(pending_epow);
-        pending_epow = NULL;
-        rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+            cpu_physical_memory_write(buf, pending_epow, len);
+            g_free(pending_epow);
+            pending_epow = NULL;
+            rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+        } else if (pending_hp) {
+            if (sizeof(*pending_hp) < len) {
+                len = sizeof(*pending_hp);
+            }
+
+            cpu_physical_memory_write(buf, pending_hp, len);
+            g_free(pending_hp);
+            pending_hp = NULL;
+            rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+        }
     } else {
         rtas_st(rets, 0, RTAS_OUT_NO_ERRORS_FOUND);
     }
@@ -314,7 +451,7 @@ static void check_exception(PowerPCCPU *cpu, sPAPREnvironment *spapr,
 
 void spapr_events_init(sPAPREnvironment *spapr)
 {
-    spapr->epow_irq = xics_alloc(spapr->icp, 0, 0, false);
+    spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false);
     spapr->epow_notifier.notify = spapr_powerdown_req;
     qemu_register_powerdown_notifier(&spapr->epow_notifier);
     spapr_rtas_register(RTAS_CHECK_EXCEPTION, "check-exception",
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index e08dd2f..5382bf1 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -63,7 +63,7 @@ typedef struct sPAPREnvironment {
     struct PPCTimebase tb;
     bool has_graphics;
 
-    uint32_t epow_irq;
+    uint32_t check_exception_irq;
     Notifier epow_notifier;
 
     /* Migration state */
@@ -521,5 +521,7 @@ int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
 sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
 sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
 sPAPRDrcEntry *spapr_find_drc_entry(int drc_index);
+void spapr_pci_hotplug_add_event(DeviceState *qdev, int slot);
+void spapr_pci_hotplug_remove_event(DeviceState *qdev, int slot);
 
 #endif /* !defined (__HW_SPAPR_H__) */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 11/12] spapr_events: event-scan RTAS interface
  2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
                   ` (9 preceding siblings ...)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 10/12] spapr_events: re-use EPOW event infrastructure for hotplug events Michael Roth
@ 2014-08-19  0:21 ` Michael Roth
  2014-08-26  9:30   ` Alexey Kardashevskiy
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 12/12] spapr_pci: emit hotplug add/remove events during hotplug Michael Roth
  2014-08-26  9:24 ` [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Alexey Kardashevskiy
  12 siblings, 1 reply; 69+ messages in thread
From: Michael Roth @ 2014-08-19  0:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

From: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>

We don't actually rely on this interface to surface hotplug events, and
instead rely on the similar-but-interrupt-driven check-exception RTAS
interface used for EPOW events. However, the existence of this interface
is needed to ensure guest kernels initialize the event-reporting
interfaces which will in turn be used by userspace tools to handle these
events, so we implement this interface as a stub.

Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c         | 1 +
 hw/ppc/spapr_events.c  | 9 +++++++++
 include/hw/ppc/spapr.h | 2 ++
 3 files changed, 12 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 825fd31..c65b13a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -702,6 +702,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
         refpoints, sizeof(refpoints))));
 
     _FDT((fdt_property_cell(fdt, "rtas-error-log-max", RTAS_ERROR_LOG_MAX)));
+    _FDT((fdt_property_cell(fdt, "rtas-event-scan-rate", RTAS_EVENT_SCAN_RATE)));
 
     /*
      * According to PAPR, rtas ibm,os-term, does not gaurantee a return
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index c0be0e5..bb80080 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -449,6 +449,14 @@ static void check_exception(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     }
 }
 
+static void event_scan(PowerPCCPU *cpu, sPAPREnvironment *spapr,
+                            uint32_t token, uint32_t nargs,
+                            target_ulong args,
+                            uint32_t nret, target_ulong rets)
+{
+    rtas_st(rets, 0, 1); /* no error events found */
+}
+
 void spapr_events_init(sPAPREnvironment *spapr)
 {
     spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false);
@@ -456,4 +464,5 @@ void spapr_events_init(sPAPREnvironment *spapr)
     qemu_register_powerdown_notifier(&spapr->epow_notifier);
     spapr_rtas_register(RTAS_CHECK_EXCEPTION, "check-exception",
                         check_exception);
+    spapr_rtas_register(RTAS_EVENT_SCAN, "event-scan", event_scan);
 }
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 5382bf1..aab627f 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -484,6 +484,8 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
 
 #define RTAS_ERROR_LOG_MAX      2048
 
+#define RTAS_EVENT_SCAN_RATE    1
+
 typedef struct sPAPRTCETable sPAPRTCETable;
 
 #define TYPE_SPAPR_TCE_TABLE "spapr-tce-table"
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [Qemu-devel] [PATCH 12/12] spapr_pci: emit hotplug add/remove events during hotplug
  2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
                   ` (10 preceding siblings ...)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 11/12] spapr_events: event-scan RTAS interface Michael Roth
@ 2014-08-19  0:21 ` Michael Roth
  2014-08-26  9:35   ` Alexey Kardashevskiy
  2014-08-26 12:36   ` Alexander Graf
  2014-08-26  9:24 ` [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Alexey Kardashevskiy
  12 siblings, 2 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-19  0:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

From: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>

This uses extension of existing EPOW interrupt/event mechanism
to notify userspace tools like librtas/drmgr to handle
in-guest configuration/cleanup operations in response to
device_add/device_del.

Userspace tools that don't implement this extension will need
to be run manually in response/advance of device_add/device_del,
respectively.

Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
 hw/ppc/spapr_pci.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 23864ab..17d37c0 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -944,6 +944,18 @@ static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev)
     drc_entry_slot->state &= ~(uint32_t)INDICATOR_ENTITY_SENSE_MASK;
     drc_entry_slot->state |= encoded; /* and the slot */
 
+    /* reliable unplug requires we wait for a transition from
+     * UNISOLATED->ISOLATED prior to device removal/deletion.
+     * However, slots populated by devices at boot-time will not
+     * have ever been set by guest tools to an UNISOLATED/populated
+     * state, so set this manually in the case of coldplug devices
+     */
+    if (!DEVICE(dev)->hotplugged) {
+        drc_entry_slot->state |= ENCODE_DRC_STATE(1,
+                                                  INDICATOR_ISOLATION_MASK,
+                                                  INDICATOR_ISOLATION_SHIFT);
+    }
+
     /* add OF node for pci device and required OF DT properties */
     fdt_orig = g_malloc0(FDT_MAX_SIZE);
     offset = fdt_create(fdt_orig, FDT_MAX_SIZE);
@@ -1026,13 +1038,21 @@ static void spapr_device_hotplug_remove(DeviceState *qdev, PCIDevice *dev)
 static void spapr_phb_hot_plug(HotplugHandler *plug_handler,
                                DeviceState *plugged_dev, Error **errp)
 {
+    int slot = PCI_SLOT(PCI_DEVICE(plugged_dev)->devfn);
+
     spapr_device_hotplug_add(DEVICE(plug_handler), PCI_DEVICE(plugged_dev));
+    if (plugged_dev->hotplugged) {
+        spapr_pci_hotplug_add_event(DEVICE(plug_handler), slot);
+    }
 }
 
 static void spapr_phb_hot_unplug(HotplugHandler *plug_handler,
                                  DeviceState *plugged_dev, Error **errp)
 {
+    int slot = PCI_SLOT(PCI_DEVICE(plugged_dev)->devfn);
+
     spapr_device_hotplug_remove(DEVICE(plug_handler), PCI_DEVICE(plugged_dev));
+    spapr_pci_hotplug_remove_event(DEVICE(plug_handler), slot);
 }
 
 static void spapr_phb_realize(DeviceState *dev, Error **errp)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node Michael Roth
@ 2014-08-26  7:55   ` Alexey Kardashevskiy
  2014-08-26  8:24     ` Alexey Kardashevskiy
                       ` (2 more replies)
  2014-08-26 11:11   ` [Qemu-devel] " Alexander Graf
                     ` (2 subsequent siblings)
  3 siblings, 3 replies; 69+ messages in thread
From: Alexey Kardashevskiy @ 2014-08-26  7:55 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

On 08/19/2014 10:21 AM, Michael Roth wrote:
> From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> 
> This add entries to the root OF node to advertise our PHBs as being
> DR-capable in according with PAPR specification.
> 
> Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
> and associated with a power domain of -1 (indicating to guests that
> power management is handled automatically by hardware).
> 
> We currently allocate entries for up to 32 DR-capable PHBs, though
> this limit can be increased later.
> 
> DrcEntry objects to track the state of the DR-connector associated
> with each PHB are stored in a 32-entry array, and each DrcEntry has
> in turn have a dynamically-sized number of child DR-connectors,
> which we will use later to track the state of DR-connectors
> associated with a PHB's physical slots.
> 
> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_pci.c     |   1 +
>  include/hw/ppc/spapr.h |  35 ++++++++++++
>  3 files changed, 179 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 5c92707..d5e46c3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
>      return ram_size;
>  }
>  
> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
> +{
> +    int i;
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        if (spapr->drc_table[i].phb_buid == buid) {
> +            return &spapr->drc_table[i];
> +        }
> +     }
> +
> +     return NULL;
> +}
> +
> +static void spapr_init_drc_table(void)
> +{
> +    int i;
> +
> +    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
> +
> +    /* For now we only care about PHB entries */
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        spapr->drc_table[i].drc_index = 0x2000001 + i;
> +    }
> +}
> +
> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
> +{
> +    sPAPRDrcEntry *empty_drc = NULL;
> +    sPAPRDrcEntry *found_drc = NULL;
> +    int i, phb_index;
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        if (spapr->drc_table[i].phb_buid == 0) {
> +            empty_drc = &spapr->drc_table[i];
> +        }
> +
> +        if (spapr->drc_table[i].phb_buid == buid) {
> +            found_drc = &spapr->drc_table[i];

return &spapr->drc_table[i];
?


> +            break;
> +        }
> +    }
> +
> +    if (found_drc) {
> +        return found_drc;
> +    }

   if (!empty_drc) {
        return NULL;
   }

?


> +
> +    if (empty_drc) {

and no need in this :)


> +        empty_drc->phb_buid = buid;
> +        empty_drc->state = state;
> +        empty_drc->cc_state.fdt = NULL;
> +        empty_drc->cc_state.offset = 0;
> +        empty_drc->cc_state.depth = 0;
> +        empty_drc->cc_state.state = CC_STATE_IDLE;
> +        empty_drc->child_entries =
> +            g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
> +        phb_index = buid - SPAPR_PCI_BASE_BUID;
> +        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +            empty_drc->child_entries[i].drc_index =
> +                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
> +        }
> +        return empty_drc;
> +    }
> +
> +    return NULL;
> +}
> +
> +static void spapr_create_drc_dt_entries(void *fdt)
> +{
> +    char char_buf[1024];
> +    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
> +    uint32_t *entries;
> +    int offset, fdt_offset;
> +    int i, ret;
> +
> +    fdt_offset = fdt_path_offset(fdt, "/");
> +
> +    /* ibm,drc-indexes */
> +    memset(int_buf, 0, sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> +
> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> +        int_buf[i] = spapr->drc_table[i-1].drc_index;
> +    }
> +
> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");

return here and below in the same error cases?

> +    }
> +
> +    /* ibm,drc-power-domains */
> +    memset(int_buf, 0, sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> +
> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> +        int_buf[i] = 0xffffffff;
> +    }
> +
> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains property\n");
> +    }
> +
> +    /* ibm,drc-names */
> +    memset(char_buf, 0, sizeof(char_buf));
> +    entries = (uint32_t *)&char_buf[0];
> +    *entries = SPAPR_DRC_TABLE_SIZE;
> +    offset = sizeof(*entries);
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
> +        char_buf[offset++] = '\0';
> +    }
> +
> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
> +    if (ret) {
> +        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
> +    }
> +
> +    /* ibm,drc-types */
> +    memset(char_buf, 0, sizeof(char_buf));
> +    entries = (uint32_t *)&char_buf[0];
> +    *entries = SPAPR_DRC_TABLE_SIZE;
> +    offset = sizeof(*entries);
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        offset += sprintf(char_buf + offset, "PHB");
> +        char_buf[offset++] = '\0';
> +    }
> +
> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, offset);
> +    if (ret) {
> +        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
> +    }
> +}
> +
>  #define _FDT(exp) \
>      do { \
>          int ret = (exp);                                           \
> @@ -731,6 +868,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>      char *bootlist;
>      void *fdt;
>      sPAPRPHBState *phb;
> +    sPAPRDrcEntry *drc_entry;
>  
>      fdt = g_malloc(FDT_MAX_SIZE);
>  
> @@ -750,6 +888,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>      }
>  
>      QLIST_FOREACH(phb, &spapr->phbs, list) {
> +        drc_entry = spapr_phb_to_drc_entry(phb->buid);
> +        g_assert(drc_entry);
>          ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
>      }
>  
> @@ -789,6 +929,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
>      }
>  
> +    spapr_create_drc_dt_entries(fdt);
> +
>      _FDT((fdt_pack(fdt)));
>  
>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> @@ -1443,6 +1585,7 @@ static void ppc_spapr_init(MachineState *machine)
>      spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW);
>      spapr_pci_rtas_init();
>  
> +    spapr_init_drc_table();
>      phb = spapr_create_phb(spapr, 0);
>  
>      for (i = 0; i < nb_nics; i++) {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 9ed39a9..e85134f 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -531,6 +531,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>              + sphb->index * SPAPR_PCI_WINDOW_SPACING;
>          sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
>          sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
> +        spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);


What exactly does "unusable" mean here? Macro?



>      }
>  
>      if (sphb->buid == -1) {
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 36e8e51..c93794b 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -10,6 +10,36 @@ struct sPAPRNVRAM;
>  
>  #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
>  
> +/* For dlparable/hotpluggable slots */
> +#define SPAPR_DRC_TABLE_SIZE    32
> +#define SPAPR_DRC_PHB_SLOT_MAX  32
> +#define SPAPR_DRC_DEV_ID_BASE   0x40000000


Is this SPAPR_DRC_DEV_ID_BASE really necessary (can it be zero)? Is that
global id or per PCI bus or per PHB?


> +
> +typedef struct sPAPRConfigureConnectorState {
> +    void *fdt;
> +    int offset_start;
> +    int offset;
> +    int depth;
> +    PCIDevice *dev;
> +    enum {
> +        CC_STATE_IDLE = 0,
> +        CC_STATE_PENDING = 1,
> +        CC_STATE_ACTIVE,
> +    } state;
> +} sPAPRConfigureConnectorState;
> +
> +typedef struct sPAPRDrcEntry sPAPRDrcEntry;
> +
> +struct sPAPRDrcEntry {
> +    uint32_t drc_index;
> +    uint64_t phb_buid;
> +    void *fdt;
> +    int fdt_offset;
> +    uint32_t state;
> +    sPAPRConfigureConnectorState cc_state;
> +    sPAPRDrcEntry *child_entries;
> +};
> +
>  typedef struct sPAPREnvironment {
>      struct VIOsPAPRBus *vio_bus;
>      QLIST_HEAD(, sPAPRPHBState) phbs;
> @@ -39,6 +69,9 @@ typedef struct sPAPREnvironment {
>      int htab_save_index;
>      bool htab_first_pass;
>      int htab_fd;
> +
> +    /* state for Dynamic Reconfiguration Connectors */
> +    sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
>  } sPAPREnvironment;
>  
>  #define H_SUCCESS         0
> @@ -481,5 +514,7 @@ int spapr_dma_dt(void *fdt, int node_off, const char *propname,
>                   uint32_t liobn, uint64_t window, uint32_t size);
>  int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
>                        sPAPRTCETable *tcet);
> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
>  
>  #endif /* !defined (__HW_SPAPR_H__) */
> 


-- 
Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-26  7:55   ` Alexey Kardashevskiy
@ 2014-08-26  8:24     ` Alexey Kardashevskiy
  2014-08-26 15:25       ` Michael Roth
  2014-08-26 14:56     ` Michael Roth
  2014-09-05  0:31     ` [Qemu-devel] [Qemu-ppc] " Tyrel Datwyler
  2 siblings, 1 reply; 69+ messages in thread
From: Alexey Kardashevskiy @ 2014-08-26  8:24 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

On 08/26/2014 05:55 PM, Alexey Kardashevskiy wrote:
> On 08/19/2014 10:21 AM, Michael Roth wrote:
>> From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>>
>> This add entries to the root OF node to advertise our PHBs as being
>> DR-capable in according with PAPR specification.
>>
>> Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
>> and associated with a power domain of -1 (indicating to guests that
>> power management is handled automatically by hardware).
>>
>> We currently allocate entries for up to 32 DR-capable PHBs, though
>> this limit can be increased later.
>>
>> DrcEntry objects to track the state of the DR-connector associated
>> with each PHB are stored in a 32-entry array, and each DrcEntry has
>> in turn have a dynamically-sized number of child DR-connectors,
>> which we will use later to track the state of DR-connectors
>> associated with a PHB's physical slots.
>>
>> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>> ---
>>  hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
>>  hw/ppc/spapr_pci.c     |   1 +
>>  include/hw/ppc/spapr.h |  35 ++++++++++++
>>  3 files changed, 179 insertions(+)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 5c92707..d5e46c3 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
>>      return ram_size;
>>  }
>>  
>> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
>> +{
>> +    int i;
>> +
>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>> +        if (spapr->drc_table[i].phb_buid == buid) {
>> +            return &spapr->drc_table[i];
>> +        }
>> +     }
>> +
>> +     return NULL;
>> +}
>> +
>> +static void spapr_init_drc_table(void)
>> +{
>> +    int i;
>> +
>> +    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
>> +
>> +    /* For now we only care about PHB entries */
>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>> +        spapr->drc_table[i].drc_index = 0x2000001 + i;
>> +    }
>> +}
>> +
>> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
>> +{
>> +    sPAPRDrcEntry *empty_drc = NULL;
>> +    sPAPRDrcEntry *found_drc = NULL;
>> +    int i, phb_index;
>> +
>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>> +        if (spapr->drc_table[i].phb_buid == 0) {
>> +            empty_drc = &spapr->drc_table[i];
>> +        }
>> +
>> +        if (spapr->drc_table[i].phb_buid == buid) {
>> +            found_drc = &spapr->drc_table[i];
> 
> return &spapr->drc_table[i];
> ?
> 
> 
>> +            break;
>> +        }
>> +    }
>> +
>> +    if (found_drc) {
>> +        return found_drc;
>> +    }
> 
>    if (!empty_drc) {
>         return NULL;
>    }
> 
> ?
> 
> 
>> +
>> +    if (empty_drc) {
> 
> and no need in this :)
> 
> 
>> +        empty_drc->phb_buid = buid;
>> +        empty_drc->state = state;
>> +        empty_drc->cc_state.fdt = NULL;
>> +        empty_drc->cc_state.offset = 0;
>> +        empty_drc->cc_state.depth = 0;
>> +        empty_drc->cc_state.state = CC_STATE_IDLE;
>> +        empty_drc->child_entries =
>> +            g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
>> +        phb_index = buid - SPAPR_PCI_BASE_BUID;
>> +        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
>> +            empty_drc->child_entries[i].drc_index =
>> +                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
>> +        }
>> +        return empty_drc;
>> +    }
>> +
>> +    return NULL;
>> +}
>> +
>> +static void spapr_create_drc_dt_entries(void *fdt)
>> +{
>> +    char char_buf[1024];
>> +    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
>> +    uint32_t *entries;
>> +    int offset, fdt_offset;
>> +    int i, ret;
>> +
>> +    fdt_offset = fdt_path_offset(fdt, "/");
>> +
>> +    /* ibm,drc-indexes */
>> +    memset(int_buf, 0, sizeof(int_buf));
>> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
>> +
>> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
>> +        int_buf[i] = spapr->drc_table[i-1].drc_index;
>> +    }
>> +
>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
>> +                      sizeof(int_buf));
>> +    if (ret) {
>> +        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
> 
> return here and below in the same error cases?
> 
>> +    }
>> +
>> +    /* ibm,drc-power-domains */
>> +    memset(int_buf, 0, sizeof(int_buf));
>> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
>> +
>> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
>> +        int_buf[i] = 0xffffffff;
>> +    }
>> +
>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
>> +                      sizeof(int_buf));
>> +    if (ret) {
>> +        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains property\n");
>> +    }
>> +
>> +    /* ibm,drc-names */
>> +    memset(char_buf, 0, sizeof(char_buf));
>> +    entries = (uint32_t *)&char_buf[0];
>> +    *entries = SPAPR_DRC_TABLE_SIZE;
>> +    offset = sizeof(*entries);
>> +
>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>> +        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
>> +        char_buf[offset++] = '\0';
>> +    }
>> +
>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
>> +    if (ret) {
>> +        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
>> +    }
>> +
>> +    /* ibm,drc-types */
>> +    memset(char_buf, 0, sizeof(char_buf));
>> +    entries = (uint32_t *)&char_buf[0];
>> +    *entries = SPAPR_DRC_TABLE_SIZE;
>> +    offset = sizeof(*entries);
>> +
>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>> +        offset += sprintf(char_buf + offset, "PHB");
>> +        char_buf[offset++] = '\0';
>> +    }
>> +
>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, offset);
>> +    if (ret) {
>> +        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
>> +    }
>> +}
>> +
>>  #define _FDT(exp) \
>>      do { \
>>          int ret = (exp);                                           \
>> @@ -731,6 +868,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>      char *bootlist;
>>      void *fdt;
>>      sPAPRPHBState *phb;
>> +    sPAPRDrcEntry *drc_entry;
>>  
>>      fdt = g_malloc(FDT_MAX_SIZE);
>>  
>> @@ -750,6 +888,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>      }
>>  
>>      QLIST_FOREACH(phb, &spapr->phbs, list) {
>> +        drc_entry = spapr_phb_to_drc_entry(phb->buid);
>> +        g_assert(drc_entry);
>>          ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
>>      }
>>  
>> @@ -789,6 +929,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
>>      }
>>  
>> +    spapr_create_drc_dt_entries(fdt);
>> +
>>      _FDT((fdt_pack(fdt)));
>>  
>>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
>> @@ -1443,6 +1585,7 @@ static void ppc_spapr_init(MachineState *machine)
>>      spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW);
>>      spapr_pci_rtas_init();
>>  
>> +    spapr_init_drc_table();
>>      phb = spapr_create_phb(spapr, 0);
>>  
>>      for (i = 0; i < nb_nics; i++) {
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index 9ed39a9..e85134f 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -531,6 +531,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>              + sphb->index * SPAPR_PCI_WINDOW_SPACING;
>>          sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
>>          sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
>> +        spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
> 
> 
> What exactly does "unusable" mean here? Macro?
> 
> 
> 
>>      }
>>  
>>      if (sphb->buid == -1) {
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index 36e8e51..c93794b 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -10,6 +10,36 @@ struct sPAPRNVRAM;
>>  
>>  #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
>>  
>> +/* For dlparable/hotpluggable slots */
>> +#define SPAPR_DRC_TABLE_SIZE    32
>> +#define SPAPR_DRC_PHB_SLOT_MAX  32
>> +#define SPAPR_DRC_DEV_ID_BASE   0x40000000
> 
> 
> Is this SPAPR_DRC_DEV_ID_BASE really necessary (can it be zero)? Is that
> global id or per PCI bus or per PHB?


Ah. Got it. If it was like below, I would not even ask :)

#define SPAPR_DRC_DEV_ID(phb_index, slot) \
	(0x40000000 | ((phb_index)<<8) | ((slot)<<3))

Still not clear why you need this 0x40000000 for. Is it kind of "PHB" DRC type?


> 
>> +
>> +typedef struct sPAPRConfigureConnectorState {
>> +    void *fdt;
>> +    int offset_start;
>> +    int offset;
>> +    int depth;
>> +    PCIDevice *dev;
>> +    enum {
>> +        CC_STATE_IDLE = 0,
>> +        CC_STATE_PENDING = 1,
>> +        CC_STATE_ACTIVE,
>> +    } state;
>> +} sPAPRConfigureConnectorState;
>> +
>> +typedef struct sPAPRDrcEntry sPAPRDrcEntry;
>> +
>> +struct sPAPRDrcEntry {
>> +    uint32_t drc_index;
>> +    uint64_t phb_buid;
>> +    void *fdt;
>> +    int fdt_offset;
>> +    uint32_t state;
>> +    sPAPRConfigureConnectorState cc_state;
>> +    sPAPRDrcEntry *child_entries;
>> +};
>> +
>>  typedef struct sPAPREnvironment {
>>      struct VIOsPAPRBus *vio_bus;
>>      QLIST_HEAD(, sPAPRPHBState) phbs;
>> @@ -39,6 +69,9 @@ typedef struct sPAPREnvironment {
>>      int htab_save_index;
>>      bool htab_first_pass;
>>      int htab_fd;
>> +
>> +    /* state for Dynamic Reconfiguration Connectors */
>> +    sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
>>  } sPAPREnvironment;
>>  
>>  #define H_SUCCESS         0
>> @@ -481,5 +514,7 @@ int spapr_dma_dt(void *fdt, int node_off, const char *propname,
>>                   uint32_t liobn, uint64_t window, uint32_t size);
>>  int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
>>                        sPAPRTCETable *tcet);
>> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
>> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
>>  
>>  #endif /* !defined (__HW_SPAPR_H__) */
>>
> 
> 


-- 
Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs Michael Roth
@ 2014-08-26  8:32   ` Alexey Kardashevskiy
  2014-08-26 17:16     ` Michael Roth
  2014-08-26  9:09   ` Alexey Kardashevskiy
  2014-08-26 11:29   ` Alexander Graf
  2 siblings, 1 reply; 69+ messages in thread
From: Alexey Kardashevskiy @ 2014-08-26  8:32 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

On 08/19/2014 10:21 AM, Michael Roth wrote:
> Reserve 32 entries of type PCI in each PHB's initial FDT. This
> advertises to guests that each PHB is DR-capable device with
> physical hotpluggable slots. This is necessary for allowing
> hotplugging of devices to it later via bus rescan or guest rpaphp
> hotplug module.
> 
> Each entry is assigned a name of "Slot <<bus_idx>*32 +1>",
> advertised as a hotpluggable PCI slot, and assigned to power domain
> -1 to indicate to the guest that power management is handled by the
> hardware.
> 
> This models a DR-capable PCI expansion device attached to a host/lpar
> via a single PHB with 32 physical hotpluggable slots (as opposed to a
> virtual bridge device with external management console). Hotplug will
> be handled by the guest via bus rescan or the rpaphp hotplug module.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c              |   3 +-
>  hw/ppc/spapr_pci.c          | 102 ++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/pci-host/spapr.h |   1 +
>  3 files changed, 105 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d5e46c3..90b25b3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -890,7 +890,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>      QLIST_FOREACH(phb, &spapr->phbs, list) {
>          drc_entry = spapr_phb_to_drc_entry(phb->buid);
>          g_assert(drc_entry);
> -        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> +        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, drc_entry->drc_index,
> +                                    fdt);
>      }
>  
>      if (ret < 0) {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index e85134f..924d488 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -851,8 +851,104 @@ static int spapr_phb_children_dt(Object *child, void *opaque)
>      return 1;
>  }
>  
> +static void spapr_create_drc_phb_dt_entries(void *fdt, int bus_off, int phb_index)
> +{
> +    char char_buf[1024];
> +    uint32_t int_buf[SPAPR_DRC_PHB_SLOT_MAX + 1];
> +    uint32_t *entries;
> +    int i, ret, offset;
> +
> +    /* ibm,drc-indexes */
> +    memset(int_buf, 0 , sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +        int_buf[i] = SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + ((i - 1) << 3);
> +    }
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-indexes", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr, "error adding 'ibm,drc-indexes' field for PHB FDT");
> +    }
> +
> +    /* ibm,drc-power-domains */
> +    memset(int_buf, 0, sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +        int_buf[i] = 0xffffffff;
> +    }
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-power-domains", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr,
> +                "error adding 'ibm,drc-power-domains' field for PHB FDT");

As before - return here and below.

> +    }
> +
> +    /* ibm,drc-names */
> +    memset(char_buf, 0, sizeof(char_buf));
> +    entries = (uint32_t *)&char_buf[0];
> +    *entries = SPAPR_DRC_PHB_SLOT_MAX;
> +    offset = sizeof(*entries);
> +
> +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +        offset += sprintf(char_buf + offset, "Slot %d",
> +                          (phb_index * SPAPR_DRC_PHB_SLOT_MAX) + i - 1);

Mmmm. From 1 to <=MAX and (i-1) inside the loop when it could be
traditional  0 to <MAX and (i) as it is done below :)


> +        char_buf[offset++] = '\0';


sprintf() puts zero there itself, no? And as we are here, should not it be
snprintf()?


> +    }
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-names", char_buf, offset);
> +    if (ret) {
> +        fprintf(stderr, "error adding 'ibm,drc-names' field for PHB FDT");
> +    }
> +
> +    /* ibm,drc-types */
> +    memset(char_buf, 0, sizeof(char_buf));
> +    entries = (uint32_t *)&char_buf[0];
> +    *entries = SPAPR_DRC_PHB_SLOT_MAX;
> +    offset = sizeof(*entries);
> +
> +    for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +        offset += sprintf(char_buf + offset, "28");
> +        char_buf[offset++] = '\0';
> +    }
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-types", char_buf, offset);
> +    if (ret) {
> +        fprintf(stderr, "error adding 'ibm,drc-types' field for PHB FDT");
> +    }
> +
> +    /* we want the initial indicator state to be 0 - "empty", when we
> +     * hot-plug an adaptor in the slot, we need to set the indicator
> +     * to 1 - "present."
> +     */
> +
> +    /* ibm,indicator-9003 */
> +    memset(int_buf, 0, sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,indicator-9003", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr, "error adding 'ibm,indicator-9003' field for PHB FDT");
> +    }
> +
> +    /* ibm,sensor-9003 */
> +    memset(int_buf, 0, sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,sensor-9003", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr, "error adding 'ibm,sensor-9003' field for PHB FDT");
> +    }
> +}
> +
>  int spapr_populate_pci_dt(sPAPRPHBState *phb,
>                            uint32_t xics_phandle,
> +                          uint32_t drc_index,
>                            void *fdt)
>  {
>      int bus_off, i, j;
> @@ -934,6 +1030,12 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb,
>      object_child_foreach(OBJECT(phb), spapr_phb_children_dt,
>                           &((sPAPRTCEDT){ .fdt = fdt, .node_off = bus_off }));
>  
> +    spapr_create_drc_phb_dt_entries(fdt, bus_off, phb->index);
> +    if (drc_index) {
> +        _FDT(fdt_setprop(fdt, bus_off, "ibm,my-drc-index", &drc_index,
> +                         sizeof(drc_index)));
> +    }
> +
>      return 0;
>  }
>  
> diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
> index 32f0aa7..8f0a42f 100644
> --- a/include/hw/pci-host/spapr.h
> +++ b/include/hw/pci-host/spapr.h
> @@ -116,6 +116,7 @@ PCIHostState *spapr_create_phb(sPAPREnvironment *spapr, int index);
>  
>  int spapr_populate_pci_dt(sPAPRPHBState *phb,
>                            uint32_t xics_phandle,
> +                          uint32_t drc_index,
>                            void *fdt);
>  
>  void spapr_pci_msi_init(sPAPREnvironment *spapr, hwaddr addr);
> 


-- 
Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs Michael Roth
  2014-08-26  8:32   ` Alexey Kardashevskiy
@ 2014-08-26  9:09   ` Alexey Kardashevskiy
  2014-08-26 17:52     ` Michael Roth
  2014-08-26 11:29   ` Alexander Graf
  2 siblings, 1 reply; 69+ messages in thread
From: Alexey Kardashevskiy @ 2014-08-26  9:09 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

On 08/19/2014 10:21 AM, Michael Roth wrote:
> Reserve 32 entries of type PCI in each PHB's initial FDT. This
> advertises to guests that each PHB is DR-capable device with
> physical hotpluggable slots. This is necessary for allowing
> hotplugging of devices to it later via bus rescan or guest rpaphp
> hotplug module.
> 
> Each entry is assigned a name of "Slot <<bus_idx>*32 +1>",
> advertised as a hotpluggable PCI slot, and assigned to power domain
> -1 to indicate to the guest that power management is handled by the
> hardware.
> 
> This models a DR-capable PCI expansion device attached to a host/lpar
> via a single PHB with 32 physical hotpluggable slots (as opposed to a
> virtual bridge device with external management console). Hotplug will
> be handled by the guest via bus rescan or the rpaphp hotplug module.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c              |   3 +-
>  hw/ppc/spapr_pci.c          | 102 ++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/pci-host/spapr.h |   1 +
>  3 files changed, 105 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d5e46c3..90b25b3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -890,7 +890,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>      QLIST_FOREACH(phb, &spapr->phbs, list) {
>          drc_entry = spapr_phb_to_drc_entry(phb->buid);
>          g_assert(drc_entry);
> -        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> +        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, drc_entry->drc_index,
> +                                    fdt);
>      }
>  
>      if (ret < 0) {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index e85134f..924d488 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -851,8 +851,104 @@ static int spapr_phb_children_dt(Object *child, void *opaque)
>      return 1;
>  }
>  
> +static void spapr_create_drc_phb_dt_entries(void *fdt, int bus_off, int phb_index)
> +{
> +    char char_buf[1024];
> +    uint32_t int_buf[SPAPR_DRC_PHB_SLOT_MAX + 1];
> +    uint32_t *entries;
> +    int i, ret, offset;
> +
> +    /* ibm,drc-indexes */
> +    memset(int_buf, 0 , sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +        int_buf[i] = SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + ((i - 1) << 3);
> +    }
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-indexes", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr, "error adding 'ibm,drc-indexes' field for PHB FDT");
> +    }
> +
> +    /* ibm,drc-power-domains */
> +    memset(int_buf, 0, sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +        int_buf[i] = 0xffffffff;
> +    }
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-power-domains", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr,
> +                "error adding 'ibm,drc-power-domains' field for PHB FDT");
> +    }
> +
> +    /* ibm,drc-names */
> +    memset(char_buf, 0, sizeof(char_buf));
> +    entries = (uint32_t *)&char_buf[0];
> +    *entries = SPAPR_DRC_PHB_SLOT_MAX;
> +    offset = sizeof(*entries);
> +
> +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +        offset += sprintf(char_buf + offset, "Slot %d",
> +                          (phb_index * SPAPR_DRC_PHB_SLOT_MAX) + i - 1);
> +        char_buf[offset++] = '\0';
> +    }
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-names", char_buf, offset);
> +    if (ret) {
> +        fprintf(stderr, "error adding 'ibm,drc-names' field for PHB FDT");
> +    }
> +
> +    /* ibm,drc-types */
> +    memset(char_buf, 0, sizeof(char_buf));
> +    entries = (uint32_t *)&char_buf[0];
> +    *entries = SPAPR_DRC_PHB_SLOT_MAX;
> +    offset = sizeof(*entries);
> +
> +    for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +        offset += sprintf(char_buf + offset, "28");


"28"? Is it for "PHB"?




-- 
Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 07/12] spapr_pci: add ibm, configure-connector RTAS interface
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 07/12] spapr_pci: add ibm, configure-connector " Michael Roth
@ 2014-08-26  9:12   ` Alexey Kardashevskiy
  2014-09-05  3:03     ` Nathan Fontenot
  2014-08-26 11:39   ` Alexander Graf
  1 sibling, 1 reply; 69+ messages in thread
From: Alexey Kardashevskiy @ 2014-08-26  9:12 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

On 08/19/2014 10:21 AM, Michael Roth wrote:
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

I have totally no idea what this patch actually does :) When is this rtas
call made? Once after the guest received the check exception interrupt? Is
it all it does is fetching the copy of the device tree made by
spapr_create_drc_phb_dt_entries()? If it is,
spapr_create_drc_phb_dt_entries() could compose the structure below at the
first place without any additional conversions, no?


> ---
>  hw/ppc/spapr_pci.c | 111 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 111 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 8d1351d..96a57be 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -606,6 +606,115 @@ static void rtas_get_sensor_state(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>      rtas_st(rets, 1, decoded);
>  }
>  
> +/* configure-connector work area offsets, int32_t units */
> +#define CC_IDX_NODE_NAME_OFFSET 2
> +#define CC_IDX_PROP_NAME_OFFSET 2
> +#define CC_IDX_PROP_LEN 3
> +#define CC_IDX_PROP_DATA_OFFSET 4
> +
> +#define CC_VAL_DATA_OFFSET ((CC_IDX_PROP_DATA_OFFSET + 1) * 4)
> +#define CC_RET_NEXT_SIB 1
> +#define CC_RET_NEXT_CHILD 2
> +#define CC_RET_NEXT_PROPERTY 3
> +#define CC_RET_PREV_PARENT 4
> +#define CC_RET_ERROR RTAS_OUT_HW_ERROR
> +#define CC_RET_SUCCESS RTAS_OUT_SUCCESS


Why these two are here, not in the same bucket as RTAS_OUT_HW_ERROR and others?


> +
> +static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
> +                                         sPAPREnvironment *spapr,
> +                                         uint32_t token, uint32_t nargs,
> +                                         target_ulong args, uint32_t nret,
> +                                         target_ulong rets)
> +{
> +    uint64_t wa_addr = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 0);
> +    sPAPRDrcEntry *drc_entry = NULL;
> +    sPAPRConfigureConnectorState *ccs;
> +    void *wa_buf;
> +    int32_t *wa_buf_int;
> +    hwaddr map_len = 0x1024;
> +    uint32_t drc_index;
> +    int rc = 0, next_offset, tag, prop_len, node_name_len;
> +    const struct fdt_property *prop;
> +    const char *node_name, *prop_name;
> +
> +    wa_buf = cpu_physical_memory_map(wa_addr, &map_len, 1);
> +    if (!wa_buf) {
> +        rc = CC_RET_ERROR;
> +        goto error_exit;
> +    }
> +    wa_buf_int = wa_buf;
> +
> +    drc_index = *(uint32_t *)wa_buf;
> +    drc_entry = spapr_find_drc_entry(drc_index);
> +    if (!drc_entry) {
> +        rc = -1;
> +        goto error_exit;
> +    }
> +
> +    ccs = &drc_entry->cc_state;
> +    if (ccs->state == CC_STATE_PENDING) {
> +        /* fdt should've been been attached to drc_entry during
> +         * realize/hotplug
> +         */
> +        g_assert(ccs->fdt);
> +        ccs->depth = 0;
> +        ccs->offset = ccs->offset_start;
> +        ccs->state = CC_STATE_ACTIVE;
> +    }
> +
> +    if (ccs->state == CC_STATE_IDLE) {
> +        rc = -1;
> +        goto error_exit;
> +    }
> +
> +retry:
> +    tag = fdt_next_tag(ccs->fdt, ccs->offset, &next_offset);
> +
> +    switch (tag) {
> +    case FDT_BEGIN_NODE:
> +        ccs->depth++;
> +        node_name = fdt_get_name(ccs->fdt, ccs->offset, &node_name_len);
> +        wa_buf_int[CC_IDX_NODE_NAME_OFFSET] = CC_VAL_DATA_OFFSET;
> +        strcpy(wa_buf + wa_buf_int[CC_IDX_NODE_NAME_OFFSET], node_name);
> +        rc = CC_RET_NEXT_CHILD;
> +        break;
> +    case FDT_END_NODE:
> +        ccs->depth--;
> +        if (ccs->depth == 0) {
> +            /* reached the end of top-level node, declare success */
> +            ccs->state = CC_STATE_PENDING;
> +            rc = CC_RET_SUCCESS;
> +        } else {
> +            rc = CC_RET_PREV_PARENT;
> +        }
> +        break;
> +    case FDT_PROP:
> +        prop = fdt_get_property_by_offset(ccs->fdt, ccs->offset, &prop_len);
> +        prop_name = fdt_string(ccs->fdt, fdt32_to_cpu(prop->nameoff));
> +        wa_buf_int[CC_IDX_PROP_NAME_OFFSET] = CC_VAL_DATA_OFFSET;
> +        wa_buf_int[CC_IDX_PROP_LEN] = prop_len;
> +        wa_buf_int[CC_IDX_PROP_DATA_OFFSET] =
> +            CC_VAL_DATA_OFFSET + strlen(prop_name) + 1;
> +        strcpy(wa_buf + wa_buf_int[CC_IDX_PROP_NAME_OFFSET], prop_name);
> +        memcpy(wa_buf + wa_buf_int[CC_IDX_PROP_DATA_OFFSET],
> +               prop->data, prop_len);
> +        rc = CC_RET_NEXT_PROPERTY;
> +        break;
> +    case FDT_END:
> +        rc = CC_RET_ERROR;
> +        break;
> +    default:
> +        ccs->offset = next_offset;
> +        goto retry;

Could be a while(1) loop...


> +    }
> +
> +    ccs->offset = next_offset;
> +
> +error_exit:
> +    cpu_physical_memory_unmap(wa_buf, 0x1024, 1, 0x1024);


0x1024 is weird constant, are you sure about that?


> +    rtas_st(rets, 0, rc);
> +}
> +
>  static int pci_spapr_swizzle(int slot, int pin)
>  {
>      return (slot + pin) % PCI_NUM_PINS;
> @@ -1276,6 +1385,8 @@ void spapr_pci_rtas_init(void)
>                          rtas_get_power_level);
>      spapr_rtas_register(RTAS_GET_SENSOR_STATE, "get-sensor-state",
>                          rtas_get_sensor_state);
> +    spapr_rtas_register(RTAS_IBM_CONFIGURE_CONNECTOR, "ibm,configure-connector",
> +                        rtas_ibm_configure_connector);
>  }
>  
>  static void spapr_pci_register_types(void)
> 


-- 
Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions Michael Roth
@ 2014-08-26  9:14   ` Alexey Kardashevskiy
  2014-08-26 11:55     ` Peter Maydell
  2014-08-26 18:34     ` Michael Roth
  2014-08-26 11:41   ` Alexander Graf
  2014-08-27 13:47   ` Michael S. Tsirkin
  2 siblings, 2 replies; 69+ messages in thread
From: Alexey Kardashevskiy @ 2014-08-26  9:14 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

On 08/19/2014 10:21 AM, Michael Roth wrote:
> Some kernels program a 0 address for io regions. PCI 3.0 spec
> section 6.2.5.1 doesn't seem to disallow this.


I remember there was discussion about it but I forgot :) Why does it have
to be a part of this patchset? Worth mentioning in the commit log I believe.


> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/pci/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 351d320..9578749 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -1035,7 +1035,7 @@ static pcibus_t pci_bar_address(PCIDevice *d,
>          /* Check if 32 bit BAR wraps around explicitly.
>           * TODO: make priorities correct and remove this work around.
>           */
> -        if (last_addr <= new_addr || new_addr == 0 || last_addr >= UINT32_MAX) {
> +        if (last_addr <= new_addr || last_addr >= UINT32_MAX) {
>              return PCI_BAR_UNMAPPED;
>          }
>          return new_addr;
> 


-- 
Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug
  2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
                   ` (11 preceding siblings ...)
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 12/12] spapr_pci: emit hotplug add/remove events during hotplug Michael Roth
@ 2014-08-26  9:24 ` Alexey Kardashevskiy
  12 siblings, 0 replies; 69+ messages in thread
From: Alexey Kardashevskiy @ 2014-08-26  9:24 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

On 08/19/2014 10:21 AM, Michael Roth wrote:
> These patches are based on ppc-next, and can also be obtained from:
> 
> https://github.com/mdroth/qemu/commits/spapr-pci-hotplug-v3-ppc-next
> 
> v3:
>  * dropped emulation of firmware-managed BAR allocation. this will be
>    introduced via a follow-up series via a -machine flag and tied to
>    a separate hotplug event to avoid a race condition with guest vs.
>    "firmware"-managed BAR allocation, in conjunction with required
>    fixes to rpaphp hotplug kernel module to utilize this mode.
>  * moved drc_table into sPAPREnvironment (Alexey)
>  * moved INDICATOR_* constants and friends into spapr_pci.c (Alexey)
>  * use prefixes for global types (DrcEntry/ConfigureConnectorState) (Alexey)
>  * updated for new hotplug interface (Alexey)
>  * fixed get-power-level to report current power-level rather than
>    desired (Alexey)
>  * rebased to latest ppc-next
> 
> v2:
>   * re-ordered patches to fix build bisectability (Alexey)
>   * replaced g_warning with DPRINTF in RTAS calls for guest errors (Alexey)
>   * replaced g_warning with fprintf for qemu errors (Alexey)
>   * updated RTAS calls to use pre-existing error/success macros (Alexey)
>   * replaced DR_*/SENSOR_* macros with INDICATOR_* for set-indicator/
>     get-sensor-state (Alexey)
> 
> OVERVIEW
> 
> These patches add support for PCI hotplug for SPAPR guests. We advertise
> each PHB as DR-capable (as defined by PAPR 13.5/13.6) with 32 hotpluggable
> PCI slots per PHB, which models a standard PCI expansion device for Power
> machines where the DRC name/loc-code/index for each slot are generated
> based on bus/slot number.
> 
> This is compatible with existing guest kernel's via the rpaphp hotplug
> module, and existing userspace tools such as drmgr/librtas/rtas_errd for
> managing devices, in theory...


It would help if you described roughly what happens step-by-step when
hotplugging - when HotplugHandlerClass is called, when RTAS calls are made,
what format is used between (it is something like device tree but not
exactly), when check exception interrupt is triggered...




> 
> NOTES / ADDITIONAL DEPENDENCIES
> 
> This series relies on v1.2.19 or later of powerppc-utils (drmgr, rtas_errd,
> ppc64-diag, and librtas components, specificially), which will automate
> guest-side hotplug setup in response to an EPOW event emitted by QEMU. For
> guests with older versions of powerpc-utils, a manual workaround must be
> used (documented below).
> 
> PATCH LAYOUT
> 
> Patches
>         1-3   advertise PHBs and associated slots as hotpluggable to guests
>         4-7   add RTAS interfaces required for device configuration
>         8     fix for ppc (and other) guests that allocate IO bars starting
>               at 0x0
>         9     enables device_add/device_del for spapr machines and
>               guest-driven hotplug
>         10-12 define hotplug event structure and emit them in response to
>               device_add/device_del
> 
> USAGE
> 
> For guests with powerpc-utils 1.2.19+:
>   hotplug:
>     qemu:
>       device_add e1000,id=slot0
>   unplug:
>     qemu:
>       device_del slot0
> 
> For guests with powerpc-utils prior to 1.2.19:
>   hotplug:
>     qemu:
>       device_add e1000,id=slot0
>     guest:
>       drmgr -c pci -s "Slot 0" -n -a
>       echo 1 >/sys/bus/pci/rescan
>   unplug:
>     guest:
>       drmgr -c pci -s "Slot 0" -n -r
>       echo 1 >/sys/bus/pci/devices/0000:00:00.0/remove
>     qemu:
>       device_del slot0
> 
>  hw/pci/pci.c                |   2 +-
>  hw/ppc/spapr.c              | 172 +++++++++++++++++++++++++++++++++-
>  hw/ppc/spapr_events.c       | 224 ++++++++++++++++++++++++++++++++++++--------
>  hw/ppc/spapr_pci.c          | 689 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  include/hw/pci-host/spapr.h |   1 +
>  include/hw/ppc/spapr.h      |  46 ++++++++-
>  6 files changed, 1083 insertions(+), 51 deletions(-)
> 


-- 
Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 10/12] spapr_events: re-use EPOW event infrastructure for hotplug events
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 10/12] spapr_events: re-use EPOW event infrastructure for hotplug events Michael Roth
@ 2014-08-26  9:28   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 69+ messages in thread
From: Alexey Kardashevskiy @ 2014-08-26  9:28 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

On 08/19/2014 10:21 AM, Michael Roth wrote:
> From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> 
> This extends the data structures currently used to report EPOW events to
> gets via the check-exception RTAS interfaces to also include event types
> for hotplug/unplug events.
> 
> This is currently undocumented and being finalized for inclusion in PAPR
> specification, but we implement this here as an extension for guest
> userspace tools to implement (existing guest kernels simply log these
> events via a sysfs interface that's read by rtas_errd).


This patch moves things around (should be a mechanical change, right?) AND
 add new (undocumented) stuff. It would be easier to review if it was 2
patches.


> 
> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c         |   2 +-
>  hw/ppc/spapr_events.c  | 215 ++++++++++++++++++++++++++++++++++++++++---------
>  include/hw/ppc/spapr.h |   4 +-
>  3 files changed, 180 insertions(+), 41 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 39cb0bb..825fd31 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1706,7 +1706,7 @@ static void ppc_spapr_init(MachineState *machine)
>      spapr->fdt_skel = spapr_create_fdt_skel(initrd_base, initrd_size,
>                                              kernel_size, kernel_le,
>                                              boot_device, kernel_cmdline,
> -                                            spapr->epow_irq);
> +                                            spapr->check_exception_irq);
>      assert(spapr->fdt_skel != NULL);
>  }
>  
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 1b6157d..c0be0e5 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -32,6 +32,8 @@
>  
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/spapr_vio.h"
> +#include "hw/pci/pci.h"
> +#include "hw/pci-host/spapr.h"
>  
>  #include <libfdt.h>
>  
> @@ -77,6 +79,7 @@ struct rtas_error_log {
>  #define   RTAS_LOG_TYPE_ECC_UNCORR              0x00000009
>  #define   RTAS_LOG_TYPE_ECC_CORR                0x0000000a
>  #define   RTAS_LOG_TYPE_EPOW                    0x00000040
> +#define   RTAS_LOG_TYPE_HOTPLUG                 0x000000e5
>      uint32_t extended_length;
>  } QEMU_PACKED;
>  
> @@ -166,6 +169,38 @@ struct epow_log_full {
>      struct rtas_event_log_v6_epow epow;
>  } QEMU_PACKED;
>  
> +struct rtas_event_log_v6_hp {
> +#define RTAS_LOG_V6_SECTION_ID_HOTPLUG              0x4850 /* HP */
> +    struct rtas_event_log_v6_section_header hdr;
> +    uint8_t hotplug_type;
> +#define RTAS_LOG_V6_HP_TYPE_CPU                          1
> +#define RTAS_LOG_V6_HP_TYPE_MEMORY                       2
> +#define RTAS_LOG_V6_HP_TYPE_SLOT                         3
> +#define RTAS_LOG_V6_HP_TYPE_PHB                          4
> +#define RTAS_LOG_V6_HP_TYPE_PCI                          5
> +    uint8_t hotplug_action;
> +#define RTAS_LOG_V6_HP_ACTION_ADD                        1
> +#define RTAS_LOG_V6_HP_ACTION_REMOVE                     2
> +    uint8_t hotplug_identifier;
> +#define RTAS_LOG_V6_HP_ID_DRC_NAME                       1
> +#define RTAS_LOG_V6_HP_ID_DRC_INDEX                      2
> +#define RTAS_LOG_V6_HP_ID_DRC_COUNT                      3
> +    uint8_t reserved;
> +    union {
> +        uint32_t index;
> +        uint32_t count;
> +        char name[1];
> +    } drc;
> +} QEMU_PACKED;
> +
> +struct hp_log_full {
> +    struct rtas_error_log hdr;
> +    struct rtas_event_log_v6 v6hdr;
> +    struct rtas_event_log_v6_maina maina;
> +    struct rtas_event_log_v6_mainb mainb;
> +    struct rtas_event_log_v6_hp hp;
> +} QEMU_PACKED;
> +
>  #define EVENT_MASK_INTERNAL_ERRORS           0x80000000
>  #define EVENT_MASK_EPOW                      0x40000000
>  #define EVENT_MASK_HOTPLUG                   0x10000000
> @@ -181,29 +216,61 @@ struct epow_log_full {
>          }                                                          \
>      } while (0)
>  
> -void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq)
> +void spapr_events_fdt_skel(void *fdt, uint32_t check_exception_irq)
>  {
> -    uint32_t epow_irq_ranges[] = {cpu_to_be32(epow_irq), cpu_to_be32(1)};
> -    uint32_t epow_interrupts[] = {cpu_to_be32(epow_irq), 0};
> +    uint32_t irq_ranges[] = {cpu_to_be32(check_exception_irq), cpu_to_be32(1)};
> +    uint32_t interrupts[] = {cpu_to_be32(check_exception_irq), 0};
>  
>      _FDT((fdt_begin_node(fdt, "event-sources")));
>  
>      _FDT((fdt_property(fdt, "interrupt-controller", NULL, 0)));
>      _FDT((fdt_property_cell(fdt, "#interrupt-cells", 2)));
>      _FDT((fdt_property(fdt, "interrupt-ranges",
> -                       epow_irq_ranges, sizeof(epow_irq_ranges))));
> +                       irq_ranges, sizeof(irq_ranges))));
>  
>      _FDT((fdt_begin_node(fdt, "epow-events")));
> -    _FDT((fdt_property(fdt, "interrupts",
> -                       epow_interrupts, sizeof(epow_interrupts))));
> +    _FDT((fdt_property(fdt, "interrupts", interrupts, sizeof(interrupts))));
>      _FDT((fdt_end_node(fdt)));
>  
>      _FDT((fdt_end_node(fdt)));
>  }
>  
>  static struct epow_log_full *pending_epow;
> +static struct hp_log_full *pending_hp;
>  static uint32_t next_plid;
>  
> +static void spapr_init_v6hdr(struct rtas_event_log_v6 *v6hdr)
> +{
> +    v6hdr->b0 = RTAS_LOG_V6_B0_VALID | RTAS_LOG_V6_B0_NEW_LOG
> +        | RTAS_LOG_V6_B0_BIGENDIAN;
> +    v6hdr->b2 = RTAS_LOG_V6_B2_POWERPC_FORMAT
> +        | RTAS_LOG_V6_B2_LOG_FORMAT_PLATFORM_EVENT;
> +    v6hdr->company = cpu_to_be32(RTAS_LOG_V6_COMPANY_IBM);
> +}
> +
> +static void spapr_init_maina(struct rtas_event_log_v6_maina *maina,
> +                             int section_count)
> +{
> +    struct tm tm;
> +    int year;
> +
> +    maina->hdr.section_id = cpu_to_be16(RTAS_LOG_V6_SECTION_ID_MAINA);
> +    maina->hdr.section_length = cpu_to_be16(sizeof(*maina));
> +    /* FIXME: section version, subtype and creator id? */
> +    qemu_get_timedate(&tm, spapr->rtc_offset);
> +    year = tm.tm_year + 1900;
> +    maina->creation_date = cpu_to_be32((to_bcd(year / 100) << 24)
> +                                       | (to_bcd(year % 100) << 16)
> +                                       | (to_bcd(tm.tm_mon + 1) << 8)
> +                                       | to_bcd(tm.tm_mday));
> +    maina->creation_time = cpu_to_be32((to_bcd(tm.tm_hour) << 24)
> +                                       | (to_bcd(tm.tm_min) << 16)
> +                                       | (to_bcd(tm.tm_sec) << 8));
> +    maina->creator_id = 'H'; /* Hypervisor */
> +    maina->section_count = section_count;
> +    maina->plid = next_plid++;
> +}
> +
>  static void spapr_powerdown_req(Notifier *n, void *opaque)
>  {
>      sPAPREnvironment *spapr = container_of(n, sPAPREnvironment, epow_notifier);
> @@ -212,8 +279,6 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
>      struct rtas_event_log_v6_maina *maina;
>      struct rtas_event_log_v6_mainb *mainb;
>      struct rtas_event_log_v6_epow *epow;
> -    struct tm tm;
> -    int year;
>  
>      if (pending_epow) {
>          /* For now, we just throw away earlier events if two come
> @@ -237,27 +302,8 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
>      hdr->extended_length = cpu_to_be32(sizeof(*pending_epow)
>                                         - sizeof(pending_epow->hdr));
>  
> -    v6hdr->b0 = RTAS_LOG_V6_B0_VALID | RTAS_LOG_V6_B0_NEW_LOG
> -        | RTAS_LOG_V6_B0_BIGENDIAN;
> -    v6hdr->b2 = RTAS_LOG_V6_B2_POWERPC_FORMAT
> -        | RTAS_LOG_V6_B2_LOG_FORMAT_PLATFORM_EVENT;
> -    v6hdr->company = cpu_to_be32(RTAS_LOG_V6_COMPANY_IBM);
> -
> -    maina->hdr.section_id = cpu_to_be16(RTAS_LOG_V6_SECTION_ID_MAINA);
> -    maina->hdr.section_length = cpu_to_be16(sizeof(*maina));
> -    /* FIXME: section version, subtype and creator id? */
> -    qemu_get_timedate(&tm, spapr->rtc_offset);
> -    year = tm.tm_year + 1900;
> -    maina->creation_date = cpu_to_be32((to_bcd(year / 100) << 24)
> -                                       | (to_bcd(year % 100) << 16)
> -                                       | (to_bcd(tm.tm_mon + 1) << 8)
> -                                       | to_bcd(tm.tm_mday));
> -    maina->creation_time = cpu_to_be32((to_bcd(tm.tm_hour) << 24)
> -                                       | (to_bcd(tm.tm_min) << 16)
> -                                       | (to_bcd(tm.tm_sec) << 8));
> -    maina->creator_id = 'H'; /* Hypervisor */
> -    maina->section_count = 3; /* Main-A, Main-B and EPOW */
> -    maina->plid = next_plid++;
> +    spapr_init_v6hdr(v6hdr);
> +    spapr_init_maina(maina, 3 /* Main-A, Main-B and EPOW */);
>  
>      mainb->hdr.section_id = cpu_to_be16(RTAS_LOG_V6_SECTION_ID_MAINB);
>      mainb->hdr.section_length = cpu_to_be16(sizeof(*mainb));
> @@ -274,7 +320,87 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
>      epow->event_modifier = RTAS_LOG_V6_EPOW_MODIFIER_NORMAL;
>      epow->extended_modifier = RTAS_LOG_V6_EPOW_XMODIFIER_PARTITION_SPECIFIC;
>  
> -    qemu_irq_pulse(xics_get_qirq(spapr->icp, spapr->epow_irq));
> +    qemu_irq_pulse(xics_get_qirq(spapr->icp, spapr->check_exception_irq));
> +}
> +
> +static void spapr_hotplug_req_event(uint8_t hp_type, uint8_t hp_action,
> +                                    sPAPRPHBState *phb, int slot)
> +{
> +    struct rtas_error_log *hdr;
> +    struct rtas_event_log_v6 *v6hdr;
> +    struct rtas_event_log_v6_maina *maina;
> +    struct rtas_event_log_v6_mainb *mainb;
> +    struct rtas_event_log_v6_hp *hp;
> +    sPAPRDrcEntry *drc_entry;
> +
> +    if (pending_hp) {
> +        /* Just toss any pending hotplug events for now, this will
> +         * need to be fixed later on.
> +         */
> +        g_free(pending_hp);
> +    }
> +
> +    pending_hp = g_malloc0(sizeof(*pending_hp));
> +    hdr = &pending_hp->hdr;
> +    v6hdr = &pending_hp->v6hdr;
> +    maina = &pending_hp->maina;
> +    mainb = &pending_hp->mainb;
> +    hp = &pending_hp->hp;
> +
> +    hdr->summary = cpu_to_be32(RTAS_LOG_VERSION_6
> +                               | RTAS_LOG_SEVERITY_EVENT
> +                               | RTAS_LOG_DISPOSITION_NOT_RECOVERED
> +                               | RTAS_LOG_OPTIONAL_PART_PRESENT
> +                               | RTAS_LOG_INITIATOR_HOTPLUG
> +                               | RTAS_LOG_TYPE_HOTPLUG);
> +    hdr->extended_length = cpu_to_be32(sizeof(*pending_hp)
> +                                       - sizeof(pending_hp->hdr));
> +
> +    spapr_init_v6hdr(v6hdr);
> +    spapr_init_maina(maina, 3 /* Main-A, Main-B, HP */);
> +
> +    mainb->hdr.section_id = cpu_to_be16(RTAS_LOG_V6_SECTION_ID_MAINB);
> +    mainb->hdr.section_length = cpu_to_be16(sizeof(*mainb));
> +    mainb->subsystem_id = 0x80; /* External environment */
> +    mainb->event_severity = 0x00; /* Informational / non-error */
> +    mainb->event_subtype = 0x00; /* Normal shutdown */
> +
> +    hp->hdr.section_id = cpu_to_be16(RTAS_LOG_V6_SECTION_ID_HOTPLUG);
> +    hp->hdr.section_length = cpu_to_be16(sizeof(*hp));
> +    hp->hdr.section_version = 1; /* includes extended modifier */
> +    hp->hotplug_action = hp_action;
> +
> +    hp->hotplug_type = hp_type;
> +
> +    drc_entry = spapr_phb_to_drc_entry(phb->buid);
> +    if (!drc_entry) {
> +        drc_entry = spapr_add_phb_to_drc_table(phb->buid, 2 /* Unusable */);
> +    }
> +
> +    switch (hp_type) {
> +    case RTAS_LOG_V6_HP_TYPE_PCI:
> +        hp->drc.index = drc_entry->child_entries[slot].drc_index;
> +        hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
> +        break;
> +    }
> +
> +    qemu_irq_pulse(xics_get_qirq(spapr->icp, spapr->check_exception_irq));
> +}
> +
> +void spapr_pci_hotplug_add_event(DeviceState *qdev, int slot)
> +{
> +    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
> +
> +    return spapr_hotplug_req_event(RTAS_LOG_V6_HP_TYPE_PCI,
> +                                   RTAS_LOG_V6_HP_ACTION_ADD, phb, slot);
> +}
> +
> +void spapr_pci_hotplug_remove_event(DeviceState *qdev, int slot)
> +{
> +    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
> +
> +    return spapr_hotplug_req_event(RTAS_LOG_V6_HP_TYPE_PCI,
> +                                   RTAS_LOG_V6_HP_ACTION_REMOVE, phb, slot);
>  }
>  
>  static void check_exception(PowerPCCPU *cpu, sPAPREnvironment *spapr,
> @@ -298,15 +424,26 @@ static void check_exception(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>          xinfo |= (uint64_t)rtas_ld(args, 6) << 32;
>      }
>  
> -    if ((mask & EVENT_MASK_EPOW) && pending_epow) {
> -        if (sizeof(*pending_epow) < len) {
> -            len = sizeof(*pending_epow);
> -        }
> +    if (mask & EVENT_MASK_EPOW) {
> +        if (pending_epow) {
> +            if (sizeof(*pending_epow) < len) {
> +                len = sizeof(*pending_epow);
> +            }
>  
> -        cpu_physical_memory_write(buf, pending_epow, len);
> -        g_free(pending_epow);
> -        pending_epow = NULL;
> -        rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +            cpu_physical_memory_write(buf, pending_epow, len);
> +            g_free(pending_epow);
> +            pending_epow = NULL;
> +            rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +        } else if (pending_hp) {
> +            if (sizeof(*pending_hp) < len) {
> +                len = sizeof(*pending_hp);
> +            }
> +
> +            cpu_physical_memory_write(buf, pending_hp, len);
> +            g_free(pending_hp);
> +            pending_hp = NULL;
> +            rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +        }
>      } else {
>          rtas_st(rets, 0, RTAS_OUT_NO_ERRORS_FOUND);
>      }
> @@ -314,7 +451,7 @@ static void check_exception(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>  
>  void spapr_events_init(sPAPREnvironment *spapr)
>  {
> -    spapr->epow_irq = xics_alloc(spapr->icp, 0, 0, false);
> +    spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false);
>      spapr->epow_notifier.notify = spapr_powerdown_req;
>      qemu_register_powerdown_notifier(&spapr->epow_notifier);
>      spapr_rtas_register(RTAS_CHECK_EXCEPTION, "check-exception",
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index e08dd2f..5382bf1 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -63,7 +63,7 @@ typedef struct sPAPREnvironment {
>      struct PPCTimebase tb;
>      bool has_graphics;
>  
> -    uint32_t epow_irq;
> +    uint32_t check_exception_irq;
>      Notifier epow_notifier;
>  
>      /* Migration state */
> @@ -521,5 +521,7 @@ int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
>  sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
>  sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
>  sPAPRDrcEntry *spapr_find_drc_entry(int drc_index);
> +void spapr_pci_hotplug_add_event(DeviceState *qdev, int slot);
> +void spapr_pci_hotplug_remove_event(DeviceState *qdev, int slot);
>  
>  #endif /* !defined (__HW_SPAPR_H__) */
> 


-- 
Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 11/12] spapr_events: event-scan RTAS interface
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 11/12] spapr_events: event-scan RTAS interface Michael Roth
@ 2014-08-26  9:30   ` Alexey Kardashevskiy
  2014-08-29 18:43     ` Tyrel Datwyler
  0 siblings, 1 reply; 69+ messages in thread
From: Alexey Kardashevskiy @ 2014-08-26  9:30 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

On 08/19/2014 10:21 AM, Michael Roth wrote:
> From: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
> 
> We don't actually rely on this interface to surface hotplug events, and
> instead rely on the similar-but-interrupt-driven check-exception RTAS
> interface used for EPOW events. However, the existence of this interface
> is needed to ensure guest kernels initialize the event-reporting
> interfaces which will in turn be used by userspace tools to handle these
> events, so we implement this interface as a stub.
> 
> Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c         | 1 +
>  hw/ppc/spapr_events.c  | 9 +++++++++
>  include/hw/ppc/spapr.h | 2 ++
>  3 files changed, 12 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 825fd31..c65b13a 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -702,6 +702,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
>          refpoints, sizeof(refpoints))));
>  
>      _FDT((fdt_property_cell(fdt, "rtas-error-log-max", RTAS_ERROR_LOG_MAX)));
> +    _FDT((fdt_property_cell(fdt, "rtas-event-scan-rate", RTAS_EVENT_SCAN_RATE)));
>  
>      /*
>       * According to PAPR, rtas ibm,os-term, does not gaurantee a return
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index c0be0e5..bb80080 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -449,6 +449,14 @@ static void check_exception(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>      }
>  }
>  
> +static void event_scan(PowerPCCPU *cpu, sPAPREnvironment *spapr,
> +                            uint32_t token, uint32_t nargs,
> +                            target_ulong args,
> +                            uint32_t nret, target_ulong rets)
> +{
> +    rtas_st(rets, 0, 1); /* no error events found */


rtas_st(rets, 0, RTAS_OUT_SUCCESS);



> +}
> +
>  void spapr_events_init(sPAPREnvironment *spapr)
>  {
>      spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false);
> @@ -456,4 +464,5 @@ void spapr_events_init(sPAPREnvironment *spapr)
>      qemu_register_powerdown_notifier(&spapr->epow_notifier);
>      spapr_rtas_register(RTAS_CHECK_EXCEPTION, "check-exception",
>                          check_exception);
> +    spapr_rtas_register(RTAS_EVENT_SCAN, "event-scan", event_scan);
>  }
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 5382bf1..aab627f 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -484,6 +484,8 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
>  
>  #define RTAS_ERROR_LOG_MAX      2048
>  
> +#define RTAS_EVENT_SCAN_RATE    1

1 second? 1ms? 1 minute? :) Worth mentioning in the commit log.


> +
>  typedef struct sPAPRTCETable sPAPRTCETable;
>  
>  #define TYPE_SPAPR_TCE_TABLE "spapr-tce-table"
> 


-- 
Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 12/12] spapr_pci: emit hotplug add/remove events during hotplug
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 12/12] spapr_pci: emit hotplug add/remove events during hotplug Michael Roth
@ 2014-08-26  9:35   ` Alexey Kardashevskiy
  2014-08-26 12:36   ` Alexander Graf
  1 sibling, 0 replies; 69+ messages in thread
From: Alexey Kardashevskiy @ 2014-08-26  9:35 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

On 08/19/2014 10:21 AM, Michael Roth wrote:
> From: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
> 
> This uses extension of existing EPOW interrupt/event mechanism
> to notify userspace tools like librtas/drmgr to handle
> in-guest configuration/cleanup operations in response to
> device_add/device_del.
> 
> Userspace tools that don't implement this extension will need
> to be run manually in response/advance of device_add/device_del,
> respectively.
> 
> Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr_pci.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 23864ab..17d37c0 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -944,6 +944,18 @@ static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev)
>      drc_entry_slot->state &= ~(uint32_t)INDICATOR_ENTITY_SENSE_MASK;
>      drc_entry_slot->state |= encoded; /* and the slot */
>  
> +    /* reliable unplug requires we wait for a transition from
> +     * UNISOLATED->ISOLATED prior to device removal/deletion.
> +     * However, slots populated by devices at boot-time will not
> +     * have ever been set by guest tools to an UNISOLATED/populated
> +     * state, so set this manually in the case of coldplug devices
> +     */
> +    if (!DEVICE(dev)->hotplugged) {
> +        drc_entry_slot->state |= ENCODE_DRC_STATE(1,
> +                                                  INDICATOR_ISOLATION_MASK,
> +                                                  INDICATOR_ISOLATION_SHIFT);
> +    }
> +
>      /* add OF node for pci device and required OF DT properties */
>      fdt_orig = g_malloc0(FDT_MAX_SIZE);
>      offset = fdt_create(fdt_orig, FDT_MAX_SIZE);
> @@ -1026,13 +1038,21 @@ static void spapr_device_hotplug_remove(DeviceState *qdev, PCIDevice *dev)
>  static void spapr_phb_hot_plug(HotplugHandler *plug_handler,
>                                 DeviceState *plugged_dev, Error **errp)
>  {
> +    int slot = PCI_SLOT(PCI_DEVICE(plugged_dev)->devfn);
> +
>      spapr_device_hotplug_add(DEVICE(plug_handler), PCI_DEVICE(plugged_dev));
> +    if (plugged_dev->hotplugged) {
> +        spapr_pci_hotplug_add_event(DEVICE(plug_handler), slot);
> +    }


Minor comment here :)
@slot is a temporary variable and it used once. Ok, may be it increases
readability, then it makes sense to have local variables for
PCI_DEVICE(plugged_dev) and DEVICE(plug_handler) too as they both are
calculated twice. But I do not insist :)


>  }
>  
>  static void spapr_phb_hot_unplug(HotplugHandler *plug_handler,
>                                   DeviceState *plugged_dev, Error **errp)
>  {
> +    int slot = PCI_SLOT(PCI_DEVICE(plugged_dev)->devfn);
> +
>      spapr_device_hotplug_remove(DEVICE(plug_handler), PCI_DEVICE(plugged_dev));
> +    spapr_pci_hotplug_remove_event(DEVICE(plug_handler), slot);
>  }

Same here.

>  
>  static void spapr_phb_realize(DeviceState *dev, Error **errp)
> 


-- 
Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations Michael Roth
@ 2014-08-26  9:40   ` Alexey Kardashevskiy
  2014-08-26 12:30   ` Alexander Graf
  2014-09-03 10:33   ` Bharata B Rao
  2 siblings, 0 replies; 69+ messages in thread
From: Alexey Kardashevskiy @ 2014-08-26  9:40 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

On 08/19/2014 10:21 AM, Michael Roth wrote:
> This enables hotplug for PHB bridges. Upon hotplug we generate the
> OF-nodes required by PAPR specification and IEEE 1275-1994
> "PCI Bus Binding to Open Firmware" for the device.
> 
> We associate the corresponding FDT for these nodes with the DrcEntry
> corresponding to the slot, which will be fetched via
> ibm,configure-connector RTAS calls by the guest as described by PAPR
> specification. The FDT is cleaned up in the case of unplug.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr_pci.c     | 235 +++++++++++++++++++++++++++++++++++++++++++++++--
>  include/hw/ppc/spapr.h |   1 +
>  2 files changed, 228 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 96a57be..23864ab 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -87,6 +87,17 @@
>  #define ENCODE_DRC_STATE(val, m, s) \
>      (((uint32_t)(val) << (s)) & (uint32_t)(m))
>  
> +#define FDT_MAX_SIZE            0x10000
> +#define _FDT(exp) \
> +    do { \
> +        int ret = (exp);                                           \
> +        if (ret < 0) {                                             \
> +            return ret;                                            \
> +        }                                                          \
> +    } while (0)
> +
> +static void spapr_drc_state_reset(sPAPRDrcEntry *drc_entry);
> +
>  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
>  {
>      sPAPRPHBState *sphb;
> @@ -476,6 +487,22 @@ static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>          /* encode the new value into the correct bit field */
>          shift = INDICATOR_ISOLATION_SHIFT;
>          mask = INDICATOR_ISOLATION_MASK;
> +        if (drc_entry) {
> +            /* transition from unisolated to isolated for a hotplug slot
> +             * entails completion of guest-side device unplug/cleanup, so
> +             * we can now safely remove the device if qemu is waiting for
> +             * it to be released
> +             */
> +            if (DECODE_DRC_STATE(*pind, mask, shift) != indicator_state) {
> +                if (indicator_state == 0 && drc_entry->awaiting_release) {
> +                    /* device_del has been called and host is waiting
> +                     * for guest to release/isolate device, go ahead
> +                     * and remove it now
> +                     */
> +                    spapr_drc_state_reset(drc_entry);
> +                }
> +            }
> +        }
>          break;
>      case 9002: /* DR */
>          shift = INDICATOR_DR_SHIFT;
> @@ -816,6 +843,198 @@ static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>      return &phb->iommu_as;
>  }
>  
> +static int spapr_populate_pci_child_dt(PCIDevice *dev, void *fdt, int offset,
> +                                       int phb_index)
> +{
> +    int slot = PCI_SLOT(dev->devfn);
> +    char slotname[16];
> +    bool is_bridge = 1;
> +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> +    uint32_t reg[RESOURCE_CELLS_TOTAL * 8] = { 0 };
> +    uint32_t assigned[RESOURCE_CELLS_TOTAL * 8] = { 0 };
> +    int reg_size, assigned_size;
> +
> +    drc_entry = spapr_phb_to_drc_entry(phb_index + SPAPR_PCI_BASE_BUID);
> +    g_assert(drc_entry);
> +    drc_entry_slot = &drc_entry->child_entries[slot];
> +
> +    if (pci_default_read_config(dev, PCI_HEADER_TYPE, 1) ==
> +        PCI_HEADER_TYPE_NORMAL) {
> +        is_bridge = 0;
> +    }
> +
> +    _FDT(fdt_setprop_cell(fdt, offset, "vendor-id",
> +                          pci_default_read_config(dev, PCI_VENDOR_ID, 2)));
> +    _FDT(fdt_setprop_cell(fdt, offset, "device-id",
> +                          pci_default_read_config(dev, PCI_DEVICE_ID, 2)));
> +    _FDT(fdt_setprop_cell(fdt, offset, "revision-id",
> +                          pci_default_read_config(dev, PCI_REVISION_ID, 1)));
> +    _FDT(fdt_setprop_cell(fdt, offset, "class-code",
> +                          pci_default_read_config(dev, PCI_CLASS_DEVICE, 2) << 8));
> +
> +    _FDT(fdt_setprop_cell(fdt, offset, "interrupts",
> +                          pci_default_read_config(dev, PCI_INTERRUPT_PIN, 1)));
> +
> +    /* if this device is NOT a bridge */
> +    if (!is_bridge) {
> +        _FDT(fdt_setprop_cell(fdt, offset, "min-grant",
> +            pci_default_read_config(dev, PCI_MIN_GNT, 1)));
> +        _FDT(fdt_setprop_cell(fdt, offset, "max-latency",
> +            pci_default_read_config(dev, PCI_MAX_LAT, 1)));
> +        _FDT(fdt_setprop_cell(fdt, offset, "subsystem-id",
> +            pci_default_read_config(dev, PCI_SUBSYSTEM_ID, 2)));
> +        _FDT(fdt_setprop_cell(fdt, offset, "subsystem-vendor-id",
> +            pci_default_read_config(dev, PCI_SUBSYSTEM_VENDOR_ID, 2)));
> +    }
> +
> +    _FDT(fdt_setprop_cell(fdt, offset, "cache-line-size",
> +        pci_default_read_config(dev, PCI_CACHE_LINE_SIZE, 1)));
> +
> +    /* the following fdt cells are masked off the pci status register */
> +    int pci_status = pci_default_read_config(dev, PCI_STATUS, 2);
> +    _FDT(fdt_setprop_cell(fdt, offset, "devsel-speed",
> +                          PCI_STATUS_DEVSEL_MASK & pci_status));
> +    _FDT(fdt_setprop_cell(fdt, offset, "fast-back-to-back",
> +                          PCI_STATUS_FAST_BACK & pci_status));
> +    _FDT(fdt_setprop_cell(fdt, offset, "66mhz-capable",
> +                          PCI_STATUS_66MHZ & pci_status));
> +    _FDT(fdt_setprop_cell(fdt, offset, "udf-supported",
> +                          PCI_STATUS_UDF & pci_status));
> +
> +    _FDT(fdt_setprop_string(fdt, offset, "name", "pci"));
> +    sprintf(slotname, "Slot %d", slot + phb_index * 32);
> +    _FDT(fdt_setprop(fdt, offset, "ibm,loc-code", slotname, strlen(slotname)));
> +    _FDT(fdt_setprop_cell(fdt, offset, "ibm,my-drc-index",
> +                          drc_entry_slot->drc_index));
> +
> +    _FDT(fdt_setprop_cell(fdt, offset, "#address-cells",
> +                          RESOURCE_CELLS_ADDRESS));
> +    _FDT(fdt_setprop_cell(fdt, offset, "#size-cells",
> +                          RESOURCE_CELLS_SIZE));
> +    _FDT(fdt_setprop_cell(fdt, offset, "ibm,req#msi-x",
> +                          RESOURCE_CELLS_SIZE));
> +    fill_resource_props(dev, phb_index, reg, &reg_size,
> +                        assigned, &assigned_size);
> +    _FDT(fdt_setprop(fdt, offset, "reg", reg, reg_size));
> +    _FDT(fdt_setprop(fdt, offset, "assigned-addresses",
> +                     assigned, assigned_size));
> +
> +    return 0;
> +}
> +
> +static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev)


Minor comment here too :)

Will this function ever support anything but sPAPRPHBState as a bus device?
It is not a callback and could receive what it actually needs - sPAPRPHBState.



> +{
> +    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
> +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> +    sPAPRConfigureConnectorState *ccs;
> +    int slot = PCI_SLOT(dev->devfn);
> +    int offset, ret;
> +    void *fdt_orig, *fdt;
> +    char nodename[512];
> +    uint32_t encoded = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_PRESENT,
> +                                        INDICATOR_ENTITY_SENSE_MASK,
> +                                        INDICATOR_ENTITY_SENSE_SHIFT);
> +
> +    drc_entry = spapr_phb_to_drc_entry(phb->buid);
> +    g_assert(drc_entry);
> +    drc_entry_slot = &drc_entry->child_entries[slot];
> +
> +    drc_entry->state &= ~(uint32_t)INDICATOR_ENTITY_SENSE_MASK;
> +    drc_entry->state |= encoded; /* DR entity present */
> +    drc_entry_slot->state &= ~(uint32_t)INDICATOR_ENTITY_SENSE_MASK;
> +    drc_entry_slot->state |= encoded; /* and the slot */
> +
> +    /* add OF node for pci device and required OF DT properties */
> +    fdt_orig = g_malloc0(FDT_MAX_SIZE);
> +    offset = fdt_create(fdt_orig, FDT_MAX_SIZE);
> +    fdt_begin_node(fdt_orig, "");
> +    fdt_end_node(fdt_orig);
> +    fdt_finish(fdt_orig);
> +
> +    fdt = g_malloc0(FDT_MAX_SIZE);
> +    fdt_open_into(fdt_orig, fdt, FDT_MAX_SIZE);
> +    sprintf(nodename, "pci@%d", slot);
> +    offset = fdt_add_subnode(fdt, 0, nodename);
> +    ret = spapr_populate_pci_child_dt(dev, fdt, offset, phb->index);
> +    g_assert(!ret);
> +    g_free(fdt_orig);
> +
> +    /* hold on to node, configure_connector will pass it to the guest later */
> +    ccs = &drc_entry_slot->cc_state;
> +    ccs->fdt = fdt;
> +    ccs->offset_start = offset;
> +    ccs->state = CC_STATE_PENDING;
> +    ccs->dev = dev;
> +
> +    return 0;
> +}
> +
> +/* check whether guest has released/isolated device */
> +static bool spapr_drc_state_is_releasable(sPAPRDrcEntry *drc_entry)
> +{
> +    return !DECODE_DRC_STATE(drc_entry->state,
> +                             INDICATOR_ISOLATION_MASK,
> +                             INDICATOR_ISOLATION_SHIFT);
> +}
> +
> +/* finalize device unplug/deletion */
> +static void spapr_drc_state_reset(sPAPRDrcEntry *drc_entry)
> +{
> +    sPAPRConfigureConnectorState *ccs = &drc_entry->cc_state;
> +    uint32_t sense_empty = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_EMPTY,
> +                                            INDICATOR_ENTITY_SENSE_MASK,
> +                                            INDICATOR_ENTITY_SENSE_SHIFT);
> +
> +    g_free(ccs->fdt);
> +    ccs->fdt = NULL;
> +    object_unparent(OBJECT(ccs->dev));
> +    ccs->dev = NULL;
> +    ccs->state = CC_STATE_IDLE;
> +    drc_entry->state &= ~INDICATOR_ENTITY_SENSE_MASK;
> +    drc_entry->state |= sense_empty;
> +    drc_entry->awaiting_release = false;
> +}
> +
> +static void spapr_device_hotplug_remove(DeviceState *qdev, PCIDevice *dev)
> +{
> +    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
> +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> +    sPAPRConfigureConnectorState *ccs;
> +    int slot = PCI_SLOT(dev->devfn);
> +
> +    drc_entry = spapr_phb_to_drc_entry(phb->buid);
> +    g_assert(drc_entry);
> +    drc_entry_slot = &drc_entry->child_entries[slot];
> +    ccs = &drc_entry_slot->cc_state;
> +    /* shouldn't be removing devices we haven't created an fdt for */
> +    g_assert(ccs->state != CC_STATE_IDLE);
> +    /* if the device has already been released/isolated by guest, go ahead
> +     * and remove it now. Otherwise, flag it as pending guest release so it
> +     * can be removed later
> +     */
> +    if (spapr_drc_state_is_releasable(drc_entry_slot)) {
> +        spapr_drc_state_reset(drc_entry_slot);
> +    } else {
> +        if (drc_entry_slot->awaiting_release) {
> +            fprintf(stderr, "waiting for guest to release the device");
> +        } else {
> +            drc_entry_slot->awaiting_release = true;
> +        }
> +    }
> +}
> +
> +static void spapr_phb_hot_plug(HotplugHandler *plug_handler,
> +                               DeviceState *plugged_dev, Error **errp)
> +{
> +    spapr_device_hotplug_add(DEVICE(plug_handler), PCI_DEVICE(plugged_dev));
> +}
> +
> +static void spapr_phb_hot_unplug(HotplugHandler *plug_handler,
> +                                 DeviceState *plugged_dev, Error **errp)
> +{
> +    spapr_device_hotplug_remove(DEVICE(plug_handler), PCI_DEVICE(plugged_dev));
> +}
> +
>  static void spapr_phb_realize(DeviceState *dev, Error **errp)
>  {
>      SysBusDevice *s = SYS_BUS_DEVICE(dev);
> @@ -903,6 +1122,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>                             &sphb->memspace, &sphb->iospace,
>                             PCI_DEVFN(0, 0), PCI_NUM_PINS, TYPE_PCI_BUS);
>      phb->bus = bus;
> +    qbus_set_hotplug_handler(BUS(phb->bus), DEVICE(sphb), NULL);
>  
>      /*
>       * Initialize PHB address space.
> @@ -1108,6 +1328,7 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data)
>      PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass);
>      DeviceClass *dc = DEVICE_CLASS(klass);
>      sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_CLASS(klass);
> +    HotplugHandlerClass *hp = HOTPLUG_HANDLER_CLASS(klass);
>  
>      hc->root_bus_path = spapr_phb_root_bus_path;
>      dc->realize = spapr_phb_realize;
> @@ -1117,6 +1338,8 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data)
>      set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
>      dc->cannot_instantiate_with_device_add_yet = false;
>      spc->finish_realize = spapr_phb_finish_realize;
> +    hp->plug = spapr_phb_hot_plug;
> +    hp->unplug = spapr_phb_hot_unplug;
>  }
>  
>  static const TypeInfo spapr_phb_info = {
> @@ -1125,6 +1348,10 @@ static const TypeInfo spapr_phb_info = {
>      .instance_size = sizeof(sPAPRPHBState),
>      .class_init    = spapr_phb_class_init,
>      .class_size    = sizeof(sPAPRPHBClass),
> +    .interfaces    = (InterfaceInfo[]) {
> +        { TYPE_HOTPLUG_HANDLER },
> +        { }
> +    }
>  };
>  
>  PCIHostState *spapr_create_phb(sPAPREnvironment *spapr, int index)
> @@ -1304,14 +1531,6 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb,
>          return bus_off;
>      }
>  
> -#define _FDT(exp) \
> -    do { \
> -        int ret = (exp);                                           \
> -        if (ret < 0) {                                             \
> -            return ret;                                            \
> -        }                                                          \
> -    } while (0)
> -
>      /* Write PHB properties */
>      _FDT(fdt_setprop_string(fdt, bus_off, "device_type", "pci"));
>      _FDT(fdt_setprop_string(fdt, bus_off, "compatible", "IBM,Logical_PHB"));
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index fac85f8..e08dd2f 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -36,6 +36,7 @@ struct sPAPRDrcEntry {
>      void *fdt;
>      int fdt_offset;
>      uint32_t state;
> +    bool awaiting_release;
>      sPAPRConfigureConnectorState cc_state;
>      sPAPRDrcEntry *child_entries;
>  };
> 


-- 
Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node Michael Roth
  2014-08-26  7:55   ` Alexey Kardashevskiy
@ 2014-08-26 11:11   ` Alexander Graf
  2014-08-26 16:47     ` Michael Roth
  2014-09-03  5:55   ` Bharata B Rao
  2014-09-05 22:00   ` Tyrel Datwyler
  3 siblings, 1 reply; 69+ messages in thread
From: Alexander Graf @ 2014-08-26 11:11 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: aik, ncmike, qemu-ppc, tyreld, nfont



On 19.08.14 02:21, Michael Roth wrote:
> From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> 
> This add entries to the root OF node to advertise our PHBs as being
> DR-capable in according with PAPR specification.
> 
> Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
> and associated with a power domain of -1 (indicating to guests that
> power management is handled automatically by hardware).
> 
> We currently allocate entries for up to 32 DR-capable PHBs, though
> this limit can be increased later.
> 
> DrcEntry objects to track the state of the DR-connector associated
> with each PHB are stored in a 32-entry array, and each DrcEntry has
> in turn have a dynamically-sized number of child DR-connectors,
> which we will use later to track the state of DR-connectors
> associated with a PHB's physical slots.
> 
> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_pci.c     |   1 +
>  include/hw/ppc/spapr.h |  35 ++++++++++++
>  3 files changed, 179 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 5c92707..d5e46c3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
>      return ram_size;
>  }
>  
> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
> +{
> +    int i;
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        if (spapr->drc_table[i].phb_buid == buid) {
> +            return &spapr->drc_table[i];
> +        }
> +     }
> +
> +     return NULL;
> +}
> +
> +static void spapr_init_drc_table(void)
> +{
> +    int i;
> +
> +    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
> +
> +    /* For now we only care about PHB entries */
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        spapr->drc_table[i].drc_index = 0x2000001 + i;

magic number?

> +    }
> +}
> +
> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
> +{
> +    sPAPRDrcEntry *empty_drc = NULL;
> +    sPAPRDrcEntry *found_drc = NULL;
> +    int i, phb_index;
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        if (spapr->drc_table[i].phb_buid == 0) {
> +            empty_drc = &spapr->drc_table[i];
> +        }
> +
> +        if (spapr->drc_table[i].phb_buid == buid) {
> +            found_drc = &spapr->drc_table[i];
> +            break;
> +        }
> +    }
> +
> +    if (found_drc) {
> +        return found_drc;
> +    }
> +
> +    if (empty_drc) {
> +        empty_drc->phb_buid = buid;
> +        empty_drc->state = state;
> +        empty_drc->cc_state.fdt = NULL;
> +        empty_drc->cc_state.offset = 0;
> +        empty_drc->cc_state.depth = 0;
> +        empty_drc->cc_state.state = CC_STATE_IDLE;
> +        empty_drc->child_entries =
> +            g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
> +        phb_index = buid - SPAPR_PCI_BASE_BUID;
> +        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +            empty_drc->child_entries[i].drc_index =
> +                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
> +        }
> +        return empty_drc;
> +    }
> +
> +    return NULL;
> +}
> +
> +static void spapr_create_drc_dt_entries(void *fdt)
> +{
> +    char char_buf[1024];
> +    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
> +    uint32_t *entries;
> +    int offset, fdt_offset;
> +    int i, ret;
> +
> +    fdt_offset = fdt_path_offset(fdt, "/");
> +
> +    /* ibm,drc-indexes */
> +    memset(int_buf, 0, sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> +
> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> +        int_buf[i] = spapr->drc_table[i-1].drc_index;

Not endian safe.

> +    }
> +
> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
> +    }
> +
> +    /* ibm,drc-power-domains */
> +    memset(int_buf, 0, sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;

Not endian safe.

> +
> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> +        int_buf[i] = 0xffffffff;
> +    }

memset(-1) instead above?

> +
> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains property\n");
> +    }
> +
> +    /* ibm,drc-names */
> +    memset(char_buf, 0, sizeof(char_buf));
> +    entries = (uint32_t *)&char_buf[0];
> +    *entries = SPAPR_DRC_TABLE_SIZE;

Not endian safe. I guess you get the idea. I'll stop looking for endian
problems here :).

> +    offset = sizeof(*entries);
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
> +        char_buf[offset++] = '\0';
> +    }
> +
> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
> +    if (ret) {
> +        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
> +    }
> +
> +    /* ibm,drc-types */
> +    memset(char_buf, 0, sizeof(char_buf));
> +    entries = (uint32_t *)&char_buf[0];
> +    *entries = SPAPR_DRC_TABLE_SIZE;
> +    offset = sizeof(*entries);
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        offset += sprintf(char_buf + offset, "PHB");
> +        char_buf[offset++] = '\0';
> +    }
> +
> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, offset);
> +    if (ret) {
> +        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
> +    }
> +}
> +
>  #define _FDT(exp) \
>      do { \
>          int ret = (exp);                                           \
> @@ -731,6 +868,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>      char *bootlist;
>      void *fdt;
>      sPAPRPHBState *phb;
> +    sPAPRDrcEntry *drc_entry;
>  
>      fdt = g_malloc(FDT_MAX_SIZE);
>  
> @@ -750,6 +888,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>      }
>  
>      QLIST_FOREACH(phb, &spapr->phbs, list) {
> +        drc_entry = spapr_phb_to_drc_entry(phb->buid);
> +        g_assert(drc_entry);
>          ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
>      }
>  
> @@ -789,6 +929,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
>      }
>  
> +    spapr_create_drc_dt_entries(fdt);

I would really prefer if we can stick to always use the spapr as
function parameter, not use the global.

> +
>      _FDT((fdt_pack(fdt)));
>  
>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> @@ -1443,6 +1585,7 @@ static void ppc_spapr_init(MachineState *machine)
>      spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW);
>      spapr_pci_rtas_init();
>  
> +    spapr_init_drc_table();
>      phb = spapr_create_phb(spapr, 0);
>  
>      for (i = 0; i < nb_nics; i++) {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 9ed39a9..e85134f 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -531,6 +531,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>              + sphb->index * SPAPR_PCI_WINDOW_SPACING;
>          sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
>          sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
> +        spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
>      }
>  
>      if (sphb->buid == -1) {
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 36e8e51..c93794b 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -10,6 +10,36 @@ struct sPAPRNVRAM;
>  
>  #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
>  
> +/* For dlparable/hotpluggable slots */
> +#define SPAPR_DRC_TABLE_SIZE    32

Can we make this dynamic so that we can set it to 0 for pseries-2.0 (if
necessary) or have an easy tunable to extend the list later?


Alex

> +#define SPAPR_DRC_PHB_SLOT_MAX  32
> +#define SPAPR_DRC_DEV_ID_BASE   0x40000000
> +
> +typedef struct sPAPRConfigureConnectorState {
> +    void *fdt;
> +    int offset_start;
> +    int offset;
> +    int depth;
> +    PCIDevice *dev;
> +    enum {
> +        CC_STATE_IDLE = 0,
> +        CC_STATE_PENDING = 1,
> +        CC_STATE_ACTIVE,
> +    } state;
> +} sPAPRConfigureConnectorState;
> +
> +typedef struct sPAPRDrcEntry sPAPRDrcEntry;
> +
> +struct sPAPRDrcEntry {
> +    uint32_t drc_index;
> +    uint64_t phb_buid;
> +    void *fdt;
> +    int fdt_offset;
> +    uint32_t state;
> +    sPAPRConfigureConnectorState cc_state;
> +    sPAPRDrcEntry *child_entries;
> +};
> +
>  typedef struct sPAPREnvironment {
>      struct VIOsPAPRBus *vio_bus;
>      QLIST_HEAD(, sPAPRPHBState) phbs;
> @@ -39,6 +69,9 @@ typedef struct sPAPREnvironment {
>      int htab_save_index;
>      bool htab_first_pass;
>      int htab_fd;
> +
> +    /* state for Dynamic Reconfiguration Connectors */
> +    sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
>  } sPAPREnvironment;
>  
>  #define H_SUCCESS         0
> @@ -481,5 +514,7 @@ int spapr_dma_dt(void *fdt, int node_off, const char *propname,
>                   uint32_t liobn, uint64_t window, uint32_t size);
>  int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
>                        sPAPRTCETable *tcet);
> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
>  
>  #endif /* !defined (__HW_SPAPR_H__) */
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs Michael Roth
  2014-08-26  8:32   ` Alexey Kardashevskiy
  2014-08-26  9:09   ` Alexey Kardashevskiy
@ 2014-08-26 11:29   ` Alexander Graf
  2014-08-26 18:30     ` Michael Roth
  2 siblings, 1 reply; 69+ messages in thread
From: Alexander Graf @ 2014-08-26 11:29 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: aik, ncmike, qemu-ppc, tyreld, nfont



On 19.08.14 02:21, Michael Roth wrote:
> Reserve 32 entries of type PCI in each PHB's initial FDT. This
> advertises to guests that each PHB is DR-capable device with
> physical hotpluggable slots. This is necessary for allowing
> hotplugging of devices to it later via bus rescan or guest rpaphp
> hotplug module.
> 
> Each entry is assigned a name of "Slot <<bus_idx>*32 +1>",
> advertised as a hotpluggable PCI slot, and assigned to power domain
> -1 to indicate to the guest that power management is handled by the
> hardware.
> 
> This models a DR-capable PCI expansion device attached to a host/lpar
> via a single PHB with 32 physical hotpluggable slots (as opposed to a
> virtual bridge device with external management console). Hotplug will
> be handled by the guest via bus rescan or the rpaphp hotplug module.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c              |   3 +-
>  hw/ppc/spapr_pci.c          | 102 ++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/pci-host/spapr.h |   1 +
>  3 files changed, 105 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d5e46c3..90b25b3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -890,7 +890,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>      QLIST_FOREACH(phb, &spapr->phbs, list) {
>          drc_entry = spapr_phb_to_drc_entry(phb->buid);
>          g_assert(drc_entry);
> -        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> +        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, drc_entry->drc_index,
> +                                    fdt);
>      }
>  
>      if (ret < 0) {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index e85134f..924d488 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -851,8 +851,104 @@ static int spapr_phb_children_dt(Object *child, void *opaque)
>      return 1;
>  }
>  
> +static void spapr_create_drc_phb_dt_entries(void *fdt, int bus_off, int phb_index)
> +{
> +    char char_buf[1024];
> +    uint32_t int_buf[SPAPR_DRC_PHB_SLOT_MAX + 1];
> +    uint32_t *entries;
> +    int i, ret, offset;
> +
> +    /* ibm,drc-indexes */
> +    memset(int_buf, 0 , sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +        int_buf[i] = SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + ((i - 1) << 3);

Same endianness breakage.

Please verify that your patch set works with

  1) ppc64le host and KVM
  2) x86_64 host and TCG


Alex

> +    }
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-indexes", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr, "error adding 'ibm,drc-indexes' field for PHB FDT");
> +    }
> +
> +    /* ibm,drc-power-domains */
> +    memset(int_buf, 0, sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +        int_buf[i] = 0xffffffff;
> +    }
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-power-domains", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr,
> +                "error adding 'ibm,drc-power-domains' field for PHB FDT");
> +    }
> +
> +    /* ibm,drc-names */
> +    memset(char_buf, 0, sizeof(char_buf));
> +    entries = (uint32_t *)&char_buf[0];
> +    *entries = SPAPR_DRC_PHB_SLOT_MAX;
> +    offset = sizeof(*entries);
> +
> +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +        offset += sprintf(char_buf + offset, "Slot %d",
> +                          (phb_index * SPAPR_DRC_PHB_SLOT_MAX) + i - 1);
> +        char_buf[offset++] = '\0';
> +    }
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-names", char_buf, offset);
> +    if (ret) {
> +        fprintf(stderr, "error adding 'ibm,drc-names' field for PHB FDT");
> +    }
> +
> +    /* ibm,drc-types */
> +    memset(char_buf, 0, sizeof(char_buf));
> +    entries = (uint32_t *)&char_buf[0];
> +    *entries = SPAPR_DRC_PHB_SLOT_MAX;
> +    offset = sizeof(*entries);
> +
> +    for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +        offset += sprintf(char_buf + offset, "28");
> +        char_buf[offset++] = '\0';
> +    }
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-types", char_buf, offset);
> +    if (ret) {
> +        fprintf(stderr, "error adding 'ibm,drc-types' field for PHB FDT");
> +    }
> +
> +    /* we want the initial indicator state to be 0 - "empty", when we
> +     * hot-plug an adaptor in the slot, we need to set the indicator
> +     * to 1 - "present."
> +     */
> +
> +    /* ibm,indicator-9003 */
> +    memset(int_buf, 0, sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,indicator-9003", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr, "error adding 'ibm,indicator-9003' field for PHB FDT");
> +    }
> +
> +    /* ibm,sensor-9003 */
> +    memset(int_buf, 0, sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +    ret = fdt_setprop(fdt, bus_off, "ibm,sensor-9003", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr, "error adding 'ibm,sensor-9003' field for PHB FDT");
> +    }
> +}
> +
>  int spapr_populate_pci_dt(sPAPRPHBState *phb,
>                            uint32_t xics_phandle,
> +                          uint32_t drc_index,
>                            void *fdt)
>  {
>      int bus_off, i, j;
> @@ -934,6 +1030,12 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb,
>      object_child_foreach(OBJECT(phb), spapr_phb_children_dt,
>                           &((sPAPRTCEDT){ .fdt = fdt, .node_off = bus_off }));
>  
> +    spapr_create_drc_phb_dt_entries(fdt, bus_off, phb->index);
> +    if (drc_index) {
> +        _FDT(fdt_setprop(fdt, bus_off, "ibm,my-drc-index", &drc_index,
> +                         sizeof(drc_index)));
> +    }
> +
>      return 0;
>  }
>  
> diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
> index 32f0aa7..8f0a42f 100644
> --- a/include/hw/pci-host/spapr.h
> +++ b/include/hw/pci-host/spapr.h
> @@ -116,6 +116,7 @@ PCIHostState *spapr_create_phb(sPAPREnvironment *spapr, int index);
>  
>  int spapr_populate_pci_dt(sPAPRPHBState *phb,
>                            uint32_t xics_phandle,
> +                          uint32_t drc_index,
>                            void *fdt);
>  
>  void spapr_pci_msi_init(sPAPREnvironment *spapr, hwaddr addr);
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface Michael Roth
@ 2014-08-26 11:36   ` Alexander Graf
  2014-09-05  2:55     ` Nathan Fontenot
  2014-09-30 22:08     ` Michael Roth
  0 siblings, 2 replies; 69+ messages in thread
From: Alexander Graf @ 2014-08-26 11:36 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: aik, ncmike, qemu-ppc, tyreld, nfont



On 19.08.14 02:21, Michael Roth wrote:
> From: Mike Day <ncmike@ncultra.org>
> 
> Signed-off-by: Mike Day <ncmike@ncultra.org>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr_pci.c     | 119 +++++++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/ppc/spapr.h |   3 ++
>  2 files changed, 122 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 924d488..23a3477 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -36,6 +36,16 @@
>  
>  #include "hw/pci/pci_bus.h"
>  
> +/* #define DEBUG_SPAPR */
> +
> +#ifdef DEBUG_SPAPR
> +#define DPRINTF(fmt, ...) \
> +    do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
> +#else
> +#define DPRINTF(fmt, ...) \
> +    do { } while (0)
> +#endif
> +
>  /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
>  #define RTAS_QUERY_FN           0
>  #define RTAS_CHANGE_FN          1
> @@ -47,6 +57,31 @@
>  #define RTAS_TYPE_MSI           1
>  #define RTAS_TYPE_MSIX          2
>  
> +/* For set-indicator RTAS interface */
> +#define INDICATOR_ISOLATION_MASK            0x0001   /* 9001 one bit */
> +#define INDICATOR_GLOBAL_INTERRUPT_MASK     0x0002   /* 9005 one bit */
> +#define INDICATOR_ERROR_LOG_MASK            0x0004   /* 9006 one bit */
> +#define INDICATOR_IDENTIFY_MASK             0x0008   /* 9007 one bit */
> +#define INDICATOR_RESET_MASK                0x0010   /* 9009 one bit */
> +#define INDICATOR_DR_MASK                   0x00e0   /* 9002 three bits */
> +#define INDICATOR_ALLOCATION_MASK           0x0300   /* 9003 two bits */
> +#define INDICATOR_EPOW_MASK                 0x1c00   /* 9 three bits */
> +
> +#define INDICATOR_ISOLATION_SHIFT           0x00     /* bit 0 */
> +#define INDICATOR_GLOBAL_INTERRUPT_SHIFT    0x01     /* bit 1 */
> +#define INDICATOR_ERROR_LOG_SHIFT           0x02     /* bit 2 */
> +#define INDICATOR_IDENTIFY_SHIFT            0x03     /* bit 3 */
> +#define INDICATOR_RESET_SHIFT               0x04     /* bit 4 */
> +#define INDICATOR_DR_SHIFT                  0x05     /* bits 5-7 */
> +#define INDICATOR_ALLOCATION_SHIFT          0x08     /* bits 8-9 */
> +#define INDICATOR_EPOW_SHIFT                0x0a     /* bits 10-12 */
> +
> +#define DECODE_DRC_STATE(state, m, s)                  \
> +    ((((uint32_t)(state) & (uint32_t)(m))) >> (s))
> +
> +#define ENCODE_DRC_STATE(val, m, s) \
> +    (((uint32_t)(val) << (s)) & (uint32_t)(m))
> +
>  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
>  {
>      sPAPRPHBState *sphb;
> @@ -402,6 +437,80 @@ static void rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
>      rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */
>  }
>  
> +static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
> +                               uint32_t token, uint32_t nargs,
> +                               target_ulong args, uint32_t nret,
> +                               target_ulong rets)
> +{
> +    uint32_t indicator = rtas_ld(args, 0);
> +    uint32_t drc_index = rtas_ld(args, 1);
> +    uint32_t indicator_state = rtas_ld(args, 2);
> +    uint32_t encoded = 0, shift = 0, mask = 0;
> +    uint32_t *pind;
> +    sPAPRDrcEntry *drc_entry = NULL;

This rtas call does not have any idea what a PHB is. Why does it live in
spapr_pci.c?

> +
> +    if (drc_index == 0) { /* platform indicator */
> +        pind = &spapr->state;
> +    } else {
> +        drc_entry = spapr_find_drc_entry(drc_index);
> +        if (!drc_entry) {
> +            DPRINTF("rtas_set_indicator: unable to find drc_entry for %x",
> +                    drc_index);
> +            rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
> +            return;
> +        }
> +        pind = &drc_entry->state;
> +    }
> +
> +    switch (indicator) {
> +    case 9:  /* EPOW */
> +        shift = INDICATOR_EPOW_SHIFT;
> +        mask = INDICATOR_EPOW_MASK;
> +        break;
> +    case 9001: /* Isolation state */
> +        /* encode the new value into the correct bit field */
> +        shift = INDICATOR_ISOLATION_SHIFT;
> +        mask = INDICATOR_ISOLATION_MASK;
> +        break;
> +    case 9002: /* DR */
> +        shift = INDICATOR_DR_SHIFT;
> +        mask = INDICATOR_DR_MASK;
> +        break;
> +    case 9003: /* Allocation State */
> +        shift = INDICATOR_ALLOCATION_SHIFT;
> +        mask = INDICATOR_ALLOCATION_MASK;
> +        break;
> +    case 9005: /* global interrupt */
> +        shift = INDICATOR_GLOBAL_INTERRUPT_SHIFT;
> +        mask = INDICATOR_GLOBAL_INTERRUPT_MASK;
> +        break;
> +    case 9006: /* error log */
> +        shift = INDICATOR_ERROR_LOG_SHIFT;
> +        mask = INDICATOR_ERROR_LOG_MASK;
> +        break;
> +    case 9007: /* identify */
> +        shift = INDICATOR_IDENTIFY_SHIFT;
> +        mask = INDICATOR_IDENTIFY_MASK;
> +        break;
> +    case 9009: /* reset */
> +        shift = INDICATOR_RESET_SHIFT;
> +        mask = INDICATOR_RESET_MASK;
> +        break;
> +    default:
> +        DPRINTF("rtas_set_indicator: indicator not implemented: %d",
> +                indicator);
> +        rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
> +        return;
> +    }
> +
> +    encoded = ENCODE_DRC_STATE(indicator_state, mask, shift);
> +    /* clear the current indicator value */
> +    *pind &= ~mask;
> +    /* set the new value */
> +    *pind |= encoded;
> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +}
> +
>  static int pci_spapr_swizzle(int slot, int pin)
>  {
>      return (slot + pin) % PCI_NUM_PINS;
> @@ -624,6 +733,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>          sphb->lsi_table[i].irq = irq;
>      }
>  
> +    /* make sure the platform EPOW sensor is initialized - the
> +     * guest will probe it when there is a hotplug event.
> +     */
> +    spapr->state &= ~(uint32_t)INDICATOR_EPOW_MASK;
> +    spapr->state |= ENCODE_DRC_STATE(0,
> +                                     INDICATOR_EPOW_MASK,
> +                                     INDICATOR_EPOW_SHIFT);
> +
>      if (!info->finish_realize) {
>          error_setg(errp, "finish_realize not defined");
>          return;
> @@ -1056,6 +1173,8 @@ void spapr_pci_rtas_init(void)
>          spapr_rtas_register(RTAS_IBM_CHANGE_MSI, "ibm,change-msi",
>                              rtas_ibm_change_msi);
>      }
> +    spapr_rtas_register(RTAS_SET_INDICATOR, "set-indicator",
> +                        rtas_set_indicator);
>  }
>  
>  static void spapr_pci_register_types(void)
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 0ac1a19..fac85f8 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -72,6 +72,9 @@ typedef struct sPAPREnvironment {
>  
>      /* state for Dynamic Reconfiguration Connectors */
>      sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
> +
> +    /* Platform state - sensors and indicators */
> +    uint32_t state;

Do you think it'd be possible to create a special DRC device that
contains all of its tables and global state and also exposes sensors and
indicators? That device could then get linked via qom links to the PHBs
for their slots.


Alex

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 07/12] spapr_pci: add ibm, configure-connector RTAS interface
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 07/12] spapr_pci: add ibm, configure-connector " Michael Roth
  2014-08-26  9:12   ` Alexey Kardashevskiy
@ 2014-08-26 11:39   ` Alexander Graf
  1 sibling, 0 replies; 69+ messages in thread
From: Alexander Graf @ 2014-08-26 11:39 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: aik, ncmike, qemu-ppc, tyreld, nfont



On 19.08.14 02:21, Michael Roth wrote:
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr_pci.c | 111 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 111 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 8d1351d..96a57be 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -606,6 +606,115 @@ static void rtas_get_sensor_state(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>      rtas_st(rets, 1, decoded);
>  }
>  
> +/* configure-connector work area offsets, int32_t units */
> +#define CC_IDX_NODE_NAME_OFFSET 2
> +#define CC_IDX_PROP_NAME_OFFSET 2
> +#define CC_IDX_PROP_LEN 3
> +#define CC_IDX_PROP_DATA_OFFSET 4
> +
> +#define CC_VAL_DATA_OFFSET ((CC_IDX_PROP_DATA_OFFSET + 1) * 4)
> +#define CC_RET_NEXT_SIB 1
> +#define CC_RET_NEXT_CHILD 2
> +#define CC_RET_NEXT_PROPERTY 3
> +#define CC_RET_PREV_PARENT 4
> +#define CC_RET_ERROR RTAS_OUT_HW_ERROR
> +#define CC_RET_SUCCESS RTAS_OUT_SUCCESS
> +
> +static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
> +                                         sPAPREnvironment *spapr,
> +                                         uint32_t token, uint32_t nargs,
> +                                         target_ulong args, uint32_t nret,
> +                                         target_ulong rets)
> +{
> +    uint64_t wa_addr = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 0);
> +    sPAPRDrcEntry *drc_entry = NULL;
> +    sPAPRConfigureConnectorState *ccs;
> +    void *wa_buf;
> +    int32_t *wa_buf_int;
> +    hwaddr map_len = 0x1024;
> +    uint32_t drc_index;
> +    int rc = 0, next_offset, tag, prop_len, node_name_len;
> +    const struct fdt_property *prop;
> +    const char *node_name, *prop_name;
> +
> +    wa_buf = cpu_physical_memory_map(wa_addr, &map_len, 1);

Please use the rtas helpers to access memory from an rtas function. It
takes care of endian swapping and address masking.


Alex

> +    if (!wa_buf) {
> +        rc = CC_RET_ERROR;
> +        goto error_exit;
> +    }
> +    wa_buf_int = wa_buf;
> +
> +    drc_index = *(uint32_t *)wa_buf;
> +    drc_entry = spapr_find_drc_entry(drc_index);
> +    if (!drc_entry) {
> +        rc = -1;
> +        goto error_exit;
> +    }
> +
> +    ccs = &drc_entry->cc_state;
> +    if (ccs->state == CC_STATE_PENDING) {
> +        /* fdt should've been been attached to drc_entry during
> +         * realize/hotplug
> +         */
> +        g_assert(ccs->fdt);
> +        ccs->depth = 0;
> +        ccs->offset = ccs->offset_start;
> +        ccs->state = CC_STATE_ACTIVE;
> +    }
> +
> +    if (ccs->state == CC_STATE_IDLE) {
> +        rc = -1;
> +        goto error_exit;
> +    }
> +
> +retry:
> +    tag = fdt_next_tag(ccs->fdt, ccs->offset, &next_offset);
> +
> +    switch (tag) {
> +    case FDT_BEGIN_NODE:
> +        ccs->depth++;
> +        node_name = fdt_get_name(ccs->fdt, ccs->offset, &node_name_len);
> +        wa_buf_int[CC_IDX_NODE_NAME_OFFSET] = CC_VAL_DATA_OFFSET;
> +        strcpy(wa_buf + wa_buf_int[CC_IDX_NODE_NAME_OFFSET], node_name);
> +        rc = CC_RET_NEXT_CHILD;
> +        break;
> +    case FDT_END_NODE:
> +        ccs->depth--;
> +        if (ccs->depth == 0) {
> +            /* reached the end of top-level node, declare success */
> +            ccs->state = CC_STATE_PENDING;
> +            rc = CC_RET_SUCCESS;
> +        } else {
> +            rc = CC_RET_PREV_PARENT;
> +        }
> +        break;
> +    case FDT_PROP:
> +        prop = fdt_get_property_by_offset(ccs->fdt, ccs->offset, &prop_len);
> +        prop_name = fdt_string(ccs->fdt, fdt32_to_cpu(prop->nameoff));
> +        wa_buf_int[CC_IDX_PROP_NAME_OFFSET] = CC_VAL_DATA_OFFSET;
> +        wa_buf_int[CC_IDX_PROP_LEN] = prop_len;
> +        wa_buf_int[CC_IDX_PROP_DATA_OFFSET] =
> +            CC_VAL_DATA_OFFSET + strlen(prop_name) + 1;
> +        strcpy(wa_buf + wa_buf_int[CC_IDX_PROP_NAME_OFFSET], prop_name);
> +        memcpy(wa_buf + wa_buf_int[CC_IDX_PROP_DATA_OFFSET],
> +               prop->data, prop_len);
> +        rc = CC_RET_NEXT_PROPERTY;
> +        break;
> +    case FDT_END:
> +        rc = CC_RET_ERROR;
> +        break;
> +    default:
> +        ccs->offset = next_offset;
> +        goto retry;
> +    }
> +
> +    ccs->offset = next_offset;
> +
> +error_exit:
> +    cpu_physical_memory_unmap(wa_buf, 0x1024, 1, 0x1024);
> +    rtas_st(rets, 0, rc);
> +}
> +
>  static int pci_spapr_swizzle(int slot, int pin)
>  {
>      return (slot + pin) % PCI_NUM_PINS;
> @@ -1276,6 +1385,8 @@ void spapr_pci_rtas_init(void)
>                          rtas_get_power_level);
>      spapr_rtas_register(RTAS_GET_SENSOR_STATE, "get-sensor-state",
>                          rtas_get_sensor_state);
> +    spapr_rtas_register(RTAS_IBM_CONFIGURE_CONNECTOR, "ibm,configure-connector",
> +                        rtas_ibm_configure_connector);
>  }
>  
>  static void spapr_pci_register_types(void)
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions Michael Roth
  2014-08-26  9:14   ` Alexey Kardashevskiy
@ 2014-08-26 11:41   ` Alexander Graf
  2014-08-27 13:47   ` Michael S. Tsirkin
  2 siblings, 0 replies; 69+ messages in thread
From: Alexander Graf @ 2014-08-26 11:41 UTC (permalink / raw)
  To: Michael Roth, qemu-devel
  Cc: Michael S. Tsirkin, aik, ncmike, qemu-ppc, tyreld, nfont



On 19.08.14 02:21, Michael Roth wrote:
> Some kernels program a 0 address for io regions. PCI 3.0 spec
> section 6.2.5.1 doesn't seem to disallow this.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

This patch does not need to be inside of this patch set. It also should
go via Michael's tree.


Alex

> ---
>  hw/pci/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 351d320..9578749 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -1035,7 +1035,7 @@ static pcibus_t pci_bar_address(PCIDevice *d,
>          /* Check if 32 bit BAR wraps around explicitly.
>           * TODO: make priorities correct and remove this work around.
>           */
> -        if (last_addr <= new_addr || new_addr == 0 || last_addr >= UINT32_MAX) {
> +        if (last_addr <= new_addr || last_addr >= UINT32_MAX) {
>              return PCI_BAR_UNMAPPED;
>          }
>          return new_addr;
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions
  2014-08-26  9:14   ` Alexey Kardashevskiy
@ 2014-08-26 11:55     ` Peter Maydell
  2014-08-26 18:34     ` Michael Roth
  1 sibling, 0 replies; 69+ messages in thread
From: Peter Maydell @ 2014-08-26 11:55 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: QEMU Developers, Alexander Graf, Michael Roth, Mike Day,
	qemu-ppc, tyreld, nfont

On 26 August 2014 10:14, Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> On 08/19/2014 10:21 AM, Michael Roth wrote:
>> Some kernels program a 0 address for io regions. PCI 3.0 spec
>> section 6.2.5.1 doesn't seem to disallow this.
>
>
> I remember there was discussion about it but I forgot :)

I think the conclusion we came to was that there may have been
a note in the PCI 2.1 spec that implied that 0 addresses meant
"disabled" but this seems to have gone from later versions,
suggesting it was erroneous.

Personally I'm happy for us to remove the "0 means disabled"
check, but I'd prefer it if we do it consistently for both IO and
MMIO regions -- this patch only changes the IO BAR code.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations Michael Roth
  2014-08-26  9:40   ` Alexey Kardashevskiy
@ 2014-08-26 12:30   ` Alexander Graf
  2014-09-03 10:33   ` Bharata B Rao
  2 siblings, 0 replies; 69+ messages in thread
From: Alexander Graf @ 2014-08-26 12:30 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: aik, ncmike, qemu-ppc, tyreld, nfont



On 19.08.14 02:21, Michael Roth wrote:
> This enables hotplug for PHB bridges. Upon hotplug we generate the
> OF-nodes required by PAPR specification and IEEE 1275-1994
> "PCI Bus Binding to Open Firmware" for the device.
> 
> We associate the corresponding FDT for these nodes with the DrcEntry
> corresponding to the slot, which will be fetched via
> ibm,configure-connector RTAS calls by the guest as described by PAPR
> specification. The FDT is cleaned up in the case of unplug.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr_pci.c     | 235 +++++++++++++++++++++++++++++++++++++++++++++++--
>  include/hw/ppc/spapr.h |   1 +
>  2 files changed, 228 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 96a57be..23864ab 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -87,6 +87,17 @@
>  #define ENCODE_DRC_STATE(val, m, s) \
>      (((uint32_t)(val) << (s)) & (uint32_t)(m))
>  
> +#define FDT_MAX_SIZE            0x10000
> +#define _FDT(exp) \
> +    do { \
> +        int ret = (exp);                                           \
> +        if (ret < 0) {                                             \
> +            return ret;                                            \
> +        }                                                          \
> +    } while (0)
> +
> +static void spapr_drc_state_reset(sPAPRDrcEntry *drc_entry);
> +
>  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
>  {
>      sPAPRPHBState *sphb;
> @@ -476,6 +487,22 @@ static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>          /* encode the new value into the correct bit field */
>          shift = INDICATOR_ISOLATION_SHIFT;
>          mask = INDICATOR_ISOLATION_MASK;
> +        if (drc_entry) {
> +            /* transition from unisolated to isolated for a hotplug slot
> +             * entails completion of guest-side device unplug/cleanup, so
> +             * we can now safely remove the device if qemu is waiting for
> +             * it to be released
> +             */
> +            if (DECODE_DRC_STATE(*pind, mask, shift) != indicator_state) {
> +                if (indicator_state == 0 && drc_entry->awaiting_release) {
> +                    /* device_del has been called and host is waiting
> +                     * for guest to release/isolate device, go ahead
> +                     * and remove it now
> +                     */
> +                    spapr_drc_state_reset(drc_entry);
> +                }
> +            }
> +        }
>          break;
>      case 9002: /* DR */
>          shift = INDICATOR_DR_SHIFT;
> @@ -816,6 +843,198 @@ static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>      return &phb->iommu_as;
>  }
>  
> +static int spapr_populate_pci_child_dt(PCIDevice *dev, void *fdt, int offset,
> +                                       int phb_index)
> +{
> +    int slot = PCI_SLOT(dev->devfn);
> +    char slotname[16];
> +    bool is_bridge = 1;
> +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> +    uint32_t reg[RESOURCE_CELLS_TOTAL * 8] = { 0 };
> +    uint32_t assigned[RESOURCE_CELLS_TOTAL * 8] = { 0 };
> +    int reg_size, assigned_size;
> +
> +    drc_entry = spapr_phb_to_drc_entry(phb_index + SPAPR_PCI_BASE_BUID);
> +    g_assert(drc_entry);
> +    drc_entry_slot = &drc_entry->child_entries[slot];
> +
> +    if (pci_default_read_config(dev, PCI_HEADER_TYPE, 1) ==
> +        PCI_HEADER_TYPE_NORMAL) {
> +        is_bridge = 0;
> +    }
> +
> +    _FDT(fdt_setprop_cell(fdt, offset, "vendor-id",
> +                          pci_default_read_config(dev, PCI_VENDOR_ID, 2)));
> +    _FDT(fdt_setprop_cell(fdt, offset, "device-id",
> +                          pci_default_read_config(dev, PCI_DEVICE_ID, 2)));
> +    _FDT(fdt_setprop_cell(fdt, offset, "revision-id",
> +                          pci_default_read_config(dev, PCI_REVISION_ID, 1)));
> +    _FDT(fdt_setprop_cell(fdt, offset, "class-code",
> +                          pci_default_read_config(dev, PCI_CLASS_DEVICE, 2) << 8));
> +
> +    _FDT(fdt_setprop_cell(fdt, offset, "interrupts",
> +                          pci_default_read_config(dev, PCI_INTERRUPT_PIN, 1)));
> +
> +    /* if this device is NOT a bridge */
> +    if (!is_bridge) {
> +        _FDT(fdt_setprop_cell(fdt, offset, "min-grant",
> +            pci_default_read_config(dev, PCI_MIN_GNT, 1)));
> +        _FDT(fdt_setprop_cell(fdt, offset, "max-latency",
> +            pci_default_read_config(dev, PCI_MAX_LAT, 1)));
> +        _FDT(fdt_setprop_cell(fdt, offset, "subsystem-id",
> +            pci_default_read_config(dev, PCI_SUBSYSTEM_ID, 2)));
> +        _FDT(fdt_setprop_cell(fdt, offset, "subsystem-vendor-id",
> +            pci_default_read_config(dev, PCI_SUBSYSTEM_VENDOR_ID, 2)));
> +    }
> +
> +    _FDT(fdt_setprop_cell(fdt, offset, "cache-line-size",
> +        pci_default_read_config(dev, PCI_CACHE_LINE_SIZE, 1)));
> +
> +    /* the following fdt cells are masked off the pci status register */
> +    int pci_status = pci_default_read_config(dev, PCI_STATUS, 2);
> +    _FDT(fdt_setprop_cell(fdt, offset, "devsel-speed",
> +                          PCI_STATUS_DEVSEL_MASK & pci_status));
> +    _FDT(fdt_setprop_cell(fdt, offset, "fast-back-to-back",
> +                          PCI_STATUS_FAST_BACK & pci_status));
> +    _FDT(fdt_setprop_cell(fdt, offset, "66mhz-capable",
> +                          PCI_STATUS_66MHZ & pci_status));
> +    _FDT(fdt_setprop_cell(fdt, offset, "udf-supported",
> +                          PCI_STATUS_UDF & pci_status));
> +
> +    _FDT(fdt_setprop_string(fdt, offset, "name", "pci"));
> +    sprintf(slotname, "Slot %d", slot + phb_index * 32);
> +    _FDT(fdt_setprop(fdt, offset, "ibm,loc-code", slotname, strlen(slotname)));
> +    _FDT(fdt_setprop_cell(fdt, offset, "ibm,my-drc-index",
> +                          drc_entry_slot->drc_index));
> +
> +    _FDT(fdt_setprop_cell(fdt, offset, "#address-cells",
> +                          RESOURCE_CELLS_ADDRESS));
> +    _FDT(fdt_setprop_cell(fdt, offset, "#size-cells",
> +                          RESOURCE_CELLS_SIZE));
> +    _FDT(fdt_setprop_cell(fdt, offset, "ibm,req#msi-x",
> +                          RESOURCE_CELLS_SIZE));
> +    fill_resource_props(dev, phb_index, reg, &reg_size,
> +                        assigned, &assigned_size);
> +    _FDT(fdt_setprop(fdt, offset, "reg", reg, reg_size));
> +    _FDT(fdt_setprop(fdt, offset, "assigned-addresses",
> +                     assigned, assigned_size));
> +
> +    return 0;
> +}
> +
> +static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev)
> +{
> +    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
> +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> +    sPAPRConfigureConnectorState *ccs;
> +    int slot = PCI_SLOT(dev->devfn);
> +    int offset, ret;
> +    void *fdt_orig, *fdt;
> +    char nodename[512];

I think we're better off using g_strdup_printf.

> +    uint32_t encoded = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_PRESENT,
> +                                        INDICATOR_ENTITY_SENSE_MASK,
> +                                        INDICATOR_ENTITY_SENSE_SHIFT);
> +
> +    drc_entry = spapr_phb_to_drc_entry(phb->buid);
> +    g_assert(drc_entry);
> +    drc_entry_slot = &drc_entry->child_entries[slot];
> +
> +    drc_entry->state &= ~(uint32_t)INDICATOR_ENTITY_SENSE_MASK;
> +    drc_entry->state |= encoded; /* DR entity present */
> +    drc_entry_slot->state &= ~(uint32_t)INDICATOR_ENTITY_SENSE_MASK;
> +    drc_entry_slot->state |= encoded; /* and the slot */
> +
> +    /* add OF node for pci device and required OF DT properties */
> +    fdt_orig = g_malloc0(FDT_MAX_SIZE);
> +    offset = fdt_create(fdt_orig, FDT_MAX_SIZE);
> +    fdt_begin_node(fdt_orig, "");
> +    fdt_end_node(fdt_orig);
> +    fdt_finish(fdt_orig);
> +
> +    fdt = g_malloc0(FDT_MAX_SIZE);
> +    fdt_open_into(fdt_orig, fdt, FDT_MAX_SIZE);
> +    sprintf(nodename, "pci@%d", slot);
> +    offset = fdt_add_subnode(fdt, 0, nodename);
> +    ret = spapr_populate_pci_child_dt(dev, fdt, offset, phb->index);
> +    g_assert(!ret);
> +    g_free(fdt_orig);
> +
> +    /* hold on to node, configure_connector will pass it to the guest later */
> +    ccs = &drc_entry_slot->cc_state;
> +    ccs->fdt = fdt;
> +    ccs->offset_start = offset;
> +    ccs->state = CC_STATE_PENDING;
> +    ccs->dev = dev;
> +
> +    return 0;
> +}
> +
> +/* check whether guest has released/isolated device */
> +static bool spapr_drc_state_is_releasable(sPAPRDrcEntry *drc_entry)
> +{
> +    return !DECODE_DRC_STATE(drc_entry->state,
> +                             INDICATOR_ISOLATION_MASK,
> +                             INDICATOR_ISOLATION_SHIFT);
> +}
> +
> +/* finalize device unplug/deletion */
> +static void spapr_drc_state_reset(sPAPRDrcEntry *drc_entry)
> +{
> +    sPAPRConfigureConnectorState *ccs = &drc_entry->cc_state;
> +    uint32_t sense_empty = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_EMPTY,
> +                                            INDICATOR_ENTITY_SENSE_MASK,
> +                                            INDICATOR_ENTITY_SENSE_SHIFT);
> +
> +    g_free(ccs->fdt);
> +    ccs->fdt = NULL;
> +    object_unparent(OBJECT(ccs->dev));
> +    ccs->dev = NULL;
> +    ccs->state = CC_STATE_IDLE;
> +    drc_entry->state &= ~INDICATOR_ENTITY_SENSE_MASK;
> +    drc_entry->state |= sense_empty;
> +    drc_entry->awaiting_release = false;
> +}
> +
> +static void spapr_device_hotplug_remove(DeviceState *qdev, PCIDevice *dev)
> +{
> +    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
> +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> +    sPAPRConfigureConnectorState *ccs;
> +    int slot = PCI_SLOT(dev->devfn);
> +
> +    drc_entry = spapr_phb_to_drc_entry(phb->buid);
> +    g_assert(drc_entry);
> +    drc_entry_slot = &drc_entry->child_entries[slot];
> +    ccs = &drc_entry_slot->cc_state;
> +    /* shouldn't be removing devices we haven't created an fdt for */
> +    g_assert(ccs->state != CC_STATE_IDLE);
> +    /* if the device has already been released/isolated by guest, go ahead
> +     * and remove it now. Otherwise, flag it as pending guest release so it
> +     * can be removed later
> +     */
> +    if (spapr_drc_state_is_releasable(drc_entry_slot)) {
> +        spapr_drc_state_reset(drc_entry_slot);
> +    } else {
> +        if (drc_entry_slot->awaiting_release) {
> +            fprintf(stderr, "waiting for guest to release the device");

Please add a timer in this case that force ejects the device. Also we
need to tell the caller that the device hasn't been ejected yet, no? How
does our ACPI implementation deal with uncooperative guests and
notifying upper stacks about them?


Alex

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 12/12] spapr_pci: emit hotplug add/remove events during hotplug
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 12/12] spapr_pci: emit hotplug add/remove events during hotplug Michael Roth
  2014-08-26  9:35   ` Alexey Kardashevskiy
@ 2014-08-26 12:36   ` Alexander Graf
  1 sibling, 0 replies; 69+ messages in thread
From: Alexander Graf @ 2014-08-26 12:36 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: aik, ncmike, qemu-ppc, tyreld, nfont



On 19.08.14 02:21, Michael Roth wrote:
> From: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
> 
> This uses extension of existing EPOW interrupt/event mechanism
> to notify userspace tools like librtas/drmgr to handle
> in-guest configuration/cleanup operations in response to
> device_add/device_del.
> 
> Userspace tools that don't implement this extension will need
> to be run manually in response/advance of device_add/device_del,
> respectively.

Couldn't they use the pull event mechanism you implement in the previous
patch?

> 
> Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr_pci.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 23864ab..17d37c0 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -944,6 +944,18 @@ static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev)
>      drc_entry_slot->state &= ~(uint32_t)INDICATOR_ENTITY_SENSE_MASK;
>      drc_entry_slot->state |= encoded; /* and the slot */
>  
> +    /* reliable unplug requires we wait for a transition from
> +     * UNISOLATED->ISOLATED prior to device removal/deletion.
> +     * However, slots populated by devices at boot-time will not
> +     * have ever been set by guest tools to an UNISOLATED/populated
> +     * state, so set this manually in the case of coldplug devices
> +     */
> +    if (!DEVICE(dev)->hotplugged) {
> +        drc_entry_slot->state |= ENCODE_DRC_STATE(1,
> +                                                  INDICATOR_ISOLATION_MASK,
> +                                                  INDICATOR_ISOLATION_SHIFT);

I think as a general scheme we would like to have the PHB call DRC
(which it knows from a qom link) which raises a qemu_irq to notify the
EPOW device that an event happened. If the EPOW interface is too
complex, I guess we can live with a link and function call too.


Alex

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-26  7:55   ` Alexey Kardashevskiy
  2014-08-26  8:24     ` Alexey Kardashevskiy
@ 2014-08-26 14:56     ` Michael Roth
  2014-09-05  0:31     ` [Qemu-devel] [Qemu-ppc] " Tyrel Datwyler
  2 siblings, 0 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-26 14:56 UTC (permalink / raw)
  To: Alexey Kardashevskiy, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

Quoting Alexey Kardashevskiy (2014-08-26 02:55:05)
> On 08/19/2014 10:21 AM, Michael Roth wrote:
> > From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> > 
> > This add entries to the root OF node to advertise our PHBs as being
> > DR-capable in according with PAPR specification.
> > 
> > Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
> > and associated with a power domain of -1 (indicating to guests that
> > power management is handled automatically by hardware).
> > 
> > We currently allocate entries for up to 32 DR-capable PHBs, though
> > this limit can be increased later.
> > 
> > DrcEntry objects to track the state of the DR-connector associated
> > with each PHB are stored in a 32-entry array, and each DrcEntry has
> > in turn have a dynamically-sized number of child DR-connectors,
> > which we will use later to track the state of DR-connectors
> > associated with a PHB's physical slots.
> > 
> > Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  hw/ppc/spapr_pci.c     |   1 +
> >  include/hw/ppc/spapr.h |  35 ++++++++++++
> >  3 files changed, 179 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 5c92707..d5e46c3 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
> >      return ram_size;
> >  }
> >  
> > +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
> > +{
> > +    int i;
> > +
> > +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > +        if (spapr->drc_table[i].phb_buid == buid) {
> > +            return &spapr->drc_table[i];
> > +        }
> > +     }
> > +
> > +     return NULL;
> > +}
> > +
> > +static void spapr_init_drc_table(void)
> > +{
> > +    int i;
> > +
> > +    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
> > +
> > +    /* For now we only care about PHB entries */
> > +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > +        spapr->drc_table[i].drc_index = 0x2000001 + i;
> > +    }
> > +}
> > +
> > +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
> > +{
> > +    sPAPRDrcEntry *empty_drc = NULL;
> > +    sPAPRDrcEntry *found_drc = NULL;
> > +    int i, phb_index;
> > +
> > +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > +        if (spapr->drc_table[i].phb_buid == 0) {
> > +            empty_drc = &spapr->drc_table[i];
> > +        }
> > +
> > +        if (spapr->drc_table[i].phb_buid == buid) {
> > +            found_drc = &spapr->drc_table[i];
> 
> return &spapr->drc_table[i];
> ?

That makes sense. Looking at this again though I think maybe
I should drop the found_drc stuff completely, or maybe even
assert if we attempt to re-add a phb. Current callers already
do spapr_phb_to_drc_entry beforehand anyway, which should
cover the case where it's already there. So maybe something
like this?

sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
{
    sPAPRDrcEntry *empty_drc = NULL;
    int i, phb_index;

    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
        g_assert(spapr->drc_table[i].phb_buid != buid);
        if (spapr->drc_table[i].phb_buid == 0) {
            empty_drc = &spapr->drc_table[i];
            break;
        }
    }

    if (!empty_drc) {
        return NULL;
    }

    empty_drc->phb_buid = buid;
    empty_drc->state = state;
    empty_drc->cc_state.fdt = NULL;
    empty_drc->cc_state.offset = 0;
    empty_drc->cc_state.depth = 0;
    empty_drc->cc_state.state = CC_STATE_IDLE;
    empty_drc->child_entries =
        g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
    phb_index = buid - SPAPR_PCI_BASE_BUID;
    for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
        empty_drc->child_entries[i].drc_index =
            SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
    }

    return empty_drc;
}

> 
> 
> > +            break;
> > +        }
> > +    }
> > +
> > +    if (found_drc) {
> > +        return found_drc;
> > +    }
> 
>    if (!empty_drc) {
>         return NULL;
>    }
> 
> ?
> 
> 
> > +
> > +    if (empty_drc) {
> 
> and no need in this :)
> 
> 
> > +        empty_drc->phb_buid = buid;
> > +        empty_drc->state = state;
> > +        empty_drc->cc_state.fdt = NULL;
> > +        empty_drc->cc_state.offset = 0;
> > +        empty_drc->cc_state.depth = 0;
> > +        empty_drc->cc_state.state = CC_STATE_IDLE;
> > +        empty_drc->child_entries =
> > +            g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
> > +        phb_index = buid - SPAPR_PCI_BASE_BUID;
> > +        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +            empty_drc->child_entries[i].drc_index =
> > +                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
> > +        }
> > +        return empty_drc;
> > +    }
> > +
> > +    return NULL;
> > +}
> > +
> > +static void spapr_create_drc_dt_entries(void *fdt)
> > +{
> > +    char char_buf[1024];
> > +    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
> > +    uint32_t *entries;
> > +    int offset, fdt_offset;
> > +    int i, ret;
> > +
> > +    fdt_offset = fdt_path_offset(fdt, "/");
> > +
> > +    /* ibm,drc-indexes */
> > +    memset(int_buf, 0, sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> > +
> > +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> > +        int_buf[i] = spapr->drc_table[i-1].drc_index;
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
> 
> return here and below in the same error cases?

Yah, that's clearly more sensible. I suppose if we introduce an error exit case
this should stop being a void function as well.

> 
> > +    }
> > +
> > +    /* ibm,drc-power-domains */
> > +    memset(int_buf, 0, sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> > +
> > +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> > +        int_buf[i] = 0xffffffff;
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains property\n");
> > +    }
> > +
> > +    /* ibm,drc-names */
> > +    memset(char_buf, 0, sizeof(char_buf));
> > +    entries = (uint32_t *)&char_buf[0];
> > +    *entries = SPAPR_DRC_TABLE_SIZE;
> > +    offset = sizeof(*entries);
> > +
> > +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > +        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
> > +        char_buf[offset++] = '\0';
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
> > +    if (ret) {
> > +        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
> > +    }
> > +
> > +    /* ibm,drc-types */
> > +    memset(char_buf, 0, sizeof(char_buf));
> > +    entries = (uint32_t *)&char_buf[0];
> > +    *entries = SPAPR_DRC_TABLE_SIZE;
> > +    offset = sizeof(*entries);
> > +
> > +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > +        offset += sprintf(char_buf + offset, "PHB");
> > +        char_buf[offset++] = '\0';
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, offset);
> > +    if (ret) {
> > +        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
> > +    }
> > +}
> > +
> >  #define _FDT(exp) \
> >      do { \
> >          int ret = (exp);                                           \
> > @@ -731,6 +868,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >      char *bootlist;
> >      void *fdt;
> >      sPAPRPHBState *phb;
> > +    sPAPRDrcEntry *drc_entry;
> >  
> >      fdt = g_malloc(FDT_MAX_SIZE);
> >  
> > @@ -750,6 +888,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >      }
> >  
> >      QLIST_FOREACH(phb, &spapr->phbs, list) {
> > +        drc_entry = spapr_phb_to_drc_entry(phb->buid);
> > +        g_assert(drc_entry);
> >          ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> >      }
> >  
> > @@ -789,6 +929,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
> >      }
> >  
> > +    spapr_create_drc_dt_entries(fdt);
> > +
> >      _FDT((fdt_pack(fdt)));
> >  
> >      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> > @@ -1443,6 +1585,7 @@ static void ppc_spapr_init(MachineState *machine)
> >      spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW);
> >      spapr_pci_rtas_init();
> >  
> > +    spapr_init_drc_table();
> >      phb = spapr_create_phb(spapr, 0);
> >  
> >      for (i = 0; i < nb_nics; i++) {
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 9ed39a9..e85134f 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -531,6 +531,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >              + sphb->index * SPAPR_PCI_WINDOW_SPACING;
> >          sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
> >          sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
> > +        spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
> 
> 
> What exactly does "unusable" mean here? Macro?

That's the 9003/"DR-entity-sense" for the DRC entry corresponding to the PHB
itself (as opposed to the sensors for each of its slots). In the case of the
slots, we advertise them as 'physical' DR resources in the PHB's "ibm,drc-types"
property. In the case of the PHBs we advertise them as 'logical'/dlpar DR
resources in the root DT node's "ibm,drc-types" property. We do not actually
support 'logical' DR operations in this implementation though (though we may
in the future to support PHB hotplug), so we've set this to 'unusable' for
now.

But according to PAPR 2.7 13.5.3.3 this corresponds to:

"Returned for logical DR entities when the DR entity is not currently available
to the OS, but may possibly be made available to the OS by calling set-indicator
with the allocation-state indicator, setting that indicator to usable."

So maybe it makes more sense to just set it to present/(1)?

I don't think the PHB sensors will actually get read unless we attempt dlpar
operations in the guest (as opposed to pci hp), so it's probably mostly unused
now, but seems like a more sensible default. Will test and change it if it
doesn't break anything.

Macros make sense... we actually already have:

#define INDICATOR_ENTITY_SENSE_EMPTY    0
#define INDICATOR_ENTITY_SENSE_PRESENT  1

so not adding 'unused' was an oversight. I'll add it either way for
completeness.

> 
> 
> 
> >      }
> >  
> >      if (sphb->buid == -1) {
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 36e8e51..c93794b 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -10,6 +10,36 @@ struct sPAPRNVRAM;
> >  
> >  #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
> >  
> > +/* For dlparable/hotpluggable slots */
> > +#define SPAPR_DRC_TABLE_SIZE    32
> > +#define SPAPR_DRC_PHB_SLOT_MAX  32
> > +#define SPAPR_DRC_DEV_ID_BASE   0x40000000
> 
> 
> Is this SPAPR_DRC_DEV_ID_BASE really necessary (can it be zero)? Is that
> global id or per PCI bus or per PHB?
> 
> 
> > +
> > +typedef struct sPAPRConfigureConnectorState {
> > +    void *fdt;
> > +    int offset_start;
> > +    int offset;
> > +    int depth;
> > +    PCIDevice *dev;
> > +    enum {
> > +        CC_STATE_IDLE = 0,
> > +        CC_STATE_PENDING = 1,
> > +        CC_STATE_ACTIVE,
> > +    } state;
> > +} sPAPRConfigureConnectorState;
> > +
> > +typedef struct sPAPRDrcEntry sPAPRDrcEntry;
> > +
> > +struct sPAPRDrcEntry {
> > +    uint32_t drc_index;
> > +    uint64_t phb_buid;
> > +    void *fdt;
> > +    int fdt_offset;
> > +    uint32_t state;
> > +    sPAPRConfigureConnectorState cc_state;
> > +    sPAPRDrcEntry *child_entries;
> > +};
> > +
> >  typedef struct sPAPREnvironment {
> >      struct VIOsPAPRBus *vio_bus;
> >      QLIST_HEAD(, sPAPRPHBState) phbs;
> > @@ -39,6 +69,9 @@ typedef struct sPAPREnvironment {
> >      int htab_save_index;
> >      bool htab_first_pass;
> >      int htab_fd;
> > +
> > +    /* state for Dynamic Reconfiguration Connectors */
> > +    sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
> >  } sPAPREnvironment;
> >  
> >  #define H_SUCCESS         0
> > @@ -481,5 +514,7 @@ int spapr_dma_dt(void *fdt, int node_off, const char *propname,
> >                   uint32_t liobn, uint64_t window, uint32_t size);
> >  int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
> >                        sPAPRTCETable *tcet);
> > +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
> > +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
> >  
> >  #endif /* !defined (__HW_SPAPR_H__) */
> > 
> 
> 
> -- 
> Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-26  8:24     ` Alexey Kardashevskiy
@ 2014-08-26 15:25       ` Michael Roth
  2014-08-26 15:41         ` Michael Roth
  2014-08-29 18:27         ` Tyrel Datwyler
  0 siblings, 2 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-26 15:25 UTC (permalink / raw)
  To: Alexey Kardashevskiy, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

Quoting Alexey Kardashevskiy (2014-08-26 03:24:08)
> On 08/26/2014 05:55 PM, Alexey Kardashevskiy wrote:
> > On 08/19/2014 10:21 AM, Michael Roth wrote:
> >> From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> >>
> >> This add entries to the root OF node to advertise our PHBs as being
> >> DR-capable in according with PAPR specification.
> >>
> >> Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
> >> and associated with a power domain of -1 (indicating to guests that
> >> power management is handled automatically by hardware).
> >>
> >> We currently allocate entries for up to 32 DR-capable PHBs, though
> >> this limit can be increased later.
> >>
> >> DrcEntry objects to track the state of the DR-connector associated
> >> with each PHB are stored in a 32-entry array, and each DrcEntry has
> >> in turn have a dynamically-sized number of child DR-connectors,
> >> which we will use later to track the state of DR-connectors
> >> associated with a PHB's physical slots.
> >>
> >> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> >> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> >> ---
> >>  hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
> >>  hw/ppc/spapr_pci.c     |   1 +
> >>  include/hw/ppc/spapr.h |  35 ++++++++++++
> >>  3 files changed, 179 insertions(+)
> >>
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index 5c92707..d5e46c3 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
> >>      return ram_size;
> >>  }
> >>  
> >> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
> >> +{
> >> +    int i;
> >> +
> >> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> >> +        if (spapr->drc_table[i].phb_buid == buid) {
> >> +            return &spapr->drc_table[i];
> >> +        }
> >> +     }
> >> +
> >> +     return NULL;
> >> +}
> >> +
> >> +static void spapr_init_drc_table(void)
> >> +{
> >> +    int i;
> >> +
> >> +    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
> >> +
> >> +    /* For now we only care about PHB entries */
> >> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> >> +        spapr->drc_table[i].drc_index = 0x2000001 + i;
> >> +    }
> >> +}
> >> +
> >> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
> >> +{
> >> +    sPAPRDrcEntry *empty_drc = NULL;
> >> +    sPAPRDrcEntry *found_drc = NULL;
> >> +    int i, phb_index;
> >> +
> >> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> >> +        if (spapr->drc_table[i].phb_buid == 0) {
> >> +            empty_drc = &spapr->drc_table[i];
> >> +        }
> >> +
> >> +        if (spapr->drc_table[i].phb_buid == buid) {
> >> +            found_drc = &spapr->drc_table[i];
> > 
> > return &spapr->drc_table[i];
> > ?
> > 
> > 
> >> +            break;
> >> +        }
> >> +    }
> >> +
> >> +    if (found_drc) {
> >> +        return found_drc;
> >> +    }
> > 
> >    if (!empty_drc) {
> >         return NULL;
> >    }
> > 
> > ?
> > 
> > 
> >> +
> >> +    if (empty_drc) {
> > 
> > and no need in this :)
> > 
> > 
> >> +        empty_drc->phb_buid = buid;
> >> +        empty_drc->state = state;
> >> +        empty_drc->cc_state.fdt = NULL;
> >> +        empty_drc->cc_state.offset = 0;
> >> +        empty_drc->cc_state.depth = 0;
> >> +        empty_drc->cc_state.state = CC_STATE_IDLE;
> >> +        empty_drc->child_entries =
> >> +            g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
> >> +        phb_index = buid - SPAPR_PCI_BASE_BUID;
> >> +        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> >> +            empty_drc->child_entries[i].drc_index =
> >> +                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
> >> +        }
> >> +        return empty_drc;
> >> +    }
> >> +
> >> +    return NULL;
> >> +}
> >> +
> >> +static void spapr_create_drc_dt_entries(void *fdt)
> >> +{
> >> +    char char_buf[1024];
> >> +    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
> >> +    uint32_t *entries;
> >> +    int offset, fdt_offset;
> >> +    int i, ret;
> >> +
> >> +    fdt_offset = fdt_path_offset(fdt, "/");
> >> +
> >> +    /* ibm,drc-indexes */
> >> +    memset(int_buf, 0, sizeof(int_buf));
> >> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> >> +
> >> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> >> +        int_buf[i] = spapr->drc_table[i-1].drc_index;
> >> +    }
> >> +
> >> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
> >> +                      sizeof(int_buf));
> >> +    if (ret) {
> >> +        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
> > 
> > return here and below in the same error cases?
> > 
> >> +    }
> >> +
> >> +    /* ibm,drc-power-domains */
> >> +    memset(int_buf, 0, sizeof(int_buf));
> >> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> >> +
> >> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> >> +        int_buf[i] = 0xffffffff;
> >> +    }
> >> +
> >> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
> >> +                      sizeof(int_buf));
> >> +    if (ret) {
> >> +        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains property\n");
> >> +    }
> >> +
> >> +    /* ibm,drc-names */
> >> +    memset(char_buf, 0, sizeof(char_buf));
> >> +    entries = (uint32_t *)&char_buf[0];
> >> +    *entries = SPAPR_DRC_TABLE_SIZE;
> >> +    offset = sizeof(*entries);
> >> +
> >> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> >> +        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
> >> +        char_buf[offset++] = '\0';
> >> +    }
> >> +
> >> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
> >> +    if (ret) {
> >> +        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
> >> +    }
> >> +
> >> +    /* ibm,drc-types */
> >> +    memset(char_buf, 0, sizeof(char_buf));
> >> +    entries = (uint32_t *)&char_buf[0];
> >> +    *entries = SPAPR_DRC_TABLE_SIZE;
> >> +    offset = sizeof(*entries);
> >> +
> >> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> >> +        offset += sprintf(char_buf + offset, "PHB");
> >> +        char_buf[offset++] = '\0';
> >> +    }
> >> +
> >> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, offset);
> >> +    if (ret) {
> >> +        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
> >> +    }
> >> +}
> >> +
> >>  #define _FDT(exp) \
> >>      do { \
> >>          int ret = (exp);                                           \
> >> @@ -731,6 +868,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >>      char *bootlist;
> >>      void *fdt;
> >>      sPAPRPHBState *phb;
> >> +    sPAPRDrcEntry *drc_entry;
> >>  
> >>      fdt = g_malloc(FDT_MAX_SIZE);
> >>  
> >> @@ -750,6 +888,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >>      }
> >>  
> >>      QLIST_FOREACH(phb, &spapr->phbs, list) {
> >> +        drc_entry = spapr_phb_to_drc_entry(phb->buid);
> >> +        g_assert(drc_entry);
> >>          ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> >>      }
> >>  
> >> @@ -789,6 +929,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >>          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
> >>      }
> >>  
> >> +    spapr_create_drc_dt_entries(fdt);
> >> +
> >>      _FDT((fdt_pack(fdt)));
> >>  
> >>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> >> @@ -1443,6 +1585,7 @@ static void ppc_spapr_init(MachineState *machine)
> >>      spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW);
> >>      spapr_pci_rtas_init();
> >>  
> >> +    spapr_init_drc_table();
> >>      phb = spapr_create_phb(spapr, 0);
> >>  
> >>      for (i = 0; i < nb_nics; i++) {
> >> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> >> index 9ed39a9..e85134f 100644
> >> --- a/hw/ppc/spapr_pci.c
> >> +++ b/hw/ppc/spapr_pci.c
> >> @@ -531,6 +531,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >>              + sphb->index * SPAPR_PCI_WINDOW_SPACING;
> >>          sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
> >>          sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
> >> +        spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
> > 
> > 
> > What exactly does "unusable" mean here? Macro?
> > 
> > 
> > 
> >>      }
> >>  
> >>      if (sphb->buid == -1) {
> >> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> >> index 36e8e51..c93794b 100644
> >> --- a/include/hw/ppc/spapr.h
> >> +++ b/include/hw/ppc/spapr.h
> >> @@ -10,6 +10,36 @@ struct sPAPRNVRAM;
> >>  
> >>  #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
> >>  
> >> +/* For dlparable/hotpluggable slots */
> >> +#define SPAPR_DRC_TABLE_SIZE    32
> >> +#define SPAPR_DRC_PHB_SLOT_MAX  32
> >> +#define SPAPR_DRC_DEV_ID_BASE   0x40000000
> > 
> > 
> > Is this SPAPR_DRC_DEV_ID_BASE really necessary (can it be zero)? Is that
> > global id or per PCI bus or per PHB?
> 
> 
> Ah. Got it. If it was like below, I would not even ask :)
> 
> #define SPAPR_DRC_DEV_ID(phb_index, slot) \
>         (0x40000000 | ((phb_index)<<8) | ((slot)<<3))
> 
> Still not clear why you need this 0x40000000 for. Is it kind of "PHB" DRC type?

Yes, it's somewhat ad-hoc...the only requirement I see in PAPR is that this value be
globally unique across all DR resources. CPUs and memory and such might have different
ways to compute their DRC indices (so a slot-based macro would need to be specific
to PCI DR entries). I'm not sure where the 0x40000000 originated honestly. I'm not
sure it matters for QEMU, since we hold a monopoly on all DRC index assignments and
don't have to deal with hard-coded firmware values.

I will say that a base somewhat less common than 0 may prove useful from a debugging
standpoint, all other things being equal.

So not sure what best to do here. If we choose to leave it as is, I could at least
make sure to add a comment about this.

> 
> 
> > 
> >> +
> >> +typedef struct sPAPRConfigureConnectorState {
> >> +    void *fdt;
> >> +    int offset_start;
> >> +    int offset;
> >> +    int depth;
> >> +    PCIDevice *dev;
> >> +    enum {
> >> +        CC_STATE_IDLE = 0,
> >> +        CC_STATE_PENDING = 1,
> >> +        CC_STATE_ACTIVE,
> >> +    } state;
> >> +} sPAPRConfigureConnectorState;
> >> +
> >> +typedef struct sPAPRDrcEntry sPAPRDrcEntry;
> >> +
> >> +struct sPAPRDrcEntry {
> >> +    uint32_t drc_index;
> >> +    uint64_t phb_buid;
> >> +    void *fdt;
> >> +    int fdt_offset;
> >> +    uint32_t state;
> >> +    sPAPRConfigureConnectorState cc_state;
> >> +    sPAPRDrcEntry *child_entries;
> >> +};
> >> +
> >>  typedef struct sPAPREnvironment {
> >>      struct VIOsPAPRBus *vio_bus;
> >>      QLIST_HEAD(, sPAPRPHBState) phbs;
> >> @@ -39,6 +69,9 @@ typedef struct sPAPREnvironment {
> >>      int htab_save_index;
> >>      bool htab_first_pass;
> >>      int htab_fd;
> >> +
> >> +    /* state for Dynamic Reconfiguration Connectors */
> >> +    sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
> >>  } sPAPREnvironment;
> >>  
> >>  #define H_SUCCESS         0
> >> @@ -481,5 +514,7 @@ int spapr_dma_dt(void *fdt, int node_off, const char *propname,
> >>                   uint32_t liobn, uint64_t window, uint32_t size);
> >>  int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
> >>                        sPAPRTCETable *tcet);
> >> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
> >> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
> >>  
> >>  #endif /* !defined (__HW_SPAPR_H__) */
> >>
> > 
> > 
> 
> 
> -- 
> Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-26 15:25       ` Michael Roth
@ 2014-08-26 15:41         ` Michael Roth
  2014-08-29 18:27         ` Tyrel Datwyler
  1 sibling, 0 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-26 15:41 UTC (permalink / raw)
  To: Alexey Kardashevskiy, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

Quoting Michael Roth (2014-08-26 10:25:13)
> Quoting Alexey Kardashevskiy (2014-08-26 03:24:08)
> > On 08/26/2014 05:55 PM, Alexey Kardashevskiy wrote:
> > > On 08/19/2014 10:21 AM, Michael Roth wrote:
> > >> From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> > >>
> > >> This add entries to the root OF node to advertise our PHBs as being
> > >> DR-capable in according with PAPR specification.
> > >>
> > >> Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
> > >> and associated with a power domain of -1 (indicating to guests that
> > >> power management is handled automatically by hardware).
> > >>
> > >> We currently allocate entries for up to 32 DR-capable PHBs, though
> > >> this limit can be increased later.
> > >>
> > >> DrcEntry objects to track the state of the DR-connector associated
> > >> with each PHB are stored in a 32-entry array, and each DrcEntry has
> > >> in turn have a dynamically-sized number of child DR-connectors,
> > >> which we will use later to track the state of DR-connectors
> > >> associated with a PHB's physical slots.
> > >>
> > >> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> > >> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > >> ---
> > >>  hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
> > >>  hw/ppc/spapr_pci.c     |   1 +
> > >>  include/hw/ppc/spapr.h |  35 ++++++++++++
> > >>  3 files changed, 179 insertions(+)
> > >>
> > >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > >> index 5c92707..d5e46c3 100644
> > >> --- a/hw/ppc/spapr.c
> > >> +++ b/hw/ppc/spapr.c
> > >> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
> > >>      return ram_size;
> > >>  }
> > >>  
> > >> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
> > >> +{
> > >> +    int i;
> > >> +
> > >> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > >> +        if (spapr->drc_table[i].phb_buid == buid) {
> > >> +            return &spapr->drc_table[i];
> > >> +        }
> > >> +     }
> > >> +
> > >> +     return NULL;
> > >> +}
> > >> +
> > >> +static void spapr_init_drc_table(void)
> > >> +{
> > >> +    int i;
> > >> +
> > >> +    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
> > >> +
> > >> +    /* For now we only care about PHB entries */
> > >> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > >> +        spapr->drc_table[i].drc_index = 0x2000001 + i;
> > >> +    }
> > >> +}
> > >> +
> > >> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
> > >> +{
> > >> +    sPAPRDrcEntry *empty_drc = NULL;
> > >> +    sPAPRDrcEntry *found_drc = NULL;
> > >> +    int i, phb_index;
> > >> +
> > >> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > >> +        if (spapr->drc_table[i].phb_buid == 0) {
> > >> +            empty_drc = &spapr->drc_table[i];
> > >> +        }
> > >> +
> > >> +        if (spapr->drc_table[i].phb_buid == buid) {
> > >> +            found_drc = &spapr->drc_table[i];
> > > 
> > > return &spapr->drc_table[i];
> > > ?
> > > 
> > > 
> > >> +            break;
> > >> +        }
> > >> +    }
> > >> +
> > >> +    if (found_drc) {
> > >> +        return found_drc;
> > >> +    }
> > > 
> > >    if (!empty_drc) {
> > >         return NULL;
> > >    }
> > > 
> > > ?
> > > 
> > > 
> > >> +
> > >> +    if (empty_drc) {
> > > 
> > > and no need in this :)
> > > 
> > > 
> > >> +        empty_drc->phb_buid = buid;
> > >> +        empty_drc->state = state;
> > >> +        empty_drc->cc_state.fdt = NULL;
> > >> +        empty_drc->cc_state.offset = 0;
> > >> +        empty_drc->cc_state.depth = 0;
> > >> +        empty_drc->cc_state.state = CC_STATE_IDLE;
> > >> +        empty_drc->child_entries =
> > >> +            g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
> > >> +        phb_index = buid - SPAPR_PCI_BASE_BUID;
> > >> +        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > >> +            empty_drc->child_entries[i].drc_index =
> > >> +                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
> > >> +        }
> > >> +        return empty_drc;
> > >> +    }
> > >> +
> > >> +    return NULL;
> > >> +}
> > >> +
> > >> +static void spapr_create_drc_dt_entries(void *fdt)
> > >> +{
> > >> +    char char_buf[1024];
> > >> +    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
> > >> +    uint32_t *entries;
> > >> +    int offset, fdt_offset;
> > >> +    int i, ret;
> > >> +
> > >> +    fdt_offset = fdt_path_offset(fdt, "/");
> > >> +
> > >> +    /* ibm,drc-indexes */
> > >> +    memset(int_buf, 0, sizeof(int_buf));
> > >> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> > >> +
> > >> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> > >> +        int_buf[i] = spapr->drc_table[i-1].drc_index;
> > >> +    }
> > >> +
> > >> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
> > >> +                      sizeof(int_buf));
> > >> +    if (ret) {
> > >> +        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
> > > 
> > > return here and below in the same error cases?
> > > 
> > >> +    }
> > >> +
> > >> +    /* ibm,drc-power-domains */
> > >> +    memset(int_buf, 0, sizeof(int_buf));
> > >> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> > >> +
> > >> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> > >> +        int_buf[i] = 0xffffffff;
> > >> +    }
> > >> +
> > >> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
> > >> +                      sizeof(int_buf));
> > >> +    if (ret) {
> > >> +        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains property\n");
> > >> +    }
> > >> +
> > >> +    /* ibm,drc-names */
> > >> +    memset(char_buf, 0, sizeof(char_buf));
> > >> +    entries = (uint32_t *)&char_buf[0];
> > >> +    *entries = SPAPR_DRC_TABLE_SIZE;
> > >> +    offset = sizeof(*entries);
> > >> +
> > >> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > >> +        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
> > >> +        char_buf[offset++] = '\0';
> > >> +    }
> > >> +
> > >> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
> > >> +    if (ret) {
> > >> +        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
> > >> +    }
> > >> +
> > >> +    /* ibm,drc-types */
> > >> +    memset(char_buf, 0, sizeof(char_buf));
> > >> +    entries = (uint32_t *)&char_buf[0];
> > >> +    *entries = SPAPR_DRC_TABLE_SIZE;
> > >> +    offset = sizeof(*entries);
> > >> +
> > >> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > >> +        offset += sprintf(char_buf + offset, "PHB");
> > >> +        char_buf[offset++] = '\0';
> > >> +    }
> > >> +
> > >> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, offset);
> > >> +    if (ret) {
> > >> +        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
> > >> +    }
> > >> +}
> > >> +
> > >>  #define _FDT(exp) \
> > >>      do { \
> > >>          int ret = (exp);                                           \
> > >> @@ -731,6 +868,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> > >>      char *bootlist;
> > >>      void *fdt;
> > >>      sPAPRPHBState *phb;
> > >> +    sPAPRDrcEntry *drc_entry;
> > >>  
> > >>      fdt = g_malloc(FDT_MAX_SIZE);
> > >>  
> > >> @@ -750,6 +888,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> > >>      }
> > >>  
> > >>      QLIST_FOREACH(phb, &spapr->phbs, list) {
> > >> +        drc_entry = spapr_phb_to_drc_entry(phb->buid);
> > >> +        g_assert(drc_entry);
> > >>          ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> > >>      }
> > >>  
> > >> @@ -789,6 +929,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> > >>          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
> > >>      }
> > >>  
> > >> +    spapr_create_drc_dt_entries(fdt);
> > >> +
> > >>      _FDT((fdt_pack(fdt)));
> > >>  
> > >>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> > >> @@ -1443,6 +1585,7 @@ static void ppc_spapr_init(MachineState *machine)
> > >>      spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW);
> > >>      spapr_pci_rtas_init();
> > >>  
> > >> +    spapr_init_drc_table();
> > >>      phb = spapr_create_phb(spapr, 0);
> > >>  
> > >>      for (i = 0; i < nb_nics; i++) {
> > >> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > >> index 9ed39a9..e85134f 100644
> > >> --- a/hw/ppc/spapr_pci.c
> > >> +++ b/hw/ppc/spapr_pci.c
> > >> @@ -531,6 +531,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> > >>              + sphb->index * SPAPR_PCI_WINDOW_SPACING;
> > >>          sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
> > >>          sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
> > >> +        spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
> > > 
> > > 
> > > What exactly does "unusable" mean here? Macro?
> > > 
> > > 
> > > 
> > >>      }
> > >>  
> > >>      if (sphb->buid == -1) {
> > >> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > >> index 36e8e51..c93794b 100644
> > >> --- a/include/hw/ppc/spapr.h
> > >> +++ b/include/hw/ppc/spapr.h
> > >> @@ -10,6 +10,36 @@ struct sPAPRNVRAM;
> > >>  
> > >>  #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
> > >>  
> > >> +/* For dlparable/hotpluggable slots */
> > >> +#define SPAPR_DRC_TABLE_SIZE    32
> > >> +#define SPAPR_DRC_PHB_SLOT_MAX  32
> > >> +#define SPAPR_DRC_DEV_ID_BASE   0x40000000
> > > 
> > > 
> > > Is this SPAPR_DRC_DEV_ID_BASE really necessary (can it be zero)? Is that
> > > global id or per PCI bus or per PHB?
> > 
> > 
> > Ah. Got it. If it was like below, I would not even ask :)
> > 
> > #define SPAPR_DRC_DEV_ID(phb_index, slot) \
> >         (0x40000000 | ((phb_index)<<8) | ((slot)<<3))
> > 
> > Still not clear why you need this 0x40000000 for. Is it kind of "PHB" DRC type?
> 
> Yes, it's somewhat ad-hoc...the only requirement I see in PAPR is that this value be
> globally unique across all DR resources. CPUs and memory and such might have different
> ways to compute their DRC indices (so a slot-based macro would need to be specific
> to PCI DR entries). I'm not sure where the 0x40000000 originated honestly. I'm not
> sure it matters for QEMU, since we hold a monopoly on all DRC index assignments and
> don't have to deal with hard-coded firmware values.
> 
> I will say that a base somewhat less common than 0 may prove useful from a debugging
> standpoint, all other things being equal.
> 
> So not sure what best to do here. If we choose to leave it as is, I could at least
> make sure to add a comment about this.

Hmm, this all also applies to the 0x2000001 drc base used for the parent PHB DRC
indices that Alex mentioned.

If we want to do something a little more structured, we could take the hotplug types
introduced later:

  #define RTAS_LOG_V6_HP_TYPE_CPU                          1
  #define RTAS_LOG_V6_HP_TYPE_MEMORY                       2
  #define RTAS_LOG_V6_HP_TYPE_SLOT                         3
  #define RTAS_LOG_V6_HP_TYPE_PHB                          4
  #define RTAS_LOG_V6_HP_TYPE_PCI                          5

As the DRC index base. A macro representation would basically be:

#define SPAPR_DRC_DEV_ID_BASE(dr_type) (dr_type << 28)

I don't really like that 'dr_type' doesn't actually correspond to the
ibm,drc-types ofdt property though, which are string values, unfortunately.
("PHB" for PHB, and "28" for PCI slot...). We could do a string->dr_type lookup
in the macro, but passing around "28" doesn't seem very readable either...

> 
> > 
> > 
> > > 
> > >> +
> > >> +typedef struct sPAPRConfigureConnectorState {
> > >> +    void *fdt;
> > >> +    int offset_start;
> > >> +    int offset;
> > >> +    int depth;
> > >> +    PCIDevice *dev;
> > >> +    enum {
> > >> +        CC_STATE_IDLE = 0,
> > >> +        CC_STATE_PENDING = 1,
> > >> +        CC_STATE_ACTIVE,
> > >> +    } state;
> > >> +} sPAPRConfigureConnectorState;
> > >> +
> > >> +typedef struct sPAPRDrcEntry sPAPRDrcEntry;
> > >> +
> > >> +struct sPAPRDrcEntry {
> > >> +    uint32_t drc_index;
> > >> +    uint64_t phb_buid;
> > >> +    void *fdt;
> > >> +    int fdt_offset;
> > >> +    uint32_t state;
> > >> +    sPAPRConfigureConnectorState cc_state;
> > >> +    sPAPRDrcEntry *child_entries;
> > >> +};
> > >> +
> > >>  typedef struct sPAPREnvironment {
> > >>      struct VIOsPAPRBus *vio_bus;
> > >>      QLIST_HEAD(, sPAPRPHBState) phbs;
> > >> @@ -39,6 +69,9 @@ typedef struct sPAPREnvironment {
> > >>      int htab_save_index;
> > >>      bool htab_first_pass;
> > >>      int htab_fd;
> > >> +
> > >> +    /* state for Dynamic Reconfiguration Connectors */
> > >> +    sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
> > >>  } sPAPREnvironment;
> > >>  
> > >>  #define H_SUCCESS         0
> > >> @@ -481,5 +514,7 @@ int spapr_dma_dt(void *fdt, int node_off, const char *propname,
> > >>                   uint32_t liobn, uint64_t window, uint32_t size);
> > >>  int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
> > >>                        sPAPRTCETable *tcet);
> > >> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
> > >> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
> > >>  
> > >>  #endif /* !defined (__HW_SPAPR_H__) */
> > >>
> > > 
> > > 
> > 
> > 
> > -- 
> > Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-26 11:11   ` [Qemu-devel] " Alexander Graf
@ 2014-08-26 16:47     ` Michael Roth
  2014-08-26 17:16       ` Alexander Graf
  0 siblings, 1 reply; 69+ messages in thread
From: Michael Roth @ 2014-08-26 16:47 UTC (permalink / raw)
  To: Alexander Graf, qemu-devel; +Cc: aik, ncmike, qemu-ppc, tyreld, nfont

Quoting Alexander Graf (2014-08-26 06:11:24)
> On 19.08.14 02:21, Michael Roth wrote:
> > From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> > 
> > This add entries to the root OF node to advertise our PHBs as being
> > DR-capable in according with PAPR specification.
> > 
> > Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
> > and associated with a power domain of -1 (indicating to guests that
> > power management is handled automatically by hardware).
> > 
> > We currently allocate entries for up to 32 DR-capable PHBs, though
> > this limit can be increased later.
> > 
> > DrcEntry objects to track the state of the DR-connector associated
> > with each PHB are stored in a 32-entry array, and each DrcEntry has
> > in turn have a dynamically-sized number of child DR-connectors,
> > which we will use later to track the state of DR-connectors
> > associated with a PHB's physical slots.
> > 
> > Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  hw/ppc/spapr_pci.c     |   1 +
> >  include/hw/ppc/spapr.h |  35 ++++++++++++
> >  3 files changed, 179 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 5c92707..d5e46c3 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
> >      return ram_size;
> >  }
> >  
> > +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
> > +{
> > +    int i;
> > +
> > +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > +        if (spapr->drc_table[i].phb_buid == buid) {
> > +            return &spapr->drc_table[i];
> > +        }
> > +     }
> > +
> > +     return NULL;
> > +}
> > +
> > +static void spapr_init_drc_table(void)
> > +{
> > +    int i;
> > +
> > +    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
> > +
> > +    /* For now we only care about PHB entries */
> > +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > +        spapr->drc_table[i].drc_index = 0x2000001 + i;
> 
> magic number?
> 
> > +    }
> > +}
> > +
> > +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
> > +{
> > +    sPAPRDrcEntry *empty_drc = NULL;
> > +    sPAPRDrcEntry *found_drc = NULL;
> > +    int i, phb_index;
> > +
> > +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > +        if (spapr->drc_table[i].phb_buid == 0) {
> > +            empty_drc = &spapr->drc_table[i];
> > +        }
> > +
> > +        if (spapr->drc_table[i].phb_buid == buid) {
> > +            found_drc = &spapr->drc_table[i];
> > +            break;
> > +        }
> > +    }
> > +
> > +    if (found_drc) {
> > +        return found_drc;
> > +    }
> > +
> > +    if (empty_drc) {
> > +        empty_drc->phb_buid = buid;
> > +        empty_drc->state = state;
> > +        empty_drc->cc_state.fdt = NULL;
> > +        empty_drc->cc_state.offset = 0;
> > +        empty_drc->cc_state.depth = 0;
> > +        empty_drc->cc_state.state = CC_STATE_IDLE;
> > +        empty_drc->child_entries =
> > +            g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
> > +        phb_index = buid - SPAPR_PCI_BASE_BUID;
> > +        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +            empty_drc->child_entries[i].drc_index =
> > +                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
> > +        }
> > +        return empty_drc;
> > +    }
> > +
> > +    return NULL;
> > +}
> > +
> > +static void spapr_create_drc_dt_entries(void *fdt)
> > +{
> > +    char char_buf[1024];
> > +    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
> > +    uint32_t *entries;
> > +    int offset, fdt_offset;
> > +    int i, ret;
> > +
> > +    fdt_offset = fdt_path_offset(fdt, "/");
> > +
> > +    /* ibm,drc-indexes */
> > +    memset(int_buf, 0, sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> > +
> > +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> > +        int_buf[i] = spapr->drc_table[i-1].drc_index;
> 
> Not endian safe.
> 
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
> > +    }
> > +
> > +    /* ibm,drc-power-domains */
> > +    memset(int_buf, 0, sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> 
> Not endian safe.
> 
> > +
> > +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> > +        int_buf[i] = 0xffffffff;
> > +    }
> 
> memset(-1) instead above?
> 
> > +
> > +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains property\n");
> > +    }
> > +
> > +    /* ibm,drc-names */
> > +    memset(char_buf, 0, sizeof(char_buf));
> > +    entries = (uint32_t *)&char_buf[0];
> > +    *entries = SPAPR_DRC_TABLE_SIZE;
> 
> Not endian safe. I guess you get the idea. I'll stop looking for endian
> problems here :).
> 
> > +    offset = sizeof(*entries);
> > +
> > +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > +        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
> > +        char_buf[offset++] = '\0';
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
> > +    if (ret) {
> > +        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
> > +    }
> > +
> > +    /* ibm,drc-types */
> > +    memset(char_buf, 0, sizeof(char_buf));
> > +    entries = (uint32_t *)&char_buf[0];
> > +    *entries = SPAPR_DRC_TABLE_SIZE;
> > +    offset = sizeof(*entries);
> > +
> > +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> > +        offset += sprintf(char_buf + offset, "PHB");
> > +        char_buf[offset++] = '\0';
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, offset);
> > +    if (ret) {
> > +        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
> > +    }
> > +}
> > +
> >  #define _FDT(exp) \
> >      do { \
> >          int ret = (exp);                                           \
> > @@ -731,6 +868,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >      char *bootlist;
> >      void *fdt;
> >      sPAPRPHBState *phb;
> > +    sPAPRDrcEntry *drc_entry;
> >  
> >      fdt = g_malloc(FDT_MAX_SIZE);
> >  
> > @@ -750,6 +888,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >      }
> >  
> >      QLIST_FOREACH(phb, &spapr->phbs, list) {
> > +        drc_entry = spapr_phb_to_drc_entry(phb->buid);
> > +        g_assert(drc_entry);
> >          ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> >      }
> >  
> > @@ -789,6 +929,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
> >      }
> >  
> > +    spapr_create_drc_dt_entries(fdt);
> 
> I would really prefer if we can stick to always use the spapr as
> function parameter, not use the global.
> 
> > +
> >      _FDT((fdt_pack(fdt)));
> >  
> >      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> > @@ -1443,6 +1585,7 @@ static void ppc_spapr_init(MachineState *machine)
> >      spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW);
> >      spapr_pci_rtas_init();
> >  
> > +    spapr_init_drc_table();
> >      phb = spapr_create_phb(spapr, 0);
> >  
> >      for (i = 0; i < nb_nics; i++) {
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 9ed39a9..e85134f 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -531,6 +531,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >              + sphb->index * SPAPR_PCI_WINDOW_SPACING;
> >          sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
> >          sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
> > +        spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
> >      }
> >  
> >      if (sphb->buid == -1) {
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 36e8e51..c93794b 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -10,6 +10,36 @@ struct sPAPRNVRAM;
> >  
> >  #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
> >  
> > +/* For dlparable/hotpluggable slots */
> > +#define SPAPR_DRC_TABLE_SIZE    32
> 
> Can we make this dynamic so that we can set it to 0 for pseries-2.0 (if
> necessary) or have an easy tunable to extend the list later?

We could introduce something like -machine pseries,max-dr-connectors=x maybe,
and set the default based on current machine. Though it's worth noting future
stuff like cpu/mem DRC entries will get allocated via the same top-level
ibm,drc-indexes list property (before or after PHB entries), so
the meaning of that option would change unless we name it something specific
to PHBs entries, like max-phb-dr-connectors.

> 
> 
> Alex
> 
> > +#define SPAPR_DRC_PHB_SLOT_MAX  32
> > +#define SPAPR_DRC_DEV_ID_BASE   0x40000000
> > +
> > +typedef struct sPAPRConfigureConnectorState {
> > +    void *fdt;
> > +    int offset_start;
> > +    int offset;
> > +    int depth;
> > +    PCIDevice *dev;
> > +    enum {
> > +        CC_STATE_IDLE = 0,
> > +        CC_STATE_PENDING = 1,
> > +        CC_STATE_ACTIVE,
> > +    } state;
> > +} sPAPRConfigureConnectorState;
> > +
> > +typedef struct sPAPRDrcEntry sPAPRDrcEntry;
> > +
> > +struct sPAPRDrcEntry {
> > +    uint32_t drc_index;
> > +    uint64_t phb_buid;
> > +    void *fdt;
> > +    int fdt_offset;
> > +    uint32_t state;
> > +    sPAPRConfigureConnectorState cc_state;
> > +    sPAPRDrcEntry *child_entries;
> > +};
> > +
> >  typedef struct sPAPREnvironment {
> >      struct VIOsPAPRBus *vio_bus;
> >      QLIST_HEAD(, sPAPRPHBState) phbs;
> > @@ -39,6 +69,9 @@ typedef struct sPAPREnvironment {
> >      int htab_save_index;
> >      bool htab_first_pass;
> >      int htab_fd;
> > +
> > +    /* state for Dynamic Reconfiguration Connectors */
> > +    sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
> >  } sPAPREnvironment;
> >  
> >  #define H_SUCCESS         0
> > @@ -481,5 +514,7 @@ int spapr_dma_dt(void *fdt, int node_off, const char *propname,
> >                   uint32_t liobn, uint64_t window, uint32_t size);
> >  int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
> >                        sPAPRTCETable *tcet);
> > +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
> > +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
> >  
> >  #endif /* !defined (__HW_SPAPR_H__) */
> >

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-26 16:47     ` Michael Roth
@ 2014-08-26 17:16       ` Alexander Graf
  0 siblings, 0 replies; 69+ messages in thread
From: Alexander Graf @ 2014-08-26 17:16 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: aik, ncmike, qemu-ppc, tyreld, nfont



On 26.08.14 18:47, Michael Roth wrote:
> Quoting Alexander Graf (2014-08-26 06:11:24)
>> On 19.08.14 02:21, Michael Roth wrote:
>>> From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>>>
>>> This add entries to the root OF node to advertise our PHBs as being
>>> DR-capable in according with PAPR specification.
>>>
>>> Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
>>> and associated with a power domain of -1 (indicating to guests that
>>> power management is handled automatically by hardware).
>>>
>>> We currently allocate entries for up to 32 DR-capable PHBs, though
>>> this limit can be increased later.
>>>
>>> DrcEntry objects to track the state of the DR-connector associated
>>> with each PHB are stored in a 32-entry array, and each DrcEntry has
>>> in turn have a dynamically-sized number of child DR-connectors,
>>> which we will use later to track the state of DR-connectors
>>> associated with a PHB's physical slots.
>>>
>>> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>>> ---
>>>  hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
>>>  hw/ppc/spapr_pci.c     |   1 +
>>>  include/hw/ppc/spapr.h |  35 ++++++++++++
>>>  3 files changed, 179 insertions(+)
>>>
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index 5c92707..d5e46c3 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
>>>      return ram_size;
>>>  }
>>>  
>>> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
>>> +{
>>> +    int i;
>>> +
>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>> +        if (spapr->drc_table[i].phb_buid == buid) {
>>> +            return &spapr->drc_table[i];
>>> +        }
>>> +     }
>>> +
>>> +     return NULL;
>>> +}
>>> +
>>> +static void spapr_init_drc_table(void)
>>> +{
>>> +    int i;
>>> +
>>> +    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
>>> +
>>> +    /* For now we only care about PHB entries */
>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>> +        spapr->drc_table[i].drc_index = 0x2000001 + i;
>>
>> magic number?
>>
>>> +    }
>>> +}
>>> +
>>> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
>>> +{
>>> +    sPAPRDrcEntry *empty_drc = NULL;
>>> +    sPAPRDrcEntry *found_drc = NULL;
>>> +    int i, phb_index;
>>> +
>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>> +        if (spapr->drc_table[i].phb_buid == 0) {
>>> +            empty_drc = &spapr->drc_table[i];
>>> +        }
>>> +
>>> +        if (spapr->drc_table[i].phb_buid == buid) {
>>> +            found_drc = &spapr->drc_table[i];
>>> +            break;
>>> +        }
>>> +    }
>>> +
>>> +    if (found_drc) {
>>> +        return found_drc;
>>> +    }
>>> +
>>> +    if (empty_drc) {
>>> +        empty_drc->phb_buid = buid;
>>> +        empty_drc->state = state;
>>> +        empty_drc->cc_state.fdt = NULL;
>>> +        empty_drc->cc_state.offset = 0;
>>> +        empty_drc->cc_state.depth = 0;
>>> +        empty_drc->cc_state.state = CC_STATE_IDLE;
>>> +        empty_drc->child_entries =
>>> +            g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
>>> +        phb_index = buid - SPAPR_PCI_BASE_BUID;
>>> +        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
>>> +            empty_drc->child_entries[i].drc_index =
>>> +                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
>>> +        }
>>> +        return empty_drc;
>>> +    }
>>> +
>>> +    return NULL;
>>> +}
>>> +
>>> +static void spapr_create_drc_dt_entries(void *fdt)
>>> +{
>>> +    char char_buf[1024];
>>> +    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
>>> +    uint32_t *entries;
>>> +    int offset, fdt_offset;
>>> +    int i, ret;
>>> +
>>> +    fdt_offset = fdt_path_offset(fdt, "/");
>>> +
>>> +    /* ibm,drc-indexes */
>>> +    memset(int_buf, 0, sizeof(int_buf));
>>> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
>>> +
>>> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
>>> +        int_buf[i] = spapr->drc_table[i-1].drc_index;
>>
>> Not endian safe.
>>
>>> +    }
>>> +
>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
>>> +                      sizeof(int_buf));
>>> +    if (ret) {
>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
>>> +    }
>>> +
>>> +    /* ibm,drc-power-domains */
>>> +    memset(int_buf, 0, sizeof(int_buf));
>>> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
>>
>> Not endian safe.
>>
>>> +
>>> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
>>> +        int_buf[i] = 0xffffffff;
>>> +    }
>>
>> memset(-1) instead above?
>>
>>> +
>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
>>> +                      sizeof(int_buf));
>>> +    if (ret) {
>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains property\n");
>>> +    }
>>> +
>>> +    /* ibm,drc-names */
>>> +    memset(char_buf, 0, sizeof(char_buf));
>>> +    entries = (uint32_t *)&char_buf[0];
>>> +    *entries = SPAPR_DRC_TABLE_SIZE;
>>
>> Not endian safe. I guess you get the idea. I'll stop looking for endian
>> problems here :).
>>
>>> +    offset = sizeof(*entries);
>>> +
>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>> +        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
>>> +        char_buf[offset++] = '\0';
>>> +    }
>>> +
>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
>>> +    if (ret) {
>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
>>> +    }
>>> +
>>> +    /* ibm,drc-types */
>>> +    memset(char_buf, 0, sizeof(char_buf));
>>> +    entries = (uint32_t *)&char_buf[0];
>>> +    *entries = SPAPR_DRC_TABLE_SIZE;
>>> +    offset = sizeof(*entries);
>>> +
>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>> +        offset += sprintf(char_buf + offset, "PHB");
>>> +        char_buf[offset++] = '\0';
>>> +    }
>>> +
>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, offset);
>>> +    if (ret) {
>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
>>> +    }
>>> +}
>>> +
>>>  #define _FDT(exp) \
>>>      do { \
>>>          int ret = (exp);                                           \
>>> @@ -731,6 +868,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>>      char *bootlist;
>>>      void *fdt;
>>>      sPAPRPHBState *phb;
>>> +    sPAPRDrcEntry *drc_entry;
>>>  
>>>      fdt = g_malloc(FDT_MAX_SIZE);
>>>  
>>> @@ -750,6 +888,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>>      }
>>>  
>>>      QLIST_FOREACH(phb, &spapr->phbs, list) {
>>> +        drc_entry = spapr_phb_to_drc_entry(phb->buid);
>>> +        g_assert(drc_entry);
>>>          ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
>>>      }
>>>  
>>> @@ -789,6 +929,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>>          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
>>>      }
>>>  
>>> +    spapr_create_drc_dt_entries(fdt);
>>
>> I would really prefer if we can stick to always use the spapr as
>> function parameter, not use the global.
>>
>>> +
>>>      _FDT((fdt_pack(fdt)));
>>>  
>>>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
>>> @@ -1443,6 +1585,7 @@ static void ppc_spapr_init(MachineState *machine)
>>>      spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW);
>>>      spapr_pci_rtas_init();
>>>  
>>> +    spapr_init_drc_table();
>>>      phb = spapr_create_phb(spapr, 0);
>>>  
>>>      for (i = 0; i < nb_nics; i++) {
>>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>>> index 9ed39a9..e85134f 100644
>>> --- a/hw/ppc/spapr_pci.c
>>> +++ b/hw/ppc/spapr_pci.c
>>> @@ -531,6 +531,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>>              + sphb->index * SPAPR_PCI_WINDOW_SPACING;
>>>          sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
>>>          sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
>>> +        spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
>>>      }
>>>  
>>>      if (sphb->buid == -1) {
>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>>> index 36e8e51..c93794b 100644
>>> --- a/include/hw/ppc/spapr.h
>>> +++ b/include/hw/ppc/spapr.h
>>> @@ -10,6 +10,36 @@ struct sPAPRNVRAM;
>>>  
>>>  #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
>>>  
>>> +/* For dlparable/hotpluggable slots */
>>> +#define SPAPR_DRC_TABLE_SIZE    32
>>
>> Can we make this dynamic so that we can set it to 0 for pseries-2.0 (if
>> necessary) or have an easy tunable to extend the list later?
> 
> We could introduce something like -machine pseries,max-dr-connectors=x maybe,
> and set the default based on current machine. Though it's worth noting future
> stuff like cpu/mem DRC entries will get allocated via the same top-level
> ibm,drc-indexes list property (before or after PHB entries), so
> the meaning of that option would change unless we name it something specific
> to PHBs entries, like max-phb-dr-connectors.

I don't think we'd have to expose this to the user at all, so the naming
doesn't have to stay consistent :).


Alex

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs
  2014-08-26  8:32   ` Alexey Kardashevskiy
@ 2014-08-26 17:16     ` Michael Roth
  0 siblings, 0 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-26 17:16 UTC (permalink / raw)
  To: Alexey Kardashevskiy, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

Quoting Alexey Kardashevskiy (2014-08-26 03:32:58)
> On 08/19/2014 10:21 AM, Michael Roth wrote:
> > Reserve 32 entries of type PCI in each PHB's initial FDT. This
> > advertises to guests that each PHB is DR-capable device with
> > physical hotpluggable slots. This is necessary for allowing
> > hotplugging of devices to it later via bus rescan or guest rpaphp
> > hotplug module.
> > 
> > Each entry is assigned a name of "Slot <<bus_idx>*32 +1>",
> > advertised as a hotpluggable PCI slot, and assigned to power domain
> > -1 to indicate to the guest that power management is handled by the
> > hardware.
> > 
> > This models a DR-capable PCI expansion device attached to a host/lpar
> > via a single PHB with 32 physical hotpluggable slots (as opposed to a
> > virtual bridge device with external management console). Hotplug will
> > be handled by the guest via bus rescan or the rpaphp hotplug module.
> > 
> > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c              |   3 +-
> >  hw/ppc/spapr_pci.c          | 102 ++++++++++++++++++++++++++++++++++++++++++++
> >  include/hw/pci-host/spapr.h |   1 +
> >  3 files changed, 105 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index d5e46c3..90b25b3 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -890,7 +890,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >      QLIST_FOREACH(phb, &spapr->phbs, list) {
> >          drc_entry = spapr_phb_to_drc_entry(phb->buid);
> >          g_assert(drc_entry);
> > -        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> > +        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, drc_entry->drc_index,
> > +                                    fdt);
> >      }
> >  
> >      if (ret < 0) {
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index e85134f..924d488 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -851,8 +851,104 @@ static int spapr_phb_children_dt(Object *child, void *opaque)
> >      return 1;
> >  }
> >  
> > +static void spapr_create_drc_phb_dt_entries(void *fdt, int bus_off, int phb_index)
> > +{
> > +    char char_buf[1024];
> > +    uint32_t int_buf[SPAPR_DRC_PHB_SLOT_MAX + 1];
> > +    uint32_t *entries;
> > +    int i, ret, offset;
> > +
> > +    /* ibm,drc-indexes */
> > +    memset(int_buf, 0 , sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> > +
> > +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +        int_buf[i] = SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + ((i - 1) << 3);
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-indexes", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr, "error adding 'ibm,drc-indexes' field for PHB FDT");
> > +    }
> > +
> > +    /* ibm,drc-power-domains */
> > +    memset(int_buf, 0, sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> > +
> > +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +        int_buf[i] = 0xffffffff;
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-power-domains", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr,
> > +                "error adding 'ibm,drc-power-domains' field for PHB FDT");
> 
> As before - return here and below.
> 
> > +    }
> > +
> > +    /* ibm,drc-names */
> > +    memset(char_buf, 0, sizeof(char_buf));
> > +    entries = (uint32_t *)&char_buf[0];
> > +    *entries = SPAPR_DRC_PHB_SLOT_MAX;
> > +    offset = sizeof(*entries);
> > +
> > +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +        offset += sprintf(char_buf + offset, "Slot %d",
> > +                          (phb_index * SPAPR_DRC_PHB_SLOT_MAX) + i - 1);
> 
> Mmmm. From 1 to <=MAX and (i-1) inside the loop when it could be
> traditional  0 to <MAX and (i) as it is done below :)
> 
> 
> > +        char_buf[offset++] = '\0';
> 
> 
> sprintf() puts zero there itself, no? And as we are here, should not it be
> snprintf()?

You mean something like this?

    snprintf(char_buf + offset, sizeof(char_buf) - offset, "Slot %d", ...)

I'm not sure what happens if sizeof(char_buf) - offset goes negative though, so
maybe:

    for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX && offset < sizeof(char_buf); i++) {
        if (offset >= sizeof(char_buf)) {
            fprintf(stderr,
                    "error generating 'ibm,drc-names' field for PHB FDT");
            return;
        }
        offset += snprintf(char_buf + offset, sizeof(char_buf) - offset,
                           "Slot %d", (phb_index * SPAPR_DRC_PHB_SLOT_MAX) + i);
        offset++;
    }

> 
> 
> > +    }

> 
> 
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-names", char_buf, offset);
> > +    if (ret) {
> > +        fprintf(stderr, "error adding 'ibm,drc-names' field for PHB FDT");
> > +    }
> > +
> > +    /* ibm,drc-types */
> > +    memset(char_buf, 0, sizeof(char_buf));
> > +    entries = (uint32_t *)&char_buf[0];
> > +    *entries = SPAPR_DRC_PHB_SLOT_MAX;
> > +    offset = sizeof(*entries);
> > +
> > +    for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +        offset += sprintf(char_buf + offset, "28");
> > +        char_buf[offset++] = '\0';
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-types", char_buf, offset);
> > +    if (ret) {
> > +        fprintf(stderr, "error adding 'ibm,drc-types' field for PHB FDT");
> > +    }
> > +
> > +    /* we want the initial indicator state to be 0 - "empty", when we
> > +     * hot-plug an adaptor in the slot, we need to set the indicator
> > +     * to 1 - "present."
> > +     */
> > +
> > +    /* ibm,indicator-9003 */
> > +    memset(int_buf, 0, sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,indicator-9003", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr, "error adding 'ibm,indicator-9003' field for PHB FDT");
> > +    }
> > +
> > +    /* ibm,sensor-9003 */
> > +    memset(int_buf, 0, sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,sensor-9003", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr, "error adding 'ibm,sensor-9003' field for PHB FDT");
> > +    }
> > +}
> > +
> >  int spapr_populate_pci_dt(sPAPRPHBState *phb,
> >                            uint32_t xics_phandle,
> > +                          uint32_t drc_index,
> >                            void *fdt)
> >  {
> >      int bus_off, i, j;
> > @@ -934,6 +1030,12 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb,
> >      object_child_foreach(OBJECT(phb), spapr_phb_children_dt,
> >                           &((sPAPRTCEDT){ .fdt = fdt, .node_off = bus_off }));
> >  
> > +    spapr_create_drc_phb_dt_entries(fdt, bus_off, phb->index);
> > +    if (drc_index) {
> > +        _FDT(fdt_setprop(fdt, bus_off, "ibm,my-drc-index", &drc_index,
> > +                         sizeof(drc_index)));
> > +    }
> > +
> >      return 0;
> >  }
> >  
> > diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
> > index 32f0aa7..8f0a42f 100644
> > --- a/include/hw/pci-host/spapr.h
> > +++ b/include/hw/pci-host/spapr.h
> > @@ -116,6 +116,7 @@ PCIHostState *spapr_create_phb(sPAPREnvironment *spapr, int index);
> >  
> >  int spapr_populate_pci_dt(sPAPRPHBState *phb,
> >                            uint32_t xics_phandle,
> > +                          uint32_t drc_index,
> >                            void *fdt);
> >  
> >  void spapr_pci_msi_init(sPAPREnvironment *spapr, hwaddr addr);
> > 
> 
> 
> -- 
> Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs
  2014-08-26  9:09   ` Alexey Kardashevskiy
@ 2014-08-26 17:52     ` Michael Roth
  0 siblings, 0 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-26 17:52 UTC (permalink / raw)
  To: Alexey Kardashevskiy, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

Quoting Alexey Kardashevskiy (2014-08-26 04:09:36)
> On 08/19/2014 10:21 AM, Michael Roth wrote:
> > Reserve 32 entries of type PCI in each PHB's initial FDT. This
> > advertises to guests that each PHB is DR-capable device with
> > physical hotpluggable slots. This is necessary for allowing
> > hotplugging of devices to it later via bus rescan or guest rpaphp
> > hotplug module.
> > 
> > Each entry is assigned a name of "Slot <<bus_idx>*32 +1>",
> > advertised as a hotpluggable PCI slot, and assigned to power domain
> > -1 to indicate to the guest that power management is handled by the
> > hardware.
> > 
> > This models a DR-capable PCI expansion device attached to a host/lpar
> > via a single PHB with 32 physical hotpluggable slots (as opposed to a
> > virtual bridge device with external management console). Hotplug will
> > be handled by the guest via bus rescan or the rpaphp hotplug module.
> > 
> > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c              |   3 +-
> >  hw/ppc/spapr_pci.c          | 102 ++++++++++++++++++++++++++++++++++++++++++++
> >  include/hw/pci-host/spapr.h |   1 +
> >  3 files changed, 105 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index d5e46c3..90b25b3 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -890,7 +890,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >      QLIST_FOREACH(phb, &spapr->phbs, list) {
> >          drc_entry = spapr_phb_to_drc_entry(phb->buid);
> >          g_assert(drc_entry);
> > -        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> > +        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, drc_entry->drc_index,
> > +                                    fdt);
> >      }
> >  
> >      if (ret < 0) {
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index e85134f..924d488 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -851,8 +851,104 @@ static int spapr_phb_children_dt(Object *child, void *opaque)
> >      return 1;
> >  }
> >  
> > +static void spapr_create_drc_phb_dt_entries(void *fdt, int bus_off, int phb_index)
> > +{
> > +    char char_buf[1024];
> > +    uint32_t int_buf[SPAPR_DRC_PHB_SLOT_MAX + 1];
> > +    uint32_t *entries;
> > +    int i, ret, offset;
> > +
> > +    /* ibm,drc-indexes */
> > +    memset(int_buf, 0 , sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> > +
> > +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +        int_buf[i] = SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + ((i - 1) << 3);
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-indexes", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr, "error adding 'ibm,drc-indexes' field for PHB FDT");
> > +    }
> > +
> > +    /* ibm,drc-power-domains */
> > +    memset(int_buf, 0, sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> > +
> > +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +        int_buf[i] = 0xffffffff;
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-power-domains", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr,
> > +                "error adding 'ibm,drc-power-domains' field for PHB FDT");
> > +    }
> > +
> > +    /* ibm,drc-names */
> > +    memset(char_buf, 0, sizeof(char_buf));
> > +    entries = (uint32_t *)&char_buf[0];
> > +    *entries = SPAPR_DRC_PHB_SLOT_MAX;
> > +    offset = sizeof(*entries);
> > +
> > +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +        offset += sprintf(char_buf + offset, "Slot %d",
> > +                          (phb_index * SPAPR_DRC_PHB_SLOT_MAX) + i - 1);
> > +        char_buf[offset++] = '\0';
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-names", char_buf, offset);
> > +    if (ret) {
> > +        fprintf(stderr, "error adding 'ibm,drc-names' field for PHB FDT");
> > +    }
> > +
> > +    /* ibm,drc-types */
> > +    memset(char_buf, 0, sizeof(char_buf));
> > +    entries = (uint32_t *)&char_buf[0];
> > +    *entries = SPAPR_DRC_PHB_SLOT_MAX;
> > +    offset = sizeof(*entries);
> > +
> > +    for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +        offset += sprintf(char_buf + offset, "28");
> 
> 
> "28"? Is it for "PHB"?

For each actual PCI slot. C.6.1 in PAPR 2.7 defines a stringified 28 as:

  "A PCI Express Rev 2 slot with 8x lanes."

...there may be more appropriate values to use though, such as "1":

  "A 32-bit, 5 Volt conventional PCI slot which accommodates cards that operate up to 33 MHz Only."

since we don't emulate PCIe with spapr-host-bridge. I seem to recall having issues with some
of these other values though, will test again and see.

> 
> 
> 
> 
> -- 
> Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs
  2014-08-26 11:29   ` Alexander Graf
@ 2014-08-26 18:30     ` Michael Roth
  0 siblings, 0 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-26 18:30 UTC (permalink / raw)
  To: Alexander Graf, qemu-devel; +Cc: aik, ncmike, qemu-ppc, tyreld, nfont

Quoting Alexander Graf (2014-08-26 06:29:42)
> On 19.08.14 02:21, Michael Roth wrote:
> > Reserve 32 entries of type PCI in each PHB's initial FDT. This
> > advertises to guests that each PHB is DR-capable device with
> > physical hotpluggable slots. This is necessary for allowing
> > hotplugging of devices to it later via bus rescan or guest rpaphp
> > hotplug module.
> > 
> > Each entry is assigned a name of "Slot <<bus_idx>*32 +1>",
> > advertised as a hotpluggable PCI slot, and assigned to power domain
> > -1 to indicate to the guest that power management is handled by the
> > hardware.
> > 
> > This models a DR-capable PCI expansion device attached to a host/lpar
> > via a single PHB with 32 physical hotpluggable slots (as opposed to a
> > virtual bridge device with external management console). Hotplug will
> > be handled by the guest via bus rescan or the rpaphp hotplug module.
> > 
> > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c              |   3 +-
> >  hw/ppc/spapr_pci.c          | 102 ++++++++++++++++++++++++++++++++++++++++++++
> >  include/hw/pci-host/spapr.h |   1 +
> >  3 files changed, 105 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index d5e46c3..90b25b3 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -890,7 +890,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >      QLIST_FOREACH(phb, &spapr->phbs, list) {
> >          drc_entry = spapr_phb_to_drc_entry(phb->buid);
> >          g_assert(drc_entry);
> > -        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> > +        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, drc_entry->drc_index,
> > +                                    fdt);
> >      }
> >  
> >      if (ret < 0) {
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index e85134f..924d488 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -851,8 +851,104 @@ static int spapr_phb_children_dt(Object *child, void *opaque)
> >      return 1;
> >  }
> >  
> > +static void spapr_create_drc_phb_dt_entries(void *fdt, int bus_off, int phb_index)
> > +{
> > +    char char_buf[1024];
> > +    uint32_t int_buf[SPAPR_DRC_PHB_SLOT_MAX + 1];
> > +    uint32_t *entries;
> > +    int i, ret, offset;
> > +
> > +    /* ibm,drc-indexes */
> > +    memset(int_buf, 0 , sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> > +
> > +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +        int_buf[i] = SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + ((i - 1) << 3);
> 
> Same endianness breakage.
> 
> Please verify that your patch set works with
> 
>   1) ppc64le host and KVM

Working on getting a hold of such thing (most of our installs are reserved
for BE host testing atm), but worst case would TCG on top of an LE guest be
a reasonable sniff test?

>   2) x86_64 host and TCG
> 
> 
> Alex
> 
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-indexes", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr, "error adding 'ibm,drc-indexes' field for PHB FDT");
> > +    }
> > +
> > +    /* ibm,drc-power-domains */
> > +    memset(int_buf, 0, sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> > +
> > +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +        int_buf[i] = 0xffffffff;
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-power-domains", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr,
> > +                "error adding 'ibm,drc-power-domains' field for PHB FDT");
> > +    }
> > +
> > +    /* ibm,drc-names */
> > +    memset(char_buf, 0, sizeof(char_buf));
> > +    entries = (uint32_t *)&char_buf[0];
> > +    *entries = SPAPR_DRC_PHB_SLOT_MAX;
> > +    offset = sizeof(*entries);
> > +
> > +    for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +        offset += sprintf(char_buf + offset, "Slot %d",
> > +                          (phb_index * SPAPR_DRC_PHB_SLOT_MAX) + i - 1);
> > +        char_buf[offset++] = '\0';
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-names", char_buf, offset);
> > +    if (ret) {
> > +        fprintf(stderr, "error adding 'ibm,drc-names' field for PHB FDT");
> > +    }
> > +
> > +    /* ibm,drc-types */
> > +    memset(char_buf, 0, sizeof(char_buf));
> > +    entries = (uint32_t *)&char_buf[0];
> > +    *entries = SPAPR_DRC_PHB_SLOT_MAX;
> > +    offset = sizeof(*entries);
> > +
> > +    for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> > +        offset += sprintf(char_buf + offset, "28");
> > +        char_buf[offset++] = '\0';
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,drc-types", char_buf, offset);
> > +    if (ret) {
> > +        fprintf(stderr, "error adding 'ibm,drc-types' field for PHB FDT");
> > +    }
> > +
> > +    /* we want the initial indicator state to be 0 - "empty", when we
> > +     * hot-plug an adaptor in the slot, we need to set the indicator
> > +     * to 1 - "present."
> > +     */
> > +
> > +    /* ibm,indicator-9003 */
> > +    memset(int_buf, 0, sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,indicator-9003", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr, "error adding 'ibm,indicator-9003' field for PHB FDT");
> > +    }
> > +
> > +    /* ibm,sensor-9003 */
> > +    memset(int_buf, 0, sizeof(int_buf));
> > +    int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> > +
> > +    ret = fdt_setprop(fdt, bus_off, "ibm,sensor-9003", int_buf,
> > +                      sizeof(int_buf));
> > +    if (ret) {
> > +        fprintf(stderr, "error adding 'ibm,sensor-9003' field for PHB FDT");
> > +    }
> > +}
> > +
> >  int spapr_populate_pci_dt(sPAPRPHBState *phb,
> >                            uint32_t xics_phandle,
> > +                          uint32_t drc_index,
> >                            void *fdt)
> >  {
> >      int bus_off, i, j;
> > @@ -934,6 +1030,12 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb,
> >      object_child_foreach(OBJECT(phb), spapr_phb_children_dt,
> >                           &((sPAPRTCEDT){ .fdt = fdt, .node_off = bus_off }));
> >  
> > +    spapr_create_drc_phb_dt_entries(fdt, bus_off, phb->index);
> > +    if (drc_index) {
> > +        _FDT(fdt_setprop(fdt, bus_off, "ibm,my-drc-index", &drc_index,
> > +                         sizeof(drc_index)));
> > +    }
> > +
> >      return 0;
> >  }
> >  
> > diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
> > index 32f0aa7..8f0a42f 100644
> > --- a/include/hw/pci-host/spapr.h
> > +++ b/include/hw/pci-host/spapr.h
> > @@ -116,6 +116,7 @@ PCIHostState *spapr_create_phb(sPAPREnvironment *spapr, int index);
> >  
> >  int spapr_populate_pci_dt(sPAPRPHBState *phb,
> >                            uint32_t xics_phandle,
> > +                          uint32_t drc_index,
> >                            void *fdt);
> >  
> >  void spapr_pci_msi_init(sPAPREnvironment *spapr, hwaddr addr);
> >

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions
  2014-08-26  9:14   ` Alexey Kardashevskiy
  2014-08-26 11:55     ` Peter Maydell
@ 2014-08-26 18:34     ` Michael Roth
  1 sibling, 0 replies; 69+ messages in thread
From: Michael Roth @ 2014-08-26 18:34 UTC (permalink / raw)
  To: Alexey Kardashevskiy, qemu-devel; +Cc: ncmike, nfont, qemu-ppc, agraf, tyreld

Quoting Alexey Kardashevskiy (2014-08-26 04:14:27)
> On 08/19/2014 10:21 AM, Michael Roth wrote:
> > Some kernels program a 0 address for io regions. PCI 3.0 spec
> > section 6.2.5.1 doesn't seem to disallow this.
> 
> 
> I remember there was discussion about it but I forgot :) Why does it have
> to be a part of this patchset? Worth mentioning in the commit log I believe.

Unfortunately with ppc guests the first bar allocation tends to be the
0-address case, so to me it seemed necessary for a testable series. Can
simply document this in the series and re-send separately though.

> 
> 
> > 
> > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > ---
> >  hw/pci/pci.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > index 351d320..9578749 100644
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -1035,7 +1035,7 @@ static pcibus_t pci_bar_address(PCIDevice *d,
> >          /* Check if 32 bit BAR wraps around explicitly.
> >           * TODO: make priorities correct and remove this work around.
> >           */
> > -        if (last_addr <= new_addr || new_addr == 0 || last_addr >= UINT32_MAX) {
> > +        if (last_addr <= new_addr || last_addr >= UINT32_MAX) {
> >              return PCI_BAR_UNMAPPED;
> >          }
> >          return new_addr;
> > 
> 
> 
> -- 
> Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions Michael Roth
  2014-08-26  9:14   ` Alexey Kardashevskiy
  2014-08-26 11:41   ` Alexander Graf
@ 2014-08-27 13:47   ` Michael S. Tsirkin
  2014-08-28 21:21     ` Michael Roth
  2 siblings, 1 reply; 69+ messages in thread
From: Michael S. Tsirkin @ 2014-08-27 13:47 UTC (permalink / raw)
  To: Michael Roth; +Cc: aik, qemu-devel, agraf, ncmike, qemu-ppc, tyreld, nfont

On Mon, Aug 18, 2014 at 07:21:54PM -0500, Michael Roth wrote:
> Some kernels program a 0 address for io regions. PCI 3.0 spec
> section 6.2.5.1 doesn't seem to disallow this.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Yes the PCI spec does not care.

But unfortunately as documented in the comment, at
least for PC (maybe others) priorities aren't
currently setup correctly, so programming PCI BAR at
address zero (during sizing) conflicts with
whatever else is there.

To make address 0 work, you'll have to fix up the prioriorities for a
bunch of machine types :(

> ---
>  hw/pci/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 351d320..9578749 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -1035,7 +1035,7 @@ static pcibus_t pci_bar_address(PCIDevice *d,
>          /* Check if 32 bit BAR wraps around explicitly.
>           * TODO: make priorities correct and remove this work around.
>           */
> -        if (last_addr <= new_addr || new_addr == 0 || last_addr >= UINT32_MAX) {
> +        if (last_addr <= new_addr || last_addr >= UINT32_MAX) {
>              return PCI_BAR_UNMAPPED;
>          }
>          return new_addr;
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions
  2014-08-27 13:47   ` Michael S. Tsirkin
@ 2014-08-28 21:21     ` Michael Roth
  2014-08-28 21:33       ` Peter Maydell
  0 siblings, 1 reply; 69+ messages in thread
From: Michael Roth @ 2014-08-28 21:21 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: aik, qemu-devel, agraf, ncmike, qemu-ppc, tyreld, nfont

Quoting Michael S. Tsirkin (2014-08-27 08:47:51)
> On Mon, Aug 18, 2014 at 07:21:54PM -0500, Michael Roth wrote:
> > Some kernels program a 0 address for io regions. PCI 3.0 spec
> > section 6.2.5.1 doesn't seem to disallow this.
> > 
> > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> 
> Yes the PCI spec does not care.
> 
> But unfortunately as documented in the comment, at
> least for PC (maybe others) priorities aren't
> currently setup correctly, so programming PCI BAR at
> address zero (during sizing) conflicts with
> whatever else is there.

I'm not sure I understand: that note was included as part of the following
fixup to 9f1a029abf15751e32a4b1df99ed2b8315f9072c:

-        if (last_addr <= new_addr || new_addr == 0) {
+        /* Check if BAR is being sized explicitly.
+         * TODO: make priorities correct and remove this work around.
+         */
+        if (last_addr <= new_addr || new_addr == 0 || last_addr >= UINT32_MAX)


which forces the BAR to PCI_BAR_UNMAPPED and unmaps the io region if the
address range extends beyond UINT32_MAX (which would happen during sizing
when guest writes -1...and I guess maybe last_addr <= new_addr covered the
same case back when we used uint32_t for pcibus_t?) ...

But the (new_addr == 0) seems to be something unrelated..., it means the
guest actually attempted to program a 0 address, or...

since pci_update_mappings unconditionally updates all IO regions for a
device whenever a particular BAR is written to, it would prevent us from
temporarily mapping all the IO regions to 0 (until guest re-assigns them)
...

You mentioned in the past this could lead to dispatch tables getting
permanantly corrupted, so maybe that's what the check was for?

But I guess there's still a separate issue, where there's a high liklihood that
a 0 address would conflict with some hard-wired IO address? Wouldn't this be a
guest bug though? Well, I guess it would be a QEMU bug if the above scenario
is a real one...but if we fix or verify that's not the case, would this be
an acceptable change?

> 
> To make address 0 work, you'll have to fix up the prioriorities for a
> bunch of machine types :(
> 
> > ---
> >  hw/pci/pci.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > index 351d320..9578749 100644
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -1035,7 +1035,7 @@ static pcibus_t pci_bar_address(PCIDevice *d,
> >          /* Check if 32 bit BAR wraps around explicitly.
> >           * TODO: make priorities correct and remove this work around.
> >           */
> > -        if (last_addr <= new_addr || new_addr == 0 || last_addr >= UINT32_MAX) {
> > +        if (last_addr <= new_addr || last_addr >= UINT32_MAX) {
> >              return PCI_BAR_UNMAPPED;
> >          }
> >          return new_addr;
> > -- 
> > 1.9.1
> >

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions
  2014-08-28 21:21     ` Michael Roth
@ 2014-08-28 21:33       ` Peter Maydell
  2014-08-28 21:46         ` Michael S. Tsirkin
  0 siblings, 1 reply; 69+ messages in thread
From: Peter Maydell @ 2014-08-28 21:33 UTC (permalink / raw)
  To: Michael Roth
  Cc: Michael S. Tsirkin, Alexey Kardashevskiy, Alexander Graf,
	QEMU Developers, Mike Day, qemu-ppc, tyreld, nfont

On 28 August 2014 22:21, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> But I guess there's still a separate issue, where there's a high liklihood that
> a 0 address would conflict with some hard-wired IO address? Wouldn't this be a
> guest bug though?

Even if it's a guest bug, we should act like the hardware does
if the guest does this. If that differs between PCI controllers
then we need a flag so the host controller model can select
the required behaviour. (The versatile PB PCI controller we
model does have "address 0 is valid", and we'd need to have
this working if we implemented DMA accesses properly.)

-- PMM

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions
  2014-08-28 21:33       ` Peter Maydell
@ 2014-08-28 21:46         ` Michael S. Tsirkin
  0 siblings, 0 replies; 69+ messages in thread
From: Michael S. Tsirkin @ 2014-08-28 21:46 UTC (permalink / raw)
  To: Peter Maydell
  Cc: QEMU Developers, Alexey Kardashevskiy, Alexander Graf,
	Michael Roth, Mike Day, qemu-ppc, tyreld, nfont

On Thu, Aug 28, 2014 at 10:33:02PM +0100, Peter Maydell wrote:
> On 28 August 2014 22:21, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> > But I guess there's still a separate issue, where there's a high liklihood that
> > a 0 address would conflict with some hard-wired IO address? Wouldn't this be a
> > guest bug though?

Real hardware behaves in a specific way.
the problem is we don't always emulate it properly, and PCI
has some work arounds for that.

> Even if it's a guest bug, we should act like the hardware does
> if the guest does this. If that differs between PCI controllers
> then we need a flag so the host controller model can select
> the required behaviour. (The versatile PB PCI controller we
> model does have "address 0 is valid", and we'd need to have
> this working if we implemented DMA accesses properly.)
> 
> -- PMM

Exactly.
Actually, I forgot that I might have already fixed it for PC:
83d08f2673504a299194dcac1657a13754b5932a
    pc: map PCI address space as catchall region for not mapped addresses

But need to go back and re-check other systems.


-- 
MST

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-26 15:25       ` Michael Roth
  2014-08-26 15:41         ` Michael Roth
@ 2014-08-29 18:27         ` Tyrel Datwyler
  2014-08-29 23:15           ` Alexander Graf
  1 sibling, 1 reply; 69+ messages in thread
From: Tyrel Datwyler @ 2014-08-29 18:27 UTC (permalink / raw)
  To: Michael Roth, Alexey Kardashevskiy, qemu-devel
  Cc: ncmike, nfont, qemu-ppc, agraf

On 08/26/2014 08:25 AM, Michael Roth wrote:
> Quoting Alexey Kardashevskiy (2014-08-26 03:24:08)
>> On 08/26/2014 05:55 PM, Alexey Kardashevskiy wrote:
>>> On 08/19/2014 10:21 AM, Michael Roth wrote:
>>>> From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>>>>
>>>> This add entries to the root OF node to advertise our PHBs as being
>>>> DR-capable in according with PAPR specification.
>>>>
>>>> Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
>>>> and associated with a power domain of -1 (indicating to guests that
>>>> power management is handled automatically by hardware).
>>>>
>>>> We currently allocate entries for up to 32 DR-capable PHBs, though
>>>> this limit can be increased later.
>>>>
>>>> DrcEntry objects to track the state of the DR-connector associated
>>>> with each PHB are stored in a 32-entry array, and each DrcEntry has
>>>> in turn have a dynamically-sized number of child DR-connectors,
>>>> which we will use later to track the state of DR-connectors
>>>> associated with a PHB's physical slots.
>>>>
>>>> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>>>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>>>> ---
>>>>  hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  hw/ppc/spapr_pci.c     |   1 +
>>>>  include/hw/ppc/spapr.h |  35 ++++++++++++
>>>>  3 files changed, 179 insertions(+)
>>>>
>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>> index 5c92707..d5e46c3 100644
>>>> --- a/hw/ppc/spapr.c
>>>> +++ b/hw/ppc/spapr.c
>>>> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
>>>>      return ram_size;
>>>>  }
>>>>  
>>>> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
>>>> +{
>>>> +    int i;
>>>> +
>>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>>> +        if (spapr->drc_table[i].phb_buid == buid) {
>>>> +            return &spapr->drc_table[i];
>>>> +        }
>>>> +     }
>>>> +
>>>> +     return NULL;
>>>> +}
>>>> +
>>>> +static void spapr_init_drc_table(void)
>>>> +{
>>>> +    int i;
>>>> +
>>>> +    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
>>>> +
>>>> +    /* For now we only care about PHB entries */
>>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>>> +        spapr->drc_table[i].drc_index = 0x2000001 + i;
>>>> +    }
>>>> +}
>>>> +
>>>> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
>>>> +{
>>>> +    sPAPRDrcEntry *empty_drc = NULL;
>>>> +    sPAPRDrcEntry *found_drc = NULL;
>>>> +    int i, phb_index;
>>>> +
>>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>>> +        if (spapr->drc_table[i].phb_buid == 0) {
>>>> +            empty_drc = &spapr->drc_table[i];
>>>> +        }
>>>> +
>>>> +        if (spapr->drc_table[i].phb_buid == buid) {
>>>> +            found_drc = &spapr->drc_table[i];
>>>
>>> return &spapr->drc_table[i];
>>> ?
>>>
>>>
>>>> +            break;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    if (found_drc) {
>>>> +        return found_drc;
>>>> +    }
>>>
>>>    if (!empty_drc) {
>>>         return NULL;
>>>    }
>>>
>>> ?
>>>
>>>
>>>> +
>>>> +    if (empty_drc) {
>>>
>>> and no need in this :)
>>>
>>>
>>>> +        empty_drc->phb_buid = buid;
>>>> +        empty_drc->state = state;
>>>> +        empty_drc->cc_state.fdt = NULL;
>>>> +        empty_drc->cc_state.offset = 0;
>>>> +        empty_drc->cc_state.depth = 0;
>>>> +        empty_drc->cc_state.state = CC_STATE_IDLE;
>>>> +        empty_drc->child_entries =
>>>> +            g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
>>>> +        phb_index = buid - SPAPR_PCI_BASE_BUID;
>>>> +        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
>>>> +            empty_drc->child_entries[i].drc_index =
>>>> +                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
>>>> +        }
>>>> +        return empty_drc;
>>>> +    }
>>>> +
>>>> +    return NULL;
>>>> +}
>>>> +
>>>> +static void spapr_create_drc_dt_entries(void *fdt)
>>>> +{
>>>> +    char char_buf[1024];
>>>> +    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
>>>> +    uint32_t *entries;
>>>> +    int offset, fdt_offset;
>>>> +    int i, ret;
>>>> +
>>>> +    fdt_offset = fdt_path_offset(fdt, "/");
>>>> +
>>>> +    /* ibm,drc-indexes */
>>>> +    memset(int_buf, 0, sizeof(int_buf));
>>>> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
>>>> +
>>>> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
>>>> +        int_buf[i] = spapr->drc_table[i-1].drc_index;
>>>> +    }
>>>> +
>>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
>>>> +                      sizeof(int_buf));
>>>> +    if (ret) {
>>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
>>>
>>> return here and below in the same error cases?
>>>
>>>> +    }
>>>> +
>>>> +    /* ibm,drc-power-domains */
>>>> +    memset(int_buf, 0, sizeof(int_buf));
>>>> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
>>>> +
>>>> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
>>>> +        int_buf[i] = 0xffffffff;
>>>> +    }
>>>> +
>>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
>>>> +                      sizeof(int_buf));
>>>> +    if (ret) {
>>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains property\n");
>>>> +    }
>>>> +
>>>> +    /* ibm,drc-names */
>>>> +    memset(char_buf, 0, sizeof(char_buf));
>>>> +    entries = (uint32_t *)&char_buf[0];
>>>> +    *entries = SPAPR_DRC_TABLE_SIZE;
>>>> +    offset = sizeof(*entries);
>>>> +
>>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>>> +        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
>>>> +        char_buf[offset++] = '\0';
>>>> +    }
>>>> +
>>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
>>>> +    if (ret) {
>>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
>>>> +    }
>>>> +
>>>> +    /* ibm,drc-types */
>>>> +    memset(char_buf, 0, sizeof(char_buf));
>>>> +    entries = (uint32_t *)&char_buf[0];
>>>> +    *entries = SPAPR_DRC_TABLE_SIZE;
>>>> +    offset = sizeof(*entries);
>>>> +
>>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>>> +        offset += sprintf(char_buf + offset, "PHB");
>>>> +        char_buf[offset++] = '\0';
>>>> +    }
>>>> +
>>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, offset);
>>>> +    if (ret) {
>>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
>>>> +    }
>>>> +}
>>>> +
>>>>  #define _FDT(exp) \
>>>>      do { \
>>>>          int ret = (exp);                                           \
>>>> @@ -731,6 +868,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>>>      char *bootlist;
>>>>      void *fdt;
>>>>      sPAPRPHBState *phb;
>>>> +    sPAPRDrcEntry *drc_entry;
>>>>  
>>>>      fdt = g_malloc(FDT_MAX_SIZE);
>>>>  
>>>> @@ -750,6 +888,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>>>      }
>>>>  
>>>>      QLIST_FOREACH(phb, &spapr->phbs, list) {
>>>> +        drc_entry = spapr_phb_to_drc_entry(phb->buid);
>>>> +        g_assert(drc_entry);
>>>>          ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
>>>>      }
>>>>  
>>>> @@ -789,6 +929,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>>>          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
>>>>      }
>>>>  
>>>> +    spapr_create_drc_dt_entries(fdt);
>>>> +
>>>>      _FDT((fdt_pack(fdt)));
>>>>  
>>>>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
>>>> @@ -1443,6 +1585,7 @@ static void ppc_spapr_init(MachineState *machine)
>>>>      spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW);
>>>>      spapr_pci_rtas_init();
>>>>  
>>>> +    spapr_init_drc_table();
>>>>      phb = spapr_create_phb(spapr, 0);
>>>>  
>>>>      for (i = 0; i < nb_nics; i++) {
>>>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>>>> index 9ed39a9..e85134f 100644
>>>> --- a/hw/ppc/spapr_pci.c
>>>> +++ b/hw/ppc/spapr_pci.c
>>>> @@ -531,6 +531,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>>>              + sphb->index * SPAPR_PCI_WINDOW_SPACING;
>>>>          sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
>>>>          sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
>>>> +        spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
>>>
>>>
>>> What exactly does "unusable" mean here? Macro?
>>>
>>>
>>>
>>>>      }
>>>>  
>>>>      if (sphb->buid == -1) {
>>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>>>> index 36e8e51..c93794b 100644
>>>> --- a/include/hw/ppc/spapr.h
>>>> +++ b/include/hw/ppc/spapr.h
>>>> @@ -10,6 +10,36 @@ struct sPAPRNVRAM;
>>>>  
>>>>  #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
>>>>  
>>>> +/* For dlparable/hotpluggable slots */
>>>> +#define SPAPR_DRC_TABLE_SIZE    32
>>>> +#define SPAPR_DRC_PHB_SLOT_MAX  32
>>>> +#define SPAPR_DRC_DEV_ID_BASE   0x40000000
>>>
>>>
>>> Is this SPAPR_DRC_DEV_ID_BASE really necessary (can it be zero)? Is that
>>> global id or per PCI bus or per PHB?
>>
>>
>> Ah. Got it. If it was like below, I would not even ask :)
>>
>> #define SPAPR_DRC_DEV_ID(phb_index, slot) \
>>         (0x40000000 | ((phb_index)<<8) | ((slot)<<3))
>>
>> Still not clear why you need this 0x40000000 for. Is it kind of "PHB" DRC type?
> 
> Yes, it's somewhat ad-hoc...the only requirement I see in PAPR is that this value be
> globally unique across all DR resources. CPUs and memory and such might have different
> ways to compute their DRC indices (so a slot-based macro would need to be specific
> to PCI DR entries). I'm not sure where the 0x40000000 originated honestly. I'm not
> sure it matters for QEMU, since we hold a monopoly on all DRC index assignments and
> don't have to deal with hard-coded firmware values.
> 
> I will say that a base somewhat less common than 0 may prove useful from a debugging
> standpoint, all other things being equal.
> 
> So not sure what best to do here. If we choose to leave it as is, I could at least
> make sure to add a comment about this.

It is ture that PAPR only requires that these values be unique, and I'm
not currently aware of guest tools that make assumptions about the DRC
values for different DRC connectors. However, seeing as we are emulating
a pseries guest I picked base DRC values that matched those used by PHYP.

CPU	0x10000000
VIO	0x30000000
LMB	0x80000000
PHB	0x20000000
PCI	0x40000000

-Tyrel

> 
>>
>>
>>>
>>>> +
>>>> +typedef struct sPAPRConfigureConnectorState {
>>>> +    void *fdt;
>>>> +    int offset_start;
>>>> +    int offset;
>>>> +    int depth;
>>>> +    PCIDevice *dev;
>>>> +    enum {
>>>> +        CC_STATE_IDLE = 0,
>>>> +        CC_STATE_PENDING = 1,
>>>> +        CC_STATE_ACTIVE,
>>>> +    } state;
>>>> +} sPAPRConfigureConnectorState;
>>>> +
>>>> +typedef struct sPAPRDrcEntry sPAPRDrcEntry;
>>>> +
>>>> +struct sPAPRDrcEntry {
>>>> +    uint32_t drc_index;
>>>> +    uint64_t phb_buid;
>>>> +    void *fdt;
>>>> +    int fdt_offset;
>>>> +    uint32_t state;
>>>> +    sPAPRConfigureConnectorState cc_state;
>>>> +    sPAPRDrcEntry *child_entries;
>>>> +};
>>>> +
>>>>  typedef struct sPAPREnvironment {
>>>>      struct VIOsPAPRBus *vio_bus;
>>>>      QLIST_HEAD(, sPAPRPHBState) phbs;
>>>> @@ -39,6 +69,9 @@ typedef struct sPAPREnvironment {
>>>>      int htab_save_index;
>>>>      bool htab_first_pass;
>>>>      int htab_fd;
>>>> +
>>>> +    /* state for Dynamic Reconfiguration Connectors */
>>>> +    sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
>>>>  } sPAPREnvironment;
>>>>  
>>>>  #define H_SUCCESS         0
>>>> @@ -481,5 +514,7 @@ int spapr_dma_dt(void *fdt, int node_off, const char *propname,
>>>>                   uint32_t liobn, uint64_t window, uint32_t size);
>>>>  int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
>>>>                        sPAPRTCETable *tcet);
>>>> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
>>>> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
>>>>  
>>>>  #endif /* !defined (__HW_SPAPR_H__) */
>>>>
>>>
>>>
>>
>>
>> -- 
>> Alexey

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 11/12] spapr_events: event-scan RTAS interface
  2014-08-26  9:30   ` Alexey Kardashevskiy
@ 2014-08-29 18:43     ` Tyrel Datwyler
  0 siblings, 0 replies; 69+ messages in thread
From: Tyrel Datwyler @ 2014-08-29 18:43 UTC (permalink / raw)
  To: Alexey Kardashevskiy, Michael Roth, qemu-devel
  Cc: ncmike, nfont, qemu-ppc, agraf

On 08/26/2014 02:30 AM, Alexey Kardashevskiy wrote:
> On 08/19/2014 10:21 AM, Michael Roth wrote:
>> From: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
>>
>> We don't actually rely on this interface to surface hotplug events, and
>> instead rely on the similar-but-interrupt-driven check-exception RTAS
>> interface used for EPOW events. However, the existence of this interface
>> is needed to ensure guest kernels initialize the event-reporting
>> interfaces which will in turn be used by userspace tools to handle these
>> events, so we implement this interface as a stub.
>>
>> Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>> ---
>>  hw/ppc/spapr.c         | 1 +
>>  hw/ppc/spapr_events.c  | 9 +++++++++
>>  include/hw/ppc/spapr.h | 2 ++
>>  3 files changed, 12 insertions(+)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 825fd31..c65b13a 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -702,6 +702,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
>>          refpoints, sizeof(refpoints))));
>>  
>>      _FDT((fdt_property_cell(fdt, "rtas-error-log-max", RTAS_ERROR_LOG_MAX)));
>> +    _FDT((fdt_property_cell(fdt, "rtas-event-scan-rate", RTAS_EVENT_SCAN_RATE)));
>>  
>>      /*
>>       * According to PAPR, rtas ibm,os-term, does not gaurantee a return
>> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
>> index c0be0e5..bb80080 100644
>> --- a/hw/ppc/spapr_events.c
>> +++ b/hw/ppc/spapr_events.c
>> @@ -449,6 +449,14 @@ static void check_exception(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>>      }
>>  }
>>  
>> +static void event_scan(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>> +                            uint32_t token, uint32_t nargs,
>> +                            target_ulong args,
>> +                            uint32_t nret, target_ulong rets)
>> +{
>> +    rtas_st(rets, 0, 1); /* no error events found */
> 
> 
> rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> 
> 
> 
>> +}
>> +
>>  void spapr_events_init(sPAPREnvironment *spapr)
>>  {
>>      spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false);
>> @@ -456,4 +464,5 @@ void spapr_events_init(sPAPREnvironment *spapr)
>>      qemu_register_powerdown_notifier(&spapr->epow_notifier);
>>      spapr_rtas_register(RTAS_CHECK_EXCEPTION, "check-exception",
>>                          check_exception);
>> +    spapr_rtas_register(RTAS_EVENT_SCAN, "event-scan", event_scan);
>>  }
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index 5382bf1..aab627f 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -484,6 +484,8 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
>>  
>>  #define RTAS_ERROR_LOG_MAX      2048
>>  
>> +#define RTAS_EVENT_SCAN_RATE    1
> 
> 1 second? 1ms? 1 minute? :) Worth mentioning in the commit log.

As per PAPR 7.3.3.1 the rate is per minute.

-Tyrel

> 
> 
>> +
>>  typedef struct sPAPRTCETable sPAPRTCETable;
>>  
>>  #define TYPE_SPAPR_TCE_TABLE "spapr-tce-table"
>>
> 
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-29 18:27         ` Tyrel Datwyler
@ 2014-08-29 23:15           ` Alexander Graf
  0 siblings, 0 replies; 69+ messages in thread
From: Alexander Graf @ 2014-08-29 23:15 UTC (permalink / raw)
  To: Tyrel Datwyler, Michael Roth, Alexey Kardashevskiy, qemu-devel
  Cc: ncmike, nfont, qemu-ppc



On 29.08.14 20:27, Tyrel Datwyler wrote:
> On 08/26/2014 08:25 AM, Michael Roth wrote:
>> Quoting Alexey Kardashevskiy (2014-08-26 03:24:08)
>>> On 08/26/2014 05:55 PM, Alexey Kardashevskiy wrote:
>>>> On 08/19/2014 10:21 AM, Michael Roth wrote:
>>>>> From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>>>>>
>>>>> This add entries to the root OF node to advertise our PHBs as being
>>>>> DR-capable in according with PAPR specification.
>>>>>
>>>>> Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
>>>>> and associated with a power domain of -1 (indicating to guests that
>>>>> power management is handled automatically by hardware).
>>>>>
>>>>> We currently allocate entries for up to 32 DR-capable PHBs, though
>>>>> this limit can be increased later.
>>>>>
>>>>> DrcEntry objects to track the state of the DR-connector associated
>>>>> with each PHB are stored in a 32-entry array, and each DrcEntry has
>>>>> in turn have a dynamically-sized number of child DR-connectors,
>>>>> which we will use later to track the state of DR-connectors
>>>>> associated with a PHB's physical slots.
>>>>>
>>>>> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>>>>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>>>>> ---
>>>>>  hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>  hw/ppc/spapr_pci.c     |   1 +
>>>>>  include/hw/ppc/spapr.h |  35 ++++++++++++
>>>>>  3 files changed, 179 insertions(+)
>>>>>
>>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>>> index 5c92707..d5e46c3 100644
>>>>> --- a/hw/ppc/spapr.c
>>>>> +++ b/hw/ppc/spapr.c
>>>>> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
>>>>>      return ram_size;
>>>>>  }
>>>>>  
>>>>> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
>>>>> +{
>>>>> +    int i;
>>>>> +
>>>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>>>> +        if (spapr->drc_table[i].phb_buid == buid) {
>>>>> +            return &spapr->drc_table[i];
>>>>> +        }
>>>>> +     }
>>>>> +
>>>>> +     return NULL;
>>>>> +}
>>>>> +
>>>>> +static void spapr_init_drc_table(void)
>>>>> +{
>>>>> +    int i;
>>>>> +
>>>>> +    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
>>>>> +
>>>>> +    /* For now we only care about PHB entries */
>>>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>>>> +        spapr->drc_table[i].drc_index = 0x2000001 + i;
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
>>>>> +{
>>>>> +    sPAPRDrcEntry *empty_drc = NULL;
>>>>> +    sPAPRDrcEntry *found_drc = NULL;
>>>>> +    int i, phb_index;
>>>>> +
>>>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>>>> +        if (spapr->drc_table[i].phb_buid == 0) {
>>>>> +            empty_drc = &spapr->drc_table[i];
>>>>> +        }
>>>>> +
>>>>> +        if (spapr->drc_table[i].phb_buid == buid) {
>>>>> +            found_drc = &spapr->drc_table[i];
>>>>
>>>> return &spapr->drc_table[i];
>>>> ?
>>>>
>>>>
>>>>> +            break;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    if (found_drc) {
>>>>> +        return found_drc;
>>>>> +    }
>>>>
>>>>    if (!empty_drc) {
>>>>         return NULL;
>>>>    }
>>>>
>>>> ?
>>>>
>>>>
>>>>> +
>>>>> +    if (empty_drc) {
>>>>
>>>> and no need in this :)
>>>>
>>>>
>>>>> +        empty_drc->phb_buid = buid;
>>>>> +        empty_drc->state = state;
>>>>> +        empty_drc->cc_state.fdt = NULL;
>>>>> +        empty_drc->cc_state.offset = 0;
>>>>> +        empty_drc->cc_state.depth = 0;
>>>>> +        empty_drc->cc_state.state = CC_STATE_IDLE;
>>>>> +        empty_drc->child_entries =
>>>>> +            g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
>>>>> +        phb_index = buid - SPAPR_PCI_BASE_BUID;
>>>>> +        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
>>>>> +            empty_drc->child_entries[i].drc_index =
>>>>> +                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
>>>>> +        }
>>>>> +        return empty_drc;
>>>>> +    }
>>>>> +
>>>>> +    return NULL;
>>>>> +}
>>>>> +
>>>>> +static void spapr_create_drc_dt_entries(void *fdt)
>>>>> +{
>>>>> +    char char_buf[1024];
>>>>> +    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
>>>>> +    uint32_t *entries;
>>>>> +    int offset, fdt_offset;
>>>>> +    int i, ret;
>>>>> +
>>>>> +    fdt_offset = fdt_path_offset(fdt, "/");
>>>>> +
>>>>> +    /* ibm,drc-indexes */
>>>>> +    memset(int_buf, 0, sizeof(int_buf));
>>>>> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
>>>>> +
>>>>> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
>>>>> +        int_buf[i] = spapr->drc_table[i-1].drc_index;
>>>>> +    }
>>>>> +
>>>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
>>>>> +                      sizeof(int_buf));
>>>>> +    if (ret) {
>>>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
>>>>
>>>> return here and below in the same error cases?
>>>>
>>>>> +    }
>>>>> +
>>>>> +    /* ibm,drc-power-domains */
>>>>> +    memset(int_buf, 0, sizeof(int_buf));
>>>>> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
>>>>> +
>>>>> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
>>>>> +        int_buf[i] = 0xffffffff;
>>>>> +    }
>>>>> +
>>>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
>>>>> +                      sizeof(int_buf));
>>>>> +    if (ret) {
>>>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains property\n");
>>>>> +    }
>>>>> +
>>>>> +    /* ibm,drc-names */
>>>>> +    memset(char_buf, 0, sizeof(char_buf));
>>>>> +    entries = (uint32_t *)&char_buf[0];
>>>>> +    *entries = SPAPR_DRC_TABLE_SIZE;
>>>>> +    offset = sizeof(*entries);
>>>>> +
>>>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>>>> +        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
>>>>> +        char_buf[offset++] = '\0';
>>>>> +    }
>>>>> +
>>>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
>>>>> +    if (ret) {
>>>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
>>>>> +    }
>>>>> +
>>>>> +    /* ibm,drc-types */
>>>>> +    memset(char_buf, 0, sizeof(char_buf));
>>>>> +    entries = (uint32_t *)&char_buf[0];
>>>>> +    *entries = SPAPR_DRC_TABLE_SIZE;
>>>>> +    offset = sizeof(*entries);
>>>>> +
>>>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>>>>> +        offset += sprintf(char_buf + offset, "PHB");
>>>>> +        char_buf[offset++] = '\0';
>>>>> +    }
>>>>> +
>>>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, offset);
>>>>> +    if (ret) {
>>>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
>>>>> +    }
>>>>> +}
>>>>> +
>>>>>  #define _FDT(exp) \
>>>>>      do { \
>>>>>          int ret = (exp);                                           \
>>>>> @@ -731,6 +868,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>>>>      char *bootlist;
>>>>>      void *fdt;
>>>>>      sPAPRPHBState *phb;
>>>>> +    sPAPRDrcEntry *drc_entry;
>>>>>  
>>>>>      fdt = g_malloc(FDT_MAX_SIZE);
>>>>>  
>>>>> @@ -750,6 +888,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>>>>      }
>>>>>  
>>>>>      QLIST_FOREACH(phb, &spapr->phbs, list) {
>>>>> +        drc_entry = spapr_phb_to_drc_entry(phb->buid);
>>>>> +        g_assert(drc_entry);
>>>>>          ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
>>>>>      }
>>>>>  
>>>>> @@ -789,6 +929,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>>>>          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
>>>>>      }
>>>>>  
>>>>> +    spapr_create_drc_dt_entries(fdt);
>>>>> +
>>>>>      _FDT((fdt_pack(fdt)));
>>>>>  
>>>>>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
>>>>> @@ -1443,6 +1585,7 @@ static void ppc_spapr_init(MachineState *machine)
>>>>>      spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW);
>>>>>      spapr_pci_rtas_init();
>>>>>  
>>>>> +    spapr_init_drc_table();
>>>>>      phb = spapr_create_phb(spapr, 0);
>>>>>  
>>>>>      for (i = 0; i < nb_nics; i++) {
>>>>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>>>>> index 9ed39a9..e85134f 100644
>>>>> --- a/hw/ppc/spapr_pci.c
>>>>> +++ b/hw/ppc/spapr_pci.c
>>>>> @@ -531,6 +531,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>>>>              + sphb->index * SPAPR_PCI_WINDOW_SPACING;
>>>>>          sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
>>>>>          sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
>>>>> +        spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
>>>>
>>>>
>>>> What exactly does "unusable" mean here? Macro?
>>>>
>>>>
>>>>
>>>>>      }
>>>>>  
>>>>>      if (sphb->buid == -1) {
>>>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>>>>> index 36e8e51..c93794b 100644
>>>>> --- a/include/hw/ppc/spapr.h
>>>>> +++ b/include/hw/ppc/spapr.h
>>>>> @@ -10,6 +10,36 @@ struct sPAPRNVRAM;
>>>>>  
>>>>>  #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
>>>>>  
>>>>> +/* For dlparable/hotpluggable slots */
>>>>> +#define SPAPR_DRC_TABLE_SIZE    32
>>>>> +#define SPAPR_DRC_PHB_SLOT_MAX  32
>>>>> +#define SPAPR_DRC_DEV_ID_BASE   0x40000000
>>>>
>>>>
>>>> Is this SPAPR_DRC_DEV_ID_BASE really necessary (can it be zero)? Is that
>>>> global id or per PCI bus or per PHB?
>>>
>>>
>>> Ah. Got it. If it was like below, I would not even ask :)
>>>
>>> #define SPAPR_DRC_DEV_ID(phb_index, slot) \
>>>         (0x40000000 | ((phb_index)<<8) | ((slot)<<3))
>>>
>>> Still not clear why you need this 0x40000000 for. Is it kind of "PHB" DRC type?
>>
>> Yes, it's somewhat ad-hoc...the only requirement I see in PAPR is that this value be
>> globally unique across all DR resources. CPUs and memory and such might have different
>> ways to compute their DRC indices (so a slot-based macro would need to be specific
>> to PCI DR entries). I'm not sure where the 0x40000000 originated honestly. I'm not
>> sure it matters for QEMU, since we hold a monopoly on all DRC index assignments and
>> don't have to deal with hard-coded firmware values.
>>
>> I will say that a base somewhat less common than 0 may prove useful from a debugging
>> standpoint, all other things being equal.
>>
>> So not sure what best to do here. If we choose to leave it as is, I could at least
>> make sure to add a comment about this.
> 
> It is ture that PAPR only requires that these values be unique, and I'm
> not currently aware of guest tools that make assumptions about the DRC
> values for different DRC connectors. However, seeing as we are emulating
> a pseries guest I picked base DRC values that matched those used by PHYP.
> 
> CPU	0x10000000
> VIO	0x30000000
> LMB	0x80000000
> PHB	0x20000000
> PCI	0x40000000

Maybe we should keep these offsets (and boundary checks?) inside a
single spot, so that people can easily spot what the number space is
divided into.


Alex

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node Michael Roth
  2014-08-26  7:55   ` Alexey Kardashevskiy
  2014-08-26 11:11   ` [Qemu-devel] " Alexander Graf
@ 2014-09-03  5:55   ` Bharata B Rao
  2014-09-05 22:00   ` Tyrel Datwyler
  3 siblings, 0 replies; 69+ messages in thread
From: Bharata B Rao @ 2014-09-03  5:55 UTC (permalink / raw)
  To: Michael Roth
  Cc: aik, qemu-devel, Alexander Graf, ncmike, qemu-ppc, tyreld, nfont

On Tue, Aug 19, 2014 at 5:51 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>
> This add entries to the root OF node to advertise our PHBs as being
> DR-capable in according with PAPR specification.
>
> Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
> and associated with a power domain of -1 (indicating to guests that
> power management is handled automatically by hardware).
>
> We currently allocate entries for up to 32 DR-capable PHBs, though
> this limit can be increased later.
>
> DrcEntry objects to track the state of the DR-connector associated
> with each PHB are stored in a 32-entry array, and each DrcEntry has
> in turn have a dynamically-sized number of child DR-connectors,
> which we will use later to track the state of DR-connectors
> associated with a PHB's physical slots.
>
> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_pci.c     |   1 +
>  include/hw/ppc/spapr.h |  35 ++++++++++++
>  3 files changed, 179 insertions(+)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 5c92707..d5e46c3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
>      return ram_size;
>  }
>
> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
> +{
> +    int i;
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        if (spapr->drc_table[i].phb_buid == buid) {
> +            return &spapr->drc_table[i];
> +        }
> +     }
> +
> +     return NULL;
> +}
> +
> +static void spapr_init_drc_table(void)
> +{
> +    int i;
> +
> +    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
> +
> +    /* For now we only care about PHB entries */
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        spapr->drc_table[i].drc_index = 0x2000001 + i;
> +    }
> +}
> +
> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
> +{
> +    sPAPRDrcEntry *empty_drc = NULL;
> +    sPAPRDrcEntry *found_drc = NULL;
> +    int i, phb_index;
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        if (spapr->drc_table[i].phb_buid == 0) {
> +            empty_drc = &spapr->drc_table[i];
> +        }
> +
> +        if (spapr->drc_table[i].phb_buid == buid) {
> +            found_drc = &spapr->drc_table[i];
> +            break;
> +        }
> +    }
> +
> +    if (found_drc) {
> +        return found_drc;
> +    }
> +
> +    if (empty_drc) {
> +        empty_drc->phb_buid = buid;
> +        empty_drc->state = state;

Shouldn't this be

empty_drc->state = state << INDICATOR_ENTITY_SENSE_SHIFT ?

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations Michael Roth
  2014-08-26  9:40   ` Alexey Kardashevskiy
  2014-08-26 12:30   ` Alexander Graf
@ 2014-09-03 10:33   ` Bharata B Rao
  2014-09-03 23:03     ` Michael Roth
  2 siblings, 1 reply; 69+ messages in thread
From: Bharata B Rao @ 2014-09-03 10:33 UTC (permalink / raw)
  To: Michael Roth
  Cc: aik, qemu-devel, Alexander Graf, ncmike, qemu-ppc, tyreld, nfont

On Tue, Aug 19, 2014 at 5:51 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> This enables hotplug for PHB bridges. Upon hotplug we generate the
> OF-nodes required by PAPR specification and IEEE 1275-1994
> "PCI Bus Binding to Open Firmware" for the device.
>
> We associate the corresponding FDT for these nodes with the DrcEntry
> corresponding to the slot, which will be fetched via
> ibm,configure-connector RTAS calls by the guest as described by PAPR
> specification. The FDT is cleaned up in the case of unplug.
>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr_pci.c     | 235 +++++++++++++++++++++++++++++++++++++++++++++++--
>  include/hw/ppc/spapr.h |   1 +
>  2 files changed, 228 insertions(+), 8 deletions(-)
>
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 96a57be..23864ab 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -87,6 +87,17 @@
>  #define ENCODE_DRC_STATE(val, m, s) \
>      (((uint32_t)(val) << (s)) & (uint32_t)(m))
>
> +#define FDT_MAX_SIZE            0x10000
> +#define _FDT(exp) \
> +    do { \
> +        int ret = (exp);                                           \
> +        if (ret < 0) {                                             \
> +            return ret;                                            \
> +        }                                                          \
> +    } while (0)
> +
> +static void spapr_drc_state_reset(sPAPRDrcEntry *drc_entry);
> +
>  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
>  {
>      sPAPRPHBState *sphb;
> @@ -476,6 +487,22 @@ static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>          /* encode the new value into the correct bit field */
>          shift = INDICATOR_ISOLATION_SHIFT;
>          mask = INDICATOR_ISOLATION_MASK;
> +        if (drc_entry) {
> +            /* transition from unisolated to isolated for a hotplug slot
> +             * entails completion of guest-side device unplug/cleanup, so
> +             * we can now safely remove the device if qemu is waiting for
> +             * it to be released
> +             */
> +            if (DECODE_DRC_STATE(*pind, mask, shift) != indicator_state) {
> +                if (indicator_state == 0 && drc_entry->awaiting_release) {
> +                    /* device_del has been called and host is waiting
> +                     * for guest to release/isolate device, go ahead
> +                     * and remove it now
> +                     */
> +                    spapr_drc_state_reset(drc_entry);
> +                }
> +            }
> +        }
>          break;
>      case 9002: /* DR */
>          shift = INDICATOR_DR_SHIFT;
> @@ -816,6 +843,198 @@ static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>      return &phb->iommu_as;
>  }
>
> +static int spapr_populate_pci_child_dt(PCIDevice *dev, void *fdt, int offset,
> +                                       int phb_index)
> +{
> +    int slot = PCI_SLOT(dev->devfn);
> +    char slotname[16];
> +    bool is_bridge = 1;
> +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> +    uint32_t reg[RESOURCE_CELLS_TOTAL * 8] = { 0 };
> +    uint32_t assigned[RESOURCE_CELLS_TOTAL * 8] = { 0 };
> +    int reg_size, assigned_size;
> +
> +    drc_entry = spapr_phb_to_drc_entry(phb_index + SPAPR_PCI_BASE_BUID);
> +    g_assert(drc_entry);
> +    drc_entry_slot = &drc_entry->child_entries[slot];
> +
> +    if (pci_default_read_config(dev, PCI_HEADER_TYPE, 1) ==
> +        PCI_HEADER_TYPE_NORMAL) {
> +        is_bridge = 0;
> +    }
> +
> +    _FDT(fdt_setprop_cell(fdt, offset, "vendor-id",
> +                          pci_default_read_config(dev, PCI_VENDOR_ID, 2)));
> +    _FDT(fdt_setprop_cell(fdt, offset, "device-id",
> +                          pci_default_read_config(dev, PCI_DEVICE_ID, 2)));
> +    _FDT(fdt_setprop_cell(fdt, offset, "revision-id",
> +                          pci_default_read_config(dev, PCI_REVISION_ID, 1)));
> +    _FDT(fdt_setprop_cell(fdt, offset, "class-code",
> +                          pci_default_read_config(dev, PCI_CLASS_DEVICE, 2) << 8));
> +
> +    _FDT(fdt_setprop_cell(fdt, offset, "interrupts",
> +                          pci_default_read_config(dev, PCI_INTERRUPT_PIN, 1)));
> +
> +    /* if this device is NOT a bridge */
> +    if (!is_bridge) {
> +        _FDT(fdt_setprop_cell(fdt, offset, "min-grant",
> +            pci_default_read_config(dev, PCI_MIN_GNT, 1)));
> +        _FDT(fdt_setprop_cell(fdt, offset, "max-latency",
> +            pci_default_read_config(dev, PCI_MAX_LAT, 1)));
> +        _FDT(fdt_setprop_cell(fdt, offset, "subsystem-id",
> +            pci_default_read_config(dev, PCI_SUBSYSTEM_ID, 2)));
> +        _FDT(fdt_setprop_cell(fdt, offset, "subsystem-vendor-id",
> +            pci_default_read_config(dev, PCI_SUBSYSTEM_VENDOR_ID, 2)));
> +    }
> +
> +    _FDT(fdt_setprop_cell(fdt, offset, "cache-line-size",
> +        pci_default_read_config(dev, PCI_CACHE_LINE_SIZE, 1)));
> +
> +    /* the following fdt cells are masked off the pci status register */
> +    int pci_status = pci_default_read_config(dev, PCI_STATUS, 2);
> +    _FDT(fdt_setprop_cell(fdt, offset, "devsel-speed",
> +                          PCI_STATUS_DEVSEL_MASK & pci_status));
> +    _FDT(fdt_setprop_cell(fdt, offset, "fast-back-to-back",
> +                          PCI_STATUS_FAST_BACK & pci_status));
> +    _FDT(fdt_setprop_cell(fdt, offset, "66mhz-capable",
> +                          PCI_STATUS_66MHZ & pci_status));
> +    _FDT(fdt_setprop_cell(fdt, offset, "udf-supported",
> +                          PCI_STATUS_UDF & pci_status));
> +
> +    _FDT(fdt_setprop_string(fdt, offset, "name", "pci"));
> +    sprintf(slotname, "Slot %d", slot + phb_index * 32);
> +    _FDT(fdt_setprop(fdt, offset, "ibm,loc-code", slotname, strlen(slotname)));
> +    _FDT(fdt_setprop_cell(fdt, offset, "ibm,my-drc-index",
> +                          drc_entry_slot->drc_index));
> +
> +    _FDT(fdt_setprop_cell(fdt, offset, "#address-cells",
> +                          RESOURCE_CELLS_ADDRESS));
> +    _FDT(fdt_setprop_cell(fdt, offset, "#size-cells",
> +                          RESOURCE_CELLS_SIZE));
> +    _FDT(fdt_setprop_cell(fdt, offset, "ibm,req#msi-x",
> +                          RESOURCE_CELLS_SIZE));
> +    fill_resource_props(dev, phb_index, reg, &reg_size,
> +                        assigned, &assigned_size);
> +    _FDT(fdt_setprop(fdt, offset, "reg", reg, reg_size));
> +    _FDT(fdt_setprop(fdt, offset, "assigned-addresses",
> +                     assigned, assigned_size));
> +
> +    return 0;
> +}
> +
> +static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev)
> +{
> +    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
> +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> +    sPAPRConfigureConnectorState *ccs;
> +    int slot = PCI_SLOT(dev->devfn);
> +    int offset, ret;
> +    void *fdt_orig, *fdt;
> +    char nodename[512];
> +    uint32_t encoded = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_PRESENT,
> +                                        INDICATOR_ENTITY_SENSE_MASK,
> +                                        INDICATOR_ENTITY_SENSE_SHIFT);
> +

I am building on this infrastructure of yours and adding CPU hotplug
support to sPAPR guests.

So we start with dr state of UNUSABLE and change it to PRESENT like
above when performing hotplug operation. But after this, in case of
CPU hotplug, the CPU hotplug code in the kernel
(arch/powerpc/platforms/pseries/dlpar.c:dlpar_acquire_drc()) does
get-sensor-state rtas call and expects the dr state to be UNUSABLE. Is
the guest kernel right in expecting dr state to be in UNUSABLE state
like this ? I have in fact disabled this check in the guest kernel to
get a CPU hotplugged successfully, but wanted to understand the state
changes and the expectations from the guest kernel correctly.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations
  2014-09-03 10:33   ` Bharata B Rao
@ 2014-09-03 23:03     ` Michael Roth
  2014-09-04 15:08       ` Bharata B Rao
  0 siblings, 1 reply; 69+ messages in thread
From: Michael Roth @ 2014-09-03 23:03 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: aik, qemu-devel, Alexander Graf, ncmike, qemu-ppc, tyreld, nfont

Quoting Bharata B Rao (2014-09-03 05:33:56)
> On Tue, Aug 19, 2014 at 5:51 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> > This enables hotplug for PHB bridges. Upon hotplug we generate the
> > OF-nodes required by PAPR specification and IEEE 1275-1994
> > "PCI Bus Binding to Open Firmware" for the device.
> >
> > We associate the corresponding FDT for these nodes with the DrcEntry
> > corresponding to the slot, which will be fetched via
> > ibm,configure-connector RTAS calls by the guest as described by PAPR
> > specification. The FDT is cleaned up in the case of unplug.
> >
> > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr_pci.c     | 235 +++++++++++++++++++++++++++++++++++++++++++++++--
> >  include/hw/ppc/spapr.h |   1 +
> >  2 files changed, 228 insertions(+), 8 deletions(-)
> >
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 96a57be..23864ab 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -87,6 +87,17 @@
> >  #define ENCODE_DRC_STATE(val, m, s) \
> >      (((uint32_t)(val) << (s)) & (uint32_t)(m))
> >
> > +#define FDT_MAX_SIZE            0x10000
> > +#define _FDT(exp) \
> > +    do { \
> > +        int ret = (exp);                                           \
> > +        if (ret < 0) {                                             \
> > +            return ret;                                            \
> > +        }                                                          \
> > +    } while (0)
> > +
> > +static void spapr_drc_state_reset(sPAPRDrcEntry *drc_entry);
> > +
> >  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
> >  {
> >      sPAPRPHBState *sphb;
> > @@ -476,6 +487,22 @@ static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
> >          /* encode the new value into the correct bit field */
> >          shift = INDICATOR_ISOLATION_SHIFT;
> >          mask = INDICATOR_ISOLATION_MASK;
> > +        if (drc_entry) {
> > +            /* transition from unisolated to isolated for a hotplug slot
> > +             * entails completion of guest-side device unplug/cleanup, so
> > +             * we can now safely remove the device if qemu is waiting for
> > +             * it to be released
> > +             */
> > +            if (DECODE_DRC_STATE(*pind, mask, shift) != indicator_state) {
> > +                if (indicator_state == 0 && drc_entry->awaiting_release) {
> > +                    /* device_del has been called and host is waiting
> > +                     * for guest to release/isolate device, go ahead
> > +                     * and remove it now
> > +                     */
> > +                    spapr_drc_state_reset(drc_entry);
> > +                }
> > +            }
> > +        }
> >          break;
> >      case 9002: /* DR */
> >          shift = INDICATOR_DR_SHIFT;
> > @@ -816,6 +843,198 @@ static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
> >      return &phb->iommu_as;
> >  }
> >
> > +static int spapr_populate_pci_child_dt(PCIDevice *dev, void *fdt, int offset,
> > +                                       int phb_index)
> > +{
> > +    int slot = PCI_SLOT(dev->devfn);
> > +    char slotname[16];
> > +    bool is_bridge = 1;
> > +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> > +    uint32_t reg[RESOURCE_CELLS_TOTAL * 8] = { 0 };
> > +    uint32_t assigned[RESOURCE_CELLS_TOTAL * 8] = { 0 };
> > +    int reg_size, assigned_size;
> > +
> > +    drc_entry = spapr_phb_to_drc_entry(phb_index + SPAPR_PCI_BASE_BUID);
> > +    g_assert(drc_entry);
> > +    drc_entry_slot = &drc_entry->child_entries[slot];
> > +
> > +    if (pci_default_read_config(dev, PCI_HEADER_TYPE, 1) ==
> > +        PCI_HEADER_TYPE_NORMAL) {
> > +        is_bridge = 0;
> > +    }
> > +
> > +    _FDT(fdt_setprop_cell(fdt, offset, "vendor-id",
> > +                          pci_default_read_config(dev, PCI_VENDOR_ID, 2)));
> > +    _FDT(fdt_setprop_cell(fdt, offset, "device-id",
> > +                          pci_default_read_config(dev, PCI_DEVICE_ID, 2)));
> > +    _FDT(fdt_setprop_cell(fdt, offset, "revision-id",
> > +                          pci_default_read_config(dev, PCI_REVISION_ID, 1)));
> > +    _FDT(fdt_setprop_cell(fdt, offset, "class-code",
> > +                          pci_default_read_config(dev, PCI_CLASS_DEVICE, 2) << 8));
> > +
> > +    _FDT(fdt_setprop_cell(fdt, offset, "interrupts",
> > +                          pci_default_read_config(dev, PCI_INTERRUPT_PIN, 1)));
> > +
> > +    /* if this device is NOT a bridge */
> > +    if (!is_bridge) {
> > +        _FDT(fdt_setprop_cell(fdt, offset, "min-grant",
> > +            pci_default_read_config(dev, PCI_MIN_GNT, 1)));
> > +        _FDT(fdt_setprop_cell(fdt, offset, "max-latency",
> > +            pci_default_read_config(dev, PCI_MAX_LAT, 1)));
> > +        _FDT(fdt_setprop_cell(fdt, offset, "subsystem-id",
> > +            pci_default_read_config(dev, PCI_SUBSYSTEM_ID, 2)));
> > +        _FDT(fdt_setprop_cell(fdt, offset, "subsystem-vendor-id",
> > +            pci_default_read_config(dev, PCI_SUBSYSTEM_VENDOR_ID, 2)));
> > +    }
> > +
> > +    _FDT(fdt_setprop_cell(fdt, offset, "cache-line-size",
> > +        pci_default_read_config(dev, PCI_CACHE_LINE_SIZE, 1)));
> > +
> > +    /* the following fdt cells are masked off the pci status register */
> > +    int pci_status = pci_default_read_config(dev, PCI_STATUS, 2);
> > +    _FDT(fdt_setprop_cell(fdt, offset, "devsel-speed",
> > +                          PCI_STATUS_DEVSEL_MASK & pci_status));
> > +    _FDT(fdt_setprop_cell(fdt, offset, "fast-back-to-back",
> > +                          PCI_STATUS_FAST_BACK & pci_status));
> > +    _FDT(fdt_setprop_cell(fdt, offset, "66mhz-capable",
> > +                          PCI_STATUS_66MHZ & pci_status));
> > +    _FDT(fdt_setprop_cell(fdt, offset, "udf-supported",
> > +                          PCI_STATUS_UDF & pci_status));
> > +
> > +    _FDT(fdt_setprop_string(fdt, offset, "name", "pci"));
> > +    sprintf(slotname, "Slot %d", slot + phb_index * 32);
> > +    _FDT(fdt_setprop(fdt, offset, "ibm,loc-code", slotname, strlen(slotname)));
> > +    _FDT(fdt_setprop_cell(fdt, offset, "ibm,my-drc-index",
> > +                          drc_entry_slot->drc_index));
> > +
> > +    _FDT(fdt_setprop_cell(fdt, offset, "#address-cells",
> > +                          RESOURCE_CELLS_ADDRESS));
> > +    _FDT(fdt_setprop_cell(fdt, offset, "#size-cells",
> > +                          RESOURCE_CELLS_SIZE));
> > +    _FDT(fdt_setprop_cell(fdt, offset, "ibm,req#msi-x",
> > +                          RESOURCE_CELLS_SIZE));
> > +    fill_resource_props(dev, phb_index, reg, &reg_size,
> > +                        assigned, &assigned_size);
> > +    _FDT(fdt_setprop(fdt, offset, "reg", reg, reg_size));
> > +    _FDT(fdt_setprop(fdt, offset, "assigned-addresses",
> > +                     assigned, assigned_size));
> > +
> > +    return 0;
> > +}
> > +
> > +static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev)
> > +{
> > +    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
> > +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> > +    sPAPRConfigureConnectorState *ccs;
> > +    int slot = PCI_SLOT(dev->devfn);
> > +    int offset, ret;
> > +    void *fdt_orig, *fdt;
> > +    char nodename[512];
> > +    uint32_t encoded = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_PRESENT,
> > +                                        INDICATOR_ENTITY_SENSE_MASK,
> > +                                        INDICATOR_ENTITY_SENSE_SHIFT);
> > +
> 
> I am building on this infrastructure of yours and adding CPU hotplug
> support to sPAPR guests.
> 
> So we start with dr state of UNUSABLE and change it to PRESENT like
> above when performing hotplug operation. But after this, in case of
> CPU hotplug, the CPU hotplug code in the kernel
> (arch/powerpc/platforms/pseries/dlpar.c:dlpar_acquire_drc()) does
> get-sensor-state rtas call and expects the dr state to be UNUSABLE. Is
> the guest kernel right in expecting dr state to be in UNUSABLE state
> like this ? I have in fact disabled this check in the guest kernel to
> get a CPU hotplugged successfully, but wanted to understand the state
> changes and the expectations from the guest kernel correctly.

According to PAPR+ 2.7 13.5.3.3,

  PRESENT (1):
  
  Returned for logical and physical DR entities when the DR connector is
  allocated to the OS and the DR entity is present. For physical DR entities,
  this indicates that the DR connector actually has a DR entity plugged into
  it. For DR connectors of physical DR entities, the DR connector must be
  allocated to the OS to return this value, otherwise a status of -3, no such
  sensor implemented, will be returned from the get-sensor-state RTAS call. For
  DR connectors of logical DR entities, the DR connector must be allocated to
  the OS to return this value, otherwise a sensor value of 2 or 3 will be
  returned.
  
  UNUSABLE (2):
  
  Returned for logical DR entities when the DR entity is not currently
  available to the OS, but may possibly be made available to the OS by calling
  set-indicator with the allocation-state indicator, setting that indicator to
  usable.

So it seems 'PRESENT' is in fact the right value immediately after PCI
hotplug, but it doesn't seem clear from the documentation whether 'PRESENT'
or 'UNUSABLE' is more correct immediately after CPU hotplug. What does
seem clear as that UNUSABLE does not have any use in the case of PCI
devices: just PRESENT/EMPTY(0).

But we actually 0-initialize the sensor field for DRCEntrys corresponding
to PCI devices, which corresponds to 'EMPTY' (0), so the handling seems
correct for PCI devices...

And since we already initialize PHB sensors to UNUSABLE in the top-level
DRC list, I'm not sure why adjacent CPU entries would be affected by what
we do later for PCI devices? It seems like you'd just need to do the
equivalent of what we do for PHBs during realize:

  spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);

So I'm not sure where the need for guest kernel changes is coming from for
CPU hotplug. Do you have a snippet of what the initialize/hot_add hooks
like in your case?

> 
> Regards,
> Bharata.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations
  2014-09-03 23:03     ` Michael Roth
@ 2014-09-04 15:08       ` Bharata B Rao
  2014-09-04 16:12         ` Michael Roth
  0 siblings, 1 reply; 69+ messages in thread
From: Bharata B Rao @ 2014-09-04 15:08 UTC (permalink / raw)
  To: Michael Roth
  Cc: aik, qemu-devel, Alexander Graf, ncmike, qemu-ppc, tyreld, nfont

On Thu, Sep 4, 2014 at 4:33 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
>> > +static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev)
>> > +{
>> > +    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
>> > +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
>> > +    sPAPRConfigureConnectorState *ccs;
>> > +    int slot = PCI_SLOT(dev->devfn);
>> > +    int offset, ret;
>> > +    void *fdt_orig, *fdt;
>> > +    char nodename[512];
>> > +    uint32_t encoded = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_PRESENT,
>> > +                                        INDICATOR_ENTITY_SENSE_MASK,
>> > +                                        INDICATOR_ENTITY_SENSE_SHIFT);
>> > +
>>
>> I am building on this infrastructure of yours and adding CPU hotplug
>> support to sPAPR guests.
>>
>> So we start with dr state of UNUSABLE and change it to PRESENT like
>> above when performing hotplug operation. But after this, in case of
>> CPU hotplug, the CPU hotplug code in the kernel
>> (arch/powerpc/platforms/pseries/dlpar.c:dlpar_acquire_drc()) does
>> get-sensor-state rtas call and expects the dr state to be UNUSABLE. Is
>> the guest kernel right in expecting dr state to be in UNUSABLE state
>> like this ? I have in fact disabled this check in the guest kernel to
>> get a CPU hotplugged successfully, but wanted to understand the state
>> changes and the expectations from the guest kernel correctly.
>
> According to PAPR+ 2.7 13.5.3.3,
>
>   PRESENT (1):
>
>   Returned for logical and physical DR entities when the DR connector is
>   allocated to the OS and the DR entity is present. For physical DR entities,
>   this indicates that the DR connector actually has a DR entity plugged into
>   it. For DR connectors of physical DR entities, the DR connector must be
>   allocated to the OS to return this value, otherwise a status of -3, no such
>   sensor implemented, will be returned from the get-sensor-state RTAS call. For
>   DR connectors of logical DR entities, the DR connector must be allocated to
>   the OS to return this value, otherwise a sensor value of 2 or 3 will be
>   returned.
>
>   UNUSABLE (2):
>
>   Returned for logical DR entities when the DR entity is not currently
>   available to the OS, but may possibly be made available to the OS by calling
>   set-indicator with the allocation-state indicator, setting that indicator to
>   usable.
>
> So it seems 'PRESENT' is in fact the right value immediately after PCI
> hotplug, but it doesn't seem clear from the documentation whether 'PRESENT'
> or 'UNUSABLE' is more correct immediately after CPU hotplug. What does
> seem clear as that UNUSABLE does not have any use in the case of PCI
> devices: just PRESENT/EMPTY(0).
>
> But we actually 0-initialize the sensor field for DRCEntrys corresponding
> to PCI devices, which corresponds to 'EMPTY' (0), so the handling seems
> correct for PCI devices...

Thanks Michael for the above information on PRESENT and USABLE states.

>
> And since we already initialize PHB sensors to UNUSABLE in the top-level
> DRC list, I'm not sure why adjacent CPU entries would be affected by what
> we do later for PCI devices?

Sorry if I wasn't clear enough in my previous mail. CPU hotplug isn't
affected by what you do for PCI devices, but...

> It seems like you'd just need to do the
> equivalent of what we do for PHBs during realize:

when I try to do the same state changes for CPU hotplug, things don't
work as expected.

>
>   spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
>
> So I'm not sure where the need for guest kernel changes is coming from for
> CPU hotplug.

When the resource is hotplugged, you change the state from UNUSABLE to
PRESENT in QEMU before signalling the guest (via check exception irq).
But the same state change in CPU hotplug case isn't as per guest
kernel's expectation.

> Do you have a snippet of what the initialize/hot_add hooks
> like in your case?

I am talking about this piece of code in the the kernel in
arch/powerpc/platforms/pseries/dlpar.c

int dlpar_acquire_drc(u32 drc_index)
{
        int dr_status, rc;

        rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, &dr_status,
                       DR_ENTITY_SENSE, drc_index);
        if (rc || dr_status != DR_ENTITY_UNUSABLE)
          return -1;
       ...
}

I have circumvented this problem by not setting the state to PRESENT
in my current hotplug patch. You can refer to the same in
spapr_cpu_hotplug_add() routine that's part of my patch 14/15
(https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg00757.html)

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations
  2014-09-04 15:08       ` Bharata B Rao
@ 2014-09-04 16:12         ` Michael Roth
  2014-09-04 16:34           ` Michael Roth
  0 siblings, 1 reply; 69+ messages in thread
From: Michael Roth @ 2014-09-04 16:12 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: aik, qemu-devel, Alexander Graf, ncmike, qemu-ppc, tyreld, nfont

Quoting Bharata B Rao (2014-09-04 10:08:20)
> On Thu, Sep 4, 2014 at 4:33 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> >> > +static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev)
> >> > +{
> >> > +    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
> >> > +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> >> > +    sPAPRConfigureConnectorState *ccs;
> >> > +    int slot = PCI_SLOT(dev->devfn);
> >> > +    int offset, ret;
> >> > +    void *fdt_orig, *fdt;
> >> > +    char nodename[512];
> >> > +    uint32_t encoded = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_PRESENT,
> >> > +                                        INDICATOR_ENTITY_SENSE_MASK,
> >> > +                                        INDICATOR_ENTITY_SENSE_SHIFT);
> >> > +
> >>
> >> I am building on this infrastructure of yours and adding CPU hotplug
> >> support to sPAPR guests.
> >>
> >> So we start with dr state of UNUSABLE and change it to PRESENT like
> >> above when performing hotplug operation. But after this, in case of
> >> CPU hotplug, the CPU hotplug code in the kernel
> >> (arch/powerpc/platforms/pseries/dlpar.c:dlpar_acquire_drc()) does
> >> get-sensor-state rtas call and expects the dr state to be UNUSABLE. Is
> >> the guest kernel right in expecting dr state to be in UNUSABLE state
> >> like this ? I have in fact disabled this check in the guest kernel to
> >> get a CPU hotplugged successfully, but wanted to understand the state
> >> changes and the expectations from the guest kernel correctly.
> >
> > According to PAPR+ 2.7 13.5.3.3,
> >
> >   PRESENT (1):
> >
> >   Returned for logical and physical DR entities when the DR connector is
> >   allocated to the OS and the DR entity is present. For physical DR entities,
> >   this indicates that the DR connector actually has a DR entity plugged into
> >   it. For DR connectors of physical DR entities, the DR connector must be
> >   allocated to the OS to return this value, otherwise a status of -3, no such
> >   sensor implemented, will be returned from the get-sensor-state RTAS call. For
> >   DR connectors of logical DR entities, the DR connector must be allocated to
> >   the OS to return this value, otherwise a sensor value of 2 or 3 will be
> >   returned.
> >
> >   UNUSABLE (2):
> >
> >   Returned for logical DR entities when the DR entity is not currently
> >   available to the OS, but may possibly be made available to the OS by calling
> >   set-indicator with the allocation-state indicator, setting that indicator to
> >   usable.
> >
> > So it seems 'PRESENT' is in fact the right value immediately after PCI
> > hotplug, but it doesn't seem clear from the documentation whether 'PRESENT'
> > or 'UNUSABLE' is more correct immediately after CPU hotplug. What does
> > seem clear as that UNUSABLE does not have any use in the case of PCI
> > devices: just PRESENT/EMPTY(0).
> >
> > But we actually 0-initialize the sensor field for DRCEntrys corresponding
> > to PCI devices, which corresponds to 'EMPTY' (0), so the handling seems
> > correct for PCI devices...
> 
> Thanks Michael for the above information on PRESENT and USABLE states.
> 
> >
> > And since we already initialize PHB sensors to UNUSABLE in the top-level
> > DRC list, I'm not sure why adjacent CPU entries would be affected by what
> > we do later for PCI devices?
> 
> Sorry if I wasn't clear enough in my previous mail. CPU hotplug isn't
> affected by what you do for PCI devices, but...
> 
> > It seems like you'd just need to do the
> > equivalent of what we do for PHBs during realize:
> 
> when I try to do the same state changes for CPU hotplug, things don't
> work as expected.
> 
> >
> >   spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
> >
> > So I'm not sure where the need for guest kernel changes is coming from for
> > CPU hotplug.
> 
> When the resource is hotplugged, you change the state from UNUSABLE to
> PRESENT in QEMU before signalling the guest (via check exception irq).
> But the same state change in CPU hotplug case isn't as per guest
> kernel's expectation.
> 
> > Do you have a snippet of what the initialize/hot_add hooks
> > like in your case?
> 
> I am talking about this piece of code in the the kernel in
> arch/powerpc/platforms/pseries/dlpar.c
> 
> int dlpar_acquire_drc(u32 drc_index)
> {
>         int dr_status, rc;
> 
>         rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, &dr_status,
>                        DR_ENTITY_SENSE, drc_index);
>         if (rc || dr_status != DR_ENTITY_UNUSABLE)
>           return -1;
>        ...
> }
> 
> I have circumvented this problem by not setting the state to PRESENT
> in my current hotplug patch. You can refer to the same in
> spapr_cpu_hotplug_add() routine that's part of my patch 14/15
> (https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg00757.html)

Yah, that's what I was getting at: at least just to get things working
for testing, just avoid the PRESENT bits in your hot_add_cpu hook rather
than patching the guest. Unfortunately the documentation isn't particularly
clear about which of these approaches is more correct as far as CPUs go. But
looking at it again:

   UNUSABLE (2):

   Returned for logical DR entities when the DR entity is not currently
   available to the OS, but may possibly be made available to the OS by calling
   set-indicator with the allocation-state indicator, setting that indicator to
   usable.

That 'usable' indicator setting is documented for set-indicator as (1), which
happens to correspond to PRESENT (1). So my read would be that for 'physical'
hotplug (like PCI), the firmware changes the indicator state to PRESENT/USABLE
immediately after hotplug, whereas with 'logical' hotplug (like PHB/CPU), the
guest OS signals this transition to USABLE through set-indicator calls for the
9003 sensor/allocation state after hotplug (which also seems to match up with
the kernel code).

This seems to correspond with the dlpar_acquire_drc() function, but I'm a
little confused why that's not also called in the PHB path...I think maybe
in that case it's handled by drmgr in userspace... will take another look
to confirm.

> 
> Regards,
> Bharata.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations
  2014-09-04 16:12         ` Michael Roth
@ 2014-09-04 16:34           ` Michael Roth
  2014-09-05  3:10             ` Nathan Fontenot
  0 siblings, 1 reply; 69+ messages in thread
From: Michael Roth @ 2014-09-04 16:34 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: aik, qemu-devel, Alexander Graf, ncmike, qemu-ppc, tyreld, nfont

Quoting Michael Roth (2014-09-04 11:12:15)
> Quoting Bharata B Rao (2014-09-04 10:08:20)
> > On Thu, Sep 4, 2014 at 4:33 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> > >> > +static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev)
> > >> > +{
> > >> > +    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
> > >> > +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> > >> > +    sPAPRConfigureConnectorState *ccs;
> > >> > +    int slot = PCI_SLOT(dev->devfn);
> > >> > +    int offset, ret;
> > >> > +    void *fdt_orig, *fdt;
> > >> > +    char nodename[512];
> > >> > +    uint32_t encoded = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_PRESENT,
> > >> > +                                        INDICATOR_ENTITY_SENSE_MASK,
> > >> > +                                        INDICATOR_ENTITY_SENSE_SHIFT);
> > >> > +
> > >>
> > >> I am building on this infrastructure of yours and adding CPU hotplug
> > >> support to sPAPR guests.
> > >>
> > >> So we start with dr state of UNUSABLE and change it to PRESENT like
> > >> above when performing hotplug operation. But after this, in case of
> > >> CPU hotplug, the CPU hotplug code in the kernel
> > >> (arch/powerpc/platforms/pseries/dlpar.c:dlpar_acquire_drc()) does
> > >> get-sensor-state rtas call and expects the dr state to be UNUSABLE. Is
> > >> the guest kernel right in expecting dr state to be in UNUSABLE state
> > >> like this ? I have in fact disabled this check in the guest kernel to
> > >> get a CPU hotplugged successfully, but wanted to understand the state
> > >> changes and the expectations from the guest kernel correctly.
> > >
> > > According to PAPR+ 2.7 13.5.3.3,
> > >
> > >   PRESENT (1):
> > >
> > >   Returned for logical and physical DR entities when the DR connector is
> > >   allocated to the OS and the DR entity is present. For physical DR entities,
> > >   this indicates that the DR connector actually has a DR entity plugged into
> > >   it. For DR connectors of physical DR entities, the DR connector must be
> > >   allocated to the OS to return this value, otherwise a status of -3, no such
> > >   sensor implemented, will be returned from the get-sensor-state RTAS call. For
> > >   DR connectors of logical DR entities, the DR connector must be allocated to
> > >   the OS to return this value, otherwise a sensor value of 2 or 3 will be
> > >   returned.
> > >
> > >   UNUSABLE (2):
> > >
> > >   Returned for logical DR entities when the DR entity is not currently
> > >   available to the OS, but may possibly be made available to the OS by calling
> > >   set-indicator with the allocation-state indicator, setting that indicator to
> > >   usable.
> > >
> > > So it seems 'PRESENT' is in fact the right value immediately after PCI
> > > hotplug, but it doesn't seem clear from the documentation whether 'PRESENT'
> > > or 'UNUSABLE' is more correct immediately after CPU hotplug. What does
> > > seem clear as that UNUSABLE does not have any use in the case of PCI
> > > devices: just PRESENT/EMPTY(0).
> > >
> > > But we actually 0-initialize the sensor field for DRCEntrys corresponding
> > > to PCI devices, which corresponds to 'EMPTY' (0), so the handling seems
> > > correct for PCI devices...
> > 
> > Thanks Michael for the above information on PRESENT and USABLE states.
> > 
> > >
> > > And since we already initialize PHB sensors to UNUSABLE in the top-level
> > > DRC list, I'm not sure why adjacent CPU entries would be affected by what
> > > we do later for PCI devices?
> > 
> > Sorry if I wasn't clear enough in my previous mail. CPU hotplug isn't
> > affected by what you do for PCI devices, but...
> > 
> > > It seems like you'd just need to do the
> > > equivalent of what we do for PHBs during realize:
> > 
> > when I try to do the same state changes for CPU hotplug, things don't
> > work as expected.
> > 
> > >
> > >   spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
> > >
> > > So I'm not sure where the need for guest kernel changes is coming from for
> > > CPU hotplug.
> > 
> > When the resource is hotplugged, you change the state from UNUSABLE to
> > PRESENT in QEMU before signalling the guest (via check exception irq).
> > But the same state change in CPU hotplug case isn't as per guest
> > kernel's expectation.
> > 
> > > Do you have a snippet of what the initialize/hot_add hooks
> > > like in your case?
> > 
> > I am talking about this piece of code in the the kernel in
> > arch/powerpc/platforms/pseries/dlpar.c
> > 
> > int dlpar_acquire_drc(u32 drc_index)
> > {
> >         int dr_status, rc;
> > 
> >         rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, &dr_status,
> >                        DR_ENTITY_SENSE, drc_index);
> >         if (rc || dr_status != DR_ENTITY_UNUSABLE)
> >           return -1;
> >        ...
> > }
> > 
> > I have circumvented this problem by not setting the state to PRESENT
> > in my current hotplug patch. You can refer to the same in
> > spapr_cpu_hotplug_add() routine that's part of my patch 14/15
> > (https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg00757.html)
> 
> Yah, that's what I was getting at: at least just to get things working
> for testing, just avoid the PRESENT bits in your hot_add_cpu hook rather
> than patching the guest. Unfortunately the documentation isn't particularly
> clear about which of these approaches is more correct as far as CPUs go. But
> looking at it again:
> 
>    UNUSABLE (2):
> 
>    Returned for logical DR entities when the DR entity is not currently
>    available to the OS, but may possibly be made available to the OS by calling
>    set-indicator with the allocation-state indicator, setting that indicator to
>    usable.
> 
> That 'usable' indicator setting is documented for set-indicator as (1), which
> happens to correspond to PRESENT (1). So my read would be that for 'physical'
> hotplug (like PCI), the firmware changes the indicator state to PRESENT/USABLE
> immediately after hotplug, whereas with 'logical' hotplug (like PHB/CPU), the
> guest OS signals this transition to USABLE through set-indicator calls for the
> 9003 sensor/allocation state after hotplug (which also seems to match up with
> the kernel code).
> 
> This seems to correspond with the dlpar_acquire_drc() function, but I'm a
> little confused why that's not also called in the PHB path...I think maybe
> in that case it's handled by drmgr in userspace... will take another look
> to confirm.

Yah, here's the code from drmgr, same expectations:

int     
acquire_drc(uint32_t drc_index)
{
    int rc;

    say(DEBUG, "Acquiring drc index 0x%x\n", drc_index);

    rc = dr_entity_sense(drc_index);
    if (rc != STATE_UNUSABLE) {
        say(ERROR, "Entity sense failed for drc %x with %d\n%s\n",
            drc_index, rc, entity_sense_error(rc));
        return -1;
    }

    say(DEBUG, "Setting allocation state to 'alloc usable'\n");
    rc = rtas_set_indicator(ALLOCATION_STATE, drc_index, ALLOC_USABLE);
    if (rc) {
        say(ERROR, "Allocation failed for drc %x with %d\n%s\n",
            drc_index, rc, set_indicator_error(rc));
        return -1;
    }

    say(DEBUG, "Setting indicator state to 'unisolate'\n");
    rc = rtas_set_indicator(ISOLATION_STATE, drc_index, UNISOLATE);
    if (rc) {
        int ret;
        rc = -1;

        say(ERROR, "Unisolate failed for drc %x with %d\n%s\n",
            drc_index, rc, set_indicator_error(rc));
        ret = rtas_set_indicator(ALLOCATION_STATE, drc_index,
                     ALLOC_UNUSABLE);
        if (ret) {
            say(ERROR, "Failed recovery to unusable state after "
                "unisolate failure for drc %x with %d\n%s\n",
                drc_index, ret, set_indicator_error(ret));
        }
    }

    return rc;
}

> 
> > 
> > Regards,
> > Bharata.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-26  7:55   ` Alexey Kardashevskiy
  2014-08-26  8:24     ` Alexey Kardashevskiy
  2014-08-26 14:56     ` Michael Roth
@ 2014-09-05  0:31     ` Tyrel Datwyler
  2 siblings, 0 replies; 69+ messages in thread
From: Tyrel Datwyler @ 2014-09-05  0:31 UTC (permalink / raw)
  To: Alexey Kardashevskiy, Michael Roth, qemu-devel
  Cc: ncmike, nfont, qemu-ppc, tyreld

On 08/26/2014 12:55 AM, Alexey Kardashevskiy wrote:
> On 08/19/2014 10:21 AM, Michael Roth wrote:
>> From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>>
>> This add entries to the root OF node to advertise our PHBs as being
>> DR-capable in according with PAPR specification.
>>
>> Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
>> and associated with a power domain of -1 (indicating to guests that
>> power management is handled automatically by hardware).
>>
>> We currently allocate entries for up to 32 DR-capable PHBs, though
>> this limit can be increased later.
>>
>> DrcEntry objects to track the state of the DR-connector associated
>> with each PHB are stored in a 32-entry array, and each DrcEntry has
>> in turn have a dynamically-sized number of child DR-connectors,
>> which we will use later to track the state of DR-connectors
>> associated with a PHB's physical slots.
>>
>> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>> ---
>>  hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
>>  hw/ppc/spapr_pci.c     |   1 +
>>  include/hw/ppc/spapr.h |  35 ++++++++++++
>>  3 files changed, 179 insertions(+)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 5c92707..d5e46c3 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
>>      return ram_size;
>>  }
>>  
>> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
>> +{
>> +    int i;
>> +
>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>> +        if (spapr->drc_table[i].phb_buid == buid) {
>> +            return &spapr->drc_table[i];
>> +        }
>> +     }
>> +
>> +     return NULL;
>> +}
>> +
>> +static void spapr_init_drc_table(void)
>> +{
>> +    int i;
>> +
>> +    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
>> +
>> +    /* For now we only care about PHB entries */
>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>> +        spapr->drc_table[i].drc_index = 0x2000001 + i;
>> +    }
>> +}
>> +
>> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
>> +{
>> +    sPAPRDrcEntry *empty_drc = NULL;
>> +    sPAPRDrcEntry *found_drc = NULL;
>> +    int i, phb_index;
>> +
>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>> +        if (spapr->drc_table[i].phb_buid == 0) {
>> +            empty_drc = &spapr->drc_table[i];
>> +        }
>> +
>> +        if (spapr->drc_table[i].phb_buid == buid) {
>> +            found_drc = &spapr->drc_table[i];
> 
> return &spapr->drc_table[i];
> ?
> 
> 
>> +            break;
>> +        }
>> +    }
>> +
>> +    if (found_drc) {
>> +        return found_drc;
>> +    }
> 
>    if (!empty_drc) {
>         return NULL;
>    }
> 
> ?
> 
> 
>> +
>> +    if (empty_drc) {
> 
> and no need in this :)
> 
> 
>> +        empty_drc->phb_buid = buid;
>> +        empty_drc->state = state;
>> +        empty_drc->cc_state.fdt = NULL;
>> +        empty_drc->cc_state.offset = 0;
>> +        empty_drc->cc_state.depth = 0;
>> +        empty_drc->cc_state.state = CC_STATE_IDLE;
>> +        empty_drc->child_entries =
>> +            g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
>> +        phb_index = buid - SPAPR_PCI_BASE_BUID;
>> +        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
>> +            empty_drc->child_entries[i].drc_index =
>> +                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
>> +        }
>> +        return empty_drc;
>> +    }
>> +
>> +    return NULL;
>> +}
>> +
>> +static void spapr_create_drc_dt_entries(void *fdt)
>> +{
>> +    char char_buf[1024];
>> +    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
>> +    uint32_t *entries;
>> +    int offset, fdt_offset;
>> +    int i, ret;
>> +
>> +    fdt_offset = fdt_path_offset(fdt, "/");
>> +
>> +    /* ibm,drc-indexes */
>> +    memset(int_buf, 0, sizeof(int_buf));
>> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
>> +
>> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
>> +        int_buf[i] = spapr->drc_table[i-1].drc_index;
>> +    }
>> +
>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
>> +                      sizeof(int_buf));
>> +    if (ret) {
>> +        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
> 
> return here and below in the same error cases?
> 
>> +    }
>> +
>> +    /* ibm,drc-power-domains */
>> +    memset(int_buf, 0, sizeof(int_buf));
>> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
>> +
>> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
>> +        int_buf[i] = 0xffffffff;
>> +    }
>> +
>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
>> +                      sizeof(int_buf));
>> +    if (ret) {
>> +        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains property\n");
>> +    }
>> +
>> +    /* ibm,drc-names */
>> +    memset(char_buf, 0, sizeof(char_buf));
>> +    entries = (uint32_t *)&char_buf[0];
>> +    *entries = SPAPR_DRC_TABLE_SIZE;
>> +    offset = sizeof(*entries);
>> +
>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>> +        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
>> +        char_buf[offset++] = '\0';
>> +    }
>> +
>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
>> +    if (ret) {
>> +        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
>> +    }
>> +
>> +    /* ibm,drc-types */
>> +    memset(char_buf, 0, sizeof(char_buf));
>> +    entries = (uint32_t *)&char_buf[0];
>> +    *entries = SPAPR_DRC_TABLE_SIZE;
>> +    offset = sizeof(*entries);
>> +
>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>> +        offset += sprintf(char_buf + offset, "PHB");
>> +        char_buf[offset++] = '\0';
>> +    }
>> +
>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, offset);
>> +    if (ret) {
>> +        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
>> +    }
>> +}
>> +
>>  #define _FDT(exp) \
>>      do { \
>>          int ret = (exp);                                           \
>> @@ -731,6 +868,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>      char *bootlist;
>>      void *fdt;
>>      sPAPRPHBState *phb;
>> +    sPAPRDrcEntry *drc_entry;
>>  
>>      fdt = g_malloc(FDT_MAX_SIZE);
>>  
>> @@ -750,6 +888,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>      }
>>  
>>      QLIST_FOREACH(phb, &spapr->phbs, list) {
>> +        drc_entry = spapr_phb_to_drc_entry(phb->buid);
>> +        g_assert(drc_entry);
>>          ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
>>      }
>>  
>> @@ -789,6 +929,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>>          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
>>      }
>>  
>> +    spapr_create_drc_dt_entries(fdt);
>> +
>>      _FDT((fdt_pack(fdt)));
>>  
>>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
>> @@ -1443,6 +1585,7 @@ static void ppc_spapr_init(MachineState *machine)
>>      spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW);
>>      spapr_pci_rtas_init();
>>  
>> +    spapr_init_drc_table();
>>      phb = spapr_create_phb(spapr, 0);
>>  
>>      for (i = 0; i < nb_nics; i++) {
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index 9ed39a9..e85134f 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -531,6 +531,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>              + sphb->index * SPAPR_PCI_WINDOW_SPACING;
>>          sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
>>          sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
>> +        spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
> 
> 
> What exactly does "unusable" mean here? Macro?
> 

It is the state of the dr-entity-sensor. Definitions of those states can
be found in PAPR 13.5.3.3 in reference to the get-sensor-state rtas call
requirements for dynamic reconfiguration. That rtas call is introduced
later in this patchset, but oddly a the unusable state is left undefined
in the later patch. The get-sensor-state implementation has dependencies
on the DRC patches. Either, need to add the unusable definition into the
get-sensor-state patch and fix the magic number there as well, or the
sensor value defines could be split out into an earlier patch.

-Tyrel

> 
> 
>>      }
>>  
>>      if (sphb->buid == -1) {
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index 36e8e51..c93794b 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -10,6 +10,36 @@ struct sPAPRNVRAM;
>>  
>>  #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
>>  
>> +/* For dlparable/hotpluggable slots */
>> +#define SPAPR_DRC_TABLE_SIZE    32
>> +#define SPAPR_DRC_PHB_SLOT_MAX  32
>> +#define SPAPR_DRC_DEV_ID_BASE   0x40000000
> 
> 
> Is this SPAPR_DRC_DEV_ID_BASE really necessary (can it be zero)? Is that
> global id or per PCI bus or per PHB?
> 
> 
>> +
>> +typedef struct sPAPRConfigureConnectorState {
>> +    void *fdt;
>> +    int offset_start;
>> +    int offset;
>> +    int depth;
>> +    PCIDevice *dev;
>> +    enum {
>> +        CC_STATE_IDLE = 0,
>> +        CC_STATE_PENDING = 1,
>> +        CC_STATE_ACTIVE,
>> +    } state;
>> +} sPAPRConfigureConnectorState;
>> +
>> +typedef struct sPAPRDrcEntry sPAPRDrcEntry;
>> +
>> +struct sPAPRDrcEntry {
>> +    uint32_t drc_index;
>> +    uint64_t phb_buid;
>> +    void *fdt;
>> +    int fdt_offset;
>> +    uint32_t state;
>> +    sPAPRConfigureConnectorState cc_state;
>> +    sPAPRDrcEntry *child_entries;
>> +};
>> +
>>  typedef struct sPAPREnvironment {
>>      struct VIOsPAPRBus *vio_bus;
>>      QLIST_HEAD(, sPAPRPHBState) phbs;
>> @@ -39,6 +69,9 @@ typedef struct sPAPREnvironment {
>>      int htab_save_index;
>>      bool htab_first_pass;
>>      int htab_fd;
>> +
>> +    /* state for Dynamic Reconfiguration Connectors */
>> +    sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
>>  } sPAPREnvironment;
>>  
>>  #define H_SUCCESS         0
>> @@ -481,5 +514,7 @@ int spapr_dma_dt(void *fdt, int node_off, const char *propname,
>>                   uint32_t liobn, uint64_t window, uint32_t size);
>>  int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
>>                        sPAPRTCETable *tcet);
>> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
>> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
>>  
>>  #endif /* !defined (__HW_SPAPR_H__) */
>>
> 
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 06/12] spapr_pci: add get-sensor-state RTAS interface
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 06/12] spapr_pci: add get-sensor-state RTAS interface Michael Roth
@ 2014-09-05  0:34   ` Tyrel Datwyler
  0 siblings, 0 replies; 69+ messages in thread
From: Tyrel Datwyler @ 2014-09-05  0:34 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: aik, agraf, ncmike, qemu-ppc, tyreld, nfont

On 08/18/2014 05:21 PM, Michael Roth wrote:
> From: Mike Day <ncmike@ncultra.org>
> 
> Signed-off-by: Mike Day <ncmike@ncultra.org>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr_pci.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 76 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index f007dd6..8d1351d 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -66,6 +66,7 @@
>  #define INDICATOR_DR_MASK                   0x00e0   /* 9002 three bits */
>  #define INDICATOR_ALLOCATION_MASK           0x0300   /* 9003 two bits */
>  #define INDICATOR_EPOW_MASK                 0x1c00   /* 9 three bits */
> +#define INDICATOR_ENTITY_SENSE_MASK         0xe000   /* 9003 three bits */
>  
>  #define INDICATOR_ISOLATION_SHIFT           0x00     /* bit 0 */
>  #define INDICATOR_GLOBAL_INTERRUPT_SHIFT    0x01     /* bit 1 */
> @@ -75,6 +76,10 @@
>  #define INDICATOR_DR_SHIFT                  0x05     /* bits 5-7 */
>  #define INDICATOR_ALLOCATION_SHIFT          0x08     /* bits 8-9 */
>  #define INDICATOR_EPOW_SHIFT                0x0a     /* bits 10-12 */
> +#define INDICATOR_ENTITY_SENSE_SHIFT        0x0d     /* bits 13-15 */
> +
> +#define INDICATOR_ENTITY_SENSE_EMPTY    0
> +#define INDICATOR_ENTITY_SENSE_PRESENT  1

Need a define for the unusable state for the dr-entity-sensor per PAPR
13.5.3.3.

#define INDICATOR_ENTITY_SENSE_UNUSABLE  2


-Tyrel

>  
>  #define DECODE_DRC_STATE(state, m, s)                  \
>      ((((uint32_t)(state) & (uint32_t)(m))) >> (s))
> @@ -532,6 +537,75 @@ static void rtas_get_power_level(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>      rtas_st(rets, 1, 100);
>  }
>  
> +static void rtas_get_sensor_state(PowerPCCPU *cpu, sPAPREnvironment *spapr,
> +                                  uint32_t token, uint32_t nargs,
> +                                  target_ulong args, uint32_t nret,
> +                                  target_ulong rets)
> +{
> +    uint32_t sensor = rtas_ld(args, 0);
> +    uint32_t drc_index = rtas_ld(args, 1);
> +    uint32_t sensor_state = 0, decoded = 0;
> +    uint32_t shift = 0, mask = 0;
> +    sPAPRDrcEntry *drc_entry = NULL;
> +
> +    if (drc_index == 0) {  /* platform state sensor/indicator */
> +        sensor_state = spapr->state;
> +    } else { /* we should have a drc entry */
> +        drc_entry = spapr_find_drc_entry(drc_index);
> +        if (!drc_entry) {
> +            DPRINTF("unable to find DRC entry for index %x", drc_index);
> +            sensor_state = 0; /* empty */
> +            rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
> +            return;
> +        }
> +        sensor_state = drc_entry->state;
> +    }
> +    switch (sensor) {
> +    case 9:  /* EPOW */
> +        shift = INDICATOR_EPOW_SHIFT;
> +        mask = INDICATOR_EPOW_MASK;
> +        break;
> +    case 9001: /* Isolation state */
> +        /* encode the new value into the correct bit field */
> +        shift = INDICATOR_ISOLATION_SHIFT;
> +        mask = INDICATOR_ISOLATION_MASK;
> +        break;
> +    case 9002: /* DR */
> +        shift = INDICATOR_DR_SHIFT;
> +        mask = INDICATOR_DR_MASK;
> +        break;
> +    case 9003: /* entity sense */
> +        shift = INDICATOR_ENTITY_SENSE_SHIFT;
> +        mask = INDICATOR_ENTITY_SENSE_MASK;
> +        break;
> +    case 9005: /* global interrupt */
> +        shift = INDICATOR_GLOBAL_INTERRUPT_SHIFT;
> +        mask = INDICATOR_GLOBAL_INTERRUPT_MASK;
> +        break;
> +    case 9006: /* error log */
> +        shift = INDICATOR_ERROR_LOG_SHIFT;
> +        mask = INDICATOR_ERROR_LOG_MASK;
> +        break;
> +    case 9007: /* identify */
> +        shift = INDICATOR_IDENTIFY_SHIFT;
> +        mask = INDICATOR_IDENTIFY_MASK;
> +        break;
> +    case 9009: /* reset */
> +        shift = INDICATOR_RESET_SHIFT;
> +        mask = INDICATOR_RESET_MASK;
> +        break;
> +    default:
> +        DPRINTF("rtas_get_sensor_state: sensor not implemented: %d",
> +                sensor);
> +        rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
> +        return;
> +    }
> +
> +    decoded = DECODE_DRC_STATE(sensor_state, mask, shift);
> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +    rtas_st(rets, 1, decoded);
> +}
> +
>  static int pci_spapr_swizzle(int slot, int pin)
>  {
>      return (slot + pin) % PCI_NUM_PINS;
> @@ -1200,6 +1274,8 @@ void spapr_pci_rtas_init(void)
>                          rtas_set_power_level);
>      spapr_rtas_register(RTAS_GET_POWER_LEVEL, "get-power-level",
>                          rtas_get_power_level);
> +    spapr_rtas_register(RTAS_GET_SENSOR_STATE, "get-sensor-state",
> +                        rtas_get_sensor_state);
>  }
>  
>  static void spapr_pci_register_types(void)
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface
  2014-08-26 11:36   ` Alexander Graf
@ 2014-09-05  2:55     ` Nathan Fontenot
  2014-09-30 22:08     ` Michael Roth
  1 sibling, 0 replies; 69+ messages in thread
From: Nathan Fontenot @ 2014-09-05  2:55 UTC (permalink / raw)
  To: Alexander Graf, Michael Roth, qemu-devel; +Cc: aik, ncmike, qemu-ppc, tyreld

On 08/26/2014 06:36 AM, Alexander Graf wrote:
> 
> 
> On 19.08.14 02:21, Michael Roth wrote:
>> From: Mike Day <ncmike@ncultra.org>
>>
>> Signed-off-by: Mike Day <ncmike@ncultra.org>
>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>> ---
>>  hw/ppc/spapr_pci.c     | 119 +++++++++++++++++++++++++++++++++++++++++++++++++
>>  include/hw/ppc/spapr.h |   3 ++
>>  2 files changed, 122 insertions(+)
>>
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index 924d488..23a3477 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -36,6 +36,16 @@
>>  
>>  #include "hw/pci/pci_bus.h"
>>  
>> +/* #define DEBUG_SPAPR */
>> +
>> +#ifdef DEBUG_SPAPR
>> +#define DPRINTF(fmt, ...) \
>> +    do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
>> +#else
>> +#define DPRINTF(fmt, ...) \
>> +    do { } while (0)
>> +#endif
>> +
>>  /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
>>  #define RTAS_QUERY_FN           0
>>  #define RTAS_CHANGE_FN          1
>> @@ -47,6 +57,31 @@
>>  #define RTAS_TYPE_MSI           1
>>  #define RTAS_TYPE_MSIX          2
>>  
>> +/* For set-indicator RTAS interface */
>> +#define INDICATOR_ISOLATION_MASK            0x0001   /* 9001 one bit */
>> +#define INDICATOR_GLOBAL_INTERRUPT_MASK     0x0002   /* 9005 one bit */
>> +#define INDICATOR_ERROR_LOG_MASK            0x0004   /* 9006 one bit */
>> +#define INDICATOR_IDENTIFY_MASK             0x0008   /* 9007 one bit */
>> +#define INDICATOR_RESET_MASK                0x0010   /* 9009 one bit */
>> +#define INDICATOR_DR_MASK                   0x00e0   /* 9002 three bits */
>> +#define INDICATOR_ALLOCATION_MASK           0x0300   /* 9003 two bits */
>> +#define INDICATOR_EPOW_MASK                 0x1c00   /* 9 three bits */
>> +
>> +#define INDICATOR_ISOLATION_SHIFT           0x00     /* bit 0 */
>> +#define INDICATOR_GLOBAL_INTERRUPT_SHIFT    0x01     /* bit 1 */
>> +#define INDICATOR_ERROR_LOG_SHIFT           0x02     /* bit 2 */
>> +#define INDICATOR_IDENTIFY_SHIFT            0x03     /* bit 3 */
>> +#define INDICATOR_RESET_SHIFT               0x04     /* bit 4 */
>> +#define INDICATOR_DR_SHIFT                  0x05     /* bits 5-7 */
>> +#define INDICATOR_ALLOCATION_SHIFT          0x08     /* bits 8-9 */
>> +#define INDICATOR_EPOW_SHIFT                0x0a     /* bits 10-12 */
>> +
>> +#define DECODE_DRC_STATE(state, m, s)                  \
>> +    ((((uint32_t)(state) & (uint32_t)(m))) >> (s))
>> +
>> +#define ENCODE_DRC_STATE(val, m, s) \
>> +    (((uint32_t)(val) << (s)) & (uint32_t)(m))
>> +
>>  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
>>  {
>>      sPAPRPHBState *sphb;
>> @@ -402,6 +437,80 @@ static void rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
>>      rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */
>>  }
>>  
>> +static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>> +                               uint32_t token, uint32_t nargs,
>> +                               target_ulong args, uint32_t nret,
>> +                               target_ulong rets)
>> +{
>> +    uint32_t indicator = rtas_ld(args, 0);
>> +    uint32_t drc_index = rtas_ld(args, 1);
>> +    uint32_t indicator_state = rtas_ld(args, 2);
>> +    uint32_t encoded = 0, shift = 0, mask = 0;
>> +    uint32_t *pind;
>> +    sPAPRDrcEntry *drc_entry = NULL;
> 
> This rtas call does not have any idea what a PHB is. Why does it live in
> spapr_pci.c?
> 

It probably shouldn't be there, we will need this call for memory and cpu hotplug
later on.

-Nathan

>> +
>> +    if (drc_index == 0) { /* platform indicator */
>> +        pind = &spapr->state;
>> +    } else {
>> +        drc_entry = spapr_find_drc_entry(drc_index);
>> +        if (!drc_entry) {
>> +            DPRINTF("rtas_set_indicator: unable to find drc_entry for %x",
>> +                    drc_index);
>> +            rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
>> +            return;
>> +        }
>> +        pind = &drc_entry->state;
>> +    }
>> +
>> +    switch (indicator) {
>> +    case 9:  /* EPOW */
>> +        shift = INDICATOR_EPOW_SHIFT;
>> +        mask = INDICATOR_EPOW_MASK;
>> +        break;
>> +    case 9001: /* Isolation state */
>> +        /* encode the new value into the correct bit field */
>> +        shift = INDICATOR_ISOLATION_SHIFT;
>> +        mask = INDICATOR_ISOLATION_MASK;
>> +        break;
>> +    case 9002: /* DR */
>> +        shift = INDICATOR_DR_SHIFT;
>> +        mask = INDICATOR_DR_MASK;
>> +        break;
>> +    case 9003: /* Allocation State */
>> +        shift = INDICATOR_ALLOCATION_SHIFT;
>> +        mask = INDICATOR_ALLOCATION_MASK;
>> +        break;
>> +    case 9005: /* global interrupt */
>> +        shift = INDICATOR_GLOBAL_INTERRUPT_SHIFT;
>> +        mask = INDICATOR_GLOBAL_INTERRUPT_MASK;
>> +        break;
>> +    case 9006: /* error log */
>> +        shift = INDICATOR_ERROR_LOG_SHIFT;
>> +        mask = INDICATOR_ERROR_LOG_MASK;
>> +        break;
>> +    case 9007: /* identify */
>> +        shift = INDICATOR_IDENTIFY_SHIFT;
>> +        mask = INDICATOR_IDENTIFY_MASK;
>> +        break;
>> +    case 9009: /* reset */
>> +        shift = INDICATOR_RESET_SHIFT;
>> +        mask = INDICATOR_RESET_MASK;
>> +        break;
>> +    default:
>> +        DPRINTF("rtas_set_indicator: indicator not implemented: %d",
>> +                indicator);
>> +        rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
>> +        return;
>> +    }
>> +
>> +    encoded = ENCODE_DRC_STATE(indicator_state, mask, shift);
>> +    /* clear the current indicator value */
>> +    *pind &= ~mask;
>> +    /* set the new value */
>> +    *pind |= encoded;
>> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
>> +}
>> +
>>  static int pci_spapr_swizzle(int slot, int pin)
>>  {
>>      return (slot + pin) % PCI_NUM_PINS;
>> @@ -624,6 +733,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>          sphb->lsi_table[i].irq = irq;
>>      }
>>  
>> +    /* make sure the platform EPOW sensor is initialized - the
>> +     * guest will probe it when there is a hotplug event.
>> +     */
>> +    spapr->state &= ~(uint32_t)INDICATOR_EPOW_MASK;
>> +    spapr->state |= ENCODE_DRC_STATE(0,
>> +                                     INDICATOR_EPOW_MASK,
>> +                                     INDICATOR_EPOW_SHIFT);
>> +
>>      if (!info->finish_realize) {
>>          error_setg(errp, "finish_realize not defined");
>>          return;
>> @@ -1056,6 +1173,8 @@ void spapr_pci_rtas_init(void)
>>          spapr_rtas_register(RTAS_IBM_CHANGE_MSI, "ibm,change-msi",
>>                              rtas_ibm_change_msi);
>>      }
>> +    spapr_rtas_register(RTAS_SET_INDICATOR, "set-indicator",
>> +                        rtas_set_indicator);
>>  }
>>  
>>  static void spapr_pci_register_types(void)
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index 0ac1a19..fac85f8 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -72,6 +72,9 @@ typedef struct sPAPREnvironment {
>>  
>>      /* state for Dynamic Reconfiguration Connectors */
>>      sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
>> +
>> +    /* Platform state - sensors and indicators */
>> +    uint32_t state;
> 
> Do you think it'd be possible to create a special DRC device that
> contains all of its tables and global state and also exposes sensors and
> indicators? That device could then get linked via qom links to the PHBs
> for their slots.
> 
> 
> Alex
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 07/12] spapr_pci: add ibm, configure-connector RTAS interface
  2014-08-26  9:12   ` Alexey Kardashevskiy
@ 2014-09-05  3:03     ` Nathan Fontenot
  0 siblings, 0 replies; 69+ messages in thread
From: Nathan Fontenot @ 2014-09-05  3:03 UTC (permalink / raw)
  To: Alexey Kardashevskiy, Michael Roth, qemu-devel
  Cc: ncmike, qemu-ppc, agraf, tyreld

On 08/26/2014 04:12 AM, Alexey Kardashevskiy wrote:
> On 08/19/2014 10:21 AM, Michael Roth wrote:
>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> 
> I have totally no idea what this patch actually does :) When is this rtas
> call made? Once after the guest received the check exception interrupt? Is
> it all it does is fetching the copy of the device tree made by
> spapr_create_drc_phb_dt_entries()? If it is,
> spapr_create_drc_phb_dt_entries() could compose the structure below at the
> first place without any additional conversions, no?

This is one of those functions that I wished never existed, but unfortunately
its one we have to have.

For pseries the hotplug flow includes a step where the guest makes the rtas
configure-connector call to get the new pieces of the device tree that are
added to the guests device tree as a result of adding a pci adapter (and later
for cpu and memory).

This happens after the check exception interrupt. In the guest we determine the
drc index for the pci device being added, then makes this rtas call to get the
device tree updates.

-Nathan

> 
> 
>> ---
>>  hw/ppc/spapr_pci.c | 111 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 111 insertions(+)
>>
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index 8d1351d..96a57be 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -606,6 +606,115 @@ static void rtas_get_sensor_state(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>>      rtas_st(rets, 1, decoded);
>>  }
>>  
>> +/* configure-connector work area offsets, int32_t units */
>> +#define CC_IDX_NODE_NAME_OFFSET 2
>> +#define CC_IDX_PROP_NAME_OFFSET 2
>> +#define CC_IDX_PROP_LEN 3
>> +#define CC_IDX_PROP_DATA_OFFSET 4
>> +
>> +#define CC_VAL_DATA_OFFSET ((CC_IDX_PROP_DATA_OFFSET + 1) * 4)
>> +#define CC_RET_NEXT_SIB 1
>> +#define CC_RET_NEXT_CHILD 2
>> +#define CC_RET_NEXT_PROPERTY 3
>> +#define CC_RET_PREV_PARENT 4
>> +#define CC_RET_ERROR RTAS_OUT_HW_ERROR
>> +#define CC_RET_SUCCESS RTAS_OUT_SUCCESS
> 
> 
> Why these two are here, not in the same bucket as RTAS_OUT_HW_ERROR and others?
> 
> 
>> +
>> +static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
>> +                                         sPAPREnvironment *spapr,
>> +                                         uint32_t token, uint32_t nargs,
>> +                                         target_ulong args, uint32_t nret,
>> +                                         target_ulong rets)
>> +{
>> +    uint64_t wa_addr = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 0);
>> +    sPAPRDrcEntry *drc_entry = NULL;
>> +    sPAPRConfigureConnectorState *ccs;
>> +    void *wa_buf;
>> +    int32_t *wa_buf_int;
>> +    hwaddr map_len = 0x1024;
>> +    uint32_t drc_index;
>> +    int rc = 0, next_offset, tag, prop_len, node_name_len;
>> +    const struct fdt_property *prop;
>> +    const char *node_name, *prop_name;
>> +
>> +    wa_buf = cpu_physical_memory_map(wa_addr, &map_len, 1);
>> +    if (!wa_buf) {
>> +        rc = CC_RET_ERROR;
>> +        goto error_exit;
>> +    }
>> +    wa_buf_int = wa_buf;
>> +
>> +    drc_index = *(uint32_t *)wa_buf;
>> +    drc_entry = spapr_find_drc_entry(drc_index);
>> +    if (!drc_entry) {
>> +        rc = -1;
>> +        goto error_exit;
>> +    }
>> +
>> +    ccs = &drc_entry->cc_state;
>> +    if (ccs->state == CC_STATE_PENDING) {
>> +        /* fdt should've been been attached to drc_entry during
>> +         * realize/hotplug
>> +         */
>> +        g_assert(ccs->fdt);
>> +        ccs->depth = 0;
>> +        ccs->offset = ccs->offset_start;
>> +        ccs->state = CC_STATE_ACTIVE;
>> +    }
>> +
>> +    if (ccs->state == CC_STATE_IDLE) {
>> +        rc = -1;
>> +        goto error_exit;
>> +    }
>> +
>> +retry:
>> +    tag = fdt_next_tag(ccs->fdt, ccs->offset, &next_offset);
>> +
>> +    switch (tag) {
>> +    case FDT_BEGIN_NODE:
>> +        ccs->depth++;
>> +        node_name = fdt_get_name(ccs->fdt, ccs->offset, &node_name_len);
>> +        wa_buf_int[CC_IDX_NODE_NAME_OFFSET] = CC_VAL_DATA_OFFSET;
>> +        strcpy(wa_buf + wa_buf_int[CC_IDX_NODE_NAME_OFFSET], node_name);
>> +        rc = CC_RET_NEXT_CHILD;
>> +        break;
>> +    case FDT_END_NODE:
>> +        ccs->depth--;
>> +        if (ccs->depth == 0) {
>> +            /* reached the end of top-level node, declare success */
>> +            ccs->state = CC_STATE_PENDING;
>> +            rc = CC_RET_SUCCESS;
>> +        } else {
>> +            rc = CC_RET_PREV_PARENT;
>> +        }
>> +        break;
>> +    case FDT_PROP:
>> +        prop = fdt_get_property_by_offset(ccs->fdt, ccs->offset, &prop_len);
>> +        prop_name = fdt_string(ccs->fdt, fdt32_to_cpu(prop->nameoff));
>> +        wa_buf_int[CC_IDX_PROP_NAME_OFFSET] = CC_VAL_DATA_OFFSET;
>> +        wa_buf_int[CC_IDX_PROP_LEN] = prop_len;
>> +        wa_buf_int[CC_IDX_PROP_DATA_OFFSET] =
>> +            CC_VAL_DATA_OFFSET + strlen(prop_name) + 1;
>> +        strcpy(wa_buf + wa_buf_int[CC_IDX_PROP_NAME_OFFSET], prop_name);
>> +        memcpy(wa_buf + wa_buf_int[CC_IDX_PROP_DATA_OFFSET],
>> +               prop->data, prop_len);
>> +        rc = CC_RET_NEXT_PROPERTY;
>> +        break;
>> +    case FDT_END:
>> +        rc = CC_RET_ERROR;
>> +        break;
>> +    default:
>> +        ccs->offset = next_offset;
>> +        goto retry;
> 
> Could be a while(1) loop...
> 
> 
>> +    }
>> +
>> +    ccs->offset = next_offset;
>> +
>> +error_exit:
>> +    cpu_physical_memory_unmap(wa_buf, 0x1024, 1, 0x1024);
> 
> 
> 0x1024 is weird constant, are you sure about that?
> 
> 
>> +    rtas_st(rets, 0, rc);
>> +}
>> +
>>  static int pci_spapr_swizzle(int slot, int pin)
>>  {
>>      return (slot + pin) % PCI_NUM_PINS;
>> @@ -1276,6 +1385,8 @@ void spapr_pci_rtas_init(void)
>>                          rtas_get_power_level);
>>      spapr_rtas_register(RTAS_GET_SENSOR_STATE, "get-sensor-state",
>>                          rtas_get_sensor_state);
>> +    spapr_rtas_register(RTAS_IBM_CONFIGURE_CONNECTOR, "ibm,configure-connector",
>> +                        rtas_ibm_configure_connector);
>>  }
>>  
>>  static void spapr_pci_register_types(void)
>>
> 
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations
  2014-09-04 16:34           ` Michael Roth
@ 2014-09-05  3:10             ` Nathan Fontenot
  2014-09-05 17:17               ` [Qemu-devel] [Qemu-ppc] " Tyrel Datwyler
  0 siblings, 1 reply; 69+ messages in thread
From: Nathan Fontenot @ 2014-09-05  3:10 UTC (permalink / raw)
  To: Michael Roth, Bharata B Rao
  Cc: aik, qemu-devel, Alexander Graf, ncmike, qemu-ppc, tyreld

On 09/04/2014 11:34 AM, Michael Roth wrote:
> Quoting Michael Roth (2014-09-04 11:12:15)
>> Quoting Bharata B Rao (2014-09-04 10:08:20)
>>> On Thu, Sep 4, 2014 at 4:33 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
>>>>>> +static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev)
>>>>>> +{
>>>>>> +    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
>>>>>> +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
>>>>>> +    sPAPRConfigureConnectorState *ccs;
>>>>>> +    int slot = PCI_SLOT(dev->devfn);
>>>>>> +    int offset, ret;
>>>>>> +    void *fdt_orig, *fdt;
>>>>>> +    char nodename[512];
>>>>>> +    uint32_t encoded = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_PRESENT,
>>>>>> +                                        INDICATOR_ENTITY_SENSE_MASK,
>>>>>> +                                        INDICATOR_ENTITY_SENSE_SHIFT);
>>>>>> +
>>>>>
>>>>> I am building on this infrastructure of yours and adding CPU hotplug
>>>>> support to sPAPR guests.
>>>>>
>>>>> So we start with dr state of UNUSABLE and change it to PRESENT like
>>>>> above when performing hotplug operation. But after this, in case of
>>>>> CPU hotplug, the CPU hotplug code in the kernel
>>>>> (arch/powerpc/platforms/pseries/dlpar.c:dlpar_acquire_drc()) does
>>>>> get-sensor-state rtas call and expects the dr state to be UNUSABLE. Is
>>>>> the guest kernel right in expecting dr state to be in UNUSABLE state
>>>>> like this ? I have in fact disabled this check in the guest kernel to
>>>>> get a CPU hotplugged successfully, but wanted to understand the state
>>>>> changes and the expectations from the guest kernel correctly.
>>>>
>>>> According to PAPR+ 2.7 13.5.3.3,
>>>>
>>>>   PRESENT (1):
>>>>
>>>>   Returned for logical and physical DR entities when the DR connector is
>>>>   allocated to the OS and the DR entity is present. For physical DR entities,
>>>>   this indicates that the DR connector actually has a DR entity plugged into
>>>>   it. For DR connectors of physical DR entities, the DR connector must be
>>>>   allocated to the OS to return this value, otherwise a status of -3, no such
>>>>   sensor implemented, will be returned from the get-sensor-state RTAS call. For
>>>>   DR connectors of logical DR entities, the DR connector must be allocated to
>>>>   the OS to return this value, otherwise a sensor value of 2 or 3 will be
>>>>   returned.
>>>>
>>>>   UNUSABLE (2):
>>>>
>>>>   Returned for logical DR entities when the DR entity is not currently
>>>>   available to the OS, but may possibly be made available to the OS by calling
>>>>   set-indicator with the allocation-state indicator, setting that indicator to
>>>>   usable.
>>>>
>>>> So it seems 'PRESENT' is in fact the right value immediately after PCI
>>>> hotplug, but it doesn't seem clear from the documentation whether 'PRESENT'
>>>> or 'UNUSABLE' is more correct immediately after CPU hotplug. What does
>>>> seem clear as that UNUSABLE does not have any use in the case of PCI
>>>> devices: just PRESENT/EMPTY(0).
>>>>
>>>> But we actually 0-initialize the sensor field for DRCEntrys corresponding
>>>> to PCI devices, which corresponds to 'EMPTY' (0), so the handling seems
>>>> correct for PCI devices...
>>>
>>> Thanks Michael for the above information on PRESENT and USABLE states.
>>>
>>>>
>>>> And since we already initialize PHB sensors to UNUSABLE in the top-level
>>>> DRC list, I'm not sure why adjacent CPU entries would be affected by what
>>>> we do later for PCI devices?
>>>
>>> Sorry if I wasn't clear enough in my previous mail. CPU hotplug isn't
>>> affected by what you do for PCI devices, but...
>>>
>>>> It seems like you'd just need to do the
>>>> equivalent of what we do for PHBs during realize:
>>>
>>> when I try to do the same state changes for CPU hotplug, things don't
>>> work as expected.
>>>
>>>>
>>>>   spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
>>>>
>>>> So I'm not sure where the need for guest kernel changes is coming from for
>>>> CPU hotplug.
>>>
>>> When the resource is hotplugged, you change the state from UNUSABLE to
>>> PRESENT in QEMU before signalling the guest (via check exception irq).
>>> But the same state change in CPU hotplug case isn't as per guest
>>> kernel's expectation.
>>>
>>>> Do you have a snippet of what the initialize/hot_add hooks
>>>> like in your case?
>>>
>>> I am talking about this piece of code in the the kernel in
>>> arch/powerpc/platforms/pseries/dlpar.c
>>>
>>> int dlpar_acquire_drc(u32 drc_index)
>>> {
>>>         int dr_status, rc;
>>>
>>>         rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, &dr_status,
>>>                        DR_ENTITY_SENSE, drc_index);
>>>         if (rc || dr_status != DR_ENTITY_UNUSABLE)
>>>           return -1;
>>>        ...
>>> }
>>>
>>> I have circumvented this problem by not setting the state to PRESENT
>>> in my current hotplug patch. You can refer to the same in
>>> spapr_cpu_hotplug_add() routine that's part of my patch 14/15
>>> (https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg00757.html)
>>
>> Yah, that's what I was getting at: at least just to get things working
>> for testing, just avoid the PRESENT bits in your hot_add_cpu hook rather
>> than patching the guest. Unfortunately the documentation isn't particularly
>> clear about which of these approaches is more correct as far as CPUs go. But
>> looking at it again:
>>
>>    UNUSABLE (2):
>>
>>    Returned for logical DR entities when the DR entity is not currently
>>    available to the OS, but may possibly be made available to the OS by calling
>>    set-indicator with the allocation-state indicator, setting that indicator to
>>    usable.
>>
>> That 'usable' indicator setting is documented for set-indicator as (1), which
>> happens to correspond to PRESENT (1). So my read would be that for 'physical'
>> hotplug (like PCI), the firmware changes the indicator state to PRESENT/USABLE
>> immediately after hotplug, whereas with 'logical' hotplug (like PHB/CPU), the
>> guest OS signals this transition to USABLE through set-indicator calls for the
>> 9003 sensor/allocation state after hotplug (which also seems to match up with
>> the kernel code).
>>
>> This seems to correspond with the dlpar_acquire_drc() function, but I'm a
>> little confused why that's not also called in the PHB path...I think maybe
>> in that case it's handled by drmgr in userspace... will take another look
>> to confirm.
> 
> Yah, here's the code from drmgr, same expectations:

Yes, the guest expects the state to be UNUSABLE.

As mentioned above, I don't think we should be changing the state to PRESENT
before notifying the guest.

-Nathan

 
> 
> int     
> acquire_drc(uint32_t drc_index)
> {
>     int rc;
> 
>     say(DEBUG, "Acquiring drc index 0x%x\n", drc_index);
> 
>     rc = dr_entity_sense(drc_index);
>     if (rc != STATE_UNUSABLE) {
>         say(ERROR, "Entity sense failed for drc %x with %d\n%s\n",
>             drc_index, rc, entity_sense_error(rc));
>         return -1;
>     }
> 
>     say(DEBUG, "Setting allocation state to 'alloc usable'\n");
>     rc = rtas_set_indicator(ALLOCATION_STATE, drc_index, ALLOC_USABLE);
>     if (rc) {
>         say(ERROR, "Allocation failed for drc %x with %d\n%s\n",
>             drc_index, rc, set_indicator_error(rc));
>         return -1;
>     }
> 
>     say(DEBUG, "Setting indicator state to 'unisolate'\n");
>     rc = rtas_set_indicator(ISOLATION_STATE, drc_index, UNISOLATE);
>     if (rc) {
>         int ret;
>         rc = -1;
> 
>         say(ERROR, "Unisolate failed for drc %x with %d\n%s\n",
>             drc_index, rc, set_indicator_error(rc));
>         ret = rtas_set_indicator(ALLOCATION_STATE, drc_index,
>                      ALLOC_UNUSABLE);
>         if (ret) {
>             say(ERROR, "Failed recovery to unusable state after "
>                 "unisolate failure for drc %x with %d\n%s\n",
>                 drc_index, ret, set_indicator_error(ret));
>         }
>     }
> 
>     return rc;
> }
> 
>>
>>>
>>> Regards,
>>> Bharata.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 09/12] spapr_pci: enable basic hotplug operations
  2014-09-05  3:10             ` Nathan Fontenot
@ 2014-09-05 17:17               ` Tyrel Datwyler
  0 siblings, 0 replies; 69+ messages in thread
From: Tyrel Datwyler @ 2014-09-05 17:17 UTC (permalink / raw)
  To: Nathan Fontenot, Michael Roth, Bharata B Rao
  Cc: ncmike, qemu-ppc, qemu-devel, tyreld

On 09/04/2014 08:10 PM, Nathan Fontenot wrote:
> On 09/04/2014 11:34 AM, Michael Roth wrote:
>> Quoting Michael Roth (2014-09-04 11:12:15)
>>> Quoting Bharata B Rao (2014-09-04 10:08:20)
>>>> On Thu, Sep 4, 2014 at 4:33 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
>>>>>>> +static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev)
>>>>>>> +{
>>>>>>> +    sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev);
>>>>>>> +    sPAPRDrcEntry *drc_entry, *drc_entry_slot;
>>>>>>> +    sPAPRConfigureConnectorState *ccs;
>>>>>>> +    int slot = PCI_SLOT(dev->devfn);
>>>>>>> +    int offset, ret;
>>>>>>> +    void *fdt_orig, *fdt;
>>>>>>> +    char nodename[512];
>>>>>>> +    uint32_t encoded = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_PRESENT,
>>>>>>> +                                        INDICATOR_ENTITY_SENSE_MASK,
>>>>>>> +                                        INDICATOR_ENTITY_SENSE_SHIFT);
>>>>>>> +
>>>>>>
>>>>>> I am building on this infrastructure of yours and adding CPU hotplug
>>>>>> support to sPAPR guests.
>>>>>>
>>>>>> So we start with dr state of UNUSABLE and change it to PRESENT like
>>>>>> above when performing hotplug operation. But after this, in case of
>>>>>> CPU hotplug, the CPU hotplug code in the kernel
>>>>>> (arch/powerpc/platforms/pseries/dlpar.c:dlpar_acquire_drc()) does
>>>>>> get-sensor-state rtas call and expects the dr state to be UNUSABLE. Is
>>>>>> the guest kernel right in expecting dr state to be in UNUSABLE state
>>>>>> like this ? I have in fact disabled this check in the guest kernel to
>>>>>> get a CPU hotplugged successfully, but wanted to understand the state
>>>>>> changes and the expectations from the guest kernel correctly.
>>>>>
>>>>> According to PAPR+ 2.7 13.5.3.3,
>>>>>
>>>>>   PRESENT (1):
>>>>>
>>>>>   Returned for logical and physical DR entities when the DR connector is
>>>>>   allocated to the OS and the DR entity is present. For physical DR entities,
>>>>>   this indicates that the DR connector actually has a DR entity plugged into
>>>>>   it. For DR connectors of physical DR entities, the DR connector must be
>>>>>   allocated to the OS to return this value, otherwise a status of -3, no such
>>>>>   sensor implemented, will be returned from the get-sensor-state RTAS call. For
>>>>>   DR connectors of logical DR entities, the DR connector must be allocated to
>>>>>   the OS to return this value, otherwise a sensor value of 2 or 3 will be
>>>>>   returned.
>>>>>
>>>>>   UNUSABLE (2):
>>>>>
>>>>>   Returned for logical DR entities when the DR entity is not currently
>>>>>   available to the OS, but may possibly be made available to the OS by calling
>>>>>   set-indicator with the allocation-state indicator, setting that indicator to
>>>>>   usable.
>>>>>
>>>>> So it seems 'PRESENT' is in fact the right value immediately after PCI
>>>>> hotplug, but it doesn't seem clear from the documentation whether 'PRESENT'
>>>>> or 'UNUSABLE' is more correct immediately after CPU hotplug. What does
>>>>> seem clear as that UNUSABLE does not have any use in the case of PCI
>>>>> devices: just PRESENT/EMPTY(0).
>>>>>
>>>>> But we actually 0-initialize the sensor field for DRCEntrys corresponding
>>>>> to PCI devices, which corresponds to 'EMPTY' (0), so the handling seems
>>>>> correct for PCI devices...
>>>>
>>>> Thanks Michael for the above information on PRESENT and USABLE states.
>>>>
>>>>>
>>>>> And since we already initialize PHB sensors to UNUSABLE in the top-level
>>>>> DRC list, I'm not sure why adjacent CPU entries would be affected by what
>>>>> we do later for PCI devices?
>>>>
>>>> Sorry if I wasn't clear enough in my previous mail. CPU hotplug isn't
>>>> affected by what you do for PCI devices, but...
>>>>
>>>>> It seems like you'd just need to do the
>>>>> equivalent of what we do for PHBs during realize:
>>>>
>>>> when I try to do the same state changes for CPU hotplug, things don't
>>>> work as expected.
>>>>
>>>>>
>>>>>   spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
>>>>>
>>>>> So I'm not sure where the need for guest kernel changes is coming from for
>>>>> CPU hotplug.
>>>>
>>>> When the resource is hotplugged, you change the state from UNUSABLE to
>>>> PRESENT in QEMU before signalling the guest (via check exception irq).
>>>> But the same state change in CPU hotplug case isn't as per guest
>>>> kernel's expectation.
>>>>
>>>>> Do you have a snippet of what the initialize/hot_add hooks
>>>>> like in your case?
>>>>
>>>> I am talking about this piece of code in the the kernel in
>>>> arch/powerpc/platforms/pseries/dlpar.c
>>>>
>>>> int dlpar_acquire_drc(u32 drc_index)
>>>> {
>>>>         int dr_status, rc;
>>>>
>>>>         rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, &dr_status,
>>>>                        DR_ENTITY_SENSE, drc_index);
>>>>         if (rc || dr_status != DR_ENTITY_UNUSABLE)
>>>>           return -1;
>>>>        ...
>>>> }
>>>>
>>>> I have circumvented this problem by not setting the state to PRESENT
>>>> in my current hotplug patch. You can refer to the same in
>>>> spapr_cpu_hotplug_add() routine that's part of my patch 14/15
>>>> (https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg00757.html)
>>>
>>> Yah, that's what I was getting at: at least just to get things working
>>> for testing, just avoid the PRESENT bits in your hot_add_cpu hook rather
>>> than patching the guest. Unfortunately the documentation isn't particularly
>>> clear about which of these approaches is more correct as far as CPUs go. But
>>> looking at it again:
>>>
>>>    UNUSABLE (2):
>>>
>>>    Returned for logical DR entities when the DR entity is not currently
>>>    available to the OS, but may possibly be made available to the OS by calling
>>>    set-indicator with the allocation-state indicator, setting that indicator to
>>>    usable.
>>>
>>> That 'usable' indicator setting is documented for set-indicator as (1), which
>>> happens to correspond to PRESENT (1). So my read would be that for 'physical'
>>> hotplug (like PCI), the firmware changes the indicator state to PRESENT/USABLE
>>> immediately after hotplug, whereas with 'logical' hotplug (like PHB/CPU), the
>>> guest OS signals this transition to USABLE through set-indicator calls for the
>>> 9003 sensor/allocation state after hotplug (which also seems to match up with
>>> the kernel code).
>>>
>>> This seems to correspond with the dlpar_acquire_drc() function, but I'm a
>>> little confused why that's not also called in the PHB path...I think maybe
>>> in that case it's handled by drmgr in userspace... will take another look
>>> to confirm.
>>
>> Yah, here's the code from drmgr, same expectations:
> 
> Yes, the guest expects the state to be UNUSABLE.
> 
> As mentioned above, I don't think we should be changing the state to PRESENT
> before notifying the guest.
> 
> -Nathan
> 
>  

There is a subtle difference in the state transitions for logical and
physical entities. In the pci case which is a physical entity the state
needs to start as PRESENT to indicate a device is present in the slot.
Physical entities are always considered resources that are owned by the
OS. For cpus which are considered logical entities the state starts in
UNUSABLE waiting for a set-indicator call to change the allocation to
usable. Logical entities are owned by the platform and reserved until
the they are made owned by the OS through an allocation attempt by the
set-indicator call from the guest. If the changing of the allocation
state to usable is successful the entities sense state should now be
PRESENT.

The state transition diagram in PAPR section 13.4 should make this a
little clearer.

-Tyrel

>>
>> int     
>> acquire_drc(uint32_t drc_index)
>> {
>>     int rc;
>>
>>     say(DEBUG, "Acquiring drc index 0x%x\n", drc_index);
>>
>>     rc = dr_entity_sense(drc_index);
>>     if (rc != STATE_UNUSABLE) {
>>         say(ERROR, "Entity sense failed for drc %x with %d\n%s\n",
>>             drc_index, rc, entity_sense_error(rc));
>>         return -1;
>>     }
>>
>>     say(DEBUG, "Setting allocation state to 'alloc usable'\n");
>>     rc = rtas_set_indicator(ALLOCATION_STATE, drc_index, ALLOC_USABLE);
>>     if (rc) {
>>         say(ERROR, "Allocation failed for drc %x with %d\n%s\n",
>>             drc_index, rc, set_indicator_error(rc));
>>         return -1;
>>     }
>>
>>     say(DEBUG, "Setting indicator state to 'unisolate'\n");
>>     rc = rtas_set_indicator(ISOLATION_STATE, drc_index, UNISOLATE);
>>     if (rc) {
>>         int ret;
>>         rc = -1;
>>
>>         say(ERROR, "Unisolate failed for drc %x with %d\n%s\n",
>>             drc_index, rc, set_indicator_error(rc));
>>         ret = rtas_set_indicator(ALLOCATION_STATE, drc_index,
>>                      ALLOC_UNUSABLE);
>>         if (ret) {
>>             say(ERROR, "Failed recovery to unusable state after "
>>                 "unisolate failure for drc %x with %d\n%s\n",
>>                 drc_index, ret, set_indicator_error(ret));
>>         }
>>     }
>>
>>     return rc;
>> }
>>
>>>
>>>>
>>>> Regards,
>>>> Bharata.
> 
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node
  2014-08-19  0:21 ` [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node Michael Roth
                     ` (2 preceding siblings ...)
  2014-09-03  5:55   ` Bharata B Rao
@ 2014-09-05 22:00   ` Tyrel Datwyler
  3 siblings, 0 replies; 69+ messages in thread
From: Tyrel Datwyler @ 2014-09-05 22:00 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: aik, ncmike, qemu-ppc, agraf, nfont

On 08/18/2014 05:21 PM, Michael Roth wrote:
> From: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> 
> This add entries to the root OF node to advertise our PHBs as being
> DR-capable in according with PAPR specification.
> 
> Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
> and associated with a power domain of -1 (indicating to guests that
> power management is handled automatically by hardware).
> 
> We currently allocate entries for up to 32 DR-capable PHBs, though
> this limit can be increased later.
> 
> DrcEntry objects to track the state of the DR-connector associated
> with each PHB are stored in a 32-entry array, and each DrcEntry has
> in turn have a dynamically-sized number of child DR-connectors,
> which we will use later to track the state of DR-connectors
> associated with a PHB's physical slots.
> 
> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c         | 143 +++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_pci.c     |   1 +
>  include/hw/ppc/spapr.h |  35 ++++++++++++
>  3 files changed, 179 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 5c92707..d5e46c3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
>      return ram_size;
>  }
> 
> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
> +{
> +    int i;
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        if (spapr->drc_table[i].phb_buid == buid) {
> +            return &spapr->drc_table[i];
> +        }
> +     }
> +
> +     return NULL;
> +}
> +
> +static void spapr_init_drc_table(void)
> +{
> +    int i;
> +
> +    memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
> +
> +    /* For now we only care about PHB entries */
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        spapr->drc_table[i].drc_index = 0x2000001 + i;
> +    }
> +}

Magic number can be changed to SPAPR_DRC_PHB_ID_BASE. See below.

> +
> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
> +{
> +    sPAPRDrcEntry *empty_drc = NULL;
> +    sPAPRDrcEntry *found_drc = NULL;
> +    int i, phb_index;
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        if (spapr->drc_table[i].phb_buid == 0) {
> +            empty_drc = &spapr->drc_table[i];
> +        }
> +
> +        if (spapr->drc_table[i].phb_buid == buid) {
> +            found_drc = &spapr->drc_table[i];
> +            break;
> +        }
> +    }
> +
> +    if (found_drc) {
> +        return found_drc;
> +    }
> +
> +    if (empty_drc) {
> +        empty_drc->phb_buid = buid;
> +        empty_drc->state = state;
> +        empty_drc->cc_state.fdt = NULL;
> +        empty_drc->cc_state.offset = 0;
> +        empty_drc->cc_state.depth = 0;
> +        empty_drc->cc_state.state = CC_STATE_IDLE;
> +        empty_drc->child_entries =
> +            g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
> +        phb_index = buid - SPAPR_PCI_BASE_BUID;
> +        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +            empty_drc->child_entries[i].drc_index =
> +                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
> +        }
> +        return empty_drc;
> +    }
> +
> +    return NULL;
> +}
> +
> +static void spapr_create_drc_dt_entries(void *fdt)
> +{
> +    char char_buf[1024];
> +    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
> +    uint32_t *entries;
> +    int offset, fdt_offset;
> +    int i, ret;
> +
> +    fdt_offset = fdt_path_offset(fdt, "/");
> +
> +    /* ibm,drc-indexes */
> +    memset(int_buf, 0, sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> +
> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> +        int_buf[i] = spapr->drc_table[i-1].drc_index;
> +    }
> +
> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
> +    }
> +
> +    /* ibm,drc-power-domains */
> +    memset(int_buf, 0, sizeof(int_buf));
> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> +
> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> +        int_buf[i] = 0xffffffff;
> +    }
> +
> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
> +                      sizeof(int_buf));
> +    if (ret) {
> +        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains property\n");
> +    }
> +
> +    /* ibm,drc-names */
> +    memset(char_buf, 0, sizeof(char_buf));
> +    entries = (uint32_t *)&char_buf[0];
> +    *entries = SPAPR_DRC_TABLE_SIZE;
> +    offset = sizeof(*entries);
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
> +        char_buf[offset++] = '\0';
> +    }
> +
> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
> +    if (ret) {
> +        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
> +    }
> +
> +    /* ibm,drc-types */
> +    memset(char_buf, 0, sizeof(char_buf));
> +    entries = (uint32_t *)&char_buf[0];
> +    *entries = SPAPR_DRC_TABLE_SIZE;
> +    offset = sizeof(*entries);
> +
> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +        offset += sprintf(char_buf + offset, "PHB");
> +        char_buf[offset++] = '\0';
> +    }
> +
> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, offset);
> +    if (ret) {
> +        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
> +    }
> +}
> +
>  #define _FDT(exp) \
>      do { \
>          int ret = (exp);                                           \
> @@ -731,6 +868,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>      char *bootlist;
>      void *fdt;
>      sPAPRPHBState *phb;
> +    sPAPRDrcEntry *drc_entry;
> 
>      fdt = g_malloc(FDT_MAX_SIZE);
> 
> @@ -750,6 +888,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>      }
> 
>      QLIST_FOREACH(phb, &spapr->phbs, list) {
> +        drc_entry = spapr_phb_to_drc_entry(phb->buid);
> +        g_assert(drc_entry);
>          ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
>      }
> 
> @@ -789,6 +929,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
>      }
> 
> +    spapr_create_drc_dt_entries(fdt);
> +
>      _FDT((fdt_pack(fdt)));
> 
>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> @@ -1443,6 +1585,7 @@ static void ppc_spapr_init(MachineState *machine)
>      spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW);
>      spapr_pci_rtas_init();
> 
> +    spapr_init_drc_table();
>      phb = spapr_create_phb(spapr, 0);
> 
>      for (i = 0; i < nb_nics; i++) {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 9ed39a9..e85134f 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -531,6 +531,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>              + sphb->index * SPAPR_PCI_WINDOW_SPACING;
>          sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
>          sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
> +        spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */);
>      }
> 
>      if (sphb->buid == -1) {
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 36e8e51..c93794b 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -10,6 +10,36 @@ struct sPAPRNVRAM;
> 
>  #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
> 
> +/* For dlparable/hotpluggable slots */
> +#define SPAPR_DRC_TABLE_SIZE    32
> +#define SPAPR_DRC_PHB_SLOT_MAX  32
> +#define SPAPR_DRC_DEV_ID_BASE   0x40000000

Change this to SPAPR_DRC_PCI_ID_BASE.

Add

#define SPAPR_DRC_PHB_ID_BASE  0x20000000

-Tyrel

> +
> +typedef struct sPAPRConfigureConnectorState {
> +    void *fdt;
> +    int offset_start;
> +    int offset;
> +    int depth;
> +    PCIDevice *dev;
> +    enum {
> +        CC_STATE_IDLE = 0,
> +        CC_STATE_PENDING = 1,
> +        CC_STATE_ACTIVE,
> +    } state;
> +} sPAPRConfigureConnectorState;
> +
> +typedef struct sPAPRDrcEntry sPAPRDrcEntry;
> +
> +struct sPAPRDrcEntry {
> +    uint32_t drc_index;
> +    uint64_t phb_buid;
> +    void *fdt;
> +    int fdt_offset;
> +    uint32_t state;
> +    sPAPRConfigureConnectorState cc_state;
> +    sPAPRDrcEntry *child_entries;
> +};
> +
>  typedef struct sPAPREnvironment {
>      struct VIOsPAPRBus *vio_bus;
>      QLIST_HEAD(, sPAPRPHBState) phbs;
> @@ -39,6 +69,9 @@ typedef struct sPAPREnvironment {
>      int htab_save_index;
>      bool htab_first_pass;
>      int htab_fd;
> +
> +    /* state for Dynamic Reconfiguration Connectors */
> +    sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
>  } sPAPREnvironment;
> 
>  #define H_SUCCESS         0
> @@ -481,5 +514,7 @@ int spapr_dma_dt(void *fdt, int node_off, const char *propname,
>                   uint32_t liobn, uint64_t window, uint32_t size);
>  int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
>                        sPAPRTCETable *tcet);
> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid);
> 
>  #endif /* !defined (__HW_SPAPR_H__) */
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface
  2014-08-26 11:36   ` Alexander Graf
  2014-09-05  2:55     ` Nathan Fontenot
@ 2014-09-30 22:08     ` Michael Roth
  2014-10-01 14:30       ` Alexander Graf
  1 sibling, 1 reply; 69+ messages in thread
From: Michael Roth @ 2014-09-30 22:08 UTC (permalink / raw)
  To: Alexander Graf, qemu-devel; +Cc: aik, ncmike, qemu-ppc, tyreld, nfont

Quoting Alexander Graf (2014-08-26 06:36:57)
> On 19.08.14 02:21, Michael Roth wrote:
> > From: Mike Day <ncmike@ncultra.org>
> > 
> > Signed-off-by: Mike Day <ncmike@ncultra.org>
> > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr_pci.c     | 119 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/hw/ppc/spapr.h |   3 ++
> >  2 files changed, 122 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 924d488..23a3477 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -36,6 +36,16 @@
> >  
> >  #include "hw/pci/pci_bus.h"
> >  
> > +/* #define DEBUG_SPAPR */
> > +
> > +#ifdef DEBUG_SPAPR
> > +#define DPRINTF(fmt, ...) \
> > +    do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
> > +#else
> > +#define DPRINTF(fmt, ...) \
> > +    do { } while (0)
> > +#endif
> > +
> >  /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
> >  #define RTAS_QUERY_FN           0
> >  #define RTAS_CHANGE_FN          1
> > @@ -47,6 +57,31 @@
> >  #define RTAS_TYPE_MSI           1
> >  #define RTAS_TYPE_MSIX          2
> >  
> > +/* For set-indicator RTAS interface */
> > +#define INDICATOR_ISOLATION_MASK            0x0001   /* 9001 one bit */
> > +#define INDICATOR_GLOBAL_INTERRUPT_MASK     0x0002   /* 9005 one bit */
> > +#define INDICATOR_ERROR_LOG_MASK            0x0004   /* 9006 one bit */
> > +#define INDICATOR_IDENTIFY_MASK             0x0008   /* 9007 one bit */
> > +#define INDICATOR_RESET_MASK                0x0010   /* 9009 one bit */
> > +#define INDICATOR_DR_MASK                   0x00e0   /* 9002 three bits */
> > +#define INDICATOR_ALLOCATION_MASK           0x0300   /* 9003 two bits */
> > +#define INDICATOR_EPOW_MASK                 0x1c00   /* 9 three bits */
> > +
> > +#define INDICATOR_ISOLATION_SHIFT           0x00     /* bit 0 */
> > +#define INDICATOR_GLOBAL_INTERRUPT_SHIFT    0x01     /* bit 1 */
> > +#define INDICATOR_ERROR_LOG_SHIFT           0x02     /* bit 2 */
> > +#define INDICATOR_IDENTIFY_SHIFT            0x03     /* bit 3 */
> > +#define INDICATOR_RESET_SHIFT               0x04     /* bit 4 */
> > +#define INDICATOR_DR_SHIFT                  0x05     /* bits 5-7 */
> > +#define INDICATOR_ALLOCATION_SHIFT          0x08     /* bits 8-9 */
> > +#define INDICATOR_EPOW_SHIFT                0x0a     /* bits 10-12 */
> > +
> > +#define DECODE_DRC_STATE(state, m, s)                  \
> > +    ((((uint32_t)(state) & (uint32_t)(m))) >> (s))
> > +
> > +#define ENCODE_DRC_STATE(val, m, s) \
> > +    (((uint32_t)(val) << (s)) & (uint32_t)(m))
> > +
> >  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
> >  {
> >      sPAPRPHBState *sphb;
> > @@ -402,6 +437,80 @@ static void rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
> >      rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */
> >  }
> >  
> > +static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
> > +                               uint32_t token, uint32_t nargs,
> > +                               target_ulong args, uint32_t nret,
> > +                               target_ulong rets)
> > +{
> > +    uint32_t indicator = rtas_ld(args, 0);
> > +    uint32_t drc_index = rtas_ld(args, 1);
> > +    uint32_t indicator_state = rtas_ld(args, 2);
> > +    uint32_t encoded = 0, shift = 0, mask = 0;
> > +    uint32_t *pind;
> > +    sPAPRDrcEntry *drc_entry = NULL;
> 
> This rtas call does not have any idea what a PHB is. Why does it live in
> spapr_pci.c?

spapr_rtas.c does seem like a better home

> 
> > +
> > +    if (drc_index == 0) { /* platform indicator */
> > +        pind = &spapr->state;
> > +    } else {
> > +        drc_entry = spapr_find_drc_entry(drc_index);
> > +        if (!drc_entry) {
> > +            DPRINTF("rtas_set_indicator: unable to find drc_entry for %x",
> > +                    drc_index);
> > +            rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
> > +            return;
> > +        }
> > +        pind = &drc_entry->state;
> > +    }
> > +
> > +    switch (indicator) {
> > +    case 9:  /* EPOW */
> > +        shift = INDICATOR_EPOW_SHIFT;
> > +        mask = INDICATOR_EPOW_MASK;
> > +        break;
> > +    case 9001: /* Isolation state */
> > +        /* encode the new value into the correct bit field */
> > +        shift = INDICATOR_ISOLATION_SHIFT;
> > +        mask = INDICATOR_ISOLATION_MASK;
> > +        break;
> > +    case 9002: /* DR */
> > +        shift = INDICATOR_DR_SHIFT;
> > +        mask = INDICATOR_DR_MASK;
> > +        break;
> > +    case 9003: /* Allocation State */
> > +        shift = INDICATOR_ALLOCATION_SHIFT;
> > +        mask = INDICATOR_ALLOCATION_MASK;
> > +        break;
> > +    case 9005: /* global interrupt */
> > +        shift = INDICATOR_GLOBAL_INTERRUPT_SHIFT;
> > +        mask = INDICATOR_GLOBAL_INTERRUPT_MASK;
> > +        break;
> > +    case 9006: /* error log */
> > +        shift = INDICATOR_ERROR_LOG_SHIFT;
> > +        mask = INDICATOR_ERROR_LOG_MASK;
> > +        break;
> > +    case 9007: /* identify */
> > +        shift = INDICATOR_IDENTIFY_SHIFT;
> > +        mask = INDICATOR_IDENTIFY_MASK;
> > +        break;
> > +    case 9009: /* reset */
> > +        shift = INDICATOR_RESET_SHIFT;
> > +        mask = INDICATOR_RESET_MASK;
> > +        break;
> > +    default:
> > +        DPRINTF("rtas_set_indicator: indicator not implemented: %d",
> > +                indicator);
> > +        rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
> > +        return;
> > +    }
> > +
> > +    encoded = ENCODE_DRC_STATE(indicator_state, mask, shift);
> > +    /* clear the current indicator value */
> > +    *pind &= ~mask;
> > +    /* set the new value */
> > +    *pind |= encoded;
> > +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> > +}
> > +
> >  static int pci_spapr_swizzle(int slot, int pin)
> >  {
> >      return (slot + pin) % PCI_NUM_PINS;
> > @@ -624,6 +733,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >          sphb->lsi_table[i].irq = irq;
> >      }
> >  
> > +    /* make sure the platform EPOW sensor is initialized - the
> > +     * guest will probe it when there is a hotplug event.
> > +     */
> > +    spapr->state &= ~(uint32_t)INDICATOR_EPOW_MASK;
> > +    spapr->state |= ENCODE_DRC_STATE(0,
> > +                                     INDICATOR_EPOW_MASK,
> > +                                     INDICATOR_EPOW_SHIFT);
> > +
> >      if (!info->finish_realize) {
> >          error_setg(errp, "finish_realize not defined");
> >          return;
> > @@ -1056,6 +1173,8 @@ void spapr_pci_rtas_init(void)
> >          spapr_rtas_register(RTAS_IBM_CHANGE_MSI, "ibm,change-msi",
> >                              rtas_ibm_change_msi);
> >      }
> > +    spapr_rtas_register(RTAS_SET_INDICATOR, "set-indicator",
> > +                        rtas_set_indicator);
> >  }
> >  
> >  static void spapr_pci_register_types(void)
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 0ac1a19..fac85f8 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -72,6 +72,9 @@ typedef struct sPAPREnvironment {
> >  
> >      /* state for Dynamic Reconfiguration Connectors */
> >      sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
> > +
> > +    /* Platform state - sensors and indicators */
> > +    uint32_t state;
> 
> Do you think it'd be possible to create a special DRC device that
> contains all of its tables and global state and also exposes sensors and
> indicators? That device could then get linked via qom links to the PHBs
> for their slots.

Sorry for the delay, I've been going back through the code with this
suggestion in mind and there does seem to be a lot of state that
can be nicely encapsulated by modeling the DR Connectors as a QOM
"device" (though I haven't gone as far as to make them actual
DeviceState's since it's more of a firmware abstraction than real
hardware)

I'm not sure what the best way to plumb things together is, as a first
run, since each DRC must have a index drc_index as per spec, I've moved
put them under /machine/DRConnector as a flat list, where top-level
PHB/CPU/MEMORY DRCs would be allocated statically during sPAPR machine
init (since the corresponding DRC indexes/types/etc are hard-coded into
the top-level of the boot-time DT anyway, though I guess we could also
allocate these on the fly...seems messier though than just plugging new
resources into existing DRCs)

PHB's in turn will associate themselves with a DRC via an attach/detach
method as part of realize (and in the future, hotplug hooks, though
that's not part of the series). The PHBs in turn will create a DRC for each
hotpluggable PCI slot.

Creation is via:

sPAPRDRConnector *spapr_dr_connector_new(sPAPRDRConnectorType type,
                                         uint32_t id);

where the code computes the drc index based on <type> (one of phb, cpu, pci,
memory, etc) and <id>, and sticks them under /machine/dr-Connector/<drc_index>

Any pci/phb/cpu hotplug hooks can then fetch the DRC via type/id,
and hotplug/unplug via attach()/detach() methods. attach() adds
the attached/hotplugged DeviceState as a link property of the
DRC object, and sets the initial sensor state.

rtas calls can fetch DRCs via drc_index, and set/get sensor state
via DRC sensor get/set methods.

Hotplug event delivery still lives outside of DRC implementation for now. I
thought of moving them into DRC, but decisions like whether we should
emit events during coldplug/initial boot seemed to require pushing
a lot of general machine state into DRCs and making the encapsulation
seem superficial.

Things end up looking like this (2xxxxxxx are PHBs, 4xxxxxxx are PCI slots):

mdroth@loki:~/w/qom/machine/dr-connector$ ls
20000000  40000018  40000038  40000058  40000078  40000098  400000b8  400000d8  400000f8
40000000  40000020  40000040  40000060  40000080  400000a0  400000c0  400000e0  type
40000008  40000028  40000048  40000068  40000088  400000a8  400000c8  400000e8
40000010  40000030  40000050  40000070  40000090  400000b0  400000d0  400000f0
mdroth@loki:~/w/qom/machine/dr-connector$ cd 40000000/
mdroth@loki:~/w/qom/machine/dr-connector/40000000$ ls -l
total 0
-rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 allocation-state
lrwxr-xr-x 2 mdroth mdroth 4096 Dec 31  1969 device -> ../../../machine/peripheral/hp0
-rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 drc-index
-rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 entity-sense
-rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 indicator-state
-rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 isolation-state
-rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 type
mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat allocation-state 
1
mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat indicator-state 
1
mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat ../../../machine/peripheral/hp0/type 
virtio-net-pci
mdroth@loki:~/w/qom/machine/dr-connector/40000000$

Hopefully this is sort of the approach you were thinking of?

> 
> 
> Alex

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface
  2014-09-30 22:08     ` Michael Roth
@ 2014-10-01 14:30       ` Alexander Graf
  2014-11-26  4:51         ` Bharata B Rao
  2014-11-26  4:54         ` Bharata B Rao
  0 siblings, 2 replies; 69+ messages in thread
From: Alexander Graf @ 2014-10-01 14:30 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: aik, ncmike, qemu-ppc, tyreld, nfont



On 01.10.14 00:08, Michael Roth wrote:
> Quoting Alexander Graf (2014-08-26 06:36:57)
>> On 19.08.14 02:21, Michael Roth wrote:
>>> From: Mike Day <ncmike@ncultra.org>
>>>
>>> Signed-off-by: Mike Day <ncmike@ncultra.org>
>>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>>> ---
>>>  hw/ppc/spapr_pci.c     | 119 +++++++++++++++++++++++++++++++++++++++++++++++++
>>>  include/hw/ppc/spapr.h |   3 ++
>>>  2 files changed, 122 insertions(+)
>>>
>>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>>> index 924d488..23a3477 100644
>>> --- a/hw/ppc/spapr_pci.c
>>> +++ b/hw/ppc/spapr_pci.c
>>> @@ -36,6 +36,16 @@
>>>  
>>>  #include "hw/pci/pci_bus.h"
>>>  
>>> +/* #define DEBUG_SPAPR */
>>> +
>>> +#ifdef DEBUG_SPAPR
>>> +#define DPRINTF(fmt, ...) \
>>> +    do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
>>> +#else
>>> +#define DPRINTF(fmt, ...) \
>>> +    do { } while (0)
>>> +#endif
>>> +
>>>  /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
>>>  #define RTAS_QUERY_FN           0
>>>  #define RTAS_CHANGE_FN          1
>>> @@ -47,6 +57,31 @@
>>>  #define RTAS_TYPE_MSI           1
>>>  #define RTAS_TYPE_MSIX          2
>>>  
>>> +/* For set-indicator RTAS interface */
>>> +#define INDICATOR_ISOLATION_MASK            0x0001   /* 9001 one bit */
>>> +#define INDICATOR_GLOBAL_INTERRUPT_MASK     0x0002   /* 9005 one bit */
>>> +#define INDICATOR_ERROR_LOG_MASK            0x0004   /* 9006 one bit */
>>> +#define INDICATOR_IDENTIFY_MASK             0x0008   /* 9007 one bit */
>>> +#define INDICATOR_RESET_MASK                0x0010   /* 9009 one bit */
>>> +#define INDICATOR_DR_MASK                   0x00e0   /* 9002 three bits */
>>> +#define INDICATOR_ALLOCATION_MASK           0x0300   /* 9003 two bits */
>>> +#define INDICATOR_EPOW_MASK                 0x1c00   /* 9 three bits */
>>> +
>>> +#define INDICATOR_ISOLATION_SHIFT           0x00     /* bit 0 */
>>> +#define INDICATOR_GLOBAL_INTERRUPT_SHIFT    0x01     /* bit 1 */
>>> +#define INDICATOR_ERROR_LOG_SHIFT           0x02     /* bit 2 */
>>> +#define INDICATOR_IDENTIFY_SHIFT            0x03     /* bit 3 */
>>> +#define INDICATOR_RESET_SHIFT               0x04     /* bit 4 */
>>> +#define INDICATOR_DR_SHIFT                  0x05     /* bits 5-7 */
>>> +#define INDICATOR_ALLOCATION_SHIFT          0x08     /* bits 8-9 */
>>> +#define INDICATOR_EPOW_SHIFT                0x0a     /* bits 10-12 */
>>> +
>>> +#define DECODE_DRC_STATE(state, m, s)                  \
>>> +    ((((uint32_t)(state) & (uint32_t)(m))) >> (s))
>>> +
>>> +#define ENCODE_DRC_STATE(val, m, s) \
>>> +    (((uint32_t)(val) << (s)) & (uint32_t)(m))
>>> +
>>>  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
>>>  {
>>>      sPAPRPHBState *sphb;
>>> @@ -402,6 +437,80 @@ static void rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
>>>      rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */
>>>  }
>>>  
>>> +static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>>> +                               uint32_t token, uint32_t nargs,
>>> +                               target_ulong args, uint32_t nret,
>>> +                               target_ulong rets)
>>> +{
>>> +    uint32_t indicator = rtas_ld(args, 0);
>>> +    uint32_t drc_index = rtas_ld(args, 1);
>>> +    uint32_t indicator_state = rtas_ld(args, 2);
>>> +    uint32_t encoded = 0, shift = 0, mask = 0;
>>> +    uint32_t *pind;
>>> +    sPAPRDrcEntry *drc_entry = NULL;
>>
>> This rtas call does not have any idea what a PHB is. Why does it live in
>> spapr_pci.c?
> 
> spapr_rtas.c does seem like a better home
> 
>>
>>> +
>>> +    if (drc_index == 0) { /* platform indicator */
>>> +        pind = &spapr->state;
>>> +    } else {
>>> +        drc_entry = spapr_find_drc_entry(drc_index);
>>> +        if (!drc_entry) {
>>> +            DPRINTF("rtas_set_indicator: unable to find drc_entry for %x",
>>> +                    drc_index);
>>> +            rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
>>> +            return;
>>> +        }
>>> +        pind = &drc_entry->state;
>>> +    }
>>> +
>>> +    switch (indicator) {
>>> +    case 9:  /* EPOW */
>>> +        shift = INDICATOR_EPOW_SHIFT;
>>> +        mask = INDICATOR_EPOW_MASK;
>>> +        break;
>>> +    case 9001: /* Isolation state */
>>> +        /* encode the new value into the correct bit field */
>>> +        shift = INDICATOR_ISOLATION_SHIFT;
>>> +        mask = INDICATOR_ISOLATION_MASK;
>>> +        break;
>>> +    case 9002: /* DR */
>>> +        shift = INDICATOR_DR_SHIFT;
>>> +        mask = INDICATOR_DR_MASK;
>>> +        break;
>>> +    case 9003: /* Allocation State */
>>> +        shift = INDICATOR_ALLOCATION_SHIFT;
>>> +        mask = INDICATOR_ALLOCATION_MASK;
>>> +        break;
>>> +    case 9005: /* global interrupt */
>>> +        shift = INDICATOR_GLOBAL_INTERRUPT_SHIFT;
>>> +        mask = INDICATOR_GLOBAL_INTERRUPT_MASK;
>>> +        break;
>>> +    case 9006: /* error log */
>>> +        shift = INDICATOR_ERROR_LOG_SHIFT;
>>> +        mask = INDICATOR_ERROR_LOG_MASK;
>>> +        break;
>>> +    case 9007: /* identify */
>>> +        shift = INDICATOR_IDENTIFY_SHIFT;
>>> +        mask = INDICATOR_IDENTIFY_MASK;
>>> +        break;
>>> +    case 9009: /* reset */
>>> +        shift = INDICATOR_RESET_SHIFT;
>>> +        mask = INDICATOR_RESET_MASK;
>>> +        break;
>>> +    default:
>>> +        DPRINTF("rtas_set_indicator: indicator not implemented: %d",
>>> +                indicator);
>>> +        rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
>>> +        return;
>>> +    }
>>> +
>>> +    encoded = ENCODE_DRC_STATE(indicator_state, mask, shift);
>>> +    /* clear the current indicator value */
>>> +    *pind &= ~mask;
>>> +    /* set the new value */
>>> +    *pind |= encoded;
>>> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
>>> +}
>>> +
>>>  static int pci_spapr_swizzle(int slot, int pin)
>>>  {
>>>      return (slot + pin) % PCI_NUM_PINS;
>>> @@ -624,6 +733,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>>          sphb->lsi_table[i].irq = irq;
>>>      }
>>>  
>>> +    /* make sure the platform EPOW sensor is initialized - the
>>> +     * guest will probe it when there is a hotplug event.
>>> +     */
>>> +    spapr->state &= ~(uint32_t)INDICATOR_EPOW_MASK;
>>> +    spapr->state |= ENCODE_DRC_STATE(0,
>>> +                                     INDICATOR_EPOW_MASK,
>>> +                                     INDICATOR_EPOW_SHIFT);
>>> +
>>>      if (!info->finish_realize) {
>>>          error_setg(errp, "finish_realize not defined");
>>>          return;
>>> @@ -1056,6 +1173,8 @@ void spapr_pci_rtas_init(void)
>>>          spapr_rtas_register(RTAS_IBM_CHANGE_MSI, "ibm,change-msi",
>>>                              rtas_ibm_change_msi);
>>>      }
>>> +    spapr_rtas_register(RTAS_SET_INDICATOR, "set-indicator",
>>> +                        rtas_set_indicator);
>>>  }
>>>  
>>>  static void spapr_pci_register_types(void)
>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>>> index 0ac1a19..fac85f8 100644
>>> --- a/include/hw/ppc/spapr.h
>>> +++ b/include/hw/ppc/spapr.h
>>> @@ -72,6 +72,9 @@ typedef struct sPAPREnvironment {
>>>  
>>>      /* state for Dynamic Reconfiguration Connectors */
>>>      sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
>>> +
>>> +    /* Platform state - sensors and indicators */
>>> +    uint32_t state;
>>
>> Do you think it'd be possible to create a special DRC device that
>> contains all of its tables and global state and also exposes sensors and
>> indicators? That device could then get linked via qom links to the PHBs
>> for their slots.
> 
> Sorry for the delay, I've been going back through the code with this
> suggestion in mind and there does seem to be a lot of state that
> can be nicely encapsulated by modeling the DR Connectors as a QOM
> "device" (though I haven't gone as far as to make them actual
> DeviceState's since it's more of a firmware abstraction than real
> hardware)
> 
> I'm not sure what the best way to plumb things together is, as a first
> run, since each DRC must have a index drc_index as per spec, I've moved
> put them under /machine/DRConnector as a flat list, where top-level
> PHB/CPU/MEMORY DRCs would be allocated statically during sPAPR machine
> init (since the corresponding DRC indexes/types/etc are hard-coded into
> the top-level of the boot-time DT anyway, though I guess we could also
> allocate these on the fly...seems messier though than just plugging new
> resources into existing DRCs)
> 
> PHB's in turn will associate themselves with a DRC via an attach/detach
> method as part of realize (and in the future, hotplug hooks, though
> that's not part of the series). The PHBs in turn will create a DRC for each
> hotpluggable PCI slot.
> 
> Creation is via:
> 
> sPAPRDRConnector *spapr_dr_connector_new(sPAPRDRConnectorType type,
>                                          uint32_t id);
> 
> where the code computes the drc index based on <type> (one of phb, cpu, pci,
> memory, etc) and <id>, and sticks them under /machine/dr-Connector/<drc_index>
> 
> Any pci/phb/cpu hotplug hooks can then fetch the DRC via type/id,
> and hotplug/unplug via attach()/detach() methods. attach() adds
> the attached/hotplugged DeviceState as a link property of the
> DRC object, and sets the initial sensor state.
> 
> rtas calls can fetch DRCs via drc_index, and set/get sensor state
> via DRC sensor get/set methods.
> 
> Hotplug event delivery still lives outside of DRC implementation for now. I
> thought of moving them into DRC, but decisions like whether we should
> emit events during coldplug/initial boot seemed to require pushing
> a lot of general machine state into DRCs and making the encapsulation
> seem superficial.
> 
> Things end up looking like this (2xxxxxxx are PHBs, 4xxxxxxx are PCI slots):
> 
> mdroth@loki:~/w/qom/machine/dr-connector$ ls
> 20000000  40000018  40000038  40000058  40000078  40000098  400000b8  400000d8  400000f8
> 40000000  40000020  40000040  40000060  40000080  400000a0  400000c0  400000e0  type
> 40000008  40000028  40000048  40000068  40000088  400000a8  400000c8  400000e8
> 40000010  40000030  40000050  40000070  40000090  400000b0  400000d0  400000f0
> mdroth@loki:~/w/qom/machine/dr-connector$ cd 40000000/
> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ ls -l
> total 0
> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 allocation-state
> lrwxr-xr-x 2 mdroth mdroth 4096 Dec 31  1969 device -> ../../../machine/peripheral/hp0
> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 drc-index
> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 entity-sense
> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 indicator-state
> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 isolation-state
> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 type
> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat allocation-state 
> 1
> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat indicator-state 
> 1
> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat ../../../machine/peripheral/hp0/type 
> virtio-net-pci
> mdroth@loki:~/w/qom/machine/dr-connector/40000000$
> 
> Hopefully this is sort of the approach you were thinking of?

This look quite neat so far, looking forward to the patches :).


Alex

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface
  2014-10-01 14:30       ` Alexander Graf
@ 2014-11-26  4:51         ` Bharata B Rao
  2014-11-26  4:54         ` Bharata B Rao
  1 sibling, 0 replies; 69+ messages in thread
From: Bharata B Rao @ 2014-11-26  4:51 UTC (permalink / raw)
  To: Alexander Graf
  Cc: aik, Michael Roth, qemu-devel, ncmike, qemu-ppc, tyreld, Nathan Fontenot

On Wed, Oct 1, 2014 at 8:00 PM, Alexander Graf <agraf@suse.de> wrote:
>
>
> On 01.10.14 00:08, Michael Roth wrote:
>> Quoting Alexander Graf (2014-08-26 06:36:57)
>>> On 19.08.14 02:21, Michael Roth wrote:
>>>> From: Mike Day <ncmike@ncultra.org>
>>>>
>>>> Signed-off-by: Mike Day <ncmike@ncultra.org>
>>>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>>>> ---
>>>>  hw/ppc/spapr_pci.c     | 119 +++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  include/hw/ppc/spapr.h |   3 ++
>>>>  2 files changed, 122 insertions(+)
>>>>
>>>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>>>> index 924d488..23a3477 100644
>>>> --- a/hw/ppc/spapr_pci.c
>>>> +++ b/hw/ppc/spapr_pci.c
>>>> @@ -36,6 +36,16 @@
>>>>
>>>>  #include "hw/pci/pci_bus.h"
>>>>
>>>> +/* #define DEBUG_SPAPR */
>>>> +
>>>> +#ifdef DEBUG_SPAPR
>>>> +#define DPRINTF(fmt, ...) \
>>>> +    do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
>>>> +#else
>>>> +#define DPRINTF(fmt, ...) \
>>>> +    do { } while (0)
>>>> +#endif
>>>> +
>>>>  /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
>>>>  #define RTAS_QUERY_FN           0
>>>>  #define RTAS_CHANGE_FN          1
>>>> @@ -47,6 +57,31 @@
>>>>  #define RTAS_TYPE_MSI           1
>>>>  #define RTAS_TYPE_MSIX          2
>>>>
>>>> +/* For set-indicator RTAS interface */
>>>> +#define INDICATOR_ISOLATION_MASK            0x0001   /* 9001 one bit */
>>>> +#define INDICATOR_GLOBAL_INTERRUPT_MASK     0x0002   /* 9005 one bit */
>>>> +#define INDICATOR_ERROR_LOG_MASK            0x0004   /* 9006 one bit */
>>>> +#define INDICATOR_IDENTIFY_MASK             0x0008   /* 9007 one bit */
>>>> +#define INDICATOR_RESET_MASK                0x0010   /* 9009 one bit */
>>>> +#define INDICATOR_DR_MASK                   0x00e0   /* 9002 three bits */
>>>> +#define INDICATOR_ALLOCATION_MASK           0x0300   /* 9003 two bits */
>>>> +#define INDICATOR_EPOW_MASK                 0x1c00   /* 9 three bits */
>>>> +
>>>> +#define INDICATOR_ISOLATION_SHIFT           0x00     /* bit 0 */
>>>> +#define INDICATOR_GLOBAL_INTERRUPT_SHIFT    0x01     /* bit 1 */
>>>> +#define INDICATOR_ERROR_LOG_SHIFT           0x02     /* bit 2 */
>>>> +#define INDICATOR_IDENTIFY_SHIFT            0x03     /* bit 3 */
>>>> +#define INDICATOR_RESET_SHIFT               0x04     /* bit 4 */
>>>> +#define INDICATOR_DR_SHIFT                  0x05     /* bits 5-7 */
>>>> +#define INDICATOR_ALLOCATION_SHIFT          0x08     /* bits 8-9 */
>>>> +#define INDICATOR_EPOW_SHIFT                0x0a     /* bits 10-12 */
>>>> +
>>>> +#define DECODE_DRC_STATE(state, m, s)                  \
>>>> +    ((((uint32_t)(state) & (uint32_t)(m))) >> (s))
>>>> +
>>>> +#define ENCODE_DRC_STATE(val, m, s) \
>>>> +    (((uint32_t)(val) << (s)) & (uint32_t)(m))
>>>> +
>>>>  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
>>>>  {
>>>>      sPAPRPHBState *sphb;
>>>> @@ -402,6 +437,80 @@ static void rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
>>>>      rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */
>>>>  }
>>>>
>>>> +static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>>>> +                               uint32_t token, uint32_t nargs,
>>>> +                               target_ulong args, uint32_t nret,
>>>> +                               target_ulong rets)
>>>> +{
>>>> +    uint32_t indicator = rtas_ld(args, 0);
>>>> +    uint32_t drc_index = rtas_ld(args, 1);
>>>> +    uint32_t indicator_state = rtas_ld(args, 2);
>>>> +    uint32_t encoded = 0, shift = 0, mask = 0;
>>>> +    uint32_t *pind;
>>>> +    sPAPRDrcEntry *drc_entry = NULL;
>>>
>>> This rtas call does not have any idea what a PHB is. Why does it live in
>>> spapr_pci.c?
>>
>> spapr_rtas.c does seem like a better home
>>
>>>
>>>> +
>>>> +    if (drc_index == 0) { /* platform indicator */
>>>> +        pind = &spapr->state;
>>>> +    } else {
>>>> +        drc_entry = spapr_find_drc_entry(drc_index);
>>>> +        if (!drc_entry) {
>>>> +            DPRINTF("rtas_set_indicator: unable to find drc_entry for %x",
>>>> +                    drc_index);
>>>> +            rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
>>>> +            return;
>>>> +        }
>>>> +        pind = &drc_entry->state;
>>>> +    }
>>>> +
>>>> +    switch (indicator) {
>>>> +    case 9:  /* EPOW */
>>>> +        shift = INDICATOR_EPOW_SHIFT;
>>>> +        mask = INDICATOR_EPOW_MASK;
>>>> +        break;
>>>> +    case 9001: /* Isolation state */
>>>> +        /* encode the new value into the correct bit field */
>>>> +        shift = INDICATOR_ISOLATION_SHIFT;
>>>> +        mask = INDICATOR_ISOLATION_MASK;
>>>> +        break;
>>>> +    case 9002: /* DR */
>>>> +        shift = INDICATOR_DR_SHIFT;
>>>> +        mask = INDICATOR_DR_MASK;
>>>> +        break;
>>>> +    case 9003: /* Allocation State */
>>>> +        shift = INDICATOR_ALLOCATION_SHIFT;
>>>> +        mask = INDICATOR_ALLOCATION_MASK;
>>>> +        break;
>>>> +    case 9005: /* global interrupt */
>>>> +        shift = INDICATOR_GLOBAL_INTERRUPT_SHIFT;
>>>> +        mask = INDICATOR_GLOBAL_INTERRUPT_MASK;
>>>> +        break;
>>>> +    case 9006: /* error log */
>>>> +        shift = INDICATOR_ERROR_LOG_SHIFT;
>>>> +        mask = INDICATOR_ERROR_LOG_MASK;
>>>> +        break;
>>>> +    case 9007: /* identify */
>>>> +        shift = INDICATOR_IDENTIFY_SHIFT;
>>>> +        mask = INDICATOR_IDENTIFY_MASK;
>>>> +        break;
>>>> +    case 9009: /* reset */
>>>> +        shift = INDICATOR_RESET_SHIFT;
>>>> +        mask = INDICATOR_RESET_MASK;
>>>> +        break;
>>>> +    default:
>>>> +        DPRINTF("rtas_set_indicator: indicator not implemented: %d",
>>>> +                indicator);
>>>> +        rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    encoded = ENCODE_DRC_STATE(indicator_state, mask, shift);
>>>> +    /* clear the current indicator value */
>>>> +    *pind &= ~mask;
>>>> +    /* set the new value */
>>>> +    *pind |= encoded;
>>>> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
>>>> +}
>>>> +
>>>>  static int pci_spapr_swizzle(int slot, int pin)
>>>>  {
>>>>      return (slot + pin) % PCI_NUM_PINS;
>>>> @@ -624,6 +733,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>>>          sphb->lsi_table[i].irq = irq;
>>>>      }
>>>>
>>>> +    /* make sure the platform EPOW sensor is initialized - the
>>>> +     * guest will probe it when there is a hotplug event.
>>>> +     */
>>>> +    spapr->state &= ~(uint32_t)INDICATOR_EPOW_MASK;
>>>> +    spapr->state |= ENCODE_DRC_STATE(0,
>>>> +                                     INDICATOR_EPOW_MASK,
>>>> +                                     INDICATOR_EPOW_SHIFT);
>>>> +
>>>>      if (!info->finish_realize) {
>>>>          error_setg(errp, "finish_realize not defined");
>>>>          return;
>>>> @@ -1056,6 +1173,8 @@ void spapr_pci_rtas_init(void)
>>>>          spapr_rtas_register(RTAS_IBM_CHANGE_MSI, "ibm,change-msi",
>>>>                              rtas_ibm_change_msi);
>>>>      }
>>>> +    spapr_rtas_register(RTAS_SET_INDICATOR, "set-indicator",
>>>> +                        rtas_set_indicator);
>>>>  }
>>>>
>>>>  static void spapr_pci_register_types(void)
>>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>>>> index 0ac1a19..fac85f8 100644
>>>> --- a/include/hw/ppc/spapr.h
>>>> +++ b/include/hw/ppc/spapr.h
>>>> @@ -72,6 +72,9 @@ typedef struct sPAPREnvironment {
>>>>
>>>>      /* state for Dynamic Reconfiguration Connectors */
>>>>      sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
>>>> +
>>>> +    /* Platform state - sensors and indicators */
>>>> +    uint32_t state;
>>>
>>> Do you think it'd be possible to create a special DRC device that
>>> contains all of its tables and global state and also exposes sensors and
>>> indicators? That device could then get linked via qom links to the PHBs
>>> for their slots.
>>
>> Sorry for the delay, I've been going back through the code with this
>> suggestion in mind and there does seem to be a lot of state that
>> can be nicely encapsulated by modeling the DR Connectors as a QOM
>> "device" (though I haven't gone as far as to make them actual
>> DeviceState's since it's more of a firmware abstraction than real
>> hardware)
>>
>> I'm not sure what the best way to plumb things together is, as a first
>> run, since each DRC must have a index drc_index as per spec, I've moved
>> put them under /machine/DRConnector as a flat list, where top-level
>> PHB/CPU/MEMORY DRCs would be allocated statically during sPAPR machine
>> init (since the corresponding DRC indexes/types/etc are hard-coded into
>> the top-level of the boot-time DT anyway, though I guess we could also
>> allocate these on the fly...seems messier though than just plugging new
>> resources into existing DRCs)
>>
>> PHB's in turn will associate themselves with a DRC via an attach/detach
>> method as part of realize (and in the future, hotplug hooks, though
>> that's not part of the series). The PHBs in turn will create a DRC for each
>> hotpluggable PCI slot.
>>
>> Creation is via:
>>
>> sPAPRDRConnector *spapr_dr_connector_new(sPAPRDRConnectorType type,
>>                                          uint32_t id);
>>
>> where the code computes the drc index based on <type> (one of phb, cpu, pci,
>> memory, etc) and <id>, and sticks them under /machine/dr-Connector/<drc_index>
>>
>> Any pci/phb/cpu hotplug hooks can then fetch the DRC via type/id,
>> and hotplug/unplug via attach()/detach() methods. attach() adds
>> the attached/hotplugged DeviceState as a link property of the
>> DRC object, and sets the initial sensor state.
>>
>> rtas calls can fetch DRCs via drc_index, and set/get sensor state
>> via DRC sensor get/set methods.
>>
>> Hotplug event delivery still lives outside of DRC implementation for now. I
>> thought of moving them into DRC, but decisions like whether we should
>> emit events during coldplug/initial boot seemed to require pushing
>> a lot of general machine state into DRCs and making the encapsulation
>> seem superficial.
>>
>> Things end up looking like this (2xxxxxxx are PHBs, 4xxxxxxx are PCI slots):
>>
>> mdroth@loki:~/w/qom/machine/dr-connector$ ls
>> 20000000  40000018  40000038  40000058  40000078  40000098  400000b8  400000d8  400000f8
>> 40000000  40000020  40000040  40000060  40000080  400000a0  400000c0  400000e0  type
>> 40000008  40000028  40000048  40000068  40000088  400000a8  400000c8  400000e8
>> 40000010  40000030  40000050  40000070  40000090  400000b0  400000d0  400000f0
>> mdroth@loki:~/w/qom/machine/dr-connector$ cd 40000000/
>> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ ls -l
>> total 0
>> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 allocation-state
>> lrwxr-xr-x 2 mdroth mdroth 4096 Dec 31  1969 device -> ../../../machine/peripheral/hp0
>> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 drc-index
>> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 entity-sense
>> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 indicator-state
>> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 isolation-state
>> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 type
>> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat allocation-state
>> 1
>> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat indicator-state
>> 1
>> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat ../../../machine/peripheral/hp0/type
>> virtio-net-pci
>> mdroth@loki:~/w/qom/machine/dr-connector/40000000$
>>
>> Hopefully this is sort of the approach you were thinking of?
>
> This look quite neat so far, looking forward to the patches :).

Michael,

Do you have this code/patches anywhere that I could use ? I have got
the initial working versions of both CPU and memory hotplug now for
sPAPR guests based on top of your old PCI hotplug patchset and it
would be good to rebase them on top of your DR connector device work.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface
  2014-10-01 14:30       ` Alexander Graf
  2014-11-26  4:51         ` Bharata B Rao
@ 2014-11-26  4:54         ` Bharata B Rao
  2014-11-26  6:27           ` Michael Roth
  1 sibling, 1 reply; 69+ messages in thread
From: Bharata B Rao @ 2014-11-26  4:54 UTC (permalink / raw)
  To: Alexander Graf
  Cc: aik, Michael Roth, qemu-devel, ncmike, qemu-ppc, tyreld, Nathan Fontenot

On Wed, Oct 1, 2014 at 8:00 PM, Alexander Graf <agraf@suse.de> wrote:
>
>
> On 01.10.14 00:08, Michael Roth wrote:
>> Quoting Alexander Graf (2014-08-26 06:36:57)
>>> On 19.08.14 02:21, Michael Roth wrote:
>>>> From: Mike Day <ncmike@ncultra.org>
>>>>
>>>> Signed-off-by: Mike Day <ncmike@ncultra.org>
>>>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>>>> ---
>>>>  hw/ppc/spapr_pci.c     | 119 +++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  include/hw/ppc/spapr.h |   3 ++
>>>>  2 files changed, 122 insertions(+)
>>>>
>>>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>>>> index 924d488..23a3477 100644
>>>> --- a/hw/ppc/spapr_pci.c
>>>> +++ b/hw/ppc/spapr_pci.c
>>>> @@ -36,6 +36,16 @@
>>>>
>>>>  #include "hw/pci/pci_bus.h"
>>>>
>>>> +/* #define DEBUG_SPAPR */
>>>> +
>>>> +#ifdef DEBUG_SPAPR
>>>> +#define DPRINTF(fmt, ...) \
>>>> +    do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
>>>> +#else
>>>> +#define DPRINTF(fmt, ...) \
>>>> +    do { } while (0)
>>>> +#endif
>>>> +
>>>>  /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
>>>>  #define RTAS_QUERY_FN           0
>>>>  #define RTAS_CHANGE_FN          1
>>>> @@ -47,6 +57,31 @@
>>>>  #define RTAS_TYPE_MSI           1
>>>>  #define RTAS_TYPE_MSIX          2
>>>>
>>>> +/* For set-indicator RTAS interface */
>>>> +#define INDICATOR_ISOLATION_MASK            0x0001   /* 9001 one bit */
>>>> +#define INDICATOR_GLOBAL_INTERRUPT_MASK     0x0002   /* 9005 one bit */
>>>> +#define INDICATOR_ERROR_LOG_MASK            0x0004   /* 9006 one bit */
>>>> +#define INDICATOR_IDENTIFY_MASK             0x0008   /* 9007 one bit */
>>>> +#define INDICATOR_RESET_MASK                0x0010   /* 9009 one bit */
>>>> +#define INDICATOR_DR_MASK                   0x00e0   /* 9002 three bits */
>>>> +#define INDICATOR_ALLOCATION_MASK           0x0300   /* 9003 two bits */
>>>> +#define INDICATOR_EPOW_MASK                 0x1c00   /* 9 three bits */
>>>> +
>>>> +#define INDICATOR_ISOLATION_SHIFT           0x00     /* bit 0 */
>>>> +#define INDICATOR_GLOBAL_INTERRUPT_SHIFT    0x01     /* bit 1 */
>>>> +#define INDICATOR_ERROR_LOG_SHIFT           0x02     /* bit 2 */
>>>> +#define INDICATOR_IDENTIFY_SHIFT            0x03     /* bit 3 */
>>>> +#define INDICATOR_RESET_SHIFT               0x04     /* bit 4 */
>>>> +#define INDICATOR_DR_SHIFT                  0x05     /* bits 5-7 */
>>>> +#define INDICATOR_ALLOCATION_SHIFT          0x08     /* bits 8-9 */
>>>> +#define INDICATOR_EPOW_SHIFT                0x0a     /* bits 10-12 */
>>>> +
>>>> +#define DECODE_DRC_STATE(state, m, s)                  \
>>>> +    ((((uint32_t)(state) & (uint32_t)(m))) >> (s))
>>>> +
>>>> +#define ENCODE_DRC_STATE(val, m, s) \
>>>> +    (((uint32_t)(val) << (s)) & (uint32_t)(m))
>>>> +
>>>>  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
>>>>  {
>>>>      sPAPRPHBState *sphb;
>>>> @@ -402,6 +437,80 @@ static void rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
>>>>      rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */
>>>>  }
>>>>
>>>> +static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>>>> +                               uint32_t token, uint32_t nargs,
>>>> +                               target_ulong args, uint32_t nret,
>>>> +                               target_ulong rets)
>>>> +{
>>>> +    uint32_t indicator = rtas_ld(args, 0);
>>>> +    uint32_t drc_index = rtas_ld(args, 1);
>>>> +    uint32_t indicator_state = rtas_ld(args, 2);
>>>> +    uint32_t encoded = 0, shift = 0, mask = 0;
>>>> +    uint32_t *pind;
>>>> +    sPAPRDrcEntry *drc_entry = NULL;
>>>
>>> This rtas call does not have any idea what a PHB is. Why does it live in
>>> spapr_pci.c?
>>
>> spapr_rtas.c does seem like a better home
>>
>>>
>>>> +
>>>> +    if (drc_index == 0) { /* platform indicator */
>>>> +        pind = &spapr->state;
>>>> +    } else {
>>>> +        drc_entry = spapr_find_drc_entry(drc_index);
>>>> +        if (!drc_entry) {
>>>> +            DPRINTF("rtas_set_indicator: unable to find drc_entry for %x",
>>>> +                    drc_index);
>>>> +            rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
>>>> +            return;
>>>> +        }
>>>> +        pind = &drc_entry->state;
>>>> +    }
>>>> +
>>>> +    switch (indicator) {
>>>> +    case 9:  /* EPOW */
>>>> +        shift = INDICATOR_EPOW_SHIFT;
>>>> +        mask = INDICATOR_EPOW_MASK;
>>>> +        break;
>>>> +    case 9001: /* Isolation state */
>>>> +        /* encode the new value into the correct bit field */
>>>> +        shift = INDICATOR_ISOLATION_SHIFT;
>>>> +        mask = INDICATOR_ISOLATION_MASK;
>>>> +        break;
>>>> +    case 9002: /* DR */
>>>> +        shift = INDICATOR_DR_SHIFT;
>>>> +        mask = INDICATOR_DR_MASK;
>>>> +        break;
>>>> +    case 9003: /* Allocation State */
>>>> +        shift = INDICATOR_ALLOCATION_SHIFT;
>>>> +        mask = INDICATOR_ALLOCATION_MASK;
>>>> +        break;
>>>> +    case 9005: /* global interrupt */
>>>> +        shift = INDICATOR_GLOBAL_INTERRUPT_SHIFT;
>>>> +        mask = INDICATOR_GLOBAL_INTERRUPT_MASK;
>>>> +        break;
>>>> +    case 9006: /* error log */
>>>> +        shift = INDICATOR_ERROR_LOG_SHIFT;
>>>> +        mask = INDICATOR_ERROR_LOG_MASK;
>>>> +        break;
>>>> +    case 9007: /* identify */
>>>> +        shift = INDICATOR_IDENTIFY_SHIFT;
>>>> +        mask = INDICATOR_IDENTIFY_MASK;
>>>> +        break;
>>>> +    case 9009: /* reset */
>>>> +        shift = INDICATOR_RESET_SHIFT;
>>>> +        mask = INDICATOR_RESET_MASK;
>>>> +        break;
>>>> +    default:
>>>> +        DPRINTF("rtas_set_indicator: indicator not implemented: %d",
>>>> +                indicator);
>>>> +        rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    encoded = ENCODE_DRC_STATE(indicator_state, mask, shift);
>>>> +    /* clear the current indicator value */
>>>> +    *pind &= ~mask;
>>>> +    /* set the new value */
>>>> +    *pind |= encoded;
>>>> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
>>>> +}
>>>> +
>>>>  static int pci_spapr_swizzle(int slot, int pin)
>>>>  {
>>>>      return (slot + pin) % PCI_NUM_PINS;
>>>> @@ -624,6 +733,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>>>          sphb->lsi_table[i].irq = irq;
>>>>      }
>>>>
>>>> +    /* make sure the platform EPOW sensor is initialized - the
>>>> +     * guest will probe it when there is a hotplug event.
>>>> +     */
>>>> +    spapr->state &= ~(uint32_t)INDICATOR_EPOW_MASK;
>>>> +    spapr->state |= ENCODE_DRC_STATE(0,
>>>> +                                     INDICATOR_EPOW_MASK,
>>>> +                                     INDICATOR_EPOW_SHIFT);
>>>> +
>>>>      if (!info->finish_realize) {
>>>>          error_setg(errp, "finish_realize not defined");
>>>>          return;
>>>> @@ -1056,6 +1173,8 @@ void spapr_pci_rtas_init(void)
>>>>          spapr_rtas_register(RTAS_IBM_CHANGE_MSI, "ibm,change-msi",
>>>>                              rtas_ibm_change_msi);
>>>>      }
>>>> +    spapr_rtas_register(RTAS_SET_INDICATOR, "set-indicator",
>>>> +                        rtas_set_indicator);
>>>>  }
>>>>
>>>>  static void spapr_pci_register_types(void)
>>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>>>> index 0ac1a19..fac85f8 100644
>>>> --- a/include/hw/ppc/spapr.h
>>>> +++ b/include/hw/ppc/spapr.h
>>>> @@ -72,6 +72,9 @@ typedef struct sPAPREnvironment {
>>>>
>>>>      /* state for Dynamic Reconfiguration Connectors */
>>>>      sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
>>>> +
>>>> +    /* Platform state - sensors and indicators */
>>>> +    uint32_t state;
>>>
>>> Do you think it'd be possible to create a special DRC device that
>>> contains all of its tables and global state and also exposes sensors and
>>> indicators? That device could then get linked via qom links to the PHBs
>>> for their slots.
>>
>> Sorry for the delay, I've been going back through the code with this
>> suggestion in mind and there does seem to be a lot of state that
>> can be nicely encapsulated by modeling the DR Connectors as a QOM
>> "device" (though I haven't gone as far as to make them actual
>> DeviceState's since it's more of a firmware abstraction than real
>> hardware)
>>
>> I'm not sure what the best way to plumb things together is, as a first
>> run, since each DRC must have a index drc_index as per spec, I've moved
>> put them under /machine/DRConnector as a flat list, where top-level
>> PHB/CPU/MEMORY DRCs would be allocated statically during sPAPR machine
>> init (since the corresponding DRC indexes/types/etc are hard-coded into
>> the top-level of the boot-time DT anyway, though I guess we could also
>> allocate these on the fly...seems messier though than just plugging new
>> resources into existing DRCs)
>>
>> PHB's in turn will associate themselves with a DRC via an attach/detach
>> method as part of realize (and in the future, hotplug hooks, though
>> that's not part of the series). The PHBs in turn will create a DRC for each
>> hotpluggable PCI slot.
>>
>> Creation is via:
>>
>> sPAPRDRConnector *spapr_dr_connector_new(sPAPRDRConnectorType type,
>>                                          uint32_t id);
>>
>> where the code computes the drc index based on <type> (one of phb, cpu, pci,
>> memory, etc) and <id>, and sticks them under /machine/dr-Connector/<drc_index>
>>
>> Any pci/phb/cpu hotplug hooks can then fetch the DRC via type/id,
>> and hotplug/unplug via attach()/detach() methods. attach() adds
>> the attached/hotplugged DeviceState as a link property of the
>> DRC object, and sets the initial sensor state.
>>
>> rtas calls can fetch DRCs via drc_index, and set/get sensor state
>> via DRC sensor get/set methods.
>>
>> Hotplug event delivery still lives outside of DRC implementation for now. I
>> thought of moving them into DRC, but decisions like whether we should
>> emit events during coldplug/initial boot seemed to require pushing
>> a lot of general machine state into DRCs and making the encapsulation
>> seem superficial.
>>
>> Things end up looking like this (2xxxxxxx are PHBs, 4xxxxxxx are PCI slots):
>>
>> mdroth@loki:~/w/qom/machine/dr-connector$ ls
>> 20000000  40000018  40000038  40000058  40000078  40000098  400000b8  400000d8  400000f8
>> 40000000  40000020  40000040  40000060  40000080  400000a0  400000c0  400000e0  type
>> 40000008  40000028  40000048  40000068  40000088  400000a8  400000c8  400000e8
>> 40000010  40000030  40000050  40000070  40000090  400000b0  400000d0  400000f0
>> mdroth@loki:~/w/qom/machine/dr-connector$ cd 40000000/
>> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ ls -l
>> total 0
>> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 allocation-state
>> lrwxr-xr-x 2 mdroth mdroth 4096 Dec 31  1969 device -> ../../../machine/peripheral/hp0
>> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 drc-index
>> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 entity-sense
>> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 indicator-state
>> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 isolation-state
>> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 type
>> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat allocation-state
>> 1
>> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat indicator-state
>> 1
>> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat ../../../machine/peripheral/hp0/type
>> virtio-net-pci
>> mdroth@loki:~/w/qom/machine/dr-connector/40000000$
>>
>> Hopefully this is sort of the approach you were thinking of?
>
> This look quite neat so far, looking forward to the patches :).

Michael,

Do you have this code/patches anywhere that I could use ? I have got
the initial working versions of both CPU and memory hotplug now for
sPAPR guests based on top of your old PCI hotplug patchset and it
would be good to rebase them on top of your DR connector device work.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface
  2014-11-26  4:54         ` Bharata B Rao
@ 2014-11-26  6:27           ` Michael Roth
  2014-12-01  4:57             ` Bharata B Rao
  0 siblings, 1 reply; 69+ messages in thread
From: Michael Roth @ 2014-11-26  6:27 UTC (permalink / raw)
  To: Bharata B Rao, Alexander Graf
  Cc: aik, qemu-devel, ncmike, qemu-ppc, tyreld, Nathan Fontenot

Quoting Bharata B Rao (2014-11-25 22:54:12)
> On Wed, Oct 1, 2014 at 8:00 PM, Alexander Graf <agraf@suse.de> wrote:
> >
> >
> > On 01.10.14 00:08, Michael Roth wrote:
> >> Quoting Alexander Graf (2014-08-26 06:36:57)
> >>> On 19.08.14 02:21, Michael Roth wrote:
> >>>> From: Mike Day <ncmike@ncultra.org>
> >>>>
> >>>> Signed-off-by: Mike Day <ncmike@ncultra.org>
> >>>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> >>>> ---
> >>>>  hw/ppc/spapr_pci.c     | 119 +++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>  include/hw/ppc/spapr.h |   3 ++
> >>>>  2 files changed, 122 insertions(+)
> >>>>
> >>>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> >>>> index 924d488..23a3477 100644
> >>>> --- a/hw/ppc/spapr_pci.c
> >>>> +++ b/hw/ppc/spapr_pci.c
> >>>> @@ -36,6 +36,16 @@
> >>>>
> >>>>  #include "hw/pci/pci_bus.h"
> >>>>
> >>>> +/* #define DEBUG_SPAPR */
> >>>> +
> >>>> +#ifdef DEBUG_SPAPR
> >>>> +#define DPRINTF(fmt, ...) \
> >>>> +    do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
> >>>> +#else
> >>>> +#define DPRINTF(fmt, ...) \
> >>>> +    do { } while (0)
> >>>> +#endif
> >>>> +
> >>>>  /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
> >>>>  #define RTAS_QUERY_FN           0
> >>>>  #define RTAS_CHANGE_FN          1
> >>>> @@ -47,6 +57,31 @@
> >>>>  #define RTAS_TYPE_MSI           1
> >>>>  #define RTAS_TYPE_MSIX          2
> >>>>
> >>>> +/* For set-indicator RTAS interface */
> >>>> +#define INDICATOR_ISOLATION_MASK            0x0001   /* 9001 one bit */
> >>>> +#define INDICATOR_GLOBAL_INTERRUPT_MASK     0x0002   /* 9005 one bit */
> >>>> +#define INDICATOR_ERROR_LOG_MASK            0x0004   /* 9006 one bit */
> >>>> +#define INDICATOR_IDENTIFY_MASK             0x0008   /* 9007 one bit */
> >>>> +#define INDICATOR_RESET_MASK                0x0010   /* 9009 one bit */
> >>>> +#define INDICATOR_DR_MASK                   0x00e0   /* 9002 three bits */
> >>>> +#define INDICATOR_ALLOCATION_MASK           0x0300   /* 9003 two bits */
> >>>> +#define INDICATOR_EPOW_MASK                 0x1c00   /* 9 three bits */
> >>>> +
> >>>> +#define INDICATOR_ISOLATION_SHIFT           0x00     /* bit 0 */
> >>>> +#define INDICATOR_GLOBAL_INTERRUPT_SHIFT    0x01     /* bit 1 */
> >>>> +#define INDICATOR_ERROR_LOG_SHIFT           0x02     /* bit 2 */
> >>>> +#define INDICATOR_IDENTIFY_SHIFT            0x03     /* bit 3 */
> >>>> +#define INDICATOR_RESET_SHIFT               0x04     /* bit 4 */
> >>>> +#define INDICATOR_DR_SHIFT                  0x05     /* bits 5-7 */
> >>>> +#define INDICATOR_ALLOCATION_SHIFT          0x08     /* bits 8-9 */
> >>>> +#define INDICATOR_EPOW_SHIFT                0x0a     /* bits 10-12 */
> >>>> +
> >>>> +#define DECODE_DRC_STATE(state, m, s)                  \
> >>>> +    ((((uint32_t)(state) & (uint32_t)(m))) >> (s))
> >>>> +
> >>>> +#define ENCODE_DRC_STATE(val, m, s) \
> >>>> +    (((uint32_t)(val) << (s)) & (uint32_t)(m))
> >>>> +
> >>>>  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
> >>>>  {
> >>>>      sPAPRPHBState *sphb;
> >>>> @@ -402,6 +437,80 @@ static void rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
> >>>>      rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */
> >>>>  }
> >>>>
> >>>> +static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
> >>>> +                               uint32_t token, uint32_t nargs,
> >>>> +                               target_ulong args, uint32_t nret,
> >>>> +                               target_ulong rets)
> >>>> +{
> >>>> +    uint32_t indicator = rtas_ld(args, 0);
> >>>> +    uint32_t drc_index = rtas_ld(args, 1);
> >>>> +    uint32_t indicator_state = rtas_ld(args, 2);
> >>>> +    uint32_t encoded = 0, shift = 0, mask = 0;
> >>>> +    uint32_t *pind;
> >>>> +    sPAPRDrcEntry *drc_entry = NULL;
> >>>
> >>> This rtas call does not have any idea what a PHB is. Why does it live in
> >>> spapr_pci.c?
> >>
> >> spapr_rtas.c does seem like a better home
> >>
> >>>
> >>>> +
> >>>> +    if (drc_index == 0) { /* platform indicator */
> >>>> +        pind = &spapr->state;
> >>>> +    } else {
> >>>> +        drc_entry = spapr_find_drc_entry(drc_index);
> >>>> +        if (!drc_entry) {
> >>>> +            DPRINTF("rtas_set_indicator: unable to find drc_entry for %x",
> >>>> +                    drc_index);
> >>>> +            rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
> >>>> +            return;
> >>>> +        }
> >>>> +        pind = &drc_entry->state;
> >>>> +    }
> >>>> +
> >>>> +    switch (indicator) {
> >>>> +    case 9:  /* EPOW */
> >>>> +        shift = INDICATOR_EPOW_SHIFT;
> >>>> +        mask = INDICATOR_EPOW_MASK;
> >>>> +        break;
> >>>> +    case 9001: /* Isolation state */
> >>>> +        /* encode the new value into the correct bit field */
> >>>> +        shift = INDICATOR_ISOLATION_SHIFT;
> >>>> +        mask = INDICATOR_ISOLATION_MASK;
> >>>> +        break;
> >>>> +    case 9002: /* DR */
> >>>> +        shift = INDICATOR_DR_SHIFT;
> >>>> +        mask = INDICATOR_DR_MASK;
> >>>> +        break;
> >>>> +    case 9003: /* Allocation State */
> >>>> +        shift = INDICATOR_ALLOCATION_SHIFT;
> >>>> +        mask = INDICATOR_ALLOCATION_MASK;
> >>>> +        break;
> >>>> +    case 9005: /* global interrupt */
> >>>> +        shift = INDICATOR_GLOBAL_INTERRUPT_SHIFT;
> >>>> +        mask = INDICATOR_GLOBAL_INTERRUPT_MASK;
> >>>> +        break;
> >>>> +    case 9006: /* error log */
> >>>> +        shift = INDICATOR_ERROR_LOG_SHIFT;
> >>>> +        mask = INDICATOR_ERROR_LOG_MASK;
> >>>> +        break;
> >>>> +    case 9007: /* identify */
> >>>> +        shift = INDICATOR_IDENTIFY_SHIFT;
> >>>> +        mask = INDICATOR_IDENTIFY_MASK;
> >>>> +        break;
> >>>> +    case 9009: /* reset */
> >>>> +        shift = INDICATOR_RESET_SHIFT;
> >>>> +        mask = INDICATOR_RESET_MASK;
> >>>> +        break;
> >>>> +    default:
> >>>> +        DPRINTF("rtas_set_indicator: indicator not implemented: %d",
> >>>> +                indicator);
> >>>> +        rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
> >>>> +        return;
> >>>> +    }
> >>>> +
> >>>> +    encoded = ENCODE_DRC_STATE(indicator_state, mask, shift);
> >>>> +    /* clear the current indicator value */
> >>>> +    *pind &= ~mask;
> >>>> +    /* set the new value */
> >>>> +    *pind |= encoded;
> >>>> +    rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> >>>> +}
> >>>> +
> >>>>  static int pci_spapr_swizzle(int slot, int pin)
> >>>>  {
> >>>>      return (slot + pin) % PCI_NUM_PINS;
> >>>> @@ -624,6 +733,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >>>>          sphb->lsi_table[i].irq = irq;
> >>>>      }
> >>>>
> >>>> +    /* make sure the platform EPOW sensor is initialized - the
> >>>> +     * guest will probe it when there is a hotplug event.
> >>>> +     */
> >>>> +    spapr->state &= ~(uint32_t)INDICATOR_EPOW_MASK;
> >>>> +    spapr->state |= ENCODE_DRC_STATE(0,
> >>>> +                                     INDICATOR_EPOW_MASK,
> >>>> +                                     INDICATOR_EPOW_SHIFT);
> >>>> +
> >>>>      if (!info->finish_realize) {
> >>>>          error_setg(errp, "finish_realize not defined");
> >>>>          return;
> >>>> @@ -1056,6 +1173,8 @@ void spapr_pci_rtas_init(void)
> >>>>          spapr_rtas_register(RTAS_IBM_CHANGE_MSI, "ibm,change-msi",
> >>>>                              rtas_ibm_change_msi);
> >>>>      }
> >>>> +    spapr_rtas_register(RTAS_SET_INDICATOR, "set-indicator",
> >>>> +                        rtas_set_indicator);
> >>>>  }
> >>>>
> >>>>  static void spapr_pci_register_types(void)
> >>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> >>>> index 0ac1a19..fac85f8 100644
> >>>> --- a/include/hw/ppc/spapr.h
> >>>> +++ b/include/hw/ppc/spapr.h
> >>>> @@ -72,6 +72,9 @@ typedef struct sPAPREnvironment {
> >>>>
> >>>>      /* state for Dynamic Reconfiguration Connectors */
> >>>>      sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
> >>>> +
> >>>> +    /* Platform state - sensors and indicators */
> >>>> +    uint32_t state;
> >>>
> >>> Do you think it'd be possible to create a special DRC device that
> >>> contains all of its tables and global state and also exposes sensors and
> >>> indicators? That device could then get linked via qom links to the PHBs
> >>> for their slots.
> >>
> >> Sorry for the delay, I've been going back through the code with this
> >> suggestion in mind and there does seem to be a lot of state that
> >> can be nicely encapsulated by modeling the DR Connectors as a QOM
> >> "device" (though I haven't gone as far as to make them actual
> >> DeviceState's since it's more of a firmware abstraction than real
> >> hardware)
> >>
> >> I'm not sure what the best way to plumb things together is, as a first
> >> run, since each DRC must have a index drc_index as per spec, I've moved
> >> put them under /machine/DRConnector as a flat list, where top-level
> >> PHB/CPU/MEMORY DRCs would be allocated statically during sPAPR machine
> >> init (since the corresponding DRC indexes/types/etc are hard-coded into
> >> the top-level of the boot-time DT anyway, though I guess we could also
> >> allocate these on the fly...seems messier though than just plugging new
> >> resources into existing DRCs)
> >>
> >> PHB's in turn will associate themselves with a DRC via an attach/detach
> >> method as part of realize (and in the future, hotplug hooks, though
> >> that's not part of the series). The PHBs in turn will create a DRC for each
> >> hotpluggable PCI slot.
> >>
> >> Creation is via:
> >>
> >> sPAPRDRConnector *spapr_dr_connector_new(sPAPRDRConnectorType type,
> >>                                          uint32_t id);
> >>
> >> where the code computes the drc index based on <type> (one of phb, cpu, pci,
> >> memory, etc) and <id>, and sticks them under /machine/dr-Connector/<drc_index>
> >>
> >> Any pci/phb/cpu hotplug hooks can then fetch the DRC via type/id,
> >> and hotplug/unplug via attach()/detach() methods. attach() adds
> >> the attached/hotplugged DeviceState as a link property of the
> >> DRC object, and sets the initial sensor state.
> >>
> >> rtas calls can fetch DRCs via drc_index, and set/get sensor state
> >> via DRC sensor get/set methods.
> >>
> >> Hotplug event delivery still lives outside of DRC implementation for now. I
> >> thought of moving them into DRC, but decisions like whether we should
> >> emit events during coldplug/initial boot seemed to require pushing
> >> a lot of general machine state into DRCs and making the encapsulation
> >> seem superficial.
> >>
> >> Things end up looking like this (2xxxxxxx are PHBs, 4xxxxxxx are PCI slots):
> >>
> >> mdroth@loki:~/w/qom/machine/dr-connector$ ls
> >> 20000000  40000018  40000038  40000058  40000078  40000098  400000b8  400000d8  400000f8
> >> 40000000  40000020  40000040  40000060  40000080  400000a0  400000c0  400000e0  type
> >> 40000008  40000028  40000048  40000068  40000088  400000a8  400000c8  400000e8
> >> 40000010  40000030  40000050  40000070  40000090  400000b0  400000d0  400000f0
> >> mdroth@loki:~/w/qom/machine/dr-connector$ cd 40000000/
> >> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ ls -l
> >> total 0
> >> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 allocation-state
> >> lrwxr-xr-x 2 mdroth mdroth 4096 Dec 31  1969 device -> ../../../machine/peripheral/hp0
> >> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 drc-index
> >> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 entity-sense
> >> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 indicator-state
> >> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 isolation-state
> >> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31  1969 type
> >> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat allocation-state
> >> 1
> >> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat indicator-state
> >> 1
> >> mdroth@loki:~/w/qom/machine/dr-connector/40000000$ cat ../../../machine/peripheral/hp0/type
> >> virtio-net-pci
> >> mdroth@loki:~/w/qom/machine/dr-connector/40000000$
> >>
> >> Hopefully this is sort of the approach you were thinking of?
> >
> > This look quite neat so far, looking forward to the patches :).
> 
> Michael,
> 
> Do you have this code/patches anywhere that I could use ? I have got
> the initial working versions of both CPU and memory hotplug now for
> sPAPR guests based on top of your old PCI hotplug patchset and it
> would be good to rebase them on top of your DR connector device work.

Hi Bharata,

Here's the latest branch:

https://github.com/mdroth/qemu/commits/spapr-pci-hotplug-ppc-next-cleanup4.2

The sPAPRDREntry stuff is now modeled by the sPAPRDRConnector QOM object in
hw/ppc/spapr_drc.c, which manages the device's life-cycle based on
rtas-set-sensor-state calls from the guest. As part of qemu-side hotplug/unplug
you use the attach/detach methods of the DRC to associate DT bits and callbacks
for things like device cleanup or rtas calls to fetch a DT node from the device
associated with a particular DRC.

I still need to fix endian issues, and am realizing the dr connectors and DT
bits for PHBs are not actually a prereq for PCI hotplug, so I may be pulling
that out to a separate series specific to enabling PHB hotplug (namely for
VFIO hotplug). I realize your CPU/MEM sort of depend on the top-level PHB
device tree code so I'm not sure how best to deal with that. Worse case we'd
roll the initial code into your series and base a follow-up series on that of
that instead.

Let me know if you have any questions.

> 
> Regards,
> Bharata.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface
  2014-11-26  6:27           ` Michael Roth
@ 2014-12-01  4:57             ` Bharata B Rao
  2014-12-23 15:12               ` Michael Roth
  0 siblings, 1 reply; 69+ messages in thread
From: Bharata B Rao @ 2014-12-01  4:57 UTC (permalink / raw)
  To: Michael Roth
  Cc: aik, Alexander Graf, qemu-devel, ncmike, qemu-ppc, tyreld,
	Nathan Fontenot

On Wed, Nov 26, 2014 at 11:57 AM, Michael Roth
<mdroth@linux.vnet.ibm.com> wrote:
> https://github.com/mdroth/qemu/commits/spapr-pci-hotplug-ppc-next-cleanup4.2
>
> The sPAPRDREntry stuff is now modeled by the sPAPRDRConnector QOM object in
> hw/ppc/spapr_drc.c, which manages the device's life-cycle based on
> rtas-set-sensor-state calls from the guest. As part of qemu-side hotplug/unplug
> you use the attach/detach methods of the DRC to associate DT bits and callbacks
> for things like device cleanup or rtas calls to fetch a DT node from the device
> associated with a particular DRC.
>
> I still need to fix endian issues, and am realizing the dr connectors and DT
> bits for PHBs are not actually a prereq for PCI hotplug, so I may be pulling
> that out to a separate series specific to enabling PHB hotplug (namely for
> VFIO hotplug). I realize your CPU/MEM sort of depend on the top-level PHB
> device tree code so I'm not sure how best to deal with that. Worse case we'd
> roll the initial code into your series and base a follow-up series on that of
> that instead.

Thanks Michael for pointing me to your git tree.

I started rebasing my patchset on top of yours and realized that the
generic DT setup code from the below commits of your branch are needed
for CPU and memory hotplug too. They all apply in the order I  have
listed below.

71b32999c4eb spapr_drc: initial implementation
255c50200848 spapr: populate DRC entries for root dt node (don't need
code that adds PHB DT entries)
408206fc627e3 spapr_rtas: add set-indicator RTAS interface
da7a232fa6a44 spapr_rtas: add get-sensor-state RTAS interface
1c575d5b29688 spapr_rtas: add ibm,configure-connector RTAS interface
0c5d72833666c spapr_events: re-use EPOW event infrastructure for hotplug events
82ee5a9c88155 spapr_events: event-scan RTAS interface

If you can make the above set an independent patchset, it will become
easy to maintain and post CPU and memory hotplug patchsets.

I am facing some endian issues in your patchset and I will send fixes
for those separately.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface
  2014-12-01  4:57             ` Bharata B Rao
@ 2014-12-23 15:12               ` Michael Roth
  2015-01-01  6:35                 ` Bharata B Rao
  0 siblings, 1 reply; 69+ messages in thread
From: Michael Roth @ 2014-12-23 15:12 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: aik, Alexander Graf, qemu-devel, ncmike, qemu-ppc, tyreld,
	Nathan Fontenot

Quoting Bharata B Rao (2014-11-30 22:57:48)
> On Wed, Nov 26, 2014 at 11:57 AM, Michael Roth
> <mdroth@linux.vnet.ibm.com> wrote:
> > https://github.com/mdroth/qemu/commits/spapr-pci-hotplug-ppc-next-cleanup4.2
> >
> > The sPAPRDREntry stuff is now modeled by the sPAPRDRConnector QOM object in
> > hw/ppc/spapr_drc.c, which manages the device's life-cycle based on
> > rtas-set-sensor-state calls from the guest. As part of qemu-side hotplug/unplug
> > you use the attach/detach methods of the DRC to associate DT bits and callbacks
> > for things like device cleanup or rtas calls to fetch a DT node from the device
> > associated with a particular DRC.
> >
> > I still need to fix endian issues, and am realizing the dr connectors and DT
> > bits for PHBs are not actually a prereq for PCI hotplug, so I may be pulling
> > that out to a separate series specific to enabling PHB hotplug (namely for
> > VFIO hotplug). I realize your CPU/MEM sort of depend on the top-level PHB
> > device tree code so I'm not sure how best to deal with that. Worse case we'd
> > roll the initial code into your series and base a follow-up series on that of
> > that instead.
> 
> Thanks Michael for pointing me to your git tree.
> 
> I started rebasing my patchset on top of yours and realized that the
> generic DT setup code from the below commits of your branch are needed
> for CPU and memory hotplug too. They all apply in the order I  have
> listed below.
> 
> 71b32999c4eb spapr_drc: initial implementation
> 255c50200848 spapr: populate DRC entries for root dt node (don't need
> code that adds PHB DT entries)
> 408206fc627e3 spapr_rtas: add set-indicator RTAS interface
> da7a232fa6a44 spapr_rtas: add get-sensor-state RTAS interface
> 1c575d5b29688 spapr_rtas: add ibm,configure-connector RTAS interface
> 0c5d72833666c spapr_events: re-use EPOW event infrastructure for hotplug events
> 82ee5a9c88155 spapr_events: event-scan RTAS interface
> 
> If you can make the above set an independent patchset, it will become
> easy to maintain and post CPU and memory hotplug patchsets.

Hi Bharata,

I've submitted v4 of PCI hotplug. The development branch is here:

  https://github.com/mdroth/qemu/commits/spapr-hotplug-pci

and is based on top of a 'core' branch organized similar to what you proposed:

  https://github.com/mdroth/qemu/commits/spapr-hotplug-core

I'll be rolling changes for core/pci code into the branches as we go.
The endian fixes you provided are included, and PCI hotplug has been
tested on ppc64le.

There's a pseries-2.3 in the core patchset to enable/disable
dynamic-reconfiguration for individual resources on a machine basis to
maintain backward migration compatibility. There's a PHB hotplug patchset
based on core that might be a good reference for re-basing CPU/memory:

  https://github.com/mdroth/qemu/commits/spapr-hotplug-phb


> 
> I am facing some endian issues in your patchset and I will send fixes
> for those separately.
> 
> Regards,
> Bharata.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface
  2014-12-23 15:12               ` Michael Roth
@ 2015-01-01  6:35                 ` Bharata B Rao
  0 siblings, 0 replies; 69+ messages in thread
From: Bharata B Rao @ 2015-01-01  6:35 UTC (permalink / raw)
  To: Michael Roth
  Cc: aik, Alexander Graf, qemu-devel, ncmike, qemu-ppc, tyreld,
	Nathan Fontenot

On Tue, Dec 23, 2014 at 8:42 PM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
>
> Hi Bharata,
>
> I've submitted v4 of PCI hotplug. The development branch is here:
>
>   https://github.com/mdroth/qemu/commits/spapr-hotplug-pci
>
> and is based on top of a 'core' branch organized similar to what you proposed:
>
>   https://github.com/mdroth/qemu/commits/spapr-hotplug-core
>
> I'll be rolling changes for core/pci code into the branches as we go.
> The endian fixes you provided are included, and PCI hotplug has been
> tested on ppc64le.
>
> There's a pseries-2.3 in the core patchset to enable/disable
> dynamic-reconfiguration for individual resources on a machine basis to
> maintain backward migration compatibility. There's a PHB hotplug patchset
> based on core that might be a good reference for re-basing CPU/memory:
>
>   https://github.com/mdroth/qemu/commits/spapr-hotplug-phb

Thanks Michael for providing the 'core' branch. I shall be posting CPU
and memory hotplug patches based on your 'core' branch shortly.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2015-01-01  6:35 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-19  0:21 [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Michael Roth
2014-08-19  0:21 ` [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node Michael Roth
2014-08-26  7:55   ` Alexey Kardashevskiy
2014-08-26  8:24     ` Alexey Kardashevskiy
2014-08-26 15:25       ` Michael Roth
2014-08-26 15:41         ` Michael Roth
2014-08-29 18:27         ` Tyrel Datwyler
2014-08-29 23:15           ` Alexander Graf
2014-08-26 14:56     ` Michael Roth
2014-09-05  0:31     ` [Qemu-devel] [Qemu-ppc] " Tyrel Datwyler
2014-08-26 11:11   ` [Qemu-devel] " Alexander Graf
2014-08-26 16:47     ` Michael Roth
2014-08-26 17:16       ` Alexander Graf
2014-09-03  5:55   ` Bharata B Rao
2014-09-05 22:00   ` Tyrel Datwyler
2014-08-19  0:21 ` [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs Michael Roth
2014-08-26  8:32   ` Alexey Kardashevskiy
2014-08-26 17:16     ` Michael Roth
2014-08-26  9:09   ` Alexey Kardashevskiy
2014-08-26 17:52     ` Michael Roth
2014-08-26 11:29   ` Alexander Graf
2014-08-26 18:30     ` Michael Roth
2014-08-19  0:21 ` [Qemu-devel] [PATCH 03/12] spapr: add helper to retrieve a PHB/device DrcEntry Michael Roth
2014-08-19  0:21 ` [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface Michael Roth
2014-08-26 11:36   ` Alexander Graf
2014-09-05  2:55     ` Nathan Fontenot
2014-09-30 22:08     ` Michael Roth
2014-10-01 14:30       ` Alexander Graf
2014-11-26  4:51         ` Bharata B Rao
2014-11-26  4:54         ` Bharata B Rao
2014-11-26  6:27           ` Michael Roth
2014-12-01  4:57             ` Bharata B Rao
2014-12-23 15:12               ` Michael Roth
2015-01-01  6:35                 ` Bharata B Rao
2014-08-19  0:21 ` [Qemu-devel] [PATCH 05/12] spapr_pci: add get/set-power-level RTAS interfaces Michael Roth
2014-08-19  0:21 ` [Qemu-devel] [PATCH 06/12] spapr_pci: add get-sensor-state RTAS interface Michael Roth
2014-09-05  0:34   ` Tyrel Datwyler
2014-08-19  0:21 ` [Qemu-devel] [PATCH 07/12] spapr_pci: add ibm, configure-connector " Michael Roth
2014-08-26  9:12   ` Alexey Kardashevskiy
2014-09-05  3:03     ` Nathan Fontenot
2014-08-26 11:39   ` Alexander Graf
2014-08-19  0:21 ` [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions Michael Roth
2014-08-26  9:14   ` Alexey Kardashevskiy
2014-08-26 11:55     ` Peter Maydell
2014-08-26 18:34     ` Michael Roth
2014-08-26 11:41   ` Alexander Graf
2014-08-27 13:47   ` Michael S. Tsirkin
2014-08-28 21:21     ` Michael Roth
2014-08-28 21:33       ` Peter Maydell
2014-08-28 21:46         ` Michael S. Tsirkin
2014-08-19  0:21 ` [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations Michael Roth
2014-08-26  9:40   ` Alexey Kardashevskiy
2014-08-26 12:30   ` Alexander Graf
2014-09-03 10:33   ` Bharata B Rao
2014-09-03 23:03     ` Michael Roth
2014-09-04 15:08       ` Bharata B Rao
2014-09-04 16:12         ` Michael Roth
2014-09-04 16:34           ` Michael Roth
2014-09-05  3:10             ` Nathan Fontenot
2014-09-05 17:17               ` [Qemu-devel] [Qemu-ppc] " Tyrel Datwyler
2014-08-19  0:21 ` [Qemu-devel] [PATCH 10/12] spapr_events: re-use EPOW event infrastructure for hotplug events Michael Roth
2014-08-26  9:28   ` Alexey Kardashevskiy
2014-08-19  0:21 ` [Qemu-devel] [PATCH 11/12] spapr_events: event-scan RTAS interface Michael Roth
2014-08-26  9:30   ` Alexey Kardashevskiy
2014-08-29 18:43     ` Tyrel Datwyler
2014-08-19  0:21 ` [Qemu-devel] [PATCH 12/12] spapr_pci: emit hotplug add/remove events during hotplug Michael Roth
2014-08-26  9:35   ` Alexey Kardashevskiy
2014-08-26 12:36   ` Alexander Graf
2014-08-26  9:24 ` [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug Alexey Kardashevskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.