All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
@ 2012-12-18 12:41 Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 01/30] [SeaBIOS] Add ACPI_EXTRACT_DEVICE* macros Vasilis Liaskovitis
                   ` (34 more replies)
  0 siblings, 35 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

This is v4 of the ACPI memory hotplug functionality. Only x86_64 target is
supported (both i440fx and q35). There are still several issues, but it's
been a while since v3 and I wanted to get some more feedback on the current
state of the patchseries.

Overview:

Dimm device layout is modeled with a normal qemu device:

"-device dimm,id=name,size=sz,node=pxm,populated=on|off,bus=membus.0"

The starting physical address for all dimms is calculated from top of memory,
during memory controller init, skipping the pci hole at [PCI_HOLE_START, 4G).
e.g.
"-device dimm,id=dimm0,size=512M,node=0,populated=off,bus=membus.0"
will define a 512M memory dimm belonging to numa node 0, on bus membus.0.

Because dimm layout needs to be configured on machine-boot, all dimm devices
need to be specified on startup command line (either with populated=on or with
populated=off). The dimm information is stored in dimm configuration structures.

After machine startup, dimms are hot-added or removed with normal device_add
and device_del operations e.g.:
Hot-add syntax: "device_add dimm,id=mydimm0,bus=membus.0"
Hot-remove syntax: "device_del dimm,id=mydimm0"

Changes v3->v4

- Dimms added with normal -device argument (extra -dimm arg dropped).
- multiple memory buses can be registered. Memory buses of the real hw/chipset
  or a paravirtual memory bus can be added.
- acpi implementation uses memory API instead of old ioports.
- Support for q35/ich9 added (still buggy, see patch 12/31).
- piix4/i440fx initialization code has been refactored to resemble q35. This
will allow memory map initialization at chipset qdev init time for both
machines, as well as more similar code.
- Hot-remove functionality has been moved to separate patches. Hot-remove no
longer frees memory but unmaps the dimm/qdev device from the guest's view.
Freeing the memory should happen when the last user unrefs/unmaps the memory,
see also (work in progress):
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg00728.html
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02697.html
- new qmp/hmp command for the state of each dimm (on/off)

Changes v2->v3

- qdev integration. Dimms are attached to a dimmbus. The dimmbus is a child
  of i440fx device in the pc machine. Hot-add and remove are done with normal
  device_add / device_del operations on the dimmbus. New commands "dimm_add" and
  "dimm_del" are obsolete.
- Add _PS3 method to allow OSPM-induced hot operations.
- pci-window calculation in Seabios takes dimms into account(for both 32-bit and
  64-bit windows)
- rename new qmp commands: query-memory-total and query-memory-hotplug
- balloon driver can see the hotplugged memory

Changes v1->v2

- memory map is automatically calculated for hotplug dimms. Dimms are added from
top-of-memory skipping the pci hole at [PCI_HOLE_START, 4G).
- Renamed from "-memslot" to "-dimm". Commands changed to "dimm_add", "dimm_del"
- Seabios ejection array reduced to a byte. Use extraction macros for dimm ssdt.
- additional SRAT paravirt info does not break previous SRAT fw_cfg layout.
- Documentation of new acpi_piix4 registers and paravirt data.
- add ACPI _OST support for _OST enabled guests. This allows qemu to receive
notification for success / failure of memory hot-add and hot-remove operations.
Guest needs to support _OST (https://lkml.org/lkml/2012/6/25/321)
- add monitor info command to report total guest memory (initial + hot-added)

Issues:

- hot-remove needs to only unmap the dimm device from guest's view. Freeing the
memory should happen when the last user of the device (e.g. virtio-blk) unrefs
the device. A testcase is needed for this.

- Live Migration: Ramblocks are migrated before qdev VMStates are migrated. So
the DimmDevice is handled diferrently than other devices. Should this be
reworked ?( DimmDevice structure currently does not define a VMStateDescription)
Live migration works as long as the dimm layout (command line args) are
identical at the source and destination qemu command line, and destination takes
into account hot-operations that have occured on source. (v3 patch 10/19
created the DimmDevice that corresponds to an unknown incoming ramblock, e.g.
for a dimm that was hot-added on source. but has been dropped for the moment). 

- A main blocker issue is windows guest functionality. The patchset does not
work for windows currently.  Testing on win2012 server RC or windows2008
consumer prerelease, when adding a DIMM, there is a BSOD with ACPI_BIOS_ERROR
message. After this, the VM keeps rebooting with ACPI_BIOS_ERROR. The windows
pnpmem driver obviosuly has a problem with the seabios dimm implementation
(or the seabios dimm implementation is not fully ACPI-compliant). If someone
can review the seabios patches or has any ideas to debug this, let me know.

- hot-operation notification lists need to be added to migration state.

series is based on:
- qemu master (commit a8a826a3) + patch:
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02699.html 
- seabios master (commit a810e4e7) 

Can also be found at:

http://github.com/vliaskov/qemu-kvm/commits/memhp-v4
http://github.com/vliaskov/seabios/commits/memhp-v4

Vasilis Liaskovitis (21):
  qapi: make visit_type_size fallback to type_int
  Add SIZE type to qdev properties
  qemu-option: export parse_option_number
  Implement dimm device abstraction
  vl: handle "-device dimm"
  acpi_piix4 : Implement memory device hotplug registers
  acpi_ich9 : Implement memory device hotplug registers
  piix_pci and pc_piix: refactor
  piix_pci: Add i440fx dram controller initialization
  q35: Add i440fx dram controller initialization
  pc: Add dimm paravirt SRAT info
  Introduce paravirt interface QEMU_CFG_PCI_WINDOW
  Implement "info memory-total" and "query-memory-total"
  balloon: update with hotplugged memory
  Implement dimm-info
  dimm: add hot-remove capability
  acpi_piix4: add hot-remove capability
  acpi_ich9: add hot-remove capability
  Implement qmp and hmp commands for notification lists
  Add _OST dimm support
  Implement _PS3 for dimm

 docs/specs/acpi_hotplug.txt |   54 ++++++
 docs/specs/fwcfg.txt        |   28 +++
 hmp-commands.hx             |    6 +
 hmp.c                       |   41 ++++
 hmp.h                       |    3 +
 hw/Makefile.objs            |    2 +-
 hw/acpi.h                   |    5 +
 hw/acpi_ich9.c              |  115 +++++++++++-
 hw/acpi_ich9.h              |   12 +-
 hw/acpi_piix4.c             |  126 ++++++++++++-
 hw/dimm.c                   |  444 +++++++++++++++++++++++++++++++++++++++++++
 hw/dimm.h                   |  102 ++++++++++
 hw/fw_cfg.h                 |    1 +
 hw/lpc_ich9.c               |    2 +-
 hw/pc.c                     |   28 +++-
 hw/pc.h                     |    1 +
 hw/pc_piix.c                |   74 ++++++--
 hw/pc_q35.c                 |   18 ++-
 hw/piix_pci.c               |  249 ++++++++-----------------
 hw/q35.c                    |   27 +++
 hw/q35.h                    |    5 +
 hw/qdev-properties.c        |   60 ++++++
 hw/qdev-properties.h        |    3 +
 hw/virtio-balloon.c         |   13 +-
 monitor.c                   |   21 ++
 qapi-schema.json            |   63 ++++++
 qapi/qapi-visit-core.c      |   11 +-
 qemu-option.c               |    4 +-
 qemu-option.h               |    4 +
 qmp-commands.hx             |   57 ++++++
 sysemu.h                    |    1 +
 vl.c                        |   60 ++++++
 32 files changed, 1432 insertions(+), 208 deletions(-)
 create mode 100644 docs/specs/acpi_hotplug.txt
 create mode 100644 docs/specs/fwcfg.txt
 create mode 100644 hw/dimm.c
 create mode 100644 hw/dimm.h


Vasilis Liaskovitis (9):
  Add ACPI_EXTRACT_DEVICE* macros
  Add SSDT memory device support
  acpi-dsdt: Implement functions for memory hotplug
  acpi: generate hotplug memory devices
  q35: Add memory hotplug handler
  pci: Use paravirt interface for pcimem_start and pcimem64_start
  acpi: add _EJ0 operation and eject port for memory devices
  Add _OST dimm method
  Implement _PS3 method for memory device

 Makefile                      |    2 +-
 src/acpi-dsdt-mem-hotplug.dsl |  136 +++++++++++++++++++++++++++++++++++
 src/acpi-dsdt.dsl             |    5 +-
 src/acpi.c                    |  158 +++++++++++++++++++++++++++++++++++++++--
 src/paravirt.c                |    6 ++
 src/paravirt.h                |    2 +
 src/pciinit.c                 |    9 +++
 src/q35-acpi-dsdt.dsl         |    6 +-
 src/ssdt-mem.dsl              |   73 +++++++++++++++++++
 tools/acpi_extract.py         |   28 +++++++
 10 files changed, 415 insertions(+), 10 deletions(-)
 create mode 100644 src/acpi-dsdt-mem-hotplug.dsl
 create mode 100644 src/ssdt-mem.dsl

-- 
1.7.9

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 01/30] [SeaBIOS] Add ACPI_EXTRACT_DEVICE* macros
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2013-03-20  3:28   ` li guang
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 02/30] [SeaBIOS] Add SSDT memory device support Vasilis Liaskovitis
                   ` (33 subsequent siblings)
  34 siblings, 1 reply; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

This allows to extract the beginning, end and name of a Device object.
---
 tools/acpi_extract.py |   28 ++++++++++++++++++++++++++++
 1 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/tools/acpi_extract.py b/tools/acpi_extract.py
index 3295678..3191f53 100755
--- a/tools/acpi_extract.py
+++ b/tools/acpi_extract.py
@@ -217,6 +217,28 @@ def aml_package_start(offset):
     offset += 1
     return offset + aml_pkglen_bytes(offset) + 1
 
+def aml_device_start(offset):
+    #0x5B 0x82 DeviceOp PkgLength NameString ProcID
+    if ((aml[offset] != 0x5B) or (aml[offset + 1] != 0x82)):
+        die( "Name offset 0x%x: expected 0x5B 0x83 actual 0x%x 0x%x" %
+             (offset, aml[offset], aml[offset + 1]));
+    return offset
+
+def aml_device_string(offset):
+    #0x5B 0x82 DeviceOp PkgLength NameString ProcID
+    start = aml_device_start(offset)
+    offset += 2
+    pkglenbytes = aml_pkglen_bytes(offset)
+    offset += pkglenbytes
+    return offset
+
+def aml_device_end(offset):
+    start = aml_device_start(offset)
+    offset += 2
+    pkglenbytes = aml_pkglen_bytes(offset)
+    pkglen = aml_pkglen(offset)
+    return offset + pkglen
+
 lineno = 0
 for line in fileinput.input():
     # Strip trailing newline
@@ -307,6 +329,12 @@ for i in range(len(asl)):
         offset = aml_processor_end(offset)
     elif (directive == "ACPI_EXTRACT_PKG_START"):
         offset = aml_package_start(offset)
+    elif (directive == "ACPI_EXTRACT_DEVICE_START"):
+        offset = aml_device_start(offset)
+    elif (directive == "ACPI_EXTRACT_DEVICE_STRING"):
+        offset = aml_device_string(offset)
+    elif (directive == "ACPI_EXTRACT_DEVICE_END"):
+        offset = aml_device_end(offset)
     else:
         die("Unsupported directive %s" % directive)
 
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 02/30] [SeaBIOS] Add SSDT memory device support
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 01/30] [SeaBIOS] Add ACPI_EXTRACT_DEVICE* macros Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 03/30] [SeaBIOS] acpi-dsdt: Implement functions for memory hotplug Vasilis Liaskovitis
                   ` (32 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Define SSDT hotplug-able memory devices in _SB namespace. The dynamically
generated SSDT includes per memory device hotplug methods. These methods
just call methods defined in the DSDT. Also dynamically generate a MTFY
method and a MEON array of the online/available memory devices.  ACPI
extraction macros are used to place the AML code in variables later used by
src/acpi. The design is taken from SSDT cpu generation.

v3->v4: EJ0 operation will be provided separately
---
 Makefile         |    2 +-
 src/ssdt-mem.dsl |   62 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 63 insertions(+), 1 deletions(-)
 create mode 100644 src/ssdt-mem.dsl

diff --git a/Makefile b/Makefile
index f28d86c..c8fcc57 100644
--- a/Makefile
+++ b/Makefile
@@ -220,7 +220,7 @@ $(OUT)%.hex: src/%.dsl ./tools/acpi_extract_preprocess.py ./tools/acpi_extract.p
 	$(Q)$(PYTHON) ./tools/acpi_extract.py $(OUT)$*.lst > $(OUT)$*.off
 	$(Q)cat $(OUT)$*.off > $@
 
-$(OUT)acpi.o: $(OUT)acpi-dsdt.hex $(OUT)ssdt-proc.hex $(OUT)ssdt-pcihp.hex $(OUT)ssdt-susp.hex $(OUT)q35-acpi-dsdt.hex
+$(OUT)acpi.o: $(OUT)acpi-dsdt.hex $(OUT)ssdt-proc.hex $(OUT)ssdt-pcihp.hex $(OUT)ssdt-susp.hex $(OUT)q35-acpi-dsdt.hex $(OUT)ssdt-mem.hex
 
 ################ Kconfig rules
 
diff --git a/src/ssdt-mem.dsl b/src/ssdt-mem.dsl
new file mode 100644
index 0000000..dbac33f
--- /dev/null
+++ b/src/ssdt-mem.dsl
@@ -0,0 +1,62 @@
+/* This file is the basis for the ssdt_mem[] variable in src/acpi.c.
+ * It is similar in design to the ssdt_proc variable.
+ * It defines the contents of the per-dimm QWordMemory() object.  At
+ * runtime, a dynamically generated SSDT will contain one copy of this
+ * AML snippet for every possible memory device in the system.  The
+ * objects will * be placed in the \_SB_ namespace.
+ *
+ * In addition to the aml code generated from this file, the
+ * src/acpi.c file creates a MTFY method with an entry for each memdevice:
+ *     Method(MTFY, 2) {
+ *         If (LEqual(Arg0, 0x00)) { Notify(MP00, Arg1) }
+ *         If (LEqual(Arg0, 0x01)) { Notify(MP01, Arg1) }
+ *         ...
+ *     }
+ * and a MEON array with the list of active and inactive memory devices:
+ *     Name(MEON, Package() { One, One, ..., Zero, Zero, ... })
+ */
+ACPI_EXTRACT_ALL_CODE ssdm_mem_aml
+
+DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", "CSSDT", 0x1)
+/*  v------------------ DO NOT EDIT ------------------v */
+{
+    ACPI_EXTRACT_DEVICE_START ssdt_mem_start
+    ACPI_EXTRACT_DEVICE_END ssdt_mem_end
+    ACPI_EXTRACT_DEVICE_STRING ssdt_mem_name
+    Device(MPAA) {
+        ACPI_EXTRACT_NAME_BYTE_CONST ssdt_mem_id
+        Name(ID, 0xAA)
+/*  ^------------------ DO NOT EDIT ------------------^
+ *
+ * The src/acpi.c code requires the above layout so that it can update
+ * MPAA and 0xAA with the appropriate MEMDEVICE id (see
+ * SD_OFFSET_MEMHEX/MEMID1/MEMID2).  Don't change the above without
+ * also updating the C code.
+ */
+        Name(_HID, EISAID("PNP0C80"))
+        Name(_PXM, 0xAA)
+
+        External(CMST, MethodObj)
+        External(MPEJ, MethodObj)
+
+        Name(_CRS, ResourceTemplate() {
+            QwordMemory(
+               ResourceConsumer,
+               ,
+               MinFixed,
+               MaxFixed,
+               Cacheable,
+               ReadWrite,
+               0x0,
+               0xDEADBEEF,
+               0xE6ADBEEE,
+               0x00000000,
+               0x08000000,
+               )
+        })
+        Method (_STA, 0) {
+            Return(CMST(ID))        
+        }    
+    }
+}    
+
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 03/30] [SeaBIOS] acpi-dsdt: Implement functions for memory hotplug
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 01/30] [SeaBIOS] Add ACPI_EXTRACT_DEVICE* macros Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 02/30] [SeaBIOS] Add SSDT memory device support Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 04/30] [SeaBIOS] acpi: generate hotplug memory devices Vasilis Liaskovitis
                   ` (31 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Extend the DSDT to include methods for handling memory hot-add and hot-remove
notifications and memory device status requests. These functions are called
from the memory device SSDT methods.
---
 src/acpi-dsdt-mem-hotplug.dsl |   57 +++++++++++++++++++++++++++++++++++++++++
 src/acpi-dsdt.dsl             |    5 +++-
 2 files changed, 61 insertions(+), 1 deletions(-)
 create mode 100644 src/acpi-dsdt-mem-hotplug.dsl

diff --git a/src/acpi-dsdt-mem-hotplug.dsl b/src/acpi-dsdt-mem-hotplug.dsl
new file mode 100644
index 0000000..0e7ced3
--- /dev/null
+++ b/src/acpi-dsdt-mem-hotplug.dsl
@@ -0,0 +1,57 @@
+/****************************************************************
+ * Memory hotplug
+ ****************************************************************/
+
+Scope(\_SB) {
+        /* Objects filled in by run-time generated SSDT */
+        External(MTFY, MethodObj)
+        External(MEON, PkgObj)
+
+        Method (CMST, 1, NotSerialized) {
+            // _STA method - return ON status of memdevice
+            // Local0 = MEON flag for this cpu
+            Store(DerefOf(Index(MEON, Arg0)), Local0)
+            If (Local0) { Return(0xF) } Else { Return(0x0) }
+        }
+
+        /* Memory hotplug notify array */
+        OperationRegion(MEST, SystemIO, 0xaf80, 32)
+        Field (MEST, ByteAcc, NoLock, Preserve)
+        {
+            MES, 256
+        }
+ 
+        Method(MESC, 0) {
+            // Local5 = active memdevice bitmap
+            Store (MES, Local5)
+            // Local2 = last read byte from bitmap
+            Store (Zero, Local2)
+            // Local0 = memory device iterator
+            Store (Zero, Local0)
+            While (LLess(Local0, SizeOf(MEON))) {
+                // Local1 = MEON flag for this memory device
+                Store(DerefOf(Index(MEON, Local0)), Local1)
+                If (And(Local0, 0x07)) {
+                    // Shift down previously read bitmap byte
+                    ShiftRight(Local2, 1, Local2)
+                } Else {
+                    // Read next byte from memdevice bitmap
+                    Store(DerefOf(Index(Local5, ShiftRight(Local0, 3))), Local2)
+                }
+                // Local3 = active state for this memory device
+                Store(And(Local2, 1), Local3)
+
+                If (LNotEqual(Local1, Local3)) {
+                    // State change - update MEON with new state
+                    Store(Local3, Index(MEON, Local0))
+                    // Do MEM notify
+                    If (LEqual(Local3, 1)) {
+                        MTFY(Local0, 1)
+                    }
+                }
+                Increment(Local0)
+            }
+            Return(One)
+        }
+
+}
diff --git a/src/acpi-dsdt.dsl b/src/acpi-dsdt.dsl
index 158f6b4..98c9413 100644
--- a/src/acpi-dsdt.dsl
+++ b/src/acpi-dsdt.dsl
@@ -294,6 +294,7 @@ DefinitionBlock (
     }
 
 #include "acpi-dsdt-cpu-hotplug.dsl"
+#include "acpi-dsdt-mem-hotplug.dsl"
 
 
 /****************************************************************
@@ -313,7 +314,9 @@ DefinitionBlock (
             // CPU hotplug event
             \_SB.PRSC()
         }
-        Method(_L03) {
+        Method(_E03) {
+            // Memory hotplug event
+            \_SB.MESC()
         }
         Method(_L04) {
         }
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 04/30] [SeaBIOS] acpi: generate hotplug memory devices
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (2 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 03/30] [SeaBIOS] acpi-dsdt: Implement functions for memory hotplug Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 05/30] [SeaBIOS] q35: Add memory hotplug handler Vasilis Liaskovitis
                   ` (30 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

The memory device generation is guided by qemu paravirt info. Seabios
first uses the info to setup SRAT entries for the hotplug-able memory slots.
Afterwards, build_memssdt uses the created SRAT entries to generate
appropriate memory device objects. One memory device (and corresponding SRAT
entry) is generated for each hotplug-able qemu memslot. Currently no SSDT
memory device is created for initial system memory.

We only support up to 255 DIMMs for now (PackageOp used for the MEON array can
only describe an array of at most 255 elements. VarPackageOp would be needed to
support more than 255 devices)

v1->v2:
Seabios reads mems_sts from qemu to build e820_map
SSDT size and some offsets are calculated with extraction macros.
---
 src/acpi.c |  158 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 152 insertions(+), 6 deletions(-)

diff --git a/src/acpi.c b/src/acpi.c
index 6267d7b..82231da 100644
--- a/src/acpi.c
+++ b/src/acpi.c
@@ -14,6 +14,7 @@
 #include "ioport.h" // inl
 #include "paravirt.h" // qemu_cfg_irq0_override
 #include "dev-q35.h" // qemu_cfg_irq0_override
+#include "memmap.h"
 
 /****************************************************/
 /* ACPI tables init */
@@ -446,11 +447,26 @@ encodeLen(u8 *ssdt_ptr, int length, int bytes)
 #define PCIHP_AML (ssdp_pcihp_aml + *ssdt_pcihp_start)
 #define PCI_SLOTS 32
 
+/* 0x5B 0x82 DeviceOp PkgLength NameString DimmID */
+#define MEM_BASE 0xaf80
+#define MEM_AML (ssdm_mem_aml + *ssdt_mem_start)
+#define MEM_SIZEOF (*ssdt_mem_end - *ssdt_mem_start)
+#define MEM_OFFSET_HEX (*ssdt_mem_name - *ssdt_mem_start + 2)
+#define MEM_OFFSET_ID (*ssdt_mem_id - *ssdt_mem_start)
+#define MEM_OFFSET_PXM 31
+#define MEM_OFFSET_START 55
+#define MEM_OFFSET_END   63
+#define MEM_OFFSET_SIZE  79
+
+u64 nb_hp_memslots = 0;
+struct srat_memory_affinity *mem;
+
 #define SSDT_SIGNATURE 0x54445353 // SSDT
 #define SSDT_HEADER_LENGTH 36
 
 #include "ssdt-susp.hex"
 #include "ssdt-pcihp.hex"
+#include "ssdt-mem.hex"
 
 #define PCI_RMV_BASE 0xae0c
 
@@ -502,6 +518,111 @@ static void patch_pcihp(int slot, u8 *ssdt_ptr, u32 eject)
     }
 }
 
+static void build_memdev(u8 *ssdt_ptr, int i, u64 mem_base, u64 mem_len, u8 node)
+{
+    memcpy(ssdt_ptr, MEM_AML, MEM_SIZEOF);
+    ssdt_ptr[MEM_OFFSET_HEX] = getHex(i >> 4);
+    ssdt_ptr[MEM_OFFSET_HEX+1] = getHex(i);
+    ssdt_ptr[MEM_OFFSET_ID] = i;
+    ssdt_ptr[MEM_OFFSET_PXM] = node;
+    *(u64*)(ssdt_ptr + MEM_OFFSET_START) = mem_base;
+    *(u64*)(ssdt_ptr + MEM_OFFSET_END) = mem_base + mem_len;
+    *(u64*)(ssdt_ptr + MEM_OFFSET_SIZE) = mem_len;
+}
+
+static void*
+build_memssdt(void)
+{
+    u64 mem_base;
+    u64 mem_len;
+    u8  node;
+    int i;
+    struct srat_memory_affinity *entry = mem;
+    u64 nb_memdevs = nb_hp_memslots;
+    u8  memslot_status, enabled;
+
+    int length = ((1+3+4)
+                  + (nb_memdevs * MEM_SIZEOF)
+                  + (1+2+5+(12*nb_memdevs))
+                  + (6+2+1+(1*nb_memdevs)));
+    u8 *ssdt = malloc_high(sizeof(struct acpi_table_header) + length);
+    if (! ssdt) {
+        warn_noalloc();
+        return NULL;
+    }
+    u8 *ssdt_ptr = ssdt + sizeof(struct acpi_table_header);
+
+    // build Scope(_SB_) header
+    *(ssdt_ptr++) = 0x10; // ScopeOp
+    ssdt_ptr = encodeLen(ssdt_ptr, length-1, 3);
+    *(ssdt_ptr++) = '_';
+    *(ssdt_ptr++) = 'S';
+    *(ssdt_ptr++) = 'B';
+    *(ssdt_ptr++) = '_';
+
+    for (i = 0; i < nb_memdevs; i++) {
+        mem_base = (((u64)(entry->base_addr_high) << 32 )| entry->base_addr_low);
+        mem_len = (((u64)(entry->length_high) << 32 )| entry->length_low);
+        node = entry->proximity[0];
+        build_memdev(ssdt_ptr, i, mem_base, mem_len, node);
+        ssdt_ptr += MEM_SIZEOF;
+        entry++;
+    }
+
+    // build "Method(MTFY, 2) {If (LEqual(Arg0, 0x00)) {Notify(CM00, Arg1)} ...}"
+    *(ssdt_ptr++) = 0x14; // MethodOp
+    ssdt_ptr = encodeLen(ssdt_ptr, 2+5+(12*nb_memdevs), 2);
+    *(ssdt_ptr++) = 'M';
+    *(ssdt_ptr++) = 'T';
+    *(ssdt_ptr++) = 'F';
+    *(ssdt_ptr++) = 'Y';
+    *(ssdt_ptr++) = 0x02;
+    for (i=0; i<nb_memdevs; i++) {
+        *(ssdt_ptr++) = 0xA0; // IfOp
+       ssdt_ptr = encodeLen(ssdt_ptr, 11, 1);
+        *(ssdt_ptr++) = 0x93; // LEqualOp
+        *(ssdt_ptr++) = 0x68; // Arg0Op
+        *(ssdt_ptr++) = 0x0A; // BytePrefix
+        *(ssdt_ptr++) = i;
+        *(ssdt_ptr++) = 0x86; // NotifyOp
+        *(ssdt_ptr++) = 'M';
+        *(ssdt_ptr++) = 'P';
+        *(ssdt_ptr++) = getHex(i >> 4);
+        *(ssdt_ptr++) = getHex(i);
+        *(ssdt_ptr++) = 0x69; // Arg1Op
+    }
+
+    // build "Name(MEON, Package() { One, One, ..., Zero, Zero, ... })"
+    *(ssdt_ptr++) = 0x08; // NameOp
+    *(ssdt_ptr++) = 'M';
+    *(ssdt_ptr++) = 'E';
+    *(ssdt_ptr++) = 'O';
+    *(ssdt_ptr++) = 'N';
+    *(ssdt_ptr++) = 0x12; // PackageOp
+    ssdt_ptr = encodeLen(ssdt_ptr, 2+1+(1*nb_memdevs), 2);
+    *(ssdt_ptr++) = nb_memdevs;
+
+    entry = mem;
+    memslot_status = 0;
+
+    for (i = 0; i < nb_memdevs; i++) {
+        enabled = 0;
+        if (i % 8 == 0)
+            memslot_status = inb(MEM_BASE + i/8);
+        enabled = memslot_status & 1;
+        mem_base = (((u64)(entry->base_addr_high) << 32 )| entry->base_addr_low);
+        mem_len = (((u64)(entry->length_high) << 32 )| entry->length_low);
+        *(ssdt_ptr++) = enabled ? 0x01 : 0x00;
+        if (enabled)
+            add_e820(mem_base, mem_len, E820_RAM);
+        memslot_status = memslot_status >> 1;
+        entry++;
+    }
+    build_header((void*)ssdt, SSDT_SIGNATURE, ssdt_ptr - ssdt, 1);
+
+    return ssdt;
+}
+
 static void*
 build_ssdt(void)
 {
@@ -674,9 +795,6 @@ build_srat(void)
 {
     int nb_numa_nodes = qemu_cfg_get_numa_nodes();
 
-    if (nb_numa_nodes == 0)
-        return NULL;
-
     u64 *numadata = malloc_tmphigh(sizeof(u64) * (MaxCountCPUs + nb_numa_nodes));
     if (!numadata) {
         warn_noalloc();
@@ -685,10 +803,11 @@ build_srat(void)
 
     qemu_cfg_get_numa_data(numadata, MaxCountCPUs + nb_numa_nodes);
 
+    qemu_cfg_get_numa_data(&nb_hp_memslots, 1);
     struct system_resource_affinity_table *srat;
     int srat_size = sizeof(*srat) +
         sizeof(struct srat_processor_affinity) * MaxCountCPUs +
-        sizeof(struct srat_memory_affinity) * (nb_numa_nodes + 2);
+        sizeof(struct srat_memory_affinity) * (nb_numa_nodes + nb_hp_memslots + 2);
 
     srat = malloc_high(srat_size);
     if (!srat) {
@@ -723,7 +842,7 @@ build_srat(void)
      * from 640k-1M and possibly another one from 3.5G-4G.
      */
     struct srat_memory_affinity *numamem = (void*)core;
-    int slots = 0;
+    int slots = 0, node;
     u64 mem_len, mem_base, next_base = 0;
 
     acpi_build_srat_memory(numamem, 0, 640*1024, 0, 1);
@@ -750,10 +869,36 @@ build_srat(void)
             next_base += (1ULL << 32) - RamSize;
         }
         acpi_build_srat_memory(numamem, mem_base, mem_len, i-1, 1);
+
         numamem++;
         slots++;
+
     }
-    for (; slots < nb_numa_nodes + 2; slots++) {
+    mem = (void*)numamem;
+
+    if (nb_hp_memslots) {
+        u64 *hpmemdata = malloc_tmphigh(sizeof(u64) * (3 * nb_hp_memslots));
+        if (!hpmemdata) {
+            warn_noalloc();
+            free(hpmemdata);
+            free(numadata);
+            return NULL;
+        }
+
+        qemu_cfg_get_numa_data(hpmemdata, 3 * nb_hp_memslots);
+
+        for (i = 1; i < nb_hp_memslots + 1; ++i) {
+            mem_base = *hpmemdata++;
+            mem_len = *hpmemdata++;
+            node = *hpmemdata++;
+            acpi_build_srat_memory(numamem, mem_base, mem_len, node, 1);
+            numamem++;
+            slots++;
+        }
+        free(hpmemdata);
+    }
+
+    for (; slots < nb_numa_nodes + nb_hp_memslots + 2; slots++) {
         acpi_build_srat_memory(numamem, 0, 0, 0, 0);
         numamem++;
     }
@@ -825,6 +970,7 @@ acpi_bios_init(void)
     ACPI_INIT_TABLE(build_madt());
     ACPI_INIT_TABLE(build_hpet());
     ACPI_INIT_TABLE(build_srat());
+    ACPI_INIT_TABLE(build_memssdt());
     if (pci->device == PCI_DEVICE_ID_INTEL_ICH9_LPC)
         ACPI_INIT_TABLE(build_mcfg_q35());
 
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 05/30] [SeaBIOS] q35: Add memory hotplug handler
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (3 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 04/30] [SeaBIOS] acpi: generate hotplug memory devices Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 06/30] qapi: make visit_type_size fallback to type_int Vasilis Liaskovitis
                   ` (29 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

---
 src/q35-acpi-dsdt.dsl |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/q35-acpi-dsdt.dsl b/src/q35-acpi-dsdt.dsl
index c031d83..5b28d72 100644
--- a/src/q35-acpi-dsdt.dsl
+++ b/src/q35-acpi-dsdt.dsl
@@ -403,7 +403,7 @@ DefinitionBlock (
     }
 
 #include "acpi-dsdt-cpu-hotplug.dsl"
-
+#include "acpi-dsdt-mem-hotplug.dsl"
 
 /****************************************************************
  * General purpose events
@@ -418,7 +418,9 @@ DefinitionBlock (
             // CPU hotplug event
             \_SB.PRSC()
         }
-        Method(_L02) {
+        Method(_E02) {
+            // Memory hotplug event
+            \_SB.MESC()
         }
         Method(_L03) {
         }
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 06/30] qapi: make visit_type_size fallback to type_int
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (4 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 05/30] [SeaBIOS] q35: Add memory hotplug handler Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2013-01-09  0:18   ` Andreas Färber
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 07/30] Add SIZE type to qdev properties Vasilis Liaskovitis
                   ` (28 subsequent siblings)
  34 siblings, 1 reply; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Currently visit_type_size checks if the visitor's type_size function pointer is
NULL. If not, it calls it, otherwise it calls v->type_uint64(). But neither of
these pointers are ever set. Fallback to calling v->type_int() in this third
(default) case.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 qapi/qapi-visit-core.c |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index 7a82b63..497e693 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -236,8 +236,17 @@ void visit_type_int64(Visitor *v, int64_t *obj, const char *name, Error **errp)
 
 void visit_type_size(Visitor *v, uint64_t *obj, const char *name, Error **errp)
 {
+    int64_t value;
     if (!error_is_set(errp)) {
-        (v->type_size ? v->type_size : v->type_uint64)(v, obj, name, errp);
+        if (v->type_size) {
+            v->type_size(v, obj, name, errp);
+        } else if (v->type_uint64) {
+            v->type_uint64(v, obj, name, errp);
+        } else {
+            value = *obj;
+            v->type_int(v, &value, name, errp);
+            *obj = value;
+        }
     }
 }
 
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 07/30] Add SIZE type to qdev properties
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (5 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 06/30] qapi: make visit_type_size fallback to type_int Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2013-03-20  6:06   ` li guang
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 08/30] qemu-option: export parse_option_number Vasilis Liaskovitis
                   ` (27 subsequent siblings)
  34 siblings, 1 reply; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

This patch adds a 'SIZE' type property to qdev.

It will make dimm description more convenient by allowing sizes to be specified
with K,M,G,T prefixes instead of number of bytes e.g.:
-device dimm,id=mem0,size=2G,bus=membus.0

Credits go to Ian Molton for original patch. See:
http://patchwork.ozlabs.org/patch/38835/

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 hw/qdev-properties.c |   60 ++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/qdev-properties.h |    3 ++
 qemu-option.c        |    2 +-
 qemu-option.h        |    2 +
 4 files changed, 66 insertions(+), 1 deletions(-)

diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
index 81d901c..a77f760 100644
--- a/hw/qdev-properties.c
+++ b/hw/qdev-properties.c
@@ -1279,3 +1279,63 @@ void qemu_add_globals(void)
 {
     qemu_opts_foreach(qemu_find_opts("global"), qdev_add_one_global, NULL, 0);
 }
+
+/* --- 64bit unsigned int 'size' type --- */
+
+static void get_size(Object *obj, Visitor *v, void *opaque,
+                       const char *name, Error **errp)
+{
+    DeviceState *dev = DEVICE(obj);
+    Property *prop = opaque;
+    uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
+
+    visit_type_size(v, ptr, name, errp);
+}
+
+static void set_size(Object *obj, Visitor *v, void *opaque,
+                       const char *name, Error **errp)
+{
+    DeviceState *dev = DEVICE(obj);
+    Property *prop = opaque;
+    uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
+
+    if (dev->state != DEV_STATE_CREATED) {
+        error_set(errp, QERR_PERMISSION_DENIED);
+        return;
+    }
+
+    visit_type_size(v, ptr, name, errp);
+}
+
+static int parse_size(DeviceState *dev, Property *prop, const char *str)
+{
+    uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
+    Error *errp = NULL;
+
+    if (str != NULL) {
+        parse_option_size(prop->name, str, ptr, &errp);
+    }
+    assert_no_error(errp);
+    return 0;
+}
+
+static int print_size(DeviceState *dev, Property *prop, char *dest, size_t len)
+{
+    uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
+    char suffixes[] = {'T', 'G', 'M', 'K', 'B'};
+    int i = 0;
+    uint64_t div;
+
+    for (div = (long int)1 << 40; !(*ptr / div) ; div >>= 10) {
+        i++;
+    }
+    return snprintf(dest, len, "%0.03f%c", (double)*ptr/div, suffixes[i]);
+}
+
+PropertyInfo qdev_prop_size = {
+    .name  = "size",
+    .parse = parse_size,
+    .print = print_size,
+    .get = get_size,
+    .set = set_size,
+};
diff --git a/hw/qdev-properties.h b/hw/qdev-properties.h
index 5b046ab..0182bef 100644
--- a/hw/qdev-properties.h
+++ b/hw/qdev-properties.h
@@ -14,6 +14,7 @@ extern PropertyInfo qdev_prop_uint64;
 extern PropertyInfo qdev_prop_hex8;
 extern PropertyInfo qdev_prop_hex32;
 extern PropertyInfo qdev_prop_hex64;
+extern PropertyInfo qdev_prop_size;
 extern PropertyInfo qdev_prop_string;
 extern PropertyInfo qdev_prop_chr;
 extern PropertyInfo qdev_prop_ptr;
@@ -67,6 +68,8 @@ extern PropertyInfo qdev_prop_pci_host_devaddr;
     DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_hex32, uint32_t)
 #define DEFINE_PROP_HEX64(_n, _s, _f, _d)                       \
     DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_hex64, uint64_t)
+#define DEFINE_PROP_SIZE(_n, _s, _f, _d)                       \
+    DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_size, uint64_t)
 #define DEFINE_PROP_PCI_DEVFN(_n, _s, _f, _d)                   \
     DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_pci_devfn, int32_t)
 
diff --git a/qemu-option.c b/qemu-option.c
index 27891e7..38e0a11 100644
--- a/qemu-option.c
+++ b/qemu-option.c
@@ -203,7 +203,7 @@ static void parse_option_number(const char *name, const char *value,
     }
 }
 
-static void parse_option_size(const char *name, const char *value,
+void parse_option_size(const char *name, const char *value,
                               uint64_t *ret, Error **errp)
 {
     char *postfix;
diff --git a/qemu-option.h b/qemu-option.h
index ca72986..b8ee5b3 100644
--- a/qemu-option.h
+++ b/qemu-option.h
@@ -152,5 +152,7 @@ typedef int (*qemu_opts_loopfunc)(QemuOpts *opts, void *opaque);
 int qemu_opts_print(QemuOpts *opts, void *dummy);
 int qemu_opts_foreach(QemuOptsList *list, qemu_opts_loopfunc func, void *opaque,
                       int abort_on_failure);
+void parse_option_size(const char *name, const char *value,
+                              uint64_t *ret, Error **errp);
 
 #endif
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 08/30] qemu-option: export parse_option_number
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (6 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 07/30] Add SIZE type to qdev properties Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 09/30] Implement dimm device abstraction Vasilis Liaskovitis
                   ` (26 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 qemu-option.c |    2 +-
 qemu-option.h |    2 ++
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/qemu-option.c b/qemu-option.c
index 38e0a11..88fd370 100644
--- a/qemu-option.c
+++ b/qemu-option.c
@@ -185,7 +185,7 @@ static void parse_option_bool(const char *name, const char *value, bool *ret,
     }
 }
 
-static void parse_option_number(const char *name, const char *value,
+void parse_option_number(const char *name, const char *value,
                                 uint64_t *ret, Error **errp)
 {
     char *postfix;
diff --git a/qemu-option.h b/qemu-option.h
index b8ee5b3..8b7235f 100644
--- a/qemu-option.h
+++ b/qemu-option.h
@@ -154,5 +154,7 @@ int qemu_opts_foreach(QemuOptsList *list, qemu_opts_loopfunc func, void *opaque,
                       int abort_on_failure);
 void parse_option_size(const char *name, const char *value,
                               uint64_t *ret, Error **errp);
+void parse_option_number(const char *name, const char *value,
+                                uint64_t *ret, Error **errp);
 
 #endif
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 09/30] Implement dimm device abstraction
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (7 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 08/30] qemu-option: export parse_option_number Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2013-03-26  3:51   ` li guang
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 10/30] vl: handle "-device dimm" Vasilis Liaskovitis
                   ` (25 subsequent siblings)
  34 siblings, 1 reply; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Each hotplug-able memory slot is a DimmDevice. All DimmDevices are attached
to a new bus called DimmBus. This bus is introduced so that we no longer
depend on hotplug-capability of main system bus (the main bus does not allow
hotplugging). The DimmBus should be attached to a chipset Device (i440fx in case
of the pc)

A hot-add operation for a particular dimm:
- creates a new DimmDevice and attaches it to the DimmBus
- creates a new MemoryRegion of the given physical address offset, size and
node proximity, and attaches it to main system memory as a sub_region.

Hotplug operations are done through normal device_add commands.
Also add properties to DimmDevice.

v3->v4: Removed hot-remove functions. Will be offered in separate patches.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 hw/Makefile.objs |    2 +-
 hw/dimm.c        |  245 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/dimm.h        |   89 ++++++++++++++++++++
 3 files changed, 335 insertions(+), 1 deletions(-)
 create mode 100644 hw/dimm.c
 create mode 100644 hw/dimm.h

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index d581d8d..51494c9 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -29,7 +29,7 @@ common-obj-$(CONFIG_I8254) += i8254_common.o i8254.o
 common-obj-$(CONFIG_PCSPK) += pcspk.o
 common-obj-$(CONFIG_PCKBD) += pckbd.o
 common-obj-$(CONFIG_FDC) += fdc.o
-common-obj-$(CONFIG_ACPI) += acpi.o acpi_piix4.o acpi_ich9.o smbus_ich9.o
+common-obj-$(CONFIG_ACPI) += acpi.o acpi_piix4.o acpi_ich9.o smbus_ich9.o dimm.o
 common-obj-$(CONFIG_APM) += pm_smbus.o apm.o
 common-obj-$(CONFIG_DMA) += dma.o
 common-obj-$(CONFIG_I82374) += i82374.o
diff --git a/hw/dimm.c b/hw/dimm.c
new file mode 100644
index 0000000..e384952
--- /dev/null
+++ b/hw/dimm.c
@@ -0,0 +1,245 @@
+/*
+ * Dimm device for Memory Hotplug
+ *
+ * Copyright ProfitBricks GmbH 2012
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "trace.h"
+#include "qdev.h"
+#include "dimm.h"
+#include <time.h>
+#include "../exec-memory.h"
+#include "qmp-commands.h"
+
+/* the following list is used to hold dimm config info before machine
+ * is initialized. After machine init, the list is not used anymore.*/
+static DimmConfiglist dimmconfig_list =
+       QTAILQ_HEAD_INITIALIZER(dimmconfig_list);
+
+/* the list of memory buses */
+static QLIST_HEAD(, DimmBus) memory_buses;
+
+static void dimmbus_dev_print(Monitor *mon, DeviceState *dev, int indent);
+static char *dimmbus_get_fw_dev_path(DeviceState *dev);
+
+static Property dimm_properties[] = {
+    DEFINE_PROP_UINT64("start", DimmDevice, start, 0),
+    DEFINE_PROP_SIZE("size", DimmDevice, size, DEFAULT_DIMMSIZE),
+    DEFINE_PROP_UINT32("node", DimmDevice, node, 0),
+    DEFINE_PROP_BIT("populated", DimmDevice, populated, 0, false),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void dimmbus_dev_print(Monitor *mon, DeviceState *dev, int indent)
+{
+}
+
+static char *dimmbus_get_fw_dev_path(DeviceState *dev)
+{
+    char path[40];
+
+    snprintf(path, sizeof(path), "%s", qdev_fw_name(dev));
+    return strdup(path);
+}
+
+static void dimm_bus_class_init(ObjectClass *klass, void *data)
+{
+    BusClass *k = BUS_CLASS(klass);
+
+    k->print_dev = dimmbus_dev_print;
+    k->get_fw_dev_path = dimmbus_get_fw_dev_path;
+}
+
+static void dimm_bus_initfn(Object *obj)
+{
+    DimmBus *bus = DIMM_BUS(obj);
+    QTAILQ_INIT(&bus->dimmconfig_list);
+    QTAILQ_INIT(&bus->dimmlist);
+}
+
+static const TypeInfo dimm_bus_info = {
+    .name = TYPE_DIMM_BUS,
+    .parent = TYPE_BUS,
+    .instance_size = sizeof(DimmBus),
+    .instance_init = dimm_bus_initfn,
+    .class_init = dimm_bus_class_init,
+};
+
+DimmBus *dimm_bus_create(Object *parent, const char *name, uint32_t max_dimms,
+    dimm_calcoffset_fn pmc_set_offset)
+{
+    DimmBus *memory_bus;
+    DimmConfig *dimm_cfg, *next_cfg;
+    uint32_t num_dimms = 0;
+
+    memory_bus = g_malloc0(dimm_bus_info.instance_size);
+    memory_bus->qbus.name = name ? g_strdup(name) : "membus.0";
+    qbus_create_inplace(&memory_bus->qbus, TYPE_DIMM_BUS, DEVICE(parent),
+                         name);
+
+    QTAILQ_FOREACH_SAFE(dimm_cfg, &dimmconfig_list, nextdimmcfg, next_cfg) {
+        if (!strcmp(memory_bus->qbus.name, dimm_cfg->bus_name)) {
+            if (max_dimms && (num_dimms == max_dimms)) {
+                fprintf(stderr, "Bus %s can only accept %u number of DIMMs\n",
+                        name, max_dimms);
+            }
+            QTAILQ_REMOVE(&dimmconfig_list, dimm_cfg, nextdimmcfg);
+            QTAILQ_INSERT_TAIL(&memory_bus->dimmconfig_list, dimm_cfg,
+                    nextdimmcfg);
+
+            dimm_cfg->start = pmc_set_offset(DEVICE(parent), dimm_cfg->size);
+            num_dimms++;
+        }
+    }
+    QLIST_INSERT_HEAD(&memory_buses, memory_bus, next);
+    return memory_bus;
+}
+
+static void dimm_populate(DimmDevice *s)
+{
+    DeviceState *dev = (DeviceState *)s;
+    MemoryRegion *new = NULL;
+
+    new = g_malloc(sizeof(MemoryRegion));
+    memory_region_init_ram(new, dev->id, s->size);
+    vmstate_register_ram_global(new);
+    memory_region_add_subregion(get_system_memory(), s->start, new);
+    s->populated = true;
+    s->mr = new;
+}
+
+void dimm_config_create(char *id, uint64_t size, const char *bus, uint64_t node,
+        uint32_t dimm_idx, uint32_t populated)
+{
+    DimmConfig *dimm_cfg;
+    dimm_cfg = (DimmConfig *) g_malloc0(sizeof(DimmConfig));
+    dimm_cfg->name = strdup(id);
+    dimm_cfg->bus_name = strdup(bus);
+    dimm_cfg->idx = dimm_idx;
+    dimm_cfg->start = 0;
+    dimm_cfg->size = size;
+    dimm_cfg->node = node;
+    dimm_cfg->populated = populated;
+
+    QTAILQ_INSERT_TAIL(&dimmconfig_list, dimm_cfg, nextdimmcfg);
+}
+
+void dimm_bus_hotplug(dimm_hotplug_fn hotplug, DeviceState *qdev)
+{
+    DimmBus *bus;
+    QLIST_FOREACH(bus, &memory_buses, next) {
+        assert(bus);
+        bus->qbus.allow_hotplug = 1;
+        bus->dimm_hotplug_qdev = qdev;
+        bus->dimm_hotplug = hotplug;
+    }
+}
+
+static void dimm_plug_device(DimmDevice *slot)
+{
+    DimmBus *bus = DIMM_BUS(qdev_get_parent_bus(&slot->qdev));
+
+    dimm_populate(slot);
+    if (bus->dimm_hotplug) {
+        bus->dimm_hotplug(bus->dimm_hotplug_qdev, slot, 1);
+    }
+}
+
+static int dimm_unplug_device(DeviceState *qdev)
+{
+    return 1;
+}
+
+static DimmConfig *dimmcfg_find_from_name(DimmBus *bus, const char *name)
+{
+    DimmConfig *slot;
+
+    QTAILQ_FOREACH(slot, &bus->dimmconfig_list, nextdimmcfg) {
+        if (!strcmp(slot->name, name)) {
+            return slot;
+        }
+    }
+    return NULL;
+}
+
+void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots)
+{
+    DimmConfig *slot;
+    DimmBus *bus;
+
+    QLIST_FOREACH(bus, &memory_buses, next) {
+        QTAILQ_FOREACH(slot, &bus->dimmconfig_list, nextdimmcfg) {
+            assert(slot->start);
+            fw_cfg_slots[3 * slot->idx] = cpu_to_le64(slot->start);
+            fw_cfg_slots[3 * slot->idx + 1] = cpu_to_le64(slot->size);
+            fw_cfg_slots[3 * slot->idx + 2] = cpu_to_le64(slot->node);
+        }
+    }
+}
+
+static int dimm_init(DeviceState *s)
+{
+    DimmBus *bus = DIMM_BUS(qdev_get_parent_bus(s));
+    DimmDevice *slot;
+    DimmConfig *slotcfg;
+
+    slot = DIMM(s);
+    slot->mr = NULL;
+
+    slotcfg = dimmcfg_find_from_name(bus, s->id);
+
+    if (!slotcfg) {
+        fprintf(stderr, "%s no config for slot %s found\n",
+                __func__, s->id);
+        return 1;
+    }
+
+    slot->idx = slotcfg->idx;
+    assert(slotcfg->start);
+    slot->start = slotcfg->start;
+    slot->size = slotcfg->size;
+    slot->node = slotcfg->node;
+
+    QTAILQ_INSERT_TAIL(&bus->dimmlist, slot, nextdimm);
+    dimm_plug_device(slot);
+
+    return 0;
+}
+
+
+static void dimm_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->props = dimm_properties;
+    dc->unplug = dimm_unplug_device;
+    dc->init = dimm_init;
+    dc->bus_type = TYPE_DIMM_BUS;
+}
+
+static TypeInfo dimm_info = {
+    .name          = TYPE_DIMM,
+    .parent        = TYPE_DEVICE,
+    .instance_size = sizeof(DimmDevice),
+    .class_init    = dimm_class_init,
+};
+
+static void dimm_register_types(void)
+{
+    type_register_static(&dimm_bus_info);
+    type_register_static(&dimm_info);
+}
+
+type_init(dimm_register_types)
diff --git a/hw/dimm.h b/hw/dimm.h
new file mode 100644
index 0000000..75a6911
--- /dev/null
+++ b/hw/dimm.h
@@ -0,0 +1,89 @@
+#ifndef QEMU_DIMM_H
+#define QEMU_DIMM_H
+
+#include "qemu-common.h"
+#include "memory.h"
+#include "sysbus.h"
+#include "qapi-types.h"
+#include "qemu-queue.h"
+#include "cpus.h"
+#define MAX_DIMMS 255
+#define DIMM_BITMAP_BYTES ((MAX_DIMMS + 7) / 8)
+#define DEFAULT_DIMMSIZE (1024*1024*1024)
+
+typedef enum {
+    DIMM_REMOVE_SUCCESS = 0,
+    DIMM_REMOVE_FAIL = 1,
+    DIMM_ADD_SUCCESS = 2,
+    DIMM_ADD_FAIL = 3
+} dimm_hp_result_code;
+
+#define TYPE_DIMM "dimm"
+#define DIMM(obj) \
+    OBJECT_CHECK(DimmDevice, (obj), TYPE_DIMM)
+#define DIMM_CLASS(klass) \
+    OBJECT_CLASS_CHECK(DimmDeviceClass, (klass), TYPE_DIMM)
+#define DIMM_GET_CLASS(obj) \
+    OBJECT_GET_CLASS(DimmDeviceClass, (obj), TYPE_DIMM)
+
+typedef struct DimmDevice DimmDevice;
+typedef QTAILQ_HEAD(DimmConfiglist, DimmConfig) DimmConfiglist;
+
+typedef struct DimmDeviceClass {
+    DeviceClass parent_class;
+
+    int (*init)(DimmDevice *dev);
+} DimmDeviceClass;
+
+struct DimmDevice {
+    DeviceState qdev;
+    uint32_t idx; /* index in memory hotplug register/bitmap */
+    ram_addr_t start; /* starting physical address */
+    ram_addr_t size;
+    uint32_t node; /* numa node proximity */
+    uint32_t populated; /* 1 means device has been hotplugged. Default is 0. */
+    MemoryRegion *mr; /* MemoryRegion for this slot. !NULL only if populated */
+    QTAILQ_ENTRY(DimmDevice) nextdimm;
+};
+
+typedef struct DimmConfig {
+    const char *name;
+    uint32_t idx; /* index in linear memory hotplug bitmap */
+    const char *bus_name;
+    ram_addr_t start; /* starting physical address */
+    ram_addr_t size;
+    uint32_t node; /* numa node proximity */
+    uint32_t populated; /* 1 means device has been hotplugged. Default is 0. */
+    QTAILQ_ENTRY(DimmConfig) nextdimmcfg;
+} DimmConfig;
+
+typedef int (*dimm_hotplug_fn)(DeviceState *qdev, DimmDevice *dev, int add);
+typedef hwaddr(*dimm_calcoffset_fn)(DeviceState *dev, uint64_t size);
+
+#define TYPE_DIMM_BUS "dimmbus"
+#define DIMM_BUS(obj) OBJECT_CHECK(DimmBus, (obj), TYPE_DIMM_BUS)
+
+typedef struct DimmBus {
+    BusState qbus;
+    DeviceState *dimm_hotplug_qdev;
+    dimm_hotplug_fn dimm_hotplug;
+    DimmConfiglist dimmconfig_list;
+    QTAILQ_HEAD(Dimmlist, DimmDevice) dimmlist;
+    QLIST_ENTRY(DimmBus) next;
+} DimmBus;
+
+struct dimm_hp_result {
+    const char *dimmname;
+    dimm_hp_result_code ret;
+    QTAILQ_ENTRY(dimm_hp_result) next;
+};
+
+void dimm_bus_hotplug(dimm_hotplug_fn hotplug, DeviceState *qdev);
+void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots);
+int dimm_add(char *id);
+DimmBus *dimm_bus_create(Object *parent, const char *name, uint32_t max_dimms,
+    dimm_calcoffset_fn pmc_set_offset);
+void dimm_config_create(char *id, uint64_t size, const char *bus, uint64_t node,
+        uint32_t dimm_idx, uint32_t populated);
+
+#endif
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 10/30] vl: handle "-device dimm"
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (8 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 09/30] Implement dimm device abstraction Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 11/30] acpi_piix4 : Implement memory device hotplug registers Vasilis Liaskovitis
                   ` (24 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 vl.c |   51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/vl.c b/vl.c
index a3ab384..8406933 100644
--- a/vl.c
+++ b/vl.c
@@ -169,6 +169,7 @@ int main(int argc, char **argv)
 
 #include "ui/qemu-spice.h"
 #include "qapi/string-input-visitor.h"
+#include "hw/dimm.h"
 
 //#define DEBUG_NET
 //#define DEBUG_SLIRP
@@ -249,6 +250,7 @@ static QTAILQ_HEAD(, FWBootEntry) fw_boot_order =
 int nb_numa_nodes;
 uint64_t node_mem[MAX_NODES];
 unsigned long *node_cpumask[MAX_NODES];
+int nb_hp_dimms;
 
 uint8_t qemu_uuid[16];
 
@@ -2065,6 +2067,50 @@ static int chardev_init_func(QemuOpts *opts, void *opaque)
     return 0;
 }
 
+static int dimmcfg_init_func(QemuOpts *opts, void *opaque)
+{
+    const char *driver;
+    const char *id;
+    uint64_t node, size;
+    uint32_t populated;
+    const char *buf, *busbuf;
+
+    /* DimmDevice configuration needs to be known in order to initialize chipset
+     * with correct memory and pci ranges. But all devices are created after
+     * chipset / machine initialization. In * order to avoid this problem, we
+     * parse dimm information earlier into dimmcfg structs. */
+
+    driver = qemu_opt_get(opts, "driver");
+    if (!strcmp(driver, "dimm")) {
+
+        id = qemu_opts_id(opts);
+        buf = qemu_opt_get(opts, "size");
+        parse_option_size("size", buf, &size, NULL);
+        buf = qemu_opt_get(opts, "node");
+        parse_option_number("node", buf, &node, NULL);
+        busbuf = qemu_opt_get(opts, "bus");
+        buf = qemu_opt_get(opts, "populated");
+        if (!buf) {
+            populated = 0;
+        } else {
+            populated = strcmp(buf, "on") ? 0 : 1;
+        }
+
+        dimm_config_create((char *)id, size, busbuf ? busbuf : "membus.0",
+                node, nb_hp_dimms, populated);
+
+        /* if !populated, we just keep the config. The real device
+         * will be created in the future with a normal device_add
+         * command. */
+        if (!populated) {
+            qemu_opts_del(opts);
+        }
+        nb_hp_dimms++;
+    }
+
+    return 0;
+}
+
 #ifdef CONFIG_VIRTFS
 static int fsdev_init_func(QemuOpts *opts, void *opaque)
 {
@@ -3859,6 +3905,11 @@ int main(int argc, char **argv, char **envp)
     }
     qemu_add_globals();
 
+    /* init generic devices */
+    if (qemu_opts_foreach(qemu_find_opts("device"),
+           dimmcfg_init_func, NULL, 1) != 0) {
+        exit(1);
+    }
     qdev_machine_init();
 
     QEMUMachineInitArgs args = { .ram_size = ram_size,
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 11/30] acpi_piix4 : Implement memory device hotplug registers
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (9 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 10/30] vl: handle "-device dimm" Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 12/30] acpi_ich9 " Vasilis Liaskovitis
                   ` (23 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

A 32-byte register is used to present up to 256 hotplug-able memory devices
to BIOS and OSPM. Hot-add and hot-remove functions trigger an ACPI hotplug
event through these. Only reads are allowed from these registers.

An ACPI hot-remove event but needs to wait for OSPM to eject the device.
We use a single-byte register to know when OSPM has called the _EJ function
for a particular dimm. A write to this byte will depopulate the respective dimm.
Only writes are allowed to this byte.

v1->v2:
mems_sts address moved from 0xaf20 to 0xaf80 (to accomodate more space for
cpu-hotplugging in the future).
_EJ array is reduced to a single byte.
Add documentation in docs/specs/acpi_hotplug.txt

v3->v4: Removed hot-remove functions, will be added separately. Updated for
memory API.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 docs/specs/acpi_hotplug.txt |   14 +++++++++
 hw/acpi.h                   |    5 +++
 hw/acpi_piix4.c             |   65 +++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 82 insertions(+), 2 deletions(-)
 create mode 100644 docs/specs/acpi_hotplug.txt

diff --git a/docs/specs/acpi_hotplug.txt b/docs/specs/acpi_hotplug.txt
new file mode 100644
index 0000000..8391713
--- /dev/null
+++ b/docs/specs/acpi_hotplug.txt
@@ -0,0 +1,14 @@
+QEMU<->ACPI BIOS hotplug interface
+--------------------------------------
+This document describes the interface between QEMU and the ACPI BIOS for non-PCI
+space. For the PCI interface please look at docs/specs/acpi_pci_hotplug.txt
+
+QEMU<->ACPI BIOS memory hotplug interface
+--------------------------------------
+
+Memory Dimm status array (IO port 0xaf80-0xaf9f, 1-byte access):
+---------------------------------------------------------------
+Dimm hot-plug notification pending. One bit per slot.
+
+Read by ACPI BIOS GPE.3 handler to notify OS of memory hot-add or hot-remove
+events.  Read-only.
diff --git a/hw/acpi.h b/hw/acpi.h
index afda153..dc617d3 100644
--- a/hw/acpi.h
+++ b/hw/acpi.h
@@ -120,6 +120,11 @@ struct ACPIREGS {
     Notifier wakeup;
 };
 
+#include "dimm.h"
+struct gpe_regs {
+    uint8_t mems_sts[DIMM_BITMAP_BYTES];
+};
+
 /* PM_TMR */
 void acpi_pm_tmr_update(ACPIREGS *ar, bool enable);
 void acpi_pm_tmr_calc_overflow_time(ACPIREGS *ar);
diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index 0b5b0d3..879d8a0 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -29,6 +29,8 @@
 #include "ioport.h"
 #include "fw_cfg.h"
 #include "exec-memory.h"
+#include "sysbus.h"
+#include "dimm.h"
 
 //#define DEBUG
 
@@ -47,7 +49,9 @@
 #define PCI_DOWN_BASE 0xae04
 #define PCI_EJ_BASE 0xae08
 #define PCI_RMV_BASE 0xae0c
+#define MEM_BASE 0xaf80
 
+#define PIIX4_MEM_HOTPLUG_STATUS 8
 #define PIIX4_PCI_HOTPLUG_STATUS 2
 
 struct pci_status {
@@ -60,6 +64,7 @@ typedef struct PIIX4PMState {
     MemoryRegion io;
     MemoryRegion io_gpe;
     MemoryRegion io_pci;
+    MemoryRegion io_memhp;
     ACPIREGS ar;
 
     APMState apm;
@@ -74,6 +79,7 @@ typedef struct PIIX4PMState {
     Notifier powerdown_notifier;
 
     /* for pci hotplug */
+    struct gpe_regs gperegs;
     struct pci_status pci0_status;
     uint32_t pci0_hotplug_enable;
     uint32_t pci0_slot_device_present;
@@ -98,8 +104,8 @@ static void pm_update_sci(PIIX4PMState *s)
                    ACPI_BITMASK_POWER_BUTTON_ENABLE |
                    ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
                    ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
-        (((s->ar.gpe.sts[0] & s->ar.gpe.en[0])
-          & PIIX4_PCI_HOTPLUG_STATUS) != 0);
+        (((s->ar.gpe.sts[0] & s->ar.gpe.en[0]) &
+          (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_MEM_HOTPLUG_STATUS)) != 0);
 
     qemu_set_irq(s->irq, sci_level);
     /* schedule a timer interruption if needed */
@@ -526,6 +532,29 @@ static const MemoryRegionOps piix4_gpe_ops = {
     .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
+static uint32_t memhp_readb(void *opaque, uint32_t addr)
+{
+    PIIX4PMState *s = opaque;
+    uint32_t val = 0;
+    struct gpe_regs *g = &s->gperegs;
+    if (addr < DIMM_BITMAP_BYTES) {
+        val = (uint32_t) g->mems_sts[addr];
+    }
+    PIIX4_DPRINTF(stderr, "memhp read %x == %x\n", addr, val);
+    return val;
+}
+
+static const MemoryRegionOps piix4_memhp_ops = {
+    .old_portio = (MemoryRegionPortio[]) {
+        {
+            .offset = 0,   .len = DIMM_BITMAP_BYTES, .size = 1,
+            .read = memhp_readb,
+        },
+        PORTIO_END_OF_LIST()
+    },
+    .endianness = DEVICE_LITTLE_ENDIAN,
+};
+
 static uint32_t pci_up_read(void *opaque, uint32_t addr)
 {
     PIIX4PMState *s = opaque;
@@ -592,9 +621,11 @@ static const MemoryRegionOps piix4_pci_ops = {
 
 static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
                                 PCIHotplugState state);
+static int piix4_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int add);
 
 static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
 {
+    int i = 0;
     memory_region_init_io(&s->io_gpe, &piix4_gpe_ops, s, "apci-gpe0",
                           GPE_LEN);
     memory_region_add_subregion(get_system_io(), GPE_BASE, &s->io_gpe);
@@ -603,7 +634,16 @@ static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
                           PCI_HOTPLUG_SIZE);
     memory_region_add_subregion(get_system_io(), PCI_HOTPLUG_ADDR,
                                 &s->io_pci);
+    memory_region_init_io(&s->io_memhp, &piix4_memhp_ops, s, "apci-memhp0",
+                          DIMM_BITMAP_BYTES);
+    memory_region_add_subregion(get_system_io(), MEM_BASE, &s->io_memhp);
+
+    for (i = 0; i < DIMM_BITMAP_BYTES; i++) {
+        s->gperegs.mems_sts[i] = 0;
+    }
+
     pci_bus_hotplug(bus, piix4_device_hotplug, &s->dev.qdev);
+    dimm_bus_hotplug(piix4_dimm_hotplug, &s->dev.qdev);
 }
 
 static void enable_device(PIIX4PMState *s, int slot)
@@ -618,6 +658,27 @@ static void disable_device(PIIX4PMState *s, int slot)
     s->pci0_status.down |= (1U << slot);
 }
 
+static void enable_mem_device(PIIX4PMState *s, int memdevice)
+{
+    struct gpe_regs *g = &s->gperegs;
+    s->ar.gpe.sts[0] |= PIIX4_MEM_HOTPLUG_STATUS;
+    g->mems_sts[memdevice/8] |= (1 << (memdevice%8));
+}
+
+static int piix4_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int
+        add)
+{
+    PCIDevice *pci_dev = DO_UPCAST(PCIDevice, qdev, qdev);
+    PIIX4PMState *s = DO_UPCAST(PIIX4PMState, dev, pci_dev);
+    DimmDevice *slot = DIMM(dev);
+
+    if (add) {
+        enable_mem_device(s, slot->idx);
+    }
+    pm_update_sci(s);
+    return 0;
+}
+
 static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
 				PCIHotplugState state)
 {
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 12/30] acpi_ich9 : Implement memory device hotplug registers
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (10 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 11/30] acpi_piix4 : Implement memory device hotplug registers Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 13/30] piix_pci and pc_piix: refactor Vasilis Liaskovitis
                   ` (22 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

This implements acpi dimm hot-add capability for q35 (ich9). The logic is the
same as for the pc machine (piix4).

TODO: Fix acpi irq delivery bug. Currently there is a flood of irqs when
delivering an acpi interrupt (should be just one). Guest complains as follows:
"irq 9: nobody cared
[...]
Disabling IRQ #9"
where #9 is the acpi irq

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 hw/acpi_ich9.c |   61 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 hw/acpi_ich9.h |    7 +++++-
 hw/lpc_ich9.c  |    2 +-
 3 files changed, 65 insertions(+), 5 deletions(-)

diff --git a/hw/acpi_ich9.c b/hw/acpi_ich9.c
index c5978d3..abafbb5 100644
--- a/hw/acpi_ich9.c
+++ b/hw/acpi_ich9.c
@@ -48,11 +48,14 @@ static void pm_update_sci(ICH9LPCPMRegs *pm)
 
     pm1a_sts = acpi_pm1_evt_get_sts(&pm->acpi_regs);
 
-    sci_level = (((pm1a_sts & pm->acpi_regs.pm1.evt.en) &
+    sci_level = ((((pm1a_sts & pm->acpi_regs.pm1.evt.en) &
                   (ACPI_BITMASK_RT_CLOCK_ENABLE |
                    ACPI_BITMASK_POWER_BUTTON_ENABLE |
                    ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
-                   ACPI_BITMASK_TIMER_ENABLE)) != 0);
+                   ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
+        (((pm->acpi_regs.gpe.sts[0] & pm->acpi_regs.gpe.en[0]) &
+          (ICH9_MEM_HOTPLUG_STATUS)) != 0));
+
     qemu_set_irq(pm->irq, sci_level);
 
     /* schedule a timer interruption if needed */
@@ -90,6 +93,29 @@ static const MemoryRegionOps ich9_gpe_ops = {
     .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
+static uint32_t memhp_readb(void *opaque, uint32_t addr)
+{
+    ICH9LPCPMRegs *s = opaque;
+    uint32_t val = 0;
+    struct gpe_regs *g = &s->gperegs;
+    if (addr < DIMM_BITMAP_BYTES) {
+        val = (uint32_t) g->mems_sts[addr];
+    }
+    ICH9_DEBUG("memhp read %x == %x\n", addr, val);
+    return val;
+}
+
+static const MemoryRegionOps ich9_memhp_ops = {
+    .old_portio = (MemoryRegionPortio[]) {
+        {
+            .offset = 0,   .len = DIMM_BITMAP_BYTES, .size = 1,
+            .read = memhp_readb,
+        },
+        PORTIO_END_OF_LIST()
+    },
+    .endianness = DEVICE_LITTLE_ENDIAN,
+};
+
 static uint64_t ich9_smi_readl(void *opaque, hwaddr addr, unsigned width)
 {
     ICH9LPCPMRegs *pm = opaque;
@@ -201,8 +227,31 @@ static void pm_powerdown_req(Notifier *n, void *opaque)
     acpi_pm1_evt_power_down(&pm->acpi_regs);
 }
 
-void ich9_pm_init(ICH9LPCPMRegs *pm, qemu_irq sci_irq, qemu_irq cmos_s3)
+static void enable_mem_device(ICH9LPCState *s, int memdevice)
 {
+    struct gpe_regs *g = &s->pm.gperegs;
+    s->pm.acpi_regs.gpe.sts[0] |= ICH9_MEM_HOTPLUG_STATUS;
+    g->mems_sts[memdevice/8] |= (1 << (memdevice%8));
+}
+
+static int ich9_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int
+        add)
+{
+    PCIDevice *pci_dev = DO_UPCAST(PCIDevice, qdev, qdev);
+    ICH9LPCState *s = DO_UPCAST(ICH9LPCState, d, pci_dev);
+    DimmDevice *slot = DIMM(dev);
+
+    if (add) {
+        enable_mem_device(s, slot->idx);
+    }
+    pm_update_sci(&s->pm);
+    return 0;
+}
+
+void ich9_pm_init(void *device, qemu_irq sci_irq, qemu_irq cmos_s3)
+{
+    ICH9LPCState *lpc = (ICH9LPCState *)device;
+    ICH9LPCPMRegs *pm = &lpc->pm;
     memory_region_init(&pm->io, "ich9-pm", ICH9_PMIO_SIZE);
     memory_region_set_enabled(&pm->io, false);
     memory_region_add_subregion(get_system_io(), 0, &pm->io);
@@ -220,6 +269,12 @@ void ich9_pm_init(ICH9LPCPMRegs *pm, qemu_irq sci_irq, qemu_irq cmos_s3)
                           8);
     memory_region_add_subregion(&pm->io, ICH9_PMIO_SMI_EN, &pm->io_smi);
 
+    memory_region_init_io(&pm->io_memhp, &ich9_memhp_ops, pm, "apci-memhp0",
+                          DIMM_BITMAP_BYTES);
+    memory_region_add_subregion(get_system_io(), ICH9_MEM_BASE, &pm->io_memhp);
+
+    dimm_bus_hotplug(ich9_dimm_hotplug, &lpc->d.qdev);
+
     pm->irq = sci_irq;
     qemu_register_reset(pm_reset, pm);
     pm->powerdown_notifier.notify = pm_powerdown_req;
diff --git a/hw/acpi_ich9.h b/hw/acpi_ich9.h
index bc221d3..4419247 100644
--- a/hw/acpi_ich9.h
+++ b/hw/acpi_ich9.h
@@ -23,6 +23,9 @@
 
 #include "acpi.h"
 
+#define ICH9_MEM_BASE    0xaf80
+#define ICH9_MEM_HOTPLUG_STATUS 8
+
 typedef struct ICH9LPCPMRegs {
     /*
      * In ich9 spec says that pm1_cnt register is 32bit width and
@@ -33,16 +36,18 @@ typedef struct ICH9LPCPMRegs {
     MemoryRegion io;
     MemoryRegion io_gpe;
     MemoryRegion io_smi;
+    MemoryRegion io_memhp;
     uint32_t smi_en;
     uint32_t smi_sts;
 
     qemu_irq irq;      /* SCI */
 
+    struct gpe_regs gperegs;
     uint32_t pm_io_base;
     Notifier powerdown_notifier;
 } ICH9LPCPMRegs;
 
-void ich9_pm_init(ICH9LPCPMRegs *pm,
+void ich9_pm_init(void *lpc,
                   qemu_irq sci_irq, qemu_irq cmos_s3_resume);
 void ich9_pm_iospace_update(ICH9LPCPMRegs *pm, uint32_t pm_io_base);
 extern const VMStateDescription vmstate_ich9_pm;
diff --git a/hw/lpc_ich9.c b/hw/lpc_ich9.c
index 878a43e..0ef7af6 100644
--- a/hw/lpc_ich9.c
+++ b/hw/lpc_ich9.c
@@ -352,7 +352,7 @@ void ich9_lpc_pm_init(PCIDevice *lpc_pci, qemu_irq cmos_s3)
     qemu_irq *sci_irq;
 
     sci_irq = qemu_allocate_irqs(ich9_set_sci, lpc, 1);
-    ich9_pm_init(&lpc->pm, sci_irq[0], cmos_s3);
+    ich9_pm_init(lpc, sci_irq[0], cmos_s3);
 
     ich9_lpc_reset(&lpc->d.qdev);
 }
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 13/30] piix_pci and pc_piix: refactor
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (11 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 12/30] acpi_ich9 " Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2013-01-16  7:20   ` Hu Tao
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 14/30] piix_pci: Add i440fx dram controller initialization Vasilis Liaskovitis
                   ` (21 subsequent siblings)
  34 siblings, 1 reply; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Refactor code so that chipset initialization is similar to q35. This will
allow memory map initialization at chipset qdev init time for both
machines, as well as more similar code structure overall.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 hw/pc_piix.c  |   57 ++++++++++++---
 hw/piix_pci.c |  225 ++++++++++++++-------------------------------------------
 2 files changed, 100 insertions(+), 182 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 19e342a..6a9b508 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -47,6 +47,7 @@
 #ifdef CONFIG_XEN
 #  include <xen/hvm/hvm_info_table.h>
 #endif
+#include "piix_pci.h"
 
 #define MAX_IDE_BUS 2
 
@@ -85,6 +86,8 @@ static void pc_init1(MemoryRegion *system_memory,
     MemoryRegion *pci_memory;
     MemoryRegion *rom_memory;
     void *fw_cfg = NULL;
+    I440FXState *i440fx_host;
+    PIIX3State *piix3;
 
     pc_cpus_init(cpu_model);
 
@@ -127,21 +130,53 @@ static void pc_init1(MemoryRegion *system_memory,
     }
 
     if (pci_enabled) {
-        pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, &isa_bus, gsi,
-                              system_memory, system_io, ram_size,
-                              below_4g_mem_size,
-                              0x100000000ULL - below_4g_mem_size,
-                              0x100000000ULL + above_4g_mem_size,
-                              (sizeof(hwaddr) == 4
-                               ? 0
-                               : ((uint64_t)1 << 62)),
-                              pci_memory, ram_memory);
+        i440fx_host = I440FX_HOST_DEVICE(qdev_create(NULL,
+                    TYPE_I440FX_HOST_DEVICE));
+        i440fx_host->mch.ram_memory = ram_memory;
+        i440fx_host->mch.pci_address_space = pci_memory;
+        i440fx_host->mch.system_memory = get_system_memory();
+        i440fx_host->mch.address_space_io = get_system_io();;
+        i440fx_host->mch.below_4g_mem_size = below_4g_mem_size;
+        i440fx_host->mch.above_4g_mem_size = above_4g_mem_size;
+
+        qdev_init_nofail(DEVICE(i440fx_host));
+        i440fx_state = &i440fx_host->mch;
+        pci_bus = i440fx_host->parent_obj.bus;
+        /* Xen supports additional interrupt routes from the PCI devices to
+         * the IOAPIC: the four pins of each PCI device on the bus are also
+         * connected to the IOAPIC directly.
+         * These additional routes can be discovered through ACPI. */
+        if (xen_enabled()) {
+            piix3 = DO_UPCAST(PIIX3State, dev,
+                    pci_create_simple_multifunction(pci_bus, -1, true,
+                        "PIIX3-xen"));
+            pci_bus_irqs(pci_bus, xen_piix3_set_irq, xen_pci_slot_get_pirq,
+                    piix3, XEN_PIIX_NUM_PIRQS);
+        } else {
+            piix3 = DO_UPCAST(PIIX3State, dev,
+                    pci_create_simple_multifunction(pci_bus, -1, true,
+                        "PIIX3"));
+            pci_bus_irqs(pci_bus, piix3_set_irq, pci_slot_get_pirq, piix3,
+                    PIIX_NUM_PIRQS);
+            pci_bus_set_route_irq_fn(pci_bus, piix3_route_intx_pin_to_irq);
+        }
+        piix3->pic = gsi;
+        isa_bus = DO_UPCAST(ISABus, qbus,
+                qdev_get_child_bus(&piix3->dev.qdev, "isa.0"));
+
+        piix3_devfn = piix3->dev.devfn;
+
+        ram_size = ram_size / 8 / 1024 / 1024;
+        if (ram_size > 255) {
+            ram_size = 255;
+        }
+        i440fx_state->dev.config[0x57] = ram_size;
     } else {
         pci_bus = NULL;
-        i440fx_state = NULL;
         isa_bus = isa_bus_new(NULL, system_io);
         no_hpet = 1;
     }
+
     isa_bus_irqs(isa_bus, gsi);
 
     if (kvm_irqchip_in_kernel()) {
@@ -157,7 +192,7 @@ static void pc_init1(MemoryRegion *system_memory,
         gsi_state->i8259_irq[i] = i8259[i];
     }
     if (pci_enabled) {
-        ioapic_init_gsi(gsi_state, "i440fx");
+        ioapic_init_gsi(gsi_state, NULL);
     }
 
     pc_register_ferr_irq(gsi[13]);
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index ba1b3de..7ca3c73 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -31,70 +31,15 @@
 #include "range.h"
 #include "xen.h"
 #include "pam.h"
+#include "piix_pci.h"
 
-/*
- * I440FX chipset data sheet.
- * http://download.intel.com/design/chipsets/datashts/29054901.pdf
- */
-
-typedef struct I440FXState {
-    PCIHostState parent_obj;
-} I440FXState;
-
-#define PIIX_NUM_PIC_IRQS       16      /* i8259 * 2 */
-#define PIIX_NUM_PIRQS          4ULL    /* PIRQ[A-D] */
-#define XEN_PIIX_NUM_PIRQS      128ULL
-#define PIIX_PIRQC              0x60
-
-typedef struct PIIX3State {
-    PCIDevice dev;
-
-    /*
-     * bitmap to track pic levels.
-     * The pic level is the logical OR of all the PCI irqs mapped to it
-     * So one PIC level is tracked by PIIX_NUM_PIRQS bits.
-     *
-     * PIRQ is mapped to PIC pins, we track it by
-     * PIIX_NUM_PIRQS * PIIX_NUM_PIC_IRQS = 64 bits with
-     * pic_irq * PIIX_NUM_PIRQS + pirq
-     */
-#if PIIX_NUM_PIC_IRQS * PIIX_NUM_PIRQS > 64
-#error "unable to encode pic state in 64bit in pic_levels."
-#endif
-    uint64_t pic_levels;
-
-    qemu_irq *pic;
-
-    /* This member isn't used. Just for save/load compatibility */
-    int32_t pci_irq_levels_vmstate[PIIX_NUM_PIRQS];
-} PIIX3State;
-
-struct PCII440FXState {
-    PCIDevice dev;
-    MemoryRegion *system_memory;
-    MemoryRegion *pci_address_space;
-    MemoryRegion *ram_memory;
-    MemoryRegion pci_hole;
-    MemoryRegion pci_hole_64bit;
-    PAMMemoryRegion pam_regions[13];
-    MemoryRegion smram_region;
-    uint8_t smm_enabled;
-};
-
-
-#define I440FX_PAM      0x59
-#define I440FX_PAM_SIZE 7
-#define I440FX_SMRAM    0x72
-
-static void piix3_set_irq(void *opaque, int pirq, int level);
-static PCIINTxRoute piix3_route_intx_pin_to_irq(void *opaque, int pci_intx);
 static void piix3_write_config_xen(PCIDevice *dev,
                                uint32_t address, uint32_t val, int len);
 
 /* return the global irq number corresponding to a given device irq
    pin. We could also use the bus number to have a more precise
    mapping. */
-static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
+int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
 {
     int slot_addend;
     slot_addend = (pci_dev->devfn >> 3) - 1;
@@ -180,149 +125,86 @@ static const VMStateDescription vmstate_i440fx = {
     }
 };
 
-static int i440fx_pcihost_initfn(SysBusDevice *dev)
+static void i440fx_pcihost_initfn(Object *obj)
 {
-    PCIHostState *s = PCI_HOST_BRIDGE(dev);
+    I440FXState *s = I440FX_HOST_DEVICE(obj);
+    object_initialize(&s->mch, TYPE_I440FX_PCI_DEVICE);
+    object_property_add_child(OBJECT(s), "mch", OBJECT(&s->mch), NULL);
+}
 
-    memory_region_init_io(&s->conf_mem, &pci_host_conf_le_ops, s,
-                          "pci-conf-idx", 4);
-    sysbus_add_io(dev, 0xcf8, &s->conf_mem);
-    sysbus_init_ioports(&s->busdev, 0xcf8, 4);
+static int i440fx_pcihost_init(SysBusDevice *dev)
+{
+    PCIHostState *pci = FROM_SYSBUS(PCIHostState, dev);
+    I440FXState *s = I440FX_HOST_DEVICE(&dev->qdev);
+    PCIBus *b;
+
+    memory_region_init_io(&pci->conf_mem, &pci_host_conf_le_ops, pci,
+                           "pci-conf-idx", 4);
+    sysbus_add_io(dev, 0xcf8, &pci->conf_mem);
+    sysbus_init_ioports(&pci->busdev, 0xcf8, 4);
+    memory_region_init_io(&pci->data_mem, &pci_host_data_le_ops, pci,
+                           "pci-conf-data", 4);
 
-    memory_region_init_io(&s->data_mem, &pci_host_data_le_ops, s,
-                          "pci-conf-data", 4);
-    sysbus_add_io(dev, 0xcfc, &s->data_mem);
-    sysbus_init_ioports(&s->busdev, 0xcfc, 4);
+    sysbus_add_io(dev, 0xcfc, &pci->data_mem);
+    sysbus_init_ioports(&pci->busdev, 0xcfc, 4);
+
+    b = pci_bus_new(&s->parent_obj.busdev.qdev, NULL, s->mch.pci_address_space,
+                    s->mch.address_space_io, 0);
+    s->parent_obj.bus = b;
+    qdev_set_parent_bus(DEVICE(&s->mch), BUS(b));
+    qdev_init_nofail(DEVICE(&s->mch));
 
     return 0;
 }
 
 static int i440fx_initfn(PCIDevice *dev)
 {
-    PCII440FXState *d = DO_UPCAST(PCII440FXState, dev, dev);
+    int i;
+    PCII440FXState *f = DO_UPCAST(PCII440FXState, dev, dev);
+    hwaddr pci_hole64_size;
 
-    d->dev.config[I440FX_SMRAM] = 0x02;
+    f->dev.config[I440FX_SMRAM] = 0x02;
 
-    cpu_smm_register(&i440fx_set_smm, d);
-    return 0;
-}
+    cpu_smm_register(&i440fx_set_smm, f);
 
-static PCIBus *i440fx_common_init(const char *device_name,
-                                  PCII440FXState **pi440fx_state,
-                                  int *piix3_devfn,
-                                  ISABus **isa_bus, qemu_irq *pic,
-                                  MemoryRegion *address_space_mem,
-                                  MemoryRegion *address_space_io,
-                                  ram_addr_t ram_size,
-                                  hwaddr pci_hole_start,
-                                  hwaddr pci_hole_size,
-                                  hwaddr pci_hole64_start,
-                                  hwaddr pci_hole64_size,
-                                  MemoryRegion *pci_address_space,
-                                  MemoryRegion *ram_memory)
-{
-    DeviceState *dev;
-    PCIBus *b;
-    PCIDevice *d;
-    PCIHostState *s;
-    PIIX3State *piix3;
-    PCII440FXState *f;
-    unsigned i;
-
-    dev = qdev_create(NULL, "i440FX-pcihost");
-    s = PCI_HOST_BRIDGE(dev);
-    s->address_space = address_space_mem;
-    b = pci_bus_new(dev, NULL, pci_address_space,
-                    address_space_io, 0);
-    s->bus = b;
-    object_property_add_child(qdev_get_machine(), "i440fx", OBJECT(dev), NULL);
-    qdev_init_nofail(dev);
-
-    d = pci_create_simple(b, 0, device_name);
-    *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
-    f = *pi440fx_state;
-    f->system_memory = address_space_mem;
-    f->pci_address_space = pci_address_space;
-    f->ram_memory = ram_memory;
+    pci_hole64_size = (sizeof(hwaddr) == 4 ? 0 :
+                       ((uint64_t)1 << 62));
     memory_region_init_alias(&f->pci_hole, "pci-hole", f->pci_address_space,
-                             pci_hole_start, pci_hole_size);
-    memory_region_add_subregion(f->system_memory, pci_hole_start, &f->pci_hole);
+                             f->below_4g_mem_size,
+                             0x100000000LL - f->below_4g_mem_size);
+    memory_region_add_subregion(f->system_memory, f->below_4g_mem_size,
+            &f->pci_hole);
     memory_region_init_alias(&f->pci_hole_64bit, "pci-hole64",
                              f->pci_address_space,
-                             pci_hole64_start, pci_hole64_size);
+                             0x100000000LL + f->above_4g_mem_size,
+                             pci_hole64_size);
     if (pci_hole64_size) {
-        memory_region_add_subregion(f->system_memory, pci_hole64_start,
+        memory_region_add_subregion(f->system_memory,
+                                    0x100000000LL + f->above_4g_mem_size,
                                     &f->pci_hole_64bit);
     }
+
     memory_region_init_alias(&f->smram_region, "smram-region",
                              f->pci_address_space, 0xa0000, 0x20000);
     memory_region_add_subregion_overlap(f->system_memory, 0xa0000,
                                         &f->smram_region, 1);
     memory_region_set_enabled(&f->smram_region, false);
+
     init_pam(f->ram_memory, f->system_memory, f->pci_address_space,
              &f->pam_regions[0], PAM_BIOS_BASE, PAM_BIOS_SIZE);
     for (i = 0; i < 12; ++i) {
         init_pam(f->ram_memory, f->system_memory, f->pci_address_space,
-                 &f->pam_regions[i+1], PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE,
+                &f->pam_regions[i+1], PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE,
                  PAM_EXPAN_SIZE);
     }
-
-    /* Xen supports additional interrupt routes from the PCI devices to
-     * the IOAPIC: the four pins of each PCI device on the bus are also
-     * connected to the IOAPIC directly.
-     * These additional routes can be discovered through ACPI. */
-    if (xen_enabled()) {
-        piix3 = DO_UPCAST(PIIX3State, dev,
-                pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
-        pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
-                piix3, XEN_PIIX_NUM_PIRQS);
-    } else {
-        piix3 = DO_UPCAST(PIIX3State, dev,
-                pci_create_simple_multifunction(b, -1, true, "PIIX3"));
-        pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3,
-                PIIX_NUM_PIRQS);
-        pci_bus_set_route_irq_fn(b, piix3_route_intx_pin_to_irq);
-    }
-    piix3->pic = pic;
-    *isa_bus = DO_UPCAST(ISABus, qbus,
-                         qdev_get_child_bus(&piix3->dev.qdev, "isa.0"));
-
-    *piix3_devfn = piix3->dev.devfn;
-
-    ram_size = ram_size / 8 / 1024 / 1024;
-    if (ram_size > 255)
-        ram_size = 255;
-    (*pi440fx_state)->dev.config[0x57]=ram_size;
-
+    f->dev.config[0x57] = f->below_4g_mem_size;
     i440fx_update_memory_mappings(f);
 
-    return b;
-}
-
-PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn,
-                    ISABus **isa_bus, qemu_irq *pic,
-                    MemoryRegion *address_space_mem,
-                    MemoryRegion *address_space_io,
-                    ram_addr_t ram_size,
-                    hwaddr pci_hole_start,
-                    hwaddr pci_hole_size,
-                    hwaddr pci_hole64_start,
-                    hwaddr pci_hole64_size,
-                    MemoryRegion *pci_memory, MemoryRegion *ram_memory)
-
-{
-    PCIBus *b;
-
-    b = i440fx_common_init("i440FX", pi440fx_state, piix3_devfn, isa_bus, pic,
-                           address_space_mem, address_space_io, ram_size,
-                           pci_hole_start, pci_hole_size,
-                           pci_hole64_start, pci_hole64_size,
-                           pci_memory, ram_memory);
-    return b;
+    return 0;
 }
 
 /* PIIX3 PCI to ISA bridge */
-static void piix3_set_irq_pic(PIIX3State *piix3, int pic_irq)
+void piix3_set_irq_pic(PIIX3State *piix3, int pic_irq)
 {
     qemu_set_irq(piix3->pic[pic_irq],
                  !!(piix3->pic_levels &
@@ -347,13 +229,13 @@ static void piix3_set_irq_level(PIIX3State *piix3, int pirq, int level)
     piix3_set_irq_pic(piix3, pic_irq);
 }
 
-static void piix3_set_irq(void *opaque, int pirq, int level)
+void piix3_set_irq(void *opaque, int pirq, int level)
 {
     PIIX3State *piix3 = opaque;
     piix3_set_irq_level(piix3, pirq, level);
 }
 
-static PCIINTxRoute piix3_route_intx_pin_to_irq(void *opaque, int pin)
+PCIINTxRoute piix3_route_intx_pin_to_irq(void *opaque, int pin)
 {
     PIIX3State *piix3 = opaque;
     int irq = piix3->dev.config[PIIX_PIRQC + pin];
@@ -550,7 +432,7 @@ static void i440fx_class_init(ObjectClass *klass, void *data)
 }
 
 static const TypeInfo i440fx_info = {
-    .name          = "i440FX",
+    .name          = TYPE_I440FX_PCI_DEVICE,
     .parent        = TYPE_PCI_DEVICE,
     .instance_size = sizeof(PCII440FXState),
     .class_init    = i440fx_class_init,
@@ -561,15 +443,16 @@ static void i440fx_pcihost_class_init(ObjectClass *klass, void *data)
     DeviceClass *dc = DEVICE_CLASS(klass);
     SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
 
-    k->init = i440fx_pcihost_initfn;
+    k->init = i440fx_pcihost_init;
     dc->fw_name = "pci";
     dc->no_user = 1;
 }
 
 static const TypeInfo i440fx_pcihost_info = {
-    .name          = "i440FX-pcihost",
+    .name          = TYPE_I440FX_HOST_DEVICE,
     .parent        = TYPE_PCI_HOST_BRIDGE,
     .instance_size = sizeof(I440FXState),
+    .instance_init = i440fx_pcihost_initfn,
     .class_init    = i440fx_pcihost_class_init,
 };
 
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 14/30] piix_pci: Add i440fx dram controller initialization
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (12 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 13/30] piix_pci and pc_piix: refactor Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 15/30] q35: " Vasilis Liaskovitis
                   ` (20 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Also introduce function to adjust memory map for hotplug-able dimms.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 hw/pc_piix.c  |    6 +++---
 hw/piix_pci.c |   30 ++++++++++++++++++++++++++++--
 2 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 6a9b508..fe995b9 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -95,9 +95,9 @@ static void pc_init1(MemoryRegion *system_memory,
         kvmclock_create();
     }
 
-    if (ram_size >= 0xe0000000 ) {
-        above_4g_mem_size = ram_size - 0xe0000000;
-        below_4g_mem_size = 0xe0000000;
+    if (ram_size >= I440FX_PCI_HOLE_START) {
+        above_4g_mem_size = ram_size - I440FX_PCI_HOLE_START;
+        below_4g_mem_size = I440FX_PCI_HOLE_START;
     } else {
         above_4g_mem_size = 0;
         below_4g_mem_size = ram_size;
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index 7ca3c73..9866b1d 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -125,6 +125,25 @@ static const VMStateDescription vmstate_i440fx = {
     }
 };
 
+hwaddr i440fx_pmc_dimm_offset(DeviceState *dev, uint64_t size)
+{
+    PCII440FXState *d = I440FX_PCI_DEVICE(dev);
+    hwaddr ret;
+
+    /* if dimm fits before pci hole, append it normally */
+    if (d->below_4g_mem_size + size <= I440FX_PCI_HOLE_START) {
+        ret = d->below_4g_mem_size;
+        d->below_4g_mem_size += size;
+    }
+    /* otherwise place it above 4GB */
+    else {
+        ret = 0x100000000LL + d->above_4g_mem_size;
+        d->above_4g_mem_size += size;
+    }
+
+    return ret;
+}
+
 static void i440fx_pcihost_initfn(Object *obj)
 {
     I440FXState *s = I440FX_HOST_DEVICE(obj);
@@ -148,8 +167,8 @@ static int i440fx_pcihost_init(SysBusDevice *dev)
     sysbus_add_io(dev, 0xcfc, &pci->data_mem);
     sysbus_init_ioports(&pci->busdev, 0xcfc, 4);
 
-    b = pci_bus_new(&s->parent_obj.busdev.qdev, NULL, s->mch.pci_address_space,
-                    s->mch.address_space_io, 0);
+    b = pci_bus_new(&s->parent_obj.busdev.qdev, "pci.0",
+            s->mch.pci_address_space, s->mch.address_space_io, 0);
     s->parent_obj.bus = b;
     qdev_set_parent_bus(DEVICE(&s->mch), BUS(b));
     qdev_init_nofail(DEVICE(&s->mch));
@@ -169,6 +188,13 @@ static int i440fx_initfn(PCIDevice *dev)
 
     pci_hole64_size = (sizeof(hwaddr) == 4 ? 0 :
                        ((uint64_t)1 << 62));
+
+    /* Initialize i440fx's DRAM channel, it can hold up to 8 DRAM ranks */
+    f->dram_channel0 = dimm_bus_create(OBJECT(f), "membus.0", 8,
+            i440fx_pmc_dimm_offset);
+    /* Initialize paravirtual memory bus */
+    f->pv_dram_channel = dimm_bus_create(OBJECT(f), "membus.pv", 0,
+            i440fx_pmc_dimm_offset);
     memory_region_init_alias(&f->pci_hole, "pci-hole", f->pci_address_space,
                              f->below_4g_mem_size,
                              0x100000000LL - f->below_4g_mem_size);
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 15/30] q35: Add i440fx dram controller initialization
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (13 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 14/30] piix_pci: Add i440fx dram controller initialization Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 16/30] pc: Add dimm paravirt SRAT info Vasilis Liaskovitis
                   ` (19 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Create memory buses and introduce function to adjust memory map for
hotplug-able dimms.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 hw/pc_q35.c |    1 +
 hw/q35.c    |   27 +++++++++++++++++++++++++++
 hw/q35.h    |    5 +++++
 3 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/hw/pc_q35.c b/hw/pc_q35.c
index 3429a9a..e6375bf 100644
--- a/hw/pc_q35.c
+++ b/hw/pc_q35.c
@@ -41,6 +41,7 @@
 #include "hw/ide/pci.h"
 #include "hw/ide/ahci.h"
 #include "hw/usb.h"
+#include "fw_cfg.h"
 
 /* ICH9 AHCI has 6 ports */
 #define MAX_SATA_PORTS     6
diff --git a/hw/q35.c b/hw/q35.c
index efebc27..cc27d72 100644
--- a/hw/q35.c
+++ b/hw/q35.c
@@ -236,12 +236,39 @@ static void mch_reset(DeviceState *qdev)
     mch_update(mch);
 }
 
+static hwaddr mch_dimm_offset(DeviceState *dev, uint64_t size)
+{
+    MCHPCIState *d = MCH_PCI_DEVICE(dev);
+    hwaddr ret;
+
+    /* if dimm fits before pci hole, append it normally */
+    if (d->below_4g_mem_size + size <= MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT) {
+        ret = d->below_4g_mem_size;
+        d->below_4g_mem_size += size;
+    }
+    /* otherwise place it above 4GB */
+    else {
+        ret = 0x100000000LL + d->above_4g_mem_size;
+        d->above_4g_mem_size += size;
+    }
+
+    return ret;
+}
+
 static int mch_init(PCIDevice *d)
 {
     int i;
     hwaddr pci_hole64_size;
     MCHPCIState *mch = MCH_PCI_DEVICE(d);
 
+    /* Initialize 2 GMC DRAM channels x 4 DRAM ranks each */
+    mch->dram_channel[0] = dimm_bus_create(OBJECT(d), "membus.0", 4,
+            mch_dimm_offset);
+    mch->dram_channel[1] = dimm_bus_create(OBJECT(d), "membus.1", 4,
+            mch_dimm_offset);
+    /* Initialize paravirtual memory bus */
+    mch->pv_dram_channel = dimm_bus_create(OBJECT(d), "membus.pv", 0,
+            mch_dimm_offset);
     /* setup pci memory regions */
     memory_region_init_alias(&mch->pci_hole, "pci-hole",
                              mch->pci_address_space,
diff --git a/hw/q35.h b/hw/q35.h
index e34f7c1..bf76dc8 100644
--- a/hw/q35.h
+++ b/hw/q35.h
@@ -34,6 +34,7 @@
 #include "acpi.h"
 #include "acpi_ich9.h"
 #include "pam.h"
+#include "dimm.h"
 
 #define TYPE_Q35_HOST_DEVICE "q35-pcihost"
 #define Q35_HOST_DEVICE(obj) \
@@ -56,6 +57,10 @@ typedef struct MCHPCIState {
     uint8_t smm_enabled;
     ram_addr_t below_4g_mem_size;
     ram_addr_t above_4g_mem_size;
+    /* GMCH allows for 2 DRAM channels x 4 DRAM ranks each */
+    DimmBus * dram_channel[2];
+    /* paravirtual memory bus */
+    DimmBus *pv_dram_channel;
 } MCHPCIState;
 
 typedef struct Q35PCIHost {
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 16/30] pc: Add dimm paravirt SRAT info
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (14 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 15/30] q35: " Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 17/30] [SeaBIOS] pci: Use paravirt interface for pcimem_start and pcimem64_start Vasilis Liaskovitis
                   ` (18 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

The numa_fw_cfg paravirt interface is extended to include SRAT information for
all hotplug-able dimms. There are 3 words for each hotplug-able memory slot,
denoting start address, size and node proximity. The new info is appended after
existing numa info, so that the fw_cfg layout does not break.  This information
is used by Seabios to build hotplug memory device objects at runtime.
nb_numa_nodes is set to 1 by default (not 0), so that we always pass srat info
to SeaBIOS.

v3->v4: numa_fw_cfg needs to be initalized after memory controller sets up dimm
ranges.  Make changes for pc_piix and pc_q35 to set numa_fw_cfg after i440fx
initialization.

v2->v3: setting nb_numa_nodes to 1 is not needed

v1->v2:
Dimm SRAT info (#dimms) is appended at end of existing numa fw_cfg in order not
to break existing layout
Documentation of the new fwcfg layout is included in docs/specs/fwcfg.txt

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 docs/specs/fwcfg.txt |   28 ++++++++++++++++++++++++++++
 hw/pc.c              |   28 +++++++++++++++++++++++-----
 hw/pc.h              |    1 +
 hw/pc_piix.c         |    1 +
 hw/pc_q35.c          |    8 +++++---
 sysemu.h             |    1 +
 6 files changed, 59 insertions(+), 8 deletions(-)
 create mode 100644 docs/specs/fwcfg.txt

diff --git a/docs/specs/fwcfg.txt b/docs/specs/fwcfg.txt
new file mode 100644
index 0000000..e6fcd8f
--- /dev/null
+++ b/docs/specs/fwcfg.txt
@@ -0,0 +1,28 @@
+QEMU<->BIOS Paravirt Documentation
+--------------------------------------
+
+This document describes paravirt data structures passed from QEMU to BIOS.
+
+fw_cfg SRAT paravirt info
+--------------------
+The SRAT info passed from QEMU to BIOS has the following layout:
+
+-----------------------------------------------------------------------------------------------
+#nodes | cpu0_pxm | cpu1_pxm | ... | cpulast_pxm | node0_mem | node1_mem | ... | nodelast_mem
+
+-----------------------------------------------------------------------------------------------
+#dimms | dimm0_start | dimm0_sz | dimm0_pxm | ... | dimmlast_start | dimmlast_sz | dimmlast_pxm
+
+Entry 0 contains the number of numa nodes (nb_numa_nodes).
+
+Entries 1..max_cpus: The next max_cpus entries describe node proximity for each
+one of the vCPUs in the system.
+
+Entries max_cpus+1..max_cpus+nb_numa_nodes+1:  The next nb_numa_nodes entries
+describe the memory size for each one of the NUMA nodes in the system.
+
+Entry max_cpus+nb_numa_nodes+1 contains the number of memory dimms (nb_hp_dimms)
+
+The last 3 * nb_hp_dimms entries are organized in triplets: Each triplet contains
+the physical address offset, size (in bytes), and node proximity for the
+respective dimm.
diff --git a/hw/pc.c b/hw/pc.c
index b11e7c4..025c356 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -51,6 +51,7 @@
 #include "exec-memory.h"
 #include "arch_init.h"
 #include "bitmap.h"
+#include "hw/dimm.h"
 
 /* debug PC/ISA interrupts */
 //#define DEBUG_IRQ
@@ -582,8 +583,6 @@ static void *bochs_bios_init(void)
     void *fw_cfg;
     uint8_t *smbios_table;
     size_t smbios_len;
-    uint64_t *numa_fw_cfg;
-    int i, j;
     PortioList *bochs_bios_port_list = g_new(PortioList, 1);
 
     portio_list_init(bochs_bios_port_list, bochs_bios_portio_list,
@@ -607,11 +606,24 @@ static void *bochs_bios_init(void)
 
     fw_cfg_add_bytes(fw_cfg, FW_CFG_HPET, (uint8_t *)&hpet_cfg,
                      sizeof(struct hpet_fw_config));
+
+    return fw_cfg;
+}
+
+void bochs_meminfo_bios_init(void *fw_cfg)
+{
+    uint64_t *numa_fw_cfg;
+    uint64_t *hp_dimms_fw_cfg;
+    int i, j;
+
     /* allocate memory for the NUMA channel: one (64bit) word for the number
      * of nodes, one word for each VCPU->node and one word for each node to
      * hold the amount of memory.
+     * Finally one word for the number of hotplug memory slots and three words
+     * for each hotplug memory slot (start address, size and node proximity).
      */
-    numa_fw_cfg = g_malloc0((1 + max_cpus + nb_numa_nodes) * 8);
+    numa_fw_cfg = g_malloc0((2 + max_cpus + nb_numa_nodes + 3 * nb_hp_dimms)
+            * 8);
     numa_fw_cfg[0] = cpu_to_le64(nb_numa_nodes);
     for (i = 0; i < max_cpus; i++) {
         for (j = 0; j < nb_numa_nodes; j++) {
@@ -624,10 +636,16 @@ static void *bochs_bios_init(void)
     for (i = 0; i < nb_numa_nodes; i++) {
         numa_fw_cfg[max_cpus + 1 + i] = cpu_to_le64(node_mem[i]);
     }
+
+    numa_fw_cfg[1 + max_cpus + nb_numa_nodes] = cpu_to_le64(nb_hp_dimms);
+
+    hp_dimms_fw_cfg = numa_fw_cfg + 2 + max_cpus + nb_numa_nodes;
+    if (nb_hp_dimms) {
+        dimm_setup_fwcfg_layout(hp_dimms_fw_cfg);
+    }
     fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA, (uint8_t *)numa_fw_cfg,
-                     (1 + max_cpus + nb_numa_nodes) * 8);
+                     (2 + max_cpus + nb_numa_nodes + 3 * nb_hp_dimms) * 8);
 
-    return fw_cfg;
 }
 
 static long get_file_size(FILE *f)
diff --git a/hw/pc.h b/hw/pc.h
index 2237e86..075514f 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -185,5 +185,6 @@ void pc_system_firmware_init(MemoryRegion *rom_memory);
 #define E820_UNUSABLE   5
 
 int e820_add_entry(uint64_t, uint64_t, uint32_t);
+void bochs_meminfo_bios_init(void *fw_cfg);
 
 #endif
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index fe995b9..1a99852 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -140,6 +140,7 @@ static void pc_init1(MemoryRegion *system_memory,
         i440fx_host->mch.above_4g_mem_size = above_4g_mem_size;
 
         qdev_init_nofail(DEVICE(i440fx_host));
+        bochs_meminfo_bios_init(fw_cfg);
         i440fx_state = &i440fx_host->mch;
         pci_bus = i440fx_host->parent_obj.bus;
         /* Xen supports additional interrupt routes from the PCI devices to
diff --git a/hw/pc_q35.c b/hw/pc_q35.c
index e6375bf..7ce0b53 100644
--- a/hw/pc_q35.c
+++ b/hw/pc_q35.c
@@ -86,6 +86,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
     ICH9LPCState *ich9_lpc;
     PCIDevice *ahci;
     qemu_irq *cmos_s3;
+    void *fw_cfg = NULL;
 
     pc_cpus_init(cpu_model);
 
@@ -111,9 +112,9 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
 
     /* allocate ram and load rom/bios */
     if (!xen_enabled()) {
-        pc_memory_init(get_system_memory(), kernel_filename, kernel_cmdline,
-                       initrd_filename, below_4g_mem_size, above_4g_mem_size,
-                       rom_memory, &ram_memory);
+        fw_cfg = pc_memory_init(get_system_memory(), kernel_filename,
+                kernel_cmdline, initrd_filename, below_4g_mem_size,
+                above_4g_mem_size, rom_memory, &ram_memory);
     }
 
     /* irq lines */
@@ -137,6 +138,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
     q35_host->mch.above_4g_mem_size = above_4g_mem_size;
     /* pci */
     qdev_init_nofail(DEVICE(q35_host));
+    bochs_meminfo_bios_init(fw_cfg);
     host_bus = q35_host->host.pci.bus;
     /* create ISA bus */
     lpc = pci_create_simple_multifunction(host_bus, PCI_DEVFN(ICH9_LPC_DEV,
diff --git a/sysemu.h b/sysemu.h
index 1b6add2..86f729e 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -130,6 +130,7 @@ extern QEMUClock *rtc_clock;
 extern int nb_numa_nodes;
 extern uint64_t node_mem[MAX_NODES];
 extern unsigned long *node_cpumask[MAX_NODES];
+extern int nb_hp_dimms;
 
 #define MAX_OPTION_ROMS 16
 typedef struct QEMUOptionRom {
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 17/30] [SeaBIOS] pci: Use paravirt interface for pcimem_start and pcimem64_start
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (15 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 16/30] pc: Add dimm paravirt SRAT info Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 18/30] Introduce paravirt interface QEMU_CFG_PCI_WINDOW Vasilis Liaskovitis
                   ` (17 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Initialize the 32-bit and 64-bit pci starting offsets from values passed in by
the qemu paravirt interface QEMU_CFG_PCI_WINDOW. Qemu calculates the starting
offsets based on initial memory and hotplug-able dimms.
It's possible to avoid the new paravirt interface, and calculate pci ranges from
srat entries. But the code changes are ugly, see:
http://lists.gnu.org/archive/html/qemu-devel/2012-09/msg03548.html
---
 src/paravirt.c |    6 ++++++
 src/paravirt.h |    2 ++
 src/pciinit.c  |    9 +++++++++
 3 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/src/paravirt.c b/src/paravirt.c
index 4b5c441..f7517b9 100644
--- a/src/paravirt.c
+++ b/src/paravirt.c
@@ -347,3 +347,9 @@ void qemu_cfg_romfile_setup(void)
         dprintf(3, "Found fw_cfg file: %s (size=%d)\n", file->name, file->size);
     }
 }
+
+void qemu_cfg_get_pci_offsets(u64 *pcimem_start, u64 *pcimem64_start)
+{
+    qemu_cfg_read_entry(pcimem_start, QEMU_CFG_PCI_WINDOW, sizeof(u64));
+    qemu_cfg_read((u8*)(pcimem64_start), sizeof(u64));
+}
diff --git a/src/paravirt.h b/src/paravirt.h
index a284c41..b53ff88 100644
--- a/src/paravirt.h
+++ b/src/paravirt.h
@@ -35,6 +35,7 @@ static inline int kvm_para_available(void)
 #define QEMU_CFG_BOOT_MENU              0x0e
 #define QEMU_CFG_MAX_CPUS               0x0f
 #define QEMU_CFG_FILE_DIR               0x19
+#define QEMU_CFG_PCI_WINDOW             0x1a
 #define QEMU_CFG_ARCH_LOCAL             0x8000
 #define QEMU_CFG_ACPI_TABLES            (QEMU_CFG_ARCH_LOCAL + 0)
 #define QEMU_CFG_SMBIOS_ENTRIES         (QEMU_CFG_ARCH_LOCAL + 1)
@@ -65,5 +66,6 @@ struct e820_reservation {
 u32 qemu_cfg_e820_entries(void);
 void* qemu_cfg_e820_load_next(void *addr);
 void qemu_cfg_romfile_setup(void);
+void qemu_cfg_get_pci_offsets(u64 *pcimem_start, u64 *pcimem64_start);
 
 #endif
diff --git a/src/pciinit.c b/src/pciinit.c
index a406bbd..4103d2d 100644
--- a/src/pciinit.c
+++ b/src/pciinit.c
@@ -734,6 +734,7 @@ static void pci_bios_map_devices(struct pci_bus *busses)
 void
 pci_setup(void)
 {
+    u64 pv_pcimem_start, pv_pcimem64_start;
     if (CONFIG_COREBOOT || usingXen()) {
         // PCI setup already done by coreboot or Xen - just do probe.
         pci_probe_devices();
@@ -769,5 +770,13 @@ pci_setup(void)
 
     pci_bios_init_devices();
 
+    /* if qemu gives us other pci window values, it means there are hotplug-able
+     * dimms. Adjust accordingly */
+    qemu_cfg_get_pci_offsets(&pv_pcimem_start, &pv_pcimem64_start);
+    if (pv_pcimem_start > pcimem_start)
+        pcimem_start = pv_pcimem_start;
+    if (pv_pcimem64_start > pcimem64_start)
+        pcimem64_start = pv_pcimem64_start;
+
     free(busses);
 }
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 18/30] Introduce paravirt interface QEMU_CFG_PCI_WINDOW
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (16 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 17/30] [SeaBIOS] pci: Use paravirt interface for pcimem_start and pcimem64_start Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 19/30] Implement "info memory-total" and "query-memory-total" Vasilis Liaskovitis
                   ` (16 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Qemu calculates the 32-bit and 64-bit PCI starting offsets based on
initial memory and hotplug-able dimms. This info needs to be passed to Seabios
for PCI initialization.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 hw/fw_cfg.h  |    1 +
 hw/pc_piix.c |   10 ++++++++++
 hw/pc_q35.c  |    9 +++++++++
 3 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
index 619a394..8b48493 100644
--- a/hw/fw_cfg.h
+++ b/hw/fw_cfg.h
@@ -27,6 +27,7 @@
 #define FW_CFG_SETUP_SIZE       0x17
 #define FW_CFG_SETUP_DATA       0x18
 #define FW_CFG_FILE_DIR         0x19
+#define FW_CFG_PCI_WINDOW       0x1a
 
 #define FW_CFG_FILE_FIRST       0x20
 #define FW_CFG_FILE_SLOTS       0x10
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 1a99852..b6633e8 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -48,6 +48,7 @@
 #  include <xen/hvm/hvm_info_table.h>
 #endif
 #include "piix_pci.h"
+#include "fw_cfg.h"
 
 #define MAX_IDE_BUS 2
 
@@ -86,6 +87,7 @@ static void pc_init1(MemoryRegion *system_memory,
     MemoryRegion *pci_memory;
     MemoryRegion *rom_memory;
     void *fw_cfg = NULL;
+    uint64_t *pci_window_fw_cfg;
     I440FXState *i440fx_host;
     PIIX3State *piix3;
 
@@ -141,6 +143,14 @@ static void pc_init1(MemoryRegion *system_memory,
 
         qdev_init_nofail(DEVICE(i440fx_host));
         bochs_meminfo_bios_init(fw_cfg);
+
+        pci_window_fw_cfg = g_malloc0(2 * 8);
+        pci_window_fw_cfg[0] = cpu_to_le64(i440fx_host->mch.below_4g_mem_size);
+        pci_window_fw_cfg[1] = cpu_to_le64(0x100000000ULL +
+                i440fx_host->mch.above_4g_mem_size);
+        fw_cfg_add_bytes(fw_cfg, FW_CFG_PCI_WINDOW,
+                            (uint8_t *)pci_window_fw_cfg, 2 * 8);
+
         i440fx_state = &i440fx_host->mch;
         pci_bus = i440fx_host->parent_obj.bus;
         /* Xen supports additional interrupt routes from the PCI devices to
diff --git a/hw/pc_q35.c b/hw/pc_q35.c
index 7ce0b53..e35814a 100644
--- a/hw/pc_q35.c
+++ b/hw/pc_q35.c
@@ -87,6 +87,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
     PCIDevice *ahci;
     qemu_irq *cmos_s3;
     void *fw_cfg = NULL;
+    uint64_t *pci_window_fw_cfg;
 
     pc_cpus_init(cpu_model);
 
@@ -139,6 +140,14 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
     /* pci */
     qdev_init_nofail(DEVICE(q35_host));
     bochs_meminfo_bios_init(fw_cfg);
+
+    pci_window_fw_cfg = g_malloc0(2 * 8);
+    pci_window_fw_cfg[0] = cpu_to_le64(MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT);
+    pci_window_fw_cfg[1] = cpu_to_le64(0x100000000ULL +
+            q35_host->mch.above_4g_mem_size);
+    fw_cfg_add_bytes(fw_cfg, FW_CFG_PCI_WINDOW,
+                        (uint8_t *)pci_window_fw_cfg, 2 * 8);
+
     host_bus = q35_host->host.pci.bus;
     /* create ISA bus */
     lpc = pci_create_simple_multifunction(host_bus, PCI_DEVFN(ICH9_LPC_DEV,
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 19/30] Implement "info memory-total" and "query-memory-total"
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (17 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 18/30] Introduce paravirt interface QEMU_CFG_PCI_WINDOW Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-19 19:47   ` Blue Swirl
  2013-01-04 16:21   ` Eric Blake
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 20/30] balloon: update with hotplugged memory Vasilis Liaskovitis
                   ` (15 subsequent siblings)
  34 siblings, 2 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Returns total physical memory available to guest in bytes, including hotplugged
memory. Note that the number reported here may be different from what the guest
sees e.g. if the guest has not logically onlined hotplugged memory.

This functionality is provided independently of a balloon device, since a
guest can be using ACPI memory hotplug without using a balloon device.

v3->v4: Moved qmp command implementation to vl.c. This prevents a circular
header dependency problem.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 hmp-commands.hx  |    2 ++
 hmp.c            |    7 +++++++
 hmp.h            |    1 +
 hw/dimm.c        |   14 ++++++++++++++
 hw/dimm.h        |    1 +
 monitor.c        |    7 +++++++
 qapi-schema.json |   11 +++++++++++
 qmp-commands.hx  |   20 ++++++++++++++++++++
 vl.c             |    9 +++++++++
 9 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 010b8c9..3fbd975 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1570,6 +1570,8 @@ show device tree
 show qdev device model list
 @item info roms
 show roms
+@item info memory-total
+show memory-total
 @end table
 ETEXI
 
diff --git a/hmp.c b/hmp.c
index 180ba2b..fb39b0d 100644
--- a/hmp.c
+++ b/hmp.c
@@ -628,6 +628,13 @@ void hmp_info_block_jobs(Monitor *mon)
     }
 }
 
+void hmp_info_memory_total(Monitor *mon)
+{
+    uint64_t ram_total;
+    ram_total = (uint64_t)qmp_query_memory_total(NULL);
+    monitor_printf(mon, "MemTotal: %lu\n", ram_total);
+}
+
 void hmp_quit(Monitor *mon, const QDict *qdict)
 {
     monitor_suspend(mon);
diff --git a/hmp.h b/hmp.h
index 0ab03be..25a3a70 100644
--- a/hmp.h
+++ b/hmp.h
@@ -36,6 +36,7 @@ void hmp_info_spice(Monitor *mon);
 void hmp_info_balloon(Monitor *mon);
 void hmp_info_pci(Monitor *mon);
 void hmp_info_block_jobs(Monitor *mon);
+void hmp_info_memory_total(Monitor *mon);
 void hmp_quit(Monitor *mon, const QDict *qdict);
 void hmp_stop(Monitor *mon, const QDict *qdict);
 void hmp_system_reset(Monitor *mon, const QDict *qdict);
diff --git a/hw/dimm.c b/hw/dimm.c
index e384952..f181e54 100644
--- a/hw/dimm.c
+++ b/hw/dimm.c
@@ -189,6 +189,20 @@ void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots)
     }
 }
 
+uint64_t get_hp_memory_total(void)
+{
+    DimmBus *bus;
+    DimmDevice *slot;
+    uint64_t info = 0;
+
+    QLIST_FOREACH(bus, &memory_buses, next) {
+        QTAILQ_FOREACH(slot, &bus->dimmlist, nextdimm) {
+            info += slot->size;
+        }
+    }
+    return info;
+}
+
 static int dimm_init(DeviceState *s)
 {
     DimmBus *bus = DIMM_BUS(qdev_get_parent_bus(s));
diff --git a/hw/dimm.h b/hw/dimm.h
index 75a6911..5130b2c 100644
--- a/hw/dimm.h
+++ b/hw/dimm.h
@@ -85,5 +85,6 @@ DimmBus *dimm_bus_create(Object *parent, const char *name, uint32_t max_dimms,
     dimm_calcoffset_fn pmc_set_offset);
 void dimm_config_create(char *id, uint64_t size, const char *bus, uint64_t node,
         uint32_t dimm_idx, uint32_t populated);
+uint64_t get_hp_memory_total(void);
 
 #endif
diff --git a/monitor.c b/monitor.c
index c0e32d6..6e87d0d 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2708,6 +2708,13 @@ static mon_cmd_t info_cmds[] = {
         .mhandler.info = hmp_info_balloon,
     },
     {
+        .name       = "memory-total",
+        .args_type  = "",
+        .params     = "",
+        .help       = "show total memory size",
+        .mhandler.info = hmp_info_memory_total,
+    },
+    {
         .name       = "qtree",
         .args_type  = "",
         .params     = "",
diff --git a/qapi-schema.json b/qapi-schema.json
index 5dfa052..33f88d6 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2903,6 +2903,17 @@
 { 'command': 'query-target', 'returns': 'TargetInfo' }
 
 ##
+# @query-memory-total:
+#
+# Returns total memory in bytes, including hotplugged dimms
+#
+# Returns: int
+#
+# Since: 1.4
+##
+{ 'command': 'query-memory-total', 'returns': 'int' }
+
+##
 # @QKeyCode:
 #
 # An enumeration of key name.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 5c692d0..a99117a 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2654,3 +2654,23 @@ EQMP
         .args_type  = "",
         .mhandler.cmd_new = qmp_marshal_input_query_target,
     },
+
+    {
+        .name       = "query-memory-total",
+        .args_type  = "",
+        .mhandler.cmd_new = qmp_marshal_input_query_memory_total
+    },
+SQMP
+query-memory-total
+----------
+
+Return total memory in bytes, including hotplugged dimms
+
+Example:
+
+-> { "execute": "query-memory-total" }
+<- {
+      "return": 1073741824
+   }
+
+EQMP
diff --git a/vl.c b/vl.c
index 8406933..80803c5 100644
--- a/vl.c
+++ b/vl.c
@@ -126,6 +126,7 @@ int main(int argc, char **argv)
 #include "hw/xen.h"
 #include "hw/qdev.h"
 #include "hw/loader.h"
+#include "hw/dimm.h"
 #include "bt-host.h"
 #include "net.h"
 #include "net/slirp.h"
@@ -442,6 +443,14 @@ StatusInfo *qmp_query_status(Error **errp)
     return info;
 }
 
+int64_t qmp_query_memory_total(Error **errp)
+{
+    uint64_t info;
+    info = ram_size + get_hp_memory_total();
+
+    return (int64_t)info;
+}
+
 /***********************************************************/
 /* real time host monotonic timer */
 
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 20/30] balloon: update with hotplugged memory
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (18 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 19/30] Implement "info memory-total" and "query-memory-total" Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 21/30] Implement dimm-info Vasilis Liaskovitis
                   ` (14 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

query-balloon and "info balloon" should report total memory available to the
guest.

balloon inflate/ deflate can also use all memory available to the guest (initial
+ hotplugged memory)

Ballon driver has been minimaly tested with the patch, please review and test.

Caveat: if the guest does not online hotplugged-memory, it's easy for a balloon
inflate command to OOM a guest.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 hw/virtio-balloon.c |   13 +++++++++----
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/hw/virtio-balloon.c b/hw/virtio-balloon.c
index dd1a650..149e8ba 100644
--- a/hw/virtio-balloon.c
+++ b/hw/virtio-balloon.c
@@ -22,6 +22,7 @@
 #include "virtio-balloon.h"
 #include "kvm.h"
 #include "exec-memory.h"
+#include "dimm.h"
 
 #if defined(__linux__)
 #include <sys/mman.h>
@@ -147,10 +148,11 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
     VirtIOBalloon *dev = to_virtio_balloon(vdev);
     struct virtio_balloon_config config;
     uint32_t oldactual = dev->actual;
+    uint64_t hotplugged_ram_size = get_hp_memory_total();
     memcpy(&config, config_data, 8);
     dev->actual = le32_to_cpu(config.actual);
     if (dev->actual != oldactual) {
-        qemu_balloon_changed(ram_size -
+        qemu_balloon_changed(ram_size + hotplugged_ram_size -
                              (dev->actual << VIRTIO_BALLOON_PFN_SHIFT));
     }
 }
@@ -188,17 +190,20 @@ static void virtio_balloon_stat(void *opaque, BalloonInfo *info)
 
     info->actual = ram_size - ((uint64_t) dev->actual <<
                                VIRTIO_BALLOON_PFN_SHIFT);
+    info->actual += get_hp_memory_total();
 }
 
 static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
 {
     VirtIOBalloon *dev = opaque;
+    uint64_t hotplugged_ram_size = get_hp_memory_total();
 
-    if (target > ram_size) {
-        target = ram_size;
+    if (target > ram_size + hotplugged_ram_size) {
+        target = ram_size + hotplugged_ram_size;
     }
     if (target) {
-        dev->num_pages = (ram_size - target) >> VIRTIO_BALLOON_PFN_SHIFT;
+        dev->num_pages = (ram_size + hotplugged_ram_size - target) >>
+                                 VIRTIO_BALLOON_PFN_SHIFT;
         virtio_notify_config(&dev->vdev);
     }
 }
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 21/30] Implement dimm-info
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (19 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 20/30] balloon: update with hotplugged memory Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2013-01-08 23:20   ` Eric Blake
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 22/30] [SeaBIOS] acpi: add _EJ0 operation and eject port for memory devices Vasilis Liaskovitis
                   ` (13 subsequent siblings)
  34 siblings, 1 reply; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

"query-dimm-info" and "info dimm" will give current state of all dimms in the
system e.g.

dimm0: on
dimm1: off
dimm2: off
dimm3: on
etc.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 hmp-commands.hx  |    2 ++
 hmp.c            |   17 +++++++++++++++++
 hmp.h            |    1 +
 hw/dimm.c        |   43 +++++++++++++++++++++++++++++++++++++++++++
 monitor.c        |    7 +++++++
 qapi-schema.json |   26 ++++++++++++++++++++++++++
 6 files changed, 96 insertions(+), 0 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 3fbd975..65d799e 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1572,6 +1572,8 @@ show qdev device model list
 show roms
 @item info memory-total
 show memory-total
+@item info dimm
+show dimm
 @end table
 ETEXI
 
diff --git a/hmp.c b/hmp.c
index fb39b0d..f8456fd 100644
--- a/hmp.c
+++ b/hmp.c
@@ -635,6 +635,23 @@ void hmp_info_memory_total(Monitor *mon)
     monitor_printf(mon, "MemTotal: %lu\n", ram_total);
 }
 
+void hmp_info_dimm(Monitor *mon)
+{
+    DimmInfoList *info;
+    DimmInfoList *item;
+    DimmInfo *dimm;
+
+    info = qmp_query_dimm_info(NULL);
+    for (item = info; item; item = item->next) {
+        dimm = item->value;
+        monitor_printf(mon, "dimm %s : %s\n", dimm->dimm,
+                dimm->state ? "on" : "off");
+        dimm->dimm = NULL;
+    }
+
+    qapi_free_DimmInfoList(info);
+}
+
 void hmp_quit(Monitor *mon, const QDict *qdict)
 {
     monitor_suspend(mon);
diff --git a/hmp.h b/hmp.h
index 25a3a70..74ac061 100644
--- a/hmp.h
+++ b/hmp.h
@@ -37,6 +37,7 @@ void hmp_info_balloon(Monitor *mon);
 void hmp_info_pci(Monitor *mon);
 void hmp_info_block_jobs(Monitor *mon);
 void hmp_info_memory_total(Monitor *mon);
+void hmp_info_dimm(Monitor *mon);
 void hmp_quit(Monitor *mon, const QDict *qdict);
 void hmp_stop(Monitor *mon, const QDict *qdict);
 void hmp_system_reset(Monitor *mon, const QDict *qdict);
diff --git a/hw/dimm.c b/hw/dimm.c
index f181e54..e79f23d 100644
--- a/hw/dimm.c
+++ b/hw/dimm.c
@@ -174,6 +174,18 @@ static DimmConfig *dimmcfg_find_from_name(DimmBus *bus, const char *name)
     return NULL;
 }
 
+static DimmDevice *dimm_find_from_name(DimmBus *bus, const char *name)
+{
+    DimmDevice *slot;
+
+    QTAILQ_FOREACH(slot, &bus->dimmlist, nextdimm) {
+        if (!strcmp(slot->qdev.id, name)) {
+            return slot;
+        }
+    }
+    return NULL;
+}
+
 void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots)
 {
     DimmConfig *slot;
@@ -203,6 +215,37 @@ uint64_t get_hp_memory_total(void)
     return info;
 }
 
+DimmInfoList *qmp_query_dimm_info(Error **errp)
+{
+    DimmBus *bus;
+    DimmConfig *slot;
+    DimmInfoList *head = NULL, *info, *cur_item = NULL;
+
+    QLIST_FOREACH(bus, &memory_buses, next) {
+        QTAILQ_FOREACH(slot, &bus->dimmconfig_list, nextdimmcfg) {
+
+            info = g_malloc0(sizeof(*info));
+            info->value = g_malloc0(sizeof(*info->value));
+            info->value->dimm = g_malloc0(sizeof(char) * 32);
+            strcpy(info->value->dimm, slot->name);
+            if (dimm_find_from_name(bus, slot->name)) {
+                info->value->state = 1;
+            } else {
+                info->value->state = 0;
+            }
+            /* XXX: waiting for the qapi to support GSList */
+            if (!cur_item) {
+                head = cur_item = info;
+            } else {
+                cur_item->next = info;
+                cur_item = info;
+            }
+        }
+    }
+
+    return head;
+}
+
 static int dimm_init(DeviceState *s)
 {
     DimmBus *bus = DIMM_BUS(qdev_get_parent_bus(s));
diff --git a/monitor.c b/monitor.c
index 6e87d0d..de1dcf1 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2743,6 +2743,13 @@ static mon_cmd_t info_cmds[] = {
         .mhandler.info = do_trace_print_events,
     },
     {
+        .name       = "dimm",
+        .args_type  = "",
+        .params     = "",
+        .help       = "show active and non active dimms",
+        .mhandler.info = hmp_info_dimm,
+    },
+    {
         .name       = NULL,
     },
 };
diff --git a/qapi-schema.json b/qapi-schema.json
index 33f88d6..5a20577 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2914,6 +2914,32 @@
 { 'command': 'query-memory-total', 'returns': 'int' }
 
 ##
+# @DimmInfo:
+#
+# Information about status of a memory hotplug command
+#
+# @dimm: the Dimm associated with the result
+#
+# @result: the result of the hotplug command
+#
+# Since: 1.4
+#
+##
+{ 'type': 'DimmInfo',
+  'data': {'dimm': 'str', 'state': 'bool'} }
+
+##
+# @query-dimm-info:
+#
+# Returns total memory in bytes, including hotplugged dimms
+#
+# Returns: int
+#
+# Since: 1.4
+##
+{ 'command': 'query-dimm-info', 'returns': ['DimmInfo'] }
+
+##
 # @QKeyCode:
 #
 # An enumeration of key name.
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 22/30] [SeaBIOS] acpi: add _EJ0 operation and eject port for memory devices
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (20 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 21/30] Implement dimm-info Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 23/30] dimm: add hot-remove capability Vasilis Liaskovitis
                   ` (12 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

This will allow hot-remove signalling from/to qemu and acpi-enabled guest.
---
 src/acpi-dsdt-mem-hotplug.dsl |   15 +++++++++++++++
 src/ssdt-mem.dsl              |    3 +++
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/src/acpi-dsdt-mem-hotplug.dsl b/src/acpi-dsdt-mem-hotplug.dsl
index 0e7ced3..fd73ea7 100644
--- a/src/acpi-dsdt-mem-hotplug.dsl
+++ b/src/acpi-dsdt-mem-hotplug.dsl
@@ -21,6 +21,13 @@ Scope(\_SB) {
             MES, 256
         }
  
+        /* Memory eject byte */
+        OperationRegion(MEMJ, SystemIO, 0xafa0, 1)
+        Field (MEMJ, ByteAcc, NoLock, Preserve)
+        {
+            MPE, 8
+        }
+        
         Method(MESC, 0) {
             // Local5 = active memdevice bitmap
             Store (MES, Local5)
@@ -47,6 +54,8 @@ Scope(\_SB) {
                     // Do MEM notify
                     If (LEqual(Local3, 1)) {
                         MTFY(Local0, 1)
+                    } Else {
+                        MTFY(Local0, 3)
                     }
                 }
                 Increment(Local0)
@@ -54,4 +63,10 @@ Scope(\_SB) {
             Return(One)
         }
 
+        Method (MPEJ, 2, NotSerialized) {
+            // _EJ0 method - eject callback
+            Store(Arg0, MPE)
+            Sleep(200)
+        }
+
 }
diff --git a/src/ssdt-mem.dsl b/src/ssdt-mem.dsl
index dbac33f..eef84b6 100644
--- a/src/ssdt-mem.dsl
+++ b/src/ssdt-mem.dsl
@@ -57,6 +57,9 @@ DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", "CSSDT", 0x1)
         Method (_STA, 0) {
             Return(CMST(ID))        
         }    
+        Method (_EJ0, 1, NotSerialized) {
+            MPEJ(ID, Arg0)
+        }
     }
 }    
 
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 23/30] dimm: add hot-remove capability
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (21 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 22/30] [SeaBIOS] acpi: add _EJ0 operation and eject port for memory devices Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 24/30] acpi_piix4: " Vasilis Liaskovitis
                   ` (11 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

On a succesfull _EJ0 operation unmap the device from the guest by using the new
qdev function qdev_unplug_complete, see:
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02699.html

The memory of the device should be freed when the last subsystem using it
unmaps it, see the following two series:
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg00728.html
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02697.html

Needs testing. Other subsystems (e.g. virtio-blk) may have to install new
memorylisteners to complete pending I/O before device memory can be freed.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 hw/dimm.c |   51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/dimm.h |    1 +
 2 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/hw/dimm.c b/hw/dimm.c
index e79f23d..0b4e22d 100644
--- a/hw/dimm.c
+++ b/hw/dimm.c
@@ -120,6 +120,18 @@ static void dimm_populate(DimmDevice *s)
     s->mr = new;
 }
 
+static int dimm_depopulate(DeviceState *dev)
+{
+    DimmDevice *s = DIMM(dev);
+    assert(s);
+    vmstate_unregister_ram(s->mr, NULL);
+    memory_region_del_subregion(get_system_memory(), s->mr);
+    memory_region_destroy(s->mr);
+    s->populated = false;
+    s->mr = NULL;
+    return 0;
+}
+
 void dimm_config_create(char *id, uint64_t size, const char *bus, uint64_t node,
         uint32_t dimm_idx, uint32_t populated)
 {
@@ -159,6 +171,11 @@ static void dimm_plug_device(DimmDevice *slot)
 
 static int dimm_unplug_device(DeviceState *qdev)
 {
+    DimmBus *bus = DIMM_BUS(qdev_get_parent_bus(qdev));
+
+    if (bus->dimm_hotplug) {
+        bus->dimm_hotplug(bus->dimm_hotplug_qdev, DIMM(qdev), 0);
+    }
     return 1;
 }
 
@@ -186,6 +203,21 @@ static DimmDevice *dimm_find_from_name(DimmBus *bus, const char *name)
     return NULL;
 }
 
+static DimmDevice *dimm_find_from_idx(uint32_t idx)
+{
+    DimmDevice *slot;
+    DimmBus *bus;
+
+    QLIST_FOREACH(bus, &memory_buses, next) {
+        QTAILQ_FOREACH(slot, &bus->dimmlist, nextdimm) {
+            if (slot->idx == idx) {
+                return slot;
+            }
+        }
+    }
+    return NULL;
+}
+
 void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots)
 {
     DimmConfig *slot;
@@ -275,6 +307,24 @@ static int dimm_init(DeviceState *s)
     return 0;
 }
 
+void dimm_notify(uint32_t idx, uint32_t event)
+{
+    DimmBus *bus;
+    DimmDevice *slot;
+
+    slot = dimm_find_from_idx(idx);
+    assert(slot != NULL);
+    bus = DIMM_BUS(qdev_get_parent_bus(&slot->qdev));
+
+    switch (event) {
+    case DIMM_REMOVE_SUCCESS:
+        qdev_unplug_complete((DeviceState *)slot, NULL);
+        QTAILQ_REMOVE(&bus->dimmlist, slot, nextdimm);
+        break;
+    default:
+        break;
+    }
+}
 
 static void dimm_class_init(ObjectClass *klass, void *data)
 {
@@ -283,6 +333,7 @@ static void dimm_class_init(ObjectClass *klass, void *data)
     dc->props = dimm_properties;
     dc->unplug = dimm_unplug_device;
     dc->init = dimm_init;
+    dc->exit = dimm_depopulate;
     dc->bus_type = TYPE_DIMM_BUS;
 }
 
diff --git a/hw/dimm.h b/hw/dimm.h
index 5130b2c..86c7cd5 100644
--- a/hw/dimm.h
+++ b/hw/dimm.h
@@ -86,5 +86,6 @@ DimmBus *dimm_bus_create(Object *parent, const char *name, uint32_t max_dimms,
 void dimm_config_create(char *id, uint64_t size, const char *bus, uint64_t node,
         uint32_t dimm_idx, uint32_t populated);
 uint64_t get_hp_memory_total(void);
+void dimm_notify(uint32_t idx, uint32_t event);
 
 #endif
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 24/30] acpi_piix4: add hot-remove capability
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (22 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 23/30] dimm: add hot-remove capability Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 25/30] acpi_ich9: " Vasilis Liaskovitis
                   ` (10 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

---
 docs/specs/acpi_hotplug.txt |    8 ++++++++
 hw/acpi_piix4.c             |   29 ++++++++++++++++++++++++++++-
 2 files changed, 36 insertions(+), 1 deletions(-)

diff --git a/docs/specs/acpi_hotplug.txt b/docs/specs/acpi_hotplug.txt
index 8391713..cf86242 100644
--- a/docs/specs/acpi_hotplug.txt
+++ b/docs/specs/acpi_hotplug.txt
@@ -12,3 +12,11 @@ Dimm hot-plug notification pending. One bit per slot.
 
 Read by ACPI BIOS GPE.3 handler to notify OS of memory hot-add or hot-remove
 events.  Read-only.
+
+Memory Dimm ejection success notification (IO port 0xafa0, 1-byte access):
+---------------------------------------------------------------
+Dimm hot-remove _EJ0 notification. Byte value indicates Dimm slot that was
+ejected.
+
+Written by ACPI memory device _EJ0 method to notify qemu of successfull
+hot-removal.  Write-only.
diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index 879d8a0..6e4718e 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -50,6 +50,7 @@
 #define PCI_EJ_BASE 0xae08
 #define PCI_RMV_BASE 0xae0c
 #define MEM_BASE 0xaf80
+#define MEM_EJ_BASE 0xafa0
 
 #define PIIX4_MEM_HOTPLUG_STATUS 8
 #define PIIX4_PCI_HOTPLUG_STATUS 2
@@ -544,12 +545,29 @@ static uint32_t memhp_readb(void *opaque, uint32_t addr)
     return val;
 }
 
+static void memhp_writeb(void *opaque, uint32_t addr, uint32_t val)
+{
+    switch (addr) {
+    case MEM_EJ_BASE - MEM_BASE:
+        dimm_notify(val, DIMM_REMOVE_SUCCESS);
+        break;
+    default:
+        PIIX4_DPRINTF("memhp write invalid %x <== %d\n", addr, val);
+    }
+    PIIX4_DPRINTF("memhp write %x <== %d\n", addr, val);
+}
+
 static const MemoryRegionOps piix4_memhp_ops = {
     .old_portio = (MemoryRegionPortio[]) {
         {
             .offset = 0,   .len = DIMM_BITMAP_BYTES, .size = 1,
             .read = memhp_readb,
         },
+        {
+            .offset = MEM_EJ_BASE - MEM_BASE, .len = 1,
+            .size = 1,
+            .write = memhp_writeb,
+        },
         PORTIO_END_OF_LIST()
     },
     .endianness = DEVICE_LITTLE_ENDIAN,
@@ -635,7 +653,7 @@ static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
     memory_region_add_subregion(get_system_io(), PCI_HOTPLUG_ADDR,
                                 &s->io_pci);
     memory_region_init_io(&s->io_memhp, &piix4_memhp_ops, s, "apci-memhp0",
-                          DIMM_BITMAP_BYTES);
+                          DIMM_BITMAP_BYTES + 1);
     memory_region_add_subregion(get_system_io(), MEM_BASE, &s->io_memhp);
 
     for (i = 0; i < DIMM_BITMAP_BYTES; i++) {
@@ -665,6 +683,13 @@ static void enable_mem_device(PIIX4PMState *s, int memdevice)
     g->mems_sts[memdevice/8] |= (1 << (memdevice%8));
 }
 
+static void disable_mem_device(PIIX4PMState *s, int memdevice)
+{
+    struct gpe_regs *g = &s->gperegs;
+    s->ar.gpe.sts[0] |= PIIX4_MEM_HOTPLUG_STATUS;
+    g->mems_sts[memdevice/8] &= ~(1 << (memdevice%8));
+}
+
 static int piix4_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int
         add)
 {
@@ -674,6 +699,8 @@ static int piix4_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int
 
     if (add) {
         enable_mem_device(s, slot->idx);
+    } else {
+        disable_mem_device(s, slot->idx);
     }
     pm_update_sci(s);
     return 0;
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 25/30] acpi_ich9: add hot-remove capability
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (23 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 24/30] acpi_piix4: " Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-19 19:48   ` Blue Swirl
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 26/30] Implement qmp and hmp commands for notification lists Vasilis Liaskovitis
                   ` (9 subsequent siblings)
  34 siblings, 1 reply; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

---
 hw/acpi_ich9.c |   28 +++++++++++++++++++++++++++-
 hw/acpi_ich9.h |    1 +
 2 files changed, 28 insertions(+), 1 deletions(-)

diff --git a/hw/acpi_ich9.c b/hw/acpi_ich9.c
index abafbb5..f5dc1c9 100644
--- a/hw/acpi_ich9.c
+++ b/hw/acpi_ich9.c
@@ -105,12 +105,29 @@ static uint32_t memhp_readb(void *opaque, uint32_t addr)
     return val;
 }
 
+static void memhp_writeb(void *opaque, uint32_t addr, uint32_t val)
+{
+    switch (addr) {
+    case ICH9_MEM_EJ_BASE - ICH9_MEM_BASE:
+        dimm_notify(val, DIMM_REMOVE_SUCCESS);
+        break;
+    default:
+        ICH9_DEBUG("memhp write invalid %x <== %d\n", addr, val);
+    }
+    ICH9_DEBUG("memhp write %x <== %d\n", addr, val);
+}
+
 static const MemoryRegionOps ich9_memhp_ops = {
     .old_portio = (MemoryRegionPortio[]) {
         {
             .offset = 0,   .len = DIMM_BITMAP_BYTES, .size = 1,
             .read = memhp_readb,
         },
+        {
+            .offset = ICH9_MEM_EJ_BASE - ICH9_MEM_BASE,
+            .len = 1, .size = 1,
+            .write = memhp_writeb,
+        },
         PORTIO_END_OF_LIST()
     },
     .endianness = DEVICE_LITTLE_ENDIAN,
@@ -234,6 +251,13 @@ static void enable_mem_device(ICH9LPCState *s, int memdevice)
     g->mems_sts[memdevice/8] |= (1 << (memdevice%8));
 }
 
+static void disable_mem_device(ICH9LPCState *s, int memdevice)
+{
+    struct gpe_regs *g = &s->pm.gperegs;
+    s->pm.acpi_regs.gpe.sts[0] |= ICH9_MEM_HOTPLUG_STATUS;
+    g->mems_sts[memdevice/8] &= ~(1 << (memdevice%8));
+}
+
 static int ich9_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int
         add)
 {
@@ -243,6 +267,8 @@ static int ich9_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int
 
     if (add) {
         enable_mem_device(s, slot->idx);
+    } else {
+        disable_mem_device(s, slot->idx);
     }
     pm_update_sci(&s->pm);
     return 0;
@@ -270,7 +296,7 @@ void ich9_pm_init(void *device, qemu_irq sci_irq, qemu_irq cmos_s3)
     memory_region_add_subregion(&pm->io, ICH9_PMIO_SMI_EN, &pm->io_smi);
 
     memory_region_init_io(&pm->io_memhp, &ich9_memhp_ops, pm, "apci-memhp0",
-                          DIMM_BITMAP_BYTES);
+                          DIMM_BITMAP_BYTES + 1);
     memory_region_add_subregion(get_system_io(), ICH9_MEM_BASE, &pm->io_memhp);
 
     dimm_bus_hotplug(ich9_dimm_hotplug, &lpc->d.qdev);
diff --git a/hw/acpi_ich9.h b/hw/acpi_ich9.h
index 4419247..af61a2d 100644
--- a/hw/acpi_ich9.h
+++ b/hw/acpi_ich9.h
@@ -24,6 +24,7 @@
 #include "acpi.h"
 
 #define ICH9_MEM_BASE    0xaf80
+#define ICH9_MEM_EJ_BASE    0xafa0
 #define ICH9_MEM_HOTPLUG_STATUS 8
 
 typedef struct ICH9LPCPMRegs {
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 26/30] Implement qmp and hmp commands for notification lists
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (24 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 25/30] acpi_ich9: " Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2013-01-09  0:23   ` Eric Blake
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 27/30] [SeaBIOS] Add _OST dimm method Vasilis Liaskovitis
                   ` (8 subsequent siblings)
  34 siblings, 1 reply; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Guest can respond to ACPI hotplug events e.g. with _EJ or _OST method.
This patch implements a tail queue to store guest notifications for memory
hot-add and hot-remove requests.

Guest responses for memory hotplug command on a per-dimm basis can be detected
with the new hmp command "info memory-hotplug" or the new qmp command
"query-memory-hotplug"

Examples:

(qemu) device_add dimm,id=ram0
(qemu) info memory-hotplug
dimm: ram0 hot-add success
or
dimm: ram0 hot-add failure

(qemu) device_del ram3
(qemu) info memory-hotplug
dimm: ram3 hot-remove success
or
dimm: ram3 hot-remove failure

Results are removed from the queue once read.

This patch only queues _EJ events that signal hot-remove success.
For  _OST event queuing, which cover the hot-remove failure and
hot-add success/failure cases, the _OST patches in this series are  are also
needed.

These notification items should probably be part of migration state (not yet
implemented).

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 hmp-commands.hx  |    2 +
 hmp.c            |   17 +++++++++++++++
 hmp.h            |    1 +
 hw/dimm.c        |   61 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/dimm.h        |    1 +
 monitor.c        |    7 ++++++
 qapi-schema.json |   26 +++++++++++++++++++++++
 qmp-commands.hx  |   37 ++++++++++++++++++++++++++++++++
 8 files changed, 152 insertions(+), 0 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 65d799e..b94b7a2 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1574,6 +1574,8 @@ show roms
 show memory-total
 @item info dimm
 show dimm
+@item info memory-hotplug
+show memory-hotplug
 @end table
 ETEXI
 
diff --git a/hmp.c b/hmp.c
index f8456fd..727ed80 100644
--- a/hmp.c
+++ b/hmp.c
@@ -652,6 +652,23 @@ void hmp_info_dimm(Monitor *mon)
     qapi_free_DimmInfoList(info);
 }
 
+void hmp_info_memory_hotplug(Monitor *mon)
+{
+    MemHpInfoList *info;
+    MemHpInfoList *item;
+    MemHpInfo *dimm;
+
+    info = qmp_query_memory_hotplug(NULL);
+    for (item = info; item; item = item->next) {
+        dimm = item->value;
+        monitor_printf(mon, "dimm: %s %s %s\n", dimm->dimm,
+                dimm->request, dimm->result);
+        dimm->dimm = NULL;
+    }
+
+    qapi_free_MemHpInfoList(info);
+}
+
 void hmp_quit(Monitor *mon, const QDict *qdict)
 {
     monitor_suspend(mon);
diff --git a/hmp.h b/hmp.h
index 74ac061..92095df 100644
--- a/hmp.h
+++ b/hmp.h
@@ -38,6 +38,7 @@ void hmp_info_pci(Monitor *mon);
 void hmp_info_block_jobs(Monitor *mon);
 void hmp_info_memory_total(Monitor *mon);
 void hmp_info_dimm(Monitor *mon);
+void hmp_info_memory_hotplug(Monitor *mon);
 void hmp_quit(Monitor *mon, const QDict *qdict);
 void hmp_stop(Monitor *mon, const QDict *qdict);
 void hmp_system_reset(Monitor *mon, const QDict *qdict);
diff --git a/hw/dimm.c b/hw/dimm.c
index 0b4e22d..4670ae6 100644
--- a/hw/dimm.c
+++ b/hw/dimm.c
@@ -67,6 +67,7 @@ static void dimm_bus_initfn(Object *obj)
     DimmBus *bus = DIMM_BUS(obj);
     QTAILQ_INIT(&bus->dimmconfig_list);
     QTAILQ_INIT(&bus->dimmlist);
+    QTAILQ_INIT(&bus->dimm_hp_result_queue);
 }
 
 static const TypeInfo dimm_bus_info = {
@@ -278,6 +279,58 @@ DimmInfoList *qmp_query_dimm_info(Error **errp)
     return head;
 }
 
+MemHpInfoList *qmp_query_memory_hotplug(Error **errp)
+{
+    DimmBus *bus;
+    MemHpInfoList *head = NULL, *cur_item = NULL, *info;
+    struct dimm_hp_result *item, *nextitem;
+
+    QLIST_FOREACH(bus, &memory_buses, next) {
+        QTAILQ_FOREACH_SAFE(item, &bus->dimm_hp_result_queue, next, nextitem) {
+
+            info = g_malloc0(sizeof(*info));
+            info->value = g_malloc0(sizeof(*info->value));
+            info->value->dimm = g_malloc0(sizeof(char) * 32);
+            info->value->request = g_malloc0(sizeof(char) * 16);
+            info->value->result = g_malloc0(sizeof(char) * 16);
+            switch (item->ret) {
+            case DIMM_REMOVE_SUCCESS:
+                strcpy(info->value->request, "hot-remove");
+                strcpy(info->value->result, "success");
+                break;
+            case DIMM_REMOVE_FAIL:
+                strcpy(info->value->request, "hot-remove");
+                strcpy(info->value->result, "failure");
+                break;
+            case DIMM_ADD_SUCCESS:
+                strcpy(info->value->request, "hot-add");
+                strcpy(info->value->result, "success");
+                break;
+            case DIMM_ADD_FAIL:
+                strcpy(info->value->request, "hot-add");
+                strcpy(info->value->result, "failure");
+                break;
+            default:
+                break;
+            }
+            strcpy(info->value->dimm, item->dimmname);
+            /* XXX: waiting for the qapi to support GSList */
+            if (!cur_item) {
+                head = cur_item = info;
+            } else {
+                cur_item->next = info;
+                cur_item = info;
+            }
+
+            /* hotplug notification copied to qmp list, delete original item */
+            QTAILQ_REMOVE(&bus->dimm_hp_result_queue, item, next);
+            g_free(item);
+        }
+    }
+
+    return head;
+}
+
 static int dimm_init(DeviceState *s)
 {
     DimmBus *bus = DIMM_BUS(qdev_get_parent_bus(s));
@@ -311,17 +364,25 @@ void dimm_notify(uint32_t idx, uint32_t event)
 {
     DimmBus *bus;
     DimmDevice *slot;
+    DimmConfig *slotcfg;
+    struct dimm_hp_result *result;
 
     slot = dimm_find_from_idx(idx);
     assert(slot != NULL);
     bus = DIMM_BUS(qdev_get_parent_bus(&slot->qdev));
 
+    result = g_malloc0(sizeof(*result));
+    slotcfg = dimmcfg_find_from_name(bus, slot->qdev.id);
+    result->dimmname = slotcfg->name;
+
     switch (event) {
     case DIMM_REMOVE_SUCCESS:
         qdev_unplug_complete((DeviceState *)slot, NULL);
         QTAILQ_REMOVE(&bus->dimmlist, slot, nextdimm);
+        QTAILQ_INSERT_TAIL(&bus->dimm_hp_result_queue, result, next);
         break;
     default:
+        g_free(result);
         break;
     }
 }
diff --git a/hw/dimm.h b/hw/dimm.h
index 86c7cd5..8f9546b 100644
--- a/hw/dimm.h
+++ b/hw/dimm.h
@@ -69,6 +69,7 @@ typedef struct DimmBus {
     dimm_hotplug_fn dimm_hotplug;
     DimmConfiglist dimmconfig_list;
     QTAILQ_HEAD(Dimmlist, DimmDevice) dimmlist;
+    QTAILQ_HEAD(dimm_hp_result_head, dimm_hp_result)  dimm_hp_result_queue;
     QLIST_ENTRY(DimmBus) next;
 } DimmBus;
 
diff --git a/monitor.c b/monitor.c
index de1dcf1..40a7b7e 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2715,6 +2715,13 @@ static mon_cmd_t info_cmds[] = {
         .mhandler.info = hmp_info_memory_total,
     },
     {
+        .name       = "memory-hotplug",
+        .args_type  = "",
+        .params     = "",
+        .help       = "show memory hotplug status",
+        .mhandler.info = hmp_info_memory_hotplug,
+    },
+    {
         .name       = "qtree",
         .args_type  = "",
         .params     = "",
diff --git a/qapi-schema.json b/qapi-schema.json
index 5a20577..d2e9831 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2940,6 +2940,32 @@
 { 'command': 'query-dimm-info', 'returns': ['DimmInfo'] }
 
 ##
+# @MemHpInfo:
+#
+# Information about status of a memory hotplug command
+#
+# @dimm: the Dimm associated with the result
+#
+# @result: the result of the hotplug command
+#
+# Since: 1.4
+#
+##
+{ 'type': 'MemHpInfo',
+  'data': {'dimm': 'str', 'request': 'str', 'result': 'str'} }
+
+##
+# @query-memory-hotplug:
+#
+# Returns a list of information about pending hotplug commands
+#
+# Returns: a list of @MemhpInfo
+#
+# Since: 1.4
+##
+{ 'command': 'query-memory-hotplug', 'returns': ['MemHpInfo'] }
+
+##
 # @QKeyCode:
 #
 # An enumeration of key name.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index a99117a..ad0dac7 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2674,3 +2674,40 @@ Example:
    }
 
 EQMP
+    {
+        .name       = "query-memory-hotplug",
+        .args_type  = "",
+        .mhandler.cmd_new = qmp_marshal_input_query_memory_hotplug
+    },
+SQMP
+query-memory-hotplug
+----------
+
+Show memory hotplug command notifications.
+
+Return a json-array. Each DIMM that has a pending notification is represented
+by a json-object, which contains:
+
+- "dimm": Dimm name (json-str)
+- "request": type of hot request: hot-add or hot-remove  (json-str)
+- "result": result of the hotplug request for this Dimm success or failure (json-str)
+
+Example:
+
+-> { "execute": "query-memory-hotplug" }
+<- {
+      "return":[
+         {
+            "result": "failure",
+            "request": "hot-remove",
+            "dimm": "dimm10"
+        },
+         {
+            "result": "success",
+            "request": "hot-add",
+            "dimm": "dimm3"
+         }
+      ]
+   }
+
+EQMP
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 27/30] [SeaBIOS] Add _OST dimm method
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (25 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 26/30] Implement qmp and hmp commands for notification lists Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 28/30] Add _OST dimm support Vasilis Liaskovitis
                   ` (7 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

Add support for _OST method. _OST method will write into the correct I/O byte to
signal success / failure of hot-add or hot-remove to qemu.
---
 src/acpi-dsdt-mem-hotplug.dsl |   51 ++++++++++++++++++++++++++++++++++++++++-
 src/ssdt-mem.dsl              |    4 +++
 2 files changed, 54 insertions(+), 1 deletions(-)

diff --git a/src/acpi-dsdt-mem-hotplug.dsl b/src/acpi-dsdt-mem-hotplug.dsl
index fd73ea7..a648bee 100644
--- a/src/acpi-dsdt-mem-hotplug.dsl
+++ b/src/acpi-dsdt-mem-hotplug.dsl
@@ -27,7 +27,28 @@ Scope(\_SB) {
         {
             MPE, 8
         }
-        
+
+        /* Memory hot-remove notify failure byte */
+        OperationRegion(MEEF, SystemIO, 0xafa1, 1)
+        Field (MEEF, ByteAcc, NoLock, Preserve)
+        {
+            MEF, 8
+        }
+
+        /* Memory hot-add notify success byte */
+        OperationRegion(MPIS, SystemIO, 0xafa2, 1)
+        Field (MPIS, ByteAcc, NoLock, Preserve)
+        {
+            MIS, 8
+        }
+
+        /* Memory hot-add notify failure byte */
+        OperationRegion(MPIF, SystemIO, 0xafa3, 1)
+        Field (MPIF, ByteAcc, NoLock, Preserve)
+        {
+            MIF, 8
+        }
+
         Method(MESC, 0) {
             // Local5 = active memdevice bitmap
             Store (MES, Local5)
@@ -69,4 +90,32 @@ Scope(\_SB) {
             Sleep(200)
         }
 
+        Method (MOST, 3, Serialized) {
+            // _OST method - OS status indication
+            Switch (And(Arg0, 0xFF)) {
+                Case(0x3)
+                {
+                    Switch(And(Arg1, 0xFF)) {
+                        Case(0x1) {
+                            Store(Arg2, MEF)
+                            // Revert MEON flag for this memory device to one
+                            Store(One, Index(MEON, Arg2))
+                        }
+                    }
+                }
+                Case(0x1)
+                {
+                    Switch(And(Arg1, 0xFF)) {
+                        Case(0x0) {
+                            Store(Arg2, MIS)
+                        }
+                        Case(0x1) {
+                            Store(Arg2, MIF)
+                            // Revert MEON flag for this memory device to zero
+                            Store(Zero, Index(MEON, Arg2))
+                        }
+                    }
+                }
+            }
+        }
 }
diff --git a/src/ssdt-mem.dsl b/src/ssdt-mem.dsl
index eef84b6..47a3b4f 100644
--- a/src/ssdt-mem.dsl
+++ b/src/ssdt-mem.dsl
@@ -38,6 +38,7 @@ DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", "CSSDT", 0x1)
 
         External(CMST, MethodObj)
         External(MPEJ, MethodObj)
+        External(MOST, MethodObj)
 
         Name(_CRS, ResourceTemplate() {
             QwordMemory(
@@ -60,6 +61,9 @@ DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", "CSSDT", 0x1)
         Method (_EJ0, 1, NotSerialized) {
             MPEJ(ID, Arg0)
         }
+        Method (_OST, 3) {
+            MOST(Arg0, Arg1, ID)
+        }
     }
 }    
 
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 28/30] Add _OST dimm support
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (26 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 27/30] [SeaBIOS] Add _OST dimm method Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 29/30] [SeaBIOS] Implement _PS3 method for memory device Vasilis Liaskovitis
                   ` (6 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

This allows qemu to receive notifications from the guest OS on success or
failure of a memory hotplug request. The guest OS needs to implement the _OST
functionality for this to work (linux-next: http://lkml.org/lkml/2012/6/25/321)

This patch also updates dimm bitmap state and hot-remove pending flag
on hot-remove fail. This allows failed hot operations to be retried at
anytime (only works for guests that use _OST notification).
Also adds new _OST registers in  docs/specs/acpi_hotplug.txt
---
 docs/specs/acpi_hotplug.txt |   25 +++++++++++++++++++++++++
 hw/acpi_ich9.c              |   31 ++++++++++++++++++++++++++++---
 hw/acpi_ich9.h              |    3 +++
 hw/acpi_piix4.c             |   35 ++++++++++++++++++++++++++++++++---
 hw/dimm.c                   |   28 +++++++++++++++++++++++++++-
 hw/dimm.h                   |   11 ++++++++++-
 6 files changed, 125 insertions(+), 8 deletions(-)

diff --git a/docs/specs/acpi_hotplug.txt b/docs/specs/acpi_hotplug.txt
index cf86242..536da16 100644
--- a/docs/specs/acpi_hotplug.txt
+++ b/docs/specs/acpi_hotplug.txt
@@ -20,3 +20,28 @@ ejected.
 
 Written by ACPI memory device _EJ0 method to notify qemu of successfull
 hot-removal.  Write-only.
+
+Memory Dimm ejection failure notification (IO port 0xafa1, 1-byte access):
+---------------------------------------------------------------
+Dimm hot-remove _OST notification. Byte value indicates Dimm slot for which
+ejection failed.
+
+Written by ACPI memory device _OST method to notify qemu of failed
+hot-removal.  Write-only.
+
+Memory Dimm insertion success notification (IO port 0xafa2, 1-byte access):
+---------------------------------------------------------------
+Dimm hot-remove _OST notification. Byte value indicates Dimm slot for which
+insertion succeeded.
+
+Written by ACPI memory device _OST method to notify qemu of failed
+hot-add.  Write-only.
+
+Memory Dimm insertion failure notification (IO port 0xafa3, 1-byte access):
+---------------------------------------------------------------
+Dimm hot-remove _OST notification. Byte value indicates Dimm slot for which
+insertion failed.
+
+Written by ACPI memory device _OST method to notify qemu of failed
+hot-add.  Write-only.
+
diff --git a/hw/acpi_ich9.c b/hw/acpi_ich9.c
index f5dc1c9..2705230 100644
--- a/hw/acpi_ich9.c
+++ b/hw/acpi_ich9.c
@@ -111,6 +111,15 @@ static void memhp_writeb(void *opaque, uint32_t addr, uint32_t val)
     case ICH9_MEM_EJ_BASE - ICH9_MEM_BASE:
         dimm_notify(val, DIMM_REMOVE_SUCCESS);
         break;
+    case ICH9_MEM_OST_REMOVE_FAIL - ICH9_MEM_BASE:
+        dimm_notify(val, DIMM_REMOVE_FAIL);
+        break;
+    case ICH9_MEM_OST_ADD_SUCCESS - ICH9_MEM_BASE:
+        dimm_notify(val, DIMM_ADD_SUCCESS);
+        break;
+    case ICH9_MEM_OST_ADD_FAIL - ICH9_MEM_BASE:
+        dimm_notify(val, DIMM_ADD_FAIL);
+        break;
     default:
         ICH9_DEBUG("memhp write invalid %x <== %d\n", addr, val);
     }
@@ -125,7 +134,7 @@ static const MemoryRegionOps ich9_memhp_ops = {
         },
         {
             .offset = ICH9_MEM_EJ_BASE - ICH9_MEM_BASE,
-            .len = 1, .size = 1,
+            .len = 4, .size = 1,
             .write = memhp_writeb,
         },
         PORTIO_END_OF_LIST()
@@ -274,6 +283,22 @@ static int ich9_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int
     return 0;
 }
 
+static int ich9_dimm_revert(DeviceState *qdev, DimmDevice *dev, int add)
+{
+    PCIDevice *pci_dev = DO_UPCAST(PCIDevice, qdev, qdev);
+    ICH9LPCState *s = DO_UPCAST(ICH9LPCState, d, pci_dev);
+    struct gpe_regs *g = &s->pm.gperegs;
+    DimmDevice *slot = DIMM(dev);
+    int idx = slot->idx;
+
+    if (add) {
+        g->mems_sts[idx/8] &= ~(1 << (idx%8));
+    } else {
+        g->mems_sts[idx/8] |= (1 << (idx%8));
+    }
+    return 0;
+}
+
 void ich9_pm_init(void *device, qemu_irq sci_irq, qemu_irq cmos_s3)
 {
     ICH9LPCState *lpc = (ICH9LPCState *)device;
@@ -296,10 +321,10 @@ void ich9_pm_init(void *device, qemu_irq sci_irq, qemu_irq cmos_s3)
     memory_region_add_subregion(&pm->io, ICH9_PMIO_SMI_EN, &pm->io_smi);
 
     memory_region_init_io(&pm->io_memhp, &ich9_memhp_ops, pm, "apci-memhp0",
-                          DIMM_BITMAP_BYTES + 1);
+                          DIMM_BITMAP_BYTES + 4);
     memory_region_add_subregion(get_system_io(), ICH9_MEM_BASE, &pm->io_memhp);
 
-    dimm_bus_hotplug(ich9_dimm_hotplug, &lpc->d.qdev);
+    dimm_bus_hotplug(ich9_dimm_hotplug, ich9_dimm_revert, &lpc->d.qdev);
 
     pm->irq = sci_irq;
     qemu_register_reset(pm_reset, pm);
diff --git a/hw/acpi_ich9.h b/hw/acpi_ich9.h
index af61a2d..8f57cd8 100644
--- a/hw/acpi_ich9.h
+++ b/hw/acpi_ich9.h
@@ -26,6 +26,9 @@
 #define ICH9_MEM_BASE    0xaf80
 #define ICH9_MEM_EJ_BASE    0xafa0
 #define ICH9_MEM_HOTPLUG_STATUS 8
+#define ICH9_MEM_OST_REMOVE_FAIL 0xafa1
+#define ICH9_MEM_OST_ADD_SUCCESS 0xafa2
+#define ICH9_MEM_OST_ADD_FAIL 0xafa3
 
 typedef struct ICH9LPCPMRegs {
     /*
diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index 6e4718e..70aa480 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -51,6 +51,9 @@
 #define PCI_RMV_BASE 0xae0c
 #define MEM_BASE 0xaf80
 #define MEM_EJ_BASE 0xafa0
+#define MEM_OST_REMOVE_FAIL 0xafa1
+#define MEM_OST_ADD_SUCCESS 0xafa2
+#define MEM_OST_ADD_FAIL 0xafa3
 
 #define PIIX4_MEM_HOTPLUG_STATUS 8
 #define PIIX4_PCI_HOTPLUG_STATUS 2
@@ -90,6 +93,7 @@ typedef struct PIIX4PMState {
     uint8_t s4_val;
 } PIIX4PMState;
 
+static int piix4_dimm_revert(DeviceState *qdev, DimmDevice *dev, int add);
 static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s);
 
 #define ACPI_ENABLE 0xf1
@@ -551,6 +555,15 @@ static void memhp_writeb(void *opaque, uint32_t addr, uint32_t val)
     case MEM_EJ_BASE - MEM_BASE:
         dimm_notify(val, DIMM_REMOVE_SUCCESS);
         break;
+    case MEM_OST_REMOVE_FAIL - MEM_BASE:
+        dimm_notify(val, DIMM_REMOVE_FAIL);
+        break;
+    case MEM_OST_ADD_SUCCESS - MEM_BASE:
+        dimm_notify(val, DIMM_ADD_SUCCESS);
+        break;
+    case MEM_OST_ADD_FAIL - MEM_BASE:
+        dimm_notify(val, DIMM_ADD_FAIL);
+        break;
     default:
         PIIX4_DPRINTF("memhp write invalid %x <== %d\n", addr, val);
     }
@@ -564,7 +577,7 @@ static const MemoryRegionOps piix4_memhp_ops = {
             .read = memhp_readb,
         },
         {
-            .offset = MEM_EJ_BASE - MEM_BASE, .len = 1,
+            .offset = MEM_EJ_BASE - MEM_BASE, .len = 4,
             .size = 1,
             .write = memhp_writeb,
         },
@@ -653,7 +666,7 @@ static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
     memory_region_add_subregion(get_system_io(), PCI_HOTPLUG_ADDR,
                                 &s->io_pci);
     memory_region_init_io(&s->io_memhp, &piix4_memhp_ops, s, "apci-memhp0",
-                          DIMM_BITMAP_BYTES + 1);
+                          DIMM_BITMAP_BYTES + 4);
     memory_region_add_subregion(get_system_io(), MEM_BASE, &s->io_memhp);
 
     for (i = 0; i < DIMM_BITMAP_BYTES; i++) {
@@ -661,7 +674,7 @@ static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
     }
 
     pci_bus_hotplug(bus, piix4_device_hotplug, &s->dev.qdev);
-    dimm_bus_hotplug(piix4_dimm_hotplug, &s->dev.qdev);
+    dimm_bus_hotplug(piix4_dimm_hotplug, piix4_dimm_revert, &s->dev.qdev);
 }
 
 static void enable_device(PIIX4PMState *s, int slot)
@@ -706,6 +719,22 @@ static int piix4_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int
     return 0;
 }
 
+static int piix4_dimm_revert(DeviceState *qdev, DimmDevice *dev, int add)
+{
+    PCIDevice *pci_dev = DO_UPCAST(PCIDevice, qdev, qdev);
+    PIIX4PMState *s = DO_UPCAST(PIIX4PMState, dev, pci_dev);
+    struct gpe_regs *g = &s->gperegs;
+    DimmDevice *slot = DIMM(dev);
+    int idx = slot->idx;
+
+    if (add) {
+        g->mems_sts[idx/8] &= ~(1 << (idx%8));
+    } else {
+        g->mems_sts[idx/8] |= (1 << (idx%8));
+    }
+    return 0;
+}
+
 static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
 				PCIHotplugState state)
 {
diff --git a/hw/dimm.c b/hw/dimm.c
index 4670ae6..69b97b6 100644
--- a/hw/dimm.c
+++ b/hw/dimm.c
@@ -149,7 +149,8 @@ void dimm_config_create(char *id, uint64_t size, const char *bus, uint64_t node,
     QTAILQ_INSERT_TAIL(&dimmconfig_list, dimm_cfg, nextdimmcfg);
 }
 
-void dimm_bus_hotplug(dimm_hotplug_fn hotplug, DeviceState *qdev)
+void dimm_bus_hotplug(dimm_hotplug_fn hotplug, dimm_hotplug_fn revert,
+        DeviceState *qdev)
 {
     DimmBus *bus;
     QLIST_FOREACH(bus, &memory_buses, next) {
@@ -157,6 +158,7 @@ void dimm_bus_hotplug(dimm_hotplug_fn hotplug, DeviceState *qdev)
         bus->qbus.allow_hotplug = 1;
         bus->dimm_hotplug_qdev = qdev;
         bus->dimm_hotplug = hotplug;
+        bus->dimm_revert = revert;
     }
 }
 
@@ -168,6 +170,7 @@ static void dimm_plug_device(DimmDevice *slot)
     if (bus->dimm_hotplug) {
         bus->dimm_hotplug(bus->dimm_hotplug_qdev, slot, 1);
     }
+    slot->pending = DIMM_ADD_PENDING;
 }
 
 static int dimm_unplug_device(DeviceState *qdev)
@@ -177,6 +180,7 @@ static int dimm_unplug_device(DeviceState *qdev)
     if (bus->dimm_hotplug) {
         bus->dimm_hotplug(bus->dimm_hotplug_qdev, DIMM(qdev), 0);
     }
+    DIMM(qdev)->pending = DIMM_REMOVE_PENDING;
     return 1;
 }
 
@@ -353,6 +357,7 @@ static int dimm_init(DeviceState *s)
     slot->start = slotcfg->start;
     slot->size = slotcfg->size;
     slot->node = slotcfg->node;
+    slot->pending = DIMM_NO_PENDING;
 
     QTAILQ_INSERT_TAIL(&bus->dimmlist, slot, nextdimm);
     dimm_plug_device(slot);
@@ -374,13 +379,34 @@ void dimm_notify(uint32_t idx, uint32_t event)
     result = g_malloc0(sizeof(*result));
     slotcfg = dimmcfg_find_from_name(bus, slot->qdev.id);
     result->dimmname = slotcfg->name;
+    result->ret = event;
 
     switch (event) {
     case DIMM_REMOVE_SUCCESS:
+        slot->pending = DIMM_NO_PENDING;
         qdev_unplug_complete((DeviceState *)slot, NULL);
         QTAILQ_REMOVE(&bus->dimmlist, slot, nextdimm);
         QTAILQ_INSERT_TAIL(&bus->dimm_hp_result_queue, result, next);
         break;
+    case DIMM_REMOVE_FAIL:
+        slot->pending = DIMM_NO_PENDING;
+        if (bus->dimm_revert) {
+            bus->dimm_revert(bus->dimm_hotplug_qdev, slot, 0);
+        }
+        QTAILQ_INSERT_TAIL(&bus->dimm_hp_result_queue, result, next);
+        break;
+    case DIMM_ADD_SUCCESS:
+        slot->pending = DIMM_NO_PENDING;
+        QTAILQ_INSERT_TAIL(&bus->dimm_hp_result_queue, result, next);
+        break;
+    case DIMM_ADD_FAIL:
+        slot->pending = DIMM_NO_PENDING;
+        if (bus->dimm_revert) {
+            bus->dimm_revert(bus->dimm_hotplug_qdev, slot, 1);
+        }
+        qdev_unplug_complete((DeviceState *)slot, NULL);
+        QTAILQ_REMOVE(&bus->dimmlist, slot, nextdimm);
+        QTAILQ_INSERT_TAIL(&bus->dimm_hp_result_queue, result, next);
     default:
         g_free(result);
         break;
diff --git a/hw/dimm.h b/hw/dimm.h
index 8f9546b..f43f745 100644
--- a/hw/dimm.h
+++ b/hw/dimm.h
@@ -18,6 +18,12 @@ typedef enum {
     DIMM_ADD_FAIL = 3
 } dimm_hp_result_code;
 
+typedef enum {
+    DIMM_NO_PENDING = 0,
+    DIMM_ADD_PENDING = 1,
+    DIMM_REMOVE_PENDING = 2,
+} dimm_hp_pending_code;
+
 #define TYPE_DIMM "dimm"
 #define DIMM(obj) \
     OBJECT_CHECK(DimmDevice, (obj), TYPE_DIMM)
@@ -43,6 +49,7 @@ struct DimmDevice {
     uint32_t node; /* numa node proximity */
     uint32_t populated; /* 1 means device has been hotplugged. Default is 0. */
     MemoryRegion *mr; /* MemoryRegion for this slot. !NULL only if populated */
+    dimm_hp_pending_code pending; /* pending hot operation for this dimm */
     QTAILQ_ENTRY(DimmDevice) nextdimm;
 };
 
@@ -68,6 +75,7 @@ typedef struct DimmBus {
     DeviceState *dimm_hotplug_qdev;
     dimm_hotplug_fn dimm_hotplug;
     DimmConfiglist dimmconfig_list;
+    dimm_hotplug_fn dimm_revert;
     QTAILQ_HEAD(Dimmlist, DimmDevice) dimmlist;
     QTAILQ_HEAD(dimm_hp_result_head, dimm_hp_result)  dimm_hp_result_queue;
     QLIST_ENTRY(DimmBus) next;
@@ -79,7 +87,8 @@ struct dimm_hp_result {
     QTAILQ_ENTRY(dimm_hp_result) next;
 };
 
-void dimm_bus_hotplug(dimm_hotplug_fn hotplug, DeviceState *qdev);
+void dimm_bus_hotplug(dimm_hotplug_fn hotplug, dimm_hotplug_fn revert,
+    DeviceState *qdev);
 void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots);
 int dimm_add(char *id);
 DimmBus *dimm_bus_create(Object *parent, const char *name, uint32_t max_dimms,
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 29/30] [SeaBIOS] Implement _PS3 method for memory device
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (27 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 28/30] Add _OST dimm support Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 30/30] Implement _PS3 for dimm Vasilis Liaskovitis
                   ` (5 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

---
 src/acpi-dsdt-mem-hotplug.dsl |   15 +++++++++++++++
 src/ssdt-mem.dsl              |    4 ++++
 2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/src/acpi-dsdt-mem-hotplug.dsl b/src/acpi-dsdt-mem-hotplug.dsl
index a648bee..7d7c078 100644
--- a/src/acpi-dsdt-mem-hotplug.dsl
+++ b/src/acpi-dsdt-mem-hotplug.dsl
@@ -49,6 +49,13 @@ Scope(\_SB) {
             MIF, 8
         }
 
+        /* Memory _PS3 byte */
+        OperationRegion(MPSB, SystemIO, 0xafa4, 1)
+        Field (MPSB, ByteAcc, NoLock, Preserve)
+        {
+            MPS, 8
+        }
+
         Method(MESC, 0) {
             // Local5 = active memdevice bitmap
             Store (MES, Local5)
@@ -90,6 +97,14 @@ Scope(\_SB) {
             Sleep(200)
         }
 
+
+        Method (MPS3, 1, NotSerialized) {
+            // _PS3 method - power-off method
+            Store(Arg0, MPS)
+            Store(Zero, Index(MEON, Arg0))
+            Sleep(200)
+        }
+
         Method (MOST, 3, Serialized) {
             // _OST method - OS status indication
             Switch (And(Arg0, 0xFF)) {
diff --git a/src/ssdt-mem.dsl b/src/ssdt-mem.dsl
index 47a3b4f..9827a58 100644
--- a/src/ssdt-mem.dsl
+++ b/src/ssdt-mem.dsl
@@ -39,6 +39,7 @@ DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", "CSSDT", 0x1)
         External(CMST, MethodObj)
         External(MPEJ, MethodObj)
         External(MOST, MethodObj)
+        External(MPS3, MethodObj)
 
         Name(_CRS, ResourceTemplate() {
             QwordMemory(
@@ -64,6 +65,9 @@ DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", "CSSDT", 0x1)
         Method (_OST, 3) {
             MOST(Arg0, Arg1, ID)
         }
+        Method (_PS3, 0) {
+            MPS3(ID)
+        }
     }
 }    
 
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [RFC PATCH v4 30/30] Implement _PS3 for dimm
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (28 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 29/30] [SeaBIOS] Implement _PS3 method for memory device Vasilis Liaskovitis
@ 2012-12-18 12:41 ` Vasilis Liaskovitis
  2012-12-18 16:45 ` [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Zhi Yong Wu
                   ` (4 subsequent siblings)
  34 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-18 12:41 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: Vasilis Liaskovitis, pingfank, gleb, stefanha, jbaron,
	blauwirbel, kevin, kraxel, anthony

This will allow us to update dimm state on OSPM-initiated eject operations e.g.
with "echo 1 > /sys/bus/acpi/devices/PNP0C80\:00/eject"

v3->v4: Add support for ich9
---
 docs/specs/acpi_hotplug.txt |    7 +++++++
 hw/acpi_ich9.c              |    7 +++++--
 hw/acpi_ich9.h              |    1 +
 hw/acpi_piix4.c             |    9 ++++++---
 hw/dimm.c                   |    4 ++++
 hw/dimm.h                   |    3 ++-
 6 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/docs/specs/acpi_hotplug.txt b/docs/specs/acpi_hotplug.txt
index 536da16..69868fe 100644
--- a/docs/specs/acpi_hotplug.txt
+++ b/docs/specs/acpi_hotplug.txt
@@ -45,3 +45,10 @@ insertion failed.
 Written by ACPI memory device _OST method to notify qemu of failed
 hot-add.  Write-only.
 
+Memory Dimm _PS3 power-off initiated by OSPM (IO port 0xafa4, 1-byte access):
+---------------------------------------------------------------
+Dimm hot-add _PS3 initiated by OSPM. Byte value indicates Dimm slot which
+entered D3 state.
+
+Written by ACPI memory device _PS3 method to notify qemu of power-off state for
+the dimm.  Write-only.
diff --git a/hw/acpi_ich9.c b/hw/acpi_ich9.c
index 2705230..5e7fca6 100644
--- a/hw/acpi_ich9.c
+++ b/hw/acpi_ich9.c
@@ -120,6 +120,9 @@ static void memhp_writeb(void *opaque, uint32_t addr, uint32_t val)
     case ICH9_MEM_OST_ADD_FAIL - ICH9_MEM_BASE:
         dimm_notify(val, DIMM_ADD_FAIL);
         break;
+    case ICH9_MEM_PS3 - ICH9_MEM_BASE:
+         dimm_notify(val, DIMM_OSPM_POWEROFF);
+         break;
     default:
         ICH9_DEBUG("memhp write invalid %x <== %d\n", addr, val);
     }
@@ -134,7 +137,7 @@ static const MemoryRegionOps ich9_memhp_ops = {
         },
         {
             .offset = ICH9_MEM_EJ_BASE - ICH9_MEM_BASE,
-            .len = 4, .size = 1,
+            .len = 5, .size = 1,
             .write = memhp_writeb,
         },
         PORTIO_END_OF_LIST()
@@ -321,7 +324,7 @@ void ich9_pm_init(void *device, qemu_irq sci_irq, qemu_irq cmos_s3)
     memory_region_add_subregion(&pm->io, ICH9_PMIO_SMI_EN, &pm->io_smi);
 
     memory_region_init_io(&pm->io_memhp, &ich9_memhp_ops, pm, "apci-memhp0",
-                          DIMM_BITMAP_BYTES + 4);
+                          DIMM_BITMAP_BYTES + 5);
     memory_region_add_subregion(get_system_io(), ICH9_MEM_BASE, &pm->io_memhp);
 
     dimm_bus_hotplug(ich9_dimm_hotplug, ich9_dimm_revert, &lpc->d.qdev);
diff --git a/hw/acpi_ich9.h b/hw/acpi_ich9.h
index 8f57cd8..816d453 100644
--- a/hw/acpi_ich9.h
+++ b/hw/acpi_ich9.h
@@ -29,6 +29,7 @@
 #define ICH9_MEM_OST_REMOVE_FAIL 0xafa1
 #define ICH9_MEM_OST_ADD_SUCCESS 0xafa2
 #define ICH9_MEM_OST_ADD_FAIL 0xafa3
+#define ICH9_MEM_PS3 0xafa4
 
 typedef struct ICH9LPCPMRegs {
     /*
diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index 70aa480..6c953c2 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -54,6 +54,7 @@
 #define MEM_OST_REMOVE_FAIL 0xafa1
 #define MEM_OST_ADD_SUCCESS 0xafa2
 #define MEM_OST_ADD_FAIL 0xafa3
+#define MEM_PS3 0xafa4
 
 #define PIIX4_MEM_HOTPLUG_STATUS 8
 #define PIIX4_PCI_HOTPLUG_STATUS 2
@@ -564,6 +565,9 @@ static void memhp_writeb(void *opaque, uint32_t addr, uint32_t val)
     case MEM_OST_ADD_FAIL - MEM_BASE:
         dimm_notify(val, DIMM_ADD_FAIL);
         break;
+    case MEM_PS3 - MEM_BASE:
+        dimm_notify(val, DIMM_OSPM_POWEROFF);
+        break;
     default:
         PIIX4_DPRINTF("memhp write invalid %x <== %d\n", addr, val);
     }
@@ -577,7 +581,7 @@ static const MemoryRegionOps piix4_memhp_ops = {
             .read = memhp_readb,
         },
         {
-            .offset = MEM_EJ_BASE - MEM_BASE, .len = 4,
+            .offset = MEM_EJ_BASE - MEM_BASE, .len = 5,
             .size = 1,
             .write = memhp_writeb,
         },
@@ -666,7 +670,7 @@ static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
     memory_region_add_subregion(get_system_io(), PCI_HOTPLUG_ADDR,
                                 &s->io_pci);
     memory_region_init_io(&s->io_memhp, &piix4_memhp_ops, s, "apci-memhp0",
-                          DIMM_BITMAP_BYTES + 4);
+                          DIMM_BITMAP_BYTES + 5);
     memory_region_add_subregion(get_system_io(), MEM_BASE, &s->io_memhp);
 
     for (i = 0; i < DIMM_BITMAP_BYTES; i++) {
@@ -726,7 +730,6 @@ static int piix4_dimm_revert(DeviceState *qdev, DimmDevice *dev, int add)
     struct gpe_regs *g = &s->gperegs;
     DimmDevice *slot = DIMM(dev);
     int idx = slot->idx;
-
     if (add) {
         g->mems_sts[idx/8] &= ~(1 << (idx%8));
     } else {
diff --git a/hw/dimm.c b/hw/dimm.c
index 69b97b6..2454e38 100644
--- a/hw/dimm.c
+++ b/hw/dimm.c
@@ -407,6 +407,10 @@ void dimm_notify(uint32_t idx, uint32_t event)
         qdev_unplug_complete((DeviceState *)slot, NULL);
         QTAILQ_REMOVE(&bus->dimmlist, slot, nextdimm);
         QTAILQ_INSERT_TAIL(&bus->dimm_hp_result_queue, result, next);
+    case DIMM_OSPM_POWEROFF:
+        if (bus->dimm_revert) {
+            bus->dimm_revert(bus->dimm_hotplug_qdev, slot, 1);
+        }
     default:
         g_free(result);
         break;
diff --git a/hw/dimm.h b/hw/dimm.h
index f43f745..081f2db 100644
--- a/hw/dimm.h
+++ b/hw/dimm.h
@@ -15,7 +15,8 @@ typedef enum {
     DIMM_REMOVE_SUCCESS = 0,
     DIMM_REMOVE_FAIL = 1,
     DIMM_ADD_SUCCESS = 2,
-    DIMM_ADD_FAIL = 3
+    DIMM_ADD_FAIL = 3,
+    DIMM_OSPM_POWEROFF = 4
 } dimm_hp_result_code;
 
 typedef enum {
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (29 preceding siblings ...)
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 30/30] Implement _PS3 for dimm Vasilis Liaskovitis
@ 2012-12-18 16:45 ` Zhi Yong Wu
  2012-12-19 11:40   ` Vasilis Liaskovitis
  2012-12-19  7:27 ` Gerd Hoffmann
                   ` (3 subsequent siblings)
  34 siblings, 1 reply; 72+ messages in thread
From: Zhi Yong Wu @ 2012-12-18 16:45 UTC (permalink / raw)
  To: Vasilis Liaskovitis; +Cc: QEMU Developers

HI,

One stupid question, 'dimm' presents one guest memory, then why it is
called as "dimm"? what is its full name?

On Tue, Dec 18, 2012 at 8:41 PM, Vasilis Liaskovitis
<vasilis.liaskovitis@profitbricks.com> wrote:
> This is v4 of the ACPI memory hotplug functionality. Only x86_64 target is
> supported (both i440fx and q35). There are still several issues, but it's
> been a while since v3 and I wanted to get some more feedback on the current
> state of the patchseries.
>
> Overview:
>
> Dimm device layout is modeled with a normal qemu device:
>
> "-device dimm,id=name,size=sz,node=pxm,populated=on|off,bus=membus.0"
>
> The starting physical address for all dimms is calculated from top of memory,
> during memory controller init, skipping the pci hole at [PCI_HOLE_START, 4G).
> e.g.
> "-device dimm,id=dimm0,size=512M,node=0,populated=off,bus=membus.0"
> will define a 512M memory dimm belonging to numa node 0, on bus membus.0.
>
> Because dimm layout needs to be configured on machine-boot, all dimm devices
> need to be specified on startup command line (either with populated=on or with
> populated=off). The dimm information is stored in dimm configuration structures.
>
> After machine startup, dimms are hot-added or removed with normal device_add
> and device_del operations e.g.:
> Hot-add syntax: "device_add dimm,id=mydimm0,bus=membus.0"
> Hot-remove syntax: "device_del dimm,id=mydimm0"
>
> Changes v3->v4
>
> - Dimms added with normal -device argument (extra -dimm arg dropped).
> - multiple memory buses can be registered. Memory buses of the real hw/chipset
>   or a paravirtual memory bus can be added.
> - acpi implementation uses memory API instead of old ioports.
> - Support for q35/ich9 added (still buggy, see patch 12/31).
> - piix4/i440fx initialization code has been refactored to resemble q35. This
> will allow memory map initialization at chipset qdev init time for both
> machines, as well as more similar code.
> - Hot-remove functionality has been moved to separate patches. Hot-remove no
> longer frees memory but unmaps the dimm/qdev device from the guest's view.
> Freeing the memory should happen when the last user unrefs/unmaps the memory,
> see also (work in progress):
> https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg00728.html
> https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02697.html
> - new qmp/hmp command for the state of each dimm (on/off)
>
> Changes v2->v3
>
> - qdev integration. Dimms are attached to a dimmbus. The dimmbus is a child
>   of i440fx device in the pc machine. Hot-add and remove are done with normal
>   device_add / device_del operations on the dimmbus. New commands "dimm_add" and
>   "dimm_del" are obsolete.
> - Add _PS3 method to allow OSPM-induced hot operations.
> - pci-window calculation in Seabios takes dimms into account(for both 32-bit and
>   64-bit windows)
> - rename new qmp commands: query-memory-total and query-memory-hotplug
> - balloon driver can see the hotplugged memory
>
> Changes v1->v2
>
> - memory map is automatically calculated for hotplug dimms. Dimms are added from
> top-of-memory skipping the pci hole at [PCI_HOLE_START, 4G).
> - Renamed from "-memslot" to "-dimm". Commands changed to "dimm_add", "dimm_del"
> - Seabios ejection array reduced to a byte. Use extraction macros for dimm ssdt.
> - additional SRAT paravirt info does not break previous SRAT fw_cfg layout.
> - Documentation of new acpi_piix4 registers and paravirt data.
> - add ACPI _OST support for _OST enabled guests. This allows qemu to receive
> notification for success / failure of memory hot-add and hot-remove operations.
> Guest needs to support _OST (https://lkml.org/lkml/2012/6/25/321)
> - add monitor info command to report total guest memory (initial + hot-added)
>
> Issues:
>
> - hot-remove needs to only unmap the dimm device from guest's view. Freeing the
> memory should happen when the last user of the device (e.g. virtio-blk) unrefs
> the device. A testcase is needed for this.
>
> - Live Migration: Ramblocks are migrated before qdev VMStates are migrated. So
> the DimmDevice is handled diferrently than other devices. Should this be
> reworked ?( DimmDevice structure currently does not define a VMStateDescription)
> Live migration works as long as the dimm layout (command line args) are
> identical at the source and destination qemu command line, and destination takes
> into account hot-operations that have occured on source. (v3 patch 10/19
> created the DimmDevice that corresponds to an unknown incoming ramblock, e.g.
> for a dimm that was hot-added on source. but has been dropped for the moment).
>
> - A main blocker issue is windows guest functionality. The patchset does not
> work for windows currently.  Testing on win2012 server RC or windows2008
> consumer prerelease, when adding a DIMM, there is a BSOD with ACPI_BIOS_ERROR
> message. After this, the VM keeps rebooting with ACPI_BIOS_ERROR. The windows
> pnpmem driver obviosuly has a problem with the seabios dimm implementation
> (or the seabios dimm implementation is not fully ACPI-compliant). If someone
> can review the seabios patches or has any ideas to debug this, let me know.
>
> - hot-operation notification lists need to be added to migration state.
>
> series is based on:
> - qemu master (commit a8a826a3) + patch:
> https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02699.html
> - seabios master (commit a810e4e7)
>
> Can also be found at:
>
> http://github.com/vliaskov/qemu-kvm/commits/memhp-v4
> http://github.com/vliaskov/seabios/commits/memhp-v4
>
> Vasilis Liaskovitis (21):
>   qapi: make visit_type_size fallback to type_int
>   Add SIZE type to qdev properties
>   qemu-option: export parse_option_number
>   Implement dimm device abstraction
>   vl: handle "-device dimm"
>   acpi_piix4 : Implement memory device hotplug registers
>   acpi_ich9 : Implement memory device hotplug registers
>   piix_pci and pc_piix: refactor
>   piix_pci: Add i440fx dram controller initialization
>   q35: Add i440fx dram controller initialization
>   pc: Add dimm paravirt SRAT info
>   Introduce paravirt interface QEMU_CFG_PCI_WINDOW
>   Implement "info memory-total" and "query-memory-total"
>   balloon: update with hotplugged memory
>   Implement dimm-info
>   dimm: add hot-remove capability
>   acpi_piix4: add hot-remove capability
>   acpi_ich9: add hot-remove capability
>   Implement qmp and hmp commands for notification lists
>   Add _OST dimm support
>   Implement _PS3 for dimm
>
>  docs/specs/acpi_hotplug.txt |   54 ++++++
>  docs/specs/fwcfg.txt        |   28 +++
>  hmp-commands.hx             |    6 +
>  hmp.c                       |   41 ++++
>  hmp.h                       |    3 +
>  hw/Makefile.objs            |    2 +-
>  hw/acpi.h                   |    5 +
>  hw/acpi_ich9.c              |  115 +++++++++++-
>  hw/acpi_ich9.h              |   12 +-
>  hw/acpi_piix4.c             |  126 ++++++++++++-
>  hw/dimm.c                   |  444 +++++++++++++++++++++++++++++++++++++++++++
>  hw/dimm.h                   |  102 ++++++++++
>  hw/fw_cfg.h                 |    1 +
>  hw/lpc_ich9.c               |    2 +-
>  hw/pc.c                     |   28 +++-
>  hw/pc.h                     |    1 +
>  hw/pc_piix.c                |   74 ++++++--
>  hw/pc_q35.c                 |   18 ++-
>  hw/piix_pci.c               |  249 ++++++++-----------------
>  hw/q35.c                    |   27 +++
>  hw/q35.h                    |    5 +
>  hw/qdev-properties.c        |   60 ++++++
>  hw/qdev-properties.h        |    3 +
>  hw/virtio-balloon.c         |   13 +-
>  monitor.c                   |   21 ++
>  qapi-schema.json            |   63 ++++++
>  qapi/qapi-visit-core.c      |   11 +-
>  qemu-option.c               |    4 +-
>  qemu-option.h               |    4 +
>  qmp-commands.hx             |   57 ++++++
>  sysemu.h                    |    1 +
>  vl.c                        |   60 ++++++
>  32 files changed, 1432 insertions(+), 208 deletions(-)
>  create mode 100644 docs/specs/acpi_hotplug.txt
>  create mode 100644 docs/specs/fwcfg.txt
>  create mode 100644 hw/dimm.c
>  create mode 100644 hw/dimm.h
>
>
> Vasilis Liaskovitis (9):
>   Add ACPI_EXTRACT_DEVICE* macros
>   Add SSDT memory device support
>   acpi-dsdt: Implement functions for memory hotplug
>   acpi: generate hotplug memory devices
>   q35: Add memory hotplug handler
>   pci: Use paravirt interface for pcimem_start and pcimem64_start
>   acpi: add _EJ0 operation and eject port for memory devices
>   Add _OST dimm method
>   Implement _PS3 method for memory device
>
>  Makefile                      |    2 +-
>  src/acpi-dsdt-mem-hotplug.dsl |  136 +++++++++++++++++++++++++++++++++++
>  src/acpi-dsdt.dsl             |    5 +-
>  src/acpi.c                    |  158 +++++++++++++++++++++++++++++++++++++++--
>  src/paravirt.c                |    6 ++
>  src/paravirt.h                |    2 +
>  src/pciinit.c                 |    9 +++
>  src/q35-acpi-dsdt.dsl         |    6 +-
>  src/ssdt-mem.dsl              |   73 +++++++++++++++++++
>  tools/acpi_extract.py         |   28 +++++++
>  10 files changed, 415 insertions(+), 10 deletions(-)
>  create mode 100644 src/acpi-dsdt-mem-hotplug.dsl
>  create mode 100644 src/ssdt-mem.dsl
>
> --
> 1.7.9
>
>



-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (30 preceding siblings ...)
  2012-12-18 16:45 ` [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Zhi Yong Wu
@ 2012-12-19  7:27 ` Gerd Hoffmann
  2012-12-19 11:35   ` Vasilis Liaskovitis
  2013-01-09  0:08 ` Andreas Färber
                   ` (2 subsequent siblings)
  34 siblings, 1 reply; 72+ messages in thread
From: Gerd Hoffmann @ 2012-12-19  7:27 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, anthony

  Hi,

> - multiple memory buses can be registered. Memory buses of the real hw/chipset
>   or a paravirtual memory bus can be added.

IIRC q35 supports memory hotplug natively (picked up in some
discussion).  Is that correct?

What does the code emulate?  It doesn't look like it emulates q35 memory
hotplug ...

I think the paravirtual memory hotplug controller should be a PCI device
(which we then can add as function to the chipset).  Having some fixed
magic addresses is bad.

[ btw: same goes for ACPI PCI hotplug, that is hardly fixable without
       breaking compatibility though, for q35 we should be able to do
       better ].

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2012-12-19  7:27 ` Gerd Hoffmann
@ 2012-12-19 11:35   ` Vasilis Liaskovitis
  2012-12-19 13:56     ` Gerd Hoffmann
  2013-01-10 18:57     ` Vasilis Liaskovitis
  0 siblings, 2 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-19 11:35 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, anthony

Hi,

On Wed, Dec 19, 2012 at 08:27:36AM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> > - multiple memory buses can be registered. Memory buses of the real hw/chipset
> >   or a paravirtual memory bus can be added.
> 
> IIRC q35 supports memory hotplug natively (picked up in some
> discussion).  Is that correct?
> 
> What does the code emulate?  It doesn't look like it emulates q35 memory
> hotplug ...

correct, only the number of channels and ranks(dimms) per channel has been
emulated so far (2 channels of 4 dimms each). So it is still paravirtual memory
hotplug, not native. Native support still needs to be worked on.

>From previous discussion I also understand that q35 supports native hotplug. 
Sections 5.1 and 5.2 of the spec describe the MCH registers but the native
acpi-memory hotplug specifics are not yet clear to me. Any pointers from the
spec are welcome.

> 
> I think the paravirtual memory hotplug controller should be a PCI device
> (which we then can add as function to the chipset).  Having some fixed
> magic addresses is bad.

ok, so in your opinion a pci-based hotplug controller sounds better than adding
acpi ports to piix4 or ich9?

Magic acpi_ich9 ports can be avoided if q35 native support is implemented. For
i440fx/piix4 it was discussed and more or less decided we would only support
a paravirtual way of memory hotplug. 

In the description. I meant "paravirtual memory bus" to describe a memory bus
with unlimited number of dimm devices. But the "hotplug control" has always
been acpi-based so far and not a pci device.

thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2012-12-18 16:45 ` [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Zhi Yong Wu
@ 2012-12-19 11:40   ` Vasilis Liaskovitis
  0 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2012-12-19 11:40 UTC (permalink / raw)
  To: Zhi Yong Wu; +Cc: QEMU Developers

On Wed, Dec 19, 2012 at 12:45:46AM +0800, Zhi Yong Wu wrote:
> HI,
> 
> One stupid question, 'dimm' presents one guest memory, then why it is
> called as "dimm"? what is its full name?

it's a bad name coming from dram technology (dual in-line memory module).
Memory-slot or memory-module is probably a better name, since we are not really
modelling a specific memory technology.

thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2012-12-19 11:35   ` Vasilis Liaskovitis
@ 2012-12-19 13:56     ` Gerd Hoffmann
  2013-01-10 18:57     ` Vasilis Liaskovitis
  1 sibling, 0 replies; 72+ messages in thread
From: Gerd Hoffmann @ 2012-12-19 13:56 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, anthony

  Hi,

> correct, only the number of channels and ranks(dimms) per channel has been
> emulated so far (2 channels of 4 dimms each). So it is still paravirtual memory
> hotplug, not native. Native support still needs to be worked on.

Ok.

>> I think the paravirtual memory hotplug controller should be a PCI device
>> (which we then can add as function to the chipset).  Having some fixed
>> magic addresses is bad.
> 
> ok, so in your opinion a pci-based hotplug controller sounds better than adding
> acpi ports to piix4 or ich9?
> 
> Magic acpi_ich9 ports can be avoided if q35 native support is implemented.

Yes.  We should go that route for q35.

> For
> i440fx/piix4 it was discussed and more or less decided we would only support
> a paravirtual way of memory hotplug. 

Sure, there is no other way to do it.

It is probably a good idea to model piix4 paravirtual to work simliar to
q35 native.

> In the description. I meant "paravirtual memory bus" to describe a memory bus
> with unlimited number of dimm devices. But the "hotplug control" has always
> been acpi-based so far and not a pci device.

It still can (and should) be acpi-based.  It is just that:

  (a) Instead of using get_system_io() as parent memory region you use
      create a pci device and place the memory region in one of the PCI
      bars.
  (b) Instead of using OperationRegion($name, SystemIO, $magicaddress)
      you use OperationRegion($name, PciBarTarget, ...) to access the
      registers.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 19/30] Implement "info memory-total" and "query-memory-total"
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 19/30] Implement "info memory-total" and "query-memory-total" Vasilis Liaskovitis
@ 2012-12-19 19:47   ` Blue Swirl
  2013-01-04 16:21   ` Eric Blake
  1 sibling, 0 replies; 72+ messages in thread
From: Blue Swirl @ 2012-12-19 19:47 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel, kevin,
	kraxel, anthony

On Tue, Dec 18, 2012 at 12:41 PM, Vasilis Liaskovitis
<vasilis.liaskovitis@profitbricks.com> wrote:
> Returns total physical memory available to guest in bytes, including hotplugged
> memory. Note that the number reported here may be different from what the guest
> sees e.g. if the guest has not logically onlined hotplugged memory.
>
> This functionality is provided independently of a balloon device, since a
> guest can be using ACPI memory hotplug without using a balloon device.
>
> v3->v4: Moved qmp command implementation to vl.c. This prevents a circular
> header dependency problem.
>
> Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
> ---
>  hmp-commands.hx  |    2 ++
>  hmp.c            |    7 +++++++
>  hmp.h            |    1 +
>  hw/dimm.c        |   14 ++++++++++++++
>  hw/dimm.h        |    1 +
>  monitor.c        |    7 +++++++
>  qapi-schema.json |   11 +++++++++++
>  qmp-commands.hx  |   20 ++++++++++++++++++++
>  vl.c             |    9 +++++++++
>  9 files changed, 72 insertions(+), 0 deletions(-)
>
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index 010b8c9..3fbd975 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1570,6 +1570,8 @@ show device tree
>  show qdev device model list
>  @item info roms
>  show roms
> +@item info memory-total
> +show memory-total
>  @end table
>  ETEXI
>
> diff --git a/hmp.c b/hmp.c
> index 180ba2b..fb39b0d 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -628,6 +628,13 @@ void hmp_info_block_jobs(Monitor *mon)
>      }
>  }
>
> +void hmp_info_memory_total(Monitor *mon)
> +{
> +    uint64_t ram_total;
> +    ram_total = (uint64_t)qmp_query_memory_total(NULL);
> +    monitor_printf(mon, "MemTotal: %lu\n", ram_total);

Wrong format on 32 bit hosts, please use PRIu64.

> +}
> +
>  void hmp_quit(Monitor *mon, const QDict *qdict)
>  {
>      monitor_suspend(mon);
> diff --git a/hmp.h b/hmp.h
> index 0ab03be..25a3a70 100644
> --- a/hmp.h
> +++ b/hmp.h
> @@ -36,6 +36,7 @@ void hmp_info_spice(Monitor *mon);
>  void hmp_info_balloon(Monitor *mon);
>  void hmp_info_pci(Monitor *mon);
>  void hmp_info_block_jobs(Monitor *mon);
> +void hmp_info_memory_total(Monitor *mon);
>  void hmp_quit(Monitor *mon, const QDict *qdict);
>  void hmp_stop(Monitor *mon, const QDict *qdict);
>  void hmp_system_reset(Monitor *mon, const QDict *qdict);
> diff --git a/hw/dimm.c b/hw/dimm.c
> index e384952..f181e54 100644
> --- a/hw/dimm.c
> +++ b/hw/dimm.c
> @@ -189,6 +189,20 @@ void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots)
>      }
>  }
>
> +uint64_t get_hp_memory_total(void)
> +{
> +    DimmBus *bus;
> +    DimmDevice *slot;
> +    uint64_t info = 0;
> +
> +    QLIST_FOREACH(bus, &memory_buses, next) {
> +        QTAILQ_FOREACH(slot, &bus->dimmlist, nextdimm) {
> +            info += slot->size;
> +        }
> +    }
> +    return info;
> +}
> +
>  static int dimm_init(DeviceState *s)
>  {
>      DimmBus *bus = DIMM_BUS(qdev_get_parent_bus(s));
> diff --git a/hw/dimm.h b/hw/dimm.h
> index 75a6911..5130b2c 100644
> --- a/hw/dimm.h
> +++ b/hw/dimm.h
> @@ -85,5 +85,6 @@ DimmBus *dimm_bus_create(Object *parent, const char *name, uint32_t max_dimms,
>      dimm_calcoffset_fn pmc_set_offset);
>  void dimm_config_create(char *id, uint64_t size, const char *bus, uint64_t node,
>          uint32_t dimm_idx, uint32_t populated);
> +uint64_t get_hp_memory_total(void);
>
>  #endif
> diff --git a/monitor.c b/monitor.c
> index c0e32d6..6e87d0d 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -2708,6 +2708,13 @@ static mon_cmd_t info_cmds[] = {
>          .mhandler.info = hmp_info_balloon,
>      },
>      {
> +        .name       = "memory-total",
> +        .args_type  = "",
> +        .params     = "",
> +        .help       = "show total memory size",
> +        .mhandler.info = hmp_info_memory_total,
> +    },
> +    {
>          .name       = "qtree",
>          .args_type  = "",
>          .params     = "",
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 5dfa052..33f88d6 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -2903,6 +2903,17 @@
>  { 'command': 'query-target', 'returns': 'TargetInfo' }
>
>  ##
> +# @query-memory-total:
> +#
> +# Returns total memory in bytes, including hotplugged dimms
> +#
> +# Returns: int
> +#
> +# Since: 1.4
> +##
> +{ 'command': 'query-memory-total', 'returns': 'int' }
> +
> +##
>  # @QKeyCode:
>  #
>  # An enumeration of key name.
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index 5c692d0..a99117a 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -2654,3 +2654,23 @@ EQMP
>          .args_type  = "",
>          .mhandler.cmd_new = qmp_marshal_input_query_target,
>      },
> +
> +    {
> +        .name       = "query-memory-total",
> +        .args_type  = "",
> +        .mhandler.cmd_new = qmp_marshal_input_query_memory_total
> +    },
> +SQMP
> +query-memory-total
> +----------
> +
> +Return total memory in bytes, including hotplugged dimms
> +
> +Example:
> +
> +-> { "execute": "query-memory-total" }
> +<- {
> +      "return": 1073741824
> +   }
> +
> +EQMP
> diff --git a/vl.c b/vl.c
> index 8406933..80803c5 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -126,6 +126,7 @@ int main(int argc, char **argv)
>  #include "hw/xen.h"
>  #include "hw/qdev.h"
>  #include "hw/loader.h"
> +#include "hw/dimm.h"
>  #include "bt-host.h"
>  #include "net.h"
>  #include "net/slirp.h"
> @@ -442,6 +443,14 @@ StatusInfo *qmp_query_status(Error **errp)
>      return info;
>  }
>
> +int64_t qmp_query_memory_total(Error **errp)
> +{
> +    uint64_t info;
> +    info = ram_size + get_hp_memory_total();
> +
> +    return (int64_t)info;
> +}
> +
>  /***********************************************************/
>  /* real time host monotonic timer */
>
> --
> 1.7.9
>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 25/30] acpi_ich9: add hot-remove capability
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 25/30] acpi_ich9: " Vasilis Liaskovitis
@ 2012-12-19 19:48   ` Blue Swirl
  0 siblings, 0 replies; 72+ messages in thread
From: Blue Swirl @ 2012-12-19 19:48 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel, kevin,
	kraxel, anthony

On Tue, Dec 18, 2012 at 12:41 PM, Vasilis Liaskovitis
<vasilis.liaskovitis@profitbricks.com> wrote:
> ---
>  hw/acpi_ich9.c |   28 +++++++++++++++++++++++++++-
>  hw/acpi_ich9.h |    1 +
>  2 files changed, 28 insertions(+), 1 deletions(-)
>
> diff --git a/hw/acpi_ich9.c b/hw/acpi_ich9.c
> index abafbb5..f5dc1c9 100644
> --- a/hw/acpi_ich9.c
> +++ b/hw/acpi_ich9.c
> @@ -105,12 +105,29 @@ static uint32_t memhp_readb(void *opaque, uint32_t addr)
>      return val;
>  }
>
> +static void memhp_writeb(void *opaque, uint32_t addr, uint32_t val)
> +{
> +    switch (addr) {
> +    case ICH9_MEM_EJ_BASE - ICH9_MEM_BASE:
> +        dimm_notify(val, DIMM_REMOVE_SUCCESS);
> +        break;
> +    default:
> +        ICH9_DEBUG("memhp write invalid %x <== %d\n", addr, val);
> +    }
> +    ICH9_DEBUG("memhp write %x <== %d\n", addr, val);
> +}
> +
>  static const MemoryRegionOps ich9_memhp_ops = {
>      .old_portio = (MemoryRegionPortio[]) {
>          {
>              .offset = 0,   .len = DIMM_BITMAP_BYTES, .size = 1,
>              .read = memhp_readb,
>          },
> +        {
> +            .offset = ICH9_MEM_EJ_BASE - ICH9_MEM_BASE,
> +            .len = 1, .size = 1,
> +            .write = memhp_writeb,
> +        },
>          PORTIO_END_OF_LIST()
>      },
>      .endianness = DEVICE_LITTLE_ENDIAN,
> @@ -234,6 +251,13 @@ static void enable_mem_device(ICH9LPCState *s, int memdevice)
>      g->mems_sts[memdevice/8] |= (1 << (memdevice%8));
>  }
>
> +static void disable_mem_device(ICH9LPCState *s, int memdevice)
> +{
> +    struct gpe_regs *g = &s->pm.gperegs;
> +    s->pm.acpi_regs.gpe.sts[0] |= ICH9_MEM_HOTPLUG_STATUS;
> +    g->mems_sts[memdevice/8] &= ~(1 << (memdevice%8));

Spaces around '/' and '%'.

> +}
> +
>  static int ich9_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int
>          add)
>  {
> @@ -243,6 +267,8 @@ static int ich9_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int
>
>      if (add) {
>          enable_mem_device(s, slot->idx);
> +    } else {
> +        disable_mem_device(s, slot->idx);
>      }
>      pm_update_sci(&s->pm);
>      return 0;
> @@ -270,7 +296,7 @@ void ich9_pm_init(void *device, qemu_irq sci_irq, qemu_irq cmos_s3)
>      memory_region_add_subregion(&pm->io, ICH9_PMIO_SMI_EN, &pm->io_smi);
>
>      memory_region_init_io(&pm->io_memhp, &ich9_memhp_ops, pm, "apci-memhp0",
> -                          DIMM_BITMAP_BYTES);
> +                          DIMM_BITMAP_BYTES + 1);
>      memory_region_add_subregion(get_system_io(), ICH9_MEM_BASE, &pm->io_memhp);
>
>      dimm_bus_hotplug(ich9_dimm_hotplug, &lpc->d.qdev);
> diff --git a/hw/acpi_ich9.h b/hw/acpi_ich9.h
> index 4419247..af61a2d 100644
> --- a/hw/acpi_ich9.h
> +++ b/hw/acpi_ich9.h
> @@ -24,6 +24,7 @@
>  #include "acpi.h"
>
>  #define ICH9_MEM_BASE    0xaf80
> +#define ICH9_MEM_EJ_BASE    0xafa0
>  #define ICH9_MEM_HOTPLUG_STATUS 8
>
>  typedef struct ICH9LPCPMRegs {
> --
> 1.7.9
>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 19/30] Implement "info memory-total" and "query-memory-total"
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 19/30] Implement "info memory-total" and "query-memory-total" Vasilis Liaskovitis
  2012-12-19 19:47   ` Blue Swirl
@ 2013-01-04 16:21   ` Eric Blake
  2013-01-10 17:42     ` Vasilis Liaskovitis
  1 sibling, 1 reply; 72+ messages in thread
From: Eric Blake @ 2013-01-04 16:21 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, kraxel, anthony

[-- Attachment #1: Type: text/plain, Size: 1689 bytes --]

On 12/18/2012 05:41 AM, Vasilis Liaskovitis wrote:
> Returns total physical memory available to guest in bytes, including hotplugged
> memory. Note that the number reported here may be different from what the guest
> sees e.g. if the guest has not logically onlined hotplugged memory.
> 
> This functionality is provided independently of a balloon device, since a
> guest can be using ACPI memory hotplug without using a balloon device.
> 
> v3->v4: Moved qmp command implementation to vl.c. This prevents a circular
> header dependency problem.

Generally, patch change history should occur...

> 
> Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
> ---

...here, after the --- divider.  It's useful in the email chain, but
does not need to be part of the final git history.

> +++ b/qapi-schema.json
> @@ -2903,6 +2903,17 @@
>  { 'command': 'query-target', 'returns': 'TargetInfo' }
>  
>  ##
> +# @query-memory-total:
> +#
> +# Returns total memory in bytes, including hotplugged dimms
> +#
> +# Returns: int
> +#
> +# Since: 1.4
> +##
> +{ 'command': 'query-memory-total', 'returns': 'int' }

Any reason you can't name this just 'query-memory', and return a JSON
dictionary instead of a single int, so that in the future you can add
other memory parameters into the same call?  For example, down the road
we may want to report some 'newstat' without adding a new QMP command:

{ 'type': 'MemoryInfo',
  'data': { 'total': 'int', 'newstat': 'int' } }
{ 'command': 'query-memory', 'returns': 'MemoryInfo' }

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 21/30] Implement dimm-info
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 21/30] Implement dimm-info Vasilis Liaskovitis
@ 2013-01-08 23:20   ` Eric Blake
  2013-01-10 17:45     ` Vasilis Liaskovitis
  0 siblings, 1 reply; 72+ messages in thread
From: Eric Blake @ 2013-01-08 23:20 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, kraxel, anthony

[-- Attachment #1: Type: text/plain, Size: 1538 bytes --]

On 12/18/2012 05:41 AM, Vasilis Liaskovitis wrote:
> "query-dimm-info" and "info dimm" will give current state of all dimms in the
> system e.g.
> 
> dimm0: on
> dimm1: off
> dimm2: off
> dimm3: on
> etc.
> 
> Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
> ---

> +++ b/qapi-schema.json
> @@ -2914,6 +2914,32 @@
>  { 'command': 'query-memory-total', 'returns': 'int' }
>  
>  ##
> +# @DimmInfo:
> +#
> +# Information about status of a memory hotplug command
> +#
> +# @dimm: the Dimm associated with the result
> +#
> +# @result: the result of the hotplug command

Here you call it 'result',

> +#
> +# Since: 1.4
> +#
> +##
> +{ 'type': 'DimmInfo',
> +  'data': {'dimm': 'str', 'state': 'bool'} }

but here you call it 'state'.  Which is it?  And does 'true' mean
plugged in, or that the last command succeeded (where the last command
may have been either a plug or an unplug)?  My preference is that 'true'
means plugged in, so more documentation would help.

> +
> +##
> +# @query-dimm-info:
> +#
> +# Returns total memory in bytes, including hotplugged dimms

Really?

> +#
> +# Returns: int

Copy-and-paste error?  This doesn't return an 'int', but an array of
'DimmInfo'.

> +#
> +# Since: 1.4
> +##
> +{ 'command': 'query-dimm-info', 'returns': ['DimmInfo'] }
> +
> +##
>  # @QKeyCode:
>  #
>  # An enumeration of key name.
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (31 preceding siblings ...)
  2012-12-19  7:27 ` Gerd Hoffmann
@ 2013-01-09  0:08 ` Andreas Färber
  2013-01-10 17:36   ` Vasilis Liaskovitis
  2013-03-20  6:18   ` li guang
       [not found] ` <CAF+CadtnTcOnUt7jp1bARJgioxR5KzLG0QSQuDbiqhiKxiCqFA@mail.gmail.com>
  2013-03-26 14:47 ` Luiz Capitulino
  34 siblings, 2 replies; 72+ messages in thread
From: Andreas Färber @ 2013-01-09  0:08 UTC (permalink / raw)
  To: Vasilis Liaskovitis, Anthony Liguori, Paolo Bonzini
  Cc: pingfank, Eduardo Habkost, gleb, stefanha, jbaron, seabios,
	qemu-devel, blauwirbel, kevin, kraxel, Igor Mammedov

Am 18.12.2012 13:41, schrieb Vasilis Liaskovitis:
> Because dimm layout needs to be configured on machine-boot, all dimm devices
> need to be specified on startup command line (either with populated=on or with
> populated=off). The dimm information is stored in dimm configuration structures.
> 
> After machine startup, dimms are hot-added or removed with normal device_add
> and device_del operations e.g.:
> Hot-add syntax: "device_add dimm,id=mydimm0,bus=membus.0"
> Hot-remove syntax: "device_del dimm,id=mydimm0"

This sounds contradictory: Either all devices need to be specified on
the command line, or they can be hot-added via monitor.

Assuming a fixed layout at startup, I wonder if there is another clever
way to model this... For CPU hotplug Anthony had suggested to have a
fixed set of link<Socket> properties that get set to a CPU socket as
needed. Might a similar strategy work for memory, i.e. a
startup-configured amount of link<DIMM>s on /machine/dimm[n] that point
to a QOM DIMM object or NULL if unpopulated? Hot(un)plug would then
simply work via QMP qom-set command. (CC'ing some people)

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 06/30] qapi: make visit_type_size fallback to type_int
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 06/30] qapi: make visit_type_size fallback to type_int Vasilis Liaskovitis
@ 2013-01-09  0:18   ` Andreas Färber
  2013-01-09 16:00     ` mdroth
  0 siblings, 1 reply; 72+ messages in thread
From: Andreas Färber @ 2013-01-09  0:18 UTC (permalink / raw)
  To: Vasilis Liaskovitis, Michael Roth
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, kraxel, Anthony Liguori

Am 18.12.2012 13:41, schrieb Vasilis Liaskovitis:
> Currently visit_type_size checks if the visitor's type_size function pointer is
> NULL. If not, it calls it, otherwise it calls v->type_uint64(). But neither of
> these pointers are ever set. Fallback to calling v->type_int() in this third
> (default) case.
> 
> Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>

Is this patch still needed? I thought size -> int fallback was fixed
differently for the deallocation visitor...

Anyway, someone (Anthony?) recently suggested to drop the size visitor
completely in favor of string type (cf. frequency visitor discussion).

Regards,
Andreas

> ---
>  qapi/qapi-visit-core.c |   11 ++++++++++-
>  1 files changed, 10 insertions(+), 1 deletions(-)
> 
> diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
> index 7a82b63..497e693 100644
> --- a/qapi/qapi-visit-core.c
> +++ b/qapi/qapi-visit-core.c
> @@ -236,8 +236,17 @@ void visit_type_int64(Visitor *v, int64_t *obj, const char *name, Error **errp)
>  
>  void visit_type_size(Visitor *v, uint64_t *obj, const char *name, Error **errp)
>  {
> +    int64_t value;
>      if (!error_is_set(errp)) {
> -        (v->type_size ? v->type_size : v->type_uint64)(v, obj, name, errp);
> +        if (v->type_size) {
> +            v->type_size(v, obj, name, errp);
> +        } else if (v->type_uint64) {
> +            v->type_uint64(v, obj, name, errp);
> +        } else {
> +            value = *obj;
> +            v->type_int(v, &value, name, errp);
> +            *obj = value;
> +        }
>      }
>  }
>  

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 26/30] Implement qmp and hmp commands for notification lists
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 26/30] Implement qmp and hmp commands for notification lists Vasilis Liaskovitis
@ 2013-01-09  0:23   ` Eric Blake
  0 siblings, 0 replies; 72+ messages in thread
From: Eric Blake @ 2013-01-09  0:23 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, kraxel, anthony

[-- Attachment #1: Type: text/plain, Size: 1655 bytes --]

On 12/18/2012 05:41 AM, Vasilis Liaskovitis wrote:
> Guest can respond to ACPI hotplug events e.g. with _EJ or _OST method.
> This patch implements a tail queue to store guest notifications for memory
> hot-add and hot-remove requests.
> 
> Guest responses for memory hotplug command on a per-dimm basis can be detected
> with the new hmp command "info memory-hotplug" or the new qmp command
> "query-memory-hotplug"
> 
> Examples:
> 
> (qemu) device_add dimm,id=ram0
> (qemu) info memory-hotplug
> dimm: ram0 hot-add success
> or
> dimm: ram0 hot-add failure
> 

> +++ b/qapi-schema.json
> @@ -2940,6 +2940,32 @@
>  { 'command': 'query-dimm-info', 'returns': ['DimmInfo'] }
>  
>  ##
> +# @MemHpInfo:
> +#
> +# Information about status of a memory hotplug command
> +#
> +# @dimm: the Dimm associated with the result

Didn't document @request.

> +#
> +# @result: the result of the hotplug command
> +#
> +# Since: 1.4
> +#
> +##
> +{ 'type': 'MemHpInfo',
> +  'data': {'dimm': 'str', 'request': 'str', 'result': 'str'} }

Instead of open-coding 'request' and 'result' as an arbitrary string, it
would be nicer to have them be an enum type with a finite subset of
expected values.

> +
> +##
> +# @query-memory-hotplug:
> +#
> +# Returns a list of information about pending hotplug commands
> +#
> +# Returns: a list of @MemhpInfo

s/MemhpInfo/MemHpInfo/

> +#
> +# Since: 1.4
> +##
> +{ 'command': 'query-memory-hotplug', 'returns': ['MemHpInfo'] }
> +
> +##
>  # @QKeyCode:
>  #

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 06/30] qapi: make visit_type_size fallback to type_int
  2013-01-09  0:18   ` Andreas Färber
@ 2013-01-09 16:00     ` mdroth
  0 siblings, 0 replies; 72+ messages in thread
From: mdroth @ 2013-01-09 16:00 UTC (permalink / raw)
  To: Andreas Färber
  Cc: blauwirbel, pingfank, gleb, stefanha, jbaron, seabios,
	qemu-devel, Vasilis Liaskovitis, kevin, kraxel, Anthony Liguori

On Wed, Jan 09, 2013 at 01:18:04AM +0100, Andreas Färber wrote:
> Am 18.12.2012 13:41, schrieb Vasilis Liaskovitis:
> > Currently visit_type_size checks if the visitor's type_size function pointer is
> > NULL. If not, it calls it, otherwise it calls v->type_uint64(). But neither of
> > these pointers are ever set. Fallback to calling v->type_int() in this third
> > (default) case.
> > 
> > Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
> 
> Is this patch still needed? I thought size -> int fallback was fixed
> differently for the deallocation visitor...

It was only fixed for the dealloc visitor since that was the only path
hit in 1.3. If we add a non-OptsVisitor user of visit_type_size() then
we'll trigger the bug this patch fixes.

I do have some comments on this patch though (below)

> 
> Anyway, someone (Anthony?) recently suggested to drop the size visitor
> completely in favor of string type (cf. frequency visitor discussion).

Hmm, in terms of generalizing size/freq visitors without adding code for
theoretical use cases (which I think was the deadlock) it seems to serve
that purpose. And it does make sense to handle special suffixes at the
end-points (qemu options and option users) rather than bake them into the
visitor interfaces (which get too numerous if we add them on a
case-by-case, and unwieldly if we try to generalize too much)

Though for qapi-generated visitors (as opposed to the open-coded ones like
the freq visitor) it does have the downside of requiring us to deserialize
into a string field before parsing/using it later, so we add some
unecessary fields. But I don't think that's a major downside.

At least, I would consider implementing the frequency visitor as a
string visitor should we decide to leave visit_type_size() as is.

> 
> Regards,
> Andreas
> 
> > ---
> >  qapi/qapi-visit-core.c |   11 ++++++++++-
> >  1 files changed, 10 insertions(+), 1 deletions(-)
> > 
> > diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
> > index 7a82b63..497e693 100644
> > --- a/qapi/qapi-visit-core.c
> > +++ b/qapi/qapi-visit-core.c
> > @@ -236,8 +236,17 @@ void visit_type_int64(Visitor *v, int64_t *obj, const char *name, Error **errp)
> >  
> >  void visit_type_size(Visitor *v, uint64_t *obj, const char *name, Error **errp)
> >  {
> > +    int64_t value;
> >      if (!error_is_set(errp)) {
> > -        (v->type_size ? v->type_size : v->type_uint64)(v, obj, name, errp);
> > +        if (v->type_size) {
> > +            v->type_size(v, obj, name, errp);
> > +        } else if (v->type_uint64) {
> > +            v->type_uint64(v, obj, name, errp);
> > +        } else {
> > +            value = *obj;
> > +            v->type_int(v, &value, name, errp);
> > +            *obj = value;
> > +        }

I'd recommend just doing:

  if (v->type_size) {
      v->type_size(v, obj, name, errp);
  } else {
      visit_type_uint64(v, obj, name, errp);
  }

visit_type_uint64() already handles the fallback to visit_type_int() so no
need to duplicate.

> >      }
> >  }
> >  
> 
> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-01-09  0:08 ` Andreas Färber
@ 2013-01-10 17:36   ` Vasilis Liaskovitis
  2013-01-10 17:55     ` Andreas Färber
  2013-03-20  6:18   ` li guang
  1 sibling, 1 reply; 72+ messages in thread
From: Vasilis Liaskovitis @ 2013-01-10 17:36 UTC (permalink / raw)
  To: Andreas Färber
  Cc: pingfank, Eduardo Habkost, gleb, stefanha, jbaron, seabios,
	qemu-devel, blauwirbel, kevin, kraxel, Anthony Liguori,
	Igor Mammedov, Paolo Bonzini

Hi,
On Wed, Jan 09, 2013 at 01:08:52AM +0100, Andreas Färber wrote:
> Am 18.12.2012 13:41, schrieb Vasilis Liaskovitis:
> > Because dimm layout needs to be configured on machine-boot, all dimm devices
> > need to be specified on startup command line (either with populated=on or with
> > populated=off). The dimm information is stored in dimm configuration structures.
> > 
> > After machine startup, dimms are hot-added or removed with normal device_add
> > and device_del operations e.g.:
> > Hot-add syntax: "device_add dimm,id=mydimm0,bus=membus.0"
> > Hot-remove syntax: "device_del dimm,id=mydimm0"
> 
> This sounds contradictory: Either all devices need to be specified on
> the command line, or they can be hot-added via monitor.

Due to the fixed layout requirement, all memory devices need to be specified at
the command line. This was done with a separate "-dimm" argument in previous
versions (see v3), but some reviewers didn't like the extra argument and
suggested handling everything with the normal "-device" arg.

So "-device dimm,..." saves the layout for *all* memory devices. However for
"populated=off" dimms, the device is actually *not* created at startup.

This is why the following combination is not contradictory:

Dimm descirption at startup:
-device dimm,id=mydimm0,bus=membus.0,size=1G,node=0,populated=off
Hot-add with monitor command: 
device_add dimm,id=mydimm0,bus=membus.0

If on the other hand we specify:
-device dimm,id=mydimm0,bus=membus.0,size=1G,node=0,populated=on
the dimm device is indeed created at startup.

granted it's confusing, but this is how v4 handles the fixed layout/device
creation without adding a new command line argument for the layout. Better
solutions are welcome.

> 
> Assuming a fixed layout at startup, I wonder if there is another clever
> way to model this... For CPU hotplug Anthony had suggested to have a
> fixed set of link<Socket> properties that get set to a CPU socket as
> needed. Might a similar strategy work for memory, i.e. a
> startup-configured amount of link<DIMM>s on /machine/dimm[n] that point
> to a QOM DIMM object or NULL if unpopulated? Hot(un)plug would then
> simply work via QMP qom-set command. (CC'ing some people)

This may work for a fixed number of PV dimms. On the other hand some other
reviewers like the idea of modelling the memory bus (DimmBus), either for
paravirtualized features (e.g.  i440fx) or for emulated memory controllers
in the future. I assume we either go with a bus or links<>, and not both.

Btw, is the CPU link<socket> feature already implemented in a qom-cpu branch? I
haven't tested qom-cpu for a long time, but I could take a look as a point of
reference.

thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 19/30] Implement "info memory-total" and "query-memory-total"
  2013-01-04 16:21   ` Eric Blake
@ 2013-01-10 17:42     ` Vasilis Liaskovitis
  0 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2013-01-10 17:42 UTC (permalink / raw)
  To: Eric Blake
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, kraxel, anthony

On Fri, Jan 04, 2013 at 09:21:08AM -0700, Eric Blake wrote:
> On 12/18/2012 05:41 AM, Vasilis Liaskovitis wrote:
> > Returns total physical memory available to guest in bytes, including hotplugged
> > memory. Note that the number reported here may be different from what the guest
> > sees e.g. if the guest has not logically onlined hotplugged memory.
> > 
> > This functionality is provided independently of a balloon device, since a
> > guest can be using ACPI memory hotplug without using a balloon device.
> > 
> > v3->v4: Moved qmp command implementation to vl.c. This prevents a circular
> > header dependency problem.
> 
> Generally, patch change history should occur...
> 
> > 
> > Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
> > ---
> 
> ...here, after the --- divider.  It's useful in the email chain, but
> does not need to be part of the final git history.
ok, thanks.

> 
> > +++ b/qapi-schema.json
> > @@ -2903,6 +2903,17 @@
> >  { 'command': 'query-target', 'returns': 'TargetInfo' }
> >  
> >  ##
> > +# @query-memory-total:
> > +#
> > +# Returns total memory in bytes, including hotplugged dimms
> > +#
> > +# Returns: int
> > +#
> > +# Since: 1.4
> > +##
> > +{ 'command': 'query-memory-total', 'returns': 'int' }
> 
> Any reason you can't name this just 'query-memory', and return a JSON
> dictionary instead of a single int, so that in the future you can add
> other memory parameters into the same call?  For example, down the road
> we may want to report some 'newstat' without adding a new QMP command:

I am fine with a dictionary, if we see a need for extending the command in the
future. Is it common practice to start off with dicts for simple commands? In
any case, I 'll update.

thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 21/30] Implement dimm-info
  2013-01-08 23:20   ` Eric Blake
@ 2013-01-10 17:45     ` Vasilis Liaskovitis
  0 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2013-01-10 17:45 UTC (permalink / raw)
  To: Eric Blake
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, kraxel, anthony

On Tue, Jan 08, 2013 at 04:20:26PM -0700, Eric Blake wrote:
> On 12/18/2012 05:41 AM, Vasilis Liaskovitis wrote:
> > "query-dimm-info" and "info dimm" will give current state of all dimms in the
> > system e.g.
> > 
> > dimm0: on
> > dimm1: off
> > dimm2: off
> > dimm3: on
> > etc.
> > 
> > Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
> > ---
> 
> > +++ b/qapi-schema.json
> > @@ -2914,6 +2914,32 @@
> >  { 'command': 'query-memory-total', 'returns': 'int' }
> >  
> >  ##
> > +# @DimmInfo:
> > +#
> > +# Information about status of a memory hotplug command
> > +#
> > +# @dimm: the Dimm associated with the result
> > +#
> > +# @result: the result of the hotplug command
> 
> Here you call it 'result',
> 
> > +#
> > +# Since: 1.4
> > +#
> > +##
> > +{ 'type': 'DimmInfo',
> > +  'data': {'dimm': 'str', 'state': 'bool'} }
> 
> but here you call it 'state'.  Which is it?  And does 'true' mean
> plugged in, or that the last command succeeded (where the last command
> may have been either a plug or an unplug)?  My preference is that 'true'
> means plugged in, so more documentation would help.

"True" does mean "plugged in" as you suggest, and the name should be "state". I
'll clarify the documentation.

> 
> > +
> > +##
> > +# @query-dimm-info:
> > +#
> > +# Returns total memory in bytes, including hotplugged dimms
> 
> Really?
> 
> > +#
> > +# Returns: int
> 
> Copy-and-paste error?  This doesn't return an 'int', but an array of
> 'DimmInfo'.

both copy-paste errors, will fix.

thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-01-10 17:36   ` Vasilis Liaskovitis
@ 2013-01-10 17:55     ` Andreas Färber
  0 siblings, 0 replies; 72+ messages in thread
From: Andreas Färber @ 2013-01-10 17:55 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, Eduardo Habkost, gleb, stefanha, jbaron, seabios,
	qemu-devel, blauwirbel, kevin, kraxel, Anthony Liguori,
	Igor Mammedov, Paolo Bonzini

Am 10.01.2013 18:36, schrieb Vasilis Liaskovitis:
> Btw, is the CPU link<socket> feature already implemented in a qom-cpu branch?

No, the latest topology series moves fields around as preparation.
There's still one CPUState per hyperthread, but as a DeviceState now.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2012-12-19 11:35   ` Vasilis Liaskovitis
  2012-12-19 13:56     ` Gerd Hoffmann
@ 2013-01-10 18:57     ` Vasilis Liaskovitis
  2013-03-19  6:30       ` li guang
  2013-04-02  9:15       ` liu ping fan
  1 sibling, 2 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2013-01-10 18:57 UTC (permalink / raw)
  To: qemu-devel, seabios
  Cc: gleb, stefanha, jbaron, blauwirbel, kevin, Gerd Hoffmann, anthony

> > 
> > IIRC q35 supports memory hotplug natively (picked up in some
> > discussion).  Is that correct?
> > 
> From previous discussion I also understand that q35 supports native hotplug. 
> Sections 5.1 and 5.2 of the spec describe the MCH registers but the native
> memory hotplug specifics are not yet clear to me. Any pointers from the
> spec are welcome.

Ping. Could anyone who's familiar with the q35 spec provide some pointers on
native memory hotplug details in the spec? I see pcie hotplug registers but can't
find memory hotplug interface details. If I am not mistaken, the spec is here:
http://www.intel.com/design/chipsets/datashts/316966.htm

Is the q35 memory hotplug support supposed to be an shpc-like interface geared
towards memory slots instead of pci slots?

thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 13/30] piix_pci and pc_piix: refactor
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 13/30] piix_pci and pc_piix: refactor Vasilis Liaskovitis
@ 2013-01-16  7:20   ` Hu Tao
  2013-01-16  9:36     ` Vasilis Liaskovitis
  0 siblings, 1 reply; 72+ messages in thread
From: Hu Tao @ 2013-01-16  7:20 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, kraxel, anthony

Hi Vasilis,

On Tue, Dec 18, 2012 at 01:41:41PM +0100, Vasilis Liaskovitis wrote:
> Refactor code so that chipset initialization is similar to q35. This will
> allow memory map initialization at chipset qdev init time for both
> machines, as well as more similar code structure overall.
> 
> Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
> ---
>  hw/pc_piix.c  |   57 ++++++++++++---
>  hw/piix_pci.c |  225 ++++++++++++++-------------------------------------------
>  2 files changed, 100 insertions(+), 182 deletions(-)
> 
> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> index 19e342a..6a9b508 100644
> --- a/hw/pc_piix.c
> +++ b/hw/pc_piix.c
> @@ -47,6 +47,7 @@
>  #ifdef CONFIG_XEN
>  #  include <xen/hvm/hvm_info_table.h>
>  #endif
> +#include "piix_pci.h"

Can't find this file. Did you forget to add this file to git?

-- 
Regards,
Hu Tao

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 13/30] piix_pci and pc_piix: refactor
  2013-01-16  7:20   ` Hu Tao
@ 2013-01-16  9:36     ` Vasilis Liaskovitis
  2013-01-16 11:17       ` Andreas Färber
  0 siblings, 1 reply; 72+ messages in thread
From: Vasilis Liaskovitis @ 2013-01-16  9:36 UTC (permalink / raw)
  To: Hu Tao
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, kraxel, anthony

Hi,

On Wed, Jan 16, 2013 at 03:20:40PM +0800, Hu Tao wrote:
> Hi Vasilis,
> 
> On Tue, Dec 18, 2012 at 01:41:41PM +0100, Vasilis Liaskovitis wrote:
> > Refactor code so that chipset initialization is similar to q35. This will
> > allow memory map initialization at chipset qdev init time for both
> > machines, as well as more similar code structure overall.
> > 
> > Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
> > ---
> >  hw/pc_piix.c  |   57 ++++++++++++---
> >  hw/piix_pci.c |  225 ++++++++++++++-------------------------------------------
> >  2 files changed, 100 insertions(+), 182 deletions(-)
> > 
> > diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> > index 19e342a..6a9b508 100644
> > --- a/hw/pc_piix.c
> > +++ b/hw/pc_piix.c
> > @@ -47,6 +47,7 @@
> >  #ifdef CONFIG_XEN
> >  #  include <xen/hvm/hvm_info_table.h>
> >  #endif
> > +#include "piix_pci.h"
> 
> Can't find this file. Did you forget to add this file to git?

sorry, you are right. Below is the corrected patch with the missing header

Refactor code so that chipset initialization is similar to q35. This will
allow memory map initialization at chipset qdev init time for both
machines, as well as more similar code structure overall.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
 hw/pc_piix.c  |   57 ++++++++++++---
 hw/piix_pci.c |  225 ++++++++++++++-------------------------------------------
 hw/piix_pci.h |  116 +++++++++++++++++++++++++++++
 3 files changed, 216 insertions(+), 182 deletions(-)
 create mode 100644 hw/piix_pci.h

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 19e342a..6a9b508 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -47,6 +47,7 @@
 #ifdef CONFIG_XEN
 #  include <xen/hvm/hvm_info_table.h>
 #endif
+#include "piix_pci.h"
 
 #define MAX_IDE_BUS 2
 
@@ -85,6 +86,8 @@ static void pc_init1(MemoryRegion *system_memory,
     MemoryRegion *pci_memory;
     MemoryRegion *rom_memory;
     void *fw_cfg = NULL;
+    I440FXState *i440fx_host;
+    PIIX3State *piix3;
 
     pc_cpus_init(cpu_model);
 
@@ -127,21 +130,53 @@ static void pc_init1(MemoryRegion *system_memory,
     }
 
     if (pci_enabled) {
-        pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, &isa_bus, gsi,
-                              system_memory, system_io, ram_size,
-                              below_4g_mem_size,
-                              0x100000000ULL - below_4g_mem_size,
-                              0x100000000ULL + above_4g_mem_size,
-                              (sizeof(hwaddr) == 4
-                               ? 0
-                               : ((uint64_t)1 << 62)),
-                              pci_memory, ram_memory);
+        i440fx_host = I440FX_HOST_DEVICE(qdev_create(NULL,
+                    TYPE_I440FX_HOST_DEVICE));
+        i440fx_host->mch.ram_memory = ram_memory;
+        i440fx_host->mch.pci_address_space = pci_memory;
+        i440fx_host->mch.system_memory = get_system_memory();
+        i440fx_host->mch.address_space_io = get_system_io();;
+        i440fx_host->mch.below_4g_mem_size = below_4g_mem_size;
+        i440fx_host->mch.above_4g_mem_size = above_4g_mem_size;
+
+        qdev_init_nofail(DEVICE(i440fx_host));
+        i440fx_state = &i440fx_host->mch;
+        pci_bus = i440fx_host->parent_obj.bus;
+        /* Xen supports additional interrupt routes from the PCI devices to
+         * the IOAPIC: the four pins of each PCI device on the bus are also
+         * connected to the IOAPIC directly.
+         * These additional routes can be discovered through ACPI. */
+        if (xen_enabled()) {
+            piix3 = DO_UPCAST(PIIX3State, dev,
+                    pci_create_simple_multifunction(pci_bus, -1, true,
+                        "PIIX3-xen"));
+            pci_bus_irqs(pci_bus, xen_piix3_set_irq, xen_pci_slot_get_pirq,
+                    piix3, XEN_PIIX_NUM_PIRQS);
+        } else {
+            piix3 = DO_UPCAST(PIIX3State, dev,
+                    pci_create_simple_multifunction(pci_bus, -1, true,
+                        "PIIX3"));
+            pci_bus_irqs(pci_bus, piix3_set_irq, pci_slot_get_pirq, piix3,
+                    PIIX_NUM_PIRQS);
+            pci_bus_set_route_irq_fn(pci_bus, piix3_route_intx_pin_to_irq);
+        }
+        piix3->pic = gsi;
+        isa_bus = DO_UPCAST(ISABus, qbus,
+                qdev_get_child_bus(&piix3->dev.qdev, "isa.0"));
+
+        piix3_devfn = piix3->dev.devfn;
+
+        ram_size = ram_size / 8 / 1024 / 1024;
+        if (ram_size > 255) {
+            ram_size = 255;
+        }
+        i440fx_state->dev.config[0x57] = ram_size;
     } else {
         pci_bus = NULL;
-        i440fx_state = NULL;
         isa_bus = isa_bus_new(NULL, system_io);
         no_hpet = 1;
     }
+
     isa_bus_irqs(isa_bus, gsi);
 
     if (kvm_irqchip_in_kernel()) {
@@ -157,7 +192,7 @@ static void pc_init1(MemoryRegion *system_memory,
         gsi_state->i8259_irq[i] = i8259[i];
     }
     if (pci_enabled) {
-        ioapic_init_gsi(gsi_state, "i440fx");
+        ioapic_init_gsi(gsi_state, NULL);
     }
 
     pc_register_ferr_irq(gsi[13]);
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index ba1b3de..7ca3c73 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -31,70 +31,15 @@
 #include "range.h"
 #include "xen.h"
 #include "pam.h"
+#include "piix_pci.h"
 
-/*
- * I440FX chipset data sheet.
- * http://download.intel.com/design/chipsets/datashts/29054901.pdf
- */
-
-typedef struct I440FXState {
-    PCIHostState parent_obj;
-} I440FXState;
-
-#define PIIX_NUM_PIC_IRQS       16      /* i8259 * 2 */
-#define PIIX_NUM_PIRQS          4ULL    /* PIRQ[A-D] */
-#define XEN_PIIX_NUM_PIRQS      128ULL
-#define PIIX_PIRQC              0x60
-
-typedef struct PIIX3State {
-    PCIDevice dev;
-
-    /*
-     * bitmap to track pic levels.
-     * The pic level is the logical OR of all the PCI irqs mapped to it
-     * So one PIC level is tracked by PIIX_NUM_PIRQS bits.
-     *
-     * PIRQ is mapped to PIC pins, we track it by
-     * PIIX_NUM_PIRQS * PIIX_NUM_PIC_IRQS = 64 bits with
-     * pic_irq * PIIX_NUM_PIRQS + pirq
-     */
-#if PIIX_NUM_PIC_IRQS * PIIX_NUM_PIRQS > 64
-#error "unable to encode pic state in 64bit in pic_levels."
-#endif
-    uint64_t pic_levels;
-
-    qemu_irq *pic;
-
-    /* This member isn't used. Just for save/load compatibility */
-    int32_t pci_irq_levels_vmstate[PIIX_NUM_PIRQS];
-} PIIX3State;
-
-struct PCII440FXState {
-    PCIDevice dev;
-    MemoryRegion *system_memory;
-    MemoryRegion *pci_address_space;
-    MemoryRegion *ram_memory;
-    MemoryRegion pci_hole;
-    MemoryRegion pci_hole_64bit;
-    PAMMemoryRegion pam_regions[13];
-    MemoryRegion smram_region;
-    uint8_t smm_enabled;
-};
-
-
-#define I440FX_PAM      0x59
-#define I440FX_PAM_SIZE 7
-#define I440FX_SMRAM    0x72
-
-static void piix3_set_irq(void *opaque, int pirq, int level);
-static PCIINTxRoute piix3_route_intx_pin_to_irq(void *opaque, int pci_intx);
 static void piix3_write_config_xen(PCIDevice *dev,
                                uint32_t address, uint32_t val, int len);
 
 /* return the global irq number corresponding to a given device irq
    pin. We could also use the bus number to have a more precise
    mapping. */
-static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
+int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
 {
     int slot_addend;
     slot_addend = (pci_dev->devfn >> 3) - 1;
@@ -180,149 +125,86 @@ static const VMStateDescription vmstate_i440fx = {
     }
 };
 
-static int i440fx_pcihost_initfn(SysBusDevice *dev)
+static void i440fx_pcihost_initfn(Object *obj)
 {
-    PCIHostState *s = PCI_HOST_BRIDGE(dev);
+    I440FXState *s = I440FX_HOST_DEVICE(obj);
+    object_initialize(&s->mch, TYPE_I440FX_PCI_DEVICE);
+    object_property_add_child(OBJECT(s), "mch", OBJECT(&s->mch), NULL);
+}
 
-    memory_region_init_io(&s->conf_mem, &pci_host_conf_le_ops, s,
-                          "pci-conf-idx", 4);
-    sysbus_add_io(dev, 0xcf8, &s->conf_mem);
-    sysbus_init_ioports(&s->busdev, 0xcf8, 4);
+static int i440fx_pcihost_init(SysBusDevice *dev)
+{
+    PCIHostState *pci = FROM_SYSBUS(PCIHostState, dev);
+    I440FXState *s = I440FX_HOST_DEVICE(&dev->qdev);
+    PCIBus *b;
+
+    memory_region_init_io(&pci->conf_mem, &pci_host_conf_le_ops, pci,
+                           "pci-conf-idx", 4);
+    sysbus_add_io(dev, 0xcf8, &pci->conf_mem);
+    sysbus_init_ioports(&pci->busdev, 0xcf8, 4);
+    memory_region_init_io(&pci->data_mem, &pci_host_data_le_ops, pci,
+                           "pci-conf-data", 4);
 
-    memory_region_init_io(&s->data_mem, &pci_host_data_le_ops, s,
-                          "pci-conf-data", 4);
-    sysbus_add_io(dev, 0xcfc, &s->data_mem);
-    sysbus_init_ioports(&s->busdev, 0xcfc, 4);
+    sysbus_add_io(dev, 0xcfc, &pci->data_mem);
+    sysbus_init_ioports(&pci->busdev, 0xcfc, 4);
+
+    b = pci_bus_new(&s->parent_obj.busdev.qdev, NULL, s->mch.pci_address_space,
+                    s->mch.address_space_io, 0);
+    s->parent_obj.bus = b;
+    qdev_set_parent_bus(DEVICE(&s->mch), BUS(b));
+    qdev_init_nofail(DEVICE(&s->mch));
 
     return 0;
 }
 
 static int i440fx_initfn(PCIDevice *dev)
 {
-    PCII440FXState *d = DO_UPCAST(PCII440FXState, dev, dev);
+    int i;
+    PCII440FXState *f = DO_UPCAST(PCII440FXState, dev, dev);
+    hwaddr pci_hole64_size;
 
-    d->dev.config[I440FX_SMRAM] = 0x02;
+    f->dev.config[I440FX_SMRAM] = 0x02;
 
-    cpu_smm_register(&i440fx_set_smm, d);
-    return 0;
-}
+    cpu_smm_register(&i440fx_set_smm, f);
 
-static PCIBus *i440fx_common_init(const char *device_name,
-                                  PCII440FXState **pi440fx_state,
-                                  int *piix3_devfn,
-                                  ISABus **isa_bus, qemu_irq *pic,
-                                  MemoryRegion *address_space_mem,
-                                  MemoryRegion *address_space_io,
-                                  ram_addr_t ram_size,
-                                  hwaddr pci_hole_start,
-                                  hwaddr pci_hole_size,
-                                  hwaddr pci_hole64_start,
-                                  hwaddr pci_hole64_size,
-                                  MemoryRegion *pci_address_space,
-                                  MemoryRegion *ram_memory)
-{
-    DeviceState *dev;
-    PCIBus *b;
-    PCIDevice *d;
-    PCIHostState *s;
-    PIIX3State *piix3;
-    PCII440FXState *f;
-    unsigned i;
-
-    dev = qdev_create(NULL, "i440FX-pcihost");
-    s = PCI_HOST_BRIDGE(dev);
-    s->address_space = address_space_mem;
-    b = pci_bus_new(dev, NULL, pci_address_space,
-                    address_space_io, 0);
-    s->bus = b;
-    object_property_add_child(qdev_get_machine(), "i440fx", OBJECT(dev), NULL);
-    qdev_init_nofail(dev);
-
-    d = pci_create_simple(b, 0, device_name);
-    *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
-    f = *pi440fx_state;
-    f->system_memory = address_space_mem;
-    f->pci_address_space = pci_address_space;
-    f->ram_memory = ram_memory;
+    pci_hole64_size = (sizeof(hwaddr) == 4 ? 0 :
+                       ((uint64_t)1 << 62));
     memory_region_init_alias(&f->pci_hole, "pci-hole", f->pci_address_space,
-                             pci_hole_start, pci_hole_size);
-    memory_region_add_subregion(f->system_memory, pci_hole_start, &f->pci_hole);
+                             f->below_4g_mem_size,
+                             0x100000000LL - f->below_4g_mem_size);
+    memory_region_add_subregion(f->system_memory, f->below_4g_mem_size,
+            &f->pci_hole);
     memory_region_init_alias(&f->pci_hole_64bit, "pci-hole64",
                              f->pci_address_space,
-                             pci_hole64_start, pci_hole64_size);
+                             0x100000000LL + f->above_4g_mem_size,
+                             pci_hole64_size);
     if (pci_hole64_size) {
-        memory_region_add_subregion(f->system_memory, pci_hole64_start,
+        memory_region_add_subregion(f->system_memory,
+                                    0x100000000LL + f->above_4g_mem_size,
                                     &f->pci_hole_64bit);
     }
+
     memory_region_init_alias(&f->smram_region, "smram-region",
                              f->pci_address_space, 0xa0000, 0x20000);
     memory_region_add_subregion_overlap(f->system_memory, 0xa0000,
                                         &f->smram_region, 1);
     memory_region_set_enabled(&f->smram_region, false);
+
     init_pam(f->ram_memory, f->system_memory, f->pci_address_space,
              &f->pam_regions[0], PAM_BIOS_BASE, PAM_BIOS_SIZE);
     for (i = 0; i < 12; ++i) {
         init_pam(f->ram_memory, f->system_memory, f->pci_address_space,
-                 &f->pam_regions[i+1], PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE,
+                &f->pam_regions[i+1], PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE,
                  PAM_EXPAN_SIZE);
     }
-
-    /* Xen supports additional interrupt routes from the PCI devices to
-     * the IOAPIC: the four pins of each PCI device on the bus are also
-     * connected to the IOAPIC directly.
-     * These additional routes can be discovered through ACPI. */
-    if (xen_enabled()) {
-        piix3 = DO_UPCAST(PIIX3State, dev,
-                pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
-        pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
-                piix3, XEN_PIIX_NUM_PIRQS);
-    } else {
-        piix3 = DO_UPCAST(PIIX3State, dev,
-                pci_create_simple_multifunction(b, -1, true, "PIIX3"));
-        pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3,
-                PIIX_NUM_PIRQS);
-        pci_bus_set_route_irq_fn(b, piix3_route_intx_pin_to_irq);
-    }
-    piix3->pic = pic;
-    *isa_bus = DO_UPCAST(ISABus, qbus,
-                         qdev_get_child_bus(&piix3->dev.qdev, "isa.0"));
-
-    *piix3_devfn = piix3->dev.devfn;
-
-    ram_size = ram_size / 8 / 1024 / 1024;
-    if (ram_size > 255)
-        ram_size = 255;
-    (*pi440fx_state)->dev.config[0x57]=ram_size;
-
+    f->dev.config[0x57] = f->below_4g_mem_size;
     i440fx_update_memory_mappings(f);
 
-    return b;
-}
-
-PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn,
-                    ISABus **isa_bus, qemu_irq *pic,
-                    MemoryRegion *address_space_mem,
-                    MemoryRegion *address_space_io,
-                    ram_addr_t ram_size,
-                    hwaddr pci_hole_start,
-                    hwaddr pci_hole_size,
-                    hwaddr pci_hole64_start,
-                    hwaddr pci_hole64_size,
-                    MemoryRegion *pci_memory, MemoryRegion *ram_memory)
-
-{
-    PCIBus *b;
-
-    b = i440fx_common_init("i440FX", pi440fx_state, piix3_devfn, isa_bus, pic,
-                           address_space_mem, address_space_io, ram_size,
-                           pci_hole_start, pci_hole_size,
-                           pci_hole64_start, pci_hole64_size,
-                           pci_memory, ram_memory);
-    return b;
+    return 0;
 }
 
 /* PIIX3 PCI to ISA bridge */
-static void piix3_set_irq_pic(PIIX3State *piix3, int pic_irq)
+void piix3_set_irq_pic(PIIX3State *piix3, int pic_irq)
 {
     qemu_set_irq(piix3->pic[pic_irq],
                  !!(piix3->pic_levels &
@@ -347,13 +229,13 @@ static void piix3_set_irq_level(PIIX3State *piix3, int pirq, int level)
     piix3_set_irq_pic(piix3, pic_irq);
 }
 
-static void piix3_set_irq(void *opaque, int pirq, int level)
+void piix3_set_irq(void *opaque, int pirq, int level)
 {
     PIIX3State *piix3 = opaque;
     piix3_set_irq_level(piix3, pirq, level);
 }
 
-static PCIINTxRoute piix3_route_intx_pin_to_irq(void *opaque, int pin)
+PCIINTxRoute piix3_route_intx_pin_to_irq(void *opaque, int pin)
 {
     PIIX3State *piix3 = opaque;
     int irq = piix3->dev.config[PIIX_PIRQC + pin];
@@ -550,7 +432,7 @@ static void i440fx_class_init(ObjectClass *klass, void *data)
 }
 
 static const TypeInfo i440fx_info = {
-    .name          = "i440FX",
+    .name          = TYPE_I440FX_PCI_DEVICE,
     .parent        = TYPE_PCI_DEVICE,
     .instance_size = sizeof(PCII440FXState),
     .class_init    = i440fx_class_init,
@@ -561,15 +443,16 @@ static void i440fx_pcihost_class_init(ObjectClass *klass, void *data)
     DeviceClass *dc = DEVICE_CLASS(klass);
     SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
 
-    k->init = i440fx_pcihost_initfn;
+    k->init = i440fx_pcihost_init;
     dc->fw_name = "pci";
     dc->no_user = 1;
 }
 
 static const TypeInfo i440fx_pcihost_info = {
-    .name          = "i440FX-pcihost",
+    .name          = TYPE_I440FX_HOST_DEVICE,
     .parent        = TYPE_PCI_HOST_BRIDGE,
     .instance_size = sizeof(I440FXState),
+    .instance_init = i440fx_pcihost_initfn,
     .class_init    = i440fx_pcihost_class_init,
 };
 
diff --git a/hw/piix_pci.h b/hw/piix_pci.h
new file mode 100644
index 0000000..26b3fa0
--- /dev/null
+++ b/hw/piix_pci.h
@@ -0,0 +1,116 @@
+/*
+ * piix_pci.h
+ *
+ * Copyright (c) 2009 Isaku Yamahata <yamahata at valinux co jp>
+ *                    VA Linux Systems Japan K.K.
+ * Copyright (C) 2012 Jason Baron <jbaron@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#ifndef HW_I440FX_H
+#define HW_I440FX_H
+
+#include "hw.h"
+#include "range.h"
+#include "isa.h"
+#include "sysbus.h"
+#include "pc.h"
+#include "apm.h"
+#include "apic.h"
+#include "pci.h"
+#include "pcie_host.h"
+#include "acpi.h"
+#include "pam.h"
+#include "dimm.h"
+
+/*
+ * I440FX chipset data sheet.
+ * http://download.intel.com/design/chipsets/datashts/29054901.pdf
+ */
+#define I440FX_PCI_HOLE_START 0xe0000000
+
+#define TYPE_I440FX_HOST_DEVICE "i440FX-pcihost"
+#define I440FX_HOST_DEVICE(obj) \
+     OBJECT_CHECK(I440FXState, (obj), TYPE_I440FX_HOST_DEVICE)
+
+#define TYPE_I440FX_PCI_DEVICE "i440FX"
+#define I440FX_PCI_DEVICE(obj) \
+     OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)
+
+typedef struct PCII440FXState {
+    PCIDevice dev;
+    MemoryRegion *system_memory;
+    MemoryRegion *pci_address_space;
+    MemoryRegion *ram_memory;
+    MemoryRegion *address_space_io;
+    PAMMemoryRegion pam_regions[13];
+    MemoryRegion pci_hole;
+    MemoryRegion pci_hole_64bit;
+    MemoryRegion smram_region;
+    uint8_t smm_enabled;
+    ram_addr_t below_4g_mem_size;
+    ram_addr_t above_4g_mem_size;
+    /* i440fx allows for 1 DRAM channels x 8 DRAM ranks */
+    DimmBus *dram_channel0;
+    /* paravirtual memory bus */
+    DimmBus *pv_dram_channel;
+    void *fw_cfg;
+} PCII440FXState;
+
+typedef struct I440FXState {
+    PCIHostState parent_obj;
+    PCII440FXState mch;
+} I440FXState;
+
+#define PIIX_NUM_PIC_IRQS       16      /* i8259 * 2 */
+#define PIIX_NUM_PIRQS          4ULL    /* PIRQ[A-D] */
+#define XEN_PIIX_NUM_PIRQS      128ULL
+#define PIIX_PIRQC              0x60
+
+typedef struct PIIX3State {
+    PCIDevice dev;
+
+    /*
+     * bitmap to track pic levels.
+     * The pic level is the logical OR of all the PCI irqs mapped to it
+     * So one PIC level is tracked by PIIX_NUM_PIRQS bits.
+     *
+     * PIRQ is mapped to PIC pins, we track it by
+     * PIIX_NUM_PIRQS * PIIX_NUM_PIC_IRQS = 64 bits with
+     * pic_irq * PIIX_NUM_PIRQS + pirq
+     */
+#if PIIX_NUM_PIC_IRQS * PIIX_NUM_PIRQS > 64
+#error "unable to encode pic state in 64bit in pic_levels."
+#endif
+    uint64_t pic_levels;
+
+    qemu_irq *pic;
+
+    /* This member isn't used. Just for save/load compatibility */
+    int32_t pci_irq_levels_vmstate[PIIX_NUM_PIRQS];
+} PIIX3State;
+
+
+int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx);
+void piix3_set_irq_pic(PIIX3State *piix3, int pic_irq);
+void piix3_set_irq(void *opaque, int pirq, int level);
+PCIINTxRoute piix3_route_intx_pin_to_irq(void *opaque, int pci_intx);
+hwaddr i440fx_pmc_dimm_offset(DeviceState *dev, uint64_t size);
+
+#define I440FX_PAM      0x59
+#define I440FX_PAM_SIZE 7
+#define I440FX_SMRAM    0x72
+
+#endif /* HW_I440FX_H */
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 13/30] piix_pci and pc_piix: refactor
  2013-01-16  9:36     ` Vasilis Liaskovitis
@ 2013-01-16 11:17       ` Andreas Färber
  2013-01-16 17:10         ` Vasilis Liaskovitis
  0 siblings, 1 reply; 72+ messages in thread
From: Andreas Färber @ 2013-01-16 11:17 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, Hu Tao, jbaron, seabios, qemu-devel, blauwirbel,
	kevin, kraxel, anthony, stefanha

Hi,

Am 16.01.2013 10:36, schrieb Vasilis Liaskovitis:
> On Wed, Jan 16, 2013 at 03:20:40PM +0800, Hu Tao wrote:
>> On Tue, Dec 18, 2012 at 01:41:41PM +0100, Vasilis Liaskovitis wrote:
>>> Refactor code so that chipset initialization is similar to q35. This will
>>> allow memory map initialization at chipset qdev init time for both
>>> machines, as well as more similar code structure overall.
>>>
>>> Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
>>> ---
>>>  hw/pc_piix.c  |   57 ++++++++++++---
>>>  hw/piix_pci.c |  225 ++++++++++++++-------------------------------------------
>>>  2 files changed, 100 insertions(+), 182 deletions(-)
>>>
>>> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
>>> index 19e342a..6a9b508 100644
>>> --- a/hw/pc_piix.c
>>> +++ b/hw/pc_piix.c
>>> @@ -47,6 +47,7 @@
>>>  #ifdef CONFIG_XEN
>>>  #  include <xen/hvm/hvm_info_table.h>
>>>  #endif
>>> +#include "piix_pci.h"
>>
>> Can't find this file. Did you forget to add this file to git?
> 
> sorry, you are right. Below is the corrected patch with the missing header

Please take review comments on other similar series into account. You
can also check if the QOM Vadis slides from KVM Forum are online somewhere.

You are aware that there were two people previously working on
QOM'ifying i440fx?

> Refactor code so that chipset initialization is similar to q35. This will
> allow memory map initialization at chipset qdev init time for both
> machines, as well as more similar code structure overall.
> 
> Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
> ---
>  hw/pc_piix.c  |   57 ++++++++++++---
>  hw/piix_pci.c |  225 ++++++++++++++-------------------------------------------
>  hw/piix_pci.h |  116 +++++++++++++++++++++++++++++
>  3 files changed, 216 insertions(+), 182 deletions(-)
>  create mode 100644 hw/piix_pci.h
> 
> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> index 19e342a..6a9b508 100644
> --- a/hw/pc_piix.c
> +++ b/hw/pc_piix.c
[...]
> @@ -127,21 +130,53 @@ static void pc_init1(MemoryRegion *system_memory,
>      }
>  
>      if (pci_enabled) {
> -        pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, &isa_bus, gsi,
> -                              system_memory, system_io, ram_size,
> -                              below_4g_mem_size,
> -                              0x100000000ULL - below_4g_mem_size,
> -                              0x100000000ULL + above_4g_mem_size,
> -                              (sizeof(hwaddr) == 4
> -                               ? 0
> -                               : ((uint64_t)1 << 62)),
> -                              pci_memory, ram_memory);
> +        i440fx_host = I440FX_HOST_DEVICE(qdev_create(NULL,
> +                    TYPE_I440FX_HOST_DEVICE));

Elsewhere it was requested to use _HOST_BRIDGE wording.

> +        i440fx_host->mch.ram_memory = ram_memory;
> +        i440fx_host->mch.pci_address_space = pci_memory;
> +        i440fx_host->mch.system_memory = get_system_memory();
> +        i440fx_host->mch.address_space_io = get_system_io();;
> +        i440fx_host->mch.below_4g_mem_size = below_4g_mem_size;
> +        i440fx_host->mch.above_4g_mem_size = above_4g_mem_size;
> +
> +        qdev_init_nofail(DEVICE(i440fx_host));
> +        i440fx_state = &i440fx_host->mch;
> +        pci_bus = i440fx_host->parent_obj.bus;

Please don't access the parent field, in particular not "parent_obj". It
was specifically renamed after checking that no more users exist.

PCIHostState *phb = PCI_HOST_BRIDGE(i440fx_host);
...
pci_bus = phb->bus;

> +        /* Xen supports additional interrupt routes from the PCI devices to
> +         * the IOAPIC: the four pins of each PCI device on the bus are also
> +         * connected to the IOAPIC directly.
> +         * These additional routes can be discovered through ACPI. */
> +        if (xen_enabled()) {
> +            piix3 = DO_UPCAST(PIIX3State, dev,
> +                    pci_create_simple_multifunction(pci_bus, -1, true,
> +                        "PIIX3-xen"));

Please don't introduce new usages of DO_UPCAST() with QOM types. Instead
add QOM cast macros where needed and use them.

> +            pci_bus_irqs(pci_bus, xen_piix3_set_irq, xen_pci_slot_get_pirq,
> +                    piix3, XEN_PIIX_NUM_PIRQS);
> +        } else {
> +            piix3 = DO_UPCAST(PIIX3State, dev,
> +                    pci_create_simple_multifunction(pci_bus, -1, true,
> +                        "PIIX3"));
> +            pci_bus_irqs(pci_bus, piix3_set_irq, pci_slot_get_pirq, piix3,
> +                    PIIX_NUM_PIRQS);
> +            pci_bus_set_route_irq_fn(pci_bus, piix3_route_intx_pin_to_irq);
> +        }
> +        piix3->pic = gsi;
> +        isa_bus = DO_UPCAST(ISABus, qbus,
> +                qdev_get_child_bus(&piix3->dev.qdev, "isa.0"));

isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(piix3), ...));

> +
> +        piix3_devfn = piix3->dev.devfn;
> +
> +        ram_size = ram_size / 8 / 1024 / 1024;
> +        if (ram_size > 255) {
> +            ram_size = 255;
> +        }
> +        i440fx_state->dev.config[0x57] = ram_size;
>      } else {
>          pci_bus = NULL;
> -        i440fx_state = NULL;
>          isa_bus = isa_bus_new(NULL, system_io);
>          no_hpet = 1;
>      }
> +
>      isa_bus_irqs(isa_bus, gsi);
>  
>      if (kvm_irqchip_in_kernel()) {
> @@ -157,7 +192,7 @@ static void pc_init1(MemoryRegion *system_memory,
>          gsi_state->i8259_irq[i] = i8259[i];
>      }
>      if (pci_enabled) {
> -        ioapic_init_gsi(gsi_state, "i440fx");
> +        ioapic_init_gsi(gsi_state, NULL);

Unrelated? Why?

>      }
>  
>      pc_register_ferr_irq(gsi[13]);
> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
> index ba1b3de..7ca3c73 100644
> --- a/hw/piix_pci.c
> +++ b/hw/piix_pci.c
> @@ -31,70 +31,15 @@
>  #include "range.h"
>  #include "xen.h"
>  #include "pam.h"
> +#include "piix_pci.h"
>  
> -/*
> - * I440FX chipset data sheet.
> - * http://download.intel.com/design/chipsets/datashts/29054901.pdf
> - */
> -
> -typedef struct I440FXState {
> -    PCIHostState parent_obj;
> -} I440FXState;
> -
> -#define PIIX_NUM_PIC_IRQS       16      /* i8259 * 2 */
> -#define PIIX_NUM_PIRQS          4ULL    /* PIRQ[A-D] */
> -#define XEN_PIIX_NUM_PIRQS      128ULL
> -#define PIIX_PIRQC              0x60
> -
> -typedef struct PIIX3State {
> -    PCIDevice dev;
> -
> -    /*
> -     * bitmap to track pic levels.
> -     * The pic level is the logical OR of all the PCI irqs mapped to it
> -     * So one PIC level is tracked by PIIX_NUM_PIRQS bits.
> -     *
> -     * PIRQ is mapped to PIC pins, we track it by
> -     * PIIX_NUM_PIRQS * PIIX_NUM_PIC_IRQS = 64 bits with
> -     * pic_irq * PIIX_NUM_PIRQS + pirq
> -     */
> -#if PIIX_NUM_PIC_IRQS * PIIX_NUM_PIRQS > 64
> -#error "unable to encode pic state in 64bit in pic_levels."
> -#endif
> -    uint64_t pic_levels;
> -
> -    qemu_irq *pic;
> -
> -    /* This member isn't used. Just for save/load compatibility */
> -    int32_t pci_irq_levels_vmstate[PIIX_NUM_PIRQS];
> -} PIIX3State;
> -
> -struct PCII440FXState {
> -    PCIDevice dev;
> -    MemoryRegion *system_memory;
> -    MemoryRegion *pci_address_space;
> -    MemoryRegion *ram_memory;
> -    MemoryRegion pci_hole;
> -    MemoryRegion pci_hole_64bit;
> -    PAMMemoryRegion pam_regions[13];
> -    MemoryRegion smram_region;
> -    uint8_t smm_enabled;
> -};
> -
> -
> -#define I440FX_PAM      0x59
> -#define I440FX_PAM_SIZE 7
> -#define I440FX_SMRAM    0x72
> -
> -static void piix3_set_irq(void *opaque, int pirq, int level);
> -static PCIINTxRoute piix3_route_intx_pin_to_irq(void *opaque, int pci_intx);
>  static void piix3_write_config_xen(PCIDevice *dev,
>                                 uint32_t address, uint32_t val, int len);
>  
>  /* return the global irq number corresponding to a given device irq
>     pin. We could also use the bus number to have a more precise
>     mapping. */
> -static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
> +int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
>  {
>      int slot_addend;
>      slot_addend = (pci_dev->devfn >> 3) - 1;
> @@ -180,149 +125,86 @@ static const VMStateDescription vmstate_i440fx = {
>      }
>  };
>  
> -static int i440fx_pcihost_initfn(SysBusDevice *dev)
> +static void i440fx_pcihost_initfn(Object *obj)
>  {
> -    PCIHostState *s = PCI_HOST_BRIDGE(dev);
> +    I440FXState *s = I440FX_HOST_DEVICE(obj);
> +    object_initialize(&s->mch, TYPE_I440FX_PCI_DEVICE);
> +    object_property_add_child(OBJECT(s), "mch", OBJECT(&s->mch), NULL);

Is there maybe a more readable property name?

> +}
>  
> -    memory_region_init_io(&s->conf_mem, &pci_host_conf_le_ops, s,
> -                          "pci-conf-idx", 4);
> -    sysbus_add_io(dev, 0xcf8, &s->conf_mem);
> -    sysbus_init_ioports(&s->busdev, 0xcf8, 4);
> +static int i440fx_pcihost_init(SysBusDevice *dev)
> +{
> +    PCIHostState *pci = FROM_SYSBUS(PCIHostState, dev);

Don't use FROM_SYSBUS() either:

PCIHostState *phb = PCI_HOST_BRIDGE(dev);

> +    I440FXState *s = I440FX_HOST_DEVICE(&dev->qdev);

No need to access ->qdev, just use I440FX_...(dev);

> +    PCIBus *b;
> +
> +    memory_region_init_io(&pci->conf_mem, &pci_host_conf_le_ops, pci,
> +                           "pci-conf-idx", 4);
> +    sysbus_add_io(dev, 0xcf8, &pci->conf_mem);
> +    sysbus_init_ioports(&pci->busdev, 0xcf8, 4);
> +    memory_region_init_io(&pci->data_mem, &pci_host_data_le_ops, pci,
> +                           "pci-conf-data", 4);
>  
> -    memory_region_init_io(&s->data_mem, &pci_host_data_le_ops, s,
> -                          "pci-conf-data", 4);
> -    sysbus_add_io(dev, 0xcfc, &s->data_mem);
> -    sysbus_init_ioports(&s->busdev, 0xcfc, 4);
> +    sysbus_add_io(dev, 0xcfc, &pci->data_mem);
> +    sysbus_init_ioports(&pci->busdev, 0xcfc, 4);
> +
> +    b = pci_bus_new(&s->parent_obj.busdev.qdev, NULL, s->mch.pci_address_space,

DEVICE(dev)

> +                    s->mch.address_space_io, 0);

Initializing the bus in-place would be preferred.

> +    s->parent_obj.bus = b;
> +    qdev_set_parent_bus(DEVICE(&s->mch), BUS(b));
> +    qdev_init_nofail(DEVICE(&s->mch));

When casts other than OBJECT() are used multiple times, a variable is
preferred.

>  
>      return 0;
>  }
>  
>  static int i440fx_initfn(PCIDevice *dev)
>  {
> -    PCII440FXState *d = DO_UPCAST(PCII440FXState, dev, dev);
> +    int i;
> +    PCII440FXState *f = DO_UPCAST(PCII440FXState, dev, dev);
> +    hwaddr pci_hole64_size;
>  
> -    d->dev.config[I440FX_SMRAM] = 0x02;
> +    f->dev.config[I440FX_SMRAM] = 0x02;
>  
> -    cpu_smm_register(&i440fx_set_smm, d);
> -    return 0;
> -}
> +    cpu_smm_register(&i440fx_set_smm, f);

Is all this d -> f variable renaming really necessary? I can understand
the s -> pci (or for less ambiguity: phb) renaming above (I believe I
left it s to keep my patch small ;)), but here no new variable is
introduced so it just seems to enlarge the patch.

[...]
> @@ -550,7 +432,7 @@ static void i440fx_class_init(ObjectClass *klass, void *data)
>  }
>  
>  static const TypeInfo i440fx_info = {
> -    .name          = "i440FX",
> +    .name          = TYPE_I440FX_PCI_DEVICE,
>      .parent        = TYPE_PCI_DEVICE,

This matches the _PCI_DEVICE naming in earlier series including prep_pci

>      .instance_size = sizeof(PCII440FXState),
>      .class_init    = i440fx_class_init,
> @@ -561,15 +443,16 @@ static void i440fx_pcihost_class_init(ObjectClass *klass, void *data)
>      DeviceClass *dc = DEVICE_CLASS(klass);
>      SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
>  
> -    k->init = i440fx_pcihost_initfn;
> +    k->init = i440fx_pcihost_init;
>      dc->fw_name = "pci";
>      dc->no_user = 1;
>  }
>  
>  static const TypeInfo i440fx_pcihost_info = {
> -    .name          = "i440FX-pcihost",
> +    .name          = TYPE_I440FX_HOST_DEVICE,
>      .parent        = TYPE_PCI_HOST_BRIDGE,

whereas here you see the mentioned _HOST_DEVICE vs. _HOST_BRIDGE.

>      .instance_size = sizeof(I440FXState),
> +    .instance_init = i440fx_pcihost_initfn,
>      .class_init    = i440fx_pcihost_class_init,
>  };
>  
[snip]

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 13/30] piix_pci and pc_piix: refactor
  2013-01-16 11:17       ` Andreas Färber
@ 2013-01-16 17:10         ` Vasilis Liaskovitis
  0 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2013-01-16 17:10 UTC (permalink / raw)
  To: Andreas Färber
  Cc: pingfank, gleb, Hu Tao, seabios, qemu-devel, blauwirbel, kevin,
	kraxel, anthony, stefanha

Hi,

On Wed, Jan 16, 2013 at 12:17:05PM +0100, Andreas Färber wrote:
> Hi,
> 
> Am 16.01.2013 10:36, schrieb Vasilis Liaskovitis:
> > On Wed, Jan 16, 2013 at 03:20:40PM +0800, Hu Tao wrote:
> >> On Tue, Dec 18, 2012 at 01:41:41PM +0100, Vasilis Liaskovitis wrote:
> >>> Refactor code so that chipset initialization is similar to q35. This will
> >>> allow memory map initialization at chipset qdev init time for both
> >>> machines, as well as more similar code structure overall.
> >>>
> >>> Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
> >>> ---
> >>>  hw/pc_piix.c  |   57 ++++++++++++---
> >>>  hw/piix_pci.c |  225 ++++++++++++++-------------------------------------------
> >>>  2 files changed, 100 insertions(+), 182 deletions(-)
> >>>
> >>> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> >>> index 19e342a..6a9b508 100644
> >>> --- a/hw/pc_piix.c
> >>> +++ b/hw/pc_piix.c
> >>> @@ -47,6 +47,7 @@
> >>>  #ifdef CONFIG_XEN
> >>>  #  include <xen/hvm/hvm_info_table.h>
> >>>  #endif
> >>> +#include "piix_pci.h"
> >>
> >> Can't find this file. Did you forget to add this file to git?
> > 
> > sorry, you are right. Below is the corrected patch with the missing header
> 
> Please take review comments on other similar series into account. You
> can also check if the QOM Vadis slides from KVM Forum are online somewhere.

thanks, I will take a look.

> 
> You are aware that there were two people previously working on
> QOM'ifying i440fx?

I am aware of Anthony's i440fx-pmc patchset from about a year ago (a few months
ago I asked if it would be respinned, but got no response iirc, so I am not sure
what the status is)

What's the second effort you mention and its status? Are prep_pci patchsets
going to address this in the future? I don't mean to step on other people's
work-in-progress, so I don't mind dropping this patch if one of the other
efforts is still active.

> > ---
> >  hw/pc_piix.c  |   57 ++++++++++++---
> >  hw/piix_pci.c |  225 ++++++++++++++-------------------------------------------
> >  hw/piix_pci.h |  116 +++++++++++++++++++++++++++++
> >  3 files changed, 216 insertions(+), 182 deletions(-)
> >  create mode 100644 hw/piix_pci.h
> > 
> > diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> > index 19e342a..6a9b508 100644
> > --- a/hw/pc_piix.c
> > +++ b/hw/pc_piix.c
> [...]
> > @@ -127,21 +130,53 @@ static void pc_init1(MemoryRegion *system_memory,
> >      }
> >  
> >      if (pci_enabled) {
> > -        pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, &isa_bus, gsi,
> > -                              system_memory, system_io, ram_size,
> > -                              below_4g_mem_size,
> > -                              0x100000000ULL - below_4g_mem_size,
> > -                              0x100000000ULL + above_4g_mem_size,
> > -                              (sizeof(hwaddr) == 4
> > -                               ? 0
> > -                               : ((uint64_t)1 << 62)),
> > -                              pci_memory, ram_memory);
> > +        i440fx_host = I440FX_HOST_DEVICE(qdev_create(NULL,
> > +                    TYPE_I440FX_HOST_DEVICE));
> 
> Elsewhere it was requested to use _HOST_BRIDGE wording.

ok

> 
> > +        i440fx_host->mch.ram_memory = ram_memory;
> > +        i440fx_host->mch.pci_address_space = pci_memory;
> > +        i440fx_host->mch.system_memory = get_system_memory();
> > +        i440fx_host->mch.address_space_io = get_system_io();;
> > +        i440fx_host->mch.below_4g_mem_size = below_4g_mem_size;
> > +        i440fx_host->mch.above_4g_mem_size = above_4g_mem_size;
> > +
> > +        qdev_init_nofail(DEVICE(i440fx_host));
> > +        i440fx_state = &i440fx_host->mch;
> > +        pci_bus = i440fx_host->parent_obj.bus;
> 
> Please don't access the parent field, in particular not "parent_obj". It
> was specifically renamed after checking that no more users exist.

ok

> 
> PCIHostState *phb = PCI_HOST_BRIDGE(i440fx_host);
> ...
> pci_bus = phb->bus;
> 
> > +        /* Xen supports additional interrupt routes from the PCI devices to
> > +         * the IOAPIC: the four pins of each PCI device on the bus are also
> > +         * connected to the IOAPIC directly.
> > +         * These additional routes can be discovered through ACPI. */
> > +        if (xen_enabled()) {
> > +            piix3 = DO_UPCAST(PIIX3State, dev,
> > +                    pci_create_simple_multifunction(pci_bus, -1, true,
> > +                        "PIIX3-xen"));
> 
> Please don't introduce new usages of DO_UPCAST() with QOM types. Instead
> add QOM cast macros where needed and use them.

Any examples of this? I assume the new macros should't use DO_UPCAST themselves.

> 
> > +            pci_bus_irqs(pci_bus, xen_piix3_set_irq, xen_pci_slot_get_pirq,
> > +                    piix3, XEN_PIIX_NUM_PIRQS);
> > +        } else {
> > +            piix3 = DO_UPCAST(PIIX3State, dev,
> > +                    pci_create_simple_multifunction(pci_bus, -1, true,
> > +                        "PIIX3"));
> > +            pci_bus_irqs(pci_bus, piix3_set_irq, pci_slot_get_pirq, piix3,
> > +                    PIIX_NUM_PIRQS);
> > +            pci_bus_set_route_irq_fn(pci_bus, piix3_route_intx_pin_to_irq);
> > +        }
> > +        piix3->pic = gsi;
> > +        isa_bus = DO_UPCAST(ISABus, qbus,
> > +                qdev_get_child_bus(&piix3->dev.qdev, "isa.0"));
> 
> isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(piix3), ...));
> 
> > +
> > +        piix3_devfn = piix3->dev.devfn;
> > +
> > +        ram_size = ram_size / 8 / 1024 / 1024;
> > +        if (ram_size > 255) {
> > +            ram_size = 255;
> > +        }
> > +        i440fx_state->dev.config[0x57] = ram_size;
> >      } else {
> >          pci_bus = NULL;
> > -        i440fx_state = NULL;
> >          isa_bus = isa_bus_new(NULL, system_io);
> >          no_hpet = 1;
> >      }
> > +
> >      isa_bus_irqs(isa_bus, gsi);
> >  
> >      if (kvm_irqchip_in_kernel()) {
> > @@ -157,7 +192,7 @@ static void pc_init1(MemoryRegion *system_memory,
> >          gsi_state->i8259_irq[i] = i8259[i];
> >      }
> >      if (pci_enabled) {
> > -        ioapic_init_gsi(gsi_state, "i440fx");
> > +        ioapic_init_gsi(gsi_state, NULL);
> 
> Unrelated? Why?

I was getting a segfault in object_propert_add_child because the path to
"i440fx" was resolved to a NULL object. This was just a quick fix - I think the
real issue is that the name of the i440fx-host device is not yet set in the new
code. I 'll try to fix.

> 
> >      }
> >  
> >      pc_register_ferr_irq(gsi[13]);
> > diff --git a/hw/piix_pci.c b/hw/piix_pci.c
> > index ba1b3de..7ca3c73 100644
> > --- a/hw/piix_pci.c
> > +++ b/hw/piix_pci.c
> > @@ -31,70 +31,15 @@
> >  #include "range.h"
> >  #include "xen.h"
> >  #include "pam.h"
> > +#include "piix_pci.h"
> >  
> > -/*
> > - * I440FX chipset data sheet.
> > - * http://download.intel.com/design/chipsets/datashts/29054901.pdf
> > - */
> > -
> > -typedef struct I440FXState {
> > -    PCIHostState parent_obj;
> > -} I440FXState;
> > -
> > -#define PIIX_NUM_PIC_IRQS       16      /* i8259 * 2 */
> > -#define PIIX_NUM_PIRQS          4ULL    /* PIRQ[A-D] */
> > -#define XEN_PIIX_NUM_PIRQS      128ULL
> > -#define PIIX_PIRQC              0x60
> > -
> > -typedef struct PIIX3State {
> > -    PCIDevice dev;
> > -
> > -    /*
> > -     * bitmap to track pic levels.
> > -     * The pic level is the logical OR of all the PCI irqs mapped to it
> > -     * So one PIC level is tracked by PIIX_NUM_PIRQS bits.
> > -     *
> > -     * PIRQ is mapped to PIC pins, we track it by
> > -     * PIIX_NUM_PIRQS * PIIX_NUM_PIC_IRQS = 64 bits with
> > -     * pic_irq * PIIX_NUM_PIRQS + pirq
> > -     */
> > -#if PIIX_NUM_PIC_IRQS * PIIX_NUM_PIRQS > 64
> > -#error "unable to encode pic state in 64bit in pic_levels."
> > -#endif
> > -    uint64_t pic_levels;
> > -
> > -    qemu_irq *pic;
> > -
> > -    /* This member isn't used. Just for save/load compatibility */
> > -    int32_t pci_irq_levels_vmstate[PIIX_NUM_PIRQS];
> > -} PIIX3State;
> > -
> > -struct PCII440FXState {
> > -    PCIDevice dev;
> > -    MemoryRegion *system_memory;
> > -    MemoryRegion *pci_address_space;
> > -    MemoryRegion *ram_memory;
> > -    MemoryRegion pci_hole;
> > -    MemoryRegion pci_hole_64bit;
> > -    PAMMemoryRegion pam_regions[13];
> > -    MemoryRegion smram_region;
> > -    uint8_t smm_enabled;
> > -};
> > -
> > -
> > -#define I440FX_PAM      0x59
> > -#define I440FX_PAM_SIZE 7
> > -#define I440FX_SMRAM    0x72
> > -
> > -static void piix3_set_irq(void *opaque, int pirq, int level);
> > -static PCIINTxRoute piix3_route_intx_pin_to_irq(void *opaque, int pci_intx);
> >  static void piix3_write_config_xen(PCIDevice *dev,
> >                                 uint32_t address, uint32_t val, int len);
> >  
> >  /* return the global irq number corresponding to a given device irq
> >     pin. We could also use the bus number to have a more precise
> >     mapping. */
> > -static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
> > +int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
> >  {
> >      int slot_addend;
> >      slot_addend = (pci_dev->devfn >> 3) - 1;
> > @@ -180,149 +125,86 @@ static const VMStateDescription vmstate_i440fx = {
> >      }
> >  };
> >  
> > -static int i440fx_pcihost_initfn(SysBusDevice *dev)
> > +static void i440fx_pcihost_initfn(Object *obj)
> >  {
> > -    PCIHostState *s = PCI_HOST_BRIDGE(dev);
> > +    I440FXState *s = I440FX_HOST_DEVICE(obj);
> > +    object_initialize(&s->mch, TYPE_I440FX_PCI_DEVICE);
> > +    object_property_add_child(OBJECT(s), "mch", OBJECT(&s->mch), NULL);
> 
> Is there maybe a more readable property name?

"mem-controller" maybe? although q35 also uses an "mch" property for its
memory controller.

> 
> > +}
> >  
> > -    memory_region_init_io(&s->conf_mem, &pci_host_conf_le_ops, s,
> > -                          "pci-conf-idx", 4);
> > -    sysbus_add_io(dev, 0xcf8, &s->conf_mem);
> > -    sysbus_init_ioports(&s->busdev, 0xcf8, 4);
> > +static int i440fx_pcihost_init(SysBusDevice *dev)
> > +{
> > +    PCIHostState *pci = FROM_SYSBUS(PCIHostState, dev);
> 
> Don't use FROM_SYSBUS() either:

Is a new macro required here as well?

> 
> PCIHostState *phb = PCI_HOST_BRIDGE(dev);
> 
> > +    I440FXState *s = I440FX_HOST_DEVICE(&dev->qdev);
> 
> No need to access ->qdev, just use I440FX_...(dev);

ok.

> 
> > +    PCIBus *b;
> > +
> > +    memory_region_init_io(&pci->conf_mem, &pci_host_conf_le_ops, pci,
> > +                           "pci-conf-idx", 4);
> > +    sysbus_add_io(dev, 0xcf8, &pci->conf_mem);
> > +    sysbus_init_ioports(&pci->busdev, 0xcf8, 4);
> > +    memory_region_init_io(&pci->data_mem, &pci_host_data_le_ops, pci,
> > +                           "pci-conf-data", 4);
> >  
> > -    memory_region_init_io(&s->data_mem, &pci_host_data_le_ops, s,
> > -                          "pci-conf-data", 4);
> > -    sysbus_add_io(dev, 0xcfc, &s->data_mem);
> > -    sysbus_init_ioports(&s->busdev, 0xcfc, 4);
> > +    sysbus_add_io(dev, 0xcfc, &pci->data_mem);
> > +    sysbus_init_ioports(&pci->busdev, 0xcfc, 4);
> > +
> > +    b = pci_bus_new(&s->parent_obj.busdev.qdev, NULL, s->mch.pci_address_space,
> 
> DEVICE(dev)
> 
> > +                    s->mch.address_space_io, 0);
> 
> Initializing the bus in-place would be preferred.

Do you mean pci_bus_new_inplace? Why is this preferred?

> 
> > +    s->parent_obj.bus = b;
> > +    qdev_set_parent_bus(DEVICE(&s->mch), BUS(b));
> > +    qdev_init_nofail(DEVICE(&s->mch));
> 
> When casts other than OBJECT() are used multiple times, a variable is
> preferred.

ok

> 
> >  
> >      return 0;
> >  }
> >  
> >  static int i440fx_initfn(PCIDevice *dev)
> >  {
> > -    PCII440FXState *d = DO_UPCAST(PCII440FXState, dev, dev);
> > +    int i;
> > +    PCII440FXState *f = DO_UPCAST(PCII440FXState, dev, dev);
> > +    hwaddr pci_hole64_size;
> >  
> > -    d->dev.config[I440FX_SMRAM] = 0x02;
> > +    f->dev.config[I440FX_SMRAM] = 0x02;
> >  
> > -    cpu_smm_register(&i440fx_set_smm, d);
> > -    return 0;
> > -}
> > +    cpu_smm_register(&i440fx_set_smm, f);
> 
> Is all this d -> f variable renaming really necessary? I can understand
> the s -> pci (or for less ambiguity: phb) renaming above (I believe I
> left it s to keep my patch small ;)), but here no new variable is
> introduced so it just seems to enlarge the patch.

yes, this renaming here isn't needed, I 'll omit it.

> 
> [...]
> > @@ -550,7 +432,7 @@ static void i440fx_class_init(ObjectClass *klass, void *data)
> >  }
> >  
> >  static const TypeInfo i440fx_info = {
> > -    .name          = "i440FX",
> > +    .name          = TYPE_I440FX_PCI_DEVICE,
> >      .parent        = TYPE_PCI_DEVICE,
> 
> This matches the _PCI_DEVICE naming in earlier series including prep_pci
> 
> >      .instance_size = sizeof(PCII440FXState),
> >      .class_init    = i440fx_class_init,
> > @@ -561,15 +443,16 @@ static void i440fx_pcihost_class_init(ObjectClass *klass, void *data)
> >      DeviceClass *dc = DEVICE_CLASS(klass);
> >      SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
> >  
> > -    k->init = i440fx_pcihost_initfn;
> > +    k->init = i440fx_pcihost_init;
> >      dc->fw_name = "pci";
> >      dc->no_user = 1;
> >  }
> >  
> >  static const TypeInfo i440fx_pcihost_info = {
> > -    .name          = "i440FX-pcihost",
> > +    .name          = TYPE_I440FX_HOST_DEVICE,
> >      .parent        = TYPE_PCI_HOST_BRIDGE,
> 
> whereas here you see the mentioned _HOST_DEVICE vs. _HOST_BRIDGE.
> 
so, TYPE_I440FX_HOST_BRIDGE for the name is preferred?

thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-01-10 18:57     ` Vasilis Liaskovitis
@ 2013-03-19  6:30       ` li guang
  2013-03-26 16:58         ` Vasilis Liaskovitis
  2013-04-02  9:15       ` liu ping fan
  1 sibling, 1 reply; 72+ messages in thread
From: li guang @ 2013-03-19  6:30 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: gleb, stefanha, jbaron, seabios, qemu-devel, blauwirbel, kevin,
	Gerd Hoffmann, anthony

在 2013-01-10四的 19:57 +0100,Vasilis Liaskovitis写道:
> > > 
> > > IIRC q35 supports memory hotplug natively (picked up in some
> > > discussion).  Is that correct?
> > > 
> > From previous discussion I also understand that q35 supports native hotplug. 
> > Sections 5.1 and 5.2 of the spec describe the MCH registers but the native
> > memory hotplug specifics are not yet clear to me. Any pointers from the
> > spec are welcome.
> 
> Ping. Could anyone who's familiar with the q35 spec provide some pointers on
> native memory hotplug details in the spec? I see pcie hotplug registers but can't
> find memory hotplug interface details. If I am not mistaken, the spec is here:
> http://www.intel.com/design/chipsets/datashts/316966.htm
> 
> Is the q35 memory hotplug support supposed to be an shpc-like interface geared
> towards memory slots instead of pci slots?
> 

seems there's no so-called q35-native support

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
       [not found]   ` <20130228101819.GA4370@dhcp-192-168-178-175.profitbricks.localdomain>
@ 2013-03-19  7:28     ` li guang
  2013-03-26 16:43       ` Vasilis Liaskovitis
  0 siblings, 1 reply; 72+ messages in thread
From: li guang @ 2013-03-19  7:28 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, Erlon Cruz, kraxel, anthony

在 2013-02-28四的 11:18 +0100,Vasilis Liaskovitis写道:
> Hi,
> 
> sorry for the delay.
> On Tue, Feb 19, 2013 at 07:39:40PM -0300, Erlon Cruz wrote:
> > On Tue, Dec 18, 2012 at 10:41 AM, Vasilis Liaskovitis <
> > vasilis.liaskovitis@profitbricks.com> wrote:
> > 
> > > This is v4 of the ACPI memory hotplug functionality. Only x86_64 target is
> > > supported (both i440fx and q35). There are still several issues, but it's
> > > been a while since v3 and I wanted to get some more feedback on the current
> > > state of the patchseries.
> > >
> > >
> > We are working in memory hotplug functionality on pSeries machine. I'm
> > wondering whether and how we can better integrate things. Do you think the
> > DIMM abstraction is generic enough to be used in other machine types?
> 
> I think the DimmDevice is generic enough but I am open to other suggestions. 
> 
> A related issue is that the patchseries uses a DimmBus to hot-add and hot-remove
> DimmDevice. Another approach that has been suggested is to use links<> between
> DimmDevices and the dram controller device (piix4 or mch for pc and q35-pc
> machines respectively). This would be more similar to the CPUState/qom
> patches - see Andreas Färber's earlier reply to this thread.
> 
> I think we should get some consensus from the community/maintainers before we
> continue to integrate. 
> 
> I haven't updated the series for a while, but I can rework if there is a more
> clear direction for the community.
> 
> Another open issue is reference counting of memoryregions in qemu memory
> model. In order to make memory hot-remove operations safe, we need to remove
> a memoryregion after all users (e.g. both guest and block layer) have stopped
> using it,

it seems it mostly up to the user who want to hot-(un)plug,
if user want to un-plug a memory which is kernel's main memory, kernel
will always run on it(never stop) unless power off.
and if guest stops, all DIMMs should be safe to hot-remove,
or else we should do something to let user can unlock all reference.

>  see discussion at
> http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg03986.html. There was a
> relevant ibm patchset
> https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02697.html
> but it was not merged.
> 
> > 
> > 
> > > Overview:
> > >
> > > Dimm device layout is modeled with a normal qemu device:
> > >
> > > "-device dimm,id=name,size=sz,node=pxm,populated=on|off,bus=membus.0"
> > >
> > >
> >  How does this will handle the no-hotplugable memory for example the memory
> > passed in '-m' parameter?
> 
> The non-hotpluggable initial memory (-m) is currently not modelled at all as a
> DimmDevice. We may want to model it though.
> 
> thanks,
> - Vasilis
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 01/30] [SeaBIOS] Add ACPI_EXTRACT_DEVICE* macros
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 01/30] [SeaBIOS] Add ACPI_EXTRACT_DEVICE* macros Vasilis Liaskovitis
@ 2013-03-20  3:28   ` li guang
  0 siblings, 0 replies; 72+ messages in thread
From: li guang @ 2013-03-20  3:28 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, kraxel, anthony

seems these changes is in seebios now.

在 2012-12-18二的 13:41 +0100,Vasilis Liaskovitis写道:
> This allows to extract the beginning, end and name of a Device object.
> ---
>  tools/acpi_extract.py |   28 ++++++++++++++++++++++++++++
>  1 files changed, 28 insertions(+), 0 deletions(-)
> 
> diff --git a/tools/acpi_extract.py b/tools/acpi_extract.py
> index 3295678..3191f53 100755
> --- a/tools/acpi_extract.py
> +++ b/tools/acpi_extract.py
> @@ -217,6 +217,28 @@ def aml_package_start(offset):
>      offset += 1
>      return offset + aml_pkglen_bytes(offset) + 1
>  
> +def aml_device_start(offset):
> +    #0x5B 0x82 DeviceOp PkgLength NameString ProcID
> +    if ((aml[offset] != 0x5B) or (aml[offset + 1] != 0x82)):
> +        die( "Name offset 0x%x: expected 0x5B 0x83 actual 0x%x 0x%x" %
> +             (offset, aml[offset], aml[offset + 1]));
> +    return offset
> +
> +def aml_device_string(offset):
> +    #0x5B 0x82 DeviceOp PkgLength NameString ProcID
> +    start = aml_device_start(offset)
> +    offset += 2
> +    pkglenbytes = aml_pkglen_bytes(offset)
> +    offset += pkglenbytes
> +    return offset
> +
> +def aml_device_end(offset):
> +    start = aml_device_start(offset)
> +    offset += 2
> +    pkglenbytes = aml_pkglen_bytes(offset)
> +    pkglen = aml_pkglen(offset)
> +    return offset + pkglen
> +
>  lineno = 0
>  for line in fileinput.input():
>      # Strip trailing newline
> @@ -307,6 +329,12 @@ for i in range(len(asl)):
>          offset = aml_processor_end(offset)
>      elif (directive == "ACPI_EXTRACT_PKG_START"):
>          offset = aml_package_start(offset)
> +    elif (directive == "ACPI_EXTRACT_DEVICE_START"):
> +        offset = aml_device_start(offset)
> +    elif (directive == "ACPI_EXTRACT_DEVICE_STRING"):
> +        offset = aml_device_string(offset)
> +    elif (directive == "ACPI_EXTRACT_DEVICE_END"):
> +        offset = aml_device_end(offset)
>      else:
>          die("Unsupported directive %s" % directive)
>  

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 07/30] Add SIZE type to qdev properties
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 07/30] Add SIZE type to qdev properties Vasilis Liaskovitis
@ 2013-03-20  6:06   ` li guang
  2013-03-20 14:24     ` Eric Blake
  0 siblings, 1 reply; 72+ messages in thread
From: li guang @ 2013-03-20  6:06 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, kraxel, anthony

在 2012-12-18二的 13:41 +0100,Vasilis Liaskovitis写道:
> This patch adds a 'SIZE' type property to qdev.
> 
> It will make dimm description more convenient by allowing sizes to be specified
> with K,M,G,T prefixes instead of number of bytes e.g.:
> -device dimm,id=mem0,size=2G,bus=membus.0
> 
> Credits go to Ian Molton for original patch. See:
> http://patchwork.ozlabs.org/patch/38835/
> 
> Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
> ---
>  hw/qdev-properties.c |   60 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/qdev-properties.h |    3 ++
>  qemu-option.c        |    2 +-
>  qemu-option.h        |    2 +
>  4 files changed, 66 insertions(+), 1 deletions(-)
> 
> diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
> index 81d901c..a77f760 100644
> --- a/hw/qdev-properties.c
> +++ b/hw/qdev-properties.c
> @@ -1279,3 +1279,63 @@ void qemu_add_globals(void)
>  {
>      qemu_opts_foreach(qemu_find_opts("global"), qdev_add_one_global, NULL, 0);
>  }
> +
> +/* --- 64bit unsigned int 'size' type --- */
> +
> +static void get_size(Object *obj, Visitor *v, void *opaque,
> +                       const char *name, Error **errp)
> +{
> +    DeviceState *dev = DEVICE(obj);
> +    Property *prop = opaque;
> +    uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
> +
> +    visit_type_size(v, ptr, name, errp);
> +}
> +
> +static void set_size(Object *obj, Visitor *v, void *opaque,
> +                       const char *name, Error **errp)
> +{
> +    DeviceState *dev = DEVICE(obj);
> +    Property *prop = opaque;
> +    uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
> +
> +    if (dev->state != DEV_STATE_CREATED) {
> +        error_set(errp, QERR_PERMISSION_DENIED);
> +        return;
> +    }
> +
> +    visit_type_size(v, ptr, name, errp);
> +}
> +
> +static int parse_size(DeviceState *dev, Property *prop, const char *str)
> +{
> +    uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
> +    Error *errp = NULL;
> +
> +    if (str != NULL) {
> +        parse_option_size(prop->name, str, ptr, &errp);
> +    }
> +    assert_no_error(errp);
> +    return 0;
> +}
> +
> +static int print_size(DeviceState *dev, Property *prop, char *dest, size_t len)
> +{
> +    uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
> +    char suffixes[] = {'T', 'G', 'M', 'K', 'B'};
> +    int i = 0;
> +    uint64_t div;
> +
> +    for (div = (long int)1 << 40; !(*ptr / div) ; div >>= 10) {
> +        i++;
> +    }
> +    return snprintf(dest, len, "%0.03f%c", (double)*ptr/div, suffixes[i]);
                                    ^^^^^^     ^^^^^^^^^^^^^^^  
> +}
> +

IMHO, you may need (double)(*ptr/div), for type cast is right
associated.

> +PropertyInfo qdev_prop_size = {
> +    .name  = "size",
> +    .parse = parse_size,
> +    .print = print_size,
> +    .get = get_size,
> +    .set = set_size,
> +};
> diff --git a/hw/qdev-properties.h b/hw/qdev-properties.h
> index 5b046ab..0182bef 100644
> --- a/hw/qdev-properties.h
> +++ b/hw/qdev-properties.h
> @@ -14,6 +14,7 @@ extern PropertyInfo qdev_prop_uint64;
>  extern PropertyInfo qdev_prop_hex8;
>  extern PropertyInfo qdev_prop_hex32;
>  extern PropertyInfo qdev_prop_hex64;
> +extern PropertyInfo qdev_prop_size;
>  extern PropertyInfo qdev_prop_string;
>  extern PropertyInfo qdev_prop_chr;
>  extern PropertyInfo qdev_prop_ptr;
> @@ -67,6 +68,8 @@ extern PropertyInfo qdev_prop_pci_host_devaddr;
>      DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_hex32, uint32_t)
>  #define DEFINE_PROP_HEX64(_n, _s, _f, _d)                       \
>      DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_hex64, uint64_t)
> +#define DEFINE_PROP_SIZE(_n, _s, _f, _d)                       \
> +    DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_size, uint64_t)
>  #define DEFINE_PROP_PCI_DEVFN(_n, _s, _f, _d)                   \
>      DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_pci_devfn, int32_t)
>  
> diff --git a/qemu-option.c b/qemu-option.c
> index 27891e7..38e0a11 100644
> --- a/qemu-option.c
> +++ b/qemu-option.c
> @@ -203,7 +203,7 @@ static void parse_option_number(const char *name, const char *value,
>      }
>  }
>  
> -static void parse_option_size(const char *name, const char *value,
> +void parse_option_size(const char *name, const char *value,
>                                uint64_t *ret, Error **errp)
>  {
>      char *postfix;
> diff --git a/qemu-option.h b/qemu-option.h
> index ca72986..b8ee5b3 100644
> --- a/qemu-option.h
> +++ b/qemu-option.h
> @@ -152,5 +152,7 @@ typedef int (*qemu_opts_loopfunc)(QemuOpts *opts, void *opaque);
>  int qemu_opts_print(QemuOpts *opts, void *dummy);
>  int qemu_opts_foreach(QemuOptsList *list, qemu_opts_loopfunc func, void *opaque,
>                        int abort_on_failure);
> +void parse_option_size(const char *name, const char *value,
> +                              uint64_t *ret, Error **errp);
>  
>  #endif

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-01-09  0:08 ` Andreas Färber
  2013-01-10 17:36   ` Vasilis Liaskovitis
@ 2013-03-20  6:18   ` li guang
  2013-03-26 14:20     ` Eduardo Habkost
  1 sibling, 1 reply; 72+ messages in thread
From: li guang @ 2013-03-20  6:18 UTC (permalink / raw)
  To: Andreas Färber
  Cc: blauwirbel, pingfank, Eduardo Habkost, gleb, stefanha, jbaron,
	seabios, qemu-devel, Vasilis Liaskovitis, kevin, kraxel,
	Anthony Liguori, Igor Mammedov, Paolo Bonzini

在 2013-01-09三的 01:08 +0100,Andreas Färber写道:
> Am 18.12.2012 13:41, schrieb Vasilis Liaskovitis:
> > Because dimm layout needs to be configured on machine-boot, all dimm devices
> > need to be specified on startup command line (either with populated=on or with
> > populated=off). The dimm information is stored in dimm configuration structures.
> > 
> > After machine startup, dimms are hot-added or removed with normal device_add
> > and device_del operations e.g.:
> > Hot-add syntax: "device_add dimm,id=mydimm0,bus=membus.0"
> > Hot-remove syntax: "device_del dimm,id=mydimm0"
> 
> This sounds contradictory: Either all devices need to be specified on
> the command line, or they can be hot-added via monitor.
> 
> Assuming a fixed layout at startup, I wonder if there is another clever
> way to model this... For CPU hotplug Anthony had suggested to have a
> fixed set of link<Socket> properties that get set to a CPU socket as
> needed. Might a similar strategy work for memory, i.e. a
> startup-configured amount of link<DIMM>s on /machine/dimm[n] that point
> to a QOM DIMM object or NULL if unpopulated? Hot(un)plug would then
> simply work via QMP qom-set command. (CC'ing some people)


Sorry, what's link<>, did it adopted by cpu-QOM?
can you give some hints?

Thanks!

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 07/30] Add SIZE type to qdev properties
  2013-03-20  6:06   ` li guang
@ 2013-03-20 14:24     ` Eric Blake
  2013-03-21  0:39       ` li guang
  0 siblings, 1 reply; 72+ messages in thread
From: Eric Blake @ 2013-03-20 14:24 UTC (permalink / raw)
  To: li guang
  Cc: blauwirbel, pingfank, gleb, stefanha, seabios, qemu-devel,
	Vasilis Liaskovitis, kevin, kraxel, anthony

[-- Attachment #1: Type: text/plain, Size: 708 bytes --]

On 03/20/2013 12:06 AM, li guang wrote:

>> +    return snprintf(dest, len, "%0.03f%c", (double)*ptr/div, suffixes[i]);
>                                     ^^^^^^     ^^^^^^^^^^^^^^^  
>> +}
>> +
> 
> IMHO, you may need (double)(*ptr/div), for type cast is right
> associated.

No, the code as written is correct, and your proposal would be wrong.
As written, it is evaluated as:

((double)*ptr) / div

which promotes the division to floating point.  Your proposal would do
the division (*ptr/div) as an integral (truncating) division, and only
then cast the result to double.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 07/30] Add SIZE type to qdev properties
  2013-03-20 14:24     ` Eric Blake
@ 2013-03-21  0:39       ` li guang
  0 siblings, 0 replies; 72+ messages in thread
From: li guang @ 2013-03-21  0:39 UTC (permalink / raw)
  To: Eric Blake
  Cc: blauwirbel, pingfank, gleb, stefanha, seabios, qemu-devel,
	Vasilis Liaskovitis, kevin, kraxel, anthony

在 2013-03-20三的 08:24 -0600,Eric Blake写道:
> On 03/20/2013 12:06 AM, li guang wrote:
> 
> >> +    return snprintf(dest, len, "%0.03f%c", (double)*ptr/div, suffixes[i]);
> >                                     ^^^^^^     ^^^^^^^^^^^^^^^  
> >> +}
> >> +
> > 
> > IMHO, you may need (double)(*ptr/div), for type cast is right
> > associated.
> 
> No, the code as written is correct, and your proposal would be wrong.
> As written, it is evaluated as:
> 
> ((double)*ptr) / div
> 
> which promotes the division to floating point.  Your proposal would do
> the division (*ptr/div) as an integral (truncating) division, and only
> then cast the result to double.
> 

Yes, you're right,
amazing c type cast!

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 09/30] Implement dimm device abstraction
  2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 09/30] Implement dimm device abstraction Vasilis Liaskovitis
@ 2013-03-26  3:51   ` li guang
  0 siblings, 0 replies; 72+ messages in thread
From: li guang @ 2013-03-26  3:51 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, stefanha, jbaron, seabios, qemu-devel,
	blauwirbel, kevin, kraxel, anthony

在 2012-12-18二的 13:41 +0100,Vasilis Liaskovitis写道:
> Each hotplug-able memory slot is a DimmDevice. All DimmDevices are attached
> to a new bus called DimmBus. This bus is introduced so that we no longer
> depend on hotplug-capability of main system bus (the main bus does not allow
> hotplugging). The DimmBus should be attached to a chipset Device (i440fx in case
> of the pc)
> 
> A hot-add operation for a particular dimm:
> - creates a new DimmDevice and attaches it to the DimmBus
> - creates a new MemoryRegion of the given physical address offset, size and
> node proximity, and attaches it to main system memory as a sub_region.
> 
> Hotplug operations are done through normal device_add commands.
> Also add properties to DimmDevice.
> 
> v3->v4: Removed hot-remove functions. Will be offered in separate patches.
> 
> Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
> ---
>  hw/Makefile.objs |    2 +-
>  hw/dimm.c        |  245 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/dimm.h        |   89 ++++++++++++++++++++
>  3 files changed, 335 insertions(+), 1 deletions(-)
>  create mode 100644 hw/dimm.c
>  create mode 100644 hw/dimm.h
> 
> diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> index d581d8d..51494c9 100644
> --- a/hw/Makefile.objs
> +++ b/hw/Makefile.objs
> @@ -29,7 +29,7 @@ common-obj-$(CONFIG_I8254) += i8254_common.o i8254.o
>  common-obj-$(CONFIG_PCSPK) += pcspk.o
>  common-obj-$(CONFIG_PCKBD) += pckbd.o
>  common-obj-$(CONFIG_FDC) += fdc.o
> -common-obj-$(CONFIG_ACPI) += acpi.o acpi_piix4.o acpi_ich9.o smbus_ich9.o
> +common-obj-$(CONFIG_ACPI) += acpi.o acpi_piix4.o acpi_ich9.o smbus_ich9.o dimm.o
>  common-obj-$(CONFIG_APM) += pm_smbus.o apm.o
>  common-obj-$(CONFIG_DMA) += dma.o
>  common-obj-$(CONFIG_I82374) += i82374.o
> diff --git a/hw/dimm.c b/hw/dimm.c
> new file mode 100644
> index 0000000..e384952
> --- /dev/null
> +++ b/hw/dimm.c
> @@ -0,0 +1,245 @@
> +/*
> + * Dimm device for Memory Hotplug
> + *
> + * Copyright ProfitBricks GmbH 2012
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see <http://www.gnu.org/licenses/>
> + */
> +
> +#include "trace.h"
> +#include "qdev.h"
> +#include "dimm.h"
> +#include <time.h>
> +#include "../exec-memory.h"
> +#include "qmp-commands.h"
> +
> +/* the following list is used to hold dimm config info before machine
> + * is initialized. After machine init, the list is not used anymore.*/
> +static DimmConfiglist dimmconfig_list =
> +       QTAILQ_HEAD_INITIALIZER(dimmconfig_list);
> +
> +/* the list of memory buses */
> +static QLIST_HEAD(, DimmBus) memory_buses;
> +
> +static void dimmbus_dev_print(Monitor *mon, DeviceState *dev, int indent);
> +static char *dimmbus_get_fw_dev_path(DeviceState *dev);
> +
> +static Property dimm_properties[] = {
> +    DEFINE_PROP_UINT64("start", DimmDevice, start, 0),
> +    DEFINE_PROP_SIZE("size", DimmDevice, size, DEFAULT_DIMMSIZE),
> +    DEFINE_PROP_UINT32("node", DimmDevice, node, 0),
> +    DEFINE_PROP_BIT("populated", DimmDevice, populated, 0, false),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void dimmbus_dev_print(Monitor *mon, DeviceState *dev, int indent)
> +{
> +}
> +
> +static char *dimmbus_get_fw_dev_path(DeviceState *dev)
> +{
> +    char path[40];
> +
> +    snprintf(path, sizeof(path), "%s", qdev_fw_name(dev));
> +    return strdup(path);
> +}
> +
> +static void dimm_bus_class_init(ObjectClass *klass, void *data)
> +{
> +    BusClass *k = BUS_CLASS(klass);
> +
> +    k->print_dev = dimmbus_dev_print;
> +    k->get_fw_dev_path = dimmbus_get_fw_dev_path;
> +}
> +
> +static void dimm_bus_initfn(Object *obj)
> +{
> +    DimmBus *bus = DIMM_BUS(obj);
> +    QTAILQ_INIT(&bus->dimmconfig_list);
> +    QTAILQ_INIT(&bus->dimmlist);
> +}
> +
> +static const TypeInfo dimm_bus_info = {
> +    .name = TYPE_DIMM_BUS,
> +    .parent = TYPE_BUS,
> +    .instance_size = sizeof(DimmBus),
> +    .instance_init = dimm_bus_initfn,
> +    .class_init = dimm_bus_class_init,
> +};
> +
> +DimmBus *dimm_bus_create(Object *parent, const char *name, uint32_t max_dimms,
> +    dimm_calcoffset_fn pmc_set_offset)
> +{
> +    DimmBus *memory_bus;
> +    DimmConfig *dimm_cfg, *next_cfg;
> +    uint32_t num_dimms = 0;
> +
> +    memory_bus = g_malloc0(dimm_bus_info.instance_size);
> +    memory_bus->qbus.name = name ? g_strdup(name) : "membus.0";
> +    qbus_create_inplace(&memory_bus->qbus, TYPE_DIMM_BUS, DEVICE(parent),
> +                         name);
> +
> +    QTAILQ_FOREACH_SAFE(dimm_cfg, &dimmconfig_list, nextdimmcfg, next_cfg) {
> +        if (!strcmp(memory_bus->qbus.name, dimm_cfg->bus_name)) {
> +            if (max_dimms && (num_dimms == max_dimms)) {
> +                fprintf(stderr, "Bus %s can only accept %u number of DIMMs\n",
> +                        name, max_dimms);
> +            }
> +            QTAILQ_REMOVE(&dimmconfig_list, dimm_cfg, nextdimmcfg);
> +            QTAILQ_INSERT_TAIL(&memory_bus->dimmconfig_list, dimm_cfg,
> +                    nextdimmcfg);
> +
> +            dimm_cfg->start = pmc_set_offset(DEVICE(parent), dimm_cfg->size);
> +            num_dimms++;
> +        }
> +    }
> +    QLIST_INSERT_HEAD(&memory_buses, memory_bus, next);
> +    return memory_bus;
> +}
> +
> +static void dimm_populate(DimmDevice *s)
> +{
> +    DeviceState *dev = (DeviceState *)s;
> +    MemoryRegion *new = NULL;
> +
> +    new = g_malloc(sizeof(MemoryRegion));
> +    memory_region_init_ram(new, dev->id, s->size);
> +    vmstate_register_ram_global(new);
> +    memory_region_add_subregion(get_system_memory(), s->start, new);
> +    s->populated = true;
> +    s->mr = new;

can't we simple just inc ref-count of mr when init memory_region, 
and dec ref-count of mr when destroy memory_region?
Sorry, it may be a bold idea :-)

> +}
> +
> +void dimm_config_create(char *id, uint64_t size, const char *bus, uint64_t node,
> +        uint32_t dimm_idx, uint32_t populated)
> +{
> +    DimmConfig *dimm_cfg;
> +    dimm_cfg = (DimmConfig *) g_malloc0(sizeof(DimmConfig));
> +    dimm_cfg->name = strdup(id);
> +    dimm_cfg->bus_name = strdup(bus);
> +    dimm_cfg->idx = dimm_idx;
> +    dimm_cfg->start = 0;
> +    dimm_cfg->size = size;
> +    dimm_cfg->node = node;
> +    dimm_cfg->populated = populated;
> +
> +    QTAILQ_INSERT_TAIL(&dimmconfig_list, dimm_cfg, nextdimmcfg);
> +}
> +
> +void dimm_bus_hotplug(dimm_hotplug_fn hotplug, DeviceState *qdev)
> +{
> +    DimmBus *bus;
> +    QLIST_FOREACH(bus, &memory_buses, next) {
> +        assert(bus);
> +        bus->qbus.allow_hotplug = 1;
> +        bus->dimm_hotplug_qdev = qdev;
> +        bus->dimm_hotplug = hotplug;
> +    }
> +}
> +
> +static void dimm_plug_device(DimmDevice *slot)
> +{
> +    DimmBus *bus = DIMM_BUS(qdev_get_parent_bus(&slot->qdev));
> +
> +    dimm_populate(slot);
> +    if (bus->dimm_hotplug) {
> +        bus->dimm_hotplug(bus->dimm_hotplug_qdev, slot, 1);
> +    }
> +}
> +
> +static int dimm_unplug_device(DeviceState *qdev)
> +{
> +    return 1;
> +}
> +
> +static DimmConfig *dimmcfg_find_from_name(DimmBus *bus, const char *name)
> +{
> +    DimmConfig *slot;
> +
> +    QTAILQ_FOREACH(slot, &bus->dimmconfig_list, nextdimmcfg) {
> +        if (!strcmp(slot->name, name)) {
> +            return slot;
> +        }
> +    }
> +    return NULL;
> +}
> +
> +void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots)
> +{
> +    DimmConfig *slot;
> +    DimmBus *bus;
> +
> +    QLIST_FOREACH(bus, &memory_buses, next) {
> +        QTAILQ_FOREACH(slot, &bus->dimmconfig_list, nextdimmcfg) {
> +            assert(slot->start);
> +            fw_cfg_slots[3 * slot->idx] = cpu_to_le64(slot->start);
> +            fw_cfg_slots[3 * slot->idx + 1] = cpu_to_le64(slot->size);
> +            fw_cfg_slots[3 * slot->idx + 2] = cpu_to_le64(slot->node);
> +        }
> +    }
> +}
> +
> +static int dimm_init(DeviceState *s)
> +{
> +    DimmBus *bus = DIMM_BUS(qdev_get_parent_bus(s));
> +    DimmDevice *slot;
> +    DimmConfig *slotcfg;
> +
> +    slot = DIMM(s);
> +    slot->mr = NULL;
> +
> +    slotcfg = dimmcfg_find_from_name(bus, s->id);
> +
> +    if (!slotcfg) {
> +        fprintf(stderr, "%s no config for slot %s found\n",
> +                __func__, s->id);
> +        return 1;
> +    }
> +
> +    slot->idx = slotcfg->idx;
> +    assert(slotcfg->start);
> +    slot->start = slotcfg->start;
> +    slot->size = slotcfg->size;
> +    slot->node = slotcfg->node;
> +
> +    QTAILQ_INSERT_TAIL(&bus->dimmlist, slot, nextdimm);
> +    dimm_plug_device(slot);
> +
> +    return 0;
> +}
> +
> +
> +static void dimm_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +
> +    dc->props = dimm_properties;
> +    dc->unplug = dimm_unplug_device;
> +    dc->init = dimm_init;
> +    dc->bus_type = TYPE_DIMM_BUS;
> +}
> +
> +static TypeInfo dimm_info = {
> +    .name          = TYPE_DIMM,
> +    .parent        = TYPE_DEVICE,
> +    .instance_size = sizeof(DimmDevice),
> +    .class_init    = dimm_class_init,
> +};
> +
> +static void dimm_register_types(void)
> +{
> +    type_register_static(&dimm_bus_info);
> +    type_register_static(&dimm_info);
> +}
> +
> +type_init(dimm_register_types)
> diff --git a/hw/dimm.h b/hw/dimm.h
> new file mode 100644
> index 0000000..75a6911
> --- /dev/null
> +++ b/hw/dimm.h
> @@ -0,0 +1,89 @@
> +#ifndef QEMU_DIMM_H
> +#define QEMU_DIMM_H
> +
> +#include "qemu-common.h"
> +#include "memory.h"
> +#include "sysbus.h"
> +#include "qapi-types.h"
> +#include "qemu-queue.h"
> +#include "cpus.h"
> +#define MAX_DIMMS 255
> +#define DIMM_BITMAP_BYTES ((MAX_DIMMS + 7) / 8)
> +#define DEFAULT_DIMMSIZE (1024*1024*1024)
> +
> +typedef enum {
> +    DIMM_REMOVE_SUCCESS = 0,
> +    DIMM_REMOVE_FAIL = 1,
> +    DIMM_ADD_SUCCESS = 2,
> +    DIMM_ADD_FAIL = 3
> +} dimm_hp_result_code;
> +
> +#define TYPE_DIMM "dimm"
> +#define DIMM(obj) \
> +    OBJECT_CHECK(DimmDevice, (obj), TYPE_DIMM)
> +#define DIMM_CLASS(klass) \
> +    OBJECT_CLASS_CHECK(DimmDeviceClass, (klass), TYPE_DIMM)
> +#define DIMM_GET_CLASS(obj) \
> +    OBJECT_GET_CLASS(DimmDeviceClass, (obj), TYPE_DIMM)
> +
> +typedef struct DimmDevice DimmDevice;
> +typedef QTAILQ_HEAD(DimmConfiglist, DimmConfig) DimmConfiglist;
> +
> +typedef struct DimmDeviceClass {
> +    DeviceClass parent_class;
> +
> +    int (*init)(DimmDevice *dev);
> +} DimmDeviceClass;
> +
> +struct DimmDevice {
> +    DeviceState qdev;
> +    uint32_t idx; /* index in memory hotplug register/bitmap */
> +    ram_addr_t start; /* starting physical address */
> +    ram_addr_t size;
> +    uint32_t node; /* numa node proximity */
> +    uint32_t populated; /* 1 means device has been hotplugged. Default is 0. */
> +    MemoryRegion *mr; /* MemoryRegion for this slot. !NULL only if populated */
> +    QTAILQ_ENTRY(DimmDevice) nextdimm;
> +};
> +
> +typedef struct DimmConfig {
> +    const char *name;
> +    uint32_t idx; /* index in linear memory hotplug bitmap */
> +    const char *bus_name;
> +    ram_addr_t start; /* starting physical address */
> +    ram_addr_t size;
> +    uint32_t node; /* numa node proximity */
> +    uint32_t populated; /* 1 means device has been hotplugged. Default is 0. */
> +    QTAILQ_ENTRY(DimmConfig) nextdimmcfg;
> +} DimmConfig;
> +
> +typedef int (*dimm_hotplug_fn)(DeviceState *qdev, DimmDevice *dev, int add);
> +typedef hwaddr(*dimm_calcoffset_fn)(DeviceState *dev, uint64_t size);
> +
> +#define TYPE_DIMM_BUS "dimmbus"
> +#define DIMM_BUS(obj) OBJECT_CHECK(DimmBus, (obj), TYPE_DIMM_BUS)
> +
> +typedef struct DimmBus {
> +    BusState qbus;
> +    DeviceState *dimm_hotplug_qdev;
> +    dimm_hotplug_fn dimm_hotplug;
> +    DimmConfiglist dimmconfig_list;
> +    QTAILQ_HEAD(Dimmlist, DimmDevice) dimmlist;
> +    QLIST_ENTRY(DimmBus) next;
> +} DimmBus;
> +
> +struct dimm_hp_result {
> +    const char *dimmname;
> +    dimm_hp_result_code ret;
> +    QTAILQ_ENTRY(dimm_hp_result) next;
> +};
> +
> +void dimm_bus_hotplug(dimm_hotplug_fn hotplug, DeviceState *qdev);
> +void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots);
> +int dimm_add(char *id);
> +DimmBus *dimm_bus_create(Object *parent, const char *name, uint32_t max_dimms,
> +    dimm_calcoffset_fn pmc_set_offset);
> +void dimm_config_create(char *id, uint64_t size, const char *bus, uint64_t node,
> +        uint32_t dimm_idx, uint32_t populated);
> +
> +#endif

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-03-20  6:18   ` li guang
@ 2013-03-26 14:20     ` Eduardo Habkost
  2013-03-27  7:39       ` li guang
  0 siblings, 1 reply; 72+ messages in thread
From: Eduardo Habkost @ 2013-03-26 14:20 UTC (permalink / raw)
  To: li guang
  Cc: blauwirbel, pingfank, gleb, stefanha, jbaron, seabios,
	qemu-devel, Vasilis Liaskovitis, kevin, kraxel, Anthony Liguori,
	Igor Mammedov, Paolo Bonzini, Andreas Färber

On Wed, Mar 20, 2013 at 02:18:00PM +0800, li guang wrote:
> 在 2013-01-09三的 01:08 +0100,Andreas Färber写道:
> > Am 18.12.2012 13:41, schrieb Vasilis Liaskovitis:
> > > Because dimm layout needs to be configured on machine-boot, all dimm devices
> > > need to be specified on startup command line (either with populated=on or with
> > > populated=off). The dimm information is stored in dimm configuration structures.
> > > 
> > > After machine startup, dimms are hot-added or removed with normal device_add
> > > and device_del operations e.g.:
> > > Hot-add syntax: "device_add dimm,id=mydimm0,bus=membus.0"
> > > Hot-remove syntax: "device_del dimm,id=mydimm0"
> > 
> > This sounds contradictory: Either all devices need to be specified on
> > the command line, or they can be hot-added via monitor.
> > 
> > Assuming a fixed layout at startup, I wonder if there is another clever
> > way to model this... For CPU hotplug Anthony had suggested to have a
> > fixed set of link<Socket> properties that get set to a CPU socket as
> > needed. Might a similar strategy work for memory, i.e. a
> > startup-configured amount of link<DIMM>s on /machine/dimm[n] that point
> > to a QOM DIMM object or NULL if unpopulated? Hot(un)plug would then
> > simply work via QMP qom-set command. (CC'ing some people)
> 
> 
> Sorry, what's link<>, did it adopted by cpu-QOM?

"link<...>" is a QOM construct, allowing properties that point to other
objects. We don't use it on the CPU objects yet.

> can you give some hints?

Look for mentions of "link" in the doc comments at include/qom/object.h.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
                   ` (33 preceding siblings ...)
       [not found] ` <CAF+CadtnTcOnUt7jp1bARJgioxR5KzLG0QSQuDbiqhiKxiCqFA@mail.gmail.com>
@ 2013-03-26 14:47 ` Luiz Capitulino
  2013-03-26 16:59   ` Vasilis Liaskovitis
  34 siblings, 1 reply; 72+ messages in thread
From: Luiz Capitulino @ 2013-03-26 14:47 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: pingfank, gleb, stefanha, seabios, qemu-devel, blauwirbel, kevin,
	kraxel, anthony

On Tue, 18 Dec 2012 13:41:28 +0100
Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> wrote:

> This is v4 of the ACPI memory hotplug functionality. Only x86_64 target is
> supported (both i440fx and q35). There are still several issues, but it's
> been a while since v3 and I wanted to get some more feedback on the current
> state of the patchseries.

It seems this series doesn't apply anymore, do you plan to respin it?

Also, some months ago I saw patches flying on linux-mm fixing some
issues related to memory hotplug, so should this work with latest Linux
kernel?

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-03-19  7:28     ` li guang
@ 2013-03-26 16:43       ` Vasilis Liaskovitis
  2013-03-27  2:54         ` li guang
  0 siblings, 1 reply; 72+ messages in thread
From: Vasilis Liaskovitis @ 2013-03-26 16:43 UTC (permalink / raw)
  To: li guang
  Cc: gleb, stefanha, seabios, qemulist, qemu-devel, blauwirbel, kevin,
	Erlon Cruz, kraxel, anthony

Hi,

On Tue, Mar 19, 2013 at 03:28:38PM +0800, li guang wrote:
[...]
> > > > This is v4 of the ACPI memory hotplug functionality. Only x86_64 target is
> > > > supported (both i440fx and q35). There are still several issues, but it's
> > > > been a while since v3 and I wanted to get some more feedback on the current
> > > > state of the patchseries.
> > > >
> > > >
> > > We are working in memory hotplug functionality on pSeries machine. I'm
> > > wondering whether and how we can better integrate things. Do you think the
> > > DIMM abstraction is generic enough to be used in other machine types?
> > 
> > I think the DimmDevice is generic enough but I am open to other suggestions. 
> > 
> > A related issue is that the patchseries uses a DimmBus to hot-add and hot-remove
> > DimmDevice. Another approach that has been suggested is to use links<> between
> > DimmDevices and the dram controller device (piix4 or mch for pc and q35-pc
> > machines respectively). This would be more similar to the CPUState/qom
> > patches - see Andreas Färber's earlier reply to this thread.
> > 
> > I think we should get some consensus from the community/maintainers before we
> > continue to integrate. 
> > 
> > I haven't updated the series for a while, but I can rework if there is a more
> > clear direction for the community.
> > 
> > Another open issue is reference counting of memoryregions in qemu memory
> > model. In order to make memory hot-remove operations safe, we need to remove
> > a memoryregion after all users (e.g. both guest and block layer) have stopped
> > using it,
> 
> it seems it mostly up to the user who want to hot-(un)plug,
> if user want to un-plug a memory which is kernel's main memory, kernel
> will always run on it(never stop) unless power off.
> and if guest stops, all DIMMs should be safe to hot-remove,
> or else we should do something to let user can unlock all reference.

it's not only the guest-side that needs to stop using it, we need to make sure
that the qemu block layer is also not using the memory region anymore. See the 2
links below for discussion:

> >  see discussion at
> > http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg03986.html. There was a
> > relevant ibm patchset
> > https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02697.html
> > but it was not merged.
> > 
thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-03-19  6:30       ` li guang
@ 2013-03-26 16:58         ` Vasilis Liaskovitis
  2013-03-27  2:42           ` li guang
  0 siblings, 1 reply; 72+ messages in thread
From: Vasilis Liaskovitis @ 2013-03-26 16:58 UTC (permalink / raw)
  To: li guang
  Cc: gleb, stefanha, seabios, qemu-devel, blauwirbel, kevin,
	Gerd Hoffmann, anthony

Hi,

On Tue, Mar 19, 2013 at 02:30:25PM +0800, li guang wrote:
> 在 2013-01-10四的 19:57 +0100,Vasilis Liaskovitis写道:
> > > > 
> > > > IIRC q35 supports memory hotplug natively (picked up in some
> > > > discussion).  Is that correct?
> > > > 
> > > From previous discussion I also understand that q35 supports native hotplug. 
> > > Sections 5.1 and 5.2 of the spec describe the MCH registers but the native
> > > memory hotplug specifics are not yet clear to me. Any pointers from the
> > > spec are welcome.
> > 
> > Ping. Could anyone who's familiar with the q35 spec provide some pointers on
> > native memory hotplug details in the spec? I see pcie hotplug registers but can't
> > find memory hotplug interface details. If I am not mistaken, the spec is here:
> > http://www.intel.com/design/chipsets/datashts/316966.htm
> > 
> > Is the q35 memory hotplug support supposed to be an shpc-like interface geared
> > towards memory slots instead of pci slots?
> > 
> 
> seems there's no so-called q35-native support

that was also my first impression when scanning the specification. Wasn't native
memory hotplug capabilities one of the reasons that q35 got picked as the next
pc chipset?

thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-03-26 14:47 ` Luiz Capitulino
@ 2013-03-26 16:59   ` Vasilis Liaskovitis
  0 siblings, 0 replies; 72+ messages in thread
From: Vasilis Liaskovitis @ 2013-03-26 16:59 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: gleb, stefanha, seabios, qemu-devel, qemulist, blauwirbel, kevin,
	kraxel, anthony

Hi,

On Tue, Mar 26, 2013 at 10:47:01AM -0400, Luiz Capitulino wrote:
> On Tue, 18 Dec 2012 13:41:28 +0100
> Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> wrote:
> 
> > This is v4 of the ACPI memory hotplug functionality. Only x86_64 target is
> > supported (both i440fx and q35). There are still several issues, but it's
> > been a while since v3 and I wanted to get some more feedback on the current
> > state of the patchseries.
> 
> It seems this series doesn't apply anymore, do you plan to respin it?
> 

I 'll respin sometime in April due to other work currently.

> Also, some months ago I saw patches flying on linux-mm fixing some
> issues related to memory hotplug, so should this work with latest Linux
> kernel?

hot-add is working. But hot-remove is broken in mainline and still in progress.
See discussion at:
https://lkml.org/lkml/2013/3/25/490

thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-03-26 16:58         ` Vasilis Liaskovitis
@ 2013-03-27  2:42           ` li guang
  0 siblings, 0 replies; 72+ messages in thread
From: li guang @ 2013-03-27  2:42 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: gleb, stefanha, seabios, qemu-devel, blauwirbel, kevin,
	Gerd Hoffmann, anthony

在 2013-03-26二的 17:58 +0100,Vasilis Liaskovitis写道:
> Hi,
> 
> On Tue, Mar 19, 2013 at 02:30:25PM +0800, li guang wrote:
> > 在 2013-01-10四的 19:57 +0100,Vasilis Liaskovitis写道:
> > > > > 
> > > > > IIRC q35 supports memory hotplug natively (picked up in some
> > > > > discussion).  Is that correct?
> > > > > 
> > > > From previous discussion I also understand that q35 supports native hotplug. 
> > > > Sections 5.1 and 5.2 of the spec describe the MCH registers but the native
> > > > memory hotplug specifics are not yet clear to me. Any pointers from the
> > > > spec are welcome.
> > > 
> > > Ping. Could anyone who's familiar with the q35 spec provide some pointers on
> > > native memory hotplug details in the spec? I see pcie hotplug registers but can't
> > > find memory hotplug interface details. If I am not mistaken, the spec is here:
> > > http://www.intel.com/design/chipsets/datashts/316966.htm
> > > 
> > > Is the q35 memory hotplug support supposed to be an shpc-like interface geared
> > > towards memory slots instead of pci slots?
> > > 
> > 
> > seems there's no so-called q35-native support
> 
> that was also my first impression when scanning the specification. Wasn't native
> memory hotplug capabilities one of the reasons that q35 got picked as the next
> pc chipset?

Um, I can't find the original statement of q35,
but I think if we can't find in intel's official
SPEC, then we have to say 'there's no q35-native support'.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-03-26 16:43       ` Vasilis Liaskovitis
@ 2013-03-27  2:54         ` li guang
  2013-03-28  9:29           ` Vasilis Liaskovitis
  0 siblings, 1 reply; 72+ messages in thread
From: li guang @ 2013-03-27  2:54 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: gleb, stefanha, seabios, qemulist, qemu-devel, blauwirbel, kevin,
	Erlon Cruz, kraxel, anthony

在 2013-03-26二的 17:43 +0100,Vasilis Liaskovitis写道:
> Hi,
> 
> On Tue, Mar 19, 2013 at 03:28:38PM +0800, li guang wrote:
> [...]
> > > > > This is v4 of the ACPI memory hotplug functionality. Only x86_64 target is
> > > > > supported (both i440fx and q35). There are still several issues, but it's
> > > > > been a while since v3 and I wanted to get some more feedback on the current
> > > > > state of the patchseries.
> > > > >
> > > > >
> > > > We are working in memory hotplug functionality on pSeries machine. I'm
> > > > wondering whether and how we can better integrate things. Do you think the
> > > > DIMM abstraction is generic enough to be used in other machine types?
> > > 
> > > I think the DimmDevice is generic enough but I am open to other suggestions. 
> > > 
> > > A related issue is that the patchseries uses a DimmBus to hot-add and hot-remove
> > > DimmDevice. Another approach that has been suggested is to use links<> between
> > > DimmDevices and the dram controller device (piix4 or mch for pc and q35-pc
> > > machines respectively). This would be more similar to the CPUState/qom
> > > patches - see Andreas Färber's earlier reply to this thread.
> > > 
> > > I think we should get some consensus from the community/maintainers before we
> > > continue to integrate. 
> > > 
> > > I haven't updated the series for a while, but I can rework if there is a more
> > > clear direction for the community.
> > > 
> > > Another open issue is reference counting of memoryregions in qemu memory
> > > model. In order to make memory hot-remove operations safe, we need to remove
> > > a memoryregion after all users (e.g. both guest and block layer) have stopped
> > > using it,
> > 
> > it seems it mostly up to the user who want to hot-(un)plug,
> > if user want to un-plug a memory which is kernel's main memory, kernel
> > will always run on it(never stop) unless power off.
> > and if guest stops, all DIMMs should be safe to hot-remove,
> > or else we should do something to let user can unlock all reference.
> 
> it's not only the guest-side that needs to stop using it, we need to make sure
> that the qemu block layer is also not using the memory region anymore. See the 2
> links below for discussion:
> 

can't we simply track this(MemoryRegion) usage by ref-count?
e.g.
every time mr used, inc ref-count, then dec it when unused
even for cpu_physical_memory_map and other potential users.

> > >  see discussion at
> > > http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg03986.html. There was a
> > > relevant ibm patchset
> > > https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02697.html
> > > but it was not merged.
> > > 
> thanks,
> 
> - Vasilis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-03-26 14:20     ` Eduardo Habkost
@ 2013-03-27  7:39       ` li guang
  0 siblings, 0 replies; 72+ messages in thread
From: li guang @ 2013-03-27  7:39 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: blauwirbel, pingfank, gleb, stefanha, jbaron, seabios,
	qemu-devel, Vasilis Liaskovitis, kevin, kraxel, Anthony Liguori,
	Igor Mammedov, Paolo Bonzini, Andreas Färber

在 2013-03-26二的 11:20 -0300,Eduardo Habkost写道:
> On Wed, Mar 20, 2013 at 02:18:00PM +0800, li guang wrote:
> > 在 2013-01-09三的 01:08 +0100,Andreas Färber写道:
> > > Am 18.12.2012 13:41, schrieb Vasilis Liaskovitis:
> > > > Because dimm layout needs to be configured on machine-boot, all dimm devices
> > > > need to be specified on startup command line (either with populated=on or with
> > > > populated=off). The dimm information is stored in dimm configuration structures.
> > > > 
> > > > After machine startup, dimms are hot-added or removed with normal device_add
> > > > and device_del operations e.g.:
> > > > Hot-add syntax: "device_add dimm,id=mydimm0,bus=membus.0"
> > > > Hot-remove syntax: "device_del dimm,id=mydimm0"
> > > 
> > > This sounds contradictory: Either all devices need to be specified on
> > > the command line, or they can be hot-added via monitor.
> > > 
> > > Assuming a fixed layout at startup, I wonder if there is another clever
> > > way to model this... For CPU hotplug Anthony had suggested to have a
> > > fixed set of link<Socket> properties that get set to a CPU socket as
> > > needed. Might a similar strategy work for memory, i.e. a
> > > startup-configured amount of link<DIMM>s on /machine/dimm[n] that point
> > > to a QOM DIMM object or NULL if unpopulated? Hot(un)plug would then
> > > simply work via QMP qom-set command. (CC'ing some people)
> > 
> > 
> > Sorry, what's link<>, did it adopted by cpu-QOM?
> 
> "link<...>" is a QOM construct, allowing properties that point to other
> objects. We don't use it on the CPU objects yet.
> 
> > can you give some hints?
> 
> Look for mentions of "link" in the doc comments at include/qom/object.h.
> 

OK, I see, Thanks!

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-03-27  2:54         ` li guang
@ 2013-03-28  9:29           ` Vasilis Liaskovitis
  2013-03-28  9:49             ` liu ping fan
  0 siblings, 1 reply; 72+ messages in thread
From: Vasilis Liaskovitis @ 2013-03-28  9:29 UTC (permalink / raw)
  To: li guang
  Cc: gleb, stefanha, seabios, qemulist, qemu-devel, blauwirbel, kevin,
	Erlon Cruz, kraxel, anthony

Hi,

[...]
>> > >
>> > > I haven't updated the series for a while, but I can rework if there is a more
>> > > clear direction for the community.
>> > >
>> > > Another open issue is reference counting of memoryregions in qemu memory
>> > > model. In order to make memory hot-remove operations safe, we need to remove
>> > > a memoryregion after all users (e.g. both guest and block layer) have stopped
>> > > using it,
>> >
>> > it seems it mostly up to the user who want to hot-(un)plug,
>> > if user want to un-plug a memory which is kernel's main memory, kernel
>> > will always run on it(never stop) unless power off.
>> > and if guest stops, all DIMMs should be safe to hot-remove,
>> > or else we should do something to let user can unlock all reference.
>>
>> it's not only the guest-side that needs to stop using it, we need to make sure
>> that the qemu block layer is also not using the memory region anymore. See the 2
>> links below for discussion:
>>
>
> can't we simply track this(MemoryRegion) usage by ref-count?
> e.g.
> every time mr used, inc ref-count, then dec it when unused
> even for cpu_physical_memory_map and other potential users.
>

yes, that's the idea the patchset below try to implement, but last
time I checked this was not upstreamed. I will take a closer look next
week.

>> > >  see discussion at
>> > > http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg03986.html. There was a
>> > > relevant ibm patchset
>> > > https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02697.html
>> > > but it was not merged.

- Vasilis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-03-28  9:29           ` Vasilis Liaskovitis
@ 2013-03-28  9:49             ` liu ping fan
  0 siblings, 0 replies; 72+ messages in thread
From: liu ping fan @ 2013-03-28  9:49 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: gleb, stefanha, seabios, qemu-devel, blauwirbel, kevin,
	Erlon Cruz, kraxel, anthony, li guang

On Thu, Mar 28, 2013 at 5:29 PM, Vasilis Liaskovitis
<vasilis.liaskovitis@profitbricks.com> wrote:
> Hi,
>
> [...]
>>> > >
>>> > > I haven't updated the series for a while, but I can rework if there is a more
>>> > > clear direction for the community.
>>> > >
>>> > > Another open issue is reference counting of memoryregions in qemu memory
>>> > > model. In order to make memory hot-remove operations safe, we need to remove
>>> > > a memoryregion after all users (e.g. both guest and block layer) have stopped
>>> > > using it,
>>> >
>>> > it seems it mostly up to the user who want to hot-(un)plug,
>>> > if user want to un-plug a memory which is kernel's main memory, kernel
>>> > will always run on it(never stop) unless power off.
>>> > and if guest stops, all DIMMs should be safe to hot-remove,
>>> > or else we should do something to let user can unlock all reference.
>>>
>>> it's not only the guest-side that needs to stop using it, we need to make sure
>>> that the qemu block layer is also not using the memory region anymore. See the 2
>>> links below for discussion:
>>>
>>
>> can't we simply track this(MemoryRegion) usage by ref-count?
>> e.g.
>> every time mr used, inc ref-count, then dec it when unused
>> even for cpu_physical_memory_map and other potential users.
>>
>
> yes, that's the idea the patchset below try to implement, but last
> time I checked this was not upstreamed. I will take a closer look next
> week.
>
The current  problem is that if not all the device convert to support
MemoryRegionOps ref/unref API, the code in address_space_rw will be
ugly.
But I think the virtio data plane will need thread-safe RAM, so can we
just try that for RAM device.
If it is needed, I will update my patches

Regards,
Pingfan
>>> > >  see discussion at
>>> > > http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg03986.html. There was a
>>> > > relevant ibm patchset
>>> > > https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02697.html
>>> > > but it was not merged.
>
> - Vasilis

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug
  2013-01-10 18:57     ` Vasilis Liaskovitis
  2013-03-19  6:30       ` li guang
@ 2013-04-02  9:15       ` liu ping fan
  1 sibling, 0 replies; 72+ messages in thread
From: liu ping fan @ 2013-04-02  9:15 UTC (permalink / raw)
  To: Vasilis Liaskovitis, Michael S. Tsirkin
  Cc: Gleb Natapov, Stefan Hajnoczi, jbaron, seabios, qemu-devel,
	Blue Swirl, Kevin OConnor, Gerd Hoffmann, Anthony Liguori

On Fri, Jan 11, 2013 at 2:57 AM, Vasilis Liaskovitis
<vasilis.liaskovitis@profitbricks.com> wrote:
>> >
>> > IIRC q35 supports memory hotplug natively (picked up in some
>> > discussion).  Is that correct?
>> >
>> From previous discussion I also understand that q35 supports native hotplug.
>> Sections 5.1 and 5.2 of the spec describe the MCH registers but the native
>> memory hotplug specifics are not yet clear to me. Any pointers from the
>> spec are welcome.
>
> Ping. Could anyone who's familiar with the q35 spec provide some pointers on
> native memory hotplug details in the spec? I see pcie hotplug registers but can't
> find memory hotplug interface details. If I am not mistaken, the spec is here:
> http://www.intel.com/design/chipsets/datashts/316966.htm
>
> Is the q35 memory hotplug support supposed to be an shpc-like interface geared
> towards memory slots instead of pci slots?
>
I think there is no dedicated pci express downstream port for memory
slots, so native plug is not available.


> thanks,
>
> - Vasilis
>

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2013-04-02  9:16 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-18 12:41 [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 01/30] [SeaBIOS] Add ACPI_EXTRACT_DEVICE* macros Vasilis Liaskovitis
2013-03-20  3:28   ` li guang
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 02/30] [SeaBIOS] Add SSDT memory device support Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 03/30] [SeaBIOS] acpi-dsdt: Implement functions for memory hotplug Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 04/30] [SeaBIOS] acpi: generate hotplug memory devices Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 05/30] [SeaBIOS] q35: Add memory hotplug handler Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 06/30] qapi: make visit_type_size fallback to type_int Vasilis Liaskovitis
2013-01-09  0:18   ` Andreas Färber
2013-01-09 16:00     ` mdroth
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 07/30] Add SIZE type to qdev properties Vasilis Liaskovitis
2013-03-20  6:06   ` li guang
2013-03-20 14:24     ` Eric Blake
2013-03-21  0:39       ` li guang
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 08/30] qemu-option: export parse_option_number Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 09/30] Implement dimm device abstraction Vasilis Liaskovitis
2013-03-26  3:51   ` li guang
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 10/30] vl: handle "-device dimm" Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 11/30] acpi_piix4 : Implement memory device hotplug registers Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 12/30] acpi_ich9 " Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 13/30] piix_pci and pc_piix: refactor Vasilis Liaskovitis
2013-01-16  7:20   ` Hu Tao
2013-01-16  9:36     ` Vasilis Liaskovitis
2013-01-16 11:17       ` Andreas Färber
2013-01-16 17:10         ` Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 14/30] piix_pci: Add i440fx dram controller initialization Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 15/30] q35: " Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 16/30] pc: Add dimm paravirt SRAT info Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 17/30] [SeaBIOS] pci: Use paravirt interface for pcimem_start and pcimem64_start Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 18/30] Introduce paravirt interface QEMU_CFG_PCI_WINDOW Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 19/30] Implement "info memory-total" and "query-memory-total" Vasilis Liaskovitis
2012-12-19 19:47   ` Blue Swirl
2013-01-04 16:21   ` Eric Blake
2013-01-10 17:42     ` Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 20/30] balloon: update with hotplugged memory Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 21/30] Implement dimm-info Vasilis Liaskovitis
2013-01-08 23:20   ` Eric Blake
2013-01-10 17:45     ` Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 22/30] [SeaBIOS] acpi: add _EJ0 operation and eject port for memory devices Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 23/30] dimm: add hot-remove capability Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 24/30] acpi_piix4: " Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 25/30] acpi_ich9: " Vasilis Liaskovitis
2012-12-19 19:48   ` Blue Swirl
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 26/30] Implement qmp and hmp commands for notification lists Vasilis Liaskovitis
2013-01-09  0:23   ` Eric Blake
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 27/30] [SeaBIOS] Add _OST dimm method Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 28/30] Add _OST dimm support Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 29/30] [SeaBIOS] Implement _PS3 method for memory device Vasilis Liaskovitis
2012-12-18 12:41 ` [Qemu-devel] [RFC PATCH v4 30/30] Implement _PS3 for dimm Vasilis Liaskovitis
2012-12-18 16:45 ` [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug Zhi Yong Wu
2012-12-19 11:40   ` Vasilis Liaskovitis
2012-12-19  7:27 ` Gerd Hoffmann
2012-12-19 11:35   ` Vasilis Liaskovitis
2012-12-19 13:56     ` Gerd Hoffmann
2013-01-10 18:57     ` Vasilis Liaskovitis
2013-03-19  6:30       ` li guang
2013-03-26 16:58         ` Vasilis Liaskovitis
2013-03-27  2:42           ` li guang
2013-04-02  9:15       ` liu ping fan
2013-01-09  0:08 ` Andreas Färber
2013-01-10 17:36   ` Vasilis Liaskovitis
2013-01-10 17:55     ` Andreas Färber
2013-03-20  6:18   ` li guang
2013-03-26 14:20     ` Eduardo Habkost
2013-03-27  7:39       ` li guang
     [not found] ` <CAF+CadtnTcOnUt7jp1bARJgioxR5KzLG0QSQuDbiqhiKxiCqFA@mail.gmail.com>
     [not found]   ` <20130228101819.GA4370@dhcp-192-168-178-175.profitbricks.localdomain>
2013-03-19  7:28     ` li guang
2013-03-26 16:43       ` Vasilis Liaskovitis
2013-03-27  2:54         ` li guang
2013-03-28  9:29           ` Vasilis Liaskovitis
2013-03-28  9:49             ` liu ping fan
2013-03-26 14:47 ` Luiz Capitulino
2013-03-26 16:59   ` Vasilis Liaskovitis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.