All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v3 00/32] implement vNVDIMM
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-10 21:17   ` Dan Williams
  -1 siblings, 0 replies; 200+ messages in thread
From: Dan Williams @ 2015-10-10 21:17 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Paolo Bonzini, imammedo, Gleb Natapov, mtosatti, stefanha,
	Michael S. Tsirkin, rth, ehabkost, kvm, qemu-devel

On Sat, Oct 10, 2015 at 8:52 PM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
[..]
> ====== Test ======
> In host
> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
> 2) append "-object memory-backend-file,share,id=mem1,
>    mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>    id=nv1" in QEMU command line
>
> In guest, download the latest upsteam kernel (4.2 merge window) and enable
> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
> 1) insmod drivers/nvdimm/libnvdimm.ko
> 2) insmod drivers/acpi/nfit.ko
> 3) insmod drivers/nvdimm/nd_btt.ko
> 4) insmod drivers/nvdimm/nd_pmem.ko
> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
> appears. You can do whatever on /dev/pmem0 including DAX access.
>
> Currently Linux NVDIMM driver does not support namespace operation on this
> kind of PMEM, apply below changes to support dynamical namespace:
>
> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
>                         continue;
>                 }
>
> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               if (nfit_mem->memdev_pmem)
>                         flags |= NDD_ALIASING;

This is just for testing purposes, right?  I expect guests can
sub-divide persistent memory capacity by partitioning the resulting
block device(s).

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-10 21:17   ` Dan Williams
  0 siblings, 0 replies; 200+ messages in thread
From: Dan Williams @ 2015-10-10 21:17 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, imammedo, Paolo Bonzini, rth

On Sat, Oct 10, 2015 at 8:52 PM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
[..]
> ====== Test ======
> In host
> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
> 2) append "-object memory-backend-file,share,id=mem1,
>    mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>    id=nv1" in QEMU command line
>
> In guest, download the latest upsteam kernel (4.2 merge window) and enable
> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
> 1) insmod drivers/nvdimm/libnvdimm.ko
> 2) insmod drivers/acpi/nfit.ko
> 3) insmod drivers/nvdimm/nd_btt.ko
> 4) insmod drivers/nvdimm/nd_pmem.ko
> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
> appears. You can do whatever on /dev/pmem0 including DAX access.
>
> Currently Linux NVDIMM driver does not support namespace operation on this
> kind of PMEM, apply below changes to support dynamical namespace:
>
> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
>                         continue;
>                 }
>
> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               if (nfit_mem->memdev_pmem)
>                         flags |= NDD_ALIASING;

This is just for testing purposes, right?  I expect guests can
sub-divide persistent memory capacity by partitioning the resulting
block device(s).

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-11  3:52 ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Changelog in v3:
There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
Michael for their valuable comments, the patchset finally gets better shape.
- changes from Igor's comments:
  1) abstract dimm device type from pc-dimm and create nvdimm device based on
     dimm, then it uses memory backend device as nvdimm's memory and NUMA has
     easily been implemented.
  2) let file-backend device support any kind of filesystem not only for
     hugetlbfs and let it work on file not only for directory which is
     achieved by extending 'mem-path' - if it's a directory then it works as
     current behavior, otherwise if it's file then directly allocates memory
     from it.
  3) we figure out a unused memory hole below 4G that is 0xFF00000 ~ 
     0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
     ACPI SSDT/DSDT table will break windows XP.
     BTW, only make SSDT.rev = 2 can not work since the width is only depended
     on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
     in ACPI spec:
| Note: For compatibility with ACPI versions before ACPI 2.0, the bit 
| width of Integer objects is dependent on the ComplianceRevision of the DSDT.
| If the ComplianceRevision is less than 2, all integers are restricted to 32 
| bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets 
| the global integer width for all integers, including integers in SSDTs.
  4) use the lowest ACPI spec version to document AML terms.
  5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"

- changes from Stefan's comments:
  1) do not do endian adjustment in-place since _DSM memory is visible to guest
  2) use target platform's target page size instead of fixed PAGE_SIZE
     definition
  3) lots of code style improvement and typo fixes.
  4) live migration fix
- changes from Paolo's comments:
  1) improve the name of memory region
  
- other changes:
  1) return exact buffer size for _DSM method instead of the page size.
  2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
     devices.
  3) NUMA support
  4) implement _FIT method
  5) rename "configdata" to "reserve-label-data"
  6) simplify _DSM arg3 determination
  7) main changelog update to let it reflect v3.

Changlog in v2:
- Use litten endian for DSM method, thanks for Stefan's suggestion

- introduce a new parameter, @configdata, if it's false, Qemu will
  build a static and readonly namespace in memory and use it serveing
  for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
  reserved region is needed at the end of the @file, it is good for
  the user who want to pass whole nvdimm device and make its data
  completely be visible to guest

- divide the source code into separated files and add maintain info

BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
be posted on next week

====== Background ======
NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
on Intel's platform. They are discovered via ACPI and configured by _DSM
method of NVDIMM device in ACPI. There has some supporting documents which
can be found at:
ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf

Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
this patchset tries to enable it in virtualization field

====== Design ======
NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
address space then CPU can directly access it as normal memory, another is
BLK which is used as block device to reduce the occupying of CPU address
space

BLK mode accesses NVDIMM via Command Register window and Data Register window.
BLK virtualization has high workload since each sector access will cause at
least two VM-EXIT. So we currently only imperilment vPMEM in this patchset

--- vPMEM design ---
We introduce a new device named "nvdimm", it uses memory backend device as
NVDIMM memory. The file in file-backend device can be a regular file and block 
device. We can use any file when we do test or emulation, however,
in the real word, the files passed to guest are:
- the regular file in the filesystem with DAX enabled created on NVDIMM device
  on host
- the raw PMEM device on host, e,g /dev/pmem0
Memory access on the address created by mmap on these kinds of files can
directly reach NVDIMM device on host.

--- vConfigure data area design ---
Each NVDIMM device has a configure data area which is used to store label
namespace data. In order to emulating this area, we divide the file into two
parts:
- first parts is (0, size - 128K], which is used as PMEM
- 128K at the end of the file, which is used as Label Data Area
So that the label namespace data can be persistent during power lose or system
failure.

We also support passing the whole file to guest without reserve any region for
label data area which is achieved by "reserve-label-data" parameter - if it's
false then QEMU will build static and readonly namespace in memory and that
namespace contains the whole file size. The parameter is false on default.

--- _DSM method design ---
_DSM in ACPI is used to configure NVDIMM, currently we only allow access of
label namespace data, i.e, Get Namespace Label Size (Function Index 4),
Get Namespace Label Data (Function Index 5) and Set Namespace Label Data
(Function Index 6)

_DSM uses two pages to transfer data between ACPI and Qemu, the first page
is RAM-based used to save the input info of _DSM method and Qemu reuse it
store output info and another page is MMIO-based, ACPI write data to this
page to transfer the control to Qemu

====== Test ======
In host
1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
2) append "-object memory-backend-file,share,id=mem1,
   mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
   id=nv1" in QEMU command line

In guest, download the latest upsteam kernel (4.2 merge window) and enable
ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
1) insmod drivers/nvdimm/libnvdimm.ko
2) insmod drivers/acpi/nfit.ko
3) insmod drivers/nvdimm/nd_btt.ko
4) insmod drivers/nvdimm/nd_pmem.ko
You can see the whole nvdimm device used as a single namespace and /dev/pmem0
appears. You can do whatever on /dev/pmem0 including DAX access.

Currently Linux NVDIMM driver does not support namespace operation on this
kind of PMEM, apply below changes to support dynamical namespace:

@@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
                        continue;
                }
 
-               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
+               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
+               if (nfit_mem->memdev_pmem)
                        flags |= NDD_ALIASING;

You can append another NVDIMM device in guest and do:                       
# cd /sys/bus/nd/devices/
# cd namespace1.0/
# echo `uuidgen` > uuid
# echo `expr 1024 \* 1024 \* 128` > size
then reload nd.pmem.ko

You can see /dev/pmem1 appears

====== TODO ======
NVDIMM hotplug support

Xiao Guangrong (32):
  acpi: add aml_derefof
  acpi: add aml_sizeof
  acpi: add aml_create_field
  acpi: add aml_mutex, aml_acquire, aml_release
  acpi: add aml_concatenate
  acpi: add aml_object_type
  util: introduce qemu_file_get_page_size()
  exec: allow memory to be allocated from any kind of path
  exec: allow file_ram_alloc to work on file
  hostmem-file: clean up memory allocation
  hostmem-file: use whole file size if possible
  pc-dimm: remove DEFAULT_PC_DIMMSIZE
  pc-dimm: make pc_existing_dimms_capacity static and rename it
  pc-dimm: drop the prefix of pc-dimm
  stubs: rename qmp_pc_dimm_device_list.c
  pc-dimm: rename pc-dimm.c and pc-dimm.h
  dimm: abstract dimm device from pc-dimm
  dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
  dimm: keep the state of the whole backend memory
  dimm: introduce realize callback
  nvdimm: implement NVDIMM device abstract
  nvdimm: init the address region used by NVDIMM ACPI
  nvdimm: build ACPI NFIT table
  nvdimm: init the address region used by DSM method
  nvdimm: build ACPI nvdimm devices
  nvdimm: save arg3 for NVDIMM device _DSM method
  nvdimm: support DSM_CMD_IMPLEMENTED function
  nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
  nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
  nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
  nvdimm: allow using whole backend memory as pmem
  nvdimm: add maintain info

 MAINTAINERS                                        |   6 +
 backends/hostmem-file.c                            |  58 +-
 default-configs/i386-softmmu.mak                   |   2 +
 default-configs/x86_64-softmmu.mak                 |   2 +
 exec.c                                             | 113 ++-
 hmp.c                                              |   2 +-
 hw/Makefile.objs                                   |   2 +-
 hw/acpi/aml-build.c                                |  83 ++
 hw/acpi/ich9.c                                     |   8 +-
 hw/acpi/memory_hotplug.c                           |  26 +-
 hw/acpi/piix4.c                                    |   8 +-
 hw/i386/acpi-build.c                               |   4 +
 hw/i386/pc.c                                       |  37 +-
 hw/mem/Makefile.objs                               |   3 +
 hw/mem/{pc-dimm.c => dimm.c}                       | 240 ++---
 hw/mem/nvdimm/acpi.c                               | 961 +++++++++++++++++++++
 hw/mem/nvdimm/internal.h                           |  41 +
 hw/mem/nvdimm/namespace.c                          | 309 +++++++
 hw/mem/nvdimm/nvdimm.c                             | 136 +++
 hw/mem/pc-dimm.c                                   | 506 +----------
 hw/ppc/spapr.c                                     |  20 +-
 include/hw/acpi/aml-build.h                        |   8 +
 include/hw/i386/pc.h                               |   4 +-
 include/hw/mem/{pc-dimm.h => dimm.h}               |  68 +-
 include/hw/mem/nvdimm.h                            |  58 ++
 include/hw/mem/pc-dimm.h                           | 105 +--
 include/hw/ppc/spapr.h                             |   2 +-
 include/qemu/osdep.h                               |   1 +
 numa.c                                             |   4 +-
 qapi-schema.json                                   |   8 +-
 qmp.c                                              |   4 +-
 stubs/Makefile.objs                                |   2 +-
 ...c_dimm_device_list.c => qmp_dimm_device_list.c} |   4 +-
 target-ppc/kvm.c                                   |  21 +-
 trace-events                                       |   8 +-
 util/oslib-posix.c                                 |  16 +
 util/oslib-win32.c                                 |   5 +
 37 files changed, 2023 insertions(+), 862 deletions(-)
 rename hw/mem/{pc-dimm.c => dimm.c} (65%)
 create mode 100644 hw/mem/nvdimm/acpi.c
 create mode 100644 hw/mem/nvdimm/internal.h
 create mode 100644 hw/mem/nvdimm/namespace.c
 create mode 100644 hw/mem/nvdimm/nvdimm.c
 rewrite hw/mem/pc-dimm.c (91%)
 rename include/hw/mem/{pc-dimm.h => dimm.h} (50%)
 create mode 100644 include/hw/mem/nvdimm.h
 rewrite include/hw/mem/pc-dimm.h (97%)
 rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (56%)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-11  3:52 ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Changelog in v3:
There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
Michael for their valuable comments, the patchset finally gets better shape.
- changes from Igor's comments:
  1) abstract dimm device type from pc-dimm and create nvdimm device based on
     dimm, then it uses memory backend device as nvdimm's memory and NUMA has
     easily been implemented.
  2) let file-backend device support any kind of filesystem not only for
     hugetlbfs and let it work on file not only for directory which is
     achieved by extending 'mem-path' - if it's a directory then it works as
     current behavior, otherwise if it's file then directly allocates memory
     from it.
  3) we figure out a unused memory hole below 4G that is 0xFF00000 ~ 
     0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
     ACPI SSDT/DSDT table will break windows XP.
     BTW, only make SSDT.rev = 2 can not work since the width is only depended
     on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
     in ACPI spec:
| Note: For compatibility with ACPI versions before ACPI 2.0, the bit 
| width of Integer objects is dependent on the ComplianceRevision of the DSDT.
| If the ComplianceRevision is less than 2, all integers are restricted to 32 
| bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets 
| the global integer width for all integers, including integers in SSDTs.
  4) use the lowest ACPI spec version to document AML terms.
  5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"

- changes from Stefan's comments:
  1) do not do endian adjustment in-place since _DSM memory is visible to guest
  2) use target platform's target page size instead of fixed PAGE_SIZE
     definition
  3) lots of code style improvement and typo fixes.
  4) live migration fix
- changes from Paolo's comments:
  1) improve the name of memory region
  
- other changes:
  1) return exact buffer size for _DSM method instead of the page size.
  2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
     devices.
  3) NUMA support
  4) implement _FIT method
  5) rename "configdata" to "reserve-label-data"
  6) simplify _DSM arg3 determination
  7) main changelog update to let it reflect v3.

Changlog in v2:
- Use litten endian for DSM method, thanks for Stefan's suggestion

- introduce a new parameter, @configdata, if it's false, Qemu will
  build a static and readonly namespace in memory and use it serveing
  for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
  reserved region is needed at the end of the @file, it is good for
  the user who want to pass whole nvdimm device and make its data
  completely be visible to guest

- divide the source code into separated files and add maintain info

BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
be posted on next week

====== Background ======
NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
on Intel's platform. They are discovered via ACPI and configured by _DSM
method of NVDIMM device in ACPI. There has some supporting documents which
can be found at:
ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf

Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
this patchset tries to enable it in virtualization field

====== Design ======
NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
address space then CPU can directly access it as normal memory, another is
BLK which is used as block device to reduce the occupying of CPU address
space

BLK mode accesses NVDIMM via Command Register window and Data Register window.
BLK virtualization has high workload since each sector access will cause at
least two VM-EXIT. So we currently only imperilment vPMEM in this patchset

--- vPMEM design ---
We introduce a new device named "nvdimm", it uses memory backend device as
NVDIMM memory. The file in file-backend device can be a regular file and block 
device. We can use any file when we do test or emulation, however,
in the real word, the files passed to guest are:
- the regular file in the filesystem with DAX enabled created on NVDIMM device
  on host
- the raw PMEM device on host, e,g /dev/pmem0
Memory access on the address created by mmap on these kinds of files can
directly reach NVDIMM device on host.

--- vConfigure data area design ---
Each NVDIMM device has a configure data area which is used to store label
namespace data. In order to emulating this area, we divide the file into two
parts:
- first parts is (0, size - 128K], which is used as PMEM
- 128K at the end of the file, which is used as Label Data Area
So that the label namespace data can be persistent during power lose or system
failure.

We also support passing the whole file to guest without reserve any region for
label data area which is achieved by "reserve-label-data" parameter - if it's
false then QEMU will build static and readonly namespace in memory and that
namespace contains the whole file size. The parameter is false on default.

--- _DSM method design ---
_DSM in ACPI is used to configure NVDIMM, currently we only allow access of
label namespace data, i.e, Get Namespace Label Size (Function Index 4),
Get Namespace Label Data (Function Index 5) and Set Namespace Label Data
(Function Index 6)

_DSM uses two pages to transfer data between ACPI and Qemu, the first page
is RAM-based used to save the input info of _DSM method and Qemu reuse it
store output info and another page is MMIO-based, ACPI write data to this
page to transfer the control to Qemu

====== Test ======
In host
1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
2) append "-object memory-backend-file,share,id=mem1,
   mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
   id=nv1" in QEMU command line

In guest, download the latest upsteam kernel (4.2 merge window) and enable
ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
1) insmod drivers/nvdimm/libnvdimm.ko
2) insmod drivers/acpi/nfit.ko
3) insmod drivers/nvdimm/nd_btt.ko
4) insmod drivers/nvdimm/nd_pmem.ko
You can see the whole nvdimm device used as a single namespace and /dev/pmem0
appears. You can do whatever on /dev/pmem0 including DAX access.

Currently Linux NVDIMM driver does not support namespace operation on this
kind of PMEM, apply below changes to support dynamical namespace:

@@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
                        continue;
                }
 
-               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
+               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
+               if (nfit_mem->memdev_pmem)
                        flags |= NDD_ALIASING;

You can append another NVDIMM device in guest and do:                       
# cd /sys/bus/nd/devices/
# cd namespace1.0/
# echo `uuidgen` > uuid
# echo `expr 1024 \* 1024 \* 128` > size
then reload nd.pmem.ko

You can see /dev/pmem1 appears

====== TODO ======
NVDIMM hotplug support

Xiao Guangrong (32):
  acpi: add aml_derefof
  acpi: add aml_sizeof
  acpi: add aml_create_field
  acpi: add aml_mutex, aml_acquire, aml_release
  acpi: add aml_concatenate
  acpi: add aml_object_type
  util: introduce qemu_file_get_page_size()
  exec: allow memory to be allocated from any kind of path
  exec: allow file_ram_alloc to work on file
  hostmem-file: clean up memory allocation
  hostmem-file: use whole file size if possible
  pc-dimm: remove DEFAULT_PC_DIMMSIZE
  pc-dimm: make pc_existing_dimms_capacity static and rename it
  pc-dimm: drop the prefix of pc-dimm
  stubs: rename qmp_pc_dimm_device_list.c
  pc-dimm: rename pc-dimm.c and pc-dimm.h
  dimm: abstract dimm device from pc-dimm
  dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
  dimm: keep the state of the whole backend memory
  dimm: introduce realize callback
  nvdimm: implement NVDIMM device abstract
  nvdimm: init the address region used by NVDIMM ACPI
  nvdimm: build ACPI NFIT table
  nvdimm: init the address region used by DSM method
  nvdimm: build ACPI nvdimm devices
  nvdimm: save arg3 for NVDIMM device _DSM method
  nvdimm: support DSM_CMD_IMPLEMENTED function
  nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
  nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
  nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
  nvdimm: allow using whole backend memory as pmem
  nvdimm: add maintain info

 MAINTAINERS                                        |   6 +
 backends/hostmem-file.c                            |  58 +-
 default-configs/i386-softmmu.mak                   |   2 +
 default-configs/x86_64-softmmu.mak                 |   2 +
 exec.c                                             | 113 ++-
 hmp.c                                              |   2 +-
 hw/Makefile.objs                                   |   2 +-
 hw/acpi/aml-build.c                                |  83 ++
 hw/acpi/ich9.c                                     |   8 +-
 hw/acpi/memory_hotplug.c                           |  26 +-
 hw/acpi/piix4.c                                    |   8 +-
 hw/i386/acpi-build.c                               |   4 +
 hw/i386/pc.c                                       |  37 +-
 hw/mem/Makefile.objs                               |   3 +
 hw/mem/{pc-dimm.c => dimm.c}                       | 240 ++---
 hw/mem/nvdimm/acpi.c                               | 961 +++++++++++++++++++++
 hw/mem/nvdimm/internal.h                           |  41 +
 hw/mem/nvdimm/namespace.c                          | 309 +++++++
 hw/mem/nvdimm/nvdimm.c                             | 136 +++
 hw/mem/pc-dimm.c                                   | 506 +----------
 hw/ppc/spapr.c                                     |  20 +-
 include/hw/acpi/aml-build.h                        |   8 +
 include/hw/i386/pc.h                               |   4 +-
 include/hw/mem/{pc-dimm.h => dimm.h}               |  68 +-
 include/hw/mem/nvdimm.h                            |  58 ++
 include/hw/mem/pc-dimm.h                           | 105 +--
 include/hw/ppc/spapr.h                             |   2 +-
 include/qemu/osdep.h                               |   1 +
 numa.c                                             |   4 +-
 qapi-schema.json                                   |   8 +-
 qmp.c                                              |   4 +-
 stubs/Makefile.objs                                |   2 +-
 ...c_dimm_device_list.c => qmp_dimm_device_list.c} |   4 +-
 target-ppc/kvm.c                                   |  21 +-
 trace-events                                       |   8 +-
 util/oslib-posix.c                                 |  16 +
 util/oslib-win32.c                                 |   5 +
 37 files changed, 2023 insertions(+), 862 deletions(-)
 rename hw/mem/{pc-dimm.c => dimm.c} (65%)
 create mode 100644 hw/mem/nvdimm/acpi.c
 create mode 100644 hw/mem/nvdimm/internal.h
 create mode 100644 hw/mem/nvdimm/namespace.c
 create mode 100644 hw/mem/nvdimm/nvdimm.c
 rewrite hw/mem/pc-dimm.c (91%)
 rename include/hw/mem/{pc-dimm.h => dimm.h} (50%)
 create mode 100644 include/hw/mem/nvdimm.h
 rewrite include/hw/mem/pc-dimm.h (97%)
 rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (56%)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [PATCH v3 01/32] acpi: add aml_derefof
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Implement DeRefOf term which is used by NVDIMM _DSM method in later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/acpi/aml-build.c         | 8 ++++++++
 include/hw/acpi/aml-build.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 0d4b324..cbd53f4 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1135,6 +1135,14 @@ Aml *aml_unicode(const char *str)
     return var;
 }
 
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefDerefOf */
+Aml *aml_derefof(Aml *arg)
+{
+    Aml *var = aml_opcode(0x83 /* DerefOfOp */);
+    aml_append(var, arg);
+    return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
              AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 1b632dc..5a03d33 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -274,6 +274,7 @@ Aml *aml_create_dword_field(Aml *srcbuf, Aml *index, const char *name);
 Aml *aml_varpackage(uint32_t num_elements);
 Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
+Aml *aml_derefof(Aml *arg);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 01/32] acpi: add aml_derefof
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Implement DeRefOf term which is used by NVDIMM _DSM method in later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/acpi/aml-build.c         | 8 ++++++++
 include/hw/acpi/aml-build.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 0d4b324..cbd53f4 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1135,6 +1135,14 @@ Aml *aml_unicode(const char *str)
     return var;
 }
 
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefDerefOf */
+Aml *aml_derefof(Aml *arg)
+{
+    Aml *var = aml_opcode(0x83 /* DerefOfOp */);
+    aml_append(var, arg);
+    return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
              AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 1b632dc..5a03d33 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -274,6 +274,7 @@ Aml *aml_create_dword_field(Aml *srcbuf, Aml *index, const char *name);
 Aml *aml_varpackage(uint32_t num_elements);
 Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
+Aml *aml_derefof(Aml *arg);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 02/32] acpi: add aml_sizeof
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Implement SizeOf term which is used by NVDIMM _DSM method in later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/acpi/aml-build.c         | 8 ++++++++
 include/hw/acpi/aml-build.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index cbd53f4..a72214d 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1143,6 +1143,14 @@ Aml *aml_derefof(Aml *arg)
     return var;
 }
 
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefSizeOf */
+Aml *aml_sizeof(Aml *arg)
+{
+    Aml *var = aml_opcode(0x87 /* SizeOfOp */);
+    aml_append(var, arg);
+    return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
              AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 5a03d33..7296efb 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -275,6 +275,7 @@ Aml *aml_varpackage(uint32_t num_elements);
 Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
 Aml *aml_derefof(Aml *arg);
+Aml *aml_sizeof(Aml *arg);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 02/32] acpi: add aml_sizeof
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Implement SizeOf term which is used by NVDIMM _DSM method in later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/acpi/aml-build.c         | 8 ++++++++
 include/hw/acpi/aml-build.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index cbd53f4..a72214d 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1143,6 +1143,14 @@ Aml *aml_derefof(Aml *arg)
     return var;
 }
 
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefSizeOf */
+Aml *aml_sizeof(Aml *arg)
+{
+    Aml *var = aml_opcode(0x87 /* SizeOfOp */);
+    aml_append(var, arg);
+    return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
              AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 5a03d33..7296efb 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -275,6 +275,7 @@ Aml *aml_varpackage(uint32_t num_elements);
 Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
 Aml *aml_derefof(Aml *arg);
+Aml *aml_sizeof(Aml *arg);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 03/32] acpi: add aml_create_field
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Implement CreateField term which is used by NVDIMM _DSM method in later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/acpi/aml-build.c         | 13 +++++++++++++
 include/hw/acpi/aml-build.h |  1 +
 2 files changed, 14 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index a72214d..9fe5e7b 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1151,6 +1151,19 @@ Aml *aml_sizeof(Aml *arg)
     return var;
 }
 
+/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefCreateField */
+Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
+{
+    Aml *var = aml_alloc();
+    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
+    build_append_byte(var->buf, 0x13); /* CreateFieldOp */
+    aml_append(var, srcbuf);
+    aml_append(var, index);
+    aml_append(var, len);
+    build_append_namestring(var->buf, "%s", name);
+    return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
              AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 7296efb..7e1c43b 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -276,6 +276,7 @@ Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
 Aml *aml_derefof(Aml *arg);
 Aml *aml_sizeof(Aml *arg);
+Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 03/32] acpi: add aml_create_field
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Implement CreateField term which is used by NVDIMM _DSM method in later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/acpi/aml-build.c         | 13 +++++++++++++
 include/hw/acpi/aml-build.h |  1 +
 2 files changed, 14 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index a72214d..9fe5e7b 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1151,6 +1151,19 @@ Aml *aml_sizeof(Aml *arg)
     return var;
 }
 
+/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefCreateField */
+Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
+{
+    Aml *var = aml_alloc();
+    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
+    build_append_byte(var->buf, 0x13); /* CreateFieldOp */
+    aml_append(var, srcbuf);
+    aml_append(var, index);
+    aml_append(var, len);
+    build_append_namestring(var->buf, "%s", name);
+    return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
              AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 7296efb..7e1c43b 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -276,6 +276,7 @@ Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
 Aml *aml_derefof(Aml *arg);
 Aml *aml_sizeof(Aml *arg);
+Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 04/32] acpi: add aml_mutex, aml_acquire, aml_release
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Implement Mutex, Acquire and Release terms which are used by NVDIMM _DSM method
in later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/acpi/aml-build.c         | 32 ++++++++++++++++++++++++++++++++
 include/hw/acpi/aml-build.h |  3 +++
 2 files changed, 35 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 9fe5e7b..ab52692 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1164,6 +1164,38 @@ Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
     return var;
 }
 
+/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefMutex */
+Aml *aml_mutex(const char *name, uint8_t flags)
+{
+    Aml *var = aml_alloc();
+    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
+    build_append_byte(var->buf, 0x01); /* MutexOp */
+    build_append_namestring(var->buf, "%s", name);
+    build_append_byte(var->buf, flags);
+    return var;
+}
+
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefAcquire */
+Aml *aml_acquire(Aml *mutex, uint16_t timeout)
+{
+    Aml *var = aml_alloc();
+    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
+    build_append_byte(var->buf, 0x23); /* AcquireOp */
+    aml_append(var, mutex);
+    build_append_int_noprefix(var->buf, timeout, sizeof(timeout));
+    return var;
+}
+
+/* ACPI 1.0b: 16.2.5.3 Type 1 Opcodes Encoding: DefRelease */
+Aml *aml_release(Aml *mutex)
+{
+    Aml *var = aml_alloc();
+    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
+    build_append_byte(var->buf, 0x27); /* ReleaseOp */
+    aml_append(var, mutex);
+    return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
              AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 7e1c43b..d494c0c 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -277,6 +277,9 @@ Aml *aml_unicode(const char *str);
 Aml *aml_derefof(Aml *arg);
 Aml *aml_sizeof(Aml *arg);
 Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
+Aml *aml_mutex(const char *name, uint8_t flags);
+Aml *aml_acquire(Aml *mutex, uint16_t timeout);
+Aml *aml_release(Aml *mutex);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 04/32] acpi: add aml_mutex, aml_acquire, aml_release
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Implement Mutex, Acquire and Release terms which are used by NVDIMM _DSM method
in later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/acpi/aml-build.c         | 32 ++++++++++++++++++++++++++++++++
 include/hw/acpi/aml-build.h |  3 +++
 2 files changed, 35 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 9fe5e7b..ab52692 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1164,6 +1164,38 @@ Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
     return var;
 }
 
+/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefMutex */
+Aml *aml_mutex(const char *name, uint8_t flags)
+{
+    Aml *var = aml_alloc();
+    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
+    build_append_byte(var->buf, 0x01); /* MutexOp */
+    build_append_namestring(var->buf, "%s", name);
+    build_append_byte(var->buf, flags);
+    return var;
+}
+
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefAcquire */
+Aml *aml_acquire(Aml *mutex, uint16_t timeout)
+{
+    Aml *var = aml_alloc();
+    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
+    build_append_byte(var->buf, 0x23); /* AcquireOp */
+    aml_append(var, mutex);
+    build_append_int_noprefix(var->buf, timeout, sizeof(timeout));
+    return var;
+}
+
+/* ACPI 1.0b: 16.2.5.3 Type 1 Opcodes Encoding: DefRelease */
+Aml *aml_release(Aml *mutex)
+{
+    Aml *var = aml_alloc();
+    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
+    build_append_byte(var->buf, 0x27); /* ReleaseOp */
+    aml_append(var, mutex);
+    return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
              AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 7e1c43b..d494c0c 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -277,6 +277,9 @@ Aml *aml_unicode(const char *str);
 Aml *aml_derefof(Aml *arg);
 Aml *aml_sizeof(Aml *arg);
 Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
+Aml *aml_mutex(const char *name, uint8_t flags);
+Aml *aml_acquire(Aml *mutex, uint16_t timeout);
+Aml *aml_release(Aml *mutex);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 05/32] acpi: add aml_concatenate
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Implement Concatenate term which is used by NVDIMM _DSM method
in later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/acpi/aml-build.c         | 14 ++++++++++++++
 include/hw/acpi/aml-build.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index ab52692..d3b071f 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1196,6 +1196,20 @@ Aml *aml_release(Aml *mutex)
     return var;
 }
 
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefConcat */
+Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target)
+{
+    Aml *var = aml_opcode(0x73 /* ConcatOp */);
+    aml_append(var, source1);
+    aml_append(var, source2);
+
+    if (target) {
+        aml_append(var, target);
+    }
+
+    return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
              AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index d494c0c..d4b6d10 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -280,6 +280,7 @@ Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
 Aml *aml_mutex(const char *name, uint8_t flags);
 Aml *aml_acquire(Aml *mutex, uint16_t timeout);
 Aml *aml_release(Aml *mutex);
+Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 05/32] acpi: add aml_concatenate
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Implement Concatenate term which is used by NVDIMM _DSM method
in later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/acpi/aml-build.c         | 14 ++++++++++++++
 include/hw/acpi/aml-build.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index ab52692..d3b071f 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1196,6 +1196,20 @@ Aml *aml_release(Aml *mutex)
     return var;
 }
 
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefConcat */
+Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target)
+{
+    Aml *var = aml_opcode(0x73 /* ConcatOp */);
+    aml_append(var, source1);
+    aml_append(var, source2);
+
+    if (target) {
+        aml_append(var, target);
+    }
+
+    return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
              AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index d494c0c..d4b6d10 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -280,6 +280,7 @@ Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
 Aml *aml_mutex(const char *name, uint8_t flags);
 Aml *aml_acquire(Aml *mutex, uint16_t timeout);
 Aml *aml_release(Aml *mutex);
+Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 06/32] acpi: add aml_object_type
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Implement ObjectType which is used by NVDIMM _DSM method in
later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/acpi/aml-build.c         | 8 ++++++++
 include/hw/acpi/aml-build.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index d3b071f..c5639b5 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1210,6 +1210,14 @@ Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target)
     return var;
 }
 
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefObjectType */
+Aml *aml_object_type(Aml *object)
+{
+    Aml *var = aml_opcode(0x8E /* ObjectTypeOp */);
+    aml_append(var, object);
+    return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
              AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index d4b6d10..77ff965 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -281,6 +281,7 @@ Aml *aml_mutex(const char *name, uint8_t flags);
 Aml *aml_acquire(Aml *mutex, uint16_t timeout);
 Aml *aml_release(Aml *mutex);
 Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target);
+Aml *aml_object_type(Aml *object);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 06/32] acpi: add aml_object_type
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Implement ObjectType which is used by NVDIMM _DSM method in
later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/acpi/aml-build.c         | 8 ++++++++
 include/hw/acpi/aml-build.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index d3b071f..c5639b5 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1210,6 +1210,14 @@ Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target)
     return var;
 }
 
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefObjectType */
+Aml *aml_object_type(Aml *object)
+{
+    Aml *var = aml_opcode(0x8E /* ObjectTypeOp */);
+    aml_append(var, object);
+    return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
              AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index d4b6d10..77ff965 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -281,6 +281,7 @@ Aml *aml_mutex(const char *name, uint8_t flags);
 Aml *aml_acquire(Aml *mutex, uint16_t timeout);
 Aml *aml_release(Aml *mutex);
 Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target);
+Aml *aml_object_type(Aml *object);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 07/32] util: introduce qemu_file_get_page_size()
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

There are three places use the some logic to get the page size on
the file path or file fd

This patch introduces qemu_file_get_page_size() to unify the code

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 include/qemu/osdep.h |  1 +
 target-ppc/kvm.c     | 21 +++------------------
 util/oslib-posix.c   | 16 ++++++++++++++++
 util/oslib-win32.c   |  5 +++++
 4 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index ef21efb..9c8c0c4 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -286,4 +286,5 @@ void os_mem_prealloc(int fd, char *area, size_t sz);
 
 int qemu_read_password(char *buf, int buf_size);
 
+size_t qemu_file_get_page_size(const char *mem_path);
 #endif
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index f8ea783..ed3424e 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -306,28 +306,13 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
 
 static long gethugepagesize(const char *mem_path)
 {
-    struct statfs fs;
-    int ret;
-
-    do {
-        ret = statfs(mem_path, &fs);
-    } while (ret != 0 && errno == EINTR);
+    long size = qemu_file_get_page_size(mem_path);
 
-    if (ret != 0) {
-        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
-                strerror(errno));
+    if (!size) {
         exit(1);
     }
 
-#define HUGETLBFS_MAGIC       0x958458f6
-
-    if (fs.f_type != HUGETLBFS_MAGIC) {
-        /* Explicit mempath, but it's ordinary pages */
-        return getpagesize();
-    }
-
-    /* It's hugepage, return the huge page size */
-    return fs.f_bsize;
+    return size;
 }
 
 static int find_max_supported_pagesize(Object *obj, void *opaque)
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index a0fcdc2..6b5c612 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -380,6 +380,22 @@ static size_t fd_getpagesize(int fd)
     return getpagesize();
 }
 
+size_t qemu_file_get_page_size(const char *path)
+{
+    size_t size = 0;
+    int fd = qemu_open(path, O_RDONLY);
+
+    if (fd < 0) {
+        fprintf(stderr, "Could not open %s.\n", path);
+        goto exit;
+    }
+
+    size = fd_getpagesize(fd);
+    qemu_close(fd);
+exit:
+    return size;
+}
+
 void os_mem_prealloc(int fd, char *area, size_t memory)
 {
     int ret;
diff --git a/util/oslib-win32.c b/util/oslib-win32.c
index 08f5a9c..1ff1fae 100644
--- a/util/oslib-win32.c
+++ b/util/oslib-win32.c
@@ -462,6 +462,11 @@ size_t getpagesize(void)
     return system_info.dwPageSize;
 }
 
+size_t qemu_file_get_page_size(const char *path)
+{
+    return getpagesize();
+}
+
 void os_mem_prealloc(int fd, char *area, size_t memory)
 {
     int i;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 07/32] util: introduce qemu_file_get_page_size()
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

There are three places use the some logic to get the page size on
the file path or file fd

This patch introduces qemu_file_get_page_size() to unify the code

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 include/qemu/osdep.h |  1 +
 target-ppc/kvm.c     | 21 +++------------------
 util/oslib-posix.c   | 16 ++++++++++++++++
 util/oslib-win32.c   |  5 +++++
 4 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index ef21efb..9c8c0c4 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -286,4 +286,5 @@ void os_mem_prealloc(int fd, char *area, size_t sz);
 
 int qemu_read_password(char *buf, int buf_size);
 
+size_t qemu_file_get_page_size(const char *mem_path);
 #endif
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index f8ea783..ed3424e 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -306,28 +306,13 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
 
 static long gethugepagesize(const char *mem_path)
 {
-    struct statfs fs;
-    int ret;
-
-    do {
-        ret = statfs(mem_path, &fs);
-    } while (ret != 0 && errno == EINTR);
+    long size = qemu_file_get_page_size(mem_path);
 
-    if (ret != 0) {
-        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
-                strerror(errno));
+    if (!size) {
         exit(1);
     }
 
-#define HUGETLBFS_MAGIC       0x958458f6
-
-    if (fs.f_type != HUGETLBFS_MAGIC) {
-        /* Explicit mempath, but it's ordinary pages */
-        return getpagesize();
-    }
-
-    /* It's hugepage, return the huge page size */
-    return fs.f_bsize;
+    return size;
 }
 
 static int find_max_supported_pagesize(Object *obj, void *opaque)
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index a0fcdc2..6b5c612 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -380,6 +380,22 @@ static size_t fd_getpagesize(int fd)
     return getpagesize();
 }
 
+size_t qemu_file_get_page_size(const char *path)
+{
+    size_t size = 0;
+    int fd = qemu_open(path, O_RDONLY);
+
+    if (fd < 0) {
+        fprintf(stderr, "Could not open %s.\n", path);
+        goto exit;
+    }
+
+    size = fd_getpagesize(fd);
+    qemu_close(fd);
+exit:
+    return size;
+}
+
 void os_mem_prealloc(int fd, char *area, size_t memory)
 {
     int ret;
diff --git a/util/oslib-win32.c b/util/oslib-win32.c
index 08f5a9c..1ff1fae 100644
--- a/util/oslib-win32.c
+++ b/util/oslib-win32.c
@@ -462,6 +462,11 @@ size_t getpagesize(void)
     return system_info.dwPageSize;
 }
 
+size_t qemu_file_get_page_size(const char *path)
+{
+    return getpagesize();
+}
+
 void os_mem_prealloc(int fd, char *area, size_t memory)
 {
     int i;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 08/32] exec: allow memory to be allocated from any kind of path
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Currently file_ram_alloc() is designed for hugetlbfs, however, the memory
of nvdimm can come from either raw pmem device eg, /dev/pmem, or the file
locates at DAX enabled filesystem

So this patch let it work on any kind of path

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 exec.c | 55 ++++++++++++++-----------------------------------------
 1 file changed, 14 insertions(+), 41 deletions(-)

diff --git a/exec.c b/exec.c
index 7d90a52..70cb0ef 100644
--- a/exec.c
+++ b/exec.c
@@ -1154,32 +1154,6 @@ void qemu_mutex_unlock_ramlist(void)
 }
 
 #ifdef __linux__
-
-#include <sys/vfs.h>
-
-#define HUGETLBFS_MAGIC       0x958458f6
-
-static long gethugepagesize(const char *path, Error **errp)
-{
-    struct statfs fs;
-    int ret;
-
-    do {
-        ret = statfs(path, &fs);
-    } while (ret != 0 && errno == EINTR);
-
-    if (ret != 0) {
-        error_setg_errno(errp, errno, "failed to get page size of file %s",
-                         path);
-        return 0;
-    }
-
-    if (fs.f_type != HUGETLBFS_MAGIC)
-        fprintf(stderr, "Warning: path not on HugeTLBFS: %s\n", path);
-
-    return fs.f_bsize;
-}
-
 static void *file_ram_alloc(RAMBlock *block,
                             ram_addr_t memory,
                             const char *path,
@@ -1191,22 +1165,21 @@ static void *file_ram_alloc(RAMBlock *block,
     void *ptr;
     void *area = NULL;
     int fd;
-    uint64_t hpagesize;
+    uint64_t pagesize;
     uint64_t total;
-    Error *local_err = NULL;
     size_t offset;
 
-    hpagesize = gethugepagesize(path, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
+    pagesize = qemu_file_get_page_size(path);
+    if (!pagesize) {
+        error_setg(errp, "can't get page size for %s", path);
         goto error;
     }
-    block->mr->align = hpagesize;
+    block->mr->align = pagesize;
 
-    if (memory < hpagesize) {
+    if (memory < pagesize) {
         error_setg(errp, "memory size 0x" RAM_ADDR_FMT " must be equal to "
-                   "or larger than huge page size 0x%" PRIx64,
-                   memory, hpagesize);
+                   "or larger than page size 0x%" PRIx64,
+                   memory, pagesize);
         goto error;
     }
 
@@ -1230,15 +1203,15 @@ static void *file_ram_alloc(RAMBlock *block,
     fd = mkstemp(filename);
     if (fd < 0) {
         error_setg_errno(errp, errno,
-                         "unable to create backing store for hugepages");
+                         "unable to create backing store for path %s", path);
         g_free(filename);
         goto error;
     }
     unlink(filename);
     g_free(filename);
 
-    memory = ROUND_UP(memory, hpagesize);
-    total = memory + hpagesize;
+    memory = ROUND_UP(memory, pagesize);
+    total = memory + pagesize;
 
     /*
      * ftruncate is not supported by hugetlbfs in older
@@ -1254,12 +1227,12 @@ static void *file_ram_alloc(RAMBlock *block,
                 -1, 0);
     if (ptr == MAP_FAILED) {
         error_setg_errno(errp, errno,
-                         "unable to allocate memory range for hugepages");
+                         "unable to allocate memory range for path %s", path);
         close(fd);
         goto error;
     }
 
-    offset = QEMU_ALIGN_UP((uintptr_t)ptr, hpagesize) - (uintptr_t)ptr;
+    offset = QEMU_ALIGN_UP((uintptr_t)ptr, pagesize) - (uintptr_t)ptr;
 
     area = mmap(ptr + offset, memory, PROT_READ | PROT_WRITE,
                 (block->flags & RAM_SHARED ? MAP_SHARED : MAP_PRIVATE) |
@@ -1267,7 +1240,7 @@ static void *file_ram_alloc(RAMBlock *block,
                 fd, 0);
     if (area == MAP_FAILED) {
         error_setg_errno(errp, errno,
-                         "unable to map backing store for hugepages");
+                         "unable to map backing store for path %s", path);
         munmap(ptr, total);
         close(fd);
         goto error;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 08/32] exec: allow memory to be allocated from any kind of path
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Currently file_ram_alloc() is designed for hugetlbfs, however, the memory
of nvdimm can come from either raw pmem device eg, /dev/pmem, or the file
locates at DAX enabled filesystem

So this patch let it work on any kind of path

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 exec.c | 55 ++++++++++++++-----------------------------------------
 1 file changed, 14 insertions(+), 41 deletions(-)

diff --git a/exec.c b/exec.c
index 7d90a52..70cb0ef 100644
--- a/exec.c
+++ b/exec.c
@@ -1154,32 +1154,6 @@ void qemu_mutex_unlock_ramlist(void)
 }
 
 #ifdef __linux__
-
-#include <sys/vfs.h>
-
-#define HUGETLBFS_MAGIC       0x958458f6
-
-static long gethugepagesize(const char *path, Error **errp)
-{
-    struct statfs fs;
-    int ret;
-
-    do {
-        ret = statfs(path, &fs);
-    } while (ret != 0 && errno == EINTR);
-
-    if (ret != 0) {
-        error_setg_errno(errp, errno, "failed to get page size of file %s",
-                         path);
-        return 0;
-    }
-
-    if (fs.f_type != HUGETLBFS_MAGIC)
-        fprintf(stderr, "Warning: path not on HugeTLBFS: %s\n", path);
-
-    return fs.f_bsize;
-}
-
 static void *file_ram_alloc(RAMBlock *block,
                             ram_addr_t memory,
                             const char *path,
@@ -1191,22 +1165,21 @@ static void *file_ram_alloc(RAMBlock *block,
     void *ptr;
     void *area = NULL;
     int fd;
-    uint64_t hpagesize;
+    uint64_t pagesize;
     uint64_t total;
-    Error *local_err = NULL;
     size_t offset;
 
-    hpagesize = gethugepagesize(path, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
+    pagesize = qemu_file_get_page_size(path);
+    if (!pagesize) {
+        error_setg(errp, "can't get page size for %s", path);
         goto error;
     }
-    block->mr->align = hpagesize;
+    block->mr->align = pagesize;
 
-    if (memory < hpagesize) {
+    if (memory < pagesize) {
         error_setg(errp, "memory size 0x" RAM_ADDR_FMT " must be equal to "
-                   "or larger than huge page size 0x%" PRIx64,
-                   memory, hpagesize);
+                   "or larger than page size 0x%" PRIx64,
+                   memory, pagesize);
         goto error;
     }
 
@@ -1230,15 +1203,15 @@ static void *file_ram_alloc(RAMBlock *block,
     fd = mkstemp(filename);
     if (fd < 0) {
         error_setg_errno(errp, errno,
-                         "unable to create backing store for hugepages");
+                         "unable to create backing store for path %s", path);
         g_free(filename);
         goto error;
     }
     unlink(filename);
     g_free(filename);
 
-    memory = ROUND_UP(memory, hpagesize);
-    total = memory + hpagesize;
+    memory = ROUND_UP(memory, pagesize);
+    total = memory + pagesize;
 
     /*
      * ftruncate is not supported by hugetlbfs in older
@@ -1254,12 +1227,12 @@ static void *file_ram_alloc(RAMBlock *block,
                 -1, 0);
     if (ptr == MAP_FAILED) {
         error_setg_errno(errp, errno,
-                         "unable to allocate memory range for hugepages");
+                         "unable to allocate memory range for path %s", path);
         close(fd);
         goto error;
     }
 
-    offset = QEMU_ALIGN_UP((uintptr_t)ptr, hpagesize) - (uintptr_t)ptr;
+    offset = QEMU_ALIGN_UP((uintptr_t)ptr, pagesize) - (uintptr_t)ptr;
 
     area = mmap(ptr + offset, memory, PROT_READ | PROT_WRITE,
                 (block->flags & RAM_SHARED ? MAP_SHARED : MAP_PRIVATE) |
@@ -1267,7 +1240,7 @@ static void *file_ram_alloc(RAMBlock *block,
                 fd, 0);
     if (area == MAP_FAILED) {
         error_setg_errno(errp, errno,
-                         "unable to map backing store for hugepages");
+                         "unable to map backing store for path %s", path);
         munmap(ptr, total);
         close(fd);
         goto error;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 09/32] exec: allow file_ram_alloc to work on file
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Currently, file_ram_alloc() only works on directory - it creates a file
under @path and do mmap on it

This patch tries to allow it to work on file directly, if @path is a
directory it works as before, otherwise it treats @path as the target
file then directly allocate memory from it

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 exec.c | 82 ++++++++++++++++++++++++++++++++++++++++++------------------------
 1 file changed, 52 insertions(+), 30 deletions(-)

diff --git a/exec.c b/exec.c
index 70cb0ef..c8c7e12 100644
--- a/exec.c
+++ b/exec.c
@@ -1154,14 +1154,60 @@ void qemu_mutex_unlock_ramlist(void)
 }
 
 #ifdef __linux__
+static bool path_is_dir(const char *path)
+{
+    struct stat fs;
+
+    return stat(path, &fs) == 0 && S_ISDIR(fs.st_mode);
+}
+
+static int open_file_path(RAMBlock *block, const char *path, size_t size)
+{
+    char *filename;
+    char *sanitized_name;
+    char *c;
+    int fd;
+
+    if (!path_is_dir(path)) {
+        int flags = (block->flags & RAM_SHARED) ? O_RDWR : O_RDONLY;
+
+        flags |= O_EXCL;
+        return open(path, flags);
+    }
+
+    /* Make name safe to use with mkstemp by replacing '/' with '_'. */
+    sanitized_name = g_strdup(memory_region_name(block->mr));
+    for (c = sanitized_name; *c != '\0'; c++) {
+        if (*c == '/') {
+            *c = '_';
+        }
+    }
+    filename = g_strdup_printf("%s/qemu_back_mem.%s.XXXXXX", path,
+                               sanitized_name);
+    g_free(sanitized_name);
+    fd = mkstemp(filename);
+    if (fd >= 0) {
+        unlink(filename);
+        /*
+         * ftruncate is not supported by hugetlbfs in older
+         * hosts, so don't bother bailing out on errors.
+         * If anything goes wrong with it under other filesystems,
+         * mmap will fail.
+         */
+        if (ftruncate(fd, size)) {
+            perror("ftruncate");
+        }
+    }
+    g_free(filename);
+
+    return fd;
+}
+
 static void *file_ram_alloc(RAMBlock *block,
                             ram_addr_t memory,
                             const char *path,
                             Error **errp)
 {
-    char *filename;
-    char *sanitized_name;
-    char *c;
     void *ptr;
     void *area = NULL;
     int fd;
@@ -1189,39 +1235,15 @@ static void *file_ram_alloc(RAMBlock *block,
         goto error;
     }
 
-    /* Make name safe to use with mkstemp by replacing '/' with '_'. */
-    sanitized_name = g_strdup(memory_region_name(block->mr));
-    for (c = sanitized_name; *c != '\0'; c++) {
-        if (*c == '/')
-            *c = '_';
-    }
-
-    filename = g_strdup_printf("%s/qemu_back_mem.%s.XXXXXX", path,
-                               sanitized_name);
-    g_free(sanitized_name);
+    memory = ROUND_UP(memory, pagesize);
+    total = memory + pagesize;
 
-    fd = mkstemp(filename);
+    fd = open_file_path(block, path, memory);
     if (fd < 0) {
         error_setg_errno(errp, errno,
                          "unable to create backing store for path %s", path);
-        g_free(filename);
         goto error;
     }
-    unlink(filename);
-    g_free(filename);
-
-    memory = ROUND_UP(memory, pagesize);
-    total = memory + pagesize;
-
-    /*
-     * ftruncate is not supported by hugetlbfs in older
-     * hosts, so don't bother bailing out on errors.
-     * If anything goes wrong with it under other filesystems,
-     * mmap will fail.
-     */
-    if (ftruncate(fd, memory)) {
-        perror("ftruncate");
-    }
 
     ptr = mmap(0, total, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS,
                 -1, 0);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 09/32] exec: allow file_ram_alloc to work on file
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Currently, file_ram_alloc() only works on directory - it creates a file
under @path and do mmap on it

This patch tries to allow it to work on file directly, if @path is a
directory it works as before, otherwise it treats @path as the target
file then directly allocate memory from it

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 exec.c | 82 ++++++++++++++++++++++++++++++++++++++++++------------------------
 1 file changed, 52 insertions(+), 30 deletions(-)

diff --git a/exec.c b/exec.c
index 70cb0ef..c8c7e12 100644
--- a/exec.c
+++ b/exec.c
@@ -1154,14 +1154,60 @@ void qemu_mutex_unlock_ramlist(void)
 }
 
 #ifdef __linux__
+static bool path_is_dir(const char *path)
+{
+    struct stat fs;
+
+    return stat(path, &fs) == 0 && S_ISDIR(fs.st_mode);
+}
+
+static int open_file_path(RAMBlock *block, const char *path, size_t size)
+{
+    char *filename;
+    char *sanitized_name;
+    char *c;
+    int fd;
+
+    if (!path_is_dir(path)) {
+        int flags = (block->flags & RAM_SHARED) ? O_RDWR : O_RDONLY;
+
+        flags |= O_EXCL;
+        return open(path, flags);
+    }
+
+    /* Make name safe to use with mkstemp by replacing '/' with '_'. */
+    sanitized_name = g_strdup(memory_region_name(block->mr));
+    for (c = sanitized_name; *c != '\0'; c++) {
+        if (*c == '/') {
+            *c = '_';
+        }
+    }
+    filename = g_strdup_printf("%s/qemu_back_mem.%s.XXXXXX", path,
+                               sanitized_name);
+    g_free(sanitized_name);
+    fd = mkstemp(filename);
+    if (fd >= 0) {
+        unlink(filename);
+        /*
+         * ftruncate is not supported by hugetlbfs in older
+         * hosts, so don't bother bailing out on errors.
+         * If anything goes wrong with it under other filesystems,
+         * mmap will fail.
+         */
+        if (ftruncate(fd, size)) {
+            perror("ftruncate");
+        }
+    }
+    g_free(filename);
+
+    return fd;
+}
+
 static void *file_ram_alloc(RAMBlock *block,
                             ram_addr_t memory,
                             const char *path,
                             Error **errp)
 {
-    char *filename;
-    char *sanitized_name;
-    char *c;
     void *ptr;
     void *area = NULL;
     int fd;
@@ -1189,39 +1235,15 @@ static void *file_ram_alloc(RAMBlock *block,
         goto error;
     }
 
-    /* Make name safe to use with mkstemp by replacing '/' with '_'. */
-    sanitized_name = g_strdup(memory_region_name(block->mr));
-    for (c = sanitized_name; *c != '\0'; c++) {
-        if (*c == '/')
-            *c = '_';
-    }
-
-    filename = g_strdup_printf("%s/qemu_back_mem.%s.XXXXXX", path,
-                               sanitized_name);
-    g_free(sanitized_name);
+    memory = ROUND_UP(memory, pagesize);
+    total = memory + pagesize;
 
-    fd = mkstemp(filename);
+    fd = open_file_path(block, path, memory);
     if (fd < 0) {
         error_setg_errno(errp, errno,
                          "unable to create backing store for path %s", path);
-        g_free(filename);
         goto error;
     }
-    unlink(filename);
-    g_free(filename);
-
-    memory = ROUND_UP(memory, pagesize);
-    total = memory + pagesize;
-
-    /*
-     * ftruncate is not supported by hugetlbfs in older
-     * hosts, so don't bother bailing out on errors.
-     * If anything goes wrong with it under other filesystems,
-     * mmap will fail.
-     */
-    if (ftruncate(fd, memory)) {
-        perror("ftruncate");
-    }
 
     ptr = mmap(0, total, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS,
                 -1, 0);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 10/32] hostmem-file: clean up memory allocation
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

- hostmem-file.c is compiled only if CONFIG_LINUX is enabled so that is
  unnecessary to do the same check in the source file

- the interface, HostMemoryBackendClass->alloc(), is not called many
  times, do not need to check if the memory-region is initialized

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 backends/hostmem-file.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index e9b6d21..9097a57 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -46,17 +46,12 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
         error_setg(errp, "mem-path property not set");
         return;
     }
-#ifndef CONFIG_LINUX
-    error_setg(errp, "-mem-path not supported on this host");
-#else
-    if (!memory_region_size(&backend->mr)) {
-        backend->force_prealloc = mem_prealloc;
-        memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
+
+    backend->force_prealloc = mem_prealloc;
+    memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
                                  object_get_canonical_path(OBJECT(backend)),
                                  backend->size, fb->share,
                                  fb->mem_path, errp);
-    }
-#endif
 }
 
 static void
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 10/32] hostmem-file: clean up memory allocation
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

- hostmem-file.c is compiled only if CONFIG_LINUX is enabled so that is
  unnecessary to do the same check in the source file

- the interface, HostMemoryBackendClass->alloc(), is not called many
  times, do not need to check if the memory-region is initialized

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 backends/hostmem-file.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index e9b6d21..9097a57 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -46,17 +46,12 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
         error_setg(errp, "mem-path property not set");
         return;
     }
-#ifndef CONFIG_LINUX
-    error_setg(errp, "-mem-path not supported on this host");
-#else
-    if (!memory_region_size(&backend->mr)) {
-        backend->force_prealloc = mem_prealloc;
-        memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
+
+    backend->force_prealloc = mem_prealloc;
+    memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
                                  object_get_canonical_path(OBJECT(backend)),
                                  backend->size, fb->share,
                                  fb->mem_path, errp);
-    }
-#endif
 }
 
 static void
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 11/32] hostmem-file: use whole file size if possible
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Use the whole file size if @size is not specified which is useful
if we want to directly pass a file to guest

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 backends/hostmem-file.c | 47 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 43 insertions(+), 4 deletions(-)

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 9097a57..adf2835 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -9,6 +9,9 @@
  * This work is licensed under the terms of the GNU GPL, version 2 or later.
  * See the COPYING file in the top-level directory.
  */
+#include <sys/ioctl.h>
+#include <linux/fs.h>
+
 #include "qemu-common.h"
 #include "sysemu/hostmem.h"
 #include "sysemu/sysemu.h"
@@ -33,20 +36,56 @@ struct HostMemoryBackendFile {
     char *mem_path;
 };
 
+static uint64_t get_file_size(const char *file)
+{
+    struct stat stat_buf;
+    uint64_t size = 0;
+    int fd;
+
+    fd = open(file, O_RDONLY);
+    if (fd < 0) {
+        return 0;
+    }
+
+    if (stat(file, &stat_buf) < 0) {
+        goto exit;
+    }
+
+    if ((S_ISBLK(stat_buf.st_mode)) && !ioctl(fd, BLKGETSIZE64, &size)) {
+        goto exit;
+    }
+
+    size = lseek(fd, 0, SEEK_END);
+    if (size == -1) {
+        size = 0;
+    }
+exit:
+    close(fd);
+    return size;
+}
+
 static void
 file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
 {
     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend);
 
-    if (!backend->size) {
-        error_setg(errp, "can't create backend with size 0");
-        return;
-    }
     if (!fb->mem_path) {
         error_setg(errp, "mem-path property not set");
         return;
     }
 
+    if (!backend->size) {
+        /*
+         * use the whole file size if @size is not specified.
+         */
+        backend->size = get_file_size(fb->mem_path);
+    }
+
+    if (!backend->size) {
+        error_setg(errp, "can't create backend with size 0");
+        return;
+    }
+
     backend->force_prealloc = mem_prealloc;
     memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
                                  object_get_canonical_path(OBJECT(backend)),
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 11/32] hostmem-file: use whole file size if possible
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Use the whole file size if @size is not specified which is useful
if we want to directly pass a file to guest

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 backends/hostmem-file.c | 47 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 43 insertions(+), 4 deletions(-)

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 9097a57..adf2835 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -9,6 +9,9 @@
  * This work is licensed under the terms of the GNU GPL, version 2 or later.
  * See the COPYING file in the top-level directory.
  */
+#include <sys/ioctl.h>
+#include <linux/fs.h>
+
 #include "qemu-common.h"
 #include "sysemu/hostmem.h"
 #include "sysemu/sysemu.h"
@@ -33,20 +36,56 @@ struct HostMemoryBackendFile {
     char *mem_path;
 };
 
+static uint64_t get_file_size(const char *file)
+{
+    struct stat stat_buf;
+    uint64_t size = 0;
+    int fd;
+
+    fd = open(file, O_RDONLY);
+    if (fd < 0) {
+        return 0;
+    }
+
+    if (stat(file, &stat_buf) < 0) {
+        goto exit;
+    }
+
+    if ((S_ISBLK(stat_buf.st_mode)) && !ioctl(fd, BLKGETSIZE64, &size)) {
+        goto exit;
+    }
+
+    size = lseek(fd, 0, SEEK_END);
+    if (size == -1) {
+        size = 0;
+    }
+exit:
+    close(fd);
+    return size;
+}
+
 static void
 file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
 {
     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend);
 
-    if (!backend->size) {
-        error_setg(errp, "can't create backend with size 0");
-        return;
-    }
     if (!fb->mem_path) {
         error_setg(errp, "mem-path property not set");
         return;
     }
 
+    if (!backend->size) {
+        /*
+         * use the whole file size if @size is not specified.
+         */
+        backend->size = get_file_size(fb->mem_path);
+    }
+
+    if (!backend->size) {
+        error_setg(errp, "can't create backend with size 0");
+        return;
+    }
+
     backend->force_prealloc = mem_prealloc;
     memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
                                  object_get_canonical_path(OBJECT(backend)),
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 12/32] pc-dimm: remove DEFAULT_PC_DIMMSIZE
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

It's not used any more

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 include/hw/mem/pc-dimm.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
index c1ee7b0..15590f1 100644
--- a/include/hw/mem/pc-dimm.h
+++ b/include/hw/mem/pc-dimm.h
@@ -20,8 +20,6 @@
 #include "sysemu/hostmem.h"
 #include "hw/qdev.h"
 
-#define DEFAULT_PC_DIMMSIZE (1024*1024*1024)
-
 #define TYPE_PC_DIMM "pc-dimm"
 #define PC_DIMM(obj) \
     OBJECT_CHECK(PCDIMMDevice, (obj), TYPE_PC_DIMM)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 12/32] pc-dimm: remove DEFAULT_PC_DIMMSIZE
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

It's not used any more

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 include/hw/mem/pc-dimm.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
index c1ee7b0..15590f1 100644
--- a/include/hw/mem/pc-dimm.h
+++ b/include/hw/mem/pc-dimm.h
@@ -20,8 +20,6 @@
 #include "sysemu/hostmem.h"
 #include "hw/qdev.h"
 
-#define DEFAULT_PC_DIMMSIZE (1024*1024*1024)
-
 #define TYPE_PC_DIMM "pc-dimm"
 #define PC_DIMM(obj) \
     OBJECT_CHECK(PCDIMMDevice, (obj), TYPE_PC_DIMM)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 13/32] pc-dimm: make pc_existing_dimms_capacity static and rename it
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

pc_existing_dimms_capacity() can be static since it is not used out of
pc-dimm.c and drop the pc_ prefix to prepare the work which abstracts
dimm device type from pc-dimm

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/pc-dimm.c         | 73 ++++++++++++++++++++++++------------------------
 include/hw/mem/pc-dimm.h |  1 -
 2 files changed, 36 insertions(+), 38 deletions(-)

diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index 506fe0d..a581622 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -31,6 +31,38 @@ typedef struct pc_dimms_capacity {
      Error    **errp;
 } pc_dimms_capacity;
 
+static int existing_dimms_capacity_internal(Object *obj, void *opaque)
+{
+    pc_dimms_capacity *cap = opaque;
+    uint64_t *size = &cap->size;
+
+    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
+        DeviceState *dev = DEVICE(obj);
+
+        if (dev->realized) {
+            (*size) += object_property_get_int(obj, PC_DIMM_SIZE_PROP,
+                cap->errp);
+        }
+
+        if (cap->errp && *cap->errp) {
+            return 1;
+        }
+    }
+    object_child_foreach(obj, existing_dimms_capacity_internal, opaque);
+    return 0;
+}
+
+static uint64_t existing_dimms_capacity(Error **errp)
+{
+    pc_dimms_capacity cap;
+
+    cap.size = 0;
+    cap.errp = errp;
+
+    existing_dimms_capacity_internal(qdev_get_machine(), &cap);
+    return cap.size;
+}
+
 void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
                          MemoryRegion *mr, uint64_t align, bool gap,
                          Error **errp)
@@ -39,7 +71,7 @@ void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
     MachineState *machine = MACHINE(qdev_get_machine());
     PCDIMMDevice *dimm = PC_DIMM(dev);
     Error *local_err = NULL;
-    uint64_t existing_dimms_capacity = 0;
+    uint64_t dimms_capacity = 0;
     uint64_t addr;
 
     addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
@@ -55,17 +87,16 @@ void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
         goto out;
     }
 
-    existing_dimms_capacity = pc_existing_dimms_capacity(&local_err);
+    dimms_capacity = existing_dimms_capacity(&local_err);
     if (local_err) {
         goto out;
     }
 
-    if (existing_dimms_capacity + memory_region_size(mr) >
+    if (dimms_capacity + memory_region_size(mr) >
         machine->maxram_size - machine->ram_size) {
         error_setg(&local_err, "not enough space, currently 0x%" PRIx64
                    " in use of total hot pluggable 0x" RAM_ADDR_FMT,
-                   existing_dimms_capacity,
-                   machine->maxram_size - machine->ram_size);
+                   dimms_capacity, machine->maxram_size - machine->ram_size);
         goto out;
     }
 
@@ -114,38 +145,6 @@ void pc_dimm_memory_unplug(DeviceState *dev, MemoryHotplugState *hpms,
     vmstate_unregister_ram(mr, dev);
 }
 
-static int pc_existing_dimms_capacity_internal(Object *obj, void *opaque)
-{
-    pc_dimms_capacity *cap = opaque;
-    uint64_t *size = &cap->size;
-
-    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
-        DeviceState *dev = DEVICE(obj);
-
-        if (dev->realized) {
-            (*size) += object_property_get_int(obj, PC_DIMM_SIZE_PROP,
-                cap->errp);
-        }
-
-        if (cap->errp && *cap->errp) {
-            return 1;
-        }
-    }
-    object_child_foreach(obj, pc_existing_dimms_capacity_internal, opaque);
-    return 0;
-}
-
-uint64_t pc_existing_dimms_capacity(Error **errp)
-{
-    pc_dimms_capacity cap;
-
-    cap.size = 0;
-    cap.errp = errp;
-
-    pc_existing_dimms_capacity_internal(qdev_get_machine(), &cap);
-    return cap.size;
-}
-
 int qmp_pc_dimm_device_list(Object *obj, void *opaque)
 {
     MemoryDeviceInfoList ***prev = opaque;
diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
index 15590f1..c1e5774 100644
--- a/include/hw/mem/pc-dimm.h
+++ b/include/hw/mem/pc-dimm.h
@@ -87,7 +87,6 @@ uint64_t pc_dimm_get_free_addr(uint64_t address_space_start,
 int pc_dimm_get_free_slot(const int *hint, int max_slots, Error **errp);
 
 int qmp_pc_dimm_device_list(Object *obj, void *opaque);
-uint64_t pc_existing_dimms_capacity(Error **errp);
 void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
                          MemoryRegion *mr, uint64_t align, bool gap,
                          Error **errp);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 13/32] pc-dimm: make pc_existing_dimms_capacity static and rename it
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

pc_existing_dimms_capacity() can be static since it is not used out of
pc-dimm.c and drop the pc_ prefix to prepare the work which abstracts
dimm device type from pc-dimm

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/pc-dimm.c         | 73 ++++++++++++++++++++++++------------------------
 include/hw/mem/pc-dimm.h |  1 -
 2 files changed, 36 insertions(+), 38 deletions(-)

diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index 506fe0d..a581622 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -31,6 +31,38 @@ typedef struct pc_dimms_capacity {
      Error    **errp;
 } pc_dimms_capacity;
 
+static int existing_dimms_capacity_internal(Object *obj, void *opaque)
+{
+    pc_dimms_capacity *cap = opaque;
+    uint64_t *size = &cap->size;
+
+    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
+        DeviceState *dev = DEVICE(obj);
+
+        if (dev->realized) {
+            (*size) += object_property_get_int(obj, PC_DIMM_SIZE_PROP,
+                cap->errp);
+        }
+
+        if (cap->errp && *cap->errp) {
+            return 1;
+        }
+    }
+    object_child_foreach(obj, existing_dimms_capacity_internal, opaque);
+    return 0;
+}
+
+static uint64_t existing_dimms_capacity(Error **errp)
+{
+    pc_dimms_capacity cap;
+
+    cap.size = 0;
+    cap.errp = errp;
+
+    existing_dimms_capacity_internal(qdev_get_machine(), &cap);
+    return cap.size;
+}
+
 void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
                          MemoryRegion *mr, uint64_t align, bool gap,
                          Error **errp)
@@ -39,7 +71,7 @@ void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
     MachineState *machine = MACHINE(qdev_get_machine());
     PCDIMMDevice *dimm = PC_DIMM(dev);
     Error *local_err = NULL;
-    uint64_t existing_dimms_capacity = 0;
+    uint64_t dimms_capacity = 0;
     uint64_t addr;
 
     addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
@@ -55,17 +87,16 @@ void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
         goto out;
     }
 
-    existing_dimms_capacity = pc_existing_dimms_capacity(&local_err);
+    dimms_capacity = existing_dimms_capacity(&local_err);
     if (local_err) {
         goto out;
     }
 
-    if (existing_dimms_capacity + memory_region_size(mr) >
+    if (dimms_capacity + memory_region_size(mr) >
         machine->maxram_size - machine->ram_size) {
         error_setg(&local_err, "not enough space, currently 0x%" PRIx64
                    " in use of total hot pluggable 0x" RAM_ADDR_FMT,
-                   existing_dimms_capacity,
-                   machine->maxram_size - machine->ram_size);
+                   dimms_capacity, machine->maxram_size - machine->ram_size);
         goto out;
     }
 
@@ -114,38 +145,6 @@ void pc_dimm_memory_unplug(DeviceState *dev, MemoryHotplugState *hpms,
     vmstate_unregister_ram(mr, dev);
 }
 
-static int pc_existing_dimms_capacity_internal(Object *obj, void *opaque)
-{
-    pc_dimms_capacity *cap = opaque;
-    uint64_t *size = &cap->size;
-
-    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
-        DeviceState *dev = DEVICE(obj);
-
-        if (dev->realized) {
-            (*size) += object_property_get_int(obj, PC_DIMM_SIZE_PROP,
-                cap->errp);
-        }
-
-        if (cap->errp && *cap->errp) {
-            return 1;
-        }
-    }
-    object_child_foreach(obj, pc_existing_dimms_capacity_internal, opaque);
-    return 0;
-}
-
-uint64_t pc_existing_dimms_capacity(Error **errp)
-{
-    pc_dimms_capacity cap;
-
-    cap.size = 0;
-    cap.errp = errp;
-
-    pc_existing_dimms_capacity_internal(qdev_get_machine(), &cap);
-    return cap.size;
-}
-
 int qmp_pc_dimm_device_list(Object *obj, void *opaque)
 {
     MemoryDeviceInfoList ***prev = opaque;
diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
index 15590f1..c1e5774 100644
--- a/include/hw/mem/pc-dimm.h
+++ b/include/hw/mem/pc-dimm.h
@@ -87,7 +87,6 @@ uint64_t pc_dimm_get_free_addr(uint64_t address_space_start,
 int pc_dimm_get_free_slot(const int *hint, int max_slots, Error **errp);
 
 int qmp_pc_dimm_device_list(Object *obj, void *opaque);
-uint64_t pc_existing_dimms_capacity(Error **errp);
 void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
                          MemoryRegion *mr, uint64_t align, bool gap,
                          Error **errp);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 14/32] pc-dimm: drop the prefix of pc-dimm
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

This patch is generated by this script:

find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
| xargs sed -i "s/PC_DIMM/DIMM/g"

find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
| xargs sed -i "s/PCDIMM/DIMM/g"

find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
| xargs sed -i "s/pc_dimm/dimm/g"

find ./ -name "trace-events" -type f | xargs sed -i "s/pc-dimm/dimm/g"

It prepares the work which abstracts dimm device type for both pc-dimm and
nvdimm

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hmp.c                           |   2 +-
 hw/acpi/ich9.c                  |   6 +-
 hw/acpi/memory_hotplug.c        |  16 ++---
 hw/acpi/piix4.c                 |   6 +-
 hw/i386/pc.c                    |  32 ++++-----
 hw/mem/pc-dimm.c                | 148 ++++++++++++++++++++--------------------
 hw/ppc/spapr.c                  |  18 ++---
 include/hw/mem/pc-dimm.h        |  62 ++++++++---------
 numa.c                          |   2 +-
 qapi-schema.json                |   8 +--
 qmp.c                           |   2 +-
 stubs/qmp_pc_dimm_device_list.c |   2 +-
 trace-events                    |   8 +--
 13 files changed, 156 insertions(+), 156 deletions(-)

diff --git a/hmp.c b/hmp.c
index 5048eee..5c617d2 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1952,7 +1952,7 @@ void hmp_info_memory_devices(Monitor *mon, const QDict *qdict)
     MemoryDeviceInfoList *info_list = qmp_query_memory_devices(&err);
     MemoryDeviceInfoList *info;
     MemoryDeviceInfo *value;
-    PCDIMMDeviceInfo *di;
+    DIMMDeviceInfo *di;
 
     for (info = info_list; info; info = info->next) {
         value = info->value;
diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 1c7fcfa..b0d6a67 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -440,7 +440,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm, Error **errp)
 void ich9_pm_device_plug_cb(ICH9LPCPMRegs *pm, DeviceState *dev, Error **errp)
 {
     if (pm->acpi_memory_hotplug.is_enabled &&
-        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         acpi_memory_plug_cb(&pm->acpi_regs, pm->irq, &pm->acpi_memory_hotplug,
                             dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
@@ -455,7 +455,7 @@ void ich9_pm_device_unplug_request_cb(ICH9LPCPMRegs *pm, DeviceState *dev,
                                       Error **errp)
 {
     if (pm->acpi_memory_hotplug.is_enabled &&
-        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         acpi_memory_unplug_request_cb(&pm->acpi_regs, pm->irq,
                                       &pm->acpi_memory_hotplug, dev, errp);
     } else {
@@ -468,7 +468,7 @@ void ich9_pm_device_unplug_cb(ICH9LPCPMRegs *pm, DeviceState *dev,
                               Error **errp)
 {
     if (pm->acpi_memory_hotplug.is_enabled &&
-        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         acpi_memory_unplug_cb(&pm->acpi_memory_hotplug, dev, errp);
     } else {
         error_setg(errp, "acpi: device unplug for not supported device"
diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index 2ff0d5c..1f6cccc 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -54,23 +54,23 @@ static uint64_t acpi_memory_hotplug_read(void *opaque, hwaddr addr,
     o = OBJECT(mdev->dimm);
     switch (addr) {
     case 0x0: /* Lo part of phys address where DIMM is mapped */
-        val = o ? object_property_get_int(o, PC_DIMM_ADDR_PROP, NULL) : 0;
+        val = o ? object_property_get_int(o, DIMM_ADDR_PROP, NULL) : 0;
         trace_mhp_acpi_read_addr_lo(mem_st->selector, val);
         break;
     case 0x4: /* Hi part of phys address where DIMM is mapped */
-        val = o ? object_property_get_int(o, PC_DIMM_ADDR_PROP, NULL) >> 32 : 0;
+        val = o ? object_property_get_int(o, DIMM_ADDR_PROP, NULL) >> 32 : 0;
         trace_mhp_acpi_read_addr_hi(mem_st->selector, val);
         break;
     case 0x8: /* Lo part of DIMM size */
-        val = o ? object_property_get_int(o, PC_DIMM_SIZE_PROP, NULL) : 0;
+        val = o ? object_property_get_int(o, DIMM_SIZE_PROP, NULL) : 0;
         trace_mhp_acpi_read_size_lo(mem_st->selector, val);
         break;
     case 0xc: /* Hi part of DIMM size */
-        val = o ? object_property_get_int(o, PC_DIMM_SIZE_PROP, NULL) >> 32 : 0;
+        val = o ? object_property_get_int(o, DIMM_SIZE_PROP, NULL) >> 32 : 0;
         trace_mhp_acpi_read_size_hi(mem_st->selector, val);
         break;
     case 0x10: /* node proximity for _PXM method */
-        val = o ? object_property_get_int(o, PC_DIMM_NODE_PROP, NULL) : 0;
+        val = o ? object_property_get_int(o, DIMM_NODE_PROP, NULL) : 0;
         trace_mhp_acpi_read_pxm(mem_st->selector, val);
         break;
     case 0x14: /* pack and return is_* fields */
@@ -151,13 +151,13 @@ static void acpi_memory_hotplug_write(void *opaque, hwaddr addr, uint64_t data,
             /* call pc-dimm unplug cb */
             hotplug_handler_unplug(hotplug_ctrl, dev, &local_err);
             if (local_err) {
-                trace_mhp_acpi_pc_dimm_delete_failed(mem_st->selector);
+                trace_mhp_acpi_dimm_delete_failed(mem_st->selector);
                 qapi_event_send_mem_unplug_error(dev->id,
                                                  error_get_pretty(local_err),
                                                  &error_abort);
                 break;
             }
-            trace_mhp_acpi_pc_dimm_deleted(mem_st->selector);
+            trace_mhp_acpi_dimm_deleted(mem_st->selector);
         }
         break;
     default:
@@ -206,7 +206,7 @@ acpi_memory_slot_status(MemHotplugState *mem_st,
                         DeviceState *dev, Error **errp)
 {
     Error *local_err = NULL;
-    int slot = object_property_get_int(OBJECT(dev), PC_DIMM_SLOT_PROP,
+    int slot = object_property_get_int(OBJECT(dev), DIMM_SLOT_PROP,
                                        &local_err);
 
     if (local_err) {
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 2cd2fee..0b2cb6e 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -344,7 +344,7 @@ static void piix4_device_plug_cb(HotplugHandler *hotplug_dev,
     PIIX4PMState *s = PIIX4_PM(hotplug_dev);
 
     if (s->acpi_memory_hotplug.is_enabled &&
-        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         acpi_memory_plug_cb(&s->ar, s->irq, &s->acpi_memory_hotplug, dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
         acpi_pcihp_device_plug_cb(&s->ar, s->irq, &s->acpi_pci_hotplug, dev,
@@ -363,7 +363,7 @@ static void piix4_device_unplug_request_cb(HotplugHandler *hotplug_dev,
     PIIX4PMState *s = PIIX4_PM(hotplug_dev);
 
     if (s->acpi_memory_hotplug.is_enabled &&
-        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         acpi_memory_unplug_request_cb(&s->ar, s->irq, &s->acpi_memory_hotplug,
                                       dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
@@ -381,7 +381,7 @@ static void piix4_device_unplug_cb(HotplugHandler *hotplug_dev,
     PIIX4PMState *s = PIIX4_PM(hotplug_dev);
 
     if (s->acpi_memory_hotplug.is_enabled &&
-        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         acpi_memory_unplug_cb(&s->acpi_memory_hotplug, dev, errp);
     } else {
         error_setg(errp, "acpi: device unplug for not supported device"
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 682867a..d6b9fa7 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1609,15 +1609,15 @@ void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name)
     }
 }
 
-static void pc_dimm_plug(HotplugHandler *hotplug_dev,
+static void dimm_plug(HotplugHandler *hotplug_dev,
                          DeviceState *dev, Error **errp)
 {
     HotplugHandlerClass *hhc;
     Error *local_err = NULL;
     PCMachineState *pcms = PC_MACHINE(hotplug_dev);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
-    PCDIMMDevice *dimm = PC_DIMM(dev);
-    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
+    DIMMDevice *dimm = DIMM(dev);
+    DIMMDeviceClass *ddc = DIMM_GET_CLASS(dimm);
     MemoryRegion *mr = ddc->get_memory_region(dimm);
     uint64_t align = TARGET_PAGE_SIZE;
 
@@ -1631,7 +1631,7 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev,
         goto out;
     }
 
-    pc_dimm_memory_plug(dev, &pcms->hotplug_memory, mr, align,
+    dimm_memory_plug(dev, &pcms->hotplug_memory, mr, align,
                         pcmc->inter_dimm_gap, &local_err);
     if (local_err) {
         goto out;
@@ -1643,7 +1643,7 @@ out:
     error_propagate(errp, local_err);
 }
 
-static void pc_dimm_unplug_request(HotplugHandler *hotplug_dev,
+static void dimm_unplug_request(HotplugHandler *hotplug_dev,
                                    DeviceState *dev, Error **errp)
 {
     HotplugHandlerClass *hhc;
@@ -1663,12 +1663,12 @@ out:
     error_propagate(errp, local_err);
 }
 
-static void pc_dimm_unplug(HotplugHandler *hotplug_dev,
+static void dimm_unplug(HotplugHandler *hotplug_dev,
                            DeviceState *dev, Error **errp)
 {
     PCMachineState *pcms = PC_MACHINE(hotplug_dev);
-    PCDIMMDevice *dimm = PC_DIMM(dev);
-    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
+    DIMMDevice *dimm = DIMM(dev);
+    DIMMDeviceClass *ddc = DIMM_GET_CLASS(dimm);
     MemoryRegion *mr = ddc->get_memory_region(dimm);
     HotplugHandlerClass *hhc;
     Error *local_err = NULL;
@@ -1680,7 +1680,7 @@ static void pc_dimm_unplug(HotplugHandler *hotplug_dev,
         goto out;
     }
 
-    pc_dimm_memory_unplug(dev, &pcms->hotplug_memory, mr);
+    dimm_memory_unplug(dev, &pcms->hotplug_memory, mr);
     object_unparent(OBJECT(dev));
 
  out:
@@ -1719,8 +1719,8 @@ out:
 static void pc_machine_device_plug_cb(HotplugHandler *hotplug_dev,
                                       DeviceState *dev, Error **errp)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
-        pc_dimm_plug(hotplug_dev, dev, errp);
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
+        dimm_plug(hotplug_dev, dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
         pc_cpu_plug(hotplug_dev, dev, errp);
     }
@@ -1729,8 +1729,8 @@ static void pc_machine_device_plug_cb(HotplugHandler *hotplug_dev,
 static void pc_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
                                                 DeviceState *dev, Error **errp)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
-        pc_dimm_unplug_request(hotplug_dev, dev, errp);
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
+        dimm_unplug_request(hotplug_dev, dev, errp);
     } else {
         error_setg(errp, "acpi: device unplug request for not supported device"
                    " type: %s", object_get_typename(OBJECT(dev)));
@@ -1740,8 +1740,8 @@ static void pc_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
 static void pc_machine_device_unplug_cb(HotplugHandler *hotplug_dev,
                                         DeviceState *dev, Error **errp)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
-        pc_dimm_unplug(hotplug_dev, dev, errp);
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
+        dimm_unplug(hotplug_dev, dev, errp);
     } else {
         error_setg(errp, "acpi: device unplug for not supported device"
                    " type: %s", object_get_typename(OBJECT(dev)));
@@ -1753,7 +1753,7 @@ static HotplugHandler *pc_get_hotpug_handler(MachineState *machine,
 {
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine);
 
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM) ||
         object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
         return HOTPLUG_HANDLER(machine);
     }
diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index a581622..9e26bf7 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -26,21 +26,21 @@
 #include "sysemu/kvm.h"
 #include "trace.h"
 
-typedef struct pc_dimms_capacity {
+typedef struct dimms_capacity {
      uint64_t size;
      Error    **errp;
-} pc_dimms_capacity;
+} dimms_capacity;
 
 static int existing_dimms_capacity_internal(Object *obj, void *opaque)
 {
-    pc_dimms_capacity *cap = opaque;
+    dimms_capacity *cap = opaque;
     uint64_t *size = &cap->size;
 
-    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(obj, TYPE_DIMM)) {
         DeviceState *dev = DEVICE(obj);
 
         if (dev->realized) {
-            (*size) += object_property_get_int(obj, PC_DIMM_SIZE_PROP,
+            (*size) += object_property_get_int(obj, DIMM_SIZE_PROP,
                 cap->errp);
         }
 
@@ -54,7 +54,7 @@ static int existing_dimms_capacity_internal(Object *obj, void *opaque)
 
 static uint64_t existing_dimms_capacity(Error **errp)
 {
-    pc_dimms_capacity cap;
+    dimms_capacity cap;
 
     cap.size = 0;
     cap.errp = errp;
@@ -63,23 +63,23 @@ static uint64_t existing_dimms_capacity(Error **errp)
     return cap.size;
 }
 
-void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
+void dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
                          MemoryRegion *mr, uint64_t align, bool gap,
                          Error **errp)
 {
     int slot;
     MachineState *machine = MACHINE(qdev_get_machine());
-    PCDIMMDevice *dimm = PC_DIMM(dev);
+    DIMMDevice *dimm = DIMM(dev);
     Error *local_err = NULL;
     uint64_t dimms_capacity = 0;
     uint64_t addr;
 
-    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
+    addr = object_property_get_int(OBJECT(dimm), DIMM_ADDR_PROP, &local_err);
     if (local_err) {
         goto out;
     }
 
-    addr = pc_dimm_get_free_addr(hpms->base,
+    addr = dimm_get_free_addr(hpms->base,
                                  memory_region_size(&hpms->mr),
                                  !addr ? NULL : &addr, align, gap,
                                  memory_region_size(mr), &local_err);
@@ -100,27 +100,27 @@ void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
         goto out;
     }
 
-    object_property_set_int(OBJECT(dev), addr, PC_DIMM_ADDR_PROP, &local_err);
+    object_property_set_int(OBJECT(dev), addr, DIMM_ADDR_PROP, &local_err);
     if (local_err) {
         goto out;
     }
-    trace_mhp_pc_dimm_assigned_address(addr);
+    trace_mhp_dimm_assigned_address(addr);
 
-    slot = object_property_get_int(OBJECT(dev), PC_DIMM_SLOT_PROP, &local_err);
+    slot = object_property_get_int(OBJECT(dev), DIMM_SLOT_PROP, &local_err);
     if (local_err) {
         goto out;
     }
 
-    slot = pc_dimm_get_free_slot(slot == PC_DIMM_UNASSIGNED_SLOT ? NULL : &slot,
+    slot = dimm_get_free_slot(slot == DIMM_UNASSIGNED_SLOT ? NULL : &slot,
                                  machine->ram_slots, &local_err);
     if (local_err) {
         goto out;
     }
-    object_property_set_int(OBJECT(dev), slot, PC_DIMM_SLOT_PROP, &local_err);
+    object_property_set_int(OBJECT(dev), slot, DIMM_SLOT_PROP, &local_err);
     if (local_err) {
         goto out;
     }
-    trace_mhp_pc_dimm_assigned_slot(slot);
+    trace_mhp_dimm_assigned_slot(slot);
 
     if (kvm_enabled() && !kvm_has_free_slot(machine)) {
         error_setg(&local_err, "hypervisor has no free memory slots left");
@@ -135,29 +135,29 @@ out:
     error_propagate(errp, local_err);
 }
 
-void pc_dimm_memory_unplug(DeviceState *dev, MemoryHotplugState *hpms,
+void dimm_memory_unplug(DeviceState *dev, MemoryHotplugState *hpms,
                            MemoryRegion *mr)
 {
-    PCDIMMDevice *dimm = PC_DIMM(dev);
+    DIMMDevice *dimm = DIMM(dev);
 
     numa_unset_mem_node_id(dimm->addr, memory_region_size(mr), dimm->node);
     memory_region_del_subregion(&hpms->mr, mr);
     vmstate_unregister_ram(mr, dev);
 }
 
-int qmp_pc_dimm_device_list(Object *obj, void *opaque)
+int qmp_dimm_device_list(Object *obj, void *opaque)
 {
     MemoryDeviceInfoList ***prev = opaque;
 
-    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(obj, TYPE_DIMM)) {
         DeviceState *dev = DEVICE(obj);
 
         if (dev->realized) {
             MemoryDeviceInfoList *elem = g_new0(MemoryDeviceInfoList, 1);
             MemoryDeviceInfo *info = g_new0(MemoryDeviceInfo, 1);
-            PCDIMMDeviceInfo *di = g_new0(PCDIMMDeviceInfo, 1);
+            DIMMDeviceInfo *di = g_new0(DIMMDeviceInfo, 1);
             DeviceClass *dc = DEVICE_GET_CLASS(obj);
-            PCDIMMDevice *dimm = PC_DIMM(obj);
+            DIMMDevice *dimm = DIMM(obj);
 
             if (dev->id) {
                 di->has_id = true;
@@ -168,7 +168,7 @@ int qmp_pc_dimm_device_list(Object *obj, void *opaque)
             di->addr = dimm->addr;
             di->slot = dimm->slot;
             di->node = dimm->node;
-            di->size = object_property_get_int(OBJECT(dimm), PC_DIMM_SIZE_PROP,
+            di->size = object_property_get_int(OBJECT(dimm), DIMM_SIZE_PROP,
                                                NULL);
             di->memdev = object_get_canonical_path(OBJECT(dimm->hostmem));
 
@@ -180,7 +180,7 @@ int qmp_pc_dimm_device_list(Object *obj, void *opaque)
         }
     }
 
-    object_child_foreach(obj, qmp_pc_dimm_device_list, opaque);
+    object_child_foreach(obj, qmp_dimm_device_list, opaque);
     return 0;
 }
 
@@ -191,7 +191,7 @@ ram_addr_t get_current_ram_size(void)
     MemoryDeviceInfoList *info;
     ram_addr_t size = ram_size;
 
-    qmp_pc_dimm_device_list(qdev_get_machine(), &prev);
+    qmp_dimm_device_list(qdev_get_machine(), &prev);
     for (info = info_list; info; info = info->next) {
         MemoryDeviceInfo *value = info->value;
 
@@ -210,28 +210,28 @@ ram_addr_t get_current_ram_size(void)
     return size;
 }
 
-static int pc_dimm_slot2bitmap(Object *obj, void *opaque)
+static int dimm_slot2bitmap(Object *obj, void *opaque)
 {
     unsigned long *bitmap = opaque;
 
-    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(obj, TYPE_DIMM)) {
         DeviceState *dev = DEVICE(obj);
         if (dev->realized) { /* count only realized DIMMs */
-            PCDIMMDevice *d = PC_DIMM(obj);
+            DIMMDevice *d = DIMM(obj);
             set_bit(d->slot, bitmap);
         }
     }
 
-    object_child_foreach(obj, pc_dimm_slot2bitmap, opaque);
+    object_child_foreach(obj, dimm_slot2bitmap, opaque);
     return 0;
 }
 
-int pc_dimm_get_free_slot(const int *hint, int max_slots, Error **errp)
+int dimm_get_free_slot(const int *hint, int max_slots, Error **errp)
 {
     unsigned long *bitmap = bitmap_new(max_slots);
     int slot = 0;
 
-    object_child_foreach(qdev_get_machine(), pc_dimm_slot2bitmap, bitmap);
+    object_child_foreach(qdev_get_machine(), dimm_slot2bitmap, bitmap);
 
     /* check if requested slot is not occupied */
     if (hint) {
@@ -256,10 +256,10 @@ out:
     return slot;
 }
 
-static gint pc_dimm_addr_sort(gconstpointer a, gconstpointer b)
+static gint dimm_addr_sort(gconstpointer a, gconstpointer b)
 {
-    PCDIMMDevice *x = PC_DIMM(a);
-    PCDIMMDevice *y = PC_DIMM(b);
+    DIMMDevice *x = DIMM(a);
+    DIMMDevice *y = DIMM(b);
     Int128 diff = int128_sub(int128_make64(x->addr), int128_make64(y->addr));
 
     if (int128_lt(diff, int128_zero())) {
@@ -270,22 +270,22 @@ static gint pc_dimm_addr_sort(gconstpointer a, gconstpointer b)
     return 0;
 }
 
-static int pc_dimm_built_list(Object *obj, void *opaque)
+static int dimm_built_list(Object *obj, void *opaque)
 {
     GSList **list = opaque;
 
-    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(obj, TYPE_DIMM)) {
         DeviceState *dev = DEVICE(obj);
         if (dev->realized) { /* only realized DIMMs matter */
-            *list = g_slist_insert_sorted(*list, dev, pc_dimm_addr_sort);
+            *list = g_slist_insert_sorted(*list, dev, dimm_addr_sort);
         }
     }
 
-    object_child_foreach(obj, pc_dimm_built_list, opaque);
+    object_child_foreach(obj, dimm_built_list, opaque);
     return 0;
 }
 
-uint64_t pc_dimm_get_free_addr(uint64_t address_space_start,
+uint64_t dimm_get_free_addr(uint64_t address_space_start,
                                uint64_t address_space_size,
                                uint64_t *hint, uint64_t align, bool gap,
                                uint64_t size, Error **errp)
@@ -315,7 +315,7 @@ uint64_t pc_dimm_get_free_addr(uint64_t address_space_start,
     }
 
     assert(address_space_end > address_space_start);
-    object_child_foreach(qdev_get_machine(), pc_dimm_built_list, &list);
+    object_child_foreach(qdev_get_machine(), dimm_built_list, &list);
 
     if (hint) {
         new_addr = *hint;
@@ -325,9 +325,9 @@ uint64_t pc_dimm_get_free_addr(uint64_t address_space_start,
 
     /* find address range that will fit new DIMM */
     for (item = list; item; item = g_slist_next(item)) {
-        PCDIMMDevice *dimm = item->data;
+        DIMMDevice *dimm = item->data;
         uint64_t dimm_size = object_property_get_int(OBJECT(dimm),
-                                                     PC_DIMM_SIZE_PROP,
+                                                     DIMM_SIZE_PROP,
                                                      errp);
         if (errp && *errp) {
             goto out;
@@ -359,20 +359,20 @@ out:
     return ret;
 }
 
-static Property pc_dimm_properties[] = {
-    DEFINE_PROP_UINT64(PC_DIMM_ADDR_PROP, PCDIMMDevice, addr, 0),
-    DEFINE_PROP_UINT32(PC_DIMM_NODE_PROP, PCDIMMDevice, node, 0),
-    DEFINE_PROP_INT32(PC_DIMM_SLOT_PROP, PCDIMMDevice, slot,
-                      PC_DIMM_UNASSIGNED_SLOT),
+static Property dimm_properties[] = {
+    DEFINE_PROP_UINT64(DIMM_ADDR_PROP, DIMMDevice, addr, 0),
+    DEFINE_PROP_UINT32(DIMM_NODE_PROP, DIMMDevice, node, 0),
+    DEFINE_PROP_INT32(DIMM_SLOT_PROP, DIMMDevice, slot,
+                      DIMM_UNASSIGNED_SLOT),
     DEFINE_PROP_END_OF_LIST(),
 };
 
-static void pc_dimm_get_size(Object *obj, Visitor *v, void *opaque,
+static void dimm_get_size(Object *obj, Visitor *v, void *opaque,
                           const char *name, Error **errp)
 {
     int64_t value;
     MemoryRegion *mr;
-    PCDIMMDevice *dimm = PC_DIMM(obj);
+    DIMMDevice *dimm = DIMM(obj);
 
     mr = host_memory_backend_get_memory(dimm->hostmem, errp);
     value = memory_region_size(mr);
@@ -380,7 +380,7 @@ static void pc_dimm_get_size(Object *obj, Visitor *v, void *opaque,
     visit_type_int(v, &value, name, errp);
 }
 
-static void pc_dimm_check_memdev_is_busy(Object *obj, const char *name,
+static void dimm_check_memdev_is_busy(Object *obj, const char *name,
                                       Object *val, Error **errp)
 {
     MemoryRegion *mr;
@@ -395,65 +395,65 @@ static void pc_dimm_check_memdev_is_busy(Object *obj, const char *name,
     }
 }
 
-static void pc_dimm_init(Object *obj)
+static void dimm_init(Object *obj)
 {
-    PCDIMMDevice *dimm = PC_DIMM(obj);
+    DIMMDevice *dimm = DIMM(obj);
 
-    object_property_add(obj, PC_DIMM_SIZE_PROP, "int", pc_dimm_get_size,
+    object_property_add(obj, DIMM_SIZE_PROP, "int", dimm_get_size,
                         NULL, NULL, NULL, &error_abort);
-    object_property_add_link(obj, PC_DIMM_MEMDEV_PROP, TYPE_MEMORY_BACKEND,
+    object_property_add_link(obj, DIMM_MEMDEV_PROP, TYPE_MEMORY_BACKEND,
                              (Object **)&dimm->hostmem,
-                             pc_dimm_check_memdev_is_busy,
+                             dimm_check_memdev_is_busy,
                              OBJ_PROP_LINK_UNREF_ON_RELEASE,
                              &error_abort);
 }
 
-static void pc_dimm_realize(DeviceState *dev, Error **errp)
+static void dimm_realize(DeviceState *dev, Error **errp)
 {
-    PCDIMMDevice *dimm = PC_DIMM(dev);
+    DIMMDevice *dimm = DIMM(dev);
 
     if (!dimm->hostmem) {
-        error_setg(errp, "'" PC_DIMM_MEMDEV_PROP "' property is not set");
+        error_setg(errp, "'" DIMM_MEMDEV_PROP "' property is not set");
         return;
     }
     if (((nb_numa_nodes > 0) && (dimm->node >= nb_numa_nodes)) ||
         (!nb_numa_nodes && dimm->node)) {
-        error_setg(errp, "'DIMM property " PC_DIMM_NODE_PROP " has value %"
+        error_setg(errp, "'DIMM property " DIMM_NODE_PROP " has value %"
                    PRIu32 "' which exceeds the number of numa nodes: %d",
                    dimm->node, nb_numa_nodes ? nb_numa_nodes : 1);
         return;
     }
 }
 
-static MemoryRegion *pc_dimm_get_memory_region(PCDIMMDevice *dimm)
+static MemoryRegion *dimm_get_memory_region(DIMMDevice *dimm)
 {
     return host_memory_backend_get_memory(dimm->hostmem, &error_abort);
 }
 
-static void pc_dimm_class_init(ObjectClass *oc, void *data)
+static void dimm_class_init(ObjectClass *oc, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(oc);
-    PCDIMMDeviceClass *ddc = PC_DIMM_CLASS(oc);
+    DIMMDeviceClass *ddc = DIMM_CLASS(oc);
 
-    dc->realize = pc_dimm_realize;
-    dc->props = pc_dimm_properties;
+    dc->realize = dimm_realize;
+    dc->props = dimm_properties;
     dc->desc = "DIMM memory module";
 
-    ddc->get_memory_region = pc_dimm_get_memory_region;
+    ddc->get_memory_region = dimm_get_memory_region;
 }
 
-static TypeInfo pc_dimm_info = {
-    .name          = TYPE_PC_DIMM,
+static TypeInfo dimm_info = {
+    .name          = TYPE_DIMM,
     .parent        = TYPE_DEVICE,
-    .instance_size = sizeof(PCDIMMDevice),
-    .instance_init = pc_dimm_init,
-    .class_init    = pc_dimm_class_init,
-    .class_size    = sizeof(PCDIMMDeviceClass),
+    .instance_size = sizeof(DIMMDevice),
+    .instance_init = dimm_init,
+    .class_init    = dimm_class_init,
+    .class_size    = sizeof(DIMMDeviceClass),
 };
 
-static void pc_dimm_register_types(void)
+static void dimm_register_types(void)
 {
-    type_register_static(&pc_dimm_info);
+    type_register_static(&dimm_info);
 }
 
-type_init(pc_dimm_register_types)
+type_init(dimm_register_types)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index d1b0e53..4fb91a5 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2083,8 +2083,8 @@ static void spapr_memory_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
 {
     Error *local_err = NULL;
     sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
-    PCDIMMDevice *dimm = PC_DIMM(dev);
-    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
+    DIMMDevice *dimm = DIMM(dev);
+    DIMMDeviceClass *ddc = DIMM_GET_CLASS(dimm);
     MemoryRegion *mr = ddc->get_memory_region(dimm);
     uint64_t align = memory_region_get_alignment(mr);
     uint64_t size = memory_region_size(mr);
@@ -2096,14 +2096,14 @@ static void spapr_memory_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         goto out;
     }
 
-    pc_dimm_memory_plug(dev, &ms->hotplug_memory, mr, align, false, &local_err);
+    dimm_memory_plug(dev, &ms->hotplug_memory, mr, align, false, &local_err);
     if (local_err) {
         goto out;
     }
 
-    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
+    addr = object_property_get_int(OBJECT(dimm), DIMM_ADDR_PROP, &local_err);
     if (local_err) {
-        pc_dimm_memory_unplug(dev, &ms->hotplug_memory, mr);
+        dimm_memory_unplug(dev, &ms->hotplug_memory, mr);
         goto out;
     }
 
@@ -2118,14 +2118,14 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
 {
     sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
 
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         int node;
 
         if (!smc->dr_lmb_enabled) {
             error_setg(errp, "Memory hotplug not supported for this machine");
             return;
         }
-        node = object_property_get_int(OBJECT(dev), PC_DIMM_NODE_PROP, errp);
+        node = object_property_get_int(OBJECT(dev), DIMM_NODE_PROP, errp);
         if (*errp) {
             return;
         }
@@ -2159,7 +2159,7 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
 static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
                                       DeviceState *dev, Error **errp)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         error_setg(errp, "Memory hot unplug not supported by sPAPR");
     }
 }
@@ -2167,7 +2167,7 @@ static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
 static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
                                              DeviceState *dev)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         return HOTPLUG_HANDLER(machine);
     }
     return NULL;
diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
index c1e5774..5ddbf08 100644
--- a/include/hw/mem/pc-dimm.h
+++ b/include/hw/mem/pc-dimm.h
@@ -13,39 +13,39 @@
  *
  */
 
-#ifndef QEMU_PC_DIMM_H
-#define QEMU_PC_DIMM_H
+#ifndef QEMU_DIMM_H
+#define QEMU_DIMM_H
 
 #include "exec/memory.h"
 #include "sysemu/hostmem.h"
 #include "hw/qdev.h"
 
-#define TYPE_PC_DIMM "pc-dimm"
-#define PC_DIMM(obj) \
-    OBJECT_CHECK(PCDIMMDevice, (obj), TYPE_PC_DIMM)
-#define PC_DIMM_CLASS(oc) \
-    OBJECT_CLASS_CHECK(PCDIMMDeviceClass, (oc), TYPE_PC_DIMM)
-#define PC_DIMM_GET_CLASS(obj) \
-    OBJECT_GET_CLASS(PCDIMMDeviceClass, (obj), TYPE_PC_DIMM)
+#define TYPE_DIMM "pc-dimm"
+#define DIMM(obj) \
+    OBJECT_CHECK(DIMMDevice, (obj), TYPE_DIMM)
+#define DIMM_CLASS(oc) \
+    OBJECT_CLASS_CHECK(DIMMDeviceClass, (oc), TYPE_DIMM)
+#define DIMM_GET_CLASS(obj) \
+    OBJECT_GET_CLASS(DIMMDeviceClass, (obj), TYPE_DIMM)
 
-#define PC_DIMM_ADDR_PROP "addr"
-#define PC_DIMM_SLOT_PROP "slot"
-#define PC_DIMM_NODE_PROP "node"
-#define PC_DIMM_SIZE_PROP "size"
-#define PC_DIMM_MEMDEV_PROP "memdev"
+#define DIMM_ADDR_PROP "addr"
+#define DIMM_SLOT_PROP "slot"
+#define DIMM_NODE_PROP "node"
+#define DIMM_SIZE_PROP "size"
+#define DIMM_MEMDEV_PROP "memdev"
 
-#define PC_DIMM_UNASSIGNED_SLOT -1
+#define DIMM_UNASSIGNED_SLOT -1
 
 /**
- * PCDIMMDevice:
- * @addr: starting guest physical address, where @PCDIMMDevice is mapped.
+ * DIMMDevice:
+ * @addr: starting guest physical address, where @DIMMDevice is mapped.
  *         Default value: 0, means that address is auto-allocated.
- * @node: numa node to which @PCDIMMDevice is attached.
- * @slot: slot number into which @PCDIMMDevice is plugged in.
+ * @node: numa node to which @DIMMDevice is attached.
+ * @slot: slot number into which @DIMMDevice is plugged in.
  *        Default value: -1, means that slot is auto-allocated.
- * @hostmem: host memory backend providing memory for @PCDIMMDevice
+ * @hostmem: host memory backend providing memory for @DIMMDevice
  */
-typedef struct PCDIMMDevice {
+typedef struct DIMMDevice {
     /* private */
     DeviceState parent_obj;
 
@@ -54,19 +54,19 @@ typedef struct PCDIMMDevice {
     uint32_t node;
     int32_t slot;
     HostMemoryBackend *hostmem;
-} PCDIMMDevice;
+} DIMMDevice;
 
 /**
- * PCDIMMDeviceClass:
+ * DIMMDeviceClass:
  * @get_memory_region: returns #MemoryRegion associated with @dimm
  */
-typedef struct PCDIMMDeviceClass {
+typedef struct DIMMDeviceClass {
     /* private */
     DeviceClass parent_class;
 
     /* public */
-    MemoryRegion *(*get_memory_region)(PCDIMMDevice *dimm);
-} PCDIMMDeviceClass;
+    MemoryRegion *(*get_memory_region)(DIMMDevice *dimm);
+} DIMMDeviceClass;
 
 /**
  * MemoryHotplugState:
@@ -79,17 +79,17 @@ typedef struct MemoryHotplugState {
     MemoryRegion mr;
 } MemoryHotplugState;
 
-uint64_t pc_dimm_get_free_addr(uint64_t address_space_start,
+uint64_t dimm_get_free_addr(uint64_t address_space_start,
                                uint64_t address_space_size,
                                uint64_t *hint, uint64_t align, bool gap,
                                uint64_t size, Error **errp);
 
-int pc_dimm_get_free_slot(const int *hint, int max_slots, Error **errp);
+int dimm_get_free_slot(const int *hint, int max_slots, Error **errp);
 
-int qmp_pc_dimm_device_list(Object *obj, void *opaque);
-void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
+int qmp_dimm_device_list(Object *obj, void *opaque);
+void dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
                          MemoryRegion *mr, uint64_t align, bool gap,
                          Error **errp);
-void pc_dimm_memory_unplug(DeviceState *dev, MemoryHotplugState *hpms,
+void dimm_memory_unplug(DeviceState *dev, MemoryHotplugState *hpms,
                            MemoryRegion *mr);
 #endif
diff --git a/numa.c b/numa.c
index e9b18f5..cb69965 100644
--- a/numa.c
+++ b/numa.c
@@ -482,7 +482,7 @@ static void numa_stat_memory_devices(uint64_t node_mem[])
     MemoryDeviceInfoList **prev = &info_list;
     MemoryDeviceInfoList *info;
 
-    qmp_pc_dimm_device_list(qdev_get_machine(), &prev);
+    qmp_dimm_device_list(qdev_get_machine(), &prev);
     for (info = info_list; info; info = info->next) {
         MemoryDeviceInfo *value = info->value;
 
diff --git a/qapi-schema.json b/qapi-schema.json
index 8b0520c..d8dd8b9 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3684,9 +3684,9 @@
 { 'command': 'query-memdev', 'returns': ['Memdev'] }
 
 ##
-# @PCDIMMDeviceInfo:
+# @DIMMDeviceInfo:
 #
-# PCDIMMDevice state information
+# DIMMDevice state information
 #
 # @id: #optional device's ID
 #
@@ -3706,7 +3706,7 @@
 #
 # Since: 2.1
 ##
-{ 'struct': 'PCDIMMDeviceInfo',
+{ 'struct': 'DIMMDeviceInfo',
   'data': { '*id': 'str',
             'addr': 'int',
             'size': 'int',
@@ -3725,7 +3725,7 @@
 #
 # Since: 2.1
 ##
-{ 'union': 'MemoryDeviceInfo', 'data': {'dimm': 'PCDIMMDeviceInfo'} }
+{ 'union': 'MemoryDeviceInfo', 'data': {'dimm': 'DIMMDeviceInfo'} }
 
 ##
 # @query-memory-devices
diff --git a/qmp.c b/qmp.c
index 057a7cb..6709aea 100644
--- a/qmp.c
+++ b/qmp.c
@@ -699,7 +699,7 @@ MemoryDeviceInfoList *qmp_query_memory_devices(Error **errp)
     MemoryDeviceInfoList *head = NULL;
     MemoryDeviceInfoList **prev = &head;
 
-    qmp_pc_dimm_device_list(qdev_get_machine(), &prev);
+    qmp_dimm_device_list(qdev_get_machine(), &prev);
 
     return head;
 }
diff --git a/stubs/qmp_pc_dimm_device_list.c b/stubs/qmp_pc_dimm_device_list.c
index b584bd8..b2704c6 100644
--- a/stubs/qmp_pc_dimm_device_list.c
+++ b/stubs/qmp_pc_dimm_device_list.c
@@ -1,7 +1,7 @@
 #include "qom/object.h"
 #include "hw/mem/pc-dimm.h"
 
-int qmp_pc_dimm_device_list(Object *obj, void *opaque)
+int qmp_dimm_device_list(Object *obj, void *opaque)
 {
    return 0;
 }
diff --git a/trace-events b/trace-events
index b813ae4..06c76f2 100644
--- a/trace-events
+++ b/trace-events
@@ -1639,12 +1639,12 @@ mhp_acpi_write_ost_ev(uint32_t slot, uint32_t ev) "slot[0x%"PRIx32"] OST EVENT:
 mhp_acpi_write_ost_status(uint32_t slot, uint32_t st) "slot[0x%"PRIx32"] OST STATUS: 0x%"PRIx32
 mhp_acpi_clear_insert_evt(uint32_t slot) "slot[0x%"PRIx32"] clear insert event"
 mhp_acpi_clear_remove_evt(uint32_t slot) "slot[0x%"PRIx32"] clear remove event"
-mhp_acpi_pc_dimm_deleted(uint32_t slot) "slot[0x%"PRIx32"] pc-dimm deleted"
-mhp_acpi_pc_dimm_delete_failed(uint32_t slot) "slot[0x%"PRIx32"] pc-dimm delete failed"
+mhp_acpi_dimm_deleted(uint32_t slot) "slot[0x%"PRIx32"] dimm deleted"
+mhp_acpi_dimm_delete_failed(uint32_t slot) "slot[0x%"PRIx32"] dimm delete failed"
 
 # hw/i386/pc.c
-mhp_pc_dimm_assigned_slot(int slot) "0x%d"
-mhp_pc_dimm_assigned_address(uint64_t addr) "0x%"PRIx64
+mhp_dimm_assigned_slot(int slot) "0x%d"
+mhp_dimm_assigned_address(uint64_t addr) "0x%"PRIx64
 
 # target-s390x/kvm.c
 kvm_enable_cmma(int rc) "CMMA: enabling with result code %d"
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 14/32] pc-dimm: drop the prefix of pc-dimm
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

This patch is generated by this script:

find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
| xargs sed -i "s/PC_DIMM/DIMM/g"

find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
| xargs sed -i "s/PCDIMM/DIMM/g"

find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
| xargs sed -i "s/pc_dimm/dimm/g"

find ./ -name "trace-events" -type f | xargs sed -i "s/pc-dimm/dimm/g"

It prepares the work which abstracts dimm device type for both pc-dimm and
nvdimm

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hmp.c                           |   2 +-
 hw/acpi/ich9.c                  |   6 +-
 hw/acpi/memory_hotplug.c        |  16 ++---
 hw/acpi/piix4.c                 |   6 +-
 hw/i386/pc.c                    |  32 ++++-----
 hw/mem/pc-dimm.c                | 148 ++++++++++++++++++++--------------------
 hw/ppc/spapr.c                  |  18 ++---
 include/hw/mem/pc-dimm.h        |  62 ++++++++---------
 numa.c                          |   2 +-
 qapi-schema.json                |   8 +--
 qmp.c                           |   2 +-
 stubs/qmp_pc_dimm_device_list.c |   2 +-
 trace-events                    |   8 +--
 13 files changed, 156 insertions(+), 156 deletions(-)

diff --git a/hmp.c b/hmp.c
index 5048eee..5c617d2 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1952,7 +1952,7 @@ void hmp_info_memory_devices(Monitor *mon, const QDict *qdict)
     MemoryDeviceInfoList *info_list = qmp_query_memory_devices(&err);
     MemoryDeviceInfoList *info;
     MemoryDeviceInfo *value;
-    PCDIMMDeviceInfo *di;
+    DIMMDeviceInfo *di;
 
     for (info = info_list; info; info = info->next) {
         value = info->value;
diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 1c7fcfa..b0d6a67 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -440,7 +440,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm, Error **errp)
 void ich9_pm_device_plug_cb(ICH9LPCPMRegs *pm, DeviceState *dev, Error **errp)
 {
     if (pm->acpi_memory_hotplug.is_enabled &&
-        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         acpi_memory_plug_cb(&pm->acpi_regs, pm->irq, &pm->acpi_memory_hotplug,
                             dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
@@ -455,7 +455,7 @@ void ich9_pm_device_unplug_request_cb(ICH9LPCPMRegs *pm, DeviceState *dev,
                                       Error **errp)
 {
     if (pm->acpi_memory_hotplug.is_enabled &&
-        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         acpi_memory_unplug_request_cb(&pm->acpi_regs, pm->irq,
                                       &pm->acpi_memory_hotplug, dev, errp);
     } else {
@@ -468,7 +468,7 @@ void ich9_pm_device_unplug_cb(ICH9LPCPMRegs *pm, DeviceState *dev,
                               Error **errp)
 {
     if (pm->acpi_memory_hotplug.is_enabled &&
-        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         acpi_memory_unplug_cb(&pm->acpi_memory_hotplug, dev, errp);
     } else {
         error_setg(errp, "acpi: device unplug for not supported device"
diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index 2ff0d5c..1f6cccc 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -54,23 +54,23 @@ static uint64_t acpi_memory_hotplug_read(void *opaque, hwaddr addr,
     o = OBJECT(mdev->dimm);
     switch (addr) {
     case 0x0: /* Lo part of phys address where DIMM is mapped */
-        val = o ? object_property_get_int(o, PC_DIMM_ADDR_PROP, NULL) : 0;
+        val = o ? object_property_get_int(o, DIMM_ADDR_PROP, NULL) : 0;
         trace_mhp_acpi_read_addr_lo(mem_st->selector, val);
         break;
     case 0x4: /* Hi part of phys address where DIMM is mapped */
-        val = o ? object_property_get_int(o, PC_DIMM_ADDR_PROP, NULL) >> 32 : 0;
+        val = o ? object_property_get_int(o, DIMM_ADDR_PROP, NULL) >> 32 : 0;
         trace_mhp_acpi_read_addr_hi(mem_st->selector, val);
         break;
     case 0x8: /* Lo part of DIMM size */
-        val = o ? object_property_get_int(o, PC_DIMM_SIZE_PROP, NULL) : 0;
+        val = o ? object_property_get_int(o, DIMM_SIZE_PROP, NULL) : 0;
         trace_mhp_acpi_read_size_lo(mem_st->selector, val);
         break;
     case 0xc: /* Hi part of DIMM size */
-        val = o ? object_property_get_int(o, PC_DIMM_SIZE_PROP, NULL) >> 32 : 0;
+        val = o ? object_property_get_int(o, DIMM_SIZE_PROP, NULL) >> 32 : 0;
         trace_mhp_acpi_read_size_hi(mem_st->selector, val);
         break;
     case 0x10: /* node proximity for _PXM method */
-        val = o ? object_property_get_int(o, PC_DIMM_NODE_PROP, NULL) : 0;
+        val = o ? object_property_get_int(o, DIMM_NODE_PROP, NULL) : 0;
         trace_mhp_acpi_read_pxm(mem_st->selector, val);
         break;
     case 0x14: /* pack and return is_* fields */
@@ -151,13 +151,13 @@ static void acpi_memory_hotplug_write(void *opaque, hwaddr addr, uint64_t data,
             /* call pc-dimm unplug cb */
             hotplug_handler_unplug(hotplug_ctrl, dev, &local_err);
             if (local_err) {
-                trace_mhp_acpi_pc_dimm_delete_failed(mem_st->selector);
+                trace_mhp_acpi_dimm_delete_failed(mem_st->selector);
                 qapi_event_send_mem_unplug_error(dev->id,
                                                  error_get_pretty(local_err),
                                                  &error_abort);
                 break;
             }
-            trace_mhp_acpi_pc_dimm_deleted(mem_st->selector);
+            trace_mhp_acpi_dimm_deleted(mem_st->selector);
         }
         break;
     default:
@@ -206,7 +206,7 @@ acpi_memory_slot_status(MemHotplugState *mem_st,
                         DeviceState *dev, Error **errp)
 {
     Error *local_err = NULL;
-    int slot = object_property_get_int(OBJECT(dev), PC_DIMM_SLOT_PROP,
+    int slot = object_property_get_int(OBJECT(dev), DIMM_SLOT_PROP,
                                        &local_err);
 
     if (local_err) {
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 2cd2fee..0b2cb6e 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -344,7 +344,7 @@ static void piix4_device_plug_cb(HotplugHandler *hotplug_dev,
     PIIX4PMState *s = PIIX4_PM(hotplug_dev);
 
     if (s->acpi_memory_hotplug.is_enabled &&
-        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         acpi_memory_plug_cb(&s->ar, s->irq, &s->acpi_memory_hotplug, dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
         acpi_pcihp_device_plug_cb(&s->ar, s->irq, &s->acpi_pci_hotplug, dev,
@@ -363,7 +363,7 @@ static void piix4_device_unplug_request_cb(HotplugHandler *hotplug_dev,
     PIIX4PMState *s = PIIX4_PM(hotplug_dev);
 
     if (s->acpi_memory_hotplug.is_enabled &&
-        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         acpi_memory_unplug_request_cb(&s->ar, s->irq, &s->acpi_memory_hotplug,
                                       dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
@@ -381,7 +381,7 @@ static void piix4_device_unplug_cb(HotplugHandler *hotplug_dev,
     PIIX4PMState *s = PIIX4_PM(hotplug_dev);
 
     if (s->acpi_memory_hotplug.is_enabled &&
-        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         acpi_memory_unplug_cb(&s->acpi_memory_hotplug, dev, errp);
     } else {
         error_setg(errp, "acpi: device unplug for not supported device"
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 682867a..d6b9fa7 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1609,15 +1609,15 @@ void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name)
     }
 }
 
-static void pc_dimm_plug(HotplugHandler *hotplug_dev,
+static void dimm_plug(HotplugHandler *hotplug_dev,
                          DeviceState *dev, Error **errp)
 {
     HotplugHandlerClass *hhc;
     Error *local_err = NULL;
     PCMachineState *pcms = PC_MACHINE(hotplug_dev);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
-    PCDIMMDevice *dimm = PC_DIMM(dev);
-    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
+    DIMMDevice *dimm = DIMM(dev);
+    DIMMDeviceClass *ddc = DIMM_GET_CLASS(dimm);
     MemoryRegion *mr = ddc->get_memory_region(dimm);
     uint64_t align = TARGET_PAGE_SIZE;
 
@@ -1631,7 +1631,7 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev,
         goto out;
     }
 
-    pc_dimm_memory_plug(dev, &pcms->hotplug_memory, mr, align,
+    dimm_memory_plug(dev, &pcms->hotplug_memory, mr, align,
                         pcmc->inter_dimm_gap, &local_err);
     if (local_err) {
         goto out;
@@ -1643,7 +1643,7 @@ out:
     error_propagate(errp, local_err);
 }
 
-static void pc_dimm_unplug_request(HotplugHandler *hotplug_dev,
+static void dimm_unplug_request(HotplugHandler *hotplug_dev,
                                    DeviceState *dev, Error **errp)
 {
     HotplugHandlerClass *hhc;
@@ -1663,12 +1663,12 @@ out:
     error_propagate(errp, local_err);
 }
 
-static void pc_dimm_unplug(HotplugHandler *hotplug_dev,
+static void dimm_unplug(HotplugHandler *hotplug_dev,
                            DeviceState *dev, Error **errp)
 {
     PCMachineState *pcms = PC_MACHINE(hotplug_dev);
-    PCDIMMDevice *dimm = PC_DIMM(dev);
-    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
+    DIMMDevice *dimm = DIMM(dev);
+    DIMMDeviceClass *ddc = DIMM_GET_CLASS(dimm);
     MemoryRegion *mr = ddc->get_memory_region(dimm);
     HotplugHandlerClass *hhc;
     Error *local_err = NULL;
@@ -1680,7 +1680,7 @@ static void pc_dimm_unplug(HotplugHandler *hotplug_dev,
         goto out;
     }
 
-    pc_dimm_memory_unplug(dev, &pcms->hotplug_memory, mr);
+    dimm_memory_unplug(dev, &pcms->hotplug_memory, mr);
     object_unparent(OBJECT(dev));
 
  out:
@@ -1719,8 +1719,8 @@ out:
 static void pc_machine_device_plug_cb(HotplugHandler *hotplug_dev,
                                       DeviceState *dev, Error **errp)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
-        pc_dimm_plug(hotplug_dev, dev, errp);
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
+        dimm_plug(hotplug_dev, dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
         pc_cpu_plug(hotplug_dev, dev, errp);
     }
@@ -1729,8 +1729,8 @@ static void pc_machine_device_plug_cb(HotplugHandler *hotplug_dev,
 static void pc_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
                                                 DeviceState *dev, Error **errp)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
-        pc_dimm_unplug_request(hotplug_dev, dev, errp);
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
+        dimm_unplug_request(hotplug_dev, dev, errp);
     } else {
         error_setg(errp, "acpi: device unplug request for not supported device"
                    " type: %s", object_get_typename(OBJECT(dev)));
@@ -1740,8 +1740,8 @@ static void pc_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
 static void pc_machine_device_unplug_cb(HotplugHandler *hotplug_dev,
                                         DeviceState *dev, Error **errp)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
-        pc_dimm_unplug(hotplug_dev, dev, errp);
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
+        dimm_unplug(hotplug_dev, dev, errp);
     } else {
         error_setg(errp, "acpi: device unplug for not supported device"
                    " type: %s", object_get_typename(OBJECT(dev)));
@@ -1753,7 +1753,7 @@ static HotplugHandler *pc_get_hotpug_handler(MachineState *machine,
 {
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine);
 
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM) ||
         object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
         return HOTPLUG_HANDLER(machine);
     }
diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index a581622..9e26bf7 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -26,21 +26,21 @@
 #include "sysemu/kvm.h"
 #include "trace.h"
 
-typedef struct pc_dimms_capacity {
+typedef struct dimms_capacity {
      uint64_t size;
      Error    **errp;
-} pc_dimms_capacity;
+} dimms_capacity;
 
 static int existing_dimms_capacity_internal(Object *obj, void *opaque)
 {
-    pc_dimms_capacity *cap = opaque;
+    dimms_capacity *cap = opaque;
     uint64_t *size = &cap->size;
 
-    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(obj, TYPE_DIMM)) {
         DeviceState *dev = DEVICE(obj);
 
         if (dev->realized) {
-            (*size) += object_property_get_int(obj, PC_DIMM_SIZE_PROP,
+            (*size) += object_property_get_int(obj, DIMM_SIZE_PROP,
                 cap->errp);
         }
 
@@ -54,7 +54,7 @@ static int existing_dimms_capacity_internal(Object *obj, void *opaque)
 
 static uint64_t existing_dimms_capacity(Error **errp)
 {
-    pc_dimms_capacity cap;
+    dimms_capacity cap;
 
     cap.size = 0;
     cap.errp = errp;
@@ -63,23 +63,23 @@ static uint64_t existing_dimms_capacity(Error **errp)
     return cap.size;
 }
 
-void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
+void dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
                          MemoryRegion *mr, uint64_t align, bool gap,
                          Error **errp)
 {
     int slot;
     MachineState *machine = MACHINE(qdev_get_machine());
-    PCDIMMDevice *dimm = PC_DIMM(dev);
+    DIMMDevice *dimm = DIMM(dev);
     Error *local_err = NULL;
     uint64_t dimms_capacity = 0;
     uint64_t addr;
 
-    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
+    addr = object_property_get_int(OBJECT(dimm), DIMM_ADDR_PROP, &local_err);
     if (local_err) {
         goto out;
     }
 
-    addr = pc_dimm_get_free_addr(hpms->base,
+    addr = dimm_get_free_addr(hpms->base,
                                  memory_region_size(&hpms->mr),
                                  !addr ? NULL : &addr, align, gap,
                                  memory_region_size(mr), &local_err);
@@ -100,27 +100,27 @@ void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
         goto out;
     }
 
-    object_property_set_int(OBJECT(dev), addr, PC_DIMM_ADDR_PROP, &local_err);
+    object_property_set_int(OBJECT(dev), addr, DIMM_ADDR_PROP, &local_err);
     if (local_err) {
         goto out;
     }
-    trace_mhp_pc_dimm_assigned_address(addr);
+    trace_mhp_dimm_assigned_address(addr);
 
-    slot = object_property_get_int(OBJECT(dev), PC_DIMM_SLOT_PROP, &local_err);
+    slot = object_property_get_int(OBJECT(dev), DIMM_SLOT_PROP, &local_err);
     if (local_err) {
         goto out;
     }
 
-    slot = pc_dimm_get_free_slot(slot == PC_DIMM_UNASSIGNED_SLOT ? NULL : &slot,
+    slot = dimm_get_free_slot(slot == DIMM_UNASSIGNED_SLOT ? NULL : &slot,
                                  machine->ram_slots, &local_err);
     if (local_err) {
         goto out;
     }
-    object_property_set_int(OBJECT(dev), slot, PC_DIMM_SLOT_PROP, &local_err);
+    object_property_set_int(OBJECT(dev), slot, DIMM_SLOT_PROP, &local_err);
     if (local_err) {
         goto out;
     }
-    trace_mhp_pc_dimm_assigned_slot(slot);
+    trace_mhp_dimm_assigned_slot(slot);
 
     if (kvm_enabled() && !kvm_has_free_slot(machine)) {
         error_setg(&local_err, "hypervisor has no free memory slots left");
@@ -135,29 +135,29 @@ out:
     error_propagate(errp, local_err);
 }
 
-void pc_dimm_memory_unplug(DeviceState *dev, MemoryHotplugState *hpms,
+void dimm_memory_unplug(DeviceState *dev, MemoryHotplugState *hpms,
                            MemoryRegion *mr)
 {
-    PCDIMMDevice *dimm = PC_DIMM(dev);
+    DIMMDevice *dimm = DIMM(dev);
 
     numa_unset_mem_node_id(dimm->addr, memory_region_size(mr), dimm->node);
     memory_region_del_subregion(&hpms->mr, mr);
     vmstate_unregister_ram(mr, dev);
 }
 
-int qmp_pc_dimm_device_list(Object *obj, void *opaque)
+int qmp_dimm_device_list(Object *obj, void *opaque)
 {
     MemoryDeviceInfoList ***prev = opaque;
 
-    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(obj, TYPE_DIMM)) {
         DeviceState *dev = DEVICE(obj);
 
         if (dev->realized) {
             MemoryDeviceInfoList *elem = g_new0(MemoryDeviceInfoList, 1);
             MemoryDeviceInfo *info = g_new0(MemoryDeviceInfo, 1);
-            PCDIMMDeviceInfo *di = g_new0(PCDIMMDeviceInfo, 1);
+            DIMMDeviceInfo *di = g_new0(DIMMDeviceInfo, 1);
             DeviceClass *dc = DEVICE_GET_CLASS(obj);
-            PCDIMMDevice *dimm = PC_DIMM(obj);
+            DIMMDevice *dimm = DIMM(obj);
 
             if (dev->id) {
                 di->has_id = true;
@@ -168,7 +168,7 @@ int qmp_pc_dimm_device_list(Object *obj, void *opaque)
             di->addr = dimm->addr;
             di->slot = dimm->slot;
             di->node = dimm->node;
-            di->size = object_property_get_int(OBJECT(dimm), PC_DIMM_SIZE_PROP,
+            di->size = object_property_get_int(OBJECT(dimm), DIMM_SIZE_PROP,
                                                NULL);
             di->memdev = object_get_canonical_path(OBJECT(dimm->hostmem));
 
@@ -180,7 +180,7 @@ int qmp_pc_dimm_device_list(Object *obj, void *opaque)
         }
     }
 
-    object_child_foreach(obj, qmp_pc_dimm_device_list, opaque);
+    object_child_foreach(obj, qmp_dimm_device_list, opaque);
     return 0;
 }
 
@@ -191,7 +191,7 @@ ram_addr_t get_current_ram_size(void)
     MemoryDeviceInfoList *info;
     ram_addr_t size = ram_size;
 
-    qmp_pc_dimm_device_list(qdev_get_machine(), &prev);
+    qmp_dimm_device_list(qdev_get_machine(), &prev);
     for (info = info_list; info; info = info->next) {
         MemoryDeviceInfo *value = info->value;
 
@@ -210,28 +210,28 @@ ram_addr_t get_current_ram_size(void)
     return size;
 }
 
-static int pc_dimm_slot2bitmap(Object *obj, void *opaque)
+static int dimm_slot2bitmap(Object *obj, void *opaque)
 {
     unsigned long *bitmap = opaque;
 
-    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(obj, TYPE_DIMM)) {
         DeviceState *dev = DEVICE(obj);
         if (dev->realized) { /* count only realized DIMMs */
-            PCDIMMDevice *d = PC_DIMM(obj);
+            DIMMDevice *d = DIMM(obj);
             set_bit(d->slot, bitmap);
         }
     }
 
-    object_child_foreach(obj, pc_dimm_slot2bitmap, opaque);
+    object_child_foreach(obj, dimm_slot2bitmap, opaque);
     return 0;
 }
 
-int pc_dimm_get_free_slot(const int *hint, int max_slots, Error **errp)
+int dimm_get_free_slot(const int *hint, int max_slots, Error **errp)
 {
     unsigned long *bitmap = bitmap_new(max_slots);
     int slot = 0;
 
-    object_child_foreach(qdev_get_machine(), pc_dimm_slot2bitmap, bitmap);
+    object_child_foreach(qdev_get_machine(), dimm_slot2bitmap, bitmap);
 
     /* check if requested slot is not occupied */
     if (hint) {
@@ -256,10 +256,10 @@ out:
     return slot;
 }
 
-static gint pc_dimm_addr_sort(gconstpointer a, gconstpointer b)
+static gint dimm_addr_sort(gconstpointer a, gconstpointer b)
 {
-    PCDIMMDevice *x = PC_DIMM(a);
-    PCDIMMDevice *y = PC_DIMM(b);
+    DIMMDevice *x = DIMM(a);
+    DIMMDevice *y = DIMM(b);
     Int128 diff = int128_sub(int128_make64(x->addr), int128_make64(y->addr));
 
     if (int128_lt(diff, int128_zero())) {
@@ -270,22 +270,22 @@ static gint pc_dimm_addr_sort(gconstpointer a, gconstpointer b)
     return 0;
 }
 
-static int pc_dimm_built_list(Object *obj, void *opaque)
+static int dimm_built_list(Object *obj, void *opaque)
 {
     GSList **list = opaque;
 
-    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(obj, TYPE_DIMM)) {
         DeviceState *dev = DEVICE(obj);
         if (dev->realized) { /* only realized DIMMs matter */
-            *list = g_slist_insert_sorted(*list, dev, pc_dimm_addr_sort);
+            *list = g_slist_insert_sorted(*list, dev, dimm_addr_sort);
         }
     }
 
-    object_child_foreach(obj, pc_dimm_built_list, opaque);
+    object_child_foreach(obj, dimm_built_list, opaque);
     return 0;
 }
 
-uint64_t pc_dimm_get_free_addr(uint64_t address_space_start,
+uint64_t dimm_get_free_addr(uint64_t address_space_start,
                                uint64_t address_space_size,
                                uint64_t *hint, uint64_t align, bool gap,
                                uint64_t size, Error **errp)
@@ -315,7 +315,7 @@ uint64_t pc_dimm_get_free_addr(uint64_t address_space_start,
     }
 
     assert(address_space_end > address_space_start);
-    object_child_foreach(qdev_get_machine(), pc_dimm_built_list, &list);
+    object_child_foreach(qdev_get_machine(), dimm_built_list, &list);
 
     if (hint) {
         new_addr = *hint;
@@ -325,9 +325,9 @@ uint64_t pc_dimm_get_free_addr(uint64_t address_space_start,
 
     /* find address range that will fit new DIMM */
     for (item = list; item; item = g_slist_next(item)) {
-        PCDIMMDevice *dimm = item->data;
+        DIMMDevice *dimm = item->data;
         uint64_t dimm_size = object_property_get_int(OBJECT(dimm),
-                                                     PC_DIMM_SIZE_PROP,
+                                                     DIMM_SIZE_PROP,
                                                      errp);
         if (errp && *errp) {
             goto out;
@@ -359,20 +359,20 @@ out:
     return ret;
 }
 
-static Property pc_dimm_properties[] = {
-    DEFINE_PROP_UINT64(PC_DIMM_ADDR_PROP, PCDIMMDevice, addr, 0),
-    DEFINE_PROP_UINT32(PC_DIMM_NODE_PROP, PCDIMMDevice, node, 0),
-    DEFINE_PROP_INT32(PC_DIMM_SLOT_PROP, PCDIMMDevice, slot,
-                      PC_DIMM_UNASSIGNED_SLOT),
+static Property dimm_properties[] = {
+    DEFINE_PROP_UINT64(DIMM_ADDR_PROP, DIMMDevice, addr, 0),
+    DEFINE_PROP_UINT32(DIMM_NODE_PROP, DIMMDevice, node, 0),
+    DEFINE_PROP_INT32(DIMM_SLOT_PROP, DIMMDevice, slot,
+                      DIMM_UNASSIGNED_SLOT),
     DEFINE_PROP_END_OF_LIST(),
 };
 
-static void pc_dimm_get_size(Object *obj, Visitor *v, void *opaque,
+static void dimm_get_size(Object *obj, Visitor *v, void *opaque,
                           const char *name, Error **errp)
 {
     int64_t value;
     MemoryRegion *mr;
-    PCDIMMDevice *dimm = PC_DIMM(obj);
+    DIMMDevice *dimm = DIMM(obj);
 
     mr = host_memory_backend_get_memory(dimm->hostmem, errp);
     value = memory_region_size(mr);
@@ -380,7 +380,7 @@ static void pc_dimm_get_size(Object *obj, Visitor *v, void *opaque,
     visit_type_int(v, &value, name, errp);
 }
 
-static void pc_dimm_check_memdev_is_busy(Object *obj, const char *name,
+static void dimm_check_memdev_is_busy(Object *obj, const char *name,
                                       Object *val, Error **errp)
 {
     MemoryRegion *mr;
@@ -395,65 +395,65 @@ static void pc_dimm_check_memdev_is_busy(Object *obj, const char *name,
     }
 }
 
-static void pc_dimm_init(Object *obj)
+static void dimm_init(Object *obj)
 {
-    PCDIMMDevice *dimm = PC_DIMM(obj);
+    DIMMDevice *dimm = DIMM(obj);
 
-    object_property_add(obj, PC_DIMM_SIZE_PROP, "int", pc_dimm_get_size,
+    object_property_add(obj, DIMM_SIZE_PROP, "int", dimm_get_size,
                         NULL, NULL, NULL, &error_abort);
-    object_property_add_link(obj, PC_DIMM_MEMDEV_PROP, TYPE_MEMORY_BACKEND,
+    object_property_add_link(obj, DIMM_MEMDEV_PROP, TYPE_MEMORY_BACKEND,
                              (Object **)&dimm->hostmem,
-                             pc_dimm_check_memdev_is_busy,
+                             dimm_check_memdev_is_busy,
                              OBJ_PROP_LINK_UNREF_ON_RELEASE,
                              &error_abort);
 }
 
-static void pc_dimm_realize(DeviceState *dev, Error **errp)
+static void dimm_realize(DeviceState *dev, Error **errp)
 {
-    PCDIMMDevice *dimm = PC_DIMM(dev);
+    DIMMDevice *dimm = DIMM(dev);
 
     if (!dimm->hostmem) {
-        error_setg(errp, "'" PC_DIMM_MEMDEV_PROP "' property is not set");
+        error_setg(errp, "'" DIMM_MEMDEV_PROP "' property is not set");
         return;
     }
     if (((nb_numa_nodes > 0) && (dimm->node >= nb_numa_nodes)) ||
         (!nb_numa_nodes && dimm->node)) {
-        error_setg(errp, "'DIMM property " PC_DIMM_NODE_PROP " has value %"
+        error_setg(errp, "'DIMM property " DIMM_NODE_PROP " has value %"
                    PRIu32 "' which exceeds the number of numa nodes: %d",
                    dimm->node, nb_numa_nodes ? nb_numa_nodes : 1);
         return;
     }
 }
 
-static MemoryRegion *pc_dimm_get_memory_region(PCDIMMDevice *dimm)
+static MemoryRegion *dimm_get_memory_region(DIMMDevice *dimm)
 {
     return host_memory_backend_get_memory(dimm->hostmem, &error_abort);
 }
 
-static void pc_dimm_class_init(ObjectClass *oc, void *data)
+static void dimm_class_init(ObjectClass *oc, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(oc);
-    PCDIMMDeviceClass *ddc = PC_DIMM_CLASS(oc);
+    DIMMDeviceClass *ddc = DIMM_CLASS(oc);
 
-    dc->realize = pc_dimm_realize;
-    dc->props = pc_dimm_properties;
+    dc->realize = dimm_realize;
+    dc->props = dimm_properties;
     dc->desc = "DIMM memory module";
 
-    ddc->get_memory_region = pc_dimm_get_memory_region;
+    ddc->get_memory_region = dimm_get_memory_region;
 }
 
-static TypeInfo pc_dimm_info = {
-    .name          = TYPE_PC_DIMM,
+static TypeInfo dimm_info = {
+    .name          = TYPE_DIMM,
     .parent        = TYPE_DEVICE,
-    .instance_size = sizeof(PCDIMMDevice),
-    .instance_init = pc_dimm_init,
-    .class_init    = pc_dimm_class_init,
-    .class_size    = sizeof(PCDIMMDeviceClass),
+    .instance_size = sizeof(DIMMDevice),
+    .instance_init = dimm_init,
+    .class_init    = dimm_class_init,
+    .class_size    = sizeof(DIMMDeviceClass),
 };
 
-static void pc_dimm_register_types(void)
+static void dimm_register_types(void)
 {
-    type_register_static(&pc_dimm_info);
+    type_register_static(&dimm_info);
 }
 
-type_init(pc_dimm_register_types)
+type_init(dimm_register_types)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index d1b0e53..4fb91a5 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2083,8 +2083,8 @@ static void spapr_memory_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
 {
     Error *local_err = NULL;
     sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
-    PCDIMMDevice *dimm = PC_DIMM(dev);
-    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
+    DIMMDevice *dimm = DIMM(dev);
+    DIMMDeviceClass *ddc = DIMM_GET_CLASS(dimm);
     MemoryRegion *mr = ddc->get_memory_region(dimm);
     uint64_t align = memory_region_get_alignment(mr);
     uint64_t size = memory_region_size(mr);
@@ -2096,14 +2096,14 @@ static void spapr_memory_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         goto out;
     }
 
-    pc_dimm_memory_plug(dev, &ms->hotplug_memory, mr, align, false, &local_err);
+    dimm_memory_plug(dev, &ms->hotplug_memory, mr, align, false, &local_err);
     if (local_err) {
         goto out;
     }
 
-    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
+    addr = object_property_get_int(OBJECT(dimm), DIMM_ADDR_PROP, &local_err);
     if (local_err) {
-        pc_dimm_memory_unplug(dev, &ms->hotplug_memory, mr);
+        dimm_memory_unplug(dev, &ms->hotplug_memory, mr);
         goto out;
     }
 
@@ -2118,14 +2118,14 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
 {
     sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
 
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         int node;
 
         if (!smc->dr_lmb_enabled) {
             error_setg(errp, "Memory hotplug not supported for this machine");
             return;
         }
-        node = object_property_get_int(OBJECT(dev), PC_DIMM_NODE_PROP, errp);
+        node = object_property_get_int(OBJECT(dev), DIMM_NODE_PROP, errp);
         if (*errp) {
             return;
         }
@@ -2159,7 +2159,7 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
 static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
                                       DeviceState *dev, Error **errp)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         error_setg(errp, "Memory hot unplug not supported by sPAPR");
     }
 }
@@ -2167,7 +2167,7 @@ static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
 static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
                                              DeviceState *dev)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+    if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
         return HOTPLUG_HANDLER(machine);
     }
     return NULL;
diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
index c1e5774..5ddbf08 100644
--- a/include/hw/mem/pc-dimm.h
+++ b/include/hw/mem/pc-dimm.h
@@ -13,39 +13,39 @@
  *
  */
 
-#ifndef QEMU_PC_DIMM_H
-#define QEMU_PC_DIMM_H
+#ifndef QEMU_DIMM_H
+#define QEMU_DIMM_H
 
 #include "exec/memory.h"
 #include "sysemu/hostmem.h"
 #include "hw/qdev.h"
 
-#define TYPE_PC_DIMM "pc-dimm"
-#define PC_DIMM(obj) \
-    OBJECT_CHECK(PCDIMMDevice, (obj), TYPE_PC_DIMM)
-#define PC_DIMM_CLASS(oc) \
-    OBJECT_CLASS_CHECK(PCDIMMDeviceClass, (oc), TYPE_PC_DIMM)
-#define PC_DIMM_GET_CLASS(obj) \
-    OBJECT_GET_CLASS(PCDIMMDeviceClass, (obj), TYPE_PC_DIMM)
+#define TYPE_DIMM "pc-dimm"
+#define DIMM(obj) \
+    OBJECT_CHECK(DIMMDevice, (obj), TYPE_DIMM)
+#define DIMM_CLASS(oc) \
+    OBJECT_CLASS_CHECK(DIMMDeviceClass, (oc), TYPE_DIMM)
+#define DIMM_GET_CLASS(obj) \
+    OBJECT_GET_CLASS(DIMMDeviceClass, (obj), TYPE_DIMM)
 
-#define PC_DIMM_ADDR_PROP "addr"
-#define PC_DIMM_SLOT_PROP "slot"
-#define PC_DIMM_NODE_PROP "node"
-#define PC_DIMM_SIZE_PROP "size"
-#define PC_DIMM_MEMDEV_PROP "memdev"
+#define DIMM_ADDR_PROP "addr"
+#define DIMM_SLOT_PROP "slot"
+#define DIMM_NODE_PROP "node"
+#define DIMM_SIZE_PROP "size"
+#define DIMM_MEMDEV_PROP "memdev"
 
-#define PC_DIMM_UNASSIGNED_SLOT -1
+#define DIMM_UNASSIGNED_SLOT -1
 
 /**
- * PCDIMMDevice:
- * @addr: starting guest physical address, where @PCDIMMDevice is mapped.
+ * DIMMDevice:
+ * @addr: starting guest physical address, where @DIMMDevice is mapped.
  *         Default value: 0, means that address is auto-allocated.
- * @node: numa node to which @PCDIMMDevice is attached.
- * @slot: slot number into which @PCDIMMDevice is plugged in.
+ * @node: numa node to which @DIMMDevice is attached.
+ * @slot: slot number into which @DIMMDevice is plugged in.
  *        Default value: -1, means that slot is auto-allocated.
- * @hostmem: host memory backend providing memory for @PCDIMMDevice
+ * @hostmem: host memory backend providing memory for @DIMMDevice
  */
-typedef struct PCDIMMDevice {
+typedef struct DIMMDevice {
     /* private */
     DeviceState parent_obj;
 
@@ -54,19 +54,19 @@ typedef struct PCDIMMDevice {
     uint32_t node;
     int32_t slot;
     HostMemoryBackend *hostmem;
-} PCDIMMDevice;
+} DIMMDevice;
 
 /**
- * PCDIMMDeviceClass:
+ * DIMMDeviceClass:
  * @get_memory_region: returns #MemoryRegion associated with @dimm
  */
-typedef struct PCDIMMDeviceClass {
+typedef struct DIMMDeviceClass {
     /* private */
     DeviceClass parent_class;
 
     /* public */
-    MemoryRegion *(*get_memory_region)(PCDIMMDevice *dimm);
-} PCDIMMDeviceClass;
+    MemoryRegion *(*get_memory_region)(DIMMDevice *dimm);
+} DIMMDeviceClass;
 
 /**
  * MemoryHotplugState:
@@ -79,17 +79,17 @@ typedef struct MemoryHotplugState {
     MemoryRegion mr;
 } MemoryHotplugState;
 
-uint64_t pc_dimm_get_free_addr(uint64_t address_space_start,
+uint64_t dimm_get_free_addr(uint64_t address_space_start,
                                uint64_t address_space_size,
                                uint64_t *hint, uint64_t align, bool gap,
                                uint64_t size, Error **errp);
 
-int pc_dimm_get_free_slot(const int *hint, int max_slots, Error **errp);
+int dimm_get_free_slot(const int *hint, int max_slots, Error **errp);
 
-int qmp_pc_dimm_device_list(Object *obj, void *opaque);
-void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
+int qmp_dimm_device_list(Object *obj, void *opaque);
+void dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
                          MemoryRegion *mr, uint64_t align, bool gap,
                          Error **errp);
-void pc_dimm_memory_unplug(DeviceState *dev, MemoryHotplugState *hpms,
+void dimm_memory_unplug(DeviceState *dev, MemoryHotplugState *hpms,
                            MemoryRegion *mr);
 #endif
diff --git a/numa.c b/numa.c
index e9b18f5..cb69965 100644
--- a/numa.c
+++ b/numa.c
@@ -482,7 +482,7 @@ static void numa_stat_memory_devices(uint64_t node_mem[])
     MemoryDeviceInfoList **prev = &info_list;
     MemoryDeviceInfoList *info;
 
-    qmp_pc_dimm_device_list(qdev_get_machine(), &prev);
+    qmp_dimm_device_list(qdev_get_machine(), &prev);
     for (info = info_list; info; info = info->next) {
         MemoryDeviceInfo *value = info->value;
 
diff --git a/qapi-schema.json b/qapi-schema.json
index 8b0520c..d8dd8b9 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3684,9 +3684,9 @@
 { 'command': 'query-memdev', 'returns': ['Memdev'] }
 
 ##
-# @PCDIMMDeviceInfo:
+# @DIMMDeviceInfo:
 #
-# PCDIMMDevice state information
+# DIMMDevice state information
 #
 # @id: #optional device's ID
 #
@@ -3706,7 +3706,7 @@
 #
 # Since: 2.1
 ##
-{ 'struct': 'PCDIMMDeviceInfo',
+{ 'struct': 'DIMMDeviceInfo',
   'data': { '*id': 'str',
             'addr': 'int',
             'size': 'int',
@@ -3725,7 +3725,7 @@
 #
 # Since: 2.1
 ##
-{ 'union': 'MemoryDeviceInfo', 'data': {'dimm': 'PCDIMMDeviceInfo'} }
+{ 'union': 'MemoryDeviceInfo', 'data': {'dimm': 'DIMMDeviceInfo'} }
 
 ##
 # @query-memory-devices
diff --git a/qmp.c b/qmp.c
index 057a7cb..6709aea 100644
--- a/qmp.c
+++ b/qmp.c
@@ -699,7 +699,7 @@ MemoryDeviceInfoList *qmp_query_memory_devices(Error **errp)
     MemoryDeviceInfoList *head = NULL;
     MemoryDeviceInfoList **prev = &head;
 
-    qmp_pc_dimm_device_list(qdev_get_machine(), &prev);
+    qmp_dimm_device_list(qdev_get_machine(), &prev);
 
     return head;
 }
diff --git a/stubs/qmp_pc_dimm_device_list.c b/stubs/qmp_pc_dimm_device_list.c
index b584bd8..b2704c6 100644
--- a/stubs/qmp_pc_dimm_device_list.c
+++ b/stubs/qmp_pc_dimm_device_list.c
@@ -1,7 +1,7 @@
 #include "qom/object.h"
 #include "hw/mem/pc-dimm.h"
 
-int qmp_pc_dimm_device_list(Object *obj, void *opaque)
+int qmp_dimm_device_list(Object *obj, void *opaque)
 {
    return 0;
 }
diff --git a/trace-events b/trace-events
index b813ae4..06c76f2 100644
--- a/trace-events
+++ b/trace-events
@@ -1639,12 +1639,12 @@ mhp_acpi_write_ost_ev(uint32_t slot, uint32_t ev) "slot[0x%"PRIx32"] OST EVENT:
 mhp_acpi_write_ost_status(uint32_t slot, uint32_t st) "slot[0x%"PRIx32"] OST STATUS: 0x%"PRIx32
 mhp_acpi_clear_insert_evt(uint32_t slot) "slot[0x%"PRIx32"] clear insert event"
 mhp_acpi_clear_remove_evt(uint32_t slot) "slot[0x%"PRIx32"] clear remove event"
-mhp_acpi_pc_dimm_deleted(uint32_t slot) "slot[0x%"PRIx32"] pc-dimm deleted"
-mhp_acpi_pc_dimm_delete_failed(uint32_t slot) "slot[0x%"PRIx32"] pc-dimm delete failed"
+mhp_acpi_dimm_deleted(uint32_t slot) "slot[0x%"PRIx32"] dimm deleted"
+mhp_acpi_dimm_delete_failed(uint32_t slot) "slot[0x%"PRIx32"] dimm delete failed"
 
 # hw/i386/pc.c
-mhp_pc_dimm_assigned_slot(int slot) "0x%d"
-mhp_pc_dimm_assigned_address(uint64_t addr) "0x%"PRIx64
+mhp_dimm_assigned_slot(int slot) "0x%d"
+mhp_dimm_assigned_address(uint64_t addr) "0x%"PRIx64
 
 # target-s390x/kvm.c
 kvm_enable_cmma(int rc) "CMMA: enabling with result code %d"
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 15/32] stubs: rename qmp_pc_dimm_device_list.c
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Rename qmp_pc_dimm_device_list.c to qmp_dimm_device_list.c

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 stubs/Makefile.objs                                         | 2 +-
 stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (100%)

diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index 85e4e81..c2fdcbb 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -37,5 +37,5 @@ stub-obj-y += vmstate.o
 stub-obj-$(CONFIG_WIN32) += fd-register.o
 stub-obj-y += cpus.o
 stub-obj-y += kvm.o
-stub-obj-y += qmp_pc_dimm_device_list.o
+stub-obj-y += qmp_dimm_device_list.o
 stub-obj-y += target-monitor-defs.o
diff --git a/stubs/qmp_pc_dimm_device_list.c b/stubs/qmp_dimm_device_list.c
similarity index 100%
rename from stubs/qmp_pc_dimm_device_list.c
rename to stubs/qmp_dimm_device_list.c
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 15/32] stubs: rename qmp_pc_dimm_device_list.c
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Rename qmp_pc_dimm_device_list.c to qmp_dimm_device_list.c

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 stubs/Makefile.objs                                         | 2 +-
 stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (100%)

diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index 85e4e81..c2fdcbb 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -37,5 +37,5 @@ stub-obj-y += vmstate.o
 stub-obj-$(CONFIG_WIN32) += fd-register.o
 stub-obj-y += cpus.o
 stub-obj-y += kvm.o
-stub-obj-y += qmp_pc_dimm_device_list.o
+stub-obj-y += qmp_dimm_device_list.o
 stub-obj-y += target-monitor-defs.o
diff --git a/stubs/qmp_pc_dimm_device_list.c b/stubs/qmp_dimm_device_list.c
similarity index 100%
rename from stubs/qmp_pc_dimm_device_list.c
rename to stubs/qmp_dimm_device_list.c
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 16/32] pc-dimm: rename pc-dimm.c and pc-dimm.h
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Rename:
   pc-dimm.c => dimm.c
   pc-dimm.h => dimm.h

It prepares the work which abstracts dimm device type for both pc-dimm and
nvdimm

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/Makefile.objs                     | 2 +-
 hw/acpi/ich9.c                       | 2 +-
 hw/acpi/memory_hotplug.c             | 4 ++--
 hw/acpi/piix4.c                      | 2 +-
 hw/i386/pc.c                         | 2 +-
 hw/mem/Makefile.objs                 | 2 +-
 hw/mem/{pc-dimm.c => dimm.c}         | 2 +-
 hw/ppc/spapr.c                       | 2 +-
 include/hw/i386/pc.h                 | 2 +-
 include/hw/mem/{pc-dimm.h => dimm.h} | 0
 include/hw/ppc/spapr.h               | 2 +-
 numa.c                               | 2 +-
 qmp.c                                | 2 +-
 stubs/qmp_dimm_device_list.c         | 2 +-
 14 files changed, 14 insertions(+), 14 deletions(-)
 rename hw/mem/{pc-dimm.c => dimm.c} (99%)
 rename include/hw/mem/{pc-dimm.h => dimm.h} (100%)

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 7e7c241..12ecda9 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -30,8 +30,8 @@ devices-dirs-$(CONFIG_SOFTMMU) += vfio/
 devices-dirs-$(CONFIG_VIRTIO) += virtio/
 devices-dirs-$(CONFIG_SOFTMMU) += watchdog/
 devices-dirs-$(CONFIG_SOFTMMU) += xen/
-devices-dirs-$(CONFIG_MEM_HOTPLUG) += mem/
 devices-dirs-$(CONFIG_SMBIOS) += smbios/
+devices-dirs-y += mem/
 devices-dirs-y += core/
 common-obj-y += $(devices-dirs-y)
 obj-y += $(devices-dirs-y)
diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index b0d6a67..1e9ae20 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -35,7 +35,7 @@
 #include "exec/address-spaces.h"
 
 #include "hw/i386/ich9.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 
 //#define DEBUG
 
diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index 1f6cccc..e232641 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -1,6 +1,6 @@
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/pc-hotplug.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "hw/boards.h"
 #include "hw/qdev-core.h"
 #include "trace.h"
@@ -148,7 +148,7 @@ static void acpi_memory_hotplug_write(void *opaque, hwaddr addr, uint64_t data,
 
             dev = DEVICE(mdev->dimm);
             hotplug_ctrl = qdev_get_hotplug_handler(dev);
-            /* call pc-dimm unplug cb */
+            /* call dimm unplug cb */
             hotplug_handler_unplug(hotplug_ctrl, dev, &local_err);
             if (local_err) {
                 trace_mhp_acpi_dimm_delete_failed(mem_st->selector);
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 0b2cb6e..b2f5b2c 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -33,7 +33,7 @@
 #include "hw/acpi/pcihp.h"
 #include "hw/acpi/cpu_hotplug.h"
 #include "hw/hotplug.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/acpi_dev_interface.h"
 #include "hw/xen/xen.h"
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index d6b9fa7..6694b18 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -62,7 +62,7 @@
 #include "hw/boards.h"
 #include "hw/pci/pci_host.h"
 #include "acpi-build.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "qapi/visitor.h"
 #include "qapi-visit.h"
 
diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index b000fb4..7563ef5 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1 +1 @@
-common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
+common-obj-$(CONFIG_MEM_HOTPLUG) += dimm.o
diff --git a/hw/mem/pc-dimm.c b/hw/mem/dimm.c
similarity index 99%
rename from hw/mem/pc-dimm.c
rename to hw/mem/dimm.c
index 9e26bf7..e007271 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/dimm.c
@@ -18,7 +18,7 @@
  * License along with this library; if not, see <http://www.gnu.org/licenses/>
  */
 
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "qemu/config-file.h"
 #include "qapi/visitor.h"
 #include "qemu/range.h"
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 4fb91a5..171fa77 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2138,7 +2138,7 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
          *
          * - Memory gets hotplugged to a different node than what the user
          *   specified.
-         * - Since pc-dimm subsystem in QEMU still thinks that memory belongs
+         * - Since dimm subsystem in QEMU still thinks that memory belongs
          *   to memory-less node, a reboot will set things accordingly
          *   and the previously hotplugged memory now ends in the right node.
          *   This appears as if some memory moved from one node to another.
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 0503485..693b6c5 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -16,7 +16,7 @@
 #include "hw/pci/pci.h"
 #include "hw/boards.h"
 #include "hw/compat.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 
 #define HPET_INTCAP "hpet-intcap"
 
diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/dimm.h
similarity index 100%
rename from include/hw/mem/pc-dimm.h
rename to include/hw/mem/dimm.h
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 56c5b0b..ec46ce0 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -5,7 +5,7 @@
 #include "hw/boards.h"
 #include "hw/ppc/xics.h"
 #include "hw/ppc/spapr_drc.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 
 struct VIOsPAPRBus;
 struct sPAPRPHBState;
diff --git a/numa.c b/numa.c
index cb69965..34c6d6b 100644
--- a/numa.c
+++ b/numa.c
@@ -34,7 +34,7 @@
 #include "hw/boards.h"
 #include "sysemu/hostmem.h"
 #include "qmp-commands.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "qemu/option.h"
 #include "qemu/config-file.h"
 
diff --git a/qmp.c b/qmp.c
index 6709aea..54c9c37 100644
--- a/qmp.c
+++ b/qmp.c
@@ -30,7 +30,7 @@
 #include "qapi/qmp-input-visitor.h"
 #include "hw/boards.h"
 #include "qom/object_interfaces.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "hw/acpi/acpi_dev_interface.h"
 
 NameInfo *qmp_query_name(Error **errp)
diff --git a/stubs/qmp_dimm_device_list.c b/stubs/qmp_dimm_device_list.c
index b2704c6..fb66400 100644
--- a/stubs/qmp_dimm_device_list.c
+++ b/stubs/qmp_dimm_device_list.c
@@ -1,5 +1,5 @@
 #include "qom/object.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 
 int qmp_dimm_device_list(Object *obj, void *opaque)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 16/32] pc-dimm: rename pc-dimm.c and pc-dimm.h
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Rename:
   pc-dimm.c => dimm.c
   pc-dimm.h => dimm.h

It prepares the work which abstracts dimm device type for both pc-dimm and
nvdimm

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/Makefile.objs                     | 2 +-
 hw/acpi/ich9.c                       | 2 +-
 hw/acpi/memory_hotplug.c             | 4 ++--
 hw/acpi/piix4.c                      | 2 +-
 hw/i386/pc.c                         | 2 +-
 hw/mem/Makefile.objs                 | 2 +-
 hw/mem/{pc-dimm.c => dimm.c}         | 2 +-
 hw/ppc/spapr.c                       | 2 +-
 include/hw/i386/pc.h                 | 2 +-
 include/hw/mem/{pc-dimm.h => dimm.h} | 0
 include/hw/ppc/spapr.h               | 2 +-
 numa.c                               | 2 +-
 qmp.c                                | 2 +-
 stubs/qmp_dimm_device_list.c         | 2 +-
 14 files changed, 14 insertions(+), 14 deletions(-)
 rename hw/mem/{pc-dimm.c => dimm.c} (99%)
 rename include/hw/mem/{pc-dimm.h => dimm.h} (100%)

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 7e7c241..12ecda9 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -30,8 +30,8 @@ devices-dirs-$(CONFIG_SOFTMMU) += vfio/
 devices-dirs-$(CONFIG_VIRTIO) += virtio/
 devices-dirs-$(CONFIG_SOFTMMU) += watchdog/
 devices-dirs-$(CONFIG_SOFTMMU) += xen/
-devices-dirs-$(CONFIG_MEM_HOTPLUG) += mem/
 devices-dirs-$(CONFIG_SMBIOS) += smbios/
+devices-dirs-y += mem/
 devices-dirs-y += core/
 common-obj-y += $(devices-dirs-y)
 obj-y += $(devices-dirs-y)
diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index b0d6a67..1e9ae20 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -35,7 +35,7 @@
 #include "exec/address-spaces.h"
 
 #include "hw/i386/ich9.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 
 //#define DEBUG
 
diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index 1f6cccc..e232641 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -1,6 +1,6 @@
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/pc-hotplug.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "hw/boards.h"
 #include "hw/qdev-core.h"
 #include "trace.h"
@@ -148,7 +148,7 @@ static void acpi_memory_hotplug_write(void *opaque, hwaddr addr, uint64_t data,
 
             dev = DEVICE(mdev->dimm);
             hotplug_ctrl = qdev_get_hotplug_handler(dev);
-            /* call pc-dimm unplug cb */
+            /* call dimm unplug cb */
             hotplug_handler_unplug(hotplug_ctrl, dev, &local_err);
             if (local_err) {
                 trace_mhp_acpi_dimm_delete_failed(mem_st->selector);
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 0b2cb6e..b2f5b2c 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -33,7 +33,7 @@
 #include "hw/acpi/pcihp.h"
 #include "hw/acpi/cpu_hotplug.h"
 #include "hw/hotplug.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/acpi_dev_interface.h"
 #include "hw/xen/xen.h"
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index d6b9fa7..6694b18 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -62,7 +62,7 @@
 #include "hw/boards.h"
 #include "hw/pci/pci_host.h"
 #include "acpi-build.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "qapi/visitor.h"
 #include "qapi-visit.h"
 
diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index b000fb4..7563ef5 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1 +1 @@
-common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
+common-obj-$(CONFIG_MEM_HOTPLUG) += dimm.o
diff --git a/hw/mem/pc-dimm.c b/hw/mem/dimm.c
similarity index 99%
rename from hw/mem/pc-dimm.c
rename to hw/mem/dimm.c
index 9e26bf7..e007271 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/dimm.c
@@ -18,7 +18,7 @@
  * License along with this library; if not, see <http://www.gnu.org/licenses/>
  */
 
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "qemu/config-file.h"
 #include "qapi/visitor.h"
 #include "qemu/range.h"
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 4fb91a5..171fa77 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2138,7 +2138,7 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
          *
          * - Memory gets hotplugged to a different node than what the user
          *   specified.
-         * - Since pc-dimm subsystem in QEMU still thinks that memory belongs
+         * - Since dimm subsystem in QEMU still thinks that memory belongs
          *   to memory-less node, a reboot will set things accordingly
          *   and the previously hotplugged memory now ends in the right node.
          *   This appears as if some memory moved from one node to another.
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 0503485..693b6c5 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -16,7 +16,7 @@
 #include "hw/pci/pci.h"
 #include "hw/boards.h"
 #include "hw/compat.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 
 #define HPET_INTCAP "hpet-intcap"
 
diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/dimm.h
similarity index 100%
rename from include/hw/mem/pc-dimm.h
rename to include/hw/mem/dimm.h
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 56c5b0b..ec46ce0 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -5,7 +5,7 @@
 #include "hw/boards.h"
 #include "hw/ppc/xics.h"
 #include "hw/ppc/spapr_drc.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 
 struct VIOsPAPRBus;
 struct sPAPRPHBState;
diff --git a/numa.c b/numa.c
index cb69965..34c6d6b 100644
--- a/numa.c
+++ b/numa.c
@@ -34,7 +34,7 @@
 #include "hw/boards.h"
 #include "sysemu/hostmem.h"
 #include "qmp-commands.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "qemu/option.h"
 #include "qemu/config-file.h"
 
diff --git a/qmp.c b/qmp.c
index 6709aea..54c9c37 100644
--- a/qmp.c
+++ b/qmp.c
@@ -30,7 +30,7 @@
 #include "qapi/qmp-input-visitor.h"
 #include "hw/boards.h"
 #include "qom/object_interfaces.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "hw/acpi/acpi_dev_interface.h"
 
 NameInfo *qmp_query_name(Error **errp)
diff --git a/stubs/qmp_dimm_device_list.c b/stubs/qmp_dimm_device_list.c
index b2704c6..fb66400 100644
--- a/stubs/qmp_dimm_device_list.c
+++ b/stubs/qmp_dimm_device_list.c
@@ -1,5 +1,5 @@
 #include "qom/object.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 
 int qmp_dimm_device_list(Object *obj, void *opaque)
 {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 17/32] dimm: abstract dimm device from pc-dimm
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

A base device, dimm, is abstracted from pc-dimm, so that we can
build nvdimm device based on dimm in the later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 default-configs/i386-softmmu.mak   |  1 +
 default-configs/x86_64-softmmu.mak |  1 +
 hw/mem/Makefile.objs               |  3 ++-
 hw/mem/dimm.c                      | 11 ++-------
 hw/mem/pc-dimm.c                   | 46 ++++++++++++++++++++++++++++++++++++++
 include/hw/mem/dimm.h              |  4 ++--
 include/hw/mem/pc-dimm.h           |  7 ++++++
 7 files changed, 61 insertions(+), 12 deletions(-)
 create mode 100644 hw/mem/pc-dimm.c
 create mode 100644 include/hw/mem/pc-dimm.h

diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 43c96d1..3ece8bb 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -18,6 +18,7 @@ CONFIG_FDC=y
 CONFIG_ACPI=y
 CONFIG_ACPI_X86=y
 CONFIG_ACPI_X86_ICH=y
+CONFIG_DIMM=y
 CONFIG_ACPI_MEMORY_HOTPLUG=y
 CONFIG_ACPI_CPU_HOTPLUG=y
 CONFIG_APM=y
diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
index dfb8095..92ea7c1 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -18,6 +18,7 @@ CONFIG_FDC=y
 CONFIG_ACPI=y
 CONFIG_ACPI_X86=y
 CONFIG_ACPI_X86_ICH=y
+CONFIG_DIMM=y
 CONFIG_ACPI_MEMORY_HOTPLUG=y
 CONFIG_ACPI_CPU_HOTPLUG=y
 CONFIG_APM=y
diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index 7563ef5..cebb4b1 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1 +1,2 @@
-common-obj-$(CONFIG_MEM_HOTPLUG) += dimm.o
+common-obj-$(CONFIG_DIMM) += dimm.o
+common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c
index e007271..2e35764 100644
--- a/hw/mem/dimm.c
+++ b/hw/mem/dimm.c
@@ -1,5 +1,5 @@
 /*
- * Dimm device for Memory Hotplug
+ * Dimm device abstraction
  *
  * Copyright ProfitBricks GmbH 2012
  * Copyright (C) 2014 Red Hat Inc
@@ -425,21 +425,13 @@ static void dimm_realize(DeviceState *dev, Error **errp)
     }
 }
 
-static MemoryRegion *dimm_get_memory_region(DIMMDevice *dimm)
-{
-    return host_memory_backend_get_memory(dimm->hostmem, &error_abort);
-}
-
 static void dimm_class_init(ObjectClass *oc, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(oc);
-    DIMMDeviceClass *ddc = DIMM_CLASS(oc);
 
     dc->realize = dimm_realize;
     dc->props = dimm_properties;
     dc->desc = "DIMM memory module";
-
-    ddc->get_memory_region = dimm_get_memory_region;
 }
 
 static TypeInfo dimm_info = {
@@ -449,6 +441,7 @@ static TypeInfo dimm_info = {
     .instance_init = dimm_init,
     .class_init    = dimm_class_init,
     .class_size    = sizeof(DIMMDeviceClass),
+    .abstract      = true,
 };
 
 static void dimm_register_types(void)
diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
new file mode 100644
index 0000000..38323e9
--- /dev/null
+++ b/hw/mem/pc-dimm.c
@@ -0,0 +1,46 @@
+/*
+ * Dimm device for Memory Hotplug
+ *
+ * Copyright ProfitBricks GmbH 2012
+ * Copyright (C) 2014 Red Hat Inc
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "hw/mem/pc-dimm.h"
+
+static MemoryRegion *pc_dimm_get_memory_region(DIMMDevice *dimm)
+{
+    return host_memory_backend_get_memory(dimm->hostmem, &error_abort);
+}
+
+static void pc_dimm_class_init(ObjectClass *oc, void *data)
+{
+    DIMMDeviceClass *ddc = DIMM_CLASS(oc);
+
+    ddc->get_memory_region = pc_dimm_get_memory_region;
+}
+
+static TypeInfo pc_dimm_info = {
+    .name          = TYPE_PC_DIMM,
+    .parent        = TYPE_DIMM,
+    .class_init    = pc_dimm_class_init,
+};
+
+static void pc_dimm_register_types(void)
+{
+    type_register_static(&pc_dimm_info);
+}
+
+type_init(pc_dimm_register_types)
diff --git a/include/hw/mem/dimm.h b/include/hw/mem/dimm.h
index 5ddbf08..84a62ed 100644
--- a/include/hw/mem/dimm.h
+++ b/include/hw/mem/dimm.h
@@ -1,5 +1,5 @@
 /*
- * PC DIMM device
+ * Dimm device abstraction
  *
  * Copyright ProfitBricks GmbH 2012
  * Copyright (C) 2013-2014 Red Hat Inc
@@ -20,7 +20,7 @@
 #include "sysemu/hostmem.h"
 #include "hw/qdev.h"
 
-#define TYPE_DIMM "pc-dimm"
+#define TYPE_DIMM "dimm"
 #define DIMM(obj) \
     OBJECT_CHECK(DIMMDevice, (obj), TYPE_DIMM)
 #define DIMM_CLASS(oc) \
diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
new file mode 100644
index 0000000..50818c2
--- /dev/null
+++ b/include/hw/mem/pc-dimm.h
@@ -0,0 +1,7 @@
+#ifndef QEMU_PC_DIMM_H
+#define QEMU_PC_DIMM_H
+
+#include "hw/mem/dimm.h"
+
+#define TYPE_PC_DIMM "pc-dimm"
+#endif
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 17/32] dimm: abstract dimm device from pc-dimm
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

A base device, dimm, is abstracted from pc-dimm, so that we can
build nvdimm device based on dimm in the later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 default-configs/i386-softmmu.mak   |  1 +
 default-configs/x86_64-softmmu.mak |  1 +
 hw/mem/Makefile.objs               |  3 ++-
 hw/mem/dimm.c                      | 11 ++-------
 hw/mem/pc-dimm.c                   | 46 ++++++++++++++++++++++++++++++++++++++
 include/hw/mem/dimm.h              |  4 ++--
 include/hw/mem/pc-dimm.h           |  7 ++++++
 7 files changed, 61 insertions(+), 12 deletions(-)
 create mode 100644 hw/mem/pc-dimm.c
 create mode 100644 include/hw/mem/pc-dimm.h

diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 43c96d1..3ece8bb 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -18,6 +18,7 @@ CONFIG_FDC=y
 CONFIG_ACPI=y
 CONFIG_ACPI_X86=y
 CONFIG_ACPI_X86_ICH=y
+CONFIG_DIMM=y
 CONFIG_ACPI_MEMORY_HOTPLUG=y
 CONFIG_ACPI_CPU_HOTPLUG=y
 CONFIG_APM=y
diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
index dfb8095..92ea7c1 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -18,6 +18,7 @@ CONFIG_FDC=y
 CONFIG_ACPI=y
 CONFIG_ACPI_X86=y
 CONFIG_ACPI_X86_ICH=y
+CONFIG_DIMM=y
 CONFIG_ACPI_MEMORY_HOTPLUG=y
 CONFIG_ACPI_CPU_HOTPLUG=y
 CONFIG_APM=y
diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index 7563ef5..cebb4b1 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1 +1,2 @@
-common-obj-$(CONFIG_MEM_HOTPLUG) += dimm.o
+common-obj-$(CONFIG_DIMM) += dimm.o
+common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c
index e007271..2e35764 100644
--- a/hw/mem/dimm.c
+++ b/hw/mem/dimm.c
@@ -1,5 +1,5 @@
 /*
- * Dimm device for Memory Hotplug
+ * Dimm device abstraction
  *
  * Copyright ProfitBricks GmbH 2012
  * Copyright (C) 2014 Red Hat Inc
@@ -425,21 +425,13 @@ static void dimm_realize(DeviceState *dev, Error **errp)
     }
 }
 
-static MemoryRegion *dimm_get_memory_region(DIMMDevice *dimm)
-{
-    return host_memory_backend_get_memory(dimm->hostmem, &error_abort);
-}
-
 static void dimm_class_init(ObjectClass *oc, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(oc);
-    DIMMDeviceClass *ddc = DIMM_CLASS(oc);
 
     dc->realize = dimm_realize;
     dc->props = dimm_properties;
     dc->desc = "DIMM memory module";
-
-    ddc->get_memory_region = dimm_get_memory_region;
 }
 
 static TypeInfo dimm_info = {
@@ -449,6 +441,7 @@ static TypeInfo dimm_info = {
     .instance_init = dimm_init,
     .class_init    = dimm_class_init,
     .class_size    = sizeof(DIMMDeviceClass),
+    .abstract      = true,
 };
 
 static void dimm_register_types(void)
diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
new file mode 100644
index 0000000..38323e9
--- /dev/null
+++ b/hw/mem/pc-dimm.c
@@ -0,0 +1,46 @@
+/*
+ * Dimm device for Memory Hotplug
+ *
+ * Copyright ProfitBricks GmbH 2012
+ * Copyright (C) 2014 Red Hat Inc
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "hw/mem/pc-dimm.h"
+
+static MemoryRegion *pc_dimm_get_memory_region(DIMMDevice *dimm)
+{
+    return host_memory_backend_get_memory(dimm->hostmem, &error_abort);
+}
+
+static void pc_dimm_class_init(ObjectClass *oc, void *data)
+{
+    DIMMDeviceClass *ddc = DIMM_CLASS(oc);
+
+    ddc->get_memory_region = pc_dimm_get_memory_region;
+}
+
+static TypeInfo pc_dimm_info = {
+    .name          = TYPE_PC_DIMM,
+    .parent        = TYPE_DIMM,
+    .class_init    = pc_dimm_class_init,
+};
+
+static void pc_dimm_register_types(void)
+{
+    type_register_static(&pc_dimm_info);
+}
+
+type_init(pc_dimm_register_types)
diff --git a/include/hw/mem/dimm.h b/include/hw/mem/dimm.h
index 5ddbf08..84a62ed 100644
--- a/include/hw/mem/dimm.h
+++ b/include/hw/mem/dimm.h
@@ -1,5 +1,5 @@
 /*
- * PC DIMM device
+ * Dimm device abstraction
  *
  * Copyright ProfitBricks GmbH 2012
  * Copyright (C) 2013-2014 Red Hat Inc
@@ -20,7 +20,7 @@
 #include "sysemu/hostmem.h"
 #include "hw/qdev.h"
 
-#define TYPE_DIMM "pc-dimm"
+#define TYPE_DIMM "dimm"
 #define DIMM(obj) \
     OBJECT_CHECK(DIMMDevice, (obj), TYPE_DIMM)
 #define DIMM_CLASS(oc) \
diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
new file mode 100644
index 0000000..50818c2
--- /dev/null
+++ b/include/hw/mem/pc-dimm.h
@@ -0,0 +1,7 @@
+#ifndef QEMU_PC_DIMM_H
+#define QEMU_PC_DIMM_H
+
+#include "hw/mem/dimm.h"
+
+#define TYPE_PC_DIMM "pc-dimm"
+#endif
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 18/32] dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Curretly, the memory region of backed memory is directly mapped to
guest's address space, however, it is not true for nvdimm device

This patch let dimm device realize this fact and use
DIMMDeviceClass->get_memory_region method to get the mapped memory
region

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/dimm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c
index 2e35764..b307511 100644
--- a/hw/mem/dimm.c
+++ b/hw/mem/dimm.c
@@ -373,8 +373,9 @@ static void dimm_get_size(Object *obj, Visitor *v, void *opaque,
     int64_t value;
     MemoryRegion *mr;
     DIMMDevice *dimm = DIMM(obj);
+    DIMMDeviceClass *ddc = DIMM_GET_CLASS(obj);
 
-    mr = host_memory_backend_get_memory(dimm->hostmem, errp);
+    mr = ddc->get_memory_region(dimm);
     value = memory_region_size(mr);
 
     visit_type_int(v, &value, name, errp);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 18/32] dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Curretly, the memory region of backed memory is directly mapped to
guest's address space, however, it is not true for nvdimm device

This patch let dimm device realize this fact and use
DIMMDeviceClass->get_memory_region method to get the mapped memory
region

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/dimm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c
index 2e35764..b307511 100644
--- a/hw/mem/dimm.c
+++ b/hw/mem/dimm.c
@@ -373,8 +373,9 @@ static void dimm_get_size(Object *obj, Visitor *v, void *opaque,
     int64_t value;
     MemoryRegion *mr;
     DIMMDevice *dimm = DIMM(obj);
+    DIMMDeviceClass *ddc = DIMM_GET_CLASS(obj);
 
-    mr = host_memory_backend_get_memory(dimm->hostmem, errp);
+    mr = ddc->get_memory_region(dimm);
     value = memory_region_size(mr);
 
     visit_type_int(v, &value, name, errp);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 19/32] dimm: keep the state of the whole backend memory
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

QEMU keeps the state of memory of dimm device during live migration,
however, it is not enough for nvdimm device as its memory does not
contain its label data, so that we should protect the whole backend
memory instead

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/dimm.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c
index b307511..efe964a 100644
--- a/hw/mem/dimm.c
+++ b/hw/mem/dimm.c
@@ -128,9 +128,16 @@ void dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
     }
 
     memory_region_add_subregion(&hpms->mr, addr - hpms->base, mr);
-    vmstate_register_ram(mr, dev);
     numa_set_mem_node_id(addr, memory_region_size(mr), dimm->node);
 
+    /*
+     * save the state only for @mr is not enough as it does not contain
+     * the label data of NVDIMM device, so that we keep the state of
+     * whole hostmem instead.
+     */
+    vmstate_register_ram(host_memory_backend_get_memory(dimm->hostmem, errp),
+                         dev);
+
 out:
     error_propagate(errp, local_err);
 }
@@ -139,10 +146,13 @@ void dimm_memory_unplug(DeviceState *dev, MemoryHotplugState *hpms,
                            MemoryRegion *mr)
 {
     DIMMDevice *dimm = DIMM(dev);
+    MemoryRegion *backend_mr;
+
+    backend_mr = host_memory_backend_get_memory(dimm->hostmem, &error_abort);
 
     numa_unset_mem_node_id(dimm->addr, memory_region_size(mr), dimm->node);
     memory_region_del_subregion(&hpms->mr, mr);
-    vmstate_unregister_ram(mr, dev);
+    vmstate_unregister_ram(backend_mr, dev);
 }
 
 int qmp_dimm_device_list(Object *obj, void *opaque)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 19/32] dimm: keep the state of the whole backend memory
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

QEMU keeps the state of memory of dimm device during live migration,
however, it is not enough for nvdimm device as its memory does not
contain its label data, so that we should protect the whole backend
memory instead

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/dimm.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c
index b307511..efe964a 100644
--- a/hw/mem/dimm.c
+++ b/hw/mem/dimm.c
@@ -128,9 +128,16 @@ void dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
     }
 
     memory_region_add_subregion(&hpms->mr, addr - hpms->base, mr);
-    vmstate_register_ram(mr, dev);
     numa_set_mem_node_id(addr, memory_region_size(mr), dimm->node);
 
+    /*
+     * save the state only for @mr is not enough as it does not contain
+     * the label data of NVDIMM device, so that we keep the state of
+     * whole hostmem instead.
+     */
+    vmstate_register_ram(host_memory_backend_get_memory(dimm->hostmem, errp),
+                         dev);
+
 out:
     error_propagate(errp, local_err);
 }
@@ -139,10 +146,13 @@ void dimm_memory_unplug(DeviceState *dev, MemoryHotplugState *hpms,
                            MemoryRegion *mr)
 {
     DIMMDevice *dimm = DIMM(dev);
+    MemoryRegion *backend_mr;
+
+    backend_mr = host_memory_backend_get_memory(dimm->hostmem, &error_abort);
 
     numa_unset_mem_node_id(dimm->addr, memory_region_size(mr), dimm->node);
     memory_region_del_subregion(&hpms->mr, mr);
-    vmstate_unregister_ram(mr, dev);
+    vmstate_unregister_ram(backend_mr, dev);
 }
 
 int qmp_dimm_device_list(Object *obj, void *opaque)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 20/32] dimm: introduce realize callback
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

nvdimm need check if the backend memory is large enough to contain label
data and init its memory region when the device is realized, so introduce
realize callback which is called after common dimm has been realize

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/dimm.c         | 5 +++++
 include/hw/mem/dimm.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c
index efe964a..7a87761 100644
--- a/hw/mem/dimm.c
+++ b/hw/mem/dimm.c
@@ -422,6 +422,7 @@ static void dimm_init(Object *obj)
 static void dimm_realize(DeviceState *dev, Error **errp)
 {
     DIMMDevice *dimm = DIMM(dev);
+    DIMMDeviceClass *ddc = DIMM_GET_CLASS(dimm);
 
     if (!dimm->hostmem) {
         error_setg(errp, "'" DIMM_MEMDEV_PROP "' property is not set");
@@ -434,6 +435,10 @@ static void dimm_realize(DeviceState *dev, Error **errp)
                    dimm->node, nb_numa_nodes ? nb_numa_nodes : 1);
         return;
     }
+
+    if (ddc->realize) {
+        ddc->realize(dimm, errp);
+    }
 }
 
 static void dimm_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/mem/dimm.h b/include/hw/mem/dimm.h
index 84a62ed..663288d 100644
--- a/include/hw/mem/dimm.h
+++ b/include/hw/mem/dimm.h
@@ -65,6 +65,7 @@ typedef struct DIMMDeviceClass {
     DeviceClass parent_class;
 
     /* public */
+    void (*realize)(DIMMDevice *dimm, Error **errp);
     MemoryRegion *(*get_memory_region)(DIMMDevice *dimm);
 } DIMMDeviceClass;
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 20/32] dimm: introduce realize callback
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

nvdimm need check if the backend memory is large enough to contain label
data and init its memory region when the device is realized, so introduce
realize callback which is called after common dimm has been realize

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/dimm.c         | 5 +++++
 include/hw/mem/dimm.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c
index efe964a..7a87761 100644
--- a/hw/mem/dimm.c
+++ b/hw/mem/dimm.c
@@ -422,6 +422,7 @@ static void dimm_init(Object *obj)
 static void dimm_realize(DeviceState *dev, Error **errp)
 {
     DIMMDevice *dimm = DIMM(dev);
+    DIMMDeviceClass *ddc = DIMM_GET_CLASS(dimm);
 
     if (!dimm->hostmem) {
         error_setg(errp, "'" DIMM_MEMDEV_PROP "' property is not set");
@@ -434,6 +435,10 @@ static void dimm_realize(DeviceState *dev, Error **errp)
                    dimm->node, nb_numa_nodes ? nb_numa_nodes : 1);
         return;
     }
+
+    if (ddc->realize) {
+        ddc->realize(dimm, errp);
+    }
 }
 
 static void dimm_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/mem/dimm.h b/include/hw/mem/dimm.h
index 84a62ed..663288d 100644
--- a/include/hw/mem/dimm.h
+++ b/include/hw/mem/dimm.h
@@ -65,6 +65,7 @@ typedef struct DIMMDeviceClass {
     DeviceClass parent_class;
 
     /* public */
+    void (*realize)(DIMMDevice *dimm, Error **errp);
     MemoryRegion *(*get_memory_region)(DIMMDevice *dimm);
 } DIMMDeviceClass;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 21/32] nvdimm: implement NVDIMM device abstract
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Introduce "nvdimm" device which is based on dimm device type

128K memory region which is the minimum namespace label size
required by NVDIMM Namespace Spec locates at the end of
backend memory device is reserved for label data

We can use "-m 1G,maxmem=100G,slots=10 -object memory-backend-file,
id=mem1,size=1G,mem-path=/dev/pmem0 -device nvdimm,memdev=mem1" to
create NVDIMM device for guest

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 default-configs/i386-softmmu.mak   |  1 +
 default-configs/x86_64-softmmu.mak |  1 +
 hw/acpi/memory_hotplug.c           |  6 +++
 hw/mem/Makefile.objs               |  1 +
 hw/mem/nvdimm/internal.h           | 17 ++++++++
 hw/mem/nvdimm/nvdimm.c             | 85 ++++++++++++++++++++++++++++++++++++++
 include/hw/mem/nvdimm.h            | 33 +++++++++++++++
 7 files changed, 144 insertions(+)
 create mode 100644 hw/mem/nvdimm/internal.h
 create mode 100644 hw/mem/nvdimm/nvdimm.c
 create mode 100644 include/hw/mem/nvdimm.h

diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 3ece8bb..a1b24e5 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -47,6 +47,7 @@ CONFIG_APIC=y
 CONFIG_IOAPIC=y
 CONFIG_PVPANIC=y
 CONFIG_MEM_HOTPLUG=y
+CONFIG_NVDIMM = y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
 CONFIG_I82801B11=y
diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
index 92ea7c1..e3f5a0b 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -47,6 +47,7 @@ CONFIG_APIC=y
 CONFIG_IOAPIC=y
 CONFIG_PVPANIC=y
 CONFIG_MEM_HOTPLUG=y
+CONFIG_NVDIMM = y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
 CONFIG_I82801B11=y
diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index e232641..92cd973 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -1,6 +1,7 @@
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/pc-hotplug.h"
 #include "hw/mem/dimm.h"
+#include "hw/mem/nvdimm.h"
 #include "hw/boards.h"
 #include "hw/qdev-core.h"
 #include "trace.h"
@@ -231,6 +232,11 @@ void acpi_memory_plug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st,
 {
     MemStatus *mdev;
 
+    /* Currently, NVDIMM hotplug has not been supported yet. */
+    if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) {
+        return;
+    }
+
     mdev = acpi_memory_slot_status(mem_st, dev, errp);
     if (!mdev) {
         return;
diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index cebb4b1..e0ff328 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1,2 +1,3 @@
 common-obj-$(CONFIG_DIMM) += dimm.o
 common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
+common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o
diff --git a/hw/mem/nvdimm/internal.h b/hw/mem/nvdimm/internal.h
new file mode 100644
index 0000000..c4ba750
--- /dev/null
+++ b/hw/mem/nvdimm/internal.h
@@ -0,0 +1,17 @@
+/*
+ * Non-Volatile Dual In-line Memory Module Virtualization Implementation
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef NVDIMM_INTERNAL_H
+#define NVDIMM_INTERNAL_H
+
+#define MIN_NAMESPACE_LABEL_SIZE    (128UL << 10)
+#endif
diff --git a/hw/mem/nvdimm/nvdimm.c b/hw/mem/nvdimm/nvdimm.c
new file mode 100644
index 0000000..0850e82
--- /dev/null
+++ b/hw/mem/nvdimm/nvdimm.c
@@ -0,0 +1,85 @@
+/*
+ * Non-Volatile Dual In-line Memory Module Virtualization Implementation
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ *
+ * Currently, it only supports PMEM Virtualization.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "qapi/visitor.h"
+#include "hw/mem/nvdimm.h"
+#include "internal.h"
+
+static MemoryRegion *nvdimm_get_memory_region(DIMMDevice *dimm)
+{
+    NVDIMMDevice *nvdimm = NVDIMM(dimm);
+
+    return memory_region_size(&nvdimm->nvdimm_mr) ? &nvdimm->nvdimm_mr : NULL;
+}
+
+static void nvdimm_realize(DIMMDevice *dimm, Error **errp)
+{
+    MemoryRegion *mr;
+    NVDIMMDevice *nvdimm = NVDIMM(dimm);
+    uint64_t size;
+
+    nvdimm->label_size = MIN_NAMESPACE_LABEL_SIZE;
+
+    mr = host_memory_backend_get_memory(dimm->hostmem, errp);
+    size = memory_region_size(mr);
+
+    if (size <= nvdimm->label_size) {
+        char *path = object_get_canonical_path_component(OBJECT(dimm->hostmem));
+        error_setg(errp, "the size of memdev %s (0x%" PRIx64 ") is too small"
+                   " to contain nvdimm namespace label (0x%" PRIx64 ")", path,
+                   memory_region_size(mr), nvdimm->label_size);
+        return;
+    }
+
+    memory_region_init_alias(&nvdimm->nvdimm_mr, OBJECT(dimm), "nvdimm-memory",
+                             mr, 0, size - nvdimm->label_size);
+    nvdimm->label_data = memory_region_get_ram_ptr(mr) +
+                         memory_region_size(&nvdimm->nvdimm_mr);
+}
+
+static void nvdimm_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    DIMMDeviceClass *ddc = DIMM_CLASS(oc);
+
+    /* nvdimm hotplug has not been supported yet. */
+    dc->hotpluggable = false;
+
+    ddc->realize = nvdimm_realize;
+    ddc->get_memory_region = nvdimm_get_memory_region;
+}
+
+static TypeInfo nvdimm_info = {
+    .name          = TYPE_NVDIMM,
+    .parent        = TYPE_DIMM,
+    .instance_size = sizeof(NVDIMMDevice),
+    .class_init    = nvdimm_class_init,
+};
+
+static void nvdimm_register_types(void)
+{
+    type_register_static(&nvdimm_info);
+}
+
+type_init(nvdimm_register_types)
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
new file mode 100644
index 0000000..f6bd2c4
--- /dev/null
+++ b/include/hw/mem/nvdimm.h
@@ -0,0 +1,33 @@
+/*
+ * Non-Volatile Dual In-line Memory Module Virtualization Implementation
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_NVDIMM_H
+#define QEMU_NVDIMM_H
+
+#include "hw/mem/dimm.h"
+
+#define TYPE_NVDIMM "nvdimm"
+#define NVDIMM(obj) \
+    OBJECT_CHECK(NVDIMMDevice, (obj), TYPE_NVDIMM)
+
+struct NVDIMMDevice {
+    /* private */
+    DIMMDevice parent_obj;
+
+    /* public */
+    uint64_t label_size;
+    void *label_data;
+    MemoryRegion nvdimm_mr;
+};
+typedef struct NVDIMMDevice NVDIMMDevice;
+
+#endif
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 21/32] nvdimm: implement NVDIMM device abstract
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Introduce "nvdimm" device which is based on dimm device type

128K memory region which is the minimum namespace label size
required by NVDIMM Namespace Spec locates at the end of
backend memory device is reserved for label data

We can use "-m 1G,maxmem=100G,slots=10 -object memory-backend-file,
id=mem1,size=1G,mem-path=/dev/pmem0 -device nvdimm,memdev=mem1" to
create NVDIMM device for guest

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 default-configs/i386-softmmu.mak   |  1 +
 default-configs/x86_64-softmmu.mak |  1 +
 hw/acpi/memory_hotplug.c           |  6 +++
 hw/mem/Makefile.objs               |  1 +
 hw/mem/nvdimm/internal.h           | 17 ++++++++
 hw/mem/nvdimm/nvdimm.c             | 85 ++++++++++++++++++++++++++++++++++++++
 include/hw/mem/nvdimm.h            | 33 +++++++++++++++
 7 files changed, 144 insertions(+)
 create mode 100644 hw/mem/nvdimm/internal.h
 create mode 100644 hw/mem/nvdimm/nvdimm.c
 create mode 100644 include/hw/mem/nvdimm.h

diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 3ece8bb..a1b24e5 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -47,6 +47,7 @@ CONFIG_APIC=y
 CONFIG_IOAPIC=y
 CONFIG_PVPANIC=y
 CONFIG_MEM_HOTPLUG=y
+CONFIG_NVDIMM = y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
 CONFIG_I82801B11=y
diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
index 92ea7c1..e3f5a0b 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -47,6 +47,7 @@ CONFIG_APIC=y
 CONFIG_IOAPIC=y
 CONFIG_PVPANIC=y
 CONFIG_MEM_HOTPLUG=y
+CONFIG_NVDIMM = y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
 CONFIG_I82801B11=y
diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index e232641..92cd973 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -1,6 +1,7 @@
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/pc-hotplug.h"
 #include "hw/mem/dimm.h"
+#include "hw/mem/nvdimm.h"
 #include "hw/boards.h"
 #include "hw/qdev-core.h"
 #include "trace.h"
@@ -231,6 +232,11 @@ void acpi_memory_plug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st,
 {
     MemStatus *mdev;
 
+    /* Currently, NVDIMM hotplug has not been supported yet. */
+    if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) {
+        return;
+    }
+
     mdev = acpi_memory_slot_status(mem_st, dev, errp);
     if (!mdev) {
         return;
diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index cebb4b1..e0ff328 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1,2 +1,3 @@
 common-obj-$(CONFIG_DIMM) += dimm.o
 common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
+common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o
diff --git a/hw/mem/nvdimm/internal.h b/hw/mem/nvdimm/internal.h
new file mode 100644
index 0000000..c4ba750
--- /dev/null
+++ b/hw/mem/nvdimm/internal.h
@@ -0,0 +1,17 @@
+/*
+ * Non-Volatile Dual In-line Memory Module Virtualization Implementation
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef NVDIMM_INTERNAL_H
+#define NVDIMM_INTERNAL_H
+
+#define MIN_NAMESPACE_LABEL_SIZE    (128UL << 10)
+#endif
diff --git a/hw/mem/nvdimm/nvdimm.c b/hw/mem/nvdimm/nvdimm.c
new file mode 100644
index 0000000..0850e82
--- /dev/null
+++ b/hw/mem/nvdimm/nvdimm.c
@@ -0,0 +1,85 @@
+/*
+ * Non-Volatile Dual In-line Memory Module Virtualization Implementation
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ *
+ * Currently, it only supports PMEM Virtualization.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "qapi/visitor.h"
+#include "hw/mem/nvdimm.h"
+#include "internal.h"
+
+static MemoryRegion *nvdimm_get_memory_region(DIMMDevice *dimm)
+{
+    NVDIMMDevice *nvdimm = NVDIMM(dimm);
+
+    return memory_region_size(&nvdimm->nvdimm_mr) ? &nvdimm->nvdimm_mr : NULL;
+}
+
+static void nvdimm_realize(DIMMDevice *dimm, Error **errp)
+{
+    MemoryRegion *mr;
+    NVDIMMDevice *nvdimm = NVDIMM(dimm);
+    uint64_t size;
+
+    nvdimm->label_size = MIN_NAMESPACE_LABEL_SIZE;
+
+    mr = host_memory_backend_get_memory(dimm->hostmem, errp);
+    size = memory_region_size(mr);
+
+    if (size <= nvdimm->label_size) {
+        char *path = object_get_canonical_path_component(OBJECT(dimm->hostmem));
+        error_setg(errp, "the size of memdev %s (0x%" PRIx64 ") is too small"
+                   " to contain nvdimm namespace label (0x%" PRIx64 ")", path,
+                   memory_region_size(mr), nvdimm->label_size);
+        return;
+    }
+
+    memory_region_init_alias(&nvdimm->nvdimm_mr, OBJECT(dimm), "nvdimm-memory",
+                             mr, 0, size - nvdimm->label_size);
+    nvdimm->label_data = memory_region_get_ram_ptr(mr) +
+                         memory_region_size(&nvdimm->nvdimm_mr);
+}
+
+static void nvdimm_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    DIMMDeviceClass *ddc = DIMM_CLASS(oc);
+
+    /* nvdimm hotplug has not been supported yet. */
+    dc->hotpluggable = false;
+
+    ddc->realize = nvdimm_realize;
+    ddc->get_memory_region = nvdimm_get_memory_region;
+}
+
+static TypeInfo nvdimm_info = {
+    .name          = TYPE_NVDIMM,
+    .parent        = TYPE_DIMM,
+    .instance_size = sizeof(NVDIMMDevice),
+    .class_init    = nvdimm_class_init,
+};
+
+static void nvdimm_register_types(void)
+{
+    type_register_static(&nvdimm_info);
+}
+
+type_init(nvdimm_register_types)
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
new file mode 100644
index 0000000..f6bd2c4
--- /dev/null
+++ b/include/hw/mem/nvdimm.h
@@ -0,0 +1,33 @@
+/*
+ * Non-Volatile Dual In-line Memory Module Virtualization Implementation
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_NVDIMM_H
+#define QEMU_NVDIMM_H
+
+#include "hw/mem/dimm.h"
+
+#define TYPE_NVDIMM "nvdimm"
+#define NVDIMM(obj) \
+    OBJECT_CHECK(NVDIMMDevice, (obj), TYPE_NVDIMM)
+
+struct NVDIMMDevice {
+    /* private */
+    DIMMDevice parent_obj;
+
+    /* public */
+    uint64_t label_size;
+    void *label_data;
+    MemoryRegion nvdimm_mr;
+};
+typedef struct NVDIMMDevice NVDIMMDevice;
+
+#endif
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

We reserve the memory region 0xFF00000 ~ 0xFFF00000 for NVDIMM ACPI
which is used as:
- the first page is mapped as MMIO, ACPI write data to this page to
  transfer the control to QEMU

- the second page is RAM-based which used to save the input info of
  _DSM method and QEMU reuse it store output info

- the left is mapped as RAM, it's the buffer returned by _FIT method,
  this is needed by NVDIMM hotplug

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/i386/pc.c            |   3 ++
 hw/mem/Makefile.objs    |   2 +-
 hw/mem/nvdimm/acpi.c    | 120 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/hw/i386/pc.h    |   2 +
 include/hw/mem/nvdimm.h |  19 ++++++++
 5 files changed, 145 insertions(+), 1 deletion(-)
 create mode 100644 hw/mem/nvdimm/acpi.c

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 6694b18..8fea4c3 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1360,6 +1360,9 @@ FWCfgState *pc_memory_init(PCMachineState *pcms,
             exit(EXIT_FAILURE);
         }
 
+        nvdimm_init_memory_state(&pcms->nvdimm_memory, system_memory, machine,
+                                 TARGET_PAGE_SIZE);
+
         pcms->hotplug_memory.base =
             ROUND_UP(0x100000000ULL + pcms->above_4g_mem_size, 1ULL << 30);
 
diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index e0ff328..7310bac 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1,3 +1,3 @@
 common-obj-$(CONFIG_DIMM) += dimm.o
 common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
-common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o
+common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o nvdimm/acpi.o
diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
new file mode 100644
index 0000000..b640874
--- /dev/null
+++ b/hw/mem/nvdimm/acpi.c
@@ -0,0 +1,120 @@
+/*
+ * NVDIMM ACPI Implementation
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ *
+ * NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
+ * and the DSM specfication can be found at:
+ *       http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
+ *
+ * Currently, it only supports PMEM Virtualization.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "qemu-common.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/aml-build.h"
+#include "hw/mem/nvdimm.h"
+#include "internal.h"
+
+/* System Physical Address Range Structure */
+struct nfit_spa {
+    uint16_t type;
+    uint16_t length;
+    uint16_t spa_index;
+    uint16_t flags;
+    uint32_t reserved;
+    uint32_t proximity_domain;
+    uint8_t type_guid[16];
+    uint64_t spa_base;
+    uint64_t spa_length;
+    uint64_t mem_attr;
+} QEMU_PACKED;
+typedef struct nfit_spa nfit_spa;
+
+/* Memory Device to System Physical Address Range Mapping Structure */
+struct nfit_memdev {
+    uint16_t type;
+    uint16_t length;
+    uint32_t nfit_handle;
+    uint16_t phys_id;
+    uint16_t region_id;
+    uint16_t spa_index;
+    uint16_t dcr_index;
+    uint64_t region_len;
+    uint64_t region_offset;
+    uint64_t region_dpa;
+    uint16_t interleave_index;
+    uint16_t interleave_ways;
+    uint16_t flags;
+    uint16_t reserved;
+} QEMU_PACKED;
+typedef struct nfit_memdev nfit_memdev;
+
+/* NVDIMM Control Region Structure */
+struct nfit_dcr {
+    uint16_t type;
+    uint16_t length;
+    uint16_t dcr_index;
+    uint16_t vendor_id;
+    uint16_t device_id;
+    uint16_t revision_id;
+    uint16_t sub_vendor_id;
+    uint16_t sub_device_id;
+    uint16_t sub_revision_id;
+    uint8_t reserved[6];
+    uint32_t serial_number;
+    uint16_t fic;
+    uint16_t num_bcw;
+    uint64_t bcw_size;
+    uint64_t cmd_offset;
+    uint64_t cmd_size;
+    uint64_t status_offset;
+    uint64_t status_size;
+    uint16_t flags;
+    uint8_t reserved2[6];
+} QEMU_PACKED;
+typedef struct nfit_dcr nfit_dcr;
+
+static uint64_t nvdimm_device_structure_size(uint64_t slots)
+{
+    /* each nvdimm has three structures. */
+    return slots * (sizeof(nfit_spa) + sizeof(nfit_memdev) + sizeof(nfit_dcr));
+}
+
+static uint64_t nvdimm_acpi_memory_size(uint64_t slots, uint64_t page_size)
+{
+    uint64_t size = nvdimm_device_structure_size(slots);
+
+    /* two pages for nvdimm _DSM method. */
+    return size + page_size * 2;
+}
+
+void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
+                              MachineState *machine , uint64_t page_size)
+{
+    QEMU_BUILD_BUG_ON(nvdimm_acpi_memory_size(ACPI_MAX_RAM_SLOTS,
+                                   page_size) >= NVDIMM_ACPI_MEM_SIZE);
+
+    state->base = NVDIMM_ACPI_MEM_BASE;
+    state->page_size = page_size;
+
+    memory_region_init(&state->mr, OBJECT(machine), "nvdimm-acpi",
+                       NVDIMM_ACPI_MEM_SIZE);
+    memory_region_add_subregion(system_memory, state->base, &state->mr);
+}
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 693b6c5..fd65c27 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -17,6 +17,7 @@
 #include "hw/boards.h"
 #include "hw/compat.h"
 #include "hw/mem/dimm.h"
+#include "hw/mem/nvdimm.h"
 
 #define HPET_INTCAP "hpet-intcap"
 
@@ -32,6 +33,7 @@ struct PCMachineState {
 
     /* <public> */
     MemoryHotplugState hotplug_memory;
+    NVDIMMState nvdimm_memory;
 
     HotplugHandler *acpi_dev;
     ISADevice *rtc;
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index f6bd2c4..aa95961 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -15,6 +15,10 @@
 
 #include "hw/mem/dimm.h"
 
+/* Memory region 0xFF00000 ~ 0xFFF00000 is reserved for NVDIMM ACPI. */
+#define NVDIMM_ACPI_MEM_BASE   0xFF000000ULL
+#define NVDIMM_ACPI_MEM_SIZE   0xF00000ULL
+
 #define TYPE_NVDIMM "nvdimm"
 #define NVDIMM(obj) \
     OBJECT_CHECK(NVDIMMDevice, (obj), TYPE_NVDIMM)
@@ -30,4 +34,19 @@ struct NVDIMMDevice {
 };
 typedef struct NVDIMMDevice NVDIMMDevice;
 
+/*
+ * NVDIMMState:
+ * @base: address in guest address space where NVDIMM ACPI memory begins.
+ * @page_size: the page size of target platform.
+ * @mr: NVDIMM ACPI memory address space container.
+ */
+struct NVDIMMState {
+    ram_addr_t base;
+    uint64_t page_size;
+    MemoryRegion mr;
+};
+typedef struct NVDIMMState NVDIMMState;
+
+void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
+                              MachineState *machine , uint64_t page_size);
 #endif
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

We reserve the memory region 0xFF00000 ~ 0xFFF00000 for NVDIMM ACPI
which is used as:
- the first page is mapped as MMIO, ACPI write data to this page to
  transfer the control to QEMU

- the second page is RAM-based which used to save the input info of
  _DSM method and QEMU reuse it store output info

- the left is mapped as RAM, it's the buffer returned by _FIT method,
  this is needed by NVDIMM hotplug

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/i386/pc.c            |   3 ++
 hw/mem/Makefile.objs    |   2 +-
 hw/mem/nvdimm/acpi.c    | 120 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/hw/i386/pc.h    |   2 +
 include/hw/mem/nvdimm.h |  19 ++++++++
 5 files changed, 145 insertions(+), 1 deletion(-)
 create mode 100644 hw/mem/nvdimm/acpi.c

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 6694b18..8fea4c3 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1360,6 +1360,9 @@ FWCfgState *pc_memory_init(PCMachineState *pcms,
             exit(EXIT_FAILURE);
         }
 
+        nvdimm_init_memory_state(&pcms->nvdimm_memory, system_memory, machine,
+                                 TARGET_PAGE_SIZE);
+
         pcms->hotplug_memory.base =
             ROUND_UP(0x100000000ULL + pcms->above_4g_mem_size, 1ULL << 30);
 
diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index e0ff328..7310bac 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1,3 +1,3 @@
 common-obj-$(CONFIG_DIMM) += dimm.o
 common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
-common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o
+common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o nvdimm/acpi.o
diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
new file mode 100644
index 0000000..b640874
--- /dev/null
+++ b/hw/mem/nvdimm/acpi.c
@@ -0,0 +1,120 @@
+/*
+ * NVDIMM ACPI Implementation
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ *
+ * NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
+ * and the DSM specfication can be found at:
+ *       http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
+ *
+ * Currently, it only supports PMEM Virtualization.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "qemu-common.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/aml-build.h"
+#include "hw/mem/nvdimm.h"
+#include "internal.h"
+
+/* System Physical Address Range Structure */
+struct nfit_spa {
+    uint16_t type;
+    uint16_t length;
+    uint16_t spa_index;
+    uint16_t flags;
+    uint32_t reserved;
+    uint32_t proximity_domain;
+    uint8_t type_guid[16];
+    uint64_t spa_base;
+    uint64_t spa_length;
+    uint64_t mem_attr;
+} QEMU_PACKED;
+typedef struct nfit_spa nfit_spa;
+
+/* Memory Device to System Physical Address Range Mapping Structure */
+struct nfit_memdev {
+    uint16_t type;
+    uint16_t length;
+    uint32_t nfit_handle;
+    uint16_t phys_id;
+    uint16_t region_id;
+    uint16_t spa_index;
+    uint16_t dcr_index;
+    uint64_t region_len;
+    uint64_t region_offset;
+    uint64_t region_dpa;
+    uint16_t interleave_index;
+    uint16_t interleave_ways;
+    uint16_t flags;
+    uint16_t reserved;
+} QEMU_PACKED;
+typedef struct nfit_memdev nfit_memdev;
+
+/* NVDIMM Control Region Structure */
+struct nfit_dcr {
+    uint16_t type;
+    uint16_t length;
+    uint16_t dcr_index;
+    uint16_t vendor_id;
+    uint16_t device_id;
+    uint16_t revision_id;
+    uint16_t sub_vendor_id;
+    uint16_t sub_device_id;
+    uint16_t sub_revision_id;
+    uint8_t reserved[6];
+    uint32_t serial_number;
+    uint16_t fic;
+    uint16_t num_bcw;
+    uint64_t bcw_size;
+    uint64_t cmd_offset;
+    uint64_t cmd_size;
+    uint64_t status_offset;
+    uint64_t status_size;
+    uint16_t flags;
+    uint8_t reserved2[6];
+} QEMU_PACKED;
+typedef struct nfit_dcr nfit_dcr;
+
+static uint64_t nvdimm_device_structure_size(uint64_t slots)
+{
+    /* each nvdimm has three structures. */
+    return slots * (sizeof(nfit_spa) + sizeof(nfit_memdev) + sizeof(nfit_dcr));
+}
+
+static uint64_t nvdimm_acpi_memory_size(uint64_t slots, uint64_t page_size)
+{
+    uint64_t size = nvdimm_device_structure_size(slots);
+
+    /* two pages for nvdimm _DSM method. */
+    return size + page_size * 2;
+}
+
+void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
+                              MachineState *machine , uint64_t page_size)
+{
+    QEMU_BUILD_BUG_ON(nvdimm_acpi_memory_size(ACPI_MAX_RAM_SLOTS,
+                                   page_size) >= NVDIMM_ACPI_MEM_SIZE);
+
+    state->base = NVDIMM_ACPI_MEM_BASE;
+    state->page_size = page_size;
+
+    memory_region_init(&state->mr, OBJECT(machine), "nvdimm-acpi",
+                       NVDIMM_ACPI_MEM_SIZE);
+    memory_region_add_subregion(system_memory, state->base, &state->mr);
+}
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 693b6c5..fd65c27 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -17,6 +17,7 @@
 #include "hw/boards.h"
 #include "hw/compat.h"
 #include "hw/mem/dimm.h"
+#include "hw/mem/nvdimm.h"
 
 #define HPET_INTCAP "hpet-intcap"
 
@@ -32,6 +33,7 @@ struct PCMachineState {
 
     /* <public> */
     MemoryHotplugState hotplug_memory;
+    NVDIMMState nvdimm_memory;
 
     HotplugHandler *acpi_dev;
     ISADevice *rtc;
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index f6bd2c4..aa95961 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -15,6 +15,10 @@
 
 #include "hw/mem/dimm.h"
 
+/* Memory region 0xFF00000 ~ 0xFFF00000 is reserved for NVDIMM ACPI. */
+#define NVDIMM_ACPI_MEM_BASE   0xFF000000ULL
+#define NVDIMM_ACPI_MEM_SIZE   0xF00000ULL
+
 #define TYPE_NVDIMM "nvdimm"
 #define NVDIMM(obj) \
     OBJECT_CHECK(NVDIMMDevice, (obj), TYPE_NVDIMM)
@@ -30,4 +34,19 @@ struct NVDIMMDevice {
 };
 typedef struct NVDIMMDevice NVDIMMDevice;
 
+/*
+ * NVDIMMState:
+ * @base: address in guest address space where NVDIMM ACPI memory begins.
+ * @page_size: the page size of target platform.
+ * @mr: NVDIMM ACPI memory address space container.
+ */
+struct NVDIMMState {
+    ram_addr_t base;
+    uint64_t page_size;
+    MemoryRegion mr;
+};
+typedef struct NVDIMMState NVDIMMState;
+
+void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
+                              MachineState *machine , uint64_t page_size);
 #endif
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 23/32] nvdimm: build ACPI NFIT table
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)

Currently, we only support PMEM mode. Each device has 3 structures:
- SPA structure, defines the PMEM region info

- MEM DEV structure, it has the @handle which is used to associate specified
  ACPI NVDIMM  device we will introduce in later patch.
  Also we can happily ignored the memory device's interleave, the real
  nvdimm hardware access is hidden behind host

- DCR structure, it defines vendor ID used to associate specified vendor
  nvdimm driver. Since we only implement PMEM mode this time, Command
  window and Data window are not needed

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/i386/acpi-build.c     |   4 +
 hw/mem/nvdimm/acpi.c     | 209 ++++++++++++++++++++++++++++++++++++++++++++++-
 hw/mem/nvdimm/internal.h |  13 +++
 hw/mem/nvdimm/nvdimm.c   |  25 ++++++
 include/hw/mem/nvdimm.h  |   2 +
 5 files changed, 252 insertions(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 95e0c65..c637dc8 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1661,6 +1661,7 @@ static bool acpi_has_iommu(void)
 static
 void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
 {
+    PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
     GArray *table_offsets;
     unsigned facs, ssdt, dsdt, rsdt;
     AcpiCpuInfo cpu;
@@ -1742,6 +1743,9 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
         build_dmar_q35(tables_blob, tables->linker);
     }
 
+    nvdimm_build_acpi_table(&pcms->nvdimm_memory, table_offsets, tables_blob,
+                            tables->linker);
+
     /* Add tables supplied by user (if any) */
     for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
         unsigned len = acpi_table_len(u);
diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index b640874..62b1e02 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -32,6 +32,46 @@
 #include "hw/mem/nvdimm.h"
 #include "internal.h"
 
+static void nfit_spa_uuid_pm(uuid_le *uuid)
+{
+    uuid_le uuid_pm = UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d,
+                              0x33, 0x18, 0xb7, 0x8c, 0xdb);
+    memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
+}
+
+enum {
+    NFIT_STRUCTURE_SPA = 0,
+    NFIT_STRUCTURE_MEMDEV = 1,
+    NFIT_STRUCTURE_IDT = 2,
+    NFIT_STRUCTURE_SMBIOS = 3,
+    NFIT_STRUCTURE_DCR = 4,
+    NFIT_STRUCTURE_BDW = 5,
+    NFIT_STRUCTURE_FLUSH = 6,
+};
+
+enum {
+    EFI_MEMORY_UC = 0x1ULL,
+    EFI_MEMORY_WC = 0x2ULL,
+    EFI_MEMORY_WT = 0x4ULL,
+    EFI_MEMORY_WB = 0x8ULL,
+    EFI_MEMORY_UCE = 0x10ULL,
+    EFI_MEMORY_WP = 0x1000ULL,
+    EFI_MEMORY_RP = 0x2000ULL,
+    EFI_MEMORY_XP = 0x4000ULL,
+    EFI_MEMORY_NV = 0x8000ULL,
+    EFI_MEMORY_MORE_RELIABLE = 0x10000ULL,
+};
+
+/*
+ * NVDIMM Firmware Interface Table
+ * @signature: "NFIT"
+ */
+struct nfit {
+    ACPI_TABLE_HEADER_DEF
+    uint32_t reserved;
+} QEMU_PACKED;
+typedef struct nfit nfit;
+
 /* System Physical Address Range Structure */
 struct nfit_spa {
     uint16_t type;
@@ -40,13 +80,21 @@ struct nfit_spa {
     uint16_t flags;
     uint32_t reserved;
     uint32_t proximity_domain;
-    uint8_t type_guid[16];
+    uuid_le type_guid;
     uint64_t spa_base;
     uint64_t spa_length;
     uint64_t mem_attr;
 } QEMU_PACKED;
 typedef struct nfit_spa nfit_spa;
 
+/*
+ * Control region is strictly for management during hot add/online
+ * operation.
+ */
+#define SPA_FLAGS_ADD_ONLINE_ONLY     (1)
+/* Data in Proximity Domain field is valid. */
+#define SPA_FLAGS_PROXIMITY_VALID     (1 << 1)
+
 /* Memory Device to System Physical Address Range Mapping Structure */
 struct nfit_memdev {
     uint16_t type;
@@ -91,12 +139,20 @@ struct nfit_dcr {
 } QEMU_PACKED;
 typedef struct nfit_dcr nfit_dcr;
 
+#define REVSISON_ID    1
+#define NFIT_FIC1      0x201
+
 static uint64_t nvdimm_device_structure_size(uint64_t slots)
 {
     /* each nvdimm has three structures. */
     return slots * (sizeof(nfit_spa) + sizeof(nfit_memdev) + sizeof(nfit_dcr));
 }
 
+static uint64_t get_nfit_total_size(uint64_t slots)
+{
+    return sizeof(struct nfit) + nvdimm_device_structure_size(slots);
+}
+
 static uint64_t nvdimm_acpi_memory_size(uint64_t slots, uint64_t page_size)
 {
     uint64_t size = nvdimm_device_structure_size(slots);
@@ -118,3 +174,154 @@ void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
                        NVDIMM_ACPI_MEM_SIZE);
     memory_region_add_subregion(system_memory, state->base, &state->mr);
 }
+
+static uint32_t nvdimm_slot_to_sn(int slot)
+{
+    return 0x123456 + slot;
+}
+
+static uint32_t nvdimm_slot_to_handle(int slot)
+{
+    return slot + 1;
+}
+
+static uint16_t nvdimm_slot_to_spa_index(int slot)
+{
+    return (slot + 1) << 1;
+}
+
+static uint32_t nvdimm_slot_to_dcr_index(int slot)
+{
+    return nvdimm_slot_to_spa_index(slot) + 1;
+}
+
+static int build_structure_spa(void *buf, NVDIMMDevice *nvdimm)
+{
+    nfit_spa *nfit_spa;
+    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
+                                            NULL);
+    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
+                                            NULL);
+    uint32_t node = object_property_get_int(OBJECT(nvdimm), DIMM_NODE_PROP,
+                                            NULL);
+    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
+                                            NULL);
+
+    nfit_spa = buf;
+
+    nfit_spa->type = cpu_to_le16(NFIT_STRUCTURE_SPA);
+    nfit_spa->length = cpu_to_le16(sizeof(*nfit_spa));
+    nfit_spa->spa_index = cpu_to_le16(nvdimm_slot_to_spa_index(slot));
+    nfit_spa->flags = cpu_to_le16(SPA_FLAGS_PROXIMITY_VALID);
+    nfit_spa->proximity_domain = cpu_to_le32(node);
+    nfit_spa_uuid_pm(&nfit_spa->type_guid);
+    nfit_spa->spa_base = cpu_to_le64(addr);
+    nfit_spa->spa_length = cpu_to_le64(size);
+    nfit_spa->mem_attr = cpu_to_le64(EFI_MEMORY_WB | EFI_MEMORY_NV);
+
+    return sizeof(*nfit_spa);
+}
+
+static int build_structure_memdev(void *buf, NVDIMMDevice *nvdimm)
+{
+    nfit_memdev *nfit_memdev;
+    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
+                                            NULL);
+    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
+                                            NULL);
+    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
+                                            NULL);
+    uint32_t handle = nvdimm_slot_to_handle(slot);
+
+    nfit_memdev = buf;
+    nfit_memdev->type = cpu_to_le16(NFIT_STRUCTURE_MEMDEV);
+    nfit_memdev->length = cpu_to_le16(sizeof(*nfit_memdev));
+    nfit_memdev->nfit_handle = cpu_to_le32(handle);
+    /* point to nfit_spa. */
+    nfit_memdev->spa_index = cpu_to_le16(nvdimm_slot_to_spa_index(slot));
+    /* point to nfit_dcr. */
+    nfit_memdev->dcr_index = cpu_to_le16(nvdimm_slot_to_dcr_index(slot));
+    nfit_memdev->region_len = cpu_to_le64(size);
+    nfit_memdev->region_dpa = cpu_to_le64(addr);
+    /* Only one interleave for pmem. */
+    nfit_memdev->interleave_ways = cpu_to_le16(1);
+
+    return sizeof(*nfit_memdev);
+}
+
+static int build_structure_dcr(void *buf, NVDIMMDevice *nvdimm)
+{
+    nfit_dcr *nfit_dcr;
+    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
+                                       NULL);
+    uint32_t sn = nvdimm_slot_to_sn(slot);
+
+    nfit_dcr = buf;
+    nfit_dcr->type = cpu_to_le16(NFIT_STRUCTURE_DCR);
+    nfit_dcr->length = cpu_to_le16(sizeof(*nfit_dcr));
+    nfit_dcr->dcr_index = cpu_to_le16(nvdimm_slot_to_dcr_index(slot));
+    nfit_dcr->vendor_id = cpu_to_le16(0x8086);
+    nfit_dcr->device_id = cpu_to_le16(1);
+    nfit_dcr->revision_id = cpu_to_le16(REVSISON_ID);
+    nfit_dcr->serial_number = cpu_to_le32(sn);
+    nfit_dcr->fic = cpu_to_le16(NFIT_FIC1);
+
+    return sizeof(*nfit_dcr);
+}
+
+static void build_device_structure(GSList *device_list, char *buf)
+{
+    buf += sizeof(nfit);
+
+    for (; device_list; device_list = device_list->next) {
+        NVDIMMDevice *nvdimm = device_list->data;
+
+        /* build System Physical Address Range Description Table. */
+        buf += build_structure_spa(buf, nvdimm);
+
+        /*
+         * build Memory Device to System Physical Address Range Mapping
+         * Table.
+         */
+        buf += build_structure_memdev(buf, nvdimm);
+
+        /* build Control Region Descriptor Table. */
+        buf += build_structure_dcr(buf, nvdimm);
+    }
+}
+
+static void build_nfit(GSList *device_list, GArray *table_offsets,
+                       GArray *table_data, GArray *linker)
+{
+    size_t total;
+    char *buf;
+    int nfit_start, nr;
+
+    nr = g_slist_length(device_list);
+    total = get_nfit_total_size(nr);
+
+    nfit_start = table_data->len;
+    acpi_add_table(table_offsets, table_data);
+
+    buf = acpi_data_push(table_data, total);
+    build_device_structure(device_list, buf);
+
+    build_header(linker, table_data, (void *)(table_data->data + nfit_start),
+                 "NFIT", table_data->len - nfit_start, 1);
+}
+
+void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
+                             GArray *table_data, GArray *linker)
+{
+    GSList *device_list = nvdimm_get_built_list();
+
+    if (!memory_region_size(&state->mr)) {
+        assert(!device_list);
+        return;
+    }
+
+    if (device_list) {
+        build_nfit(device_list, table_offsets, table_data, linker);
+        g_slist_free(device_list);
+    }
+}
diff --git a/hw/mem/nvdimm/internal.h b/hw/mem/nvdimm/internal.h
index c4ba750..5551448 100644
--- a/hw/mem/nvdimm/internal.h
+++ b/hw/mem/nvdimm/internal.h
@@ -14,4 +14,17 @@
 #define NVDIMM_INTERNAL_H
 
 #define MIN_NAMESPACE_LABEL_SIZE    (128UL << 10)
+
+struct uuid_le {
+    uint8_t b[16];
+};
+typedef struct uuid_le uuid_le;
+
+#define UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)                   \
+((uuid_le)                                                                 \
+{ { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
+    (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff,          \
+    (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } })
+
+GSList *nvdimm_get_built_list(void);
 #endif
diff --git a/hw/mem/nvdimm/nvdimm.c b/hw/mem/nvdimm/nvdimm.c
index 0850e82..bc8c577 100644
--- a/hw/mem/nvdimm/nvdimm.c
+++ b/hw/mem/nvdimm/nvdimm.c
@@ -26,6 +26,31 @@
 #include "hw/mem/nvdimm.h"
 #include "internal.h"
 
+static int nvdimm_built_list(Object *obj, void *opaque)
+{
+    GSList **list = opaque;
+
+    if (object_dynamic_cast(obj, TYPE_NVDIMM)) {
+        DeviceState *dev = DEVICE(obj);
+
+        /* only realized NVDIMMs matter */
+        if (dev->realized) {
+            *list = g_slist_append(*list, dev);
+        }
+    }
+
+    object_child_foreach(obj, nvdimm_built_list, opaque);
+    return 0;
+}
+
+GSList *nvdimm_get_built_list(void)
+{
+    GSList *list = NULL;
+
+    object_child_foreach(qdev_get_machine(), nvdimm_built_list, &list);
+    return list;
+}
+
 static MemoryRegion *nvdimm_get_memory_region(DIMMDevice *dimm)
 {
     NVDIMMDevice *nvdimm = NVDIMM(dimm);
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index aa95961..0a6bda4 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -49,4 +49,6 @@ typedef struct NVDIMMState NVDIMMState;
 
 void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
                               MachineState *machine , uint64_t page_size);
+void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
+                             GArray *table_data, GArray *linker);
 #endif
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 23/32] nvdimm: build ACPI NFIT table
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)

Currently, we only support PMEM mode. Each device has 3 structures:
- SPA structure, defines the PMEM region info

- MEM DEV structure, it has the @handle which is used to associate specified
  ACPI NVDIMM  device we will introduce in later patch.
  Also we can happily ignored the memory device's interleave, the real
  nvdimm hardware access is hidden behind host

- DCR structure, it defines vendor ID used to associate specified vendor
  nvdimm driver. Since we only implement PMEM mode this time, Command
  window and Data window are not needed

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/i386/acpi-build.c     |   4 +
 hw/mem/nvdimm/acpi.c     | 209 ++++++++++++++++++++++++++++++++++++++++++++++-
 hw/mem/nvdimm/internal.h |  13 +++
 hw/mem/nvdimm/nvdimm.c   |  25 ++++++
 include/hw/mem/nvdimm.h  |   2 +
 5 files changed, 252 insertions(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 95e0c65..c637dc8 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1661,6 +1661,7 @@ static bool acpi_has_iommu(void)
 static
 void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
 {
+    PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
     GArray *table_offsets;
     unsigned facs, ssdt, dsdt, rsdt;
     AcpiCpuInfo cpu;
@@ -1742,6 +1743,9 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
         build_dmar_q35(tables_blob, tables->linker);
     }
 
+    nvdimm_build_acpi_table(&pcms->nvdimm_memory, table_offsets, tables_blob,
+                            tables->linker);
+
     /* Add tables supplied by user (if any) */
     for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
         unsigned len = acpi_table_len(u);
diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index b640874..62b1e02 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -32,6 +32,46 @@
 #include "hw/mem/nvdimm.h"
 #include "internal.h"
 
+static void nfit_spa_uuid_pm(uuid_le *uuid)
+{
+    uuid_le uuid_pm = UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d,
+                              0x33, 0x18, 0xb7, 0x8c, 0xdb);
+    memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
+}
+
+enum {
+    NFIT_STRUCTURE_SPA = 0,
+    NFIT_STRUCTURE_MEMDEV = 1,
+    NFIT_STRUCTURE_IDT = 2,
+    NFIT_STRUCTURE_SMBIOS = 3,
+    NFIT_STRUCTURE_DCR = 4,
+    NFIT_STRUCTURE_BDW = 5,
+    NFIT_STRUCTURE_FLUSH = 6,
+};
+
+enum {
+    EFI_MEMORY_UC = 0x1ULL,
+    EFI_MEMORY_WC = 0x2ULL,
+    EFI_MEMORY_WT = 0x4ULL,
+    EFI_MEMORY_WB = 0x8ULL,
+    EFI_MEMORY_UCE = 0x10ULL,
+    EFI_MEMORY_WP = 0x1000ULL,
+    EFI_MEMORY_RP = 0x2000ULL,
+    EFI_MEMORY_XP = 0x4000ULL,
+    EFI_MEMORY_NV = 0x8000ULL,
+    EFI_MEMORY_MORE_RELIABLE = 0x10000ULL,
+};
+
+/*
+ * NVDIMM Firmware Interface Table
+ * @signature: "NFIT"
+ */
+struct nfit {
+    ACPI_TABLE_HEADER_DEF
+    uint32_t reserved;
+} QEMU_PACKED;
+typedef struct nfit nfit;
+
 /* System Physical Address Range Structure */
 struct nfit_spa {
     uint16_t type;
@@ -40,13 +80,21 @@ struct nfit_spa {
     uint16_t flags;
     uint32_t reserved;
     uint32_t proximity_domain;
-    uint8_t type_guid[16];
+    uuid_le type_guid;
     uint64_t spa_base;
     uint64_t spa_length;
     uint64_t mem_attr;
 } QEMU_PACKED;
 typedef struct nfit_spa nfit_spa;
 
+/*
+ * Control region is strictly for management during hot add/online
+ * operation.
+ */
+#define SPA_FLAGS_ADD_ONLINE_ONLY     (1)
+/* Data in Proximity Domain field is valid. */
+#define SPA_FLAGS_PROXIMITY_VALID     (1 << 1)
+
 /* Memory Device to System Physical Address Range Mapping Structure */
 struct nfit_memdev {
     uint16_t type;
@@ -91,12 +139,20 @@ struct nfit_dcr {
 } QEMU_PACKED;
 typedef struct nfit_dcr nfit_dcr;
 
+#define REVSISON_ID    1
+#define NFIT_FIC1      0x201
+
 static uint64_t nvdimm_device_structure_size(uint64_t slots)
 {
     /* each nvdimm has three structures. */
     return slots * (sizeof(nfit_spa) + sizeof(nfit_memdev) + sizeof(nfit_dcr));
 }
 
+static uint64_t get_nfit_total_size(uint64_t slots)
+{
+    return sizeof(struct nfit) + nvdimm_device_structure_size(slots);
+}
+
 static uint64_t nvdimm_acpi_memory_size(uint64_t slots, uint64_t page_size)
 {
     uint64_t size = nvdimm_device_structure_size(slots);
@@ -118,3 +174,154 @@ void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
                        NVDIMM_ACPI_MEM_SIZE);
     memory_region_add_subregion(system_memory, state->base, &state->mr);
 }
+
+static uint32_t nvdimm_slot_to_sn(int slot)
+{
+    return 0x123456 + slot;
+}
+
+static uint32_t nvdimm_slot_to_handle(int slot)
+{
+    return slot + 1;
+}
+
+static uint16_t nvdimm_slot_to_spa_index(int slot)
+{
+    return (slot + 1) << 1;
+}
+
+static uint32_t nvdimm_slot_to_dcr_index(int slot)
+{
+    return nvdimm_slot_to_spa_index(slot) + 1;
+}
+
+static int build_structure_spa(void *buf, NVDIMMDevice *nvdimm)
+{
+    nfit_spa *nfit_spa;
+    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
+                                            NULL);
+    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
+                                            NULL);
+    uint32_t node = object_property_get_int(OBJECT(nvdimm), DIMM_NODE_PROP,
+                                            NULL);
+    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
+                                            NULL);
+
+    nfit_spa = buf;
+
+    nfit_spa->type = cpu_to_le16(NFIT_STRUCTURE_SPA);
+    nfit_spa->length = cpu_to_le16(sizeof(*nfit_spa));
+    nfit_spa->spa_index = cpu_to_le16(nvdimm_slot_to_spa_index(slot));
+    nfit_spa->flags = cpu_to_le16(SPA_FLAGS_PROXIMITY_VALID);
+    nfit_spa->proximity_domain = cpu_to_le32(node);
+    nfit_spa_uuid_pm(&nfit_spa->type_guid);
+    nfit_spa->spa_base = cpu_to_le64(addr);
+    nfit_spa->spa_length = cpu_to_le64(size);
+    nfit_spa->mem_attr = cpu_to_le64(EFI_MEMORY_WB | EFI_MEMORY_NV);
+
+    return sizeof(*nfit_spa);
+}
+
+static int build_structure_memdev(void *buf, NVDIMMDevice *nvdimm)
+{
+    nfit_memdev *nfit_memdev;
+    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
+                                            NULL);
+    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
+                                            NULL);
+    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
+                                            NULL);
+    uint32_t handle = nvdimm_slot_to_handle(slot);
+
+    nfit_memdev = buf;
+    nfit_memdev->type = cpu_to_le16(NFIT_STRUCTURE_MEMDEV);
+    nfit_memdev->length = cpu_to_le16(sizeof(*nfit_memdev));
+    nfit_memdev->nfit_handle = cpu_to_le32(handle);
+    /* point to nfit_spa. */
+    nfit_memdev->spa_index = cpu_to_le16(nvdimm_slot_to_spa_index(slot));
+    /* point to nfit_dcr. */
+    nfit_memdev->dcr_index = cpu_to_le16(nvdimm_slot_to_dcr_index(slot));
+    nfit_memdev->region_len = cpu_to_le64(size);
+    nfit_memdev->region_dpa = cpu_to_le64(addr);
+    /* Only one interleave for pmem. */
+    nfit_memdev->interleave_ways = cpu_to_le16(1);
+
+    return sizeof(*nfit_memdev);
+}
+
+static int build_structure_dcr(void *buf, NVDIMMDevice *nvdimm)
+{
+    nfit_dcr *nfit_dcr;
+    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
+                                       NULL);
+    uint32_t sn = nvdimm_slot_to_sn(slot);
+
+    nfit_dcr = buf;
+    nfit_dcr->type = cpu_to_le16(NFIT_STRUCTURE_DCR);
+    nfit_dcr->length = cpu_to_le16(sizeof(*nfit_dcr));
+    nfit_dcr->dcr_index = cpu_to_le16(nvdimm_slot_to_dcr_index(slot));
+    nfit_dcr->vendor_id = cpu_to_le16(0x8086);
+    nfit_dcr->device_id = cpu_to_le16(1);
+    nfit_dcr->revision_id = cpu_to_le16(REVSISON_ID);
+    nfit_dcr->serial_number = cpu_to_le32(sn);
+    nfit_dcr->fic = cpu_to_le16(NFIT_FIC1);
+
+    return sizeof(*nfit_dcr);
+}
+
+static void build_device_structure(GSList *device_list, char *buf)
+{
+    buf += sizeof(nfit);
+
+    for (; device_list; device_list = device_list->next) {
+        NVDIMMDevice *nvdimm = device_list->data;
+
+        /* build System Physical Address Range Description Table. */
+        buf += build_structure_spa(buf, nvdimm);
+
+        /*
+         * build Memory Device to System Physical Address Range Mapping
+         * Table.
+         */
+        buf += build_structure_memdev(buf, nvdimm);
+
+        /* build Control Region Descriptor Table. */
+        buf += build_structure_dcr(buf, nvdimm);
+    }
+}
+
+static void build_nfit(GSList *device_list, GArray *table_offsets,
+                       GArray *table_data, GArray *linker)
+{
+    size_t total;
+    char *buf;
+    int nfit_start, nr;
+
+    nr = g_slist_length(device_list);
+    total = get_nfit_total_size(nr);
+
+    nfit_start = table_data->len;
+    acpi_add_table(table_offsets, table_data);
+
+    buf = acpi_data_push(table_data, total);
+    build_device_structure(device_list, buf);
+
+    build_header(linker, table_data, (void *)(table_data->data + nfit_start),
+                 "NFIT", table_data->len - nfit_start, 1);
+}
+
+void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
+                             GArray *table_data, GArray *linker)
+{
+    GSList *device_list = nvdimm_get_built_list();
+
+    if (!memory_region_size(&state->mr)) {
+        assert(!device_list);
+        return;
+    }
+
+    if (device_list) {
+        build_nfit(device_list, table_offsets, table_data, linker);
+        g_slist_free(device_list);
+    }
+}
diff --git a/hw/mem/nvdimm/internal.h b/hw/mem/nvdimm/internal.h
index c4ba750..5551448 100644
--- a/hw/mem/nvdimm/internal.h
+++ b/hw/mem/nvdimm/internal.h
@@ -14,4 +14,17 @@
 #define NVDIMM_INTERNAL_H
 
 #define MIN_NAMESPACE_LABEL_SIZE    (128UL << 10)
+
+struct uuid_le {
+    uint8_t b[16];
+};
+typedef struct uuid_le uuid_le;
+
+#define UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)                   \
+((uuid_le)                                                                 \
+{ { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
+    (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff,          \
+    (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } })
+
+GSList *nvdimm_get_built_list(void);
 #endif
diff --git a/hw/mem/nvdimm/nvdimm.c b/hw/mem/nvdimm/nvdimm.c
index 0850e82..bc8c577 100644
--- a/hw/mem/nvdimm/nvdimm.c
+++ b/hw/mem/nvdimm/nvdimm.c
@@ -26,6 +26,31 @@
 #include "hw/mem/nvdimm.h"
 #include "internal.h"
 
+static int nvdimm_built_list(Object *obj, void *opaque)
+{
+    GSList **list = opaque;
+
+    if (object_dynamic_cast(obj, TYPE_NVDIMM)) {
+        DeviceState *dev = DEVICE(obj);
+
+        /* only realized NVDIMMs matter */
+        if (dev->realized) {
+            *list = g_slist_append(*list, dev);
+        }
+    }
+
+    object_child_foreach(obj, nvdimm_built_list, opaque);
+    return 0;
+}
+
+GSList *nvdimm_get_built_list(void)
+{
+    GSList *list = NULL;
+
+    object_child_foreach(qdev_get_machine(), nvdimm_built_list, &list);
+    return list;
+}
+
 static MemoryRegion *nvdimm_get_memory_region(DIMMDevice *dimm)
 {
     NVDIMMDevice *nvdimm = NVDIMM(dimm);
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index aa95961..0a6bda4 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -49,4 +49,6 @@ typedef struct NVDIMMState NVDIMMState;
 
 void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
                               MachineState *machine , uint64_t page_size);
+void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
+                             GArray *table_data, GArray *linker);
 #endif
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 24/32] nvdimm: init the address region used by DSM method
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Map the NVDIMM ACPI memory region to guest address space

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c     | 75 ++++++++++++++++++++++++++++++++++++++++++++----
 hw/mem/nvdimm/internal.h |  8 ++++++
 2 files changed, 78 insertions(+), 5 deletions(-)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index 62b1e02..1450a6a 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -271,8 +271,6 @@ static int build_structure_dcr(void *buf, NVDIMMDevice *nvdimm)
 
 static void build_device_structure(GSList *device_list, char *buf)
 {
-    buf += sizeof(nfit);
-
     for (; device_list; device_list = device_list->next) {
         NVDIMMDevice *nvdimm = device_list->data;
 
@@ -290,7 +288,7 @@ static void build_device_structure(GSList *device_list, char *buf)
     }
 }
 
-static void build_nfit(GSList *device_list, GArray *table_offsets,
+static void build_nfit(void *fit, GSList *device_list, GArray *table_offsets,
                        GArray *table_data, GArray *linker)
 {
     size_t total;
@@ -304,12 +302,76 @@ static void build_nfit(GSList *device_list, GArray *table_offsets,
     acpi_add_table(table_offsets, table_data);
 
     buf = acpi_data_push(table_data, total);
-    build_device_structure(device_list, buf);
+    memcpy(buf + sizeof(nfit), fit, total - sizeof(nfit));
 
     build_header(linker, table_data, (void *)(table_data->data + nfit_start),
                  "NFIT", table_data->len - nfit_start, 1);
 }
 
+static uint64_t dsm_read(void *opaque, hwaddr addr,
+                         unsigned size)
+{
+    return 0;
+}
+
+static void dsm_write(void *opaque, hwaddr addr,
+                      uint64_t val, unsigned size)
+{
+}
+
+static const MemoryRegionOps dsm_ops = {
+    .read = dsm_read,
+    .write = dsm_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+};
+
+static MemoryRegion *build_dsm_memory(NVDIMMState *state)
+{
+    MemoryRegion *dsm_ram_mr, *dsm_mmio_mr, *dsm_fit_mr;
+    uint64_t fit_size = memory_region_size(&state->mr) - state->page_size * 2;
+
+    /* DSM memory has already been built. */
+    dsm_fit_mr = memory_region_find(&state->mr, state->page_size * 2,
+                                    fit_size).mr;
+    if (dsm_fit_mr) {
+        nvdebug("DSM FIT has already been built by %s.\n", dsm_fit_mr->name);
+        memory_region_unref(dsm_fit_mr);
+        return dsm_fit_mr;
+    }
+
+    /*
+     * the first page is MMIO-based used to transfer control from guest
+     * ACPI to QEMU.
+     */
+    dsm_mmio_mr = g_new(MemoryRegion, 1);
+    memory_region_init_io(dsm_mmio_mr, NULL, &dsm_ops, state,
+                          "nvdimm.dsm_mmio", state->page_size);
+
+    /*
+     * the second page is RAM-based used to transfer data between guest
+     * ACPI and QEMU.
+     */
+    dsm_ram_mr = g_new(MemoryRegion, 1);
+    memory_region_init_ram(dsm_ram_mr, NULL, "nvdimm.dsm_ram",
+                           state->page_size, &error_abort);
+    vmstate_register_ram_global(dsm_ram_mr);
+
+    /*
+     * the left is RAM-based which is _FIT buffer returned by _FIT
+     * method.
+     */
+    dsm_fit_mr = g_new(MemoryRegion, 1);
+    memory_region_init_ram(dsm_fit_mr, NULL, "nvdimm.fit", fit_size,
+                           &error_abort);
+    vmstate_register_ram_global(dsm_fit_mr);
+
+    memory_region_add_subregion(&state->mr, 0, dsm_mmio_mr);
+    memory_region_add_subregion(&state->mr, state->page_size, dsm_ram_mr);
+    memory_region_add_subregion(&state->mr, state->page_size * 2, dsm_fit_mr);
+
+    return dsm_fit_mr;
+}
+
 void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
                              GArray *table_data, GArray *linker)
 {
@@ -321,7 +383,10 @@ void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
     }
 
     if (device_list) {
-        build_nfit(device_list, table_offsets, table_data, linker);
+        void *fit = memory_region_get_ram_ptr(build_dsm_memory(state));
+
+        build_device_structure(device_list, fit);
+        build_nfit(fit, device_list, table_offsets, table_data, linker);
         g_slist_free(device_list);
     }
 }
diff --git a/hw/mem/nvdimm/internal.h b/hw/mem/nvdimm/internal.h
index 5551448..1e95363 100644
--- a/hw/mem/nvdimm/internal.h
+++ b/hw/mem/nvdimm/internal.h
@@ -13,6 +13,14 @@
 #ifndef NVDIMM_INTERNAL_H
 #define NVDIMM_INTERNAL_H
 
+#define NVDIMM_DEBUG 0
+#define nvdebug(fmt, ...)                                     \
+    do {                                                      \
+        if (NVDIMM_DEBUG) {                                   \
+            fprintf(stderr, "nvdimm: " fmt, ## __VA_ARGS__);  \
+        }                                                     \
+    } while (0)
+
 #define MIN_NAMESPACE_LABEL_SIZE    (128UL << 10)
 
 struct uuid_le {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 24/32] nvdimm: init the address region used by DSM method
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Map the NVDIMM ACPI memory region to guest address space

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c     | 75 ++++++++++++++++++++++++++++++++++++++++++++----
 hw/mem/nvdimm/internal.h |  8 ++++++
 2 files changed, 78 insertions(+), 5 deletions(-)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index 62b1e02..1450a6a 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -271,8 +271,6 @@ static int build_structure_dcr(void *buf, NVDIMMDevice *nvdimm)
 
 static void build_device_structure(GSList *device_list, char *buf)
 {
-    buf += sizeof(nfit);
-
     for (; device_list; device_list = device_list->next) {
         NVDIMMDevice *nvdimm = device_list->data;
 
@@ -290,7 +288,7 @@ static void build_device_structure(GSList *device_list, char *buf)
     }
 }
 
-static void build_nfit(GSList *device_list, GArray *table_offsets,
+static void build_nfit(void *fit, GSList *device_list, GArray *table_offsets,
                        GArray *table_data, GArray *linker)
 {
     size_t total;
@@ -304,12 +302,76 @@ static void build_nfit(GSList *device_list, GArray *table_offsets,
     acpi_add_table(table_offsets, table_data);
 
     buf = acpi_data_push(table_data, total);
-    build_device_structure(device_list, buf);
+    memcpy(buf + sizeof(nfit), fit, total - sizeof(nfit));
 
     build_header(linker, table_data, (void *)(table_data->data + nfit_start),
                  "NFIT", table_data->len - nfit_start, 1);
 }
 
+static uint64_t dsm_read(void *opaque, hwaddr addr,
+                         unsigned size)
+{
+    return 0;
+}
+
+static void dsm_write(void *opaque, hwaddr addr,
+                      uint64_t val, unsigned size)
+{
+}
+
+static const MemoryRegionOps dsm_ops = {
+    .read = dsm_read,
+    .write = dsm_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+};
+
+static MemoryRegion *build_dsm_memory(NVDIMMState *state)
+{
+    MemoryRegion *dsm_ram_mr, *dsm_mmio_mr, *dsm_fit_mr;
+    uint64_t fit_size = memory_region_size(&state->mr) - state->page_size * 2;
+
+    /* DSM memory has already been built. */
+    dsm_fit_mr = memory_region_find(&state->mr, state->page_size * 2,
+                                    fit_size).mr;
+    if (dsm_fit_mr) {
+        nvdebug("DSM FIT has already been built by %s.\n", dsm_fit_mr->name);
+        memory_region_unref(dsm_fit_mr);
+        return dsm_fit_mr;
+    }
+
+    /*
+     * the first page is MMIO-based used to transfer control from guest
+     * ACPI to QEMU.
+     */
+    dsm_mmio_mr = g_new(MemoryRegion, 1);
+    memory_region_init_io(dsm_mmio_mr, NULL, &dsm_ops, state,
+                          "nvdimm.dsm_mmio", state->page_size);
+
+    /*
+     * the second page is RAM-based used to transfer data between guest
+     * ACPI and QEMU.
+     */
+    dsm_ram_mr = g_new(MemoryRegion, 1);
+    memory_region_init_ram(dsm_ram_mr, NULL, "nvdimm.dsm_ram",
+                           state->page_size, &error_abort);
+    vmstate_register_ram_global(dsm_ram_mr);
+
+    /*
+     * the left is RAM-based which is _FIT buffer returned by _FIT
+     * method.
+     */
+    dsm_fit_mr = g_new(MemoryRegion, 1);
+    memory_region_init_ram(dsm_fit_mr, NULL, "nvdimm.fit", fit_size,
+                           &error_abort);
+    vmstate_register_ram_global(dsm_fit_mr);
+
+    memory_region_add_subregion(&state->mr, 0, dsm_mmio_mr);
+    memory_region_add_subregion(&state->mr, state->page_size, dsm_ram_mr);
+    memory_region_add_subregion(&state->mr, state->page_size * 2, dsm_fit_mr);
+
+    return dsm_fit_mr;
+}
+
 void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
                              GArray *table_data, GArray *linker)
 {
@@ -321,7 +383,10 @@ void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
     }
 
     if (device_list) {
-        build_nfit(device_list, table_offsets, table_data, linker);
+        void *fit = memory_region_get_ram_ptr(build_dsm_memory(state));
+
+        build_device_structure(device_list, fit);
+        build_nfit(fit, device_list, table_offsets, table_data, linker);
         g_slist_free(device_list);
     }
 }
diff --git a/hw/mem/nvdimm/internal.h b/hw/mem/nvdimm/internal.h
index 5551448..1e95363 100644
--- a/hw/mem/nvdimm/internal.h
+++ b/hw/mem/nvdimm/internal.h
@@ -13,6 +13,14 @@
 #ifndef NVDIMM_INTERNAL_H
 #define NVDIMM_INTERNAL_H
 
+#define NVDIMM_DEBUG 0
+#define nvdebug(fmt, ...)                                     \
+    do {                                                      \
+        if (NVDIMM_DEBUG) {                                   \
+            fprintf(stderr, "nvdimm: " fmt, ## __VA_ARGS__);  \
+        }                                                     \
+    } while (0)
+
 #define MIN_NAMESPACE_LABEL_SIZE    (128UL << 10)
 
 struct uuid_le {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 25/32] nvdimm: build ACPI nvdimm devices
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

NVDIMM devices is defined in ACPI 6.0 9.20 NVDIMM Devices

There is a root device under \_SB and specified NVDIMM devices are under the
root device. Each NVDIMM device has _ADR which returns its handle used to
associate MEMDEV structure in NFIT

We reserve handle 0 for root device. In this patch, we save handle, arg0,
arg1 and arg2. Arg3 is conditionally saved in later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c | 203 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 203 insertions(+)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index 1450a6a..d9fa0fd 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -308,15 +308,38 @@ static void build_nfit(void *fit, GSList *device_list, GArray *table_offsets,
                  "NFIT", table_data->len - nfit_start, 1);
 }
 
+#define NOTIFY_VALUE      0x99
+
+struct dsm_in {
+    uint32_t handle;
+    uint8_t arg0[16];
+    uint32_t arg1;
+    uint32_t arg2;
+   /* the remaining size in the page is used by arg3. */
+    uint8_t arg3[0];
+} QEMU_PACKED;
+typedef struct dsm_in dsm_in;
+
+struct dsm_out {
+    /* the size of buffer filled by QEMU. */
+    uint16_t len;
+    uint8_t data[0];
+} QEMU_PACKED;
+typedef struct dsm_out dsm_out;
+
 static uint64_t dsm_read(void *opaque, hwaddr addr,
                          unsigned size)
 {
+    fprintf(stderr, "BUG: we never read DSM notification MMIO.\n");
     return 0;
 }
 
 static void dsm_write(void *opaque, hwaddr addr,
                       uint64_t val, unsigned size)
 {
+    if (val != NOTIFY_VALUE) {
+        fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
+    }
 }
 
 static const MemoryRegionOps dsm_ops = {
@@ -372,6 +395,183 @@ static MemoryRegion *build_dsm_memory(NVDIMMState *state)
     return dsm_fit_mr;
 }
 
+#define BUILD_STA_METHOD(_dev_, _method_)                                  \
+    do {                                                                   \
+        _method_ = aml_method("_STA", 0);                                  \
+        aml_append(_method_, aml_return(aml_int(0x0f)));                   \
+        aml_append(_dev_, _method_);                                       \
+    } while (0)
+
+#define SAVE_ARG012_HANDLE_LOCK(_method_, _handle_)                        \
+    do {                                                                   \
+        aml_append(_method_, aml_acquire(aml_name("NLCK"), 0xFFFF));       \
+        aml_append(_method_, aml_store(_handle_, aml_name("HDLE")));       \
+        aml_append(_method_, aml_store(aml_arg(0), aml_name("ARG0")));     \
+        aml_append(_method_, aml_store(aml_arg(1), aml_name("ARG1")));     \
+        aml_append(_method_, aml_store(aml_arg(2), aml_name("ARG2")));     \
+    } while (0)
+
+#define NOTIFY_AND_RETURN_UNLOCK(_method_)                           \
+    do {                                                                   \
+        aml_append(_method_, aml_store(aml_int(NOTIFY_VALUE),              \
+                   aml_name("NOTI")));                                     \
+        aml_append(_method_, aml_store(aml_name("RLEN"), aml_local(6)));   \
+        aml_append(_method_, aml_store(aml_shiftleft(aml_local(6),         \
+                      aml_int(3)), aml_local(6)));                         \
+        aml_append(_method_, aml_create_field(aml_name("ODAT"), aml_int(0),\
+                                              aml_local(6) , "OBUF"));     \
+        aml_append(_method_, aml_name_decl("ZBUF", aml_buffer(0, NULL)));  \
+        aml_append(_method_, aml_concatenate(aml_name("ZBUF"),             \
+                                          aml_name("OBUF"), aml_arg(6)));  \
+        aml_append(_method_, aml_release(aml_name("NLCK")));               \
+        aml_append(_method_, aml_return(aml_arg(6)));                      \
+    } while (0)
+
+#define BUILD_FIELD_UNIT_STRUCT(_field_, _s_, _f_, _name_)                 \
+    aml_append(_field_, aml_named_field(_name_,                            \
+               sizeof(typeof_field(_s_, _f_)) * BITS_PER_BYTE))
+
+#define BUILD_FIELD_UNIT_SIZE(_field_, _byte_, _name_)                     \
+    aml_append(_field_, aml_named_field(_name_, (_byte_) * BITS_PER_BYTE))
+
+static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
+                                 Aml *root_dev)
+{
+    for (; device_list; device_list = device_list->next) {
+        NVDIMMDevice *nvdimm = device_list->data;
+        int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
+                                           NULL);
+        uint32_t handle = nvdimm_slot_to_handle(slot);
+        Aml *dev, *method;
+
+        dev = aml_device("NV%02X", slot);
+        aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
+
+        BUILD_STA_METHOD(dev, method);
+
+        method = aml_method("_DSM", 4);
+        {
+            SAVE_ARG012_HANDLE_LOCK(method, aml_int(handle));
+            NOTIFY_AND_RETURN_UNLOCK(method);
+        }
+        aml_append(dev, method);
+
+        aml_append(root_dev, dev);
+    }
+}
+
+static void nvdimm_build_acpi_devices(NVDIMMState *state, GSList *device_list,
+                                      Aml *sb_scope)
+{
+    Aml *dev, *method, *field;
+    int fit_size = nvdimm_device_structure_size(g_slist_length(device_list));
+
+    dev = aml_device("NVDR");
+    aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0012")));
+
+    /* map DSM memory into ACPI namespace. */
+    aml_append(dev, aml_operation_region("NMIO", AML_SYSTEM_MEMORY,
+               state->base, state->page_size));
+    aml_append(dev, aml_operation_region("NRAM", AML_SYSTEM_MEMORY,
+               state->base + state->page_size, state->page_size));
+    aml_append(dev, aml_operation_region("NFIT", AML_SYSTEM_MEMORY,
+               state->base + state->page_size * 2,
+               memory_region_size(&state->mr) - state->page_size * 2));
+
+    /*
+     * DSM notifier:
+     * @NOTI: write value to it will notify QEMU that _DSM method is being
+     *        called and the parameters can be found in dsm_in.
+     *
+     * It is MMIO mapping on host so that it will cause VM-exit then QEMU
+     * gets control.
+     */
+    field = aml_field("NMIO", AML_DWORD_ACC, AML_PRESERVE);
+    BUILD_FIELD_UNIT_SIZE(field, sizeof(uint32_t), "NOTI");
+    aml_append(dev, field);
+
+    /*
+     * DSM input:
+     * @HDLE: store device's handle, it's zero if the _DSM call happens
+     *        on ROOT.
+     * @ARG0 ~ @ARG3: store the parameters of _DSM call.
+     *
+     * They are ram mapping on host so that these accesses never cause
+     * VM-EXIT.
+     */
+    field = aml_field("NRAM", AML_DWORD_ACC, AML_PRESERVE);
+    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, handle, "HDLE");
+    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, arg0, "ARG0");
+    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, arg1, "ARG1");
+    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, arg2, "ARG2");
+    BUILD_FIELD_UNIT_SIZE(field, state->page_size - offsetof(dsm_in, arg3),
+                          "ARG3");
+    aml_append(dev, field);
+
+    /*
+     * DSM output:
+     * @RLEN: the size of buffer filled by QEMU
+     * @ODAT: the buffer QEMU uses to store the result
+     *
+     * Since the page is reused by both input and out, the input data
+     * will be lost after storing new result into @RLEN and @ODAT
+    */
+    field = aml_field("NRAM", AML_DWORD_ACC, AML_PRESERVE);
+    BUILD_FIELD_UNIT_STRUCT(field, dsm_out, len, "RLEN");
+    BUILD_FIELD_UNIT_SIZE(field, state->page_size - offsetof(dsm_out, data),
+                          "ODAT");
+    aml_append(dev, field);
+
+    /* @RFIT, returned by _FIT method. */
+    field = aml_field("NFIT", AML_DWORD_ACC, AML_PRESERVE);
+    BUILD_FIELD_UNIT_SIZE(field, fit_size, "RFIT");
+    aml_append(dev, field);
+
+    aml_append(dev, aml_mutex("NLCK", 0));
+
+    BUILD_STA_METHOD(dev, method);
+
+    method = aml_method("_DSM", 4);
+    {
+        SAVE_ARG012_HANDLE_LOCK(method, aml_int(0));
+        NOTIFY_AND_RETURN_UNLOCK(method);
+    }
+    aml_append(dev, method);
+
+    method = aml_method("_FIT", 0);
+    {
+        aml_append(method, aml_return(aml_name("RFIT")));
+    }
+    aml_append(dev, method);
+
+    build_nvdimm_devices(state, device_list, dev);
+
+    aml_append(sb_scope, dev);
+}
+
+static void nvdimm_build_ssdt(NVDIMMState *state, GSList *device_list,
+                              GArray *table_offsets, GArray *table_data,
+                              GArray *linker)
+{
+    Aml *ssdt, *sb_scope;
+
+    acpi_add_table(table_offsets, table_data);
+
+    ssdt = init_aml_allocator();
+    acpi_data_push(ssdt->buf, sizeof(AcpiTableHeader));
+
+    sb_scope = aml_scope("\\_SB");
+    nvdimm_build_acpi_devices(state, device_list, sb_scope);
+
+    aml_append(ssdt, sb_scope);
+    /* copy AML table into ACPI tables blob and patch header there */
+    g_array_append_vals(table_data, ssdt->buf->data, ssdt->buf->len);
+    build_header(linker, table_data,
+        (void *)(table_data->data + table_data->len - ssdt->buf->len),
+        "SSDT", ssdt->buf->len, 1);
+    free_aml_allocator();
+}
+
 void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
                              GArray *table_data, GArray *linker)
 {
@@ -387,6 +587,9 @@ void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
 
         build_device_structure(device_list, fit);
         build_nfit(fit, device_list, table_offsets, table_data, linker);
+
+        nvdimm_build_ssdt(state, device_list, table_offsets, table_data,
+                          linker);
         g_slist_free(device_list);
     }
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 25/32] nvdimm: build ACPI nvdimm devices
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

NVDIMM devices is defined in ACPI 6.0 9.20 NVDIMM Devices

There is a root device under \_SB and specified NVDIMM devices are under the
root device. Each NVDIMM device has _ADR which returns its handle used to
associate MEMDEV structure in NFIT

We reserve handle 0 for root device. In this patch, we save handle, arg0,
arg1 and arg2. Arg3 is conditionally saved in later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c | 203 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 203 insertions(+)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index 1450a6a..d9fa0fd 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -308,15 +308,38 @@ static void build_nfit(void *fit, GSList *device_list, GArray *table_offsets,
                  "NFIT", table_data->len - nfit_start, 1);
 }
 
+#define NOTIFY_VALUE      0x99
+
+struct dsm_in {
+    uint32_t handle;
+    uint8_t arg0[16];
+    uint32_t arg1;
+    uint32_t arg2;
+   /* the remaining size in the page is used by arg3. */
+    uint8_t arg3[0];
+} QEMU_PACKED;
+typedef struct dsm_in dsm_in;
+
+struct dsm_out {
+    /* the size of buffer filled by QEMU. */
+    uint16_t len;
+    uint8_t data[0];
+} QEMU_PACKED;
+typedef struct dsm_out dsm_out;
+
 static uint64_t dsm_read(void *opaque, hwaddr addr,
                          unsigned size)
 {
+    fprintf(stderr, "BUG: we never read DSM notification MMIO.\n");
     return 0;
 }
 
 static void dsm_write(void *opaque, hwaddr addr,
                       uint64_t val, unsigned size)
 {
+    if (val != NOTIFY_VALUE) {
+        fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
+    }
 }
 
 static const MemoryRegionOps dsm_ops = {
@@ -372,6 +395,183 @@ static MemoryRegion *build_dsm_memory(NVDIMMState *state)
     return dsm_fit_mr;
 }
 
+#define BUILD_STA_METHOD(_dev_, _method_)                                  \
+    do {                                                                   \
+        _method_ = aml_method("_STA", 0);                                  \
+        aml_append(_method_, aml_return(aml_int(0x0f)));                   \
+        aml_append(_dev_, _method_);                                       \
+    } while (0)
+
+#define SAVE_ARG012_HANDLE_LOCK(_method_, _handle_)                        \
+    do {                                                                   \
+        aml_append(_method_, aml_acquire(aml_name("NLCK"), 0xFFFF));       \
+        aml_append(_method_, aml_store(_handle_, aml_name("HDLE")));       \
+        aml_append(_method_, aml_store(aml_arg(0), aml_name("ARG0")));     \
+        aml_append(_method_, aml_store(aml_arg(1), aml_name("ARG1")));     \
+        aml_append(_method_, aml_store(aml_arg(2), aml_name("ARG2")));     \
+    } while (0)
+
+#define NOTIFY_AND_RETURN_UNLOCK(_method_)                           \
+    do {                                                                   \
+        aml_append(_method_, aml_store(aml_int(NOTIFY_VALUE),              \
+                   aml_name("NOTI")));                                     \
+        aml_append(_method_, aml_store(aml_name("RLEN"), aml_local(6)));   \
+        aml_append(_method_, aml_store(aml_shiftleft(aml_local(6),         \
+                      aml_int(3)), aml_local(6)));                         \
+        aml_append(_method_, aml_create_field(aml_name("ODAT"), aml_int(0),\
+                                              aml_local(6) , "OBUF"));     \
+        aml_append(_method_, aml_name_decl("ZBUF", aml_buffer(0, NULL)));  \
+        aml_append(_method_, aml_concatenate(aml_name("ZBUF"),             \
+                                          aml_name("OBUF"), aml_arg(6)));  \
+        aml_append(_method_, aml_release(aml_name("NLCK")));               \
+        aml_append(_method_, aml_return(aml_arg(6)));                      \
+    } while (0)
+
+#define BUILD_FIELD_UNIT_STRUCT(_field_, _s_, _f_, _name_)                 \
+    aml_append(_field_, aml_named_field(_name_,                            \
+               sizeof(typeof_field(_s_, _f_)) * BITS_PER_BYTE))
+
+#define BUILD_FIELD_UNIT_SIZE(_field_, _byte_, _name_)                     \
+    aml_append(_field_, aml_named_field(_name_, (_byte_) * BITS_PER_BYTE))
+
+static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
+                                 Aml *root_dev)
+{
+    for (; device_list; device_list = device_list->next) {
+        NVDIMMDevice *nvdimm = device_list->data;
+        int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
+                                           NULL);
+        uint32_t handle = nvdimm_slot_to_handle(slot);
+        Aml *dev, *method;
+
+        dev = aml_device("NV%02X", slot);
+        aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
+
+        BUILD_STA_METHOD(dev, method);
+
+        method = aml_method("_DSM", 4);
+        {
+            SAVE_ARG012_HANDLE_LOCK(method, aml_int(handle));
+            NOTIFY_AND_RETURN_UNLOCK(method);
+        }
+        aml_append(dev, method);
+
+        aml_append(root_dev, dev);
+    }
+}
+
+static void nvdimm_build_acpi_devices(NVDIMMState *state, GSList *device_list,
+                                      Aml *sb_scope)
+{
+    Aml *dev, *method, *field;
+    int fit_size = nvdimm_device_structure_size(g_slist_length(device_list));
+
+    dev = aml_device("NVDR");
+    aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0012")));
+
+    /* map DSM memory into ACPI namespace. */
+    aml_append(dev, aml_operation_region("NMIO", AML_SYSTEM_MEMORY,
+               state->base, state->page_size));
+    aml_append(dev, aml_operation_region("NRAM", AML_SYSTEM_MEMORY,
+               state->base + state->page_size, state->page_size));
+    aml_append(dev, aml_operation_region("NFIT", AML_SYSTEM_MEMORY,
+               state->base + state->page_size * 2,
+               memory_region_size(&state->mr) - state->page_size * 2));
+
+    /*
+     * DSM notifier:
+     * @NOTI: write value to it will notify QEMU that _DSM method is being
+     *        called and the parameters can be found in dsm_in.
+     *
+     * It is MMIO mapping on host so that it will cause VM-exit then QEMU
+     * gets control.
+     */
+    field = aml_field("NMIO", AML_DWORD_ACC, AML_PRESERVE);
+    BUILD_FIELD_UNIT_SIZE(field, sizeof(uint32_t), "NOTI");
+    aml_append(dev, field);
+
+    /*
+     * DSM input:
+     * @HDLE: store device's handle, it's zero if the _DSM call happens
+     *        on ROOT.
+     * @ARG0 ~ @ARG3: store the parameters of _DSM call.
+     *
+     * They are ram mapping on host so that these accesses never cause
+     * VM-EXIT.
+     */
+    field = aml_field("NRAM", AML_DWORD_ACC, AML_PRESERVE);
+    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, handle, "HDLE");
+    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, arg0, "ARG0");
+    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, arg1, "ARG1");
+    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, arg2, "ARG2");
+    BUILD_FIELD_UNIT_SIZE(field, state->page_size - offsetof(dsm_in, arg3),
+                          "ARG3");
+    aml_append(dev, field);
+
+    /*
+     * DSM output:
+     * @RLEN: the size of buffer filled by QEMU
+     * @ODAT: the buffer QEMU uses to store the result
+     *
+     * Since the page is reused by both input and out, the input data
+     * will be lost after storing new result into @RLEN and @ODAT
+    */
+    field = aml_field("NRAM", AML_DWORD_ACC, AML_PRESERVE);
+    BUILD_FIELD_UNIT_STRUCT(field, dsm_out, len, "RLEN");
+    BUILD_FIELD_UNIT_SIZE(field, state->page_size - offsetof(dsm_out, data),
+                          "ODAT");
+    aml_append(dev, field);
+
+    /* @RFIT, returned by _FIT method. */
+    field = aml_field("NFIT", AML_DWORD_ACC, AML_PRESERVE);
+    BUILD_FIELD_UNIT_SIZE(field, fit_size, "RFIT");
+    aml_append(dev, field);
+
+    aml_append(dev, aml_mutex("NLCK", 0));
+
+    BUILD_STA_METHOD(dev, method);
+
+    method = aml_method("_DSM", 4);
+    {
+        SAVE_ARG012_HANDLE_LOCK(method, aml_int(0));
+        NOTIFY_AND_RETURN_UNLOCK(method);
+    }
+    aml_append(dev, method);
+
+    method = aml_method("_FIT", 0);
+    {
+        aml_append(method, aml_return(aml_name("RFIT")));
+    }
+    aml_append(dev, method);
+
+    build_nvdimm_devices(state, device_list, dev);
+
+    aml_append(sb_scope, dev);
+}
+
+static void nvdimm_build_ssdt(NVDIMMState *state, GSList *device_list,
+                              GArray *table_offsets, GArray *table_data,
+                              GArray *linker)
+{
+    Aml *ssdt, *sb_scope;
+
+    acpi_add_table(table_offsets, table_data);
+
+    ssdt = init_aml_allocator();
+    acpi_data_push(ssdt->buf, sizeof(AcpiTableHeader));
+
+    sb_scope = aml_scope("\\_SB");
+    nvdimm_build_acpi_devices(state, device_list, sb_scope);
+
+    aml_append(ssdt, sb_scope);
+    /* copy AML table into ACPI tables blob and patch header there */
+    g_array_append_vals(table_data, ssdt->buf->data, ssdt->buf->len);
+    build_header(linker, table_data,
+        (void *)(table_data->data + table_data->len - ssdt->buf->len),
+        "SSDT", ssdt->buf->len, 1);
+    free_aml_allocator();
+}
+
 void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
                              GArray *table_data, GArray *linker)
 {
@@ -387,6 +587,9 @@ void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
 
         build_device_structure(device_list, fit);
         build_nfit(fit, device_list, table_offsets, table_data, linker);
+
+        nvdimm_build_ssdt(state, device_list, table_offsets, table_data,
+                          linker);
         g_slist_free(device_list);
     }
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 26/32] nvdimm: save arg3 for NVDIMM device _DSM method
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Check if the input Arg3 is valid then store it into dsm_in if needed

We only do the save on NVDIMM device since we are not going to support any
function on root device

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index d9fa0fd..3b9399c 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -442,7 +442,7 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
         int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
                                            NULL);
         uint32_t handle = nvdimm_slot_to_handle(slot);
-        Aml *dev, *method;
+        Aml *dev, *method, *ifctx;
 
         dev = aml_device("NV%02X", slot);
         aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
@@ -452,6 +452,24 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
         method = aml_method("_DSM", 4);
         {
             SAVE_ARG012_HANDLE_LOCK(method, aml_int(handle));
+
+            /* Arg3 is passed as Package and it has one element? */
+            ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
+                                             aml_int(4)),
+                                   aml_equal(aml_sizeof(aml_arg(3)),
+                                             aml_int(1))));
+            {
+                /* Local0 = Index(Arg3, 0) */
+                aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
+                                            aml_local(0)));
+                /* Local3 = DeRefOf(Local0) */
+                aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
+                                            aml_local(3)));
+                /* ARG3 = Local3 */
+                aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));
+            }
+            aml_append(method, ifctx);
+
             NOTIFY_AND_RETURN_UNLOCK(method);
         }
         aml_append(dev, method);
@@ -534,6 +552,7 @@ static void nvdimm_build_acpi_devices(NVDIMMState *state, GSList *device_list,
     method = aml_method("_DSM", 4);
     {
         SAVE_ARG012_HANDLE_LOCK(method, aml_int(0));
+        /* no command we support on ROOT device has Arg3. */
         NOTIFY_AND_RETURN_UNLOCK(method);
     }
     aml_append(dev, method);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 26/32] nvdimm: save arg3 for NVDIMM device _DSM method
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Check if the input Arg3 is valid then store it into dsm_in if needed

We only do the save on NVDIMM device since we are not going to support any
function on root device

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index d9fa0fd..3b9399c 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -442,7 +442,7 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
         int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
                                            NULL);
         uint32_t handle = nvdimm_slot_to_handle(slot);
-        Aml *dev, *method;
+        Aml *dev, *method, *ifctx;
 
         dev = aml_device("NV%02X", slot);
         aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
@@ -452,6 +452,24 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
         method = aml_method("_DSM", 4);
         {
             SAVE_ARG012_HANDLE_LOCK(method, aml_int(handle));
+
+            /* Arg3 is passed as Package and it has one element? */
+            ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
+                                             aml_int(4)),
+                                   aml_equal(aml_sizeof(aml_arg(3)),
+                                             aml_int(1))));
+            {
+                /* Local0 = Index(Arg3, 0) */
+                aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
+                                            aml_local(0)));
+                /* Local3 = DeRefOf(Local0) */
+                aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
+                                            aml_local(3)));
+                /* ARG3 = Local3 */
+                aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));
+            }
+            aml_append(method, ifctx);
+
             NOTIFY_AND_RETURN_UNLOCK(method);
         }
         aml_append(dev, method);
@@ -534,6 +552,7 @@ static void nvdimm_build_acpi_devices(NVDIMMState *state, GSList *device_list,
     method = aml_method("_DSM", 4);
     {
         SAVE_ARG012_HANDLE_LOCK(method, aml_int(0));
+        /* no command we support on ROOT device has Arg3. */
         NOTIFY_AND_RETURN_UNLOCK(method);
     }
     aml_append(dev, method);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:52   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

__DSM is defined in ACPI 6.0: 9.14.1 _DSM (Device Specific Method)

Function 0 is a query function. We do not support any function on root
device and only 3 functions are support for NVDIMM device,
DSM_CMD_NAMESPACE_LABEL_SIZE, DSM_CMD_GET_NAMESPACE_LABEL_DATA and
DSM_CMD_SET_NAMESPACE_LABEL_DATA, that means we currently only allow to access
device's Label Namespace

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c | 178 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 177 insertions(+), 1 deletion(-)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index 3b9399c..cb6a428 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -39,6 +39,22 @@ static void nfit_spa_uuid_pm(uuid_le *uuid)
     memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
 }
 
+static bool dsm_is_root_uuid(uint8_t *uuid)
+{
+    uuid_le uuid_root = UUID_LE(0x2f10e7a4, 0x9e91, 0x11e4, 0x89,
+                                0xd3, 0x12, 0x3b, 0x93, 0xf7, 0x5c, 0xba);
+
+    return !memcmp(uuid, &uuid_root, sizeof(uuid_root));
+}
+
+static bool dsm_is_dimm_uuid(uint8_t *uuid)
+{
+    uuid_le uuid_dimm = UUID_LE(0x4309ac30, 0x0d11, 0x11e4, 0x91,
+                                0x91, 0x08, 0x00, 0x20, 0x0c, 0x9a, 0x66);
+
+    return !memcmp(uuid, &uuid_dimm, sizeof(uuid_dimm));
+}
+
 enum {
     NFIT_STRUCTURE_SPA = 0,
     NFIT_STRUCTURE_MEMDEV = 1,
@@ -195,6 +211,22 @@ static uint32_t nvdimm_slot_to_dcr_index(int slot)
     return nvdimm_slot_to_spa_index(slot) + 1;
 }
 
+static NVDIMMDevice
+*get_nvdimm_device_by_handle(GSList *list, uint32_t handle)
+{
+    for (; list; list = list->next) {
+        NVDIMMDevice *nvdimm = list->data;
+        int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
+                                           NULL);
+
+        if (nvdimm_slot_to_handle(slot) == handle) {
+            return nvdimm;
+        }
+    }
+
+    return NULL;
+}
+
 static int build_structure_spa(void *buf, NVDIMMDevice *nvdimm)
 {
     nfit_spa *nfit_spa;
@@ -310,6 +342,43 @@ static void build_nfit(void *fit, GSList *device_list, GArray *table_offsets,
 
 #define NOTIFY_VALUE      0x99
 
+enum {
+    DSM_CMD_IMPLEMENTED = 0,
+
+    /* root device commands */
+    DSM_CMD_ARS_CAP = 1,
+    DSM_CMD_ARS_START = 2,
+    DSM_CMD_ARS_QUERY = 3,
+
+    /* per-nvdimm device commands */
+    DSM_CMD_SMART = 1,
+    DSM_CMD_SMART_THRESHOLD = 2,
+    DSM_CMD_BLOCK_NVDIMM_FLAGS = 3,
+    DSM_CMD_NAMESPACE_LABEL_SIZE = 4,
+    DSM_CMD_GET_NAMESPACE_LABEL_DATA = 5,
+    DSM_CMD_SET_NAMESPACE_LABEL_DATA = 6,
+    DSM_CMD_VENDOR_EFFECT_LOG_SIZE = 7,
+    DSM_CMD_GET_VENDOR_EFFECT_LOG = 8,
+    DSM_CMD_VENDOR_SPECIFIC = 9,
+};
+
+enum {
+    DSM_STATUS_SUCCESS = 0,
+    DSM_STATUS_NOT_SUPPORTED = 1,
+    DSM_STATUS_NON_EXISTING_MEM_DEV = 2,
+    DSM_STATUS_INVALID_PARAS = 3,
+    DSM_STATUS_VENDOR_SPECIFIC_ERROR = 4,
+};
+
+#define DSM_REVISION        (1)
+
+/* do not support any command except NFIT_CMD_IMPLEMENTED on root. */
+#define ROOT_SUPPORT_CMD    (1 << DSM_CMD_IMPLEMENTED)
+#define DIMM_SUPPORT_CMD    ((1 << DSM_CMD_IMPLEMENTED)               \
+                           | (1 << DSM_CMD_NAMESPACE_LABEL_SIZE)      \
+                           | (1 << DSM_CMD_GET_NAMESPACE_LABEL_DATA)  \
+                           | (1 << DSM_CMD_SET_NAMESPACE_LABEL_DATA))
+
 struct dsm_in {
     uint32_t handle;
     uint8_t arg0[16];
@@ -320,10 +389,19 @@ struct dsm_in {
 } QEMU_PACKED;
 typedef struct dsm_in dsm_in;
 
+struct cmd_out_implemented {
+    uint64_t cmd_list;
+};
+typedef struct cmd_out_implemented cmd_out_implemented;
+
 struct dsm_out {
     /* the size of buffer filled by QEMU. */
     uint16_t len;
-    uint8_t data[0];
+    union {
+        uint8_t data[0];
+        uint32_t status;
+        cmd_out_implemented cmd_implemented;
+    };
 } QEMU_PACKED;
 typedef struct dsm_out dsm_out;
 
@@ -334,12 +412,110 @@ static uint64_t dsm_read(void *opaque, hwaddr addr,
     return 0;
 }
 
+static void dsm_write_root(uint32_t function, dsm_in *in, dsm_out *out)
+{
+    if (function == DSM_CMD_IMPLEMENTED) {
+        out->len = sizeof(out->cmd_implemented);
+        out->cmd_implemented.cmd_list = cpu_to_le64(ROOT_SUPPORT_CMD);
+        return;
+    }
+
+    out->len = sizeof(out->status);
+    out->status = cpu_to_le32(DSM_STATUS_NOT_SUPPORTED);
+    nvdebug("Return status %#x.\n", out->status);
+}
+
+static void dsm_write_nvdimm(uint32_t handle, uint32_t function, dsm_in *in,
+                             dsm_out *out)
+{
+    GSList *list = nvdimm_get_built_list();
+    NVDIMMDevice *nvdimm = get_nvdimm_device_by_handle(list, handle);
+    uint32_t status = DSM_STATUS_NON_EXISTING_MEM_DEV;
+    uint64_t cmd_list;
+
+    if (!nvdimm) {
+        out->len = sizeof(out->status);
+        goto set_status_free;
+    }
+
+    switch (function) {
+    case DSM_CMD_IMPLEMENTED:
+        cmd_list = DIMM_SUPPORT_CMD;
+        out->len = sizeof(out->cmd_implemented);
+        out->cmd_implemented.cmd_list = cpu_to_le64(cmd_list);
+        goto free;
+    default:
+        out->len = sizeof(out->status);
+        status = DSM_STATUS_NOT_SUPPORTED;
+    };
+
+    nvdebug("Return status %#x.\n", status);
+
+set_status_free:
+    out->status = cpu_to_le32(status);
+free:
+    g_slist_free(list);
+}
+
 static void dsm_write(void *opaque, hwaddr addr,
                       uint64_t val, unsigned size)
 {
+    NVDIMMState *state = opaque;
+    MemoryRegion *dsm_ram_mr;
+    dsm_in *in;
+    dsm_out *out;
+    uint32_t revision, function, handle;
+
     if (val != NOTIFY_VALUE) {
         fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
     }
+
+    dsm_ram_mr = memory_region_find(&state->mr, state->page_size,
+                                    state->page_size).mr;
+    memory_region_unref(dsm_ram_mr);
+    in = memory_region_get_ram_ptr(dsm_ram_mr);
+    out = (dsm_out *)in;
+
+    revision = in->arg1;
+    function = in->arg2;
+    handle = in->handle;
+    le32_to_cpus(&revision);
+    le32_to_cpus(&function);
+    le32_to_cpus(&handle);
+
+    nvdebug("UUID " UUID_FMT ".\n", in->arg0[0], in->arg0[1], in->arg0[2],
+            in->arg0[3], in->arg0[4], in->arg0[5], in->arg0[6],
+            in->arg0[7], in->arg0[8], in->arg0[9], in->arg0[10],
+            in->arg0[11], in->arg0[12], in->arg0[13], in->arg0[14],
+            in->arg0[15]);
+    nvdebug("Revision %#x Function %#x Handler %#x.\n", revision, function,
+            handle);
+
+    if (revision != DSM_REVISION) {
+        nvdebug("Revision %#x is not supported, expect %#x.\n",
+                revision, DSM_REVISION);
+        goto exit;
+    }
+
+    if (!handle) {
+        if (!dsm_is_root_uuid(in->arg0)) {
+            nvdebug("Root UUID does not match.\n");
+            goto exit;
+        }
+
+        return dsm_write_root(function, in, out);
+    }
+
+    if (!dsm_is_dimm_uuid(in->arg0)) {
+        nvdebug("DIMM UUID does not match.\n");
+        goto exit;
+    }
+
+    return dsm_write_nvdimm(handle, function, in, out);
+
+exit:
+    out->len = sizeof(out->status);
+    out->status = cpu_to_le32(DSM_STATUS_NOT_SUPPORTED);
 }
 
 static const MemoryRegionOps dsm_ops = {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
@ 2015-10-11  3:52   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:52 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

__DSM is defined in ACPI 6.0: 9.14.1 _DSM (Device Specific Method)

Function 0 is a query function. We do not support any function on root
device and only 3 functions are support for NVDIMM device,
DSM_CMD_NAMESPACE_LABEL_SIZE, DSM_CMD_GET_NAMESPACE_LABEL_DATA and
DSM_CMD_SET_NAMESPACE_LABEL_DATA, that means we currently only allow to access
device's Label Namespace

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c | 178 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 177 insertions(+), 1 deletion(-)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index 3b9399c..cb6a428 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -39,6 +39,22 @@ static void nfit_spa_uuid_pm(uuid_le *uuid)
     memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
 }
 
+static bool dsm_is_root_uuid(uint8_t *uuid)
+{
+    uuid_le uuid_root = UUID_LE(0x2f10e7a4, 0x9e91, 0x11e4, 0x89,
+                                0xd3, 0x12, 0x3b, 0x93, 0xf7, 0x5c, 0xba);
+
+    return !memcmp(uuid, &uuid_root, sizeof(uuid_root));
+}
+
+static bool dsm_is_dimm_uuid(uint8_t *uuid)
+{
+    uuid_le uuid_dimm = UUID_LE(0x4309ac30, 0x0d11, 0x11e4, 0x91,
+                                0x91, 0x08, 0x00, 0x20, 0x0c, 0x9a, 0x66);
+
+    return !memcmp(uuid, &uuid_dimm, sizeof(uuid_dimm));
+}
+
 enum {
     NFIT_STRUCTURE_SPA = 0,
     NFIT_STRUCTURE_MEMDEV = 1,
@@ -195,6 +211,22 @@ static uint32_t nvdimm_slot_to_dcr_index(int slot)
     return nvdimm_slot_to_spa_index(slot) + 1;
 }
 
+static NVDIMMDevice
+*get_nvdimm_device_by_handle(GSList *list, uint32_t handle)
+{
+    for (; list; list = list->next) {
+        NVDIMMDevice *nvdimm = list->data;
+        int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
+                                           NULL);
+
+        if (nvdimm_slot_to_handle(slot) == handle) {
+            return nvdimm;
+        }
+    }
+
+    return NULL;
+}
+
 static int build_structure_spa(void *buf, NVDIMMDevice *nvdimm)
 {
     nfit_spa *nfit_spa;
@@ -310,6 +342,43 @@ static void build_nfit(void *fit, GSList *device_list, GArray *table_offsets,
 
 #define NOTIFY_VALUE      0x99
 
+enum {
+    DSM_CMD_IMPLEMENTED = 0,
+
+    /* root device commands */
+    DSM_CMD_ARS_CAP = 1,
+    DSM_CMD_ARS_START = 2,
+    DSM_CMD_ARS_QUERY = 3,
+
+    /* per-nvdimm device commands */
+    DSM_CMD_SMART = 1,
+    DSM_CMD_SMART_THRESHOLD = 2,
+    DSM_CMD_BLOCK_NVDIMM_FLAGS = 3,
+    DSM_CMD_NAMESPACE_LABEL_SIZE = 4,
+    DSM_CMD_GET_NAMESPACE_LABEL_DATA = 5,
+    DSM_CMD_SET_NAMESPACE_LABEL_DATA = 6,
+    DSM_CMD_VENDOR_EFFECT_LOG_SIZE = 7,
+    DSM_CMD_GET_VENDOR_EFFECT_LOG = 8,
+    DSM_CMD_VENDOR_SPECIFIC = 9,
+};
+
+enum {
+    DSM_STATUS_SUCCESS = 0,
+    DSM_STATUS_NOT_SUPPORTED = 1,
+    DSM_STATUS_NON_EXISTING_MEM_DEV = 2,
+    DSM_STATUS_INVALID_PARAS = 3,
+    DSM_STATUS_VENDOR_SPECIFIC_ERROR = 4,
+};
+
+#define DSM_REVISION        (1)
+
+/* do not support any command except NFIT_CMD_IMPLEMENTED on root. */
+#define ROOT_SUPPORT_CMD    (1 << DSM_CMD_IMPLEMENTED)
+#define DIMM_SUPPORT_CMD    ((1 << DSM_CMD_IMPLEMENTED)               \
+                           | (1 << DSM_CMD_NAMESPACE_LABEL_SIZE)      \
+                           | (1 << DSM_CMD_GET_NAMESPACE_LABEL_DATA)  \
+                           | (1 << DSM_CMD_SET_NAMESPACE_LABEL_DATA))
+
 struct dsm_in {
     uint32_t handle;
     uint8_t arg0[16];
@@ -320,10 +389,19 @@ struct dsm_in {
 } QEMU_PACKED;
 typedef struct dsm_in dsm_in;
 
+struct cmd_out_implemented {
+    uint64_t cmd_list;
+};
+typedef struct cmd_out_implemented cmd_out_implemented;
+
 struct dsm_out {
     /* the size of buffer filled by QEMU. */
     uint16_t len;
-    uint8_t data[0];
+    union {
+        uint8_t data[0];
+        uint32_t status;
+        cmd_out_implemented cmd_implemented;
+    };
 } QEMU_PACKED;
 typedef struct dsm_out dsm_out;
 
@@ -334,12 +412,110 @@ static uint64_t dsm_read(void *opaque, hwaddr addr,
     return 0;
 }
 
+static void dsm_write_root(uint32_t function, dsm_in *in, dsm_out *out)
+{
+    if (function == DSM_CMD_IMPLEMENTED) {
+        out->len = sizeof(out->cmd_implemented);
+        out->cmd_implemented.cmd_list = cpu_to_le64(ROOT_SUPPORT_CMD);
+        return;
+    }
+
+    out->len = sizeof(out->status);
+    out->status = cpu_to_le32(DSM_STATUS_NOT_SUPPORTED);
+    nvdebug("Return status %#x.\n", out->status);
+}
+
+static void dsm_write_nvdimm(uint32_t handle, uint32_t function, dsm_in *in,
+                             dsm_out *out)
+{
+    GSList *list = nvdimm_get_built_list();
+    NVDIMMDevice *nvdimm = get_nvdimm_device_by_handle(list, handle);
+    uint32_t status = DSM_STATUS_NON_EXISTING_MEM_DEV;
+    uint64_t cmd_list;
+
+    if (!nvdimm) {
+        out->len = sizeof(out->status);
+        goto set_status_free;
+    }
+
+    switch (function) {
+    case DSM_CMD_IMPLEMENTED:
+        cmd_list = DIMM_SUPPORT_CMD;
+        out->len = sizeof(out->cmd_implemented);
+        out->cmd_implemented.cmd_list = cpu_to_le64(cmd_list);
+        goto free;
+    default:
+        out->len = sizeof(out->status);
+        status = DSM_STATUS_NOT_SUPPORTED;
+    };
+
+    nvdebug("Return status %#x.\n", status);
+
+set_status_free:
+    out->status = cpu_to_le32(status);
+free:
+    g_slist_free(list);
+}
+
 static void dsm_write(void *opaque, hwaddr addr,
                       uint64_t val, unsigned size)
 {
+    NVDIMMState *state = opaque;
+    MemoryRegion *dsm_ram_mr;
+    dsm_in *in;
+    dsm_out *out;
+    uint32_t revision, function, handle;
+
     if (val != NOTIFY_VALUE) {
         fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
     }
+
+    dsm_ram_mr = memory_region_find(&state->mr, state->page_size,
+                                    state->page_size).mr;
+    memory_region_unref(dsm_ram_mr);
+    in = memory_region_get_ram_ptr(dsm_ram_mr);
+    out = (dsm_out *)in;
+
+    revision = in->arg1;
+    function = in->arg2;
+    handle = in->handle;
+    le32_to_cpus(&revision);
+    le32_to_cpus(&function);
+    le32_to_cpus(&handle);
+
+    nvdebug("UUID " UUID_FMT ".\n", in->arg0[0], in->arg0[1], in->arg0[2],
+            in->arg0[3], in->arg0[4], in->arg0[5], in->arg0[6],
+            in->arg0[7], in->arg0[8], in->arg0[9], in->arg0[10],
+            in->arg0[11], in->arg0[12], in->arg0[13], in->arg0[14],
+            in->arg0[15]);
+    nvdebug("Revision %#x Function %#x Handler %#x.\n", revision, function,
+            handle);
+
+    if (revision != DSM_REVISION) {
+        nvdebug("Revision %#x is not supported, expect %#x.\n",
+                revision, DSM_REVISION);
+        goto exit;
+    }
+
+    if (!handle) {
+        if (!dsm_is_root_uuid(in->arg0)) {
+            nvdebug("Root UUID does not match.\n");
+            goto exit;
+        }
+
+        return dsm_write_root(function, in, out);
+    }
+
+    if (!dsm_is_dimm_uuid(in->arg0)) {
+        nvdebug("DIMM UUID does not match.\n");
+        goto exit;
+    }
+
+    return dsm_write_nvdimm(handle, function, in, out);
+
+exit:
+    out->len = sizeof(out->status);
+    out->status = cpu_to_le32(DSM_STATUS_NOT_SUPPORTED);
 }
 
 static const MemoryRegionOps dsm_ops = {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 28/32] nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:53   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:53 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Function 4 is used to get Namespace label size

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 86 insertions(+), 4 deletions(-)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index cb6a428..b420e2f 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -379,13 +379,29 @@ enum {
                            | (1 << DSM_CMD_GET_NAMESPACE_LABEL_DATA)  \
                            | (1 << DSM_CMD_SET_NAMESPACE_LABEL_DATA))
 
+struct cmd_in_get_label_data {
+    uint32_t offset;
+    uint32_t length;
+} QEMU_PACKED;
+typedef struct cmd_in_get_label_data cmd_in_get_label_data;
+
+struct cmd_in_set_label_data {
+    uint32_t offset;
+    uint32_t length;
+    uint8_t in_buf[0];
+} QEMU_PACKED;
+typedef struct cmd_in_set_label_data cmd_in_set_label_data;
+
 struct dsm_in {
     uint32_t handle;
     uint8_t arg0[16];
     uint32_t arg1;
     uint32_t arg2;
    /* the remaining size in the page is used by arg3. */
-    uint8_t arg3[0];
+    union {
+        uint8_t arg3[0];
+        cmd_in_set_label_data cmd_set_label_data;
+    };
 } QEMU_PACKED;
 typedef struct dsm_in dsm_in;
 
@@ -394,6 +410,19 @@ struct cmd_out_implemented {
 };
 typedef struct cmd_out_implemented cmd_out_implemented;
 
+struct cmd_out_label_size {
+    uint32_t status;
+    uint32_t label_size;
+    uint32_t max_xfer;
+} QEMU_PACKED;
+typedef struct cmd_out_label_size cmd_out_label_size;
+
+struct cmd_out_get_label_data {
+    uint32_t status;
+    uint8_t out_buf[0];
+} QEMU_PACKED;
+typedef struct cmd_out_get_label_data cmd_out_get_label_data;
+
 struct dsm_out {
     /* the size of buffer filled by QEMU. */
     uint16_t len;
@@ -401,6 +430,8 @@ struct dsm_out {
         uint8_t data[0];
         uint32_t status;
         cmd_out_implemented cmd_implemented;
+        cmd_out_label_size cmd_label_size;
+        cmd_out_get_label_data cmd_get_label_data;
     };
 } QEMU_PACKED;
 typedef struct dsm_out dsm_out;
@@ -425,8 +456,56 @@ static void dsm_write_root(uint32_t function, dsm_in *in, dsm_out *out)
     nvdebug("Return status %#x.\n", out->status);
 }
 
-static void dsm_write_nvdimm(uint32_t handle, uint32_t function, dsm_in *in,
-                             dsm_out *out)
+/*
+ * the max transfer size is the max size transfered by both a
+ * DSM_CMD_GET_NAMESPACE_LABEL_DATA and a DSM_CMD_SET_NAMESPACE_LABEL_DATA
+ * command.
+ */
+static uint32_t max_xfer_label_size(MemoryRegion *dsm_ram_mr)
+{
+    dsm_in *in;
+    dsm_out *out;
+    uint32_t mr_size, max_get_size, max_set_size;
+
+    mr_size = memory_region_size(dsm_ram_mr);
+
+    /*
+     * the max data ACPI can read one time which is transfered by
+     * the response of DSM_CMD_GET_NAMESPACE_LABEL_DATA.
+     */
+    max_get_size = mr_size - offsetof(dsm_out, data) -
+                   sizeof(out->cmd_get_label_data);
+
+    /*
+     * the max data ACPI can write one time which is transfered by
+     * DSM_CMD_SET_NAMESPACE_LABEL_DATA
+     */
+    max_set_size = mr_size - offsetof(dsm_in, arg3) -
+                   sizeof(in->cmd_set_label_data);
+
+    return MIN(max_get_size, max_set_size);
+}
+
+static uint32_t
+dsm_cmd_label_size(MemoryRegion *dsm_ram_mr, NVDIMMDevice *nvdimm,
+                    dsm_out *out)
+{
+    uint32_t label_size, mxfer;
+
+    label_size = nvdimm->label_size;
+    mxfer = max_xfer_label_size(dsm_ram_mr);
+
+    out->cmd_label_size.label_size = cpu_to_le32(label_size);
+    out->cmd_label_size.max_xfer = cpu_to_le32(mxfer);
+    out->len = sizeof(out->cmd_label_size);
+
+    nvdebug("%s label_size %#x, max_xfer %#x.\n", __func__, label_size, mxfer);
+
+    return DSM_STATUS_SUCCESS;
+}
+
+static void dsm_write_nvdimm(MemoryRegion *dsm_ram_mr, uint32_t handle,
+                             uint32_t function, dsm_in *in, dsm_out *out)
 {
     GSList *list = nvdimm_get_built_list();
     NVDIMMDevice *nvdimm = get_nvdimm_device_by_handle(list, handle);
@@ -444,6 +523,9 @@ static void dsm_write_nvdimm(uint32_t handle, uint32_t function, dsm_in *in,
         out->len = sizeof(out->cmd_implemented);
         out->cmd_implemented.cmd_list = cpu_to_le64(cmd_list);
         goto free;
+    case DSM_CMD_NAMESPACE_LABEL_SIZE:
+        status = dsm_cmd_label_size(dsm_ram_mr, nvdimm, out);
+        break;
     default:
         out->len = sizeof(out->status);
         status = DSM_STATUS_NOT_SUPPORTED;
@@ -511,7 +593,7 @@ static void dsm_write(void *opaque, hwaddr addr,
         goto exit;
     }
 
-    return dsm_write_nvdimm(handle, function, in, out);
+    return dsm_write_nvdimm(dsm_ram_mr, handle, function, in, out);
 
 exit:
     out->len = sizeof(out->status);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 28/32] nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
@ 2015-10-11  3:53   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:53 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Function 4 is used to get Namespace label size

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 86 insertions(+), 4 deletions(-)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index cb6a428..b420e2f 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -379,13 +379,29 @@ enum {
                            | (1 << DSM_CMD_GET_NAMESPACE_LABEL_DATA)  \
                            | (1 << DSM_CMD_SET_NAMESPACE_LABEL_DATA))
 
+struct cmd_in_get_label_data {
+    uint32_t offset;
+    uint32_t length;
+} QEMU_PACKED;
+typedef struct cmd_in_get_label_data cmd_in_get_label_data;
+
+struct cmd_in_set_label_data {
+    uint32_t offset;
+    uint32_t length;
+    uint8_t in_buf[0];
+} QEMU_PACKED;
+typedef struct cmd_in_set_label_data cmd_in_set_label_data;
+
 struct dsm_in {
     uint32_t handle;
     uint8_t arg0[16];
     uint32_t arg1;
     uint32_t arg2;
    /* the remaining size in the page is used by arg3. */
-    uint8_t arg3[0];
+    union {
+        uint8_t arg3[0];
+        cmd_in_set_label_data cmd_set_label_data;
+    };
 } QEMU_PACKED;
 typedef struct dsm_in dsm_in;
 
@@ -394,6 +410,19 @@ struct cmd_out_implemented {
 };
 typedef struct cmd_out_implemented cmd_out_implemented;
 
+struct cmd_out_label_size {
+    uint32_t status;
+    uint32_t label_size;
+    uint32_t max_xfer;
+} QEMU_PACKED;
+typedef struct cmd_out_label_size cmd_out_label_size;
+
+struct cmd_out_get_label_data {
+    uint32_t status;
+    uint8_t out_buf[0];
+} QEMU_PACKED;
+typedef struct cmd_out_get_label_data cmd_out_get_label_data;
+
 struct dsm_out {
     /* the size of buffer filled by QEMU. */
     uint16_t len;
@@ -401,6 +430,8 @@ struct dsm_out {
         uint8_t data[0];
         uint32_t status;
         cmd_out_implemented cmd_implemented;
+        cmd_out_label_size cmd_label_size;
+        cmd_out_get_label_data cmd_get_label_data;
     };
 } QEMU_PACKED;
 typedef struct dsm_out dsm_out;
@@ -425,8 +456,56 @@ static void dsm_write_root(uint32_t function, dsm_in *in, dsm_out *out)
     nvdebug("Return status %#x.\n", out->status);
 }
 
-static void dsm_write_nvdimm(uint32_t handle, uint32_t function, dsm_in *in,
-                             dsm_out *out)
+/*
+ * the max transfer size is the max size transfered by both a
+ * DSM_CMD_GET_NAMESPACE_LABEL_DATA and a DSM_CMD_SET_NAMESPACE_LABEL_DATA
+ * command.
+ */
+static uint32_t max_xfer_label_size(MemoryRegion *dsm_ram_mr)
+{
+    dsm_in *in;
+    dsm_out *out;
+    uint32_t mr_size, max_get_size, max_set_size;
+
+    mr_size = memory_region_size(dsm_ram_mr);
+
+    /*
+     * the max data ACPI can read one time which is transfered by
+     * the response of DSM_CMD_GET_NAMESPACE_LABEL_DATA.
+     */
+    max_get_size = mr_size - offsetof(dsm_out, data) -
+                   sizeof(out->cmd_get_label_data);
+
+    /*
+     * the max data ACPI can write one time which is transfered by
+     * DSM_CMD_SET_NAMESPACE_LABEL_DATA
+     */
+    max_set_size = mr_size - offsetof(dsm_in, arg3) -
+                   sizeof(in->cmd_set_label_data);
+
+    return MIN(max_get_size, max_set_size);
+}
+
+static uint32_t
+dsm_cmd_label_size(MemoryRegion *dsm_ram_mr, NVDIMMDevice *nvdimm,
+                    dsm_out *out)
+{
+    uint32_t label_size, mxfer;
+
+    label_size = nvdimm->label_size;
+    mxfer = max_xfer_label_size(dsm_ram_mr);
+
+    out->cmd_label_size.label_size = cpu_to_le32(label_size);
+    out->cmd_label_size.max_xfer = cpu_to_le32(mxfer);
+    out->len = sizeof(out->cmd_label_size);
+
+    nvdebug("%s label_size %#x, max_xfer %#x.\n", __func__, label_size, mxfer);
+
+    return DSM_STATUS_SUCCESS;
+}
+
+static void dsm_write_nvdimm(MemoryRegion *dsm_ram_mr, uint32_t handle,
+                             uint32_t function, dsm_in *in, dsm_out *out)
 {
     GSList *list = nvdimm_get_built_list();
     NVDIMMDevice *nvdimm = get_nvdimm_device_by_handle(list, handle);
@@ -444,6 +523,9 @@ static void dsm_write_nvdimm(uint32_t handle, uint32_t function, dsm_in *in,
         out->len = sizeof(out->cmd_implemented);
         out->cmd_implemented.cmd_list = cpu_to_le64(cmd_list);
         goto free;
+    case DSM_CMD_NAMESPACE_LABEL_SIZE:
+        status = dsm_cmd_label_size(dsm_ram_mr, nvdimm, out);
+        break;
     default:
         out->len = sizeof(out->status);
         status = DSM_STATUS_NOT_SUPPORTED;
@@ -511,7 +593,7 @@ static void dsm_write(void *opaque, hwaddr addr,
         goto exit;
     }
 
-    return dsm_write_nvdimm(handle, function, in, out);
+    return dsm_write_nvdimm(dsm_ram_mr, handle, function, in, out);
 
 exit:
     out->len = sizeof(out->status);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 29/32] nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:53   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:53 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Function 5 is used to get Namespace Label Data

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index b420e2f..e0a37cb 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -401,6 +401,7 @@ struct dsm_in {
     union {
         uint8_t arg3[0];
         cmd_in_set_label_data cmd_set_label_data;
+        cmd_in_get_label_data cmd_get_label_data;
     };
 } QEMU_PACKED;
 typedef struct dsm_in dsm_in;
@@ -504,6 +505,35 @@ dsm_cmd_label_size(MemoryRegion *dsm_ram_mr, NVDIMMDevice *nvdimm,
     return DSM_STATUS_SUCCESS;
 }
 
+static uint32_t dsm_cmd_get_label_data(NVDIMMDevice *nvdimm, dsm_in *in,
+                                       dsm_out *out)
+{
+    cmd_in_get_label_data *cmd_in = &in->cmd_get_label_data;
+    uint32_t length, offset, status;
+
+    length = cmd_in->length;
+    offset = cmd_in->offset;
+    le32_to_cpus(&length);
+    le32_to_cpus(&offset);
+
+    nvdebug("Read Label Data: offset %#x length %#x.\n", offset, length);
+
+    if (nvdimm->label_size < length + offset) {
+        nvdebug("position %#x is beyond label data (len = %#lx).\n",
+                length + offset, nvdimm->label_size);
+        out->len = sizeof(out->status);
+        status = DSM_STATUS_INVALID_PARAS;
+        goto exit;
+    }
+
+    status = DSM_STATUS_SUCCESS;
+    memcpy(out->cmd_get_label_data.out_buf, nvdimm->label_data +
+           offset, length);
+    out->len = sizeof(out->cmd_get_label_data) + length;
+exit:
+    return status;
+}
+
 static void dsm_write_nvdimm(MemoryRegion *dsm_ram_mr, uint32_t handle,
                              uint32_t function, dsm_in *in, dsm_out *out)
 {
@@ -526,6 +556,9 @@ static void dsm_write_nvdimm(MemoryRegion *dsm_ram_mr, uint32_t handle,
     case DSM_CMD_NAMESPACE_LABEL_SIZE:
         status = dsm_cmd_label_size(dsm_ram_mr, nvdimm, out);
         break;
+    case DSM_CMD_GET_NAMESPACE_LABEL_DATA:
+        status = dsm_cmd_get_label_data(nvdimm, in, out);
+        break;
     default:
         out->len = sizeof(out->status);
         status = DSM_STATUS_NOT_SUPPORTED;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 29/32] nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
@ 2015-10-11  3:53   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:53 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Function 5 is used to get Namespace Label Data

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index b420e2f..e0a37cb 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -401,6 +401,7 @@ struct dsm_in {
     union {
         uint8_t arg3[0];
         cmd_in_set_label_data cmd_set_label_data;
+        cmd_in_get_label_data cmd_get_label_data;
     };
 } QEMU_PACKED;
 typedef struct dsm_in dsm_in;
@@ -504,6 +505,35 @@ dsm_cmd_label_size(MemoryRegion *dsm_ram_mr, NVDIMMDevice *nvdimm,
     return DSM_STATUS_SUCCESS;
 }
 
+static uint32_t dsm_cmd_get_label_data(NVDIMMDevice *nvdimm, dsm_in *in,
+                                       dsm_out *out)
+{
+    cmd_in_get_label_data *cmd_in = &in->cmd_get_label_data;
+    uint32_t length, offset, status;
+
+    length = cmd_in->length;
+    offset = cmd_in->offset;
+    le32_to_cpus(&length);
+    le32_to_cpus(&offset);
+
+    nvdebug("Read Label Data: offset %#x length %#x.\n", offset, length);
+
+    if (nvdimm->label_size < length + offset) {
+        nvdebug("position %#x is beyond label data (len = %#lx).\n",
+                length + offset, nvdimm->label_size);
+        out->len = sizeof(out->status);
+        status = DSM_STATUS_INVALID_PARAS;
+        goto exit;
+    }
+
+    status = DSM_STATUS_SUCCESS;
+    memcpy(out->cmd_get_label_data.out_buf, nvdimm->label_data +
+           offset, length);
+    out->len = sizeof(out->cmd_get_label_data) + length;
+exit:
+    return status;
+}
+
 static void dsm_write_nvdimm(MemoryRegion *dsm_ram_mr, uint32_t handle,
                              uint32_t function, dsm_in *in, dsm_out *out)
 {
@@ -526,6 +556,9 @@ static void dsm_write_nvdimm(MemoryRegion *dsm_ram_mr, uint32_t handle,
     case DSM_CMD_NAMESPACE_LABEL_SIZE:
         status = dsm_cmd_label_size(dsm_ram_mr, nvdimm, out);
         break;
+    case DSM_CMD_GET_NAMESPACE_LABEL_DATA:
+        status = dsm_cmd_get_label_data(nvdimm, in, out);
+        break;
     default:
         out->len = sizeof(out->status);
         status = DSM_STATUS_NOT_SUPPORTED;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 30/32] nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:53   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:53 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Function 6 is used to set Namespace Label Data

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index e0a37cb..6f05b37 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -424,6 +424,11 @@ struct cmd_out_get_label_data {
 } QEMU_PACKED;
 typedef struct cmd_out_get_label_data cmd_out_get_label_data;
 
+struct cmd_out_set_label_data {
+    uint32_t status;
+};
+typedef struct cmd_out_set_label_data cmd_out_set_label_data;
+
 struct dsm_out {
     /* the size of buffer filled by QEMU. */
     uint16_t len;
@@ -433,6 +438,7 @@ struct dsm_out {
         cmd_out_implemented cmd_implemented;
         cmd_out_label_size cmd_label_size;
         cmd_out_get_label_data cmd_get_label_data;
+        cmd_out_set_label_data cmd_set_label_data;
     };
 } QEMU_PACKED;
 typedef struct dsm_out dsm_out;
@@ -534,6 +540,33 @@ exit:
     return status;
 }
 
+static uint32_t
+dsm_cmd_set_label_data(NVDIMMDevice *nvdimm, dsm_in *in, dsm_out *out)
+{
+    cmd_in_set_label_data *cmd_in = &in->cmd_set_label_data;
+    uint32_t length, offset, status;
+
+    length = cmd_in->length;
+    offset = cmd_in->offset;
+    le32_to_cpus(&length);
+    le32_to_cpus(&offset);
+
+    nvdebug("Write Label Data: offset %#x length %#x.\n", offset, length);
+    if (nvdimm->label_size < length + offset) {
+        nvdebug("position %#x is beyond config data (len = %#lx).\n",
+                length + offset, nvdimm->label_size);
+        out->len = sizeof(out->status);
+        status = DSM_STATUS_INVALID_PARAS;
+        goto exit;
+    }
+
+    status = DSM_STATUS_SUCCESS;
+    memcpy(nvdimm->label_data + offset, cmd_in->in_buf, length);
+    out->len = sizeof(status);
+exit:
+    return status;
+}
+
 static void dsm_write_nvdimm(MemoryRegion *dsm_ram_mr, uint32_t handle,
                              uint32_t function, dsm_in *in, dsm_out *out)
 {
@@ -559,6 +592,9 @@ static void dsm_write_nvdimm(MemoryRegion *dsm_ram_mr, uint32_t handle,
     case DSM_CMD_GET_NAMESPACE_LABEL_DATA:
         status = dsm_cmd_get_label_data(nvdimm, in, out);
         break;
+    case DSM_CMD_SET_NAMESPACE_LABEL_DATA:
+        status = dsm_cmd_set_label_data(nvdimm, in, out);
+        break;
     default:
         out->len = sizeof(out->status);
         status = DSM_STATUS_NOT_SUPPORTED;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 30/32] nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
@ 2015-10-11  3:53   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:53 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Function 6 is used to set Namespace Label Data

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/nvdimm/acpi.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index e0a37cb..6f05b37 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -424,6 +424,11 @@ struct cmd_out_get_label_data {
 } QEMU_PACKED;
 typedef struct cmd_out_get_label_data cmd_out_get_label_data;
 
+struct cmd_out_set_label_data {
+    uint32_t status;
+};
+typedef struct cmd_out_set_label_data cmd_out_set_label_data;
+
 struct dsm_out {
     /* the size of buffer filled by QEMU. */
     uint16_t len;
@@ -433,6 +438,7 @@ struct dsm_out {
         cmd_out_implemented cmd_implemented;
         cmd_out_label_size cmd_label_size;
         cmd_out_get_label_data cmd_get_label_data;
+        cmd_out_set_label_data cmd_set_label_data;
     };
 } QEMU_PACKED;
 typedef struct dsm_out dsm_out;
@@ -534,6 +540,33 @@ exit:
     return status;
 }
 
+static uint32_t
+dsm_cmd_set_label_data(NVDIMMDevice *nvdimm, dsm_in *in, dsm_out *out)
+{
+    cmd_in_set_label_data *cmd_in = &in->cmd_set_label_data;
+    uint32_t length, offset, status;
+
+    length = cmd_in->length;
+    offset = cmd_in->offset;
+    le32_to_cpus(&length);
+    le32_to_cpus(&offset);
+
+    nvdebug("Write Label Data: offset %#x length %#x.\n", offset, length);
+    if (nvdimm->label_size < length + offset) {
+        nvdebug("position %#x is beyond config data (len = %#lx).\n",
+                length + offset, nvdimm->label_size);
+        out->len = sizeof(out->status);
+        status = DSM_STATUS_INVALID_PARAS;
+        goto exit;
+    }
+
+    status = DSM_STATUS_SUCCESS;
+    memcpy(nvdimm->label_data + offset, cmd_in->in_buf, length);
+    out->len = sizeof(status);
+exit:
+    return status;
+}
+
 static void dsm_write_nvdimm(MemoryRegion *dsm_ram_mr, uint32_t handle,
                              uint32_t function, dsm_in *in, dsm_out *out)
 {
@@ -559,6 +592,9 @@ static void dsm_write_nvdimm(MemoryRegion *dsm_ram_mr, uint32_t handle,
     case DSM_CMD_GET_NAMESPACE_LABEL_DATA:
         status = dsm_cmd_get_label_data(nvdimm, in, out);
         break;
+    case DSM_CMD_SET_NAMESPACE_LABEL_DATA:
+        status = dsm_cmd_set_label_data(nvdimm, in, out);
+        break;
     default:
         out->len = sizeof(out->status);
         status = DSM_STATUS_NOT_SUPPORTED;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 31/32] nvdimm: allow using whole backend memory as pmem
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:53   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:53 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Introduce a parameter, named "reserve-label-data", which indicates
that QEMU does not reserve any region on the backend memory to
support label data, instead, it will build a readonly label data
in memory which has a active namespace containing whole backend
memory

This is useful for the users who want to pass whole nvdimm device
and make its data completely be visible to guest

The parameter is false on default

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/Makefile.objs      |   3 +-
 hw/mem/nvdimm/acpi.c      |  20 +++
 hw/mem/nvdimm/internal.h  |   3 +
 hw/mem/nvdimm/namespace.c | 309 ++++++++++++++++++++++++++++++++++++++++++++++
 hw/mem/nvdimm/nvdimm.c    |  36 +++++-
 include/hw/mem/nvdimm.h   |   4 +
 6 files changed, 369 insertions(+), 6 deletions(-)
 create mode 100644 hw/mem/nvdimm/namespace.c

diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index 7310bac..fc76ca5 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1,3 +1,4 @@
 common-obj-$(CONFIG_DIMM) += dimm.o
 common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
-common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o nvdimm/acpi.o
+common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o nvdimm/acpi.o   \
+                               nvdimm/namespace.o
diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index 6f05b37..e6694bc 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -305,6 +305,8 @@ static void build_device_structure(GSList *device_list, char *buf)
 {
     for (; device_list; device_list = device_list->next) {
         NVDIMMDevice *nvdimm = device_list->data;
+        nfit_memdev *memdev;
+        nfit_dcr *dcr;
 
         /* build System Physical Address Range Description Table. */
         buf += build_structure_spa(buf, nvdimm);
@@ -313,10 +315,15 @@ static void build_device_structure(GSList *device_list, char *buf)
          * build Memory Device to System Physical Address Range Mapping
          * Table.
          */
+        memdev = (nfit_memdev *)buf;
         buf += build_structure_memdev(buf, nvdimm);
 
         /* build Control Region Descriptor Table. */
+        dcr = (struct nfit_dcr *)buf;
         buf += build_structure_dcr(buf, nvdimm);
+
+        calculate_nvdimm_isetcookie(nvdimm, memdev->region_offset,
+                                    dcr->serial_number);
     }
 }
 
@@ -560,6 +567,12 @@ dsm_cmd_set_label_data(NVDIMMDevice *nvdimm, dsm_in *in, dsm_out *out)
         goto exit;
     }
 
+    if (!nvdimm->reserve_label_data) {
+        out->len = sizeof(out->status);
+        status = DSM_STATUS_NOT_SUPPORTED;
+        goto exit;
+    }
+
     status = DSM_STATUS_SUCCESS;
     memcpy(nvdimm->label_data + offset, cmd_in->in_buf, length);
     out->len = sizeof(status);
@@ -583,6 +596,10 @@ static void dsm_write_nvdimm(MemoryRegion *dsm_ram_mr, uint32_t handle,
     switch (function) {
     case DSM_CMD_IMPLEMENTED:
         cmd_list = DIMM_SUPPORT_CMD;
+        if (!nvdimm->reserve_label_data) {
+            cmd_list &= ~(1 << DSM_CMD_SET_NAMESPACE_LABEL_DATA);
+        }
+
         out->len = sizeof(out->cmd_implemented);
         out->cmd_implemented.cmd_list = cpu_to_le64(cmd_list);
         goto free;
@@ -936,6 +953,9 @@ void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
 
         nvdimm_build_ssdt(state, device_list, table_offsets, table_data,
                           linker);
+
+        build_nvdimm_label_data(device_list);
+
         g_slist_free(device_list);
     }
 }
diff --git a/hw/mem/nvdimm/internal.h b/hw/mem/nvdimm/internal.h
index 1e95363..f523175 100644
--- a/hw/mem/nvdimm/internal.h
+++ b/hw/mem/nvdimm/internal.h
@@ -35,4 +35,7 @@ typedef struct uuid_le uuid_le;
     (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } })
 
 GSList *nvdimm_get_built_list(void);
+void calculate_nvdimm_isetcookie(NVDIMMDevice *nvdimm, uint64_t spa_offset,
+                                 uint32_t sn);
+void build_nvdimm_label_data(GSList *device_list);
 #endif
diff --git a/hw/mem/nvdimm/namespace.c b/hw/mem/nvdimm/namespace.c
new file mode 100644
index 0000000..fe58f9a
--- /dev/null
+++ b/hw/mem/nvdimm/namespace.c
@@ -0,0 +1,309 @@
+/*
+ * NVDIMM  Namespace Support
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ *
+ * NVDIMM namespace specification can be found at:
+ *      http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "hw/mem/nvdimm.h"
+#include "internal.h"
+
+static uint64_t fletcher64(void *addr, size_t len)
+{
+    uint32_t *buf = addr;
+    uint32_t lo32 = 0;
+    uint64_t hi32 = 0;
+    int i;
+
+    for (i = 0; i < len / sizeof(uint32_t); i++) {
+        lo32 += cpu_to_le32(buf[i]);
+        hi32 += lo32;
+    }
+
+    return hi32 << 32 | lo32;
+}
+
+struct interleave_set_info {
+    struct interleave_set_info_map {
+        uint64_t region_spa_offset;
+        uint32_t serial_number;
+        uint32_t zero;
+    } mapping[1];
+};
+typedef struct interleave_set_info interleave_set_info;
+
+void calculate_nvdimm_isetcookie(NVDIMMDevice *nvdimm, uint64_t spa_offset,
+                                 uint32_t sn)
+{
+    interleave_set_info info;
+
+    info.mapping[0].region_spa_offset = spa_offset;
+    info.mapping[0].serial_number = sn;
+    info.mapping[0].zero = 0;
+
+    nvdimm->isetcookie = fletcher64(&info, sizeof(info));
+}
+
+#define NSINDEX_SIGNATURE      "NAMESPACE_INDEX\0"
+
+enum {
+    NSINDEX_SIG_LEN = 16,
+    NSINDEX_ALIGN = 256,
+    NSINDEX_SEQ_MASK = 0x3,
+    NSINDEX_MAJOR = 0x1,
+    NSINDEX_MINOR = 0x1,
+
+    NSLABEL_UUID_LEN = 16,
+    NSLABEL_NAME_LEN = 64,
+    NSLABEL_FLAG_ROLABEL = 0x1,  /* read-only label */
+    NSLABEL_FLAG_LOCAL = 0x2,    /* DIMM-local namespace */
+    NSLABEL_FLAG_BTT = 0x4,      /* namespace contains a BTT */
+    NSLABEL_FLAG_UPDATING = 0x8, /* label being updated */
+};
+
+/*
+ * struct nd_namespace_index - label set superblock
+ * @sig: NAMESPACE_INDEX\0
+ * @flags: placeholder
+ * @seq: sequence number for this index
+ * @myoff: offset of this index in label area
+ * @mysize: size of this index struct
+ * @otheroff: offset of other index
+ * @labeloff: offset of first label slot
+ * @nslot: total number of label slots
+ * @major: label area major version
+ * @minor: label area minor version
+ * @checksum: fletcher64 of all fields
+ * @free[0]: bitmap, nlabel bits
+ *
+ * The size of free[] is rounded up so the total struct size is a
+ * multiple of NSINDEX_ALIGN bytes.  Any bits this allocates beyond
+ * nlabel bits must be zero.
+ */
+struct namespace_label_index_block {
+    uint8_t sig[NSINDEX_SIG_LEN];
+    uint32_t flags;
+    uint32_t seq;
+    uint64_t myoff;
+    uint64_t mysize;
+    uint64_t otheroff;
+    uint64_t labeloff;
+    uint32_t nlabel;
+    uint16_t major;
+    uint16_t minor;
+    uint64_t checksum;
+    uint8_t free[0];
+} QEMU_PACKED;
+typedef struct namespace_label_index_block namespace_label_index_block;
+
+/*
+ * struct nd_namespace_label - namespace superblock
+ * @uuid: UUID per RFC 4122
+ * @name: optional name (NULL-terminated)
+ * @flags: see NSLABEL_FLAG_*
+ * @nlabel: num labels to describe this ns
+ * @position: labels position in set
+ * @isetcookie: interleave set cookie
+ * @lbasize: LBA size in bytes or 0 for pmem
+ * @dpa: DPA of NVM range on this DIMM
+ * @rawsize: size of namespace
+ * @slot: slot of this label in label area
+ * @unused: must be zero
+ */
+struct namespace_label {
+    uint8_t uuid[NSLABEL_UUID_LEN];
+    uint8_t name[NSLABEL_NAME_LEN];
+    uint32_t flags;
+    uint16_t nlabel;
+    uint16_t position;
+    uint64_t isetcookie;
+    uint64_t lbasize;
+    uint64_t dpa;
+    uint64_t rawsize;
+    uint32_t slot;
+    uint32_t unused;
+} QEMU_PACKED;
+typedef struct namespace_label namespace_label;
+
+/*calculate the number of label can be contained in whole label data. */
+static int label_data_max_label_nr(NVDIMMDevice *nvdimm, size_t block_size)
+{
+    /* totally we have 2 namespace label index block. */
+    if (block_size * 2 >= nvdimm->label_size) {
+        return 0;
+    }
+
+    return (nvdimm->label_size - block_size * 2) / sizeof(namespace_label);
+}
+
+/*calculate the number of label can be contained in index block. */
+static int label_index_block_max_label_nr(size_t block_size)
+{
+    int free_size;
+
+    free_size = block_size - sizeof(namespace_label_index_block);
+
+    return free_size * BITS_PER_BYTE;
+}
+
+static int calculate_max_label_nr(NVDIMMDevice *nvdimm, size_t block_size)
+{
+    return MIN(label_index_block_max_label_nr(block_size),
+        label_data_max_label_nr(nvdimm, block_size));
+}
+
+/*
+ * check if we can increase the size of namespace_label_index_block to
+ * contain more labels.
+ */
+static bool can_increase_index_block(NVDIMMDevice *nvdimm,
+                                     size_t block_size, int label_nr)
+{
+    size_t remaining;
+
+    remaining = nvdimm->label_size - block_size * 2 -
+                label_nr * sizeof(namespace_label);
+
+    assert((int64_t)remaining >= 0);
+
+    /* can contain 1 label at least. */
+    return remaining >=  NSINDEX_ALIGN * 2 + sizeof(namespace_label);
+}
+
+static void count_label_nr(NVDIMMDevice *nvdimm, size_t *label_block_size,
+                           int *label_nr)
+{
+    *label_block_size = 0;
+
+    do {
+        /*
+          * The minimum size of an index block is 256 bytes and the size must
+          * be a multiple of 256 bytes.
+          */
+        *label_block_size += NSINDEX_ALIGN;
+
+        *label_nr = calculate_max_label_nr(nvdimm, *label_block_size);
+    } while (can_increase_index_block(nvdimm, *label_block_size, *label_nr));
+}
+
+static void namespace_label_uuid(NVDIMMDevice *nvdimm, void *uuid)
+{
+    /* magic UUID. */
+    uuid_le label_uuid_init = UUID_LE(0x137e67a9, 0x7dcb, 0x4c66, 0xb2,
+                                      0xe6, 0x05, 0x06, 0x5b, 0xeb,
+                                      0x6a, 0x00);
+    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP, NULL);
+
+    label_uuid_init.b[0] += slot;
+    memcpy(uuid, &label_uuid_init, sizeof(label_uuid_init));
+}
+
+static void nvdimm_device_init_namespace(NVDIMMDevice *nvdimm)
+{
+    namespace_label_index_block *index1, *index2;
+    namespace_label *label;
+    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
+                                            NULL);
+    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
+                                            NULL);
+    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP, NULL);
+    int i, label_nr;
+    size_t label_block_size;
+
+    nvdimm->label_data = g_malloc(nvdimm->label_size);
+
+    count_label_nr(nvdimm, &label_block_size, &label_nr);
+    nvdebug("nvdimm%d: label_block_size 0x%lx label_nr %d.\n",
+            slot, label_block_size, label_nr);
+
+    index1 = nvdimm->label_data;
+
+    /*
+     * init the first namespace label index block, except @otheroff
+     * and @checksum. we will do it later.
+     */
+    memcpy(index1->sig, NSINDEX_SIGNATURE, sizeof(NSINDEX_SIGNATURE));
+    index1->flags = cpu_to_le32(0);
+    index1->seq = cpu_to_le32(0x1);
+    index1->myoff = cpu_to_le64(0);
+    index1->mysize = cpu_to_le64(label_block_size);
+    index1->labeloff = cpu_to_le64(label_block_size * 2);
+    index1->nlabel = cpu_to_le32(label_nr);
+    index1->major = cpu_to_le16(NSINDEX_MAJOR);
+    index1->minor = cpu_to_le16(NSINDEX_MINOR);
+    index1->checksum = cpu_to_le64(0);
+    memset(index1->free, 0,
+           label_block_size - sizeof(namespace_label_index_block));
+
+    /*
+     * the label slot with the lowest offset in the label storage area is
+     * tracked by the least significant bit of the first byte of the free
+     * array.
+     *
+     * the fist label is used.
+     */
+    for (i = 1; i < index1->nlabel; i++) {
+        set_bit(i, (unsigned long *)index1->free);
+    }
+
+    /* init the second namespace label index block. */
+    index2 = (void *)index1 + label_block_size;
+    memcpy(index2, index1, label_block_size);
+    index2->seq = cpu_to_le32(0x2);
+    index2->myoff = cpu_to_le64(label_block_size);
+
+    /* init @otheroff and @checksume. */
+    index1->otheroff = cpu_to_le64(index2->myoff);
+    index2->otheroff = cpu_to_le64(index1->myoff);
+    index1->checksum = cpu_to_le64(fletcher64(index1, label_block_size));
+    index2->checksum = cpu_to_le64(fletcher64(index2, label_block_size));
+
+    /* only one label is used which is the first label and is readonly. */
+    label = nvdimm->label_data + label_block_size * 2;
+    namespace_label_uuid(nvdimm, label->uuid);
+    sprintf((char *)label->name, "QEMU NS%d", slot);
+    label->flags = cpu_to_le32(NSLABEL_FLAG_ROLABEL);
+    label->nlabel = cpu_to_le16(1);
+    label->position = cpu_to_le16(0);
+    label->isetcookie = cpu_to_le64(nvdimm->isetcookie);
+    label->lbasize = cpu_to_le64(0);
+    label->dpa = cpu_to_le64(addr);
+    label->rawsize = cpu_to_le64(size);
+    label->slot = cpu_to_le32(0);
+    label->unused = cpu_to_le32(0);
+
+    nvdebug("nvdimm%d, checksum1 0x%lx checksum2 0x%lx isetcookie 0x%lx.\n",
+            slot, index1->checksum, index2->checksum,
+            label->isetcookie);
+}
+
+void build_nvdimm_label_data(GSList *device_list)
+{
+    for (; device_list; device_list = device_list->next) {
+        NVDIMMDevice *nvdimm = device_list->data;
+
+        if (nvdimm->label_data) {
+            continue;
+        }
+
+        nvdimm_device_init_namespace(nvdimm);
+    }
+}
diff --git a/hw/mem/nvdimm/nvdimm.c b/hw/mem/nvdimm/nvdimm.c
index bc8c577..9688533 100644
--- a/hw/mem/nvdimm/nvdimm.c
+++ b/hw/mem/nvdimm/nvdimm.c
@@ -62,14 +62,15 @@ static void nvdimm_realize(DIMMDevice *dimm, Error **errp)
 {
     MemoryRegion *mr;
     NVDIMMDevice *nvdimm = NVDIMM(dimm);
-    uint64_t size;
+    uint64_t reserved_label_size, size;
 
     nvdimm->label_size = MIN_NAMESPACE_LABEL_SIZE;
+    reserved_label_size = nvdimm->reserve_label_data ? nvdimm->label_size : 0;
 
     mr = host_memory_backend_get_memory(dimm->hostmem, errp);
     size = memory_region_size(mr);
 
-    if (size <= nvdimm->label_size) {
+    if (size <= reserved_label_size) {
         char *path = object_get_canonical_path_component(OBJECT(dimm->hostmem));
         error_setg(errp, "the size of memdev %s (0x%" PRIx64 ") is too small"
                    " to contain nvdimm namespace label (0x%" PRIx64 ")", path,
@@ -78,9 +79,12 @@ static void nvdimm_realize(DIMMDevice *dimm, Error **errp)
     }
 
     memory_region_init_alias(&nvdimm->nvdimm_mr, OBJECT(dimm), "nvdimm-memory",
-                             mr, 0, size - nvdimm->label_size);
-    nvdimm->label_data = memory_region_get_ram_ptr(mr) +
-                         memory_region_size(&nvdimm->nvdimm_mr);
+                             mr, 0, size - reserved_label_size);
+
+    if (reserved_label_size) {
+        nvdimm->label_data = memory_region_get_ram_ptr(mr) +
+                             memory_region_size(&nvdimm->nvdimm_mr);
+    }
 }
 
 static void nvdimm_class_init(ObjectClass *oc, void *data)
@@ -95,10 +99,32 @@ static void nvdimm_class_init(ObjectClass *oc, void *data)
     ddc->get_memory_region = nvdimm_get_memory_region;
 }
 
+static bool nvdimm_get_reserve_label_data(Object *obj, Error **errp)
+{
+    NVDIMMDevice *nvdimm = NVDIMM(obj);
+
+    return nvdimm->reserve_label_data;
+}
+
+static void nvdimm_set_reserve_label_data(Object *obj, bool value, Error **errp)
+{
+    NVDIMMDevice *nvdimm = NVDIMM(obj);
+
+    nvdimm->reserve_label_data = value;
+}
+
+static void nvdimm_init(Object *obj)
+{
+    object_property_add_bool(obj, "reserve-label-data",
+                             nvdimm_get_reserve_label_data,
+                             nvdimm_set_reserve_label_data, NULL);
+}
+
 static TypeInfo nvdimm_info = {
     .name          = TYPE_NVDIMM,
     .parent        = TYPE_DIMM,
     .instance_size = sizeof(NVDIMMDevice),
+    .instance_init = nvdimm_init,
     .class_init    = nvdimm_class_init,
 };
 
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index 0a6bda4..a8eef65 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -28,8 +28,12 @@ struct NVDIMMDevice {
     DIMMDevice parent_obj;
 
     /* public */
+    bool reserve_label_data;
     uint64_t label_size;
     void *label_data;
+
+    uint64_t isetcookie;
+
     MemoryRegion nvdimm_mr;
 };
 typedef struct NVDIMMDevice NVDIMMDevice;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 31/32] nvdimm: allow using whole backend memory as pmem
@ 2015-10-11  3:53   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:53 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Introduce a parameter, named "reserve-label-data", which indicates
that QEMU does not reserve any region on the backend memory to
support label data, instead, it will build a readonly label data
in memory which has a active namespace containing whole backend
memory

This is useful for the users who want to pass whole nvdimm device
and make its data completely be visible to guest

The parameter is false on default

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/mem/Makefile.objs      |   3 +-
 hw/mem/nvdimm/acpi.c      |  20 +++
 hw/mem/nvdimm/internal.h  |   3 +
 hw/mem/nvdimm/namespace.c | 309 ++++++++++++++++++++++++++++++++++++++++++++++
 hw/mem/nvdimm/nvdimm.c    |  36 +++++-
 include/hw/mem/nvdimm.h   |   4 +
 6 files changed, 369 insertions(+), 6 deletions(-)
 create mode 100644 hw/mem/nvdimm/namespace.c

diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index 7310bac..fc76ca5 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1,3 +1,4 @@
 common-obj-$(CONFIG_DIMM) += dimm.o
 common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
-common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o nvdimm/acpi.o
+common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o nvdimm/acpi.o   \
+                               nvdimm/namespace.o
diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
index 6f05b37..e6694bc 100644
--- a/hw/mem/nvdimm/acpi.c
+++ b/hw/mem/nvdimm/acpi.c
@@ -305,6 +305,8 @@ static void build_device_structure(GSList *device_list, char *buf)
 {
     for (; device_list; device_list = device_list->next) {
         NVDIMMDevice *nvdimm = device_list->data;
+        nfit_memdev *memdev;
+        nfit_dcr *dcr;
 
         /* build System Physical Address Range Description Table. */
         buf += build_structure_spa(buf, nvdimm);
@@ -313,10 +315,15 @@ static void build_device_structure(GSList *device_list, char *buf)
          * build Memory Device to System Physical Address Range Mapping
          * Table.
          */
+        memdev = (nfit_memdev *)buf;
         buf += build_structure_memdev(buf, nvdimm);
 
         /* build Control Region Descriptor Table. */
+        dcr = (struct nfit_dcr *)buf;
         buf += build_structure_dcr(buf, nvdimm);
+
+        calculate_nvdimm_isetcookie(nvdimm, memdev->region_offset,
+                                    dcr->serial_number);
     }
 }
 
@@ -560,6 +567,12 @@ dsm_cmd_set_label_data(NVDIMMDevice *nvdimm, dsm_in *in, dsm_out *out)
         goto exit;
     }
 
+    if (!nvdimm->reserve_label_data) {
+        out->len = sizeof(out->status);
+        status = DSM_STATUS_NOT_SUPPORTED;
+        goto exit;
+    }
+
     status = DSM_STATUS_SUCCESS;
     memcpy(nvdimm->label_data + offset, cmd_in->in_buf, length);
     out->len = sizeof(status);
@@ -583,6 +596,10 @@ static void dsm_write_nvdimm(MemoryRegion *dsm_ram_mr, uint32_t handle,
     switch (function) {
     case DSM_CMD_IMPLEMENTED:
         cmd_list = DIMM_SUPPORT_CMD;
+        if (!nvdimm->reserve_label_data) {
+            cmd_list &= ~(1 << DSM_CMD_SET_NAMESPACE_LABEL_DATA);
+        }
+
         out->len = sizeof(out->cmd_implemented);
         out->cmd_implemented.cmd_list = cpu_to_le64(cmd_list);
         goto free;
@@ -936,6 +953,9 @@ void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
 
         nvdimm_build_ssdt(state, device_list, table_offsets, table_data,
                           linker);
+
+        build_nvdimm_label_data(device_list);
+
         g_slist_free(device_list);
     }
 }
diff --git a/hw/mem/nvdimm/internal.h b/hw/mem/nvdimm/internal.h
index 1e95363..f523175 100644
--- a/hw/mem/nvdimm/internal.h
+++ b/hw/mem/nvdimm/internal.h
@@ -35,4 +35,7 @@ typedef struct uuid_le uuid_le;
     (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } })
 
 GSList *nvdimm_get_built_list(void);
+void calculate_nvdimm_isetcookie(NVDIMMDevice *nvdimm, uint64_t spa_offset,
+                                 uint32_t sn);
+void build_nvdimm_label_data(GSList *device_list);
 #endif
diff --git a/hw/mem/nvdimm/namespace.c b/hw/mem/nvdimm/namespace.c
new file mode 100644
index 0000000..fe58f9a
--- /dev/null
+++ b/hw/mem/nvdimm/namespace.c
@@ -0,0 +1,309 @@
+/*
+ * NVDIMM  Namespace Support
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ *
+ * NVDIMM namespace specification can be found at:
+ *      http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "hw/mem/nvdimm.h"
+#include "internal.h"
+
+static uint64_t fletcher64(void *addr, size_t len)
+{
+    uint32_t *buf = addr;
+    uint32_t lo32 = 0;
+    uint64_t hi32 = 0;
+    int i;
+
+    for (i = 0; i < len / sizeof(uint32_t); i++) {
+        lo32 += cpu_to_le32(buf[i]);
+        hi32 += lo32;
+    }
+
+    return hi32 << 32 | lo32;
+}
+
+struct interleave_set_info {
+    struct interleave_set_info_map {
+        uint64_t region_spa_offset;
+        uint32_t serial_number;
+        uint32_t zero;
+    } mapping[1];
+};
+typedef struct interleave_set_info interleave_set_info;
+
+void calculate_nvdimm_isetcookie(NVDIMMDevice *nvdimm, uint64_t spa_offset,
+                                 uint32_t sn)
+{
+    interleave_set_info info;
+
+    info.mapping[0].region_spa_offset = spa_offset;
+    info.mapping[0].serial_number = sn;
+    info.mapping[0].zero = 0;
+
+    nvdimm->isetcookie = fletcher64(&info, sizeof(info));
+}
+
+#define NSINDEX_SIGNATURE      "NAMESPACE_INDEX\0"
+
+enum {
+    NSINDEX_SIG_LEN = 16,
+    NSINDEX_ALIGN = 256,
+    NSINDEX_SEQ_MASK = 0x3,
+    NSINDEX_MAJOR = 0x1,
+    NSINDEX_MINOR = 0x1,
+
+    NSLABEL_UUID_LEN = 16,
+    NSLABEL_NAME_LEN = 64,
+    NSLABEL_FLAG_ROLABEL = 0x1,  /* read-only label */
+    NSLABEL_FLAG_LOCAL = 0x2,    /* DIMM-local namespace */
+    NSLABEL_FLAG_BTT = 0x4,      /* namespace contains a BTT */
+    NSLABEL_FLAG_UPDATING = 0x8, /* label being updated */
+};
+
+/*
+ * struct nd_namespace_index - label set superblock
+ * @sig: NAMESPACE_INDEX\0
+ * @flags: placeholder
+ * @seq: sequence number for this index
+ * @myoff: offset of this index in label area
+ * @mysize: size of this index struct
+ * @otheroff: offset of other index
+ * @labeloff: offset of first label slot
+ * @nslot: total number of label slots
+ * @major: label area major version
+ * @minor: label area minor version
+ * @checksum: fletcher64 of all fields
+ * @free[0]: bitmap, nlabel bits
+ *
+ * The size of free[] is rounded up so the total struct size is a
+ * multiple of NSINDEX_ALIGN bytes.  Any bits this allocates beyond
+ * nlabel bits must be zero.
+ */
+struct namespace_label_index_block {
+    uint8_t sig[NSINDEX_SIG_LEN];
+    uint32_t flags;
+    uint32_t seq;
+    uint64_t myoff;
+    uint64_t mysize;
+    uint64_t otheroff;
+    uint64_t labeloff;
+    uint32_t nlabel;
+    uint16_t major;
+    uint16_t minor;
+    uint64_t checksum;
+    uint8_t free[0];
+} QEMU_PACKED;
+typedef struct namespace_label_index_block namespace_label_index_block;
+
+/*
+ * struct nd_namespace_label - namespace superblock
+ * @uuid: UUID per RFC 4122
+ * @name: optional name (NULL-terminated)
+ * @flags: see NSLABEL_FLAG_*
+ * @nlabel: num labels to describe this ns
+ * @position: labels position in set
+ * @isetcookie: interleave set cookie
+ * @lbasize: LBA size in bytes or 0 for pmem
+ * @dpa: DPA of NVM range on this DIMM
+ * @rawsize: size of namespace
+ * @slot: slot of this label in label area
+ * @unused: must be zero
+ */
+struct namespace_label {
+    uint8_t uuid[NSLABEL_UUID_LEN];
+    uint8_t name[NSLABEL_NAME_LEN];
+    uint32_t flags;
+    uint16_t nlabel;
+    uint16_t position;
+    uint64_t isetcookie;
+    uint64_t lbasize;
+    uint64_t dpa;
+    uint64_t rawsize;
+    uint32_t slot;
+    uint32_t unused;
+} QEMU_PACKED;
+typedef struct namespace_label namespace_label;
+
+/*calculate the number of label can be contained in whole label data. */
+static int label_data_max_label_nr(NVDIMMDevice *nvdimm, size_t block_size)
+{
+    /* totally we have 2 namespace label index block. */
+    if (block_size * 2 >= nvdimm->label_size) {
+        return 0;
+    }
+
+    return (nvdimm->label_size - block_size * 2) / sizeof(namespace_label);
+}
+
+/*calculate the number of label can be contained in index block. */
+static int label_index_block_max_label_nr(size_t block_size)
+{
+    int free_size;
+
+    free_size = block_size - sizeof(namespace_label_index_block);
+
+    return free_size * BITS_PER_BYTE;
+}
+
+static int calculate_max_label_nr(NVDIMMDevice *nvdimm, size_t block_size)
+{
+    return MIN(label_index_block_max_label_nr(block_size),
+        label_data_max_label_nr(nvdimm, block_size));
+}
+
+/*
+ * check if we can increase the size of namespace_label_index_block to
+ * contain more labels.
+ */
+static bool can_increase_index_block(NVDIMMDevice *nvdimm,
+                                     size_t block_size, int label_nr)
+{
+    size_t remaining;
+
+    remaining = nvdimm->label_size - block_size * 2 -
+                label_nr * sizeof(namespace_label);
+
+    assert((int64_t)remaining >= 0);
+
+    /* can contain 1 label at least. */
+    return remaining >=  NSINDEX_ALIGN * 2 + sizeof(namespace_label);
+}
+
+static void count_label_nr(NVDIMMDevice *nvdimm, size_t *label_block_size,
+                           int *label_nr)
+{
+    *label_block_size = 0;
+
+    do {
+        /*
+          * The minimum size of an index block is 256 bytes and the size must
+          * be a multiple of 256 bytes.
+          */
+        *label_block_size += NSINDEX_ALIGN;
+
+        *label_nr = calculate_max_label_nr(nvdimm, *label_block_size);
+    } while (can_increase_index_block(nvdimm, *label_block_size, *label_nr));
+}
+
+static void namespace_label_uuid(NVDIMMDevice *nvdimm, void *uuid)
+{
+    /* magic UUID. */
+    uuid_le label_uuid_init = UUID_LE(0x137e67a9, 0x7dcb, 0x4c66, 0xb2,
+                                      0xe6, 0x05, 0x06, 0x5b, 0xeb,
+                                      0x6a, 0x00);
+    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP, NULL);
+
+    label_uuid_init.b[0] += slot;
+    memcpy(uuid, &label_uuid_init, sizeof(label_uuid_init));
+}
+
+static void nvdimm_device_init_namespace(NVDIMMDevice *nvdimm)
+{
+    namespace_label_index_block *index1, *index2;
+    namespace_label *label;
+    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
+                                            NULL);
+    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
+                                            NULL);
+    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP, NULL);
+    int i, label_nr;
+    size_t label_block_size;
+
+    nvdimm->label_data = g_malloc(nvdimm->label_size);
+
+    count_label_nr(nvdimm, &label_block_size, &label_nr);
+    nvdebug("nvdimm%d: label_block_size 0x%lx label_nr %d.\n",
+            slot, label_block_size, label_nr);
+
+    index1 = nvdimm->label_data;
+
+    /*
+     * init the first namespace label index block, except @otheroff
+     * and @checksum. we will do it later.
+     */
+    memcpy(index1->sig, NSINDEX_SIGNATURE, sizeof(NSINDEX_SIGNATURE));
+    index1->flags = cpu_to_le32(0);
+    index1->seq = cpu_to_le32(0x1);
+    index1->myoff = cpu_to_le64(0);
+    index1->mysize = cpu_to_le64(label_block_size);
+    index1->labeloff = cpu_to_le64(label_block_size * 2);
+    index1->nlabel = cpu_to_le32(label_nr);
+    index1->major = cpu_to_le16(NSINDEX_MAJOR);
+    index1->minor = cpu_to_le16(NSINDEX_MINOR);
+    index1->checksum = cpu_to_le64(0);
+    memset(index1->free, 0,
+           label_block_size - sizeof(namespace_label_index_block));
+
+    /*
+     * the label slot with the lowest offset in the label storage area is
+     * tracked by the least significant bit of the first byte of the free
+     * array.
+     *
+     * the fist label is used.
+     */
+    for (i = 1; i < index1->nlabel; i++) {
+        set_bit(i, (unsigned long *)index1->free);
+    }
+
+    /* init the second namespace label index block. */
+    index2 = (void *)index1 + label_block_size;
+    memcpy(index2, index1, label_block_size);
+    index2->seq = cpu_to_le32(0x2);
+    index2->myoff = cpu_to_le64(label_block_size);
+
+    /* init @otheroff and @checksume. */
+    index1->otheroff = cpu_to_le64(index2->myoff);
+    index2->otheroff = cpu_to_le64(index1->myoff);
+    index1->checksum = cpu_to_le64(fletcher64(index1, label_block_size));
+    index2->checksum = cpu_to_le64(fletcher64(index2, label_block_size));
+
+    /* only one label is used which is the first label and is readonly. */
+    label = nvdimm->label_data + label_block_size * 2;
+    namespace_label_uuid(nvdimm, label->uuid);
+    sprintf((char *)label->name, "QEMU NS%d", slot);
+    label->flags = cpu_to_le32(NSLABEL_FLAG_ROLABEL);
+    label->nlabel = cpu_to_le16(1);
+    label->position = cpu_to_le16(0);
+    label->isetcookie = cpu_to_le64(nvdimm->isetcookie);
+    label->lbasize = cpu_to_le64(0);
+    label->dpa = cpu_to_le64(addr);
+    label->rawsize = cpu_to_le64(size);
+    label->slot = cpu_to_le32(0);
+    label->unused = cpu_to_le32(0);
+
+    nvdebug("nvdimm%d, checksum1 0x%lx checksum2 0x%lx isetcookie 0x%lx.\n",
+            slot, index1->checksum, index2->checksum,
+            label->isetcookie);
+}
+
+void build_nvdimm_label_data(GSList *device_list)
+{
+    for (; device_list; device_list = device_list->next) {
+        NVDIMMDevice *nvdimm = device_list->data;
+
+        if (nvdimm->label_data) {
+            continue;
+        }
+
+        nvdimm_device_init_namespace(nvdimm);
+    }
+}
diff --git a/hw/mem/nvdimm/nvdimm.c b/hw/mem/nvdimm/nvdimm.c
index bc8c577..9688533 100644
--- a/hw/mem/nvdimm/nvdimm.c
+++ b/hw/mem/nvdimm/nvdimm.c
@@ -62,14 +62,15 @@ static void nvdimm_realize(DIMMDevice *dimm, Error **errp)
 {
     MemoryRegion *mr;
     NVDIMMDevice *nvdimm = NVDIMM(dimm);
-    uint64_t size;
+    uint64_t reserved_label_size, size;
 
     nvdimm->label_size = MIN_NAMESPACE_LABEL_SIZE;
+    reserved_label_size = nvdimm->reserve_label_data ? nvdimm->label_size : 0;
 
     mr = host_memory_backend_get_memory(dimm->hostmem, errp);
     size = memory_region_size(mr);
 
-    if (size <= nvdimm->label_size) {
+    if (size <= reserved_label_size) {
         char *path = object_get_canonical_path_component(OBJECT(dimm->hostmem));
         error_setg(errp, "the size of memdev %s (0x%" PRIx64 ") is too small"
                    " to contain nvdimm namespace label (0x%" PRIx64 ")", path,
@@ -78,9 +79,12 @@ static void nvdimm_realize(DIMMDevice *dimm, Error **errp)
     }
 
     memory_region_init_alias(&nvdimm->nvdimm_mr, OBJECT(dimm), "nvdimm-memory",
-                             mr, 0, size - nvdimm->label_size);
-    nvdimm->label_data = memory_region_get_ram_ptr(mr) +
-                         memory_region_size(&nvdimm->nvdimm_mr);
+                             mr, 0, size - reserved_label_size);
+
+    if (reserved_label_size) {
+        nvdimm->label_data = memory_region_get_ram_ptr(mr) +
+                             memory_region_size(&nvdimm->nvdimm_mr);
+    }
 }
 
 static void nvdimm_class_init(ObjectClass *oc, void *data)
@@ -95,10 +99,32 @@ static void nvdimm_class_init(ObjectClass *oc, void *data)
     ddc->get_memory_region = nvdimm_get_memory_region;
 }
 
+static bool nvdimm_get_reserve_label_data(Object *obj, Error **errp)
+{
+    NVDIMMDevice *nvdimm = NVDIMM(obj);
+
+    return nvdimm->reserve_label_data;
+}
+
+static void nvdimm_set_reserve_label_data(Object *obj, bool value, Error **errp)
+{
+    NVDIMMDevice *nvdimm = NVDIMM(obj);
+
+    nvdimm->reserve_label_data = value;
+}
+
+static void nvdimm_init(Object *obj)
+{
+    object_property_add_bool(obj, "reserve-label-data",
+                             nvdimm_get_reserve_label_data,
+                             nvdimm_set_reserve_label_data, NULL);
+}
+
 static TypeInfo nvdimm_info = {
     .name          = TYPE_NVDIMM,
     .parent        = TYPE_DIMM,
     .instance_size = sizeof(NVDIMMDevice),
+    .instance_init = nvdimm_init,
     .class_init    = nvdimm_class_init,
 };
 
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index 0a6bda4..a8eef65 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -28,8 +28,12 @@ struct NVDIMMDevice {
     DIMMDevice parent_obj;
 
     /* public */
+    bool reserve_label_data;
     uint64_t label_size;
     void *label_data;
+
+    uint64_t isetcookie;
+
     MemoryRegion nvdimm_mr;
 };
 typedef struct NVDIMMDevice NVDIMMDevice;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH v3 32/32] nvdimm: add maintain info
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-11  3:53   ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:53 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: gleb, mtosatti, stefanha, mst, rth, ehabkost, dan.j.williams,
	kvm, qemu-devel, Xiao Guangrong

Add NVDIMM maintainer

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 MAINTAINERS | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9bde832..204d82a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -868,6 +868,12 @@ M: Jiri Pirko <jiri@resnulli.us>
 S: Maintained
 F: hw/net/rocker/
 
+NVDIMM
+M: Xiao Guangrong <guangrong.xiao@linux.intel.com>
+S: Maintained
+F: hw/mem/nvdimm/*
+F: include/hw/mem/nvdimm.h
+
 Subsystems
 ----------
 Audio
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Qemu-devel] [PATCH v3 32/32] nvdimm: add maintain info
@ 2015-10-11  3:53   ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-11  3:53 UTC (permalink / raw)
  To: pbonzini, imammedo
  Cc: Xiao Guangrong, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

Add NVDIMM maintainer

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 MAINTAINERS | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9bde832..204d82a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -868,6 +868,12 @@ M: Jiri Pirko <jiri@resnulli.us>
 S: Maintained
 F: hw/net/rocker/
 
+NVDIMM
+M: Xiao Guangrong <guangrong.xiao@linux.intel.com>
+S: Maintained
+F: hw/mem/nvdimm/*
+F: include/hw/mem/nvdimm.h
+
 Subsystems
 ----------
 Audio
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 00/32] implement vNVDIMM
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-12  2:59   ` Bharata B Rao
  -1 siblings, 0 replies; 200+ messages in thread
From: Bharata B Rao @ 2015-10-12  2:59 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Paolo Bonzini, Igor Mammedov, gleb, mtosatti, Stefan Hajnoczi,
	mst, Richard Henderson, Eduardo Habkost, dan.j.williams, kvm,
	qemu-devel

Xiao,

Are these patches present in any git tree so that they can be easily tried out.

Regards,
Bharata.

On Sun, Oct 11, 2015 at 9:22 AM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
> Changelog in v3:
> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> Michael for their valuable comments, the patchset finally gets better shape.
> - changes from Igor's comments:
>   1) abstract dimm device type from pc-dimm and create nvdimm device based on
>      dimm, then it uses memory backend device as nvdimm's memory and NUMA has
>      easily been implemented.
>   2) let file-backend device support any kind of filesystem not only for
>      hugetlbfs and let it work on file not only for directory which is
>      achieved by extending 'mem-path' - if it's a directory then it works as
>      current behavior, otherwise if it's file then directly allocates memory
>      from it.
>   3) we figure out a unused memory hole below 4G that is 0xFF00000 ~
>      0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
>      ACPI SSDT/DSDT table will break windows XP.
>      BTW, only make SSDT.rev = 2 can not work since the width is only depended
>      on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
>      in ACPI spec:
> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit
> | width of Integer objects is dependent on the ComplianceRevision of the DSDT.
> | If the ComplianceRevision is less than 2, all integers are restricted to 32
> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets
> | the global integer width for all integers, including integers in SSDTs.
>   4) use the lowest ACPI spec version to document AML terms.
>   5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"
>
> - changes from Stefan's comments:
>   1) do not do endian adjustment in-place since _DSM memory is visible to guest
>   2) use target platform's target page size instead of fixed PAGE_SIZE
>      definition
>   3) lots of code style improvement and typo fixes.
>   4) live migration fix
> - changes from Paolo's comments:
>   1) improve the name of memory region
>
> - other changes:
>   1) return exact buffer size for _DSM method instead of the page size.
>   2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
>      devices.
>   3) NUMA support
>   4) implement _FIT method
>   5) rename "configdata" to "reserve-label-data"
>   6) simplify _DSM arg3 determination
>   7) main changelog update to let it reflect v3.
>
> Changlog in v2:
> - Use litten endian for DSM method, thanks for Stefan's suggestion
>
> - introduce a new parameter, @configdata, if it's false, Qemu will
>   build a static and readonly namespace in memory and use it serveing
>   for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
>   reserved region is needed at the end of the @file, it is good for
>   the user who want to pass whole nvdimm device and make its data
>   completely be visible to guest
>
> - divide the source code into separated files and add maintain info
>
> BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
> be posted on next week
>
> ====== Background ======
> NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
> on Intel's platform. They are discovered via ACPI and configured by _DSM
> method of NVDIMM device in ACPI. There has some supporting documents which
> can be found at:
> ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
> NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
> DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
>
> Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
> this patchset tries to enable it in virtualization field
>
> ====== Design ======
> NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
> address space then CPU can directly access it as normal memory, another is
> BLK which is used as block device to reduce the occupying of CPU address
> space
>
> BLK mode accesses NVDIMM via Command Register window and Data Register window.
> BLK virtualization has high workload since each sector access will cause at
> least two VM-EXIT. So we currently only imperilment vPMEM in this patchset
>
> --- vPMEM design ---
> We introduce a new device named "nvdimm", it uses memory backend device as
> NVDIMM memory. The file in file-backend device can be a regular file and block
> device. We can use any file when we do test or emulation, however,
> in the real word, the files passed to guest are:
> - the regular file in the filesystem with DAX enabled created on NVDIMM device
>   on host
> - the raw PMEM device on host, e,g /dev/pmem0
> Memory access on the address created by mmap on these kinds of files can
> directly reach NVDIMM device on host.
>
> --- vConfigure data area design ---
> Each NVDIMM device has a configure data area which is used to store label
> namespace data. In order to emulating this area, we divide the file into two
> parts:
> - first parts is (0, size - 128K], which is used as PMEM
> - 128K at the end of the file, which is used as Label Data Area
> So that the label namespace data can be persistent during power lose or system
> failure.
>
> We also support passing the whole file to guest without reserve any region for
> label data area which is achieved by "reserve-label-data" parameter - if it's
> false then QEMU will build static and readonly namespace in memory and that
> namespace contains the whole file size. The parameter is false on default.
>
> --- _DSM method design ---
> _DSM in ACPI is used to configure NVDIMM, currently we only allow access of
> label namespace data, i.e, Get Namespace Label Size (Function Index 4),
> Get Namespace Label Data (Function Index 5) and Set Namespace Label Data
> (Function Index 6)
>
> _DSM uses two pages to transfer data between ACPI and Qemu, the first page
> is RAM-based used to save the input info of _DSM method and Qemu reuse it
> store output info and another page is MMIO-based, ACPI write data to this
> page to transfer the control to Qemu
>
> ====== Test ======
> In host
> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
> 2) append "-object memory-backend-file,share,id=mem1,
>    mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>    id=nv1" in QEMU command line
>
> In guest, download the latest upsteam kernel (4.2 merge window) and enable
> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
> 1) insmod drivers/nvdimm/libnvdimm.ko
> 2) insmod drivers/acpi/nfit.ko
> 3) insmod drivers/nvdimm/nd_btt.ko
> 4) insmod drivers/nvdimm/nd_pmem.ko
> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
> appears. You can do whatever on /dev/pmem0 including DAX access.
>
> Currently Linux NVDIMM driver does not support namespace operation on this
> kind of PMEM, apply below changes to support dynamical namespace:
>
> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
>                         continue;
>                 }
>
> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               if (nfit_mem->memdev_pmem)
>                         flags |= NDD_ALIASING;
>
> You can append another NVDIMM device in guest and do:
> # cd /sys/bus/nd/devices/
> # cd namespace1.0/
> # echo `uuidgen` > uuid
> # echo `expr 1024 \* 1024 \* 128` > size
> then reload nd.pmem.ko
>
> You can see /dev/pmem1 appears
>
> ====== TODO ======
> NVDIMM hotplug support
>
> Xiao Guangrong (32):
>   acpi: add aml_derefof
>   acpi: add aml_sizeof
>   acpi: add aml_create_field
>   acpi: add aml_mutex, aml_acquire, aml_release
>   acpi: add aml_concatenate
>   acpi: add aml_object_type
>   util: introduce qemu_file_get_page_size()
>   exec: allow memory to be allocated from any kind of path
>   exec: allow file_ram_alloc to work on file
>   hostmem-file: clean up memory allocation
>   hostmem-file: use whole file size if possible
>   pc-dimm: remove DEFAULT_PC_DIMMSIZE
>   pc-dimm: make pc_existing_dimms_capacity static and rename it
>   pc-dimm: drop the prefix of pc-dimm
>   stubs: rename qmp_pc_dimm_device_list.c
>   pc-dimm: rename pc-dimm.c and pc-dimm.h
>   dimm: abstract dimm device from pc-dimm
>   dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
>   dimm: keep the state of the whole backend memory
>   dimm: introduce realize callback
>   nvdimm: implement NVDIMM device abstract
>   nvdimm: init the address region used by NVDIMM ACPI
>   nvdimm: build ACPI NFIT table
>   nvdimm: init the address region used by DSM method
>   nvdimm: build ACPI nvdimm devices
>   nvdimm: save arg3 for NVDIMM device _DSM method
>   nvdimm: support DSM_CMD_IMPLEMENTED function
>   nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
>   nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
>   nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
>   nvdimm: allow using whole backend memory as pmem
>   nvdimm: add maintain info
>
>  MAINTAINERS                                        |   6 +
>  backends/hostmem-file.c                            |  58 +-
>  default-configs/i386-softmmu.mak                   |   2 +
>  default-configs/x86_64-softmmu.mak                 |   2 +
>  exec.c                                             | 113 ++-
>  hmp.c                                              |   2 +-
>  hw/Makefile.objs                                   |   2 +-
>  hw/acpi/aml-build.c                                |  83 ++
>  hw/acpi/ich9.c                                     |   8 +-
>  hw/acpi/memory_hotplug.c                           |  26 +-
>  hw/acpi/piix4.c                                    |   8 +-
>  hw/i386/acpi-build.c                               |   4 +
>  hw/i386/pc.c                                       |  37 +-
>  hw/mem/Makefile.objs                               |   3 +
>  hw/mem/{pc-dimm.c => dimm.c}                       | 240 ++---
>  hw/mem/nvdimm/acpi.c                               | 961 +++++++++++++++++++++
>  hw/mem/nvdimm/internal.h                           |  41 +
>  hw/mem/nvdimm/namespace.c                          | 309 +++++++
>  hw/mem/nvdimm/nvdimm.c                             | 136 +++
>  hw/mem/pc-dimm.c                                   | 506 +----------
>  hw/ppc/spapr.c                                     |  20 +-
>  include/hw/acpi/aml-build.h                        |   8 +
>  include/hw/i386/pc.h                               |   4 +-
>  include/hw/mem/{pc-dimm.h => dimm.h}               |  68 +-
>  include/hw/mem/nvdimm.h                            |  58 ++
>  include/hw/mem/pc-dimm.h                           | 105 +--
>  include/hw/ppc/spapr.h                             |   2 +-
>  include/qemu/osdep.h                               |   1 +
>  numa.c                                             |   4 +-
>  qapi-schema.json                                   |   8 +-
>  qmp.c                                              |   4 +-
>  stubs/Makefile.objs                                |   2 +-
>  ...c_dimm_device_list.c => qmp_dimm_device_list.c} |   4 +-
>  target-ppc/kvm.c                                   |  21 +-
>  trace-events                                       |   8 +-
>  util/oslib-posix.c                                 |  16 +
>  util/oslib-win32.c                                 |   5 +
>  37 files changed, 2023 insertions(+), 862 deletions(-)
>  rename hw/mem/{pc-dimm.c => dimm.c} (65%)
>  create mode 100644 hw/mem/nvdimm/acpi.c
>  create mode 100644 hw/mem/nvdimm/internal.h
>  create mode 100644 hw/mem/nvdimm/namespace.c
>  create mode 100644 hw/mem/nvdimm/nvdimm.c
>  rewrite hw/mem/pc-dimm.c (91%)
>  rename include/hw/mem/{pc-dimm.h => dimm.h} (50%)
>  create mode 100644 include/hw/mem/nvdimm.h
>  rewrite include/hw/mem/pc-dimm.h (97%)
>  rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (56%)
>
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
http://raobharata.wordpress.com/

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-12  2:59   ` Bharata B Rao
  0 siblings, 0 replies; 200+ messages in thread
From: Bharata B Rao @ 2015-10-12  2:59 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Eduardo Habkost, kvm, mst, gleb, mtosatti, qemu-devel,
	Stefan Hajnoczi, Igor Mammedov, Paolo Bonzini, dan.j.williams,
	Richard Henderson

Xiao,

Are these patches present in any git tree so that they can be easily tried out.

Regards,
Bharata.

On Sun, Oct 11, 2015 at 9:22 AM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
> Changelog in v3:
> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> Michael for their valuable comments, the patchset finally gets better shape.
> - changes from Igor's comments:
>   1) abstract dimm device type from pc-dimm and create nvdimm device based on
>      dimm, then it uses memory backend device as nvdimm's memory and NUMA has
>      easily been implemented.
>   2) let file-backend device support any kind of filesystem not only for
>      hugetlbfs and let it work on file not only for directory which is
>      achieved by extending 'mem-path' - if it's a directory then it works as
>      current behavior, otherwise if it's file then directly allocates memory
>      from it.
>   3) we figure out a unused memory hole below 4G that is 0xFF00000 ~
>      0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
>      ACPI SSDT/DSDT table will break windows XP.
>      BTW, only make SSDT.rev = 2 can not work since the width is only depended
>      on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
>      in ACPI spec:
> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit
> | width of Integer objects is dependent on the ComplianceRevision of the DSDT.
> | If the ComplianceRevision is less than 2, all integers are restricted to 32
> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets
> | the global integer width for all integers, including integers in SSDTs.
>   4) use the lowest ACPI spec version to document AML terms.
>   5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"
>
> - changes from Stefan's comments:
>   1) do not do endian adjustment in-place since _DSM memory is visible to guest
>   2) use target platform's target page size instead of fixed PAGE_SIZE
>      definition
>   3) lots of code style improvement and typo fixes.
>   4) live migration fix
> - changes from Paolo's comments:
>   1) improve the name of memory region
>
> - other changes:
>   1) return exact buffer size for _DSM method instead of the page size.
>   2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
>      devices.
>   3) NUMA support
>   4) implement _FIT method
>   5) rename "configdata" to "reserve-label-data"
>   6) simplify _DSM arg3 determination
>   7) main changelog update to let it reflect v3.
>
> Changlog in v2:
> - Use litten endian for DSM method, thanks for Stefan's suggestion
>
> - introduce a new parameter, @configdata, if it's false, Qemu will
>   build a static and readonly namespace in memory and use it serveing
>   for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
>   reserved region is needed at the end of the @file, it is good for
>   the user who want to pass whole nvdimm device and make its data
>   completely be visible to guest
>
> - divide the source code into separated files and add maintain info
>
> BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
> be posted on next week
>
> ====== Background ======
> NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
> on Intel's platform. They are discovered via ACPI and configured by _DSM
> method of NVDIMM device in ACPI. There has some supporting documents which
> can be found at:
> ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
> NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
> DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
>
> Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
> this patchset tries to enable it in virtualization field
>
> ====== Design ======
> NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
> address space then CPU can directly access it as normal memory, another is
> BLK which is used as block device to reduce the occupying of CPU address
> space
>
> BLK mode accesses NVDIMM via Command Register window and Data Register window.
> BLK virtualization has high workload since each sector access will cause at
> least two VM-EXIT. So we currently only imperilment vPMEM in this patchset
>
> --- vPMEM design ---
> We introduce a new device named "nvdimm", it uses memory backend device as
> NVDIMM memory. The file in file-backend device can be a regular file and block
> device. We can use any file when we do test or emulation, however,
> in the real word, the files passed to guest are:
> - the regular file in the filesystem with DAX enabled created on NVDIMM device
>   on host
> - the raw PMEM device on host, e,g /dev/pmem0
> Memory access on the address created by mmap on these kinds of files can
> directly reach NVDIMM device on host.
>
> --- vConfigure data area design ---
> Each NVDIMM device has a configure data area which is used to store label
> namespace data. In order to emulating this area, we divide the file into two
> parts:
> - first parts is (0, size - 128K], which is used as PMEM
> - 128K at the end of the file, which is used as Label Data Area
> So that the label namespace data can be persistent during power lose or system
> failure.
>
> We also support passing the whole file to guest without reserve any region for
> label data area which is achieved by "reserve-label-data" parameter - if it's
> false then QEMU will build static and readonly namespace in memory and that
> namespace contains the whole file size. The parameter is false on default.
>
> --- _DSM method design ---
> _DSM in ACPI is used to configure NVDIMM, currently we only allow access of
> label namespace data, i.e, Get Namespace Label Size (Function Index 4),
> Get Namespace Label Data (Function Index 5) and Set Namespace Label Data
> (Function Index 6)
>
> _DSM uses two pages to transfer data between ACPI and Qemu, the first page
> is RAM-based used to save the input info of _DSM method and Qemu reuse it
> store output info and another page is MMIO-based, ACPI write data to this
> page to transfer the control to Qemu
>
> ====== Test ======
> In host
> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
> 2) append "-object memory-backend-file,share,id=mem1,
>    mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>    id=nv1" in QEMU command line
>
> In guest, download the latest upsteam kernel (4.2 merge window) and enable
> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
> 1) insmod drivers/nvdimm/libnvdimm.ko
> 2) insmod drivers/acpi/nfit.ko
> 3) insmod drivers/nvdimm/nd_btt.ko
> 4) insmod drivers/nvdimm/nd_pmem.ko
> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
> appears. You can do whatever on /dev/pmem0 including DAX access.
>
> Currently Linux NVDIMM driver does not support namespace operation on this
> kind of PMEM, apply below changes to support dynamical namespace:
>
> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
>                         continue;
>                 }
>
> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               if (nfit_mem->memdev_pmem)
>                         flags |= NDD_ALIASING;
>
> You can append another NVDIMM device in guest and do:
> # cd /sys/bus/nd/devices/
> # cd namespace1.0/
> # echo `uuidgen` > uuid
> # echo `expr 1024 \* 1024 \* 128` > size
> then reload nd.pmem.ko
>
> You can see /dev/pmem1 appears
>
> ====== TODO ======
> NVDIMM hotplug support
>
> Xiao Guangrong (32):
>   acpi: add aml_derefof
>   acpi: add aml_sizeof
>   acpi: add aml_create_field
>   acpi: add aml_mutex, aml_acquire, aml_release
>   acpi: add aml_concatenate
>   acpi: add aml_object_type
>   util: introduce qemu_file_get_page_size()
>   exec: allow memory to be allocated from any kind of path
>   exec: allow file_ram_alloc to work on file
>   hostmem-file: clean up memory allocation
>   hostmem-file: use whole file size if possible
>   pc-dimm: remove DEFAULT_PC_DIMMSIZE
>   pc-dimm: make pc_existing_dimms_capacity static and rename it
>   pc-dimm: drop the prefix of pc-dimm
>   stubs: rename qmp_pc_dimm_device_list.c
>   pc-dimm: rename pc-dimm.c and pc-dimm.h
>   dimm: abstract dimm device from pc-dimm
>   dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
>   dimm: keep the state of the whole backend memory
>   dimm: introduce realize callback
>   nvdimm: implement NVDIMM device abstract
>   nvdimm: init the address region used by NVDIMM ACPI
>   nvdimm: build ACPI NFIT table
>   nvdimm: init the address region used by DSM method
>   nvdimm: build ACPI nvdimm devices
>   nvdimm: save arg3 for NVDIMM device _DSM method
>   nvdimm: support DSM_CMD_IMPLEMENTED function
>   nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
>   nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
>   nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
>   nvdimm: allow using whole backend memory as pmem
>   nvdimm: add maintain info
>
>  MAINTAINERS                                        |   6 +
>  backends/hostmem-file.c                            |  58 +-
>  default-configs/i386-softmmu.mak                   |   2 +
>  default-configs/x86_64-softmmu.mak                 |   2 +
>  exec.c                                             | 113 ++-
>  hmp.c                                              |   2 +-
>  hw/Makefile.objs                                   |   2 +-
>  hw/acpi/aml-build.c                                |  83 ++
>  hw/acpi/ich9.c                                     |   8 +-
>  hw/acpi/memory_hotplug.c                           |  26 +-
>  hw/acpi/piix4.c                                    |   8 +-
>  hw/i386/acpi-build.c                               |   4 +
>  hw/i386/pc.c                                       |  37 +-
>  hw/mem/Makefile.objs                               |   3 +
>  hw/mem/{pc-dimm.c => dimm.c}                       | 240 ++---
>  hw/mem/nvdimm/acpi.c                               | 961 +++++++++++++++++++++
>  hw/mem/nvdimm/internal.h                           |  41 +
>  hw/mem/nvdimm/namespace.c                          | 309 +++++++
>  hw/mem/nvdimm/nvdimm.c                             | 136 +++
>  hw/mem/pc-dimm.c                                   | 506 +----------
>  hw/ppc/spapr.c                                     |  20 +-
>  include/hw/acpi/aml-build.h                        |   8 +
>  include/hw/i386/pc.h                               |   4 +-
>  include/hw/mem/{pc-dimm.h => dimm.h}               |  68 +-
>  include/hw/mem/nvdimm.h                            |  58 ++
>  include/hw/mem/pc-dimm.h                           | 105 +--
>  include/hw/ppc/spapr.h                             |   2 +-
>  include/qemu/osdep.h                               |   1 +
>  numa.c                                             |   4 +-
>  qapi-schema.json                                   |   8 +-
>  qmp.c                                              |   4 +-
>  stubs/Makefile.objs                                |   2 +-
>  ...c_dimm_device_list.c => qmp_dimm_device_list.c} |   4 +-
>  target-ppc/kvm.c                                   |  21 +-
>  trace-events                                       |   8 +-
>  util/oslib-posix.c                                 |  16 +
>  util/oslib-win32.c                                 |   5 +
>  37 files changed, 2023 insertions(+), 862 deletions(-)
>  rename hw/mem/{pc-dimm.c => dimm.c} (65%)
>  create mode 100644 hw/mem/nvdimm/acpi.c
>  create mode 100644 hw/mem/nvdimm/internal.h
>  create mode 100644 hw/mem/nvdimm/namespace.c
>  create mode 100644 hw/mem/nvdimm/nvdimm.c
>  rewrite hw/mem/pc-dimm.c (91%)
>  rename include/hw/mem/{pc-dimm.h => dimm.h} (50%)
>  create mode 100644 include/hw/mem/nvdimm.h
>  rewrite include/hw/mem/pc-dimm.h (97%)
>  rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (56%)
>
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
http://raobharata.wordpress.com/

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 00/32] implement vNVDIMM
  2015-10-12  2:59   ` [Qemu-devel] " Bharata B Rao
@ 2015-10-12  3:06     ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-12  3:06 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: Paolo Bonzini, Igor Mammedov, gleb, mtosatti, Stefan Hajnoczi,
	mst, Richard Henderson, Eduardo Habkost, dan.j.williams, kvm,
	qemu-devel



On 10/12/2015 10:59 AM, Bharata B Rao wrote:
> Xiao,
>
> Are these patches present in any git tree so that they can be easily tried out.
>

Sorry, currently no git tree out of my workspace is available :(

BTW, this patchset is based on top of the commit b37686f7e on qemu tree:
commit b37686f7e84b22cfaf7fd01ac5133f2617cc3027
Merge: 8be6e62 98cf48f
Author: Peter Maydell <peter.maydell@linaro.org>
Date:   Fri Oct 9 12:18:13 2015 +0100

     Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

Thanks.

> Regards,
> Bharata.
>
> On Sun, Oct 11, 2015 at 9:22 AM, Xiao Guangrong
> <guangrong.xiao@linux.intel.com> wrote:
>> Changelog in v3:
>> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
>> Michael for their valuable comments, the patchset finally gets better shape.
>> - changes from Igor's comments:
>>    1) abstract dimm device type from pc-dimm and create nvdimm device based on
>>       dimm, then it uses memory backend device as nvdimm's memory and NUMA has
>>       easily been implemented.
>>    2) let file-backend device support any kind of filesystem not only for
>>       hugetlbfs and let it work on file not only for directory which is
>>       achieved by extending 'mem-path' - if it's a directory then it works as
>>       current behavior, otherwise if it's file then directly allocates memory
>>       from it.
>>    3) we figure out a unused memory hole below 4G that is 0xFF00000 ~
>>       0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
>>       ACPI SSDT/DSDT table will break windows XP.
>>       BTW, only make SSDT.rev = 2 can not work since the width is only depended
>>       on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
>>       in ACPI spec:
>> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit
>> | width of Integer objects is dependent on the ComplianceRevision of the DSDT.
>> | If the ComplianceRevision is less than 2, all integers are restricted to 32
>> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets
>> | the global integer width for all integers, including integers in SSDTs.
>>    4) use the lowest ACPI spec version to document AML terms.
>>    5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"
>>
>> - changes from Stefan's comments:
>>    1) do not do endian adjustment in-place since _DSM memory is visible to guest
>>    2) use target platform's target page size instead of fixed PAGE_SIZE
>>       definition
>>    3) lots of code style improvement and typo fixes.
>>    4) live migration fix
>> - changes from Paolo's comments:
>>    1) improve the name of memory region
>>
>> - other changes:
>>    1) return exact buffer size for _DSM method instead of the page size.
>>    2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
>>       devices.
>>    3) NUMA support
>>    4) implement _FIT method
>>    5) rename "configdata" to "reserve-label-data"
>>    6) simplify _DSM arg3 determination
>>    7) main changelog update to let it reflect v3.
>>
>> Changlog in v2:
>> - Use litten endian for DSM method, thanks for Stefan's suggestion
>>
>> - introduce a new parameter, @configdata, if it's false, Qemu will
>>    build a static and readonly namespace in memory and use it serveing
>>    for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
>>    reserved region is needed at the end of the @file, it is good for
>>    the user who want to pass whole nvdimm device and make its data
>>    completely be visible to guest
>>
>> - divide the source code into separated files and add maintain info
>>
>> BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
>> be posted on next week
>>
>> ====== Background ======
>> NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
>> on Intel's platform. They are discovered via ACPI and configured by _DSM
>> method of NVDIMM device in ACPI. There has some supporting documents which
>> can be found at:
>> ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
>> NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
>> DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
>> Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
>>
>> Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
>> this patchset tries to enable it in virtualization field
>>
>> ====== Design ======
>> NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
>> address space then CPU can directly access it as normal memory, another is
>> BLK which is used as block device to reduce the occupying of CPU address
>> space
>>
>> BLK mode accesses NVDIMM via Command Register window and Data Register window.
>> BLK virtualization has high workload since each sector access will cause at
>> least two VM-EXIT. So we currently only imperilment vPMEM in this patchset
>>
>> --- vPMEM design ---
>> We introduce a new device named "nvdimm", it uses memory backend device as
>> NVDIMM memory. The file in file-backend device can be a regular file and block
>> device. We can use any file when we do test or emulation, however,
>> in the real word, the files passed to guest are:
>> - the regular file in the filesystem with DAX enabled created on NVDIMM device
>>    on host
>> - the raw PMEM device on host, e,g /dev/pmem0
>> Memory access on the address created by mmap on these kinds of files can
>> directly reach NVDIMM device on host.
>>
>> --- vConfigure data area design ---
>> Each NVDIMM device has a configure data area which is used to store label
>> namespace data. In order to emulating this area, we divide the file into two
>> parts:
>> - first parts is (0, size - 128K], which is used as PMEM
>> - 128K at the end of the file, which is used as Label Data Area
>> So that the label namespace data can be persistent during power lose or system
>> failure.
>>
>> We also support passing the whole file to guest without reserve any region for
>> label data area which is achieved by "reserve-label-data" parameter - if it's
>> false then QEMU will build static and readonly namespace in memory and that
>> namespace contains the whole file size. The parameter is false on default.
>>
>> --- _DSM method design ---
>> _DSM in ACPI is used to configure NVDIMM, currently we only allow access of
>> label namespace data, i.e, Get Namespace Label Size (Function Index 4),
>> Get Namespace Label Data (Function Index 5) and Set Namespace Label Data
>> (Function Index 6)
>>
>> _DSM uses two pages to transfer data between ACPI and Qemu, the first page
>> is RAM-based used to save the input info of _DSM method and Qemu reuse it
>> store output info and another page is MMIO-based, ACPI write data to this
>> page to transfer the control to Qemu
>>
>> ====== Test ======
>> In host
>> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
>> 2) append "-object memory-backend-file,share,id=mem1,
>>     mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>>     id=nv1" in QEMU command line
>>
>> In guest, download the latest upsteam kernel (4.2 merge window) and enable
>> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
>> 1) insmod drivers/nvdimm/libnvdimm.ko
>> 2) insmod drivers/acpi/nfit.ko
>> 3) insmod drivers/nvdimm/nd_btt.ko
>> 4) insmod drivers/nvdimm/nd_pmem.ko
>> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
>> appears. You can do whatever on /dev/pmem0 including DAX access.
>>
>> Currently Linux NVDIMM driver does not support namespace operation on this
>> kind of PMEM, apply below changes to support dynamical namespace:
>>
>> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
>>                          continue;
>>                  }
>>
>> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>> +               if (nfit_mem->memdev_pmem)
>>                          flags |= NDD_ALIASING;
>>
>> You can append another NVDIMM device in guest and do:
>> # cd /sys/bus/nd/devices/
>> # cd namespace1.0/
>> # echo `uuidgen` > uuid
>> # echo `expr 1024 \* 1024 \* 128` > size
>> then reload nd.pmem.ko
>>
>> You can see /dev/pmem1 appears
>>
>> ====== TODO ======
>> NVDIMM hotplug support
>>
>> Xiao Guangrong (32):
>>    acpi: add aml_derefof
>>    acpi: add aml_sizeof
>>    acpi: add aml_create_field
>>    acpi: add aml_mutex, aml_acquire, aml_release
>>    acpi: add aml_concatenate
>>    acpi: add aml_object_type
>>    util: introduce qemu_file_get_page_size()
>>    exec: allow memory to be allocated from any kind of path
>>    exec: allow file_ram_alloc to work on file
>>    hostmem-file: clean up memory allocation
>>    hostmem-file: use whole file size if possible
>>    pc-dimm: remove DEFAULT_PC_DIMMSIZE
>>    pc-dimm: make pc_existing_dimms_capacity static and rename it
>>    pc-dimm: drop the prefix of pc-dimm
>>    stubs: rename qmp_pc_dimm_device_list.c
>>    pc-dimm: rename pc-dimm.c and pc-dimm.h
>>    dimm: abstract dimm device from pc-dimm
>>    dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
>>    dimm: keep the state of the whole backend memory
>>    dimm: introduce realize callback
>>    nvdimm: implement NVDIMM device abstract
>>    nvdimm: init the address region used by NVDIMM ACPI
>>    nvdimm: build ACPI NFIT table
>>    nvdimm: init the address region used by DSM method
>>    nvdimm: build ACPI nvdimm devices
>>    nvdimm: save arg3 for NVDIMM device _DSM method
>>    nvdimm: support DSM_CMD_IMPLEMENTED function
>>    nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
>>    nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
>>    nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
>>    nvdimm: allow using whole backend memory as pmem
>>    nvdimm: add maintain info
>>
>>   MAINTAINERS                                        |   6 +
>>   backends/hostmem-file.c                            |  58 +-
>>   default-configs/i386-softmmu.mak                   |   2 +
>>   default-configs/x86_64-softmmu.mak                 |   2 +
>>   exec.c                                             | 113 ++-
>>   hmp.c                                              |   2 +-
>>   hw/Makefile.objs                                   |   2 +-
>>   hw/acpi/aml-build.c                                |  83 ++
>>   hw/acpi/ich9.c                                     |   8 +-
>>   hw/acpi/memory_hotplug.c                           |  26 +-
>>   hw/acpi/piix4.c                                    |   8 +-
>>   hw/i386/acpi-build.c                               |   4 +
>>   hw/i386/pc.c                                       |  37 +-
>>   hw/mem/Makefile.objs                               |   3 +
>>   hw/mem/{pc-dimm.c => dimm.c}                       | 240 ++---
>>   hw/mem/nvdimm/acpi.c                               | 961 +++++++++++++++++++++
>>   hw/mem/nvdimm/internal.h                           |  41 +
>>   hw/mem/nvdimm/namespace.c                          | 309 +++++++
>>   hw/mem/nvdimm/nvdimm.c                             | 136 +++
>>   hw/mem/pc-dimm.c                                   | 506 +----------
>>   hw/ppc/spapr.c                                     |  20 +-
>>   include/hw/acpi/aml-build.h                        |   8 +
>>   include/hw/i386/pc.h                               |   4 +-
>>   include/hw/mem/{pc-dimm.h => dimm.h}               |  68 +-
>>   include/hw/mem/nvdimm.h                            |  58 ++
>>   include/hw/mem/pc-dimm.h                           | 105 +--
>>   include/hw/ppc/spapr.h                             |   2 +-
>>   include/qemu/osdep.h                               |   1 +
>>   numa.c                                             |   4 +-
>>   qapi-schema.json                                   |   8 +-
>>   qmp.c                                              |   4 +-
>>   stubs/Makefile.objs                                |   2 +-
>>   ...c_dimm_device_list.c => qmp_dimm_device_list.c} |   4 +-
>>   target-ppc/kvm.c                                   |  21 +-
>>   trace-events                                       |   8 +-
>>   util/oslib-posix.c                                 |  16 +
>>   util/oslib-win32.c                                 |   5 +
>>   37 files changed, 2023 insertions(+), 862 deletions(-)
>>   rename hw/mem/{pc-dimm.c => dimm.c} (65%)
>>   create mode 100644 hw/mem/nvdimm/acpi.c
>>   create mode 100644 hw/mem/nvdimm/internal.h
>>   create mode 100644 hw/mem/nvdimm/namespace.c
>>   create mode 100644 hw/mem/nvdimm/nvdimm.c
>>   rewrite hw/mem/pc-dimm.c (91%)
>>   rename include/hw/mem/{pc-dimm.h => dimm.h} (50%)
>>   create mode 100644 include/hw/mem/nvdimm.h
>>   rewrite include/hw/mem/pc-dimm.h (97%)
>>   rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (56%)
>>
>> --
>> 1.8.3.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-12  3:06     ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-12  3:06 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: Eduardo Habkost, kvm, mst, gleb, mtosatti, qemu-devel,
	Stefan Hajnoczi, Igor Mammedov, Paolo Bonzini, dan.j.williams,
	Richard Henderson



On 10/12/2015 10:59 AM, Bharata B Rao wrote:
> Xiao,
>
> Are these patches present in any git tree so that they can be easily tried out.
>

Sorry, currently no git tree out of my workspace is available :(

BTW, this patchset is based on top of the commit b37686f7e on qemu tree:
commit b37686f7e84b22cfaf7fd01ac5133f2617cc3027
Merge: 8be6e62 98cf48f
Author: Peter Maydell <peter.maydell@linaro.org>
Date:   Fri Oct 9 12:18:13 2015 +0100

     Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

Thanks.

> Regards,
> Bharata.
>
> On Sun, Oct 11, 2015 at 9:22 AM, Xiao Guangrong
> <guangrong.xiao@linux.intel.com> wrote:
>> Changelog in v3:
>> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
>> Michael for their valuable comments, the patchset finally gets better shape.
>> - changes from Igor's comments:
>>    1) abstract dimm device type from pc-dimm and create nvdimm device based on
>>       dimm, then it uses memory backend device as nvdimm's memory and NUMA has
>>       easily been implemented.
>>    2) let file-backend device support any kind of filesystem not only for
>>       hugetlbfs and let it work on file not only for directory which is
>>       achieved by extending 'mem-path' - if it's a directory then it works as
>>       current behavior, otherwise if it's file then directly allocates memory
>>       from it.
>>    3) we figure out a unused memory hole below 4G that is 0xFF00000 ~
>>       0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
>>       ACPI SSDT/DSDT table will break windows XP.
>>       BTW, only make SSDT.rev = 2 can not work since the width is only depended
>>       on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
>>       in ACPI spec:
>> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit
>> | width of Integer objects is dependent on the ComplianceRevision of the DSDT.
>> | If the ComplianceRevision is less than 2, all integers are restricted to 32
>> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets
>> | the global integer width for all integers, including integers in SSDTs.
>>    4) use the lowest ACPI spec version to document AML terms.
>>    5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"
>>
>> - changes from Stefan's comments:
>>    1) do not do endian adjustment in-place since _DSM memory is visible to guest
>>    2) use target platform's target page size instead of fixed PAGE_SIZE
>>       definition
>>    3) lots of code style improvement and typo fixes.
>>    4) live migration fix
>> - changes from Paolo's comments:
>>    1) improve the name of memory region
>>
>> - other changes:
>>    1) return exact buffer size for _DSM method instead of the page size.
>>    2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
>>       devices.
>>    3) NUMA support
>>    4) implement _FIT method
>>    5) rename "configdata" to "reserve-label-data"
>>    6) simplify _DSM arg3 determination
>>    7) main changelog update to let it reflect v3.
>>
>> Changlog in v2:
>> - Use litten endian for DSM method, thanks for Stefan's suggestion
>>
>> - introduce a new parameter, @configdata, if it's false, Qemu will
>>    build a static and readonly namespace in memory and use it serveing
>>    for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
>>    reserved region is needed at the end of the @file, it is good for
>>    the user who want to pass whole nvdimm device and make its data
>>    completely be visible to guest
>>
>> - divide the source code into separated files and add maintain info
>>
>> BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
>> be posted on next week
>>
>> ====== Background ======
>> NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
>> on Intel's platform. They are discovered via ACPI and configured by _DSM
>> method of NVDIMM device in ACPI. There has some supporting documents which
>> can be found at:
>> ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
>> NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
>> DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
>> Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
>>
>> Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
>> this patchset tries to enable it in virtualization field
>>
>> ====== Design ======
>> NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
>> address space then CPU can directly access it as normal memory, another is
>> BLK which is used as block device to reduce the occupying of CPU address
>> space
>>
>> BLK mode accesses NVDIMM via Command Register window and Data Register window.
>> BLK virtualization has high workload since each sector access will cause at
>> least two VM-EXIT. So we currently only imperilment vPMEM in this patchset
>>
>> --- vPMEM design ---
>> We introduce a new device named "nvdimm", it uses memory backend device as
>> NVDIMM memory. The file in file-backend device can be a regular file and block
>> device. We can use any file when we do test or emulation, however,
>> in the real word, the files passed to guest are:
>> - the regular file in the filesystem with DAX enabled created on NVDIMM device
>>    on host
>> - the raw PMEM device on host, e,g /dev/pmem0
>> Memory access on the address created by mmap on these kinds of files can
>> directly reach NVDIMM device on host.
>>
>> --- vConfigure data area design ---
>> Each NVDIMM device has a configure data area which is used to store label
>> namespace data. In order to emulating this area, we divide the file into two
>> parts:
>> - first parts is (0, size - 128K], which is used as PMEM
>> - 128K at the end of the file, which is used as Label Data Area
>> So that the label namespace data can be persistent during power lose or system
>> failure.
>>
>> We also support passing the whole file to guest without reserve any region for
>> label data area which is achieved by "reserve-label-data" parameter - if it's
>> false then QEMU will build static and readonly namespace in memory and that
>> namespace contains the whole file size. The parameter is false on default.
>>
>> --- _DSM method design ---
>> _DSM in ACPI is used to configure NVDIMM, currently we only allow access of
>> label namespace data, i.e, Get Namespace Label Size (Function Index 4),
>> Get Namespace Label Data (Function Index 5) and Set Namespace Label Data
>> (Function Index 6)
>>
>> _DSM uses two pages to transfer data between ACPI and Qemu, the first page
>> is RAM-based used to save the input info of _DSM method and Qemu reuse it
>> store output info and another page is MMIO-based, ACPI write data to this
>> page to transfer the control to Qemu
>>
>> ====== Test ======
>> In host
>> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
>> 2) append "-object memory-backend-file,share,id=mem1,
>>     mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>>     id=nv1" in QEMU command line
>>
>> In guest, download the latest upsteam kernel (4.2 merge window) and enable
>> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
>> 1) insmod drivers/nvdimm/libnvdimm.ko
>> 2) insmod drivers/acpi/nfit.ko
>> 3) insmod drivers/nvdimm/nd_btt.ko
>> 4) insmod drivers/nvdimm/nd_pmem.ko
>> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
>> appears. You can do whatever on /dev/pmem0 including DAX access.
>>
>> Currently Linux NVDIMM driver does not support namespace operation on this
>> kind of PMEM, apply below changes to support dynamical namespace:
>>
>> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
>>                          continue;
>>                  }
>>
>> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>> +               if (nfit_mem->memdev_pmem)
>>                          flags |= NDD_ALIASING;
>>
>> You can append another NVDIMM device in guest and do:
>> # cd /sys/bus/nd/devices/
>> # cd namespace1.0/
>> # echo `uuidgen` > uuid
>> # echo `expr 1024 \* 1024 \* 128` > size
>> then reload nd.pmem.ko
>>
>> You can see /dev/pmem1 appears
>>
>> ====== TODO ======
>> NVDIMM hotplug support
>>
>> Xiao Guangrong (32):
>>    acpi: add aml_derefof
>>    acpi: add aml_sizeof
>>    acpi: add aml_create_field
>>    acpi: add aml_mutex, aml_acquire, aml_release
>>    acpi: add aml_concatenate
>>    acpi: add aml_object_type
>>    util: introduce qemu_file_get_page_size()
>>    exec: allow memory to be allocated from any kind of path
>>    exec: allow file_ram_alloc to work on file
>>    hostmem-file: clean up memory allocation
>>    hostmem-file: use whole file size if possible
>>    pc-dimm: remove DEFAULT_PC_DIMMSIZE
>>    pc-dimm: make pc_existing_dimms_capacity static and rename it
>>    pc-dimm: drop the prefix of pc-dimm
>>    stubs: rename qmp_pc_dimm_device_list.c
>>    pc-dimm: rename pc-dimm.c and pc-dimm.h
>>    dimm: abstract dimm device from pc-dimm
>>    dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
>>    dimm: keep the state of the whole backend memory
>>    dimm: introduce realize callback
>>    nvdimm: implement NVDIMM device abstract
>>    nvdimm: init the address region used by NVDIMM ACPI
>>    nvdimm: build ACPI NFIT table
>>    nvdimm: init the address region used by DSM method
>>    nvdimm: build ACPI nvdimm devices
>>    nvdimm: save arg3 for NVDIMM device _DSM method
>>    nvdimm: support DSM_CMD_IMPLEMENTED function
>>    nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
>>    nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
>>    nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
>>    nvdimm: allow using whole backend memory as pmem
>>    nvdimm: add maintain info
>>
>>   MAINTAINERS                                        |   6 +
>>   backends/hostmem-file.c                            |  58 +-
>>   default-configs/i386-softmmu.mak                   |   2 +
>>   default-configs/x86_64-softmmu.mak                 |   2 +
>>   exec.c                                             | 113 ++-
>>   hmp.c                                              |   2 +-
>>   hw/Makefile.objs                                   |   2 +-
>>   hw/acpi/aml-build.c                                |  83 ++
>>   hw/acpi/ich9.c                                     |   8 +-
>>   hw/acpi/memory_hotplug.c                           |  26 +-
>>   hw/acpi/piix4.c                                    |   8 +-
>>   hw/i386/acpi-build.c                               |   4 +
>>   hw/i386/pc.c                                       |  37 +-
>>   hw/mem/Makefile.objs                               |   3 +
>>   hw/mem/{pc-dimm.c => dimm.c}                       | 240 ++---
>>   hw/mem/nvdimm/acpi.c                               | 961 +++++++++++++++++++++
>>   hw/mem/nvdimm/internal.h                           |  41 +
>>   hw/mem/nvdimm/namespace.c                          | 309 +++++++
>>   hw/mem/nvdimm/nvdimm.c                             | 136 +++
>>   hw/mem/pc-dimm.c                                   | 506 +----------
>>   hw/ppc/spapr.c                                     |  20 +-
>>   include/hw/acpi/aml-build.h                        |   8 +
>>   include/hw/i386/pc.h                               |   4 +-
>>   include/hw/mem/{pc-dimm.h => dimm.h}               |  68 +-
>>   include/hw/mem/nvdimm.h                            |  58 ++
>>   include/hw/mem/pc-dimm.h                           | 105 +--
>>   include/hw/ppc/spapr.h                             |   2 +-
>>   include/qemu/osdep.h                               |   1 +
>>   numa.c                                             |   4 +-
>>   qapi-schema.json                                   |   8 +-
>>   qmp.c                                              |   4 +-
>>   stubs/Makefile.objs                                |   2 +-
>>   ...c_dimm_device_list.c => qmp_dimm_device_list.c} |   4 +-
>>   target-ppc/kvm.c                                   |  21 +-
>>   trace-events                                       |   8 +-
>>   util/oslib-posix.c                                 |  16 +
>>   util/oslib-win32.c                                 |   5 +
>>   37 files changed, 2023 insertions(+), 862 deletions(-)
>>   rename hw/mem/{pc-dimm.c => dimm.c} (65%)
>>   create mode 100644 hw/mem/nvdimm/acpi.c
>>   create mode 100644 hw/mem/nvdimm/internal.h
>>   create mode 100644 hw/mem/nvdimm/namespace.c
>>   create mode 100644 hw/mem/nvdimm/nvdimm.c
>>   rewrite hw/mem/pc-dimm.c (91%)
>>   rename include/hw/mem/{pc-dimm.h => dimm.h} (50%)
>>   create mode 100644 include/hw/mem/nvdimm.h
>>   rewrite include/hw/mem/pc-dimm.h (97%)
>>   rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (56%)
>>
>> --
>> 1.8.3.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 00/32] implement vNVDIMM
  2015-10-10 21:17   ` [Qemu-devel] " Dan Williams
@ 2015-10-12  4:33     ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-12  4:33 UTC (permalink / raw)
  To: Dan Williams
  Cc: Paolo Bonzini, imammedo, Gleb Natapov, mtosatti, stefanha,
	Michael S. Tsirkin, rth, ehabkost, kvm, qemu-devel



On 10/11/2015 05:17 AM, Dan Williams wrote:
> On Sat, Oct 10, 2015 at 8:52 PM, Xiao Guangrong
> <guangrong.xiao@linux.intel.com> wrote:
> [..]
>> ====== Test ======
>> In host
>> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
>> 2) append "-object memory-backend-file,share,id=mem1,
>>     mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>>     id=nv1" in QEMU command line
>>
>> In guest, download the latest upsteam kernel (4.2 merge window) and enable
>> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
>> 1) insmod drivers/nvdimm/libnvdimm.ko
>> 2) insmod drivers/acpi/nfit.ko
>> 3) insmod drivers/nvdimm/nd_btt.ko
>> 4) insmod drivers/nvdimm/nd_pmem.ko
>> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
>> appears. You can do whatever on /dev/pmem0 including DAX access.
>>
>> Currently Linux NVDIMM driver does not support namespace operation on this
>> kind of PMEM, apply below changes to support dynamical namespace:
>>
>> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
>>                          continue;
>>                  }
>>
>> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>> +               if (nfit_mem->memdev_pmem)
>>                          flags |= NDD_ALIASING;
>
> This is just for testing purposes, right?  I expect guests can

It's used to validate NVDIMM _DSM method and static namespace following
NVDIMM specs...

> sub-divide persistent memory capacity by partitioning the resulting
> block device(s).

I understand that it's a Linux design... Hmm, can the same expectation
apply to PBLK?

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-12  4:33     ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-12  4:33 UTC (permalink / raw)
  To: Dan Williams
  Cc: ehabkost, kvm, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, imammedo, Paolo Bonzini, rth



On 10/11/2015 05:17 AM, Dan Williams wrote:
> On Sat, Oct 10, 2015 at 8:52 PM, Xiao Guangrong
> <guangrong.xiao@linux.intel.com> wrote:
> [..]
>> ====== Test ======
>> In host
>> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
>> 2) append "-object memory-backend-file,share,id=mem1,
>>     mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>>     id=nv1" in QEMU command line
>>
>> In guest, download the latest upsteam kernel (4.2 merge window) and enable
>> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
>> 1) insmod drivers/nvdimm/libnvdimm.ko
>> 2) insmod drivers/acpi/nfit.ko
>> 3) insmod drivers/nvdimm/nd_btt.ko
>> 4) insmod drivers/nvdimm/nd_pmem.ko
>> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
>> appears. You can do whatever on /dev/pmem0 including DAX access.
>>
>> Currently Linux NVDIMM driver does not support namespace operation on this
>> kind of PMEM, apply below changes to support dynamical namespace:
>>
>> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
>>                          continue;
>>                  }
>>
>> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>> +               if (nfit_mem->memdev_pmem)
>>                          flags |= NDD_ALIASING;
>
> This is just for testing purposes, right?  I expect guests can

It's used to validate NVDIMM _DSM method and static namespace following
NVDIMM specs...

> sub-divide persistent memory capacity by partitioning the resulting
> block device(s).

I understand that it's a Linux design... Hmm, can the same expectation
apply to PBLK?

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 00/32] implement vNVDIMM
  2015-10-12  3:06     ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-12  8:20       ` Igor Mammedov
  -1 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-12  8:20 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Bharata B Rao, Paolo Bonzini, gleb, mtosatti, Stefan Hajnoczi,
	mst, Richard Henderson, Eduardo Habkost, dan.j.williams, kvm,
	qemu-devel

On Mon, 12 Oct 2015 11:06:20 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> 
> 
> On 10/12/2015 10:59 AM, Bharata B Rao wrote:
> > Xiao,
> >
> > Are these patches present in any git tree so that they can be easily tried out.
> >
> 
> Sorry, currently no git tree out of my workspace is available :(
Is it possible for you to put working tree on github?

> 
> BTW, this patchset is based on top of the commit b37686f7e on qemu tree:
> commit b37686f7e84b22cfaf7fd01ac5133f2617cc3027
> Merge: 8be6e62 98cf48f
> Author: Peter Maydell <peter.maydell@linaro.org>
> Date:   Fri Oct 9 12:18:13 2015 +0100
> 
>      Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
> 
> Thanks.
> 
> > Regards,
> > Bharata.
> >
> > On Sun, Oct 11, 2015 at 9:22 AM, Xiao Guangrong
> > <guangrong.xiao@linux.intel.com> wrote:
> >> Changelog in v3:
> >> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> >> Michael for their valuable comments, the patchset finally gets better shape.
> >> - changes from Igor's comments:
> >>    1) abstract dimm device type from pc-dimm and create nvdimm device based on
> >>       dimm, then it uses memory backend device as nvdimm's memory and NUMA has
> >>       easily been implemented.
> >>    2) let file-backend device support any kind of filesystem not only for
> >>       hugetlbfs and let it work on file not only for directory which is
> >>       achieved by extending 'mem-path' - if it's a directory then it works as
> >>       current behavior, otherwise if it's file then directly allocates memory
> >>       from it.
> >>    3) we figure out a unused memory hole below 4G that is 0xFF00000 ~
> >>       0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
> >>       ACPI SSDT/DSDT table will break windows XP.
> >>       BTW, only make SSDT.rev = 2 can not work since the width is only depended
> >>       on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
> >>       in ACPI spec:
> >> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit
> >> | width of Integer objects is dependent on the ComplianceRevision of the DSDT.
> >> | If the ComplianceRevision is less than 2, all integers are restricted to 32
> >> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets
> >> | the global integer width for all integers, including integers in SSDTs.
> >>    4) use the lowest ACPI spec version to document AML terms.
> >>    5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"
> >>
> >> - changes from Stefan's comments:
> >>    1) do not do endian adjustment in-place since _DSM memory is visible to guest
> >>    2) use target platform's target page size instead of fixed PAGE_SIZE
> >>       definition
> >>    3) lots of code style improvement and typo fixes.
> >>    4) live migration fix
> >> - changes from Paolo's comments:
> >>    1) improve the name of memory region
> >>
> >> - other changes:
> >>    1) return exact buffer size for _DSM method instead of the page size.
> >>    2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
> >>       devices.
> >>    3) NUMA support
> >>    4) implement _FIT method
> >>    5) rename "configdata" to "reserve-label-data"
> >>    6) simplify _DSM arg3 determination
> >>    7) main changelog update to let it reflect v3.
> >>
> >> Changlog in v2:
> >> - Use litten endian for DSM method, thanks for Stefan's suggestion
> >>
> >> - introduce a new parameter, @configdata, if it's false, Qemu will
> >>    build a static and readonly namespace in memory and use it serveing
> >>    for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
> >>    reserved region is needed at the end of the @file, it is good for
> >>    the user who want to pass whole nvdimm device and make its data
> >>    completely be visible to guest
> >>
> >> - divide the source code into separated files and add maintain info
> >>
> >> BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
> >> be posted on next week
> >>
> >> ====== Background ======
> >> NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
> >> on Intel's platform. They are discovered via ACPI and configured by _DSM
> >> method of NVDIMM device in ACPI. There has some supporting documents which
> >> can be found at:
> >> ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
> >> NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
> >> DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> >> Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
> >>
> >> Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
> >> this patchset tries to enable it in virtualization field
> >>
> >> ====== Design ======
> >> NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
> >> address space then CPU can directly access it as normal memory, another is
> >> BLK which is used as block device to reduce the occupying of CPU address
> >> space
> >>
> >> BLK mode accesses NVDIMM via Command Register window and Data Register window.
> >> BLK virtualization has high workload since each sector access will cause at
> >> least two VM-EXIT. So we currently only imperilment vPMEM in this patchset
> >>
> >> --- vPMEM design ---
> >> We introduce a new device named "nvdimm", it uses memory backend device as
> >> NVDIMM memory. The file in file-backend device can be a regular file and block
> >> device. We can use any file when we do test or emulation, however,
> >> in the real word, the files passed to guest are:
> >> - the regular file in the filesystem with DAX enabled created on NVDIMM device
> >>    on host
> >> - the raw PMEM device on host, e,g /dev/pmem0
> >> Memory access on the address created by mmap on these kinds of files can
> >> directly reach NVDIMM device on host.
> >>
> >> --- vConfigure data area design ---
> >> Each NVDIMM device has a configure data area which is used to store label
> >> namespace data. In order to emulating this area, we divide the file into two
> >> parts:
> >> - first parts is (0, size - 128K], which is used as PMEM
> >> - 128K at the end of the file, which is used as Label Data Area
> >> So that the label namespace data can be persistent during power lose or system
> >> failure.
> >>
> >> We also support passing the whole file to guest without reserve any region for
> >> label data area which is achieved by "reserve-label-data" parameter - if it's
> >> false then QEMU will build static and readonly namespace in memory and that
> >> namespace contains the whole file size. The parameter is false on default.
> >>
> >> --- _DSM method design ---
> >> _DSM in ACPI is used to configure NVDIMM, currently we only allow access of
> >> label namespace data, i.e, Get Namespace Label Size (Function Index 4),
> >> Get Namespace Label Data (Function Index 5) and Set Namespace Label Data
> >> (Function Index 6)
> >>
> >> _DSM uses two pages to transfer data between ACPI and Qemu, the first page
> >> is RAM-based used to save the input info of _DSM method and Qemu reuse it
> >> store output info and another page is MMIO-based, ACPI write data to this
> >> page to transfer the control to Qemu
> >>
> >> ====== Test ======
> >> In host
> >> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
> >> 2) append "-object memory-backend-file,share,id=mem1,
> >>     mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
> >>     id=nv1" in QEMU command line
> >>
> >> In guest, download the latest upsteam kernel (4.2 merge window) and enable
> >> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
> >> 1) insmod drivers/nvdimm/libnvdimm.ko
> >> 2) insmod drivers/acpi/nfit.ko
> >> 3) insmod drivers/nvdimm/nd_btt.ko
> >> 4) insmod drivers/nvdimm/nd_pmem.ko
> >> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
> >> appears. You can do whatever on /dev/pmem0 including DAX access.
> >>
> >> Currently Linux NVDIMM driver does not support namespace operation on this
> >> kind of PMEM, apply below changes to support dynamical namespace:
> >>
> >> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
> >>                          continue;
> >>                  }
> >>
> >> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> >> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> >> +               if (nfit_mem->memdev_pmem)
> >>                          flags |= NDD_ALIASING;
> >>
> >> You can append another NVDIMM device in guest and do:
> >> # cd /sys/bus/nd/devices/
> >> # cd namespace1.0/
> >> # echo `uuidgen` > uuid
> >> # echo `expr 1024 \* 1024 \* 128` > size
> >> then reload nd.pmem.ko
> >>
> >> You can see /dev/pmem1 appears
> >>
> >> ====== TODO ======
> >> NVDIMM hotplug support
> >>
> >> Xiao Guangrong (32):
> >>    acpi: add aml_derefof
> >>    acpi: add aml_sizeof
> >>    acpi: add aml_create_field
> >>    acpi: add aml_mutex, aml_acquire, aml_release
> >>    acpi: add aml_concatenate
> >>    acpi: add aml_object_type
> >>    util: introduce qemu_file_get_page_size()
> >>    exec: allow memory to be allocated from any kind of path
> >>    exec: allow file_ram_alloc to work on file
> >>    hostmem-file: clean up memory allocation
> >>    hostmem-file: use whole file size if possible
> >>    pc-dimm: remove DEFAULT_PC_DIMMSIZE
> >>    pc-dimm: make pc_existing_dimms_capacity static and rename it
> >>    pc-dimm: drop the prefix of pc-dimm
> >>    stubs: rename qmp_pc_dimm_device_list.c
> >>    pc-dimm: rename pc-dimm.c and pc-dimm.h
> >>    dimm: abstract dimm device from pc-dimm
> >>    dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
> >>    dimm: keep the state of the whole backend memory
> >>    dimm: introduce realize callback
> >>    nvdimm: implement NVDIMM device abstract
> >>    nvdimm: init the address region used by NVDIMM ACPI
> >>    nvdimm: build ACPI NFIT table
> >>    nvdimm: init the address region used by DSM method
> >>    nvdimm: build ACPI nvdimm devices
> >>    nvdimm: save arg3 for NVDIMM device _DSM method
> >>    nvdimm: support DSM_CMD_IMPLEMENTED function
> >>    nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
> >>    nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
> >>    nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
> >>    nvdimm: allow using whole backend memory as pmem
> >>    nvdimm: add maintain info
> >>
> >>   MAINTAINERS                                        |   6 +
> >>   backends/hostmem-file.c                            |  58 +-
> >>   default-configs/i386-softmmu.mak                   |   2 +
> >>   default-configs/x86_64-softmmu.mak                 |   2 +
> >>   exec.c                                             | 113 ++-
> >>   hmp.c                                              |   2 +-
> >>   hw/Makefile.objs                                   |   2 +-
> >>   hw/acpi/aml-build.c                                |  83 ++
> >>   hw/acpi/ich9.c                                     |   8 +-
> >>   hw/acpi/memory_hotplug.c                           |  26 +-
> >>   hw/acpi/piix4.c                                    |   8 +-
> >>   hw/i386/acpi-build.c                               |   4 +
> >>   hw/i386/pc.c                                       |  37 +-
> >>   hw/mem/Makefile.objs                               |   3 +
> >>   hw/mem/{pc-dimm.c => dimm.c}                       | 240 ++---
> >>   hw/mem/nvdimm/acpi.c                               | 961 +++++++++++++++++++++
> >>   hw/mem/nvdimm/internal.h                           |  41 +
> >>   hw/mem/nvdimm/namespace.c                          | 309 +++++++
> >>   hw/mem/nvdimm/nvdimm.c                             | 136 +++
> >>   hw/mem/pc-dimm.c                                   | 506 +----------
> >>   hw/ppc/spapr.c                                     |  20 +-
> >>   include/hw/acpi/aml-build.h                        |   8 +
> >>   include/hw/i386/pc.h                               |   4 +-
> >>   include/hw/mem/{pc-dimm.h => dimm.h}               |  68 +-
> >>   include/hw/mem/nvdimm.h                            |  58 ++
> >>   include/hw/mem/pc-dimm.h                           | 105 +--
> >>   include/hw/ppc/spapr.h                             |   2 +-
> >>   include/qemu/osdep.h                               |   1 +
> >>   numa.c                                             |   4 +-
> >>   qapi-schema.json                                   |   8 +-
> >>   qmp.c                                              |   4 +-
> >>   stubs/Makefile.objs                                |   2 +-
> >>   ...c_dimm_device_list.c => qmp_dimm_device_list.c} |   4 +-
> >>   target-ppc/kvm.c                                   |  21 +-
> >>   trace-events                                       |   8 +-
> >>   util/oslib-posix.c                                 |  16 +
> >>   util/oslib-win32.c                                 |   5 +
> >>   37 files changed, 2023 insertions(+), 862 deletions(-)
> >>   rename hw/mem/{pc-dimm.c => dimm.c} (65%)
> >>   create mode 100644 hw/mem/nvdimm/acpi.c
> >>   create mode 100644 hw/mem/nvdimm/internal.h
> >>   create mode 100644 hw/mem/nvdimm/namespace.c
> >>   create mode 100644 hw/mem/nvdimm/nvdimm.c
> >>   rewrite hw/mem/pc-dimm.c (91%)
> >>   rename include/hw/mem/{pc-dimm.h => dimm.h} (50%)
> >>   create mode 100644 include/hw/mem/nvdimm.h
> >>   rewrite include/hw/mem/pc-dimm.h (97%)
> >>   rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (56%)
> >>
> >> --
> >> 1.8.3.1
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe kvm" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >
> >


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-12  8:20       ` Igor Mammedov
  0 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-12  8:20 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Eduardo Habkost, kvm, mst, gleb, mtosatti, qemu-devel,
	Stefan Hajnoczi, Bharata B Rao, Paolo Bonzini, dan.j.williams,
	Richard Henderson

On Mon, 12 Oct 2015 11:06:20 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> 
> 
> On 10/12/2015 10:59 AM, Bharata B Rao wrote:
> > Xiao,
> >
> > Are these patches present in any git tree so that they can be easily tried out.
> >
> 
> Sorry, currently no git tree out of my workspace is available :(
Is it possible for you to put working tree on github?

> 
> BTW, this patchset is based on top of the commit b37686f7e on qemu tree:
> commit b37686f7e84b22cfaf7fd01ac5133f2617cc3027
> Merge: 8be6e62 98cf48f
> Author: Peter Maydell <peter.maydell@linaro.org>
> Date:   Fri Oct 9 12:18:13 2015 +0100
> 
>      Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
> 
> Thanks.
> 
> > Regards,
> > Bharata.
> >
> > On Sun, Oct 11, 2015 at 9:22 AM, Xiao Guangrong
> > <guangrong.xiao@linux.intel.com> wrote:
> >> Changelog in v3:
> >> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> >> Michael for their valuable comments, the patchset finally gets better shape.
> >> - changes from Igor's comments:
> >>    1) abstract dimm device type from pc-dimm and create nvdimm device based on
> >>       dimm, then it uses memory backend device as nvdimm's memory and NUMA has
> >>       easily been implemented.
> >>    2) let file-backend device support any kind of filesystem not only for
> >>       hugetlbfs and let it work on file not only for directory which is
> >>       achieved by extending 'mem-path' - if it's a directory then it works as
> >>       current behavior, otherwise if it's file then directly allocates memory
> >>       from it.
> >>    3) we figure out a unused memory hole below 4G that is 0xFF00000 ~
> >>       0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
> >>       ACPI SSDT/DSDT table will break windows XP.
> >>       BTW, only make SSDT.rev = 2 can not work since the width is only depended
> >>       on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
> >>       in ACPI spec:
> >> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit
> >> | width of Integer objects is dependent on the ComplianceRevision of the DSDT.
> >> | If the ComplianceRevision is less than 2, all integers are restricted to 32
> >> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets
> >> | the global integer width for all integers, including integers in SSDTs.
> >>    4) use the lowest ACPI spec version to document AML terms.
> >>    5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"
> >>
> >> - changes from Stefan's comments:
> >>    1) do not do endian adjustment in-place since _DSM memory is visible to guest
> >>    2) use target platform's target page size instead of fixed PAGE_SIZE
> >>       definition
> >>    3) lots of code style improvement and typo fixes.
> >>    4) live migration fix
> >> - changes from Paolo's comments:
> >>    1) improve the name of memory region
> >>
> >> - other changes:
> >>    1) return exact buffer size for _DSM method instead of the page size.
> >>    2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
> >>       devices.
> >>    3) NUMA support
> >>    4) implement _FIT method
> >>    5) rename "configdata" to "reserve-label-data"
> >>    6) simplify _DSM arg3 determination
> >>    7) main changelog update to let it reflect v3.
> >>
> >> Changlog in v2:
> >> - Use litten endian for DSM method, thanks for Stefan's suggestion
> >>
> >> - introduce a new parameter, @configdata, if it's false, Qemu will
> >>    build a static and readonly namespace in memory and use it serveing
> >>    for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
> >>    reserved region is needed at the end of the @file, it is good for
> >>    the user who want to pass whole nvdimm device and make its data
> >>    completely be visible to guest
> >>
> >> - divide the source code into separated files and add maintain info
> >>
> >> BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
> >> be posted on next week
> >>
> >> ====== Background ======
> >> NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
> >> on Intel's platform. They are discovered via ACPI and configured by _DSM
> >> method of NVDIMM device in ACPI. There has some supporting documents which
> >> can be found at:
> >> ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
> >> NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
> >> DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> >> Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
> >>
> >> Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
> >> this patchset tries to enable it in virtualization field
> >>
> >> ====== Design ======
> >> NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
> >> address space then CPU can directly access it as normal memory, another is
> >> BLK which is used as block device to reduce the occupying of CPU address
> >> space
> >>
> >> BLK mode accesses NVDIMM via Command Register window and Data Register window.
> >> BLK virtualization has high workload since each sector access will cause at
> >> least two VM-EXIT. So we currently only imperilment vPMEM in this patchset
> >>
> >> --- vPMEM design ---
> >> We introduce a new device named "nvdimm", it uses memory backend device as
> >> NVDIMM memory. The file in file-backend device can be a regular file and block
> >> device. We can use any file when we do test or emulation, however,
> >> in the real word, the files passed to guest are:
> >> - the regular file in the filesystem with DAX enabled created on NVDIMM device
> >>    on host
> >> - the raw PMEM device on host, e,g /dev/pmem0
> >> Memory access on the address created by mmap on these kinds of files can
> >> directly reach NVDIMM device on host.
> >>
> >> --- vConfigure data area design ---
> >> Each NVDIMM device has a configure data area which is used to store label
> >> namespace data. In order to emulating this area, we divide the file into two
> >> parts:
> >> - first parts is (0, size - 128K], which is used as PMEM
> >> - 128K at the end of the file, which is used as Label Data Area
> >> So that the label namespace data can be persistent during power lose or system
> >> failure.
> >>
> >> We also support passing the whole file to guest without reserve any region for
> >> label data area which is achieved by "reserve-label-data" parameter - if it's
> >> false then QEMU will build static and readonly namespace in memory and that
> >> namespace contains the whole file size. The parameter is false on default.
> >>
> >> --- _DSM method design ---
> >> _DSM in ACPI is used to configure NVDIMM, currently we only allow access of
> >> label namespace data, i.e, Get Namespace Label Size (Function Index 4),
> >> Get Namespace Label Data (Function Index 5) and Set Namespace Label Data
> >> (Function Index 6)
> >>
> >> _DSM uses two pages to transfer data between ACPI and Qemu, the first page
> >> is RAM-based used to save the input info of _DSM method and Qemu reuse it
> >> store output info and another page is MMIO-based, ACPI write data to this
> >> page to transfer the control to Qemu
> >>
> >> ====== Test ======
> >> In host
> >> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
> >> 2) append "-object memory-backend-file,share,id=mem1,
> >>     mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
> >>     id=nv1" in QEMU command line
> >>
> >> In guest, download the latest upsteam kernel (4.2 merge window) and enable
> >> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
> >> 1) insmod drivers/nvdimm/libnvdimm.ko
> >> 2) insmod drivers/acpi/nfit.ko
> >> 3) insmod drivers/nvdimm/nd_btt.ko
> >> 4) insmod drivers/nvdimm/nd_pmem.ko
> >> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
> >> appears. You can do whatever on /dev/pmem0 including DAX access.
> >>
> >> Currently Linux NVDIMM driver does not support namespace operation on this
> >> kind of PMEM, apply below changes to support dynamical namespace:
> >>
> >> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
> >>                          continue;
> >>                  }
> >>
> >> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> >> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> >> +               if (nfit_mem->memdev_pmem)
> >>                          flags |= NDD_ALIASING;
> >>
> >> You can append another NVDIMM device in guest and do:
> >> # cd /sys/bus/nd/devices/
> >> # cd namespace1.0/
> >> # echo `uuidgen` > uuid
> >> # echo `expr 1024 \* 1024 \* 128` > size
> >> then reload nd.pmem.ko
> >>
> >> You can see /dev/pmem1 appears
> >>
> >> ====== TODO ======
> >> NVDIMM hotplug support
> >>
> >> Xiao Guangrong (32):
> >>    acpi: add aml_derefof
> >>    acpi: add aml_sizeof
> >>    acpi: add aml_create_field
> >>    acpi: add aml_mutex, aml_acquire, aml_release
> >>    acpi: add aml_concatenate
> >>    acpi: add aml_object_type
> >>    util: introduce qemu_file_get_page_size()
> >>    exec: allow memory to be allocated from any kind of path
> >>    exec: allow file_ram_alloc to work on file
> >>    hostmem-file: clean up memory allocation
> >>    hostmem-file: use whole file size if possible
> >>    pc-dimm: remove DEFAULT_PC_DIMMSIZE
> >>    pc-dimm: make pc_existing_dimms_capacity static and rename it
> >>    pc-dimm: drop the prefix of pc-dimm
> >>    stubs: rename qmp_pc_dimm_device_list.c
> >>    pc-dimm: rename pc-dimm.c and pc-dimm.h
> >>    dimm: abstract dimm device from pc-dimm
> >>    dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
> >>    dimm: keep the state of the whole backend memory
> >>    dimm: introduce realize callback
> >>    nvdimm: implement NVDIMM device abstract
> >>    nvdimm: init the address region used by NVDIMM ACPI
> >>    nvdimm: build ACPI NFIT table
> >>    nvdimm: init the address region used by DSM method
> >>    nvdimm: build ACPI nvdimm devices
> >>    nvdimm: save arg3 for NVDIMM device _DSM method
> >>    nvdimm: support DSM_CMD_IMPLEMENTED function
> >>    nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
> >>    nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
> >>    nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
> >>    nvdimm: allow using whole backend memory as pmem
> >>    nvdimm: add maintain info
> >>
> >>   MAINTAINERS                                        |   6 +
> >>   backends/hostmem-file.c                            |  58 +-
> >>   default-configs/i386-softmmu.mak                   |   2 +
> >>   default-configs/x86_64-softmmu.mak                 |   2 +
> >>   exec.c                                             | 113 ++-
> >>   hmp.c                                              |   2 +-
> >>   hw/Makefile.objs                                   |   2 +-
> >>   hw/acpi/aml-build.c                                |  83 ++
> >>   hw/acpi/ich9.c                                     |   8 +-
> >>   hw/acpi/memory_hotplug.c                           |  26 +-
> >>   hw/acpi/piix4.c                                    |   8 +-
> >>   hw/i386/acpi-build.c                               |   4 +
> >>   hw/i386/pc.c                                       |  37 +-
> >>   hw/mem/Makefile.objs                               |   3 +
> >>   hw/mem/{pc-dimm.c => dimm.c}                       | 240 ++---
> >>   hw/mem/nvdimm/acpi.c                               | 961 +++++++++++++++++++++
> >>   hw/mem/nvdimm/internal.h                           |  41 +
> >>   hw/mem/nvdimm/namespace.c                          | 309 +++++++
> >>   hw/mem/nvdimm/nvdimm.c                             | 136 +++
> >>   hw/mem/pc-dimm.c                                   | 506 +----------
> >>   hw/ppc/spapr.c                                     |  20 +-
> >>   include/hw/acpi/aml-build.h                        |   8 +
> >>   include/hw/i386/pc.h                               |   4 +-
> >>   include/hw/mem/{pc-dimm.h => dimm.h}               |  68 +-
> >>   include/hw/mem/nvdimm.h                            |  58 ++
> >>   include/hw/mem/pc-dimm.h                           | 105 +--
> >>   include/hw/ppc/spapr.h                             |   2 +-
> >>   include/qemu/osdep.h                               |   1 +
> >>   numa.c                                             |   4 +-
> >>   qapi-schema.json                                   |   8 +-
> >>   qmp.c                                              |   4 +-
> >>   stubs/Makefile.objs                                |   2 +-
> >>   ...c_dimm_device_list.c => qmp_dimm_device_list.c} |   4 +-
> >>   target-ppc/kvm.c                                   |  21 +-
> >>   trace-events                                       |   8 +-
> >>   util/oslib-posix.c                                 |  16 +
> >>   util/oslib-win32.c                                 |   5 +
> >>   37 files changed, 2023 insertions(+), 862 deletions(-)
> >>   rename hw/mem/{pc-dimm.c => dimm.c} (65%)
> >>   create mode 100644 hw/mem/nvdimm/acpi.c
> >>   create mode 100644 hw/mem/nvdimm/internal.h
> >>   create mode 100644 hw/mem/nvdimm/namespace.c
> >>   create mode 100644 hw/mem/nvdimm/nvdimm.c
> >>   rewrite hw/mem/pc-dimm.c (91%)
> >>   rename include/hw/mem/{pc-dimm.h => dimm.h} (50%)
> >>   create mode 100644 include/hw/mem/nvdimm.h
> >>   rewrite include/hw/mem/pc-dimm.h (97%)
> >>   rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (56%)
> >>
> >> --
> >> 1.8.3.1
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe kvm" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >
> >

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 00/32] implement vNVDIMM
  2015-10-12  8:20       ` [Qemu-devel] " Igor Mammedov
@ 2015-10-12  8:21         ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-12  8:21 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Bharata B Rao, Paolo Bonzini, gleb, mtosatti, Stefan Hajnoczi,
	mst, Richard Henderson, Eduardo Habkost, dan.j.williams, kvm,
	qemu-devel



On 10/12/2015 04:20 PM, Igor Mammedov wrote:
> On Mon, 12 Oct 2015 11:06:20 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>>
>>
>> On 10/12/2015 10:59 AM, Bharata B Rao wrote:
>>> Xiao,
>>>
>>> Are these patches present in any git tree so that they can be easily tried out.
>>>
>>
>> Sorry, currently no git tree out of my workspace is available :(
> Is it possible for you to put working tree on github?

Yes. I am planning do this. Hopefully, i will publish the git repo alone with the
new version patchset.


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-12  8:21         ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-12  8:21 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Eduardo Habkost, kvm, mst, gleb, mtosatti, qemu-devel,
	Stefan Hajnoczi, Bharata B Rao, Paolo Bonzini, dan.j.williams,
	Richard Henderson



On 10/12/2015 04:20 PM, Igor Mammedov wrote:
> On Mon, 12 Oct 2015 11:06:20 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>>
>>
>> On 10/12/2015 10:59 AM, Bharata B Rao wrote:
>>> Xiao,
>>>
>>> Are these patches present in any git tree so that they can be easily tried out.
>>>
>>
>> Sorry, currently no git tree out of my workspace is available :(
> Is it possible for you to put working tree on github?

Yes. I am planning do this. Hopefully, i will publish the git repo alone with the
new version patchset.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 08/32] exec: allow memory to be allocated from any kind of path
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-12 10:08     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-12 10:08 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Sun, Oct 11, 2015 at 11:52:40AM +0800, Xiao Guangrong wrote:
> Currently file_ram_alloc() is designed for hugetlbfs, however, the memory
> of nvdimm can come from either raw pmem device eg, /dev/pmem, or the file
> locates at DAX enabled filesystem
> 
> So this patch let it work on any kind of path
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>

This conflicts with map alloc rework.
Please rebase this on top of my tree.


> ---
>  exec.c | 55 ++++++++++++++-----------------------------------------
>  1 file changed, 14 insertions(+), 41 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 7d90a52..70cb0ef 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1154,32 +1154,6 @@ void qemu_mutex_unlock_ramlist(void)
>  }
>  
>  #ifdef __linux__
> -
> -#include <sys/vfs.h>
> -
> -#define HUGETLBFS_MAGIC       0x958458f6
> -
> -static long gethugepagesize(const char *path, Error **errp)
> -{
> -    struct statfs fs;
> -    int ret;
> -
> -    do {
> -        ret = statfs(path, &fs);
> -    } while (ret != 0 && errno == EINTR);
> -
> -    if (ret != 0) {
> -        error_setg_errno(errp, errno, "failed to get page size of file %s",
> -                         path);
> -        return 0;
> -    }
> -
> -    if (fs.f_type != HUGETLBFS_MAGIC)
> -        fprintf(stderr, "Warning: path not on HugeTLBFS: %s\n", path);
> -
> -    return fs.f_bsize;

What this *actually* is trying to warn against is that
mapping a regular file (as opposed to hugetlbfs)
means transparent huge pages don't work.

So I don't think we should drop this warning completely.
Either let's add the nvdimm magic, or simply check the
page size.


> -}
> -
>  static void *file_ram_alloc(RAMBlock *block,
>                              ram_addr_t memory,
>                              const char *path,
> @@ -1191,22 +1165,21 @@ static void *file_ram_alloc(RAMBlock *block,
>      void *ptr;
>      void *area = NULL;
>      int fd;
> -    uint64_t hpagesize;
> +    uint64_t pagesize;
>      uint64_t total;
> -    Error *local_err = NULL;
>      size_t offset;
>  
> -    hpagesize = gethugepagesize(path, &local_err);
> -    if (local_err) {
> -        error_propagate(errp, local_err);
> +    pagesize = qemu_file_get_page_size(path);
> +    if (!pagesize) {
> +        error_setg(errp, "can't get page size for %s", path);
>          goto error;
>      }
> -    block->mr->align = hpagesize;
> +    block->mr->align = pagesize;
>  
> -    if (memory < hpagesize) {
> +    if (memory < pagesize) {
>          error_setg(errp, "memory size 0x" RAM_ADDR_FMT " must be equal to "
> -                   "or larger than huge page size 0x%" PRIx64,
> -                   memory, hpagesize);
> +                   "or larger than page size 0x%" PRIx64,
> +                   memory, pagesize);
>          goto error;
>      }
>  
> @@ -1230,15 +1203,15 @@ static void *file_ram_alloc(RAMBlock *block,
>      fd = mkstemp(filename);
>      if (fd < 0) {
>          error_setg_errno(errp, errno,
> -                         "unable to create backing store for hugepages");
> +                         "unable to create backing store for path %s", path);
>          g_free(filename);
>          goto error;
>      }
>      unlink(filename);
>      g_free(filename);
>  
> -    memory = ROUND_UP(memory, hpagesize);
> -    total = memory + hpagesize;
> +    memory = ROUND_UP(memory, pagesize);
> +    total = memory + pagesize;
>  
>      /*
>       * ftruncate is not supported by hugetlbfs in older
> @@ -1254,12 +1227,12 @@ static void *file_ram_alloc(RAMBlock *block,
>                  -1, 0);
>      if (ptr == MAP_FAILED) {
>          error_setg_errno(errp, errno,
> -                         "unable to allocate memory range for hugepages");
> +                         "unable to allocate memory range for path %s", path);
>          close(fd);
>          goto error;
>      }
>  
> -    offset = QEMU_ALIGN_UP((uintptr_t)ptr, hpagesize) - (uintptr_t)ptr;
> +    offset = QEMU_ALIGN_UP((uintptr_t)ptr, pagesize) - (uintptr_t)ptr;
>  
>      area = mmap(ptr + offset, memory, PROT_READ | PROT_WRITE,
>                  (block->flags & RAM_SHARED ? MAP_SHARED : MAP_PRIVATE) |
> @@ -1267,7 +1240,7 @@ static void *file_ram_alloc(RAMBlock *block,
>                  fd, 0);
>      if (area == MAP_FAILED) {
>          error_setg_errno(errp, errno,
> -                         "unable to map backing store for hugepages");
> +                         "unable to map backing store for path %s", path);
>          munmap(ptr, total);
>          close(fd);
>          goto error;
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 08/32] exec: allow memory to be allocated from any kind of path
@ 2015-10-12 10:08     ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-12 10:08 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth

On Sun, Oct 11, 2015 at 11:52:40AM +0800, Xiao Guangrong wrote:
> Currently file_ram_alloc() is designed for hugetlbfs, however, the memory
> of nvdimm can come from either raw pmem device eg, /dev/pmem, or the file
> locates at DAX enabled filesystem
> 
> So this patch let it work on any kind of path
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>

This conflicts with map alloc rework.
Please rebase this on top of my tree.


> ---
>  exec.c | 55 ++++++++++++++-----------------------------------------
>  1 file changed, 14 insertions(+), 41 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 7d90a52..70cb0ef 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1154,32 +1154,6 @@ void qemu_mutex_unlock_ramlist(void)
>  }
>  
>  #ifdef __linux__
> -
> -#include <sys/vfs.h>
> -
> -#define HUGETLBFS_MAGIC       0x958458f6
> -
> -static long gethugepagesize(const char *path, Error **errp)
> -{
> -    struct statfs fs;
> -    int ret;
> -
> -    do {
> -        ret = statfs(path, &fs);
> -    } while (ret != 0 && errno == EINTR);
> -
> -    if (ret != 0) {
> -        error_setg_errno(errp, errno, "failed to get page size of file %s",
> -                         path);
> -        return 0;
> -    }
> -
> -    if (fs.f_type != HUGETLBFS_MAGIC)
> -        fprintf(stderr, "Warning: path not on HugeTLBFS: %s\n", path);
> -
> -    return fs.f_bsize;

What this *actually* is trying to warn against is that
mapping a regular file (as opposed to hugetlbfs)
means transparent huge pages don't work.

So I don't think we should drop this warning completely.
Either let's add the nvdimm magic, or simply check the
page size.


> -}
> -
>  static void *file_ram_alloc(RAMBlock *block,
>                              ram_addr_t memory,
>                              const char *path,
> @@ -1191,22 +1165,21 @@ static void *file_ram_alloc(RAMBlock *block,
>      void *ptr;
>      void *area = NULL;
>      int fd;
> -    uint64_t hpagesize;
> +    uint64_t pagesize;
>      uint64_t total;
> -    Error *local_err = NULL;
>      size_t offset;
>  
> -    hpagesize = gethugepagesize(path, &local_err);
> -    if (local_err) {
> -        error_propagate(errp, local_err);
> +    pagesize = qemu_file_get_page_size(path);
> +    if (!pagesize) {
> +        error_setg(errp, "can't get page size for %s", path);
>          goto error;
>      }
> -    block->mr->align = hpagesize;
> +    block->mr->align = pagesize;
>  
> -    if (memory < hpagesize) {
> +    if (memory < pagesize) {
>          error_setg(errp, "memory size 0x" RAM_ADDR_FMT " must be equal to "
> -                   "or larger than huge page size 0x%" PRIx64,
> -                   memory, hpagesize);
> +                   "or larger than page size 0x%" PRIx64,
> +                   memory, pagesize);
>          goto error;
>      }
>  
> @@ -1230,15 +1203,15 @@ static void *file_ram_alloc(RAMBlock *block,
>      fd = mkstemp(filename);
>      if (fd < 0) {
>          error_setg_errno(errp, errno,
> -                         "unable to create backing store for hugepages");
> +                         "unable to create backing store for path %s", path);
>          g_free(filename);
>          goto error;
>      }
>      unlink(filename);
>      g_free(filename);
>  
> -    memory = ROUND_UP(memory, hpagesize);
> -    total = memory + hpagesize;
> +    memory = ROUND_UP(memory, pagesize);
> +    total = memory + pagesize;
>  
>      /*
>       * ftruncate is not supported by hugetlbfs in older
> @@ -1254,12 +1227,12 @@ static void *file_ram_alloc(RAMBlock *block,
>                  -1, 0);
>      if (ptr == MAP_FAILED) {
>          error_setg_errno(errp, errno,
> -                         "unable to allocate memory range for hugepages");
> +                         "unable to allocate memory range for path %s", path);
>          close(fd);
>          goto error;
>      }
>  
> -    offset = QEMU_ALIGN_UP((uintptr_t)ptr, hpagesize) - (uintptr_t)ptr;
> +    offset = QEMU_ALIGN_UP((uintptr_t)ptr, pagesize) - (uintptr_t)ptr;
>  
>      area = mmap(ptr + offset, memory, PROT_READ | PROT_WRITE,
>                  (block->flags & RAM_SHARED ? MAP_SHARED : MAP_PRIVATE) |
> @@ -1267,7 +1240,7 @@ static void *file_ram_alloc(RAMBlock *block,
>                  fd, 0);
>      if (area == MAP_FAILED) {
>          error_setg_errno(errp, errno,
> -                         "unable to map backing store for hugepages");
> +                         "unable to map backing store for path %s", path);
>          munmap(ptr, total);
>          close(fd);
>          goto error;
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 23/32] nvdimm: build ACPI NFIT table
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-12 11:27     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-12 11:27 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Sun, Oct 11, 2015 at 11:52:55AM +0800, Xiao Guangrong wrote:
> NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
> 
> Currently, we only support PMEM mode. Each device has 3 structures:
> - SPA structure, defines the PMEM region info
> 
> - MEM DEV structure, it has the @handle which is used to associate specified
>   ACPI NVDIMM  device we will introduce in later patch.
>   Also we can happily ignored the memory device's interleave, the real
>   nvdimm hardware access is hidden behind host
> 
> - DCR structure, it defines vendor ID used to associate specified vendor
>   nvdimm driver. Since we only implement PMEM mode this time, Command
>   window and Data window are not needed
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/i386/acpi-build.c     |   4 +
>  hw/mem/nvdimm/acpi.c     | 209 ++++++++++++++++++++++++++++++++++++++++++++++-
>  hw/mem/nvdimm/internal.h |  13 +++
>  hw/mem/nvdimm/nvdimm.c   |  25 ++++++
>  include/hw/mem/nvdimm.h  |   2 +
>  5 files changed, 252 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 95e0c65..c637dc8 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1661,6 +1661,7 @@ static bool acpi_has_iommu(void)
>  static
>  void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>  {
> +    PCMachineState *pcms = PC_MACHINE(qdev_get_machine());

I don't like more code poking at machine directly.
I know srat does it, and I don't like it. Any chance you can add
acpi_get_nvdumm_info to get all you need from nvdimm state?

>      GArray *table_offsets;
>      unsigned facs, ssdt, dsdt, rsdt;
>      AcpiCpuInfo cpu;
> @@ -1742,6 +1743,9 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>          build_dmar_q35(tables_blob, tables->linker);
>      }
>  
> +    nvdimm_build_acpi_table(&pcms->nvdimm_memory, table_offsets, tables_blob,
> +                            tables->linker);
> +
>      /* Add tables supplied by user (if any) */
>      for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
>          unsigned len = acpi_table_len(u);
> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> index b640874..62b1e02 100644
> --- a/hw/mem/nvdimm/acpi.c
> +++ b/hw/mem/nvdimm/acpi.c
> @@ -32,6 +32,46 @@
>  #include "hw/mem/nvdimm.h"
>  #include "internal.h"
>  
> +static void nfit_spa_uuid_pm(uuid_le *uuid)
> +{
> +    uuid_le uuid_pm = UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d,
> +                              0x33, 0x18, 0xb7, 0x8c, 0xdb);
> +    memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
> +}
> +

Just add a static constant:
    const uint8_t nfit_spa_uuid[] = {0x79, 0xd3, ..... }
then memcpy instead of a wrapper.

> +enum {
> +    NFIT_STRUCTURE_SPA = 0,
> +    NFIT_STRUCTURE_MEMDEV = 1,
> +    NFIT_STRUCTURE_IDT = 2,
> +    NFIT_STRUCTURE_SMBIOS = 3,
> +    NFIT_STRUCTURE_DCR = 4,
> +    NFIT_STRUCTURE_BDW = 5,
> +    NFIT_STRUCTURE_FLUSH = 6,
> +};
> +
> +enum {
> +    EFI_MEMORY_UC = 0x1ULL,
> +    EFI_MEMORY_WC = 0x2ULL,
> +    EFI_MEMORY_WT = 0x4ULL,
> +    EFI_MEMORY_WB = 0x8ULL,
> +    EFI_MEMORY_UCE = 0x10ULL,
> +    EFI_MEMORY_WP = 0x1000ULL,
> +    EFI_MEMORY_RP = 0x2000ULL,
> +    EFI_MEMORY_XP = 0x4000ULL,
> +    EFI_MEMORY_NV = 0x8000ULL,
> +    EFI_MEMORY_MORE_RELIABLE = 0x10000ULL,
> +};
> +
> +/*
> + * NVDIMM Firmware Interface Table
> + * @signature: "NFIT"
> + */
> +struct nfit {
> +    ACPI_TABLE_HEADER_DEF
> +    uint32_t reserved;
> +} QEMU_PACKED;
> +typedef struct nfit nfit;
> +
>  /* System Physical Address Range Structure */
>  struct nfit_spa {
>      uint16_t type;
> @@ -40,13 +80,21 @@ struct nfit_spa {
>      uint16_t flags;
>      uint32_t reserved;
>      uint32_t proximity_domain;
> -    uint8_t type_guid[16];
> +    uuid_le type_guid;
>      uint64_t spa_base;
>      uint64_t spa_length;
>      uint64_t mem_attr;
>  } QEMU_PACKED;
>  typedef struct nfit_spa nfit_spa;
>  
> +/*
> + * Control region is strictly for management during hot add/online
> + * operation.
> + */
> +#define SPA_FLAGS_ADD_ONLINE_ONLY     (1)

unused

> +/* Data in Proximity Domain field is valid. */
> +#define SPA_FLAGS_PROXIMITY_VALID     (1 << 1)
> +
>  /* Memory Device to System Physical Address Range Mapping Structure */
>  struct nfit_memdev {
>      uint16_t type;
> @@ -91,12 +139,20 @@ struct nfit_dcr {
>  } QEMU_PACKED;
>  typedef struct nfit_dcr nfit_dcr;
>  
> +#define REVSISON_ID    1
> +#define NFIT_FIC1      0x201
> +
>  static uint64_t nvdimm_device_structure_size(uint64_t slots)
>  {
>      /* each nvdimm has three structures. */
>      return slots * (sizeof(nfit_spa) + sizeof(nfit_memdev) + sizeof(nfit_dcr));
>  }
>  
> +static uint64_t get_nfit_total_size(uint64_t slots)
> +{
> +    return sizeof(struct nfit) + nvdimm_device_structure_size(slots);
> +}
> +
>  static uint64_t nvdimm_acpi_memory_size(uint64_t slots, uint64_t page_size)
>  {
>      uint64_t size = nvdimm_device_structure_size(slots);
> @@ -118,3 +174,154 @@ void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
>                         NVDIMM_ACPI_MEM_SIZE);
>      memory_region_add_subregion(system_memory, state->base, &state->mr);
>  }
> +
> +static uint32_t nvdimm_slot_to_sn(int slot)
> +{
> +    return 0x123456 + slot;
> +}
> +
> +static uint32_t nvdimm_slot_to_handle(int slot)
> +{
> +    return slot + 1;
> +}
> +
> +static uint16_t nvdimm_slot_to_spa_index(int slot)
> +{
> +    return (slot + 1) << 1;
> +}
> +
> +static uint32_t nvdimm_slot_to_dcr_index(int slot)
> +{
> +    return nvdimm_slot_to_spa_index(slot) + 1;
> +}
> +

There are lots of magic numbers here with no comments.
Pls explain the logic in code comments.

> +static int build_structure_spa(void *buf, NVDIMMDevice *nvdimm)

Pls document the specific chapter that this implements.

same everywhere else.
> +{
> +    nfit_spa *nfit_spa;
> +    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
> +                                            NULL);
> +    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
> +                                            NULL);
> +    uint32_t node = object_property_get_int(OBJECT(nvdimm), DIMM_NODE_PROP,
> +                                            NULL);
> +    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> +                                            NULL);
> +
> +    nfit_spa = buf;
> +
> +    nfit_spa->type = cpu_to_le16(NFIT_STRUCTURE_SPA);

Don't do these 1-time enums. They are hard to match against spec.

       nfit_spa->type = cpu_to_le16(0 /* System Physical Address Range Structure */);

same everywhere else.

> +    nfit_spa->length = cpu_to_le16(sizeof(*nfit_spa));
> +    nfit_spa->spa_index = cpu_to_le16(nvdimm_slot_to_spa_index(slot));
> +    nfit_spa->flags = cpu_to_le16(SPA_FLAGS_PROXIMITY_VALID);
> +    nfit_spa->proximity_domain = cpu_to_le32(node);
> +    nfit_spa_uuid_pm(&nfit_spa->type_guid);
> +    nfit_spa->spa_base = cpu_to_le64(addr);
> +    nfit_spa->spa_length = cpu_to_le64(size);
> +    nfit_spa->mem_attr = cpu_to_le64(EFI_MEMORY_WB | EFI_MEMORY_NV);
> +
> +    return sizeof(*nfit_spa);
> +}
> +
> +static int build_structure_memdev(void *buf, NVDIMMDevice *nvdimm)
> +{
> +    nfit_memdev *nfit_memdev;
> +    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
> +                                            NULL);
> +    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
> +                                            NULL);
> +    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> +                                            NULL);
> +    uint32_t handle = nvdimm_slot_to_handle(slot);
> +
> +    nfit_memdev = buf;
> +    nfit_memdev->type = cpu_to_le16(NFIT_STRUCTURE_MEMDEV);
> +    nfit_memdev->length = cpu_to_le16(sizeof(*nfit_memdev));
> +    nfit_memdev->nfit_handle = cpu_to_le32(handle);
> +    /* point to nfit_spa. */
> +    nfit_memdev->spa_index = cpu_to_le16(nvdimm_slot_to_spa_index(slot));
> +    /* point to nfit_dcr. */
> +    nfit_memdev->dcr_index = cpu_to_le16(nvdimm_slot_to_dcr_index(slot));
> +    nfit_memdev->region_len = cpu_to_le64(size);
> +    nfit_memdev->region_dpa = cpu_to_le64(addr);
> +    /* Only one interleave for pmem. */
> +    nfit_memdev->interleave_ways = cpu_to_le16(1);
> +
> +    return sizeof(*nfit_memdev);
> +}
> +
> +static int build_structure_dcr(void *buf, NVDIMMDevice *nvdimm)
> +{
> +    nfit_dcr *nfit_dcr;
> +    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> +                                       NULL);
> +    uint32_t sn = nvdimm_slot_to_sn(slot);
> +
> +    nfit_dcr = buf;
> +    nfit_dcr->type = cpu_to_le16(NFIT_STRUCTURE_DCR);
> +    nfit_dcr->length = cpu_to_le16(sizeof(*nfit_dcr));
> +    nfit_dcr->dcr_index = cpu_to_le16(nvdimm_slot_to_dcr_index(slot));
> +    nfit_dcr->vendor_id = cpu_to_le16(0x8086);
> +    nfit_dcr->device_id = cpu_to_le16(1);
> +    nfit_dcr->revision_id = cpu_to_le16(REVSISON_ID);
> +    nfit_dcr->serial_number = cpu_to_le32(sn);
> +    nfit_dcr->fic = cpu_to_le16(NFIT_FIC1);
> +
> +    return sizeof(*nfit_dcr);
> +}
> +
> +static void build_device_structure(GSList *device_list, char *buf)
> +{
> +    buf += sizeof(nfit);
> +
> +    for (; device_list; device_list = device_list->next) {
> +        NVDIMMDevice *nvdimm = device_list->data;
> +
> +        /* build System Physical Address Range Description Table. */
> +        buf += build_structure_spa(buf, nvdimm);
> +
> +        /*
> +         * build Memory Device to System Physical Address Range Mapping
> +         * Table.
> +         */
> +        buf += build_structure_memdev(buf, nvdimm);
> +
> +        /* build Control Region Descriptor Table. */
> +        buf += build_structure_dcr(buf, nvdimm);
> +    }
> +}
> +
> +static void build_nfit(GSList *device_list, GArray *table_offsets,
> +                       GArray *table_data, GArray *linker)
> +{
> +    size_t total;
> +    char *buf;
> +    int nfit_start, nr;
> +
> +    nr = g_slist_length(device_list);
> +    total = get_nfit_total_size(nr);
> +
> +    nfit_start = table_data->len;
> +    acpi_add_table(table_offsets, table_data);
> +
> +    buf = acpi_data_push(table_data, total);
> +    build_device_structure(device_list, buf);

This seems fragile. Should build_device_structure overflow
a buffer we'll corrupt memory.
Current code does use acpi_data_push but only for trivial
things like fixed size headers.
Can't you use glib to dynamically append things to table
as they are generated?


> +
> +    build_header(linker, table_data, (void *)(table_data->data + nfit_start),
> +                 "NFIT", table_data->len - nfit_start, 1);
> +}
> +
> +void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
> +                             GArray *table_data, GArray *linker)
> +{
> +    GSList *device_list = nvdimm_get_built_list();
> +
> +    if (!memory_region_size(&state->mr)) {
> +        assert(!device_list);
> +        return;
> +    }
> +
> +    if (device_list) {
> +        build_nfit(device_list, table_offsets, table_data, linker);
> +        g_slist_free(device_list);
> +    }
> +}
> diff --git a/hw/mem/nvdimm/internal.h b/hw/mem/nvdimm/internal.h
> index c4ba750..5551448 100644
> --- a/hw/mem/nvdimm/internal.h
> +++ b/hw/mem/nvdimm/internal.h
> @@ -14,4 +14,17 @@
>  #define NVDIMM_INTERNAL_H
>  
>  #define MIN_NAMESPACE_LABEL_SIZE    (128UL << 10)
> +
> +struct uuid_le {
> +    uint8_t b[16];
> +};
> +typedef struct uuid_le uuid_le;
> +
> +#define UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)                   \
> +((uuid_le)                                                                 \
> +{ { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
> +    (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff,          \
> +    (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } })
> +

Please avoid polluting the global namespace.
Prefix everything with NVDIMM.

> +GSList *nvdimm_get_built_list(void);

You are adding an extern function with no comment
about it's purpose anywhere. Pls fix this.
The name isn't pretty. What does "built" mean?
List of what? Is this a device list?

>  #endif

This header is too small to be worth it.
nvdimm_get_built_list seems to be the only interface -
just stick it in the header you have under include.


> diff --git a/hw/mem/nvdimm/nvdimm.c b/hw/mem/nvdimm/nvdimm.c
> index 0850e82..bc8c577 100644
> --- a/hw/mem/nvdimm/nvdimm.c
> +++ b/hw/mem/nvdimm/nvdimm.c
> @@ -26,6 +26,31 @@
>  #include "hw/mem/nvdimm.h"
>  #include "internal.h"
>  
> +static int nvdimm_built_list(Object *obj, void *opaque)
> +{
> +    GSList **list = opaque;
> +
> +    if (object_dynamic_cast(obj, TYPE_NVDIMM)) {
> +        DeviceState *dev = DEVICE(obj);
> +
> +        /* only realized NVDIMMs matter */
> +        if (dev->realized) {
> +            *list = g_slist_append(*list, dev);
> +        }
> +    }
> +
> +    object_child_foreach(obj, nvdimm_built_list, opaque);
> +    return 0;
> +}
> +
> +GSList *nvdimm_get_built_list(void)
> +{
> +    GSList *list = NULL;
> +
> +    object_child_foreach(qdev_get_machine(), nvdimm_built_list, &list);
> +    return list;
> +}
> +
>  static MemoryRegion *nvdimm_get_memory_region(DIMMDevice *dimm)
>  {
>      NVDIMMDevice *nvdimm = NVDIMM(dimm);
> diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
> index aa95961..0a6bda4 100644
> --- a/include/hw/mem/nvdimm.h
> +++ b/include/hw/mem/nvdimm.h
> @@ -49,4 +49,6 @@ typedef struct NVDIMMState NVDIMMState;
>  
>  void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
>                                MachineState *machine , uint64_t page_size);
> +void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
> +                             GArray *table_data, GArray *linker);
>  #endif
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 23/32] nvdimm: build ACPI NFIT table
@ 2015-10-12 11:27     ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-12 11:27 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth

On Sun, Oct 11, 2015 at 11:52:55AM +0800, Xiao Guangrong wrote:
> NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
> 
> Currently, we only support PMEM mode. Each device has 3 structures:
> - SPA structure, defines the PMEM region info
> 
> - MEM DEV structure, it has the @handle which is used to associate specified
>   ACPI NVDIMM  device we will introduce in later patch.
>   Also we can happily ignored the memory device's interleave, the real
>   nvdimm hardware access is hidden behind host
> 
> - DCR structure, it defines vendor ID used to associate specified vendor
>   nvdimm driver. Since we only implement PMEM mode this time, Command
>   window and Data window are not needed
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/i386/acpi-build.c     |   4 +
>  hw/mem/nvdimm/acpi.c     | 209 ++++++++++++++++++++++++++++++++++++++++++++++-
>  hw/mem/nvdimm/internal.h |  13 +++
>  hw/mem/nvdimm/nvdimm.c   |  25 ++++++
>  include/hw/mem/nvdimm.h  |   2 +
>  5 files changed, 252 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 95e0c65..c637dc8 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1661,6 +1661,7 @@ static bool acpi_has_iommu(void)
>  static
>  void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>  {
> +    PCMachineState *pcms = PC_MACHINE(qdev_get_machine());

I don't like more code poking at machine directly.
I know srat does it, and I don't like it. Any chance you can add
acpi_get_nvdumm_info to get all you need from nvdimm state?

>      GArray *table_offsets;
>      unsigned facs, ssdt, dsdt, rsdt;
>      AcpiCpuInfo cpu;
> @@ -1742,6 +1743,9 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>          build_dmar_q35(tables_blob, tables->linker);
>      }
>  
> +    nvdimm_build_acpi_table(&pcms->nvdimm_memory, table_offsets, tables_blob,
> +                            tables->linker);
> +
>      /* Add tables supplied by user (if any) */
>      for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
>          unsigned len = acpi_table_len(u);
> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> index b640874..62b1e02 100644
> --- a/hw/mem/nvdimm/acpi.c
> +++ b/hw/mem/nvdimm/acpi.c
> @@ -32,6 +32,46 @@
>  #include "hw/mem/nvdimm.h"
>  #include "internal.h"
>  
> +static void nfit_spa_uuid_pm(uuid_le *uuid)
> +{
> +    uuid_le uuid_pm = UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d,
> +                              0x33, 0x18, 0xb7, 0x8c, 0xdb);
> +    memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
> +}
> +

Just add a static constant:
    const uint8_t nfit_spa_uuid[] = {0x79, 0xd3, ..... }
then memcpy instead of a wrapper.

> +enum {
> +    NFIT_STRUCTURE_SPA = 0,
> +    NFIT_STRUCTURE_MEMDEV = 1,
> +    NFIT_STRUCTURE_IDT = 2,
> +    NFIT_STRUCTURE_SMBIOS = 3,
> +    NFIT_STRUCTURE_DCR = 4,
> +    NFIT_STRUCTURE_BDW = 5,
> +    NFIT_STRUCTURE_FLUSH = 6,
> +};
> +
> +enum {
> +    EFI_MEMORY_UC = 0x1ULL,
> +    EFI_MEMORY_WC = 0x2ULL,
> +    EFI_MEMORY_WT = 0x4ULL,
> +    EFI_MEMORY_WB = 0x8ULL,
> +    EFI_MEMORY_UCE = 0x10ULL,
> +    EFI_MEMORY_WP = 0x1000ULL,
> +    EFI_MEMORY_RP = 0x2000ULL,
> +    EFI_MEMORY_XP = 0x4000ULL,
> +    EFI_MEMORY_NV = 0x8000ULL,
> +    EFI_MEMORY_MORE_RELIABLE = 0x10000ULL,
> +};
> +
> +/*
> + * NVDIMM Firmware Interface Table
> + * @signature: "NFIT"
> + */
> +struct nfit {
> +    ACPI_TABLE_HEADER_DEF
> +    uint32_t reserved;
> +} QEMU_PACKED;
> +typedef struct nfit nfit;
> +
>  /* System Physical Address Range Structure */
>  struct nfit_spa {
>      uint16_t type;
> @@ -40,13 +80,21 @@ struct nfit_spa {
>      uint16_t flags;
>      uint32_t reserved;
>      uint32_t proximity_domain;
> -    uint8_t type_guid[16];
> +    uuid_le type_guid;
>      uint64_t spa_base;
>      uint64_t spa_length;
>      uint64_t mem_attr;
>  } QEMU_PACKED;
>  typedef struct nfit_spa nfit_spa;
>  
> +/*
> + * Control region is strictly for management during hot add/online
> + * operation.
> + */
> +#define SPA_FLAGS_ADD_ONLINE_ONLY     (1)

unused

> +/* Data in Proximity Domain field is valid. */
> +#define SPA_FLAGS_PROXIMITY_VALID     (1 << 1)
> +
>  /* Memory Device to System Physical Address Range Mapping Structure */
>  struct nfit_memdev {
>      uint16_t type;
> @@ -91,12 +139,20 @@ struct nfit_dcr {
>  } QEMU_PACKED;
>  typedef struct nfit_dcr nfit_dcr;
>  
> +#define REVSISON_ID    1
> +#define NFIT_FIC1      0x201
> +
>  static uint64_t nvdimm_device_structure_size(uint64_t slots)
>  {
>      /* each nvdimm has three structures. */
>      return slots * (sizeof(nfit_spa) + sizeof(nfit_memdev) + sizeof(nfit_dcr));
>  }
>  
> +static uint64_t get_nfit_total_size(uint64_t slots)
> +{
> +    return sizeof(struct nfit) + nvdimm_device_structure_size(slots);
> +}
> +
>  static uint64_t nvdimm_acpi_memory_size(uint64_t slots, uint64_t page_size)
>  {
>      uint64_t size = nvdimm_device_structure_size(slots);
> @@ -118,3 +174,154 @@ void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
>                         NVDIMM_ACPI_MEM_SIZE);
>      memory_region_add_subregion(system_memory, state->base, &state->mr);
>  }
> +
> +static uint32_t nvdimm_slot_to_sn(int slot)
> +{
> +    return 0x123456 + slot;
> +}
> +
> +static uint32_t nvdimm_slot_to_handle(int slot)
> +{
> +    return slot + 1;
> +}
> +
> +static uint16_t nvdimm_slot_to_spa_index(int slot)
> +{
> +    return (slot + 1) << 1;
> +}
> +
> +static uint32_t nvdimm_slot_to_dcr_index(int slot)
> +{
> +    return nvdimm_slot_to_spa_index(slot) + 1;
> +}
> +

There are lots of magic numbers here with no comments.
Pls explain the logic in code comments.

> +static int build_structure_spa(void *buf, NVDIMMDevice *nvdimm)

Pls document the specific chapter that this implements.

same everywhere else.
> +{
> +    nfit_spa *nfit_spa;
> +    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
> +                                            NULL);
> +    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
> +                                            NULL);
> +    uint32_t node = object_property_get_int(OBJECT(nvdimm), DIMM_NODE_PROP,
> +                                            NULL);
> +    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> +                                            NULL);
> +
> +    nfit_spa = buf;
> +
> +    nfit_spa->type = cpu_to_le16(NFIT_STRUCTURE_SPA);

Don't do these 1-time enums. They are hard to match against spec.

       nfit_spa->type = cpu_to_le16(0 /* System Physical Address Range Structure */);

same everywhere else.

> +    nfit_spa->length = cpu_to_le16(sizeof(*nfit_spa));
> +    nfit_spa->spa_index = cpu_to_le16(nvdimm_slot_to_spa_index(slot));
> +    nfit_spa->flags = cpu_to_le16(SPA_FLAGS_PROXIMITY_VALID);
> +    nfit_spa->proximity_domain = cpu_to_le32(node);
> +    nfit_spa_uuid_pm(&nfit_spa->type_guid);
> +    nfit_spa->spa_base = cpu_to_le64(addr);
> +    nfit_spa->spa_length = cpu_to_le64(size);
> +    nfit_spa->mem_attr = cpu_to_le64(EFI_MEMORY_WB | EFI_MEMORY_NV);
> +
> +    return sizeof(*nfit_spa);
> +}
> +
> +static int build_structure_memdev(void *buf, NVDIMMDevice *nvdimm)
> +{
> +    nfit_memdev *nfit_memdev;
> +    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
> +                                            NULL);
> +    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
> +                                            NULL);
> +    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> +                                            NULL);
> +    uint32_t handle = nvdimm_slot_to_handle(slot);
> +
> +    nfit_memdev = buf;
> +    nfit_memdev->type = cpu_to_le16(NFIT_STRUCTURE_MEMDEV);
> +    nfit_memdev->length = cpu_to_le16(sizeof(*nfit_memdev));
> +    nfit_memdev->nfit_handle = cpu_to_le32(handle);
> +    /* point to nfit_spa. */
> +    nfit_memdev->spa_index = cpu_to_le16(nvdimm_slot_to_spa_index(slot));
> +    /* point to nfit_dcr. */
> +    nfit_memdev->dcr_index = cpu_to_le16(nvdimm_slot_to_dcr_index(slot));
> +    nfit_memdev->region_len = cpu_to_le64(size);
> +    nfit_memdev->region_dpa = cpu_to_le64(addr);
> +    /* Only one interleave for pmem. */
> +    nfit_memdev->interleave_ways = cpu_to_le16(1);
> +
> +    return sizeof(*nfit_memdev);
> +}
> +
> +static int build_structure_dcr(void *buf, NVDIMMDevice *nvdimm)
> +{
> +    nfit_dcr *nfit_dcr;
> +    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> +                                       NULL);
> +    uint32_t sn = nvdimm_slot_to_sn(slot);
> +
> +    nfit_dcr = buf;
> +    nfit_dcr->type = cpu_to_le16(NFIT_STRUCTURE_DCR);
> +    nfit_dcr->length = cpu_to_le16(sizeof(*nfit_dcr));
> +    nfit_dcr->dcr_index = cpu_to_le16(nvdimm_slot_to_dcr_index(slot));
> +    nfit_dcr->vendor_id = cpu_to_le16(0x8086);
> +    nfit_dcr->device_id = cpu_to_le16(1);
> +    nfit_dcr->revision_id = cpu_to_le16(REVSISON_ID);
> +    nfit_dcr->serial_number = cpu_to_le32(sn);
> +    nfit_dcr->fic = cpu_to_le16(NFIT_FIC1);
> +
> +    return sizeof(*nfit_dcr);
> +}
> +
> +static void build_device_structure(GSList *device_list, char *buf)
> +{
> +    buf += sizeof(nfit);
> +
> +    for (; device_list; device_list = device_list->next) {
> +        NVDIMMDevice *nvdimm = device_list->data;
> +
> +        /* build System Physical Address Range Description Table. */
> +        buf += build_structure_spa(buf, nvdimm);
> +
> +        /*
> +         * build Memory Device to System Physical Address Range Mapping
> +         * Table.
> +         */
> +        buf += build_structure_memdev(buf, nvdimm);
> +
> +        /* build Control Region Descriptor Table. */
> +        buf += build_structure_dcr(buf, nvdimm);
> +    }
> +}
> +
> +static void build_nfit(GSList *device_list, GArray *table_offsets,
> +                       GArray *table_data, GArray *linker)
> +{
> +    size_t total;
> +    char *buf;
> +    int nfit_start, nr;
> +
> +    nr = g_slist_length(device_list);
> +    total = get_nfit_total_size(nr);
> +
> +    nfit_start = table_data->len;
> +    acpi_add_table(table_offsets, table_data);
> +
> +    buf = acpi_data_push(table_data, total);
> +    build_device_structure(device_list, buf);

This seems fragile. Should build_device_structure overflow
a buffer we'll corrupt memory.
Current code does use acpi_data_push but only for trivial
things like fixed size headers.
Can't you use glib to dynamically append things to table
as they are generated?


> +
> +    build_header(linker, table_data, (void *)(table_data->data + nfit_start),
> +                 "NFIT", table_data->len - nfit_start, 1);
> +}
> +
> +void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
> +                             GArray *table_data, GArray *linker)
> +{
> +    GSList *device_list = nvdimm_get_built_list();
> +
> +    if (!memory_region_size(&state->mr)) {
> +        assert(!device_list);
> +        return;
> +    }
> +
> +    if (device_list) {
> +        build_nfit(device_list, table_offsets, table_data, linker);
> +        g_slist_free(device_list);
> +    }
> +}
> diff --git a/hw/mem/nvdimm/internal.h b/hw/mem/nvdimm/internal.h
> index c4ba750..5551448 100644
> --- a/hw/mem/nvdimm/internal.h
> +++ b/hw/mem/nvdimm/internal.h
> @@ -14,4 +14,17 @@
>  #define NVDIMM_INTERNAL_H
>  
>  #define MIN_NAMESPACE_LABEL_SIZE    (128UL << 10)
> +
> +struct uuid_le {
> +    uint8_t b[16];
> +};
> +typedef struct uuid_le uuid_le;
> +
> +#define UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)                   \
> +((uuid_le)                                                                 \
> +{ { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
> +    (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff,          \
> +    (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } })
> +

Please avoid polluting the global namespace.
Prefix everything with NVDIMM.

> +GSList *nvdimm_get_built_list(void);

You are adding an extern function with no comment
about it's purpose anywhere. Pls fix this.
The name isn't pretty. What does "built" mean?
List of what? Is this a device list?

>  #endif

This header is too small to be worth it.
nvdimm_get_built_list seems to be the only interface -
just stick it in the header you have under include.


> diff --git a/hw/mem/nvdimm/nvdimm.c b/hw/mem/nvdimm/nvdimm.c
> index 0850e82..bc8c577 100644
> --- a/hw/mem/nvdimm/nvdimm.c
> +++ b/hw/mem/nvdimm/nvdimm.c
> @@ -26,6 +26,31 @@
>  #include "hw/mem/nvdimm.h"
>  #include "internal.h"
>  
> +static int nvdimm_built_list(Object *obj, void *opaque)
> +{
> +    GSList **list = opaque;
> +
> +    if (object_dynamic_cast(obj, TYPE_NVDIMM)) {
> +        DeviceState *dev = DEVICE(obj);
> +
> +        /* only realized NVDIMMs matter */
> +        if (dev->realized) {
> +            *list = g_slist_append(*list, dev);
> +        }
> +    }
> +
> +    object_child_foreach(obj, nvdimm_built_list, opaque);
> +    return 0;
> +}
> +
> +GSList *nvdimm_get_built_list(void)
> +{
> +    GSList *list = NULL;
> +
> +    object_child_foreach(qdev_get_machine(), nvdimm_built_list, &list);
> +    return list;
> +}
> +
>  static MemoryRegion *nvdimm_get_memory_region(DIMMDevice *dimm)
>  {
>      NVDIMMDevice *nvdimm = NVDIMM(dimm);
> diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
> index aa95961..0a6bda4 100644
> --- a/include/hw/mem/nvdimm.h
> +++ b/include/hw/mem/nvdimm.h
> @@ -49,4 +49,6 @@ typedef struct NVDIMMState NVDIMMState;
>  
>  void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
>                                MachineState *machine , uint64_t page_size);
> +void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
> +                             GArray *table_data, GArray *linker);
>  #endif
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 00/32] implement vNVDIMM
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-12 11:55   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-12 11:55 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
> Changelog in v3:
> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> Michael for their valuable comments, the patchset finally gets better shape.

Thanks!
This needs some changes in coding style, and more comments, to
make it easier to maintain going forward.

High level comments - I didn't point out all instances,
please go over code and locate them yourself.
I focused on acpi code in this review.

    - fix coding style violations, prefix eveything with nvdimm_ etc
    - in apci code, avoid manual memory management/complex pointer math
    - comments are needed to document apis & explain what's going on
    - constants need comments too, refer to text that
      can be looked up in acpi spec verbatim


> - changes from Igor's comments:
>   1) abstract dimm device type from pc-dimm and create nvdimm device based on
>      dimm, then it uses memory backend device as nvdimm's memory and NUMA has
>      easily been implemented.
>   2) let file-backend device support any kind of filesystem not only for
>      hugetlbfs and let it work on file not only for directory which is
>      achieved by extending 'mem-path' - if it's a directory then it works as
>      current behavior, otherwise if it's file then directly allocates memory
>      from it.
>   3) we figure out a unused memory hole below 4G that is 0xFF00000 ~ 
>      0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
>      ACPI SSDT/DSDT table will break windows XP.
>      BTW, only make SSDT.rev = 2 can not work since the width is only depended
>      on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
>      in ACPI spec:
> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit 
> | width of Integer objects is dependent on the ComplianceRevision of the DSDT.
> | If the ComplianceRevision is less than 2, all integers are restricted to 32 
> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets 
> | the global integer width for all integers, including integers in SSDTs.
>   4) use the lowest ACPI spec version to document AML terms.
>   5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"
> 
> - changes from Stefan's comments:
>   1) do not do endian adjustment in-place since _DSM memory is visible to guest
>   2) use target platform's target page size instead of fixed PAGE_SIZE
>      definition
>   3) lots of code style improvement and typo fixes.
>   4) live migration fix
> - changes from Paolo's comments:
>   1) improve the name of memory region
>   
> - other changes:
>   1) return exact buffer size for _DSM method instead of the page size.
>   2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
>      devices.
>   3) NUMA support
>   4) implement _FIT method
>   5) rename "configdata" to "reserve-label-data"
>   6) simplify _DSM arg3 determination
>   7) main changelog update to let it reflect v3.
> 
> Changlog in v2:
> - Use litten endian for DSM method, thanks for Stefan's suggestion
> 
> - introduce a new parameter, @configdata, if it's false, Qemu will
>   build a static and readonly namespace in memory and use it serveing
>   for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
>   reserved region is needed at the end of the @file, it is good for
>   the user who want to pass whole nvdimm device and make its data
>   completely be visible to guest
> 
> - divide the source code into separated files and add maintain info
> 
> BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
> be posted on next week
> 
> ====== Background ======
> NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
> on Intel's platform. They are discovered via ACPI and configured by _DSM
> method of NVDIMM device in ACPI. There has some supporting documents which
> can be found at:
> ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
> NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
> DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
> 
> Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
> this patchset tries to enable it in virtualization field
> 
> ====== Design ======
> NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
> address space then CPU can directly access it as normal memory, another is
> BLK which is used as block device to reduce the occupying of CPU address
> space
> 
> BLK mode accesses NVDIMM via Command Register window and Data Register window.
> BLK virtualization has high workload since each sector access will cause at
> least two VM-EXIT. So we currently only imperilment vPMEM in this patchset
> 
> --- vPMEM design ---
> We introduce a new device named "nvdimm", it uses memory backend device as
> NVDIMM memory. The file in file-backend device can be a regular file and block 
> device. We can use any file when we do test or emulation, however,
> in the real word, the files passed to guest are:
> - the regular file in the filesystem with DAX enabled created on NVDIMM device
>   on host
> - the raw PMEM device on host, e,g /dev/pmem0
> Memory access on the address created by mmap on these kinds of files can
> directly reach NVDIMM device on host.
> 
> --- vConfigure data area design ---
> Each NVDIMM device has a configure data area which is used to store label
> namespace data. In order to emulating this area, we divide the file into two
> parts:
> - first parts is (0, size - 128K], which is used as PMEM
> - 128K at the end of the file, which is used as Label Data Area
> So that the label namespace data can be persistent during power lose or system
> failure.
> 
> We also support passing the whole file to guest without reserve any region for
> label data area which is achieved by "reserve-label-data" parameter - if it's
> false then QEMU will build static and readonly namespace in memory and that
> namespace contains the whole file size. The parameter is false on default.
> 
> --- _DSM method design ---
> _DSM in ACPI is used to configure NVDIMM, currently we only allow access of
> label namespace data, i.e, Get Namespace Label Size (Function Index 4),
> Get Namespace Label Data (Function Index 5) and Set Namespace Label Data
> (Function Index 6)
> 
> _DSM uses two pages to transfer data between ACPI and Qemu, the first page
> is RAM-based used to save the input info of _DSM method and Qemu reuse it
> store output info and another page is MMIO-based, ACPI write data to this
> page to transfer the control to Qemu
> 
> ====== Test ======
> In host
> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
> 2) append "-object memory-backend-file,share,id=mem1,
>    mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>    id=nv1" in QEMU command line
> 
> In guest, download the latest upsteam kernel (4.2 merge window) and enable
> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
> 1) insmod drivers/nvdimm/libnvdimm.ko
> 2) insmod drivers/acpi/nfit.ko
> 3) insmod drivers/nvdimm/nd_btt.ko
> 4) insmod drivers/nvdimm/nd_pmem.ko
> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
> appears. You can do whatever on /dev/pmem0 including DAX access.
> 
> Currently Linux NVDIMM driver does not support namespace operation on this
> kind of PMEM, apply below changes to support dynamical namespace:
> 
> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
>                         continue;
>                 }
>  
> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               if (nfit_mem->memdev_pmem)
>                         flags |= NDD_ALIASING;
> 
> You can append another NVDIMM device in guest and do:                       
> # cd /sys/bus/nd/devices/
> # cd namespace1.0/
> # echo `uuidgen` > uuid
> # echo `expr 1024 \* 1024 \* 128` > size
> then reload nd.pmem.ko
> 
> You can see /dev/pmem1 appears
> 
> ====== TODO ======
> NVDIMM hotplug support
> 
> Xiao Guangrong (32):
>   acpi: add aml_derefof
>   acpi: add aml_sizeof
>   acpi: add aml_create_field
>   acpi: add aml_mutex, aml_acquire, aml_release
>   acpi: add aml_concatenate
>   acpi: add aml_object_type
>   util: introduce qemu_file_get_page_size()
>   exec: allow memory to be allocated from any kind of path
>   exec: allow file_ram_alloc to work on file
>   hostmem-file: clean up memory allocation
>   hostmem-file: use whole file size if possible
>   pc-dimm: remove DEFAULT_PC_DIMMSIZE
>   pc-dimm: make pc_existing_dimms_capacity static and rename it
>   pc-dimm: drop the prefix of pc-dimm
>   stubs: rename qmp_pc_dimm_device_list.c
>   pc-dimm: rename pc-dimm.c and pc-dimm.h
>   dimm: abstract dimm device from pc-dimm
>   dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
>   dimm: keep the state of the whole backend memory
>   dimm: introduce realize callback
>   nvdimm: implement NVDIMM device abstract
>   nvdimm: init the address region used by NVDIMM ACPI
>   nvdimm: build ACPI NFIT table
>   nvdimm: init the address region used by DSM method
>   nvdimm: build ACPI nvdimm devices
>   nvdimm: save arg3 for NVDIMM device _DSM method
>   nvdimm: support DSM_CMD_IMPLEMENTED function
>   nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
>   nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
>   nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
>   nvdimm: allow using whole backend memory as pmem
>   nvdimm: add maintain info
> 
>  MAINTAINERS                                        |   6 +
>  backends/hostmem-file.c                            |  58 +-
>  default-configs/i386-softmmu.mak                   |   2 +
>  default-configs/x86_64-softmmu.mak                 |   2 +
>  exec.c                                             | 113 ++-
>  hmp.c                                              |   2 +-
>  hw/Makefile.objs                                   |   2 +-
>  hw/acpi/aml-build.c                                |  83 ++
>  hw/acpi/ich9.c                                     |   8 +-
>  hw/acpi/memory_hotplug.c                           |  26 +-
>  hw/acpi/piix4.c                                    |   8 +-
>  hw/i386/acpi-build.c                               |   4 +
>  hw/i386/pc.c                                       |  37 +-
>  hw/mem/Makefile.objs                               |   3 +
>  hw/mem/{pc-dimm.c => dimm.c}                       | 240 ++---
>  hw/mem/nvdimm/acpi.c                               | 961 +++++++++++++++++++++
>  hw/mem/nvdimm/internal.h                           |  41 +
>  hw/mem/nvdimm/namespace.c                          | 309 +++++++
>  hw/mem/nvdimm/nvdimm.c                             | 136 +++
>  hw/mem/pc-dimm.c                                   | 506 +----------
>  hw/ppc/spapr.c                                     |  20 +-
>  include/hw/acpi/aml-build.h                        |   8 +
>  include/hw/i386/pc.h                               |   4 +-
>  include/hw/mem/{pc-dimm.h => dimm.h}               |  68 +-
>  include/hw/mem/nvdimm.h                            |  58 ++
>  include/hw/mem/pc-dimm.h                           | 105 +--
>  include/hw/ppc/spapr.h                             |   2 +-
>  include/qemu/osdep.h                               |   1 +
>  numa.c                                             |   4 +-
>  qapi-schema.json                                   |   8 +-
>  qmp.c                                              |   4 +-
>  stubs/Makefile.objs                                |   2 +-
>  ...c_dimm_device_list.c => qmp_dimm_device_list.c} |   4 +-
>  target-ppc/kvm.c                                   |  21 +-
>  trace-events                                       |   8 +-
>  util/oslib-posix.c                                 |  16 +
>  util/oslib-win32.c                                 |   5 +
>  37 files changed, 2023 insertions(+), 862 deletions(-)
>  rename hw/mem/{pc-dimm.c => dimm.c} (65%)
>  create mode 100644 hw/mem/nvdimm/acpi.c
>  create mode 100644 hw/mem/nvdimm/internal.h
>  create mode 100644 hw/mem/nvdimm/namespace.c
>  create mode 100644 hw/mem/nvdimm/nvdimm.c
>  rewrite hw/mem/pc-dimm.c (91%)
>  rename include/hw/mem/{pc-dimm.h => dimm.h} (50%)
>  create mode 100644 include/hw/mem/nvdimm.h
>  rewrite include/hw/mem/pc-dimm.h (97%)
>  rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (56%)
> 
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-12 11:55   ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-12 11:55 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth

On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
> Changelog in v3:
> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> Michael for their valuable comments, the patchset finally gets better shape.

Thanks!
This needs some changes in coding style, and more comments, to
make it easier to maintain going forward.

High level comments - I didn't point out all instances,
please go over code and locate them yourself.
I focused on acpi code in this review.

    - fix coding style violations, prefix eveything with nvdimm_ etc
    - in apci code, avoid manual memory management/complex pointer math
    - comments are needed to document apis & explain what's going on
    - constants need comments too, refer to text that
      can be looked up in acpi spec verbatim


> - changes from Igor's comments:
>   1) abstract dimm device type from pc-dimm and create nvdimm device based on
>      dimm, then it uses memory backend device as nvdimm's memory and NUMA has
>      easily been implemented.
>   2) let file-backend device support any kind of filesystem not only for
>      hugetlbfs and let it work on file not only for directory which is
>      achieved by extending 'mem-path' - if it's a directory then it works as
>      current behavior, otherwise if it's file then directly allocates memory
>      from it.
>   3) we figure out a unused memory hole below 4G that is 0xFF00000 ~ 
>      0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
>      ACPI SSDT/DSDT table will break windows XP.
>      BTW, only make SSDT.rev = 2 can not work since the width is only depended
>      on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
>      in ACPI spec:
> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit 
> | width of Integer objects is dependent on the ComplianceRevision of the DSDT.
> | If the ComplianceRevision is less than 2, all integers are restricted to 32 
> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets 
> | the global integer width for all integers, including integers in SSDTs.
>   4) use the lowest ACPI spec version to document AML terms.
>   5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"
> 
> - changes from Stefan's comments:
>   1) do not do endian adjustment in-place since _DSM memory is visible to guest
>   2) use target platform's target page size instead of fixed PAGE_SIZE
>      definition
>   3) lots of code style improvement and typo fixes.
>   4) live migration fix
> - changes from Paolo's comments:
>   1) improve the name of memory region
>   
> - other changes:
>   1) return exact buffer size for _DSM method instead of the page size.
>   2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
>      devices.
>   3) NUMA support
>   4) implement _FIT method
>   5) rename "configdata" to "reserve-label-data"
>   6) simplify _DSM arg3 determination
>   7) main changelog update to let it reflect v3.
> 
> Changlog in v2:
> - Use litten endian for DSM method, thanks for Stefan's suggestion
> 
> - introduce a new parameter, @configdata, if it's false, Qemu will
>   build a static and readonly namespace in memory and use it serveing
>   for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
>   reserved region is needed at the end of the @file, it is good for
>   the user who want to pass whole nvdimm device and make its data
>   completely be visible to guest
> 
> - divide the source code into separated files and add maintain info
> 
> BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
> be posted on next week
> 
> ====== Background ======
> NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
> on Intel's platform. They are discovered via ACPI and configured by _DSM
> method of NVDIMM device in ACPI. There has some supporting documents which
> can be found at:
> ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
> NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
> DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
> 
> Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
> this patchset tries to enable it in virtualization field
> 
> ====== Design ======
> NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
> address space then CPU can directly access it as normal memory, another is
> BLK which is used as block device to reduce the occupying of CPU address
> space
> 
> BLK mode accesses NVDIMM via Command Register window and Data Register window.
> BLK virtualization has high workload since each sector access will cause at
> least two VM-EXIT. So we currently only imperilment vPMEM in this patchset
> 
> --- vPMEM design ---
> We introduce a new device named "nvdimm", it uses memory backend device as
> NVDIMM memory. The file in file-backend device can be a regular file and block 
> device. We can use any file when we do test or emulation, however,
> in the real word, the files passed to guest are:
> - the regular file in the filesystem with DAX enabled created on NVDIMM device
>   on host
> - the raw PMEM device on host, e,g /dev/pmem0
> Memory access on the address created by mmap on these kinds of files can
> directly reach NVDIMM device on host.
> 
> --- vConfigure data area design ---
> Each NVDIMM device has a configure data area which is used to store label
> namespace data. In order to emulating this area, we divide the file into two
> parts:
> - first parts is (0, size - 128K], which is used as PMEM
> - 128K at the end of the file, which is used as Label Data Area
> So that the label namespace data can be persistent during power lose or system
> failure.
> 
> We also support passing the whole file to guest without reserve any region for
> label data area which is achieved by "reserve-label-data" parameter - if it's
> false then QEMU will build static and readonly namespace in memory and that
> namespace contains the whole file size. The parameter is false on default.
> 
> --- _DSM method design ---
> _DSM in ACPI is used to configure NVDIMM, currently we only allow access of
> label namespace data, i.e, Get Namespace Label Size (Function Index 4),
> Get Namespace Label Data (Function Index 5) and Set Namespace Label Data
> (Function Index 6)
> 
> _DSM uses two pages to transfer data between ACPI and Qemu, the first page
> is RAM-based used to save the input info of _DSM method and Qemu reuse it
> store output info and another page is MMIO-based, ACPI write data to this
> page to transfer the control to Qemu
> 
> ====== Test ======
> In host
> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
> 2) append "-object memory-backend-file,share,id=mem1,
>    mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>    id=nv1" in QEMU command line
> 
> In guest, download the latest upsteam kernel (4.2 merge window) and enable
> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
> 1) insmod drivers/nvdimm/libnvdimm.ko
> 2) insmod drivers/acpi/nfit.ko
> 3) insmod drivers/nvdimm/nd_btt.ko
> 4) insmod drivers/nvdimm/nd_pmem.ko
> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
> appears. You can do whatever on /dev/pmem0 including DAX access.
> 
> Currently Linux NVDIMM driver does not support namespace operation on this
> kind of PMEM, apply below changes to support dynamical namespace:
> 
> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
>                         continue;
>                 }
>  
> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               if (nfit_mem->memdev_pmem)
>                         flags |= NDD_ALIASING;
> 
> You can append another NVDIMM device in guest and do:                       
> # cd /sys/bus/nd/devices/
> # cd namespace1.0/
> # echo `uuidgen` > uuid
> # echo `expr 1024 \* 1024 \* 128` > size
> then reload nd.pmem.ko
> 
> You can see /dev/pmem1 appears
> 
> ====== TODO ======
> NVDIMM hotplug support
> 
> Xiao Guangrong (32):
>   acpi: add aml_derefof
>   acpi: add aml_sizeof
>   acpi: add aml_create_field
>   acpi: add aml_mutex, aml_acquire, aml_release
>   acpi: add aml_concatenate
>   acpi: add aml_object_type
>   util: introduce qemu_file_get_page_size()
>   exec: allow memory to be allocated from any kind of path
>   exec: allow file_ram_alloc to work on file
>   hostmem-file: clean up memory allocation
>   hostmem-file: use whole file size if possible
>   pc-dimm: remove DEFAULT_PC_DIMMSIZE
>   pc-dimm: make pc_existing_dimms_capacity static and rename it
>   pc-dimm: drop the prefix of pc-dimm
>   stubs: rename qmp_pc_dimm_device_list.c
>   pc-dimm: rename pc-dimm.c and pc-dimm.h
>   dimm: abstract dimm device from pc-dimm
>   dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
>   dimm: keep the state of the whole backend memory
>   dimm: introduce realize callback
>   nvdimm: implement NVDIMM device abstract
>   nvdimm: init the address region used by NVDIMM ACPI
>   nvdimm: build ACPI NFIT table
>   nvdimm: init the address region used by DSM method
>   nvdimm: build ACPI nvdimm devices
>   nvdimm: save arg3 for NVDIMM device _DSM method
>   nvdimm: support DSM_CMD_IMPLEMENTED function
>   nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
>   nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
>   nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
>   nvdimm: allow using whole backend memory as pmem
>   nvdimm: add maintain info
> 
>  MAINTAINERS                                        |   6 +
>  backends/hostmem-file.c                            |  58 +-
>  default-configs/i386-softmmu.mak                   |   2 +
>  default-configs/x86_64-softmmu.mak                 |   2 +
>  exec.c                                             | 113 ++-
>  hmp.c                                              |   2 +-
>  hw/Makefile.objs                                   |   2 +-
>  hw/acpi/aml-build.c                                |  83 ++
>  hw/acpi/ich9.c                                     |   8 +-
>  hw/acpi/memory_hotplug.c                           |  26 +-
>  hw/acpi/piix4.c                                    |   8 +-
>  hw/i386/acpi-build.c                               |   4 +
>  hw/i386/pc.c                                       |  37 +-
>  hw/mem/Makefile.objs                               |   3 +
>  hw/mem/{pc-dimm.c => dimm.c}                       | 240 ++---
>  hw/mem/nvdimm/acpi.c                               | 961 +++++++++++++++++++++
>  hw/mem/nvdimm/internal.h                           |  41 +
>  hw/mem/nvdimm/namespace.c                          | 309 +++++++
>  hw/mem/nvdimm/nvdimm.c                             | 136 +++
>  hw/mem/pc-dimm.c                                   | 506 +----------
>  hw/ppc/spapr.c                                     |  20 +-
>  include/hw/acpi/aml-build.h                        |   8 +
>  include/hw/i386/pc.h                               |   4 +-
>  include/hw/mem/{pc-dimm.h => dimm.h}               |  68 +-
>  include/hw/mem/nvdimm.h                            |  58 ++
>  include/hw/mem/pc-dimm.h                           | 105 +--
>  include/hw/ppc/spapr.h                             |   2 +-
>  include/qemu/osdep.h                               |   1 +
>  numa.c                                             |   4 +-
>  qapi-schema.json                                   |   8 +-
>  qmp.c                                              |   4 +-
>  stubs/Makefile.objs                                |   2 +-
>  ...c_dimm_device_list.c => qmp_dimm_device_list.c} |   4 +-
>  target-ppc/kvm.c                                   |  21 +-
>  trace-events                                       |   8 +-
>  util/oslib-posix.c                                 |  16 +
>  util/oslib-win32.c                                 |   5 +
>  37 files changed, 2023 insertions(+), 862 deletions(-)
>  rename hw/mem/{pc-dimm.c => dimm.c} (65%)
>  create mode 100644 hw/mem/nvdimm/acpi.c
>  create mode 100644 hw/mem/nvdimm/internal.h
>  create mode 100644 hw/mem/nvdimm/namespace.c
>  create mode 100644 hw/mem/nvdimm/nvdimm.c
>  rewrite hw/mem/pc-dimm.c (91%)
>  rename include/hw/mem/{pc-dimm.h => dimm.h} (50%)
>  create mode 100644 include/hw/mem/nvdimm.h
>  rewrite include/hw/mem/pc-dimm.h (97%)
>  rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (56%)
> 
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 00/32] implement vNVDIMM
  2015-10-12  4:33     ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-12 16:36       ` Dan Williams
  -1 siblings, 0 replies; 200+ messages in thread
From: Dan Williams @ 2015-10-12 16:36 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Paolo Bonzini, imammedo, Gleb Natapov, mtosatti, stefanha,
	Michael S. Tsirkin, rth, ehabkost, KVM list, qemu-devel

On Sun, Oct 11, 2015 at 9:33 PM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
>
>
> On 10/11/2015 05:17 AM, Dan Williams wrote:
>>
>> On Sat, Oct 10, 2015 at 8:52 PM, Xiao Guangrong
>> <guangrong.xiao@linux.intel.com> wrote:
>> [..]
>>>
>>> ====== Test ======
>>> In host
>>> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G
>>> count=10
>>> 2) append "-object memory-backend-file,share,id=mem1,
>>>     mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>>>     id=nv1" in QEMU command line
>>>
>>> In guest, download the latest upsteam kernel (4.2 merge window) and
>>> enable
>>> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
>>> 1) insmod drivers/nvdimm/libnvdimm.ko
>>> 2) insmod drivers/acpi/nfit.ko
>>> 3) insmod drivers/nvdimm/nd_btt.ko
>>> 4) insmod drivers/nvdimm/nd_pmem.ko
>>> You can see the whole nvdimm device used as a single namespace and
>>> /dev/pmem0
>>> appears. You can do whatever on /dev/pmem0 including DAX access.
>>>
>>> Currently Linux NVDIMM driver does not support namespace operation on
>>> this
>>> kind of PMEM, apply below changes to support dynamical namespace:
>>>
>>> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct
>>> acpi_nfit_desc *a
>>>                          continue;
>>>                  }
>>>
>>> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>>> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>>> +               if (nfit_mem->memdev_pmem)
>>>                          flags |= NDD_ALIASING;
>>
>>
>> This is just for testing purposes, right?  I expect guests can
>
>
> It's used to validate NVDIMM _DSM method and static namespace following
> NVDIMM specs...

Static namespaces can be emitted without a label.  Linux needs this to
support existing "label-less" bare metal NVDIMMs.

>> sub-divide persistent memory capacity by partitioning the resulting
>> block device(s).
>
>
> I understand that it's a Linux design... Hmm, can the same expectation
> apply to PBLK?

BLK-mode is a bit different as those namespaces have both configurable
sector-size and an optional BTT.  It is possible to expect multiple
BLK namespaces per a given region with different settings.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-12 16:36       ` Dan Williams
  0 siblings, 0 replies; 200+ messages in thread
From: Dan Williams @ 2015-10-12 16:36 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, imammedo, Paolo Bonzini, rth

On Sun, Oct 11, 2015 at 9:33 PM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
>
>
> On 10/11/2015 05:17 AM, Dan Williams wrote:
>>
>> On Sat, Oct 10, 2015 at 8:52 PM, Xiao Guangrong
>> <guangrong.xiao@linux.intel.com> wrote:
>> [..]
>>>
>>> ====== Test ======
>>> In host
>>> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G
>>> count=10
>>> 2) append "-object memory-backend-file,share,id=mem1,
>>>     mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>>>     id=nv1" in QEMU command line
>>>
>>> In guest, download the latest upsteam kernel (4.2 merge window) and
>>> enable
>>> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
>>> 1) insmod drivers/nvdimm/libnvdimm.ko
>>> 2) insmod drivers/acpi/nfit.ko
>>> 3) insmod drivers/nvdimm/nd_btt.ko
>>> 4) insmod drivers/nvdimm/nd_pmem.ko
>>> You can see the whole nvdimm device used as a single namespace and
>>> /dev/pmem0
>>> appears. You can do whatever on /dev/pmem0 including DAX access.
>>>
>>> Currently Linux NVDIMM driver does not support namespace operation on
>>> this
>>> kind of PMEM, apply below changes to support dynamical namespace:
>>>
>>> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct
>>> acpi_nfit_desc *a
>>>                          continue;
>>>                  }
>>>
>>> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>>> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>>> +               if (nfit_mem->memdev_pmem)
>>>                          flags |= NDD_ALIASING;
>>
>>
>> This is just for testing purposes, right?  I expect guests can
>
>
> It's used to validate NVDIMM _DSM method and static namespace following
> NVDIMM specs...

Static namespaces can be emitted without a label.  Linux needs this to
support existing "label-less" bare metal NVDIMMs.

>> sub-divide persistent memory capacity by partitioning the resulting
>> block device(s).
>
>
> I understand that it's a Linux design... Hmm, can the same expectation
> apply to PBLK?

BLK-mode is a bit different as those namespaces have both configurable
sector-size and an optional BTT.  It is possible to expect multiple
BLK namespaces per a given region with different settings.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 23/32] nvdimm: build ACPI NFIT table
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-12 16:40     ` Dan Williams
  -1 siblings, 0 replies; 200+ messages in thread
From: Dan Williams @ 2015-10-12 16:40 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Paolo Bonzini, imammedo, Gleb Natapov, mtosatti, stefanha,
	Michael S. Tsirkin, rth, ehabkost, KVM list, qemu-devel

On Sat, Oct 10, 2015 at 8:52 PM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
> NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
>
> Currently, we only support PMEM mode. Each device has 3 structures:
> - SPA structure, defines the PMEM region info
>
> - MEM DEV structure, it has the @handle which is used to associate specified
>   ACPI NVDIMM  device we will introduce in later patch.
>   Also we can happily ignored the memory device's interleave, the real
>   nvdimm hardware access is hidden behind host
>
> - DCR structure, it defines vendor ID used to associate specified vendor
>   nvdimm driver. Since we only implement PMEM mode this time, Command
>   window and Data window are not needed
>
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/i386/acpi-build.c     |   4 +
>  hw/mem/nvdimm/acpi.c     | 209 ++++++++++++++++++++++++++++++++++++++++++++++-
>  hw/mem/nvdimm/internal.h |  13 +++
>  hw/mem/nvdimm/nvdimm.c   |  25 ++++++
>  include/hw/mem/nvdimm.h  |   2 +
>  5 files changed, 252 insertions(+), 1 deletion(-)
>
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 95e0c65..c637dc8 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1661,6 +1661,7 @@ static bool acpi_has_iommu(void)
>  static
>  void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>  {
> +    PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
>      GArray *table_offsets;
>      unsigned facs, ssdt, dsdt, rsdt;
>      AcpiCpuInfo cpu;
> @@ -1742,6 +1743,9 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>          build_dmar_q35(tables_blob, tables->linker);
>      }
>
> +    nvdimm_build_acpi_table(&pcms->nvdimm_memory, table_offsets, tables_blob,
> +                            tables->linker);
> +
>      /* Add tables supplied by user (if any) */
>      for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
>          unsigned len = acpi_table_len(u);
> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> index b640874..62b1e02 100644
> --- a/hw/mem/nvdimm/acpi.c
> +++ b/hw/mem/nvdimm/acpi.c
> @@ -32,6 +32,46 @@
>  #include "hw/mem/nvdimm.h"
>  #include "internal.h"
>
> +static void nfit_spa_uuid_pm(uuid_le *uuid)
> +{
> +    uuid_le uuid_pm = UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d,
> +                              0x33, 0x18, 0xb7, 0x8c, 0xdb);
> +    memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
> +}
> +
> +enum {
> +    NFIT_STRUCTURE_SPA = 0,
> +    NFIT_STRUCTURE_MEMDEV = 1,
> +    NFIT_STRUCTURE_IDT = 2,
> +    NFIT_STRUCTURE_SMBIOS = 3,
> +    NFIT_STRUCTURE_DCR = 4,
> +    NFIT_STRUCTURE_BDW = 5,
> +    NFIT_STRUCTURE_FLUSH = 6,
> +};
> +
> +enum {
> +    EFI_MEMORY_UC = 0x1ULL,
> +    EFI_MEMORY_WC = 0x2ULL,
> +    EFI_MEMORY_WT = 0x4ULL,
> +    EFI_MEMORY_WB = 0x8ULL,
> +    EFI_MEMORY_UCE = 0x10ULL,
> +    EFI_MEMORY_WP = 0x1000ULL,
> +    EFI_MEMORY_RP = 0x2000ULL,
> +    EFI_MEMORY_XP = 0x4000ULL,
> +    EFI_MEMORY_NV = 0x8000ULL,
> +    EFI_MEMORY_MORE_RELIABLE = 0x10000ULL,
> +};

Would it worth including / copying the ACPICA header files directly?

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/acpi/actbl1.h
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/acpi/acuuid.h

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 23/32] nvdimm: build ACPI NFIT table
@ 2015-10-12 16:40     ` Dan Williams
  0 siblings, 0 replies; 200+ messages in thread
From: Dan Williams @ 2015-10-12 16:40 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, imammedo, Paolo Bonzini, rth

On Sat, Oct 10, 2015 at 8:52 PM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
> NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
>
> Currently, we only support PMEM mode. Each device has 3 structures:
> - SPA structure, defines the PMEM region info
>
> - MEM DEV structure, it has the @handle which is used to associate specified
>   ACPI NVDIMM  device we will introduce in later patch.
>   Also we can happily ignored the memory device's interleave, the real
>   nvdimm hardware access is hidden behind host
>
> - DCR structure, it defines vendor ID used to associate specified vendor
>   nvdimm driver. Since we only implement PMEM mode this time, Command
>   window and Data window are not needed
>
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/i386/acpi-build.c     |   4 +
>  hw/mem/nvdimm/acpi.c     | 209 ++++++++++++++++++++++++++++++++++++++++++++++-
>  hw/mem/nvdimm/internal.h |  13 +++
>  hw/mem/nvdimm/nvdimm.c   |  25 ++++++
>  include/hw/mem/nvdimm.h  |   2 +
>  5 files changed, 252 insertions(+), 1 deletion(-)
>
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 95e0c65..c637dc8 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1661,6 +1661,7 @@ static bool acpi_has_iommu(void)
>  static
>  void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>  {
> +    PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
>      GArray *table_offsets;
>      unsigned facs, ssdt, dsdt, rsdt;
>      AcpiCpuInfo cpu;
> @@ -1742,6 +1743,9 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>          build_dmar_q35(tables_blob, tables->linker);
>      }
>
> +    nvdimm_build_acpi_table(&pcms->nvdimm_memory, table_offsets, tables_blob,
> +                            tables->linker);
> +
>      /* Add tables supplied by user (if any) */
>      for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
>          unsigned len = acpi_table_len(u);
> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> index b640874..62b1e02 100644
> --- a/hw/mem/nvdimm/acpi.c
> +++ b/hw/mem/nvdimm/acpi.c
> @@ -32,6 +32,46 @@
>  #include "hw/mem/nvdimm.h"
>  #include "internal.h"
>
> +static void nfit_spa_uuid_pm(uuid_le *uuid)
> +{
> +    uuid_le uuid_pm = UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d,
> +                              0x33, 0x18, 0xb7, 0x8c, 0xdb);
> +    memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
> +}
> +
> +enum {
> +    NFIT_STRUCTURE_SPA = 0,
> +    NFIT_STRUCTURE_MEMDEV = 1,
> +    NFIT_STRUCTURE_IDT = 2,
> +    NFIT_STRUCTURE_SMBIOS = 3,
> +    NFIT_STRUCTURE_DCR = 4,
> +    NFIT_STRUCTURE_BDW = 5,
> +    NFIT_STRUCTURE_FLUSH = 6,
> +};
> +
> +enum {
> +    EFI_MEMORY_UC = 0x1ULL,
> +    EFI_MEMORY_WC = 0x2ULL,
> +    EFI_MEMORY_WT = 0x4ULL,
> +    EFI_MEMORY_WB = 0x8ULL,
> +    EFI_MEMORY_UCE = 0x10ULL,
> +    EFI_MEMORY_WP = 0x1000ULL,
> +    EFI_MEMORY_RP = 0x2000ULL,
> +    EFI_MEMORY_XP = 0x4000ULL,
> +    EFI_MEMORY_NV = 0x8000ULL,
> +    EFI_MEMORY_MORE_RELIABLE = 0x10000ULL,
> +};

Would it worth including / copying the ACPICA header files directly?

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/acpi/actbl1.h
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/acpi/acuuid.h

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 14/32] pc-dimm: drop the prefix of pc-dimm
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
  (?)
@ 2015-10-12 16:43   ` Eric Blake
  2015-10-13  3:32     ` Xiao Guangrong
  -1 siblings, 1 reply; 200+ messages in thread
From: Eric Blake @ 2015-10-12 16:43 UTC (permalink / raw)
  To: Xiao Guangrong, pbonzini, imammedo
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	dan.j.williams, rth

[-- Attachment #1: Type: text/plain, Size: 1716 bytes --]

On 10/10/2015 09:52 PM, Xiao Guangrong wrote:
> This patch is generated by this script:
> 
> find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
> | xargs sed -i "s/PC_DIMM/DIMM/g"
> 
> find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
> | xargs sed -i "s/PCDIMM/DIMM/g"
> 
> find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
> | xargs sed -i "s/pc_dimm/dimm/g"
> 
> find ./ -name "trace-events" -type f | xargs sed -i "s/pc-dimm/dimm/g"
> 
> It prepares the work which abstracts dimm device type for both pc-dimm and
> nvdimm
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hmp.c                           |   2 +-

> +++ b/qapi-schema.json
> @@ -3684,9 +3684,9 @@
>  { 'command': 'query-memdev', 'returns': ['Memdev'] }
>  
>  ##
> -# @PCDIMMDeviceInfo:
> +# @DIMMDeviceInfo:
>  #
> -# PCDIMMDevice state information
> +# DIMMDevice state information
>  #
>  # @id: #optional device's ID
>  #
> @@ -3706,7 +3706,7 @@
>  #
>  # Since: 2.1
>  ##
> -{ 'struct': 'PCDIMMDeviceInfo',
> +{ 'struct': 'DIMMDeviceInfo',
>    'data': { '*id': 'str',
>              'addr': 'int',
>              'size': 'int',
> @@ -3725,7 +3725,7 @@
>  #
>  # Since: 2.1
>  ##
> -{ 'union': 'MemoryDeviceInfo', 'data': {'dimm': 'PCDIMMDeviceInfo'} }
> +{ 'union': 'MemoryDeviceInfo', 'data': {'dimm': 'DIMMDeviceInfo'} }

Struct names are not ABI, so this change is safe.

I have not reviewed the rest of the patch, but I don't see any problems
from the qapi perspective.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
  2015-10-12 16:36       ` [Qemu-devel] " Dan Williams
@ 2015-10-13  3:14         ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  3:14 UTC (permalink / raw)
  To: Dan Williams
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, imammedo, Paolo Bonzini, rth



On 10/13/2015 12:36 AM, Dan Williams wrote:
> On Sun, Oct 11, 2015 at 9:33 PM, Xiao Guangrong
> <guangrong.xiao@linux.intel.com> wrote:
>>
>>
>> On 10/11/2015 05:17 AM, Dan Williams wrote:
>>>
>>> On Sat, Oct 10, 2015 at 8:52 PM, Xiao Guangrong
>>> <guangrong.xiao@linux.intel.com> wrote:
>>> [..]
>>>>
>>>> ====== Test ======
>>>> In host
>>>> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G
>>>> count=10
>>>> 2) append "-object memory-backend-file,share,id=mem1,
>>>>      mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>>>>      id=nv1" in QEMU command line
>>>>
>>>> In guest, download the latest upsteam kernel (4.2 merge window) and
>>>> enable
>>>> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
>>>> 1) insmod drivers/nvdimm/libnvdimm.ko
>>>> 2) insmod drivers/acpi/nfit.ko
>>>> 3) insmod drivers/nvdimm/nd_btt.ko
>>>> 4) insmod drivers/nvdimm/nd_pmem.ko
>>>> You can see the whole nvdimm device used as a single namespace and
>>>> /dev/pmem0
>>>> appears. You can do whatever on /dev/pmem0 including DAX access.
>>>>
>>>> Currently Linux NVDIMM driver does not support namespace operation on
>>>> this
>>>> kind of PMEM, apply below changes to support dynamical namespace:
>>>>
>>>> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct
>>>> acpi_nfit_desc *a
>>>>                           continue;
>>>>                   }
>>>>
>>>> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>>>> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>>>> +               if (nfit_mem->memdev_pmem)
>>>>                           flags |= NDD_ALIASING;
>>>
>>>
>>> This is just for testing purposes, right?  I expect guests can
>>
>>
>> It's used to validate NVDIMM _DSM method and static namespace following
>> NVDIMM specs...
>
> Static namespaces can be emitted without a label.  Linux needs this to
> support existing "label-less" bare metal NVDIMMs.

This is Linux specific? As i did not see it has been documented in the
spec...

>
>>> sub-divide persistent memory capacity by partitioning the resulting
>>> block device(s).
>>
>>
>> I understand that it's a Linux design... Hmm, can the same expectation
>> apply to PBLK?
>
> BLK-mode is a bit different as those namespaces have both configurable
> sector-size and an optional BTT.  It is possible to expect multiple
> BLK namespaces per a given region with different settings.

Okay, thanks for your nice explanation, Dan!


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-13  3:14         ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  3:14 UTC (permalink / raw)
  To: Dan Williams
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, Paolo Bonzini, imammedo, rth



On 10/13/2015 12:36 AM, Dan Williams wrote:
> On Sun, Oct 11, 2015 at 9:33 PM, Xiao Guangrong
> <guangrong.xiao@linux.intel.com> wrote:
>>
>>
>> On 10/11/2015 05:17 AM, Dan Williams wrote:
>>>
>>> On Sat, Oct 10, 2015 at 8:52 PM, Xiao Guangrong
>>> <guangrong.xiao@linux.intel.com> wrote:
>>> [..]
>>>>
>>>> ====== Test ======
>>>> In host
>>>> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G
>>>> count=10
>>>> 2) append "-object memory-backend-file,share,id=mem1,
>>>>      mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>>>>      id=nv1" in QEMU command line
>>>>
>>>> In guest, download the latest upsteam kernel (4.2 merge window) and
>>>> enable
>>>> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
>>>> 1) insmod drivers/nvdimm/libnvdimm.ko
>>>> 2) insmod drivers/acpi/nfit.ko
>>>> 3) insmod drivers/nvdimm/nd_btt.ko
>>>> 4) insmod drivers/nvdimm/nd_pmem.ko
>>>> You can see the whole nvdimm device used as a single namespace and
>>>> /dev/pmem0
>>>> appears. You can do whatever on /dev/pmem0 including DAX access.
>>>>
>>>> Currently Linux NVDIMM driver does not support namespace operation on
>>>> this
>>>> kind of PMEM, apply below changes to support dynamical namespace:
>>>>
>>>> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct
>>>> acpi_nfit_desc *a
>>>>                           continue;
>>>>                   }
>>>>
>>>> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>>>> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
>>>> +               if (nfit_mem->memdev_pmem)
>>>>                           flags |= NDD_ALIASING;
>>>
>>>
>>> This is just for testing purposes, right?  I expect guests can
>>
>>
>> It's used to validate NVDIMM _DSM method and static namespace following
>> NVDIMM specs...
>
> Static namespaces can be emitted without a label.  Linux needs this to
> support existing "label-less" bare metal NVDIMMs.

This is Linux specific? As i did not see it has been documented in the
spec...

>
>>> sub-divide persistent memory capacity by partitioning the resulting
>>> block device(s).
>>
>>
>> I understand that it's a Linux design... Hmm, can the same expectation
>> apply to PBLK?
>
> BLK-mode is a bit different as those namespaces have both configurable
> sector-size and an optional BTT.  It is possible to expect multiple
> BLK namespaces per a given region with different settings.

Okay, thanks for your nice explanation, Dan!

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 08/32] exec: allow memory to be allocated from any kind of path
  2015-10-12 10:08     ` [Qemu-devel] " Michael S. Tsirkin
@ 2015-10-13  3:31       ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  3:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/12/2015 06:08 PM, Michael S. Tsirkin wrote:
> On Sun, Oct 11, 2015 at 11:52:40AM +0800, Xiao Guangrong wrote:
>> Currently file_ram_alloc() is designed for hugetlbfs, however, the memory
>> of nvdimm can come from either raw pmem device eg, /dev/pmem, or the file
>> locates at DAX enabled filesystem
>>
>> So this patch let it work on any kind of path
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>
> This conflicts with map alloc rework.
> Please rebase this on top of my tree.
>

Okay, thanks for your reminder. I did it based on upstream QEMU tree,
will do it on pci branch on your tree instead.

>
>> ---
>>   exec.c | 55 ++++++++++++++-----------------------------------------
>>   1 file changed, 14 insertions(+), 41 deletions(-)
>>
>> diff --git a/exec.c b/exec.c
>> index 7d90a52..70cb0ef 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -1154,32 +1154,6 @@ void qemu_mutex_unlock_ramlist(void)
>>   }
>>
>>   #ifdef __linux__
>> -
>> -#include <sys/vfs.h>
>> -
>> -#define HUGETLBFS_MAGIC       0x958458f6
>> -
>> -static long gethugepagesize(const char *path, Error **errp)
>> -{
>> -    struct statfs fs;
>> -    int ret;
>> -
>> -    do {
>> -        ret = statfs(path, &fs);
>> -    } while (ret != 0 && errno == EINTR);
>> -
>> -    if (ret != 0) {
>> -        error_setg_errno(errp, errno, "failed to get page size of file %s",
>> -                         path);
>> -        return 0;
>> -    }
>> -
>> -    if (fs.f_type != HUGETLBFS_MAGIC)
>> -        fprintf(stderr, "Warning: path not on HugeTLBFS: %s\n", path);
>> -
>> -    return fs.f_bsize;
>
> What this *actually* is trying to warn against is that
> mapping a regular file (as opposed to hugetlbfs)
> means transparent huge pages don't work.
>
> So I don't think we should drop this warning completely.
> Either let's add the nvdimm magic, or simply check the
> page size.

Check the page size sounds good, will check:
if (pagesize != getpagesize()) {
	...print something...
}

I agree with you that showing the info is needed, however,
'Warning' might scare some users, how about drop this word or
just show “Memory is not allocated from HugeTlbfs”?


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 08/32] exec: allow memory to be allocated from any kind of path
@ 2015-10-13  3:31       ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  3:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth



On 10/12/2015 06:08 PM, Michael S. Tsirkin wrote:
> On Sun, Oct 11, 2015 at 11:52:40AM +0800, Xiao Guangrong wrote:
>> Currently file_ram_alloc() is designed for hugetlbfs, however, the memory
>> of nvdimm can come from either raw pmem device eg, /dev/pmem, or the file
>> locates at DAX enabled filesystem
>>
>> So this patch let it work on any kind of path
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>
> This conflicts with map alloc rework.
> Please rebase this on top of my tree.
>

Okay, thanks for your reminder. I did it based on upstream QEMU tree,
will do it on pci branch on your tree instead.

>
>> ---
>>   exec.c | 55 ++++++++++++++-----------------------------------------
>>   1 file changed, 14 insertions(+), 41 deletions(-)
>>
>> diff --git a/exec.c b/exec.c
>> index 7d90a52..70cb0ef 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -1154,32 +1154,6 @@ void qemu_mutex_unlock_ramlist(void)
>>   }
>>
>>   #ifdef __linux__
>> -
>> -#include <sys/vfs.h>
>> -
>> -#define HUGETLBFS_MAGIC       0x958458f6
>> -
>> -static long gethugepagesize(const char *path, Error **errp)
>> -{
>> -    struct statfs fs;
>> -    int ret;
>> -
>> -    do {
>> -        ret = statfs(path, &fs);
>> -    } while (ret != 0 && errno == EINTR);
>> -
>> -    if (ret != 0) {
>> -        error_setg_errno(errp, errno, "failed to get page size of file %s",
>> -                         path);
>> -        return 0;
>> -    }
>> -
>> -    if (fs.f_type != HUGETLBFS_MAGIC)
>> -        fprintf(stderr, "Warning: path not on HugeTLBFS: %s\n", path);
>> -
>> -    return fs.f_bsize;
>
> What this *actually* is trying to warn against is that
> mapping a regular file (as opposed to hugetlbfs)
> means transparent huge pages don't work.
>
> So I don't think we should drop this warning completely.
> Either let's add the nvdimm magic, or simply check the
> page size.

Check the page size sounds good, will check:
if (pagesize != getpagesize()) {
	...print something...
}

I agree with you that showing the info is needed, however,
'Warning' might scare some users, how about drop this word or
just show “Memory is not allocated from HugeTlbfs”?

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 14/32] pc-dimm: drop the prefix of pc-dimm
  2015-10-12 16:43   ` Eric Blake
@ 2015-10-13  3:32     ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  3:32 UTC (permalink / raw)
  To: Eric Blake, pbonzini, imammedo
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	dan.j.williams, rth



On 10/13/2015 12:43 AM, Eric Blake wrote:
> On 10/10/2015 09:52 PM, Xiao Guangrong wrote:
>> This patch is generated by this script:
>>
>> find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
>> | xargs sed -i "s/PC_DIMM/DIMM/g"
>>
>> find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
>> | xargs sed -i "s/PCDIMM/DIMM/g"
>>
>> find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
>> | xargs sed -i "s/pc_dimm/dimm/g"
>>
>> find ./ -name "trace-events" -type f | xargs sed -i "s/pc-dimm/dimm/g"
>>
>> It prepares the work which abstracts dimm device type for both pc-dimm and
>> nvdimm
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hmp.c                           |   2 +-
>
>> +++ b/qapi-schema.json
>> @@ -3684,9 +3684,9 @@
>>   { 'command': 'query-memdev', 'returns': ['Memdev'] }
>>
>>   ##
>> -# @PCDIMMDeviceInfo:
>> +# @DIMMDeviceInfo:
>>   #
>> -# PCDIMMDevice state information
>> +# DIMMDevice state information
>>   #
>>   # @id: #optional device's ID
>>   #
>> @@ -3706,7 +3706,7 @@
>>   #
>>   # Since: 2.1
>>   ##
>> -{ 'struct': 'PCDIMMDeviceInfo',
>> +{ 'struct': 'DIMMDeviceInfo',
>>     'data': { '*id': 'str',
>>               'addr': 'int',
>>               'size': 'int',
>> @@ -3725,7 +3725,7 @@
>>   #
>>   # Since: 2.1
>>   ##
>> -{ 'union': 'MemoryDeviceInfo', 'data': {'dimm': 'PCDIMMDeviceInfo'} }
>> +{ 'union': 'MemoryDeviceInfo', 'data': {'dimm': 'DIMMDeviceInfo'} }
>
> Struct names are not ABI, so this change is safe.
>
> I have not reviewed the rest of the patch, but I don't see any problems
> from the qapi perspective.

Thanks for your review, Eric! :)

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
  2015-10-13  3:14         ` Xiao Guangrong
@ 2015-10-13  3:38           ` Dan Williams
  -1 siblings, 0 replies; 200+ messages in thread
From: Dan Williams @ 2015-10-13  3:38 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, Paolo Bonzini, imammedo, rth

On Mon, Oct 12, 2015 at 8:14 PM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
> On 10/13/2015 12:36 AM, Dan Williams wrote:
>> Static namespaces can be emitted without a label.  Linux needs this to
>> support existing "label-less" bare metal NVDIMMs.
>
>
> This is Linux specific? As i did not see it has been documented in the
> spec...

I expect most NVDIMMs, especially existing ones available today, do
not have a label area.  This is not Linux specific and ACPI 6 does not
specify a label area, only the Intel DSM Interface Example.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-13  3:38           ` Dan Williams
  0 siblings, 0 replies; 200+ messages in thread
From: Dan Williams @ 2015-10-13  3:38 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, imammedo, Paolo Bonzini, rth

On Mon, Oct 12, 2015 at 8:14 PM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
> On 10/13/2015 12:36 AM, Dan Williams wrote:
>> Static namespaces can be emitted without a label.  Linux needs this to
>> support existing "label-less" bare metal NVDIMMs.
>
>
> This is Linux specific? As i did not see it has been documented in the
> spec...

I expect most NVDIMMs, especially existing ones available today, do
not have a label area.  This is not Linux specific and ACPI 6 does not
specify a label area, only the Intel DSM Interface Example.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 23/32] nvdimm: build ACPI NFIT table
  2015-10-12 11:27     ` [Qemu-devel] " Michael S. Tsirkin
@ 2015-10-13  5:13       ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  5:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/12/2015 07:27 PM, Michael S. Tsirkin wrote:
> On Sun, Oct 11, 2015 at 11:52:55AM +0800, Xiao Guangrong wrote:
>> NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
>>
>> Currently, we only support PMEM mode. Each device has 3 structures:
>> - SPA structure, defines the PMEM region info
>>
>> - MEM DEV structure, it has the @handle which is used to associate specified
>>    ACPI NVDIMM  device we will introduce in later patch.
>>    Also we can happily ignored the memory device's interleave, the real
>>    nvdimm hardware access is hidden behind host
>>
>> - DCR structure, it defines vendor ID used to associate specified vendor
>>    nvdimm driver. Since we only implement PMEM mode this time, Command
>>    window and Data window are not needed
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/i386/acpi-build.c     |   4 +
>>   hw/mem/nvdimm/acpi.c     | 209 ++++++++++++++++++++++++++++++++++++++++++++++-
>>   hw/mem/nvdimm/internal.h |  13 +++
>>   hw/mem/nvdimm/nvdimm.c   |  25 ++++++
>>   include/hw/mem/nvdimm.h  |   2 +
>>   5 files changed, 252 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>> index 95e0c65..c637dc8 100644
>> --- a/hw/i386/acpi-build.c
>> +++ b/hw/i386/acpi-build.c
>> @@ -1661,6 +1661,7 @@ static bool acpi_has_iommu(void)
>>   static
>>   void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>>   {
>> +    PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
>
> I don't like more code poking at machine directly.
> I know srat does it, and I don't like it. Any chance you can add
> acpi_get_nvdumm_info to get all you need from nvdimm state?

Do you mean introduce a wrapper to do this,like:
struct nvdimm_state *acpi_get_nvdumm_info(void)
{
	return &PC_MACHINE(qdev_get_machine())->nvdimm_memory;
}

Or should we maintain nvdimm state in other place (for example, a global
value in nvdimm.c)?

>
>>       GArray *table_offsets;
>>       unsigned facs, ssdt, dsdt, rsdt;
>>       AcpiCpuInfo cpu;
>> @@ -1742,6 +1743,9 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>>           build_dmar_q35(tables_blob, tables->linker);
>>       }
>>
>> +    nvdimm_build_acpi_table(&pcms->nvdimm_memory, table_offsets, tables_blob,
>> +                            tables->linker);
>> +
>>       /* Add tables supplied by user (if any) */
>>       for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
>>           unsigned len = acpi_table_len(u);
>> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
>> index b640874..62b1e02 100644
>> --- a/hw/mem/nvdimm/acpi.c
>> +++ b/hw/mem/nvdimm/acpi.c
>> @@ -32,6 +32,46 @@
>>   #include "hw/mem/nvdimm.h"
>>   #include "internal.h"
>>
>> +static void nfit_spa_uuid_pm(uuid_le *uuid)
>> +{
>> +    uuid_le uuid_pm = UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d,
>> +                              0x33, 0x18, 0xb7, 0x8c, 0xdb);
>> +    memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
>> +}
>> +
>
> Just add a static constant:
>      const uint8_t nfit_spa_uuid[] = {0x79, 0xd3, ..... }
> then memcpy instead of a wrapper.

Okay, good to me.

>
>> +enum {
>> +    NFIT_STRUCTURE_SPA = 0,
>> +    NFIT_STRUCTURE_MEMDEV = 1,
>> +    NFIT_STRUCTURE_IDT = 2,
>> +    NFIT_STRUCTURE_SMBIOS = 3,
>> +    NFIT_STRUCTURE_DCR = 4,
>> +    NFIT_STRUCTURE_BDW = 5,
>> +    NFIT_STRUCTURE_FLUSH = 6,
>> +};
>> +
>> +enum {
>> +    EFI_MEMORY_UC = 0x1ULL,
>> +    EFI_MEMORY_WC = 0x2ULL,
>> +    EFI_MEMORY_WT = 0x4ULL,
>> +    EFI_MEMORY_WB = 0x8ULL,
>> +    EFI_MEMORY_UCE = 0x10ULL,
>> +    EFI_MEMORY_WP = 0x1000ULL,
>> +    EFI_MEMORY_RP = 0x2000ULL,
>> +    EFI_MEMORY_XP = 0x4000ULL,
>> +    EFI_MEMORY_NV = 0x8000ULL,
>> +    EFI_MEMORY_MORE_RELIABLE = 0x10000ULL,
>> +};
>> +
>> +/*
>> + * NVDIMM Firmware Interface Table
>> + * @signature: "NFIT"
>> + */
>> +struct nfit {
>> +    ACPI_TABLE_HEADER_DEF
>> +    uint32_t reserved;
>> +} QEMU_PACKED;
>> +typedef struct nfit nfit;
>> +
>>   /* System Physical Address Range Structure */
>>   struct nfit_spa {
>>       uint16_t type;
>> @@ -40,13 +80,21 @@ struct nfit_spa {
>>       uint16_t flags;
>>       uint32_t reserved;
>>       uint32_t proximity_domain;
>> -    uint8_t type_guid[16];
>> +    uuid_le type_guid;
>>       uint64_t spa_base;
>>       uint64_t spa_length;
>>       uint64_t mem_attr;
>>   } QEMU_PACKED;
>>   typedef struct nfit_spa nfit_spa;
>>
>> +/*
>> + * Control region is strictly for management during hot add/online
>> + * operation.
>> + */
>> +#define SPA_FLAGS_ADD_ONLINE_ONLY     (1)
>
> unused

Indeed, currently vNVDIMM did not use this flag, it just introduces
the definition following the spec.

I do not see the hurt of these macros, it is really unacceptable?
Or just the programming style in QEMU?

>
>> +/* Data in Proximity Domain field is valid. */
>> +#define SPA_FLAGS_PROXIMITY_VALID     (1 << 1)
>> +
>>   /* Memory Device to System Physical Address Range Mapping Structure */
>>   struct nfit_memdev {
>>       uint16_t type;
>> @@ -91,12 +139,20 @@ struct nfit_dcr {
>>   } QEMU_PACKED;
>>   typedef struct nfit_dcr nfit_dcr;
>>
>> +#define REVSISON_ID    1
>> +#define NFIT_FIC1      0x201
>> +
>>   static uint64_t nvdimm_device_structure_size(uint64_t slots)
>>   {
>>       /* each nvdimm has three structures. */
>>       return slots * (sizeof(nfit_spa) + sizeof(nfit_memdev) + sizeof(nfit_dcr));
>>   }
>>
>> +static uint64_t get_nfit_total_size(uint64_t slots)
>> +{
>> +    return sizeof(struct nfit) + nvdimm_device_structure_size(slots);
>> +}
>> +
>>   static uint64_t nvdimm_acpi_memory_size(uint64_t slots, uint64_t page_size)
>>   {
>>       uint64_t size = nvdimm_device_structure_size(slots);
>> @@ -118,3 +174,154 @@ void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
>>                          NVDIMM_ACPI_MEM_SIZE);
>>       memory_region_add_subregion(system_memory, state->base, &state->mr);
>>   }
>> +
>> +static uint32_t nvdimm_slot_to_sn(int slot)
>> +{
>> +    return 0x123456 + slot;
>> +}
>> +
>> +static uint32_t nvdimm_slot_to_handle(int slot)
>> +{
>> +    return slot + 1;
>> +}
>> +
>> +static uint16_t nvdimm_slot_to_spa_index(int slot)
>> +{
>> +    return (slot + 1) << 1;
>> +}
>> +
>> +static uint32_t nvdimm_slot_to_dcr_index(int slot)
>> +{
>> +    return nvdimm_slot_to_spa_index(slot) + 1;
>> +}
>> +
>
> There are lots of magic numbers here with no comments.
> Pls explain the logic in code comments.

Okay, will comment these carefully in the next version.

>
>> +static int build_structure_spa(void *buf, NVDIMMDevice *nvdimm)
>
> Pls document the specific chapter that this implements.
>
> same everywhere else.

Good style indeed, will do.

>> +{
>> +    nfit_spa *nfit_spa;
>> +    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
>> +                                            NULL);
>> +    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
>> +                                            NULL);
>> +    uint32_t node = object_property_get_int(OBJECT(nvdimm), DIMM_NODE_PROP,
>> +                                            NULL);
>> +    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
>> +                                            NULL);
>> +
>> +    nfit_spa = buf;
>> +
>> +    nfit_spa->type = cpu_to_le16(NFIT_STRUCTURE_SPA);
>
> Don't do these 1-time enums. They are hard to match against spec.
>
>         nfit_spa->type = cpu_to_le16(0 /* System Physical Address Range Structure */);
>
> same everywhere else.

Yeah, learn it.

>
>> +    nfit_spa->length = cpu_to_le16(sizeof(*nfit_spa));
>> +    nfit_spa->spa_index = cpu_to_le16(nvdimm_slot_to_spa_index(slot));
>> +    nfit_spa->flags = cpu_to_le16(SPA_FLAGS_PROXIMITY_VALID);
>> +    nfit_spa->proximity_domain = cpu_to_le32(node);
>> +    nfit_spa_uuid_pm(&nfit_spa->type_guid);
>> +    nfit_spa->spa_base = cpu_to_le64(addr);
>> +    nfit_spa->spa_length = cpu_to_le64(size);
>> +    nfit_spa->mem_attr = cpu_to_le64(EFI_MEMORY_WB | EFI_MEMORY_NV);
>> +
>> +    return sizeof(*nfit_spa);
>> +}
>> +
>> +static int build_structure_memdev(void *buf, NVDIMMDevice *nvdimm)
>> +{
>> +    nfit_memdev *nfit_memdev;
>> +    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
>> +                                            NULL);
>> +    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
>> +                                            NULL);
>> +    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
>> +                                            NULL);
>> +    uint32_t handle = nvdimm_slot_to_handle(slot);
>> +
>> +    nfit_memdev = buf;
>> +    nfit_memdev->type = cpu_to_le16(NFIT_STRUCTURE_MEMDEV);
>> +    nfit_memdev->length = cpu_to_le16(sizeof(*nfit_memdev));
>> +    nfit_memdev->nfit_handle = cpu_to_le32(handle);
>> +    /* point to nfit_spa. */
>> +    nfit_memdev->spa_index = cpu_to_le16(nvdimm_slot_to_spa_index(slot));
>> +    /* point to nfit_dcr. */
>> +    nfit_memdev->dcr_index = cpu_to_le16(nvdimm_slot_to_dcr_index(slot));
>> +    nfit_memdev->region_len = cpu_to_le64(size);
>> +    nfit_memdev->region_dpa = cpu_to_le64(addr);
>> +    /* Only one interleave for pmem. */
>> +    nfit_memdev->interleave_ways = cpu_to_le16(1);
>> +
>> +    return sizeof(*nfit_memdev);
>> +}
>> +
>> +static int build_structure_dcr(void *buf, NVDIMMDevice *nvdimm)
>> +{
>> +    nfit_dcr *nfit_dcr;
>> +    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
>> +                                       NULL);
>> +    uint32_t sn = nvdimm_slot_to_sn(slot);
>> +
>> +    nfit_dcr = buf;
>> +    nfit_dcr->type = cpu_to_le16(NFIT_STRUCTURE_DCR);
>> +    nfit_dcr->length = cpu_to_le16(sizeof(*nfit_dcr));
>> +    nfit_dcr->dcr_index = cpu_to_le16(nvdimm_slot_to_dcr_index(slot));
>> +    nfit_dcr->vendor_id = cpu_to_le16(0x8086);
>> +    nfit_dcr->device_id = cpu_to_le16(1);
>> +    nfit_dcr->revision_id = cpu_to_le16(REVSISON_ID);
>> +    nfit_dcr->serial_number = cpu_to_le32(sn);
>> +    nfit_dcr->fic = cpu_to_le16(NFIT_FIC1);
>> +
>> +    return sizeof(*nfit_dcr);
>> +}
>> +
>> +static void build_device_structure(GSList *device_list, char *buf)
>> +{
>> +    buf += sizeof(nfit);
>> +
>> +    for (; device_list; device_list = device_list->next) {
>> +        NVDIMMDevice *nvdimm = device_list->data;
>> +
>> +        /* build System Physical Address Range Description Table. */
>> +        buf += build_structure_spa(buf, nvdimm);
>> +
>> +        /*
>> +         * build Memory Device to System Physical Address Range Mapping
>> +         * Table.
>> +         */
>> +        buf += build_structure_memdev(buf, nvdimm);
>> +
>> +        /* build Control Region Descriptor Table. */
>> +        buf += build_structure_dcr(buf, nvdimm);
>> +    }
>> +}
>> +
>> +static void build_nfit(GSList *device_list, GArray *table_offsets,
>> +                       GArray *table_data, GArray *linker)
>> +{
>> +    size_t total;
>> +    char *buf;
>> +    int nfit_start, nr;
>> +
>> +    nr = g_slist_length(device_list);
>> +    total = get_nfit_total_size(nr);
>> +
>> +    nfit_start = table_data->len;
>> +    acpi_add_table(table_offsets, table_data);
>> +
>> +    buf = acpi_data_push(table_data, total);
>> +    build_device_structure(device_list, buf);
>
> This seems fragile. Should build_device_structure overflow
> a buffer we'll corrupt memory.
> Current code does use acpi_data_push but only for trivial
> things like fixed size headers.
> Can't you use glib to dynamically append things to table
> as they are generated?
>

Okay, good point, will adjust it in the next version.

>
>> +
>> +    build_header(linker, table_data, (void *)(table_data->data + nfit_start),
>> +                 "NFIT", table_data->len - nfit_start, 1);
>> +}
>> +
>> +void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
>> +                             GArray *table_data, GArray *linker)
>> +{
>> +    GSList *device_list = nvdimm_get_built_list();
>> +
>> +    if (!memory_region_size(&state->mr)) {
>> +        assert(!device_list);
>> +        return;
>> +    }
>> +
>> +    if (device_list) {
>> +        build_nfit(device_list, table_offsets, table_data, linker);
>> +        g_slist_free(device_list);
>> +    }
>> +}
>> diff --git a/hw/mem/nvdimm/internal.h b/hw/mem/nvdimm/internal.h
>> index c4ba750..5551448 100644
>> --- a/hw/mem/nvdimm/internal.h
>> +++ b/hw/mem/nvdimm/internal.h
>> @@ -14,4 +14,17 @@
>>   #define NVDIMM_INTERNAL_H
>>
>>   #define MIN_NAMESPACE_LABEL_SIZE    (128UL << 10)
>> +
>> +struct uuid_le {
>> +    uint8_t b[16];
>> +};
>> +typedef struct uuid_le uuid_le;
>> +
>> +#define UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)                   \
>> +((uuid_le)                                                                 \
>> +{ { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
>> +    (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff,          \
>> +    (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } })
>> +
>
> Please avoid polluting the global namespace.
> Prefix everything with NVDIMM.

Hmm... this include-file, "internal.h", locates at hw/mem/nvdimm/ which
is only used in NVDIMM internal. But your point is good to me, i will carefully
name the stuff defined in a include-file.

>
>> +GSList *nvdimm_get_built_list(void);
>
> You are adding an extern function with no comment
> about it's purpose anywhere. Pls fix this.
> The name isn't pretty. What does "built" mean?
> List of what? Is this a device list?

I used the sytle in pc-dimm.c, pc_dimm_built_list(), i will
rename it to nvdimm_device_list() for better match its doing.

>
>>   #endif
>
> This header is too small to be worth it.
> nvdimm_get_built_list seems to be the only interface -
> just stick it in the header you have under include.
>

Other functions are introudced and included into it in later patches,
it includes the internal things shared between nvdimm device, nvdimm ACPI,
nvdimm namespace.

Furthermore, this is a internal include file, it is not bad i think.


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 23/32] nvdimm: build ACPI NFIT table
@ 2015-10-13  5:13       ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  5:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth



On 10/12/2015 07:27 PM, Michael S. Tsirkin wrote:
> On Sun, Oct 11, 2015 at 11:52:55AM +0800, Xiao Guangrong wrote:
>> NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
>>
>> Currently, we only support PMEM mode. Each device has 3 structures:
>> - SPA structure, defines the PMEM region info
>>
>> - MEM DEV structure, it has the @handle which is used to associate specified
>>    ACPI NVDIMM  device we will introduce in later patch.
>>    Also we can happily ignored the memory device's interleave, the real
>>    nvdimm hardware access is hidden behind host
>>
>> - DCR structure, it defines vendor ID used to associate specified vendor
>>    nvdimm driver. Since we only implement PMEM mode this time, Command
>>    window and Data window are not needed
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/i386/acpi-build.c     |   4 +
>>   hw/mem/nvdimm/acpi.c     | 209 ++++++++++++++++++++++++++++++++++++++++++++++-
>>   hw/mem/nvdimm/internal.h |  13 +++
>>   hw/mem/nvdimm/nvdimm.c   |  25 ++++++
>>   include/hw/mem/nvdimm.h  |   2 +
>>   5 files changed, 252 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>> index 95e0c65..c637dc8 100644
>> --- a/hw/i386/acpi-build.c
>> +++ b/hw/i386/acpi-build.c
>> @@ -1661,6 +1661,7 @@ static bool acpi_has_iommu(void)
>>   static
>>   void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>>   {
>> +    PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
>
> I don't like more code poking at machine directly.
> I know srat does it, and I don't like it. Any chance you can add
> acpi_get_nvdumm_info to get all you need from nvdimm state?

Do you mean introduce a wrapper to do this,like:
struct nvdimm_state *acpi_get_nvdumm_info(void)
{
	return &PC_MACHINE(qdev_get_machine())->nvdimm_memory;
}

Or should we maintain nvdimm state in other place (for example, a global
value in nvdimm.c)?

>
>>       GArray *table_offsets;
>>       unsigned facs, ssdt, dsdt, rsdt;
>>       AcpiCpuInfo cpu;
>> @@ -1742,6 +1743,9 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>>           build_dmar_q35(tables_blob, tables->linker);
>>       }
>>
>> +    nvdimm_build_acpi_table(&pcms->nvdimm_memory, table_offsets, tables_blob,
>> +                            tables->linker);
>> +
>>       /* Add tables supplied by user (if any) */
>>       for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
>>           unsigned len = acpi_table_len(u);
>> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
>> index b640874..62b1e02 100644
>> --- a/hw/mem/nvdimm/acpi.c
>> +++ b/hw/mem/nvdimm/acpi.c
>> @@ -32,6 +32,46 @@
>>   #include "hw/mem/nvdimm.h"
>>   #include "internal.h"
>>
>> +static void nfit_spa_uuid_pm(uuid_le *uuid)
>> +{
>> +    uuid_le uuid_pm = UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d,
>> +                              0x33, 0x18, 0xb7, 0x8c, 0xdb);
>> +    memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
>> +}
>> +
>
> Just add a static constant:
>      const uint8_t nfit_spa_uuid[] = {0x79, 0xd3, ..... }
> then memcpy instead of a wrapper.

Okay, good to me.

>
>> +enum {
>> +    NFIT_STRUCTURE_SPA = 0,
>> +    NFIT_STRUCTURE_MEMDEV = 1,
>> +    NFIT_STRUCTURE_IDT = 2,
>> +    NFIT_STRUCTURE_SMBIOS = 3,
>> +    NFIT_STRUCTURE_DCR = 4,
>> +    NFIT_STRUCTURE_BDW = 5,
>> +    NFIT_STRUCTURE_FLUSH = 6,
>> +};
>> +
>> +enum {
>> +    EFI_MEMORY_UC = 0x1ULL,
>> +    EFI_MEMORY_WC = 0x2ULL,
>> +    EFI_MEMORY_WT = 0x4ULL,
>> +    EFI_MEMORY_WB = 0x8ULL,
>> +    EFI_MEMORY_UCE = 0x10ULL,
>> +    EFI_MEMORY_WP = 0x1000ULL,
>> +    EFI_MEMORY_RP = 0x2000ULL,
>> +    EFI_MEMORY_XP = 0x4000ULL,
>> +    EFI_MEMORY_NV = 0x8000ULL,
>> +    EFI_MEMORY_MORE_RELIABLE = 0x10000ULL,
>> +};
>> +
>> +/*
>> + * NVDIMM Firmware Interface Table
>> + * @signature: "NFIT"
>> + */
>> +struct nfit {
>> +    ACPI_TABLE_HEADER_DEF
>> +    uint32_t reserved;
>> +} QEMU_PACKED;
>> +typedef struct nfit nfit;
>> +
>>   /* System Physical Address Range Structure */
>>   struct nfit_spa {
>>       uint16_t type;
>> @@ -40,13 +80,21 @@ struct nfit_spa {
>>       uint16_t flags;
>>       uint32_t reserved;
>>       uint32_t proximity_domain;
>> -    uint8_t type_guid[16];
>> +    uuid_le type_guid;
>>       uint64_t spa_base;
>>       uint64_t spa_length;
>>       uint64_t mem_attr;
>>   } QEMU_PACKED;
>>   typedef struct nfit_spa nfit_spa;
>>
>> +/*
>> + * Control region is strictly for management during hot add/online
>> + * operation.
>> + */
>> +#define SPA_FLAGS_ADD_ONLINE_ONLY     (1)
>
> unused

Indeed, currently vNVDIMM did not use this flag, it just introduces
the definition following the spec.

I do not see the hurt of these macros, it is really unacceptable?
Or just the programming style in QEMU?

>
>> +/* Data in Proximity Domain field is valid. */
>> +#define SPA_FLAGS_PROXIMITY_VALID     (1 << 1)
>> +
>>   /* Memory Device to System Physical Address Range Mapping Structure */
>>   struct nfit_memdev {
>>       uint16_t type;
>> @@ -91,12 +139,20 @@ struct nfit_dcr {
>>   } QEMU_PACKED;
>>   typedef struct nfit_dcr nfit_dcr;
>>
>> +#define REVSISON_ID    1
>> +#define NFIT_FIC1      0x201
>> +
>>   static uint64_t nvdimm_device_structure_size(uint64_t slots)
>>   {
>>       /* each nvdimm has three structures. */
>>       return slots * (sizeof(nfit_spa) + sizeof(nfit_memdev) + sizeof(nfit_dcr));
>>   }
>>
>> +static uint64_t get_nfit_total_size(uint64_t slots)
>> +{
>> +    return sizeof(struct nfit) + nvdimm_device_structure_size(slots);
>> +}
>> +
>>   static uint64_t nvdimm_acpi_memory_size(uint64_t slots, uint64_t page_size)
>>   {
>>       uint64_t size = nvdimm_device_structure_size(slots);
>> @@ -118,3 +174,154 @@ void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
>>                          NVDIMM_ACPI_MEM_SIZE);
>>       memory_region_add_subregion(system_memory, state->base, &state->mr);
>>   }
>> +
>> +static uint32_t nvdimm_slot_to_sn(int slot)
>> +{
>> +    return 0x123456 + slot;
>> +}
>> +
>> +static uint32_t nvdimm_slot_to_handle(int slot)
>> +{
>> +    return slot + 1;
>> +}
>> +
>> +static uint16_t nvdimm_slot_to_spa_index(int slot)
>> +{
>> +    return (slot + 1) << 1;
>> +}
>> +
>> +static uint32_t nvdimm_slot_to_dcr_index(int slot)
>> +{
>> +    return nvdimm_slot_to_spa_index(slot) + 1;
>> +}
>> +
>
> There are lots of magic numbers here with no comments.
> Pls explain the logic in code comments.

Okay, will comment these carefully in the next version.

>
>> +static int build_structure_spa(void *buf, NVDIMMDevice *nvdimm)
>
> Pls document the specific chapter that this implements.
>
> same everywhere else.

Good style indeed, will do.

>> +{
>> +    nfit_spa *nfit_spa;
>> +    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
>> +                                            NULL);
>> +    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
>> +                                            NULL);
>> +    uint32_t node = object_property_get_int(OBJECT(nvdimm), DIMM_NODE_PROP,
>> +                                            NULL);
>> +    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
>> +                                            NULL);
>> +
>> +    nfit_spa = buf;
>> +
>> +    nfit_spa->type = cpu_to_le16(NFIT_STRUCTURE_SPA);
>
> Don't do these 1-time enums. They are hard to match against spec.
>
>         nfit_spa->type = cpu_to_le16(0 /* System Physical Address Range Structure */);
>
> same everywhere else.

Yeah, learn it.

>
>> +    nfit_spa->length = cpu_to_le16(sizeof(*nfit_spa));
>> +    nfit_spa->spa_index = cpu_to_le16(nvdimm_slot_to_spa_index(slot));
>> +    nfit_spa->flags = cpu_to_le16(SPA_FLAGS_PROXIMITY_VALID);
>> +    nfit_spa->proximity_domain = cpu_to_le32(node);
>> +    nfit_spa_uuid_pm(&nfit_spa->type_guid);
>> +    nfit_spa->spa_base = cpu_to_le64(addr);
>> +    nfit_spa->spa_length = cpu_to_le64(size);
>> +    nfit_spa->mem_attr = cpu_to_le64(EFI_MEMORY_WB | EFI_MEMORY_NV);
>> +
>> +    return sizeof(*nfit_spa);
>> +}
>> +
>> +static int build_structure_memdev(void *buf, NVDIMMDevice *nvdimm)
>> +{
>> +    nfit_memdev *nfit_memdev;
>> +    uint64_t addr = object_property_get_int(OBJECT(nvdimm), DIMM_ADDR_PROP,
>> +                                            NULL);
>> +    uint64_t size = object_property_get_int(OBJECT(nvdimm), DIMM_SIZE_PROP,
>> +                                            NULL);
>> +    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
>> +                                            NULL);
>> +    uint32_t handle = nvdimm_slot_to_handle(slot);
>> +
>> +    nfit_memdev = buf;
>> +    nfit_memdev->type = cpu_to_le16(NFIT_STRUCTURE_MEMDEV);
>> +    nfit_memdev->length = cpu_to_le16(sizeof(*nfit_memdev));
>> +    nfit_memdev->nfit_handle = cpu_to_le32(handle);
>> +    /* point to nfit_spa. */
>> +    nfit_memdev->spa_index = cpu_to_le16(nvdimm_slot_to_spa_index(slot));
>> +    /* point to nfit_dcr. */
>> +    nfit_memdev->dcr_index = cpu_to_le16(nvdimm_slot_to_dcr_index(slot));
>> +    nfit_memdev->region_len = cpu_to_le64(size);
>> +    nfit_memdev->region_dpa = cpu_to_le64(addr);
>> +    /* Only one interleave for pmem. */
>> +    nfit_memdev->interleave_ways = cpu_to_le16(1);
>> +
>> +    return sizeof(*nfit_memdev);
>> +}
>> +
>> +static int build_structure_dcr(void *buf, NVDIMMDevice *nvdimm)
>> +{
>> +    nfit_dcr *nfit_dcr;
>> +    int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
>> +                                       NULL);
>> +    uint32_t sn = nvdimm_slot_to_sn(slot);
>> +
>> +    nfit_dcr = buf;
>> +    nfit_dcr->type = cpu_to_le16(NFIT_STRUCTURE_DCR);
>> +    nfit_dcr->length = cpu_to_le16(sizeof(*nfit_dcr));
>> +    nfit_dcr->dcr_index = cpu_to_le16(nvdimm_slot_to_dcr_index(slot));
>> +    nfit_dcr->vendor_id = cpu_to_le16(0x8086);
>> +    nfit_dcr->device_id = cpu_to_le16(1);
>> +    nfit_dcr->revision_id = cpu_to_le16(REVSISON_ID);
>> +    nfit_dcr->serial_number = cpu_to_le32(sn);
>> +    nfit_dcr->fic = cpu_to_le16(NFIT_FIC1);
>> +
>> +    return sizeof(*nfit_dcr);
>> +}
>> +
>> +static void build_device_structure(GSList *device_list, char *buf)
>> +{
>> +    buf += sizeof(nfit);
>> +
>> +    for (; device_list; device_list = device_list->next) {
>> +        NVDIMMDevice *nvdimm = device_list->data;
>> +
>> +        /* build System Physical Address Range Description Table. */
>> +        buf += build_structure_spa(buf, nvdimm);
>> +
>> +        /*
>> +         * build Memory Device to System Physical Address Range Mapping
>> +         * Table.
>> +         */
>> +        buf += build_structure_memdev(buf, nvdimm);
>> +
>> +        /* build Control Region Descriptor Table. */
>> +        buf += build_structure_dcr(buf, nvdimm);
>> +    }
>> +}
>> +
>> +static void build_nfit(GSList *device_list, GArray *table_offsets,
>> +                       GArray *table_data, GArray *linker)
>> +{
>> +    size_t total;
>> +    char *buf;
>> +    int nfit_start, nr;
>> +
>> +    nr = g_slist_length(device_list);
>> +    total = get_nfit_total_size(nr);
>> +
>> +    nfit_start = table_data->len;
>> +    acpi_add_table(table_offsets, table_data);
>> +
>> +    buf = acpi_data_push(table_data, total);
>> +    build_device_structure(device_list, buf);
>
> This seems fragile. Should build_device_structure overflow
> a buffer we'll corrupt memory.
> Current code does use acpi_data_push but only for trivial
> things like fixed size headers.
> Can't you use glib to dynamically append things to table
> as they are generated?
>

Okay, good point, will adjust it in the next version.

>
>> +
>> +    build_header(linker, table_data, (void *)(table_data->data + nfit_start),
>> +                 "NFIT", table_data->len - nfit_start, 1);
>> +}
>> +
>> +void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
>> +                             GArray *table_data, GArray *linker)
>> +{
>> +    GSList *device_list = nvdimm_get_built_list();
>> +
>> +    if (!memory_region_size(&state->mr)) {
>> +        assert(!device_list);
>> +        return;
>> +    }
>> +
>> +    if (device_list) {
>> +        build_nfit(device_list, table_offsets, table_data, linker);
>> +        g_slist_free(device_list);
>> +    }
>> +}
>> diff --git a/hw/mem/nvdimm/internal.h b/hw/mem/nvdimm/internal.h
>> index c4ba750..5551448 100644
>> --- a/hw/mem/nvdimm/internal.h
>> +++ b/hw/mem/nvdimm/internal.h
>> @@ -14,4 +14,17 @@
>>   #define NVDIMM_INTERNAL_H
>>
>>   #define MIN_NAMESPACE_LABEL_SIZE    (128UL << 10)
>> +
>> +struct uuid_le {
>> +    uint8_t b[16];
>> +};
>> +typedef struct uuid_le uuid_le;
>> +
>> +#define UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)                   \
>> +((uuid_le)                                                                 \
>> +{ { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
>> +    (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff,          \
>> +    (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } })
>> +
>
> Please avoid polluting the global namespace.
> Prefix everything with NVDIMM.

Hmm... this include-file, "internal.h", locates at hw/mem/nvdimm/ which
is only used in NVDIMM internal. But your point is good to me, i will carefully
name the stuff defined in a include-file.

>
>> +GSList *nvdimm_get_built_list(void);
>
> You are adding an extern function with no comment
> about it's purpose anywhere. Pls fix this.
> The name isn't pretty. What does "built" mean?
> List of what? Is this a device list?

I used the sytle in pc-dimm.c, pc_dimm_built_list(), i will
rename it to nvdimm_device_list() for better match its doing.

>
>>   #endif
>
> This header is too small to be worth it.
> nvdimm_get_built_list seems to be the only interface -
> just stick it in the header you have under include.
>

Other functions are introudced and included into it in later patches,
it includes the internal things shared between nvdimm device, nvdimm ACPI,
nvdimm namespace.

Furthermore, this is a internal include file, it is not bad i think.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 23/32] nvdimm: build ACPI NFIT table
  2015-10-12 16:40     ` [Qemu-devel] " Dan Williams
@ 2015-10-13  5:17       ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  5:17 UTC (permalink / raw)
  To: Dan Williams
  Cc: Paolo Bonzini, imammedo, Gleb Natapov, mtosatti, stefanha,
	Michael S. Tsirkin, rth, ehabkost, KVM list, qemu-devel



On 10/13/2015 12:40 AM, Dan Williams wrote:
> On Sat, Oct 10, 2015 at 8:52 PM, Xiao Guangrong
> <guangrong.xiao@linux.intel.com> wrote:
>> NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
>>
>> Currently, we only support PMEM mode. Each device has 3 structures:
>> - SPA structure, defines the PMEM region info
>>
>> - MEM DEV structure, it has the @handle which is used to associate specified
>>    ACPI NVDIMM  device we will introduce in later patch.
>>    Also we can happily ignored the memory device's interleave, the real
>>    nvdimm hardware access is hidden behind host
>>
>> - DCR structure, it defines vendor ID used to associate specified vendor
>>    nvdimm driver. Since we only implement PMEM mode this time, Command
>>    window and Data window are not needed
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/i386/acpi-build.c     |   4 +
>>   hw/mem/nvdimm/acpi.c     | 209 ++++++++++++++++++++++++++++++++++++++++++++++-
>>   hw/mem/nvdimm/internal.h |  13 +++
>>   hw/mem/nvdimm/nvdimm.c   |  25 ++++++
>>   include/hw/mem/nvdimm.h  |   2 +
>>   5 files changed, 252 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>> index 95e0c65..c637dc8 100644
>> --- a/hw/i386/acpi-build.c
>> +++ b/hw/i386/acpi-build.c
>> @@ -1661,6 +1661,7 @@ static bool acpi_has_iommu(void)
>>   static
>>   void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>>   {
>> +    PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
>>       GArray *table_offsets;
>>       unsigned facs, ssdt, dsdt, rsdt;
>>       AcpiCpuInfo cpu;
>> @@ -1742,6 +1743,9 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>>           build_dmar_q35(tables_blob, tables->linker);
>>       }
>>
>> +    nvdimm_build_acpi_table(&pcms->nvdimm_memory, table_offsets, tables_blob,
>> +                            tables->linker);
>> +
>>       /* Add tables supplied by user (if any) */
>>       for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
>>           unsigned len = acpi_table_len(u);
>> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
>> index b640874..62b1e02 100644
>> --- a/hw/mem/nvdimm/acpi.c
>> +++ b/hw/mem/nvdimm/acpi.c
>> @@ -32,6 +32,46 @@
>>   #include "hw/mem/nvdimm.h"
>>   #include "internal.h"
>>
>> +static void nfit_spa_uuid_pm(uuid_le *uuid)
>> +{
>> +    uuid_le uuid_pm = UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d,
>> +                              0x33, 0x18, 0xb7, 0x8c, 0xdb);
>> +    memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
>> +}
>> +
>> +enum {
>> +    NFIT_STRUCTURE_SPA = 0,
>> +    NFIT_STRUCTURE_MEMDEV = 1,
>> +    NFIT_STRUCTURE_IDT = 2,
>> +    NFIT_STRUCTURE_SMBIOS = 3,
>> +    NFIT_STRUCTURE_DCR = 4,
>> +    NFIT_STRUCTURE_BDW = 5,
>> +    NFIT_STRUCTURE_FLUSH = 6,
>> +};
>> +
>> +enum {
>> +    EFI_MEMORY_UC = 0x1ULL,
>> +    EFI_MEMORY_WC = 0x2ULL,
>> +    EFI_MEMORY_WT = 0x4ULL,
>> +    EFI_MEMORY_WB = 0x8ULL,
>> +    EFI_MEMORY_UCE = 0x10ULL,
>> +    EFI_MEMORY_WP = 0x1000ULL,
>> +    EFI_MEMORY_RP = 0x2000ULL,
>> +    EFI_MEMORY_XP = 0x4000ULL,
>> +    EFI_MEMORY_NV = 0x8000ULL,
>> +    EFI_MEMORY_MORE_RELIABLE = 0x10000ULL,
>> +};
>
> Would it worth including / copying the ACPICA header files directly?
>
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/acpi/actbl1.h
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/acpi/acuuid.h

Good point, Dan.

These files are not exported under uapi/ so that it is not good to directly
include it, i will learn the definition and adjust it to QEMU's code style
in the next version.

Thanks!



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 23/32] nvdimm: build ACPI NFIT table
@ 2015-10-13  5:17       ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  5:17 UTC (permalink / raw)
  To: Dan Williams
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, imammedo, Paolo Bonzini, rth



On 10/13/2015 12:40 AM, Dan Williams wrote:
> On Sat, Oct 10, 2015 at 8:52 PM, Xiao Guangrong
> <guangrong.xiao@linux.intel.com> wrote:
>> NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
>>
>> Currently, we only support PMEM mode. Each device has 3 structures:
>> - SPA structure, defines the PMEM region info
>>
>> - MEM DEV structure, it has the @handle which is used to associate specified
>>    ACPI NVDIMM  device we will introduce in later patch.
>>    Also we can happily ignored the memory device's interleave, the real
>>    nvdimm hardware access is hidden behind host
>>
>> - DCR structure, it defines vendor ID used to associate specified vendor
>>    nvdimm driver. Since we only implement PMEM mode this time, Command
>>    window and Data window are not needed
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/i386/acpi-build.c     |   4 +
>>   hw/mem/nvdimm/acpi.c     | 209 ++++++++++++++++++++++++++++++++++++++++++++++-
>>   hw/mem/nvdimm/internal.h |  13 +++
>>   hw/mem/nvdimm/nvdimm.c   |  25 ++++++
>>   include/hw/mem/nvdimm.h  |   2 +
>>   5 files changed, 252 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>> index 95e0c65..c637dc8 100644
>> --- a/hw/i386/acpi-build.c
>> +++ b/hw/i386/acpi-build.c
>> @@ -1661,6 +1661,7 @@ static bool acpi_has_iommu(void)
>>   static
>>   void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>>   {
>> +    PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
>>       GArray *table_offsets;
>>       unsigned facs, ssdt, dsdt, rsdt;
>>       AcpiCpuInfo cpu;
>> @@ -1742,6 +1743,9 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
>>           build_dmar_q35(tables_blob, tables->linker);
>>       }
>>
>> +    nvdimm_build_acpi_table(&pcms->nvdimm_memory, table_offsets, tables_blob,
>> +                            tables->linker);
>> +
>>       /* Add tables supplied by user (if any) */
>>       for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
>>           unsigned len = acpi_table_len(u);
>> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
>> index b640874..62b1e02 100644
>> --- a/hw/mem/nvdimm/acpi.c
>> +++ b/hw/mem/nvdimm/acpi.c
>> @@ -32,6 +32,46 @@
>>   #include "hw/mem/nvdimm.h"
>>   #include "internal.h"
>>
>> +static void nfit_spa_uuid_pm(uuid_le *uuid)
>> +{
>> +    uuid_le uuid_pm = UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d,
>> +                              0x33, 0x18, 0xb7, 0x8c, 0xdb);
>> +    memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
>> +}
>> +
>> +enum {
>> +    NFIT_STRUCTURE_SPA = 0,
>> +    NFIT_STRUCTURE_MEMDEV = 1,
>> +    NFIT_STRUCTURE_IDT = 2,
>> +    NFIT_STRUCTURE_SMBIOS = 3,
>> +    NFIT_STRUCTURE_DCR = 4,
>> +    NFIT_STRUCTURE_BDW = 5,
>> +    NFIT_STRUCTURE_FLUSH = 6,
>> +};
>> +
>> +enum {
>> +    EFI_MEMORY_UC = 0x1ULL,
>> +    EFI_MEMORY_WC = 0x2ULL,
>> +    EFI_MEMORY_WT = 0x4ULL,
>> +    EFI_MEMORY_WB = 0x8ULL,
>> +    EFI_MEMORY_UCE = 0x10ULL,
>> +    EFI_MEMORY_WP = 0x1000ULL,
>> +    EFI_MEMORY_RP = 0x2000ULL,
>> +    EFI_MEMORY_XP = 0x4000ULL,
>> +    EFI_MEMORY_NV = 0x8000ULL,
>> +    EFI_MEMORY_MORE_RELIABLE = 0x10000ULL,
>> +};
>
> Would it worth including / copying the ACPICA header files directly?
>
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/acpi/actbl1.h
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/acpi/acuuid.h

Good point, Dan.

These files are not exported under uapi/ so that it is not good to directly
include it, i will learn the definition and adjust it to QEMU's code style
in the next version.

Thanks!

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 00/32] implement vNVDIMM
  2015-10-12 11:55   ` [Qemu-devel] " Michael S. Tsirkin
@ 2015-10-13  5:29     ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  5:29 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/12/2015 07:55 PM, Michael S. Tsirkin wrote:
> On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
>> Changelog in v3:
>> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
>> Michael for their valuable comments, the patchset finally gets better shape.
>
> Thanks!
> This needs some changes in coding style, and more comments, to
> make it easier to maintain going forward.

Thanks for your review, Michael. I have learned lots of thing from
your comments.

>
> High level comments - I didn't point out all instances,
> please go over code and locate them yourself.
> I focused on acpi code in this review.

Okay, will do.

>
>      - fix coding style violations, prefix eveything with nvdimm_ etc

Actually i did not pay attention on naming the stuff which is only internally
used. Thank you for pointing it out and will fix it in next version.

>      - in apci code, avoid manual memory management/complex pointer math

I am not very good at ACPI ASL/AML, could you please more detail?

>      - comments are needed to document apis & explain what's going on
>      - constants need comments too, refer to text that
>        can be looked up in acpi spec verbatim

Indeed, will document carefully.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-13  5:29     ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  5:29 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth



On 10/12/2015 07:55 PM, Michael S. Tsirkin wrote:
> On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
>> Changelog in v3:
>> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
>> Michael for their valuable comments, the patchset finally gets better shape.
>
> Thanks!
> This needs some changes in coding style, and more comments, to
> make it easier to maintain going forward.

Thanks for your review, Michael. I have learned lots of thing from
your comments.

>
> High level comments - I didn't point out all instances,
> please go over code and locate them yourself.
> I focused on acpi code in this review.

Okay, will do.

>
>      - fix coding style violations, prefix eveything with nvdimm_ etc

Actually i did not pay attention on naming the stuff which is only internally
used. Thank you for pointing it out and will fix it in next version.

>      - in apci code, avoid manual memory management/complex pointer math

I am not very good at ACPI ASL/AML, could you please more detail?

>      - comments are needed to document apis & explain what's going on
>      - constants need comments too, refer to text that
>        can be looked up in acpi spec verbatim

Indeed, will document carefully.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 23/32] nvdimm: build ACPI NFIT table
  2015-10-13  5:13       ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-13  5:42         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-13  5:42 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Tue, Oct 13, 2015 at 01:13:18PM +0800, Xiao Guangrong wrote:
> >
> >>  #endif
> >
> >This header is too small to be worth it.
> >nvdimm_get_built_list seems to be the only interface -
> >just stick it in the header you have under include.
> >
> 
> Other functions are introudced and included into it in later patches,
> it includes the internal things shared between nvdimm device, nvdimm ACPI,
> nvdimm namespace.
> 
> Furthermore, this is a internal include file, it is not bad i think.

Each time we do this, this seems to invite abuse where
people add APIs without documenting them.

I guess I could buy this if you add nvdimm_defs.h
with just internal things such as layout of the buffer
used for communication between ACPI and hardware.

-- 
MST

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 23/32] nvdimm: build ACPI NFIT table
@ 2015-10-13  5:42         ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-13  5:42 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth

On Tue, Oct 13, 2015 at 01:13:18PM +0800, Xiao Guangrong wrote:
> >
> >>  #endif
> >
> >This header is too small to be worth it.
> >nvdimm_get_built_list seems to be the only interface -
> >just stick it in the header you have under include.
> >
> 
> Other functions are introudced and included into it in later patches,
> it includes the internal things shared between nvdimm device, nvdimm ACPI,
> nvdimm namespace.
> 
> Furthermore, this is a internal include file, it is not bad i think.

Each time we do this, this seems to invite abuse where
people add APIs without documenting them.

I guess I could buy this if you add nvdimm_defs.h
with just internal things such as layout of the buffer
used for communication between ACPI and hardware.

-- 
MST

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
  2015-10-13  3:38           ` Dan Williams
@ 2015-10-13  5:49             ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  5:49 UTC (permalink / raw)
  To: Dan Williams
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, Paolo Bonzini, imammedo, rth



On 10/13/2015 11:38 AM, Dan Williams wrote:
> On Mon, Oct 12, 2015 at 8:14 PM, Xiao Guangrong
> <guangrong.xiao@linux.intel.com> wrote:
>> On 10/13/2015 12:36 AM, Dan Williams wrote:
>>> Static namespaces can be emitted without a label.  Linux needs this to
>>> support existing "label-less" bare metal NVDIMMs.
>>
>>
>> This is Linux specific? As i did not see it has been documented in the
>> spec...
>
> I expect most NVDIMMs, especially existing ones available today, do
> not have a label area.  This is not Linux specific and ACPI 6 does not
> specify a label area, only the Intel DSM Interface Example.
>

Yup, label data is accessed via DSM interface, the spec I mentioned
is Intel DSM Interface Example.

However, IIRC Linux NVDIMM driver refused to use the device if no
DSM GET_LABEL support, are you going to update it?

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-13  5:49             ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  5:49 UTC (permalink / raw)
  To: Dan Williams
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, imammedo, Paolo Bonzini, rth



On 10/13/2015 11:38 AM, Dan Williams wrote:
> On Mon, Oct 12, 2015 at 8:14 PM, Xiao Guangrong
> <guangrong.xiao@linux.intel.com> wrote:
>> On 10/13/2015 12:36 AM, Dan Williams wrote:
>>> Static namespaces can be emitted without a label.  Linux needs this to
>>> support existing "label-less" bare metal NVDIMMs.
>>
>>
>> This is Linux specific? As i did not see it has been documented in the
>> spec...
>
> I expect most NVDIMMs, especially existing ones available today, do
> not have a label area.  This is not Linux specific and ACPI 6 does not
> specify a label area, only the Intel DSM Interface Example.
>

Yup, label data is accessed via DSM interface, the spec I mentioned
is Intel DSM Interface Example.

However, IIRC Linux NVDIMM driver refused to use the device if no
DSM GET_LABEL support, are you going to update it?

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 00/32] implement vNVDIMM
  2015-10-13  5:57       ` [Qemu-devel] " Michael S. Tsirkin
@ 2015-10-13  5:52         ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  5:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/13/2015 01:57 PM, Michael S. Tsirkin wrote:
> On Tue, Oct 13, 2015 at 01:29:48PM +0800, Xiao Guangrong wrote:
>>
>>
>> On 10/12/2015 07:55 PM, Michael S. Tsirkin wrote:
>>> On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
>>>> Changelog in v3:
>>>> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
>>>> Michael for their valuable comments, the patchset finally gets better shape.
>>>
>>> Thanks!
>>> This needs some changes in coding style, and more comments, to
>>> make it easier to maintain going forward.
>>
>> Thanks for your review, Michael. I have learned lots of thing from
>> your comments.
>>
>>>
>>> High level comments - I didn't point out all instances,
>>> please go over code and locate them yourself.
>>> I focused on acpi code in this review.
>>
>> Okay, will do.
>>
>>>
>>>      - fix coding style violations, prefix eveything with nvdimm_ etc
>>
>> Actually i did not pay attention on naming the stuff which is only internally
>> used. Thank you for pointing it out and will fix it in next version.
>>
>>>      - in apci code, avoid manual memory management/complex pointer math
>>
>> I am not very good at ACPI ASL/AML, could you please more detail?
>
> It's about C.
>
> For example:
> 	Foo *foo = acpi_data_push(table, sizeof *foo);
> 	Bar *foo = acpi_data_push(table, sizeof *bar);
> is pretty obviously safe, and it doesn't require you to do any
> calculations.
> 	char *buf = acpi_data_push(table, sizeof *foo + sizeof *bar);
> is worse, now you need:
> 	Bar *bar = (Bar *)(buf + sizeof *foo);
> which will corrupt memory if you get the size wrong in push.

Ah, got it. :)

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-13  5:52         ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  5:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth



On 10/13/2015 01:57 PM, Michael S. Tsirkin wrote:
> On Tue, Oct 13, 2015 at 01:29:48PM +0800, Xiao Guangrong wrote:
>>
>>
>> On 10/12/2015 07:55 PM, Michael S. Tsirkin wrote:
>>> On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
>>>> Changelog in v3:
>>>> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
>>>> Michael for their valuable comments, the patchset finally gets better shape.
>>>
>>> Thanks!
>>> This needs some changes in coding style, and more comments, to
>>> make it easier to maintain going forward.
>>
>> Thanks for your review, Michael. I have learned lots of thing from
>> your comments.
>>
>>>
>>> High level comments - I didn't point out all instances,
>>> please go over code and locate them yourself.
>>> I focused on acpi code in this review.
>>
>> Okay, will do.
>>
>>>
>>>      - fix coding style violations, prefix eveything with nvdimm_ etc
>>
>> Actually i did not pay attention on naming the stuff which is only internally
>> used. Thank you for pointing it out and will fix it in next version.
>>
>>>      - in apci code, avoid manual memory management/complex pointer math
>>
>> I am not very good at ACPI ASL/AML, could you please more detail?
>
> It's about C.
>
> For example:
> 	Foo *foo = acpi_data_push(table, sizeof *foo);
> 	Bar *foo = acpi_data_push(table, sizeof *bar);
> is pretty obviously safe, and it doesn't require you to do any
> calculations.
> 	char *buf = acpi_data_push(table, sizeof *foo + sizeof *bar);
> is worse, now you need:
> 	Bar *bar = (Bar *)(buf + sizeof *foo);
> which will corrupt memory if you get the size wrong in push.

Ah, got it. :)

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 00/32] implement vNVDIMM
  2015-10-13  5:29     ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-13  5:57       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-13  5:57 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Tue, Oct 13, 2015 at 01:29:48PM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/12/2015 07:55 PM, Michael S. Tsirkin wrote:
> >On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
> >>Changelog in v3:
> >>There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> >>Michael for their valuable comments, the patchset finally gets better shape.
> >
> >Thanks!
> >This needs some changes in coding style, and more comments, to
> >make it easier to maintain going forward.
> 
> Thanks for your review, Michael. I have learned lots of thing from
> your comments.
> 
> >
> >High level comments - I didn't point out all instances,
> >please go over code and locate them yourself.
> >I focused on acpi code in this review.
> 
> Okay, will do.
> 
> >
> >     - fix coding style violations, prefix eveything with nvdimm_ etc
> 
> Actually i did not pay attention on naming the stuff which is only internally
> used. Thank you for pointing it out and will fix it in next version.
> 
> >     - in apci code, avoid manual memory management/complex pointer math
> 
> I am not very good at ACPI ASL/AML, could you please more detail?

It's about C.

For example:
	Foo *foo = acpi_data_push(table, sizeof *foo);
	Bar *foo = acpi_data_push(table, sizeof *bar);
is pretty obviously safe, and it doesn't require you to do any
calculations.
	char *buf = acpi_data_push(table, sizeof *foo + sizeof *bar);
is worse, now you need:
	Bar *bar = (Bar *)(buf + sizeof *foo);
which will corrupt memory if you get the size wrong in push.

> >     - comments are needed to document apis & explain what's going on
> >     - constants need comments too, refer to text that
> >       can be looked up in acpi spec verbatim
> 
> Indeed, will document carefully.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-13  5:57       ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-13  5:57 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth

On Tue, Oct 13, 2015 at 01:29:48PM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/12/2015 07:55 PM, Michael S. Tsirkin wrote:
> >On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
> >>Changelog in v3:
> >>There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> >>Michael for their valuable comments, the patchset finally gets better shape.
> >
> >Thanks!
> >This needs some changes in coding style, and more comments, to
> >make it easier to maintain going forward.
> 
> Thanks for your review, Michael. I have learned lots of thing from
> your comments.
> 
> >
> >High level comments - I didn't point out all instances,
> >please go over code and locate them yourself.
> >I focused on acpi code in this review.
> 
> Okay, will do.
> 
> >
> >     - fix coding style violations, prefix eveything with nvdimm_ etc
> 
> Actually i did not pay attention on naming the stuff which is only internally
> used. Thank you for pointing it out and will fix it in next version.
> 
> >     - in apci code, avoid manual memory management/complex pointer math
> 
> I am not very good at ACPI ASL/AML, could you please more detail?

It's about C.

For example:
	Foo *foo = acpi_data_push(table, sizeof *foo);
	Bar *foo = acpi_data_push(table, sizeof *bar);
is pretty obviously safe, and it doesn't require you to do any
calculations.
	char *buf = acpi_data_push(table, sizeof *foo + sizeof *bar);
is worse, now you need:
	Bar *bar = (Bar *)(buf + sizeof *foo);
which will corrupt memory if you get the size wrong in push.

> >     - comments are needed to document apis & explain what's going on
> >     - constants need comments too, refer to text that
> >       can be looked up in acpi spec verbatim
> 
> Indeed, will document carefully.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 23/32] nvdimm: build ACPI NFIT table
  2015-10-13  5:42         ` [Qemu-devel] " Michael S. Tsirkin
@ 2015-10-13  6:06           ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  6:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/13/2015 01:42 PM, Michael S. Tsirkin wrote:
> On Tue, Oct 13, 2015 at 01:13:18PM +0800, Xiao Guangrong wrote:
>>>
>>>>   #endif
>>>
>>> This header is too small to be worth it.
>>> nvdimm_get_built_list seems to be the only interface -
>>> just stick it in the header you have under include.
>>>
>>
>> Other functions are introudced and included into it in later patches,
>> it includes the internal things shared between nvdimm device, nvdimm ACPI,
>> nvdimm namespace.
>>
>> Furthermore, this is a internal include file, it is not bad i think.
>
> Each time we do this, this seems to invite abuse where
> people add APIs without documenting them.
>

Understood.

> I guess I could buy this if you add nvdimm_defs.h
> with just internal things such as layout of the buffer
> used for communication between ACPI and hardware.
>

Okay, i will rename internel.h to nvdimm_defs.h and carefully document
everything (definitions, function prototypes, etc.) in this file. :)


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 23/32] nvdimm: build ACPI NFIT table
@ 2015-10-13  6:06           ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13  6:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth



On 10/13/2015 01:42 PM, Michael S. Tsirkin wrote:
> On Tue, Oct 13, 2015 at 01:13:18PM +0800, Xiao Guangrong wrote:
>>>
>>>>   #endif
>>>
>>> This header is too small to be worth it.
>>> nvdimm_get_built_list seems to be the only interface -
>>> just stick it in the header you have under include.
>>>
>>
>> Other functions are introudced and included into it in later patches,
>> it includes the internal things shared between nvdimm device, nvdimm ACPI,
>> nvdimm namespace.
>>
>> Furthermore, this is a internal include file, it is not bad i think.
>
> Each time we do this, this seems to invite abuse where
> people add APIs without documenting them.
>

Understood.

> I guess I could buy this if you add nvdimm_defs.h
> with just internal things such as layout of the buffer
> used for communication between ACPI and hardware.
>

Okay, i will rename internel.h to nvdimm_defs.h and carefully document
everything (definitions, function prototypes, etc.) in this file. :)

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 23/32] nvdimm: build ACPI NFIT table
  2015-10-13  5:17       ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-13  6:07         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-13  6:07 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Dan Williams, Paolo Bonzini, imammedo, Gleb Natapov, mtosatti,
	stefanha, rth, ehabkost, KVM list, qemu-devel

On Tue, Oct 13, 2015 at 01:17:20PM +0800, Xiao Guangrong wrote:
> >Would it worth including / copying the ACPICA header files directly?
> >
> >https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/acpi/actbl1.h
> >https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/acpi/acuuid.h
> 
> Good point, Dan.
> 
> These files are not exported under uapi/ so that it is not good to directly
> include it, i will learn the definition and adjust it to QEMU's code style
> in the next version.
> 
> Thanks!
> 

You can talk to acpica guys to try to move acuuid.h to uapi
if you like. But there's not a lot there that we need,
I'm not sure it's worth it.

-- 
MST

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 23/32] nvdimm: build ACPI NFIT table
@ 2015-10-13  6:07         ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-13  6:07 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, KVM list, Gleb Natapov, mtosatti, qemu-devel, stefanha,
	imammedo, Paolo Bonzini, Dan Williams, rth

On Tue, Oct 13, 2015 at 01:17:20PM +0800, Xiao Guangrong wrote:
> >Would it worth including / copying the ACPICA header files directly?
> >
> >https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/acpi/actbl1.h
> >https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/acpi/acuuid.h
> 
> Good point, Dan.
> 
> These files are not exported under uapi/ so that it is not good to directly
> include it, i will learn the definition and adjust it to QEMU's code style
> in the next version.
> 
> Thanks!
> 

You can talk to acpica guys to try to move acuuid.h to uapi
if you like. But there's not a lot there that we need,
I'm not sure it's worth it.

-- 
MST

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
  2015-10-13  5:49             ` Xiao Guangrong
@ 2015-10-13  6:36               ` Dan Williams
  -1 siblings, 0 replies; 200+ messages in thread
From: Dan Williams @ 2015-10-13  6:36 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, Paolo Bonzini, imammedo, rth

On Mon, Oct 12, 2015 at 10:49 PM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
>
>
> On 10/13/2015 11:38 AM, Dan Williams wrote:
>>
>> On Mon, Oct 12, 2015 at 8:14 PM, Xiao Guangrong
>> <guangrong.xiao@linux.intel.com> wrote:
>>>
>>> On 10/13/2015 12:36 AM, Dan Williams wrote:
>>>>
>>>> Static namespaces can be emitted without a label.  Linux needs this to
>>>> support existing "label-less" bare metal NVDIMMs.
>>>
>>>
>>>
>>> This is Linux specific? As i did not see it has been documented in the
>>> spec...
>>
>>
>> I expect most NVDIMMs, especially existing ones available today, do
>> not have a label area.  This is not Linux specific and ACPI 6 does not
>> specify a label area, only the Intel DSM Interface Example.
>>
>
> Yup, label data is accessed via DSM interface, the spec I mentioned
> is Intel DSM Interface Example.
>
> However, IIRC Linux NVDIMM driver refused to use the device if no
> DSM GET_LABEL support, are you going to update it?

Label-less DIMMs are tested as part of the unit test [1] and the
"memmap=nn!ss" kernel parameter that registers a persistent-memory
address range without a DIMM.  What error do you see when label
support is disabled?

[1]: https://github.com/pmem/ndctl/blob/master/README.md

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-13  6:36               ` Dan Williams
  0 siblings, 0 replies; 200+ messages in thread
From: Dan Williams @ 2015-10-13  6:36 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, imammedo, Paolo Bonzini, rth

On Mon, Oct 12, 2015 at 10:49 PM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
>
>
> On 10/13/2015 11:38 AM, Dan Williams wrote:
>>
>> On Mon, Oct 12, 2015 at 8:14 PM, Xiao Guangrong
>> <guangrong.xiao@linux.intel.com> wrote:
>>>
>>> On 10/13/2015 12:36 AM, Dan Williams wrote:
>>>>
>>>> Static namespaces can be emitted without a label.  Linux needs this to
>>>> support existing "label-less" bare metal NVDIMMs.
>>>
>>>
>>>
>>> This is Linux specific? As i did not see it has been documented in the
>>> spec...
>>
>>
>> I expect most NVDIMMs, especially existing ones available today, do
>> not have a label area.  This is not Linux specific and ACPI 6 does not
>> specify a label area, only the Intel DSM Interface Example.
>>
>
> Yup, label data is accessed via DSM interface, the spec I mentioned
> is Intel DSM Interface Example.
>
> However, IIRC Linux NVDIMM driver refused to use the device if no
> DSM GET_LABEL support, are you going to update it?

Label-less DIMMs are tested as part of the unit test [1] and the
"memmap=nn!ss" kernel parameter that registers a persistent-memory
address range without a DIMM.  What error do you see when label
support is disabled?

[1]: https://github.com/pmem/ndctl/blob/master/README.md

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 11/32] hostmem-file: use whole file size if possible
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-13 11:50     ` Vladimir Sementsov-Ogievskiy
  -1 siblings, 0 replies; 200+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-10-13 11:50 UTC (permalink / raw)
  To: Xiao Guangrong, pbonzini, imammedo
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	dan.j.williams, rth

On 11.10.2015 06:52, Xiao Guangrong wrote:
> Use the whole file size if @size is not specified which is useful
> if we want to directly pass a file to guest
>
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>   backends/hostmem-file.c | 47 +++++++++++++++++++++++++++++++++++++++++++----
>   1 file changed, 43 insertions(+), 4 deletions(-)
>
> diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
> index 9097a57..adf2835 100644
> --- a/backends/hostmem-file.c
> +++ b/backends/hostmem-file.c
> @@ -9,6 +9,9 @@
>    * This work is licensed under the terms of the GNU GPL, version 2 or later.
>    * See the COPYING file in the top-level directory.
>    */
> +#include <sys/ioctl.h>
> +#include <linux/fs.h>
> +
>   #include "qemu-common.h"
>   #include "sysemu/hostmem.h"
>   #include "sysemu/sysemu.h"
> @@ -33,20 +36,56 @@ struct HostMemoryBackendFile {
>       char *mem_path;
>   };
>   
> +static uint64_t get_file_size(const char *file)
> +{
> +    struct stat stat_buf;
> +    uint64_t size = 0;
> +    int fd;
> +
> +    fd = open(file, O_RDONLY);
> +    if (fd < 0) {
> +        return 0;
> +    }
> +
> +    if (stat(file, &stat_buf) < 0) {
> +        goto exit;
> +    }
> +
> +    if ((S_ISBLK(stat_buf.st_mode)) && !ioctl(fd, BLKGETSIZE64, &size)) {
> +        goto exit;
> +    }
> +
> +    size = lseek(fd, 0, SEEK_END);
> +    if (size == -1) {
> +        size = 0;
> +    }
> +exit:
> +    close(fd);
> +    return size;
> +}
> +
>   static void
>   file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>   {
>       HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend);
>   
> -    if (!backend->size) {
> -        error_setg(errp, "can't create backend with size 0");
> -        return;
> -    }
>       if (!fb->mem_path) {
>           error_setg(errp, "mem-path property not set");
>           return;
>       }
>   
> +    if (!backend->size) {
> +        /*
> +         * use the whole file size if @size is not specified.
> +         */
> +        backend->size = get_file_size(fb->mem_path);
> +    }
> +
> +    if (!backend->size) {
> +        error_setg(errp, "can't create backend with size 0");
> +        return;
> +    }

in case of any error in get_file_size (open, stat, lseek) it will write 
about "backend with size 0" which may be not appropriate..

> +
>       backend->force_prealloc = mem_prealloc;
>       memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
>                                    object_get_canonical_path(OBJECT(backend)),


-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 11/32] hostmem-file: use whole file size if possible
@ 2015-10-13 11:50     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 200+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-10-13 11:50 UTC (permalink / raw)
  To: Xiao Guangrong, pbonzini, imammedo
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	dan.j.williams, rth

On 11.10.2015 06:52, Xiao Guangrong wrote:
> Use the whole file size if @size is not specified which is useful
> if we want to directly pass a file to guest
>
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>   backends/hostmem-file.c | 47 +++++++++++++++++++++++++++++++++++++++++++----
>   1 file changed, 43 insertions(+), 4 deletions(-)
>
> diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
> index 9097a57..adf2835 100644
> --- a/backends/hostmem-file.c
> +++ b/backends/hostmem-file.c
> @@ -9,6 +9,9 @@
>    * This work is licensed under the terms of the GNU GPL, version 2 or later.
>    * See the COPYING file in the top-level directory.
>    */
> +#include <sys/ioctl.h>
> +#include <linux/fs.h>
> +
>   #include "qemu-common.h"
>   #include "sysemu/hostmem.h"
>   #include "sysemu/sysemu.h"
> @@ -33,20 +36,56 @@ struct HostMemoryBackendFile {
>       char *mem_path;
>   };
>   
> +static uint64_t get_file_size(const char *file)
> +{
> +    struct stat stat_buf;
> +    uint64_t size = 0;
> +    int fd;
> +
> +    fd = open(file, O_RDONLY);
> +    if (fd < 0) {
> +        return 0;
> +    }
> +
> +    if (stat(file, &stat_buf) < 0) {
> +        goto exit;
> +    }
> +
> +    if ((S_ISBLK(stat_buf.st_mode)) && !ioctl(fd, BLKGETSIZE64, &size)) {
> +        goto exit;
> +    }
> +
> +    size = lseek(fd, 0, SEEK_END);
> +    if (size == -1) {
> +        size = 0;
> +    }
> +exit:
> +    close(fd);
> +    return size;
> +}
> +
>   static void
>   file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>   {
>       HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend);
>   
> -    if (!backend->size) {
> -        error_setg(errp, "can't create backend with size 0");
> -        return;
> -    }
>       if (!fb->mem_path) {
>           error_setg(errp, "mem-path property not set");
>           return;
>       }
>   
> +    if (!backend->size) {
> +        /*
> +         * use the whole file size if @size is not specified.
> +         */
> +        backend->size = get_file_size(fb->mem_path);
> +    }
> +
> +    if (!backend->size) {
> +        error_setg(errp, "can't create backend with size 0");
> +        return;
> +    }

in case of any error in get_file_size (open, stat, lseek) it will write 
about "backend with size 0" which may be not appropriate..

> +
>       backend->force_prealloc = mem_prealloc;
>       memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
>                                    object_get_canonical_path(OBJECT(backend)),


-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 01/32] acpi: add aml_derefof
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-13 12:30     ` Igor Mammedov
  -1 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-13 12:30 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	pbonzini, dan.j.williams, rth

On Sun, 11 Oct 2015 11:52:33 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> Implement DeRefOf term which is used by NVDIMM _DSM method in later patch
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  hw/acpi/aml-build.c         | 8 ++++++++
>  include/hw/acpi/aml-build.h | 1 +
>  2 files changed, 9 insertions(+)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index 0d4b324..cbd53f4 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -1135,6 +1135,14 @@ Aml *aml_unicode(const char *str)
>      return var;
>  }
>  
> +/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefDerefOf */
> +Aml *aml_derefof(Aml *arg)
> +{
> +    Aml *var = aml_opcode(0x83 /* DerefOfOp */);
> +    aml_append(var, arg);
> +    return var;
> +}
> +
>  void
>  build_header(GArray *linker, GArray *table_data,
>               AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index 1b632dc..5a03d33 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -274,6 +274,7 @@ Aml *aml_create_dword_field(Aml *srcbuf, Aml *index, const char *name);
>  Aml *aml_varpackage(uint32_t num_elements);
>  Aml *aml_touuid(const char *uuid);
>  Aml *aml_unicode(const char *str);
> +Aml *aml_derefof(Aml *arg);
>  
>  void
>  build_header(GArray *linker, GArray *table_data,

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 01/32] acpi: add aml_derefof
@ 2015-10-13 12:30     ` Igor Mammedov
  0 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-13 12:30 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	pbonzini, dan.j.williams, rth

On Sun, 11 Oct 2015 11:52:33 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> Implement DeRefOf term which is used by NVDIMM _DSM method in later patch
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  hw/acpi/aml-build.c         | 8 ++++++++
>  include/hw/acpi/aml-build.h | 1 +
>  2 files changed, 9 insertions(+)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index 0d4b324..cbd53f4 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -1135,6 +1135,14 @@ Aml *aml_unicode(const char *str)
>      return var;
>  }
>  
> +/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefDerefOf */
> +Aml *aml_derefof(Aml *arg)
> +{
> +    Aml *var = aml_opcode(0x83 /* DerefOfOp */);
> +    aml_append(var, arg);
> +    return var;
> +}
> +
>  void
>  build_header(GArray *linker, GArray *table_data,
>               AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index 1b632dc..5a03d33 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -274,6 +274,7 @@ Aml *aml_create_dword_field(Aml *srcbuf, Aml *index, const char *name);
>  Aml *aml_varpackage(uint32_t num_elements);
>  Aml *aml_touuid(const char *uuid);
>  Aml *aml_unicode(const char *str);
> +Aml *aml_derefof(Aml *arg);
>  
>  void
>  build_header(GArray *linker, GArray *table_data,

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 02/32] acpi: add aml_sizeof
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-13 12:33     ` Igor Mammedov
  -1 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-13 12:33 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, gleb, mtosatti, stefanha, mst, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Sun, 11 Oct 2015 11:52:34 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> Implement SizeOf term which is used by NVDIMM _DSM method in later patch
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  hw/acpi/aml-build.c         | 8 ++++++++
>  include/hw/acpi/aml-build.h | 1 +
>  2 files changed, 9 insertions(+)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index cbd53f4..a72214d 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -1143,6 +1143,14 @@ Aml *aml_derefof(Aml *arg)
>      return var;
>  }
>  
> +/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefSizeOf */
> +Aml *aml_sizeof(Aml *arg)
> +{
> +    Aml *var = aml_opcode(0x87 /* SizeOfOp */);
> +    aml_append(var, arg);
> +    return var;
> +}
> +
>  void
>  build_header(GArray *linker, GArray *table_data,
>               AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index 5a03d33..7296efb 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -275,6 +275,7 @@ Aml *aml_varpackage(uint32_t num_elements);
>  Aml *aml_touuid(const char *uuid);
>  Aml *aml_unicode(const char *str);
>  Aml *aml_derefof(Aml *arg);
> +Aml *aml_sizeof(Aml *arg);
>  
>  void
>  build_header(GArray *linker, GArray *table_data,


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 02/32] acpi: add aml_sizeof
@ 2015-10-13 12:33     ` Igor Mammedov
  0 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-13 12:33 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	pbonzini, dan.j.williams, rth

On Sun, 11 Oct 2015 11:52:34 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> Implement SizeOf term which is used by NVDIMM _DSM method in later patch
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  hw/acpi/aml-build.c         | 8 ++++++++
>  include/hw/acpi/aml-build.h | 1 +
>  2 files changed, 9 insertions(+)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index cbd53f4..a72214d 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -1143,6 +1143,14 @@ Aml *aml_derefof(Aml *arg)
>      return var;
>  }
>  
> +/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefSizeOf */
> +Aml *aml_sizeof(Aml *arg)
> +{
> +    Aml *var = aml_opcode(0x87 /* SizeOfOp */);
> +    aml_append(var, arg);
> +    return var;
> +}
> +
>  void
>  build_header(GArray *linker, GArray *table_data,
>               AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index 5a03d33..7296efb 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -275,6 +275,7 @@ Aml *aml_varpackage(uint32_t num_elements);
>  Aml *aml_touuid(const char *uuid);
>  Aml *aml_unicode(const char *str);
>  Aml *aml_derefof(Aml *arg);
> +Aml *aml_sizeof(Aml *arg);
>  
>  void
>  build_header(GArray *linker, GArray *table_data,

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 03/32] acpi: add aml_create_field
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-13 12:38     ` Igor Mammedov
  -1 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-13 12:38 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, gleb, mtosatti, stefanha, mst, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Sun, 11 Oct 2015 11:52:35 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> Implement CreateField term which is used by NVDIMM _DSM method in later patch
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/acpi/aml-build.c         | 13 +++++++++++++
>  include/hw/acpi/aml-build.h |  1 +
>  2 files changed, 14 insertions(+)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index a72214d..9fe5e7b 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -1151,6 +1151,19 @@ Aml *aml_sizeof(Aml *arg)
>      return var;
>  }
>  
> +/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefCreateField */
> +Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
you haven't addressed v2 comment wrt index, len
 https://lists.gnu.org/archive/html/qemu-devel/2015-09/msg00435.html

> +{
> +    Aml *var = aml_alloc();
> +    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
> +    build_append_byte(var->buf, 0x13); /* CreateFieldOp */
> +    aml_append(var, srcbuf);
> +    aml_append(var, index);
> +    aml_append(var, len);
> +    build_append_namestring(var->buf, "%s", name);
> +    return var;
> +}
> +
>  void
>  build_header(GArray *linker, GArray *table_data,
>               AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index 7296efb..7e1c43b 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -276,6 +276,7 @@ Aml *aml_touuid(const char *uuid);
>  Aml *aml_unicode(const char *str);
>  Aml *aml_derefof(Aml *arg);
>  Aml *aml_sizeof(Aml *arg);
> +Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
>  
>  void
>  build_header(GArray *linker, GArray *table_data,


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 03/32] acpi: add aml_create_field
@ 2015-10-13 12:38     ` Igor Mammedov
  0 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-13 12:38 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	pbonzini, dan.j.williams, rth

On Sun, 11 Oct 2015 11:52:35 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> Implement CreateField term which is used by NVDIMM _DSM method in later patch
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/acpi/aml-build.c         | 13 +++++++++++++
>  include/hw/acpi/aml-build.h |  1 +
>  2 files changed, 14 insertions(+)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index a72214d..9fe5e7b 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -1151,6 +1151,19 @@ Aml *aml_sizeof(Aml *arg)
>      return var;
>  }
>  
> +/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefCreateField */
> +Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
you haven't addressed v2 comment wrt index, len
 https://lists.gnu.org/archive/html/qemu-devel/2015-09/msg00435.html

> +{
> +    Aml *var = aml_alloc();
> +    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
> +    build_append_byte(var->buf, 0x13); /* CreateFieldOp */
> +    aml_append(var, srcbuf);
> +    aml_append(var, index);
> +    aml_append(var, len);
> +    build_append_namestring(var->buf, "%s", name);
> +    return var;
> +}
> +
>  void
>  build_header(GArray *linker, GArray *table_data,
>               AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index 7296efb..7e1c43b 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -276,6 +276,7 @@ Aml *aml_touuid(const char *uuid);
>  Aml *aml_unicode(const char *str);
>  Aml *aml_derefof(Aml *arg);
>  Aml *aml_sizeof(Aml *arg);
> +Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
>  
>  void
>  build_header(GArray *linker, GArray *table_data,

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 04/32] acpi: add aml_mutex, aml_acquire, aml_release
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-13 13:34     ` Igor Mammedov
  -1 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-13 13:34 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, gleb, mtosatti, stefanha, mst, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Sun, 11 Oct 2015 11:52:36 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> Implement Mutex, Acquire and Release terms which are used by NVDIMM _DSM method
> in later patch
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/acpi/aml-build.c         | 32 ++++++++++++++++++++++++++++++++
>  include/hw/acpi/aml-build.h |  3 +++
>  2 files changed, 35 insertions(+)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index 9fe5e7b..ab52692 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -1164,6 +1164,38 @@ Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
>      return var;
>  }
>  
> +/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefMutex */
> +Aml *aml_mutex(const char *name, uint8_t flags)
s/flags/sync_level/

> +{
> +    Aml *var = aml_alloc();
> +    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
> +    build_append_byte(var->buf, 0x01); /* MutexOp */
> +    build_append_namestring(var->buf, "%s", name);

add assert here to check that reserved bits are 0
> +    build_append_byte(var->buf, flags);
> +    return var;
> +}
> +
> +/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefAcquire */
> +Aml *aml_acquire(Aml *mutex, uint16_t timeout)
> +{
> +    Aml *var = aml_alloc();
> +    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
> +    build_append_byte(var->buf, 0x23); /* AcquireOp */
> +    aml_append(var, mutex);
> +    build_append_int_noprefix(var->buf, timeout, sizeof(timeout));
> +    return var;
> +}
> +
> +/* ACPI 1.0b: 16.2.5.3 Type 1 Opcodes Encoding: DefRelease */
> +Aml *aml_release(Aml *mutex)
> +{
> +    Aml *var = aml_alloc();
> +    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
> +    build_append_byte(var->buf, 0x27); /* ReleaseOp */
> +    aml_append(var, mutex);
> +    return var;
> +}
> +
>  void
>  build_header(GArray *linker, GArray *table_data,
>               AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index 7e1c43b..d494c0c 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -277,6 +277,9 @@ Aml *aml_unicode(const char *str);
>  Aml *aml_derefof(Aml *arg);
>  Aml *aml_sizeof(Aml *arg);
>  Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
> +Aml *aml_mutex(const char *name, uint8_t flags);
> +Aml *aml_acquire(Aml *mutex, uint16_t timeout);
> +Aml *aml_release(Aml *mutex);
>  
>  void
>  build_header(GArray *linker, GArray *table_data,


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 04/32] acpi: add aml_mutex, aml_acquire, aml_release
@ 2015-10-13 13:34     ` Igor Mammedov
  0 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-13 13:34 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	pbonzini, dan.j.williams, rth

On Sun, 11 Oct 2015 11:52:36 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> Implement Mutex, Acquire and Release terms which are used by NVDIMM _DSM method
> in later patch
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/acpi/aml-build.c         | 32 ++++++++++++++++++++++++++++++++
>  include/hw/acpi/aml-build.h |  3 +++
>  2 files changed, 35 insertions(+)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index 9fe5e7b..ab52692 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -1164,6 +1164,38 @@ Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
>      return var;
>  }
>  
> +/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefMutex */
> +Aml *aml_mutex(const char *name, uint8_t flags)
s/flags/sync_level/

> +{
> +    Aml *var = aml_alloc();
> +    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
> +    build_append_byte(var->buf, 0x01); /* MutexOp */
> +    build_append_namestring(var->buf, "%s", name);

add assert here to check that reserved bits are 0
> +    build_append_byte(var->buf, flags);
> +    return var;
> +}
> +
> +/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefAcquire */
> +Aml *aml_acquire(Aml *mutex, uint16_t timeout)
> +{
> +    Aml *var = aml_alloc();
> +    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
> +    build_append_byte(var->buf, 0x23); /* AcquireOp */
> +    aml_append(var, mutex);
> +    build_append_int_noprefix(var->buf, timeout, sizeof(timeout));
> +    return var;
> +}
> +
> +/* ACPI 1.0b: 16.2.5.3 Type 1 Opcodes Encoding: DefRelease */
> +Aml *aml_release(Aml *mutex)
> +{
> +    Aml *var = aml_alloc();
> +    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
> +    build_append_byte(var->buf, 0x27); /* ReleaseOp */
> +    aml_append(var, mutex);
> +    return var;
> +}
> +
>  void
>  build_header(GArray *linker, GArray *table_data,
>               AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index 7e1c43b..d494c0c 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -277,6 +277,9 @@ Aml *aml_unicode(const char *str);
>  Aml *aml_derefof(Aml *arg);
>  Aml *aml_sizeof(Aml *arg);
>  Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
> +Aml *aml_mutex(const char *name, uint8_t flags);
> +Aml *aml_acquire(Aml *mutex, uint16_t timeout);
> +Aml *aml_release(Aml *mutex);
>  
>  void
>  build_header(GArray *linker, GArray *table_data,

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 25/32] nvdimm: build ACPI nvdimm devices
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-13 14:39     ` Igor Mammedov
  -1 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-13 14:39 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth

On Sun, 11 Oct 2015 11:52:57 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> NVDIMM devices is defined in ACPI 6.0 9.20 NVDIMM Devices
> 
> There is a root device under \_SB and specified NVDIMM devices are under the
> root device. Each NVDIMM device has _ADR which returns its handle used to
> associate MEMDEV structure in NFIT
> 
> We reserve handle 0 for root device. In this patch, we save handle, arg0,
> arg1 and arg2. Arg3 is conditionally saved in later patch
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/mem/nvdimm/acpi.c | 203 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 203 insertions(+)
> 
> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
I'd suggest to put ACPI parts to hw/acpi/nvdimm.c file so that ACPI
maintainers won't miss changes to this files.


> index 1450a6a..d9fa0fd 100644
> --- a/hw/mem/nvdimm/acpi.c
> +++ b/hw/mem/nvdimm/acpi.c
> @@ -308,15 +308,38 @@ static void build_nfit(void *fit, GSList *device_list, GArray *table_offsets,
>                   "NFIT", table_data->len - nfit_start, 1);
>  }
>  
> +#define NOTIFY_VALUE      0x99
> +
> +struct dsm_in {
> +    uint32_t handle;
> +    uint8_t arg0[16];
> +    uint32_t arg1;
> +    uint32_t arg2;
> +   /* the remaining size in the page is used by arg3. */
> +    uint8_t arg3[0];
> +} QEMU_PACKED;
> +typedef struct dsm_in dsm_in;
> +
> +struct dsm_out {
> +    /* the size of buffer filled by QEMU. */
> +    uint16_t len;
> +    uint8_t data[0];
> +} QEMU_PACKED;
> +typedef struct dsm_out dsm_out;
> +
>  static uint64_t dsm_read(void *opaque, hwaddr addr,
>                           unsigned size)
>  {
> +    fprintf(stderr, "BUG: we never read DSM notification MMIO.\n");
>      return 0;
>  }
>  
>  static void dsm_write(void *opaque, hwaddr addr,
>                        uint64_t val, unsigned size)
>  {
> +    if (val != NOTIFY_VALUE) {
> +        fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
> +    }
>  }
>  
>  static const MemoryRegionOps dsm_ops = {
> @@ -372,6 +395,183 @@ static MemoryRegion *build_dsm_memory(NVDIMMState *state)
>      return dsm_fit_mr;
>  }
>  
> +#define BUILD_STA_METHOD(_dev_, _method_)                                  \
> +    do {                                                                   \
> +        _method_ = aml_method("_STA", 0);                                  \
> +        aml_append(_method_, aml_return(aml_int(0x0f)));                   \
> +        aml_append(_dev_, _method_);                                       \
> +    } while (0)
> +
> +#define SAVE_ARG012_HANDLE_LOCK(_method_, _handle_)                        \
> +    do {                                                                   \
> +        aml_append(_method_, aml_acquire(aml_name("NLCK"), 0xFFFF));       \
how about making method serialized, then you could drop explicit lock/unlock logic
for that you'd need to extend existing aml_method() to something like this:

  aml_method("FOO", 3/*count*/, AML_SERIALIZED, 0 /* sync_level */)

> +        aml_append(_method_, aml_store(_handle_, aml_name("HDLE")));       \
> +        aml_append(_method_, aml_store(aml_arg(0), aml_name("ARG0")));     \
Could you describe QEMU<->ASL interface in a separate spec
file (for example like: docs/specs/acpi_mem_hotplug.txt),
it will help to with review process as there will be something to compare
patches with.
Once that is finalized/agreed upon, it should be easy to review and probably
to write corresponding patches.

Also I'd try to minimize QEMU<->ASL interface and implement as much as possible
of ASL logic in AML instead of pushing it in hardware (QEMU).
For example there isn't really any need to tell QEMU ARG0 (UUID), _DSM method
could just compare UUIDs itself and execute a corresponding branch.
Probably something else could be optimized as well but that we can find out
during discussion over QEMU<->ASL interface spec.

> +        aml_append(_method_, aml_store(aml_arg(1), aml_name("ARG1")));     \
> +        aml_append(_method_, aml_store(aml_arg(2), aml_name("ARG2")));     \
> +    } while (0)
> +
> +#define NOTIFY_AND_RETURN_UNLOCK(_method_)                           \
> +    do {                                                                   \
> +        aml_append(_method_, aml_store(aml_int(NOTIFY_VALUE),              \
> +                   aml_name("NOTI")));                                     \
> +        aml_append(_method_, aml_store(aml_name("RLEN"), aml_local(6)));   \
> +        aml_append(_method_, aml_store(aml_shiftleft(aml_local(6),         \
> +                      aml_int(3)), aml_local(6)));                         \
> +        aml_append(_method_, aml_create_field(aml_name("ODAT"), aml_int(0),\
> +                                              aml_local(6) , "OBUF"));     \
> +        aml_append(_method_, aml_name_decl("ZBUF", aml_buffer(0, NULL)));  \
> +        aml_append(_method_, aml_concatenate(aml_name("ZBUF"),             \
> +                                          aml_name("OBUF"), aml_arg(6)));  \
> +        aml_append(_method_, aml_release(aml_name("NLCK")));               \
> +        aml_append(_method_, aml_return(aml_arg(6)));                      \
> +    } while (0)
> +
> +#define BUILD_FIELD_UNIT_STRUCT(_field_, _s_, _f_, _name_)                 \
> +    aml_append(_field_, aml_named_field(_name_,                            \
> +               sizeof(typeof_field(_s_, _f_)) * BITS_PER_BYTE))
> +
> +#define BUILD_FIELD_UNIT_SIZE(_field_, _byte_, _name_)                     \
> +    aml_append(_field_, aml_named_field(_name_, (_byte_) * BITS_PER_BYTE))
> +
> +static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
> +                                 Aml *root_dev)
> +{
> +    for (; device_list; device_list = device_list->next) {
> +        NVDIMMDevice *nvdimm = device_list->data;
> +        int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> +                                           NULL);
> +        uint32_t handle = nvdimm_slot_to_handle(slot);
> +        Aml *dev, *method;
> +
> +        dev = aml_device("NV%02X", slot);
> +        aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
> +
> +        BUILD_STA_METHOD(dev, method);
> +
> +        method = aml_method("_DSM", 4);
That will create the same method per each device which increases
ACPI table size unnecessarily.
I'd suggest to make per nvdimm device method a wrapper over common
NVDR._DSM method and make the later handle all the logic.

> +        {
> +            SAVE_ARG012_HANDLE_LOCK(method, aml_int(handle));
> +            NOTIFY_AND_RETURN_UNLOCK(method);
> +        }
> +        aml_append(dev, method);
> +
> +        aml_append(root_dev, dev);
> +    }
> +}
> +
> +static void nvdimm_build_acpi_devices(NVDIMMState *state, GSList *device_list,
> +                                      Aml *sb_scope)
> +{
> +    Aml *dev, *method, *field;
> +    int fit_size = nvdimm_device_structure_size(g_slist_length(device_list));
> +
> +    dev = aml_device("NVDR");
> +    aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0012")));
> +
> +    /* map DSM memory into ACPI namespace. */
> +    aml_append(dev, aml_operation_region("NMIO", AML_SYSTEM_MEMORY,
> +               state->base, state->page_size));
> +    aml_append(dev, aml_operation_region("NRAM", AML_SYSTEM_MEMORY,
> +               state->base + state->page_size, state->page_size));
> +    aml_append(dev, aml_operation_region("NFIT", AML_SYSTEM_MEMORY,
> +               state->base + state->page_size * 2,
> +               memory_region_size(&state->mr) - state->page_size * 2));
> +
> +    /*
> +     * DSM notifier:
> +     * @NOTI: write value to it will notify QEMU that _DSM method is being
> +     *        called and the parameters can be found in dsm_in.
> +     *
> +     * It is MMIO mapping on host so that it will cause VM-exit then QEMU
> +     * gets control.
> +     */
> +    field = aml_field("NMIO", AML_DWORD_ACC, AML_PRESERVE);
> +    BUILD_FIELD_UNIT_SIZE(field, sizeof(uint32_t), "NOTI");
> +    aml_append(dev, field);
> +
> +    /*
> +     * DSM input:
> +     * @HDLE: store device's handle, it's zero if the _DSM call happens
> +     *        on ROOT.
> +     * @ARG0 ~ @ARG3: store the parameters of _DSM call.
> +     *
> +     * They are ram mapping on host so that these accesses never cause
> +     * VM-EXIT.
> +     */
> +    field = aml_field("NRAM", AML_DWORD_ACC, AML_PRESERVE);
> +    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, handle, "HDLE");
> +    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, arg0, "ARG0");
> +    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, arg1, "ARG1");
> +    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, arg2, "ARG2");
> +    BUILD_FIELD_UNIT_SIZE(field, state->page_size - offsetof(dsm_in, arg3),
> +                          "ARG3");
> +    aml_append(dev, field);
> +
> +    /*
> +     * DSM output:
> +     * @RLEN: the size of buffer filled by QEMU
> +     * @ODAT: the buffer QEMU uses to store the result
> +     *
> +     * Since the page is reused by both input and out, the input data
> +     * will be lost after storing new result into @RLEN and @ODAT
> +    */
> +    field = aml_field("NRAM", AML_DWORD_ACC, AML_PRESERVE);
> +    BUILD_FIELD_UNIT_STRUCT(field, dsm_out, len, "RLEN");
> +    BUILD_FIELD_UNIT_SIZE(field, state->page_size - offsetof(dsm_out, data),
> +                          "ODAT");
> +    aml_append(dev, field);
> +
> +    /* @RFIT, returned by _FIT method. */
> +    field = aml_field("NFIT", AML_DWORD_ACC, AML_PRESERVE);
> +    BUILD_FIELD_UNIT_SIZE(field, fit_size, "RFIT");
> +    aml_append(dev, field);
> +
> +    aml_append(dev, aml_mutex("NLCK", 0));
> +
> +    BUILD_STA_METHOD(dev, method);
> +
> +    method = aml_method("_DSM", 4);
> +    {
> +        SAVE_ARG012_HANDLE_LOCK(method, aml_int(0));
> +        NOTIFY_AND_RETURN_UNLOCK(method);
> +    }
> +    aml_append(dev, method);
> +
> +    method = aml_method("_FIT", 0);
> +    {
> +        aml_append(method, aml_return(aml_name("RFIT")));
> +    }
> +    aml_append(dev, method);
> +
> +    build_nvdimm_devices(state, device_list, dev);
> +
> +    aml_append(sb_scope, dev);
> +}
> +
> +static void nvdimm_build_ssdt(NVDIMMState *state, GSList *device_list,
> +                              GArray *table_offsets, GArray *table_data,
> +                              GArray *linker)
> +{
> +    Aml *ssdt, *sb_scope;
> +
> +    acpi_add_table(table_offsets, table_data);
> +
> +    ssdt = init_aml_allocator();
> +    acpi_data_push(ssdt->buf, sizeof(AcpiTableHeader));
> +
> +    sb_scope = aml_scope("\\_SB");
> +    nvdimm_build_acpi_devices(state, device_list, sb_scope);
> +
> +    aml_append(ssdt, sb_scope);
> +    /* copy AML table into ACPI tables blob and patch header there */
> +    g_array_append_vals(table_data, ssdt->buf->data, ssdt->buf->len);
> +    build_header(linker, table_data,
> +        (void *)(table_data->data + table_data->len - ssdt->buf->len),
> +        "SSDT", ssdt->buf->len, 1);
> +    free_aml_allocator();
> +}
> +
>  void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
>                               GArray *table_data, GArray *linker)
>  {
> @@ -387,6 +587,9 @@ void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
>  
>          build_device_structure(device_list, fit);
>          build_nfit(fit, device_list, table_offsets, table_data, linker);
> +
> +        nvdimm_build_ssdt(state, device_list, table_offsets, table_data,
> +                          linker);
>          g_slist_free(device_list);
>      }
>  }


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 25/32] nvdimm: build ACPI nvdimm devices
@ 2015-10-13 14:39     ` Igor Mammedov
  0 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-13 14:39 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	pbonzini, dan.j.williams, rth

On Sun, 11 Oct 2015 11:52:57 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> NVDIMM devices is defined in ACPI 6.0 9.20 NVDIMM Devices
> 
> There is a root device under \_SB and specified NVDIMM devices are under the
> root device. Each NVDIMM device has _ADR which returns its handle used to
> associate MEMDEV structure in NFIT
> 
> We reserve handle 0 for root device. In this patch, we save handle, arg0,
> arg1 and arg2. Arg3 is conditionally saved in later patch
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/mem/nvdimm/acpi.c | 203 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 203 insertions(+)
> 
> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
I'd suggest to put ACPI parts to hw/acpi/nvdimm.c file so that ACPI
maintainers won't miss changes to this files.


> index 1450a6a..d9fa0fd 100644
> --- a/hw/mem/nvdimm/acpi.c
> +++ b/hw/mem/nvdimm/acpi.c
> @@ -308,15 +308,38 @@ static void build_nfit(void *fit, GSList *device_list, GArray *table_offsets,
>                   "NFIT", table_data->len - nfit_start, 1);
>  }
>  
> +#define NOTIFY_VALUE      0x99
> +
> +struct dsm_in {
> +    uint32_t handle;
> +    uint8_t arg0[16];
> +    uint32_t arg1;
> +    uint32_t arg2;
> +   /* the remaining size in the page is used by arg3. */
> +    uint8_t arg3[0];
> +} QEMU_PACKED;
> +typedef struct dsm_in dsm_in;
> +
> +struct dsm_out {
> +    /* the size of buffer filled by QEMU. */
> +    uint16_t len;
> +    uint8_t data[0];
> +} QEMU_PACKED;
> +typedef struct dsm_out dsm_out;
> +
>  static uint64_t dsm_read(void *opaque, hwaddr addr,
>                           unsigned size)
>  {
> +    fprintf(stderr, "BUG: we never read DSM notification MMIO.\n");
>      return 0;
>  }
>  
>  static void dsm_write(void *opaque, hwaddr addr,
>                        uint64_t val, unsigned size)
>  {
> +    if (val != NOTIFY_VALUE) {
> +        fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
> +    }
>  }
>  
>  static const MemoryRegionOps dsm_ops = {
> @@ -372,6 +395,183 @@ static MemoryRegion *build_dsm_memory(NVDIMMState *state)
>      return dsm_fit_mr;
>  }
>  
> +#define BUILD_STA_METHOD(_dev_, _method_)                                  \
> +    do {                                                                   \
> +        _method_ = aml_method("_STA", 0);                                  \
> +        aml_append(_method_, aml_return(aml_int(0x0f)));                   \
> +        aml_append(_dev_, _method_);                                       \
> +    } while (0)
> +
> +#define SAVE_ARG012_HANDLE_LOCK(_method_, _handle_)                        \
> +    do {                                                                   \
> +        aml_append(_method_, aml_acquire(aml_name("NLCK"), 0xFFFF));       \
how about making method serialized, then you could drop explicit lock/unlock logic
for that you'd need to extend existing aml_method() to something like this:

  aml_method("FOO", 3/*count*/, AML_SERIALIZED, 0 /* sync_level */)

> +        aml_append(_method_, aml_store(_handle_, aml_name("HDLE")));       \
> +        aml_append(_method_, aml_store(aml_arg(0), aml_name("ARG0")));     \
Could you describe QEMU<->ASL interface in a separate spec
file (for example like: docs/specs/acpi_mem_hotplug.txt),
it will help to with review process as there will be something to compare
patches with.
Once that is finalized/agreed upon, it should be easy to review and probably
to write corresponding patches.

Also I'd try to minimize QEMU<->ASL interface and implement as much as possible
of ASL logic in AML instead of pushing it in hardware (QEMU).
For example there isn't really any need to tell QEMU ARG0 (UUID), _DSM method
could just compare UUIDs itself and execute a corresponding branch.
Probably something else could be optimized as well but that we can find out
during discussion over QEMU<->ASL interface spec.

> +        aml_append(_method_, aml_store(aml_arg(1), aml_name("ARG1")));     \
> +        aml_append(_method_, aml_store(aml_arg(2), aml_name("ARG2")));     \
> +    } while (0)
> +
> +#define NOTIFY_AND_RETURN_UNLOCK(_method_)                           \
> +    do {                                                                   \
> +        aml_append(_method_, aml_store(aml_int(NOTIFY_VALUE),              \
> +                   aml_name("NOTI")));                                     \
> +        aml_append(_method_, aml_store(aml_name("RLEN"), aml_local(6)));   \
> +        aml_append(_method_, aml_store(aml_shiftleft(aml_local(6),         \
> +                      aml_int(3)), aml_local(6)));                         \
> +        aml_append(_method_, aml_create_field(aml_name("ODAT"), aml_int(0),\
> +                                              aml_local(6) , "OBUF"));     \
> +        aml_append(_method_, aml_name_decl("ZBUF", aml_buffer(0, NULL)));  \
> +        aml_append(_method_, aml_concatenate(aml_name("ZBUF"),             \
> +                                          aml_name("OBUF"), aml_arg(6)));  \
> +        aml_append(_method_, aml_release(aml_name("NLCK")));               \
> +        aml_append(_method_, aml_return(aml_arg(6)));                      \
> +    } while (0)
> +
> +#define BUILD_FIELD_UNIT_STRUCT(_field_, _s_, _f_, _name_)                 \
> +    aml_append(_field_, aml_named_field(_name_,                            \
> +               sizeof(typeof_field(_s_, _f_)) * BITS_PER_BYTE))
> +
> +#define BUILD_FIELD_UNIT_SIZE(_field_, _byte_, _name_)                     \
> +    aml_append(_field_, aml_named_field(_name_, (_byte_) * BITS_PER_BYTE))
> +
> +static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
> +                                 Aml *root_dev)
> +{
> +    for (; device_list; device_list = device_list->next) {
> +        NVDIMMDevice *nvdimm = device_list->data;
> +        int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> +                                           NULL);
> +        uint32_t handle = nvdimm_slot_to_handle(slot);
> +        Aml *dev, *method;
> +
> +        dev = aml_device("NV%02X", slot);
> +        aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
> +
> +        BUILD_STA_METHOD(dev, method);
> +
> +        method = aml_method("_DSM", 4);
That will create the same method per each device which increases
ACPI table size unnecessarily.
I'd suggest to make per nvdimm device method a wrapper over common
NVDR._DSM method and make the later handle all the logic.

> +        {
> +            SAVE_ARG012_HANDLE_LOCK(method, aml_int(handle));
> +            NOTIFY_AND_RETURN_UNLOCK(method);
> +        }
> +        aml_append(dev, method);
> +
> +        aml_append(root_dev, dev);
> +    }
> +}
> +
> +static void nvdimm_build_acpi_devices(NVDIMMState *state, GSList *device_list,
> +                                      Aml *sb_scope)
> +{
> +    Aml *dev, *method, *field;
> +    int fit_size = nvdimm_device_structure_size(g_slist_length(device_list));
> +
> +    dev = aml_device("NVDR");
> +    aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0012")));
> +
> +    /* map DSM memory into ACPI namespace. */
> +    aml_append(dev, aml_operation_region("NMIO", AML_SYSTEM_MEMORY,
> +               state->base, state->page_size));
> +    aml_append(dev, aml_operation_region("NRAM", AML_SYSTEM_MEMORY,
> +               state->base + state->page_size, state->page_size));
> +    aml_append(dev, aml_operation_region("NFIT", AML_SYSTEM_MEMORY,
> +               state->base + state->page_size * 2,
> +               memory_region_size(&state->mr) - state->page_size * 2));
> +
> +    /*
> +     * DSM notifier:
> +     * @NOTI: write value to it will notify QEMU that _DSM method is being
> +     *        called and the parameters can be found in dsm_in.
> +     *
> +     * It is MMIO mapping on host so that it will cause VM-exit then QEMU
> +     * gets control.
> +     */
> +    field = aml_field("NMIO", AML_DWORD_ACC, AML_PRESERVE);
> +    BUILD_FIELD_UNIT_SIZE(field, sizeof(uint32_t), "NOTI");
> +    aml_append(dev, field);
> +
> +    /*
> +     * DSM input:
> +     * @HDLE: store device's handle, it's zero if the _DSM call happens
> +     *        on ROOT.
> +     * @ARG0 ~ @ARG3: store the parameters of _DSM call.
> +     *
> +     * They are ram mapping on host so that these accesses never cause
> +     * VM-EXIT.
> +     */
> +    field = aml_field("NRAM", AML_DWORD_ACC, AML_PRESERVE);
> +    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, handle, "HDLE");
> +    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, arg0, "ARG0");
> +    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, arg1, "ARG1");
> +    BUILD_FIELD_UNIT_STRUCT(field, dsm_in, arg2, "ARG2");
> +    BUILD_FIELD_UNIT_SIZE(field, state->page_size - offsetof(dsm_in, arg3),
> +                          "ARG3");
> +    aml_append(dev, field);
> +
> +    /*
> +     * DSM output:
> +     * @RLEN: the size of buffer filled by QEMU
> +     * @ODAT: the buffer QEMU uses to store the result
> +     *
> +     * Since the page is reused by both input and out, the input data
> +     * will be lost after storing new result into @RLEN and @ODAT
> +    */
> +    field = aml_field("NRAM", AML_DWORD_ACC, AML_PRESERVE);
> +    BUILD_FIELD_UNIT_STRUCT(field, dsm_out, len, "RLEN");
> +    BUILD_FIELD_UNIT_SIZE(field, state->page_size - offsetof(dsm_out, data),
> +                          "ODAT");
> +    aml_append(dev, field);
> +
> +    /* @RFIT, returned by _FIT method. */
> +    field = aml_field("NFIT", AML_DWORD_ACC, AML_PRESERVE);
> +    BUILD_FIELD_UNIT_SIZE(field, fit_size, "RFIT");
> +    aml_append(dev, field);
> +
> +    aml_append(dev, aml_mutex("NLCK", 0));
> +
> +    BUILD_STA_METHOD(dev, method);
> +
> +    method = aml_method("_DSM", 4);
> +    {
> +        SAVE_ARG012_HANDLE_LOCK(method, aml_int(0));
> +        NOTIFY_AND_RETURN_UNLOCK(method);
> +    }
> +    aml_append(dev, method);
> +
> +    method = aml_method("_FIT", 0);
> +    {
> +        aml_append(method, aml_return(aml_name("RFIT")));
> +    }
> +    aml_append(dev, method);
> +
> +    build_nvdimm_devices(state, device_list, dev);
> +
> +    aml_append(sb_scope, dev);
> +}
> +
> +static void nvdimm_build_ssdt(NVDIMMState *state, GSList *device_list,
> +                              GArray *table_offsets, GArray *table_data,
> +                              GArray *linker)
> +{
> +    Aml *ssdt, *sb_scope;
> +
> +    acpi_add_table(table_offsets, table_data);
> +
> +    ssdt = init_aml_allocator();
> +    acpi_data_push(ssdt->buf, sizeof(AcpiTableHeader));
> +
> +    sb_scope = aml_scope("\\_SB");
> +    nvdimm_build_acpi_devices(state, device_list, sb_scope);
> +
> +    aml_append(ssdt, sb_scope);
> +    /* copy AML table into ACPI tables blob and patch header there */
> +    g_array_append_vals(table_data, ssdt->buf->data, ssdt->buf->len);
> +    build_header(linker, table_data,
> +        (void *)(table_data->data + table_data->len - ssdt->buf->len),
> +        "SSDT", ssdt->buf->len, 1);
> +    free_aml_allocator();
> +}
> +
>  void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
>                               GArray *table_data, GArray *linker)
>  {
> @@ -387,6 +587,9 @@ void nvdimm_build_acpi_table(NVDIMMState *state, GArray *table_offsets,
>  
>          build_device_structure(device_list, fit);
>          build_nfit(fit, device_list, table_offsets, table_data, linker);
> +
> +        nvdimm_build_ssdt(state, device_list, table_offsets, table_data,
> +                          linker);
>          g_slist_free(device_list);
>      }
>  }

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 03/32] acpi: add aml_create_field
  2015-10-13 12:38     ` [Qemu-devel] " Igor Mammedov
@ 2015-10-13 16:36       ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13 16:36 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: pbonzini, gleb, mtosatti, stefanha, mst, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/13/2015 08:38 PM, Igor Mammedov wrote:
> On Sun, 11 Oct 2015 11:52:35 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>> Implement CreateField term which is used by NVDIMM _DSM method in later patch
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/acpi/aml-build.c         | 13 +++++++++++++
>>   include/hw/acpi/aml-build.h |  1 +
>>   2 files changed, 14 insertions(+)
>>
>> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
>> index a72214d..9fe5e7b 100644
>> --- a/hw/acpi/aml-build.c
>> +++ b/hw/acpi/aml-build.c
>> @@ -1151,6 +1151,19 @@ Aml *aml_sizeof(Aml *arg)
>>       return var;
>>   }
>>
>> +/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefCreateField */
>> +Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
> you haven't addressed v2 comment wrt index, len
>   https://lists.gnu.org/archive/html/qemu-devel/2015-09/msg00435.html

Ah, i forgot to mention that the index/len can be determined at runtime:

aml_append(_method_, aml_create_field(aml_name("ODAT"), aml_int(0),\
                                       aml_local(6) , "OBUF"));
That why i kept these as "aml *" and sorry for i failed to log it
in patch 0.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 03/32] acpi: add aml_create_field
@ 2015-10-13 16:36       ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13 16:36 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	pbonzini, dan.j.williams, rth



On 10/13/2015 08:38 PM, Igor Mammedov wrote:
> On Sun, 11 Oct 2015 11:52:35 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>> Implement CreateField term which is used by NVDIMM _DSM method in later patch
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/acpi/aml-build.c         | 13 +++++++++++++
>>   include/hw/acpi/aml-build.h |  1 +
>>   2 files changed, 14 insertions(+)
>>
>> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
>> index a72214d..9fe5e7b 100644
>> --- a/hw/acpi/aml-build.c
>> +++ b/hw/acpi/aml-build.c
>> @@ -1151,6 +1151,19 @@ Aml *aml_sizeof(Aml *arg)
>>       return var;
>>   }
>>
>> +/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefCreateField */
>> +Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
> you haven't addressed v2 comment wrt index, len
>   https://lists.gnu.org/archive/html/qemu-devel/2015-09/msg00435.html

Ah, i forgot to mention that the index/len can be determined at runtime:

aml_append(_method_, aml_create_field(aml_name("ODAT"), aml_int(0),\
                                       aml_local(6) , "OBUF"));
That why i kept these as "aml *" and sorry for i failed to log it
in patch 0.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 04/32] acpi: add aml_mutex, aml_acquire, aml_release
  2015-10-13 13:34     ` [Qemu-devel] " Igor Mammedov
@ 2015-10-13 16:44       ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13 16:44 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: pbonzini, gleb, mtosatti, stefanha, mst, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/13/2015 09:34 PM, Igor Mammedov wrote:
> On Sun, 11 Oct 2015 11:52:36 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>> Implement Mutex, Acquire and Release terms which are used by NVDIMM _DSM method
>> in later patch
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/acpi/aml-build.c         | 32 ++++++++++++++++++++++++++++++++
>>   include/hw/acpi/aml-build.h |  3 +++
>>   2 files changed, 35 insertions(+)
>>
>> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
>> index 9fe5e7b..ab52692 100644
>> --- a/hw/acpi/aml-build.c
>> +++ b/hw/acpi/aml-build.c
>> @@ -1164,6 +1164,38 @@ Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
>>       return var;
>>   }
>>
>> +/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefMutex */
>> +Aml *aml_mutex(const char *name, uint8_t flags)
> s/flags/sync_level/

Oops, will fix.

>
>> +{
>> +    Aml *var = aml_alloc();
>> +    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
>> +    build_append_byte(var->buf, 0x01); /* MutexOp */
>> +    build_append_namestring(var->buf, "%s", name);
>
> add assert here to check that reserved bits are 0

Good idea, will do.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 04/32] acpi: add aml_mutex, aml_acquire, aml_release
@ 2015-10-13 16:44       ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13 16:44 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	pbonzini, dan.j.williams, rth



On 10/13/2015 09:34 PM, Igor Mammedov wrote:
> On Sun, 11 Oct 2015 11:52:36 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>> Implement Mutex, Acquire and Release terms which are used by NVDIMM _DSM method
>> in later patch
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/acpi/aml-build.c         | 32 ++++++++++++++++++++++++++++++++
>>   include/hw/acpi/aml-build.h |  3 +++
>>   2 files changed, 35 insertions(+)
>>
>> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
>> index 9fe5e7b..ab52692 100644
>> --- a/hw/acpi/aml-build.c
>> +++ b/hw/acpi/aml-build.c
>> @@ -1164,6 +1164,38 @@ Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
>>       return var;
>>   }
>>
>> +/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefMutex */
>> +Aml *aml_mutex(const char *name, uint8_t flags)
> s/flags/sync_level/

Oops, will fix.

>
>> +{
>> +    Aml *var = aml_alloc();
>> +    build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
>> +    build_append_byte(var->buf, 0x01); /* MutexOp */
>> +    build_append_namestring(var->buf, "%s", name);
>
> add assert here to check that reserved bits are 0

Good idea, will do.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 11/32] hostmem-file: use whole file size if possible
  2015-10-13 11:50     ` Vladimir Sementsov-Ogievskiy
  (?)
@ 2015-10-13 16:53     ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13 16:53 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, pbonzini, imammedo
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	dan.j.williams, rth



On 10/13/2015 07:50 PM, Vladimir Sementsov-Ogievskiy wrote:
> On 11.10.2015 06:52, Xiao Guangrong wrote:
>> Use the whole file size if @size is not specified which is useful
>> if we want to directly pass a file to guest
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   backends/hostmem-file.c | 47 +++++++++++++++++++++++++++++++++++++++++++----
>>   1 file changed, 43 insertions(+), 4 deletions(-)
>>
>> diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
>> index 9097a57..adf2835 100644
>> --- a/backends/hostmem-file.c
>> +++ b/backends/hostmem-file.c
>> @@ -9,6 +9,9 @@
>>    * This work is licensed under the terms of the GNU GPL, version 2 or later.
>>    * See the COPYING file in the top-level directory.
>>    */
>> +#include <sys/ioctl.h>
>> +#include <linux/fs.h>
>> +
>>   #include "qemu-common.h"
>>   #include "sysemu/hostmem.h"
>>   #include "sysemu/sysemu.h"
>> @@ -33,20 +36,56 @@ struct HostMemoryBackendFile {
>>       char *mem_path;
>>   };
>> +static uint64_t get_file_size(const char *file)
>> +{
>> +    struct stat stat_buf;
>> +    uint64_t size = 0;
>> +    int fd;
>> +
>> +    fd = open(file, O_RDONLY);
>> +    if (fd < 0) {
>> +        return 0;
>> +    }
>> +
>> +    if (stat(file, &stat_buf) < 0) {
>> +        goto exit;
>> +    }
>> +
>> +    if ((S_ISBLK(stat_buf.st_mode)) && !ioctl(fd, BLKGETSIZE64, &size)) {
>> +        goto exit;
>> +    }
>> +
>> +    size = lseek(fd, 0, SEEK_END);
>> +    if (size == -1) {
>> +        size = 0;
>> +    }
>> +exit:
>> +    close(fd);
>> +    return size;
>> +}
>> +
>>   static void
>>   file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>>   {
>>       HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend);
>> -    if (!backend->size) {
>> -        error_setg(errp, "can't create backend with size 0");
>> -        return;
>> -    }
>>       if (!fb->mem_path) {
>>           error_setg(errp, "mem-path property not set");
>>           return;
>>       }
>> +    if (!backend->size) {
>> +        /*
>> +         * use the whole file size if @size is not specified.
>> +         */
>> +        backend->size = get_file_size(fb->mem_path);
>> +    }
>> +
>> +    if (!backend->size) {
>> +        error_setg(errp, "can't create backend with size 0");
>> +        return;
>> +    }
>
> in case of any error in get_file_size (open, stat, lseek) it will write about "backend with size 0"
> which may be not appropriate..

Okay, i will change it to:
("failed to get file size for %s, can't create backend on it", mem_path);

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 25/32] nvdimm: build ACPI nvdimm devices
  2015-10-13 14:39     ` Igor Mammedov
@ 2015-10-13 17:24       ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13 17:24 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: pbonzini, ehabkost, kvm, mst, gleb, mtosatti, qemu-devel,
	stefanha, dan.j.williams, rth



On 10/13/2015 10:39 PM, Igor Mammedov wrote:
> On Sun, 11 Oct 2015 11:52:57 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>> NVDIMM devices is defined in ACPI 6.0 9.20 NVDIMM Devices
>>
>> There is a root device under \_SB and specified NVDIMM devices are under the
>> root device. Each NVDIMM device has _ADR which returns its handle used to
>> associate MEMDEV structure in NFIT
>>
>> We reserve handle 0 for root device. In this patch, we save handle, arg0,
>> arg1 and arg2. Arg3 is conditionally saved in later patch
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/mem/nvdimm/acpi.c | 203 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 203 insertions(+)
>>
>> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> I'd suggest to put ACPI parts to hw/acpi/nvdimm.c file so that ACPI
> maintainers won't miss changes to this files.
>

Sounds reasonable to me.

>
>> index 1450a6a..d9fa0fd 100644
>> --- a/hw/mem/nvdimm/acpi.c
>> +++ b/hw/mem/nvdimm/acpi.c
>> @@ -308,15 +308,38 @@ static void build_nfit(void *fit, GSList *device_list, GArray *table_offsets,
>>                    "NFIT", table_data->len - nfit_start, 1);
>>   }
>>
>> +#define NOTIFY_VALUE      0x99
>> +
>> +struct dsm_in {
>> +    uint32_t handle;
>> +    uint8_t arg0[16];
>> +    uint32_t arg1;
>> +    uint32_t arg2;
>> +   /* the remaining size in the page is used by arg3. */
>> +    uint8_t arg3[0];
>> +} QEMU_PACKED;
>> +typedef struct dsm_in dsm_in;
>> +
>> +struct dsm_out {
>> +    /* the size of buffer filled by QEMU. */
>> +    uint16_t len;
>> +    uint8_t data[0];
>> +} QEMU_PACKED;
>> +typedef struct dsm_out dsm_out;
>> +
>>   static uint64_t dsm_read(void *opaque, hwaddr addr,
>>                            unsigned size)
>>   {
>> +    fprintf(stderr, "BUG: we never read DSM notification MMIO.\n");
>>       return 0;
>>   }
>>
>>   static void dsm_write(void *opaque, hwaddr addr,
>>                         uint64_t val, unsigned size)
>>   {
>> +    if (val != NOTIFY_VALUE) {
>> +        fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
>> +    }
>>   }
>>
>>   static const MemoryRegionOps dsm_ops = {
>> @@ -372,6 +395,183 @@ static MemoryRegion *build_dsm_memory(NVDIMMState *state)
>>       return dsm_fit_mr;
>>   }
>>
>> +#define BUILD_STA_METHOD(_dev_, _method_)                                  \
>> +    do {                                                                   \
>> +        _method_ = aml_method("_STA", 0);                                  \
>> +        aml_append(_method_, aml_return(aml_int(0x0f)));                   \
>> +        aml_append(_dev_, _method_);                                       \
>> +    } while (0)
>> +
>> +#define SAVE_ARG012_HANDLE_LOCK(_method_, _handle_)                        \
>> +    do {                                                                   \
>> +        aml_append(_method_, aml_acquire(aml_name("NLCK"), 0xFFFF));       \
> how about making method serialized, then you could drop explicit lock/unlock logic
> for that you'd need to extend existing aml_method() to something like this:
>
>    aml_method("FOO", 3/*count*/, AML_SERIALIZED, 0 /* sync_level */)

I am not sure if multiple methods under different namespace objects can be
serialized, for example:
Device("__D0") {
	Method("FOO", 3, AML_SERIALIZED, 0) {
		BUF = Arg0
	}
}

Device("__D1") {
	Method("FOO", 3, AML_SERIALIZED, 0) {
		BUF = Arg0
	}
}

__D0.FOO and __D1.FOO can be serialized?

Your suggestion definitely valuable to me, i will abstract the access of
shared-memory into one method as your comment below.

>
>> +        aml_append(_method_, aml_store(_handle_, aml_name("HDLE")));       \
>> +        aml_append(_method_, aml_store(aml_arg(0), aml_name("ARG0")));     \
> Could you describe QEMU<->ASL interface in a separate spec
> file (for example like: docs/specs/acpi_mem_hotplug.txt),
> it will help to with review process as there will be something to compare
> patches with.
> Once that is finalized/agreed upon, it should be easy to review and probably
> to write corresponding patches.

Sure, i considered it too and was planing to make this kind of spec after this
patchset is merged... I will document the interface in the next version.

>
> Also I'd try to minimize QEMU<->ASL interface and implement as much as possible
> of ASL logic in AML instead of pushing it in hardware (QEMU).

Okay, i agree.

Since ACPI ASL/AML is new knowledge to me, i did it using the opposite way - move
the control to QEMU side as possible ... :)

> For example there isn't really any need to tell QEMU ARG0 (UUID), _DSM method
> could just compare UUIDs itself and execute a corresponding branch.
> Probably something else could be optimized as well but that we can find out
> during discussion over QEMU<->ASL interface spec.

Okay.

>
>> +        aml_append(_method_, aml_store(aml_arg(1), aml_name("ARG1")));     \
>> +        aml_append(_method_, aml_store(aml_arg(2), aml_name("ARG2")));     \
>> +    } while (0)
>> +
>> +#define NOTIFY_AND_RETURN_UNLOCK(_method_)                           \
>> +    do {                                                                   \
>> +        aml_append(_method_, aml_store(aml_int(NOTIFY_VALUE),              \
>> +                   aml_name("NOTI")));                                     \
>> +        aml_append(_method_, aml_store(aml_name("RLEN"), aml_local(6)));   \
>> +        aml_append(_method_, aml_store(aml_shiftleft(aml_local(6),         \
>> +                      aml_int(3)), aml_local(6)));                         \
>> +        aml_append(_method_, aml_create_field(aml_name("ODAT"), aml_int(0),\
>> +                                              aml_local(6) , "OBUF"));     \
>> +        aml_append(_method_, aml_name_decl("ZBUF", aml_buffer(0, NULL)));  \
>> +        aml_append(_method_, aml_concatenate(aml_name("ZBUF"),             \
>> +                                          aml_name("OBUF"), aml_arg(6)));  \
>> +        aml_append(_method_, aml_release(aml_name("NLCK")));               \
>> +        aml_append(_method_, aml_return(aml_arg(6)));                      \
>> +    } while (0)
>> +
>> +#define BUILD_FIELD_UNIT_STRUCT(_field_, _s_, _f_, _name_)                 \
>> +    aml_append(_field_, aml_named_field(_name_,                            \
>> +               sizeof(typeof_field(_s_, _f_)) * BITS_PER_BYTE))
>> +
>> +#define BUILD_FIELD_UNIT_SIZE(_field_, _byte_, _name_)                     \
>> +    aml_append(_field_, aml_named_field(_name_, (_byte_) * BITS_PER_BYTE))
>> +
>> +static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
>> +                                 Aml *root_dev)
>> +{
>> +    for (; device_list; device_list = device_list->next) {
>> +        NVDIMMDevice *nvdimm = device_list->data;
>> +        int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
>> +                                           NULL);
>> +        uint32_t handle = nvdimm_slot_to_handle(slot);
>> +        Aml *dev, *method;
>> +
>> +        dev = aml_device("NV%02X", slot);
>> +        aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
>> +
>> +        BUILD_STA_METHOD(dev, method);
>> +
>> +        method = aml_method("_DSM", 4);
> That will create the same method per each device which increases
> ACPI table size unnecessarily.
> I'd suggest to make per nvdimm device method a wrapper over common
> NVDR._DSM method and make the later handle all the logic.

Good to me.

Really appropriate for your review, Igor!

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 25/32] nvdimm: build ACPI nvdimm devices
@ 2015-10-13 17:24       ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-13 17:24 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	pbonzini, dan.j.williams, rth



On 10/13/2015 10:39 PM, Igor Mammedov wrote:
> On Sun, 11 Oct 2015 11:52:57 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>> NVDIMM devices is defined in ACPI 6.0 9.20 NVDIMM Devices
>>
>> There is a root device under \_SB and specified NVDIMM devices are under the
>> root device. Each NVDIMM device has _ADR which returns its handle used to
>> associate MEMDEV structure in NFIT
>>
>> We reserve handle 0 for root device. In this patch, we save handle, arg0,
>> arg1 and arg2. Arg3 is conditionally saved in later patch
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/mem/nvdimm/acpi.c | 203 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 203 insertions(+)
>>
>> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> I'd suggest to put ACPI parts to hw/acpi/nvdimm.c file so that ACPI
> maintainers won't miss changes to this files.
>

Sounds reasonable to me.

>
>> index 1450a6a..d9fa0fd 100644
>> --- a/hw/mem/nvdimm/acpi.c
>> +++ b/hw/mem/nvdimm/acpi.c
>> @@ -308,15 +308,38 @@ static void build_nfit(void *fit, GSList *device_list, GArray *table_offsets,
>>                    "NFIT", table_data->len - nfit_start, 1);
>>   }
>>
>> +#define NOTIFY_VALUE      0x99
>> +
>> +struct dsm_in {
>> +    uint32_t handle;
>> +    uint8_t arg0[16];
>> +    uint32_t arg1;
>> +    uint32_t arg2;
>> +   /* the remaining size in the page is used by arg3. */
>> +    uint8_t arg3[0];
>> +} QEMU_PACKED;
>> +typedef struct dsm_in dsm_in;
>> +
>> +struct dsm_out {
>> +    /* the size of buffer filled by QEMU. */
>> +    uint16_t len;
>> +    uint8_t data[0];
>> +} QEMU_PACKED;
>> +typedef struct dsm_out dsm_out;
>> +
>>   static uint64_t dsm_read(void *opaque, hwaddr addr,
>>                            unsigned size)
>>   {
>> +    fprintf(stderr, "BUG: we never read DSM notification MMIO.\n");
>>       return 0;
>>   }
>>
>>   static void dsm_write(void *opaque, hwaddr addr,
>>                         uint64_t val, unsigned size)
>>   {
>> +    if (val != NOTIFY_VALUE) {
>> +        fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
>> +    }
>>   }
>>
>>   static const MemoryRegionOps dsm_ops = {
>> @@ -372,6 +395,183 @@ static MemoryRegion *build_dsm_memory(NVDIMMState *state)
>>       return dsm_fit_mr;
>>   }
>>
>> +#define BUILD_STA_METHOD(_dev_, _method_)                                  \
>> +    do {                                                                   \
>> +        _method_ = aml_method("_STA", 0);                                  \
>> +        aml_append(_method_, aml_return(aml_int(0x0f)));                   \
>> +        aml_append(_dev_, _method_);                                       \
>> +    } while (0)
>> +
>> +#define SAVE_ARG012_HANDLE_LOCK(_method_, _handle_)                        \
>> +    do {                                                                   \
>> +        aml_append(_method_, aml_acquire(aml_name("NLCK"), 0xFFFF));       \
> how about making method serialized, then you could drop explicit lock/unlock logic
> for that you'd need to extend existing aml_method() to something like this:
>
>    aml_method("FOO", 3/*count*/, AML_SERIALIZED, 0 /* sync_level */)

I am not sure if multiple methods under different namespace objects can be
serialized, for example:
Device("__D0") {
	Method("FOO", 3, AML_SERIALIZED, 0) {
		BUF = Arg0
	}
}

Device("__D1") {
	Method("FOO", 3, AML_SERIALIZED, 0) {
		BUF = Arg0
	}
}

__D0.FOO and __D1.FOO can be serialized?

Your suggestion definitely valuable to me, i will abstract the access of
shared-memory into one method as your comment below.

>
>> +        aml_append(_method_, aml_store(_handle_, aml_name("HDLE")));       \
>> +        aml_append(_method_, aml_store(aml_arg(0), aml_name("ARG0")));     \
> Could you describe QEMU<->ASL interface in a separate spec
> file (for example like: docs/specs/acpi_mem_hotplug.txt),
> it will help to with review process as there will be something to compare
> patches with.
> Once that is finalized/agreed upon, it should be easy to review and probably
> to write corresponding patches.

Sure, i considered it too and was planing to make this kind of spec after this
patchset is merged... I will document the interface in the next version.

>
> Also I'd try to minimize QEMU<->ASL interface and implement as much as possible
> of ASL logic in AML instead of pushing it in hardware (QEMU).

Okay, i agree.

Since ACPI ASL/AML is new knowledge to me, i did it using the opposite way - move
the control to QEMU side as possible ... :)

> For example there isn't really any need to tell QEMU ARG0 (UUID), _DSM method
> could just compare UUIDs itself and execute a corresponding branch.
> Probably something else could be optimized as well but that we can find out
> during discussion over QEMU<->ASL interface spec.

Okay.

>
>> +        aml_append(_method_, aml_store(aml_arg(1), aml_name("ARG1")));     \
>> +        aml_append(_method_, aml_store(aml_arg(2), aml_name("ARG2")));     \
>> +    } while (0)
>> +
>> +#define NOTIFY_AND_RETURN_UNLOCK(_method_)                           \
>> +    do {                                                                   \
>> +        aml_append(_method_, aml_store(aml_int(NOTIFY_VALUE),              \
>> +                   aml_name("NOTI")));                                     \
>> +        aml_append(_method_, aml_store(aml_name("RLEN"), aml_local(6)));   \
>> +        aml_append(_method_, aml_store(aml_shiftleft(aml_local(6),         \
>> +                      aml_int(3)), aml_local(6)));                         \
>> +        aml_append(_method_, aml_create_field(aml_name("ODAT"), aml_int(0),\
>> +                                              aml_local(6) , "OBUF"));     \
>> +        aml_append(_method_, aml_name_decl("ZBUF", aml_buffer(0, NULL)));  \
>> +        aml_append(_method_, aml_concatenate(aml_name("ZBUF"),             \
>> +                                          aml_name("OBUF"), aml_arg(6)));  \
>> +        aml_append(_method_, aml_release(aml_name("NLCK")));               \
>> +        aml_append(_method_, aml_return(aml_arg(6)));                      \
>> +    } while (0)
>> +
>> +#define BUILD_FIELD_UNIT_STRUCT(_field_, _s_, _f_, _name_)                 \
>> +    aml_append(_field_, aml_named_field(_name_,                            \
>> +               sizeof(typeof_field(_s_, _f_)) * BITS_PER_BYTE))
>> +
>> +#define BUILD_FIELD_UNIT_SIZE(_field_, _byte_, _name_)                     \
>> +    aml_append(_field_, aml_named_field(_name_, (_byte_) * BITS_PER_BYTE))
>> +
>> +static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
>> +                                 Aml *root_dev)
>> +{
>> +    for (; device_list; device_list = device_list->next) {
>> +        NVDIMMDevice *nvdimm = device_list->data;
>> +        int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
>> +                                           NULL);
>> +        uint32_t handle = nvdimm_slot_to_handle(slot);
>> +        Aml *dev, *method;
>> +
>> +        dev = aml_device("NV%02X", slot);
>> +        aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
>> +
>> +        BUILD_STA_METHOD(dev, method);
>> +
>> +        method = aml_method("_DSM", 4);
> That will create the same method per each device which increases
> ACPI table size unnecessarily.
> I'd suggest to make per nvdimm device method a wrapper over common
> NVDR._DSM method and make the later handle all the logic.

Good to me.

Really appropriate for your review, Igor!

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
  2015-10-13  6:36               ` Dan Williams
@ 2015-10-14  4:03                 ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-14  4:03 UTC (permalink / raw)
  To: Dan Williams
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, Paolo Bonzini, imammedo, rth



On 10/13/2015 02:36 PM, Dan Williams wrote:
> On Mon, Oct 12, 2015 at 10:49 PM, Xiao Guangrong
> <guangrong.xiao@linux.intel.com> wrote:
>>
>>
>> On 10/13/2015 11:38 AM, Dan Williams wrote:
>>>
>>> On Mon, Oct 12, 2015 at 8:14 PM, Xiao Guangrong
>>> <guangrong.xiao@linux.intel.com> wrote:
>>>>
>>>> On 10/13/2015 12:36 AM, Dan Williams wrote:
>>>>>
>>>>> Static namespaces can be emitted without a label.  Linux needs this to
>>>>> support existing "label-less" bare metal NVDIMMs.
>>>>
>>>>
>>>>
>>>> This is Linux specific? As i did not see it has been documented in the
>>>> spec...
>>>
>>>
>>> I expect most NVDIMMs, especially existing ones available today, do
>>> not have a label area.  This is not Linux specific and ACPI 6 does not
>>> specify a label area, only the Intel DSM Interface Example.
>>>
>>
>> Yup, label data is accessed via DSM interface, the spec I mentioned
>> is Intel DSM Interface Example.
>>
>> However, IIRC Linux NVDIMM driver refused to use the device if no
>> DSM GET_LABEL support, are you going to update it?
>
> Label-less DIMMs are tested as part of the unit test [1] and the
> "memmap=nn!ss" kernel parameter that registers a persistent-memory
> address range without a DIMM.  What error do you see when label
> support is disabled?
>
> [1]: https://github.com/pmem/ndctl/blob/master/README.md
>

After revert my commits on NVDIMM driver, yeah, it works.

Okay, i will drop the namespace part and make it as label-less
instead.

Thank you, Dan!


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-14  4:03                 ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-14  4:03 UTC (permalink / raw)
  To: Dan Williams
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, imammedo, Paolo Bonzini, rth



On 10/13/2015 02:36 PM, Dan Williams wrote:
> On Mon, Oct 12, 2015 at 10:49 PM, Xiao Guangrong
> <guangrong.xiao@linux.intel.com> wrote:
>>
>>
>> On 10/13/2015 11:38 AM, Dan Williams wrote:
>>>
>>> On Mon, Oct 12, 2015 at 8:14 PM, Xiao Guangrong
>>> <guangrong.xiao@linux.intel.com> wrote:
>>>>
>>>> On 10/13/2015 12:36 AM, Dan Williams wrote:
>>>>>
>>>>> Static namespaces can be emitted without a label.  Linux needs this to
>>>>> support existing "label-less" bare metal NVDIMMs.
>>>>
>>>>
>>>>
>>>> This is Linux specific? As i did not see it has been documented in the
>>>> spec...
>>>
>>>
>>> I expect most NVDIMMs, especially existing ones available today, do
>>> not have a label area.  This is not Linux specific and ACPI 6 does not
>>> specify a label area, only the Intel DSM Interface Example.
>>>
>>
>> Yup, label data is accessed via DSM interface, the spec I mentioned
>> is Intel DSM Interface Example.
>>
>> However, IIRC Linux NVDIMM driver refused to use the device if no
>> DSM GET_LABEL support, are you going to update it?
>
> Label-less DIMMs are tested as part of the unit test [1] and the
> "memmap=nn!ss" kernel parameter that registers a persistent-memory
> address range without a DIMM.  What error do you see when label
> support is disabled?
>
> [1]: https://github.com/pmem/ndctl/blob/master/README.md
>

After revert my commits on NVDIMM driver, yeah, it works.

Okay, i will drop the namespace part and make it as label-less
instead.

Thank you, Dan!

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-14  9:40     ` Stefan Hajnoczi
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefan Hajnoczi @ 2015-10-14  9:40 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, mst, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
>  static void dsm_write(void *opaque, hwaddr addr,
>                        uint64_t val, unsigned size)
>  {
> +    NVDIMMState *state = opaque;
> +    MemoryRegion *dsm_ram_mr;
> +    dsm_in *in;
> +    dsm_out *out;
> +    uint32_t revision, function, handle;
> +
>      if (val != NOTIFY_VALUE) {
>          fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
>      }
> +
> +    dsm_ram_mr = memory_region_find(&state->mr, state->page_size,
> +                                    state->page_size).mr;
> +    memory_region_unref(dsm_ram_mr);
> +    in = memory_region_get_ram_ptr(dsm_ram_mr);

This looks suspicious.  Shouldn't the memory_region_unref(dsm_ram_mr)
happen after we're done using it?

> +    out = (dsm_out *)in;
> +
> +    revision = in->arg1;
> +    function = in->arg2;
> +    handle = in->handle;
> +    le32_to_cpus(&revision);
> +    le32_to_cpus(&function);
> +    le32_to_cpus(&handle);
> +
> +    nvdebug("UUID " UUID_FMT ".\n", in->arg0[0], in->arg0[1], in->arg0[2],
> +            in->arg0[3], in->arg0[4], in->arg0[5], in->arg0[6],
> +            in->arg0[7], in->arg0[8], in->arg0[9], in->arg0[10],
> +            in->arg0[11], in->arg0[12], in->arg0[13], in->arg0[14],
> +            in->arg0[15]);
> +    nvdebug("Revision %#x Function %#x Handler %#x.\n", revision, function,
> +            handle);
> +
> +    if (revision != DSM_REVISION) {
> +        nvdebug("Revision %#x is not supported, expect %#x.\n",
> +                revision, DSM_REVISION);
> +        goto exit;
> +    }
> +
> +    if (!handle) {
> +        if (!dsm_is_root_uuid(in->arg0)) {

Please don't dereference 'in' or pass it to other functions.  Avoid race
conditions with guest vcpus by coping in the entire dsm_in struct.

This is like a system call - the kernel cannot trust userspace memory
and must copy in before accessing data.  The same rules apply.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
@ 2015-10-14  9:40     ` Stefan Hajnoczi
  0 siblings, 0 replies; 200+ messages in thread
From: Stefan Hajnoczi @ 2015-10-14  9:40 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	imammedo, pbonzini, dan.j.williams, rth

On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
>  static void dsm_write(void *opaque, hwaddr addr,
>                        uint64_t val, unsigned size)
>  {
> +    NVDIMMState *state = opaque;
> +    MemoryRegion *dsm_ram_mr;
> +    dsm_in *in;
> +    dsm_out *out;
> +    uint32_t revision, function, handle;
> +
>      if (val != NOTIFY_VALUE) {
>          fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
>      }
> +
> +    dsm_ram_mr = memory_region_find(&state->mr, state->page_size,
> +                                    state->page_size).mr;
> +    memory_region_unref(dsm_ram_mr);
> +    in = memory_region_get_ram_ptr(dsm_ram_mr);

This looks suspicious.  Shouldn't the memory_region_unref(dsm_ram_mr)
happen after we're done using it?

> +    out = (dsm_out *)in;
> +
> +    revision = in->arg1;
> +    function = in->arg2;
> +    handle = in->handle;
> +    le32_to_cpus(&revision);
> +    le32_to_cpus(&function);
> +    le32_to_cpus(&handle);
> +
> +    nvdebug("UUID " UUID_FMT ".\n", in->arg0[0], in->arg0[1], in->arg0[2],
> +            in->arg0[3], in->arg0[4], in->arg0[5], in->arg0[6],
> +            in->arg0[7], in->arg0[8], in->arg0[9], in->arg0[10],
> +            in->arg0[11], in->arg0[12], in->arg0[13], in->arg0[14],
> +            in->arg0[15]);
> +    nvdebug("Revision %#x Function %#x Handler %#x.\n", revision, function,
> +            handle);
> +
> +    if (revision != DSM_REVISION) {
> +        nvdebug("Revision %#x is not supported, expect %#x.\n",
> +                revision, DSM_REVISION);
> +        goto exit;
> +    }
> +
> +    if (!handle) {
> +        if (!dsm_is_root_uuid(in->arg0)) {

Please don't dereference 'in' or pass it to other functions.  Avoid race
conditions with guest vcpus by coping in the entire dsm_in struct.

This is like a system call - the kernel cannot trust userspace memory
and must copy in before accessing data.  The same rules apply.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-14  9:41     ` Stefan Hajnoczi
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefan Hajnoczi @ 2015-10-14  9:41 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	imammedo, pbonzini, dan.j.williams, rth

On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
> +    out->len = sizeof(out->status);

out->len is uint16_t, it needs cpu_to_le16().  There may be other
instances in this patch series.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
@ 2015-10-14  9:41     ` Stefan Hajnoczi
  0 siblings, 0 replies; 200+ messages in thread
From: Stefan Hajnoczi @ 2015-10-14  9:41 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	imammedo, pbonzini, dan.j.williams, rth

On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
> +    out->len = sizeof(out->status);

out->len is uint16_t, it needs cpu_to_le16().  There may be other
instances in this patch series.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
  2015-10-14  9:40     ` [Qemu-devel] " Stefan Hajnoczi
@ 2015-10-14 14:50       ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-14 14:50 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, mst, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/14/2015 05:40 PM, Stefan Hajnoczi wrote:
> On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
>>   static void dsm_write(void *opaque, hwaddr addr,
>>                         uint64_t val, unsigned size)
>>   {
>> +    NVDIMMState *state = opaque;
>> +    MemoryRegion *dsm_ram_mr;
>> +    dsm_in *in;
>> +    dsm_out *out;
>> +    uint32_t revision, function, handle;
>> +
>>       if (val != NOTIFY_VALUE) {
>>           fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
>>       }
>> +
>> +    dsm_ram_mr = memory_region_find(&state->mr, state->page_size,
>> +                                    state->page_size).mr;
>> +    memory_region_unref(dsm_ram_mr);
>> +    in = memory_region_get_ram_ptr(dsm_ram_mr);
>
> This looks suspicious.  Shouldn't the memory_region_unref(dsm_ram_mr)
> happen after we're done using it?

This region is keep-alive during QEMU's running, it is okay. The
same style is applied to other codes, for example:
line 208 in hw/s390x/sclp.c.

>
>> +    out = (dsm_out *)in;
>> +
>> +    revision = in->arg1;
>> +    function = in->arg2;
>> +    handle = in->handle;
>> +    le32_to_cpus(&revision);
>> +    le32_to_cpus(&function);
>> +    le32_to_cpus(&handle);
>> +
>> +    nvdebug("UUID " UUID_FMT ".\n", in->arg0[0], in->arg0[1], in->arg0[2],
>> +            in->arg0[3], in->arg0[4], in->arg0[5], in->arg0[6],
>> +            in->arg0[7], in->arg0[8], in->arg0[9], in->arg0[10],
>> +            in->arg0[11], in->arg0[12], in->arg0[13], in->arg0[14],
>> +            in->arg0[15]);
>> +    nvdebug("Revision %#x Function %#x Handler %#x.\n", revision, function,
>> +            handle);
>> +
>> +    if (revision != DSM_REVISION) {
>> +        nvdebug("Revision %#x is not supported, expect %#x.\n",
>> +                revision, DSM_REVISION);
>> +        goto exit;
>> +    }
>> +
>> +    if (!handle) {
>> +        if (!dsm_is_root_uuid(in->arg0)) {
>
> Please don't dereference 'in' or pass it to other functions.  Avoid race
> conditions with guest vcpus by coping in the entire dsm_in struct.
>
> This is like a system call - the kernel cannot trust userspace memory
> and must copy in before accessing data.  The same rules apply.
>

It's little different for QEMU:
- the memory address is always valid to QEMU, it's not always true for Kernel
   due to context-switch

- we have checked the header before use it's data, for example, when we get
   data from GET_NAMESPACE_DATA, we have got the @offset and @length from the
   memory, then copy memory based on these values, that means the userspace
   has no chance to cause buffer overflow by increasing these values at runtime.

   The scenario for our case is simple but Kernel is difficult to do
   check_all_before_use as many paths may be involved.

- guest changes some data is okay, the worst case is that the label data is
   corrupted. This is caused by guest itself. Kernel also supports this kind
   of behaviour, e,g. network TX zero copy, the userspace page is being
   transferred while userspace can still access it.

- it's 4K size on x86, full copy wastes CPU time too much.







^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
@ 2015-10-14 14:50       ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-14 14:50 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	imammedo, pbonzini, dan.j.williams, rth



On 10/14/2015 05:40 PM, Stefan Hajnoczi wrote:
> On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
>>   static void dsm_write(void *opaque, hwaddr addr,
>>                         uint64_t val, unsigned size)
>>   {
>> +    NVDIMMState *state = opaque;
>> +    MemoryRegion *dsm_ram_mr;
>> +    dsm_in *in;
>> +    dsm_out *out;
>> +    uint32_t revision, function, handle;
>> +
>>       if (val != NOTIFY_VALUE) {
>>           fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
>>       }
>> +
>> +    dsm_ram_mr = memory_region_find(&state->mr, state->page_size,
>> +                                    state->page_size).mr;
>> +    memory_region_unref(dsm_ram_mr);
>> +    in = memory_region_get_ram_ptr(dsm_ram_mr);
>
> This looks suspicious.  Shouldn't the memory_region_unref(dsm_ram_mr)
> happen after we're done using it?

This region is keep-alive during QEMU's running, it is okay. The
same style is applied to other codes, for example:
line 208 in hw/s390x/sclp.c.

>
>> +    out = (dsm_out *)in;
>> +
>> +    revision = in->arg1;
>> +    function = in->arg2;
>> +    handle = in->handle;
>> +    le32_to_cpus(&revision);
>> +    le32_to_cpus(&function);
>> +    le32_to_cpus(&handle);
>> +
>> +    nvdebug("UUID " UUID_FMT ".\n", in->arg0[0], in->arg0[1], in->arg0[2],
>> +            in->arg0[3], in->arg0[4], in->arg0[5], in->arg0[6],
>> +            in->arg0[7], in->arg0[8], in->arg0[9], in->arg0[10],
>> +            in->arg0[11], in->arg0[12], in->arg0[13], in->arg0[14],
>> +            in->arg0[15]);
>> +    nvdebug("Revision %#x Function %#x Handler %#x.\n", revision, function,
>> +            handle);
>> +
>> +    if (revision != DSM_REVISION) {
>> +        nvdebug("Revision %#x is not supported, expect %#x.\n",
>> +                revision, DSM_REVISION);
>> +        goto exit;
>> +    }
>> +
>> +    if (!handle) {
>> +        if (!dsm_is_root_uuid(in->arg0)) {
>
> Please don't dereference 'in' or pass it to other functions.  Avoid race
> conditions with guest vcpus by coping in the entire dsm_in struct.
>
> This is like a system call - the kernel cannot trust userspace memory
> and must copy in before accessing data.  The same rules apply.
>

It's little different for QEMU:
- the memory address is always valid to QEMU, it's not always true for Kernel
   due to context-switch

- we have checked the header before use it's data, for example, when we get
   data from GET_NAMESPACE_DATA, we have got the @offset and @length from the
   memory, then copy memory based on these values, that means the userspace
   has no chance to cause buffer overflow by increasing these values at runtime.

   The scenario for our case is simple but Kernel is difficult to do
   check_all_before_use as many paths may be involved.

- guest changes some data is okay, the worst case is that the label data is
   corrupted. This is caused by guest itself. Kernel also supports this kind
   of behaviour, e,g. network TX zero copy, the userspace page is being
   transferred while userspace can still access it.

- it's 4K size on x86, full copy wastes CPU time too much.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
  2015-10-14  9:41     ` [Qemu-devel] " Stefan Hajnoczi
@ 2015-10-14 14:52       ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-14 14:52 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, mst, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/14/2015 05:41 PM, Stefan Hajnoczi wrote:
> On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
>> +    out->len = sizeof(out->status);
>
> out->len is uint16_t, it needs cpu_to_le16().  There may be other
> instances in this patch series.
>

out->len is internally used only which is invisible to guest OS, i,e,
we write this value and read this value by ourself. I think it is
okay.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
@ 2015-10-14 14:52       ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-14 14:52 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: ehabkost, kvm, mst, gleb, mtosatti, qemu-devel, stefanha,
	imammedo, pbonzini, dan.j.williams, rth



On 10/14/2015 05:41 PM, Stefan Hajnoczi wrote:
> On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
>> +    out->len = sizeof(out->status);
>
> out->len is uint16_t, it needs cpu_to_le16().  There may be other
> instances in this patch series.
>

out->len is internally used only which is invisible to guest OS, i,e,
we write this value and read this value by ourself. I think it is
okay.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
  2015-10-14 14:50       ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-14 17:06         ` Eduardo Habkost
  -1 siblings, 0 replies; 200+ messages in thread
From: Eduardo Habkost @ 2015-10-14 17:06 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Stefan Hajnoczi, pbonzini, imammedo, gleb, mtosatti, stefanha,
	mst, rth, dan.j.williams, kvm, qemu-devel

On Wed, Oct 14, 2015 at 10:50:40PM +0800, Xiao Guangrong wrote:
> On 10/14/2015 05:40 PM, Stefan Hajnoczi wrote:
> >On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
> >>  static void dsm_write(void *opaque, hwaddr addr,
> >>                        uint64_t val, unsigned size)
> >>  {
> >>+    NVDIMMState *state = opaque;
> >>+    MemoryRegion *dsm_ram_mr;
> >>+    dsm_in *in;
> >>+    dsm_out *out;
> >>+    uint32_t revision, function, handle;
> >>+
> >>      if (val != NOTIFY_VALUE) {
> >>          fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
> >>      }
> >>+
> >>+    dsm_ram_mr = memory_region_find(&state->mr, state->page_size,
> >>+                                    state->page_size).mr;
> >>+    memory_region_unref(dsm_ram_mr);
> >>+    in = memory_region_get_ram_ptr(dsm_ram_mr);
> >
> >This looks suspicious.  Shouldn't the memory_region_unref(dsm_ram_mr)
> >happen after we're done using it?
> 
> This region is keep-alive during QEMU's running, it is okay.  The same
> style is applied to other codes, for example: line 208 in
> hw/s390x/sclp.c.

In sclp.c (assign_storage()), the memory region is never used after
memory_region_unref() is called. In unassign_storage(), sclp.c owns an
additional reference, grabbed by assign_storage().

-- 
Eduardo

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
@ 2015-10-14 17:06         ` Eduardo Habkost
  0 siblings, 0 replies; 200+ messages in thread
From: Eduardo Habkost @ 2015-10-14 17:06 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: kvm, mst, gleb, Stefan Hajnoczi, mtosatti, qemu-devel, stefanha,
	pbonzini, imammedo, dan.j.williams, rth

On Wed, Oct 14, 2015 at 10:50:40PM +0800, Xiao Guangrong wrote:
> On 10/14/2015 05:40 PM, Stefan Hajnoczi wrote:
> >On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
> >>  static void dsm_write(void *opaque, hwaddr addr,
> >>                        uint64_t val, unsigned size)
> >>  {
> >>+    NVDIMMState *state = opaque;
> >>+    MemoryRegion *dsm_ram_mr;
> >>+    dsm_in *in;
> >>+    dsm_out *out;
> >>+    uint32_t revision, function, handle;
> >>+
> >>      if (val != NOTIFY_VALUE) {
> >>          fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
> >>      }
> >>+
> >>+    dsm_ram_mr = memory_region_find(&state->mr, state->page_size,
> >>+                                    state->page_size).mr;
> >>+    memory_region_unref(dsm_ram_mr);
> >>+    in = memory_region_get_ram_ptr(dsm_ram_mr);
> >
> >This looks suspicious.  Shouldn't the memory_region_unref(dsm_ram_mr)
> >happen after we're done using it?
> 
> This region is keep-alive during QEMU's running, it is okay.  The same
> style is applied to other codes, for example: line 208 in
> hw/s390x/sclp.c.

In sclp.c (assign_storage()), the memory region is never used after
memory_region_unref() is called. In unassign_storage(), sclp.c owns an
additional reference, grabbed by assign_storage().

-- 
Eduardo

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
  2015-10-14  4:03                 ` Xiao Guangrong
@ 2015-10-14 19:20                   ` Dan Williams
  -1 siblings, 0 replies; 200+ messages in thread
From: Dan Williams @ 2015-10-14 19:20 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, Paolo Bonzini, imammedo, rth

On Tue, Oct 13, 2015 at 9:03 PM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
>> Label-less DIMMs are tested as part of the unit test [1] and the
>> "memmap=nn!ss" kernel parameter that registers a persistent-memory
>> address range without a DIMM.  What error do you see when label
>> support is disabled?
>>
>> [1]: https://github.com/pmem/ndctl/blob/master/README.md
>>
>
> After revert my commits on NVDIMM driver, yeah, it works.
>
> Okay, i will drop the namespace part and make it as label-less
> instead.
>
> Thank you, Dan!
>

Good to hear.  There are still cases where a guest would likely want
to submit a _DSM, like retrieving address range scrub results from the
ACPI0012 root device, so the ASL work is still needed.  However, I
think the bulk of the storage functionality can be had without
storing/retrieving labels in the guest.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-14 19:20                   ` Dan Williams
  0 siblings, 0 replies; 200+ messages in thread
From: Dan Williams @ 2015-10-14 19:20 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, KVM list, Michael S. Tsirkin, Gleb Natapov, mtosatti,
	qemu-devel, stefanha, imammedo, Paolo Bonzini, rth

On Tue, Oct 13, 2015 at 9:03 PM, Xiao Guangrong
<guangrong.xiao@linux.intel.com> wrote:
>> Label-less DIMMs are tested as part of the unit test [1] and the
>> "memmap=nn!ss" kernel parameter that registers a persistent-memory
>> address range without a DIMM.  What error do you see when label
>> support is disabled?
>>
>> [1]: https://github.com/pmem/ndctl/blob/master/README.md
>>
>
> After revert my commits on NVDIMM driver, yeah, it works.
>
> Okay, i will drop the namespace part and make it as label-less
> instead.
>
> Thank you, Dan!
>

Good to hear.  There are still cases where a guest would likely want
to submit a _DSM, like retrieving address range scrub results from the
ACPI0012 root device, so the ASL work is still needed.  However, I
think the bulk of the storage functionality can be had without
storing/retrieving labels in the guest.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
  2015-10-14 17:06         ` [Qemu-devel] " Eduardo Habkost
@ 2015-10-15  1:43           ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-15  1:43 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Stefan Hajnoczi, pbonzini, imammedo, gleb, mtosatti, stefanha,
	mst, rth, dan.j.williams, kvm, qemu-devel



On 10/15/2015 01:06 AM, Eduardo Habkost wrote:
> On Wed, Oct 14, 2015 at 10:50:40PM +0800, Xiao Guangrong wrote:
>> On 10/14/2015 05:40 PM, Stefan Hajnoczi wrote:
>>> On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
>>>>   static void dsm_write(void *opaque, hwaddr addr,
>>>>                         uint64_t val, unsigned size)
>>>>   {
>>>> +    NVDIMMState *state = opaque;
>>>> +    MemoryRegion *dsm_ram_mr;
>>>> +    dsm_in *in;
>>>> +    dsm_out *out;
>>>> +    uint32_t revision, function, handle;
>>>> +
>>>>       if (val != NOTIFY_VALUE) {
>>>>           fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
>>>>       }
>>>> +
>>>> +    dsm_ram_mr = memory_region_find(&state->mr, state->page_size,
>>>> +                                    state->page_size).mr;
>>>> +    memory_region_unref(dsm_ram_mr);
>>>> +    in = memory_region_get_ram_ptr(dsm_ram_mr);
>>>
>>> This looks suspicious.  Shouldn't the memory_region_unref(dsm_ram_mr)
>>> happen after we're done using it?
>>
>> This region is keep-alive during QEMU's running, it is okay.  The same
>> style is applied to other codes, for example: line 208 in
>> hw/s390x/sclp.c.
>
> In sclp.c (assign_storage()), the memory region is never used after
> memory_region_unref() is called. In unassign_storage(), sclp.c owns an
> additional reference, grabbed by assign_storage().
>

Ah... I got it, thank you for pointing it out.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
@ 2015-10-15  1:43           ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-15  1:43 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: kvm, mst, gleb, Stefan Hajnoczi, mtosatti, qemu-devel, stefanha,
	pbonzini, imammedo, dan.j.williams, rth



On 10/15/2015 01:06 AM, Eduardo Habkost wrote:
> On Wed, Oct 14, 2015 at 10:50:40PM +0800, Xiao Guangrong wrote:
>> On 10/14/2015 05:40 PM, Stefan Hajnoczi wrote:
>>> On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
>>>>   static void dsm_write(void *opaque, hwaddr addr,
>>>>                         uint64_t val, unsigned size)
>>>>   {
>>>> +    NVDIMMState *state = opaque;
>>>> +    MemoryRegion *dsm_ram_mr;
>>>> +    dsm_in *in;
>>>> +    dsm_out *out;
>>>> +    uint32_t revision, function, handle;
>>>> +
>>>>       if (val != NOTIFY_VALUE) {
>>>>           fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
>>>>       }
>>>> +
>>>> +    dsm_ram_mr = memory_region_find(&state->mr, state->page_size,
>>>> +                                    state->page_size).mr;
>>>> +    memory_region_unref(dsm_ram_mr);
>>>> +    in = memory_region_get_ram_ptr(dsm_ram_mr);
>>>
>>> This looks suspicious.  Shouldn't the memory_region_unref(dsm_ram_mr)
>>> happen after we're done using it?
>>
>> This region is keep-alive during QEMU's running, it is okay.  The same
>> style is applied to other codes, for example: line 208 in
>> hw/s390x/sclp.c.
>
> In sclp.c (assign_storage()), the memory region is never used after
> memory_region_unref() is called. In unassign_storage(), sclp.c owns an
> additional reference, grabbed by assign_storage().
>

Ah... I got it, thank you for pointing it out.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
  2015-10-14 14:52       ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-15 15:01         ` Stefan Hajnoczi
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefan Hajnoczi @ 2015-10-15 15:01 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Stefan Hajnoczi, pbonzini, imammedo, gleb, mtosatti, mst, rth,
	ehabkost, dan.j.williams, kvm, qemu-devel

On Wed, Oct 14, 2015 at 10:52:15PM +0800, Xiao Guangrong wrote:
> On 10/14/2015 05:41 PM, Stefan Hajnoczi wrote:
> >On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
> >>+    out->len = sizeof(out->status);
> >
> >out->len is uint16_t, it needs cpu_to_le16().  There may be other
> >instances in this patch series.
> >
> 
> out->len is internally used only which is invisible to guest OS, i,e,
> we write this value and read this value by ourself. I think it is
> okay.

'out' points to guest memory.  Guest memory is untrusted so QEMU cannot
stash values there - an evil guest could modify them.

Please put the len variable on the QEMU stack or heap where the guest
cannot access it.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
@ 2015-10-15 15:01         ` Stefan Hajnoczi
  0 siblings, 0 replies; 200+ messages in thread
From: Stefan Hajnoczi @ 2015-10-15 15:01 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, mst, gleb, Stefan Hajnoczi, mtosatti, qemu-devel,
	pbonzini, imammedo, dan.j.williams, rth

On Wed, Oct 14, 2015 at 10:52:15PM +0800, Xiao Guangrong wrote:
> On 10/14/2015 05:41 PM, Stefan Hajnoczi wrote:
> >On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
> >>+    out->len = sizeof(out->status);
> >
> >out->len is uint16_t, it needs cpu_to_le16().  There may be other
> >instances in this patch series.
> >
> 
> out->len is internally used only which is invisible to guest OS, i,e,
> we write this value and read this value by ourself. I think it is
> okay.

'out' points to guest memory.  Guest memory is untrusted so QEMU cannot
stash values there - an evil guest could modify them.

Please put the len variable on the QEMU stack or heap where the guest
cannot access it.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
  2015-10-14 14:50       ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-15 15:07         ` Stefan Hajnoczi
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefan Hajnoczi @ 2015-10-15 15:07 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Stefan Hajnoczi, pbonzini, imammedo, gleb, mtosatti, mst, rth,
	ehabkost, dan.j.williams, kvm, qemu-devel

On Wed, Oct 14, 2015 at 10:50:40PM +0800, Xiao Guangrong wrote:
> On 10/14/2015 05:40 PM, Stefan Hajnoczi wrote:
> >On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
> >>+    out = (dsm_out *)in;
> >>+
> >>+    revision = in->arg1;
> >>+    function = in->arg2;
> >>+    handle = in->handle;
> >>+    le32_to_cpus(&revision);
> >>+    le32_to_cpus(&function);
> >>+    le32_to_cpus(&handle);
> >>+
> >>+    nvdebug("UUID " UUID_FMT ".\n", in->arg0[0], in->arg0[1], in->arg0[2],
> >>+            in->arg0[3], in->arg0[4], in->arg0[5], in->arg0[6],
> >>+            in->arg0[7], in->arg0[8], in->arg0[9], in->arg0[10],
> >>+            in->arg0[11], in->arg0[12], in->arg0[13], in->arg0[14],
> >>+            in->arg0[15]);
> >>+    nvdebug("Revision %#x Function %#x Handler %#x.\n", revision, function,
> >>+            handle);
> >>+
> >>+    if (revision != DSM_REVISION) {
> >>+        nvdebug("Revision %#x is not supported, expect %#x.\n",
> >>+                revision, DSM_REVISION);
> >>+        goto exit;
> >>+    }
> >>+
> >>+    if (!handle) {
> >>+        if (!dsm_is_root_uuid(in->arg0)) {
> >
> >Please don't dereference 'in' or pass it to other functions.  Avoid race
> >conditions with guest vcpus by coping in the entire dsm_in struct.
> >
> >This is like a system call - the kernel cannot trust userspace memory
> >and must copy in before accessing data.  The same rules apply.
> >
> 
> It's little different for QEMU:
> - the memory address is always valid to QEMU, it's not always true for Kernel
>   due to context-switch
> 
> - we have checked the header before use it's data, for example, when we get
>   data from GET_NAMESPACE_DATA, we have got the @offset and @length from the
>   memory, then copy memory based on these values, that means the userspace
>   has no chance to cause buffer overflow by increasing these values at runtime.
> 
>   The scenario for our case is simple but Kernel is difficult to do
>   check_all_before_use as many paths may be involved.
> 
> - guest changes some data is okay, the worst case is that the label data is
>   corrupted. This is caused by guest itself. Kernel also supports this kind
>   of behaviour, e,g. network TX zero copy, the userspace page is being
>   transferred while userspace can still access it.
> 
> - it's 4K size on x86, full copy wastes CPU time too much.

This isn't performance-critical code and I don't want to review it
keeping the race conditions in mind the whole time.  Also, if the code
is modified in the future, the chance of introducing a race is high.

I see this as premature optimization, please just copy in data.

Stefan

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
@ 2015-10-15 15:07         ` Stefan Hajnoczi
  0 siblings, 0 replies; 200+ messages in thread
From: Stefan Hajnoczi @ 2015-10-15 15:07 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, mst, gleb, Stefan Hajnoczi, mtosatti, qemu-devel,
	pbonzini, imammedo, dan.j.williams, rth

On Wed, Oct 14, 2015 at 10:50:40PM +0800, Xiao Guangrong wrote:
> On 10/14/2015 05:40 PM, Stefan Hajnoczi wrote:
> >On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
> >>+    out = (dsm_out *)in;
> >>+
> >>+    revision = in->arg1;
> >>+    function = in->arg2;
> >>+    handle = in->handle;
> >>+    le32_to_cpus(&revision);
> >>+    le32_to_cpus(&function);
> >>+    le32_to_cpus(&handle);
> >>+
> >>+    nvdebug("UUID " UUID_FMT ".\n", in->arg0[0], in->arg0[1], in->arg0[2],
> >>+            in->arg0[3], in->arg0[4], in->arg0[5], in->arg0[6],
> >>+            in->arg0[7], in->arg0[8], in->arg0[9], in->arg0[10],
> >>+            in->arg0[11], in->arg0[12], in->arg0[13], in->arg0[14],
> >>+            in->arg0[15]);
> >>+    nvdebug("Revision %#x Function %#x Handler %#x.\n", revision, function,
> >>+            handle);
> >>+
> >>+    if (revision != DSM_REVISION) {
> >>+        nvdebug("Revision %#x is not supported, expect %#x.\n",
> >>+                revision, DSM_REVISION);
> >>+        goto exit;
> >>+    }
> >>+
> >>+    if (!handle) {
> >>+        if (!dsm_is_root_uuid(in->arg0)) {
> >
> >Please don't dereference 'in' or pass it to other functions.  Avoid race
> >conditions with guest vcpus by coping in the entire dsm_in struct.
> >
> >This is like a system call - the kernel cannot trust userspace memory
> >and must copy in before accessing data.  The same rules apply.
> >
> 
> It's little different for QEMU:
> - the memory address is always valid to QEMU, it's not always true for Kernel
>   due to context-switch
> 
> - we have checked the header before use it's data, for example, when we get
>   data from GET_NAMESPACE_DATA, we have got the @offset and @length from the
>   memory, then copy memory based on these values, that means the userspace
>   has no chance to cause buffer overflow by increasing these values at runtime.
> 
>   The scenario for our case is simple but Kernel is difficult to do
>   check_all_before_use as many paths may be involved.
> 
> - guest changes some data is okay, the worst case is that the label data is
>   corrupted. This is caused by guest itself. Kernel also supports this kind
>   of behaviour, e,g. network TX zero copy, the userspace page is being
>   transferred while userspace can still access it.
> 
> - it's 4K size on x86, full copy wastes CPU time too much.

This isn't performance-critical code and I don't want to review it
keeping the race conditions in mind the whole time.  Also, if the code
is modified in the future, the chance of introducing a race is high.

I see this as premature optimization, please just copy in data.

Stefan

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
  2015-10-15 15:07         ` [Qemu-devel] " Stefan Hajnoczi
@ 2015-10-16  2:30           ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-16  2:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Stefan Hajnoczi, pbonzini, imammedo, gleb, mtosatti, mst, rth,
	ehabkost, dan.j.williams, kvm, qemu-devel



On 10/15/2015 11:07 PM, Stefan Hajnoczi wrote:
> On Wed, Oct 14, 2015 at 10:50:40PM +0800, Xiao Guangrong wrote:
>> On 10/14/2015 05:40 PM, Stefan Hajnoczi wrote:
>>> On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
>>>> +    out = (dsm_out *)in;
>>>> +
>>>> +    revision = in->arg1;
>>>> +    function = in->arg2;
>>>> +    handle = in->handle;
>>>> +    le32_to_cpus(&revision);
>>>> +    le32_to_cpus(&function);
>>>> +    le32_to_cpus(&handle);
>>>> +
>>>> +    nvdebug("UUID " UUID_FMT ".\n", in->arg0[0], in->arg0[1], in->arg0[2],
>>>> +            in->arg0[3], in->arg0[4], in->arg0[5], in->arg0[6],
>>>> +            in->arg0[7], in->arg0[8], in->arg0[9], in->arg0[10],
>>>> +            in->arg0[11], in->arg0[12], in->arg0[13], in->arg0[14],
>>>> +            in->arg0[15]);
>>>> +    nvdebug("Revision %#x Function %#x Handler %#x.\n", revision, function,
>>>> +            handle);
>>>> +
>>>> +    if (revision != DSM_REVISION) {
>>>> +        nvdebug("Revision %#x is not supported, expect %#x.\n",
>>>> +                revision, DSM_REVISION);
>>>> +        goto exit;
>>>> +    }
>>>> +
>>>> +    if (!handle) {
>>>> +        if (!dsm_is_root_uuid(in->arg0)) {
>>>
>>> Please don't dereference 'in' or pass it to other functions.  Avoid race
>>> conditions with guest vcpus by coping in the entire dsm_in struct.
>>>
>>> This is like a system call - the kernel cannot trust userspace memory
>>> and must copy in before accessing data.  The same rules apply.
>>>
>>
>> It's little different for QEMU:
>> - the memory address is always valid to QEMU, it's not always true for Kernel
>>    due to context-switch
>>
>> - we have checked the header before use it's data, for example, when we get
>>    data from GET_NAMESPACE_DATA, we have got the @offset and @length from the
>>    memory, then copy memory based on these values, that means the userspace
>>    has no chance to cause buffer overflow by increasing these values at runtime.
>>
>>    The scenario for our case is simple but Kernel is difficult to do
>>    check_all_before_use as many paths may be involved.
>>
>> - guest changes some data is okay, the worst case is that the label data is
>>    corrupted. This is caused by guest itself. Kernel also supports this kind
>>    of behaviour, e,g. network TX zero copy, the userspace page is being
>>    transferred while userspace can still access it.
>>
>> - it's 4K size on x86, full copy wastes CPU time too much.
>
> This isn't performance-critical code and I don't want to review it
> keeping the race conditions in mind the whole time.  Also, if the code
> is modified in the future, the chance of introducing a race is high.
>
> I see this as premature optimization, please just copy in data.


No strong opinion on it... will do it following your idea.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
@ 2015-10-16  2:30           ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-16  2:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: ehabkost, kvm, mst, gleb, Stefan Hajnoczi, mtosatti, qemu-devel,
	pbonzini, imammedo, dan.j.williams, rth



On 10/15/2015 11:07 PM, Stefan Hajnoczi wrote:
> On Wed, Oct 14, 2015 at 10:50:40PM +0800, Xiao Guangrong wrote:
>> On 10/14/2015 05:40 PM, Stefan Hajnoczi wrote:
>>> On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
>>>> +    out = (dsm_out *)in;
>>>> +
>>>> +    revision = in->arg1;
>>>> +    function = in->arg2;
>>>> +    handle = in->handle;
>>>> +    le32_to_cpus(&revision);
>>>> +    le32_to_cpus(&function);
>>>> +    le32_to_cpus(&handle);
>>>> +
>>>> +    nvdebug("UUID " UUID_FMT ".\n", in->arg0[0], in->arg0[1], in->arg0[2],
>>>> +            in->arg0[3], in->arg0[4], in->arg0[5], in->arg0[6],
>>>> +            in->arg0[7], in->arg0[8], in->arg0[9], in->arg0[10],
>>>> +            in->arg0[11], in->arg0[12], in->arg0[13], in->arg0[14],
>>>> +            in->arg0[15]);
>>>> +    nvdebug("Revision %#x Function %#x Handler %#x.\n", revision, function,
>>>> +            handle);
>>>> +
>>>> +    if (revision != DSM_REVISION) {
>>>> +        nvdebug("Revision %#x is not supported, expect %#x.\n",
>>>> +                revision, DSM_REVISION);
>>>> +        goto exit;
>>>> +    }
>>>> +
>>>> +    if (!handle) {
>>>> +        if (!dsm_is_root_uuid(in->arg0)) {
>>>
>>> Please don't dereference 'in' or pass it to other functions.  Avoid race
>>> conditions with guest vcpus by coping in the entire dsm_in struct.
>>>
>>> This is like a system call - the kernel cannot trust userspace memory
>>> and must copy in before accessing data.  The same rules apply.
>>>
>>
>> It's little different for QEMU:
>> - the memory address is always valid to QEMU, it's not always true for Kernel
>>    due to context-switch
>>
>> - we have checked the header before use it's data, for example, when we get
>>    data from GET_NAMESPACE_DATA, we have got the @offset and @length from the
>>    memory, then copy memory based on these values, that means the userspace
>>    has no chance to cause buffer overflow by increasing these values at runtime.
>>
>>    The scenario for our case is simple but Kernel is difficult to do
>>    check_all_before_use as many paths may be involved.
>>
>> - guest changes some data is okay, the worst case is that the label data is
>>    corrupted. This is caused by guest itself. Kernel also supports this kind
>>    of behaviour, e,g. network TX zero copy, the userspace page is being
>>    transferred while userspace can still access it.
>>
>> - it's 4K size on x86, full copy wastes CPU time too much.
>
> This isn't performance-critical code and I don't want to review it
> keeping the race conditions in mind the whole time.  Also, if the code
> is modified in the future, the chance of introducing a race is high.
>
> I see this as premature optimization, please just copy in data.


No strong opinion on it... will do it following your idea.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
  2015-10-15 15:01         ` [Qemu-devel] " Stefan Hajnoczi
@ 2015-10-16  2:32           ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-16  2:32 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Stefan Hajnoczi, pbonzini, imammedo, gleb, mtosatti, mst, rth,
	ehabkost, dan.j.williams, kvm, qemu-devel



On 10/15/2015 11:01 PM, Stefan Hajnoczi wrote:
> On Wed, Oct 14, 2015 at 10:52:15PM +0800, Xiao Guangrong wrote:
>> On 10/14/2015 05:41 PM, Stefan Hajnoczi wrote:
>>> On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
>>>> +    out->len = sizeof(out->status);
>>>
>>> out->len is uint16_t, it needs cpu_to_le16().  There may be other
>>> instances in this patch series.
>>>
>>
>> out->len is internally used only which is invisible to guest OS, i,e,
>> we write this value and read this value by ourself. I think it is
>> okay.
>
> 'out' points to guest memory.  Guest memory is untrusted so QEMU cannot
> stash values there - an evil guest could modify them.
>
> Please put the len variable on the QEMU stack or heap where the guest
> cannot access it.

okay okay.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function
@ 2015-10-16  2:32           ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-16  2:32 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: ehabkost, kvm, mst, gleb, Stefan Hajnoczi, mtosatti, qemu-devel,
	pbonzini, imammedo, dan.j.williams, rth



On 10/15/2015 11:01 PM, Stefan Hajnoczi wrote:
> On Wed, Oct 14, 2015 at 10:52:15PM +0800, Xiao Guangrong wrote:
>> On 10/14/2015 05:41 PM, Stefan Hajnoczi wrote:
>>> On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:
>>>> +    out->len = sizeof(out->status);
>>>
>>> out->len is uint16_t, it needs cpu_to_le16().  There may be other
>>> instances in this patch series.
>>>
>>
>> out->len is internally used only which is invisible to guest OS, i,e,
>> we write this value and read this value by ourself. I think it is
>> okay.
>
> 'out' points to guest memory.  Guest memory is untrusted so QEMU cannot
> stash values there - an evil guest could modify them.
>
> Please put the len variable on the QEMU stack or heap where the guest
> cannot access it.

okay okay.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 26/32] nvdimm: save arg3 for NVDIMM device _DSM method
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-19  6:50     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  6:50 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Sun, Oct 11, 2015 at 11:52:58AM +0800, Xiao Guangrong wrote:
> Check if the input Arg3 is valid then store it into dsm_in if needed
> 
> We only do the save on NVDIMM device since we are not going to support any
> function on root device
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/mem/nvdimm/acpi.c | 21 ++++++++++++++++++++-
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> index d9fa0fd..3b9399c 100644
> --- a/hw/mem/nvdimm/acpi.c
> +++ b/hw/mem/nvdimm/acpi.c
> @@ -442,7 +442,7 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
>          int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
>                                             NULL);
>          uint32_t handle = nvdimm_slot_to_handle(slot);
> -        Aml *dev, *method;
> +        Aml *dev, *method, *ifctx;
>  
>          dev = aml_device("NV%02X", slot);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
> @@ -452,6 +452,24 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
>          method = aml_method("_DSM", 4);
>          {
>              SAVE_ARG012_HANDLE_LOCK(method, aml_int(handle));
> +
> +            /* Arg3 is passed as Package and it has one element? */
> +            ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
> +                                             aml_int(4)),
> +                                   aml_equal(aml_sizeof(aml_arg(3)),

aml_arg(3) is used many times below.
Pls give it a name that makes sense (not arg3! what is it for?)

> +                                             aml_int(1))));

Pls document AML constants used.
Like this:

            ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
                                             aml_int(4 /* 4 - Package */) ),
                                   aml_equal(aml_sizeof(aml_arg(3)),
                                             aml_int(1))));

> +            {
> +                /* Local0 = Index(Arg3, 0) */
> +                aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
> +                                            aml_local(0)));
> +                /* Local3 = DeRefOf(Local0) */
> +                aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
> +                                            aml_local(3)));
> +                /* ARG3 = Local3 */
> +                aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));

This isn't a good way to comment things: you are
just adding ASL before the equivalent C.
Pls document what's going on.




> +            }
> +            aml_append(method, ifctx);
> +
>              NOTIFY_AND_RETURN_UNLOCK(method);
>          }
>          aml_append(dev, method);
> @@ -534,6 +552,7 @@ static void nvdimm_build_acpi_devices(NVDIMMState *state, GSList *device_list,
>      method = aml_method("_DSM", 4);
>      {
>          SAVE_ARG012_HANDLE_LOCK(method, aml_int(0));
> +        /* no command we support on ROOT device has Arg3. */
>          NOTIFY_AND_RETURN_UNLOCK(method);
>      }
>      aml_append(dev, method);
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 26/32] nvdimm: save arg3 for NVDIMM device _DSM method
@ 2015-10-19  6:50     ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  6:50 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth

On Sun, Oct 11, 2015 at 11:52:58AM +0800, Xiao Guangrong wrote:
> Check if the input Arg3 is valid then store it into dsm_in if needed
> 
> We only do the save on NVDIMM device since we are not going to support any
> function on root device
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/mem/nvdimm/acpi.c | 21 ++++++++++++++++++++-
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> index d9fa0fd..3b9399c 100644
> --- a/hw/mem/nvdimm/acpi.c
> +++ b/hw/mem/nvdimm/acpi.c
> @@ -442,7 +442,7 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
>          int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
>                                             NULL);
>          uint32_t handle = nvdimm_slot_to_handle(slot);
> -        Aml *dev, *method;
> +        Aml *dev, *method, *ifctx;
>  
>          dev = aml_device("NV%02X", slot);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
> @@ -452,6 +452,24 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
>          method = aml_method("_DSM", 4);
>          {
>              SAVE_ARG012_HANDLE_LOCK(method, aml_int(handle));
> +
> +            /* Arg3 is passed as Package and it has one element? */
> +            ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
> +                                             aml_int(4)),
> +                                   aml_equal(aml_sizeof(aml_arg(3)),

aml_arg(3) is used many times below.
Pls give it a name that makes sense (not arg3! what is it for?)

> +                                             aml_int(1))));

Pls document AML constants used.
Like this:

            ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
                                             aml_int(4 /* 4 - Package */) ),
                                   aml_equal(aml_sizeof(aml_arg(3)),
                                             aml_int(1))));

> +            {
> +                /* Local0 = Index(Arg3, 0) */
> +                aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
> +                                            aml_local(0)));
> +                /* Local3 = DeRefOf(Local0) */
> +                aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
> +                                            aml_local(3)));
> +                /* ARG3 = Local3 */
> +                aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));

This isn't a good way to comment things: you are
just adding ASL before the equivalent C.
Pls document what's going on.




> +            }
> +            aml_append(method, ifctx);
> +
>              NOTIFY_AND_RETURN_UNLOCK(method);
>          }
>          aml_append(dev, method);
> @@ -534,6 +552,7 @@ static void nvdimm_build_acpi_devices(NVDIMMState *state, GSList *device_list,
>      method = aml_method("_DSM", 4);
>      {
>          SAVE_ARG012_HANDLE_LOCK(method, aml_int(0));
> +        /* no command we support on ROOT device has Arg3. */
>          NOTIFY_AND_RETURN_UNLOCK(method);
>      }
>      aml_append(dev, method);
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-19  6:56     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  6:56 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Sun, Oct 11, 2015 at 11:52:54AM +0800, Xiao Guangrong wrote:
> We reserve the memory region 0xFF00000 ~ 0xFFF00000 for NVDIMM ACPI
> which is used as:
> - the first page is mapped as MMIO, ACPI write data to this page to
>   transfer the control to QEMU
> 
> - the second page is RAM-based which used to save the input info of
>   _DSM method and QEMU reuse it store output info
> 
> - the left is mapped as RAM, it's the buffer returned by _FIT method,
>   this is needed by NVDIMM hotplug
> 

Isn't there some way to document this in code, e.g. with
macros?

Adding text under docs/specs would also be a good idea.


> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/i386/pc.c            |   3 ++
>  hw/mem/Makefile.objs    |   2 +-
>  hw/mem/nvdimm/acpi.c    | 120 ++++++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/i386/pc.h    |   2 +
>  include/hw/mem/nvdimm.h |  19 ++++++++
>  5 files changed, 145 insertions(+), 1 deletion(-)
>  create mode 100644 hw/mem/nvdimm/acpi.c
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 6694b18..8fea4c3 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1360,6 +1360,9 @@ FWCfgState *pc_memory_init(PCMachineState *pcms,
>              exit(EXIT_FAILURE);
>          }
>  
> +        nvdimm_init_memory_state(&pcms->nvdimm_memory, system_memory, machine,
> +                                 TARGET_PAGE_SIZE);
> +

Shouldn't this be conditional on presence of the nvdimm device?


>          pcms->hotplug_memory.base =
>              ROUND_UP(0x100000000ULL + pcms->above_4g_mem_size, 1ULL << 30);
>  
> diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
> index e0ff328..7310bac 100644
> --- a/hw/mem/Makefile.objs
> +++ b/hw/mem/Makefile.objs
> @@ -1,3 +1,3 @@
>  common-obj-$(CONFIG_DIMM) += dimm.o
>  common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
> -common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o
> +common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o nvdimm/acpi.o
> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> new file mode 100644
> index 0000000..b640874
> --- /dev/null
> +++ b/hw/mem/nvdimm/acpi.c
> @@ -0,0 +1,120 @@
> +/*
> + * NVDIMM ACPI Implementation
> + *
> + * Copyright(C) 2015 Intel Corporation.
> + *
> + * Author:
> + *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
> + *
> + * NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
> + * and the DSM specfication can be found at:
> + *       http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> + *
> + * Currently, it only supports PMEM Virtualization.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see <http://www.gnu.org/licenses/>
> + */
> +
> +#include "qemu-common.h"
> +#include "hw/acpi/acpi.h"
> +#include "hw/acpi/aml-build.h"
> +#include "hw/mem/nvdimm.h"
> +#include "internal.h"
> +
> +/* System Physical Address Range Structure */
> +struct nfit_spa {
> +    uint16_t type;
> +    uint16_t length;
> +    uint16_t spa_index;
> +    uint16_t flags;
> +    uint32_t reserved;
> +    uint32_t proximity_domain;
> +    uint8_t type_guid[16];
> +    uint64_t spa_base;
> +    uint64_t spa_length;
> +    uint64_t mem_attr;
> +} QEMU_PACKED;
> +typedef struct nfit_spa nfit_spa;
> +
> +/* Memory Device to System Physical Address Range Mapping Structure */
> +struct nfit_memdev {
> +    uint16_t type;
> +    uint16_t length;
> +    uint32_t nfit_handle;
> +    uint16_t phys_id;
> +    uint16_t region_id;
> +    uint16_t spa_index;
> +    uint16_t dcr_index;
> +    uint64_t region_len;
> +    uint64_t region_offset;
> +    uint64_t region_dpa;
> +    uint16_t interleave_index;
> +    uint16_t interleave_ways;
> +    uint16_t flags;
> +    uint16_t reserved;
> +} QEMU_PACKED;
> +typedef struct nfit_memdev nfit_memdev;
> +
> +/* NVDIMM Control Region Structure */
> +struct nfit_dcr {
> +    uint16_t type;
> +    uint16_t length;
> +    uint16_t dcr_index;
> +    uint16_t vendor_id;
> +    uint16_t device_id;
> +    uint16_t revision_id;
> +    uint16_t sub_vendor_id;
> +    uint16_t sub_device_id;
> +    uint16_t sub_revision_id;
> +    uint8_t reserved[6];
> +    uint32_t serial_number;
> +    uint16_t fic;
> +    uint16_t num_bcw;
> +    uint64_t bcw_size;
> +    uint64_t cmd_offset;
> +    uint64_t cmd_size;
> +    uint64_t status_offset;
> +    uint64_t status_size;
> +    uint16_t flags;
> +    uint8_t reserved2[6];
> +} QEMU_PACKED;
> +typedef struct nfit_dcr nfit_dcr;

Struct naming violates the QEMU coding style.
Pls fix it.

> +
> +static uint64_t nvdimm_device_structure_size(uint64_t slots)
> +{
> +    /* each nvdimm has three structures. */
> +    return slots * (sizeof(nfit_spa) + sizeof(nfit_memdev) + sizeof(nfit_dcr));
> +}
> +
> +static uint64_t nvdimm_acpi_memory_size(uint64_t slots, uint64_t page_size)
> +{
> +    uint64_t size = nvdimm_device_structure_size(slots);
> +
> +    /* two pages for nvdimm _DSM method. */
> +    return size + page_size * 2;
> +}
> +
> +void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
> +                              MachineState *machine , uint64_t page_size)
> +{
> +    QEMU_BUILD_BUG_ON(nvdimm_acpi_memory_size(ACPI_MAX_RAM_SLOTS,
> +                                   page_size) >= NVDIMM_ACPI_MEM_SIZE);
> +
> +    state->base = NVDIMM_ACPI_MEM_BASE;
> +    state->page_size = page_size;
> +
> +    memory_region_init(&state->mr, OBJECT(machine), "nvdimm-acpi",
> +                       NVDIMM_ACPI_MEM_SIZE);
> +    memory_region_add_subregion(system_memory, state->base, &state->mr);
> +}
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 693b6c5..fd65c27 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -17,6 +17,7 @@
>  #include "hw/boards.h"
>  #include "hw/compat.h"
>  #include "hw/mem/dimm.h"
> +#include "hw/mem/nvdimm.h"
>  
>  #define HPET_INTCAP "hpet-intcap"
>  
> @@ -32,6 +33,7 @@ struct PCMachineState {
>  
>      /* <public> */
>      MemoryHotplugState hotplug_memory;
> +    NVDIMMState nvdimm_memory;
>  
>      HotplugHandler *acpi_dev;
>      ISADevice *rtc;
> diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
> index f6bd2c4..aa95961 100644
> --- a/include/hw/mem/nvdimm.h
> +++ b/include/hw/mem/nvdimm.h
> @@ -15,6 +15,10 @@
>  
>  #include "hw/mem/dimm.h"
>  
> +/* Memory region 0xFF00000 ~ 0xFFF00000 is reserved for NVDIMM ACPI. */
> +#define NVDIMM_ACPI_MEM_BASE   0xFF000000ULL
> +#define NVDIMM_ACPI_MEM_SIZE   0xF00000ULL
> +
>  #define TYPE_NVDIMM "nvdimm"
>  #define NVDIMM(obj) \
>      OBJECT_CHECK(NVDIMMDevice, (obj), TYPE_NVDIMM)
> @@ -30,4 +34,19 @@ struct NVDIMMDevice {
>  };
>  typedef struct NVDIMMDevice NVDIMMDevice;
>  
> +/*
> + * NVDIMMState:
> + * @base: address in guest address space where NVDIMM ACPI memory begins.
> + * @page_size: the page size of target platform.
> + * @mr: NVDIMM ACPI memory address space container.
> + */
> +struct NVDIMMState {
> +    ram_addr_t base;
> +    uint64_t page_size;
> +    MemoryRegion mr;
> +};
> +typedef struct NVDIMMState NVDIMMState;
> +
> +void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
> +                              MachineState *machine , uint64_t page_size);
>  #endif
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-19  6:56     ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  6:56 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth

On Sun, Oct 11, 2015 at 11:52:54AM +0800, Xiao Guangrong wrote:
> We reserve the memory region 0xFF00000 ~ 0xFFF00000 for NVDIMM ACPI
> which is used as:
> - the first page is mapped as MMIO, ACPI write data to this page to
>   transfer the control to QEMU
> 
> - the second page is RAM-based which used to save the input info of
>   _DSM method and QEMU reuse it store output info
> 
> - the left is mapped as RAM, it's the buffer returned by _FIT method,
>   this is needed by NVDIMM hotplug
> 

Isn't there some way to document this in code, e.g. with
macros?

Adding text under docs/specs would also be a good idea.


> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  hw/i386/pc.c            |   3 ++
>  hw/mem/Makefile.objs    |   2 +-
>  hw/mem/nvdimm/acpi.c    | 120 ++++++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/i386/pc.h    |   2 +
>  include/hw/mem/nvdimm.h |  19 ++++++++
>  5 files changed, 145 insertions(+), 1 deletion(-)
>  create mode 100644 hw/mem/nvdimm/acpi.c
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 6694b18..8fea4c3 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1360,6 +1360,9 @@ FWCfgState *pc_memory_init(PCMachineState *pcms,
>              exit(EXIT_FAILURE);
>          }
>  
> +        nvdimm_init_memory_state(&pcms->nvdimm_memory, system_memory, machine,
> +                                 TARGET_PAGE_SIZE);
> +

Shouldn't this be conditional on presence of the nvdimm device?


>          pcms->hotplug_memory.base =
>              ROUND_UP(0x100000000ULL + pcms->above_4g_mem_size, 1ULL << 30);
>  
> diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
> index e0ff328..7310bac 100644
> --- a/hw/mem/Makefile.objs
> +++ b/hw/mem/Makefile.objs
> @@ -1,3 +1,3 @@
>  common-obj-$(CONFIG_DIMM) += dimm.o
>  common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
> -common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o
> +common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o nvdimm/acpi.o
> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> new file mode 100644
> index 0000000..b640874
> --- /dev/null
> +++ b/hw/mem/nvdimm/acpi.c
> @@ -0,0 +1,120 @@
> +/*
> + * NVDIMM ACPI Implementation
> + *
> + * Copyright(C) 2015 Intel Corporation.
> + *
> + * Author:
> + *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
> + *
> + * NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
> + * and the DSM specfication can be found at:
> + *       http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> + *
> + * Currently, it only supports PMEM Virtualization.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see <http://www.gnu.org/licenses/>
> + */
> +
> +#include "qemu-common.h"
> +#include "hw/acpi/acpi.h"
> +#include "hw/acpi/aml-build.h"
> +#include "hw/mem/nvdimm.h"
> +#include "internal.h"
> +
> +/* System Physical Address Range Structure */
> +struct nfit_spa {
> +    uint16_t type;
> +    uint16_t length;
> +    uint16_t spa_index;
> +    uint16_t flags;
> +    uint32_t reserved;
> +    uint32_t proximity_domain;
> +    uint8_t type_guid[16];
> +    uint64_t spa_base;
> +    uint64_t spa_length;
> +    uint64_t mem_attr;
> +} QEMU_PACKED;
> +typedef struct nfit_spa nfit_spa;
> +
> +/* Memory Device to System Physical Address Range Mapping Structure */
> +struct nfit_memdev {
> +    uint16_t type;
> +    uint16_t length;
> +    uint32_t nfit_handle;
> +    uint16_t phys_id;
> +    uint16_t region_id;
> +    uint16_t spa_index;
> +    uint16_t dcr_index;
> +    uint64_t region_len;
> +    uint64_t region_offset;
> +    uint64_t region_dpa;
> +    uint16_t interleave_index;
> +    uint16_t interleave_ways;
> +    uint16_t flags;
> +    uint16_t reserved;
> +} QEMU_PACKED;
> +typedef struct nfit_memdev nfit_memdev;
> +
> +/* NVDIMM Control Region Structure */
> +struct nfit_dcr {
> +    uint16_t type;
> +    uint16_t length;
> +    uint16_t dcr_index;
> +    uint16_t vendor_id;
> +    uint16_t device_id;
> +    uint16_t revision_id;
> +    uint16_t sub_vendor_id;
> +    uint16_t sub_device_id;
> +    uint16_t sub_revision_id;
> +    uint8_t reserved[6];
> +    uint32_t serial_number;
> +    uint16_t fic;
> +    uint16_t num_bcw;
> +    uint64_t bcw_size;
> +    uint64_t cmd_offset;
> +    uint64_t cmd_size;
> +    uint64_t status_offset;
> +    uint64_t status_size;
> +    uint16_t flags;
> +    uint8_t reserved2[6];
> +} QEMU_PACKED;
> +typedef struct nfit_dcr nfit_dcr;

Struct naming violates the QEMU coding style.
Pls fix it.

> +
> +static uint64_t nvdimm_device_structure_size(uint64_t slots)
> +{
> +    /* each nvdimm has three structures. */
> +    return slots * (sizeof(nfit_spa) + sizeof(nfit_memdev) + sizeof(nfit_dcr));
> +}
> +
> +static uint64_t nvdimm_acpi_memory_size(uint64_t slots, uint64_t page_size)
> +{
> +    uint64_t size = nvdimm_device_structure_size(slots);
> +
> +    /* two pages for nvdimm _DSM method. */
> +    return size + page_size * 2;
> +}
> +
> +void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
> +                              MachineState *machine , uint64_t page_size)
> +{
> +    QEMU_BUILD_BUG_ON(nvdimm_acpi_memory_size(ACPI_MAX_RAM_SLOTS,
> +                                   page_size) >= NVDIMM_ACPI_MEM_SIZE);
> +
> +    state->base = NVDIMM_ACPI_MEM_BASE;
> +    state->page_size = page_size;
> +
> +    memory_region_init(&state->mr, OBJECT(machine), "nvdimm-acpi",
> +                       NVDIMM_ACPI_MEM_SIZE);
> +    memory_region_add_subregion(system_memory, state->base, &state->mr);
> +}
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 693b6c5..fd65c27 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -17,6 +17,7 @@
>  #include "hw/boards.h"
>  #include "hw/compat.h"
>  #include "hw/mem/dimm.h"
> +#include "hw/mem/nvdimm.h"
>  
>  #define HPET_INTCAP "hpet-intcap"
>  
> @@ -32,6 +33,7 @@ struct PCMachineState {
>  
>      /* <public> */
>      MemoryHotplugState hotplug_memory;
> +    NVDIMMState nvdimm_memory;
>  
>      HotplugHandler *acpi_dev;
>      ISADevice *rtc;
> diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
> index f6bd2c4..aa95961 100644
> --- a/include/hw/mem/nvdimm.h
> +++ b/include/hw/mem/nvdimm.h
> @@ -15,6 +15,10 @@
>  
>  #include "hw/mem/dimm.h"
>  
> +/* Memory region 0xFF00000 ~ 0xFFF00000 is reserved for NVDIMM ACPI. */
> +#define NVDIMM_ACPI_MEM_BASE   0xFF000000ULL
> +#define NVDIMM_ACPI_MEM_SIZE   0xF00000ULL
> +
>  #define TYPE_NVDIMM "nvdimm"
>  #define NVDIMM(obj) \
>      OBJECT_CHECK(NVDIMMDevice, (obj), TYPE_NVDIMM)
> @@ -30,4 +34,19 @@ struct NVDIMMDevice {
>  };
>  typedef struct NVDIMMDevice NVDIMMDevice;
>  
> +/*
> + * NVDIMMState:
> + * @base: address in guest address space where NVDIMM ACPI memory begins.
> + * @page_size: the page size of target platform.
> + * @mr: NVDIMM ACPI memory address space container.
> + */
> +struct NVDIMMState {
> +    ram_addr_t base;
> +    uint64_t page_size;
> +    MemoryRegion mr;
> +};
> +typedef struct NVDIMMState NVDIMMState;
> +
> +void nvdimm_init_memory_state(NVDIMMState *state, MemoryRegion*system_memory,
> +                              MachineState *machine , uint64_t page_size);
>  #endif
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 00/32] implement vNVDIMM
  2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-19  6:56   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  6:56 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
> Changelog in v3:
> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> Michael for their valuable comments, the patchset finally gets better shape.

Thanks!
This needs some changes in coding style, and more comments, to
make it easier to maintain going forward.

High level comments - I didn't point out all instances,
please go over code and locate them yourself.
I focused on acpi code in this review.

    - fix coding style violations, prefix eveything with nvdimm_ etc
    - in apci code, avoid manual memory management/complex pointer math
    - comments are needed to document apis & explain what's going on
    - constants need comments too, refer to text that
      can be looked up in acpi spec verbatim


> - changes from Igor's comments:
>   1) abstract dimm device type from pc-dimm and create nvdimm device based on
>      dimm, then it uses memory backend device as nvdimm's memory and NUMA has
>      easily been implemented.
>   2) let file-backend device support any kind of filesystem not only for
>      hugetlbfs and let it work on file not only for directory which is
>      achieved by extending 'mem-path' - if it's a directory then it works as
>      current behavior, otherwise if it's file then directly allocates memory
>      from it.
>   3) we figure out a unused memory hole below 4G that is 0xFF00000 ~ 
>      0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
>      ACPI SSDT/DSDT table will break windows XP.
>      BTW, only make SSDT.rev = 2 can not work since the width is only depended
>      on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
>      in ACPI spec:
> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit 
> | width of Integer objects is dependent on the ComplianceRevision of the DSDT.
> | If the ComplianceRevision is less than 2, all integers are restricted to 32 
> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets 
> | the global integer width for all integers, including integers in SSDTs.
>   4) use the lowest ACPI spec version to document AML terms.
>   5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"
> 
> - changes from Stefan's comments:
>   1) do not do endian adjustment in-place since _DSM memory is visible to guest
>   2) use target platform's target page size instead of fixed PAGE_SIZE
>      definition
>   3) lots of code style improvement and typo fixes.
>   4) live migration fix
> - changes from Paolo's comments:
>   1) improve the name of memory region
>   
> - other changes:
>   1) return exact buffer size for _DSM method instead of the page size.
>   2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
>      devices.
>   3) NUMA support
>   4) implement _FIT method
>   5) rename "configdata" to "reserve-label-data"
>   6) simplify _DSM arg3 determination
>   7) main changelog update to let it reflect v3.
> 
> Changlog in v2:
> - Use litten endian for DSM method, thanks for Stefan's suggestion
> 
> - introduce a new parameter, @configdata, if it's false, Qemu will
>   build a static and readonly namespace in memory and use it serveing
>   for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
>   reserved region is needed at the end of the @file, it is good for
>   the user who want to pass whole nvdimm device and make its data
>   completely be visible to guest
> 
> - divide the source code into separated files and add maintain info
> 
> BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
> be posted on next week
> 
> ====== Background ======
> NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
> on Intel's platform. They are discovered via ACPI and configured by _DSM
> method of NVDIMM device in ACPI. There has some supporting documents which
> can be found at:
> ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
> NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
> DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
> 
> Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
> this patchset tries to enable it in virtualization field
> 
> ====== Design ======
> NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
> address space then CPU can directly access it as normal memory, another is
> BLK which is used as block device to reduce the occupying of CPU address
> space
> 
> BLK mode accesses NVDIMM via Command Register window and Data Register window.
> BLK virtualization has high workload since each sector access will cause at
> least two VM-EXIT. So we currently only imperilment vPMEM in this patchset
> 
> --- vPMEM design ---
> We introduce a new device named "nvdimm", it uses memory backend device as
> NVDIMM memory. The file in file-backend device can be a regular file and block 
> device. We can use any file when we do test or emulation, however,
> in the real word, the files passed to guest are:
> - the regular file in the filesystem with DAX enabled created on NVDIMM device
>   on host
> - the raw PMEM device on host, e,g /dev/pmem0
> Memory access on the address created by mmap on these kinds of files can
> directly reach NVDIMM device on host.
> 
> --- vConfigure data area design ---
> Each NVDIMM device has a configure data area which is used to store label
> namespace data. In order to emulating this area, we divide the file into two
> parts:
> - first parts is (0, size - 128K], which is used as PMEM
> - 128K at the end of the file, which is used as Label Data Area
> So that the label namespace data can be persistent during power lose or system
> failure.
> 
> We also support passing the whole file to guest without reserve any region for
> label data area which is achieved by "reserve-label-data" parameter - if it's
> false then QEMU will build static and readonly namespace in memory and that
> namespace contains the whole file size. The parameter is false on default.
> 
> --- _DSM method design ---
> _DSM in ACPI is used to configure NVDIMM, currently we only allow access of
> label namespace data, i.e, Get Namespace Label Size (Function Index 4),
> Get Namespace Label Data (Function Index 5) and Set Namespace Label Data
> (Function Index 6)
> 
> _DSM uses two pages to transfer data between ACPI and Qemu, the first page
> is RAM-based used to save the input info of _DSM method and Qemu reuse it
> store output info and another page is MMIO-based, ACPI write data to this
> page to transfer the control to Qemu
> 
> ====== Test ======
> In host
> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
> 2) append "-object memory-backend-file,share,id=mem1,
>    mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>    id=nv1" in QEMU command line
> 
> In guest, download the latest upsteam kernel (4.2 merge window) and enable
> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
> 1) insmod drivers/nvdimm/libnvdimm.ko
> 2) insmod drivers/acpi/nfit.ko
> 3) insmod drivers/nvdimm/nd_btt.ko
> 4) insmod drivers/nvdimm/nd_pmem.ko
> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
> appears. You can do whatever on /dev/pmem0 including DAX access.
> 
> Currently Linux NVDIMM driver does not support namespace operation on this
> kind of PMEM, apply below changes to support dynamical namespace:
> 
> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
>                         continue;
>                 }
>  
> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               if (nfit_mem->memdev_pmem)
>                         flags |= NDD_ALIASING;
> 
> You can append another NVDIMM device in guest and do:                       
> # cd /sys/bus/nd/devices/
> # cd namespace1.0/
> # echo `uuidgen` > uuid
> # echo `expr 1024 \* 1024 \* 128` > size
> then reload nd.pmem.ko
> 
> You can see /dev/pmem1 appears
> 
> ====== TODO ======
> NVDIMM hotplug support
> 
> Xiao Guangrong (32):
>   acpi: add aml_derefof
>   acpi: add aml_sizeof
>   acpi: add aml_create_field
>   acpi: add aml_mutex, aml_acquire, aml_release
>   acpi: add aml_concatenate
>   acpi: add aml_object_type
>   util: introduce qemu_file_get_page_size()
>   exec: allow memory to be allocated from any kind of path
>   exec: allow file_ram_alloc to work on file
>   hostmem-file: clean up memory allocation
>   hostmem-file: use whole file size if possible
>   pc-dimm: remove DEFAULT_PC_DIMMSIZE
>   pc-dimm: make pc_existing_dimms_capacity static and rename it
>   pc-dimm: drop the prefix of pc-dimm
>   stubs: rename qmp_pc_dimm_device_list.c
>   pc-dimm: rename pc-dimm.c and pc-dimm.h
>   dimm: abstract dimm device from pc-dimm
>   dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
>   dimm: keep the state of the whole backend memory
>   dimm: introduce realize callback
>   nvdimm: implement NVDIMM device abstract
>   nvdimm: init the address region used by NVDIMM ACPI
>   nvdimm: build ACPI NFIT table
>   nvdimm: init the address region used by DSM method
>   nvdimm: build ACPI nvdimm devices
>   nvdimm: save arg3 for NVDIMM device _DSM method
>   nvdimm: support DSM_CMD_IMPLEMENTED function
>   nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
>   nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
>   nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
>   nvdimm: allow using whole backend memory as pmem
>   nvdimm: add maintain info
> 
>  MAINTAINERS                                        |   6 +
>  backends/hostmem-file.c                            |  58 +-
>  default-configs/i386-softmmu.mak                   |   2 +
>  default-configs/x86_64-softmmu.mak                 |   2 +
>  exec.c                                             | 113 ++-
>  hmp.c                                              |   2 +-
>  hw/Makefile.objs                                   |   2 +-
>  hw/acpi/aml-build.c                                |  83 ++
>  hw/acpi/ich9.c                                     |   8 +-
>  hw/acpi/memory_hotplug.c                           |  26 +-
>  hw/acpi/piix4.c                                    |   8 +-
>  hw/i386/acpi-build.c                               |   4 +
>  hw/i386/pc.c                                       |  37 +-
>  hw/mem/Makefile.objs                               |   3 +
>  hw/mem/{pc-dimm.c => dimm.c}                       | 240 ++---
>  hw/mem/nvdimm/acpi.c                               | 961 +++++++++++++++++++++
>  hw/mem/nvdimm/internal.h                           |  41 +
>  hw/mem/nvdimm/namespace.c                          | 309 +++++++
>  hw/mem/nvdimm/nvdimm.c                             | 136 +++
>  hw/mem/pc-dimm.c                                   | 506 +----------
>  hw/ppc/spapr.c                                     |  20 +-
>  include/hw/acpi/aml-build.h                        |   8 +
>  include/hw/i386/pc.h                               |   4 +-
>  include/hw/mem/{pc-dimm.h => dimm.h}               |  68 +-
>  include/hw/mem/nvdimm.h                            |  58 ++
>  include/hw/mem/pc-dimm.h                           | 105 +--
>  include/hw/ppc/spapr.h                             |   2 +-
>  include/qemu/osdep.h                               |   1 +
>  numa.c                                             |   4 +-
>  qapi-schema.json                                   |   8 +-
>  qmp.c                                              |   4 +-
>  stubs/Makefile.objs                                |   2 +-
>  ...c_dimm_device_list.c => qmp_dimm_device_list.c} |   4 +-
>  target-ppc/kvm.c                                   |  21 +-
>  trace-events                                       |   8 +-
>  util/oslib-posix.c                                 |  16 +
>  util/oslib-win32.c                                 |   5 +
>  37 files changed, 2023 insertions(+), 862 deletions(-)
>  rename hw/mem/{pc-dimm.c => dimm.c} (65%)
>  create mode 100644 hw/mem/nvdimm/acpi.c
>  create mode 100644 hw/mem/nvdimm/internal.h
>  create mode 100644 hw/mem/nvdimm/namespace.c
>  create mode 100644 hw/mem/nvdimm/nvdimm.c
>  rewrite hw/mem/pc-dimm.c (91%)
>  rename include/hw/mem/{pc-dimm.h => dimm.h} (50%)
>  create mode 100644 include/hw/mem/nvdimm.h
>  rewrite include/hw/mem/pc-dimm.h (97%)
>  rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (56%)
> 
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-19  6:56   ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  6:56 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth

On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
> Changelog in v3:
> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> Michael for their valuable comments, the patchset finally gets better shape.

Thanks!
This needs some changes in coding style, and more comments, to
make it easier to maintain going forward.

High level comments - I didn't point out all instances,
please go over code and locate them yourself.
I focused on acpi code in this review.

    - fix coding style violations, prefix eveything with nvdimm_ etc
    - in apci code, avoid manual memory management/complex pointer math
    - comments are needed to document apis & explain what's going on
    - constants need comments too, refer to text that
      can be looked up in acpi spec verbatim


> - changes from Igor's comments:
>   1) abstract dimm device type from pc-dimm and create nvdimm device based on
>      dimm, then it uses memory backend device as nvdimm's memory and NUMA has
>      easily been implemented.
>   2) let file-backend device support any kind of filesystem not only for
>      hugetlbfs and let it work on file not only for directory which is
>      achieved by extending 'mem-path' - if it's a directory then it works as
>      current behavior, otherwise if it's file then directly allocates memory
>      from it.
>   3) we figure out a unused memory hole below 4G that is 0xFF00000 ~ 
>      0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
>      ACPI SSDT/DSDT table will break windows XP.
>      BTW, only make SSDT.rev = 2 can not work since the width is only depended
>      on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
>      in ACPI spec:
> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit 
> | width of Integer objects is dependent on the ComplianceRevision of the DSDT.
> | If the ComplianceRevision is less than 2, all integers are restricted to 32 
> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets 
> | the global integer width for all integers, including integers in SSDTs.
>   4) use the lowest ACPI spec version to document AML terms.
>   5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"
> 
> - changes from Stefan's comments:
>   1) do not do endian adjustment in-place since _DSM memory is visible to guest
>   2) use target platform's target page size instead of fixed PAGE_SIZE
>      definition
>   3) lots of code style improvement and typo fixes.
>   4) live migration fix
> - changes from Paolo's comments:
>   1) improve the name of memory region
>   
> - other changes:
>   1) return exact buffer size for _DSM method instead of the page size.
>   2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
>      devices.
>   3) NUMA support
>   4) implement _FIT method
>   5) rename "configdata" to "reserve-label-data"
>   6) simplify _DSM arg3 determination
>   7) main changelog update to let it reflect v3.
> 
> Changlog in v2:
> - Use litten endian for DSM method, thanks for Stefan's suggestion
> 
> - introduce a new parameter, @configdata, if it's false, Qemu will
>   build a static and readonly namespace in memory and use it serveing
>   for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
>   reserved region is needed at the end of the @file, it is good for
>   the user who want to pass whole nvdimm device and make its data
>   completely be visible to guest
> 
> - divide the source code into separated files and add maintain info
> 
> BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
> be posted on next week
> 
> ====== Background ======
> NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
> on Intel's platform. They are discovered via ACPI and configured by _DSM
> method of NVDIMM device in ACPI. There has some supporting documents which
> can be found at:
> ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
> NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
> DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
> 
> Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
> this patchset tries to enable it in virtualization field
> 
> ====== Design ======
> NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
> address space then CPU can directly access it as normal memory, another is
> BLK which is used as block device to reduce the occupying of CPU address
> space
> 
> BLK mode accesses NVDIMM via Command Register window and Data Register window.
> BLK virtualization has high workload since each sector access will cause at
> least two VM-EXIT. So we currently only imperilment vPMEM in this patchset
> 
> --- vPMEM design ---
> We introduce a new device named "nvdimm", it uses memory backend device as
> NVDIMM memory. The file in file-backend device can be a regular file and block 
> device. We can use any file when we do test or emulation, however,
> in the real word, the files passed to guest are:
> - the regular file in the filesystem with DAX enabled created on NVDIMM device
>   on host
> - the raw PMEM device on host, e,g /dev/pmem0
> Memory access on the address created by mmap on these kinds of files can
> directly reach NVDIMM device on host.
> 
> --- vConfigure data area design ---
> Each NVDIMM device has a configure data area which is used to store label
> namespace data. In order to emulating this area, we divide the file into two
> parts:
> - first parts is (0, size - 128K], which is used as PMEM
> - 128K at the end of the file, which is used as Label Data Area
> So that the label namespace data can be persistent during power lose or system
> failure.
> 
> We also support passing the whole file to guest without reserve any region for
> label data area which is achieved by "reserve-label-data" parameter - if it's
> false then QEMU will build static and readonly namespace in memory and that
> namespace contains the whole file size. The parameter is false on default.
> 
> --- _DSM method design ---
> _DSM in ACPI is used to configure NVDIMM, currently we only allow access of
> label namespace data, i.e, Get Namespace Label Size (Function Index 4),
> Get Namespace Label Data (Function Index 5) and Set Namespace Label Data
> (Function Index 6)
> 
> _DSM uses two pages to transfer data between ACPI and Qemu, the first page
> is RAM-based used to save the input info of _DSM method and Qemu reuse it
> store output info and another page is MMIO-based, ACPI write data to this
> page to transfer the control to Qemu
> 
> ====== Test ======
> In host
> 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
> 2) append "-object memory-backend-file,share,id=mem1,
>    mem-path=/tmp/nvdimm -device nvdimm,memdev=mem1,reserve-label-data,
>    id=nv1" in QEMU command line
> 
> In guest, download the latest upsteam kernel (4.2 merge window) and enable
> ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
> 1) insmod drivers/nvdimm/libnvdimm.ko
> 2) insmod drivers/acpi/nfit.ko
> 3) insmod drivers/nvdimm/nd_btt.ko
> 4) insmod drivers/nvdimm/nd_pmem.ko
> You can see the whole nvdimm device used as a single namespace and /dev/pmem0
> appears. You can do whatever on /dev/pmem0 including DAX access.
> 
> Currently Linux NVDIMM driver does not support namespace operation on this
> kind of PMEM, apply below changes to support dynamical namespace:
> 
> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
>                         continue;
>                 }
>  
> -               if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +               if (nfit_mem->memdev_pmem)
>                         flags |= NDD_ALIASING;
> 
> You can append another NVDIMM device in guest and do:                       
> # cd /sys/bus/nd/devices/
> # cd namespace1.0/
> # echo `uuidgen` > uuid
> # echo `expr 1024 \* 1024 \* 128` > size
> then reload nd.pmem.ko
> 
> You can see /dev/pmem1 appears
> 
> ====== TODO ======
> NVDIMM hotplug support
> 
> Xiao Guangrong (32):
>   acpi: add aml_derefof
>   acpi: add aml_sizeof
>   acpi: add aml_create_field
>   acpi: add aml_mutex, aml_acquire, aml_release
>   acpi: add aml_concatenate
>   acpi: add aml_object_type
>   util: introduce qemu_file_get_page_size()
>   exec: allow memory to be allocated from any kind of path
>   exec: allow file_ram_alloc to work on file
>   hostmem-file: clean up memory allocation
>   hostmem-file: use whole file size if possible
>   pc-dimm: remove DEFAULT_PC_DIMMSIZE
>   pc-dimm: make pc_existing_dimms_capacity static and rename it
>   pc-dimm: drop the prefix of pc-dimm
>   stubs: rename qmp_pc_dimm_device_list.c
>   pc-dimm: rename pc-dimm.c and pc-dimm.h
>   dimm: abstract dimm device from pc-dimm
>   dimm: get mapped memory region from DIMMDeviceClass->get_memory_region
>   dimm: keep the state of the whole backend memory
>   dimm: introduce realize callback
>   nvdimm: implement NVDIMM device abstract
>   nvdimm: init the address region used by NVDIMM ACPI
>   nvdimm: build ACPI NFIT table
>   nvdimm: init the address region used by DSM method
>   nvdimm: build ACPI nvdimm devices
>   nvdimm: save arg3 for NVDIMM device _DSM method
>   nvdimm: support DSM_CMD_IMPLEMENTED function
>   nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function
>   nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA
>   nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA
>   nvdimm: allow using whole backend memory as pmem
>   nvdimm: add maintain info
> 
>  MAINTAINERS                                        |   6 +
>  backends/hostmem-file.c                            |  58 +-
>  default-configs/i386-softmmu.mak                   |   2 +
>  default-configs/x86_64-softmmu.mak                 |   2 +
>  exec.c                                             | 113 ++-
>  hmp.c                                              |   2 +-
>  hw/Makefile.objs                                   |   2 +-
>  hw/acpi/aml-build.c                                |  83 ++
>  hw/acpi/ich9.c                                     |   8 +-
>  hw/acpi/memory_hotplug.c                           |  26 +-
>  hw/acpi/piix4.c                                    |   8 +-
>  hw/i386/acpi-build.c                               |   4 +
>  hw/i386/pc.c                                       |  37 +-
>  hw/mem/Makefile.objs                               |   3 +
>  hw/mem/{pc-dimm.c => dimm.c}                       | 240 ++---
>  hw/mem/nvdimm/acpi.c                               | 961 +++++++++++++++++++++
>  hw/mem/nvdimm/internal.h                           |  41 +
>  hw/mem/nvdimm/namespace.c                          | 309 +++++++
>  hw/mem/nvdimm/nvdimm.c                             | 136 +++
>  hw/mem/pc-dimm.c                                   | 506 +----------
>  hw/ppc/spapr.c                                     |  20 +-
>  include/hw/acpi/aml-build.h                        |   8 +
>  include/hw/i386/pc.h                               |   4 +-
>  include/hw/mem/{pc-dimm.h => dimm.h}               |  68 +-
>  include/hw/mem/nvdimm.h                            |  58 ++
>  include/hw/mem/pc-dimm.h                           | 105 +--
>  include/hw/ppc/spapr.h                             |   2 +-
>  include/qemu/osdep.h                               |   1 +
>  numa.c                                             |   4 +-
>  qapi-schema.json                                   |   8 +-
>  qmp.c                                              |   4 +-
>  stubs/Makefile.objs                                |   2 +-
>  ...c_dimm_device_list.c => qmp_dimm_device_list.c} |   4 +-
>  target-ppc/kvm.c                                   |  21 +-
>  trace-events                                       |   8 +-
>  util/oslib-posix.c                                 |  16 +
>  util/oslib-win32.c                                 |   5 +
>  37 files changed, 2023 insertions(+), 862 deletions(-)
>  rename hw/mem/{pc-dimm.c => dimm.c} (65%)
>  create mode 100644 hw/mem/nvdimm/acpi.c
>  create mode 100644 hw/mem/nvdimm/internal.h
>  create mode 100644 hw/mem/nvdimm/namespace.c
>  create mode 100644 hw/mem/nvdimm/nvdimm.c
>  rewrite hw/mem/pc-dimm.c (91%)
>  rename include/hw/mem/{pc-dimm.h => dimm.h} (50%)
>  create mode 100644 include/hw/mem/nvdimm.h
>  rewrite include/hw/mem/pc-dimm.h (97%)
>  rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (56%)
> 
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 00/32] implement vNVDIMM
  2015-10-13  5:29     ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-19  6:57       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  6:57 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Tue, Oct 13, 2015 at 01:29:48PM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/12/2015 07:55 PM, Michael S. Tsirkin wrote:
> >On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
> >>Changelog in v3:
> >>There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> >>Michael for their valuable comments, the patchset finally gets better shape.
> >
> >Thanks!
> >This needs some changes in coding style, and more comments, to
> >make it easier to maintain going forward.
> 
> Thanks for your review, Michael. I have learned lots of thing from
> your comments.
> 
> >
> >High level comments - I didn't point out all instances,
> >please go over code and locate them yourself.
> >I focused on acpi code in this review.
> 
> Okay, will do.
> 
> >
> >     - fix coding style violations, prefix eveything with nvdimm_ etc
> 
> Actually i did not pay attention on naming the stuff which is only internally
> used. Thank you for pointing it out and will fix it in next version.
> 
> >     - in apci code, avoid manual memory management/complex pointer math
> 
> I am not very good at ACPI ASL/AML, could you please more detail?

It's about C.

For example:
	Foo *foo = acpi_data_push(table, sizeof *foo);
	Bar *foo = acpi_data_push(table, sizeof *bar);
is pretty obviously safe, and it doesn't require you to do any
calculations.
	char *buf = acpi_data_push(table, sizeof *foo + sizeof *bar);
is worse, now you need:
	Bar *bar = (Bar *)(buf + sizeof *foo);
which will corrupt memory if you get the size wrong in push.

> >     - comments are needed to document apis & explain what's going on
> >     - constants need comments too, refer to text that
> >       can be looked up in acpi spec verbatim
> 
> Indeed, will document carefully.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/32] implement vNVDIMM
@ 2015-10-19  6:57       ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  6:57 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth

On Tue, Oct 13, 2015 at 01:29:48PM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/12/2015 07:55 PM, Michael S. Tsirkin wrote:
> >On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
> >>Changelog in v3:
> >>There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> >>Michael for their valuable comments, the patchset finally gets better shape.
> >
> >Thanks!
> >This needs some changes in coding style, and more comments, to
> >make it easier to maintain going forward.
> 
> Thanks for your review, Michael. I have learned lots of thing from
> your comments.
> 
> >
> >High level comments - I didn't point out all instances,
> >please go over code and locate them yourself.
> >I focused on acpi code in this review.
> 
> Okay, will do.
> 
> >
> >     - fix coding style violations, prefix eveything with nvdimm_ etc
> 
> Actually i did not pay attention on naming the stuff which is only internally
> used. Thank you for pointing it out and will fix it in next version.
> 
> >     - in apci code, avoid manual memory management/complex pointer math
> 
> I am not very good at ACPI ASL/AML, could you please more detail?

It's about C.

For example:
	Foo *foo = acpi_data_push(table, sizeof *foo);
	Bar *foo = acpi_data_push(table, sizeof *bar);
is pretty obviously safe, and it doesn't require you to do any
calculations.
	char *buf = acpi_data_push(table, sizeof *foo + sizeof *bar);
is worse, now you need:
	Bar *bar = (Bar *)(buf + sizeof *foo);
which will corrupt memory if you get the size wrong in push.

> >     - comments are needed to document apis & explain what's going on
> >     - constants need comments too, refer to text that
> >       can be looked up in acpi spec verbatim
> 
> Indeed, will document carefully.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 26/32] nvdimm: save arg3 for NVDIMM device _DSM method
  2015-10-19  6:50     ` [Qemu-devel] " Michael S. Tsirkin
@ 2015-10-19  7:14       ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-19  7:14 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/19/2015 02:50 PM, Michael S. Tsirkin wrote:
> On Sun, Oct 11, 2015 at 11:52:58AM +0800, Xiao Guangrong wrote:
>> Check if the input Arg3 is valid then store it into dsm_in if needed
>>
>> We only do the save on NVDIMM device since we are not going to support any
>> function on root device
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/mem/nvdimm/acpi.c | 21 ++++++++++++++++++++-
>>   1 file changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
>> index d9fa0fd..3b9399c 100644
>> --- a/hw/mem/nvdimm/acpi.c
>> +++ b/hw/mem/nvdimm/acpi.c
>> @@ -442,7 +442,7 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
>>           int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
>>                                              NULL);
>>           uint32_t handle = nvdimm_slot_to_handle(slot);
>> -        Aml *dev, *method;
>> +        Aml *dev, *method, *ifctx;
>>
>>           dev = aml_device("NV%02X", slot);
>>           aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
>> @@ -452,6 +452,24 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
>>           method = aml_method("_DSM", 4);
>>           {
>>               SAVE_ARG012_HANDLE_LOCK(method, aml_int(handle));
>> +
>> +            /* Arg3 is passed as Package and it has one element? */
>> +            ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
>> +                                             aml_int(4)),
>> +                                   aml_equal(aml_sizeof(aml_arg(3)),
>
> aml_arg(3) is used many times below.
> Pls give it a name that makes sense (not arg3! what is it for?)
>

Er. aml_arg(3) is just the fourth parameter of _DSM method. Will add some
comments:

/*
  * The fourth parameter (Arg3) of _DMS is a package which contains a buffer, the
  * layout of the buffer is specified by UUID (Arg0), Revision ID (Arg1) and
  * Function Index (Arg2) which are documented in the DSM specification.
  */

>> +                                             aml_int(1))));
>
> Pls document AML constants used.
> Like this:
>
>              ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
>                                               aml_int(4 /* 4 - Package */) ),
>                                     aml_equal(aml_sizeof(aml_arg(3)),
>                                               aml_int(1))));
>
>> +            {
>> +                /* Local0 = Index(Arg3, 0) */
>> +                aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
>> +                                            aml_local(0)));
>> +                /* Local3 = DeRefOf(Local0) */
>> +                aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
>> +                                            aml_local(3)));
>> +                /* ARG3 = Local3 */
>> +                aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));
>
> This isn't a good way to comment things: you are
> just adding ASL before the equivalent C.
> Pls document what's going on.
>

Okay... i just thought C is little readable than AML. Will change the comment
to:

/* fetch buffer from the package (Arg3) and store it to DSM memory. */

Thanks.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 26/32] nvdimm: save arg3 for NVDIMM device _DSM method
@ 2015-10-19  7:14       ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-19  7:14 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth



On 10/19/2015 02:50 PM, Michael S. Tsirkin wrote:
> On Sun, Oct 11, 2015 at 11:52:58AM +0800, Xiao Guangrong wrote:
>> Check if the input Arg3 is valid then store it into dsm_in if needed
>>
>> We only do the save on NVDIMM device since we are not going to support any
>> function on root device
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/mem/nvdimm/acpi.c | 21 ++++++++++++++++++++-
>>   1 file changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
>> index d9fa0fd..3b9399c 100644
>> --- a/hw/mem/nvdimm/acpi.c
>> +++ b/hw/mem/nvdimm/acpi.c
>> @@ -442,7 +442,7 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
>>           int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
>>                                              NULL);
>>           uint32_t handle = nvdimm_slot_to_handle(slot);
>> -        Aml *dev, *method;
>> +        Aml *dev, *method, *ifctx;
>>
>>           dev = aml_device("NV%02X", slot);
>>           aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
>> @@ -452,6 +452,24 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
>>           method = aml_method("_DSM", 4);
>>           {
>>               SAVE_ARG012_HANDLE_LOCK(method, aml_int(handle));
>> +
>> +            /* Arg3 is passed as Package and it has one element? */
>> +            ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
>> +                                             aml_int(4)),
>> +                                   aml_equal(aml_sizeof(aml_arg(3)),
>
> aml_arg(3) is used many times below.
> Pls give it a name that makes sense (not arg3! what is it for?)
>

Er. aml_arg(3) is just the fourth parameter of _DSM method. Will add some
comments:

/*
  * The fourth parameter (Arg3) of _DMS is a package which contains a buffer, the
  * layout of the buffer is specified by UUID (Arg0), Revision ID (Arg1) and
  * Function Index (Arg2) which are documented in the DSM specification.
  */

>> +                                             aml_int(1))));
>
> Pls document AML constants used.
> Like this:
>
>              ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
>                                               aml_int(4 /* 4 - Package */) ),
>                                     aml_equal(aml_sizeof(aml_arg(3)),
>                                               aml_int(1))));
>
>> +            {
>> +                /* Local0 = Index(Arg3, 0) */
>> +                aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
>> +                                            aml_local(0)));
>> +                /* Local3 = DeRefOf(Local0) */
>> +                aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
>> +                                            aml_local(3)));
>> +                /* ARG3 = Local3 */
>> +                aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));
>
> This isn't a good way to comment things: you are
> just adding ASL before the equivalent C.
> Pls document what's going on.
>

Okay... i just thought C is little readable than AML. Will change the comment
to:

/* fetch buffer from the package (Arg3) and store it to DSM memory. */

Thanks.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19  6:56     ` [Qemu-devel] " Michael S. Tsirkin
@ 2015-10-19  7:27       ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-19  7:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/19/2015 02:56 PM, Michael S. Tsirkin wrote:
> On Sun, Oct 11, 2015 at 11:52:54AM +0800, Xiao Guangrong wrote:
>> We reserve the memory region 0xFF00000 ~ 0xFFF00000 for NVDIMM ACPI
>> which is used as:
>> - the first page is mapped as MMIO, ACPI write data to this page to
>>    transfer the control to QEMU
>>
>> - the second page is RAM-based which used to save the input info of
>>    _DSM method and QEMU reuse it store output info
>>
>> - the left is mapped as RAM, it's the buffer returned by _FIT method,
>>    this is needed by NVDIMM hotplug
>>
>
> Isn't there some way to document this in code, e.g. with
> macros?
>

Yes. It's also documented when DSM memory is created, see
nvdimm_build_dsm_memory() introduced in
[PATCH v4 25/33] nvdimm acpi: init the address region used by DSM

> Adding text under docs/specs would also be a good idea.
>

Yes. A doc has been added in v4.

>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/i386/pc.c            |   3 ++
>>   hw/mem/Makefile.objs    |   2 +-
>>   hw/mem/nvdimm/acpi.c    | 120 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   include/hw/i386/pc.h    |   2 +
>>   include/hw/mem/nvdimm.h |  19 ++++++++
>>   5 files changed, 145 insertions(+), 1 deletion(-)
>>   create mode 100644 hw/mem/nvdimm/acpi.c
>>
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index 6694b18..8fea4c3 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -1360,6 +1360,9 @@ FWCfgState *pc_memory_init(PCMachineState *pcms,
>>               exit(EXIT_FAILURE);
>>           }
>>
>> +        nvdimm_init_memory_state(&pcms->nvdimm_memory, system_memory, machine,
>> +                                 TARGET_PAGE_SIZE);
>> +
>
> Shouldn't this be conditional on presence of the nvdimm device?
>

We will enable hotplug on nvdimm devices in the near future once Linux driver is
ready. I'd keep it here for future development.

>
>>           pcms->hotplug_memory.base =
>>               ROUND_UP(0x100000000ULL + pcms->above_4g_mem_size, 1ULL << 30);
>>
>> diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
>> index e0ff328..7310bac 100644
>> --- a/hw/mem/Makefile.objs
>> +++ b/hw/mem/Makefile.objs
>> @@ -1,3 +1,3 @@
>>   common-obj-$(CONFIG_DIMM) += dimm.o
>>   common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
>> -common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o
>> +common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o nvdimm/acpi.o
>> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
>> new file mode 100644
>> index 0000000..b640874
>> --- /dev/null
>> +++ b/hw/mem/nvdimm/acpi.c
>> @@ -0,0 +1,120 @@
>> +/*
>> + * NVDIMM ACPI Implementation
>> + *
>> + * Copyright(C) 2015 Intel Corporation.
>> + *
>> + * Author:
>> + *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> + *
>> + * NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
>> + * and the DSM specfication can be found at:
>> + *       http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
>> + *
>> + * Currently, it only supports PMEM Virtualization.
>> + *
>> + * This library is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU Lesser General Public
>> + * License as published by the Free Software Foundation; either
>> + * version 2 of the License, or (at your option) any later version.
>> + *
>> + * This library is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * Lesser General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU Lesser General Public
>> + * License along with this library; if not, see <http://www.gnu.org/licenses/>
>> + */
>> +
>> +#include "qemu-common.h"
>> +#include "hw/acpi/acpi.h"
>> +#include "hw/acpi/aml-build.h"
>> +#include "hw/mem/nvdimm.h"
>> +#include "internal.h"
>> +
>> +/* System Physical Address Range Structure */
>> +struct nfit_spa {
>> +    uint16_t type;
>> +    uint16_t length;
>> +    uint16_t spa_index;
>> +    uint16_t flags;
>> +    uint32_t reserved;
>> +    uint32_t proximity_domain;
>> +    uint8_t type_guid[16];
>> +    uint64_t spa_base;
>> +    uint64_t spa_length;
>> +    uint64_t mem_attr;
>> +} QEMU_PACKED;
>> +typedef struct nfit_spa nfit_spa;
>> +
>> +/* Memory Device to System Physical Address Range Mapping Structure */
>> +struct nfit_memdev {
>> +    uint16_t type;
>> +    uint16_t length;
>> +    uint32_t nfit_handle;
>> +    uint16_t phys_id;
>> +    uint16_t region_id;
>> +    uint16_t spa_index;
>> +    uint16_t dcr_index;
>> +    uint64_t region_len;
>> +    uint64_t region_offset;
>> +    uint64_t region_dpa;
>> +    uint16_t interleave_index;
>> +    uint16_t interleave_ways;
>> +    uint16_t flags;
>> +    uint16_t reserved;
>> +} QEMU_PACKED;
>> +typedef struct nfit_memdev nfit_memdev;
>> +
>> +/* NVDIMM Control Region Structure */
>> +struct nfit_dcr {
>> +    uint16_t type;
>> +    uint16_t length;
>> +    uint16_t dcr_index;
>> +    uint16_t vendor_id;
>> +    uint16_t device_id;
>> +    uint16_t revision_id;
>> +    uint16_t sub_vendor_id;
>> +    uint16_t sub_device_id;
>> +    uint16_t sub_revision_id;
>> +    uint8_t reserved[6];
>> +    uint32_t serial_number;
>> +    uint16_t fic;
>> +    uint16_t num_bcw;
>> +    uint64_t bcw_size;
>> +    uint64_t cmd_offset;
>> +    uint64_t cmd_size;
>> +    uint64_t status_offset;
>> +    uint64_t status_size;
>> +    uint16_t flags;
>> +    uint8_t reserved2[6];
>> +} QEMU_PACKED;
>> +typedef struct nfit_dcr nfit_dcr;
>
> Struct naming violates the QEMU coding style.
> Pls fix it.

I got it. Will add nvdimm_ prefix.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-19  7:27       ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-19  7:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth



On 10/19/2015 02:56 PM, Michael S. Tsirkin wrote:
> On Sun, Oct 11, 2015 at 11:52:54AM +0800, Xiao Guangrong wrote:
>> We reserve the memory region 0xFF00000 ~ 0xFFF00000 for NVDIMM ACPI
>> which is used as:
>> - the first page is mapped as MMIO, ACPI write data to this page to
>>    transfer the control to QEMU
>>
>> - the second page is RAM-based which used to save the input info of
>>    _DSM method and QEMU reuse it store output info
>>
>> - the left is mapped as RAM, it's the buffer returned by _FIT method,
>>    this is needed by NVDIMM hotplug
>>
>
> Isn't there some way to document this in code, e.g. with
> macros?
>

Yes. It's also documented when DSM memory is created, see
nvdimm_build_dsm_memory() introduced in
[PATCH v4 25/33] nvdimm acpi: init the address region used by DSM

> Adding text under docs/specs would also be a good idea.
>

Yes. A doc has been added in v4.

>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>   hw/i386/pc.c            |   3 ++
>>   hw/mem/Makefile.objs    |   2 +-
>>   hw/mem/nvdimm/acpi.c    | 120 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   include/hw/i386/pc.h    |   2 +
>>   include/hw/mem/nvdimm.h |  19 ++++++++
>>   5 files changed, 145 insertions(+), 1 deletion(-)
>>   create mode 100644 hw/mem/nvdimm/acpi.c
>>
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index 6694b18..8fea4c3 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -1360,6 +1360,9 @@ FWCfgState *pc_memory_init(PCMachineState *pcms,
>>               exit(EXIT_FAILURE);
>>           }
>>
>> +        nvdimm_init_memory_state(&pcms->nvdimm_memory, system_memory, machine,
>> +                                 TARGET_PAGE_SIZE);
>> +
>
> Shouldn't this be conditional on presence of the nvdimm device?
>

We will enable hotplug on nvdimm devices in the near future once Linux driver is
ready. I'd keep it here for future development.

>
>>           pcms->hotplug_memory.base =
>>               ROUND_UP(0x100000000ULL + pcms->above_4g_mem_size, 1ULL << 30);
>>
>> diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
>> index e0ff328..7310bac 100644
>> --- a/hw/mem/Makefile.objs
>> +++ b/hw/mem/Makefile.objs
>> @@ -1,3 +1,3 @@
>>   common-obj-$(CONFIG_DIMM) += dimm.o
>>   common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
>> -common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o
>> +common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o nvdimm/acpi.o
>> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
>> new file mode 100644
>> index 0000000..b640874
>> --- /dev/null
>> +++ b/hw/mem/nvdimm/acpi.c
>> @@ -0,0 +1,120 @@
>> +/*
>> + * NVDIMM ACPI Implementation
>> + *
>> + * Copyright(C) 2015 Intel Corporation.
>> + *
>> + * Author:
>> + *  Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> + *
>> + * NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
>> + * and the DSM specfication can be found at:
>> + *       http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
>> + *
>> + * Currently, it only supports PMEM Virtualization.
>> + *
>> + * This library is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU Lesser General Public
>> + * License as published by the Free Software Foundation; either
>> + * version 2 of the License, or (at your option) any later version.
>> + *
>> + * This library is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * Lesser General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU Lesser General Public
>> + * License along with this library; if not, see <http://www.gnu.org/licenses/>
>> + */
>> +
>> +#include "qemu-common.h"
>> +#include "hw/acpi/acpi.h"
>> +#include "hw/acpi/aml-build.h"
>> +#include "hw/mem/nvdimm.h"
>> +#include "internal.h"
>> +
>> +/* System Physical Address Range Structure */
>> +struct nfit_spa {
>> +    uint16_t type;
>> +    uint16_t length;
>> +    uint16_t spa_index;
>> +    uint16_t flags;
>> +    uint32_t reserved;
>> +    uint32_t proximity_domain;
>> +    uint8_t type_guid[16];
>> +    uint64_t spa_base;
>> +    uint64_t spa_length;
>> +    uint64_t mem_attr;
>> +} QEMU_PACKED;
>> +typedef struct nfit_spa nfit_spa;
>> +
>> +/* Memory Device to System Physical Address Range Mapping Structure */
>> +struct nfit_memdev {
>> +    uint16_t type;
>> +    uint16_t length;
>> +    uint32_t nfit_handle;
>> +    uint16_t phys_id;
>> +    uint16_t region_id;
>> +    uint16_t spa_index;
>> +    uint16_t dcr_index;
>> +    uint64_t region_len;
>> +    uint64_t region_offset;
>> +    uint64_t region_dpa;
>> +    uint16_t interleave_index;
>> +    uint16_t interleave_ways;
>> +    uint16_t flags;
>> +    uint16_t reserved;
>> +} QEMU_PACKED;
>> +typedef struct nfit_memdev nfit_memdev;
>> +
>> +/* NVDIMM Control Region Structure */
>> +struct nfit_dcr {
>> +    uint16_t type;
>> +    uint16_t length;
>> +    uint16_t dcr_index;
>> +    uint16_t vendor_id;
>> +    uint16_t device_id;
>> +    uint16_t revision_id;
>> +    uint16_t sub_vendor_id;
>> +    uint16_t sub_device_id;
>> +    uint16_t sub_revision_id;
>> +    uint8_t reserved[6];
>> +    uint32_t serial_number;
>> +    uint16_t fic;
>> +    uint16_t num_bcw;
>> +    uint64_t bcw_size;
>> +    uint64_t cmd_offset;
>> +    uint64_t cmd_size;
>> +    uint64_t status_offset;
>> +    uint64_t status_size;
>> +    uint16_t flags;
>> +    uint8_t reserved2[6];
>> +} QEMU_PACKED;
>> +typedef struct nfit_dcr nfit_dcr;
>
> Struct naming violates the QEMU coding style.
> Pls fix it.

I got it. Will add nvdimm_ prefix.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19  7:27       ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-19  7:39         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  7:39 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
> >>+        nvdimm_init_memory_state(&pcms->nvdimm_memory, system_memory, machine,
> >>+                                 TARGET_PAGE_SIZE);
> >>+
> >
> >Shouldn't this be conditional on presence of the nvdimm device?
> >
> 
> We will enable hotplug on nvdimm devices in the near future once Linux driver is
> ready. I'd keep it here for future development.

No, I don't think we should add stuff unconditionally. If not nvdimm,
some other flag should indicate user intends to hotplug things.

-- 
MST

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-19  7:39         ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  7:39 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth

On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
> >>+        nvdimm_init_memory_state(&pcms->nvdimm_memory, system_memory, machine,
> >>+                                 TARGET_PAGE_SIZE);
> >>+
> >
> >Shouldn't this be conditional on presence of the nvdimm device?
> >
> 
> We will enable hotplug on nvdimm devices in the near future once Linux driver is
> ready. I'd keep it here for future development.

No, I don't think we should add stuff unconditionally. If not nvdimm,
some other flag should indicate user intends to hotplug things.

-- 
MST

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19  7:39         ` [Qemu-devel] " Michael S. Tsirkin
@ 2015-10-19  7:44           ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-19  7:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
> On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
>>>> +        nvdimm_init_memory_state(&pcms->nvdimm_memory, system_memory, machine,
>>>> +                                 TARGET_PAGE_SIZE);
>>>> +
>>>
>>> Shouldn't this be conditional on presence of the nvdimm device?
>>>
>>
>> We will enable hotplug on nvdimm devices in the near future once Linux driver is
>> ready. I'd keep it here for future development.
>
> No, I don't think we should add stuff unconditionally. If not nvdimm,
> some other flag should indicate user intends to hotplug things.
>

Actually, it is not unconditionally which is called if parameter "-m aaa, maxmem=bbb"
(aaa < bbb) is used. It is on the some path of memoy-hotplug initiation.



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-19  7:44           ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-19  7:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth



On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
> On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
>>>> +        nvdimm_init_memory_state(&pcms->nvdimm_memory, system_memory, machine,
>>>> +                                 TARGET_PAGE_SIZE);
>>>> +
>>>
>>> Shouldn't this be conditional on presence of the nvdimm device?
>>>
>>
>> We will enable hotplug on nvdimm devices in the near future once Linux driver is
>> ready. I'd keep it here for future development.
>
> No, I don't think we should add stuff unconditionally. If not nvdimm,
> some other flag should indicate user intends to hotplug things.
>

Actually, it is not unconditionally which is called if parameter "-m aaa, maxmem=bbb"
(aaa < bbb) is used. It is on the some path of memoy-hotplug initiation.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 26/32] nvdimm: save arg3 for NVDIMM device _DSM method
  2015-10-19  7:14       ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-19  7:47         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  7:47 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Mon, Oct 19, 2015 at 03:14:28PM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/19/2015 02:50 PM, Michael S. Tsirkin wrote:
> >On Sun, Oct 11, 2015 at 11:52:58AM +0800, Xiao Guangrong wrote:
> >>Check if the input Arg3 is valid then store it into dsm_in if needed
> >>
> >>We only do the save on NVDIMM device since we are not going to support any
> >>function on root device
> >>
> >>Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> >>---
> >>  hw/mem/nvdimm/acpi.c | 21 ++++++++++++++++++++-
> >>  1 file changed, 20 insertions(+), 1 deletion(-)
> >>
> >>diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> >>index d9fa0fd..3b9399c 100644
> >>--- a/hw/mem/nvdimm/acpi.c
> >>+++ b/hw/mem/nvdimm/acpi.c
> >>@@ -442,7 +442,7 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
> >>          int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> >>                                             NULL);
> >>          uint32_t handle = nvdimm_slot_to_handle(slot);
> >>-        Aml *dev, *method;
> >>+        Aml *dev, *method, *ifctx;
> >>
> >>          dev = aml_device("NV%02X", slot);
> >>          aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
> >>@@ -452,6 +452,24 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
> >>          method = aml_method("_DSM", 4);
> >>          {
> >>              SAVE_ARG012_HANDLE_LOCK(method, aml_int(handle));
> >>+
> >>+            /* Arg3 is passed as Package and it has one element? */
> >>+            ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
> >>+                                             aml_int(4)),
> >>+                                   aml_equal(aml_sizeof(aml_arg(3)),
> >
> >aml_arg(3) is used many times below.
> >Pls give it a name that makes sense (not arg3! what is it for?)
> >
> 
> Er. aml_arg(3) is just the fourth parameter of _DSM method. Will add some
> comments:
> 
> /*
>  * The fourth parameter (Arg3) of _DMS is a package which contains a buffer, the
>  * layout of the buffer is specified by UUID (Arg0), Revision ID (Arg1) and
>  * Function Index (Arg2) which are documented in the DSM specification.
>  */
> 
> >>+                                             aml_int(1))));
> >
> >Pls document AML constants used.
> >Like this:
> >
> >             ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
> >                                              aml_int(4 /* 4 - Package */) ),
> >                                    aml_equal(aml_sizeof(aml_arg(3)),
> >                                              aml_int(1))));
> >
> >>+            {
> >>+                /* Local0 = Index(Arg3, 0) */
> >>+                aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
> >>+                                            aml_local(0)));
> >>+                /* Local3 = DeRefOf(Local0) */
> >>+                aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
> >>+                                            aml_local(3)));
> >>+                /* ARG3 = Local3 */
> >>+                aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));
> >
> >This isn't a good way to comment things: you are
> >just adding ASL before the equivalent C.
> >Pls document what's going on.
> >
> 
> Okay... i just thought C is little readable than AML. Will change the comment
> to:
> 
> /* fetch buffer from the package (Arg3) and store it to DSM memory. */
> 
> Thanks.

You can use variables to make the logic clear. E.g.:

	Aml *pckg = aml_arg(3);
	Aml *pckg_idx = aml_local(0);
	Aml *pckg_buf = aml_local(3);

	aml_append(ifctx, aml_store(aml_index(pckg, aml_int(0)), pckg_idx);
	aml_append(ifctx, aml_store(aml_derefof(pckg_idx), pckg_buf));


This is also better than repeating aml_arg(3) many times.

-- 
MST

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 26/32] nvdimm: save arg3 for NVDIMM device _DSM method
@ 2015-10-19  7:47         ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  7:47 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth

On Mon, Oct 19, 2015 at 03:14:28PM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/19/2015 02:50 PM, Michael S. Tsirkin wrote:
> >On Sun, Oct 11, 2015 at 11:52:58AM +0800, Xiao Guangrong wrote:
> >>Check if the input Arg3 is valid then store it into dsm_in if needed
> >>
> >>We only do the save on NVDIMM device since we are not going to support any
> >>function on root device
> >>
> >>Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> >>---
> >>  hw/mem/nvdimm/acpi.c | 21 ++++++++++++++++++++-
> >>  1 file changed, 20 insertions(+), 1 deletion(-)
> >>
> >>diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> >>index d9fa0fd..3b9399c 100644
> >>--- a/hw/mem/nvdimm/acpi.c
> >>+++ b/hw/mem/nvdimm/acpi.c
> >>@@ -442,7 +442,7 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
> >>          int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> >>                                             NULL);
> >>          uint32_t handle = nvdimm_slot_to_handle(slot);
> >>-        Aml *dev, *method;
> >>+        Aml *dev, *method, *ifctx;
> >>
> >>          dev = aml_device("NV%02X", slot);
> >>          aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
> >>@@ -452,6 +452,24 @@ static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
> >>          method = aml_method("_DSM", 4);
> >>          {
> >>              SAVE_ARG012_HANDLE_LOCK(method, aml_int(handle));
> >>+
> >>+            /* Arg3 is passed as Package and it has one element? */
> >>+            ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
> >>+                                             aml_int(4)),
> >>+                                   aml_equal(aml_sizeof(aml_arg(3)),
> >
> >aml_arg(3) is used many times below.
> >Pls give it a name that makes sense (not arg3! what is it for?)
> >
> 
> Er. aml_arg(3) is just the fourth parameter of _DSM method. Will add some
> comments:
> 
> /*
>  * The fourth parameter (Arg3) of _DMS is a package which contains a buffer, the
>  * layout of the buffer is specified by UUID (Arg0), Revision ID (Arg1) and
>  * Function Index (Arg2) which are documented in the DSM specification.
>  */
> 
> >>+                                             aml_int(1))));
> >
> >Pls document AML constants used.
> >Like this:
> >
> >             ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
> >                                              aml_int(4 /* 4 - Package */) ),
> >                                    aml_equal(aml_sizeof(aml_arg(3)),
> >                                              aml_int(1))));
> >
> >>+            {
> >>+                /* Local0 = Index(Arg3, 0) */
> >>+                aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
> >>+                                            aml_local(0)));
> >>+                /* Local3 = DeRefOf(Local0) */
> >>+                aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
> >>+                                            aml_local(3)));
> >>+                /* ARG3 = Local3 */
> >>+                aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));
> >
> >This isn't a good way to comment things: you are
> >just adding ASL before the equivalent C.
> >Pls document what's going on.
> >
> 
> Okay... i just thought C is little readable than AML. Will change the comment
> to:
> 
> /* fetch buffer from the package (Arg3) and store it to DSM memory. */
> 
> Thanks.

You can use variables to make the logic clear. E.g.:

	Aml *pckg = aml_arg(3);
	Aml *pckg_idx = aml_local(0);
	Aml *pckg_buf = aml_local(3);

	aml_append(ifctx, aml_store(aml_index(pckg, aml_int(0)), pckg_idx);
	aml_append(ifctx, aml_store(aml_derefof(pckg_idx), pckg_buf));


This is also better than repeating aml_arg(3) many times.

-- 
MST

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 26/32] nvdimm: save arg3 for NVDIMM device _DSM method
  2015-10-19  7:47         ` [Qemu-devel] " Michael S. Tsirkin
@ 2015-10-19  7:51           ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-19  7:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/19/2015 03:47 PM, Michael S. Tsirkin wrote:

>>> aml_arg(3) is used many times below.
>>> Pls give it a name that makes sense (not arg3! what is it for?)
>>>
>>
>> Er. aml_arg(3) is just the fourth parameter of _DSM method. Will add some
>> comments:
>>
>> /*
>>   * The fourth parameter (Arg3) of _DMS is a package which contains a buffer, the
>>   * layout of the buffer is specified by UUID (Arg0), Revision ID (Arg1) and
>>   * Function Index (Arg2) which are documented in the DSM specification.
>>   */
>>
>>>> +                                             aml_int(1))));
>>>
>>> Pls document AML constants used.
>>> Like this:
>>>
>>>              ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
>>>                                               aml_int(4 /* 4 - Package */) ),
>>>                                     aml_equal(aml_sizeof(aml_arg(3)),
>>>                                               aml_int(1))));
>>>
>>>> +            {
>>>> +                /* Local0 = Index(Arg3, 0) */
>>>> +                aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
>>>> +                                            aml_local(0)));
>>>> +                /* Local3 = DeRefOf(Local0) */
>>>> +                aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
>>>> +                                            aml_local(3)));
>>>> +                /* ARG3 = Local3 */
>>>> +                aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));
>>>
>>> This isn't a good way to comment things: you are
>>> just adding ASL before the equivalent C.
>>> Pls document what's going on.
>>>
>>
>> Okay... i just thought C is little readable than AML. Will change the comment
>> to:
>>
>> /* fetch buffer from the package (Arg3) and store it to DSM memory. */
>>
>> Thanks.
>
> You can use variables to make the logic clear. E.g.:
>
> 	Aml *pckg = aml_arg(3);
> 	Aml *pckg_idx = aml_local(0);
> 	Aml *pckg_buf = aml_local(3);
>
> 	aml_append(ifctx, aml_store(aml_index(pckg, aml_int(0)), pckg_idx);
> 	aml_append(ifctx, aml_store(aml_derefof(pckg_idx), pckg_buf));
>
>
> This is also better than repeating aml_arg(3) many times.
>

Indeed, it's more clearer now.

Thanks for your review and really appreciate for your patience, Michael!



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 26/32] nvdimm: save arg3 for NVDIMM device _DSM method
@ 2015-10-19  7:51           ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-19  7:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth



On 10/19/2015 03:47 PM, Michael S. Tsirkin wrote:

>>> aml_arg(3) is used many times below.
>>> Pls give it a name that makes sense (not arg3! what is it for?)
>>>
>>
>> Er. aml_arg(3) is just the fourth parameter of _DSM method. Will add some
>> comments:
>>
>> /*
>>   * The fourth parameter (Arg3) of _DMS is a package which contains a buffer, the
>>   * layout of the buffer is specified by UUID (Arg0), Revision ID (Arg1) and
>>   * Function Index (Arg2) which are documented in the DSM specification.
>>   */
>>
>>>> +                                             aml_int(1))));
>>>
>>> Pls document AML constants used.
>>> Like this:
>>>
>>>              ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
>>>                                               aml_int(4 /* 4 - Package */) ),
>>>                                     aml_equal(aml_sizeof(aml_arg(3)),
>>>                                               aml_int(1))));
>>>
>>>> +            {
>>>> +                /* Local0 = Index(Arg3, 0) */
>>>> +                aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
>>>> +                                            aml_local(0)));
>>>> +                /* Local3 = DeRefOf(Local0) */
>>>> +                aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
>>>> +                                            aml_local(3)));
>>>> +                /* ARG3 = Local3 */
>>>> +                aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));
>>>
>>> This isn't a good way to comment things: you are
>>> just adding ASL before the equivalent C.
>>> Pls document what's going on.
>>>
>>
>> Okay... i just thought C is little readable than AML. Will change the comment
>> to:
>>
>> /* fetch buffer from the package (Arg3) and store it to DSM memory. */
>>
>> Thanks.
>
> You can use variables to make the logic clear. E.g.:
>
> 	Aml *pckg = aml_arg(3);
> 	Aml *pckg_idx = aml_local(0);
> 	Aml *pckg_buf = aml_local(3);
>
> 	aml_append(ifctx, aml_store(aml_index(pckg, aml_int(0)), pckg_idx);
> 	aml_append(ifctx, aml_store(aml_derefof(pckg_idx), pckg_buf));
>
>
> This is also better than repeating aml_arg(3) many times.
>

Indeed, it's more clearer now.

Thanks for your review and really appreciate for your patience, Michael!

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19  7:44           ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-19  9:17             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  9:17 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: pbonzini, imammedo, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Mon, Oct 19, 2015 at 03:44:13PM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
> >On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
> >>>>+        nvdimm_init_memory_state(&pcms->nvdimm_memory, system_memory, machine,
> >>>>+                                 TARGET_PAGE_SIZE);
> >>>>+
> >>>
> >>>Shouldn't this be conditional on presence of the nvdimm device?
> >>>
> >>
> >>We will enable hotplug on nvdimm devices in the near future once Linux driver is
> >>ready. I'd keep it here for future development.
> >
> >No, I don't think we should add stuff unconditionally. If not nvdimm,
> >some other flag should indicate user intends to hotplug things.
> >
> 
> Actually, it is not unconditionally which is called if parameter "-m aaa, maxmem=bbb"
> (aaa < bbb) is used. It is on the some path of memoy-hotplug initiation.
> 

Right, but that's not the same as nvdimm.


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-19  9:17             ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19  9:17 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, imammedo,
	pbonzini, dan.j.williams, rth

On Mon, Oct 19, 2015 at 03:44:13PM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
> >On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
> >>>>+        nvdimm_init_memory_state(&pcms->nvdimm_memory, system_memory, machine,
> >>>>+                                 TARGET_PAGE_SIZE);
> >>>>+
> >>>
> >>>Shouldn't this be conditional on presence of the nvdimm device?
> >>>
> >>
> >>We will enable hotplug on nvdimm devices in the near future once Linux driver is
> >>ready. I'd keep it here for future development.
> >
> >No, I don't think we should add stuff unconditionally. If not nvdimm,
> >some other flag should indicate user intends to hotplug things.
> >
> 
> Actually, it is not unconditionally which is called if parameter "-m aaa, maxmem=bbb"
> (aaa < bbb) is used. It is on the some path of memoy-hotplug initiation.
> 

Right, but that's not the same as nvdimm.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19  6:56     ` [Qemu-devel] " Michael S. Tsirkin
@ 2015-10-19  9:18       ` Igor Mammedov
  -1 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-19  9:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Xiao Guangrong, pbonzini, gleb, mtosatti, stefanha, rth,
	ehabkost, dan.j.williams, kvm, qemu-devel

On Mon, 19 Oct 2015 09:56:12 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Sun, Oct 11, 2015 at 11:52:54AM +0800, Xiao Guangrong wrote:
[...]
> > diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
> > index f6bd2c4..aa95961 100644
> > --- a/include/hw/mem/nvdimm.h
> > +++ b/include/hw/mem/nvdimm.h
> > @@ -15,6 +15,10 @@
> >  
> >  #include "hw/mem/dimm.h"
> >  
> > +/* Memory region 0xFF00000 ~ 0xFFF00000 is reserved for NVDIMM
> > ACPI. */ +#define NVDIMM_ACPI_MEM_BASE   0xFF000000ULL
Michael,

If it's ok to map control RAM region directly from QEMU at arbitrary
location let's do the same for VMGENID too (i.e. use v16
implementation which does exactly the same thing as this series).

> > +#define NVDIMM_ACPI_MEM_SIZE   0xF00000ULL
[...]


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-19  9:18       ` Igor Mammedov
  0 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-19  9:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Xiao Guangrong, ehabkost, kvm, gleb, mtosatti, qemu-devel,
	stefanha, pbonzini, dan.j.williams, rth

On Mon, 19 Oct 2015 09:56:12 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Sun, Oct 11, 2015 at 11:52:54AM +0800, Xiao Guangrong wrote:
[...]
> > diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
> > index f6bd2c4..aa95961 100644
> > --- a/include/hw/mem/nvdimm.h
> > +++ b/include/hw/mem/nvdimm.h
> > @@ -15,6 +15,10 @@
> >  
> >  #include "hw/mem/dimm.h"
> >  
> > +/* Memory region 0xFF00000 ~ 0xFFF00000 is reserved for NVDIMM
> > ACPI. */ +#define NVDIMM_ACPI_MEM_BASE   0xFF000000ULL
Michael,

If it's ok to map control RAM region directly from QEMU at arbitrary
location let's do the same for VMGENID too (i.e. use v16
implementation which does exactly the same thing as this series).

> > +#define NVDIMM_ACPI_MEM_SIZE   0xF00000ULL
[...]

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19  9:17             ` [Qemu-devel] " Michael S. Tsirkin
@ 2015-10-19  9:46               ` Igor Mammedov
  -1 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-19  9:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Xiao Guangrong, pbonzini, gleb, mtosatti, stefanha, rth,
	ehabkost, dan.j.williams, kvm, qemu-devel

On Mon, 19 Oct 2015 12:17:22 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Mon, Oct 19, 2015 at 03:44:13PM +0800, Xiao Guangrong wrote:
> > 
> > 
> > On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
> > >On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
> > >>>>+        nvdimm_init_memory_state(&pcms->nvdimm_memory,
> > >>>>system_memory, machine,
> > >>>>+                                 TARGET_PAGE_SIZE);
> > >>>>+
> > >>>
> > >>>Shouldn't this be conditional on presence of the nvdimm device?
> > >>>
> > >>
> > >>We will enable hotplug on nvdimm devices in the near future once
> > >>Linux driver is ready. I'd keep it here for future development.
> > >
> > >No, I don't think we should add stuff unconditionally. If not
> > >nvdimm, some other flag should indicate user intends to hotplug
> > >things.
> > >
> > 
> > Actually, it is not unconditionally which is called if parameter
> > "-m aaa, maxmem=bbb" (aaa < bbb) is used. It is on the some path of
> > memoy-hotplug initiation.
> > 
> 
> Right, but that's not the same as nvdimm.
> 

it could be pc-machine property, then it could be turned on like this:
 -machine nvdimm_support=on

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-19  9:46               ` Igor Mammedov
  0 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-19  9:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Xiao Guangrong, ehabkost, kvm, gleb, mtosatti, qemu-devel,
	stefanha, pbonzini, dan.j.williams, rth

On Mon, 19 Oct 2015 12:17:22 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Mon, Oct 19, 2015 at 03:44:13PM +0800, Xiao Guangrong wrote:
> > 
> > 
> > On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
> > >On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
> > >>>>+        nvdimm_init_memory_state(&pcms->nvdimm_memory,
> > >>>>system_memory, machine,
> > >>>>+                                 TARGET_PAGE_SIZE);
> > >>>>+
> > >>>
> > >>>Shouldn't this be conditional on presence of the nvdimm device?
> > >>>
> > >>
> > >>We will enable hotplug on nvdimm devices in the near future once
> > >>Linux driver is ready. I'd keep it here for future development.
> > >
> > >No, I don't think we should add stuff unconditionally. If not
> > >nvdimm, some other flag should indicate user intends to hotplug
> > >things.
> > >
> > 
> > Actually, it is not unconditionally which is called if parameter
> > "-m aaa, maxmem=bbb" (aaa < bbb) is used. It is on the some path of
> > memoy-hotplug initiation.
> > 
> 
> Right, but that's not the same as nvdimm.
> 

it could be pc-machine property, then it could be turned on like this:
 -machine nvdimm_support=on

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19  9:46               ` [Qemu-devel] " Igor Mammedov
  (?)
@ 2015-10-19 10:01               ` Xiao Guangrong
  2015-10-19 10:34                   ` Michael S. Tsirkin
  2015-10-19 10:42                   ` Igor Mammedov
  -1 siblings, 2 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-19 10:01 UTC (permalink / raw)
  To: Igor Mammedov, Michael S. Tsirkin
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, pbonzini,
	dan.j.williams, rth



On 10/19/2015 05:46 PM, Igor Mammedov wrote:
> On Mon, 19 Oct 2015 12:17:22 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>
>> On Mon, Oct 19, 2015 at 03:44:13PM +0800, Xiao Guangrong wrote:
>>>
>>>
>>> On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
>>>> On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
>>>>>>> +        nvdimm_init_memory_state(&pcms->nvdimm_memory,
>>>>>>> system_memory, machine,
>>>>>>> +                                 TARGET_PAGE_SIZE);
>>>>>>> +
>>>>>>
>>>>>> Shouldn't this be conditional on presence of the nvdimm device?
>>>>>>
>>>>>
>>>>> We will enable hotplug on nvdimm devices in the near future once
>>>>> Linux driver is ready. I'd keep it here for future development.
>>>>
>>>> No, I don't think we should add stuff unconditionally. If not
>>>> nvdimm, some other flag should indicate user intends to hotplug
>>>> things.
>>>>
>>>
>>> Actually, it is not unconditionally which is called if parameter
>>> "-m aaa, maxmem=bbb" (aaa < bbb) is used. It is on the some path of
>>> memoy-hotplug initiation.
>>>
>>
>> Right, but that's not the same as nvdimm.
>>
>
> it could be pc-machine property, then it could be turned on like this:
>   -machine nvdimm_support=on

Er, I do not understand why this separate switch is needed and why nvdimm
and pc-dimm is different. :(

NVDIMM reuses memory-hotplug's framework, such as maxmem, slot, and dimm
device, even some of ACPI logic to do hotplug things, etc. Both nvdimm
and pc-dimm are built on the same infrastructure.






^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19  9:18       ` [Qemu-devel] " Igor Mammedov
@ 2015-10-19 10:25         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19 10:25 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Xiao Guangrong, pbonzini, gleb, mtosatti, stefanha, rth,
	ehabkost, dan.j.williams, kvm, qemu-devel

On Mon, Oct 19, 2015 at 11:18:36AM +0200, Igor Mammedov wrote:
> On Mon, 19 Oct 2015 09:56:12 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Sun, Oct 11, 2015 at 11:52:54AM +0800, Xiao Guangrong wrote:
> [...]
> > > diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
> > > index f6bd2c4..aa95961 100644
> > > --- a/include/hw/mem/nvdimm.h
> > > +++ b/include/hw/mem/nvdimm.h
> > > @@ -15,6 +15,10 @@
> > >  
> > >  #include "hw/mem/dimm.h"
> > >  
> > > +/* Memory region 0xFF00000 ~ 0xFFF00000 is reserved for NVDIMM
> > > ACPI. */ +#define NVDIMM_ACPI_MEM_BASE   0xFF000000ULL
> Michael,
> 
> If it's ok to map control RAM region directly from QEMU at arbitrary
> location

It's a fair question. Where is it reserved? In which spec?

-- 
MST

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-19 10:25         ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19 10:25 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Xiao Guangrong, ehabkost, kvm, gleb, mtosatti, qemu-devel,
	stefanha, pbonzini, dan.j.williams, rth

On Mon, Oct 19, 2015 at 11:18:36AM +0200, Igor Mammedov wrote:
> On Mon, 19 Oct 2015 09:56:12 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Sun, Oct 11, 2015 at 11:52:54AM +0800, Xiao Guangrong wrote:
> [...]
> > > diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
> > > index f6bd2c4..aa95961 100644
> > > --- a/include/hw/mem/nvdimm.h
> > > +++ b/include/hw/mem/nvdimm.h
> > > @@ -15,6 +15,10 @@
> > >  
> > >  #include "hw/mem/dimm.h"
> > >  
> > > +/* Memory region 0xFF00000 ~ 0xFFF00000 is reserved for NVDIMM
> > > ACPI. */ +#define NVDIMM_ACPI_MEM_BASE   0xFF000000ULL
> Michael,
> 
> If it's ok to map control RAM region directly from QEMU at arbitrary
> location

It's a fair question. Where is it reserved? In which spec?

-- 
MST

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19 10:01               ` Xiao Guangrong
@ 2015-10-19 10:34                   ` Michael S. Tsirkin
  2015-10-19 10:42                   ` Igor Mammedov
  1 sibling, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19 10:34 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Igor Mammedov, ehabkost, kvm, gleb, mtosatti, qemu-devel,
	stefanha, pbonzini, dan.j.williams, rth

On Mon, Oct 19, 2015 at 06:01:17PM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/19/2015 05:46 PM, Igor Mammedov wrote:
> >On Mon, 19 Oct 2015 12:17:22 +0300
> >"Michael S. Tsirkin" <mst@redhat.com> wrote:
> >
> >>On Mon, Oct 19, 2015 at 03:44:13PM +0800, Xiao Guangrong wrote:
> >>>
> >>>
> >>>On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
> >>>>On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
> >>>>>>>+        nvdimm_init_memory_state(&pcms->nvdimm_memory,
> >>>>>>>system_memory, machine,
> >>>>>>>+                                 TARGET_PAGE_SIZE);
> >>>>>>>+
> >>>>>>
> >>>>>>Shouldn't this be conditional on presence of the nvdimm device?
> >>>>>>
> >>>>>
> >>>>>We will enable hotplug on nvdimm devices in the near future once
> >>>>>Linux driver is ready. I'd keep it here for future development.
> >>>>
> >>>>No, I don't think we should add stuff unconditionally. If not
> >>>>nvdimm, some other flag should indicate user intends to hotplug
> >>>>things.
> >>>>
> >>>
> >>>Actually, it is not unconditionally which is called if parameter
> >>>"-m aaa, maxmem=bbb" (aaa < bbb) is used. It is on the some path of
> >>>memoy-hotplug initiation.
> >>>
> >>
> >>Right, but that's not the same as nvdimm.
> >>
> >
> >it could be pc-machine property, then it could be turned on like this:
> >  -machine nvdimm_support=on
> 
> Er, I do not understand why this separate switch is needed and why nvdimm
> and pc-dimm is different. :(
> 
> NVDIMM reuses memory-hotplug's framework, such as maxmem, slot, and dimm
> device, even some of ACPI logic to do hotplug things, etc. Both nvdimm
> and pc-dimm are built on the same infrastructure.
> 
> 
> 
> 

It does seem to add a bunch of devices in ACPI and memory regions in
memory space.
Whatever your device is, it's generally safe to assume that most
people don't need it.

-- 
MST


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-19 10:34                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19 10:34 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, pbonzini,
	Igor Mammedov, dan.j.williams, rth

On Mon, Oct 19, 2015 at 06:01:17PM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/19/2015 05:46 PM, Igor Mammedov wrote:
> >On Mon, 19 Oct 2015 12:17:22 +0300
> >"Michael S. Tsirkin" <mst@redhat.com> wrote:
> >
> >>On Mon, Oct 19, 2015 at 03:44:13PM +0800, Xiao Guangrong wrote:
> >>>
> >>>
> >>>On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
> >>>>On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
> >>>>>>>+        nvdimm_init_memory_state(&pcms->nvdimm_memory,
> >>>>>>>system_memory, machine,
> >>>>>>>+                                 TARGET_PAGE_SIZE);
> >>>>>>>+
> >>>>>>
> >>>>>>Shouldn't this be conditional on presence of the nvdimm device?
> >>>>>>
> >>>>>
> >>>>>We will enable hotplug on nvdimm devices in the near future once
> >>>>>Linux driver is ready. I'd keep it here for future development.
> >>>>
> >>>>No, I don't think we should add stuff unconditionally. If not
> >>>>nvdimm, some other flag should indicate user intends to hotplug
> >>>>things.
> >>>>
> >>>
> >>>Actually, it is not unconditionally which is called if parameter
> >>>"-m aaa, maxmem=bbb" (aaa < bbb) is used. It is on the some path of
> >>>memoy-hotplug initiation.
> >>>
> >>
> >>Right, but that's not the same as nvdimm.
> >>
> >
> >it could be pc-machine property, then it could be turned on like this:
> >  -machine nvdimm_support=on
> 
> Er, I do not understand why this separate switch is needed and why nvdimm
> and pc-dimm is different. :(
> 
> NVDIMM reuses memory-hotplug's framework, such as maxmem, slot, and dimm
> device, even some of ACPI logic to do hotplug things, etc. Both nvdimm
> and pc-dimm are built on the same infrastructure.
> 
> 
> 
> 

It does seem to add a bunch of devices in ACPI and memory regions in
memory space.
Whatever your device is, it's generally safe to assume that most
people don't need it.

-- 
MST

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19 10:01               ` Xiao Guangrong
@ 2015-10-19 10:42                   ` Igor Mammedov
  2015-10-19 10:42                   ` Igor Mammedov
  1 sibling, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-19 10:42 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Michael S. Tsirkin, ehabkost, kvm, gleb, mtosatti, qemu-devel,
	stefanha, pbonzini, dan.j.williams, rth

On Mon, 19 Oct 2015 18:01:17 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> 
> 
> On 10/19/2015 05:46 PM, Igor Mammedov wrote:
> > On Mon, 19 Oct 2015 12:17:22 +0300
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >
> >> On Mon, Oct 19, 2015 at 03:44:13PM +0800, Xiao Guangrong wrote:
> >>>
> >>>
> >>> On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
> >>>> On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
> >>>>>>> +        nvdimm_init_memory_state(&pcms->nvdimm_memory,
> >>>>>>> system_memory, machine,
> >>>>>>> +                                 TARGET_PAGE_SIZE);
> >>>>>>> +
> >>>>>>
> >>>>>> Shouldn't this be conditional on presence of the nvdimm device?
> >>>>>>
> >>>>>
> >>>>> We will enable hotplug on nvdimm devices in the near future once
> >>>>> Linux driver is ready. I'd keep it here for future development.
> >>>>
> >>>> No, I don't think we should add stuff unconditionally. If not
> >>>> nvdimm, some other flag should indicate user intends to hotplug
> >>>> things.
> >>>>
> >>>
> >>> Actually, it is not unconditionally which is called if parameter
> >>> "-m aaa, maxmem=bbb" (aaa < bbb) is used. It is on the some path
> >>> of memoy-hotplug initiation.
> >>>
> >>
> >> Right, but that's not the same as nvdimm.
> >>
> >
> > it could be pc-machine property, then it could be turned on like
> > this: -machine nvdimm_support=on
> 
> Er, I do not understand why this separate switch is needed and why
> nvdimm and pc-dimm is different. :(
> 
> NVDIMM reuses memory-hotplug's framework, such as maxmem, slot, and
> dimm device, even some of ACPI logic to do hotplug things, etc. Both
> nvdimm and pc-dimm are built on the same infrastructure.
NVDIMM support consumes precious low RAM  and MMIO resources and
not small amount at that. So turning it on unconditionally with
memory hotplug even if NVDIMM wouldn't ever be used isn't nice.

However that concern could be dropped if instead of allocating it's
own control MMIO/RAM regions, NVDIMM would reuse memory hotplug's MMIO
region and replace RAM region with serializing/marshaling label data
over the same MMIO interface (yes, it's slower but it's not
performance critical path).  

> 
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-19 10:42                   ` Igor Mammedov
  0 siblings, 0 replies; 200+ messages in thread
From: Igor Mammedov @ 2015-10-19 10:42 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, Michael S. Tsirkin, gleb, mtosatti, qemu-devel,
	stefanha, pbonzini, dan.j.williams, rth

On Mon, 19 Oct 2015 18:01:17 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> 
> 
> On 10/19/2015 05:46 PM, Igor Mammedov wrote:
> > On Mon, 19 Oct 2015 12:17:22 +0300
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >
> >> On Mon, Oct 19, 2015 at 03:44:13PM +0800, Xiao Guangrong wrote:
> >>>
> >>>
> >>> On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
> >>>> On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
> >>>>>>> +        nvdimm_init_memory_state(&pcms->nvdimm_memory,
> >>>>>>> system_memory, machine,
> >>>>>>> +                                 TARGET_PAGE_SIZE);
> >>>>>>> +
> >>>>>>
> >>>>>> Shouldn't this be conditional on presence of the nvdimm device?
> >>>>>>
> >>>>>
> >>>>> We will enable hotplug on nvdimm devices in the near future once
> >>>>> Linux driver is ready. I'd keep it here for future development.
> >>>>
> >>>> No, I don't think we should add stuff unconditionally. If not
> >>>> nvdimm, some other flag should indicate user intends to hotplug
> >>>> things.
> >>>>
> >>>
> >>> Actually, it is not unconditionally which is called if parameter
> >>> "-m aaa, maxmem=bbb" (aaa < bbb) is used. It is on the some path
> >>> of memoy-hotplug initiation.
> >>>
> >>
> >> Right, but that's not the same as nvdimm.
> >>
> >
> > it could be pc-machine property, then it could be turned on like
> > this: -machine nvdimm_support=on
> 
> Er, I do not understand why this separate switch is needed and why
> nvdimm and pc-dimm is different. :(
> 
> NVDIMM reuses memory-hotplug's framework, such as maxmem, slot, and
> dimm device, even some of ACPI logic to do hotplug things, etc. Both
> nvdimm and pc-dimm are built on the same infrastructure.
NVDIMM support consumes precious low RAM  and MMIO resources and
not small amount at that. So turning it on unconditionally with
memory hotplug even if NVDIMM wouldn't ever be used isn't nice.

However that concern could be dropped if instead of allocating it's
own control MMIO/RAM regions, NVDIMM would reuse memory hotplug's MMIO
region and replace RAM region with serializing/marshaling label data
over the same MMIO interface (yes, it's slower but it's not
performance critical path).  

> 
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19 10:25         ` [Qemu-devel] " Michael S. Tsirkin
@ 2015-10-19 17:54           ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-19 17:54 UTC (permalink / raw)
  To: Michael S. Tsirkin, Igor Mammedov
  Cc: pbonzini, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel



On 10/19/2015 06:25 PM, Michael S. Tsirkin wrote:
> On Mon, Oct 19, 2015 at 11:18:36AM +0200, Igor Mammedov wrote:
>> On Mon, 19 Oct 2015 09:56:12 +0300
>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>
>>> On Sun, Oct 11, 2015 at 11:52:54AM +0800, Xiao Guangrong wrote:
>> [...]
>>>> diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
>>>> index f6bd2c4..aa95961 100644
>>>> --- a/include/hw/mem/nvdimm.h
>>>> +++ b/include/hw/mem/nvdimm.h
>>>> @@ -15,6 +15,10 @@
>>>>
>>>>   #include "hw/mem/dimm.h"
>>>>
>>>> +/* Memory region 0xFF00000 ~ 0xFFF00000 is reserved for NVDIMM
>>>> ACPI. */ +#define NVDIMM_ACPI_MEM_BASE   0xFF000000ULL
>> Michael,
>>
>> If it's ok to map control RAM region directly from QEMU at arbitrary
>> location
>
> It's a fair question. Where is it reserved? In which spec?
>

The region 0xFF00000 ~ 0xFFF00000 is just reserved for vDSM implementation,
it is not required by spec.


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-19 17:54           ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-19 17:54 UTC (permalink / raw)
  To: Michael S. Tsirkin, Igor Mammedov
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, pbonzini,
	dan.j.williams, rth



On 10/19/2015 06:25 PM, Michael S. Tsirkin wrote:
> On Mon, Oct 19, 2015 at 11:18:36AM +0200, Igor Mammedov wrote:
>> On Mon, 19 Oct 2015 09:56:12 +0300
>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>
>>> On Sun, Oct 11, 2015 at 11:52:54AM +0800, Xiao Guangrong wrote:
>> [...]
>>>> diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
>>>> index f6bd2c4..aa95961 100644
>>>> --- a/include/hw/mem/nvdimm.h
>>>> +++ b/include/hw/mem/nvdimm.h
>>>> @@ -15,6 +15,10 @@
>>>>
>>>>   #include "hw/mem/dimm.h"
>>>>
>>>> +/* Memory region 0xFF00000 ~ 0xFFF00000 is reserved for NVDIMM
>>>> ACPI. */ +#define NVDIMM_ACPI_MEM_BASE   0xFF000000ULL
>> Michael,
>>
>> If it's ok to map control RAM region directly from QEMU at arbitrary
>> location
>
> It's a fair question. Where is it reserved? In which spec?
>

The region 0xFF00000 ~ 0xFFF00000 is just reserved for vDSM implementation,
it is not required by spec.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19 10:42                   ` Igor Mammedov
@ 2015-10-19 17:56                     ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-19 17:56 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Michael S. Tsirkin, ehabkost, kvm, gleb, mtosatti, qemu-devel,
	stefanha, pbonzini, dan.j.williams, rth



On 10/19/2015 06:42 PM, Igor Mammedov wrote:
> On Mon, 19 Oct 2015 18:01:17 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>>
>>
>> On 10/19/2015 05:46 PM, Igor Mammedov wrote:
>>> On Mon, 19 Oct 2015 12:17:22 +0300
>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>
>>>> On Mon, Oct 19, 2015 at 03:44:13PM +0800, Xiao Guangrong wrote:
>>>>>
>>>>>
>>>>> On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
>>>>>> On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
>>>>>>>>> +        nvdimm_init_memory_state(&pcms->nvdimm_memory,
>>>>>>>>> system_memory, machine,
>>>>>>>>> +                                 TARGET_PAGE_SIZE);
>>>>>>>>> +
>>>>>>>>
>>>>>>>> Shouldn't this be conditional on presence of the nvdimm device?
>>>>>>>>
>>>>>>>
>>>>>>> We will enable hotplug on nvdimm devices in the near future once
>>>>>>> Linux driver is ready. I'd keep it here for future development.
>>>>>>
>>>>>> No, I don't think we should add stuff unconditionally. If not
>>>>>> nvdimm, some other flag should indicate user intends to hotplug
>>>>>> things.
>>>>>>
>>>>>
>>>>> Actually, it is not unconditionally which is called if parameter
>>>>> "-m aaa, maxmem=bbb" (aaa < bbb) is used. It is on the some path
>>>>> of memoy-hotplug initiation.
>>>>>
>>>>
>>>> Right, but that's not the same as nvdimm.
>>>>
>>>
>>> it could be pc-machine property, then it could be turned on like
>>> this: -machine nvdimm_support=on
>>
>> Er, I do not understand why this separate switch is needed and why
>> nvdimm and pc-dimm is different. :(
>>
>> NVDIMM reuses memory-hotplug's framework, such as maxmem, slot, and
>> dimm device, even some of ACPI logic to do hotplug things, etc. Both
>> nvdimm and pc-dimm are built on the same infrastructure.
> NVDIMM support consumes precious low RAM  and MMIO resources and
> not small amount at that. So turning it on unconditionally with
> memory hotplug even if NVDIMM wouldn't ever be used isn't nice.

Okay, understand... will introduce nvdimm_support as your suggestion.
Thank you, Igor and Michael!


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-19 17:56                     ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-19 17:56 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, kvm, Michael S. Tsirkin, gleb, mtosatti, qemu-devel,
	stefanha, pbonzini, dan.j.williams, rth



On 10/19/2015 06:42 PM, Igor Mammedov wrote:
> On Mon, 19 Oct 2015 18:01:17 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>>
>>
>> On 10/19/2015 05:46 PM, Igor Mammedov wrote:
>>> On Mon, 19 Oct 2015 12:17:22 +0300
>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>
>>>> On Mon, Oct 19, 2015 at 03:44:13PM +0800, Xiao Guangrong wrote:
>>>>>
>>>>>
>>>>> On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
>>>>>> On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
>>>>>>>>> +        nvdimm_init_memory_state(&pcms->nvdimm_memory,
>>>>>>>>> system_memory, machine,
>>>>>>>>> +                                 TARGET_PAGE_SIZE);
>>>>>>>>> +
>>>>>>>>
>>>>>>>> Shouldn't this be conditional on presence of the nvdimm device?
>>>>>>>>
>>>>>>>
>>>>>>> We will enable hotplug on nvdimm devices in the near future once
>>>>>>> Linux driver is ready. I'd keep it here for future development.
>>>>>>
>>>>>> No, I don't think we should add stuff unconditionally. If not
>>>>>> nvdimm, some other flag should indicate user intends to hotplug
>>>>>> things.
>>>>>>
>>>>>
>>>>> Actually, it is not unconditionally which is called if parameter
>>>>> "-m aaa, maxmem=bbb" (aaa < bbb) is used. It is on the some path
>>>>> of memoy-hotplug initiation.
>>>>>
>>>>
>>>> Right, but that's not the same as nvdimm.
>>>>
>>>
>>> it could be pc-machine property, then it could be turned on like
>>> this: -machine nvdimm_support=on
>>
>> Er, I do not understand why this separate switch is needed and why
>> nvdimm and pc-dimm is different. :(
>>
>> NVDIMM reuses memory-hotplug's framework, such as maxmem, slot, and
>> dimm device, even some of ACPI logic to do hotplug things, etc. Both
>> nvdimm and pc-dimm are built on the same infrastructure.
> NVDIMM support consumes precious low RAM  and MMIO resources and
> not small amount at that. So turning it on unconditionally with
> memory hotplug even if NVDIMM wouldn't ever be used isn't nice.

Okay, understand... will introduce nvdimm_support as your suggestion.
Thank you, Igor and Michael!

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19 17:54           ` [Qemu-devel] " Xiao Guangrong
@ 2015-10-19 21:20             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19 21:20 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Igor Mammedov, pbonzini, gleb, mtosatti, stefanha, rth, ehabkost,
	dan.j.williams, kvm, qemu-devel

On Tue, Oct 20, 2015 at 01:54:12AM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/19/2015 06:25 PM, Michael S. Tsirkin wrote:
> >On Mon, Oct 19, 2015 at 11:18:36AM +0200, Igor Mammedov wrote:
> >>On Mon, 19 Oct 2015 09:56:12 +0300
> >>"Michael S. Tsirkin" <mst@redhat.com> wrote:
> >>
> >>>On Sun, Oct 11, 2015 at 11:52:54AM +0800, Xiao Guangrong wrote:
> >>[...]
> >>>>diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
> >>>>index f6bd2c4..aa95961 100644
> >>>>--- a/include/hw/mem/nvdimm.h
> >>>>+++ b/include/hw/mem/nvdimm.h
> >>>>@@ -15,6 +15,10 @@
> >>>>
> >>>>  #include "hw/mem/dimm.h"
> >>>>
> >>>>+/* Memory region 0xFF00000 ~ 0xFFF00000 is reserved for NVDIMM
> >>>>ACPI. */ +#define NVDIMM_ACPI_MEM_BASE   0xFF000000ULL
> >>Michael,
> >>
> >>If it's ok to map control RAM region directly from QEMU at arbitrary
> >>location
> >
> >It's a fair question. Where is it reserved? In which spec?
> >
> 
> The region 0xFF00000 ~ 0xFFF00000 is just reserved for vDSM implementation,
> it is not required by spec.

See Igor's comment then. You can't just steal it like that.

-- 
MST

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-19 21:20             ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2015-10-19 21:20 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: ehabkost, kvm, gleb, mtosatti, qemu-devel, stefanha, pbonzini,
	Igor Mammedov, dan.j.williams, rth

On Tue, Oct 20, 2015 at 01:54:12AM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/19/2015 06:25 PM, Michael S. Tsirkin wrote:
> >On Mon, Oct 19, 2015 at 11:18:36AM +0200, Igor Mammedov wrote:
> >>On Mon, 19 Oct 2015 09:56:12 +0300
> >>"Michael S. Tsirkin" <mst@redhat.com> wrote:
> >>
> >>>On Sun, Oct 11, 2015 at 11:52:54AM +0800, Xiao Guangrong wrote:
> >>[...]
> >>>>diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
> >>>>index f6bd2c4..aa95961 100644
> >>>>--- a/include/hw/mem/nvdimm.h
> >>>>+++ b/include/hw/mem/nvdimm.h
> >>>>@@ -15,6 +15,10 @@
> >>>>
> >>>>  #include "hw/mem/dimm.h"
> >>>>
> >>>>+/* Memory region 0xFF00000 ~ 0xFFF00000 is reserved for NVDIMM
> >>>>ACPI. */ +#define NVDIMM_ACPI_MEM_BASE   0xFF000000ULL
> >>Michael,
> >>
> >>If it's ok to map control RAM region directly from QEMU at arbitrary
> >>location
> >
> >It's a fair question. Where is it reserved? In which spec?
> >
> 
> The region 0xFF00000 ~ 0xFFF00000 is just reserved for vDSM implementation,
> it is not required by spec.

See Igor's comment then. You can't just steal it like that.

-- 
MST

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
  2015-10-19 10:42                   ` Igor Mammedov
@ 2015-10-20  2:27                     ` Xiao Guangrong
  -1 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-20  2:27 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Michael S. Tsirkin, ehabkost, kvm, gleb, mtosatti, qemu-devel,
	stefanha, pbonzini, dan.j.williams, rth



On 10/19/2015 06:42 PM, Igor Mammedov wrote:
> On Mon, 19 Oct 2015 18:01:17 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>>
>>
>> On 10/19/2015 05:46 PM, Igor Mammedov wrote:
>>> On Mon, 19 Oct 2015 12:17:22 +0300
>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>
>>>> On Mon, Oct 19, 2015 at 03:44:13PM +0800, Xiao Guangrong wrote:
>>>>>
>>>>>
>>>>> On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
>>>>>> On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
>>>>>>>>> +        nvdimm_init_memory_state(&pcms->nvdimm_memory,
>>>>>>>>> system_memory, machine,
>>>>>>>>> +                                 TARGET_PAGE_SIZE);
>>>>>>>>> +
>>>>>>>>
>>>>>>>> Shouldn't this be conditional on presence of the nvdimm device?
>>>>>>>>
>>>>>>>
>>>>>>> We will enable hotplug on nvdimm devices in the near future once
>>>>>>> Linux driver is ready. I'd keep it here for future development.
>>>>>>
>>>>>> No, I don't think we should add stuff unconditionally. If not
>>>>>> nvdimm, some other flag should indicate user intends to hotplug
>>>>>> things.
>>>>>>
>>>>>
>>>>> Actually, it is not unconditionally which is called if parameter
>>>>> "-m aaa, maxmem=bbb" (aaa < bbb) is used. It is on the some path
>>>>> of memoy-hotplug initiation.
>>>>>
>>>>
>>>> Right, but that's not the same as nvdimm.
>>>>
>>>
>>> it could be pc-machine property, then it could be turned on like
>>> this: -machine nvdimm_support=on
>>
>> Er, I do not understand why this separate switch is needed and why
>> nvdimm and pc-dimm is different. :(
>>
>> NVDIMM reuses memory-hotplug's framework, such as maxmem, slot, and
>> dimm device, even some of ACPI logic to do hotplug things, etc. Both
>> nvdimm and pc-dimm are built on the same infrastructure.
> NVDIMM support consumes precious low RAM  and MMIO resources and
> not small amount at that. So turning it on unconditionally with
> memory hotplug even if NVDIMM wouldn't ever be used isn't nice.
>
> However that concern could be dropped if instead of allocating it's
> own control MMIO/RAM regions, NVDIMM would reuse memory hotplug's MMIO
> region and replace RAM region with serializing/marshaling label data
> over the same MMIO interface (yes, it's slower but it's not
> performance critical path).12

I really do not want to reuse all memory-hotplug's resource, NVDIMM and
memory-hotplug do not have the same ACPI logic, that makes the AML code
really complex.

Another point is, Microsoft does use label data area oon its own way - label
data area will not be used as namespace area at all, too slow access for
_DSM is not acceptable for vNVDIMM usage.

Most important point is, we do not want to slow down system boot with NVDIMM
attached, (imagine accessing 128K data with single 8 bytes MMIO access, crazy
slowly.), NVDIMM will be use as boot device and it will be used for
light-way virtualization, such as Clear Container and Hyper, which require
boot the system up as fast as possible.

I understand your concern that reserve big resource is not so acceptable - okay,
then how about just reserve 4 bit IO port and 1 RAM?


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI
@ 2015-10-20  2:27                     ` Xiao Guangrong
  0 siblings, 0 replies; 200+ messages in thread
From: Xiao Guangrong @ 2015-10-20  2:27 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, kvm, Michael S. Tsirkin, gleb, mtosatti, qemu-devel,
	stefanha, pbonzini, dan.j.williams, rth



On 10/19/2015 06:42 PM, Igor Mammedov wrote:
> On Mon, 19 Oct 2015 18:01:17 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>>
>>
>> On 10/19/2015 05:46 PM, Igor Mammedov wrote:
>>> On Mon, 19 Oct 2015 12:17:22 +0300
>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>
>>>> On Mon, Oct 19, 2015 at 03:44:13PM +0800, Xiao Guangrong wrote:
>>>>>
>>>>>
>>>>> On 10/19/2015 03:39 PM, Michael S. Tsirkin wrote:
>>>>>> On Mon, Oct 19, 2015 at 03:27:21PM +0800, Xiao Guangrong wrote:
>>>>>>>>> +        nvdimm_init_memory_state(&pcms->nvdimm_memory,
>>>>>>>>> system_memory, machine,
>>>>>>>>> +                                 TARGET_PAGE_SIZE);
>>>>>>>>> +
>>>>>>>>
>>>>>>>> Shouldn't this be conditional on presence of the nvdimm device?
>>>>>>>>
>>>>>>>
>>>>>>> We will enable hotplug on nvdimm devices in the near future once
>>>>>>> Linux driver is ready. I'd keep it here for future development.
>>>>>>
>>>>>> No, I don't think we should add stuff unconditionally. If not
>>>>>> nvdimm, some other flag should indicate user intends to hotplug
>>>>>> things.
>>>>>>
>>>>>
>>>>> Actually, it is not unconditionally which is called if parameter
>>>>> "-m aaa, maxmem=bbb" (aaa < bbb) is used. It is on the some path
>>>>> of memoy-hotplug initiation.
>>>>>
>>>>
>>>> Right, but that's not the same as nvdimm.
>>>>
>>>
>>> it could be pc-machine property, then it could be turned on like
>>> this: -machine nvdimm_support=on
>>
>> Er, I do not understand why this separate switch is needed and why
>> nvdimm and pc-dimm is different. :(
>>
>> NVDIMM reuses memory-hotplug's framework, such as maxmem, slot, and
>> dimm device, even some of ACPI logic to do hotplug things, etc. Both
>> nvdimm and pc-dimm are built on the same infrastructure.
> NVDIMM support consumes precious low RAM  and MMIO resources and
> not small amount at that. So turning it on unconditionally with
> memory hotplug even if NVDIMM wouldn't ever be used isn't nice.
>
> However that concern could be dropped if instead of allocating it's
> own control MMIO/RAM regions, NVDIMM would reuse memory hotplug's MMIO
> region and replace RAM region with serializing/marshaling label data
> over the same MMIO interface (yes, it's slower but it's not
> performance critical path).12

I really do not want to reuse all memory-hotplug's resource, NVDIMM and
memory-hotplug do not have the same ACPI logic, that makes the AML code
really complex.

Another point is, Microsoft does use label data area oon its own way - label
data area will not be used as namespace area at all, too slow access for
_DSM is not acceptable for vNVDIMM usage.

Most important point is, we do not want to slow down system boot with NVDIMM
attached, (imagine accessing 128K data with single 8 bytes MMIO access, crazy
slowly.), NVDIMM will be use as boot device and it will be used for
light-way virtualization, such as Clear Container and Hyper, which require
boot the system up as fast as possible.

I understand your concern that reserve big resource is not so acceptable - okay,
then how about just reserve 4 bit IO port and 1 RAM?

^ permalink raw reply	[flat|nested] 200+ messages in thread

end of thread, other threads:[~2015-10-20  2:33 UTC | newest]

Thread overview: 200+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-11  3:52 [PATCH v3 00/32] implement vNVDIMM Xiao Guangrong
2015-10-11  3:52 ` [Qemu-devel] " Xiao Guangrong
2015-10-10 21:17 ` Dan Williams
2015-10-10 21:17   ` [Qemu-devel] " Dan Williams
2015-10-12  4:33   ` Xiao Guangrong
2015-10-12  4:33     ` [Qemu-devel] " Xiao Guangrong
2015-10-12 16:36     ` Dan Williams
2015-10-12 16:36       ` [Qemu-devel] " Dan Williams
2015-10-13  3:14       ` Xiao Guangrong
2015-10-13  3:14         ` Xiao Guangrong
2015-10-13  3:38         ` Dan Williams
2015-10-13  3:38           ` Dan Williams
2015-10-13  5:49           ` Xiao Guangrong
2015-10-13  5:49             ` Xiao Guangrong
2015-10-13  6:36             ` Dan Williams
2015-10-13  6:36               ` Dan Williams
2015-10-14  4:03               ` Xiao Guangrong
2015-10-14  4:03                 ` Xiao Guangrong
2015-10-14 19:20                 ` Dan Williams
2015-10-14 19:20                   ` Dan Williams
2015-10-11  3:52 ` [PATCH v3 01/32] acpi: add aml_derefof Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-13 12:30   ` Igor Mammedov
2015-10-13 12:30     ` [Qemu-devel] " Igor Mammedov
2015-10-11  3:52 ` [PATCH v3 02/32] acpi: add aml_sizeof Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-13 12:33   ` Igor Mammedov
2015-10-13 12:33     ` [Qemu-devel] " Igor Mammedov
2015-10-11  3:52 ` [PATCH v3 03/32] acpi: add aml_create_field Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-13 12:38   ` Igor Mammedov
2015-10-13 12:38     ` [Qemu-devel] " Igor Mammedov
2015-10-13 16:36     ` Xiao Guangrong
2015-10-13 16:36       ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 04/32] acpi: add aml_mutex, aml_acquire, aml_release Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-13 13:34   ` Igor Mammedov
2015-10-13 13:34     ` [Qemu-devel] " Igor Mammedov
2015-10-13 16:44     ` Xiao Guangrong
2015-10-13 16:44       ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 05/32] acpi: add aml_concatenate Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 06/32] acpi: add aml_object_type Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 07/32] util: introduce qemu_file_get_page_size() Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 08/32] exec: allow memory to be allocated from any kind of path Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-12 10:08   ` Michael S. Tsirkin
2015-10-12 10:08     ` [Qemu-devel] " Michael S. Tsirkin
2015-10-13  3:31     ` Xiao Guangrong
2015-10-13  3:31       ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 09/32] exec: allow file_ram_alloc to work on file Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 10/32] hostmem-file: clean up memory allocation Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 11/32] hostmem-file: use whole file size if possible Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-13 11:50   ` Vladimir Sementsov-Ogievskiy
2015-10-13 11:50     ` Vladimir Sementsov-Ogievskiy
2015-10-13 16:53     ` Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 12/32] pc-dimm: remove DEFAULT_PC_DIMMSIZE Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 13/32] pc-dimm: make pc_existing_dimms_capacity static and rename it Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 14/32] pc-dimm: drop the prefix of pc-dimm Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-12 16:43   ` Eric Blake
2015-10-13  3:32     ` Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 15/32] stubs: rename qmp_pc_dimm_device_list.c Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 16/32] pc-dimm: rename pc-dimm.c and pc-dimm.h Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 17/32] dimm: abstract dimm device from pc-dimm Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 18/32] dimm: get mapped memory region from DIMMDeviceClass->get_memory_region Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 19/32] dimm: keep the state of the whole backend memory Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 20/32] dimm: introduce realize callback Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 21/32] nvdimm: implement NVDIMM device abstract Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-19  6:56   ` Michael S. Tsirkin
2015-10-19  6:56     ` [Qemu-devel] " Michael S. Tsirkin
2015-10-19  7:27     ` Xiao Guangrong
2015-10-19  7:27       ` [Qemu-devel] " Xiao Guangrong
2015-10-19  7:39       ` Michael S. Tsirkin
2015-10-19  7:39         ` [Qemu-devel] " Michael S. Tsirkin
2015-10-19  7:44         ` Xiao Guangrong
2015-10-19  7:44           ` [Qemu-devel] " Xiao Guangrong
2015-10-19  9:17           ` Michael S. Tsirkin
2015-10-19  9:17             ` [Qemu-devel] " Michael S. Tsirkin
2015-10-19  9:46             ` Igor Mammedov
2015-10-19  9:46               ` [Qemu-devel] " Igor Mammedov
2015-10-19 10:01               ` Xiao Guangrong
2015-10-19 10:34                 ` Michael S. Tsirkin
2015-10-19 10:34                   ` Michael S. Tsirkin
2015-10-19 10:42                 ` Igor Mammedov
2015-10-19 10:42                   ` Igor Mammedov
2015-10-19 17:56                   ` Xiao Guangrong
2015-10-19 17:56                     ` Xiao Guangrong
2015-10-20  2:27                   ` Xiao Guangrong
2015-10-20  2:27                     ` Xiao Guangrong
2015-10-19  9:18     ` Igor Mammedov
2015-10-19  9:18       ` [Qemu-devel] " Igor Mammedov
2015-10-19 10:25       ` Michael S. Tsirkin
2015-10-19 10:25         ` [Qemu-devel] " Michael S. Tsirkin
2015-10-19 17:54         ` Xiao Guangrong
2015-10-19 17:54           ` [Qemu-devel] " Xiao Guangrong
2015-10-19 21:20           ` Michael S. Tsirkin
2015-10-19 21:20             ` [Qemu-devel] " Michael S. Tsirkin
2015-10-11  3:52 ` [PATCH v3 23/32] nvdimm: build ACPI NFIT table Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-12 11:27   ` Michael S. Tsirkin
2015-10-12 11:27     ` [Qemu-devel] " Michael S. Tsirkin
2015-10-13  5:13     ` Xiao Guangrong
2015-10-13  5:13       ` [Qemu-devel] " Xiao Guangrong
2015-10-13  5:42       ` Michael S. Tsirkin
2015-10-13  5:42         ` [Qemu-devel] " Michael S. Tsirkin
2015-10-13  6:06         ` Xiao Guangrong
2015-10-13  6:06           ` [Qemu-devel] " Xiao Guangrong
2015-10-12 16:40   ` Dan Williams
2015-10-12 16:40     ` [Qemu-devel] " Dan Williams
2015-10-13  5:17     ` Xiao Guangrong
2015-10-13  5:17       ` [Qemu-devel] " Xiao Guangrong
2015-10-13  6:07       ` Michael S. Tsirkin
2015-10-13  6:07         ` [Qemu-devel] " Michael S. Tsirkin
2015-10-11  3:52 ` [PATCH v3 24/32] nvdimm: init the address region used by DSM method Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 25/32] nvdimm: build ACPI nvdimm devices Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-13 14:39   ` Igor Mammedov
2015-10-13 14:39     ` Igor Mammedov
2015-10-13 17:24     ` Xiao Guangrong
2015-10-13 17:24       ` Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 26/32] nvdimm: save arg3 for NVDIMM device _DSM method Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-19  6:50   ` Michael S. Tsirkin
2015-10-19  6:50     ` [Qemu-devel] " Michael S. Tsirkin
2015-10-19  7:14     ` Xiao Guangrong
2015-10-19  7:14       ` [Qemu-devel] " Xiao Guangrong
2015-10-19  7:47       ` Michael S. Tsirkin
2015-10-19  7:47         ` [Qemu-devel] " Michael S. Tsirkin
2015-10-19  7:51         ` Xiao Guangrong
2015-10-19  7:51           ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:52 ` [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function Xiao Guangrong
2015-10-11  3:52   ` [Qemu-devel] " Xiao Guangrong
2015-10-14  9:40   ` Stefan Hajnoczi
2015-10-14  9:40     ` [Qemu-devel] " Stefan Hajnoczi
2015-10-14 14:50     ` Xiao Guangrong
2015-10-14 14:50       ` [Qemu-devel] " Xiao Guangrong
2015-10-14 17:06       ` Eduardo Habkost
2015-10-14 17:06         ` [Qemu-devel] " Eduardo Habkost
2015-10-15  1:43         ` Xiao Guangrong
2015-10-15  1:43           ` [Qemu-devel] " Xiao Guangrong
2015-10-15 15:07       ` Stefan Hajnoczi
2015-10-15 15:07         ` [Qemu-devel] " Stefan Hajnoczi
2015-10-16  2:30         ` Xiao Guangrong
2015-10-16  2:30           ` [Qemu-devel] " Xiao Guangrong
2015-10-14  9:41   ` Stefan Hajnoczi
2015-10-14  9:41     ` [Qemu-devel] " Stefan Hajnoczi
2015-10-14 14:52     ` Xiao Guangrong
2015-10-14 14:52       ` [Qemu-devel] " Xiao Guangrong
2015-10-15 15:01       ` Stefan Hajnoczi
2015-10-15 15:01         ` [Qemu-devel] " Stefan Hajnoczi
2015-10-16  2:32         ` Xiao Guangrong
2015-10-16  2:32           ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:53 ` [PATCH v3 28/32] nvdimm: support DSM_CMD_NAMESPACE_LABEL_SIZE function Xiao Guangrong
2015-10-11  3:53   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:53 ` [PATCH v3 29/32] nvdimm: support DSM_CMD_GET_NAMESPACE_LABEL_DATA Xiao Guangrong
2015-10-11  3:53   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:53 ` [PATCH v3 30/32] nvdimm: support DSM_CMD_SET_NAMESPACE_LABEL_DATA Xiao Guangrong
2015-10-11  3:53   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:53 ` [PATCH v3 31/32] nvdimm: allow using whole backend memory as pmem Xiao Guangrong
2015-10-11  3:53   ` [Qemu-devel] " Xiao Guangrong
2015-10-11  3:53 ` [PATCH v3 32/32] nvdimm: add maintain info Xiao Guangrong
2015-10-11  3:53   ` [Qemu-devel] " Xiao Guangrong
2015-10-12  2:59 ` [PATCH v3 00/32] implement vNVDIMM Bharata B Rao
2015-10-12  2:59   ` [Qemu-devel] " Bharata B Rao
2015-10-12  3:06   ` Xiao Guangrong
2015-10-12  3:06     ` [Qemu-devel] " Xiao Guangrong
2015-10-12  8:20     ` Igor Mammedov
2015-10-12  8:20       ` [Qemu-devel] " Igor Mammedov
2015-10-12  8:21       ` Xiao Guangrong
2015-10-12  8:21         ` [Qemu-devel] " Xiao Guangrong
2015-10-12 11:55 ` Michael S. Tsirkin
2015-10-12 11:55   ` [Qemu-devel] " Michael S. Tsirkin
2015-10-13  5:29   ` Xiao Guangrong
2015-10-13  5:29     ` [Qemu-devel] " Xiao Guangrong
2015-10-13  5:57     ` Michael S. Tsirkin
2015-10-13  5:57       ` [Qemu-devel] " Michael S. Tsirkin
2015-10-13  5:52       ` Xiao Guangrong
2015-10-13  5:52         ` [Qemu-devel] " Xiao Guangrong
2015-10-19  6:57     ` Michael S. Tsirkin
2015-10-19  6:57       ` [Qemu-devel] " Michael S. Tsirkin
2015-10-19  6:56 ` Michael S. Tsirkin
2015-10-19  6:56   ` [Qemu-devel] " Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.