From: Shivaprasad G Bhat <sbhat@linux.ibm.com>
To: david@gibson.dropbear.id.au, groug@kaod.org, qemu-ppc@nongnu.org
Cc: qemu-devel@nongnu.org, aneesh.kumar@linux.ibm.com,
nvdimm@lists.linux.dev, kvm-ppc@vger.kernel.org,
bharata@linux.vnet.ibm.com
Subject: [PATCH REBASED v5 0/3] spapr: nvdimm: Introduce spapr-nvdimm device
Date: Wed, 07 Jul 2021 21:57:03 -0500 [thread overview]
Message-ID: <162571302321.1030381.15196355582642786915.stgit@lep8c.aus.stglabs.ibm.com> (raw)
If the device backend is not persistent memory for the nvdimm, there
is need for explicit IO flushes to ensure persistence.
On SPAPR, the issue is addressed by adding a new hcall to request for
an explicit flush from the guest when the backend is not pmem.
So, the approach here is to convey when the hcall flush is required
in a device tree property. The guest once it knows the device needs
explicit flushes, makes the hcall as and when required.
It was suggested to create a new device type to address the
explicit flush for such backends on PPC instead of extending the
generic nvdimm device with new property. So, the patch introduces
the spapr-nvdimm device. The new device inherits the nvdimm device
with the new bahviour such that if the backend has pmem=no, the
device tree property is set.
The below demonstration shows the map_sync behavior for non-pmem
backends.
(https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/ndctl.py.data/map_sync.c)
The pmem0 is from spapr-nvdimm with with backend pmem=yes, and pmem1 is
from spapr-nvdimm with pmem=no, mounted as
/dev/pmem0 on /mnt1 type xfs (rw,relatime,attr2,dax=always,inode64,logbufs=8,logbsize=32k,noquota)
/dev/pmem1 on /mnt2 type xfs (rw,relatime,attr2,dax=always,inode64,logbufs=8,logbsize=32k,noquota)
[root@atest-guest ~]# ./mapsync /mnt1/newfile ----> When pmem=yes
[root@atest-guest ~]# ./mapsync /mnt2/newfile ----> when pmem=no
Failed to mmap with Operation not supported
First patch implements the hcall, adds the necessary
vmstate properties to spapr machine structure for carrying the hcall
status during save-restore. The nature of the hcall being asynchronus,
the patch uses aio utilities to offload the flush. The second patch
introduces the spapr-nvdimm device, adds the device tree property
for the guest when spapr-nvdimm is used with pmem="no" on the backend.
The kernel changes to exploit this hcall is at
https://github.com/linuxppc/linux/commit/75b7c05ebf9026.patch
---
v4 - https://lists.gnu.org/archive/html/qemu-devel/2021-04/msg05982.html
Changes from v4:
- Introduce spapr-nvdimm device with nvdimm device as the parent.
- The new spapr-nvdimm has no new properties. As this is a new
device and there is no migration related dependencies to be
taken care of, the device behavior is made to set the device tree
property and enable hcall when the device type spapr-nvdimm is
used with pmem="no"
- Fixed commit messages
- Added checks to ensure the backend is actualy file and not memory
- Addressed things pointed out by Eric
v3 - https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg07916.html
Changes from v3:
- Fixed the forward declaration coding guideline violations in 1st patch.
- Removed the code waiting for the flushes to complete during migration,
instead restart the flush worker on destination qemu in post load.
- Got rid of the randomization of the flush tokens, using simple
counter.
- Got rid of the redundant flush state lock, relying on the BQL now.
- Handling the memory-backend-ram usage
- Changed the sync-dax symantics from on/off to 'unsafe','writeback' and 'direct'.
Added prevention code using 'writeback' on arm and x86_64.
- Fixed all the miscellaneous comments.
v2 - https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg07031.html
Changes from v2:
- Using the thread pool based approach as suggested
- Moved the async hcall handling code to spapr_nvdimm.c along
with some simplifications
- Added vmstate to preserve the hcall status during save-restore
along with pre_save handler code to complete all ongoning flushes.
- Added hw_compat magic for sync-dax 'on' on previous machines.
- Miscellanious minor fixes.
v1 - https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg06330.html
Changes from v1
- Fixed a missed-out unlock
- using QLIST_FOREACH instead of QLIST_FOREACH_SAFE while generating token
Shivaprasad G Bhat (2):
spapr: nvdimm: Implement H_SCM_FLUSH hcall
spapr: nvdimm: Introduce spapr-nvdimm device
hw/ppc/spapr.c | 6 +
hw/ppc/spapr_nvdimm.c | 286 +++++++++++++++++++++++++++++++++++++++++
include/hw/ppc/spapr.h | 11 +-
include/hw/ppc/spapr_nvdimm.h | 17 ++
4 files changed, 319 insertions(+), 1 deletion(-)
--
Signature
next reply other threads:[~2021-07-08 2:57 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-08 2:57 Shivaprasad G Bhat [this message]
2021-07-08 2:57 ` [PATCH REBASED v5 1/2] spapr: nvdimm: Implement H_SCM_FLUSH hcall Shivaprasad G Bhat
2021-07-08 6:12 ` David Gibson
2021-09-21 6:23 ` David Gibson
2022-02-01 21:41 ` Shivaprasad G Bhat
2021-07-08 2:57 ` [PATCH REBASED v5 2/2] spapr: nvdimm: Introduce spapr-nvdimm device Shivaprasad G Bhat
2021-09-21 6:32 ` David Gibson
2022-02-01 21:41 ` Shivaprasad G Bhat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=162571302321.1030381.15196355582642786915.stgit@lep8c.aus.stglabs.ibm.com \
--to=sbhat@linux.ibm.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=bharata@linux.vnet.ibm.com \
--cc=david@gibson.dropbear.id.au \
--cc=groug@kaod.org \
--cc=kvm-ppc@vger.kernel.org \
--cc=nvdimm@lists.linux.dev \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).