From: Pavel Tatashin <pasha.tatashin@soleen.com>
To: pasha.tatashin@soleen.com, jmorris@namei.org, sashal@kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-nvdimm@lists.01.org, akpm@linux-foundation.org,
mhocko@suse.com, dave.hansen@linux.intel.com,
dan.j.williams@intel.com, keith.busch@intel.com,
vishal.l.verma@intel.com, dave.jiang@intel.com,
zwisler@kernel.org, thomas.lendacky@amd.com,
ying.huang@intel.com, fengguang.wu@intel.com, bp@suse.de,
bhelgaas@google.com, baiyaowei@cmss.chinamobile.com,
tiwai@suse.de, jglisse@redhat.com
Subject: [v2 0/2] "Hotremove" persistent memory
Date: Sat, 20 Apr 2019 21:44:27 -0400 [thread overview]
Message-ID: <20190421014429.31206-1-pasha.tatashin@soleen.com> (raw)
Changelog:
v2
- Dan Williams mentioned that drv->remove() return is ignored
by unbind. Unbind always succeeds. Because we cannot guarantee
that memory can be offlined from the driver, don't even
attempt to do so. Simply check that every section is offlined
beforehand and only then proceed with removing dax memory.
---
Recently, adding a persistent memory to be used like a regular RAM was
added to Linux. This work extends this functionality to also allow hot
removing persistent memory.
We (Microsoft) have an important use case for this functionality.
The requirement is for physical machines with small amount of RAM (~8G)
to be able to reboot in a very short period of time (<1s). Yet, there is
a userland state that is expensive to recreate (~2G).
The solution is to boot machines with 2G preserved for persistent
memory.
Copy the state, and hotadd the persistent memory so machine still has
all 8G available for runtime. Before reboot, offline and hotremove
device-dax 2G, copy the memory that is needed to be preserved to pmem0
device, and reboot.
The series of operations look like this:
1. After boot restore /dev/pmem0 to ramdisk to be consumed by apps.
and free ramdisk.
2. Convert raw pmem0 to devdax
ndctl create-namespace --mode devdax --map mem -e namespace0.0 -f
3. Hotadd to System RAM
echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id
echo online_movable > /sys/devices/system/memoryXXX/state
4. Before reboot hotremove device-dax memory from System RAM
echo offline > /sys/devices/system/memoryXXX/state
echo dax0.0 > /sys/bus/dax/drivers/kmem/unbind
5. Create raw pmem0 device
ndctl create-namespace --mode raw -e namespace0.0 -f
6. Copy the state that was stored by apps to ramdisk to pmem device
7. Do kexec reboot or reboot through firmware if firmware does not
zero memory in pmem0 region (These machines have only regular
volatile memory). So to have pmem0 device either memmap kernel
parameter is used, or devices nodes in dtb are specified.
Pavel Tatashin (2):
device-dax: fix memory and resource leak if hotplug fails
device-dax: "Hotremove" persistent memory that is used like normal RAM
drivers/dax/dax-private.h | 2 +
drivers/dax/kmem.c | 96 +++++++++++++++++++++++++++++++++++++--
2 files changed, 93 insertions(+), 5 deletions(-)
--
2.21.0
next reply other threads:[~2019-04-21 1:44 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-21 1:44 Pavel Tatashin [this message]
2019-04-21 1:44 ` [v2 1/2] device-dax: fix memory and resource leak if hotplug fails Pavel Tatashin
2019-04-21 1:44 ` [v2 2/2] device-dax: "Hotremove" persistent memory that is used like normal RAM Pavel Tatashin
2019-04-24 20:55 ` David Hildenbrand
2019-04-24 21:02 ` Dan Williams
2019-04-24 21:34 ` Pavel Tatashin
2019-04-25 7:41 ` David Hildenbrand
2019-04-25 12:30 ` Pavel Tatashin
2019-04-25 12:38 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190421014429.31206-1-pasha.tatashin@soleen.com \
--to=pasha.tatashin@soleen.com \
--cc=akpm@linux-foundation.org \
--cc=baiyaowei@cmss.chinamobile.com \
--cc=bhelgaas@google.com \
--cc=bp@suse.de \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dave.jiang@intel.com \
--cc=fengguang.wu@intel.com \
--cc=jglisse@redhat.com \
--cc=jmorris@namei.org \
--cc=keith.busch@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@lists.01.org \
--cc=mhocko@suse.com \
--cc=sashal@kernel.org \
--cc=thomas.lendacky@amd.com \
--cc=tiwai@suse.de \
--cc=vishal.l.verma@intel.com \
--cc=ying.huang@intel.com \
--cc=zwisler@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).