linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: linux-nvdimm <linux-nvdimm@ml01.01.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/8] device-dax: sub-division support
Date: Mon, 12 Dec 2016 10:46:45 -0800	[thread overview]
Message-ID: <CAPcyv4hq=U8YkYqK4OE__FQyeGq2dKH+=14NttQu0M84yXZ7BQ@mail.gmail.com> (raw)
In-Reply-To: <x497f75hxwp.fsf@segfault.boston.devel.redhat.com>

On Mon, Dec 12, 2016 at 9:15 AM, Jeff Moyer <jmoyer@redhat.com> wrote:
> Hi, Dan,
>
> Dan Williams <dan.j.williams@intel.com> writes:
>
>>>From [PATCH 6/8] dax: sub-division support:
>>
>> Device-DAX is a mechanism to establish mappings of performance / feature
>> differentiated memory with strict fault behavior guarantees.  With
>> sub-division support a platform owner can provision sub-allocations of a
>> dax-region into separate devices. The provisioning mechanism follows the
>> same scheme as the libnvdimm sub-system in that a 'seed' device is
>> created at initialization time that can be resized from zero to become
>> enabled.
>>
>> Unlike the nvdimm sub-system there is no on media labelling scheme
>> associated with this partitioning. Provisioning decisions are ephemeral
>> / not automatically restored after reboot. While the initial use case of
>> device-dax is persistent memory other uses case may be volatile, so the
>> device-dax core is unable to assume the underlying memory is pmem.  The
>> task of recalling a partitioning scheme or permissions on the device(s)
>> is left to userspace.
>
> Can you explain this reasoning in a bit more detail, please?  If you
> have specific use cases in mind, that would be helpful.

A few use cases are top of mind:

* userspace persistence support: filesystem-DAX as implemented in XFS
and EXT4 requires filesystem coordination for persistence, device-dax
does not. An application may not need a full namespace worth of
persistent memory, or may want to dynamically resize the amount of
persistent memory it is consuming. This enabling allows online resize
of device-dax file/instance.

* allocation + access mechanism for performance differentiated memory:
Persistent memory is one example of a reserved memory pool with
different performance characteristics than typical DRAM in a system,
and there are examples of other performance differentiated memory
pools (high bandwidth or low latency) showing up on commonly available
platforms. This mechanism gives purpose built applications (high
performance computing, databases, etc...) a way to establish mappings
with predictable fault-granularities and performance, but also allow
for different permissions per allocation.

* carving up a PCI-E device memory bar for managing peer-to-peer
transactions: In the thread about enablling P2P DMA one of the
concerns that was raised was security separation of different users of
a device: http://marc.info/?l=linux-kernel&m=148106083913173&w=2

>> For persistent allocations, naming, and permissions automatically
>> recalled by the kernel, use filesystem-DAX. For a userspace helper
>
> I'd agree with that guidance if it wasn't for the fact that device dax
> was born out of the need to be able to flush dirty data in a safe manner
> from userspace.  At best, we're giving mixed guidance to application
> developers.

Yes, but at the same time device-DAX is sufficiently painful (no
read(2)/write(2) support, no builtin metadata support) that it may
spur application developers to lobby for a filesystem that offers
userspace dirty-data flushing. Until then we have this vehicle to test
the difference and dax-support for memory types beyond persistent
memory.

  reply	other threads:[~2016-12-12 18:46 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-11  6:28 [PATCH 0/8] device-dax: sub-division support Dan Williams
2016-12-11  6:28 ` [PATCH 1/8] dax: add region-available-size attribute Dan Williams
2016-12-14 14:38   ` Johannes Thumshirn
2016-12-14 15:53     ` Dan Williams
2016-12-15  6:47       ` Dan Williams
2016-12-11  6:28 ` [PATCH 2/8] dax: add region 'id', 'size', and 'align' attributes Dan Williams
2016-12-11  6:28 ` [PATCH 3/8] dax: register seed device Dan Williams
2016-12-11  6:28 ` [PATCH 4/8] dax: use multi-order radix for resource lookup Dan Williams
2016-12-11  6:28 ` [PATCH 5/8] dax: refactor locking out of size calculation routines Dan Williams
2016-12-14 15:01   ` Johannes Thumshirn
2016-12-14 15:55     ` Dan Williams
2016-12-11  6:28 ` [PATCH 6/8] dax: sub-division support Dan Williams
2016-12-11  6:29 ` [PATCH 7/8] dax: add / remove dax devices after provisioning Dan Williams
2016-12-11  6:29 ` [PATCH 8/8] dax: add debug for region available_size Dan Williams
2016-12-12 17:15 ` [PATCH 0/8] device-dax: sub-division support Jeff Moyer
2016-12-12 18:46   ` Dan Williams [this message]
2016-12-13 23:46     ` Jeff Moyer
2016-12-14  1:17       ` Dan Williams
2016-12-15 16:50         ` Jeff Moyer
2016-12-15 23:48           ` Dan Williams
2016-12-16  2:33             ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPcyv4hq=U8YkYqK4OE__FQyeGq2dKH+=14NttQu0M84yXZ7BQ@mail.gmail.com' \
    --to=dan.j.williams@intel.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@ml01.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).