All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Linda Knippers <linda.knippers@hpe.com>
Cc: Jan Kara <jack@suse.cz>, Matthew Wilcox <mawilcox@microsoft.com>,
	X86 ML <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v4 12/16] libnvdimm, nfit: enable support for volatile ranges
Date: Thu, 29 Jun 2017 14:50:12 -0700	[thread overview]
Message-ID: <CAPcyv4g6ijkWncZAbP-Bmbf+J7a2hR-JFLSX-ji6v-+oG0PDJQ@mail.gmail.com> (raw)
In-Reply-To: <59556E37.80808@hpe.com>

On Thu, Jun 29, 2017 at 2:16 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
> On 06/29/2017 04:42 PM, Dan Williams wrote:
>> On Thu, Jun 29, 2017 at 12:20 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>> On 06/29/2017 01:54 PM, Dan Williams wrote:
>>>> Allow volatile nfit ranges to participate in all the same infrastructure
>>>> provided for persistent memory regions.
>>>
>>> This seems to be a bit more than "other rework".
>>
>> It's part of the rationale for having a "write_cache" control
>> attribute. There's only so much I can squeeze into the subject line,
>> but it is mentioned in the cover letter.
>>
>>>> A resulting resulting namespace
>>>> device will still be called "pmem", but the parent region type will be
>>>> "nd_volatile".
>>>
>>> What does this look like to a user or admin?  How does someone know that
>>> /dev/pmemX is persistent memory and /dev/pmemY isn't?  Someone shouldn't
>>> have to weed through /sys or ndctl some other interface to figure that out
>>> in the future if they don't have to do that today.  We have different
>>> names for BTT namespaces.  Is there a different name for volatile ranges?
>>
>> No, the block device name is still /dev/pmem. It's already the case
>> that you need to check behind just the name of the device to figure
>> out if something is actually volatile or not (see memmap=ss!nn
>> configurations),
>
> I don't have any experience with using memmap but if it's primarily used
> by developers without NVDIMMs, they'd know it's not persistent.  Or is it
> primarily used by administrators using non-NFIT NVDIMMs, in which case it
> is persistent?
>
> In any case, how exactly does one determine whether the device is volatile
> or not?  I'm dumb so tell me the command line or API.

Especially with memmap= or e820-defined memory it's unknowable from
the kernel. We don't know if the user is using it to cover for a
platform where there is no BIOS support for advertising persistent
memory, or if they have a BIOS that does not produce an NFIT as is the
case here [1], or if it is some developer just testing with no
expectation of persistence.

[1]: https://github.com/pmem/ndctl/issues/21

>> so I would not be in favor of changing the device
>> name if we think the memory might not be persistent. Moreover, I think
>> it was a mistake that we change the device name for btt or not, and
>> I'm glad Matthew talked me out of making the same mistake with
>> memory-mode vs raw-mode pmem namespaces. So, the block device name
>> just reflects the driver of the block device, not the properties of
>> the device, just like all other block device instances.
>
> I agree that creating a new device name for BTT was perhaps a mistake,
> although it would be good to know how to query a device property for
> sector atomicity.  The difference between BTT vs. non-BTT seems less
> critical to me than knowing in an obvious way whether the device is
> actually persistent.

We don't have a good way to answer "actually persistent" in the
general case. I'm thinking of cases where the energy source on the
DIMM has died, or we trigger one of the conditions that leads to the
""unable to guarantee persistence of writes" message. The /dev/pmem
device name just tells you that your block device is hosted by a
driver that knows how to handle persistent memory constraints, but any
other details about the nature of the address range need to come from
other sources of information, and potentially information sources that
the kernel does not know about.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Linda Knippers <linda.knippers@hpe.com>
Cc: "linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Jan Kara <jack@suse.cz>, Matthew Wilcox <mawilcox@microsoft.com>,
	X86 ML <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v4 12/16] libnvdimm, nfit: enable support for volatile ranges
Date: Thu, 29 Jun 2017 14:50:12 -0700	[thread overview]
Message-ID: <CAPcyv4g6ijkWncZAbP-Bmbf+J7a2hR-JFLSX-ji6v-+oG0PDJQ@mail.gmail.com> (raw)
In-Reply-To: <59556E37.80808@hpe.com>

On Thu, Jun 29, 2017 at 2:16 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
> On 06/29/2017 04:42 PM, Dan Williams wrote:
>> On Thu, Jun 29, 2017 at 12:20 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>> On 06/29/2017 01:54 PM, Dan Williams wrote:
>>>> Allow volatile nfit ranges to participate in all the same infrastructure
>>>> provided for persistent memory regions.
>>>
>>> This seems to be a bit more than "other rework".
>>
>> It's part of the rationale for having a "write_cache" control
>> attribute. There's only so much I can squeeze into the subject line,
>> but it is mentioned in the cover letter.
>>
>>>> A resulting resulting namespace
>>>> device will still be called "pmem", but the parent region type will be
>>>> "nd_volatile".
>>>
>>> What does this look like to a user or admin?  How does someone know that
>>> /dev/pmemX is persistent memory and /dev/pmemY isn't?  Someone shouldn't
>>> have to weed through /sys or ndctl some other interface to figure that out
>>> in the future if they don't have to do that today.  We have different
>>> names for BTT namespaces.  Is there a different name for volatile ranges?
>>
>> No, the block device name is still /dev/pmem. It's already the case
>> that you need to check behind just the name of the device to figure
>> out if something is actually volatile or not (see memmap=ss!nn
>> configurations),
>
> I don't have any experience with using memmap but if it's primarily used
> by developers without NVDIMMs, they'd know it's not persistent.  Or is it
> primarily used by administrators using non-NFIT NVDIMMs, in which case it
> is persistent?
>
> In any case, how exactly does one determine whether the device is volatile
> or not?  I'm dumb so tell me the command line or API.

Especially with memmap= or e820-defined memory it's unknowable from
the kernel. We don't know if the user is using it to cover for a
platform where there is no BIOS support for advertising persistent
memory, or if they have a BIOS that does not produce an NFIT as is the
case here [1], or if it is some developer just testing with no
expectation of persistence.

[1]: https://github.com/pmem/ndctl/issues/21

>> so I would not be in favor of changing the device
>> name if we think the memory might not be persistent. Moreover, I think
>> it was a mistake that we change the device name for btt or not, and
>> I'm glad Matthew talked me out of making the same mistake with
>> memory-mode vs raw-mode pmem namespaces. So, the block device name
>> just reflects the driver of the block device, not the properties of
>> the device, just like all other block device instances.
>
> I agree that creating a new device name for BTT was perhaps a mistake,
> although it would be good to know how to query a device property for
> sector atomicity.  The difference between BTT vs. non-BTT seems less
> critical to me than knowing in an obvious way whether the device is
> actually persistent.

We don't have a good way to answer "actually persistent" in the
general case. I'm thinking of cases where the energy source on the
DIMM has died, or we trigger one of the conditions that leads to the
""unable to guarantee persistence of writes" message. The /dev/pmem
device name just tells you that your block device is hosted by a
driver that knows how to handle persistent memory constraints, but any
other details about the nature of the address range need to come from
other sources of information, and potentially information sources that
the kernel does not know about.

  reply	other threads:[~2017-06-29 21:48 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-29 17:52 [PATCH v4 00/16] pmem: stop abusing copy_user_nocache(), and other reworks Dan Williams
2017-06-29 17:52 ` Dan Williams
2017-06-29 17:53 ` [PATCH v4 01/16] x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations Dan Williams
2017-06-29 17:53   ` Dan Williams
2017-06-29 17:53 ` [PATCH v4 02/16] dm: add ->copy_from_iter() dax operation support Dan Williams
2017-06-29 17:53   ` Dan Williams
2017-06-29 17:53 ` [PATCH v4 03/16] filesystem-dax: convert to dax_copy_from_iter() Dan Williams
2017-06-29 17:53   ` Dan Williams
2017-06-29 17:53 ` [PATCH v4 04/16] dax, pmem: introduce an optional 'flush' dax_operation Dan Williams
2017-06-29 17:53   ` Dan Williams
2017-06-29 17:53 ` [PATCH v4 05/16] dm: add ->flush() dax operation support Dan Williams
2017-06-29 17:53   ` Dan Williams
2017-06-29 17:53 ` [PATCH v4 06/16] filesystem-dax: convert to dax_flush() Dan Williams
2017-06-29 17:53   ` Dan Williams
2017-06-29 17:53 ` [PATCH v4 07/16] x86, dax: replace clear_pmem() with open coded memset + dax_ops->flush Dan Williams
2017-06-29 17:53   ` Dan Williams
2017-06-29 17:53 ` [PATCH v4 08/16] x86, dax, libnvdimm: remove wb_cache_pmem() indirection Dan Williams
2017-06-29 17:53   ` Dan Williams
2017-06-29 17:53 ` [PATCH v4 09/16] x86, libnvdimm, pmem: move arch_invalidate_pmem() to libnvdimm Dan Williams
2017-06-29 17:53   ` Dan Williams
2017-06-29 17:53 ` [PATCH v4 10/16] x86, libnvdimm, pmem: remove global pmem api Dan Williams
2017-06-29 17:53   ` Dan Williams
2017-06-29 17:53 ` [PATCH v4 11/16] libnvdimm, pmem: fix persistence warning Dan Williams
2017-06-29 17:53   ` Dan Williams
2017-06-29 17:54 ` [PATCH v4 12/16] libnvdimm, nfit: enable support for volatile ranges Dan Williams
2017-06-29 17:54   ` Dan Williams
2017-06-29 19:20   ` Linda Knippers
2017-06-29 19:20     ` Linda Knippers
2017-06-29 20:42     ` Dan Williams
2017-06-29 20:42       ` Dan Williams
2017-06-29 21:16       ` Linda Knippers
2017-06-29 21:16         ` Linda Knippers
2017-06-29 21:50         ` Dan Williams [this message]
2017-06-29 21:50           ` Dan Williams
2017-06-29 22:12           ` Linda Knippers
2017-06-29 22:12             ` Linda Knippers
2017-06-29 22:28             ` Dan Williams
2017-06-29 22:28               ` Dan Williams
2017-06-29 22:35               ` Linda Knippers
2017-06-29 22:35                 ` Linda Knippers
2017-06-29 22:43                 ` Dan Williams
2017-06-29 22:43                   ` Dan Williams
2017-06-29 22:49                   ` Linda Knippers
2017-06-29 22:49                     ` Linda Knippers
2017-06-29 22:58                     ` Dan Williams
2017-06-29 22:58                       ` Dan Williams
2017-06-29 23:14                       ` Linda Knippers
2017-06-29 23:14                         ` Linda Knippers
2017-06-30  1:28                         ` Dan Williams
2017-06-30  1:28                           ` Dan Williams
2017-07-05 23:46                           ` Kani, Toshimitsu
2017-07-05 23:46                             ` Kani, Toshimitsu
2017-07-06  0:07                             ` Dan Williams
2017-07-06  0:07                               ` Dan Williams
2017-07-06  1:17                               ` Kani, Toshimitsu
2017-07-06  1:17                                 ` Kani, Toshimitsu
2017-07-06  2:08                                 ` Dan Williams
2017-07-06  2:08                                   ` Dan Williams
2017-07-06  2:11                                   ` hch
2017-07-06  2:11                                     ` hch
2017-07-06  2:53                                     ` Oliver
2017-07-06  2:53                                       ` Oliver
2017-07-06  2:56                                       ` hch
2017-07-06  2:56                                         ` hch
2017-06-29 17:54 ` [PATCH v4 13/16] dax: remove default copy_from_iter fallback Dan Williams
2017-06-29 17:54   ` Dan Williams
2017-06-29 17:54 ` [PATCH v4 14/16] dax: convert to bitmask for flags Dan Williams
2017-06-29 17:54   ` Dan Williams
2017-06-29 17:54 ` [PATCH v4 15/16] libnvdimm, pmem, dax: export a cache control attribute Dan Williams
2017-06-29 17:54   ` Dan Williams
2017-06-29 17:54 ` [PATCH v4 16/16] libnvdimm, pmem: disable dax flushing when pmem is fronting a volatile region Dan Williams
2017-06-29 17:54   ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4g6ijkWncZAbP-Bmbf+J7a2hR-JFLSX-ji6v-+oG0PDJQ@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linda.knippers@hpe.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mawilcox@microsoft.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.