All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joao Martins <joao.m.martins@oracle.com>
To: Barret Rhoden <brho@google.com>, Dan Williams <dan.j.williams@intel.com>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	Cornelia Huck <cohuck@redhat.com>, KVM list <kvm@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H . Peter Anvin" <hpa@zytor.com>, X86 ML <x86@kernel.org>,
	Liran Alon <liran.alon@oracle.com>,
	Nikita Leshenko <nikita.leshchenko@oracle.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Matthew Wilcox <willy@infradead.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: [PATCH RFC 10/10] nvdimm/e820: add multiple namespaces support
Date: Tue, 4 Feb 2020 19:24:39 +0000	[thread overview]
Message-ID: <6fc0d900-3df7-d09a-9b6a-dc6a82823d94@oracle.com> (raw)
In-Reply-To: <ae788015-616f-96e6-3a0e-39c1911c4b01@google.com>

On 2/4/20 6:20 PM, Barret Rhoden wrote:
> Hi -
> 
> On 2/4/20 11:44 AM, Dan Williams wrote:
>> On Tue, Feb 4, 2020 at 7:30 AM Barret Rhoden <brho@google.com> wrote:
>>>
>>> Hi -
>>>
>>> On 1/10/20 2:03 PM, Joao Martins wrote:
>>>> User can define regions with 'memmap=size!offset' which in turn
>>>> creates PMEM legacy devices. But because it is a label-less
>>>> NVDIMM device we only have one namespace for the whole device.
>>>>
>>>> Add support for multiple namespaces by adding ndctl control
>>>> support, and exposing a minimal set of features:
>>>> (ND_CMD_GET_CONFIG_SIZE, ND_CMD_GET_CONFIG_DATA,
>>>> ND_CMD_SET_CONFIG_DATA) alongside NDD_ALIASING because we can
>>>> store labels.
>>>
>>> FWIW, I like this a lot.  If we move away from using memmap in favor of
>>> efi_fake_mem, ideally we'd have the same support for full-fledged
>>> pmem/dax regions and namespaces that this patch brings.
>>
>> No, efi_fake_mem only supports creating dax-regions. What's the use
>> case that can't be satisfied by just specifying multiple memmap=
>> ranges?
> 
> I'd like to be able to create and destroy dax regions on the fly.  In 
> particular, I want to run guest VMs using the dax files for guest 
> memory, but I don't know at boot time how many VMs I'll have, or what 
> their sizes are.  Ideally, I'd have separate files for each VM, instead 
> of a single /dev/dax.
> 
> I currently do this with fs-dax with one big memmap region (ext4 on 
> /dev/pmem0), and I use the file system to handle the 
> creation/destruction/resizing and metadata management.  But since fs-dax 
> won't work with device pass-through, I started looking at dev-dax, with 
> the expectation that I'd need some software to manage the memory (i.e. 
> allocation).  That led me to ndctl, which seems to need namespace labels 
> to have the level of control I was looking for.
> 

Indeed this is the intent of the patch.

As Barret mentioned, memmap= is limited to the one namespace covering the entire
region, and this would fix it (regardless of namespace mode). Otherwise we gotta
know in advance the amount of guests and its exact sizes, which would be
somewhat unflexible.

But given that it's 'pmem emulation' I thought it was OK to twist the label-less
aspect of nd_e820 (unless there's hardware out there which does this?).

If Dan agrees, I can continue with the patch.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: Joao Martins <joao.m.martins@oracle.com>
To: Barret Rhoden <brho@google.com>, Dan Williams <dan.j.williams@intel.com>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Cornelia Huck <cohuck@redhat.com>, KVM list <kvm@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H . Peter Anvin" <hpa@zytor.com>, X86 ML <x86@kernel.org>,
	Liran Alon <liran.alon@oracle.com>,
	Nikita Leshenko <nikita.leshchenko@oracle.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Matthew Wilcox <willy@infradead.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: [PATCH RFC 10/10] nvdimm/e820: add multiple namespaces support
Date: Tue, 4 Feb 2020 19:24:39 +0000	[thread overview]
Message-ID: <6fc0d900-3df7-d09a-9b6a-dc6a82823d94@oracle.com> (raw)
In-Reply-To: <ae788015-616f-96e6-3a0e-39c1911c4b01@google.com>

On 2/4/20 6:20 PM, Barret Rhoden wrote:
> Hi -
> 
> On 2/4/20 11:44 AM, Dan Williams wrote:
>> On Tue, Feb 4, 2020 at 7:30 AM Barret Rhoden <brho@google.com> wrote:
>>>
>>> Hi -
>>>
>>> On 1/10/20 2:03 PM, Joao Martins wrote:
>>>> User can define regions with 'memmap=size!offset' which in turn
>>>> creates PMEM legacy devices. But because it is a label-less
>>>> NVDIMM device we only have one namespace for the whole device.
>>>>
>>>> Add support for multiple namespaces by adding ndctl control
>>>> support, and exposing a minimal set of features:
>>>> (ND_CMD_GET_CONFIG_SIZE, ND_CMD_GET_CONFIG_DATA,
>>>> ND_CMD_SET_CONFIG_DATA) alongside NDD_ALIASING because we can
>>>> store labels.
>>>
>>> FWIW, I like this a lot.  If we move away from using memmap in favor of
>>> efi_fake_mem, ideally we'd have the same support for full-fledged
>>> pmem/dax regions and namespaces that this patch brings.
>>
>> No, efi_fake_mem only supports creating dax-regions. What's the use
>> case that can't be satisfied by just specifying multiple memmap=
>> ranges?
> 
> I'd like to be able to create and destroy dax regions on the fly.  In 
> particular, I want to run guest VMs using the dax files for guest 
> memory, but I don't know at boot time how many VMs I'll have, or what 
> their sizes are.  Ideally, I'd have separate files for each VM, instead 
> of a single /dev/dax.
> 
> I currently do this with fs-dax with one big memmap region (ext4 on 
> /dev/pmem0), and I use the file system to handle the 
> creation/destruction/resizing and metadata management.  But since fs-dax 
> won't work with device pass-through, I started looking at dev-dax, with 
> the expectation that I'd need some software to manage the memory (i.e. 
> allocation).  That led me to ndctl, which seems to need namespace labels 
> to have the level of control I was looking for.
> 

Indeed this is the intent of the patch.

As Barret mentioned, memmap= is limited to the one namespace covering the entire
region, and this would fix it (regardless of namespace mode). Otherwise we gotta
know in advance the amount of guests and its exact sizes, which would be
somewhat unflexible.

But given that it's 'pmem emulation' I thought it was OK to twist the label-less
aspect of nd_e820 (unless there's hardware out there which does this?).

If Dan agrees, I can continue with the patch.

  reply	other threads:[~2020-02-04 19:25 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-10 19:03 [PATCH RFC 00/10] device-dax: Support devices without PFN metadata Joao Martins
2020-01-10 19:03 ` Joao Martins
2020-01-10 19:03 ` [PATCH RFC 01/10] mm: Add pmd support for _PAGE_SPECIAL Joao Martins
2020-01-10 19:03   ` Joao Martins
2020-02-03 21:34   ` Matthew Wilcox
2020-02-03 21:34     ` Matthew Wilcox
2020-02-04 16:14     ` Joao Martins
2020-02-04 16:14       ` Joao Martins
2020-01-10 19:03 ` [PATCH RFC 02/10] mm: Handle pmd entries in follow_pfn() Joao Martins
2020-01-10 19:03   ` Joao Martins
2020-02-03 21:37   ` Matthew Wilcox
2020-02-03 21:37     ` Matthew Wilcox
2020-02-04 16:17     ` Joao Martins
2020-02-04 16:17       ` Joao Martins
2020-01-10 19:03 ` [PATCH RFC 03/10] mm: Add pud support for _PAGE_SPECIAL Joao Martins
2020-01-10 19:03   ` Joao Martins
2020-01-10 19:03 ` [PATCH RFC 04/10] mm: Handle pud entries in follow_pfn() Joao Martins
2020-01-10 19:03   ` Joao Martins
2020-01-10 19:03 ` [PATCH RFC 05/10] device-dax: Do not enforce MADV_DONTFORK on mmap() Joao Martins
2020-01-10 19:03   ` Joao Martins
2020-01-10 19:03 ` [PATCH RFC 06/10] device-dax: Introduce pfn_flags helper Joao Martins
2020-01-10 19:03   ` Joao Martins
2020-01-10 19:03 ` [PATCH RFC 07/10] device-dax: Add support for PFN_SPECIAL flags Joao Martins
2020-01-10 19:03   ` Joao Martins
2020-01-10 19:03 ` [PATCH RFC 08/10] dax/pmem: Add device-dax support for PFN_MODE_NONE Joao Martins
2020-01-10 19:03   ` Joao Martins
2020-01-10 19:03 ` [PATCH RFC 09/10] vfio/type1: Use follow_pfn for VM_FPNMAP VMAs Joao Martins
2020-01-10 19:03   ` Joao Martins
2020-02-07 21:08   ` Jason Gunthorpe
2020-02-11 16:23     ` Joao Martins
2020-02-11 16:23       ` Joao Martins
2020-02-11 16:50       ` Jason Gunthorpe
2020-01-10 19:03 ` [PATCH RFC 10/10] nvdimm/e820: add multiple namespaces support Joao Martins
2020-01-10 19:03   ` Joao Martins
2020-02-04 15:28   ` Barret Rhoden
2020-02-04 15:28     ` Barret Rhoden
2020-02-04 16:44     ` Dan Williams
2020-02-04 16:44       ` Dan Williams
2020-02-04 16:44       ` Dan Williams
2020-02-04 18:20       ` Barret Rhoden
2020-02-04 18:20         ` Barret Rhoden
2020-02-04 19:24         ` Joao Martins [this message]
2020-02-04 19:24           ` Joao Martins
2020-02-04 21:43         ` Dan Williams
2020-02-04 21:43           ` Dan Williams
2020-02-04 21:43           ` Dan Williams
2020-02-04 21:57           ` Barret Rhoden
2020-02-04 21:57             ` Barret Rhoden
2020-02-04  1:24 ` [PATCH RFC 00/10] device-dax: Support devices without PFN metadata Dan Williams
2020-02-04  1:24   ` Dan Williams
2020-02-04  1:24   ` Dan Williams
2020-02-04 19:07   ` Joao Martins
2020-02-04 19:07     ` Joao Martins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6fc0d900-3df7-d09a-9b6a-dc6a82823d94@oracle.com \
    --to=joao.m.martins@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=brho@google.com \
    --cc=cohuck@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=hpa@zytor.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=liran.alon@oracle.com \
    --cc=mingo@redhat.com \
    --cc=nikita.leshchenko@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.