All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Xiao Guangrong <guangrong.xiao@linux.intel.com>,
	Arnd Bergmann <arnd@arndb.de>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	linux-api@vger.kernel.org,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [RFC PATCH 1/2] mm, mincore2(): retrieve dax and tlb-size attributes of an address range
Date: Mon, 12 Sep 2016 10:15:35 -0700	[thread overview]
Message-ID: <CAPcyv4i0j2d9NqqG4JJFDykP400xT+JcO9wA+d9MiRJTBHTfbA@mail.gmail.com> (raw)
In-Reply-To: <20160912100910.GC23346@node.shutemov.name>

On Mon, Sep 12, 2016 at 3:09 AM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
> On Sun, Sep 11, 2016 at 10:31:35AM -0700, Dan Williams wrote:
>> As evidenced by this bug report [1], userspace libraries are interested
>> in whether a mapping is DAX mapped, i.e. no intervening page cache.
>> Rather than using the ambiguous VM_MIXEDMAP flag in smaps, provide an
>> explicit "is dax" indication as a new flag in the page vector populated
>> by mincore.
>>
>> There are also cases, particularly for testing and validating a
>> configuration to know the hardware mapping geometry of the pages in a
>> given process address range.  Consider filesystem-dax where a
>> configuration needs to take care to align partitions and block
>> allocations before huge page mappings might be used, or
>> anonymous-transparent-huge-pages where a process is opportunistically
>> assigned large pages.  mincore2() allows these configurations to be
>> surveyed and validated.
>>
>> The implementation takes advantage of the unused bits in the per-page
>> byte returned for each PAGE_SIZE extent of a given address range.  The
>> new format of each vector byte is:
>>
>> (TLB_SHIFT - PAGE_SHIFT) << 2 | vma_is_dax() << 1 | page_present
>>
>> [1]: https://lkml.org/lkml/2016/9/7/61
>>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>  include/linux/syscalls.h               |    2 +
>>  include/uapi/asm-generic/mman-common.h |    3 +
>>  kernel/sys_ni.c                        |    1
>>  mm/mincore.c                           |  126 +++++++++++++++++++++++++-------
>>  4 files changed, 104 insertions(+), 28 deletions(-)
>>
>> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
>> index d02239022bd0..4aa2ee7e359a 100644
>> --- a/include/linux/syscalls.h
>> +++ b/include/linux/syscalls.h
>> @@ -467,6 +467,8 @@ asmlinkage long sys_munlockall(void);
>>  asmlinkage long sys_madvise(unsigned long start, size_t len, int behavior);
>>  asmlinkage long sys_mincore(unsigned long start, size_t len,
>>                               unsigned char __user * vec);
>> +asmlinkage long sys_mincore2(unsigned long start, size_t len,
>> +                             unsigned char __user * vec, int flags);
>
> We had few attempts to extand mincore(2) interface/functionality before.
> None of them ended up in upsteam.
>
> How this attempt compares to previous?

Not sure, I'm wading into this cold trying to get my pet problem
solved, hence the RFC.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Linux MM <linux-mm@kvack.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Xiao Guangrong <guangrong.xiao@linux.intel.com>,
	Arnd Bergmann <arnd@arndb.de>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
	linux-api@vger.kernel.org,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [RFC PATCH 1/2] mm, mincore2(): retrieve dax and tlb-size attributes of an address range
Date: Mon, 12 Sep 2016 10:15:35 -0700	[thread overview]
Message-ID: <CAPcyv4i0j2d9NqqG4JJFDykP400xT+JcO9wA+d9MiRJTBHTfbA@mail.gmail.com> (raw)
In-Reply-To: <20160912100910.GC23346@node.shutemov.name>

On Mon, Sep 12, 2016 at 3:09 AM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
> On Sun, Sep 11, 2016 at 10:31:35AM -0700, Dan Williams wrote:
>> As evidenced by this bug report [1], userspace libraries are interested
>> in whether a mapping is DAX mapped, i.e. no intervening page cache.
>> Rather than using the ambiguous VM_MIXEDMAP flag in smaps, provide an
>> explicit "is dax" indication as a new flag in the page vector populated
>> by mincore.
>>
>> There are also cases, particularly for testing and validating a
>> configuration to know the hardware mapping geometry of the pages in a
>> given process address range.  Consider filesystem-dax where a
>> configuration needs to take care to align partitions and block
>> allocations before huge page mappings might be used, or
>> anonymous-transparent-huge-pages where a process is opportunistically
>> assigned large pages.  mincore2() allows these configurations to be
>> surveyed and validated.
>>
>> The implementation takes advantage of the unused bits in the per-page
>> byte returned for each PAGE_SIZE extent of a given address range.  The
>> new format of each vector byte is:
>>
>> (TLB_SHIFT - PAGE_SHIFT) << 2 | vma_is_dax() << 1 | page_present
>>
>> [1]: https://lkml.org/lkml/2016/9/7/61
>>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>  include/linux/syscalls.h               |    2 +
>>  include/uapi/asm-generic/mman-common.h |    3 +
>>  kernel/sys_ni.c                        |    1
>>  mm/mincore.c                           |  126 +++++++++++++++++++++++++-------
>>  4 files changed, 104 insertions(+), 28 deletions(-)
>>
>> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
>> index d02239022bd0..4aa2ee7e359a 100644
>> --- a/include/linux/syscalls.h
>> +++ b/include/linux/syscalls.h
>> @@ -467,6 +467,8 @@ asmlinkage long sys_munlockall(void);
>>  asmlinkage long sys_madvise(unsigned long start, size_t len, int behavior);
>>  asmlinkage long sys_mincore(unsigned long start, size_t len,
>>                               unsigned char __user * vec);
>> +asmlinkage long sys_mincore2(unsigned long start, size_t len,
>> +                             unsigned char __user * vec, int flags);
>
> We had few attempts to extand mincore(2) interface/functionality before.
> None of them ended up in upsteam.
>
> How this attempt compares to previous?

Not sure, I'm wading into this cold trying to get my pet problem
solved, hence the RFC.

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Linux MM <linux-mm@kvack.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Xiao Guangrong <guangrong.xiao@linux.intel.com>,
	Arnd Bergmann <arnd@arndb.de>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	linux-api@vger.kernel.org,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [RFC PATCH 1/2] mm, mincore2(): retrieve dax and tlb-size attributes of an address range
Date: Mon, 12 Sep 2016 10:15:35 -0700	[thread overview]
Message-ID: <CAPcyv4i0j2d9NqqG4JJFDykP400xT+JcO9wA+d9MiRJTBHTfbA@mail.gmail.com> (raw)
In-Reply-To: <20160912100910.GC23346@node.shutemov.name>

On Mon, Sep 12, 2016 at 3:09 AM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
> On Sun, Sep 11, 2016 at 10:31:35AM -0700, Dan Williams wrote:
>> As evidenced by this bug report [1], userspace libraries are interested
>> in whether a mapping is DAX mapped, i.e. no intervening page cache.
>> Rather than using the ambiguous VM_MIXEDMAP flag in smaps, provide an
>> explicit "is dax" indication as a new flag in the page vector populated
>> by mincore.
>>
>> There are also cases, particularly for testing and validating a
>> configuration to know the hardware mapping geometry of the pages in a
>> given process address range.  Consider filesystem-dax where a
>> configuration needs to take care to align partitions and block
>> allocations before huge page mappings might be used, or
>> anonymous-transparent-huge-pages where a process is opportunistically
>> assigned large pages.  mincore2() allows these configurations to be
>> surveyed and validated.
>>
>> The implementation takes advantage of the unused bits in the per-page
>> byte returned for each PAGE_SIZE extent of a given address range.  The
>> new format of each vector byte is:
>>
>> (TLB_SHIFT - PAGE_SHIFT) << 2 | vma_is_dax() << 1 | page_present
>>
>> [1]: https://lkml.org/lkml/2016/9/7/61
>>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>  include/linux/syscalls.h               |    2 +
>>  include/uapi/asm-generic/mman-common.h |    3 +
>>  kernel/sys_ni.c                        |    1
>>  mm/mincore.c                           |  126 +++++++++++++++++++++++++-------
>>  4 files changed, 104 insertions(+), 28 deletions(-)
>>
>> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
>> index d02239022bd0..4aa2ee7e359a 100644
>> --- a/include/linux/syscalls.h
>> +++ b/include/linux/syscalls.h
>> @@ -467,6 +467,8 @@ asmlinkage long sys_munlockall(void);
>>  asmlinkage long sys_madvise(unsigned long start, size_t len, int behavior);
>>  asmlinkage long sys_mincore(unsigned long start, size_t len,
>>                               unsigned char __user * vec);
>> +asmlinkage long sys_mincore2(unsigned long start, size_t len,
>> +                             unsigned char __user * vec, int flags);
>
> We had few attempts to extand mincore(2) interface/functionality before.
> None of them ended up in upsteam.
>
> How this attempt compares to previous?

Not sure, I'm wading into this cold trying to get my pet problem
solved, hence the RFC.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-09-12 17:15 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-11 17:31 [RFC PATCH 1/2] mm, mincore2(): retrieve dax and tlb-size attributes of an address range Dan Williams
2016-09-11 17:31 ` Dan Williams
2016-09-11 17:31 ` Dan Williams
2016-09-11 17:31 ` [RFC PATCH 2/2] x86: wire up mincore2() Dan Williams
2016-09-11 17:31   ` Dan Williams
2016-09-11 17:31   ` Dan Williams
2016-09-11 17:31   ` Dan Williams
2016-09-13 18:44   ` Ingo Molnar
2016-09-13 18:44     ` Ingo Molnar
2016-09-13 18:44     ` Ingo Molnar
2016-09-13 18:44     ` Ingo Molnar
2016-09-12  3:35 ` [RFC PATCH 1/2] mm, mincore2(): retrieve dax and tlb-size attributes of an address range Nicholas Piggin
2016-09-12  3:35   ` Nicholas Piggin
2016-09-12  3:35   ` Nicholas Piggin
2016-09-12  3:35   ` Nicholas Piggin
2016-09-12  3:35   ` Nicholas Piggin
2016-09-12 17:29   ` Dan Williams
2016-09-12 17:29     ` Dan Williams
2016-09-12 17:29     ` Dan Williams
2016-09-12 17:29     ` Dan Williams
2016-09-13  2:16     ` Nicholas Piggin
2016-09-13  2:16       ` Nicholas Piggin
2016-09-13  2:16       ` Nicholas Piggin
2016-09-13  2:16       ` Nicholas Piggin
2016-09-13  2:16       ` Nicholas Piggin
2016-09-13  3:49       ` Dan Williams
2016-09-13  3:49         ` Dan Williams
2016-09-13  3:49         ` Dan Williams
2016-09-13  3:49         ` Dan Williams
2016-09-12  6:29 ` Oliver O'Halloran
2016-09-12  6:29   ` Oliver O'Halloran
2016-09-12  6:29   ` Oliver O'Halloran
2016-09-12 17:25   ` Dan Williams
2016-09-12 17:25     ` Dan Williams
2016-09-12 17:25     ` Dan Williams
2016-09-12 17:25     ` Dan Williams
2016-09-12 10:09 ` Kirill A. Shutemov
2016-09-12 10:09   ` Kirill A. Shutemov
2016-09-12 10:09   ` Kirill A. Shutemov
2016-09-12 10:09   ` Kirill A. Shutemov
2016-09-12 17:15   ` Dan Williams [this message]
2016-09-12 17:15     ` Dan Williams
2016-09-12 17:15     ` Dan Williams
     [not found] ` <147361509579.17004.5258725187329709824.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2016-09-13  6:44   ` Christoph Hellwig
2016-09-13  6:44     ` Christoph Hellwig
2016-09-13  6:44     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4i0j2d9NqqG4JJFDykP400xT+JcO9wA+d9MiRJTBHTfbA@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=guangrong.xiao@linux.intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.