linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: add ZONE_DEVICE statistics to smaps
@ 2016-11-10 22:11 Dan Williams
  2016-11-11  4:12 ` Anshuman Khandual
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Dan Williams @ 2016-11-10 22:11 UTC (permalink / raw)
  To: akpm; +Cc: Dave Hansen, linux-nvdimm, Christoph Hellwig, linux-kernel, linux-mm

ZONE_DEVICE pages are mapped into a process via the filesystem-dax and
device-dax mechanisms.  There are also proposals to use ZONE_DEVICE
pages for other usages outside of dax.  Add statistics to smaps so
applications can debug that they are obtaining the mappings they expect,
or otherwise accounting them.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 fs/proc/task_mmu.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 35b92d81692f..6765cafcf057 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -445,6 +445,8 @@ struct mem_size_stats {
 	unsigned long swap;
 	unsigned long shared_hugetlb;
 	unsigned long private_hugetlb;
+	unsigned long device;
+	unsigned long device_huge;
 	u64 pss;
 	u64 swap_pss;
 	bool check_shmem_swap;
@@ -458,6 +460,8 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
 
 	if (PageAnon(page))
 		mss->anonymous += size;
+	else if (is_zone_device_page(page))
+		mss->device += size;
 
 	mss->resident += size;
 	/* Accumulate the size in pages that have been accessed. */
@@ -575,7 +579,7 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
 	else if (PageSwapBacked(page))
 		mss->shmem_thp += HPAGE_PMD_SIZE;
 	else if (is_zone_device_page(page))
-		/* pass */;
+		mss->device_huge += HPAGE_PMD_SIZE;
 	else
 		VM_BUG_ON_PAGE(1, page);
 	smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd));
@@ -774,6 +778,8 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
 		   "ShmemPmdMapped: %8lu kB\n"
 		   "Shared_Hugetlb: %8lu kB\n"
 		   "Private_Hugetlb: %7lu kB\n"
+		   "Device:         %8lu kB\n"
+		   "DeviceHugePages: %7lu kB\n"
 		   "Swap:           %8lu kB\n"
 		   "SwapPss:        %8lu kB\n"
 		   "KernelPageSize: %8lu kB\n"
@@ -792,6 +798,8 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
 		   mss.shmem_thp >> 10,
 		   mss.shared_hugetlb >> 10,
 		   mss.private_hugetlb >> 10,
+		   mss.device >> 10,
+		   mss.device_huge >> 10,
 		   mss.swap >> 10,
 		   (unsigned long)(mss.swap_pss >> (10 + PSS_SHIFT)),
 		   vma_kernel_pagesize(vma) >> 10,

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: add ZONE_DEVICE statistics to smaps
  2016-11-10 22:11 [PATCH] mm: add ZONE_DEVICE statistics to smaps Dan Williams
@ 2016-11-11  4:12 ` Anshuman Khandual
  2016-11-15  3:14 ` Dan Williams
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Anshuman Khandual @ 2016-11-11  4:12 UTC (permalink / raw)
  To: Dan Williams, akpm
  Cc: Dave Hansen, linux-nvdimm, Christoph Hellwig, linux-kernel, linux-mm

On 11/11/2016 03:41 AM, Dan Williams wrote:
> ZONE_DEVICE pages are mapped into a process via the filesystem-dax and
> device-dax mechanisms.  There are also proposals to use ZONE_DEVICE
> pages for other usages outside of dax.  Add statistics to smaps so
> applications can debug that they are obtaining the mappings they expect,
> or otherwise accounting them.

This might also help when we will have ZONE_DEVICE based solution for
HMM based device memory.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: add ZONE_DEVICE statistics to smaps
  2016-11-10 22:11 [PATCH] mm: add ZONE_DEVICE statistics to smaps Dan Williams
  2016-11-11  4:12 ` Anshuman Khandual
@ 2016-11-15  3:14 ` Dan Williams
  2016-11-15 18:50   ` Christoph Hellwig
  2016-11-16  0:04 ` Andrew Morton
  2016-11-16  0:15 ` Dave Hansen
  3 siblings, 1 reply; 7+ messages in thread
From: Dan Williams @ 2016-11-15  3:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dave Hansen, linux-nvdimm@lists.01.org, linux-kernel, Linux MM,
	Andrew Morton

On Thu, Nov 10, 2016 at 2:11 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> ZONE_DEVICE pages are mapped into a process via the filesystem-dax and
> device-dax mechanisms.  There are also proposals to use ZONE_DEVICE
> pages for other usages outside of dax.  Add statistics to smaps so
> applications can debug that they are obtaining the mappings they expect,
> or otherwise accounting them.
>
> Cc: Christoph Hellwig <hch@lst.de>

Christoph,

Wanted to get your opinion on this given your earlier concerns about
the VM_DAX flag.

This instead lets an application know how much of a vma is backed by
ZONE_DEVICE pages, but does not make any indications about the vma
having DAX semantics or not.  I.e. it is possible that 'device' and
'device_huge' are non-zero *and* vma_is_dax() is false.  So, it is
purely accounting the composition of the present pages in the vma.

Another option is to have something like 'shared_thp' just to account
for file backed huge pages that dax can map.  However if ZONE_DEVICE
is leaking into other use cases I think it makes sense to have it be a
first class-citizen with respect to accounting alongside
'anonymous_thp'.

> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  fs/proc/task_mmu.c |   10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 35b92d81692f..6765cafcf057 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -445,6 +445,8 @@ struct mem_size_stats {
>         unsigned long swap;
>         unsigned long shared_hugetlb;
>         unsigned long private_hugetlb;
> +       unsigned long device;
> +       unsigned long device_huge;
>         u64 pss;
>         u64 swap_pss;
>         bool check_shmem_swap;
> @@ -458,6 +460,8 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
>
>         if (PageAnon(page))
>                 mss->anonymous += size;
> +       else if (is_zone_device_page(page))
> +               mss->device += size;
>
>         mss->resident += size;
>         /* Accumulate the size in pages that have been accessed. */
> @@ -575,7 +579,7 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
>         else if (PageSwapBacked(page))
>                 mss->shmem_thp += HPAGE_PMD_SIZE;
>         else if (is_zone_device_page(page))
> -               /* pass */;
> +               mss->device_huge += HPAGE_PMD_SIZE;
>         else
>                 VM_BUG_ON_PAGE(1, page);
>         smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd));
> @@ -774,6 +778,8 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
>                    "ShmemPmdMapped: %8lu kB\n"
>                    "Shared_Hugetlb: %8lu kB\n"
>                    "Private_Hugetlb: %7lu kB\n"
> +                  "Device:         %8lu kB\n"
> +                  "DeviceHugePages: %7lu kB\n"
>                    "Swap:           %8lu kB\n"
>                    "SwapPss:        %8lu kB\n"
>                    "KernelPageSize: %8lu kB\n"
> @@ -792,6 +798,8 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
>                    mss.shmem_thp >> 10,
>                    mss.shared_hugetlb >> 10,
>                    mss.private_hugetlb >> 10,
> +                  mss.device >> 10,
> +                  mss.device_huge >> 10,
>                    mss.swap >> 10,
>                    (unsigned long)(mss.swap_pss >> (10 + PSS_SHIFT)),
>                    vma_kernel_pagesize(vma) >> 10,
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: add ZONE_DEVICE statistics to smaps
  2016-11-15  3:14 ` Dan Williams
@ 2016-11-15 18:50   ` Christoph Hellwig
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2016-11-15 18:50 UTC (permalink / raw)
  To: Dan Williams
  Cc: Christoph Hellwig, Dave Hansen, linux-nvdimm@lists.01.org,
	linux-kernel, Linux MM, Andrew Morton

Hi Dan,

On Mon, Nov 14, 2016 at 07:14:22PM -0800, Dan Williams wrote:
> Wanted to get your opinion on this given your earlier concerns about
> the VM_DAX flag.
> 
> This instead lets an application know how much of a vma is backed by
> ZONE_DEVICE pages, but does not make any indications about the vma
> having DAX semantics or not.  I.e. it is possible that 'device' and
> 'device_huge' are non-zero *and* vma_is_dax() is false.  So, it is
> purely accounting the composition of the present pages in the vma.
> 
> Another option is to have something like 'shared_thp' just to account
> for file backed huge pages that dax can map.  However if ZONE_DEVICE
> is leaking into other use cases I think it makes sense to have it be a
> first class-citizen with respect to accounting alongside
> 'anonymous_thp'.

This counter sounds fine to me, it's a debug tool and not an obvious
abuse candidate like VM_DAX.  But I'll defer to the VM folks for a real
review.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: add ZONE_DEVICE statistics to smaps
  2016-11-10 22:11 [PATCH] mm: add ZONE_DEVICE statistics to smaps Dan Williams
  2016-11-11  4:12 ` Anshuman Khandual
  2016-11-15  3:14 ` Dan Williams
@ 2016-11-16  0:04 ` Andrew Morton
  2016-11-16  0:15 ` Dave Hansen
  3 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2016-11-16  0:04 UTC (permalink / raw)
  To: Dan Williams
  Cc: Dave Hansen, linux-nvdimm, Christoph Hellwig, linux-kernel, linux-mm

On Thu, 10 Nov 2016 14:11:57 -0800 Dan Williams <dan.j.williams@intel.com> wrote:

> ZONE_DEVICE pages are mapped into a process via the filesystem-dax and
> device-dax mechanisms.  There are also proposals to use ZONE_DEVICE
> pages for other usages outside of dax.  Add statistics to smaps so
> applications can debug that they are obtaining the mappings they expect,
> or otherwise accounting them.
> 
> ...
>
>  fs/proc/task_mmu.c |   10 +++++++++-

Please update Documentation/filesystems/proc.txt.

(While there, please check to see if anything else is missed?)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: add ZONE_DEVICE statistics to smaps
  2016-11-10 22:11 [PATCH] mm: add ZONE_DEVICE statistics to smaps Dan Williams
                   ` (2 preceding siblings ...)
  2016-11-16  0:04 ` Andrew Morton
@ 2016-11-16  0:15 ` Dave Hansen
  2016-11-16  0:48   ` Dan Williams
  3 siblings, 1 reply; 7+ messages in thread
From: Dave Hansen @ 2016-11-16  0:15 UTC (permalink / raw)
  To: Dan Williams, akpm
  Cc: linux-nvdimm, Christoph Hellwig, linux-kernel, linux-mm

On 11/10/2016 02:11 PM, Dan Williams wrote:
> @@ -774,6 +778,8 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
>  		   "ShmemPmdMapped: %8lu kB\n"
>  		   "Shared_Hugetlb: %8lu kB\n"
>  		   "Private_Hugetlb: %7lu kB\n"
> +		   "Device:         %8lu kB\n"
> +		   "DeviceHugePages: %7lu kB\n"
>  		   "Swap:           %8lu kB\n"
>  		   "SwapPss:        %8lu kB\n"
>  		   "KernelPageSize: %8lu kB\n"

So, a couple of nits...

smaps is getting a bit big, and the fields that get added in this patch
are going to be pretty infrequently used.  Is it OK if smaps grows
forever, even if most of them items are "0 kB"?

IOW, Could we make it output Device* only for DAX VMAs?  All the parsers
have to handle that field being there or not (for old kernels).

The other thing missing for DAX is the page size.  DAX mappings support
mixed page sizes, so MMUPageSize in this context is pretty worthless.
What will we do in here for 1GB DAX pages?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: add ZONE_DEVICE statistics to smaps
  2016-11-16  0:15 ` Dave Hansen
@ 2016-11-16  0:48   ` Dan Williams
  0 siblings, 0 replies; 7+ messages in thread
From: Dan Williams @ 2016-11-16  0:48 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Andrew Morton, linux-nvdimm@lists.01.org, Christoph Hellwig,
	linux-kernel, Linux MM

On Tue, Nov 15, 2016 at 4:15 PM, Dave Hansen <dave.hansen@intel.com> wrote:
> On 11/10/2016 02:11 PM, Dan Williams wrote:
>> @@ -774,6 +778,8 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
>>                  "ShmemPmdMapped: %8lu kB\n"
>>                  "Shared_Hugetlb: %8lu kB\n"
>>                  "Private_Hugetlb: %7lu kB\n"
>> +                "Device:         %8lu kB\n"
>> +                "DeviceHugePages: %7lu kB\n"
>>                  "Swap:           %8lu kB\n"
>>                  "SwapPss:        %8lu kB\n"
>>                  "KernelPageSize: %8lu kB\n"
>
> So, a couple of nits...
>
> smaps is getting a bit big, and the fields that get added in this patch
> are going to be pretty infrequently used.  Is it OK if smaps grows
> forever, even if most of them items are "0 kB"?
>
> IOW, Could we make it output Device* only for DAX VMAs?  All the parsers
> have to handle that field being there or not (for old kernels).

How about just hiding the field if it is zero?  That way it's not an
backdoor way to leak vma_is_dax() which was Christoph's concern.

> The other thing missing for DAX is the page size.  DAX mappings support
> mixed page sizes, so MMUPageSize in this context is pretty worthless.
> What will we do in here for 1GB DAX pages?

I was thinking that would be yet another field "DeviceGiganticPages?"
when we eventually add 1GB support (not there today).

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-11-16  0:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-10 22:11 [PATCH] mm: add ZONE_DEVICE statistics to smaps Dan Williams
2016-11-11  4:12 ` Anshuman Khandual
2016-11-15  3:14 ` Dan Williams
2016-11-15 18:50   ` Christoph Hellwig
2016-11-16  0:04 ` Andrew Morton
2016-11-16  0:15 ` Dave Hansen
2016-11-16  0:48   ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).