linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags
@ 2014-12-05  8:57 Wang, Yalin
  2014-12-05  9:20 ` Konstantin Khlebnikov
       [not found] ` <35FD53F367049845BC99AC72306C23D103E688B313F1@CNBJMBX05.corpusers.net>
  0 siblings, 2 replies; 20+ messages in thread
From: Wang, Yalin @ 2014-12-05  8:57 UTC (permalink / raw)
  To: linux-arm-kernel

This patch add KPF_ZERO_PAGE flag for zero_page,
so that userspace process can notice zero_page from
/proc/kpageflags, and then do memory analysis more accurately.

Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
---
 fs/proc/page.c                         | 3 +++
 include/uapi/linux/kernel-page-flags.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/fs/proc/page.c b/fs/proc/page.c
index 1e3187d..120dbf7 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -136,6 +136,9 @@ u64 stable_page_flags(struct page *page)
 	if (PageBalloon(page))
 		u |= 1 << KPF_BALLOON;
 
+	if (is_zero_pfn(page_to_pfn(page)))
+		u |= 1 << KPF_ZERO_PAGE;
+
 	u |= kpf_copy_bit(k, KPF_LOCKED,	PG_locked);
 
 	u |= kpf_copy_bit(k, KPF_SLAB,		PG_slab);
diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h
index 2f96d23..a6c4962 100644
--- a/include/uapi/linux/kernel-page-flags.h
+++ b/include/uapi/linux/kernel-page-flags.h
@@ -32,6 +32,7 @@
 #define KPF_KSM			21
 #define KPF_THP			22
 #define KPF_BALLOON		23
+#define KPF_ZERO_PAGE		24
 
 
 #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags
  2014-12-05  8:57 [RFC] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags Wang, Yalin
@ 2014-12-05  9:20 ` Konstantin Khlebnikov
  2014-12-05 10:22   ` Wang, Yalin
       [not found] ` <35FD53F367049845BC99AC72306C23D103E688B313F1@CNBJMBX05.corpusers.net>
  1 sibling, 1 reply; 20+ messages in thread
From: Konstantin Khlebnikov @ 2014-12-05  9:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 5, 2014 at 11:57 AM, Wang, Yalin <Yalin.Wang@sonymobile.com> wrote:
> This patch add KPF_ZERO_PAGE flag for zero_page,
> so that userspace process can notice zero_page from
> /proc/kpageflags, and then do memory analysis more accurately.

It would be nice to mark also huge_zero_page. See (completely
untested) patch in attachment.

>
> Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
> ---
>  fs/proc/page.c                         | 3 +++
>  include/uapi/linux/kernel-page-flags.h | 1 +
>  2 files changed, 4 insertions(+)
>
> diff --git a/fs/proc/page.c b/fs/proc/page.c
> index 1e3187d..120dbf7 100644
> --- a/fs/proc/page.c
> +++ b/fs/proc/page.c
> @@ -136,6 +136,9 @@ u64 stable_page_flags(struct page *page)
>         if (PageBalloon(page))
>                 u |= 1 << KPF_BALLOON;
>
> +       if (is_zero_pfn(page_to_pfn(page)))
> +               u |= 1 << KPF_ZERO_PAGE;
> +
>         u |= kpf_copy_bit(k, KPF_LOCKED,        PG_locked);
>
>         u |= kpf_copy_bit(k, KPF_SLAB,          PG_slab);
> diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h
> index 2f96d23..a6c4962 100644
> --- a/include/uapi/linux/kernel-page-flags.h
> +++ b/include/uapi/linux/kernel-page-flags.h
> @@ -32,6 +32,7 @@
>  #define KPF_KSM                        21
>  #define KPF_THP                        22
>  #define KPF_BALLOON            23
> +#define KPF_ZERO_PAGE          24
>
>
>  #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */
> --
> 2.1.3
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kpageflags-zero-huge-page
Type: application/octet-stream
Size: 2626 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20141205/6e1ec2ab/attachment.obj>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags
  2014-12-05  9:20 ` Konstantin Khlebnikov
@ 2014-12-05 10:22   ` Wang, Yalin
  2014-12-05 22:31     ` Andrew Morton
  0 siblings, 1 reply; 20+ messages in thread
From: Wang, Yalin @ 2014-12-05 10:22 UTC (permalink / raw)
  To: linux-arm-kernel

> -----Original Message-----
> From: Konstantin Khlebnikov [mailto:koct9i at gmail.com]
> Sent: Friday, December 05, 2014 5:21 PM
> To: Wang, Yalin
> Cc: linux-kernel at vger.kernel.org; linux-mm at kvack.org; linux-arm-
> kernel at lists.infradead.org; akpm at linux-foundation.org; n-
> horiguchi at ah.jp.nec.com
> Subject: Re: [RFC] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags
> 
> On Fri, Dec 5, 2014 at 11:57 AM, Wang, Yalin <Yalin.Wang@sonymobile.com>
> wrote:
> > This patch add KPF_ZERO_PAGE flag for zero_page, so that userspace
> > process can notice zero_page from /proc/kpageflags, and then do memory
> > analysis more accurately.
> 
> It would be nice to mark also huge_zero_page. See (completely
> untested) patch in attachment.
> 
Got it,
Thanks for your patch.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC V2] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags
       [not found] ` <35FD53F367049845BC99AC72306C23D103E688B313F1@CNBJMBX05.corpusers.net>
@ 2014-12-05 11:05   ` Kirill A. Shutemov
  2014-12-08  7:47   ` [PATCH] mm:add VM_BUG_ON() for page_mapcount() Wang, Yalin
  1 sibling, 0 replies; 20+ messages in thread
From: Kirill A. Shutemov @ 2014-12-05 11:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 05, 2014 at 06:21:17PM +0800, Wang, Yalin wrote:
> This patch add KPF_ZERO_PAGE flag for zero_page,
> so that userspace process can notice zero_page from
> /proc/kpageflags, and then do memory analysis more accurately.
> 
> Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
> ---
>  fs/proc/page.c                         | 14 +++++++++++---
>  include/linux/huge_mm.h                | 12 ++++++++++++
>  include/uapi/linux/kernel-page-flags.h |  1 +
>  mm/huge_memory.c                       |  7 +------
>  4 files changed, 25 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/proc/page.c b/fs/proc/page.c
> index 1e3187d..dbe5630 100644
> --- a/fs/proc/page.c
> +++ b/fs/proc/page.c
> @@ -5,6 +5,7 @@
>  #include <linux/ksm.h>
>  #include <linux/mm.h>
>  #include <linux/mmzone.h>
> +#include <linux/huge_mm.h>
>  #include <linux/proc_fs.h>
>  #include <linux/seq_file.h>
>  #include <linux/hugetlb.h>
> @@ -121,9 +122,16 @@ u64 stable_page_flags(struct page *page)
>  	 * just checks PG_head/PG_tail, so we need to check PageLRU/PageAnon
>  	 * to make sure a given page is a thp, not a non-huge compound page.
>  	 */
> -	else if (PageTransCompound(page) && (PageLRU(compound_head(page)) ||
> -					     PageAnon(compound_head(page))))
> -		u |= 1 << KPF_THP;
> +	else if (PageTransCompound(page)) {
> +		struct page *head = compound_head(page);
> +
> +		if (PageLRU(head) || PageAnon(head))
> +			u |= 1 << KPF_THP;
> +		else if (is_huge_zero_page(head))
> +			u |= 1 << KPF_ZERO_PAGE;

IIUC, KPF_THP bit should be set for huge zero page too.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags
  2014-12-05 10:22   ` Wang, Yalin
@ 2014-12-05 22:31     ` Andrew Morton
       [not found]       ` <35FD53F367049845BC99AC72306C23D103E688B313F4@CNBJMBX05.corpusers.net>
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2014-12-05 22:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 5 Dec 2014 18:22:33 +0800 "Wang, Yalin" <Yalin.Wang@sonymobile.com> wrote:

> > -----Original Message-----
> > From: Konstantin Khlebnikov [mailto:koct9i at gmail.com]
> > Sent: Friday, December 05, 2014 5:21 PM
> > To: Wang, Yalin
> > Cc: linux-kernel at vger.kernel.org; linux-mm at kvack.org; linux-arm-
> > kernel at lists.infradead.org; akpm at linux-foundation.org; n-
> > horiguchi at ah.jp.nec.com
> > Subject: Re: [RFC] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags
> > 
> > On Fri, Dec 5, 2014 at 11:57 AM, Wang, Yalin <Yalin.Wang@sonymobile.com>
> > wrote:
> > > This patch add KPF_ZERO_PAGE flag for zero_page, so that userspace
> > > process can notice zero_page from /proc/kpageflags, and then do memory
> > > analysis more accurately.
> > 
> > It would be nice to mark also huge_zero_page. See (completely
> > untested) patch in attachment.
> > 
> Got it,
> Thanks for your patch.

Documentation/vm/pagemap.txt will need updating please.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] mm:add VM_BUG_ON() for page_mapcount()
       [not found] ` <35FD53F367049845BC99AC72306C23D103E688B313F1@CNBJMBX05.corpusers.net>
  2014-12-05 11:05   ` [RFC V2] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags Kirill A. Shutemov
@ 2014-12-08  7:47   ` Wang, Yalin
  2014-12-08 11:50     ` Kirill A. Shutemov
  1 sibling, 1 reply; 20+ messages in thread
From: Wang, Yalin @ 2014-12-08  7:47 UTC (permalink / raw)
  To: linux-arm-kernel

This patch add VM_BUG_ON() for slab page,
because _mapcount is an union with slab struct in struct page,
avoid access _mapcount if this page is a slab page.
Also remove the unneeded bracket.

Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
---
 include/linux/mm.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 11b65cf..34124c4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -373,7 +373,8 @@ static inline void reset_page_mapcount(struct page *page)
 
 static inline int page_mapcount(struct page *page)
 {
-	return atomic_read(&(page)->_mapcount) + 1;
+	VM_BUG_ON(PageSlab(page));
+	return atomic_read(&page->_mapcount) + 1;
 }
 
 static inline int page_count(struct page *page)
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC V4] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags
       [not found]       ` <35FD53F367049845BC99AC72306C23D103E688B313F4@CNBJMBX05.corpusers.net>
@ 2014-12-08 11:46         ` Kirill A. Shutemov
       [not found]           ` <35FD53F367049845BC99AC72306C23D103E688B313FB@CNBJMBX05.corpusers.net>
  0 siblings, 1 reply; 20+ messages in thread
From: Kirill A. Shutemov @ 2014-12-08 11:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 08, 2014 at 10:00:50AM +0800, Wang, Yalin wrote:
> This patch add KPF_ZERO_PAGE flag for zero_page,
> so that userspace process can notice zero_page from
> /proc/kpageflags, and then do memory analysis more accurately.
> 
> Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
> ---
>  Documentation/vm/pagemap.txt           |  5 +++++
>  fs/proc/page.c                         | 16 +++++++++++++---
>  include/linux/huge_mm.h                | 12 ++++++++++++
>  include/uapi/linux/kernel-page-flags.h |  1 +
>  mm/huge_memory.c                       |  7 +------
>  5 files changed, 32 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/vm/pagemap.txt b/Documentation/vm/pagemap.txt
> index 5948e45..fdeb06e 100644
> --- a/Documentation/vm/pagemap.txt
> +++ b/Documentation/vm/pagemap.txt
> @@ -62,6 +62,8 @@ There are three components to pagemap:
>      20. NOPAGE
>      21. KSM
>      22. THP
> +    23. BALLOON
> +    24. ZERO_PAGE
>  
>  Short descriptions to the page flags:
>  
> @@ -102,6 +104,9 @@ Short descriptions to the page flags:
>  22. THP
>      contiguous pages which construct transparent hugepages
>  
> +24. ZERO_PAGE
> +    zero page for pfn_zero or huge_zero page
> +
>      [IO related page flags]
>   1. ERROR     IO error occurred
>   3. UPTODATE  page has up-to-date data

Would be nice to document BALLOON while you're there.
Otherwise looks good to me.

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] mm:add VM_BUG_ON() for page_mapcount()
  2014-12-08  7:47   ` [PATCH] mm:add VM_BUG_ON() for page_mapcount() Wang, Yalin
@ 2014-12-08 11:50     ` Kirill A. Shutemov
  0 siblings, 0 replies; 20+ messages in thread
From: Kirill A. Shutemov @ 2014-12-08 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 08, 2014 at 03:47:47PM +0800, Wang, Yalin wrote:
> This patch add VM_BUG_ON() for slab page,
> because _mapcount is an union with slab struct in struct page,
> avoid access _mapcount if this page is a slab page.
> Also remove the unneeded bracket.
> 
> Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
> ---
>  include/linux/mm.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 11b65cf..34124c4 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -373,7 +373,8 @@ static inline void reset_page_mapcount(struct page *page)
>  
>  static inline int page_mapcount(struct page *page)
>  {
> -	return atomic_read(&(page)->_mapcount) + 1;
> +	VM_BUG_ON(PageSlab(page));

VM_BUG_ON_PAGE(), please.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC] mm:fix zero_page huge_zero_page rss/pss statistic
       [not found]             ` <35FD53F367049845BC99AC72306C23D103E688B31403@CNBJMBX05.corpusers.net>
@ 2014-12-10 11:05               ` Kirill A. Shutemov
  2014-12-12  1:59                 ` Wang, Yalin
  0 siblings, 1 reply; 20+ messages in thread
From: Kirill A. Shutemov @ 2014-12-10 11:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 10, 2014 at 03:22:21PM +0800, Wang, Yalin wrote:
> smaps_pte_entry() doesn't ignore zero_huge_page,
> but it ignore zero_page, because vm_normal_page() will
> ignore it. We remove vm_normal_page() call, because walk_page_range()
> have ignore VM_PFNMAP vma maps, it's safe to just use pfn_valid(),
> so that we can also consider zero_page to be a valid page.

We fixed huge zero page accounting in smaps recentely. See mm tree.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC V5] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags
       [not found]           ` <35FD53F367049845BC99AC72306C23D103E688B313FB@CNBJMBX05.corpusers.net>
       [not found]             ` <35FD53F367049845BC99AC72306C23D103E688B31403@CNBJMBX05.corpusers.net>
@ 2014-12-10 17:06             ` Konstantin Khlebnikov
       [not found]               ` <35FD53F367049845BC99AC72306C23D103E688B31408@CNBJMBX05.corpusers.net>
  1 sibling, 1 reply; 20+ messages in thread
From: Konstantin Khlebnikov @ 2014-12-10 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 9, 2014 at 6:24 AM, Wang, Yalin <Yalin.Wang@sonymobile.com> wrote:
> This patch add KPF_ZERO_PAGE flag for zero_page,
> so that userspace process can notice zero_page from
> /proc/kpageflags, and then do memory analysis more accurately.
>
> Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>

Ack. Looks good.

> ---
>  Documentation/vm/pagemap.txt           |  8 ++++++++
>  fs/proc/page.c                         | 16 +++++++++++++---
>  include/linux/huge_mm.h                | 12 ++++++++++++
>  include/uapi/linux/kernel-page-flags.h |  1 +
>  mm/huge_memory.c                       |  7 +------
>  tools/vm/page-types.c                  |  1 +
>  6 files changed, 36 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/vm/pagemap.txt b/Documentation/vm/pagemap.txt
> index 5948e45..6fbd55e 100644
> --- a/Documentation/vm/pagemap.txt
> +++ b/Documentation/vm/pagemap.txt
> @@ -62,6 +62,8 @@ There are three components to pagemap:
>      20. NOPAGE
>      21. KSM
>      22. THP
> +    23. BALLOON
> +    24. ZERO_PAGE
>
>  Short descriptions to the page flags:
>
> @@ -102,6 +104,12 @@ Short descriptions to the page flags:
>  22. THP
>      contiguous pages which construct transparent hugepages
>
> +23. BALLOON
> +    balloon compaction page
> +
> +24. ZERO_PAGE
> +    zero page for pfn_zero or huge_zero page
> +
>      [IO related page flags]
>   1. ERROR     IO error occurred
>   3. UPTODATE  page has up-to-date data
> diff --git a/fs/proc/page.c b/fs/proc/page.c
> index 1e3187d..7eee2d8 100644
> --- a/fs/proc/page.c
> +++ b/fs/proc/page.c
> @@ -5,6 +5,7 @@
>  #include <linux/ksm.h>
>  #include <linux/mm.h>
>  #include <linux/mmzone.h>
> +#include <linux/huge_mm.h>
>  #include <linux/proc_fs.h>
>  #include <linux/seq_file.h>
>  #include <linux/hugetlb.h>
> @@ -121,9 +122,18 @@ u64 stable_page_flags(struct page *page)
>          * just checks PG_head/PG_tail, so we need to check PageLRU/PageAnon
>          * to make sure a given page is a thp, not a non-huge compound page.
>          */
> -       else if (PageTransCompound(page) && (PageLRU(compound_head(page)) ||
> -                                            PageAnon(compound_head(page))))
> -               u |= 1 << KPF_THP;
> +       else if (PageTransCompound(page)) {
> +               struct page *head = compound_head(page);
> +
> +               if (PageLRU(head) || PageAnon(head))
> +                       u |= 1 << KPF_THP;
> +               else if (is_huge_zero_page(head)) {
> +                       u |= 1 << KPF_ZERO_PAGE;
> +                       u |= 1 << KPF_THP;
> +               }
> +       } else if (is_zero_pfn(page_to_pfn(page)))
> +               u |= 1 << KPF_ZERO_PAGE;
> +
>
>         /*
>          * Caveats on high order pages: page->_count will only be set
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index ad9051b..f10b20f 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -157,6 +157,13 @@ static inline int hpage_nr_pages(struct page *page)
>  extern int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
>                                 unsigned long addr, pmd_t pmd, pmd_t *pmdp);
>
> +extern struct page *huge_zero_page;
> +
> +static inline bool is_huge_zero_page(struct page *page)
> +{
> +       return ACCESS_ONCE(huge_zero_page) == page;
> +}
> +
>  #else /* CONFIG_TRANSPARENT_HUGEPAGE */
>  #define HPAGE_PMD_SHIFT ({ BUILD_BUG(); 0; })
>  #define HPAGE_PMD_MASK ({ BUILD_BUG(); 0; })
> @@ -206,6 +213,11 @@ static inline int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_str
>         return 0;
>  }
>
> +static inline bool is_huge_zero_page(struct page *page)
> +{
> +       return false;
> +}
> +
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>
>  #endif /* _LINUX_HUGE_MM_H */
> diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h
> index 2f96d23..a6c4962 100644
> --- a/include/uapi/linux/kernel-page-flags.h
> +++ b/include/uapi/linux/kernel-page-flags.h
> @@ -32,6 +32,7 @@
>  #define KPF_KSM                        21
>  #define KPF_THP                        22
>  #define KPF_BALLOON            23
> +#define KPF_ZERO_PAGE          24
>
>
>  #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index de98415..d7bc7a5 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -171,12 +171,7 @@ static int start_khugepaged(void)
>  }
>
>  static atomic_t huge_zero_refcount;
> -static struct page *huge_zero_page __read_mostly;
> -
> -static inline bool is_huge_zero_page(struct page *page)
> -{
> -       return ACCESS_ONCE(huge_zero_page) == page;
> -}
> +struct page *huge_zero_page __read_mostly;
>
>  static inline bool is_huge_zero_pmd(pmd_t pmd)
>  {
> diff --git a/tools/vm/page-types.c b/tools/vm/page-types.c
> index 264fbc2..8bdf16b 100644
> --- a/tools/vm/page-types.c
> +++ b/tools/vm/page-types.c
> @@ -133,6 +133,7 @@ static const char * const page_flag_names[] = {
>         [KPF_KSM]               = "x:ksm",
>         [KPF_THP]               = "t:thp",
>         [KPF_BALLOON]           = "o:balloon",
> +       [KPF_ZERO_PAGE]         = "z:zero_page",
>
>         [KPF_RESERVED]          = "r:reserved",
>         [KPF_MLOCKED]           = "m:mlocked",
> --
> 2.1.3

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC] mm:fix zero_page huge_zero_page rss/pss statistic
  2014-12-10 11:05               ` [RFC] mm:fix zero_page huge_zero_page rss/pss statistic Kirill A. Shutemov
@ 2014-12-12  1:59                 ` Wang, Yalin
  2014-12-12 11:10                   ` Kirill A. Shutemov
  0 siblings, 1 reply; 20+ messages in thread
From: Wang, Yalin @ 2014-12-12  1:59 UTC (permalink / raw)
  To: linux-arm-kernel

> -----Original Message-----
> From: Kirill A. Shutemov [mailto:kirill at shutemov.name]
> Sent: Wednesday, December 10, 2014 7:06 PM
> To: Wang, Yalin
> Cc: 'Andrew Morton'; 'Konstantin Khlebnikov'; 'linux-
> kernel at vger.kernel.org'; 'linux-mm at kvack.org'; 'linux-arm-
> kernel at lists.infradead.org'; 'n-horiguchi at ah.jp.nec.com'; 'oleg at redhat.com';
> 'gorcunov at openvz.org'; 'pfeiner at google.com'
> Subject: Re: [RFC] mm:fix zero_page huge_zero_page rss/pss statistic
> 
> On Wed, Dec 10, 2014 at 03:22:21PM +0800, Wang, Yalin wrote:
> > smaps_pte_entry() doesn't ignore zero_huge_page, but it ignore
> > zero_page, because vm_normal_page() will ignore it. We remove
> > vm_normal_page() call, because walk_page_range() have ignore VM_PFNMAP
> > vma maps, it's safe to just use pfn_valid(), so that we can also
> > consider zero_page to be a valid page.
> 
> We fixed huge zero page accounting in smaps recentely. See mm tree.
> 
Hi 
I can't find the git, could you send me a link?
Thank you !

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC] mm:fix zero_page huge_zero_page rss/pss statistic
  2014-12-12  1:59                 ` Wang, Yalin
@ 2014-12-12 11:10                   ` Kirill A. Shutemov
  0 siblings, 0 replies; 20+ messages in thread
From: Kirill A. Shutemov @ 2014-12-12 11:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 12, 2014 at 09:59:15AM +0800, Wang, Yalin wrote:
> > -----Original Message-----
> > From: Kirill A. Shutemov [mailto:kirill at shutemov.name]
> > Sent: Wednesday, December 10, 2014 7:06 PM
> > To: Wang, Yalin
> > Cc: 'Andrew Morton'; 'Konstantin Khlebnikov'; 'linux-
> > kernel at vger.kernel.org'; 'linux-mm at kvack.org'; 'linux-arm-
> > kernel at lists.infradead.org'; 'n-horiguchi at ah.jp.nec.com'; 'oleg at redhat.com';
> > 'gorcunov at openvz.org'; 'pfeiner at google.com'
> > Subject: Re: [RFC] mm:fix zero_page huge_zero_page rss/pss statistic
> > 
> > On Wed, Dec 10, 2014 at 03:22:21PM +0800, Wang, Yalin wrote:
> > > smaps_pte_entry() doesn't ignore zero_huge_page, but it ignore
> > > zero_page, because vm_normal_page() will ignore it. We remove
> > > vm_normal_page() call, because walk_page_range() have ignore VM_PFNMAP
> > > vma maps, it's safe to just use pfn_valid(), so that we can also
> > > consider zero_page to be a valid page.
> > 
> > We fixed huge zero page accounting in smaps recentely. See mm tree.
> > 
> Hi 
> I can't find the git, could you send me a link?

http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/

or just take linux-next.

The fix is already in Linus' tree.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC] MADV_FREE doesn't work when doesn't have swap partition
       [not found]                 ` <35FD53F367049845BC99AC72306C23D103EDAF89E14C@CNBJMBX05.corpusers.net>
@ 2014-12-19  1:04                   ` Minchan Kim
  2014-12-19  6:54                     ` Wang, Yalin
  2014-12-22 10:30                     ` Konstantin Khlebnikov
       [not found]                   ` <35FD53F367049845BC99AC72306C23D103EDAF89E160@CNBJMBX05.corpusers.net>
  1 sibling, 2 replies; 20+ messages in thread
From: Minchan Kim @ 2014-12-19  1:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Dec 18, 2014 at 11:50:01AM +0800, Wang, Yalin wrote:
> I notice this commit:
> mm: support madvise(MADV_FREE),
> 
> it can free clean anonymous pages directly,
> doesn't need pageout to swap partition,
> 
> but I found it doesn't work on my platform,
> which don't enable any swap partitions.

Current implementation, if there is no empty slot in swap, it does
instant free instead of delayed free. Look at madvise_vma.

> 
> I make a change for this.
> Just to explain my issue clearly,
> Do we need some other checks to still scan anonymous pages even
> Don't have swap partition but have clean anonymous pages?

There is a few places we should consider if you want to scan anonymous page
withotu swap. Refer 69c854817566 and 74e3f3c3391d.

However, it's not simple at the moment. If we reenable anonymous scan without swap,
it would make much regress of reclaim. So my direction is move normal anonymos pages
into unevictable LRU list because they're real unevictable without swap and
put delayed freeing pages into anon LRU list and age them.

> ---
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 5e8772b..8258f3a 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1941,7 +1941,7 @@ static void get_scan_count(struct lruvec *lruvec, int swappiness,
>                 force_scan = true;
> 
>         /* If we have no swap space, do not bother scanning anon pages. */
> -       if (!sc->may_swap || (get_nr_swap_pages() <= 0)) {
> +       if (!sc->may_swap) {
>                 scan_balance = SCAN_FILE;
>                 goto out;
>         }

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC] MADV_FREE doesn't work when doesn't have swap partition
  2014-12-19  1:04                   ` [RFC] MADV_FREE doesn't work when doesn't have swap partition Minchan Kim
@ 2014-12-19  6:54                     ` Wang, Yalin
  2014-12-22 10:30                     ` Konstantin Khlebnikov
  1 sibling, 0 replies; 20+ messages in thread
From: Wang, Yalin @ 2014-12-19  6:54 UTC (permalink / raw)
  To: linux-arm-kernel

> -----Original Message-----
> From: Minchan Kim [mailto:minchan at kernel.org]
> Sent: Friday, December 19, 2014 9:05 AM
> To: Wang, Yalin
> Cc: 'Konstantin Khlebnikov'; 'Kirill A. Shutemov'; 'Andrew Morton'; 'linux-
> kernel at vger.kernel.org'; 'linux-mm at kvack.org'; 'linux-arm-
> kernel at lists.infradead.org'; 'n-horiguchi at ah.jp.nec.com'
> Subject: Re: [RFC] MADV_FREE doesn't work when doesn't have swap partition
> 
> On Thu, Dec 18, 2014 at 11:50:01AM +0800, Wang, Yalin wrote:
> > I notice this commit:
> > mm: support madvise(MADV_FREE),
> >
> > it can free clean anonymous pages directly, doesn't need pageout to
> > swap partition,
> >
> > but I found it doesn't work on my platform, which don't enable any
> > swap partitions.
> 
> Current implementation, if there is no empty slot in swap, it does instant
> free instead of delayed free. Look at madvise_vma.
> 
> >
> > I make a change for this.
> > Just to explain my issue clearly,
> > Do we need some other checks to still scan anonymous pages even Don't
> > have swap partition but have clean anonymous pages?
> 
> There is a few places we should consider if you want to scan anonymous page
> withotu swap. Refer 69c854817566 and 74e3f3c3391d.
> 
> However, it's not simple at the moment. If we reenable anonymous scan
> without swap, it would make much regress of reclaim. So my direction is
> move normal anonymos pages into unevictable LRU list because they're real
> unevictable without swap and put delayed freeing pages into anon LRU list
> and age them.
> 
I understand your solution, sounds a great idea!
When this design will be merged into main stream?

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC] MADV_FREE doesn't work when doesn't have swap partition
  2014-12-19  1:04                   ` [RFC] MADV_FREE doesn't work when doesn't have swap partition Minchan Kim
  2014-12-19  6:54                     ` Wang, Yalin
@ 2014-12-22 10:30                     ` Konstantin Khlebnikov
  1 sibling, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2014-12-22 10:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 19, 2014 at 4:04 AM, Minchan Kim <minchan@kernel.org> wrote:
> On Thu, Dec 18, 2014 at 11:50:01AM +0800, Wang, Yalin wrote:
>> I notice this commit:
>> mm: support madvise(MADV_FREE),
>>
>> it can free clean anonymous pages directly,
>> doesn't need pageout to swap partition,
>>
>> but I found it doesn't work on my platform,
>> which don't enable any swap partitions.
>
> Current implementation, if there is no empty slot in swap, it does
> instant free instead of delayed free. Look at madvise_vma.
>
>>
>> I make a change for this.
>> Just to explain my issue clearly,
>> Do we need some other checks to still scan anonymous pages even
>> Don't have swap partition but have clean anonymous pages?
>
> There is a few places we should consider if you want to scan anonymous page
> withotu swap. Refer 69c854817566 and 74e3f3c3391d.
>
> However, it's not simple at the moment. If we reenable anonymous scan without swap,
> it would make much regress of reclaim. So my direction is move normal anonymos pages
> into unevictable LRU list because they're real unevictable without swap and
> put delayed freeing pages into anon LRU list and age them.

This sounds reasonable. In this case swapon must either scan
unevictable pages and make
some of them evictable again or just move all unevictable pages into
active list and postpone
this job till reclaimer invocation.

>
>> ---
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 5e8772b..8258f3a 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1941,7 +1941,7 @@ static void get_scan_count(struct lruvec *lruvec, int swappiness,
>>                 force_scan = true;
>>
>>         /* If we have no swap space, do not bother scanning anon pages. */
>> -       if (!sc->may_swap || (get_nr_swap_pages() <= 0)) {
>> +       if (!sc->may_swap) {
>>                 scan_balance = SCAN_FILE;
>>                 goto out;
>>         }
>
> --
> Kind regards,
> Minchan Kim
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo at kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email at kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC] mm:change meminfo cached calculation
       [not found]                   ` <35FD53F367049845BC99AC72306C23D103EDAF89E160@CNBJMBX05.corpusers.net>
@ 2015-01-07  0:43                     ` Andrew Morton
  2015-01-07  1:04                       ` Hugh Dickins
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2015-01-07  0:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 26 Dec 2014 19:56:49 +0800 "Wang, Yalin" <Yalin.Wang@sonymobile.com> wrote:

> This patch subtract sharedram from cached,
> sharedram can only be swap into swap partitions,
> they should be treated as swap pages, not as cached pages.
> 
> ...
>
> --- a/fs/proc/meminfo.c
> +++ b/fs/proc/meminfo.c
> @@ -45,7 +45,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
>  	committed = percpu_counter_read_positive(&vm_committed_as);
>  
>  	cached = global_page_state(NR_FILE_PAGES) -
> -			total_swapcache_pages() - i.bufferram;
> +			total_swapcache_pages() - i.bufferram - i.sharedram;
>  	if (cached < 0)
>  		cached = 0;

Documentation/filesystems/proc.txt says

:      Cached: in-memory cache for files read from the disk (the
:              pagecache).  Doesn't include SwapCached

So yes, I guess it should not include shmem.

And why not do this as well?


--- a/Documentation/filesystems/proc.txt~mm-change-meminfo-cached-calculation-fix
+++ a/Documentation/filesystems/proc.txt
@@ -811,7 +811,7 @@ MemAvailable: An estimate of how much me
      Buffers: Relatively temporary storage for raw disk blocks
               shouldn't get tremendously large (20MB or so)
       Cached: in-memory cache for files read from the disk (the
-              pagecache).  Doesn't include SwapCached
+              pagecache).  Doesn't include SwapCached or Shmem.
   SwapCached: Memory that once was swapped out, is swapped back in but
               still also is in the swapfile (if memory is needed it
               doesn't need to be swapped out AGAIN because it is already
_

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC] mm:change meminfo cached calculation
  2015-01-07  0:43                     ` [RFC] mm:change meminfo cached calculation Andrew Morton
@ 2015-01-07  1:04                       ` Hugh Dickins
  2015-01-07  1:25                         ` Andrew Morton
  0 siblings, 1 reply; 20+ messages in thread
From: Hugh Dickins @ 2015-01-07  1:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 6 Jan 2015, Andrew Morton wrote:
> On Fri, 26 Dec 2014 19:56:49 +0800 "Wang, Yalin" <Yalin.Wang@sonymobile.com> wrote:
> 
> > This patch subtract sharedram from cached,
> > sharedram can only be swap into swap partitions,
> > they should be treated as swap pages, not as cached pages.
> > 
> > ...
> >
> > --- a/fs/proc/meminfo.c
> > +++ b/fs/proc/meminfo.c
> > @@ -45,7 +45,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
> >  	committed = percpu_counter_read_positive(&vm_committed_as);
> >  
> >  	cached = global_page_state(NR_FILE_PAGES) -
> > -			total_swapcache_pages() - i.bufferram;
> > +			total_swapcache_pages() - i.bufferram - i.sharedram;
> >  	if (cached < 0)
> >  		cached = 0;
> 
> Documentation/filesystems/proc.txt says
> 
> :      Cached: in-memory cache for files read from the disk (the
> :              pagecache).  Doesn't include SwapCached
> 
> So yes, I guess it should not include shmem.
> 
> And why not do this as well?
> 
> 
> --- a/Documentation/filesystems/proc.txt~mm-change-meminfo-cached-calculation-fix
> +++ a/Documentation/filesystems/proc.txt
> @@ -811,7 +811,7 @@ MemAvailable: An estimate of how much me
>       Buffers: Relatively temporary storage for raw disk blocks
>                shouldn't get tremendously large (20MB or so)
>        Cached: in-memory cache for files read from the disk (the
> -              pagecache).  Doesn't include SwapCached
> +              pagecache).  Doesn't include SwapCached or Shmem.
>    SwapCached: Memory that once was swapped out, is swapped back in but
>                still also is in the swapfile (if memory is needed it
>                doesn't need to be swapped out AGAIN because it is already

Whoa.  Changes of this kind would have made good sense about 14 years ago.
And there's plenty more which would benefit from having anon/shmem/file
properly distinguished.  But how can we make such a change now,
breaking everything that has made its own sense of these counts?

Hugh

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC] mm:change meminfo cached calculation
  2015-01-07  1:04                       ` Hugh Dickins
@ 2015-01-07  1:25                         ` Andrew Morton
  2015-01-07  2:03                           ` Hugh Dickins
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2015-01-07  1:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 6 Jan 2015 17:04:33 -0800 (PST) Hugh Dickins <hughd@google.com> wrote:

> On Tue, 6 Jan 2015, Andrew Morton wrote:
> > On Fri, 26 Dec 2014 19:56:49 +0800 "Wang, Yalin" <Yalin.Wang@sonymobile.com> wrote:
> > 
> > > This patch subtract sharedram from cached,
> > > sharedram can only be swap into swap partitions,
> > > they should be treated as swap pages, not as cached pages.
> > > 
> > > ...
> > >
> > > --- a/fs/proc/meminfo.c
> > > +++ b/fs/proc/meminfo.c
> > > @@ -45,7 +45,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
> > >  	committed = percpu_counter_read_positive(&vm_committed_as);
> > >  
> > >  	cached = global_page_state(NR_FILE_PAGES) -
> > > -			total_swapcache_pages() - i.bufferram;
> > > +			total_swapcache_pages() - i.bufferram - i.sharedram;
> > >  	if (cached < 0)
> > >  		cached = 0;
> > 
> > Documentation/filesystems/proc.txt says
> > 
> > :      Cached: in-memory cache for files read from the disk (the
> > :              pagecache).  Doesn't include SwapCached
> > 
> > So yes, I guess it should not include shmem.
> > 
> > And why not do this as well?
> > 
> > 
> > --- a/Documentation/filesystems/proc.txt~mm-change-meminfo-cached-calculation-fix
> > +++ a/Documentation/filesystems/proc.txt
> > @@ -811,7 +811,7 @@ MemAvailable: An estimate of how much me
> >       Buffers: Relatively temporary storage for raw disk blocks
> >                shouldn't get tremendously large (20MB or so)
> >        Cached: in-memory cache for files read from the disk (the
> > -              pagecache).  Doesn't include SwapCached
> > +              pagecache).  Doesn't include SwapCached or Shmem.
> >    SwapCached: Memory that once was swapped out, is swapped back in but
> >                still also is in the swapfile (if memory is needed it
> >                doesn't need to be swapped out AGAIN because it is already
> 
> Whoa.  Changes of this kind would have made good sense about 14 years ago.
> And there's plenty more which would benefit from having anon/shmem/file
> properly distinguished.  But how can we make such a change now,
> breaking everything that has made its own sense of these counts?

That's what I was wondering, but I was having some trouble picking a
situation where it mattered much.  What's the problematic scenario
here?  Userspace that is taking Cached, saying "that was silly" and
subtracting Shmem from it by hand?

I suppose that as nobody knows we should err on the side of caution and
leave this alone.  But the situation is pretty sad - it would be nice
to make the code agree with the documentation at least.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC] mm:change meminfo cached calculation
  2015-01-07  1:25                         ` Andrew Morton
@ 2015-01-07  2:03                           ` Hugh Dickins
  2015-01-11  8:23                             ` Konstantin Khlebnikov
  0 siblings, 1 reply; 20+ messages in thread
From: Hugh Dickins @ 2015-01-07  2:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 6 Jan 2015, Andrew Morton wrote:
> On Tue, 6 Jan 2015 17:04:33 -0800 (PST) Hugh Dickins <hughd@google.com> wrote:
> > On Tue, 6 Jan 2015, Andrew Morton wrote:
> > > On Fri, 26 Dec 2014 19:56:49 +0800 "Wang, Yalin" <Yalin.Wang@sonymobile.com> wrote:
> > > 
> > > > This patch subtract sharedram from cached,
> > > > sharedram can only be swap into swap partitions,
> > > > they should be treated as swap pages, not as cached pages.
> > > > 
> > > > ...
> > > >
> > > > --- a/fs/proc/meminfo.c
> > > > +++ b/fs/proc/meminfo.c
> > > > @@ -45,7 +45,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
> > > >  	committed = percpu_counter_read_positive(&vm_committed_as);
> > > >  
> > > >  	cached = global_page_state(NR_FILE_PAGES) -
> > > > -			total_swapcache_pages() - i.bufferram;
> > > > +			total_swapcache_pages() - i.bufferram - i.sharedram;
> > > >  	if (cached < 0)
> > > >  		cached = 0;
> > > 
> > > Documentation/filesystems/proc.txt says
> > > 
> > > :      Cached: in-memory cache for files read from the disk (the
> > > :              pagecache).  Doesn't include SwapCached
> > > 
> > > So yes, I guess it should not include shmem.
> > > 
> > > And why not do this as well?
> > > 
> > > 
> > > --- a/Documentation/filesystems/proc.txt~mm-change-meminfo-cached-calculation-fix
> > > +++ a/Documentation/filesystems/proc.txt
> > > @@ -811,7 +811,7 @@ MemAvailable: An estimate of how much me
> > >       Buffers: Relatively temporary storage for raw disk blocks
> > >                shouldn't get tremendously large (20MB or so)
> > >        Cached: in-memory cache for files read from the disk (the
> > > -              pagecache).  Doesn't include SwapCached
> > > +              pagecache).  Doesn't include SwapCached or Shmem.
> > >    SwapCached: Memory that once was swapped out, is swapped back in but
> > >                still also is in the swapfile (if memory is needed it
> > >                doesn't need to be swapped out AGAIN because it is already
> > 
> > Whoa.  Changes of this kind would have made good sense about 14 years ago.
> > And there's plenty more which would benefit from having anon/shmem/file
> > properly distinguished.  But how can we make such a change now,
> > breaking everything that has made its own sense of these counts?
> 
> That's what I was wondering, but I was having some trouble picking a
> situation where it mattered much.

If it doesn't matter, then we don't need to change it.

> What's the problematic scenario
> here?  Userspace that is taking Cached, saying "that was silly" and
> subtracting Shmem from it by hand?

Someone a long time ago saw "that was silly", worked out it was because
of Shmem, adjusted their scripts or whatever accordingly, and has run
happily ever since.

> 
> I suppose that as nobody knows we should err on the side of caution and
> leave this alone.  But the situation is pretty sad - it would be nice
> to make the code agree with the documentation at least.

By all means fix the documentation.  And work on a /proc/meminfo.2015
which has sensibly differentiated counts (and probably omits that
wonderful Linux 2.2-compatible "Buffers").

But there's more to do than I can think of.  Cc'ing Jerome who has a
particular interest in this (no, I haven't forgotten his patches,
but nor have I had a moment to reconsider them).

Hugh

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC] mm:change meminfo cached calculation
  2015-01-07  2:03                           ` Hugh Dickins
@ 2015-01-11  8:23                             ` Konstantin Khlebnikov
  0 siblings, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2015-01-11  8:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 7, 2015 at 5:03 AM, Hugh Dickins <hughd@google.com> wrote:
> On Tue, 6 Jan 2015, Andrew Morton wrote:
>> On Tue, 6 Jan 2015 17:04:33 -0800 (PST) Hugh Dickins <hughd@google.com> wrote:
>> > On Tue, 6 Jan 2015, Andrew Morton wrote:
>> > > On Fri, 26 Dec 2014 19:56:49 +0800 "Wang, Yalin" <Yalin.Wang@sonymobile.com> wrote:
>> > >
>> > > > This patch subtract sharedram from cached,
>> > > > sharedram can only be swap into swap partitions,
>> > > > they should be treated as swap pages, not as cached pages.
>> > > >
>> > > > ...
>> > > >
>> > > > --- a/fs/proc/meminfo.c
>> > > > +++ b/fs/proc/meminfo.c
>> > > > @@ -45,7 +45,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
>> > > >         committed = percpu_counter_read_positive(&vm_committed_as);
>> > > >
>> > > >         cached = global_page_state(NR_FILE_PAGES) -
>> > > > -                       total_swapcache_pages() - i.bufferram;
>> > > > +                       total_swapcache_pages() - i.bufferram - i.sharedram;
>> > > >         if (cached < 0)
>> > > >                 cached = 0;
>> > >
>> > > Documentation/filesystems/proc.txt says
>> > >
>> > > :      Cached: in-memory cache for files read from the disk (the
>> > > :              pagecache).  Doesn't include SwapCached
>> > >
>> > > So yes, I guess it should not include shmem.
>> > >
>> > > And why not do this as well?
>> > >
>> > >
>> > > --- a/Documentation/filesystems/proc.txt~mm-change-meminfo-cached-calculation-fix
>> > > +++ a/Documentation/filesystems/proc.txt
>> > > @@ -811,7 +811,7 @@ MemAvailable: An estimate of how much me
>> > >       Buffers: Relatively temporary storage for raw disk blocks
>> > >                shouldn't get tremendously large (20MB or so)
>> > >        Cached: in-memory cache for files read from the disk (the
>> > > -              pagecache).  Doesn't include SwapCached
>> > > +              pagecache).  Doesn't include SwapCached or Shmem.
>> > >    SwapCached: Memory that once was swapped out, is swapped back in but
>> > >                still also is in the swapfile (if memory is needed it
>> > >                doesn't need to be swapped out AGAIN because it is already
>> >
>> > Whoa.  Changes of this kind would have made good sense about 14 years ago.
>> > And there's plenty more which would benefit from having anon/shmem/file
>> > properly distinguished.  But how can we make such a change now,
>> > breaking everything that has made its own sense of these counts?
>>
>> That's what I was wondering, but I was having some trouble picking a
>> situation where it mattered much.
>
> If it doesn't matter, then we don't need to change it.
>
>> What's the problematic scenario
>> here?  Userspace that is taking Cached, saying "that was silly" and
>> subtracting Shmem from it by hand?
>
> Someone a long time ago saw "that was silly", worked out it was because
> of Shmem, adjusted their scripts or whatever accordingly, and has run
> happily ever since.

Totally agree. I know some of these guys.
But that's here not for so long time, 'Shmem' has appeared only in 2.6.32.

>
>>
>> I suppose that as nobody knows we should err on the side of caution and
>> leave this alone.  But the situation is pretty sad - it would be nice
>> to make the code agree with the documentation at least.
>
> By all means fix the documentation.  And work on a /proc/meminfo.2015
> which has sensibly differentiated counts (and probably omits that
> wonderful Linux 2.2-compatible "Buffers").

'Buffers' is actually very useful. Ext4 keeps almost all metadata in
bdev page-cache.

Meminfo has a bigger and much more confusing problem: there is no subset of
fields which sums to all ram. Some paged allocations are showed nowhere.

Probably it would be good to show that 'Untracked' memory as well,
calculated as:
Total - Free - Cached - Buffers - Slab - PageTables - KernelStack - AnonPages.
(fix me if I'm wrong =)

>
> But there's more to do than I can think of.  Cc'ing Jerome who has a
> particular interest in this (no, I haven't forgotten his patches,
> but nor have I had a moment to reconsider them).
>
> Hugh

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-01-11  8:23 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-05  8:57 [RFC] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags Wang, Yalin
2014-12-05  9:20 ` Konstantin Khlebnikov
2014-12-05 10:22   ` Wang, Yalin
2014-12-05 22:31     ` Andrew Morton
     [not found]       ` <35FD53F367049845BC99AC72306C23D103E688B313F4@CNBJMBX05.corpusers.net>
2014-12-08 11:46         ` [RFC V4] " Kirill A. Shutemov
     [not found]           ` <35FD53F367049845BC99AC72306C23D103E688B313FB@CNBJMBX05.corpusers.net>
     [not found]             ` <35FD53F367049845BC99AC72306C23D103E688B31403@CNBJMBX05.corpusers.net>
2014-12-10 11:05               ` [RFC] mm:fix zero_page huge_zero_page rss/pss statistic Kirill A. Shutemov
2014-12-12  1:59                 ` Wang, Yalin
2014-12-12 11:10                   ` Kirill A. Shutemov
2014-12-10 17:06             ` [RFC V5] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags Konstantin Khlebnikov
     [not found]               ` <35FD53F367049845BC99AC72306C23D103E688B31408@CNBJMBX05.corpusers.net>
     [not found]                 ` <35FD53F367049845BC99AC72306C23D103EDAF89E14C@CNBJMBX05.corpusers.net>
2014-12-19  1:04                   ` [RFC] MADV_FREE doesn't work when doesn't have swap partition Minchan Kim
2014-12-19  6:54                     ` Wang, Yalin
2014-12-22 10:30                     ` Konstantin Khlebnikov
     [not found]                   ` <35FD53F367049845BC99AC72306C23D103EDAF89E160@CNBJMBX05.corpusers.net>
2015-01-07  0:43                     ` [RFC] mm:change meminfo cached calculation Andrew Morton
2015-01-07  1:04                       ` Hugh Dickins
2015-01-07  1:25                         ` Andrew Morton
2015-01-07  2:03                           ` Hugh Dickins
2015-01-11  8:23                             ` Konstantin Khlebnikov
     [not found] ` <35FD53F367049845BC99AC72306C23D103E688B313F1@CNBJMBX05.corpusers.net>
2014-12-05 11:05   ` [RFC V2] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags Kirill A. Shutemov
2014-12-08  7:47   ` [PATCH] mm:add VM_BUG_ON() for page_mapcount() Wang, Yalin
2014-12-08 11:50     ` Kirill A. Shutemov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).