All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: "Steven Rostedt" <rostedt@goodmis.org>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Larry Woodman" <lwoodman@redhat.com>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Pekka Enberg" <penberg@cs.helsinki.fi>,
	"Eduard - Gabriel Munteanu" <eduard.munteanu@linux360.ro>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com>,
	"Andi Kleen" <andi@firstfloor.org>,
	"Matt Mackall" <mpm@selenic.com>,
	"Alexey Dobriyan" <adobriyan@gmail.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH 5/5] proc: export more page flags in /proc/kpageflags
Date: Tue, 28 Apr 2009 11:24:54 +0200	[thread overview]
Message-ID: <20090428092454.GB21085@elte.hu> (raw)
In-Reply-To: <20090428083320.GB17038@localhost>


* Wu Fengguang <fengguang.wu@intel.com> wrote:

> On Tue, Apr 28, 2009 at 08:55:07AM +0200, Ingo Molnar wrote:
> > 
> > * Wu Fengguang <fengguang.wu@intel.com> wrote:
> > 
> > > Export 9 page flags in /proc/kpageflags, and 8 more for kernel developers.
> > > 
> > > 1) for kernel hackers (on CONFIG_DEBUG_KERNEL)
> > >    - all available page flags are exported, and
> > >    - exported as is
> > > 2) for admins and end users
> > >    - only the more `well known' flags are exported:
> > > 	11. KPF_MMAP		(pseudo flag) memory mapped page
> > > 	12. KPF_ANON		(pseudo flag) memory mapped page (anonymous)
> > > 	13. KPF_SWAPCACHE	page is in swap cache
> > > 	14. KPF_SWAPBACKED	page is swap/RAM backed
> > > 	15. KPF_COMPOUND_HEAD	(*)
> > > 	16. KPF_COMPOUND_TAIL	(*)
> > > 	17. KPF_UNEVICTABLE	page is in the unevictable LRU list
> > > 	18. KPF_HWPOISON	hardware detected corruption
> > > 	19. KPF_NOPAGE		(pseudo flag) no page frame at the address
> > > 
> > > 	(*) For compound pages, exporting _both_ head/tail info enables
> > > 	    users to tell where a compound page starts/ends, and its order.
> > > 
> > >    - limit flags to their typical usage scenario, as indicated by KOSAKI:
> > > 	- LRU pages: only export relevant flags
> > > 		- PG_lru
> > > 		- PG_unevictable
> > > 		- PG_active
> > > 		- PG_referenced
> > > 		- page_mapped()
> > > 		- PageAnon()
> > > 		- PG_swapcache
> > > 		- PG_swapbacked
> > > 		- PG_reclaim
> > > 	- no-IO pages: mask out irrelevant flags
> > > 		- PG_dirty
> > > 		- PG_uptodate
> > > 		- PG_writeback
> > > 	- SLAB pages: mask out overloaded flags:
> > > 		- PG_error
> > > 		- PG_active
> > > 		- PG_private
> > > 	- PG_reclaim: mask out the overloaded PG_readahead
> > > 	- compound flags: only export huge/gigantic pages
> > > 
> > > Here are the admin/linus views of all page flags on a newly booted nfs-root system:
> > > 
> > > # ./page-types # for admin
> > >          flags  page-count       MB  symbolic-flags                     long-symbolic-flags
> > > 0x000000000000      491174     1918  ____________________________                
> > > 0x000000000020           1        0  _____l______________________       lru      
> > > 0x000000000028        2543        9  ___U_l______________________       uptodate,lru
> > > 0x00000000002c        5288       20  __RU_l______________________       referenced,uptodate,lru
> > > 0x000000004060           1        0  _____lA_______b_____________       lru,active,swapbacked
> > 
> > I think i have to NAK this kind of ad-hoc instrumentation of kernel 
> > internals and statistics until we clear up why such instrumentation 
> > measures are being accepted into the MM while other, more dynamic 
> > and more flexible MM instrumentation are being resisted by Andrew.
> 
> An unexpected NAK - to throw away an orange because we are to have an apple? ;-)
> 
> Anyway here are the missing rationals.
> 
> 1) FAST
> 
> It takes merely 0.2s to scan 4GB pages:
> 
>         ./page-types  0.02s user 0.20s system 99% cpu 0.216 total
> 
> 2) SIMPLE
> 
> /proc/kpageflags will be a *long standing* hack we have to live 
> with - it was originally introduced by Matt to do shared memory 
> accounting and a facility to analyze applications' memory 
> consumptions, with the hope it will also help kernel developers 
> someday.
> 
> So why not extend and embrace it, in a straightforward way?
> 
> 3) USE CASES
> 
> I have/will take advantage of the above page-types command in a number ways:
> - to help track down memory leak (the recent trace/ring_buffer.c case)
> - to estimate the system wide readahead miss ratio
> - Andi want to examine the major page types in different workloads
>   (for the hwpoison work)
> - Me too, for fun of learning: read/write/lock/whatever a lot of pages
>   and examine their flags, to get an idea of some random kernel behaviors.
>   (the dynamic tracing tools can be more helpful, as a different view)
> 
> 4) COMPLEMENTARITY
> 
> In some cases the dynamic tracing tool is not enough (or too complex)
> to rebuild the current status view.
> 
> I myself have a dynamic readahead tracing tool(very useful!). At 
> the same time I also use readahead accounting numbers, and the 
> /proc/filecache tool(frequently!), and the above page-types tool. 
> I simply need them all - they are handy for different cases.

Well, the main counter argument here is that statistics is _derived_ 
from events. In their simplest form the 'counts' are the integral of 
events over time.

So if we capture all interesting events, and do that with low 
overhead (and in fact can even collect and integrate them in-kernel, 
today), we _dont have_ to maintain various overlapping counters all 
around the kernel. This is really a general instrumentation design 
observation.

Every time we add yet another /proc hack we splinter Linux 
instrumentation, in a hard to reverse way.

So your single-purpose /proc hack could be made multi-purpose and 
could help a much broader range of people, with just a little bit of 
effort i believe. Pekka already wrote the page tracking patch for 
example, that would be a good starting point.

Does it mean more work to do? You bet ;-)

	Ingo

WARNING: multiple messages have this Message-ID (diff)
From: Ingo Molnar <mingo@elte.hu>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: "Steven Rostedt" <rostedt@goodmis.org>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Larry Woodman" <lwoodman@redhat.com>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Pekka Enberg" <penberg@cs.helsinki.fi>,
	"Eduard - Gabriel Munteanu" <eduard.munteanu@linux360.ro>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com>,
	"Andi Kleen" <andi@firstfloor.org>,
	"Matt Mackall" <mpm@selenic.com>,
	"Alexey Dobriyan" <adobriyan@gmail.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH 5/5] proc: export more page flags in /proc/kpageflags
Date: Tue, 28 Apr 2009 11:24:54 +0200	[thread overview]
Message-ID: <20090428092454.GB21085@elte.hu> (raw)
In-Reply-To: <20090428083320.GB17038@localhost>


* Wu Fengguang <fengguang.wu@intel.com> wrote:

> On Tue, Apr 28, 2009 at 08:55:07AM +0200, Ingo Molnar wrote:
> > 
> > * Wu Fengguang <fengguang.wu@intel.com> wrote:
> > 
> > > Export 9 page flags in /proc/kpageflags, and 8 more for kernel developers.
> > > 
> > > 1) for kernel hackers (on CONFIG_DEBUG_KERNEL)
> > >    - all available page flags are exported, and
> > >    - exported as is
> > > 2) for admins and end users
> > >    - only the more `well known' flags are exported:
> > > 	11. KPF_MMAP		(pseudo flag) memory mapped page
> > > 	12. KPF_ANON		(pseudo flag) memory mapped page (anonymous)
> > > 	13. KPF_SWAPCACHE	page is in swap cache
> > > 	14. KPF_SWAPBACKED	page is swap/RAM backed
> > > 	15. KPF_COMPOUND_HEAD	(*)
> > > 	16. KPF_COMPOUND_TAIL	(*)
> > > 	17. KPF_UNEVICTABLE	page is in the unevictable LRU list
> > > 	18. KPF_HWPOISON	hardware detected corruption
> > > 	19. KPF_NOPAGE		(pseudo flag) no page frame at the address
> > > 
> > > 	(*) For compound pages, exporting _both_ head/tail info enables
> > > 	    users to tell where a compound page starts/ends, and its order.
> > > 
> > >    - limit flags to their typical usage scenario, as indicated by KOSAKI:
> > > 	- LRU pages: only export relevant flags
> > > 		- PG_lru
> > > 		- PG_unevictable
> > > 		- PG_active
> > > 		- PG_referenced
> > > 		- page_mapped()
> > > 		- PageAnon()
> > > 		- PG_swapcache
> > > 		- PG_swapbacked
> > > 		- PG_reclaim
> > > 	- no-IO pages: mask out irrelevant flags
> > > 		- PG_dirty
> > > 		- PG_uptodate
> > > 		- PG_writeback
> > > 	- SLAB pages: mask out overloaded flags:
> > > 		- PG_error
> > > 		- PG_active
> > > 		- PG_private
> > > 	- PG_reclaim: mask out the overloaded PG_readahead
> > > 	- compound flags: only export huge/gigantic pages
> > > 
> > > Here are the admin/linus views of all page flags on a newly booted nfs-root system:
> > > 
> > > # ./page-types # for admin
> > >          flags  page-count       MB  symbolic-flags                     long-symbolic-flags
> > > 0x000000000000      491174     1918  ____________________________                
> > > 0x000000000020           1        0  _____l______________________       lru      
> > > 0x000000000028        2543        9  ___U_l______________________       uptodate,lru
> > > 0x00000000002c        5288       20  __RU_l______________________       referenced,uptodate,lru
> > > 0x000000004060           1        0  _____lA_______b_____________       lru,active,swapbacked
> > 
> > I think i have to NAK this kind of ad-hoc instrumentation of kernel 
> > internals and statistics until we clear up why such instrumentation 
> > measures are being accepted into the MM while other, more dynamic 
> > and more flexible MM instrumentation are being resisted by Andrew.
> 
> An unexpected NAK - to throw away an orange because we are to have an apple? ;-)
> 
> Anyway here are the missing rationals.
> 
> 1) FAST
> 
> It takes merely 0.2s to scan 4GB pages:
> 
>         ./page-types  0.02s user 0.20s system 99% cpu 0.216 total
> 
> 2) SIMPLE
> 
> /proc/kpageflags will be a *long standing* hack we have to live 
> with - it was originally introduced by Matt to do shared memory 
> accounting and a facility to analyze applications' memory 
> consumptions, with the hope it will also help kernel developers 
> someday.
> 
> So why not extend and embrace it, in a straightforward way?
> 
> 3) USE CASES
> 
> I have/will take advantage of the above page-types command in a number ways:
> - to help track down memory leak (the recent trace/ring_buffer.c case)
> - to estimate the system wide readahead miss ratio
> - Andi want to examine the major page types in different workloads
>   (for the hwpoison work)
> - Me too, for fun of learning: read/write/lock/whatever a lot of pages
>   and examine their flags, to get an idea of some random kernel behaviors.
>   (the dynamic tracing tools can be more helpful, as a different view)
> 
> 4) COMPLEMENTARITY
> 
> In some cases the dynamic tracing tool is not enough (or too complex)
> to rebuild the current status view.
> 
> I myself have a dynamic readahead tracing tool(very useful!). At 
> the same time I also use readahead accounting numbers, and the 
> /proc/filecache tool(frequently!), and the above page-types tool. 
> I simply need them all - they are handy for different cases.

Well, the main counter argument here is that statistics is _derived_ 
from events. In their simplest form the 'counts' are the integral of 
events over time.

So if we capture all interesting events, and do that with low 
overhead (and in fact can even collect and integrate them in-kernel, 
today), we _dont have_ to maintain various overlapping counters all 
around the kernel. This is really a general instrumentation design 
observation.

Every time we add yet another /proc hack we splinter Linux 
instrumentation, in a hard to reverse way.

So your single-purpose /proc hack could be made multi-purpose and 
could help a much broader range of people, with just a little bit of 
effort i believe. Pekka already wrote the page tracking patch for 
example, that would be a good starting point.

Does it mean more work to do? You bet ;-)

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-04-28  9:25 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-28  1:09 [PATCH 0/5] proc: export more page flags in /proc/kpageflags (take 4) Wu Fengguang
2009-04-28  1:09 ` Wu Fengguang
2009-04-28  1:09 ` [PATCH 1/5] pagemap: document clarifications Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  7:11   ` Tommi Rantala
2009-04-28  7:11     ` Tommi Rantala
2009-04-28  1:09 ` [PATCH 2/5] pagemap: documentation 9 more exported page flags Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  1:09 ` [PATCH 3/5] mm: introduce PageHuge() for testing huge/gigantic pages Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  1:09 ` [PATCH 4/5] proc: kpagecount/kpageflags code cleanup Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  1:09 ` [PATCH 5/5] proc: export more page flags in /proc/kpageflags Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  6:55   ` Ingo Molnar
2009-04-28  6:55     ` Ingo Molnar
2009-04-28  7:40     ` Andi Kleen
2009-04-28  7:40       ` Andi Kleen
2009-04-28  9:04       ` Pekka Enberg
2009-04-28  9:04         ` Pekka Enberg
2009-04-28  9:10         ` Andi Kleen
2009-04-28  9:10           ` Andi Kleen
2009-04-28  9:15           ` Pekka Enberg
2009-04-28  9:15             ` Pekka Enberg
2009-04-28  9:15         ` Ingo Molnar
2009-04-28  9:15           ` Ingo Molnar
2009-04-28  9:19           ` Pekka Enberg
2009-04-28  9:19             ` Pekka Enberg
2009-04-28  9:25             ` Pekka Enberg
2009-04-28  9:25               ` Pekka Enberg
2009-04-28  9:36               ` Wu Fengguang
2009-04-28  9:36                 ` Wu Fengguang
2009-04-28  9:36               ` Ingo Molnar
2009-04-28  9:36                 ` Ingo Molnar
2009-04-28  9:57                 ` Pekka Enberg
2009-04-28  9:57                   ` Pekka Enberg
2009-04-28 10:10                   ` KOSAKI Motohiro
2009-04-28 10:10                     ` KOSAKI Motohiro
2009-04-28 10:21                     ` Pekka Enberg
2009-04-28 10:21                       ` Pekka Enberg
2009-04-28 10:56                       ` Ingo Molnar
2009-04-28 10:56                         ` Ingo Molnar
2009-04-28 11:09                         ` KOSAKI Motohiro
2009-04-28 11:09                           ` KOSAKI Motohiro
2009-04-28 12:42                           ` Ingo Molnar
2009-04-28 12:42                             ` Ingo Molnar
2009-04-28 11:03                   ` Ingo Molnar
2009-04-28 11:03                     ` Ingo Molnar
2009-04-28 17:42                 ` Matt Mackall
2009-04-28 17:42                   ` Matt Mackall
2009-04-28  9:29             ` Ingo Molnar
2009-04-28  9:29               ` Ingo Molnar
2009-04-28  9:34               ` KOSAKI Motohiro
2009-04-28  9:34                 ` KOSAKI Motohiro
2009-04-28  9:38                 ` Ingo Molnar
2009-04-28  9:38                   ` Ingo Molnar
2009-04-28  9:55                   ` Wu Fengguang
2009-04-28  9:55                     ` Wu Fengguang
2009-04-28 10:11                     ` KOSAKI Motohiro
2009-04-28 10:11                       ` KOSAKI Motohiro
2009-04-28 11:05                     ` Ingo Molnar
2009-04-28 11:05                       ` Ingo Molnar
2009-04-28 11:36                       ` Wu Fengguang
2009-04-28 11:36                         ` Wu Fengguang
2009-04-28 12:17                         ` [rfc] object collection tracing (was: [PATCH 5/5] proc: export more page flags in /proc/kpageflags) Ingo Molnar
2009-04-28 12:17                           ` Ingo Molnar
2009-04-28 13:31                           ` Wu Fengguang
2009-04-28 13:31                             ` Wu Fengguang
2009-05-12 13:01                             ` Frederic Weisbecker
2009-05-12 13:01                               ` Frederic Weisbecker
2009-05-17 13:36                               ` Wu Fengguang
2009-05-17 13:55                                 ` Frederic Weisbecker
2009-05-17 13:55                                   ` Frederic Weisbecker
2009-05-17 14:12                                   ` Wu Fengguang
2009-05-17 14:12                                     ` Wu Fengguang
2009-05-18 11:44                                 ` KOSAKI Motohiro
2009-05-18 11:44                                   ` KOSAKI Motohiro
2009-05-18 11:47                                   ` Wu Fengguang
2009-05-18 11:47                                     ` Wu Fengguang
2009-04-28 10:18                   ` [PATCH 5/5] proc: export more page flags in /proc/kpageflags Andi Kleen
2009-04-28 10:18                     ` Andi Kleen
2009-04-28  8:33     ` Wu Fengguang
2009-04-28  8:33       ` Wu Fengguang
2009-04-28  9:24       ` Ingo Molnar [this message]
2009-04-28  9:24         ` Ingo Molnar
2009-04-28 18:11       ` Tony Luck
2009-04-28 18:11         ` Tony Luck
2009-04-28 18:34         ` Matt Mackall
2009-04-28 18:34           ` Matt Mackall
2009-04-28 20:47           ` Tony Luck
2009-04-28 20:47             ` Tony Luck
2009-04-28 20:54             ` Andi Kleen
2009-04-28 20:54               ` Andi Kleen
2009-04-28 20:59             ` Matt Mackall
2009-04-28 20:59               ` Matt Mackall
2009-04-28 21:17         ` Andrew Morton
2009-04-28 21:17           ` Andrew Morton
2009-04-28 21:49           ` Matt Mackall
2009-04-28 21:49             ` Matt Mackall
2009-04-29  0:02             ` Robin Holt
2009-04-29  0:02               ` Robin Holt
2009-04-28 17:49   ` Matt Mackall
2009-04-28 17:49     ` Matt Mackall
2009-04-29  8:05     ` Wu Fengguang
2009-04-29  8:05       ` Wu Fengguang
2009-04-29 19:13       ` Matt Mackall
2009-04-29 19:13         ` Matt Mackall
2009-04-30  1:00         ` Wu Fengguang
2009-04-30  1:00           ` Wu Fengguang
2009-04-28 21:32   ` Andrew Morton
2009-04-28 21:32     ` Andrew Morton
2009-04-28 22:46     ` Matt Mackall
2009-04-28 22:46       ` Matt Mackall
2009-04-28 23:02       ` Andrew Morton
2009-04-28 23:02         ` Andrew Morton
2009-04-28 23:31         ` Matt Mackall
2009-04-28 23:31           ` Matt Mackall
2009-04-28 23:42           ` Andrew Morton
2009-04-28 23:42             ` Andrew Morton
2009-04-28 23:55             ` Matt Mackall
2009-04-28 23:55               ` Matt Mackall
2009-04-29  3:33               ` Wu Fengguang
2009-04-29  3:33                 ` Wu Fengguang
2009-04-29  2:38     ` Wu Fengguang
2009-04-29  2:38       ` Wu Fengguang
2009-04-29  2:55       ` Andrew Morton
2009-04-29  2:55         ` Andrew Morton
2009-04-29  3:48         ` Wu Fengguang
2009-04-29  3:48           ` Wu Fengguang
2009-04-29  5:09           ` Wu Fengguang
2009-04-29  5:09             ` Wu Fengguang
2009-04-29  4:41       ` Nathan Lynch
2009-04-29  4:41         ` Nathan Lynch
2009-04-29  4:41         ` Nathan Lynch
2009-04-29  4:50         ` Andrew Morton
2009-04-29  4:50           ` Andrew Morton
2009-04-29  4:50           ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090428092454.GB21085@elte.hu \
    --to=mingo@elte.hu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=eduard.munteanu@linux360.ro \
    --cc=fengguang.wu@intel.com \
    --cc=fweisbec@gmail.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    --cc=mpm@selenic.com \
    --cc=penberg@cs.helsinki.fi \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.